Prediction of potential genes in microbial genomes Time: Wed May 11 23:51:10 2011 Seq name: gi|226332319|gb|ACIC01000001.1| Bacteroides sp. 1_1_6 cont1.1, whole genome shotgun sequence Length of sequence - 41382 bp Number of predicted genes - 39, with homology - 39 Number of transcription units - 15, operones - 9 average op.length - 3.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 163 - 399 201 ## BDI_3881 glycoside hydrolase family protein + Term 407 - 459 5.2 2 2 Op 1 . - CDS 476 - 1399 781 ## gi|253567576|ref|ZP_04844987.1| conserved hypothetical protein 3 2 Op 2 . - CDS 1457 - 1747 276 ## BVU_1580 hypothetical protein - Prom 1912 - 1971 1.8 - Term 1762 - 1795 1.9 4 3 Op 1 . - CDS 1991 - 2509 319 ## BT_2586 hypothetical protein 5 3 Op 2 . - CDS 2521 - 3231 455 ## BDI_3871 hypothetical protein 6 3 Op 3 . - CDS 3264 - 4286 796 ## BVU_1584 hypothetical protein 7 3 Op 4 . - CDS 4299 - 4517 300 ## gi|253567581|ref|ZP_04844992.1| conserved hypothetical protein 8 3 Op 5 . - CDS 4510 - 5127 532 ## gi|253567582|ref|ZP_04844993.1| conserved hypothetical protein 9 3 Op 6 . - CDS 5124 - 5717 356 ## BDI_3867 hypothetical protein 10 3 Op 7 . - CDS 5755 - 6684 525 ## gi|253567584|ref|ZP_04844995.1| conserved hypothetical protein - Term 6728 - 6766 6.2 11 4 Op 1 . - CDS 6775 - 7062 126 ## gi|253567586|ref|ZP_04844997.1| conserved hypothetical protein 12 4 Op 2 . - CDS 7080 - 7448 237 ## gi|253567587|ref|ZP_04844998.1| predicted protein 13 4 Op 3 . - CDS 7465 - 7779 286 ## gi|253567588|ref|ZP_04844999.1| conserved hypothetical protein 14 4 Op 4 . - CDS 7793 - 8290 482 ## BVU_2473 putative anti-restriction protein - Prom 8321 - 8380 2.2 15 5 Tu 1 . - CDS 8516 - 11518 2328 ## COG0827 Adenine-specific DNA methylase - Term 11538 - 11577 8.2 16 6 Op 1 . - CDS 11598 - 12539 758 ## gi|253567591|ref|ZP_04845002.1| conserved hypothetical protein 17 6 Op 2 . - CDS 12559 - 13377 643 ## BDI_3903 hypothetical protein + Prom 13325 - 13384 2.5 18 7 Tu 1 . + CDS 13446 - 13628 75 ## gi|253567593|ref|ZP_04845004.1| predicted protein + Prom 13738 - 13797 4.3 19 8 Op 1 . + CDS 13861 - 14160 455 ## BT_0185 hypothetical protein 20 8 Op 2 . + CDS 14176 - 16230 2196 ## COG4232 Thiol:disulfide interchange protein + Term 16254 - 16299 14.4 - Term 16245 - 16284 6.0 21 9 Tu 1 . - CDS 16326 - 17030 685 ## COG1741 Pirin-related protein - Prom 17158 - 17217 6.2 + Prom 17134 - 17193 10.4 22 10 Op 1 6/0.000 + CDS 17223 - 17825 542 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 23 10 Op 2 . + CDS 17913 - 18896 783 ## COG3712 Fe2+-dicitrate sensor, membrane component + Prom 19024 - 19083 1.5 24 11 Op 1 . + CDS 19109 - 22549 3128 ## BT_0190 hypothetical protein 25 11 Op 2 . + CDS 22573 - 23964 1157 ## BT_0191 hypothetical protein 26 11 Op 3 . + CDS 23992 - 25959 1297 ## COG1520 FOG: WD40-like repeat 27 11 Op 4 . + CDS 25966 - 26847 539 ## COG1082 Sugar phosphate isomerases/epimerases 28 11 Op 5 . + CDS 26849 - 27907 806 ## BT_0194 hypothetical protein 29 11 Op 6 6/0.000 + CDS 27928 - 28827 756 ## COG0584 Glycerophosphoryl diester phosphodiesterase 30 11 Op 7 . + CDS 28840 - 30186 931 ## COG2271 Sugar phosphate permease + Term 30263 - 30321 3.4 + Prom 30286 - 30345 6.4 31 12 Tu 1 . + CDS 30391 - 31044 446 ## BT_0197 hypothetical protein + Term 31198 - 31237 3.3 + Prom 31132 - 31191 7.4 32 13 Op 1 . + CDS 31354 - 33258 1133 ## BT_0198 hypothetical protein 33 13 Op 2 . + CDS 33334 - 33825 360 ## BT_0199 hypothetical protein + Term 34000 - 34039 8.4 + Prom 34000 - 34059 10.4 34 14 Op 1 18/0.000 + CDS 34079 - 34930 908 ## COG0040 ATP phosphoribosyltransferase 35 14 Op 2 19/0.000 + CDS 34943 - 36226 1041 ## COG0141 Histidinol dehydrogenase 36 14 Op 3 13/0.000 + CDS 36223 - 37263 874 ## COG0079 Histidinol-phosphate/aromatic aminotransferase and cobyric acid decarboxylase 37 14 Op 4 . + CDS 37267 - 38391 1015 ## COG0131 Imidazoleglycerol-phosphate dehydratase 38 14 Op 5 . + CDS 38431 - 38991 274 ## COG2365 Protein tyrosine/serine phosphatase + Term 39009 - 39050 -1.0 39 15 Tu 1 . - CDS 39106 - 41031 1376 ## COG0171 NAD synthase Predicted protein(s) >gi|226332319|gb|ACIC01000001.1| GENE 1 163 - 399 201 78 aa, chain + ## HITS:1 COG:no KEGG:BDI_3881 NR:ns ## KEGG: BDI_3881 # Name: not_defined # Def: glycoside hydrolase family protein # Organism: P.distasonis # Pathway: not_defined # 4 76 85 157 159 111 69.0 7e-24 MCSRFGKDALLVATLSYNVGYYRVVGYGKIPKSRLIQKLEAGDRDIYNEYVSFRCYKGKV VPSIERRRKVEYMLLFKK >gi|226332319|gb|ACIC01000001.1| GENE 2 476 - 1399 781 307 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253567576|ref|ZP_04844987.1| ## NR: gi|253567576|ref|ZP_04844987.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 307 1 307 307 612 100.0 1e-174 MANLAVLERQQQFDFQQNGIEVMDFETLQRTYKENDIYNNPVQGIYHYQVIQRMMDICEK HNLDYEVEEIFAAQNRNKTQPGVSILPQVEQIHGEKAVEAHILRRIFTTIRIKDWETDEL TTTLVVAYHQDGVQAAIGPCVKICHNQCILSPERSVCNYGKKKVTTEELFDTVDGWMANF EVNMNEDIERIQRLKRRVIPLEEIYMYIGLLTALRVSHDSADKNLSSTVETYPLNQGQIS VFTEEVLKLVMTKGQITAWDLYNVATEIYKPGKTDFPALIPQNGAMAELLLSRLPEEAEI MDAIPVG >gi|226332319|gb|ACIC01000001.1| GENE 3 1457 - 1747 276 96 aa, chain - ## HITS:1 COG:no KEGG:BVU_1580 NR:ns ## KEGG: BVU_1580 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 96 1 96 96 187 91.0 1e-46 MKILNEEHFENVKRYAESIGDTSFQKCLDRLESWEKNPDHPNEISLYYDHAPYSFGFTQR YPDGRTGIVGGLLYHGIPDRSFAVTLEPFHGWQIHT >gi|226332319|gb|ACIC01000001.1| GENE 4 1991 - 2509 319 172 aa, chain - ## HITS:1 COG:no KEGG:BT_2586 NR:ns ## KEGG: BT_2586 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 6 171 1 166 167 131 46.0 1e-29 MAQERMDDWMEYARELARAERELRVERWVFISIECKDEAGNPIRLHSYDLPRELHKRYRW VVRWREARLQCLYPKRQINTYYSYYDRRTGLRTNFNSSLSRLSAAKAQISIAERKEREYI QYQRTNNLFFDENTDEQLVGFREKLRMKKEKYTALEHEIRSEMESMQKLNRI >gi|226332319|gb|ACIC01000001.1| GENE 5 2521 - 3231 455 236 aa, chain - ## HITS:1 COG:no KEGG:BDI_3871 NR:ns ## KEGG: BDI_3871 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 230 1 230 237 285 66.0 7e-76 MKAKVLKYKFDGNTVVAPYMELEAYAENIYLSLSDKNEYGNENYDYFHVVCKVENIYFSC GQYSREMLGREEQKEKLVKYCKNWIANMLQDAENGNHVSLLSIRVFEELGLDTVPLLQAR EAYRKKQEQRRQEQKEQEEEKRRLEEAKWQQELDEEKQKFLNGEYIPANMFLEITKRDGF EIHIRTKGTLNRHVCGLNKSGSIRFYKKRGCRTPDFSGCHKAIAAYLTFLETITES >gi|226332319|gb|ACIC01000001.1| GENE 6 3264 - 4286 796 340 aa, chain - ## HITS:1 COG:no KEGG:BVU_1584 NR:ns ## KEGG: BVU_1584 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 336 1 334 339 511 75.0 1e-143 MTYFQNIHSLADLKKEYRRLALQHHPDKGGDTAVMQQVNVEFEKLYDVWKDSTNMSADTT GYEYDYSGATAKEYAEYVYNEYRWKGRNYKGQHAPEIVELVRTWLKETYPRYRFSVRREN YNSIYIKLMSADFEAFTKESGKVQDTINHYNIEWNPDLTDRAKEVMMNVCDFVMSYNFDD SNAMTDYFHTNFYLTLGIGSYRKPYKVELPKLACKGKEKQEVFKHPEGTAHKALRQALGT ARFGFIEHRRHIGEMILGEDHYGSQGEHYFWPKEYSSAKTAQKRIDKLEQAGMRCRLTGY NGGYIQFLGYTPETKTLLEQEREEVIIAHQAWQARRIQTN >gi|226332319|gb|ACIC01000001.1| GENE 7 4299 - 4517 300 72 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253567581|ref|ZP_04844992.1| ## NR: gi|253567581|ref|ZP_04844992.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 72 28 99 99 133 100.0 5e-30 MNNMGENIIDLICNSCSCDKTEAQEYLDSELQYLHELQDVDDLREGDIELACSNLGLDLD YQEYFINRLAGA >gi|226332319|gb|ACIC01000001.1| GENE 8 4510 - 5127 532 205 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253567582|ref|ZP_04844993.1| ## NR: gi|253567582|ref|ZP_04844993.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 16 205 1 190 190 350 100.0 2e-95 MKYHAENAVSSFFYYMWNAWSKEECKVVFGGDYLHFWEKWNAQAENSIYGAAERFYTELS ECSRTLLVERAVSLYDGKAFRKEPDDSKVLVCEECGSRQVETQAWIDANTEMYICDTAHD CDGKWCEECEENVDFCSLEEFKQIMQSWWTGNDIRTLEGITGLKETDYLSNNSSQTFAGA TDKWWYNLDYDGKRNVYNKHTSNNE >gi|226332319|gb|ACIC01000001.1| GENE 9 5124 - 5717 356 197 aa, chain - ## HITS:1 COG:no KEGG:BDI_3867 NR:ns ## KEGG: BDI_3867 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 196 1 198 199 281 69.0 8e-75 MAKYDVRVRYAFEGTYTVVAEDHDEARQMVSEDCGLVLGGNIHTTLDDEDVDWNFGVHPD TQILSVAQRNGKGTSASIDFSGRIEELRADIIEAIRQLLQAHCMTEIRFPEDDGYDPVWV IWFSKNGDPYECMVTGLRVTENSLTVFAEEKESGYEVECHSPFELGTRNIDWLHEMYDVV WRQLEGKENVEPETEKS >gi|226332319|gb|ACIC01000001.1| GENE 10 5755 - 6684 525 309 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253567584|ref|ZP_04844995.1| ## NR: gi|253567584|ref|ZP_04844995.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 309 1 309 309 608 100.0 1e-172 MEEQRTLTLDFVKSLMEPSYTLVWTDYNDNLDNHLDIIRKCLDRRNCDCLWEKAGEWYGD AEYEAVHGIMEKLKEECFVFNDFDEHEVDAFFDEHEDAIRDEIYSRNDSDVVKDLIRYTD DIPIRVEMLSDYDCINSNWFESQGGYSYEESYFGDMVDSLNLNPAQVKKLLTSHGYKVYG RFPNRKSRNGKEQVSYEQFYEELINSCCGANLLTYIGKVSLKELYDADFSLKEVIIPKGN CCGLFSSTYGGGSLLEMELKQDVKLKLEVKGCNGFRFRLDDERSKYDYSIQHVYGVDDSF FNNPVSIVS >gi|226332319|gb|ACIC01000001.1| GENE 11 6775 - 7062 126 95 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253567586|ref|ZP_04844997.1| ## NR: gi|253567586|ref|ZP_04844997.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 95 8 102 102 186 100.0 4e-46 MEELKISNRQIAMMAFDRLRKENKKDSALRLARCLLQGTSISLGIGDIDWDIDTAIRQCG GEPSTGYRYTAYFHFNRKTEMVKERYDEIVKELYG >gi|226332319|gb|ACIC01000001.1| GENE 12 7080 - 7448 237 122 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253567587|ref|ZP_04844998.1| ## NR: gi|253567587|ref|ZP_04844998.1| predicted protein [Bacteroides sp. 1_1_6] # 1 122 1 122 122 237 100.0 2e-61 MSYQIITKMAYNAKNKQIETWQHSNNVWPKTDHFYDLDVKTDKQMFEFIKLVASGSWQVR KWRKAFNILFEEYPELVMSSYEHELEGRPWKEYCAICRKHEGLAESKCNEIVARFKQLAG IV >gi|226332319|gb|ACIC01000001.1| GENE 13 7465 - 7779 286 104 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253567588|ref|ZP_04844999.1| ## NR: gi|253567588|ref|ZP_04844999.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 104 1 104 104 198 100.0 9e-50 MNINFLGPVLPTDSFTQMAFVEILNIILTSDNIVDVNRMLIGRNVNPTFGSLSGHFRWSY ANDHFTLWQRMEYNSPICFGQRIFSIHFGMLASRDRKRNNMIMN >gi|226332319|gb|ACIC01000001.1| GENE 14 7793 - 8290 482 165 aa, chain - ## HITS:1 COG:no KEGG:BVU_2473 NR:ns ## KEGG: BVU_2473 # Name: not_defined # Def: putative anti-restriction protein # Organism: B.vulgatus # Pathway: not_defined # 3 116 6 119 177 75 39.0 8e-13 MDLNQAEVAVTTQHLIDIRQERDSLLRMSDFGDMGEFLCTCSELFPEEETPEYRYTKWEE IPDPLINREWLCPNFFDIREAMEQLEEPDKDFFFDWCDRYGHDISTEDPHLLVAHYLELF GNVTYIDDDSCPDSGDDSLLYYPGISGNYFDNCFPRFEVFDDNYD >gi|226332319|gb|ACIC01000001.1| GENE 15 8516 - 11518 2328 1000 aa, chain - ## HITS:1 COG:pli0004 KEGG:ns NR:ns ## COG: pli0004 COG0827 # Protein_GI_number: 18450290 # Func_class: L Replication, recombination and repair # Function: Adenine-specific DNA methylase # Organism: Listeria innocua # 41 241 451 675 756 65 25.0 4e-10 MYAIIPQQIPQDRRAEVNEKILFAIDSGKDLIPAESIYNCYTGIGGLHNLKQSDFANYNE YAEAKKEYEMGQFFTPHEVCRDIVDMLSPASSEMILDMCCGMGNFFNHLPNHHNTYGFDI DGKAVAVARYLYPDAHIEKCDIRQYYPEQRFDIIIGNPPFNLKFDYRLSQEYYMEKAYDV LNPAGILMVIVPSSFMQSEFWEKTRITGINSRFSFIGQVKLNPNAFASTGVHNFNTKVMV FLRKSLHIEMQAYNAEEFISMSELKERIREARLMKHKIRFDLMRETNRIDKEELEVFEYK LAKYMYELKAHTKLNKHIDKAEALVTKFRNQKPPENATREQVNQWEKNKLTTTKVLGIIR RYITSQNTVPRKEVALVKTSYGFKLKQYAPRLLDKVSHKAASINDLVLGRSELPLPELPT EKNMRQIRAAEKLIRKKRRQYENQNRQFTEMQPDPFLQEYLDRTTFRNKDGEICEFTSLQ KHDLNLVLQKRYALLNWQQGSGKTAAVYHRAKYLLKFHKVRNAVILAPAIATNMTWIPFL TINREKFRIVRSNADLETVPEGVLLILSTSILSKLKRGLSKFVRRTSGKLCLVFDESDEI TNPSSQRTRLVLSIFRRLKYKILDTGTTTRNNIAELYSQFELLYNNSVNMVCWCDRIYHE NRDKEIEEESNRHYGEPFPAFRGHVLFRSCHCPGKSTVFGIEKQNQDVYNKEELAELIGK TIITRKFRDFAGEKYKIRTHTVSPAEGEHEVYRVIIEEFCRICELYYNSTGDAKKDAGLR LMRQIKLLIKACSVPHLIDGYFGDGIPNKTRYIEKLIRKIPGKVAVGCTSIAAFDLYENH LRECFPNRPVFVVKGDVTFKKRQSIVTEFDSTVNGILVCTQQSLSSSVNIPTCNDVILES LQWNIPKMEQFYFRFIRLDSKELKDVHYVTYKDSVEQNLMALVLTKERLNEFIKSGEVKE QSDIFEEFDVTMSVIESLLVRERDREGKIHISWGSQRITG >gi|226332319|gb|ACIC01000001.1| GENE 16 11598 - 12539 758 313 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253567591|ref|ZP_04845002.1| ## NR: gi|253567591|ref|ZP_04845002.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 313 1 313 313 614 100.0 1e-174 MQTTTSFKPGNAPDLLSGILSVQVRNEDKITEQDRLFCQNQQDMLYKTLDRIDRWYTIFK EEAEQYRTERDFSYDDNGKISMRDLYSLRNGKDDYSHNEFKPFDVLNDLVDKNHNANSNF ANRIIAYFNRTYNVSVPEQRINEKTLRMGFRPVYGTYVDAVIEHLGGKSFRETAVEELLA RVAKVVKPSCWSKVKTELKKDKIVFPEIIRFDDYYIQYHQRCKISYNYSGELETLCAGIA YGADDVLCGNSKMIIRFDDNDVSVNDWYDLTTTNAEQIRFYKNGRIDVKFKDSAAAESCF KRLRLDEITLPEN >gi|226332319|gb|ACIC01000001.1| GENE 17 12559 - 13377 643 272 aa, chain - ## HITS:1 COG:no KEGG:BDI_3903 NR:ns ## KEGG: BDI_3903 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 265 62 319 320 159 36.0 1e-37 MLQATKNKYGIETLKTLNVLYDHEHWLTQEDVDMANRYVELIEQTRSEITPQTGDRLIYL SRHGDYYGNALIDSMDEKKGLLSICEQPYVPFVWQSADNIRLSVSGGAFHHVKTDDLKFN GWTEGAFKDWGHCGSCAHGSVTFTAKVPQWIYREPEPLYGDFTTETYRRFYLHKDLEARN LYQSLDIAFHNEEDFRQFLQDYEGTVFKGNWKNQIVVWCFRREYVFLPLSEWEKIDVPAV ERRLNFHPEQVKIVKDMEKHITYFHRIQSQDF >gi|226332319|gb|ACIC01000001.1| GENE 18 13446 - 13628 75 60 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253567593|ref|ZP_04845004.1| ## NR: gi|253567593|ref|ZP_04845004.1| predicted protein [Bacteroides sp. 1_1_6] # 1 60 1 60 60 115 100.0 7e-25 MFRTCKVGGKNTEALRGEDDFSTPPGGLDLACTIRTRVTFACELRELGYGTDVILDLDYF >gi|226332319|gb|ACIC01000001.1| GENE 19 13861 - 14160 455 99 aa, chain + ## HITS:1 COG:no KEGG:BT_0185 NR:ns ## KEGG: BT_0185 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 99 1 99 99 181 100.0 9e-45 MVKHIVLFKLKDEVPEAEKLVVMNKFKEAIEALPAKISVIRKIEVGLNMNPGEAWHIALY SEFDTLEDVKFYATHPDHVAAGKIIAEAKDSRACVDYEL >gi|226332319|gb|ACIC01000001.1| GENE 20 14176 - 16230 2196 684 aa, chain + ## HITS:1 COG:HI0885 KEGG:ns NR:ns ## COG: HI0885 COG4232 # Protein_GI_number: 16272825 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol:disulfide interchange protein # Organism: Haemophilus influenzae # 239 597 182 521 579 97 27.0 6e-20 MRKLISFLLLSFVAYTLQAQIKDPVKFKTELNSLSDTEAEIVFTAAIDKGWHVYSTDLGD GGPISATFNVEKKSGVELVGKLKPVGKEIATFDKLFEMKVRYFENTAKFVQKVKLTGGAY AIEGYLEYGACDDESCLPPTEVPFKFSGEAKAGTTAAPVAKADKPETKEAEEVTAPTASK DSAAMMELVPASATDANADIQPAVASGDLWRPVISELQALGEEHTQGDMSWIYIFVTGFL GGLLALFTPCVWPIIPMTVSFFLKRSKDKKKGIRDAWTYGASIVVIYVALGLAITLIFGA SALNALSTNAVFNILFFLMLVIFAASFFGAFEIRLPSKWGNAVDSKAESTTGLLSIFLMA FTLSLVSFSCTGPIIGFLLVQVSTTGSVVAPAIGMLGFAIALALPFTLFALFPSWLKSMP KSGGWMNVIKVTLGFLELAFALKFLSVADLAYGWRLLDRETFLALWIVIFALLGFYLLGK IKFPHDDDDNKVGVTRFFMALISLAFAVYMVPGLWGAPLKAVSAFAPPMHTQDFNLYKNE VHAKFDDYDLGMEYARLNGKPVMLDFTGYGCVNCRKMEAAVWTDPKVSDLINNDYVLITL YVDNKTPLTEPVKIVENGKERTLRTVGDKWSYLQRVKFGANAQPFYVLLDNQGSPLNKSY AYDEDIPKYIEFLQTGLENYRKEK >gi|226332319|gb|ACIC01000001.1| GENE 21 16326 - 17030 685 234 aa, chain - ## HITS:1 COG:RSc2208 KEGG:ns NR:ns ## COG: RSc2208 COG1741 # Protein_GI_number: 17546927 # Func_class: R General function prediction only # Function: Pirin-related protein # Organism: Ralstonia solanacearum # 5 233 4 232 232 169 36.0 5e-42 MKKVIDRASSRGYFNHGWLKTHHTFSFANYYNPERIHFGALRVLNDDSVDPSMGFDTHPH KNMEVISIPLKGYLRHGDSVQNTKTITPGDIQVMSTGSGIYHSEYNDSKEEQLEFLQIWV FPRIENTKPEYNNFDIRPLLKPNELSLFISPNGKTPASIKQDAWFSMGDFDTERTIEYCM HQEGNGAYLFVIEGEISVADEHLAKRDGIGIWDTKSFSIRATKGTKLLVMEVPM >gi|226332319|gb|ACIC01000001.1| GENE 22 17223 - 17825 542 200 aa, chain + ## HITS:1 COG:RSc2361 KEGG:ns NR:ns ## COG: RSc2361 COG1595 # Protein_GI_number: 17547080 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Ralstonia solanacearum # 13 181 36 210 213 68 28.0 8e-12 MYATDTLMDERELVLRLIDGDEDAFCELYAAYKNRLLYFAMKFVKSREFAEDIFQDAFTV VWQSRRFINPDASFSSYLYTIVRNRILNQIRDMANEDKLKEHILSHAVDSANETNNKILF DDLKDVLSRALEQLTPRQREVFNMSRDLQMSHKEIAEALGVSVNTVQEHISVSLKVIRAY LTKYSGTSADILLILLCLNL >gi|226332319|gb|ACIC01000001.1| GENE 23 17913 - 18896 783 327 aa, chain + ## HITS:1 COG:CC2708 KEGG:ns NR:ns ## COG: CC2708 COG3712 # Protein_GI_number: 16126941 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Caulobacter vibrioides # 123 313 139 339 357 71 27.0 2e-12 MKDTKTEKNKLRRYLDDMYIREEASQLLESMRDADHKDILDELSAEVWEESVSQQPVTDL EREKYKKEARQLLKHIEHKKRTWFRRVMTIAASVAAVIAIVTGSISYFRYMSEQQITFAE ISTSFGEKKRVELPDGTILVLNSCSQVRYPDSFQGDIRKVELEGEGYFRVAHNEDMPFIV QTKRLDVRVLGTRFDVKSYSTDEIVSVSVESGKVQVDLPEAMMRLTAKEQVLINTVSGEY SKKKEERGVAVWIKGSLRFNSTPIRDVAKELERVYNCQITFASGQEFDNLITGEHDNKSL ESVLKSIEFISDINYKKEGRNILLYKK >gi|226332319|gb|ACIC01000001.1| GENE 24 19109 - 22549 3128 1146 aa, chain + ## HITS:1 COG:no KEGG:BT_0190 NR:ns ## KEGG: BT_0190 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1146 1 1146 1146 2190 99.0 0 MNNLLICKGKSRNSCSFLAFLLFTFLLVAPTAAQTLKENNITLRVQNEPVENVFNKISEQ TNFKFIYDQETVNNAPHVSFDVKNATLKQILGVITTQSKLYFNRTDNTIAVSKQPLKEES AQRTRTVQGVVVDDKGEPVIGASVQIKGEGSGTITDMDGRYSLMNVPESATLTISYIGYK TVSLSAKDKNTAKITLTEDSKMIDEVVVVGYGVQRKRDVSTSISSVKAEQIAEVSASDFR QALAGKMPGVQVTQPSGDPEGSVSIRVRGISTVNAGSDPLYIIDGVPVERGFANLNNNDV ESVEVLKDASSAAIYGSRGSNGVIIITTKQGQSEKMKVQYDGYYGIQSVSKKLPMLNAYQ FAEFAKDGHDNSYLDANPGGSPNDPNGMRPNSWERIPAELFPYLNGDQGLTDTDWQDAIF RSAATTSHNVSISGRGKTVGYFISANYYDKEGIIINSDFKKYSMRMNLDGKYKRLKFGLN FSPSYSTSNRVDASGSNGIVQSALMMPPVWPVYNSDGSYNYQGNGYWRIGNDYQHNAVLN PVAMANLQSDVVDRMAIVGKVFAELELFKGLTYNISFGGDYYGSHNDQYRSSELPLLGQK YYDIKSNPTAYSSSGFYFNWLIENKINYNTVINEDHSINAVLVQSAQKETYKGDNVTATD FPNDYIQTISGGTVIKGASDKTQWSIASYLARVQYSYKGKYMASGAIRADGSSRFGKNNR WGYFPSASLAWRVSGEDFFTKAKFLSFVDDLKIRTSYGVTGNFQIGNYDHLSLMALDNYI LGTGNGQLVQGYKPNTIKNDDLSWEKNAMVNVGVDLQMFKGLLGITVDYYNTNTSNMLLN VPVPHLTGYSTALMNIGKVNNRGWEIALTSQKNFTKDFGYSFNANYATNTNEVKALGPGN APIISTGSVDHAYYITKVGEPIGCYYLLVQDGIFSNEEELKKYPHFSNTQPGDFRFVDVD GDGVMDLDKDRTIVGNYMPDFTYGFGGKVWYKGIDLDFNFQGVYGNEILNLNRRYIDNLE GNTNGTTIALNRWKSADNPGNGQVNRANRKSKGYNGRTSTWHLEDGSYLRLQNVTLGYTL PQNLTRRFFVEKLRVYVSGQNLWTSTNYGGYNPEVNARPSNSLSPGEDYGTYPLAKTFLF GLNITL >gi|226332319|gb|ACIC01000001.1| GENE 25 22573 - 23964 1157 463 aa, chain + ## HITS:1 COG:no KEGG:BT_0191 NR:ns ## KEGG: BT_0191 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 463 1 463 463 944 100.0 0 MKIYRQILFLATAFVSVACTDSFFDLEPSSSVPTDKVYKTADDFNVAVIGCYSKLQGQVS YFTECCEYRSDNLSLSAPTAGTQDRYDIDQFADKASNGILESAWANFNNGVYRCNLVLDK IDAADFDATLKKQYKGEALFVRALTYFNMYRLWGGIPMTDKVVTVAEALKIGRSSEQQVF DFLTGDLNQVINENMLPSSYTGTDTGRATSGAAMALLGKIYLTFHKWTEARNVLSQLIGR YSLMPTPDKVFDVDNKMNDEIIFAVRFNKDVEGEGHGYWFSIINLTDDTNQTKALKECYK DGDKRKDLITYVKVEDKVCVMNKFKDLKSATYNTVGNDQIILRYADVLLMYAEALNEISY SNSQTSDAMVALNAVHTRAGLSPVQITELADQDSFRKAIMLERQQEFPYEGHRWFDLVRM GGAKEAMKAEGHTIQDFQFLYPIPKTELERINNTELLWQNPGY >gi|226332319|gb|ACIC01000001.1| GENE 26 23992 - 25959 1297 655 aa, chain + ## HITS:1 COG:MA0850_3 KEGG:ns NR:ns ## COG: MA0850_3 COG1520 # Protein_GI_number: 20089734 # Func_class: S Function unknown # Function: FOG: WD40-like repeat # Organism: Methanosarcina acetivorans str.C2A # 340 615 26 295 314 102 30.0 2e-21 MKKIIYSLLFVITAVCAACDEELPKASFDLYELKTLTATAGDMNVALSWEAYPDARPDEY LISWTSGSSGAEGGQMTVEAEASGATVSDLVNDVAYTFSVQPRYTGGLASKRSTSCTPKN ARYPVTGLTASAGDKKVRLRWTKPASERFTNYQITVSPGNQVIKLDDTTLEEYMVEGLTN DQEYKFDVVCIYPMGNSVSAEAVATPGIIYPVAVNTELVVWESAVFAYNDMYFTGGNVKS VSWDFGDGTASAETTPSHAFAATGTYTVTITVTYTNNTTESGSVEINVVEYKWDSLNLNF NGLTGYVKTSNPVFSPDGKTMYIPTSTPAGHLFALDVVSGEFKWVFAIDKITYGGGALVG SDGTIYQCVRDASIKNVYAINPNGTQKWSLQLDGPIGAFPALSADGVLYCLTNKSTLYAL DVANGAIKWQQSLDGTTGSAVAIDRNGHIYAGTSEAIYAFSANKEELWKLSGVNVTEQGT FALNGNTLYATLKSKAGLVAVDITNGTKKWTYPTTGGDAYFPIVDKKGVVYFTEKGSQTV YAVDADGSKVWVKKVNNNLNYSGAALSTDGVLYIGTQSNNKIFGLNTADGSTVYEESVGQ QVMASVSIGPDKRLYCGTIGASNIGSIKAFDINKTLETDSWSIRGGDIQGTNRQK >gi|226332319|gb|ACIC01000001.1| GENE 27 25966 - 26847 539 293 aa, chain + ## HITS:1 COG:MJ1311 KEGG:ns NR:ns ## COG: MJ1311 COG1082 # Protein_GI_number: 15669501 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate isomerases/epimerases # Organism: Methanococcus jannaschii # 69 276 61 269 293 75 28.0 2e-13 MRKLIFIAFMVMSVCGYAQKYEVGTTTAVWKAPAAADFLHAKAIGVKYVEVAFNQCYRGV PVDEVIPRIKEMKAKIDSADIEVWSIHLPFSRTLDISVLDDRLRKENVDFMAEMIELCAM FQPTCLVLHPSSEPIADSIRAQRIANASESIAYLKKYADRIGAQLCIENLPRTCLGNTPE ELMRIVGDIPDVKVCFDTNHYTKGTTEHFVAIVGSRIGTIHASDFDLVNECHWLPTQGNI KWGKLMQDLEKTGYKGVFMYEATKDHENNNVRPALERIAETFDKITDDYKTLK >gi|226332319|gb|ACIC01000001.1| GENE 28 26849 - 27907 806 352 aa, chain + ## HITS:1 COG:no KEGG:BT_0194 NR:ns ## KEGG: BT_0194 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 352 1 352 352 735 100.0 0 MNNIRRIQYFLFCLLAIGLASCSDDDNNDKKTGYEGILTELAAEVDATTQQLWGTSPSIV NTERADALSTIQGYADKCLDDYFISFLNGFDQASMSMEKSEPILYYYRSAFDRVMDGIEN SKVENGTAEIWLLYNMGYIVKTPSGCFAIDISHRWAKELAPYIDFLCVTHKHSDHYNTDL IQAMFDLDKPVLSNYLKDTTYPYTAKGDKDYEIGKFKIRTCITDHNNSGLSNFVTIFQID CGDDTGNFVFMHVGDSNFKTEQYTNIAPHVNVLIPRYAPNALTENNILGTGAGQVQPDYV LLSHILEMAHAGVDASRWSLDMALERASKINCDQTYVPMWGEKMVWKNGKLN >gi|226332319|gb|ACIC01000001.1| GENE 29 27928 - 28827 756 299 aa, chain + ## HITS:1 COG:CC3172 KEGG:ns NR:ns ## COG: CC3172 COG0584 # Protein_GI_number: 16127402 # Func_class: C Energy production and conversion # Function: Glycerophosphoryl diester phosphodiesterase # Organism: Caulobacter vibrioides # 49 299 20 271 295 139 33.0 7e-33 MKRIYNFILLVCLVQLAVAQNRIAEIRENLLGNYSDRVLVVSHRADWRNAPENSLQGIRN CIDMGVDMVEIDLKKTKDGHLVVMHDKTINRTMTGKGNAEDYTLAELKAMRLKNGAGCKT RHQIPTLEEVMLLCKGKIMVNIDKGYDYFQEAYAVLEKTGTIDHCVIKAGLPYERVKAEN GDVLDKVIFMPIVNLHKEGAEKIINDYQSHMKPAAYELVFNDDNEEMLRLIRKVRDSGAK LFINSLWPELCGGHDDDRAVELHQPDESWGWILNQGAKLIQTDRPALLLEYLRKKKLHD >gi|226332319|gb|ACIC01000001.1| GENE 30 28840 - 30186 931 448 aa, chain + ## HITS:1 COG:VCA0707 KEGG:ns NR:ns ## COG: VCA0707 COG2271 # Protein_GI_number: 15601463 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate permease # Organism: Vibrio cholerae # 5 447 1 441 459 372 43.0 1e-102 MLKHLINFYKVSSPGPCNEDVMSSSGKRRLKYLQWSTFLSATFGYGMYYVCRLSLNVVKK PIVDEGIFSETELGIIGSVLFFTYAVGKFTNGFLADRSNINRFMTTGLLVTALINLCLGF SHSFILFAVLWGVSGWFQSMGAASCVVGLSRWFTDKERGSYYGFWSASHNIGEALTFIIV ASIVSVCGWRYGFFGAGMVGLLGALIVWRFFHDSPESKGFPPVNVPKEKKTMSASETTDF NKAQRQVLTMPAIWILALSSAFMYISRYAVNSWGVFYLEAQKGYSTLDASFIISISSVCG IIGTMFSGVISDKLFGGRRNVPALIFGLTNVLALSLFLLVPGVHFWLDAVAMILFGLGIG VLICFLGGLMAVDIAPRNASGAALGVVGIASYIGAGLQDVMSGVLIEGHKTIRSGVEVYD FTYINWFWIGSAILSVFFALWVWNAKQK >gi|226332319|gb|ACIC01000001.1| GENE 31 30391 - 31044 446 217 aa, chain + ## HITS:1 COG:no KEGG:BT_0197 NR:ns ## KEGG: BT_0197 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 19 217 1 199 199 386 100.0 1e-106 MSRIFAITKILLFVLLAGMTSTVFAQSDDFTTWTKFKVNYKIDPRFSLLSNLEFRTKDDV SRMDRLGLAIGGEYRAYSFLKLEAGYETHFRNLGDSEWKFRHRYNFGATASFQYQWLKIA LRERFQQTFDRGDSKARLRSRLKLAYAPEKGIVSPYFSIEIYQSLDDAPFWRADRMRYRP GVEIALAKRWALDIFYCYQYESPKGKHIAGVEIGYSF >gi|226332319|gb|ACIC01000001.1| GENE 32 31354 - 33258 1133 634 aa, chain + ## HITS:1 COG:no KEGG:BT_0198 NR:ns ## KEGG: BT_0198 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 634 1 634 634 1288 100.0 0 MNILFRMEKGIAVCLFSSFFVLACVNHIPGEKAAVSNGDIPLKFTADIHEVTYTRVAGNG FEENDEVGVFALVGSSTMKEERYADNLRFFRTSEKVFESSEPIYYPDDEVTLRLISYYPY QEEGVNMGESTMSVAVETDQNLSAGYFCSDFLVASQNLQSAPKGAIALHYDHKFFKLKIA LVGKEGEELQGILDADPKLSVCGFYTKAIYDFQKETYSAFSNEKSITPAGKWEIQGDRLS GKEVILIPQETTLGYQYIVLDVGGKIYTSPLPSSLQLQSGKQSELAISFIPTEDVLISKV EGEINSWEDGNTGQSGSEVFHNYVDVSKLVFDDSKVYKVLSAGQQVAEICKEYLVTPDFS SQAIVVYPMKDNHAVDLSKGTVVQLVGHADKVHGGTVAWDLKEHSLKYTPGNQAARSKVY ILPDGQIALSMPEEMLPVLAMKDIMRDVRGSSIHNYPIVKIGTQYWMGSNLKTSLYIDGE EIPKLEIMNAGATGYLLSQTDKSYYFYSFGTIVSNKLLPVGWNMPDWNDWNMLKTYLSND ASLLKTGTWKAISTGAGAIVGTATNLTGFDGYPVGMYAGKFQSAYEGKYLSYWTLNEAGT DADEKIFLLRSDKNEIAEGNTGLDKAYAIRCIRK >gi|226332319|gb|ACIC01000001.1| GENE 33 33334 - 33825 360 163 aa, chain + ## HITS:1 COG:no KEGG:BT_0199 NR:ns ## KEGG: BT_0199 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 163 1 163 163 338 100.0 4e-92 MKKIINPWKGLEGYNCFGCAPNNEAGIKMEFYEDGDEVVSIWKPRPEYQGWINTLHGGIQ AVLMDEICAWVILRKLQTTGVTSKMETRYRKPVSTTDSHIVLRASIKEVKRNIVIIEAKL YNKDGEVCTEAVCTYFTFSHEKSKDEMHFDKCDVEPEEILPLI >gi|226332319|gb|ACIC01000001.1| GENE 34 34079 - 34930 908 283 aa, chain + ## HITS:1 COG:PM1195 KEGG:ns NR:ns ## COG: PM1195 COG0040 # Protein_GI_number: 15603060 # Func_class: E Amino acid transport and metabolism # Function: ATP phosphoribosyltransferase # Organism: Pasteurella multocida # 2 282 7 298 299 248 46.0 1e-65 MLRIAVQAKGRLFEETMALLGESDIKISTTKRTLLVQSSNFPIEVLFLRDDDIPQTVATG VADLGIVGENEFMEKEEDAEIIKRLGFSKCRLSLAMPKDIEYPGLSWFNGKKIATSYPVI LRNFLKKNGVNAEIHVITGSVEVSPGIGLADAIFDIVSSGSTLVSNRLKEVEVVMKSEAL LIGNKNMSDEKKEVLEELLFRMNAVKTAEDKKYVLMNAPKDKLEEIIAVLPGMKSPTIMP LAQEGWCSVHTVLDEKRFWEIIGKLKGLGAEGILVLPIEKMIV >gi|226332319|gb|ACIC01000001.1| GENE 35 34943 - 36226 1041 427 aa, chain + ## HITS:1 COG:hisD KEGG:ns NR:ns ## COG: hisD COG0141 # Protein_GI_number: 16129961 # Func_class: E Amino acid transport and metabolism # Function: Histidinol dehydrogenase # Organism: Escherichia coli K12 # 8 427 13 431 434 424 55.0 1e-118 MKLIKYPSKEQWAELLKRPALNTESLFDTVRTIINKVRAEGDKAVLEYEAAFDKVTLSAL TVTSEEIQKAEGLISDELKSAITLAKRNIETFHSSQRFVGKKVETMEGVTCWQKAVGIEK VGLYIPGGTAPLFSTVLMLAVPAKIAGCREIVLCTPPDKNGNIHPAILFAAQLAGVSKIF KAGGVQAIAAMAYGTESVPKVYKIFGPGNQYVTAAKQLVSLRDVAIDMPAGPSEVEVLAD ASANPVFVAADLLSQAEHGVDSQAMLITTSEKLQAEVMEEVNRQLAKLPRREIAAKSLEN SKLILVKDMDEALELTNAYAPEHLIVETENYLEVAERVINAGSVFLGSLTPESAGDYASG TNHTLPTNGYAKAYSGVSLDSFIRKITFQEILPQGMKVIGPAIEEMAANELLDAHKNAVT VRLNTLK >gi|226332319|gb|ACIC01000001.1| GENE 36 36223 - 37263 874 346 aa, chain + ## HITS:1 COG:YIL116w KEGG:ns NR:ns ## COG: YIL116w COG0079 # Protein_GI_number: 6322075 # Func_class: E Amino acid transport and metabolism # Function: Histidinol-phosphate/aromatic aminotransferase and cobyric acid decarboxylase # Organism: Saccharomyces cerevisiae # 4 343 5 377 385 221 37.0 2e-57 MKTLQELTRPNIWKLKPYSSARDEYKGAVASVFLDANENPYNLPHNRYPDPMQWELKTLL SKIKKVSPQHIFLGNGSDEAIDLVFRAFCEPEKDNVVAIDPTYGMYQVCADVNNVEYRKV LLDENFQFSAEKLLAATDERTKLIFLCSPNNPTGNDLLRSEIEKILREFEGLVILDEAYN DFSEAPSFLEELDKYPNLVVFQTFSKAWGCAAIRLGMAFASEAIIGILSKIKYPYNVNQL TQQQAIAMLHKYYEIERWIKTLKEERDYLEEEFAKLSCTVRMYPSDSNFFLAKVTDAVKI YNYLVGEGIIVRNRHSISLCCNCLRVTVGTRVENNTLLAALKKYQG >gi|226332319|gb|ACIC01000001.1| GENE 37 37267 - 38391 1015 374 aa, chain + ## HITS:1 COG:VC1135_2 KEGG:ns NR:ns ## COG: VC1135_2 COG0131 # Protein_GI_number: 15641148 # Func_class: E Amino acid transport and metabolism # Function: Imidazoleglycerol-phosphate dehydratase # Organism: Vibrio cholerae # 184 374 12 200 200 221 54.0 1e-57 MKKKILFIDRDGTLVIEPPIDYQLDSLEKLEFYPRVFRNLGFIRSKLDFEFVMVTNQDGL GTSSFPEDTFWPAHNLMLKTLAGEGITFDDILIDRSFPEDNAPTRKPRTGMLTKYIDNPE YDLAESFVIGDRPTDVELAKNLGCRAIYLQEATDDLKEKGLEEVCALATTDWDQVAEFLF AGERKAEVRRTTKETDIYVSLNLDGNGGCDISTGLGFFDHMLEQIGKHSGMDLTIRVKGD LEVDEHHTIEDTAIALGECIYQALGSKRGIERYGYALPMDDCLCQVCLDFGGRPWLVWDA EFNREKIGEMPTEMFLHFFKSLSDAAKMNLNIKAEGQNEHHKIEGIFKALARALKMALKR DIYHFELPSSKGVL >gi|226332319|gb|ACIC01000001.1| GENE 38 38431 - 38991 274 186 aa, chain + ## HITS:1 COG:PA3885 KEGG:ns NR:ns ## COG: PA3885 COG2365 # Protein_GI_number: 15599080 # Func_class: T Signal transduction mechanisms # Function: Protein tyrosine/serine phosphatase # Organism: Pseudomonas aeruginosa # 41 181 48 189 218 89 32.0 3e-18 MIINNKRIWLGLFIGIVMSFSVYGQNINAEKITVPDSKLTNLYQIDSGVYRSEQPSDADF KALEKYGIREVLNLRNRHSDDDEAAGTKIKLYRLKMKAHSVSEDQLINALRIIKNRKGPI VFHCHHGSDRTGAVCAMYRIVFQGVSKQKAIQEMTEGGFGFHRIYKNIIRTIEKADIERI KREVLQ >gi|226332319|gb|ACIC01000001.1| GENE 39 39106 - 41031 1376 641 aa, chain - ## HITS:1 COG:CAC1050_2 KEGG:ns NR:ns ## COG: CAC1050_2 COG0171 # Protein_GI_number: 15894337 # Func_class: H Coenzyme transport and metabolism # Function: NAD synthase # Organism: Clostridium acetobutylicum # 326 634 1 309 310 460 67.0 1e-129 MNYGFVKVAAAVPHVKVADCKFNVERIESQIAIAEGKGVQIIVFPEMSITGYTCGDLFGQ QILLEEAEMGLMQILNNTRQLDIISIVGMPVVVNSTVINAAVVIQKGKVLGVAAKTYLPN YKEFYEQRWFTSALQLTEDTVRLCGQIVPIGANLLFETSDTTFGIEICEDLWATIPPSSS LALQGAEIIFNMSADNEGIGKHHYLCSLISQQSARCIAGYVFSSCGFGESTTDVVFAGNG LIYENGSLLARSKRFCMEEQLIISEIDVERIRAERRINTTFAASQGNPGDKKAISVATEF INSKELTLTRDFNSHPFVPQGAELDEHCEEVFSIQIAGLAQRLVHTKAKTAVVGISGGLD STLALLVCVKTFDKLGLPRKDILGVTMPGFGTTDRTYNNAIDLMKSLGISIREISIQDAC IQHFKDIDHDINVHDVTYENSQARERTQILMDIANQTWGMVVGTGDLSELALGWATYNGD HMSMYGVNGSIPKTLVKYLVQWVAENDMDEDAKATLLDIVDTPISPELIPADENGEIKQK TEDLVGPYELHDFFLYYFLRFGFRPSKIYYLANIAFKDVYDKETIKKWLSTFFRRFFNQQ FKRSCLPDGPKVGSISISPRGDWRMPSDASSTIWLKEIEDL Prediction of potential genes in microbial genomes Time: Wed May 11 23:53:28 2011 Seq name: gi|226332318|gb|ACIC01000002.1| Bacteroides sp. 1_1_6 cont1.2, whole genome shotgun sequence Length of sequence - 17871 bp Number of predicted genes - 12, with homology - 12 Number of transcription units - 2, operones - 2 average op.length - 6.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 208 - 267 5.7 1 1 Op 1 . + CDS 359 - 3241 2329 ## BT_0206 hypothetical protein 2 1 Op 2 . + CDS 3259 - 4794 1118 ## BT_0207 hypothetical protein 3 1 Op 3 . + CDS 4806 - 5711 672 ## BT_0208 hypothetical protein 4 1 Op 4 . + CDS 5729 - 7681 1268 ## BT_0209 hypothetical protein 5 1 Op 5 . + CDS 7727 - 10441 1914 ## COG4886 Leucine-rich repeat (LRR) protein 6 1 Op 6 . + CDS 10459 - 12441 1293 ## BT_0211 hypothetical protein 7 1 Op 7 . + CDS 12492 - 14579 1515 ## COG1404 Subtilisin-like serine proteases 8 1 Op 8 . + CDS 14592 - 15407 469 ## BT_0213 hypothetical protein 9 1 Op 9 . + CDS 15429 - 15905 421 ## BT_0214 hypothetical protein + Term 15935 - 15968 3.1 + Prom 16006 - 16065 9.2 10 2 Op 1 . + CDS 16094 - 16522 265 ## COG0735 Fe2+/Zn2+ uptake regulation proteins 11 2 Op 2 1/0.000 + CDS 16545 - 17105 758 ## COG1592 Rubrerythrin + Term 17132 - 17171 3.5 + Prom 17122 - 17181 3.8 12 2 Op 3 . + CDS 17204 - 17839 580 ## COG0778 Nitroreductase Predicted protein(s) >gi|226332318|gb|ACIC01000002.1| GENE 1 359 - 3241 2329 960 aa, chain + ## HITS:1 COG:no KEGG:BT_0206 NR:ns ## KEGG: BT_0206 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 960 22 980 980 1710 89.0 0 MSQSGYKVKGHVVSAEDNEPMVGVSILEKGTTNGVITDIDGNYTLEIKGTASATLLFSYI GMQSQAHAVSAKTGTLNVRLVSDAALIDEVVVVAYGTRKKGTIAGAVSTVKAEKMENVPA AGFDQSLQGQTPGLSVISNSGEPSKAAVFQIRGTNSINSGTAPLFILDGVPISSADFNTI SPGDIESISVLKDASSTSIYGARAANGVVVITSKRGLAMDKAKVTLRGQWGFSQLASNDN WMMMNTPERIQFEKEIGQDTGKDYNLLSRTNINWLDEVFNDRAPLQSYELSVNRATDRLN YYVSGGFYDQDGIAQSSTFRRYNMRANAEVKASNWLKIGTNTMMAYEEIAQAEEGDMALY TPISGSRFMLPYWNPYNADGSLASENDGTWKGTGQNPIEWMANNPVEHKKYKLLSTVFAD ITPVKNLTVRAQFGADYAHSTSFMKSYPSYIINNNSGRAGRSSSDILNLTETLTANYRWA LNEDHSFNFMLGQEGIDYRSTGFQVVTRGQTIDRLTNVTAGTRASSWQDANTAHSFMSFF FRTEYNYKDLYYAEVAARTDASSRFGKDHRWGAFWSLGFMWNIKNEAFLKDVEWLTGAQI KLSTGTSGNSTIPDYDHLALVSGNANYLDQAGLFPLQSGNEELGWEQTWANNIGLSVGLF NRVNVNFDFYHKKTTNMLMFVPQSYAISGESGHWDNIGAMMNRGVEVAVDGDVIRTRDFT WNLSANFSYNKNKLLELYNGVEEYVNSTTGLKYVVGHSVHEYFMNRYAGVNPANGDALWY TADGELTTEFREEDKVMTGKTFDSPWAGGFGTTLMWKGLSLSAQFSWMAKRYVMNNDRFF EESNGIYSAYNQSKRLLYDRWKKPGDITDIPRYDVVARLDDRFLENTSFLRLKNLTLAYA LPQPWLKKTNFFSAARVYLQGQNLLTWTGFTGLDPEVASNIYRAQYPASRQFTLGIEVTF >gi|226332318|gb|ACIC01000002.1| GENE 2 3259 - 4794 1118 511 aa, chain + ## HITS:1 COG:no KEGG:BT_0207 NR:ns ## KEGG: BT_0207 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 511 1 511 511 889 85.0 0 MIKKFKLYILLAAVVLSTTSCLDKMPEDSIPFDEAIQTVDDVNLAVIGIYDAFKSSSLYS GSLTLLPDLQTDLVYGVNGNTNIYGEIWRWKDILATNTNIEGVYAALYNVINRCNFLLDR VDNVRKNTTNDDDLDLIDQCCGEAYFARAIAYSELVKLFCKAYESDEDAANQLGVILTRH YKGNEEMKRASLKDSYQFILDDLDLAAELLALDKDFQPTGKDALYNTAAFFNEYTVYALR ARVALYMRKWDEAIKYSSKVIDSNYYLLSSCTNYVSENVSYYKYMWTSDLSTEAIWKVGF TVNSYGGSLGQIFFNYDYSSYRPDYVPATWAINSYDSNDLRVSSFFQTYTTGYSHGLSWP LLIKYLGNEEFTNAQILHVSMPKVLRLSEQYLIRAEAYVQQAQPDYGRAGKDITTLRTAR YSTYGGSTALSASNAMEVIEAERVKELYMEGFRLHDLKRWHKGFERKPQDQSLANGSSLK VEADDPLFVWPIPQHELDAPGSQVQPNESNK >gi|226332318|gb|ACIC01000002.1| GENE 3 4806 - 5711 672 301 aa, chain + ## HITS:1 COG:no KEGG:BT_0208 NR:ns ## KEGG: BT_0208 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 301 1 299 299 560 99.0 1e-158 MIMKKLLLGCIAAVAALMALSGCDQDKVAYNGPSYLMFSDTLYTYAVQETNEIFNVPISA TVAADHDRTFAVEVIDRESNAVEGKHYKILSNTVTIKAGERSTNLEVQGIYDNLEINDSL GFALRLVIPETEQWGLYGTEAKVVMQKIRPFDIKNFTGYCVVSSTYFASYLNNLELRLVT SDIVEGKENTIAIHGLYYDGYDTEITFNREDVQEPLVEMDEQLCASTATAFGTIYGDGKL LMNQPTAYTSYFSTNENFVLQYVTLSVNNRDGSSYGTVGTFVNVIEWISEAEAEKLKEQG Y >gi|226332318|gb|ACIC01000002.1| GENE 4 5729 - 7681 1268 650 aa, chain + ## HITS:1 COG:no KEGG:BT_0209 NR:ns ## KEGG: BT_0209 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 650 1 650 650 1263 100.0 0 MDMKKFSLKPLGDSCMKMLCKSCFFIGLLMTFCLTACSDDDDDTVTPIFPEKQNIVCNAG EKKEFTFTANTNWSLASSTIWCKFQNNDTEEFVVSGTAGTQTVTIMATNENQNNDNASVA KLELTMGGQTIVIGEVTRSAAGYELKIFDEEGVEINELKVGYQHFSKFSVQANFRFAATN LPGWVELEGGSLVGPVNQKVSGGLKIIEDDNREKYPVPASDKNVITFSDEEGKAFYSFNV SYDGMTPGVLEMTLPSSNKYDWAVSLDGKTFTQSAGGVAGTGGSTTTLKNRVPFTVKTLN DDYEVVFVEKGSNNNDLYIMDASYNEWMRCEREGGKATLIVDEYTPASYEPAERVGYVLV FSRAQYEDIKDNLEATIIDGEDLVYKYEQSNLVLQFTQKETKGGGDELAITAVDGQTYNP IDCTSYTGGDADYFKSEYGVTGIFEIKQPASVATVAKMPFNWSNSVCYYFEDEQEANGVT EPISETDITIYTEAANGKDVFLIVSDESGNKLMLIVRISNAGSGGGGDVPFTVTTSQLAP VSCTAYDGSMGGANYFITQYEVTDISDIKNPPIGESIFVTLNTSSIVDFKCYDVDEKIVD ASSDIEISEDWSTGKQNLNVWLGNGSSLTKTVFLIITGEDGSKHMLVINI >gi|226332318|gb|ACIC01000002.1| GENE 5 7727 - 10441 1914 904 aa, chain + ## HITS:1 COG:alr0124_1 KEGG:ns NR:ns ## COG: alr0124_1 COG4886 # Protein_GI_number: 17227620 # Func_class: S Function unknown # Function: Leucine-rich repeat (LRR) protein # Organism: Nostoc sp. PCC 7120 # 456 791 129 435 461 72 26.0 3e-12 MNLHKIYSIFFLPFIVGLLTMLVVTGCSDDDEELQSQYGYVQFKLYKSTSFEKGTTTRAV GKLESMSSAQKIKVVMTHNGTTVSQTLLLNAYNANNAEYGLRSDKLQLLAGTYKIVGYYL YDGLDEVLLAGPAGDDNELTVVSGGLLEKALTVDAVPHGTVTFKLSKEGISTRAAGEYLF SNIRYVDVTVMNSFNRVTTELKGMKVTYKEDSKEHQNPDNANDKYMDIGVATCDSAVWLP AGTYQVVAYTTYSQSGIKRSELETQSVRGESFTVIDNKLTKDANVPIQLKETAEYIKDYK ALKAIWEALDGKNWRYYSGTINNTIHSLNWNFNKELDMWGDQPGVDLDNNGRVTGLSLAG FGAKGRVPDAIGQLTELKVLSFGTHSETVSGRLFGDEELTPDMSEERKHRIRMHYKKMFL DYDQRLNLSDLLQDAINRNPEMKPIKKDSRISLKDTQIGNLTNRITFISKAIQRLTKLQI IYFANSPFTYDNIAVDWEDANSDYAKQYENEELSWSNLKDLTDVELYNCPNMTQLPDFLY DLPELQSLNIACNRGISAAQLKADWTRLADDEDTGPKIQIFYMGYNNLEEFPASASLQKM VKLGLLDCVHNKVRHLEAFGTNVKLTDLKLDYNQIEEIPEDFCAFTDQVEGLGFSHNKLK YIPNIFNAKSVYVMGSVDFSYNKIGSEGRNISCSMDDYKGINASTVTLSYNEIQKFPTEL FATGSPISTIILSNNLMTSIPENSLKPKDGNYKNTYLLTTIDLRFNKLTSLSDDFRATTL PYLSNMDVSYNCFSSFPTQPLNSSQLKAFGIRHQRDAEGNRILRQWPTGITTCPSLIQLQ IGSNDIRKVDEKLTPQLYILDIADNPNISIDVTSVCPYIEAGMYVLLYDKTQDIRGCDAL GIER >gi|226332318|gb|ACIC01000002.1| GENE 6 10459 - 12441 1293 660 aa, chain + ## HITS:1 COG:no KEGG:BT_0211 NR:ns ## KEGG: BT_0211 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 660 1 660 660 1303 100.0 0 MKYFSKITLGLILCMGIISSCKDDDETRIDGISVDKEEIAIGAEGGTEKIAVSSNDQWVV RVSKPWIAVSPANGFGSAVCELTIDSTLTNVARTAQISFTMNGREPSLVTVTQFGFGKQI LVKEPEVEIPSSDAFDNRHFKSIISTNVNFKIGSVDYSFAEEATMTEEEKREVEGERSGW VTLPKDKDLAVNLDKGARPRTLKVDFRWGMNVAPYTRVAKIHLVPVDENDQLVDNNGNKI DAVVLTVTQKAAMKIEDNRAGDSLAIITINSKLQSMMSFDTSESMMNWSFVTLWEATDKE IKDGIVPTEAVGRVRSVSYAMIDLQDGETFPKEIRHLKYLESFSVQSNANRQIRTISLGE EICELEYLKDLTIFSFGINALPENFIKLGKKLENLDLASNNFQSLSVVTDVVNEKNFPHL RYLTLTGCRATETLKDMSLIDGNNQYNGRDVGLHVDISQGQPEREAFLKLLTWDKLISLQ MSYNFLEGELPTDAEVRAALRAADKPETYQSDDFFSKDELTAKPSIFMDKISRDTCQWLL TTDNQVRYRKQTPVSGQDIPRVLPFARTVHINLNFLTGALPNWLLFHPYFVYWGPESMIF NQQEDGRNSKGNTVGFNNVDIVNYDFSYYYGSTDPGTNQIVSGVAYPLYFRKFVLSGTTD >gi|226332318|gb|ACIC01000002.1| GENE 7 12492 - 14579 1515 695 aa, chain + ## HITS:1 COG:alr1615_2 KEGG:ns NR:ns ## COG: alr1615_2 COG1404 # Protein_GI_number: 17229107 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Subtilisin-like serine proteases # Organism: Nostoc sp. PCC 7120 # 220 584 45 357 416 127 33.0 9e-29 MKKNFLYTALFALMLASCSEQEVIEQPSTPTGGTEVQLPADVTSGELLIKFDPAMTEILD QALTVATRSGGAMTRSGIPSTDEVLDILGTYHFERIFPVDTKNEERTRTSGLHLWYRVKF DENTDLKEAAGRLAKLGEVAKVQANSHIQRAYRVDGYRSYVSESALRQKAATRTVTTGST FSDPGLAYQWHYNNSGNNPFDNQNVLKNGSRPGCDVGCMEAWKKCTGDPSIIVAVLDEGV MYTHPDLKGNMWINEKEELYADKDADGNGYKDDKYGYNFVSNSGIISWMDAVDTGHGTHV AGTIAAMNNNGEGVCGIAGGDGSKNSGVKIMSCQVFAGEAGVTLDAEARAIKYAADNGAV ILQCSWGYNSSLANLIEGYTPGPGSEEEWEKLYPLEKDALDYFINNAGSPNGVIDGGLAI FASGNEYAGMAAFPAAYSKCISVSAVAADFTPASYSNYGKEVTISAPGGDTEYYNPVGQD DPEGWEEGIHSGSILSTWIQNGNATYGFMDGTSMACPHVSGVAALGLSYAVKQRRHFKAS EFVELLKSSTKPLDSWYNTGEVKAYYRNHISSGASATRVELSKYVGKMGAGLLDAGMLLN NIEGNGSDMVVPNMYVAEGAASTLNLACYFVNGENLTYTCTSGDTTVASVSVNGTFMTVS GVKTGATRITVKVSNGSEQSITVTVRKKANDNGWM >gi|226332318|gb|ACIC01000002.1| GENE 8 14592 - 15407 469 271 aa, chain + ## HITS:1 COG:no KEGG:BT_0213 NR:ns ## KEGG: BT_0213 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 271 1 271 271 548 100.0 1e-155 MKPVTWIRFFLLCVVIGQFRAVEARNVHFTPQEKTDTLINPPLIKGAEHILRFDKTVLNI GTLTEDDAPKMYRFTCTNVSGKAINLTRVRTTCGCAVADVRTGEISPGETRVVVLTYHPK NHPGTIDTNAFVYLSSSDKMPVARLTLIGNVLPGADEWGRYPYKMGKLRLKQNRIEFREV NAGKRPSERILCGNSGNQSLRLSAAVIPEFATFRTEPEVISPGSEADVVITIDASLIPVE KGRSFTFPIIIEGVEGQPSDRTLNIKVNCIK >gi|226332318|gb|ACIC01000002.1| GENE 9 15429 - 15905 421 158 aa, chain + ## HITS:1 COG:no KEGG:BT_0214 NR:ns ## KEGG: BT_0214 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 158 1 158 158 315 100.0 3e-85 MKNIFKLMALFAFVLCFASCDDDEKVEIPALPVTAANLNGTWQLSEWNGQALAEGTYCYI TFNRRELTFEMYQKFDSMYARYITGSFNIENDPYLGYVISGEYDFGNGDWNNDYIVTDLL ESGSMIWTVKDDDSDVNKYVRCEKVPESIIEEAKTNKN >gi|226332318|gb|ACIC01000002.1| GENE 10 16094 - 16522 265 142 aa, chain + ## HITS:1 COG:FN2045 KEGG:ns NR:ns ## COG: FN2045 COG0735 # Protein_GI_number: 19705335 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+/Zn2+ uptake regulation proteins # Organism: Fusobacterium nucleatum # 7 132 14 137 142 115 47.0 2e-26 MKPYERLLEHNIKPSMQRIAIMEYLMNHPIHPSADDIYTALSPSMPTLSKTTVYNTLRLF SEQGAALMLTIDEKNTNFDADTSVHSHFLCRYCGHIYDLKSPEAVKKVESLEMDGHQVTE VHYYYKGICKNCLRKDKETRID >gi|226332318|gb|ACIC01000002.1| GENE 11 16545 - 17105 758 186 aa, chain + ## HITS:1 COG:CAC3597 KEGG:ns NR:ns ## COG: CAC3597 COG1592 # Protein_GI_number: 15896831 # Func_class: C Energy production and conversion # Function: Rubrerythrin # Organism: Clostridium acetobutylicum # 1 182 1 180 181 234 71.0 6e-62 MKKFRCTVCGYVYEGDAAPEKCPLCKAPASKFVEVVEVEGGALSFADEHVIGVAKGCDEE MIKDLNNHFMGECTEVGMYLAMSRQADREGYPEVAEAFKRYAWEEAEHASKFAELLGDCV WDTKTNLQKRKDAEQGACEDKKRIATRAKALNLDAIHDTVHEMCKDEARHGKGFEGLYNR YFGDKK >gi|226332318|gb|ACIC01000002.1| GENE 12 17204 - 17839 580 211 aa, chain + ## HITS:1 COG:PAE2336 KEGG:ns NR:ns ## COG: PAE2336 COG0778 # Protein_GI_number: 18313271 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Pyrobaculum aerophilum # 24 208 52 252 274 108 34.0 6e-24 MRKVQLLFVCLMLSAAAFAADKVVKLPKPNLNRTGTVMKALSERQSTREYASKALTLADL SDLLWAANGINRSDAGKRTAPSAMNKQDVDVYVILSEGSYLYDAKNHQLNLIAEGDYRGA VAGGQAFVKTAPVSLVLISDVSRFGDAQKIQNQLMGAMDAGIVSQNISIFCSAAKLATVP RASMDAAQLKKVLKLKDSQIPMMNHPVGYFK Prediction of potential genes in microbial genomes Time: Wed May 11 23:54:21 2011 Seq name: gi|226332317|gb|ACIC01000003.1| Bacteroides sp. 1_1_6 cont1.3, whole genome shotgun sequence Length of sequence - 51081 bp Number of predicted genes - 39, with homology - 38 Number of transcription units - 21, operones - 11 average op.length - 2.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 4/0.000 - CDS 61 - 525 627 ## COG0526 Thiol-disulfide isomerase and thioredoxins 2 1 Op 2 . - CDS 461 - 841 260 ## COG0526 Thiol-disulfide isomerase and thioredoxins - Prom 1001 - 1060 8.0 + Prom 851 - 910 4.5 3 2 Op 1 . + CDS 1037 - 2164 588 ## COG1819 Glycosyl transferases, related to UDP-glucuronosyltransferase 4 2 Op 2 . + CDS 2161 - 2343 256 ## BF3012 hypothetical protein + Term 2456 - 2490 1.1 5 3 Op 1 . - CDS 2433 - 4589 2109 ## BT_0236 hypothetical protein 6 3 Op 2 . - CDS 4590 - 6749 2141 ## BT_0237 hypothetical protein 7 3 Op 3 . - CDS 6781 - 8025 818 ## COG0641 Arylsulfatase regulator (Fe-S oxidoreductase) - Prom 8068 - 8127 7.3 + Prom 8076 - 8135 4.2 8 4 Op 1 . + CDS 8183 - 8917 515 ## BT_0239 hypothetical protein 9 4 Op 2 . + CDS 8961 - 10715 837 ## BT_0240 hypothetical protein + Prom 10757 - 10816 5.8 10 5 Op 1 . + CDS 10881 - 11264 352 ## BT_0241 hypothetical protein 11 5 Op 2 . + CDS 11294 - 11770 282 ## BT_0242 putative polysaccharide deacetylase + Term 11814 - 11852 7.1 12 6 Tu 1 . - CDS 11849 - 12607 531 ## COG0204 1-acyl-sn-glycerol-3-phosphate acyltransferase - Prom 12627 - 12686 6.9 + Prom 12605 - 12664 4.7 13 7 Op 1 28/0.000 + CDS 12727 - 13977 1084 ## COG0420 DNA repair exonuclease 14 7 Op 2 . + CDS 13974 - 16838 2283 ## COG0419 ATPase involved in DNA repair + Term 16980 - 17039 6.5 - Term 16621 - 16661 1.6 15 8 Op 1 . - CDS 16822 - 17988 739 ## COG1408 Predicted phosphohydrolases 16 8 Op 2 . - CDS 18042 - 18620 615 ## BT_0247 hypothetical protein 17 8 Op 3 . - CDS 18627 - 19136 584 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 18 8 Op 4 . - CDS 19206 - 20108 865 ## COG1410 Methionine synthase I, cobalamin-binding domain 19 8 Op 5 . - CDS 20134 - 21471 1012 ## COG0044 Dihydroorotase and related cyclic amidohydrolases 20 8 Op 6 . - CDS 21468 - 22211 709 ## COG0463 Glycosyltransferases involved in cell wall biogenesis - Prom 22438 - 22497 3.9 21 9 Tu 1 . + CDS 22456 - 25839 2963 ## COG1197 Transcription-repair coupling factor (superfamily II helicase) + Prom 25841 - 25900 5.2 22 10 Tu 1 . + CDS 25953 - 26798 468 ## BT_0253 hypothetical protein + Term 26975 - 27011 -1.0 23 11 Tu 1 . - CDS 26807 - 26953 83 ## - Prom 26976 - 27035 6.3 - Term 27046 - 27093 13.4 24 12 Op 1 . - CDS 27240 - 29102 1560 ## COG1032 Fe-S oxidoreductase - Term 29117 - 29159 6.3 25 12 Op 2 . - CDS 29187 - 29411 237 ## BT_0255 hypothetical protein 26 12 Op 3 . - CDS 29411 - 29728 422 ## BT_0256 hypothetical protein - Prom 29756 - 29815 4.9 - Term 29782 - 29823 9.7 27 13 Op 1 . - CDS 29853 - 30884 849 ## BT_0257 xanthan lyase 28 13 Op 2 . - CDS 30739 - 32766 875 ## BT_0257 xanthan lyase - Prom 32926 - 32985 5.1 - Term 32905 - 32959 7.0 29 14 Tu 1 . - CDS 32995 - 34986 2051 ## COG0363 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase - Prom 35037 - 35096 3.8 - Term 35043 - 35082 -0.3 30 15 Tu 1 . - CDS 35126 - 35911 486 ## COG0388 Predicted amidohydrolase - Prom 35956 - 36015 3.4 31 16 Op 1 . - CDS 36034 - 36588 354 ## COG0204 1-acyl-sn-glycerol-3-phosphate acyltransferase 32 16 Op 2 . - CDS 36597 - 36866 334 ## BT_0261 hypothetical protein + Prom 37145 - 37204 7.7 33 17 Tu 1 . + CDS 37224 - 37859 281 ## BT_0262 hypothetical protein 34 18 Tu 1 . + CDS 37964 - 39367 743 ## BT_0263 hypothetical protein + Term 39437 - 39477 0.3 - Term 39420 - 39471 12.1 35 19 Op 1 . - CDS 39493 - 41241 891 ## BT_0264 hypothetical protein 36 19 Op 2 . - CDS 41283 - 42764 1007 ## BT_0265 hypothetical protein 37 19 Op 3 . - CDS 42777 - 44237 919 ## BT_0266 hypothetical protein - Prom 44262 - 44321 6.1 38 20 Tu 1 . + CDS 44520 - 48632 1472 ## COG0642 Signal transduction histidine kinase + Term 48664 - 48702 6.2 + Prom 48715 - 48774 5.6 39 21 Tu 1 . + CDS 48907 - 51079 1238 ## BT_0268 hypothetical protein Predicted protein(s) >gi|226332317|gb|ACIC01000003.1| GENE 1 61 - 525 627 154 aa, chain - ## HITS:1 COG:BB0061 KEGG:ns NR:ns ## COG: BB0061 COG0526 # Protein_GI_number: 15594407 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Borrelia burgdorferi # 40 152 3 115 117 105 38.0 2e-23 MKKVLAMVAFVMLSVIVYAFNDKPDANQGKKEVTGNGEVVVMDKEMFLKDVFDYEKSKEW KYKGDKPAIIDLYADWCGPCRQTAPIMKELAKEYAGKITIYKVNVDKQKELAALFNATSI PLFVFIPMKGDPQLFRGAADKATYKKAIDEFLLK >gi|226332317|gb|ACIC01000003.1| GENE 2 461 - 841 260 126 aa, chain - ## HITS:1 COG:all1893 KEGG:ns NR:ns ## COG: all1893 COG0526 # Protein_GI_number: 17229385 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Nostoc sp. PCC 7120 # 4 90 10 97 105 96 46.0 1e-20 MEKFEDLIQSPIPVLVDFFAEWCGPCKAMKPVLEELKTMVGDKARIVKIDVDQHEDLATK YRIQAVPTFILFKNGEAVWRHSGVIHSSELKGIIETKLHIKRKIANEESISNGSFCHAER YRIRLQ >gi|226332317|gb|ACIC01000003.1| GENE 3 1037 - 2164 588 375 aa, chain + ## HITS:1 COG:MTH884_2 KEGG:ns NR:ns ## COG: MTH884_2 COG1819 # Protein_GI_number: 15678904 # Func_class: G Carbohydrate transport and metabolism; C Energy production and conversion # Function: Glycosyl transferases, related to UDP-glucuronosyltransferase # Organism: Methanothermobacter thermautotrophicus # 2 355 1 339 348 66 21.0 7e-11 MKFLFIVQGEGRGHLTQAITLEEMLLRNGHEVVEVLVGESSSRILPGFFNRNIQAPVKRF ISPNFLPAADNKRANLKKSFTYNLLRIPEYFRSMCYINQRIKETGAEVVINFYELLTGLT YALFRPSVPYVCVGHQYLFLHQNFEFPDKNSFELRMLRFFTKMTAVRSSKKLALSFNDME PDRGQQVVVVPPLIRQEVTAIQPEEGNYIHGYMVNSGFADSVEEFHTIYPEVPLRFFWDK ADAEEVTRIDETLSFYQIDDVKFLNGMAGCRAYATTAGFESICEAMYLGKPVLMVPAHIE QDCNAHDAMRAGAGIISDSFDLKPLLRFAGTYSPNRTFVRWVRSGERRIILELEKLAASQ SAITSIPTFTNYLPI >gi|226332317|gb|ACIC01000003.1| GENE 4 2161 - 2343 256 60 aa, chain + ## HITS:1 COG:no KEGG:BF3012 NR:ns ## KEGG: BF3012 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 60 1 60 60 89 78.0 3e-17 MKLNKTDYVLERASDGGYYAWLTVNMQCNAYGESPEEAVLNLQETMNEMIDEMYMVEEFI >gi|226332317|gb|ACIC01000003.1| GENE 5 2433 - 4589 2109 718 aa, chain - ## HITS:1 COG:no KEGG:BT_0236 NR:ns ## KEGG: BT_0236 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 37 718 1 682 682 1417 100.0 0 MRKQIIFAIFSLATLSIHADEGMWMLTDLKAQNAVAMRELGLEIPVEEVYNANSISLKDA VVHFGGGCTGEVISSEGLVLTNHHCGYGAIQQHSNVEHDYLTEGFWAMNRDAELPTPGLT VTFIDRILDVTSYVNDQLKKDEDPNGTNYLSPSYLSKVAERFAKDENIEVTPATKLELKA FYGGNKYYMFVKTVYSDIRMVGAPPSSIGKFGADTDNWMWPRHTGDFSLFRIYADKNGKP AAYSRDNVPLQVKKHLKISIAGVQEGDFTFVMGFPGRNWRYMISDEVEERMQTTNFMRQH VRGARQKVLMEQMLKDPAVRIHYASKYASSANYWKNAIGMNEGLVRLKVLDTKRKQQEEL LARGREKGDDSYQKAFDEIRSIVAHRRDALYHQQALNEALVTALDFMRLPSTTEMVTALK SKNKEQIKTATENLKQAGEKYFASVPFPEVERMVAKTMLQTYASYIPEEQRINIFEIINS RFKGNIDSFVDACFEYSIFGNPKNFEKFIKKPSLYKIGHDWMVLFKYSITDGILKTAIAM KEANQNYDAAHKVWVKGMMDMRQEKGMPIYPDANSTLRLTYGQVLSYEPADGVVYDAHTT LKGVMEKEDPGNWEFVVPQKLKELYKARDYGRYGKDGEMPVCFIVNTDNTGGNSGSPVFN SKGQLVGTAFDRNYEGLTGDIAFRPSSQRAACVDIRYTLFIIDKYAGASHIIDELSIE >gi|226332317|gb|ACIC01000003.1| GENE 6 4590 - 6749 2141 719 aa, chain - ## HITS:1 COG:no KEGG:BT_0237 NR:ns ## KEGG: BT_0237 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 25 719 1 695 695 1419 100.0 0 MNRLRLYLLALTALLVCTVKADEGMWLLQLMQQQHSIDMMKKQGLKLEAQDLYNPNGVSL KDAVGIFGGGCTGEIISPEGLILTNHHCGYSSIQQHSSVEHDYLTDGFWATSRDQELPTP GLKFTFIERIEDITDIVNAKIAAKEITESQSFTGAFLEGLAKELYEKSDLKDKKGIVPQA LPFYAGNKFYLFYKKIYPDVRMVAAPPSSVGKFGGETDNWMWPRHTGDFSMFRIYADANG EPAEYSASNTPLKTKKHLSISIKGLKEGDYAMIMGFPGRTSRYLTVSEVKERMESTNEPR IRIRGARLAVLKEVMNASDKIRIQYANKYAGSSNYWKNSIGMNKAIIDNDVLGTKAEQEA KFAEYAKAQNNTEYANVVKKIDDLVAQTAPLNYQLTCLTEVFFGAIEFGNSMLTKTREAL VDKNDSLIKVRLEGLKENFKSIHNKDYDHEVDRKVAKALLPLYAEMIPANQRPAIYKVIE QKYKGDYNKFVDDMYDKSIFANQANFDKFLKKPTVKAIDEDLALQYAQSKYDQYGNLLDQ LKELDKELALLHKTYIRGLGEMKLPVPSYPDANFTIRLTYGNVKPYDPKDGVHYNYYTTT KGILEKENPEDREFVVPAKLKELIEKKDYGRYALPNGDMPVCFLSTNDITGGNSGSPVLN ENGELIGCAFDGNWESLSGDINFDNNLQRCINLDIRYVLFILEKLGNCGHLINEMTIVE >gi|226332317|gb|ACIC01000003.1| GENE 7 6781 - 8025 818 414 aa, chain - ## HITS:1 COG:MA2647 KEGG:ns NR:ns ## COG: MA2647 COG0641 # Protein_GI_number: 20091470 # Func_class: R General function prediction only # Function: Arylsulfatase regulator (Fe-S oxidoreductase) # Organism: Methanosarcina acetivorans str.C2A # 13 405 9 396 446 404 47.0 1e-112 MKATTYAPFAKPLYVMVKPVGAVCNLACEYCYYLEKANLYKENPKHVMSDELLEKFIDEY INSQTMPQVLFTWHGGETLMRPLSFYKKAMELQKKYARGRTIDNCIQTNGTLLTDEWCEF FRENNWLVGVSIDGPQEFHDEYRKNKMGKPSFVKVMQGINLLKKHGVEWNAMAVVNDFNA EYPLDFYNFFKEIDCHYIQFAPIVERIVSHQDGRHLASLAEGKEGALADFSVSPEQWGNF LCTIFDEWVKEDVGKFFIQIFDSTLANWMGEQPGVCTMAKHCGHAGVMEFNGDVYSCDHF VFPEYKLGNIYSQTLVEMMHSERQHNFGTMKYQSLPTQCKECDFLFACNGECPKNRFSRT ADGEPGLNYLCKGYYQYFQHVAPYMDFMKKELMNQQAPANIMKALKDGSLKIEY >gi|226332317|gb|ACIC01000003.1| GENE 8 8183 - 8917 515 244 aa, chain + ## HITS:1 COG:no KEGG:BT_0239 NR:ns ## KEGG: BT_0239 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 244 1 244 244 440 100.0 1e-122 MNVNKILPFVLLLPFLASCTHKYKIEGTSSVNGLDGKMLYLKTLRDGEWTKLDSAEVVHG SFSMKGKIDSVQMTTLYMDDESVMPVVLESGKIVITISNTDLKAVGTPLNTALYDFIAKK NAMEESIGELERKETRMVMDGADLEEVHEQLLAEGDSLMKAMNQYVKTFISDNYENVLGP NVFIMLCSSLPYPIMTPQIDDIIKDAPYSFKSNKMVREFLTKAKENMQLIEEHQRMQQNV GSKK >gi|226332317|gb|ACIC01000003.1| GENE 9 8961 - 10715 837 584 aa, chain + ## HITS:1 COG:no KEGG:BT_0240 NR:ns ## KEGG: BT_0240 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 584 1 584 584 1068 100.0 0 MKTSFSIRIAFFFILSFLAVSCHRDTDALNMTFSKVEKCMDLCPDSALNLLKGIHDPEKL WGESQATYALLMTQAMDKNYMKFSSDSLIALALNYYTITQTSPIMYAKALFYHGRVMLEL DKEEEALKSFLAAKDVYERTKDHKMLALIAEEVGMINRKQDLYDDALTNFREALTTYKQL KDSLSVISASLNIARVYLFKSEWDSCSLYYNNALEIAVQKNYLSEITILHELGILYRSMQ NLSEAERYFLAAYEKETDEEKKYMECLSLGYLYMQMGQTENARKYLIMSANSSKAYTQIS AYDCLYFLEKDIDNFEEAIVYHELADSITNAMEELNSRELIASLQKKYENEKLRNDNLQM KVRYTNFILWGTIAFLFVVACMCYYYYKNRNNKKKIAEIELQIQENEEEIERYQQEIEDI QISKDQVLKENLMLEENRTKVGELNGKIVLLTMQNKTLSEHLKELGGELNVGISSGSFIH AFRLLLAIKEGTLRGKLSNEERQKLFSLFDLIYWNYVSRLLERAPTLTKHDLEICCFLKF GLSHEELSCIFHTTSDSVTRAKGRLKGRLGISPQDDLDLFLKEF >gi|226332317|gb|ACIC01000003.1| GENE 10 10881 - 11264 352 127 aa, chain + ## HITS:1 COG:no KEGG:BT_0241 NR:ns ## KEGG: BT_0241 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 127 1 127 127 226 100.0 2e-58 MKVRFILSFLAALFIAGNMIAIESTERVERKLPLVLKGEVPTTTSRSIPVIPISASVSTD DNIVEIIFTVPLGEVKILVDGQVQEVCQVTAPGQTTSFSIEGWAPGVYKLEFKVAGGGYV YGELVIE >gi|226332317|gb|ACIC01000003.1| GENE 11 11294 - 11770 282 158 aa, chain + ## HITS:1 COG:no KEGG:BT_0242 NR:ns ## KEGG: BT_0242 # Name: not_defined # Def: putative polysaccharide deacetylase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 158 1 158 158 309 100.0 2e-83 MNSHHTVAIGTWEGDKQQSIEYNSSIAVGIDCPRAKLRVSILLDTIQEYQVRYIFSDEEI SEAVNEAGLIIGARGIAYEGVLQRKPVIVVGEYGFGGLVTPDTLHEQYNNYFRGKINGIK AEYFSLERLEEEIRRGFALTFQELQMMSNQTIRFLHNI >gi|226332317|gb|ACIC01000003.1| GENE 12 11849 - 12607 531 252 aa, chain - ## HITS:1 COG:TM1693 KEGG:ns NR:ns ## COG: TM1693 COG0204 # Protein_GI_number: 15644441 # Func_class: I Lipid transport and metabolism # Function: 1-acyl-sn-glycerol-3-phosphate acyltransferase # Organism: Thermotoga maritima # 55 220 59 223 247 107 35.0 2e-23 MKILYYIYQICIALPILLVLTILTAIVTIVGSLLGGAHFWGYYPGKIWSQLICLFLLIPV KIHGREKLHGKTSYIFVPNHQGSFDIFLIYGFIGRNFKWMMKKSLRKIPFVGKACESAGH IFVDRSGPKKVLETIRQAKDSLKDGVSLVVFPEGARTFTGHMGYFKKGAFQLADDLQLAV VPVTIDGSFEILPRTGKWIHRHRMILTIHDPIPPKGKGMENIKATMAEAYAAVESALPEK HKGMITNEDQDR >gi|226332317|gb|ACIC01000003.1| GENE 13 12727 - 13977 1084 416 aa, chain + ## HITS:1 COG:PA4281 KEGG:ns NR:ns ## COG: PA4281 COG0420 # Protein_GI_number: 15599477 # Func_class: L Replication, recombination and repair # Function: DNA repair exonuclease # Organism: Pseudomonas aeruginosa # 2 413 1 405 409 279 37.0 7e-75 MIRILHTADWHLGQTFFGYDRAEEHKAFLDWLAEEIRQNEIDALVIAGDVFDVSNPSAAS QRIYYEFIYRVTAENPKLQIVIVAGNHDSAARLEAPLPLLQAMRTEVRGVVRKLEGGEID YDHLTIELKNREGEVEVLCMAVPFLRQGDYPVVETEGNPYMEGVRELYARLLQRLWARRK TNQAILAVGHLQATGSEIAEKDYSERTVIGGLECVSPDTFSEKIAYTALGHIHKAQRVSG RENVRYAGSPIPMSFAEKHYHHGVVMVILDEGCAVDIRRIECPQSIPLISVPGGEAASPE KIIEILRDLPEVDGEAPYLEVKVLLEEPEPMLRQEIEEALAGKKYRLARIVSAYRQEERV EKEVDGWKKGLQEMSPLQIAQSAFEKVYQAEMPADLTDLFQEAYISATRKEEEEGE >gi|226332317|gb|ACIC01000003.1| GENE 14 13974 - 16838 2283 954 aa, chain + ## HITS:1 COG:PA4282 KEGG:ns NR:ns ## COG: PA4282 COG0419 # Protein_GI_number: 15599478 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA repair # Organism: Pseudomonas aeruginosa # 1 385 1 396 1211 189 39.0 3e-47 MKIIAIRLKNLTSIEGSVEIDFTMEPLSSAGIFAISGATGAGKSTLLDALCLALYDKAPR FAASVESINLADVGDNQINQSDVRNLLRRGTSDGYAEVDFQGVDGHRYRSRWSVRRTRNK ASGSLQPQVLEVKDLDTEKEFQGTKKELLAQLVELVGLTYEQFTRTVLLAQNDFATFLKS RGAAKAELLEKLTGTGVYSRISQEIFARNKAAQEEVTMIQSKMSVIELLPEEELLTLQNE KEQLVEKRAAGIKLLAELNAQLNVVRSLKMQEALWVKKQQEEQEELNKQKNLQDALVVQE EGLIHFKAQWEAIQPDLKKARQLDVQIQSQQAGYIQSQQILQAARQQVADQEKKSASAIE QLRISYHSLSRLLNRTDVESLQLEQIEVILNEEKDTLETWTKVNEERLGRLNSFGYPSLV DEQGKIQKELTRLQNIKKLTEEQSKSKQDIEKLEKEVTVCTQQLAEQDTVVKALQRLYEN ARMAVGKDVKALRQQLQEGEACPVCGSTTHPYHREQEVVDTLYRNMEQEYNTAVSAYQQI NNRSIALQRDLTHQRATEVQIKEQLSVLQQEGLPAGEEDQIQNRLNELAERISAYQHLYA EWQQNDEKIKKLRTYCDALRENVSQCRLAAQKVSAAKEQSAILQKAAIDEQQRFEVIAKA LDALRQERSLLLKGKSADEAEAAVARREKELNEALEKARKQVEGVQNHLSGLQGEMKQLS VVIEELREQQKGIELPDQLPQTIAKQQEDNLNTERALSIAEARLLQQAKNKTIFEQITKE LTEKQAVAVRWAKLNKLIGSADGAKFKVIAQSYTLNLLLLHANKHLAYLSKRYKLQQVPD TLALQVIDCDMCDEIRTVYSLSGGESFLISLALALGLSSLSSNNLKVESLFIDEGFGSLD ADSLRTAMEALEQLQMQGRKIGVISHVQEMSERISVQVQVHKKINGKSVLTVVG >gi|226332317|gb|ACIC01000003.1| GENE 15 16822 - 17988 739 388 aa, chain - ## HITS:1 COG:BS_ykuE KEGG:ns NR:ns ## COG: BS_ykuE COG1408 # Protein_GI_number: 16078469 # Func_class: R General function prediction only # Function: Predicted phosphohydrolases # Organism: Bacillus subtilis # 123 387 30 284 287 109 31.0 9e-24 MLQRLFIFLLIFLILPDIYLYLRFIVHLTAKRWLRILYWLPAFLLSAGLLYLVYFSNNAF AERHTQAIGWFSIFFFLFTAPKLLLSLCTIIGVPFHKWLRWPRTPFICTGLTLAVISIVM IIYGSFIGRTQFDVKEVTYSSPKLPSAFDGYRIVQLSDIHIGSWQGNASAIQKLVKLVNE QQADLIVFTGDLVNHRAVELNDFQDILAGLKAKDGVYSILGNHDYGPYFHWKSKDEQDDN LNDLLQRQAAMGWKLLNNSHTILIQGSDSIALIGVENEGEPPFSQYADLPEAMRGTEGMF QILLSHNPTHWRREVLPQTYIDLMLAGHTHAMQLKFGNHSPSSYIYPEWSGMYLEGTRGL YVNEGIGYVGLPFRFGAWPEITVLTLRQ >gi|226332317|gb|ACIC01000003.1| GENE 16 18042 - 18620 615 192 aa, chain - ## HITS:1 COG:no KEGG:BT_0247 NR:ns ## KEGG: BT_0247 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 192 1 192 192 288 100.0 1e-76 MELEELKKSWNALDEHLKDKELIKEEELGKLIGHADKGIHAIARLNIKLILISLPVLVLF LLEVLLHGRLNPIYIIIILSWIPALYWDITTTRFLQQTKVDEMPLIEVISRVNRIHRWMI RERLIATAFLLILAVLSFIYWQIWQYGIAIILFFILLWGAGLGLILWIYRKKFLNRIQEI KKNLDELKIIDS >gi|226332317|gb|ACIC01000003.1| GENE 17 18627 - 19136 584 169 aa, chain - ## HITS:1 COG:CC3310 KEGG:ns NR:ns ## COG: CC3310 COG1595 # Protein_GI_number: 16127540 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Caulobacter vibrioides # 42 166 33 158 166 78 34.0 5e-15 MRERTTEATNPIEQEFLSVIREYERVIYKVCYLYTNPNAPLNDLYQDVILNLWKAYPKFR RECKMSTWIYRIALNTCISFFRKEKNVPEIVTLTREADWIIEEHDPIHEMLRQLYQMINQ LGQLDKSIILLYLEDKSYEDIAEITGLTVTNVATKLSRIKDKLKKMKKE >gi|226332317|gb|ACIC01000003.1| GENE 18 19206 - 20108 865 300 aa, chain - ## HITS:1 COG:AGc3907_2 KEGG:ns NR:ns ## COG: AGc3907_2 COG1410 # Protein_GI_number: 15889436 # Func_class: E Amino acid transport and metabolism # Function: Methionine synthase I, cobalamin-binding domain # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 4 299 616 904 919 203 38.0 4e-52 MILSYKIHNVAPYINWIYFFHAWGFQPRFAAIANIHGCDVCRASWLTTFPEEDRNKASEA MQLFKEANRMLDLLDKDYEVRTIFKLCKANSDGDNLVIEKETDQFLVFPLLRQQTPKRDG SPFFCLSDFIRPLSSGIPDTIGAFASCIDADMEGLYEQDPYKHLLVQTLSDRLAEAATEK MHEYVRKEAWGYAKDEILSMPDLLVEKYQGIRPAVGYPSLPDQSINFLLDELLDMKQIGI TLTENGAMYPHASVCGLMFAHPAAEYFSVGKIGEDQLEDYARRRGKSTEEMRKFLAANLQ >gi|226332317|gb|ACIC01000003.1| GENE 19 20134 - 21471 1012 445 aa, chain - ## HITS:1 COG:XF0988 KEGG:ns NR:ns ## COG: XF0988 COG0044 # Protein_GI_number: 15837590 # Func_class: F Nucleotide transport and metabolism # Function: Dihydroorotase and related cyclic amidohydrolases # Organism: Xylella fastidiosa 9a5c # 1 444 1 447 449 407 47.0 1e-113 MKRILIKNAVIVNEGRKVPGSVVIEGEKIAEILVDGQEAAMPCDETIDASGCYLLPGAID EHVHFRDPGLTHKADITTESHAAAAGGVTSIMDMPNTNPQTTTLEALNDKLALLAEKSSV NFSCYFGATNNNYPLFSQLDKNRVCGVKLFMGSSTGNMLVDRMASLRNIFGGTDLLIAAH CEDQGIIKENTDKYKKEYGDDVPLALHPVIRSEEACYRSSALAVQLARETNARLHIMHIS TAKELSLFSKAPLAEKRITAEACVSHLIFTEEDYQTLGARIKCNPAIKTAEDRKALQEAV NSGLIDAIATDHAPHSLSEKEGGALKVMSGMPTIQFSLVSMLELADKGAFSIEKVVEKMS HAPAQMYEIRNRGFIRKGYQADLVLVRPDSEWTVTTDCILSKCKWSPLEGHTFHWKVEKT FVNGHLLYNNGTIDENYRGQELRFR >gi|226332317|gb|ACIC01000003.1| GENE 20 21468 - 22211 709 247 aa, chain - ## HITS:1 COG:Rv2051c_2 KEGG:ns NR:ns ## COG: Rv2051c_2 COG0463 # Protein_GI_number: 15609188 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Mycobacterium tuberculosis H37Rv # 7 236 3 231 264 214 46.0 1e-55 MQLSDSIVIIPTYNERENIENIIRAVFGLDKVFHILVIEDGSPDGTASIVKTLQQEFPER LFMIERKGKLGLGTAYIAGFKWALEHAYEYIFEMDADFSHNPNDLPRLYAACAKEGGDVS VGSRYVSGVNVVNWPMGRVLMSYFASKYVRFITGIPVHDTTAGFVCYRRQVLETIDLDHV RFKGYAFQIEMKFTAYKCGFKIIEVPVIFINRELGTSKMNSSIFGEAIFGVIKLKINSWF HKFPQKK >gi|226332317|gb|ACIC01000003.1| GENE 21 22456 - 25839 2963 1127 aa, chain + ## HITS:1 COG:BS_mfd KEGG:ns NR:ns ## COG: BS_mfd COG1197 # Protein_GI_number: 16077123 # Func_class: L Replication, recombination and repair; K Transcription # Function: Transcription-repair coupling factor (superfamily II helicase) # Organism: Bacillus subtilis # 34 1048 31 1097 1177 619 34.0 1e-176 MTITELQQQYAAHPNTAVMERLLKDASVQTIFCGGLCASSASLFSSVLVKQDVCPFVFVL GDLEEAGYFYHDLTQVLGTEKVFFFPSSFRRSIKYGQKDAANEILRTEVLSRLQKGEEGL CIVTYPDALAEKVVSRKELSDKTLKLNVGEKLDTTFITDVLHSYGFEYVDYVYEPGQYAV RGSIIDVFSFASEYPYRIDFFGDEVESIRTFEVESQLSRERKEGVAIVPDLAVTGDVTTS FLDFIPKETVLAMRDFLWLRERIQVVHDEALTPQAIAVQEAEENGGITLEGKLIDGSEFT VRALDFRRMEFGNKPTGTPDAKISFHTSVQPIFHKNFDLVAGSFKDYLEQGYSLYICSDS TKQTDRIRAIFEDRGDKIQFTPVVRTIHEGFVDHTLHLCIFTDHQLFDRFHKYNLKSDKA RSGKVALSLKELNQFTPGDYVVHTDHGIGRFSGLVRIPNGDTTQEVLKLVFQNEDVVFVS IHSLHKVSKYKGKDGEAPRLNKLGTGAWEKLKDRTKAKIKDIARDLIKLYSQRRQEKGFS YSPDSFLQRELEASFIYEDTPDQSKATMEVKADMESDRPMDRLVCGDVGFGKTEVAIRAA FKAVADNKQVAVLVPTTVLAYQHFQTFRERLKGLPCRVEYLSRARTAAQSKAVLKGLKEG EVSILIGTHRILGKDVQFKDLGLLIIDEEQKFGVSVKEKLRQLKVNVDTLTMTATPIPRT LQFSLMGARDLSVISTPPPNRYPIQTEVHTFNEEVITDAINFEMSRNGQVFFVNNRIANL PELKVMIERHIPDCRVAIGHGQMEPTQLEQIIFDFVNYDYDVLLATTIIESGIDIPNANT IIINQAQNFGLSDLHQMRGRVGRSNKKAFCYLLAPPLSSLTAEGKRRLQAIENFSDLGSG IHIAMQDLDIRGAGNLLGAEQSGFIADLGYETYQKILAEAVHELRNDEFAELYADEIKGE GQISGEEFVDECQIESDLELLLPATYVTGSSERMLLYRELDGLTLDKDVDAFRFRLEDRF GPVPPETQELLRIVPLRRLAARLGVEKVFLKGGRMTLFFVSNADSPFYQSQAFGKMIDYM MKYTRRCDLREQNGRRSMLIKDVTNVETAVSVLQEIVALPVKELDDH >gi|226332317|gb|ACIC01000003.1| GENE 22 25953 - 26798 468 281 aa, chain + ## HITS:1 COG:no KEGG:BT_0253 NR:ns ## KEGG: BT_0253 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 281 1 281 281 551 100.0 1e-156 MKNMLTLIVLLSSWLTVYSQESYTEAELKQKCDSIVEEANTLYRYETAAWNFTDMIFAKP GLIETIQNILTYHQGDSIKCVAIDKQSYCIYEATFLNESAPCSEVTTRRNLTEQEILLAK IKEKIRSELADHEKYPIYRYDNYPLNWDLIPFADGYKFYAISGVSKGRAIPFGNDYLFIA NKEGEIQSWKKFHSRLIPVEATEQMPMIAFPVHSHLKYEPFISATDICTFRLYYNKTGST KFAVYSTALSMYFIYELATNTITPTKDINFNSAPLSEKVAP >gi|226332317|gb|ACIC01000003.1| GENE 23 26807 - 26953 83 48 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRVYKATCIRKRKQKSQTTEEKHQTQIYSHIYKEQYTFTNAFLFSFQS >gi|226332317|gb|ACIC01000003.1| GENE 24 27240 - 29102 1560 620 aa, chain - ## HITS:1 COG:PA4928 KEGG:ns NR:ns ## COG: PA4928 COG1032 # Protein_GI_number: 15600121 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Pseudomonas aeruginosa # 9 611 23 667 747 527 42.0 1e-149 MKEYRLTDWLPTTKKEVELRGWDELDVILFSGDAYVDHPSFGAAVIGRILEAEGLKVAII PQPNWRDDLRDFRKLGRPRLFFGISAGSMDSMVNKYTANKRLRSEDAYTPDGRPDMRPEY PSIVYSQILKRLYPDVPVILGSIEASLRRLSHYDYWQDKVQKSILCDSGADLLIYGMGEK PIVELTRKMKELLPAEDASLTAGELKKIAGTIPQTAYLCRATEWTPAADDIQLYSHEECL ADKKKQASNFRHIEEESNKYAASRITQEVDNKVVVVNPPYPPMSQEELDHSYDLPYTRLP HPKYKGKRIPAYDMIKFSVNIHRGCFGGCAFCTISAHQGKFIVSRSKKSILNEVKEVMQL PDFKGYLSDLGGPSANMYQMKGKDEAICKKCKRPSCIHPKVCPNLNSDHRPLLDIYKAVD AIPGIKKSFIGSGVRYDLLLHQSKDAATNRSTAEYTRELIVNHVSGRLKVAPEHTSDRVL SIMRKPSFEQFETFKKIFDRINREENLRQQLIPYFISSHPGCKEEDMAELAVITKRLDFH LEQVQDFTPTPMTVATEAWYSGFHPYTLEPVFSAKTQREKLAQRQFFFWYKPEERKNILN ELRRIGRQDLIDKLYGKRNK >gi|226332317|gb|ACIC01000003.1| GENE 25 29187 - 29411 237 74 aa, chain - ## HITS:1 COG:no KEGG:BT_0255 NR:ns ## KEGG: BT_0255 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 74 1 74 74 142 100.0 3e-33 MITVDTCGTTAYSPLIPAIKAICEAPFGETLEIIMNHADAFQDLKEYLSEQSIGFREIYD GEQMTLQFTINEKF >gi|226332317|gb|ACIC01000003.1| GENE 26 29411 - 29728 422 105 aa, chain - ## HITS:1 COG:no KEGG:BT_0256 NR:ns ## KEGG: BT_0256 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 105 1 105 105 186 100.0 2e-46 MYTIQANPSGTRSIEVSSENLKTIEKYALFRHLIDSTGIVDEPVLDKLKLNIRSLIASQE EDSKELLDLCIDVIYHNNMKAFGLQQLIKLYLTWLSTQDAEEEEE >gi|226332317|gb|ACIC01000003.1| GENE 27 29853 - 30884 849 343 aa, chain - ## HITS:1 COG:no KEGG:BT_0257 NR:ns ## KEGG: BT_0257 # Name: not_defined # Def: xanthan lyase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 343 628 970 970 689 100.0 0 MVYTRIGYGGFDNGTLVSKPYYSVKVEPGLVYSFKVTAVNRGGESFPSEILSAYKAKRER ERILIINGFDRISGPAVINTPDKAGFDLEQDPGVPYLSNISFCGAQSGFNRSQAGKEGEG SLGHSGRELEGMEIAGNTFDYPFIHGKAIQAAGKYSFVSCSDEAVENGIVTLEDYPIVDY ILGLEKEDPIAKAYYKTFSSPMQRLITSYCQSGGHLFVSGAYVGSDMSGTQGNREFTEKV LKYGYQNSLTDKSSGQINGLGRSITIPRLPNETSYAVTAPDCIVPVAPAFPVFTYARGNQ SAGIAYKGADYRTFVLGFPFESIQSETDRASIMAGILGFFTQK >gi|226332317|gb|ACIC01000003.1| GENE 28 30739 - 32766 875 675 aa, chain - ## HITS:1 COG:no KEGG:BT_0257 NR:ns ## KEGG: BT_0257 # Name: not_defined # Def: xanthan lyase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 604 1 604 970 1224 99.0 0 MKKIIVFLLCLTMGTSLLFAQDIERNIKERLTDYFAKYTATAKISTPKLNSFDIDYDRRT IQIYASESFAYQPFLPETVETIYNQMKELLPGPVNYYQITIYADGKPIEELVPNLYRSKK KDKERMALNTEYKGAPWVKNTSRPNEISRGLQDRHIALWQSHGNYYKNDKGEWGWQRPRL FCTTEDMFTQSFILPYVIPMLENAGAIVYTPRERDTQKNEIIVDNDTPNASLYLEVGSKK ARWTTTSVKGFAQKKAIYKDGENPFTDGTSRYIQTEKKKKKNKDQAFAEWVPTLPATGKY AVYVSYQTLPNSVSDAKYLVFHNGGVTEFKVNQKIGGGTWVYLGTFEFDKGNNDYGMVVL SNESSEHGVVCADAVRFGGGMGNISRGGKISGLPRYLEGARYSSQWAGMPYDVYAGRKGE NDYTDDINTRSNTINYLSGGSVYNPGQTGLGVPLEMTMALHSDAGCSKDDEIIGSLGIYT TDFNNGKLNSGMDRYASRDLADILLTQIQKDIRTNYNLPWTRRSMWNRNYSETRLPATPS TIIELLSHQNFADMQLGHNPNFKFTVGRAIYKGILQFINSQHGKDYVVQPLPVSNFAIHL RKKEKHTGTYLERGRRSAGTDSPSTGIYGIYTHRIRRIRQRNSGQQTLLQCQSRTGTCLF VQSDSRKPGRRKFSF >gi|226332317|gb|ACIC01000003.1| GENE 29 32995 - 34986 2051 663 aa, chain - ## HITS:1 COG:BS_nagB KEGG:ns NR:ns ## COG: BS_nagB COG0363 # Protein_GI_number: 16080555 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase # Organism: Bacillus subtilis # 42 279 7 238 242 168 38.0 4e-41 MKTNLSSQITLHRVSPRYYRPENAFEKSVLTRLEKIPTDIYESVEEGANYIACEIAQTIR EKQKAGRFCVLALPGGNSPSHVYSELIRMHKEEGLSFRNVIVFNMYEYYPLTADAINSNF NALKEMLLDHVDIDKQNIFTPDGTIAKDTIFEYCRLYEQRIESFGGIDIALLGIGRVGNI AFNEPGSRLNSTTRLILLDNASRNEASKIFGTIENTPISSITMGVSTILGAKKVYLLAWG ENKAAMIKECVEGPISDTIPASYLQTHNNAHVAIDLSASMNLTRIQRPWLVTSCEWNDKL IRSAIVWLCQLTGKPILKLTNKDYNENGLSELLALFGSAYNVNIKIFNDLQHTITGWPGG KPNADDTYRPERAKPYPKRVVIFSPHPDDDVISMGGTLRRLVEQKHEVHVAYETSGNIAV GDEEVVRFMHFINGFNQIFNNSEDLVISEKYAEIRKFLKEKKDGDMDSRDILTIKGLIRR GEARTASSYNNIPLDRVHFLDLPFYETGKIQKNPISEADVEIVRNLLREIKPHQIFVAGD LADPHGTHRVCTDAVFAAVDLEKEEGAKWLKDCRIWMYRGAWAEWEIENIEMAVPISPEE LRAKRNSILKHQSQMESAPFLGNDERLFWQRSEDRNRGTATLYDQLGLASYEAMEAFVEY IPL >gi|226332317|gb|ACIC01000003.1| GENE 30 35126 - 35911 486 261 aa, chain - ## HITS:1 COG:STM0308 KEGG:ns NR:ns ## COG: STM0308 COG0388 # Protein_GI_number: 16763691 # Func_class: R General function prediction only # Function: Predicted amidohydrolase # Organism: Salmonella typhimurium LT2 # 1 258 1 255 255 231 46.0 9e-61 MDSIRISIVQTDIVWENKQENLRLLHEKLQSLCGTTEIVVLPEMFSTGFSMQSDMLAEAN SGETITTLKQWASLFQVAICGSYITVDNGRYYNRAFFLTPEGEEFYYDKRHLFRMGREAE HFSAGDERLIIPYRGWNICLLVCYDLRFPVWSRNVANQYDLLIYVANWPIPRRLAWDTLL RARALENQCYVCGVNRVGIDGYRLKYNGGSKIYSALGEEAASVPDETEGIATATLHLTAL HQFREKFPVWKDADEFQLRHS >gi|226332317|gb|ACIC01000003.1| GENE 31 36034 - 36588 354 184 aa, chain - ## HITS:1 COG:CC1900 KEGG:ns NR:ns ## COG: CC1900 COG0204 # Protein_GI_number: 16126143 # Func_class: I Lipid transport and metabolism # Function: 1-acyl-sn-glycerol-3-phosphate acyltransferase # Organism: Caulobacter vibrioides # 11 177 16 182 196 129 42.0 2e-30 MKKAIYSFIYYRLLGWKTNVTVPNYDKCVICAAPHTTNLDLFIGKLFYGALGRKTSFMMK KDWFFFPLGIIFKAVGGIPVDRSRKTSLVDQMVHHFAECKKFHLAITPEGTRKANPNWKK GFYYIALKAQVPIVLIGIDYSTKTITSTKAIMPTGDLNKDMREIKLYFKDFKGKHPENFA LGEL >gi|226332317|gb|ACIC01000003.1| GENE 32 36597 - 36866 334 89 aa, chain - ## HITS:1 COG:no KEGG:BT_0261 NR:ns ## KEGG: BT_0261 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 89 1 89 89 168 100.0 5e-41 MASRRELKKNVNYIAGELFTECLINSMFIPGTDKAKADELMAEVLRMQDEFVSRISHTEP GNVKGFYKKFRVDFNAKVNEIIEAIGKLN >gi|226332317|gb|ACIC01000003.1| GENE 33 37224 - 37859 281 211 aa, chain + ## HITS:1 COG:no KEGG:BT_0262 NR:ns ## KEGG: BT_0262 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 211 1 211 211 435 100.0 1e-121 MDLDLSKRSKPCFKKNRMMSMGKVKNPFLKLCGIGLLTVICISVKAQKVNFTVNSKTGAI QSMNIDNDKQNMNWLIATDGSQYPWIKENYGWGLGYFTEVRRNQKNKLFWNLPASIKQDG REVTYRVGDICILVERSMRGEDLIEEYTFQNDGTEEILLSDIGIYTPFNDNYPGAQTCIN MRANAHIWEGDNAAYVNAIRMGDMLHIWDWF >gi|226332317|gb|ACIC01000003.1| GENE 34 37964 - 39367 743 467 aa, chain + ## HITS:1 COG:no KEGG:BT_0263 NR:ns ## KEGG: BT_0263 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 467 4 470 470 954 100.0 0 MPGDEQVFSWYIFSHKGEDDFRQKLLERESVWVSCNKYVFEKGETALVKISGGQMVKGCM LKKNDVTIPMKKQGTAWYAEVVMDQLGEVRFDILYGTGKKTHANCLVISSVDDLIKKRVE FIVANQQMKSSNTRRDAYMVYDNEKNEIYLNNTHNCNPVDRDEGAERVGMGVLLAKYYQL HPVAEVKASLLRYASFLRNRLQDADYKTFSSVDQKGRNRAYNYVWVADFYFQMYKITNDK QYAKHGYMTLRSMFKQFGHGFYAIGIPVRLGLQTLKNADMQREYQELENDYIAVGDTFLK NGLNYPASEVNYEQAIVAPSVMFLLQLYMETGRQKYLDGAKIQMPVLEAFNGKQPSYHLN EIAVRHWDGYWFGKREMWGDTFPHYWSTLSGAAFYLYSQCTGDHSYKERAENIVRNNLCL FFEDGKASCAYIYPNKVNGVKGGFYDPYANDQDWALVYYLLVQNGIY >gi|226332317|gb|ACIC01000003.1| GENE 35 39493 - 41241 891 582 aa, chain - ## HITS:1 COG:no KEGG:BT_0264 NR:ns ## KEGG: BT_0264 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 582 1 582 582 1162 100.0 0 MNIKSQQAKNSFLFIFIYLCSLSLQPAFATQSKEVITTDIWAATATVFKLVGEKAVNNVK LNWMQREDADLYRIYRDGNCIGEAKGNTYDDYNLKADKTFTYHVEAFKEGKKIATSASQQ ATTFTPNGETKVYDNLNGKYITKESAQKPQGMKIGELYFSYKMENVEKEVDGQTLKGWLA TESFSPTGLNGSWSTSRELAFYPNVKFEGIAFRYNAKTGKVVLSAHYEDQSGYVAAKIYL AQITPKGELEVGTMERPLGYDSRDQSLFIDDDGTAYLLSATNMNRDINIYKLDPSWTKPV LLVNTICKGLHRETPAIIKKDGEYYFFSSKASGWYPSQTMYTSAADLGGEWTPMREIGNN STFDAQFNRISTVGKTCGVWSYHWGAQRKYKTPAGNFPRISIAAFNKGYASMDYYRYLEF SDKYGIIPVQNGKNLTLNVPVTAAVPGARGIKADCITDGACTESSTYFQKSSNAATGSPY MFTIDMQKEAVISEINLSTRLVNGSEAAYKYTIEGSRDGKSYKMLVDGKLNWQVGFLILN IEDPSLYRYLRLRVYGVVNVHKNNSAMWADGIYEFAAYGKPQ >gi|226332317|gb|ACIC01000003.1| GENE 36 41283 - 42764 1007 493 aa, chain - ## HITS:1 COG:no KEGG:BT_0265 NR:ns ## KEGG: BT_0265 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 493 1 493 493 998 100.0 0 MRITLFICFMFSFTSIFSQNTQISPGVLWNDIDGEQINAHGGCVVYEKGTYYWFGEDRTG FKSNGVSCYQSKDLYNWKRLGLSMKTTGEAREDMNDISQGRLFERPKVIYNPQTKKWVMW SHWESGDGYGAARVCVATSDKIMGPYVLYKTFRPNKNESRDQTLFVDTDGKAYHFCSTDM NTNMNIALLRDDYLEPTPTETKILKGLKYEAPAIFKVGDMYFGLFSGCTGWEPNPGRSAY STDILGNWTTGNNFAVDKLKQVTYNSQSCYVFKVEGKEKAYIYMGDRWNSKDVGKSHHVW LPISMRSGYPVVKWYDQWDLTVFNSMYRYKRAAEIIPGNIYSLLEKTSDRLVSKPANGFS IADDDDDINLSLEFIKTNIPNVYKIKDTKTGKFLESLFGTLRLNPEKKDDAQCWVFNLQE DGYYQIQNLKDKKYVTVSGSNTFAGSNLYLTELSKKLMQDFAVYFDSNKYKYKEADIFSD AYKANNLKQMKAQ >gi|226332317|gb|ACIC01000003.1| GENE 37 42777 - 44237 919 486 aa, chain - ## HITS:1 COG:no KEGG:BT_0266 NR:ns ## KEGG: BT_0266 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 486 1 486 486 935 100.0 0 MKKKYVKIMFLSLTLLTISTTATYAVQPKDDKKENEALQTSTEEHLPLLTINGTNASSSF IVNSEALLGKEITITAPNGFTVTPTVIAANSGKQKVKVTLNSTKRLTEGKIILRSGDVRS YLKVKGYGTALPVKDIAASPAYKKGNDKEFTKAFTPNSKGYTIEFKIKTDDSEKSFYPYF VNEKGYGFKAYITSNEIGLFNAYKKEITNPATNGKAGGKGKFYNNDGQAHIYRFAITPDN RAFIYRDGIPVDSVRIIDYAPQPNFAEKVGKPVENLLKNPNFEGEFDINPETKLVSRVEG WDVVISDRWNSEQQILPEEIDNNQDLDNHVFEIRPYKWAAGWSDGILMQVVDVVPNENYT LSALLKGGIAKKEGKLTGKMIIEEVQDPEKKVITEIASDNWETYSMDYTTSAECNQIRVS FTVGRGGWGNDIGAVRVDNAKLTGTSRTYSPKFGFIDNTADVEYFTIDESGAYAPAQPEI TINIED >gi|226332317|gb|ACIC01000003.1| GENE 38 44520 - 48632 1472 1370 aa, chain + ## HITS:1 COG:all4963_3 KEGG:ns NR:ns ## COG: all4963_3 COG0642 # Protein_GI_number: 17232455 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Nostoc sp. PCC 7120 # 840 1102 5 263 294 140 36.0 2e-32 MNSDSYLYFNIAQEIRLYINSTMKHQLKFYIVLFFIQLFPCAFAQQMSVSTLPLNNYLPS STVLRVHCDREGFLWLGTKDGLCRYDGYRLLVFRSGLKSPDLLTNNEITCIAENKNGYLF IGTKKGINILDKRTYQIIPVTHNDLKDQEIRTMIVDSDGWVWVGTLTSVFRCSADFSYCK RYDSTLPVTSVNSIYEDADKNIWVTLWERGLHRYNSKTDSFIAMPQIGDLNNPFKVFQDN KKQHWILTWESGIFLLYPEEKDDLMYSHVDIKSNEHWDKNGCFSITQDDKYGYIWIVSTQ GLYALQKRPGNIINTVDISHISSKLNNIFSEIVKDKSGNLWIAAFNEGVSMIDLNKPLVQ NYSFPVIREKTGFVTNIKNIYEDKEGDLWIDQNRWGIGIYNPDSNKLLFYTDIPSLKNIA NLKNVSCITSVPLLNEIWLGSEYYPEIYKVKKDKKGVELLGTLKLTDYVDNSGFPRLFYT DSNHNLWVGTTKGILVKPAKEKILQDTKFPFVDIIGIREGKDGSLWISTRKQGVYNAKIS SDLTLEEKNLRNLKTHAEGVISDNIGAICVDDNGLVWMGSQDGDVFTYDPQTNKVENLSD MFDMLEEGIFNIITDQLGHIWISTNKRVIEYDPKNGGIMDYSTMTDVMVNSFMPNSYYKT RSGKILYGGNKGISVFTPYDHLSDNPRRIRTMVSDVKIDGVSSLLEKNNQRFNLRSQIIS LNAGDKNIEIDFSSLNYAFPDKIKYAYKMDGVDDDWVYVRGDRQFAFYNQLPKGKRTFYL KTTDVNGLWSNYIAEVQVFKQPAFYETLWAYLFYIVFTLLCLYLFYHRMKRRIQLRHELR IAQIDKEKSEELVQTKLRYFTNISHDLLTPLTIITCLIDDAEMTNGSRISQLTMIRSNVN KLRRLLQQILDFRKVESGNMKLSVSKSDVISFIDDVCKIHFTPLMRKKNQTFTFLTEDRH LMAYFDRDKLDKIVSNLLSNAYKYTANGGNIKLIVDSYWESENHHLRIQVVDTGEGIAPA DLENVFKRFYTINKGDESESNGIGLSLTKDLVELHHGTINVESELGKGSTFTVDLPINKD SYQEDELISEHISANGINTDLILEKEALADSQVGEDTQIADVHLLLVEDNEELLFLMEKI LSKHYHVLIAKDGLEALNVIKDNEIDIIISDVMMPEMDGLEFCRALKSNLETSHIPIILL TAKNTVEDRIECYNAGADGYISKPFELKILEARINNFIMHKKNKQEEFRSNVEVNIDSLE PSSIDKEFLDKVISVINSNMSEGDFDVVQLADALAVSKSSLYRKMKLATGLSPIEFIRNI RLKHGSQLLKDKSISVAEVAYECGFSNPKYFATCFKEEFGVTPKEYQKSC >gi|226332317|gb|ACIC01000003.1| GENE 39 48907 - 51079 1238 724 aa, chain + ## HITS:1 COG:no KEGG:BT_0268 NR:ns ## KEGG: BT_0268 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 724 1 724 977 1354 100.0 0 MVIILPNFFLQKPVCRLLAVVGFLGFSGAIASQSSSPANIDSVKTYMKKNSFEKNVGSRF VTNRSIDSFGVSDTVDIKMLQRSQFLSIQQLLKGNVPGVYVQENNGEPGTIQSMLVRGLS SPVFSNKDVSSVQPTVYLNGVPLMLENSYVYDIKQFDINPIGAAANMLAGLDISSIESIE IIKDPLQLAKLGPLAANGAIWITTKDGYYGGENVSIGVSAGMAFAPSSVRMTNGSYEKAF RQRFYDTYSLTPGKQPYYLMDTRDVRYFGKPDWADDYYQSSLLYNVNASIGGGSKKANYV FTLATTKDAGAADNTSYTKYNIGFALNMVPLDGLNVSTIINAAKIDRVRNRNFRDRFAEM EYMVDFSTPLAPAGNLYNDFLISNDLTKDDNYNNLLNGLLALSYTKQRFNATASIKLDYT TNVRRAFWPMALLESVNFVSNYSGYNQRIIGETSASYLLPLADIHKLNVQWNGSITSDLY HYNYTRAYDGDDDMKPTTSTGNFKQYRYVDRLENRWVSTSIALDYKYKNLLNVGLLARYD GNSAIQSDHRWMFTPAASAEWNLKNHFFTGSTALSGLSLRASYARIAKSFQSDRYELGPQ YLATSITWSGEPLLSSANGFATITRPYASGWVGYDLKLPYSDKMELALKGSFFDNRIIAE ISLYKNYEKNLLTYLPVTQEMGYEYKLASGMDISNQGLELNLSASILKNTPLKWDLSFNA SYNK Prediction of potential genes in microbial genomes Time: Wed May 11 23:56:11 2011 Seq name: gi|226332316|gb|ACIC01000004.1| Bacteroides sp. 1_1_6 cont1.4, whole genome shotgun sequence Length of sequence - 52287 bp Number of predicted genes - 33, with homology - 32 Number of transcription units - 12, operones - 5 average op.length - 5.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 78 - 701 375 ## BT_0268 hypothetical protein 2 1 Op 2 . + CDS 735 - 2273 744 ## BT_0269 hypothetical protein 3 1 Op 3 . + CDS 2287 - 3081 557 ## BT_0270 hypothetical protein 4 1 Op 4 . + CDS 3059 - 4099 527 ## BT_0271 hypothetical protein 5 1 Op 5 . + CDS 4123 - 7293 1927 ## BT_0272 hypothetical protein 6 1 Op 6 . + CDS 7339 - 8829 856 ## BT_0273 hypothetical protein 7 1 Op 7 . + CDS 8856 - 9578 528 ## BT_0274 hypothetical protein 8 1 Op 8 . + CDS 9599 - 11284 1010 ## BT_0275 hypothetical protein 9 1 Op 9 . + CDS 11301 - 13130 882 ## BT_0276 hypothetical protein 10 1 Op 10 . + CDS 13144 - 15261 1341 ## BT_0277 hypothetical protein 11 1 Op 11 . + CDS 15273 - 17045 924 ## BT_0278 hypothetical protein + Term 17091 - 17133 7.4 + Prom 17102 - 17161 5.4 12 2 Tu 1 . + CDS 17303 - 17944 206 ## BT_0279 hypothetical protein + Term 17946 - 17979 -0.5 + Prom 17957 - 18016 1.9 13 3 Tu 1 . + CDS 18058 - 19275 796 ## COG3328 Transposase and inactivated derivatives 14 4 Tu 1 . - CDS 19161 - 19376 70 ## - Prom 19414 - 19473 7.2 - Term 19459 - 19495 -0.2 15 5 Tu 1 . - CDS 19525 - 20166 475 ## BT_0281 putative DNA-binding protein + Prom 20381 - 20440 8.8 16 6 Op 1 . + CDS 20468 - 21763 784 ## BT_0282 hypothetical protein + Prom 21880 - 21939 7.0 17 6 Op 2 . + CDS 22010 - 23461 409 ## COG4886 Leucine-rich repeat (LRR) protein + Term 23577 - 23614 1.2 + Prom 23512 - 23571 7.0 18 7 Op 1 . + CDS 23692 - 25110 1181 ## COG0811 Biopolymer transport proteins 19 7 Op 2 . + CDS 25094 - 25540 265 ## BT_0286 hypothetical protein 20 7 Op 3 . + CDS 25545 - 25970 327 ## COG0848 Biopolymer transport protein 21 7 Op 4 . + CDS 25945 - 27468 1054 ## BT_0288 hypothetical protein + Term 27679 - 27722 7.1 - Term 27662 - 27715 11.8 22 8 Tu 1 . - CDS 27726 - 30416 1080 ## BT_0289 hypothetical protein + Prom 30449 - 30508 8.1 23 9 Tu 1 . + CDS 30734 - 33073 1811 ## COG1874 Beta-galactosidase + Term 33134 - 33187 17.1 + Prom 33209 - 33268 5.3 24 10 Tu 1 . + CDS 33313 - 34242 363 ## BT_0291 integrase + Term 34379 - 34430 -0.9 25 11 Op 1 . + CDS 34634 - 35383 423 ## BT_0292 hypothetical protein 26 11 Op 2 . + CDS 35388 - 36620 745 ## BT_0293 hypothetical protein 27 11 Op 3 . + CDS 36659 - 38089 1277 ## BT_0294 hypothetical protein + Term 38173 - 38224 12.6 + Prom 38091 - 38150 11.5 28 12 Op 1 . + CDS 38357 - 39250 739 ## COG1131 ABC-type multidrug transport system, ATPase component 29 12 Op 2 . + CDS 39266 - 42571 2061 ## BT_0296 putative xanthan lyase XalB precursor 30 12 Op 3 13/0.000 + CDS 42592 - 44088 1475 ## COG1538 Outer membrane protein 31 12 Op 4 27/0.000 + CDS 44096 - 45157 1058 ## COG0845 Membrane-fusion protein + Term 45164 - 45197 -0.7 32 12 Op 5 10/0.000 + CDS 45206 - 48322 2800 ## COG0841 Cation/multidrug efflux pump 33 12 Op 6 . + CDS 48361 - 52285 3157 ## COG0841 Cation/multidrug efflux pump Predicted protein(s) >gi|226332316|gb|ACIC01000004.1| GENE 1 78 - 701 375 207 aa, chain + ## HITS:1 COG:no KEGG:BT_0268 NR:ns ## KEGG: BT_0268 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 207 771 977 977 414 100.0 1e-114 MNGKPLNINGVPFKAGDPVWTDVDGNNQINDNDRVLTGHAIPPYTGGLTNQFAYKGFDFS FNLFFALGHSALNLRDQQRYDFATLDNIQSLQSVKEIFFWQNTNQRDDYPIYNPQSSVHP YRAEQDLFLEKLSYLKLRNITVGYTLPIKKVGSNIPKSLYFYLTGSNLLSFSNFSAGDPE LVNFNGTYDGYSLPIPRSISLGLRFKF >gi|226332316|gb|ACIC01000004.1| GENE 2 735 - 2273 744 512 aa, chain + ## HITS:1 COG:no KEGG:BT_0269 NR:ns ## KEGG: BT_0269 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 512 1 512 512 1022 100.0 0 MKNNMKKLRLALFSALVCLFLQSCNDFLDVDPKHAASETQQWKTLEDTRSALMGVYGLTR AALADNNTHWICGDLRKGDFTVYKRSDLQAVSDNELNKPYDLLKKVSNWRRFYAVINAAS VFMEKAPRTVELDRSYSEQNLKYDIAQVRALRAFAYFYMVRIWGDVPLVTYSYDNGTFPS MPRTDAQTVLSYAKAELLTAIEDLPYQYGTQTNLYYGSYGAQWQGKLFNKLSAYSVLAHI CAWQGNYAEAETYSAFIIDHASEINAKYTSIADLTSETGLFYSNASVKGSRILGFNFAHN DNEATQSGHLEQLTLAYPLVQKSYPEIYISKDSLFSIFTNFDDLRFGIIDTIKYSSYYVQ NLNEETPVFSKIKIIQDGSAKDNDFGVFGSSIVFTRLEDITLLRAEALCALNRSTEAVSY LNMIRTNRGLREVSFKKDFGNNRESLIAEIFEERRRELMGEGWRWYDLVRRQKLMKDNEA FLRLISSGGIYWPVSEDIITANSQIEQNEFWK >gi|226332316|gb|ACIC01000004.1| GENE 3 2287 - 3081 557 264 aa, chain + ## HITS:1 COG:no KEGG:BT_0270 NR:ns ## KEGG: BT_0270 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 264 1 264 264 512 100.0 1e-144 MNKILFILCLTSASLWMLSCTDYEVASYPEENETEVFNGTVLDYLSTGNERLNLKFDSMM VLVNNIPDFIQQMEQTDVQYTVFAIPDACIRSSLAQLNEYRKQKELGEAIYLSDLLIEPF VVKDTIVNVITPTLNDTIINEYHYDYRADLERMLCKYIIKGSYDTDNILANEGNNSLNSL KYNYQMNIECSRKPASGFVGGGVKQLIFSDMKNSQVKDNWNRVSTVWNDVYTNNGIIHIL SPQHSFGFDEFIYVFNNYGNEYKK >gi|226332316|gb|ACIC01000004.1| GENE 4 3059 - 4099 527 346 aa, chain + ## HITS:1 COG:no KEGG:BT_0271 NR:ns ## KEGG: BT_0271 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 56 346 1 291 291 577 100.0 1e-163 MAMNIRSKNIALLFSCVLLSISCVDKYLPDSLDAFDRDVNFTTKLYRPQLGKNTLMSDNF SSGNSTLPLTFEISRIVRADGSPAPELTEYFPVKVWKTPYMGTEKSIEEIEAKREIEYRT LFQVKKHSGEFMMWSNAESSFVQCAPSDGYIFDVLVKNSGGYKTFTDMQLIPVRESDYEP SIYDPETGLVQGQDYVTPNSLTLFQTESGDYMFPEDVHIYFRENQDNDDDVKSLTFRFYG PDYTPISPSSFNQTDWANLIHGFNMEKTDEYVKYDVVYPMPLVEMKSKYTNKDGNRINVN FLYDRITASGYRMTSTMSFEFAIYKEAHWEIIVVFTAGAPLFEDGK >gi|226332316|gb|ACIC01000004.1| GENE 5 4123 - 7293 1927 1056 aa, chain + ## HITS:1 COG:no KEGG:BT_0272 NR:ns ## KEGG: BT_0272 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1056 1 1056 1056 2040 100.0 0 MRNIFKYGLLAMAIAVTSVGVRAQDIPVSGVVKDKSNNSTLPYASVIVKNESGKTVPQLS TTTDDNGRFSTKVKSGYRLVFSFLAFDSLSVKVTKPSERMQIYLSPTENMLDETVVVGFK RVSKAAVTASVTVIKAEDLVNTPVANPMELLQGRVPGLNIQMNNGTPGGLPSFSIRGVSD ISVQSSGDGEFMMGLTPPLFVVDGIPQEDVTGYDAAGLLSGATVSPLAMIPLEDIANIQV LKDAAATSLYGSKGAYGVILIETKRGETAKPKVSYSANFVVKTPPRLRDVLAGGAERALR IMQILGNDTSAYHGYNEIHKLQALSDSLNPYYNNNTDWQGVFYQTTYNQTHNLSFSGKPN EKFDYKINANYYTEKGIVKNTDFNRYGIRAAVGYKPNDRFHLDLNLATTLTRNSNGSGNA FSQSGVAAGSAASSLLPPPSMYTASNSALSVFSVQADNQNVVYDASMNIMYLLPFDIRWR STLGYKYGTVENEKFTPGILNANRATWNNGSEYSYNMYVRSLLSRTVKLGIVNFDLQGGF ELSSEKYSGNFIMLNGLASDHIWSSGMPSMAAGRSNFSDKKNTFALIIDPQLSLPGGKYV FTPNIRPEINSSYGSQAKWAINPSLGFRWNFSRESFAKKWKFLDAGALRVTWGRSTTYKA SIYDIWGSYNLSKDTYNGVSIIPIDKNAMPNPDLKPVTSTSWNLGTDLSFLNNKIMFVAE AYYKQIDNQLSSIELANHNAFNSVRSTKTSLVNYGLEFSLNVRPLSRQSNWDLNVATSLA INKDVIAKLPNEVRQIINSDAEVVNKLGSNAMGNYLYVYKGVYATDEDVPVNPLTGERLR MGGNTSTQAYFKAGDPIWVDVNGDYIIDEKDKVIVGNSQPRMTGGISINLRYKAFSINTN CSFTLRRDIINKALADRFRAYGTPVAGKVNLTGSGALTPIEAYNFWTEDNIYAQYPNPFD YTRSSIIQPFRYDQTLFMEDGSYFKINGISVAYTIPKKMLDFFRISRCQLNFSMNNIYTF SKYSGINPENVNNLGYDTSGGYPNGRTVTFGVSMDF >gi|226332316|gb|ACIC01000004.1| GENE 6 7339 - 8829 856 496 aa, chain + ## HITS:1 COG:no KEGG:BT_0273 NR:ns ## KEGG: BT_0273 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 496 8 503 503 979 100.0 0 MTLIVSCLFAGCDSLANKSEDKLSGDDFWAQGNETNAEAFLLSIYNSFRNATMSQRPFLT YSGDMRCAPITAYSTGDKYVAYLANNDMGELRNTYPDDARGGLIMQWDVFYTAIQDANIL LAEIDKVPGMDELKRSRFKAEAIFMRSLSYFFIVRAFGDVPYYTNAYNEAPLPRTNMVIV LQNCLADLQPLLDDDPGAEVLPWSYSSYSSKGIRASRGSVIALMMHINLWLVQFDAQNKE QYYRNVVSLGEELERNNGAYSLLDINRSSVIFAGGSDEGLFEIAQNINFNEIFMMNAKFS DNVSYSCLNKSMPLFCYSGDYLMTLFPMYEDDARKELWFDEKIYSTSVSSSAPKEIKKFW NIDTYGNGTITSNSGNQIVFRYAGALLLYAEALAALGTNDTKACELLNRVRNRAHASEIN TSGSELMDAIFWERCRELIGEGHYYYDLVRTGKVYNRNYCMNPMTRTNFNVGAWTWPIHR NALKNNTQIGLNLFWE >gi|226332316|gb|ACIC01000004.1| GENE 7 8856 - 9578 528 240 aa, chain + ## HITS:1 COG:no KEGG:BT_0274 NR:ns ## KEGG: BT_0274 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 240 1 240 240 488 100.0 1e-137 MKNNIKLWKKIVGCCLLAVLMYNCADDSYLIDGGKANPYYNGNMMQYLESRPDYFKDLVE VIRLSGMEDVFEDEQITFFAPTDWSIRGSFNYLNRIWYRMGHDSIKSFSQIKPEVWREML SMYIVKDKYLLKDIPQIDTTAIAAYPGQAFLSYGQQPMNMGVVYYDANNVKYAGARQIIY SYVYDFTIGDMKNAYVATSDIQPTNGVVHVLRLTDHAFGFEPYLFATKAINATIETEPDN >gi|226332316|gb|ACIC01000004.1| GENE 8 9599 - 11284 1010 561 aa, chain + ## HITS:1 COG:no KEGG:BT_0275 NR:ns ## KEGG: BT_0275 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 561 1 561 561 1114 100.0 0 MKHITIYSLMIVCSVFTSLLVTSCNSDLNSNIKDDPYSGGKAPLGIGLLAESPSPESAYP NDTVVFKAKGMLNWCDPQSGRYDFKFYISDEETKIVTATDTTITVKVPGNLSSGTAYIVL KEQVFYGPRLTVLGNIKIDQSYGFKGTSGPIYDCAEHYSKARVYYPVGDYMQAYYNENSS QSFSCISMVLTDGSVSGKWVTDFKLDPGQGAGIDLANPGVTDIECYLNSFTYFPSDHRVL LSGKFSEYGWDKLPVNNITIATNEVASYYKTVALPSKKNNTSINCKIPVFNGGTLEAPVR TFITSNEKVVAVGNITNYCRINTEKSYAESMVLDYSKVASVLRMSRTGELDDSYRRDAEG VIGQILDACMVESDGIVIVGTFSSFDGQSVKNIVKLNAEGTLDETFMKNIGTGANGSITK IRYNKNKKKILITGEFSEFNGIPAQSVVMLNDDGTRDEIFKIGKMEGGLANFACLLDNDN IVVSGAFTKYDGVTRRGFLILGRDGKALQQFNVPGIFQGELYKVIETRTSTNSNGLLLLG DFSRFDGNMVRNAIMIEVDYE >gi|226332316|gb|ACIC01000004.1| GENE 9 11301 - 13130 882 609 aa, chain + ## HITS:1 COG:no KEGG:BT_0276 NR:ns ## KEGG: BT_0276 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 9 609 1 601 601 1209 100.0 0 MEKLIKNIMLCLFAVFAFSCNEEMERLLTDDYPESGTEYTTGHVLMVVIDGASGIAVNEA YNTRKAPVIRSMTDKSLFTFYGLADNEEGVSKERGWANLLTGTTKNGLDVNQGIDELETP SFLQRLKEADENLKVSFYSSDTEFFGAFGSVANTKRKTSADSETADALIAEINGETISDI VVAQFGGVQQTGEQYGFWSNETTPTSEVIDAIYNVDAFIGKIMKALEARSRYVQENWLVV ITSSYGGVYEGNVTPASLYDDPRLNSFMMLYNSRFASKLLQRPGSDELQYKFYSPYFVGP GQKATTNAVMTGNTEFFNMGKRWSSNPDEEVDKSGYTIQFKMFDKHGDPWGNRNIISKRY QKAGTGWQLMFSNNTQVEFAANFKEGSSVFRLPSTRRDGSWHSYTLVIKEVDDTQKGDSI LFYLDGKYQMGCKIDGSKDMWTDAPLAIGHTFSPDRPNQANLYINDLQFYNVALPADIIA EYHCTTKLDLLGELYPYWNNLVGYYPNDREDDIGLSYLEDYSKYTSKDKRLYFNKPVGTF APTDDYVTRRQESIVTDKVCPLIDDSYYKTVLNTVDLSYQIFQWFGEPVDSSWNFDGEGW ALDYVSLNN >gi|226332316|gb|ACIC01000004.1| GENE 10 13144 - 15261 1341 705 aa, chain + ## HITS:1 COG:no KEGG:BT_0277 NR:ns ## KEGG: BT_0277 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 705 1 705 705 1406 100.0 0 MKCFNKLKADYAVLLSFISCMFFMGSCNIAYQYDIESGTEDDTADSTNVTIVTGEGIDVS MYESARVFPGLVDTLVDNTVNTLLALDLSKRYIPAYDLDVQQVPRPIYSTGLYAGAGELI TITINDNTMGLTVIIGSHLDDLTDISPYLRLPVVTTSKQLFPGKNTIRNPLGGMIWIEKS KDVNGSADFVMEINGAYRSPDFIVGSTDVTAWVEQLRTTTVPWLELRGRHVAFSVQRERL LDMINDDPIIAEKMPNTLEAWDNAVETYYYNYYSLQVGAQDFSMRAPDFPERVVLDVELL DNLYIRNADYGVVALNTNYLLNELASYQTLKSGNSVAIFNALYRNYSFRDIKSPWWSEVS DAVKAIPLYRMAEKGLREDGYPMGPIFPEEGSSIAEQFPKALAYADTDSSRWFVSDIKSE VRPTYALASLVQLANYKDDDWAFYIELNRMIKDKISIDHSTSTYFFKALCDYFKEDFSPF FEHWGYSLTDEARSYASKYPLMDKAIWKYSPLAENPSSGVVDFPSNGYHYRHNRADWEIY ALDANGSDNFNSGQSPDRLLDSDKSTGWTSYYNNDTSTPPLPYYLFIDMKETTDVNGVFL AGNNSDFAATKLIIETPVNQDIINDPLDENIVWREIVQIDSIAHPELPNIKSEGFYDFPS KEQVRYLRIKLPEPNRNDNWNETDMSRFRFRVQSFTEFGTFYYKP >gi|226332316|gb|ACIC01000004.1| GENE 11 15273 - 17045 924 590 aa, chain + ## HITS:1 COG:no KEGG:BT_0278 NR:ns ## KEGG: BT_0278 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 590 1 590 590 1174 100.0 0 MNKLLNILTYSLLVVCVLAFNYACTEYDTPAEIADDGTIDTGISTKIHRKVLWINIDGAV GEVVKNSLPADGAIAKMLKNSKYSWTGVSDNRTLSVERNEDPVTWATMLTGVIPEKHSIT DESYTANVEYNPNNPNEKVIHYQNIISYISNNDVNMLSLCVTPWAKLNKNMLNNAKTTIT SENDVQTRDVVLNHIANEDYTFILADFSGMLEAGKSGGFKADNAAYVSALKTIDGYIGEF LSAIDARENAFYEDWLIVVTSNHGGSADGRYGGTSEVERNTFGLFYYNHYTEKQLNGNRL YGAYFDSQNEYKAVVFDSIGKYYSLGMDAFSMEIIMRMVPRQDGTYNGNNWDRILGKAGW GLYRQRGTVSMRTNPKEGPALEQAITGYNDSKWHHYGVSISSAASGRRNWLITLDGKLQG SGTTDTQGLAPDSSLLAIGGYSVPTSYYISEVRLWKKDLAEREFLQLSGEIDIEPSGDML GYWKFKPTEQLEKLEEDTLVIKNQIQGGLNMYYIKDKTSSSPDIEQKASAMFANTLPTNV NAERICVENTLIVPQILYWLGISVPSTLDGYVFINNYALSEEWREEVDTE >gi|226332316|gb|ACIC01000004.1| GENE 12 17303 - 17944 206 213 aa, chain + ## HITS:1 COG:no KEGG:BT_0279 NR:ns ## KEGG: BT_0279 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 213 38 250 250 425 100.0 1e-118 MITQAISQLTPQNKLLVEEINKNVQDTILERLNMNIAGRWNDSSLSLTLMKSTIKFVPVI DSPSRLYLVNDSEKVFRIPEKFACFYGQNDNGETIYFYAIYHAENFMKDTNPKSYYQGYV EVFGKEAADKMVESSIKRTEAERWEIMSFTPQKNETKKFEYAREHSDDGTFFILTRENTY PHICFFKDKKPYYCWGANQDELSMEPLENYLKP >gi|226332316|gb|ACIC01000004.1| GENE 13 18058 - 19275 796 405 aa, chain + ## HITS:1 COG:YPO0011 KEGG:ns NR:ns ## COG: YPO0011 COG3328 # Protein_GI_number: 16120364 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Yersinia pestis # 14 401 7 398 402 342 42.0 6e-94 MDRNSEEYQQMRDKALSQLRSGESLTGKDGAFGPLLKEFLEAALDGEMSSHLDDSERLKG NKRNGRGSKRVKTMAGEIDIVTPQDRHSSFSPEILKKRETILADNMSSKIISLYGMGLSL RDISSHIEEMYDVEISHNTLSEIIERIVPKVKEWQSRPLESMYTIVWLDAMHYKVKDGGR TESRAVYNVLAVNKDGHKELIGMYVSESEGANFWLSVLTDLKARGMKDVLIACIDNLTGF AEAISTIFPEVIIQNCVIHQIRNSLKYIASKDQKEFMADLKTVYQAPNKDLAELNLDKLQ DKWGKKYPVVLESWRRNWDNLSAYFAYDEHIRRLIYTTNAVEGFHRQVRKVTKTKGVFPN DMALMKLIYLAVMNISKKWTQPLQNWALTAGQLRIKFGERMPLAI >gi|226332316|gb|ACIC01000004.1| GENE 14 19161 - 19376 70 71 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MCARAAKNEPFIYPCSHPGTTFSLRLKSKKGEMGYIARGILSPNLIRSCPAVNAQFWSGC VHFLEMFITAR >gi|226332316|gb|ACIC01000004.1| GENE 15 19525 - 20166 475 213 aa, chain - ## HITS:1 COG:no KEGG:BT_0281 NR:ns ## KEGG: BT_0281 # Name: not_defined # Def: putative DNA-binding protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 213 1 213 213 391 100.0 1e-108 MAVQFELYKTPMPKEKKNKTRYHARPVSFETVNTEKLAYRIHDRSTLRVSDIISTLEELK NEVAQCLLEGKKVHVDGLGFFQVTLSCEEEIRNPKDKRVHRVKLKAIKFKADKELKGELC HMKFQRSKIRPHSANLSEVEIDMKLTEYFAENQIITRKDFQYLCGMTQITAYRHIKRLMA EKKLQNKGTIYQPIYTPVPGNYRVSVDLKYKEQ >gi|226332316|gb|ACIC01000004.1| GENE 16 20468 - 21763 784 431 aa, chain + ## HITS:1 COG:no KEGG:BT_0282 NR:ns ## KEGG: BT_0282 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 431 1 431 431 868 100.0 0 MEQKNGWFAFWQKLMKCLSKQDETGVGDTKSVAPVNTIIQSIKGVSSSKEKDSSKVKSLT ESVDTFLKSRYDFRYNVLTEETEYRLLERMEEGFAPVNQRVLNTFCLEAHESGISCWDRD LSRCIFSTRIAEYHPFKLYLEELPVWDGVDRLAALARRVSRDQLWVKEFHIWMRGMTAQW MGVTGSHANSVAPLLISTEQGYLKSTFCKSLLPPALQRYYMDKVDLTSQGHIERRLAEMG LLNLDEFDKYAPTKMPLLKNLMQMASLSLCKAYQKNYRSLPRIASFIGTSNRKDLLTDPT GSRRFICVLVEYPIDCEGIDYAQLYAQLKAEILAGEHYWFTKEEETELQQSNMTFYRQGP VEDVLRSCYRAAEKGEGCELLSSADIFQRLKKINPAAMRGANPASLAQILIAVGIERKHT KFGNVYRVVSV >gi|226332316|gb|ACIC01000004.1| GENE 17 22010 - 23461 409 483 aa, chain + ## HITS:1 COG:lin0354_1 KEGG:ns NR:ns ## COG: lin0354_1 COG4886 # Protein_GI_number: 16799431 # Func_class: S Function unknown # Function: Leucine-rich repeat (LRR) protein # Organism: Listeria innocua # 92 289 115 291 292 75 32.0 2e-13 MKNIILIVSLLLMGVSSRADILNSSDNKNVADKFDPMLKAALFQMKIIADTSSILLSDID YVEDLELKTSLWINNDVNRAKTTRHEITSLKGLEYFSSLRRLKLVGNRTDTLECIPDFSR FKELRELEIIQIYLPKLDISGCKNLTNLSCRSCDLNTIDLKKNIQLRTLDCTYNRLIELD LSHNKQLRTVIVKSQRNKGLLSEGGRAGILSKLILPDNRSNTEGISILECSDNNLEKLDI SRSPHLRKINCSWNKLKALETSHNQELRELNCEANYIGELDFSANLKMEFLTCGLQGYEW IRQEGNDYRLLKKLILPEQKENVEGGSLRKLSIKEISEEAIPVFSDYPYLKELNCADNRL KTIDLSANIYLEVLDCSSNAISQLDLHSNVNLYSLNILNCPLQVLDLSQTKVKLIMCDFY DRREKGVKNGYISGLYLKLIVPKGYHAETINFRTPTLETASFENGSIYNYLPPQYVRMVV SGK >gi|226332316|gb|ACIC01000004.1| GENE 18 23692 - 25110 1181 472 aa, chain + ## HITS:1 COG:FN1312 KEGG:ns NR:ns ## COG: FN1312 COG0811 # Protein_GI_number: 19704647 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Biopolymer transport proteins # Organism: Fusobacterium nucleatum # 256 449 5 196 202 97 30.0 5e-20 MNCTKMKKLFRKGTLSILLLGVITFFTYPLSAQTNSRSGKGLYEIFSTKIPDGAQWKAQD NKFVRDGGYSKKSENYIYVGASKSPAELTGLSIPIRENPGPGEFRYITFAWVKWGGDQIG MKFHVSEKSANQKGKKYDFTYIAGKSKDLINPMEGLNLGELPGHWMVMTRDLWKDFGNIT LTGISFICPERRDAGFDEIFLAKTQDDIKNAPKVLPSEIATPVPVDGEEEGLVYEEGTTN EDEQPQGVQIDWAAQIKAGGFMMYPLYLLGLLALVIALQRLLTSREGRLAPKGLRKTVSE CLAQGDLKGAIAACDKYPSTLGNSLRFIFEHVKAGREAVSQTAGDMAARDIRTHLSRIYP LSVIASLSPLIGLLGTVVGMIEAFGLVALYGDEGGASILSDSISKALITTAAGLIIAAPA IALYFIIKNRIMKLASLVEVEVENVITTLYLEGDSTDMNTPGSKEEKHETKL >gi|226332316|gb|ACIC01000004.1| GENE 19 25094 - 25540 265 148 aa, chain + ## HITS:1 COG:no KEGG:BT_0286 NR:ns ## KEGG: BT_0286 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 148 1 148 148 289 100.0 2e-77 MRLNYDEEDKVEVQMSPLIDCVFLLLIFFLVTTMMKKWEMQIPLSLPSMTSSLSTTRAGE EAVIIAVDEEKNVYQVVGHDAYTGENHYVPISDLNTFLTELRNSEGTEIAIDVAAYRTVP VNTVIEIFDQCQIQGFTRTRVRLGSKPY >gi|226332316|gb|ACIC01000004.1| GENE 20 25545 - 25970 327 141 aa, chain + ## HITS:1 COG:alr0644 KEGG:ns NR:ns ## COG: alr0644 COG0848 # Protein_GI_number: 17228140 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Biopolymer transport protein # Organism: Nostoc sp. PCC 7120 # 1 119 1 121 217 58 26.0 3e-09 MRLSLFNNEDEPEVSMSPLIDCVFLLLIFFLVSTMTKVKNRDISVDLPTSESAIKLKPDD KQAIVGLDAEGNFYWDGQPCSTNFMMEQLRETCISDPGRRIRIDMDKNTPFGRFVEVMDA CQFYNLTNIGIRTYDENYNRE >gi|226332316|gb|ACIC01000004.1| GENE 21 25945 - 27468 1054 507 aa, chain + ## HITS:1 COG:no KEGG:BT_0288 NR:ns ## KEGG: BT_0288 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 507 1 507 507 937 100.0 0 MTKTTTGSSRLTVILSWVIALVCLVGGALFIWLMPSPGQGAIDLQAKKQKSPKSISDKIK QEQAQRELPANYSEQLVKQSEVLVKKNLEDQLRKFQEMAKKMRQRKNELLTKVEQRKLPR TAPDDANDTSKARNIPQAGKLSANPSVDDMYALLREYEAEIQQNHLAANAAKQALSKGLS FPEVYSSLKTGSSRMPSFDELIRMQTKGGEWATSAGSNASAGLEIKSTADLNNYRGLLGQ ATRQAGLAQSRLENLFGVVKQVGKPGGGFGIGNGQPGGGEGSGNGNGFGGMGEAAQGSGN RRPMNAYAGPRLNQEMVKAQALPGRRFSKSADRKGWLYINTWYMIGPWESFGREDFSIVH PPEISVDFDAVYTDGQVGTGVMETDSHPIKVIGEEVYLDGTLRWKFMQSESMHNTVPVTT GHSTYYAYTELYFDEATTMLVAIGTDDSGRVWINGKDVWRDYGTSWYNIDEHITPFQFRQ GWNRILVRLENGGGGACGFSFLIIPQE >gi|226332316|gb|ACIC01000004.1| GENE 22 27726 - 30416 1080 896 aa, chain - ## HITS:1 COG:no KEGG:BT_0289 NR:ns ## KEGG: BT_0289 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 896 1 896 896 1578 100.0 0 MNKTTFILLITLIISYINTAYIQAQTQRVTVQGSILEKDTQFPVEQATIQLLSLPDSNYV KGISSLEQGKFSLTTLPGRFLLKISFIGLATEYKPLQINTTQTIVQLGKIELKPDAVLLN ETVIVAQAPPVVVTGDTTAYNTSAYRVSEGAALEELVKKLPGAEINEEGKLIINGQEVKK IMMNGEEFFLNDPNTVLKELPADMIDQLKTYQKKSDMARITGVDDGKEEMVLDLRVKPSM RKGWNGNFEAGYGNDDHYRAKGMANRFKNKMNLSINGTANDNGINSNQRIGANYSQNTQK LKYGGSIEVRENKRDSWSKRHTESFLSDNTSQFSLQNNQSDGKTSAISANFRFEWKLDSL TTIMYRPTFNFSKGNSNSGNFSETLNNESTPINRKEATNHQTNDRFSTNGSLQLNRKLNS KGRNIALRLYYDLDDGNSDRYSLSNTYYLKYGDSIKTLNQWIEKLDKNNKYQVQITYMEP VFTNRFIEINYSYQHRSSLSEKYAYDWDKQEDTYSQYPDTAHSDCYKNKYSTHQTGIFFR TIRTNYFYNIGIEVEAQKTTNKSYMRDTTFEYLSRNAINYSPTINFKYTFSKQTTLKINY RGRTAQPGMTDLYRKKDISDPLNIRLANPELKPSFNNNISATFNTYFTETSRSLNANLSY NNTMNSTTRIVTYDENTGGQTTQPTNVNGNWRINGSLTFSTPLSNKKFTVNTHTNTNYGN SVGFTVLNKESDAVKSNTTNLFLSERLKGSYRSDLIDFELSAEVRYRKSEHSIKKENRRE TFDYTFGTEANLNLPWDIKISSNINCRLKRGYGGKNDTNRTLWNAQISKSFLKKKKATVR FHLYDILRENESSERSISENSITDRESSTSNVYFMAYIAYRFNTFGQKRKRQSVQN >gi|226332316|gb|ACIC01000004.1| GENE 23 30734 - 33073 1811 779 aa, chain + ## HITS:1 COG:XF0840 KEGG:ns NR:ns ## COG: XF0840 COG1874 # Protein_GI_number: 15837442 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase # Organism: Xylella fastidiosa 9a5c # 1 606 1 602 612 422 36.0 1e-117 MKKPLLYLLILVVAVLGSSCSQSSEGTFEVGKNTFLLNGEPFVVKAAEIHYPRIPKEYWE HRIKMCKALGMNTICLYVFWNFHEPEEGRYDFAGQKDIAAFCRLAQENGMYVIVRPGPYV CAEWEMGGLPWWLLKKKDIKLREQDPYYMERVKLFLNEVGKQLADLQISKGGNIIMVQVE NEYGAFGIDKPYISEIRDMVKQAGFTGVPLFQCDWNSNFENNALDDLLWTINFGTGANID EQFKRLKELRPDTPLMCSEFWSGWFDHWGAKHETRSAEELVKGMKEMLDRNISFSLYMTH GGTSFGHWGGANFPNFSPTCTSYDYDAPINESGKVTPKYLEVRNLLGNYLPEGETLPEIP DSIPTIAIPTIKMTEMAVLFDNLPHPKESEDIRTMEAFDQGWGSILYRTSLSASDKEQTL LITEAHDWAQVFLNGKKLATLSRLKGEGVVKLPPLKEGDRLDILVEAMGRMNFGKGIYDW KGITEKVELQSDKGVELVKDWQVYTIPVDYSFARDKQYKQQENAENQPAYYRSTFNLNEL GDTFLNMMNWSKGMVWVNGHAIGRYWEIGPQQTLYVPGCWLKKGENEIIILDMAGPSKAE TEGLRQPILDVQRGNGAYAHRKMGENLDLTNETPVYQGIFKSGNGWQHVKFGKKVETRFF CLEALNAHDGKDFAAIAELELLGEDGKPVSRQHWKVIYADSEETDAANNIATNVFDLQES TFWHTNYSSSKPAFPHQIVIDLGEDKVITGFSYLPRAEAGKTGMIKDYKVYLKMQPFKI >gi|226332316|gb|ACIC01000004.1| GENE 24 33313 - 34242 363 309 aa, chain + ## HITS:1 COG:no KEGG:BT_0291 NR:ns ## KEGG: BT_0291 # Name: not_defined # Def: integrase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 309 13 321 321 608 100.0 1e-173 MYIGRLRKEGRYSTAHVYENALLSFRGFCGTPTVSFGQVTREHLRCYGQYLYERGLKLNT VSTYMRMLRSIYNRGVESGRAPYVHRLFHGVYTGVDVRQKKALPATELHKLLYGDPKSDI LRRTQAIAALMFQFCGMSFADLAHLEKSSLDRNVLHYNRVKTKTPMSVEILDTAKDMIYQ LQNRKPALQDCPDYLFSILNGNKKRKDESAYREYQSALRNFNNHLRGLAKALHLTSPVTS YTIRHSWATTAKYRGVPIEMISESLGHKSIKTTQIYLKGFALKERTEVNRMNLSYVKNCL EKSVNNIKY >gi|226332316|gb|ACIC01000004.1| GENE 25 34634 - 35383 423 249 aa, chain + ## HITS:1 COG:no KEGG:BT_0292 NR:ns ## KEGG: BT_0292 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 249 1 249 249 518 100.0 1e-145 MKQTEAILGDMRFIPLKGVVLILILILTTSRVHAQYALGATGQMMIPTAEMQETGTFMGS ANFLPEEVTPNKFNYPTMNYSVDMSLFSFVELTYRMTLLKMRTYNGRVGYHNQDRSNTIR IRPLKESRYFPAVVIGADDLFTEKATQYWGDYYGVLTKTFGFCKGHQLAVTAGWYFHQGN QPAFKKGPFGGIRYTPAFCRELKLMAEYDTNGWNLGGAIRFWKRLSVHVFTREFTCVSAG LRYECTLIH >gi|226332316|gb|ACIC01000004.1| GENE 26 35388 - 36620 745 410 aa, chain + ## HITS:1 COG:no KEGG:BT_0293 NR:ns ## KEGG: BT_0293 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 410 1 410 410 859 100.0 0 MKRVFSILYIICCCTLESVDAQSAGISEVLKELQMENISVAQKQDTITAAFETSVYRGSY NGIGIAIRRLITLPEVSTLQLVILDNALPQLCITLPAKLIEDYQSGKCDLDGVYCSMQMT TSTKTAMEALKGTKRESTTFGKVDVVVYPGVMLVNNVTYKLYKAAFELQPAVEMQLWKGA SLRLQVCLPIVNNEPGKWDCIRLGYLTLRQEFRLDNHWKGYLTGGNFSDDRQGLAAGIGY FSSDGRWTVEGEGGITGSSHLYGNDWGMSKWKRVNGQLSVGYFIPQVNTQLKVSGGRFIY GDYGVCGILSRYFGEYVVGLYGMYTDGETNAGFHFSIPLPGKKRSRHAVRVMLPDYFAFQ YDMRSGNEFARRALGVSYRTEPKSAENSRFWQPDYIRYCLIRTNEKTKLK >gi|226332316|gb|ACIC01000004.1| GENE 27 36659 - 38089 1277 476 aa, chain + ## HITS:1 COG:no KEGG:BT_0294 NR:ns ## KEGG: BT_0294 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 476 1 476 476 774 100.0 0 MRNLKWLYACSLAIAFGVLSFVTVSCHDDDDEPKQEPGEVIETPAPVVEYYIMGTVTDAG KGVSGVDVKIGTETIKTDKDGKFSVTEKNVGKYSVEVAPKGYLAQSTSVEIAANAENRSV VTVAVALTKQSEPTTVEVGETGNTEEVKVEDKSASNEEVKEPGTVEPEDVKEDLPLVTPE LEIPAGAIQTEGNEDVLEGGNAEVSVTTYVPAPETVTTEVKKEEENKEVEKTIPLAAAHF EPSGLTFAKDHYPTISIPNPIPGLTFADDMVLTYLENGKWVPQTEDINKVVYDEKSGSYT TKVKHFSSYAMENKVTSKVSTENVVKSEILGQASCDNSENPKAVTGIALTYKEKSGWECK DADIKAAVEKALSGAKPETVNAMVAFFKTRLYSLMGSASGITETTRTYNTVNVNGYTTMT YTCYAKTRTTTLSTTVKYNNKDVKISVTATRYTGTDHQYKTVTTNPTHSGGKGGSN >gi|226332316|gb|ACIC01000004.1| GENE 28 38357 - 39250 739 297 aa, chain + ## HITS:1 COG:all2672 KEGG:ns NR:ns ## COG: all2672 COG1131 # Protein_GI_number: 17230164 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase component # Organism: Nostoc sp. PCC 7120 # 3 283 2 278 293 256 45.0 4e-68 MSIIIKNLNKIYPNGNHALKDVNLEIPAGMFGLLGPNGAGKSTLMRILVALMEPTSGQVE ICGYDLMKQRKEIRGILGYLPQDFRFFAKYKTYEFLDYAARLSGMTHSRQRKQAVDEMLE NVGLFDVRERYANKLSGGMKRRLGIAQALIHHPKVIIVDEPTTGLDPEERIRFRNLLSEV SENDVTIILSTHIVGDISSTCNNMALMNRGQVSFNGSPQDMLKLAEGKVWRIRAAGDQLH EIDKKYPVISTIPSGTIWEVQVVADEVEGYEAEPFPPNLEHAYVYYMENQLNLWTND >gi|226332316|gb|ACIC01000004.1| GENE 29 39266 - 42571 2061 1101 aa, chain + ## HITS:1 COG:no KEGG:BT_0296 NR:ns ## KEGG: BT_0296 # Name: not_defined # Def: putative xanthan lyase XalB precursor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1101 1 1101 1101 2214 100.0 0 MTFISNIQSVAKYESKLLIRSWFFRVFTVLAVTIITFFNFQLFVSEDSGGFWIATAIPSN IPYLILLLLNTGQAVIAIFLASDFLKRDKKLDTSEVFYVRPLSNAEYVIGKIWGNLRVFL LLNLIIMAITAAFNLTLGEVDWMAYLLYFLLISVPTLIFIIGLAIFLMLVLKNQALTFVL LLGYIGLTVFYIEDKFYYLFDYMAYSLPLVKSSIVGFSNWEVILNHRGIYFLAGLAFVFF TISLFRRLPNSSHSNYPWLVISFCTLMLAFVCGYWHIHSILYQSDIRATYTKINNQYVST PKMFIHEYDLSVEQHLEDFLSEVTVKGVALDSSAVFTFCLNPGLTIHSVHSAGQQLKFKR DKQIVLVDFGTKHAKGDTISATFRYDGQIDNSFCYLDIPPEVLQASNKKFLFNIDKQYSF QTKNYLMLTPETYWYPRAGTSYSDKNPDWQQTYFSNYRLKVKPLPGLVPLSQGEGKCDEE GVYSFKGDYPSQTLSLVIGDYQQKSAYADSILYSVWHLRDHDYFTASFDSIHDTIPWLIR NVKEQLARQYKLDYPFKRFSIVEVPVQFASYERAWSQAQETVQPEMVLFPEKGAIFWELD VKRQVKNHIRWSGNNEISLQEAQMRTFSNFAWMFLQTEGDYNFSSGSRGRGNLSSTANPY FLFPQIYNFRYNIYSSEWPVANRAIELYLQKKSENEGWERQINGLSNNEKANLLLEKHTF KELLADVEQRNLQNNIVGLEASRLFARSEINMGVDAFRDSLYAMLKRNTFLNIKFENMLD TLGEMSRTDLYSYLKEWEQLTPLPFYSIGEPELTKVVNKGGEEFFVLKVLVSNNSDYDGI IQMNVLQNGWWYQPMEDPRARKKVEIAAHTSKEFVSVWEEQIRDVEINTMVSGNLPYMIR QSIGNVKQERNKIVKDTIYTLPESSLEMPGEVIVDNEDSTLFVLSTPAVVGLLPKWLDKV EDTSFKYSGISWWRAPLQWTATTNAKYYGKYIRSAYVIKSGDGSQTATWKIPVPEAGQYE LYYHVFKDDELRWNDRLQGEYHFRVAYDSEMEDAYINLRKANEGWEQLGTYYFSADTVRV VLTDECKLRSVTADAVKIVKR >gi|226332316|gb|ACIC01000004.1| GENE 30 42592 - 44088 1475 498 aa, chain + ## HITS:1 COG:PA4592 KEGG:ns NR:ns ## COG: PA4592 COG1538 # Protein_GI_number: 15599788 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Outer membrane protein # Organism: Pseudomonas aeruginosa # 23 467 41 468 493 73 21.0 6e-13 MQSDLLLLKRMLLTIVTFTGIALTSQGQIPLTIDKAMEIAQENSPSLRRSYMNLERYKQN LLAQKASLKSRFSLNLNPVDYNKNRRFDNRLSQWYTNETFNTSGTFQVDQPILWTDGTIS LINRFGWQDNSSIIEGDKTSNRAFSNDLYLQLTQPIFTYNKRKMELKQIEYDYENANISY ALQQLNTEKSITDQFYSVYMAQSNLEISREELVNAQQSYDIIKNKVEADLAAKDELFQAE LNLATARSSVDESQVNLENAKDKLKQTLGMRLDEDILVFAEVEIKPIQVNLDQAISHGLG SRLELRQREIESKELEFDMIKTKALNEFKGDVSISFGLIGDNSHLNKVYNNPTQNPRVSI SFAVPIFDWGEKKARIKAQKMAQRINELEFQEDKVDIELNIRQVWRNLENLRTQIKIAEQ NVQNAQLTYDLNQTRYREGDITGMEMSQFQTQLSNKKITYTQALINYRIELLNLKILSLY DFDKNIPLVPMKDIIDKK >gi|226332316|gb|ACIC01000004.1| GENE 31 44096 - 45157 1058 353 aa, chain + ## HITS:1 COG:VC0165 KEGG:ns NR:ns ## COG: VC0165 COG0845 # Protein_GI_number: 15640195 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Vibrio cholerae # 35 351 43 353 368 89 26.0 9e-18 MKQYQIFIASSLLTTLLLTGCGNKPQNAPNNIATPVSVVELKKGSISKLINTTGTVQPTY GASQNAEMSGFYKLQTNPRTGKPFKMGDRVSKGELIIRLEDKEYENGIAIDAKKLSLEIA EQEQSKQKALYEKGGVTMSEMRNTEVKVTNARYDYENAKLNLEKMKVKAPFDGVIVDLPH YTADTRVEQGKAMVSLMAYDKMYMDINLPESSIRYVKEAQPVYITHYTIPGDTLKARVGE LSPAISTETRTFKGKILIDNDQLKLRPGMFAKADIIVDRADSSIIIPKDVILSNRRRKYV YIVEKNTAKIRNLQTGLEDEYNIEILSGLNVNDNLVVKGFETLKEDAKVKIQK >gi|226332316|gb|ACIC01000004.1| GENE 32 45206 - 48322 2800 1038 aa, chain + ## HITS:1 COG:BH3816 KEGG:ns NR:ns ## COG: BH3816 COG0841 # Protein_GI_number: 15616378 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Bacillus halodurans # 4 1033 1 1014 1093 427 31.0 1e-119 MKNIIKFAVGNPVTICMVVFALLLLGKVSYDQLSVDLLPDLNNPRLFIELKAGERPPEEI EKQFVKNMESMAIRQSDVTQVSSVIKAGTARITVEYTWTKDMDEAFLDLQKAMNPFAQNK DITELKITQHDANLSPVVLVGMSHQSITDMAELRKIAESYIRNELIRLEGVAEVTLSGEE VSTLTIQTDPYKLDAFQLKIEDIASRIESNNQSISGGRVSELGLQYLVKSSSLFASEDDF ENLIVGYKAIQQEEASGNNASGTATANSNKAPVFLKEVATVQFLNARPENIVRINGKRSI GLSIYKEMRFNTVKVVDEVTRQLAVIENALPGYHFQVISNQGTFIKSAIGEVKSSAVLGV ILAIVVLFVFLRRMGTTLIVSLAIPISIVATFNLMFFNGLSLNIMTLGGLALGAGMLVDN AIVVIESIFRNQEKGLSIKEAAINGTAEVANAVIASTLTTIVVFLPIVYLHGASGELFKD QAWTVTFSLVSSLFVAILVIPMLYIQLSGKKVKLEEVKSIRITGYSRVLRKLIQRRWLVI GMAVLLLIVTGLLTPFIGTEFMPRAESKTFTAVIKMPEGTQMERTSAAVGNLEELLYAIV GGDSLCTVYSHIGEGSGSENAIFEGENTAMMKVILSPECTLSPEKVIASFVESAKNPDGL ELTIQQDENSLSSLLGSEGAPIVVEVKGEELDEIAQLTEEVKERMIGVNGIYDVITSVED GAPEVVISIDRTIAGINNLSVAMVIEQLKQQLSGKEVGKMEYRGEMRDIVIKVPDIPLSS LGALVIKSGTQEFVLQEIVTITHGQAPKEILRRNQSRIGKVMANMDASKSLDKVAAEVRE TVKGIELPANYSITVTGEEEKRQESMNSLLFALMLSVVLVYMVMASQFESLLHPFTILLT IPLAVVGAILLFFITGTTINMMGVIGIVMLGGIAVNNSIILVDRINQLSQAGMELTDAIV EAGQQRIRPIIMTTLTTILAMLPMTFSFGEGASLRSPMAIAVIGGLITSTLMSLMVIPCV YYVLENIKRRINRRTKNS >gi|226332316|gb|ACIC01000004.1| GENE 33 48361 - 52285 3157 1308 aa, chain + ## HITS:1 COG:FN1275 KEGG:ns NR:ns ## COG: FN1275 COG0841 # Protein_GI_number: 19704610 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Fusobacterium nucleatum # 1 1015 16 992 1020 296 24.0 1e-79 MLFIALSLLGYVSYKQLPVELLPNAELPVLFVQVSSQQDMDPSYVESEVIIPLEGAVSAI GGVDKLQSYIDRRQSSIQIDFKKNINFKMTSLKLQEKVNEIAASLPTGFTVQVQKVDIAQ MNNNFMVLQVRGTGGTDRVRSLVDEDVRSDLENIDGVASVNIYGGRQKAIEVRLNPEACK ALNLTASKISGLLSQNTQEKTFVGFADEPDSKNFVHVNAMYTKVSDLENIVVAPGPILLK DLATVFFDLKDETTYSRVNGKEAVSVALINDSQANLIELSHRVSDVIEKLNEKLAPLELE IVIQENKAETMENNINQIINLALVGGLLAVFVLWLFLKNMRLVFFIALSIPISVYTAFNF FYAFGITINSLTLVGMALAIGMLLDNSVVVLENIYRLSGNGYTPERSVTQGTKEVWRSIV AATLTTITVFLPFVFSDNFLIKLIGHHIGVSIISTLTISLFVALLFIPMITYLLLKSKSG NSVVYEKVSIVQRPMQVYLVLLKTCMRNPGVTVFGAVILLFATLILSLTLNVQQMKEVAS DRFNISVVMPTGSTLENTDKIVKVLEERLEDFPEKKDLICRIREKEATLTLVLQEDYTKI AKRKIADIKADVQTKVANINGAEIYVSDAMGGGGDNSALSSLGGFMRLLGIGDNSERVVI RGSDFEMMQMVAEEVRYLLDEQDFVQHTHVSYTPRQAEINLNFDPILLTAYDINRGNITT GLTALNNEYSSEATFKVGEDIYDIIIRDEIPEKETEEEAQEKAQKEKTVDDLRAIRIENA NGGMHNLEDISSLNYGRGRSRIIRVNQDKQIEVYYAFPRDVQSSKDLLGGYRSDIDQLIA SYNLPSGVAVQVFHEEDEFGDFKFLILAAFLLIFMILASVFESVVTPFVLLFTIPLAAIG SLLALLLSSNSLMNANTLTGFLILLGVVVNNGIILIDYANILRKRGYRRNRALMTAGMSR IRPILITSITTIAAMLPLAMGDTEYAGAIGAPFAITVIGGLFFSAMLTLILIPTVCMGLE NVLQWYRSLSRKLWTIHLILFVSGVICIWLYTDGMLWQSIYLVALIAGIPGMTYFAQTSL RRAKAEVINPDEEIRISVRNLVKIYDWPGHISRQWNSGLQLRKRLGLSNEYHSLKDFINV LWQFGILLFAIYFTYFFIHNRLWIFLFSFAIYAAVLYLWRKVRSYLYYRYGDNRVTKIVN RVIFWSLPPLILFQLFRKLDNNGLVIMIGLLWLVGIAIYVTSQYLYDHDVNIERVTGRFA GLRRSYFRMVKSVPMIGKRRKPFKALRGVSFEIQTGMFGLLGPNGAGK Prediction of potential genes in microbial genomes Time: Wed May 11 23:58:37 2011 Seq name: gi|226332315|gb|ACIC01000005.1| Bacteroides sp. 1_1_6 cont1.5, whole genome shotgun sequence Length of sequence - 59610 bp Number of predicted genes - 49, with homology - 48 Number of transcription units - 22, operones - 15 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 721 728 ## COG1131 ABC-type multidrug transport system, ATPase component 2 1 Op 2 . + CDS 737 - 1438 407 ## BT_0302 hypothetical protein + Term 1493 - 1545 10.0 - Term 1480 - 1532 10.0 3 2 Tu 1 . - CDS 1588 - 3804 2054 ## COG1752 Predicted esterase of the alpha-beta hydrolase superfamily - Prom 3872 - 3931 8.7 + Prom 3780 - 3839 4.8 4 3 Op 1 27/0.000 + CDS 4076 - 5311 1265 ## COG0845 Membrane-fusion protein 5 3 Op 2 9/0.000 + CDS 5315 - 8419 3041 ## COG0841 Cation/multidrug efflux pump 6 3 Op 3 . + CDS 8419 - 9792 436 ## PROTEIN SUPPORTED gi|157165073|ref|YP_001466086.1| 30S ribosomal protein S12 + Prom 9838 - 9897 5.1 7 4 Tu 1 . + CDS 9921 - 11567 1796 ## COG0205 6-phosphofructokinase + Prom 11606 - 11665 3.7 8 5 Op 1 . + CDS 11725 - 12618 510 ## COG3757 Lyzozyme M1 (1,4-beta-N-acetylmuramidase) 9 5 Op 2 3/0.000 + CDS 12693 - 14036 699 ## PROTEIN SUPPORTED gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 10 5 Op 3 . + CDS 14020 - 14739 245 ## COG0095 Lipoate-protein ligase A + Prom 14742 - 14801 4.2 11 6 Op 1 24/0.000 + CDS 14826 - 16196 1223 ## COG0508 Pyruvate/2-oxoglutarate dehydrogenase complex, dihydrolipoamide acyltransferase (E2) component, and related enzymes 12 6 Op 2 . + CDS 16234 - 18270 1880 ## COG0022 Pyruvate/2-oxoglutarate dehydrogenase complex, dehydrogenase (E1) component, eukaryotic type, beta subunit 13 6 Op 3 . + CDS 18295 - 18810 514 ## COG0716 Flavodoxins + Term 18999 - 19035 0.1 14 7 Tu 1 . - CDS 19061 - 19903 1490 ## PROTEIN SUPPORTED gi|29345724|ref|NP_809227.1| ribosomal protein L11 methyltransferase - Prom 20043 - 20102 5.4 15 8 Op 1 . + CDS 20055 - 21398 867 ## BT_0315 hypothetical protein 16 8 Op 2 . + CDS 21432 - 23036 1211 ## BT_0316 putative hemin receptor + Term 23079 - 23133 12.4 + Prom 23145 - 23204 7.2 17 9 Op 1 . + CDS 23349 - 26537 2833 ## BT_0317 hypothetical protein 18 9 Op 2 . + CDS 26557 - 28155 1374 ## BT_0318 hypothetical protein + Term 28163 - 28223 15.1 - Term 28214 - 28262 13.6 19 10 Op 1 . - CDS 28302 - 29105 629 ## BT_0320 hypothetical protein 20 10 Op 2 . - CDS 29143 - 29613 525 ## COG0590 Cytosine/adenosine deaminases - Prom 29658 - 29717 8.6 21 11 Tu 1 . - CDS 29735 - 30721 965 ## COG0673 Predicted dehydrogenases and related proteins - Prom 30750 - 30809 4.5 22 12 Op 1 . - CDS 30835 - 31737 844 ## BT_0323 hypothetical protein 23 12 Op 2 . - CDS 31762 - 32211 513 ## BT_0324 hypothetical protein 24 12 Op 3 . - CDS 32232 - 32693 492 ## BT_0325 hypothetical protein 25 12 Op 4 . - CDS 32690 - 33211 371 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 33389 - 33448 6.3 + Prom 33166 - 33225 5.6 26 13 Tu 1 . + CDS 33431 - 33568 63 ## - Term 33519 - 33562 7.1 27 14 Op 1 . - CDS 33654 - 34493 906 ## COG0834 ABC-type amino acid transport/signal transduction systems, periplasmic component/domain 28 14 Op 2 . - CDS 34511 - 36565 1812 ## BT_0328 hypothetical protein - Prom 36601 - 36660 3.1 29 15 Op 1 22/0.000 - CDS 36674 - 37216 586 ## COG1014 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, gamma subunit 30 15 Op 2 . - CDS 37236 - 38000 734 ## COG1013 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, beta subunit 31 15 Op 3 . - CDS 38013 - 38189 212 ## BF1647 hypothetical protein 32 15 Op 4 . - CDS 38200 - 39282 1310 ## COG0674 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit 33 15 Op 5 . - CDS 39295 - 39522 297 ## BT_0333 hypothetical protein 34 15 Op 6 . - CDS 39557 - 39838 204 ## BT_0334 hypothetical protein - Prom 39879 - 39938 6.7 + Prom 39906 - 39965 7.6 35 16 Op 1 7/0.000 + CDS 40131 - 41684 1369 ## COG0714 MoxR-like ATPases 36 16 Op 2 . + CDS 41674 - 43116 1102 ## COG2425 Uncharacterized protein containing a von Willebrand factor type A (vWA) domain 37 16 Op 3 . + CDS 43194 - 43676 405 ## COG2839 Uncharacterized protein conserved in bacteria + Prom 44324 - 44383 3.7 38 17 Tu 1 . + CDS 44446 - 46296 1346 ## BT_0338 hypothetical protein + Prom 46337 - 46396 2.2 39 18 Op 1 . + CDS 46418 - 48460 1663 ## COG1501 Alpha-glucosidases, family 31 of glycosyl hydrolases 40 18 Op 2 . + CDS 48457 - 48663 153 ## BT_0339 alpha-glucosidase 41 18 Op 3 . + CDS 48673 - 50481 1740 ## COG5012 Predicted cobalamin binding protein 42 18 Op 4 . + CDS 50510 - 52081 1218 ## COG4146 Predicted symporter + Term 52160 - 52205 3.2 43 19 Op 1 . - CDS 52295 - 53068 291 ## BT_0342 hypothetical protein 44 19 Op 2 . - CDS 53097 - 54113 762 ## COG0407 Uroporphyrinogen-III decarboxylase - Prom 54193 - 54252 10.5 45 20 Op 1 . + CDS 54506 - 55351 410 ## BT_0344 hypothetical protein 46 20 Op 2 . + CDS 55404 - 56114 398 ## BT_0345 hypothetical protein + Term 56242 - 56305 20.4 - Term 56238 - 56283 7.5 47 21 Op 1 . - CDS 56304 - 56738 472 ## COG0698 Ribose 5-phosphate isomerase RpiB 48 21 Op 2 . - CDS 56738 - 58747 2209 ## COG0021 Transketolase - Prom 58814 - 58873 2.4 - Term 58813 - 58861 -0.2 49 22 Tu 1 . - CDS 58924 - 59565 632 ## COG3534 Alpha-L-arabinofuranosidase Predicted protein(s) >gi|226332315|gb|ACIC01000005.1| GENE 1 2 - 721 728 239 aa, chain + ## HITS:1 COG:all2672 KEGG:ns NR:ns ## COG: all2672 COG1131 # Protein_GI_number: 17230164 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase component # Organism: Nostoc sp. PCC 7120 # 6 235 56 282 293 184 40.0 2e-46 FEQSYGSIWINGLDTRIYREELQSLIGFLPQEFGTYENMTSWEFLDYQAILKGIVDGDLR RERLDYVLKAVHMYERKDEKIGSFSGGMKQRIGIALILLHLPRILVVDEPTAGLDPRERI RFRNLLVELSKDRIVIFSTHIIEDISSSCNQVVVINKGELKYFGDPSDMVEMANGKVWQF NIDKTEFEKVLDKSLVIHHIQEGDTIRVRYLSVGQPYEGAVEVEANLEDAYLCLLKNMN >gi|226332315|gb|ACIC01000005.1| GENE 2 737 - 1438 407 233 aa, chain + ## HITS:1 COG:no KEGG:BT_0302 NR:ns ## KEGG: BT_0302 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 233 1 233 233 370 100.0 1e-101 MTFKQLIQLLPKVVRYNLKIIFAGKFFWFLLAAFGFFALFMFQNAWKREEINEGLIYSIL MFPSMLLIFYPAVFGIQNDEDSRILEILFGIPDYKYKVWGVRLLMIYVAIFVILIIFSYI AILLLYPVNPLEMALQLMFPILFFGNMAFMLSTITRSGNGTAVFMIIIGIALMFLGKPDS FWNVLLNPFRVPDNVHPIIWEGILIKNRIFLLSASLVWMMIGLLYLQKRENFV >gi|226332315|gb|ACIC01000005.1| GENE 3 1588 - 3804 2054 738 aa, chain - ## HITS:1 COG:PA3339_1 KEGG:ns NR:ns ## COG: PA3339_1 COG1752 # Protein_GI_number: 15598535 # Func_class: R General function prediction only # Function: Predicted esterase of the alpha-beta hydrolase superfamily # Organism: Pseudomonas aeruginosa # 27 297 22 299 308 201 40.0 4e-51 MKKQIFSTLVFAMCLILPFSVYSQEQRKKVGVVLSGGGAKGVAHIKALKVIEEAGIPIDY IVGTSMGSIVGGLYAIGYTPEQLDSMVRKQDWTFLLSDRIKRSAMSLTERERSAKYIVSL PFTKNPKAAMSGGIIKGQNLANLFSDLTVGYHDSINFNKLPIPFACVSANVVNGDQIVFH DGVLSTAMRASMAIPGVFTPVRKDSMVLVDGGIVNNYPADVAKAMGADIIIGVDVQNALK SADKLNSAPDILGQIVDLTCQTNHEKNVELTDTYIKVNVDGFSSASFTPAAIDSLMRRGE EAARAQWNSLIALKKEIGIPDNYVPKQHGPYSSLSNSRTVYVTDISFSGVEADDKKWLMK KCNLKENSNITTQQIEQAVYQLRGSHSYSSASYTLTDTPEGYHLNFLLEEKYEKRINLGI RFDSEEIASLLINATADLKTHIPSRLSLTGRLGKRYAARIDYTLEPMQQRNFNFSYMFQY NDINIYDEGERAYNTTYKYHLAEFGFSDVWYKNLRFGLGLRFEYYKYKDFLFKKPELTGL DVESEHFLSYFAQVHYSTFDKGYFPSKGTEFKAAYSLYTDNMAQYNDHAPFSALSGSWSS VIPVTRRFSIIPSIYGRVLIGKDIAYPLQNAIGGEVYGLYIPQQLPFAGVTNMELMDNSI MIASVKLRQRMGSIHYLTLTGNYGLTDSHFFEILKGKQLFGISAGYGMDSMFGPLEITFG YSNQTDKGSCYVNLGYYF >gi|226332315|gb|ACIC01000005.1| GENE 4 4076 - 5311 1265 411 aa, chain + ## HITS:1 COG:mll6731 KEGG:ns NR:ns ## COG: mll6731 COG0845 # Protein_GI_number: 13475614 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Mesorhizobium loti # 38 385 53 394 402 204 36.0 4e-52 MRLFFSKHELKLRRKRNIAAVVCVVIVLGVYWILTRPQKAAPEMPTVIVEPVIKDDVEIF GEYVGRIRAQQFVEVRARVEGYLENMLFAEGTYVNKNQVLFVINQDQYRAKADKARAQLK KDEAQALKAERDLKRIRPLYEQNAASQLDLDNAEAAYESAVATVAMSEADLAQAELELGY TIVRSPLAGHISERNVDLGTLVGPGGKSLLATVVKSDTVLVDFSMTALDYLKSKERNINL GQQDSTRSWQPNITITLADNTVYPYKGYVDFAEPQVDPQTGTFSVRAEMPNPKQVLLPGQ FTKVKLLLDVREGAILVPMKAVTIEKGGAYIYTMRKDDTVEKRFIELGPEVGNNVVVERG LAVGETIVVEGFHKLTPGMKVRISQPEAEVKDTTMVAGDSTTGMKDNAKGE >gi|226332315|gb|ACIC01000005.1| GENE 5 5315 - 8419 3041 1034 aa, chain + ## HITS:1 COG:SMa1662 KEGG:ns NR:ns ## COG: SMa1662 COG0841 # Protein_GI_number: 16263363 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Sinorhizobium meliloti # 6 1028 7 1030 1044 840 43.0 0 MKVSFFIDRPVFSAVISIVIVIVGIIGLTMLPVDQYPQITPPVVKISASYPGASALTVSQ AVATPIEQEINGTPGMLYMESNSSNSGGFSATVTFDVSADPDLAAVEIQNRVKLAESRLP AEVIQNGISVEKQAPSQLMTLCLTSTDPKFDEIYLSNFATINVLDVIRRIPGVGRVSNIG SRYYAMQIWAEPDKLANFGLTVQDLQNALKDQNRESAAGVLGQQPVQGLDITIPITTQGR LSTVGQFEDIVVRANANGSIIRLKDVARVSLEASSYNTESGINGENAAVLGIYMLPGANA MEVAENVKKAMEEISENFPEGLSYEIPFDMTTYISESIHEVYKTLFEALVLVVLVVYLSL QSWRATLIPVVAVPISLIGTFGFMLIFGFSINILTLLGLILAIGIVVDDAIVVVEGVEHI METEKLSPYEATKKAMNGLSSALIATSLVLAAVFVPVSFLSGITGQLYRQFTVTIVVSVL ISTVVALTLSPVMCSLILKPDSGKKKNIVFRKINEWLGIGSNKYVAAVTRTIKHPRRVLS AFGMVLIAILLIHRIIPTSFLPVEDQGYFKIELELPEGATLERTRVVTERAIAYLEKNPY IEYVQNVTGSSPRVGSNQGRAELTVILKPWEERKSTSIEKIMDTVEKHLREYPECKVYLS TPPVIPGLGSSGGFEMQLEARGEATFDNLVDAADTLMYYASKRKELTGLSSSLQSEIPQL YFDVDRDKVKMLGVPLADVFSTMKAYTGSVYVNDFNMFNRIYKVYIQAEAPYREHKDNIN LFFVKASNGAMVPLTSLGNASYTTGPGSIKRFNMFTTAVIRGAAAQGYSSGQAMEIMEQI ARDHLPDNIGLEWSGLSYQEKQAGGQTGMVMALVFLFVFLFLAAQYESWTVPIAVLLSLP VAALGAYLGVWVCGLENDVYFQIGLVMLVGLAAKNAILIVEFAKVQVDRGEDLIQSAIYA AKLRFRPILMTSLAFVLGMLPMVLASGPGSASRQAIGTGVFFGMIFAIVFGIILVPFFFV MVYKTKSKILKHKK >gi|226332315|gb|ACIC01000005.1| GENE 6 8419 - 9792 436 457 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157165073|ref|YP_001466086.1| 30S ribosomal protein S12 [Campylobacter concisus 13826] # 10 456 7 457 460 172 27 4e-42 MKLNKLYIAILLFAGILTSCKVGKSYVRPDLHLPDSLAQHQDTVSFGDQDWEEIYTDSTL RSLIDRALDHNKDMLIAAARIKEMAAQKRISTAALLPDIKGKVTAERELENHGGDAFKKS DTFEAQFLVSWELDLWGNLRWARSASIAEYLQSVEAQRALQMTIVAEVAQAYYELVALDT ELDIVKQTLKAREEGVRLARIRFEGGLTSETSYRQAQVELARTATLVPDLERKISLKEND IAFLAGEYPNRIARSRLLQEFNFPETLPIGMPSTLLERRPDIRQAEQKLIAANAKVGVAY TNMFPRLALTGGFGSESTSLSELLKSPYAVMEGALLTPIFGWGKNRAALKGKKAAYEAEI HSYEKTVLTAFKETRNAIVNFNKIKEVYALRANLERSAKSYMDLAQLQYINGVINYLDVL DAQRGYFDAQIGLSNAIRDELITVVQLYKALGGGWKQ >gi|226332315|gb|ACIC01000005.1| GENE 7 9921 - 11567 1796 548 aa, chain + ## HITS:1 COG:TP0542 KEGG:ns NR:ns ## COG: TP0542 COG0205 # Protein_GI_number: 15639531 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphofructokinase # Organism: Treponema pallidum # 1 546 1 559 573 635 54.0 0 MTKSALQIARAAYQPKLPKALATGAVKAVAGAATQSVADQEAIKALFPNTYGMPLITFEA GEAVQLPAMNVGVILSGGQAPGGHNVISGLFDGIKKLNPESKLYGFILGPGGLVDHNYME LTADIIDEYRNTGGFDIIGSGRTKLETEAQFEKGFEIIKELGIKALVIIGGDDSNTNACV LAEYYAAKKYGVQVIGCPKTIDGDLKNDMIETSFGFDTACKTYAEVIGNIQRDCNSARKY WHFIKLMGRSASHIALECALQVQPNVCIVSEEVEEKDMSLDDVVTSIAKVVADRAAQGNN FGTVLIPEGLVEFIPAMKRLIAELNDFLAANAEEFSQIKKSHQRDYIIRKLSPENSAIYA SLPEGVARQLTLDRDPHGNVQVSLIETEKLLSEMVATKLAAWKEDGKYVGKFAAQHHFFG YEGRCAAPSNFDADYCYSLGYTASMLIADGKTGYMSSVRNTTAPAAEWIAGGVPITMMMN MERRHGEMKPVIQKALVKLDGAPFKAFAAQRDRWAVETDYVYPGPIQYFGPTEVCDQATK TLQLEQAK >gi|226332315|gb|ACIC01000005.1| GENE 8 11725 - 12618 510 297 aa, chain + ## HITS:1 COG:yegX KEGG:ns NR:ns ## COG: yegX COG3757 # Protein_GI_number: 16130040 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lyzozyme M1 (1,4-beta-N-acetylmuramidase) # Organism: Escherichia coli K12 # 70 292 48 265 275 186 43.0 5e-47 MSSRNNPISAVQKKRTASTTPKKRTVSSSRPSRAPKKAPVKQHYSMPGWLRNILAVAIVG CFSMVFYYFFIRPYAYRWKPCNGLKEYGVCIPSGYDIHGIDISHYQGKIDWGRLLQNKQT ATPLHFIFMKATEGGDHNDTTFQTNFANARNHGFIRGAYHFYIPGTDALKQADFFIRTVK LDSGDLPPVLDVEVTGRKEKKELQQGIKRWLDRIESHYGVKPILYTSYKFKTRYLDDSIF NAYPYWIAHYYVDSVKYQGKWDFWQHTDVGNVPGIKEDVDLNVFNGTLEELKKLTIK >gi|226332315|gb|ACIC01000005.1| GENE 9 12693 - 14036 699 447 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 [Flavobacteriales bacterium ALC-1] # 3 447 4 449 458 273 33 1e-72 MNFDIAIIGGGPAGYTAAERAGANGLRAVLFEKKAIGGVCLNEGCIPTKTLLYSAKILDS IKSASKYGISAESPSFDLTKIMSRKEKTVKMLTGGVKMTVNSYGVTIVEKEAFIEGEENG LIRIICDGERYEVKYLLVCTGSDTVIPPIPGLSEVSYWTSKEALEIKELPETLVVIGGGV IGMEFASFFNSMGVKVHVVEMMPEILGVMDKETSSMLRMEYAKRGVTFYLNTKVIEVKPD GVVIEKDGKASTIKTEKILLSVGRKANITNVGLDKLNIELHRNGVKVDEYLQTSHPGVYA CGDITGYSLLAHTAIREAEVAINHILGVEDRMNYNSVPGVVYTNPEVAGVGKTEEELTKQ GIPYRVTKLPMAYSGRFVAENEQVNGICKLILDEADHIIGCHMLGNPASELIVIAGIAIQ KGYTVEEFQKNVFPHPTVGEIYHEVLS >gi|226332315|gb|ACIC01000005.1| GENE 10 14020 - 14739 245 239 aa, chain + ## HITS:1 COG:SP1160 KEGG:ns NR:ns ## COG: SP1160 COG0095 # Protein_GI_number: 15901025 # Func_class: H Coenzyme transport and metabolism # Function: Lipoate-protein ligase A # Organism: Streptococcus pneumoniae TIGR4 # 1 239 1 242 329 149 35.0 4e-36 MRCFHNTFTDIYFHLAAEEYLLKQETDSVFMLWQDTPSVVMGKHQSVQLEVNREWAEEQQ IQIARRFSGGGAVYHDLGNVNLTFIETVSRLPDFSLYLHRILDFLKLIGLPAKGDERLGI YLDGLKISGSAQCVHKNRVLYHCTLLYDTNLAALNKVLNPERDIETGVALPVYAVPSVRS EVTNISRYLPMETVDHFKAILFEYFSQKGCADTFSEKELEAIHKLRTEKYICEDWIFSR >gi|226332315|gb|ACIC01000005.1| GENE 11 14826 - 16196 1223 456 aa, chain + ## HITS:1 COG:BH2761 KEGG:ns NR:ns ## COG: BH2761 COG0508 # Protein_GI_number: 15615324 # Func_class: C Energy production and conversion # Function: Pyruvate/2-oxoglutarate dehydrogenase complex, dihydrolipoamide acyltransferase (E2) component, and related enzymes # Organism: Bacillus halodurans # 5 453 4 417 426 283 41.0 4e-76 MSRFEIKMPKLGESITEGTIVSWSVKVGDVIQEDDVLFEVNTAKVSAEIPSPVAGKVVEI LFKEGDTVAVGTVVAVVDMGGEEASDEETASGKETPESKENASSDAEKVSSQVAKAEERW YSPVVIQLARGANIPKEELDSIQGTGYEGRLSKKDIKDYIEKKKRGISEVSKAAIPTGDA LTASMPSSTGGAGSMTTSPVAASVQTPVAAPSAPSKQAPAAANIPGVEVKEMDRVRRIIA DHMVMSKKVSPHVTNVVEVDVTKLVRWREKNKDAFFRREGVKLTYMPVITEAVAKALAAY PQVNVSVDGYNILFKKHINIGIAVSLNDGNLIVPVVHDADHLNLNGLAVAIDSLALKARD NKLMPDDIDGGTFTITNFGTFKSLFGTPVINQPQVAILGVGYIEKKPAVIETPEGDTIAI RHKMYLSLSYDHRVVDGMLGGNFLHFIADYLENWKG >gi|226332315|gb|ACIC01000005.1| GENE 12 16234 - 18270 1880 678 aa, chain + ## HITS:1 COG:CT340_2 KEGG:ns NR:ns ## COG: CT340_2 COG0022 # Protein_GI_number: 15605063 # Func_class: C Energy production and conversion # Function: Pyruvate/2-oxoglutarate dehydrogenase complex, dehydrogenase (E1) component, eukaryotic type, beta subunit # Organism: Chlamydia trachomatis # 354 678 5 328 328 258 44.0 4e-68 MKKKYDIKTTDVETLKKWYHLMTLGRALDEKAPSYQLQSLGWSYHAPYAGHDGIQLAVGQ VFTLGEDFLFPYYRDMLTVLSAGMTAEEIILNGISKATDPGSGGRHMSNHFAKPEWHIEN ISSATGTHDLHAAGVARAMVYYGHKGVAITSHGESATSEGFVYEAINGASLERLPVIFVI QDNGYGISVPKSEQTANRKVAENFSGFKNLKIIYCNGKDVFDSMNAMTEAREYAISTRNP VIVQANCVRIGSHSNSDKHTLYRDENELEYVKEADPLMKFRRMLLRYKRLTEEELLQIEA ESKKELSAANRKALAAPEPDPKSIYDFVMPEPYQPQKYKEGTHQEEGEKTFLVNAINETL KAEFRHNPDTFIWGQDVANREKGGVFNVTKGMQQEFGEARVFSAPIAEDYIVGTANGMSR FDPKIHVVIEGAEFADYFWPAVEQYVECTHEYWRSNGKFAPNITLRLASGGYIGGGLYHS QNIEGALTTLPGARIVCPSFADDAAGLLRTSMRSKGFTLFLEPKALYNSVEAAAVVPEDF EVPFGKARIRREGTDLSIITYGNTTHFCLHVAEQLEKESGWKVEVIDIRSLIPLDKEAIF ESVKKTSKALVVHEDKVFSGFGAELAAMIGTDMFRYLDGPVQRVGSTFTPVGFNPILEKE ILPDEAKIYEAAKKLLEY >gi|226332315|gb|ACIC01000005.1| GENE 13 18295 - 18810 514 171 aa, chain + ## HITS:1 COG:alr2405 KEGG:ns NR:ns ## COG: alr2405 COG0716 # Protein_GI_number: 17229897 # Func_class: C Energy production and conversion # Function: Flavodoxins # Organism: Nostoc sp. PCC 7120 # 2 167 3 168 170 140 46.0 9e-34 MKKIGLFYATKADKTTWVAEKIQKEFGKDRIEVVPIEQAWQNDFAAYDCLIVGASTWFDG ELPTYWDELLPELRTLDLKGKKVAVFGLGDQIRYPENFADGIGLLAEVFEGDGATLVGFT SSERYTFERSRALRGDRWCGLVVDLDNQSEQAEKRIREWCEQVKNEFDSKP >gi|226332315|gb|ACIC01000005.1| GENE 14 19061 - 19903 1490 280 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|29345724|ref|NP_809227.1| ribosomal protein L11 methyltransferase [Bacteroides thetaiotaomicron VPI-5482] # 1 280 1 280 280 578 100 1e-164 MKYFEFTFHTSPCTETVNDVLAAVLGEAGFESFVESEGGLTAYIQQALCDENTIKNAITE FPLPDTEITYTYVEAEDKDWNEEWEKNFFQPIVIDNRCVIHSTFHKDVPQATYDIVINPQ MAFGTGHHETTSLIIGELLDNELKDKSLLDMGCGTSILAILARMRGARPCIAIDIDEWCV RNSIENIELNHVDDIAVSQGDASSLVGKGPFDIIIANINRNILLNDMKQYVACMHPGSEL YMSGFYVDDIPFIRREAEKNGLTFVHHKEKNRWAAVKFTY >gi|226332315|gb|ACIC01000005.1| GENE 15 20055 - 21398 867 447 aa, chain + ## HITS:1 COG:no KEGG:BT_0315 NR:ns ## KEGG: BT_0315 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 19 364 1 346 429 393 100.0 1e-108 MKKIVYFLLIALCLPIGLMAQSVDDDLYFVPSKDKKEKKETTPEKKEPRKQVTTNIYTSP GTTVVVQDRKGKTRDVDEYNRRYDARENEFVMENDTLYIKEKAQPDLEGEWVTGEFDGSQ EDYEYAERIIRFRNPRFAISISSPLYWDVVYGPNSWNWNVYTDGMYAYAFPTFSNPLWWD WRYNSYGWGWNYGWGWNRPYYGWGYYPGSWGGWYGGYWGGFYGGGYWGHHHHWGGGPSWG WGGGHRDMLYTNRRSPNRSVNRTTNGTRYTSNRYQSSRPSSVRTSSNRTTSGRVVGTREN STRTGVSTRTSSSRRDTYTRPSSTRTSTSSGMNRSSSTRTSSSRSEGTYNRGSSSRSSGT YSRGSSSYTPSRSSSSRTYTPSSSNRSSSSSRTYTPSSSSRSSSSSSYSPSSSSSRSSGS SYGGGSSSSSRGSSSGGGGSARGGSRR >gi|226332315|gb|ACIC01000005.1| GENE 16 21432 - 23036 1211 534 aa, chain + ## HITS:1 COG:no KEGG:BT_0316 NR:ns ## KEGG: BT_0316 # Name: not_defined # Def: putative hemin receptor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 534 1 534 534 953 100.0 0 MKKKINIAALALIMVASASAQTIYDGAKFTQKDLNGTARFVGMGGAMGALGGDISTMGTN PAGIGIYRSNDIMGTFGFSSFGSESKYEGSKFNVDKTRGSFDNIGFVFSSKIGNQTALRY VNFGFSYTRSKSFDKYMTMEGLINLGPGGTILSQTNQMTNQANTMIKANGDFISKLADDK INLFTDSQTGWLAAIGWNGYLYNENEDKNGYIGYLPQPYSWYDGHEKGGINQYDFNVAFN ISDRVYLGLTIGAYDVNYSKISTYGEEYGEYIVENVNYGTPSYEMTTENKIDGSGVDFKF GAIVRPFEDSPFRIGVSVHTPTFYNLKIKTNVRAVTYTPDFDSKKLSEAVVDSYDFTNGV DYGYDFRFRTPWKYNLSLGYTYGSSFAIGAEYEYQDYSAMHFSYSDGEAMGWQNPTAKEM LKGVNTFRIGAEWKVIPQFAFRLGYNYMSAAFKKTAFKDLSYNSINTDTDFANAKANNNY TLGIGYRGSTFYADLAYMYSTYKEDFYPFDDGALKKTDVTNSRSKVMMTLGLRF >gi|226332315|gb|ACIC01000005.1| GENE 17 23349 - 26537 2833 1062 aa, chain + ## HITS:1 COG:no KEGG:BT_0317 NR:ns ## KEGG: BT_0317 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1062 1 1062 1062 1971 100.0 0 MFKPLKSVSVLLLLFAMPAAISYAASENGATSVSVTQQNGTCKGVVKDKAGEAVIGASVV VKGTTNGVITDLDGNFVLSNVPDGATIQISYVGYTPQEVKFTGKPLDITLQEDTQTLDEV VVTALGIKRQKRSLGYSTTTVGGKDFTEARTTNIGNALSGKIAGVSVSGNATGPGGSSRV VIRGNASLTGNNQPLYVIDGVPFDNTNIGSAGQWGGKDMGDGLSSINPDDIADIQVLKGA AASALYGYRGGNGAILITTKSGQKGKPVSVEFNNNLAFDVIYDYRDFQNVYGQGTQGNRP LSADVAKATETSSWGEVMDGKKAVNFLGNEYAYSPVDNWKNFYRTGINNTTTLAVSGASD KISYRFGVSNMAVKGILPNSSISQQGINMNTTYDISSKVHLMVNANYVFDKNKGRSNLSD GNSNTNAALLYHANSFDIRWMERENPDCDWGTGADGKELLGGTNGYFNNPYWLQYRVTNE TNRNRLTGGMTLKYDIFDWLYIQGAVTRDGYNLEYSEVKPIGSASPEDPRGYIKEYTQNF SEMNLNYLIGFNKTFGDWSVGATFGGNRQRNITKKYGLDDKASSFFVPDFYSSSNTAKHV YKKEYTEYRVNSIYGTADLGYKNQVFLNLTGRNDWFSTLDPDNNHYFYPSIGMSWVFSDT FKTPDWFTFGKVRASYAAASNGTKAYQNLLTYKVDNYQSNGQPVVTINNSTVPNKGLKPV QISEWEIGLNLSFLDNRLSLDAAYYVKTTKDDIVQVTTSGASGFESAIQNVGEIRNNGVE VMVNAVPVHTKDFNWNSTFNIAYNSSDVKYLGIDGTGEKIKRLTLDGANSRVGSVSVQNI LGHPYGELVGYEYKRTSDGQVIFENGLPVHSDEVQVLGNGVYKVTGGWRNEFTYKNITLS FLIDFKAGAKMFSGTNLSLYSNGLHKNTLQGRGADGKGTMVGNGVMSDGKGGYVKNTVAV SAQDYWQAITSQNIAEEFVYNASFIKLRELSVGYTLPQAWLNKQTLIKGVTLSLVGRNLW TILKHTDNIDPESAYNNGNAQGLELNGYPATRNVGFNVNVKF >gi|226332315|gb|ACIC01000005.1| GENE 18 26557 - 28155 1374 532 aa, chain + ## HITS:1 COG:no KEGG:BT_0318 NR:ns ## KEGG: BT_0318 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 401 1 400 404 787 99.0 0 MKKFTLYILTAFGLMGLATSCNDFGDVNNDPMNLNPGVVDYKMEFTQVEAQICGSDWDVW RNGCIYTANMIQHTASVDWAYGVFYTWNDQYSGAYWGGFYSGGRAAIRNIIDVMNNWEGN PAYANEYQMCRILKAYMFQNMTDLYGDVPYSEAGQGYSTNPIPYPKYDTQEAIYNDLLKE LDEAQAALSTSAGNTIGAADVIYNGDAAKWKKFANSLMLRVAMRLTKVAPDKAKTWVAKA VSNGLFVSNDDNAIVRHTNGSPSDDSAEPYGKVFSSLDTQAFYISETFINILQDDPRLPL IATVCTRDPKPGWGDADFDLGDNTAAKQKGMPVGYDVSGGDWDLSKAPSYQVDWRKAYSV PNRTVYARPDSPSMLVTHAGNLLLLADAVKRGLYTGDAEKFYRDGVTAAMKQFSYYEKAS ATITDDAITAYLNANPLDADTEKALEQINTEYYIHTFCNEYEAFANWRRTGYPKLTPARN AASYPNNVTNGEIPRRFIYPTSEITANPVNYNDAVKRLSGGDKMTSRVWWDK >gi|226332315|gb|ACIC01000005.1| GENE 19 28302 - 29105 629 267 aa, chain - ## HITS:1 COG:no KEGG:BT_0320 NR:ns ## KEGG: BT_0320 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 267 1 267 267 504 100.0 1e-142 MKRKNFLILLAAGCLFSCQQQDELQETSKEATNFSISIDDALSDPLTRTSNDLFPARNSI TTGEVISMAASGQDYTPFIVGKDSRAWNEIGTATGTVTFYAHYPALTDEAATRSGGNKRY LKGGQEHLFGTAEAAPGSQNVSLKFKRMTVPVIILDENDRPYEGEAKVELSLKNEGTQDL LNGTIEINENALSENIEVKKVSEGVTTNVLPQKINAGEEIGTITVGGVTQKISAVEDLDL KAGSTLSVRLSKKFGGGIIDGNVPLYR >gi|226332315|gb|ACIC01000005.1| GENE 20 29143 - 29613 525 156 aa, chain - ## HITS:1 COG:MA3407 KEGG:ns NR:ns ## COG: MA3407 COG0590 # Protein_GI_number: 20092219 # Func_class: F Nucleotide transport and metabolism; J Translation, ribosomal structure and biogenesis # Function: Cytosine/adenosine deaminases # Organism: Methanosarcina acetivorans str.C2A # 6 156 13 162 162 183 60.0 1e-46 MTKEELMRKAIELSKENVANGGGPFGAVIATKEGEIIATGVNRVTSSCDPTAHAEVSAIR AAAAKLGTFNLSGYEIYTSCEPCPMCLGAIYWARLDKMYYGNNKTDAKNIGFDDSFIYDE LALKPADRKLPSEVLLHNEAITAFEAWMSKEDKVEY >gi|226332315|gb|ACIC01000005.1| GENE 21 29735 - 30721 965 328 aa, chain - ## HITS:1 COG:BH1248 KEGG:ns NR:ns ## COG: BH1248 COG0673 # Protein_GI_number: 15613811 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Bacillus halodurans # 4 244 2 244 340 112 27.0 1e-24 MSEKMIKWGFIGCGEVTKTKSGPAFQKVEHSEVVAVMSRDGAKAKAYAKERGIRKWYDDA QELIDDPEVNAIYIATPPSSHATYAIMAMKAGKPAYIEKPMAVTYEECTRINRISKETGV PCFVAYYRRYLPYFQKVKEMVENGSIGNVINVQVRFAQPPRDLDYNRENLPWRVQADIAG GGYFYDLAPHQIDLLQDMFGCILEASGYKSNRGGLYPTEDTLSACFQFDNGLVGSGSWCF VAHDSAREDRIEIIGDKGMICFSVFTYEPIGLHTERGREEICIGNPEHVQQPLIQAVVDH LLGKSVCSCDGESATLTNWVMDKILGKL >gi|226332315|gb|ACIC01000005.1| GENE 22 30835 - 31737 844 300 aa, chain - ## HITS:1 COG:no KEGG:BT_0323 NR:ns ## KEGG: BT_0323 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 300 1 300 300 579 99.0 1e-164 MKKLLLLWLMIAPMSGMAQNESTPQDTTLYVNGRKIVIKEQGDKIKVKLYESASGGDTIT NAQIFEGVYLNGQSTESRTVLSALPFSKKNNKRNRFEPHAGGFYIGYTRLSNDFLSFNPS NGADLNTSHSWEIGGNLFTGSHAFAPTYNWGITIGLGWGYRSMRLDGNYAFREIDGVTEI YSGMEAEEPTEYSKSRLRYFYFRIPLSIEWQTRLNGKGPLFFAVGPEAEIRHGFRSKAKV NGSKKTIDKGLNGRPLGINLLAQAGYADLGVYMRYSTYGLFEKNKGPELYPFSFGVCWYW >gi|226332315|gb|ACIC01000005.1| GENE 23 31762 - 32211 513 149 aa, chain - ## HITS:1 COG:no KEGG:BT_0324 NR:ns ## KEGG: BT_0324 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 149 1 149 149 241 100.0 4e-63 MMNKIIRTTLIALLLTFAAHAAAQQGLQIATVFQKYGKQKGVTMVELSNEMLETYQMTLY KSLVFKDASDALPTILRCLEADKKKAKKVKEVVSNGQIKSGYYQLPQLKEDVNRFILFKT GKKDSATLIYIEGELDADDLITMLFMKKE >gi|226332315|gb|ACIC01000005.1| GENE 24 32232 - 32693 492 153 aa, chain - ## HITS:1 COG:no KEGG:BT_0325 NR:ns ## KEGG: BT_0325 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 153 1 153 153 256 98.0 2e-67 MKDIETLLNKYFEGETTCEEERRLRRFFAEGLVPEHLEVYRPMFAFFEAEQKELPEISGI GNAVEMPELAPFEKKTKTIRQYLTYSLSAAAATILLLVGISGIYRHLSPAPANYVIIDGK EYTDVHLIREQAMVAFRDVSLSEEEVFATLFDE >gi|226332315|gb|ACIC01000005.1| GENE 25 32690 - 33211 371 173 aa, chain - ## HITS:1 COG:mll1867 KEGG:ns NR:ns ## COG: mll1867 COG1595 # Protein_GI_number: 13471781 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Mesorhizobium loti # 1 164 16 177 298 63 26.0 1e-10 MELKQFKITVVPLRDKLLNYARRMTDDPSDAEDAVQEVMLKLWNLRQKLDEYRSIEAVAM TMTHHLCMDMWRAKRPDTLSLDRVQAPTPSATPERLLEEKDEFRLMREIIDSLPTLQRTI IRMKDIEEYETEEIAEITGCNAEAIRSNLSRARKKVREVYLQTIQERKRRNKA >gi|226332315|gb|ACIC01000005.1| GENE 26 33431 - 33568 63 45 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLSMVTNEVKRTKLGGKMSKKLGKRLIFIHCNKKKGGNCDSLLYV >gi|226332315|gb|ACIC01000005.1| GENE 27 33654 - 34493 906 279 aa, chain - ## HITS:1 COG:AF0231 KEGG:ns NR:ns ## COG: AF0231 COG0834 # Protein_GI_number: 11497847 # Func_class: E Amino acid transport and metabolism; T Signal transduction mechanisms # Function: ABC-type amino acid transport/signal transduction systems, periplasmic component/domain # Organism: Archaeoglobus fulgidus # 70 278 62 264 264 80 29.0 4e-15 MKTRPKLRLFRYLLPVVIVLALIFSVRYCGKQEKPLGHPRDYAAIAKEGILHVATEYNSI SFYVDSDTVSGFHYELIEAFARDKGLKTEITPVMSFDERLRGLSDGRYDVIAYGILATSE LKDSLLLTSPIILNKQVLVQRKANGENDSLYIRNQLDLAQKTLHVVKGSPSILRIQNLGN EIGDTIYIKEIEKYGSEQLISLVAHGDIDYAVCDESIARAVADSLPQVDINTAISFTQFY SWAVSKQSPALLDSLNTWLDKFQKEKEYQKIYRKYYGDK >gi|226332315|gb|ACIC01000005.1| GENE 28 34511 - 36565 1812 684 aa, chain - ## HITS:1 COG:no KEGG:BT_0328 NR:ns ## KEGG: BT_0328 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 5 684 1 680 680 1360 99.0 0 MPPIMRRILLLYILFSVVGLSAIHAQLNPARQQIDENGRDQYGNQVDPAMIPDKLDSANV EVQGLPPTLYMWRIKNQLGDRTMIPADTTYHHFQNTNLTEGITGHYNYLANLGSPRLSRI FFERRYPEPTIFMEPFSSFFIRPTEFNFTNSNVPYTNLTYHKAGNKQNGEERFKSYFSVN VNKKLAFGFNIDYLYGRGYYNNQNTSYFNAALFGNYIGDRYQVQGIYSNNYLKTNENGGI TDDRYITAPEEMAEGRKEYESVNIPTVLNASANRNHDFYVFLTQRYNLGFHRDIPQAEND TMPAKQEFVPVTSFIHTIQVERSRHNFRSDDNVENYYKQTLLDKENTFVRDSTVYIGVKN TIGIALLEGFNKYAKAGLTAFASHKLSKYSLMSLDPLKQDKYNETEIYIGGELTKKQGNF LHYHAIGEVGMAGKAIGQFDVKGDIDLNIPLWKDTVSVIARGEISNKLAPFYMRHYHSKH FWWDDDMDKEFRTRIEGELSIANWGTRLRAGVENIKNYTYFNQQALPEQKGGSLQVVSAC FNQDFKVGIFHLDNEVIWQKSSDQAVLPLPELSLYHNFYMQFKLAKKVLSVQLGADVRYF TKYDAPAYMPATQQFYLQPEEGKVEIGGYPIVNVYANLHLKRTRFYVMMYHVNQGMSKPN YFLAPHYPINPRVLKFGLSWNFYD >gi|226332315|gb|ACIC01000005.1| GENE 29 36674 - 37216 586 180 aa, chain - ## HITS:1 COG:MA2909_2 KEGG:ns NR:ns ## COG: MA2909_2 COG1014 # Protein_GI_number: 20091730 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, gamma subunit # Organism: Methanosarcina acetivorans str.C2A # 7 138 12 142 186 115 46.0 5e-26 MKEEIIIAGFGGQGVLSMGKILAYSGLMEGKEVSWMPAYGPEQRGGTANVTVIVSDDKIS SPILSKYDTAIILNQPSLEKFESRVKPGGILIYDGYGIINPPTRKDIKVYRIDAMDAANE MNNAKAFNMIVLGGLLQLRPIVTLENVIKGLKKTLPERHHHLIPMNEEAIKKGMELIHEV >gi|226332315|gb|ACIC01000005.1| GENE 30 37236 - 38000 734 254 aa, chain - ## HITS:1 COG:MA2909_1 KEGG:ns NR:ns ## COG: MA2909_1 COG1013 # Protein_GI_number: 20091730 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, beta subunit # Organism: Methanosarcina acetivorans str.C2A # 5 252 6 262 296 254 46.0 8e-68 MTKEEIIKPENLVYKKPTLMNDNAMHYCPGCSHGVVHKLVAEVIEEMGMEEKTVGVSPVG CAVFAYNYLDIDWQEAAHGRAPAVATAIKRLWPDRLVFTYQGDGDLACIGTAETIHALNR GENITIIFINNAIYGMTGGQMAPTTLVGMKSSTCPYGRDVELHGYPLKITEIAAQLEGTA YVTRQSVQSVPAIRKAKKAIRKAFENSMNGKGSNLVEIVSTCSSGWKMTPEKSNKWMEEH MFPFYPLGDLKDKE >gi|226332315|gb|ACIC01000005.1| GENE 31 38013 - 38189 212 58 aa, chain - ## HITS:1 COG:no KEGG:BF1647 NR:ns ## KEGG: BF1647 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 58 1 57 57 73 91.0 3e-12 MNPDKIRNVLNILFMILALAAIIVYFVVGKEDFKMFIYVCGAAIFVKLMEFFIRFTNR >gi|226332315|gb|ACIC01000005.1| GENE 32 38200 - 39282 1310 360 aa, chain - ## HITS:1 COG:TM1759 KEGG:ns NR:ns ## COG: TM1759 COG0674 # Protein_GI_number: 15644505 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit # Organism: Thermotoga maritima # 6 356 7 351 356 379 54.0 1e-105 MAEEVVLMKGNEAIAHAAIRCGADGYFGYPITPQSEVLETLAELKPWETTGMVVLQAESE VAAINMVYGGAGSGKMVLTSSSSPGVSLKQEGISYIAGAELPCLIVNVMRGGPGLGTIQP SQADYFQTVKGGGHGDYRLIALAPASVQEMADFVALGFELAFKYRNPAIILADGVIGQMM EKVVLPAQKPRRTDAEVIEQCPWAATGKAKGRKPNIITSLELKPEAMEINNIRFQAKYKQ IEENEVRFEEINCEDAEYLIVAFGSMARIGQKAMELAREKGIKVGILRPITLWPFPSKAI AAYADKVKGMLVTELNAGQMIEDVRLAVNGKVKVEHFGRLGGIVPDPDEIVTALEEKIIK >gi|226332315|gb|ACIC01000005.1| GENE 33 39295 - 39522 297 75 aa, chain - ## HITS:1 COG:no KEGG:BT_0333 NR:ns ## KEGG: BT_0333 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: Citrate cycle (TCA cycle) [PATH:bth00020]; Metabolic pathways [PATH:bth01100] # 1 75 1 75 75 120 100.0 2e-26 MAKIKGAIVVDTERCKGCNLCVVACPLDVIALNKEVNMKGYNYAWQVKEDTCNGCSSCAM VCPDGCISVYKVKVE >gi|226332315|gb|ACIC01000005.1| GENE 34 39557 - 39838 204 93 aa, chain - ## HITS:1 COG:no KEGG:BT_0334 NR:ns ## KEGG: BT_0334 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 93 1 93 93 146 100.0 2e-34 MEQIDNLKELINQGDVDIAIKQLDQLLQDDSVEKDKDMLYYLRGNAYRKKGDWKQALDNY QYAIDLNPDSPAVQARTMAIDILNFYHKDMYNQ >gi|226332315|gb|ACIC01000005.1| GENE 35 40131 - 41684 1369 517 aa, chain + ## HITS:1 COG:ECs4688 KEGG:ns NR:ns ## COG: ECs4688 COG0714 # Protein_GI_number: 15833942 # Func_class: R General function prediction only # Function: MoxR-like ATPases # Organism: Escherichia coli O157:H7 # 3 298 15 304 506 294 52.0 3e-79 MEIKERIQLLLSEMNRGVYEKEAEIGLSLLAALAGESILLLGPPGVAKSMVARRLKRAFV DARAFEYLMSRFSTPDEIFGPVSISRLKESDKYERAVDGYLPTADVVFLDEIWKAGPAIQ NTLLTVINEKLFRNGDTELKLPLKLLVAASNELPTQGEGLEALWDRFLIRIISTCVKQEE AFYQMLLDDDDEEEQEACRWQISDEEYAGWQKEIRQIKVSAEVLSCITEIRKNLEKVEIQ GSEVHRNVYVSDRRWKNIVKLLKASAFIHGRKEVCLTDLLPVYHCLWNEPEECADIRQIV VRALFVPYVKEIAAINLSLKSDLKVSRVREALEKARQKGDRRDDDLMIIDHFYYQIENHG TGNTYIFIVDYKNLKEYSPKDVPATGVMYADPLNPKRTIIRTFSDATKMKEQGAERVTLY RDEKNLYINGVRFPMRRLKRGEQQQLWLGNMTLTDRDYETELETAHGRIEKLVKDLSENI FISEEDKKGVAQYVATLRKEIAWARVDVRKLRYGDES >gi|226332315|gb|ACIC01000005.1| GENE 36 41674 - 43116 1102 480 aa, chain + ## HITS:1 COG:VCA0762 KEGG:ns NR:ns ## COG: VCA0762 COG2425 # Protein_GI_number: 15601517 # Func_class: R General function prediction only # Function: Uncharacterized protein containing a von Willebrand factor type A (vWA) domain # Organism: Vibrio cholerae # 78 477 67 472 481 103 25.0 8e-22 MSLRYDAAFYVQVLDKYVATGECAEADNEPLYAYLLSVMNDPMIKIQVLSDELCARIFYD AMSQFIRLNLEKQKYNMQKSQSEQQGMELVLEWSETKRKDGWQALLQEVADKHEGNGFDK AFYQGQFGNEGKYADEEVWERMVDDWKEAFQRDLQEQKEKEIEQRKDDFERRLHANLRNI PGYIRQNNIEKDEFYQAWGLMSGMWNTVDFERIRKIVRIQKEYPEVVKVANKMGRMADDE GQEQLHVAQGNVYKMEHSSKSDILGITVGNDLNALLPTELAHCADDELEDLFVYKYVTRK LQTFRYKSEIMQPARRLEIKPATQKGPMIVCLDTSGSMVGKPEKIAYSLLIKILEIADRQ RRDCFLIAFSVSINPIDVRRERARLLEFFSTTSCGDTDATRMLQAIFRLLQSKKEYMNAD VLWISDFKIPLSVPELTDKMLEYRKVDTHFYGLQLGIAENEWSPFFDRIYRIDYTPSRRY >gi|226332315|gb|ACIC01000005.1| GENE 37 43194 - 43676 405 160 aa, chain + ## HITS:1 COG:NMB0932 KEGG:ns NR:ns ## COG: NMB0932 COG2839 # Protein_GI_number: 15676826 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Neisseria meningitidis MC58 # 2 158 1 158 162 63 31.0 1e-10 MLDIILIILGVLCLITGLMGCILPFLPGPPVAYLGLVLLHFTDKVQYTTTQLIVWLLIVL VVQVLDYFTPMLGSKYSGGSRWGNWGCIIGTLIGLLFLPWGVIFGPFLGAVIGELLGNKE FSQALRSGVGSLIGFLLGTFLKFVVCGYFCYQFIVGFIRS >gi|226332315|gb|ACIC01000005.1| GENE 38 44446 - 46296 1346 616 aa, chain + ## HITS:1 COG:no KEGG:BT_0338 NR:ns ## KEGG: BT_0338 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 616 1 616 616 1289 99.0 0 MKKTIVVLLFMASLGLFSVLVAGEVYVSPHGSDRNAGTKEAPYLTLNRAIKQAREWRRLN RPEAAGGICICLEDGVYAQSAPLFIRPEDSGTPDSPTLIRAVENAHPVISGGVAVTGWKK GCDDPRITKELRSKIWVAKAPSFGNRIVETRQMWVDGNKAQRAAQFPDGVMERMIDFNPE EQTIIIPASQIGNLLNARRLEMIVHQRWAIAILRVKSIDVRGEQAVIRFHEPESHLEFAH PWPQPVIGGEKGNSSFCLTNALELLDQPGEWFQEYPSGTIYYYPRSEEDMETAEVIVPAL ETLMIVDGTLERPVRHIRVEGITFAHTSWMRPSYQGHVTLQGGFPLLDAYKLHEPGLPEK AELENQAWIARPETAIRVRGTEHLTFSRCRFRHLASTGLDYEWAVSSSGIENCVFSDIGG TGILIGAFPDGGFETHVPFIPPEERNLCTDITIKNNLITDVTNEDWGCVGIGAGYVSGID ISHNEVCHLNYSGICVGWGWTSLESGMKNNRIEANYVHHFARRLYDAGGLYTLSNQPGSV MRNNRIEHLEEAPYATNDRAFYIYLDEATDGYTIENNWCPTERFDSNRPGNRNVWKNNGP QVTESIKNKAGRIKPE >gi|226332315|gb|ACIC01000005.1| GENE 39 46418 - 48460 1663 680 aa, chain + ## HITS:1 COG:BH1905 KEGG:ns NR:ns ## COG: BH1905 COG1501 # Protein_GI_number: 15614468 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-glucosidases, family 31 of glycosyl hydrolases # Organism: Bacillus halodurans # 191 676 150 630 773 353 38.0 9e-97 MKPTNYHLFDFLDFDTELLRDESLWKACKPTAVYEKDGDICVTVPFQKQLLANDMVADTA VPREEYTLIIRQYNIGITRLFLGFGEYELTDQSEMLQFSERIRRVPLSVEKQGGKWILFT QDGTKRAVINVEEPALDRWSELLPDPQETLDITLYPDGKREIRLAAYDHFSPPRYDGLPI AFCKWTGKKERATLSFESRPDECFAGTGERFFKMDLSGQTLFLKNQDGQGVNNRRTYKNI PFYLSSRMYGTFYHTCAHSKLSLAGHSTRSVQFLSDQAMLDAFVIAGDTMEEILRGYRDL TGYPSMPPLWSFGVWMSRMTYFSADEVNEICDRMRAEHYPCDVIHLDTGWFRTDWLCEWK FNEERFPDPKGFIQRLKKNGYRVSLWQLPYVAEDAEQIEEAKANEYIAPLTKQQDTDGSN FSALDYAGTIDFTYPKATEWYKGLLKQLLDMGVTCIKTDFGENIHMDAVYKGMKPELLNN LYALLYQKAAYEITKEVTGDGIVWARAAWAGCQRYPLHWGGDSCSSWDGMAGSLKGGLHF GLSGFAFWSHDVPGFHTLPNFMNSIVAEDVYMRWTQFGVFTSHIRYHGTNKREPWHYPAI APLVKKWWKLRYSLIPYIIEQSKLAVESGWPLLQALILHHPEDKLCWHIDDEYYFGNDFL VAPVMNSENRRDIYLPKDSG >gi|226332315|gb|ACIC01000005.1| GENE 40 48457 - 48663 153 68 aa, chain + ## HITS:1 COG:no KEGG:BT_0339 NR:ns ## KEGG: BT_0339 # Name: not_defined # Def: alpha-glucosidase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 68 681 748 748 147 98.0 1e-34 MNFFTGERLQGGRWLKEVYVPLEEMPVYVRENAVIPIYPEEVNCTDEMDLGKSIALRIDH NYKGFWTK >gi|226332315|gb|ACIC01000005.1| GENE 41 48673 - 50481 1740 602 aa, chain + ## HITS:1 COG:mlr1231 KEGG:ns NR:ns ## COG: mlr1231 COG5012 # Protein_GI_number: 13471298 # Func_class: R General function prediction only # Function: Predicted cobalamin binding protein # Organism: Mesorhizobium loti # 389 601 19 231 238 180 42.0 6e-45 MKTWKTNLEETKQRYINWWNHKGIILSMWEHFQEGVKPHADIAPPAPARDLSQKWFDPQW RAEYLDWYVAHSSLMADMLPVANTQLGPGSLAAILGGVFEGGEDTIWIHPDPNFNDEIVF NPEHPNWLLHKELLKACKAKANGNYFVGMPDLMEGLDVLAALKGTDMVLLDTVMQSEVLE QQMQQINDIYFKVFDELYDIIREGDEMAFCYFSSWAPGKMSKLQSDISTMISQDDYRRFV QPFIREQCQKIDYTLYHLDGVGAMHHLPALLEIEELNAIQWTPGVGEPQGGSPKWYDLYK KILAGGKSVMACWVTLEELKPLLDHIGADGVHLEMDFHNEREVEQAMRIIEEYTGSSSSV SADVHTNQPVGEQDNAQESIRIREEKSQEEDRMKPLYDAIVAGKLEPAVEVTKDAIAAGV LPQDIINGYMITAMGEVGQRFQDGKAFVPQLLMAGRAMKGALELLKPLLAGNASATIGKI VIGTVKGDLHDIGKNLVASMLEGCGFEVINIGIDVTCDKFVEAVKENKADILCMSALLTT TMTYMQDVIRALEEAGIRDQVKVMIGGAPVSQGFADEIGADGYSDNANTAVAVAKVLMGK RD >gi|226332315|gb|ACIC01000005.1| GENE 42 50510 - 52081 1218 523 aa, chain + ## HITS:1 COG:BH2222 KEGG:ns NR:ns ## COG: BH2222 COG4146 # Protein_GI_number: 15614785 # Func_class: R General function prediction only # Function: Predicted symporter # Organism: Bacillus halodurans # 13 521 4 523 580 177 28.0 5e-44 MHAKFLDTLDWGILIAYFLILIAIGIWASSKRKKGSSLFLAEHSLRWHHIGFSMWGTNVG PSMLIASASAGFTTGIVSGNYAWYAFVFICLLAFVFAPRYLGSRVSTLPEFMGKRFGQST RNILAWYTIVTILISWLALTLFAGGVLIRQVFDIPMWQSALILLIISAFFTMLGGLKAVA YTNVYQMILLILVSAALAIVGIYKVGGISALTDAVPADFWNLFRPNDDTAFPWLPIILGY PVMGVWFWCTDQSMVQPVLAAKSLKEGQLGTNFTGWLKILDVPLYILPGIICLALFPQLE NPDEAYMTMVTHLFPVGMVGLVLAVLTAALVSTIGSALNALSTVFTMDIYVKKIRPQAKQ KEIIRVGQVVTVAGALISVIITIAIDSIHGLNLFNVFQSVLGFIAPPMAAVFLFGVFWKR TTTLAANAALTVGTVFSIGVGVLYLWVFPADQYSAWPHFMLLSFYLFVIIGIGMVVVGLL DKTPQTAILNMEKIEEKPARIVLILWGLLIVTMIGLYIFFNGH >gi|226332315|gb|ACIC01000005.1| GENE 43 52295 - 53068 291 257 aa, chain - ## HITS:1 COG:no KEGG:BT_0342 NR:ns ## KEGG: BT_0342 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 257 1 257 257 520 99.0 1e-146 MIEQELTYQDLRILSSEIYEAMGYKDSIPDQMVIDETNTLLKRVIPLLRPRFAFFFMEGT LDIEKETLSVQSIHPPTVNPIQDTIPTRSPEQNTFSTQSKSTVFSVGKTIARQLRGSEAF VFFAATAGTEFEVFQHTLQQEEDMVKIYIADSLGSIIAEKTADCMEVALAEYIREKNWKH TNRYSPGYCGWHVSEQQKLFPLFPIAAPCGIRLTDSSLMVPIKSVSGVIGTGTNVRKLEY TCGLCTYENCYRKRKHR >gi|226332315|gb|ACIC01000005.1| GENE 44 53097 - 54113 762 338 aa, chain - ## HITS:1 COG:MA0146 KEGG:ns NR:ns ## COG: MA0146 COG0407 # Protein_GI_number: 20089044 # Func_class: H Coenzyme transport and metabolism # Function: Uroporphyrinogen-III decarboxylase # Organism: Methanosarcina acetivorans str.C2A # 62 323 68 327 339 107 30.0 2e-23 MGKLNMKDWIGQTILNKKVISIPIMTHPGIELIGKTVHDAVTNGQVHYEAIKALCDKYPA AAATVIMDLTVEAEAFGAEIIFPENEVPSVSGRLLADEAAIDALEIPALNKGRIAQYLKA NMLTAKNITDRPVLAGCIGPYSLAGRLYDMSEIMMLIYINPEAANTLLRKCTDFIIRYCM ALKATGVNGVVMAEPAAGLLSNEDCLQYSSVFVKEIVEKVQDDHFTVVLHNCGNTGNCTQ AMVYTGAAAYHFGNKINMEEALKEVPADALAMGNLDPVSLFKMSSPEEMKKATLQLLEAT AAYPNFVLSSGCDIPPYTPLANINAFYTALEEFNETRN >gi|226332315|gb|ACIC01000005.1| GENE 45 54506 - 55351 410 281 aa, chain + ## HITS:1 COG:no KEGG:BT_0344 NR:ns ## KEGG: BT_0344 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 281 1 281 281 560 100.0 1e-158 MKTIKQIVFCVSLLLVMNCFYSCSNSNDVVINTQQNQEKTQDVEIQELIDFIETLNANKI QTRGPFWNRIKRFLVGDAWGYGWGVNKGLTPRGGLITAVVFSLICAASDDDLPRNWWHLS SNWKVYDAPLRPYEIIGNDHNKTIYNMMREDPVIANGTFSNYYLYNSTNKKLKSYGYTEE MPLLLQTRLLEIMDLVKKSTSVDQLMNLMKQEFPQRVSEFQLVESYINGLVNMDDKSTVR DYTKQVYAQIDASSLDVSAVSRLKTMIAIAENSKFLWVETK >gi|226332315|gb|ACIC01000005.1| GENE 46 55404 - 56114 398 236 aa, chain + ## HITS:1 COG:no KEGG:BT_0345 NR:ns ## KEGG: BT_0345 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 236 1 236 236 443 100.0 1e-123 MKQNKCLLVGLISLMFSINSLYAQKDSLNVSTNNESYKRVYILDVGLGIGFTGIQKNGSS SWYIGNNKNDVPFVSSIQLMTYTPYSNIGCGLFYYDYSNGTKKHDGVLSSGINEKKSFYY VAPQISFIKRQAGFLDGIMYVSAGIGYVNYRSKGTISQLENYNTTCSTVGGNMGIAYEYA FDSRLGIRLGVNCLYAKIKGLHKNTSIGELSIRPREKFHLIVPSLEIGLSYYIVQW >gi|226332315|gb|ACIC01000005.1| GENE 47 56304 - 56738 472 144 aa, chain - ## HITS:1 COG:TM1080 KEGG:ns NR:ns ## COG: TM1080 COG0698 # Protein_GI_number: 15643838 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose 5-phosphate isomerase RpiB # Organism: Thermotoga maritima # 4 141 3 140 143 145 52.0 3e-35 MKTIGLACDHAGFELKEYVRGWLEVKGWAYKDFGTNSAASVDYPDYAHPLALAVESGECY PGIAICGSGNGINMTLNKHQGIRAALCWNAEIAHLARQHNDANILVMPGRFISTEEADMI LTEFFSTQFEGGRHQNRIDKIPVK >gi|226332315|gb|ACIC01000005.1| GENE 48 56738 - 58747 2209 669 aa, chain - ## HITS:1 COG:BH2352 KEGG:ns NR:ns ## COG: BH2352 COG0021 # Protein_GI_number: 15614915 # Func_class: G Carbohydrate transport and metabolism # Function: Transketolase # Organism: Bacillus halodurans # 10 669 9 663 666 463 41.0 1e-130 MNDNKLMNRAADNIRILAASMVEKANSGHPGGAMGGADFVNVLFSEFLVYDPENPRWEGR DRFFLDPGHMSPMLYSTLALTGKFTLDELKEFRQWGSPTPGHPEVDIMRGIENTSGPLGQ GHTFAVGAAIAAKFLKARFNEVMNQTIYAYISDGGIQEEISQGAGRIAGALGLDNLIMFY DSNDIQLSTETKDVTVEDTAMKYEAWGWNVLSINGNDPDEIRAAIKEAQTEKERPTLIIG KTVMGKGARKADGSSYEANCATHGAPLGGDAYVNTIKNLGGDPVNPFVIFPEVAELYAKR AAELKKIVAERYAKKAKWTKANPELAAKLEAFFSGKAPKVDWAAIEQKAGTATRAASATV LGALAMQVENMIVASADLSNSDKTDGFLKKTHSFKKGDFSGAFFQAGVSELSMACICIGM SLHGGVIAACGTFFVFSDYMKPAVRMAALMEQPVKFIWTHDAFRVGEDGPTHEPVEQEAQ IRLMEKLKNHKGHNSMLVLRPADAEETTIAWKLAMENMSTPTGLIFSRQNIANLPAGTDY EQAAKGAYIVAGSDENPDVILVASGSEVSTLVAGTELLRKDGVKVRIVSAPSEGLFRSQS KEYQESVLPADAKIFGLTAGLPVTLQGLVGCHGKVWGLESFGFSAPYTVLDEKLGFTAEN VYNQVKAMI >gi|226332315|gb|ACIC01000005.1| GENE 49 58924 - 59565 632 213 aa, chain - ## HITS:1 COG:BH1874 KEGG:ns NR:ns ## COG: BH1874 COG3534 # Protein_GI_number: 15614437 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-L-arabinofuranosidase # Organism: Bacillus halodurans # 1 213 282 497 498 209 44.0 3e-54 MDKYDKDKKIALLLDEWGTWWDEEPGTVRGHLYQQNTLRDAFVASLSLDVFHKYTDRLKM ANIAQIVNVLQSMILTKDKDMVLTPTYYVFKMYKVHQDATYLPLDLTCEKMNVRDNRTVP MVSATASKNKNGVIHISLSNVDADNAQEITVNLPDVNAKKAIGEILTSANLTDYNSFEKP NIVKPASFKEVKINKGIMKVKLPAKSIVTLELQ Prediction of potential genes in microbial genomes Time: Thu May 12 00:00:09 2011 Seq name: gi|226332314|gb|ACIC01000006.1| Bacteroides sp. 1_1_6 cont1.6, whole genome shotgun sequence Length of sequence - 11414 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 3, operones - 2 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.000 - CDS 3 - 879 735 ## COG3534 Alpha-L-arabinofuranosidase 2 1 Op 2 . - CDS 949 - 3357 1560 ## COG3533 Uncharacterized protein conserved in bacteria - Prom 3395 - 3454 4.0 3 2 Tu 1 . - CDS 3462 - 5057 1136 ## COG1070 Sugar (pentulose and hexulose) kinases - Prom 5188 - 5247 5.3 4 3 Op 1 5/0.000 - CDS 5396 - 6925 1587 ## COG2160 L-arabinose isomerase 5 3 Op 2 . - CDS 6962 - 7645 768 ## COG0235 Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases 6 3 Op 3 . - CDS 7651 - 8328 851 ## COG1051 ADP-ribose pyrophosphatase 7 3 Op 4 . - CDS 8350 - 10044 1784 ## COG4146 Predicted symporter 8 3 Op 5 . - CDS 10070 - 11209 399 ## PROTEIN SUPPORTED gi|15900011|ref|NP_344615.1| aldose 1-epimerase - Prom 11250 - 11309 4.8 Predicted protein(s) >gi|226332314|gb|ACIC01000006.1| GENE 1 3 - 879 735 292 aa, chain - ## HITS:1 COG:BH1874 KEGG:ns NR:ns ## COG: BH1874 COG3534 # Protein_GI_number: 15614437 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-L-arabinofuranosidase # Organism: Bacillus halodurans # 25 292 4 272 498 370 62.0 1e-102 MKAKLLVSTAFLAASVSLSAQKSATITVHADQGKEIIPKEIYGQFAEHLGSCIYGGLWVG ENSDIPNIKGYRTDVFNALKDLSVPVLRWPGGCFADEYHWMDGIGPKENRPKMVNNNWGG TIEDNSFGTHEFLNLCEMLGCEPYISGNVGSGTVEELAKWVEYMTSDGDSPMANLRRKNG RDKAWKVKYLGVGNESWGCGGSMRPEYYADLYRRYSTYCRNYDGNHLFKIASGASDYDYN WTDVLMNRVGHRMQGLSLHYYTVTGWSGSKGAATQFNKDDYYWTMGKCLEME >gi|226332314|gb|ACIC01000006.1| GENE 2 949 - 3357 1560 802 aa, chain - ## HITS:1 COG:BH1877 KEGG:ns NR:ns ## COG: BH1877 COG3533 # Protein_GI_number: 15614440 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus halodurans # 29 801 3 756 758 486 38.0 1e-137 MKTTSFILALIISISIGKAQTNHQVSYFSLQDVKLLSSPFLQAQQTDLHYILALDPDRLS APFLREAGLTPKAPSYTNWENTGLDGHIGGHYLSALSMMYAATGDTAIYHRLNYMLNELH RAQQAVGTGFIGGTPGSLQLWKEIKAGDIRAGGFSLNGKWVPLYNIHKTYAGLRDAYLYA HSDLARQMLIDLTDWMIDITSGLSDNQMQDMLRSEHGGLNETFADVAEITGDKKYLKLAR RFFHKVILDPLIKNEDRLNGMHANTQIPKVIGYKRVAEVSKDDKDWNHAAEWDHAARFFW NTVVNHRSVCIGGNSVREHFHPSDNFTSMLNDVQGPETCNTYNMLRLTKMLYQNSGDVDN SNKPDPRYVDYYERALYNHILSSQEPDKGGFVYFTPMRPGHYRVYSQPETSMWCCVGSGL ENHTKYGEFIYAHQQDTLYVNLFIPSQLNWKEQGVTLTQETLFPDDEKVTLRIDKAAKKN LTLMIRIPEWAGNSKGYEITINGKKHLSDIQTGASTYLPIRRKWKKGDMITFHLPMKVSL EQIPDKKDYYAFLYGPIVLATSTGTENLDGIYADDSRGGHIAHGRQTPLQEIPMLIGNPD SIRHSLHKLSGSKLAFSYDGNVYPTQKSKSLELIPFFRLHNSRYAVYFRQASEEQFKTIQ EEMATAEQKATDLANRTVDLVFPGEQQPESDHGILYEASETGTHKDRHFRRAKGWFSYNL KVKEEASQLMITVRQEDRNKAVILLNNEKLTVHPTVSKADKDGFIRLCYLLPRKLKAGSC EILFKPDGTEWTSAVYEVRLLK >gi|226332314|gb|ACIC01000006.1| GENE 3 3462 - 5057 1136 531 aa, chain - ## HITS:1 COG:CAC1344 KEGG:ns NR:ns ## COG: CAC1344 COG1070 # Protein_GI_number: 15894623 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar (pentulose and hexulose) kinases # Organism: Clostridium acetobutylicum # 1 529 1 531 534 650 60.0 0 MKLDAKSTIETGKAILGIELGSTRIKAVLIDQENKPIAQGSHTWENQLVNGLWTYSIDAI WSGLQDCYADLRSNVKKLYDTEIETLAAIGVSAMMHGYMPFNEKEEILVPFRTWRNTNTG RAAAELSELFVYNIPLRWSISHLYQAILDNEAHVKDIKFLTTLAGYVHWQITGEKVLGIG DASGMLPIDPTTNNYSAEMVAKFNNLIASKEYSWKLEDILPKVLSAGENAGVLTPEGCKK LDASGHLKAGIPVCPPEGDAGTGMVATNAVKQRTGNVSAGTSSFSMIVLEKELSKPYEMI DMVTTPDGSLVAMVHCNNCTSDLNAWVNLFKEYQELLGIPVDMDELYGKLYNIALTGDTD CGGLLSYNYISGEPVTGLAEGRPLFVRSANDKFNLANFMRAHLYASVGVLKIGNDILFNE EKIKVDRITGHGGLFRTKGVGQRVLAAAINSPISVMETAGEGGAWGIALLGSYLVNNKKG QSLADFLDESVFVSDAGVEVSPTPEDVAGFNTYIESYKAGLPIEEAAVKFK >gi|226332314|gb|ACIC01000006.1| GENE 4 5396 - 6925 1587 509 aa, chain - ## HITS:1 COG:TM0276 KEGG:ns NR:ns ## COG: TM0276 COG2160 # Protein_GI_number: 15643046 # Func_class: G Carbohydrate transport and metabolism # Function: L-arabinose isomerase # Organism: Thermotoga maritima # 7 506 6 495 496 502 50.0 1e-142 MNNVFDQYEVWFVTGAQLLYGGDAVIAVDAHSNEMVNGLNESGKLPVKVVYKGTANSSKE VEAVFKAANNDDKCVGVITWMHTFSPAKMWIHGLQQLKKPLLHLHTQFNKEIPWDTMDMD FMNLNQSAHGDREFGHICTRMRIRRKVVVGYWKEEETLHKIAVWMRVCAGWADSQDMLII RFGDQMNNVAVTDGDKVEAEQRMGYHVDYCPASELMEYHKDIKNADVDALVATYFNDYDH DASLEDKSTEAYQKVWNAAKAELALRAILKAKGAKGFTTNFDDLGQTDGSYFDQIPGLAS QRLMAEGYGFGAEGDWKSAALYRTVWVMNQGLPKGCSFLEDYTLNFDGANSSILQSHMLE ICPLIAANKPRLEVHFLGIGIRKSQTARLVFTSKTGTGCTATVVDMGNRFRLIVNDVECI EPKPLPKLPVASALWIPMPNLEVGAGAWILAGGTHHSCFSYDLTAEYWEDYAEIAGIEMV HINKDTTISCFKKELRMNEVYYMLNKALC >gi|226332314|gb|ACIC01000006.1| GENE 5 6962 - 7645 768 227 aa, chain - ## HITS:1 COG:ECs5174 KEGG:ns NR:ns ## COG: ECs5174 COG0235 # Protein_GI_number: 15834428 # Func_class: G Carbohydrate transport and metabolism # Function: Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases # Organism: Escherichia coli O157:H7 # 2 227 1 227 228 314 65.0 7e-86 MLEELKEKVFHANLELVKHGLVIFTWGNVSAIDRETELVVIKPSGVSYDDMKAEDMVVVD LDGKVVEGRLKPSSDTPTHVVLYKAFPEIGGVVHTHSTYATAWAQAGCDIPNIGTTHADY FHDAIPCTADMTEAEVKGAYELETGNVIVKRFEGLNPVHTPGVLVKNHGPFSWGKDAHDA VHNAVVMEQVAKMASIAYAVNPNLTMNPLLVEKHFSRKHGPNAYYGQ >gi|226332314|gb|ACIC01000006.1| GENE 6 7651 - 8328 851 225 aa, chain - ## HITS:1 COG:alr2484 KEGG:ns NR:ns ## COG: alr2484 COG1051 # Protein_GI_number: 17229976 # Func_class: F Nucleotide transport and metabolism # Function: ADP-ribose pyrophosphatase # Organism: Nostoc sp. PCC 7120 # 11 217 21 237 248 132 34.0 5e-31 MKNYYSSNPTFYLGIDCIIFGFNEGEISLLLLKRNFEPAMGEWSLMGGFVQKDESVDDAA KRVLAELTGLENVYMEQVGAFGAIDRDPGERVVSIAYYALININEYDRELVQKHNAYWVN INELPALIFDHPEMVDKAREMMKQKASVEPIGFNLLPKLFTLSQLQSLYEAIYGEPMDKR NFRKRVAEMDFIEKTDKIDKLGSKRGAALYKFNGKAYRKDPKFKL >gi|226332314|gb|ACIC01000006.1| GENE 7 8350 - 10044 1784 564 aa, chain - ## HITS:1 COG:BH2222 KEGG:ns NR:ns ## COG: BH2222 COG4146 # Protein_GI_number: 15614785 # Func_class: R General function prediction only # Function: Predicted symporter # Organism: Bacillus halodurans # 6 449 3 436 580 186 30.0 8e-47 MEALDWLVIGVFFLALIGIIVWVVRQKQNDSADYFLGGRDATWLAIGASIFASNIGSEHL IGLAGAGASSGMAMAHWEIQGWMILILGWVFVPFYSRSMVYTMPEFLERRYNPQSRTILS VISLVSYVLTKVAVTVYAGGLVFQQVFGIKELWGIDFFWIAAIGLVVLTALYTIFGGMKS VLYTSVLQTPILLLGSLIILVLGFKELGGWDEMMRVCGAVTVNDYGDTMTNLIRSNDDAN FPWLGALIGSAIIGFWYWCTDQFIVQRVLSGKNEKEARRGTIFGAYLKLLPVFLFLIPGM IAFALHQKYIGAGGEGFLPMLANGTANADAAFPTLVAKLLPAGVKGLVVCGILAALMSSL ASLFNSSAMLFTIDFYKRFRPETPEKKLVGIGQIATVVIVILGILWIPIMRSVGDVLYTY LQDVQSVLAPGIAAAFLLGICWKRTSAQGGMWGLIAGMIIGLTRLGAKVYYSNAGEVADS TFKYLFYDMNWLFFCGWMFLFCIIVVIVVSLATEAPTAEKIQGLVFGTATKEQKAATRAS WDHWDIIHTVIILAITGAFYWYFW >gi|226332314|gb|ACIC01000006.1| GENE 8 10070 - 11209 399 379 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15900011|ref|NP_344615.1| aldose 1-epimerase [Streptococcus pneumoniae TIGR4] # 40 370 12 337 345 158 30 2e-38 MKKHFLLAGIAALMLAACNNKPASELTLSGLDPAKFQTEVNNAQTALYTLKNKAGMEVCI TNFGGRIVSIMVPDKNGKMQDVVLGFDSIADYINVPSDFGASIGRYANRINQGRFVLDGD TIQLPQNNFGHCLHGGPKGWQYQVYEANLIDPTTLELTLISPDGDANFPGNVTAKVTYQL TDDNAIDIKYSATTDKKTIINMTNHSYFNLAGDPSKTSTDNIMYVNADYYTPVDSTFMTT GEIAPVKDTPMDFTTPKAVGKDINNYDFVQLKNGNGYDHNWVLNTKGDISQVAARLTSPE TGITLEVYTNEPGIQVYTGNFLDGTVSGKKGIVYNQRASVCLETQHYPDSPNKADWPSVV LEPGQTYNSECIFKFSTEK Prediction of potential genes in microbial genomes Time: Thu May 12 00:00:17 2011 Seq name: gi|226332313|gb|ACIC01000007.1| Bacteroides sp. 1_1_6 cont1.7, whole genome shotgun sequence Length of sequence - 32496 bp Number of predicted genes - 18, with homology - 18 Number of transcription units - 8, operones - 4 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 236 - 284 -0.9 1 1 Op 1 . - CDS 306 - 2231 1496 ## COG3507 Beta-xylosidase 2 1 Op 2 . - CDS 2266 - 4071 1729 ## BT_0361 hypothetical protein 3 1 Op 3 . - CDS 4097 - 7240 3133 ## BT_0362 hypothetical protein 4 1 Op 4 . - CDS 7251 - 9026 1935 ## BT_0363 putative outer membrane protein 5 1 Op 5 . - CDS 9039 - 12122 3276 ## BT_0364 hypothetical protein 6 1 Op 6 . - CDS 12142 - 14451 1752 ## BT_0365 hypothetical protein - Prom 14583 - 14642 7.2 - Term 14628 - 14671 7.7 7 2 Tu 1 . - CDS 14681 - 18943 2912 ## COG0642 Signal transduction histidine kinase + Prom 18918 - 18977 4.9 8 3 Op 1 . + CDS 19099 - 20640 1048 ## COG3507 Beta-xylosidase 9 3 Op 2 . + CDS 20666 - 22648 1929 ## COG3534 Alpha-L-arabinofuranosidase 10 3 Op 3 . + CDS 22671 - 23615 851 ## BT_0369 endo-1,4-beta-xylanase D precursor + Term 23677 - 23719 3.8 - Term 23663 - 23705 9.2 11 4 Op 1 . - CDS 23734 - 24888 1278 ## COG0153 Galactokinase 12 4 Op 2 . - CDS 24932 - 26242 1482 ## COG0738 Fucose permease 13 4 Op 3 . - CDS 26296 - 27393 374 ## PROTEIN SUPPORTED gi|15900011|ref|NP_344615.1| aldose 1-epimerase - Prom 27580 - 27639 7.0 14 5 Tu 1 . + CDS 27622 - 28593 922 ## COG1482 Phosphomannose isomerase + Term 28630 - 28668 10.2 + Prom 28653 - 28712 3.7 15 6 Tu 1 . + CDS 28742 - 30292 1229 ## BT_0374 hypothetical protein + Term 30316 - 30359 -0.9 + Prom 30295 - 30354 5.8 16 7 Tu 1 . + CDS 30404 - 31351 552 ## BT_0291 integrase + Term 31384 - 31426 6.4 + Prom 31437 - 31496 5.9 17 8 Op 1 . + CDS 31703 - 32281 426 ## BT_0376 putative transcriptional regulator 18 8 Op 2 . + CDS 32289 - 32496 77 ## BT_0377 hypothetical protein Predicted protein(s) >gi|226332313|gb|ACIC01000007.1| GENE 1 306 - 2231 1496 641 aa, chain - ## HITS:1 COG:BH1878 KEGG:ns NR:ns ## COG: BH1878 COG3507 # Protein_GI_number: 15614441 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-xylosidase # Organism: Bacillus halodurans # 160 586 40 428 781 98 25.0 4e-20 MKYLKTIYLLLCCTLALAACSDDDENSASGALSITTPAYTNVGYNKATLSANISGTEGVN IVKRGFCYGTASHPDIYDTTSEVRGSEISTTLTGLTPQTRYYVRAFVTLYNEEPRYSEET SFTTPAETLSDELAAYEAPTYVDDYTSFSAWSNRYDWNLANVHDPTVMKADDGYYYMYQT DASYGNAHSGNGHFHARRSKDLVNWEYLGATMSETPPTWIKEKLNAYRQEMGLEPIDNPS YGYWAPVARQVSNGKYRMYYSIVITNYIQTGKPEIENNGNFDGSWTERAFIGLMETSDPA SNIWEDKGFVVCSASDKGKTDYGRSSINDWEGYFKINAIDPTYIITENGEHWLIYGSWHS GIAALQVNPEDGKPLNALGNPWDITGEDNSGYGKIIATRGTSRWQASEGPEVIYRDGYYY LFLAYGSLSVEYNTRVCRSKNIDGPYVDIHGNSAMGSAQLYPILTAPYRFDNSYGWVGIS HCGIFDDGAGNWFYTSQGRFPVNVGGNEYSNAIMMGHVRSIRWDANGWPLVMPERYGAVP QAPITENEIAGDWEHLALTTSTGTQRTSETMTYDLGTHKITSGSWKNATWTFDAATQTIT TSAGVVLYLQREVDWEASPRTHTIVYAAQGNQKTYWGKKLQ >gi|226332313|gb|ACIC01000007.1| GENE 2 2266 - 4071 1729 601 aa, chain - ## HITS:1 COG:no KEGG:BT_0361 NR:ns ## KEGG: BT_0361 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 601 1 601 601 1109 99.0 0 MKRISTYIATLVLALAGSSCSDYLDKEYDASLSEKKVFNNQNLTREFLANIYTNLPDGLA PLSDDQFTGASRDCMTDNAVTCWGLHYYTKIGSDGYTAGDHPLLGFWNTDLYGIRKCNMF LKNAKASVVGNTEKDGDDNRLYDRYCAEARLLRAIFHFDLICWFGAAPVIAEDESGEPII FDLSDPSAMNMFRTPAAEALEWVADQCDQIKNQLPFRYSDEASNWGRVNGAAAYALKARA LLYRASKLNNPDGNTAYWANAAQAAADFITQNNKQSSPYRLYNTGNPENDYYECFTNNPV YNNEIILARSVWNTNQVEKVFLPVGFTGSFSGNGRTNPTQNLVDAYEMNNGKRIDENGST YDAANPYKDRDPRLAQTIFYQGMMWGRADKEERRAIDVRYNSDADKGVDYTSAMGGTYTG YYLKKFVNNISCKEPATYPHAWMIFRYGEILLNAAEAYNEAEGPAKAYSYINEVRARAGM PAYADMSQSELRERIRNERRIELAFEDHRFFDVRRWKLYDNVTPTGETGKPRYNQLLNLY GVKVTGSADTPSYTFGLAETVNSRTFVNPKSYYFPIPANEVKRAPNLGQNPGWDTGSASN E >gi|226332313|gb|ACIC01000007.1| GENE 3 4097 - 7240 3133 1047 aa, chain - ## HITS:1 COG:no KEGG:BT_0362 NR:ns ## KEGG: BT_0362 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1047 3 1049 1049 2063 99.0 0 MQPMKLNNIQTGFKRILLTVAAVFMLGNGWAQDAKVLKGRIVNAEGEPIAGAVVNVVEAS RIALSDKDGFFTLKNVKPADELYVSSVGYLPTTAIADFEENFKIVMDADLDEYAHTTPLP FNRKPKKFVTESTSIVTGEELEKHPVTVLQNAFTSTVTGVETYEAQSEPGWSETAMYIRG IRTMNASARSPLIIVDNVERDLSFLDAYPIESITILKDAAATAIYGMRGANGAVLVTTKR GEAGKTKINFTQEVGFQTIAGIPESQNSYNYALSRNQARYLDGLSPEYSDEDLEYYRRVC NGEQLEGMAKYKYFNTNWHDTMLRDAAPQYRTNLSVSGGNARARYYVSFSYLRQEGLFDT KWTEWNEGYSTQEVLNRYNLRSNIDIDVNKFLNVSMDLGGRIDNISQPGIDVWNLFTWGA GENLPVYPVFCPNGEFFMPTSSDSKNGAAQIAGRGVEQNRRRNLYTTVTATGNLDALVPG LKAKMTFSFDSYETFQKVQQADVNVYYYNYMADVNDPSEYTYQRMRTYKALPNATTSPRD YYYNLNMNGGLAYEHTFGKHAVNAQAFIRTYQNVVRGQESSNRYLSYNAQATYVYNNRYI LSGNISRMGSDNYADGERFGTFPGGSVGWVLSEESWLKNSFVNLLKLRASYGRAGQAVTG VSRYPYQGTFTEGGGYNFGTSQSYTEGVYESTAGNKNIKWEISNMANFGVDFDLWNKKIY GSVDFFKEWRSNILVSRSTVPSLFGVNAPQDSYGKAETKGFEITLGHSNRIGDFEYYIDG MLTFNTNKITEMDELTPDYAYQARTGNRIDQSQLLIWKQWASNPDLIPESYEDVVANPQK YPWNATGKYKLGNAVFQDTNGDRKIDSYDKVPTGYTNIPELIPTIRLGFSWKGFDARAVL TAYLNRTVPCRENMDYGFGWGGTSTHEITNTWGYYTDDPTDPRNINAKYPRLSTSFSDLD RNYPYNESTIWVVNGDFLSLRNVEVGYSLPARLISKVNMTKCRLYFSGYNLCNWSHLPKG FDPENPTNYIWAYPKTRSFTFGVNIGF >gi|226332313|gb|ACIC01000007.1| GENE 4 7251 - 9026 1935 591 aa, chain - ## HITS:1 COG:no KEGG:BT_0363 NR:ns ## KEGG: BT_0363 # Name: not_defined # Def: putative outer membrane protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 591 1 591 591 1186 99.0 0 MKKTIIYTAFWLATAGIALTSCEDIFGGFLDKQPSNELTEEEVFSQWTTTREFHFDTYNF LRHGACRINNSWMDAATDLAETSYASGGTRTSFNIGNYYASGAATELTGTWEHYYRGIRK CNMLLKNIDDVPKATDDSEEAHATYVKQYKAEAHFLRAYFYWEMFLRYGPVPLVTDVLDP DGDLLSNYTTRPSLKEYVVDFILKELKDCEEGLMDKATSAESGNPGRISQPMARALYSRV MLYMASDRFRSESGISWQQAADAAQSFMTDYGTLYGLYTTDTDPKTCYTNAILKNAHDEK NNETIFWRNDVAVGWGAIYNDTPVGEGGNGGLCPSQNLVDMYDMANGQSPFSSYDETGAP VYNGTATPAINNASGYKSNDPYSNRDPRLAATVLYNGVNWGNGIINVLKGQRDNPQGNAN ATPTGYYTRKYIPEVILNNNHTGSNYRNWIIIRYAEILLNYAEALNEAGGSRSDVLNAIQ PLRNRVGMTAKLTDRSDLQTIADRRNFIRKERTVELAFEDHRAWDIRRWNVAEKALARPI YGMEITKENGKFVYTRKVAQNRVFTEKMYLYPIPEGEVWKTNIENNPGWNN >gi|226332313|gb|ACIC01000007.1| GENE 5 9039 - 12122 3276 1027 aa, chain - ## HITS:1 COG:no KEGG:BT_0364 NR:ns ## KEGG: BT_0364 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1027 1 1027 1027 2023 99.0 0 MKNFILILCLATFSQWVYAQNKQISGRVVDTKGEAAIGASILEKGTTNGTITDFDGNFKL TVGPKAVLQISYIGYKTQEIPVANKTKLNITMEEDTEVLDEVVVVGYGAQKKESVVGAIS QVSSKELLKSPAANISQAIAGKISGVITSQTSGAPGADDMKIYIRGRASFAGDNQPLILV DGVEREFSQIAPDDIESISTLKDASATAVYGVRGANGVMLITTKRGKEQKPTVSLTANWQ IQSPTRQDTYLDSYNSVVLLEEALANDGLPSQYSASDIEMYKRASAGQLSGIDALLYPNV DWYDTVLRNSAPAQRYNVNIQGGTKRMRYFTSAEYYNQQGLFKEFSQDEYGNKSNSSFKR FAFRANLDFLMTKDLTLSVNFGTRFEERRGPNSNESRDGTYSQAFYEMNHTPGWLFPVSY TVGEGEDQKTLYSGSSQYQNNIVARFAKAGFYRSTNTINETNFIVDYKMDWLTKGLAAKG MVSFDYDAYYRRAFSADFATYELNDRTNYNSIDAYTQFNTDTELAYLGNDQTTTYKLYME FQLNWARKFGKHDITAMALYNQNDYRYQADLAERYQGLVGRATYGYDDRYLAEVNFGYNG SENFMRGRRFGFFPSFSVGWRISNEAFMKGTEDWLNNLKIRASYGEVGNDVYKVNGVKQR FLYQAVWTQIANDYHFGTTGYTGIYESQYPNYAVTWERAHKYNLGLEFGLWNGLLNGNVD VFYEKRNDILTPYLTRPQWVGVNMAAGNLGETNNKGFEIELKHANHIGKDFTYNASLTFS HARNEIRNMDEPATKTAYRKQEGHPINQYFGLVCDGFVTQADLNNPDFPVSTFGNVQVGD LKYRDMNGDGFIDSRDETFIGYSDVPENTYALSLGCEYKGIGFSIMFQGVDHVSRYYDAE AMYAFVNGGKVKEHHLNRWNPNQSEAYNLTHASYPLLHYDSYGDHNQRKNSFFLKNGNFL RLKNIELSYSLPARWIRKVAMNECRVYVNANNLITWDKLDGLCDPESEGSNRYPIMKTVN FGVNIKF >gi|226332313|gb|ACIC01000007.1| GENE 6 12142 - 14451 1752 769 aa, chain - ## HITS:1 COG:no KEGG:BT_0365 NR:ns ## KEGG: BT_0365 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 11 769 1 759 759 1467 98.0 0 MKTRKWPLLLMAILTVLSMSSCQDWGEADPPARNQKLPAKPDTSAKLIAEFTFDEDFNST ATEGETPVSGEGFAYSADQEIPEIVQDDARESGVMHANGGYLRIPNPLIGADIQTGVSFS FFVKSTKTDHTGALISFSDGTDKLYFTANAFLSYTGTGGYLDVNDPEVSVTNAMPYGTWH YVALTFTEKGYAIYIDGTKKYDTNNHASISSGKTRAVSVGDFDYSNMIDLLSSATYIYLG YGSDTETAEAYYDDLKIWTNTFSDSDAQGPNIGGGIHVPDPVYKATFETTTGLQIVGGGS FVTDDNSAFGTVFKNIPGGMRQNYLLLPEDVMSHSAETKEMSIGMWVNAKDAGISSTYMW SPLFSAHGQAPGIAGTANTDNWPMFALYVRGTLQLNNAGWCNFDDAQNVNGTNALYHDAT DWLADHGWHYYTVTLTATSAKVYMDGELKNEWQVAGTGDNNTIEGAFIYGANYKYITLGG NQAFDWGDPDPGFMFDDFAVYNKELSPEQIKQIMEDKTLALPTPVYINTFEEGAGDAKIV GSGKIVSVEDKGFGQIFQNVAGSKGTNYLMLPQDALSHSVESQEVSISVWVNAKNAGASD DYRYSPLFTAYGNEPSAGSENIWPLFVLQSRGLAQINCNGWTDFTAAQNEKGVNTVYCSE YAADGIVCEEDWLKDHEWHLYTAVLTSTTCKIYIDGEVANSWTVSGSGDGNTISGIFTNG ADLKYISLGGNQAWTWADPDPGFMFDDIAIYNKALTIDEIQAIMKKKKN >gi|226332313|gb|ACIC01000007.1| GENE 7 14681 - 18943 2912 1420 aa, chain - ## HITS:1 COG:SMc04212 KEGG:ns NR:ns ## COG: SMc04212 COG0642 # Protein_GI_number: 15965635 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Sinorhizobium meliloti # 865 1134 217 489 511 145 31.0 5e-34 MKKHLLIYLIIYSICSLAGQVRAQERFADRYNITYVTMNEGLPHNFIDDLYKDSRGFMWI STAGGGLSRYDGYEFVNYTPNNHQCKLKSNFIRNVCEDAFQRLWIVSEGGTDIIDLSTLK PIIPHDPKGIFPKILELPATRIMKDTQGCIWLHCDNALHRIEFNDKGEVQILSTLSPVYL NGPDIALKDIDEDGKIWMGNNGEIRKVALSPQKRLVTTPIADCLKFEAGTYISDFLSKEN EVWISSDRGLFRYNKTGNVIKRYEHDSGNPYSLSQNYLTNLAITDDKQLIVATLRGINIY NPMTDNFERIACDLPNGGTNLLNSNFINCILTDGEHIWFGTETGGINLLNPRQLSIRSYR HDKENPSSLSYNPVNAIYEDVYGTLWVGTVEGGLNRKERNSEDFTHYTREHGGLSHNSVS ALTADTDNHLWIGTWGGGLNLLDLKAPRQILEVISSQTGGGFPINFIGVLNYDPINEGIW IGANQGLYFYDPVKKEITAPLPDKAAENIHGCIGSIIDKEGKLWVGCLEGVYIIDLHSRS SKGEFQYRHLNYKLDDPSSRLIEKITCFYQGKDGTLWLGSNGYGIYQRKIDAQGKEQFIS YSTAQGLPNNSIRGILEDNNGNLWISTNNGLSCYHQAENRFINYTIQDGLIDTQFYWNAS CGSSHDLLYFGSVGGLVAIESNRPAMSLPAAKVRFTRLRIGNEEILPGSEYLSEDIAITT ELKLHEKEKSFSLEFAALNFESNNTAIYSYRLVGFDDKWVQVPGNRRFASYTNLPPGSYT LQVRYTPDGENEGENITELNITIVPYFYKTVWFILFIIVLLSVSVWQFYQWRIRNLKRQK EYLHRTVEERTHELEQQKHLLENQTDELSRQNQMLIQQNEKITRQKAQLIRMSRKVQELT LDKISFFTNITHEFRTPITLIIGPIERALKLSYNPQVIEQLNFVERNSKYLLSLVNQLMD FRKVESGKLEIVKTRGNFLKFIDSLITPFEVFAQERNIVLKRYYRMEMPEILYDEEAMRK VVTNLLSNAIKFTPNGGTVSLYLSALSAKDGEKETLYICVKDSGSGIPEEDLNRIFNRFY QSQNQVKYPVYGQAGTGIGLYLCKRIVQMHGGEIKAFNNRHAGCSFRILLPLQREEGTDE KTIIIDHNDSSAIPVQDSGAPKEKEALSMLVVEDNADMRGYIRSILREQYHVLEAANGEE ALHILNSNPIDFIISDLMMPVMDGIELSRRVKETFAISHIPFLMLTAKTSQEARLESYRM GVDEYLLKPFDETLLLTRIQNILENRKRYQRKFTLDMDVDVLNMEEESGDKKFLNQVMEV IKENYKNSYFEVSDFCEAVGVSKSLLNKKLQNLIGQSAGQFIRNYRLNIARELILKNRET KNMNIAEVAYEVGFNDPKYFTRCFTKHFNVTPSALLNNEE >gi|226332313|gb|ACIC01000007.1| GENE 8 19099 - 20640 1048 513 aa, chain + ## HITS:1 COG:BH1878 KEGG:ns NR:ns ## COG: BH1878 COG3507 # Protein_GI_number: 15614441 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-xylosidase # Organism: Bacillus halodurans # 46 513 24 479 781 133 25.0 9e-31 MNEYTDIMKNLFLPVLLVAGSFFVSCTSSVFTPTPSANPWDDNYLSVAKMEDYRQWGTYN VHDPSCRKLGDYYYMYSTDAIFGENRKEAREKGVPLGFIQMRRSKDLVHWEFLGWAFPEI PEESVQWVQFHAGGQGATNIWAPYIIPYKDKYRLYYCVSAFGRKTSYIGLAESTSPEGPW TQKGCIVKTDDSTAMNAIDPSVIADENTGKWWMHYGSFFGGLYCVELNPETGLALNEGDL GHLVARRANYRKDNLEAPEIIYHPELKQYYLFTSYDPLMTTYNVRVSRSDAAEGPFTDYF GKAVKDTTNNFPILTAPYRFENHTGWAGTAHCAVFSDGEGNYFMAHQGRLSPQNQLMVLH IRQLFFTPEGWPVASPERYAGTASRQFSKEDLVGEWEIIRVQEPAYERQLEAGQILWGEG ELKEKEWNLSTRMTLVKDGTCKGQMTDNEWNIVQMNGKWSFLTEKHLLMVDLNSEKIENL VIFAGHDWENETETILFTGLDSRGRSVWGKRIK >gi|226332313|gb|ACIC01000007.1| GENE 9 20666 - 22648 1929 660 aa, chain + ## HITS:1 COG:CAC3436 KEGG:ns NR:ns ## COG: CAC3436 COG3534 # Protein_GI_number: 15896677 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-L-arabinofuranosidase # Organism: Clostridium acetobutylicum # 2 621 3 627 835 425 39.0 1e-118 MRRYTELLAALALSTGMALHAQTNEMVIQTKKLGAEIQPTMYGLFFEDINYAADGGLYAE LVKNRSFEFPQRLMGWKTYGKVTLQDDGPFERNPHYVRLDNSGHAHKHTGLDNEGFFGIG VKQGEEYRFSVWARLPHGGTGEKIRVELVDTKSMGEHQAFASQTLTIDSKDWKKYQVILK AGVTNPKATLRIFLASQGTVDLEHISLFPVDTWKGHENGLRKDLAQALADIHPGVFRFPG GCIVEGTDLETRYDWKKSVGPVENRPLNENRWQYTFTHRFYPDYYQSYGLGFYEYFLLSE EMGAAPLPILNCGLACQYQNNEEKAHVAVCDLDSYIQDALDLIEFANGDVNTTWGKVRAD MGHPAPFNLKYLGIGNEQWGKEYPERLEPFIKALRKAHPEIMIVGSSGPNSEGKEFDYLW PEMKRLKADLVDEHFYRPESWFLSQGARYDNYDRKGPKVFAGEYACHGKGKKWNHFHAAL LEAAFMTGLERNADIVHMATYAPLFAHVEGWQWRPDMIWFDNLNSVRTVSYYVQQLYAQN KGTNVLPLTMNKKPVTGAEGQNGLFASAVYDKDKNELIVKVANTSDKTQPVSLTFEGLKK QDVLSEGRCITLSSLDQDKDNTLEQPFAITPQETPVTINGHALTTELGPNTFAVYKFTKK >gi|226332313|gb|ACIC01000007.1| GENE 10 22671 - 23615 851 314 aa, chain + ## HITS:1 COG:no KEGG:BT_0369 NR:ns ## KEGG: BT_0369 # Name: not_defined # Def: endo-1,4-beta-xylanase D precursor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 314 8 321 321 646 98.0 0 MISCVLAAFSGLSAQNDTTFVANGNPIIKYKYTADPGAMVHDGKVYIYAGHDECPPPKEH YQLNEWCVFSSPDMKTWTEHPVPLKAKDFSWAKGEAWASQVIERDGKFYWYVTVEHGTIH GKSIGVAVSDSPVGPFVDARGSALITNDMTTEFTKISWEDIDPTVFIDDDGQAYLYWGNT QCYYVKLKKNMIELDGPIVPVHLPRYTEAPWIHKRGDWYYLSYASEFPEKICYAMSRSIT GPWEYKGILNEIAGNSNTNHQAIIEFKGDWYFIYHNGSINTAGGSFRRSVCIDRLYYNED GTMKRIQMTTEGVQ >gi|226332313|gb|ACIC01000007.1| GENE 11 23734 - 24888 1278 384 aa, chain - ## HITS:1 COG:CAC2959 KEGG:ns NR:ns ## COG: CAC2959 COG0153 # Protein_GI_number: 15896212 # Func_class: G Carbohydrate transport and metabolism # Function: Galactokinase # Organism: Clostridium acetobutylicum # 4 383 9 388 389 249 39.0 5e-66 MDIEHVRSRFIKHFDGTTGFIYASPGRINLIGEHTDYNGGFVFPGAVDKGMLAEIKPNGT DKVRAYSIDLKDYVEFGLNEEDAPRASWARYIFGVCREMIKRGVDVKGFNTAFSGDVPLG AGMSSSAALESTYAFALNELFGEGKIDKFELAKVGQATEHNYCGVNCGIMDQFASVFGKA GSLIRLDCRSLEYQYFPFHPEGYRLVLMDSVVKHELASSAYNKRRQSCEAAVAAIQKKHP HVEFLRDCTMDMLEEAKADISAEDYMRAEYVIEEIQRVLDVCDALEKDDYETVGQKMYET HHGMSKLYEVSCEELDFLNDCAKEYGVTGSRVMGGGFGGCTINLVKNELYDNFVEKTKEA FKAKFGRSPKVYDVVIGDGSRRVE >gi|226332313|gb|ACIC01000007.1| GENE 12 24932 - 26242 1482 436 aa, chain - ## HITS:1 COG:BMEII1053 KEGG:ns NR:ns ## COG: BMEII1053 COG0738 # Protein_GI_number: 17989398 # Func_class: G Carbohydrate transport and metabolism # Function: Fucose permease # Organism: Brucella melitensis # 12 428 24 412 412 150 30.0 7e-36 MTQEKKNGKIIAIITMIFLFGMISFVTNLAAPMGIVLKNQFDVSNALGMLGNFGNFIAYA VMGIPSGILLQRVGYKKTALIAVAIGFIGVGIQFLSGHSSPEMAFAVYLIGAFVAGFSMC LLNTVVNPMLNKLGGEGNKGNQLIQVGGSFNSVMATITPMFVGILIAGSIEKATISQIFP VMYTAMAVFAFAFFVLLFVRIPEPNAAATTEPISTLMKGALKFRHFVLGAIAIFVYVGIE VGVPGTLNLFLTDPVEKGGAGIASTISGFVVGTYWFLMLIGRLAGASLGAKVSSKAMLTF TSALGLILVFLAIFSSTGTLVNLPVLQQSATGGLSFGFAEVPINAMYLVLVGLCTSIMWG GIFNLAVEGLGKYLAAASGLFMVLVCGGGILPVIQGWVADVAGFMASYWVIIAALAYLLY YGLVGCKNVNKDIPVE >gi|226332313|gb|ACIC01000007.1| GENE 13 26296 - 27393 374 365 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15900011|ref|NP_344615.1| aldose 1-epimerase [Streptococcus pneumoniae TIGR4] # 38 364 27 345 345 148 31 4e-35 MNNTFPTEGNLSGLSRKDFQKEINNKETDLFILKNKKGMEVAVTNYGCAILAIMVPDKDG KYANVVLGHDSIDHVVNSPEPFLSTTIGRYGNRIAKGKFTLYGEEHELTINNGPNSLHGG PTGFHARVWDADQLAENIIQFNYVSADGEEGFPGNLEVEMVYRLEEEENALVIEYRATTD KATVVNLTNHGFFNLAGISNPTPTIENNIVTINANFYTPIDEVSIPTGEVAKVEGTPMDF TTPHTVGERINDKFQQLINGAGYDHCYVLNKIETGSLDLAATCFEPNSGRTMEVYTTEAG VQLYTGNWLNGFEGAHGATFPARSAICFEAQCFPDTPNKPHFPSATLLPGDEYQQVTIYK FGVEK >gi|226332313|gb|ACIC01000007.1| GENE 14 27622 - 28593 922 323 aa, chain + ## HITS:1 COG:CAC2918 KEGG:ns NR:ns ## COG: CAC2918 COG1482 # Protein_GI_number: 15896171 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannose isomerase # Organism: Clostridium acetobutylicum # 1 323 1 310 326 200 38.0 3e-51 MYPLKFEPILKQTLWGGDKIIPFKHLNSDLKGVGESWEISGVEDNESVVANGPDKGLTLA DMVRKYREELVGEANYARFGNKFPLLIKFIDAKQDLSIQVHPADDLAKKRHNSMGKTEMW YVVDADKGAKLRSGFSEQITPKEYKERVLNNTITDVLQEYEIKPGDVFFLPAGRVHSIGA GAFIAEIQQTSDITYRIYDFNRKDANGKTRELHTDLAREAINYEVLDDYRTKYEAVKDEP VELVACPYFTTSVYDMTEEISCDYSELDSFVIFICMEGACKIKDNEGNELKVGAGESILL PATTQDVTITPEAGNVKLLETYV >gi|226332313|gb|ACIC01000007.1| GENE 15 28742 - 30292 1229 516 aa, chain + ## HITS:1 COG:no KEGG:BT_0374 NR:ns ## KEGG: BT_0374 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 516 1 516 516 1018 100.0 0 MNSKIYPIGVQNFESLRKDDYFYIDKTALVYQLARTGRYYFLSRPRRFGKSLLISTLEAY FQGKKELFEGLAIETLEKDWIKHPILHLDLNIEKYDVPESLDNILEKSLVAWEKLYGAEP SERSFSLRFAGIIQRACEKTGQRVVILVDEYDKPMLQAIGDDELQKYYRNTLKPFYGALK SKDGYIKFAMLTGVTKFGKVSIFSDLNNLKDISMDERFIEICGITEKEIHDNLEKELHEL ARNQKMSYDEVCKELKACYDGYHFVEDSVGIYNPFSLLNTFDQMKFGDYWFETGTPTYLV ELLKSSHYDLRRVVNVETDSDVLNSIDSTSKNPIPVIYQSGYLTIKGYDRRFKIYRLGFP NREVEEGFMKYLLPFYANVDQIDSPFQITKFIHEVEQGDCDAFFHRLQSFFADTPYELAR DLELHYQNVLFIVFKLIGFYVKVEYHTSEGRIDLVLQTDKFIYVMEFKLDGTAEEAIKQI NEKHYALPFEADGRRLFKIGVNFSSETRNIEKWIVE >gi|226332313|gb|ACIC01000007.1| GENE 16 30404 - 31351 552 315 aa, chain + ## HITS:1 COG:no KEGG:BT_0291 NR:ns ## KEGG: BT_0291 # Name: not_defined # Def: integrase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 310 2 311 321 508 83.0 1e-142 MNKNGFSRCAESYIGRLRKEGRHSTAHVYTNALFSFTKFCGTTHVAFRQVTRERLRCYGQ YLYSSGLKPNTVSTYMRMLRSIYNRGVESGRAPYVHRLFHDVYTGVDICQKKALPVGELN RLLYEDPKSERLRRTQAIAALMFQFCGMSFADLAHLEKSSLERNIIRYNRIKTKTPMSVE VLDTAQDIISRLRNCQPSHPDCPDYLFSILLGDKKREDESAYREYQSALRRFNNRLKRLA KALRLTSPVTSYTIRHSWATTAKYRGVPIEMISESLGHKSIKTTQIYLKGFGLQERTEVN RMNLSYVKNCRIGRV >gi|226332313|gb|ACIC01000007.1| GENE 17 31703 - 32281 426 192 aa, chain + ## HITS:1 COG:no KEGG:BT_0376 NR:ns ## KEGG: BT_0376 # Name: not_defined # Def: putative transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 192 1 192 192 384 99.0 1e-106 MILTKEKSAILGTKNGTGEGVACLKRWYVAHVRIHHEKKVAEYLGKMGIETFVPVQQEIH QWSDRRKLVETVLLPMMVFVHADPKERMAALTLATVSRYMVLRGEGKPAVIPDDQMARFR FMLDYSEEAICMNYSPLARGKKVRVIKGPLTGLVGELVALDGKSKIAVRLDMLGCACVDM PIGYVEQIGERN >gi|226332313|gb|ACIC01000007.1| GENE 18 32289 - 32496 77 69 aa, chain + ## HITS:1 COG:no KEGG:BT_0377 NR:ns ## KEGG: BT_0377 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 69 1 69 119 127 98.0 2e-28 MGTFNNSIQEKIEKLQKTVDTLLHMGENMDCICVDDLSLLNKEIHEQINDLYPCHGKTAE QEAALCLSL Prediction of potential genes in microbial genomes Time: Thu May 12 00:01:23 2011 Seq name: gi|226332312|gb|ACIC01000008.1| Bacteroides sp. 1_1_6 cont1.8, whole genome shotgun sequence Length of sequence - 3625 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 937 841 ## COG1086 Predicted nucleoside-diphosphate sugar epimerases 2 1 Op 2 8/0.000 + CDS 972 - 2288 1133 ## COG1004 Predicted UDP-glucose 6-dehydrogenase 3 1 Op 3 . + CDS 2291 - 3355 613 ## COG0451 Nucleoside-diphosphate-sugar epimerases Predicted protein(s) >gi|226332312|gb|ACIC01000008.1| GENE 1 2 - 937 841 311 aa, chain + ## HITS:1 COG:FN1696 KEGG:ns NR:ns ## COG: FN1696 COG1086 # Protein_GI_number: 19705017 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Predicted nucleoside-diphosphate sugar epimerases # Organism: Fusobacterium nucleatum # 1 301 314 605 607 259 46.0 5e-69 ESPLHNVRLEFEKNYPDLDFVPVIGDVRVKERLRMVFETYQPQIIFHAAAYKHVPLMEEN PCEAVLVNVVGSRQVADMAVEYGAEKMIMVSTDKAVNPTNVMGCSKRLAEIYVQSLGCAI REGKVKGHTKFITTRFGNVLGSNGSVIPRFKEQIENGGPVTVTHPDIIRFFMTIPEACRL VMEAATMGEGNEIFVFEMGKAVKIVDLATRMIELAGYRPGEDIEIEFTGLRPGEKLYEEV LSDKENTLPTENKKIMIAKVRHYEYTDILETYGVFENLSRTVKIMDTVKLMKRVVPEFKS KNSPRFEVLDK >gi|226332312|gb|ACIC01000008.1| GENE 2 972 - 2288 1133 438 aa, chain + ## HITS:1 COG:STM2080 KEGG:ns NR:ns ## COG: STM2080 COG1004 # Protein_GI_number: 16765410 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted UDP-glucose 6-dehydrogenase # Organism: Salmonella typhimurium LT2 # 6 438 1 388 388 472 55.0 1e-133 MDTKELKIAVAGTGYVGLSIATLLSQHHQVTAVDVIPEKVDMLNRKQSPIQDEYIEKYLS EKSLNLTATLDGAKAYSDADFVVIAAPTNYDPVKNYFDTHHIEDVIDLVLSVNPDAVMVI KSTIPVGYCRGLYLKYACKGVKKLNLLFSPEFLRESMALYDNLYPSRIIVGYPKLIDSEQ FDEENEAIKSVADIPGLEKAARTFAALLQEGAIKEDIPTLFMGIKEAEAVKLFANTYLAL RVSYFNELDTYAEMRGLDSQSIIQGVGLDPRIGTHYNNPSFGYGGYCLPKDTKQLLANYQ DVPQNMMSAIVESNKTRKDYIADAVLHKAGYYTENGQWAASKEHTCVIGVYRLTMKSNSD NFRQSAIQGIMKRVKAKGAEVIIYEPTLEDGTTFFGSKVVNNIDEFKSQSKAIIANRFDT CLKDVEDKVYTRDIFRRD >gi|226332312|gb|ACIC01000008.1| GENE 3 2291 - 3355 613 354 aa, chain + ## HITS:1 COG:BH3709 KEGG:ns NR:ns ## COG: BH3709 COG0451 # Protein_GI_number: 15616271 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Bacillus halodurans # 13 350 3 333 343 358 51.0 6e-99 MLQYNVDLEGKTVLVTGAAGFIGSNLVKRLFHDVKNIKIIGIDSITDYYDVNIKYERLKE IESLNRDWIFVHASIADKDTVEEIFTENNVAIVVNLAAQAGVRYSITNPDSYIQSNLVGF YNILEACRHHEVEHLVYASSSSVYGSNKKVPYSTDDKVDNPVSLYAATKKSNELMAHAYS KLYNIPSTGLRFFTVYGPAGRPDMAYFGFTNKLREGKTIQIFNYGNCKRDFTYIDDIVEG VVRVMQHAPEKENGEDGLLIPPYKVYNIGNNNPENLLDFVTILQDELIRAKVLPLDYDFE VHKELVPMQPGDVPVTFADTELLEQDFGFKPNTTLREGLRSFAEWYAKYYGTNY Prediction of potential genes in microbial genomes Time: Thu May 12 00:01:24 2011 Seq name: gi|226332311|gb|ACIC01000009.1| Bacteroides sp. 1_1_6 cont1.9, whole genome shotgun sequence Length of sequence - 2224 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 107 - 352 202 ## BF0746 putative transposase_11 DDE family protein 2 1 Op 2 . + CDS 285 - 746 123 ## BF0746 putative transposase_11 DDE family protein + Prom 761 - 820 1.7 3 2 Tu 1 . + CDS 854 - 2113 429 ## COG2244 Membrane protein involved in the export of O-antigen and teichoic acid Predicted protein(s) >gi|226332311|gb|ACIC01000009.1| GENE 1 107 - 352 202 81 aa, chain + ## HITS:1 COG:no KEGG:BF0746 NR:ns ## KEGG: BF0746 # Name: not_defined # Def: putative transposase_11 DDE family protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 53 92 144 302 85 75.0 6e-16 MPEAISELVRQLIHSFKPQDCDSMKSLVDSMPIITCAGKNKVGKVAAEITSNGLLFDKEH VLLRTKTPCSGFPQKRNYTFS >gi|226332311|gb|ACIC01000009.1| GENE 2 285 - 746 123 153 aa, chain + ## HITS:1 COG:no KEGG:BF0746 NR:ns ## KEGG: BF0746 # Name: not_defined # Def: putative transposase_11 DDE family protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 152 151 302 302 193 65.0 1e-48 MYYFGLKLHAVAFRRKGTIPFPKMLILSATDENDSTVFKREYVGNLNNREIYADKIYSDI PFYKETQECKKTLAIYSCKSYQRRIPEVTKREKAARDLFSTTVSKVRQPIEALFNWLNEK TDIQRAMKVRFTSGLLVHTMEKIAIALITLIFN >gi|226332311|gb|ACIC01000009.1| GENE 3 854 - 2113 429 419 aa, chain + ## HITS:1 COG:MTH347 KEGG:ns NR:ns ## COG: MTH347 COG2244 # Protein_GI_number: 15678375 # Func_class: R General function prediction only # Function: Membrane protein involved in the export of O-antigen and teichoic acid # Organism: Methanothermobacter thermautotrophicus # 12 356 7 359 420 163 29.0 5e-40 MSINVKRYLYSLKNPEGKNVMKNLSYLSLLQVANYIFPLITLPYLARVVGVDAFGLLAVG TSVIAYFQTLIDYGFNYTTVKEIARNKNDKPLINMIVMETMLARIILLVLSFTLIFLGIA FVPYLYENRIIILSTSTILVGYAFMLDWYFQAIEDMKFITLVNLVSNLVFTLSVFIFIHS PKDYFIQPLLTSAGTLVASFLCWIIIFRKYGLNLSIPSFYLAMARIKKGFNMFISLFLPT IYTNLNVLLLGAYNGTCATGIYSGGAKFTGIAYRITMLFSRVFYPFLSRRMDKHSVYAIV SILIGFLISIFFFFGSDLLVQLFLGDAFSETITVLRIVSFTPLAISIFNVYGTNYLAIKC KDSILRNIVIGTTILGLFFGVFGAIYYSYIGVAVCSLLTQSVRALAAFIFAKIEMKKSL Prediction of potential genes in microbial genomes Time: Thu May 12 00:01:29 2011 Seq name: gi|226332310|gb|ACIC01000010.1| Bacteroides sp. 1_1_6 cont1.10, whole genome shotgun sequence Length of sequence - 3698 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 2, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 819 109 ## gi|253567779|ref|ZP_04845190.1| predicted protein + Prom 1699 - 1758 5.2 2 2 Op 1 . + CDS 1813 - 2154 108 ## gi|253567781|ref|ZP_04845192.1| predicted protein 3 2 Op 2 . + CDS 2144 - 3208 466 ## COG0438 Glycosyltransferase 4 2 Op 3 . + CDS 3213 - 3696 139 ## gi|291544013|emb|CBL17122.1| Acetyltransferases Predicted protein(s) >gi|226332310|gb|ACIC01000010.1| GENE 1 1 - 819 109 272 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253567779|ref|ZP_04845190.1| ## NR: gi|253567779|ref|ZP_04845190.1| predicted protein [Bacteroides sp. 1_1_6] # 7 272 1 266 266 521 99.0 1e-146 KKQIEKLNYSNVTIIENSFSDCPQMTPRISQCLRWVLWDLSFLQYDYLYIVDIDMLYVRE PKPLHEQHIEHMRITGLCFDNMRRIHKRNPFKLSSIGQRIKYAGTKAILKYLFGSRIEYR ATGLHFIKVSEYYSAIPQDHLSSIRKQIYSGDWLNRVMYPNDEVFLYKILENHNLHPEKM AIQSNSCKSLDFNSYLTPEFRPHHGIHLGIFRNPVPLEERISDLKILQSDAYIYYKTIVI NSYFADKEFLSILSLAPESIKNSYNRLLEYYK >gi|226332310|gb|ACIC01000010.1| GENE 2 1813 - 2154 108 113 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253567781|ref|ZP_04845192.1| ## NR: gi|253567781|ref|ZP_04845192.1| predicted protein [Bacteroides sp. 1_1_6] # 1 113 57 169 169 198 100.0 8e-50 MSSYLSSADCLLGGFADYCTKYIGQHNTFLDVITRVGLIGFPLFIAYFIKTFLSFFKEVK FYNPDITSLLAISGLIFIAYSQTHSSGIQSGITYFWELVAMFELSKKFNKYEI >gi|226332310|gb|ACIC01000010.1| GENE 3 2144 - 3208 466 354 aa, chain + ## HITS:1 COG:BH3663 KEGG:ns NR:ns ## COG: BH3663 COG0438 # Protein_GI_number: 15616225 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Bacillus halodurans # 5 325 5 325 373 110 27.0 5e-24 MRYNSVAFLLPDLSAGGAERVTITIARLLRKEGFDVEFVVLGPNKGEMLTWIEPEFNMTC LGFSRVLNSVPKLCSFMKEHNHSIFFSSREHVSIVGLLAARLTNQQIVVRVPNMPKNKLS KGLTGVKMSIIKTINRWLLKSAKIIIAQNAEMRTQLLDYYSLPQKKVVAINNPVDIEYIR SSAENSHNPFKENEVNFLNVCNIAYSKGIDILLEAWNKVKEAIPNAHMYIVGRNASEYAR EIVEKSKAYEDFEFLGFQSNPYVYLKYCDVFVLPSRMEGFPNVVLEAQCFNRPVVSTTCV EVIKDIVQQGCNGYYCEIENSGALADCMINASKLKKIENNYSMFDKELLLNCFR >gi|226332310|gb|ACIC01000010.1| GENE 4 3213 - 3696 139 161 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|291544013|emb|CBL17122.1| ## NR: gi|291544013|emb|CBL17122.1| Acetyltransferases [Ruminococcus sp. 18P13] # 2 160 3 152 204 81 38.0 1e-14 MIRFAKWNDLEGVARVHAKCFPNSYVTQLSKVNWLGKDLLPVFYQTFFNDNPELFVVAED DNHKIVGFCMGYYMDKDDQIARFMVENSFWISIKTLLLLLIFNKYAWNKLLSHFLYKSSQ SDWTIVNDKYECYSNDQRGDLLSVCVLPECRGKQYAQGMME Prediction of potential genes in microbial genomes Time: Thu May 12 00:01:56 2011 Seq name: gi|226332309|gb|ACIC01000011.1| Bacteroides sp. 1_1_6 cont1.11, whole genome shotgun sequence Length of sequence - 9362 bp Number of predicted genes - 9, with homology - 9 Number of transcription units - 3, operones - 2 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 137 - 1054 343 ## TTE0659 hypothetical protein 2 1 Op 2 . + CDS 1051 - 2022 338 ## Tter_2795 glycosyl transferase group 1 3 1 Op 3 . + CDS 2024 - 2869 139 ## Coch_1776 acyltransferase 3 + Prom 2873 - 2932 7.0 4 2 Tu 1 . + CDS 2987 - 3730 312 ## PRU_1335 family 11 glycosyl transferase + Term 3894 - 3935 -0.8 + Prom 3882 - 3941 6.4 5 3 Op 1 5/0.000 + CDS 4185 - 4970 610 ## COG0463 Glycosyltransferases involved in cell wall biogenesis + Term 4977 - 5033 6.0 + Prom 4987 - 5046 2.6 6 3 Op 2 5/0.000 + CDS 5073 - 6356 882 ## COG0399 Predicted pyridoxal phosphate-dependent enzyme apparently involved in regulation of cell wall biogenesis 7 3 Op 3 1/0.000 + CDS 6356 - 6949 388 ## COG2148 Sugar transferases involved in lipopolysaccharide synthesis + Prom 7092 - 7151 8.4 8 3 Op 4 14/0.000 + CDS 7316 - 8410 875 ## COG1089 GDP-D-mannose dehydratase 9 3 Op 5 . + CDS 8413 - 9361 591 ## COG0451 Nucleoside-diphosphate-sugar epimerases Predicted protein(s) >gi|226332309|gb|ACIC01000011.1| GENE 1 137 - 1054 343 305 aa, chain + ## HITS:1 COG:no KEGG:TTE0659 NR:ns ## KEGG: TTE0659 # Name: not_defined # Def: hypothetical protein # Organism: T.tengcongensis # Pathway: not_defined # 8 304 7 304 312 188 39.0 3e-46 MALSLSNIKRWYLMLTGQSIYHVNQSIGEYFQKDKICGYYNNMMEKVQKAPQYVDNEDMP SLDLGNGKQFFFPVGIFQFAFGLLDLYYKFHEEKYKAKFRQCADWALAHQMETGAWDNFS YYYSNNPYGAMAQGEGASLLIRAYVQFKEEKYLSAAKKAIDFMLLSNTEGGCTEYSGEKN VLLLEYPHRKAVLNGFIFSWWGLYDYVLVTGEGGVYKKVLKESLDTLIGLLSQFRGSYWS VYDLEGRMASPFYHNLHIAQMQAMYELTGESIFDEYAKCWERQQKNPICKGLAFIKKSIQ KILEK >gi|226332309|gb|ACIC01000011.1| GENE 2 1051 - 2022 338 323 aa, chain + ## HITS:1 COG:no KEGG:Tter_2795 NR:ns ## KEGG: Tter_2795 # Name: not_defined # Def: glycosyl transferase group 1 # Organism: T.terrenum # Pathway: not_defined # 53 321 60 307 325 84 26.0 8e-15 MRILLISNQHPNKQGIGNPVMCRMKNALSEDSRIEKVEFLPFYNSLSSLKIIRKVSKQYD LVHIHFGGLYALIIWFLLIGVHCKKFITFHGTDIHAKALKTAKDWKERLKIRLNQKASFL SIKLFDKCGFVAKEMMTYVPSCLAKQMAGKAFIQLLGVDYSTFRIINKEMAQEYLGLEHK KYVLFSDVSNTSIKRRDIAENIVRELGPDYNLLIMCGVKPDEVPYYINACEFALLTSDEE GSPNIIREVLSLNKSFFSVEVGDAAKQLEGLHNSCIISRNPQEAARQICNVIINEYSDNT RLSLQYCLDFSKISRRVIDLYLS >gi|226332309|gb|ACIC01000011.1| GENE 3 2024 - 2869 139 281 aa, chain + ## HITS:1 COG:no KEGG:Coch_1776 NR:ns ## KEGG: Coch_1776 # Name: not_defined # Def: acyltransferase 3 # Organism: C.ochracea # Pathway: not_defined # 11 181 9 183 318 65 31.0 2e-09 MEKTKRLIWADALKGYLILTVILGHAIQYCLPDGLCEVNYWWKLIYSFHMAAFIAISGFV NYKSKTGCCKRRYYQLFIPFVLWLLIYWRTKGGDFATLANIFFRPDGYLWFLWVLFVISL LFVNAIWISKKLRVKLEYVVFFLCLLLIGIMIVTDFRLFGYQFIAYYFLFYAIGYFANRY KEYLVSNNWYLISLFCIWMILASFWSMHDLPFFLSNIPFIPASLLQYGYRFITAFVAIIF LLSFGTKYLNSDKNQRMIVCCGNMSLGFTFMSIYHVYHTKY >gi|226332309|gb|ACIC01000011.1| GENE 4 2987 - 3730 312 247 aa, chain + ## HITS:1 COG:no KEGG:PRU_1335 NR:ns ## KEGG: PRU_1335 # Name: not_defined # Def: family 11 glycosyl transferase # Organism: P.ruminicola # Pathway: not_defined # 1 223 1 224 289 129 33.0 7e-29 MKIVNITGGLGNQMFQYAFAMALKYRNPQEEVFVDIQHYNTIFFKKFKGINLHNGYEIDK VFPKAKLPVAGVRQLMKFSYWIPNYILSRLGRKFLPIRKKEYIPPYSMNYSYDEKALNWK GDGYFEGYWQSYNHFGDIKEELQKVYAHPKPNQYNAALISNLESCNSVGIHVRRGDYLAE PEFRGICGLDYYEKGIKEILSDEKNMCSSSLAMICNGVKRILLLLWVIIALYLYQAIKVR IAVGICS >gi|226332309|gb|ACIC01000011.1| GENE 5 4185 - 4970 610 261 aa, chain + ## HITS:1 COG:BH3661 KEGG:ns NR:ns ## COG: BH3661 COG0463 # Protein_GI_number: 15616223 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Bacillus halodurans # 6 250 10 253 257 169 41.0 4e-42 MQNYGLVSIITPTWACAKFICETIYSIQAQTYQNWELLIQDDCSTDNTYLVVEPLANLDP RIKYQCNEKRSGAAITRNNALKRANGRWIAFLDSDDLWLPEKLERQLKFMVQNNYAFTYH EYIEMSEEGNDLGVYVSGKKRVNELDMYICCWPGCLTVMYDMEKIGLIQIKDVCKNNDTA MWLKVVQKSPCYLLQENLARYRRRAVSITPKPLYKRIWAHYPLFHVAEEMHPVRATYWVL MNVLGNTFKKIFYVKHYDVKG >gi|226332309|gb|ACIC01000011.1| GENE 6 5073 - 6356 882 427 aa, chain + ## HITS:1 COG:SP1837 KEGG:ns NR:ns ## COG: SP1837 COG0399 # Protein_GI_number: 15901666 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted pyridoxal phosphate-dependent enzyme apparently involved in regulation of cell wall biogenesis # Organism: Streptococcus pneumoniae TIGR4 # 7 420 5 395 408 476 54.0 1e-134 MNIQLRNVPFSPPDMTEAEANEVRDAILSGWITTGPRTKKLEKKISEYVHTQKTVCLNSA TAAMEMVLHLLGIGPGDEVIVPAYTYTASASVVAHVGATIVMVDAQSECVEMDYNKLADA ITEKTKVIVPVDLAGICVNLDKIYEVINSPEIKAKFTPSNDIQRKIGRIIVSNDCAHSFG ANRHGKMAGEIADFSSFSFHAVKNFTTAEGGALTWNLPFGQEKVERQQVYCGSPVPFIEG ETWNEFLYRLSQLFSLHGQNKDALAKTKFGAWEYDIIGPWYKCNMTDIMAAMGLIQFERY PTMLKRRRAIIEKMDVAFADLNVTTLKHYGDDWEGSGHLYMVRLSGHSRKEVNIVIEKMA ERGIATNVHYKPLPMMTAYKALGFDIVNFPNAYHLFENEITLPLNTRMTDEDVDYVIENF VEIVKSL >gi|226332309|gb|ACIC01000011.1| GENE 7 6356 - 6949 388 197 aa, chain + ## HITS:1 COG:SP1838 KEGG:ns NR:ns ## COG: SP1838 COG2148 # Protein_GI_number: 15901667 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Sugar transferases involved in lipopolysaccharide synthesis # Organism: Streptococcus pneumoniae TIGR4 # 18 193 49 229 230 166 48.0 2e-41 MKRLFDILVSGCGLIFLSPLFLVLAIWIKLDSKGPVFYCQVRVGKRNKDFRIFKFRSMRV GADKGSLVTIGGRDPRVTKSGYIIRKYKFDELPQLINVFIGDMSLVGPRPEVRHYVDYWT PEQMHVLDVRPGITDPASIKFRNENELMEKAVDPESYYINVIMQEKIKLYLEYVQNVSFV YDIKLIFQTFKVIAIER >gi|226332309|gb|ACIC01000011.1| GENE 8 7316 - 8410 875 364 aa, chain + ## HITS:1 COG:STM2109 KEGG:ns NR:ns ## COG: STM2109 COG1089 # Protein_GI_number: 16765439 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: GDP-D-mannose dehydratase # Organism: Salmonella typhimurium LT2 # 4 353 1 356 373 479 65.0 1e-135 MNKMKKVALISGITGQDGSFLAEFLIEKGYEVHGILRRSSSFNTGRIEHLYLDEWVRDMK KERLVNLHYGDMTDSSSLIRIIQQVQPDEIYNLAAQSHVKVSFDVPEYTAEADAIGTLRM LEAVRILGMKKTRIYQASTSELFGLVQEVPQKETTPFYPRSPYGVAKQYGFWITKNYRES YGMYAVNGILFNHESERRGETFVTRKITLAAARIAQGLQDKLYLGNLNSLRDWGYAKDYV ECMWLILQHDTPEDFVIATGEYHTVREFTTLAFKEVGIELRWDGEGVDEKGIDVSTGKVL VEVDSKYFRPAEVEQLLGDPTKARTLLGWNPTKTSFSELVKIMVSHDMKFVKKLHMKELM NREE >gi|226332309|gb|ACIC01000011.1| GENE 9 8413 - 9361 591 316 aa, chain + ## HITS:1 COG:ECs2857 KEGG:ns NR:ns ## COG: ECs2857 COG0451 # Protein_GI_number: 15832111 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Escherichia coli O157:H7 # 6 316 5 274 321 287 47.0 2e-77 MDKNTKIYVAGHHGLVGSAIWNNLQRRGYTNLVGRSHKELDLLDGVAVKQFFDEEQPDAV VLAAAHVGGILANLQYRADFIYQNLQIQQNVIGESFRHNVQKLLFLGSTCIYPRDAAQPM KEDALLTSPLEYTNEPYAIAKIAGLKMCESFNLQYGTNYIAVMPTNLYGPNDNFHLENSH VLPAMIRKIHLAKCLNEDDWKAVRRDLDLRPVEGVTGSESDAGILDKLAKFGITPESVTL WGTGTPMREFLWSEEMADASVHVLLNVDFKDTYVTGSRDIRNCHINVGTGKEVSIREVAE IIMKEIGFKGILLWDS Prediction of potential genes in microbial genomes Time: Thu May 12 00:02:12 2011 Seq name: gi|226332308|gb|ACIC01000012.1| Bacteroides sp. 1_1_6 cont1.12, whole genome shotgun sequence Length of sequence - 2509 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 50 - 109 3.0 1 1 Op 1 7/0.000 + CDS 143 - 940 676 ## COG1596 Periplasmic protein involved in polysaccharide export 2 1 Op 2 . + CDS 953 - 2507 1369 ## COG3206 Uncharacterized protein involved in exopolysaccharide biosynthesis Predicted protein(s) >gi|226332308|gb|ACIC01000012.1| GENE 1 143 - 940 676 265 aa, chain + ## HITS:1 COG:PM1016 KEGG:ns NR:ns ## COG: PM1016 COG1596 # Protein_GI_number: 15602881 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protein involved in polysaccharide export # Organism: Pasteurella multocida # 41 232 77 253 387 62 29.0 8e-10 MEKLNGRNVLILLLILLLTACQSYKKVPYLQDAEVVLYSTQNEQLYDAKIMPKDLLTIVV SCTSPELAAPFNLTVATQSNAALNYTTTQPVLQQYLVDNEGNINFPVLGELHVGGLTKKA TEQMIVEKLKPYITETPIVTVRMVNYKISVIGEVARPGTFTISNEKVNILEALAMAGDMT VYGLRDDVKLIRENANGKQEIIPLDLNKAETILSPYYYLQQNDIIYVTPNKAKARNSDIG TSTSLWFSATSILVSIASLLFNILK >gi|226332308|gb|ACIC01000012.1| GENE 2 953 - 2507 1369 518 aa, chain + ## HITS:1 COG:PM1018_1 KEGG:ns NR:ns ## COG: PM1018_1 COG3206 # Protein_GI_number: 15602883 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Uncharacterized protein involved in exopolysaccharide biosynthesis # Organism: Pasteurella multocida # 7 517 2 424 482 66 20.0 1e-10 MKEENKNGKGPEMTEDQIDFRALLFKYIIHWPWFVGAVLLCFVGAWFYLHWATPIYNISA TVLIKDEKKGGGAGLSSELEDMGLSGLMTSSKNIDNELEVLRSKTLVKEVVNQLGLYITY KDEDEFPAKGLYKTSPVQVSLTPQEAEKLNAPMMVEMTLQPEGSMDVNVTVGEKGYQKHF EKLPAIFPTDEGTLAFFQEVDSVTLQNGTKAPRIEKNVRHITATINKPMRVAKGYCNSLS IAPTSKTTSVAVISLKNSSLQRGQDFINQLLEMYNRNTNNDKNEIAQKTAEFIDERIGII SKELGSTEANLETFKRDAGITDLTSEAQIALAGNAEYEKKSVENRTQISLVNDLRKYLRG NEYEVLPSNVGLQDAALIGAIERYNEMLMERKRLLRTSTENNPTIVNLDTSIRAMKANVQ ATLEGTLQGLMITKESLDREASRYSRRISNAPGQERAYVSIARQQEIKAGLYLMLLQKRE ENAIALAATANNAKIIDEAIADDIPVSPKRSMIYLIAL Prediction of potential genes in microbial genomes Time: Thu May 12 00:02:16 2011 Seq name: gi|226332307|gb|ACIC01000013.1| Bacteroides sp. 1_1_6 cont1.13, whole genome shotgun sequence Length of sequence - 10801 bp Number of predicted genes - 17, with homology - 15 Number of transcription units - 8, operones - 5 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 199 192 ## + Prom 205 - 264 5.0 2 1 Op 2 . + CDS 288 - 521 202 ## gi|253567796|ref|ZP_04845207.1| conserved hypothetical protein + Term 554 - 599 2.6 - Term 542 - 587 7.2 3 2 Op 1 . - CDS 593 - 1036 174 ## COG3023 Negative regulator of beta-lactamase expression 4 2 Op 2 . - CDS 1046 - 1147 63 ## 5 2 Op 3 . - CDS 1182 - 1679 659 ## BT_0403 hypothetical protein - Prom 1722 - 1781 5.8 - Term 1790 - 1849 10.3 6 3 Tu 1 . - CDS 1938 - 4265 1159 ## BT_0404 hypothetical protein - Prom 4391 - 4450 8.5 + Prom 4339 - 4398 7.8 7 4 Op 1 . + CDS 4419 - 4544 102 ## BT_0405 hypothetical protein 8 4 Op 2 . + CDS 4550 - 4777 151 ## BT_0405 hypothetical protein + Term 4971 - 5011 -0.9 9 5 Op 1 . - CDS 4801 - 5058 236 ## BT_0406 hypothetical protein 10 5 Op 2 . - CDS 5067 - 5426 352 ## BT_0407 hypothetical protein 11 5 Op 3 . - CDS 5455 - 5664 428 ## BT_0408 hypothetical protein - Prom 5692 - 5751 6.6 + Prom 5735 - 5794 5.3 12 6 Tu 1 . + CDS 5867 - 6637 860 ## COG4221 Short-chain alcohol dehydrogenase of unknown specificity + Prom 6639 - 6698 3.7 13 7 Tu 1 . + CDS 6749 - 7249 549 ## BT_0410 hypothetical protein + Term 7259 - 7296 -0.4 + Prom 7341 - 7400 5.3 14 8 Op 1 . + CDS 7500 - 8324 754 ## COG1218 3'-Phosphoadenosine 5'-phosphosulfate (PAPS) 3'-phosphatase 15 8 Op 2 1/0.000 + CDS 8336 - 9889 1377 ## COG0471 Di- and tricarboxylate transporters 16 8 Op 3 8/0.000 + CDS 9906 - 10511 685 ## COG0529 Adenylylsulfate kinase and related kinases 17 8 Op 4 . + CDS 10532 - 10799 202 ## COG0175 3'-phosphoadenosine 5'-phosphosulfate sulfotransferase (PAPS reductase)/FAD synthetase and related enzymes Predicted protein(s) >gi|226332307|gb|ACIC01000013.1| GENE 1 2 - 199 192 65 aa, chain + ## HITS:0 COG:no KEGG:no NR:no EYTLINELSFEKKLPNLCTVINGVDLKKRKYGYYYGYGKYGKHYGYGKRYGYGYGYGQTK SERKD >gi|226332307|gb|ACIC01000013.1| GENE 2 288 - 521 202 77 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253567796|ref|ZP_04845207.1| ## NR: gi|253567796|ref|ZP_04845207.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 77 1 77 77 156 100.0 5e-37 MNQPDFVVKCYNKQDLAQMYFPDITVRASVNKLRRWMRRCQPLMDEILSTDFHPKTKAFS VREVRLITYYLGKPGDL >gi|226332307|gb|ACIC01000013.1| GENE 3 593 - 1036 174 147 aa, chain - ## HITS:1 COG:HI1494 KEGG:ns NR:ns ## COG: HI1494 COG3023 # Protein_GI_number: 16273395 # Func_class: V Defense mechanisms # Function: Negative regulator of beta-lactamase expression # Organism: Haemophilus influenzae # 55 139 2 97 116 68 39.0 3e-12 MESSKDEYLPREIKLLVIHCSATRCNVSFPVERLRECHLQRGFRDIGYHFYITQDGVLHH CRPVSEIGAHVRGFNRHSIGICYEGGLDENGRPADTRTTAQRFALLDLLTILKHQYPEAQ ILGHYQLSATIHKACPCYDPRKEYLGL >gi|226332307|gb|ACIC01000013.1| GENE 4 1046 - 1147 63 33 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKKQVWKNILQFIVTIATSIISAIGVTSCTAHL >gi|226332307|gb|ACIC01000013.1| GENE 5 1182 - 1679 659 165 aa, chain - ## HITS:1 COG:no KEGG:BT_0403 NR:ns ## KEGG: BT_0403 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 165 1 165 165 287 99.0 1e-76 MAQNYVVMARKNLLKPSEAPKYYAVARSGRKVTVKEVCKRITERSSYSKGELEGCIGEFL LEIVNVLDEGNIVQMGDLGNFRMSLKTATATATEKEFKASCIEKGKVLFYPGSDLRKLCK TLDFALYKSDGKPDSGDEPLPDDGKGDQPGGGSGDDGEEAPDPTV >gi|226332307|gb|ACIC01000013.1| GENE 6 1938 - 4265 1159 775 aa, chain - ## HITS:1 COG:no KEGG:BT_0404 NR:ns ## KEGG: BT_0404 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 775 1 775 775 1534 98.0 0 MQVLEEIKVSVYENVYSKKPKIMSFLEVIFMCIHPVYASIIQSIRRYHQEGDHEAAQKLK SQLPCFTPAGTFDGAHAIRNFQLPSHIIGLDYDHVPNRLEIIRLCAADPHTVAALESPTD GVKIFAYVEGIEGRHREGQLLVSRYYDQLTGLTSDPACKDESRLCYFTYSPDGYVASLYQ SFVLEAAVETQPFQPTAENLPSPPLPAKASETSENFSEEEVSLFLSSYIFLNPLTAGQRH TNLFKLACEACRRRYSQESILRGITVYFEHSDFPAQEIRSILQSGYQKVTSTPSVSASPL CSSPHKDKMTKATYSPSENVYEADEAYWQGEEFRKETPCFPKSVYKYLPDLLNECILEEE GDREQDLSFLSNLTALSSVLPATFGIYNHKKYSPHFYSFGIAPAGSNKSIAQTGRYLLEE VHDWILSNSELQQKTYNHKYTQWKLDCTYKKKAHEECPEEPEKPAYKMLFLPATTSYSRM QIQMRDNGPQGSIIFDTEAQTLATANHLDCGNFDDMLRKAFEHENIDSAFKINGLAPIYI RFPMLAMFLTGTPSQMASLIETSEKGLPSRIMLYTFRSIPKWKPMGDDSISLEESFKPLA HRVFELYHFCKNHPVLFHFSRSQWDYLNHTFSKLLAEVVLEGNDDLQAVVKRYACLVMRI SMIQARIRQFEANDGAPDIYCEDVDFDRSLQIVLCCYEHSRLLLSSMPSSQLHPLKDPNS TRKFISELPETFTTEDAMQIGVKYDFSRRKISRLIKSFIGVKINKISHGKYQKIS >gi|226332307|gb|ACIC01000013.1| GENE 7 4419 - 4544 102 41 aa, chain + ## HITS:1 COG:no KEGG:BT_0405 NR:ns ## KEGG: BT_0405 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 30 1 30 119 65 100.0 4e-10 MTKENHGPYEFLGSMVVNLMNIDKSISYSFWKMSCISPEGM >gi|226332307|gb|ACIC01000013.1| GENE 8 4550 - 4777 151 75 aa, chain + ## HITS:1 COG:no KEGG:BT_0405 NR:ns ## KEGG: BT_0405 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 75 45 119 119 155 100.0 4e-37 MRQGKDLHVYQYVRVVSCIMDEIHLEILLNALLKELKNVFAAHCDLVIGTIPHRLYGNGQ PAEWVVVMRWDGVGV >gi|226332307|gb|ACIC01000013.1| GENE 9 4801 - 5058 236 85 aa, chain - ## HITS:1 COG:no KEGG:BT_0406 NR:ns ## KEGG: BT_0406 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 85 1 85 85 131 98.0 8e-30 MSAQTPAKITLEEITQRKKKLLNEIQAQKRAMTATTREIFSPLAPTANKADSLMRSFNTG MAIFDGVVLGIKIMKKMRTYFRRLR >gi|226332307|gb|ACIC01000013.1| GENE 10 5067 - 5426 352 119 aa, chain - ## HITS:1 COG:no KEGG:BT_0407 NR:ns ## KEGG: BT_0407 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 119 1 119 119 148 100.0 7e-35 MFADDKSIENFQQLFFEFKKYLELQKEYTKLELTEKLTILLSTLIMIVILIILGMVALFY LLFALAYVLEPLVGGLMASFAIIAGINILIMALVIIFRKQLIISPMVNFLANLFLTDSK >gi|226332307|gb|ACIC01000013.1| GENE 11 5455 - 5664 428 69 aa, chain - ## HITS:1 COG:no KEGG:BT_0408 NR:ns ## KEGG: BT_0408 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 69 1 69 69 81 100.0 8e-15 MKGLNVLAAFLGGAAVGAALGILFAPEKGEDTRNKIAEILRKKGIKLNRNEMENLVDEIA AEIKGEIGE >gi|226332307|gb|ACIC01000013.1| GENE 12 5867 - 6637 860 256 aa, chain + ## HITS:1 COG:all0475 KEGG:ns NR:ns ## COG: all0475 COG4221 # Protein_GI_number: 17227971 # Func_class: R General function prediction only # Function: Short-chain alcohol dehydrogenase of unknown specificity # Organism: Nostoc sp. PCC 7120 # 1 253 4 256 257 294 53.0 1e-79 MKAKIVFITGASSGIGEGCARKFAKEGWNLILNARTVSKLEELKAELEKEYGIQVCVLPF DVRDRKQAAAALEALPEEWKSIDVLINNAGLVIGVDKEFEGSLDEWDIMIDTNIRGLLAM TRLVVPGMVERGCGHIINIGSIAGDAAYPGGSVYCATKAAVKALSDGLRIDLVDTPLRVT NIKPGMVETNFTVVRYRGDKQAADNFYKGIRPLTGDDIAETVYYAASAPAHIQIAEVLLM PTYQATGTISYKKKAE >gi|226332307|gb|ACIC01000013.1| GENE 13 6749 - 7249 549 166 aa, chain + ## HITS:1 COG:no KEGG:BT_0410 NR:ns ## KEGG: BT_0410 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 166 1 164 164 294 100.0 9e-79 MDMKKLVVLGMGVCMVLAFASCKSSESAYKKAYEKAKQQELAEPQVEAPVEVTPVVAAPV TTPKATDTTGVRQEKVTVVSGTDGLKDYSVVVGSFGVKANAEGLKDWLDGQGYNATIAFN AEKAMYRVIVSSFADKAAAVDARDAFKAKYPSRTDFQGAWLLYRIY >gi|226332307|gb|ACIC01000013.1| GENE 14 7500 - 8324 754 274 aa, chain + ## HITS:1 COG:aq_337 KEGG:ns NR:ns ## COG: aq_337 COG1218 # Protein_GI_number: 15605852 # Func_class: P Inorganic ion transport and metabolism # Function: 3'-Phosphoadenosine 5'-phosphosulfate (PAPS) 3'-phosphatase # Organism: Aquifex aeolicus # 10 267 6 249 268 259 52.0 4e-69 MEQKYVMAAIDAALKAGERILSIYEDPRSDFEIERKADNSPLTIADRKAHEAIVAILNET PFPVLSEEGKHMDYAVRRGWDTLWIVDPLDGTKEFIKRNGEFTVNIALVQNAVPVMGVIY VPVKKELYFAVEGTGAYKCSGIVSLEDEGVTLQQMIEKSERMPLADTRDHFIAVASRSHL TPETETYIADLKKKHGSVELISSGSSIKICLVAEGKADVYPRFAPTMEWDTAAGHAIARA AGMEVYQAGKEEPLRYNKEDLLNPWFIVEAKREH >gi|226332307|gb|ACIC01000013.1| GENE 15 8336 - 9889 1377 517 aa, chain + ## HITS:1 COG:BH3384 KEGG:ns NR:ns ## COG: BH3384 COG0471 # Protein_GI_number: 15615946 # Func_class: P Inorganic ion transport and metabolism # Function: Di- and tricarboxylate transporters # Organism: Bacillus halodurans # 1 305 2 300 589 160 32.0 8e-39 MTFEIVFVLLSLLGMMAALIADKMRPGMVLFSVVVLFLCVGILTPKEMLEGFSNKGMITV ALLFLVSEGIRQSGTLGQVIKKLLPQGKTTVFKAQLRILPSVAFISAFLNNTPVVVIFAP IIKHWAKSVKLPATKFLIPLSYVTILGGICTLIGTSTNLVVHGMILEAGYEGFTMFELGK VGIFIAIAGIIYLFIFSKRLLPDARPDTAVPDEEVEEGEKLHRVEAVLGARFPGINKKLS DFNFQRHYGAEVKEIKTRNGQRYVDHLEDVVLREGDTLVVMADDTFIPTWGESSVFVLLT NGNDPDTSGKKKRWLALILLVLMIVGATVGELPATKEMFPDIKLDMFFFVCITTIIMAWT NIFPARKYTKYISWDILITIACAFAISKAMVNSGVADCVAGFIIGLSDDYGPHVLLAILF VITNLFTELITNNAAAALAFPLALSISAQLGVSPTPFFVVICMAASASFSTPIGYQTNLI VQGIGNYKFTDFVRIGLPLNIITFLISVILIPLIWNF >gi|226332307|gb|ACIC01000013.1| GENE 16 9906 - 10511 685 201 aa, chain + ## HITS:1 COG:BH3385 KEGG:ns NR:ns ## COG: BH3385 COG0529 # Protein_GI_number: 15615947 # Func_class: P Inorganic ion transport and metabolism # Function: Adenylylsulfate kinase and related kinases # Organism: Bacillus halodurans # 13 201 16 204 208 202 52.0 5e-52 MEEKNHIYPIFDRMMTREDKEELLGQHSVMIWFTGLSGSGKSTIAIALERELHKRGLLCR ILDGDNIRSGINNNLGFSETDRVENIRRIAEVSKLFLDSGIITIAAFISPNNDIREMAAN IIGKDDFLEVFVSTPLEECEKRDVKGLYAKARRGEIQNFTGISAPFEVPEHPALSLDTSK LSLEESVNRLLEMVLPKIEKK >gi|226332307|gb|ACIC01000013.1| GENE 17 10532 - 10799 202 89 aa, chain + ## HITS:1 COG:VC2560 KEGG:ns NR:ns ## COG: VC2560 COG0175 # Protein_GI_number: 15642555 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: 3'-phosphoadenosine 5'-phosphosulfate sulfotransferase (PAPS reductase)/FAD synthetase and related enzymes # Organism: Vibrio cholerae # 1 89 14 102 315 142 71.0 1e-34 MEEYKLSHLKELEAESIHIIREVAAEFENPVMLYSIGKDSSVMVRLAEKAFYPGKVPFPL MHIDSKWKFKEMIQFRDEYAKKHGWNLIV Prediction of potential genes in microbial genomes Time: Thu May 12 00:02:55 2011 Seq name: gi|226332306|gb|ACIC01000014.1| Bacteroides sp. 1_1_6 cont1.14, whole genome shotgun sequence Length of sequence - 10577 bp Number of predicted genes - 9, with homology - 9 Number of transcription units - 2, operones - 2 average op.length - 4.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 18/0.000 + CDS 3 - 539 653 ## COG0175 3'-phosphoadenosine 5'-phosphosulfate sulfotransferase (PAPS reductase)/FAD synthetase and related enzymes 2 1 Op 2 . + CDS 579 - 2036 1655 ## COG2895 GTPases - Sulfate adenylate transferase subunit 1 3 1 Op 3 . + CDS 2048 - 3154 1196 ## BT_0416 hypothetical protein 4 1 Op 4 . + CDS 3173 - 4141 870 ## BT_0417 hypothetical protein + Term 4160 - 4217 10.1 5 2 Op 1 . + CDS 4460 - 5605 1288 ## COG2885 Outer membrane protein and related peptidoglycan-associated (lipo)proteins + Term 5627 - 5677 11.3 6 2 Op 2 . + CDS 5682 - 6098 363 ## COG0816 Predicted endonuclease involved in recombination (possible Holliday junction resolvase in Mycoplasmas and B. subtilis) 7 2 Op 3 . + CDS 6143 - 6697 629 ## COG0242 N-formylmethionyl-tRNA deformylase 8 2 Op 4 . + CDS 6774 - 8825 2213 ## COG0457 FOG: TPR repeat 9 2 Op 5 . + CDS 8908 - 10576 1616 ## COG0441 Threonyl-tRNA synthetase Predicted protein(s) >gi|226332306|gb|ACIC01000014.1| GENE 1 3 - 539 653 178 aa, chain + ## HITS:1 COG:CC1483 KEGG:ns NR:ns ## COG: CC1483 COG0175 # Protein_GI_number: 16125730 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: 3'-phosphoadenosine 5'-phosphosulfate sulfotransferase (PAPS reductase)/FAD synthetase and related enzymes # Organism: Caulobacter vibrioides # 1 178 134 311 311 267 70.0 8e-72 KYKFDAAFGGARRDEEKSRAKERIFSFRDKFHQWDPKNQRPELWDIYNARVHKGESIRVF PISNWTELDIWQYIRLENIPIVPLYYAKERPVINLDGNIIMADDDRLPEKYRDQIEMKMV RFRTLGCWPLTGAVESGAATIEEIVEEMMTTTKSERTTRVIDFDQEGSMEQKKREGYF >gi|226332306|gb|ACIC01000014.1| GENE 2 579 - 2036 1655 485 aa, chain + ## HITS:1 COG:PA4442_1 KEGG:ns NR:ns ## COG: PA4442_1 COG2895 # Protein_GI_number: 15599638 # Func_class: P Inorganic ion transport and metabolism # Function: GTPases - Sulfate adenylate transferase subunit 1 # Organism: Pseudomonas aeruginosa # 6 435 11 433 451 516 60.0 1e-146 MDNKLDIKAFLDKDEQKDLLRLLTAGSVDDGKSTLIGRLLFDSKKLYEDQLDALERDSKR VGNAGEHIDYALLLDGLKAEREQGITIDVAYRYFSTNGRKFIIADTPGHEQYTRNMITGG STANLAIILVDARTGVITQTRRHTFLVSLLGIKHVVLAVNKMDLVDFSEERFDEIVSEYK KFVEPLGIPDVNCIPLSALDGDNVVDKSERTPWYKGISLLDFLETVHIDNDHNFTDFRFP VQYVLRPNLDFRGFCGKVASGIVRKGDTVMALPSGKTSKVKSIVTYDGELDYAFPPQSVT LTLEDEIDVSRGEMLVHPDNLPIVDRNFEAMMVWMDEEPMDINKSFFIKQTTNLSRTRID AIKYKVDVNTMEHLSIDNGQLTKDNLPLQLNQIARVVLTTAKELFFDPYKKNKSCGSFIL IDPITNNTSAVGMIIDRVEMKDMAATDDIPVLDLSKLDIAPEHHAAIEKAVKELERQGLS IKVIK >gi|226332306|gb|ACIC01000014.1| GENE 3 2048 - 3154 1196 368 aa, chain + ## HITS:1 COG:no KEGG:BT_0416 NR:ns ## KEGG: BT_0416 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 368 1 368 368 696 100.0 0 MGLLEFNKLPINTLVGADWKTFKAITAGRDIDAAYKGKYRLTKAVCRLLSPLAALQDKRY EKLLANQPLEHDPVFILGHWRSGTTFVHNVFSCDKHFGYNTTYQTVFPHLMMWGQPFFKK NMSWLMPDKRPTDNMELAVDLPQEEEFALSNMMPYTYYNFWFLPKYQQEYADKYLLFDDI TDKELKVFEEVFTKLIKISLWNTKGTQFLSKNPPHTGRVKELVKMFPNAKFIYLMRNPYT VFESTRSFFTNTIQPLKLEDVSNEQLEENILSIYAKLYHKYESDKKFIPEGNLIEVKFED FEADAMGMTENIYKSLSIPGFAEAQGDIEKYVGGKKGYKKNKYKYDDRTIRLVEENWDFA LKQWNYNL >gi|226332306|gb|ACIC01000014.1| GENE 4 3173 - 4141 870 322 aa, chain + ## HITS:1 COG:no KEGG:BT_0417 NR:ns ## KEGG: BT_0417 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 322 1 322 322 656 99.0 0 MIKKMVHRLFLTGFVAFFSSLSLMAQHKVEMVPFGDMDQWVDRQIKESGIIGGNTKNVYE IGPTTVIKGDQVYKNMGGSPWATSNVMAKVAGITKTNTSVFPEKRGDGYCARLDTRMESV KVLGLVNITVLAAGSIFTGSVHEPIKGTKNPQKMLQTGIPFTKKPVALQFDYKVKMSDRE KRIRATGFSKITDVAGKDYPAAILFLQKRWEDANGNVYAKRIGTLVTYYYHSTDWKNNAT YEIMYGDITSRPEYKPHMMRLQVTENYTVNSKGESVPIHEVAWGDENDEPTHLYLQFTSS HGGAYVGSPGNSLWIDNVKLVY >gi|226332306|gb|ACIC01000014.1| GENE 5 4460 - 5605 1288 381 aa, chain + ## HITS:1 COG:PA1777 KEGG:ns NR:ns ## COG: PA1777 COG2885 # Protein_GI_number: 15596974 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane protein and related peptidoglycan-associated (lipo)proteins # Organism: Pseudomonas aeruginosa # 149 355 110 315 350 61 27.0 3e-09 MKKGLLFILMVGASVCLSAQEKGKTEGKAYTVETNRFGANWFISGGVGGQMYFGDNDGKA DFGKRIAPALDIAVGKWFTPGIGLRVAYNGLQAKGAAPSANDLYTKGGKYSNGYYKQKWN MANFHGDVLFNLSNMFCGYNEDRLYSFIPYVGAGVALSWSEPHQQNLGINAGLINRFRVS DAWDINLELRGLLMKDAFGGTSKEGMAGVTVGVTYKFKKRGFNAVPTVPMVPESQLDDMR NRLNSLKGENESLKRDLVEARNKKPEVIVKKEGGFVPRLVVVFNIGKSNISKREYMNIEA MAKGIKANPEKTFTVTGYADKGTGSAEYNMKLSKKRAEAVRDLMVNEFGVPASQLKVDYK GGVGNMFYDDAKLSRVAIVEE >gi|226332306|gb|ACIC01000014.1| GENE 6 5682 - 6098 363 138 aa, chain + ## HITS:1 COG:CAC1680 KEGG:ns NR:ns ## COG: CAC1680 COG0816 # Protein_GI_number: 15894957 # Func_class: L Replication, recombination and repair # Function: Predicted endonuclease involved in recombination (possible Holliday junction resolvase in Mycoplasmas and B. subtilis) # Organism: Clostridium acetobutylicum # 3 135 2 134 135 83 39.0 1e-16 MSRIVAIDYGRKRTGIAVSDTLQLIANGLTTVPTHELLNFIGGYVAKEPVERIIIGLPKQ MNNEASENMKNIEPFVRSLKKRFPELPVEYVDERFTSVLAHRTMLEAGLKKKDRQNKALV DEISATIILQTYLESKRF >gi|226332306|gb|ACIC01000014.1| GENE 7 6143 - 6697 629 184 aa, chain + ## HITS:1 COG:TM1661 KEGG:ns NR:ns ## COG: TM1661 COG0242 # Protein_GI_number: 15644409 # Func_class: J Translation, ribosomal structure and biogenesis # Function: N-formylmethionyl-tRNA deformylase # Organism: Thermotoga maritima # 5 166 4 155 164 126 45.0 2e-29 MILPIYVYGQPVLRKVAEDITPEYPNLKELIANMFETMVHADGVGLAAPQIGLPIRVVTI TLDPLSEDYPEFKDFNKAYINPHIIEVGGEEVSMEEGCLSLPGIHESVKRGNKIRVKYMD ENFVEHDEVVEGYLARVMQHEFDHLDGKMFIDHLSPLRKQMIRGKLNTMLKGKARSSYKM KQVK >gi|226332306|gb|ACIC01000014.1| GENE 8 6774 - 8825 2213 683 aa, chain + ## HITS:1 COG:all0889 KEGG:ns NR:ns ## COG: all0889 COG0457 # Protein_GI_number: 17228384 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Nostoc sp. PCC 7120 # 18 563 36 574 605 159 25.0 2e-38 MIKRILTVLLLFPTLVCAQINTERVMTIARNALYFEDYVLSIQYFNQVINAKPYLYEPYF FRALAKVNLEDYQGAEADCDEAIKRNPFVVGAYQVRGLARIRQSKFDGAIEDYQKALHYD PENITLWHNLTLSHIQKKDYDAAKEDLESLLKVSPRYTRAYLMRGEVSLQQKDTVAALKD FNTALELDKYDPDAWSARAIVRLQQSNYAEAESDLDRAIHLSAKNAGNYINRALARFHQN NLRGAMSDYDLALDIDPNNFIGHYNRGLLRARVGDDNRAIEDFDFVIQMEPDNMMAIFNR ALLRAQTGDYRGAIKDYTTVIDQYPNFLAGYYHRAEARKKIGDKKGAEQDDFKLLKAQLD KQNGGTNKDVAQNQNKDKENQSGENGDESEEGKTRKKSDKNMNNYRKIVIADDSEADQRY TSDYRGRVQDKNVNIKLEPMFALTYYEKSSEVKRSVNFHKFIEDLNLSNVLPKRIRITNM EAPLTEEQVKFHFALIDAHTSAIVDDDKSAPKRFARAIDFYLVQDFSSAVADLTQAILLD GDFFPAYFMRALIRCKQLEYQKAEQAAEADMPSGAAVKEVSAVDYEVVRKDLDKVITLAP DFVYAYYNRANVAAMLKDYRAAIVDYDKAIQLSPDFADAYFNRGLTHIFLGNNKAGISDL SKAGELGVVSAYNVIKRFTDQTE >gi|226332306|gb|ACIC01000014.1| GENE 9 8908 - 10576 1616 556 aa, chain + ## HITS:1 COG:DR2081 KEGG:ns NR:ns ## COG: DR2081 COG0441 # Protein_GI_number: 15807075 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Threonyl-tRNA synthetase # Organism: Deinococcus radiodurans # 2 556 1 559 649 587 53.0 1e-167 MIKITFPDGSVREYNEGVNGLQIAESISSRLAQDVLACGVNGETYDLGRPINEDADFVLY KWDDEEGKHAFWHTSAHLLAEALQELYPGIQFGIGPAIENGFYYDVDPGEAVIKESDLPA IEAKMLELSAKKEAVVRESISKTDALKMFGDRGETYKCELISELEDGHITTYTQGAFTDL CRGPHLMTTAPIKAIKLTTVAGAYWRGHEDRKMLTRIYGITFPKKKMLDEYLVLLEEAKK RDHRKIGKEMQLFMFSETVGKGLPMWLPKGTALRLRLQEFLRRIQTRYDYQEVITPPIGN KLLYVTSGHYAKYGKDAFQPIHTPEEGEEYFLKPMNCPHHCEIYKNFPRSYKDLPLRIAE FGTVCRYEQSGELHGLTRVRSFTQDDAHIFCRPDQVKDEFLRVMDIISIVFRSMNFQNFE AQISLRDKVNREKYIGSDDNWEKAEQAIIEACEEKGLPAKIEYGEAAFYGPKLDFMVKDA IGRRWQLGTIQVDYNLPERFELEYMGSDNQKHRPVMIHRAPFGSMERFVAVLIEHTAGKF PLWLTPDQVAILPISE Prediction of potential genes in microbial genomes Time: Thu May 12 00:03:09 2011 Seq name: gi|226332305|gb|ACIC01000015.1| Bacteroides sp. 1_1_6 cont1.15, whole genome shotgun sequence Length of sequence - 11650 bp Number of predicted genes - 13, with homology - 13 Number of transcription units - 7, operones - 3 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 16/0.000 + CDS 1 - 264 286 ## COG0441 Threonyl-tRNA synthetase 2 1 Op 2 . + CDS 336 - 941 654 ## COG0290 Translation initiation factor 3 (IF-3) + Term 949 - 999 4.1 3 2 Op 1 . + CDS 1015 - 1212 333 ## PROTEIN SUPPORTED gi|29345834|ref|NP_809337.1| 50S ribosomal protein L35 4 2 Op 2 . + CDS 1312 - 1662 595 ## PROTEIN SUPPORTED gi|29345835|ref|NP_809338.1| 50S ribosomal protein L20 + Term 1688 - 1729 10.1 + Prom 1762 - 1821 5.5 5 3 Tu 1 . + CDS 2021 - 2791 431 ## COG1145 Ferredoxin + Term 2805 - 2866 2.0 6 4 Op 1 . - CDS 2785 - 3357 479 ## COG0503 Adenine/guanine phosphoribosyltransferases and related PRPP-binding proteins - Prom 3378 - 3437 4.5 7 4 Op 2 1/0.000 - CDS 3454 - 4761 1269 ## COG1541 Coenzyme F390 synthetase 8 4 Op 3 11/0.000 - CDS 4773 - 5357 735 ## COG1014 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, gamma subunit 9 4 Op 4 . - CDS 5361 - 6953 1464 ## COG4231 Indolepyruvate ferredoxin oxidoreductase, alpha and beta subunits 10 4 Op 5 . - CDS 7034 - 8071 794 ## COG1559 Predicted periplasmic solute-binding protein - Prom 8105 - 8164 2.5 11 5 Tu 1 . - CDS 8176 - 9054 648 ## COG1864 DNA/RNA endonuclease G, NUC1 - Prom 9214 - 9273 3.1 - TRNA 9121 - 9193 81.4 # Thr CGT 0 0 - Term 9071 - 9108 6.0 12 6 Tu 1 . - CDS 9323 - 10531 335 ## PROTEIN SUPPORTED gi|163762640|ref|ZP_02169704.1| ribosomal protein L33 - Prom 10779 - 10838 8.2 13 7 Tu 1 . + CDS 10874 - 11648 527 ## BT_0434 hypothetical protein Predicted protein(s) >gi|226332305|gb|ACIC01000015.1| GENE 1 1 - 264 286 87 aa, chain + ## HITS:1 COG:BS_thrS KEGG:ns NR:ns ## COG: BS_thrS COG0441 # Protein_GI_number: 16079947 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Threonyl-tRNA synthetase # Organism: Bacillus subtilis # 1 74 562 635 643 73 50.0 8e-14 EYAEQVKMYLKMHEIRAIVDDRNEKIGRKIRDNEMKRIPYMLIVGEKEAENGEVSVRRQG EGDKGTMKYEEFAKILNEEVQNMINKW >gi|226332305|gb|ACIC01000015.1| GENE 2 336 - 941 654 201 aa, chain + ## HITS:1 COG:BH3140 KEGG:ns NR:ns ## COG: BH3140 COG0290 # Protein_GI_number: 15615702 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation initiation factor 3 (IF-3) # Organism: Bacillus halodurans # 12 172 26 187 190 145 48.0 5e-35 MKNDTLKGQYRINEQIRAKEVRIVGDEIEPKVYPIFQALKMAEEKELDLVEISPNAQPPV CRIIDYSKFLYQLKKRQKEQKAKQVKVNVKEIRFGPQTDDHDYNFKLKHARGFLEDGDKV KAYVFFKGRSILFKEQGEVLLLRFANDLEDFAKVDQMPVLEGKRMTIQLSPKKKEMPKKP AAPKPVTQGQKAEKPEAGEEE >gi|226332305|gb|ACIC01000015.1| GENE 3 1015 - 1212 333 65 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|29345834|ref|NP_809337.1| 50S ribosomal protein L35 [Bacteroides thetaiotaomicron VPI-5482] # 1 65 1 65 65 132 100 9e-31 MPKMKTNSGSKKRFTLTGTGKIKRKHAFHSHILTKKSKKRKRNLCYSTTVDTTNVSQVKE LLAMK >gi|226332305|gb|ACIC01000015.1| GENE 4 1312 - 1662 595 116 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|29345835|ref|NP_809338.1| 50S ribosomal protein L20 [Bacteroides thetaiotaomicron VPI-5482] # 1 116 1 116 116 233 100 4e-61 MPRSVNHVASKARRKKILKLTRGYFGARKNVWTVAKNTWEKGLTYAFRDRRNKKRNFRAL WIQRINAAARLEGMSYSKLMGGLHKAGIEINRKVLADLAMNHPEAFKAVVAKAKAA >gi|226332305|gb|ACIC01000015.1| GENE 5 2021 - 2791 431 256 aa, chain + ## HITS:1 COG:MA4170 KEGG:ns NR:ns ## COG: MA4170 COG1145 # Protein_GI_number: 20092963 # Func_class: C Energy production and conversion # Function: Ferredoxin # Organism: Methanosarcina acetivorans str.C2A # 2 240 12 247 294 74 26.0 1e-13 MIFYFSGTGNSKWIANQLSKEQQEDLFFIPDAWKNDTWEFSLREDEKIGFVFPVYSWAPP VIVQEFIRKLVLKGYQQQYLFFVCSCGDDTGLTRQVMEKALAARGWRCNAGFSVIMPNNY VLFPGFDVDSKELEKKKLAEAVPVLEKVSRLITGRETVFACKEGSLPFIKTRIINPLFVR YQMSPKPFHATSECIGCKRCEKSCPVGNITMKERRPVWGKNCTACLACYHVCPQHAVQYG KKTKGKGHYFNPDSAY >gi|226332305|gb|ACIC01000015.1| GENE 6 2785 - 3357 479 190 aa, chain - ## HITS:1 COG:CAC0873 KEGG:ns NR:ns ## COG: CAC0873 COG0503 # Protein_GI_number: 15894160 # Func_class: F Nucleotide transport and metabolism # Function: Adenine/guanine phosphoribosyltransferases and related PRPP-binding proteins # Organism: Clostridium acetobutylicum # 1 180 1 181 189 177 50.0 9e-45 MQLLKKRILQDGKCYEGGILKVDGFINHQMDPVLMKSIGVEFVRRFAATNVNKIMTIEAS GIAPAIMTGYLMDLPVVFAKKKSPKTIQNALSTTVHSFTKDRDYEVVISADFLTPNDNVL FVDDFLAYGNAALGILDLIEQSGAKLVGMGFIIEKAFQNGRKILEEKGVRVESLAIIEDL SNCCIKIKDQ >gi|226332305|gb|ACIC01000015.1| GENE 7 3454 - 4761 1269 435 aa, chain - ## HITS:1 COG:AF2013 KEGG:ns NR:ns ## COG: AF2013 COG1541 # Protein_GI_number: 11499595 # Func_class: H Coenzyme transport and metabolism # Function: Coenzyme F390 synthetase # Organism: Archaeoglobus fulgidus # 2 433 8 438 440 449 48.0 1e-126 MSTQYWEEEIEIMSREKLQELQLQRLKKTINIAANSPYYKEVFSKNGITGDSIQSLDDIR KIPFTTKSDMRANYPFGLVAGDMKRDGVRIHSSSGTTGNPTVIVHSQHDLDSWANLVARC LYMVGIRKTDVFQNSSGYGMFTGGLGFQYGAERLGCLTVPAAAGNSKRQIKFISDFKTTA LHAIPSYAIRLAEVFQEEGIDPRETTLKTLVIGAEPHTDEQRRKIERMLNVKAYNSFGMT EMNGPGVAFECQEQNGMHFWEDCYLVEIIDPETGEPVPEGEIGELVLTTLDREMMPLIRY RTRDLTRILPGKCPCGRTHLRIDRIKGRSDDMFIIKGVNIFPMQVEKILVQFPELGSNYL ITLETVNNQDEMIVEVELSDLSTDNYIELEKIRRDIIRQLKDEILVTPKVKLVKKGSLPQ SEGKAVRVKDLRDNK >gi|226332305|gb|ACIC01000015.1| GENE 8 4773 - 5357 735 194 aa, chain - ## HITS:1 COG:PH0764 KEGG:ns NR:ns ## COG: PH0764 COG1014 # Protein_GI_number: 14590633 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, gamma subunit # Organism: Pyrococcus horikoshii # 4 193 5 200 202 108 36.0 4e-24 MKKDIILSGVGGQGILSIATVIGKAALKEGLYMKQAEVHGMSQRGGDVQSNLRISDKPIA SDLIPSGKCDLIISLEPMEGLRYLPYLSPEGWLVTNETPFVNIPNYPEEDKVMAEINKLP HKIVLNVDKVAKEVGSARVANIVLLGATIPFLGIDYEKVQDSIREIFLRKGEAIVEMNLK ALAAGKEIAEKLMQ >gi|226332305|gb|ACIC01000015.1| GENE 9 5361 - 6953 1464 530 aa, chain - ## HITS:1 COG:CAC2001 KEGG:ns NR:ns ## COG: CAC2001 COG4231 # Protein_GI_number: 15895271 # Func_class: C Energy production and conversion # Function: Indolepyruvate ferredoxin oxidoreductase, alpha and beta subunits # Organism: Clostridium acetobutylicum # 3 529 2 521 584 356 38.0 5e-98 MSKQLLLGDEAIAQAALDAGLSGVYAYPGTPSTEITEYIQMAPITTEQNIHNRWCANEKT AMEAALGMSFVGKRALVCMKHVGMNVAADCFVNSAITGVKGGLIVIAADDPSMHSSQNEQ DSRFYGDFSLIPMYEPSNQQEAYDMVYNGFEFSEKTGEPILMRMVTRLAHSRSGVERKEQ KPQNSISFSEDPRQFILLPGNARKRYKVLLARQDEFIKASEESPYNKYIDGPNKKLGIVA CGIGYNYLMENYPEGCEYPVLKIGQYPLPKKQLLQLVESCDEILVLEDGQPFVEKHLKGY LGIGIKVKGRLDGTLSQDGELNPDSVARAVGKENKSEFGVPSVVEMRPPALCEGCGHRDM YITLTEVLKEEYPSHKVFSDIGCYTLGANAPFNAINSCVDMGASITMAKGAADGGLHPSV AVIGDSTFTHSGMTGLLDCVNENANVVIVISDNETTAMTGGQDSAGTGRIESICIGLGVD PAHIRVVVPLKKNHEEMRQIIREEIEYNGVSVIIPRRECIQTLARKKRSK >gi|226332305|gb|ACIC01000015.1| GENE 10 7034 - 8071 794 345 aa, chain - ## HITS:1 COG:XF0675 KEGG:ns NR:ns ## COG: XF0675 COG1559 # Protein_GI_number: 15837277 # Func_class: R General function prediction only # Function: Predicted periplasmic solute-binding protein # Organism: Xylella fastidiosa 9a5c # 28 342 29 343 350 124 28.0 3e-28 MKKKTRNILLSVLTGALLLCAIAGGTVYYYLFAPQFHPSKTVYVYVDRDDTADSIYNKIR QTGHVNKFTGFQWMAKYRKFDQNIHTGRYAIRPNENVYHVFSRFFRGYQEPMNLTIGSIR TLDRLARSIGKQLMIDSAEIARQLFDSAFQTEMGYTPVTMPCLFIPETYQVYWDMSVDDF FKRMQTEHKRFWNDERLAQATAIGMTPEEVCTLASIVEEETNNNEEKPMVAGLYINRLHT GMPLQADPTIKFALQDFGLRRITNEHLKVNSPYNTYINSGLPPGPIRIPSKKGLDSVLNY TKHNYIYMCAKEDFSGTHNFASNYADHMANARKYWKALNERKIFK >gi|226332305|gb|ACIC01000015.1| GENE 11 8176 - 9054 648 292 aa, chain - ## HITS:1 COG:BB0411 KEGG:ns NR:ns ## COG: BB0411 COG1864 # Protein_GI_number: 15594756 # Func_class: F Nucleotide transport and metabolism # Function: DNA/RNA endonuclease G, NUC1 # Organism: Borrelia burgdorferi # 101 273 2 175 195 120 38.0 3e-27 MKRGKSRRLFKKKKSHSNQRLGCIIAIIVLIPICYGLYLYYQQYSVQHSSKPQIETSVSL PTPSGKDLEIPISLIPRQEQIIRHKGYTVSYNKDLKIPNWVSYELTRQETKGKEKRGDNF IADPLVKGAIATNADYARSGYDKGHMAPAADMKWSPDVMKESFYFSNMCPQHPQLNRRGW KNLEEKIRDWAVADSAIIIICGPIIDKPSKTIGKNKVAVPERFFKVILSPFVKPARGIGF LFNNRQAVEPLSTYAVSIDSIEKLTHMDFFSPLPDELENAVEANADYYQWPR >gi|226332305|gb|ACIC01000015.1| GENE 12 9323 - 10531 335 402 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163762640|ref|ZP_02169704.1| ribosomal protein L33 [Bacillus selenitireducens MLS10] # 87 399 5 317 323 133 27 5e-31 MNQQFLKEIEMGSKNALVKKRIITHYIYNGSSTIPDLSKELDLSVPTVTKFIGEMCEDGY INDYGKLETSGGRHPNLYGLNPESGYFIGVDIKRFAVNIGLINFKGDMVELKMNIPYKFE NSIEGMNELCKLILNFIKKLPINKEKILNINVNVSGRVNPESGYSFSQFNFEERPLADVL SEKLGHKVTIDNDTRAMTYGEYMQGCVKGEKDIIFVNVSWGVGIGIIIDGKVYTGKSGFS GEFGHVNAYDNEIICHCGKKGCLETEASGSALHRILLERIKSGESSILSTRISGEEDPIT LDEIITAVNKEDLLCIEIVEEIGQKLGKQIAGLINIFNPELVIIGGTLSLTGDYITQPIK TAVRKYSLNLVNKDSAIITSKLKDKAGIVGACMLARSRMFES >gi|226332305|gb|ACIC01000015.1| GENE 13 10874 - 11648 527 258 aa, chain + ## HITS:1 COG:no KEGG:BT_0434 NR:ns ## KEGG: BT_0434 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 258 2 259 627 543 100.0 1e-153 MKNLFQILCGLFLVVSYSSCMHSEEDKVDVLIIGGGASGVTSGIQSARMGANTLILEEST WLGGMLTSAGVSAVDGNYRLPAGLWGEFKNRLSDYYGGLDSLKTGWVSNVLFEPSVGNKI LHEMVAAENNLKVWKQACLEEVKRTGDEWVAKVKVEGQGVKTVRAKVMIDATELGDVAKM CGVKYDIGMESRDDTHEDIAPEKKNNIVQDITYVAILKDYGKDVTIPEPEGYDPKEFACA CASPVCITPKEPDRVWSK Prediction of potential genes in microbial genomes Time: Thu May 12 00:03:16 2011 Seq name: gi|226332304|gb|ACIC01000016.1| Bacteroides sp. 1_1_6 cont1.16, whole genome shotgun sequence Length of sequence - 3160 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 1, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 44 - 1108 976 ## BT_0434 hypothetical protein 2 1 Op 2 . + CDS 1147 - 2004 771 ## BT_0435 hypothetical protein 3 1 Op 3 . + CDS 2028 - 2249 199 ## BT_0435 hypothetical protein 4 1 Op 4 . + CDS 2246 - 3158 534 ## COG0477 Permeases of the major facilitator superfamily Predicted protein(s) >gi|226332304|gb|ACIC01000016.1| GENE 1 44 - 1108 976 354 aa, chain + ## HITS:1 COG:no KEGG:BT_0434 NR:ns ## KEGG: BT_0434 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 354 274 627 627 722 100.0 0 MINWPIEGNDYYINLIEMTPEERLKALEYAKHYTMCFVYFLQHELGYNTLGLADDEYPTE DKLPFIPYHRESRRIHGLVRFNLNHALNPYTQDEKLYRTCIAVGDYPVDHHHTRYHGYEE LPNLYFHPIPSYGLPLGTLIPQEVDGLIVAEKSISVSNIMNGTTRLQPVVLQIGQAAGAL AALAVKNNQKISDVSVRDVQNAILDAKGYLLPYLDVPVSDVKFAPYQRIGSTGILKGVGK NVDWSNQTWLRADTVLLASELDGLFEVYPAAKDEFKKTGADKLSIEEAVQLVQKIAEMEK LHTTVDPQALWKAYGMQEPDLKRNILRGEMAVMIDQVLDPFNKKQIDITGKYIQ >gi|226332304|gb|ACIC01000016.1| GENE 2 1147 - 2004 771 285 aa, chain + ## HITS:1 COG:no KEGG:BT_0435 NR:ns ## KEGG: BT_0435 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 284 1 284 366 584 100.0 1e-165 METNNRRDFLKKAALAGASALVAPSLFAADGEGRGELSPKIKAGDLESKLIVPKNNGLKI TGTFLDEISHDIPHQNWGEKEWDLDFQHMKRIGIDTVIMIRSGYRKFMTYPSPYLLKKGC YMPSVDLVDMYLRLAEKYNMKFYFGLYDSGRYWDTGDLSWEIEDNKYVIDEVWKMYGEKY KSFGGWYISGEISRATKGAIDAFRAMGKQCKDISNGLPTFISPWIDGKKAIMGTGKLTRE DAVSVQQHEKEWNEIFDGIHEVVDACAFQDGHIDYDELDAFFTVK >gi|226332304|gb|ACIC01000016.1| GENE 3 2028 - 2249 199 73 aa, chain + ## HITS:1 COG:no KEGG:BT_0435 NR:ns ## KEGG: BT_0435 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 73 294 366 366 154 100.0 1e-36 MQCWTNAESFDRDMPIRFLPIKFDKLRMKLEAAKRAGYDKAITFEFSHFMSPQSAYLQAG HLYDRYREYFEIK >gi|226332304|gb|ACIC01000016.1| GENE 4 2246 - 3158 534 304 aa, chain + ## HITS:1 COG:STM4418 KEGG:ns NR:ns ## COG: STM4418 COG0477 # Protein_GI_number: 16767664 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Salmonella typhimurium LT2 # 1 296 1 287 477 202 38.0 5e-52 MSDMKSTINFGYLIFLSVVAALGGFLFGYDTAVISGTIAQVTQLFQLDTLQQGWYVGCAL IGSIVGVLFAGILSDKLGRKMTMIISATLFSTSALGCAISADFTQLVVYRIIGGVGIGVV SIVSPLYISEVAVAQYRGRLVSLYQLAVTVGFLGAYLINYQLLAYAESGNQLSMDWLNKI FVTEVWRGMLGMETLPAVLFFIIIFFIPESPRWLIVRGKEEKAVNILERIYNSVSEAASQ LKETKSVLTSETKSEWAMLMKPGIFKAVIIGVCIAILGQFMGVNAVLYYGPSIFENAGLS GGDS Prediction of potential genes in microbial genomes Time: Thu May 12 00:03:31 2011 Seq name: gi|226332303|gb|ACIC01000017.1| Bacteroides sp. 1_1_6 cont1.17, whole genome shotgun sequence Length of sequence - 14997 bp Number of predicted genes - 9, with homology - 9 Number of transcription units - 3, operones - 2 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 52 - 408 291 ## COG2942 N-acyl-D-glucosamine 2-epimerase 2 1 Op 2 . + CDS 463 - 3687 2503 ## BT_0452 hypothetical protein 3 1 Op 3 . + CDS 3706 - 5367 1576 ## BT_0451 hypothetical protein 4 1 Op 4 . + CDS 5387 - 7084 1117 ## BT_0450 hypothetical protein 5 1 Op 5 . + CDS 7131 - 8534 1213 ## BT_0449 putative S-layer related protein precursor + Term 8598 - 8637 7.7 6 2 Op 1 . + CDS 8670 - 10103 985 ## BT_0448 hypothetical protein 7 2 Op 2 . + CDS 10115 - 12769 1945 ## COG1649 Uncharacterized protein conserved in bacteria 8 2 Op 3 . + CDS 12766 - 14175 751 ## BT_0446 hypothetical protein - Term 14770 - 14812 9.1 9 3 Tu 1 . - CDS 14832 - 14996 116 ## BT_0445 endoglucanase E precursor (EGE) Predicted protein(s) >gi|226332303|gb|ACIC01000017.1| GENE 1 52 - 408 291 118 aa, chain + ## HITS:1 COG:all3695 KEGG:ns NR:ns ## COG: all3695 COG2942 # Protein_GI_number: 17231187 # Func_class: G Carbohydrate transport and metabolism # Function: N-acyl-D-glucosamine 2-epimerase # Organism: Nostoc sp. PCC 7120 # 1 118 270 387 388 187 70.0 3e-48 MLDYGWDKQYGGIYYFMDRNGCPPQQLEWDQKLWWVHIESLISLLKGYQLTGDRKCLEWF EKVHDYTWSHFKDPEYPEWYGYLNRRGEVLLPLKGGKWKGCFHVPRGLFQCWKVLEPL >gi|226332303|gb|ACIC01000017.1| GENE 2 463 - 3687 2503 1074 aa, chain + ## HITS:1 COG:no KEGG:BT_0452 NR:ns ## KEGG: BT_0452 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1074 1 1074 1074 2080 100.0 0 MTEFNSKIKQNSRILKIFLFSICALFACAINAHAQSLVQGTVTDETGESLPGVSVVVKGT TNGTVTDVNGKYSIQATSKDILSFTYVGMAEQDIKVAGQKTINVVMKDDVASLDEVIVVG YGTAKKQSLTGAVSAVKGDELLKAPATNVSSLLGGRLPGISSVQVSGEPGDDQAVLRVRG SIYNVTYIVDGMPRSINDIDPNDIESVSVLKDGASAAVYGLKGAGGVIIVTTKRGQEGKS RITYNGSIGASMNANFPQFMNGPQFAYYYNMADMMDKLANGSITDMKQYTPVFTKANVEA MLNGDPTDGWDNVNYIDKVFGTGINQKHNITVQGGSDKMRYFTSVGYLGQDGNIDNFTYR RYNLRTNIETQLAKNFQLSLGVAGNVGKRETPGYASGGTDSNSTLGEQGWLSVAHQTIMM HPYLPEKYDGLYTATTQNNTSLPNSPLAAIYESGYKHTNSFDLQTNLSLQYNLPWVKGLS VKVTGAYDYKTSHNKNLNTPYSTYINKMPTSTTDWTWTKADDPRGTANGINLGEGQFSSR QMVGQGSINYANSFGKHNVEAMVLAEIRDYKENSFSGYNKNMSFAELPELSFGQPADSPI SGYSDANRSIGYVFRLKYDYDNKYLAEFTGRYDGSYKFAGNVSGKRWAFFPSASVAWRIS KENFMSDLTFLDDMKIRASIGLLGNDDVSPYAFLSTYNLMDGNKNAYQTILNGKVMQALR ASVIANPTLTWENTLTYNAGFDFTMWNGLLGMEFDAFYNYTYDILTAMGGDYPPSMGGYY YTYANYNKVDSKGVEITVSHRNKLNLAGKPFSYGASFNLTFARNRYLRYPDSPNTVEWRK RTGRSVDASVVWVAEGLFRSEEEIDNSAWYGTRPNVGDIKYKDLNGDGKIDEDDRTRVGR GNRPELTYGLNLNCAWNGFDLSAQFTGGALFDVSLTGTYYNGYDDNTVWTQAFKEGANSP LYLVQNAYTTENPNGTFPRLTLGNQGHGGDNGLASTFWLRNGRYLRLKSAQLGYTFPKKW MSPVGIQNLRIYVEGSNLFTISGLPDGIDPESPGVNNGYYPQQRTIMGGITLTF >gi|226332303|gb|ACIC01000017.1| GENE 3 3706 - 5367 1576 553 aa, chain + ## HITS:1 COG:no KEGG:BT_0451 NR:ns ## KEGG: BT_0451 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 553 1 553 553 1094 100.0 0 MKTKFISIILLGASLGLGSCSDYLDMTPTDKASDKLVWSKLEYAEQAVNDFYRYIDYLGV YRDGQCLAGMTEALTDQLKYGSYNYMSKCQIPSEIAYGGSVLTVSYVSTYLGNWGTMYEY VRRVNEGLNKLKKYASFGQTEQERLEAEMRFFRGYLYTDLVKRYKEVIIYDEDLDNIKKD MPVSSEAEGWNFVYEDLKFAGEHLPVDKTPNGRLTSGAAYAMLSRAMLYAERWDDARIAA EEVMGMGYKLEEKENYSNAFKAGSQEAILQYCYDKSSTVTHDFDGYMAPGGDKALDGNSM TGGFGTPTQEMVESYEYADGSGFPDWSTWHTAEGTTATPPYDKLEPRFKATILYNGATWK GRTIASYVNGTDGWAAWKTDAKPEGRSTTGYYLRKLVDETHSFTSVQASTQPWTEIRYAE VLLNHAEACYHLSGYADKANKDIKDIRNRVGLPYSDKSGEDLMKAIRQERKVELAYEGHY YWDMRRWKLSATAFTGIRLHGLKIEQNANGSFNYIYVECDDKDRNFPTKMYRFPMPQAEL DNNGAVTQYAEWQ >gi|226332303|gb|ACIC01000017.1| GENE 4 5387 - 7084 1117 565 aa, chain + ## HITS:1 COG:no KEGG:BT_0450 NR:ns ## KEGG: BT_0450 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 565 1 565 565 1059 99.0 0 MKKNIQYLLLGVLALMSSCQDPEYVLPTAERQGITSLTALFTSGPYVDKEAVVYTIADAS VDKYVIPIPWFYPEDSENETAEYMKTMRVQAKLAPNCTINPVLTILDLTKENYFTYTDAQ GYKKQICITGERVKSNKCQLLSFSIPAEDISGIIDEEHKTVSLISAEDLSSCLAEYSLSS HATMSPDPKTEALDFNSPVELTVIAHDGVTKQTYTVQKAVPDKIPYGYRKGSETELFKLD MGVIGLPWTSANSPSLAVNGNNLVVCLGDGSTAPTYYNASTGSKIGNMTLGSVGVASLGC VTSDSKGNILLAIKAANGKSFSIYKTSSVTTVPTLLISYTNNTSLDMGTKVSVQGDINTN AAIIATCDGTASSGSNTFVRWIITDGVLGSPEVVSVNGVGYWGSPVSNTKVVTKGTTAQS DYFLSYYDPNVLYWVNGTSNNASKSLSNSDNGNSWAMNNNCLDARTFNNAQYLVLVSTAH FPQWGGTPYLYMYDVTSDGSFTGTVSTSDALTFNPSLSSFGSADGVAATGDVLFAPTTDG YKLRLYYVDNNCKIIGGYEFDCIDK >gi|226332303|gb|ACIC01000017.1| GENE 5 7131 - 8534 1213 467 aa, chain + ## HITS:1 COG:no KEGG:BT_0449 NR:ns ## KEGG: BT_0449 # Name: not_defined # Def: putative S-layer related protein precursor # Organism: B.thetaiotaomicron # Pathway: not_defined # 15 467 1 453 453 876 100.0 0 MRNLFKILLLTFCGMVAITSCSDDSDGIPGWPWNDNSTEGPENPDVTEIKPRYVWIDAAA NFPDYANNKENISTDMVKVKNAGFTDIIVDVRPTTGDVLFNTTAVDQVKRMDVWGSSGYV YYERTETWDYLQAFIDAARSQGLKVHASVNTFVGGYLCPYNLGSDGILFRDDSKKEWASV ANLADGLTNTMDLLNDETDYGAKFLNPANDDVQNFVLQLLGDLAKYDLDGIILDRCRYDD YGLESDFSAVSKQKFEEYIGETVAHFPEDIMAPGTDEIPSDQPAYFKKWLEFRVKVIHDF IVKAREKVKSVNSEINFGVYVGAWYSSYYTSGVNWASPKYDTSAYYPKWATADYKNYGYA DHLDYIFLGAYASVDKIYGSGEWTMEGFCKNGRELLKGDVPFSGGPDIGNGTGWTNGGQA AKIPDAIDACINNSDGFFAFDLCHIKMYDYWNAFKTGFDNYLESVQK >gi|226332303|gb|ACIC01000017.1| GENE 6 8670 - 10103 985 477 aa, chain + ## HITS:1 COG:no KEGG:BT_0448 NR:ns ## KEGG: BT_0448 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 477 1 477 477 989 100.0 0 MKIYLLQLVILMSLSLNLQAQNLSGTIRGKVTCHGKGLQGVVVTDGIDCVLTNKSGEYVL APQRDARFVYLSVPSGYQPKTEKTIPLFHQQIEMDKQDGYNFELMKNPQNDANHLFLVQA DVQVTSEDDVKAYGRFLQDIKEYVQPYSGKRDIFGIDCGDIVGDTPSLYPSYINTVSTLD FPVYRAIGNHDMTYGGRTFEYSYRTFESYFGPIYYSFNKGKAHYIVLDNCFYVNRDYQYI GYIDERTFAWLEKDLSYVSRDKLVFVVMHIPSSLQKKLRYNTLDQDETVNTAALYKLLEG YNAHIISGHTHFNTNVCFNDSLMEHNTAAVCGTWWRADINVDGTPRGYGIYEVNGDQLKW IYKSAGYPLEHQFHAYPAGSSDEYPSDIIANVWNWDDLWKVEWYENGQRMGEMQKYKGYD PMAKAICSDKEKVKYDWISPVLTEHLFHATPHDPNARIEIKVTDRFGNVYTQTIENK >gi|226332303|gb|ACIC01000017.1| GENE 7 10115 - 12769 1945 884 aa, chain + ## HITS:1 COG:all1210_2 KEGG:ns NR:ns ## COG: all1210_2 COG1649 # Protein_GI_number: 17228705 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Nostoc sp. PCC 7120 # 520 711 47 227 489 63 25.0 2e-09 MFRNRYFWLLISLFFVLKTEAKVTLTSIWGDNMVLQQQAEVTFRGKATPGKRVYASASWN HRKIYADCDRNGQWTLALPTPEAGGPYTITFSDGEELTLKNILIGEVWFCSGQSNMEMPV KGFRGQPVYGSQPYIVSANPKRPLRLYTVKNNWSTTLKEEGIDGQWSEASSEEVADFSAT AYFFGNLLQQSLDVPVGLIHCSWSMSKIEAWMDKETLSHFSEVTLPDTNQDKFEWAAGTP TLLWNAMVNPWEGFPVKGVIWYQGEANTPDPTLYKKLFPAMVSQWRNFFHNAEMPFYYVQ IAPWKSEGNDKLDWAWFRQCQLELMKEVPGVGMVTTGDAGSELFIHSPYKIKVGERLAYW ALAQTYGRKGFQYSGPVYKTYRIQGNAVEIDFEYGEEGLTPENQNVKGFEIVGSDGIFRP AKAEIINGTSTVKVWNDSVSDPTEVRYCFRNYVQGELCNNAGLPASPFRIVIKKKPALMW IDAEANFERFSHKDSIDYYLEKIKSLGFTHAVVDIRPITGEVLYKSEYAPQMKEWKGAKA GDFDYLGYFIKKGHELGLEIHASLNVFCAGHNYFDRGMVYSGHPEWASMVYTPDKGIIPI TEEKHKYGAMINPLNEEYRTHILNVLKEVVTKYPDLDGLMLDRVRYDGITADFSSLSRKK FEEYIGKKVANFPEDIFRWTKNADGKYTTQPGKYFRKWLEWRTKNITDFMALARKEVKAA NPDVSFGTYTGAWYPSYYEVGVNFASKEYDPGKDFSWATPEYKNYGYAELIDLYATGNYY TDITIEEYRKTNRSIWNETDSQAQSGTWYCVEGSCQHLRQILKGNKFMGGILVDQFYDNP AKLSETIEMNLRRSDGLMVFDIVHIITKNLWKEVEEGMKRGGSL >gi|226332303|gb|ACIC01000017.1| GENE 8 12766 - 14175 751 469 aa, chain + ## HITS:1 COG:no KEGG:BT_0446 NR:ns ## KEGG: BT_0446 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 469 1 469 469 806 100.0 0 MNNRAYALDALRGYAIITMVLSATIVTQVLPGWMSHAQTPPPDHIFNPSLPGITWVDLVF PFFLFAMGAAFPFSIGKRAEKGDSKLKLVYEAVKRGVQLTFFAIFIQHFYPYVLSSPQDI RAWLLAILCFAVLFPMFMRIPLKMPDWAHTGIKIAAYGIAVIMMLTTSYADGRTFSLYFS NVIILLLANMAIFGSALYIFTMHNRWLRLGVLLLLMAVILGSTVEGSWTQVIFNYTPLPW MYRFDYLKYLFIVIPGSIAGEYLMEWMKGRKDTEQPASPQYRKIGIALLILTMVIIVFNL YGLYTRMVTVNLLATLVLLLIGKLVFLRPTEGIALLWKKLFNAGAYLLLLGLCFEPFQEG IKKDPATFSYFFVTSGLAFLALLFLSIVCDYFRCVRSSRFLVMSGQNPMIAYVVGDLFIM PLANILGLASLLSYFQQNAWLGFLQGVIITSLAVLVTMFFTKIKWFWRT >gi|226332303|gb|ACIC01000017.1| GENE 9 14832 - 14996 116 54 aa, chain - ## HITS:1 COG:no KEGG:BT_0445 NR:ns ## KEGG: BT_0445 # Name: not_defined # Def: endoglucanase E precursor (EGE) # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 54 313 366 366 121 100.0 9e-27 KVYIAVLPEGLCDSTTDLGAVWHPNYKGQMKMAMSLIPYMSTITGWPLKKESFY Prediction of potential genes in microbial genomes Time: Thu May 12 00:04:21 2011 Seq name: gi|226332302|gb|ACIC01000018.1| Bacteroides sp. 1_1_6 cont1.18, whole genome shotgun sequence Length of sequence - 13410 bp Number of predicted genes - 9, with homology - 9 Number of transcription units - 3, operones - 2 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 917 670 ## BT_0445 endoglucanase E precursor (EGE) 2 1 Op 2 . - CDS 931 - 2187 734 ## BT_0444 hypothetical protein 3 1 Op 3 . - CDS 2199 - 2516 265 ## COG1472 Beta-glucosidase-related glycosidases 4 1 Op 4 . - CDS 2506 - 2658 80 ## gi|253567845|ref|ZP_04845256.1| predicted protein - Prom 2679 - 2738 5.3 - Term 2683 - 2729 11.0 5 2 Op 1 . - CDS 2742 - 4517 1116 ## BT_0442 glycerophosphoryl diester phosphodiesterase 6 2 Op 2 . - CDS 4535 - 5911 843 ## BT_0441 hypothetical protein 7 2 Op 3 . - CDS 5933 - 7816 1207 ## BT_0440 hypothetical protein 8 2 Op 4 . - CDS 7831 - 10950 2237 ## BT_0439 hypothetical protein - Prom 11023 - 11082 9.4 9 3 Tu 1 . - CDS 11085 - 13115 925 ## BT_0438 alpha-N-acetylglucosaminidase precursor - Prom 13245 - 13304 6.5 Predicted protein(s) >gi|226332302|gb|ACIC01000018.1| GENE 1 2 - 917 670 305 aa, chain - ## HITS:1 COG:no KEGG:BT_0445 NR:ns ## KEGG: BT_0445 # Name: not_defined # Def: endoglucanase E precursor (EGE) # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 305 1 305 366 604 100.0 1e-171 MWKNVLVLVSLCLLFEDVNAQTTKNRYYKANDENVSYVGRTEIQSDGSVSFDWTGTYLRT RLSGGELSMKISDTGINYYNVFVDSLLHKTVKVSGKDTLINFISGMDKGIHHVLIQKRTE GEWGKTTIHQFVLHNEGELMKETECPSRHIEFIGNSLTCGYGVEGKDRSEPYKAETENCN LSYATIIARYFNADYTLIAHSGRGVVRNYGDSVRISAVTMKDRMLNTFDMNLDKKWDFKT YKPDLVVVNLGTNDFSTNIYPLEDEFIHAYKLLISRLRTNYGDVPILCISPAIAQRQIVQ YMERM >gi|226332302|gb|ACIC01000018.1| GENE 2 931 - 2187 734 418 aa, chain - ## HITS:1 COG:no KEGG:BT_0444 NR:ns ## KEGG: BT_0444 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 418 1 418 418 837 100.0 0 MTKRLILAFVFLSWHLVACSNGDSGISSGLPDKPSAEEDSPGTETISEEMRVKITGEDLY ISAKLDAKTDILYWFKKCMFNELFTFYRVGTIDNNMPAPATLPDASPTSLLNLAYSDNIG PFNITGYGWCGGNHPYLDEVTQTARNIGYRIYIDGKEIKTDLSTLTKEIKIEVINEIMNP SKADTSSGKTLLNEILCIETVNYSIYRNNIQVLVTHEFQNESPVTVQMYYGMQSMFDKET HTLTANGKYVDWTVQKDVSTFTKKEYPSFRRFIEKSANVYQSSYLFADGLGNHEEISDDD IIFIGNSSNKTYHKIIADREHKKGDIIHWSGVYTWFKSPVSDDDDLLCYEGVIDNKKVLF IDCKKNVDRTIQLPVDYTKKSFTIIEKSKSITITGNKVTAKGLRIVSDGSGSCVITFD >gi|226332302|gb|ACIC01000018.1| GENE 3 2199 - 2516 265 105 aa, chain - ## HITS:1 COG:STM2166 KEGG:ns NR:ns ## COG: STM2166 COG1472 # Protein_GI_number: 16765496 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Salmonella typhimurium LT2 # 1 97 663 758 765 91 44.0 3e-19 MSVDETITFSLAVKNTGNREGQEIVQLYIRDIKSSLPRPVKELKRFKKVNLMPGEEKIVS LSIDKEALSFFDAERHEWICEPGKFEAIIAASAMDIKDVLKFELK >gi|226332302|gb|ACIC01000018.1| GENE 4 2506 - 2658 80 50 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253567845|ref|ZP_04845256.1| ## NR: gi|253567845|ref|ZP_04845256.1| predicted protein [Bacteroides sp. 1_1_6] # 1 50 1 50 50 97 100.0 3e-19 MKLRNIIAIMPFLASSFMTQAPVYQDVNHPIGDRVVGVWKTNSKQESHVS >gi|226332302|gb|ACIC01000018.1| GENE 5 2742 - 4517 1116 591 aa, chain - ## HITS:1 COG:no KEGG:BT_0442 NR:ns ## KEGG: BT_0442 # Name: not_defined # Def: glycerophosphoryl diester phosphodiesterase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 591 1 591 591 1232 100.0 0 MNYFSLYLKGLICFCLLFITMDSHALSIDKGYRQNKIKDLALIYQGGVHRIDWTSDQFLP YVVHQFADGHKDWLFDGFLFLEFTDGKGCGFATRYSDKNARKKEWLWLLDRLFEDGKALS ALDRCIGTQIKEIGKPDFTHQIVLCLPEVLPGQKDWGEVDGEPMDFSRQEDQVKATRWYI DELMKRFKQAKYKHLKLSGFYWLAEDIDFTKDLSVPLSKYIHSKNKTFCWIPYWQAKGYN QWRELGFDIAYQQPNHFFKASIPDKRLEEACQSAATLNMGMELEFDERALFDAKDSFYNR LVAYIDHFERQQAFRTSAMAYYSGNHAVLDMYKSTNPKDHEVMDRLANLIVSRRGKQKKE SHQTKVIAHRGFWNTPGSAQNSLAALVKADSIGCYGSEFDVWLTADDELMLNHDGWHEGY SLEKTSSDVLRTLKLSNGENMPTLEQFLDKAKKLKIHLILEMKPLSTPERETKAVEMILK LVEEKKLTHRVEYISFSRHVMCEFIRLAPPKTPILYLGQDISPKELKQLGATGADYNYWA YHKNPDWLKDLKNSGMTSNVWTVNIPSEMQWCIDNGFDLITTDAPTAFLGL >gi|226332302|gb|ACIC01000018.1| GENE 6 4535 - 5911 843 458 aa, chain - ## HITS:1 COG:no KEGG:BT_0441 NR:ns ## KEGG: BT_0441 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 458 5 462 462 940 99.0 0 MKKHKTIQFIAAALFIASACTDDRDNLMVNDQIGLLHSTYTETEIFRGMDTPYQLFVIKS GKGKQETEVSISVDETVLQSYNTDNGTSIQLLPSDCYTILQPALRLNDSDYRKAFDIKWN ADRLSDLLSTGKEYGLPIQMSVVQNFISADDERLKTIIVPTIKEPYLQMDVPGMSPVTDF MPTFGDLNERELFVKIRTNYSNDEDISFQIEVDPQLVIDYNTAHGTAYKMLPESSFTVDN KGVILTGMKETFIKIKYHKKGLIPEEGTYLFGDYVLPLRITSVSKYGISPDAASKLYTVK FQPGPIDKKGWSVIDFNNCCTQDGGWYLNMGWGVESLIDNNPGTQWLCRWDVKEPLPYYF VFDMGKEYTLFRFGFANPVAPAAHVWAGTSKAGYVEASIDNENWVKLKDWTSPKIGEPNV NMDVPATQARYIRFVITDTYPTYDGLRVSLGEVYAWGM >gi|226332302|gb|ACIC01000018.1| GENE 7 5933 - 7816 1207 627 aa, chain - ## HITS:1 COG:no KEGG:BT_0440 NR:ns ## KEGG: BT_0440 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 627 1 627 627 1266 99.0 0 MNKKIRILGIFLYFTVAFSFVSCNDYLDVTLLDQLTQDETFSKRVSTEQYLAHVYSYLPR EYEYLEEGSAVPRSDEAMFSWYQWVNYLSFNNGSWGPSTPNYNIWKAKYTGIKQASIFMN HVDECLEIDSETRRIMKAEARFLRAYYYFELFRQYGPVYIWGDIESDELIKPETIDRHTV DENVNFIAEEYDKAIAELPAEISDFTKWAGRITKGAAMAAKARLMLYAASPLYNGCDLYK GQMKNLYGDFLFPQSPDPQKWEKAAKAAKDVIDMNIYELYKDQTEADPLLRAIKSYQGVL FEEWNKETIWGAWGRNPTNTGLGAIGFYLYSRCMPPRVCELGVGGFCPSLKLVDTYPMAK SGRYPVTGYDNNGNPVIDKQSGYTNTGFTENFKQPGASFAPGFKAHNSCVGRDARFYASV FPNGFYWINQYKGIKQVTFFTGGSSSYLESGDCVKVGYLWRRPLDPALNTESGSWGTVFW PYFRLAEIYLDYAEACNEKPERNETEALKYINLVRERSGLNALETAYPEVKGNKDLLRML IQKERMVELAFEGHRYYDARRWMIAQQEFPGPEYSLNLLATSYEDSWSRTSNVWKGGPRV FEPKHYFFPINQEQLSEMKNITQNYGW >gi|226332302|gb|ACIC01000018.1| GENE 8 7831 - 10950 2237 1039 aa, chain - ## HITS:1 COG:no KEGG:BT_0439 NR:ns ## KEGG: BT_0439 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1039 1 1039 1039 2029 99.0 0 MMSDLRKNIFSIKIWTVTTFCLLSLVNMNASNTLAHSLQKDEIKEVKDNPGKKVTGKVLD EFGEPIPSASILVEGRSRGVITDLDGTFVIEVLPTDKLVVSFLGMETQTILVGKQTNIVV KMQPQTAELDEVTVVAYGKQRKASVIGAINTVSMNELKMPVAKLSSGLAGQLAGIVVMQR SGEPGAGADFWIRGINTFGAGNKPLVLVDGVERSMDLVDVEDIASFSILKDATATALYGV RGANGIVLITTKRGTESAPKINARVEYGFTNPVRMPELANTEQWINYYNDITLESSGRLA IQPEEKAKYLNNTDPDLYPNVDWMQTLFKDFSNTTRVNINVTGGSPKIRYYVGGSFYSEG SVFNISDKDRYDASMRYNKFSFRSNIDINVTSSTELSLSLSNQYETKNRPYGSLSDLYAY MIRTSPIATPTIYSDGTLAKPSTGTNPYNSLNNSGYSQDFFNNAQSLISLTQDFSKMVTP GLKANVKFSWDAYNAHTLNRIVTPSTYYATGRDEEGNLDFVRSTEGSDYMKLTRTNSGTR TINFEASANYDRLFGEKHRVGALFLFSMRNYTDNFPGNYVDAFAHKNIGIAGRATYSYYD RYFAEFNFGYNGSENFAPDKRFGFFPSFAIGYMLSNEAFWDPLRSTIHLLKIKGSYGEIG NDQIGGNRRFAFNSTMKEGQYGYIFGTNPGSGSIVGISTGIPGNLNVSWEKAQKANVGLE LGLFDQLKFQLDYFYEKRTGIYIQQESVPSIVGLNETQYVNLGEMKNQGFDGSMEYEQRF NDWYVSARANFTYNRNKKLYDDKPTPIWSYQSEAGFAYRQQRGLIALGLFESEEDIANSP KQNFGNVRPGDIKYKDINGDNVIDAYDKVAIGYTDVPEINYGFGISLGWKGFDASLFMQG VGNVTRIIGGDALYGASDNIERLGQIYADVAEKRWTTSNPNPNATYPRLSMSKVENNLQP STYWQRNMSFLRLKNVEIGYTLPKHIYKRLGMSSIRVYAQGVNLLTFSKFKLWDPELDSN YGNIYPQMRTITLGLNLNF >gi|226332302|gb|ACIC01000018.1| GENE 9 11085 - 13115 925 676 aa, chain - ## HITS:1 COG:no KEGG:BT_0438 NR:ns ## KEGG: BT_0438 # Name: not_defined # Def: alpha-N-acetylglucosaminidase precursor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 676 55 730 730 1372 99.0 0 MDRFILESAHKKICIKGNNNNSLAVGLNHYLKYYCHTNVSWYASDSIDMPEELPVIPQPI SIRSKCDNRFFLNYCTFGYTMPYWKWSDWERLIDWMALNGITMPLAISGQETVWYKVWSK LGLNDEQIRSYFTGPAHLPWHRMSNVDYWQSPLPKSWLEQQEVLQKQILKRERDFNMTPV LPAFSGHVPKELKAIYPDAKIHEMSQWGGYDSKYRSHFIEPMDSLFNIIQKMYLEEQTAI YGTDHIYGIDPFNEVDSPNWNEDFLAKVSKKIYESIYQVDAEAKWLQMTWMFYHDQKKWT QPRIRSFLEAVPDDKLILLDYYCDSTEIWRNTEMYYGKPYMWCYLGNFGGNSMMVGNLDD VDVKIEKLFVEGGENVYGLGATLEGFDVNPFMYEFVFDQAWDYPLTTDQWIQNWAKCRGG NQDRHILKAWDSLHKKIYKKYATAGQAVLMNARPMLVGTDSWNTYPDITYNNRDLWDIWT EMLKASHINNTGYRFDVINVGRQVLGNLFSSFRDHFTQCYSEKDIDGMKKWADQMDSLLI DTDRLLSCETNFSIGKWIDDARSFGKTEAEKEYYEENARCILTVWGQKATQLNDYANRGW GGLTYSYYRERWKRFTTEVITASLSGQKFDEKQFYQSITDFEYEWTLSKEHHPIISGENP ILLAKTLSEKYMQYFY Prediction of potential genes in microbial genomes Time: Thu May 12 00:05:17 2011 Seq name: gi|226332301|gb|ACIC01000019.1| Bacteroides sp. 1_1_6 cont1.19, whole genome shotgun sequence Length of sequence - 17111 bp Number of predicted genes - 12, with homology - 12 Number of transcription units - 4, operones - 3 average op.length - 3.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 452 - 1600 673 ## BVU_3656 hypothetical protein 2 1 Op 2 . + CDS 1597 - 2280 235 ## gi|253567852|ref|ZP_04845263.1| conserved hypothetical protein + Term 2370 - 2422 -0.1 - Term 2292 - 2353 7.7 3 2 Tu 1 . - CDS 2392 - 3186 537 ## BT_3593 hypothetical protein - Prom 3245 - 3304 2.4 - Term 3372 - 3404 -0.4 4 3 Op 1 . - CDS 3472 - 4503 792 ## BT_3594 hypothetical protein 5 3 Op 2 . - CDS 4507 - 5925 951 ## BT_0446 hypothetical protein 6 3 Op 3 . - CDS 5939 - 7360 742 ## BT_0447 S-layer related protein precursor, sialic acid-specific 9-O-acetylesterase 7 3 Op 4 . - CDS 7367 - 10042 2013 ## BT_0448 hypothetical protein 8 3 Op 5 . - CDS 10060 - 11382 1400 ## BT_0447 S-layer related protein precursor, sialic acid-specific 9-O-acetylesterase 9 3 Op 6 . - CDS 11416 - 12759 1281 ## BT_0449 putative S-layer related protein precursor - Prom 12780 - 12839 3.2 - Term 12807 - 12849 11.1 10 4 Op 1 . - CDS 12867 - 14534 948 ## BT_0450 hypothetical protein 11 4 Op 2 . - CDS 14551 - 16182 1303 ## BT_0451 hypothetical protein 12 4 Op 3 . - CDS 16196 - 17110 621 ## BT_0452 hypothetical protein Predicted protein(s) >gi|226332301|gb|ACIC01000019.1| GENE 1 452 - 1600 673 382 aa, chain + ## HITS:1 COG:no KEGG:BVU_3656 NR:ns ## KEGG: BVU_3656 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 4 380 2 389 389 206 35.0 1e-51 MEQKEYLSIKSFGPIKDVKLDNIKPFTFFIGESGSGKSTILKVLAMMRHMCKQINLRSYL KLGNVIDKTIDLSISEYLRNGGMTDYVKNDTEIVYSKGDCNITYTPQKGLKGTRKIISSE NLSLEKISFFSDKRGAIAPLLANLSDGAALGFYFTETFQDFKKATEVIKELEMPYLGVRY YEKKAQNGSRQFFISNVNDTYKIHFEDASSGIQTMTPLAVIAEYFSKHFDLVHGFNSSIV TLLGKNDSLSSFRHDMNIGDIANRSIHLMIEEPELSMFPTAQRSSLNMLIDKCLNGNKYM TLTLATHSPYIINHLNLLLKAFDKGVKIENAALDYHKTEVYVVEHGMVKSIKVQNAHLIN TDYLSDDINEIYNEYDKLDEIQ >gi|226332301|gb|ACIC01000019.1| GENE 2 1597 - 2280 235 227 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253567852|ref|ZP_04845263.1| ## NR: gi|253567852|ref|ZP_04845263.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 227 1 227 227 442 100.0 1e-123 MIDLLKRELPSHYKLHRTIVDVPYEITAEEHFDLTDRGLTSIVDYKTGNISFDHSNRTEV QVINYEAYLIGLKGTKFETGRKRCDFILHETGGSCDSFYILNEQTSTKKNIENLSKPILD KDKNVIYPGGKYEKAEIQLLETLRTIRAVPSISAFINRYKRKVCLMSYLIKPDQDSKVSS AQKSFGRYKLIESRETGENGALLEHPSINAEGFEFRRISHEYSFRLS >gi|226332301|gb|ACIC01000019.1| GENE 3 2392 - 3186 537 264 aa, chain - ## HITS:1 COG:no KEGG:BT_3593 NR:ns ## KEGG: BT_3593 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 264 97 358 358 418 71.0 1e-115 MTDVGIKGLDLYCWEENGQWRFVNSARPNGKTNQATVIANMQPKEREYMLYLPLYDGLVS LSIGVDSLSSIDQPQINYPIREKPIVFYGTSILQGGCASRPGMAHTNIISRRLNRECINL GFSGNGQLDLEVARVMAEVDAGVFVLDFVPNASVEQMKERMETFYRIIRNKHPKTPIVFI EDPIFTHTLFDQRVAHEVTRKNQTLNEIFNSLKKKGEKDIYLIHSEKMIGEDGEATVDGI HFTDLGMMRYANLITPFIKKMIKR >gi|226332301|gb|ACIC01000019.1| GENE 4 3472 - 4503 792 343 aa, chain - ## HITS:1 COG:no KEGG:BT_3594 NR:ns ## KEGG: BT_3594 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 34 341 219 531 536 87 26.0 6e-16 MLIKRLYFITTALLAFSICSCGESYPDHPWEWDQETEEEESNPDIVSQGWTTADNFGSLP DYIHIYKSPENLAGKKAIAYIAVADMSKAKFEVLGDIAFSQEANGYGGKSIHTPSEFYES SKAPVVINGGLFFYSAGFYYSQNLVIREGQLLAPNQNYYSKDWVTMWYPTLGAFCQMKDG TFQTTWTYQASDGINYCYPAPADNDINKDPLQAPSSTFPNGAKALEATTGIGGVTVLLRA GEIKNTYVEEMLDISAASNQPRTAIGITTNKKMIIFVCEGRNMTEGVAGLTTANVAKVMK DLGCTEALNLDGGGSSCMLVNGKETIKGSDGKQRKVLTAVGIK >gi|226332301|gb|ACIC01000019.1| GENE 5 4507 - 5925 951 472 aa, chain - ## HITS:1 COG:no KEGG:BT_0446 NR:ns ## KEGG: BT_0446 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 472 1 469 469 459 56.0 1e-128 MNNNSTRASSLDALRGYAILTMVLSGSVAWGVLPGWMYHAQVGPRSNFVFDGSIYGITWV DLVFPFFLFAMGAAFPFSIGNKYRKGSSRRRIIYDSLLRGFRLTFFAIFIQHIYPWVVSS PQDVRSWLIALGGFALMFPMFMRIPFKMPEYVRMLIKILAYATGVFMLLNINYADGRTFS LGYSNIIILVLANMAIFGSLIYTLTINKPWIRIAILPFIMAVFLGNENAESWNHSLMAFS PLPWMYKCSYLKYLFIVIPGSIAGEYLHEWLHCKNTMPVNDNKDERKRMPWILLLTIGLI VFNLYGLYMRYLILNLVGTIAILCVLYILLQVEGKNTGYWYKLFRAGTYLILLGLAFEAY EGGIRKDPSTYSYYFLASGLAFMAMIAFSIMCDIYSWSRLTRPLEYAGQNPMIAYVSTQL VVLPLLNLAGLGTYLSYLDQNAWLGFLRGVIITSLALLITILFTKLKCFWRT >gi|226332301|gb|ACIC01000019.1| GENE 6 5939 - 7360 742 473 aa, chain - ## HITS:1 COG:no KEGG:BT_0447 NR:ns ## KEGG: BT_0447 # Name: not_defined # Def: S-layer related protein precursor, sialic acid-specific 9-O-acetylesterase # Organism: B.thetaiotaomicron # Pathway: not_defined # 6 472 6 472 884 725 72.0 0 MKYIIYHLWIISLLFPLGMQAKVRLTSIWGDNMVLQQQSEVTFHGKASANSKITIEASWN HRKLTTRSDQHGNWNIQLPTPAAGGPYTITFSDGEALTLNNILLGEVWFCSGQSNMEMPV RGFRGQPVYGSQPYIVAADSKRELRLFTVKRDWSTTPKEEGITGYWSELSPKEVGDFSAT AYFFGDLLQRSLDIPVGLIHCSWSASKIETWMDKETLSQFPEIQLPDIEQTKFDWPAGTP TLLWNAMVNPWKGFPVKGVIWYQGESNSADPALYKKLFPAMVAQWREFFHNPAMPVYYVQ ITPWQAEGKDRLDRAWFRQCQLELMQEVPNVGMVTTTDAGSEKFIHPPYKIKVGQRLAYW ALAKTYGKEGFLYAGPLYKSHQLKENVVEITFEHGSDGLIPENQKLRGFELVDEKGNIVP AEAEIINGSARVKVWNDSVPHPVEVRYCFRNYMEGDLFNNAEIPASPFRVVIH >gi|226332301|gb|ACIC01000019.1| GENE 7 7367 - 10042 2013 891 aa, chain - ## HITS:1 COG:no KEGG:BT_0448 NR:ns ## KEGG: BT_0448 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 438 889 27 474 477 530 54.0 1e-148 MNMKRFSIISLLLWIGILAFASGKPKVMWLDCSANFQRFSYPDSIRFYVDKCHEAGITHL VLDIKDNTGEVLYPSKYAAQKKNWKNFDRPDFDFIGTFIEAAHARNMIIFAGMNIFADGQ NIVKRGAIFDKHKKWQAINYVPRKGLLPVTEIEGKPTMFLNPALKEVQKYEIDVIKEVVR NYAFDGIMLDRARYDCIDSDFSPESKKMFEKFIGKKVEKFPEDIFEWRPNAEGGIDRVGG PYYHQWLTWRASVIYNFIKDVRTSIKKIKPECMLAAYTGAWYPTYFEVGVNWASRNYDVS KDFSWATPDYKNYGFAELLDFYTNGNYYWNVTLDDYYKSSGKFKNETDSEFSTGEYLCVE GGCKYSKYLLKDAVPVCGGLYVEDYKRDVNQFQKAVRMNLKESDGVMIFDIVHIIRNGWW DELKEALDETKPDEARMIKGTVTCDGKGIANVVVTDGQRCVTTDKNGIYHLPNLGNTRFV YITTPAGYLTDCEQTIPRFYQEIDLNETNEYNFRLKKNPKDDSKHLFVLEADVQAGLKEH WDLYAPIVDDYKQLIDQYSDRDVFGLNCGDIFWDTPATFFPPYIDKAKKLDIPIYRAIGN HDMDCNGATHETSYRTFEGYFGPTHYSFNKGNAHYIVINNNFYVGREYFYIGYVDETTFK WLEEDLSYVPKGTLVFFITHIPTRITEQKRPFNYDYAMLAGETINAEAVHQLLDGYETHF LTGHLHSNSNIVFNDHQMEHNTAAVCGIWWHADVCIDGTPQGYGVYEVDGNQVKWYYKSA GHPKDYQFRSYAAGTSKEFPKDIIANVWNWDKNWKVEWLENGKVMGTMTQYTGVDPYAHQ VCTDKKRTMQSWISAASTGHMFRATPRNPKAKIEIRVTDRFGKVYQQNVSK >gi|226332301|gb|ACIC01000019.1| GENE 8 10060 - 11382 1400 440 aa, chain - ## HITS:1 COG:no KEGG:BT_0447 NR:ns ## KEGG: BT_0447 # Name: not_defined # Def: S-layer related protein precursor, sialic acid-specific 9-O-acetylesterase # Organism: B.thetaiotaomicron # Pathway: not_defined # 31 439 475 884 884 672 77.0 0 MRKITIVLLSSLLIIALGACKNATPQVENGKPALMWFDAEANFERFSNPDSIDYYLTKIK SLGFTHAIVDVRPITGEVLFDTEFAPKMREWHGYERKDFDYLGHFIKKAHELGIEVHASL NVFVAGHNYFDRGLVYSTHPEWASTVYTPEGITSITNEKKKYSAMVNPVNEEFQTHILNV LKDLVKRYPDLDGLMLDRVRYDGITADFSDLSRQKFEAYIGQKVEKFPEDIFEWKKDEND KYYPERGKHFLKWIEWRTKNIYDFMARARNEVKKVNPDISFGTYTGAWYPSYYEVGVNFA SKKYDPSEDFDWATSEYKNYGYAELMDLYTTGNYYTDITIEESLKNKKSVWNETDSQAQS GTWYSVEGSCQKLRNIMKDNKFMGGILVDQFYDNPAKLSATIEMNLKASDGLMVFDIVHI INKGLWKEVENGMRAGGALK >gi|226332301|gb|ACIC01000019.1| GENE 9 11416 - 12759 1281 447 aa, chain - ## HITS:1 COG:no KEGG:BT_0449 NR:ns ## KEGG: BT_0449 # Name: not_defined # Def: putative S-layer related protein precursor # Organism: B.thetaiotaomicron # Pathway: not_defined # 21 447 19 449 453 434 52.0 1e-120 MGGIAACLCMACGGNDSKDYWGDTSGGEDEEPTENPNASKPRYIWIDAAANFPDFANSKE NIARDLALAKDAGFTDIVVDVRPTTGDVLFKTNLVDQVKFMYAWIGSNYTKVERTATWDY LQAFVDEARKQGLRIHAAINTFVGGNQIDGGTGLLYRDQSKAEWATQMNMQVGITSVMNT SESTKFFNPAHPEVQTFLCDLLKDLAGYDLDGIFLDRGRFLNLQADFSEESRKQFEEYMG GIRIQNYPNDILAPGASSLPATYPKYLTKWLEFRAKVIYDFMQKARTAVKGVKPGIKFGV YVGGWYSTYYDVGVNWAASTYDTSRYYNWATSKYKNYGYAACMDQILIGAYASPLKVHGT TEWTMEGFCSLAKDKIKGECPIVAGGPDVGNWDTNNQATQEQENQAIVQSVKACMNVCDG YFLFDMIHLKKADQWQYAKEGIKLAIE >gi|226332301|gb|ACIC01000019.1| GENE 10 12867 - 14534 948 555 aa, chain - ## HITS:1 COG:no KEGG:BT_0450 NR:ns ## KEGG: BT_0450 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 554 1 564 565 249 33.0 3e-64 MKNKILSIISVIAFLMGISACHSPEEFNPTVERNIINNLTASFPDDDSDDNSFTSEIDYT NHVITVVFPYNYPKLSDNVLQLSDLKKMRVIADLDNNVYVYPTLLFMDMTKDNFITVKDQ TGASTEYKVVAEIRKSNECAITKFNITSLGLSGIINESTKTISLISLESIGEVLADISIS HGATISPDPRTVALNYDQDQKITVTAQNGTTKSTYTVKKEIPEKIAAGLRANSAKLIWAK KLTDIGISSSDMTTGIAVTNDYVVINERAKNPIYLNAKNGEKAGTMNISFAGSLTNFYTT SDKDGNILFTNLTPNAGPAFTVWKASGVDQAPEKYIEFTTSLAMGRKLSVSGSLNNNAII TAPIYATSGQFARWQVENGVLKSQTPDIITAQGIGAWGTNADVIYSDPTDLTSDYFTTFY AEPRNLCWMDGKTNSIKNKGVALDGNYVPNAVDHAVFNKVEYVASNLVNPWSWGTADNIY LFDLSSGSLGTQAIDFGSTGLGINGNYGSKILGNTNGNYCGDVVLKVSDDGYYMYIYFMF CNGYVGCVQCDCIDM >gi|226332301|gb|ACIC01000019.1| GENE 11 14551 - 16182 1303 543 aa, chain - ## HITS:1 COG:no KEGG:BT_0451 NR:ns ## KEGG: BT_0451 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 543 1 553 553 392 43.0 1e-107 MKKILLNISLALALGFSISSCNDWLDGVEQTSTVSDEIVWQDEAYVDKYVNAFYTYLNKY GQFGEAQFSGSLTESLTETFKYGSVALGDRAGHPNNYVTNPDVISPDGCRYNIWDKDLAY GNIRQMNQFLVMQKKYSTFPDDKNLKWEAQVRFFRAYVYFQLAKRHGGVILYDDLPTSND KARSTAAETWQFIADDLDFAATNLPKEWDAANKGRVTKGAAYALKSRAMLYAERWQDAYD AADEVEKLQLYDLVDNYADAWKGNNKEAILEFDYNKDSGPNHTFDQYYVPQCDGYDFGAL GTPTQEMVESYEDKNGNKVDWSEWHGTTTKEPPYDQLEPRFAATIIYRGCTWKGRVMDCS VGGTNGAFMAYREQSYSYGKTTTGYFLRKLLDEKLIDVKGTKSSQAWVEIRFAEVLLNKA EAAYRLNKTTEAQSLMNRVRGRQGVNLPGKSSSGEAWFNDYRNERKIELAYEGHLFWDMR RWRLAHIEYNNYRCHGLKITNGTYEYIDCDGQDRKFPQKLYVLPVPTSEIKNNALIEQYD EWK >gi|226332301|gb|ACIC01000019.1| GENE 12 16196 - 17110 621 304 aa, chain - ## HITS:1 COG:no KEGG:BT_0452 NR:ns ## KEGG: BT_0452 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 304 774 1074 1074 251 48.0 3e-65 APSRGGYYFSSANVNKTDYKGFDVTLTHYNHINKFNYGAKLIWSYAYGRWLKYVGDSDNA PEYQRLTGKQIGSKYGFIANGLFQSEEEIANSATVKGYKALPGYIKYVDRNGDGIITAAQ DQGYVGKSATPKHTGSLNLFGNWKGFDFDLLFSWGLGNVVALTGQYTASGSEGTQDNTSF TKPFYHGGNSPTFLVENSWTPDHTNAEFPRLEITGPSNNNGYSSTFWYRNGNYMRLKTAQ IGYNFPKKWLNPAGIEGLRLYVEGYNLLTFSGLTKYNIDPESPAVNNGYYPQQRTYTLGV KLTF Prediction of potential genes in microbial genomes Time: Thu May 12 00:06:31 2011 Seq name: gi|226332300|gb|ACIC01000020.1| Bacteroides sp. 1_1_6 cont1.20, whole genome shotgun sequence Length of sequence - 3435 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 1421 1063 ## BT_0452 hypothetical protein 2 1 Op 2 . - CDS 1468 - 2259 497 ## BT_0452 hypothetical protein 3 1 Op 3 . - CDS 2282 - 3355 916 ## BT_0435 hypothetical protein Predicted protein(s) >gi|226332300|gb|ACIC01000020.1| GENE 1 2 - 1421 1063 473 aa, chain - ## HITS:1 COG:no KEGG:BT_0452 NR:ns ## KEGG: BT_0452 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 7 473 305 780 1074 339 42.0 1e-91 MVQKMRDGVDGWGNTNWYDKVYGTGTRTHHNISATGGSDRVRFFASIGYLNEDGNIDNYG YNRLNLRSNIDAKLTKSLTFTLGVSGRIEKRDSPRYSADPNAWHNVPQQIIRALPYVPDT YEYEGKTYNVSTPTASSPVAPVASINESGYSKSNYSYIQSNFSLKYDAPWLKGLSFKFQG AYDLTYNFSKILSNPYEVMIMNLPTIESTSLTYYKGTDASGNSISLSESASRGYTFTTQS SVTYDNKFGKHTVGAMFLAETRENKSNALSATGYGLDFIQLDELNQITNKTGNGAEKFPS IGGYSGHTRVAGFVGRVNYNYDDKYYLEASLRHDGSYLFGGMNKRWVTLPGVSAGWRLNN ESWFNAPWVNNLKLRAGIGKTATSGVSAFQWRNTMGISKNAVVIGGSSQTMMYASVLGNP NLSWAQCLNYNVGVDATMWNGLLGVEFDVFYKYEYDKLATVTGAYAPSRGGYY >gi|226332300|gb|ACIC01000020.1| GENE 2 1468 - 2259 497 263 aa, chain - ## HITS:1 COG:no KEGG:BT_0452 NR:ns ## KEGG: BT_0452 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 4 233 7 239 1074 241 53.0 1e-62 MKKRIKWNPRFITSLFIAFMCTLNCMAQNVIKGSVNDGSGEPLPGVSVAVKGTTSGTITD LDGKFSINARNNDVLIFSYVGMNTQDIKVNNQRFISVTMKDDVASLDEVVVVGYGTQKRG SITAAISTVSDKELLKAPTMSISNVVGARIAGVAAVQSSGQPGSDNATLTMRGQTGIVYV IDGIRRTSADFNGLDPNEIESVSILKDASAVAVYGLDANGVFIVTTKQGKAEKIVYHLYR FCRFQPECGTTGMAGWTRLCLLV >gi|226332300|gb|ACIC01000020.1| GENE 3 2282 - 3355 916 357 aa, chain - ## HITS:1 COG:no KEGG:BT_0435 NR:ns ## KEGG: BT_0435 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 355 4 363 366 367 53.0 1e-100 MDGRRNFLKTAFATCSLLAGSKLLTACTPIKRAASEEEETFETFATPNALKGMPIKATFL DEISWDIPHQNWGTKEWDKDFKAMKKMGINTVVLIRAGLGKWIASPFQCLLGKEDVYYPP VDLVEMFLTLADKYDMAFYFGMYDSGKYWQEGLFQKEIDLNMALIDEVWKKYGHHKSFQG WYLSQEISRRTKNMSKIYAEVGKHAKSVSGNLTTLVSPYIHGVKTDQVMAGDKALSVKEH EEEWDEILDNVKGAVDIMAFQDGQVDYHELYDYLVINKKLADKYNMKCWTNIESFDRDMP IRFLPIKWEKLLLKLDAARRAGMENAITFEFSHFMSPNSAYLQAGHLYERYCEHFIL Prediction of potential genes in microbial genomes Time: Thu May 12 00:06:47 2011 Seq name: gi|226332299|gb|ACIC01000021.1| Bacteroides sp. 1_1_6 cont1.21, whole genome shotgun sequence Length of sequence - 4567 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 3 - 62 3.0 1 1 Op 1 . + CDS 84 - 1718 1040 ## COG4409 Neuraminidase (sialidase) 2 1 Op 2 . + CDS 1715 - 3727 1274 ## COG3525 N-acetyl-beta-hexosaminidase 3 1 Op 3 . + CDS 3742 - 4567 516 ## BT_0457 sialic acid-specific 9-O-acetylesterase Predicted protein(s) >gi|226332299|gb|ACIC01000021.1| GENE 1 84 - 1718 1040 544 aa, chain + ## HITS:1 COG:STM0928 KEGG:ns NR:ns ## COG: STM0928 COG4409 # Protein_GI_number: 16764290 # Func_class: G Carbohydrate transport and metabolism # Function: Neuraminidase (sialidase) # Organism: Salmonella typhimurium LT2 # 195 544 57 403 412 120 30.0 6e-27 MKRNHYLFTLILLLGCSIFVKASDTVFVHQTQIPILIERQDNVLFYFRLDAKESRMMDEI VLDFGKSVNLSDVQAVKLYYGGTEALQDKGKKRFAPVDYISSHRPGNTLAAIPSYSIKCA EALQPSAKVVLKSHYKLFPGINFFWISLQMKPETSLFTKISSELQSVKIDGKEAICEERS PKDIIHRMAVGVRHAGDDGSASFRIPGLVTSNKGTLLGVYDVRYNSSVDLQEYVDVGLSR STDGGKTWEKMRLPLSFGEYDGLPAAQNGVGDPSILVDTQTNTIWVVAAWTHGMGNQRAW WSSHPGMDLYQTAQLVMAKSTDDGKTWSKPINITEQVKDPSWYFLLQGPGRGITMSDGTL VFPTQFIDSTRVPNAGIMYSKDRGKTWKMHNMARTNTTEAQVVETEPGVLMLNMRDNRGG SRAVAITKDLGKTWTEHPSSRKALQEPVCMASLIHVEAEDNVLDKDILLFSNPNTTRGRN HITIKASLDDGLTWLPEHQLMLDEGEGWGYSCLTMIDRETIGILYESSAAHMTFQAVKLK DLIR >gi|226332299|gb|ACIC01000021.1| GENE 2 1715 - 3727 1274 670 aa, chain + ## HITS:1 COG:CC0447 KEGG:ns NR:ns ## COG: CC0447 COG3525 # Protein_GI_number: 16124702 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetyl-beta-hexosaminidase # Organism: Caulobacter vibrioides # 22 429 26 510 757 134 24.0 6e-31 MKSLRIFLGIAFLFHTLYILGADHLRLLPQPQQCVLAKGYFIVGKMQLSTPVLSQEWKQF VTEMGGTLTDQSASSINIKLVDAIDNVSVNKEEAYRLTITPKAITVEAVAERGVYWAMQT LYQLKEEKGKKIRLQCATITDWPAFRIRGFMQDVGRSYLSLEELKREIAILSRFKINTFH WHLTENQAWRLESKIFPMLNDSTNMTRMAGKYYTLEEARELTEFCKAHQVLLIPEIDMPG HSAAFIRTFRHDMQSPEGMKILKLLLDEICETFDVPYLHIGTDEVHFTNPQFVPEMVAYV RDKGKKVISWNPGWKYKAGEIDMMQLWSYRGKAQQGIPAIDSRFHYLNHFDTFGDIIALY NSRIYNADMGSDDLAGVIMGIWNDRLIDKEWNMVLENNFYPNMLAIAERSWRGGGTEYFD KQGTILPVDENSEVFRNFEDFESRMLWYKEHLFKGYPFAYVKQTHVKWNITDAFPNEGDL TKVFPPEEELKDSYTYEGKQYGVRPAIGAGIYLRHVWGKIVPAFYKDPQENHTAYAYTYV YSSKTQEVGLWAEFQNYGRSENDLPPLPGKWDYKESRIWINDQEILPPVWSATHLVKSSE TALGNENCVARPPLRVHLHKGWNKVLLKLPVGKFSTNEVRLVKWMFTAVFVTPDGDKAVE GLIYSPEKKM >gi|226332299|gb|ACIC01000021.1| GENE 3 3742 - 4567 516 275 aa, chain + ## HITS:1 COG:no KEGG:BT_0457 NR:ns ## KEGG: BT_0457 # Name: not_defined # Def: sialic acid-specific 9-O-acetylesterase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 275 2 276 692 578 100.0 1e-164 MKNVLLIVVSILFITAASAQENRIKVACIGNSITYGYGLPDRTTQSYPVQLQKMLGESYQ VENFGKSGATLLNKGHRPYMQQDEYRRAIDFGGDIVVIHLGINDTDPRDWSDYRDFFVKD YIELIDSFRAANSKVRIMIARLTPIADRHPRFLSGTRDWHGEIQLAIENVARYTGVQLID FHEPLYPYPFILTDAVHPDPEGAFIMAQTVYSAITGDYGGLKMSLLYTDNMVLQRDVPLT VQGIANAGDRVTVSIADRQMKTKAGLNGKWSVTLP Prediction of potential genes in microbial genomes Time: Thu May 12 00:07:02 2011 Seq name: gi|226332298|gb|ACIC01000022.1| Bacteroides sp. 1_1_6 cont1.22, whole genome shotgun sequence Length of sequence - 42345 bp Number of predicted genes - 34, with homology - 34 Number of transcription units - 11, operones - 8 average op.length - 3.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 1 - 1245 519 ## BT_0457 sialic acid-specific 9-O-acetylesterase 2 1 Op 2 1/0.000 + CDS 1323 - 3917 1391 ## COG3250 Beta-galactosidase/beta-glucuronidase 3 1 Op 3 . + CDS 3956 - 6280 2106 ## COG3525 N-acetyl-beta-hexosaminidase 4 1 Op 4 1/0.000 + CDS 6280 - 8355 1328 ## COG3525 N-acetyl-beta-hexosaminidase 5 1 Op 5 . + CDS 8371 - 10890 1678 ## COG3250 Beta-galactosidase/beta-glucuronidase + Term 10915 - 10965 10.0 + Prom 10964 - 11023 7.4 6 2 Tu 1 . + CDS 11218 - 11778 352 ## BT_0462 putative transcriptional regulator + Term 11814 - 11847 0.2 + Prom 11834 - 11893 7.2 7 3 Op 1 13/0.000 + CDS 11953 - 12840 842 ## COG1209 dTDP-glucose pyrophosphorylase 8 3 Op 2 9/0.000 + CDS 12871 - 13440 395 ## COG1898 dTDP-4-dehydrorhamnose 3,5-epimerase and related enzymes 9 3 Op 3 11/0.000 + CDS 13444 - 14304 603 ## COG1091 dTDP-4-dehydrorhamnose reductase 10 3 Op 4 . + CDS 14307 - 15419 590 ## COG1088 dTDP-D-glucose 4,6-dehydratase + Prom 15429 - 15488 7.8 11 4 Op 1 . + CDS 15559 - 17091 239 ## BT_0467 hypothetical protein 12 4 Op 2 . + CDS 17118 - 18284 341 ## BT_0468 putative F420H2-dehydrogenase 40 kDa subunit + Term 18397 - 18434 -0.9 + Prom 18712 - 18771 8.8 13 5 Op 1 . + CDS 18803 - 19585 357 ## BT_0469 hypothetical protein 14 5 Op 2 . + CDS 19585 - 20814 471 ## BT_0470 hypothetical protein 15 5 Op 3 8/0.000 + CDS 20821 - 22035 671 ## COG0438 Glycosyltransferase 16 5 Op 4 . + CDS 22043 - 22801 225 ## COG0110 Acetyltransferase (isoleucine patch superfamily) 17 5 Op 5 . + CDS 22811 - 23752 337 ## BT_0473 putative glycosyltransferase 18 5 Op 6 3/0.000 + CDS 23783 - 24829 753 ## COG2605 Predicted kinase related to galactokinase and mevalonate kinase 19 5 Op 7 1/0.000 + CDS 24843 - 25436 398 ## COG0279 Phosphoheptose isomerase 20 5 Op 8 . + CDS 25441 - 26148 460 ## COG1208 Nucleoside-diphosphate-sugar pyrophosphorylase involved in lipopolysaccharide biosynthesis/translation initiation factor 2B, gamma/epsilon subunits (eIF-2Bgamma/eIF-2Bepsilon) 21 5 Op 9 1/0.000 + CDS 26164 - 26634 280 ## COG0241 Histidinol phosphatase and related phosphatases 22 5 Op 10 6/0.000 + CDS 26631 - 27737 309 ## COG0438 Glycosyltransferase + Prom 27794 - 27853 4.0 23 5 Op 11 . + CDS 27894 - 28718 161 ## COG1216 Predicted glycosyltransferases + Term 28815 - 28847 -0.8 + Prom 28785 - 28844 4.3 24 6 Op 1 2/0.000 + CDS 29010 - 30182 511 ## COG2148 Sugar transferases involved in lipopolysaccharide synthesis 25 6 Op 2 2/0.000 + CDS 30236 - 31036 494 ## COG1596 Periplasmic protein involved in polysaccharide export 26 6 Op 3 . + CDS 31049 - 33487 1796 ## COG0489 ATPases involved in chromosome partitioning + Term 33577 - 33632 10.6 27 7 Tu 1 . - CDS 33495 - 33746 74 ## gi|298384779|ref|ZP_06994339.1| hypothetical protein HMPREF9007_01416 - Prom 33860 - 33919 2.8 + Prom 33658 - 33717 5.0 28 8 Op 1 . + CDS 33738 - 36482 2209 ## BT_0483 hypothetical protein 29 8 Op 2 . + CDS 36501 - 38234 1382 ## BT_0484 hypothetical protein + Term 38242 - 38284 3.1 + Prom 38242 - 38301 3.5 30 9 Op 1 . + CDS 38321 - 38512 195 ## gi|253567896|ref|ZP_04845307.1| conserved hypothetical protein 31 9 Op 2 . + CDS 38548 - 38940 461 ## Coch_0296 hypothetical protein + Term 38961 - 39005 2.1 - Term 38943 - 38998 12.5 32 10 Op 1 5/0.000 - CDS 39050 - 40240 1522 ## COG0484 DnaJ-class molecular chaperone with C-terminal Zn finger domain 33 10 Op 2 . - CDS 40268 - 40849 816 ## COG0576 Molecular chaperone GrpE (heat shock protein) - Prom 41036 - 41095 5.8 + Prom 40813 - 40872 10.4 34 11 Tu 1 . + CDS 41053 - 42343 1528 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains Predicted protein(s) >gi|226332298|gb|ACIC01000022.1| GENE 1 1 - 1245 519 414 aa, chain + ## HITS:1 COG:no KEGG:BT_0457 NR:ns ## KEGG: BT_0457 # Name: not_defined # Def: sialic acid-specific 9-O-acetylesterase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 414 279 692 692 870 99.0 0 KAGGPYTLKISTDETGFQYQNVLAGEVWLCSGQSNMEFMLKQASTARADIPRAVDQQLRL YDMKARWRTNAVEWEANVLDSLNHLQYYKDTEWKNCTPATASDFSAIAYYFGKMLRDSLN VPVGLICNAVGGSPTEAWVDRASLEYQFPAILKDWTKNDFIQEWVRGRAALNIKKSANSQ QRHPYEPCYLYESGIRPLEQYPIRGVIWYQGESNAHNWEAHEKLFKLLVNSWRKNWNDAC LPFYYVQLSSLNRPSWPWFRESQRRMLNEISHIGMAVSSDHGDSLDVHPTCKKPVGERLA RWALNKTYQKNVIPSGPLFRGANVRGGKVFLSFDYGKGMRSSDGKPLQCFEVAEYDGIYY PATAEVVGDQVKVYSKEVPNPRYVRYGWQPFTRANLINREGLPASTFRAEFSMK >gi|226332298|gb|ACIC01000022.1| GENE 2 1323 - 3917 1391 864 aa, chain + ## HITS:1 COG:XF0846 KEGG:ns NR:ns ## COG: XF0846 COG3250 # Protein_GI_number: 15837448 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Xylella fastidiosa 9a5c # 50 843 59 860 891 541 37.0 1e-153 MILNLTGKIAPIACGLLCCCSMVYAQGNDTSEVMLLDTGWEFSQSGTEKWMPATVPGTVH QDLISHELLPNPFYGMNEKKIQWVENEDWEYRTSFIVSEEQLNRDGIQLIFEGLDTYADV YLNGSLLLKADNMFVGYTLPVKSVLRKGENHLYIYFHSPIRQTLPQYASNGFNYPADNDH HEKHLSVFSRKAPYSYGWDWGIRMVTSGVWRPVTLRFYDIATISDYHVRQLLLTDENARL SNELIVNQIVPQKIPAEVRVNVSLNGTTVTEVKQQVTLQPGINHITLPAEVTNPVRWMPN GWGTPTLYDFSAQIACGDRIVAEQSHRIGLRTIRVVNEKDKDGESFYFEVNGIPMFAKGA NYIPQDALLPNVTTERYQTLFRDMKEANMNMVRIWGGGTYENNLFYDLADENGILVWQDF MFACTPYPSDPTFLKRVEAEAVYNIRRLRNHASLAMWCGNNEILEALKYWGFEKKFTPEV YQGLMHGYDKLFRELLPSMVKEFDSDRFYVHSSPYLANWGRPESWGTGDSHNWGVWYGKK PFESLDTDLPRFMSEFGFQSFPEMKTIAAFAASEDYQIESEVMNAHQKSSIGNSLIRTYM ERDYIIPESFEDFVYVGLVLQGQGMRHGLEAHRRNRPYCMGTLYWQLNDSWPVVSWSSID YYGNWKALHYQAKRAFAPVLINPIQQNDSLSVYLISDRLDTMEQMTLEMKVVDFDGKTLG KKIQVHSLEVPANTSKCVYRAKLDGWLTPEDCRRSFLKLILKDKSGHQVAESVHFFRKTK DLQLPPISVSYQMKQTDGKCELTLFSSMLAKDIFIETPLQGARYSDNFFDLLPGERKKVI ITSPRIKKGEELPVNIKHIRETYK >gi|226332298|gb|ACIC01000022.1| GENE 3 3956 - 6280 2106 774 aa, chain + ## HITS:1 COG:CC0447 KEGG:ns NR:ns ## COG: CC0447 COG3525 # Protein_GI_number: 16124702 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetyl-beta-hexosaminidase # Organism: Caulobacter vibrioides # 31 608 31 596 757 430 42.0 1e-120 MKQLLKLTGCVALAGLMSSCGSVQEEANYQIIPLPQEIVTSQVNPFILKSGVKILYPEGN EKMQRNAQFLADYLKTATGKDFSIEAGTEGKNAIVLALGSEVENPESYQLKVTDQGVTIT APTEAGVFYGIQTLRKSLPIALGADVALPAVEIKDAPRFGYRGAHFDVSRHFFTIDEVKT YIDMLALHNMNRLHWHITDDQGWRLEIKKYPKLTEIGSQRSGTVIGRNSGEYDNTPYGGF YTQEQAKEIVDYAAERYITVVPEIDLPGHMLAALAAYPELGCTGGPYEVWRQWGVADDVL CAGNDQVLKFLEDVYGELIEIFPSEYIHVGGDECPKVRWEKCPKCQARIKALGLKSDKNH SKEERLQSFVINHIEKFLNDHGRQIIGWDEILEGGLAPNATVMSWRGESGGIEAAKQKHD VIMTPNTYLYFDYYQAKDTENEPFGIGGYLPMERVYSYEPMPASLTPEEQQYIKGVQANL WTEYIATFSHAQYMVLPRWAALCEVQWSTPDKKNYEDFLSRLPRLIKWYDAEGYNYAKHV FDVKAEFTPNPADGTLDITLTTIDNAPIHYTLDGTEPTSTSPVYDGALKIKENADFSAIA IRPTGNSRVVSEKIDFSKSSMKPIVANQPVNKQYEFKGVSTLVDGLKGNGNYKTGRWIAF RGNDMDVTIDLKQPTEISSVAISTCVEKGDWVFDTRGLSVEVSEDGTNFTKVASEAYPAM KETDKNGVYDHKLTFTPVTAQYVKVIASPEKSIPEWHGGKSYPGFLFVDEITIN >gi|226332298|gb|ACIC01000022.1| GENE 4 6280 - 8355 1328 691 aa, chain + ## HITS:1 COG:CC0447 KEGG:ns NR:ns ## COG: CC0447 COG3525 # Protein_GI_number: 16124702 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetyl-beta-hexosaminidase # Organism: Caulobacter vibrioides # 27 510 27 506 757 367 39.0 1e-101 MNIRKRYTKVCLFLWVIGMCLCAHPINAQSVIPVPLKMEQGTGSFLLSEKTKLYTNLQGE EAILLGDYLKTALPVQFKEGKKKDKQNVLSLLITEKNPQLVSPESYILSVTPEHILIQAS SGAGLFYGIQTLLQLSQPSGTGYSIVSVEVQDTPRFAYRGMMLDVSRHFFSKEFVKKQID ALAFYKLNRLHLHLTDAAGWRLEIKKYPLLTEFAAWRTDANWKKWWNGGRKYLRFDEPGA SGGYYTQDDMKEIIAYAQQHYITIIPEIEMPAHSEEVLAAYPQLSCSGEPYKNADFCVGN EETFTFLENVLTEVMELFPSEYIHVGGDEAGKAAWKTCPKCQKRMQDEHLSNVDELQSYL IHRIELFLNAHGRKLLGWDEILQGGLAPNATVMSWRGEEGGIAAVRSGHQAIMTPGQYCY LDSYQDAPYSQPEAIGGYLPLEKVYSYNPVSDSLTVEQAKLVYGVQANLWAEYIPTPEHM EYMIYPRILALAEVAWSASERKSWTDFHNRALKAVDDLQAKGYHTFDLKNEIGSRPESLK PISHLAVGKKVIYNAPYSPHYPAQGNTALTDGIRGDWTYGDGSWQGFIDKKRLDVTIDMG TETAVHSVSADFMQVVGAEVFLPESVTISISDDGANFTELKQYTFEVNRKEAIKFISISW EGSAEGRYIRYQARAGEEFGGWIFTDEIVVK >gi|226332298|gb|ACIC01000022.1| GENE 5 8371 - 10890 1678 839 aa, chain + ## HITS:1 COG:SP0648_2 KEGG:ns NR:ns ## COG: SP0648_2 COG3250 # Protein_GI_number: 15900551 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Streptococcus pneumoniae TIGR4 # 30 819 59 871 871 431 33.0 1e-120 MTINIRYLLGIALLFSFSVISAQVRTSYTFEKGWKFTREDNADFIQPDYNDIKWQSVVVP HDWAIYGPFGIDNDRQLTAIAQDGQKEAMEHAGRTGGLPFVGVGWYRLNFDAPSFVKGKK ATLIFDGAMSHARVYINGQEAGYWPYGYNTFYLDVTPYLKEGTKNTLAVRLENETESSRW YPGAGLYRNVHLVVTEDAHIPTWGTQLTTPVVKDDYAKVNIKTTLVVPFGKQFDAYRIVT ELKDKDGKVVTTDEKQGSRFDDNIFSQELVVSRPALWSPETPVLYQAVSKVYEGNILKDE YTTSFGIRTIEVIPNKGFFLNGKRTSFKGVCNHHDLGPLGGAVNDAAIRRQIRILKDMGC NAIRTSHNMPAPELIRACDEMGMMVMAESFDEWKAMKVKNGYRHVFDEWAVKDLTNLLRH YRNNPSVVMWCIGNEVPEQWDGNKGPKMSYFLQELCHREDPTRPVTQGMDAPDAVVNNNM AAVMDIAGFNYRPHKYQENYKKLPQQIVLGSETASTVSSRGVYKFPVTRQWMKKYDDHQS SSYDVESCSWSNLPEDDFIQHEDLPYCIGEFVWTGFDYLGEPTPYYTDWPSHSSLFGIID LAGLPKDRYYLYRSHWNKEQETLHILPHWNWKGREGEITPVFVYTNYPSAELFINGKSQG KRTKDLSVTIDNSADSVSIMNLKRQSRYRLMWMDTKYEPGTVKVVAYNADGKAVAEKELH TAGKPDHIELVADRNVIKADGKDLSFVTVRVVDRDGNLCPDASHEISFKVKGEGSYRAGA NGNAASLESFQRPKMKVFSGMMTAIVSSTEQPGKITLEATGKGLKKGVLIIESRQEAKK >gi|226332298|gb|ACIC01000022.1| GENE 6 11218 - 11778 352 186 aa, chain + ## HITS:1 COG:no KEGG:BT_0462 NR:ns ## KEGG: BT_0462 # Name: not_defined # Def: putative transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 186 1 184 184 372 100.0 1e-102 MIMNDLDKERQTEGRKAWYAVQTFYCKEEHLGKYLEKKGVNYFIPMRYIEHETLDGKKHR KLTPAVHNLLFIEKEFTEKELLERVKDCTIPFLLVRDRSTRRCYEIPDCEMLEFRAVCDP NYKGTLYVDTVTAEARPGQAVRVIRGPFAGLEGKLTQYKKSYYVVVTLATIGVMLHIPKW YCEKIN >gi|226332298|gb|ACIC01000022.1| GENE 7 11953 - 12840 842 295 aa, chain + ## HITS:1 COG:NMB0062 KEGG:ns NR:ns ## COG: NMB0062 COG1209 # Protein_GI_number: 15675999 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-glucose pyrophosphorylase # Organism: Neisseria meningitidis MC58 # 1 291 1 288 288 409 65.0 1e-114 MKGIVLAGGSGTRLYPITKGVSKQLLPIFDKPMIYYPISVLMLAGIREILIISTPYDLPG FKRLLGDGSDYGVRFEYAEQPSPDGLAQAFIIGEEFIGNDSVCLVLGDNIFYGQGFTHML KEAVHTAEEENKATVFGYWVADSERYGVAEFDKIGNVLSIEEKPVQPKSNYAVVGLYFYP NKVVEVAKNITPSLRGELEITTVNQQFLNDQELKVQLLGRGFAWLDTGTHDSLSEASTFI EVIEKRQGLKVACLEGIALRHGWITTDKMRELAQPMLKNQYGQYLLKVIDELTQK >gi|226332298|gb|ACIC01000022.1| GENE 8 12871 - 13440 395 189 aa, chain + ## HITS:1 COG:MA3780 KEGG:ns NR:ns ## COG: MA3780 COG1898 # Protein_GI_number: 20092576 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-4-dehydrorhamnose 3,5-epimerase and related enzymes # Organism: Methanosarcina acetivorans str.C2A # 1 169 1 170 183 206 57.0 2e-53 MEVIKTAIEGVFIIEPRLFKDDRGYFFESFSQREFNEKVRKVNFVQDNESKSSYGVLRGL HFQKPPYAQSKLVRVIKGAVLDVAVDIRKGSPTFGKYVSVELTEDNHRQFFIPRGFAHGF SVLTDEVIFQYKCDNFYAPQSEGALAWDDPDLGIDWRVPANEIVLSEKDNHHELLKDASW LFNYNESLY >gi|226332298|gb|ACIC01000022.1| GENE 9 13444 - 14304 603 286 aa, chain + ## HITS:1 COG:CAC2315 KEGG:ns NR:ns ## COG: CAC2315 COG1091 # Protein_GI_number: 15895582 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-4-dehydrorhamnose reductase # Organism: Clostridium acetobutylicum # 1 283 1 279 280 231 45.0 1e-60 MNILVTGANGQLGNEMRCIAATSLNNYIFTDVAELDITDLDAIRNMIHLDNIKVIVNCAA YTNVDKAEDDYDMADLLNNKAVENLAIAAKEVNATLIHISTDYVFQGDKNMPCGEDCETN PLGIYGKTKRAGEQSIQRVGCNYLIFRTAWLYSQFGKNFVKTMRQLTADKDKLRVVFDQI GTPTYAKDLADIIYRVIEGDQYHKQGIYHFSNEGVCSWYDFAKEICELSGNSCDIQPCHS DEFPSKVKRPHFSVLDKTKLKLAFGVEIPYWKDSLVKCINELKENN >gi|226332298|gb|ACIC01000022.1| GENE 10 14307 - 15419 590 370 aa, chain + ## HITS:1 COG:ECs4721 KEGG:ns NR:ns ## COG: ECs4721 COG1088 # Protein_GI_number: 15833975 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-D-glucose 4,6-dehydratase # Organism: Escherichia coli O157:H7 # 5 360 2 347 355 389 53.0 1e-108 MKFSRNIMITGGAGFIGSHVVRLFVNKYPSYRIINLDKLTYAGNLANLKDIEDKPNYVFV KADICDFEKMIELFSEYKIDGVIHLAAESHVDRSINDPFTFARTNVMGTLSLLQAAKLTW ESLSEGYEGKRFYHISTDEVYGALDLTNPSGNKDGRGPYGHDFFKETTRYSPHSPYSASK ASSDHFVRAFHDTYNMPTIVTNCSNNYGPYQFPEKLIPLFINNIRCRKPLPVYGKGENVR DWLYVIDHVRGIDLIFHKGKIAETYNIGGFNEWKNIDIIKVLIKTVDRLLGYPEGYSLDL ITYVSDRKGHDLRYAIDSAKLKQELGWEPSLQFEEGLERTVRWYLDNEVWMDNVTSGDYQ EYYDSIYRNE >gi|226332298|gb|ACIC01000022.1| GENE 11 15559 - 17091 239 510 aa, chain + ## HITS:1 COG:no KEGG:BT_0467 NR:ns ## KEGG: BT_0467 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 22 510 1 489 489 811 99.0 0 MITTNYNNKRIAKNTFLLYFRMLFTMMVSLYTSRVVLATLGVEDFGIYNVVGGVVTMFAI ISTSLSSAISRFITIELGKGDLRRLKTVFSTAVVIQFLMALVVVILAELIGVWFLNNKMN IPEVRMNAANWVLQFSIFTFGVNLICIPFNALIIAHERMSAFAYVSILEVSLKLLIVYLL VLMSVDKLIVYGALLLVVAIVITFAYFIYCRRHFKECKCFCRIDRFLLKRMLSFSGWNFI GASSAILRDQGVNIAINLFCGPTVNAARGMAVQVNHAINSFAQNFMTALNPQITKSFAVG DSKYMFTLIFQGARLSFYMLLFLSLPVLMSTEYILSIWLKIVPDYTIIFVRLVLIFAMCE SISNPLITAMLATGSIRNYQIVVGGMQMMNFPISYILLERGYAPEITLYVAIGISLCCLA LRLFMLRSMIQLPVKSYMKNVFFNICAVTIFSVAIPYLISLRLQSSFYSFLILSFSCIIC SFMAIFFVGCKASERVFVVDQVRKLIKNNL >gi|226332298|gb|ACIC01000022.1| GENE 12 17118 - 18284 341 388 aa, chain + ## HITS:1 COG:no KEGG:BT_0468 NR:ns ## KEGG: BT_0468 # Name: not_defined # Def: putative F420H2-dehydrogenase 40 kDa subunit # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 388 1 388 388 818 100.0 0 MIKIKKSEDCCGCSACLSICNHNAIVFKEDVEGFKYPMIEEGKCINCGLCEMACPILYRK VKDIRPTPKAYFAARHKNIVTLRNSSSGGAFTAIAGSVIELGGVVCGVEYDNDGVVRHSF SETMEGIRSFMGSKYVQSEIQGVYNQIKTYLKSDRWVLFTGSPCQVEGLKLFLRKPYENL LTVDLVCHAVPSPLIYKQYINMCSTKLGQKVMSINMRYKQTYGWSHRFSYCFYFENGQKA VDPVWVVNWGKIFFSQMINRPSCHSCKYSNLDRPGDFTIADFWDDKKNRPDLYSCEGTSL LMVNTNMGLDLINKLYDSIDLWKITKDEAMQPCLIRPTCSNQRRQEFWEFYLLRGFDAAY KKYFGDSKYVITKKIVKQIIKRVLRKLQ >gi|226332298|gb|ACIC01000022.1| GENE 13 18803 - 19585 357 260 aa, chain + ## HITS:1 COG:no KEGG:BT_0469 NR:ns ## KEGG: BT_0469 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 260 161 420 420 394 100.0 1e-108 MGARNVVILIAAYGLLVIKTHRKLLKIFLVTSLCFPVYMFTAYASRAVMIMTFFFLVFIF VFLSVFMNVGLKKKIVSYLILILVPISSAFILISNSRFGNLATYMFYRYLGESFNNYNTH FFYELKGNTWGEAYFVFFRKLMGISSNFKTTREKWEWLDNITGVDTHVFYTFVGGLNIEF GFVGTIVIGLLLSFFMVKKMRPYNVLTLPKFIALGMLAYTLINGVFFFVLQGDWGNLEIL FTLFFCFLFSKYRTRKYINK >gi|226332298|gb|ACIC01000022.1| GENE 14 19585 - 20814 471 409 aa, chain + ## HITS:1 COG:no KEGG:BT_0470 NR:ns ## KEGG: BT_0470 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 409 1 409 409 848 99.0 0 MKIGILTFANVPNFGANLQALSTISYLQNHGYNPILIKWEPEDFEARFTSIKTQKQPQEH FHFVEKYLPQTKICRNDDDICQVIKDEKIEGIIIGSDAVLQCSSFWGRLDFPTKTVVRVN KTTSEREYPNAFWGTFYDKLGSKIPMVIMSASSQNAKYRLIARSVKHKMSNNLSHFEYIS VRDIWTRDMITYITKGAIIPNITPDPVFAFNYNCAKFIPSKEDILLKYHLPENYVLVSMK CRMLDNMLWMDKLKSEMKKLYLECVALPMPTGIGFKHNFDFSITTPLPPLDWYALIKYAK GYIGENMHPIVVALHNAVPCFCFDTYGVLKYARCVCVEESSKIYDIMSRFSLLDNRINAY SRFWKCPPVDIVINRILHFDVIACQKTALNYYNNYEQMMSSILEVFKSK >gi|226332298|gb|ACIC01000022.1| GENE 15 20821 - 22035 671 404 aa, chain + ## HITS:1 COG:MA1061 KEGG:ns NR:ns ## COG: MA1061 COG0438 # Protein_GI_number: 20089931 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Methanosarcina acetivorans str.C2A # 107 402 103 412 419 97 24.0 5e-20 MKYLITLPWMPYPATDGGKQGSFNMLMELQNMIDITLIYPVFFKKQMACQYLLEKQLPKV KVYPFEYYKNKDGIKSQYSLFRLHRIITRKFLKQPYLATPIYKDAYIDFVNEIIKKERID IVQNEYFEQLYMVYAIPNTVKKVFIQHEIQYIAKERLIQQREYPSSVRYLATMQRIQEIN ALNEYDQVITMTDIDKNILMCDGVRAPISASPSFIPLPDNIAYKECERSSICFIGGSGHN PNLNGVTWFLDNVWSLILKENPNFTFKIIGKWDEKIKTEYQKKYRNLFFCGFVDNLAMVI SECIMVIPILIGSGIRMKILESVNFYSPFVTTTVGVEGLDFINGKECIIADEPQAFATGV LKIATDKVLQHNLTKAAHEKLMEMYSPEASVQRRLNIYNEMLKR >gi|226332298|gb|ACIC01000022.1| GENE 16 22043 - 22801 225 252 aa, chain + ## HITS:1 COG:MJ1064 KEGG:ns NR:ns ## COG: MJ1064 COG0110 # Protein_GI_number: 15669253 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Methanococcus jannaschii # 125 236 102 213 214 88 45.0 1e-17 MNLSVVLFILIVVFIKYWLLLFTSPLLMAYCWLHRRKNMANPEGEVSEIRDSGITSSSCS KKLFKRIDKRGLINIVDGYFRWMIKIISDIPSHHIRDFFYKYIFLVKMEKNSVLYYGSEI RAPWMLMIGKGSVVGDNSILDARRGGIYIGENVNIASNVSLWTGGHDYNDPYFRSMKTNR GPIYIKNRVWIGPNVTILHSVTIGEGAVIAAGAVVTKDIPPFTICGGIPAKVLAQRSIDL RYTLGGTYLHFL >gi|226332298|gb|ACIC01000022.1| GENE 17 22811 - 23752 337 313 aa, chain + ## HITS:1 COG:no KEGG:BT_0473 NR:ns ## KEGG: BT_0473 # Name: not_defined # Def: putative glycosyltransferase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 313 1 313 313 618 100.0 1e-175 MKKLAIVIPAYKVDFFETVLFSLAQQTCKDFTVYIGEDCSRDDFKSLIEQYSKQLDIVYR RFEVNFGGHDLVAQWNRCIQLTQNEPWLWLFSDDDIMGPRCVEYFFNTIAIDNGAFDIYH FDVKIVDCENQIVRIPTVYPSVIDSETFYRRKASARLDSFVVEYIFLRDIYNCTGGFQHF DLAWGSDIATWVKIGADKGIKTISGDYVYWRKSKKNITPNMDNKMVLRKLTADIEFTHWI NEFFHKSSVYRFTKYAFFRLVVHYSQAISKSQIRLLLDKAVGKNIISFNDAFIINWTYRF IQLAKRVKDVLYF >gi|226332298|gb|ACIC01000022.1| GENE 18 23783 - 24829 753 348 aa, chain + ## HITS:1 COG:Cj1425c KEGG:ns NR:ns ## COG: Cj1425c COG2605 # Protein_GI_number: 15792743 # Func_class: R General function prediction only # Function: Predicted kinase related to galactokinase and mevalonate kinase # Organism: Campylobacter jejuni # 3 340 4 339 339 327 48.0 3e-89 MIVRSKAPLRLGLAGGGSDVSPYSDIYGGLILNATINLYAYCTIEETNSGRIEINAYDAQ CCKSYLSMSQLEIDGEASLIKGVYNRIIRDYRLEPKSFKITTYNDAPAGSGLGTSSTMVV CILKAFIEWLSLPLGDYETSRLAYEIERKDLGLSGGKQDQYAAAFGGFNYMEFLQNDLVI VNPLKMKRWIVDELESSMVLYFTGRSRSSAAIINEQKKNTSEGNQTAIEAMHKIKQSAID TKLALLKGDVGEFARILGEGWENKKKMAGAITNPMIQEAFDVATGAGAMAGKVSGAGGGG FIMFVVEPTRKEEVVRALNNLNGFVMPFQFIDDGAHGWKIYSTDKVQK >gi|226332298|gb|ACIC01000022.1| GENE 19 24843 - 25436 398 197 aa, chain + ## HITS:1 COG:Cj1424c KEGG:ns NR:ns ## COG: Cj1424c COG0279 # Protein_GI_number: 15792742 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoheptose isomerase # Organism: Campylobacter jejuni # 1 196 1 198 201 231 60.0 4e-61 MESIDIVRKQVAESERVKAELSKNEEVIAAIAKAADVCTEAYRRGNKTMFAGNGGSAADA QHLVGEFVSKFYFDRPGIPSIALTTDTSVITAIGNDYGYDKIFARQLQAQGVAGDVFIGI STSGNSKNIVDALPICKEKGITTIALTGMKSCKMDDFDIVIKVPSAETPRIQECQTLIGH IICCIVEENIFGEEYNK >gi|226332298|gb|ACIC01000022.1| GENE 20 25441 - 26148 460 235 aa, chain + ## HITS:1 COG:Cj1423c KEGG:ns NR:ns ## COG: Cj1423c COG1208 # Protein_GI_number: 15792741 # Func_class: M Cell wall/membrane/envelope biogenesis; J Translation, ribosomal structure and biogenesis # Function: Nucleoside-diphosphate-sugar pyrophosphorylase involved in lipopolysaccharide biosynthesis/translation initiation factor 2B, gamma/epsilon subunits (eIF-2Bgamma/eIF-2Bepsilon) # Organism: Campylobacter jejuni # 1 223 1 214 221 166 46.0 2e-41 MEVIILAGGFGTRLRSVVNEVPKCMAPIANKPFLWYLLKYLTKFDVSKVILSLGYLRGVI IDWIDECKDEFPFAFEYAVEDEPLGTGGGIKLALKRTSKPNIIVLNGDTFFDVNLNELYE WHCLYPSSITLALKPMENFDRYGNVQICEDTNQIRRFDEKKYCEKGLINGGIYIINTLEP IFNRLPQRFSFETGVLQPQCLLGKLYGVVQNGYFIDIGIPEDYDKANAEFSDLLF >gi|226332298|gb|ACIC01000022.1| GENE 21 26164 - 26634 280 156 aa, chain + ## HITS:1 COG:MT0122_2 KEGG:ns NR:ns ## COG: MT0122_2 COG0241 # Protein_GI_number: 15839494 # Func_class: E Amino acid transport and metabolism # Function: Histidinol phosphatase and related phosphatases # Organism: Mycobacterium tuberculosis CDC1551 # 14 147 39 173 217 94 39.0 5e-20 MRLQDIDVTGFETLLLDRDGVVNRLRPDDYVKKWEEFEFLPGVLEILKVWNTHFKYIFIV TNQRGVGKEIMSEEDLKHIHERMISEVKNYGGRIDRIYYCTALTDSDINRKPGIGMFLQI LRDYPDIDKAKCLMIGDSDSDIKFAKNCGIVGIKVI >gi|226332298|gb|ACIC01000022.1| GENE 22 26631 - 27737 309 368 aa, chain + ## HITS:1 COG:CAC3057 KEGG:ns NR:ns ## COG: CAC3057 COG0438 # Protein_GI_number: 15896308 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Clostridium acetobutylicum # 205 314 244 355 420 70 31.0 6e-12 MKEKILFLTAYVPNKAAAGEKNTMIMLNDLASCYDVDLIYFKYDWENDYVPERGNVNTKL TIRNSIMTKLKNVCNFPFVHPTFSIRFNWLILRRVRDLLKMNHYKVVIFNHSNMFLYGRY IDKRIPKILLCHDVIAQRVLRSSSYLMQKICIFSERMSLNLPNAHIFSFSQKDCDLIKQI YHKETNLCLDYIDEQIINKSPDKVEDYFTMFGDWRRKENAEGALWFINTVGPLLRKKVHI KIIGRGFPNKDVKNIVPNLNLDILGFVDDPYKILSQSKALISPLFTGAGIKVKVIEALAC GIPVIGTDIAFEGLPPLFNSFMLIAETPKDYIRCMECLNMNIDERIKTKEMFIKTYQSDS ITNYIKRI >gi|226332298|gb|ACIC01000022.1| GENE 23 27894 - 28718 161 274 aa, chain + ## HITS:1 COG:alr4493 KEGG:ns NR:ns ## COG: alr4493 COG1216 # Protein_GI_number: 17231985 # Func_class: R General function prediction only # Function: Predicted glycosyltransferases # Organism: Nostoc sp. PCC 7120 # 4 254 9 255 295 120 32.0 2e-27 MIDISIITVGMNHLLYLKVLLRSLYKENIPQASIEMIYVDNCSSDGTVEYIRQNYPQVRI VENQKPLGFGENNNKGVLASIGKYIAIINPDIVLHKGSLDFLYEYVEKHPIIGITVPKLL NPDGTVQYSVRSFITLKVLFSRFLSKGNDQTTSKIVNKYLCKNMDTTKIQPVDWAIGAAL FMRRDFYAFLGGFDQNYFLYMEDEDICLRSWKCNRPVVYFPESIMTHNHLRGSSKIGRKA LLHLQSMFVFFKKHGCSIGSFCSKMQACSKLEYY >gi|226332298|gb|ACIC01000022.1| GENE 24 29010 - 30182 511 390 aa, chain + ## HITS:1 COG:ECs2852 KEGG:ns NR:ns ## COG: ECs2852 COG2148 # Protein_GI_number: 15832106 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Sugar transferases involved in lipopolysaccharide synthesis # Organism: Escherichia coli O157:H7 # 32 380 116 455 464 246 38.0 5e-65 MIRVLRNMVPFILFSICFLSLFHFEFFRSRLFGLFYVALLIILICYRLAFRYFLELYREK GGNVRMVVLVGSHENMQELYHSMTDDPTSGYRVMGYFEDSPSDYYPEEVAYLGQPKEAIE YLERNSGKIAQLYCSLPSVRSAEIVPIINYCENHLVRFFSVPNVRNYLKRRMYFDLLGNV PVLSIRREPLELRENRVLKRFFDIIFSLIFLCTIFPFVYIIIGIAIKISSPGPIFFKQKR SGEDGREFWCYKFRSMRVNAQCDVLQATANDPRKTRIGEFIRKTSIDELPQFINVLMGDM SVVGPRPHMLKHTEEYAQVINKFMVRHFVKPGITGWAQVTGYRGETKELWQMEGRVSRDI WYIEHWTFMLDLYIIYKTVYNAIHGEKEAY >gi|226332298|gb|ACIC01000022.1| GENE 25 30236 - 31036 494 266 aa, chain + ## HITS:1 COG:PM1016 KEGG:ns NR:ns ## COG: PM1016 COG1596 # Protein_GI_number: 15602881 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protein involved in polysaccharide export # Organism: Pasteurella multocida # 38 233 75 253 387 65 28.0 1e-10 MRILNKLIFVVLLPFLFTACQSYKKVPYLQDTEVVNQIEQKENLYDAKIMPKDLLTIVVS CTSPELAAPFNLTVASPTNLSVTNILATTQPVLQTYLVDNEGKIFFPVLGELKLGGLTKK QAEQMVVEKLKPYMKETPIVTVRMVNYKISVIGEVTRPGTFTISNEKVNLLEALAMAGDM TVYGLRDNVKLIREDATGKQEIITLDLNKSETILSPYYWLQQNDVVYVTPNKAKARNSDI GNSTSLWFSATSILVSLASLLFNILK >gi|226332298|gb|ACIC01000022.1| GENE 26 31049 - 33487 1796 812 aa, chain + ## HITS:1 COG:alr2856_2 KEGG:ns NR:ns ## COG: alr2856_2 COG0489 # Protein_GI_number: 17230348 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Nostoc sp. PCC 7120 # 542 790 3 253 275 119 28.0 2e-26 MKEEIVNERQCETEDEKIDIQQLLFKYIIHWPWFVGAVLVCLIGAWIYLRMATPVYNISA TVLIKDDKKGGNTGSMVGLEELGLSGLISSSQNIDNELEVLRSKTLVKEVINLLNLYVSY TDEDGFPSKNMYKTSPVLVSLTPQEAEKLTDPMVVEMALYGEGGLEVNVTVGDKEYQKLF EKLPAVFPMDEGTLAFFQSPDSLSLKKDTMEASSNIRHITAKIKSPMKVARAYCENLKIE PTSKTTSVAVISLKNSSLQRGQDFINQLLEMYNRNTNNDKNEIAQKTAEFIDERINIISK ELGSTEANLENFKRNAGITDLTSEAQIALTGNAEYEKKRVENRTQISLIEDLRKYIRGNE YEVLPGNIGLQDPGLVATIERYNEMLVERKRLLRTSTENNPTIINLDTSIRAMKSNVQAT LDGSLKGLLITKADLEREASRFSRRISDAPGQERQFVSIARQQEIKAGLYLMLLQKREEN AIALAATANNAKIIDEAIADDIPVSPKRRMIYLIALVLGIGIPVGIIYLIGLTKFKLEGR ADVEKLTTIPIVGDIPLTDEKNEKDGSIAVFENQNNLMSETFRNIRTNLQFMLQNDKKVI LVTSTVSGEGKSFISANLAISLSLLGKKVVIVGLDIRKPGLNKVFRLSTKEKGITLYLAN PETDLMSLVQPSDINQNLYILPGGTVPPNPTELLARDGLDKAIEILKKSFDYVVLDTAPV GMVTDTLLIGRVADLSVYVCRADYTHKVEYTLINELAEEKKLPNLCTVINGVDLKRRKYG YYYGYGKYGKYYGYGKRYGYGYGYGQEKGAKS >gi|226332298|gb|ACIC01000022.1| GENE 27 33495 - 33746 74 83 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|298384779|ref|ZP_06994339.1| ## NR: gi|298384779|ref|ZP_06994339.1| hypothetical protein HMPREF9007_01416 [Bacteroides sp. 1_1_14] # 22 83 1 62 62 111 96.0 1e-23 MFHNYHFLRDNTNITKKAGIWLSGYPLFFDMGYKNYIVNVDEYNKPLDWGKSSGRLYLTD YILGDILWSIRLKRAIVYSQVYS >gi|226332298|gb|ACIC01000022.1| GENE 28 33738 - 36482 2209 914 aa, chain + ## HITS:1 COG:no KEGG:BT_0483 NR:ns ## KEGG: BT_0483 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 914 1 914 914 1791 99.0 0 MKQHYRLTIWSLFFLITSSGSAFAQRKTITPIDSLITVGYATGSLKTISGSVEKITETQM NKDQITNPLEAIRGRVPGLTIQRGTNGPAALDAVRLRGTTSLTSGNDPLIIVDGVFGDLS MLTSIYPTDIESFTILKDASETAQYGSRGASGVIEITTKKGMSGKTQVSYNGIFGISTVY KNLKMLSADDFRWVASERGISILDKGNNTDFQKEIEQTGLQQNHHIAFYGGSNESSYRVS LGFMDRQGVILNEDMKNFTSNMNMTQRMFDGFLNCELGMFGSIQKNHNLVDLQKTFYSAA TFNPTYPNHKDPDSGSWDGITTASQITNPLAWMEVQDHDATSHISTHARLTFNLMDELKL VLFGAYTYNIVENSQYLPTSVWAHGQAYKGTKKMESLLGNMMLTYKKGWKKHYFDALALA ELQKETYTGFYTTVTNFNTDKFGYHNLQAGALRPWEGTNSYYEQPHLASFMGRFNYTYAD RYSLTMTARTDASSKFGANHKWGFFPSVSAAWVISEEKFMKRLPVIDNLKFRIGYGLAGN QSGIDSYTTLSLVKPNGVIPVGNSAAVSLGDLRNTNPDLKWEVKHTFNTGIDLAMFGNRL LLSANYYNSKTTDMLYLYNVSVPPFTYNTLLANIGSMRNSGLEIAIGITPLKTQDMELNI NANITFQQNKLLSLSGMYNGELISAPEYKSLASLDGAGFHGGYNHIVYQVVGQPLGVFYL PRANGLVSDGNGGYTYDIADLNGGGVSLEDGEDRYVAGQAVPKTILGSNLSFRYKRFDIS VQINGAFGHKIYNGTSLTYMNMNIFPDYNVMQEAPARNIKDQTATDYWLENGAYVNFDYV TLGWNVPIEKVQKLKKYVRSLRLAFTVNNLATISGYSGLSPMINSSTVNSTLGLDDKRGY PLARTYTLGLSINF >gi|226332298|gb|ACIC01000022.1| GENE 29 36501 - 38234 1382 577 aa, chain + ## HITS:1 COG:no KEGG:BT_0484 NR:ns ## KEGG: BT_0484 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 577 1 577 577 1169 98.0 0 MKKIKRYLQQQGWYLESKSLYDGKATLLGGNLLEGKMNLKKTVVFLIVCITFFSCDKFLE ESPRDKLPEDEVYNNISDVYLNAVASLYTYIGGYSDSQGLQGTGRGVYDLNTFTTDEAII PTRGGDWYDGGFWQGLFLHQWGIENDAIQATWEYLYKVVMLSNKSLERIDKFSATHADAE LPAYRAEVRAMRAMYYYYLMDLFGRVPLVLTSSPSMKDVVQSERKTIFDFVFKELQESAP LLAEVHSNQSGPYYGRITRPVVVFLLAKLALNAEVYTDNDWTDGQRPDGKNLFFEVDGNR LNAWQTVIAYCDQLQEMGYRLEPDYKTNFAVFNEPSVENIFTIPMNKTLYTNQMQYLFRS RHYNHAKAYGLSGENGPSATIEALETFGYETAQQDPRFDICYFAGVVYDLKGNIVRLDDG TVLEYLPWKVELDISNTPYEQTAGARMKKYEVDETATKDGKLMENDIVLFRYADVLLMKS EAKVRNGEDGYAELNEVRGRVNASSRTATLENILAERQLELAWEGWRRQDLIRFGAFTRA YSSRPQLPAENNGYTTVFPIPEKIRVMNTKLEQNPGY >gi|226332298|gb|ACIC01000022.1| GENE 30 38321 - 38512 195 63 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253567896|ref|ZP_04845307.1| ## NR: gi|253567896|ref|ZP_04845307.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 63 1 63 63 105 100.0 8e-22 MSYKSIKDVVTMLQENGFVLKSQKGSHMKFEKDDKIVIVPNHNSKGVEKGTYYSILRQAG LKK >gi|226332298|gb|ACIC01000022.1| GENE 31 38548 - 38940 461 130 aa, chain + ## HITS:1 COG:no KEGG:Coch_0296 NR:ns ## KEGG: Coch_0296 # Name: not_defined # Def: hypothetical protein # Organism: C.ochracea # Pathway: not_defined # 1 129 1 126 126 72 33.0 3e-12 MKTVEVIVEHAGKNLSAYIEGAPVITVGNDMREIEENMKEAIELYLEDNPNPCELLSGEL ELKFKIDAATFINYYSNIFTKAALSRITGINERQLWHYAAGVHKPRRQQLEKIQKGIQAL TKELSSIHLL >gi|226332298|gb|ACIC01000022.1| GENE 32 39050 - 40240 1522 396 aa, chain - ## HITS:1 COG:STM0013 KEGG:ns NR:ns ## COG: STM0013 COG0484 # Protein_GI_number: 16763403 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: DnaJ-class molecular chaperone with C-terminal Zn finger domain # Organism: Salmonella typhimurium LT2 # 4 396 3 368 379 278 43.0 1e-74 MAEKRDYYEVLEVTKTATVEEIKKAYRKKAIQYHPDKNPGDKEAEEKFKEAAEAYDVLSN PDKRSRYDQFGHAGVSGAAGNGGPFGGFGGEGMSMDDIFSMFGDIFGGRGGGFSGGFGGF SGFGGGGGGSQQRRYRGSDLRVKVKMTLKEISTGVEKKFKLKKYVPCNHCHGTGAEGDGG SETCPTCKGSGSVIRNQQTILGTMQTRTTCPTCNGEGKIIKNKCKECGGDGIVYGEEVVT VKIPAGVAEGMQLSMGGKGNAGKHNGVPGDLLILVEEEPHPDLIRDENDLIYNLLLSFPT AALGGAVEIPTIDGKVKVKIDSGTQPGKVLRLRGKGLPNVNGYGTGDLLVNISIYVPEAL NKEEKSTLEKMEASDNFKPSTSVKEKIFKKFKSFFD >gi|226332298|gb|ACIC01000022.1| GENE 33 40268 - 40849 816 193 aa, chain - ## HITS:1 COG:STM2681 KEGG:ns NR:ns ## COG: STM2681 COG0576 # Protein_GI_number: 16765996 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone GrpE (heat shock protein) # Organism: Salmonella typhimurium LT2 # 2 193 5 194 196 77 34.0 2e-14 MDPKEKEKMAEELNVEETKDTAEEQPQNDQAEEAAPLTHEEQLEKELEDAQAVIEEQKDK YLRLSAEFDNYRKRTMKEKAELILNGGEKSISSILPVIDDFERAIKTMETAKDVKAVKEG VELIYNKFMAVMAQNGVKVIETKDQPLDTDYHEAIAVIPAPSEEQKGKILDCVQTGYTLN DKVIRHAKVVVGE >gi|226332298|gb|ACIC01000022.1| GENE 34 41053 - 42343 1528 430 aa, chain + ## HITS:1 COG:BS_ykpA KEGG:ns NR:ns ## COG: BS_ykpA COG0488 # Protein_GI_number: 16078507 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Bacillus subtilis # 1 430 1 432 540 528 57.0 1e-150 MITVSNVSVQFGKRVLFNDVNLKFTSGNCYGIIGANGAGKSTFLRTIYGDLDPTTGTIAL GPGERLSVLSQDHFKWDAYTVMDTVMMGHTVLWDIMKQREVLYAKEDFTDEDGLKVSELE EKFAELDGWNAESDAAMLLSGLGIKEDKHYTLMGELSGKEKVRVMLAQALYGNPDNLLLD EPTNDLDMETVTWLEEYLSNFEHTVLVVSHDRHFLDSVCTHTVDIDYGKINMFAGNYSFW YESSQLALRQQQNQKAKAEEKKKELEEFIRRFSANVAKSKQTTSRKKMLEKLNVEEIKPS SRKYPGIIFTPEREPGNQILEVSGLSKKTEEGVVLFSDVNFNIEKGDKVVFLSRNPRAMT AFFEIINGNMKPDAGQFNWGVTITTAYLPLDNTDFFNTDLNLVDWLSQFGEGNEVYMKGF LGRMLFSGEE Prediction of potential genes in microbial genomes Time: Thu May 12 00:08:07 2011 Seq name: gi|226332297|gb|ACIC01000023.1| Bacteroides sp. 1_1_6 cont1.23, whole genome shotgun sequence Length of sequence - 13091 bp Number of predicted genes - 11, with homology - 11 Number of transcription units - 9, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 20 - 310 297 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains - Term 1322 - 1377 3.3 2 2 Tu 1 . - CDS 1443 - 2492 1025 ## BT_1240 hypothetical protein - Prom 2559 - 2618 5.1 + Prom 2478 - 2537 5.6 3 3 Tu 1 . + CDS 2596 - 3492 825 ## COG1266 Predicted metal-dependent membrane protease 4 4 Tu 1 . - CDS 3464 - 5596 263 ## PROTEIN SUPPORTED gi|87310993|ref|ZP_01093118.1| ribosomal protein S1-like RNA-binding domain protein - Prom 5657 - 5716 5.1 5 5 Op 1 . + CDS 5714 - 7639 1394 ## COG0642 Signal transduction histidine kinase 6 5 Op 2 . + CDS 7636 - 9378 1562 ## COG0737 5'-nucleotidase/2',3'-cyclic phosphodiesterase and related esterases + Prom 9396 - 9455 2.1 7 5 Op 3 . + CDS 9475 - 10377 672 ## COG2207 AraC-type DNA-binding domain-containing proteins 8 6 Tu 1 . - CDS 10387 - 10980 519 ## COG0664 cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases - Prom 11151 - 11210 3.9 9 7 Tu 1 . + CDS 11105 - 11815 665 ## BT_1233 hypothetical protein + Term 11846 - 11897 5.1 10 8 Tu 1 . - CDS 11985 - 12191 123 ## BT_1232 hypothetical protein - Prom 12225 - 12284 4.0 - TRNA 12259 - 12330 56.8 # Glu CTC 0 0 - TRNA 12394 - 12483 59.9 # Ser GCT 0 0 + Prom 12496 - 12555 2.8 11 9 Tu 1 . + CDS 12579 - 12782 78 ## BT_1231 hypothetical protein + Term 12914 - 12960 3.2 Predicted protein(s) >gi|226332297|gb|ACIC01000023.1| GENE 1 20 - 310 297 96 aa, chain + ## HITS:1 COG:BB0742 KEGG:ns NR:ns ## COG: BB0742 COG0488 # Protein_GI_number: 15595087 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Borrelia burgdorferi # 1 95 465 559 565 124 58.0 3e-29 MRCMIARMQLRNANCLILDTPTNHLDLESIQAFNNNLKTYKGNILFSSHDHEFIQTVANR IIELTPNGIIDKMMEYDEYITSDHIKELRAKMYGDK >gi|226332297|gb|ACIC01000023.1| GENE 2 1443 - 2492 1025 349 aa, chain - ## HITS:1 COG:no KEGG:BT_1240 NR:ns ## KEGG: BT_1240 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 349 1 349 349 690 99.0 0 MKIRQLLISFLLAASTLGATAQVSKTYYVSKPGTLISMMTEEEANSITHLTLTGKLNAED FRHLRDEFSSLKVLDISNAEIKMYSGKAGTYPNGKFYIYMANFVPAYAFSNVVNGVTKGK QTLEKVILSEKIKNIEDAAFKGCDNLKICQIRKKTAPNLLPEALADSVTAIFIPLGSSDV YRFKNRWEHFAFIEGEPLETTIQVGAMGKLEDEIMKAGLQPRDINFLTIEGKLDNADFKL IRDYMPNLVSLDISKTNATTIPDFTFAQKKYLLKIKLPHNLKTIGQRVFSNCGRLAGTLE LPASVTAIEFGAFMGCDNLRYVLATGDKITTLGDELFGNGVPSKLIYKK >gi|226332297|gb|ACIC01000023.1| GENE 3 2596 - 3492 825 298 aa, chain + ## HITS:1 COG:FN0640 KEGG:ns NR:ns ## COG: FN0640 COG1266 # Protein_GI_number: 19703975 # Func_class: R General function prediction only # Function: Predicted metal-dependent membrane protease # Organism: Fusobacterium nucleatum # 95 288 98 288 293 82 29.0 9e-16 METEEIREKKEPKRLPVWACILLFALGLFVTFGLYSTIGFGVLSLILGDEARHPGLLGHM VAEAGMLLAVLTSAVILLYFERRPFSDLGLTLKGHARGLWYGLLAAILLYLIGFGLSLAL GEVEVTGFQFDPLNLLATFVFCLLVALAEEIMMRGYILGRLLHTRMNKFLSLFVSALLFA LLHLFNPNLAFLPMLNLLLAGMFIGASYLYTRNLCFPISLHLFWNWIQGPILGYQVSGND FGTSLLTLHLPEENVLNGGAFGFEGSLICTVLMIIFTILIVWWGEKREAISLAVPRSC >gi|226332297|gb|ACIC01000023.1| GENE 4 3464 - 5596 263 710 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|87310993|ref|ZP_01093118.1| ribosomal protein S1-like RNA-binding domain protein [Blastopirellula marina DSM 3645] # 595 705 833 943 1043 105 45 1e-22 MELFHQMISGLLGIPERQISSTLHLLDEGATIPFISRYRKEATGGLDEVQIENIKEQHDK LCDIAKRKETILSTINEQGKLTPELEKRINATWNPTELEDIYLPYKPKRKTRAEAARQKG LEPLAMIMMLQREPNLTAKAATFVKGDVKDTEDALKGARDIIAEQVNENECARNAIRNQF TRQAEITAKVVKGKEEEAAKYRDYFDFSESLKRCTSHRLLAIRRAESEGLLKVSISPDDE ACLERLDRQFVHGNNECSHQVKEATADAYKRLLKPSIETEFAAQSKEKADDEAIRVFTEN LRQLLLSPPLGQKRVLAIDPGFRTGCKVVCLDAQGNLLHNENIYPHPPVNKTGEAASKLR KMIEAYEIEAISIGNGTASRETEDFINHQTFDRQIPVFVVSEQGASIYSASKIARDEFPD YDVTVRGAVSIGRRLMDPLAELVKIDPKSIGVGQYQHDVDQTKLKKALDQTVENCVNLVG VNLNTASSHLLTYISGLGPQLAQNIVNFRAENGAFSSRKELMKVPRMGAKAFEQCAGFLR IPGAKNPLDNTAVHPESYHIVEQMAKDLKCSVDELIANKELRQKIKISDYITPTVGLPTL QDIMQELDKPGRDPRKAIKVFEFDKNVRTIADLREGMILPGIVGNITNFGAFVDIGIKEN GLVHLSQLAERYISDPTEIVSIHQHVMVRVMNVDTDRKRIQLSMIGVQQD >gi|226332297|gb|ACIC01000023.1| GENE 5 5714 - 7639 1394 641 aa, chain + ## HITS:1 COG:MA4377_3 KEGG:ns NR:ns ## COG: MA4377_3 COG0642 # Protein_GI_number: 20093164 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Methanosarcina acetivorans str.C2A # 400 636 13 255 311 165 41.0 2e-40 MMRYIGCLFICLLWLFRATELFAVSNDQKPILIICSYNPAAHQTSVTISDYMEEYNKLGG KRDIIIENMNCKSFSEAPLWSGMMTQILSKYQGEKHPAQIILLGQEAWAAYLSQRDSIQV KVPVMCSLVSSNIVILPEDTVAGLDTWMPESVDLFTDHMNIPELKSGFINQYNIEDNVRM IKAFYPKTEHIAFISDNTYGGVTMQALVRKEMKKFPEIDLILMDGRRHTIYTIVEELRHL PENTVIMVGTWRVDMNEGYFMRNATYAMMEVTPTIPTFTPSSVSLGYWAIGGVLPDYRKL GMDMALASVQIDQYPADEQQHLSVIGSKAVLDSRKVKEWGLDPDILPFDVQIINQTVSFY QQYTYQIWSACALVVILVLGLCISLFYYFRTKRLKDDLLKSEKDLRVAKDRAEESNRLKS AFLANMSHEIRTPLNSIVGFSDVLAVGGNTEEEQQTYYQIIKTNSDLLLRLINDILDLSR LEANRVTLTWEECDVVQLCRQVVASVSVSRQSGNQFLFVSDYESFRMTTDIQRMQQVIIN LMSNADKFTRKGQITLEFSVNEETEMAVFSVTDTGCGIPKEKQKLVFERFEKLNEYAQGT GLGLSICKLIVHKWKGDIWIDSEYTGGARFIFSHPLKIEKE >gi|226332297|gb|ACIC01000023.1| GENE 6 7636 - 9378 1562 580 aa, chain + ## HITS:1 COG:CAC0353 KEGG:ns NR:ns ## COG: CAC0353 COG0737 # Protein_GI_number: 15893644 # Func_class: F Nucleotide transport and metabolism # Function: 5'-nucleotidase/2',3'-cyclic phosphodiesterase and related esterases # Organism: Clostridium acetobutylicum # 27 568 542 1093 1193 239 31.0 1e-62 MKRFTCVYVWLLCLVLSVAAQEKVVKLKIVQTSDVHGNYYPYNFITRKDWQGSLARIYAF VEKEREQYKENLILLDNGDILQGQPTAYYYNYIDTVSPHLCAEMMNYMKYDAGNMGNHDV ETGRAVFDRWINTCDFPVLGANIIDTSTGEPHLPPYKVMERDGVKIVILGMITPAIPAWL SENLWKGLRFDDMEETARKWMKIIREKENPDLMIGLFHAGQEAFKMSGKYNENASLSVAK NVPGFDIVLMGHDHARECKKVMNVAGDSVLVIDPASNGIVLSNIDVTLKLKDGKVRSKDI KGVLTETKDYGISEDFMKHFAPQYDTVRNFVSKKIGTFTESISTRPSFFGPSAFIDLIHT LQLDITGAEISFAAPLSFDAKINKGDVFVSDMFNLYKYENMLYMMTLSGKEIKDYLEMSY FMWTNRMKSPDDHLLWFKEKRREGAEDRASFQNFSFNFDSASGIIYTVDVTKPQGEKITI VSMADGSPFFMDKIYKVALNSYRGNGGGELLTKGSGIPQEKLKERIIFSTDKDLRFYLMN YIENKGTMDPKALNQWKFVPEKWTVPAAERDYKFLFGDSQ >gi|226332297|gb|ACIC01000023.1| GENE 7 9475 - 10377 672 300 aa, chain + ## HITS:1 COG:PA0248 KEGG:ns NR:ns ## COG: PA0248 COG2207 # Protein_GI_number: 15595445 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Pseudomonas aeruginosa # 196 298 185 286 288 74 33.0 3e-13 MDIQVVPKIGISSVVHSKHIDPDSIDVVGNDIALFDTESVTSLYNGPSKLEVVSIGLCLE GSTRFNISLREFELIPGRMVIALPNQIIEHRQFSSNFRGIFFAVSKSLLESLPKVGNVLS FFFFLKDYPCFDLNLHEQEMIKEYHAFIRKRLKNKDDKYRREVVMGLMQGFFFELCNIFN SYAPDASAVVKSKSRKEYIFERFYESLVQSYQSERSVKFYADQLCLTPKHLSGVVKEVSG KTVGEWIDELVILEAKALLNSSSMNIQEIADRLNFANQSFFGKYFKHYTGMSPKEYRKSR >gi|226332297|gb|ACIC01000023.1| GENE 8 10387 - 10980 519 197 aa, chain - ## HITS:1 COG:CAC3336 KEGG:ns NR:ns ## COG: CAC3336 COG0664 # Protein_GI_number: 15896579 # Func_class: T Signal transduction mechanisms # Function: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases # Organism: Clostridium acetobutylicum # 39 190 40 192 199 73 28.0 3e-13 MDTLLRETVNTVVNSRFPEMSIEGRRQIESILIREEYPKGVIALNEGEVAHELVFVGKGM LRQYYYKNGKDVTEHFSYEGCILMCIESLLKQVPTRLIIETLEPAVIYLFPYDKMMQLTK QNWEINMFYRKILEYSLIVSQTKADSWRFESARERYNLLLETHPEIIKRAPLAHIASYLL MTPETLSRVRSGVLLVF >gi|226332297|gb|ACIC01000023.1| GENE 9 11105 - 11815 665 236 aa, chain + ## HITS:1 COG:no KEGG:BT_1233 NR:ns ## KEGG: BT_1233 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 236 9 244 244 478 100.0 1e-134 MAALLLCLTGCVRDNDAIYYPVGDVDIERGGPALEVGEEDVLVARSFNEEDYVLDTIAQY PNDPTLGKLTFMIDLKNQQKDQNVADFNGVGKSKLTMSLGYKDGNYPSESQVPIYTSQDV TAKYAVKLRLKGELLVSGDEWMIDYVYAQLASLFQPYPPANFPEVFMCKGGMKLGTFDSF RRTCTFDITYDRSDLSFSQLYFNLFINLAGQKRENRVRLRIDKESYFELYEQSEEM >gi|226332297|gb|ACIC01000023.1| GENE 10 11985 - 12191 123 68 aa, chain - ## HITS:1 COG:no KEGG:BT_1232 NR:ns ## KEGG: BT_1232 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 68 1 68 68 105 100.0 5e-22 MEYNVKELKKVLIEQCKEEGIYYALIAINKQTKEIVLPQSLDNALNNPDYCVFKCRKVKD EYKVEEVK >gi|226332297|gb|ACIC01000023.1| GENE 11 12579 - 12782 78 67 aa, chain + ## HITS:1 COG:no KEGG:BT_1231 NR:ns ## KEGG: BT_1231 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 67 1 67 67 109 100.0 3e-23 MKEITCYVSNVKRIIFFENIFSHYSWEKTITLLIIIEACGVIRFFTSVFAILRLSPWLLE KFIKKRA Prediction of potential genes in microbial genomes Time: Thu May 12 00:08:40 2011 Seq name: gi|226332296|gb|ACIC01000024.1| Bacteroides sp. 1_1_6 cont1.24, whole genome shotgun sequence Length of sequence - 79603 bp Number of predicted genes - 94, with homology - 93 Number of transcription units - 33, operones - 23 average op.length - 3.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 16 - 420 360 ## BF1816 hypothetical protein + Prom 566 - 625 5.3 2 2 Tu 1 . + CDS 736 - 2403 1856 ## COG2985 Predicted permease + Prom 2405 - 2464 6.1 3 3 Tu 1 . + CDS 2518 - 4512 2024 ## COG3855 Uncharacterized protein conserved in bacteria - Term 4611 - 4657 2.6 4 4 Op 1 . - CDS 4836 - 5984 1121 ## BT_1227 hypothetical protein 5 4 Op 2 . - CDS 6048 - 7706 1681 ## COG1022 Long-chain acyl-CoA synthetases (AMP-forming) - Prom 7731 - 7790 5.0 - Term 7850 - 7889 6.0 6 5 Op 1 14/0.000 - CDS 8129 - 9199 1055 ## COG0451 Nucleoside-diphosphate-sugar epimerases - Prom 9219 - 9278 4.4 7 5 Op 2 . - CDS 9303 - 10373 1244 ## COG1089 GDP-D-mannose dehydratase 8 5 Op 3 . - CDS 10450 - 11619 1188 ## COG1301 Na+/H+-dicarboxylate symporters - Prom 11703 - 11762 7.0 + Prom 11650 - 11709 3.6 9 6 Tu 1 . + CDS 11729 - 13204 1601 ## COG0362 6-phosphogluconate dehydrogenase + Term 13452 - 13509 -0.8 10 7 Op 1 15/0.000 + CDS 13549 - 15045 1171 ## COG0364 Glucose-6-phosphate 1-dehydrogenase 11 7 Op 2 . + CDS 15042 - 15758 195 ## PROTEIN SUPPORTED gi|163781723|ref|ZP_02176723.1| 50S ribosomal protein L13 + Term 15837 - 15898 -0.7 + Prom 15763 - 15822 2.3 12 7 Op 3 . + CDS 15943 - 16893 868 ## COG2837 Predicted iron-dependent peroxidase + Term 16922 - 16985 15.4 - Term 17102 - 17161 2.2 13 8 Tu 1 . - CDS 17179 - 17877 534 ## COG2135 Uncharacterized conserved protein - Prom 18009 - 18068 7.7 + Prom 18220 - 18279 9.2 14 9 Tu 1 . + CDS 18349 - 18687 187 ## gi|160885865|ref|ZP_02066868.1| hypothetical protein BACOVA_03869 15 10 Op 1 . - CDS 18684 - 19151 374 ## gi|253567929|ref|ZP_04845340.1| conserved hypothetical protein 16 10 Op 2 . - CDS 19171 - 19557 146 ## gi|253567930|ref|ZP_04845341.1| predicted protein 17 10 Op 3 . - CDS 19570 - 19854 201 ## gi|253567931|ref|ZP_04845342.1| hypothetical protein BSIG_00357 - Prom 20002 - 20061 6.7 + Prom 19984 - 20043 9.8 18 11 Op 1 . + CDS 20128 - 20421 277 ## gi|237716168|ref|ZP_04546649.1| conserved hypothetical protein 19 11 Op 2 . + CDS 20437 - 20673 157 ## gi|160885871|ref|ZP_02066874.1| hypothetical protein BACOVA_03875 + Prom 20805 - 20864 3.4 20 12 Op 1 . + CDS 21108 - 21314 232 ## gi|160885873|ref|ZP_02066876.1| hypothetical protein BACOVA_03878 21 12 Op 2 . + CDS 21332 - 21589 110 ## gi|260170657|ref|ZP_05757069.1| hypothetical protein BacD2_02235 22 12 Op 3 . + CDS 21586 - 21777 163 ## gi|160885875|ref|ZP_02066878.1| hypothetical protein BACOVA_03880 + TRNA 21840 - 21913 65.6 # Undet ??? 0 0 + TRNA 22003 - 22100 22.3 # Pseudo GAA 0 0 + TRNA 22328 - 22397 27.8 # Pseudo ??? 0 0 + Prom 22676 - 22735 4.6 23 13 Op 1 . + CDS 22815 - 23597 497 ## COG4712 Uncharacterized protein conserved in bacteria 24 13 Op 2 . + CDS 23603 - 24517 443 ## BVU_2860 hypothetical protein 25 13 Op 3 . + CDS 24514 - 24963 69 ## COG0629 Single-stranded DNA-binding protein 26 14 Op 1 . + CDS 25114 - 25422 202 ## BVU_2858 hypothetical protein 27 14 Op 2 . + CDS 25441 - 26109 709 ## BVU_2857 hypothetical protein 28 14 Op 3 . + CDS 26042 - 26377 90 ## BVU_2856 hypothetical protein 29 14 Op 4 . + CDS 26374 - 27405 548 ## COG1061 DNA or RNA helicases of superfamily II + Prom 27442 - 27501 4.8 30 15 Op 1 . + CDS 27537 - 28034 220 ## BDI_0857 putative recombination protein 31 15 Op 2 . + CDS 28038 - 28850 339 ## Coch_0881 hypothetical protein 32 15 Op 3 . + CDS 28747 - 29394 455 ## gi|253567946|ref|ZP_04845357.1| conserved hypothetical protein 33 15 Op 4 . + CDS 29351 - 29572 220 ## gi|260170645|ref|ZP_05757057.1| hypothetical protein BacD2_02175 + Prom 29590 - 29649 2.1 34 16 Tu 1 . + CDS 29674 - 31143 969 ## COG1475 Predicted transcriptional regulators + Term 31380 - 31414 4.0 + Prom 31168 - 31227 2.7 35 17 Op 1 . + CDS 31419 - 32183 399 ## PA14_58970 hypothetical protein 36 17 Op 2 . + CDS 32180 - 32620 331 ## gi|253567951|ref|ZP_04845362.1| conserved hypothetical protein + Prom 32632 - 32691 1.8 37 18 Op 1 . + CDS 32727 - 33020 170 ## BVU_2843 hypothetical protein 38 18 Op 2 . + CDS 33081 - 33737 451 ## BVU_2842 hypothetical protein 39 18 Op 3 . + CDS 33706 - 34458 314 ## BVU_2841 hypothetical protein + Prom 34474 - 34533 3.9 40 19 Op 1 . + CDS 34570 - 35145 263 ## BVU_2840 hypothetical protein 41 19 Op 2 . + CDS 35194 - 35841 427 ## HAPS_0636 hypothetical protein 42 19 Op 3 . + CDS 35870 - 36334 201 ## gi|160885897|ref|ZP_02066900.1| hypothetical protein BACOVA_03902 + Term 36369 - 36408 -1.0 + Prom 36503 - 36562 2.6 43 20 Op 1 . + CDS 36679 - 37386 270 ## Dalk_4616 hypothetical protein 44 20 Op 2 . + CDS 37398 - 37949 410 ## gi|253567960|ref|ZP_04845371.1| predicted protein 45 20 Op 3 . + CDS 37951 - 38193 180 ## gi|295084508|emb|CBK66031.1| hypothetical protein + Prom 38198 - 38257 2.0 46 21 Op 1 . + CDS 38282 - 38527 193 ## gi|253567961|ref|ZP_04845372.1| conserved hypothetical protein 47 21 Op 2 . + CDS 38551 - 39057 361 ## gi|253567962|ref|ZP_04845373.1| conserved hypothetical protein 48 21 Op 3 . + CDS 39069 - 39677 301 ## gi|295084505|emb|CBK66028.1| hypothetical protein 49 21 Op 4 . + CDS 39694 - 40800 691 ## COG0582 Integrase 50 21 Op 5 . + CDS 40757 - 41749 633 ## BVU_2845 putative type I restriction-modification system methyltransferase subunit + Prom 41918 - 41977 1.9 51 22 Op 1 . + CDS 42067 - 42282 135 ## gi|253567966|ref|ZP_04845377.1| predicted protein 52 22 Op 2 . + CDS 42292 - 42465 75 ## - 5S_RRNA 42366 - 42452 95.0 # CP000139 [R:343154..343304] # 5S ribosomal RNA # Bacteroides vulgatus ATCC 8482 # Bacteria; Bacteroidetes; Bacteroidia; Bacteroidales; Bacteroidaceae; Bacteroides. + Prom 42552 - 42611 6.9 53 23 Tu 1 . + CDS 42695 - 43237 271 ## PRU_0933 hypothetical protein 54 24 Tu 1 . - CDS 43268 - 44389 244 ## gi|253567968|ref|ZP_04845379.1| conserved hypothetical protein - Prom 44498 - 44557 9.0 55 25 Op 1 . + CDS 44493 - 45281 288 ## COG0451 Nucleoside-diphosphate-sugar epimerases 56 25 Op 2 . + CDS 45250 - 45813 173 ## PRU_0949 hypothetical protein 57 25 Op 3 . + CDS 45815 - 46360 467 ## BVU_2837 hypothetical protein 58 25 Op 4 . + CDS 46422 - 48437 890 ## COG1783 Phage terminase large subunit + Prom 48461 - 48520 3.7 59 26 Op 1 . + CDS 48564 - 50009 549 ## BVU_2835 hypothetical protein 60 26 Op 2 . + CDS 49963 - 51024 601 ## BVU_2834 hypothetical protein 61 26 Op 3 . + CDS 51042 - 52619 1406 ## BVU_2833 hypothetical protein 62 26 Op 4 . + CDS 52624 - 53046 471 ## BVU_2832 hypothetical protein 63 26 Op 5 . + CDS 53055 - 53597 337 ## BVU_2831 hypothetical protein 64 26 Op 6 . + CDS 53594 - 54166 240 ## BVU_2830 hypothetical protein 65 26 Op 7 . + CDS 54163 - 55002 591 ## BVU_2829 hypothetical protein 66 26 Op 8 . + CDS 54999 - 55154 93 ## gi|160885928|ref|ZP_02066931.1| hypothetical protein BACOVA_03933 + Term 55157 - 55202 8.8 67 27 Op 1 . + CDS 55279 - 55725 242 ## BVU_2828 hypothetical protein 68 27 Op 2 . + CDS 55756 - 55917 198 ## gi|160885930|ref|ZP_02066933.1| hypothetical protein BACOVA_03935 69 27 Op 3 . + CDS 55920 - 59825 3166 ## COG5283 Phage-related tail protein 70 27 Op 4 . + CDS 59828 - 60409 312 ## BVU_2826 hypothetical protein + Term 60416 - 60454 -0.8 + Prom 60425 - 60484 8.0 71 28 Op 1 . + CDS 60517 - 60804 103 ## gi|253567985|ref|ZP_04845396.1| conserved hypothetical protein 72 28 Op 2 . + CDS 60794 - 61267 142 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog + Term 61311 - 61355 -0.7 + Prom 61315 - 61374 4.5 73 29 Op 1 . + CDS 61394 - 61726 247 ## BVU_2823 hypothetical protein 74 29 Op 2 . + CDS 61728 - 61982 212 ## gi|253567988|ref|ZP_04845399.1| predicted protein 75 29 Op 3 . + CDS 61987 - 62262 181 ## gi|253567989|ref|ZP_04845400.1| predicted protein 76 29 Op 4 . + CDS 62282 - 62443 182 ## gi|253567990|ref|ZP_04845401.1| predicted protein 77 29 Op 5 . + CDS 62491 - 62637 165 ## gi|295084479|emb|CBK66002.1| hypothetical protein 78 29 Op 6 . + CDS 62648 - 62935 56 ## gi|288799795|ref|ZP_06405254.1| hypothetical protein HMPREF0669_00194 79 29 Op 7 . + CDS 62971 - 63570 -18 ## gi|253567991|ref|ZP_04845402.1| predicted protein 80 29 Op 8 . + CDS 63611 - 63844 67 ## gi|295084476|emb|CBK65999.1| hypothetical protein 81 29 Op 9 . + CDS 63866 - 64048 119 ## gi|253567992|ref|ZP_04845403.1| predicted protein 82 29 Op 10 . + CDS 64048 - 64512 220 ## gi|253567993|ref|ZP_04845404.1| predicted protein + Prom 64528 - 64587 8.8 83 30 Tu 1 . + CDS 64800 - 64988 159 ## gi|253567994|ref|ZP_04845405.1| predicted protein + Term 65029 - 65062 4.5 + Prom 65012 - 65071 1.6 84 31 Op 1 . + CDS 65093 - 67084 1191 ## BVU_2822 hypothetical protein 85 31 Op 2 . + CDS 67089 - 68402 559 ## BVU_2821 hypothetical protein 86 31 Op 3 . + CDS 68414 - 74551 3851 ## gi|253567997|ref|ZP_04845408.1| conserved hypothetical protein 87 31 Op 4 . + CDS 74554 - 75342 610 ## gi|253567998|ref|ZP_04845409.1| conserved hypothetical protein 88 31 Op 5 . + CDS 75368 - 76177 436 ## gi|253567999|ref|ZP_04845410.1| conserved hypothetical protein + Term 76206 - 76242 1.6 + Prom 76201 - 76260 10.4 89 32 Op 1 . + CDS 76324 - 76731 237 ## gi|253568000|ref|ZP_04845411.1| conserved hypothetical protein 90 32 Op 2 . + CDS 76746 - 78167 589 ## COG3344 Retron-type reverse transcriptase 91 32 Op 3 . + CDS 78130 - 78408 246 ## gi|253568002|ref|ZP_04845413.1| conserved hypothetical protein + Term 78433 - 78465 4.0 + Prom 78434 - 78493 8.2 92 33 Op 1 . + CDS 78607 - 78900 195 ## gi|295084465|emb|CBK65988.1| hypothetical protein 93 33 Op 2 . + CDS 78909 - 79424 348 ## gi|262407839|ref|ZP_06084387.1| conserved hypothetical protein 94 33 Op 3 . + CDS 79408 - 79603 162 ## BT_4443 N-acetylmuramoyl alanine amidase Predicted protein(s) >gi|226332296|gb|ACIC01000024.1| GENE 1 16 - 420 360 134 aa, chain + ## HITS:1 COG:no KEGG:BF1816 NR:ns ## KEGG: BF1816 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 133 14 146 148 160 64.0 1e-38 MAVLTLSLTSCEVEIDSFYDDDNIGGGYYNRSSDLCSRTWVSFYRDVDGNRCRQELDFYL DRTGVDFIRVEYPNGHVETFEYYFRWNWENYAQTSIRMDYGRNDVSYLDDVYIGGNRLSG YLDGRNNFVEYTGR >gi|226332296|gb|ACIC01000024.1| GENE 2 736 - 2403 1856 555 aa, chain + ## HITS:1 COG:PM1071 KEGG:ns NR:ns ## COG: PM1071 COG2985 # Protein_GI_number: 15602936 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Pasteurella multocida # 17 547 8 543 553 337 38.0 5e-92 MEWLYNLFLEHSALQAVVVLSLISAIGLGLGRVHFWGVSLGVTFVFFAGILAGHFGLSVD PQMLNYAESFGLVIFVYSLGLQVGPGFFSSFRKGGVTLNMLALAVVLLGTLLTVVASYAT GVSLPDMVGILCGATTNTPALGAAQQTLKQMGIESSTPALGCAVAYPMGVIGVILAVLLI RKFLVHKEDLEIKEKDDANKTFIAAFQVHNPAIFNKSIKDIAQMSYPKFVISRLWRDGHV SIPTSDKVLKEGDRLLVITAEKNVLALTVLFGEQEENTDWNKEDIDWNAIDSELISQRIV VTRPELNGKKLGSLRLRNHYGINISRVYRSGVQLLATPELILQLGDRLTVVGEAAAIQNV EKVLGNAVKSLKEPNLVVIFIGIVLGLALGAIPFSIPGISTPVKLGLAGGPIIVGILLGT FGPRIHMITYTTRSANLMLRALGLSMYLACLGLDAGAHFFDTVFRPEGLLWIALGAGLTI IPTVLVGFVAFKIMKIDFGSVSGMLCGSMANPMALNYANDTIPGDNPSVAYATVYPLCMF LRVIIAQVLLMFLLG >gi|226332296|gb|ACIC01000024.1| GENE 3 2518 - 4512 2024 664 aa, chain + ## HITS:1 COG:CAC1572 KEGG:ns NR:ns ## COG: CAC1572 COG3855 # Protein_GI_number: 15894850 # Func_class: G Carbohydrate transport and metabolism # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 1 664 1 663 665 796 60.0 0 MTAQSNITPESIVGDLRYLQLLSRSFPTIADASTEIINLEAILNLPKGTEHFLTDIHGEY EAFQHVLKNASGAVKRKVNEIFGNTLREAEKKELCTLIYYPEEKIQLVKAREKDLDDWYL ITLNQLVKVCQNVSSKYTRSKVRKSLPAEFSYIIQELLHETSVEPNKHAYINVIISTIIS TKRADDFIIAMCNLIQRLTIDSLHIVGDIYDRGPGAHIIMDTLCNYHNFDIQWGNHDILW MGAASGNDSCIANVIRMSMRYGNLGTLEDGYGINLLPLATFAMDTYADDPCTIFMPKMNF ADTNYNEKTLRLITQMHKAITIIQFKLEAEIIDRRPEFGMSNRKLLEKIDFERGVFVYEG KEYALRDTNFPTVDPADPYRLTDEERELVEKIHYSFMNSEKLKKHMRCLFTYGGMYLVSN SNLLYHASVPLNEDGSFKHVKIRGKEYWGRKLLDKADQLIRTAYFDEEGEDDKEFAMDYI WYMWCGPEAPLFDKDKMATFERYFLEDKEIQKEKKGYYYTLRNREDICDQILDEFGALGP HSHIINGHVPVKTIQGEQPMKANGKLFVIDGGFSKAYQPETGIAGYTLVYHSHGMQLVQH EPFQSRQKAIEEGLDIKSTNFVLEFNSQRMMVKDTDKGKELVTQIQDLKKLLVAYRIGLI KEKV >gi|226332296|gb|ACIC01000024.1| GENE 4 4836 - 5984 1121 382 aa, chain - ## HITS:1 COG:no KEGG:BT_1227 NR:ns ## KEGG: BT_1227 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 382 1 382 382 649 98.0 0 MKIIKTIFTAAVLMAAVCLPAQNKSAGINLSFWKDICTQPYDSTQTTYVNLGLLSTLNRL NGVGINALGSVIHGDMNGVQITGLANLAGGTMRGVQIAGVSNISGDNTVGLSTAGLVNIT GDGSKGVIISGLTGIGGDNNSGVMISGLMNVTGNMASGFQFSGVSNITGQSFNGMMTSGL LNVVGENMNGLQIAGIANITAKDLNGAQIGLCNYATKAHGFQIGLVNYYREDMKGLQLGL VNANPDTRIQMMVYGGNVTPANIGVRFKNQLFYTILGVGAFDQNLNDKFSISTSYRAGLA VPVYKGLSISGDLGFQHIETCSNKDEVIPRRLYGLQARANLEYQISKKFGIFATGGYGLA RYYNKSGNFDKGAIIEAGIVLF >gi|226332296|gb|ACIC01000024.1| GENE 5 6048 - 7706 1681 552 aa, chain - ## HITS:1 COG:aq_999_1 KEGG:ns NR:ns ## COG: aq_999_1 COG1022 # Protein_GI_number: 15606303 # Func_class: I Lipid transport and metabolism # Function: Long-chain acyl-CoA synthetases (AMP-forming) # Organism: Aquifex aeolicus # 25 546 14 499 600 219 29.0 1e-56 MEKSFIAYIENSIKNNWDLDALTDYKGATLQYKDVARKIEKLHIIFEESGIRKGDKIAVC GRNSSHWGVTFLATLTYGAVIVPILHEFKADNVHNIVNHSEAKLLFVGDMVWENLNESAM PLLEGILMMNDFTLLVSRSERLTHAREHLNEMFGKKYPKNFRKEHIEYHKDEPEELAVIN YTSGTTSYSKGVMLPYRSLWSNTKFAYEVLELKAGDKIVSMLPMAHMYGLAFEFLYEFSV GCHIYFLTRMPSPKIIFQAFEEVKPNLIVAVPLIIEKIIKKSVLPKLETPTMKILLKVPI INDKIKATIREEMIKAFGGNFREIIVGGAAFNQEVEQFLKMIDFPYTVGYGMTECGPIIC YEDWTRFKLGSCGKAAPRMDVRVLSSDPENIVGEIVCKGPNVMLGYYKNNEATEAVIDKD GWLHTGDLALMDAEGNITIKGRSKNMLLSASGQNIYPEEIEDKLNNLPYVSESIIVQQNE KLVGLVYPDFDEAFAHGLKNEDIERVMEENRVALNAMLPAYSQILKMKIYPEEFEKTPKK SIKRYLYQEAKG >gi|226332296|gb|ACIC01000024.1| GENE 6 8129 - 9199 1055 356 aa, chain - ## HITS:1 COG:Cj1428c KEGG:ns NR:ns ## COG: Cj1428c COG0451 # Protein_GI_number: 15792746 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Campylobacter jejuni # 1 354 1 342 346 351 48.0 1e-96 MEKNAKIYVAGHHGLVGSAIWKNLQDKGYTNLIGRTHQELDLLDGVAVRRFFDEEQPEYV FLAAAFVGGIMANSIYRADFIYKNLQIQQNVIGESFRHNVRKLLFLGSTCIYPRDAEQPM KEEVLLTSPLEYTNEPYAIAKIAGLKMCESFNLQYGTNYIAVMPTNLYGPNDNFDLERSH VLPAMIRKIHLAHCLKEGDWEAVRKDMNQRPVEGVNGDSPKADILNILRKYGISETEVTL WGTGTPLREFLWSEEMADASVFVMEHVNFKDTYKEGDKDIRNCHINIGTGKEITIRQLAE QIVETVGYQGKLTFDSSKPDGTMRKLTDPSKLHSLGWHHKIEIEEGVRRMYEWYLK >gi|226332296|gb|ACIC01000024.1| GENE 7 9303 - 10373 1244 356 aa, chain - ## HITS:1 COG:BMEI1413 KEGG:ns NR:ns ## COG: BMEI1413 COG1089 # Protein_GI_number: 17987696 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: GDP-D-mannose dehydratase # Organism: Brucella melitensis # 2 350 3 346 362 479 67.0 1e-135 MKTALISGITGQDGSYLAEFLLQKGYEVHGILRRSSSFNTGRIEHLYFDEWVRDMKQKRT INLHYGDMTDSSSLIRIIQQVQPDEIYNLAAQSHVKVSFDVPEYTAEADAIGTLRMLEAV RILGLEKKTRIYQASTSELFGKVQEVPQKETTPFYPRSPYGVAKQYGFWITKNYRESYGM FAVNGILFNHESERRGETFVTRKISLAAARIAQGEQDKLYLGNLDARRDWGYAKDYVECM WLILQHDVPEDFVIATGEMHTVREFATLAFKEAGIELRWEGEGVDEKGIDVATGKVLVEV DPKYFRPAEVEQLLGDPTKARTLLGWNPRKTSFGELVSIMVRHDMEKVKRMIATKH >gi|226332296|gb|ACIC01000024.1| GENE 8 10450 - 11619 1188 389 aa, chain - ## HITS:1 COG:Cgl2969 KEGG:ns NR:ns ## COG: Cgl2969 COG1301 # Protein_GI_number: 19554219 # Func_class: C Energy production and conversion # Function: Na+/H+-dicarboxylate symporters # Organism: Corynebacterium glutamicum # 8 381 5 379 412 324 52.0 2e-88 MKKIKIGLLPRIIIAIILGIAIGTIFPAPLVRIFVTFNGIFSEFLNFSIPLIIVGLVTVA IADIGKGAGKMLLVTALIAYGATLFSGFLSYFTGVTVFPSLIEVGAPLEEVSEAHGILPY FSVSIPPLMNVMTALILAFTLGLGLAALHSDALKSVARDFQEIIVRMISAVILPLLPLYI FGIFLNMTHSGQVYAILMVFIKIIGVIFLLHIFLLVFQYSIAALFVRKNPFRLLGRMMPA YFTALGTQSSAATIPVTLEQTKKNGVSADIAGFVIPLCATIHLSGSTLKIVACALALMMM QGMPFDFPLFAGFIFMLGITMVAAPGVPGGAIMAALGILQSMLGFDESAQALMIALYIAM DSFGTACNVTGDGAIALIIDKVMGKGTGK >gi|226332296|gb|ACIC01000024.1| GENE 9 11729 - 13204 1601 491 aa, chain + ## HITS:1 COG:TP0331 KEGG:ns NR:ns ## COG: TP0331 COG0362 # Protein_GI_number: 15639322 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphogluconate dehydrogenase # Organism: Treponema pallidum # 8 491 4 488 488 576 56.0 1e-164 MANQNKTDIGLIGLAVMGENLALNMESKGWHVSVYNRTVPGVEEGVVDRFMNGRAKGKNI EGFTDIKAFVESIATPRKIMMMVRAGSPVDELMDQLFPLLSPGDILIDGGNSNYEDTNRR VKLAESKGFLFVGSGVSGGEEGALNGASIMPGGSEKAWPEVKPVLQSIAAKAPDGTPCCQ WVGPAGSGHFVKMIHNGIEYGDMQLIAEAYWVMKNLIDLTNEEMADVFARWNEGKLRSYL IEITSNILRHKDKSGGYLIDKILDAAGQKGTGKWSVINAMELGMPLGLIATAVFERSLSA QKELRHLASKQFLCKHTLPVYNKAELVKEIFSALYASKLVSYAQGFAVLQRASDAFGWNL DLASIARMWRGGCIIRSIFLNDIAAAFEAADKPKHLLLAPYFREEIKSLLPGWKNLVAEA MKEELPVPAFSSALNYFYSLTSDNLPANLVQAQRDYFGAHTFERKDELRGQFFHENWTGH GGETKSGTYNV >gi|226332296|gb|ACIC01000024.1| GENE 10 13549 - 15045 1171 498 aa, chain + ## HITS:1 COG:VCA0896 KEGG:ns NR:ns ## COG: VCA0896 COG0364 # Protein_GI_number: 15601650 # Func_class: G Carbohydrate transport and metabolism # Function: Glucose-6-phosphate 1-dehydrogenase # Organism: Vibrio cholerae # 5 498 9 501 501 564 55.0 1e-160 MDKFAMIIFGASGDLTKRKLMPALYSLYRDKRLTGSFTVLGIGRTVYSDEDYRSYILGEL QQFVKAEEQNLELMSSFVSHLYYLPMDPAKVEGYSQLRERLVELTKEVDPDNLLFYLATP PSLYGVVPLHLKAAGLNTPHSRIIVEKPFGYDLESALELNKIYSSVFDEHQIYRIDHFLG KETAQNVLAFRFANGIFEPLWNRNYIDYVEITAVENLGIEQRGGFYETAGALRDMVQNHL IQLVALTAMEPPAVFNADNFRNEVVKVYESLTPLTETDLNEHIVRGQYTASGNKKGYREE KGVAPDSRTETYIAMKLGISNWRWSGVPFYIRTGKQMPTKVTEIVVHFRETPHQMFRCSG GNCPRANKLILRLQPNEGIVLKIGMKVPGAGFEVRQVTMDFSYAQLGGVPSGDAYARLID DCIQGDPTLFTRSDAVEASWNFFDPVLRYWKDNPDAPLYGYPAGTWGPLESEAMMHEHGA DWTNPCKNLTNTDQYCEL >gi|226332296|gb|ACIC01000024.1| GENE 11 15042 - 15758 195 238 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163781723|ref|ZP_02176723.1| 50S ribosomal protein L13 [Hydrogenivirga sp. 128-5-R1-1] # 34 228 37 212 228 79 33 4e-14 MKLSVFPSSMETARSLIFHLVDIMNAEPDKTFNIAVSGGSTPALMFDLWANEYADITPWK RMKLYWVDERCVPPEDSDSNYGMMRSLLLGIVPIPYENVYRIRGEEKPAKEAARYSELVS RQVPKKNGWPEFDIVLLGAGDDGHTSSIFPGQEALLSSDQIYVTSTHPRNGQKRIAMTGF PILTARLVIFLITGKAKAEVVEEICHSGDTGPAAYIAHHAENVELFMDAGAAMYVRND >gi|226332296|gb|ACIC01000024.1| GENE 12 15943 - 16893 868 316 aa, chain + ## HITS:1 COG:MT0820 KEGG:ns NR:ns ## COG: MT0820 COG2837 # Protein_GI_number: 15840211 # Func_class: P Inorganic ion transport and metabolism # Function: Predicted iron-dependent peroxidase # Organism: Mycobacterium tuberculosis CDC1551 # 13 311 8 306 335 290 49.0 3e-78 MNPFQNSFGGHIPQDVAGKQGENVIFIVYNLTDSPDMVDKVKDVCANFSAMIRSMRNRFP DMQFSCTMGFGADAWTRLFPDKGKPKELSTFSEIKGEKYTAVSTPGDLLFHIRAKQMGLC FEFASILDEKLKGAVVSVDETHGFRYMDGKAIIGFVDGTENPAVDENPYHFAVIGEEDAD FAGGSYVFVQKYIHDMVAWNALPVEQQEKVIGRHKFNDVELSDEEKPENAHNAVTNIGDD LKIVRANMPFANTSKGEYGTYFIGYASTFSTTRRMLENMFIGSPAGNTDRLLDFSTAITG TLFFVPSYDLLGELGE >gi|226332296|gb|ACIC01000024.1| GENE 13 17179 - 17877 534 232 aa, chain - ## HITS:1 COG:BS_yoqW KEGG:ns NR:ns ## COG: BS_yoqW COG2135 # Protein_GI_number: 16079108 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus subtilis # 23 230 13 222 224 127 36.0 2e-29 MCFHNSMSAKAIKVAARYGRQSDVVEIYQSILDEQYHVNAFTFPRYPIITSSDEVQVFNW GLIPFWVRSEEDATEIRKMTLNARADTIFEKPSFREPIMKKRCIVPSTGYFEWRHEGANK IPYYIYVKDEPIFSMAGIYDRWLDKDTGEEHETFSIITTDTNSLTDYIDNTKHRMPAILT QEEEEKWLNPSLSKAEIASLLKPFDTEKMDAYVIRNDFLKKSPNDPTIVQRA >gi|226332296|gb|ACIC01000024.1| GENE 14 18349 - 18687 187 112 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160885865|ref|ZP_02066868.1| ## NR: gi|160885865|ref|ZP_02066868.1| hypothetical protein BACOVA_03869 [Bacteroides ovatus ATCC 8483] # 1 112 1 112 112 185 100.0 6e-46 MKSLLKNVLRRISKKQSSKEDNATAFYPQCCAKVDDSARMRIKMSYDQNVKETISSLKTL ANDMSSGFVTFKKFQTRRYQYNPDADATLYASRLLRAASILEFLLTDPDNKS >gi|226332296|gb|ACIC01000024.1| GENE 15 18684 - 19151 374 155 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253567929|ref|ZP_04845340.1| ## NR: gi|253567929|ref|ZP_04845340.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 155 1 155 155 277 100.0 1e-73 MRPIRTVPSKDEREYPLVITAEEKDKVLNYILVVANGKRTAKLNYKDIPDLRISKEQYEI VLEEFKKRRFIDYKGYGIEYLTLNFEIFNFAEKGGFTVERDLYILSFDTFQMQLERLEKE LSPDTAAKVDDVVGKAKNITELLIGLSALAEKMNL >gi|226332296|gb|ACIC01000024.1| GENE 16 19171 - 19557 146 128 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253567930|ref|ZP_04845341.1| ## NR: gi|253567930|ref|ZP_04845341.1| predicted protein [Bacteroides sp. 1_1_6] # 1 128 1 128 128 234 100.0 2e-60 MKKKILILSFLFVLIFNSCSDDSINLAGTTWTSVKDWYGKTRLSFEEGTPYLRSFFAISF DLKSFTIYNVADDNEDLEYEWKETVSGKYSINDNIVNLIVEKDNLTIPCEIEKDIMYYSN TRMKLYKQ >gi|226332296|gb|ACIC01000024.1| GENE 17 19570 - 19854 201 94 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253567931|ref|ZP_04845342.1| ## NR: gi|253567931|ref|ZP_04845342.1| hypothetical protein BSIG_00357 [Bacteroides sp. 1_1_6] # 1 94 39 132 132 156 100.0 5e-37 MFQRGTEPSAKVLTSILLTYEDISAEWLLRGKGQMLLSEVTPDPNIEQMKRLVDTITTLQ GIITEQTKTNQLLTEELKKAKGELTMLKNERNVG >gi|226332296|gb|ACIC01000024.1| GENE 18 20128 - 20421 277 97 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237716168|ref|ZP_04546649.1| ## NR: gi|237716168|ref|ZP_04546649.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 97 1 97 97 155 100.0 5e-37 MSYNLSQIMKSAHRNYKKGGKTFSECLKSAWSFAKLQESFSPEAVKSRTDKFLAERHEAM SKTAKATPSKEYNNLNIPASAYYNPNSTHYGAHYVGD >gi|226332296|gb|ACIC01000024.1| GENE 19 20437 - 20673 157 78 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160885871|ref|ZP_02066874.1| ## NR: gi|160885871|ref|ZP_02066874.1| hypothetical protein BACOVA_03875 [Bacteroides ovatus ATCC 8483] # 1 78 1 78 78 105 100.0 1e-21 MDKRTELEIQRDKYEAVIEERDALISSLRGENEKLKRDLESERGFYREKVSQCDDLKKFI ESQRNLMDIVLKNNQSIL >gi|226332296|gb|ACIC01000024.1| GENE 20 21108 - 21314 232 68 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160885873|ref|ZP_02066876.1| ## NR: gi|160885873|ref|ZP_02066876.1| hypothetical protein BACOVA_03878 [Bacteroides ovatus ATCC 8483] # 1 68 9 76 76 122 100.0 9e-27 MEKDIQRRNVIDVLRSMDVGAIEVFPIVQKPSVTNTLNARLYKEKAEGMAWKTKSDVKNM QFIVTRIA >gi|226332296|gb|ACIC01000024.1| GENE 21 21332 - 21589 110 85 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260170657|ref|ZP_05757069.1| ## NR: gi|260170657|ref|ZP_05757069.1| hypothetical protein BacD2_02235 [Bacteroides sp. D2] # 1 75 5 79 89 153 100.0 3e-36 MIRGEMAEILLDNILRLFSTETFGKDKSAYYVGGEKKLMNLIEAGKIESDKPTNVQNGKW HCNAAQVLLHCRCAGRKVKSKKRKK >gi|226332296|gb|ACIC01000024.1| GENE 22 21586 - 21777 163 63 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160885875|ref|ZP_02066878.1| ## NR: gi|160885875|ref|ZP_02066878.1| hypothetical protein BACOVA_03880 [Bacteroides ovatus ATCC 8483] # 1 63 1 63 63 98 100.0 2e-19 MKKIKVIQYAMMFIALWTTLYLIDSIEVSKKEFIAAFVLVTVVSVNYICFRYYEDRKQNK DSL >gi|226332296|gb|ACIC01000024.1| GENE 23 22815 - 23597 497 260 aa, chain + ## HITS:1 COG:CAC1936 KEGG:ns NR:ns ## COG: CAC1936 COG4712 # Protein_GI_number: 15895209 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 20 187 3 169 229 201 61.0 1e-51 MTARKNTVSTVQNEEKKKNSIRPLLASEIECRVGTMKPDGSGCSLLLYKDARVDMRILDE VFGEMNWKRHHDVVNGNLFCTLSIWDNEKKEWVSKQDVGTESSTEKEKGQASDAFKRAGF NWGIGRELYTGPFIWVPLEKNEVYQSKTGSPALYTKFSVKEIGYNEQKEIILLVIVDNKN RVRFAYGNTKEKVYAPNVSASNASGKVYTGVDLDRAIKQMTGVKSREELERVWAEHPELH NNKEFRNITIDMQKTYPPRN >gi|226332296|gb|ACIC01000024.1| GENE 24 23603 - 24517 443 304 aa, chain + ## HITS:1 COG:no KEGG:BVU_2860 NR:ns ## KEGG: BVU_2860 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 303 1 301 306 395 66.0 1e-108 MIELVKSSVVFNEENHTYMLGEKQLQGITGMISRQLFPDKYKDVPDFVLKRAAEKGSLIH AQCQFADATGLPPESIEAENYIRMRVNAGYKALANEYTVSDNEYFASNIDCVWEKVGRIS LVDIKTTLHLDKEYLSWQLSIYAYLFELQNPLLKVDKLFGIWVRGDKHELVVIPRKPDKE VKKLMECEKKGEQYLSDLPVPAPDDDKLLIPMQLVNTIIGIEEELADLTKIQKDYKAKLK TAMRENGVKSWDAGRLRVSYTPASTSDNFDTKKFQADHPELYSKYIKTVPKADSIRVTIR EDKS >gi|226332296|gb|ACIC01000024.1| GENE 25 24514 - 24963 69 149 aa, chain + ## HITS:1 COG:XFa0061 KEGG:ns NR:ns ## COG: XFa0061 COG0629 # Protein_GI_number: 10956771 # Func_class: L Replication, recombination and repair # Function: Single-stranded DNA-binding protein # Organism: Xylella fastidiosa 9a5c # 2 111 3 111 136 102 44.0 2e-22 MSLNKLMLIGHVGKDPDIRILEAGSKVVTFSFATTEKGYTLANGTQVPERTEWHNIVVWR GLADVVEKYVHKGDKLYLEGKIRTRSYDDSRGIKRYITELFVDNMEMLSVKPQQAPPPPP LPEHTNNQTRSAVNECPPPLPPTKDDLPF >gi|226332296|gb|ACIC01000024.1| GENE 26 25114 - 25422 202 102 aa, chain + ## HITS:1 COG:no KEGG:BVU_2858 NR:ns ## KEGG: BVU_2858 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 102 49 150 150 169 74.0 3e-41 MWKWFQCIGACLREYTGEEYWSTAAGVQDIHDLYCKKFLVKQVHVNGKVETIVRGTSKLN TLEMHNFMESVKIDAATEFGITLPLPEDQHYLDFIHEYQNRY >gi|226332296|gb|ACIC01000024.1| GENE 27 25441 - 26109 709 222 aa, chain + ## HITS:1 COG:no KEGG:BVU_2857 NR:ns ## KEGG: BVU_2857 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 221 1 221 233 175 49.0 9e-43 MIANLRNYEPETIEFVVPDSIREKFPPVLFQGSTNVDELIKLVNEHFNATFPESEVTQRL LDEFEISEIREEYCIKQENEVPKRERELLEAIERAKKIKSDAQDRLASIKTEIKDLAAEV KKGTREYHLSSKNTIRFALDGYFLYYSWVNGEFKLVKAEKIPDWDKRSLWAQEDRNRKAM LDLFGIEYPEVERPIDDTEDYGDKFEEDLSDKLPEEEPEDDE >gi|226332296|gb|ACIC01000024.1| GENE 28 26042 - 26377 90 111 aa, chain + ## HITS:1 COG:no KEGG:BVU_2856 NR:ns ## KEGG: BVU_2856 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 21 111 1 91 91 112 64.0 3e-24 MGTSSKKTCPINFLKKNRKTMSRLQHKKGRKSNYVKRLVNNPDWEEAKRKVRIRDGHKCQ MCGKDFNLEIHHKTYRVNGKSIVGHELEHLDCLVTLCGDCHSKVHKYHIKL >gi|226332296|gb|ACIC01000024.1| GENE 29 26374 - 27405 548 343 aa, chain + ## HITS:1 COG:VC1636_1 KEGG:ns NR:ns ## COG: VC1636_1 COG1061 # Protein_GI_number: 15641641 # Func_class: K Transcription; L Replication, recombination and repair # Function: DNA or RNA helicases of superfamily II # Organism: Vibrio cholerae # 3 341 65 410 420 201 37.0 2e-51 MTYQLRDYQKSASDAAVSVFKSKEKKNYVIVLPTGAGKSLVIANIAARIDGPLIVFQPSK EILEQNFAKLQSYGIFDCGVYSASAGRKDINRITFAMIGSVMKHMSFFKHFKHVLIDECH LVNPEKGMYKEFFEDEQRKVIGLTATPYRLCSGRGGAMLKFITRTRPKVFTDVIYHCQVS ELLAKGFLASLKYYDITKLDLSRVRTNSTGADYDEKSLLQEFERVDIYKDIVGWTKRLLN PKSGIPRKGILIFTRFIREAEKLASEIPNCAIVSGSTPKEERARILKGFKDGRIKVVANV GVLTTGFDYPELDTIVLARPTKSLSLYYQMVGRVIRPCQGKEG >gi|226332296|gb|ACIC01000024.1| GENE 30 27537 - 28034 220 165 aa, chain + ## HITS:1 COG:no KEGG:BDI_0857 NR:ns ## KEGG: BDI_0857 # Name: not_defined # Def: putative recombination protein # Organism: P.distasonis # Pathway: not_defined # 14 147 17 150 157 194 70.0 1e-48 MWRNYKKKEKKKPLFEVEGVKVKKKPDLVDKLDRIFSLFIRYRDTMPNGYFQCISCGKIK PFNKADCGHYINRQHMSTRFDEMNCNAQCSHCNRFMEGNIQDYRRRLVAKYGERNVLILE AKKNVTKQFSDFQLEKLITHYKEEDEKTEGSKRSVSFITNRSIIP >gi|226332296|gb|ACIC01000024.1| GENE 31 28038 - 28850 339 270 aa, chain + ## HITS:1 COG:no KEGG:Coch_0881 NR:ns ## KEGG: Coch_0881 # Name: not_defined # Def: hypothetical protein # Organism: C.ochracea # Pathway: not_defined # 1 109 1 110 298 105 50.0 2e-21 MERNSFIFYKGWREAIKDLPDDVRLEIYESIIEYATTGNLRGLKPMANIAFNFIKIDIDR DTEKYMSIVERNKSNGSKGGRPKSENPKEPKEPTKLTGLFGNPKEPTKPDNDNEYDNDYV DDNDSHLKKKETSPKGESKKDELSLFPEEKIDWGGLMDYFNSTFKGKLPAIKSIDAKRKK AIKARVAQYGKQAIFDVFQLVLDSPFLLGQNDKNWRCTFDWIFLPTKFTNILEGNYNGKR TDTAATRRESVSSLTDLAEELLQSSMSKEG >gi|226332296|gb|ACIC01000024.1| GENE 32 28747 - 29394 455 215 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253567946|ref|ZP_04845357.1| ## NR: gi|253567946|ref|ZP_04845357.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 60 215 1 156 156 295 99.0 1e-78 MENELILRPQEENRLAVLRTSPKNYCKALCPKKVEDVFQSDEPSIGTIIRKFGEPQARAV LVILIADALEFFNVSNTMSATQVATTVDLIIEEYPYMKTDDFKLCFKNAMKMKYGENYNR IDGSIIMGWLREYNKERCAVADNQSWNTHKAKLSGETSFTSGLSYEEYRNELKLRVEQGD EEAAKALSLSNEIISYLNKRENGKQEAEGDNLLEH >gi|226332296|gb|ACIC01000024.1| GENE 33 29351 - 29572 220 73 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260170645|ref|ZP_05757057.1| ## NR: gi|260170645|ref|ZP_05757057.1| hypothetical protein BacD2_02175 [Bacteroides sp. D2] # 1 73 1 73 73 125 100.0 6e-28 MANKKQKVTIYWNTRHIKLEDIPEVKRRIRERFGIPNHTTVNGETDCYIREEDMELLRET EKRGFIQIRNKPA >gi|226332296|gb|ACIC01000024.1| GENE 34 29674 - 31143 969 489 aa, chain + ## HITS:1 COG:SMc02801 KEGG:ns NR:ns ## COG: SMc02801 COG1475 # Protein_GI_number: 15967087 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Sinorhizobium meliloti # 4 197 39 198 296 103 34.0 8e-22 MEVQNIRIDLISPSPLNPRKTFDEAALEELASNIEKQGLLQPITVRVAKSEEMTNLETGD VTPLPYTYEIVCGERRFRAVSLLKAKEDEANVAKIKAHRKKSEKFQTISCIVREMTDDEA FEAMITENLQRKDVDPIEEAFAFAQLAEKGRTLEDIALKIGKSTRFVFDRIKLNSLIPEL KERVRNGDIPLSGAMILSKLDEDTQKEFHEEEEEQCTTAMIREFVSNSFMELGNAPWIKD DSDNWENTDIKSCSQCENNTCNHGCLFYEMNSKDARCINAACYEKKQIAYVTRKIQLEYE HLVKVGEPLSFGKTIIIARRPDTYWGEDRKVFYEKTLEAVKQLGFEIVDPDEIFRCKCWY SEDDERTLKMLEDGEVYRCLSFFGHYSPEFNVSFYYVRKETASSTSAVADLKEIEREKIN AQLKRAKDIVKEKSAEEMRKWAQEKTYYQRTKEFSENEQLVFDVLVLSGCSSTYLEKLNL KNGMVRVIL >gi|226332296|gb|ACIC01000024.1| GENE 35 31419 - 32183 399 254 aa, chain + ## HITS:1 COG:no KEGG:PA14_58970 NR:ns ## KEGG: PA14_58970 # Name: not_defined # Def: hypothetical protein # Organism: P.aeruginosa_PA14 # Pathway: not_defined # 32 253 6 234 235 108 34.0 2e-22 MKTWTDEQLAILDSEYPTADLKELARRLDKTLSAVKTKALIRKLRRSPRISFWNSERLDK LKKLYPNHTNEEIAQILGITYSAVNGIAFKLRLFKSKEFKFQCASKSFFPKGHQPMNKGR KQTEYMSEEQLAKTKATQFKKGHIPKNHKPVGYERITRDGYIEVKTAEPNVFELKHRLVW IEHNGEIPPGYNIQFKDGNRQNVSIENLYMISRSEQLKKENSLYARYPEDVQYLIKLKGA LNRQINKATKKNES >gi|226332296|gb|ACIC01000024.1| GENE 36 32180 - 32620 331 146 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253567951|ref|ZP_04845362.1| ## NR: gi|253567951|ref|ZP_04845362.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 146 1 146 146 259 100.0 5e-68 MTDGAIDRLKEMVNKPFLYQNEEVVILNYCDGTGDDGTEVEIYLNNGKVLVFSMFDLASK LNRFRPITNTVVVLANERLNKVSTVNPTILQDLRNLVLQQIKDVKEDPSKVSQAKQVFQG VNTVINLAKTELEYRKYLDTTDPQNK >gi|226332296|gb|ACIC01000024.1| GENE 37 32727 - 33020 170 97 aa, chain + ## HITS:1 COG:no KEGG:BVU_2843 NR:ns ## KEGG: BVU_2843 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 97 1 97 97 97 51.0 1e-19 MITLNRFAQRCLNIMRKRFKMNEHSSRKAFSIRIEAVWRKFDIASKYRSDNLPKYSEDEE LAAEMIIYLVAYLKRFGCEDIEQLIKDKIEFDDRKND >gi|226332296|gb|ACIC01000024.1| GENE 38 33081 - 33737 451 218 aa, chain + ## HITS:1 COG:no KEGG:BVU_2842 NR:ns ## KEGG: BVU_2842 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 217 1 216 217 369 85.0 1e-101 MTEIIQVCLLDFNKGQLTGLPKNPRFFRDYRFEAMKKSIQDSPEMLELRELIVFPYNDGR YIVVCGNLRLRACKELGYKELPCKILAPDTPVKKLREYATKDNVNFGENDLDVMENEWNK AELQDWGIEFAPEKKEDEFKERFDAITDDTAIYPLIPKYDEKHELFIITSSNEVDSNWLR ERLDMQHMKSYKTGKISKSNVIDIKDVRHALQDSNTKS >gi|226332296|gb|ACIC01000024.1| GENE 39 33706 - 34458 314 250 aa, chain + ## HITS:1 COG:no KEGG:BVU_2841 NR:ns ## KEGG: BVU_2841 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 250 1 250 250 476 88.0 1e-133 MPCKIVIPSHKRHDRVFAKKLVNDPIICVAESQADLYQQFNPECEIVTHPDDVMGLIPKR NWMAKHFGELFMLDDDVHACKPIYVEKGEPSRIKDKDKITNIIQSLFEMASMMDVHLFGF TARISPVMYDESAFLSLSKMITGCSYGVIYNKNTWWNEEIRLKEDFWISCYMKYKERKVL TDLRYNFEQKNTFVNAGGLASIRNQEEERKSILFIKKNFGDSILLKSATTNGKDKTKQLV QYNISCKFKF >gi|226332296|gb|ACIC01000024.1| GENE 40 34570 - 35145 263 191 aa, chain + ## HITS:1 COG:no KEGG:BVU_2840 NR:ns ## KEGG: BVU_2840 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 190 1 189 190 315 85.0 6e-85 MIIRTVCGYDFFEVSSAMQKAIRRADTGVAGFFALELWASGYRDYVWKRLFTISAEDCFG IITKEIEALWQGHELVNKTATEPKGRIFVSKAVILLCECRKNRDADHLQNFIYDRKDIDI EKWINDVRRYPIPIPDYTFDVHTRKGKKHGRTKEEFFREEYKALQPRVPGLFDDLVQSSQ PKLFNDETTAK >gi|226332296|gb|ACIC01000024.1| GENE 41 35194 - 35841 427 215 aa, chain + ## HITS:1 COG:no KEGG:HAPS_0636 NR:ns ## KEGG: HAPS_0636 # Name: not_defined # Def: hypothetical protein # Organism: H.parasuis # Pathway: not_defined # 1 213 1 213 213 189 50.0 4e-47 MNTYYKFAPNVFLAKCDEKHEKGETIEVTTKYGKENESIVFNLIFEKDGFYYYSIVRADG FNAQEWAKRRAERRRKWAASAVQRSNEYYNKSNKDKDFLSLGEPIKVGHHSEKRHRKAID DAWNNMGKSVQFDEKAAEHESKAEYWDKRANTINLSMPESIDFYEHKLEVAKEYHAGVKS GKYPRMHSYTLTYAKKDVNEAQKNYDLAVKLWGDV >gi|226332296|gb|ACIC01000024.1| GENE 42 35870 - 36334 201 154 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160885897|ref|ZP_02066900.1| ## NR: gi|160885897|ref|ZP_02066900.1| hypothetical protein BACOVA_03902 [Bacteroides ovatus ATCC 8483] # 1 154 1 154 154 327 100.0 1e-88 MRELSKETSLQRVMRASGRVPVQCSCSVCKQQCHTPCLGTPDDIERIIDAGYADRLALTN WAAGIFLGVINIAIPMIQPVAGKEYCAFFENGLCILHDKGLKPTEGRLSHHTVRKDNFNP AMSIAWNVAKEWLMPENEDVLSRVVNKFLNARKP >gi|226332296|gb|ACIC01000024.1| GENE 43 36679 - 37386 270 235 aa, chain + ## HITS:1 COG:no KEGG:Dalk_4616 NR:ns ## KEGG: Dalk_4616 # Name: not_defined # Def: hypothetical protein # Organism: D.alkenivorans # Pathway: not_defined # 16 233 1 219 227 126 35.0 6e-28 MKYYASVSFGKDSLAMLFMLIDKGYQLDEVVFYDTGMEFQAIYNTRDAVLPILKKLGIKY TELHPEQPFLWTMFERPVKKRGTNIIHKKGYSWCGGTCRWGTSEKLRALKAHTKDGIDYV GIAADEMHRFEKEKRPNRVLPLRDWGITEADALQYCYTKGFVWHEDGVRLYELLDRVSCW CCGNKNLKELKNMYLYLPWYWKKLKELQLNTDRPYRRNSGETIFDLEERFKREMQ >gi|226332296|gb|ACIC01000024.1| GENE 44 37398 - 37949 410 183 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253567960|ref|ZP_04845371.1| ## NR: gi|253567960|ref|ZP_04845371.1| predicted protein [Bacteroides sp. 1_1_6] # 1 183 1 183 183 344 100.0 1e-93 MIPLCINGKDYYDREEALAAWFEEWLMKQDFEQDLIDRELELEYRKTHLDWNTPYVMYGV RKKHKCIQKNEIAVFYDLLPRQKRARTAETHWYKVLYKRKATPEEVESLKAGEYTRRYLV YSLFIEKKMTLDKALSLIVADDKLLGIADNTISEIVTAFETFFNRKFRIYKPEFTTQLNL FTD >gi|226332296|gb|ACIC01000024.1| GENE 45 37951 - 38193 180 80 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|295084508|emb|CBK66031.1| ## NR: gi|295084508|emb|CBK66031.1| hypothetical protein [Bacteroides xylanisolvens XB1A] # 1 80 1 80 80 143 100.0 4e-33 MKTTIISCVILFVFLLYVGHFSITIKPFTVQLPYWHRSLGLFLLILSFIVYNAGEHAKGY LDGLKEGERIIFDLLKKKTG >gi|226332296|gb|ACIC01000024.1| GENE 46 38282 - 38527 193 81 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253567961|ref|ZP_04845372.1| ## NR: gi|253567961|ref|ZP_04845372.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 81 1 81 81 151 100.0 1e-35 MKRNEKIEKLERLGIFNQWKYNTERANETFNIECPDFSMTNEERMNNLLDVDCCFHRFLA ISFPFYNTPEGAVFWENIAKK >gi|226332296|gb|ACIC01000024.1| GENE 47 38551 - 39057 361 168 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253567962|ref|ZP_04845373.1| ## NR: gi|253567962|ref|ZP_04845373.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 168 1 168 168 314 100.0 2e-84 MSKKDLIEQNITRVQEYVRELIEDAKWNNGVSETLESTSIIVGNSDDIYDFAILFASNSE CVYCEFINGKIEYIDCELDCEICQFEGRLIFQYINGSFHNPTGQIIELSKLLMKGELKDT KSIFCSMVLRLMDTEEYSNNYCKSLDLVLRLFPEIDGELLEKELDRYI >gi|226332296|gb|ACIC01000024.1| GENE 48 39069 - 39677 301 202 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|295084505|emb|CBK66028.1| ## NR: gi|295084505|emb|CBK66028.1| hypothetical protein [Bacteroides xylanisolvens XB1A] # 1 202 1 202 202 391 100.0 1e-107 MSKMNLNELRDKAYKTACEHGFHDQELSNNHFLCLVISELMEAVEADRKGRRANVDRYNK KIANSRICQGLDSDIPKERGYEVAYNETIKGSIEEELADAVIRLLDLAGLRGINLELANG DIDDCIEDMAEACKDETFTESIYSISTLPVRYDGIFDLPTAVNDMILSIFGLAKHLDIDL LWHIEQKMKYNELREKMHGKKY >gi|226332296|gb|ACIC01000024.1| GENE 49 39694 - 40800 691 368 aa, chain + ## HITS:1 COG:SP0506 KEGG:ns NR:ns ## COG: SP0506 COG0582 # Protein_GI_number: 15900420 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Streptococcus pneumoniae TIGR4 # 131 368 26 265 265 125 34.0 2e-28 MDDKRKQILVDYISYLYTTGRSYDSIGKYIKYVTDFLENSEEINRRGYLKYKHKNADVMV RHSFMCAAVCDLLSYLKIGYGRREKAVKPLEKLEVISEKNKKLLNDFIIWLTDNNDYSLH TVDIYHTSLKQYFEYANELNMDNCRRFIKSLEEEKLSPATIRLRITAIEKFSKWVKKPIE LKRPKMKRKLDVNNVPTEEEYNRLLEYLKTKLNKDYYFFIKVLGTTGARLSEFQQFTWED IAIGEVVLKGKGNKYRRFFFQKQLQREVKDYIKETGKSGTLAVGRFGPLTQRGLSQHLKV WGKHCGIDSKKMHAHAFRHFFAKMFLKKTKDVIQLADLLGHGSVDTTRIYLQKSYDEQQR DFNKNVTW >gi|226332296|gb|ACIC01000024.1| GENE 50 40757 - 41749 633 330 aa, chain + ## HITS:1 COG:no KEGG:BVU_2845 NR:ns ## KEGG: BVU_2845 # Name: not_defined # Def: putative type I restriction-modification system methyltransferase subunit # Organism: B.vulgatus # Pathway: not_defined # 1 330 1 330 330 500 73.0 1e-140 MMNNKETLIKTLRGSVAQLNELSDMTEGIDVYDAAGYVDTEFLMEALSCVNTFMDASNMV ITKISSLLAPDAPVDERKSQADEGKKWNVEEILKHCTLEDSVLKLPKVQFNKKSYAEAKK WIEEAGGSWQGGKIQGFTFPFNPERVFSILKEGKRCDLQKDFQFFETPADIADWLVMLAG GINEVDTVLEPSAGRGALIKAIHRSCPSVTVECYELMPENREFLHTLDNVILLDEDFTKD SVGHYTKIIANPPFSGNQDIDHVRLMYERLEEGGTLAAITSRHWKFASEKKCVEFREWLE EVHGEVFEIGAGEFKESGTTVSTMAVVIKK >gi|226332296|gb|ACIC01000024.1| GENE 51 42067 - 42282 135 71 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253567966|ref|ZP_04845377.1| ## NR: gi|253567966|ref|ZP_04845377.1| predicted protein [Bacteroides sp. 1_1_6] # 1 71 28 98 98 134 100.0 3e-30 MQKFLNSLDSYSAFRLLDEIGIDYYGEFHTPFLEVVGDAVVVHLDDKHEPKDENFIEITK KEAFDLLGLTK >gi|226332296|gb|ACIC01000024.1| GENE 52 42292 - 42465 75 57 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNQIEFKIKALPSIKTTGLERKWQRPTLPLLRSTIGATRLNFSVRNGKRWNPCAITT >gi|226332296|gb|ACIC01000024.1| GENE 53 42695 - 43237 271 180 aa, chain + ## HITS:1 COG:no KEGG:PRU_0933 NR:ns ## KEGG: PRU_0933 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 140 1 130 399 69 34.0 4e-11 MKAITIKQPWASLIVHGIKDIENRTWACPWKYIGHRVLIHASGKPVEMRNPNSVFTKAQW DSLPVEFQRKIICAEGIVNSAIIGSVEIIGCSINHPSKWAEKSDDSKGYYENPIYNWVLA NPILFPEPIPAKGKLSFWEYPNINSEDDICLCNLVVNERNQVVSYGEYDRCVYCGSKWSK >gi|226332296|gb|ACIC01000024.1| GENE 54 43268 - 44389 244 373 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253567968|ref|ZP_04845379.1| ## NR: gi|253567968|ref|ZP_04845379.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 373 1 373 373 699 100.0 0 MNKDRKRVLIIGNGFDLCLGRKTSYKDFCQSEFCPKDYPSPLIKHLNDKWNDNLDAVKWY DLENELYNYYIRIKNNNGQIIDLYNDKERNVLEQIQANGPVTDSYECIKSNVDIVNNLLK NGILILPRFSCYISFSHEDILNPPIERDQKALQLIKNGLIQYLIKVQQETINENSIAAIV ARAFMQNKSNDQIVIYSFNYTSFSEVAPNSSFAMEFNDTINYVHGCILDRNIILGTKDEK IIHNYDFIQKSFDSQYNPPAMVYDLMDADDITIFGHSLGINDSQYFKAFFERQSSSTNPQ KKNITIFTKDAKSEIEIKRSLQEMTNWNLTSLYGLNNLQIIKTDECVNNPTLLRKYIKMY VDNEEDIDSIIHI >gi|226332296|gb|ACIC01000024.1| GENE 55 44493 - 45281 288 262 aa, chain + ## HITS:1 COG:XF2279 KEGG:ns NR:ns ## COG: XF2279 COG0451 # Protein_GI_number: 15838870 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Xylella fastidiosa 9a5c # 4 214 22 296 342 92 28.0 1e-18 MRKIIVTGSEGFIGKALCRELTKRGVEVIGLDRKSGTEATKVCELLKNGGIDCVFHLAAQ TSVFNGNLEQIRKDNIDTFMRIADVCNQNRVKLVYASSSTANPENTTSMYGISKYFDEQY ASIYCKAATGCRLHNVYGPNPRKRTLLWFLMEKENVSLYNCGQNIRCFTYIDDVIEGLIY SVGCNRQLINICNVQPVTTMYFASLVKYYKPLEIELINEKRDFDNLEQSVNRDIYLVPLS YTSVEDGVKKIFDERKGKDMSY >gi|226332296|gb|ACIC01000024.1| GENE 56 45250 - 45813 173 187 aa, chain + ## HITS:1 COG:no KEGG:PRU_0949 NR:ns ## KEGG: PRU_0949 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 20 186 42 211 213 103 36.0 3e-21 MKGKGKICRIDDWDKPEAVKYKSWSHQERLCDLKEKVSLHKKGDIYYISQFTRSKTGTSF SEIKQSEELASFFAERACEFLHRFIVGGCEGWCIVTTPRRRHYEGFHFATSICTKIAGAV KIPFYENAIQCLTKDRLNPEFFLLRPIREKKIIVYDDILTTGSTLLATYELLKDREQLLF LIGINNN >gi|226332296|gb|ACIC01000024.1| GENE 57 45815 - 46360 467 181 aa, chain + ## HITS:1 COG:no KEGG:BVU_2837 NR:ns ## KEGG: BVU_2837 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 3 181 4 181 181 223 64.0 3e-57 MGKREEPLTFKQEKFCKYYVDTEGNASEAYRMSYNTSNMKPETIWSAASRLLANSKVSTR INEIKAQRAKESEVERKTVERVLMDIVLANPDDLHFVDPATGKTKMRTPSQLPKRARNAL KKIQNKRGEVTYEFNGKTEAARILGAWNGWEADKNVNIKGGDGNKVGELRIGFEDNENSE E >gi|226332296|gb|ACIC01000024.1| GENE 58 46422 - 48437 890 671 aa, chain + ## HITS:1 COG:BS_yqaT KEGG:ns NR:ns ## COG: BS_yqaT COG1783 # Protein_GI_number: 16079672 # Func_class: R General function prediction only # Function: Phage terminase large subunit # Organism: Bacillus subtilis # 6 173 3 169 431 83 30.0 1e-15 MVINYKKLNPNGFYLLKYLNDETIRFIILYGGSSSGKSYSVAQTILIQTLQDGENTLVMR KVGASILKTIYEDYKVAAAGLGISHLFKFQQNTIKCLVNGAKIDFSGLDDPEKIKGISNY KRVQLEEWSEFEHPDFKQLRKRLRGKKGQQIICTFNPISESHWIKKEFIDKDKWHDVPMT VTIAGKELPKELTKVKSVKKNAPRQILNLRTKQIEEQAPNTVIIQSTYLNNFWVVGSPDG AYGFYDEQCVADFEYDRVHDPDYYNVYALGEWGVIRTGSEFFGSFNRGKHSGEHKYVPDL PIHISVDNNVLPYISVSYWQVDFTTGTKVWQFHETCAESPNNTVKKASKLVAKYLKSIQY SDRLYVHGDASTKAANSIDDEKRSWMDLFIDTLQKEGFEIEDKVGNKNPSVAMTGEFINA IFDCTVPGIEIYIDESCSVSIEDYMSVQKDANGAILKTKVKNKTTLQTYEEHGHLSDTFR YVVVDLCSEQYIEFSNRRKRNLYACNGTINFFNPDTECKYTKKILYVMPNVNGKFVLIQA FRCGNKWHIVDVVFMDTTSTEDIRSSILSHESDSCVIECTDAYFPFIRELRSSTSKEIRV MKEFPDVDKRIAATSDYVKNSILFSASKIESDTEYVAFMNNLMDYNKDSETKEASAVLSG LVQFVVKLGLN >gi|226332296|gb|ACIC01000024.1| GENE 59 48564 - 50009 549 481 aa, chain + ## HITS:1 COG:no KEGG:BVU_2835 NR:ns ## KEGG: BVU_2835 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 5 481 4 475 475 473 49.0 1e-131 MNIFFDNLFGKKSKTKGEVEIVTSSENKDIDTQSGKAEKWSVAYIEDLTSPIVAGSNYLT LFSTIPEVFFPIDYIASRIAGANFQLKKTKDDSIVWANKRMNGILSRPNCLMRWKELIYQ HHIYKLCTGNSFIRAAMPDVFSTAEKWRYCDNYWVLPSDKTIVEPVYGNMPLFGIAQTED IIRSYRLEYGWNGSLEIPPYQIWHDRDGSAEFYSGAMFLKSKSRLASQNKPMSNLIAVYE ARNVIYVKRGGLGFIVSKKTDATGSIALTDDEKEQLLKQNFEKYGVRKGQVPYGISDADI DFVRTNLSIAELQPFEETLADAINIAGAYGIPAVLVPRKDQSTFSNQATAEKSVYCSTVI PMAKQFCKDFTAFLGLEGGGYYLDCDFSDVDCLQEGLKESEDVKTNINKRCREQFSCGLI TLNDWRAQIGESMIENPLFDKLKFDMSDEELDKVNRVFNTKSGDEKDGRENQKPSVQDKG K >gi|226332296|gb|ACIC01000024.1| GENE 60 49963 - 51024 601 353 aa, chain + ## HITS:1 COG:no KEGG:BVU_2834 NR:ns ## KEGG: BVU_2834 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 329 3 336 351 322 52.0 1e-86 MEEKIKSLQYKTKANDVDEKGIVTVAVNGIGVKDSQNDISMPGSFNKTLKENIGRMRWFL NHRTDQLLGVPLSGKETEGNLVMVGQLNLEKQIGRDTLADYKLFAENGRTLEHSIGVKAI KRDSVDPCKVLEWRMMEYSTLTSWGSNPQTFLVNIKSATADQVKEAVDFVRKAFLQHGYS DERLKGYDMELSLLLKSLNGGAVVSCPHCGHQFDYDAETEHTFAQQVLDYAADYQRWITQ DIVREEMEKLTPEIRIQVISLIDSVKSEKKEFTQKGLQDLMNYVRCPHCWGKVYRSNAIL QNTSEDTTGKNEPSVDTQEKNDGENGNDEVTSKAADNGTLFDFKSLNSCFENK >gi|226332296|gb|ACIC01000024.1| GENE 61 51042 - 52619 1406 525 aa, chain + ## HITS:1 COG:no KEGG:BVU_2833 NR:ns ## KEGG: BVU_2833 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 21 525 59 561 562 363 40.0 1e-98 MPIRKFTVSDFNLKTDGLPAEQKAFMENIVGMMCEVVNKSLEGIASPDEVSKQFDDINKL LKSYDNEKFQQLVKDNEELVAQVKTLGESIEKMKQKGLSMNAINKFDEKLNEMLDSEKFR DFAEGKTRKSGEFDGFSLKDVVSMTDNYTGDLLITQQQKRVVTQVANKKLHMRDVLTTLT ADPAYPQLAYAQVYAFNRNARFVTENGRLPESSIKVKEIQTGTKRLGTHIRISKRMLKSR VYIRSYILNMLPEAVWMAEDWNILFGDGNGENLLGIINNTGVTSVEKIISTAIVTGAAGA VKAITGYNGDKDVIVEFAEPQDLILDGMSITFAGAAVLTELNKTHALVKMEDGRILIPGV AFSGAETATDKMTFSVHEAGFKNIEEPNSEDVVKTAFAAMTYAQYFPNAIILNPMTVNGM ESEKDTTGRNLGIVKMVDGVKYIAGRPIIEYGGILPGKYLLGDFNQAANLVDYTTLTLEW AEDVETKLCNEVVLMAQEEVIFPIYMPWAFAYGDLAALKTAITKA >gi|226332296|gb|ACIC01000024.1| GENE 62 52624 - 53046 471 140 aa, chain + ## HITS:1 COG:no KEGG:BVU_2832 NR:ns ## KEGG: BVU_2832 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 129 1 116 140 62 37.0 4e-09 MDYILRGNDKDVTNVLKEQRIRINRGMIQLIPISECGLVTEEDARKTLECMLAEKNEEIG RLTASIAEKDKTIVELTEERETMKARIAELEVQVPSDEKNLPVADSKDLQEEDAKEVTVT DDKAVSVEDEKKTGKGKTSK >gi|226332296|gb|ACIC01000024.1| GENE 63 53055 - 53597 337 180 aa, chain + ## HITS:1 COG:no KEGG:BVU_2831 NR:ns ## KEGG: BVU_2831 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 180 3 173 173 217 57.0 2e-55 MLIDVSYFMSGPRHIENVSVAEMPSPQSLAVNEVINGYIKAFQPEFLRNVVGVTLSQAIT DYLELIEREKEDSSNEVDISEEKEEPQSGYAILCEKLCEPFADYVFYHILRDANTQATIT GLVRLKCANEYVAPLKRQVSTWNSMVEKNKQFVEWAMSNDCPFDVKITKNLLTPINAFNL >gi|226332296|gb|ACIC01000024.1| GENE 64 53594 - 54166 240 190 aa, chain + ## HITS:1 COG:no KEGG:BVU_2830 NR:ns ## KEGG: BVU_2830 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 2 189 1 188 188 233 62.0 2e-60 MIDLDITELFEEIVKELPEGLEILYPNGKGGTKVVKSPRLNYIFGSSQYIKDILDEYSKS SAQSERKFPLVALFTPISEDRGDADYFSKAKVSLIIACSSCKEWSNEMRRITSFKNILRP IYKRLLEVLYEDSRFDCDYDEKVKHSYSENYSYGRYGAYTDSGEAVSEPIDAINIRSMEI KINNLNCRRK >gi|226332296|gb|ACIC01000024.1| GENE 65 54163 - 55002 591 279 aa, chain + ## HITS:1 COG:no KEGG:BVU_2829 NR:ns ## KEGG: BVU_2829 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 278 1 277 277 406 68.0 1e-112 MRKIRTCKGSRMNTGSSACSIDWKKVKGAILTEHGVKLPADITGEKLLELCHADRPGRIY PILPFLEYAKNGGEPQVNAVGYGASEYNGLSAQTDTFTLKKFDEVLNAQLLKCANKGWDV YFWNQDNMLIGYNDDTDILAGIPMSTVYPTVTQYPTSSAKSAMTVSFSHEDVEDSQLHFD YVQLDFNPKNFVKGLVDVVFQKLEAENTYKIVEVVGGYDRTEEFGSLIADGAAEVMNNVT SATYSDGIITIVPKAGAVPSLKAPSVLYEKGIRGIEQVS >gi|226332296|gb|ACIC01000024.1| GENE 66 54999 - 55154 93 51 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160885928|ref|ZP_02066931.1| ## NR: gi|160885928|ref|ZP_02066931.1| hypothetical protein BACOVA_03933 [Bacteroides ovatus ATCC 8483] # 1 51 1 51 51 72 100.0 8e-12 MKVDNVTFVEVAVKGMTKEEFINAHIKVVWQELKEADRKKKLSEVYDAITK >gi|226332296|gb|ACIC01000024.1| GENE 67 55279 - 55725 242 148 aa, chain + ## HITS:1 COG:no KEGG:BVU_2828 NR:ns ## KEGG: BVU_2828 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 148 26 173 173 208 62.0 4e-53 MEEHKNVLVDCIQEQLYSGLDGTEHLLNPDYDTDTYFNEPGPWQNRAEQYKRWKERITPP LRSEMLYLPPRPVEVPNLFITGTFYDSITADRIDSGLRFSTKGFTDGSSIEKKYGEQILG IGDTAKEYFNIMYLRPWMERFFSECGYR >gi|226332296|gb|ACIC01000024.1| GENE 68 55756 - 55917 198 53 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160885930|ref|ZP_02066933.1| ## NR: gi|160885930|ref|ZP_02066933.1| hypothetical protein BACOVA_03935 [Bacteroides ovatus ATCC 8483] # 1 53 10 62 62 95 100.0 1e-18 MQSELERISDLAKKAAVLDGCMYVVYQKEDGTYAFDKLGVEIKGKIVEYRHYL >gi|226332296|gb|ACIC01000024.1| GENE 69 55920 - 59825 3166 1301 aa, chain + ## HITS:1 COG:ECs2641 KEGG:ns NR:ns ## COG: ECs2641 COG5283 # Protein_GI_number: 15831895 # Func_class: S Function unknown # Function: Phage-related tail protein # Organism: Escherichia coli O157:H7 # 328 608 225 520 696 83 25.0 3e-15 MADLKLKDFVDESDLQKLVELDNTIERVRADYANAAKELAKGLKLNVEGVADLEKLSNLY NTQAKTAGSASAELTEALRRQSEITQTVSKKIEEKLNVEKLSAAELKKLTKANSDNAVSL EKAAKAEANLTKAQNAGNATRKKAVLSEEERLKLIRTAIILTNQEVHSRSQAKEMNKQLQ KAVDVLKDTDENYIRTLARLNSTIGINTDYIKRNSDRYSQQKMTIGAYREEVKAAWVEIQ NGNKSMQNMGIIARNAGRMLKTEMAPGLSQVSAGLKGWAAGYIGAQAVVGGIVKMFTQLR EGVGSIVEFEFANSKLAAILGTTADNIKELTTDARQLGATTKYTAAQATELQIELAKLGF TRREILDSTGAILRFAQATGAELSDAAALSGAALRMFNASTKETERYVSAMAVATSKSAL SFSYLATALPIVGPVAKAFNFQIEDTLALLGKLADAGFDASMSATATRNILLNLADGNGK LAKALGEPVKTLPELVAGLKKLKEQGVDLNTTLELTDKRSVAAFNAFLTASDKIVPLRDQ ITGVDKELADMADTMSNNVKGSIAGLSSAWEAFMLSFYDSKGIMKDVLDFLARGLRNVAT QLKGYSELQDEADNKAVAFAQKEMMKSDILEKNARNMQRLYKEYINSGMSADEAAKKAKE DYIETLKSRLEYENSDYQLAIDNRKKLEGELKDRGFFTILTSWRRTNNVIKDEIDVATKA AAGKKAISSITESLIEQLDTIDLKENGGTKGNSVKVLTDKEKREQEKALKEKLKIHETYQ ESELALMDEGLEKELAKIGVAYSKKIAAVKGNSKEEIATRQNLAKEMQEKLDEFTIKYNS DREKKDVENALAVVKKGSQEELDLKLHQLELQREAEIDAAEKTGEDVFLIDDKYAKKKQE LYERHASDQVLLIAENAAHEQEIRDAAYVMDTLALKKQLASKEITQQEYAELEYQLKLDY VRKTTEAAIDALELELRNENLSAEDRAKIAEQLQKLKADLSQQEAEAEIDAINKVTKADE KAQKERQRNLKKWLQTASQAVGAIGDLVSTIYDGQIQKIEEEQEANDEKYDKDVERIQNL ADSGAISEEEAEARKRAAKERTEAKNAELEKQKQEMARKQAIWEKATSVAQAGIATALAI TEALPNIPLSIVIGAMGAIQVATILATPIPSYADGTQGNDRHPGGAALVGDAGKHEVIMY SGKAWITPDTPTLVDIPKGAQVFPDVDKVDISNFDMPDWDFPTFSPTYFASSSGDTIVFN DYSRLEKRVDRTNFLLMKSLKMQRQDASNREFELYKLSKLK >gi|226332296|gb|ACIC01000024.1| GENE 70 59828 - 60409 312 193 aa, chain + ## HITS:1 COG:no KEGG:BVU_2826 NR:ns ## KEGG: BVU_2826 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 192 1 188 189 144 43.0 1e-33 MIERLNQITLNDFIELSCGNYVCLLSDCKSMSESTLKEIASKLLVEYRSIVNPSNMKAMV MDKEDMLKERAKLLSLRICQALVSLGFYDDVRQVLGQLNVDTRNMSDEQVISKIDYLLHS AIFEQKRNEERRSEEHKGSKATPEQIRSSFDAEIAFLMTFFKMSIDSRVINAAVYANIVH QADVEISIRKRST >gi|226332296|gb|ACIC01000024.1| GENE 71 60517 - 60804 103 95 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253567985|ref|ZP_04845396.1| ## NR: gi|253567985|ref|ZP_04845396.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 95 1 95 95 165 100.0 8e-40 MNRKNSIHCINRRLYNVLLSELRTLETKCNRITAEVSEVKKMIALLPPDIGTLISSIERS AKEMHEQSIMHRKYVERCINGEPKIHLIRRADNGL >gi|226332296|gb|ACIC01000024.1| GENE 72 60794 - 61267 142 157 aa, chain + ## HITS:1 COG:CC2883 KEGG:ns NR:ns ## COG: CC2883 COG1595 # Protein_GI_number: 16127115 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Caulobacter vibrioides # 3 153 9 159 189 68 27.0 3e-12 MDFEKELSEIYPWILKVARKFCCSMQDAEDLAGDTVYKLLVNRDKFDCSKPLQPWCLIIM RNTYIIRYNRNSLIHFTGLDMVDGSAISNCTAHSILFDDLVSTIQRCAKKSRCIDSVMYY ASGYSYDEISEILNIPVGTVRSRISSARKFILQEISY >gi|226332296|gb|ACIC01000024.1| GENE 73 61394 - 61726 247 110 aa, chain + ## HITS:1 COG:no KEGG:BVU_2823 NR:ns ## KEGG: BVU_2823 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 110 1 110 110 150 71.0 2e-35 METKSNFRARVMKYAHHLLSTTKKSWKYCLLKAWELYRLAKRMRSGEVKFAYEKVNGSIR YAIGTLKNVPAGATNKGKRMTKPSYKTFSYFDVDKQEFRSFKIENLVTVY >gi|226332296|gb|ACIC01000024.1| GENE 74 61728 - 61982 212 84 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253567988|ref|ZP_04845399.1| ## NR: gi|253567988|ref|ZP_04845399.1| predicted protein [Bacteroides sp. 1_1_6] # 1 84 1 84 84 121 100.0 1e-26 MTSLEYYSKRKEDSRQELATLIAQANQFIGDTHNSLNTHTNQGSNIANIKMLSQQLQQLT SRIELEKQKGDMLESICLTLTTEG >gi|226332296|gb|ACIC01000024.1| GENE 75 61987 - 62262 181 91 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253567989|ref|ZP_04845400.1| ## NR: gi|253567989|ref|ZP_04845400.1| predicted protein [Bacteroides sp. 1_1_6] # 1 91 1 91 91 174 100.0 1e-42 MKATLLKVTGETVEISPVNGNCFTLNEAQSLVNGYVQVIDICPNKIMIMNEEGKFHFELN VEATRIALMNSAIFPDDYIAGDAIVCDDTMF >gi|226332296|gb|ACIC01000024.1| GENE 76 62282 - 62443 182 53 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253567990|ref|ZP_04845401.1| ## NR: gi|253567990|ref|ZP_04845401.1| predicted protein [Bacteroides sp. 1_1_6] # 1 53 1 53 53 72 100.0 9e-12 MKTIYRVESPTGEVRVLEVSRNETGYNVYIDDSNICESITEEELTEALENPNF >gi|226332296|gb|ACIC01000024.1| GENE 77 62491 - 62637 165 48 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|295084479|emb|CBK66002.1| ## NR: gi|295084479|emb|CBK66002.1| hypothetical protein [Bacteroides xylanisolvens XB1A] # 1 48 1 48 48 70 100.0 4e-11 MSNSIAANDIIQNIDDLLAEYPVDECISILQEVVKQIDVRIKDFSEHI >gi|226332296|gb|ACIC01000024.1| GENE 78 62648 - 62935 56 95 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288799795|ref|ZP_06405254.1| ## NR: gi|288799795|ref|ZP_06405254.1| hypothetical protein HMPREF0669_00194 [Prevotella sp. oral taxon 299 str. F0039] # 1 94 1 95 96 99 55.0 7e-20 MNNIFTICYSEEEANEIGHFIMRKGYEGVQNDSYRYCREAIWWAFKETKRHHSCFIYVGV RDCQMIVSRTKRGLRRNGLKYIEKKRMFYNLLSRY >gi|226332296|gb|ACIC01000024.1| GENE 79 62971 - 63570 -18 199 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253567991|ref|ZP_04845402.1| ## NR: gi|253567991|ref|ZP_04845402.1| predicted protein [Bacteroides sp. 1_1_6] # 1 199 1 199 199 383 100.0 1e-105 MTRRKKSCTDCEFCNDAEPVDGLRYFCDKKNIYFDPYKTITCRLFRQVSSSVLRIRARWA AMQLAKKKAAEEKERQLALMLTLPYWLHVGAEFIETCSGYEGVITDIDPTREDGIIYRPT NRPGWDGIDGYDTADSIIERTERGMLIFRNYTPEPLKDGFRWSDIEWDSGQIIYSEQRPD GRTDEYLKERYEVVKPEWI >gi|226332296|gb|ACIC01000024.1| GENE 80 63611 - 63844 67 77 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|295084476|emb|CBK65999.1| ## NR: gi|295084476|emb|CBK65999.1| hypothetical protein [Bacteroides xylanisolvens XB1A] # 1 77 1 77 77 146 100.0 4e-34 MNKKVILPCPFCGEIPVLERQYLPASLCLSCKNDNCYVNPAIEITVFCKKNSDGFTFAPQ FEQHENEIIEKWNKRNV >gi|226332296|gb|ACIC01000024.1| GENE 81 63866 - 64048 119 60 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253567992|ref|ZP_04845403.1| ## NR: gi|253567992|ref|ZP_04845403.1| predicted protein [Bacteroides sp. 1_1_6] # 1 60 1 60 60 97 100.0 2e-19 MIFIYRIIADNSIVIMPGISSVDAHSKLTTALNMVDSDFYLVGMLFQGIIIKGDFQTQYL >gi|226332296|gb|ACIC01000024.1| GENE 82 64048 - 64512 220 154 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253567993|ref|ZP_04845404.1| ## NR: gi|253567993|ref|ZP_04845404.1| predicted protein [Bacteroides sp. 1_1_6] # 1 154 2 155 155 289 100.0 3e-77 MKTFAERYKESIVNLSKEELIQQRDIILNHIEAQRERLHIVSNEKKVHDIRVAIKRANIK LREIDSLLKNLCSSEDSSIYHLLDSRISRFINEIVEDPNFVIPNWSKYILLSGTAEEVCE SANNGEYGELCVVADSKGQIMWEWNGDNGWCMSD >gi|226332296|gb|ACIC01000024.1| GENE 83 64800 - 64988 159 62 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253567994|ref|ZP_04845405.1| ## NR: gi|253567994|ref|ZP_04845405.1| predicted protein [Bacteroides sp. 1_1_6] # 1 62 1 62 62 104 100.0 2e-21 MFREAILEALKKRGITQVELANHLGINKSPLNAFLKGKGKISMENIEKSFLFLGIDIVLK NR >gi|226332296|gb|ACIC01000024.1| GENE 84 65093 - 67084 1191 663 aa, chain + ## HITS:1 COG:no KEGG:BVU_2822 NR:ns ## KEGG: BVU_2822 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 400 1 423 892 145 29.0 6e-33 MLCKYVLTVDSISYDIPKSCIQNWDEIKFSRKRSGLEGITRTFTSKFQFVGEAYDLILEE YLSKYLASNASITVYTITNSHTYEEFFSCRLDFGSLTYDGNTVSINSIDDSVANIIKANK GTQYEYSVDEIKDVYQLYYDSVSMNYSQPHTLGGNTVENDASLQYIVIDKGIYVEAITYS LPLYISGGELPSRDSPLEFYDAPQESKDDPNVFVKALSDIDIVLNFSFEYYISYSDAYTT KAEIVLGGRYEDGRLVELKRWGYNKGDVTPSNLNESIKIHLTKGQALFFDLKVTFNRVNA STGNIYFRNFKFETRFTSRANPIYVDAIRPIDVLNRLLKSMNGGNEGIYGEIASGVDERL DNCVILAAESIRGIPQAKLYTSYTKFKNWMETVFGFVPVINGVTVFFKHRDKLFSDNNVK DLNSSFSSFEYKVDSSRIYSLVRVGYDKQDYESMNGRDEFRFTTEYTTGIDITDNVLELI SPYRADVYGIEFLSQKRGQDTTDSESDNDVFFVCASTTLHDNGGVQTYKEYRLIRSGWEI SGVLDPETMFNTMYWQGGILQANAGYIGMFTKKLSYSSSDGNSDVVVNGIGMKDDFNVES GIITCGDVSFTTYNEDIPPTDDETIKILKDDLVYEGYIKEVSSTVERNEGVKYDLFVRSI TKA >gi|226332296|gb|ACIC01000024.1| GENE 85 67089 - 68402 559 437 aa, chain + ## HITS:1 COG:no KEGG:BVU_2821 NR:ns ## KEGG: BVU_2821 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 2 266 3 266 275 273 48.0 9e-72 MIISPFTPLFFSPSTDKFGAKSKYVQLFARTDRIFVELILTAKEQEPIVYINNLLSNIST PVSLSSWKMNDDKILYFYNISLLPCGYYTVTVNGNTSEIFKVTDDECELSETSLIQYSMK DNKQRLDAVWWIDGMQYFFDFRVPGGFKDNGWTFGVDNEQFVTSDEDIVELFSHEYTTVL FTLGNGMGCPVWFAELLNRVLCCNYVYFDGVRYTRKESNVPELNQQIEGLKSFVFNQMLQ KVRTMNPVLEWNNQLAMRCVQSGAYRIADDEGMRSIKYGSESEVAEVGAYINMTKAIPNT GVSINSDTMVTVNSIHHPGVDENSYWDLIAIKTTDIDNKYIGRRGYGKLTVNGLDRLKND LDNGSINLRAVLYKGDSYTNLIEGNVISRDGVCVLKGINGGDIGALKEFQLYLDNVYDCD IDNLGMTIELVWVYEND >gi|226332296|gb|ACIC01000024.1| GENE 86 68414 - 74551 3851 2045 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253567997|ref|ZP_04845408.1| ## NR: gi|253567997|ref|ZP_04845408.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 2045 1 2045 2045 4106 100.0 0 MTETEKQQIISLVLQALKTNSLTIEQLTDTTELSKDMYVEVSGGRKISIDLLSSTIAKMV NGDFDALVENVNKIAKDLSDGDAELLKRITGVSDKSNPLTDPFKSIGSFTTIGSFKDKLK TMYSGDSSIGNYRCILSVDSSKIPVNIQIERLELNKVCQSFTSCIQLATMSDNAEGVYLG TVCTNSRIGIVSNESVTWGKWTSVINDFEERIGKANGIAPLNEESKVPSECLPEPLSLGE GEEEAFPGNRGKSLEDTMKNIPSDIIKPGSFSVLSDASYLNVYFKKVSKTTGKETDDSFR LPSATLEQAGLLSAEDKQALEDMKSGTPADDVTHPIVIVDEIRPLKDGYYTLETAIAAVV SSQQETGINYERTGLIITYKTGEYEMETRQFQGAVSDFNEVALWKNFGGEGSKVELGDAP EEGGDKALSTGGAYDCIPVDFSLDTETEGVVKIQMVNAKGEGVGEEKQFLVGTGGGGGGG GTIVAIAFESSPVYGAYGSPIKGRAAVRSVTSGGGIETENSIETLEIVDRDSGLTVWAER VNRPSSGDLTDYTFELDFTSFFTAAGSRKFKLVATDDSGNTGSKNISVTAVDITCTCVQV LQYSPDTPVTPTTGSVTIPLYKFANNQSDKGISVRVDIKINGEWHLLATTAVNDSFTHSI TLHPSELGLSHGSYPLRIQGTDIASGAKGNTIYTAVMVVEEGNETPIVSLRYDDTTGGTV RLYDTLKLDVAVYVPGKSQSHVAIFANGIQFTQLLALNTRSYSVSQQIKGYADGTAVTYN AIVSAVSSDNIIVTVDGSAIDAELTSGTIYDFDFSGRSNDEADHSITSNGYELKLAGANF TSNGFGTFLGKNCLRIAENVTGQLNHYMFGSSMLEATGGAIQFTFATKNVKDKNAKLMEC YDESSGAGFYVTGSKVGIYCKNGIRSREERSYEQGKEITAAIVVEPTSIYIERGGIKYSM ICLYLDGERVAALGYVGGTGNLFQDRNIKFNGERGDLYLYNLCAWNTYFEWAQAHKNYLV RLTDTEIMVKEYEFENVLVSQTAEGTTMLRPSAAELYARGIPYIVEVASDESFNEFDNGV STSDNFTVDLYYYDPVHPWRSFVARGVRKRRQGTTSAKRCKKNPRYYLGKAKEIVPLFPD YTNADALLTYALFKQKKVRVGENTIPVDIITVKIDFSDSSGVNDCGTCDMMNYTYRSLGG DYLTPAQRFFDGTYDLGDIHIEGLEMNHSTANHPVCVFRSTSDTLQNVYFEARGNWKEDK GEQTALGFMNTPGYNLGCLNYQDASFVEFFGRAEETLDQIEERFKATDGLDTGMLYLLSL YCGRDYRFMRYVDGAWKDTTGSMYQEDGKWLIEGDVLNPVEGFELLVYQGMCWWRGVSSV EDMMKPSSMKSSWVQKLIDKGEISGDTFPAWTYYFECMVDNDQLAIDYALGKKVPYQLYN MLRFCDTCNKDNDAQWQENWRNNLRLHANPKSVMSYYGFTDYACGKDQQAKNMQPMWFLE SGASVTKGVYSPNALIMYLNKIYDADGVNDKDNDGGCDTDPEVDPGKPSTDTYTNPFAGW NSILWVCCREQQEVLLVDGNTIDLRTVIAAMRSCQIEVDGQMMKPFSPDGAIYFYCTKRQ LVWPKVVSSYDGYRKYIQYTATSDAIYFYALQGLGLTSLPAFIRTRWRIRDGYYQTGDFF SGVLSGRIACGADATITIMAAATGYFGVGNDASGNLSESCYLEAGQSYTFTNFAKDEGAL LYIYQADRMSSIDLSALTLSDNFDFSVMSLVETLVTGGENHVERSMGYNKLAAYMLGDLP FLTTLDIRNTGAKSLDASKCPRIEHIHTEGSVLENLTLAETSPINDISLPPTMTSLRFVG LPELTYTGLSAPSGLQIESMPNVQRLRLETSPKLDAIQMLRDVLASQTESRKLSMLRISN MTLKADGSELLAILEYGVAGMDEDGNRQDKPVVNGTYELTVIRETDEIESLESGIDGLVI LTVIDAYIDLINWFNNESYGGEPYYDNVTLDNINEVLEYYNGETYEEYLERFAEDNMDIN DLINK >gi|226332296|gb|ACIC01000024.1| GENE 87 74554 - 75342 610 262 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253567998|ref|ZP_04845409.1| ## NR: gi|253567998|ref|ZP_04845409.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 262 1 262 262 474 100.0 1e-132 MTNEQSATLLRLNKQAQVVALNAVGFSDITENSRASEFGQRIKWAAGLLDLNLACNRISD NSKWYFTREEWDSFTVTNKQLFIKRGLRIRAHGHSFVISAQECYNADMTTTFYWGGQGKA IDGLNQKGLGAMYDCFTGEEDTDLIIATLKDQNNSGVIGAPAAEAARAYRAYTLESDGIE DESNWFLPSSGQMLLMYRYRDKINEMMRTFWSSDSMLMTDKYYWSSTIWDTNSAWAFELN TGRITNQNKNSALLHVRAVASE >gi|226332296|gb|ACIC01000024.1| GENE 88 75368 - 76177 436 269 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253567999|ref|ZP_04845410.1| ## NR: gi|253567999|ref|ZP_04845410.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 269 1 269 269 521 100.0 1e-146 MDKNIASAMLLRLNKQDQIEALKSIGFTTVNENTPASDIAKYMQWSGTLLDLSLATLRIE DGEQVFFTASEWNSMSANNRSKYIRIGIRLRAECHQFIIAKSDCVDAGGNKTFKWGGYGT DLRGLKNYGSGNQGLYDTFDGKENTDVIIETLAGVKDTQGTVGAPAAEVARAYKACTLES DGIEDTTVWNLPALGELMLMAKYKTEINELITSMFGNQNIFTNDWYWSSTEYDASSSWYV LFSSGNVYTGSRQTAYRVRPLAAINTLSL >gi|226332296|gb|ACIC01000024.1| GENE 89 76324 - 76731 237 135 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253568000|ref|ZP_04845411.1| ## NR: gi|253568000|ref|ZP_04845411.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 135 1 135 135 263 100.0 2e-69 MALTQDLPISNSMYKLLNLIIDARQQFPKAFRYEFGTELMMLAVHCCEYIRYANTDMNLE HRADYLMKFLCEFDALKLLLRVCEERHLTSLTQTAEICLLAESIGKQSTGWYKKTVADLQ RQNANGSQQVAKPES >gi|226332296|gb|ACIC01000024.1| GENE 90 76746 - 78167 589 473 aa, chain + ## HITS:1 COG:alr3497 KEGG:ns NR:ns ## COG: alr3497 COG3344 # Protein_GI_number: 17230989 # Func_class: L Replication, recombination and repair # Function: Retron-type reverse transcriptase # Organism: Nostoc sp. PCC 7120 # 58 426 5 347 352 98 26.0 3e-20 MSEQLELFIGHPPGDEPGKTKIADATASSSWNVNFNNGNVNTNNRQNANRVRPLAATGNI IYDILLSSIFEASEDCARQKRTSTDCVEFYNDYQSALVRLWYSIIYGEYVPDFSKVFIRT YPVYREVFAAAFIDRVVHHWIALRIEPILEERFREQGNVSKNCRKGEGCLSAVHYLNNMI VEVSENYTADAYIFKDDLFSFFMSISKSLVWEMLNIFVRDNYKGDDIECLLYLLAVTIFH CPQNKCIRRSPVSMWDRLPSNKSLFHNDPDRGVAIGNLPSQLIANFLASVYDYFVMEILG FMYYVRFVDDFCIVVKSPEEILSKVHLLDGFLKEQLLLRLHPRKLYLQHYKKGVLFVGAF ILPGRIYVSNRVVGNTYNAVRKFNRIAENGFAEAYVEKFVSTMNSYYGLMKHFATYNIRR KIAAMLLPEWWEYVYIEGHFEKFVLKNKYNHRKQLIKHIKKHGSKKYLTAWDC >gi|226332296|gb|ACIC01000024.1| GENE 91 78130 - 78408 246 92 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253568002|ref|ZP_04845413.1| ## NR: gi|253568002|ref|ZP_04845413.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 92 1 92 92 184 100.0 2e-45 MDQKNILPRGIAKPIEQQPDGTWVVRHHFRVVGTNENGEELVTFASSEYPEKPTLQQIQR SIDRYRVCLTMYGDTISDEIEKVDLSVYMFTD >gi|226332296|gb|ACIC01000024.1| GENE 92 78607 - 78900 195 97 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|295084465|emb|CBK65988.1| ## NR: gi|295084465|emb|CBK65988.1| hypothetical protein [Bacteroides xylanisolvens XB1A] # 1 97 1 97 97 147 100.0 3e-34 MSMGIKVLYDWILQSNRPAHVKAGMFVFVVMLVFCFLLLGIDFCKSAIVSLTTTAIAAIV VEYIQKKCGFIFDWLDALATVLLPGLITVFSILVVTL >gi|226332296|gb|ACIC01000024.1| GENE 93 78909 - 79424 348 171 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|262407839|ref|ZP_06084387.1| ## NR: gi|262407839|ref|ZP_06084387.1| conserved hypothetical protein [Bacteroides sp. 2_1_22] # 1 171 1 171 171 316 99.0 4e-85 MRWLYELFNVDQIRIIFVSMFSSLLAYLTPTKGFLIALVIMFGFNIWCGMRADGVSIIRC KNFKWDKFKNALVELLLYLIIIEVVFSFMSLIGDGENSLLVIKTITYVFSYVYLQNAFKN LIIAYPRNKGFRIIYHVIRFEFKRATPTHVQGIIDRIENELDKEERYENID >gi|226332296|gb|ACIC01000024.1| GENE 94 79408 - 79603 162 65 aa, chain + ## HITS:1 COG:no KEGG:BT_4443 NR:ns ## KEGG: BT_4443 # Name: not_defined # Def: N-acetylmuramoyl alanine amidase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 65 23 87 218 108 73.0 8e-23 MKILIDNGHGSNTPGKCSPDGRLREYSYTREIAGRVVFELRKLGIDAELVVKEEIDVPLS ERCRR Prediction of potential genes in microbial genomes Time: Thu May 12 00:15:27 2011 Seq name: gi|226332295|gb|ACIC01000025.1| Bacteroides sp. 1_1_6 cont1.25, whole genome shotgun sequence Length of sequence - 15818 bp Number of predicted genes - 17, with homology - 16 Number of transcription units - 8, operones - 4 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 60 - 368 228 ## Plut_0820 cell wall hydrolase/autolysin 2 1 Op 2 . + CDS 365 - 835 261 ## BF2444 hypothetical protein + Prom 882 - 941 10.1 3 2 Tu 1 . + CDS 994 - 1707 248 ## gi|253568007|ref|ZP_04845418.1| conserved hypothetical protein + Term 1729 - 1759 2.7 - Term 1955 - 1991 7.3 4 3 Op 1 . - CDS 2009 - 2458 292 ## BF2302 hypothetical protein 5 3 Op 2 . - CDS 2500 - 2670 251 ## - Prom 2707 - 2766 5.4 6 4 Tu 1 . - CDS 2768 - 3952 403 ## BVU_2806 hypothetical protein - Term 4308 - 4344 1.8 7 5 Op 1 9/0.000 - CDS 4360 - 5058 844 ## COG3279 Response regulator of the LytR/AlgR family 8 5 Op 2 . - CDS 5129 - 6202 824 ## COG3275 Putative regulator of cell autolysis 9 5 Op 3 36/0.000 - CDS 6221 - 7441 416 ## PROTEIN SUPPORTED gi|163788031|ref|ZP_02182477.1| 50S ribosomal protein L9 10 5 Op 4 24/0.000 - CDS 7517 - 8260 261 ## PROTEIN SUPPORTED gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) 11 5 Op 5 13/0.000 - CDS 8267 - 9496 1450 ## COG0845 Membrane-fusion protein 12 5 Op 6 . - CDS 9548 - 10939 1417 ## COG1538 Outer membrane protein - Prom 11109 - 11168 9.5 + Prom 11082 - 11141 8.5 13 6 Op 1 . + CDS 11165 - 11401 174 ## BT_1211 hypothetical protein 14 6 Op 2 31/0.000 + CDS 11422 - 12984 1674 ## COG1271 Cytochrome bd-type quinol oxidase, subunit 1 + Prom 13059 - 13118 4.1 15 6 Op 3 . + CDS 13139 - 14281 1012 ## COG1294 Cytochrome bd-type quinol oxidase, subunit 2 + Term 14342 - 14405 10.1 - Term 14329 - 14391 6.1 16 7 Tu 1 . - CDS 14455 - 14835 236 ## BT_1208 hypothetical protein - Prom 14886 - 14945 6.8 17 8 Tu 1 . - CDS 14963 - 15742 780 ## COG1052 Lactate dehydrogenase and related dehydrogenases Predicted protein(s) >gi|226332295|gb|ACIC01000025.1| GENE 1 60 - 368 228 102 aa, chain + ## HITS:1 COG:no KEGG:Plut_0820 NR:ns ## KEGG: Plut_0820 # Name: not_defined # Def: cell wall hydrolase/autolysin # Organism: P.luteolum # Pathway: not_defined # 5 99 111 206 215 94 48.0 1e-18 MQARGWEAWTSVGQTKADKLADCLYATAEECLFGMKIRKDMADGDPDKESSFYILKHTKC PAVLTENLFQDNKEDVDFLLSEEGKRTIVSLHVKGICKYLGI >gi|226332295|gb|ACIC01000025.1| GENE 2 365 - 835 261 156 aa, chain + ## HITS:1 COG:no KEGG:BF2444 NR:ns ## KEGG: BF2444 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 156 1 149 149 135 45.0 4e-31 MKFLPWILVCLLLGVIVWMQCNPHDPSTVYIKGDTVRIRDTIRDTIPKPAKRTPKRIDTV YLPILIDATTDRTVEGDSIPVLVPIVSKEYKTDNYRAIVSGYKPSLDFMEVYRDKEIITL SPLQKKKRWGLGLQTGYSYPGGWYVGVGISCNLIMW >gi|226332295|gb|ACIC01000025.1| GENE 3 994 - 1707 248 237 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253568007|ref|ZP_04845418.1| ## NR: gi|253568007|ref|ZP_04845418.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 237 1 237 237 427 100.0 1e-118 MKYICLFLFACISIISKAQTLILSENDSTVMTEYNDGNLWAYRNANGFIVGLTTYETKDD YGKYYRIDVFIKNQCDSSVIFTPDDVTSHLLTNRGDNYQLMVYTNEAFQKKIRKSQNWAM ALYGFSSGLSAGSAGYSTSYSTSYSSNGTAYTTVTNHYDANAAFQANMASSYQLQTLGKM MDNDREIKRQGYLKKTTVHPNEGIIGYMNIKRKKGKILTINIPISGYVYSFDWDVSK >gi|226332295|gb|ACIC01000025.1| GENE 4 2009 - 2458 292 149 aa, chain - ## HITS:1 COG:no KEGG:BF2302 NR:ns ## KEGG: BF2302 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 18 148 14 139 140 77 39.0 1e-13 MIDWNDCLPTKEMQADFERFKELKTTEEKEAFKKEMQDKYNKLPEAQKEAYKKASEAGLK ATVNACNDYIERAEEAILRDKLGELPEAISFSYIAKKYFGKSRNWLYQRINGNIVNGKKA RFTDNELKTFLNALNDVSEMIHQTSLKIS >gi|226332295|gb|ACIC01000025.1| GENE 5 2500 - 2670 251 56 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKDEEKKELEQEYENLKLLASFHEAYGVPENAKEREALINDILDRMNEIQEKLKKL >gi|226332295|gb|ACIC01000025.1| GENE 6 2768 - 3952 403 394 aa, chain - ## HITS:1 COG:no KEGG:BVU_2806 NR:ns ## KEGG: BVU_2806 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 392 1 393 393 518 66.0 1e-145 MATLTLVIVPAKRLSDGTHKIRIRVAHNSETRFITTDIVVRENEFKNGKIVHRPDKDFLN TKLQQLYNLYFKRYMELDYPDSLTCTQLVKMITNPLNGEKHRKFEDIVDEYLSQIDEEER TKTYKLYRLATNKFMQFIGNGSLMEHITPIRMNQYISWLKKTKLSSTTINIYITLLKVII NYAIKMRYVTYDIDPFITARIPSAQKRETQITVEELKTIRDANLEHYNLNVTRDIFMLTY YLAGMNLVDILAYDFRTDEINYIRKKTKNTKEGDSLISFSIPEEAKPIIKKYMKKNTGKI IFGKYKNYTSCYNLLARKISQLGKVAGIRHKFTLYSARKSFVQHGYDLGIPLSTLEYCIG QSMKEDRPIFNYVTIMRKHADKAIREILDNLKNE >gi|226332295|gb|ACIC01000025.1| GENE 7 4360 - 5058 844 232 aa, chain - ## HITS:1 COG:FN0219 KEGG:ns NR:ns ## COG: FN0219 COG3279 # Protein_GI_number: 19703564 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Fusobacterium nucleatum # 2 204 1 207 240 93 30.0 3e-19 MILRCAIVDDEPLALGLLESYVNKTPFLQLTGKYSSAVQAMKELPGEEVDLLFLDIQMPE LNGLEFSQMVDSRTRIVFTTAFGQYAIDGYRVNALDYLLKPISYVDFLQAANKALQWFEL VQKPEEIDSIFVKSDYKLVQVELKKIMYIEGLKDYIKIYTEDDTKPILSLMSMKAMEELL PSSRFIRVHRSFIVQKDKIRVIDRGRIVFDKTYIPISDSYKQVFQAFLDERS >gi|226332295|gb|ACIC01000025.1| GENE 8 5129 - 6202 824 357 aa, chain - ## HITS:1 COG:SMb21546 KEGG:ns NR:ns ## COG: SMb21546 COG3275 # Protein_GI_number: 16264735 # Func_class: T Signal transduction mechanisms # Function: Putative regulator of cell autolysis # Organism: Sinorhizobium meliloti # 144 327 167 356 383 95 33.0 1e-19 MKQSLTSARRPLEALIHIIGWGIMFGFPFFFVERENGNINWMAYLRHSAVPLSFMIAFYV NYFLLVPRYLFQSQTKRYITYNILLLCVIGLMLHLWRSLTFDPSFVPKPHRSGGPPGWLF FVRDMLSLVFTIGLSAAIRMSARWTQAEAARKEAERNRSEAELKNLRNQLNPHFLLNTLN NIYALIAFDTDKAQQAVQELSKLLRYVLYDNQQTYVPLGKETDFIRNYIELMRIRLSSNV QMTTQIDILPDSRTLIAPLIFISLIENAFKHGISPTERSFIHIHLAENETEVICEISNSN HPKNIMDKSGSGIGLEQVNRRLEILYPGQYTWQKGISEDGKEYRSRLSIRVRESINK >gi|226332295|gb|ACIC01000025.1| GENE 9 6221 - 7441 416 406 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163788031|ref|ZP_02182477.1| 50S ribosomal protein L9 [Flavobacteriales bacterium ALC-1] # 8 406 10 413 413 164 30 3e-40 MNGTNLFKIALRALANNKLRAFLTMLGIIIGVASVITMLAIGQGSKRSIQQQISEMGSNM IMIHPGADRRGGVRQDPSAMQTLKLTDYEALRDETNFLSAISPNVSSSGQLIAGNNNYPS SVSGVGLEYLDIRQLSVESGDMFTEADIQSSAKVCVIGKTIVDNLFPDGSDPIGKIIRFN KIPFRVVGVLKAKGYNSMGQDQDNIVLAPYTTVMKRLLAVTYLQGVFASALSEDMTEYAT DEITEILRRNHKLKASDEDDFTIRTQQELSTMLNSTTDLMTTLLACIAGISLVVGGIGIM NIMYVSVTERTREIGLRMSVGARGVDILSQFLIEAIMISITGGLIGVIIGCGASWIVKSV AHWPIFIQPWSVFLSFAVCTVTGVFFGWYPAKKAADLDPIEAIRYE >gi|226332295|gb|ACIC01000025.1| GENE 10 7517 - 8260 261 247 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) [Campylobacter concisus 13826] # 1 224 1 221 223 105 31 3e-22 MKKVIELQNIKRDFQVGEETVHALRGVSFTINEGEFVTIMGTSGSGKSTLLNTLGCLDTP TSGEYLLDDISVRTMSKPQRAVLRNRKIGFVFQSYNLLPKTTAVENVELPLMYNSSVSAS ERRRRAIEALQAVGLGDRLEHKSNQMSGGQMQRVAIARALVNNPAVILADEATGNLDTRT SFDILVLFQKLHAEGRTIIFVTHNPEIAQYSSRNIRLRDGHVIEDTVNSHVLSAAEALAA LPKSDED >gi|226332295|gb|ACIC01000025.1| GENE 11 8267 - 9496 1450 409 aa, chain - ## HITS:1 COG:AGc3332 KEGG:ns NR:ns ## COG: AGc3332 COG0845 # Protein_GI_number: 15889118 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 2 372 37 432 437 199 33.0 7e-51 MKTKKIILIAVAVAVVVGGGIWFFTGSPAKHKVMYASATVSKGDISNSVTATGTIEPVIE VEVGTQVSGIIDKIYVDYNSVVTKGQLIAEMDRVTLQSELASQQATYDGAKAEFEYQKKN YERSKGLHDKSLISDTDYEQALYNYQKAKSAFDSSKASLAKAERNLSYATITSPIDGVVI SRDVEAGQTVASGFETPTLFTIAADLTQMQVVADVDEADIGGIEEGQRASFTVDAYPNDV FDGVVTQIRLGDASSSSSASSSTSTVVTYEVVISAPNPDLKLKPRLTANVTIFILDKKDV LSVPNRALRFTPEAPLIGKNDIVKDCEGEHKVWTREGTTFTAHPVEIGISNGISTEIISG VAEGTKVVTEATIGAMPDENMNREPGQGNGERSPFMPGPPGNNKKKSNK >gi|226332295|gb|ACIC01000025.1| GENE 12 9548 - 10939 1417 463 aa, chain - ## HITS:1 COG:BMEI1029 KEGG:ns NR:ns ## COG: BMEI1029 COG1538 # Protein_GI_number: 17987312 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Outer membrane protein # Organism: Brucella melitensis # 94 453 70 415 456 80 23.0 6e-15 MINVRRLTVIVFLGAGLLPGISAQESSVQTAAQSEMNLPAQWDLQSCIDYALQQNITIRK NRVNAESTQIDVKTAKATLFPSLSFSTSQNMVNRPYQESSRTISGSEVIETTSKTSYNGN YGLNASWTIYNGSKRLNTIKQEKLNNQAANLDVETSENSIQESIAQIYIQILYAAESVKV NENTLQVSIAQRDRGQELLNAGSIAKSDLAQLEAQVSTDRYQLVTAQATLQDYKLQLKQL LELDGENEMNVYLPALSDENVLLPLPTKKDVYMSALTLRPEIEASKLSVEASELGIDIAK AGYLPTISLSAGIGTNHTSGSDFTFGEQVKNGWNNSIGVSVSVPIFNNRQTKSAVQKAKL QRETSLLSLLDEQKTLYKTIEGLWLDANSAQQRYAAANEKLKSTQISYDLISEQFNLGMK NTVELLTEKNNLLQAQQEQLQAKYMAILNTQLLKFYQGDKLAL >gi|226332295|gb|ACIC01000025.1| GENE 13 11165 - 11401 174 78 aa, chain + ## HITS:1 COG:no KEGG:BT_1211 NR:ns ## KEGG: BT_1211 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 78 1 78 78 139 100.0 4e-32 MKNTLLAIWNFYLEGFRSMTLGRTLWLIILVKLFIMFFILKIFFFPDFLGDHPTDADKGT YVGNELIQRALPEKTTDF >gi|226332295|gb|ACIC01000025.1| GENE 14 11422 - 12984 1674 520 aa, chain + ## HITS:1 COG:Cj0081 KEGG:ns NR:ns ## COG: Cj0081 COG1271 # Protein_GI_number: 15791471 # Func_class: C Energy production and conversion # Function: Cytochrome bd-type quinol oxidase, subunit 1 # Organism: Campylobacter jejuni # 6 512 3 504 520 553 55.0 1e-157 MIESIDTSLIDWSRAQFAMTAMYHWIFVPLTLGLAVVMGIMETLYYKTGNEFWKRTAKFW MKLFGINFAIGVATGLILEFEFGTNWSNYSWFVGDIFGAPLAIEGILAFFMEATFIAVMF FGWDKVSKRFHLASTWLTGLGATISAWWILVANAWMQHPVGMEFNPDTVRNEMVDFWAVA TSPVAVNKFFHTVLSGWVLGGVFVVGISCWFLLKKRNREFALASIKIGAIFGLVSSLLAV WTGDGSGYQIAQTQPMKLAAVEGLYEGGTNVGLVGIGMLNPEKKTYNDGKDPFLFRIEIP SMLSFLAERDVDGYVPGITNIIEGGYQLKDGTTALSAAEKIERGKTAIGALAAYRAAKSA GHEEDAAVAYQVLQENMPYFGYGYIKDVNQLVPNVPLNFYAFRIMVILGGYFILFFIVVL FFIYKKDLSKMRWMHWIALLTIPLAYIAGQAGWVVAECGRQPWAIRDMLPTSAAISKLDV GSVQATFFIFLILFTVMLIAEIGIMVKAIKKGPEEEISNR >gi|226332295|gb|ACIC01000025.1| GENE 15 13139 - 14281 1012 380 aa, chain + ## HITS:1 COG:Cj0082 KEGG:ns NR:ns ## COG: Cj0082 COG1294 # Protein_GI_number: 15791472 # Func_class: C Energy production and conversion # Function: Cytochrome bd-type quinol oxidase, subunit 2 # Organism: Campylobacter jejuni # 5 380 10 374 374 305 50.0 1e-82 MYIFLQQYWWLVVSLLGAILVFLLFVQGGNSLLFCLGKTEEHRKMMVNSTGRKWEFTFTT LVTFGGAFFASFPLFYSTSFGGAYWLWMIILFSFVLQAVSYEFQSKAGNLLGKKTYQTFL VINGVVGPLLLGGAVATFFTGSDFYINKANMTDTIMPVISHWGNGWHGLDALTNIWNVIL GLAVFFLARVLGSLYFINSIADKELTDKCRRAVLNNTIFFLVFFLVFVIRTLVSDGFAVN PDTLEVYMQPYKYFINFIEMPVVLIIFLIGVVLVLFGIGKTVLKKTFDKGIWFAGIGTVL TVLALLLVAGYNNTAYYPSYTDLQSSLTLANSCSSEFTLKTMAYVSILVPFVIAYIFYAW RSIDRHKITEKEMDEGGHSY >gi|226332295|gb|ACIC01000025.1| GENE 16 14455 - 14835 236 126 aa, chain - ## HITS:1 COG:no KEGG:BT_1208 NR:ns ## KEGG: BT_1208 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 126 1 126 126 200 100.0 1e-50 MFELKFNLKGTINRKEFLLSMILSIINFIVCLAISMIDLFIHFGGDYKTGTTPSLSFWGG IILFITTVMILITLWYFLAQCMKRLRDISLSGLWLLVLIVPIANIFLLVYLLFAKSTDNT QKDISA >gi|226332295|gb|ACIC01000025.1| GENE 17 14963 - 15742 780 259 aa, chain - ## HITS:1 COG:CAC2945 KEGG:ns NR:ns ## COG: CAC2945 COG1052 # Protein_GI_number: 15896198 # Func_class: C Energy production and conversion; H Coenzyme transport and metabolism; R General function prediction only # Function: Lactate dehydrogenase and related dehydrogenases # Organism: Clostridium acetobutylicum # 4 259 68 324 324 281 54.0 6e-76 MTALPELKYIGVLATGYNVVDTAVAKERGIIVTNIPAYSTASVAQMVFAHILNICQQVQH HSEEVHKGRWTNNKDFCFWDTPLIELRDKKIGLVGLGNTGYTTARVAIGFGMQVYALTSK SHFQLPPEIKKMDLDQLFNECDIISLHCPLTPETHELVNARRLALMKPNAILINTGRGPL VNEQDLADALNSGKIYAAGVDVLSSEPPRADNPLLTAKNCYITPHIAWASTEARERLMNI AISNLQAYISGTPENVVNK Prediction of potential genes in microbial genomes Time: Thu May 12 00:16:03 2011 Seq name: gi|226332294|gb|ACIC01000026.1| Bacteroides sp. 1_1_6 cont1.26, whole genome shotgun sequence Length of sequence - 20124 bp Number of predicted genes - 18, with homology - 17 Number of transcription units - 12, operones - 5 average op.length - 2.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 469 - 1737 1190 ## COG2256 ATPase related to the helicase subunit of the Holliday junction resolvase 2 1 Op 2 . - CDS 1742 - 4165 2201 ## BT_1204 putative outer membrane protein 3 2 Tu 1 . - CDS 4302 - 4391 65 ## - Prom 4614 - 4673 2.0 - Term 4605 - 4646 5.1 4 3 Tu 1 . - CDS 4736 - 5821 1002 ## COG0381 UDP-N-acetylglucosamine 2-epimerase - Prom 5883 - 5942 5.3 + Prom 5766 - 5825 3.4 5 4 Tu 1 . + CDS 5904 - 6533 659 ## COG2860 Predicted membrane protein 6 5 Op 1 . - CDS 6539 - 7033 496 ## BT_1200 hypothetical protein - Prom 7064 - 7123 8.0 7 5 Op 2 3/0.000 - CDS 7129 - 8091 887 ## COG0501 Zn-dependent protease with chaperone function 8 5 Op 3 . - CDS 8104 - 8664 712 ## COG1704 Uncharacterized conserved protein - Prom 8875 - 8934 4.7 + Prom 8731 - 8790 5.0 9 6 Tu 1 . + CDS 8907 - 9410 519 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog + Term 9450 - 9490 7.2 + Prom 9518 - 9577 4.8 10 7 Tu 1 . + CDS 9620 - 11395 1795 ## COG5016 Pyruvate/oxaloacetate carboxyltransferase + Term 11426 - 11486 0.2 - Term 11401 - 11435 0.2 11 8 Op 1 . - CDS 11480 - 12226 377 ## COG0744 Membrane carboxypeptidase (penicillin-binding protein) 12 8 Op 2 . - CDS 12204 - 12920 922 ## COG2885 Outer membrane protein and related peptidoglycan-associated (lipo)proteins - Prom 12944 - 13003 5.0 13 9 Tu 1 . - CDS 13066 - 14421 1292 ## COG0534 Na+-driven multidrug efflux pump - Prom 14444 - 14503 4.5 + Prom 14374 - 14433 3.9 14 10 Op 1 . + CDS 14513 - 15904 1413 ## COG0657 Esterase/lipase + Prom 15906 - 15965 3.0 15 10 Op 2 . + CDS 15986 - 16756 262 ## BT_1191 hypothetical protein + Term 16919 - 16957 3.5 + Prom 16987 - 17046 6.2 16 11 Tu 1 . + CDS 17124 - 18107 558 ## COG1162 Predicted GTPases + Term 18210 - 18245 -1.0 - Term 18045 - 18080 1.1 17 12 Op 1 . - CDS 18220 - 18690 425 ## BT_1189 hypothetical protein 18 12 Op 2 . - CDS 18711 - 19844 448 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 19872 - 19931 6.7 Predicted protein(s) >gi|226332294|gb|ACIC01000026.1| GENE 1 469 - 1737 1190 422 aa, chain - ## HITS:1 COG:CAC0326 KEGG:ns NR:ns ## COG: CAC0326 COG2256 # Protein_GI_number: 15893618 # Func_class: L Replication, recombination and repair # Function: ATPase related to the helicase subunit of the Holliday junction resolvase # Organism: Clostridium acetobutylicum # 3 416 16 432 443 397 48.0 1e-110 MQPLAERLRPKTLDDYIGQKHLVGPGAILRKMIDAGRISSFILWGPPGVGKTTLAQIIAN KLETPFYTLSAVTSGVKDVREVIERAKSNRFFSQSSPILFIDEIHRFSKSQQDSLLGAVE NGTVTLIGATTENPSFEVIRPLLSRCQLYVLKSLEKEDLLELLQRAITTDTILKERKIEL KETTAMLRFSGGDARKLLNILELVVESDTEETVVITDDMVTERLQQNPLAYDKDGEMHYD IISAFIKSIRGSDPDGAIYWLARMVEGGEDPAFIARRLVISAAEDIGLANPNALLLANAC FETLMKIGWPEGRIPLAETTIYLATSPKSNSAYSAINDALELVRSTGNLPVPLHLRNAPT KLMKQLGYGQEYKYAHSYEGNFVKQQFLPDELKEKRIWQPQNNAAEQKHAERMQQLWGER FK >gi|226332294|gb|ACIC01000026.1| GENE 2 1742 - 4165 2201 807 aa, chain - ## HITS:1 COG:no KEGG:BT_1204 NR:ns ## KEGG: BT_1204 # Name: not_defined # Def: putative outer membrane protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 807 52 858 858 1621 99.0 0 MGTVTNIDGEYKLDARRNGGTLVFSSIGYVTKTVKVGSGNQTVNVKLSPDDVMLTEVVVK PQKEKYSRKNNPAVEFMKKVIEHKKAQVLEVNDYYQYDKYEKMKMSINDLTPEKLEKGIY KKYSFLKDQVEVSETTNKLILPISVQETASQTIFRKDPESKKTIIKGKNSNGIEEFFSTG DMLGTVLKDVFADINIYDDDIRLLQQRFVSPIGNNAISFYKYYLMDTLMVNKRECVHLTF VPQNSQDFGFTGHLYVLKDSTYAVQKCTMNLPKKSGVNFVNRMDIVQQYEQLPNGNWVLA DDDMTVDLSWSSNKTSGGLQVERTTKYSNYKFDPIEQRLFRLKGPVIKEADMLSKSDEYW ASVRQVPLTRKESNMDVFVNRLEQIPGFKYIIFGAKALIENFVETGSKEHKSKVDIGPIN TMISSNYIDGTRFRLSGMTTAHFDKHWFLSGYGAYGLKDEKWKYSGTLTYSFNKRDYVVW EFPKHFISATYSYDVMSPMDKFLFTDKDNIFLSMKTTTVDQMSYMRDATINYELETLTGF GVKAMLRHRNDEPTGKLEYLRNDAAQTRVHDITTSEASVTLRYAPGESFVNSKQRRVPVS LDAPIFTLTHAMGFKGVLGGEYNFNRTEASIWKRFWLPASWGKIDCSVKAGAEWNVVPFP LLILPEANLSYITQRETFSLINNMEFLNDRYASMSLSYDMNGKLFNRIPLIKNLKWREMF RIRALWGTLTDKNNPFKSSNPDLFRFPTRDGKFTSFVMDPKVPYIEGSIGIYNIFKLLHV EYVHRFTYRDNPGINKNGIRFMVLMVF >gi|226332294|gb|ACIC01000026.1| GENE 3 4302 - 4391 65 29 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGKKSYILYFTQTYLLTLQPYYERYDTTI >gi|226332294|gb|ACIC01000026.1| GENE 4 4736 - 5821 1002 361 aa, chain - ## HITS:1 COG:MJ1504 KEGG:ns NR:ns ## COG: MJ1504 COG0381 # Protein_GI_number: 15669698 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylglucosamine 2-epimerase # Organism: Methanococcus jannaschii # 1 359 1 362 366 195 33.0 1e-49 MKITIVAGARPNFMKIAPIARAIEAARGLGKSISYRLVYTGRRDDTSLDASLFSDLDMKA PDVYLGVESSNPTSLAAGIMIAFEQELTENPAHVVLVVDDLTATMSCAIVAKKQGIKVAH LVAGTRSFDMKMPKEVNRMITDGVSDYLFTAGMVANRNLNQTGTESENVYYVGNILVDTV RFNRNRLLKPVWFSVLGLQEGNYLLLTLNRRVLLNNKENLRKLMQTMIEKAAGMPIVAPM HTYVRNAIKNLGIEAPNLHIMPPQNYLFFGYLINKAKGIITDSGNVAEEATFLGIPCITL NTYAEHPETWRVGTNELVGEDPAALAQAMDTLMKGEWKKGELPERWDGRTAERIVQILTS K >gi|226332294|gb|ACIC01000026.1| GENE 5 5904 - 6533 659 209 aa, chain + ## HITS:1 COG:VC2382 KEGG:ns NR:ns ## COG: VC2382 COG2860 # Protein_GI_number: 15642379 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Vibrio cholerae # 3 197 37 233 239 119 39.0 4e-27 MPTFVQILDFIGTFAFAISGIRLASAKRFDWFGAYVVGLATAIGGGTIRDVLLDVTPGWM TNPIYLICTGLALIWVICFGRWLIRLNNTFFIFDTIGLALFTVVGVGKSIALGYPFWVAI IMGSITGAAGGVIRDVFINEIPLIFRKEIYAMACVVGGIAYWLCDAAGMESYVCQLVGGS AVFVTRILAVKYHICLPILKGSEETDGDS >gi|226332294|gb|ACIC01000026.1| GENE 6 6539 - 7033 496 164 aa, chain - ## HITS:1 COG:no KEGG:BT_1200 NR:ns ## KEGG: BT_1200 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 164 1 164 164 293 100.0 2e-78 MKTRKQILMLLLVMLAGIPTLSAQTKKEKKEQKKEAVRQLIVSEDYKIDVNTAMPMRGRS IPLTSQYSLQIRNDSVFSYLPYYGRAYSVPYGGGSGLIFNAPLKEYTMDMDKKGNAVIKF SARSPEDFFEFRAKVFPDGTTSIDVTMQNRQAISYRGEVDLDKK >gi|226332294|gb|ACIC01000026.1| GENE 7 7129 - 8091 887 320 aa, chain - ## HITS:1 COG:lin0962 KEGG:ns NR:ns ## COG: lin0962 COG0501 # Protein_GI_number: 16800031 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Zn-dependent protease with chaperone function # Organism: Listeria innocua # 60 318 39 302 304 152 33.0 6e-37 MQYVGIQTQQSRNNRRSVLLLFLFPCLVAALLYLACYLLVAFGHGESMEVSILELANPLF INALPYTMGVVLIWFLIAFWANTSIIKAATGAKPLDRRENKRVYNLVENLCMANGMKAPK INIIDDDSLNAFASGINDRTYTVTLSKGIIQKLNDEELEAVIAHELTHIRNRDVRLLIVS IVFVGIFSMLTQITFYTITHTRIRSNGKNGGGAILIILIALVIAAVGYFFASLMRFAISR KREYMADAGSAEMTKNPLALASALRKISDDPDIEAVKRSDVAQLFIQNPGDQSKSALSGL SGLFATHPPIEKRISILEQF >gi|226332294|gb|ACIC01000026.1| GENE 8 8104 - 8664 712 186 aa, chain - ## HITS:1 COG:lin0961 KEGG:ns NR:ns ## COG: lin0961 COG1704 # Protein_GI_number: 16800030 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Listeria innocua # 4 186 5 185 185 154 45.0 1e-37 MTILIIVGIIVLLGIIFASMYNSLVRLRNNRENAFADIDVQLKQRHDLIPQLVDTVKGYA AHEKDTLERVIQARNGAVSAKTIDEKIAAENQLSSALSGLKITLEAYPDLKANQNFLQLQ EEVSDVENKLAAVRRYFNSATKEYNNAVQTFPSNIIAGMAGFRREIMFDLGTNERANLEQ APKISF >gi|226332294|gb|ACIC01000026.1| GENE 9 8907 - 9410 519 167 aa, chain + ## HITS:1 COG:mll3697 KEGG:ns NR:ns ## COG: mll3697 COG1595 # Protein_GI_number: 13473184 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Mesorhizobium loti # 3 164 5 161 183 100 37.0 1e-21 MKSLSFRKDLIGVQEELLRFAYKLTANREEANDLLQETSLKALDNEEKYVPDTNFKGWMY TIMRNIFINNYRKIVRDQTFVDQTDNLYHLNLPQDSGFESTEGAYDLKEMHRIVNALPKE YKVPFSMHVSGFKYREIAEKLDLPLGTVKSRIFFTRQRLQQELKDFV >gi|226332294|gb|ACIC01000026.1| GENE 10 9620 - 11395 1795 591 aa, chain + ## HITS:1 COG:AF1252m KEGG:ns NR:ns ## COG: AF1252m COG5016 # Protein_GI_number: 18677784 # Func_class: C Energy production and conversion # Function: Pyruvate/oxaloacetate carboxyltransferase # Organism: Archaeoglobus fulgidus # 1 456 1 431 480 184 31.0 6e-46 MKKEIKFSLVYRDMWQSSGKYQPRADQLVRIAPLIIEMGCFSRVETNGGAFEQVNLLYGE NPNKAVRAFTKPFHEASIQTHMLDRGLNGLRMYPVPADVRRLMYKVKHAQGVDITRIFCG LNEVRNIIPSIHYALEAGMIPQATLCITFSPVHTVEYYTAIAERLIEAGAPEICLKDMAG VGRPEMLGRLTKAIKERHPEIIIQYHGHSGPGLSMASILEVCENGADIIDVAMEPISWGK VHPDVISVQAMLKDAGFQVPEINMKAYMKARAMTQEFIDDFLGYFMDPTNKHMSSLLLKC GLPGGMMGSMMADLKGVHSGINLILRGKNEPELSIDDLLVMLFDEVEYVWPKLGYPPLVT PFSQYVKNVALMNVMSLIKGEERWTMIDNHTWDMILGKSGRLPGALAPEIIALAKEKGYE FTDEDPQKNYPDQLDEYRKEMTENGWDFGQDDEELFELAMHDRQYRDYKSGIAKKRFEDD LQRAKDAALAKQGFSEEEVKRMKRAKAEPVTAMEKGQIIWEIDVESPSMPPEVGHKYDPD DVFCYIATPWHTYDKVLANFSGRVIEVCAKQGALVDKGEPLAYIERCEEPA >gi|226332294|gb|ACIC01000026.1| GENE 11 11480 - 12226 377 248 aa, chain - ## HITS:1 COG:CC0325 KEGG:ns NR:ns ## COG: CC0325 COG0744 # Protein_GI_number: 16124580 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane carboxypeptidase (penicillin-binding protein) # Organism: Caulobacter vibrioides # 26 219 25 212 229 214 55.0 1e-55 MHRPLPIKKILRYARNLLIFFFASTILAVIVYRFMPVYVTPLMVIRSVQQLASGDKPTWK HTWVSFDKISPHLPMAVIASEDNRFAEHNGFDFIEIEKAMKENEKRKRKRGASTISQQTA KNVFLWPQSSWVRKGFEVYFTFLIELFWSKERIMEVYLNSIEMGKGIYGAQATAKYKFNT TAAKLSSGQCALIAATLPNPIRFNSAKPSAYLLKRQKQILRLMNLVPKFPPVEKKAVDKK DTRKKKKK >gi|226332294|gb|ACIC01000026.1| GENE 12 12204 - 12920 922 238 aa, chain - ## HITS:1 COG:FN1265 KEGG:ns NR:ns ## COG: FN1265 COG2885 # Protein_GI_number: 19704600 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane protein and related peptidoglycan-associated (lipo)proteins # Organism: Fusobacterium nucleatum # 47 227 37 202 202 80 37.0 2e-15 MNKMKWMVLFMSITMIFSGCASMNNTGKGAAIGGGSGAALGAILGGVIGKGKGAAIGAAI GTAVGAGTGALIGKKMDKAAAEAKQIEGAQVEQITDNNGLQAVKVTFDSGILFTTGNATL SAAAKSALSKFANNVLNQNRDMDVSIYGYTDNQGWRNSTAEQSRQKNLNLSQERAQSVAS YLLNCGVSSNQIKGVQGMGESDPIASNDTAAGREQNRRVEVYMYASEQMIKDAQAASN >gi|226332294|gb|ACIC01000026.1| GENE 13 13066 - 14421 1292 451 aa, chain - ## HITS:1 COG:TM0815 KEGG:ns NR:ns ## COG: TM0815 COG0534 # Protein_GI_number: 15643578 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Thermotoga maritima # 12 435 18 444 464 135 25.0 2e-31 MQGIKNLTHGPINRQLFNLAMPIMATSFIQMAYSLTDMAWVGRLGSEAVAAIGSVGILTW MSGSISLLNKVGSEVSVGQSIGAQNQEDARQFASHNITIALIISICWGGLLFMFASPIIR IYELEEHITANAIEYLRIISTALPFIFLSAAFTGIYNAAGRSKIPFYISGTGLIMNIILD PLFIFGFGLGTNGAAYATWLSQATVFAIFIYQLRFRKDALLGGFPFFSRLKKKYTRRILK LGLPVATLNTLFAFVNMFLCRTASEQGGHIGLMTFTTGGQIEAITWNTSQGFSTALSAFI AQNFAAGRIERVLKAWHTTLWMTGIFGTFCTLLFVFYGNEVFSIFVPEQAAYEAGGVFLR IDGYSQLFMMLEITMQGVFYGIGRTVPPAIVSIVCNYMRIPLAILFVRMGMGVEGIWWAV CVTTIAKGLILLGWFTMIRKKCLNLPSVSTR >gi|226332294|gb|ACIC01000026.1| GENE 14 14513 - 15904 1413 463 aa, chain + ## HITS:1 COG:CC2313 KEGG:ns NR:ns ## COG: CC2313 COG0657 # Protein_GI_number: 16126552 # Func_class: I Lipid transport and metabolism # Function: Esterase/lipase # Organism: Caulobacter vibrioides # 35 244 82 305 328 144 38.0 4e-34 MKQKFSSLFVLLVLLLTGLQVSAQSTPKPFDIEQPSLRVFLPAPELATGRAVVACPGGGY SGLAVNHEGYDWAPYFNKQGIALIVLKYRMPKGDRTLPISDAEAAMKMVRDSADVWNLNP NDIGIMGSSAGGHLASTIATHAPEALRPNFQILFYPVITMDKSFTHMGSHDNLLGKDASA DLEKEFSNEKQVTKETPRAFIVYSDDDKVVPPANGVNYYLALNKKGVPSVLHIYPTGGHG WGIREDFLYKSEMQNELTSWLRSFKAPRKDAVRVACIGNSITFGAGIRNRSRDSYPSVLA RMLGDSYWVKNFGVSARTMLNKGDHPYMNEPAYKNALAFNPNIVVIKLGTNDSKSFNWKY KADFMKDAQTMIDAFKGLPSQPKIYLCYPSKAYLTGDGINDDIISKEIIPMIKKLAKKND LSVIDLHTAMDGMPELFPDRIHPNEKGAQVMAKAVYQSISTLK >gi|226332294|gb|ACIC01000026.1| GENE 15 15986 - 16756 262 256 aa, chain + ## HITS:1 COG:no KEGG:BT_1191 NR:ns ## KEGG: BT_1191 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 256 1 256 256 511 100.0 1e-143 MIRTVAVFVFVSFACVATAFSQTIEKENKIDVQKKNSSAEVSGRQLPEAKNIEPKLPDIE AKKLATDDKSMQTYGLTPSSPASGLSPYSYDFPALLESNPTLTDFYNFRQHSMNGQLSFI GSAEQKTYLGLGQYVAVNAGIRWIPSQRFFVEAGGDLSRQFYLALPIARQDIAGIYTRMR YSLVRNVRLNVWGQYFMGGNTPPPAYNPLFPHTGVGASISVDVKKNTEVSVGAEYQYDKK SREWKMETSGRVSIGF >gi|226332294|gb|ACIC01000026.1| GENE 16 17124 - 18107 558 327 aa, chain + ## HITS:1 COG:MA3445 KEGG:ns NR:ns ## COG: MA3445 COG1162 # Protein_GI_number: 20092257 # Func_class: R General function prediction only # Function: Predicted GTPases # Organism: Methanosarcina acetivorans str.C2A # 4 324 43 363 369 202 38.0 8e-52 MSHGRIIVTHKTCYEVVAEDGVYLCELTGNMIYGRVPDEYPCTGDWVIFQPFDANKGIIV DILPRERALYRKKNGRVADRQAIASYVDKAFIVQSLDDNFNVRRAERFIAQVMEEKIKPV LVLNKADLGCDRQKIDEAIKHIARQFPVFITSIRQPQTILRLRESITKGETVVFVGSSGV GKSSLVNALCGKSVLNTSDISLSTGKGRHTSTRREMVLMDGSGVLIDTPGVREFGLAIDN PDSLTEMFEISDYAESCRFSDCKHIDEPGCAVLEAVHNGTLDHKVYESYLKLKREAWHFS ASEHEKRKKEKSFTKLVEEVKKRKANF >gi|226332294|gb|ACIC01000026.1| GENE 17 18220 - 18690 425 156 aa, chain - ## HITS:1 COG:no KEGG:BT_1189 NR:ns ## KEGG: BT_1189 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 156 1 156 156 293 98.0 2e-78 MAKISEIMLLQQPEQPALAIEVQTNMKGMSQAIGENFVRIDSLFKKQGEVTTDIPFVEYP DFESLTEDRIEMIIGLKSSKPLQGDEKIQSVILPARRIVVCLHRGNYNELAQLYNEMTEW IKTNGYKASGTSIEYYYSNPDVPEEEHVTRVEMPLL >gi|226332294|gb|ACIC01000026.1| GENE 18 18711 - 19844 448 377 aa, chain - ## HITS:1 COG:BH3506_1 KEGG:ns NR:ns ## COG: BH3506_1 COG2207 # Protein_GI_number: 15616068 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Bacillus halodurans # 109 213 7 111 130 68 34.0 2e-11 MVESEINKRYCQSCGMPLRFDVEEYLGTNSDGSRSDEFCYYCLKDGKYIVDISMWEMIDI WIKYTDKYNEYADTDYSPKELREILDKRLPTLNRWRQKQETSSLHHKMIQNIIVYINGHL TEVLNTDTLSSMSGLSKFHFRRVFRTATGENIGSYIQRLRMEHVAHLLISTDYTLKQIIE QTSYQTKYSIAKAFKKHFGISTSQYREKHRPNGENPATNIKPEIKVISPIKIFCIEVGEA YKNKLKYRLLWNKLLHHAEQYEADPRYDKFISLSMDDPSITPTEKCRFYLGITLRDDTKA RPVPGIMQIPGGRYAVFRHTGSYSSLHKVYRSIYEEWFPKSKYHPQSTFSFEMYMNHPST TETSELLTEIYIPVIRK Prediction of potential genes in microbial genomes Time: Thu May 12 00:16:31 2011 Seq name: gi|226332293|gb|ACIC01000027.1| Bacteroides sp. 1_1_6 cont1.27, whole genome shotgun sequence Length of sequence - 26404 bp Number of predicted genes - 22, with homology - 21 Number of transcription units - 9, operones - 3 average op.length - 5.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 16 - 75 3.4 1 1 Tu 1 . + CDS 115 - 1803 1914 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains + Term 1850 - 1893 7.5 + Prom 1882 - 1941 5.7 2 2 Tu 1 . + CDS 2020 - 5292 2059 ## BT_1185 OmpA-related protein + Term 5334 - 5380 12.2 3 3 Op 1 . - CDS 5407 - 6519 702 ## COG3021 Uncharacterized protein conserved in bacteria 4 3 Op 2 . - CDS 6521 - 9259 2180 ## COG0642 Signal transduction histidine kinase 5 3 Op 3 . - CDS 9327 - 10532 1233 ## COG1215 Glycosyltransferases, probably involved in cell wall biogenesis 6 3 Op 4 . - CDS 10570 - 12072 757 ## BT_1182 hypothetical protein 7 3 Op 5 26/0.000 - CDS 12097 - 12948 445 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 8 3 Op 6 4/0.000 - CDS 12960 - 14114 799 ## COG0438 Glycosyltransferase 9 3 Op 7 . - CDS 14118 - 15029 730 ## COG1442 Lipopolysaccharide biosynthesis proteins, LPS:glycosyltransferases 10 3 Op 8 . - CDS 15042 - 16172 615 ## BT_1178 hypothetical protein 11 3 Op 9 8/0.000 - CDS 16169 - 17644 818 ## COG2244 Membrane protein involved in the export of O-antigen and teichoic acid 12 3 Op 10 . - CDS 17668 - 18612 319 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 13 3 Op 11 . - CDS 18659 - 19777 683 ## BT_1175 hypothetical protein - Prom 19814 - 19873 1.8 14 4 Tu 1 . - CDS 19914 - 20219 315 ## BT_1174 hypothetical protein - Prom 20354 - 20413 4.2 + Prom 20170 - 20229 3.4 15 5 Tu 1 . + CDS 20371 - 21357 626 ## BT_1173 hypothetical protein + Term 21387 - 21425 -0.8 + Prom 21369 - 21428 1.6 16 6 Tu 1 . + CDS 21616 - 22059 506 ## COG0346 Lactoylglutathione lyase and related lyases 17 7 Tu 1 . - CDS 22350 - 22607 230 ## BT_1170 hypothetical protein - Prom 22636 - 22695 2.9 18 8 Op 1 . + CDS 23013 - 23561 576 ## BT_1517 hypothetical protein 19 8 Op 2 . + CDS 23631 - 23732 85 ## 20 8 Op 3 . + CDS 23719 - 24162 337 ## COG3023 Negative regulator of beta-lactamase expression + Term 24238 - 24283 2.3 21 9 Op 1 . - CDS 24318 - 25505 1262 ## COG1215 Glycosyltransferases, probably involved in cell wall biogenesis 22 9 Op 2 . - CDS 25525 - 26403 487 ## COG1216 Predicted glycosyltransferases Predicted protein(s) >gi|226332293|gb|ACIC01000027.1| GENE 1 115 - 1803 1914 562 aa, chain + ## HITS:1 COG:RSc2913 KEGG:ns NR:ns ## COG: RSc2913 COG0488 # Protein_GI_number: 17547632 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Ralstonia solanacearum # 8 557 5 551 555 635 57.0 0 MADDKKIIFSMVGVSKAFQPNKNVLKDIYLSFFYGAKIGIIGLNGSGKSTLLKIIAGLEK SYQGEVVFSPGYSVGYLAQEPYLDDTKTVKEVVMEGVQPIVDALAEYEEINQKFGLPEYY EDQDKMDILFARQGELQDIIDATDAWNLDSKLERAMDALRCPPEDQPVVNLSGGERRRVA LCRLLLQKPDILLLDEPTNHLDAESIDWLEQHLQQYEGTVIAVTHDRYFLDHVAGWILEL DRGEGIPWKGNYSSWLEQKTKRMEMEEKTASKRRKTLERELEWVRMAPKARQAKGKARLN SYDKLLNEDLKEKEEKLEIFIPNGPRLGNKVIEAKHVAKAYGDKLLFDDLNFMLPPNGIV GVIGPNGAGKTTLFRLIMGLETVDKGEFEVGETVKVSYVDQQHKDIDPNKSVYQVISGGN DLIRMGGRDINARAYLSRFNFAGADQEKLCGVLSGGERNRLHLAIALKEEGNVLLLDEPT NDIDVNTLRALEEGLEDFAGCAVVISHDRWFLDRICTHILAFEGDSNVFYFEGSYSEYEE NKLKRLGNEEPKRVRYRNLMTD >gi|226332293|gb|ACIC01000027.1| GENE 2 2020 - 5292 2059 1090 aa, chain + ## HITS:1 COG:no KEGG:BT_1185 NR:ns ## KEGG: BT_1185 # Name: not_defined # Def: OmpA-related protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1090 1 1090 1090 2087 100.0 0 MLKRMRSFLVVVMLLITATMSAQVTTASMSGKVTAQDEPIIGATIVAIHEPSGTRYGTVT NVSGQFNLQGMRTGGPYKVEITYVGYQSAIYKGIQLSLGENYVLNVSLKESSELLDEVVV TAPKTIEKIGTVTNVSERQMNTLPTINRSITDFTKLSPYAGGSSSFAGRDGRYNNITIDG ASFNNSFGLSSNNLPGGDSQPISLDAIDEITVNVSPYDVKYSNFTGASINAVTKSGTNSF SGTAYTYHKPRGFAGSSIDGEDILKAKEYNSHSYGVTFGGPIIKNKLFFFLNAEIENKET PGVTWLPNQDQNGTGDSGKRISRTWVGDMRTISDFVKTKYDYDPGQYENFNNFQSDNWKL MARIDWNIHQNHKLTVRFNSVSSEDDREVSAKSTIITSTNSNRYGLDAFSFGNSNYKMKN VVTSITGELNSRFSNNVQNKLLATYTHISDTREQKGSDFPFVDIYKDGKQYITLGTEVFT PNNQVINNVFSLVDNVSISLGKHELLAGVSFERQYFKNSYLRGPYGYYRYDSMDDFMQGK VPTIYGITYGYNGNDAPGSELTFGMAGVYAQDVWSLTPNFRLTYGLRLDMPIYMNSLDSN KSIEELAFVNNTHVDISKWPKTQVLFSPRVGFNWDVKGDRSVIVSGGTGIFTGLLPFVWF TNQPSNSGLYQNMVEYNTQKNPGSLPADFGFNPNYRETLQKYPSLFPSTPSEQAPDVIAY VDPKFKMPQVWRSNLNVDIQLPYDFMLSVGAMYTRDIYNVAQINMNEAEPTGVYNEQPDR IYWASKKYEYNDYTNGKNVVVKLSNGEDKGYQYSFNAILTKKYDFGFTGSIGYTYTMAKD LTANPGSAPNSAWQNNVAVNSLNDPGVSYSLFSTPHRIIANASYEINYAKCLKTTFSLFY SGYHTGRYSYTYYNDMNGDGNYSDLIYVPNSQDEMTFVDITDKSGAITYSAVDQAKDFWD FVNNDSYLKDRKGKYVERNGSLTPWINRFDFKIAQDFYATLGGRKYGIQVSLDMLNVGNM INSKWGAYQSCGLKSYDNVQLLKTASKVGEPLTYQINANSREAFKERSSWQYTNTIGSAW SMQLGVKVTF >gi|226332293|gb|ACIC01000027.1| GENE 3 5407 - 6519 702 370 aa, chain - ## HITS:1 COG:DR0632 KEGG:ns NR:ns ## COG: DR0632 COG3021 # Protein_GI_number: 15805659 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Deinococcus radiodurans # 187 368 159 325 329 63 28.0 9e-10 MKALYRLLYYLSILLTSILAGFTLAGAFAGDASPGGFKLMPFIGLVLPMLLLANVVSAIY WTVRWRCWVFIPLIAILGNLGYLTRVLQSPLFDSASQPNTESLKKNERATESALTIATYN VNSFNHEHTGFSCKEIAAYMKELGVDIFCFQEFGINHEFGTDSLRTVLSEWPYYYVPSSP AGESLLQLAVFSRYPIKEKQLVTYPNSNNCSLWCDIDINGQTIRLFNNHLQTTEVSRNKR KLEKELRADDTDRAERAALTLADGLHENFKKRAAQAEHINQLISDSPYPTLVCGDFNSLP SSYVYQTVKGEKMNDGFQTCGHGYMYTFRYFKHLLRIDYILHSSEFQGVDYFSPDLEYSD HNPVVMRMRL >gi|226332293|gb|ACIC01000027.1| GENE 4 6521 - 9259 2180 912 aa, chain - ## HITS:1 COG:mll3725_2 KEGG:ns NR:ns ## COG: mll3725_2 COG0642 # Protein_GI_number: 13473203 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Mesorhizobium loti # 486 732 61 319 328 169 39.0 2e-41 MNRTLHEADNAQRILKLTADTMLLVDRNGICIDIDIHSNLWFLQEERLLGKNIFERMPEH TREKVYSTFQTVLREQRSISKNYKLELEDNTYYFKCIMYPYDDMVLCQYRDITQRSNVKR QLEQVNRNLREIQKVAQIGQWMYNTRDNVFHYMGYTGVLCEESSRSISLEKYQEFIVEED RLSFTSWCRKNEEGLNEESISYRVRVAGEIYYMRLQAYLRDELPNGSCNIEGYIQNITDI QRRRNDINTLTHAINNAKESIFAAKEDGTLIFANRQFLHNHGIPESEEIGKLKIYEIAGD MNTQEDWHERCKDILHGGSRSFVAHHPSKVSKNILAYEGIMYNVTNDSGEESYWSFSHDI SERLHYEAQIKRLNLIMDTTIDNLPAGIVVKEINNDFRYIYRNRESYNRNLCIGESIGKN DFDYYPPEVAEKKREEDTLVATTGRGLHWTTEGKDKNGNEIILDKRKIRVDGDELSSPII VSIEWDITELEKIKRELQTSKEKAEMSDKLKSAFLANMSHEIRTPLNAIVGFSHLIAESE NKEERKTFYEIVEANNERLLQLINEILDLSKIESGIIEFTSAPININSLCKEVHDAHVFR TPQGVQLIYEPSENGLVIETDKNRVFQVFSNLIGNAFKFTKAGSISYGYKLVGNQIVFHV TDTGTGIEPEKIGRVFERFAKLNNYAQGTGLGLSICKTIVERLGGEISVSSVVGQGTTFT FTLPYESEQSTENTKEEKRQEGNANENPTGTGSMKIDPASTETPAETSVKEDSEKTSGST DDDSSGSISANASPSQRTILIAEDEDSNFDLLKAILGRKYRLIRARDGMEAVTSYDEAKP DLILMDIKMPNLDGLEATKIIRELSHTVPIIAQSAYAYQQDQIAAMDAGCSDFIAKPISQ AKLKDMINKWLP >gi|226332293|gb|ACIC01000027.1| GENE 5 9327 - 10532 1233 401 aa, chain - ## HITS:1 COG:PAE0419 KEGG:ns NR:ns ## COG: PAE0419 COG1215 # Protein_GI_number: 18311929 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases, probably involved in cell wall biogenesis # Organism: Pyrobaculum aerophilum # 54 307 42 295 365 81 27.0 3e-15 MNETLILEIIFWSALFMVFYTYLGYGIVLYALVKLKELFVKPVKRVLPPDSDLPEVTLFI TAFNEEDVVDEKMENSLALDYPADKLRIVWVTDGSDDGTNDRLQTRWNGKATVYFQPKRQ GKTAAMTRGMTLVRTPLVVFTDANTMVNSEAIHEIVLAFQNPKVGCVAGEKRIAVQTKDG AAAGGEGIYWKYESTLKALDSRLYSAVGAAGELFAVRRELFEPMEPDTLLDDFILSLRIA MKGYTIAYCTNAYAIESGSADMGEEEKRKVRIAAGGLQSIWRLRPLLNPFHYGILSFQYT SHRVLRWSVTPFLLFALLPLNIVVLLLGESPLFYGTLLGLQILFYGMGYWGYYLSTRQIK NKVLFIPYYFLFMNVNVLKGIGYLKRKKGSGAWEKAKRAEK >gi|226332293|gb|ACIC01000027.1| GENE 6 10570 - 12072 757 500 aa, chain - ## HITS:1 COG:no KEGG:BT_1182 NR:ns ## KEGG: BT_1182 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 500 1 500 500 1001 100.0 0 MAKYIVNIILVTACLFTYTSCIRKMNLYDENRNTGQDNNEKDTDSTPVYMYPFDKETANV TAEIIIHTKKEIQTKDINVEIPHLKYNKTWMLMLTQDDCMQAAFCRTWAAINGKPISSSI PYPTPTPTDQNMKHQLYFDIKHLQKGDLPPTIISANQSLGCTDGAGNEVRFAITTTLAPE EKWMDAETNVLPGFTANYYRFYMKSGLIWDNIREMLNYGTGIALHDVMTKDANDPIQILE HFDIAQNIIVEKLTGRGCKFLAEPNGNKTYVTAALQCPEIQTMTAQAGAITLYPYQVSET LQKSLIEREFNDSPAYFKQQITDLLKLPKEERKAICIGVHGTDNNWIDFLLWLNSNYGKD GDDSLWFPSQEEYYEYNYYRLNSHISIAQIDASSFKLTVNLPGEKFFYYPSTTINLPGIS MYDIVSIEGNDALTGLSYADYKDGIMLNIDCRKYLFEHAENFVKRYEANPSDASNKADAL YFVNILKESAKKEALKKRLQ >gi|226332293|gb|ACIC01000027.1| GENE 7 12097 - 12948 445 283 aa, chain - ## HITS:1 COG:TVN0223 KEGG:ns NR:ns ## COG: TVN0223 COG0463 # Protein_GI_number: 13541054 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Thermoplasma volcanium # 41 144 9 110 226 72 42.0 1e-12 MVKWYKTYLEVYGKSFDEISDDSIRKVKRQLKERQNDSPLVSVVLIAHNEERHLISCIWS LSENHCHFPIEIIAVNNNSTDKTEEILKSLGVTYYNETKKGPGYARQCGLDHAKGKYHIC IDADTLYPPHYIETHVKYLMNPKVACTFSLWSFMVDERHSRFGLWCYEGLRDIYLRLQAI QRPELCVRGMAFSFNTEIGRRFGFRTDIIRGEDGSLALAMKPYGKIVFITSRKARVLTSN GTLNSEGSLIDNIWIRVIKALKEITGLFIKKTAYKDKDSNLIK >gi|226332293|gb|ACIC01000027.1| GENE 8 12960 - 14114 799 384 aa, chain - ## HITS:1 COG:SMb21250 KEGG:ns NR:ns ## COG: SMb21250 COG0438 # Protein_GI_number: 16264502 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Sinorhizobium meliloti # 2 384 50 413 427 179 32.0 6e-45 MKIAYYLPSLQAPGGIERIVTFKANYFAEHFEGYDITIITSEQMGKAPHFPLSPKVKHID LGVSFDLPYSQSIVSKVLKYPFRYYRFKQRLSNLLNELKPDITISTMRRELNFITKLKDG SLKIGEFHVSRYAYGAEALKRKSPIVNMIKKRWANRFVKNLSKLSRVIILTSEGAKDWPE LTNISIIPNPISTPVEGKQTDILSHNAIAVGRYAPQKGFDMLIPAWSIVAQRHPDWKLHI YGEGDLKEKFTKLIDELQLNNNCLLHHTVSNIAEKYCMSSIFVLSSRYEGFGLVLAEAMS CGIPCVSFDCPHGPSDIIKDHEDGLLVEKENIKELADKICYLIENENVRIKMGHKARENV KRFLPENVMPQWKNLFESLTYSSK >gi|226332293|gb|ACIC01000027.1| GENE 9 14118 - 15029 730 303 aa, chain - ## HITS:1 COG:SP1767 KEGG:ns NR:ns ## COG: SP1767 COG1442 # Protein_GI_number: 15901598 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lipopolysaccharide biosynthesis proteins, LPS:glycosyltransferases # Organism: Streptococcus pneumoniae TIGR4 # 39 302 549 812 814 201 40.0 1e-51 MNIQPLINAIRYKLRYGITVEWYNFWYMALRCHQKITPQVASIDETIRIIIEDQCSVSRF GDGEMLLTNPDKEIGFQKGNVLLAQRLKEVLTSHEEGHLVCISDTFENIYRYNRKARRFW RTHFFLYGSWWDRLLLPDRLYYNTFITRPYMDFASKEDCPRWFHQMKAIWKDRDVVFIEG EKSRLGVGNDLFDNVKSIRRILCPPRDAFDRYEEILNEAQKLEKTALFLIALGPTATVLA YDLFKSGYQAIDAGHVDVEYEWWRMGAKRKVKLERKYVNEAANGKQVSDAGEFYRKQIIA KIY >gi|226332293|gb|ACIC01000027.1| GENE 10 15042 - 16172 615 376 aa, chain - ## HITS:1 COG:no KEGG:BT_1178 NR:ns ## KEGG: BT_1178 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 376 1 376 376 765 100.0 0 MKGLFLIFHGFEAFNGISKKIRYQVKALKECGLEMHTCWLDDTDNHKRRMVDESIIADYG FGIKGKILKRIEFDSIVHYVQKENIDFIYVRYVHNASPFSIRLMKLLKKTGARIVMEIPT YPYDQEYKGLPFVYQRILFIDKCFRQHLARYVDKIVTFSDYDIIWNRPTIRISNGIDFSE IPLRGPKNDTAHSLQLIAVATMHPWHGFDRVIAGMANYYRTHTDTHDYKVYLNIVGFGVP ELVDSYKKDVAKHQLEKYIIFHSALYGKELDAVFEQSDMGIGSLARHRSGIDKIKTLKNR EYAARGIPFVYSETDDDFEHQPYILKAAPDDSPLDIEKVIRFYQSLKTTPLQIRMSIEQS LSWKAQMQIVINETFE >gi|226332293|gb|ACIC01000027.1| GENE 11 16169 - 17644 818 491 aa, chain - ## HITS:1 COG:L13324 KEGG:ns NR:ns ## COG: L13324 COG2244 # Protein_GI_number: 15672194 # Func_class: R General function prediction only # Function: Membrane protein involved in the export of O-antigen and teichoic acid # Organism: Lactococcus lactis # 5 481 1 474 475 217 30.0 3e-56 MANNIKHQLFSGVFYTALAKYSGIIISLVVAGILARLLSPDDFGVVAIATVIIAFFNLFT DMGISPAIVQHKSLTKEELSDIFSFTVWTGIGISILFFAASWLIADYYESGILRTLCQLL SVNLFFASANIVPGALFYRNKEFKFIAVRSFIIQIAGGAGAITAALCGAGLYALIINPIL SSILIFVISYQRYPQRLRFTLGLKVLRKIFSYSAYQFLFNVINYFSRNLDKLLIGKYMSM SDLGYYEKSYRLMMLPLQNITQVITPVMHPIFSDFQNDKEKLATSYERIVRFLSFIGLPL SVLFFFTAEEVTLIIFGDQWLPSVPVFRILSLSVGIQIILSSSGSIFQAAGDTRSLFVCG LFSSILNVTGMLTGIFYFGTLTAVAICIVITFTINFVQCYWMMYRVTFQRNAWIFIRQLL SPLLVSLLITAFLTPIYYMMENMNIFVTLIAKSFVSFIIFGTYMQVTHTYDITGKVKTLV RNRKLVQKNKQ >gi|226332293|gb|ACIC01000027.1| GENE 12 17668 - 18612 319 314 aa, chain - ## HITS:1 COG:YPO0187 KEGG:ns NR:ns ## COG: YPO0187 COG0463 # Protein_GI_number: 16120528 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Yersinia pestis # 6 251 4 250 329 135 33.0 8e-32 MENEEIKVSVVIPVYNTEKYVREAVESIMNQTLRELEIIIINDGSTDNSLQVVEELAAAD SRIQVYSQSNQGLSMARNAGITHAHGRYIYFMDSDDLLEKDAMELCYSKCEEKELDFVFF DAQSFFEEDIRNAPVLNYQRTDKLKDKVYTGPEAINIQLQHKTYTPSACLNVIRTAFLKE RQLLFYPDIVHEDQLFTTLLYLQARRTSCIQRSFFHRRIRKNSIMTCDFSLKNLKGYLTV TQEIVKFKQQTSENKIKDTIDLYLSQMLNAVIWQAHVLPYPYRLRLAWRCLFRYKKLISI RSIGVLLFKSIIHK >gi|226332293|gb|ACIC01000027.1| GENE 13 18659 - 19777 683 372 aa, chain - ## HITS:1 COG:no KEGG:BT_1175 NR:ns ## KEGG: BT_1175 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 372 1 372 372 724 95.0 0 MNDNEIGNPPIKALIVFSENSDTDNLYVPILCDAIRTTGIDVRCSTNEFWDSDTPYDIIH FQWLEEVIGWNCDDPDRIRRLEERIAFFRSRGARFVYTRHNVRPHDANEVIGRAYDIIEG QSDVVVHMGRYSLDEFAAKHADSRNVIIPHHIYQYTYKENISVERARQYLNLPQEAFIVT SFGKFRNREERRMVTGAFRKWDEAKKFLLAPRLYPFSRFNRYGSNFFKRWASRAGYYLLI PLLNRLRRMHAGASDEPIDNCDLPYYMAASDVIFIQRKDILNSASIPLAFLFHKVVIGPN IGNIGEILQDTGNPVFHPDNKFDIIRALEAARQLSARRKGEDNYAYAIENMNSGKIGKEY AELYRNLANLSL >gi|226332293|gb|ACIC01000027.1| GENE 14 19914 - 20219 315 101 aa, chain - ## HITS:1 COG:no KEGG:BT_1174 NR:ns ## KEGG: BT_1174 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 101 47 147 147 183 100.0 1e-45 MKGYERATKEEINDRLRIEENCHAQVERIIYIRHLCNLNLEEAADVTNLSISTLSRYENE VTKCSVQSLITIYYHYQKYLYEQHIPFDKNLFLIDMNSFHN >gi|226332293|gb|ACIC01000027.1| GENE 15 20371 - 21357 626 328 aa, chain + ## HITS:1 COG:no KEGG:BT_1173 NR:ns ## KEGG: BT_1173 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 328 1 328 328 638 100.0 0 MKKDQYFNLEVNFLSNDNIVRMMMELDAAQSLGIYMMLLTHLRKADNYEASCCPLCMGAF ARMYDLDAGVLQRVLREFDLFVLDEERQMFRSPYLDRVMARLEERRMINVANGKKGGRPK RMESTPETPMDKGEKPNQNQKSREEERRVTTVVKDNNSSNEEKTEKENSAAASDRLQIVS PPEDRGQLPLQPVLPWEMLVDQMAESESYMESMAMHSCMGKLFVERQEEVIAAFKEHIRR FGREDGLLFLRDVKLYFGNYLSPGGRPAQELKKRLLGSCRQSAEEELYRFEQLIGGQRTY LGHPIPKDAPPRPDRSAVWDEVHWKWGN >gi|226332293|gb|ACIC01000027.1| GENE 16 21616 - 22059 506 147 aa, chain + ## HITS:1 COG:alr2922 KEGG:ns NR:ns ## COG: alr2922 COG0346 # Protein_GI_number: 17230414 # Func_class: E Amino acid transport and metabolism # Function: Lactoylglutathione lyase and related lyases # Organism: Nostoc sp. PCC 7120 # 1 131 3 132 135 113 39.0 1e-25 MRFSNVRLLVKDFAGCFKFYTEQLGLEPAWGDENSGYASFKVADGIEGLALFVSDWMAPS AGNADKQQPVGMREKLMISFSVDNVDETFVALKAKGVTFISEPTDMPDWGMRTLYLRDPE ENLIELFTPLAPEKFSQELIEEDQKFH >gi|226332293|gb|ACIC01000027.1| GENE 17 22350 - 22607 230 85 aa, chain - ## HITS:1 COG:no KEGG:BT_1170 NR:ns ## KEGG: BT_1170 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 85 1 85 85 152 100.0 4e-36 MKANTDTLQKQEEEKSFQYRSYGKGELAMLYIPNVQQQSAVDRFNEWIEAAPGLKERLLS TGMNPRSRHYTPAQVRLIVEVLQEP >gi|226332293|gb|ACIC01000027.1| GENE 18 23013 - 23561 576 182 aa, chain + ## HITS:1 COG:no KEGG:BT_1517 NR:ns ## KEGG: BT_1517 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 4 142 3 140 176 125 46.0 5e-28 MATVSVVRYQRKKKIGDDKSPMVYVLKPKPGESKLYSIESLAREIESIGSLSVEDVEHVM QSFVRSMKKVLVAGNKVKVDGLGIFYTTLTCPGVEQEKDCTVRNITRVNLRFKVDNSLRL ANDSTATTRGGDNNMVFELISKNAPASGGSGDGGGSGSGGGSGGGSGSGGDGGEEEAPDP TV >gi|226332293|gb|ACIC01000027.1| GENE 19 23631 - 23732 85 33 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIDKILEIALEILFFFRRKRKGKRTKGQVDEED >gi|226332293|gb|ACIC01000027.1| GENE 20 23719 - 24162 337 147 aa, chain + ## HITS:1 COG:HI1494 KEGG:ns NR:ns ## COG: HI1494 COG3023 # Protein_GI_number: 16273395 # Func_class: V Defense mechanisms # Function: Negative regulator of beta-lactamase expression # Organism: Haemophilus influenzae # 46 141 2 97 116 105 50.0 4e-23 MRKIDLIVIHCSASRADRDFTETDLEVCHRRRGFNGIGYHFYIRKNGDIKTTRPIERIGA HAKGFNQTSIGICYEGGLDCHGRPADTRTEWQIHSMHVLVLTLLRDYPGCRVCGHRDLSP DLNGNGEIEPEEWVKACPCFDAEEEWA >gi|226332293|gb|ACIC01000027.1| GENE 21 24318 - 25505 1262 395 aa, chain - ## HITS:1 COG:CAC1691 KEGG:ns NR:ns ## COG: CAC1691 COG1215 # Protein_GI_number: 15894968 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases, probably involved in cell wall biogenesis # Organism: Clostridium acetobutylicum # 42 262 47 271 425 77 26.0 3e-14 MYTFIDWILFILLALCVGYLLFYAIASKFYRPRKLSEARILRRFLVLFPAYKEDRVIVST IRNFLEQEYPKEMYDVLVISDQMQPDTNAALQTLPICLQVADYTNSSKAKALTLAMNVTA NESYDVVVIMDADNVTTPNFLAEINRAFESGLHAVQAHRTGKNMNTDIAVLDAISEEINN GFFRSGHNAIGLSAGLAGSGMAFDVHWFRRNVGHLQTSGEDKELEALLLKQRIHIEYLEQ LPVYDEKTQKKEGIKNQRKRWIAAQYGALRASLPDFPKALIQGNIDYCDKILQWMLPPRL IQLAGVFGFTLLFTAIGLLMSLQSGGYEWYTAIKWWILSAAQVAAMIIPVPGKLLNRKLG KAILQIPTLALAMIANMFRLKGTNKKFIHTEHGEE >gi|226332293|gb|ACIC01000027.1| GENE 22 25525 - 26403 487 292 aa, chain - ## HITS:1 COG:CAC3069 KEGG:ns NR:ns ## COG: CAC3069 COG1216 # Protein_GI_number: 15896320 # Func_class: R General function prediction only # Function: Predicted glycosyltransferases # Organism: Clostridium acetobutylicum # 3 252 7 258 299 174 36.0 1e-43 SFITVCYNGFKDTCELIESLQTHVHSVSYEIIVVDNASREDEATKIKELYSDIVTLRSES NLGFSGGNNLGIRVAKGAYIFLINNDTYVESDGFHYLTERLESQKNIGAVSPKIRFAFPP QNIQFAGFTSLSAITIRNEMTGFGCPDDGTFDTPHTTPYLHGAALMVKREVIEKVGEMPE IFFLYYEEMDWCTQMTNAGYELWYEPRCTVFHKESQSTGQFSKLRTFYMTRNRLLYTRRN RKGAQRLLSILYQSTIAAGKNSLQFAVKGRFDLFAAVWKGVVRSFSVTSPIR Prediction of potential genes in microbial genomes Time: Thu May 12 00:17:25 2011 Seq name: gi|226332292|gb|ACIC01000028.1| Bacteroides sp. 1_1_6 cont1.28, whole genome shotgun sequence Length of sequence - 34411 bp Number of predicted genes - 29, with homology - 29 Number of transcription units - 13, operones - 7 average op.length - 3.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 41 - 1522 680 ## BT_1165 hypothetical protein 2 1 Op 2 . - CDS 1512 - 3740 1325 ## BT_1164 hypothetical protein 3 1 Op 3 . - CDS 3751 - 4500 523 ## BT_1163 hypothetical protein - Prom 4550 - 4609 6.3 4 2 Tu 1 . - CDS 4626 - 5852 1206 ## BT_1162 hypothetical protein - Prom 5908 - 5967 5.6 + Prom 5867 - 5926 5.7 5 3 Tu 1 . + CDS 6057 - 7457 1219 ## COG3579 Aminopeptidase C + Term 7541 - 7581 7.1 + Prom 7547 - 7606 6.1 6 4 Op 1 7/0.000 + CDS 7646 - 8995 1348 ## COG1726 Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrA 7 4 Op 2 9/0.000 + CDS 9044 - 10216 1237 ## COG1805 Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrB 8 4 Op 3 9/0.000 + CDS 10223 - 10903 909 ## COG2869 Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrC 9 4 Op 4 9/0.000 + CDS 10924 - 11565 761 ## COG1347 Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrD + Prom 11567 - 11626 3.1 10 4 Op 5 7/0.000 + CDS 11649 - 12275 820 ## COG2209 Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrE 11 4 Op 6 . + CDS 12294 - 13568 1538 ## COG2871 Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrF 12 5 Tu 1 . + CDS 13646 - 15058 1014 ## COG0513 Superfamily II DNA and RNA helicases + Term 15248 - 15297 -0.4 13 6 Op 1 6/0.000 + CDS 15633 - 16700 1073 ## COG1932 Phosphoserine aminotransferase + Prom 16709 - 16768 5.3 14 6 Op 2 2/0.000 + CDS 16815 - 17735 1170 ## COG0111 Phosphoglycerate dehydrogenase and related dehydrogenases + Term 17834 - 17868 4.2 + Prom 17760 - 17819 3.2 15 6 Op 3 . + CDS 17893 - 19140 1241 ## COG4198 Uncharacterized conserved protein + Term 19163 - 19226 18.4 + Prom 19175 - 19234 3.7 16 7 Op 1 . + CDS 19396 - 20004 435 ## COG1739 Uncharacterized conserved protein 17 7 Op 2 . + CDS 20026 - 21111 1048 ## BT_1149 type II restriction enzyme HpaII 18 7 Op 3 . + CDS 21154 - 21723 506 ## COG0778 Nitroreductase + Term 21853 - 21901 16.0 - Term 21841 - 21889 16.0 19 8 Op 1 . - CDS 21926 - 24775 2821 ## COG1003 Glycine cleavage system protein P (pyridoxal-binding), C-terminal domain - Prom 24830 - 24889 3.0 20 8 Op 2 . - CDS 24898 - 25509 426 ## COG0491 Zn-dependent hydrolases, including glyoxylases 21 8 Op 3 . - CDS 25558 - 26178 639 ## COG0357 Predicted S-adenosylmethionine-dependent methyltransferase involved in bacterial cell division - Prom 26200 - 26259 2.1 22 9 Tu 1 . - CDS 26292 - 27143 753 ## BT_1144 hypothetical protein - Prom 27187 - 27246 6.3 + Prom 27262 - 27321 4.8 23 10 Tu 1 . + CDS 27368 - 27601 250 ## BT_1143 hypothetical protein + Prom 27622 - 27681 3.6 24 11 Tu 1 . + CDS 27795 - 28301 404 ## BT_1142 hypothetical protein + Term 28529 - 28567 5.8 - Term 28142 - 28179 -0.1 25 12 Op 1 . - CDS 28311 - 29234 550 ## COG0053 Predicted Co/Zn/Cd cation transporters 26 12 Op 2 . - CDS 29239 - 29418 122 ## gi|253568089|ref|ZP_04845500.1| conserved hypothetical protein - Prom 29456 - 29515 4.7 + Prom 29924 - 29983 2.4 27 13 Op 1 . + CDS 30093 - 32297 2004 ## BF2084 putative TonB-dependent outer membrane receptor protein 28 13 Op 2 . + CDS 32321 - 32680 484 ## BT_1092 putative heavy-metal binding protein 29 13 Op 3 . + CDS 32744 - 34409 1732 ## COG2217 Cation transport ATPase Predicted protein(s) >gi|226332292|gb|ACIC01000028.1| GENE 1 41 - 1522 680 493 aa, chain - ## HITS:1 COG:no KEGG:BT_1165 NR:ns ## KEGG: BT_1165 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 39 493 1 455 455 873 100.0 0 MTNDKGKYSQYLTVAGNYFADKGWVHLLLAGGLYILYKMFFRTSLPFFLTISSLPLMTVI FLLIIKYSKQSFYTLFILQFLLAAAYAVTDIPLGVATLMCTLFAVVLLFAYGMHEKIDWS ESRNGMLMLFLIWGVYCILEIANPNNVQAAWNISITHYLIYPIVCAVIVPLAIRNIKGIQ WLLIIWSLFILLAAAKGYWQKNCGFNEREQYFLYVLGGARTHIIWSGIRYFSFFSDAANF GVHMAMGVSLFGISLFYIKGVWLKIYFIIVIIAAIYGMGISGTRAAIALPIGALGSFIIL SRNRKACITGISVLALLFLFFVCTNIGDDNQYIRKMRSAFRPSQDASYQLRVDNRKKMRE LMIHKPFGYGIGLSKGDRFYPKERMLYPPDSWLVSVWVETGIIGLVLYLAVHGVLFAWCG WILMFKIMNKRLRGLLTAWLCTAAGFYLAAYANDVMQYPNSIPIYTAFALCFAGPYIDKR MQQSKETELKNEQ >gi|226332292|gb|ACIC01000028.1| GENE 2 1512 - 3740 1325 742 aa, chain - ## HITS:1 COG:no KEGG:BT_1164 NR:ns ## KEGG: BT_1164 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 742 1 742 742 1383 100.0 0 MEYILYISRFLYRIRWWLLIGTAIITFAVYYFGKRMIGKTYNVEATLYTGAASGYNLEGG NNKVDWATTQNAMDNLMNIIKAESTLKRVSIRLYARSLIKGNPKEDNEFIKASNYNRIYE HLKNSPNGKEILSLIDKNSEDKTVANFFNYLRPTQANYLYGVFYYNLPYYSYNDLRAIRV ARKGASDLIEISYTASDPGIAYNTIDILTKEFVNEYSAIRYGETDKVIEYFKSELQRIGK ELRLKEDSLTQYNVEKRVINYYDETKEIAAINKEFELREQNVLIAYNSAKAMLNELEKQM GSNAKHVINNLQFLDKLNEAASLTGKISEMETISSDHQNQEASLQEYKNRLNQTRKELSE LSNKYVENKYTKEGIAKENIIEQWLEVTMQYEKAKSDLQIIQQSRIDLNEKYRQFAPVGS TIKRQERNITFIEQNYLSVLKSYNDALMRRKNLEMTSAALKVLNAPAYPISAMPTPLKKM VMAACAGTFLFILGFFLILELLDRTLRDSIRTRRLIGLPMLGAFPKDSILEYHNHVEESK SIATKQLSNSILRFCQQKKEGLPYIINFISTEGGEGKSYVIEALKKYWNSIGLKTKVITW KSDFRIDSREYNLAKSITDLYTSEEEDILIVEYPNLREASISLELLQEANLNILVARADR GWKETDKLLSEKLSQQVGKTPLYVYLTHASRNVVEDYTGMLPPYTLWRKIVYRLSQLALT ESIFTFTKRKKQPATSGDEDDE >gi|226332292|gb|ACIC01000028.1| GENE 3 3751 - 4500 523 249 aa, chain - ## HITS:1 COG:no KEGG:BT_1163 NR:ns ## KEGG: BT_1163 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 249 1 249 249 449 100.0 1e-125 MKQIRYLFGGLLFLFASIGSVTAQEVSDYSKLQPSDYSSITLPPLDLLFENAKSAPTYEL ATVKEQIERKLLAKEKRAFLGFFSLRGSYQYGMFGNESTYTDVAIAPYLTYATQAQNGYT VGAGVNIPLDGLFDLTARVKRQKLNVRTAQLEREVKFEEMKKEIILLYATATSQLNILKL NAEALMLANVQYSIAEKDFSNGAIDSGTLSSEKSRQSDAQEKFENSKFELSKSLMILELV THTPILRNK >gi|226332292|gb|ACIC01000028.1| GENE 4 4626 - 5852 1206 408 aa, chain - ## HITS:1 COG:no KEGG:BT_1162 NR:ns ## KEGG: BT_1162 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 408 2 409 409 818 100.0 0 MFLFINISLAQTADTAVKSDELQTTTTSVDEVGVAPADPTPTLSKKELRRQRVAKRNLHY NILGGPSYTPDFGLLIGGSALMTFRMNPSDTTQQRSVVPMAIALMFEGGLNLFTKPQLFF KGDRFRIFGVFAYKNTLENFYGIGYSTNKDYPRGEDTSEYRYSGVQVNPWFLFRLGKSNF FAGPQIDFNYDKITKPAAGMIEQPSYIAAGGTDHGYSNLSSGLGFLLTYDTRDVPANAYR GTYLDFRGMMYNKAFGSDNNFYRLEIDYRQYKTLGKRKVLAWTVQTKNVFGNVPLTKYAL SGTPFDLRGYYMGQFRDKSSHVMMAEYRQMINTDKSNWVKKMLNHVGYVAWGGCGFMGPT PGKIEGVLPNLGLGLRIEVQPRMNVRLDFGRDMVNKQNLFYFNMTEAF >gi|226332292|gb|ACIC01000028.1| GENE 5 6057 - 7457 1219 466 aa, chain + ## HITS:1 COG:SPy1651 KEGG:ns NR:ns ## COG: SPy1651 COG3579 # Protein_GI_number: 15675522 # Func_class: E Amino acid transport and metabolism # Function: Aminopeptidase C # Organism: Streptococcus pyogenes M1 GAS # 26 463 3 441 445 279 33.0 1e-74 MNKRILSVFVCCALYYSAQAQDAKGGISTDMMQQIKQSYQGTSTDKAIRNAISNNDIRKL ALNQDNMKGMDTHFSVKVNSKGITDQKSSGRCWLFTGLNVMRAKAIDKYHLGSFEFSQTY PFFFDQLEKANLFLQGVIDTSDKPMNDKMVEWLFRNPLSDGGTFTGVADIVSKYGLVPKD VMPETNSSENTARMANLIALKLREQGLQLRDMASKKAKPAAIENAKVEMLSTIYRMLVLN LGVPPTEFTWTQYNVKGEPVETATYTPLSFLKKYGDEKLIDNYVMLMNDPSREYYKCYEI DYDRHRYDGKNWTYVNLPIEDIKEMAIASLKDSTMMYYSCDVGKFLNSDRGLLDVKNYDY DSLMGTTFSMDKKQRIQSFASSSSHAMTLMAVDLDKNGKPTKWMVENSWGAGAGYQGHLI MTDEWFNEYTFRLVVETKYVTPKALEVLKQKPIRLPAWDPMFAGEE >gi|226332292|gb|ACIC01000028.1| GENE 6 7646 - 8995 1348 449 aa, chain + ## HITS:1 COG:YPO3240 KEGG:ns NR:ns ## COG: YPO3240 COG1726 # Protein_GI_number: 16123399 # Func_class: C Energy production and conversion # Function: Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrA # Organism: Yersinia pestis # 4 447 1 446 447 286 34.0 8e-77 MANVIKLRKGLDINLKGKAAETYTTVKEPGFYALVPDDFPGVTPKVVVKEQEYVMAGGPL FIDKYHPEVKFVSPVSGVVTSVERGARRKVLNIVVEAAAEQDYEEFGKKNVSSMDADAVK SALLEAGLFAFIKQRPYDIIADPTVTPKGIFVSAFDTNPLAPDFEFALKGEEANFQTGLD ALAKMAKTYLSISVKQKAAALTQAKNVTITAFDGPNPAGNVGVQINHLDPVSKGETVWTI DPQAVIFIGRLFNTGRVNFTRTVAVTGSEVLKPAYCKLQVGALLTNVFAGNVTTDKDLRY ISGNVLSGKQVSPNGFLGAFHSQLTVIPEGDDVHEMFGWIMPRFNQFSANHSYFSWLMGK KEYTLDARIKGGERHMIMSGEYDKVFPMDIMPEYLIKAIIAGDIDRMEALGIYEVAPEDF ALCEFVCSSKMELQRIVRVGLDMLRAEMA >gi|226332292|gb|ACIC01000028.1| GENE 7 9044 - 10216 1237 390 aa, chain + ## HITS:1 COG:PA2998 KEGG:ns NR:ns ## COG: PA2998 COG1805 # Protein_GI_number: 15598194 # Func_class: C Energy production and conversion # Function: Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrB # Organism: Pseudomonas aeruginosa # 3 384 2 401 403 325 44.0 1e-88 MKALRNYLDKIKPNFEEGGKFHAFQSVFDGFETFLFVPSKTAKTGTHIHDAIDSKRIMSI VVISLVPALLFGMYNVGYQHFTHTGATGSFIEMFIYGFLAVLPKIIVSYVVGLGIEFVVA QWKKEEIQEGFLVSGILIPMIVPVDCPLWILAVATAFSVIFAKEVFGGTGMNVFNVALIT RAFLFFAYPTKMSGDAVWVSGDTIFGMGQAVDGLTVATPLGQAATSGTVPAFNMDMITGL IPGSIGETSVIAILIGAVILLWTGVASWKTMLSVFVGGAFMAWVFNAIGMENNTMAQMPW YEHLVLGGFCFGAVFMATDPVTSARTERGKYIFGFLIGVMAIVIRVLNPGYPEGMMLAIL LMNIFAPLIDYCVVQSNISRREKRAVKSNQ >gi|226332292|gb|ACIC01000028.1| GENE 8 10223 - 10903 909 226 aa, chain + ## HITS:1 COG:VC2293 KEGG:ns NR:ns ## COG: VC2293 COG2869 # Protein_GI_number: 15642291 # Func_class: C Energy production and conversion # Function: Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrC # Organism: Vibrio cholerae # 1 226 1 250 257 89 28.0 6e-18 MKMNTNSNSYTIIYASVMVVIVAFLLAFVSSSLKATQDKNVQLDTKKQILAALNIKNVED ADAEYQKYVKGDMLMNVDGTLTENTGEFATNYEKEVKEHQRLHVFVCDVDGQTKYVVPVY GAGLWGGIWGYVALNEDKDTVYGVYFSHASETPGLGAEIATDWFQHEFAGKKTLENGAVA LGVVKNGKVEKADYQVDGISGGTITSVGVDAMLKACLNNYISFLTK >gi|226332292|gb|ACIC01000028.1| GENE 9 10924 - 11565 761 213 aa, chain + ## HITS:1 COG:HI0168 KEGG:ns NR:ns ## COG: HI0168 COG1347 # Protein_GI_number: 16272134 # Func_class: C Energy production and conversion # Function: Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrD # Organism: Haemophilus influenzae # 10 211 8 208 208 213 57.0 2e-55 MGQLFSKKNKEVFSAPLGIDNPVTVQVLGICSALAVTAKLEPAIVMGLSVTVITAFANVV ISLLRKTIPNRIRIIVQLVVVAALVTIVSEILKAFAYDVSVQLSVYVGLIITNCILMGRL EAFAMQNGPWESFLDGVGNGLGYAKILVIVAFFRELLGSGTLLGFNILNYEPIQNIGYVN NGLMLMPPMALIIVACIIWYQRARHKELQEESN >gi|226332292|gb|ACIC01000028.1| GENE 10 11649 - 12275 820 208 aa, chain + ## HITS:1 COG:HI0170 KEGG:ns NR:ns ## COG: HI0170 COG2209 # Protein_GI_number: 16272135 # Func_class: C Energy production and conversion # Function: Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrE # Organism: Haemophilus influenzae # 1 208 1 198 198 207 60.0 1e-53 MEHLLSLFVRSIFVDNMIFAFFLGMCSYLAVSKNVKTAVGLGIAVTFVLLVTLPVNYLLQ TKVLAANAIIEGVDLSFLSFILFIAVIAGIVQLVEMVVERFSPSLYASLGIFLPLIAVNC AIMGASLFMQQRINLGASDPKYIGDVWDALSYALGSGIGWLLAIVGLAAIREKMAYSDVP APLKGLGITFITVGLMAIAFMCFSGLNI >gi|226332292|gb|ACIC01000028.1| GENE 11 12294 - 13568 1538 424 aa, chain + ## HITS:1 COG:PA2994 KEGG:ns NR:ns ## COG: PA2994 COG2871 # Protein_GI_number: 15598190 # Func_class: C Energy production and conversion # Function: Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrF # Organism: Pseudomonas aeruginosa # 6 424 6 407 407 429 51.0 1e-120 MDMNLILSSIGVFLVVILLLVVILLVAKKFLVPSGNVKLTINGETGLEVESGSTLLNTLA VNGVFLSSACGGKGSCGQCKCQVLEGGGEILPSEVPHFSRKQQQDHWRLGCQVKVKSDMA IKIDESVLGVKEWECEVISNKNVATFIKEFIVALPKGEHMDFIPGSYAQIKIPKFSMDYD KDIDKSLIGEEYLPAWEKFGLLGLKCKNEEETIRAYSMANYPAEGDRIMLTVRIATPPFK PKEQGPGFMDVMPGIASSYIFTLKPGDKVIMSGPYGDFHPIFDSNKEMMWIGGGAGMAPL RAQIMHLTKTLHTTDRKMSYFYGARALNEVFYLEDFLQIEKDFPNFTFHLALDRPDPAAD AAGVKYTPGFVHNVIYETYLKNHEAPEDIEYYMCGPGPMSKAVEKMLDDLGVPAQNLMFD NFGG >gi|226332292|gb|ACIC01000028.1| GENE 12 13646 - 15058 1014 470 aa, chain + ## HITS:1 COG:lin1214 KEGG:ns NR:ns ## COG: lin1214 COG0513 # Protein_GI_number: 16800283 # Func_class: L Replication, recombination and repair; K Transcription; J Translation, ribosomal structure and biogenesis # Function: Superfamily II DNA and RNA helicases # Organism: Listeria innocua # 28 470 4 468 470 224 30.0 2e-58 MVLFRFPAFFMHYLTESFRPFCIFVPMLEKNEIIQSALQNLKIESLNPMQEAALEQGTGR KDVILLSPTGSGKTLAYLLPLLLTLKPNDDSVQVLILVPSRELALQIDTVFRSMGTSWKT CCCYGGHPIAEEKKSIAGNHPAIILGTPGRITDHLSKGNFDPETIETLIIDEFDKSLEFG FHDEMAGIITQLPGLKKRMLLSATDAEEIPQFTGLNRTVKLDFLPEATEEQENRLKLMKV LSPSKDKIDTLYNLLCSLGSSSSIVFCNHRDAVDRVHKLLEDKKLLAERFHGGMEQPDRE RALYKFRNGSCHVLISTDLAARGLDIPEIEHIIHYHLPVNEEAFTHRNGRTARWDATGTS YLILHAEEKLPEYIPEDIETMELLENPSRPPKSVWTTIYIGKGKKEKLSRMDIAGFLYKK GNLTREDVGAIDVKEHYAFVAVRRAKVKQLLNLIQGEKIKGMKTIIEEAK >gi|226332292|gb|ACIC01000028.1| GENE 13 15633 - 16700 1073 355 aa, chain + ## HITS:1 COG:BS_serC KEGG:ns NR:ns ## COG: BS_serC COG1932 # Protein_GI_number: 16078066 # Func_class: H Coenzyme transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoserine aminotransferase # Organism: Bacillus subtilis # 5 353 6 356 359 335 48.0 6e-92 MKKHNFNAGPSILPREVIEDTAKAILDFNGSGLSLMEISHRAKDFQPVVDEAEALFKELL NIPEGYSVLFLGGGASMEFCMVPFNFLEKKAAYLNTGVWAKKAMKEAKGFGEVVEVASSA EATYTYIPKDYTIPADADYFHITTNNTIYGTELKEDLNSPVPMVADMSSDIFSRPIDVSK YICIYGGAQKNLAPAGVTFVIVKNDAVGKVSRYIPSMLNYQTHIDNGSMFNTPPVVPIYA ALLNLRWIKAQGGVKEMERRAIEKADMLYAEIDRNKLFVGTAAKEDRSRMNICFVMAPEY KDLEADFMKFATEKGMVGIKGHRSVGGFRASCYNALPKESVQALIDCMQEFEKLH >gi|226332292|gb|ACIC01000028.1| GENE 14 16815 - 17735 1170 306 aa, chain + ## HITS:1 COG:aq_1905 KEGG:ns NR:ns ## COG: aq_1905 COG0111 # Protein_GI_number: 15606928 # Func_class: H Coenzyme transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoglycerate dehydrogenase and related dehydrogenases # Organism: Aquifex aeolicus # 2 305 3 320 533 159 34.0 6e-39 MKVLVATEKPFAKVAVDGIRKEIEAAGYELALLEKYTDKAQLLDAVKDANAIIIRSDIID AEVLDAAKELKIVVRAGAGYDNVDLAAATAHNVCVMNTPGQNSNAVAELALGMMVYAVRN FYNGTSGTELMGKKLGIHAYGNVGRNVARVAKGFGMEVYAYDAFCPKEVIEKDGVKALDS AEELYKTCQVVSLHIPATAETKNSINYALLKDMPKGAMLVNTARKEVINEAELIKLMEDR ADFKYMTDIMPAANAEFAEKFAGRYFSTPKKMGAQTAEANINAGIAAAQQIVGFLKDGCE KFRVNK >gi|226332292|gb|ACIC01000028.1| GENE 15 17893 - 19140 1241 415 aa, chain + ## HITS:1 COG:CAC0016 KEGG:ns NR:ns ## COG: CAC0016 COG4198 # Protein_GI_number: 15893314 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 1 414 1 413 414 406 48.0 1e-113 MAIIKPFKGVRPPQDLVEQVASRPYDVLNSEEARTEAAGNEKSLYHIIKPEIDFPVGTDE HDERVYEKAAENFRLFQDKGWLVQDAKENYYIYAQTMNGKTQYGLVVGAYVPDYMNGIIK KHELTRRDKEEDRMKHVRVNNANIEPVFFAYPDNAKLDTIIRKYTAEKPVYDFIAPGDGF GHTFWIVDQDEDIAAITAEFAKMPALYIADGHHRSAAAALVGAEKAKQNANHRGDEEYNY FMAVCFPANQLTIIDYNRVVKDLNGLTPEQFLAALDKNFVVEEKGTDIYKPSGLHNFSLY LGDKWYSLTAKAGTYNDNDPIGVLDVTISSNLILDEILGIKDLRSDKRIDFVGGIRGLGE LKKRVDSGEMKVALALYPVSMKQLMDIADTGNIMPPKTTWFEPKLRSGLVIHKLD >gi|226332292|gb|ACIC01000028.1| GENE 16 19396 - 20004 435 202 aa, chain + ## HITS:1 COG:NMA0240 KEGG:ns NR:ns ## COG: NMA0240 COG1739 # Protein_GI_number: 15793258 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Neisseria meningitidis Z2491 # 6 178 4 176 203 182 47.0 4e-46 MTAEDTYKTIVEPSEGIYTEKRSKFIAIALPVRTLDEIKAHLETYQKKYYDARHVCYAYM LGAARKDFRANDNGEPSGTAGKPILGQINSNELTDILIIVVRYFGGIKLGTSGLIVAYKA AAAEAISAATIIEKTVDEEVTVMFEYPFMNDIMRIVKEEEPEILSQSYDMDCSMTLCIRR SMMPKLRARLEKVETARILDEE >gi|226332292|gb|ACIC01000028.1| GENE 17 20026 - 21111 1048 361 aa, chain + ## HITS:1 COG:no KEGG:BT_1149 NR:ns ## KEGG: BT_1149 # Name: not_defined # Def: type II restriction enzyme HpaII # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 361 1 361 361 721 100.0 0 MAFEATKREWCELYTFFRLLTEGKVTLGTAKAKKGEIDWPIAMIQREEHDGTRCYYIEKE MVRIKGENSEKFVPREDFGIVADLILQAVKSSSEDEVTSPDGVEEFLDEVAIFDLEAKTE DRTDFYIAFWHPEAPLSGFSVRSRLGAMNPLLDGGRAANLKLEQSGVKFATPTVNKINAL PEAPNEVAERMLLIERLGGVLKYSDVADRVFRSNLLMIDLHFPRVLTEMVRIMHLDDITR ISELTEVIKQMNPLKIKDELVNKHGFYEFKVKQFLMALALGMRPAKIYTGQDSAVEGILL MDGSGEVLCYHKSEKPVMEDFLFLNTRLEKGSLDKDKYGFLERENGTYYFKLNAKIGLVK R >gi|226332292|gb|ACIC01000028.1| GENE 18 21154 - 21723 506 189 aa, chain + ## HITS:1 COG:MA1774 KEGG:ns NR:ns ## COG: MA1774 COG0778 # Protein_GI_number: 20090624 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Methanosarcina acetivorans str.C2A # 3 168 31 199 220 109 36.0 2e-24 MKTNEVLENIKARRSVRAYTDRQVSEEDLQAILEAATFAPSGMHLETWHFTAIQNADKLA ELNERIKGAFAKSDEPKLQERGHSKAYCCYYHAPTLVIVSNEPTQWWAGMDCACAIENMF LAAHSLGIGSCWINQLGTTCDDPEVREFISSLGVPENHKVYGCVALGYADPKISLREKMV KAGTVTVVR >gi|226332292|gb|ACIC01000028.1| GENE 19 21926 - 24775 2821 949 aa, chain - ## HITS:1 COG:YPO0905_2 KEGG:ns NR:ns ## COG: YPO0905_2 COG1003 # Protein_GI_number: 16121210 # Func_class: E Amino acid transport and metabolism # Function: Glycine cleavage system protein P (pyridoxal-binding), C-terminal domain # Organism: Yersinia pestis # 462 941 1 484 494 578 59.0 1e-164 MKTDLLASRHIGINEEDTAVMLRKIGVDSLDELINKTIPANIRLKEPLALAKPLTEYEFG KHIADLASKNKLYTTYIGLGWYNTITPAVIQRNVFENPVWYTSYTPYQTEVSQGRLEALM NFQTAVCDLTAMPLANCSLLDEATAAAEAVTMMYALRSRTQQKAGANVVFVDENIFPQTL AVMTTRAIPQGIELRVGKYKEFEPSPEIFACILQYPNSSGNVEDYADFTKKAHEADCKVA VAADILSLALLTPPGEWGADIVFGTTQRLGTPMFYGGPSAGYFATRDEYKRNMPGRIIGW SKDKYGKLCYRMALQTREQHIKREKATSNICTAQALLATMAGFYAVYHGQEGIKTIASRI HSITVFLDKQLKKFGYTQVNAQYFDTLRFELPEHVSAQQIRTIALSKEVNLRYYENGDVG FSIDETTDIAATNVLLSIFAIAAGKDYQKVEDVPEKSNIDKALKRTTPFLTHEVFSNYHT ETEMMRYIKRLDRKDISLAQSMISLGSCTMKLNAAAEMLPLSRPEFMSMHPLVPEDQAEG YRELISNLSEDLKVITGFAGVSLQPNSGAAGEYAGLRVIRAYLESIGQGHRNKILIPASA HGTNPASAIQAGFETVTCACDEQGNVDMGDLRAKAEENKEALAALMITYPSTHGIFETEI KEICEIIHACGAQVYMDGANMNAQVGLTNPGFIGADVCHLNLHKTFASPHGGGGPGVGPI CVAEHLVPFLPGHSIFGSTQNQVSAAPFGSAGILPITYGYIRMMGTEGLTQATKIAILNA NYLAACLKDTYGIVYRGATGFVGHEMILECRKVHEETGISENDIAKRLMDYGYHAPTLSF PVHGTLMIEPTESESLAELDNFVDVMLNIWKEIQEVKNEEADKNDNVLINAPHPEYEIVN DNWEHSYTREKAAYPIESVRENKFWVNVARVDNTLGDRKLLPTRYGTFE >gi|226332292|gb|ACIC01000028.1| GENE 20 24898 - 25509 426 203 aa, chain - ## HITS:1 COG:VC1270 KEGG:ns NR:ns ## COG: VC1270 COG0491 # Protein_GI_number: 15641283 # Func_class: R General function prediction only # Function: Zn-dependent hydrolases, including glyoxylases # Organism: Vibrio cholerae # 2 201 13 210 218 152 40.0 3e-37 MFPVNCYVLWDDTKEAVVIDPGCFYEEEKQALKKFILTNELTVKHLLNTHLHLDHIFGNP FMLKEFGLSAEANKADEFWIDEAPKQSRMFGFQLQEEPVPLGKYLHDGDIITFGHTKLEA IHVPGHSPGSLVYYCKEENCMFSGDVLFQGSIGRADLSGGNFDELIDHICSRLFVLPNET VVYPGHGAPTTIGMEKAENPFFR >gi|226332292|gb|ACIC01000028.1| GENE 21 25558 - 26178 639 206 aa, chain - ## HITS:1 COG:SA2499 KEGG:ns NR:ns ## COG: SA2499 COG0357 # Protein_GI_number: 15928295 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted S-adenosylmethionine-dependent methyltransferase involved in bacterial cell division # Organism: Staphylococcus aureus N315 # 10 149 16 164 239 97 35.0 2e-20 MEIILKYFPDLTEEQRKQFAALYDLYIDWNAKINVISRKDIENLYEHHVLHSLGIAKVIQ FRPGTKVMDLGTGGGFPGIPLAILFPETKFHLVDSIGKKVRVATEVANAIGLKNVTFRHA RAEEEKQLFDFVVSRAVMPLADLIKIIKKNISPKQQNAMPNGLICLKGGELEHETMPFKH KTVIHSLSENFEEEFFETKKVVYSQI >gi|226332292|gb|ACIC01000028.1| GENE 22 26292 - 27143 753 283 aa, chain - ## HITS:1 COG:no KEGG:BT_1144 NR:ns ## KEGG: BT_1144 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 283 1 283 283 555 100.0 1e-156 MKKQYVSLLAVILSVSGFLFSCHDKMNKNTGALQFDSIQVNETAHLFNDTAKPACNIIIN FAYPVKSSDDMLKDSLNAYFISVCFGDKYIGEKPEAVVKQYAKNYIDEYRHDQEPMYAED EKNKESDAVVGAWYSYYKSIESHVQLYEKDLLVYRVDYNEYTGGAHGMYMSTFLNMDLTL MRPLRLDDIFVGDFQEALTDLIWNQLMADNKVTTHEALEDMGYASTGDIAPTENFYLSKE GITFYYNIYDITPYSMGPVKVTIPYSMMEHLLGSNPILGDLKD >gi|226332292|gb|ACIC01000028.1| GENE 23 27368 - 27601 250 77 aa, chain + ## HITS:1 COG:no KEGG:BT_1143 NR:ns ## KEGG: BT_1143 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 12 77 1 66 66 129 100.0 4e-29 MEQLHAHEVLHMMEGNSYTEASLRAAIVEKFGEHQRFYTCSADNMEVDELIGFLKRKGKF MPAGEEFTVDISKVCSH >gi|226332292|gb|ACIC01000028.1| GENE 24 27795 - 28301 404 168 aa, chain + ## HITS:1 COG:no KEGG:BT_1142 NR:ns ## KEGG: BT_1142 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 19 168 12 161 161 247 100.0 9e-65 MKKSQILMSLLLVILLSSCYASRITYENLSIIQRGMSSKEVIAIMGKPSYRSFDEESEML EFRTSESSIAGVVNIWFVDDRVTKMKSYYTNGCMDRNRAIEDKEEEESAKKEKKEKEETS ARIRVTTDGKHIIQTGSIIVTPDGKHETVVSDCGGVIITASGEHILAP >gi|226332292|gb|ACIC01000028.1| GENE 25 28311 - 29234 550 307 aa, chain - ## HITS:1 COG:TM0876 KEGG:ns NR:ns ## COG: TM0876 COG0053 # Protein_GI_number: 15643638 # Func_class: P Inorganic ion transport and metabolism # Function: Predicted Co/Zn/Cd cation transporters # Organism: Thermotoga maritima # 4 304 2 303 306 228 37.0 8e-60 MKAKREQILIRTSWISTIGNAILSTSKIIVGLWAGSLAVLGDGIDSATDVIISIVMIFTA RIISQPPSKKYVFGYEKAEGIATKILSLVIFYAGVQMLISSIGSIFSEEAKEIPSAIAIY VTIFSIIGKLLLALYQYKQGKKIDSSMLTANAINMRNDVIISSGVLLGLIFTFIFKLPIL DSITGLIISLFIIKSSIGIFIDSNVELMDGVKDVNVYNKIFEAVETVPGASNPHRVRSRM IGNLYMITLDIEVNPQITITQAHQIADAVEKSIKSSIDNVYDILVHVEPAGECQTDEKFG IDKGMVN >gi|226332292|gb|ACIC01000028.1| GENE 26 29239 - 29418 122 59 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253568089|ref|ZP_04845500.1| ## NR: gi|253568089|ref|ZP_04845500.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 59 1 59 59 104 100.0 2e-21 MEYHRISFIHNDTEYSFIKAINGNLTGYALMSACRIEVVNYMKENNLKGSYILTGMSKA >gi|226332292|gb|ACIC01000028.1| GENE 27 30093 - 32297 2004 734 aa, chain + ## HITS:1 COG:no KEGG:BF2084 NR:ns ## KEGG: BF2084 # Name: not_defined # Def: putative TonB-dependent outer membrane receptor protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 12 733 14 735 735 1197 79.0 0 MGLFSLLAVTTQAQVRGIVKDNTGEPVPGANVFWVNTGQGVTTKEDGSFSISKPSKSHML VISFIGFQNDTVHVSKKNQELDITLREGVELNEVNVVSRRMGTMKLRSSVMNEEMISSAE LSRAACCNLGESFVTNPSVDVSYSDAATGAKQIKLLGLSGTYVQMLTENIPNYRGAASPY GLGYVPGPWMQSIQVSKGISSVKNGYEAITGQINVEFKKPQLPEADWVSANLFASSTNRY EANADATLKLSKRWSTSLLAHYENETKAHDGNDDGFVDIPQVEQYNVWNRWAYMGDHYVF QAGFKALSESRTSGQAHHTDMQTGDLYKVGIDTERYEFFTKNAYIFNKEKNTNLALILSA SWHNQDAMYGRKLYNVDQTNVYASLMFETEFNKQNSFSAGLSFNYDAYDQHYRLENQTEL PLIKAFAKEAVPGAYAQYTLNLNDQWMLMAGLRGDYSNEHGFFVTPRAHLKYNPNEYVHF RLSAGKGYRTNHVLAENNYLLSSSRKVEIAKGLDMEEAWNYGASVSTYVPVFGKTLNINA EYYYTDFLKQVVVDMDSNPHEVAFYNLDGRSYSHVFQVEASYPFFKGFTLTGAYRLTDAQ TTYKGQRMEKPLTSKYKGLLTASYQTPLGIWQFDATLQLNGGGRMPSPYELADGQLSWER RYGSFEQLSAQITRYFRRWSVYIGGENLTNFKQKNPIIDAANPWGGNFDSTMIWGPVHGA KAYIGIRFNLARNE >gi|226332292|gb|ACIC01000028.1| GENE 28 32321 - 32680 484 119 aa, chain + ## HITS:1 COG:no KEGG:BT_1092 NR:ns ## KEGG: BT_1092 # Name: not_defined # Def: putative heavy-metal binding protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 109 1 109 119 159 100.0 4e-38 MRTKRWMATCVVALLSVAAVLAKDIRVVIFKVSQMHCENCEKKVKNNMRFEKGVKELSTE LKNKTVSITYDAEKTDVKKLQAGFKKFNYEAEFVKETKQDGKKDGKQESKKEDKKADKK >gi|226332292|gb|ACIC01000028.1| GENE 29 32744 - 34409 1732 555 aa, chain + ## HITS:1 COG:alr7635 KEGG:ns NR:ns ## COG: alr7635 COG2217 # Protein_GI_number: 17158771 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Nostoc sp. PCC 7120 # 14 555 11 566 753 422 41.0 1e-118 MGNIAKKAFPVLNMHCAGCANNVEKTVKKLPGVVDASVNFASNTLTISYEDDKLTPGEIR AAVLAAGYDLIVEEANKEERREEEQHKRYLRLKRKVIGAWIFVVPLLIFSMLLMHMPYSN EIQMLLAIPVMVLFGGSFFTGAWKQAKIGRSNMDTLVALSTSIAFLFSLFNTFFPQFWTD RGLEPHVYYEASAVIIAFVLTGKMMEERAKGNTSDAIRKLMGMQPKVARVLRGGVEEEIQ IDLLQVGDMVVVRPGEQIPVDGQLSEGDSYVDESMISGEPIPVEKKKGDKVLAGTINQRG SFIIRASQVGSDTVLARIIHMVQEAQGSKAPVQRIVDRITGIFVPVVLGIAILTFVLWVA IGGSEYISYGILSAVSVLVIACPCALGLATPTALMVGIGKAAGQHILIKDAVALEQMRKV DVVVLDKTGTLTEGHPTATGWLWAQPQEEHFKDVLLAAELKSEHPLAGAIVAVLQDEEKI TPAPLSSFESITGKGIKVSYQGKTYWVGSHKLLKDFSATVSDVMADMLVHYESDGNGIIY FGRENELLAIIAVSD Prediction of potential genes in microbial genomes Time: Thu May 12 00:18:22 2011 Seq name: gi|226332291|gb|ACIC01000029.1| Bacteroides sp. 1_1_6 cont1.29, whole genome shotgun sequence Length of sequence - 24592 bp Number of predicted genes - 27, with homology - 26 Number of transcription units - 15, operones - 8 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 2 2 Tu 1 . - CDS 778 - 1614 646 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 1820 - 1879 4.9 + Prom 1622 - 1681 8.5 3 3 Tu 1 . + CDS 1704 - 2387 555 ## COG0321 Lipoate-protein ligase B 4 4 Op 1 . - CDS 2446 - 3378 842 ## BT_1088 hypothetical protein 5 4 Op 2 . - CDS 3390 - 4904 1077 ## COG0657 Esterase/lipase 6 4 Op 3 . - CDS 4939 - 6927 2553 ## COG1297 Predicted membrane protein - Prom 6958 - 7017 5.2 - Term 6965 - 7005 4.2 7 5 Tu 1 . - CDS 7023 - 7418 353 ## BT_1085 hypothetical protein - Prom 7464 - 7523 3.5 + Prom 7696 - 7755 7.0 8 6 Tu 1 . + CDS 7777 - 7917 96 ## + Term 8073 - 8109 -0.6 - Term 7874 - 7925 1.2 9 7 Op 1 . - CDS 8114 - 8713 683 ## BT_1084 hypothetical protein 10 7 Op 2 . - CDS 8720 - 9328 593 ## COG0218 Predicted GTPase - Prom 9377 - 9436 6.4 11 8 Tu 1 . - CDS 9452 - 10906 1080 ## COG0591 Na+/proline symporter + Prom 10836 - 10895 3.5 12 9 Op 1 . + CDS 10995 - 11612 553 ## COG0353 Recombinational DNA repair protein (RecF pathway) 13 9 Op 2 . + CDS 11649 - 12098 344 ## BT_1080 hypothetical protein 14 9 Op 3 . + CDS 12105 - 12644 353 ## PROTEIN SUPPORTED gi|229254479|ref|ZP_04378409.1| acetyltransferase, ribosomal protein N-acetylase 15 10 Op 1 . - CDS 12641 - 13231 459 ## COG1678 Putative transcriptional regulator - Prom 13272 - 13331 7.8 16 10 Op 2 . - CDS 13381 - 14697 1249 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase - Prom 14872 - 14931 77.0 + TRNA 14855 - 14928 85.5 # Asp GTC 0 0 - Term 14841 - 14908 31.1 17 11 Op 1 9/0.000 - CDS 15066 - 15845 540 ## COG3279 Response regulator of the LytR/AlgR family 18 11 Op 2 . - CDS 15838 - 16875 448 ## COG3275 Putative regulator of cell autolysis - Prom 16895 - 16954 9.7 19 12 Op 1 . - CDS 16987 - 17622 370 ## BT_1074 hypothetical protein 20 12 Op 2 . - CDS 17634 - 18278 497 ## BT_1073 hypothetical protein - Prom 18308 - 18367 4.6 21 13 Tu 1 . - CDS 18381 - 18935 410 ## BT_1072 hypothetical protein - Prom 18985 - 19044 2.8 22 14 Op 1 . + CDS 19617 - 20024 275 ## BT_1071 hypothetical protein 23 14 Op 2 8/0.000 + CDS 20028 - 21332 1131 ## COG3969 Predicted phosphoadenosine phosphosulfate sulfotransferase 24 14 Op 3 . + CDS 21329 - 21871 610 ## COG1475 Predicted transcriptional regulators + Term 21886 - 21953 17.4 + Prom 21967 - 22026 9.0 25 15 Op 1 . + CDS 22231 - 22575 223 ## BT_1068 hypothetical protein 26 15 Op 2 . + CDS 22592 - 24415 885 ## BT_1067 hypothetical protein 27 15 Op 3 . + CDS 24430 - 24592 194 ## BT_1066 hypothetical protein Predicted protein(s) >gi|226332291|gb|ACIC01000029.1| GENE 1 1 - 480 477 159 aa, chain + ## HITS:1 COG:CAC3655 KEGG:ns NR:ns ## COG: CAC3655 COG2217 # Protein_GI_number: 15896888 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Clostridium acetobutylicum # 1 158 656 813 818 177 57.0 6e-45 GDGQRTALAVSSRLGIERFVADALPDDKAEFVRELQMQGKKVAMVGDGINDSQALALADV SIAMGKGTDIAMDVAMVTLMTSDLLLLPKAFELSRQTVRLIHQNLFWAFIYNLIGIPIAA GVLFPINGLLLNPMLASAAMAFSSVSVVLNSLSLGRRKI >gi|226332291|gb|ACIC01000029.1| GENE 2 778 - 1614 646 278 aa, chain - ## HITS:1 COG:BMEII0641 KEGG:ns NR:ns ## COG: BMEII0641 COG2207 # Protein_GI_number: 17988986 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Brucella melitensis # 171 276 187 291 307 79 36.0 6e-15 MKMEVPQVDLPLEVLAWTKVTEDILNIYKQSCRLQACIFAICTEGTMKVSINLLEYEVHP NDLITLLPGTIIQFREKAEKVSLCFAGFSAKCIGHINLLTTIGNAYPKLIEQPIIPLSEN IADYLKEYFALLSHASCDESFKMDSKLADLSLQTILTSIQLMYRNYSGNNHSNRKKEICR KLIQLITEHYKDQRCAQFYADQLGISLQHLSTTVKQVTGKSVLDTIAYIVIIDAKAQLKG TDMTIQEIAYSLNFPNPSFFCKFFRRHVGMSPLEFRNS >gi|226332291|gb|ACIC01000029.1| GENE 3 1704 - 2387 555 227 aa, chain + ## HITS:1 COG:MT2274 KEGG:ns NR:ns ## COG: MT2274 COG0321 # Protein_GI_number: 15841708 # Func_class: H Coenzyme transport and metabolism # Function: Lipoate-protein ligase B # Organism: Mycobacterium tuberculosis CDC1551 # 11 205 30 208 240 152 44.0 4e-37 MKTVLVDWNLIPYAEAWQRQTEWFDTLVRAKAQGEAYENRIIMCEHPHVYTLGRSGKENN MLLNDDQLKAIQATLFHIDRGGDITYHGPGQLVCYPILNLEEFHLGLKEYVHLLEEAVIR VCASYGIEAGRLEKATGVWLEGSTPRARKICAIGVRSSHYVTMHGLALNVNTDLRYFSYI HPCGFIDKGVTSLRQELKHEVPMDEVKQRLECELGRLLNEKKQQIAQ >gi|226332291|gb|ACIC01000029.1| GENE 4 2446 - 3378 842 310 aa, chain - ## HITS:1 COG:no KEGG:BT_1088 NR:ns ## KEGG: BT_1088 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 310 1 310 310 608 100.0 1e-173 MKRVKHLLLASCLLPTFLGAQGVASASFLEITPDAQSAGMAGTGLAITDNGSTAIFHNAS TIAFSQDVMGASYSYADINKDFGMHSASLFYRIGREGKHGFSVGFRHYKDADFHWQDGAE DHARAWNVEAAYFRNVAKNLSLSLTLRYLQAKAYKDADSKGSVSVDLGASYHRSMGILDE MASWTIGFQAANLGSKLDGQKLPARLGLGGAIDLPFSMDNRLQVALDFNYLLPSEYRHLQ AGIGAEYNFLKYGIIRGGYHFGDNDKGTGNYGTLGCGINFWPIRADFSYALADKDCFMHK TWQLGIGIVF >gi|226332291|gb|ACIC01000029.1| GENE 5 3390 - 4904 1077 504 aa, chain - ## HITS:1 COG:CC2313 KEGG:ns NR:ns ## COG: CC2313 COG0657 # Protein_GI_number: 16126552 # Func_class: I Lipid transport and metabolism # Function: Esterase/lipase # Organism: Caulobacter vibrioides # 15 250 45 305 328 174 37.0 3e-43 MTAMTFAQQPVELPLWPDGAPNSNGLTGGEKEVSPHRISNVTDPTITVYRAPQPNGMAVI MCPGGGYSRLAMDHEGHDMAPWFCGQGITYVVLKYRMPNGHCEVPLSDAERAIRIVREHA GEWNIHPRKIGIMGASAGGHLASTLATHYSAASRPDFQILLYPVVTMTQSTHGGSRKELL GGNPTAEQKVLFSNELQVTSDTPQAFIVLSSDDGAVPPSNGVNYYLALQKNNVPASLHVY PTGGHGWGFRDNFKYKQQWTQELEKWLREGVVFPKETAPMLRIGKTYLGTKYVANTLDQG TEEKLIILPQTVDCLTFVEYTLAQAMGSSFADNLQKIRYRDGVIDGYTSRLHYTSDWIEN GVRQGFLEDVTAQNSTQTTKLSLSYMSTHPQKYRQLADSPENVKRMAEHEKALSGKKVHW LPKGKLPDAGLPWIMDGDIIAITTNLPGLDVAHVGMAEYINGKLHLLHASSTLGKVVVSE EPLSQMLRNNKSWSGIRVVRMSHP >gi|226332291|gb|ACIC01000029.1| GENE 6 4939 - 6927 2553 662 aa, chain - ## HITS:1 COG:PH0361 KEGG:ns NR:ns ## COG: PH0361 COG1297 # Protein_GI_number: 14590271 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Pyrococcus horikoshii # 25 626 3 595 626 239 30.0 2e-62 MKQEEDKFTGLPENAFRELKAGEVYNPLMSPSKNYPEVNFWSVTWGIAMAILFSAAAAYL GLKVGQVFEAAIPIAIIAVGVSGAAKRKNALGENVIIQSIGACSGVIVAGAIFTLPALYI LQVKYPEMTVSFMQVFISSLLGGVLGILFLIPFRKYFVSDMHGKYPFPEATATTQVLVSG EKSGSQAKPLLMAGMIGGLYDFIVATFGWWNENFTTRVCGAGEMLAEKAKLVFKVNTGAA VLGLGYIVGLKYASIICAGSLAVWWIIIPGMSAIWGDSVLNAWNPEITSTVGMMSPEEIF KYYAKSIGIGGIAMAGVIGIIRSWGIIKSAVGLAAKEMGGKGNVEKNIVRTQRDLSMKII AIGSILTLILIVLFFYFDIMQGNIVHTLVAIALVAGISFLFTTVAANAIAIVGTNPVSGM TLMTLILASVVMVAVGLKGPSGMVAALVMGGVVCTALSMAGGFITDLKIGYWLGSTPAKQ ETWKFLGTIVSAATVGGVMIILNKTYGFTSGALAAPQANAMAAVIEPLMSGVGAPWLLYG IGAILAIVLTFFKIPALAFALGMFIPLELNVPLVVGGAVNWYVTSRSKDTALNTERGEKG TLLASGFIAGGALMGVVSAAMRFGGINLVNEAWLNNTWSEVLALGAYALLILYFIKASMK VK >gi|226332291|gb|ACIC01000029.1| GENE 7 7023 - 7418 353 131 aa, chain - ## HITS:1 COG:no KEGG:BT_1085 NR:ns ## KEGG: BT_1085 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 131 1 131 131 254 100.0 5e-67 MKKILFLMTLLVVGVSFAFAQTNADIKFDKTTHDFGKFSENSPVVSCVFTFTNIGDAPLV IHQAVASCGCTVPEYTKEPIQPGKKGTIKVTYNGTGKYPGHFKKSITLRTNAKTEMVRLY IEGDMTPKDEK >gi|226332291|gb|ACIC01000029.1| GENE 8 7777 - 7917 96 46 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKTKLDFYHMTPFYELNYFIAFIVKLYKRIQQRREQREHRIKDELV >gi|226332291|gb|ACIC01000029.1| GENE 9 8114 - 8713 683 199 aa, chain - ## HITS:1 COG:no KEGG:BT_1084 NR:ns ## KEGG: BT_1084 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 199 1 199 199 307 100.0 1e-82 MKKNILFIAWVALFSLFASNSQAQSLKDLLNKDNISKVVNAITGTPETIDMTGTWTYSGS AVEFESDNLLMKAGGAAAATMAENKLNEQLSKVGIKEGQMSFTFNADSTFTSTVGKKKLS GTYSYNASTKQVDLKYLKLLNLHAKVNCTSNSMDLLFNSDKLLKLMTFLGSKTNSTALKT VSSLAENYDGMMLGFGLKK >gi|226332291|gb|ACIC01000029.1| GENE 10 8720 - 9328 593 202 aa, chain - ## HITS:1 COG:FN2013 KEGG:ns NR:ns ## COG: FN2013 COG0218 # Protein_GI_number: 19705309 # Func_class: R General function prediction only # Function: Predicted GTPase # Organism: Fusobacterium nucleatum # 1 192 1 189 194 135 44.0 4e-32 MEITSAEFVISNTDVKKCPTGVFPEYAFIGRSNVGKSSLINMLTARKGLAMTSATPGKTM LINHFLINQSWYLVDLPGYGYARRGQKGKDQIRTIIEDYILEREQMTNLFVLIDSRLEPQ KIDLEFMEWLGENGIPFSIIFTKADKLKGGRLKMNINNYLRELSKEWEELPPYFISSSEN RTGRTEILDYIENISKEVYKNK >gi|226332291|gb|ACIC01000029.1| GENE 11 9452 - 10906 1080 484 aa, chain - ## HITS:1 COG:sll1087 KEGG:ns NR:ns ## COG: sll1087 COG0591 # Protein_GI_number: 16330938 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Na+/proline symporter # Organism: Synechocystis # 1 422 3 423 512 114 27.0 5e-25 MMILVIIICYFAGLLLIAHITGHKGGSNAAFFKGENKSPWYIVSFGMIGATISGVTFVSV PGMVRGMDMTYMQTVFGFFFGYMIVAHILLPLYYRLNLTSIYGYLGTRIGVNAYRTGSFF FLLSRMLGTAAKLYLVCLILHTHVFQDMHVPFWVIAVGSVALVWIYTHKSGIKTIVWTDT LQTFCLIAALLFIIYFTIQKLDVGFDGIVQTIRHSEHSRIFVFDDWMSRQNFFKQFFSGI FIVIVMTGLDQDMMQKNLSCRNLHEAKKNMYCYGFSFIPLNFLFLCLGILLIALAGQMQL ELPAMNDDILPMFATQGYLGQSVLILFTIGIIAAAFSNSDSALTAMTTSVCIDLLNTDKD TEETARRKRKKVHLSLSILLAFFICLVEMLNNKSVIDAIYIIASYTYGPLLGMFAFGLFT RRRTKDRLVPLIAIASPVLCYALDWWINKETGYKFGYELLMLNGSLTFAGLMLLSGKEKT VETP >gi|226332291|gb|ACIC01000029.1| GENE 12 10995 - 11612 553 205 aa, chain + ## HITS:1 COG:DR0198 KEGG:ns NR:ns ## COG: DR0198 COG0353 # Protein_GI_number: 15805234 # Func_class: L Replication, recombination and repair # Function: Recombinational DNA repair protein (RecF pathway) # Organism: Deinococcus radiodurans # 4 198 2 193 220 198 49.0 5e-51 MNQQYPSVLLEKAVGEFSKLPGIGRKTAMRLVLHLLRQDTATVEAFGNSIITLKREVKYC KVCHNISDTETCQICANPQRDASTVCVVENIRDVMAVEATQQYRGLYHVLGGVISPMDGV GPSDLQIESLVQRVSEGGIKEVILALSTTMEGDTTNFYIYRKLEKMGVKLSVIARGISVG DELEYADEITLGRSIVNRTLFTGTV >gi|226332291|gb|ACIC01000029.1| GENE 13 11649 - 12098 344 149 aa, chain + ## HITS:1 COG:no KEGG:BT_1080 NR:ns ## KEGG: BT_1080 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 149 1 149 149 259 98.0 3e-68 MEEQIKRAVRNLNISYVFFWVLPAFLLGAGEFDLLPVGALVDNPQATYYMETAGILLTAL CVPLALKLFSLVLKKKIDLLTIPLALKRYVQWSMVRLGLLEVAIVLNVLCYYLTLSNTGN LCMLIGLTASLFCLPSEKRLRNELHINKE >gi|226332291|gb|ACIC01000029.1| GENE 14 12105 - 12644 353 179 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229254479|ref|ZP_04378409.1| acetyltransferase, ribosomal protein N-acetylase [Capnocytophaga ochracea DSM 7271] # 11 172 2 162 166 140 41 8e-33 MKQSFLSNDRIYLRAVEPEDMDVMYEMENDPSMWDISNFTVPYSRYVLRQYIEGSQCDVF ADKQLRLMIMSKSDHRILGTIDITDFVPLHSRGEVGIAVHKDYRRQGYASDALQLLCEYA FDFLSLTQLYAHVATDNEVCMKLFTSCGFVQCGLLKDWLQVGGCYKDAAIFQYLNPKRH >gi|226332291|gb|ACIC01000029.1| GENE 15 12641 - 13231 459 196 aa, chain - ## HITS:1 COG:CPn0139 KEGG:ns NR:ns ## COG: CPn0139 COG1678 # Protein_GI_number: 15618063 # Func_class: K Transcription # Function: Putative transcriptional regulator # Organism: Chlamydophila pneumoniae CWL029 # 19 190 10 182 188 104 32.0 1e-22 MNIDSDIFKIQSNNVLPSRGKILISEPFLRDATFGRSVVLLIDHTEEGSMGLIINKQLPI FVNDIIKEFKYIENIPLYKGGPIATDTLFYLHTLADIPGAIPISKGLYLNGDFDEIKKYI LQGNKVDRYIRFFLGYSGWESEQLSTELKENTWLVSKEENAYLMNGDTKDMWKQALEKLG SKYETWSRFPQVPTFN >gi|226332291|gb|ACIC01000029.1| GENE 16 13381 - 14697 1249 438 aa, chain - ## HITS:1 COG:L0098 KEGG:ns NR:ns ## COG: L0098 COG0436 # Protein_GI_number: 15673812 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Lactococcus lactis # 33 271 18 251 393 77 26.0 7e-14 MKNTPIERNVIDETINEFQIVDFSKATIREVKAIASKAEAVSGVEFIKMEMGVPGLPASA VGVKAEIEALQNGIASLYPDINGLPELKKEASDFIKAFINVDLSPEGCVPVTGSMQGTFA SFLTCSQCDEKKDTILFIDPGFPVQKQQLVVMGQKYETFDVYDYRGDKLKEKLESYLRKG NISAIIYSNPNNPSWICLKDEELRIIGELATQYDVIVLEDLAYFAMDFRQDLSTPYQAPF QPSVAHYTDNYVLLISGSKAFSYAGQRIGVSCISDKLYHRSYPGLTKRYGGGTFGTVFIH RVLYALSSGTSHSAQYAMAAMLKAANEGQYNFLSEVKVYGDRAQKLKEIFLRHGFYLVYD NDLGDPIADGFYFTIGYPGMTSGELAKELMYYGVSAISLVTTGSHQEGLRACTSFIKDHQ YAQLDERMKLFAEHHPIA >gi|226332291|gb|ACIC01000029.1| GENE 17 15066 - 15845 540 259 aa, chain - ## HITS:1 COG:VCA0850 KEGG:ns NR:ns ## COG: VCA0850 COG3279 # Protein_GI_number: 15601605 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Vibrio cholerae # 1 235 2 243 261 102 32.0 8e-22 MNKITAVIIEDEIPAARRLNNTLNELRPEWQITVLPGSVKKSVEWFAENPHPDLVFLDIQ LTDGISFTFIEQAQPESTIIFTTAYDEYAIRAFTVNSIDYLLKPIDNIRLEEAIVKFEHL TTKYLNQGQKPIDLMEILQNITQPGKKYRTRFLISGDDKLFTLQVEDIAYFYSENKVTFA VTKQNREFIIDLSLDKLMEQLDPDVFFRSNRQTVVSINAIVKVESYFLGKAILHVKPPFK DKIIVSRDKIAPLKLWLNY >gi|226332291|gb|ACIC01000029.1| GENE 18 15838 - 16875 448 345 aa, chain - ## HITS:1 COG:FN0220 KEGG:ns NR:ns ## COG: FN0220 COG3275 # Protein_GI_number: 19703565 # Func_class: T Signal transduction mechanisms # Function: Putative regulator of cell autolysis # Organism: Fusobacterium nucleatum # 140 330 315 522 541 84 32.0 3e-16 MRQKFNYLFIGLLFSGLAFFSYLFLLLYSDFTPQHLEVLISFKSFILILAAFNLVGFGVL MIHNWQKRSFQFLVKRKERLIIDCILTAIILFLMNYLVLSIVKAIFEVPTPFTLKGSGLR MIALVWLVEMVITNLTLTINFYRQLVLLHERTEQVEENSIKAQYAALQNQLNPHFLFNSL NTLISEIEYAPKNAILFTQRLSDVYRYILQSQQQRLVTIESELSFIDSYIFLHKVRLGDC IRIENRIEADNYDLKLPSLTLQLLVENVIKHNIINMDMPMNILIDYDSENGRILVTNKIR IKPNVVSTGMGLKNLSARYLLICNQDITIENNTNYFTVKIPVLNE >gi|226332291|gb|ACIC01000029.1| GENE 19 16987 - 17622 370 211 aa, chain - ## HITS:1 COG:no KEGG:BT_1074 NR:ns ## KEGG: BT_1074 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 211 1 211 211 418 99.0 1e-116 MKTIKFIPKKKYILAFMISGLFYFSSYAQDISDRLHFNVDWQMNAPLNTNFADKISGWGM NLEGGYFLTPHWSLGAFLDFHTNHKYVPRQTITEGTASLTTDRQESAFQLPFGLAASYRF ISNGCLKPYVGTKIGTMYAQNTTYLNTISLTDKPWGFYVSPEIGVNIYPFFQSRFGFHLA AYYSYATNKSELLTGCQDGNNNIGFRLGVCF >gi|226332291|gb|ACIC01000029.1| GENE 20 17634 - 18278 497 214 aa, chain - ## HITS:1 COG:no KEGG:BT_1073 NR:ns ## KEGG: BT_1073 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 214 1 214 214 371 99.0 1e-102 MKKYLFIGLMAVLATSCEKDPDLSKLDNNFTVYTNYDSKTNFNDFKTYCLPDSILLIGQG MKAEYWKDENAQEIIKQVADEMDTRGYTRVKVIKNANIGLQLSFTRQTTQIIGTGGWYGG GWYNGWWGPGYWGPYWNDWYYPYPVTYSYNTGTLIMEMVNLTDHPEDTSQKVKLPVIWHS YATGLLFENSKYNMQLTLDAVNQAFDQSPYIKKS >gi|226332291|gb|ACIC01000029.1| GENE 21 18381 - 18935 410 184 aa, chain - ## HITS:1 COG:no KEGG:BT_1072 NR:ns ## KEGG: BT_1072 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 184 1 184 184 306 99.0 2e-82 MKKLSILSMLLLFISMGNIQAQTVKEESRKQKKAEQELLDQMFFDEAKQAIEMKNFILEA DHVMFKYGTTAFVSPNTNFVAVKGNKAVVQVAFNIPISGPNGLGGITVNGNISGYKQTTD KKGNISVSMNVMGVGISAQVNIRLNKGSNNASVDISPNFNSNNFSLTGSLLPMAKANVFK GNSL >gi|226332291|gb|ACIC01000029.1| GENE 22 19617 - 20024 275 135 aa, chain + ## HITS:1 COG:no KEGG:BT_1071 NR:ns ## KEGG: BT_1071 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 135 1 135 135 261 100.0 7e-69 MIQVIQLKGKDKHLYQLLAPLVMDPDVIRANNNYPFKTSEDFVWYIAIDNRDVIGFIPVE QKSGKKAVINNYYVAAVDEKRKEILSLLLSSVVTAFIPAGWTLNSVTLIQDQETFEKFEF ASMDKKWTRYVKMNR >gi|226332291|gb|ACIC01000029.1| GENE 23 20028 - 21332 1131 434 aa, chain + ## HITS:1 COG:lin1347 KEGG:ns NR:ns ## COG: lin1347 COG3969 # Protein_GI_number: 16800415 # Func_class: R General function prediction only # Function: Predicted phosphoadenosine phosphosulfate sulfotransferase # Organism: Listeria innocua # 9 433 3 434 434 437 49.0 1e-122 MGKKTITGTKNVYELAQERLKVIFNEFDNIYVSFSGGKDSGVLLNMCIDYIRRNNLKIRL GVFHMDYEIQYKMTIDYVDRILEANKDILDVYRVCVPFRVSTCTSMYQSFWRPWEDNKKD IWVRSMPKKAMKKEDFPFYNTTMWDYEFQMRFAQWLHQKKDAVRTCCLIGIRTQESFNRW RCIYMSRKFQMYHKYRWTSKVGNDIYNAYPIFDWKTTDVWTANGKFQWDYNVLYDLYYRA GVNLERQRVASPFINEAQESLALYRVLDPNTWGKMIGRVNGVNFTGMYGGTHAMGWQMVK LPEGYTWREFMYFLLSTLPERARKGYLRKLSVSVHFWRTKGGCLSDSTIQKLIDAKVPII VMDNSNYKTYKKPVRMEYQDDIDIPEFREIPTYKRMCICILKNDHACKYMGFSPTKEEMS KRSQVMEQYRIIVS >gi|226332291|gb|ACIC01000029.1| GENE 24 21329 - 21871 610 180 aa, chain + ## HITS:1 COG:L69383 KEGG:ns NR:ns ## COG: L69383 COG1475 # Protein_GI_number: 15673430 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Lactococcus lactis # 6 165 7 166 180 221 63.0 6e-58 MSVDKSPVYEVKAVPVEKVLANDYNPNVVAPPEMKLLELSIWEDGFTMPCVCYYNKEEDN YILVDGYHRYTVLKTSKRIYKRENGLLPIVVIDKDLSNRMSSTIRHNRARGMHNIELMCN IVAELDRAGMSDQWIMKNIGMDRDELLRLKQISGLADLFANREFSIPDEVAPTETERKTL >gi|226332291|gb|ACIC01000029.1| GENE 25 22231 - 22575 223 114 aa, chain + ## HITS:1 COG:no KEGG:BT_1068 NR:ns ## KEGG: BT_1068 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 114 1 114 114 199 97.0 2e-50 MRLEGFKVNIGLFNLLLLVAILSCMDDTFTNNTTAKGTFISIRGIDVQSCTHLVNPETYD FVFFANEPYYIPVINQLNYITAYDNLNDIVYSERYSLSEQLIPMIQEVNYQYVA >gi|226332291|gb|ACIC01000029.1| GENE 26 22592 - 24415 885 607 aa, chain + ## HITS:1 COG:no KEGG:BT_1067 NR:ns ## KEGG: BT_1067 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 607 2 608 608 1177 99.0 0 MKYIMKIFGLVIILLISACEHETLLTGEPDDPGDISEDKVRLEIFARANSYRLPSTRALG DENTVEMTPWVLVFRGSGTGAIFVEAVQAFEFAGKRYVLLTKRTDGSKYQLLILANPMAQ FYYGNASTGYVFNESQLTEKLIEGTTTLATACTNLLTEPASTSPARIPFSGVDETIPMSY LLEVDKVDNTTKIEKSDGSSLMLLRAVAKIVVVNEASNFELEGVSAVANVPRQGQLHNLI PASIMGNAGNLTNYQYDASCSLPLVSAETITNKQSTENSPIYIYETNTQTYSTYLIIQGI YETKSYYYKMAIVDKDLNYMNIYRNHAYTFTIKKVHGRGYDTVADAKLSKPSNTDLDFEI TVDDSNSYEIIANNDYYLGVSNSVFIAYSPDDASTYEAFKVITDCEKAFPDARTITDNMA EADNSFALSSPADGKLPIVASTSPRVTPVSVLIQSWLMYSEVGQSFEGRDKVNAYITLKL GNLEKQVHIRRRAAIDAGGTILKYMPANSYPFPTTGEVDYHCLTGEVEDGTDNPKEWIKL LPSSMVDRNDTDKIIVEDGHIYIKIMPNTASTTRNGIVYLTTIMPGSLYTGESTVKRIKI DITQKGN >gi|226332291|gb|ACIC01000029.1| GENE 27 24430 - 24592 194 54 aa, chain + ## HITS:1 COG:no KEGG:BT_1066 NR:ns ## KEGG: BT_1066 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 54 1 54 189 112 100.0 3e-24 MKKVSILIVLVLGCFIESKAQKVALKSNLLYDATTTMNLGLEFGLARKWTLDVP Prediction of potential genes in microbial genomes Time: Thu May 12 00:19:04 2011 Seq name: gi|226332290|gb|ACIC01000030.1| Bacteroides sp. 1_1_6 cont1.30, whole genome shotgun sequence Length of sequence - 17708 bp Number of predicted genes - 13, with homology - 13 Number of transcription units - 6, operones - 3 average op.length - 3.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 137 - 367 159 ## BT_1066 hypothetical protein 2 1 Op 2 . + CDS 379 - 1893 1535 ## BT_1064 hypothetical protein 3 1 Op 3 . + CDS 1940 - 3640 1487 ## BT_1063 hypothetical protein + Term 3679 - 3715 8.2 + Prom 3738 - 3797 5.5 4 2 Tu 1 . + CDS 3860 - 4813 757 ## BT_1062 hypothetical protein + Term 4883 - 4923 9.1 - Term 4871 - 4910 4.3 5 3 Tu 1 . - CDS 4919 - 5878 646 ## BT_1061 hypothetical protein - Prom 5996 - 6055 4.1 6 4 Tu 1 . - CDS 6120 - 7079 729 ## BT_1060 hypothetical protein - Prom 7146 - 7205 5.9 + Prom 7142 - 7201 7.5 7 5 Op 1 . + CDS 7284 - 8213 628 ## COG0451 Nucleoside-diphosphate-sugar epimerases 8 5 Op 2 . + CDS 8237 - 9370 1039 ## COG0642 Signal transduction histidine kinase 9 6 Op 1 . - CDS 9495 - 12359 2275 ## BT_1057 hypothetical protein 10 6 Op 2 . - CDS 12395 - 13204 686 ## BT_1056 hypothetical protein 11 6 Op 3 . - CDS 13201 - 13827 284 ## COG1180 Pyruvate-formate lyase-activating enzyme 12 6 Op 4 . - CDS 13827 - 17000 2743 ## COG1074 ATP-dependent exoDNAse (exonuclease V) beta subunit (contains helicase and exonuclease domains) 13 6 Op 5 . - CDS 17020 - 17628 488 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 17648 - 17707 4.1 Predicted protein(s) >gi|226332290|gb|ACIC01000030.1| GENE 1 137 - 367 159 76 aa, chain + ## HITS:1 COG:no KEGG:BT_1066 NR:ns ## KEGG: BT_1066 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 76 114 189 189 153 100.0 2e-36 MQNSRYQGYLYGGGVSVGHSWILKKRWSIEASVGVGYAHIVYDKYPCRACGTKLKDSSKN YFGPTKASVSLIYVIK >gi|226332290|gb|ACIC01000030.1| GENE 2 379 - 1893 1535 504 aa, chain + ## HITS:1 COG:no KEGG:BT_1064 NR:ns ## KEGG: BT_1064 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 118 500 4 386 392 730 99.0 0 MMNRSIQFIVCVLAILCQNINSAVAQTRSYDGVISVVPVQLEQRGKSVYINIDFVLEDVK VKSAHGVDFIPQLVAPAHTYNLPKVSIKGHNEYLAYERWLSLMSAKEKDSYEKPYVVEKG SKKRNDTIRYRYILPYESWMDDARVDVQRDECGCGEIQLMDVEPLGDIELERILVPYVVT PFFAYLQPKAEEVKSRDIQAECFLDFEVNKINIRPEYMNNPKELAKIRAMIDELKSDPSI KVNKLDIVGYASPEGSLANNKRLSEGRAMALRDYLASRYDFSRNQYFIIFGGENWDGLVK ALDTIDFEYKDEALNIINDIPVEKGREAKLMQLRGGVPYRYMLKYIFPSLRVAICKVNYE IKNFNLDEAKEIIKTRPQNLSLNEMFMVANSYPKGSQEFIDVFETAVRMYPKDEIASINA AAAALSRNDLVSAERYLNMVNVNKQLPEYSNAMGVLMLLKGEYEHAEEYLKAAAKSGLQA AGQNLEELAKKKTNAAEIEKKKRK >gi|226332290|gb|ACIC01000030.1| GENE 3 1940 - 3640 1487 566 aa, chain + ## HITS:1 COG:no KEGG:BT_1063 NR:ns ## KEGG: BT_1063 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 566 1 566 566 992 99.0 0 MKKKNFLMLAFAAVAFAACSNEDLVPTGNPGDDNQLVTDPTGDAWVALAVKTPVQTRGLH NPNQEAATEDESKITTVRAIFFTANANEDAATVTADLTLDNDEAGLDANGIPSGAAGKAF KVPATSKRILIIANPSPKFEAKTASETDQKWPVGTTYATVNAALDDTSASMTTAGKFMMS NAKGSLEPSDEDEASGTFGAPVDLALYATETAATNNPISIRIDRVVSKVRVYVTADSDAK ATISDAGWVLNATNKTFFPVSKRTLTWNETPANKPLGARGTCITPFDQYKIGSYRVDPNY STQALANYNAYTSTSAPSIWNDPTTTVNKVAEYCLENTQTATHNVHAYTTHVLLKAKFIP SEYGRPAPNDGTPSTAQETDPKLVGDWMLINGGFYTFATLMEWIEAELKAKYADKEPTTF PTARTTAYNRYLAAVGVGEISLPATADAAAITSLVDDFKATKANVGNVAAKDRAITVTGL TYYVGAVSYYKIMIKHDDTSAVTNELGEFGVVRNSVYDINVKKFNNPGYPAIPDPDEGTP DESDEGWLSIEIIPNPWTWYTQEEIM >gi|226332290|gb|ACIC01000030.1| GENE 4 3860 - 4813 757 317 aa, chain + ## HITS:1 COG:no KEGG:BT_1062 NR:ns ## KEGG: BT_1062 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 317 1 317 317 633 100.0 1e-180 MNLKSLICYVALGILLMSSGIVASCDSFNEDLPECRLSVKFKYDYNMEFADAFHAQVDKV ELYVFDKNGKYLFKQAEEGSALSTGNYLMEVELPVGQYQFMAWAGARDSYDITSLTPGVS TLTDLKLKLKREASLIINKRMETLWYGEVINVNFDGTVHQTETINLIRDTKIVRFGFQSY TGSWTLDMNDYDYEIIESNGHLGHDNSLLDDDVLSFRPYYMEQKDPATAYVDMNTMRLME DRKTRLVLTEKASGKRVFDINLIDYLAMTNAEGKNLSTQEYLDRQSNYHIIFFLSESWLA VQIVVNGWVHRIQEENQ >gi|226332290|gb|ACIC01000030.1| GENE 5 4919 - 5878 646 319 aa, chain - ## HITS:1 COG:no KEGG:BT_1061 NR:ns ## KEGG: BT_1061 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 319 1 319 319 600 100.0 1e-170 MKNGIVNELITAMKERIPKGQNLANSLADILYMGKEAVYRRLRGEVAFTIDEVALLSKKL GISIDQIIGNHLSNRVTFDMNLLHSPNTLESYYEIISRYQQIFDYVKGDDTTEVYTASNL LPFTLYSSYEYMSKFRLCRWIYQNGQMKTPNSLAEMQVEDRIVKAHNKLSESVKQCRKTY FIWDSNIFLSFVKEIKYFASLNLITEDDVAHLKDELYQLLSVMETLSVKGEFSEGRKVSF YLSNIDFEATYTYIEKQDYQVSLLRVYSINSMDSQSPQICQIQRNWIQSLKRHSTLISES GEAQRIEFLQKQRTVIDTL >gi|226332290|gb|ACIC01000030.1| GENE 6 6120 - 7079 729 319 aa, chain - ## HITS:1 COG:no KEGG:BT_1060 NR:ns ## KEGG: BT_1060 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 319 1 317 317 597 100.0 1e-169 MIMNELNTNLIEAAKEKFPAKGQLANILMDTLYMGKEAIYRRLRGEVPFTFQEAAIISKE LGISLDRIAGVSFSNNAMFDINVVDHGNPFETYYDFLNKHVKLFHTLREDPNASLGTASN VIPQTLYLKHELLAKFRFFKWMYQNEHIKCKHFEELEIPQKIINVQQDFAKLSHHIHSTD YIWDSMVFLHLINDIQYFSDIHLISDEMKQNLKEELLILTDELERLAIKGKTDFGNDVHI YVSQINFEATYSYLETSTLQLSLIRVYSINSLTTQDVQMFQSLKEWIQSLKKFSTLISES GEMQRIQFFNQQREIIDAL >gi|226332290|gb|ACIC01000030.1| GENE 7 7284 - 8213 628 309 aa, chain + ## HITS:1 COG:XF0611 KEGG:ns NR:ns ## COG: XF0611 COG0451 # Protein_GI_number: 15837213 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Xylella fastidiosa 9a5c # 3 307 22 326 329 492 71.0 1e-139 MKRILVSGGAGFIGSHLCTRLVNEGHDVICLDNFFTGSKDNIKHLMGNHHFEVVRHDVTY PYSAEVDEIYNLACPASPIHYQHDPIQTAKTSVMGAINMLGLAMRLDAKILQASTSEVYG DPIIHPQPESYWGNVNPVGYRSCYDEGKRCAETLFMDYYRQNQTRIKIIRIFNTYGPRML PNDGRVVSNFIIQALNNEDITIYGDGKQTRSFQYIDDLIEGMVRMMDTEDDFTGPINIGN PNEFPVLELAERVIRMTGSTSKIVFKPLPTDDPKQRQPDIKLAKEKLGWQPTVELEDGLK RMIEYLKNV >gi|226332290|gb|ACIC01000030.1| GENE 8 8237 - 9370 1039 377 aa, chain + ## HITS:1 COG:CAC3391 KEGG:ns NR:ns ## COG: CAC3391 COG0642 # Protein_GI_number: 15896632 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 130 373 342 576 579 123 36.0 5e-28 MNVEINPSEYKVLIVDDVISNVLLLKVLLTNEKFNIVTAGNGTQALEQVKKEKPDLVLLD VMMPDISGFEVAQQMKADEEMSEIPVIFLTALNSTADIVKGFQVGGNDFISKPFNKEELI IRVTHQISLVAAKRIIIAQTEELRKTIIGRDKLYSVIAHDLRSPMGSIKMVLNMLILNLP SETIGSEMYELLTMANQTTEDVFSLLDNLLKWTKSQIGKLKVVYQNIDMVEVVEGVSEIF TMVAGLKNIHIAVEIPEDRMEVRADIDMIKTVIRNLISNAIKFSNEGSEVLVSLREEDSM AIVSVKDSGCGIDEENQKKLLHTDTHFSTFGTNNEEGSGLGLLLCKDFVTKNGGELWFTS KKGEGSTFSFSIPLLES >gi|226332290|gb|ACIC01000030.1| GENE 9 9495 - 12359 2275 954 aa, chain - ## HITS:1 COG:no KEGG:BT_1057 NR:ns ## KEGG: BT_1057 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 954 1 954 954 1859 99.0 0 MQSFLQLVAHDLYTKIGNDLSRTVLVFPNKRANLFFNEYLAGESDQPIWSPAAMSISDLF RKLSVQKAGDPIRLVCELYKVFREETESQETLDDFYFWGELLISDFDDVDKNLVDADKLF SNLQDLKNLMDDYEFLDKEQEEAIQQFFQNFSIERRTVLKEKFISLWDKLGTIYHRYREN LAELGIAYEGMLYRNVIEQLDTTRLKYDKYIFVGFNVLNKVETKFFQLLQDADKAMFYWD YDLFYTKQIAKHEAGEFINRNLKRFPNELPESCFDTLRKPKNIRYISASTENAQARFLPE WVQSTLSTANEEKENAVVLCNEALLLPVLHSIPQEVKNVNITMGFPLAQTPVYSFINAAM ELQTNGYRTATGRFTYETVSAILKHPYTRQLSINAEPLERELTKTNRFYPLPSELKMDDF LIKLFTPRSGIKELCDYLTELIKDISLLYREESEYNDIFNQLYRESLFKSYTTINRLYSL IETGELNVRTDTLRRLVSKVLTASNIPFHGEPAIGMQVMGVLETRNLDFRNIILLSLNEG QLPKSGGDSSFIPYNLRKAFGMTTIEHKNAVYAYYFYRLIQRAENITLLYNTSSNGLNRG EESRFMLQLLVEGPHKITREYLEAGQSPQGTTDISIEKTPEVFERLRCSYDCSNPQSYIL SPSALNAYLDCRLKFYYRYVARLKAPDEVSAEIDSALFGTIFHLSAQLAYTDLTANRKII QKEELERLLRNEVKLQNYVDLAFKQELFKVPADEKPEYNGVQLINSKVIVSYLKQLLRND LQYAPFEMVAMEKPVAEKITIQTGQGPITFRLGGTIDRMDAKDTTLRIVDYKTGGSPKTP ANIEQLFTPSETRPNYIFQTFLYAAIMSRQQSLKVAPSLLYIHRAASESYSPVIEMGEPR QPKIPVNNFAFFEDEFRERLQRLLKEIFDENEPFTQTEDTKKCAYCDFKAICKR >gi|226332290|gb|ACIC01000030.1| GENE 10 12395 - 13204 686 269 aa, chain - ## HITS:1 COG:no KEGG:BT_1056 NR:ns ## KEGG: BT_1056 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 269 1 269 269 484 92.0 1e-135 MKRGKQTCRILKDIRRQIAEANDIEFITSECQYQGDCLGTCPKCEAEVRYLEQQLERKRM AGKAITILGISAGLVAMAPMTSCSSSPNKGTSQETISDSTNASVMFGEVCPTPVEDTIPM VEKDTVNKQELPELPQDMGLIEITPISGEIITVKDSLTEKDSLTDVLEVAAVMPEFPGGA QELMKFISANIEYPEIAQGDMGQGRVIVRFIVDKEGNIIQPKVVRSVDPYLDKEALRIVG LMPKWKPGELDDGTKVAVRFTIPVMFRQQ >gi|226332290|gb|ACIC01000030.1| GENE 11 13201 - 13827 284 208 aa, chain - ## HITS:1 COG:AF1450 KEGG:ns NR:ns ## COG: AF1450 COG1180 # Protein_GI_number: 11499045 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Pyruvate-formate lyase-activating enzyme # Organism: Archaeoglobus fulgidus # 17 202 18 256 302 101 26.0 7e-22 MKAAPLIGISRHRLSTDGEGVTTLVAFHGCPLRCKYCLNPQSLHSEDIWKHYDCGQLYEE VKQDELYFLATHGGITFGGGEPCLQSDFIDEFRQLCGQEWQLSVETSLNVAQENIEKLVP VVDSYIIDIKDINNAIYQKYTGKDNEKVLHNLQYLIDHGKNEQIIVRTPVIPAYNTENDV DYSIRLLKEMGITQFDRFTYKTPNNCQP >gi|226332290|gb|ACIC01000030.1| GENE 12 13827 - 17000 2743 1057 aa, chain - ## HITS:1 COG:jhp1446 KEGG:ns NR:ns ## COG: jhp1446 COG1074 # Protein_GI_number: 15612511 # Func_class: L Replication, recombination and repair # Function: ATP-dependent exoDNAse (exonuclease V) beta subunit (contains helicase and exonuclease domains) # Organism: Helicobacter pylori J99 # 3 818 6 728 946 132 26.0 4e-30 MSELLVYKASAGSGKTFTLAVEYIKLLVLNPRAYRQILAVTFTNKATAEMKERILSQLYG IQIGDKDSEAYLNRIKEETKRTEQEIREAAGIALGYMLHDYSRFRVETIDSFFQSVMRNL ARELELSPNLNIELNNAEVLSDAVDSMIEKLGPTSPVLAWLLDYINERIADDKRWNVSDE VKNFGRNIFDEGYIEKGEGLRQRLRNPDTIKEYRKQLKALETEILEQMKGFYDQFEGELE GHALTADELKNGSRGIGSYFRKLNNGILSNDIRNATVEKCLEDAKHWATKTSPRFADIID LAKSSLMQILEDAEKLRSKNNLLLNSCRLSLQHLNKVQLLANIDEEVRQLNHDNNRFLLS DTNALLHQLVKDGDSSFVFEKIGTNIRNVMIDEFQDTSRMQWGNFKLLLLEGLSQGADSL IVGDVKQSIYRWRNGDWGILNGLNDRIEHFPIKVKTLATNRRSETNIIRFNNQIFTAAVN YLNEVYKKQLGKDCDDLQKAYADVVQESPRSVQKGYVKATFLEPDEAHDYTDQTLISLGE EVEHLLSSGVRLNDIAILVRKNKSIPRIADYFDKELHYKIVSDEAFRLDASLAICMMIDA LRFLSDESNKIARAQLAIAYQNEVLQKNLDWNTLLLLPIENYLPPAFLEKQKELRLMPLY ELLEELFSIFEMSHIEEQDAYLFAFFDAVTDYLQSNSSELDGFIRYWDETLCSKTIPSGE VEGIRIFSIHKSKGLEFHTVLLPFCDWKLENETNNQLVWCAPQEAPFNAMDILPINYSTQ MAESIYGNDYLLERLQLWVDNLNLLYVAFTRAGKNLIIWSKKGQKGTMSELLANTLPVVA LKEGLDWEEDCYEQGELCPSEEEKAKTSTNKLTQKPDKLPVHMESMRHEIEFRQSNRSAD FIQGIEEEESDDRFIHRGRMLHTLFSAIETAEDIDPAIERLIFEGVIASSEKAEEIREVA RKAFSSPEIQDWYSGEWTLFNECAIIYKEKGVLQTRRPDRVMMKDGQVVVVDFKFGKENL QYNKQVKGYMQLLTKMGYKNITGYLWYVDEELIVKVK >gi|226332290|gb|ACIC01000030.1| GENE 13 17020 - 17628 488 202 aa, chain - ## HITS:1 COG:XF2239 KEGG:ns NR:ns ## COG: XF2239 COG1595 # Protein_GI_number: 15838830 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Xylella fastidiosa 9a5c # 5 189 7 196 206 64 24.0 1e-10 MQQPPIYLDINNNKSIINALKAGEEYVFDAVYRHYFRRLCAFCSQYVSEQEEIEEIVQDT MMWLWENRCNLMEELTLKTLLFTIVKNKALNRISHFEIRRKVHQEITEKFEKEFDNPDFY LENELFRLYENALKQLPKDYREAYEMNRNHRMTHKEIAEKLNVSPQTVNYRIGQALKLLR IALKDYLPLFILYFGLDIFKQS Prediction of potential genes in microbial genomes Time: Thu May 12 00:19:46 2011 Seq name: gi|226332289|gb|ACIC01000031.1| Bacteroides sp. 1_1_6 cont1.31, whole genome shotgun sequence Length of sequence - 6332 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 3, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 78 - 1070 706 ## COG3712 Fe2+-dicitrate sensor, membrane component + Term 1088 - 1138 10.4 - Term 1084 - 1119 7.1 2 2 Tu 1 . - CDS 1130 - 3451 1709 ## COG3525 N-acetyl-beta-hexosaminidase - Prom 3474 - 3533 5.4 - Term 3487 - 3543 9.1 3 3 Op 1 . - CDS 3579 - 3776 102 ## BT_1050 hypothetical protein 4 3 Op 2 . - CDS 3742 - 4452 520 ## BT_1050 hypothetical protein 5 3 Op 3 . - CDS 4518 - 5699 886 ## BT_1049 putative patatin-like protein 6 3 Op 4 . - CDS 5706 - 6332 560 ## BT_1048 putative secreted endoglycosidase Predicted protein(s) >gi|226332289|gb|ACIC01000031.1| GENE 1 78 - 1070 706 330 aa, chain + ## HITS:1 COG:PA2388 KEGG:ns NR:ns ## COG: PA2388 COG3712 # Protein_GI_number: 15597584 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Pseudomonas aeruginosa # 114 302 116 299 331 79 31.0 7e-15 MDESILMNYLRGGCNDEECQQVELWCEEAPENRQILEQLYYTLFVGDRMAVMDAVDTEAS LTKFKSLVREKEKKAKRRSISVRWGRYASAAAAFLAGLVFAGGIAWGLLSNKLSDYTVMT TGGQRAQTVLPDGSKVWLNASTKLIYRNSFWSTDRQIDLSGEAYFEVARDKHAPFIVNTK HIKTCVLGTKFNVRAREEEDRVVTTLLQGSVRMESPRTVNNGYLLKPGQTLNINTTTYQA ELIEYAEPTEVLLWINGKLKFKQHSLLEITNIMEKLYDLKFVYDDESLKTERFTGEFSTD NLPDEILNVLMHTNHFSYKKEGRIIRLSKK >gi|226332289|gb|ACIC01000031.1| GENE 2 1130 - 3451 1709 773 aa, chain - ## HITS:1 COG:CC0447 KEGG:ns NR:ns ## COG: CC0447 COG3525 # Protein_GI_number: 16124702 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetyl-beta-hexosaminidase # Organism: Caulobacter vibrioides # 31 599 31 585 757 433 41.0 1e-121 MKKHHLLGVILLCSGLFSCSSGVIQQANYEVIPLPQEIKITTGNFVLNDRTSIVYPKDNK EMQQNANLLAEYIHQMSGKKLKVTDEPVTSNAIILATGLNADNAEAYQLKVTQDNVTITG TSEAGTFYGIQTLRKSLPITNKGDISLPAAEINDYPRFSYRGVHLDVSRHFFPADSVKHF IDMMALHNINRLHWHLTDDQGWRIEIKKRPELTTIGSKRSETVIGHNSGEYDGIPYSGFY TQDEAREIVKYAKERHITVIPEIDLPGHMQAALAAYPELGCTGGLYEVWKMWGVSEDVLC AGNDKTLTFIEDVLNEIVDIFPSEYIHVGGDECPKVRWKKCPKCQARIKELGLKADKGHT AEQRLQSYIINYAEQFLNGKGRQIIGWEEILEGGLAPNATVMSWRGIEGGIEAVKHKHDA IMTPSSFLYFDYYQTMDTDNEPPAIGGYVPLEKVYSYEPVPQILTPEEAKHIVGVQANLW TEYIPTYSQVEYMELPRMAALSEIQWTMPKKKEYADFLKRLPGLIAVYDINQYNYAKHIF QVKSQYIPDTEANVLNVVLSTIDNTPVYYTLDGSEPTASSNIYTDTLKIGQSCTLKAITI RPNGSSAVLKEDIKFNKATMKPITMQQPINEKYKFEGKNTLIDGLAGSRNYRTGRWIAFY QNDLEAVIDLQQETPISKAWVRTYVEIGEEILDLRELSVAVSNDGKEYKEVKSEVYPAVS KEDKNGIYTHELSFDTVQARYVKIAARPEYNIPAWHWGKGRPAFIFVDEIGLE >gi|226332289|gb|ACIC01000031.1| GENE 3 3579 - 3776 102 65 aa, chain - ## HITS:1 COG:no KEGG:BT_1050 NR:ns ## KEGG: BT_1050 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 64 240 301 301 104 82.0 1e-21 MLPTKVEIFTSEDGEKWVSIGQWTTKGGTQTITFEESVKTRYLKYQMITVPGRVDITRFY IYSWE >gi|226332289|gb|ACIC01000031.1| GENE 4 3742 - 4452 520 236 aa, chain - ## HITS:1 COG:no KEGG:BT_1050 NR:ns ## KEGG: BT_1050 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 227 13 239 301 348 80.0 8e-95 MSAMIPFFSSCSDDDSVNEWDMSYVSLLPADYLRPTPSFTLKHVEEEGIEGSVEFQFMAT TQKAVAQDINVNIDATCDGISADKINLTSKTAVIKANATASEPITLSITDWDELESIKGE ANYTLRIKITGIESTATDVANAAYYQEIVLKISKSAERKKENVLLTNAKDWIFTFMEGVE NPESNSVAGTGSSDVATNGIPFWLTVDLKTVKTLTGIQTRHWGAGYASYESRDIHI >gi|226332289|gb|ACIC01000031.1| GENE 5 4518 - 5699 886 393 aa, chain - ## HITS:1 COG:no KEGG:BT_1049 NR:ns ## KEGG: BT_1049 # Name: not_defined # Def: putative patatin-like protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 393 1 393 393 773 99.0 0 MKQIKNIRNFLLISLIAGGTWIGCDDANYSTLDTHVFFEEALTATSTKVTVMGSGETNVT LNAHISNTQQKDNSYSLAIDQAALDAYNKANGTNYIALPETHYTLPDNITIKAGAYNADP ISIHIKAFSEEMNASGESYALPVRLVAKQGSIDVMPVTGSLVILANSIMEFSAPQFVGST ELVAQKFSEAPETYNEFTVEVRFQVSNTANRNRAVFSTSGSDGKSLLLRFEDPQSDNSDH KAHSLVQIQTHETYLNPTYSFEPNKWQHLAVTYNGSKYRIYINGKDGGSKDVAGGPCIFG NMSWFSGGSWWSGCKILVSEARIWSVCRSEIQIQNNMTITSPKSPGLEAYWRFNEGKGNV FEDTTGKGHTITTTVTPVWIDKILSTDEATPWK >gi|226332289|gb|ACIC01000031.1| GENE 6 5706 - 6332 560 208 aa, chain - ## HITS:1 COG:no KEGG:BT_1048 NR:ns ## KEGG: BT_1048 # Name: not_defined # Def: putative secreted endoglycosidase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 208 168 375 375 414 99.0 1e-114 EYANAIADTINKYGYDGFDIDYEPNYGNKGNIVDDDDRMFIFVDELGKHFGPKSGTGKLL IIDGEPQSIKNRPDVGPYFDYFIIQAYKPGNDNNLDKRLIDGGVAGPGLVQTYGSVMSEE QITKMTIMTENFEAVDAAMDGGYPYTDRYGNSMKSLEGMARWQPKNGFRKGGVGSYHMEA EYPTNPEYKNLRKAIQIMNPSTKPLIKY Prediction of potential genes in microbial genomes Time: Thu May 12 00:20:01 2011 Seq name: gi|226332288|gb|ACIC01000032.1| Bacteroides sp. 1_1_6 cont1.32, whole genome shotgun sequence Length of sequence - 4676 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 446 233 ## BT_1048 putative secreted endoglycosidase 2 1 Op 2 . - CDS 471 - 2015 1373 ## BT_1047 hypothetical protein 3 1 Op 3 . - CDS 2034 - 4664 2155 ## BT_1046 hypothetical protein Predicted protein(s) >gi|226332288|gb|ACIC01000032.1| GENE 1 2 - 446 233 148 aa, chain - ## HITS:1 COG:no KEGG:BT_1048 NR:ns ## KEGG: BT_1048 # Name: not_defined # Def: putative secreted endoglycosidase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 148 1 148 375 283 100.0 1e-75 MKKNYLLYFLLLALSTPIWVSCNDWTESEAKDYFEGPSEEYYAALRAYKKSDHPKAFGWF GNWTGEGASLVNSMAGIPDSVDVVSIWGNWSNITEAQKKDLKFCQEVKGTRFTMCFIITS VGTQITPQHIYDNWESMGFASQQEAVND >gi|226332288|gb|ACIC01000032.1| GENE 2 471 - 2015 1373 514 aa, chain - ## HITS:1 COG:no KEGG:BT_1047 NR:ns ## KEGG: BT_1047 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 514 1 514 514 1001 99.0 0 MKKYLRNITVGTALVLVAGSCTGNFEEYNTNQFQIHEADPSTLMKSMIETIVNIQQNDSQ MMDQMVGQLGGYLTCANVWSGTNFGTFNQSDAWNAGPWNTNFEKIYGNFFQIQEATNESG HYYAFARMIRAITMLRVADCYGPMPYSQVKKGNFYVAYDTEEQVYKNIMEDFASAANVLY NYYKDTNGNAPLASNDPIFSGNYANWARLANSMRLRVAIRISTAYPEMAKENAEAAASHE AGLIEANNDNAMMDCGSQTNPYQLAAVSWGDLRVNANIVDYMNGYGDPRMSKYFNHSTFT GHTQEYVGMRSGEDGFAKSDVAGYSIPAITGTSKLLVFCAAETAFLRAEGKLKNWSVGNK SAKEYYEEGIKLSMEQYGTTMPDNYLTNTEKPTVSHNNDPRGHAYTISNTVAVAWNDNNE EENLQRIITQKWIANYPLGLEAWAEYRRTGYPELYPCIDNLSPSGVDSKRGMRRLRFPYT EVQNNHSNYQQGVSYLKNGGPDNEATDLSWGKTN >gi|226332288|gb|ACIC01000032.1| GENE 3 2034 - 4664 2155 876 aa, chain - ## HITS:1 COG:no KEGG:BT_1046 NR:ns ## KEGG: BT_1046 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 876 65 940 940 1632 99.0 0 MKDANFINSLAGKVAGVQISSGATGAGGATRVVMRGMKSLTKNNNALYVIDGVPMFNTGS SGGDGQYGSMGGSDAVADLNPDDIESISMMTGPSAAALYGSAAANGVVLVNTKKGKKEKI SLTVSNSTTFSKAYIMPEMQNRYGTSSGLFSWGDLTDKRYDPKDFFNTGSNIINSITLST GSDKNQTYFSASTTNSDGILPNNSYNRYNFTARNTTNFLNDKLTLDIGAQYIIQNNTNMV SQGQYYNPLPALYLFPRGDNFDEIRLYERYDTNYGFMKQYWPYGDGGLSLQNPYWTQNRI RRTSDKKRYMLNASLKWKVTDWLNITGRVNLDNSDYRNKTEKYASTLATFCSVNGGFEDA MRQERSLYTDLMATVDKTFGDFRLNTNIGMSLYHTSMQTIGFAGDLIIPNFFALNNINYA ANYKPLPDGYDDEVQSIFANVELGWRSQLYLTLTGRNDWDSKLAFSNQSSYFYPSVGLSA VLTEMFALPKIISYAKVRGSYTVVASSFDRYLTNPGYEYNSQTHNWANPTVYPMDNLKPE KTKSWEIGLNLKFWENRFNLDATYYRSNTLNQTFKVDIPSSSGYNKAIVQTGNVQNQGIE LALGFTDEWAGFRWASNATFTLNRNKVKRLAGGSTNPVTGELIDMPNMPVGWLGKENVAP RVILTEGGTMTDIYVYNELTKDNNGNIKVDAQTGGLSMHTAETPTKVGNLNANYNLGWTN NFTYKGINLGVVLSARVGGLAYSATQGVLDYYGVSSITANARDNKGVPMNNGKVDTQKYY QTIGTGEGGYGRYYLYSATNVRLQELSLSYTLPKKWLNNVANVTFGLVGRNLWMIYCKAP FDPELSASTSSNYYMNVDYFMQPSMRNIGFNVKVQF Prediction of potential genes in microbial genomes Time: Thu May 12 00:20:22 2011 Seq name: gi|226332287|gb|ACIC01000033.1| Bacteroides sp. 1_1_6 cont1.33, whole genome shotgun sequence Length of sequence - 9863 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 3, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 707 - 747 1.8 1 1 Tu 1 . - CDS 755 - 1321 352 ## BT_1041 integrase - Prom 1446 - 1505 2.9 2 2 Tu 1 . - CDS 1724 - 1996 135 ## BT_1041 integrase - Prom 2021 - 2080 4.7 + Prom 2254 - 2313 4.5 3 3 Op 1 . + CDS 2365 - 5658 2349 ## BT_1042 hypothetical protein 4 3 Op 2 . + CDS 5679 - 7319 867 ## BT_1043 hypothetical protein 5 3 Op 3 . + CDS 7351 - 8445 788 ## BT_1044 hypothetical protein 6 3 Op 4 . + CDS 8464 - 9645 803 ## BT_1045 hypothetical protein + Term 9674 - 9720 6.0 Predicted protein(s) >gi|226332287|gb|ACIC01000033.1| GENE 1 755 - 1321 352 188 aa, chain - ## HITS:1 COG:no KEGG:BT_1041 NR:ns ## KEGG: BT_1041 # Name: not_defined # Def: integrase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 188 226 413 413 377 100.0 1e-104 MSFQDSTNELEEFRDLWFFSYLCNGINFMDLLFLQYSNIVDGEICFIRSKTSRTTKHSKE IRATITPEMWDIIHKWGNPNISPQTYIFKYAKGNEDAFEKIKLVRRIVTKCNRKLKKIAQ GTGIAQLTTYTARHSFATVLKRGGAKTSYISESLGHSNLTVTENYLACFEKEERIRNARL LTNFDNKM >gi|226332287|gb|ACIC01000033.1| GENE 2 1724 - 1996 135 90 aa, chain - ## HITS:1 COG:no KEGG:BT_1041 NR:ns ## KEGG: BT_1041 # Name: not_defined # Def: integrase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 87 1 87 413 160 100.0 1e-38 MFKYSKDGVSVLTILDTRRAKINGLFPVKVQVVFRRKQKYYSTGKELSKEDWDRLLKAKS QLLMEVRTDIESSFSIIKQQVSELIQKRRI >gi|226332287|gb|ACIC01000033.1| GENE 3 2365 - 5658 2349 1097 aa, chain + ## HITS:1 COG:no KEGG:BT_1042 NR:ns ## KEGG: BT_1042 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1097 1 1097 1097 2032 99.0 0 MKEIIKHKFPYLLFFLLLFSSFASAYGQERMITLNLSKVPLNTALKEIEKQTSMSVVYNT NDVDINRVISIKVSKESLTNVMGQLFKGTNISYSIVDKHIVLSTKKEVEQQKKTPIVATG TVTDAQGEPLIGVSILVKGTATGAITDMDGNFKIQAAKGDVLEISYIGYASQAITLTNAQ PLKITMGEDTQKLDEVVVTALGIKRSEKALSYNVQKVNNDALTSVKDANFVNSLNGKVAG VNIQRSASGVGGSTRVTMRGNKSISGDNNVLYVVDGVPIGNQADRTGDGTGFGSGRTSGE GIANFNPDDIESVSVLTGPSAAALYGASAANGVILINTKKGEAGKMRIDVSSSVEFMTPL TMPKFQNRYGISGNYYSWGDKLENQSSYDPKDFFELGATFNNSFNLSTGNDKNQTYFSIA AVNSDGIVPNNKYHRYNVTLRNTAKFLNDKLTLDASASYIREYYNNMISYGTYFNPIVGA YLYPRGEDFEKEKYFERYNSELGYSQQQWSPGDFGMDIQNPYWIAYRNIRPEVKDRYMLY ASLKYDITNYLNVAGRVRLDNTYTEKEDKRYASTINTYASTNGRYTYSNETFRQKYADIM VNFDKQFAEIYHATINAGTSFEEYDTKGRGYGGQLLLVPNKFVYSNIDPTQSTASQTGGD SRRRNFAVFASAELSWNSALYLTLTGRADKPSQLVNSDDPWIFYPSVGLSAIVTELLPNN IKEKLEPTLGFLKVRASYTEVGSPIPYTGLTPGTLTHELEGGTFKPFKYYPISNLKAERT RSYEVGLDSKWFNNTITFGVTYYHSNTYNQLLKATLGSNFEYMFVQAGSVQNRGLELSLG FDKKFGDFNYNTTFTATTNKNKIIQLARDVLNPVTNTLIDLTDIQVGRFRLREGGEIGTV YANEWLKRDGDNYIDYQPGQALTTETTSPYKLGSVNPKWNLGWQHGFSYKGFNLNLLFTA RIGGIVISKTQAMLDRYGVSEASADARDAGGVSLGYFKVDPKTYYDAVSNLDAYYTYSAT NVRLQEARLTYTFPNKWFKGTVNNLSLSVYGTNLWMIYNKAPYDPELTASTGTFGQGYDY FMLPSQSTYGLSLKFGF >gi|226332287|gb|ACIC01000033.1| GENE 4 5679 - 7319 867 546 aa, chain + ## HITS:1 COG:no KEGG:BT_1043 NR:ns ## KEGG: BT_1043 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 546 1 546 546 1046 99.0 0 MKKYIILLLALCSMSACDYEAVNTNPYGVSDGELGPLKYGARFMNMQQRVIPIGSPSLTT GPGNDLQNTDLISSGNYIGYFGNNNNWGFNNEANWNFTDSRMNYAYQNFYSQIFLPWNEI YEIAKDSDSPSEQAILEIANIVRNIAWLRATDVFGPIAYNSAGDGSIAPKFDSQEVVYRS MLADLSKSVELLNTISYSVMAQYDLIYNGNVQNWVKLANSLMLRIAVRVHFIDETLAKEY ITKALDPKNGGVIEDISSEAKIKSSDKMPLLNSMLASVNEYNETRMGATIWGYLDGYKDP RLSAYFTEGTYGSGSWAQTGYFPVAPTNSKSKSETSYSAKFASRPKVDSNSPLYWFRASE TYFLKAEAALYNLIGGDPKTFYEQGINISFQEQGVSGVATYLSGTGKPTGLTGSNYKYGT YNHDLSIGNTSPKWDDYTGNLSKQEEQLQKIITQKYLALYPNAVEAWTEYRRTGFPYLMK PMDEVAPGRIGASIEDCRVPERFRFAPTAYNSNPNMAEIPTLLGGGDIGATKLWWVRSNR PKQPNQ >gi|226332287|gb|ACIC01000033.1| GENE 5 7351 - 8445 788 364 aa, chain + ## HITS:1 COG:no KEGG:BT_1044 NR:ns ## KEGG: BT_1044 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 364 1 364 364 749 100.0 0 MKLLKYLCIGISALSILSCSDWTSEEREVFENQEGMHRLIPLIEAQTEEDLTPTMREYFA QIREYRKTPHVKGFGWFGNWTGKGNNAQNYLKMLPDSVDFVSLWGTRGYLSDEQKADLKF FQEVKGGKALLCWIIQDLGDQLTPKGLNATQYWVEEKGQGNFIEGVKAYANAICDSIEKY NLDGFDIDYEPGYGHSGTLANYQTISPSGNNKMQVFIETLSARLRPAGRMLVMDGQPDLL STETSKLVDHYIYQAYWESSTSSVIYKINKPNLDDWERKTIITVEFEQGWKTGGITYYTS VRPELNSMEGNQILDYATLDLPSGKRIGGIGTYHMEYDYPNDPPYKWLRKALYFGNQVYP GKFD >gi|226332287|gb|ACIC01000033.1| GENE 6 8464 - 9645 803 393 aa, chain + ## HITS:1 COG:no KEGG:BT_1045 NR:ns ## KEGG: BT_1045 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 393 4 396 396 765 99.0 0 MKKNIYLAVLLFLTGCQNELYKNPLDDFQSEQGVYIANNSVLQSFVEEGKDVEIKGLQVA LAVQDTKEIKVTLEAGNQAQLDAYNAKNGTNYIVLPKEMYEISQKITFMPLYAMMDVPIS LKNVKFSLEGNYALPIRITGGDVNIIRGQEETLLVLEQRVNTKALRISTSGSGSGSEDDK MFPNDFKVDQWTMEVMVNRASYKSNNRSICGTKLVANAGPMDEIYTRFGDVTINPNQLQI KTGSSQIDVPADKFSAQPDTWYMLAFVYDGQKNYVYVNGVLVADREIRTGPYGLIGFWIG GSNELIREVRFWKTARTSQEIASNIWKMVNPDDDNLLLYYPLNGKKRDSATGEITEDETL LWDWSKNGKNLPKPSSYSFDDNKGNGFIFPPQE Prediction of potential genes in microbial genomes Time: Thu May 12 00:20:55 2011 Seq name: gi|226332286|gb|ACIC01000034.1| Bacteroides sp. 1_1_6 cont1.34, whole genome shotgun sequence Length of sequence - 3160 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 2659 1550 ## BT_1040 hypothetical protein 2 1 Op 2 . + CDS 2668 - 3160 350 ## BT_1039 hypothetical protein Predicted protein(s) >gi|226332286|gb|ACIC01000034.1| GENE 1 2 - 2659 1550 885 aa, chain + ## HITS:1 COG:no KEGG:BT_1040 NR:ns ## KEGG: BT_1040 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 885 36 920 920 1628 100.0 0 ELTNIKDANFINALSGKVAGVTINASSAGAGAAARVVMRGTKSLEKNDNALYVIDGIPMF NVNSGDNAGGTMNKQPGSNSVADINPEDIESMTILTGPSAAALYGSDASNGVILITTKKG TVGKVQISYSNSTSFSSPMMMPKFQNIYGNREGELGSWGSLMDTPSNFDPSDFFNTGMTE MNGFTLTTGTEQNQTYASVSTTNSTGILPNNAYNRYNFSIRNTAKFCDNKLSLDLGAQYI IQNNKNMVGSGQYFNPLVSLYLFPRGENFQEVQMYERYSEARNLMVQYWPESIFGTDLDM QNPYWIMNRMQNELSKRRYMFNASLKWDITDWVNVTGRVRVDNSDSDSYEKYYASTRGTF TESSSKGYYGHTKQNDRSVYADVMASINKNFFDDKLSLNATVGASINDIQEDAMYLKGGL EQIPNFFHYGNINVNTSKRNESKWHDQVQSVFASAELGWNHQLYLTVTGRNDWASQLAFT SKGSYFYPSVGLSWLVSESVKLPKAISYLKVRGSWAEVASSPNRYLTQMQYTYNEQTNTY EYPASHYNTNLKPENTKSWELGVNAKFLGNRINLDMTFYRSNTFNQTFYVDASASSGYKN NIVQTGNIQNQGIELALGYSDTFDKVKVSTNFTYTLNQNKIVSLANGAINPETGEAIEME YYSKGTLGTSGGPTLRLYENGSMGDIYINQRLRQSPNGYIWKDPSNGSLAIENTEYRKVG SILPKYYLGWNGSVAWKGWNLGFAFTGRFGGLVVSDTQAMIDKYGVSESSAVARQAGGVW VGDSKVDAEEYYSKVTTAVGTYYTYSATNIRLSELSLSYQLPKTWFKNHVNMTLGLTGKN LWMIYCKAPYDPESTSSVTSNFYQGVDYFQQPSLRSLGFNVKLTF >gi|226332286|gb|ACIC01000034.1| GENE 2 2668 - 3160 350 164 aa, chain + ## HITS:1 COG:no KEGG:BT_1039 NR:ns ## KEGG: BT_1039 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 164 1 164 527 341 100.0 5e-93 MKKNKIIKKFAKLSVYTLLVALGTGCTDKFEEYNTNPFGPKPDQMLGDNAITGSLIKSMI PALVQGQQNNSQMLDQMIGSEYGGEITCIAQWGNGGNYYTYNPRVGWYGNMFDTTMPQIY TGYFQIRDLSDGKGLAYQWAQILRVAASLKISDCYGPIPYSQIT Prediction of potential genes in microbial genomes Time: Thu May 12 00:21:10 2011 Seq name: gi|226332285|gb|ACIC01000035.1| Bacteroides sp. 1_1_6 cont1.35, whole genome shotgun sequence Length of sequence - 9921 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 2, operones - 2 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 4 - 1065 845 ## BT_1039 hypothetical protein 2 1 Op 2 . + CDS 1092 - 2114 731 ## BT_1038 hypothetical protein 3 1 Op 3 . + CDS 2131 - 3345 628 ## BT_1037 hypothetical protein 4 1 Op 4 . + CDS 3362 - 4711 886 ## BT_1036 hypothetical protein + Term 4716 - 4761 9.8 + Prom 4799 - 4858 8.1 5 2 Op 1 . + CDS 4878 - 7082 1467 ## BT_1035 hypothetical protein + Term 7093 - 7128 2.1 + Prom 7100 - 7159 4.0 6 2 Op 2 . + CDS 7181 - 8512 1012 ## COG0477 Permeases of the major facilitator superfamily 7 2 Op 3 . + CDS 8565 - 9533 861 ## COG2152 Predicted glycosylase 8 2 Op 4 . + CDS 9568 - 9919 221 ## BT_1032 hypothetical protein Predicted protein(s) >gi|226332285|gb|ACIC01000035.1| GENE 1 4 - 1065 845 353 aa, chain + ## HITS:1 COG:no KEGG:BT_1039 NR:ns ## KEGG: BT_1039 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 353 175 527 527 678 99.0 0 MEDLYKHMFDDLDEAISAFKVAVLGNEDMSSLSEYDLVFNGDFNKWVKFANSLKLRMAMR ISSVSPALAKEKAEEAVSDVIGVMTSASDAAYSKYNDGMNPYYRVTSSWNEIRISANITS YLSGYNDPRLAKYASEADASIGGGQIGVRNGIYQSATTQANYTKFSRLNIGIDDELLIMS ASEAYFLRAEAALMYKWNMGGTPKELYDTGVTVSMEERKATIGDYLKSEKVPAAYSDPSD SGKDIAAVSTVTPKYNESASVTENLERILIQKWIANFPNGWETWADIRRTGYPKLFPIVN NLNTDGVTVQRGMRRLPFPQSEYNTNNANVKAAVSMLGGADNSATDLWWAKKN >gi|226332285|gb|ACIC01000035.1| GENE 2 1092 - 2114 731 340 aa, chain + ## HITS:1 COG:no KEGG:BT_1038 NR:ns ## KEGG: BT_1038 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 340 1 340 340 709 100.0 0 MKTSKLKKIIYGIGISLFFSQGFLNIACSDWTDIEAKDYYEPPTQGYENNLKDYFNSPHK IMFGWFGNWAGKGGSSMQYALCGLPDSTDFVSLWLCWGNLTVEQQADLKDFQAKGSRAVL CWRAGDIGDNLTPGGNDDAVKEAFWGFDPKDEQSCIEAAKKYALAIVDTCNKYNIDGFDY DIEDWGTLMNSSMPSVPNAFMKTLREEFDKTGKMLVADIPGGAGWLSFYEVLSEETVLSL DYIAWQTYELGHSGLDSFFEAVRRSHPNVFKEAMGKSIVTATFERAVDKHYFTEQQSYHP ACGIEHAGMGAYHIEYDYPGNPDYPTVRAAIAAQNPPINN >gi|226332285|gb|ACIC01000035.1| GENE 3 2131 - 3345 628 404 aa, chain + ## HITS:1 COG:no KEGG:BT_1037 NR:ns ## KEGG: BT_1037 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 10 404 1 395 395 772 99.0 0 MKRISLLKIMAFGVLALVHTACENAKYDVIDNLVYINEASTSKTKELTLTEGTTRTSLTV RLVNIINQTIKANLFIDESVLDAYNKKNETNYKMPPKEYISFPESVTIEAGSVSADPVNV DIKSFEAEGGAQYAIPISIKNVEGGIEKSESSSSFLLVLVKPLKQAVPKLTWYNGMHAAP EGAWGLALPNYTLEWWSKVTSKSGNGGYTKNNQAIINSGGSGTELYIRFGDLIYSENGSY KNNFLQVKTMGSQFDTGDPTKGKGLEAQKWYHFAVSYDAASGTTLLYQNGTVVASLSSSI GQPMNIDQFQMISSGEEYFPDFCEMCQVRFWKVTRTVNQIKKSMYTEVDYTDKNLLLYLP MNEGEGATILHDVTGNGHDVEIGNSNLGNENRQKVTWETYSFAQ >gi|226332285|gb|ACIC01000035.1| GENE 4 3362 - 4711 886 449 aa, chain + ## HITS:1 COG:no KEGG:BT_1036 NR:ns ## KEGG: BT_1036 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 449 1 449 449 819 100.0 0 MKALFKYILYSTFFAMSLPLVTACDDDEATDPYDINYVYIYKPSETNATLEYKGDGTFLK EIATEHILSPVRCTKPAPQDLTIQLAIDRSLVDAYNSEYGTSYVALQNAELENATLYIKQ GEYISVDTLKVHYTNMEEFKNGSENYILPIAITSINGTGVSASESNSKIFLTFESTYKPN HVTLTSSKEYLLEYQYSQGFTNLSSEWKLDGILKSAWAADGDTKVTLAINPALVGTYNAL HGTDYSLLTSASLKQSTIMIKQGKTTPEESVALSFSDDMASVQLGTNYVIPVTITNVDGV GADIGENKVIYIVYKTALVGFLSCVDKPVGSVINDHSGWSFTLNGQSSDLYDYVPDGSIL DIDLGKQENIKTIALHFYEWFYSSESASIAISNDGEKYEDLGVASGFANKTSYILLLVAK QAQYIRVTFHGALYYSPYINTVGIYTETE >gi|226332285|gb|ACIC01000035.1| GENE 5 4878 - 7082 1467 734 aa, chain + ## HITS:1 COG:no KEGG:BT_1035 NR:ns ## KEGG: BT_1035 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 734 1 734 734 1566 99.0 0 MRIIRRIIFQLLCLLGACCYIPATAQVVLVDNGKTKSRIILSENDQINQISANLFQLFLQ RISGCTFPIVKGQNAKKGDIIISSKTPAGVTEDGFSLSTKDGILRISGSGNGTVYGVVTL LEQYLGVDYWGENEYSFNPAKTIELPLIDKIDNPAFRYRQTQCYAIRTDSIYKWWNRLEE PNEVFAAGYWVHTFDKLLPASVYGKDHPEYYSYFKGKRHPGKASQWCLSNPEVFEIVAQR IDSIFKANPDKHIMSVSQNDGNYTNCTCDACKAIDDYEGALSGSIITFLNKLAARFPDKE FSTLAYLYTMNPPKHVKPLPNVNIMLCDIDCDREVTLTENASGKEFVKAMEGWAAITNNI FVWDYGINFDNYLAPFPNFHILQDNIRLFKKNHATMHFSQIAGSRGGDFAELRAYLVSKL MWNPEANVDSLMQHFLHGYYGEAAPYLYQYIKVMEGALIGSGQRLWIYDSPVSHKYGMLK PQLIRRYNQLFDDAEKAVAEDNKFLKRVQRARLPIQYSELEIARTETGTDMNEISPKLAL FEERVKEFNVPTLNERSNSPVEYCQLYRERYMPRAEKSVAIGAKVTYLIPPTGKYAEIGK TALVDGLFGGSTFVESWIGWEGTDGAFVIDLGKEKEIHSIDTDFLHQIGAWILFPLKVVY SYSEDGENYTHWGTHDMPENRSGKVEFRGVKTESEMPVHARYVKVEVTGTKECPEWHYGV GHPSWFFIDEVTIK >gi|226332285|gb|ACIC01000035.1| GENE 6 7181 - 8512 1012 443 aa, chain + ## HITS:1 COG:YPO3162 KEGG:ns NR:ns ## COG: YPO3162 COG0477 # Protein_GI_number: 16123324 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Yersinia pestis # 56 441 53 411 492 98 23.0 3e-20 MQQNKTASQGKTINPIYWVPTAYFAMGLPFIAINLVSVFMFKDLGISDTQITFWTSLIMM PWTLKFLWSPFLEMYRTKKFFVLVTELLSGILFGVVAFSLFFDYFFAISISTMAVIAFSG ATHDIACDGVYMAELNKEDQAKYIGVQGAFYNVAKLVANGGLVALAGMLAEHFGAIEGAS IDANKGAYSSAWMIIFAVIAAVMVLLGLYHIKMLPSTQVPATGKKTTSEIMQDLVNVIGN FFTKKHIVYYIFFIILYRLAEGFIMKVAPLFLRASREVGGLGLSLTEIGTLNGVFGSAAF VIGSLLGGIFVARYGLKKTLFMLCCVFNLPFLAYTFLAVVQPTSLYLIGTCITMEYFGYG FGFVGLTLFMMQQIAPGKHQMSHYAFASGIMNLGVMLPGMISGYFSDLLGYRSFFIYVLI ATIPAFLMTYFIPFTYDDSKKKK >gi|226332285|gb|ACIC01000035.1| GENE 7 8565 - 9533 861 322 aa, chain + ## HITS:1 COG:TM1225 KEGG:ns NR:ns ## COG: TM1225 COG2152 # Protein_GI_number: 15643981 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted glycosylase # Organism: Thermotoga maritima # 6 321 11 326 326 374 58.0 1e-103 MNKIQIPWEERPVGCTDVMWRYSQNPVIGRYHIPSSNSIFNSAVVPFKDGFAGVFRCDNK AVQMNIFTGFSKDGIHWDISHEPIQFKAGNTEMIESEYKYDPRVTWIEDRYWITWCNGYH GPTIGIAYTFDFIDFFQCENAFLPFNRNGVLFPQKIDGKYAMLSRPSDNGHTPFGDIYIS YSPDMKYWGEHRCVMKVTPFPESAWQCTKIGAGSVPFLTDEGWLLFYHGVITTCNGFRYA MGSAILDKDHPEKVLYRTREYLIGPAAPYELQGDVPNVVFPCAALQDGERVAVYYGAADT VVGMAFGYIQEIIDFTKRTSII >gi|226332285|gb|ACIC01000035.1| GENE 8 9568 - 9919 221 117 aa, chain + ## HITS:1 COG:no KEGG:BT_1032 NR:ns ## KEGG: BT_1032 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 117 10 126 756 246 97.0 3e-64 MAFTLLPVFGGEGKAPPMNKEKSLLIDYVDPFIGTTNFGTTNPGAVCPNGMMSVVPFNVM GSADNTYDKDARWWSTPYEYTNCFFTGYSHVNLSGVGCPELGSLLLMPTTGELNVDY Prediction of potential genes in microbial genomes Time: Thu May 12 00:21:40 2011 Seq name: gi|226332284|gb|ACIC01000036.1| Bacteroides sp. 1_1_6 cont1.36, whole genome shotgun sequence Length of sequence - 6804 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 1843 1575 ## COG3537 Putative alpha-1,2-mannosidase + Term 1882 - 1925 8.1 + Prom 2015 - 2074 6.0 2 2 Tu 1 . + CDS 2143 - 3036 631 ## BT_1031 hypothetical protein + Term 3102 - 3137 6.0 + Prom 3483 - 3542 9.4 3 3 Op 1 . + CDS 3565 - 5040 723 ## BT_1030 hypothetical protein 4 3 Op 2 . + CDS 5070 - 6804 1274 ## BT_1029 hypothetical protein Predicted protein(s) >gi|226332284|gb|ACIC01000036.1| GENE 1 2 - 1843 1575 613 aa, chain + ## HITS:1 COG:CC0533 KEGG:ns NR:ns ## COG: CC0533 COG3537 # Protein_GI_number: 16124788 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Caulobacter vibrioides # 9 605 146 745 770 312 33.0 1e-84 SNYLTKYNIKTEVSATPRTGIARFTFPRGKSHILLNLGEGLTNESGAMLRRVSDSEIEGM KLLGTFCYNPQAVFPIYFVMRVNKVPTTTGYWKKQRPMTGVEAEWDRDQGKYKLYTRYGK EIAGDDVGAYFTFETEEGEQVEVQMGVSFVSIENARLNLDKEQSGKNFEQVLSDARAQWN DDLSRITVEGGTDAQKTVFYTALYHLLIHPNVLQDVNGEYPAMESDQILTTKGTRYTVFS LWDTYRNVHQLLTLVYPERQMEMVRTMLDMYREHGWFPKWELYGRETLTMEGDPSIPVIV DTWMKGLRDFDVDLAYEAMYKSATLPGAENLMRPDNDDYMSKGYVPLREQYDNSVSHALE YYIADFALSRFADALGKKKDAEMFYKRSLGYKHYYSKEFGTFRPILPDGTFYSPFNPRQG ENFEPNPGFHEGNSWNYTFYVPHDVYGLAKLMGGKKPFVNKLQMVFDEGLYDPANEPDIA YAHLFSYFKGEEWRTQKETQRLLDKYFTTKPDGIPGNDDTGTMSAWAIFNMIGFYPDCPG LPEYTLTTPVFNKVTIRLDPKWYKENELVIESNRTGSETLYINKVLLDGKKFNKYRITHD ELVHGKRLIFDLK >gi|226332284|gb|ACIC01000036.1| GENE 2 2143 - 3036 631 297 aa, chain + ## HITS:1 COG:no KEGG:BT_1031 NR:ns ## KEGG: BT_1031 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 297 1 297 297 592 99.0 1e-168 MNRKKLFQHILWILIVAECFPMLAVAASKQKEQRYKIAVCDWMILKRQKIGSFQLVHELK GDGVELDMGSLGKREMFDNKLREPHFQQLFRETAQNYNLEVPSIAMSGFYGQSFLDRANY KELVRDCLDAMKVMGAKVAFLPLGGVEAGWQETPELRTALIKRLKEVGDMAASERVVIGI ETQLDAREEVKLLKEINSPGIKIYFKFQNALENGRDLCKELKILGKNRICQIHCTDTDGV TLPFNERLDMNKVKKTLDKMGWRGWLVVERSRDKDDVRNVKKNYGTNIEYLKKVFQE >gi|226332284|gb|ACIC01000036.1| GENE 3 3565 - 5040 723 491 aa, chain + ## HITS:1 COG:no KEGG:BT_1030 NR:ns ## KEGG: BT_1030 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 491 1 491 491 966 99.0 0 MKSKHLFLRVKRDSLAFAAAILLFSSCVDGYKDDWTFSSEVQGVTLESPSTEGWTFTRNV DITSLEIKWPVVLGAGGYQVTFYNIDDPDNPVVIGEENEVVDKCTITRDITEDTKYKVAV RALGNQKYNNADAVAATEMTYNTLVRVRETIPTGTNLTEYFTANPIEPLAEGEEEIAYEL VAGGVYEMDGNIDLGTTTLTIRGDKVNHAKLTMKRNASFINRGAGLKIKFIDFDFDADTY SASNSRGVVMFNSTEAGIVQQPYVFQSCTIKDLPVPLYYCNNGYALSSLSITDCLVSINT ASTIFIAFNGQGWIKDLSFSNSTIYYTAPGSAYFVQMRGRTPSNFSGSGWSTSLRKMSNC TFYQIGTNNRFFNNVINSNSAVFFLEMQNTIFADCVVSSATGTEGVFRRICNAGNYGNVN YTLGYNTYYYSAVPGGFLDYDTSDENGRDHSGTAIKVEPKFVNAANGDFTLSSSEHIANR CGDPRWLPTTE >gi|226332284|gb|ACIC01000036.1| GENE 4 5070 - 6804 1274 578 aa, chain + ## HITS:1 COG:no KEGG:BT_1029 NR:ns ## KEGG: BT_1029 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 578 1 578 1102 1113 100.0 0 MSKRYFILGLCMLFIQAFAQIVYAQNARTVTGVVVDEFGDPIIGAAIKIVDSTVGTISDI DGKFSLPVPEGGRLAVSFVGYVSQTITNLNNPKIVLKEDVANLDEVVIVGYGTQKMKNIT GAIETITPDEIKDLSVGNLGDALSGMMSGLHVNSGGGRPGSTPSLQIRQSNINTSITPNS TRGGDADPSPLYVIDDFISTEDAFNNLDVSEVESITVLKDASAAVYGARAAYGVILVKTK RGKVGTPSISYTGQFGFTDALKKPKMLSAYDYGRIYNAARVAGTSTGESESDNRRTQYFQ ADELEAMRGLNYDLLDDEWSAAWTQRHSFSINGGTEKATYFAGASYYNQEGNMGRLDYDR WNFRAGVNANIGKWIKASLQFSGDMGEQNNSRNGITSGGTDADFNSLMTHLPFVPGYVGG RPVIYTGMENVSSGLSAVRLFHFGAVQDSPDNTQNQTNNMSINGSLEYDFGWSKWLKGLK VKGSYSRSIINNKSNNIGTKMNVYRLLERGGSGNHLYTGDDINIDDSNFGTFTLDNGNLL SRAMNKTDNYQMNLTVSYARQFGLHNVNGLFSIEKAES Prediction of potential genes in microbial genomes Time: Thu May 12 00:22:00 2011 Seq name: gi|226332283|gb|ACIC01000037.1| Bacteroides sp. 1_1_6 cont1.37, whole genome shotgun sequence Length of sequence - 17316 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 4, operones - 4 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 1 - 1470 1291 ## BT_1029 hypothetical protein 2 1 Op 2 . + CDS 1485 - 3698 1941 ## BT_1028 hypothetical protein + Term 3745 - 3801 17.1 + Prom 3918 - 3977 6.5 3 2 Op 1 . + CDS 4069 - 4254 73 ## BT_1027 hypothetical protein + Prom 4263 - 4322 4.8 4 2 Op 2 . + CDS 4347 - 5792 706 ## BT_1026 hypothetical protein 5 2 Op 3 . + CDS 5825 - 9133 2239 ## BT_1025 hypothetical protein 6 2 Op 4 . + CDS 9147 - 11417 1304 ## BT_1024 hypothetical protein + Term 11546 - 11589 -0.4 + Prom 11515 - 11574 8.5 7 3 Op 1 . + CDS 11644 - 13383 1536 ## BT_1023 hypothetical protein + Term 13391 - 13431 4.3 + Prom 13396 - 13455 2.0 8 3 Op 2 . + CDS 13475 - 14125 835 ## BT_1022 hypothetical protein + Term 14224 - 14273 9.5 + Prom 14139 - 14198 3.5 9 4 Op 1 . + CDS 14369 - 15298 811 ## BT_1021 arabinosidase 10 4 Op 2 . + CDS 15350 - 17315 1698 ## BT_1020 hypothetical protein Predicted protein(s) >gi|226332283|gb|ACIC01000037.1| GENE 1 1 - 1470 1291 489 aa, chain + ## HITS:1 COG:no KEGG:BT_1029 NR:ns ## KEGG: BT_1029 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 489 614 1102 1102 976 100.0 0 TESGMLSYIGRLNYSYADKYLFEFLLRSDASTKFAPSNYWGMFPSWSAGWVISEESWFNK EKLGIDFLKIRGSFGILGRDNIQPWLWTQLYSRNADGGPIFGTATNTYSGATFQMPQRGV NADVHWDKTYKTNLGIDVRMLDSRLGITLDAYYDMGRELFTTFTGTSFYPTTVGTQATPE NFGEVDTYGLELTLNWKDKIGKDFSYWVKMTTGYSDNKIKEAGFQATPGFDDIVRGERSD RGVWGYECIGMFRTYQDIEEYFAVNKITSYLGNTKDNIHPGMLIYRDVRGQRNADGTYGE ADGVIDQNDYVKISNRANNPYGLTFNFGASYKNFSFSAQFGASWGTYALVQTDLRQEKYD DLEYKNVSAMWKDMYVYEDVLDAGGNVVAPMNRSAKYPNLRYSAINGQASTFWKVSAASV RLRNLTVAYSLPKEWTNRIGISSCRLNLTAQNLLNFYNPYPDGVWSTWGGTYGRYPNLRK VTLGVNVSF >gi|226332283|gb|ACIC01000037.1| GENE 2 1485 - 3698 1941 737 aa, chain + ## HITS:1 COG:no KEGG:BT_1028 NR:ns ## KEGG: BT_1028 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 14 737 1 724 724 1467 100.0 0 MKEKINKICLTASMVTGLICFTGCSDDFLKDKKVYGSYDASVVYENYETAKSRVDFLYQS LLPSSTGGSNALTDITSAGIDDDFSKCTEEYGDYSIFNDPTASLTIQTVPDYFYVINSER SPWGRIRECNNVIEGVTGSATLSQTEKEQLLGQAYFFRAWRYYLLVKMYGGVPIVDHVQN PVIGDGNGENLVIPRSSTKACIDFICDDLELAASYLPARWQNEGQDYGRITAGAALALKG RTLLLYASPLFNRADDASRWKDAYDANFAAITKLNEGNFGLAYEGNGGEDNAKNWARMFA TYTGGSEAVFVTLYNNVSPIASQNINRYNLWEQGIRPGNINGGGGKTPTAEIIDIFPMID GKKPMESGVHYDPKKFFLNRDPRFYRTFAFPGVEWKFNSGNVDFSGATMSGLCPTRYTSG ANYELWNYCWYTTADERNNPSRSGFAADMLGTKNRGVYVRKRSDDFDLGTSLYVFTDNSS GDQQGFRRSAAPYMEIRYAEVLLNLAESACGAGGTYHAEGVKALKAVRGRVGYTSANNYG LDAAIETDRAKLFEAILYERQVELAFEGKRAYDMRRWMLFDGGVGQGALNASWALSGFGG NTCTYLGVTPMNERGKRYRIEINVEGTGSANNDSDPIKNVERPAALTLNEKIATEADGET ILDPAVSSICTFYDTYFSRKDISLDGNVLDINPYFQPRYYFFGLRQSAQQTNATLYQTIG WEDYAHGGMGTFDPLAE >gi|226332283|gb|ACIC01000037.1| GENE 3 4069 - 4254 73 61 aa, chain + ## HITS:1 COG:no KEGG:BT_1027 NR:ns ## KEGG: BT_1027 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 61 1 61 61 92 100.0 4e-18 MNIAKVKVLEKGKASLMVSVKVLCHNIFLIKRFILVLGIKPNKEINRKLWKEAISFAAIT A >gi|226332283|gb|ACIC01000037.1| GENE 4 4347 - 5792 706 481 aa, chain + ## HITS:1 COG:no KEGG:BT_1026 NR:ns ## KEGG: BT_1026 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 481 1 481 481 934 100.0 0 MKSNRLFLRRRTRGLLFATLLFLLSSCTIGYHEDESFESDVKNATLESPQLENVKVELDA TGENATIEWPVVHGAEGYEFSWYVVDDPENPIAVVEGEFIDGCSVELEVEEDTKYKFLIK TIGNKQFNNKDAEKACEISFSTLLETYASIPSGVDLTQWFIDNPLPETDMEPNEDGTLKE LAYELEANGEYTISGPIDFGARKVTIRGNKINHSKITFGQSGRILTQNGLKIKFMDFYCN AMEKGSSDASLIGLSKTPNEQLKVSSGEYVIKDPIVIQFCNVYDLNRHLLYDSGKKYCVE NFTVKASFIRCNQSNTIVYFNKGSFINITFKESTLCSTVQNGSYFAQVNGNRPNKITGYS NGTFNFYNCTVYNLAYSKDFTNWNSYRGQACLTLNFSRTIFVDCGKGDMTNKIMGNANMS RNFEYNTYWYNGSQSNDKYDTNTLDTDPGFINAANGDFTVTGASQLEKRTGDPRWLPLVE E >gi|226332283|gb|ACIC01000037.1| GENE 5 5825 - 9133 2239 1102 aa, chain + ## HITS:1 COG:no KEGG:BT_1025 NR:ns ## KEGG: BT_1025 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1102 1 1102 1102 2122 100.0 0 MNKKNIILSLFIGTLLTFVMPLCAQNQTVTGLVVDVSGEPIIGATVMVVNGTVGTVTDID GKFNIKVAPKSKLKVSFVGYTSQIISDLKNPRIVLLEDQLKLDEVVVLGDYGSQKLRNAT GAIETISTEELKDLSVGSLGDALAGKINGLHVSLSGGRPGSTPSLQIRQSSVNTTITPSS DLGGNASPTPLYVIDGFIADEGAFNNLDINEVENVTVLKDAAAAVYGARAAYGVVLVKTK QGKVGAPKISYSGQFGYTDALMLPKMLNAYDYGRIYNAARAANTATKDQESDDLRVQLFQ SDELEAMKGMNYNLLDKEWSAALTQRHSVNISGGTEKATYFGGVSYYQQEGNIGRLDYNR WNYRAGVNANISKWMKASLQVSGNYGETNKPKNVKGGGSDGDFESLMLHVPYVPDQVNGY YIFHSGMENITNPSDQQKYNFAAVQNASDNVENQNQNFSLNGSLEYDFGWSKYLRGLKVK ASYSKNISTGKTNTIGTKIDVYRLISRGGSGGHLYVGDEIEYNANTLGLYELNNGNSLGR SMNRSDSYQMNLTVSYARQFGKHYVSGLFSIEKAESEWEDLNGSLTDPLPFTDGQSSSVD SNAEGFAQTVTFNRSESGMLSYVGRVNYSYDDKYLFEFMLRSDASAKFAPQNYWGMFPSW SAGWVISDEKWFDKDKTKIDFLKIRGSFGILGKDNVNAWLWTQLYTRNPDKGPIFGTNSS TNTSATIRMPKQGVNPDVHWDKTYKTNFGIDMAFLKNRMSANLDFYYDMGREMFASHQGT SYYPNTVGIQPAPENFGEVDTYGVELSLGWKDKIGKDMSYWVKLTTGYNDNKIRETGWKA AYDFDKLVRNERSDRGLWGYECIGMFRSYQEINEYFDKYSITKYLGNTKENVHPGMLIYK DVRGQYDESTGTFGPADGIVDEQDYIKISHRASNPYGGTLNFGFSYKDFSISAQFGASWG AYSLVPTTMRKESYSEYSNVSAMWKDMFVYEDIYDANNIVTVAQNRNAKYPNIRYSSVNG APSTFWKVSAATVQLRNMTVAYALPKEWLKIIGISSCRFNLTCQNVFNFLNPYPEGAWAS WAGSYGYYPNLRKFTLGVNVSF >gi|226332283|gb|ACIC01000037.1| GENE 6 9147 - 11417 1304 756 aa, chain + ## HITS:1 COG:no KEGG:BT_1024 NR:ns ## KEGG: BT_1024 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 756 10 765 765 1540 100.0 0 MMKKINKYGSIALLTITAAIFTGCSDQFLEDKKLYGSFNGTTIYENYESADNRIAYLYYI MLPSATGGSDLTGSMGDMPSTGTSDQYSKCTEEYGGISGLMNPNSPLDYETVTDYFNLDN NYSPWVRIRECNDMIEGVIGSESLTERQKQLLLGQAFFFRAWRYYLMVMMYGGVPIIDNV QNPIIGDSEGIHLVVPRSSTKECIDFICKDLQTAADYLPARWESENKNFGRITSGAALAL KGRVELLYASPLFNRADDVTRWETAYQTNLAAVKKLEEGNFGLAYENNSGKNAAGWGKIF SDYIGSEGGGGTVSEAVLVTLYNNVSPAEHLQLEKWNGWEHSLRPANAGGGGGMTPTAEM VDLFPMADGKKPNDMNGSYDYNQELFFLNRDPRFYRTFAFPGVEWKFDSDDLKSFGEKGQ LPYGINGYTTGNNYKLWNYCWYDNVDDRNSNSKSGYAADCLGTKNTGIYLRKRTDDLALN SNPLYIFEKTSGNGFRQSAAPYMEIRFAEVLLNYAEAACGANHFDEAVNALKRIRKRVGY TDNCGLDPAIFNDRAKLFEAILYERQIELAYEGKRFHDVRRWMLFDGGVGQEALKSTWKL TGFNGNTCNYLGVEPLNGQRRHRIEVYAKNFIAAKSNNVYNEDKTIKETYDELWNKRPTA LSLTEDITCTESADDEITYADAKVKALAQFYQEHFKRKNLSTDGNDETILPTFKPYYYIL GFKKNAMQNNVSLEQNIGWGDYWRGGADGTFDPLAE >gi|226332283|gb|ACIC01000037.1| GENE 7 11644 - 13383 1536 579 aa, chain + ## HITS:1 COG:no KEGG:BT_1023 NR:ns ## KEGG: BT_1023 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 579 1 579 579 1164 100.0 0 MKKKHVLLVAFAAAMLTPTVVWAQYPQITDEAKANYTKMMTEERKRSDEAWEKALPIVLK EAKEGRPYISWAGRPYDLPQARIPSFPGAEGGGMYSFGGRGGKVITVTNLNDRGPGSFRE ACETGGARIIVFNVAGIIRLESPIIVRAPYVTIAGQTAPGDGVCIAGESFWVDTHDVVVR HMRFRRGETKVWHRDDSFGGNPIGNIMIDHCSCTWGLDENISFYRHMYDPSEGQYESKDL KLPTVNVTIQNTISAKALDTYNHAFGSTLGGENCAFMRNLWASNSGRNPSVGWNGVFNFV NNVVFNWVHRSSDGGDYTAMFNMINNYYKPGPATPKNNTVGCRILKPEAGRSKLNYKVYG RVYADGNVMEGYPEITKDNWNGGIQIESQPNTAGYTENMRSYEPFAMPYIKITSANDAYD YVTKHVGANIPCRDIVDERIIEEVKTGIPYYDKKLAKDANGDLTGLSPKSMGEDGQFKYR RLPKDSYKIGIITDIRQMGGFPEYKGTPYVDTDGDGMPDEWEIANGLNPNDPSDANKDCT GDGYTNIEKYINGISTKHKVDWRDLKNNYDTLAEKGKLM >gi|226332283|gb|ACIC01000037.1| GENE 8 13475 - 14125 835 216 aa, chain + ## HITS:1 COG:no KEGG:BT_1022 NR:ns ## KEGG: BT_1022 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 216 7 222 222 398 100.0 1e-110 MSLLFLCGWISAGAVDLNKENRDPKYVESIVNRSQKIVDKLGLTDAKVAEDVCNVIANRY FELNDIYEIRDAKVKAVKESGLTGDAKNEALKAAENEKDAALYRSHFAFPASLSLFLNEE QIEAVKDGMTYGVVKVTYEATLDMIPSLKEEEKVQIYAWLVEAREFAMDAENSNKKHAAF GKYKGRINNYLAKRGYNLTKEREEWAKRVKARGGTL >gi|226332283|gb|ACIC01000037.1| GENE 9 14369 - 15298 811 309 aa, chain + ## HITS:1 COG:no KEGG:BT_1021 NR:ns ## KEGG: BT_1021 # Name: not_defined # Def: arabinosidase # Organism: B.thetaiotaomicron # Pathway: not_defined # 27 309 1 283 283 576 99.0 1e-163 MGKLRNTLAAISILLLASCGGNKDYYMFTSFHEPADEGLRYLYSEDGMHWDSIPGVWLKP ELGQHRLMRDPSMVRMPDGTYHLVWTTSWKGDLGFGYANSKDLIHWSEQQMIPVMADEPT TVNVWAPEIFYDDESEQFMVVWASCVPGRFEKGIEEEENNHRLYYITTKDFKTVSKAKVL YDPGFSSIDAVIVKRAKNDYVMVLKDNTRPERNLKIAFSDSLTGPYSPSSQPFTESFVEG PTVEKLGDDYLIYFDVYKKKIYGAMRTRDFKNFTDITESVSVPVGHKHGTIFRASESDVK TLLEESKKK >gi|226332283|gb|ACIC01000037.1| GENE 10 15350 - 17315 1698 655 aa, chain + ## HITS:1 COG:no KEGG:BT_1020 NR:ns ## KEGG: BT_1020 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 655 1 655 1109 1354 100.0 0 MNNSTKILLVLAAGWMATTIQAQTPQDRIHYTGKELSNPTYHDGQLSPVVGVHNIQLVRA NREHPEASNGNGWTYNHQPMLAYWNGQFYYQYLADPSDEHVPPSQTFLMTSKDGYQWTNP EIVFPPYKVPDGYTKESRPGMQAKDLIAIMHQRVGFYVSKSGRLITMGNYGVALDKKDDP NDGNGIGRVVREIKKDGSFGPIYFIYYNHGFNEKNTDYPYFKKSKDREFVKACQEILDNP LYMMQWVEEADREDPIIPLKKGYKAFNCYTLPDGRIASLWKHALTSISEDGGHTWAEPVL RAKGFVNSNAKIWGQRLSDGTYATVYNPSEFRWPLAISLSKDGLEYTTLNLVHGEITPMR YGGNYKSYGPQYPRGIQEGNGVPADGDLWVSYSVNKEDMWISRIPVPVQINASAHADDDF SKSGSIAELTNWNIYSPVWAPVSLEGEWLKLQDKDPFDYAKVERKIPASKELKVSFDLSA GQNDKGILQIDFLDENSIACSRLELTPDGIFRMKGGSRFANMMNYEAGKTYHVEAVLSTA DRNIQVYVDGKRVGLRMFYAPVATIERIVFRTGEMRTFPTVDTPADQTYDLPDAGGQEPL AEYRIANVKTSSTDKDASSAFLKYADFSHYAESFNGMEDENIVQAIPNAKASEWM Prediction of potential genes in microbial genomes Time: Thu May 12 00:23:05 2011 Seq name: gi|226332282|gb|ACIC01000038.1| Bacteroides sp. 1_1_6 cont1.38, whole genome shotgun sequence Length of sequence - 16817 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 5, operones - 2 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 6 - 1259 1078 ## BT_1020 hypothetical protein 2 1 Op 2 . + CDS 1292 - 4153 1994 ## BT_1019 putative secreted hydrolase 3 1 Op 3 . + CDS 4190 - 5527 1191 ## COG5434 Endopolygalacturonase 4 1 Op 4 . + CDS 5565 - 7526 1563 ## BT_1017 hypothetical protein + Term 7564 - 7609 6.3 + Prom 7616 - 7675 4.2 5 2 Tu 1 . + CDS 7697 - 8479 654 ## COG1752 Predicted esterase of the alpha-beta hydrolase superfamily + Term 8615 - 8652 -0.8 6 3 Tu 1 . - CDS 8467 - 9354 551 ## COG1028 Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) + Prom 9621 - 9680 6.7 7 4 Tu 1 . + CDS 9906 - 11426 744 ## BT_1014 hypothetical protein - Term 11421 - 11465 5.3 8 5 Op 1 . - CDS 11505 - 13814 1804 ## COG4692 Predicted neuraminidase (sialidase) 9 5 Op 2 . - CDS 13748 - 15373 1208 ## BT_1013 putative alpha-rhamnosidase 10 5 Op 3 . - CDS 15403 - 16815 1477 ## BT_1012 hypothetical protein Predicted protein(s) >gi|226332282|gb|ACIC01000038.1| GENE 1 6 - 1259 1078 417 aa, chain + ## HITS:1 COG:no KEGG:BT_1020 NR:ns ## KEGG: BT_1020 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 417 693 1109 1109 877 100.0 0 MTEFLVQRSYSDKYNLIACAIGHHIYESRWLRDPKYLDQIIHTWYRGNDGGPMKKMDKFS SWNADAVLARYMVDGDKDFMLDMTKDLETEYQRWERTNRLKNGLYWQGDVQDGMEESISG GRNKKYARPTINSYMYGNAKALSIMGILSGDEGMAMRYGMRADTLKSLVENDLWNTRHQF FETMRTDSSANVREAIGYIPWYFNLPDTTKKYEVAWKEIMDEKGFSAPYGLTTAERRHPE FRTRGVGKCEWDGAIWPFASAQTLTAMANFMNNYPQTVLSDSVYFRQMELYVESQYHRGR PYIGEYLDEVTGYWLKGDQERSRYYNHSTFNDLMITGLIGLRPRLDDTIEINPLIPADKW DWFCLDNVLYHGHNLTILWDKNGDRYHCGKGLRIFVNGKEAGHADTLTRLVCENALK >gi|226332282|gb|ACIC01000038.1| GENE 2 1292 - 4153 1994 953 aa, chain + ## HITS:1 COG:no KEGG:BT_1019 NR:ns ## KEGG: BT_1019 # Name: not_defined # Def: putative secreted hydrolase # Organism: B.thetaiotaomicron # Pathway: not_defined # 31 953 1 923 923 1920 100.0 0 MTFKKYLLLLLCLPCFVQAKNITISRLTCEMQEGLVVVESCPRLGWAMESPENGTRQTAY EIEIREAFTGRSVWNSGKVTSSQSQLVPTEGADICLNNPFNYSWRVRVWDETDTPSEWSQ EAKFRLASDDLSSGKWIGAITRKDSHLPEGRKFHGGELKKPEVKAAWEAVDTLAKKSICL RRTFQTGEAKGKNANRKSGKKIVEATAYVCGLGFYEFSLNGKKIGDSEFAPLWSDYDKSV YYNTYDVTEQLQDGENVVGILLGNGFYNVQGGRYRKLQISFGPPTLLFELVINYEDGTRE TIRSDHEWKYDLSPITFNCIYGGEDYDARREQKGWNQVGFNDSHWRPVVVQEAPKGVLRS QMAPPVKIMERYDIQKVTKLNSEQVMAASKSTKRTVDPSAVVLDMGQNLAGFPEITVRGK RGQKVTLIVAEALMDEGACNQRQTGRQHYYEYTLKGEGDEIWHPRFSYYGFRYIQVEGAV LKGQKNPQKLPVLKNIQSCFIYNSAKKVSTFESSNRIFNAAHRLIEKAVRSNMQSVFTDC PHREKLGWLEQVHLNGPGLLYNYDLTAYAPQIMQNMADAQHPNGAMPTTAPEYVVFEGPG MDAFAESPEWGGSLVIFPFMYYETYGDDSLIKKYYQHMRRYTDYLKTRADNGILSFGLGD WYDYGDFRAGFSRNTPVPLVATAHYYMTVMYLIKAARMVGNEFDVRYYSSLAYDIMVAFN KRFLDSNTAQYGTGSQASNALPIFLQMIQSTGDKGDYTPDASLAAKVLTNLIKDVESHGN RLTTGDVGNRYLIQTLARNGQHELIYKMFNHEEAPGYGFQLKFGATTLTEQWDPRQGSSW NHFMMGQIDEWFFNSLIGIRPSTTLSASVNEKDEPMQGYQKFVIAPKPVGDLKYVKSTYE TLYGTILVDWTRQSGTFTLNVSVPVNTTAIVYLPGEKEPKEVQSGTYKFVAAE >gi|226332282|gb|ACIC01000038.1| GENE 3 4190 - 5527 1191 445 aa, chain + ## HITS:1 COG:TM0437 KEGG:ns NR:ns ## COG: TM0437 COG5434 # Protein_GI_number: 15643203 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Endopolygalacturonase # Organism: Thermotoga maritima # 24 443 28 437 448 283 37.0 7e-76 MNLRITLLAFLCLCATAILRAERVDMLKAGAKANGKALNTKLINSTIDRLNRGGGGTLFF PAGTYLTGSIHLKSNITLELEAGATLLFSDNFDDYLPFVEVRHEGVMMKSFQPLIYAVDA ENITIKGEGTLDGQGKKWWMEFFRVMIDLKDNGMRDINKYQPMWDAQNDTTAIYAETNKD YVSTLQRRFFRPPFIQPVRCKKVKIEGVKIVNSPFWTVNPEFCDNVTIKGITINNVPSPN TDGINPESCRNVHISDCHISVGDDCITIKSGRDAQARRLGVPCENITITNCTMLSGHGGV VIGSEMSGSVRKVTISNCVFDGTDRGIRIKSTRGRGGVVEDIRVSNIVMSNIKQEAVVLN LKYSQMPAEAKSERTPIFRNVHISGMTVTDVKTPIKIVGLEEAPIFDIVLRDIHIQGARQ KCIFEDCERITMDNVIINGEEIKLK >gi|226332282|gb|ACIC01000038.1| GENE 4 5565 - 7526 1563 653 aa, chain + ## HITS:1 COG:no KEGG:BT_1017 NR:ns ## KEGG: BT_1017 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 653 1 653 653 1284 99.0 0 MKRILFSLLLAASLSAEAQTQTYETEFARPLNEVLTDIQNRFGIRLKYDIDTVGKILPYA DFRIRPYSVEESLTNVLSPFDYKFVRQSGNLYKLKAYEYPRRTDADGEKMLAYLNTLYAD KQAFELRADSLRKEVRQRLGIDTLLAQCVNSTPILSKIRKFDGYTVQNFALETLPGLYVC GAVYTPQSKGKHALIICPNGHFGGGRYREDQQQRMGTLARMGAVCVDYDLFGWGESILQV GSAAHRSSAAHTIQAMNGLLILDYMLASRKDIDTKRIGANGGSGGGTHTVLLTTLDDRFT ASAPVVSLASHFDGGCPCESGMPIQLSAGGTCNAELAATFAPRPQLVVSDGGDWTASVPA LEFPYLQRIYGFYDAKDNVTNVHLPKEKHDFGPNKRNAVYDFFAEVFDLDKKMLDESKVT IEPESAMYSFGEKGELLPENAIHSFDKVAAYFDKKAFAKLKSDASLEKKAMEWVASLNLD DEKKSGFAVTTIYNHLRQVRDWHNDHPYTTIPAGINPTTGKPLTQLEREIIADSAMPKEV HERLMKGLCRVLTEEQVEQILDKYTVGKVAFTMKGYQEIVPDMTEEETAFILEQLKLARE QAVDYKSMKQISAIFKAYKTKIELYFYEHGRNWRQMYKDYAEKRKAEKAKEGK >gi|226332282|gb|ACIC01000038.1| GENE 5 7697 - 8479 654 260 aa, chain + ## HITS:1 COG:PA1640 KEGG:ns NR:ns ## COG: PA1640 COG1752 # Protein_GI_number: 15596837 # Func_class: R General function prediction only # Function: Predicted esterase of the alpha-beta hydrolase superfamily # Organism: Pseudomonas aeruginosa # 1 182 1 180 345 173 48.0 3e-43 MDKQKVALVLSMGGARGIAHIGVIEELLRHNFEITSIAGSSMGAMVGAMYASGKLEECKE WLYSWDKRKMWELADLTLSRDGLVKGDRFIKELKQIIPVMNIEELPIPYVAMATDIVCDQ EVRFDSGNLYDAIRATISIPMLFRPLRKDGMVLIDGGILNPLPLNQVHRTEEDILIAVDV NAPIDCGKKKKMSPYNLLTESSRMMMQQITRYQIERCQPDILIQMSGNAYDMLEFHHAAS IVETGIEITRDALNTELYSV >gi|226332282|gb|ACIC01000038.1| GENE 6 8467 - 9354 551 295 aa, chain - ## HITS:1 COG:BH1510 KEGG:ns NR:ns ## COG: BH1510 COG1028 # Protein_GI_number: 15614073 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) # Organism: Bacillus halodurans # 53 293 3 241 243 204 47.0 1e-52 MADNYIERQQEQYEARKAAWKQAQKYGKKKTATSSQVKTEVAGEIVSKPDSLKKRVFVTG GAEGIGRAIVEAFCKDRHQVAFCDINAVSGQQTARDTGAIFHPVDVSDKEALESCMQQIL DEWKDIDILINNVGISKFSSITETSVEDFDKILSVNLRPVFITSRLFAIHRKGQSDSNPY GRIINICSTRYLMSEPGSEGYAASKGGIYSLTHALALSLSEWNITVNSIAPGWIQTQDYD QLRPEDHSQHPSRRVGKPEDIARMCLFLCRDENDFINGENITIDGGMTKKMIYTE >gi|226332282|gb|ACIC01000038.1| GENE 7 9906 - 11426 744 506 aa, chain + ## HITS:1 COG:no KEGG:BT_1014 NR:ns ## KEGG: BT_1014 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 506 1 506 506 985 99.0 0 MSMKKNLKYLLLLSLLSFMACNGDEKIDLGMVQQDGEVLQLNVRVGDFAINDVSSIRATD SGSTGTFENGDRIGIIVLDADNNVLSDNIPYRYDGNIWSFDSNNDEGKTTIYYDNKATVY LAYFPYSKEADNVINIDDLKGKFLPQGDQRSKDAYRASDLLVWSDTSGRPLKKLDIVFEH AYSLLSLSPSIKCKINGRRDFTYVPSSISDVSFNVGTEPLFPYQMNDGSYQIIISPKRAK VRWLYEYNKEMYSGAMPDTDLSANTCYTFTPVLKDIGDYTLDKAQMGDFYCKDESNNGYL IPGDVTALSADMNCLGIVLKSGKDSEGEWVDYCKYKQKYGITEMHTVHGYVLALYDANGG NACTWGLWSLTDINRGEPSTGFYGYKNTQGIISFAENENRTLKNGFPPVYWVTDYEYSHP APANSSGWFLPSLGQCWYWMYNKAYLLSAVNKATGDNDYGWKLGYWSSSEDDYDPFLNAY YADTYVGAMDWDYKNSKHCVRACLAF >gi|226332282|gb|ACIC01000038.1| GENE 8 11505 - 13814 1804 769 aa, chain - ## HITS:1 COG:STM1252 KEGG:ns NR:ns ## COG: STM1252 COG4692 # Protein_GI_number: 16764604 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted neuraminidase (sialidase) # Organism: Salmonella typhimurium LT2 # 406 763 21 344 347 148 33.0 3e-35 MYNKWARDIREAQREDGCIPDVAPAYWNYYSDNVTWPATLPLVCDMLFTNYGDIRPIEDN YPAIKKWVSHIREYYMTKDYIITKDKYGDWCVPPESLEMIHSQDPARKTDGALIATAYYL KVLQLMHRFASLQGLTTDAKEWEDLEHKMKDAFNAHFLHIKEGTSLVPGHTLYPDSVFYG NNTVTANILPLAFGLVPKAHIKEVAKNAVTTIITTNKGHISTGVIGTQWLMRELSRRGHT DVAYLLATNDTYPSWGYMAAQGATTIWELWNGDTANPGMNSGNHVMLLGDLLPWCFNNLA GIRADRWKSGYKHIVFQPAFEIQELSNVDASYMSIYGKITSRWKKTPMHLEWDIELPANT TGEVHLPDGRKEKIGSGKYHFSVDIPTRNAAIISDEFLYEKASFPECHGATIVELKNGDL VASFFGGTKERNPDCCIWVCRKPKGAKEWTAPKLAADGVFSLKDSQAALAGIDSTCTPVV DAKGKLTARRKACWNPVLFQIPGGDLILFYKIGLNVGDWTGWLVRSKDGGKTWGKREALP EGFLGPIKNKPEYINGRIICPSSREGKGGWRIHFEYSDDKGKTWKTTESVPAELSVPTQN RKKGGINVDDQEAGEAIQGEGAQPILAIQPSILKHKDGRLQVLCRTRNAKIATSWSSDNG ETWSKVTLSNVPNNNSGTDAVTMSDGRHILIYNHFSTLPGTPKGPRTPLCIAISEDGINW KPILTLEDSPISQYSYPSIIQGKDGKLHAIYTWRRQRIKYTEIDLSKFK >gi|226332282|gb|ACIC01000038.1| GENE 9 13748 - 15373 1208 541 aa, chain - ## HITS:1 COG:no KEGG:BT_1013 NR:ns ## KEGG: BT_1013 # Name: not_defined # Def: putative alpha-rhamnosidase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 515 1 516 1290 1045 97.0 0 MNKKILILSFLSVLFLAAGPIRAAIGVTNLRTEQLKNPSGIDTHQPRLGWRIESNEQNVM QTAYHILVASSPDLLAQGKGDMWDSGKVETDASQWITYQGENLKRNAPYFWKVKVYTNKG ESDWSSPAFWSMGLFNEADWQGQWIGLDRAAPGDSETQWSRLAARYLRKEFALKKEIKRA TVHIAGMGLYELFINGQRIGDQVLAPAPTDYRKTILYNTYDVTSQLQQENAIGVTLGNGR FYTMRQNYKPYKIPTFGYPKLRLNLIVEYADGSKETIATNTSWKLTTGPVRSNNEYDGEE YDARKELGNWTQTGYDDTKWMPAERVSIPSGTLRAQMMPGMKVTETLKPVSIQKMGDKYI MDTGQNLAGWVRFRIKGQAGDSIRLHFAERLESDGEIFTKNLRDAHCTDIYVVSGREPQG ATWAPRFVYHGFRYVEISGYPDAKTEDFVVEVVEDEMDHTGSFSCSNETLNQVIRNAFWG IRSNYKGMPVDCPQRNERQPWLGDHAMGCWGKVCFSTIMPCTTNGHVISAKHSAKTAVFL M >gi|226332282|gb|ACIC01000038.1| GENE 10 15403 - 16815 1477 470 aa, chain - ## HITS:1 COG:no KEGG:BT_1012 NR:ns ## KEGG: BT_1012 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 6 458 23 475 483 915 99.0 0 TSAAAQKKTQKTYIPWSNGKLVVSEEGRYLKHENGTPFFWLGETGWLLPERLNRDEAEYY LEQCKRRGYNVIQVQTLNNVPSMNIYGQYSMTDGYNFKNINQKGVYGYWDHMDYIIRTAA KKGLYIGMVCIWGSPVSHGEMNVDQAKAYGKFLAERYKDEPNIIWFIGGDIRGDVKTAEW EALATSIKAIDKNHLMTFHPRGRTTSATWFNNAPWLDFNMFQSGHRRYGQRFGDGDYPIE ENTEEDNWRFVERSMAMKPMKPVIDGEPIYEEIPHGLHDENELLWKDYDVRRYAYWSVFA GSFGHTYGHNSIMQFIKPSVGGVYGAKKPWYDALNDPGYNQMKYLKNLMLTFPFFERVPD QSVITGQNGERYDRAIATRGNDYLMVYNYTGRPMEVDFSKISGAKKNAWWYTTKDGKLEY IGEFDNGVHKFQHDSGYSSGNDHILIVVDSSKDYVKKDWTELPDQQGAGA Prediction of potential genes in microbial genomes Time: Thu May 12 00:24:08 2011 Seq name: gi|226332281|gb|ACIC01000039.1| Bacteroides sp. 1_1_6 cont1.39, whole genome shotgun sequence Length of sequence - 87714 bp Number of predicted genes - 54, with homology - 54 Number of transcription units - 29, operones - 15 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 70 - 1272 1006 ## COG4225 Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins 2 1 Op 2 . - CDS 1286 - 3760 1900 ## BT_1010 hypothetical protein - Prom 3808 - 3867 8.3 + Prom 3725 - 3784 7.4 3 2 Op 1 . + CDS 3983 - 4981 598 ## COG0167 Dihydroorotate dehydrogenase 4 2 Op 2 . + CDS 4987 - 5775 588 ## COG0657 Esterase/lipase 5 2 Op 3 . + CDS 5842 - 6633 807 ## BT_1007 hypothetical protein + Term 6654 - 6712 14.6 + Prom 6724 - 6783 5.5 6 3 Op 1 . + CDS 6868 - 7596 451 ## COG0778 Nitroreductase 7 3 Op 2 . + CDS 7517 - 7795 171 ## BF2880 hypothetical protein + Term 7845 - 7881 8.2 + Prom 8196 - 8255 1.6 8 4 Tu 1 . + CDS 8336 - 9070 413 ## COG4422 Bacteriophage protein gp37 + Term 9093 - 9139 15.2 + Prom 9101 - 9160 1.8 9 5 Op 1 . + CDS 9206 - 11302 1724 ## COG3533 Uncharacterized protein conserved in bacteria 10 5 Op 2 . + CDS 11322 - 13181 1339 ## BT_1002 hypothetical protein 11 5 Op 3 . + CDS 13203 - 15392 1756 ## BT_1001 putative alpha-rhamnosidase + Term 15415 - 15468 11.0 + Prom 15486 - 15545 2.8 12 6 Tu 1 . + CDS 15598 - 18006 1118 ## BT_1000 hypothetical protein + Term 18160 - 18193 0.4 13 7 Op 1 1/0.200 - CDS 18488 - 19033 198 ## PROTEIN SUPPORTED gi|124009622|ref|ZP_01694295.1| nucleotidyltransferase plus glutamate rich protein grpb plus ribosomal protein alanine acetyltransferase 14 7 Op 2 . - CDS 19073 - 21505 1668 ## COG0642 Signal transduction histidine kinase - Prom 21525 - 21584 5.7 + Prom 21484 - 21543 9.1 15 8 Op 1 . + CDS 21725 - 24424 2511 ## BT_0997 hypothetical protein + Term 24450 - 24504 3.3 + Prom 24439 - 24498 2.2 16 8 Op 2 . + CDS 24559 - 28818 3503 ## COG3250 Beta-galactosidase/beta-glucuronidase + Term 28864 - 28916 16.6 + Prom 28933 - 28992 7.3 17 9 Tu 1 . + CDS 29038 - 29883 610 ## COG2207 AraC-type DNA-binding domain-containing proteins 18 10 Tu 1 . - CDS 29880 - 31379 914 ## COG0642 Signal transduction histidine kinase - Prom 31558 - 31617 7.8 + Prom 31528 - 31587 8.5 19 11 Op 1 . + CDS 31802 - 35080 2313 ## COG3250 Beta-galactosidase/beta-glucuronidase 20 11 Op 2 . + CDS 35132 - 37984 2339 ## COG3250 Beta-galactosidase/beta-glucuronidase + Term 37999 - 38038 2.7 - Term 37924 - 37964 0.5 21 12 Tu 1 . - CDS 38006 - 38422 349 ## COG3602 Uncharacterized protein conserved in bacteria - Prom 38450 - 38509 5.5 + Prom 38444 - 38503 6.3 22 13 Op 1 40/0.000 + CDS 38545 - 39231 525 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 23 13 Op 2 1/0.200 + CDS 39228 - 40595 1077 ## COG0642 Signal transduction histidine kinase 24 14 Tu 1 . + CDS 40716 - 43367 1894 ## COG0474 Cation transport ATPase + Term 43445 - 43500 8.6 + Prom 43452 - 43511 5.6 25 15 Tu 1 . + CDS 43531 - 45045 987 ## BT_0987 putative cytochrome c-type biogenesis protein + Prom 45089 - 45148 1.8 26 16 Op 1 . + CDS 45169 - 48486 2359 ## BT_0986 putative DNA-binding protein 27 16 Op 2 . + CDS 48489 - 49922 1226 ## BT_0985 putative sialic acid-specific acetylesterase II 28 16 Op 3 . + CDS 49932 - 52343 1400 ## BT_0984 hypothetical protein 29 16 Op 4 . + CDS 52372 - 55020 2055 ## COG3250 Beta-galactosidase/beta-glucuronidase + Term 55110 - 55173 0.5 + Prom 55107 - 55166 4.4 30 17 Tu 1 . + CDS 55250 - 55501 101 ## BT_0982 hypothetical protein + Prom 55638 - 55697 4.9 31 18 Op 1 . + CDS 55759 - 60177 4032 ## COG0642 Signal transduction histidine kinase 32 18 Op 2 . + CDS 60241 - 61923 1600 ## BT_0980 hypothetical protein 33 18 Op 3 . + CDS 61991 - 65086 2170 ## BT_0979 hypothetical protein + Term 65096 - 65132 6.5 + Prom 65112 - 65171 7.4 34 19 Op 1 . + CDS 65263 - 65814 561 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 35 19 Op 2 . + CDS 65897 - 66148 291 ## BT_0977 hypothetical protein + Term 66194 - 66258 20.4 36 20 Tu 1 . - CDS 66255 - 67508 943 ## COG0477 Permeases of the major facilitator superfamily - Prom 67534 - 67593 4.2 37 21 Op 1 12/0.000 + CDS 67614 - 68381 530 ## COG2966 Uncharacterized conserved protein 38 21 Op 2 . + CDS 68383 - 68877 250 ## COG3610 Uncharacterized conserved protein 39 21 Op 3 . + CDS 68891 - 69619 608 ## BT_0973 hypothetical protein + Term 69688 - 69740 1.2 - Term 69676 - 69726 1.6 40 22 Tu 1 . - CDS 69733 - 70542 206 ## PROTEIN SUPPORTED gi|163797523|ref|ZP_02191474.1| 50S ribosomal protein L9 - Prom 70671 - 70730 6.7 + Prom 70513 - 70572 2.8 41 23 Tu 1 . + CDS 70689 - 71486 497 ## BT_0971 hypothetical protein 42 24 Tu 1 . - CDS 71559 - 72179 512 ## COG1011 Predicted hydrolase (HAD superfamily) - Prom 72199 - 72258 6.3 + Prom 72371 - 72430 7.8 43 25 Tu 1 . + CDS 72480 - 73568 726 ## BT_0969 hypothetical protein + Prom 73631 - 73690 2.7 44 26 Op 1 . + CDS 73710 - 75071 801 ## BT_0968 hypothetical protein 45 26 Op 2 . + CDS 75025 - 76845 1294 ## BT_0968 hypothetical protein 46 26 Op 3 . + CDS 76889 - 77110 170 ## BT_0967 hypothetical protein + Prom 77125 - 77184 5.6 47 27 Op 1 6/0.000 + CDS 77205 - 77756 441 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 48 27 Op 2 . + CDS 77820 - 78794 522 ## COG3712 Fe2+-dicitrate sensor, membrane component + Term 78823 - 78862 -0.0 + Prom 78807 - 78866 1.5 49 28 Tu 1 . + CDS 78919 - 81357 1113 ## BT_0964 hypothetical protein + Term 81586 - 81648 1.5 + Prom 81685 - 81744 6.3 50 29 Op 1 . + CDS 81854 - 83362 1025 ## COG1145 Ferredoxin 51 29 Op 2 . + CDS 83382 - 84785 1069 ## COG1453 Predicted oxidoreductases of the aldo/keto reductase family 52 29 Op 3 . + CDS 84785 - 85219 291 ## BT_0961 hypothetical protein 53 29 Op 4 . + CDS 85236 - 87134 1547 ## COG4206 Outer membrane cobalamin receptor protein 54 29 Op 5 . + CDS 87147 - 87656 421 ## BT_0959 hypothetical protein Predicted protein(s) >gi|226332281|gb|ACIC01000039.1| GENE 1 70 - 1272 1006 400 aa, chain - ## HITS:1 COG:CAC0359 KEGG:ns NR:ns ## COG: CAC0359 COG4225 # Protein_GI_number: 15893650 # Func_class: R General function prediction only # Function: Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins # Organism: Clostridium acetobutylicum # 57 397 23 361 361 304 41.0 2e-82 MNRNLLKLIVACGAVCFIACTAPQKAETEKWSERMACSEMKRFPEPWMIEKAKVPRWGYT HGLVVKSMLEEWKHTGDSTYYEYAKIYADSLIDTDGHIKTMKYLSFNIDNVNGGKILFDL YAQSGDERYKIAMDTLRKQMAEQPRTSEGGFWHKLRYPHQMWLDGIFMASPYLVQYGSTF QEPALYDEAVKQILLIARKTYDPTTGLYYHGWDESREQKWANPETGCSPNFWSRSIGWYG AALVDVLDYLPQETTGRDSVMQILQRLAKTLVKYQDPQSGTWYQVTDQGAREGNYLESSA TALFIYTLAKAVNKGYIGKEYIQPTRKAFDGMVKTFTRLEEDGSYTITNCCAVAGLGGDS KRYRDGSFEYYISEPVIENDPKSVGSFILAAIEYEKMTDK >gi|226332281|gb|ACIC01000039.1| GENE 2 1286 - 3760 1900 824 aa, chain - ## HITS:1 COG:no KEGG:BT_1010 NR:ns ## KEGG: BT_1010 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 824 1 824 824 1701 98.0 0 MKMNKKRLIGYLFVMVSAGCIHAQEKKVSAQEYKLWYDRPAQVWTEALPLGNGRLGAMVY GTPGIEQIQLNEETIWAGRPNNNANPNALEYIPKVRELIFAGKYLEAQTLATEKVMAKTN SGMPYQSFGDLRIAFPGHTRYSDYYRELSLDSARAIVRYEVDGVQYQRETITSFTDQVVM VRLTANRPGQITFNAQLTSPHQDAMINSEEGNCVTLSGVSSLHEGLKGKVEFQGRLTARN QGGKIACTDGVLSVEGADEATIYVSIATNFNNYLDITGNQTERAKSYLSEALVHPFAEAK KNHVEFYRQYLTRVSLDLGEDQYKNVTTDKRVENFKDTHDAHLVATYFQFGRYLLICSSQ PGGQPANLQGIWNDKLFPSWDSKYTCNINLEMNYWPSEVTNLSDLNEPLFRLIKEVSESG KETARIMYGANGWVLHHNTDIWRITGALDKAPSGMWPSGGAWLCRHLWERYLYTGDTEFL RSVYPILKGSGLFFDEIMVKEPVHNWLVVCPSNSPENVHSGNDGKATTAAGCTMDNQLIF DLWTAIISASRILDTDKEFAAHLEQRLKEMAPMQVGHWGQLQEWMFDWDDPNDVHRHVSH LYGLFPSNQISPYRTPELFDAARTSLIHRGDPSTGWSMGWKVCLWARLLDGDHAYKLITD QLTLVRNEKKKGGTYPNLFDAHPPFQIDGNFGCAAGIAEMLMQSYDGFIYLLPALPTLWK DGSVTGIIARGGFELDLSWKNGKVNRLVVKSHKGGNCRLRSLNPLTGKGLKRAKGENPNP LYAVPTIPQPLVNEKASLNKVEIARTYLYDLPTKAGQEYILIGK >gi|226332281|gb|ACIC01000039.1| GENE 3 3983 - 4981 598 332 aa, chain + ## HITS:1 COG:SA2375 KEGG:ns NR:ns ## COG: SA2375 COG0167 # Protein_GI_number: 15928168 # Func_class: F Nucleotide transport and metabolism # Function: Dihydroorotate dehydrogenase # Organism: Staphylococcus aureus N315 # 1 328 1 332 354 208 36.0 2e-53 MYNLIVRPLLFLIDPEKVHRMLLNWLKIYRYLLPVRACVRGHYQVDEPFSYGSLRFKNRI GLSAGFDKNAETFDELADFGFGFLEVGTVTPDYQPGNPSPRIFRLVEDESLISRTGFNNC GVDVALEHIRKYRKHSYRLGVNINKNPLSAGEQIIKDFELVFTRLYESVDYFTLNWGSID KESFAKVLENLTTFRQAVNKRCSIFIKLPADIDEKALDLVISLAYKYSIEGFIATGPTMD RSELHHTKTKQLEKIGNGGVSGRGIGKKSQKVVRYLSGHTDHHFLIIGAGGIMTAEDAAD MISQGADMIQIYSAFIYSGPSVVHKIGKKLKH >gi|226332281|gb|ACIC01000039.1| GENE 4 4987 - 5775 588 262 aa, chain + ## HITS:1 COG:AGc2981 KEGG:ns NR:ns ## COG: AGc2981 COG0657 # Protein_GI_number: 15888930 # Func_class: I Lipid transport and metabolism # Function: Esterase/lipase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 25 261 50 305 310 122 28.0 1e-27 MKKLICISLYEDLSMTTYDLSEVDNVELIGIVENASEGTLFVFTCDRPNGSSVIMCPGGG FLKTNLENEGIDFAEWFTKLGITYIVFKYRMPRGNPDVPEQDIRLALKVVREKFPEFCDK LGVMGASIGGYLATFSATLLPDDEKPDFQILMYPVVSVDDRLTHFPCRERMFGHSYSPDK MEQYSPIEHITCGTPAAFIVAAADDAVVSPLNGIMYAARLQKADIPISLHIYPAGGHGFG YNDSFVYKQEWLQELGEWLAKL >gi|226332281|gb|ACIC01000039.1| GENE 5 5842 - 6633 807 263 aa, chain + ## HITS:1 COG:no KEGG:BT_1007 NR:ns ## KEGG: BT_1007 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 263 1 263 263 410 98.0 1e-113 MRNSKVTSMLVVLMFLLVTGIQAQTVTPSKKYITKELNNVSNFSSIRVLGSPDVEYRQSN GSKTTVSIYGSDNLVDLLEVSTVNGVLQVNIKKGVKILSGERRLKVIASSPSLNQVDIKG SADVYLKGTIKGNDLNLNIAGSGDIEAENLQYANIFALVKGSGDIDLKNVKATTVMSEVN GSGDINMKGSAQKATLTVNGSGDISAEKLAATNVVATVAGSGDIVCYASRQLDARVSGSG DIEYKGSPSVVNKQGKKNSITGK >gi|226332281|gb|ACIC01000039.1| GENE 6 6868 - 7596 451 242 aa, chain + ## HITS:1 COG:PM0161 KEGG:ns NR:ns ## COG: PM0161 COG0778 # Protein_GI_number: 15602026 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Pasteurella multocida # 1 241 1 241 242 327 62.0 2e-89 MNLDEVLNYRRSVRVYDKEKEIDTEKVKHCLELATLAPNSSDMQLWEFYHITQPELMAKV SRACLDQKATSTASQIVVFVTRRDLYRKRAKFVLDFETGNIRRNSPIDRQEKRIKDRKLY YGILMPFVYARFCGILGLFRVLLANIISIFRPMMLEVSEGDVRVVVHKSCALAAQTFMIA MANEGYDTCPLEGLDSRRLKRLLKLPHGAEINMVVSCGIRNGNKGIWGERCRVPFEEVYR KI >gi|226332281|gb|ACIC01000039.1| GENE 7 7517 - 7795 171 92 aa, chain + ## HITS:1 COG:no KEGG:BF2880 NR:ns ## KEGG: BF2880 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 30 92 461 523 525 103 76.0 2e-21 MESGMEIKEFGENGAGYRLRKFIVRFDGTDAVVKTPKYIYVFEFKLNGTAEQALQQIDEK GYLIPYQTDGREVVKVGVEFSAEKRNIGRWLL >gi|226332281|gb|ACIC01000039.1| GENE 8 8336 - 9070 413 244 aa, chain + ## HITS:1 COG:MT2803.2 KEGG:ns NR:ns ## COG: MT2803.2 COG4422 # Protein_GI_number: 15842273 # Func_class: S Function unknown # Function: Bacteriophage protein gp37 # Organism: Mycobacterium tuberculosis CDC1551 # 5 212 11 223 284 80 30.0 4e-15 MGGKASMWNLWHGCHKWSEGCRHCYVYRTDGKYGKDSSVVTKTEKFGLPLQKKKNGEYKI PSGNLVYTCFTSDFLIEDADEWRAEAWEMMRIRQDLHFMFITKRIERLQQCLPPDWGDGY NNVTICCTMENQDRVDYRLPIYKESPIKHKIIICEPLLSQIDFRGELEGWVEQVVAGGES GKEARVCNYDWVLDIRRQCVEAHVSFWFKQTGSYLLKDGHEYKVARQFQHSQARKAELNY TPEK >gi|226332281|gb|ACIC01000039.1| GENE 9 9206 - 11302 1724 698 aa, chain + ## HITS:1 COG:TM0280 KEGG:ns NR:ns ## COG: TM0280 COG3533 # Protein_GI_number: 15643049 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Thermotoga maritima # 25 694 4 618 620 390 35.0 1e-108 MKQIKLLFLLASASVTGAFAQSNGLTDMSQSRYAKMANTGIDAVHWTNGFWGERFNVFSG TSLQSMWNTWNTPEVSHGFRNFEIAAGVCKGEHWGPPFHDGDMYKWMEGVASVYAVNKDP ELDKLMDNFIACVVKAQRADGYIHTPVVIEELNKGIDSHTQADSQQQTVIGTKVGSEDEK GAFANRLNFETYNLGHLMMAGIVHHRATGKTTLFDAAVKATDFLCHFYETASAELARNAI CPSHYMGVVEMYRATGNPRYLELSKNLIDIRGMVESGTDDNQDRIPFRDQYRAMGHAVRA NYLYAGVADVYAETGEQQLMKNLTSIWNDIVTRKMYVTGACGALYDGTSPDGTCYEPDSI QKVHQSYGRPYQLPNSTAHNETCANIGNMLFNWRMLEVTGDAKYAELVETCLYNSVLSGI SLDGKKYFYTNPLRISADLPYTLRWPKERTEYISCFCCPPNTLRTLCQAQNYAYTLSPEG IYCNLYGANTLTTNWKDKGELALVQETDYPWEGNIRVTLDKVPRKAGAFSLFFRIPEWCG KAALIVNGQPVSMNAKANTYAEVNRTWKKGDVVELVMDMPVCLLEAHPLAEEIRNQVVVK RGPLVYCLESMDIANGEKIDNILIPADIKLTPKKTTIEGSSIVALEGKARLASSESWEGV LYRPVAQAEKTVDIRLIPYYAWGNRGKGEMTVWMPLAR >gi|226332281|gb|ACIC01000039.1| GENE 10 11322 - 13181 1339 619 aa, chain + ## HITS:1 COG:no KEGG:BT_1002 NR:ns ## KEGG: BT_1002 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 619 1 619 619 1289 98.0 0 MKKLLVTTICLLGAHLLSAGEIWISPQGNDFNDGTRPSPKATLTSALRQAREWRRTDDER GRGGITICMEGGTYALYEPVFIRPEDSGTEDSPTVIRPVADEKVVLSGGIRIGGWKKQGK LWVADVPMFNGRPLDFRQLWVNGKKAVRARDVEDFEKMNRICSVDEKNEILYVPAVAIRR LVDGKGALKAKYAEIVLHQMWCVANLRIRSVELAGDSAAIRFHQPESRIQFEHPWPRPMV TTDGHNSAFYLTNARELLDVAGEWYHDIDARKVYYYPREGEKMQDAGTEVIVPAIETLVQ VKGTIDRPVSHIRFEKITFSHTTWMRPSEKGHVPLQAGMYLTDGYRIDPKMERDYLNHPL DNQGWLGRPAAAVSVATANQIDFERCRFEHLGSTGLDYEEAVQGGVVRGCLFRDIAGNGL VVGSFSPAAHETHLPYDPADLREVCAHQQISNCYFTEVGNEDWGCLAILAGYVKDINIEH NEICEVPYSGISLGWGWTQTVNCMRNNRVHANLIHHYAKHMYDVAGIYTLGSQPKSYVTE NCVHSIYKPGYVHDPNHWFYLYTDEGSSFITVRDNWTEGEKYLQNANGPGNVWENNGPQV DTVIRERAGLEAEYRDLKK >gi|226332281|gb|ACIC01000039.1| GENE 11 13203 - 15392 1756 729 aa, chain + ## HITS:1 COG:no KEGG:BT_1001 NR:ns ## KEGG: BT_1001 # Name: not_defined # Def: putative alpha-rhamnosidase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 729 1 729 729 1488 99.0 0 MILLGALSLASSTFAQTWIWYPGDYEIWLGNQMNNRRTERGAFFPPFWKTDSHYVVVEFS KVLNLSEPEEVFIAAEGTYNVKLDGKLQFGMPETLLLPAGKHSLNIKVWNQATPPTIYVK GKTVNSDSSWRVTYEDKEWIDESGKASDTSATIYMDAGCWNFDDATQRPSQFSLMREPQQ PVAKTEQPEGGILYDFGKETFGFITLKNLSGKGKIDLYYGESPEEAKDKAYCETLDKLLL EPGQITDLAILSTSPLHHSDNEYTLENSKAFRYVYITHEPEVQIGEVSMQYEYLPEEYRG NFRCNDEELNRIWEVGAYTMHLTTREFFIDGIKRDRWVWSGDAIQSYLMNYYLFFDSESV KRTIWLLRGKDPVTSHSNTIMDYTFYWFLSVYDYYMYSGDRHFVNQLYPRMQTMMDYVLG RTNKNGMVVGMSGDWVFVDWADGYLDKKGELSFEQVLFCRSLETMALCADLVGDKDGQQK YEKLASALKAKLEPTFWNNQKQAFVHNCVDGRQSDAVTRYANMFSVFFDYLNADKQQAIK QSVLLNDEILKITTPYMRFYELEAFCALGEQETVMKEMKAYWGGMLKAGATSFWEKYNPE ESGTQHLAMYGRPYGKSLCHAWGASPIYLLGKYYLGVKPTKEGYKEFAVSPVLGGLKWME GTVPTPNGDIHVYMDNKTIKVKATEGKGYLTIQSRRQPKANMGTVEKVSEGVWRLWIDSP EERIVTYRL >gi|226332281|gb|ACIC01000039.1| GENE 12 15598 - 18006 1118 802 aa, chain + ## HITS:1 COG:no KEGG:BT_1000 NR:ns ## KEGG: BT_1000 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 802 1 802 802 1582 99.0 0 MKTLKYAWRFLVRSKSYTVINIVGLSLSLACTIILVRYIHQETRVNTHCIDAENVYIPLR DIDGNVFAGSVSDGYSGADTVFYSPEAVKEKARFVTFSNDNVAINGKPYTVQLLAADTAF FHFFTYPLSGTPMKAPNDVLVTRQFAERVFGKENPIGEKLSYSGGHLLTVCGVLDEPTCK SSLTFDIVLNLELKNGERGWGKMLVELIRFMPGVDVDAINAISNIYRQADGGYRIRYDFI PVSELYWNKILAAKADAPEMWHYSSRFHLWVLSCVCLLLLLAGILNFVNIYLVFMLKRSK EYGIRKVFGMRGRTLFFQLWTENVLMVLIALFFAWFFIEMFSGYANRLLESNVMYTAFDW QLSLAIFILLPLLTTIYPYLKYNYLPPVVSISTIGTSRQSVKTRTLFLFVQYSITLLLII LSLYFSNHLHFLLNTPPGFRTEGILYADLMPKLPNQWWEDSQEIQNKRWHDREVMEQKLN ECPYIEHWFAGDTGREGILSAGSMSSMINDKGGKLNMAMMWVTVDFFKLYGLHIVDGSLP DEVNGHADYLVAMNKAALKAFGYTRREDAFVKGESSLWSSISNGKIVEGGLSLMPVQAII EDYYSGHLTAGKKPIIYMISSAGINAQCQISCVPGKEKQLVDYLKKCREDIYNSNEFDYR WLKDDIRALYQDDRRIVTVFTLFATISIFVSALGLFGLSLFDIRQRYREVAIRKVNGARL RNLYSILFRKYIWIIGGATLLTMPLSYYLIYIYTSDFVVKTPVSISIFVIALLVVVAISA GTLFWQVNKAARINPAEIIKTE >gi|226332281|gb|ACIC01000039.1| GENE 13 18488 - 19033 198 181 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|124009622|ref|ZP_01694295.1| nucleotidyltransferase plus glutamate rich protein grpb plus ribosomal protein alanine acetyltransferase [Microscilla marina ATCC 23134] # 1 149 1 149 174 80 32 2e-14 MQTYIETSRLILRDWKEEDIPAFARMNADPHVMEFFLHPLTPEESLAFYHRIQNEFQTCG FGLYAVERKEDHAFMGYTGLHQITFDVDFAPGIEIGWRLAHEYWGRGYAPEAATACLEYA RESLDIKELFSFTSLPNQRSERVMQKIGMERVREFGHPLVPAEYPLHRHVLYHIDLRKHS S >gi|226332281|gb|ACIC01000039.1| GENE 14 19073 - 21505 1668 810 aa, chain - ## HITS:1 COG:mlr3786_1 KEGG:ns NR:ns ## COG: mlr3786_1 COG0642 # Protein_GI_number: 13473249 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Mesorhizobium loti # 583 805 231 462 478 171 39.0 4e-42 MEGQLNLDCERFQQITQMAKMGWWESDVKNQQYICSDFIVDLLGLESHRISFQEFHHRIR EDHRTRLRNEFISHSNLETYEQMFPIQAKNGEIWVYSKISFRKPDKEGYRDIFGYLQYVD KPAENRCGNIDFFHVNSLLYQQNSISYSLLAFLQSDNIAEIINQILGDLLKQFGGDRTYI VEIDRKKRVQNCMYEATAEGVSEERDNLQDILWDESFWWNRQISNRKSIILNTLDDMPQE AQEYRKLLESQNIKSLMAVPLISKDKIWGYMGIDMVKAYRSWSNVDLQWFSSLANIISIY IELRKSELQAKEDRLALDHREKILRNIYKNLPAGIELYDKDGYLIDINDKELEIFGLTDK NEVLGINLFDNPNIPQEIKEKLRARNDVDFSINYDFSKVNKYIETQRKGVINLTTKVTTL YDSQNQFINYLFINIDTTETTNAYTKIQEFENLFLLIGDYANVGFAHFNILTRDGYAQDT WYRNLGEKDGTPMTEVIGVYSHVHPEDQAVLKSFVRGVKEGKATSLRKEVRVSRENGKYT WTSINVMVRDYRPEEGIIEMLCINYDITTLKETEQKLIIARDKAEELDRLKSAFLANMSH EIRTPLNAIVGFSSLLAETESKEECQDYIKIVQDNNDLLLRLISDILDLAKIEAGTFNFV YADLDVNEVCSEIIKSTGMKVKGNIKLLFEKQIPECHIYTDKNRFMQVITNFINNALKFT HEGTIALGYEQISSQKIRFYVRDTGVGIPASKINSVFERFVKLNNFVQGTGLGLSICKSI ITQMEGEIGVDSTEGAGSCFWFTLPYHTNE >gi|226332281|gb|ACIC01000039.1| GENE 15 21725 - 24424 2511 899 aa, chain + ## HITS:1 COG:no KEGG:BT_0997 NR:ns ## KEGG: BT_0997 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 27 899 1 873 873 1786 99.0 0 MKNTKLFMFAACTLFLAACGKQTVKIMTPPDASNRVLFSAEQLQSTLDKAGYQVMMQQGD TTFSDPEIKTILLTEVNDTTLKKEGFHITTTGNLTRVSGRDGSGVIYGSRELIDRVNDSD GKLDFPEELKDGPEMVLRGACVGLQKMTYLPGHGVYEYPYTPESFPWFYDKEQWIKYLDM LVANRMNSLYLWNGHPFASLVKLEDYPFALEVDEETFKKNEEMFSFLTEEADKRGIFVIQ MFYNIILSKPFAEHYGLKTQDRNRPITPLIADYTRKSIAAFIEKYPNVGLLVCLGEAMCT VEDDVEWFTKTIIPGVKDGLQALGRTDEPPLLLRAHDTDCKLVMDAALPIYKNLYTMHKY NGESLTTYEPRGPWSKIHTDLSSLGSIHISNVHILANLEPFRWGSPDFVQKAVQAMHNVH GANALHLYPQSSYWDWPYTADKLPNGEREFQLDRDWIWYQTWGRYAWNSHRDRADEIGYW NHQLGQFYGTSDENAGNIRIAYEESGEIAPKLLRRFGITEGNRQTLLLGMFMSQLVNPYK YTIYPGFYESCGPEGEKLIEFVEKEWKNQPHVGEMPLDIVAQVIEHGDKAIAAIDKAAGS ISSNKEEFARLQNDMHCYREFAYAFNLKVKAAKLVLDYQWGKDMKNLEEAIPLMEQSLEH YRKLVELTDEHYLYANSMQTAQRRIPIGGDDGHNKTWKELLVHYEKELENFKVNLAMLKE KQNGNAVTETVEIAAWAPADVNLISNYPTVKLNEGTSLFTDLPGKIEAIAPELKGMKAFR FNGNEQREKGTSITFETNAPVKLLVAYFKDDQKKYAKAPKLEIDASANDYGQAEPVLTNA IHINGMPLANVHAYSFPAGKHTLMLPKGYLQVLGFTAADMKTRNAGLAGDEETMDWLFY >gi|226332281|gb|ACIC01000039.1| GENE 16 24559 - 28818 3503 1419 aa, chain + ## HITS:1 COG:TM1062 KEGG:ns NR:ns ## COG: TM1062 COG3250 # Protein_GI_number: 15643820 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Thermotoga maritima # 377 960 3 557 563 117 25.0 1e-25 MIDKSIMKKLLLTFLSIAAVCCSLYAQREVTQERMEQIYEEVKTPYKYGLAVAPTDNYHK IDCPTVFRQGDKWLMTYVVYNGKTGTDGRGYETWIAESDNLLDWRTLGRVLSYRDGKWDC NQRGGFPALPDMEWGGSYELQTYKGRHWMTYIGGEGTGYEAVKAPLFVGLASTKGDISTA HEWESLDKPILSIHDKDAQWWEKLTQYKSTVYWDKDKTLGAPFVMFYNAGGRHPETDLKG ERVGIALSKDMKTWKRYPGNPVFAHEADGTITGDAHIQKMGDVYVMFYFSAFEPSRKYKA FNTFAASYDLVNWTDWHGADLIIPSKNYDELFAHKSYVIKHDGVIYHFYCAVNDAEQRGI AVATSKPMGRSAVRFPKPETKNRRMITELNEGWKTWLIDNSQLTIDNSKSNHQSPIINCQ IPHNWDDYYGYRQLTHGNLHGTAMYVKDFSLDNCPLSTVNSQLKKRYFLRFEGVGTYATI KVNGHDFGRYPVGRTTLTLDVTDALKQGTNRLEVKAEHPEMIADMPWVCGGCSSEWGFSE GSQPLGIFRPVVLEATDEIRIEPFGVHIWNDEKAANVFVETEVKNYGKTTETIEVVNKLS NADGKQVFRLVEKVTLAPGEMKVIRQQSPVENPVLWDTENPYLYKLASMIKRDTKTTDEI STPFGIRTISWPVKRNDEDGRFYLNGKPVFINGVCEYEHQFGQSHAFSREQVAARVKQMR AAGFNAFRDAHQPHHLDYQKYWDEEGVLFWTQLSAHVWYDTPEFRENFKKLLRQWVKERR NSPSVVIWGLQNESTLPREFAQECSEIIREMDPTARTMRIITTCNGGEGTDWNVIQNWSG TYGGDVTKYGKELSQKNQLLNGEYGAWRSIDLHTEPGEFEVNGVWSESRMCQLMETKIRL AEQAKDSVCGQFQWIFSSHDNPGRRQPDEAFRKIDKVGPFNYKGLVTPWEEPLDVYYMYR ANYVPAAKDPMVYLVSHTWADRFEKGRRRATIEAYSNCDSVLLYNDMINDKVTYLGRKKN NGTGTHFMWENRDIRYNVLRAVGYYKGKPVAEDIIVLNGLEQAPHFDVLYQNAKPVLKGE DGYNYLYRINCGGDDYTDSFGQLWMQDNTHYSRSWAANFKELNPYLASQRTTNDPIRGSR DWKLFQHFRFGRHQLEYNFPVADGTYRIELYFTEPWHGTGGSASTDCEGLRIFDVAVNDS VVLDDLDIWAESGHDGVCKKVVYATVKGGVLKIHFPEVKAGQGLISGIAIASVDSNLQPT VFSASDWSWEKAGKEVMEKTPKELLPEDKNARVSVAYEAETATLKGKFRKKEHRKQTGVF FDKGKGNSIEWNISTGLAQVYALRFKYMNTTGKPMPVLMKFIDSKGVVLKEDILTFPETP DKWKMMSTTTGTFINAGHYKVLLSAENMDGLAFDALDIQ >gi|226332281|gb|ACIC01000039.1| GENE 17 29038 - 29883 610 281 aa, chain + ## HITS:1 COG:BMEII0641 KEGG:ns NR:ns ## COG: BMEII0641 COG2207 # Protein_GI_number: 17988986 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Brucella melitensis # 146 279 162 291 307 69 28.0 6e-12 MGSDIPMYDLEVDYIVGEAFEANLLNGYKNYLSKTESGFFILCVKGTIQATINGSLYNIG ENALLTLPPNHIMEIQEFSPDIHIYYAGFSPLLIEGINLMKATQHLLLTIMENPVVILSP LQACSYKMFYESLILSYTSPIAQANKEITKATLTMFLQGSTEIYKLQNKWYLSSQSRKYE IYQEFLQLVMKYYTVHHGASFYADQLGLSLPHFCSTIKKAAGNTPLEVIASVILMDAKSR LKSGNEPVKNIALSLGFNNISFFNKFFKQHTGITPHEYRGH >gi|226332281|gb|ACIC01000039.1| GENE 18 29880 - 31379 914 499 aa, chain - ## HITS:1 COG:MA4377_3 KEGG:ns NR:ns ## COG: MA4377_3 COG0642 # Protein_GI_number: 20093164 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Methanosarcina acetivorans str.C2A # 247 498 6 260 311 150 38.0 7e-36 MEENKQTLYNDELLNILPDGVIIFNINGEVIQLNHQALTELHVHSSVNEMMPLSTNDLFK LLNNKEDILSVILEKIRQGEQTYSLPEHTFMQEQIDYTQFPIRGDFATIKEEGHLKKILF FFRNITVELTQEYILNTALQRTKIYPWYYDINRSEFTLDNRYFEHLGISAGENNTLTMEE YVNMIHPDDRQAMADAFIVQLSGMETFDKSVPFRLHRGDGTWEWFEGQSTYIANISGHPY RLVGICMSIQEYKDIENTLIEARKKAEESDRLKMAFLANMSHEIRTPLNAIVGFSDVIGS TYDELSEEERADFVRLISINSEHLVRLIDDVLDLSKIESNTIKFTFTDCSLRSLMIDVEK EQTMKPISEIEIRASLPNEDVYINTDATRLKQVISNFINNARKFTEKGYIHFGYAVDSVD ASSVHVFVEDTGSGIPKECQKEIFDRFYKVDTFKQGTGLGLSICKTIVEHLQGDIFVESE TGKGSRFTVTLPFERQSED >gi|226332281|gb|ACIC01000039.1| GENE 19 31802 - 35080 2313 1092 aa, chain + ## HITS:1 COG:TM1193 KEGG:ns NR:ns ## COG: TM1193 COG3250 # Protein_GI_number: 15643949 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Thermotoga maritima # 22 1092 6 988 1087 602 35.0 1e-171 MKIYTLVFGALIACPMQAQTMHDWENHHVLQINREPARAAFTPFSVRKGDCSMSLDGIWK FRWTPVPGERIIDFYQTDFNDKDWKDFPVPANWEVNGYGTPIYVSAGYPFKIDPPRVMGE PKADYTTYKERNPVGQYRRTFVLPVGWEADGQTFLRFEGVMSAFYVWINGERVGYSQGSM EPSEFNITDYLKTGENQIALEVYRYSDGSYLEDQDFWRFGGIHRSIHLIHTPDVRMRDYA IRTLPASAGNYKDFILQIDPQFSVYRGMTGKGYTLQGVLKDASGKEIVKLQGEVEEILDL EHKASRMNEWYPQRGPRKTGRLSAIIKSPERWTAETPYLYKLHLTLQNAEGKVIEQAEQS VGFRTVEINKGQLLINGNPVRFRGVNRHEHDPRTARVMSEERMLQDILLMKQANINAVRT SHYPNVSRWYELCDSLGLYVMDEADIEEHGLRGTLASTPDWYAAFMDRAVRMAERDKNYP SIVMWSMGNESGYGPNFAAISAWLHDFDPTRPVHYEGAQGVDGNPDPKTVDVISRFYTRV KQEYLNPGIAEGEDKERAENARWERLLEIAERTNDDRPVMTSEYAHSMGNALGNFKEYWD EIYSNPRMLGGFIWDWVDQGIYKELPDGRIMVAYGGDFGDKPNLKAFCFNGLLMSDRETT PKYWEVKKVYAPVQLAVNNGQLTVTNRNHHIDLSQYRCLWTLTIDGKQKEQGEITLPEVA PGENETITLPAFRSLSDKKALNRKGNNSNSTNTLSDCQLKVSIVLKSDALWAKAGHEVTW EQFCLQQGELLSADLINKGALQVKEDDKSLSVSGRDFSVQWEKKTVGSITSLMYNGKEIL TQNHFPVQPVTQAFRAPTDNDKSFGNWLAKDWQLHGMDHPLISLESFDHEVRADGAVIVR IRTTNLYKEGNVTTTSVYTISSDGVIDLKTTFLPQGVFPELPRLGLAFCLAPAYNTFTWY GRGPQDNYPDRKTSAATGLWKGTVAEQYVHYPRPQDSGNKEEVQFLTLTDKQNKGIRVDA VEDVFSASALHYTAQDLYKETHDCNLKPRPEIILSMDAAVLGLGNSSCGPGVLKKYAIEK KEHTLHIRISKQ >gi|226332281|gb|ACIC01000039.1| GENE 20 35132 - 37984 2339 950 aa, chain + ## HITS:1 COG:TM1193 KEGG:ns NR:ns ## COG: TM1193 COG3250 # Protein_GI_number: 15643949 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Thermotoga maritima # 42 706 63 745 1087 234 28.0 7e-61 MRKVTTLLSTLALATTLAAQTLPQTERQYLSGHGCDDMVEWDFFCTDGRNSGKWTKIGVP SCWELQGFGTYQYGITFYGKAFPEGIADEKGMYKYEFEVPEKFRGQQVSLVFEASMTDTE VKVNGRKVGSKHQGAFYRFSYNVTDFLKYGKKNLLEVTVSKESENASVNLAERRADYWNF GGIFRPVFLEVKPAINLRHIAIDAQMDGSFRANCYTNLSNDGMSIRAQILDNKGKELTET TVPVKAGGDWTSLQLNVSDPALWTAETPNLYKAQFSLLDKGGKVLHNETETFGFRTIEVR ESDGLYVNGVRIMVRGVNRHSFRPESGRTLSKAKNIEDVLLMKDMNMNSVRLSHYPADPE FLEACDSLGLYVMDELGGWHGKYDTPTGVRLIKGMIERDVNHPSIIWWSNGNEKGWNTEL DGEFHKYDPQKRPVIHPQGNFSGFETMHYRSYGESQNYMRLPEIFMPTEFLHGLYDGGHG AGLYDYWEMMRKHPRCIGGFLWVLADEGVKRVDMDGFIDNQGNFGADGIVGPHHEKEGSF YTIKQLWSPVQIMNTSVDKQFDGKFSVENRYDYLNLNTCRFLWKQVKFPTATDASNTTAQ VLKQGEVQGSDVAAHSAGVLDIKTTILPDADALFLTAIDRYGHELWRWTFPVNKLNQANE VLSPLSNKVTSTETENELTVKAKNHTFIFSKKDGQLKGVSVNNRKISFANGPRFIGARRA DRSLDQFYNHDDEKAKEKDRTYSEFPDAAVFTKLDVKEDGGNLIVTANYKLGNLDKAQWT INPSGEAVLDYTYNFSGVVDLMGICFDYPEDQVISKRWLGAGPYRVWQNRIHGTQYDVWE NDYNDPIPGETFTYPEFKGYFGSVSWMNIRTKEGTISLTNETPDAYIGVYQPRDGRDRLL YTLPESGISVLNVIPPVRNKVNSTDLCGPSSQPKWVNGPQSGRIIFRFME >gi|226332281|gb|ACIC01000039.1| GENE 21 38006 - 38422 349 138 aa, chain - ## HITS:1 COG:VC0802 KEGG:ns NR:ns ## COG: VC0802 COG3602 # Protein_GI_number: 15640820 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Vibrio cholerae # 1 132 1 131 132 131 53.0 5e-31 MSGIKELELLLSNMEPVLDERDFVFCSFPTLDWDQMRELNPIGIFHEKEGVTFILDTKNA VDKKIDYLSVYRLITLNVHSSLDAVGLTAAFSAKLAEKNISANVVAAYYHDHIFVPEEKA EQALEAILELQKKVPNAM >gi|226332281|gb|ACIC01000039.1| GENE 22 38545 - 39231 525 228 aa, chain + ## HITS:1 COG:ECs0609 KEGG:ns NR:ns ## COG: ECs0609 COG0745 # Protein_GI_number: 15829863 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Escherichia coli O157:H7 # 4 224 3 221 227 176 42.0 3e-44 MYTILIIEDEPRVASLLMNGLEENGYQTMVAYDGLMGLRLFQAHTFDLVISDIVLPKMDG FELTKEIRKSNPRIPILMLTALGSTNDKLDGFDAGADDYMVKPFDFRELNARIKVLLKRV SGNVPELPQELVYADLRIDLQRKDVERNNVSIKLSPKEYNLLLYMVENAERVLSRVEIAE KVWNTHFDTGTNFIDVYINYLRKKIDRNFEPKLIHTKAGMGFILTDKL >gi|226332281|gb|ACIC01000039.1| GENE 23 39228 - 40595 1077 455 aa, chain + ## HITS:1 COG:RSp1043 KEGG:ns NR:ns ## COG: RSp1043 COG0642 # Protein_GI_number: 17549264 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Ralstonia solanacearum # 169 453 180 463 466 144 30.0 3e-34 MKIRTTLTLQYAGITAAVFFAFVIAVYYVSEHSRSNAFFRNLQSEAITKAHLFLNNQVDA KTMQSIYLNNQKFINEVEVAVYTTDFKILYHDALQNDIVKETPEMVKRILKRKNISFYVD EYQAIGLVYPFEGQYYIVTAAAYDGYGYANRDALRNMLILLFIGGLSVLAIVGYILSRST LKPIRNIVREAEKITASHIDKRLPVKNEQDELGELSITFNALLERLEKSFNSQKMFVSNV SHELRTPMAALTAELDLALLKERTSGQYQAAIGNALQDSRRIINLIDGLLNLAKADYQSE QITMEEVRLDELLLDARELVLKAHPDYHIELVFEQEAEEDNVLTVIGNSYLLTTAFVNLV ENNCKYSSNRTSSVLIAYWEQWAIIRLSDNGVGMSDADKENLFKLFYRGENKNIAPGNGI GMVLTQKIIHLHKGELTVSSHKDEGTTFVVKLPHI >gi|226332281|gb|ACIC01000039.1| GENE 24 40716 - 43367 1894 883 aa, chain + ## HITS:1 COG:PA4825 KEGG:ns NR:ns ## COG: PA4825 COG0474 # Protein_GI_number: 15600018 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Pseudomonas aeruginosa # 40 883 45 903 903 961 54.0 0 MVWKKKLRSSQYTFNSERVFLVATQPGKSIYSCLQTTKLGLTLGEVQERQSIYGRNEVIH EQKKNPFILFIRTFINPFIGVLTGLAIISLFLDVLMAEPGEQEWTGVIIISSMVLFSAVL RFWQEWRASEATDSLMRMVKNTCLAKRAGEQEEEIDITELVPGDIVYLAAGDMVPADIRI IDSKDLFISQASLTGESEPIEKFPEVQGQQFRKGSVIELDNICYMGSNVISGAAKGIVFE TGNKTYLGTIAKSLVGHRATTAFDKGISKVSFLLIRFMLVMVPFVFFVNGFTKGDWFEAF IFAVSVAVGLTPEMLPMIVTANLSKGAIAMSKKKTIVKNLNAIQNFGAMDILCTDKTGTL TCDKIVLEKYINADGSDDNSKRILRHAYFNSYFQTGLRNLMDKAILSHVRELNLEHLKDA YTKVDEIPFDFTRRRMSVVIEDRQGKRQIITKGAVEEILDVCSYAEFDREIHPLTDSLKI KAQKISEEMNRQGMRVLAVSQKSFIEKDCNFAIEDEKEMVLIGYLAFLDPPKPSAAEAIE QLYMHGVAVKILSGDNDTVVKAIARQVGIDTGHSLTGIEMEEMDETTLKETVKDTTLFSK LTPLQKTQIISLLQEQGNTVGFLGDGINDAGALRQSDIGISVDSAVDIAKESADIILLEK DLMVLEDGVLEGRKTFGNINKYIKMTASSNFGNMFSVMFASAFLPFLPMMPIHLLIQNLL YDISQTTIPFDRMDPEFLKKPRKWDASDLSRFMIYIGPISSIFDIATYLVMWYVFSCNSP EHQTLFQTGWFVEGLLSQTLIVHMIRTRKIPFIQSRATWPVMGLTFLIMAVGILIPFTAF GRSIGLTALPLGYFPWLIGILLSYCILTQLVKNWYIRKFVRWL >gi|226332281|gb|ACIC01000039.1| GENE 25 43531 - 45045 987 504 aa, chain + ## HITS:1 COG:no KEGG:BT_0987 NR:ns ## KEGG: BT_0987 # Name: not_defined # Def: putative cytochrome c-type biogenesis protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 504 1 504 504 996 100.0 0 MKHSLYFLIIITLLFAGCRQKAQSNFLSESSAQTEESNPSLEEYAEEAVITGKVLNRDFY PQEKELTLIIPFFRDMGNQYRSPIQKDGSFSFRFPVYAKLREVSIRNYAEHLYIHPGDSI HVEIDFKDLFHPKVTGDAEKLNQEILAFTESAYYYIQNYNMKPESDVKDFEAELKKDYNF RLERRNEYLVKYKPMEDVVLFTEELLKQDYYYALLFNGMSYLFKTRKEMDRYHTLLPEIN KLYTKGILSARLYDVADEAERYIAYGIAFRDKKNPSIEEIMATMGESEMNQYLYTKLIAG SLCTNDTLAFHEKRTQFDSIVKMSHLRAQVMQIYNQTKSYLKNPQPVSDNLLYGEFHENS KHTTRMPYMKPVYDVLEKNRGKVIYFDFWARWCPPCLAEMEPLKQLRSKFSTDDLIIYSI CVSEPKEQWEECLNEYSLKNRGIECVHVTDYLGIDNYQKICKQWKIDRMPYYVLINRKGQ IIDFGTAARPSNPQLVSRIEEAVK >gi|226332281|gb|ACIC01000039.1| GENE 26 45169 - 48486 2359 1105 aa, chain + ## HITS:1 COG:no KEGG:BT_0986 NR:ns ## KEGG: BT_0986 # Name: not_defined # Def: putative DNA-binding protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1105 1 1105 1105 2283 100.0 0 MKSRLKQQIFAISLLACTAISPANALQTHLLREQFQNPSDEAKPWTFWYWMFGAVSKEGI TADLEAMKRAGLGGTYLMPIKGIKEGPQYNGKAQQLTPEWWEMVRFSMEEADRLGLKLGM HICDGFALAGGPWMTPKESMQKIVWSDTIVDGGKIKGLHLPQPEAYEGFYEDISLFALPV KEEAADVMPAQITCANIATGNHIDIKKTVNMDDAGVIRSSYPCYIQYEYEQPFTCRNIEI ILSGNNYQAHRLKVMASDDGVNYRLVKQLVPARQGWQNTDENSTHAIPATTARYFRFYWT PEGSEPGSEDMDAAKWKPNLKIKELRLHREARLDQWEGKAGLVWRVASSTKKEEIGEQDC YALSQIINLTDPFKNSPSDNFKERTLTATLPKGKWKLLRMGHTATGHTNATAGGGKGLEC DKFNPKAVRKQFDNWFAQAFVKTNPDVARRVLKYMHVDSWECGSQNWSDTFAAEFRKRRG YDLMPYLPLLAGIPMESAERSEKILRDVRTTIGELVVDVFYQVLADCAKEYDCQFSAECV APTMVSDGLLHYQKVDLPMGEFWLNSPTHDKPNDMLDAISGAHIYGKNIIQAEGFTEVRG TWNEHPGILKALLDRNYALGINRLFFHVYVHNPWLDRKPGMTLDGIGLFFQRDQTWWNKG AKAFCEYITRCQSLLQYGHPVADIAVFTGEEMPRRSILPERLVPSLPGIFGAERVESERI RLANEGQPLRVRPVGVTHSANMSDPEKWVNPLRGYAYDSFNKDALLRLAKAENGRMTLPG GASYKVLVLPLPRPMNPDPAALSPEVKQKINELKEAGILIPSLPYKEDDFSSYGLERDLI VPENIAWTHRQGEQGDIYFIANQLEETRTFTASMRIDGRKPECWNPVTGEINADIPYEQK SHRTEITLTLAPNESVFIVYPAEEDDKETSEKERKEKKDSVKEASETGLEATEYTVTFTA NGKTIQRQELFDWSKEEDEQIRYYSGTAVYKTTFRWKSKVKEDQQVYLNLGKVCDLATVR VNGIDCGTIWTAPYRADITAALKKGVNELEIEVTNTWANALKGADEGKAPFDGIWTNAKY RRAENTLLPAGLLGPLNFDVANKNK >gi|226332281|gb|ACIC01000039.1| GENE 27 48489 - 49922 1226 477 aa, chain + ## HITS:1 COG:no KEGG:BT_0985 NR:ns ## KEGG: BT_0985 # Name: not_defined # Def: putative sialic acid-specific acetylesterase II # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 477 1 477 477 978 100.0 0 MKSSGIKTSTVLASFLLAACLSMRAEVKLPAIFSDGMVMQQQTNANLWGTATPNKKVTVT TGWNGKQYAVTADKNGSWKLSVSTPEAGGPYTITFDDGTQKTLNNILIGELWLCSGQSNM EMPMKGFKNQPVENANMDILRSKNLHIRLFTVKRTSTITPQNDVVGSWKEAAPVSVRDFS ATAYYFGRLVNEILEVPVGLVVAAWGGSACEAWMTADWLKAFPDAKIPQSEADIKSKNRT PTVLYNGMLHPLIGMTMKGVIWYQGEDNWNRAHTYADMFTTLINGWRAEWKQGDFPFYYC QIAPYDYGIITEKGKEVINSAYLREAQAKVEHRVSNSGMAVLLDAGMEKGIHPAKKQAAG ERLALLALTKTYGIEGVNGESPYYKSIEIKNDTIVVSFERANMWISGKNCFESKNFQVAG EDKVFYPAKAWIERSKMLVKSDKVPHPVAVRYGFENYVEGDVYCDGLPLGSFRSDDW >gi|226332281|gb|ACIC01000039.1| GENE 28 49932 - 52343 1400 803 aa, chain + ## HITS:1 COG:no KEGG:BT_0984 NR:ns ## KEGG: BT_0984 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 803 1 803 803 1667 99.0 0 MKRFYLAIAMIFVGVSVLWSQHANVVWDTPSRNSSESMPCGGGDIGMNIWVEEGDILFYL SRSGTFDENNCQLKQGRFRLRLSPNPFEDAKDFRQELKLIDGYVEISAEGTQVQLWADVF HPVVHIEVINDRPLQAEIFYENWRYQDRLIRKGEGQQCSYKWAPPKGTMTHADFISLENS SEKDSKRLLFYHRNAEETVFDVAVAQQGMNEVKSQMMNPLKNLTFGGYLSGENLEYIGTS DSVYAGTDYRAWGFRSLKASKKHHFSVVLHTEQTETVTQWEQGLKTAWQRIAPQGKISSK VVSQDKKQTRLWWNAFWQRSFIETIGETENKENSKVEKETGNKEKSDAKDALKEITRNYT LFRYMLGCNAYGSVPTKFNGGLFTFDPCHIDEKQAFTPDYRKWGGGTMTAQNQRLVYWPM LKSGDFDMMPSQFNFYNRMLKNAELRSHVYWQHEGACFCEQIENFGLPNPAEYGFKRPAW FDKGLEYNAWLEYEWDTILEFCQMILETKNYAGADITPYLPLIESSLTFFDEHYRLLASR RGRKALDGDGHLILFPGSACETYKMTNNASSTIAALRTVLETYIKVCNNEKWQKMLETIP PVPLRYIEVKDSLNLQASTMTPAWKQTISPAKSWERINNIETPQLYPVFPWRIYGVGKEN LEIARDTYFYDPDALKFRSHTGWKQDNIWAACLGLTEEAESLSLAKLSDGPHRFPAFWGP GYDWTPDHNWGGSGMIGLQEMLLQTNGTQILLFPAWPKEWNVHFKLHAPGNTTVEATLKD GKVTILKVSPESRKKDIVIMIEK >gi|226332281|gb|ACIC01000039.1| GENE 29 52372 - 55020 2055 882 aa, chain + ## HITS:1 COG:SMb21655 KEGG:ns NR:ns ## COG: SMb21655 COG3250 # Protein_GI_number: 16263752 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Sinorhizobium meliloti # 43 799 3 731 755 239 28.0 1e-62 MNKRPFLILSLFLLLFSFDGQAADTYRPEISVAGFIPLPDCGRQVYNFNPGWRFFKGDIR GAEAVNFDDRSWEVVSTPHTVELMPAEGSGCRNYQGPAWYRKHFVVPTATKGQRVLLHFE AAMGKQVLYLNGKRVQEHLGGYLPFTLDLTANGVQAGDSCLLAVYTDNSNDKSYPPGKPQ YTLDFAYHGGIYRDVWMIAKSPVAITDAIDSRTVGGGGVFVHFDKISEKSAQVYVETEIQ NDNTRSESVTIETTLTDAEGNVIKRTSGKLSLNSGEKKSIRQQMEVRNPKLWSPDAPYLY RVQSRIKKGNRSIDGGTTRIGIRLAEFRGKEGFWLNGKPFGQLVGANRHQDFAYVGNALP NSQQWRDAKRLRDAGCTIIRVAHYPQDPSFMDACDELGLFVIVATPGWQYWNKDPKFGEL VHQNTREMIRRDRNHPSVLMWEPILNETRYPLDFALKALEITKEEYPYPGRPIAAADVHS AGVKEHYDVVYGWPGDDEKEDKPEQCIFTREFGENVDDWYAHNNNNRASRSWGERPLLIQ ALSLAKSYDEMYRTTGQFIGGTQWHPFDHQRGYHPDPYWGGIYDAFRQKKYAYEMFRSQS PASLQHPLAECGPMIFLAHEMSQFSDKDVVVFSNCDSIRLSIYDGTKTWTKPVVHAKGHM PNAPVIFENVWDFWEARGYSYTQKNWQKVNMVAEGIIDGKVVCTQKKMPSRRSTKLRLYA DTQKVNLVADGSDFIVVVAEVTDDSGNVRRLAKENIVFTVEGEGEIIGDARIGANPRAVE FGSAPLLIRSTRKAGKIKVKAHVQFEGTQAPTAAEMEFESVPAELPFCYDEKYCISQDTP SLTVKATLSENATDGKVQLTEEERQRVLDEVERQQTEFGIEK >gi|226332281|gb|ACIC01000039.1| GENE 30 55250 - 55501 101 83 aa, chain + ## HITS:1 COG:no KEGG:BT_0982 NR:ns ## KEGG: BT_0982 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 83 1 83 83 149 100.0 2e-35 MKKISSHGKGIMVIYIYINLKQALFLYPIMKYFPNIVGTGSKNRVSKKANRRYKAYRCVF IGLFTPFVLFEKPNTGFLDFTGI >gi|226332281|gb|ACIC01000039.1| GENE 31 55759 - 60177 4032 1472 aa, chain + ## HITS:1 COG:all4963_3 KEGG:ns NR:ns ## COG: all4963_3 COG0642 # Protein_GI_number: 17232455 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Nostoc sp. PCC 7120 # 940 1172 8 235 294 157 37.0 1e-37 MKKHFFFIFMFLLMTRICAFAYPNMIVEHYTAERGLPNNIVNCTLKGQDGFVWFGTWYGL CSFDGTKFRSYDNHDGFYSADIPPRKIQRIVEDRNGYLWIKTIDRKLYLFDKKHESFHAV YDDVKEYSENIQIIKIQTTDDGDVLLLTKDKSLLRANTDQNGKITMKQLHDSRPNVNVYD LRLKHNVFCETAEFINWIGMDYQILSLRKGEALKDKPVDFIQKKVSANPDQFTCASYSSK FLWLGDKNGHIYSVDPQNGVVNRYEIPEIDQPISNILVTESGLMYITTVDGAYEYNIGYK QLTKLPFSIAGKKVGMIFNDKYDKIWFQEGNEALTYYDPLNKSNHRFPFPKQNAIGNFEK QDAGEQGMFFLTPGGEILLFDREKLEMTRINQLKPFSDDLPDQLFFHLLLDKDGILWLAS TGSGVYKVNFPKKQFQLLTEVSPVPVDTERATSWNQGIRALYQAKNGDIWVGTRWQALYR LDSNGQIKQVFTDQNYLIGAVYHIMEDKEGNLWFSTKGNGLVKAEPDMNSPHGLRFTRYK NDPKNPNSISNNDVYFTYQDSQERIWVGLLGGGLNLISEENGTLVFKHKYNGLKQYPAYG LYMEVRTMTEDEDGRIWVGTMDGLMSFDGHFTTPEQIQFETYRQISDRSNVADNDIYVLY KDSDSQIWVSVFGGGLNKLVRYDKEKREPIFKSYGIREGMNNDVVKSIVEDKNGNLWFTT EIGLSCFNKATEQFRNYDKYDGFLNVELEEGSALRALNGDLWLGSRQGILTFSPDKLETQ HTNYDTRIVDFKVSNRNLRSLNDCPIQESITYTDALQLKYNQSMFTIEFAALNFYNQNRV SYRYILEGYEKEWHYNGKNRIASYTNVPPGDYVFRVETMDEANPELVSTCTLAVTILPPW WLSWWATLIYVILGIAALYFGLRLAFFMIKMKNDIYIEQKVSEMKIKFFTNISHELRTPL TLIKGPIQELKEREELSPKGLQYVDLMEKNTNQMLQLVNQILDFRKIQNGKMRLHVSLID FNEMIASFQKEFRVLSEENEISFTFQLPDEPIMVWADKDKLSIVIRNVISNAFKFTPSGG SIYVTTGLTDDGKSCYVRVEDNGVGIPQNKLSEIFERFSQGENANNPHYQGTGIGLALSK EIVNLHHGIIRAESPEGQGAVFIVELLLDKDHYRPSEVDFYVSDTETAPAATDKKMVIDS EKEPEEELEIDASLPTLLLVEDNKDLCQLIKLQLEDKFNIHIANNGVEGLKKVHLYHPDI VVTDQMMPEMDGLEMLQSIRKDFQISHIPVIILTAKNDEGAKTKAITLGANAYITKPFSK EYLLARIDQLLGERKLFRERIRQQMENQTTTEEDSYEQYLVKKDVQFLEKIHQVIEENMD DSDFNIDTIASGIGLSRSAFFKKLKSLTGLAPVDLVKEIRLNKSIELIKNTDLSVSEVAF AVGFKDSGYYSKCFRKKYNQTPREYMNEWRKG >gi|226332281|gb|ACIC01000039.1| GENE 32 60241 - 61923 1600 560 aa, chain + ## HITS:1 COG:no KEGG:BT_0980 NR:ns ## KEGG: BT_0980 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 560 1 560 560 1110 100.0 0 MNKIKKVLSAWMLVACVLPVTAQYPVIPDSVKERGAKQEAEFEQKSNAAWEKALPTVLEE ARLGRPYKPWASKPEDLIKSNIPAFPGAEGGGMYTPGGRGGKVIVVTSLEDSGPGTFREA CETGGARVIVFNVSGIIRLKSPISVRAPYVTIAGQTAPGDGICVTGHSFLIDTHDVVIRH MRFRRGAQDVAFRDDAVGGNAVGNIIIDHCSASWGLDENMSIYRHVYNRDETGHGLKLPT VNITIQNSMFSEALDTYNHAFGATIGGHNSMFCRNLFASNISRNSSVGMDGDFNFVNNVV FNWWNRSVDGGDNKSFYNIINNYFKPGPITPLDKPISYRILKPEAGRDKSKPMSFGKAYV NGNIVHGNAKVTKDNWDGGVQLASEVDEGKFLPQIRVDEAFKMSPVTIMDTKKAYNFVLD NVGATLPKRDAVDARVIKTVQTGKAIYAKDAPEFVSPYVKRRLPADSYKQGIITDIRQVG GLPEYKGEAYLDSDGDGMPDAWEIANGLNPNDPTDAKEDCNGDGYTNIEKYINGLDTKKK VDWTDLKNNCDTLSKRKSLF >gi|226332281|gb|ACIC01000039.1| GENE 33 61991 - 65086 2170 1031 aa, chain + ## HITS:1 COG:no KEGG:BT_0979 NR:ns ## KEGG: BT_0979 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1031 1 1031 1031 2183 100.0 0 MNIQLRTILLGLLSIGFAQGYAQTFALQVKDDRITYLNDEQGNRILDFSYCGYKGSEQDI PSVRNAVFVPWTAGDNTSRIQRAIDYVASLVPDASGFRGAVLLDQGEFSLSGSLRINASG IVLRGVDKEKTILLKKGVDRGALIYMEGTDDMKVQDTLQVLSKYVPVNARTLEVASGTSL RKGDRILVDRPSGKEWIASLGCDIFGGGISALGWKEGDMDLTWDRTVTEVNGNQITLDAP LTVALDAKYGTSSVITYQWNGRIRECGVENMTLISDYDKRYPKDEDHCWTGISIEDAENC WVRRVNFKHFAGSAILVQRTGSQITVEDCISREPVSEIGGMRRCTFYTLGQLTLFQRCYS EQGIHDFAAGYCAAGPNAFVQCDSYESFGFSGSIDAWACGLLFDVVNIDGHNLSFKNLGQ DKNGAGWNTANSLFWQCTAAEIECYAPAKDAMNRAYGCWAQFSGDGEWAQSNNHVQPRSI FYAQLEDRLQKKCAERARILPRNTSATSSPTVEVAMELAKEAYNPRLTLEHWIEQGAFAP SIAMSGVKSIDDMKEKKITPNKTSAQPEVTIANGRIQMDGTLLVGGSHTTPWWNGKLKTS FLKKASPAITRFVPGREGLGLTDRIDSVIGYMKQKNILVFDQNYGLWYDRRRDDHERVRR RDGDVWGPFYEQPFGRSGQGTAWEGLSKYDLNRPNAWYWSRLKEFAEKGSKDGLLLFHEN YFQHNILEAGAHWVDCPWRSTNNINQTGFPEPAPFAGDKRIFVADMFYDITHPVRRELHK QYIRQCLNNFADNPNVIQLTSAEFTGPLHFVQFWLDVIAEWETETGKKAKVALSTTKDVQ DAILADPKRAAVVDIIDIRYWHYKTDGVFAPEGGKNMAPRQHMRKMKVGKVTFTEAYKAV HEYRQKFPEKAVTFYAQNYPAMGWAVFMAGGSCPVIPCTDKAFLKDAAAMEVEETDTDNY KKMVKSDIGSIIYSKSGTEIPVQLSSGKYILKYIHPGSGKMIVIDKSLKINGVYKLKVPD KKEGIYWFHKL >gi|226332281|gb|ACIC01000039.1| GENE 34 65263 - 65814 561 183 aa, chain + ## HITS:1 COG:BH3216 KEGG:ns NR:ns ## COG: BH3216 COG1595 # Protein_GI_number: 15615778 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Bacillus halodurans # 5 183 3 178 193 66 26.0 3e-11 MMEQTNNSTDTLLASFQAGNMAAFSQLYDLHINILFNYGLKLTIDKELLKDCIHDIFVKL YTKKDELGTIDNLKSYLFISLKNKLCDELRKRMYMSDTAIEDVNAVAPTDVEDDYMEEEQ RKNEFSLVKRLLDQLSPRQREALTLYYIEEKKYEDICEIMNMNYQSVRNLMHRGLTKLRS LAS >gi|226332281|gb|ACIC01000039.1| GENE 35 65897 - 66148 291 83 aa, chain + ## HITS:1 COG:no KEGG:BT_0977 NR:ns ## KEGG: BT_0977 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 83 1 83 83 151 100.0 9e-36 MCSKVKDFLTDDDFINYALGVTPEAASQWETYFREHPEQIADAEEAKAVLLAPADVACDF SLVENQDLKDRIVSSIKDFSNIL >gi|226332281|gb|ACIC01000039.1| GENE 36 66255 - 67508 943 417 aa, chain - ## HITS:1 COG:BMEI0267 KEGG:ns NR:ns ## COG: BMEI0267 COG0477 # Protein_GI_number: 17986551 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Brucella melitensis # 13 381 22 365 397 123 27.0 8e-28 MNNQFTYPKEKIRFAILTFFFAQGLCMASWASRIPDFKDVFAANYAFYWGLILFMIPVGK FVAIPLAGYLVSKLGSRTMVQVSILGYAMSLLCVGLAHEVYLLGFLLFCFGVFWNLCDIS FNTQGIEVERLYGKTIMATFHGGWSLGACTGALIGFVMILTGTSPVWHYMLIAAIILIIV LAGRKYLQETTLPICEHPKPGFETTDKAADKAPNGFRLLFQKPEILLLQLGLVGLFALIV ESAMFDWSAVYFESVVHVPKSLQIGFLVFMVMMATGRFLTNYAYQLWGKKKVLQLAGSLI CIGFFISALLGNAFDSLGMKVIINSLGFMLVGLGISCIVPTLYSFVGAKSKTPVSIALTI LSSISFIGSLVAPLLIGAITQALDIRIAYMIIGILGGCIVLIVSFCNAFDIREENKL >gi|226332281|gb|ACIC01000039.1| GENE 37 67614 - 68381 530 255 aa, chain + ## HITS:1 COG:Cj1166c KEGG:ns NR:ns ## COG: Cj1166c COG2966 # Protein_GI_number: 15792490 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Campylobacter jejuni # 13 247 12 249 258 141 35.0 1e-33 MKTSESLLSIGKFIAEYSAHLMGAGVHTSRVVRNSKRIGEAFGLDVKLGVFHKNIILTII DKETSEACNEVIDIPAHPISFEHNSELSALSWEAVDKQLSLEELKEKYKKIVSAPMIHPL FVLILVGFANASFCKLFGGDLISMGIVFSATITGLYLKQQMQKKKMNHYIIFIVSAFVAS LCASTALIFDTTSEIALATSVLYLVPGVPLINGVIDVVEGYVLTGFARLTEASLLIVSIA IGLSFTLLMVKNSLI >gi|226332281|gb|ACIC01000039.1| GENE 38 68383 - 68877 250 164 aa, chain + ## HITS:1 COG:Cj1165c KEGG:ns NR:ns ## COG: Cj1165c COG3610 # Protein_GI_number: 15792489 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Campylobacter jejuni # 6 164 5 160 164 82 33.0 3e-16 MVALDILSDGFFAAIAGIGFGAISDPPLRAFKMIAILAALGHACRFCLMTYLGVDIATAS LFGGLVIGFGSLWLGKRVYCPMTVLYIPALLPMIPGKFAYNMVFSLIMCLQTANKPELHD KYIKMQEMFFSNTLVASTVIFMLAVGATFPMFLFPHRAFSLTRH >gi|226332281|gb|ACIC01000039.1| GENE 39 68891 - 69619 608 242 aa, chain + ## HITS:1 COG:no KEGG:BT_0973 NR:ns ## KEGG: BT_0973 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 242 1 242 242 394 100.0 1e-108 MEIFWRTIAYYNSATWLLQIIVILIGIVLTALLISKPRPWVKMGMKIYMIGLYSWISIVY YFIYCEERSYNGVMAMFWGVMAIIWIWDTITGYTTFERTHKYDLLSYVLLAMPFIYPMVS LARGLSFPEMTSPVMPCSVVVFTIGLLLLFAHKVNMFLVLFLCHWSLIGLSKTYFFQIPE DFLLASATIPGLYLFFREYFLNNLHADTKPKAKYINWLLITVCIGLAILLTTTMFLELVP KN >gi|226332281|gb|ACIC01000039.1| GENE 40 69733 - 70542 206 269 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163797523|ref|ZP_02191474.1| 50S ribosomal protein L9 [alpha proteobacterium BAL199] # 3 223 4 233 259 84 30 2e-15 MQPQIILITGASSGFGKITAQMLSEQGHIVYGTSRKPSENIGKVRMLVVDVTNSISVRQA VEQIISEQGRMDVLINNAGMGIGGALELATEEEVSMQMNTNFFGVVNMCKAVLPYMRKAR RGKIINISSIGGVMGIPYQGFYSASKFAVEGYSEALALEVHPFHIKVCLVQPGDFNTGFT DNRNISELTGQNEDYADSFLRSLKIIEKEERNGCHPRKLGAAICKIVARKNPPFRTKVGP LVQVLFAKSKSWLPDNMMQYALRIFYAIR >gi|226332281|gb|ACIC01000039.1| GENE 41 70689 - 71486 497 265 aa, chain + ## HITS:1 COG:no KEGG:BT_0971 NR:ns ## KEGG: BT_0971 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 265 1 265 265 548 100.0 1e-155 MKRILIFSIVCLTSIIVSLGQSVESTILPPTGYTREVCDEHSFAAYLRQLPLLPKGSKVL LYNGQEKRNQAAAFAVVDMEIGNRDLQQCADAVMRLRAEYLWAQKRYGEIKFNFTNGFPA EYKKWAEGNRIKVTGNKVEWYAAGGKDYSYKTFRKYLNMVFMYAGTASLSKELRTVPYTS LQAGDVFIKGGSPGHVVIVVDVAIHPKTKKKVFLLAQSYMPAQQIHILVNPANRNLSPWY ELTDTDSGRLYTPEWTFEKKELKRF >gi|226332281|gb|ACIC01000039.1| GENE 42 71559 - 72179 512 206 aa, chain - ## HITS:1 COG:L111950 KEGG:ns NR:ns ## COG: L111950 COG1011 # Protein_GI_number: 15672092 # Func_class: R General function prediction only # Function: Predicted hydrolase (HAD superfamily) # Organism: Lactococcus lactis # 2 189 3 190 207 97 30.0 1e-20 MIKNIVFDFGGVIVDIDRDKAVQAFIKLGLADADTRLDKYHQTGIFQELEEGKLSADEFR KQLGDLCGRELTMEETKQAWLGFFNEVDLRKLDYILGLRKSYHVYLLSNTNPFVMSWACS PEFSSEGKPLNDYCDKLYLSYQLGHTKPAPEIFDFMIKDSHVIPSETLFVDDGSSNIHIG KELGFETFQPENGADWRQELTVILNS >gi|226332281|gb|ACIC01000039.1| GENE 43 72480 - 73568 726 362 aa, chain + ## HITS:1 COG:no KEGG:BT_0969 NR:ns ## KEGG: BT_0969 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 362 1 362 362 651 97.0 0 MNNLRILLYVVVLSLAACHPEGTNVKQGLDKAAQLMGQDPDTASIILENIQINQMNEAQL AEYNLLCTQLNEDKNIAHTSDKQIRQAVSYFDQHGDELQKSKAYFYLACVESDLNQKKEA ETHFKEAIKLAALTEKYDYVAKVCRRCSLYYQKYGHFDEALEMERKAYASQLMVNDQAGD SKLILSSALGVFGVMSLLLGLLWKKNKFVYSQLTTFKDEMLRKEAESDKLMLQCNYLEEK YQSLQQHIYESSPVVSKVRQIKERSVSPSKVVSFSEKDWNELLRLQEGVYGLVSKLKEIS PKLTEEDLRVCAFLREGIQPTCFADLMKLTVETLTRRISRIKTEKLMLVSSKESLEDYIK SL >gi|226332281|gb|ACIC01000039.1| GENE 44 73710 - 75071 801 453 aa, chain + ## HITS:1 COG:no KEGG:BT_0968 NR:ns ## KEGG: BT_0968 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 426 1 426 1044 867 100.0 0 MNYLLFKRWNDFSGWLVFLIAAFVYGMTIEPTASFWDCPEFISCAEKLQVGHPPGAPFYM LVGNLFTQFASDPSQVSRTVNFLNALLSAGCILFLFWSITRLVRSLLKEEKENLSVTDVI IILGSGLVGALAYTFSDTFWFSAVEGEVYAFSSFLTALVFWMILRWQDESDTVYGDRWII LIAYIIGLSIGVHLLNLLCIPAIVLVFYYQKYRTISLKGAIAAIALSGLLIVLILFVYIP GMADIGGWFELLFVNALGFPFQTGLIGFLAITFFLLIAGIYRFKKRLINTALWCILMLTV GYTTYAVILIRANANTPLNENAPDNIFALKSYLNREQYESAPLLYGKTYASEPEYVPEGD YYKVKMKTGGAVYRPNREEGKYKVIRNKEEVCYTQNMLFSRMWNDRMASSYKSWSEGTDG APTQKEESYLLLRLSAQLYVLALFSLEFCRASK >gi|226332281|gb|ACIC01000039.1| GENE 45 75025 - 76845 1294 606 aa, chain + ## HITS:1 COG:no KEGG:BT_0968 NR:ns ## KEGG: BT_0968 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 606 439 1044 1044 1253 100.0 0 MYWRYFLWNFVGRQNDIQGHGEPEHGNWITGIPLLDNLRLGDQELLPESLRQNKGHNVFY ALPLLLGLIGIYWQWMRGKKGMQQFSVVFFLFFMTGLAIVLYLNQTPGQPRERDYAYAGS FYAFAIWIGMGAAGLCDALRKKKSSVLQVGLLMFLCLLIPVQMASQTWDDHDRSNRFTCR DFGANYLMTLPDKGNPIIFSNGDNDTFPLWYNQDTEGVRRDARICNLSYAQTDWYIYQQQ CPLYDAPGLPITWSKDQYQEGKNEYVAIRPELKKQIEELYQKHPEEARDSFGNDPFEVKN ILKYWALSEKQDFHVIPTDTISISIDKDAVLRSGIMLPDSIRHLKGKELKNAIPDKIYIS LKDMRILTKVDMLMLEMLANCNWERPLYMAISVGEVSKLKFDHYFVQEGLAFRFTPFDYK KWGNVKGDNNYAIDVERLYENVMNRYKYGGLDTPGLYLDETTLRTCYYHRRLFAQLAKEL IRQGDNTRARKVLAYAEQAVPAYNVPEIYESGSFDIAKAYAALGEKTKAMPLLKNLTAES EDYINWAFSLGDNRINMVQRDCLYKFWQWNQYNELVKEIDEEEYRLSNQRFEEKYALFSQ IVGLRN >gi|226332281|gb|ACIC01000039.1| GENE 46 76889 - 77110 170 73 aa, chain + ## HITS:1 COG:no KEGG:BT_0967 NR:ns ## KEGG: BT_0967 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 73 1 73 73 108 100.0 4e-23 MRNALSKINYIILSIGSILIITGYVLMSGEGSTPAAYNPDIFSRLRICFAPIICLLGYLL NVVGIIYITKKNN >gi|226332281|gb|ACIC01000039.1| GENE 47 77205 - 77756 441 183 aa, chain + ## HITS:1 COG:XF2239 KEGG:ns NR:ns ## COG: XF2239 COG1595 # Protein_GI_number: 15838830 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Xylella fastidiosa 9a5c # 4 183 9 194 206 67 26.0 2e-11 MSNNIDIKTLEAFQDGNHKAFETVFIAYYNKTKAFIDGYIKSESDAEELTEDLFVNLWIN HDSIDASKSFSSYLHTIARNSAINFLKHKYVHDTYLNSSQEIEYSSTSEEDLIAKELGLL IDDIVGKMSEQRKMIYTLSRQEGLSNAEIAIQLNTTKRNVESQLSLALKEIRKAISCFLL SFL >gi|226332281|gb|ACIC01000039.1| GENE 48 77820 - 78794 522 324 aa, chain + ## HITS:1 COG:PA2388 KEGG:ns NR:ns ## COG: PA2388 COG3712 # Protein_GI_number: 15597584 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Pseudomonas aeruginosa # 101 273 108 275 331 85 33.0 1e-16 MEKSNIITRIIKKFLSGRFSVETEEKVQRWIMKEENAEEKEKASLEYWNELDTEADSKTY SALERVNLRTGYNKEHLENIVLYHKFVRIVAVVIPICLLAGGLLYYIPSENEQIEISTAY GEQKRLVLPDSSEVWLNAGSTITYPKTFTKENRVVTLDGEAYFSVRKDDAKPFIVETSQL SVKVLGTKFNVKAYANDANITTTLTSGKVEVSTQSRPPQTLKPNEQLTYDKSTSDIEIST VDTVDTNSWVKGKIIFTNATAEEIFRTLERRYDTVIDHSTDFSASRRYTVKFLKDENLDE MLNILGDIIGFSYRQSGNKIIITK >gi|226332281|gb|ACIC01000039.1| GENE 49 78919 - 81357 1113 812 aa, chain + ## HITS:1 COG:no KEGG:BT_0964 NR:ns ## KEGG: BT_0964 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 13 812 1 800 800 1573 100.0 0 MNRINANKKPDKMIILLFFSILSLTAISQNAEKKITIQNKNISLKEAFEQIELQTDYSIA YEQSNLNLKKRQALSLKSASIEKALTQILKGTGYIYKIKGYHIIISLPTPKTETIGEKTQ KLTQTIRGIVVDSKTNAPIEYASVYVLEEPSLNSSTDSLGNFRISNVPIGRYNIQASFTG YHISIISEVSVTSSKEVYVEIPMDENIQYLAEVLVKPEIKKNRTINPMAITGGRMISMEE AGRFANGFDDPARLSAAFAGVAGDIGTNAVAIRGNSPQFTQWRLEGIEIPNPTHFADLSG LGGGFLSALSTQVIGNSDFYNGAFPAEYNNVLSGVFDMHLRNGNNQKYEHAFQVGLMGID LSSEGPISKKRGSSYLVNYRFSTTSLATGNDLNLKYQDLAFKLNFPTSKAGTFSIWGLGL IDRNKAPIEERSKWETLGDRQAGENRLEKMVGGLAHKYVMNENTYIRSSLSATYSKDHTV VDQQADDKLIRVGDIRNSRWDFVFNSYLNTKFSPRHTNRTGVTITNLHYDLDYQVSRYFG LNRPMEQISKGDGESTVFSAYSSSVINLNNNWDTSLGVTAQYFTLNNNLSIEPRVALKWK INPKHSLALAYGLHSRREKLDYYFVEKVVNGKNQSNRYLDFSRAHHVGMTYDWNINQNLH LKIEPYFQYLFRIPVEKNSSFSVINHEEFYLDRILTNTGQGRNYGIDITLEQYMKNGFYY MITGSLFKSKYRGGDNVWRNTRLDKNYLLNLLAGKEWMVGKLKQNVLSLNGRLFFQGGER YTPVNEERSQTEHDIFFDETKAYSKRFNPSLP >gi|226332281|gb|ACIC01000039.1| GENE 50 81854 - 83362 1025 502 aa, chain + ## HITS:1 COG:MTH401 KEGG:ns NR:ns ## COG: MTH401 COG1145 # Protein_GI_number: 15678429 # Func_class: C Energy production and conversion # Function: Ferredoxin # Organism: Methanothermobacter thermautotrophicus # 194 484 21 323 337 77 25.0 8e-14 MLRTIRLTAAIVCFTLITLLFLDFTGTLHTWFGWLAKIQFLPAVLALNIGVVLFLIVLTL LFGRIYCSVICPLGVFQDAVSWFSGKQKKNRFRYSPALKWLRYGVLAVFILALVAGLNAF VVLLAPYSAYGRMVSSLLAPVWQWGNNLLAYFAERAESYAFYEVDVWMKSLSTLIIAVIT LIVLFVLAWRNGRTYCNTICPVGTVLGFISRYSIFKPVIDTSKCNSCGLCARNCKASCIN AKAHEIDYSRCVACMDCLGKCKQGAIKYTRRIQKKEGVNTEAVKIKSVTSEQIDNARRSF LSASTIFATSTILKAQEKKVDGGLAAIEDKKIPNRENPIYPPGSLNARNFAQHCTACQLC VSVCPTQVLRPSGNLATLMQPEMSYERGYCRPECAKCAEVCPTDAIHLTSLAEKSAIQIG HAVWIKENCVPLTDGMECGNCARHCPAAAIQMVASDPEIADSPKIPAINVERCIGCGACE NLCPSRPFSAIYVTGHQMHRIV >gi|226332281|gb|ACIC01000039.1| GENE 51 83382 - 84785 1069 467 aa, chain + ## HITS:1 COG:MA0422 KEGG:ns NR:ns ## COG: MA0422 COG1453 # Protein_GI_number: 20089314 # Func_class: R General function prediction only # Function: Predicted oxidoreductases of the aldo/keto reductase family # Organism: Methanosarcina acetivorans str.C2A # 54 467 1 385 400 227 36.0 3e-59 MEEKNKKDINRRDFIKIVGISAATSTAILYGCSSKGTSSSSSASAEGEIPTDKMTYRTSP TTGDRVSLLGYGCMRWPLKPAPDGKGEVIDQDAVNGLIDYAIAHGVNYFDTSPAYVQGFS EKSTGIALSRYPRDKYFIATKLSNFSPDTWSREASLKMYHKSFADLQVDYIDYMLLHGIG MGGMDALKGRYLDNGMLDFLIKEREAGRIRNLGFSYHGDIEVYDYLLSRHDEIKWDFVQI QLNYVDWKHAKETNARNTDAEYLYGELHKRGIPSIIMEPLLGGRLSKLNDNLVARLKQRR PESSVASWAFRFAGTFPDILTVLSGMTYMEHLQDNLRTYSPLEPLTDEEMDFLEDTAQLM LKYPTIPCNDCKYCMPCPYGLDIPAVLLHYNRCVNEGNVPKNGQDENYLKARRAFLVGYD RSVPKLRQASHCIGCNQCVPHCPQNIDIPKELHRIDQFVEQLKQGTL >gi|226332281|gb|ACIC01000039.1| GENE 52 84785 - 85219 291 144 aa, chain + ## HITS:1 COG:no KEGG:BT_0961 NR:ns ## KEGG: BT_0961 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 144 1 144 144 295 100.0 3e-79 MEELINLLHSGGYSCVIANGDNIRTFTQRGVADLYDLLTQEPDFLKGASIADKVVGKGAA ALMILGGIRELHTDIISSKALDLLRSSDIKVHFVQEVPFIWNRDHTGWCPVETMCSEEES AEAILPLIRDFLEKIRSGKKLQEK >gi|226332281|gb|ACIC01000039.1| GENE 53 85236 - 87134 1547 632 aa, chain + ## HITS:1 COG:PA1271 KEGG:ns NR:ns ## COG: PA1271 COG4206 # Protein_GI_number: 15596468 # Func_class: H Coenzyme transport and metabolism # Function: Outer membrane cobalamin receptor protein # Organism: Pseudomonas aeruginosa # 14 532 5 510 616 100 26.0 7e-21 MKTNTVCLFLFGSLLSISGMAQPGKDSTQVKRNYTIDEVVVTGTRNETDIRHLPMTISVV GRQQIEKRYEPSLLPLLTEQVPGFFTTSRGIMGYGVSTGAAGGMSLRGIGGSPTAGLLVL IDGHPQYMGLMGHPIADAYQSMLTEKVEVLRGPASVLYGSNAMGGVINIVTRKQQDEGVR TNMQVGYGSYNTLQTEFSNRVKKGRFSSIVTGSYNRTDGHRPDMEFEQYGGYAKLGYDFS SSWKLWGDVNVTHFNASNPGTIQVPLIDNDSRITRGMTSLALENHYEKTSGALSLFYNWG RHKINDGYKTGEQPQTSHFNSKDKMLGISWYQSATFFTGNRLTVGFDYQHFGGESWNKVL ATGERKPGVDKQMDEFAGYVDFRQDINSWLSLDAGVRVDHHSHVGTEWIPQGGLAFHLPK SAELKAMVSKGYRNPTIREMYMFAPANPKLNPEKLVSYELSYSQRLLEDALYYGLNLYYI NGDNVIMSNGLTPPLNVNSGEIENWGIEANIGYRFNSHWNVNANYSWLHMENPVLAAPEH KLYVGADYTAGRWTVSTGMQYIKGLYTSVIVNNKGTEQQDSFVLWNLRGNYRVCNFANIF VKGENLLAQRYEINAGYPMPKATFMGGINLNF >gi|226332281|gb|ACIC01000039.1| GENE 54 87147 - 87656 421 169 aa, chain + ## HITS:1 COG:no KEGG:BT_0959 NR:ns ## KEGG: BT_0959 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 169 1 169 169 283 100.0 2e-75 METTVKLYSLNYSNVKTYLAASLFIVGNILFPRLFHLIPLGGLTWLPIYFFTLIGAYKYG WKVGLLTAILSPVINSLLFGMPVPAVLPAILLKSVLLAIAAGWAAQRFGRISIPVLAGVV LFYQIIGTLGEWMYIGNFYNAVQDFRIGIPGMCLQVLGGYLFIKYLIRK Prediction of potential genes in microbial genomes Time: Thu May 12 00:26:38 2011 Seq name: gi|226332280|gb|ACIC01000040.1| Bacteroides sp. 1_1_6 cont1.40, whole genome shotgun sequence Length of sequence - 27806 bp Number of predicted genes - 22, with homology - 21 Number of transcription units - 11, operones - 6 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 19 - 4008 2549 ## COG0642 Signal transduction histidine kinase - Prom 4077 - 4136 9.4 2 2 Op 1 22/0.000 - CDS 4201 - 5370 891 ## COG0842 ABC-type multidrug transport system, permease component 3 2 Op 2 9/0.000 - CDS 5354 - 6538 680 ## COG0842 ABC-type multidrug transport system, permease component 4 2 Op 3 13/0.000 - CDS 6546 - 7532 974 ## COG0845 Membrane-fusion protein 5 2 Op 4 . - CDS 7539 - 8951 1236 ## COG1538 Outer membrane protein - Prom 9161 - 9220 7.8 + TRNA 9693 - 9765 75.0 # Gly CCC 0 0 + Prom 9998 - 10057 3.2 6 3 Tu 1 . + CDS 10078 - 11571 1725 ## COG0442 Prolyl-tRNA synthetase + Term 11591 - 11652 11.4 + Prom 11603 - 11662 5.3 7 4 Op 1 40/0.000 + CDS 11708 - 12385 175 ## PROTEIN SUPPORTED gi|149011191|ref|ZP_01832496.1| 30S ribosomal protein S9 8 4 Op 2 . + CDS 12408 - 13691 1060 ## COG0642 Signal transduction histidine kinase + Term 13708 - 13758 10.1 - Term 13695 - 13745 7.1 9 5 Op 1 24/0.000 - CDS 13809 - 14765 787 ## COG1277 ABC-type transport system involved in multi-copper enzyme maturation, permease component 10 5 Op 2 5/0.000 - CDS 14758 - 15501 229 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 11 5 Op 3 . - CDS 15505 - 16656 960 ## COG1470 Predicted membrane protein 12 5 Op 4 . - CDS 16653 - 16832 74 ## - Prom 16852 - 16911 8.6 + Prom 16710 - 16769 4.7 13 6 Op 1 . + CDS 16887 - 17324 577 ## BT_0923 putative periplasmic protein 14 6 Op 2 . + CDS 17348 - 18229 798 ## BT_0922 hypothetical protein + Term 18232 - 18286 9.1 - Term 18220 - 18274 9.1 15 7 Tu 1 . - CDS 18278 - 22732 3459 ## BT_0921 hypothetical protein - Prom 22856 - 22915 5.4 + Prom 22820 - 22879 6.6 16 8 Tu 1 . + CDS 22933 - 23952 644 ## PROTEIN SUPPORTED gi|227425790|ref|ZP_03908856.1| SSU ribosomal protein S18P alanine acetyltransferase + Prom 24289 - 24348 7.2 17 9 Tu 1 . + CDS 24376 - 25608 1002 ## COG1058 Predicted nucleotide-utilizing enzyme related to molybdopterin-biosynthesis enzyme MoeA + Prom 25652 - 25711 4.6 18 10 Op 1 . + CDS 25731 - 25991 456 ## PROTEIN SUPPORTED gi|29346326|ref|NP_809829.1| 50S ribosomal protein L28 19 10 Op 2 . + CDS 26013 - 26201 319 ## PROTEIN SUPPORTED gi|29346325|ref|NP_809828.1| 50S ribosomal protein L33 20 10 Op 3 . + CDS 26216 - 26374 219 ## PRU_0750 hypothetical protein + Term 26407 - 26446 7.1 + Prom 26429 - 26488 10.4 21 11 Op 1 3/0.000 + CDS 26516 - 27475 736 ## PROTEIN SUPPORTED gi|163762490|ref|ZP_02169555.1| ribosomal protein L28 22 11 Op 2 . + CDS 27472 - 27804 304 ## PROTEIN SUPPORTED gi|229870452|ref|ZP_04490046.1| SSU ribosomal protein S12P methylthiotransferase Predicted protein(s) >gi|226332280|gb|ACIC01000040.1| GENE 1 19 - 4008 2549 1329 aa, chain - ## HITS:1 COG:all4963_3 KEGG:ns NR:ns ## COG: all4963_3 COG0642 # Protein_GI_number: 17232455 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Nostoc sp. PCC 7120 # 785 1027 3 236 294 132 37.0 6e-30 MRNIILFLLLLPLMAYGQTYKHIGVEDGLSNRRIYNIQKDHQGYMWFLTNEGMDRYNGKD IKHYKLIEESKQLSSEIHLGWLYADKEGGIWVIGKKGRIFQYEEKYDRFTMVYKSPKVTD AISYGYLDRNNNLWLCGKDSILLYNTKTTQVMRLPNLLEDNITVVEQADDTHFFIATEMG VRYTQFENEALKVIPLDVLDHVYTQINGLYFHPQLHRLFVGTFSDGTFAYDMSAHQIIKS DTKLGDVSITHIRPLNQKELLVATEGMGVHKIDVNTCNTSPFIVANYESYNEMNGNNITD IYIDEEKRIWLANYPAGITVVDFRYKNYRWIKHSIGNKQSIVNDQVHAVIEDSEGDLWFG TSNGISFQDSKTGQWHSFLSSYDKQLKDKNHIFITLCEVSPGVIWAGGYTSGIYKINKKT LSVEYFSPFLLSSVNIRPDKYIRDMIKDSKGYIWSGGYYNLKCIDPRKNTVRLYPGVNTV TAIEEKDTNFMWVGTATGLFLLNRNSGEYQYILMPVESTYINTLYQANDGLLYIGTNGSG VLIYDPGNMSFEHYYADNSALVSNSIYTILPEMNGQIMMSTENGITSFSIEKRSFHNWSR EQGMMSACFNATSGALRKNNNFVFGSVDGAVEFPAGIKLPDYEYSKMIFSDFHISYQPIY PGEKDSPLEKDIDKTDVLRLKYGENTFSFRVSSINYDFPSHSVYYWRLEGSSYNEWVQLS GNSLIRFTNLSPGKYKLYVRAISKEEPSIVFEERQVGIIISQPVWLNGWAIFIYMLLIVF VFSIAFRIMILKKQKKISDEKTHFFINTAHDIRTPLTLIKAPLEEFVEEETLTEIGSQRI NTALRNVNALLRLSTNLINFERADVYSSELRISEYELNTYMNEICNSFRSYANIKHIDFT YKSDFNYLNVWFDKDKMDSILKNLISNALKYTPENGIVSVSISETKDSWKLEVKDTGIGI PANEQNKLLKMHFRGTNAINAKITGSGIGLKLVDKLVHLHSGKINIESVEQQGTTITVVF PKGNKHFRHSNLIDPEKTGRQEAVLDAPVISEAAEKANDENLQRILIVEDNDELRAYLVN SLSPMYNVQACGNGKEALIIVKEFWPELILSDIMMPEMRGDELCSTIKNDIETSHIPVLL LTALGEEQNILDGLDIGADEYIVKPFSIRILKASIANLLANRALLRSKYANLEIDIEVKT PLAKCTNSLDWQFLSAVKKTIEDNINNPAFSVEVLNDLHNMSRTRFYYKLKAITGLSPQE LIKTTRLKRATQLLKEGEYNITEISEMCGFSESKYFREVFKKEYKMSPSQYAKKYSISSS GIIEDDSDE >gi|226332280|gb|ACIC01000040.1| GENE 2 4201 - 5370 891 389 aa, chain - ## HITS:1 COG:jhp1379 KEGG:ns NR:ns ## COG: jhp1379 COG0842 # Protein_GI_number: 15612444 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, permease component # Organism: Helicobacter pylori J99 # 13 378 6 360 376 126 27.0 9e-29 MKKLNKLQQLSFIIRREYLAISTSYAVLLVLMGGIFVYGLLYNYMYAPNIVTDVPVAVVD NSHSELSRNFIRWLDATPQAKIYDQAMDYHEAKEWMKAGKVQGILYLPHDFEERVFRGDE AVFSLYATTDAFLYFEALQGASSRVMLAINDKYRPDGAVFLPPQGLLAVAMAQPVNVAGT ALYNYTEGYGSYLIPAVMVIIIFQTLLMVIGMVTGEEHMTKGILAYTPFGKSWAVAIRIV SGKTFVYCSLYAVFSFFLLGLLPHFFSIPNIGNGLYIVLMMIPYLLATSFLGLAASRYFT DSEAPLLMIAFFSVGLIFLSGVSYPMELMPWYWKAAHYIFPAAPATLAFVKLNSMGASMA DIRPEYITLWIQAVIYFILSAWVYKKKLS >gi|226332280|gb|ACIC01000040.1| GENE 3 5354 - 6538 680 394 aa, chain - ## HITS:1 COG:VC1608 KEGG:ns NR:ns ## COG: VC1608 COG0842 # Protein_GI_number: 15641616 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, permease component # Organism: Vibrio cholerae # 16 362 12 357 387 106 23.0 8e-23 MSSSQTYSPFRSVLLREWRRMTSRRLYFGVCIILPLFTLFFMATIFGNGQMENIPIGIVD QDNTASSRAIARNISAVPTFKVTKHFVNEEAARKAVQKKEIYGYLSIPPRFEQDAISGKN ATLSYYYHYALLSVGGELMAAFETSLAPVSLSPIVMQAVALGVEQDKITTFLLPVQASNH PIYNPSLDYSVYLSQPFFFVLFQVLILLITVYAVGIEIKFRTADEWLSVAKGNMVTAVLG KLLPYTIIFILIGWLANYVMFGILHIPFQGSWVLMNMLTALFVVATQALGLFLFSLFPAI SLIISIVSMVGSLGATLSGVTFPVPNMYPLVRDASYLFPVRHFTEMMQNMLYGGSGFIHL WPSAVILCIFPLLAFVLLPHLKRAIESHKYEKIK >gi|226332280|gb|ACIC01000040.1| GENE 4 6546 - 7532 974 328 aa, chain - ## HITS:1 COG:HP1488 KEGG:ns NR:ns ## COG: HP1488 COG0845 # Protein_GI_number: 15646097 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Helicobacter pylori 26695 # 36 327 37 327 329 214 37.0 2e-55 MKPTSRTLSWAFVIIMLAVGIFTALGIILMHKQPLVLQGQAEATEIRISGKLPGRIDTFL VQEGDWVRQGDTLVVINSPEIHAKYQQVNALEQVAVQQNKKIDAGTRRQIVATALQLWNK TKSDLTLAKTTYNRILTLYKDSVVTSQRKDEVEAMYKAAVAAERAAYEQYQMAVDGAQKE DKESAASMVNAARSTVEEVSALLVDARLTAPENGQIATIFPKRGELVAPGTPIMNLVVMD DIHVVLNVREDLMPQFKIGGTFVADVPAIDKKGIEFKIYYISPLGSFATWKSTKQTGSYD LRTFEIHARPIQKVDDLRPGMSVLLTLN >gi|226332280|gb|ACIC01000040.1| GENE 5 7539 - 8951 1236 470 aa, chain - ## HITS:1 COG:VC1606 KEGG:ns NR:ns ## COG: VC1606 COG1538 # Protein_GI_number: 15641614 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Outer membrane protein # Organism: Vibrio cholerae # 12 467 9 460 476 177 28.0 5e-44 MKKNKILFVLWALCVFVIQTNAQTPLSFEESLHLLNQGNQSLKIADKGIEIAKAERDKLN AFWYPTLQSTGTFVHMSEKIEVKQPLSQFTDPAKDFVHSIIPDDQIISAILDKIGANTLV FPLTPRNLTAVDLSAEWVLFSGGKRFRATNIGRTMVDMARENRAQTAASQQNLLAESYYG LRLAQQVVAVREETYKGLQKHYENALKLEAAGMIDKAGRLFAQVNMDEAKRALEAARKEE TVVQSALKVLLNKKDTDENIVPTSPLFMNDSLPPKMLFDMSVNSGNYMLNQLQLQEHIAK QEVRIAQSGYLPNIALFGKQTLYSHGIQSNLLPRTMVGIGFTWNLFDGLDREKKIRQSKL TQQTLALGKMKARDDLAVGVDKLYTQLEKAQDNVKALNATIELSEELVRIRKKSFTEGMA TSTEVIDAETMLANVKVARLAAYYEYDVALMNLLSLCGTPEQFVNYQPKP >gi|226332280|gb|ACIC01000040.1| GENE 6 10078 - 11571 1725 497 aa, chain + ## HITS:1 COG:BB0402 KEGG:ns NR:ns ## COG: BB0402 COG0442 # Protein_GI_number: 15594747 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Prolyl-tRNA synthetase # Organism: Borrelia burgdorferi # 8 497 5 488 488 473 49.0 1e-133 MAKELKDLTKRSENYSQWYNDLVVKADLAEQSAVRGCMVIKPYGYAIWEKMQRQLDDMFK ETGHVNAYFPLLIPKSFLSREAEHVEGFAKECAVVTHYRLKNAEDGSGVVVDPAAKLEEE LIIRPTSETIIWNTYKNWIQSYRDLPILCNQWANVFRWEMRTRLFLRTAEFLWQEGHTAH ATRKEAEEEAIRMLNVYGEFAEKYMAVPVVKGVKSANERFAGALDTYTIEAMMQDGKALQ SGTSHFLGQNFAKAFDVQFVNKENKMEYVWATSWGVSTRLMGALIMTHSDDNGLVLPPHL APIQVVIVPIYKNDEQLKQIDAKVEGIVAKLKALGISVKYDNADNKRPGFKFADYELKGV PVRLVMGGRDLENNTMEVMRRDTLEKETVTCDGIETYVQKLLEEIQANIYKKALDYRNSK ITIVDTYEEFKEKIEEGGFILAHWDGTTETEEKIKEDTKATIRCIPFESYIAGDKEPGKC MVTGKPSACRVVFARSY >gi|226332280|gb|ACIC01000040.1| GENE 7 11708 - 12385 175 225 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149011191|ref|ZP_01832496.1| 30S ribosomal protein S9 [Streptococcus pneumoniae SP19-BS75] # 1 183 1 190 226 72 29 4e-12 MKILIVEDEPSLRELIQRSLEKERYVVETAGDFNSALYKIEDYDYDCVLLDIMLPDGSGL DLLERLKALHKRENVIIISAKDSLEDKVLGLELGADDYLPKPFHLAELNARIKSVIRRNQ HDGEIDIRQGNVRIEPDKYRVFVNEQELELNRKEYDILLYFINRPGRLVNKNTLAESVWG DHIDQVDNFDFIYAQIKNLRKKLKDAEANIEIKAVYGFGYKLIVE >gi|226332280|gb|ACIC01000040.1| GENE 8 12408 - 13691 1060 427 aa, chain + ## HITS:1 COG:mll7952 KEGG:ns NR:ns ## COG: mll7952 COG0642 # Protein_GI_number: 13476585 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Mesorhizobium loti # 149 400 164 423 452 91 25.0 3e-18 MKLIYYIILRITLALTLILTVWAVFFYVTMIDEVNDEVDDSLEDYSETIIIRALAGEELP SKNNGSNNQYYLKEITKEYAKSQEDIQYKDSMVYIVEKEETEPARILTTIFKDDEGRYYE LTVSTPSIEKDDLKSAIQVWIIFLYVALLLCIIIISVWVFYRNMRPLYVLLHWLDSYQTG KKNKPLKNDTRITEFRKLNDAAARYVERTEQMFEQQKQFIGNASHEIQTPLAICRNRLEM LMEDDSLSEKQLEELMKTHQTLEYITKLNKSLLLLSKIDNGQFTDTKEVDLNVLLKQYLE DYKEVYDYKNIEVSITEQANFHITMNESLAIALLTNLLKNAFVHNVDGGHIRIIITKHSI TFRNSGEKQPLDETHIFERFYQGHKKEGSTGLGLAITDSICRLQQLNLRYYFEQGEHCFE ISSKNQR >gi|226332280|gb|ACIC01000040.1| GENE 9 13809 - 14765 787 318 aa, chain - ## HITS:1 COG:BH3213 KEGG:ns NR:ns ## COG: BH3213 COG1277 # Protein_GI_number: 15615775 # Func_class: R General function prediction only # Function: ABC-type transport system involved in multi-copper enzyme maturation, permease component # Organism: Bacillus halodurans # 8 317 31 344 345 318 57.0 1e-86 MSKVNHPFWVIVHKEISDHVKSWRFLILIGIIALTCMGSLYTALTNIGAAIKPDDPDSSF LFLKLFTASDGTLPSFVLFINFLGPLLGIALGFDAVNSEQNKGTLSRMLSQPIHRDCIIN AKFVAALIVIGIMLFVLGFLVMGFGLIAIGIPPTAEEFWRIVFFLITSIFYVAFWLNLAI LFSLRFRQAATSALASVAVWLFFSVFYTMIVNLVAKGLSPSQMASPYQIISYQKFILGLM RLAPSELFNEATTTLLMPSVRSIGPLTMEQVQGAIPSPLPLGQSLLVVWPQLTGLIAATV ICFAISYIMFMRREIRSR >gi|226332280|gb|ACIC01000040.1| GENE 10 14758 - 15501 229 247 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 5 216 1 218 245 92 27 2e-18 MGEQVIVLTELTKQYGKFTAVDHIRLSIRKGEIFGLLGPNGAGKSTTILMMLGLTEPTSG IVEICGINSTTHPIEVKRKIGYLPEDVGFYDDMTGPENLMYTARLNGIPDKEAKVKALEL MKRVGLEDQLKKKTGKYSRGMRQRLGLADVLIKNPEIIILDEPTSGIDPAGVQEFIELIR WLSKEEGLTVLFSSHHLDQAQKVCDRVGLFSNGKILALIDMAELKEKKQELSDIYNHYFE EGGERHE >gi|226332280|gb|ACIC01000040.1| GENE 11 15505 - 16656 960 383 aa, chain - ## HITS:1 COG:BH3215 KEGG:ns NR:ns ## COG: BH3215 COG1470 # Protein_GI_number: 15615777 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Bacillus halodurans # 8 383 9 385 385 284 40.0 2e-76 MTMRTNCFLLLTVLLGILPMSNTHANDSIPKSVILYTPYTKISVSPGASIDYSIDLINNT DKLVNANLSVSGLSSSWKHEMKSGGWNLSQLAVLPKEKKTFNLKVDVPLKVSRGSYHFVV SAGEAQLPLNVVVAQQGTYQTEFTTDQPNMQGNSKSTFTFNATLKNQTADQQLYALMAKA PRGWNVIFKPNYKQATSAQVEANSSQNVSIDITPPANVEAGSYKIPVRAATGNTSAELEL EVVVTGSYQMELTTPRGLLSSDVTAGDTKRIELEIKNTGSSLLKDIQLSANKPVDWDVAF EPSKIEMLKAGETATAVAILKASKKALPGDYVTTMMAKTPEVNADAQFRIAVKTPMIWGW VGVLIILATIGVVYYLFRKYGRR >gi|226332280|gb|ACIC01000040.1| GENE 12 16653 - 16832 74 59 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRGESEKIWEFWAKSEKNKGNLSYCILLFLLTATNKFIAYFCRHKFDNIHIKSTIRTFV >gi|226332280|gb|ACIC01000040.1| GENE 13 16887 - 17324 577 145 aa, chain + ## HITS:1 COG:no KEGG:BT_0923 NR:ns ## KEGG: BT_0923 # Name: not_defined # Def: putative periplasmic protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 145 1 145 145 264 100.0 8e-70 MKKLVFLLVCLFTLQTVARADDDKPIQVTQMPQLAQQFIKQHFSDSKVALAKMESDFLYK SYEVIFTNGNKVEFDKKGNWEEVDCKHTSVPVAIIPAAIQKYVTTNYPDAKVLKIERDKK DYEVKLSNRTELKFDLKFNLIDIDN >gi|226332280|gb|ACIC01000040.1| GENE 14 17348 - 18229 798 293 aa, chain + ## HITS:1 COG:no KEGG:BT_0922 NR:ns ## KEGG: BT_0922 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 293 1 293 293 537 96.0 1e-151 MKLRIYTLLIAFCAAWSLHSCDNDDDESIAVPTALQEAFSSKYPNVKNVKWETKAGYYVA DFYDGYEASAWFTTDGNWHMTETDIPYTALPDPVKSAFEVGDYKTWKRDDVDKLERQGIE TVYVIEVENQNQEVDLYYSADGVLIKSIADTDDHENDHLPTTQLPAAMKTFIDGKYAGAR IVEVDVEDDKNDWDFGFTEVDIIHFDSGLNRNVSKEVLFDKGGEWYSTSWEVRRNELPAA VTNAISTEYAGYQMDDAEYFEMATETSYYQIELEGNNSPDIDIKVTADGTILK >gi|226332280|gb|ACIC01000040.1| GENE 15 18278 - 22732 3459 1484 aa, chain - ## HITS:1 COG:no KEGG:BT_0921 NR:ns ## KEGG: BT_0921 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1484 1 1484 1484 2880 99.0 0 MSMFIAKELSEVLNTRLTIGKINIGLLNRIIIDDVLLDDQSGKEMLKVTRLSAKFDILPF FKGKISISSVQLFGFTINLNKETPDSPPNFKFVLDAFASKDTIKKESSLDLRINSVLIRR GRMAYHVLSEEKTPGKFNAKHVQLQNIIANISLKAMNRDSLNLGIKRLSFDEKASGFSLK KMSLKLVANDKRTNIENFAIELPETSLKMDTIHLVYDSLKAFDQFSEKVHFSFRTLPSQI TLKDISPFVPVLAHFKEPITLDMQVKGTVDQLTCSHLEITADDRQFRLSGDVSLQELSSP QDAYVYGKLSELSANSRGVGFLVRNLSSNYNGVPPLLERLGDIDFRGEISGYFTDLVTYG QLRTGLGNVKTDLKLSSDKAKGLFAYSGAVKTEEFELGKLLNNEKLGEITFNLDVHGRHI EKQLPAVELKGLIASVEYSRYKYENITLDGEYKQGGFNGKIALDDPNGSIYLNGDVNVAS KVPTFNFLAVVDKFRPNDLNLTPDYQDAEFSLKVKANFTGGSIDEMIGEINVDSLEFRAP DKEYFMKNMNVRATRQDNENQLKLTSEFLTASIAGKFQYHTLPASIFNIMRRYVPSLILP PKKPIETNNNFAFDVHIYNTDILSTIFDVPLTVYTHSTLKGYFNDALQRLRVEGYFPRLQ YKNNFIESGMILCENPSDHISAKVRLTSLKKNGAVNLSLEAQAKEDKVSTTLNWGNNAIA TYSGKLAAVAQFLRTAGEKPLLKAMVDVKQTDVILNDTLWQIHPSQVVVDSGKVDVNNFY FSHHDRYVRINGRLSDNPADTVKVDLKDINMGYVFDIANISDDVNFEGDATGTAYASGVF KKPVMNTRLFVKNFSLNQGRLGDLNVYGAWDNEKRGIYLDASIQDISASPSRVTGMIYPL KPESGLDLNIDAHELNLKFLEFYMKSIAQDIKGRGTGKVHFYGKFKGLNLDGAVMTDASM KFDILNTHFAVRDTIHLAPTGLKFNNMHIADMEGHSGTLNGYLRFQHFKNINYRFEIQAN NMLLMNTKESTDMPFFGTVYATGNALLAGNAIQGLDVNLAMTTNRNTVFTYINGNVASAA SNQFIKFVDKTPRRSIQDSIRVNSYFEQMQQKRQADEEEHQTDIRLNILVDATPDATMKI IMDPIAGDYISAKGTGNIRTEFYNKGDVKMFGNYRISQGIYKFSLQEVIRKDFIIKDGST ITFNGAPLDANMDIQASYTVNSASLNDLNPDVSSIAQQTNVRVNCIMNLSGILLRPTIKL GIELPNERDEIQTLVRNYISTEEQMNMQILYLLGIGKFYTEDAARNNQNSNVMSSVLSST LSGQLNNALSQVFETNNWNIGTNLSTGDKGWTDMEVEGILSGQLLNNRLLVNGNFGYRDN PMANTNFIGDFEAEWLINRSGDIRLKAYNETNDRYYTKTNLTTQGVGIMYKKDFNKWSDL FFWNKWKLRSKRKREEAEKVQQQQTDSIADDKAKSELKRKRAQE >gi|226332280|gb|ACIC01000040.1| GENE 16 22933 - 23952 644 339 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|227425790|ref|ZP_03908856.1| SSU ribosomal protein S18P alanine acetyltransferase [Atopobium parvulum DSM 20469] # 4 333 479 811 832 252 42 2e-66 MSAIILGIESSCDDTSAAVIKDGYLLSNVVSSQAVHEAYGGVVPELASRAHQQNIVPVVH EALKRAGVTKEELSAVAFTRGPGLMGSLLVGVSFAKGFARSLGIPLIDVNHLTGHVLAHF IKAEGEENIQPKFPFLCLLVSGGNSQIILVKAYNEMEILGQTIDDAAGEAIDKCSKVMGL GYPGGPIIDKLARQGNPKAFTFSKPHIPGLDYSFSGLKTSFLYSLRDWMKDDPDFIEHHK VDLAASLEATVVDILMDKLRKAAKEYKIKEVAVAGGVSANNGLRNSFREHAEKYGWNIFI PKFSYTTDNAAMIAITGYFKYLDKDFCSIDLPAYSRVTL >gi|226332280|gb|ACIC01000040.1| GENE 17 24376 - 25608 1002 410 aa, chain + ## HITS:1 COG:alr4808_1 KEGG:ns NR:ns ## COG: alr4808_1 COG1058 # Protein_GI_number: 17232300 # Func_class: R General function prediction only # Function: Predicted nucleotide-utilizing enzyme related to molybdopterin-biosynthesis enzyme MoeA # Organism: Nostoc sp. PCC 7120 # 1 251 1 252 252 134 30.0 3e-31 MFAEIITIGDELLIGQVVDTNSAWMGQELNKIGIEVLRIVSIRDREEEIMEAVDNAMKRV NIVLVTGGLGPTKDDITKQTLCKYFHTELIFSEEVFENVKRVLAGKIPMNALNKSQAMVP KDCTVINNPVGSASVSWFEKDNKVLVSMPGVPQEMTAVMAESVLPKLREKFQTDVIMHRT FLVQHYPESILAEKLEPWETALPESIKLAYLPKLGIIRLRLTGRGQNKIAVESALNGEQA KLEAILGDDIFSEEDIPLEVIVGELLKKKNLTVSTAESCTGGSIAARLTSIAGSSEYFNG GIVAYSNEVKMNLLHVSPETLEAYGAVSEQTVIEMVKGAMKALKTDCAVATSGIAGPGGG TPEKPVGTVWIAAGYKNEIRTYRQKTNRGRAMNIERAGNNALLILRDLLK >gi|226332280|gb|ACIC01000040.1| GENE 18 25731 - 25991 456 86 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|29346326|ref|NP_809829.1| 50S ribosomal protein L28 [Bacteroides thetaiotaomicron VPI-5482] # 1 86 1 86 86 180 100 1e-44 MSKICQITGKKAMIGNNVSHSKRRTKRTFDLNLFNKKFYYVEQDCWISLSLCANGLRIIN KKGLDAALNDAVAKGFCDWKSIKVIG >gi|226332280|gb|ACIC01000040.1| GENE 19 26013 - 26201 319 62 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|29346325|ref|NP_809828.1| 50S ribosomal protein L33 [Bacteroides thetaiotaomicron VPI-5482] # 1 62 1 62 62 127 100 8e-29 MAKKAKGNRVQVILECTEHKESGMPGTSRYITTKNRKNTTERLELKKYNPILKRVTVHKE IK >gi|226332280|gb|ACIC01000040.1| GENE 20 26216 - 26374 219 52 aa, chain + ## HITS:1 COG:no KEGG:PRU_0750 NR:ns ## KEGG: PRU_0750 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 51 1 51 52 76 88.0 3e-13 MAKKTVASLHEGSKEGRAYTKVIKMVKSPKTGAYVFDEQMVANEKVQDFFKK >gi|226332280|gb|ACIC01000040.1| GENE 21 26516 - 27475 736 319 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163762490|ref|ZP_02169555.1| ribosomal protein L28 [Bacillus selenitireducens MLS10] # 5 317 2 322 336 288 45 3e-77 MGFFSFFSKEKKETLDKGLSKTKESVFSKIARAVAGKSKVDDEVLDNLEEVLITSDVGVE TTLNIIKRIEKRAAADKYVNTQELNHILRDEIAALLTENNSDDVADFDVPIEKKPYVMMV VGVNGVGKTTTIGKLAYQFKKAGKSVYLGAADTFRAAAVEQLMIWGERVGVPVVKQKMGA DPASVAFDTLSSAVANNADVVIIDTAGRLHNKVGLMNELTKIKNVMKKVVPNAPDEVLLV LDGSTGQNAFEQAKQFTLATEVTAMAITKLDGTAKGGVVIGISDQFKIPVKYIGLGEGME DLQVFRKNEFVDSLFGENA >gi|226332280|gb|ACIC01000040.1| GENE 22 27472 - 27804 304 111 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229870452|ref|ZP_04490046.1| SSU ribosomal protein S12P methylthiotransferase [Spirosoma linguale DSM 74] # 1 106 6 111 437 121 53 4e-27 MKRKRIDIITLGCSKNLVDSEQLMRQLEEAGYSVTHDTENPEGEIAVINTCGFIGDAKEE SINMILEFAERKEEGDLKKLFVMGCLSERYLQELAIEIPQVDKFYGKFNWK Prediction of potential genes in microbial genomes Time: Thu May 12 00:27:10 2011 Seq name: gi|226332279|gb|ACIC01000041.1| Bacteroides sp. 1_1_6 cont1.41, whole genome shotgun sequence Length of sequence - 15023 bp Number of predicted genes - 15, with homology - 15 Number of transcription units - 4, operones - 4 average op.length - 3.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 953 472 ## PROTEIN SUPPORTED gi|229861541|ref|ZP_04481155.1| SSU ribosomal protein S12P methylthiotransferase 2 1 Op 2 . + CDS 946 - 1218 261 ## COG0776 Bacterial nucleoid DNA-binding protein 3 1 Op 3 . + CDS 1234 - 2697 1154 ## BT_0911 putative integration host factor IHF alpha subunit + Term 2714 - 2762 13.1 + Prom 2801 - 2860 5.8 4 2 Op 1 23/0.000 + CDS 2937 - 3932 1263 ## COG0714 MoxR-like ATPases 5 2 Op 2 . + CDS 4020 - 4889 741 ## COG1721 Uncharacterized conserved protein (some members contain a von Willebrand factor type A (vWA) domain) 6 2 Op 3 . + CDS 4901 - 5977 856 ## BT_0908 hypothetical protein 7 2 Op 4 5/0.000 + CDS 6022 - 7005 969 ## COG2304 Uncharacterized protein containing a von Willebrand factor type A (vWA) domain 8 2 Op 5 . + CDS 7050 - 8078 988 ## COG2304 Uncharacterized protein containing a von Willebrand factor type A (vWA) domain 9 2 Op 6 . + CDS 8082 - 8810 714 ## BT_0905 hypothetical protein 10 2 Op 7 . + CDS 8845 - 10671 1346 ## BT_0904 hypothetical protein 11 2 Op 8 . + CDS 10690 - 11523 595 ## BT_0903 hypothetical protein + Term 11551 - 11602 10.3 + Prom 11549 - 11608 8.3 12 3 Op 1 . + CDS 11658 - 11948 340 ## BT_0902 hypothetical protein + Term 11961 - 12002 6.6 13 3 Op 2 . + CDS 12022 - 13143 573 ## COG0589 Universal stress protein UspA and related nucleotide-binding proteins + Term 13314 - 13355 6.5 - Term 13296 - 13350 11.5 14 4 Op 1 . - CDS 13369 - 14586 1214 ## BT_0900 TPR repeat-containing protein 15 4 Op 2 . - CDS 14619 - 15023 335 ## COG0188 Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), A subunit Predicted protein(s) >gi|226332279|gb|ACIC01000041.1| GENE 1 3 - 953 472 316 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229861541|ref|ZP_04481155.1| SSU ribosomal protein S12P methylthiotransferase [Stackebrandtia nassauensis DSM 44728] # 5 310 154 458 464 186 33 8e-47 YHEELYIERTLTTPKHYAYLKISEGCDRKCSYCAIPIITGRHISKSMEEILDEVRYLVSQ GVKEFQVIAQELTYYGVDLYKKQMLPELIERISEIPGVEWIRLHYAYPAHFPTDLFRVMR ERDNVCKYMDIALQHISDNMLKLMRRQVSKEDTYKLIEQFRKEVPGIHLRTTLMVGHPGE TEEDFEELKEFVRKARFDRMGAFAYSEEEGTYAAQQYEDSIPQEVKQARLDELMDIQQGI SAELSAAKIGQQMKVIIDRIEGDYYIGRTEFDSPEVDPEVLISVSREELEVGQFYQVEVT DADDFDLYAKILNKYE >gi|226332279|gb|ACIC01000041.1| GENE 2 946 - 1218 261 90 aa, chain + ## HITS:1 COG:NMA1868 KEGG:ns NR:ns ## COG: NMA1868 COG0776 # Protein_GI_number: 15794756 # Func_class: L Replication, recombination and repair # Function: Bacterial nucleoid DNA-binding protein # Organism: Neisseria meningitidis Z2491 # 1 87 1 87 91 57 32.0 9e-09 MNNKEFTSELADRLGYTIKDTSELISSLLSDMTQELEDGNAIAIQGFGSFEVKKKAERIS INPSTKQRMLVPPKLVLTYRPSNTLKDKFK >gi|226332279|gb|ACIC01000041.1| GENE 3 1234 - 2697 1154 487 aa, chain + ## HITS:1 COG:no KEGG:BT_0911 NR:ns ## KEGG: BT_0911 # Name: not_defined # Def: putative integration host factor IHF alpha subunit # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 487 1 487 487 672 100.0 0 MNERLTIQDLIDLLAAKHSMTKKDAEAFVKEFFLLIEQALENEKTVKIKGLGTFKLVDVD SRESVNVNTGERFQIKGHTKVSFTPDTNLRDTINKPFAHFETVVLNEGTVLEDTPMEESD EEEGAVSDTETEMIDSEIAGNENAGGDIQEEIQLEEPVIEEQPTAEAVAVAEVPETDANE TGQPETEPELIATEEPEAEELPEVKIEPKPESVSEEAPGEEPVEESSETVLEPDFSPVTE EPAKEQSEETIIEEQKPETETTEEEEEKTVITEKAEVTAEQIIAQELHKANMEPVTLQEQ PEIEQPKATYTDKDSDKKEKSAIPYLIATIIIVLLLCGGAILFIYYPDLFSSSSDKNVVD IPEVTQPVQPEAQLSDTIAHKDTVVEAVQPQPVVKKEPTAEPVKAESKPAQQQPTASAYS DSASYKITGTKTKYTIKEGETLTKVSLRFYGTKAMWPYIVKHNPKVIKNPDNVPYGTTIE IPELTKE >gi|226332279|gb|ACIC01000041.1| GENE 4 2937 - 3932 1263 331 aa, chain + ## HITS:1 COG:Rv1479 KEGG:ns NR:ns ## COG: Rv1479 COG0714 # Protein_GI_number: 15608617 # Func_class: R General function prediction only # Function: MoxR-like ATPases # Organism: Mycobacterium tuberculosis H37Rv # 30 331 52 352 377 314 52.0 2e-85 MAESIDIRELNERIERQSAFVTNLTTGMDQIIVGQKHLVESLLIGLLSDGHVLLEGVPGL AKTLAIKTLASLIDAKYSRIQFTPDLLPADVIGTMVYSQKDESFKVQRGPIFANFVLADE INRAPAKVQSALLEAMQERQVTIGKETFMLPEPFLVLATQNPIEQEGTYPLPEAQVDRFM LKVVIDYPKLEEEKLIIRQNINGEKFNVKPILKAEEIIEARKVVRQVYLDEKIERYIVDI VFATRFPEKYDLKELKDMIGFGGSPRASINLALAARTYAFIKRRGYVIPEDVRAVAHDVL RHRIGLTYEAEANNMTSDEIISKILNKVEVP >gi|226332279|gb|ACIC01000041.1| GENE 5 4020 - 4889 741 289 aa, chain + ## HITS:1 COG:BB0175 KEGG:ns NR:ns ## COG: BB0175 COG1721 # Protein_GI_number: 15594520 # Func_class: R General function prediction only # Function: Uncharacterized conserved protein (some members contain a von Willebrand factor type A (vWA) domain) # Organism: Borrelia burgdorferi # 3 276 8 278 291 140 31.0 3e-33 METSEILKKVRQIEIKTRGLSNNIFAGQYHSAFKGRGMSFSEVREYQFGDDIRDIDWNVT ARFNKPYVKVFEEERELTVMLMVDVSGSLEFGTVKQLKKDMVTEIAATLAFSAIQNNDKI GVIFFSDRIEKFIPPKKGRKHILYIIRELIDFQPESRRTNIRLALEYLTNVMKRRCTAFI LSDFIDQDNFKNALTIANRKHDVVALQVYDRRVSDLPPVGLMRIKDAETGHEQWIDTSSK AVRRAHRDWWINKQTELNDTFTKSNVDSVSVRTDQDYVKALLNLFAKRN >gi|226332279|gb|ACIC01000041.1| GENE 6 4901 - 5977 856 358 aa, chain + ## HITS:1 COG:no KEGG:BT_0908 NR:ns ## KEGG: BT_0908 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 358 5 362 362 636 100.0 0 MNRNIILIALLCLLSIGRTVAQSVTVEAKIDSLQILIGEQAKVQLQVAMDAKQRAVFPAY TDTLVRGVEIIETAKPDTQFLNDRQRMLITQEYIVTSFDSALYYLPPMPVTVDGKEYRSK ALALKVYSMPVDTLHPDQFFGQKTVMKAPFAWEDWYGLIGCSFLALPLLGLLIYLIIRIR DNKPIIRKIKVEPKLPPHQAAMKEIERIKNEKVWQKGQPKAYYTELTDTLRTYIKDRFGF NALEMTSSEIIDKLLEMNDKEAISDLKELFQTADLVKFAKHDPQMNENDANLINAIDFIN ETKQPEEENQKPQPTEITIIEKRSLRTKILLICGIVFLSAALIGTFVYIGMQLYDLFA >gi|226332279|gb|ACIC01000041.1| GENE 7 6022 - 7005 969 327 aa, chain + ## HITS:1 COG:VCA0172 KEGG:ns NR:ns ## COG: VCA0172 COG2304 # Protein_GI_number: 15600942 # Func_class: R General function prediction only # Function: Uncharacterized protein containing a von Willebrand factor type A (vWA) domain # Organism: Vibrio cholerae # 3 318 4 313 318 157 32.0 2e-38 MVFANIEYLFLLLLLIPYIVWYILKRKKTEATLQISDARVYAHTPKSYKNYLLHAPFILR VIALALIIVVLARPQSTNKWQNSEIEGIDIMLAIDVSTSMLAEDLKPNRLEAAKDVAAEF INGRPNDNIGITLFAGETFTQCPLTVDHAVLLDMIHNIKCGLIEDGTAVGMGVANAVTRL KDSKAKSKVIILLTDGTNNKGDISPLTAAEIAKSFGIRVYTIGVGTNGMAPYPYPVGNTV QYINMPVEIDEKTLTQIAGTTDGNYFRATSNSKLKEVYEEIDKLEKTKLNVKEYSKREED YRWFALAAFLCVLLEVLLRNSILKKIP >gi|226332279|gb|ACIC01000041.1| GENE 8 7050 - 8078 988 342 aa, chain + ## HITS:1 COG:VCA0172 KEGG:ns NR:ns ## COG: VCA0172 COG2304 # Protein_GI_number: 15600942 # Func_class: R General function prediction only # Function: Uncharacterized protein containing a von Willebrand factor type A (vWA) domain # Organism: Vibrio cholerae # 9 328 8 318 318 94 26.0 2e-19 MFRFEEPAYLYLLLLLPFLAAFYLYSNYRRRRNIRRFGDPELLAQLMPDVSKYRPDVKFW LIFAAIGLFSVLLARPQFGSKQETVKRKGVEVIIALDISNSMLAQDVQPSRLEKAKRLIS RLVDELDNDKVGMIVFAGDAFTQLPITSDYISAKMFLESISPSLISKQGTAIGEAINLAT RSFTPQEGVGRAIIVITDGENHEGGAVEAAKAAAEKGIQVSVLGVGMPEGAPIPVEGTND YRRDREGNVIVTRLNEGMCQEIAKDGKGIYVRVDNSNSAQKAISQEISKMAKSDVESKIY TDFNEQFQAIAWIILLLLLAEMLILDRKNPLFKNVHLFSNKK >gi|226332279|gb|ACIC01000041.1| GENE 9 8082 - 8810 714 242 aa, chain + ## HITS:1 COG:no KEGG:BT_0905 NR:ns ## KEGG: BT_0905 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 242 1 242 242 285 99.0 1e-75 MRMLKMKYILFAAFLLSAAGVSAQKAERDYIRKGNRLFNDSVFVDAEVNYRKALEVNPKS AVSMYNLGNTLSQQQKFQEAMEQYDSASKIEKDKMKLAHIYHNMGVLFQAGKDYAKAVDA YKMSLRNNPADHETRYNLALAQKMLKDQQNQQDQDQNQDQNKDQQQKQDQKQDQNKDKQN DQKKDDQKDQQQPPKPEKQDNQMSKENAEQLLNSVMQDEKDVQDKVKKQQKVIQGGRLEK DW >gi|226332279|gb|ACIC01000041.1| GENE 10 8845 - 10671 1346 608 aa, chain + ## HITS:1 COG:no KEGG:BT_0904 NR:ns ## KEGG: BT_0904 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 608 1 608 608 1144 100.0 0 MKKLIVILMALIAYSTQALADKVSFTASAPDVVVVGDQFRLSYTVTTQKVKDFQAPAIKG FDVLMGPSRSQQSSTQIMNGNVTSTSSITFTYILMANTAGEYTIPGASIVADGNQMVSNS VKVKVLPQDQASNGGQNDGSSSARSSSGTSVSNQDLFITATASKTNVYEQEAFVLTYKIY TREHDLQLNNAKLPDFKGFHSQEIEMNANARWTQEHYKGRNYNTTIYRQFVLFPQQPGKL FIEPAQFQMTVGKAVQSDDPFDAFFNGGSNVVKVPKSIVTPKIAINVNPLPAGKPANFSG GVGEFNISSSINSKELKTNDAITIKLVISGTGNLKLISNPEIKFPDDFEVYDPKVDNQVR LTQEGLTGNKVIEYLAIPRHAGNYKIPGASFSYFDIRSKSYKTLKTEDYVINVEKGAGNA DQVIANFTNKEDLKVLGEDIRYIKQNEVTLQPKGSFFYGSMTYWLFYIIPALAFIIFFII YRKQAATNANVAKMKTKKANKVATKRMKLAGKLLSENKKDAFYDEVLKALWGYISDKLSI PVSRLSKDNIEEKLRNHGVSEELIKEFLNALNDCEFARFAPGDENQAMDKVYSSSIEVIS RMENSIKH >gi|226332279|gb|ACIC01000041.1| GENE 11 10690 - 11523 595 277 aa, chain + ## HITS:1 COG:no KEGG:BT_0903 NR:ns ## KEGG: BT_0903 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 277 1 277 277 498 100.0 1e-140 MKKILFFILLSMSVTCFGQGTQSIDSIQITEADSIHAGSHTFSDTKLEDVTKAEGDSAYI KDDYATAIQIYESLLKNGESADVYYNLGNSYYKAGEIAKAVLNYERALLMKPGNSDIRAN LEVARAKTIDKVEPVPEVFFVSWTKALINSMSVDAWATWGIVSFILFIIAFYFFIFSKQI ILKKTGFISGIVFLIVVICSNLFASEQKERLVNRSEAIVMNPSVTVRSTPSESGTSLFIL HEGRKVSIKDNSMKEWKEIRLEDGKVGWVPASAIEVI >gi|226332279|gb|ACIC01000041.1| GENE 12 11658 - 11948 340 96 aa, chain + ## HITS:1 COG:no KEGG:BT_0902 NR:ns ## KEGG: BT_0902 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 96 1 96 96 183 100.0 1e-45 MRTITFNELRKIKDSLPSGSMHRIADELGLHVDTVRNFFGGHNFKEGKSVGIHLEPGPDG GLVMLDDTTVLDRALKILDEINMSMQKEQATEPMQI >gi|226332279|gb|ACIC01000041.1| GENE 13 12022 - 13143 573 373 aa, chain + ## HITS:1 COG:MA2866 KEGG:ns NR:ns ## COG: MA2866 COG0589 # Protein_GI_number: 20091690 # Func_class: T Signal transduction mechanisms # Function: Universal stress protein UspA and related nucleotide-binding proteins # Organism: Methanosarcina acetivorans str.C2A # 86 243 8 151 152 62 29.0 2e-09 MEDKLVTLAILTYTKAQILKNVLENEGIETYIHNVNQIQPVVSSGVRLRIKESDLPRALK ITESSTWLSESIVGEKTPKVDSKSNKILIPVDFSNYSMKACEFGFNLAKTENSEVILLHV YFTPIYASSLPYGDVFNYQIGDEESVKTIIQQVHSDLNALSDKIKEKVASGEFPDIKYSC ILREGIPEEEILRYAKEQRPKVIIMGTRGKSQKDIDLIGSVTAEIIDRSRTAVLAIPENT PFKEFNEVKRIAFLTNFDQRDLIAFEAFFNTWKSFHFSVSLIHLAESKDTWNEIKLAGIK DYFQKQYPGLEIHYDVVMNDNLLKGLDKYIKDNQIDIITLTSYKRNIFARLFNPSIARKM IFHSDTPLLVING >gi|226332279|gb|ACIC01000041.1| GENE 14 13369 - 14586 1214 405 aa, chain - ## HITS:1 COG:no KEGG:BT_0900 NR:ns ## KEGG: BT_0900 # Name: not_defined # Def: TPR repeat-containing protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 8 405 1 398 398 705 100.0 0 MKRVLFSMVLLMAVSLSFAQMKNVKEAKSIASDVKPNFNQAEKLINEAIKNPETKDLPDT WDVAGFIQKRFIEEERDKKGVLKQPFDTLKAYNSILKMFEYYTKCDDLAQVPNEKGKIKN KYRKANASTILVERPNLINGGIQFFNLDKNKEALQFFATYVESASYPMLAEQNLAKTDTL LAQIAYYATLAADRVGDKDAIIKYAPAALDDKDGGKFAMQLMADAYKAKGDTAAWVKSLE EGILKFPGNDYFFANLVDYYTSSNQASKAMEFADRMLSTDANNKLYLYVKAYLYHNMKEY DNAIEFYKKAIAADPEYAEAYSNVGLVYLMKAQDYADKATTDINDPKYAEAQATVKKFYE EAKPFYEKARALKPDQKDLWLQGLYRVYYNLNMGPEFEEIDKMMK >gi|226332279|gb|ACIC01000041.1| GENE 15 14619 - 15023 335 134 aa, chain - ## HITS:1 COG:SA0006 KEGG:ns NR:ns ## COG: SA0006 COG0188 # Protein_GI_number: 15925711 # Func_class: L Replication, recombination and repair # Function: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), A subunit # Organism: Staphylococcus aureus N315 # 1 114 724 835 889 91 49.0 3e-19 GKRSEIEDYRKTNRGGKGVKTMNITEKTGKLVTIKSVTDENDLMIINKSGITIRLKVADV RIMGRATQGVRLINLEKRNDQIGSVCKVMTESLEDEVPAEEAEGTIVSDLTTDQDVDNAD TATDVNENNNEIEE Prediction of potential genes in microbial genomes Time: Thu May 12 00:27:47 2011 Seq name: gi|226332278|gb|ACIC01000042.1| Bacteroides sp. 1_1_6 cont1.42, whole genome shotgun sequence Length of sequence - 9936 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 2193 1852 ## COG0188 Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), A subunit + Prom 2466 - 2525 4.3 2 2 Op 1 1/0.000 + CDS 2601 - 5126 1955 ## PROTEIN SUPPORTED gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 + Term 5167 - 5206 6.8 + Prom 5177 - 5236 6.5 3 2 Op 2 . + CDS 5266 - 7269 1692 ## COG0326 Molecular chaperone, HSP90 family + Term 7293 - 7331 -0.4 + Prom 7357 - 7416 5.5 4 3 Tu 1 . + CDS 7482 - 9782 1189 ## COG1752 Predicted esterase of the alpha-beta hydrolase superfamily + TRNA 9862 - 9935 54.1 # Arg ACG 0 0 Predicted protein(s) >gi|226332278|gb|ACIC01000042.1| GENE 1 3 - 2193 1852 730 aa, chain - ## HITS:1 COG:BH0007 KEGG:ns NR:ns ## COG: BH0007 COG0188 # Protein_GI_number: 15612570 # Func_class: L Replication, recombination and repair # Function: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), A subunit # Organism: Bacillus halodurans # 1 730 1 723 833 788 55.0 0 MLEQDRIIKINIEEEMKSSYIDYSMSVIVSRALPDVRDGFKPVHRRILYGMMELGNTSDK PYKKSARIVGEVLGKYHPHGDSSVYYAMVRMAQEWAMRYPLVDGQGNFGSVDGDSPAAMR YTEARLNKLGEAMMDDLYKETVDFEPNFDNTLTEPKVMPTRIPNLLVNGASGIAVGMATN MPPHNLSEVIDACEAYIDNQEITVEELMTYVKAPDFPTGGYIYGISGVREGYLTGRGRVV MRAKAEIETGQTHDKIVVTEIPYNVNKAELIKYIADLVNDKRIEGISNANDESDRDGMRI VIDVKRDANASVVLNKLYKMTALQTSFGVNNVALVHGRPKTLNLRDMIKYFVEHRHEVVI RRTQFDLRKAKERAHILEGLIIASDNIDEVIRIIRAAKTPNDAIAGLIERFNLTEIQSRA IVEMRLRQLTGLMQDQLHAEYEEIMKQIAYLESILADDEVCRKVMKDELLEVKAKYGDER RSEIVYSSEEFNPEDFYADDQMIITISHMGYIKRTPLTEFRAQNRGGVGSKGTETRDADF VEHIYPATMHNTMMFFTQKGKCYWLKVYEIPEGTKNSKGRAIQNLLNIDSDDSVTAYLRV KSLDDTEYINSHYVLFCTKKGVIKKTLLEQYSRPRQNGVNAITIREDDSVIEVRMTNGNN EIIIANRNGRAIRFHEAAVRVMGRTATGVRGITLDNDGQDEVIGMICIKDLETESVMVVS EQGYGKRSEI >gi|226332278|gb|ACIC01000042.1| GENE 2 2601 - 5126 1955 841 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 [Bacillus selenitireducens MLS10] # 1 833 2 806 815 757 47 0.0 MNNQFSQRVSDIIVYSKEEANRLRSRHIGPEHLLLGMLRDGEGKAIEILSKLNTNLAAVK QQIEARLKTEADDMLLPDAEIPLSNDAAKILKMCILEARGMKSNIADTEHVLLAILRERN NMAASVLEANDVNYVKVLEQATLQPDVNSGMGFPEDDDDEEMSSPRSGGSGSEERQQQAQ TASKKPANDTPVLDNFGTDMTKAAEEGRLDPVVGREKEIERLAQILSRRKKNNPILIGEP GVGKSAIVEGLALRIIQKKVSRILFDKRVVALDMTAVVAGTKYRGQFEERIRSILNELQR NPNVILFIDEIHTIVGAGSAAGSMDAANMLKPALARGEIQCIGATTLDEYRKNIEKDGAL ERRFQKVMVEPTTAAETLQILRNIKDKYEDHHNVNYTDEALEACVKLTDRYITDRNFPDK AIDALDEAGSRVHLTNVNVPKEIEEQEKLIEEAKSKKNEAVKSQNFELAASFRDKEKELT LQLDEMKREWENSLKENRQTVDAEEIANVISMMSGIPVQRMAQAEGIKLAGMKEDLQSKV IAQDPAIEKLVKAILRSRVGLKDPNKPIGTFMFLGPTGVGKTHLAKELAKYMFGSSDALI RIDMSEFMEKFTVSRLVGAPPGYVGYEEGGQLTEKVRRKPYSIVLLDEIEKAHPDVFNLL LQVMDEGRLTDSYGRMVDFKNTVIIMTSNIGTRQLKEFGRGVGFATQSRLDDKEFSRSVI QKALNKSFAPEFINRVDEIITFDQLSLEAITKIIDIELKGLYDRIESIGYKLVIDDEAKR FVAEKGYDVQFGARPLKRAIQTHLEDGLSELIITSSLKEKDVIQVSLNKEKGELEMKVIA S >gi|226332278|gb|ACIC01000042.1| GENE 3 5266 - 7269 1692 667 aa, chain + ## HITS:1 COG:alr2323 KEGG:ns NR:ns ## COG: alr2323 COG0326 # Protein_GI_number: 17229815 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone, HSP90 family # Organism: Nostoc sp. PCC 7120 # 1 581 2 601 658 453 41.0 1e-127 MQKGNIGVTTENIFPIIKKFLYSDHEIFLRELVSNAVDATQKLNTLASIGEFKGELGDLT IHVELGKDTITISDRGIGLTAEEIEKYINQIAFSGANDFLEKYKDDANAIIGHFGLGFYS AFMVAKKVEIITKSYRDEAQAIKWTCDGSPEFTIEEVDKADRGSDIILYIDDDCKEFLEE ARVSELLKKYCSFLPVPIAFGKKKEWKDGKQIETTEDNIINDTTPLWTRKPSELSDEDYK SFYTKLYPMSDEPLFWIHLNVDYPFHLTGILYFPKVKSNIELNKNKIQLYCNQVYVTDSV EGIVPDFLTLLHGVIDSPDIPLNVSRSYLQSDSNVKKISTYITKKVSDRLQSIFKNDRKQ FEEKWNDLKIFINYGMLTQEDFYEKAQKFALFTDTDNKHYTFEEYQTLIKDNQTDKDGNL IYLYANNKDEQFSYIEAATNKGYNVLLMDGQLDVAMVSMLEQKFEKSRFTRVDSDVIDNL IIKEDKKNETLEGEKQEAITTAFKSQLPKMDKVEFNVMTQALGDNSAPVMITQSEYMRRM KEMANIQAGMSFYGEMPDMFNLVLNSDHKLIKQVLDEEEAACHPEVAPIQTEMNSVSKRR NELKDSQKDKKEEDIPTAEKDELNELDKKWDELKNKKERHLCRICQQQQGNPSTHRPGFI AKQYAKR >gi|226332278|gb|ACIC01000042.1| GENE 4 7482 - 9782 1189 766 aa, chain + ## HITS:1 COG:PA3339_1 KEGG:ns NR:ns ## COG: PA3339_1 COG1752 # Protein_GI_number: 15598535 # Func_class: R General function prediction only # Function: Predicted esterase of the alpha-beta hydrolase superfamily # Organism: Pseudomonas aeruginosa # 22 299 24 301 308 164 36.0 7e-40 MRKILLVLIAIWLFMPTVCAQKVGLVLSGGGAKGLTHIGIIRALEENNIPIDYITGTSMG AIIGSLYAMGYSPDDMEELLKSEDFKRWYSGQIEEKYVYHFKKNVPTPEFFNIRFSFKDS LKNFKPQFLPTSVVNPIQMNLVFVDLYARATASCKGDFDKLFVPFRCIASDVYNKKQLVM RNGDLGDAVRASMSFPFMFKPIEIDNVLAYDGGIYNNFPTDVMRDDFHPDIIIGSVVSTN PTKPKENDLMSQIENMVMQKTDYSIPDSMGILMTFKYDNVSLMDFQRIDELHDIGYNRTI SMMDSIKSRIQRRVNLDNIRLRRMVYRSNYPELRFKNIIIDGANPQQQAYIKKEFHSSDN KEFTYEDLKEGYFRLLSDNMISEIIPHAVYNPKDETYDLHLKVKLENNFAVRLGGNISTS NSNQIYLGLSYQDLNYYAKEFLFDGQLGKVYNNAQFMAKIDFSTAIPTSYRFIASITTFD YFKKDKLFSRNDKPAFNQKDERFLKLQVGLPFLLSKRAEFGIGIARIEDKYFQRNIIDFE KDRFDRSRYDLFGGSISFNGSTLNSRQYPTQGYKEALVAQIFMGRERFYPGEETKGVQIN KEHHSWLQLSYMKEKYHTMSEHWVLGWYLKALYASKNFSENYTATMMQAGEFSPTLHSKL TYNEAFRANQFVGAGIRPIYRLNQMFHLRGEFYGFMPIYPIERNSLNKAYYGKAFSKFEY LGEISVVCQLPFGDISAYVNHYSSPRREWNVGLSIGFQLFNYRFIE Prediction of potential genes in microbial genomes Time: Thu May 12 00:28:09 2011 Seq name: gi|226332277|gb|ACIC01000043.1| Bacteroides sp. 1_1_6 cont1.43, whole genome shotgun sequence Length of sequence - 86443 bp Number of predicted genes - 65, with homology - 65 Number of transcription units - 31, operones - 18 average op.length - 2.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 509 - 541 -0.8 1 1 Op 1 . - CDS 542 - 1435 796 ## COG0329 Dihydrodipicolinate synthase/N-acetylneuraminate lyase 2 1 Op 2 . - CDS 1495 - 3495 1526 ## COG0272 NAD-dependent DNA ligase (contains BRCT domain type II) - Prom 3577 - 3636 8.4 + Prom 3428 - 3487 5.8 3 2 Tu 1 . + CDS 3572 - 4249 661 ## COG0336 tRNA-(guanine-N1)-methyltransferase + Term 4260 - 4324 19.5 - Term 4246 - 4312 20.8 4 3 Op 1 13/0.000 - CDS 4468 - 5379 886 ## COG0167 Dihydroorotate dehydrogenase 5 3 Op 2 . - CDS 5367 - 6143 588 ## COG0543 2-polyprenylphenol hydroxylase and related flavodoxin oxidoreductases 6 3 Op 3 . - CDS 6219 - 6689 252 ## BT_0890 hypothetical protein - Prom 6734 - 6793 3.9 - Term 7840 - 7890 3.2 7 4 Op 1 . - CDS 7919 - 8938 724 ## COG1466 DNA polymerase III, delta subunit 8 4 Op 2 . - CDS 8964 - 9740 732 ## COG0775 Nucleoside phosphorylase - Prom 9790 - 9849 5.0 + Prom 9691 - 9750 4.6 9 5 Tu 1 . + CDS 9804 - 10253 344 ## BT_0887 hypothetical protein - Term 10260 - 10319 14.4 10 6 Op 1 27/0.000 - CDS 10377 - 13523 2468 ## COG0841 Cation/multidrug efflux pump - Prom 13544 - 13603 1.8 - Term 13551 - 13589 -0.8 11 6 Op 2 13/0.000 - CDS 13606 - 14715 1017 ## COG0845 Membrane-fusion protein 12 6 Op 3 . - CDS 14749 - 16146 506 ## PROTEIN SUPPORTED gi|157165073|ref|YP_001466086.1| 30S ribosomal protein S12 - Prom 16181 - 16240 8.6 - Term 16220 - 16259 7.5 13 7 Op 1 3/0.000 - CDS 16284 - 17114 946 ## COG1579 Zn-ribbon protein, possibly nucleic acid-binding 14 7 Op 2 . - CDS 17120 - 18214 820 ## COG3323 Uncharacterized protein conserved in bacteria - Prom 18303 - 18362 4.7 + Prom 18240 - 18299 8.0 15 8 Tu 1 . + CDS 18321 - 18737 252 ## BT_0881 hypothetical protein + Term 18917 - 18956 -0.9 - Term 18710 - 18742 0.1 16 9 Op 1 . - CDS 18772 - 19050 215 ## BT_0880 hypothetical protein 17 9 Op 2 . - CDS 19047 - 19841 505 ## COG0390 ABC-type uncharacterized transport system, permease component 18 9 Op 3 . - CDS 19876 - 20496 211 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 - Prom 20576 - 20635 5.8 + Prom 20351 - 20410 2.9 19 10 Op 1 . + CDS 20583 - 21125 501 ## COG4739 Uncharacterized protein containing a ferredoxin domain 20 10 Op 2 5/0.000 + CDS 21196 - 22311 1175 ## COG2957 Peptidylarginine deiminase and related enzymes 21 10 Op 3 . + CDS 22348 - 23232 833 ## COG0388 Predicted amidohydrolase + Term 23273 - 23330 -0.4 - Term 23260 - 23318 3.2 22 11 Op 1 . - CDS 23334 - 23678 248 ## BT_0874 hypothetical protein 23 11 Op 2 . - CDS 23720 - 25594 1306 ## COG1368 Phosphoglycerol transferase and related proteins, alkaline phosphatase superfamily 24 11 Op 3 . - CDS 25661 - 27424 1858 ## COG0173 Aspartyl-tRNA synthetase 25 11 Op 4 . - CDS 27500 - 28513 828 ## COG1597 Sphingosine kinase and enzymes related to eukaryotic diacylglycerol kinase - Prom 28533 - 28592 5.5 + Prom 28635 - 28694 7.6 26 12 Tu 1 . + CDS 28746 - 29930 1113 ## COG0156 7-keto-8-aminopelargonate synthetase and related enzymes + Term 29959 - 30006 4.2 - Term 29937 - 30001 10.3 27 13 Op 1 . - CDS 30020 - 30367 296 ## BT_0869 hypothetical protein 28 13 Op 2 . - CDS 30393 - 30776 324 ## BT_0868 hypothetical protein - Prom 30921 - 30980 9.8 29 14 Op 1 . + CDS 31504 - 34716 2590 ## BT_0867 hypothetical protein 30 14 Op 2 . + CDS 34737 - 36635 1686 ## BT_0866 hypothetical protein 31 14 Op 3 . + CDS 36666 - 37988 981 ## BT_0865 putative chitobiase + Term 38050 - 38091 6.5 32 15 Op 1 . - CDS 38207 - 40564 1150 ## BT_0864 putative permease 33 15 Op 2 . - CDS 40592 - 43000 1566 ## COG0577 ABC-type antimicrobial peptide transport system, permease component 34 15 Op 3 . - CDS 43022 - 45328 993 ## BF2455 putative ABC transport system, membrane protein 35 15 Op 4 . - CDS 45347 - 47662 941 ## COG0577 ABC-type antimicrobial peptide transport system, permease component 36 15 Op 5 . - CDS 47678 - 50023 1443 ## BT_0861 putative ABC transporter permease 37 15 Op 6 . - CDS 50041 - 50715 338 ## PROTEIN SUPPORTED gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) 38 15 Op 7 . - CDS 50743 - 53145 852 ## BVU_3170 hypothetical protein 39 15 Op 8 13/0.000 - CDS 53176 - 54426 1204 ## COG0845 Membrane-fusion protein 40 15 Op 9 . - CDS 54480 - 55952 1445 ## COG1538 Outer membrane protein - Prom 55999 - 56058 5.5 + Prom 55961 - 56020 7.5 41 16 Op 1 8/0.000 + CDS 56167 - 57543 1332 ## COG2204 Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains 42 16 Op 2 . + CDS 57540 - 58817 821 ## COG5000 Signal transduction histidine kinase involved in nitrogen fixation and metabolism regulation 43 17 Tu 1 . - CDS 59447 - 59641 282 ## BT_0854 hypothetical protein + Prom 59606 - 59665 10.2 44 18 Tu 1 . + CDS 59788 - 60525 545 ## PROTEIN SUPPORTED gi|239830964|ref|ZP_04679293.1| Ribosomal protein L11 methyltransferase + TRNA 60653 - 60730 82.4 # Pro TGG 0 0 + TRNA 60762 - 60839 82.4 # Pro TGG 0 0 - Term 60747 - 60815 31.3 45 19 Tu 1 . - CDS 60886 - 61611 671 ## COG4123 Predicted O-methyltransferase - Prom 61641 - 61700 6.3 + Prom 61617 - 61676 11.2 46 20 Op 1 . + CDS 61814 - 64279 2447 ## COG0466 ATP-dependent Lon protease, bacterial type 47 20 Op 2 1/0.000 + CDS 64276 - 65406 907 ## COG0343 Queuine/archaeosine tRNA-ribosyltransferase 48 20 Op 3 . + CDS 65410 - 66507 1022 ## COG0795 Predicted permeases + Term 66544 - 66600 11.3 - Term 66534 - 66585 13.1 49 21 Tu 1 . - CDS 66684 - 67190 351 ## BT_0833 hypothetical protein - Prom 67301 - 67360 6.8 + Prom 67288 - 67347 5.0 50 22 Op 1 . + CDS 67459 - 68688 1406 ## COG0560 Phosphoserine phosphatase 51 22 Op 2 . + CDS 68736 - 70004 1140 ## COG0513 Superfamily II DNA and RNA helicases - Term 69948 - 69991 5.4 52 23 Op 1 . - CDS 70017 - 70814 589 ## BT_0830 hypothetical protein 53 23 Op 2 . - CDS 70852 - 72165 1427 ## COG1004 Predicted UDP-glucose 6-dehydrogenase 54 23 Op 3 . - CDS 72185 - 72733 526 ## COG1898 dTDP-4-dehydrorhamnose 3,5-epimerase and related enzymes - Prom 72764 - 72823 3.0 + Prom 72744 - 72803 3.8 55 24 Op 1 . + CDS 72823 - 73917 1023 ## BT_0827 hypothetical protein 56 24 Op 2 . + CDS 73933 - 74466 509 ## BT_0826 hypothetical protein + Term 74529 - 74575 8.1 - Term 74505 - 74572 23.0 57 25 Op 1 . - CDS 74580 - 76019 1532 ## COG0246 Mannitol-1-phosphate/altronate dehydrogenases 58 25 Op 2 . - CDS 76046 - 76981 846 ## BT_0824 LacI family transcription regulator - Prom 77125 - 77184 7.8 + Prom 77221 - 77280 4.6 59 26 Tu 1 . + CDS 77314 - 78720 1420 ## COG1904 Glucuronate isomerase + Term 78743 - 78790 12.2 - TRNA 79509 - 79584 84.1 # Lys CTT 0 0 - TRNA 79613 - 79688 84.1 # Lys CTT 0 0 - Term 79645 - 79679 -0.5 60 27 Op 1 . - CDS 79778 - 80581 794 ## COG1235 Metal-dependent hydrolases of the beta-lactamase superfamily I - Prom 80603 - 80662 2.1 - Term 80599 - 80641 6.1 61 27 Op 2 . - CDS 80665 - 82053 1267 ## BT_0821 putative permease - Prom 82141 - 82200 4.0 + Prom 82026 - 82085 7.3 62 28 Tu 1 . + CDS 82159 - 82578 364 ## BT_0820 hypothetical protein + Term 82625 - 82667 5.0 - Term 82613 - 82654 4.8 63 29 Tu 1 . - CDS 82687 - 82938 253 ## BT_0819 hypothetical protein - Prom 83141 - 83200 10.3 + Prom 83064 - 83123 5.6 64 30 Tu 1 . + CDS 83241 - 85967 1869 ## COG0642 Signal transduction histidine kinase + Prom 86033 - 86092 6.7 65 31 Tu 1 . + CDS 86141 - 86441 252 ## COG3015 Uncharacterized lipoprotein NlpE involved in copper resistance Predicted protein(s) >gi|226332277|gb|ACIC01000043.1| GENE 1 542 - 1435 796 297 aa, chain - ## HITS:1 COG:BH1742 KEGG:ns NR:ns ## COG: BH1742 COG0329 # Protein_GI_number: 15614305 # Func_class: E Amino acid transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: Dihydrodipicolinate synthase/N-acetylneuraminate lyase # Organism: Bacillus halodurans # 12 277 9 273 295 255 48.0 6e-68 MIQTKLKGMGVALITPFKEDESVDYDALMRLVDYLLQNNADFLCVLGTTAETPTLTEEEK KTIKKMVIDRVNGRIPILLGVGGNNTRAIVETLKNDDFTGVDAILSVVPYYNKPSQEGIY QHYKAISEATELPIVLYNVPGRTGVNMTAETTLRIARNFSNVIAIKEASGNITQMDDIIK NKPDNFNVISGDDGITFPLITLGAVGVISVIGNAFPREFSRMTRLALQGDFANALTIHHR FTELFDLLFVDGNPAGVKSMLNAMGMIENKLRLPLVPTRITTFEAIRKVLNELNIKC >gi|226332277|gb|ACIC01000043.1| GENE 2 1495 - 3495 1526 666 aa, chain - ## HITS:1 COG:BH0649 KEGG:ns NR:ns ## COG: BH0649 COG0272 # Protein_GI_number: 15613212 # Func_class: L Replication, recombination and repair # Function: NAD-dependent DNA ligase (contains BRCT domain type II) # Organism: Bacillus halodurans # 2 666 5 668 669 557 45.0 1e-158 MDIKEKIEELRAELHRHNYNYYVLNAPEISDKEFDDKMRELQDLELAHPEYKDENSPTMR VGSDINKNFTQVAHKYPMLSLANTYSEGEVTDFYERVRKALNEDFEICCEMKYDGTSISL TYEDGKLVRAVTRGDGEKGDDVTDNVKTIRSIPLVLHGDNYPSSFEIRGEILMPWEVFEE LNREKEAREEPLFANPRNAASGTLKLQNSSIVASRKLDAYLYYLLGDNLPCDGHYENLQE AAKWGFKISDLTRKCQTLEEVFEFINYWDVERKNLPVATDGIVLKVNSLRQQKNLGFTAK SPRWAIAYKFQAERALTRLNMVTYQVGRTGAVTPVANLDAVQLSGTVVKRASLHNADIIE GLDLHIGDMVYVEKGGEIIPKITGVDKDARSFMLGEKVRFITNCPECGSKLIRYEGEAAH YCPNETACPPQIKGKIEHFISRKAMNIDGLGPETVDMFYRLGLIHNTADLYELKADDIKG LDRMGEKSAENIITGIEQSKTVPFERVIFALGIRFVGETVAKKIAKSFGDIDELRQADLE KLISIDEIGEKIARSILLYFSNESNRELVGRLKEAGLQLYRTEEDMSGYTDKLAGQSIVI SGVFTHHSRDEYKDLIEKNGGKNVGSISAKTSFILAGDNMGPAKLEKAKKLGVTILSEDE FLKLIS >gi|226332277|gb|ACIC01000043.1| GENE 3 3572 - 4249 661 225 aa, chain + ## HITS:1 COG:BH2479 KEGG:ns NR:ns ## COG: BH2479 COG0336 # Protein_GI_number: 15615042 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA-(guanine-N1)-methyltransferase # Organism: Bacillus halodurans # 1 224 1 225 246 249 51.0 3e-66 MRIDIITVLPEMIEGFFNCSIMKRAQNKGLAEIHIHNLRDYTEDKYRRVDDYPFGGFAGM VMKIEPIERCINALKAERDYDEVIFTTPDGEQFNQPMANSLSLAQNLIILCGHFKGIDYR IREHLITKEISIGDYVLTGGELAAAVMADAIVRIIPGVISDEQSALSDSFQDNLLAAPVY TRPADYKGWKVPDILLSGHEAKIKEWELQQSLERTKKLRPDLLED >gi|226332277|gb|ACIC01000043.1| GENE 4 4468 - 5379 886 303 aa, chain - ## HITS:1 COG:aq_046 KEGG:ns NR:ns ## COG: aq_046 COG0167 # Protein_GI_number: 15605646 # Func_class: F Nucleotide transport and metabolism # Function: Dihydroorotate dehydrogenase # Organism: Aquifex aeolicus # 3 299 2 300 306 303 50.0 3e-82 MADLSVNIGELQMKNPVMTASGTFGYGEEFSDFIDIARIGGIIVKGTTLHKREGNPYPRM AETPSGMLNAVGLQNKGVDYFVEQIYPRIKDIQTNMIVNVSGSAIEDYVKTAEIINELDK IPAIELNISCPNVKQGGMAFGVSAKGASEVVKAVRAAYKKTLIVKLSPNVTDITEIARAA EESGADSVSLINTLLGMAIDAERKRPILSTVTGGMSGAAVKPIALRMVWQVAKAVNIPVI GLGGIMNWKDAVEFMLAGASAIQIGTANFIDPAVTIKVEDGINNYLERHGCKSVKEIIGA LEV >gi|226332277|gb|ACIC01000043.1| GENE 5 5367 - 6143 588 258 aa, chain - ## HITS:1 COG:FN0423 KEGG:ns NR:ns ## COG: FN0423 COG0543 # Protein_GI_number: 19703765 # Func_class: H Coenzyme transport and metabolism; C Energy production and conversion # Function: 2-polyprenylphenol hydroxylase and related flavodoxin oxidoreductases # Organism: Fusobacterium nucleatum # 5 249 3 251 259 167 36.0 3e-41 MKKFILDLTVTENIRLNANYVLLKLTSQSLLPEMLPGQFAELRVDGSSTTFLRRPISINF VDKRQNEVWFLIQLIGDGTRRLAEVNSGDVINVVLPLGNGYTMPQKPSDKLLLVGGGVGT APMLYLGEQLAKNGHKPTFLLGARSNKDLLQLEEFAKYGEVYTTTEDGSHGEKGYVTQHS ILNQVHFEQIYTCGPKPMMMAVAKYAKSNQIECEVSLENTMACGIGACLCCVENTTEGHL CVCKEGPVFNINKLLWQI >gi|226332277|gb|ACIC01000043.1| GENE 6 6219 - 6689 252 156 aa, chain - ## HITS:1 COG:no KEGG:BT_0890 NR:ns ## KEGG: BT_0890 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 156 1 156 156 275 100.0 3e-73 MEIKDRIRMIMEREKVPPRVFAETIGVQQSTLSHILNDRNKPSLEVVMKVHQTYSYVNLE WLLYGKGEMITSAEDASTVSSNGDYQPSLFDENPVNPSKETINPENRKEMALRTAENAPK EIVKQEIRYIEKPARKITEIRIFFDDNTYETFRPEK >gi|226332277|gb|ACIC01000043.1| GENE 7 7919 - 8938 724 339 aa, chain - ## HITS:1 COG:BS_yqeN KEGG:ns NR:ns ## COG: BS_yqeN COG1466 # Protein_GI_number: 16079610 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, delta subunit # Organism: Bacillus subtilis # 10 274 4 272 347 81 25.0 3e-15 MAKQELTCDDILKELRAKQYRPIYYLMGEESYYIDLIADYITDNVLNETEKEFNLTVVYG ADVDVATIINAAKRYPMMSEHQVVVVKEAQAVRNIEELSYYLQKPLHSTILVICHKHGTL DRRKKLAAEVEKTGVLFESKKIKDAQLPAFIASYMKRKGIDMEPKATAMLADFVGSDLSR LTGELEKLIITLPTGQKRVTPEQIEKNIGISKDYNNFELRSALVEKDILKANKIIKYFEE NPKTNPIQMTLSLLFSFYSNLMLAYYAPDKSEQGIANMLGLRTTWQARDYAIAMRKYSGV KTMQIVGEIRYADAKSKGVKNSSMSDGDILRELVFKILH >gi|226332277|gb|ACIC01000043.1| GENE 8 8964 - 9740 732 258 aa, chain - ## HITS:1 COG:CPn0894 KEGG:ns NR:ns ## COG: CPn0894 COG0775 # Protein_GI_number: 15618803 # Func_class: F Nucleotide transport and metabolism # Function: Nucleoside phosphorylase # Organism: Chlamydophila pneumoniae CWL029 # 3 255 6 263 293 202 38.0 6e-52 MKTKEEIVANWLPRYTKRNLEDFGEYILLTNFNKYVEIFANQFDVPILGRDANMISATAE GITMINFGMGSPNAAIIMDLLGAIHPKACLFLGKCGGIDKKNQIGDLILPIAAIRGEGTS NDYFPPEVPALPAFMLQRAVSSSIRDYGRDYWTGTVYTTNRRIWEHDEAFKEYLKKTRAM AVDMETATLFSCGFANHIPTGALLLVSDQPMTPDGVKTDKSDNLVTKNYVEEHVEIGIAS LRMIIDAKKTVKHLKFDW >gi|226332277|gb|ACIC01000043.1| GENE 9 9804 - 10253 344 149 aa, chain + ## HITS:1 COG:no KEGG:BT_0887 NR:ns ## KEGG: BT_0887 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 149 1 149 149 297 100.0 9e-80 MLSLNLPVFDTKINVRNGKNVIFDVIRKRYVALTPEEWVRQHFVHFLIAHKGYPNALLAN EVMVKLNGTTKRCDTVLYRRDLSARMIVEYKAPHIEITQAVFDQITRYNMVLKVDYLIVS NGMQHYCCRMDYAHQSYTFLQDIPDYNAL >gi|226332277|gb|ACIC01000043.1| GENE 10 10377 - 13523 2468 1048 aa, chain - ## HITS:1 COG:BMEI1629 KEGG:ns NR:ns ## COG: BMEI1629 COG0841 # Protein_GI_number: 17987912 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Brucella melitensis # 3 1041 2 1036 1051 871 45.0 0 MFSKFFINRPIFATVLALLIVVAGLVALNILPVAQFPEITPPTVQVSAVYPGANAETVAQ TVGLPIEQQVNGVDGMLYMSSNSSSSGAYSLTITFAVGTDIDMATVQVQNRVSVAQSSLP EPVVVQGVTVQKQSSNIVMFLTMTSKDSVYNSLYLTNYAKLNLVDQLSRVPGVGAVNVMG AGDYSMRIWLDPEAMRIRNISPAQVYQSIQSQNMEVSAGYIGQPIGQDNKNAFQYTLNVQ GRLISPEQFGNIIIRREQDGAMLRLKDIARIDLGSASYSVVSRLKGMPTAAIAIYQQPGS NSLDVSKGVKAKMEELATSFPTGVAYNVTLDTTDVIHASIDEVMVTFFETTLLVVLVIFL FLQNWRAVIIPCITIPVSLIGTLAVMAAFGFSINTLTLFGLILAVAIVVDDAIVVVENAS RLLETGQYSTREAVTKAMGEITGPIVGVVLVLLAVFIPTMMISGISGQLYKQFALTIAAS TVLSGFNSLTLTPALCALFLKKSKPSNFFIYKGFNKAYDKTQGVYDSTVKWLLQRPLISF VSFAILTVIAVILFMRWPSTFIPDEDDGYFIAVVQLPPAASLERTQAVGDKINAILDSYP EVENYIGITGFSVMGGGEQSNSATYFVVLKNWDERKGKEHTAAAVVDRFNGEAYMEIQAG QAFAMVPPAIPGLGASGGLQLQLEDRRNLGPTEMQQAIDALLASSHSKPALASVSSQYQA NVPQYFLNIDRDKVQFMGIALNDVFSTLSYYMGAAYVNDFVEFGHIYQVKIEARDQAQRV IDDVLKLSVANSAGEMVPLSSFTKVEEQLGQDQINRYNMYSTAALTCNVAPGSSSGQAIQ EMETLFKEQLGDEFGYEWTSVAYQETQAGNTTTIVLVMALIVAFLVLAAQYESWTSPVAA VIGLPVALLGAMIGCMIMGTPVSVYTQIGIILLVALSAKNGILIVEFARDFRAEGNSIRE AAYEAGHVRLRPILMTSFAFVLGVMPLLFATGAGAQSRIALGTAVVFGMAMNTLLATVYI PNFYELMQKLQEKFSRKKDDDVKKDTNM >gi|226332277|gb|ACIC01000043.1| GENE 11 13606 - 14715 1017 369 aa, chain - ## HITS:1 COG:BMEII0914 KEGG:ns NR:ns ## COG: BMEII0914 COG0845 # Protein_GI_number: 17989259 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Brucella melitensis # 2 367 74 434 451 157 32.0 4e-38 MKKLMYIFLVLSVLTGCKEKKDAGAMKGMPTLAISVAKPIVKDITLTKDYPGYLTTEKTV NLVARVNGTLQSVSYAPGGRVKKGQLLFVIEPTLYNDKVAQAEAELKTAQAQLEYARNNY SRMKEAVKSDAVSQIQVLQSESSVTEGVAAVSNAEAALSTARTNLGYCYVRAPFDGTISK STVDIGSYVGGSLQPVTLATIYKDDQMYAYFNVADNQWLEMSMNNQQPTKDLPKKIMVQL GKEGTESYPATLDYLSPNVDLNTGTLMVRANFDNPQGVLKSGLYVSITLPYGEADHAILV KEASIGTDQLGKFLYAVNDSDIVHYRHIEIGQLINDTLRQVLGGLSPQERYVTEALMKVR DGMKIKPIP >gi|226332277|gb|ACIC01000043.1| GENE 12 14749 - 16146 506 465 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|157165073|ref|YP_001466086.1| 30S ribosomal protein S12 [Campylobacter concisus 13826] # 45 464 37 457 460 199 29 4e-50 MNVRAWSVSLLLFLSTVTAVGQTVNRYLNTPLPKEWEESGEVFQQTLPVDDHWWKSFQDT RLDSLIALAVDRNYSVAMAINRIAAARANLWAERGNFFPSIGLNAGWTRQETSGNTSTLP QSTDHYYDAALSMSWEIDVFGSIRKRVKAQKENFAASKEEYAAVMVSLASEVASAYINLR ELQQELEVVNKNVASQEEVLKITEVRYNTGLVAKLDVAQAKSVLYSTKASIPQLEAGINQ YITTLAVLLGMYPQEIRPVLETTGTLPDYMEPIGVGMPVDLLLRRPDVRSAERSVNAQAA LLGASKSDWLPKIFLKGSFGYAARDLNDLVKSKSMTYEIAPSLSWTIFSGGQLVNATRLA KAQLDESINQFNQTVLTAVQETDNAMNSYRNSIKQIVALREVRNQGVETLKLSLELYKQG LSPFQNVLDAQRSLLSYENQLVQAQGSSLLQLISLYKALGGGWRE >gi|226332277|gb|ACIC01000043.1| GENE 13 16284 - 17114 946 276 aa, chain - ## HITS:1 COG:TP0494 KEGG:ns NR:ns ## COG: TP0494 COG1579 # Protein_GI_number: 15639485 # Func_class: R General function prediction only # Function: Zn-ribbon protein, possibly nucleic acid-binding # Organism: Treponema pallidum # 18 244 7 232 273 63 21.0 4e-10 MAREAKKDPNELTVEQKLKTLFQLQTMLSKIDEIKTLRGELPLEVQDLEDEIAGLSTRID KIKAEVDELKSAIAGKRVEIETAKASVEKYKSQQDNVRNNREYDFLTKEIEFQSLEIELC EKRIKEFSADREEKEAEVVKNEQILSERQKDLEQKKGELDEIISETKQEEEKLRDKAKDL ETKIEPRLLQSFKRIRKNSRNGLGVVYVQRDACGGCFNKIPPQRQLDIRSRKKVIVCEYC GRIMIDPELAGVQIEHKVEEAPATTTKRAIRRKTAE >gi|226332277|gb|ACIC01000043.1| GENE 14 17120 - 18214 820 364 aa, chain - ## HITS:1 COG:BH1380_2 KEGG:ns NR:ns ## COG: BH1380_2 COG3323 # Protein_GI_number: 15613943 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus halodurans # 131 234 6 109 113 114 54.0 3e-25 MKIKEIVSALERFAPLPLQDGFDNAGLQIGLTDAEATGALLCLDVTEAILDEAIALGYNL VISHHPLIFKGYKSITGKDYVERCMLKAIKNDIVIYSAHTNLDNAQGGVNYKIAEKIGLK HLKVLEPKENSLMKLVTFVPIAQADIVRNALFAAGCGNIGNYDSCSYNLEGEGTFRAKEG THPFCGVIGELHREGEVRIETILPAFKKSEVVRALLSVHPYEEPAFDLYPLQNEWAQAGS GIVGELEEPETEMEFLKRIKKTFEVECLRHNKLTGREIQTVALCGGAGAFLLPQAIRSGA DVFITGEIKYHDYFGHEGDILMAEIGHYESEQYTKEIFYSIIRDLFPNFTLQLSKINTNP IKYL >gi|226332277|gb|ACIC01000043.1| GENE 15 18321 - 18737 252 138 aa, chain + ## HITS:1 COG:no KEGG:BT_0881 NR:ns ## KEGG: BT_0881 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 138 1 138 138 254 100.0 8e-67 MKTILSCCLVIILAVSCQMKQNQNSDESIDENMSVDNTETCMVDTVKATAIFWIDKAEAK HCKDSGLRTIKAKVFIHENGKVDFLSFTKKQSSGVEKYILHHLDKFQISKRMLEGGYIQT GEQFVQLRCMREKLQAMQ >gi|226332277|gb|ACIC01000043.1| GENE 16 18772 - 19050 215 92 aa, chain - ## HITS:1 COG:no KEGG:BT_0880 NR:ns ## KEGG: BT_0880 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 92 1 92 92 174 100.0 1e-42 MTVTDRIFQNVAELSVPNFFITVEFSVMGNEMPEHIESFVLEKYQAILHGANGRKFVYTE GGWRLFFTFFPTDKVVDERYALKNKVQMKFHK >gi|226332277|gb|ACIC01000043.1| GENE 17 19047 - 19841 505 264 aa, chain - ## HITS:1 COG:RSc0748 KEGG:ns NR:ns ## COG: RSc0748 COG0390 # Protein_GI_number: 17545467 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, permease component # Organism: Ralstonia solanacearum # 31 256 35 260 272 93 30.0 4e-19 MGTIDISYYNLFIGLLLLAIPFFYLWKFKTGLLKPAVIGTLRMIIQLFFIGIYLKYLFLW NNPWINFLWVIIMIFVAGQTALVRTQLKRSILLIPISIGFLCSVVVVGLYFIGVVLQLDN IFSAQYFIPIFGILMGNMLSSNVIALNTYYSGLKREQQLYRYLLGNGATRQEAQAPFIKQ AIIKSFSPLIANIAVMGLVALPGTMIGQILGGSSPNVAIKYQMMIMVITFTASMLSLMIT ISLASRRSFDAYGKLLEVTKETKK >gi|226332277|gb|ACIC01000043.1| GENE 18 19876 - 20496 211 206 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 197 1 202 245 85 30 6e-16 MLHINNACIAFGTEVLFSGLNLNLERGKTACIVGQSGCGKTSLLNAVMGFVPLKEGTIKV GDTLLDKSTIDIIRRQIAWIPQELALPFEWVKEMVALPFDLKVNRSVPFSEERLYEYFDD LGLEHDLYTKRVNEVSGGQRQRIMLAVAAMLNKPLIIIDEPTSALDAGSTDKVLAFFRRQ AEKGAAVLAVSHDKDFASGCHYLIEL >gi|226332277|gb|ACIC01000043.1| GENE 19 20583 - 21125 501 180 aa, chain + ## HITS:1 COG:AF2201 KEGG:ns NR:ns ## COG: AF2201 COG4739 # Protein_GI_number: 11499783 # Func_class: S Function unknown # Function: Uncharacterized protein containing a ferredoxin domain # Organism: Archaeoglobus fulgidus # 1 176 1 184 184 127 39.0 1e-29 MILNERDARHEHILQVARQMMTAARTAPKGKGIDIIEVALITDEDIKLLSDKMIAMVEEH GMKFFLRDADNILSAECVVLIGTREQAQGLNCGHCGFATCAGRTEDVPCALNSIDVGIAI GSACATAADMRVDTRVMFSAGLAAQRLNWLKDCKMVMAIPVSASSKNPFFDRKPKQETKA >gi|226332277|gb|ACIC01000043.1| GENE 20 21196 - 22311 1175 371 aa, chain + ## HITS:1 COG:HP0049 KEGG:ns NR:ns ## COG: HP0049 COG2957 # Protein_GI_number: 15644680 # Func_class: E Amino acid transport and metabolism # Function: Peptidylarginine deiminase and related enzymes # Organism: Helicobacter pylori 26695 # 38 364 6 328 330 306 48.0 3e-83 MGIMVGLPSPSGSEKDLQLNFGKNMTVQVEMRAPHLPAEWHLQSGIQLTWPHAGTDWAYM LKEVQECFVNIAREIAKRELLLIVTPEPEEVKKQIVATVNMDNVRFLRCETNDTWARDHG AITMIDTGNPSLLDFTFNGWGLKFASELDNLITGQAVKAGALKGQYIDCLDFVLEGGSIE SDGMGTLLTTTECLLSPHRNGKLNQVEIEEYLKSTFHLQKVLWLDHGYLAGDDTDSHIDT LARFCSTDTIAYVKCDNTEDEHYEALHAMEEQLKTFRTLAGEPYRLLALPMADKVEEDGE RLPATYANFLIMNDAILYPTYQQPDNDRKAGEVLQQAFPKHQVIGIDCRALIKQHGSLHC VTMQYPSGVIQ >gi|226332277|gb|ACIC01000043.1| GENE 21 22348 - 23232 833 294 aa, chain + ## HITS:1 COG:XF2443 KEGG:ns NR:ns ## COG: XF2443 COG0388 # Protein_GI_number: 15839034 # Func_class: R General function prediction only # Function: Predicted amidohydrolase # Organism: Xylella fastidiosa 9a5c # 4 294 6 295 295 414 65.0 1e-116 MKKIKVGLIQQSNTADIRVNLMNLAKSIEACAAHGAQLIVLQELHNSLYFCQTENTNLFD LAEPIPGPSTGFYSELAAANKVVLVASLFEKRAPGLYHNTAVVFDRDGSIAGKYRKMHIP DDPAYYEKFYFTPGDIGFEPIQTSLGKLGVLVCWDQWYPEAARLMALKGAELLIYPTAIG WESSDTDDEKARQLNAWIISQRAHAVANGLPVISVNRVGHEPDPSGQTNGIQFWGNSFVA GPQGEFLAQASNDHPENMVVEIDMERSEDVRRWWPFLRDRRIDEYDGLTKRFLD >gi|226332277|gb|ACIC01000043.1| GENE 22 23334 - 23678 248 114 aa, chain - ## HITS:1 COG:no KEGG:BT_0874 NR:ns ## KEGG: BT_0874 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 114 1 114 114 194 100.0 1e-48 MNALIMALVVWLMMREMSFEGDYMVANVTAYIIAQIHNFIWCKYWIFPVEKRKNNVWKQI LFFCSAFGVAYTAQFLFLILLVEGLDVNEYLAQFLGLFIYGAANFIANKKITFQ >gi|226332277|gb|ACIC01000043.1| GENE 23 23720 - 25594 1306 624 aa, chain - ## HITS:1 COG:PA1689 KEGG:ns NR:ns ## COG: PA1689 COG1368 # Protein_GI_number: 15596886 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Phosphoglycerol transferase and related proteins, alkaline phosphatase superfamily # Organism: Pseudomonas aeruginosa # 4 610 38 679 700 137 25.0 9e-32 MGCFILCRVIFLIENLKAFSDLSAGRIWRMFEGGFLFDTSAILYTNLLYIFLMLIPLHYK ENVIYQKITKGIFVTTNLIVIIMNLMDTVYFQYTHRRTTASVFSEFKNEGNLGGIIGTEL LNHWYLTLFAIAFGYALFKLYRKPKPVKVERMPVYYAVHLLTLGLGVYLCIGGMRGGFTG MVRPITISNANKYVDRPMETGIVLNTPFSVFRTFGKTSFAIPQYFDKEKMEALYTPVHMP ADSVQFRPLNVVVFILESFSKENSGFLNEELDNGTYKGYMPFLDSLMAEGLTFKYSFSNG MKSIDGMPSVLSGIPMFIEPFFLTPSSLNTVSSIGGELGKKGYYTAFFHGADNGSMGFEA FARTAGYTNYFGRTEYNEANPGNKDFDGHWGIWDEKFFQFFGNTLSGFQQPFAAALFSLS SHHPFAVPAEYEGVFPKGNKAIRQCIGYTDHALKLFFEKVSKEPWYKNTLFVFTADHTNS PERPEYETESGLFSVPVVFYQPGSNLKGFRKDVIAQQIDIMPTVLGYLGYDKPFVSFGCD LLNTPAEDTYAVNYINGIYQFFKGDYLLQFDGRNAVAVYAFKTDKLLEHNLLGQIDVQQS MEDELKAIIQQYMIRMNNDELVVK >gi|226332277|gb|ACIC01000043.1| GENE 24 25661 - 27424 1858 587 aa, chain - ## HITS:1 COG:BH1252 KEGG:ns NR:ns ## COG: BH1252 COG0173 # Protein_GI_number: 15613815 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Aspartyl-tRNA synthetase # Organism: Bacillus halodurans # 3 581 4 585 595 597 51.0 1e-170 MFRTHTCGELRISDVNKQITLSGWVQRSRKMGGMTFIDLRDRYGITQLVFNEEINAELCD RANKLGREFVIQITGTVNERFSKNANIPTGDIEIIVSELNVLNTAMTPPFTIEDNTDGGD DIRMKYRYLDLRRNAVRSNLELRHKMTIEVRKYLDSLGFIEVETPVLIGSTPEGARDFVV PSRMNPGQFYALPQSPQTLKQLLMVSGFDRYFQIAKCFRDEDLRADRQPEFTQIDCEMSF VEQEDIISTFEGMAKHLFKTLRGVELTEPFQRMPWADAMKYYGSDKPDLRFGMKFVELMD IMKGHGFSVFDNAAYVGGICAEGAATYTRKQLDALTEFVKKPQIGAKGMVYARVEADGTV KSSVDKFYTQEVLQQMKEAFGAKPGDLILILSGDDVMKTRKQLCELRLEMGSQLGLRDKN KFVCLWVIDFPMFEWSEEEGRLMAMHHPFTHPKEEDIPLLDTDPAAVRADAYDMVVNGVE VGGGSIRIHDAQLQARMFEILGFTPEKAQAQFGFLMNAFKYGAPPHGGLAYGLDRWVSLF AGLDSIRDCIAFPKNNSGRDVMLDAPSEIDQTQLDELNLIVDIKENK >gi|226332277|gb|ACIC01000043.1| GENE 25 27500 - 28513 828 337 aa, chain - ## HITS:1 COG:lin0768 KEGG:ns NR:ns ## COG: lin0768 COG1597 # Protein_GI_number: 16799842 # Func_class: I Lipid transport and metabolism; R General function prediction only # Function: Sphingosine kinase and enzymes related to eukaryotic diacylglycerol kinase # Organism: Listeria innocua # 1 295 1 299 309 118 29.0 2e-26 MKRIIFVVNPISGTQSKELILNLLDEKIDKARYTWEVVYTERAGHAVEIAAKAAEEKADI VVAIGGDGTINEIARSLVHTDTALGIIPCGSGNGLARHLHIPMEPKKALEVLNEGCLDTI DYGKINGTDFFCTCGVGFDAFVSLKFAHAGKRGLLTYLEKTLQESLKYQPETYELETENG VSKYKAFLIACGNASQYGNNAYIAPQATLTDGLLDVTILEPFTVLDVPSLAFQLFNKTID QNSRIKTFRCKKLCIRRAVPGVVHFDGDPMETEADVNIELIKSGLRVVVPKTEEKDAANV LQRAQEYVNGIKLINEAIVDNITDKNKKILKKLTKKV >gi|226332277|gb|ACIC01000043.1| GENE 26 28746 - 29930 1113 394 aa, chain + ## HITS:1 COG:BS_kbl KEGG:ns NR:ns ## COG: BS_kbl COG0156 # Protein_GI_number: 16078763 # Func_class: H Coenzyme transport and metabolism # Function: 7-keto-8-aminopelargonate synthetase and related enzymes # Organism: Bacillus subtilis # 27 394 25 392 392 296 39.0 6e-80 MGLLQEKLAKYDLPQKFMAQGVYPYFREIEGKQGTEVEMGGQHVLMFGSNAYTGLTGDER VIEAGIKAMRKYGSGCAGSRFLNGTLDLHVQLEKELAAFVGKDEALCFSTGFTVNSGVIS CLTDRNDYIICDDRDHASIVDGRRLSFSQQLKYKHNDMADLEKQLQKCNPDSVKLIIVDG VFSMEGDLANLPEIVRLKHKYNATIMVDEAHGLGVFGKQGRGVCDHFGLTHEVDLIMGTF SKSLASIGGFIAADSSIINWLRHNARTYIFSASNTPAATAAALEALHIIQNEPERLNALW EATNYALKRFREAGFEIGATESPIIPLYVRDTEKTFMVTKLAFDEGVFINPVIPPACAPQ DTLVRVALMATHTKEQIDSAVEKLVKAFKALDLL >gi|226332277|gb|ACIC01000043.1| GENE 27 30020 - 30367 296 115 aa, chain - ## HITS:1 COG:no KEGG:BT_0869 NR:ns ## KEGG: BT_0869 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 115 1 115 115 199 99.0 2e-50 MKKKMLAATLLMGLLAACGGNSASIVGSWVEPVPGMEGQVQGIKMEEGGVASSVNMATLV YESWKQEGTKLILTGKSIGNGQTIEFVDTMDIKRLTADSLVLDNQGMEIRYAKQK >gi|226332277|gb|ACIC01000043.1| GENE 28 30393 - 30776 324 127 aa, chain - ## HITS:1 COG:no KEGG:BT_0868 NR:ns ## KEGG: BT_0868 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 127 1 127 127 209 96.0 2e-53 MNKIRVFLLFAVLVGVTSACQSKSGQTAGQEQDAMDDSLAVAPTEDEKTIFGVVEDASMN NFMLVTSKGDTVFVSTMDQEPSEVGGFELGDTVQVNYIEEEEEPGSNTIPTAKKVIVVGK KANRTND >gi|226332277|gb|ACIC01000043.1| GENE 29 31504 - 34716 2590 1070 aa, chain + ## HITS:1 COG:no KEGG:BT_0867 NR:ns ## KEGG: BT_0867 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1070 1 1070 1070 2082 99.0 0 MKNILFTSNKVHTTARIGVLFFSLCFSPLINAGNAEKPMNGIESSETTQQNKIKITGVVT DSKGESIIGANVVVKGQSNVGAITDVEGRYQIMVPSDNAILQVSYLGYVTEEIKVKGRRN INVMLNEDSKALDEVVVVGYGQQKKESVVVSMSSIKPKDIVVPSRSLNNSLAGQVAGLIA VQRSGEPGYDNAEFWIRGVSTFAGGTSPLVLVDGVPRSMSDIEPDEIETFSVLKDAAATA IYGAEGANGVVLVTTKRGRVEKAKISFKTEHTISSPTRLPEFVGSADYLDLFNEALHNDG ELPQFSDDLIAHYRNNDDPDLYPNTNWIDKMLRKNTFTHRYTLNVRGGTEKAKYFVSAAY YNESGIFKGSPTKQYNTNIGIDRINLRSNIDMNVSSTTTVGVDLALQYLVNNYPGTGTAN IFRSMLITPPYTFPAVYSDGTVSTYSQERDANMRNPYNLLMNSGYAKEYRTAIQSKVYVD QKLDFITKGLSVRLNASYDYDSEMIVRREYNPTRYHATGRDENGNLLFATVVSGNPELQD PKNSATSATKKIYIDAAINYKRTFDQHEVGAMLLYMQKETQYHNVALPYRKQGLVGRLTY GYGGRYFIEGNFGYTGSEAFASGNRFGFFPAVGLAYYISNESFYPEVIKKYVPKLKLRAS VGRAGNDNTGGTRFLYRPTYKMDAGGFTQGYNDTGGGLNGIGNGIIEGRFAAPYLGWEVE DKQNYGFDIGLFDNRLEVIFDYFDSTRSQILLQRQTVPQLGGLRDDPWQNFGKVRNRGVD MSVNAHQNIGKVKLSARGTFTFTRNKILEYDELPPKYDYQAITGKRVSDKDNLYYIAERL YTEDDFTVSTNANGLKTYKLRSEIPQPTLGGLLGPGDIKYTDVNGDGIIDSYDRVRGIGH PETPEIIYGFGLNAEYKGIYASIFFQGAGNTSVLLGGKTSEGWYPFSWGVDQSNYRTFAL NRWTENNPSQDVIIPRLHKSNANNANNRVASTWWLRDGSFLRLKNIEIGYQIPKKFLSKI GFEAARIYLMGYNLAVWDHIKYFDPEAGNANAGLNYPLPRTYTIGLDFTF >gi|226332277|gb|ACIC01000043.1| GENE 30 34737 - 36635 1686 632 aa, chain + ## HITS:1 COG:no KEGG:BT_0866 NR:ns ## KEGG: BT_0866 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 632 1 632 632 1276 99.0 0 MKSKSHIYMLLMGLFFSVSSCDYLAVSDQMSGGLQNTDQIFENVAYTKRWYANVFAGIPD YSGINSLNVGAFKNPWAAICDELVVGYGNAAKANNSDKNAATAGFHRYGDCYKYIRQANI FLEKAHVITTSGTQGDRLEEDELNEMRANVRFMRAFYNYLLLEQYGPIILVKDKVYEATE TQDVPRNTVDEVIEYIDQELREVANELPQEPMHENESYRAWPTKGVALAVRAKLWLYAAS PLLNGGYREALSLTNPDGTRLFPDRDDNKWNTALNACKDFIDYAETGNRYELYKEYTTSS TGEQILDVDASVYNLFQKYNKEIIWGTANNDWGGLDGDAFDRRIVPRCEKNGLGSTGVTQ ELVDAFYMNDGLPIKETDYLPKSTLYKEDGYGTYKDKNDGKYSKNYTNVTVSNRYLNREA RFYNTVFFNGRQWPVTCKQVQFYNGGNAGVQEGQATTTGYMLFKRFNRSISKTSPGVASQ NRPSIIFRLADFYLIYAEVANEVNPSDSRVLTYLNLVRERAGLPKVEVLNPGIVGNKELQ RAAIQRERQIELATEGQRYFDVRRWMIADKDGEGRQNGYAHGMNVRGTINDTEEFNRVVE TEKIVFNRKMYLQPIPDHEMRKTQNLVQNPGW >gi|226332277|gb|ACIC01000043.1| GENE 31 36666 - 37988 981 440 aa, chain + ## HITS:1 COG:no KEGG:BT_0865 NR:ns ## KEGG: BT_0865 # Name: not_defined # Def: putative chitobiase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 440 1 440 440 860 99.0 0 MRKISYILISGCITLSIIGCDETFDTSNTSEGILSLSENGLTILQSYNVGEKYNADLWIQ HGGLELNNSTVTFTVDKSLLDSLNTADGTSYQILPEDCYQMTNTTVNVSAGDRLAKGTIV YDPTKIHALCGYDNVRYILPLKASTTGEKLNPDRSTLLLGFKVSEPIVTIINDGIQEINV DNVKELPITIGVPFSNKWNINCTLVNNQSVVDTYNSANQTYFSLLPAECYTKPESPSLSQ GVDQTTVSYKLKDNILPGNYMLPVQIGEVTSDATIRADKDTYTGFCIIKEGTKISKSGWE VLSFTTQEASGEGAGNGLAKCLIDGDTETFWHAKWQGGSDPLPYDIVIDMKQNIQIAQVE LLPRGRGSNNPIKVVEFAASEDNVNWTPIGRFGFTNQDAALEYYVKSIKARYIRLTIPDD GGNSTVAAIRELDVKGTIIN >gi|226332277|gb|ACIC01000043.1| GENE 32 38207 - 40564 1150 785 aa, chain - ## HITS:1 COG:no KEGG:BT_0864 NR:ns ## KEGG: BT_0864 # Name: not_defined # Def: putative permease # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 785 1 785 785 1495 96.0 0 MERTAMENFILLKLIFRNWWRNKLFAVISLVSLVVGISCTNLLISFVIHEYTIEGENPKK DRILRLTQQLPSMQSTGQVSFVYGGSVAGTIAPFPEIESYLRLTEKEATHIEIENQRFPQ QVIVEADSSFCRFFPYQILSGNMKEALIKPDCIALSEEKAMQYFGKLDCIGRELSLVYGD KVEMKRVSAVYQFYAQSALRIDLLGNSQNLQEEGTSCMILLKERTDVSAFRERFESTELP TVTGPGYYKLQTLQESYFDTKLADSVKTFSHRQIALLSIGLLSAFLVLFIACFNYVNLSF SRLLKQVNMIHVETLMGASHSYICRQLFIDTFLTVFIAFILSILVMGDILSLFNYFFDAR LTFGFILSWKVFPFILLFVSLLAIIPATYMAKKLNRISISGYRQFFQGRKRRSLVAILVG VQMVISIGLMTAFMMIRSQLSTIEGQGERFKEVIILGNEGIALTIPLYNDIKKLPGIQTA LLSKGRITSPFNIIIPLNRNGQEVMVNKVLLTENYELLDLFNIELLEPERTKDLFSKVAH PVVVNEAFVQFFVPAGEDPVGQTLSKYEQDNAGEGTIIGVCKDFRLTSFNSAITPMQIIL QEIPVDQATYLSCRIDEGRRNEIVKKLRLLWEKHEPGHSFVYIDAYQQYISNNTDMVNFS RLLLAYAIISLFLTLFGIFGITWYAVEQRKREIAIRKVHGASSRQILLLLNRPFFYYILI ADVIAVPIVYVLMKRWREQFVYPANWGIEVFIVPVVLLMLVTFVTVVLNGYHTALSNPAE TIKTE >gi|226332277|gb|ACIC01000043.1| GENE 33 40592 - 43000 1566 802 aa, chain - ## HITS:1 COG:TM0351 KEGG:ns NR:ns ## COG: TM0351 COG0577 # Protein_GI_number: 15643119 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Thermotoga maritima # 112 395 115 402 404 67 25.0 1e-10 MKNQLSAIRTLSRFRTYTMINIIGLAVSVAATLIIVRYIHQEITVDHFCKDLDRLYLLTI QRSNGSISIADNTDRNNDPNFIDPLKNPEVEAYSYSVSFDDDYIVVDNHRYRANVLVTDS LLLQLMDYPIASGIKTIQKPDDAIITRKYARHILGDEDPIGKQLTSSSGSVLTIRGIVDE PSTKASLQFDLIAPVNQGKYADWSRVGFCMVRLANGTDLKKYNEKISKPQSLICYSHSPI QYGLTPLKGLYFNKIVEPSGTSFLRGNINHVTILTVVACMLLLVGVFNFINIYTVIVLKR AREFGVKKVYGASGFQIFTQIYVENIYMVAAALLIIWMLIETSAGIFASVYAIPVKPDIA FDLSLSLILLFVLPLVTSVFPFLRYNYSSPITSLRSVNAGGNSIVSRVVFLFIQYVITFS LIVVALFFVRQLYTMLNADLGYQAKNIISCQFLSFETQNRRYANDEEWQKERDLERHKVQ VIKQKMDACPLFISWGYGEAPINLEPYVNVESNGEKHKMAIMYVNNNYMDMFGLKLKEGR TWNDKDQFAQYKMMINETARKLFRIKDINEASLQTESRLWWSVGTDESKNPAYQVVGVIE DFRTGHLSMGDAPLAILYDEGENPTEPLMAKIAEGKRKEAIQFLKELHNEVMGEGEFNYS FVEDEVEKLYDDDKRTTRIYVTFAGLAICVSCLGLFGLSLYDIRQRYREIALRKVNGATG KQITLLLVRKYLSILGAAFVVAIPLVYYIINDYTKDFAVKAPIGAGIFVTGFILTLLISL GTLLWQVRKAVRINPALIMKNE >gi|226332277|gb|ACIC01000043.1| GENE 34 43022 - 45328 993 768 aa, chain - ## HITS:1 COG:no KEGG:BF2455 NR:ns ## KEGG: BF2455 # Name: not_defined # Def: putative ABC transport system, membrane protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 768 1 775 775 744 49.0 0 MIRHYLKIAFRNLIKYKTQTIISIIGLSVGFTCFALSTFWIHYEMTYDAFLKGSERTYIL YKESVLENSGFDTSSPYPLSTYLKRDFPEIEAACALTRWQNVEISVEGQTPLEGIEIGAD SCFMNMFDISVVSGNTDFMHSDEKIALTKDMAMRLFGSTDILGQKVNLRGSSQTVCAILD GLEDHSNLSFGFWTEGAYFSSWTEDWQNSGFMTILRFHKKVDLQLFQKKLNEYAQEKEKE NIFIGKYRLMPITGYYYSALNTERTVEFYYLILFAVVGGLVILCSLFNYFSLFFARMRMR VRELELRKVCGSSHMGLFILLSVEYLLILLLSGLLGMTFIELFLSVFKEVSNISGNIYVE SFFYLGGVAITSLILFLPFILKKNYHSQKTVGKALYRKISVVLQLIISILFIFSVSVIMR QLFFLNNVDLGWERKNIAVFQYLYPNDSFSAIADKVEQMPCTREVLKGNYGLLLQGVRIS WRITDWDGKQSSTDQTNLECLAENEEIMRFYHLKLLKGEMLKKDDVDKIVINETAVKSLD MHDPIGKKIYQGKDRHAYIIVGVIKDFHTSPPTIPVKPVAFIGQNNELSDRKQIIVKYHE GQWQELKQKTEDLFVKEYPGVEYVLTNIEEHYNEYLKSEYILLKLLGFVSIVCVIISILG IYSLVTLTCEQRSKEIAIRKVNGAIIRDILILFLKEYLILLIAASLIAFPIGYIAMKHWL ESYVERTEISAWMYGAIFIIVGLIIFFSIIGRVWKAARQNPAEVIKSE >gi|226332277|gb|ACIC01000043.1| GENE 35 45347 - 47662 941 771 aa, chain - ## HITS:1 COG:AF1017 KEGG:ns NR:ns ## COG: AF1017 COG0577 # Protein_GI_number: 11498622 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Archaeoglobus fulgidus # 562 771 159 377 377 67 28.0 1e-10 MIYHYLKIAFRNLIKYKTQSIVSIVGLAIGFICFALSALWIRYEMTYDTFHEGAERIYLA GNKFELQGDGFSYYSSGLLADYMMKNCPEVEKACHLSKEGIKPIEYEKNIYHTWQLEVDS NFISVFNITVIDGDNRLQLGDKQIAITDKIAKQIFGKESPIGKSLLFPNRGNTEMMIVAV VKSWEGHSLYPFDILLPYKDEDPNWGRQQCNTLFRMYPNSDVKALEKRLTEYEIQQDSHK QLITTPIALLSTLRSTHPREDVNVKLNHVRLFACIGALVIICGLCNYLTMLVTRIRMRKR ELALRKVNGSSNTSLLLLLLSELILLLLISSGIGIMLMELILPIFISLSQISENTSFFYN EVLVYILSLIILTVGVATILIYYINKQALLENIRHKSNLHLSGWFYKGSILFQLFVSLSF IFCTLVMMKQLNFLLNSRELGLERHNVGVIVYASGCEKATLAQIVKQTPDIVEHMEEFYT PIPKFLYSTFRVSDWEGKSDEQQTFNMEDETINQNFADFFHIEVLEGSMLGDKDEKGAVV INEAAVKAFGWNAPIGKEINGEYRVKGVIKDIYYNAPIHPVSPAMFSLQKNNDRGHLIFK VKEGTWNKVSQNLQEEVHKVNPDAILDLINMEEAYNDYMKSEDTLSKLLSVVSLICIVIA VFGIFSLVTLSCEQRQKEIAIRKVNGASVKVILNLFFKEYLLLLVIASFIAFPLGYAIMK HWLEGYVKQTSINIWIYAGIFVAMLLIIFISIIWRVWRAARQNPAEVIKSE >gi|226332277|gb|ACIC01000043.1| GENE 36 47678 - 50023 1443 781 aa, chain - ## HITS:1 COG:no KEGG:BT_0861 NR:ns ## KEGG: BT_0861 # Name: not_defined # Def: putative ABC transporter permease # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 781 1 781 781 1457 96.0 0 MIKHYLKVAFRNLVKYKTQSLVSIIGLAVGFTCFALSVLWIRYEMTYDTFHNGADRIYMV RAHYTDSPSSTSNVTPYPLANYLQEKMPEIEATASTSIYRDMKFRLGDNEQEVVMATADS VFMNFFNIQLLKGTVNFLKENDPEIAITEEFARKLFGSEEDALGKEVELSGRLRKIGAIV SGWSRHSNIPYSVLSSARRYPRWSSKGEQLFIRVRAGVDRDTFQKKMEDIDILDIPKESE LVKLLITPISALRYSDYVSKVDVVISFSYIFYFSLAGGLVIICSLFNYLTLYISRLCMRS REMALRKVNGASNTSLSVQFAIELLLILCIALLCGLLMIEVSLSGFLNFTQIETGSYYGE IIAYLLIVIILSFLLAQMPLSYFRRRTLQEAIKGKEVTTRPYLFRKIGIIMQLVVSLAFI FCAVVMMKQLYFLKNTDLGMERHNVGNVALWNGDIKQWAGKIAALPMVTETLPPSYFPII PTGPMMYAEINNWDDSPKAVEEPLTVGMMPAKEEFFEFYKLNLLEGEFISDKNQENEVVI DENTCRKFGWKRALGKSFYHEYNGQRSLYKVVGVVRSFSYRSPTSEPGLIAFQHPKAQDY LLNRASILFKFREGTWNECRATIEKLYREEFPNAYMRLFNEEEEYNKYLRSEDALMKLLS FVSLVCVIISVFGVFSLVTLSCEQRQKEIAIRKVNGAQIHHILQMFFREYLLLLIIAAVI AFPAGYVVMRRWIDSYVRQTSIDSWVYISIFVVIAIILLFSIIWRVWKAARQNPAEIIKS E >gi|226332277|gb|ACIC01000043.1| GENE 37 50041 - 50715 338 224 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) [Campylobacter concisus 13826] # 1 199 4 199 223 134 37 1e-30 MIKTINLQKIFKTEEVETWALNNVSIEVKQGEFVAIMGPSGCGKSTLLNILGLLDNPTGG EYYLNGIEVSKYTESQRTSLRKGVIGFVFQSFNLIDELNVYENIELPLLYMGISASERKK RVETAMERMAITHRSKHFPQQLSGGQQQRVAIARAVVANPKLILADEPTGNLDSKNGKEV MGLLSELNKEGTTIVMVTHSQHDAGYADRIINLFDGQVVTEVSM >gi|226332277|gb|ACIC01000043.1| GENE 38 50743 - 53145 852 800 aa, chain - ## HITS:1 COG:no KEGG:BVU_3170 NR:ns ## KEGG: BVU_3170 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 2 800 1 799 799 832 52.0 0 MMIKHYLKVAFRNFLKYKTQNFISIIGLAVGLLCFSICFYCSRYMQSMDQCFDNYHRLVD IRLSGDGEPYAGTPAVLTENLIKEHGQLVEGYSRVSYIRDRPFNVYLNDEKKLPYTFDCI EVDSLFDRLFTPDIVAGSWTTAAHTFNAIVLTESTARKVFPYLQDAIGKRMVITSRLYTS PETTPRDGGITYTIQAVIKDIPTNVSMSFMRKIDALVLNDSEGLFRMKGNNDQTGTYTYG LLWKGQTAKDLENQYKKVKFTHRMFGQDCDVLVSPIGKETDSSKIAMIFSLTTSIVGILI LLVSLLNFFHFQIGSMINRQREFSIRKVLGNGTIQLSMMLFTQLFLVIILASIFMFGFIE VLASGMQISFFDLTMSVERDSLMNQAGEYIVLLLLVTLCICFLVSYYIRKVSVQTSIRSH HIRGKNHFRNIMLGIQFFICWLFVSMTVALYLQTEKTTSTLFNTLTKTEKSEILSIELDY AFMKNEEKLALVDRIRQHSGIKDILLADIGYLKGVSGTGMQIEKDNPDKYMEANVMNVSP NFISFMQIPLLAGRNMESDSDMFVDDVFRDKNGKDILGMSLYNYQNGYTVCGVLSSFNPS VYSRGAEQPGYVFLPSEFDYYVGHCYIKCYSGKEEEVTQWVYQILRDVLPESIDPEVSTF LEDINEEQALETKLKDIILFFSVVSLIITLLGVYAAITLDTERRQKEVAIRKVNGAGMKQ IIMLFARLYMLLLIITAILAFPVICIILQMWKEMYQVFFSYGILYWSGIFAGVTLLTAIT VLFRILKIARLNPAEVIKNE >gi|226332277|gb|ACIC01000043.1| GENE 39 53176 - 54426 1204 416 aa, chain - ## HITS:1 COG:YPO1498 KEGG:ns NR:ns ## COG: YPO1498 COG0845 # Protein_GI_number: 16121771 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Yersinia pestis # 58 406 57 410 420 88 23.0 3e-17 MDIKLEKKPWYIRYRYYLIGGLLFAAFLIYVIVLSLGPRKLRIDAENIQIAEVKEDNFME YVDVEGLIQPILTIKINTREAGSVESIIGEEGSLLQQGDTIIVLSNPDLLRSIEDQRDEW EKQMITYQEQEIEMEQKSLNLKQQALTNNYELERLKKSIALDREEFQMGVKSKAQLQVAE DEYNYKLKNAALQQESLRHDSAVTMIRKELIRNDRERERKKYERTHERLNNLVVTAPIKG QLSFVRVTPGQQVSSGESIAEIKVLDQYKIHTSLSEYYIDRITTGLPATVNYQGKKYPLK ITKVVPEVKDRMFDVDLVFTGDMPDNVRVGKSFRVQIELGQPEQALIIPRGNFYQSTGGQ WIYKVNASKTKAIRVPLSIGRQNPQQYEITEGLQTGDWVITTGYDTFGDAEELILK >gi|226332277|gb|ACIC01000043.1| GENE 40 54480 - 55952 1445 490 aa, chain - ## HITS:1 COG:VC1565 KEGG:ns NR:ns ## COG: VC1565 COG1538 # Protein_GI_number: 15641573 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Outer membrane protein # Organism: Vibrio cholerae # 185 471 125 403 419 63 23.0 6e-10 MNLKILYITLGILSTELLAAQNQTMEMSLEQVVKIARLESPDAQTARHSFRSAYWNYKYY RANYLPSLSLSSDPNLNRAINKITMGDGSVKFVEQNLLNTDLTLNLSQNIPWTGGSLFLE TSAQRMDLFSEHKYSWQTSPVMIGYRQSLFGYNSLKWDKRIEPVRYQEAKKSYVETLELV SANAINKFFALATAQSNYDIASFNYANADTLYRYAQGRYNIGTITENEMLQLELNRLTEE TNRMNARIEMDNCMQELRSYLGIQEDRELRVRINSQVPDFSVNLDEALALAYENSPDIQT MKRRKLESESAVARARANAGLKADIYLRFGLTQTAEKLPDAYRNLLDQQYVSLSIALPIL DWGRGKGQVRVARSNRDLVYTQVEQNRTDFELNVRKLVKQFNLQTQRVRIAARTDETAQR RNEVARKLYILGKSTILDLNASIAEKDSARRNYVSALYNYWSLYYTLRSMTLYDFERSTL LTEDYNLLIE >gi|226332277|gb|ACIC01000043.1| GENE 41 56167 - 57543 1332 458 aa, chain + ## HITS:1 COG:STM4174 KEGG:ns NR:ns ## COG: STM4174 COG2204 # Protein_GI_number: 16767428 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains # Organism: Salmonella typhimurium LT2 # 10 455 8 441 441 295 37.0 2e-79 MDEVNKLGKILIVDDNEDVLFALNLLLEPYTEKIKVATTPDRIEHFMTTFQPDLILLDMN FSRDAISGQEGFESLKQILQIDPQAIVIFMTAYADTDKAVRAIKAGATDFIPKPWEKEKL LATLTSGMRLRQSQREVNILKEQVEALSGQSSPEGDIIGESPVMQEVFATINKLSSTDAN ILILGENGTGKDVIARLLYRCSPRYGKPFVTIDLGSIPEQLFESELFGFEKGAFTDAKKS KAGRMEVATNGTLFLDEIGNLSLPMQSKLLTAIEKRQISRLGSTQAVPIDVRLICATNAD IRRMVDEGSFRQDLLYRINTIEIHIPPLRERGNDIILLAEYFLDRYARKYKKEMRGLTRE AKNKLLKYTWPGNVRELQHTMERVVILGDGSLLKPENFQFHVTPKQKKEEEIVLNLEQLE RQTIEKAMKLSEGNISRAADYLGITRFALYRKLEKLGL >gi|226332277|gb|ACIC01000043.1| GENE 42 57540 - 58817 821 425 aa, chain + ## HITS:1 COG:NMA0160 KEGG:ns NR:ns ## COG: NMA0160 COG5000 # Protein_GI_number: 15793187 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase involved in nitrogen fixation and metabolism regulation # Organism: Neisseria meningitidis Z2491 # 21 425 267 700 706 116 25.0 8e-26 MRRFAFRVILHILLIVLFSIGSYLLFQKQLWFSTTISLILLITTAIHLYRMQFKQIALLR RLTDGLRYNDMMQTFHPPFKNKIMNEWAEELSDTLKDFRGRLLAEEIKHQYYENLLKKVD TAVLVADKAGHIEWMNQAAIIHLGQISQLPETLLKASVAHDTPVIRIEQNSTVLEMAISR TTFATQGREQQLISLKNIHSVLERNEMEAWQKLIRVLTHEIMNSITPIISLSETLSERGI PSQLGEKEYSVMLQAMQTIHRRSKGLLEFVENYRRLTRIPAPIRTQISIAELFTDLKKLF PEEEFQFEVPSPELKLNVDRTQIEQILINLLKNAREACSRKSDKKIQVKARKLSAGNTTL TISDNGEGILPDVLDKIFVPFFTTKTSGSGIGLSLCKQIMTLHEGSINVKSEVGKGSSFI LTFPK >gi|226332277|gb|ACIC01000043.1| GENE 43 59447 - 59641 282 64 aa, chain - ## HITS:1 COG:no KEGG:BT_0854 NR:ns ## KEGG: BT_0854 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 64 1 64 64 82 100.0 3e-15 MKTEKQKETNEKEVHLLEQKGYERMVNEIVPVQQAETYRKPTQKAVKDAVKELNPDTNSL GSRG >gi|226332277|gb|ACIC01000043.1| GENE 44 59788 - 60525 545 245 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|239830964|ref|ZP_04679293.1| Ribosomal protein L11 methyltransferase [Ochrobactrum intermedium LMG 3301] # 1 245 1 244 245 214 42 1e-54 MKENKYDDDRFFSQYAQMSRSVEGLQGAGEWHILQKMLPDFTDKRVLDLGCGFGWHCIYA IEHGAKCVTGIDISGKMLEEAQKRNSSPLIEYKCMAIEDFDFQPDTYDIVISSLTFHYLE SFINICRKVNSCLTAGGSFVFSVEHPIFTAYGNQDWHYDQDGKRAHWPVDRYFSEGKRTA VFLGEEVVKYHKTLTTYINSLLQTGFEICELIEPQPSEMMLDTIPEMQDELRRPMMLLIS AKKKS >gi|226332277|gb|ACIC01000043.1| GENE 45 60886 - 61611 671 241 aa, chain - ## HITS:1 COG:YPO2709 KEGG:ns NR:ns ## COG: YPO2709 COG4123 # Protein_GI_number: 16122913 # Func_class: R General function prediction only # Function: Predicted O-methyltransferase # Organism: Yersinia pestis # 10 239 20 251 252 200 42.0 2e-51 MANKMANPFFQFKQFTVWHDKCAMKVGTDGVLLGAWASVQGAYRILDIGTGTGLVALMLA QRSLPDANIVALEIDEAAAGQAKENVARSPWKDRIEVVKQDFLSYQSPDKFDVIVSNPPY FVDSLSCPDQQRSMARHNDSLTYEKLLKGVADLLKKEGTFTIVIPTDVADRVKTAASEYY LYATRQLNVITKPGGTPKRTLITFSFDNGGCAVEELLTEVARHQYSEEYKALTREYYLHL K >gi|226332277|gb|ACIC01000043.1| GENE 46 61814 - 64279 2447 821 aa, chain + ## HITS:1 COG:ECs0493 KEGG:ns NR:ns ## COG: ECs0493 COG0466 # Protein_GI_number: 15829747 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ATP-dependent Lon protease, bacterial type # Organism: Escherichia coli O157:H7 # 40 805 11 774 784 662 45.0 0 MKERYLLEEEVGENGFSLITDYDGNDEHVFDVNIKSGDILPVLPLRNMVLFPGVFLPITV GRKASLKLVREAEKKHKDIAVICQRSAHTEDPKLEDLHNVGTVGRIVRVLEMPDQTTTVI LQGMKRLRLKDIVDTHPYLKGEVELLEEDVPNKDDKEFQALVETCKDLTMRYIKSSEMHQ DSSFAIKNISNPMFLINFICANLPFKKDEKMDLLSINSLRERTYHLLEVLNREVQLAEIK ASIQMRAREDIDQQQREYFLQQQIKTIQDELGGSGQEQEIEEMRLKAVKMHWNAEVRDTF LKELAKLERTHPQSPDFSVQLNYLQTMLNLPWGVYTTDNLNLKNAEKTLNKDHYGLEKVK ERILEHLAVLKLKGDMKSPIICLYGPPGVGKTSLGKSIASALKRKYVRMSLGGVHDEAEI RGHRKTYIGAMPGRIIKSLIKAGASNPVFILDEIDKVSADRQGDPSSALLEVLDPEQNTS FHDNFLDVDYDLSKVLFIATANNLNTIPGPLLDRMELIEVSGYITEEKVEIARKHLLPKE LEANGLKKTDIKIPKDTLEAIIESYTRESGVRELEKKIGKILRKSARQYATDGYFAKSEI KPTDLYEFLGAPEYTRDKYQGNDYAGVVTGLAWTAVGGEILFVETSLSRGKGARLTLTGN LGDVMKESAMLALEYIKAHASILHLNEDIFDNWNIHIHVPEGAIPKDGPSAGITMATSLA SALTQRKVKANLAMTGEITLRGKVLPVGGIKEKILAAKRAGIKEIIMSAENKKNIDEIQD IYLKGLQFHYVNDIKEVFAIALTDEKVADAIDLSVKKPSQE >gi|226332277|gb|ACIC01000043.1| GENE 47 64276 - 65406 907 376 aa, chain + ## HITS:1 COG:aq_1308 KEGG:ns NR:ns ## COG: aq_1308 COG0343 # Protein_GI_number: 15606515 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Queuine/archaeosine tRNA-ribosyltransferase # Organism: Aquifex aeolicus # 3 376 2 378 378 371 50.0 1e-102 MTFELQYTDTKSNARAGLITTDHGQIQTPIFMPVGTLGTVKGVHVTELKEDIQAQIILGN TYHLYLRPGLDVIEKAGGLHRFNGFDRPMLTDSGGFQVFSLAGIRKLREEGAEFRSHIDG SKHIFTPEKVMDIERTIGADIMMAFDECPPGDSDYAYAKKSLGLTHRWLDRCIQRFNETE PKYGYNQSLFPIVQGCVYQDLRKQSAEFVASKGADGNAIGGLAVGEPVDKMYEMIEVVNE ILPKDKPRYLMGVGTPVNILEGIERGVDMFDCVMPTRNGRNGMLFTKDGIINMRNKKWET DFSPIEADGASAVDTLYSKAYLRHLFHAQELLAMQIASIHNLAFYLWLVGEARKHIIAGD FSTWKPMMVKRVSTRL >gi|226332277|gb|ACIC01000043.1| GENE 48 65410 - 66507 1022 365 aa, chain + ## HITS:1 COG:FN1030 KEGG:ns NR:ns ## COG: FN1030 COG0795 # Protein_GI_number: 19704365 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Fusobacterium nucleatum # 6 363 1 361 363 103 27.0 6e-22 MKSNRLIKRLDWYIIKKFLGTYVFAIALIISIAVVFDFNEKMDKLMEHEAPWDKIIFEYY MNFIPYFSNLFSPLFVFIAVIFFTSKLAENSEIIAMFSTGMSFKRMMRPYMISAAIIAAA TFMMSSFIIPKGSVTRLNFEDKYIKPKKVNSVRNVQLEVDSGVIAYIDNYNDGMKTGNRF SLDKFVDKKLVSHLTARRITYDTTTVNKWTIHDYMVRELDGLKEKITKGDRIDSIINMDP SDFLIMKNQQEMLTSPELSEYIEKQKRRGFANIKEFEIEYHKRIAMSFASFILTIIGVSL SSRKTKGGMGLHLGIGLGLSFSYILFQTITSTFAINGNVPPAIAVWIPNILYAGIAFYLY QKAPK >gi|226332277|gb|ACIC01000043.1| GENE 49 66684 - 67190 351 168 aa, chain - ## HITS:1 COG:no KEGG:BT_0833 NR:ns ## KEGG: BT_0833 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 168 1 168 168 325 98.0 4e-88 MNSYVKWIFVVLLICSLTAFVEKDKPTGGLSEGDVAPDFKIESASNGQPAFKLGNLKGKY VLLSFWASYDAQSRMQNVSLSNVLRSSRNVEMVSVSFDEYQSIFKETVRKDQIVTPTCFV ETEGEDSGLFKKYRLNRGFTNYLLDGNGVIIAKNISAADLSAYVNEIG >gi|226332277|gb|ACIC01000043.1| GENE 50 67459 - 68688 1406 409 aa, chain + ## HITS:1 COG:PA4960_2 KEGG:ns NR:ns ## COG: PA4960_2 COG0560 # Protein_GI_number: 15600153 # Func_class: E Amino acid transport and metabolism # Function: Phosphoserine phosphatase # Organism: Pseudomonas aeruginosa # 193 404 1 212 217 258 65.0 2e-68 MQPSNTELILIRVTGEDRPGLTASVTEILAKYDATILDIGQADIHNTLSLGILFRSEERH SGFIMKELLFKASSLGVTIRFEPIGTDQYENWVGMQGKNRYILTVLGRKLSARQISAATR VLAEQDLNIDAIKRLTGRIPLDEDKTDTRTRACIEFSVRGTPKDRIAMQERLMQLASEQE MDFSFQQDNMYRRMRRLICFDMDSTLIETEVIDELAIRAGVGDEVKAITESAMRGEIDFT ESFTRRVALLKGLDESVMQEIAESLPITEGVDRLMYVLKKYGYKIAILSGGFTYFGQYLQ KKYGIDYVYANELEIVDGKLTGRYLGDVVDGKRKAELLRLIAQVEKVDIAQTIAVGDGAN DLPMLGIAGLGIAFHAKPKVVANAKQSINTIGLDGVLYFLGFKDSYLNM >gi|226332277|gb|ACIC01000043.1| GENE 51 68736 - 70004 1140 422 aa, chain + ## HITS:1 COG:lin0859 KEGG:ns NR:ns ## COG: lin0859 COG0513 # Protein_GI_number: 16799933 # Func_class: L Replication, recombination and repair; K Transcription; J Translation, ribosomal structure and biogenesis # Function: Superfamily II DNA and RNA helicases # Organism: Listeria innocua # 2 373 3 369 516 286 39.0 5e-77 MKFSELKLNANVLDALDAMRFDECTPIQEQAIPIILEGKDLIAVAQTGTGKTAAFLLPIL NKLSEGGHPEDAINCVIMSPTRELAQQIDQQMEGFSYFMPASSVAVYGGNDGILFEQQKK GLTLGADVVIATPGRLIAHLSLGYVDLSKVSYFILDEADRMLDMGFYEDIMQIVKYLPKE RQTIMFSATMPAKIQQLAKTILNNPAEIKLAVSRPADKIIQAAYVCYENQKLGIIRSLFT DEVPERVIIFASSKIKVKEVTKALKMMKLNVGEMHSDLEQAQREVVMHEFKAGRINILVA TDIVARGIDIDDIRLVINFDVPHDSEDYVHRIGRTARANNDGVALTFISEKEQSNFKSIE NFLEKEIYKIPVPAELGEAPEYKPRSYNNKRGDSSKRRSFGGKRNNNNGKKGNNRPSSPA RN >gi|226332277|gb|ACIC01000043.1| GENE 52 70017 - 70814 589 265 aa, chain - ## HITS:1 COG:no KEGG:BT_0830 NR:ns ## KEGG: BT_0830 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 265 1 265 265 502 98.0 1e-141 MTKYVAISLVAILLAACNSSKNPHSSSGQEEDLSAKELLQGIWLDDETESPLMRIEGDTI YYADSQSAPITFKIIRDTLYTYGNDTTYYKIDKQGEHIFWFHSITDNMIRLHKSEDPNDS LSFVGQEMIIPTYTEVTKRDSIVNYDGSRYRAYVYINPSKMRVIKTIYTEDGISMDNVYY DNVMHICVYEGKKSLFASDITKQMFESVVPADFLVQAILSDTKFVKVDRNGFHYQAVLSI PESSIYSIANLTVSFSGKLAITPTK >gi|226332277|gb|ACIC01000043.1| GENE 53 70852 - 72165 1427 437 aa, chain - ## HITS:1 COG:XF1606 KEGG:ns NR:ns ## COG: XF1606 COG1004 # Protein_GI_number: 15838207 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted UDP-glucose 6-dehydrogenase # Organism: Xylella fastidiosa 9a5c # 1 437 1 444 450 518 57.0 1e-146 MKIAIVGTGYVGLVTGTCFAEIGVNVTCVDTNSEKIESLQKGVIPIYENGLEEMVLRNVK AKRLKFTTSLESCLNDVEVIFSAVGTPPDEDGSADLSYVLEVARTIGRNMNQYKLVVTKS TVPVGTARRVRAAIQEELDKRGVNIEFDVASNPEFLKEGNAISDFMSPDRVVVGVESVRA EKLMSKLYKPFLLNNFRVIFMDIPSAEMTKYAANSMLATRISFMNDIANLCELVGADVNM VRSGIGSDTRIGRKFLYPGIGYGGSCFPKDVKALIKTAEQNGYTMRVLRAVEDVNEAQKG VLFEKLMKQFNGDLKGKTIALWGLAFKPETDDMREAPGLVLIDKLLKAGCQIRAYDPAAM NECKRRIGDVIYYARDMYDAVLDADVLMLITEWKEFRLPSWAVIKKTMNQQIVLDGRNIY DKKEMEELGFIYSCIGK >gi|226332277|gb|ACIC01000043.1| GENE 54 72185 - 72733 526 182 aa, chain - ## HITS:1 COG:MA3780 KEGG:ns NR:ns ## COG: MA3780 COG1898 # Protein_GI_number: 20092576 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-4-dehydrorhamnose 3,5-epimerase and related enzymes # Organism: Methanosarcina acetivorans str.C2A # 1 169 1 170 183 202 60.0 3e-52 MNYIQTEIDGVWLIEPNVYSDERGYFMEAFKKEEFEAKIGPVDFVQDNESKSSFGVLRGL HYQKGEYSQAKLVRVLKGKVLDVAVDLRRSSPTFGKHVSALLSEENKRQFFIPRGFAHGF VVLSEEAVFMYKVDNKYAPQAEASIVYNDETLGIDWMLGESQMLLSPKDKEGMAFREAVY FE >gi|226332277|gb|ACIC01000043.1| GENE 55 72823 - 73917 1023 364 aa, chain + ## HITS:1 COG:no KEGG:BT_0827 NR:ns ## KEGG: BT_0827 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 364 1 365 365 624 97.0 1e-177 MIELAQHIEALLLENDCVIVPNFGGFVAHYAPATYVKEENLFLPPTRIIGFNSQLKLNDG VLVQSYMSAHDTSFADATRMVEKEVNAFVEILHEEGKADLENVGEIRYNIYGNYEFTPYD HKITTPSLYGLDSFEMRELSALQRKERILVPASLTKEKKTYEISISRTLLRNAAAMIAVI VMFFAFSTPVENTYVEKNNYAQLLPAELFEQIEKQSVAVTPVSVETKHNQKNATGPKKVT ADKSRTSRPIAVKEVKVAKQETTPPAPTATIPATPTVTAPATPAVQPQANHPFHIIVAGG IGLKDAEAMAEQLKAKGFAEAKALNSDGKVRVSIRSFGNREEATKQLLELRKNETYKNAW LLAK >gi|226332277|gb|ACIC01000043.1| GENE 56 73933 - 74466 509 177 aa, chain + ## HITS:1 COG:no KEGG:BT_0826 NR:ns ## KEGG: BT_0826 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 177 1 177 177 328 99.0 5e-89 MKRVLCPKCENYLFFDETKYSEGQSLVFECEHCGKQFSIRLGKSKVKALRKEENLEEEAE AHKEEFGYITVIENVFGFKQVLPLQEGDNVIGRRCVGTVINTPIESGDMSMDRRHCIINV KRNKQGKLVYTLRDAPSLTGTFLMNEILGDKDRVRMEDGAIVTIGATTFILHTAEAE >gi|226332277|gb|ACIC01000043.1| GENE 57 74580 - 76019 1532 479 aa, chain - ## HITS:1 COG:CAC0695 KEGG:ns NR:ns ## COG: CAC0695 COG0246 # Protein_GI_number: 15893983 # Func_class: G Carbohydrate transport and metabolism # Function: Mannitol-1-phosphate/altronate dehydrogenases # Organism: Clostridium acetobutylicum # 4 479 3 482 482 498 52.0 1e-140 MKALNKETAPKVQRPERIIQFGEGNFLRAFVDWIIYNMNQKTDFNSSVVVVQPIDKGMVD MLNAQDDLYHVNLQGLDKGEVVNSLTMIDVISRALNPYTQNDEFMKLAEQPEMRFVISNT TEAGIAFDPTCKLEDAPASSYPGKLTQLLYHRFKTFNGDKTKGLIIFPCELIFLNGHKLK ETIYQYIDLWNLGNEFKTWFEEACGVYATLVDRIVPGFPRKDIAAIKEKIQYDDNLVVQA EIFHLWVIEAPQEVAKEFPADKAGLNVLFVPSEAPYHERKVTLLNGPHTVLSPVAYLSGV NIVRDACQHEVIGKYIHKVMFDELMETLNLPKEELKKFAEDVLERFNNPFVDHAVTSIML NSFPKYETRDLPGLKTYLERKGELPKGLVLGLAAIITYYKGGVRADGAEIVPNDAPEIMN LLKELWATGCTKKVTEGVLGAEFIWGEDLNKIPGLAAAVKADLDSIQEKGMLETVKGIL >gi|226332277|gb|ACIC01000043.1| GENE 58 76046 - 76981 846 311 aa, chain - ## HITS:1 COG:no KEGG:BT_0824 NR:ns ## KEGG: BT_0824 # Name: not_defined # Def: LacI family transcription regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 25 311 56 342 342 575 98.0 1e-163 MKFTTSLMYLLSDWLPRRNILSFRLIPYYIEHDYWHSVVGGIERARQELRPFNVNVDYLC YHQGDKESYQEVCRRVEESNVDAVLIAPNFRNETLALTDYLQAKKIAYAFIDFNMEEANA LTYIGQDSYKSGYIAAKILMRNYQAGEGQELVLFLSNNKNSPAEIQMQRRLAGFMSFITE EYEKLVIHEVVLNKSNQENNQKTLDEFFRLHPKATLGVVFNSRVYQLGEYLRSAGRSMKG LIGYDLLKANVELLKSGDVHYLIGQRPGLQGYCGVKALCDHVVFKKSVEPVKYMPIDIMI KENIDFYFEFV >gi|226332277|gb|ACIC01000043.1| GENE 59 77314 - 78720 1420 468 aa, chain + ## HITS:1 COG:uxaC KEGG:ns NR:ns ## COG: uxaC COG1904 # Protein_GI_number: 16130987 # Func_class: G Carbohydrate transport and metabolism # Function: Glucuronate isomerase # Organism: Escherichia coli K12 # 1 466 1 465 470 548 56.0 1e-156 MKNFMDENFLLQTETAQKLYHEHAAKMPIIDYHCHLIPQMVADDYKFKSLTEIWLGGDHY KWRAMRTNGVDERYCTGKDTTDWEKFEKWAETVPYTFRNPLYHWTHLELKTAFGIDKILS PKTAREIYDECNEKLAQPEYSARGMMRRYHVEVVCTTDDPIDSLEYHIQTRESGFEIKML PTWRPDKAMAVEVPADFRAYVEKLSAVSGVTISNFDDMIAALRKRHDFFAEQGCRLSDHG IEEFYAEDYTDAEIKAIFNKVYGGAELTKEEILKFKSAMLVIFGEMDWEKGWTQQFHYGA IRNNNTKMFKLLGPDTGFDSIGEFTTAKAMSKFLDRLNVNGKLTKTILYNLNPCANEVIA TMLGNFQDGSIAGKIQFGSGWWFLDQKDGMEKQMNALSVLGLLSRFVGMLTDSRSFLSYP RHEYFRRTLCNLVGRDVENGEIPVSEMDRVNQMIEDISYNNAKNFFKF >gi|226332277|gb|ACIC01000043.1| GENE 60 79778 - 80581 794 267 aa, chain - ## HITS:1 COG:CAC3538 KEGG:ns NR:ns ## COG: CAC3538 COG1235 # Protein_GI_number: 15896774 # Func_class: R General function prediction only # Function: Metal-dependent hydrolases of the beta-lactamase superfamily I # Organism: Clostridium acetobutylicum # 3 266 1 261 261 174 38.0 2e-43 MKVRFISLASGSSGNCYYLGTDTYGILIDAGIGIRTIKKTLKDFNILMESIRAVFVTHDH ADHIKAVGHLGEKMNIPVYTTARIHEGINRSYCMTEKLSTSVRYLEKQVPMMLEDFRIES FEVPHDGTDNVGYCIEIDGKVFSFLTDLGEITPTAADYISKANYLILEANYDEEMLKMGP YPQYLKERIMSRTGHMSNSATAEFLAENITEHLRYIWLCHLSKDNNHPELAYKTVEWKLK NKGVIVGKDVQLLALKRNTPSELYVFE >gi|226332277|gb|ACIC01000043.1| GENE 61 80665 - 82053 1267 462 aa, chain - ## HITS:1 COG:no KEGG:BT_0821 NR:ns ## KEGG: BT_0821 # Name: not_defined # Def: putative permease # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 462 1 462 462 804 99.0 0 MNNSPQRAAKGFTRAFWVSNTVELFERMAYYAVFIVLTIYLSSILGFNDFEASMISGLFS GGLYLLPIFSGAYADKIGFRKSMIIAFSLLSIGYLGLGVFPTLLEAAGLVSYGATTRFNG LPDSSSRWIIVPILFILMIGGSFIKSIISASVAKETTEATRARGYSIFYMMVNVGAFTGK TIIDPLRNVIGEQAYIYINYFSGAMTVIALLAVILLYKSTHTAGEGKSLREIGQGFMRIM TNWRLLILILIVTGFWMVQQQLYATMPKYVIRLAGETAKPGWIANVNPFVVVCCVSFITR LMAKRSAITSMNVGMFLIPFSALLMACGNILGNDIISGMSNITLMMVAGIVVQALAECFI SPRYLEYFSLQAPKGEEGMYLGFSHLHSFLSSIFGFGLAGILLTKYCPDPALFETRAAWE AASANAHYIWYYFAAIGLIAAIALLLFAKITESIDKKKKSSR >gi|226332277|gb|ACIC01000043.1| GENE 62 82159 - 82578 364 139 aa, chain + ## HITS:1 COG:no KEGG:BT_0820 NR:ns ## KEGG: BT_0820 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 139 1 139 139 266 100.0 1e-70 MTIAVDFDGTIVEHRYPRIGEEIPFAVETLKLLQQEKHRLILWSVREGELLDEAIEWCRA RGLEFYAANKDYPEEERDHQGFSRKLKADLFIDDRNVGGIPDWGIIYEMIKEKKTFADIY SQQREENTSQKKKRKWLPF >gi|226332277|gb|ACIC01000043.1| GENE 63 82687 - 82938 253 83 aa, chain - ## HITS:1 COG:no KEGG:BT_0819 NR:ns ## KEGG: BT_0819 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 82 1 82 82 138 98.0 7e-32 MAQHTYDEESVQELLGWAKKMLETKNYPTEKYQVNACTSIIDGKLYLESLISMISKNWEN PTFHPTIEQLWEYREKWEGGKEE >gi|226332277|gb|ACIC01000043.1| GENE 64 83241 - 85967 1869 908 aa, chain + ## HITS:1 COG:XF1114_1 KEGG:ns NR:ns ## COG: XF1114_1 COG0642 # Protein_GI_number: 15837716 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Xylella fastidiosa 9a5c # 536 764 122 360 360 157 36.0 7e-38 MNPSETEPTQNAMSSSVPETFPLTKDQLLFAESIYACLPMSIEIYDTNGILRSINEHALQ MYGVTDKTTVVNIVNLFNSPYVDETLKNRIQRGEDIVLEFEYDFDRIKDNAYFSTQNKNT IIYEAKVVPLRSKEGEIIGHILLSNDVTAIKESEFRTDESKKNLEMAMEAASMSSWVYDV YKKTFNPLHGDPVAKNNTTLDQILNILHPQDHEPLIQLLSQLTNKEILQGNITLRFYNEE EKQYRYYESMMRLSFKHFGKLQIIGTQLDITERMQMVKKTQELIAKRELAMKVSNIIHWD FDVRNQEFEAYNDPINDYASDKLVTIQRYMDVIHPEDRSSSYDAIQSMLSGKEFTINFTC RMQTKYDDSWQYCNIIGVPFEKDEYGNNVRYTGFRQNISKLHQLNEELEERNYKMQLTFK TVGMSYWDFDAKSKQFRAFNDPVNDFHSENVITLEEYLHAVHPEDIDMVRENINYMINRT TKEVNFKFRSKTKWDEEWQTLIVTGIPVERNKKGNVIRYTGIKVNNTKWEKMAQQLKELK DKAELSDRLKSAFLANMSHEIRTPLNAIVGFSELMVYSEDPAEKEEYMSIIQSNNELLLR LINDILDLSKIESGILERKRETFNLAKVCGELYTMIQPKITNPDVEFQMANSGPDCWIFL DRNRLKQVWMNYLTNAIKCTKSGHIKMGYGIERGGIRIYVEDSGVGIPIELQERVFGRFQ KLNEFAQGTGLGLAISRAIIEGAGGEVGFTSTPDVGSTFWAWIPCEISIQEEDSPTVSQP SKQQTSLNEIDKKELKILIAEDNDSNYTLVQHILNNYNLTHVQNGVEAVNKVREEEFDLI LMDMKMPVMGGLEATRKIREFNQKIPIIALTANAFDSDRVSALDAGCNEFLAKPIKKSQL LELFSKKW >gi|226332277|gb|ACIC01000043.1| GENE 65 86141 - 86441 252 100 aa, chain + ## HITS:1 COG:VC1962 KEGG:ns NR:ns ## COG: VC1962 COG3015 # Protein_GI_number: 15641964 # Func_class: M Cell wall/membrane/envelope biogenesis; P Inorganic ion transport and metabolism # Function: Uncharacterized lipoprotein NlpE involved in copper resistance # Organism: Vibrio cholerae # 40 95 56 112 163 62 50.0 2e-10 MRKIIILVCSCFLLAACGNSAKTNNSTSADSTATEVVDIHNAETSLDYEGTYKGVFPAAD CPGIETTLTLNPDKTFTLHSVYIDRDSSFDEKGTYTLEGN Prediction of potential genes in microbial genomes Time: Thu May 12 00:30:04 2011 Seq name: gi|226332276|gb|ACIC01000044.1| Bacteroides sp. 1_1_6 cont1.44, whole genome shotgun sequence Length of sequence - 22223 bp Number of predicted genes - 23, with homology - 23 Number of transcription units - 13, operones - 4 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 124 89 ## BT_0817 lipoprotein, function unknown + Term 151 - 208 15.1 + Prom 201 - 260 8.4 2 2 Op 1 . + CDS 292 - 729 245 ## COG2259 Predicted membrane protein + Prom 756 - 815 7.0 3 2 Op 2 . + CDS 835 - 1386 479 ## BT_0815 hypothetical protein + Term 1410 - 1453 6.1 - Term 1392 - 1448 13.4 4 3 Tu 1 . - CDS 1463 - 3781 1836 ## BT_0814 putative outer membrane protein - Prom 3844 - 3903 4.9 - Term 3859 - 3894 -1.0 5 4 Tu 1 . - CDS 4000 - 5091 558 ## BT_0813 TonB 6 5 Op 1 . - CDS 5165 - 5323 174 ## BT_0813 TonB 7 5 Op 2 . - CDS 5336 - 5698 322 ## BT_0812 putative transcriptional regulator - Prom 5905 - 5964 6.5 + Prom 5746 - 5805 6.7 8 6 Tu 1 . + CDS 5890 - 6648 819 ## COG0566 rRNA methylases + Term 6763 - 6813 13.1 - Term 6842 - 6881 -0.9 9 7 Tu 1 . - CDS 6981 - 7958 782 ## BT_0809 hypothetical protein - Prom 7986 - 8045 3.4 10 8 Tu 1 . - CDS 8053 - 8709 598 ## BF2277 lipoprotein signal peptidase - Term 8721 - 8782 12.1 11 9 Op 1 . - CDS 8807 - 9187 524 ## BT_0807 DnaK suppressor protein, putative 12 9 Op 2 . - CDS 9224 - 12598 3288 ## COG0060 Isoleucyl-tRNA synthetase 13 9 Op 3 . - CDS 12555 - 12710 152 ## BT_0806 isoleucyl-tRNA synthetase (EC:6.1.1.5) - Prom 12731 - 12790 11.2 + Prom 12695 - 12754 6.5 14 10 Tu 1 . + CDS 12922 - 13971 710 ## BT_0805 hypothetical protein 15 11 Tu 1 . - CDS 14128 - 15666 1302 ## COG0388 Predicted amidohydrolase - Prom 15691 - 15750 5.4 16 12 Op 1 1/0.000 - CDS 15810 - 16871 570 ## COG0798 Arsenite efflux pump ACR3 and related permeases 17 12 Op 2 . - CDS 16856 - 18568 1166 ## COG0003 Oxyanion-translocating ATPase 18 12 Op 3 . - CDS 18597 - 18923 358 ## BT_0801 arsenical resistance operon trans-acting repressor 19 12 Op 4 . - CDS 18954 - 19661 341 ## BT_0800 hypothetical protein 20 12 Op 5 . - CDS 19663 - 20106 384 ## BT_0799 hypothetical protein 21 12 Op 6 . - CDS 20117 - 20350 276 ## COG0526 Thiol-disulfide isomerase and thioredoxins - Term 20392 - 20427 -0.5 22 12 Op 7 . - CDS 20428 - 20754 252 ## COG0640 Predicted transcriptional regulators - Prom 20835 - 20894 5.5 23 13 Tu 1 . - CDS 20902 - 22221 963 ## BT_0796 hypothetical protein Predicted protein(s) >gi|226332276|gb|ACIC01000044.1| GENE 1 2 - 124 89 40 aa, chain + ## HITS:1 COG:no KEGG:BT_0817 NR:ns ## KEGG: BT_0817 # Name: not_defined # Def: lipoprotein, function unknown # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 40 105 144 144 74 97.0 9e-13 KEEGGEISYYKVEENKVRLLNDDKQEITGALAEHYILSKE >gi|226332276|gb|ACIC01000044.1| GENE 2 292 - 729 245 145 aa, chain + ## HITS:1 COG:RSc0240 KEGG:ns NR:ns ## COG: RSc0240 COG2259 # Protein_GI_number: 17544959 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Ralstonia solanacearum # 28 131 38 141 145 60 36.0 1e-09 MIYNFLFPTKPNTTKVSLFLLAARIIFGILLMNHGIQKWSNFQELSTAFPDPIGLGSSIS LGLAIFGELVCSMGFIFGFLYRLAMIPMIFTMIVAFFVIHANDVFAVKELAFIYLVVFVL MYIAGPGKFSIDHFIGNKLAHNKRK >gi|226332276|gb|ACIC01000044.1| GENE 3 835 - 1386 479 183 aa, chain + ## HITS:1 COG:no KEGG:BT_0815 NR:ns ## KEGG: BT_0815 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 183 1 183 183 325 100.0 5e-88 MKKSKLTLVAVVLSGSLLFSSCVGSFGLFNRLSSWNQGVGNKFVNELVFLAFNIVPVYGV AYLADALVINSIEFWSGSNPMANVGDVKKVKGENGNYMVKTLENGYSITKEGETASMDLI YNKEANTWNVVANGESSELLKMNNDGTADLFLPNGEKMNVTLDAQGMMAARQATMSNLMF AAR >gi|226332276|gb|ACIC01000044.1| GENE 4 1463 - 3781 1836 772 aa, chain - ## HITS:1 COG:no KEGG:BT_0814 NR:ns ## KEGG: BT_0814 # Name: not_defined # Def: putative outer membrane protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 772 1 772 772 1503 99.0 0 MRKGFLYLFTFAIIILSLASCTATKYVPDGSYLLDEVKIHTDQKNVRPSSLRMYVRQNPN AKWFSLIKTQLYVYNLSGRDSTKWGNKFLRRIGDAPVIYSETEAQRSQDEITKAMRNMGY MAATVKRSTKIKKKKIKLYYEVTAGDPYIVNSLKYDIRDPKIADFLKQDSAKSLLKEGMI FDVNVLDAERQRITDKLLRNGYYKFNKDYVGYTADTVRGTYQVDLTLHLHAYRAHVSDSV KAHQQYWINKINFITDYDVLQSSALNSMDINDSLHFKGYPIYYKDKLYLRPKMLTDNLRF APGDLFDEQDVQQTYSNFGRLSALKYTNIRFLENQVGDTAKLDCYVMLTKSKHKSVAFEL EGTNSAGDLGAAASVSFQNRNLFRGSETFMIKFRGAYEVISGLQAGYSNNNYTEYGVETS INFPNFLFPFLSSDYKRKIRATTEFGLQYNYQMRPEFLRTMASASWSYKWTQRQKIQHRI DLINIGFLYLPRISERFKDDFINKGQSHILQYNYQDRLIINMGYSYHYNSIGGAIINNTI ASNSYSIRFNFESAGNIMYALSKATNIRKNSDGEYAILGIPYAQYVKGEFDFSKNIRIDH RNSFAFHIGMGIAVPYGNAKTIPFEKQYFSGGANSVRGWTVRDLGPGSFVRDENTNLLDQ SGDIKLDASIEYRSKLFWKFQGAIFVDAGNIWTIRDYDNQPGGVFKFDKFYKQIAVAYGL GLRLDLDFFILRFDGGMKALNPVYEKGKDRYPIIHPKFSRDFAFHFAVGYPF >gi|226332276|gb|ACIC01000044.1| GENE 5 4000 - 5091 558 363 aa, chain - ## HITS:1 COG:no KEGG:BT_0813 NR:ns ## KEGG: BT_0813 # Name: not_defined # Def: TonB # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 363 78 441 443 697 93.0 0 MLVTTAVPVTTTEVEGTTWQAGTWLWIIYGLGIGMLLLRNLLEVSKIHHSLACSRRYSLK GVPVYQSEDVGEPCSFFHWIFINPMQYSDKEINEILIHEQTHVRELHSLDIILVQLVILL CWFNPFSWLIRSEIRMNHEYLADKQVVTSGYDKKSYQYHLLGIKHTSLAAANFYNNFSVL PLKKRIKMLNRKRTHNIMVGKYLMFIPVAALLLLFSNCANKKADKAQSDTEKADTIVAAE PDKTAEPQVEVAITEAKGDSIYSVVETMPDFPGGQKELLSFLSRNIKYPTKAIENKIQGR VIVQFVVNKDGSISGAKVVRSVDPDLDKEALRVINSMPQWKPGMNKGEIVSVKYTIPVMF RLQ >gi|226332276|gb|ACIC01000044.1| GENE 6 5165 - 5323 174 52 aa, chain - ## HITS:1 COG:no KEGG:BT_0813 NR:ns ## KEGG: BT_0813 # Name: not_defined # Def: TonB # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 52 1 52 443 95 96.0 4e-19 MESFTIYLIKVNVALIVLYAFYKLSFSKDTFFRLRRIMLLLICVTSLIYPLI >gi|226332276|gb|ACIC01000044.1| GENE 7 5336 - 5698 322 120 aa, chain - ## HITS:1 COG:no KEGG:BT_0812 NR:ns ## KEGG: BT_0812 # Name: not_defined # Def: putative transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 120 1 120 120 210 100.0 1e-53 MERLTQQEEEVMIYFWKMGPSFIREIVNEMPEPKPPYTSVASVVRNLEKKKFLSPFKLGN SIQYHVLVKESDYKRSFMNGVVSNYFTGSYKEMVSFFVRDRKISKKELEDLINIIEDEES >gi|226332276|gb|ACIC01000044.1| GENE 8 5890 - 6648 819 252 aa, chain + ## HITS:1 COG:VC0803 KEGG:ns NR:ns ## COG: VC0803 COG0566 # Protein_GI_number: 15640821 # Func_class: J Translation, ribosomal structure and biogenesis # Function: rRNA methylases # Organism: Vibrio cholerae # 1 248 14 255 257 165 38.0 7e-41 MLSKNKIKYIHSLELKKTRKEEQVFLAEGPKLTGDLLGHFPCRFLAATSSWLQAHPDIQA NEIAEVSQEELSRASLQKTPQQVLAVFEQPQYTLSPDFVRQSLCLALDDIQDPGNLGTII RLADWFGIEHIICSQNTVDVYNPKTIQATMGGIARVKVHYTSLPEFIHSLGDAPVFGTFL DGENMYEQPLSTNGLIVMGNEGNGIGKEVEALINRKLYIPNYPQGKETSESLNVAIATAV ICAEFRRQAAWK >gi|226332276|gb|ACIC01000044.1| GENE 9 6981 - 7958 782 325 aa, chain - ## HITS:1 COG:no KEGG:BT_0809 NR:ns ## KEGG: BT_0809 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 325 36 360 360 577 97.0 1e-163 MENLLYDYHLAKSMGDNLPYNENYKKTLYLDAVFKKYGTTEAVFDSSLVWYTRNTEVLSK IYEKVSKRLKAQQNEINHLIALRDNKPKMSEPGDSIDVWPWKRLIRLTGEMMNDQYAFVL PADTNYKDRDTLVWEVRYHFLEPMLADSLRAVTMAMQVIYEKDTLNRLETITDPGVHQLR LYADTLGVMKEIKGFIYYPGVQGEKAGALLADRFSLTRYHCSDSLSVAERDSINQLEALK TDSLKKVSDQKDNKISKDSLQKVDDLKDNQRISPEEMNRRRTVTPEKKPEQIEVEQHIQQ EKIEQRRERQLNQRRNQQRRQSPSR >gi|226332276|gb|ACIC01000044.1| GENE 10 8053 - 8709 598 218 aa, chain - ## HITS:1 COG:no KEGG:BF2277 NR:ns ## KEGG: BF2277 # Name: not_defined # Def: lipoprotein signal peptidase # Organism: B.fragilis # Pathway: Protein export [PATH:bfr03060] # 1 204 1 204 210 359 92.0 5e-98 MKKLFTKGRIALLVIFSVLIIDQIIKIWIKTHMYWHESIRVTDWFYIYFTENNGMAFGME IFGKLFLTTFRIVAVGLIGYYLYKIIKKGFKTGYIVCVALILTGALGNIIDSVFYGVFFN ESTHSQIASFMPEGGGYSTWFYGKVVDMFYFPIIDTNWPTWMPFVGGEHFIFFSPIFNFA DAAISCGIIALLLFYSKYLNESYHSLDKNKKESVDHEE >gi|226332276|gb|ACIC01000044.1| GENE 11 8807 - 9187 524 126 aa, chain - ## HITS:1 COG:no KEGG:BT_0807 NR:ns ## KEGG: BT_0807 # Name: not_defined # Def: DnaK suppressor protein, putative # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 126 1 126 126 216 100.0 3e-55 MAEKTRYSDAELEEFRAIIMEKLELAQRDYDQLKMSLMGLDGNDTDDTSPTYKVLEEGAN TLSKEETTRLAQRQLKFIQGLQAALVRIENKTYGICRETGKLIPAERLRAVPHATLSIEA KNSGKK >gi|226332276|gb|ACIC01000044.1| GENE 12 9224 - 12598 3288 1124 aa, chain - ## HITS:1 COG:CAC3038 KEGG:ns NR:ns ## COG: CAC3038 COG0060 # Protein_GI_number: 15896289 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Isoleucyl-tRNA synthetase # Organism: Clostridium acetobutylicum # 8 1124 44 1034 1035 811 40.0 0 MAALRLCFQGPPSANGMPGIHHVMARTIKDIFCRYKTMKGYQVKRKAGWDTHGLPVELSV EKALGITKEDIGKKISVADYNAACRKDVMKYTKEWEDLTHQMGYWVDMKHPYITYDNRYI ETLWWLLKQLHKKGLLYKGYTIQPYSPAAGTGLSSHELNQPGCYRDVKDTTAVAQFKMKN PKPEMAEWGTPYFLAWTTTPWTLPSNTALCVGPKIDYVAVQTYNAYTGEPITVVLAKALL NTHFNPKAADLKLEDYKAGDKLVPFKVIAEYKGADLIGMEYEQLIPWVKPVEVSEDGSWK VSDKGFRVIPGDYVTTEDGTGIVHIAPTFGADDANVARAAGIPSLFMINKKGETRPMVDL TGKFYILDELDENFVKECVDVDKYKEYQGAWVKNAYDPQFMVDGKYDEKAAQTAESLDVA LCMMMKANNQAFKIEKHVHNYPHCWRTDKPVLYYPLDSWFIRSTACKERMMELNKTINWK PESTGTGRFGKWLENLNDWNLSRSRYWGTPLPIWRTEDGTSEICIESVEELYNEIEKSVA AGFMKSNPYKDKGFVPGEYTEENYDKIDLHRPYVDDVILVSEDGQPMKRESDLIDVWFDS GAMPYAQIHYPFENKDILDNREVYPADFIAEGVDQTRGWFFTLHAIATMVFDSVSYKAVI SNGLVLDKNGNKMSKRLNNAVDPFTTIEKYGSDPLRWYMITNSSPWDNLKFDVEGIEEVR RKFFGTLYNTYSFFALYANVDGFEYKEADVPMAERPEIDRWILSVLNTLIKEVDTCYNEY EPTKAGRLISDFVNDNLSNWYVRLNRKRFWGGEFTQDKLSAYQTLYTCLETVAKLMAPIS PFYSDRLYTDLITATGRDNVVSIHLAEFPKYQEEMIDKELEARMQMAQDVTSMVLALRRK VNIKVRQPLQCIMIPVVDEEQKVHIEAVKALIMNEVNVKDIKFVDGDAGVLVKKVKCDFK KLGPKFGKQMKAVAAAVAEMSQEAIAELEKNGKYTLNLDGAEAVIEAADVEIFSEDIPGW LVANEGKLTVALEVTVTEELRREGIARELVNRIQNIRKSSGFEITDKIKITISKNTQTDD AVNEYNTYICNQVLGTSLDLADEVKDGTELNFDDFSLFVNVIKD >gi|226332276|gb|ACIC01000044.1| GENE 13 12555 - 12710 152 51 aa, chain - ## HITS:1 COG:no KEGG:BT_0806 NR:ns ## KEGG: BT_0806 # Name: ileS # Def: isoleucyl-tRNA synthetase (EC:6.1.1.5) # Organism: B.thetaiotaomicron # Pathway: Valine, leucine and isoleucine biosynthesis [PATH:bth00290]; Aminoacyl-tRNA biosynthesis [PATH:bth00970] # 1 45 1 45 1162 99 100.0 4e-20 MGKRFTEYSQFDLSQVNKDVLKKWDENQVFAKSMTERDGCPSFVFSRTSLG >gi|226332276|gb|ACIC01000044.1| GENE 14 12922 - 13971 710 349 aa, chain + ## HITS:1 COG:no KEGG:BT_0805 NR:ns ## KEGG: BT_0805 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 349 1 349 349 601 98.0 1e-170 MNYGISILFRAIPLAMAVFCFGYGAFIYGYGDAGNRLVAGPVVFSLGMICIALFCTAATI IRQIIHTYNEATKYVLPVVGYLAAIITIIGGICIFNAATTTSAFVAGHVITGVGFITACV ATAATSSTRFSLIPANAKATGNEVPEGAFSIGQRRAMIFLAIVISCIAWIWAFILLSNSH SHPAYFVAGHVMVGLACICTSLIALVATIARQVRNDYSERERNKWPKLVLLMGSISFVWG IFVILADSGSANGTTGYIMLGLGLVCYSISSKVILLAKIWRREFKLANRIPMIPVLTALT CLFLAAFVFELATVNADYFIPARVLVGLGAICFTLFSIVSILESGTSSK >gi|226332276|gb|ACIC01000044.1| GENE 15 14128 - 15666 1302 512 aa, chain - ## HITS:1 COG:BH1089_2 KEGG:ns NR:ns ## COG: BH1089_2 COG0388 # Protein_GI_number: 15613652 # Func_class: R General function prediction only # Function: Predicted amidohydrolase # Organism: Bacillus halodurans # 202 512 3 313 313 323 48.0 4e-88 MEQHPIKINKVQIRNLQIEDYAQLSQSFTRVYSDGSDVFWTHAQIKKLISIFPEGQIVTV VDDKIVGCALSIIVDYDKVKNDHTYAQVTGKETFNTHNPEGNILYGIEVFIHPGYRGLRL ARRMYEYRKELCETLNLKAIMFGGRIPNYHKYADKMRPKEYIARVRQREIYDPVLTFQLS NDFHVRKVMTNYLPNDEESKHYACLLQWDNIYYQPPTQEYINPKTTVRVGLVQWQMRSYK TLDDLFEQVEFFVDAVSDYKSDFVLFPEYFNAPLMSKFNDKGESQAIRGLAKYTDEIRDR FINLAISYNINIITGSMPYVKEDGLLYNVGFLCRRDGTYEMYEKMHVTPDEIKSWGLSGG KLLQTFDTDCAKVGVLICYDVEFPELSRLMADQGMQILFVPFLTDTQNAYSRVRVCAQAR AIENECFVVIAGCVGNLPRVHNMDIQYAQSGVFTPCDFAFPTDGKRAEATPNTEMILVSD VDLDLLNALHTYGSVRNLKDRRNDVYEVRFKK >gi|226332276|gb|ACIC01000044.1| GENE 16 15810 - 16871 570 353 aa, chain - ## HITS:1 COG:pli0038 KEGG:ns NR:ns ## COG: pli0038 COG0798 # Protein_GI_number: 18450320 # Func_class: P Inorganic ion transport and metabolism # Function: Arsenite efflux pump ACR3 and related permeases # Organism: Listeria innocua # 2 344 4 346 352 421 72.0 1e-118 MEKYIGIGFFERYLTVWVTLCIVVGIAIGQWFPAISQTLSKLEYANVSIPVAILIWLMIY PMMLKVDFQSIKNVGKRPKGIIITCITNWLIKPFTMFGIAYLFFYVVFKSLISAELAEEY LAGAVLLGTAPCTAMVFVWSYLTKGDAAYTLVQVAVNDLIILVAFAPIVAFLLGIGGVTI PWDTLLLSVVLFVVIPLSAGIITRILVIQRKGIEYFNTVFIRKFDNYTIGGLLLTLIILF SFQGETILNNPLHIVLIAVPLVLQTVLIFFVAYSWAKWWKLPHSIAAPAGMIGASNFFEL SVAVAISLFGLQSGAALATVVGVLVEVPVMLMLVRVANNTCSWFTIKSSMSNK >gi|226332276|gb|ACIC01000044.1| GENE 17 16856 - 18568 1166 570 aa, chain - ## HITS:1 COG:pli0037 KEGG:ns NR:ns ## COG: pli0037 COG0003 # Protein_GI_number: 18450319 # Func_class: P Inorganic ion transport and metabolism # Function: Oxyanion-translocating ATPase # Organism: Listeria innocua # 1 567 1 565 580 645 60.0 0 MKAFNLSDIALTKYLFFTGKGGVGKTSIACATAVGLADMGKKILLISTDPASNLQDVFDQ SLNGHGTAISEVPGLTVVNLDPEQAAAEYRESVIAPFRGKLPESVIQNMEEQLSGSCTVE IAAFNEFSDFITDAVKAKEYDHIIFDTAPIGHTLRMLQLPSAWSTFISESAHGASCLGQL SGLEERKGIYKQAVETLSDTSATRLVLVSRPEIAPLKEAARSSHELQLLGIKNQLLVING ILQQLNEADDVSRQLHNRQQKALQGMPAELSEYPMYSVPLRSYNLSDIANIRRMLYSDSL TDDICYQPISDAKSIDDLVNDLYTSGKRVVFTMGKGGVGKTTLATEIALKLIKLGAKVHL TTTDPANHLNYDIAIKSGITVSHIDEAEVLENYKNEVRSKAAETMTAEDMEYIEEDLRSP CTQEIAVFKAFAEIVDKADNEIVVIDTAPTGHTLLLLDATQSYHREVERTQGAVTGAVAN LLPRLRNPKETEVVIVTLPEATPVFEAERLQMDLQRAGINNKWWAVNACLSMTNTENTFL QAKAQNEVNWIEKVEQLSKGNAALIGWKNI >gi|226332276|gb|ACIC01000044.1| GENE 18 18597 - 18923 358 108 aa, chain - ## HITS:1 COG:no KEGG:BT_0801 NR:ns ## KEGG: BT_0801 # Name: not_defined # Def: arsenical resistance operon trans-acting repressor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 108 1 108 108 208 98.0 4e-53 MKKIEIFDPAMCCPTGLCGTNINPELMRIAVVVETLKRQGVIVIRHNLRDEPQVYVSNKT VNEYLQKNGAEALPITLVDGEIAVSKVYPTTKQMSEWTGVNLDLMPAK >gi|226332276|gb|ACIC01000044.1| GENE 19 18954 - 19661 341 235 aa, chain - ## HITS:1 COG:no KEGG:BT_0800 NR:ns ## KEGG: BT_0800 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 235 1 235 235 381 98.0 1e-104 MDFLQSLLDNSSVPVITAFILGILTAISPCPLATNITAIGFIGKDIENRHRIFINGLLYT FGRIVTYTVLGFILIPVLREGASMYMVQKVVSKYGEMLIAPVLIIIGIFMLDIIKLNIPK INIGGEGLKKNIKGSWGALLLGILFALAFCPTSGVFYFGILMPLAAAETGGYFLPVIYAI ATGVPVILVAWILAYSVAGLGRFYNSVQIFEKWFRKIVAILFIVIGIYYAVVFYF >gi|226332276|gb|ACIC01000044.1| GENE 20 19663 - 20106 384 147 aa, chain - ## HITS:1 COG:no KEGG:BT_0799 NR:ns ## KEGG: BT_0799 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 147 1 147 147 279 98.0 2e-74 MKKFFYLLFSTIILISCGNGAKAKTEAQNTEEKLPDHIEVLYFHGAQRCITCRAIETNIV ALLDSLYSKEQADDRIIYKVIDISKKENGQIADKYEVTWSSLFVNGWKDGKENVNNMTEF SFSNARKSPDKFKEGIKNKVDELLKQL >gi|226332276|gb|ACIC01000044.1| GENE 21 20117 - 20350 276 77 aa, chain - ## HITS:1 COG:MA3938 KEGG:ns NR:ns ## COG: MA3938 COG0526 # Protein_GI_number: 20092734 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Methanosarcina acetivorans str.C2A # 1 77 1 77 77 63 46.0 6e-11 MEIKVLGTGCAGCKALYETTKQAIFELGCDVVLIKEEDLLKIMEYNVLSLPALVIDGKIV SAGKRLSLAEVKELITQ >gi|226332276|gb|ACIC01000044.1| GENE 22 20428 - 20754 252 108 aa, chain - ## HITS:1 COG:mlr7816 KEGG:ns NR:ns ## COG: mlr7816 COG0640 # Protein_GI_number: 13476484 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Mesorhizobium loti # 14 103 27 114 115 57 36.0 7e-09 MKEKQYTVEQEQIARFAKAMGHPARMAILSFLAKQESCFFGDIHEELPIAKATVSQHLKE LKDAGLIQGEIETPKVRYCINKENWEVARKLFAAFLGDCKCTGTSCCG >gi|226332276|gb|ACIC01000044.1| GENE 23 20902 - 22221 963 439 aa, chain - ## HITS:1 COG:no KEGG:BT_0796 NR:ns ## KEGG: BT_0796 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 439 6 444 444 872 98.0 0 REMVETIKQLFAQKQQMQEPVVANYEVSEIVEQKTLKKKEESEKTGEKTGKSNKWSQSLT LQLKEYLNDNFDFRYNNLTGATEYREKSGKNCFRPIDEREMNGMIVDARLEGIPCWRGDV PTMILSNKVGSYNPFHLYVKELPGWDGVDRVTPLLLRVSDNEIWLRGGRCWLRAMLSQWS GEERLHANVLTPVLISGKQGLSKSTFCRLLMPDSLRHYFIDNLNLTAGSSPEKKLVKNGL INLDEFDKIKESQQATLKNLLQMVNVPVFRGKRLGWVNESRLASFIATTNVYQVLIDSTG SRRFLCVEVLKPISEEPLEHKQIYAQLKEELKSGAPDYLNKEEEKALQKHNKAYYRQSLL EDVFHCCFRHPLETEKGIWLTTAEIFQVMRKFNSTALKEVSARQLGLKLSAMGYMAKHTS FGNRYYVFNLSAEKKDVSL Prediction of potential genes in microbial genomes Time: Thu May 12 00:31:04 2011 Seq name: gi|226332275|gb|ACIC01000045.1| Bacteroides sp. 1_1_6 cont1.45, whole genome shotgun sequence Length of sequence - 30737 bp Number of predicted genes - 32, with homology - 31 Number of transcription units - 20, operones - 10 average op.length - 2.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 217 - 276 6.5 1 1 Tu 1 . + CDS 315 - 935 466 ## BT_0795 hypothetical protein - Term 1012 - 1043 1.5 2 2 Tu 1 . - CDS 1064 - 2518 1389 ## COG0477 Permeases of the major facilitator superfamily - Prom 2548 - 2607 2.0 - Term 2554 - 2607 18.1 3 3 Op 1 11/0.000 - CDS 2621 - 3937 1743 ## COG2115 Xylose isomerase 4 3 Op 2 . - CDS 3950 - 5443 1368 ## COG1070 Sugar (pentulose and hexulose) kinases - Prom 5464 - 5523 5.4 - Term 5460 - 5496 2.7 5 4 Tu 1 . - CDS 5532 - 6263 732 ## COG1051 ADP-ribose pyrophosphatase + Prom 6505 - 6564 4.9 6 5 Op 1 . + CDS 6604 - 7440 679 ## COG0351 Hydroxymethylpyrimidine/phosphomethylpyrimidine kinase 7 5 Op 2 . + CDS 7491 - 8378 997 ## COG0331 (acyl-carrier-protein) S-malonyltransferase + Term 8415 - 8456 3.0 + Prom 8430 - 8489 6.1 8 6 Op 1 39/0.000 + CDS 8578 - 9708 878 ## COG0045 Succinyl-CoA synthetase, beta subunit 9 6 Op 2 . + CDS 9738 - 10598 811 ## COG0074 Succinyl-CoA synthetase, alpha subunit + Term 10602 - 10664 20.1 - Term 10590 - 10651 10.5 10 7 Tu 1 . - CDS 10701 - 11423 528 ## BT_0786 putative integral membrane protein - Prom 11543 - 11602 6.8 + Prom 11482 - 11541 7.1 11 8 Tu 1 . + CDS 11562 - 11735 272 ## gi|160884539|ref|ZP_02065542.1| hypothetical protein BACOVA_02524 + Term 11737 - 11790 18.5 - Term 11721 - 11783 20.3 12 9 Op 1 . - CDS 11814 - 13316 1332 ## COG0174 Glutamine synthetase - Prom 13349 - 13408 4.4 - Term 13425 - 13468 12.2 13 9 Op 2 . - CDS 13498 - 13743 251 ## COG0724 RNA-binding proteins (RRM domain) - Prom 13932 - 13991 5.6 - Term 14059 - 14097 -0.7 14 10 Tu 1 . - CDS 14159 - 14362 247 ## BT_0782 hypothetical protein - Prom 14448 - 14507 9.2 + Prom 14408 - 14467 7.7 15 11 Op 1 . + CDS 14490 - 15212 537 ## BT_0781 hypothetical protein 16 11 Op 2 . + CDS 15217 - 15723 212 ## PROTEIN SUPPORTED gi|229255399|ref|ZP_04379326.1| acetyltransferase, ribosomal protein N-acetylase + Term 15884 - 15920 0.0 - Term 15772 - 15827 5.2 17 12 Tu 1 . - CDS 15849 - 16172 467 ## BT_0779 hypothetical protein + Prom 16287 - 16346 6.9 18 13 Op 1 . + CDS 16445 - 17332 853 ## COG2326 Uncharacterized conserved protein 19 13 Op 2 . + CDS 17208 - 17441 103 ## + Term 17604 - 17650 10.5 20 14 Op 1 . + CDS 18002 - 18517 412 ## BT_0777 hypothetical protein 21 14 Op 2 . + CDS 18532 - 19467 1084 ## COG2006 Uncharacterized conserved protein 22 14 Op 3 . + CDS 19474 - 21039 838 ## COG1145 Ferredoxin + Prom 21057 - 21116 4.9 23 15 Op 1 . + CDS 21168 - 21974 853 ## COG1752 Predicted esterase of the alpha-beta hydrolase superfamily 24 15 Op 2 . + CDS 22026 - 23723 1505 ## COG0366 Glycosidases + Term 23752 - 23822 7.2 + Prom 23754 - 23813 2.8 25 16 Tu 1 . + CDS 23919 - 24395 433 ## BT_0772 hypothetical protein + Term 24401 - 24445 5.4 - Term 24386 - 24437 14.5 26 17 Op 1 . - CDS 24457 - 26469 1903 ## COG0296 1,4-alpha-glucan branching enzyme 27 17 Op 2 . - CDS 26488 - 26934 652 ## COG2731 Beta-galactosidase, beta subunit 28 17 Op 3 . - CDS 26974 - 27645 533 ## COG5587 Uncharacterized conserved protein - Prom 27857 - 27916 10.2 - Term 27823 - 27881 -0.2 29 18 Op 1 9/0.000 - CDS 27926 - 28522 380 ## COG0115 Branched-chain amino acid aminotransferase/4-amino-4-deoxychorismate lyase 30 18 Op 2 . - CDS 28506 - 29498 642 ## COG0147 Anthranilate/para-aminobenzoate synthases component I - Prom 29553 - 29612 7.2 - Term 29552 - 29599 10.0 31 19 Tu 1 . - CDS 29617 - 30351 793 ## BT_0766 hypothetical protein - Prom 30381 - 30440 5.2 - Term 30500 - 30547 8.3 32 20 Tu 1 . - CDS 30572 - 30736 114 ## BT_0765 putative protease Predicted protein(s) >gi|226332275|gb|ACIC01000045.1| GENE 1 315 - 935 466 206 aa, chain + ## HITS:1 COG:no KEGG:BT_0795 NR:ns ## KEGG: BT_0795 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 206 1 206 206 407 100.0 1e-112 MNALYDFFLTPQPKDSNKTRYHARLVVRDTITLEDIAETIESRSSLRQSDVIGSFIEFAN VFKQELSNGNSIHIPGVGSFRIKAESPEVRSPKEIRAENIHCSGIVFTPEKDLLRELKAT TFEKVSETRRSQELSDIEIDGRLAEFFKDNSCITTQQLCSLCGLRKATALRRLQKRVDEG KLTHPGYIRSPFYFPVPGWFGVSRNR >gi|226332275|gb|ACIC01000045.1| GENE 2 1064 - 2518 1389 484 aa, chain - ## HITS:1 COG:ECs5014 KEGG:ns NR:ns ## COG: ECs5014 COG0477 # Protein_GI_number: 15834268 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli O157:H7 # 12 481 9 479 491 426 49.0 1e-119 MNNTTNEGSKLYLYSITSVAILGGLLFGYDTAVISGAEKGLEAFFLSASDFQYNKVMHGI TSSSALIGCVLGGALSGVFASRLGRRNSLRLAAVLFFLSALGSYYPEVLFFEYGKPNMDL LIAFNLYRVLGGIGVGLASAVCPMYIAEIAPSNIRGTLVSCNQFAIIFGMLVVYFVNYLI MGDHQNPIILKDAAGVLSVSAESDMWTVQEGWRYMFGSEAFPAAFFGMLLFFVPKTPRYL VLVQQEEKAYTILEKINGKKKAQEILNDIKATAQEKTEKLFTYGVTVIVIGILLSVFQQA IGINAVLYYAPRIFENAGAEGGGMMQTVIMGIVNIIFTLVAIFTVDRFGRKPLLIIGSIG MAVGAFAVAMCDSMAIKGVLPVLSIIVYAAFFMMSWGPICWVLISEIFPNTIRGKAVAIA VAFQWIFNYIVSSTFPALYDFSPMFAYSLYGIICVAAAIFVWRWVPETKGKTLEDMSKLW KKNK >gi|226332275|gb|ACIC01000045.1| GENE 3 2621 - 3937 1743 438 aa, chain - ## HITS:1 COG:HI1112 KEGG:ns NR:ns ## COG: HI1112 COG2115 # Protein_GI_number: 16273037 # Func_class: G Carbohydrate transport and metabolism # Function: Xylose isomerase # Organism: Haemophilus influenzae # 6 435 4 434 439 461 52.0 1e-129 MATKEFFPGIEKIKFEGKDSKNPMAFRYYDAEKVINGKKMKDWLRFAMAWWHTLCAEGGD QFGGGTKQFPWNGNADAIQAAKDKMDAGFEFMQKMGIEYYCFHDVDLVSGGASVEEYEAN LKEIVAYAKQKQAETGIKLLWGTANVFGHARYMNGAATNPDFDVVARAAVQIKNAIDATI ELGGENYVFWGGREGYMSLLNTDQKREKEHLAQMLTIARDYARARGFKGTFLIEPKPMEP TKHQYDVDTETVIGFLKAHGLDKDFKVNIEVNHATLAGHTFEHELAVAVDNGMLGSIDAN RGDYQNGWDTDQFPIDNYELTQAMMQIIRNGGLGTGGTNFDAKTRRNSTDLEDIFIAHIA GMDAMARALESAAALLDESPYKKMLADRYASFDGGKGKEFEDGKLTLEDVVAYAKTKGEP KQTSGKQELYEAILNMYC >gi|226332275|gb|ACIC01000045.1| GENE 4 3950 - 5443 1368 497 aa, chain - ## HITS:1 COG:CAC2612 KEGG:ns NR:ns ## COG: CAC2612 COG1070 # Protein_GI_number: 15895870 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar (pentulose and hexulose) kinases # Organism: Clostridium acetobutylicum # 2 483 3 487 500 213 32.0 5e-55 MFLLGYDIGSSSVKASLVNAETGKCVSSAFFPKTEAKIIAVNPGWAEQDPESWWENLKLS TQAIMAESGVSAAEIKAIGISYQMHGLVCVDKNQQVLRPAIIWCDSRAVPFGQQAFETIG EERCLSHLLNSPGNFTASKLAWIKQNEPAVYEQIYKIMLPGDYIAMKLSGDICTTVSGLS EGMFWDFKNNRVADFLMDYYGFDSSLIADIKPTFAEQGRVNAVAANELGLKEGTPITYRA GDQPNNALSLNVFNPGEIASTAGTSGVVYGVNGEVNYDPQSRVNTFAHVNHTMEQTRLGV LLCINGTGILNSWVKRNIAPEGISYNEMNVLASKAPIGSAGISILPFGNGAERMLNNKEI GCSIRGVDFNAHGKHHIIRAAQEGIVFSFKYGIDIMEQMGIPVKKIHAGHANMFLSSIFR DTLAGVTGATIELYDTDGSVGAAKGAGIGAGIYKDNNEAFATLDKLDVIEPNIAKRQEYA DAYARWKYNINNDIITF >gi|226332275|gb|ACIC01000045.1| GENE 5 5532 - 6263 732 243 aa, chain - ## HITS:1 COG:Cgl1046 KEGG:ns NR:ns ## COG: Cgl1046 COG1051 # Protein_GI_number: 19552296 # Func_class: F Nucleotide transport and metabolism # Function: ADP-ribose pyrophosphatase # Organism: Corynebacterium glutamicum # 50 235 38 206 206 90 30.0 3e-18 MQSIQKKAPLANNHISVDCVVIGFDGEQLKVLLVKRAGEDNGEVYHDMKLPGSLIYMDED LDEAAQRVLYELTGLRNVNLMQFKAFGSKNRTSNPKDVRWLERAMQSKVERIVTIAYLSM VKIDRALDKNLDDHQACWIALKDVMTLAFDHNLIIKEAMTYIRQFVEFNPSMLFELLPRK FTAAQLRTLFELVYDKPVDVRNFHKKIAMMEYVVPLEEKQQGVAHRAARYYKFDKKIYNK VRR >gi|226332275|gb|ACIC01000045.1| GENE 6 6604 - 7440 679 278 aa, chain + ## HITS:1 COG:CAC3095 KEGG:ns NR:ns ## COG: CAC3095 COG0351 # Protein_GI_number: 15896346 # Func_class: H Coenzyme transport and metabolism # Function: Hydroxymethylpyrimidine/phosphomethylpyrimidine kinase # Organism: Clostridium acetobutylicum # 7 262 7 258 265 251 54.0 8e-67 MERHPVILSIAGSDCSGGAGIQADIKTISALGGYAASAITAVTVQNTLGVRAVHAIPPEI VCGQIEAVMEDLQPVAIKIGMVNDIQIVHVIADCIRKYSPEHIIYDPVMVSTSGRKLMTN EAIEEIKKELLPLVTLVTPNIDEAKVLTGKSIQNTQDMLEAAKQLSDSYQIHILIKGGHL EGDQMCDLLYSPDHTYYIYEEKKIESKNLHGTGCTLSSAIASYLAKGYSMKESIRHAKEY ITHAIIAGKDLNIGHGNGPLWHFPDSIAQMCTFCAVVS >gi|226332275|gb|ACIC01000045.1| GENE 7 7491 - 8378 997 295 aa, chain + ## HITS:1 COG:CAC3575 KEGG:ns NR:ns ## COG: CAC3575 COG0331 # Protein_GI_number: 15896809 # Func_class: I Lipid transport and metabolism # Function: (acyl-carrier-protein) S-malonyltransferase # Organism: Clostridium acetobutylicum # 3 294 5 297 308 243 45.0 3e-64 MKAFVFPGQGAQFVGMGKDLYENSALAKELFEKANDILGYRITDIMFNGTDEDLRQTKVT QPAVFLHSVISALCMGDDFKPEMTAGHSLGEFSALVAAGALSFEDGLKLVYARAMAMQKA CEATPSTMAAIIALPDEKVEEICASVNAEGEVCVPANYNCPGQIVISGSVPGIEKACELM KAAGAKRALPLKVGGAFHSPLMDPAKVELEAAINATEIHTPKCPVYQNVDALPHTDPAEI KKNLVAQLTASVRWTQSVKNMVADGATDFTECGPGAVLQGLIKKIDGTVSAHGIA >gi|226332275|gb|ACIC01000045.1| GENE 8 8578 - 9708 878 376 aa, chain + ## HITS:1 COG:SA1088 KEGG:ns NR:ns ## COG: SA1088 COG0045 # Protein_GI_number: 15926828 # Func_class: C Energy production and conversion # Function: Succinyl-CoA synthetase, beta subunit # Organism: Staphylococcus aureus N315 # 1 375 1 383 388 340 48.0 2e-93 MKIHEYQAKEIFSKYGIPVERHTLCRTAAGAVAAYKRMGSDRVVIKAQVLTGGRGKAGGV KLVDNTEDTYQEAKNILGMSIKGLPVNQILVSEAVDIAAEYYVSFTIDRNTRSVILMMSA SGGMDIEEVARQSPEKIIRYAIDPFIGLPDYLARRFAFSLFPHIEQAGRMAAILQALYKI FMENDASLVEVNPLALTAKGILMAIDAKIVFDDNALYRHPDVLSLFDPTEEEKVEADAKN KGFSYVHMDGNIGCMVNGAGLAMATMDMIKLHGGNPANFLDIGGSSNPVKVVEAMKLLLQ DEKVKVVLINIFGGITRCDDVAIGLIQAFDQIKSDIPVIVRLTGTNEHLGRDLLRNHSRF QIATTMQEAALMAIKS >gi|226332275|gb|ACIC01000045.1| GENE 9 9738 - 10598 811 286 aa, chain + ## HITS:1 COG:BS_sucD KEGG:ns NR:ns ## COG: BS_sucD COG0074 # Protein_GI_number: 16078673 # Func_class: C Energy production and conversion # Function: Succinyl-CoA synthetase, alpha subunit # Organism: Bacillus subtilis # 1 278 1 278 300 342 61.0 7e-94 MSILIDKSTRLIVQGITGRDGLFHAKKMKEYGTNVVGGTSPGKGGTEVDGIPVFNTMHEA VKQTQANTSIIFVPARFAADTIMEAADAGIRLIVCIAEGIPTLDVIKAHRFAEQRGAMLV GPNCPGLISPGESMVGILPGQVFQKGNVGVISRSGTLTYEIVYHLTTNGMGQSTAIGIGG DPVVGLHFLKLLEMFQNDPETKAIVLIGEIGGNAEEQAAEYIRSHVNKPVVAFIAGQSAP PGKQMGHAGAIISGGSGSAKDKIEALEAAGIRVAQEPSDIPKLLKR >gi|226332275|gb|ACIC01000045.1| GENE 10 10701 - 11423 528 240 aa, chain - ## HITS:1 COG:no KEGG:BT_0786 NR:ns ## KEGG: BT_0786 # Name: not_defined # Def: putative integral membrane protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 240 1 240 240 422 100.0 1e-117 MLNLLAVFLVCSGIGIAVHIAIDLTHRPQSMKIMNAVWILTALWGSYLALWAYNKFGQGA PMKMGGGEMKMDMSGMKDMNMDMSMGDMSMKHPHWQSVALSALHCGAGCTLADIIGEWFT NYIPVTVAGSQLIGNWVLDFILALIIGVYFQFYAIREMERISVGKVLSRAFKADFFSLLS WQIGMYGWMAIVYFVLFVNEPLPKDTWIFWFMMQLAMLFGFFCAYPMNALLIKLGVKKGM >gi|226332275|gb|ACIC01000045.1| GENE 11 11562 - 11735 272 57 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160884539|ref|ZP_02065542.1| ## NR: gi|160884539|ref|ZP_02065542.1| hypothetical protein BACOVA_02524 [Bacteroides ovatus ATCC 8483] # 1 57 68 124 124 94 92.0 2e-18 MSRRSQLEHEVSVAQERIKKAAKDTPKDIIKLWEQDLVDLELELNNLVDDEEDNNED >gi|226332275|gb|ACIC01000045.1| GENE 12 11814 - 13316 1332 500 aa, chain - ## HITS:1 COG:MA3382 KEGG:ns NR:ns ## COG: MA3382 COG0174 # Protein_GI_number: 20092196 # Func_class: E Amino acid transport and metabolism # Function: Glutamine synthetase # Organism: Methanosarcina acetivorans str.C2A # 7 498 8 504 506 569 55.0 1e-162 MNQELLMNPNCLVAALEKPAAEFTKADIISFIQKNNIRMVNFMYPAADGRLKTLNFVINN AAYLDAILTCGERVDGSSLFPFIEAGSSDLYVVPRFRTAFVDPFAEIPTLSMLCSFFNKD GEPLESSPEHTLHKACKAFTDVTGMEFQAMGELEYYVISPDTGMFQATDQRGYHESAPYA KFNDFRTQCMSYIAQTGGQIKYGHSEVGNFTLDGMIYEQNEIEFLPVHAEDAADQLMIAK WVIRNLGYRYGYNVTFAPKITAGKAGSGLHVHMRIVKDGQNQMLKDGVLSETARKAIAGM MELAPSITAFGNTNPTSYFRLVPHQEAPTNVCWGDRNRSVLVRVPLGWAAKTDMCTLANP LESESHFDTSQKQTVEMRSPDGSADLYQLLAGLAVACRHGFEIEQALDIAKRTYVNVNIH QKENEDKLKALAQLPDSCAASAECLQKQRAVFEQYNVFSPAMIDGIIRKLRSYEDKTLRA DMEGKPEEMLELVHKYFHCG >gi|226332275|gb|ACIC01000045.1| GENE 13 13498 - 13743 251 81 aa, chain - ## HITS:1 COG:alr2311 KEGG:ns NR:ns ## COG: alr2311 COG0724 # Protein_GI_number: 17229803 # Func_class: R General function prediction only # Function: RNA-binding proteins (RRM domain) # Organism: Nostoc sp. PCC 7120 # 1 80 1 80 105 82 52.0 2e-16 MNLYIGNLNYNVKESDLRNVMEEYGAVASVKLITDRETRRSKGFAFIEMPDDTEAANAIK ELNGAEYVGRPMVVKEALPKN >gi|226332275|gb|ACIC01000045.1| GENE 14 14159 - 14362 247 67 aa, chain - ## HITS:1 COG:no KEGG:BT_0782 NR:ns ## KEGG: BT_0782 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 67 1 67 67 116 100.0 2e-25 MMELSAGMNLNKYLKLSLIAEMGGLSCSRIAYYTTRTLKAFFEDMSRDTDPYFKPAVYVS MGMKYKF >gi|226332275|gb|ACIC01000045.1| GENE 15 14490 - 15212 537 240 aa, chain + ## HITS:1 COG:no KEGG:BT_0781 NR:ns ## KEGG: BT_0781 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 240 1 240 240 489 100.0 1e-137 MEIKKFIENYREAFGEYAELPIVFWYSDILENETGKVNGCFFKSMSKVRGGNTISLNAET IGCGGGKFYTGFTDMPEHVPTFVSLKERYKQTPEMVKSFIEQLGVPRAEKEYLHFARIDK VETFDHLEGILFLANPDILSGLTTWAFFDNNKEDTVISTFGSGCSSVVTQTILENRKGGY RTFIGFFDPSVRPYFEADILSYTIPMSRFKVMYETMRSSCLFDTHAWGKIRERNHNSYKE >gi|226332275|gb|ACIC01000045.1| GENE 16 15217 - 15723 212 168 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229255399|ref|ZP_04379326.1| acetyltransferase, ribosomal protein N-acetylase [Capnocytophaga ochracea DSM 7271] # 1 140 7 147 175 86 34 2e-16 MDFILRPWKVSDAKALARHLNNKKIWDNCRDGLPFPYTETDADAFIKYASEQVEQNEYCI EANHEAIGNIGFVRGTDVERFNAEVGYWISETYWNKGLATAALERAIEHYFQHTDVIRLY ATVYEHNAASMRVLEKAGFQKAGIHRKACFKNGQFIDAHYYELLKNKS >gi|226332275|gb|ACIC01000045.1| GENE 17 15849 - 16172 467 107 aa, chain - ## HITS:1 COG:no KEGG:BT_0779 NR:ns ## KEGG: BT_0779 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 107 1 107 107 135 100.0 4e-31 MKKVLVAVALVLGLGSSVAFAQEVSNTPAVETQTQAPQDEYTKIEVKDLPEAVTQAITKS YEDATIKEAYVAEKETGKVYKVIITTKDAQEVTVLLNEKGEEVKEIK >gi|226332275|gb|ACIC01000045.1| GENE 18 16445 - 17332 853 295 aa, chain + ## HITS:1 COG:all2088 KEGG:ns NR:ns ## COG: all2088 COG2326 # Protein_GI_number: 17229580 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Nostoc sp. PCC 7120 # 21 287 19 282 289 315 57.0 6e-86 MKKEVLKELTAKPGKEHHVSDFNPSFTADMSKQDAKEQLAQNIEKLSELQSMLYAQDRYS VLVIFQAMDAAGKDGTIKHVMSGINPQGCQVYSFKQPSAEELDHDYLWRINRSLPERGRI GIFNRSQYEDVLIAKVHPEILLSNKLPGILKAKDIDNEFWKRRYRQINDFERYLTENGTV IIKFFLNVSKAEQKKRFMERLNDESKNWKFSSADVKERQYWDDYMKAYSDVLTETSTEIA PWYVIPADNKWFMRYAVGQIICERMEKLDLHYPQLSKEALERLEECKKNVSDINF >gi|226332275|gb|ACIC01000045.1| GENE 19 17208 - 17441 103 77 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPSGKLSVNAWKNWTYIIHSYPRRRWKDLKNARKTYPILISKQDTFSEFNGLFLSGLDKA LLMLYICKQNNNKPIKP >gi|226332275|gb|ACIC01000045.1| GENE 20 18002 - 18517 412 171 aa, chain + ## HITS:1 COG:no KEGG:BT_0777 NR:ns ## KEGG: BT_0777 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 171 1 171 171 271 100.0 6e-72 MKMETSAKLYSLNYSNVKTYLFALLFVAGNIALPQLCHLVPYGGPTLLPIYFFTLIAAYK YGFRVGLLTAVLSPVINHVLFAMPSEAVLPIILIKSTLLAGASALAARTVKTVSLLAVLG VILSYQLIGTAFEWMIEGDFYIAVQDFRIGIPGMLIQWFGGYALLKAIAKL >gi|226332275|gb|ACIC01000045.1| GENE 21 18532 - 19467 1084 311 aa, chain + ## HITS:1 COG:MA1031_1 KEGG:ns NR:ns ## COG: MA1031_1 COG2006 # Protein_GI_number: 20089906 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanosarcina acetivorans str.C2A # 49 273 15 245 295 81 28.0 2e-15 MDRRDFFKTVAITGAAMTIQRSEAMEILTQTINKANGDKPDLVAVMGGEPEAMFRRAISE LGGMKQFIKPGQKVVVKPNIGWDKVPELAGNTNPKLITEIIKQCFAAGAKEVTVFDHTCD DWQKCYKNSGIEAAAKEAGAKVMPAHLESYYKPVDLPKGQKMKKAKIHEAILNCDVWINV PILKNHGGANLTISMKNHMGIVWDRGFFHQNDLQQCIADICTLEKKAVLNVVDAYRIMKT NGPRGRSASDVVLAKGLFISPDIVAVDTAAAKFFNQIREMPLDTVGHLAKGEALKVGTMD IDKLNVKRIKM >gi|226332275|gb|ACIC01000045.1| GENE 22 19474 - 21039 838 521 aa, chain + ## HITS:1 COG:ECs3097 KEGG:ns NR:ns ## COG: ECs3097 COG1145 # Protein_GI_number: 15832351 # Func_class: C Energy production and conversion # Function: Ferredoxin # Organism: Escherichia coli O157:H7 # 336 497 17 164 164 82 29.0 2e-15 MLRKIRIGISVVLFALITFYFLDFANILPNSFHRLAHIQFVPAVLSLSIGILIVLIALTL LFGRIYCSTICPMGIWQDVIARISKSVGKKKKRYRYSPAKNMLRWTVLGVTVIASVCGFS VVLGLLDPYSAYGRIVVHIFKPVYMLGNNLLESVFSRFDNYTFYQVDTSVLSLSSLLIAI LTFAAILILAWKHGRTWCNTICPVGTVLGFLSRYSLFKVRIDTAKCNGCGLCATKCKAAC IHSKEHAIDYSRCVDCFDCLGACKQKALVYAPASKGQQRSSATEKQPVATESSSVAADSS KRRFLLAGLATAGAAPTLLTKAQESIATMEGKKAYKKENPITPPGSVSQKRFQQHCTSCH LCVSKCPSHVLKPAFMEYGLGGVMQPTVSFEKGFCNFDCTVCSDVCPNGAIPPLTVEQKH LTQMGYVVFIEENCIVLTDGTSCGACSEHCPTQAIAMVPYKDGLTLPHVNTEICVGCGGC EYVCPARPFRAVYIEGNPVQKEAKPFKESEEHKVEIDDFGF >gi|226332275|gb|ACIC01000045.1| GENE 23 21168 - 21974 853 268 aa, chain + ## HITS:1 COG:aq_1386 KEGG:ns NR:ns ## COG: aq_1386 COG1752 # Protein_GI_number: 15606577 # Func_class: R General function prediction only # Function: Predicted esterase of the alpha-beta hydrolase superfamily # Organism: Aquifex aeolicus # 11 250 9 247 259 163 37.0 3e-40 MEVFTNNWNSRKYQIGYALSGGFIKGFAHLGVMQALLEHDIKPDIISGVSAGALAGVFYA DGNEPHQVIEYFSGHKFQDLTKLVIPKKGLFDLCEFIDFLHTNLKAKNLEELQIPLIITA TDLDHGRMVHFHRGSIAERVAASCCMPVMFAPVNIDGTNYVDGGLMMNLPVSTLRRICDK VVAVNVSPIMAQDYKMNIVSIAMRSFHFMFRANTFPEREKCDLLIEPYNLYGYSNTELEK AEEIFEQGYKIANDLLDQTLAEKGKIWK >gi|226332275|gb|ACIC01000045.1| GENE 24 22026 - 23723 1505 565 aa, chain + ## HITS:1 COG:TM1650 KEGG:ns NR:ns ## COG: TM1650 COG0366 # Protein_GI_number: 15644398 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Thermotoga maritima # 3 360 2 262 422 84 23.0 8e-16 MNNENKMIIYQVFTRLFGNNNNHCVYNGDITANGCGKMADFTLKALGEIRKLGATHIWYT GIIEHATETDYRRFNIRPDHPAIVKGKAGSPYAIKDYYDVDPDLATDVPERMKEFENLVH RTHRSGLKVIIDFVPNHVARQYHSDAQPDGTTELGANDDPNYAFSPYNNFYYIPQSELRA QFDMKEGAAEPYHEYPAKATGNNRFDATPNINDWYETIKLNYGVDYLNGGTCHFSPTPDT WIKMLDILLFWASKGIDGFRCDMAEMVPVEFWEWAIPQVKEAHPEILFIAEVYNPNEYRN YLFRGKFDYLYDKVGLYDTLRNVACGYESAASITHCWQSLNGIEKQMLNFLENHDEQRIA SDFFAGDPRKGIPALIVSACMNTNPMMIYFGQEFGELGMDSEGFSGRDGRTTIFDYWSVD TIRRWRNGGKFDGKMLTEEHKRLYSIYQRVLTLCNEETSIAKGVFFDLMYANKNGWRFDE HKQYTFMRKYKNELLFIIVNFDSQLVDVAINVPSHAFDFLQIPQMEKYQATDLLTGAKEE ICLLPYKATEVSVGAYNGKILKITF >gi|226332275|gb|ACIC01000045.1| GENE 25 23919 - 24395 433 158 aa, chain + ## HITS:1 COG:no KEGG:BT_0772 NR:ns ## KEGG: BT_0772 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 158 1 158 158 304 100.0 8e-82 MKKISFLAVILLLSVSAFAILPPGNIQAAFKKMYPKANGIAWSQDDGYYCANFVMNGFTK NVWFNTQAQWQMTQTDLVSLDRVTPTVYNAFVSGPYASWVVNDVTMVEFPKWQAIIVIKV GQDNVDIKYQLFYSPQGMLLKTRNVSYMYDILGPGTFL >gi|226332275|gb|ACIC01000045.1| GENE 26 24457 - 26469 1903 670 aa, chain - ## HITS:1 COG:YEL011w KEGG:ns NR:ns ## COG: YEL011w COG0296 # Protein_GI_number: 6320826 # Func_class: G Carbohydrate transport and metabolism # Function: 1,4-alpha-glucan branching enzyme # Organism: Saccharomyces cerevisiae # 8 670 12 704 704 570 46.0 1e-162 MKETLNIIKNDPWLEPFADAITGRHQYALNKEAELTNKGKQTLSDFASGYLYFGLHRTPK GWTFREWAPNATHIYMVGTFNNWEEKAAYKLKKLKNGNWEINLPADAIQHGDLYKLNVYW DGGQGERIPAWATRVVQDEQTKIFSAQVWAPEKPYKFKKKTFKPTTNPLLIYECHIGMAQ QEEKVGTYNEFREKTLPRIAQEGYNCIQIMAIQEHPYYGSFGYHVSSFFAASSRFGTPDE LKALIDAAHEMGIAVIMDIVHSHAVKNEVEGLGNFAGDPNQYFYPAPRREHPAWDSLCFD YGKNEVIHFLLSNCKYWLEEYKFDGFRFDGVTSMLYYSHGLGEAFCNYGDYFNGHQDGNA ICYLTLANELIHQVNPKAISIAEEVSGMPGLAAKVEDGGYGFDYRMAMNIPDYWIKTIKE KIDEDWKPSSMFWEVTNRRQDEKTISYAESHDQALVGDKTIIFRLIDADMYWHMQKGDEN YVVNRGIALHKMIRLLTSSTINGGYLNFMGNEFGHPEWIDFPREGNGWSCKYARRQWDLV DNKNLAYHYMGDFDAEMLKVIKSVKDFQATPVQEIWHNDGDQVLAYGRKDLIFVFNFNPK QSFVDYGFLVSPGAYEVILNTDNVAFGGNGLADDSVVHFTIADPLYKKEKKEWLKLYIPA RTAVVLRKKK >gi|226332275|gb|ACIC01000045.1| GENE 27 26488 - 26934 652 148 aa, chain - ## HITS:1 COG:CAC0836 KEGG:ns NR:ns ## COG: CAC0836 COG2731 # Protein_GI_number: 15894123 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase, beta subunit # Organism: Clostridium acetobutylicum # 1 147 1 150 152 97 35.0 9e-21 MVVDTLENLEKYASLNPLFAQAIEFLKSHDLQAMEIGKTELKGKDLLVNIAQTKPKTKEE AKLETHNEFIDIQIPLSGTEVMGYTAAKDCVPADTPYNVEKDITFFEGLAETYVAVKPGM FAIFFPQDGHAPGITPDGVKKVIVKVKA >gi|226332275|gb|ACIC01000045.1| GENE 28 26974 - 27645 533 223 aa, chain - ## HITS:1 COG:XF2023 KEGG:ns NR:ns ## COG: XF2023 COG5587 # Protein_GI_number: 15838617 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Xylella fastidiosa 9a5c # 9 220 19 234 237 100 27.0 2e-21 MKMNIPVIFQFLKDLSANNNREWFNEHRSEYEVARAEFDNFLATVIARISLFDETIRGIQ PKDCTYRIYRDTRFSADKTPYKIHFGGYINAKGKKSDHCGYYVHLQPDGSMLAGGSLCLP PNVLKAVRQAIYDNVEEYISIVEDPEFKKYFPVVGEDFMKTAPKGFPKDFKYIDYLKCRQ FVCSYLVPDDFFTRADTMEQIEKAFRQFKRFADFINYTIDDFE >gi|226332275|gb|ACIC01000045.1| GENE 29 27926 - 28522 380 198 aa, chain - ## HITS:1 COG:HI1169 KEGG:ns NR:ns ## COG: HI1169 COG0115 # Protein_GI_number: 16273093 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Branched-chain amino acid aminotransferase/4-amino-4-deoxychorismate lyase # Organism: Haemophilus influenzae # 1 181 1 185 188 127 37.0 2e-29 MCQFIETIRIEDGQVYNLSYHTARMNRTRAAFWKEAAPIDLSGFISPPSLSGIWKCRIVY GKEIEEVGYTPYQMRMVSSLRPVASDTIDYSYKSTNREELNDLFARRGKADDILIVKDGY LTDTSIANIALYDGHTWYTPAHPLLQGTKRAELLDNRFIVEKDIRQAQLGDYSHIMLFNA MIDWKRLIIPVNEKYFIL >gi|226332275|gb|ACIC01000045.1| GENE 30 28506 - 29498 642 330 aa, chain - ## HITS:1 COG:PM1464 KEGG:ns NR:ns ## COG: PM1464 COG0147 # Protein_GI_number: 15603329 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Anthranilate/para-aminobenzoate synthases component I # Organism: Pasteurella multocida # 10 329 7 323 324 310 48.0 3e-84 MQLYTREETIRKINHLGQSCRPFIFIINYLQDASYIEEVASVDPSEVVYNLNGFTNQSSS KDVLSCFEEPVHWNSYPESFDSYRRSFDIVQRNIFAGNSFLTNLTCRTPIETNLSLKDIF FRSKAMYKLWVRDQFTVFSPEIFVRIQQGKISSYPMKGTLDASLPSAVRQLMDDPKEAAE HATIVDLIRNDLSIVADRVSVSRYRYIDKLQTNRGTILQTSSEIQGTLPDNYRENLGHIL FKLLPAGSITGAPKKKTMQIISEAETYERGFYTGVMGYFDGSSLDSAVMIRFVEQEGDRM YFKSGGGITCRSEVESEYHEMKQKVYVPIY >gi|226332275|gb|ACIC01000045.1| GENE 31 29617 - 30351 793 244 aa, chain - ## HITS:1 COG:no KEGG:BT_0766 NR:ns ## KEGG: BT_0766 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 244 24 267 267 422 99.0 1e-117 MKKLIFLFAFCLTVTNIFAQTDPSQLKKEGSDAFNAKNYPVAYAKFSEYLKQTNNQDSAI AYYCGMAADEVKKYAEAVTFFDIAIQKKFNIGNAYARKALALDAMKKTDEYVATLEEGLK VDPDNKTMIKNYGLHYLKAGIAAQKAGKAEEAEECFKKVIPLDHKTYKTNALYSLGVLCY NDGANILKKAAPLANSDADKYAAEKEKADGRFKEAIGYLEEATKVSPEDTKSKTMLSQVQ AAMK >gi|226332275|gb|ACIC01000045.1| GENE 32 30572 - 30736 114 54 aa, chain - ## HITS:1 COG:no KEGG:BT_0765 NR:ns ## KEGG: BT_0765 # Name: not_defined # Def: putative protease # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 54 307 360 360 109 100.0 3e-23 YLSYRVHISDIFSDYSENVKTNLTHLAFKADDLDNYLVDTFGCEFSEYANKNAI Prediction of potential genes in microbial genomes Time: Thu May 12 00:31:52 2011 Seq name: gi|226332274|gb|ACIC01000046.1| Bacteroides sp. 1_1_6 cont1.46, whole genome shotgun sequence Length of sequence - 52851 bp Number of predicted genes - 34, with homology - 33 Number of transcription units - 17, operones - 8 average op.length - 3.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 950 871 ## COG0265 Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain 2 1 Op 2 . - CDS 962 - 2296 1149 ## BT_0764 hypothetical protein 3 1 Op 3 . - CDS 2346 - 4031 1128 ## BT_0763 hypothetical protein - Prom 4132 - 4191 5.8 - TRNA 4324 - 4397 75.7 # Arg TCT 0 0 + Prom 4453 - 4512 7.3 4 2 Tu 1 . + CDS 4617 - 4718 70 ## + Term 4866 - 4895 -0.3 5 3 Tu 1 . - CDS 4751 - 6799 1634 ## BT_0761 hypothetical protein - Prom 6822 - 6881 6.3 + Prom 7498 - 7557 6.7 6 4 Tu 1 . + CDS 7769 - 9454 489 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain + Prom 9736 - 9795 6.9 7 5 Tu 1 . + CDS 10024 - 10209 137 ## BT_0759 hypothetical protein + Term 10372 - 10414 -0.6 - Term 10209 - 10248 8.1 8 6 Tu 1 . - CDS 10341 - 10913 345 ## COG3153 Predicted acetyltransferase - Prom 10978 - 11037 9.2 - Term 11034 - 11064 0.4 9 7 Op 1 . - CDS 11137 - 13170 1784 ## COG3250 Beta-galactosidase/beta-glucuronidase 10 7 Op 2 . - CDS 13187 - 14635 1076 ## COG3119 Arylsulfatase A and related enzymes 11 7 Op 3 . - CDS 14709 - 16514 1704 ## BT_0755 hypothetical protein 12 7 Op 4 . - CDS 16544 - 19960 3417 ## BT_0754 hypothetical protein - Prom 19989 - 20048 4.5 13 8 Op 1 . - CDS 20056 - 21081 845 ## COG3712 Fe2+-dicitrate sensor, membrane component 14 8 Op 2 . - CDS 21159 - 21746 416 ## BT_0752 putative RNA polymerase ECF-type sigma factor - Prom 21769 - 21828 4.5 - Term 21787 - 21831 10.1 15 9 Op 1 . - CDS 21840 - 23294 1032 ## COG2978 Putative p-aminobenzoyl-glutamate transporter 16 9 Op 2 . - CDS 23284 - 24012 1075 ## BF2233 two-component system response regulator 17 9 Op 3 . - CDS 24048 - 28445 3140 ## COG0642 Signal transduction histidine kinase - Prom 28564 - 28623 5.9 + Prom 28309 - 28368 5.7 18 10 Tu 1 . + CDS 28602 - 29540 923 ## COG0462 Phosphoribosylpyrophosphate synthetase + Term 29566 - 29622 15.4 - Term 29549 - 29614 19.5 19 11 Op 1 . - CDS 29629 - 30798 1219 ## COG4642 Uncharacterized protein conserved in bacteria - Prom 30821 - 30880 2.0 20 11 Op 2 . - CDS 30882 - 32165 1144 ## COG0612 Predicted Zn-dependent peptidases 21 11 Op 3 . - CDS 32169 - 32921 802 ## COG1212 CMP-2-keto-3-deoxyoctulosonic acid synthetase 22 11 Op 4 . - CDS 32918 - 33298 340 ## BT_0744 2-amino-4-hydroxy-6-hydroxymethyldihydropteridine pyrophosphokinase 23 11 Op 5 . - CDS 33309 - 35648 2435 ## COG5009 Membrane carboxypeptidase/penicillin-binding protein - Prom 35781 - 35840 8.3 + Prom 35764 - 35823 7.7 24 12 Op 1 19/0.000 + CDS 35855 - 36796 1000 ## COG0540 Aspartate carbamoyltransferase, catalytic chain 25 12 Op 2 . + CDS 36793 - 37254 509 ## COG1781 Aspartate carbamoyltransferase, regulatory subunit 26 12 Op 3 . + CDS 37289 - 37855 479 ## COG1853 Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family 27 12 Op 4 . + CDS 37872 - 38609 655 ## BT_0739 hypothetical protein + Prom 38736 - 38795 5.6 28 13 Tu 1 . + CDS 38830 - 40110 1522 ## COG0112 Glycine/serine hydroxymethyltransferase + Term 40203 - 40242 3.0 - Term 40391 - 40444 11.2 29 14 Tu 1 . - CDS 40497 - 42164 1833 ## COG2759 Formyltetrahydrofolate synthetase - Prom 42192 - 42251 8.3 + Prom 42265 - 42324 5.8 30 15 Op 1 . + CDS 42371 - 44065 2018 ## COG2985 Predicted permease 31 15 Op 2 . + CDS 44126 - 45754 1707 ## BT_0735 aspartate aminotransferase (EC:2.6.1.1) + Term 45770 - 45806 3.1 32 16 Tu 1 . - CDS 45761 - 47869 2114 ## BT_0734 hypothetical protein - Prom 47959 - 48018 6.6 - Term 49002 - 49049 3.1 33 17 Op 1 . - CDS 49199 - 50431 358 ## SUN_2263 hypothetical protein 34 17 Op 2 . - CDS 50418 - 52508 668 ## FP1531 hypothetical protein - Prom 52554 - 52613 8.1 Predicted protein(s) >gi|226332274|gb|ACIC01000046.1| GENE 1 2 - 950 871 316 aa, chain - ## HITS:1 COG:RSp1552 KEGG:ns NR:ns ## COG: RSp1552 COG0265 # Protein_GI_number: 17549771 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain # Organism: Ralstonia solanacearum # 17 178 120 285 490 115 37.0 1e-25 MEQNIYNAIYKVNHAGGCGSCFYLKQYDLFVTNYHVVEGFRQVAIQDNDKNSFLARVVMV NSVLDIALLSVEGDFSALPDIRLSELKEVAIGQKINVAGYPFGMPFTVTEGTVSSPRQLI NNSYYIQTDAAVNPGNSGGPMFNERGELIAITASKITDADNMGFGIPVDTLKKVLENIES LDRSVFNVQCNSCDEFISEEDEYCPCCGDKLPEEIFKERGLTDLAVFCEEAIEAMGINPV LARVGYEGWKFYKGSSEIRMFVYDRSYLFATSPLNILPKKNLEPVLTYLLQTEIAPYRFG LDGNQIYLSYRVHISD >gi|226332274|gb|ACIC01000046.1| GENE 2 962 - 2296 1149 444 aa, chain - ## HITS:1 COG:no KEGG:BT_0764 NR:ns ## KEGG: BT_0764 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 433 1 433 444 783 98.0 0 MKQTLSFHSPIIASTQSKVDVEAFEKSTDAFEKGEHLQSFYLLLDYIDPELRSKFGNAEG TEFAVPHGSIMVYLKLEGDQLNITAPFLLLPEKGRIPLLRQVAALNLTELTLACISLQND KLFFDFHCPVALCNPYKIYYVLEEICRIGDKYDDEFVTKFKAERIYEPKITPYDDETVDK VYEALQQTCKECMDAVKYFETERKYGYAWNILSSTLLKICYFVCPQGQLLNDLDKAITDH GREDMPLPEIVSRTKSVVTRLQEMTKEQLAADLYFVETFISDRRRSVLKNIQDNFEDTYE KAGSALESGDFIVCCLLIINKFYELYYYNNVQNDVNAVVAKAMKATSAMSWEEAAPILYE AMDNIMEGELDVDDDEEGGDEDGEIMGFDMSQLQQMQQEMMQQAMQNIDMEQIQKMQQLA MENMQKMMAGMYGNNNNTEEDTNK >gi|226332274|gb|ACIC01000046.1| GENE 3 2346 - 4031 1128 561 aa, chain - ## HITS:1 COG:no KEGG:BT_0763 NR:ns ## KEGG: BT_0763 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 561 1 561 561 1039 99.0 0 MKHLRQTVLFILLGFLLTACHHTADRLLSIEQLIKLKPDSALSLLRQIQYPEKLSDSNRA LYALLMTQVINQSSDEEHKSDSLISVAIDYYKGTTDSVHAALAYYNAGLVAMDNEDSEAS LHNFLKTIDWLGESDNDELQFMVRYKMSRLFNLRLIPDEELRLGKAALPYAERTGNPLYI CALLPYITHGFMQTNQLDSAYKYSVQAIQLAEKENLVRALSHIYSQHAHLCMAMKDYKQA LMYRDKDLAILFELFGEENEYIYDHYINKANTLTELHQYDSAFYYINKAVEDTTDIQYTT RKSLAKAEIYAATGQMDSAYYYMNRHADLRDRLIEERDKEGLLTLHRNYHNDELKKANTR LWQQAVERKLQLYVVTIVCCLAILLGGCIYFFLYKRKQQEVLRQKGQIAHQQELLKYREI EKLQAEKDLSEAKEREAQIREKEALLKVEFFKRLNETCIPVIENPKTQQNIMLKNEDWKV IFKNANSIFLNFTERLKNQYPALNEEDLRYCCMVKMQFSQTDIAKIMHLEKDSVKKRLKR IRTEKMGIAQETTLEAVLRDF >gi|226332274|gb|ACIC01000046.1| GENE 4 4617 - 4718 70 33 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLSRIIVLVIAGVAVVYIVRFIDNFLSQHRRKY >gi|226332274|gb|ACIC01000046.1| GENE 5 4751 - 6799 1634 682 aa, chain - ## HITS:1 COG:no KEGG:BT_0761 NR:ns ## KEGG: BT_0761 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 682 1 682 682 1294 100.0 0 MDELNNGEQGQYVEPGKNEKKNNTRKIVKRTLITAGLALAVYMVYSIVYLFISPDRNIQQ IYLVPEDAAFIIQSSAPIEDWEKFSGSETWQCLRKAKSFEEVAKSVEKLDSVVKSNKVLL SLVGQRDLLISLHKTRATDWDFLLIVDMQKASKMDLLKDQVETVLAMSGFTVTNRMHNGI NILEMRDPDTRDVFYTAFVDNHLVGSYTSGLVESAINSRNKPKIGLDQSFIETEKLVSGK GLVRVFINYARIPQFMSIYLGARNEYVDLFSNSMNFAGLYLNMDKDRMEVKGYTLKKDSV DPYITALLNSGKHKMKAHEILSGRTALYTNIGFDSPVTFVKELENALSVHDKLLYDSYQS SRKKIESLFGISLEENFLSWMSGEFAITQSEPGLLGHDPEVILAIRAKSIKDARKNMEFI EKKIKRRSPVKIKSVNYKDFEINYVEMKGFFRLFFGKLFDKFEKPYYTYVDDYVVFSNKP ASLLSFVEDYEQKNLLKNNAGFKDAYSYMKSSSTIFLYTDMHKFYSQLKPMMNPATWNEI QSNKDILYSFPYWTMQVIGDNNQASLQYVMDYRPYEPEEVVAVVSDEEDQDMDEEAVTEK EQMSELKRFYIEKFEGNVLREFYPEGALKSEAEVKEGKRHGRYREYHENGKLKLRGKYSN NQPKGTWKYYTEEGEFERKEKF >gi|226332274|gb|ACIC01000046.1| GENE 6 7769 - 9454 489 561 aa, chain + ## HITS:1 COG:BH1123 KEGG:ns NR:ns ## COG: BH1123 COG4753 # Protein_GI_number: 15613686 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 456 559 421 524 526 69 35.0 1e-11 MYLKTISIAASFLIATIFENQSYLVLLLIVISFYLYWHLFHYLSIQKERNFLKEENQYFI DSFQSIRNPITLVHVPLRTICSDTCPENMKNDLLLAIRNIECLNENLDRLSGLKYLHNHP EAMDITEHELGSYVRSRIQSLQGYATNKHIKLIINTNFTYASVWFDKSKVSPIIDKFITN AIDCSKPKTNITLLISISQEHWEISVPDSGDGKLTKLYNQNKHRLMRQTTEFECDFAKSI LYKKLTNLCNGKIIVNSTYHTVTLKFPVKCSCKNQLNHSALSITNHTEIEKIDRLLSKAP CKRSSHKPTVVLADSNDDFRSYLTSCLTEEFTIKSFRDGIEAMAYIKNEYPDLVICDTEL QNMDGVELSSRLKTSCETSIIPIILYSSHIDINQHNRRKTSLADTFLQQPFSVTDLKLEM SILIKNTRFLRKSFLQRLFGEEFLETKASEILQDGKHPLISKVTKIILENLNNEKLTIDS IAKELGISRTSLYNKWTQLTGEALNKFILKIRMEKAHEMLKSGKYRVNEVPEKIGMKDMD NFREKYKKYFGKTPVDTIKNV >gi|226332274|gb|ACIC01000046.1| GENE 7 10024 - 10209 137 61 aa, chain + ## HITS:1 COG:no KEGG:BT_0759 NR:ns ## KEGG: BT_0759 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 61 1 59 59 88 100.0 7e-17 MEMRAELYEFLLENKFKNGIMFKRSIELFVEHYNMEGTVKEDSLMRAFKRWRKAMKDNRK Y >gi|226332274|gb|ACIC01000046.1| GENE 8 10341 - 10913 345 190 aa, chain - ## HITS:1 COG:yjhQ KEGG:ns NR:ns ## COG: yjhQ COG3153 # Protein_GI_number: 16132128 # Func_class: R General function prediction only # Function: Predicted acetyltransferase # Organism: Escherichia coli K12 # 12 175 11 173 181 179 55.0 2e-45 MINSINIQIRETNTDDFDSIMTVEKQAFGYDKEAQLVADLLADKTAEPMVSLLAFYKGEA VGHILFTRAYFDGQGALQMMHILAPLAVKPEYQRQGIGGMLIRAGIERLQEKGSCLVFVL GHKEYYPEYGFIPDAARLGYPAPYPILEQFSDYWMVQAISPKGFDVDKGKIRCSDELNKP EHWRDDESDR >gi|226332274|gb|ACIC01000046.1| GENE 9 11137 - 13170 1784 677 aa, chain - ## HITS:1 COG:SSO3036 KEGG:ns NR:ns ## COG: SSO3036 COG3250 # Protein_GI_number: 15899743 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Sulfolobus solfataricus # 23 578 6 552 570 171 28.0 6e-42 MSCCRLMAAVLFLFCSMPLFAQRQDILLNDNWKFRFSHQVQKGSEIRVDLPHTWNAQDAL SGKIDYKRGIGNYEKKLFIRPEWQGKRLFLRFEGVNSIADVFINRRHIGEHRGGYGAFVF EITGEVNYGKENSILVRVNNGEQLDVMPLVGDFNFYGGIYRDVHLLITDEACISPLNYAS PGVRLIQDSVSHQYAKVRAVVDLANGNNAGQEVELGVRLLDGQKVVAQQKQTLTLAGNAA LQQELTFEINNPHLWNGRQDPFLYQAEVSLSRGGQLVDCVTQPLGLRYYRIDPDKGFFLN GKHLPLHGVCRHQDRSEVGNALRPQHHEEDVALMLEMGVNAVRLAHYPQATYFYDLMDKN GIIVWAEIPFIGPGGYDDKGFVNLPSFRANGKEQLKELIRQHFNHPSICVWGLFNELLEF GDNPVEYIKELNELAHQEDTTRPTTSASNQMGDLNFITDAIAWNRYDGWYGGTPADLGRW LDGMHRDHPEIRIAISEYGAGASIYHQQDSLVKSVPNSWWHPENWQTYYHIENWKAISAR PYIWGSFVWNMFDFGAAHRTEGDRPGINDKGLVTFDRKVRKDAFYFYKANWNKEEPVLYL AGKRNTVRSRRLQTITAFTNQPGAELFVNGKSYGKAMPDQYAILEWKNVELQPGENEIKV VSINKKVKLMDSFRCKL >gi|226332274|gb|ACIC01000046.1| GENE 10 13187 - 14635 1076 482 aa, chain - ## HITS:1 COG:PA0031 KEGG:ns NR:ns ## COG: PA0031 COG3119 # Protein_GI_number: 15595229 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Pseudomonas aeruginosa # 36 469 6 433 503 145 29.0 2e-34 MKNAISVKYTLLCGTACCMASAVQAQRNVSDVSRMNVLFLMADDMRPELGCYGVEAVKTP NMDRLASSGVLFQNAYCNVPVSGASRASLLTGVYPHYPDRFVNFSAYASKDCPEAIPLSG WFTKNGYHTVSDGKVFHHMSDHAASWSEPPYRNHPDGYDVYWAEYNKWELWMNSESGKTI NPKTMRGPFCESADVPDTAYDDGKLAERAIRDLRRMKEMNKPFFLACGFWKPHLPFNAPK KYWDLYKREEIPLAPNRFRPEGLPEQVRNSSEIYAYARVSDTSDADFQREVKHGYYACLS YVDAQIGKVLDALDELGLAENTIVVLLGDHGWNLGEHDFVGKHNLMDRSTHVPLIIRVPG RKKGKTRSMVEFVDLYPTLCELCQIPQPAEQLDGQSFAKVFSNLKAKTKDEVYIQWEGGD NAVDQRFSYAEWMKGDVKKASMLFDHRIDKEENKNRVNEKKYKSKVESLSSFIKVKKSSL KK >gi|226332274|gb|ACIC01000046.1| GENE 11 14709 - 16514 1704 601 aa, chain - ## HITS:1 COG:no KEGG:BT_0755 NR:ns ## KEGG: BT_0755 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 601 1 601 601 1221 99.0 0 MKKYILGTGISLCLLMSSCLNDSFLEVYPKGQQTEATVFTTYDNFKTYAWGLYNVFFGYT YDTGQTDEIFRGDFESDNMIKGLSGYEGQWAYQKAKATDEAKEWDYDYIRRVNLMLDNID GSQMNDTEKEHWRSVGYFFRSYKYFQMLSRFGDIPWVENALKEDSPELYGKRDNRDLVAS NILSNLKYAETHIGADDGKNTIGISVVQALISRFALFEGTWRKYHGLSDAETYLKECVRA SEEVLKVYPNVHPQYDELFNSESLDGVTGIILYKAYETGQLMHGLTRMVRTGESYIEATK DAVDSYLCTDGRPVSTTTSRYGGDKEIYGQFRDRDYRLYLTICPPYMVKKENGPSTADWK YTDDAQDREFIDLLATISGETYHRLPSSNFKGFTVQGQPHFKNQNWGQGWNASQMGFWVW KYYNTHTVATNANGVNTTDAPLFRVGEVMVNYAEAMCELGKFDQTAADKSINKLRARGHV AKMTVEDITDDFDTARDPSVPALLWEVRRERRVELMGEGFRLDDLRRWKKGDYVNKRPLG VYVTGASAKNLKVTGGPSNDEGYVYFFDAPLGWQEHYYLYPLPLKQLALNTNLEQNPVWT K >gi|226332274|gb|ACIC01000046.1| GENE 12 16544 - 19960 3417 1138 aa, chain - ## HITS:1 COG:no KEGG:BT_0754 NR:ns ## KEGG: BT_0754 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1138 1 1138 1138 2181 99.0 0 MRLIIRKTLLMFLLLSLSNVLTGFQLSAQEQNKKFSVKADNVTLKEAIEVVRKQGNYSFL IRNNDIDLNKKVSVNVDKGTINDLMAQLLTGTGISYEVNGNRVVIFHAAVPEKEQGKAFV LKGKVTDPSGEGVIGANVKVLNSTEGTITDMDGNFSLSVTPNACLSVSYIGYATQEVVVK NQTPLHIALKEDSRLIDEVVVVGYGVQKKANLTGAVSSVKMDEVLGDRPVVSVSDALKGA MPGLQITGNSGRPGEEMSFNIRGVNSLDKNGKPLVLVDNVEMDINMLDPNDIESVTVLKD AASSAIYGARAAFGVILITTKKGSDSTRLSINYSNNFSFSRPANMPRKATPLQTVQAYKD MGTINYQSGQNVDTWLDLLKEYNANPSAYPDGYAMVDGLRYSLAETNLFDDMMETGFQQT HNLSVGGGTKDISYRFSFGMVDENGVLASDKDAYKRYNVSSYIRSDVFSWITPELDIKYT NSKSELPETSGGYGIWGAAVAFPSYFPTGTMNIDGEELPINTPRNLINLAYPTTIQKNNI RIFGKVTITPLKNVKLVGEYTYNHLSNEKTKFEKKFYYAHGGNFVKETSTANSKYENSNG ITDYNALNFYGNYNNTWGKHEVTVMGGFNQESSDYRYAEMSRMNMINEDLPSISQATGDY FAKDKFERYTVRGLFYRINYSFAGKYLIETNGRYDGSSKFPKDSRFGFFPSVSAGWRVSE EAFMKPLTSVLSNFKLRASWGNIGNQSITPYAYIPGMDAEQAYWTVSGIKVTTLKPAALV SNSFTWEKVTTVDVGFDLGLLNNRLNLVFDWYRRDTKGMLAPGAELPAVLGASAPLQNTA DLRSKGWEITVDWNDQIGKVKYNLGFNLYDAKTKITKYNNETGLFGKDKNDKDTYRVGME LGEIWGYVTDRLYTVDDFDADGKLKAGIPKVEGYNPNPGDILYKDLDDNDIINGGTSTTK DPGDRKIIGNSTRRYQYGIHGGASWKGFSLSFLLQGVGKRDLWIMNDLFYPHYDAWTTVY DTQLNYWTPERTDSYFPRLYEKAAGNTAANTRIQTRYLQDGSYLSIRNITLSYNFPSKWM NKIGVNNLAVFFSGENLYTFDHLPKGLDPERSVTDDLGQRGFTYPYMRQYSFGINLSF >gi|226332274|gb|ACIC01000046.1| GENE 13 20056 - 21081 845 341 aa, chain - ## HITS:1 COG:RSc2919 KEGG:ns NR:ns ## COG: RSc2919 COG3712 # Protein_GI_number: 17547638 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Ralstonia solanacearum # 131 307 70 241 274 80 32.0 6e-15 MIEYYDKEELELQISNYLSGNSTDEEKETLLAFLASNEAAARTFREMSAVWALSSVPSFA EIENSNLVRIKERMTAPASSKPVRKLIPVWLKVAAAVILLIGCNYFWYTYTENLTEVYTN ADSPYEIKVPAGSRTNIVLPDGTEVSLNAGSVLRYHRGFGIRERNVTLDGEGYFKVAKNA EVPFFVKTNDVQVQVVGTVFNVRAYDDDNYVMVSLLEGRVNLSVSANSVMKLFPNEQALY NKNTGRMEKLKTNASKACDWLDGGLTFENASFADIAHRLERKFQVKISIESERLKAEHFS GSFDSNQNIYDILHEINVEKQYTWKVSGDTIFITDKRKGVR >gi|226332274|gb|ACIC01000046.1| GENE 14 21159 - 21746 416 195 aa, chain - ## HITS:1 COG:no KEGG:BT_0752 NR:ns ## KEGG: BT_0752 # Name: not_defined # Def: putative RNA polymerase ECF-type sigma factor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 195 1 195 195 367 100.0 1e-100 MNLPEVSEKLIEQLNHGSTKAFDKIYHTYYLYLCAIAVYYVHDKRVAGEIVNDVFVSFWQ NRHHITYPALPYLRRAIQNASISYLRSSAFNERIMTEQMEEIWAFLENHILSSDNPLQAL ESSEMNEIILRKVEELPAKCRAVFKASLYEGKSYSEIAEEQNINVATVRVQMKIALTKLR ESLGTPYMIAILMFL >gi|226332274|gb|ACIC01000046.1| GENE 15 21840 - 23294 1032 484 aa, chain - ## HITS:1 COG:FN0470 KEGG:ns NR:ns ## COG: FN0470 COG2978 # Protein_GI_number: 19703805 # Func_class: H Coenzyme transport and metabolism # Function: Putative p-aminobenzoyl-glutamate transporter # Organism: Fusobacterium nucleatum # 6 484 22 503 512 307 37.0 4e-83 MKSKWRMPHPATMFLLLTMAVVFLSWICDIYGLKVTLPQSGEDIRVQSLLSPEGIRWWLR NAIKNFTGFAPLGMVIIAMFGLGVAQHSGFIDACIRLGVGNRKEKRKVILWVIVLGLLSN VIGDGGYIILLPIAAMLFQWVGLHPIAGIVTAYVSVACGYSANIVLSTMDPLLAHTTQEA ALTLMGYQGNTEPLCNYFFMSASTVVITGIVYWVTQKWLLPNLGKYEGSVKVEAYRPLSR KERRALMVAVTVAGIYVALILWLTFSSYGILRGVNGGLMHSPFIAGILFLLSLGAGFTGM AYGFSSGRYRSDNDVIEGLTQPIKLLGVYFVIAFFAAQMFACFEYSHLDKCLAIMGADLL SSFEPAPLSALILFILFTAFINLIMVSATSKWAFMSFIFIPMFAQMGISPDVTQCAFRIG DSSTNAITPFLFYMPLVLTYMRQYDKQITYGSLLKYTWRYSLCILAAWTLLFIVWYLLKI PMGL >gi|226332274|gb|ACIC01000046.1| GENE 16 23284 - 24012 1075 242 aa, chain - ## HITS:1 COG:no KEGG:BF2233 NR:ns ## KEGG: BF2233 # Name: not_defined # Def: two-component system response regulator # Organism: B.fragilis # Pathway: not_defined # 1 242 1 242 242 439 95.0 1e-122 MEEQKFKVIIVEDVKLELKGTEEIFRHEIPNAEVIGTAMTESEFWPLMEAQLPDLVLLDL GLGGSTTIGVDICRNIFKRYKGVRVLIFTGEILNEKLWVDVLNAGADGIILKTGELLTKT DVQAVMDGKKLVFNYPILEKIVDRFKKSVANDAKRQEAVINYDIDEYDERFLRHLALGYT KDMIANLKGMPFGVKSLEKRQNDLIGRLFPNGERVGVNATRLAVRALELRIIDLDNLEAD EE >gi|226332274|gb|ACIC01000046.1| GENE 17 24048 - 28445 3140 1465 aa, chain - ## HITS:1 COG:MA2553_2 KEGG:ns NR:ns ## COG: MA2553_2 COG0642 # Protein_GI_number: 20091380 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Methanosarcina acetivorans str.C2A # 657 843 252 422 428 66 32.0 5e-10 MKNYGRKTENEERRMKDSYRLSFYSPFSIFHSPLYIIGVLLLSSLFSCTDMVPTKEVRLI DSLNGKAYAYRYRNLDSSYKYAYKAYRQVNLYKSGKAEASNNLGFCAFMNMDFDRAEAYH KEVYKLTKNELELLIADIGLMKICQRTALNKEFYDYRNSALRRMKRIREESDLFADRHEA LRLDYAFTEFFLVSSIYYYYLQQRQEAITSIDNIQEDEALSDTNQLLYYHYLKGSASLVA ANTPEERKLREFDELYFTWRTAVKSKHPYFEGNGMQGLANLMAAPSNFEFFKTRRTHALD QFDFPVDSLFPLRLAQLALEKFREYNDLYQIAGAYVSIGKYLNAHGRYQEALDTLSKALN CVNHHHMLYYHNEVDTLDKLYTFAEGDTTYTGVPWIGQEKVKTVPEWISRIREQLSVSYA GLGMKDASDYNRNIYLDILNFTRQDKELESRKLSLEAGSRQMTLVLFLVIVGLILVIILW WFFNKRSKIRNQVDVERLQRILSLCRDITSSIPMNIPLIQQGIDQLFGKGRLTLEIPEEG KAALVPSSHRLNRDEKALVHVLEPYIVWAAENEQMVAALSDERIQLEKQRYIYEQHIAGN KRQNLIKKACMAIVSGINPYIDRILNEVHKLTEKGYIDDAKIKKEKYQYIDELVTTINEY NDILALWIKMKQGTLSLNIETFSLNELFDLLGKGRRAFEMKKQKLEIESTTLMVKADRAL TLFMINTLAENARKYTPGGGMVKVYAHATDAYVEISVEDNGRGLSAEDIAHIIGEKVYDS RAIGMKDAADPEELKENKGSGFGLMNCKGIIEKYKKTNELFRVCVFDIESEPGKGSRFYF RLPSGVRKVIGVLLCLLLPVGFGSCLQTPTSPILQNKDSIVIITDSIYEELLDIASYYAD TAYSANVHGRYELALQYIDTAMMFLNEHYEKYAHSNQPHRHMKLVGEGIPAEIWWWNEYF DSDYHVILDIRNEAAVAFLALKELDAYSYNNAAYTDLYKVQGEDQNLEGYCRQLERSNTN KTVGIILCFVLLIVSLIGYYLLYMRKMLQNRLNLEQVLEINQKVFAASLVRPQTQENEEA LQREENTLKEIPQRIVNEAFGSVNELLTISLMGIAVYNETTHRLEYASCPGQEMPEMVQQ CFETGEYLSEQNLQAIPLMVEAGGEHQCVGVLYLERREGTQQETDHLLFELVARYVSIVV FNAVVKLATQYRDIESAHEETRRASWEDSMLHVQNMVLDNCLSTIKHETIYYPNKIKQII GRLNTQNLPEKDEREAVETITELIEYYKGIFTILSSCAARQLEEVTFRRNTIPVQELFDA AGKYFRKVTKNRTEKVELIIEPMDAKVIGDVNQLRFLFENLIDEALAVREEGVIRLQARK DNEYIRFLFTDTRREKSVTELNQLFYPNLARMTSGEKGELRGTEYLICKQIIRDHDEFAG RRGCRINAEPAEGGGFTVYFTIPRR >gi|226332274|gb|ACIC01000046.1| GENE 18 28602 - 29540 923 312 aa, chain + ## HITS:1 COG:Cj0918c KEGG:ns NR:ns ## COG: Cj0918c COG0462 # Protein_GI_number: 15792247 # Func_class: F Nucleotide transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoribosylpyrophosphate synthetase # Organism: Campylobacter jejuni # 7 311 4 309 309 312 51.0 6e-85 MSEKAPFMVFSGTNSRYLAEKICASLNCPLGNMNITHFADGEFAVSYEESIRGAHVFLVQ STFPNSDNLMELLLMIDAAKRASAKSVVAVIPYFGWARQDRKDKPRVSIGAKLVADLLSV AGIDRLITMDLHADQIQGFFNIPVDHLYASAVFLPYIQSLKLEDLVIATPDVGGSKRAST FSKYLGVPLVLCNKSREKANEVASMQIIGDVKDKNVVLIDDIVDTAGTITKAANIMMEAG AKSIRAIASHCVMSDPASFRVQESGLTEMVFTDSIPYAKKCAKVKQLSIADMFAETIKRV MNNESISSQYII >gi|226332274|gb|ACIC01000046.1| GENE 19 29629 - 30798 1219 389 aa, chain - ## HITS:1 COG:slr1485 KEGG:ns NR:ns ## COG: slr1485 COG4642 # Protein_GI_number: 16329198 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Synechocystis # 46 360 27 341 349 166 34.0 7e-41 MRKYLYTTLFLTLLAQGGVMAQDNNTSKGGFFGKLKDTFSTEIKIGNYTFKDGSVYTGEM KGRKPNGKGKTVFKNGDVYEGEYIKGKREGYGIYSFPDGEKYEGQWFQDQQHGKGIYYFM NNNRYDGMWFQDYQHGAGTMYYHNGDLYVGNWANDKREGEGTYTWANGAKYSGHWKNDKK NGKGTMNWDDGCKYDGDWKDDVRHGKGVFEYTNGDKYDGDWADDIQHGKGTYFFHTGDRY EGSYLLGERTGEGVYYHANGDKYVGNFKDGMQDGKGTFTWANGAVYEGDWKNNKREGKGI YKWSNGDVYEGDWKNNRPHGQGSLKTVAGMQYKGGFVDGLEDGQGVQIDKDGNRFEGFFK QGKKDGPFVETDKDGKVIKKGTYKMGRLL >gi|226332274|gb|ACIC01000046.1| GENE 20 30882 - 32165 1144 427 aa, chain - ## HITS:1 COG:XF0816 KEGG:ns NR:ns ## COG: XF0816 COG0612 # Protein_GI_number: 15837418 # Func_class: R General function prediction only # Function: Predicted Zn-dependent peptidases # Organism: Xylella fastidiosa 9a5c # 3 423 531 960 990 107 23.0 5e-23 MDRTIQPEIQALKNFHILSPVRTTLPNGVPLTVVNAGEQEVVRMDILFAGGRWQQSQKLQ ALFTNRMLREGTQKYTAATIAEKLDYYGSWLELSSSSEYAYITVYSLNKYLAKTLEVVES MIKEPVFPEKELHTILDTNIQQYLVNTSKVDFLAHRGLLQALYGTQHPCGQIVVEEDYHA ITPEVLRDFYGRHYHSGNCSIFLSGKVTEDIIRRVTGAFGTPFGQYQLKASKPIFSFVAV PEKRIFIEREDALQSAVKMGCTTITRQNPDYLKLRVLMTLFGGYFGSRLMSNIREEKGYT YGISAGIMFYPDSGLQGISTETDNEYVEPLIQEVYNEIDKLHREPVPMEELTMVRNYMLG EMCRSYESPFSLADAWIFIATSGLDDQYFSRSLQAVNEVTPQEIQELAQRYLCKETLKEV IAGKKLS >gi|226332274|gb|ACIC01000046.1| GENE 21 32169 - 32921 802 250 aa, chain - ## HITS:1 COG:FN0807 KEGG:ns NR:ns ## COG: FN0807 COG1212 # Protein_GI_number: 19704142 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: CMP-2-keto-3-deoxyoctulosonic acid synthetase # Organism: Fusobacterium nucleatum # 1 247 1 239 245 192 44.0 4e-49 MKFLGIIPARYASTRFPAKPLAMLGGKTVIQRVYEQVAGILDDAYVATDDERIEAAVKAF GGKVVMTSIDHKSGTDRCYEACTKIGGDFDVVVNIQGDEPFIQPSQLNAVKACFEDPTTQ IATLVKPFTADEPFAVLENVNSPKVVLNKNWNALYFSRSIIPFQRNADKEDWLKGHTYYK HIGLYAYRTEVLKEITALPQSSLELAESLEQLRWLENGYKIKVGISDVETIGIDTPQDLK HAEEFLKNRS >gi|226332274|gb|ACIC01000046.1| GENE 22 32918 - 33298 340 126 aa, chain - ## HITS:1 COG:no KEGG:BT_0744 NR:ns ## KEGG: BT_0744 # Name: not_defined # Def: 2-amino-4-hydroxy-6-hydroxymethyldihydropteridine pyrophosphokinase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 126 48 173 173 221 99.0 5e-57 MHKCIICIGSNYNRKENLLLARRRLVDLFPTIRFTSEQETRPLFFRSPALFSNQVAMFFS EAEEERVRKELKAIEQSAGRRPEDKKEEKVSLDIDLLSFDDRVLKPEDLKREYVVKGLEE LKYNQI >gi|226332274|gb|ACIC01000046.1| GENE 23 33309 - 35648 2435 779 aa, chain - ## HITS:1 COG:aq_624 KEGG:ns NR:ns ## COG: aq_624 COG5009 # Protein_GI_number: 15606057 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane carboxypeptidase/penicillin-binding protein # Organism: Aquifex aeolicus # 1 752 1 679 726 280 30.0 7e-75 MIRKIINVLWVLLAVALIAIVVVFVSISKGWIGYMPPVEELENPNYKFATEVFSEDGKVL GTFSMEKNNRVYSSYADLSPNIIHALIATEDVRFAEHSGIDAKALFRAIVKRGLLLQKSA GGGSTISQQLAKQLFTEKVASNTMQRLLQKPIEWVIAVKLERYYTKEEILTMYLNKFDFL NNAVGIKTAASTYFGCEPKDLKIEQAAMLVGMCQNPSRYNPVSRNPKIRENALGRRNVVL RQMEKAGYISDAECDSLQALPLKLAYTRVDHKEGLATYFREYLRGVMTAKKPVKSEYRGW QMQKYYEDSLSWETNPLYGWCAKNKKKDGSNYNIYTDGLKIYTTINSHMQQYAEDAIKEH LGDFLQPLFFKEKQGSKNAPYARALPQARVEELLTRAMKQTERYNVMKSAGASEQEIRKA FDTPQEMSVFTWAGEKDTIMTPMDSIRYYKHFLRTGFMSMDPMTGYVKAYVGGPNYTYFQ YDMAMVGRRQVGSTIKPYLYTLAMENGFSPCDQVRHVEQTLITETGEAWTPRNANNKRYG EMVTLKWGLANSDNWISAYLMGKLNPYNLVRLIHSFGVRNKAIDPVVSLCLGPCEISVGE MVSAYTAFANKGIRVAPLFVTRIEDSDGNVLSTFAPQMEEVISVSSAYKMLVMLRSVINE GTGGRVRRYGITADMGGKTGTTNDNSDSWFMGFTPSLVSGCWVGGDERDIHFGTMTYGQG AAAALPIWATYMKKVYDDPTLGYSQTETFKLPEGFDPCAGSETPDGEVFEETGLDDLFN >gi|226332274|gb|ACIC01000046.1| GENE 24 35855 - 36796 1000 313 aa, chain + ## HITS:1 COG:VC2510 KEGG:ns NR:ns ## COG: VC2510 COG0540 # Protein_GI_number: 15642506 # Func_class: F Nucleotide transport and metabolism # Function: Aspartate carbamoyltransferase, catalytic chain # Organism: Vibrio cholerae # 4 304 29 330 330 312 53.0 5e-85 MENRSLVTIAEHSKEKILYMLEMAKQFEMNPNRRLLQGKVVATLFFEPSTRTRLSFETAA NRLGARVIGFTDPKATSSSKGETLKDTIMMVSSYADIIVMRHYLEGAARYASEVAPVPIV NAGDGANQHPSQTMLDLYSIYKTQGTLENLNIFLVGDLKYGRTVHSLLMAMRHFNPTFHF IAPDELKMPEEYKLYCKEHQIKYIEHTEFTEEIIADADILYMTRVQRERFTDLMEYERVK NVYILRNKMLENTRPNLRILHPLPRVNEIAYDVDNNPKAYYFQQAQNGLYAREAILCDVL GITLEDVKNDILL >gi|226332274|gb|ACIC01000046.1| GENE 25 36793 - 37254 509 153 aa, chain + ## HITS:1 COG:PH0721 KEGG:ns NR:ns ## COG: PH0721 COG1781 # Protein_GI_number: 14590598 # Func_class: F Nucleotide transport and metabolism # Function: Aspartate carbamoyltransferase, regulatory subunit # Organism: Pyrococcus horikoshii # 8 150 4 148 152 137 48.0 8e-33 MSENKQALQVAALKNGTVIDHIPSEKLFTVVQLLGVEQMKCNITIGFNLDSKKLGKKGII KIADKFFCDEEINRISVVAPYVKLNIIRDYEVVEKKEVRMPDELHGIVKCANPKCITNNE PMPTLFHVIDKDNCIVKCHYCEKEQKREEITIL >gi|226332274|gb|ACIC01000046.1| GENE 26 37289 - 37855 479 188 aa, chain + ## HITS:1 COG:FN1468 KEGG:ns NR:ns ## COG: FN1468 COG1853 # Protein_GI_number: 19704800 # Func_class: R General function prediction only # Function: Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family # Organism: Fusobacterium nucleatum # 19 184 20 185 197 157 47.0 1e-38 MKQDWKPGTMIYPLPAILVSCGKDESEYNIITVAWTGTICTNPPMCYISVRPERHSYDII KKNMEFVINLTTKDMAFPTDWCGVRSGRNYRKFEEMKLTPGRCTVVSAPLIEESPLCIEC RVKEIVSLGSHDMFIADVVNVRADDRNLNPETGKFELAEANPLVYVHGGYYDLGEKIGKF GWSVEKKK >gi|226332274|gb|ACIC01000046.1| GENE 27 37872 - 38609 655 245 aa, chain + ## HITS:1 COG:no KEGG:BT_0739 NR:ns ## KEGG: BT_0739 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 245 1 245 245 479 100.0 1e-134 MNILRNIILFGWMAIACTAFAQDRNADKVERPNTKKINFGIKAGFNSSMFMVSELKIKDV TIDEVQNNYKIGYFGAIFMRFNIKKHFIQPEASYNVSKGEITFDKLGSQHPAIEPDYASV QSVLHSIDFPILYGYNVVKKGPYGMSIFAGPKLRYLWGKQNEITFTNFDQKGIHEKLYPL NVSVVIGVGVNISRIFFDFRYEQGIGNISKSIVYDNINSDGSTGVSPIIFRRRDSALSFS FGFIL >gi|226332274|gb|ACIC01000046.1| GENE 28 38830 - 40110 1522 426 aa, chain + ## HITS:1 COG:aq_479 KEGG:ns NR:ns ## COG: aq_479 COG0112 # Protein_GI_number: 15605959 # Func_class: E Amino acid transport and metabolism # Function: Glycine/serine hydroxymethyltransferase # Organism: Aquifex aeolicus # 1 426 5 412 428 479 57.0 1e-135 MKRDDLIFDIIEKEHQRQLKGIELIASENFVSDQVMQAMGSCLTNKYAEGYPGKRYYGGC EVVDQSEQIAIDRLKEIFGAEWANVQPHSGAQANAAVFLAVLNPGDKFMGLNLAHGGHLS HGSLVNTSGIIYTPCEYNLNQETGRVDYDQMEEVALREKPKMIIGGGSAYSREWDYKRMR EIADKVGAILMIDMAHPAGLIAAGLLENPVKYAHIVTSTTHKTLRGPRGGVIMMGKDFPN PWGKKTPKGEIKMMSQLLDSAVFPGVQGGPLEHVIAAKAVAFGEILQPEYKEYAKQVQKN AAVLAQALIDRGFTIVSGGTDNHSMLVDLRSKYPDLTGKVAEKALVSADITVNKNMVPFD SRSAFQTSGIRLGTPAITTRGAKEDLMIEIAEMIETVLSNVENEEVIAQVRARVNETMKK YPLFAD >gi|226332274|gb|ACIC01000046.1| GENE 29 40497 - 42164 1833 555 aa, chain - ## HITS:1 COG:SP1229 KEGG:ns NR:ns ## COG: SP1229 COG2759 # Protein_GI_number: 15901091 # Func_class: F Nucleotide transport and metabolism # Function: Formyltetrahydrofolate synthetase # Organism: Streptococcus pneumoniae TIGR4 # 1 554 1 555 556 563 53.0 1e-160 MKSDIEIARSVELKKIKQVAESIGIPREEVENYGRYIAKIPEQLIDEEKVKKSNLVLVTA ITATKAGIGKTTVSIGLALGLNKIGKKAIVALREPSLGPCFGMKGGAAGGGYAQVLPMEK INLHFTGDFHAITSAHNMISALLDNYLYQNQAKGFGLKEILWRRVLDVNDRSLRSIVVGL GPKSNGITQESGFDITPASEIMAILCLSKDVEDLRRRIENILLGFTYDDQPFTVKDLGVA GAITVLLKDAIHPNLVQTTEGTAAFVHGGPFANIAHGCNSILATKLAMSFGDYVITEAGF GADLGAEKFYNIKCRKSGLQPKLTVIVATAQGLKMHGGVSLDRIKEPNMEGLKEGLRNLD KHIRNLRSFGQTVVVAFNKFATDTDEEMEMLREHCEQLGVGYAINNAFSEGGEGAVDMAR LVVDTIENNPSEPLRYTYKEEDSIQQKIEKVATNLYGASVITYSSIARNRIKLIEKMGIT HYPVCIAKTQYSFSADPKIYGAVNNFEFHIKDIVINNGAEMIVAIAGEILRMPGLPKEPQ ALHIDIVDGEIEGLS >gi|226332274|gb|ACIC01000046.1| GENE 30 42371 - 44065 2018 564 aa, chain + ## HITS:1 COG:STM0870 KEGG:ns NR:ns ## COG: STM0870 COG2985 # Protein_GI_number: 16764232 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Salmonella typhimurium LT2 # 14 563 15 556 561 253 30.0 9e-67 MEWIINQLRDHPELAIFLTLFAGFWLGRFKIGKFSLGTVTSVLLVGVLVGQLNISVDGPM KAVFFLLFLFAVGYKVGPQFFRGLKKDGLPQVGFAVLMCIVSLIAPWILAKIMGYHVGEA AGLLAGSQTISAVIGVASDTINQLSISDAQKATFINAIPVAYAVTYIFGTAGSAWILASL GPKMLGGLEKVKADCKELEAKMGTSEADEPGFYTALRPVVFRAYKIDNEWFGKGKTVSEL ENYLVENDKRLFVERVRQKGVIEEVTPDMLLQPGDEVVLSGRREYAIGEEDWIGPEVIDA QLLDFPAETLPVMVTHRTFAGEKVSTIRALKFMHGVSIRSIKRAGINVPVLAQTVIDSGD ILELTGTKLEVETAAKQMGYIDRPTNQTDMIFVGLGILLGGLVGALAIHLGGIPISLSTS GGALIAGLVFGWLRSKHPTFGGIPEPSLWVLNNVGLNMFIAVVGIAAGPSFVTGFKEVGF SLFIVGALATAIPLLSGLLMGRYLFKFHPALTLGCTAGARTTTAALGAIQDALESDTPAL GYTVTYAVGNTLLIIWGVAIVLLM >gi|226332274|gb|ACIC01000046.1| GENE 31 44126 - 45754 1707 542 aa, chain + ## HITS:1 COG:no KEGG:BT_0735 NR:ns ## KEGG: BT_0735 # Name: not_defined # Def: aspartate aminotransferase (EC:2.6.1.1) # Organism: B.thetaiotaomicron # Pathway: Alanine, aspartate and glutamate metabolism [PATH:bth00250]; Cysteine and methionine metabolism [PATH:bth00270]; Metabolic pathways [PATH:bth01100] # 1 542 16 557 557 1122 99.0 0 MEKKTTGTAITKNFAKKMETISPFELKNKLIEMADESIKKMAHTMLNAGRGNPNWIATEP REAFFLLGKFGLCECRRVQSLEEGIAGIPQQEGIAARFEAFLKENEKEAGARLLKETYNY MLMEHAADPDRLVHEWAESVIGDQYPVPDRILHFTELIVQDYLAQEMCDRRPPKGTFDLF ATEGGTAAMCYVFDSLQENFLLNQGDSIALMIPVFTPYIEIPELRRYQFDVTEISADQMT PDGLHTWQYKDEDIDKLKNPQIKALFITNPSNPPSYALSPETAARIVNIVKNDNPNLMII TDDVYGTFIPHFRSLMAELPHNTLCVYSFSKYFGATGWRNAVIALHEDNIYDKMIARLSE EQTAILNKRYASLSLHPEKMKFIDRMVADSRQIALNHTAGLSLPQQMQMSLFAAFSLLDK EDRYKAKMQEIIHRRLHALWDSTGFTLIEDPLRAGYYSEIDMLVWAKKFYGDEFADYLQK TYNPLDVVFRLANETSLVLLNGGGFAGPKWSVRVSLANLNEADYVKIGQSIKRVLDEYAE NR >gi|226332274|gb|ACIC01000046.1| GENE 32 45761 - 47869 2114 702 aa, chain - ## HITS:1 COG:no KEGG:BT_0734 NR:ns ## KEGG: BT_0734 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 702 1 702 702 1352 99.0 0 MKRTYLLLLLLFMYWMYASAQQVQTAQEPADSVKNVAYIDSLYRELPEVMITGERPVVKA EQGKLVYDVPRLVGSLPVDNAYDAVKNLPGVVSMNDALTLGGQPVTVVINGKVTTLSVEQ LSNLLKSMPVSRIEKAEVMYSAPARYQVRGPMINLILTSGMGKEPSLQGELYTAYSQLHY ESLVERASLLYSGRKFSADLLYSYSYSRERRETDKEALHTLADGSVHPMNMYDITTSRHN NHQIRLGMDYAFTDKHLLSLVYTTAFTDVKPYATVTGAQNSVTDSHSEGQLHNAKLDYQT PFGLKAGAEFTYYHAPGSQLLHSTLGEETLNFLSKDNQRINQWRFYAGQEHTLGADWGLN YGVAYTTALDNSYQMYYDPETEALLPDNNMQSRRREQTLNFYAGLSKSFGEKLSADVSLA AEQYHTDMWNEWSLYPVANLTYLPAPGHILQFSLSSDKEYPEYWSMQNSTSYMGAYSEIQ GNPFLKPATNYEANISYILKGKYVLTAYYSRTKNKEMQTLYQSPERLVEIYKCFNFDFSE QAGLVMVVPFKVKKWLDSRITAIGFRYRQKDSDFWDIPFDRKLYTFVLTMNNTFTLSTKP DLKFTLSGFYQNKAIQGIFDLPRSGNLDAALRYTFAKGKAQLTLKCDDIFNTSTISTNVR FGQQNVKNHYMKTTRAFGISFNYKFGGYKEKKREEVDTSRFK >gi|226332274|gb|ACIC01000046.1| GENE 33 49199 - 50431 358 410 aa, chain - ## HITS:1 COG:no KEGG:SUN_2263 NR:ns ## KEGG: SUN_2263 # Name: not_defined # Def: hypothetical protein # Organism: Sulfurovum_NBC37-1 # Pathway: not_defined # 216 404 119 293 300 81 30.0 5e-14 MRRIEYRPEDIATLKKEYLQIFDKERSEMQKLWEPLRDKLRKISSKPDLYNDNIDDILIA DFDFLAKLYCDFTKHVETTKISSEECRDLEAIFHYSKTTKKKSGEEEAGEDEITSNIRRF QPDISAFFMRRADKLKIHTCHYCEISYINAYGYKMLFQNVEELLDKASLEDLAYFIPKAD GQPISLKSAKTVMKNRPYSGNLGKFDALSIWQSSVIKKSQQVENHKRNHYDLDHVLSKSR CPIVGLSLFNFVPSCPICNERLKGTGVLGKDATEMSSLSPTSSTYNFDNSVEISVTPKDT FAILNSQARPDDFKITFHYNNPLYQKSVDMFHLRERYNYHKGEALRLFDLLQDYSPAQIK MLHNVFAKSDPDTCYTEEKIKEDLFGEKFMSDNNRCFAKMKKDIIKGFKK >gi|226332274|gb|ACIC01000046.1| GENE 34 50418 - 52508 668 696 aa, chain - ## HITS:1 COG:no KEGG:FP1531 NR:ns ## KEGG: FP1531 # Name: not_defined # Def: hypothetical protein # Organism: F.psychrophilum # Pathway: not_defined # 61 683 66 663 664 216 30.0 2e-54 MHIIGIRLDGGKENVIKNLQPGWYPFGNFVEPVWKNNYEWRKPGSMAENLYQTCDPLPRT ISVSCVVGANGSGKSTLLEMLFRIINNFASELLPNSEDYKGRKLTFAEGLDAQLYFETDG TVGSILNKDDHVHYFYGINKDGQPNEVRGPLRRSYNILDSFFYTISTNYSIYSLNDDDYD SLDIQTGKNSSNEGKWLYGLFHKNDGYLAPITMTPYRQEGGIIDVTNEKELARQRLMTLF LLFESQNKAFINGYRPLELEYCFNQNYKNDTLKIYKNKVRKKLSSSNVDSLINSFSNCWK DIINKEFKCAHNNNPIVEETILFYLGYKTLKICMTYESFGRIINIKLFSDASSISKEQVN MIIDKIRSLEEDNHITLKINQCLTFLKRGYLAVENPITVNAKEFIKYNLDFDNQFLSSED KKKMKYDTYDQVFKLLPPPFFFCDMTLHRRVQVSETKKKIKEEIRLSRLSSGEKQMLFNL SYVIYHIKNIQSVKKDEYRVPYRHINLIFDEAELYFHPEFQRSFVSELLKMISWCHIDRR KIRSINIIIVTHSPFVLSDVPLENILYLEDGRHVKKMQQTFSANIHEILQSHFFMKYPMG EIAREALDRIITLYGKRDEDRERVLKDLSDNKDYYCYVENIIADPYLKSTIHRMLNELYS LEETPLQKLHREREETESKLKNIDNQIAELEKDEKN Prediction of potential genes in microbial genomes Time: Thu May 12 00:33:20 2011 Seq name: gi|226332273|gb|ACIC01000047.1| Bacteroides sp. 1_1_6 cont1.47, whole genome shotgun sequence Length of sequence - 8733 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 5, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 119 - 856 458 ## gi|253568430|ref|ZP_04845841.1| conserved hypothetical protein - Prom 965 - 1024 8.0 2 2 Tu 1 . - CDS 1376 - 1669 254 ## BVU_1718 putative transposase - Prom 1708 - 1767 5.0 + Prom 2365 - 2424 3.7 3 3 Op 1 40/0.000 + CDS 2463 - 3977 1063 ## COG0642 Signal transduction histidine kinase 4 3 Op 2 . + CDS 3990 - 4685 689 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain + Term 4717 - 4761 0.9 + Prom 4742 - 4801 3.0 5 3 Op 3 . + CDS 4888 - 5424 455 ## BT_0731 hypothetical protein + Term 5500 - 5560 11.8 - Term 5499 - 5535 5.9 6 4 Tu 1 . - CDS 5596 - 6612 777 ## COG0451 Nucleoside-diphosphate-sugar epimerases - Prom 6638 - 6697 6.2 + Prom 6580 - 6639 4.7 7 5 Op 1 . + CDS 6861 - 7763 760 ## BT_0729 transcriptional regulator 8 5 Op 2 . + CDS 7798 - 8697 602 ## COG2207 AraC-type DNA-binding domain-containing proteins Predicted protein(s) >gi|226332273|gb|ACIC01000047.1| GENE 1 119 - 856 458 245 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253568430|ref|ZP_04845841.1| ## NR: gi|253568430|ref|ZP_04845841.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 245 1 245 245 489 100.0 1e-137 MNTQLIEFLLETSLGQFMLAGEEGNLPLAKKSGLKLVFPLKSNDDERDRRRISEQEIRFL FVRVIDNAKDFDGYYAVEVPTVRKYRFKDIQPIVTGENKMDGYCSASIDVCLYNSKLERT NLIEFKAHNAPLKHIEKDILKLIKENGDNYFVHVLKNIDRGTLPVVIDKYKDSIRNVWSS DKCKNNCSRLTFYICVKSHSLIFKKSIDLNRDSECAQVVQELDGKFERHDGDITIGDWEK IKYAP >gi|226332273|gb|ACIC01000047.1| GENE 2 1376 - 1669 254 97 aa, chain - ## HITS:1 COG:no KEGG:BVU_1718 NR:ns ## KEGG: BVU_1718 # Name: not_defined # Def: putative transposase # Organism: B.vulgatus # Pathway: not_defined # 1 97 119 216 307 92 50.0 5e-18 MGMENNHTLSGEAEIDEVFMGRKNKNRHKDKKVEKCQRRSYKEKVPVFGILEKVVKLLLK LFNIHGDTFLSVINEYVTKGNVIYTDGVGYSDINIDY >gi|226332273|gb|ACIC01000047.1| GENE 3 2463 - 3977 1063 504 aa, chain + ## HITS:1 COG:alr1171 KEGG:ns NR:ns ## COG: alr1171 COG0642 # Protein_GI_number: 17228666 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Nostoc sp. PCC 7120 # 252 504 192 445 448 135 33.0 1e-31 MKLPLKHIIILVICSLTGIFVYQTYWLTGLYRTMKQEMNNNIKDAMRTSDFNEIVLRVNE LQKDNVEHGSVTVSAGYGADGKSLVTSQTVSYTDSTYKDTLHTRTETAVDTLAVNDNDPD ASAVASSESGLDVLLKKQDSMKELILSVQQGMHSGVDTYIDINLQKYDSLLTDVLKAHNI DVPHHTLYIYSGATQDSSQTFIDTLGIAGDSTYIPSPKAIRYDYEFNRHHSQRYQLIMEP ITSLVWKQMTGILVTSFVIFLILGFSFWFLIRTLLKQKTLEEMKSNFTNNITHELKTPIA VAYAANDALLNFNQAEEKSKRDQYLRVSQEQLQRLSGLVEQILSMSMESRKTFRLHPEEI CLKELITSLIEQHQLKADIPVHITLETEPEALTIVADRTHFSNIISNLIDNAVKYSKQEA EITIQCRQTEETVTITVSDHGIGIPLDKQKHIFDKFYRVPTGNLHNVKGYGLGLFYVKSM VEKHGGTITVKSESGKGSTFTITI >gi|226332273|gb|ACIC01000047.1| GENE 4 3990 - 4685 689 231 aa, chain + ## HITS:1 COG:MT1062 KEGG:ns NR:ns ## COG: MT1062 COG0745 # Protein_GI_number: 15840463 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Mycobacterium tuberculosis CDC1551 # 6 228 51 272 276 152 36.0 6e-37 MNKNDITVLLVEDELTLAMIIKDTLEENGFTIHTASDGEEGLHLFFELRPDVLVADVMMP KMDGFEMVRRIRQTDKQTPVLFLTARSAINDVVEGFELGANDYLKKPFGMQELIIRIKAL MGKAFSFTETKVSSRFEIGSYLFDPVAQTLLHTGVRQELSHRESEILKRLCENRNQVVNT QDVLLELWGDDSFFNSRSLHVFITKLRHKLSQDEQIRIVNVRGIGYKLIAN >gi|226332273|gb|ACIC01000047.1| GENE 5 4888 - 5424 455 178 aa, chain + ## HITS:1 COG:no KEGG:BT_0731 NR:ns ## KEGG: BT_0731 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 178 1 178 178 319 100.0 3e-86 MRKLILFLSLILSVGFASAQSEVPSDSIRRAPSSNIKEFGGFLLDMGLMNVAPPKLPKFS LDVPDVSKDYNQIFRLNTDASYTQGFTDAFSSPFSGFGYGYGWGLSSSPQFMQMGTFKLK NGWKINTYGDYDKDGWKVPNRSAMPWEKNNFRGAFELKSSNGNFGIRIEVQQGRNGLY >gi|226332273|gb|ACIC01000047.1| GENE 6 5596 - 6612 777 338 aa, chain - ## HITS:1 COG:alr4831 KEGG:ns NR:ns ## COG: alr4831 COG0451 # Protein_GI_number: 17232323 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Nostoc sp. PCC 7120 # 1 247 1 236 311 103 29.0 4e-22 MKALFIGGTGTISTDVVELAQQRGWEITLLNRGSKKLPEGVGSIIADIHDEEAVAKAIAD ESYDVVAQFIAYTAEDVERDIRLFRNKTKQYIFISSASAYQKPLADYRITESTPLVNPYW QYSRHKIAAEEVLMTAYRTTGFPITIVRPSHTYNGTKPPVSLHGNKGNWQILKRILDGKP VIIPGDGSSLWTLTHSKDFAKGYVGLMANPHAIGNAFHITTDESMTWNQIYQTIADALGK PLNALHVASDFLARHGGNYDFRGELLGDKAATVVFDNSKIKRLVPDFICTTSMADGLRQS VQYMLSHPETQTPDPEFDSWCDRVADAMAAADRAFEAE >gi|226332273|gb|ACIC01000047.1| GENE 7 6861 - 7763 760 300 aa, chain + ## HITS:1 COG:no KEGG:BT_0729 NR:ns ## KEGG: BT_0729 # Name: not_defined # Def: transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 300 1 300 300 583 100.0 1e-165 MEEITKIESIDQYDQLFGLETLHPLVNVIDFSKATKTVDYYRMNIGFYSLFLKDAKCGDL RYGRKYYDYQEGTVVCMAPGQVIGIDNRKQTPIRTKSIGLLFHPDLIRGTSLGQNIKHYS FFSYEVNEALHLSEQEREIVTDCIHKIQIELEHAIDKHSKQLIVRNIELLLDYCMRFYER QFITRNQANKDIIVKFEQLLDEYFQNQTALTEGLPSVKYFADKVCLSSNYFGDLIKKETG KTAQEYIQHRIIELAKERILEGSQTVSQIAYELGFQYPQHFSRLFKKNVGCTPNEYKQQN >gi|226332273|gb|ACIC01000047.1| GENE 8 7798 - 8697 602 299 aa, chain + ## HITS:1 COG:BS_ytdP KEGG:ns NR:ns ## COG: BS_ytdP COG2207 # Protein_GI_number: 16080067 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Bacillus subtilis # 174 296 642 766 772 71 29.0 2e-12 MGDKVLKIGSVHQCNCCMGSKTLHPLVSVIDLSKADLSSHTDFKFGFYTILLSECKCEAY RYGHQYYDFSDGTLLCLTPGESISMKDSDNTRPSKGWILAFHPDLICGTALGANIRDYTF FSYRPEEALHVSLREKQVILELLDKINQELQRCIDCYSQKIISKYIELLLDYCERFYERQ FITRNEANKKIIERFDRLLDDHFNTIQSQSSDLPTIEECVRVLRLSPAYFNDLLEYETGK SYKEYLQSKRFETAKEWLINTDKTLSQIAQELGFRNPQYFSRLFKKITGCLPNDFKMPN Prediction of potential genes in microbial genomes Time: Thu May 12 00:33:50 2011 Seq name: gi|226332272|gb|ACIC01000048.1| Bacteroides sp. 1_1_6 cont1.48, whole genome shotgun sequence Length of sequence - 42774 bp Number of predicted genes - 39, with homology - 39 Number of transcription units - 19, operones - 9 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 46 - 105 7.1 1 1 Tu 1 . + CDS 170 - 1351 773 ## BT_0727 hypothetical protein + Term 1523 - 1552 1.4 - Term 1509 - 1540 1.8 2 2 Tu 1 . - CDS 1577 - 1705 91 ## BT_0726 hypothetical protein - Prom 1802 - 1861 5.5 3 3 Op 1 . + CDS 1698 - 1874 77 ## gi|253568439|ref|ZP_04845850.1| predicted protein 4 3 Op 2 . + CDS 1898 - 2806 637 ## COG1091 dTDP-4-dehydrorhamnose reductase + Term 2997 - 3046 2.2 + Prom 2863 - 2922 9.3 5 4 Op 1 . + CDS 3109 - 3321 197 ## BT_0724 hypothetical protein 6 4 Op 2 . + CDS 3335 - 3613 289 ## BT_0723 hypothetical protein 7 4 Op 3 . + CDS 3610 - 4230 532 ## COG2431 Predicted membrane protein + Term 4251 - 4318 16.0 - Term 4242 - 4303 13.7 8 5 Op 1 . - CDS 4329 - 6887 2384 ## COG0507 ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member - Prom 6915 - 6974 2.9 - Term 7014 - 7062 -0.5 9 5 Op 2 . - CDS 7121 - 9088 1533 ## BT_0720 hypothetical protein - Prom 9161 - 9220 5.7 - Term 9207 - 9248 -0.6 10 6 Op 1 42/0.000 - CDS 9254 - 10150 990 ## COG0224 F0F1-type ATP synthase, gamma subunit 11 6 Op 2 41/0.000 - CDS 10240 - 11823 1894 ## COG0056 F0F1-type ATP synthase, alpha subunit 12 6 Op 3 38/0.000 - CDS 11823 - 12383 443 ## COG0712 F0F1-type ATP synthase, delta subunit (mitochondrial oligomycin sensitivity protein) 13 6 Op 4 . - CDS 12389 - 12892 540 ## COG0711 F0F1-type ATP synthase, subunit b 14 6 Op 5 . - CDS 12905 - 13162 474 ## BT_0715 ATP synthase C subunit 15 6 Op 6 . - CDS 13221 - 14327 1118 ## COG0356 F0F1-type ATP synthase, subunit a 16 6 Op 7 . - CDS 14311 - 14745 391 ## BT_0713 hypothetical protein 17 6 Op 8 42/0.000 - CDS 14788 - 15036 287 ## COG0355 F0F1-type ATP synthase, epsilon subunit (mitochondrial delta subunit) 18 6 Op 9 . - CDS 15052 - 16569 1759 ## COG0055 F0F1-type ATP synthase, beta subunit - Prom 16655 - 16714 4.6 + Prom 17056 - 17115 4.0 19 7 Tu 1 . + CDS 17236 - 17601 344 ## BT_0710 hypothetical protein + Term 17642 - 17693 3.8 20 8 Tu 1 . + CDS 17713 - 20076 1714 ## BT_0709 hypothetical protein + Term 20195 - 20251 5.1 - Term 20182 - 20238 5.1 21 9 Tu 1 . - CDS 20301 - 20546 223 ## BT_0708 hypothetical protein - Prom 20570 - 20629 5.7 + Prom 20568 - 20627 6.5 22 10 Op 1 . + CDS 20825 - 21334 532 ## BT_0707 hypothetical protein 23 10 Op 2 . + CDS 21399 - 21848 275 ## COG3023 Negative regulator of beta-lactamase expression + Term 22049 - 22100 8.8 24 11 Tu 1 . - CDS 22150 - 23085 832 ## BT_0705 hypothetical protein - Prom 23108 - 23167 4.0 - Term 23114 - 23165 5.0 25 12 Op 1 . - CDS 23183 - 24349 1499 ## COG0027 Formate-dependent phosphoribosylglycinamide formyltransferase (GAR transformylase) 26 12 Op 2 . - CDS 24400 - 25623 1095 ## BT_0703 hypothetical protein 27 12 Op 3 . - CDS 25660 - 26283 478 ## COG4845 Chloramphenicol O-acetyltransferase - Prom 26424 - 26483 5.9 - Term 26500 - 26532 -1.0 28 13 Tu 1 . - CDS 26658 - 28175 1782 ## BT_0701 hypothetical protein - Prom 28197 - 28256 5.8 + Prom 28123 - 28182 3.8 29 14 Tu 1 . + CDS 28282 - 30498 2306 ## COG0317 Guanosine polyphosphate pyrophosphohydrolases/synthetases + Term 30539 - 30596 13.7 - Term 30587 - 30627 -0.6 30 15 Op 1 . - CDS 30629 - 31786 868 ## COG0477 Permeases of the major facilitator superfamily 31 15 Op 2 . - CDS 31795 - 32616 935 ## COG0413 Ketopantoate hydroxymethyltransferase - Prom 32672 - 32731 5.4 - Term 32708 - 32766 10.1 32 16 Tu 1 . - CDS 32793 - 33440 767 ## COG0637 Predicted phosphatase/phosphohexomutase - Prom 33545 - 33604 4.2 + Prom 33455 - 33514 5.0 33 17 Tu 1 . + CDS 33592 - 34962 1219 ## COG0534 Na+-driven multidrug efflux pump - Term 35009 - 35058 -0.4 34 18 Op 1 . - CDS 35099 - 37414 1728 ## BT_0695 ABC transporter, permease protein 35 18 Op 2 . - CDS 37427 - 38110 358 ## PROTEIN SUPPORTED gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) 36 18 Op 3 . - CDS 38142 - 40451 1511 ## BT_0693 hypothetical protein - Prom 40561 - 40620 5.4 37 19 Op 1 . + CDS 40747 - 41550 859 ## BT_0692 calcineurin superfamily phosphohydrolase 38 19 Op 2 . + CDS 41531 - 42334 564 ## BT_0691 hypothetical protein 39 19 Op 3 . + CDS 42403 - 42772 301 ## COG2204 Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains Predicted protein(s) >gi|226332272|gb|ACIC01000048.1| GENE 1 170 - 1351 773 393 aa, chain + ## HITS:1 COG:no KEGG:BT_0727 NR:ns ## KEGG: BT_0727 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 393 1 393 393 782 100.0 0 MKTERKYHHLIKSVIIVLFLSFGMTSCEKEEPVPVPTEQTVFMYLPWSDNLTSNFYQNIS DLESVVEKNILKDERIIIFMCTTATKATLFELAYENGKSVHKTLKNYTDPAYTTAEGITS ILNDVQRYSPTKRYSMVIGCHGMGWIPVSNSKSRSGLRTKMHWEYENVPMTRYFGGLNAQ YQTDITTLAKGISNAGLKMEYILFDDCYMSSIEVAYALKDVTDYLIGSTSEVMAYGMPYA EIGQYLIGKVDYAGICDGFYSFYSTYSTPCGTIAVTDCSELDNLATIMKEINHRYTFDPS LTSSLQRLDGYYPVIFFDYGDYVSKLCPDETLVARFNEQLNRTVPFKRNTEYFYSMSRGE VKINTFSGITISDPSTHSLASKKEETAWYAATH >gi|226332272|gb|ACIC01000048.1| GENE 2 1577 - 1705 91 42 aa, chain - ## HITS:1 COG:no KEGG:BT_0726 NR:ns ## KEGG: BT_0726 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 42 1 42 42 68 100.0 5e-11 MRMPDGLDEVPGAYKDIAQVIANERDLVKPLVELAPMAVIKG >gi|226332272|gb|ACIC01000048.1| GENE 3 1698 - 1874 77 58 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253568439|ref|ZP_04845850.1| ## NR: gi|253568439|ref|ZP_04845850.1| predicted protein [Bacteroides sp. 1_1_6] # 1 58 1 58 58 100 100.0 4e-20 MRIGPDNDMLKFAKEQKKKGTTKSICVTKIIFYFCENKYRKAISEPKLTHVTEKNSYD >gi|226332272|gb|ACIC01000048.1| GENE 4 1898 - 2806 637 302 aa, chain + ## HITS:1 COG:APE1179 KEGG:ns NR:ns ## COG: APE1179 COG1091 # Protein_GI_number: 14601229 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-4-dehydrorhamnose reductase # Organism: Aeropyrum pernix # 4 285 8 281 305 130 29.0 2e-30 MKNIIIIGANGFTGRQILNDLSSKAQYKVTGCSLHPDILPKNGGNYHFITTDIRDEAVVK QLFKDVQPDVVINCSALSVPDYCETHHEEAWLTNVTAVEQLAHLCESYKSRFIHLSTDFV FDGKIDEKSGQLYTEKSLPAPVNYYGFTKWKGEEKVTEICSNYAIARVEIVYGKALPGQH GNIVQLVMNRLNAGQEIRVVSDQWRTPTYVGDVSAGVQHLVENAANGIFHICGEECLTIA EIAFQVADYMKLDRSLVHPATTEEMQEATPRPRFSGMSIAKARTILGYQPRKLKDILMEW KH >gi|226332272|gb|ACIC01000048.1| GENE 5 3109 - 3321 197 70 aa, chain + ## HITS:1 COG:no KEGG:BT_0724 NR:ns ## KEGG: BT_0724 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 70 1 70 70 121 100.0 7e-27 MRKFPISARHVACMYIGSADSSKFPLLILTPNCLLTYFVTHRQYFHLIVNKMYVFLGIIY AIYELFNLTN >gi|226332272|gb|ACIC01000048.1| GENE 6 3335 - 3613 289 92 aa, chain + ## HITS:1 COG:no KEGG:BT_0723 NR:ns ## KEGG: BT_0723 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 92 1 92 92 120 100.0 1e-26 MFSIISTMFLGIGIGYGLRNWSILQKTEKTISLTIFLLLFILGVSIGSNSLIVNNLGKFG WQAIILAVSGVLGSLLAARLVLQLFFKKGGKQ >gi|226332272|gb|ACIC01000048.1| GENE 7 3610 - 4230 532 206 aa, chain + ## HITS:1 COG:FN1083 KEGG:ns NR:ns ## COG: FN1083 COG2431 # Protein_GI_number: 19704418 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Fusobacterium nucleatum # 5 206 2 195 198 94 34.0 1e-19 MKGSLIVIVFFCVGCIMGVFNKFQFDTHTVSMYILYALMLQVGISIGSNKNLKAIISHLH PKMLLIPLGTITGTLLFSALASILLSQWSVFDCMAVGSGFAYYSLSSILITQFKEPTIGI QLATELGTIALLTNIFREMMALLGTPLIKKYFGKLAPISAAGVNSMDVLLPSISRYSGKE MIPIAILHGILIDISVPVFVSFFCNL >gi|226332272|gb|ACIC01000048.1| GENE 8 4329 - 6887 2384 852 aa, chain - ## HITS:1 COG:SPBC887.14c KEGG:ns NR:ns ## COG: SPBC887.14c COG0507 # Protein_GI_number: 19113280 # Func_class: L Replication, recombination and repair # Function: ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member # Organism: Schizosaccharomyces pombe # 21 406 328 758 805 162 30.0 3e-39 MENNPELQLAWQFIENTGTHLFLTGKAGTGKTTFLKRLREHTPKRMVVLAPTGIAAINAG GVTIHSFFQLSFAPFVPETTFNSAKMHYRFSKEKRNIIRSMDLLVIDEISMVRADLLDAV DAALRRYRDREKPFGGVQLLMIGDLQQLAPVVKDNEWELLKNHYETPYFFASRALKETVY MTIELKKVYRQSDTFFLSLLNKIRENKADDEVLNELNRRYQPGFRPRKEEGYIRLTTHNY QAQQVNDRELASLSGRAYSFRAEIEGNFPEYSYPADEVLTVKEGAQIMFLKNDVSSEKRY YNGMIGEVMSVDEKGMSVRGKDSSDVFQLLPEEWGNYKYVLNEETKEITEEIEGVFRQYP IRLAWAITIHKSQGLTFERAIIDARNSFAHGQTYVALSRCKTLEGMVLETPLRREAIISD SIVDNFTRDVEQNKPGSKQLNDMQKAYFYDLLNDLFNFYSLEQAYKRLLRMMDEELYKLY PKLLAEYKELAPHLKEKVVEVSNRFRNQYTRLINENENYAGNEELQQRIHSGAAYFCEAL LPVRVLFDKTSMPLDNKELRKQLNERLQALDDALWIKESLLDAMRTEKFTVTDYLKRKAK VMLSLEGETASSGSSTKAPKEKKERKERTRSNSGKVKIEVPTDILHPELYRALSEWRTAK TREVNMPAYVIMQQKALMGIVNLLPDTPVALEAIPYFGAKGVEKYGLEILGIVRKYMKEN KLERPEIFTSVSGGEDVREQKLDAQHHEDKKKQKEKKKDSKLVSYEMFCQGMTIEEIAKT RDLVTGTIAAHLEHYLREDKIKLEQVVTADKIKKVRTYLETHEFMGVVAIKAALGDEVSY ADIRFVLATAGH >gi|226332272|gb|ACIC01000048.1| GENE 9 7121 - 9088 1533 655 aa, chain - ## HITS:1 COG:no KEGG:BT_0720 NR:ns ## KEGG: BT_0720 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 655 1 655 655 1281 99.0 0 MTQKRLFGISGAIATAALIGGLLAGCMEKDLYDPNYGKEPLPDPSEYFGFETRGDVKLSV NYDVPGLVALLEVYDEDPMEVVDNVPVKKEGVEALFKIFTDGNGKYEGKMNIPTSVETVY LYTSSWGVPRCVKLDIKDGMASFDMSKKDSSTANTKAVTRSYDFSKGNVPYLINSGANLY SLCKWGEGGNLTYIYNSQTGKPDPINNGYVTPQKQVGNELIGNLVERLNSFFKPSGGSVD NAYLYKGSEITNINVTQEGTTLDLVFLDRDASFNNSFGYYYYKTSDGVSNSNMRKYIVFP NVAFSVYNGALSILKCGDKVRLLFWGEDGKPSEKFPKGYTVGWFIYADGYHQNENEIDIT KPLITSNGYSNGGFVSVKDEKSGKTIIGVEDGANRSFCDLLFYVDASPESSIDNPNRPNI PSDDKPIEKPDAQENLKGTLAFEDIWPDGGDYDMNDVIVEYNRAIYFNTENMINKIVDTF TPTHDGAMYDNAFAYQIDGGQFGKVTSDKDIKVESETSSIIVFPSVKQAVKGKVGTCTIT RTFEKATFNKENLKIYNPYIIVKYAAGQQNRTEVHLPKYSPTSYADKSLIGSSKDVYYID RDGAYPFAIDIPMLNFIPVTETHNIDTEYPYFKNWADSWGSKSTDWYKRYQSPKN >gi|226332272|gb|ACIC01000048.1| GENE 10 9254 - 10150 990 298 aa, chain - ## HITS:1 COG:BS_atpG KEGG:ns NR:ns ## COG: BS_atpG COG0224 # Protein_GI_number: 16080735 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, gamma subunit # Organism: Bacillus subtilis # 1 296 1 283 287 170 35.0 4e-42 MASLKEVKTRINSVKSTRKITSAMKMVASAKLHKAQGAIENMLPYERKLNKILTNFLSAD LPVESPYIKAREVKRVAIVAFSSNTSLCGAFNANVIKMLLQTVGEFRTLGQDNILIFPVG KKVDEAVKRLGFEPQETSPTLSDKPSYQEASELAHRLMEMYVSGEIDRVELIYHHFKSMG VQILLRETYLPIDLTRVVDEEEKQKEEEVQGGEIANDYIIEPSAEELIANLIPTVLSQKL FTAAVDSNASEHAARTLAMQVATDNANELIQDLTKQYNKSRQQAITNELLDIVGGSMQ >gi|226332272|gb|ACIC01000048.1| GENE 11 10240 - 11823 1894 527 aa, chain - ## HITS:1 COG:TM1612 KEGG:ns NR:ns ## COG: TM1612 COG0056 # Protein_GI_number: 15644360 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, alpha subunit # Organism: Thermotoga maritima # 5 517 3 496 503 575 58.0 1e-164 MSENIRVSEVSDILRKQLEGINTNVQLDEIGTVLQVSDGVVRIYGLRNAEANELLEFDNG IKAIVMNLEEDNVGAVLLGPTDRIKEGFIVKRTKRIASIRVGESMLGRVIDPLGEPLDGK GLIGGELYEMPLERKAPGVIYRQPVNQPLQTGLKAVDAMIPIGRGQRELIIGDRQTGKTS IAIDTIINQRNNFLAGDPVYCIYVAIGQKGSTVASIVNTLREYGAMDYTIVVAATAGDPA ALQYYAPFAGAAIGEYFRDTGRHALVVYDDLSKQAVAYREVSLILRRPSGREAYPGDIFY LHSRLLERAAKIIKEEEVAREMNDLPESLKGIVKGGGSLTALPIIETQAGDVSAYIPTNV ISITDGQIFLETDLFNQGVRPAINVGISVSRVGGNAQIKAMKKVAGTLKMDQAQYRELEA FSKFSSDMDPITAMTIDKGRKNAQLLIQPQYSPMPVEQQIAILYCGTHGLLHDVPLENVQ DFERSFIESLQLNHQEDVLDVLRTGVIDDNVTKAIEETAAMVAKQYL >gi|226332272|gb|ACIC01000048.1| GENE 12 11823 - 12383 443 186 aa, chain - ## HITS:1 COG:sll1325 KEGG:ns NR:ns ## COG: sll1325 COG0712 # Protein_GI_number: 16329328 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, delta subunit (mitochondrial oligomycin sensitivity protein) # Organism: Synechocystis # 10 174 14 177 185 64 25.0 1e-10 MEVGVLSMRYAKAMIEYAQEKGVEDRLYNEFFTLSHSFRVQPGLREVLDNPVVSVKDKLA LICTAADGNGESSREFVRFITLVLRNRREGYLQFISLMYLDLYRKLKHIGVGKLITAVPV DKETENRIRSAAAHILHAQMELDTVIDPSIEGGFIFDINDYRLDASIATQLKRVKQQFID KNRRIV >gi|226332272|gb|ACIC01000048.1| GENE 13 12389 - 12892 540 167 aa, chain - ## HITS:1 COG:TM1614 KEGG:ns NR:ns ## COG: TM1614 COG0711 # Protein_GI_number: 15644362 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, subunit b # Organism: Thermotoga maritima # 14 160 13 159 164 59 27.0 2e-09 MSLLLPDSGLLFWMFVAFGVVFVILAKYGFPIIIKMVEGRKTYIDQSLEVAREANAQLAH LKEEGEALVAAANKEQGRILKEAMEERDKIVHEARKQAEIAAQKELDAVKQQIQIEKDEA IRDIRRQVAVLSVDIAEKVLRKNLQDKESQMGMIDRMLDEVLTPNKN >gi|226332272|gb|ACIC01000048.1| GENE 14 12905 - 13162 474 85 aa, chain - ## HITS:1 COG:no KEGG:BT_0715 NR:ns ## KEGG: BT_0715 # Name: not_defined # Def: ATP synthase C subunit # Organism: B.thetaiotaomicron # Pathway: Oxidative phosphorylation [PATH:bth00190]; Metabolic pathways [PATH:bth01100] # 1 85 1 85 85 74 100.0 1e-12 MLLSVLLQATAAAVGVSKLGAAIGAGLAVIGAGLGIGKIGGSAMEAIARQPEASGDIRMN MIIAAALIEGVALLAVVVCLLVFFL >gi|226332272|gb|ACIC01000048.1| GENE 15 13221 - 14327 1118 368 aa, chain - ## HITS:1 COG:BMEI1546 KEGG:ns NR:ns ## COG: BMEI1546 COG0356 # Protein_GI_number: 17987829 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, subunit a # Organism: Brucella melitensis # 128 356 51 269 277 89 31.0 1e-17 MKQLRNIVAPIFLLCLMLVAGLPVFAQTQEDVTPQEQKENTVDVKEIVFGHIGDSYEWHI TTWGKTHITIPLPIIVYSSTTGWHTFLSSRLAENGGTYEGFSVAPEGSKYEGKLVEYNAA GEQVRPWDISITKVTFALLFNSVLLLVIVLSVAQWYRKRPQGALAPGGFIGFMEMFIMMV NDDIIKSCVGPNYRKFAPYLLTAFFFIFINNIMGLIPFFPGGANVTGNIAITMVLAICTF LAVNIFGTKHYWKDIFWPDVPWWLKVPVPMMPFIEFFGIFTKPFALMIRLFANMLAGHMA MLVLTCLIFISASMGPALNGTLTVASVLFNIFMNALELLVAFIQAYVFTMLSAVFIGLAQ EGAKVKAE >gi|226332272|gb|ACIC01000048.1| GENE 16 14311 - 14745 391 144 aa, chain - ## HITS:1 COG:no KEGG:BT_0713 NR:ns ## KEGG: BT_0713 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 144 1 144 144 198 100.0 6e-50 MVNISKTKRNFIALNTLFAILSGAAGAVILHIALPGHYFGGYPFIPVYFYIFGLFSIYMF DACRRHAPQKLSLLYLAMKMIKMILSLILLLIYCLAVREEAKAFLLTFISFYLLYLIYET WFFFSFEVNLKRKKKNKNKNETVA >gi|226332272|gb|ACIC01000048.1| GENE 17 14788 - 15036 287 82 aa, chain - ## HITS:1 COG:HI0478 KEGG:ns NR:ns ## COG: HI0478 COG0355 # Protein_GI_number: 16272425 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, epsilon subunit (mitochondrial delta subunit) # Organism: Haemophilus influenzae # 2 79 1 78 142 69 38.0 1e-12 MMKELHLSIVSPEKSVFDGEVKIVTLPGTVGSFSILPGHAPIVSSLKAGTLGYTTMDGDE HTLDIQGGFVEMSDGTASVCVS >gi|226332272|gb|ACIC01000048.1| GENE 18 15052 - 16569 1759 505 aa, chain - ## HITS:1 COG:SPAC222.12c KEGG:ns NR:ns ## COG: SPAC222.12c COG0055 # Protein_GI_number: 19114063 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, beta subunit # Organism: Schizosaccharomyces pombe # 6 502 55 522 525 607 64.0 1e-173 MSQIIGHISQVIGPVVDVYFEGTDAELMLPSIHDALEIKRPNGKILVVEVQQHIGENTVR TVAMDSTDGLQRGMKVYPTGGPITMPIGEQIKGRLMNVVGDSIDGMKSLDRTGAYSIHRD PPKFEDLTTVQEVLFTGIKVIDLLEPYAKGGKIGLFGGAGVGKTVLIQELINNIAKKHNG FSVFAGVGERTREGNDLLREMIESGVIRYGEAFKESMEKGDWDLSKVDYNELEKSQVSLI FGQMNEPPGARASVALSGLTVAESFRDAGKEGEKRDILFFIDNIFRFTQAGSEVSALLGR MPSAVGYQPTLATEMGAMQERITSTRKGSITSVQAVYVPADDLTDPAPATTFSHLDATTV LDRKITELGIYPAVDPLASTSRILDPHIVGQEHYDVAQRVKQILQRNKELQDIISILGME ELSEEDKLVVNRARRVQRFLSQPFAVAEQFTGVPGVMVGIEDTIKGFRMILDGEVDNLPE QAFLNVGTIEEAIEKGKKLLEQAKK >gi|226332272|gb|ACIC01000048.1| GENE 19 17236 - 17601 344 121 aa, chain + ## HITS:1 COG:no KEGG:BT_0710 NR:ns ## KEGG: BT_0710 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 121 1 121 121 237 100.0 9e-62 MVDKAGNPLTVDDISQCHLPGFPILFTQNELLEILAKEIFKQQFLWKSVQDKDIKECNKE ERIIKQIPLMPRCPDVLKGLFHCIEIRDERSPKHAIAYKSCLGKLEEMVKDQWDDWNRNQ S >gi|226332272|gb|ACIC01000048.1| GENE 20 17713 - 20076 1714 787 aa, chain + ## HITS:1 COG:no KEGG:BT_0709 NR:ns ## KEGG: BT_0709 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 787 5 791 791 1600 99.0 0 MYQVSLFINTWSKTPTIITFEDFFAMVRNGHWKVPTEGHRSCLAKDRKHDAQTIKDSMAC VIPAGICKNGHAKNNLTSLSLALCIDIDHTDEQTKDIFVRACLLEYVLGAFISISGRGVK LFIRIDIDGVNDYPAIYEATAKLVSTVLGVENDGKCKDVTHPCFGSYDPDAYYNGDAKAV NDFLPNGALTPRVTYAPVTSAPRTTCTPTSFSNGHPPAMPDNRISAAPFVQSYLTLYPAA TGDRNGTVFRLSCAACKRGIDRNELTTELVRTLEEDNFREPEIARTVKSAYQNVMTAAET ESFENPPSNSSKVQKSVIGFSENENPAEKEDEIIGEELREQTPTFPDEIYDLIPAIFKEC VSIARDERERDGLLLSCITTISSMLPSVSIRYNRRNYQPNIYCIVVASSGANKNLIGYGI RLHKHYCEYWEKQSLKIEKEYKKALRDYTLSLQAIRKKNTAATNLPDEPEEPKLAFPLIP ADISRAQLITHLINNQSCSSLLTSTEASSVSTARNQDYGHFDDILCKAFEHELISSSYKI NGRHPLKVEYPSLSAFLTGTPSSLILFIPTMETGLYNRFLINTFRLPAAWQDVFAEEKVQ ADDLFNELSMRFAQMALFLKDSPTEVKLTDTQKKEFNRVFTQLLKDTDQLGNDDLLGVVK RYGVITARICCIFSAIDKGTMRMETPEVYCSDAHFKAALAIVLCCFEHSKLVSTSVKSSA EDTKPLQSPDNLSRLLTKLKKSFSAKEACALGEELNLSRSTVYYLLNKGIKAGRISKLGH GSYILKI >gi|226332272|gb|ACIC01000048.1| GENE 21 20301 - 20546 223 81 aa, chain - ## HITS:1 COG:no KEGG:BT_0708 NR:ns ## KEGG: BT_0708 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 81 1 81 81 141 100.0 7e-33 MKTEKEEAEEEPFVIRPYLKSELAHKYNPHLPLIYAMQKLRGWIRNNKELYDAMYSGGEG KNDHAYSSRQVRLIVKFLDEP >gi|226332272|gb|ACIC01000048.1| GENE 22 20825 - 21334 532 169 aa, chain + ## HITS:1 COG:no KEGG:BT_0707 NR:ns ## KEGG: BT_0707 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 8 169 1 162 162 277 100.0 8e-74 MAIPFKRMGRKDPRKIDGVLKYHPQLVTMGQSVDLDGIAYIMKDKSSLSLGDIQSVLTNY VEAMRAALFDGKSVNIRDFGVFSLSAHTLGATTKEECTVKNIKSIQINFRPSSSVRPNLT STRAGDKIEFLDLEAPKKKKGDGSGEDNTPGGGGNEGGGGDEEAPDPTV >gi|226332272|gb|ACIC01000048.1| GENE 23 21399 - 21848 275 149 aa, chain + ## HITS:1 COG:HI1494 KEGG:ns NR:ns ## COG: HI1494 COG3023 # Protein_GI_number: 16273395 # Func_class: V Defense mechanisms # Function: Negative regulator of beta-lactamase expression # Organism: Haemophilus influenzae # 46 142 2 98 116 100 47.0 9e-22 MRKIHLIVIHCSATRADKELTAFDLDTMHRRRGFNGTGYHYYIRKDGTTHLTRPVERIGA HAKGFNTESIGICYEGGLDCRGRPADTRTPEQRAALRLLVHQLQERFPGCRVCGHRDLSP DRNGNGEIEPEEWIKACPCFEVKEESYRE >gi|226332272|gb|ACIC01000048.1| GENE 24 22150 - 23085 832 311 aa, chain - ## HITS:1 COG:no KEGG:BT_0705 NR:ns ## KEGG: BT_0705 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 311 4 315 315 499 80.0 1e-140 MKKTKLLAATAMLLLTSCSPKVVTNIVRTYPDIIPADSVYVIGLGEKVPNTAETIGRVSV VDRGTSTNCRYDQVLHLAQEATGKIGGNGLAITNHQKPSFWGSSCHQISGLMLRLSDKSV DTLKTNPVQDVIDLDLVIAKDYAESRKAPANTFEASFGYGWVISKIYDINGRALDSKGGV EWKLAYEHVWNGGMGIGLQYAGFKKSFSGGDMMFSYIAPEWVSRTKFDRWILKSGVGVGV FLYHDPFYNAAGFGLHATLGFEYMLSSNWGLGLTLNGIGGSMPKQDWMLLKEDERCGITR FNLLGGLRYYF >gi|226332272|gb|ACIC01000048.1| GENE 25 23183 - 24349 1499 388 aa, chain - ## HITS:1 COG:alr1299 KEGG:ns NR:ns ## COG: alr1299 COG0027 # Protein_GI_number: 17228794 # Func_class: F Nucleotide transport and metabolism # Function: Formate-dependent phosphoribosylglycinamide formyltransferase (GAR transformylase) # Organism: Nostoc sp. PCC 7120 # 2 387 9 388 391 418 57.0 1e-117 MKKILLLGSGELGKEFVISAQRKGQHVIACDSYAGAPAMQVADEFEVFDMLDGEALERVV EKHHPDIIVPEIEAIRTERLYDFEKEGIQVVPSARAVNFTMNRKAIRDLAAKELGLKTAN YFYAKTLDELKEAAAKIGFPCVVKPLMSSSGKGQSLVKSADELEHAWEYGCSGSRGDIRE LIIEEFIKFDSEITLLTVTQKNGPTLFCPPIGHVQKGGDYRESFQPAHIDPAHLKEAEEM AEKVTRALTGAGLWGVEFFLSHENGVYFSELSPRPHDTGMVTLAGTQNLNEFELHLRAVL GLPIPGIKQERIGASAVILSPIASQERPQYRGMEEVTKEEDTYLRIFGKPFTRVNRRMGV VLCYAPLNSDLDALRDKAKRIAEKVEVY >gi|226332272|gb|ACIC01000048.1| GENE 26 24400 - 25623 1095 407 aa, chain - ## HITS:1 COG:no KEGG:BT_0703 NR:ns ## KEGG: BT_0703 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 407 1 407 407 813 100.0 0 MRNKTLRWKTIALCSIMFCLPKTPLAGQESFGFIHRLGMEARPQYVFPTNPFLQGENERW KPIQTSFAAHLKYSFKFRPNTCADRIYGGAYQGIGVSLTTFGDKKQLGDPFSFYVFQGAR IARFSPRASLNYEWNFGLSAGWKPYDNYYNSYNGAVGSRMNAYINAGVYINWAFSRYFDL IVGGDFTHFSNGNTKFPNAGINTAGAKIGLVYNFNRTEEDLSKSLYQPVTTRFPRHISYD VVLFGSWRRKGVWVGEKQIASPNAYPVAGFNFAPMYNLGYKFRVGASLDGVYDGSANVYR EDVIVEYGSSSSKREFLKPGIQHQLALGLSGRAEYVMPYFTIGVGMGANVLSRGDMRGFY QILALKIAVTRSSFLHIGYNLQNFQTPNYLMLGLGFRFHNKYPKVRH >gi|226332272|gb|ACIC01000048.1| GENE 27 25660 - 26283 478 207 aa, chain - ## HITS:1 COG:CAC0235 KEGG:ns NR:ns ## COG: CAC0235 COG4845 # Protein_GI_number: 15893527 # Func_class: V Defense mechanisms # Function: Chloramphenicol O-acetyltransferase # Organism: Clostridium acetobutylicum # 6 205 2 200 212 124 30.0 1e-28 MNQIEKIIDIATWNRKEHFEHFSAFDDPFFGVTVHVDCTRSYQEAKDKGVSFSLLLLHRI ITAASKVEEFRYRIEGDKVVCYDSLLPEATVGRADHTFSFAAFEYDPDELTFIRKAKTEM ERLQATTGLNKGGTFHPNAIHYSAVPWLAFTDMKHPSNMRSGDSVPKISTGKYFREGEKL MLPISVTCHHGLMDGYHVAKFIETLDL >gi|226332272|gb|ACIC01000048.1| GENE 28 26658 - 28175 1782 505 aa, chain - ## HITS:1 COG:no KEGG:BT_0701 NR:ns ## KEGG: BT_0701 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 505 1 505 505 1009 99.0 0 MITTQDKELLAKKGITEAQIAEQLACFQTGFPYLKLDAAASIEKGILAPDAEEKNAYLAA WDAYKNTDKIIVKFVPASGAASRMFKNLFEFLSAEYDKPTTKFEQAFFDGIRDFAFFDDL NVACQRTAGKDIPGLMEEGNYKAVVAALLETAGLNYGALPKGLLKFHKYPEGSRTPLEEH LAEGAMYAAGKSGKVNVHFTVSTEHRELFKKLVEEKAEAFGKRYGVDYYITFSEQKPNTD TIAADMDNQPFRDNGKLLFRPGGHGALIENLNDLDADIIFIKNIDNVVPDKLKADTVTYK KLIAGVLVSLQKKAFEYLELLDSGKYTHEQIMEILQFLQKQLFCKNPETKNLEDAELVIY LKNKLNRPMRVCGMVKNVGEPGGGPFLAYNSDGTISLQILESSQIDMNDPAKKEMFEKGT HFNPVDLVCAVRDYKGHKFDLVKYVDKATGFISYKSKNGKDLKALELPGLWNGAMSDWNT VFVEVPLTTFNPVKTVNDLLREQHQ >gi|226332272|gb|ACIC01000048.1| GENE 29 28282 - 30498 2306 738 aa, chain + ## HITS:1 COG:lin1558 KEGG:ns NR:ns ## COG: lin1558 COG0317 # Protein_GI_number: 16800626 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Guanosine polyphosphate pyrophosphohydrolases/synthetases # Organism: Listeria innocua # 59 737 51 735 738 399 33.0 1e-110 MDDFFTSEEKKELFSLYRHLLQSAGDTIFWRDCQKLKKHLIKAAQCNGLQRNNFGMNPVI RDLQTAVIVAEEIGMKGSCLIGIMLHEIVKAHILSIEEVNAEYGEDVGSIIKGLVKTNEL YSKSPAIESENFRNLLLSFAEDMRVILIMIADRVNIMRQIKDTGNEEDRVKVANEAAYLY APLAHKLGLYKLKSELEDLSLKYTQRETYYYIKEKLNETKASRDKYIASFIDPIQKKVKE AGLKFDIKGRTKSIHSIWNKIQKQKTPFEGIYDLFAIRIILDSEPDPAKEKQECWQVYSI VTDMYQPNPKRLRDWLSIPKSNGYESLHITVMGPEGRWVEVQIRTRRMDEIAERGLAAHW RYKGIKGETGLDEWLTSVREALENADNDSLKVMDQFKMDLYEDEVFVFTPKGDLFKLAKG ATVLDFAFHIHSKLGCKCIGAKVNGKNVQLKQKLNSGDQVEIMTSNTQTPKQDWLNIVTT SKARTKIRQALKEMVARQHDFAKETLERKFKNRKMEYDEATMMRLIKRLGFKNVTEFYQK IADETLDVNDILDKYLEQQRRDNDTHDEIVYRSAEGYNLQATQEEPSSKEDVLVIDQNLK GLEFKLAKCCNPIYGDDVFGFVTVSGGIKIHRSDCPNANQMRERFGYRIVKARWAGKSVG TQYPITLRVVGHDDIGIVTNITSIISKENGISLRSIGIDSNDGLFSGTLTVMVGDTGRLE ALIKKLRTVKGVKQVSRN >gi|226332272|gb|ACIC01000048.1| GENE 30 30629 - 31786 868 385 aa, chain - ## HITS:1 COG:STM2280 KEGG:ns NR:ns ## COG: STM2280 COG0477 # Protein_GI_number: 16765607 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Salmonella typhimurium LT2 # 1 186 1 186 396 70 26.0 6e-12 MAVKLWTVHFMRICVANLLLFISLYILFPVLSVEMADRLGVPVAQTGVIFLFFTLGMFLI GPFHAYLVDAYKRKYVCMFSFATMVAATAGYAFVTNITELILLSTVQGLAFGIATTAGIT LAIDITNATLRSAGNVSFSWMARLGMIIGIVLGVWLYQSYSFKNLLSVSVITGAAGVLMV SGVYVPFRAPIVTKLYSFDRFLLLRGWVPAINMILITFVPGLLIPLVHRFLNDSVLGSSG IPIPFFVGTGIGYLASLLLARLFILKEKTLRLVLVGIILEIVAITLLASGFPVGVPSVLL GLGLGLVMPEFLVMFVKLSHHCQRGTANTTHLLASEVGFASGIAVACYFDLEADKMLYTG QVVAVIALIFFILVTYPYYKRKKVR >gi|226332272|gb|ACIC01000048.1| GENE 31 31795 - 32616 935 273 aa, chain - ## HITS:1 COG:Cgl0115 KEGG:ns NR:ns ## COG: Cgl0115 COG0413 # Protein_GI_number: 19551365 # Func_class: H Coenzyme transport and metabolism # Function: Ketopantoate hydroxymethyltransferase # Organism: Corynebacterium glutamicum # 8 273 5 269 269 249 50.0 4e-66 MAGYISDDTRKVTTHRLVEMKQRGERISMLTSYDYTMAQIVDGAGMDVILVGDSASNVMA GNVTTLPITLDQMIYHAKSVVRGVKRAMVVVDMPFGSYQGNEMEGLASAIRIMKESHADA LKLEGGEEIIDTVKRIISAGIPVMGHLGLMPQSINKYGTYTVRAKDDSEAEKLIRDAHLL EEAGCFAIVLEKIPATLAERVASELTIPIIGIGAGGHVDGQVLVIQDMLGMNNGFRPRFL RRYADLYTVMTDAISRYVSDVKNCDFPNEKEQY >gi|226332272|gb|ACIC01000048.1| GENE 32 32793 - 33440 767 215 aa, chain - ## HITS:1 COG:all1058_2 KEGG:ns NR:ns ## COG: all1058_2 COG0637 # Protein_GI_number: 17228553 # Func_class: R General function prediction only # Function: Predicted phosphatase/phosphohexomutase # Organism: Nostoc sp. PCC 7120 # 10 210 10 213 223 79 29.0 5e-15 MDTTKTIAALFDFDGVIMDTETQYTVFWDEQGRKYLNEEDFGRRIKGQTLTQIYEKYFSD LPEAQQEITAGLNIYEKSMSYEYIPGVEAFIADLRKHGAKIAVVTSSNEEKMQNVYNAHP EFKGMVDRILTGEMFARSKPAPDCFLLGMEIFGATPENSYVFEDSFHGLQAGMTSGATVI GLATTNTREAITGKAHYIIDDFSEMTFEHLLALHR >gi|226332272|gb|ACIC01000048.1| GENE 33 33592 - 34962 1219 456 aa, chain + ## HITS:1 COG:CAC0883 KEGG:ns NR:ns ## COG: CAC0883 COG0534 # Protein_GI_number: 15894170 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Clostridium acetobutylicum # 4 430 2 429 448 332 41.0 1e-90 MSKNDPHILGKESIGKLLLQYSIPAIIGMTITSIYNIIDSIFIGHGVGAMAIAGLAVSFP LMNLVVAFCTLVSAGGSTIASIRLGQKDMKGAMEILSHTLMLCITNSFFFGILSFIFLDD ILTFFGASPDTLPYARSFMQVILLGTPITYTMIGLNNVMRATGYPKKAMLTSMVTVVANI ILAPIFIFHFEWGMRGAATATVISQLIGMVWVVSHFCKKDSTVHFEGKVWKMKGRIVESI FAIGMSPFLMNVCACAIVIVINNSLQEHGGDMAIGAYGIINRLLTLYVMIVLGLTMGMQP IVGYNFGAQKIDRVKQTLKLGIISGVVITSSGFLICEFFPHAVSAVFTDSDELIGLAVDG LRLTVLMFPFVGAQIVIGNFFQSIGKVKISIFLSLTRQLLYLLPCLLLFPHWWGLKGIWI SLPVSDSLAFITAVISLMIYIKKVSKEHPIADKVAE >gi|226332272|gb|ACIC01000048.1| GENE 34 35099 - 37414 1728 771 aa, chain - ## HITS:1 COG:no KEGG:BT_0695 NR:ns ## KEGG: BT_0695 # Name: not_defined # Def: ABC transporter, permease protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 771 1 771 771 1503 99.0 0 MRQIYYSIRTLLRERGTNIIRVISLSLGLTIGILLFSQIAFELSYERCYPEPERLAIARC LTTNLSTGETMGDDGNNYDYTLFDVVASTLAQDMPEEIEFASCVLTGQGMSIYYEDKLLS DVNYIYADTCFFQTFGIPVLKGNPKDMIMPGSVFVSEHFARETFGDADPIGKILKADRQN DFTIRGVYKDMPENTVLTHDFVVSIHRDGGYQGGYGWKGNDVFYTFLRLRNAADIDKVND NIQRVIEKYTPAQYDDWKMNFSVIPLVKYHLISSDVQKRLIIYGFLGFAIFFVAIMNYML ISIATLSRRAKGVGVHKCSGASSGNIFGMFLAETGILVIFSVLLSLLLIVNAHEIIEDLL SVRLSSLFAWETLWVPLLTIVILFLLAGGIPGRLFSRIPVTQVFRRYSDGKTGWKRSLLF IQFTGVSFVLGLLLVTLLQYNHLMSRDMGINVPGLVQAGTWLPKETVEHVTDELRRQPMV EGVAVATSGVIGQYWTRGLMSNDGKRIATLNFNYCSYNYPEVMGIKIIEGSTLKKQDDLL VNEELVRLMKWTDGAVGKKLNAIQGTIVGVFRDVRNYSFFSTQAPIVLIGSENANHVFDV RLKEPYDENLKRLNEFADKTFPNVALHFSSVDGMIKDIYKSVYRFRNSVWITSSFILLIV IMGLIGYVNDETQRRSKEIAIRKVNGAEASHILRLLIREILYVSASSILIGTIVSYFVGK AWLDQFAEQIYMNPLLFVGTATFVLLLIVVCVVLKAWHIANENPVKSIKSE >gi|226332272|gb|ACIC01000048.1| GENE 35 37427 - 38110 358 227 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) [Campylobacter concisus 13826] # 1 202 1 199 223 142 38 3e-33 MNKMIQIENISKVFRTTEVETVALNHVNLEVKEGEFVAIMGPSGCGKSTLLNILGLLDNP TEGSYRLLGEEVAGLKEKERTGVRKGKLGFVFQSFNLIDELNVFENVELPLTYLGIKSSE RRRMVEDILKRMNISHRAKHFPQQLSGGQQQRVAIARAVVTNPKLILADEPTGNLDSKNG AEVMNLLTELNKEGTTIIMVTHSQHDASFAHRTVHLFDGSVVASVTA >gi|226332272|gb|ACIC01000048.1| GENE 36 38142 - 40451 1511 769 aa, chain - ## HITS:1 COG:no KEGG:BT_0693 NR:ns ## KEGG: BT_0693 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 769 25 793 793 1550 99.0 0 MRQIYYVIRTLLRGRSSNVVKVISLGLGLTMSILLFARVAFEQSFDKCFKDYENLYQVFS IFTMNGEQLEPQDQNCGPVAGAILENFPKEVEAATSYCLWVNDPLYHGSVRFEVRKLAAD SLFFQTMGIEVLSGTPVKDLAQKDVVYLSDRLARKMFDTENPIGKVISYSKEIELTVKGT YADIPENATVRPEAVISLPSIWSRKWGNYSWRGGDSWLAFIRFRPGADKSVVNARIDAMI QKYRPAEDQKVVGYTAFVKPIRDVYRDNKDVRRMRNIMTILGITILFIASLNYVLISVSS LSYRAKAVGVHKCNGAGNMTILSMFLWETAIIILFALILMGLILLNFQDFFEDTAAVKLS ALFSLDRIWVPLLTVAVLFVVGGVLPGRLFARIPVTQVFRRYTEGKKGWKRPLLFIQFAG VAFICGLMYVVMMQYYYILNKDLGYNPKRVVVANASFGEEEKGNYALNVFRNMPYVESVS SATYHPVYGYSGTIINDESGQSLFSSRYCSVPDDYPKMMGMVLKEGRMPRGDDETVVNET FAEWMHWGNNILNRTVYNSGYVCKVVGVMKDYRIGSFTSPQQPLLLMHTKRFGDCIHVRL KEPFAENLLKLNKEMESAFPDKTIEFRSMQQMIKENYNSVRVFSNATMLAAITMFFVMLM GLIGYTTDEVRRRSKEIAIHKVNGSEATGILELLVKDVLYVAVVAVLIGVVAAWYVNGMW MDLFAEHVPLSWAAYLLIAIANLAVIVACVLWKSWKIANENPVNSIKSE >gi|226332272|gb|ACIC01000048.1| GENE 37 40747 - 41550 859 267 aa, chain + ## HITS:1 COG:no KEGG:BT_0692 NR:ns ## KEGG: BT_0692 # Name: not_defined # Def: calcineurin superfamily phosphohydrolase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 267 1 267 267 520 99.0 1e-146 MVNKNLYTLLVCLLLAGCDLIDYHPYDVRISGETDVNAHNIEQIEAKCKGKTTIRFITMG DSQRWYDETEDFVKEVNKRDDIDFVIHGGDMSDFGLTKEFLWQRDIMNGLKVPYVVLIGN HDCLGTGAETYQAVFGPTNFSFIAGNVKFVCLNTNALEYDYSEPIPNFGFMEQELTDRAD EFEKTVVSMHAHPFADVFNDNVAKPFQYYITQYPELQFCTAAHNHGFDDRILFDDDVHYI VSDCMKSRSYLVFTITPEKYEYELVEY >gi|226332272|gb|ACIC01000048.1| GENE 38 41531 - 42334 564 267 aa, chain + ## HITS:1 COG:no KEGG:BT_0691 NR:ns ## KEGG: BT_0691 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 267 1 267 267 545 100.0 1e-154 MNWLNTKYMIRAGVLTMLLLFPSLLSATCKPDTIIRNYKPDSLKLAIKADSIIAFHEGDS AFTIVKADTVLPYIAEPVLRTGYDRRVHRFRKNWERIIPTHSKIQYAGNMGLLSFGTGWD YGKRNQWETDLLLGFIPKYSSKKAKVTMTLKQNYMPWSINLGKGFSTEPLSCGLYLNTVF GNQFWVSEPERYPKGYYGFSSKVRIHVFMGQRLTYDIDPNRRFLAKSVTFFYEISTCDLY LISGVTNSYLRPRDYLSLSFGLKFQWL >gi|226332272|gb|ACIC01000048.1| GENE 39 42403 - 42772 301 123 aa, chain + ## HITS:1 COG:mlr5842 KEGG:ns NR:ns ## COG: mlr5842 COG2204 # Protein_GI_number: 13474866 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains # Organism: Mesorhizobium loti # 2 122 3 118 455 66 34.0 9e-12 MSKSGTIIIVDDNKGVLTAVQILLKNYFSKVVTLSSPVTLTTAIREETPEVVLLDMNFTS GINTGNEGLFWLHEIKKVRRELPVVLFTAYADIDLAIRGIKEGASDFIVKPWNNQKLVET LQA Prediction of potential genes in microbial genomes Time: Thu May 12 00:35:18 2011 Seq name: gi|226332271|gb|ACIC01000049.1| Bacteroides sp. 1_1_6 cont1.49, whole genome shotgun sequence Length of sequence - 15942 bp Number of predicted genes - 11, with homology - 11 Number of transcription units - 8, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 8/0.000 + CDS 2 - 940 717 ## COG2204 Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains 2 1 Op 2 . + CDS 937 - 2253 1034 ## COG5000 Signal transduction histidine kinase involved in nitrogen fixation and metabolism regulation 3 1 Op 3 . + CDS 2265 - 2933 535 ## COG0664 cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases + Prom 3003 - 3062 8.0 4 2 Tu 1 . + CDS 3133 - 4764 1734 ## COG1151 6Fe-6S prismane cluster-containing protein + Term 4828 - 4883 14.0 - Term 4866 - 4904 -0.4 5 3 Tu 1 . - CDS 4912 - 6096 829 ## PROTEIN SUPPORTED gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 - Prom 6135 - 6194 5.2 + Prom 6060 - 6119 5.9 6 4 Tu 1 . + CDS 6265 - 7737 1328 ## COG3263 NhaP-type Na+/H+ and K+/H+ antiporters with a unique C-terminal domain + Prom 7942 - 8001 5.8 7 5 Tu 1 . + CDS 8056 - 9720 1519 ## COG1022 Long-chain acyl-CoA synthetases (AMP-forming) + Term 9758 - 9800 1.4 - Term 9745 - 9788 6.1 8 6 Tu 1 . - CDS 9908 - 11923 1749 ## BT_0683 alpha-glucosidase - Prom 12042 - 12101 3.8 + Prom 11897 - 11956 4.9 9 7 Op 1 40/0.000 + CDS 12043 - 12717 742 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 10 7 Op 2 . + CDS 12741 - 14072 1043 ## COG0642 Signal transduction histidine kinase - Term 14113 - 14165 -0.8 11 8 Tu 1 . - CDS 14170 - 15885 1452 ## COG3696 Putative silver efflux pump Predicted protein(s) >gi|226332271|gb|ACIC01000049.1| GENE 1 2 - 940 717 312 aa, chain + ## HITS:1 COG:STM4174 KEGG:ns NR:ns ## COG: STM4174 COG2204 # Protein_GI_number: 16767428 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains # Organism: Salmonella typhimurium LT2 # 14 309 143 441 441 245 46.0 7e-65 NKKEPVNTPALYWGESKSMQQLRTLIEKVASTDANILITGENGTGKEMLAREIHALSNRR QQEMIAVDMGAITESLFESELFGHVKGSFTDAHTDRTGKFEAADHSTLFLDEIGNLPYHL QAKLLTAIQRRSIVRVGSNTPIPIDIRLICATNRNLQEMVDREEFREDLLYRINTIHIEI PPLRERKEDIVPLAERFIIRFCKQYDKEPMKFSPAAKEKLTAHPWYGNIRELEHAIEKVV IINDGLQIPAELFQLSSRKTETSEKNISTLEEMEVQMIRKALDACAGNLSAVAAQLGITR QTLYNKMKKFGL >gi|226332271|gb|ACIC01000049.1| GENE 2 937 - 2253 1034 438 aa, chain + ## HITS:1 COG:CC1742 KEGG:ns NR:ns ## COG: CC1742 COG5000 # Protein_GI_number: 16125986 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase involved in nitrogen fixation and metabolism regulation # Organism: Caulobacter vibrioides # 78 437 322 698 716 113 26.0 6e-25 MNRILQQVANQYWFRVSLTVICCLATICLGVHQSYVWFSVSACLLILSIIWQVRLYRLHI KQVLFMIDALENNDSAFRFPEESGTPESKQINRALNRVGHILYNVKAETAQQEKYYELIL DFVSTGLLVLNDNGAVYQKNKEALRLLGLNVFTHVRQLSKVDATLMEKIENCCSGDKLQV VFHNERGTVNLSIRVSEINVHKEHLRILALNDINIELDEKEIDSWIRLTRVLTHEIMNSV TPITSLSDTLLSMVEDKDEEISHGLQTISTTGKGLLAFVESYRKFTRIPTPEPSLFYLKA FIERMVELARHQNPCDHITFHTKIDPADLILHADENLISQVVINLLKNAIQAIGDQPDGQ IDILVYCNETEEVLIEIKNNGPVIPPEIAEHIFIPFFTTKEGGSGIGLSISRQIMRLSGG SLTLLPDHKETKFILKFK >gi|226332271|gb|ACIC01000049.1| GENE 3 2265 - 2933 535 222 aa, chain + ## HITS:1 COG:CAC0884 KEGG:ns NR:ns ## COG: CAC0884 COG0664 # Protein_GI_number: 15894171 # Func_class: T Signal transduction mechanisms # Function: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases # Organism: Clostridium acetobutylicum # 10 212 14 216 229 127 33.0 1e-29 MTPALTNNPLFRGITPEKLSANLEEISFHTRSYRKGEILARQGDVCNRLVILTKGSVRGE MIDYSGRLIKVEDIAAPRALAPLFLFGEENRFPVEVTANEPTEVIEIPKASVLELFRRNE QFLENYMNLSANYARTLSDKLFFMSFKTIRQKIASYLLRLYKQQQQLQITLDRSQQELSD YFGVSRPSLARELSHMQEDGMIIADRKQITILQKEWLIRLIQ >gi|226332271|gb|ACIC01000049.1| GENE 4 3133 - 4764 1734 543 aa, chain + ## HITS:1 COG:CAC2750 KEGG:ns NR:ns ## COG: CAC2750 COG1151 # Protein_GI_number: 15896007 # Func_class: C Energy production and conversion # Function: 6Fe-6S prismane cluster-containing protein # Organism: Clostridium acetobutylicum # 1 541 1 527 530 734 65.0 0 MSMFCYQCQETAMGTGCTLKGVCGKTSEVANLQDLLLFVVRGIAVYNEHLRREGHPSEEA DKFIYDALFITITNANFDKAAIIRKIKEGLQLKNELASKVTIANAPDECLWDGNEDEFEE KARTVGVLRTPDEDIRSLKELVHYGLKGMAAYVEHAHNLGYQSPEIFAFMQSALAELTRN DITMEELVQLTLATGKHGVSAMAQLDAANTNSYGNPEISEVNLGVRNNPGILISGHDLKD LEELLEQTEGTGIDIYTHSEMLPAHYYPQLKKYKHLAGNYGNAWWKQKEEFESFNGPILF TSNCIVPPRANASYKDRIYITGACGLEGAHYIPERKDGKPKDFSALIAHAKQCQPPTAIE SGTLIGGFAHAQVVALADKVVEAVKSGAIRKFFVMAGCDGRMKSREYYTEFAEKLPKDTV ILTAGCAKYRYNKLALGDINGIPRVLDAGQCNDSYSLAVIALKLKEVFGLDDVNKLPIVY NIAWYEQKAVIVLLALLALGVKNIHLGPTLPAFLSPNVKNVLIEQFGIGGISTADEDIMK FLS >gi|226332271|gb|ACIC01000049.1| GENE 5 4912 - 6096 829 394 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 [Clostridium botulinum Bf] # 1 393 5 421 447 323 41 4e-88 MDSNNLTPLRKGVVGVQFLFVAFGATVLVPLLVGLDPSTALFTAGIGTLIFHAVTRGKVP IFLGSSFAFIAPIIKATELYGLAGTLSGMVGVALVYFVMSALVKWQGVRVIERLFPPVVI GPVIILIGLSLAGTGVNMAKENWVLALLSLVTAVVVSMKAKGLLKLIPIFCGIIVGYLAA WLFYDLDLSGVRDAAWIGLPQFVFPKFSWEPILFMIPVAIAPVIEHIGDVYVVNTVTGKD FVKDPGLHRTLLGDGLACCFAGLVGGPPVTTYSEVTGAMSLTKITNPQVIRIAAISAILF SVVGKISALLKSIPSAVLGGIMLLLFGTIACAGIGNLVNNCIDLSRTRNIIIVSLTLTVG IGGAVLTWGDFSLSGIGLAALVGVLLNLILPKDD >gi|226332271|gb|ACIC01000049.1| GENE 6 6265 - 7737 1328 490 aa, chain + ## HITS:1 COG:BH4038 KEGG:ns NR:ns ## COG: BH4038 COG3263 # Protein_GI_number: 15616600 # Func_class: P Inorganic ion transport and metabolism # Function: NhaP-type Na+/H+ and K+/H+ antiporters with a unique C-terminal domain # Organism: Bacillus halodurans # 1 484 1 480 490 298 37.0 2e-80 MIFTAENTLLIGSILLFVSIVVGKTGYRFGVPTLLLFLVVGMLFGSDGLGLQFHDAKDAQ FIGMVALSIILFSGGMDTKFREIKPILGPGIVLSTVGVLLTALFTGLFIWWLSGMSWTNI YLPITTSLLLASTMSSTDSASVFAILRSQKMNLKHNLRPMLELESGSNDPMAYMLTIVLI QFIQSAGMGVGAIAASFIIQFIVGAAAGYVLGKLAIRMLNKLNIDNQALYPILLLAFVFF TFSITDLLKGNGYLAVYIAGIMVGNNKIMHRKDIYTFMDGLTWLFQIIMFLCLGLLVNPH EMLEVAAVALLIGVFMIVIGRPLSVFLCLLPFRKITMKSRLFVSWVGLRGAVPIIFATYP VVAGVEGSNIIFNVVFFITIVSLVVQGTTVSFVARLLHLSKPLEKTGNDFGVELPEEIDT DLKDMTITQEMLEEADTLKDMNLPKGTLVMIVKRGDEFLIPNGTLKLHVNDKLLLISEKP KEDEEANSES >gi|226332271|gb|ACIC01000049.1| GENE 7 8056 - 9720 1519 554 aa, chain + ## HITS:1 COG:aq_999_1 KEGG:ns NR:ns ## COG: aq_999_1 COG1022 # Protein_GI_number: 15606303 # Func_class: I Lipid transport and metabolism # Function: Long-chain acyl-CoA synthetases (AMP-forming) # Organism: Aquifex aeolicus # 16 548 4 499 600 224 28.0 3e-58 MEQEHQFIDYIEQSIIKNWDKDALTDYKGITLQYKDVARKIAKFHIVLESAGIQPGDKIA VCGRNSAHWGVTFLATITYGAVIVPILHEFKADNIHNIVNHSEAKLLFVGDQAWENLNED AMPLLEGIASLTDFTALVSRNEKLTYAFEHRNAIYGQRYPKNFRPEHICYRKDRPEELAI INYTSGTTGYSKGVMLPYRSIWSNVAYCFEMLPVKAGDHIVSMLPMGHVFGMVYDFLYGF SAGAHIYFLTRMPSPKIIAQSFAEIKPRVISCVPLIVEKIIKKDILPRVDSKIGKLLLKV PIVNDKIKSLARQAAMEIFGGNFDEIIIGGAPFNAEVEAFLKKIGFPYTIAYGMTECGPI ICSSRWETLKLASCGKATTRMEVRIDSPDPKTHAGEIVCRGMNMMLGYYKNPEATAQIID ANGWLHTGDLGTIDDEGYVTVRGRSKNLLLTSSGQNIYPEEIESKLNNMPYVAESLIVLQ HDKLVALIYPDFDDAFAHGLQQADIIKVMEANRVELNQQLPNYSQISKVKIHFEEFEKTA KKSIKRFMYQEAKG >gi|226332271|gb|ACIC01000049.1| GENE 8 9908 - 11923 1749 671 aa, chain - ## HITS:1 COG:no KEGG:BT_0683 NR:ns ## KEGG: BT_0683 # Name: not_defined # Def: alpha-glucosidase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 671 1 671 671 1400 100.0 0 MKVKHLVGAFLCVLGCYACSSPKTEVQSPDGHIKMTLTVDENGKPFYNVSVDDSLLIENS AMGFTAANGIILAEGFQVKNTTFSSEDETWTQPWGENKKNRNHYNEMAVSLTNEDQVQLT LRFRVFDDGIGFRYEYTVPTVDSLLITDELTTFRFHRDGTSWSIPASAETYELLYQQQPI SEVETANTPFTFKTADGVYGSIHEAALYDFPEMTLKQIGHYTLKAELAPWPDGIKVRKGN HFTTSWRTIQIAPEAVGLINSSLILNLNDPCVLETTDWIRPLKYVGVWWGMHLGVETWKM DERHGATTANAKKYIDFAAANHIEAVLFEGWNEGWESWGGMQSFDFTKPYADFDIDEIVR YAKEKGVEIMGHHETGGNIPNYERQMDHAMQWYTDHGIHLLKTGYAGAFPDGYLHHSQYG VNHYQRVVETAARHQMTLDAHEPIKDTGIRRTWPNMMTREGARGMEWNAWSEGNPPSHHV MLPFTRLLSGPMDYTPGTFDILFLKTKDSPRRQKWNDQDKGNSRVNTTLAKQLANWVILY SPLQMASDMIEHYEGHPAFQFFRDFDPDCDESKALAGEPGEFIAVVRKAKGNYFLGAATN ESPRTLEVKLDFLEPGKQYKAVIYADGEKADWKTNPTDYKITEQIVTSENTLSIRMAPGG GQAVSFIKSNR >gi|226332271|gb|ACIC01000049.1| GENE 9 12043 - 12717 742 224 aa, chain + ## HITS:1 COG:alr1194 KEGG:ns NR:ns ## COG: alr1194 COG0745 # Protein_GI_number: 17228689 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Nostoc sp. PCC 7120 # 4 222 5 222 228 190 45.0 2e-48 MAKILLAEDEVNIASFIERGLKEFGHSVTVCHDGNTGWRILQEEPFDLVILDIIMPKING LELCHLYRQMFGYQAPVIMLTALGTTEDIVKGLDAGADDYLVKPFSFQELEARIKALLRR NKETVTNQLTCDNLVLDCNTRRAKRGDADIDLTVKEYRLLEYFMTHQGVALSRITLLKDV WDKNFDTNTNIVDVYVNYLRVKIDRDFDKKLIHTVVGLGYIMNA >gi|226332271|gb|ACIC01000049.1| GENE 10 12741 - 14072 1043 443 aa, chain + ## HITS:1 COG:RSp1043 KEGG:ns NR:ns ## COG: RSp1043 COG0642 # Protein_GI_number: 17549264 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Ralstonia solanacearum # 163 431 167 450 466 165 32.0 2e-40 MKIGSKIALFYTLLSVLTTIIIIAVFYLFSTQFINKLYASYLREKAYLTAQKHWEKDEID EQSYQIIQRKYDELLPEAHEILLNMDSLSEVRDTLNKYLTQHQQALLIAGEDSVPFSFKY KDQLGAALYYPDNEGNFIVLVMSRNVYGAEIKEHLLLLSIFLVLFSSILIYLVGKIYSGR ILIPLQHILKELKRIRANSLNRRLKTTGNNDELEDMIETLNSMLDRLDSAFKAEKSFVSH ASHELNNPITAIQGECEISLLKERSTGEYIEALQRISSESKRISNLIRHLLFLSRQDEEL IKSNMEAMSLPDMLNDLIKMNERIRLHYQETGKAATVKANPYLLKIALKNIIDNACKYSE KEVDITLSQKDQHLVLEIKDQGIGIPPEEIEHIFQSFYRGSNTHDYAGQGIGLSLTQKIV SAYNARLEISSEIEKGTKVRVIF >gi|226332271|gb|ACIC01000049.1| GENE 11 14170 - 15885 1452 571 aa, chain - ## HITS:1 COG:RSp1040 KEGG:ns NR:ns ## COG: RSp1040 COG3696 # Protein_GI_number: 17549261 # Func_class: P Inorganic ion transport and metabolism # Function: Putative silver efflux pump # Organism: Ralstonia solanacearum # 1 564 460 1025 1038 406 36.0 1e-113 MFSPLAYTLGFALLGALIFTLTLVPVMSSMLLKKNVREKNNRFVHFINQKCTALFDTFYA HRKLTIGLATVVAGVGLWLFSFLGTEFLPQLNEGSIYIRATLPQSISLDESVTLANKMRK KLLTFPEVRQVLSQTGRPNDGTDATGFYNIEFHVDIYPEKDWESKLTKLQLIDKMQEDLS IYPGIDFNFSQPITDNVEEAASGVKGSIAVKVFGKDLYESEKLAVQIDKILNTVQGIEDL GVIRNIGQPELRIELNERQLARYGVAKEDVQSIIEMAIGGKSASLLYEDERKFNIMVRYS EEFRQNEEEIGKILVPAMDGTMVPIKELADITTITGPLLIFRDNHARFCAVKFSVRGRDM GTAVAEAQKKVNASVHLPAGYSLKWTGDFENQQRASKRLAQVVPISIAIIFIILFILFSN ARDAGLVLLNVPFAAVGGIAALLITGFNFSISAGIGFIALFGICIQNGVIMISDIKANLK LGSPLQEATKEGVRSRIRPVIMTAAMAAIGLLPAAMSHGIGSESQRPLAIVIIGGLIGAT FFALFIFPLIVEVVYERMLYDKNGKLLQRRI Prediction of potential genes in microbial genomes Time: Thu May 12 00:35:29 2011 Seq name: gi|226332270|gb|ACIC01000050.1| Bacteroides sp. 1_1_6 cont1.50, whole genome shotgun sequence Length of sequence - 7099 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 3, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 11/0.000 - CDS 1 - 1363 1203 ## COG3696 Putative silver efflux pump 2 1 Op 2 13/0.000 - CDS 1379 - 2473 979 ## COG0845 Membrane-fusion protein 3 1 Op 3 . - CDS 2529 - 3791 1216 ## COG1538 Outer membrane protein - Prom 3879 - 3938 7.5 + Prom 3722 - 3781 5.6 4 2 Tu 1 . + CDS 4025 - 4384 296 ## COG0526 Thiol-disulfide isomerase and thioredoxins - Term 4356 - 4396 -0.1 5 3 Op 1 1/0.000 - CDS 4506 - 5678 1210 ## COG1820 N-acetylglucosamine-6-phosphate deacetylase 6 3 Op 2 . - CDS 5697 - 6863 1212 ## COG1820 N-acetylglucosamine-6-phosphate deacetylase - Prom 6955 - 7014 2.2 - TRNA 6989 - 7064 85.2 # Gly GCC 0 0 Predicted protein(s) >gi|226332270|gb|ACIC01000050.1| GENE 1 1 - 1363 1203 454 aa, chain - ## HITS:1 COG:RSp1040 KEGG:ns NR:ns ## COG: RSp1040 COG3696 # Protein_GI_number: 17549261 # Func_class: P Inorganic ion transport and metabolism # Function: Putative silver efflux pump # Organism: Ralstonia solanacearum # 5 454 2 443 1038 315 38.0 1e-85 MHKFIDNIVAFSLKNKFFIFFCTVIAVIAGAVSFKHTPIDAFPDVTNTKVTIITQWAGRS AEEVEKFITIPIEIAMNPVQKKTDIRSTTLFGLSVINVMFEDRVDDFTARQQVYNLLNDA DLPEGVTPEVQPLYGPTGEIYRYTLRSDRRSVRDLKTIQDWVIERNLRSVSGVADIVSFG GEVKTFEVSVNPHQLINYGITSLELYDAIAKSNINVGGDVITKSSQAYVVRGIGLINDLD ELRNIVVKNINGTPILVKNLADVHESCLPRLGQVGRMDENDVVQGIVVMRKGENPGEVIA NLKDKIEELNQNVLPEDVRIIPFYDREDLVNLAVKTVTHNLIEGILLVTFIVLIFMADWR TTVIVAVIIPLALLFAFICLRVMGMSANLLSMGAIDFGIIIDGAVVMVEGVFVALDKKAK EVGMPAFNVMSKMGLIRHTAKDKAKAVFFSKLII >gi|226332270|gb|ACIC01000050.1| GENE 2 1379 - 2473 979 364 aa, chain - ## HITS:1 COG:PA2521 KEGG:ns NR:ns ## COG: PA2521 COG0845 # Protein_GI_number: 15597717 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Pseudomonas aeruginosa # 43 306 158 429 484 115 27.0 1e-25 MNWNKFLPCILFGSLLGACSGKGEQPKVEPTTLCLTDSLLRIVSVDTVHVCEVVDELTLN GRVTFNQDQVANVYPMFGGTVTELRAEIGDFVHKGEVLAVIRSGEVADYEKQLKEAEQQL LLARRNMDATQDMYTSGMASDKDVLQAKQELATAEAEERRIKEIFSIYHFSGNAFYQLKS PVSGFIVEKQISRDMQLRPDQGDALFTISGLSDVWVMADVYESDISKVSEGADVRISTLA YPERTFTGTIDKAYHLLDSESKTMNVRIKLKNEDYLLKPGMFTNVSVKCKAEDTSMPRID SHALVFEGGKNYVVTVEPDQRLKIKEVDVYKQLSKECYVRSGLSEGDRVLNNNVLLVYNA LNAD >gi|226332270|gb|ACIC01000050.1| GENE 3 2529 - 3791 1216 420 aa, chain - ## HITS:1 COG:PA2522 KEGG:ns NR:ns ## COG: PA2522 COG1538 # Protein_GI_number: 15597718 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Outer membrane protein # Organism: Pseudomonas aeruginosa # 25 383 40 389 428 69 23.0 1e-11 MNRAFLFLLFLLLAGKMCAQQVAGTLTLKEAEQRFLERNLSLIAERYNIDMAQAQVLQAK LFENPVISLEQNVYNRLNGKYFDFGKEGEAVVEIEQVIHLAGQRNKQVRLEKINKEIAEY QFEEVMRTLRQELNEKFVEVYFLSKSIAIYEKEVNSLQALLGGMKIQQEKGNISLMEISR LESMLFSLRKEKNERENDLLTTRGELNLLLNLPEDTQVQLSLDEEVLQQLDLSQLSFADL KAIINERPDQKIARSTVNASRANLKLQKSMAFPEFSVKGNYDRVGNFINDYFAIGVSLSV PIFNRNQGNIKAARFSIQQAGVQQEYAANRADMELFTAYTSLEKATQLYQSTNMDLERNF EKLITGVNENFTRKNISLLEFIDYYDSYKETCIQLYEIKKNVFLAMENLNTVVGQNVLNY >gi|226332270|gb|ACIC01000050.1| GENE 4 4025 - 4384 296 119 aa, chain + ## HITS:1 COG:BB0061 KEGG:ns NR:ns ## COG: BB0061 COG0526 # Protein_GI_number: 15594407 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Borrelia burgdorferi # 4 114 3 112 117 105 40.0 2e-23 MKIIDLTKESFVEKIADYQSYPDNWEYKGDKPCLVDFHAPWCGYCKALSPILDQLAEEYK GRLDIYKVDVDQEEELESAFKIRTIPNLLLCPVNGKPIMKLGTMNKVQLKEFIETTLFS >gi|226332270|gb|ACIC01000050.1| GENE 5 4506 - 5678 1210 390 aa, chain - ## HITS:1 COG:lin2213 KEGG:ns NR:ns ## COG: lin2213 COG1820 # Protein_GI_number: 16801278 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetylglucosamine-6-phosphate deacetylase # Organism: Listeria innocua # 44 379 46 378 380 185 31.0 1e-46 MLTQIINGRILTPQGWLKDGSVLICDGKILEVTNSDLAVIGATVIDARGMTIVPGFVSMH AHGGGGHDFTEATEEAFRIAATAHLKHGATGIFPTLSSTSFERIYQAVDVCEKLMKEPES PILGLHIEGPYLNPKMAGSQYDGFLKIPDANEYMPLLEHTSCIKRWDISPELHGAHDFAK YTRSKGIMTAVTHTEAEYDEIKAAYAVGFSHAAHFYNAMPGFHKRREYKYEGTVESVYLT DGMTVEVIADGIHLPATILKLVYKLKGVENTCLVTDALAYAANEGNEPIDPRYIIEDGVC KMADHSALAGSLATMDVLVRTMVKKASIPLEDAVRMASETPARLIGVSDRKGALSKGKDA DIVILDKELNVRCVWSMGKVVPGTDRLLHK >gi|226332270|gb|ACIC01000050.1| GENE 6 5697 - 6863 1212 388 aa, chain - ## HITS:1 COG:lin2213 KEGG:ns NR:ns ## COG: lin2213 COG1820 # Protein_GI_number: 16801278 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetylglucosamine-6-phosphate deacetylase # Organism: Listeria innocua # 43 379 45 378 380 203 34.0 5e-52 MLTQIINARILTPQGWLKDGSVLIRDHKILEVTNCDLAVIGAKLIDAKGMYIVPGGVEIH VHGGGGRDFMEGTEDAFRTAIKAHMQHGTTSIFPTLSSSTIPMIRAAAETTEKMMAEPDS PVLGLHLEGHYFNMEMAGGQIPENIKNPDPEEYIPLLEETHCIKRWDAAPELPGAMQFGK YITAKGVLASVGHTQAEFEDIQTAYEAGYTHATHFYNAMPGFHKRKEYKYEGTVESIYLI DDMTVEVVADGIHVPPTILRLVYKIKGVERTCLITDALACAASDSKTAFDPRVIIEDGVC KLADRSALAGSVATMDRLIRTMVQNAEIPLADAVRMASETPARIMGVLDRKGTLERGKDA DIIALDRDLNVRAVWAMGQLVEGTNKLF Prediction of potential genes in microbial genomes Time: Thu May 12 00:35:39 2011 Seq name: gi|226332269|gb|ACIC01000051.1| Bacteroides sp. 1_1_6 cont1.51, whole genome shotgun sequence Length of sequence - 38426 bp Number of predicted genes - 29, with homology - 29 Number of transcription units - 14, operones - 7 average op.length - 3.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 131 - 1270 926 ## COG0019 Diaminopimelate decarboxylase - Prom 1344 - 1403 6.5 + Prom 1342 - 1401 7.6 2 2 Tu 1 . + CDS 1447 - 2001 443 ## COG2249 Putative NADPH-quinone reductase (modulator of drug activity B) + Term 2047 - 2094 7.1 - Term 2031 - 2086 11.4 3 3 Op 1 9/0.000 - CDS 2092 - 5337 2475 ## COG0841 Cation/multidrug efflux pump 4 3 Op 2 9/0.000 - CDS 5369 - 6868 1404 ## COG1538 Outer membrane protein 5 3 Op 3 27/0.000 - CDS 6865 - 9972 2793 ## COG0841 Cation/multidrug efflux pump 6 3 Op 4 . - CDS 9999 - 11087 1122 ## COG0845 Membrane-fusion protein - Prom 11107 - 11166 4.6 7 4 Op 1 . - CDS 11229 - 12845 736 ## BF2988 hypothetical protein 8 4 Op 2 . - CDS 12855 - 13682 448 ## COG0681 Signal peptidase I 9 4 Op 3 . - CDS 13723 - 15603 683 ## BF0297 putative exported glutaminase 10 4 Op 4 . - CDS 15594 - 17606 765 ## BF3158 hypothetical protein 11 4 Op 5 . - CDS 17616 - 18479 167 ## BF3160 hypothetical protein 12 4 Op 6 . - CDS 18490 - 19377 510 ## BDI_3440 hypothetical protein - Prom 19400 - 19459 5.2 - Term 19566 - 19610 2.1 13 5 Tu 1 . - CDS 19621 - 19881 170 ## gi|253568508|ref|ZP_04845919.1| predicted protein - Prom 20001 - 20060 2.5 14 6 Op 1 . - CDS 20095 - 21090 465 ## BF3003 putative lipoprotein 15 6 Op 2 . - CDS 21134 - 21370 170 ## BF0032 hypothetical protein - Prom 21427 - 21486 3.6 16 7 Tu 1 . - CDS 21523 - 22674 504 ## BF0205 putative transmembrane protein - Prom 22699 - 22758 3.2 17 8 Tu 1 . + CDS 23157 - 24308 717 ## COG1672 Predicted ATPase (AAA+ superfamily) + Prom 24341 - 24400 6.9 18 9 Tu 1 . + CDS 24497 - 25534 296 ## BT_0659 hypothetical protein - Term 25527 - 25587 16.8 19 10 Op 1 . - CDS 25624 - 26319 696 ## BT_0658 hypothetical protein 20 10 Op 2 . - CDS 26343 - 28709 2148 ## COG0210 Superfamily I DNA and RNA helicases - Prom 28772 - 28831 8.7 + Prom 28647 - 28706 5.0 21 11 Tu 1 . + CDS 28857 - 30599 1340 ## BT_0656 hypothetical protein + Prom 30614 - 30673 4.0 22 12 Op 1 1/0.000 + CDS 30706 - 31293 431 ## PROTEIN SUPPORTED gi|15900660|ref|NP_345264.1| superoxide dismutase, manganese-dependent + Term 31334 - 31370 6.4 + Prom 31367 - 31426 5.8 23 12 Op 2 . + CDS 31450 - 33432 1484 ## COG0642 Signal transduction histidine kinase 24 13 Op 1 . + CDS 33645 - 33845 306 ## BT_0653 ThiS protein, involved in thiamine biosynthesis 25 13 Op 2 3/0.000 + CDS 33850 - 34479 563 ## COG0352 Thiamine monophosphate synthase 26 13 Op 3 . + CDS 34492 - 35265 873 ## COG2022 Uncharacterized enzyme of thiazole biosynthesis + Term 35299 - 35333 0.5 + Prom 35267 - 35326 1.6 27 14 Op 1 . + CDS 35346 - 37043 1887 ## COG0422 Thiamine biosynthesis protein ThiC 28 14 Op 2 . + CDS 37115 - 38239 1068 ## COG1060 Thiamine biosynthesis enzyme ThiH and related uncharacterized enzymes 29 14 Op 3 . + CDS 38244 - 38424 194 ## COG0476 Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 2 Predicted protein(s) >gi|226332269|gb|ACIC01000051.1| GENE 1 131 - 1270 926 379 aa, chain - ## HITS:1 COG:sll0873 KEGG:ns NR:ns ## COG: sll0873 COG0019 # Protein_GI_number: 16330194 # Func_class: E Amino acid transport and metabolism # Function: Diaminopimelate decarboxylase # Organism: Synechocystis # 4 378 14 386 387 426 49.0 1e-119 MIDFTQFPSPCYIMEEELLRKNLCLIKSVADRAGVEIILAFKSFAMWRSFPIFREYVEHS TASSVYEARLALEEFGSKAHTYSPAYTEQDFPEIMRCSSHITFNSMAQFRRFYPMTVAEG SGISCGIRVNPEYSEVETELYNPCAPGTRFGMTADLLPDTLPKGIEGFHCHCHCESSSFE LERTLQHLEEKFSRWFPQIKWLNLGGGHLMTRKDYDTEHLIKLLQDLKARYPHLQIILEP GSAFTWQTGVLTSEVVDIVESRGIRTAILNVSFTCHMPDCLEMPYQPAVRGAEMGNEGEF IYRLGGNSCLSGDYMGVWSFDHELQIGERIVFEDMIHYTMVKTNMFNGIHHPAIALWTKE GKAEIYKQFSYEDYRDRMS >gi|226332269|gb|ACIC01000051.1| GENE 2 1447 - 2001 443 184 aa, chain + ## HITS:1 COG:BS_ywrO KEGG:ns NR:ns ## COG: BS_ywrO COG2249 # Protein_GI_number: 16080652 # Func_class: R General function prediction only # Function: Putative NADPH-quinone reductase (modulator of drug activity B) # Organism: Bacillus subtilis # 7 172 2 171 175 110 34.0 1e-24 MNKDLRKVVILLAHPNMKESQANKALIDAVSDIEGVAVFNLYDQQGASFDVDEWSKIISD ASALIYQFPFHWMAAPSLLKKWQDEVFTFLSKTPAVAGKPLTVVTTTGSEYEAYRSGGRN RFTVDELLRPYQVSAIHSGMSWQTPVVVYGMGTADAGKNIAEGANLYKQRVEMLIGSSNA GNNW >gi|226332269|gb|ACIC01000051.1| GENE 3 2092 - 5337 2475 1081 aa, chain - ## HITS:1 COG:FN1275 KEGG:ns NR:ns ## COG: FN1275 COG0841 # Protein_GI_number: 19704610 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Fusobacterium nucleatum # 21 1069 16 992 1020 167 20.0 9e-41 MQDIGSEIKKSSKASAFTLIVAFISVALVGLALVPLLPVKLNPSRTLPGFSVWFGMGGTS ARVVEMTATSKLEAMLARVKGIQSISSTSGNGWGSINVSLDKHADAAVARFEASTIIRQT WPELPAGVSYPYIQMASPEQKSQGPFIAFTINAPATPSLIQKYAEEHIKTRLAQLPGIYK INVSGATPMEWRLEYDYEQLRALGVSTDEISQAVGLHYQKEFLGTYDVEQAASGKEWIRL VLMPEHDNREFDAGQIQVKVKDGRIIRLDELVKVVRMEEQPQSYYRINGLNSIYLSVVAE EAANQLELSKKVQACMDDIRLSLPAGYEIHTSYDTTEFIHDELNKIYLRTGVTVAILLLF VLLITFSPKYLFLIVTSLTVNMAIAVIFYYVFGLEMQLYSLAGITVSLNLVIDNTIVMSD HYLRCKNRKAFMSVLAATLTTMGALVIIFFLDERIRLNLQDFAAVVMINLGVSLLVALFF VPSLIDKIGLKRRRKSSLTGVKRKWKMGNTRFLGWLRSKMRRGPVYFSHFYRWLIQRLCR WRVAVCLLLLLAFGLPVFLLPEKMDGDGKWAEIYNKTLGTPTYKEKVKPIVDKALGGSLR LFVQKVYEGSYFTRNEEVVLYANANLPNGSTLEQMNTLIKRMETYLSEFKEIKQFQTSVE SARRASISIRFTKENQKSGFPYTLKANMISKALQLGGGDWSIYGLQDQGFSNSVRENAGS FRVKMYGYNYDELYSWATKLKEVLLSHRRIREVTVGSNFSWWKDDYQEFYFELDKQRMIG AGIGAGELFAAIRPIYGRNQEIGSVVTEDGTEKIKLSSRQSDQRDIWAMQYYPFRVGDKE YKLAELAKVEKGQMPQEVAKENQQYRLCLQYEYIGASEQGNKLLKKDLEEFNELLPMGYK AEAESNNWSWGGGANKQYRLLLIVIAIIFFITSILFNSLKQPLAIIFVIPISYIGVFLTF YWFKLNFDQGGFASFILLCGITVNASIYILNEYNAIRKRFPCLSPLRAYVKAWNTKVIPI FLTVVSTILGFIPFMVGAEKEGFWFPLAAGTIGGLVMSVIGVFIFLPVLTLNKKKMMVKK P >gi|226332269|gb|ACIC01000051.1| GENE 4 5369 - 6868 1404 499 aa, chain - ## HITS:1 COG:VC1565 KEGG:ns NR:ns ## COG: VC1565 COG1538 # Protein_GI_number: 15641573 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Outer membrane protein # Organism: Vibrio cholerae # 193 476 128 403 419 62 21.0 2e-09 MKKSSVLFVLMAASGICFSSVYAQKRLVLDLNQTIALANDSSLESFRTQNMYLSGYWEYR TYRANRLPSLTMDLTPAQYNRDITKRYDSGQDLDVYRTQQSYYAYGGLSVRQNLDLTGGT FYLESNLAYMRNFGDNSATQFTSVPIRLGYSQSLVGYNPFKWDRKIEPLKYERVKKEFLY NVEKVSETATNYFFSLAMAQAEYKLAKENLASTDTLYRIGQQRHRIAAISQADLLTLKLD KVNAQNTLQNRASALKRAMFSLASFLNLDKNTQIELELPSRPSMMEIPVDEALRWGRSNN PQLLELKQNVLEAERNVDRTKKESRFNASVNASIGFNQVAENFGDVYHKPMQQDLVAVSV SIPLLDWGVRKGKYNMARNNLNVVKISARQDEISIEEEVIMTVSDFNIQQQLIASAEEAL DLAILAYNETRQRFIIGKADINSLTLSLNRQQEAQRNYISALQNYWLNYYKIRRLTLFDF ATKLSLSDRFDFNGGRLIR >gi|226332269|gb|ACIC01000051.1| GENE 5 6865 - 9972 2793 1035 aa, chain - ## HITS:1 COG:aq_786 KEGG:ns NR:ns ## COG: aq_786 COG0841 # Protein_GI_number: 15606161 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Aquifex aeolicus # 1 1015 1 992 1000 309 25.0 2e-83 MIKFLIQRPIAVLMAFTACFIIGLVTYSTLPISLLPNIAIPEITVQVASANTSARELENT IVKPLRGQLIQVSTLKDIHSESRDGAGIIRLSFEYGTNTDLAFIEVNEKIDAAMNYLPKE AERPKVIKASATDIPVFYLNLTLKNDSAYGATEQHSFLELCEFAESVIKRRIEQLEEVAM VDVTGILERQVQIVPDKDKLAMMGLSIGDIESALSSNNVEPGSMTVRDGYYEYNIKFSTL LRTAEDVENIYLRKGDRIVQLKDFCKVSVVPVKEKGVSLSNGKRAVSLAVIKQADENMDK MKQALVGTMTYFQLVYPEIDFSISRNQTELLDYTISNLQQNLSLGFLFICLVAVLFLGDV KSPLVIGLSMVVSIVISFVFFYLCNMSLNIISLAGLILALGMMIDSSIIVTENISQYRER GYSLRRACVTGTSEVVTPMLSSSLTTIAVFAPLIFMSGIAGAIFFDQAFSVTVGLMVSYL TGIMLLPVLYLLVYRTGVRTRKWKWLSFKFNNPIKDHTLDRFYDAGVDWVFRHKTFSVLF CVISIPLCVFFFFFIDKERMPDIEENELIARIEWNENIHVDENKRRVDELFKELKDEVLE QTASIGRQDFILNRERELSSSEAELYFRTETSDAIAPLEQAVYRKLKERYPLSVISFSPP ETVFEKLFVTGEPDVVAEFYTRNKAEAPKAEAIRSMEQELGRKTGINPTGIAFENQLNLS ISKEKLLLYRVSYNELYRVLKTAFRENSVTMLHSYQQYLPINIAGDEKTVNEVLQETLVQ TQPDNRGNVDFIPLRELINVAPAEDLKSITSGRNGEYVPFDFYGVQDANRLMREVKQVAE ETGDWDTGFSGSIFSNKEMLDELVVILLISLLLMYFILAAQFESFLQPLLVLAEIPIDVA FALLLLWICGHTLNLMSAIGLIVTCGIVINDSILKLDAINELRKAGVPLLEAIHEAGRRR LRPIIMTSLTTIFAMVPLLFSSDMGSELQKPLSIAMIGTMSIGTAVSLFIIPLLYWFIYR GKSGNHKPNEKHVEL >gi|226332269|gb|ACIC01000051.1| GENE 6 9999 - 11087 1122 362 aa, chain - ## HITS:1 COG:FN1274 KEGG:ns NR:ns ## COG: FN1274 COG0845 # Protein_GI_number: 19704609 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Fusobacterium nucleatum # 45 362 46 365 370 78 24.0 2e-14 MKNNSTWILLCLACMMTLSCSDKKNTADESEKGVSIVLPDTKNEVAVQILKKRDFHHELV SNGKINARGKADLRFETGEVIAHIYVKNGDRVQKGQKLADLDKFRLEQKLSQSEDALLKA ELELKDVLIGQGYSPDDFSKVPAETMKIAKVKSGYEQSKSQYELAKRDMEHATLIAPFDG VVANLFSKPYNPANTSEAFCTIIDSKGMEADFTVLENELAFIRMGDKVVITPYAGGDAFE GSVSEINPLVDANGMVKVKALVNAGGKLFSGMNVRVSVRRSLGERLVIPKSAVVLRSGKQ VVFTLKDGKAMWNYVNTGLENATECVVSDKSQKGIEDGLLEGDTVIVTGNLNLAHEAEVY VK >gi|226332269|gb|ACIC01000051.1| GENE 7 11229 - 12845 736 538 aa, chain - ## HITS:1 COG:no KEGG:BF2988 NR:ns ## KEGG: BF2988 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 13 537 16 545 562 558 53.0 1e-157 MLLQNKYIVHLLMLFGVIALLSVVTVSSYTIAVGDMAYQKLYLGHMSIVFVICTLMSLCL DCKKNAFYLSVVWSLIILGGVEAVWGLRQIYGLAVSNHSLYALTGSFYNPGPYSGYLAMV FPICLSEWLNLKKVKKRTWIEQSKYCVALGVLLLILCVLPAGMSRSAWMAVAISGIWVYA TYRSWGTSLRKIGRKYKKRVFPAIIAGGMVLIIVGYALFQLKVDSANGRLLIWKVSVMAI VEKPFLGHGTGNFASAYGMAQEKYFSQKEFTSTEELVAGSPEYAFNEYLQIAVEYGVLFL LVVLLIIVFCLWIGITEKRLSACAGLISVLVFAFSSYPMQIPGFAIAFYFLLAACVVGSS RLQILFFIIMIALLGSYYWKYNQYNACEEWFRYKMHYNIGAFRLAKEGYEKIYPELNDRG AFLFEYGHSLHKLKEYNSSTAILKEAMAHSCDPMILNIIGKNYQAIGKYEKAEEYFIRST HRLPGRIYPYYLLAKLYAEPEYRHPEKLKQAIQIVLTKEPKVQSTAIREMREEVKLLK >gi|226332269|gb|ACIC01000051.1| GENE 8 12855 - 13682 448 275 aa, chain - ## HITS:1 COG:PA0768 KEGG:ns NR:ns ## COG: PA0768 COG0681 # Protein_GI_number: 15595965 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Signal peptidase I # Organism: Pseudomonas aeruginosa # 3 257 67 265 284 85 30.0 8e-17 MIFCMIVILFLVQLFCFTSFKIPSDSMGPALKDGDRILVNKMIKGARLFNVFAALDNKDV TIYRMPGLGHFKRNDVLVFNFPYQMNRWDSIRMNVMQYYVKRCIALPGDTLEIRGGFYKV RGCREQLGNYEAQQYLAGLQQPEKQGIVVGTFPYDQSLGWNIREFGPLPIPQKDQIVIMN HTTYLLYRQLIAWEQKKKIELKQEQVFIGDSLIHQYCFKKNYYFVSGDNMANSQDSRYWG MLPEEYIVGKATRIWNSKEKYTDQIRWERILMQIK >gi|226332269|gb|ACIC01000051.1| GENE 9 13723 - 15603 683 626 aa, chain - ## HITS:1 COG:no KEGG:BF0297 NR:ns ## KEGG: BF0297 # Name: not_defined # Def: putative exported glutaminase # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 5 586 12 636 837 289 29.0 3e-76 MEIRLKIKLVVLWKTIILLGIIVLLSACFHSKKEAVSSIAIIPDEHTLLAITPEFSIKMA GNTLNKHYLRLKSGKIFPMTGILQVDGRTYRFMGEDSLRISPIASLANDSVSWSGKYSYL FPGKGWEQKEYDDSFWKEGNAAFGPVKGNSKIRTAWGAENIYVRRHIKIDNKDALEGHKI YVCYICDDQIKLFWNGDFLFEKGITYQTKCGRLTDEAIAHIGNGTNVLAAYGCNTGGPAL LDFGLYIENKTYSEADTALLKQIDVQATQTHYVFQCGDVELQIDFVSPSLSEKWNMTGWP VGFLSYQVRSEDKKEHTVEILFDVDMEWMFGRRKIDSWVEQEWRFVKSDSLYLGLTADGT SFSCDDSHVVLSQKLCAENGNKGILLIGYGEGQRIQYIGERLQPLWNDNGKGEVKELMKA VGNRYQKLKEECNKLDNLWNNKAFQVGDKTFAEQILPSYRNFISSHRFVLSSDNKSFCFG DTLGNVRDAYRSFPALLFFNRVDWMKSLLNPIFEYCEDEHWKRKYPPYDIGLYPVASKQI KVDNCAEEAAANMLVMTVAIVEAEQDFSYAELHWSHLCLWANYLEEKMKGEVLPSIELLD ANDKRVKCVLGLMAYHKLIQLKDAYE >gi|226332269|gb|ACIC01000051.1| GENE 10 15594 - 17606 765 670 aa, chain - ## HITS:1 COG:no KEGG:BF3158 NR:ns ## KEGG: BF3158 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 23 669 21 656 656 484 41.0 1e-135 MAKCFRDLSYICIILFIVLGWHIMDNIVSEKEVDITSLEVALRSAGDNRGELEKVLCYYK RKNTDSLKYKAACYLIENMPFHRYSVGEQLENYKSYYAWLKKSKGKEPQQVADSVKKVYG PMKEPNKIRDIMEIDSAYLCHNIDWAFKVWQEQPWGKSISFDVFCEYLLPYRIGDEPLVY WREMYYAKYNSLLDTLRMSDSLDIEDPVVAANYLINKLPDKWHYYTSTTPYSFGHIGPEY VQYLSGTCREVTDFAVYLFRALGIPCAIDFVPVRSYVNAGHFWLVAWDKTGEAYMANFPE NLGMVRKNWWYRWDDSPKVYRYTFDINKELYEQMAKYNEKVYSFWSLPKFMDVTYEYAYS FEKELVIPLEKLYRNQCSGRIAYLCVTNRDNWIPVDWTEFDAQHLAFRNVRRGTLMRVAT YENGTLNFLTDPFYVDKQKKEQHYFSIEGNTQDVVLYAKCNIEGENMFRDRMIGGVFEGS NQLDFAVSDTLFIIQCKPDRLNTTVRSSSNKEYRYIRYVGPPGGLCNVAEVAFYEKNDTL PLSGKIIGTPGCYQHDGTHEYTNVFDGKTWTSFDYFKFSGGWAGLDLGRKVQIDRIVYTP RNRDNYIRPGDIYELYYCDRYWKSAGRIKSTVDSLVYRGIPQNVLLFLRNHTRGVDERVF VYEKGEQLWK >gi|226332269|gb|ACIC01000051.1| GENE 11 17616 - 18479 167 287 aa, chain - ## HITS:1 COG:no KEGG:BF3160 NR:ns ## KEGG: BF3160 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 287 1 288 288 277 47.0 3e-73 MKNICLMLLPLFLMSCLNKDNNIVSLLKEWEQKEILFPKEMYFTTMLRDTTYYNLQSEYK ILAYVDSIGCTSCKLQLSAWSMLINEIDSLYAGRVKFIFAFSPNRIKDIYHAILTANFTY PVYVDKQDSIAKLNRFPLESNLHTFLLDKNNRVIAVGNPVYNPKIKKLYFNIISGKKPFL PCNAEQRTNASLDKRFLDMGSFNWKQEQISDIILSNSGEELLAIEDVSVSCGCITVEYLK EPIRPGQKLNLRIKYKAEHPEHFDKTVIIYCNAEGSPFRLKISGNAK >gi|226332269|gb|ACIC01000051.1| GENE 12 18490 - 19377 510 295 aa, chain - ## HITS:1 COG:no KEGG:BDI_3440 NR:ns ## KEGG: BDI_3440 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 2 294 59 350 350 197 37.0 4e-49 MRIYVCDTILITMNPMEKKLFHLFSLKSKKEIGKRISLGQGPDEMIRPSIIKFNENQILI FDVATFTLFTYDTEDFISNENTIPLERKTIELQAYGEIGLLSDNLIGSTYNPKNQFIKFD NNGKKVGEFGYYPVVADLSYTDDEKLEAFKSSFVTNMKDRIAVCYKWTDLIDIYDQNGQL KKRIHGPEHSYAHFKEFRNGNAIAATSDKDNIDAYFFPYNVGDEFFVLFNGKPWDPDDKE VELPDRIFVFDWNGIPQKIYTLDQGIINFAVDKVQKKIYGISSSPEFHVVEFSYK >gi|226332269|gb|ACIC01000051.1| GENE 13 19621 - 19881 170 86 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253568508|ref|ZP_04845919.1| ## NR: gi|253568508|ref|ZP_04845919.1| predicted protein [Bacteroides sp. 1_1_6] # 1 86 1 86 86 144 100.0 2e-33 MKKKLMGIVTIIAIAAGAGYNVYASRSNVKLSDLALANVEALADSSEGSQSDCNTYCMVS YFDTCIIRYTNNTFTTCYDSRIRRSN >gi|226332269|gb|ACIC01000051.1| GENE 14 20095 - 21090 465 331 aa, chain - ## HITS:1 COG:no KEGG:BF3003 NR:ns ## KEGG: BF3003 # Name: not_defined # Def: putative lipoprotein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 10 331 28 353 353 430 62.0 1e-119 MEYAENVFPYSEFPQEKELKGEVIELDTALFRCPFRIRVEGDKAIVMDLHGIDYYAHLFK YPGFQYLSSFGRRGDSPTEMLSMDNVRFYNHKVWTLDANKRELTRLGFSSSGDSLLRDEA VILDEDILRPLDFAIYNDTTFIIPDYSGENRLCWLNDDGELVKKIGAIPSINNQTLRKAR PALAQAWRSFIDYNSHNGVLAMVTQLGEVLEVYNLKDSTEVIRIGENGEPKFKIFEGYGI PSGIMGFSDVQVTDNAIYAVFHGTTFKEIVKHNGHLPDGGKYIYVFSLKGEPLYKYVLDH YVYGFWVDEATKTIIATDINNDQPIVRFQCG >gi|226332269|gb|ACIC01000051.1| GENE 15 21134 - 21370 170 78 aa, chain - ## HITS:1 COG:no KEGG:BF0032 NR:ns ## KEGG: BF0032 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 74 26 100 102 62 48.0 4e-09 MKKNLLLLLCIVAILSTKYLASHEKRQSALLLYNIEALAANSEHADIGHCYGTGTLDCPV SHDKVKYIAGGYSLEGLY >gi|226332269|gb|ACIC01000051.1| GENE 16 21523 - 22674 504 383 aa, chain - ## HITS:1 COG:no KEGG:BF0205 NR:ns ## KEGG: BF0205 # Name: not_defined # Def: putative transmembrane protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 26 362 27 338 351 106 24.0 2e-21 MRPKIIILLSAIACIGGIVFFLACKYTYEFKVTELKEKAKETFTEALDKELKNRNINGGL SFYFNSESAISTEMPDSVYWEDASGRHLYRMDSKKNSMNITHNSNLRALHTFIFYEKPLQ PDSLNLIWDEQLNKADIYLESALRISLTHEDGSVESQKTYQSEWGNSSNLVFTFYIGYAC EIEVMGYLHYTLWGMIYKEVLLYLLLYIIVGYGFYTFFIILSRRINSLRSKEIVEIIKET PVEILKEVPVEIIKEVQVEKQIIKEVQRVDIIPLHSYILGESLIFYADQNIIEVNDVKYN IQGQSSLLLELFFQEKDNGYTLKEDFIIEKLWPDNSGNDERMHKAIGRLRSFLRKIDPSL SIINKNGTYRLIISEKSLVKRNS >gi|226332269|gb|ACIC01000051.1| GENE 17 23157 - 24308 717 383 aa, chain + ## HITS:1 COG:MA1854 KEGG:ns NR:ns ## COG: MA1854 COG1672 # Protein_GI_number: 20090704 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Methanosarcina acetivorans str.C2A # 8 377 7 389 390 79 24.0 1e-14 MAKEINPFIVTGKIEPEYFCDRVKESERLVKSLTNGNNLVIISPRRMGKTGLIRFCYEKP EIKEEYYTFFIDILHTSSLREFTYLLGKEIYETLLPRSKKMASLFIQTLKSINGKFGFDP LTNLPTFNLELGDIDRPEYTLEEIFQCLIAADKPCIVAIDEFQQIAKYPEKNIEALLRTH IQKIENSNFIFAGSERHMMQEMFLSSSRPFYHSADILELKAIAPEIYIPFIADHFQKRKR HIEIENIEKVYRLFKGHTFYIQKTFNEAFADTPKGKECTLEIIKTAIDNLIAYNDTIFRE ILSNIPEKQKELLYAIAKVGEAENVTSAKFIKRYSLTSASSVQSATKKLLEKDIITEINK VYSVTDKFFGMWINTIYGNKYTL >gi|226332269|gb|ACIC01000051.1| GENE 18 24497 - 25534 296 345 aa, chain + ## HITS:1 COG:no KEGG:BT_0659 NR:ns ## KEGG: BT_0659 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 345 1 345 345 702 97.0 0 MKSKNIILLFTLALPLFSCSESQKRVSFSESTTAKSTALKEAFISIPMDIKKKGDILYAG DFKGDSLLYCYSLSEQRFVNQMLPQGQGPDEFLSPVEFFLSDSSAFIHNRWHFTAQNYTF NAKDFSIRRQGELIHLPMSIDRVYPMSESRFIASGVFEDCRFLILDNDGNVISKSGDFPN YQSGEETIPNTAKGMFHQSQFGYNTDRKRLACATSNVLELWDYKPETLTLHKRLLLAPYH YQFNSSPDGVYAESDNPDAELGARGIAVSNNYVYVLYNPNTNRMHEEQKETLNSEIWVFD WEGKPIRKILTDTHIECFCVDETDTSFYCVMTAPDYCIGIVSPSH >gi|226332269|gb|ACIC01000051.1| GENE 19 25624 - 26319 696 231 aa, chain - ## HITS:1 COG:no KEGG:BT_0658 NR:ns ## KEGG: BT_0658 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 231 1 231 231 385 99.0 1e-106 MRKQLTVILLSALMLSGCASGRMGNPGAIVGGAAIGGSLGSSIGGLIGDNNRGWRGGYRG SAIGNIVGTIAGAAIGNALTAPKPDPIEEYAYVPEVRQGQSHSKFKPRTQTQQQLTQLKL RKIRFIDDNRNHAIDGGESSKIIFEIMNEGRNPVYDVVPIVETVGKVKHIGISPTVMIEE ILPGEGIRYTATVYAGSKLKDGEVTFRVAVSDENGVICDSQEFTLPTQRAE >gi|226332269|gb|ACIC01000051.1| GENE 20 26343 - 28709 2148 788 aa, chain - ## HITS:1 COG:SPy1267 KEGG:ns NR:ns ## COG: SPy1267 COG0210 # Protein_GI_number: 15675225 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases # Organism: Streptococcus pyogenes M1 GAS # 9 782 8 767 772 509 39.0 1e-144 MNTNYIEELNESQCAAVTYNDGPSLVIAGAGSGKTRVLTYKIAYLLENGYQPWNILALTF TNKAAREMKERIARQVGMERARYLWMGTFHSVFSRILRAEAKYIGFTSQFTIYDSADSKS LIRSIIKEMGLDEKTYKPGMVQARISSAKNHLVSPTGYAANKEAYEGDMAAKVPAVREIY TRYWERCRQAGAMDFDDLLVYTYILFRDFPEVLARYREQFRYVLVDEYQDTNYAQHSIVL QLTKENQRVCVVGDDAQSIYSFRGADIDNILYFTKVYPNTKVFKLEQNYRSTQTIVCAAN SLIEKNERQIRKEVFSEKERGEPIGVFQAYSDVEEGDIVANKIAELRREHSYGYADFAIL YRTNAQSRVFEEALRKRSMPYKIYGGLSFYQRKEIKDVIAYFRLVVNPNDEEAFKRIINY PARGIGDTTVGKIISAATENGVSLWAALCEPLSYGLNINKGTHTKLQGFRELIEGFMTDQ VDKSAYEIGTDIIRRSGIINDVCQDTSPENLSRKENIEELVNGMNDFCALRQEEGNPNIS LTDFLSEIALLTDQDSDKADDGEKITLMTVHSAKGLEFKNVFVVGLEENLFPSGMVGDSP RALEEERRLFYVAITRAEEHCYISFAKTRFRYGKMEFGSPSRFLRDIDVNYLRLPHEAGV SRSVDEGAGRFRREIEGGFTRAASPTRTPYGTKFSDRERPKAQVIAPTVPKNLKKVSAVS GSSIRSASAGSASVTGVQAGQMIEHERFGLGEVIRVEGTGDNAKATIHFKNAGDKQLLLR FARFKVIE >gi|226332269|gb|ACIC01000051.1| GENE 21 28857 - 30599 1340 580 aa, chain + ## HITS:1 COG:no KEGG:BT_0656 NR:ns ## KEGG: BT_0656 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 580 1 580 580 1120 99.0 0 MKKSIFLSFILCLCLIACIPQQVMAQKQSRMEKLLRYLNDNDADKWQKNREKLDDETKAY YAEDLSLMDVLNDLWNGQSEQAATLYFGCYEKAAQSNFPGICEGEKIPLSQIRDKADQSI INLLEASKDKIPFSRALLDSIHATEYPVDSAMLQRLQNIREVALLEGMLKTPTPIIYQTY VKEYPNGKFIAQVNASENVRLYQLVKTAPTPANFKAFFEDPEMQKYYQDRGPRPYLAEVR TLYDDFLFQRIDSLKKEGNATAIRQIIDDYKNTPYLSTGARTHLNDLEYLSEKADFELLK PAIVNSESLGLLQEFLKTHKYKEFRDQAKNLRAPFILQAIVSTPTTVKYYTQGRLIKCCE TDSTGNITTSYTYNDKGQLTTTLSVTEKNGQPINEVQTSRLYDPQGHCIFEVKTNPKTKT DFYRRARRIGIDGSIESDSLKYMDGRYTVSSYNKQGLLTETKEYNKNGEMEGYTVNKYDD KGRMTESQHQNMLFVNSPNQILSQKELYEYDKYGYLTRIVYQRIFGNSQKTSGCLTCLYD EYGNRIDGDSYYEYDNTGQWICRTSYDNPQQVERIQYIYK >gi|226332269|gb|ACIC01000051.1| GENE 22 30706 - 31293 431 195 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15900660|ref|NP_345264.1| superoxide dismutase, manganese-dependent [Streptococcus pneumoniae TIGR4] # 3 193 1 195 201 170 43 1e-41 MTMTYEMPKLPYANNALEPVISQQTIDFHYGKHLQTYVNNLNSLVPGTEYEGKTVEEIVA AAPDGAIFNNAGQVLNHNLYFLQFAPKPSKKEPAGKLGEAIKRDFGSFENFKKEFNAAAV GLFGSGWAWLSVDKNGKLHITKEGNGSNPVRAGLKPLLGFDVWEHSYYLDYQNRRADHVN ALWNIIDWDVVEKRM >gi|226332269|gb|ACIC01000051.1| GENE 23 31450 - 33432 1484 660 aa, chain + ## HITS:1 COG:VC2453_1 KEGG:ns NR:ns ## COG: VC2453_1 COG0642 # Protein_GI_number: 15642449 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Vibrio cholerae # 379 649 238 513 516 165 35.0 2e-40 MTLQQKNKEWRIFFVAFFLLLSFISNAATPEELEEEILFVTSYNSDTKYTYDNINTFVET YRQLGGKYSTIVENMNVTDLNQSRKWKRTLTNILDKHPNAKLVILLGGEAWSSFLHLEDE KYRQLPVFCAMASRNGIRIPEDSIDIQTYEPQSIDLTERMSEYNIIYCNTYEYDVDKDIE MMRSFYPDMEHLVFISDNTYNGLAEQAWVKKNMKRYPEISTTYIDGRIHTLDAAVKQLRD IPKNSVALLGIWRIDNRGITYMNNSVYAFSKANPELPVFSLTATAIGYWAIGGYIPQYDG IGRSMGEQAYQFLDKGKNNVGHIHLLPNRYKFDANKLYEWGFQDKKLPFNSLIINQQVPF FQAYRTEVQFILFTFLVLIGGLFISLYYYYRTKILKNHLEKTTAQLREDKKKLELSEIAL RHAKERAEEANQLKSAFVSNMSHEIRTPLNAIVGFSSLLIDTVEATDEQKEYADIIQTNS ELLLQLISDVLDVSRLESGKLLFKYEWCELVNHCQNMITLTRKNRTRDIDVRLQMPKEPY MLYTDPLRLQQVIINLLNNALKFTPDGGSITLDYEIDEEDQCMLFSVTDTGSGIPEDKQE LVFQRFEKLNEFVQGTGLGLAICKLTIQRMGGDIWIDKNYKNGARFVFSHPIKKRESEEK >gi|226332269|gb|ACIC01000051.1| GENE 24 33645 - 33845 306 66 aa, chain + ## HITS:1 COG:no KEGG:BT_0653 NR:ns ## KEGG: BT_0653 # Name: not_defined # Def: ThiS protein, involved in thiamine biosynthesis # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 66 1 66 66 111 100.0 8e-24 MKVQVNNKEVEMTPASTLTQLAAQLELPVQGIAIAVNNKMIPRSEWEKFALQENDNLVVI KAACGG >gi|226332269|gb|ACIC01000051.1| GENE 25 33850 - 34479 563 209 aa, chain + ## HITS:1 COG:PAB1645 KEGG:ns NR:ns ## COG: PAB1645 COG0352 # Protein_GI_number: 14521295 # Func_class: H Coenzyme transport and metabolism # Function: Thiamine monophosphate synthase # Organism: Pyrococcus abyssi # 18 202 22 202 207 116 38.0 3e-26 MVSLQFITHQTDRYTYFESALMALEGGCKWIQLRMKEAPCEEVEAVALQLKPLCKEKEAI LLLDDHVELAKKLEVDGVHLGKKDMPIDQARQLLGEAFIIGGTANTFEDVVQHYRAGADY LGIGPFRFTTTKKNLSPVLGLEGYAAILSQMKEANIELPVVAIGGITCEDIPAILETGVN GIALSGTILRAEDPAAETRKILNMKCIIK >gi|226332269|gb|ACIC01000051.1| GENE 26 34492 - 35265 873 257 aa, chain + ## HITS:1 COG:YPO3742 KEGG:ns NR:ns ## COG: YPO3742 COG2022 # Protein_GI_number: 16123879 # Func_class: H Coenzyme transport and metabolism # Function: Uncharacterized enzyme of thiazole biosynthesis # Organism: Yersinia pestis # 2 256 62 324 333 314 62.0 9e-86 MEKLVIAGREFDSRLFLGTGKFNSNEVMEQAILTSGTEMVTVAMKRIDMDDKEDDMLKHI IHPNIQLLPNTSGVRDAEEAVFAAQMAREAFGTNWLKLEIHPDPRYLLPDSIETLKATEQ LVKLGFIVLPYCQADPVLCKRLEEAGAATVMPLGAPIGTNKGLQTKEFLQIIIEQAGIPV VVDAGIGAPSHAAEAMELGASAVLVNTAIAVAGNPVEMALAFKAATEAGRRAYEAGLGLQ ADNFIAEASSPLTAFLE >gi|226332269|gb|ACIC01000051.1| GENE 27 35346 - 37043 1887 565 aa, chain + ## HITS:1 COG:PA4973 KEGG:ns NR:ns ## COG: PA4973 COG0422 # Protein_GI_number: 15600166 # Func_class: H Coenzyme transport and metabolism # Function: Thiamine biosynthesis protein ThiC # Organism: Pseudomonas aeruginosa # 7 564 23 596 627 778 63.0 0 MEQKIKFSRSQKVYLPGKLYPNIRVAMRKVEQVPSVSFEGEEKIATPNPEVYVYDTSGPF SDPSMSIDLKKGLPRLREEWIVGRGDVEQLPEITSEYGQMRRDDKSLDHLRFEHIALPYR AKKGEAITQMAYAKRGIITPEMEYVAIRENMNCEELGIETHITPEFVRKEIAEGRAVLPA NINHPEAEPMIIGRNFLVKINTNIGNSATTSSIDEEVEKALWSCKWGGDTLMDLSTGENI HETREWIIRNCPVPVGTVPIYQALEKVNGVVEDLNWEIYRDTLIEQCEQGVDYFTIHAGI RRHNVHLADKRLCGIVSRGGSIMSKWCLVHDQESFLYEHFDDICDILAQYDVAVSLGDGL RPGSIHDANDEAQFAELDTMGELVLRAWEKNVQAFIEGPGHVPMHKIKENMERQIEKCHD APFYTLGPLVTDIAPGYDHITSAIGAAQIGWLGTAMLCYVTPKEHLALPDKEDVRVGVIT YKIAAHAADLAKGHPGAQVRDNALSKARYEFRWKDQFDLSLDPERAQGYFRAGHHIDGEY CTMCGPNFCAMRLSRDLKKNAKSDK >gi|226332269|gb|ACIC01000051.1| GENE 28 37115 - 38239 1068 374 aa, chain + ## HITS:1 COG:VC0066 KEGG:ns NR:ns ## COG: VC0066 COG1060 # Protein_GI_number: 15640098 # Func_class: H Coenzyme transport and metabolism; R General function prediction only # Function: Thiamine biosynthesis enzyme ThiH and related uncharacterized enzymes # Organism: Vibrio cholerae # 2 370 3 368 370 366 48.0 1e-101 MFSDELEKISWEETTKAIYSKTDADVRRALGKKEHLDVNDFMALISPAATPYLEVMARLS QKYTMERFGKTISMFVPLYITNSCTNSCVYCGFHISNPMKRTILTEEEIINEYKAIKRLA PFENLLLVTGENPAAAGVPYIARALDLAKPYFSNLQIEVMPLKAEEYKELTHHGLNGVIC FQETYNKANYKKYHPRGMKSKFEWRVDGFDRMGQAGVHKIGMGVLIGLEEWRTDVTMMAY HLRYLQKHYWKTKYSVNFPRMRPSENDGFQPNVIMNDRELAQLTFAMRIFDHDVDISYST RESAEIRNHMATLGVTTMSAESKTEPGGYYSYPQTLEQFHVSDERKAVEVERDLRKLGRE PVWKDWDQSFDFKR >gi|226332269|gb|ACIC01000051.1| GENE 29 38244 - 38424 194 60 aa, chain + ## HITS:1 COG:mll5577 KEGG:ns NR:ns ## COG: mll5577 COG0476 # Protein_GI_number: 13474645 # Func_class: H Coenzyme transport and metabolism # Function: Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 2 # Organism: Mesorhizobium loti # 2 60 12 70 250 81 62.0 4e-16 MRYDRQMILPEIGEDGQQKLKQAKVLIVGVGGLGSPIALYLTGAGVGCIGLVDDDVVSIS Prediction of potential genes in microbial genomes Time: Thu May 12 00:36:52 2011 Seq name: gi|226332268|gb|ACIC01000052.1| Bacteroides sp. 1_1_6 cont1.52, whole genome shotgun sequence Length of sequence - 37232 bp Number of predicted genes - 34, with homology - 33 Number of transcription units - 20, operones - 11 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 1 - 465 275 ## COG0476 Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 2 2 1 Op 2 . + CDS 527 - 1135 570 ## BT_0647 thiamine phosphate pyrophosphorylase + Term 1175 - 1223 8.2 - Term 1162 - 1209 7.2 3 2 Tu 1 . - CDS 1228 - 1800 544 ## BT_0646 hypothetical protein - Prom 1850 - 1909 7.5 - Term 1918 - 1964 12.0 4 3 Tu 1 . - CDS 1990 - 2505 586 ## BT_0645 hypothetical protein - Prom 2750 - 2809 3.6 5 4 Op 1 . + CDS 2796 - 4028 767 ## COG0582 Integrase 6 4 Op 2 . + CDS 4046 - 4438 280 ## BDI_2139 hypothetical protein + Term 4604 - 4635 0.4 + Prom 4946 - 5005 2.2 7 5 Tu 1 . + CDS 5114 - 6481 906 ## PG1109 mobilization protein + Term 6664 - 6707 4.2 + Prom 6723 - 6782 3.3 8 6 Op 1 . + CDS 6817 - 7554 404 ## BF1974 hypothetical protein 9 6 Op 2 . + CDS 7600 - 8058 314 ## gi|253568533|ref|ZP_04845944.1| conserved hypothetical protein 10 7 Op 1 . + CDS 8164 - 8904 491 ## BT_1755 hypothetical protein 11 7 Op 2 . + CDS 8974 - 9492 168 ## BF1789 hypothetical protein - Term 9857 - 9903 10.2 12 8 Tu 1 . - CDS 9943 - 12663 3109 ## COG0574 Phosphoenolpyruvate synthase/pyruvate phosphate dikinase - Prom 12830 - 12889 8.3 + Prom 12628 - 12687 5.4 13 9 Op 1 . + CDS 12863 - 14281 1279 ## COG2265 SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase 14 9 Op 2 . + CDS 14317 - 15237 212 ## PROTEIN SUPPORTED gi|161507907|ref|YP_001577871.1| ribosomal protein large subunit 15 10 Tu 1 . - CDS 15357 - 15758 411 ## BT_0641 hypothetical protein - Prom 15797 - 15856 7.9 + Prom 15733 - 15792 8.4 16 11 Tu 1 . + CDS 15870 - 17513 1045 ## COG1032 Fe-S oxidoreductase + Prom 18078 - 18137 3.7 17 12 Op 1 . + CDS 18263 - 18496 163 ## BT_0639 hypothetical protein 18 12 Op 2 . + CDS 18578 - 19432 849 ## COG0024 Methionine aminopeptidase 19 12 Op 3 . + CDS 19432 - 20661 1140 ## COG1322 Uncharacterized protein conserved in bacteria + Term 20702 - 20760 5.3 20 13 Op 1 . - CDS 20776 - 22419 825 ## BT_0636 putative transcriptional regulator 21 13 Op 2 . - CDS 22490 - 22867 389 ## COG3682 Predicted transcriptional regulator - Prom 22893 - 22952 9.2 - Term 23012 - 23055 1.1 22 14 Op 1 . - CDS 23077 - 24396 825 ## COG3004 Na+/H+ antiporter 23 14 Op 2 . - CDS 24442 - 25620 774 ## BT_0633 putative Na+/H+ exchange protein - Prom 25650 - 25709 2.0 - Term 25695 - 25753 15.1 24 15 Op 1 . - CDS 25762 - 27543 2043 ## COG0481 Membrane GTPase LepA - Prom 27593 - 27652 8.0 - Term 27598 - 27643 5.6 25 15 Op 2 . - CDS 27673 - 27873 333 ## BF2557 hypothetical protein - Prom 27969 - 28028 5.9 26 16 Op 1 . - CDS 28030 - 28488 570 ## BT_0631 hypothetical protein 27 16 Op 2 . - CDS 28503 - 29264 776 ## COG0708 Exonuclease III 28 16 Op 3 . - CDS 29295 - 30548 1097 ## COG1914 Mn2+ and Fe2+ transporters of the NRAMP family - Prom 30639 - 30698 4.6 - Term 30644 - 30681 7.0 29 17 Op 1 . - CDS 30713 - 30958 187 ## BT_0628 hypothetical protein 30 17 Op 2 . - CDS 30966 - 31709 1012 ## COG0217 Uncharacterized conserved protein 31 17 Op 3 . - CDS 31745 - 34207 2848 ## COG0072 Phenylalanyl-tRNA synthetase beta subunit - Prom 34407 - 34466 6.4 + Prom 34160 - 34219 6.8 32 18 Tu 1 . + CDS 34347 - 34541 148 ## - Term 34437 - 34480 4.6 33 19 Tu 1 . - CDS 34530 - 36083 1627 ## COG0305 Replicative DNA helicase - Prom 36104 - 36163 7.3 + Prom 36061 - 36120 6.7 34 20 Tu 1 . + CDS 36282 - 37106 585 ## COG1947 4-diphosphocytidyl-2C-methyl-D-erythritol 2-phosphate synthase Predicted protein(s) >gi|226332268|gb|ACIC01000052.1| GENE 1 1 - 465 275 154 aa, chain + ## HITS:1 COG:all2906_1 KEGG:ns NR:ns ## COG: all2906_1 COG0476 # Protein_GI_number: 17230398 # Func_class: H Coenzyme transport and metabolism # Function: Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 2 # Organism: Nostoc sp. PCC 7120 # 5 139 97 231 262 121 48.0 6e-28 KAICAAERLSALNSEITIRTYPIRLTEENAQEIISQYDIVVDGCDNFSTRYLINDICAEM GKVYVYGAICGFEGQVSVFHYGEEKKSYRDLYPDEEEMRRMPPPPKGVMGITPAVTGSIE ATEVLKIICGFGEVLSGKLWTIDLRTLQSNKFSL >gi|226332268|gb|ACIC01000052.1| GENE 2 527 - 1135 570 202 aa, chain + ## HITS:1 COG:no KEGG:BT_0647 NR:ns ## KEGG: BT_0647 # Name: not_defined # Def: thiamine phosphate pyrophosphorylase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 202 1 202 202 403 99.0 1e-111 MKLIVVTTPTFFVEEDKIITALFEEGLDILHLRKPETPAMYSERLLTLIPEKYHRRIVTH EHFYLKEEFNLMGIHLNARNPSEPHDYAGHVSCSCHSVEEVKNRKHFYDYIFMSPIYDSI SKVNYYSTYTAEELREAQKAKIIDSKVMALGGINEDNLLEIKDFGFGGAVVLGDLWNKFD ACLDQNYLAVIEHFKKLKKLAD >gi|226332268|gb|ACIC01000052.1| GENE 3 1228 - 1800 544 190 aa, chain - ## HITS:1 COG:no KEGG:BT_0646 NR:ns ## KEGG: BT_0646 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 190 1 190 190 348 100.0 5e-95 MKKSALFILFALISVAGFSQITGWNAKVGMNISNYTGDADLNAKIGFKLGGGFEYAFDNT WSLQPSLFLSTKGAKKDGNSINAMYLELPVMAAARFNVADNTNIVVNAGPYLACGIAGKT KIDLGNDTERKYDTFGDDALKRFDAGLGVGVALELGKVIVGLDGQFGLVDVEKIGNPKNT NFSIVLGYKF >gi|226332268|gb|ACIC01000052.1| GENE 4 1990 - 2505 586 171 aa, chain - ## HITS:1 COG:no KEGG:BT_0645 NR:ns ## KEGG: BT_0645 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 171 1 171 171 330 100.0 1e-89 MKKGLLLVVVMLATIAVKAQDIYVGGSLNVWRNSTGNTTSFKVAPEIGYNFNETWALGAE LNYSHEYVKQVTANSVTVAPYIRWSYYQNDAVRLFLDGTAAIGFVKIKDGDTTKAGQIGL RPGIAVKLNDHFSFIAKYGFLGYRKNVNTSGDSFGLMLTSEDLSIGFHYNF >gi|226332268|gb|ACIC01000052.1| GENE 5 2796 - 4028 767 410 aa, chain + ## HITS:1 COG:TM0967 KEGG:ns NR:ns ## COG: TM0967 COG0582 # Protein_GI_number: 15643727 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Thermotoga maritima # 123 388 9 247 253 65 25.0 1e-10 MRSTFKVLFYVKKGSEKPNGNLPLMCRITVDGEIKQFSCKMDVPPRLWDVKNSRASGKSV EAQRINLAVDKIRVEINRRYQELMQTDGYVTAAKLKDAYLGIGIKQETLLKLFEQHNAEF EKKVGHSRAQGTFTRYRTVCNHIREFLPHTYKREDIPLKELNLTFINDFEYFLRTEKKCR TNTVWGYMIALKHIVSIARNDGRLPFNPFAGYINSPESVDRGYLTQTEIQTLMDAPMKNA THELVRDLFVFSVFTGLAYSDVKNLTADRLQTFFDGNLWIITRRKKTNTESNIRLLDVPK HIIEKYKGLARDGHVFPVPSNGSCNKILKEIGRQCGFKVRLTYHVARHTNATTVLLSHGV PIETVSRLLGHTNIKTTQIYAKITAQKISQDMETLSHKLEDMEKNICRAI >gi|226332268|gb|ACIC01000052.1| GENE 6 4046 - 4438 280 130 aa, chain + ## HITS:1 COG:no KEGG:BDI_2139 NR:ns ## KEGG: BDI_2139 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 4 128 2 125 127 149 58.0 4e-35 MKEERNIITMDEQGNIFLPSDIGATAMTEWEICELFGVIAPTVRAGIKALCKSGVLSVYD IKRIIRLSDRYSAEVYNLETIAALAFRVESFGAAKVRKALLERVMHVRKEKTTVFVPLFT GCIPEKHWQA >gi|226332268|gb|ACIC01000052.1| GENE 7 5114 - 6481 906 455 aa, chain + ## HITS:1 COG:no KEGG:PG1109 NR:ns ## KEGG: PG1109 # Name: not_defined # Def: mobilization protein # Organism: P.gingivalis # Pathway: not_defined # 1 455 1 455 455 380 48.0 1e-104 MGFVVLHMEKAHGSDSGTTAHIERFIIPKNADPTRTHLNRRLIEYPDGVKDRSAAVQRRL EEAGLTRKIGSNQVRAIRINVSGTHEDMKRIEEEGRLDEWCADNLKYFADTFGEENIVAA HLHRDEETPHIHVTLVPIVKGERKRRKREEQTKKRYRKKPTDTVRLCADDIMTRLKLKSY QDTYAEAMAKYGLQRGIDGSKARHKSTQQYYREIKRQTEELKAEVVDLQEQKETAREELR QAKKEIQTEKLKGAATTAAANIAESVGSLFGSNRVKTLERENTALHREVADHEETIEALQ DRIQTMQADHSREIREMQQKHGREIADKDTRHKQEISFLKTVIARAAAWFPYFREMLRIE NLCRLVGFDERQTATLVKGKTLEYAGELYSEEHGRKFTTEKAGFQVMKDTNNKTKLVLAI DRKPIAEWFKEQFEKLRQNIRQPIQQQRKSRGMKL >gi|226332268|gb|ACIC01000052.1| GENE 8 6817 - 7554 404 245 aa, chain + ## HITS:1 COG:no KEGG:BF1974 NR:ns ## KEGG: BF1974 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 6 245 24 262 262 151 34.0 2e-35 MDMTSLWGRLAKLQSFFQDGLNVDENSHLPEADLRKISLGNLYVYQQQGVLNTFETGVTP SVRKVILGEYFGITDRDSAIETLNWLSQAPSQTMFHYAYTAFLQGGGNISRKWLNENEEL KEHTDFRNDCLEKLETMEEKYPDIEQAGIVVSKEEMGKLGVLAWDAGRLNFISRLCLEQE YIVKEECMQCINAAYEMTKEVYTNWKDYAYSYVLGRTLSMGTTNMIGLAEDLLTDTKSPW TYIKW >gi|226332268|gb|ACIC01000052.1| GENE 9 7600 - 8058 314 152 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253568533|ref|ZP_04845944.1| ## NR: gi|253568533|ref|ZP_04845944.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 152 1 152 152 283 100.0 3e-75 MKTFNYAPASPIKDNITMLAVSVGMVVVPLVYPFGIRIGSTRILGPTSTAIVFIIGGLVL LVITLNKVRLARALAANGGKIVVDADSVTYPIIKKGEKTDKIFKISDIKHLKYDDEEGEL EIFLTDDTQITLHAGFFESFERYEEFFALLKK >gi|226332268|gb|ACIC01000052.1| GENE 10 8164 - 8904 491 246 aa, chain + ## HITS:1 COG:no KEGG:BT_1755 NR:ns ## KEGG: BT_1755 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 7 226 51 267 408 83 28.0 7e-15 MNRILALYIFLLLCGTVSAQQIVEWSDLQTITDNSRRTVYYKKGRKQPLRGKYRIIRGLD EECVKFSDGMINGDYRRYRDGVLRESGIYAKGKRNSTFTEYYQDGVTPRKETPMQQGKID GTVKTYFRNGKIEAEKEYKQSVESGRERRFDSKTGEQIFESHYIDGKKDGEEWEIFEDAS TLRSRIIRHYRNGKLDGSYRVESTRDGKPYITIEGQYTDGEKSGQWIEHNYDNNTQTCTW HGEGGA >gi|226332268|gb|ACIC01000052.1| GENE 11 8974 - 9492 168 172 aa, chain + ## HITS:1 COG:no KEGG:BF1789 NR:ns ## KEGG: BF1789 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 172 23 196 196 272 79.0 4e-72 MPPTDEEMIRHFATHEAAFDKIRKIMAESSEGSFHYPPLSPCDILILDSAGQISYQPNQV QDTPVHGLSRSDRIQLDSLLSEIGCGLVLVDRREQETADSVYVSLFMLYYSHGIVDAGTS KSFVYDLELRSRRDIRITEHGDLNKIYRRTYNDTTLYKPVKEGWYIELDHSR >gi|226332268|gb|ACIC01000052.1| GENE 12 9943 - 12663 3109 906 aa, chain - ## HITS:1 COG:mlr7532 KEGG:ns NR:ns ## COG: mlr7532 COG0574 # Protein_GI_number: 13476256 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoenolpyruvate synthase/pyruvate phosphate dikinase # Organism: Mesorhizobium loti # 4 897 3 877 892 1008 57.0 0 MDKKRVYTFGNGQAEGKADMKNLLGGKGANLAEMNLIGIPVPPGFTITTEVCTEYNTLGR DKVVELLKDEVVKAIARVEELMKSKFGDIENPLLVSVRSGARASMPGMMDTILNLGLNDE VVEGIIRKTGNARFAWDSYRRFVQMYGDVVLGMKPTNKEDIDPFEAIIEEVKESKGVKLD NELEVADLQELVKKFKAAVKEQTGKDFPTCAYEQLWGAICAVFDSWMNERAILYRKMEGI PDEWGTAVNVQAMVFGNMGDTSATGVCFSRDAGTGEDLFNGEYLINAQGEDVVAGIRTPQ QITKVGSQRWAVLAGVTEDIRAAKFPSMEEAMPEIYKELDALQTKLENHYKDMQDMEFTV QEGKLWFLQTRNGKRTGAAMVKIAMDLLRQGMIDEKTALMRVEPNKLDELLHPVFDKSAL KQAKVLTRGLPASPGAATGQIVFFADDAAEWHAAGKKVVMVRIETSPEDLAGMAVAEGIL TARGGMTSHAAVVARGMGKCCVSGAGALNIDYKTRTVEIDGIVLKEGDYISLNGSTGEVY NGKVETQAAELSGDFADLMKLSDKYTRLQVRTNADTPHDAEVARNFGAVGIGLCRTEHMF FEGEKIKAMREMILAENAEGRRKALAKILPYQQADFKGIFKAMVGCPVTVRLLDPPLHEF VPHDTKGQQEMADTMGVSLQYIQQRVESLCEHNPMLGHRGCRLGNTYPEITQMQTRAILG AALELKKEGVETHPEIMVPLTGILYEFKEQEKVIREEAAKLFEEVGDSIDFKVGTMIEIP RAALTADRIASSAEFFSFGTNDLTQMTFGYSRDDIASFLPVYLEKKILKVDPFQVLDQNG VGQLVRMATEKGRAIRPDLKCGICGEHGGEPSSVKFCHRVGLNYVSCSPFRVPIARLAAA QAAIEG >gi|226332268|gb|ACIC01000052.1| GENE 13 12863 - 14281 1279 472 aa, chain + ## HITS:1 COG:BH0687 KEGG:ns NR:ns ## COG: BH0687 COG2265 # Protein_GI_number: 15613250 # Func_class: J Translation, ribosomal structure and biogenesis # Function: SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase # Organism: Bacillus halodurans # 13 472 15 457 458 287 36.0 3e-77 MARKRKELPLLEKVTITDVAAEGKAIAKVNDLVIFVPYVVPGDVVDLQIKRKKNKYAEAE AVKFHELSPVRAVPFCQHYGVCGGCKWQVLPYSEQIRYKQKQVEDNLRRIGKIELPEISP ILGSAKTEFYRNKLEFTFSNKRWLTNDEVRQDVKYDQMNAVGFHIPGAFDKVLAIEKCWL QDDISNRIRNAVRDYAYEHDYSFINLRTQEGMLRNMIIRTSSTGELMVIVICKITEDHEM ELFKQLLQFIADSFPEITSLLYIINNKCNDTINDLDVHVFKGKDHMFEEMEGLRFKVGPK SFYQTNSEQAYNLYKIAREFAGLTGKELVYDLYTGTGTIANFVSRQARQVIGIEYVPEAI EDAKVNAEINEIKNALFYAGDMKDMLTQEFINQHGRPDVIITDPPRAGMHQDVVDVILFA EPKRIVYVSCNPATQARDLQLLDGKYKVKAVQPVDMFPHTHHVENVVLLELR >gi|226332268|gb|ACIC01000052.1| GENE 14 14317 - 15237 212 306 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|161507907|ref|YP_001577871.1| ribosomal protein large subunit [Lactobacillus helveticus DPC 4571] # 63 297 50 282 285 86 30 3e-16 MEKRPRRTPAEKARAQYTNYAVKEPMELMEFLAAKMPDASRTKLKSLLSKRVVLVDNVIT TQFNFPLQPGMKVQISKEKGKKEFHNRLLKIVYEDAYIIVVEKMQGLLSVNTERQKERTA YTILNEYVQRSGRQHRVFIVHRLDRDTSGLMMFAKDEKTQRTLRDNWHEIVTDRRYVAVV EGTMEKDYDTVVSWLTDKTLYVSSSDYDDGGSKSITHYKTIKRANGYSLLELDLETGRKN QIRVHMQDLGHPIIGDGRYGGEESLNPIGRLALHAFKLCFYHPVTGDLMEFETPYPGEFK KLFLKK >gi|226332268|gb|ACIC01000052.1| GENE 15 15357 - 15758 411 133 aa, chain - ## HITS:1 COG:no KEGG:BT_0641 NR:ns ## KEGG: BT_0641 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 133 1 133 133 250 100.0 1e-65 MKNLKKYVGVLMLLTLLVGFTSCEDDETIFDHIVGRTWVGDLGFWVDNSPVESGITFKSN GFGVDEQWYYDWDRVAAVLDVRWWIEDGDLFLDYGNGYPLLILADVYVEGRYLTGRLYSN NIDQGTVTLEMSN >gi|226332268|gb|ACIC01000052.1| GENE 16 15870 - 17513 1045 547 aa, chain + ## HITS:1 COG:CAC1021 KEGG:ns NR:ns ## COG: CAC1021 COG1032 # Protein_GI_number: 15894308 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Clostridium acetobutylicum # 2 425 3 424 548 169 29.0 2e-41 MKLLWLDLNSSYAHSSLALPALHAQIADDTTIEWCTVSATINENTGMVVNQIYHHQPDII AATNWLFNHEQLLHIVSRAKALLPHCCIVLGGPEFLGNNEAFLRKNKFVNGVFRGEGEEV FPIWLKVWNHPVNEWNQISGLCYLDESDQYQDNGIARVMNFSQLVSPEKSRFFNWSKPFV QLETTRGCFNTCAFCVSGGEKPVRTISLEAIRERLNVIHQHGIKNVRVLDRTFNYNNKRA KELLHLFREYAPDICFHLEIHPALLSEELKEELAILPNGLLHLEAGIQSLREPVLEKSRR IGKLSDALQGLKYLCSLKNMETHADLIAGLPLYHLSEIFEDVRTLAEYGAGEIQLESLKL LPGTEMRRRAEELGIQYSPLPPYEVLQTKEISVEELQTAHYLSRLLDGFYNTPTWRSLTR TLILDNPHFLHELLDHLIQTDVIDTPLSLEKRGLILYNFCKNHYPDYVTQVSIAWIEAGM SLKKIPAEKVKTKRQVPPDNWEIIYGSYRENLRLCFLPIDEAGHGYWFGFESEIQKIEPV FKAKRLS >gi|226332268|gb|ACIC01000052.1| GENE 17 18263 - 18496 163 77 aa, chain + ## HITS:1 COG:no KEGG:BT_0639 NR:ns ## KEGG: BT_0639 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 77 132 208 208 152 100.0 4e-36 MDYKLSKKWFLSIYGKQNLDTRRYRGLSSEAIPTTLGSDIVLKVGKGWKLKTGVQYQYNT IQKRWEWVPQICISYEW >gi|226332268|gb|ACIC01000052.1| GENE 18 18578 - 19432 849 284 aa, chain + ## HITS:1 COG:alr4150 KEGG:ns NR:ns ## COG: alr4150 COG0024 # Protein_GI_number: 17231642 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Methionine aminopeptidase # Organism: Nostoc sp. PCC 7120 # 38 283 7 251 256 264 50.0 1e-70 MKNFIKGFRFTPSNYPAEVEAKIQKYRKQGYKLPPRKVLRTPEQLEGIRESAKINTALLD YISENIREGMSTEEIDVMVYDFTTKHGAIPAPLNYEGFPKSVCTSINDVVCHGIPSKTEI LQSGDIINVDISTIYKGYFSDASRMFMIGDVSPEMRKLVQVTKECMEIGIAAAQPWKQLG DVGAAIQEHAEKNGFNVVRDLCGHGVGMQFHEAPDVEHFGRRGTGMMIVPGMTFTIEPMI NMGTYEVFVDEADGWTVCTDDGLPSAQWENMILITETGNEILTY >gi|226332268|gb|ACIC01000052.1| GENE 19 19432 - 20661 1140 409 aa, chain + ## HITS:1 COG:XF0413 KEGG:ns NR:ns ## COG: XF0413 COG1322 # Protein_GI_number: 15837015 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Xylella fastidiosa 9a5c # 33 400 72 438 456 301 47.0 2e-81 MELILLIVIAALAVISIVLSLAKGNNHAQAEQLQAALRQQMQENREELNRSIRELRMEMT QTLNQNMQQLQDVLHKNMMTNGELQRQKFDTMARQQEVLIKSTEKRLDDMRLMVEEKLQK TLNERIGQSFEIVRSQLENVQKGLGEMKSLAQDVGGLKKVLSNVKMRGTFGEVQLGALLE QMMSPEQYDANVKTKKSGTEFVEFAIKLPGKDDANSTVYLPIDAKFPKDVYEQYYDAFEA GDTALMESSGKQLETTIKKMAKDIHDKYVDPPFTTDFAIMFLPFESIYAEVIRRTSLVET LQKEYKIVVTGPTTLGAILNSLQMGFRTLAIQKRTGEVWTVLGAVKTEFSKFGGLLEKVQ KNLQSAGDQLEEVMGKRTRAIERKLRQVEQLPHEESQRILPIADDGDDE >gi|226332268|gb|ACIC01000052.1| GENE 20 20776 - 22419 825 547 aa, chain - ## HITS:1 COG:no KEGG:BT_0636 NR:ns ## KEGG: BT_0636 # Name: not_defined # Def: putative transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 547 1 548 548 1063 98.0 0 MGAFLFYLLKSGCCLVVFYIFFKLLMSRSTFFRFNRITLLVGLLGCTLLPLIELTTTEET FLHTPLYAIHEILQSTESVILNPEQMEDPILISEKNPEINSLNWIPVTLAFIYGVGALVT LIWLSLSTCRLIQLIRTSEKKQFGNYVLVIPQQPTASFSWGKYIVISAADYSQQSEEILL HETMHLRNHHTLDLLFMQIFLLVYWFNPVVWLLKRELQEVHEFEADNGVINTGIDATKYQ LLLVKKAVGTRLYSMANGFNHSKLKKRITMMLKERTNRWARLKLLLAVPVMAGALYVFAQ PEVKEVPRQIQSELQQKEADDYSSLMFFFKEGGERYSKLVNGAYPPSKIKARQVHSLFVN KQNRVMFDNDVCSVDELKSTIVKNLMKSWEESKRKEYQVISFQVDRGSEIAALTTILKEV KGAFEQIRADLSITLTDKSEEALDRLFPVLLSEGAARNYGLKELSMEEKISGIVVTIHTS EGKEVMKDFTLTELKQKVTAARAKQADPESLVIGLKIEKGCKMGYVTDTKQVLRECSALK INYSTDN >gi|226332268|gb|ACIC01000052.1| GENE 21 22490 - 22867 389 125 aa, chain - ## HITS:1 COG:Cgl0019 KEGG:ns NR:ns ## COG: Cgl0019 COG3682 # Protein_GI_number: 19551269 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Corynebacterium glutamicum # 11 125 14 123 123 62 31.0 2e-10 MKQMKRLTVKEEEIMRIFWEHGPMFVRELLSFYDEPKPHYNTVSTLVRGLEEKGFVGYKA YGNTYQYYALVSEKEYKSSALKEVVSQYYNNSYINVVSSFIEEEGMSVDELKSLIEYIEQ SKKKK >gi|226332268|gb|ACIC01000052.1| GENE 22 23077 - 24396 825 439 aa, chain - ## HITS:1 COG:jhp1447 KEGG:ns NR:ns ## COG: jhp1447 COG3004 # Protein_GI_number: 15612512 # Func_class: P Inorganic ion transport and metabolism # Function: Na+/H+ antiporter # Organism: Helicobacter pylori J99 # 6 429 13 427 438 264 40.0 2e-70 MTILRTMRNFSSMNIAASILLFVAAIAAAIIANSPVAPVYQEFLLHELHLQIGNFNLLSH GGENLRMIEFINDGLMTIFFLLVGLEIKRELLVGELSSFRKAALPFIAACGGMLFPVIVY MSICPPGSAGSQGLAIPMATDIAFSLGVLSLLGSRVPLSLKIFLTAFAVVDDIGGILVIA LFYSSHVSYGYILIAALLYVLLYFIGKRGTTNKIFFLVIGVVIWYLFLQSGIHSTISGVI LAFVIPAKPRLYVGKYIEHIRHTIAGFPVVESGSIVLTNEQIAKLKEVESASDRVISPLQ SLEDNLHGAVNYLILPLFAFVNAGVVFSGGGELVGSVGMAVAAGLLFGKFAGIYFFTWLA IKIKLTPMPPGMTWKNLSGIALLGGIGFTVSLFIANLSFGANYPVLLNQAKFGVLSGTIL SGLLGYIVLRIVLPVRKRK >gi|226332268|gb|ACIC01000052.1| GENE 23 24442 - 25620 774 392 aa, chain - ## HITS:1 COG:no KEGG:BT_0633 NR:ns ## KEGG: BT_0633 # Name: not_defined # Def: putative Na+/H+ exchange protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 12 392 1 381 381 681 99.0 0 MRKVLSFSGFLMLGLVISQFLPMIAGEGYGTVKVLSNILLYVCLSFIMINVGREFVLDKV RWKSYAEDYFIAMATAALPWFMIAIYYVFVLLPPDYWNSWEAWKENLLLSRFAAPTSAGI LFTMLAAIGLKSSWIYKKIQVLAIFDDLDTILLMIPLQIMMIGLRWQLIIVVVIVFMLLS IGWKQLNKYNWRQDWKAILFYSVIIFLATQILYLGSKELYGEEGSIHIEVLLPAFVLGMI MKHKEHDTPVERKVSTGISFLFMFLVGMSMPHFIGVNFAETHTGTHSVTGSQEMMSWGMI LFHVVIVSFLSNIGKLCPMFFYRDRKLSERLALSIGMFTRGEVGAGIIFIALGYNLGGPA LVISVLTLVLNLILTGIFVLWVKNLALRSYND >gi|226332268|gb|ACIC01000052.1| GENE 24 25762 - 27543 2043 593 aa, chain - ## HITS:1 COG:BS_lepA KEGG:ns NR:ns ## COG: BS_lepA COG0481 # Protein_GI_number: 16079605 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane GTPase LepA # Organism: Bacillus subtilis # 4 593 14 606 612 732 58.0 0 MKNIRNFCIIAHIDHGKSTLADRLLEFTNTIQVTNGQMLDDMDLEKERGITIKSHAIQME YNYKGEKYILNLIDTPGHVDFSYEVSRSIAACEGALLIVDASQGVQAQTISNLYMAIEHD LEIIPIINKCDMASAMPEEVEDEIVELLGCKRDEIIRASGKTGMGVEEILAAVIERIPHP EGDEEAPLQALIFDSVFNSFRGIIAYFKIENGVIRKGDKVKFFNTGKEYDADEVGVLKME LVPRNELRTGDVGYIISGIKTSKEVKVGDTITHVARPCDKAIAGFEEVKPMVFAGVYPIE AEDFEDLRASLEKLQLNDASLTFQPESSLALGFGFRCGFLGLLHMEIVQERLDREFDMNV ITTVPNVSYNIYDKQGNMKEVHNPGGMPDPTLIDHIEEPYIKASIITTTDYIGPIMTLCL GKRGELLKQEYISGNRVEIYYNMPLGEIVIDFYDRLKSISKGYASFDYHPNGFRPSKLVK LDIMLNGEPVDALSTLIHFDNAYDMGRRMCEKLKELIPRQQFEIAIQAAIGAKIIARETI KAVRKDVTAKCYGGDVSRKRKLLEKQKKGKKRMKQIGNVEVPQKAFLAVLKLD >gi|226332268|gb|ACIC01000052.1| GENE 25 27673 - 27873 333 66 aa, chain - ## HITS:1 COG:no KEGG:BF2557 NR:ns ## KEGG: BF2557 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 65 1 65 66 92 83.0 5e-18 MLKEKAGVIAGNIWNALNETEGMTAKQLKKATKLVDKDLFLGLGWLLREDKVSAEEVEGE LFIKLI >gi|226332268|gb|ACIC01000052.1| GENE 26 28030 - 28488 570 152 aa, chain - ## HITS:1 COG:no KEGG:BT_0631 NR:ns ## KEGG: BT_0631 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 152 1 152 152 285 100.0 4e-76 MEDRIQKAVELFKSGYNCSQSVVAAFADMYGFTQEQAVRMSASFGGGIGRMRETCGAACG LFLIAGLETGATEATDREGKAANYAVVQELAAEFKKRNGSLICGELLGLKKKEPISTVPE ERNTQYYSKRPCAKMVEEAARIWVEYLEKHPK >gi|226332268|gb|ACIC01000052.1| GENE 27 28503 - 29264 776 253 aa, chain - ## HITS:1 COG:MA2077 KEGG:ns NR:ns ## COG: MA2077 COG0708 # Protein_GI_number: 20090923 # Func_class: L Replication, recombination and repair # Function: Exonuclease III # Organism: Methanosarcina acetivorans str.C2A # 3 253 7 257 260 251 45.0 1e-66 MKIITYNVNGLRAAASKGLPEWLVQENPDILCLQETKLQPDQYPAEVFEALGYKAYLYSA QKKGYSGVAILTKQEPDHVEYGMGIEAYDNEGRFIRADFGDLSVVSVYHPSGTSGDERQA FKMVWLEDFQKYVTELRKTRPNLILCGDYNICHEPIDIHDPVRNATNSGFLPEEREWMTR FLSAGFIDSFRLLYPEKQEYTWWSYRFNSRAKNKGWRIDYCMTSEPVRPMLKSASILNDA VHSDHCPMALEIE >gi|226332268|gb|ACIC01000052.1| GENE 28 29295 - 30548 1097 417 aa, chain - ## HITS:1 COG:CAC0628 KEGG:ns NR:ns ## COG: CAC0628 COG1914 # Protein_GI_number: 15893916 # Func_class: P Inorganic ion transport and metabolism # Function: Mn2+ and Fe2+ transporters of the NRAMP family # Organism: Clostridium acetobutylicum # 13 417 11 415 417 459 63.0 1e-129 MKNIFQDLKRKDHKRYLGGLDVFKYIGPGLLVTVGFIDPGNWASNFAAGSEFGYSLLWVV TLSTIMLIVLQHNVAHLGIVTGLCLSEAATKYTPKWVSRPILGTAVLASISTSLAEILGG AIALEMLFDMPIMWGAVLTTLFVSIMLFTNSYKKIERSIIAFVSVIGLSFIYELFLVEID WPAATMGWVTPAFPKGSMLIIMSVLGAVVMPHNLFLHSEVIQSHEYNKKDDASIKKVLKY ELFDTLFSMIVGWAINSAMILLAAATFFKSGIQVEELQQAKSLLEPLLGSNAAIVFALAL LMAGISSTITSGMAAGSIFAGIFGESYHIKDSHSQVGVILSLGIALLLIFFIGDPFKGLL ISQMILSIQLPFTVFLQVGLTSSRKVMGNYVNSRWSTFVLYAIAVIVSVLNIMLLFS >gi|226332268|gb|ACIC01000052.1| GENE 29 30713 - 30958 187 81 aa, chain - ## HITS:1 COG:no KEGG:BT_0628 NR:ns ## KEGG: BT_0628 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 81 1 81 81 160 100.0 1e-38 MEYVYRTQGTCSTNIELNVEDGVVKEVAFWGGCNGNLQGLSRLVKGMKVEEVIKKLEGVR CSGRPTSCPDQLCHALHEMGY >gi|226332268|gb|ACIC01000052.1| GENE 30 30966 - 31709 1012 247 aa, chain - ## HITS:1 COG:Cj1172c KEGG:ns NR:ns ## COG: Cj1172c COG0217 # Protein_GI_number: 15792496 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Campylobacter jejuni # 1 238 1 234 235 175 45.0 6e-44 MGRAFEYRKAAKLKRWGHMAKTFTRLGKQIAIAVKAGGPEPENNPTLRSVIATCKRENMP KDNIERAIKNAMGKDQSDYKSMTYEGYGPHGIAVFVDTLTDNTTRTVADVRSVFNKFGGN LGTMGSLAFLFDHKCMFTFKIKDGMDMEELILDLIDYDVEDEYEQDDEEGTITIYGDPKS YAAIQKHLEECGFEDVGGDFTYIPNDLKEVTPEQRETLDKMIERLEEFDDVQTVYTNMKP EEGGNEE >gi|226332268|gb|ACIC01000052.1| GENE 31 31745 - 34207 2848 820 aa, chain - ## HITS:1 COG:FN2122_2 KEGG:ns NR:ns ## COG: FN2122_2 COG0072 # Protein_GI_number: 19705412 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Phenylalanyl-tRNA synthetase beta subunit # Organism: Fusobacterium nucleatum # 154 820 3 652 653 379 33.0 1e-104 MNISYNWLKEYVNFDLTPDETAAALTSIGLETGGVEEVQTIKGGLEGLVIGEVLTCEEHP NSDHLHITTVNLGDGEPVQIVCGAPNVAAGQKVVVATLGTKLYDGDECFTIKKSKIRGVE STGMICAEDEIGIGTDHAGIIVLPAEAVPGTLAKDYYNIKSDYVLEVDITPNRADACSHY GVARDLYAYLIQNGKQATLQRPSVDGFKVENHDLDIEVVVENSEACPHYAGVTVKGVTVK ESPEWLQNKLRLIGVRPINNVVDITNYIVHAFGQPLHCFDAGKIKGNEVIVKTLPEGTPF VTLDEVERKLNERDLMICNKEEAMCIAGVFGGLDSGSTEATTDVFIESAYFHPTWVRKTA RRHGLNTDASFRFERGIDPNGVIYCLKLAALMVKELAGGTISSEIKDVCVAVPQDFMVEL SYEKVNSLIGKVIPVETIKSIVTSLEMKIMNETVDGLTLAVPPYRVDVQRDCDVIEDILR IYGYNNVEIPTTLNSSLTTKGEHDKSNKLQNLVAEQLVGCGFNEILNNSLTRAAYYDGLE NYSSNHLVMLLNPLSADLNCMRQTLLFGGLESIAHNANRKNADLKFFEFGNCYYFDADKK NPEKVLATYSEDYHLGLWVTGKKVANSWAHPDENSSVYELKAYVENILKRLGLDLHNLVV GNLTDDIFATALSVHTKGGKRLASFGVVTKKLLKAFDIDNEVYYADLNWKELMKAIRSVK ISYKEISKFPAVKRDLALLLDKNVQFAEIEKIAYDTEKKLLKEVELFDVYEGKNIEAGKK SYAVSFLLQDETQTLNDKMIDKIMSKLVKNLEDKLNAKLR >gi|226332268|gb|ACIC01000052.1| GENE 32 34347 - 34541 148 64 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVGDKVAKNKGVFFVIFATYPYQNTTRLYSIKNNRPLDWSKSSGYFHLTKRSFLFNPNER FLER >gi|226332268|gb|ACIC01000052.1| GENE 33 34530 - 36083 1627 517 aa, chain - ## HITS:1 COG:lin0047 KEGG:ns NR:ns ## COG: lin0047 COG0305 # Protein_GI_number: 16799126 # Func_class: L Replication, recombination and repair # Function: Replicative DNA helicase # Organism: Listeria innocua # 24 466 8 442 450 372 45.0 1e-103 MAEQRRNSRNTKSTKVQPVNDYGRIQPQAPELEEAVLGALMIEKDAYSLVSEILRPESFY EHRHQLIYAAITDLAVNQKPVDILTVKEQLSKRGELEEVGGPFYITQLSSKVASSAHIEY HARIIAQKSLARELITFTSNIQSKAFDETLDVDDLMQEAEGKLFEISQQNMKKDYTQINP VIDEAYKLIQKAAARTDGLSGLESGFTKLDKMTSGWQNSDLIIIAARPAMGKTAFVLSMA KNIAVDYRNPVALFSLEMSNVQLVNRLITNVCEIPSEKIKSGQLASYEWQQLDYKLKDLL DAPLYVDDTPSLSVFELRTKARRLVREHGVRIIIIDYLQLMNASGMAFGSRQEEVSTISR SLKGLAKELNIPIIALSQLNRGVESREGIDGKRPQLSDLRESGAIEQDADMVCFIHRPEY YKIYQDDRGNDLRGMAEIVIAKHRNGAVGEVLLRFKGEFTRFSNPEDDMVIPMPGEPAGA MLGSKLNAGAMPPPPPEPDFVPQGNNPFGATEGPLPF >gi|226332268|gb|ACIC01000052.1| GENE 34 36282 - 37106 585 274 aa, chain + ## HITS:1 COG:NMA1092 KEGG:ns NR:ns ## COG: NMA1092 COG1947 # Protein_GI_number: 15794040 # Func_class: I Lipid transport and metabolism # Function: 4-diphosphocytidyl-2C-methyl-D-erythritol 2-phosphate synthase # Organism: Neisseria meningitidis Z2491 # 4 249 10 245 281 135 38.0 6e-32 MITFPNAKINLGLSITEKRPDGYHNLETVFYPVALEDALEIRTSPEADKKFSLHQYGMEI SGNPEDNLVVKAYLLLDKEFHLCPIEIHLYKHIPSGAGLGGGSSDAAFMLKLLNEHFQLN LSEDQLEIYAATLGADCAFFIRNAPTFAEGIGNIFSPIPLSLKGYQILIIKPDVFVSTRE AFANIHPHHPEYSIKEAIKRPVNEWKEILINDFEDSVFPQHPVIGEIKAELYRQGAVYAS MSGSGSSVYGLFEPEGTLPETDWGTNVFCFKGRL Prediction of potential genes in microbial genomes Time: Thu May 12 00:37:56 2011 Seq name: gi|226332267|gb|ACIC01000053.1| Bacteroides sp. 1_1_6 cont1.53, whole genome shotgun sequence Length of sequence - 3401 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 2, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 49 - 80 1.1 1 1 Tu 1 . - CDS 116 - 1150 986 ## COG1087 UDP-glucose 4-epimerase - Prom 1284 - 1343 4.3 - Term 1334 - 1370 2.3 2 2 Op 1 3/0.000 - CDS 1409 - 1981 691 ## COG4657 Predicted NADH:ubiquinone oxidoreductase, subunit RnfA 3 2 Op 2 13/0.000 - CDS 2004 - 2588 727 ## COG4660 Predicted NADH:ubiquinone oxidoreductase, subunit RnfE 4 2 Op 3 . - CDS 2606 - 3289 893 ## COG4659 Predicted NADH:ubiquinone oxidoreductase, subunit RnfG Predicted protein(s) >gi|226332267|gb|ACIC01000053.1| GENE 1 116 - 1150 986 344 aa, chain - ## HITS:1 COG:SP1828 KEGG:ns NR:ns ## COG: SP1828 COG1087 # Protein_GI_number: 15901657 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-glucose 4-epimerase # Organism: Streptococcus pneumoniae TIGR4 # 5 337 3 330 336 338 51.0 8e-93 MKERILVTGGTGYIGSHTVVELQNSGYEVIIIDNLSNSSADVVDNIEKVSGIRPAFEKLD CLDFAGLDAVFAKYKGIKAIIHFAASKAVGESVEKPLLYYRNNLVSLINLLELMPKHGVE GIVFSSSCTVYGQPDELPVTEKAPIKKAESPYGNTKQINEEIIRDTVASGAPINAILLRY FNPIGAHPTALLGELPNGVPQNLIPYLTQTAMGIREKLSVFGDDYDTPDGSCIRDFINVV DLAKAHVIAIRRILEQKQKEKVEVFNIGTGRGVSVLELINGFEKATGVKLNYQIVGRRAG DIEKVWANPDYANQELGWKAVETLEDTLRSAWNWQLKLRERGIQ >gi|226332267|gb|ACIC01000053.1| GENE 2 1409 - 1981 691 190 aa, chain - ## HITS:1 COG:FN1592 KEGG:ns NR:ns ## COG: FN1592 COG4657 # Protein_GI_number: 19704913 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfA # Organism: Fusobacterium nucleatum # 18 189 21 192 194 174 60.0 1e-43 MEYILIFISAIFVNNIVLSQFLGICPFLGVSKKVETAMGMSAAVAFVLTIATIVTFLIQK FVLDAFGLGYLQTITFILVIAGLVQMVEIILKKVSPSLYQALGVFLPLITTNCCILGVAI LVIQKDFDLLTGVVYAFSTALGFGLALVLFAGLREQMTLVKIPKGMQGTPIALITAGLLA MAFMGFSGVV >gi|226332267|gb|ACIC01000053.1| GENE 3 2004 - 2588 727 194 aa, chain - ## HITS:1 COG:FN1593 KEGG:ns NR:ns ## COG: FN1593 COG4660 # Protein_GI_number: 19704914 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfE # Organism: Fusobacterium nucleatum # 1 189 1 190 205 192 60.0 4e-49 MNNFKVMMNGIIKENPIFVLLLGMCPTLGTTSSAINGMGMGLATMFVLICSNVVISSIKN LIPDMVRIPSFIVVIASFVTLLQMIMQAYVPDLYATLGLFIPLIVVNCIVLGRAEAFAAK NNPLASMFDGIGMGFGFTIALTLLGAVREFLGTGKIFNLTILPEEYGMLIFVLAPGAFIA LGYLIAIINSLKKA >gi|226332267|gb|ACIC01000053.1| GENE 4 2606 - 3289 893 227 aa, chain - ## HITS:1 COG:STM1455 KEGG:ns NR:ns ## COG: STM1455 COG4659 # Protein_GI_number: 16764803 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfG # Organism: Salmonella typhimurium LT2 # 3 191 10 187 206 67 31.0 2e-11 MLLVLTGVTAVSVALLAYVNELTKGPIAEANAKTLNEALKKVLPEYTNNPVAESDTVFSE KDGKKIVDFIIYPAKNGEKWVGSAVEAKSMGFGGELKVLVGFDAEGKIYNYSLLAHAETP GLGSKADKWFGAYDPAKNEKPVSHEESAKSILGMNPGETPLAVTKDMNGQVDAITASTIT SRAFLNAVNAAYQAYKGGEVDTNSGASQKNEPAAAPEGEASANDTNN Prediction of potential genes in microbial genomes Time: Thu May 12 00:37:58 2011 Seq name: gi|226332266|gb|ACIC01000054.1| Bacteroides sp. 1_1_6 cont1.54, whole genome shotgun sequence Length of sequence - 4731 bp Number of predicted genes - 7, with homology - 5 Number of transcription units - 3, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 12/0.000 - CDS 3 - 825 886 ## COG4658 Predicted NADH:ubiquinone oxidoreductase, subunit RnfD 2 1 Op 2 10/0.000 - CDS 832 - 2169 1233 ## COG4656 Predicted NADH:ubiquinone oxidoreductase, subunit RnfC 3 1 Op 3 . - CDS 2194 - 3075 995 ## COG2878 Predicted NADH:ubiquinone oxidoreductase, subunit RnfB 4 1 Op 4 . - CDS 3082 - 3507 268 ## BT_0616 hypothetical protein - Prom 3645 - 3704 4.8 - Term 3658 - 3698 5.0 5 2 Op 1 . - CDS 3738 - 3944 122 ## 6 2 Op 2 . - CDS 3865 - 4386 588 ## BT_0615 hypothetical protein - Prom 4411 - 4470 5.9 - Term 4463 - 4509 8.6 7 3 Tu 1 . - CDS 4542 - 4730 137 ## Predicted protein(s) >gi|226332266|gb|ACIC01000054.1| GENE 1 3 - 825 886 274 aa, chain - ## HITS:1 COG:TM0245 KEGG:ns NR:ns ## COG: TM0245 COG4658 # Protein_GI_number: 15643017 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfD # Organism: Thermotoga maritima # 4 274 2 264 318 201 47.0 1e-51 MENKLIVSLSPHVHGGDSVQKNMYGVLIALIPAFLVSLYFFGLGALIVTATSVAACLFFE WAIVKFLMKKPATTICDGSAVITGVLLAFNLPSNLPIWIIILGALFAIGVGKMSFGGLGC NPFNPALAGRVFLLLSFPVQMTTWPVVGQLTSYMDATTGATPLALMKQAIHGDASALSQI PDALTLFIGQNGGCIGEVSALALLLGLVYMLWKKIITWHIPVSIIVTVFVFAGIMHMVDP EKYVSPVLQLLSGGLMLGAVFMATDYVTSPMSKK >gi|226332266|gb|ACIC01000054.1| GENE 2 832 - 2169 1233 445 aa, chain - ## HITS:1 COG:TM0244 KEGG:ns NR:ns ## COG: TM0244 COG4656 # Protein_GI_number: 15643016 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfC # Organism: Thermotoga maritima # 8 442 22 451 451 371 47.0 1e-102 MLKTFSIGGVHPHENKLSAHQPIITAEVPAKAVILLGQHIGAPAKPIVAKGDMVKVGTKI AEPAGFVSAAIHSSVSGKVAKIDTVVDASGYAKPAIFIDVEGDEWEETIDRSKTLVKECE LSSEEIVKKIADAGIVGLGGACFPTQVKLCPPPSFKAECVIINAVECEPYLTADHQLMLE HAEEVMVGVSILMKAVKVNKAFIGIENNKPDAIELMTKVASSYAGIEVVPLKVKYPQGGE KQLIDAITKRQVASGALPISTGAVVQNVGTAFAVYEAVQKNKPLFERVITVTGKSVTKPS NFLARIGTPMKQLIDACGGLPEDTGKVIGGGPMMGKALVNIEVPTAKGSSGILIMNQKEA KRGEAQTCIRCAKCVSACPMGLEPYLLGALSENGDFETMEKERIMDCIECGSCQFTCPAN RPLLDYCRLGKGKVGTMIRARQAKK >gi|226332266|gb|ACIC01000054.1| GENE 3 2194 - 3075 995 293 aa, chain - ## HITS:1 COG:MA0664 KEGG:ns NR:ns ## COG: MA0664 COG2878 # Protein_GI_number: 20089551 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfB # Organism: Methanosarcina acetivorans str.C2A # 1 265 1 261 264 175 41.0 1e-43 MNLILIAVISLGAIALVLAAILYVASKKFAVYEDPRIAQVGEVLPQANCGGCGYPGCSGF ADACVKAGSLDGKFCPVGGQPVMAQIADILGLAATEAEPMVAVVRCNGSCANRPRINQYD GAKSCAIAASLYGGETGCSYGCLGCGDCVAACQFDAIHMNPETGLPEVDEAKCTACGACV KACPKAIIEIRPQGKKSRRVYISCVNKDKGAVARKACTVSCIGCGKCVKTCPFEAITLEN NLAYIDPNKCKSCRKCVEVCPQNTIIELNFPPRKPKAEEAPKTVAAEAPKVTE >gi|226332266|gb|ACIC01000054.1| GENE 4 3082 - 3507 268 141 aa, chain - ## HITS:1 COG:no KEGG:BT_0616 NR:ns ## KEGG: BT_0616 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 141 1 141 141 263 100.0 2e-69 MTNNTIKHLGIVENIQGSHLSVRIVQTSACAACSAKGHCSSADSKDKIIDITDVTAASYQ VGERVMVIGETSMGMMAVTLAFIIPFVLLIVSLFAWMALIGNELYAALLSLAVLIPYYFV LWLNKTRMKQHFSFTIKPINN >gi|226332266|gb|ACIC01000054.1| GENE 5 3738 - 3944 122 68 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MIRMMVKMAVVEMKEEETRKPRIRLYNLKSLKRKEEKSMKKTWSVVLKVIIAVAGAIAGV LGVQAASL >gi|226332266|gb|ACIC01000054.1| GENE 6 3865 - 4386 588 173 aa, chain - ## HITS:1 COG:no KEGG:BT_0615 NR:ns ## KEGG: BT_0615 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 173 1 172 172 283 95.0 1e-75 MSVIYKVVTRPSDPRVPNSPKKYYPHLITLGQSVNLKYIAQKMQGYSSLSIGDIKSVIQN FVEKMKEQLLEGKSVNIEGLGVFMLTACSKGADAAKDVNAKSVDSVRIFFQANKELRITK TATRADEKLDLISLDEYLKKLNASVTPNDPDDGENGGGGNEGGGDEEAPDPTV >gi|226332266|gb|ACIC01000054.1| GENE 7 4542 - 4730 137 62 aa, chain - ## HITS:0 COG:no KEGG:no NR:no EYTLINELSFEKKLPNLCTVINGVDLKKRKYGYYYGYGKYGKHYGYGKRYGYGYGYGQDT KR Prediction of potential genes in microbial genomes Time: Thu May 12 00:38:19 2011 Seq name: gi|226332265|gb|ACIC01000055.1| Bacteroides sp. 1_1_6 cont1.55, whole genome shotgun sequence Length of sequence - 23576 bp Number of predicted genes - 23, with homology - 23 Number of transcription units - 4, operones - 3 average op.length - 7.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 1557 1327 ## BT_0614 putative tyrosine-protein kinase in cps region 2 1 Op 2 . - CDS 1569 - 2363 822 ## COG1596 Periplasmic protein involved in polysaccharide export 3 1 Op 3 . - CDS 2407 - 2853 283 ## BT_0397 hypothetical protein 4 1 Op 4 1/0.000 - CDS 2859 - 3845 608 ## COG0472 UDP-N-acetylmuramyl pentapeptide phosphotransferase/UDP-N-acetylglucosamine-1-phosphate transferase 5 1 Op 5 . - CDS 3927 - 4685 644 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 6 2 Op 1 . - CDS 4797 - 6002 562 ## COG1035 Coenzyme F420-reducing hydrogenase, beta subunit 7 2 Op 2 . - CDS 5992 - 7128 592 ## Amet_0211 hypothetical protein 8 2 Op 3 . - CDS 7137 - 7964 202 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 9 2 Op 4 . - CDS 8012 - 9235 540 ## Slip_0167 glycosyl transferase group 1 10 2 Op 5 . - CDS 9243 - 10388 402 ## gi|253568579|ref|ZP_04845990.1| predicted protein 11 2 Op 6 . - CDS 10397 - 11899 294 ## BDI_1846 putative transmembrane protein 12 2 Op 7 . - CDS 11899 - 12774 552 ## BDI_0439 putative glycosyltransferase 13 2 Op 8 . - CDS 12761 - 13894 192 ## COG2843 Putative enzyme of poly-gamma-glutamate biosynthesis (capsule formation) 14 2 Op 9 . - CDS 13898 - 14506 315 ## COG0726 Predicted xylanase/chitin deacetylase 15 2 Op 10 . - CDS 14605 - 15699 311 ## BDI_3823 putative glycosyltransferase - Prom 15719 - 15778 4.1 16 3 Tu 1 . - CDS 15780 - 16694 267 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 17 4 Op 1 3/0.000 - CDS 17171 - 18361 557 ## COG0381 UDP-N-acetylglucosamine 2-epimerase 18 4 Op 2 3/0.000 - CDS 18364 - 19569 839 ## COG0451 Nucleoside-diphosphate-sugar epimerases 19 4 Op 3 3/0.000 - CDS 19577 - 20626 862 ## COG1086 Predicted nucleoside-diphosphate sugar epimerases 20 4 Op 4 5/0.000 - CDS 20661 - 20957 167 ## COG0451 Nucleoside-diphosphate-sugar epimerases 21 4 Op 5 8/0.000 - CDS 20989 - 21285 310 ## COG0451 Nucleoside-diphosphate-sugar epimerases 22 4 Op 6 . - CDS 21290 - 22606 1198 ## COG1004 Predicted UDP-glucose 6-dehydrogenase 23 4 Op 7 . - CDS 22641 - 23567 869 ## COG1086 Predicted nucleoside-diphosphate sugar epimerases Predicted protein(s) >gi|226332265|gb|ACIC01000055.1| GENE 1 3 - 1557 1327 518 aa, chain - ## HITS:1 COG:no KEGG:BT_0614 NR:ns ## KEGG: BT_0614 # Name: not_defined # Def: putative tyrosine-protein kinase in cps region # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 518 1 518 812 960 96.0 0 MKEENQNVKEREIAEDQIDFRALLFKYIIHWPWFIGAVLLCFVGAWFYLHWATPIYNISA TVLIKDEKKGGGAGLSSELEDMGLSGLMTSSKNIDNELEVLRSKTLVKEVVNQLGLYITY KDEDEFPAKGLYKTSPVQVSLTPQEAEKLKTSMVVEMTLQPKGSMDVNVTVGEKGYQKHF EKLPAIFPTDEGTLAFFQEVDSVTLQDGTKVPRIEKNVRHITATINKPMRVAKGYCNSLS IAPTSKTTSVAVISLKNSSLQRGQDFINQLLEMYNRNTNNDKNEIAQKTAEFIDERIGII SKELGSTEANLETFKRDAGITDLTSEAQIALAGNAEYEKKSVENRTQISLVNDLRKYLRG NEYEVLPSNVGLQDAALIGAIERYNEMLMERKRLLRTSTENNPTIVNLDTSIRAMKANVQ ATLEGTLQGLMITKESLDREASRYSRRISNAPGQERAYVSIARQQEIKAGLYLMLLQKRE ENAIALAATANNAKIIDEAIADDIPVSPKRSMIYLIAL >gi|226332265|gb|ACIC01000055.1| GENE 2 1569 - 2363 822 264 aa, chain - ## HITS:1 COG:PM1016 KEGG:ns NR:ns ## COG: PM1016 COG1596 # Protein_GI_number: 15602881 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protein involved in polysaccharide export # Organism: Pasteurella multocida # 14 231 51 253 387 64 28.0 2e-10 MRRMKQLGYYVLAAFLLTACQSYKKVPYLQDAEVVLYSTQDAQLYDAKIMPKDLLTIVVS CTSPELAAPFNLTVATQSNAALSYTTTQPVLQQYLVDNDGNINFPVLGELHVGGLTKKAT EQMIVDKLKPYITETPIVTVRMMNYKISVIGEVARPGTFTISNEKVNILEALAMAGDMTV YGLRDDVKLIRENANGKQEIIPLDLNKAETILSPYYYLQQNDIIYVTPNKAKARNSDIGT STSLWFSATSILVSIASLLFNILK >gi|226332265|gb|ACIC01000055.1| GENE 3 2407 - 2853 283 148 aa, chain - ## HITS:1 COG:no KEGG:BT_0397 NR:ns ## KEGG: BT_0397 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 148 1 148 148 259 86.0 2e-68 MEFDKDFLGKLFEQAVTNPRLRQNFDLRTSPADTSQRMLNALLPGTKVPIHRHENTTETV ICLVGKLEEIIYEEVNEYIYEATSCCDDVIRQKKIKEISRQILSPAEGKFGIQIPAGTWH TINVIEPSVIFEAKDGAYNESGNTTDKE >gi|226332265|gb|ACIC01000055.1| GENE 4 2859 - 3845 608 328 aa, chain - ## HITS:1 COG:PA3145 KEGG:ns NR:ns ## COG: PA3145 COG0472 # Protein_GI_number: 15598341 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl pentapeptide phosphotransferase/UDP-N-acetylglucosamine-1-phosphate transferase # Organism: Pseudomonas aeruginosa # 4 283 9 311 339 76 29.0 8e-14 MIYLIILVLLFIAELIYFCIANQHNIIDKPNERSSHSTIVLRGGGIIFLIGAWIWSLFFG FQYPWFLAGLTLVAGVSFIDDIHSLPDSVRLVAQFAAAAMAFYQLGILHWSMWWIILLAL IVYVGATNVINFMDGINGITAGYSLAVLIPLVLLNRDMDFVEQSLIILTILASLVFCIFN FRPKGKAKCFAGDVGSIGIAYIMLFLLGNVIIKTGDITWLIFLLVYGVDGCLTIAHRIML HEDLGEAHRKHAYQIMANELKIGHVKVASLYMVVQFVISLGFIYFCPNTIFAHWLYLIGT LAVVAVAYILFMKKYYYLHEEYLTSLKK >gi|226332265|gb|ACIC01000055.1| GENE 5 3927 - 4685 644 252 aa, chain - ## HITS:1 COG:BH3661 KEGG:ns NR:ns ## COG: BH3661 COG0463 # Protein_GI_number: 15616223 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Bacillus halodurans # 3 221 11 229 257 142 35.0 5e-34 MQVSIIMPYYNAAAYICETVEAIIAQTYKDWELIIVDDCSPAPETTAVLKQIEEMDSRIK VMRAEKNGGAGLARNIGIKTAQGQYIGFCDSDDWWYPTKLEEQLKFMQENGYELTCTWYE DANEQLKPYYTVKQAPRQSYKSMIAGCNVGTPGVLFDTRRVGKKYMPPLRRAEDWGLWMN ILKDVDYIYTYPKALWKYRHIPGSETSNKWLMLKAVVKMYKTVLGMNSLEAWFIALFIFL PDNILKKLKKIV >gi|226332265|gb|ACIC01000055.1| GENE 6 4797 - 6002 562 401 aa, chain - ## HITS:1 COG:MTH341 KEGG:ns NR:ns ## COG: MTH341 COG1035 # Protein_GI_number: 15678369 # Func_class: C Energy production and conversion # Function: Coenzyme F420-reducing hydrogenase, beta subunit # Organism: Methanothermobacter thermautotrophicus # 13 303 10 308 406 82 23.0 1e-15 MKSNVCDGEELRLCTGCGVCVSVCGTKAITIKLDNEGYYKPVVDEDKCVECNLCKKSCYK YDENPVQSDKYEMCYAAVNKNEKQLKASSSGAVSRVLMEECIARDYKVMGCAYDTKENIA KSIVASTIDELDQFYGSKYFQSLTTEGFDEVLKDRTEKKYAIFGTPCQIYGFSQTSKYKR KPDNYLLVDIFCHGCPSMKLWKSYLKYMSKRTGCSEYDKVVFRSKTHAWHEYCFDFISKS KQYSSKKSKDPFYDIFFGMDIMNEACYDCNSRSSMAYGDIRLGDYWGEKYDTNTKGVSAV VVKTPKGMEIFEAVKNRLNVEEVTLNHILAAQSYGKTHFYNRQRRRFLLQNMSGDSDLKE VYTTYMKMFPLKSRIKKQLKAIVKCCPKSVYFPIKKLIHSI >gi|226332265|gb|ACIC01000055.1| GENE 7 5992 - 7128 592 378 aa, chain - ## HITS:1 COG:no KEGG:Amet_0211 NR:ns ## KEGG: Amet_0211 # Name: not_defined # Def: hypothetical protein # Organism: A.metalliredigens # Pathway: not_defined # 2 372 6 368 369 150 31.0 1e-34 MKNRILLTTVFSAFNYGSCLQAFAGKQIIRKVGYECDLVKLKSLVKGRDVRFRKLMTILF RLLFLNKNNALKTYGKSYEKALVDGTERKFFAFTDEYLKPVEVSWRRLKSIAKEAVACFS GSDQIWNSSTLYVDPLYYLRFAPKQKRVALSPSFGRNFIADYNKNKMKKWINDYPYLSVR EDSGVKLIKELTGLEAQHLLDPTLIINSEEWRKNLHIEEKTKNYILAYFLDEPSSHAKKA LKALKEQLNCKVIALPYKFEDMDYCDEAVAAGPKEFVELVANAKVVCTDSFHGTVFALNL HTPFFTFEREYGFANKQSERVLSILRKVDMLERYQPVNVVEKLDNLDFKHSEEVLNTERE KAYKYVNNAINSIKKNEK >gi|226332265|gb|ACIC01000055.1| GENE 8 7137 - 7964 202 275 aa, chain - ## HITS:1 COG:SP1771_1 KEGG:ns NR:ns ## COG: SP1771_1 COG0463 # Protein_GI_number: 15901601 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Streptococcus pneumoniae TIGR4 # 2 218 6 223 259 79 30.0 7e-15 MKISVIVPTYKPQTYLWECLDSIYNQTFPKTDYELVLVLNGCNEPYNTQIKEWLSKHEDL QVQYFQIDEGGVSNARNIALNNAKGKYVTFIDDDDLISFSFLEELYDKVAPDTVSLCYPY AFKDKESNRQLNYTVTDAYNYAVENRCSTLSSKVRKYFSGPCMKLIPMSFIQERRYDKRF KIGEDSLFMFLISDKIHKVSLTSKNAVYYRRYREESASFKKRSAKEQITNNIKLIWEYIK IYVHGGYSSYIFTTRIASRIREIMFYPITNFKKNS >gi|226332265|gb|ACIC01000055.1| GENE 9 8012 - 9235 540 407 aa, chain - ## HITS:1 COG:no KEGG:Slip_0167 NR:ns ## KEGG: Slip_0167 # Name: not_defined # Def: glycosyl transferase group 1 # Organism: S.lipocalidus # Pathway: not_defined # 1 403 1 404 412 194 30.0 5e-48 MNILYIGKFFPQNMLRTINEDSKGLMGMSNHNFEMSLLNGLCQQENINLKCITCPGVFSY PQYNKKMFIEPESYAYKNTYIESVGFCNLVGVKEGSAKRATASNILKAIEGFEGDTIHII INTPDIRLLDAIKAAKRKTAKKITQTVVIPDIPSIVFGMDRQNPVKAYLLAKRNKKITEA INNSNGLVLLTEAMMDFYQKDLKHIVMEGIVDVGTMGKSDVEPTTDKKVVLYTGTLRKIF GIMNLVEAFKMVKDRDVELWICGSGDSKEAINEAARIDSRIKFFGLVDSETALEMQHKAT ILVNPRTSEGEYTKYSFPSKTMEYLLAGRSVIINHLSGIPEEYYDYVYTPKDESVEALAE CISSVIHLDIKEREERAKKGRQFIIEKKNSKVQMERVIKMIESYANI >gi|226332265|gb|ACIC01000055.1| GENE 10 9243 - 10388 402 381 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253568579|ref|ZP_04845990.1| ## NR: gi|253568579|ref|ZP_04845990.1| predicted protein [Bacteroides sp. 1_1_6] # 1 381 1 381 381 616 100.0 1e-175 MPFKYKLLVLLFFSTLITSLWSDNIYLLFLFSLATWCILPFNKWWDGIGVTLIFFSFAYC AMQYMHDQTGSGFIYLSMIIAPVAYYRFGSWAICWLEDDKKKLQFLFISILLYLIPVLLL TIQDILIVGFVNESRYLLSDIGKEESTLSATIYGLMSSAGIAFVSIIFAESLKLKEKAVF FSLAAIALVIVVHLINRTGIVLLIVSIILSFGYSTGMRLSKVVSSLLLLLLVLVIIIKSG LVSQEIIDAYIERESGTTDNATELGGRSGKWQAAISDLMSSPFGWKRVGYVHNLWLDLAA VAGWITTIPFVVATINVLRTTFRMIKRHVTPFRLVVVTMVVAMFLNCMVEPVIEASLLFF VLMMFVWGMLKAESEEMVLNH >gi|226332265|gb|ACIC01000055.1| GENE 11 10397 - 11899 294 500 aa, chain - ## HITS:1 COG:no KEGG:BDI_1846 NR:ns ## KEGG: BDI_1846 # Name: not_defined # Def: putative transmembrane protein # Organism: P.distasonis # Pathway: not_defined # 6 495 8 502 509 229 29.0 2e-58 MQENTNKAIFLNSIILYAKLIITAIAGLLTTRFALQALGVDDFGLFSVVGGVISFIAIVN TIMISTSNRFIATAIGKGDMKLINNTFSVNVVIHVLIAVITLAVAYPLGDWYILNFINYG GDIQTVMTVFHITIIGSVISFIGVPFNGLLIAKERFLVFSTAEIIASITKVIVSYSLIYF FVNKLLVYSLTICLTTAFPTFVFIVYCRKLFPQIVAFHFVKEWTPYREVLAFSVWVGYGA IATVSKNQGAALIVNRFFNTTMNTALGLANSVNSIILTFANSISKSISPQIVKSYAAGDK ERSETLVVMASKYSFLILLLVASPFLVAPEWIFSLWLGAVPANLIIFSNLIIVDALIGVF NAGIPDLIFASGKIKWYQIIVNTMFLLSVVAAYFVLQAGAPAYYLQVTYIIFSLIILVLR QIALNRIVKFNNWNLIRKSYIPCIYVAILFTLVFTIKGIVVPFMLMIISIVYLLILYFTI ALTKEERLFVINRIKGLIKK >gi|226332265|gb|ACIC01000055.1| GENE 12 11899 - 12774 552 291 aa, chain - ## HITS:1 COG:no KEGG:BDI_0439 NR:ns ## KEGG: BDI_0439 # Name: not_defined # Def: putative glycosyltransferase # Organism: P.distasonis # Pathway: not_defined # 61 281 62 277 291 165 38.0 2e-39 MGKNKSFGRRILNLLPWQVRYFVDYSRNKHVIPNLFSPRNYSEYIFRDNILGCHKKHAYL ADKYEVRKYVEERGLGHTLTKLYGVWDNADTINFDELPEQFAIKCNHSCAMNIIVFDKSK LDVEATKKLLNKWLKTKHPIDFEVHYQCIKPLIICEELIKDNEDGSFPMDYKIHCANGKP IFIQCCFDRTEQSVGKRVIYNPEWKNLHYIINDYHYSGDECDKPKHLKEMLEYASILSKG LDYARVDFYDTDERPLFGEITLTPMGGWLTYFTPEALDVMGREIRVNKNKK >gi|226332265|gb|ACIC01000055.1| GENE 13 12761 - 13894 192 377 aa, chain - ## HITS:1 COG:BS_ywtB KEGG:ns NR:ns ## COG: BS_ywtB COG2843 # Protein_GI_number: 16080641 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Putative enzyme of poly-gamma-glutamate biosynthesis (capsule formation) # Organism: Bacillus subtilis # 29 261 92 328 380 76 26.0 8e-14 MSRLFVCGDICNSMSEMKKSFISDELVEVIKSVDYSVCNLEGVEVDHNTTHVDYPHQQFG TVSYLKSCGFNMCLLANNHITDGGPDRLNYTIDAIDNVRLDYIGAGFTEEQVYAARIKQI GNYKFGIINLCEAQEGQFSNSTCKYGYAWIGHFSIAKTISELRTKVDFILCFCHYGLEHY EVPLDCVRQYYYLLIDQGVDCIVGTHPHIAQGYEYYNDKLIVYSLGNFFFPRRTGRYDDE NHSYSVVLEFEKGRKVQVTPIFHKLEEGAVCTDNSGRIDLKALNLLLSNNYKYNEEETIR KAFDGVVGNLFYNSMCSMSDQRRSPKERLKDIFRYTMFRSKYINSTAKIRASIISRLFKN EIYRYIIIKIHNDYGEE >gi|226332265|gb|ACIC01000055.1| GENE 14 13898 - 14506 315 202 aa, chain - ## HITS:1 COG:L106031 KEGG:ns NR:ns ## COG: L106031 COG0726 # Protein_GI_number: 15672087 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted xylanase/chitin deacetylase # Organism: Lactococcus lactis # 1 198 86 291 318 77 30.0 1e-14 MYHHISDVRIDELDTCQHTLEQFKKSLECYINQSYIFVNIDTALKMIREKSEQKFAVVTF DDVPLDAYEKAVPVLLSMQIPFTFFITTGYIGTKGFMTKAQLKKLDNQELCTIGAHTKSH PMLRKVSNSFEELKESKEILERLLGHPVDYLAYPFGRQSSISHKVMKQAEKAGYKCAFGT IDATINDLSSKSLYYLPRIVRK >gi|226332265|gb|ACIC01000055.1| GENE 15 14605 - 15699 311 364 aa, chain - ## HITS:1 COG:no KEGG:BDI_3823 NR:ns ## KEGG: BDI_3823 # Name: not_defined # Def: putative glycosyltransferase # Organism: P.distasonis # Pathway: not_defined # 1 350 29 378 395 280 41.0 5e-74 MKRLLGSLLKAVYPSKDIPLVISVDCSGDTELYEYVEEFEWPFGQKYVNIQERRLGLKDH IYQCGELTGQFKAIILLEDDLFVSPFFYSYVLKTLDKYGNDSRIAQISLYKNERNGYVGL PFVNIQNGSDVFLMQDVSTWGECWTESMWSEFRQWRDTHSEEDIQKVDMPSEIKGWIQAW SKYYNAYVVDSNKFVIYPNIPVTTNFSDAGEHGGDNNSLVQVNLLQQDYDYRLYDVDKLA RYDIYFNNVCLYEKLGIPENDLCLDIYGFHSNEKGCKYILSTKVLPYKIVKSFALNMRPI ELNVMYDIFGNGLYLYDTTDSNGTTQGSYHKNVVPYFLEGFNVRLLLKYVISHYRNSIKQ VLKK >gi|226332265|gb|ACIC01000055.1| GENE 16 15780 - 16694 267 304 aa, chain - ## HITS:1 COG:PAB0772 KEGG:ns NR:ns ## COG: PAB0772 COG0463 # Protein_GI_number: 14521365 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Pyrococcus abyssi # 6 275 4 259 298 117 30.0 3e-26 MDKKLPLVTIITNTKNRARLISRCIESIQNQTYQNYEHIVADGGTDNTKKIVESYNDPHI IYISIPEGGPVAQTKEAFKLSKGEFITFLDDDDEYTPEKLEKQLDLIQSLSDDYGFIYGA MTYYDNNTKEQLNIHGAEIEGGSEILPIAISKPTICGTPTFMFRRKAFESIGGTWISGIG NDMSDWALGCRALKQGWKVAALKESYLRIYVNHQATRMSDADFYKDNSQRYIKFHNHFLS EYADIIAKNPKVGTFHYDNLVHFYVAAGRTGDAFKTWYKLIKTRPNFRSLVSFPYYFFRQ LMKK >gi|226332265|gb|ACIC01000055.1| GENE 17 17171 - 18361 557 396 aa, chain - ## HITS:1 COG:SP0360 KEGG:ns NR:ns ## COG: SP0360 COG0381 # Protein_GI_number: 15900289 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylglucosamine 2-epimerase # Organism: Streptococcus pneumoniae TIGR4 # 6 395 4 393 394 679 80.0 0 MEKQPKFDYSDIKFKNNGKLKLIIVVGTRPEIIRLAAVINKCRQYFDVILAHTGQNYDYN LNGIFFKDLKLAEPEVYMDAVGDDLGATCGNIINCSYKLFAQTQPDGVLVLGDTNSCLSV ISAKRLHIPIFHMEAGNRCKDECLPEETNRRIVDIISDVNMAYSEHARRYLADCGLPKEH TYVTGSPMAEVLHNNFVEIEVSDVHQRLGLEKGKYILLSAHREENIDTEKNFLSLFTAIN KMAEKYDIPILYSCHPRSRNRLAASGFKLDSRVIQHEPLGFHDYNCLQMNAFAVVSDSGT LPEESSFFTSVGHPFPAICIRTSTERPEAIDKGDFIIAGIDEKSLLQAVDTAVELCRDDQ HGIPVPDYVDENVSTKVVKIVQSYVGIVNKMVWRKF >gi|226332265|gb|ACIC01000055.1| GENE 18 18364 - 19569 839 401 aa, chain - ## HITS:1 COG:SP0359_1 KEGG:ns NR:ns ## COG: SP0359_1 COG0451 # Protein_GI_number: 15900288 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Streptococcus pneumoniae TIGR4 # 2 279 4 281 281 384 64.0 1e-106 MNILVTGAKGFVGRNLCAQLNNIKNGKARCYGDLKIDEVFEYDLDSTSEQLDVWCQKAGF VFNLAGVNRPKDNDEFMKGNFGFASILLDALKKYHNTCPVMLSSSAQASLTGRFGNSEYG RSKKAGEDLFLQYGKDTGAKVLVYRFPNLYGKWCRPNYNSAVATFCNNIANDLPIQVNDP CVELELLYIDDLVDEMIYALKGKEHRCEFEGLDVFPLVDGRYCYCPVTHKVTLGEIVDLL HQFAEMPKTLMIPEISADSFAKRLYSTYLSYLPKEKAIFDLKMNVDQRGSFTELVHTLNC GQVSINISKPGVTKGEHWHNTKWEQFIVVSGHGLIQLRKEGTDEVLNYEVSGDKIQSVIM LPGYTHNIINLSDTEDLVTVMYCNEIFNPDKPDTYFDKVKK >gi|226332265|gb|ACIC01000055.1| GENE 19 19577 - 20626 862 349 aa, chain - ## HITS:1 COG:SP0358 KEGG:ns NR:ns ## COG: SP0358 COG1086 # Protein_GI_number: 15900287 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Predicted nucleoside-diphosphate sugar epimerases # Organism: Streptococcus pneumoniae TIGR4 # 1 338 1 339 351 517 73.0 1e-146 MSIFAGKTLMITGGTGSFGNAVLNRFLRTDIGEIRIFSRDEKKQDDMRHEYQVKYSDVAH KIKFFIGDVRNLQSCKNAMPGVDYIFHAAALKQVPSCEFFPMEAVKTNVIGTDNVLTAAI EAGVGAVICLSTDKAAYPINAMGITKAVEEKIAVAKSRYSGKTKICCTRYGNVMCSRGSV IPLWIEQIRNGNPVTLTEPTMTRFIMSLEEAVDLVLFAFEHGQNGDILVQKAPACTIQTQ AEAICELFGGKKEDIKVIGIRHGEKMYETLLTNEECAKAEDMGNFYRVPADNRGLNYDKF FKEGETERNTLTEFNSNNTRILNVAETKAKIAALDYIKKELSGESNFVQ >gi|226332265|gb|ACIC01000055.1| GENE 20 20661 - 20957 167 98 aa, chain - ## HITS:1 COG:BH3709 KEGG:ns NR:ns ## COG: BH3709 COG0451 # Protein_GI_number: 15616271 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Bacillus halodurans # 5 91 257 333 343 84 48.0 5e-17 MPILKVYNIGNNSPENLLDFVIVLQDELIRAGVLPNDYDFESHKELIPMQPEDVPVTYAD TTPLEQDFGFKPSTSLREGLRKFAGWYAKYYGTFKYHP >gi|226332265|gb|ACIC01000055.1| GENE 21 20989 - 21285 310 98 aa, chain - ## HITS:1 COG:mlr7549 KEGG:ns NR:ns ## COG: mlr7549 COG0451 # Protein_GI_number: 13476270 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Mesorhizobium loti # 13 98 3 90 342 62 41.0 3e-10 MVTYNVSLENKLVLVTGAAGFIGANLVKRLQNEFDSVKVIGIDSITEYYDVRLKYERLQE LPAYVDRFVFIKDSIANKKIVKSIFTNYHPQVVVNLTA >gi|226332265|gb|ACIC01000055.1| GENE 22 21290 - 22606 1198 438 aa, chain - ## HITS:1 COG:STM2080 KEGG:ns NR:ns ## COG: STM2080 COG1004 # Protein_GI_number: 16765410 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted UDP-glucose 6-dehydrogenase # Organism: Salmonella typhimurium LT2 # 7 438 2 388 388 460 55.0 1e-129 MDTKEFKIAVAGTGYVGLSIATLLSQHHQVTAVDVIPEKVDMLNRKQSPIQDEYIEKFLS EKELNLTATLDGTKAYSDADFVVIAAPTNYDPVKHYFDTHHIEDVIDLVLSVNPDAVLVI KSTIPVGYCRGLYMKYARKGVKKLNLLFSPEFLRESMALYDNLYPSRIIVGYPKLIDGEQ FDEENEAIKAIADVPVLEEAAHTFAALLQEGAIKEDIPTLFMGIKEAEAVKLFANTYLAL RVSYFNELDTYAEMKGLDSQSIIQGVGLDPRIGIHYNNPSFGYGGYCLPKDTKQLLANYQ DVPQNMMTAIVESNRTRKDFIADQVLRKAGYYTASCSWDAQKERKTTIGVYRLTMKSNSD NFRQSAIQGIMKRIKAKGATIVIFEPAMQDGETFFGSQVVNNLIKFKEISQAIIANRYDA CLDDVKDKVYTRDIFQRD >gi|226332265|gb|ACIC01000055.1| GENE 23 22641 - 23567 869 308 aa, chain - ## HITS:1 COG:SA0147 KEGG:ns NR:ns ## COG: SA0147 COG1086 # Protein_GI_number: 15925856 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Predicted nucleoside-diphosphate sugar epimerases # Organism: Staphylococcus aureus N315 # 4 274 326 590 607 261 50.0 2e-69 MHNVRLEFEKNYPGLDFVPVIGDVRVKERLRMVFETYQPQIIFHAAAYKHVPLMEENPCE AVLVNVVGSRQVADMAVEYGAEKMIMVSTDKAVNPTNVMGCSKRLAEIYVQSLGCAIREG KVKGRTKFITTRFGNVLGSNGSVIPRFKEQIENGGPVTVTHPDIIRFFMTIPEACRLVME AATMGEGNEIFVFEMGKAVKIVDLATRMIELAGYRPGEDIEIEFTGLRPGEKLYEEVLSD KENTIPTENKKIMIAKVRHYEYTDILDTYGEFEKLSRTVKIMDTVKLMKKVVPEFKSKNS PRFEVLDK Prediction of potential genes in microbial genomes Time: Thu May 12 00:39:14 2011 Seq name: gi|226332264|gb|ACIC01000056.1| Bacteroides sp. 1_1_6 cont1.56, whole genome shotgun sequence Length of sequence - 32396 bp Number of predicted genes - 29, with homology - 28 Number of transcription units - 19, operones - 9 average op.length - 2.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 187 90 ## 2 2 Tu 1 . - CDS 210 - 788 333 ## BT_0596 putative transcriptional regulator - Prom 1032 - 1091 4.0 - Term 1083 - 1124 2.2 3 3 Tu 1 . - CDS 1138 - 2094 537 ## BT_0595 integrase - Prom 2154 - 2213 9.1 + Prom 2064 - 2123 6.3 4 4 Tu 1 . + CDS 2182 - 2580 267 ## BT_0594 hypothetical protein + Prom 2620 - 2679 2.1 5 5 Op 1 . + CDS 2808 - 3173 314 ## BT_0593 hypothetical protein 6 5 Op 2 . + CDS 3200 - 3985 696 ## BT_0592 hypothetical protein + Term 4119 - 4165 -1.0 7 6 Tu 1 . - CDS 4081 - 5532 1205 ## BT_0591 hypothetical protein - Prom 5692 - 5751 5.8 + Prom 5604 - 5663 13.5 8 7 Op 1 . + CDS 5717 - 7330 2111 ## COG0504 CTP synthase (UTP-ammonia lyase) 9 7 Op 2 . + CDS 7393 - 9249 1732 ## COG0706 Preprotein translocase subunit YidC + Term 9275 - 9323 11.2 10 8 Op 1 . + CDS 9335 - 10666 1021 ## COG0534 Na+-driven multidrug efflux pump 11 8 Op 2 . + CDS 10714 - 12813 2059 ## COG1506 Dipeptidyl aminopeptidases/acylaminoacyl-peptidases + Term 12840 - 12885 2.8 - Term 12825 - 12876 7.4 12 9 Tu 1 . - CDS 12959 - 13663 446 ## BT_0586 hypothetical protein - Prom 13706 - 13765 10.2 - Term 13706 - 13756 -0.1 13 10 Op 1 . - CDS 13768 - 14277 516 ## COG1853 Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family 14 10 Op 2 . - CDS 14297 - 15295 855 ## COG2220 Predicted Zn-dependent hydrolases of the beta-lactamase fold - Prom 15317 - 15376 5.3 - Term 15351 - 15404 12.2 15 11 Op 1 . - CDS 15449 - 16543 943 ## BT_0583 hypothetical protein 16 11 Op 2 . - CDS 16565 - 16891 392 ## COG1695 Predicted transcriptional regulators - Prom 16992 - 17051 7.6 + Prom 16950 - 17009 8.2 17 12 Tu 1 . + CDS 17059 - 17556 179 ## PROTEIN SUPPORTED gi|229884790|ref|ZP_04504247.1| acetyltransferase, ribosomal protein N-acetylase + Term 17581 - 17635 12.3 - Term 17566 - 17624 11.7 18 13 Op 1 . - CDS 17647 - 19161 1469 ## COG3104 Dipeptide/tripeptide permease 19 13 Op 2 . - CDS 19237 - 19716 460 ## COG2606 Uncharacterized conserved protein 20 13 Op 3 . - CDS 19759 - 22599 2571 ## COG0178 Excinuclease ATPase subunit - Prom 22771 - 22830 7.4 + Prom 22561 - 22620 5.0 21 14 Op 1 . + CDS 22742 - 24508 1453 ## COG1388 FOG: LysM repeat 22 14 Op 2 . + CDS 24516 - 25238 634 ## BT_0576 hypothetical protein - Term 25071 - 25118 -0.7 23 15 Tu 1 . - CDS 25366 - 25809 390 ## BT_0575 hypothetical protein - Prom 25831 - 25890 5.2 - Term 25830 - 25891 3.2 24 16 Op 1 . - CDS 25916 - 26251 429 ## BT_0574 hypothetical protein 25 16 Op 2 . - CDS 26266 - 27069 822 ## COG4105 DNA uptake lipoprotein - Prom 27103 - 27162 5.2 - Term 27221 - 27259 -0.3 26 17 Op 1 3/0.000 - CDS 27302 - 27727 563 ## COG4747 ACT domain-containing protein - Prom 27772 - 27831 3.5 27 17 Op 2 . - CDS 27846 - 29144 1301 ## COG1541 Coenzyme F390 synthetase - Prom 29235 - 29294 5.4 + Prom 29172 - 29231 5.1 28 18 Tu 1 . + CDS 29280 - 31313 1964 ## COG0556 Helicase subunit of the DNA excision repair complex + Term 31472 - 31501 1.4 - Term 31526 - 31564 3.5 29 19 Tu 1 . - CDS 31590 - 32294 672 ## COG2755 Lysophospholipase L1 and related esterases - Prom 32326 - 32385 5.2 Predicted protein(s) >gi|226332264|gb|ACIC01000056.1| GENE 1 2 - 187 90 61 aa, chain + ## HITS:0 COG:no KEGG:no NR:no KRQTQSSFLLSSLTVTRIQIIDLFMDFVVQSCKIINIHTRHIITFMEQSMNSILKFIDFF L >gi|226332264|gb|ACIC01000056.1| GENE 2 210 - 788 333 192 aa, chain - ## HITS:1 COG:no KEGG:BT_0596 NR:ns ## KEGG: BT_0596 # Name: not_defined # Def: putative transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 192 1 192 192 389 100.0 1e-107 MILTKEKSLNAGPIYGTGEGVAHSKRWYVALVRMHHEKKVSEYLNKVGIENFVPVQKEIH QWSDRRKLVESVLLPMMVFVHADPKERMEVLNFTTVSRYMVMRGESSPAVIPDDQMARFR FMLDYSDETVCMNSSPLARGEKVQVIKGPLQGLVGELVNVDGKSKIAVRLNMLGCACVDM PIGYVEPIGEKN >gi|226332264|gb|ACIC01000056.1| GENE 3 1138 - 2094 537 318 aa, chain - ## HITS:1 COG:no KEGG:BT_0595 NR:ns ## KEGG: BT_0595 # Name: not_defined # Def: integrase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 318 1 318 318 627 100.0 1e-178 MQIMNKNGFSRCGELYIGLLRKEGRFSTAHVYQNALFSFNKFCGNASVSFRQVTRDRLRR YGEHLYDSGLKPNTVSTYMRMLRSIYNRGVELGSAPYVHGLFRDVYTGVDIRQKRALPVS ELRRLLYEDPKTDCLRRTQSIAALMFQFCGMSFADLAHLEKSALDCNVLRYNRIKTRTPM SVEVLDTAKEMMNQLRNSQSSRPGCPDYLFSILSGDKKRKDERAYREYQSALRRFNNHLK SLARALHLTSPVTSYTIRHSWATTAKFRGVPIEMISESLGHKSIKTTQIYLKGFELKERT EVNRKNLHYVKTCPLGRI >gi|226332264|gb|ACIC01000056.1| GENE 4 2182 - 2580 267 132 aa, chain + ## HITS:1 COG:no KEGG:BT_0594 NR:ns ## KEGG: BT_0594 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 132 1 132 132 183 99.0 1e-45 MTTTKKCLKNATGTQKKRSLETDYQDDDEIIDIDEVRKRNGITGEKKHYFIACLRTYYED DGFNSYVKSLTGLAKYIADWRIFYVNTHHGYRLMKFRSILSEISSLSKGIGIISKHEKEE KEQYKSKYHSRK >gi|226332264|gb|ACIC01000056.1| GENE 5 2808 - 3173 314 121 aa, chain + ## HITS:1 COG:no KEGG:BT_0593 NR:ns ## KEGG: BT_0593 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 121 1 121 121 224 100.0 8e-58 MTKEKRLIATNKTEEERDRLTEVWKKRIVSEKGYSETIAERIAEKVKSLADYMQYGHAII AYYRQNGSFQLVTGTLISYSKDFHHSYDMKQVHSTFIFWCMEEKGWRTFQIENFLDWKPI V >gi|226332264|gb|ACIC01000056.1| GENE 6 3200 - 3985 696 261 aa, chain + ## HITS:1 COG:no KEGG:BT_0592 NR:ns ## KEGG: BT_0592 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 261 1 261 261 505 100.0 1e-142 MSTTYLSVDYFPLTVNFFDRDAIELAEAKYGIRVDGAVCKLLCKIFKEGYYIPWGEEQSL IFARKLGGELKGKEIDGIIQILLDKGFFDKESYEKFQILTSLEIQHIWIDATCRRKRKLE DLPYLLVNDTQKQEKENNTKDANNSSVQGELKLENENISPEIADISGQSKGEKRKAEEEE DNTASPSLEIPGYAYNKSTHNVKGLLDSLAQHGVTNKKEITAILKLSDYGKKGTQIWITL STTKWGKIDNPGRYIISTLRT >gi|226332264|gb|ACIC01000056.1| GENE 7 4081 - 5532 1205 483 aa, chain - ## HITS:1 COG:no KEGG:BT_0591 NR:ns ## KEGG: BT_0591 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 483 1 483 483 931 100.0 0 MRRELLSFVVFTSTLSLSLSAQTATQPSTVVAKAKIVKADTLSSELQKYLMLKLNLSGET PKLDTVSILYNKYIGQLDYLNDPSVPERYIPSDPDYFRLFTPLAYYYAPMAQYSKVDWKP VQFDSLPDNLKKLTAELLPYDTLAYTKSERANQSVNKALMALYLNHPELVVTTENRIMSR DVFRRDVKPKISPKASVVHLFQPEEMSNDVGKAKMKISRPNWWVTGGNGSLQISQNHLSE NWYKGGESNFAGLATLQLYANYNDNEKVLFENQLEAKLGMTSTPSDKFHDYLINTDQFRL YNKLGLRALKNWYYTISSEFKTQFCHGYKANSEELVSAFLSPADLAVSIGMDYKLNKQKF NLSVFIGPLTYNLRYIDNSEVDETKFGLEKGKCSKNDFGSQLQSTFNWKIISAVSLESRL NYLTNYEWVRVEWENTFNFVLNRYLSTKLYVHARFDDSSKPTVENGSFFQLKELLSFGIN YKW >gi|226332264|gb|ACIC01000056.1| GENE 8 5717 - 7330 2111 537 aa, chain + ## HITS:1 COG:BS_ctrA KEGG:ns NR:ns ## COG: BS_ctrA COG0504 # Protein_GI_number: 16080768 # Func_class: F Nucleotide transport and metabolism # Function: CTP synthase (UTP-ammonia lyase) # Organism: Bacillus subtilis # 4 533 2 530 535 597 53.0 1e-170 MGETKYIFVTGGVASSLGKGIISSSIGKLLQARGYNVTIQKFDPYINIDPGTLNPYEHGE CYVTVDGHEADLDLGHYERFLGIQTTKANNITTGRIYKSVIDKERRGDYLGKTIQVIPHI TDEIKRNVKLLGNKYKFDFVITEIGGTVGDIESLPYLESIRQLKWELGKNALCVHLTYVP YLAAAGELKTKPTQHSVKELQSVGIQPDVLVLRAEHPLSDGLRKKVAQFCNVDDKAVVQS IDAETIYEVPLLMQAQGLDSTILEKMGLPVGETPGLGPWRKFLERRHAAETKEPINIALV GKYDLQDAYKSIREALSQAGTYNDRKVEVHFVNSEKLTDENVAEALKGMAGVMIGPGFGQ RGIDGKFVAIKYTRTHDIPTFGICLGMQCIAIEFARNVLGYADADSREMDEKTPHNVIDI MEEQKAITNMGGTMRLGAYECVLQKGSKAYLAYGEEHIQERHRHRYEFNNDYKAQYEAAG MKCVGINPESDLVEIVEIPALKWFIGTQFHPEYSSTVLNPHPLFVAFVKAAIENEKN >gi|226332264|gb|ACIC01000056.1| GENE 9 7393 - 9249 1732 618 aa, chain + ## HITS:1 COG:HI1001 KEGG:ns NR:ns ## COG: HI1001 COG0706 # Protein_GI_number: 16272937 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit YidC # Organism: Haemophilus influenzae # 53 577 35 536 541 135 26.0 2e-31 MDKNTITGLVLIGILLVGFSYLSRPSEEQIAAQKKYYDSIAVVQQQQEALKAKTEAALAN ENKGAAAAADSSALFFNAMHGTDSKVSIQNSVAEITFTTKGGRVYSAMLKEYKGQDKTNP VVLFDGDDATMSFNFYNKQGAIQTKDYYFEAVNKTDSSVTMRLAADNASYIDFIYTLKPN SYLMNFEIKATGMEGKLASTEYVDIDWTQRARQLEKGFTYENRLSELTYKVKGDNVDNLS AAKDDEKDLGNTAIDWVAFKNQFFSSVFIADQDFNKVSVKSRMEQQGSGYIKDYSAEMST FFDPSGKQPTEMYFYFGPNHFKTLKALDKGRTEKWELNRLVYLGWPLIRWINQFITINVF DWLSGWGLSMGIVLLILTIMVKVVVYPATWKTYMSSAKMRVLKPKIDEINKKYPKQEDAM KKQQEVMSLYSQYGVSPMGGCLPMLLQFPILMALFMFVPSAIELRQQSFLWADDLSTYDA FITFPFHIPFLGNHLSLFCLLMTVTNILNTKYTMTMQDTGAQPQMAAMKWMMYLMPIMFL FVLNDYPSGLNYYYFVSTLISVGTMILLRKTTDETKLLAILEAKKKDPKQMKKTGFAARL EAMQKQQEQLQQQKQNKR >gi|226332264|gb|ACIC01000056.1| GENE 10 9335 - 10666 1021 443 aa, chain + ## HITS:1 COG:TM0815 KEGG:ns NR:ns ## COG: TM0815 COG0534 # Protein_GI_number: 15643578 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Thermotoga maritima # 8 443 20 454 464 129 25.0 2e-29 MKTKYTYKQIWTIAYPILISLIMEQLIGMTDTAFLGRVGEIELGASAIAGVYYLAIFMMA FGFSVGAQILIARRNGEGNYKEIGPIFYQGIYFLVVVAAILFTLSIVFSPFILKNIISSP HIYDAAESYIHWRVYGFFFSFVGVMFRAFFVGTTQTKTLTLNSIVMVLSNVVFNYILIFG KFGFPQLGIAGAAIGSSLAELVSVIFFIIYTWKRIDCKKYALNILPKFQSRTLKRILNVS VWTMIQNFVSLSTWFMFFLFVEHLGERSLAIANIIRNVSGIPFMIAMAFAATCGSLVSNL IGAGEKDCVRGTINQHIRIGYVFVLPILIFFCLFPDLILRIYTDMPDLRDASIPSLWVLC AAYVVLVPANVYFQSVSGTGNTRTALAMELCVLTIYVAYATYFILYLKMDIAFAWTTESV YGIFILLFCYWYMKKGNWQKKQI >gi|226332264|gb|ACIC01000056.1| GENE 11 10714 - 12813 2059 699 aa, chain + ## HITS:1 COG:CC1986 KEGG:ns NR:ns ## COG: CC1986 COG1506 # Protein_GI_number: 16126229 # Func_class: E Amino acid transport and metabolism # Function: Dipeptidyl aminopeptidases/acylaminoacyl-peptidases # Organism: Caulobacter vibrioides # 45 693 21 675 683 332 30.0 1e-90 MRQANLFMMSAAMLLAACGGTKDAGKTDQVLIEKSDIKIEGKRMTPEALWAMGRIGGLAV SPDGKKIAYTVAYYSVPENKSNREVFVMNADGSDNRQITRTPYQENEVTWIKGGTKLAFL SNDNGSSQLYEMNPDGSGRKQLTNYDGDIEGYSISPDGKKLLFISQVKTKESTADKYPDL PKATGIIVTDLMYKHWDEWVTTAPHPFVADFDGNGISNIVDILEGEPYESPMKPWGGIEQ LAWNTTSDKVAYTCRKKTGLEYAVSTNSDIYVYDLNTKKTDNITEENKGYDTNPQYSPDG KYIAWQSMERDGYEADLNRLFIMNLETGEKRFVSKAFESNVDAFVWGNDAKTIYFTGVWH GETQIYSLDLTNDSVKAITSGMYDYEGVALFGDKLIAKRHSMSMGDEIYAVALDGSATQL TQENKEIYDQLEMGKVEGRWMKTTDGKDMLTWVIYPPQFDPNKKYPTLLFCEGGPQSPVS QFWSYRWNFQIMAANDYIIVAPNRRGLPGFGVEWNEQISGDYGGQCMKDYFTAIDEMAKE SYVDKDRLGCVGASFGGFSVYWLAGHHDKRFKAFIAHDGIFNMEMQYLETEEKWFANWDM GGAYWEKQNPIAQRTFANSPHLFVEKWDTPILCIHGEKDFRILANQAMAAFDAAVMRGVP AELLIYPDENHWVLKPQNGVLWQRTFFEWLDKWLKAPAK >gi|226332264|gb|ACIC01000056.1| GENE 12 12959 - 13663 446 234 aa, chain - ## HITS:1 COG:no KEGG:BT_0586 NR:ns ## KEGG: BT_0586 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 5 234 1 230 230 427 98.0 1e-118 MKTKMFFLAGICAALAACSSDSDDVSSSPSNAPAILEVVSYKFVQEETDVVERVEYPVVV LQHKVNDKDEPLPMIYAWDVEEEENSLFVLTEGSLPVNAENLADLKIPVPFIDAGGKLFI DGTGAKTPLIFGETLKVKNGSRSIGNVKYEIPPYSIYELTKQECGYRCTLTFYLVLKAVN KGEEYHLKGRWTGEQLREQKMGLIDLSDEKGAEKTVLMEAPIELFEKDYETGLD >gi|226332264|gb|ACIC01000056.1| GENE 13 13768 - 14277 516 169 aa, chain - ## HITS:1 COG:FN0320 KEGG:ns NR:ns ## COG: FN0320 COG1853 # Protein_GI_number: 19703665 # Func_class: R General function prediction only # Function: Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family # Organism: Fusobacterium nucleatum # 2 140 4 139 180 75 34.0 3e-14 MKKLEVKDLKENFFEAIGKEWMLVTAGTKEKFNTMTASWGGIGWLWNKPVAFVFVRPERY TYEFIEKSDYLTLSFLGEANKKIHAVCGSKSGRDTDKVKATGLKPVFTEQGNVLFEQARL SLECKKLYTDTIKPECFLDKESLEKWYDDAHGGFHKMYIVEIENIWEGD >gi|226332264|gb|ACIC01000056.1| GENE 14 14297 - 15295 855 332 aa, chain - ## HITS:1 COG:YPO1228 KEGG:ns NR:ns ## COG: YPO1228 COG2220 # Protein_GI_number: 16121515 # Func_class: R General function prediction only # Function: Predicted Zn-dependent hydrolases of the beta-lactamase fold # Organism: Yersinia pestis # 79 304 84 314 342 172 37.0 9e-43 MGEHICFRRNERLATVNPYWRGNPMVRGRFFNRQHRFRPGMGSVLKWRLSPNPQRKEKKT VKWDPKVCYLRSLDAVVGDSLIWLGHNSFFLQLAGKRIMFDPVFGNIPFVKRQSEFPANP DIFTEIDYLLISHDHFDHLDKQSITRLLNNNPQMKLFCGLGTGELIKSWFPEMRVIEAGW YQQLEDEGLKITFLPAQHWSKRSVRDGGQRLWGAFMLQGNGVSLYYSGDTGYSTHFREIP DLFGAPDYALLGIGAYKPRWFMRPNHISPYESLTASEEMHAGITIPMHYGTFDLSDEPLH DPPKVFAAEAKKRKIRVEIPYLGEIVKLSKRK >gi|226332264|gb|ACIC01000056.1| GENE 15 15449 - 16543 943 364 aa, chain - ## HITS:1 COG:no KEGG:BT_0583 NR:ns ## KEGG: BT_0583 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 364 1 364 364 670 100.0 0 MKKTLTVNLGGTVYHIDDDAYRLLDDYLSNLKHFFRKQEGAEEIVNDIEIRIAELFAEKV SAGKQVITIADVEEIIARVGKPEDFGVSDDESEPHKKEQTASSGQGYTRTTTARRLFRDP DNKLLGGVASGLAAYFDWDITLVRILMIVLLFVPYCPMIILYIIGWIIIPEARTAAEKLS MRGEAVTIENIGKTVTDGFERVADGVNNFVNSDKPRTFLQKVGDVFVTIAAIILKIFLVA LVIICCPVLFVLAVVIVALVFAVIAALVGGGALLYEMLPAIDWTPIATISPVMTLLGTIS GIALIAIPLGAFLYTIMRQLFHWSPMGTGLKWSLFILWVLGLVIVIINLSALGWQLPLYG LHCF >gi|226332264|gb|ACIC01000056.1| GENE 16 16565 - 16891 392 108 aa, chain - ## HITS:1 COG:BH0406 KEGG:ns NR:ns ## COG: BH0406 COG1695 # Protein_GI_number: 15612969 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Bacillus halodurans # 5 108 67 170 174 79 37.0 2e-15 MNVDNVKSQMRKGMLEYCIMLLLHKEPAYASDIIQKLKEAQLIVVEGTLYPLLTRLKNDD LLSYEWVESTQGPPRKYYKLTEKGEAFLGELEISWKELNETVNHIASR >gi|226332264|gb|ACIC01000056.1| GENE 17 17059 - 17556 179 165 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229884790|ref|ZP_04504247.1| acetyltransferase, ribosomal protein N-acetylase [Sebaldella termitidis ATCC 33386] # 4 161 5 160 169 73 31 2e-12 MFTIRKARIDDCGLINQMAGEVFPATYQEILSPGQLDYMMDWMYSPENIRKQMEEEGHVY FIAYEGDEPCGYVSVQQQEENVFHLQKIYVLPHFQGAHCGSFLFREAVRYIKEVHPEPCL MELNVNRNNKALLFYERMGMRKLREGDFPIGNGYYMNDYIMGLDI >gi|226332264|gb|ACIC01000056.1| GENE 18 17647 - 19161 1469 504 aa, chain - ## HITS:1 COG:XF1891 KEGG:ns NR:ns ## COG: XF1891 COG3104 # Protein_GI_number: 15838489 # Func_class: E Amino acid transport and metabolism # Function: Dipeptide/tripeptide permease # Organism: Xylella fastidiosa 9a5c # 5 461 29 458 510 150 27.0 8e-36 MFDKHPKGLIAAALANLGERFGFYTMMAILVLFLQAKFGMNGKEAGFIYSTFYFSIYILA LVGGIIADKTRNYKGTIFTGIVLMAVGYLLLAIPSKTPVDNQTFYLVITCASLFVIAFGN GLFKGNLQALVGQMYDNPQYASMRDSGFSLFYMFINIGAIFAPFAAVGVRNWWLSTFGYN YDADLPALCHGHLAGTLSPEATETYHTLVEKASNAPVQDYTAFASDYLNVFTTGFHYAFG VAIIAMIISLTIYLLNKRNFPDPSKKAAASSASSATVEMSTQEVKQRMYALFAVFGVVIF FWFSFHQNGLTLTYFAKEYTDLNLFGMPISAELFQSLNPFFVVFLTPVIMAIFASQRRRG KEPSTPKKIAIGMGIAALAFIVMAVGSYFANLPLHKDIIAVGTSPVKVTPFLLMLTYLIL TVAELYISPLGISFVSKVAPPKYQGIMQGGWLGATALGNQLLVIGAILYESIPIWMTWTV FVVACTISMFTMIFMLKWLERVAK >gi|226332264|gb|ACIC01000056.1| GENE 19 19237 - 19716 460 159 aa, chain - ## HITS:1 COG:lin0783 KEGG:ns NR:ns ## COG: lin0783 COG2606 # Protein_GI_number: 16799857 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Listeria innocua # 5 159 4 158 158 171 57.0 5e-43 MKINKTNAVRLLDKAKIAYELIPYEVDENDLSAIHVAASLGENIEQVFKTLVLHGDKSGY FVCVIPGEHEVDLKLAAKASGNKKCDLIPVKELLPLTGYIRGGCSPIGMKKHFPTYIHET CRQFPYIYVSAGVRGLQIKLAPGDLIRESRAEICRLFEE >gi|226332264|gb|ACIC01000056.1| GENE 20 19759 - 22599 2571 946 aa, chain - ## HITS:1 COG:BH3594 KEGG:ns NR:ns ## COG: BH3594 COG0178 # Protein_GI_number: 15616156 # Func_class: L Replication, recombination and repair # Function: Excinuclease ATPase subunit # Organism: Bacillus halodurans # 12 941 6 935 957 1019 56.0 0 MKDMNMQETEYINVYGARVHNLKDIDAEIPRNSLTVITGLSGSGKSSLAFDTIFAEGQRR YIETFSAYARNFLGNLERPDVDKITGLSPVISIEQKTTNKNPRSTVGTTTEIYDYLRLLY ARAGIAYSYLSGEEMVKYTEEQILDLILKDYKGKKIYLLAPLVRSRKGHYKELFEQVRKK GYLYVRIDGELREVTHGMKLDRYKNHDIEVVIDKLIVAEKDDKRLKQSVATAMRQGDGLL MILDAQTESVRHYSKRLMCPVTGLSYREPAPHNFSFNSPQGACPKCKGLGVVNQIDVDKV IPDRELSIYEGAIAPLGKYKNAMIFWQIGALLEKYEATLKTPVKELSDDAVEEILYGSDD RIKIKSSLIGTSSDYFVTYEGVVKYIQMLQEKDASATAQKWAEQFARTTVCPECKGARLN KEALHFRIHDKNINDLANMDINELYDWLMNVDQFLSDKQKKIAAEILKEIRTRLKFLLDV GLDYLALNRSSVSLSGGESQRIRLATQIGSQLVNVLYILDEPSIGLHQRDNLRLIRSLKE LRDMGNSVIVVEHDKDMMLAADYVIDMGPKAGRLGGEVVFSGTPSEMLQTETMTSQYLNG EMKIEVPAKRRKGNGKSIWLKGAKGNNLKNVDVEFPLGKLICVTGVSGSGKSTLINETLQ PILSQKFYRSLQDPLEYDSIEGLENIDKVVDVDQSPIGRTPRSNPATYTGVFSDIRNLFV SLPEAKIRGYKPGRFSFNVSGGRCEACTGNGYKTIEMNFLPDVYVPCEVCHGKRYNRETL EVRFKGKSIADVLDMTINRAVEFFENVPQILNKIKVLQDVGLGYIKLGQSSTTLSGGESQ RVKLATELSKRDTGKTLYILDEPTTGLHFEDIRVLMGVLNKLVDKGNTVIVIEHNLDVIK MADYIIDMGPEGGKGGGVLLSYGTPEEVAKSQKGYTPKFLREELGL >gi|226332264|gb|ACIC01000056.1| GENE 21 22742 - 24508 1453 588 aa, chain + ## HITS:1 COG:CAC1232_1 KEGG:ns NR:ns ## COG: CAC1232_1 COG1388 # Protein_GI_number: 15894515 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: FOG: LysM repeat # Organism: Clostridium acetobutylicum # 4 143 2 128 133 60 29.0 7e-09 MKPISRIFLFLLFISASYAISYAQENQSYFLHTIEKGQSLYSIAKMYNVTTNDIIRLNPG CDEKIYAGQAIKIPKGKESQKGETFHTIQAGETLYKLTTIYNISAKAICEANPGLSAENF RIGQVILIPLEQEQETEAAQTPAEKPAIQGPIQSRCKDMHKVKRKETVFSVSREYGISEQ ELIAANPELKKGMKKGQYLCIPYPSATTMQPTTPKEDPYAIPPSNNELFRKSKEAPQAIS TIKAALLLPFQEDKRMVEYYEGFLMAVDSLKRTGTSIDLYVYDCGKDVSTLNTILAKNEM KNMNVIFGPMHQQQIKPLSTFAEKNDIRLVIPFSSKGEEVFNNPAIYQINTPQSYLYSEV YEHFTRQFPNAHVIFIEPTSEDKEKAEFISGMKQELKSKGMSMKTVNENATKDMLKEALR SDKDNIFIPTSGKNVMLIKILPQLILLVRDTPEQNIHLFGYPEWQTYTRDHLESFFELDT YFYSSFYTNTLFPAAIQFTNNYHKWYSKDLVSKFPSYGMLGFDTGFFFLKGLSRYGSELE NNLPKMNLTPIQTGFKFERVNNWGGFINKKVFFIHFTKNFELVKLDFE >gi|226332264|gb|ACIC01000056.1| GENE 22 24516 - 25238 634 240 aa, chain + ## HITS:1 COG:no KEGG:BT_0576 NR:ns ## KEGG: BT_0576 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 240 1 240 240 471 100.0 1e-131 MTKIKSLLIAASLMMGLALPATAQIGEPRQNFSVGFNGGVNLNSASFTPKIKQNSLMGIT GGLTARYISEKYFAMICGAQVELNISQRGWDELFEVPGENGEPVEDPTRKYTRKMTYLDI PFLAHLAFGNEKGLQFFINAGPQIGLLIGESESMENIDMNALTDTQKAVYGDKIQNKFDY GIAGGGGIEFRTKKAGSFLLEGRYYFALSDFYSTTKKDYFSRAAHGTITVKLTYLFDLKK >gi|226332264|gb|ACIC01000056.1| GENE 23 25366 - 25809 390 147 aa, chain - ## HITS:1 COG:no KEGG:BT_0575 NR:ns ## KEGG: BT_0575 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 147 1 147 147 248 99.0 5e-65 MIQRIQSFYLLIVTGLLITAMCLPMGYFIDTTGEHPFKALGINVNGIFQSTWGIFGILML SALVAFATIFLYKNRMLQIRMSIFNSLLLVGYYIAFIAFYFALKSDANLFRIGWALCLPL VSIVLNVLAIRAIGRDEVMVKAADRLR >gi|226332264|gb|ACIC01000056.1| GENE 24 25916 - 26251 429 111 aa, chain - ## HITS:1 COG:no KEGG:BT_0574 NR:ns ## KEGG: BT_0574 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 111 1 111 111 186 100.0 3e-46 MDYKKTNAPATTVTRDMMELCADTGNVYETVAIIGKRANQISVEIKNDLSKKLAEFASYN DNLEEVFENREQIEISRYYEKLPKPDLIATQEYIEGKIYYRNPAKEKEKLQ >gi|226332264|gb|ACIC01000056.1| GENE 25 26266 - 27069 822 267 aa, chain - ## HITS:1 COG:BMEI0587 KEGG:ns NR:ns ## COG: BMEI0587 COG4105 # Protein_GI_number: 17986870 # Func_class: R General function prediction only # Function: DNA uptake lipoprotein # Organism: Brucella melitensis # 8 258 40 282 309 59 22.0 8e-09 MKKNIIITLLAAASLTSCGEYNKLLKSTDYEYKYEAAKNYFAKGQYNRSATLLNELITIL KGTDKAEESLYMLGMSYYNQKDYQTAAQTFITYFNTYPRGTFTELARFHAGKSLFLDTPE PRLDQSSTYQAIQQLQMFMEYFPNSTKKQEAQDMIFALQDKLVLKELYSAKLYYNLGNYL GNNYESCVITAQNALKDYPYTDYREELSILILRARHEMAIYSVEDKKMDRYRETIDEYYA FKNEFPESKYLKEAEKIFNESQKVIKD >gi|226332264|gb|ACIC01000056.1| GENE 26 27302 - 27727 563 141 aa, chain - ## HITS:1 COG:MTH1854 KEGG:ns NR:ns ## COG: MTH1854 COG4747 # Protein_GI_number: 15679842 # Func_class: R General function prediction only # Function: ACT domain-containing protein # Organism: Methanothermobacter thermautotrophicus # 1 141 1 143 143 108 39.0 3e-24 MVAKQLSIFLENKSGRLTEVTEVLAKENINLSALCIAENADFGILRGIVSDPDKAYKALK DNHFAVNITDVVGISCPNVPGALAKVLGFLSAEGVFIEYMYSFANNNVANVVIRPSNMDK CIEVLKEKKVDLLAASDLYKL >gi|226332264|gb|ACIC01000056.1| GENE 27 27846 - 29144 1301 432 aa, chain - ## HITS:1 COG:MTH1855 KEGG:ns NR:ns ## COG: MTH1855 COG1541 # Protein_GI_number: 15679843 # Func_class: H Coenzyme transport and metabolism # Function: Coenzyme F390 synthetase # Organism: Methanothermobacter thermautotrophicus # 1 431 1 431 433 501 55.0 1e-142 MIWNETIECMDRESLRKIQSIRLKKIVDYVYHNTPFYRKKMQEMGITPDDINTIDDIVKL PFTTKHDLRDNYPFGLCAVPMSQIVRIHASSGTTGKPTVVGYTRKDLSSWAECISRAFTA YGAGRSDIFQVSYGYGLFTGGLGAHAGAENIGASVIPMSSGNTEKQITLMHDFGSTVLCC TPSYALYLADAIKDSGYPREEFQLKVGALGAEPWTENMRHEIEEKLGIKAYDIYGLSEIA GPGVGYECECQHGTHLNEDHYFPEIIDPNTLQPVEPGQTGELVFTHLTKEGMPLLRYRTR DLTALHYDKCSCGRTLVRMDRILGRSDDMLIIRGVNVFPTQIESVILEMEEFEPHYLLIV GRENNTDTMELQVEVRPDFYSDEINRMLALKKKLAGRLQSVLGLGVNVKLVEPRSIERSV GKAKRVIDNRKI >gi|226332264|gb|ACIC01000056.1| GENE 28 29280 - 31313 1964 677 aa, chain + ## HITS:1 COG:BS_uvrB KEGG:ns NR:ns ## COG: BS_uvrB COG0556 # Protein_GI_number: 16080570 # Func_class: L Replication, recombination and repair # Function: Helicase subunit of the DNA excision repair complex # Organism: Bacillus subtilis # 3 669 5 658 661 728 56.0 0 MNFELTSAYKPTGDQPEAIAQLTEGVLQGVPAQTLLGVTGSGKTFTIANVIANINKPTLI LSHNKTLAAQLYSEFKGFFPNNAVEYYVSYYDYYQPEAYLPNSDTYIEKDLAINDEIDKL RLAATSSLLSGRKDVVVVSSVSCIYGMGNPSDFYKNVIEIERGRMLDRNVFLRRLVDSLY VRNDIDLNRGNFRVKGDTVDIFLAYSDTLLRVTFWGDEIDGIEEVDPITGVTTAPFEAYK IYPANLFMTTKEATLRAIHEIEDDLTKQVAFFESIGKEYEAKRLYERVTYDMEMIRELGH CSGIENYSRYFDGRAAGTRPYCLLDFFPDDFLLVIDESHVSVPQVRAMYGGDRARKINLV EYGFRLPAAMDNRPLKFEEFEEMTKQVIYVSATPAEYELIQSEGIVVEQVIRPTGLLDPV IEVRPSLNQIDDLMEEIQLRIEKEERVLVTTLTKRMAEELAEYLLNNNVRCNYIHSDVDT LERVKIMDDLRQGVYDVLIGVNLLREGLDLPEVSLVAILDADKEGFLRSHRSLTQTAGRA ARNVNGKVIMYADKITDSMRLTIDETNRRREKQLAYNEANGITPQQIKKARNLSVFGSPS SEADELLKEKHAYVEPSSPNIAADPIVQYMSKAQMEKSIERTRKLMQEAAKKLEFIEAAQ YRNELLKLEDLMKEKWG >gi|226332264|gb|ACIC01000056.1| GENE 29 31590 - 32294 672 234 aa, chain - ## HITS:1 COG:CAC3448 KEGG:ns NR:ns ## COG: CAC3448 COG2755 # Protein_GI_number: 15896689 # Func_class: E Amino acid transport and metabolism # Function: Lysophospholipase L1 and related esterases # Organism: Clostridium acetobutylicum # 58 222 15 178 190 61 32.0 1e-09 MKKKRLGQWMLLAIICLSFSVGETYAQKHEFANYKRYATENAVLAQPVKKEKRVVFMGNS ITEGWVRTHPDFFKTNGYIGRGISGQTSYQFLLRFREDVINLSPVLVVINAGTNDVAENT GAYNEDYTFGNIVSMTELAKANKIKVILTSVLPAAEFPWRREIKDAPQKIQSLNARIEAY AKANKIPFVNYYQPMVVGDNKALNPQYTKDGVHPTGEGYDIMEALIKQAIEKAL Prediction of potential genes in microbial genomes Time: Thu May 12 00:40:09 2011 Seq name: gi|226332263|gb|ACIC01000057.1| Bacteroides sp. 1_1_6 cont1.57, whole genome shotgun sequence Length of sequence - 27875 bp Number of predicted genes - 19, with homology - 19 Number of transcription units - 9, operones - 4 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 51 - 695 469 ## COG2949 Uncharacterized membrane protein 2 1 Op 2 . + CDS 731 - 1207 291 ## COG0013 Alanyl-tRNA synthetase + Term 1256 - 1296 -0.8 + Prom 1228 - 1287 4.1 3 2 Tu 1 . + CDS 1475 - 1903 427 ## COG0071 Molecular chaperone (small heat shock protein) + Term 1950 - 1996 4.0 - Term 1928 - 1991 6.1 4 3 Op 1 22/0.000 - CDS 2007 - 3122 916 ## COG0842 ABC-type multidrug transport system, permease component 5 3 Op 2 45/0.000 - CDS 3134 - 4237 755 ## COG0842 ABC-type multidrug transport system, permease component 6 3 Op 3 10/0.000 - CDS 4240 - 5709 340 ## PROTEIN SUPPORTED gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein 7 3 Op 4 . - CDS 5688 - 6602 979 ## COG0845 Membrane-fusion protein 8 3 Op 5 . - CDS 6619 - 6861 227 ## BT_0560 outer membrane efflux protein 9 3 Op 6 . - CDS 6809 - 7894 1073 ## BT_0560 outer membrane efflux protein - Prom 7981 - 8040 4.2 10 4 Tu 1 . + CDS 8068 - 8943 432 ## COG2207 AraC-type DNA-binding domain-containing proteins - Term 8759 - 8797 7.1 11 5 Tu 1 . - CDS 8947 - 9999 881 ## COG0836 Mannose-1-phosphate guanylyltransferase - Prom 10033 - 10092 8.1 - Term 10177 - 10224 9.9 12 6 Op 1 24/0.000 - CDS 10254 - 13481 3458 ## COG0458 Carbamoylphosphate synthase large subunit (split gene in MJ) 13 6 Op 2 . - CDS 13491 - 14657 910 ## COG0505 Carbamoylphosphate synthase small subunit 14 6 Op 3 . - CDS 14685 - 16568 1818 ## COG0034 Glutamine phosphoribosylpyrophosphate amidotransferase 15 6 Op 4 . - CDS 16607 - 18451 1704 ## COG0449 Glucosamine 6-phosphate synthetase, contains amidotransferase and phosphosugar isomerase domains - Prom 18573 - 18632 10.5 + Prom 18731 - 18790 3.2 16 7 Op 1 21/0.000 + CDS 18814 - 23424 4275 ## COG0069 Glutamate synthase domain 2 + Prom 23443 - 23502 7.4 17 7 Op 2 . + CDS 23553 - 24893 993 ## COG0493 NADPH-dependent glutamate synthase beta chain and related oxidoreductases + Prom 24929 - 24988 2.8 18 8 Tu 1 . + CDS 25100 - 26779 1542 ## COG0367 Asparagine synthase (glutamine-hydrolyzing) + Term 26898 - 26941 4.3 + Prom 26893 - 26952 10.0 19 9 Tu 1 . + CDS 26979 - 27740 830 ## COG0584 Glycerophosphoryl diester phosphodiesterase + Term 27804 - 27852 8.4 Predicted protein(s) >gi|226332263|gb|ACIC01000057.1| GENE 1 51 - 695 469 214 aa, chain + ## HITS:1 COG:DR0187 KEGG:ns NR:ns ## COG: DR0187 COG2949 # Protein_GI_number: 15805223 # Func_class: S Function unknown # Function: Uncharacterized membrane protein # Organism: Deinococcus radiodurans # 41 211 51 221 222 195 55.0 5e-50 MKKKLFYITLIIAILCVISIFICDWTIKKNAASCIYTEISDIPANNVGLLLGTSSKLKSG NNNLYFDYRIDAAVELYKAGKINYILISGDNRKEDYNEPEEMKKALMQKGVPEKSIYLDY AGFRTLDSVVRAKEVFGQTRLTIISQRFHNERAIYLAEKNGITAIGFNARDVDVYAGLKT NIRELFARVKMFIDLAIDKQPHFLGEKYKLPSGK >gi|226332263|gb|ACIC01000057.1| GENE 2 731 - 1207 291 158 aa, chain + ## HITS:1 COG:aq_1293 KEGG:ns NR:ns ## COG: aq_1293 COG0013 # Protein_GI_number: 15606507 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Alanyl-tRNA synthetase # Organism: Aquifex aeolicus # 19 138 559 679 867 68 33.0 3e-12 MEQQPQLNDHNKQEYPPMHTAEHLLNATMVKTFGCPRSRNAHIERKKSKCDYILPTCPTD AEIQTIEDKVNEIISQNLPVTVEFMTHEQAKDIVDLSKLPADASETLRIIRIGDYDACAC IGLHVSNTSEVGTFKIISHDYDEERQTLRLRFKLIDKK >gi|226332263|gb|ACIC01000057.1| GENE 3 1475 - 1903 427 142 aa, chain + ## HITS:1 COG:TM0374 KEGG:ns NR:ns ## COG: TM0374 COG0071 # Protein_GI_number: 15643142 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone (small heat shock protein) # Organism: Thermotoga maritima # 3 142 12 146 147 71 37.0 4e-13 MMPVRRTQSWLPSIFNDFFDNDWMVKANATAPAINVFETEKEYKVELAAPGMTKEDFNVR IDEDNNLVISMEKKTENKEEKKEGRYLRREFSYSKFQQTMILPDNVDKEKISAAVENGVL SVQLPKISEEEVKKAEKQIEVK >gi|226332263|gb|ACIC01000057.1| GENE 4 2007 - 3122 916 371 aa, chain - ## HITS:1 COG:SMb21204 KEGG:ns NR:ns ## COG: SMb21204 COG0842 # Protein_GI_number: 16264618 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, permease component # Organism: Sinorhizobium meliloti # 2 357 6 355 370 171 32.0 2e-42 MIKFLIEKEFKQLLRNSFLPRLILIFPCMIMLLMPWAVNMEIKNIQLNIVDNDHSVISQR LVNKIAASTYFRLTEVPASYEDGLRNIELGTADIIMEIPRHLERDWMNKGEAHILIAANS VNGTKGGLGSSYLGTIINNYAAELRAEYPDAVSASGSAPSIRIDMQGLFNPNLNYKLYMI PALMGMLLTLICGFLPALNVVSEKEVGTIEQINVTPVPKFTFILAKLLPYWITGFIVLTL CFLLAWLLYGITPVGHFIVIYFLAILFVFVMSGFGLVISNYSATMQQSMFVMFFFMLILM LMSGLFTPVSSMPEWAQVITYFNPLKYFMEGMRMVYLKGSSLLELLPEIGVLFLFALGFN TWAVISYRKNQ >gi|226332263|gb|ACIC01000057.1| GENE 5 3134 - 4237 755 367 aa, chain - ## HITS:1 COG:CAC3268 KEGG:ns NR:ns ## COG: CAC3268 COG0842 # Protein_GI_number: 15896513 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, permease component # Organism: Clostridium acetobutylicum # 2 365 4 374 378 211 32.0 2e-54 MKQFIAFVKKEFYHIFRDRRTMLILLGMPVVEIILFGFAISTEVKNVRLAVLDPSNDMVT RKIIDRLDASEYFTVTTRFHSPQEMETAFRKNKIDMAIVFGERFADGLYTGDARVQLIVD ATDPNMSTSQSNYAASIVSSAGQEMLPPNVSVSRLTPDVKLLYNPQMKSAYNFVPGVMGL ILMLICAMMTSISIVREKETGTMEVLLVSPVKPLFIVLAKAVPYFVLSFVNLTTILLLSV FVLDVPVVGSLFWLVMVSLLFIFVSLALGLLISSVTSTQVAAMLASGIILMMPTMVLSGM IFPVESMPVILRAISDIIPARWYIQAVRKLMIEGVPVVLVLKEIGILLLMAVGLITISIR KFKYRLE >gi|226332263|gb|ACIC01000057.1| GENE 6 4240 - 5709 340 489 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein [Acinetobacter baumannii AYE] # 250 456 7 215 311 135 33 3e-31 MEKGEPIVVVKEISKSYGKVEALKEISFAVEQGEIFGLIGPDGAGKSTLFRILTTLLLAD KGTATVNGLDVVTDYKQIRTKVGYMPGRFSLYQDLSVEENLEFFATVFHTSIQENYDLIK DIYQQIEPFKKRRAGALSGGMKQKLALSCSLIHKPDILFLDEPTTGVDPVSRKEFWQMLR NLRKQGITIIVSTPIMDEARQCDRIAFINHGQVHGIDTPDRILQKFASILCPPPLEREEA QQMAVPVIEVEQLTKSFGHFTAVDHISFQVQRGEIFGFLGANGAGKTTAMRMLCGLSRPT SGVGKVAGYDIFREAEQVKRHIGYMSQKFSLYEDLKVWENIRLFAGIYGMKEMEIEEKTD ELLERLGFADERDTLVKSLPLGWKQKLAFSVSIFHEPKIVFLDEPTGGVDPATRRQFWEL IYQAADRGITVFVTTHYMDEAEYCNRISIMVDGQIKALDTPARLKEQFGVETMDDVFQQL ARHAVRKAD >gi|226332263|gb|ACIC01000057.1| GENE 7 5688 - 6602 979 304 aa, chain - ## HITS:1 COG:PA5232 KEGG:ns NR:ns ## COG: PA5232 COG0845 # Protein_GI_number: 15600425 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Pseudomonas aeruginosa # 11 280 22 314 357 92 28.0 7e-19 MKRMTKIGIYGMAVWLLSACGNGTPDYDATGTFEATEVIVSAEAAGKLLQLEVEEGTRLK AGEEVGLVDTVQLYLKKLQLEASMKSVESQRPDLAKQIAATKQQITTAQREKKRVENLLA AGAANQKQLDDWDAQVTLLQRQLIAQESSLMKSTNSLTEQGNSVGIQVAQVEDQLSKCHI QSPIEGTVLVKYAEAGELASVGKPLFKVGEVDRMYLRAYVTSEQLSQVKLGDEVTVYSDY GNSEQKAYPGVVTWISDRSEFTPKTILTKNERANLVYAVKIAVKNDGSLKIGMYGGVVWK KENP >gi|226332263|gb|ACIC01000057.1| GENE 8 6619 - 6861 227 80 aa, chain - ## HITS:1 COG:no KEGG:BT_0560 NR:ns ## KEGG: BT_0560 # Name: not_defined # Def: outer membrane efflux protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 80 330 409 409 139 100.0 4e-32 MTQQDQSVRSLEKQMQDDDEIIRLRTNIRRAAEAKVANGTLTVTEMLRELTNESLARQAK AMHEIQRLMGIYQLKYTTNH >gi|226332263|gb|ACIC01000057.1| GENE 9 6809 - 7894 1073 361 aa, chain - ## HITS:1 COG:no KEGG:BT_0560 NR:ns ## KEGG: BT_0560 # Name: not_defined # Def: outer membrane efflux protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 16 340 1 325 409 620 98.0 1e-176 MRMKRIIFSLSLLLFMAGMYGQTGHITLEECQQKTQDNYPLVRQYDLVEKTKEYNLENAA RGYLPQFALSAKASYQSDVTELPITIPGVDIKGMAKDQYQVMLELQQQIWDGGGIRMQKK KVTAEAEIDREKLNVDMYALNGPVNDLYFGILMLDEQLAQNALLQDELGRNFRQITAYVE NGIANQADLDAVKVEQLNTRQKRVELTSSRMAYLKMLSLLMGEVLSPETVLEKPVPQNEL SAVSEIRRPELSWFDAQGAGLQVQEKALNVRHLPHFGLFVQGAYGNPGLNMLKNEFSPYY IAGVRLSWNFGSLYTLKNDRRVIENKRKQLGSNRDVFLFNYETGNDAARSVCPFVGEADA G >gi|226332263|gb|ACIC01000057.1| GENE 10 8068 - 8943 432 291 aa, chain + ## HITS:1 COG:PA0248 KEGG:ns NR:ns ## COG: PA0248 COG2207 # Protein_GI_number: 15595445 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Pseudomonas aeruginosa # 179 290 176 286 288 75 33.0 1e-13 MELGLQLLKDINIVRHNEFRFINAEFGFVTSFSKMETTIFSVGQPYRLQEGRIAIVTNGS ARVLINLIEYIFLPDHISLIAPNSIIQIIEVSHDFDAHMMAVDLNFLPISGKEEFFTHFL QQKKNLLFPLSTEEQIQIENFVTVMWSVLQEPIFRKEVIQHLLVGLLYNIEYIAKSKGQA ELSPLTHQNDIFQRFISLVNTYSKTERNVSFYADKLCLTPRYLNTVIRQTSQQTVMDWIN QSIILEAKVLLKHSNRLVYQISDELNFPNPSFFSKFFKRMTGMTPQEYQKN >gi|226332263|gb|ACIC01000057.1| GENE 11 8947 - 9999 881 350 aa, chain - ## HITS:1 COG:CAC3058 KEGG:ns NR:ns ## COG: CAC3058 COG0836 # Protein_GI_number: 15896309 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Mannose-1-phosphate guanylyltransferase # Organism: Clostridium acetobutylicum # 8 344 5 340 350 286 42.0 4e-77 MSMDNHIVIMAGGIGSRFWPMSTPECPKQFIDVMGCGRTLIQLTADRFDGVCPRENVWVV TSEKYIDIVREQLPEIPESNILAEPCARNTAPCIAYACWKIKKKHPNANVVVTPSDALVI NTGEFRRVVEKALRFTDNSSAIVTLGIKPTRPETGYGYIAAGDQIMTDKEIFTVDAFKEK PDRETADRYLAEGNYFWNAGIFVWNVRTITSVMRVYAPGIAQIFDRIFPDFYTEKENETI KKLFPTCEAISIDYAVMEKAQEIYVLPASFGWSDLGTWGALRGLLPQDKSGNATVGADVR LYESKNCIVHTSEEKRVVIQGLDGYIIAEKDNTLLICKLEEEQRIKEFSK >gi|226332263|gb|ACIC01000057.1| GENE 12 10254 - 13481 3458 1075 aa, chain - ## HITS:1 COG:SPAC22G7.06c_2 KEGG:ns NR:ns ## COG: SPAC22G7.06c_2 COG0458 # Protein_GI_number: 19113967 # Func_class: E Amino acid transport and metabolism; F Nucleotide transport and metabolism # Function: Carbamoylphosphate synthase large subunit (split gene in MJ) # Organism: Schizosaccharomyces pombe # 6 1059 1 1044 1064 1225 57.0 0 MKENIKKVLLLGSGALKIGEAGEFDYSGSQALKALKEEGIETILINPNIATVQTSEGVAD QIYFLPVTPYFVEKVIQKEKPEGIMLAFGGQTALNCGVALYKEGILEKYNVKVLGTPVQA IMDTEDRELFVQKLNEINVKTIKSEAVENAEDARCAAKELGYPVIVRAAYALGGLGSGFC DNEEQLDVLVEKAFSFSPQVLVEKSLRGWKEVEYEVVRDRFDNCITVCNMENFDPLGIHT GESIVIAPSQTLTNSEYHKLRELAIRIIRHIGIVGECNVQYAFDPESEDYRVIEVNARLS RSSALASKATGYPLAFVAAKLGLGYGLFDLKNSVTKTTSAFFEPALDYVVCKIPRWDLGK FHGVDKELGSSMKSVGEVMAIGRTFEEAIQKGLRMIGQGMHGFVENKELVIPDIDKALRE PTDKRIFVISKAFRAGYTIDQVHELTKIDKWFLQKLMNIMKTSEEMHEWGNNHKQIADLP VELLRKAKVQGFSDFQIARAIGYEGDMENGSLYVRKYRKAAGILPVVKQIDTLAAEYPAQ TNYLYLTYSGVANDVHYLGDHKSIVVLGSGAYRIGSSVEFDWCGVQALNTIRKEGWRSVM INYNPETVSTDYDMCDRLYFDELTFERVMDILELENPHGVIVSTGGQIPNNLALRLDAQN IHILGTSAQSIDNAEDREKFSAMLDRIGVDQPRWRELTSLEDINEFVDEVGFPVLVRPSY VLSGAAMNVCSNQEELERFLKLAANVSKKHPVVVSQFIEHAKEVEMDAVAQNGEIIAYAI SEHIEFAGVHSGDATIQFPPQKLYVETVRRIKRISREIAKALNISGPFNIQYLAKDNDIK VIECNLRASRSFPFVSKVLKINFIELATKVMLGLPVEKPEKNLFELDYVGIKASQFSFNR LQKADPVLGVDMASTGEVGCIGSDTSCAVLKAMLSVGYRIPKKNILLSTGTMKQKADMMD AARMLVNKGYKLFATGGTHKTFAENGIESTLVYWPSEEGHPQALEMLHNKEIDMVVNIPK NLTAGELDNGYKIRRAAIDLNVPLITNARLASAFINAFCTMTVDDIAIKSWEEYK >gi|226332263|gb|ACIC01000057.1| GENE 13 13491 - 14657 910 388 aa, chain - ## HITS:1 COG:YJL130c_1 KEGG:ns NR:ns ## COG: YJL130c_1 COG0505 # Protein_GI_number: 6322331 # Func_class: E Amino acid transport and metabolism; F Nucleotide transport and metabolism # Function: Carbamoylphosphate synthase small subunit # Organism: Saccharomyces cerevisiae # 2 388 20 408 433 370 46.0 1e-102 MRNVTLILDDGSRFSGKSFGYEKPVAGEVVFNTAMTGYPESLTDPSYAGQLMTLTYPLVG NYGVPPFTIEPNGLATFMESEKIHAEAIIVSDYSSEYSHWNAVESLGDWLKREKVPGITG IDTRELTKILREHGVMMGRIVFNDEIVGEIDNGQLPMDNYAAVNYVDRVSCKEIISYLPD GTSRSFPLTMPVAQLNCQLSTVNCQLKKVVLVDCGVKTNIIRCLLKRGVEVIRVPWDYDF NGLEFDGLFISNGPGDPDTCDAAVQNIRKAMANEKLPIFGICMGNQLLSKAGGAKIYKLK YGHRSHNQPVRMVGTERCFITSQNHGYAVDNNTLGADWEPLFINMNDGSNEGIKHKKNPW FSAQFHPEAASGPTDTEFLFDEFVNLLK >gi|226332263|gb|ACIC01000057.1| GENE 14 14685 - 16568 1818 627 aa, chain - ## HITS:1 COG:YPO2772 KEGG:ns NR:ns ## COG: YPO2772 COG0034 # Protein_GI_number: 16122976 # Func_class: F Nucleotide transport and metabolism # Function: Glutamine phosphoribosylpyrophosphate amidotransferase # Organism: Yersinia pestis # 39 535 21 453 505 122 25.0 3e-27 MEQLKHECGVAMIRLLQPLEYYEKKYGTWMYGLNKLYLLMEKQHNRGQEGAGLACVKLEA NPGEEYMFRERALGSGAITEIFEAVQSNFKDLTPEQLHDAAFAKRTLPFAGEVYMGHLRY STTGKSGISYVHPFLRRNNWRAKNLALCGNFNMTNVDEIFARITAIGQHPRKYADTYIML EQVGHRLDREVERLFNLAEAEGLTGMGVTHYIEEHIELANVLRTSSREWDGGYVICGLTG SGESFAIRDPWGIRPAFWYQDDEIAVLASERPVIQTALNVPVEKIKELQPGQALLISKEG RLRTSQINRPRERHACSFERIYFSRGSDVDIYKERKLLGEKLVPNILKAINNDLDHTVFS FIPNTAEVAFYGMLQGLDDYLNEEKVRQIAALGHHPNMEELEVILSRRIRSEKVAIKDIK LRTFIAEGNSRNDLAAHVYDITYGSLVSGVDNLVIIDDSIVRGTTLKQSIIGILDRLGPK KIVIVSSSPQVRYPDYYGIDMAKMSEFIAFKAAIELLKERDMKDVIAAAYRKSKDQVGLP KEQMVNYVKDIYAPFTDEEISAKMVELLTPKGTKAKVEIVYQPLEGLHEACPNHQGDWYF SGNYPTPGGVKMVNQAFINYIEQMYQF >gi|226332263|gb|ACIC01000057.1| GENE 15 16607 - 18451 1704 614 aa, chain - ## HITS:1 COG:DR0302 KEGG:ns NR:ns ## COG: DR0302 COG0449 # Protein_GI_number: 15805332 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glucosamine 6-phosphate synthetase, contains amidotransferase and phosphosugar isomerase domains # Organism: Deinococcus radiodurans # 1 614 37 642 642 556 48.0 1e-158 MCGIVGYIGKRKAYPILIKGLKRLEYRGYDSAGVALISDNQQLNVYKTKGKVSELENFVT QKDISGTVGIAHTRWATHGEPCSVNAHPHYSSSEKLALIHNGIIENYAVLKEKLQAKGYV FKSSTDTEVLVQLIEYMKVTNRVDLLTAVQLALNEVIGAYAIAILDKEHPEEIIAARKSS PLVVGIGEDEFFLASDATPIVEYTDKVVYLEDGEIAVINRGKELKVVDLSNVEMTPEVKK VELKLGQLEKGGYPHFMLKEIFEQPDCIHDCMRGRINVEANNVVLSAVIDYKEKLLNAKR FIIVACGTSWHAGLIGKHLIESFCRIPVEVEYASEFRYRDPVIDEHDVVIAISQSGETAD TLAAVELAKSRGAFIYGICNAIGSSIPRATHTGSYIHVGPEIGVASTKAFTGQVTVLTML ALTLAREKGTIDETQYLNIVRELNSIPGKMKEVLKLNDKLAELSKTFTYAHNFIYLGRGY SYPVALEGALKLKEISYIHAEGYPAAEMKHGPIALIDAEMPVVVIATQNGLYEKVLSNIQ EIKARKGKVIAFVTKGDTVISKIADCSIELPETIECLDPLITTVPLQLLAYHIAVCKGMD VDQPRNLAKSVTVE >gi|226332263|gb|ACIC01000057.1| GENE 16 18814 - 23424 4275 1536 aa, chain + ## HITS:1 COG:CAC1673_2 KEGG:ns NR:ns ## COG: CAC1673_2 COG0069 # Protein_GI_number: 15894950 # Func_class: E Amino acid transport and metabolism # Function: Glutamate synthase domain 2 # Organism: Clostridium acetobutylicum # 402 1221 1 804 804 877 54.0 0 MKKQELFNNATGKFPYQRQPGQMGLYDAAYEHDACGVGMLVNIHGEKSHDIVESALKVLE NMRHRGAEGADNKTGDGAGIMLQIPHEFILLQGIPVPEKGRYGTGLLFLPKNEKDQAAIF SIIIEEIEKEGLTLMHLRNVPTCPEILGEAALANEPDIKQVFITGFTETETADRKLYLIR KRIENKVRLSAIPTRNDFYVVSLSTKSIIYKGMLSSLQLRNYYPDLTNSYFTSGLALVHS RFSTNTFPTWGLAQPFRLLAHNGEINTIRGNRGWMEARESVLSTPMLGDIKEIRPIIQPG MSDSASLDNVLEFLVMSGLSLPHAMAMLVPESFNEKNPISEDLKAFYEYHSILMEPWDGP AALLFSDGRFAGGMLDRNGLRPARYLITKNDTMVVASEVGVMDFEPGDIKEKGRLQPGKI LLIDTEKGEIYYDGELKKQLAEAKPYRTWLSTNRIELDELKSGRKVPHHVENYDRMLRTF GYSKEDIERLIMPMASAGAEPIHSMGNDTPLAVLSDKPQLLYNYFRQQFAQVTNPPIDPL REELVMSLTEYIGAVGMNILTPSESHCKMVRLNHPILSNTQLDILCNIRYKGFKTVKLPM LFEVSKGKAGLQESLNNLCKMAEESVTDGANYIVLTDRDVDATHAVIPSLLAVSAVHHHL ISVGKRVQTALIVESGEMREVMHAALLLGFGASALNPYMAFAILDKLVKEKDIQLDYATA EKNYIKSICKGLFKIMSKMGISTIRSYRGAKIFEAVGLSEELSKAYFGGLGSPIGGIRLE EVARDAIAFHDEGVEGMENGELKMENEAPHGNSQFSTFNFPLLKNNGLYAFRKDGEKHAW NPETISTLQLATRLGSYKKFKEFTHLVDNKEKPIFLRDFLGFRRNPISIEQVEPIENILR RFVTGAMSFGSISKEAHEAMAIAMNTIHGRSNTGEGGEDASRFHPLPDGTSMRSAIKQVA SGRFGVTAEYLVNADEIQIKIAQGAKPGEGGQLPGFKVNDVIAKTRHSIPGISLISPPPH HDIYSIEDLAQLIFDLKNVNPQAKISVKLVAESGVGTIAAGVAKAKADLIVISGAEGGTG ASPASSIRYAGISPELGLSETQQTLVLNGLRGQVVLQADGQLKTGRDIIIMALMGAEEYG FATSALIVLGCVMMRKCHQNTCPVGVATQNEELRKRFHGRSEYLINFFTFLAQEVREYLA EMGFTKMDDIIGRTDLIERKSDENDPNPKHALIDFTKLLARVDNSAAIRHVIDQDHGIST VKDVAIIDAAQEAIEHEKEVSLEYTIANTDRATGTMLSGVIAKKHGEKGLPEHTLNVKFK GSAGQSFGAFLVPGVNFKLEGEANDYLGKGLSGGRIAVLPPIRSNFEAEKNTIAGNTLLY GATSGEVYINGRVGERFAVRNSGAVAVVEGVGDHCCEYMTGGRVVVLGQTGRNFAAGMSG GVAYVWNKDGNFDYFCNMEMVELSLIEEAGYRKELHELIRQHYLYTGSKLARTMLDDWNH YVDQFIQIVPIEYKKVLQEEQMRKLQQKIADMQRDY >gi|226332263|gb|ACIC01000057.1| GENE 17 23553 - 24893 993 446 aa, chain + ## HITS:1 COG:VC2374 KEGG:ns NR:ns ## COG: VC2374 COG0493 # Protein_GI_number: 15642371 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: NADPH-dependent glutamate synthase beta chain and related oxidoreductases # Organism: Vibrio cholerae # 1 445 1 473 489 428 47.0 1e-119 MGDPKAFLNIPRQEAGYRPVSERITDYSQVEQTLNTNSRKLQASRCMDCGVPFCHWACPI GNKQPEWQDALFKGKWKEAYEVLSSTCDFPEFTGRICPALCEKSCVLKLSCDQPVTIREN EAAIVEAAFREGYIQVQHPQRNGKKVAVVGAGPAGLVVANQLNLKGYTVTLFDKDEAAGG LLRFGIPNFKLDKNVIDRRMKILAAEGIQFEMGVEIDVNHLPEGFDAYCICTGTPTARDL SIPGRELKGIHFALEMLAQQNRILEGQTFPKDKLVNAKGKKVLVIGGGDTGSDCIGTSVR QGATSVTQIEIMPKPPVGHNPSTPWPQWPVVFKTTSSHEEGCTRRWCLTSNQFLGKNGKV TGVEVEEVEWTPATDGGRPTMNLTGKKEVIEADMVLLAMGFLKPEQPKFAENVFLAGDAA TGASLVVRAMAGGRKAAADIDHYLVK >gi|226332263|gb|ACIC01000057.1| GENE 18 25100 - 26779 1542 559 aa, chain + ## HITS:1 COG:VC0991 KEGG:ns NR:ns ## COG: VC0991 COG0367 # Protein_GI_number: 15641006 # Func_class: E Amino acid transport and metabolism # Function: Asparagine synthase (glutamine-hydrolyzing) # Organism: Vibrio cholerae # 1 554 1 554 554 794 68.0 0 MCGIAGILNIKTQTKELRDKALKMAQKIRHRGPDWSGIYVGGSAILAHERLSIVDPQSGG QPLYSPDRKQVLAVNGEIYNHRDIRARYAGQYDFQTGSDCEVILALYKDKGIHFLEEISG IFAFVLYDEEKDEFLIARDPIGVIPLYIGKDKDGKIYFGSELKALEGFCDEYEPFLPGHY FYSKEGKMKRWYTREWTDYESVKDNVAKVSDVKEALEDAVHRQLMSDVPYGVLLSGGLDS SVISAIAKKYAAKRIETDGASDAWWPQLHSFAIGLKGAPDLIKAREVAEYIGTVHHEINY TVQEGLDAIRDVIYFIETYDVTTVRASTPMYLLARVIKSMGIKMVLSGEGADEVFGGYLY FHKAPTPKDFHEETVRKLSKLHMYDCLRANKSLSAWGVEGRVPFLDKEFLDVAMALNPKA KMCPGKEIEKRIVREAFADMLPESVAWRQKEQFSDGVGYSWIDTLREITAAAVSDEQMEH AAERFPINTPLNKEEYYYRSIFEEHFPSESAARSVPSVPSVACSTAEALAWDVAFRNLNE PSGRAVRGVHEEAYAEEVK >gi|226332263|gb|ACIC01000057.1| GENE 19 26979 - 27740 830 253 aa, chain + ## HITS:1 COG:SA0220_2 KEGG:ns NR:ns ## COG: SA0220_2 COG0584 # Protein_GI_number: 15925931 # Func_class: C Energy production and conversion # Function: Glycerophosphoryl diester phosphodiesterase # Organism: Staphylococcus aureus N315 # 29 251 1 223 242 76 27.0 6e-14 MKIKRIILMVALIISVASVQAQKSQVIAHRGYWKTAGSAQNSITALQKADSIHCYGSEFD VWLTKDNKLVINHDPVYKMKYMEYSKGDALTGLKLSNGENLPSLEQYLEAGKKCNTKLIL ELKALNSKKRETKAVQEILALVTKLGLENRMEYITFSLHAMKEFIRLAPAGTPVFYLNGE LSPKELKDLGAAGLDYHMGVIKKHPEWIKEAHDLGLKVNVWTVDKAEDMKWLIDQKVDFI TTNEPTVAQEILK Prediction of potential genes in microbial genomes Time: Thu May 12 00:40:18 2011 Seq name: gi|226332262|gb|ACIC01000058.1| Bacteroides sp. 1_1_6 cont1.58, whole genome shotgun sequence Length of sequence - 2855 bp Number of predicted genes - 4, with homology - 3 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 241 247 ## BT_0549 putative thioredoxin - Prom 419 - 478 7.6 + Prom 32 - 91 3.0 2 2 Tu 1 . + CDS 197 - 322 58 ## + Prom 439 - 498 4.9 3 3 Op 1 . + CDS 651 - 1454 637 ## COG0253 Diaminopimelate epimerase 4 3 Op 2 . + CDS 1512 - 2744 1187 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase Predicted protein(s) >gi|226332262|gb|ACIC01000058.1| GENE 1 1 - 241 247 80 aa, chain - ## HITS:1 COG:no KEGG:BT_0549 NR:ns ## KEGG: BT_0549 # Name: not_defined # Def: putative thioredoxin # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 80 1 80 98 154 100.0 1e-36 MDIEKQLEVAAQSDHLVLIVFYADWSPHYEWIGPVLRTYERRVIELINVNIEENKTVADS HNIDTVPAFLLLHKGHELWR >gi|226332262|gb|ACIC01000058.1| GENE 2 197 - 322 58 41 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIALRRNFQLFLYIHIATLLYLLVTKQCYNKKVLQNCKQAI >gi|226332262|gb|ACIC01000058.1| GENE 3 651 - 1454 637 267 aa, chain + ## HITS:1 COG:slr1665 KEGG:ns NR:ns ## COG: slr1665 COG0253 # Protein_GI_number: 16332245 # Func_class: E Amino acid transport and metabolism # Function: Diaminopimelate epimerase # Organism: Synechocystis # 5 266 3 278 279 219 42.0 3e-57 MTTKIKFTKMHGAGNDYIYVDTTRYPIAAPEKKAIEWSKFHTGIGSDGLILIGSSDKADF SMRIFNADGSEAMMCGNGSRCVGKYVYEYGLTAKKEITLDTRSGIKVLKLHVEGGKVTAV TVDMGSPLETEAVDFGDQFPFQSTRVSMGNPHLVTFVEDITQINLPEIGPQLENYHLFPD RTNVEFAQIVGKDTIRMRVWERGSGITQACGTGACATAVAAVLHGLAGRKCDIIMDGGTV TIEWEEATGHILMTGPATKVFDGEMEG >gi|226332262|gb|ACIC01000058.1| GENE 4 1512 - 2744 1187 410 aa, chain + ## HITS:1 COG:MTH52 KEGG:ns NR:ns ## COG: MTH52 COG0436 # Protein_GI_number: 15678081 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Methanothermobacter thermautotrophicus # 1 406 1 405 410 541 60.0 1e-154 MALVNEHFLKLPGSYLFSDIAKKVNTFKITHPKQDIIRLGIGDVTQPLPKACIEAMHKAV EELASKDTFRGYGPEQGYDFLIEAIIKNDFAPRGIHFSPSEIFVNDGAKSDTGNIGDILR HDNSVGVTDPIYPVYIDSNVMCGRAGVLEEGTGKWSNVTYMPCTSENDFIPEIPDKRIDI VYLCYPNNPTGTTLTKPELKKWVDYALANDTLILFDAAYEAYIQDADVPHSIYEIKGAKK CAIEFRSFSKTAGFTGVRCGYTVVPKELTAATLEGDRIPLNKLWNRRQCTKFNGTSYITQ RAAEAVYSTEGKAQIKETINYYMSNAKIMKEGLEATGLKVYGGVNAPYLWVKTPNGLSSW RFFEQMLYEANVVGTPGVGFGPSGEGYIRLTAFGDHNDCMEAMRRIKNWL Prediction of potential genes in microbial genomes Time: Thu May 12 00:40:27 2011 Seq name: gi|226332261|gb|ACIC01000059.1| Bacteroides sp. 1_1_6 cont1.59, whole genome shotgun sequence Length of sequence - 14567 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 7, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 78 - 725 704 ## BT_0546 hypothetical protein + Prom 760 - 819 1.6 2 2 Op 1 24/0.000 + CDS 875 - 1231 493 ## COG0347 Nitrogen regulatory protein PII + Prom 1235 - 1294 3.6 3 2 Op 2 . + CDS 1314 - 2771 1337 ## COG0004 Ammonia permease 4 2 Op 3 . + CDS 2827 - 5016 2452 ## COG3968 Uncharacterized protein related to glutamine synthetase + Term 5102 - 5144 5.4 - Term 5013 - 5053 5.4 5 3 Tu 1 . - CDS 5138 - 5716 637 ## BT_0542 hypothetical protein - Prom 5829 - 5888 5.7 + Prom 5676 - 5735 7.7 6 4 Op 1 11/0.000 + CDS 5889 - 8006 2013 ## COG0855 Polyphosphate kinase 7 4 Op 2 . + CDS 8003 - 8929 978 ## COG0248 Exopolyphosphatase + Prom 9681 - 9740 3.9 8 5 Tu 1 . + CDS 9809 - 10561 359 ## COG0500 SAM-dependent methyltransferases + Term 10597 - 10660 18.0 - Term 10589 - 10641 18.5 9 6 Tu 1 . - CDS 10673 - 12532 1660 ## COG0471 Di- and tricarboxylate transporters - Prom 12589 - 12648 8.6 + Prom 12928 - 12987 9.7 10 7 Tu 1 . + CDS 13073 - 14500 1177 ## COG0642 Signal transduction histidine kinase Predicted protein(s) >gi|226332261|gb|ACIC01000059.1| GENE 1 78 - 725 704 215 aa, chain + ## HITS:1 COG:no KEGG:BT_0546 NR:ns ## KEGG: BT_0546 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 215 27 241 241 400 100.0 1e-110 MAQDKVEASVGADLVSGYIWRGQDLGGVSIQPSLGISYKGLSLGAWGSVGFESSDTKEFD LTLGYSTGGFSISVTDYWFNTQVPTNKYFKYGAHSTAHVFEAQVGYDFGPLAINWYTNFA GADGVKENGKRAYSSYLALSAPFSLGGLDWTVDVGMVPWETTFYNGYTSGFCVSDISLGA VKEIRVTDSFSVPAFAKVSVNPRTEGAYFTFGLSF >gi|226332261|gb|ACIC01000059.1| GENE 2 875 - 1231 493 118 aa, chain + ## HITS:1 COG:aq_109 KEGG:ns NR:ns ## COG: aq_109 COG0347 # Protein_GI_number: 15605696 # Func_class: E Amino acid transport and metabolism # Function: Nitrogen regulatory protein PII # Organism: Aquifex aeolicus # 1 111 1 112 112 93 48.0 8e-20 MKKIEAIIRKTKFEDVKDVLLEADIEWFSYYDVRGIGKARQGRIYRGVVYDTSTIERILV SIVVRDKNAEKTVQAIIKAAQTGEIGDGRIFVIPIEDAIRIRTAERGDIALYNAEQER >gi|226332261|gb|ACIC01000059.1| GENE 3 1314 - 2771 1337 485 aa, chain + ## HITS:1 COG:sll0108 KEGG:ns NR:ns ## COG: sll0108 COG0004 # Protein_GI_number: 16331833 # Func_class: P Inorganic ion transport and metabolism # Function: Ammonia permease # Organism: Synechocystis # 58 484 68 486 507 356 48.0 7e-98 MDKTYKRHSFTKLWIATAMFIFCCFATSAFAQEAVDADTFMTTTETITEVKTTPEIAETA TASAETTPDTIGELALGLNTVWMLLAAMLVFFMQPGFALVEAGFTRVKNTANILMKNFVD FMFGSLLYWFIGFGLMFGAGGLIGMPHFFDLSFLDSDLPREGFLVFQTVFCATAATIVSG AMAERTKFSMYLVYTIFISVLIYPISGHWTWGGGWLMNGEEGSFMMNLFGTTFHDFAGST IVHSVGGWIALVGAAILGPRIGKYGKDGKSRAIPGHNLTVAALGVFILWFGWFGFNPGSQ LAASTEADAMAISHVFLTTNLAACAGGFFALAMSWMKYGKPSLSLTLNGVLAGLVGITAG CDAVAPSGAVLIGAICGVVMIFAVDFIDKVLKIDDPVGASSVHGVCGFLGTVLTGLFSTS EGLFYGHGAGFLGAQLFGAVVVGVWAAGMGFIIFKVLDKIHGLRVPARVEEEGLDIYEHG ESAYN >gi|226332261|gb|ACIC01000059.1| GENE 4 2827 - 5016 2452 729 aa, chain + ## HITS:1 COG:DR2033 KEGG:ns NR:ns ## COG: DR2033 COG3968 # Protein_GI_number: 15807027 # Func_class: R General function prediction only # Function: Uncharacterized protein related to glutamine synthetase # Organism: Deinococcus radiodurans # 1 729 69 787 787 595 43.0 1e-170 MSKLRFRVVETAFKKKAVEVATPAERPSEYFGKYVFNKEKMFKYLPSKVYNALIDAIDNG APLDRSIADEVAAGMKKWAVEMGVTHYTHWFAPLTEGTAEKHDAFVEHDGKGGMMEEFTG KLLVQQEPDASSFPNGGIRNTFEARGYSAWDPSSPAFIVDDTLCIPTVFIAYTGEALDYK APLLKALRAVDKAATAVCHYFNPEVKKVVAYLGWEQEYFLVDEGLYAARPDLLMTGRTLM GHDSAKNQQLEDHYFGAIPTRVAAFMKDLEIEALKLGIPVKTRHNEVAPNQFELAPIFEE CNLANDHNLLIMSLMRKVSRRHGFRVLLHEKPFKGVNGSGKHNNWSLGTDTGILLMGPGK TPEDNLRFVTFVVNTLMAVYRHNGLLKASISSATNAHRLGANEAPPAIISSFLGKQLSQV LDHIENSTKDDLINLSGKQGMKLDIPQIPELLIDNTDRNRTSPFAFTGNRFEFRAVGSEA NCASAMIALNAAVAEQLMKFKKDVDALIEKGEPKVSAILEVIRGYIKECKAIHFDGNGYS DEWKVEAARRGLDCETSVPVIFDNYLKPETIAMFEAIGVMTKKELEARNEVKWETYTKKI QIEARVLGDLAMNHIIPVATQYQTDLINNVYKMQSLFPADKAARLSAKNLELIEEIADRT AFIKEHVDAMIEARKVANKIESEREKAIAYHDTIVPALEEIRYHIDKLELIVDNQMWTLP KYRELLFVR >gi|226332261|gb|ACIC01000059.1| GENE 5 5138 - 5716 637 192 aa, chain - ## HITS:1 COG:no KEGG:BT_0542 NR:ns ## KEGG: BT_0542 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 192 1 192 192 335 100.0 6e-91 MKTMNYSLIRILFALVIGLVLVLWPNTAASYIVITIGVAFLIPGVISLFSFFGRKRSEHE PAPRFPIEGIGSLLFGLWLIVMPEFFADVLMFLLGFILIMGGVQQIASLSMARRWTPVPG IFYLVPSLILIAGVIALFNPTGVRNTAFIIIGISSLVYSLSELINWFKFARRRPKTPVTP HDEDIEDAKIIE >gi|226332261|gb|ACIC01000059.1| GENE 6 5889 - 8006 2013 705 aa, chain + ## HITS:1 COG:ECs3363 KEGG:ns NR:ns ## COG: ECs3363 COG0855 # Protein_GI_number: 15832617 # Func_class: P Inorganic ion transport and metabolism # Function: Polyphosphate kinase # Organism: Escherichia coli O157:H7 # 12 669 7 668 688 500 39.0 1e-141 MKSSETKKKCPYVERDISWMYFNQRILLEAARPEVPLLERLTFLGIYSNNLDEFFRVRVA TLNRIIEYADKNIQKEQETATRTLKQISKLHNHFYEQFEETFASIMEELKKENICVIKDT EMTDDQKTFITSFYRNKLNGSTTPLFLNGARPLDDQTDEDIYLAIRLLRKDETGKIKEKD YAVIELPTEEYGRFIQLPDSEGKTYLMFLDDVIRYCLPLIFVGMKYTDYEAYTFKFTKDA EMEIDSDLRTGVLQKISKGVKSRKRGEPIRFVYDEQIPKDLLKRLVDRLNVDKNDTRVAG GRYHNFKDLMKFPVCGRYDLKYPVWEPVFKPELNGKESLLTLIRQKDRSLHYPFHSFDTF IRVLREAAISKEVKSIKMTLYRLAKESKVIKALICAAKNGKKVTVVIELLARFDEASNIN WSKRMQDAGIHVIFGVEGLKIHSKLVHIGTRHGDIACISTGNFHEGNARMYTDYTIMTAH RPLVREVNAVFDFIEKPYTPVNFKELLVSPNDMRKRLIALISKEIKNKEQGKEAYILAKV NHITDRALIEKLYEASTAGVQVELVVRGNCSLVTGIAGISENIHINGIIDRYLEHSRIFI FANGGEEKYYIGSADWMPRNLDNRIEVLAPVYDKEIQADLKRIVCYGYRDTAKGRIVDGT GENRPWENRPSTLCKDFTPALAEAETATVVFRSQEELYRKYKNTL >gi|226332261|gb|ACIC01000059.1| GENE 7 8003 - 8929 978 308 aa, chain + ## HITS:1 COG:BH1393 KEGG:ns NR:ns ## COG: BH1393 COG0248 # Protein_GI_number: 15613956 # Func_class: F Nucleotide transport and metabolism; P Inorganic ion transport and metabolism # Function: Exopolyphosphatase # Organism: Bacillus halodurans # 4 307 1 316 518 106 26.0 6e-23 MISMKKVNYAAIDIGSNAVRLLIKCVNEENAPELMSKVQLIRIPLRLGEDAFTMGVISAE KEKKLIRLMKAYKQLMKIYDVVDYRACATSAMRDARNGKDITRQIARKTGIRVDIIDGQE EAHIVYDNHIEQLFASGQNYLYVDVGGGSTEINLISNGELKNSRSYNIGTVRMLSGMVKE EEKEALRTDLIGLAAEYAPISIIGSGGNINKLFRLADKKDKKASLLPVESLREIYEALQA LPAEQRIEQYKLKPDRADVIVPAAEIFLEVATNVKATGIIVPTIGLSDGIIDSLYTQKMN TPVTPQIQ >gi|226332261|gb|ACIC01000059.1| GENE 8 9809 - 10561 359 250 aa, chain + ## HITS:1 COG:BH3955 KEGG:ns NR:ns ## COG: BH3955 COG0500 # Protein_GI_number: 15616517 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Bacillus halodurans # 13 250 11 252 255 118 30.0 1e-26 MNSDYKVGDLIYDANIYDGMNTDLSDLQFYKRWLPQSKGARILELCCGTGRLTLPIARDG YNISGVDYTPSMLAQARLKASEAGLEINFIEADIRTLNLQEKYDLIFIPFNSIHHLYKNE DLFKVFNVVKNHLKDGGLFLLDCFNPNIQYIVEGEKEQKEIAAYTTDDGREVLIKQRMRY ENKTQINRIEWHYYINGKFNSIQNLDMRMFFPQELDSYLKWNGFNIIHKYGGFEEEAFKD NSEKQVFVCQ >gi|226332261|gb|ACIC01000059.1| GENE 9 10673 - 12532 1660 619 aa, chain - ## HITS:1 COG:ECs3176 KEGG:ns NR:ns ## COG: ECs3176 COG0471 # Protein_GI_number: 15832430 # Func_class: P Inorganic ion transport and metabolism # Function: Di- and tricarboxylate transporters # Organism: Escherichia coli O157:H7 # 1 619 5 610 610 376 37.0 1e-104 MLITIIILVLSAIFFVNGKIRSDLVALCALVALLIFQILTPDEALSGFSNSVVIMMIGLF VVGGAIFQTGLAKMISSRILKLAGNSEIRLFLLVMLVTSAIGAFVSNTGTVALMLPIVVS LAMSAKMNPSRLLMPLAFASSMGGMMTLIGTPPNLVIQNTLTSAGFEPLSFFTFLPVGLV SVVVGTLVLMPLSKWFLSKKGQKDDNSRSGKSLKQLVNEYGLSSNLFRMQVIKDSRLLGK TILDLDIRRKYGLNIMEVRRGDASQHRFLKTITQKLAEPDTMLEAEDILYVTGEFDKVQQ FAEDYLLDILDDHATEETRSTTNSLDFYDIGIAEIVLMPASNLVNQTIKESGFRDKFNVN VLGIRRKKEYLLQDLGNERIHSGDVLLVQGTWSNIARLSKEDSDWVVLGQPLAEAAKVTL DYKAPVAAAIMVLMVAMMVFDFIPVAPVTAVMIAGILMVLTGCFRNVEAAYKTINWESIV LIAAMLPMSLALEKTGASEYISNTLVSGLGSYGPVALMAGIYFTTSLMTMFISNTATAVL LAPIALQSAMQIGVSPVPFLFAVTVGASMCFASPFSTPPNALVMPAGQYTFMDYIKVGLP LQIIMGIVMILVLPLIFPF >gi|226332261|gb|ACIC01000059.1| GENE 10 13073 - 14500 1177 475 aa, chain + ## HITS:1 COG:slr2098_3 KEGG:ns NR:ns ## COG: slr2098_3 COG0642 # Protein_GI_number: 16330584 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Synechocystis # 212 474 9 276 280 183 39.0 5e-46 MVKDIQDSFRYYYWNKESELQSGIKREEAVGCTDYEIYGEERGRRYRDVDESLVQAGKVY RAEESYSTVDGIVHDTIAVKSIIKWKEKKKWLLVTRWDITRLKTYERELIAAKEELEKAL NKQKLALKSVNFGLIYIDKNYLVQWEETTQIASLVKGRHYTPGQICYKTSALRDTPCGQC AFQKAIEQGKIIRHTIRIDQVDFEVTATPVYDNEGTEIIGGLLRFEDITEKVKMDKMLQE AKEKAEESNRLKSAFLANMSHEIRTPLNAIIGFSDMICQTGEEEEKQEYMKIVSSNNELL LQLIDDILDLSKIEAGTMEFTLAPTDINDLMEGICRQMQEKNTSPDVAITFTEKAEECIL NTDRVRLSQVIINFTNNAMKFTPKGSIQMGYRIDEAKDEIYFYVKDTGIGIPADKINEVF ERFVKLNTFAKGTGLGLAICRVIVERLGGTIGADSKEGEGSCFWFRLPITQKSNN Prediction of potential genes in microbial genomes Time: Thu May 12 00:40:39 2011 Seq name: gi|226332260|gb|ACIC01000060.1| Bacteroides sp. 1_1_6 cont1.60, whole genome shotgun sequence Length of sequence - 14570 bp Number of predicted genes - 15, with homology - 13 Number of transcription units - 4, operones - 4 average op.length - 3.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 26 - 1171 1146 ## COG1979 Uncharacterized oxidoreductases, Fe-dependent alcohol dehydrogenase family 2 1 Op 2 . - CDS 1249 - 1389 176 ## + Prom 1668 - 1727 9.3 3 2 Op 1 . + CDS 1748 - 2932 1363 ## COG0133 Tryptophan synthase beta chain 4 2 Op 2 35/0.000 + CDS 2977 - 4383 1321 ## COG0147 Anthranilate/para-aminobenzoate synthases component I 5 2 Op 3 13/0.000 + CDS 4452 - 5018 641 ## COG0512 Anthranilate/para-aminobenzoate synthases component II 6 2 Op 4 21/0.000 + CDS 5023 - 6021 1085 ## COG0547 Anthranilate phosphoribosyltransferase + Term 6048 - 6090 3.8 7 2 Op 5 9/0.000 + CDS 6100 - 6882 795 ## COG0134 Indole-3-glycerol phosphate synthase 8 2 Op 6 . + CDS 6913 - 7536 482 ## COG0135 Phosphoribosylanthranilate isomerase 9 2 Op 7 . + CDS 7555 - 8325 338 ## PROTEIN SUPPORTED gi|149916131|ref|ZP_01904653.1| 50S ribosomal protein L25/general stress protein Ctc + Prom 8416 - 8475 7.8 10 3 Op 1 . + CDS 8591 - 9631 718 ## COG0252 L-asparaginase/archaeal Glu-tRNAGln amidotransferase subunit D + Term 9652 - 9711 0.1 + Prom 9692 - 9751 5.1 11 3 Op 2 . + CDS 9844 - 12306 1557 ## BT_0525 outer membrane protein, function unknown + Term 12320 - 12363 8.2 + Prom 12329 - 12388 12.0 12 4 Op 1 . + CDS 12411 - 12551 117 ## 13 4 Op 2 . + CDS 12548 - 12913 269 ## COG3947 Response regulator containing CheY-like receiver and SARP domains 14 4 Op 3 . + CDS 12926 - 14086 748 ## COG2148 Sugar transferases involved in lipopolysaccharide synthesis 15 4 Op 4 . + CDS 14102 - 14570 62 ## BT_0522 hypothetical protein Predicted protein(s) >gi|226332260|gb|ACIC01000060.1| GENE 1 26 - 1171 1146 381 aa, chain - ## HITS:1 COG:alr4566 KEGG:ns NR:ns ## COG: alr4566 COG1979 # Protein_GI_number: 17232058 # Func_class: C Energy production and conversion # Function: Uncharacterized oxidoreductases, Fe-dependent alcohol dehydrogenase family # Organism: Nostoc sp. PCC 7120 # 1 377 1 380 384 416 53.0 1e-116 MENFIFQNPVKLIMGHGMIARLSKEIPSDKRIMITFGGGSVKKNGVYDQVKEALKDHFTI EFWGIEPNPAIETLRKAIALGKEQKVDYLLAVGGGSVIDGTKLISAGLLYDGDAWDLVLA GRPVTKTVPLSTVLTLPATGSEMNNGAVISRHETKEKYPFYSNFPLFSILDPEVTFTLPP HQVACGLADTFVHVMEQYMTVTGQSRVMDRWAEGILQTLVEIAPKIRENQHDYQLMADFM LSATMALNGFIAMGVSQDWATHMIGHEITALHGLTHGHTLVIILPATLRVLREAKGDKLV QYGERVWGITSGTKEERIDEAIDRTEEFFRSLGLTTRLHEENIGEETVLEIERRFNERGA KYGENENVTGAVARKILEAAL >gi|226332260|gb|ACIC01000060.1| GENE 2 1249 - 1389 176 46 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKQRKVLVGIAIAIFIILLLYWLLVAEDMKPWLSAMVPSALQQFIA >gi|226332260|gb|ACIC01000060.1| GENE 3 1748 - 2932 1363 394 aa, chain + ## HITS:1 COG:TM0138 KEGG:ns NR:ns ## COG: TM0138 COG0133 # Protein_GI_number: 15642912 # Func_class: E Amino acid transport and metabolism # Function: Tryptophan synthase beta chain # Organism: Thermotoga maritima # 10 390 3 380 389 456 61.0 1e-128 MKSFLVDQDGYYGEFGGAYVPEILHKCVEELKNTYLEVLESEDFKKEFDQLLRDYVGRPS PLYLARRLSEKYGCKMYLKREDLNHTGAHKINNTIGQILLARRMGKKRIIAETGAGQHGV ATATVCALMDMECIVYMGKTDVERQHINVEKMKMLGATVIPVTSGNMTLKDATNEAIRDW CCHPADTYYIIGSTVGPHPYPDMVARLQSVISEEIKKQLLEKEGRDYPDYLIACVGGGSN AAGTIYHYINDERVGIILAEAGGKGIETGMTAATIQLGKMGIIHGARTYVIQNEDGQIEE PYSISAGLDYPGIGPIHANLAAQRRATVLAVNDDEAIEAAYELTKLEGIIPALESAHALG ALKKLKFKPEDVVVLTVSGRGDKDIETYLSFNEK >gi|226332260|gb|ACIC01000060.1| GENE 4 2977 - 4383 1321 468 aa, chain + ## HITS:1 COG:TM0142 KEGG:ns NR:ns ## COG: TM0142 COG0147 # Protein_GI_number: 15642916 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Anthranilate/para-aminobenzoate synthases component I # Organism: Thermotoga maritima # 7 467 4 457 461 273 36.0 5e-73 MKTFSYTAHSKQVLGDMHTPVSIYLKVRDMYPQSALMESSDYHAGENSLSFIALCPLASI GVNSGIVTASYPDNSRKEEPLTQSFTVEKAMNQFISQFQVTGENKNVCGLYGYTTFNAVK YFEHIPVKESHDEQNDAPDLLYILYKYIIVFNHFKNELTLVEMLGEGEESGLPELEAAIE NRNYASYNFSVTGPVSSPISDEEHKANVRKGIAHCMRGDVFQIVLSRRFVQPYAGDDFKV YRALRSINPSPYLFYFDFGGYRIFGSSPETHCKIEDGRAYIDPIAGTTRRTGDTVKDREL AEALLADPKENAEHVMLVDLARNDLSRNCHDVRVLFYKEPQYYSHVIHLVSRVSGQLNEG ADKIKTFIDTFPAGTLSGAPKVRAMQLISEIEPHNRGAYGGCIGFIGLNGELNQAITIRT FVSRNNELWFQAGGGIVARSQDEYELQEVNNKLGALKKAIDLAVNLKN >gi|226332260|gb|ACIC01000060.1| GENE 5 4452 - 5018 641 188 aa, chain + ## HITS:1 COG:PA0649 KEGG:ns NR:ns ## COG: PA0649 COG0512 # Protein_GI_number: 15595846 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Anthranilate/para-aminobenzoate synthases component II # Organism: Pseudomonas aeruginosa # 3 188 2 192 201 189 50.0 3e-48 MKILLLDNYDSFTYNLLHAVKELGATDVEVVRNDQINLDEVERFDKIILSPGPGIPEEAG LLLPIIKRYAPTKSILGVCLGHQAIGEAFGARLENLKEVYHGVQTPISILQKDVLFEGLG KEIPVGRYHSWVVSREGFPECLEITAESQEGQIMALRHKTYDVHGIQFHPESVLTPQGKE IIKNFLNE >gi|226332260|gb|ACIC01000060.1| GENE 6 5023 - 6021 1085 332 aa, chain + ## HITS:1 COG:MJ0234 KEGG:ns NR:ns ## COG: MJ0234 COG0547 # Protein_GI_number: 15668409 # Func_class: E Amino acid transport and metabolism # Function: Anthranilate phosphoribosyltransferase # Organism: Methanococcus jannaschii # 1 329 2 332 336 210 35.0 3e-54 MKQILYKLFEHQYLGRDEARTILQNIAQGKYNDVQVASLITVFLMRNISVEELCGFRDAL LEMRVPVDLSEFAPIDIVGTGGDGKNTFNISTAACFTVAGAGIPVVKHGNYGATSVSGAS NVMEQHGVKFTSDVDQMRRSMEQCNIAYLHAPLFNPALKAVAPIRKGLAVRTFFNMLGPL VNPVLPTYQLLGVYNLPLLRLYTYTYQESKTKFAVVHSLDGYDEISLTNEFKVATCGNEK IYTPEGLGFARYQDTDLDGGQTPEDAAKIFDNIMNNTATEAQKNVVVINAAFAIQVVRPE KTIEECIALAKESLESGRALATLKKFIELNNK >gi|226332260|gb|ACIC01000060.1| GENE 7 6100 - 6882 795 260 aa, chain + ## HITS:1 COG:XF0213 KEGG:ns NR:ns ## COG: XF0213 COG0134 # Protein_GI_number: 15836818 # Func_class: E Amino acid transport and metabolism # Function: Indole-3-glycerol phosphate synthase # Organism: Xylella fastidiosa 9a5c # 1 256 1 261 264 197 42.0 2e-50 MKDILSEIIANKRFEVDLQKQAISIEQLQEGINEVPASRSMKRALASSDSGIIAEFKRRS PSKGWIKQEARPEEIVPSYLAAGASALSILTDEKFFGGSLKDIRTARPLVDVPIIRKDFI IDEYQLYQAKIVGADAVLLIAAALKQEKCQELAEQAHELGLEVLLEIHSAEELPYINSKI DMIGINNRNLGTFFTDVENSFRLAGQLPQDAVLVSESGISDPEVVKRLRTAGFRGFLIGE TFMKTPQPGETLQNFLKAIQ >gi|226332260|gb|ACIC01000060.1| GENE 8 6913 - 7536 482 207 aa, chain + ## HITS:1 COG:all5288 KEGG:ns NR:ns ## COG: all5288 COG0135 # Protein_GI_number: 17232780 # Func_class: E Amino acid transport and metabolism # Function: Phosphoribosylanthranilate isomerase # Organism: Nostoc sp. PCC 7120 # 7 207 3 208 217 117 33.0 1e-26 MINGKIIKVCGMREAQNIRDVESLQRVDMMGFIFYPKSPRYIYELPAYLPVHARRVGVFV NEDKDVITMYADRFGLEYIQLHGKESPEYCQSLRTSGLKIIKAFSVARPKDLNHVSEYEK TCNLFLFDTKCEQYGGSGNQFDWNILHTYNGQVPFLLSGGINSHSANALKAFDHPRLAGY DLNSRFELKPGEKDPERIRIFLNELKS >gi|226332260|gb|ACIC01000060.1| GENE 9 7555 - 8325 338 256 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149916131|ref|ZP_01904653.1| 50S ribosomal protein L25/general stress protein Ctc [Roseobacter sp. AzwK-3b] # 1 240 1 242 263 134 36 3e-31 MNRINQLFNSNKKDILSIYFCAGNPTLDGTVNVIRTLEKHGVSMIEVGIPFSDPMADGIV IQNAATQALRNGMSLKILFEQLRNIRQEVSIPLVFMGYLNPIMQFGFENFCRKCVECGID GVIIPDLPFRDYQDHYRIIAERYGIKVIMLITPETSEERVREIDAHTDGFIYMVSSAATT GAQQDFNEQKRAYFKKIEDMNLRNPLMVGFGISNKATFQAACEHASGAIIGSRFVTLLEE EKDPEKAILKLKDALK >gi|226332260|gb|ACIC01000060.1| GENE 10 8591 - 9631 718 346 aa, chain + ## HITS:1 COG:YPO2161 KEGG:ns NR:ns ## COG: YPO2161 COG0252 # Protein_GI_number: 16122393 # Func_class: E Amino acid transport and metabolism; J Translation, ribosomal structure and biogenesis # Function: L-asparaginase/archaeal Glu-tRNAGln amidotransferase subunit D # Organism: Yersinia pestis # 3 344 1 335 338 295 46.0 8e-80 MRVETPSVLLIYTGGTIGMIENPETGALENFNFDHLLKHVPELKRFNYRISSYQFDPPLD SSDMEPAYWAKLVKIINYNYDYFDGFVILHGTDTMAYTASALSFMLENLSKPVILTGSQL PIGTLRTDGKENLITAIEIAAAKNPDGTAIVPEVCIFFENHLMRGNRTTKINAENFNAFR SFNYPPLARVGIHIKYEPNLIRKPDPDKPLKPHYLFDTNVVILTLFPGIQEEIIHSLLHV PGLKAVVMKTFGSGNAPQKEWFIRELKEATDRGIIIVNITQCASGAVEMGRYETGMHLLE AGVISGYDSTPECAVTKLMFLLGHGLSNKDIRYKMNSCLIGEITKP >gi|226332260|gb|ACIC01000060.1| GENE 11 9844 - 12306 1557 820 aa, chain + ## HITS:1 COG:no KEGG:BT_0525 NR:ns ## KEGG: BT_0525 # Name: not_defined # Def: outer membrane protein, function unknown # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 820 1 820 820 1481 100.0 0 MMKKTILLTSIIAIAIVSMLSSCVDSEKDLYDPSYQTANPMGDGFAAPDGFDWNMTTTSI LNIEIDDELYNQIEILDANPFSTSDYHILAKGVAKKGQAFSQEINYTEGTNYLYIRKTDS RSRVSISTWDVSKNKEFVGSRTTRVAKATIGSYNIPEKYPEETYDTTGAIELTGNTNWNQ SNHHLEAGKSYIIKNKFNGEINHTSGYLNGGRFTIFVEGEWTPSQNQIQSADIIILKGGK INTDSFTSFLIADNSILTIQSGGSLIGNNINLAAIGVLLKNFGTISVNSMKDLNTTSILY NAPKATINVTGKSVASWEQSVFTKGAIYNFGELTIQEGALKFNSQDATCYFYNGTEATIN TPTFIIGGIGVNDGTVNAQKISNDNGGNPTFTNNCSLYAQNSFEFGGTSGTIIMNKGILA GGVENGTFIAIPSFKCGNSGSTFELNNGSMIKAEIMDIPNVTFKAAGTRSLIKSTKSIST GWTTKFNGNLDIECPEGEFAKGVPANNPNYIMENSVELYIPNGSKTIITSCGELSEIPDP TPDPEDPEFPIEVEDNKDYTYLFEDQWPLYGDYDMNDIVLTIQKRKIYTNKENKVTKFEL SIDLSAAGATKSIGAAIMLDNVPANAITQSVEFSDNTLAKNFNLNNNNIESGQDYAVIPL FDDAHKVLGRDRYEQINTVSDYAGNTKPKNISFSITFNNPTISADAFNVNNLNVFIIVDG NRNPRKEIHVAGYQPTKLANTDLFGGNNDNSHHGSKKYYISKENLAWGIMVPSNFKWPLE YVNIKTAYSQFSDWVTSGGTENEKWWNDFDVNKVFQTNKN >gi|226332260|gb|ACIC01000060.1| GENE 12 12411 - 12551 117 46 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNFTSANWKFILKLYIYLHEVKQRQEIQFNLYKSSSKASDRKVLFV >gi|226332260|gb|ACIC01000060.1| GENE 13 12548 - 12913 269 121 aa, chain + ## HITS:1 COG:DR2556 KEGG:ns NR:ns ## COG: DR2556 COG3947 # Protein_GI_number: 15807540 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver and SARP domains # Organism: Deinococcus radiodurans # 4 109 3 108 344 79 40.0 1e-15 MKKKILLVDDKATIGKVASIYLGKDYDVMYLENPIKAIDWLNEGNVPDLIISDIRMPLMR GDEFLRYMKANELFKSIPIVMLSSEESTTERIKLLEEGAEDYILKPFNPLELKIRIKKII D >gi|226332260|gb|ACIC01000060.1| GENE 14 12926 - 14086 748 386 aa, chain + ## HITS:1 COG:all4420 KEGG:ns NR:ns ## COG: all4420 COG2148 # Protein_GI_number: 17231912 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Sugar transferases involved in lipopolysaccharide synthesis # Organism: Nostoc sp. PCC 7120 # 142 383 250 441 445 142 33.0 1e-33 MQYFVYIGRHNKTIELFSKLSTGVFYAAPNCYKATKVLDKIREKYDAALFIEQVELSKDI ADIRSIRKMYPGLYMVLVIDSITKEEATEYLKAGVNNTIKYETNSEVLKDLSTFLIRRKE QKIEALQLKTQNLNAFRLPSWKRIFDIFFSGIALLFLSPLLIATSIAIRLESKGPIIYKS KRVGSNYQIFDFLKFRSMYTDADKHLKDFNALNQYQQEEQEEEDIWGEESNVSEEADEEE ILLISDDFVISEEDYIHKKSKEKSNAFVKLENDPRITKIGRFIRKYSIDELPQLINILKG DMSIVGNRPLPLYEAELLTSDEHIDRFMGPAGLTGLWQVEKRGEAGKLSAEERKQLDIQY AKNFSFGLDIKIILKTVTAFVQKENV >gi|226332260|gb|ACIC01000060.1| GENE 15 14102 - 14570 62 156 aa, chain + ## HITS:1 COG:no KEGG:BT_0522 NR:ns ## KEGG: BT_0522 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 156 1 156 304 292 100.0 2e-78 MDNFIYIADDFFFLCIGLLLLYLFILAVASHFKHSNYPKARKQYQCAILVPYGSLLPSIY QQEAYEFITYKDLLEAIDSLDKERYELVILLSDKASALSPHFLEKIYNAYDAGIQAMQLH TIINNRKGVRKRFHALSEEINNSLFRSGNTQIGFSS Prediction of potential genes in microbial genomes Time: Thu May 12 00:41:28 2011 Seq name: gi|226332259|gb|ACIC01000061.1| Bacteroides sp. 1_1_6 cont1.61, whole genome shotgun sequence Length of sequence - 115178 bp Number of predicted genes - 99, with homology - 97 Number of transcription units - 45, operones - 22 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 13 - 441 146 ## BT_0522 hypothetical protein 2 1 Op 2 . + CDS 423 - 1046 574 ## COG0110 Acetyltransferase (isoleucine patch superfamily) + Term 1201 - 1247 6.4 3 2 Op 1 . - CDS 1630 - 2241 452 ## COG2197 Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain 4 2 Op 2 . - CDS 2231 - 2935 579 ## BT_0519 hypothetical protein - Prom 3127 - 3186 3.6 + Prom 3030 - 3089 6.7 5 3 Op 1 . + CDS 3139 - 4314 1075 ## COG0138 AICAR transformylase/IMP cyclohydrolase PurH (only IMP cyclohydrolase domain in Aful) 6 3 Op 2 . + CDS 4238 - 4432 93 ## 7 3 Op 3 . + CDS 4498 - 5004 635 ## COG0716 Flavodoxins 8 3 Op 4 . + CDS 5038 - 5400 423 ## BT_0516 hypothetical protein + Term 5471 - 5518 1.1 + Prom 5410 - 5469 7.1 9 4 Tu 1 . + CDS 5554 - 6579 783 ## BT_0515 terminal quinol oxidase, subunit, putative (DoxD-like) 10 5 Tu 1 . - CDS 6976 - 7107 58 ## gi|253568677|ref|ZP_04846088.1| predicted protein - Prom 7281 - 7340 8.6 11 6 Tu 1 . + CDS 7282 - 8142 322 ## BT_0514 hypothetical protein + Prom 8189 - 8248 4.2 12 7 Op 1 . + CDS 8343 - 8774 187 ## BT_0513 hypothetical protein 13 7 Op 2 . + CDS 8819 - 9388 340 ## BT_0512 hypothetical protein 14 7 Op 3 . + CDS 9427 - 9888 452 ## BT_0511 hypothetical protein 15 7 Op 4 . + CDS 9891 - 10991 730 ## COG0535 Predicted Fe-S oxidoreductases 16 8 Tu 1 . - CDS 11589 - 12245 562 ## COG1132 ABC-type multidrug transport system, ATPase and permease components - Prom 12282 - 12341 2.0 17 9 Tu 1 . - CDS 12397 - 12741 291 ## BT_0507 TetR/AcrR family transcriptional regulator - Prom 12804 - 12863 4.0 - Term 12832 - 12873 8.1 18 10 Tu 1 . - CDS 12930 - 15233 2189 ## COG3525 N-acetyl-beta-hexosaminidase - Prom 15280 - 15339 6.5 19 11 Tu 1 . - CDS 15412 - 15981 334 ## COG4430 Uncharacterized protein conserved in bacteria - Prom 16049 - 16108 4.4 - Term 16176 - 16225 9.1 20 12 Op 1 . - CDS 16262 - 18601 2262 ## COG4771 Outer membrane receptor for ferrienterochelin and colicins - Prom 18627 - 18686 2.8 21 12 Op 2 . - CDS 18698 - 19015 195 ## BT_0503 hypothetical protein - Prom 19078 - 19137 3.5 - Term 19077 - 19119 6.6 22 13 Op 1 . - CDS 19161 - 21488 1997 ## COG4771 Outer membrane receptor for ferrienterochelin and colicins - Term 21525 - 21574 4.1 23 13 Op 2 . - CDS 21578 - 21910 113 ## BT_0501 hypothetical protein - Prom 21948 - 22007 4.1 24 14 Op 1 . - CDS 22010 - 23608 1469 ## COG2985 Predicted permease 25 14 Op 2 . - CDS 23657 - 24556 875 ## COG1230 Co/Zn/Cd efflux system component 26 14 Op 3 . - CDS 24606 - 24857 59 ## BF2104 hypothetical protein - Prom 25025 - 25084 8.4 + Prom 24965 - 25024 6.0 27 15 Op 1 . + CDS 25061 - 25945 852 ## BT_0498 hypothetical protein 28 15 Op 2 . + CDS 25962 - 26624 681 ## BT_0497 hypothetical protein 29 15 Op 3 1/0.222 + CDS 26642 - 28789 1719 ## COG4771 Outer membrane receptor for ferrienterochelin and colicins 30 15 Op 4 . + CDS 28832 - 33223 4094 ## COG1429 Cobalamin biosynthesis protein CobN and related Mg-chelatases 31 15 Op 5 . + CDS 33275 - 33976 550 ## BT_0493 hypothetical protein 32 15 Op 6 4/0.000 + CDS 34010 - 34627 655 ## COG0811 Biopolymer transport proteins 33 15 Op 7 . + CDS 34624 - 34950 455 ## COG4744 Uncharacterized conserved protein + Term 34994 - 35040 2.6 + Prom 35071 - 35130 5.6 34 16 Tu 1 . + CDS 35286 - 35552 230 ## BT_0490 hypothetical protein + Term 35582 - 35623 0.5 - Term 35565 - 35616 12.8 35 17 Op 1 8/0.000 - CDS 35751 - 36422 990 ## COG0800 2-keto-3-deoxy-6-phosphogluconate aldolase 36 17 Op 2 . - CDS 36487 - 37512 1267 ## COG0524 Sugar kinases, ribokinase family - Prom 37644 - 37703 7.2 + Prom 37514 - 37573 7.1 37 18 Op 1 . + CDS 37765 - 38829 1087 ## COG1879 ABC-type sugar transport system, periplasmic component 38 18 Op 2 . + CDS 38855 - 40345 1547 ## COG2721 Altronate dehydratase + Term 40380 - 40417 4.2 - Term 40132 - 40180 -1.0 39 19 Tu 1 . - CDS 40414 - 43005 1678 ## BT_1247 hypothetical protein - Prom 43026 - 43085 3.0 40 20 Op 1 . - CDS 43138 - 44733 863 ## BT_1248 putative transport protein 41 20 Op 2 4/0.000 - CDS 44739 - 45806 929 ## COG1566 Multidrug resistance efflux pump 42 20 Op 3 2/0.000 - CDS 45819 - 47138 1007 ## COG1538 Outer membrane protein 43 20 Op 4 . - CDS 47119 - 47562 425 ## COG1846 Transcriptional regulators - Prom 47724 - 47783 5.8 + Prom 47606 - 47665 5.9 44 21 Op 1 . + CDS 47762 - 48361 458 ## BT_1252 hypothetical protein 45 21 Op 2 . + CDS 48362 - 48658 222 ## COG1846 Transcriptional regulators 46 21 Op 3 . + CDS 48683 - 50038 1349 ## COG4452 Inner membrane protein involved in colicin E2 resistance + Term 50119 - 50163 9.3 - Term 50105 - 50153 11.0 47 22 Op 1 . - CDS 50163 - 50747 449 ## BT_1255 hypothetical protein 48 22 Op 2 . - CDS 50747 - 51121 255 ## COG3877 Uncharacterized protein conserved in bacteria - Prom 51145 - 51204 3.1 + Prom 51087 - 51146 11.1 49 23 Tu 1 . + CDS 51232 - 51699 448 ## BT_1257 hypothetical protein + Term 51868 - 51921 11.0 - Term 52048 - 52095 -0.9 50 24 Tu 1 . - CDS 52237 - 52680 366 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases - Prom 52747 - 52806 5.2 + Prom 52638 - 52697 5.9 51 25 Tu 1 . + CDS 52933 - 53913 828 ## COG3049 Penicillin V acylase and related amidases + Prom 53930 - 53989 4.0 52 26 Tu 1 . + CDS 54212 - 56446 1671 ## COG1506 Dipeptidyl aminopeptidases/acylaminoacyl-peptidases - Term 56427 - 56479 10.4 53 27 Tu 1 . - CDS 56492 - 56839 352 ## COG1733 Predicted transcriptional regulators - Prom 56965 - 57024 4.9 + Prom 56924 - 56983 6.3 54 28 Tu 1 . + CDS 57007 - 57534 500 ## BT_1263 putative protease I + Term 57626 - 57665 1.1 + Prom 57914 - 57973 5.1 55 29 Tu 1 . + CDS 58004 - 61123 1917 ## COG0642 Signal transduction histidine kinase + Term 61227 - 61262 -0.7 - Term 61051 - 61092 5.5 56 30 Op 1 9/0.000 - CDS 61160 - 61960 607 ## COG3279 Response regulator of the LytR/AlgR family 57 30 Op 2 . - CDS 61953 - 63008 707 ## COG3275 Putative regulator of cell autolysis 58 30 Op 3 9/0.000 - CDS 63025 - 64329 1246 ## COG1538 Outer membrane protein 59 30 Op 4 27/0.000 - CDS 64332 - 67364 2109 ## COG0841 Cation/multidrug efflux pump 60 30 Op 5 . - CDS 67375 - 68445 970 ## COG0845 Membrane-fusion protein - Prom 68503 - 68562 5.6 + Prom 68553 - 68612 4.0 61 31 Tu 1 . + CDS 68641 - 69942 1195 ## COG1757 Na+/H+ antiporter - Term 69779 - 69817 1.8 62 32 Tu 1 . - CDS 69873 - 70016 95 ## - Prom 70038 - 70097 5.2 + Prom 69968 - 70027 4.7 63 33 Tu 1 . + CDS 70051 - 70605 913 ## PROTEIN SUPPORTED gi|29346681|ref|NP_810184.1| 30S ribosomal protein S16 + Term 70649 - 70686 6.2 + Prom 70654 - 70713 9.0 64 34 Op 1 . + CDS 70781 - 71776 859 ## COG1609 Transcriptional regulators 65 34 Op 2 . + CDS 71849 - 73624 1911 ## COG2407 L-fucose isomerase and related proteins + Prom 73655 - 73714 2.5 66 35 Op 1 3/0.000 + CDS 73734 - 74372 480 ## COG0235 Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases 67 35 Op 2 . + CDS 74388 - 75818 1065 ## COG1070 Sugar (pentulose and hexulose) kinases 68 35 Op 3 . + CDS 75826 - 76218 360 ## BT_1276 hypothetical protein 69 35 Op 4 . + CDS 76236 - 77552 1220 ## COG0738 Fucose permease + Term 77612 - 77656 13.2 + Prom 77636 - 77695 8.5 70 36 Op 1 6/0.000 + CDS 77739 - 78290 404 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 71 36 Op 2 . + CDS 78350 - 79336 554 ## COG3712 Fe2+-dicitrate sensor, membrane component + Prom 79455 - 79514 5.7 72 37 Op 1 . + CDS 79534 - 82866 2928 ## BT_1280 hypothetical protein 73 37 Op 2 . + CDS 82880 - 84475 1214 ## BT_1281 hypothetical protein 74 37 Op 3 . + CDS 84499 - 85482 848 ## BT_1282 hypothetical protein 75 37 Op 4 . + CDS 85501 - 86823 857 ## BT_1283 hypothetical protein 76 37 Op 5 . + CDS 86851 - 88377 914 ## BT_1284 putative endo-beta-N-acetylglucosaminidase F1 precursor (mannosyl-glycoprotein endo-beta-N-acetyl-glucosaminidase F1) + Prom 88390 - 88449 4.6 77 38 Tu 1 . + CDS 88574 - 89494 584 ## BT_1285 endo-beta-N-acetylglucosaminidase + Term 89538 - 89582 9.9 78 39 Tu 1 . + CDS 91257 - 92624 595 ## COG2207 AraC-type DNA-binding domain-containing proteins + Term 92683 - 92719 -0.1 + Prom 92697 - 92756 4.7 79 40 Tu 1 . + CDS 92802 - 93458 687 ## BT_1287 hypothetical protein + Term 93496 - 93539 6.0 - Term 93568 - 93636 8.0 80 41 Op 1 25/0.000 - CDS 93659 - 95035 1329 ## COG0687 Spermidine/putrescine-binding periplasmic protein 81 41 Op 2 36/0.000 - CDS 95032 - 95829 789 ## COG1177 ABC-type spermidine/putrescine transport system, permease component II 82 41 Op 3 30/0.000 - CDS 95823 - 96623 566 ## COG1176 ABC-type spermidine/putrescine transport system, permease component I 83 41 Op 4 . - CDS 96635 - 98020 1374 ## COG3842 ABC-type spermidine/putrescine transport systems, ATPase components - Prom 98146 - 98205 9.9 + Prom 98033 - 98092 3.7 84 42 Tu 1 . + CDS 98166 - 99266 1189 ## COG0526 Thiol-disulfide isomerase and thioredoxins + Term 99280 - 99333 14.3 - Term 99267 - 99321 13.6 85 43 Op 1 . - CDS 99376 - 101094 1792 ## COG0058 Glucan phosphorylase 86 43 Op 2 . - CDS 101111 - 101941 957 ## COG0058 Glucan phosphorylase 87 43 Op 3 . - CDS 101979 - 103640 1470 ## COG0438 Glycosyltransferase - Prom 103694 - 103753 5.2 - Term 103759 - 103806 14.1 88 44 Op 1 16/0.000 - CDS 103846 - 104307 649 ## COG0636 F0F1-type ATP synthase, subunit c/Archaeal/vacuolar-type H+-ATPase, subunit K 89 44 Op 2 4/0.000 - CDS 104360 - 106183 1990 ## COG1269 Archaeal/vacuolar-type H+-ATPase subunit I 90 44 Op 3 16/0.000 - CDS 106180 - 106785 508 ## COG1394 Archaeal/vacuolar-type H+-ATPase subunit D 91 44 Op 4 16/0.000 - CDS 106826 - 108151 1537 ## COG1156 Archaeal/vacuolar-type H+-ATPase subunit B 92 44 Op 5 . - CDS 108185 - 109942 1928 ## COG1155 Archaeal/vacuolar-type H+-ATPase subunit A 93 44 Op 6 . - CDS 109960 - 110802 769 ## BT_1300 hypothetical protein 94 44 Op 7 . - CDS 110814 - 111404 621 ## BT_1301 ATP synthase subunit E - Prom 111450 - 111509 6.0 95 45 Op 1 . - CDS 111632 - 112078 251 ## BT_1302 hypothetical protein 96 45 Op 2 . - CDS 112096 - 112896 541 ## BT_1303 hypothetical protein 97 45 Op 3 . - CDS 112910 - 113281 303 ## COG1725 Predicted transcriptional regulators 98 45 Op 4 . - CDS 113297 - 114112 399 ## BT_1305 hypothetical protein 99 45 Op 5 . - CDS 114116 - 114964 677 ## COG1131 ABC-type multidrug transport system, ATPase component - Prom 115013 - 115072 6.7 Predicted protein(s) >gi|226332259|gb|ACIC01000061.1| GENE 1 13 - 441 146 142 aa, chain + ## HITS:1 COG:no KEGG:BT_0522 NR:ns ## KEGG: BT_0522 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 142 163 304 304 291 100.0 4e-78 MAFDLEWLQKNQRTAKTNLERKLFKQNIYIEYLPDVIVHCDSAPVYPYRRRIRKTLSYLL PSLLEGNWNFCNRIAQQLIPSPMKLCTFVSIWTLLITGYHWTSSLKWWILLLGLAITYSL AIPNYLVEDKKKQKYSIWRKGH >gi|226332259|gb|ACIC01000061.1| GENE 2 423 - 1046 574 207 aa, chain + ## HITS:1 COG:SMb21427 KEGG:ns NR:ns ## COG: SMb21427 COG0110 # Protein_GI_number: 16265003 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Sinorhizobium meliloti # 68 191 27 156 162 87 38.0 2e-17 MEKRTLKQRIKQNPALKQAVHRFIMHPVKTRPNWWIRLFYFVYLKRGKGSVIYRSVRKDL PPFNQFSLGRYSVVEDFSCLNNAVGDLIIGDYTRIGLGNTIIGPVRIGNHVNLAQNITVT GLNHNYQDAEKSIDEQGVSTQPVTIEDDVWVGANSVILPGVTLGKHCVVAAGSVVSRSIP AYSICAGCPAKVIKSYDFATKEWKKVK >gi|226332259|gb|ACIC01000061.1| GENE 3 1630 - 2241 452 203 aa, chain - ## HITS:1 COG:BMEI1582 KEGG:ns NR:ns ## COG: BMEI1582 COG2197 # Protein_GI_number: 17987865 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain # Organism: Brucella melitensis # 133 194 147 208 213 61 56.0 1e-09 MKNNEAIRIAIAETSVIIRGGLTAALKRLPNVKVQPIELLSVEALSDCLRTQYPEMLIVN PTFGDYFDVNKFREETAGKKIRLIALVTSFVDASLLTKYDESISIFDDLDILSKKIHGLL NLASEEEELDNQETLSQREKEIVVCVVKGMTNKEIAENLFLSIHTVITHRRNISKKLQIH SAAGLTIYAIVNKLVTLNDVKDL >gi|226332259|gb|ACIC01000061.1| GENE 4 2231 - 2935 579 234 aa, chain - ## HITS:1 COG:no KEGG:BT_0519 NR:ns ## KEGG: BT_0519 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 234 1 234 234 464 100.0 1e-130 MDNLQKYKPTDKMIDLISDNYSLLQVMSRFGLSLGFGDKTVKEVCELNGVDCRTFLIVVN FMSEGFSRMDGSSEDISIPALIDYLRQAHIYFLDFSLPAIRRKLIEAIDCSQDDVAFLIL KFFDEYTREVRKHMDYEEKTVFKYVDALINGNAPRNYQISTFSKHHDQVGEKLTELKNII IKYCPAKANENLLNAALFDIYACEAGLESHCKVEDYIFVPAILKLERRIRENEK >gi|226332259|gb|ACIC01000061.1| GENE 5 3139 - 4314 1075 391 aa, chain + ## HITS:1 COG:CAC2445 KEGG:ns NR:ns ## COG: CAC2445 COG0138 # Protein_GI_number: 15895710 # Func_class: F Nucleotide transport and metabolism # Function: AICAR transformylase/IMP cyclohydrolase PurH (only IMP cyclohydrolase domain in Aful) # Organism: Clostridium acetobutylicum # 4 391 5 391 391 538 65.0 1e-153 MANELELKYGCNPNQKPARIFMKEGELPIEVLNGRPGYINLLDAFNSWQLVKELKEATGL PAAASFKHVSPAGAAVAVEMSDTLKKIYFVDDVKLSPLATAYARARGADRMSSYGDFIAL SDTCDEETARIINREVSDGVIAPDYTPEALEILKNKRKGTYNVIKIDPAYRPAPIEHKDV FGVTFEQGRNELKIDESLLKEMPTQNKEIPAEAKRDLIISLITLKYTQSNSVCYAKDGQA IGIGAGQQSRIHCTRLAGNKADIWYLRQHPKVMNLPWIEKIRRADRDNTIDVYISEDHDD VLADGVWQQFFTEKPEVLTREEKRAWLDTMTGVALGSDAFFPFGDNIERAHKSGVSYIAQ PGGSVRDDHVIGTCDKYNMAMAFTGIRLFHH >gi|226332259|gb|ACIC01000061.1| GENE 6 4238 - 4432 93 64 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIMLSVHATSTIWQWHSQAFACSIINRFLLYFKIYIRFVGFNLFIDPKRYAIPIYNENKA IIPL >gi|226332259|gb|ACIC01000061.1| GENE 7 4498 - 5004 635 168 aa, chain + ## HITS:1 COG:Cj1382c KEGG:ns NR:ns ## COG: Cj1382c COG0716 # Protein_GI_number: 15792705 # Func_class: C Energy production and conversion # Function: Flavodoxins # Organism: Campylobacter jejuni # 4 164 3 159 163 140 50.0 8e-34 MNKIGVFYGSTTGTTEDLARRIAEKLDVPSAHIYDVSKLTEALVGEYDVLVLGSSTWGAG ELQDDWYDGIKVLKKCDLSHKYVALFGCGDSDSYSDTFCDAIGILYEELKDTRCKFCGAV DTAGYTFDSSIAVVNGKFVGLPLDEVNEDGQTDERISAWVEQVKQEIS >gi|226332259|gb|ACIC01000061.1| GENE 8 5038 - 5400 423 120 aa, chain + ## HITS:1 COG:no KEGG:BT_0516 NR:ns ## KEGG: BT_0516 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 120 1 120 120 230 100.0 1e-59 MATERIIPGEIRIFLNHIYEFKKGVRNMVLYTMSREHEEFAIRRLKNQKISYMIQEVGTN KINLFFGKPECMEAMRHIIIRPLNQLSAEEDFILGAMLGYDLCQQCKRYCNKKEGIKIAV >gi|226332259|gb|ACIC01000061.1| GENE 9 5554 - 6579 783 341 aa, chain + ## HITS:1 COG:no KEGG:BT_0515 NR:ns ## KEGG: BT_0515 # Name: not_defined # Def: terminal quinol oxidase, subunit, putative (DoxD-like) # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 341 1 341 341 650 100.0 0 MLTQQQEPSKAYVLAGIFTLSLRLIVGWTYFSAFWRRLVLENKLIPDATGYIGEKFNHFL PNSIGIKPIIEYLVSTPDLLWWAMVIFTLVEGIVGLLYMLGFFTRLMSIGVFSLAFGILL GSGWLGTTCLDEWQIGILGVSAGFTIFLSGGGKYSLDYLLLPKLSKNKWLVWLTSGELPL SIKQFSKVAISGAVLLFILTLYTNQVFHNGIWGPLHNKSVKPELKISNAKIQEDILTFKV YRIEGADVYGSFLIGITLKDENGKTILQKNGEELARFPLTRIKNDYVAKVAPGKHSLIIP LGSKATLTIRSDVFMDLPKSDYELILTDISGITWKEKITVN >gi|226332259|gb|ACIC01000061.1| GENE 10 6976 - 7107 58 43 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253568677|ref|ZP_04846088.1| ## NR: gi|253568677|ref|ZP_04846088.1| predicted protein [Bacteroides sp. 1_1_6] # 1 43 1 43 43 86 100.0 4e-16 MKKHPLTVNNDLLHDMFDQEKVKSGLVFIFTMLGWYLSTGCAN >gi|226332259|gb|ACIC01000061.1| GENE 11 7282 - 8142 322 286 aa, chain + ## HITS:1 COG:no KEGG:BT_0514 NR:ns ## KEGG: BT_0514 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 285 1 285 334 565 100.0 1e-160 MNRIFEIEELNELEVFLKSQNDIDKLRDSLFAEFLKYADYKNVEEWNNAVRVCESLAIIG WGSNEALEALRGSFFNGNPMTCFVNKHREPRFVEAIWSRRINGFTMEAGRTSYHFSPDDP FQRQSIAWEYKTKEDVQGIELRSQRNWIPKNPIWIERTIGNCYENSKVVIESIENDLQSK LNKQMRPELYGQAVNKIILKCSFSYYDHVCCKCNYVIADEKLKLRQKELYPKLLTMFTKQ EIEKNGYYLRNRFEFGPFRTDTGKVKAVITLEKEFSELNHSEQKKD >gi|226332259|gb|ACIC01000061.1| GENE 12 8343 - 8774 187 143 aa, chain + ## HITS:1 COG:no KEGG:BT_0513 NR:ns ## KEGG: BT_0513 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 143 1 143 143 262 100.0 3e-69 MKGYRIAMKLINQTKIFKFIVPIIVFILLYAVSTIRNNNVRKDGIYSIVTLIKYGSAYRG QSAKYEFIYNKTLYKGSFFISFNESKNTPIGTRYFVTFLAKAPNRHRILDTVPSWFTLKA PDKGWKTLPTQKQLRIMMKDSLI >gi|226332259|gb|ACIC01000061.1| GENE 13 8819 - 9388 340 189 aa, chain + ## HITS:1 COG:no KEGG:BT_0512 NR:ns ## KEGG: BT_0512 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 189 1 189 189 335 100.0 4e-91 MKDKSLEKNTFWIYSTNKFVWAGVAICSIFSAFMIYFLCVSNLSENKVLFIFLLPISLSI LLIYWALPSKILLCEDRIEERSLFGKRSIRVDEINSWGVVQMYKPYRCKEGIYSYIPAKS FKPSRQIEENMIFSYSLFLSNIPHYDGKKKDSFQTDKTIFVSYRKEVYIALEKHLKKIKR TMENNQFKD >gi|226332259|gb|ACIC01000061.1| GENE 14 9427 - 9888 452 153 aa, chain + ## HITS:1 COG:no KEGG:BT_0511 NR:ns ## KEGG: BT_0511 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 142 1 142 150 273 100.0 1e-72 MKNKYLLSLLAALGFSGCGNNNEIWDEPCYYGPGPVWTPDIIISSQVQNEEKETINGIRV VSSYMNQEDSLITDTAYTESREIEHYTVSGIAINTFKFKEYPPKDTEMYLEYTDVDGEKN GAYQTKKIRIQDLGKEGKVTLLKEEKKEEEKED >gi|226332259|gb|ACIC01000061.1| GENE 15 9891 - 10991 730 366 aa, chain + ## HITS:1 COG:RSc1728 KEGG:ns NR:ns ## COG: RSc1728 COG0535 # Protein_GI_number: 17546447 # Func_class: R General function prediction only # Function: Predicted Fe-S oxidoreductases # Organism: Ralstonia solanacearum # 9 337 8 337 491 120 28.0 6e-27 MKRIIPLTPRRWLALEIFRRMKHERVEEHPLRQLFWECTLRCNVRCRHCGSDCKSSPATP DMPLQDFLKVLDSIATHTNPHDVFVIISGGEPTVREDLETCGKEISRRGYPWGMVCNGLC LTRERLHNLIRSGMRSISISLDGLKDVHNWMRRHPESFDCAVNAIREITAIPELTFDVIT CVTRRSLPQLPAMKELLISLGVKRWRVSTIFPVGRAAEEPEFRMNGEELKQVLDFIRDTR KEGIIRASYGCEGFLGNYEGEVRNTPFFCSAGISVGSVLIDGSISACSSIRSNYHQGNIY QDDFWEIWQTKFQPYRDRSWMKKDECARCKYFRYCQGNGMHLRDNEGKLLVCHLNRLTGN SANVPE >gi|226332259|gb|ACIC01000061.1| GENE 16 11589 - 12245 562 218 aa, chain - ## HITS:1 COG:FN0615 KEGG:ns NR:ns ## COG: FN0615 COG1132 # Protein_GI_number: 19703950 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase and permease components # Organism: Fusobacterium nucleatum # 3 217 199 416 574 132 36.0 3e-31 MEEYLQGIRVMKAYNLLGGKFERLKMAYSDLRRACIRQEALLGPFILFSVTLIRAGLTFM ILCGTYLLIGGELSLLTFVMFLVVGSRVFDPLTSALTNFAEFRYFSIAGGRILTLMNESE MKGNRDVPESGDITFDHVSFGYQEKEILHDISVTLRKGTLTALVGPSGSGKSTMLKLCAR FYDPHKGSVRFNGMDMKELELESLMKHCSMVFQDVYLF >gi|226332259|gb|ACIC01000061.1| GENE 17 12397 - 12741 291 114 aa, chain - ## HITS:1 COG:no KEGG:BT_0507 NR:ns ## KEGG: BT_0507 # Name: not_defined # Def: TetR/AcrR family transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 114 1 114 202 201 98.0 8e-51 MQYLKEDIQEKILHIAEEVFSEKGYKDASMREIASRTGITVSNIYHYFTNKDEIFRTILK PVLNDLYVKIYRHDANQMSIEVFTNSDYQQESVQEYIDLVSEHRARLRMLLFQA >gi|226332259|gb|ACIC01000061.1| GENE 18 12930 - 15233 2189 767 aa, chain - ## HITS:1 COG:CC0447 KEGG:ns NR:ns ## COG: CC0447 COG3525 # Protein_GI_number: 16124702 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetyl-beta-hexosaminidase # Organism: Caulobacter vibrioides # 28 607 29 591 757 322 33.0 1e-87 MFKQLSTSLLIASACILSSCTPTVKQEIAILPTPVSLTEQSGAFVLKDGMKIGVSDQSLF PAAGYLQDILRNVVSTSVEVTEADKADIYLQLGQDNGKPGSYKLQATPKSVQVEAGDYSG IVSAIASLHQLLPAGIEVQGTKQTFSIPAVQIEDSPRFEWRGFMLDASRHFWNKDEVKHV LDLMSLYKLNKFHWHLTDDQGWRIEIEKYPLLTEKGAWRKFNKHDRGCMERAVEEDNTDF LIPENKIRIVEGDTLYGGYYTHEDIKEIVDYAAQRGIDVIPEIDMPGHFLAAITQYPDLA CDGLIGWGETFSSPICPGKDTTLEFCQDVFKEIFDLFPYEYVHMGGDEVEKNNWKKCPRC QKRIRTEGLKSVEDLQAWFVRDMEKFFLANGKKLIGWDEVVADGLTSDAAITWWRSWSKE AVPMATSQGQRVIACPNEYFYFDYAQDKNSVKKILAYDPYADDRLSPEQKECFWGVQANL WAEWIPSMKRIEYLILPRMVALSEIAWVQPEAKPDLKEFYRQLVPHFKRMDILGLNYRVP DLEGFYKVNAFLDEASVDLTCPLPGIEVRYTTDGSMPTKQSTLYEGNLKVTETTDFTFRT FRPDGTPSDVARTRYVKAPYAEATAAPASLNSGLKAVWHKFRGNLCADIDAAPVNGEYVV ESVSIPEEVKGDIGLILTGYLEVPADGIYTFALLSDDGSTLKLDGELLGDNDGAHSPVEI IVQKALKAGLHPIEVRYFDCNGGVLQMELVNEKGEKEVLPKEWLKHE >gi|226332259|gb|ACIC01000061.1| GENE 19 15412 - 15981 334 189 aa, chain - ## HITS:1 COG:alr0739 KEGG:ns NR:ns ## COG: alr0739 COG4430 # Protein_GI_number: 17228234 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Nostoc sp. PCC 7120 # 4 178 5 190 193 108 36.0 8e-24 MATENKIKYFENREDWRKWLMDNFETEDEIWFVFPLKSSGEKSIAYNDAVEEALCFEWID STIKPLDKEHRIQRFTPRKPKSTYSQANKERLKWLLANKMIHPEFEDKIRTVLSDPFIFP NDIMDRLKEDEIAWRNYRHFSDAYKRIRIAYIEAARKRPEEFEKRLNNFICKVKENKMIT GFGGIEKYY >gi|226332259|gb|ACIC01000061.1| GENE 20 16262 - 18601 2262 779 aa, chain - ## HITS:1 COG:STM2199 KEGG:ns NR:ns ## COG: STM2199 COG4771 # Protein_GI_number: 16765529 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor for ferrienterochelin and colicins # Organism: Salmonella typhimurium LT2 # 120 563 33 462 663 101 27.0 5e-21 MKKYLLVLVSLCCALLPALAEHPEYPELKKSDANIIGHVLDKKTSEHLPYITIALKGTTI GTVTDATGHYFLKNLPEGNFMMEVSSVGYKTVTRSVTLKKGKTLEENFELEEDAIALDGV VVSANRSETTRRLAPTLVNVVDLKLFETTNSSTLSQGLNFQPGVRVETNCQNCGFQQVRI NGLDGPYTQILIDSRPVFSALSGVYGLEQIPASMIERVEVMRGGGSALFGSSAIAGTINI ITKEPLRNSGQLSHTITSIGGSSAFDNNTSLNASLVTDDHRAGLYIFGQNRHRSGYDYDG DGFTELPKLKNQTVGFRSYLKTSTYSKLTFEYHHMQEFRRGGDMLDRPPHEAHIAEQLQH SIDGGSLKYDYFSPNEKNRLSVFASAANTDRDSYYGPGNDPLKAYGKTTDLTAMGGVQYV HTFDKCLFMPSDLTAGLEYNRDRLKDNMWGYDRHTDQTVKVYSAFLQNEWKNKHWGILIG GRLDKHNMVDDVIFSPRANLRFNPTDNINLRLSYSSGFRAPQAFDEDMHIENVGGTVAMI ERAKDLKEEKSQSFSASADMYHRFGAFQTNLLVEGFYTRLTDVFVLGGPFDRGDGVLVKI RSNGPGAKVMGLTLEGKIAYLSLLQIQAGLTLQRSRYDEPHKWHDDAPAERKIFRTPDIY GYFTATYTPVKPLSIALSGTYTGSMLVQRMDISAENVAMGDMPERKAEAIRTPRFFDMGV KIAYDFKLYKTVDLQLSGGVQNIFEAYQKDFDRGANRDSNYIYGPSLPRSFFAGVKISY >gi|226332259|gb|ACIC01000061.1| GENE 21 18698 - 19015 195 105 aa, chain - ## HITS:1 COG:no KEGG:BT_0503 NR:ns ## KEGG: BT_0503 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 105 9 113 113 196 100.0 2e-49 MKWFLPVLFISYMAGITLFTHSHVVNGVTIVHSHPFKKGGEHSHTTVEFQLIHILDHTLV TDNGLTPLFVTSVLSLLCIFLASPQRMRYCRPCPGAISLRAPPVV >gi|226332259|gb|ACIC01000061.1| GENE 22 19161 - 21488 1997 775 aa, chain - ## HITS:1 COG:STM2199 KEGG:ns NR:ns ## COG: STM2199 COG4771 # Protein_GI_number: 16765529 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor for ferrienterochelin and colicins # Organism: Salmonella typhimurium LT2 # 126 558 33 454 663 89 26.0 2e-17 MKKYIFSLVCLCCALLPALEGQAALHEHPEYPDLRKSDANIVGHILDKNTKEHLPYITVA LKGTTIGTVTDATGHYFLKNLPEGNFVLEVSSVGYKTVRRNVTLKKGRTLEEDFEIEEDA VALDGVVVSANRNETTRRLAPTLVNVVDMKMFETTNSTTLAQGLSFQPGVRVESNCQNCG FQQVRINGLDGPYTQILLDSRPIFSALSGVYGIEQIPASMIERVEVMRGGGSALFGSSAI AGTINIITKEPIRNSGMLSHTITGIGDGDAFDNNTALNASLVTDDQRAGLYIFGQNRHRS AYDHDGDGYSEIPKIHGQTIGFRSFLKTTTYSKLTFEYHHMEEFRRGGDLLNRPPHEANV AEQTEHSINGGGLKYDYFSPNEKHRFNVFASAQHINRDSYYGGGQDLNAYGNTTDLNWMA GSQYVYSFGKCIFMPSDLTAGIEFNQDKLEDNMWGYHRTVDQKVNIGSAFLQNEWKNDHW GFLLGGRLDKHNLIDHIIFSPRANLRFNPTQNINLRLSYSSGFRAPQAFDEDLHVENVGG NVAMVELAKDLKEERSQSLSASADIYHRFGAFQINFLVEGFYTKLSDVFALTDGEVKDGI LTRTRYNAPGARVMGLTLEGKMAYLNKFQIQAGATLQQSHYSEPHVWNKEAPAVKKMMRT PNTYGYFTATYTPVKPLSIALSGTYTGSMLIPHEPVPGFLDKPITVNTEDFFDIALKAAY DFKLYKSMNLQVNAGIQNIFNAYQNDFDKGADRDSGYIYGPSLPRSFFAGVKISY >gi|226332259|gb|ACIC01000061.1| GENE 23 21578 - 21910 113 110 aa, chain - ## HITS:1 COG:no KEGG:BT_0501 NR:ns ## KEGG: BT_0501 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 110 1 110 110 169 100.0 2e-41 MRCFLPVLFVSYLVSITFFAHIHVVNGVTIVHSHPFKKGAAHKHSTVELLLVHFLSHLTT DGAAVVFALSLFIPFLLWRLHGITQHTHYYSPYHGVVALRAPPAIRFSVI >gi|226332259|gb|ACIC01000061.1| GENE 24 22010 - 23608 1469 532 aa, chain - ## HITS:1 COG:STM3807 KEGG:ns NR:ns ## COG: STM3807 COG2985 # Protein_GI_number: 16767092 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Salmonella typhimurium LT2 # 20 528 18 546 553 272 33.0 1e-72 MFTDLLHSSYFSLFLIVALGFMLGRIKIKGLSLDVSAVIFIALLFGHFGVIIPKELGNFG LVLFIFTIGIQAGPGFFDSFRSKGKTLILITMLIICSACLTAVGLKYAFDIDTPSVVGLI AGALTSTPGLAVAIDSTHSPLASIAYGIAYPFGVIGVILFVKLLPKIMRVNLDQEARRLE IERRGQFPELGTCIYRVTNASVFNRSLMQINARAMTGAVISRLKHKDEISIPTAHTVLHE GDYIQAVGSEESLNQLSVLIGEREEGELPLDKTQEIESLLLTKKDMINKQLGDLNLQKNF GCTVTRVRRSGIDLSPSPDLALKFGDKLMVVGEKEGIKGVARLLGNNAKKLSDTDFFPIA MGIVLGVLFGKLNISFSDTLSFSPGLTGGVLMVALVLSAVGKTGPIIWSMSGPANQLLRQ LGLLLFLAEVGTSAGKNLVATFQESGLLMFGVGAAITVVPMLVAVIVGRLVFKINILDLL GTITGGMTSTPGLAAADSMVDSNIPSVAYATVYPIAMVFLILFIQVISSAVY >gi|226332259|gb|ACIC01000061.1| GENE 25 23657 - 24556 875 299 aa, chain - ## HITS:1 COG:CC0303 KEGG:ns NR:ns ## COG: CC0303 COG1230 # Protein_GI_number: 16124558 # Func_class: P Inorganic ion transport and metabolism # Function: Co/Zn/Cd efflux system component # Organism: Caulobacter vibrioides # 17 286 75 345 361 241 49.0 1e-63 MEPHHHEHNHRVTSLNKAFIIGIVLNISFVIVEFGVGFYYDSLGLLSDAGHNLGDVASLI LAMLAFRLEKVHPNSRYTYGYKKSTILVSLLNAVILLVAVGIIIAESIDKLFHPVSVDGS AIAWTAGVGVVVNALTAWLFMKDKDKDLNVKGAYLHMAADALVSVGVVASGIIIMYTGWS IIDPIIGLGIAVIIIVSTWGLLHDSLRLSLDGVPVGIDTQQIQQLIVEQPGVESCHHLHI WAISTTETALTAHVVVDDIAKMEEIKHRIKEALEAAGIHHATLEIEGEEVTCSTECCED >gi|226332259|gb|ACIC01000061.1| GENE 26 24606 - 24857 59 83 aa, chain - ## HITS:1 COG:no KEGG:BF2104 NR:ns ## KEGG: BF2104 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 36 82 183 236 306 64 59.0 1e-09 MATIEKCIDRDYLFPDCIVYYVFRACIDDFYGARASIVTFMELNLIMWTSYLVLLFCYDE NFIGEHSPVTALVAFLMFEKRVN >gi|226332259|gb|ACIC01000061.1| GENE 27 25061 - 25945 852 294 aa, chain + ## HITS:1 COG:no KEGG:BT_0498 NR:ns ## KEGG: BT_0498 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 294 1 294 294 476 100.0 1e-133 MKILNLWMVALMALSFSFTFVSCSDDNDDTDNTEQAKAIAGTYKGYSIGESRMFSDYLMG DDASATITANTDGTVNLVYKSGSGDFILNNLTVSSKSFKGEGEVALAMEGSEPANYEYTL EGSVSDAKVLTLKANVPVPMGGIDINFIQAETPIAYYVAATYKYNTNLAISVMGSSYGST EDCKAVIKRASETTVDITLNGFGNLTGGGNMNLGDFTISGVNVEKADDGYTLSLGAFESA AESASGTTPITGESLEGTVTANGKANITVAFKPGAMPMAITAVFTGNVKKTSAQ >gi|226332259|gb|ACIC01000061.1| GENE 28 25962 - 26624 681 220 aa, chain + ## HITS:1 COG:no KEGG:BT_0497 NR:ns ## KEGG: BT_0497 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 220 1 220 220 419 100.0 1e-116 MRRSSNAIFKNIMILGAAMSLSACNGMFEGIYDDPIEAEMEIKESSFSQINATEYTNWVY IDLSERKATTVEIGEEHKSEIPAKWDLAIHRYDIKTNEGAAFQTTYTSIDDLKASGKLPA EENFVKDEWTTDKIAIDMSGMMEGNIKYTEDYRNDILSGWLNVDTSSMPPIYTMSNQVYL IQLKDNTYAAIRFTNYTNARGIKGYIDFDFLYPLDFEENN >gi|226332259|gb|ACIC01000061.1| GENE 29 26642 - 28789 1719 715 aa, chain + ## HITS:1 COG:STM2199 KEGG:ns NR:ns ## COG: STM2199 COG4771 # Protein_GI_number: 16765529 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor for ferrienterochelin and colicins # Organism: Salmonella typhimurium LT2 # 107 683 29 638 663 144 23.0 7e-34 MKPTFIASILIFSFIVTCSTFAQQSTEHISGRVTDTNGEPLPGATISIKGEETGAVTSVD GTYTLQLPGIGSYIITASYIGYQTEQKRITTEKGKKVNFSLSEDQFDLGTVVVTGTRTPK LLKDAPIITRVITSDDIKKVNANNVADLLKTELPGIEFSFSMDQQTAINMQGFGGNSVLF LVDGERLAGETLNNVDYERLNLDNVERIEIVKGAASTLYGSSAIGGVINIITRESDDPWN LNLNSRFSEHNDQRYGGTAGFNAGKFNSLTNVQYNNVDTYAVDNPGDFSTVFGSRVWNFK ERLIYRPLETLKLTGRVGYYFRERNKPGDTQDRYRGFSGGMKANYTFNIMSNLEVGYTFD QYDKSDYQSLYKNDVRDYSNVQHNVHALYNYTFNGKHTLTVGADYLRDYLMSYQFTDNAD HTMHTTDAFGQFDWNPTERLNVIAGLRFDYFSDSNVRHLSPHLGMMYKIGNCSLRGSYSQ GFRSPTLKEMYMVFNMANMMMIYGNPDLKSETSHNFSVSAEYTKSRYNFSITGYYNLVHN RINTVYSEAPKGQIYTNTDKMDIAGIDANVSAKYPCGLGYRLSYTYIHEFMRDGQKKFTD TRPHTATVRIDWGKTLNKVDFNYSLNGRLLSKVKTNIYNDSFNNPSAGSQAVEYPGYMIW DFTFSLGIIKGINMNLAVNNLFDYVPDYYYFNSPTTTGTNLTLGLSLDIDRLFKK >gi|226332259|gb|ACIC01000061.1| GENE 30 28832 - 33223 4094 1463 aa, chain + ## HITS:1 COG:MA0383 KEGG:ns NR:ns ## COG: MA0383 COG1429 # Protein_GI_number: 20089280 # Func_class: H Coenzyme transport and metabolism # Function: Cobalamin biosynthesis protein CobN and related Mg-chelatases # Organism: Methanosarcina acetivorans str.C2A # 934 1426 965 1465 1541 313 35.0 2e-84 MKKKSKIILGGCIVVAALIGLSAWNAWFSATKIAFVNFQTIQQGSISKANDNSFIKLSEV SLDNLDRLTSYDMVFINGMGLRIVEEQRQQIQQAADKGIPVYTSMATNPANNICNLDSIQ QNLIRGYLSNGGKTNYRNMLNYIRKAIDGKASVVPEVEDPTERPSDMLYHAGISNPDDEQ EFLTVADYEKFMQENNLYKEGARKIMITGQMADATDLIKALENAGYNVYPVQSMTRFMSF IEEVQPDAVINMAHGRMGDKMVDYLKARNILLFAPLTINSLVDEWENDPMGMSGGFMSQS IVTPEIDGAIRPFALFAQYEDKEGLRHSYAVPERLKTFVSTIDNYLNLKTKPNFEKKVAI YYYKGPGQNALTAAGMEVVPSLYNLLLRMKQEGYNISGLPVNAQELGKMIQAQGAVFNAY AEGAFNDFMQNGHPELITKEQYESWVKESLRPEKYQEVVDAFGEFPGNYMVTPDGKLGIA RLQFGNVVLLPQNAAGSGDNSFQVVHGTDMAPPHTYIASYLWMQHGFKADALIHFGTHGS LEFTPRKQVALCSNDWPDRLVGAVPHYYLYSIGNVGEGMMAKRRSYATLQSYLTPPFLES SVRGIYRELMEKIKIYNNSQKANKDQESLAVKTLTVKMGIHRDLGLDSMANKPYTEDEIA RVENFAEELATEKITGQLYTMGVPYEPERITSSVYAMATEPIAYSLFALDKQRGKATESA EKHRSVFTQQYLMPARLLVERLMANPSLATDELICHTAGITPQELAKARQIEAERNAPKG MMAMMMAAAAKKDQADNEPSGNGHPASAKMEKGPHGKMPAGMKEAMKKMGANMDPEKAME MAKSMGASPEALKKMEASMKANKDTSTDASGKPAMAGKTEKPQGMSAMMAAMGKAPKEYS KEEVEFALAVAEVERTIKNVGNYKNALLTSPEEELSSLMNALKGGYTAPTPGGDPIANPN TLPTGRNMYAINAEATPTESAWEKGIALAKQTIDRYKQRHNDSIPRKVSYTLWSSEFIET GGATIAQVLYMLGVEPVRDAFGRVSDLKLIPSTELGRPRIDVVVQTSGQLRDLAASRLFL INRAVEMAAAAKDDKYENQVASSVIEAERVLTEKGLSPKDAREISTFRVFGGANGMYGTG IQEMVESGDRWENESEIADTYLNNMGAYYGSEKNWEVFQKFAFEAALTRTDVVVQPRQSN TWGALSLDHVYEFMGGMNLAVRNVTGKDPDAYLSDYRNRNHMKMQELKEAVGVESRTTIL NPTYIKEKMKGGASSASEFAEVITNTYGWNVMKPAAIDKELWDNIYNVYVKDELNLGVKQ YFEQQNPAALEEMTAVMLESARKGLWQASEEQVAELSKLHTEIVNTYRPSCSGFVCDNAK LRDFIASKADAQTATQYKENISKIREAKASGSNKGVVMKKEEMNQTAENQTNTLSNVAVG IAVIIVILALILFVRKRRKSSQM >gi|226332259|gb|ACIC01000061.1| GENE 31 33275 - 33976 550 233 aa, chain + ## HITS:1 COG:no KEGG:BT_0493 NR:ns ## KEGG: BT_0493 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 233 1 233 233 405 99.0 1e-112 MDTVVLVLMLLTAFNFLLKQTFWKLIAVGIIAAICAVFAGLMWPYAIEQSKTQIANWLSN QPLMLDTSVLLSVEVCIQMAYAMLAVHVANDYPVKPRMILMYRFLRWFPGLLIFPVLFSG LVYLIFAFPGTSFQTIAWSYAAFILAAIPCGRFILLHLLPEKELRLELFFLTNALVAVLG IVATVNGKTSAAGVSEIDWKALTGVIVIALAGGILGLLWWNIRNRNKEKSVKQ >gi|226332259|gb|ACIC01000061.1| GENE 32 34010 - 34627 655 205 aa, chain + ## HITS:1 COG:MA4426 KEGG:ns NR:ns ## COG: MA4426 COG0811 # Protein_GI_number: 20093212 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Biopolymer transport proteins # Organism: Methanosarcina acetivorans str.C2A # 7 195 10 205 273 132 38.0 6e-31 MNFISDILYWISTGLLVPVIVLLIILFCRAILLAGSFYGQYLSIRKTETLLRNKLSKLTP DTVEELSSKLPENSRSLVITYMRQVLDARDTPAQVQRLLANFEIAADKDLAISKTLTKLG PILGLMGTLIPMGPALAGLASGDIASMAYNMQIAFATTVVGLVAGAVGFLTQQVKQRWYL QDMTNLEFISELLNDKRSIHKTSEK >gi|226332259|gb|ACIC01000061.1| GENE 33 34624 - 34950 455 108 aa, chain + ## HITS:1 COG:MA4642 KEGG:ns NR:ns ## COG: MA4642 COG4744 # Protein_GI_number: 20093421 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanosarcina acetivorans str.C2A # 5 108 12 114 127 85 47.0 2e-17 MKHNLLRKEEDSDPISVVSNLFDIAMVFAVALMVALVSRYNMTEVFSQEDYTMVKNPGKE NMEIITKEGNKINRYTPSEDQDKKEGKRGKKVGIAYELDNGEIIYVPE >gi|226332259|gb|ACIC01000061.1| GENE 34 35286 - 35552 230 88 aa, chain + ## HITS:1 COG:no KEGG:BT_0490 NR:ns ## KEGG: BT_0490 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 88 1 88 88 127 98.0 1e-28 MSTEILVDTVTTTTDSCQMVATETTLQVTAATKSDNQTTANEAYTAVPPRRTFNEQKPFV FSIVGFAVVYVAALVIGRIIKMRMPHHS >gi|226332259|gb|ACIC01000061.1| GENE 35 35751 - 36422 990 223 aa, chain - ## HITS:1 COG:CC1495 KEGG:ns NR:ns ## COG: CC1495 COG0800 # Protein_GI_number: 16125742 # Func_class: G Carbohydrate transport and metabolism # Function: 2-keto-3-deoxy-6-phosphogluconate aldolase # Organism: Caulobacter vibrioides # 5 223 4 223 224 229 48.0 3e-60 MAKFDKIAVLNKIGSTGMVPVFYHKDAEVAKKVVKACYDGGVRAFEFTNRGDFAQEVFAE IVKYAAKECPEMAIGIGSIVDPATAAMYLQLGANFVVGPLFNPEIAKICNRRLVAYTPGC GSVSEVGFAQEVGCDLCKVFPGDVYGTNFVKGLMAPMPWSKLMVTGGVEPSKENLTAWIK AGVFCVGMGSKLFPNDKVAAEDWAYVTAKCEEALGYIAEARKK >gi|226332259|gb|ACIC01000061.1| GENE 36 36487 - 37512 1267 341 aa, chain - ## HITS:1 COG:TM0067 KEGG:ns NR:ns ## COG: TM0067 COG0524 # Protein_GI_number: 15642842 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Thermotoga maritima # 4 341 2 339 339 364 54.0 1e-100 MGKKVVTLGEIMLRLSTPGNTRFVQSDSFDVVYGGGEANVAVSCANYGHEAYFVTKLPKH EIGQSAVNALRKYGVKTDFIARGGDRVGIYYLETGASMRPSKVIYDRAHSAIAEADAADF DFDAIMEGADWFHWSGITPAISDKAAELTRLACEAAKRHGVTVSVDLNFRKKLWTKEKAQ SIMKPLMKYVDVCIGNEEDAELCLGFKPDADVEAGHTDAEGYKGIFQQMMKEFGFKYVVS TLRESFSATHNGWKAMIYNGEEFYTSKRYDIDPIIDRVGGGDSFSGGIIHGLMTKPNQGA ALEFAVAASALKHTINGDFNLVSVEEVEALAGGDASGRVQR >gi|226332259|gb|ACIC01000061.1| GENE 37 37765 - 38829 1087 354 aa, chain + ## HITS:1 COG:mll7623 KEGG:ns NR:ns ## COG: mll7623 COG1879 # Protein_GI_number: 13476333 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Mesorhizobium loti # 6 309 5 305 345 84 25.0 3e-16 MNKLPERIRIKDIARLADVSVGTVDRVIHGRSGVSEASKKRVEEILKQLDYQPNMYASAL ASNKKYTFICLLPQHLEGEYWTAVELGIHEAIATYSDFNTSVKINYYDPYDYHSFVDASE AILTQQPDGVMVAPTAPQYTKGFTDQLQTLDIPYIYIDSNIKDVPPLAFFGQNSRQSGYF AARMLMLLARDEKEIVIFRKIHEGIVGSNQQENREIGFRQYMEEHHPSCTILELDLHAER NDEDNEMLDEFFRSHPNVKNGITFNSKVYIIGEYLQSRGKKDFNLIGYDLLERNVTCLKE GSVSFLIAQQPELQGLNGIKALCDHLIFKKDVTCINYMPIDLLTVETIDYYHSK >gi|226332259|gb|ACIC01000061.1| GENE 38 38855 - 40345 1547 496 aa, chain + ## HITS:1 COG:CAC0696 KEGG:ns NR:ns ## COG: CAC0696 COG2721 # Protein_GI_number: 15893984 # Func_class: G Carbohydrate transport and metabolism # Function: Altronate dehydratase # Organism: Clostridium acetobutylicum # 6 496 5 492 492 609 60.0 1e-174 METKYLRINPADNVAVAIVNLPAGEHLSVDGIEITLNEDIPAGHKFALKDFAEGENVIKY GYPIGHARMAKKQGDWMNETNIKTNLAGLLEYTYHPTQVTLDIPHKNLTFKGYRRKNGDV GVRNEIWIIPTVGCVNGIIGQLAEGLRRETEGKGVDAIVAFPHNYGCSQLGDDHENTKKI LRDMVLHPNAGAVLIVGLGCENNQPDVFREFLGDYDQDRVKFMVTQKVGDEYEEGMDILR DLYAKASKDERTEVPLSELRVGLKCGGSDGFSGITANPLLGMFSDFLIAQGGTSVLTEVP EMFGAETILMNRCENKDLFEQTVHLINDFKEYFLSHGEPVGENPSPGNKAGGISTLEEKA LGCTQKCGKSYVSGVMPYGERLQKKGLNLLSAPGNDLVAATTLAASGCHMVLFTTGRGTP FGTFVPTMKISTNSTLAKNKPGWIDFNAGVIVENEPMEKTCERFIEYVIKVASGEFVNNE KKGYREIAIFKTGVTL >gi|226332259|gb|ACIC01000061.1| GENE 39 40414 - 43005 1678 863 aa, chain - ## HITS:1 COG:no KEGG:BT_1247 NR:ns ## KEGG: BT_1247 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 863 1 863 863 1738 99.0 0 MMNRAILVLLFLLSALPFSLQAQMPDSIVRKLSGYIEAVQNFGEMLPQEKVYLHFDNTSY YQGDPVWFQCYVVTSELNRPSDLSKTLYVELLNPGGEVIAKRVLPVRNGRCHGSFELTQL PFYSGFYEVRAYTKYMLNFGEEVIFSRVLPVFDKPRAAGDFKEKNIQKYAVYKYPQVRKK TAKEKKVNLKFFPEGGNLVEGIPSRVAFEATDAYGNPIALSGTVINREKKEIASFATVHE GRGTFVYTPTDGENHKAEVELNGKKYRFDLPEARLQGYALQVDNLSSEDSIAVSVQKNPH TPESVLGLAVLGHGKLLNYCLLNIRKNTPVSFKLDKTEWKAGVAQIVLFDTAGQIIADRL IFMRKPEPLTISVRKNKEDYQPFDRVLLDFSVRDTAGKPVSAPISVSVRDGWEEVESRHS MLADLLLMSEIKGYVHRPLYYFEADDKEHRTALDHLLLVQGWRRYAWKYESGVEPFELKY MPEQGIEVHGQVVSFVRGKPKPNIQVSSFLAKRGDEETSPGALSLNLFETDSVGRFAFIT DIIGKWNLILSVMEKGKRKDHRIVLDRVFSPQPTQYPLGEMQISLSGRDKPGAEQELQTD TVIGDDIDYNRFMDAYEDSLSKRGMREKNHRLGEVVVKAKKWSREKDIYENRSKSVAYYD VPSEMDDILDKGGYVGEDIHELLVRMNPNFIRRYSGGEEWLQYKGRSPLYVINYKRTEAT EIDQTRYRTLRLVAIKSIYINEELSARCRYADIRMSPMDVDEIYGCAVFIETYPEGEIPV KGGRGVRKTWLDGYSEPVEFYHPDYSVLPKEEDYRRTLYWNPEVMPDEAGNAQVRFYNNS RCRRLKITAEMISADGAIGMFNE >gi|226332259|gb|ACIC01000061.1| GENE 40 43138 - 44733 863 531 aa, chain - ## HITS:1 COG:no KEGG:BT_1248 NR:ns ## KEGG: BT_1248 # Name: not_defined # Def: putative transport protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 531 1 531 531 917 99.0 0 MKELGVPVRSWVPDWLGFCCIFIVILPVTMLNGSYTGSMLEVSNTLGTNSEDITMGYYAA SAGMAIAYPIIPKVLAAVSVKFLLLIDLLLQFFLSWICARSQNADILIACSFAIGFLKGF LMLWFIRYAQKIFSAKNIRSEFYSYFYPLVYGGGQASMLVTAQLAYHYNWKYMYYFMMLL ILVSVLFVIICFRHNRPIKSVPLSDLHIREMFIISVGLLMLIYVINYGKVLDWMASAKLC AYIVISPILIALFIWIQHHSKNPYVSLAPLFQPKAIIGYFYMMLVMFFSTSTTLLTNYLS IILKVDSTHTYSLYIFLLPGYVIGAFICFWWFRWQRWRFRFLIAGGMSCFVIFFGILYFG IAPNSTYEMLYLPIFFRGLGMLTLIIAFALFAVEDLNPKFLLSNAFFLIIFRSVLAPIMA TSFYSNMLYRLQQKYIYSLSETITTADPLAASRYTQSLNNALAQGHRYDEAVQIATHSLY GTLQEQSLLLALKEILGYLLVISIIIAVISRFIPFHKTIRVTFAKTGDDMV >gi|226332259|gb|ACIC01000061.1| GENE 41 44739 - 45806 929 355 aa, chain - ## HITS:1 COG:mll0995 KEGG:ns NR:ns ## COG: mll0995 COG1566 # Protein_GI_number: 13471111 # Func_class: V Defense mechanisms # Function: Multidrug resistance efflux pump # Organism: Mesorhizobium loti # 8 349 47 381 417 160 33.0 4e-39 MATLKEKKRKLKKMRARNIILNMVCVCLAVSGLWWTITYFWRYINYEVTNDAFVDQYVAP LNIRASGYIKDIRFKEHQYVRQGDTLLVLDNREYQIKVKEAEAALLDAHGLQDVLHSGIE TSHTNIAVQDANIAEAKAKLWQLEQDYHRFERLLKEESVPEQQYEQTKAAYEAAEARYQA LVAQKQAALSQYAETSKKTTGVQAGILRKEADLDLAKLNLSYTVLTAPYDGYMGRRTLEP GQYVQTGQTISYLVRNKDKWITANYKETQIANIYIGQQVRVKVDALPGKIFNGEVTAISE ATGSKYSLVPTDNSAGNFVKVQQRIPVRIELENISTEEMAQLRAGMMVETEALRK >gi|226332259|gb|ACIC01000061.1| GENE 42 45819 - 47138 1007 439 aa, chain - ## HITS:1 COG:alr2887_2 KEGG:ns NR:ns ## COG: alr2887_2 COG1538 # Protein_GI_number: 17230379 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Outer membrane protein # Organism: Nostoc sp. PCC 7120 # 32 435 27 433 443 75 22.0 1e-13 MFLKTYRVYLMGPLCLMFAGQPAKAQTDSLFLSVDQLFERGVQHSLQLQADALKEAMAQE RTRTARTSSLPDLQVGLKGGFVGQPVVWERGLSGPTYPDIPDWSQNYAIDFSQPLYQGGK IRRTIHKAEMEKQVAELQTLTDQAEIKLGLLNQYMNLFSLFKQHEILMRNIEESELRLRD IRRMKKEGVITNNDVLRSEMQLTNDRLSLQETENSIVLVSQQLDILLGQDENLLLKPDTT LLHQAVALEAYDDYITLAYTNDPAMKLLRKQTELARNEIRLAQSLSLPSISLYASNTLAR PVSRTLTDMYNNNWNVGLSVSYPLSSIYKNSHKIKESKLMVSLRKNDEEQKMQRIRMDVR TAFLRHQEALQRVEALQLSVRQAQENYRIMQNRYLNQLAILTDLLDANSVRLNVELQLVT ARTRVIYTYYQLQKACGRL >gi|226332259|gb|ACIC01000061.1| GENE 43 47119 - 47562 425 147 aa, chain - ## HITS:1 COG:MA1122 KEGG:ns NR:ns ## COG: MA1122 COG1846 # Protein_GI_number: 20089988 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Methanosarcina acetivorans str.C2A # 12 107 17 107 157 57 34.0 8e-09 MIREGAEFRELMLQVFRTRMAFRRSMQRTLRKNNAGITFEMLQVLSCLWHEQGISQQILA ERIAKDKACLTNLMNNLEKKGYVHRKEDPADRRNKQVYLTPEGEEFKEQIRPILDQVYVY AEQVIGIESIELMLSELKGVYDVLENV >gi|226332259|gb|ACIC01000061.1| GENE 44 47762 - 48361 458 199 aa, chain + ## HITS:1 COG:no KEGG:BT_1252 NR:ns ## KEGG: BT_1252 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 17 199 1 183 183 309 98.0 5e-83 MNKDKALESVNEIKELMEKSSKFISISGLAAIMAGVYALAGTYIATCLVTPDTTLIVALK LMIMVAILVLAAAAVTAGILSYNKSKKLRQKFFSKLTYRALWNFSLPMLTGGALCISLLL HGYYDILSSVMLLFYGLTLVNVSKFTYANIAWLGYAFICLGVIDSFWEGHALLFWTIGFG GFHILYGILFYLHYERKQS >gi|226332259|gb|ACIC01000061.1| GENE 45 48362 - 48658 222 98 aa, chain + ## HITS:1 COG:CC2206 KEGG:ns NR:ns ## COG: CC2206 COG1846 # Protein_GI_number: 16126445 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Caulobacter vibrioides # 8 91 10 93 103 80 48.0 7e-16 MIEAFQNINKAFESKVRLGIMAVLMVNEEADFNFLKEQLSLTDGNLASHTRALEELGYIE CSKGFVGRKPRTVFRATKQGREAFKSHIEALENFLKST >gi|226332259|gb|ACIC01000061.1| GENE 46 48683 - 50038 1349 451 aa, chain + ## HITS:1 COG:PA0465 KEGG:ns NR:ns ## COG: PA0465 COG4452 # Protein_GI_number: 15595662 # Func_class: V Defense mechanisms # Function: Inner membrane protein involved in colicin E2 resistance # Organism: Pseudomonas aeruginosa # 48 449 27 438 452 226 34.0 1e-58 MDGFNEKSTEQVQEQQPLGCLRRFSKTIKVIIIGGLIILLMIPMFMIEDLISERGRTQEE AINEVSEKWSLAQTITGPYLNIQYPVTTENNGEKKVSIKDLFLFPDELLVNGQLKTEILK RSIYEVNVYQSELTLKGLFSPEELIKSRVDMEQLQFDRAAICLNLTDMRGISEQISITLG DSVYVFEPGMDNRGIGATGVHAITDLSDLKKNKKLPYEIKIKLKGSQSINFIPLGKTTRV DLKANWNTPSFTRNYLPNNREITEKEFSAQWQVLNLNRNYSQVLVNYNNTNIKDIENSSF GVNFKIPVEQYQQSMRSAKYAILIILLTFGVIFFTEIMNKTRIHALQYLLVGLALCLFYS LLLSFSEHIGFNPAYLLSAALTIILVGGYMLGITKKKKPSLIMSGLLSVLYLYIFVLIQL ETFALLAGSLGLFIILAMVMYFSKKIDWFNE >gi|226332259|gb|ACIC01000061.1| GENE 47 50163 - 50747 449 194 aa, chain - ## HITS:1 COG:no KEGG:BT_1255 NR:ns ## KEGG: BT_1255 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 194 1 194 194 285 100.0 8e-76 MKNKVKLIVNPFIRIAGGQALIWGFLGLIVSTLLCWISGYHYHGLLHFGPAPNPAWWCYL AEHLIVWLIPALLFYLGGLFLSHSRIRVIDVLGTVLFAQLPLLGMNLISLLPAMRMMSQM NMNMSPEEMLAQPYFILAMILTLLGLPFLILTLIWMFNALKVSCNLKQWKLWTVALIGII GGDVLCRFLIGWLY >gi|226332259|gb|ACIC01000061.1| GENE 48 50747 - 51121 255 124 aa, chain - ## HITS:1 COG:CAC3001 KEGG:ns NR:ns ## COG: CAC3001 COG3877 # Protein_GI_number: 15896253 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 42 119 8 85 131 60 38.0 6e-10 MNKIKFYINIIDVIVCFIRIYLYFCIINEDDMNEVKKRLPLQCPSCDAPLKVGRLFCEEC NTEVCGNFELPLLARLSEKEQQFVLDFVKSSGSLKDMAKNIGVSYPTVRNMLDDIIDKLT KMDM >gi|226332259|gb|ACIC01000061.1| GENE 49 51232 - 51699 448 155 aa, chain + ## HITS:1 COG:no KEGG:BT_1257 NR:ns ## KEGG: BT_1257 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 155 1 155 155 309 98.0 2e-83 MEILYIKGDATAPIGSGVKVITHICNDIGGWGKGFVLALSKKWKMPEEAYRQWYKSQEEF TLGAVQFVNVENELYVANMIGQHGIYKDSKGLPPIRYDAVRQCLKEVALFAIDHKASVHM PRIGCGLAGGKWELMEQIIKEELITKEIAVTVYDL >gi|226332259|gb|ACIC01000061.1| GENE 50 52237 - 52680 366 147 aa, chain - ## HITS:1 COG:CAC3445 KEGG:ns NR:ns ## COG: CAC3445 COG0454 # Protein_GI_number: 15896686 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Clostridium acetobutylicum # 1 144 1 144 147 184 61.0 5e-47 MEIKKVISDKKEFLELLLLADEQESMIDRYLERGDMFVLYDNGLKACCVVTREGEGIYEI KNIATISFFQRQGYGKRLIQFLFDHYRGKCSELLVGTGDVPSTLSFYEHCGFTISHRLKN FFTDNYDHPMYEDGKQLVDMVYLKKTF >gi|226332259|gb|ACIC01000061.1| GENE 51 52933 - 53913 828 326 aa, chain + ## HITS:1 COG:mlr8141 KEGG:ns NR:ns ## COG: mlr8141 COG3049 # Protein_GI_number: 13476735 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Penicillin V acylase and related amidases # Organism: Mesorhizobium loti # 2 324 25 350 350 400 59.0 1e-111 MCTRVVYSGSNGMVATGRSMDWKTDMHSNLWVFPRGMKRNGETGENSLEWTSRYGSVVTS AFEIASTDGMNEKGLVANLLWLPETEYPVRDKNKPGLAITAWVQYMLDNFATVEEAVAYI DEDTFQVVSDMMPDGSRLATLHLSISDATGDCAIFEYIGGKLTVYHSKEYKVMTNSPTYN KQLALSEYWKSIGGLSFLPGTNRAADRFARASFYINALPETDDERIAVASVFSVVRNASV PYGISTPESPEISTTQWRTVSESKNLRYFFESSLTPNTFWVNLKDFDLSEGAPVFKLSIA NGEMYHGNTAKNFKTALPFKFMGVKG >gi|226332259|gb|ACIC01000061.1| GENE 52 54212 - 56446 1671 744 aa, chain + ## HITS:1 COG:CC2154 KEGG:ns NR:ns ## COG: CC2154 COG1506 # Protein_GI_number: 16126393 # Func_class: E Amino acid transport and metabolism # Function: Dipeptidyl aminopeptidases/acylaminoacyl-peptidases # Organism: Caulobacter vibrioides # 153 732 146 735 738 148 25.0 3e-35 MKNLLLAAMLFCTNIAIAQGTIEDYRRAYSAGEKFSANKVFYSNVNPEWIGKTHYFWYVR NTPDGRIYVLVDADTQNRKDLFDHKLVAAALSEASGRKIEATSLYLDRLSVNNGLDTLHF VFNNHRWMYVIDKNQLTDEGALPTPRKQRHWMETDDEKTAAPVTSPDKKYTAFIKNHNIY VKETATGKEKQLSLDGTLGNYYSAYIRWSPDSKKVASCKIRPVEKRYVYYVESSPSDQLQ PKLHKQEYAKPGDELPFKIPCIYDVETGHSVIPSTDLFSQQYYITAPEWNSDSQAITFEY NQRGHQVYRVLEFSAATGKVRPLIEETSDKYVNYSRRFRHDLQNGKRMIWMSERDNWNHL YMYDRTTAQPIRQITKGEWYVRDILRVDETNQLIYFSANGVQAKEDPYLIRYYRIGFDGK NLTCLTPTEGMHNAWFSKDMQYLVDVYSMVDKAPVAVLRDAKNGKVIMPLETADISRLEA EGWKAPEVFTAKGRDGKTDMWGIIIRPSNFDPDKKYPVLEYIYQGPGNQYVPKTFIPYNY FMSGIAELGFIVVMVDGMGTSFRSRTFENVCYKNLKDAGLPDHIVWIKSAAQKYPYMDID RVGIYGCSAGGQESMTAVLFHPEFYKAAYSACGCHDNRMDKIWWNEQWLGYPVGEQYKEG SNVENAHLLSRPLMLVVGEMDDNVDPASTMQVANALIKANKDFELVVIPGAHHTMGESFG EHKRYDFFVRNLLNTTPPKWNEIK >gi|226332259|gb|ACIC01000061.1| GENE 53 56492 - 56839 352 115 aa, chain - ## HITS:1 COG:AGc3635 KEGG:ns NR:ns ## COG: AGc3635 COG1733 # Protein_GI_number: 15889290 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 10 112 36 137 147 108 50.0 3e-24 MKNFHPTGNCPVRDVISRLGDKWSMLVLITLNANGTMRFSDIHKTIEDISQRMLTVTLRT LESDGLIERKIYAEVPPRVEYYLTDTGGTLIPHIEGLVGWALENMDTILEHRKSN >gi|226332259|gb|ACIC01000061.1| GENE 54 57007 - 57534 500 175 aa, chain + ## HITS:1 COG:no KEGG:BT_1263 NR:ns ## KEGG: BT_1263 # Name: not_defined # Def: putative protease I # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 175 1 175 175 346 100.0 2e-94 MAKKVAVLAVNPVNGCGLFQYLEAFFENGISYKVFAVSDTKEIKTNSGMVLIVDDVIANL KGHEDEFDALVFSCGDAVPVFQQYANQPYNVDLMEVIKTFGEKGKMMIGHCAGAMMFDFT GITKGKKVAVHPLAKPAIQNGIATDEKSEIDGNFFTAQDENTIWTMLPKVIEALK >gi|226332259|gb|ACIC01000061.1| GENE 55 58004 - 61123 1917 1039 aa, chain + ## HITS:1 COG:mll3725_2 KEGG:ns NR:ns ## COG: mll3725_2 COG0642 # Protein_GI_number: 13473203 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Mesorhizobium loti # 606 908 20 326 328 187 36.0 9e-47 MNMKKYVLAFFLCFISITPILAQRDHIDISNYILCINSYAESSPWSNRMISTVSEYVQKN PKLALYAEHMNTLMIDNDTILGEFKNMISQKYEHHRPRLLILLGNPSLLLRDEYRELWGD IPIVLCSEEKFIGPKDTYIYKQPVTQAERVPISQLADPYNLVLLYSNLYLRENIQLISHI IPDMKKFIFIGDEREVNQTNNQEIQTKLKTINPNIEYQYITPQKMTTNQLLDSLYHVDPN TTGILFASWFYKTTFAGNTSLVTNAHKLIVTTTAPIFTLNMADITEENGGMIGGYTYDQK HYNQQLIHTISEILTGKPAREIPFYMPSDGAPIINYTILLRKGFSPSMCPPNTHFLNKPL GFWKQNKYFIMGTLSFMILLAIVFFYRIHSLNSIKKAQQKEIDAMTNYKNLVNNMPILYM QEEVLADKNGIPVELIYRNVNAHFEKNFFRKEEVVGKKASEIFPESMPDFLHFTQIALSE NKVITFPYYFKKIDTFYDIVLKANRQNNMIDVFCLNSTELHKAQQKLSTINNKLAMSLDV ANIVPWKWDLRSKTILCDINKPIELSTQGKDVAEEQLAVPDYHYFSKIFKEDRKRVEQAY QNLIDGHSEKVKEEYRVVSTQKGFHRIEWVEAQAAVETRDENGKPLTLVGSSLVITERKK MEMELINAKNRAEESNRLKSAFLANMSHEIRTPLNAIVGFSGILASTEEEEEKQEYVSII ENNNTLLLQLISDILDLSKIEAGTLDLHYSNVEINDLMKDLENMCQLKLKSDAVKLEFVA PEEPCFAHIEKNRLSQLIINLVTNAIKFTIQGSIRFGYKRQNNELYFYVADTGCGIPQDK QKSIFGRFVKLNSFAQGTGLGLSICQTLVEHMGGKIGVESEEGNGSTFWFTLPYKQAETV KKSLPKDIQPIAIEKDKLVILIAEDNESNYKLFESILKYDYHLLHAWDGQEAVNMFKEHN PQIILMDINMPVLDGYEATKEIRKYSAKVPIIAITAFAYASDEQRVMESGFDGYMPKPIN ARQLKAQLTDIMQKRIVLL >gi|226332259|gb|ACIC01000061.1| GENE 56 61160 - 61960 607 266 aa, chain - ## HITS:1 COG:VCA0850 KEGG:ns NR:ns ## COG: VCA0850 COG3279 # Protein_GI_number: 15601605 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Vibrio cholerae # 1 242 2 243 261 106 30.0 5e-23 MNKIKAAIIEDEIPAARLLRDTLLSLRPDWEVQLLPGNIEEAVEWFGQHLHPDILFLDIQ LTDGNSFLFIEQAHPESMIVFTTAYDEYAVRAFSVNSIDYLLKPIHEDRLMQTIQRFEGL TKNYIHDFNQESRMLEILQQLSAAQETSVSVQKKYRTRFLISSGEKLFTLQVSDIAYFYS ENKLTFAVTHKNREYLIDLALDRLSEQLDPDHFFRTNRQTLVCIDAIQRIESYFLGKAIV HVQPPFKDKIMISKDKMASFRMWLNY >gi|226332259|gb|ACIC01000061.1| GENE 57 61953 - 63008 707 351 aa, chain - ## HITS:1 COG:VC0694 KEGG:ns NR:ns ## COG: VC0694 COG3275 # Protein_GI_number: 15640713 # Func_class: T Signal transduction mechanisms # Function: Putative regulator of cell autolysis # Organism: Vibrio cholerae # 155 284 357 484 558 89 39.0 1e-17 MKRANCMNVKLNTLLYIALFSGLGIFSFLLLINYATFSDRVADMLHSVSTLGFFILAFNV LGYTTIRLSSWIDNQYALNLHRRWKLVSVYIIVMGMFLLLNYGLMVTAKLLAGASYPFTF PNGGWRILITVWLVELVILGLLLANRSMRNTLRLQQKAAALQKENNTARYTALQNQLNPH FLFNSLNTLISEIRYNPANAELFTQHLSDVYRYILQCQNQRLTTLREELDFLNSYIFLHR VRLGDCIHIDNRIPKTCMEAQLPPLTIQLLAENVIKHNVIHTGKPITIELFYMEKERELI VRNRIQTKKTVITSGMGLKNLSARYMLLCNRDIVVENDRKEFTVKIPLLYE >gi|226332259|gb|ACIC01000061.1| GENE 58 63025 - 64329 1246 434 aa, chain - ## HITS:1 COG:RSc0009 KEGG:ns NR:ns ## COG: RSc0009 COG1538 # Protein_GI_number: 17544728 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Outer membrane protein # Organism: Ralstonia solanacearum # 26 430 78 479 514 62 22.0 2e-09 MKIRIYSLLLCGFALGSLSALAQQSPLLEKYRTMALDYNHDLKAAEKNIAASMEVEKSAR ADLKPKLSGTANFQYTENPLELTLDVPSLGLSRTVEGKQLNYGGSLSILQPVYTGGRVLE SIRMAQHQQSFAANQAKALNDAVCYQTDIQYWSAVARQEIVTVAEDFRNSIAALVKTIKE RVEVGLVDPQDLLMAEVKLNEAEYQLLQAQSNFETGRMALNSMIGVRLEHHTELDSQIPV VVVDDSTWLSTGMARPEIQMAYDRIRMAESTKKLNDSQFKPQFYVGIDGSYSSPGYNFKK DLDPNYAVYAKVSVPIFEWGKRKSEKRASSFRVGMAEDNLNKVMDQVELEIGVARKALSQ AMERVRLSESSLAKAEENEAKAIERYNEGKVSVVEVIDAQTYRQTSQVNYVQAKAAAQGH YSELIKALHGYDCR >gi|226332259|gb|ACIC01000061.1| GENE 59 64332 - 67364 2109 1010 aa, chain - ## HITS:1 COG:VC1757 KEGG:ns NR:ns ## COG: VC1757 COG0841 # Protein_GI_number: 15641761 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Vibrio cholerae # 1 1008 1 1012 1016 639 35.0 0 MKLVKYFLQKKSVTILLLILVLGGGLFSYIKMGKLEDAPFTIKQALVMTPYPGASPSEVQ SQVTDVLEESIQSLGELYYLKTENRAGLSKITVYVKKEIRADEMQQLWDKLRRKVNDVQS KLPAGAGPSVVNDDFGDVLGVFYGLTGEGYSYRELENQAKLIKNVLLRVKDVAKVEIYGV QSPTIDVILNPSVMARSGITTTDISRAFDAQNRVVDAGGIDACVNRIRIESTGNFYSLDD IRNMTIVSRTGEHFRLADIAEIEESYQTPPSNKMRIDGKPAVGIAISTIPTGNVVDMAEA VKQEIDHFAETMPEGFELQTIYDQGYESAVANQGFILNLIISVVTVVAILLFFIGFKNGI LIGSGLVFSIFATLIVMLSQGIALQRMSLAAIIIAMGMLVDNAIVVSDSALVNMQRGMRK RVAILRACSSTSLPLLAATVIAILTFLPIYYSPHITGELLSSLVVVIGVSLMFSWVFALT QTPFFIQEFVRRPRPNELKAALFAGKYYDKFRSALRWVIRHRYATIGCMVIMLVLSAWSF KFIPKVFVPALDKQYFTLDMWLPEGTRIEETDKMVMDMAEYIRGQEETEMVSTYIGRTPP RYYLSNVSFGPQSNYAQILMKCKTSKLSRQLHARLQDSVSLRFPEPLIKVNKFELSPLTE AVIEARFLGPDPAVLDSLVGQAIEIMRQNPKVSDARNEWGNMSMVIRPVYDPVKAGALGI TKASMMESVKSINDGLPVGVYRDNEKKVPVLLKSGNVDITDAHSLGDFSIWNGERSAPLS QVTERIETTWEFPQVRTYNRQLSMAAMCGVKPGHTMAEVHGEIRKEIENIQLPEGYTFFW DAQYKDQGEAMQAIAKYFPLAFLALVVILVGLFGNFREPVIILCVLPLSLIGIAVGMLLT GFDFGFFPIAGWLGLLGMIIKNVIVLIDEINVQHRSGIDLYTSIVEATVSRTRPVLMAAT TTIFGMVPLLFDVAFGGMAATIIFGLTFATGLTLFVTPALYSMFYKVKGK >gi|226332259|gb|ACIC01000061.1| GENE 60 67375 - 68445 970 356 aa, chain - ## HITS:1 COG:VC1675 KEGG:ns NR:ns ## COG: VC1675 COG0845 # Protein_GI_number: 15641679 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Vibrio cholerae # 5 341 2 334 343 120 27.0 3e-27 MTTTNFSLCIFCLCLLTSCGKGEKKTNEPVRVKVAEAEMLVPSAEREFSFIAKPFKETEL SFRVGGPIDHFEVYAGNYYHRGDIIAEIDSRDFRIRKERAEAIYHQAKAEFERIKVLYEK NNLSASAYEKARADYTSAKTAYETAVNELEDTKLIAPFDGYIGEVYIEKYQDVKATQSVV SFIDITQLKIEAYVTQEIAFQAKEIKEVGIRFDVRPEAVYPAKVVEVSKSTTRNNLSFLM TALLPNKQGEWPAGISGKMLLDLPATSSVPMVTVPQTALNHRPTEGDYVWMVDQTTGQVV KRKVILGELLPNGKVEVKDGLQAGDKVAVSKLRFLSDGMPVDIISQKEKRAVTAQK >gi|226332259|gb|ACIC01000061.1| GENE 61 68641 - 69942 1195 433 aa, chain + ## HITS:1 COG:SA2117 KEGG:ns NR:ns ## COG: SA2117 COG1757 # Protein_GI_number: 15927906 # Func_class: C Energy production and conversion # Function: Na+/H+ antiporter # Organism: Staphylococcus aureus N315 # 4 430 21 448 459 336 45.0 5e-92 MTEEKIHIERHGSWWALSPLLVFLCLYLVTSILVNDFYKVPITVAFLVSSCYAIAITRGL KLDQRIYQFSVGAANKNILLMIWIFILAGAFAQSAKQMGAIDATVNLTLHILPDNLLLAG IFIAACFISLSIGTSVGTIVALTPVAVGLAEKTEIALPFMVAVVVGGSFFGDNLSFISDT TIASTKTQECVMRDKFRVNSMIVVPAAIIVLGIYIFQGLSITAPAQVQTIEWIKVIPYII VLGTAIAGMNVMLVLIIGILTSGIIGIATGSFDVFDWFGAMGTGITGMGELIIITLLAGG MLETIRYNGGIDFIIKKLTRHVNSKRGAELSIAALVSIANLCTANNTIAIITTGPIAKDI AKRFHLDRRKTASILDTFSCLIQGIIPYGAQMLIAAGLANISPISIIGNLYYPFTMGVCA LLAILFRYPRRYS >gi|226332259|gb|ACIC01000061.1| GENE 62 69873 - 70016 95 47 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLLTRNSGRKDTAFLLIDKYFLSFSYEYLLGYRNRIASRAQTPIVNG >gi|226332259|gb|ACIC01000061.1| GENE 63 70051 - 70605 913 184 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|29346681|ref|NP_810184.1| 30S ribosomal protein S16 [Bacteroides thetaiotaomicron VPI-5482] # 1 184 1 184 184 356 100 3e-97 MATRIRLQRHGRKSYAFYSIVIADSRAPRDGKFIEKIGTYNPNTNPATVDLNFDAALAWV LKGAQPSDTVRNILSREGVYMKKHLLGGVAKGAFGEAEAEAKFEAWKNNKQSGLAALKAK QDEEKKAEAKARLEAEKKINEVKAKALAEKKAAEAAEKAAAEAPAEEATEAPAEEAAATE AAAE >gi|226332259|gb|ACIC01000061.1| GENE 64 70781 - 71776 859 331 aa, chain + ## HITS:1 COG:L0146 KEGG:ns NR:ns ## COG: L0146 COG1609 # Protein_GI_number: 15673482 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Lactococcus lactis # 19 313 8 318 345 67 25.0 3e-11 MRRTDKKKTFGQQSSKVTQLADTLSQAISMKKFREGDSLPSINQLSAEYGVSRDTVFKAF LDLRERGLIDSTPGKGYYVTSQVTNVLLLLDQYTPFKEALYNSFVKHLPINYKVDLLFHQ YNERLFNTIIRESVGKYNKYIVMNFDNEKFSMVLNKIHPTKLLLLDFGKFEKEKYSYICQ DFDEAFYQALLMLKERLRHYPQLVLLFSKNLKHPQSSKEYFTRFCEEQGFLCEIQEDIEN LTVRKGVVYIAIKQQDVVKVVKQGRLEGLKCGKDFGLLAYNDIPSYEVIDEGITSLSIDW EMMGNEAANFVLNNVPIQKYLPTEVRLRKSL >gi|226332259|gb|ACIC01000061.1| GENE 65 71849 - 73624 1911 591 aa, chain + ## HITS:1 COG:SP2158 KEGG:ns NR:ns ## COG: SP2158 COG2407 # Protein_GI_number: 15901968 # Func_class: G Carbohydrate transport and metabolism # Function: L-fucose isomerase and related proteins # Organism: Streptococcus pneumoniae TIGR4 # 1 591 1 588 588 840 67.0 0 MKKYPKIGIRPTIDGRQGGVRESLEEKTMNLAKAVAELISNNLKNGDGSPVECVIADSTI GRVAESAACAEKFEREGVGSTITVTSCWCYGAETMDMNPHYPKAVWGFNGTERPGAVYLA AVLAGHAQKGLPAFGIYGRDVQDLDDNTIPEDVAEKILRFARAAQAVATMRGKSYLSMGS VSMGIAGSIVNPDFFQEYLGMRNESIDLTEIIRRMEEGIYDHEEYAKAMAWTEKYCKVNE GDDFKNRPEKRKNREQKDADWEFVVKMMIIMRDLMTGNPKLKEMGFKEEALGHNAIAAGF QGQRQWTDFYPNGDYPEALLNTSFDWNGIREAFVVATENDACNGVAMLFGHLLTNRAQIF SDVRTYWSPEAVKRVTGKELTGLAANGIIHLINSGATTLDGSGQSLDAEGNPVMKEPWNL TDADVENCLKATTWYPADRDYFRGGGFSSNFLSKGGMPVTMMRLNLIKGLGPVLQIAEGW TVEIDPEIHQKLNMRTDPTWPTTWFVPRLCDKSAFKDVYSVMNNWGANHGAISYGHIGQD LITLASMLRIPVCMHNVDENEIFRPTAWNAFGMDKEGADYRACTTYGPIYK >gi|226332259|gb|ACIC01000061.1| GENE 66 73734 - 74372 480 212 aa, chain + ## HITS:1 COG:HI1012 KEGG:ns NR:ns ## COG: HI1012 COG0235 # Protein_GI_number: 16272947 # Func_class: G Carbohydrate transport and metabolism # Function: Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases # Organism: Haemophilus influenzae # 8 203 8 206 210 105 33.0 6e-23 MITDEHIELFLAQAHRYGDAKLMLCSSGNLSWRIGEEALISGTGSWVPTLAKEKVSICNI ASGTPTNGVKPSMESTFHLGVLRERPDVNVVLHFQSEYATAISCMKNKPTNFNVTAEIPC HVGSEIPVIPYYRPGSPELAKAVVEAMLKHNSVLLTNHGQVVCGKDFDQVYERATFFEMA CRIIVQSGGDYSVLTPEEIEDLEIYVLGKKTK >gi|226332259|gb|ACIC01000061.1| GENE 67 74388 - 75818 1065 476 aa, chain + ## HITS:1 COG:BH1551 KEGG:ns NR:ns ## COG: BH1551 COG1070 # Protein_GI_number: 15614114 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar (pentulose and hexulose) kinases # Organism: Bacillus halodurans # 7 472 2 461 467 345 39.0 1e-94 MKDTTRNTYLAVDFGGGSGRVIAGSLLQGKLELEEIHRFTNRQVKLGNHVYWDFPALFED MKTGLKLAAQKGYHVKGIGIDTWGVDFGLIDKKGNLLGNPVCYRDARTDGMPDKVFQILD AQKHYACTGIQVMPINTLFQLYSMQQNQDVLLEVAQRLLFMPDLFSYYLTGVANNEYCIA STSELLDARQRNWSMDTIRALGLPEHLFGEIILPGTVRGTLKEEIGRETGLGPVDIIAVG SHDTASAVAAVPATEGQVAFLSSGTWSLLGVEVDEPILTEEARLAQFTNEGGVGGHIRFL QNITGLWILQRLMSEWKLRGEEQSYDTILPQAADAEIDTIIPVDDAEFMNPENMETALLN YCRNHSLKVPGNKAEMVKCVLQSLAFKYREAVAQLNRCLPSPIHRLNIIGGGSQNKLLNQ LTANALGIPVYAGPVEATAMGNILTQAMAKGEISSLREIREVVSHSVTPQVYYPEK >gi|226332259|gb|ACIC01000061.1| GENE 68 75826 - 76218 360 130 aa, chain + ## HITS:1 COG:no KEGG:BT_1276 NR:ns ## KEGG: BT_1276 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 130 1 130 130 249 100.0 2e-65 METPQTGYQVKTYNVPVKRYCQTLDLRDSPELIAEYRKRHSQAEVWPEILAGIREVGILE MEIYILGTRLFMIVETPVDFDWDTAMARLNTLPRQQEWEEYMSILQQAAPGMSSAEKWIP MERMFHLYNT >gi|226332259|gb|ACIC01000061.1| GENE 69 76236 - 77552 1220 438 aa, chain + ## HITS:1 COG:HI0610 KEGG:ns NR:ns ## COG: HI0610 COG0738 # Protein_GI_number: 16272552 # Func_class: G Carbohydrate transport and metabolism # Function: Fucose permease # Organism: Haemophilus influenzae # 16 435 10 425 428 398 51.0 1e-110 MKHTKQSIISKDGVSYLIPFILITSCFALWGFANDITNPMVKAFSKIFRMSATDGALVQV AFYGGYFAMAFPAAMFIRKFSYKAGVLLGLGLYAFGAFLFFPAKMTGEYYPFLIAYFILT CGLSFLETSCNPYILSMGTEETATRRLNLAQSFNPMGSLLGMYVAMQFIQAKLHPMGTDE RALLNDSEFQAIKESDLAVLIAPYLIIGLVILAMLLLIRFVKMPKNGDQNHKIDFFPTLK RIFTQTRYREGVIAQFFYVGVQIMCWTFIIQYGTRLFMSPEYGMDEKSAEVLSQQYNIVA MVIFCISRFICTFILRYLNAGKLLMILAIFGGIFTLGTIFLQNIFGLYCLVAVSACMSLM FPTIYGIALKGMGDDAKFGAAGLIMAILGGSVLPPLQASIIDMKEIASMPAVNVSFILPL TCFLVIIGYGYRTVKRNW >gi|226332259|gb|ACIC01000061.1| GENE 70 77739 - 78290 404 183 aa, chain + ## HITS:1 COG:all2193 KEGG:ns NR:ns ## COG: all2193 COG1595 # Protein_GI_number: 17229685 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Nostoc sp. PCC 7120 # 14 175 33 193 201 61 25.0 8e-10 MTPNQDERLLAQQFETIFTKYYSVVKYFALMLLKSEEDAKDITQDVFTKLWTKPELWTEV PNPTPYIYTLTKSTTLNFIKHKKVELAYQEKIIEKSLIDELFQSEDTLNPIYYKEAQLII KLVLERLPEQRRMIFEMSRFKHMSNLEIAEKLNISKRTVEHHIYLTLLEMKKIIFFAFFL LFP >gi|226332259|gb|ACIC01000061.1| GENE 71 78350 - 79336 554 328 aa, chain + ## HITS:1 COG:AGl2289 KEGG:ns NR:ns ## COG: AGl2289 COG3712 # Protein_GI_number: 15891252 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 16 274 32 272 323 84 27.0 2e-16 MKNYFQKIITLFTGNDYPESTQQDFYKWLVDEEHTSEKDEALQKLWDEAHKQRTATDMQE AYELLKKNAGIPPIQRKRTIRPIHIWQTVAAVLFIVAASSVYLSTIGKDAEENLIQQYIP TAEIRTLTLPDGTQVQLNSQSTLLYPQNFTGKDRSVFLIGEANFKVKPDKKHPFIVKSND FQVTALGTEFNVSAYPENPVLAATLISGSVLVEYNDLKSQVILKPNEQLAYNKNTHYHSL DHPDMKEVTAWQRGELVFREMSVKDIITILERKYPYTFEYQLKTLKDDRYSFRFKDQAPL SEVMDVIVNVVGQMNYKIKGDRCYLIPK >gi|226332259|gb|ACIC01000061.1| GENE 72 79534 - 82866 2928 1110 aa, chain + ## HITS:1 COG:no KEGG:BT_1280 NR:ns ## KEGG: BT_1280 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1110 1 1110 1110 2160 100.0 0 MLQIYEFIVDQTAKGRSFLLFFCLLMMQTPSFAQNSAKITIQKKNISVIEALKEIEKQSD YSVGYNDSQLKNKPVLNLDLKAATLEYALSQILRGSGFTYQFKDKYIMIIPDKKLKETPT KKVSGIVIDENNEPLIGVNIKVEGSSEGAITDIDGNFNIMAPQGSTLSFTYVGYTPQTVK ITDKNIYEIRLASDTKQLNEVVVTALGIKREQKALSYNVQQVKSDQLTTIKDANFVNSLS GKVAGVIINSSSSGVGGASKVVMRGTKSIEKSNNALYVIDGIPMYNFGGGGSTEFGSKGE TESIADLNPDDIESLSVLTGAAAAALYGSNAANGAIIVTTKKGKIGKLQASVSTGIDWLK PFVMPKFQNRYGTGSNGKSDGSTVWSWGPKMGNSPGYDPNDFLKTGAVYNNSVTLSTGTE KNQTFFSAAALNSDGMIPNNRYNRYNFTFRNTTSFLNDKMKLDVGASYILQNDRNMTNQG QYSNPLVPAYLYPRGDNFSTVKVFERYNEARKINEQFWPSGEGDLRMQNPYWIAYRNLRE NSKKRYMLSAGLTYDVLDWLSLSGRIRIDNSNNTYEQKYYASTITTLTEGSEQGYYGIEK SGNSQTYADFLVNINKRVGDFTIVANIGTSLSDNSYDMLGYNGPIQEKGIPNVFNVFDLD NTKKRATQEGWEEMTQSIFASAEVGWKSMLYLTLTGRNDWASQLLGSSQTSFFYPSVGLS GVISEMVKLPEWIDYLKVRGSFSSVGMPYPRFLTIPTYKYDTTILGWLPKTHYPIGKLYP ERTDSWEAGLDATFFKDLRLSASFYYANTYNQTFDPKISASFGYSKFYVQTGYVRNMGME GMLSYGHRWNEFGWNSNFTFSWNKNKIIELVKDFHHPETGKILNIPELEMNGLGYSRFIL KEGGTLGDLYSKADLVRNDKGYIEMDANGALAKDANVEPIKLGSVLPKANFAFSNEFTYK GVSAGFLLSCRLGGIVYSATQAALDQYGVSEASAKARDAGGVVINGRTMINAQQWYETIG SSSGLPQYYTYSATNIRLQEAHIGYTIPRRWLKNICDINVSLVGRNLWMIYCKAPFDPES VATTSNYYQGIDYFMMPSTRNIGFNVKFNF >gi|226332259|gb|ACIC01000061.1| GENE 73 82880 - 84475 1214 531 aa, chain + ## HITS:1 COG:no KEGG:BT_1281 NR:ns ## KEGG: BT_1281 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 531 1 531 531 1089 100.0 0 MKNLRYMLIAACSACLLLPLGSCMDTDMNRNKYEVDSEEIGRENYDLGSTIRGLQGLVIP AQEHLYQFMEAMCGGSYAGYFGETRTGWLEKYSTYNPKTDWLKAPFTDVISETYPKYYAV LQHEDAPVALALAKLLRVTIMQRVTDIYGPIPYSKVLVSGEGSESDGLNAAYDSQKDVYM RMFQELEEADQALEDNMTEGNSGFEKLDDVYYGKLQQWRLFLHSLQLRMAMRLCYTDMAA EAQSIAEKAVTAGVIEKNDDNALFHVAENRSALCFNDWKDYRVGADIICYMNGYADPRRD KYFTKVKNNDQEGYYGMRIGINSPFSDDDMITSYSNRLMTASDPYVWMTASEVAFLRAEG ALRKWNMGGEAKDFYETGVKLSFEEHGASGAEDYLNSIASPSGYTDPLGSYSTGSPANIT VKWNEMGEQAFEENLERIITQKWIALFPNGIESWSEHRRTGYPKLLPVVVNKGRNVSTEA GMRRLMYPNEEYTQNSFHLNNAINVLIKESSNNQGGDTGGTHVWWDRKANK >gi|226332259|gb|ACIC01000061.1| GENE 74 84499 - 85482 848 327 aa, chain + ## HITS:1 COG:no KEGG:BT_1282 NR:ns ## KEGG: BT_1282 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 327 1 327 327 642 99.0 0 MKNKNLIFSFLTVVTLTIGAGCLWTSCDDWTETEQVDYGVTTPDGQNPELYARYTQAVRN YKSRKHYAVCVRFDNGHSGDGEKDFLRSMPDSIDAVILENAATLNSADLEDIPVLQTNFA TKVLFSFNLTSIKENAESSGQEIKTLLAPALEQMVSAITDNGLDGASISYTGDIGLGNNA AVNASITEMRQLLLDKITPLAKNGKIFFLESNPLFIPEANRDVFTRYVLNTTSSKNASQL RLLINEAIYYAGIPSDKLLITGDPELTTTDNNDGLVSQVPFFAIQVIDCGPIGGLMIQNV AADYSHANITYKETRGAIQTLNPSPLK >gi|226332259|gb|ACIC01000061.1| GENE 75 85501 - 86823 857 440 aa, chain + ## HITS:1 COG:no KEGG:BT_1283 NR:ns ## KEGG: BT_1283 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 440 1 440 440 882 99.0 0 MKKQKIRFAIFMLLLAVTACDNADYSKSAPFDNGVYLSVAESKTSETMTFNKTIIQREKS FTARLAYPAGTDVTVNVTVDPSQVEAYNSRNNTSYDILPAKHYRLESNEMTIRAGKINSA PLHIYFENLTELEIDKAYLCPVSLNSAQGVGLLDGSTTYWYIVKRSSAITTAVDLRYCYV EVPGFYVPKWGTEPAGNAAHLNNLKAVTFEIIIRISNFDEANTDISSLMGIEQYFCFRAG DAGFPRQQLQIQTPAGKFPEANKAKLLKENEWYHLALTYDIATKTIIFYVNGKEQSRSTN YGNSEFSEIKLANRKQESFPGASDGDWLFYIGRSYNDRWLIDRQLNGNVCEARIWDVART QQEIWKNMYDIEDPENEAHLAAYWKFNEGSSNDIKDYSKYHNDAHIVRYWKDFNSKEEYD VKNEELWPSGIEVPQVNRED >gi|226332259|gb|ACIC01000061.1| GENE 76 86851 - 88377 914 508 aa, chain + ## HITS:1 COG:no KEGG:BT_1284 NR:ns ## KEGG: BT_1284 # Name: not_defined # Def: putative endo-beta-N-acetylglucosaminidase F1 precursor (mannosyl-glycoprotein endo-beta-N-acetyl-glucosaminidase F1) # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 508 1 508 508 1010 100.0 0 MKQILKHILILAFAAGFIGTACENNDLNIDAGTFPETGGIGLSMGILQSDNYAMENPQIN MDHASLSDQFHISLTEPASQTGNYTVKVDESKVLDFNSKHGTSYPLYPTEYIDLGNSGKM TIEKGEQQSNSVSIAFKYDEAIEDSVIYVLPLTVEENNSSPAMSSERKTLYYIINVWGMA PAEYNAIKKNFIQIAGVDPEFTNPLLLNKLYFESMSLSSPEVDYYNPFDIINLQFATVKA DDNLLPSLYLKDDLAYVLKKREKYIVPLQQLDHKVCLAIKGAGEGIGFSNLGEKEMMIFV ERIKQMIDIYHLDGVNLYDANFSYEESNENINYSNNLCKFVASLRDKLGNKIITYTQTSE SPEGITNDANLKLGELLDYAWCDQLNTIIDPWSTPEKWTRPIAGLNKEKWGALNTDIHMS SEQANILDQVIEMFTQPSLMITAGINHVFVVNRVDYVSAGTESYAPTYMAYGAICNLCDM EKEYFVTGINSPNNQYLNIHDLLMPKDY >gi|226332259|gb|ACIC01000061.1| GENE 77 88574 - 89494 584 306 aa, chain + ## HITS:1 COG:no KEGG:BT_1285 NR:ns ## KEGG: BT_1285 # Name: not_defined # Def: endo-beta-N-acetylglucosaminidase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 306 1 306 306 569 100.0 1e-161 MKNLYKILASSIAIFAFCACSQEEMPVNQSDNNQSEVVTRSATGIKNIVYIEVNDINPLN AGSYIMDDAPFFDYVILFAANIRGVGSDATLYNNPNVQYILDHKDTLIKPLQDKGIKVLL GLLGDHTGLGFANMNSAQTEQFATAVANAVSQYGLDGVDFDDEWAEYGRNGYPSGSTGSF SNLITALHNKMPGKTITVFNYGYTSELTGVNSYIDYGIYAFFNSPSWSTGFGMPNSKFAP YTINLNSAPSAASAQLYSGQVASKGYGAIGYYDLRANNIVSVLNGVAKGAFKSTCTYDGN SYPKNY >gi|226332259|gb|ACIC01000061.1| GENE 78 91257 - 92624 595 455 aa, chain + ## HITS:1 COG:lin0157 KEGG:ns NR:ns ## COG: lin0157 COG2207 # Protein_GI_number: 16799234 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Listeria innocua # 319 454 136 275 277 60 26.0 6e-09 MDLKQIAAHSLKMNITEHELGNFLKNKISSLKDHAANRHIKLEIQTEFKYASVWIDQSKI SPVIEKFIKNIIDHTEYSKKVIIHASLCNKYWIIKASYTGNENLMKCYKCSGKHLIKQKT EPKYYFAKSALCKKLTEICNGEILINNSCHTITLRFPVKPSYENVSERSLVHIEKNIEEE KIDALFHKSSQKRNSSRPVVIMVDSNEEFRSYLETCLSEEYIVRGFENGADALGYIKNEH PDLVICDTVLNGMGGDELSSRLKTSKETSIIPVILYGSHIDADQRSKREASLADTFMHTP FHVEDLKIEIAVLIRNSRLLRKAFLHKVFGEHFLQVQPTEMIKEATEGSKSLLINEVKEY ILERIEKEDLTIDAIASHLKMSRTKFYNKWKSVTGEAPTVFIEAVRMEKAHELLESGKYP VGTIPEMIGLKDVKNFRNRYKEYFGITPKESIKKK >gi|226332259|gb|ACIC01000061.1| GENE 79 92802 - 93458 687 218 aa, chain + ## HITS:1 COG:no KEGG:BT_1287 NR:ns ## KEGG: BT_1287 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 7 218 1 212 212 382 100.0 1e-105 MKTLTKMKKVCCLSRFVLLVMALGMTSIFTSCSSDDDENIPVYSLKDVEGTYSGKMQTTS IAPLENAENEAEDGVTVNAELKDNHILISKFPVEDLIKSIIEDPDAAEAIIKAVGDVEYK IPYQAAFNDNKDNILLQMSPEPLVLEIAMEPQPIRKTEGEEETPKLTVKVTIKADEKGSY AYEGQKLKFAIQATEVTVNEERFDKFPATTFSFDMTKK >gi|226332259|gb|ACIC01000061.1| GENE 80 93659 - 95035 1329 458 aa, chain - ## HITS:1 COG:lin0800 KEGG:ns NR:ns ## COG: lin0800 COG0687 # Protein_GI_number: 16799874 # Func_class: E Amino acid transport and metabolism # Function: Spermidine/putrescine-binding periplasmic protein # Organism: Listeria innocua # 13 359 11 325 357 187 32.0 3e-47 MKMCCYVNKMKRIVSVNLFLLVLAGFLSGCYNSGESRERVLKIYNWADYIGEGVLEDFQS YYKEQTGEDIHIVYQTFDINEIMLTKIEKGHEDFDVVCPSEYIIERMLKKHLLLPIDTTF AHSPNYMNNVSPFIREQINKLSQPGEEASRYAVCYMWGTAGILYNRAYVPDSDAFSWECL WDKKYAGKILMKDSYRDAYGTAVIYAHAKELEEGTVTVEDLMNDYSPRAMEVAEKYLKAM KPNIAGWEADFGKEMMTKNKAWLNMTWSGDAIWAIEEANAVGVDLDYVVPKEGSNIWYDG WVIPKYAKNPVAASYFINFMCRPDIALRNMDFCGYVSSIATPEILEEKVDTTLDYYADLS YFFDPDADSIQIDKIQYPDRKVVERCAMIRDFGDKTKEVLDIWSRIKGDNLGVGITILIF VVVALMSGWMIYKRWQRYSRRKQQNRRSRRKRKKNVKR >gi|226332259|gb|ACIC01000061.1| GENE 81 95032 - 95829 789 265 aa, chain - ## HITS:1 COG:CAC0838 KEGG:ns NR:ns ## COG: CAC0838 COG1177 # Protein_GI_number: 15894125 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport system, permease component II # Organism: Clostridium acetobutylicum # 1 257 1 252 260 184 42.0 1e-46 MVKKIFAQTYLWVLLLLLYSPIVIIVIYSFTEAKVLGNWTGFSTKLYSSLFNTGAHHSLM NALINTITIALLAATASTLLGSIAAIGIFNLKARSRKAIGFVNSIPILNGDIITGISLFL LFVSLGITQGYTTVVLAHITFCTPYVVLSVLPRLKQMNPNIYEAALDLGATPMQALRKVI IPEIRPGMISGFMLALTLSIDDFAVTVFTIGNQGLETLSTYIYADARKGGLTPELRPLSA IIFVVVLALLIVINYRAGKTKKKES >gi|226332259|gb|ACIC01000061.1| GENE 82 95823 - 96623 566 266 aa, chain - ## HITS:1 COG:CAC0839 KEGG:ns NR:ns ## COG: CAC0839 COG1176 # Protein_GI_number: 15894126 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport system, permease component I # Organism: Clostridium acetobutylicum # 18 266 12 277 277 166 38.0 6e-41 MNKKFIVFLSSRKSWTLPYIIFSAIFVVIPLVLIVVYAFTDDNGHLTLANFQKFFEHPEA INTFVYSIGIAIITTLICILLGYPAAWILSNSKLNRSKTMVVLFILPMWVNILVRTLATV ALFDFFSVPLGEGALIFGMVYNFIPFMIYPIYNTLQKMDHSYIEAAQDLGANPVQVFFKA VLPLSMPGVMSGIMMVFMPTISTFAIAELLTMNNIKLFGTTIQENINNSMWNYGAALSLI MLLLIAATSLFSTDDKDNTNEGGGLW >gi|226332259|gb|ACIC01000061.1| GENE 83 96635 - 98020 1374 461 aa, chain - ## HITS:1 COG:BB0642 KEGG:ns NR:ns ## COG: BB0642 COG3842 # Protein_GI_number: 15594987 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport systems, ATPase components # Organism: Borrelia burgdorferi # 4 348 2 347 347 402 56.0 1e-112 MQEDKSIIEVSHVSKFFGDKTALDDVTLNVKKGEFVTILGPSGCGKTTLLRLIAGFQTAS EGEIRISGKEITQTPPHKRPVNTVFQKYALFPHLNVYDNIAFGLKLKKTPKQTIGKKVKA ALKMVGMTDYEYRDVDSLSGGQQQRVAIARAIVNEPEVLLLDEPLAALDLKMRKDMQMEL KEMHKSLGITFVYVTHDQEEALTLSDTIVVMSEGKIQQIGTPIDIYNEPINSFVADFIGE SNILNGTMIHDKLVRFCGTEFECVDEGFGENTPVDVVIRPEDLYIFPVSEMAQLTGVVQT SIFKGVHYEMTVLCGGYEFLVQDYHHFEVGAEVGLLVKPFDIHIMKKERVCNTFEGKLQD ATHVEFLGCTFECASVEGLESGTDVKVEVDFDKVILQDNEEDGTLTGEVKFILYKGDHYH LTVWSDWDENVFVDTNDVWDDGDRVGITIPPDAIRVIKITD >gi|226332259|gb|ACIC01000061.1| GENE 84 98166 - 99266 1189 366 aa, chain + ## HITS:1 COG:BH1577 KEGG:ns NR:ns ## COG: BH1577 COG0526 # Protein_GI_number: 15614140 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Bacillus halodurans # 205 363 27 166 176 75 25.0 1e-13 MKKLTYLVIATAALGMVACTGNKAGYVITGTVEGAADGDTVYLQEATGRNLTKLDTAVIS KGTFTFEGTQDSAVNRYVTCEVNGEPLMIDFFLENGKINIALTKDNDSATGTSNNDAYQA IRTQINDISQKMNAIYEAMGDSSLSDEQKEAKQKEGAQLEEEYDKVIKEGVQKNITNPVG VFLFKQTFYNNSTEENAALLEQIPANFQNDETIVKVKEQTEKQKKTAVGTKFIDFEMQTP EGKTVKLSDYAGKGKVVLVDFWASWCGPCRREMPNLVEAYAQYKGKNFEIVGVSLDQDAA AWKESIKKLNMTWPQMSDLKFWQSEGAQLYAVNSIPHTVLIDGDGTIIARGLHGEGLQAK LTEVVK >gi|226332259|gb|ACIC01000061.1| GENE 85 99376 - 101094 1792 572 aa, chain - ## HITS:1 COG:PH1512 KEGG:ns NR:ns ## COG: PH1512 COG0058 # Protein_GI_number: 14591294 # Func_class: G Carbohydrate transport and metabolism # Function: Glucan phosphorylase # Organism: Pyrococcus horikoshii # 3 477 287 741 837 394 43.0 1e-109 MINVQRICDYVATGLTFDQAIELVRASSLYTVHTPVPAGHDYFDEGLFGKYMGGYPARMG ISWDDLMDLGRNNPGDKGERFCMSVFACNTAQEVNGVSWLHGKVSQEMFSTIWKGYFPEE MHVGYVTNGVHFPTWSATEWKELYFKYFNENFWFDQSNPKIWEAIYNVPDEEIWKTRMTM KNKLVDYIRKSFRDTWLKNQGDPSRIVSLMDKINPNALLIGFGRRFATYKRAHLLFTDLE RLSKIVNNPDYPVQFLFTGKAHPHDGAGQGLIKRIIEISRRPEFLGKIIFLENYDMQLAR RLVSGVDIWLNTPTRPLEASGTSGEKALMNGVVNFSVLDGWWLEGYREGAGWALTEKRTY QNQEHQDQLDAATIYSILETEILPLYYARNKKGYSEGWIKVVKNSIAQIAPHYTMKRQLD DYYNKFYNKLAKRFHMLSANDNAKAKEIAAWKEEVVAKWDSIEIVSCDKLEDLKAGDIES GKEYTITYVIDEKGLNDAIGLELVTTYTTADGKQHVYSVEPFSVIKKEGDLYTFQVKHSL SNAGSFKVSYRMFPKNPELPHRQDFCYVRWFI >gi|226332259|gb|ACIC01000061.1| GENE 86 101111 - 101941 957 276 aa, chain - ## HITS:1 COG:all2806 KEGG:ns NR:ns ## COG: all2806 COG0058 # Protein_GI_number: 17230298 # Func_class: G Carbohydrate transport and metabolism # Function: Glucan phosphorylase # Organism: Nostoc sp. PCC 7120 # 9 270 4 274 737 226 40.0 5e-59 MKIKVSNVNTPNWKEVTVKSRIPAELEKLSEISRNIWWAWNFEATELFRDLDPELWKECG QNPVLLLERMSYEKLEALAKDKVILRRMNDVYTKFRDYLDVKPDENRPSVAYFSMEYGLS SVLKIYSGGLGVLAGDYLKEASDSNVDLCAVGFLYRYGYFTQTLSMDGQQIANYEAQNFG QLPIDRVMDANGQPLVVDVPYLDYYVHANVWRVNVGRISLYLLDTDNEMNSEFDRPITHQ LYGGDWENRLKQEILLGIGGILTLKALGIQKRRLSL >gi|226332259|gb|ACIC01000061.1| GENE 87 101979 - 103640 1470 553 aa, chain - ## HITS:1 COG:YLR258w KEGG:ns NR:ns ## COG: YLR258w COG0438 # Protein_GI_number: 6323287 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Saccharomyces cerevisiae # 11 552 10 622 705 211 27.0 2e-54 MVKDLLTPDYIFESSWEVCNKVGGIYTVLSTRANTLQEKFRDRIFFIGPDVWQGKENPLF IESDNLCAAWKKHALENDHLSVRIGRWNIPGEPIVILVDFQPFFEKKNDIYTEMWNRFQV DSLHAYGDYDEASMFSYAAGRVVESFYRYNLTETDKVVYQAHEWMTGMGALYVQEAVPEV ATIFTTHATSIGRSIAGNHKPLYDYLFAYNGDQMAQELNMQSKHSIEKQTAHHVDCFTTV SEITNNECKELLDKPADVVLMNGFEDDFVPKGNTFTGKRKRARSLMLNVANKLLGTNLDD DTLIIGTSGRYEFKNKGIDVFLESLNRLNRDKNLHKNVLAFINVPGWVGDPREDLQVRLK SKEKFDTPLEVPFITHWLHNMTHDQVLDMLKYLGMGNRPEDKVKVIFVPCYLDGRDGIMN KEYYDILLGQDLSVYASYYEPWGYTPLESVAFRVPTITTDLAGFGLWVNSLKNQHGINDG VEVLHRSDYNDSEVADGIKDTITLFADKSEKEVKEIRKRAADVAEQALWKHFIQYYYEAY DIALRNALKRQLS >gi|226332259|gb|ACIC01000061.1| GENE 88 103846 - 104307 649 153 aa, chain - ## HITS:1 COG:SPy0149 KEGG:ns NR:ns ## COG: SPy0149 COG0636 # Protein_GI_number: 15674359 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, subunit c/Archaeal/vacuolar-type H+-ATPase, subunit K # Organism: Streptococcus pyogenes M1 GAS # 5 148 14 154 159 75 38.0 3e-14 MEMNLFIAYIGIAVMVGLSGIGSAYGVTIAGNAAIGALKKNDSAFGNFLVLTALPGTQGL YGFAGYFMFQTIFGILTPEITAIQASAVLGAGIALGLVALFSAIRQGQVCANGIAAIGQG HNVFSNTLILAVFPELYAIVALAATFLIGSALA >gi|226332259|gb|ACIC01000061.1| GENE 89 104360 - 106183 1990 607 aa, chain - ## HITS:1 COG:BB0091 KEGG:ns NR:ns ## COG: BB0091 COG1269 # Protein_GI_number: 15594437 # Func_class: C Energy production and conversion # Function: Archaeal/vacuolar-type H+-ATPase subunit I # Organism: Borrelia burgdorferi # 1 607 1 607 608 180 25.0 9e-45 MITKMKKLTFLVYHKEYEEFLNSLRELGVVHIVEKQQGAADNTELQENIRLSNRLTATLK LLQNQKHEKDAVIATEGGTAARGLQVLDEVDTLQTEHGKLSQQLQGYAKEREALQAWGNF DPASVRKLKDAGYVIGFYSCSEGNYQEEWETEYNAMIINRISSKVFFVTVTKAGQEVDLD VEQAKLPAYSLAHLETLYNTTEQAIEENEKKLVALSETDVPSLKVALKELQGQIEFSKVV LSSEQAAGDKLMLIEGWAPAYSKVEIEAYLNDAHVYYEITDPMPGDNVPIRLNNKGFFAW FEPICKLYMLPKYNELDLTPFFAPFFMIFFGLCLGDSGYGLFLFLGATAYRLLAKKVTPS MKSILSLIQVLAVSTFFCGLLTGTFFGANIYDLDWPIVQRLKHAVLMDNNDMFQLSLILG AIQILFGMVLKAVNQTIQFGFKYAVATIGWIILLVSMAVSALLPNVLPMGGTVHLVILGI SGAMIFLYNSPGKNIFMNIGLGLWDSYNMATGLLGDVLSYVRLFALGLSGGILAGVFNSL AVGMSPDNVIAGPIVMVLIFVIGHAINIFMNVLGAMVHPMRLTFVEFFKNSGYEGGGKEY KPFRNLE >gi|226332259|gb|ACIC01000061.1| GENE 90 106180 - 106785 508 201 aa, chain - ## HITS:1 COG:TP0428 KEGG:ns NR:ns ## COG: TP0428 COG1394 # Protein_GI_number: 15639419 # Func_class: C Energy production and conversion # Function: Archaeal/vacuolar-type H+-ATPase subunit D # Organism: Treponema pallidum # 1 179 1 180 206 87 30.0 1e-17 MAIKFQYNKTSLQQLEKQLKVRVRTLPIIKNKESALRMEVKRCKTEAADLEDRLEQQIQA YEAMFALWNEFDASLIKVNDVHLGVKKIAGVRVPLLENIDFEIRPYSMFNAPKWYADGIH LLEELAHTAIEREFMLAKLNLLEHARKKTTQKVNLFEKVQIPGYQDALRKIKRFMEDEEN LSKSSQKIMKSHQEKRKEEEA >gi|226332259|gb|ACIC01000061.1| GENE 91 106826 - 108151 1537 441 aa, chain - ## HITS:1 COG:TP0427 KEGG:ns NR:ns ## COG: TP0427 COG1156 # Protein_GI_number: 15639418 # Func_class: C Energy production and conversion # Function: Archaeal/vacuolar-type H+-ATPase subunit B # Organism: Treponema pallidum # 8 438 3 429 430 397 47.0 1e-110 MATKAFQKIYTKITQITKATCSLKATGVGYDELATVNGKLAQVVKIAGDDVTLQVFEGTE GIPTNAEVVFLGKSPTLKVSEQLAGRFFNAFGDPIDGGPEIEGQEVEIGGPSVNPVRRKQ PSELIATGIAGIDLNNTLVSGQKIPFFADPDQPFNQVMANVALRAETDKIILGGMGMTND DYLYFKNVFSNAGALDRIVSFMNTTENPPVERLLIPDMALTAAEYFAVNNNEKVLVLLTD MTSYADALAIVSNRMDQIPSKDSMPGSLYSDLAKIYEKAVQFPSGGSITIIAVTTLSGGD ITHAVPDNTGYITEGQLFLRRDSDIGKVIVDPFRSLSRLKQLVTGKKTRKDHPQVMNAAV RLYADAANAKTKMENGFDLTNYDERTLAFAKDYSNQLLAIDVNLDTTEMLDVAWGLFGEY FRPEEVNIKKELVDQYWPKGE >gi|226332259|gb|ACIC01000061.1| GENE 92 108185 - 109942 1928 585 aa, chain - ## HITS:1 COG:TP0426 KEGG:ns NR:ns ## COG: TP0426 COG1155 # Protein_GI_number: 15639417 # Func_class: C Energy production and conversion # Function: Archaeal/vacuolar-type H+-ATPase subunit A # Organism: Treponema pallidum # 3 585 4 576 589 521 45.0 1e-147 MATKGTVSGVIANMVTLVVDGPVAQNEICYISTGGDKLMAEVIKVVGANVYVQVFESTRG LKVGAEAEFTGHMLEVTLGPGMLSKNYDGLQNDLDKMDGVFLKRGQYTYPLDKERIWHFV PMVKAGDKVVASAWLGQVDENFQPLKIMAPFTMNGTATVKTIMPEGDYKIEDTIAILTDE EGNDIPVTMIQKWPVKRAMTNYKEKPRPFKLLETGVRVIDTLNPIVEGGTGFIPGPFGTG KTVLQHAISKQAEADIVIIAACGERANEVVEIFTEFPELVDPHTGRKLMERTIIIANTSN MPVAAREASVYTAMSLAEYYRSMGLKVLLMADSTSRWAQALREMSNRMEELPGPDAFPMD ISAIISNFYGRAGYVKLGNDETGSITFIGTVSPAGGNLKEPVTENTKKVARCFYALEQDR ADKKRYPAVNPIDSYSKYIEYPEFEEYIKEHINDEWIGKVNELKTRLQRGKEIAEQINIL GDDGVPVEYHVIFWKSELIDFVILQQDAFDEVDSVTPMERQEAILNMIIDICHTEFEFDN FNEVMDYFKRMINVCKQMNYSKFKSEQYEGFQQQLKELIAERSVK >gi|226332259|gb|ACIC01000061.1| GENE 93 109960 - 110802 769 280 aa, chain - ## HITS:1 COG:no KEGG:BT_1300 NR:ns ## KEGG: BT_1300 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 280 1 280 280 514 97.0 1e-144 MGSKYYYLVAGLPELTLEDSKLSYTVADFKTEIYSGLSASDQKLIDLFYLKFDNANVLKL LKDKEAEIDKRGNYSAAELTEYISILREGGEISPKEFPSYLSVFISDYLNTPAESMVLQE DHLAALYYEYAMKCGNKFVSSWFEFNLNINNILVAFTARKFKWDIAAHIVGNTEVCEALR TSSARDFGLSGEVDVFESLVKISEITELVEREKKLDALRWNWMEDAIFFDYFTVERIFAF LLKLEMIERWISLDKERGNQLFRSIIESLKNEVQIPAEFR >gi|226332259|gb|ACIC01000061.1| GENE 94 110814 - 111404 621 196 aa, chain - ## HITS:1 COG:no KEGG:BT_1301 NR:ns ## KEGG: BT_1301 # Name: not_defined # Def: ATP synthase subunit E # Organism: B.thetaiotaomicron # Pathway: Oxidative phosphorylation [PATH:bth00190]; Metabolic pathways [PATH:bth01100] # 1 196 1 196 196 320 98.0 2e-86 MENKIQELTDKIYREGVEKGNEEAQRLIANAQDEAKKIIEDARKEAESIVAASRKSADEL ADNTKSELKLFSGQAVNALKSEIATMVTDKLITASVKDFAQDKDYLNAFIVALASKWSVD EPIVISTADAESLKKYFAAHAKALLDKGVKIEQVNGIKTLFTVSPADGSYKVNFGEEEFM NYFKAFLRPQLVEMLF >gi|226332259|gb|ACIC01000061.1| GENE 95 111632 - 112078 251 148 aa, chain - ## HITS:1 COG:no KEGG:BT_1302 NR:ns ## KEGG: BT_1302 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 148 1 148 148 242 85.0 3e-63 MKTIKYFAVLSLLVGLSGCNSKPDLDGFREVEISNNMLDTIPENRLNNYKETIEKTYLLG CLVLGTAGSSSESSPAQLKEWKESLNMPDTIKPIYRERLWDFVFYPQITYEVESRYLGIY QRRKTVIKGIGLGMITGKEEEYIKSRQK >gi|226332259|gb|ACIC01000061.1| GENE 96 112096 - 112896 541 266 aa, chain - ## HITS:1 COG:no KEGG:BT_1303 NR:ns ## KEGG: BT_1303 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 266 1 266 266 392 74.0 1e-108 MKRTTYIIIGLFVLGLLLIPILLRTPAAEKTAHRLSMEGERTEIEMAGIRHVKIVEKLGE IDKDDIGVLGKLDVQASSFSDKSILVYPKSKYLNVSQEGDCLLVEINLDDKGLSQITGEI EKGARLVLGGLNMELKTDSTLLSIENRVSRVNINVKKMNLDSLYLFNDEWQDISLESCSF RSLDIRGKKLSLKANQVKAINYYEELGTIRSSEISGFSVDNLYLSGKDSHRYRTGINCKR IFWNPLAKDAELELTLRQKGEIILNR >gi|226332259|gb|ACIC01000061.1| GENE 97 112910 - 113281 303 123 aa, chain - ## HITS:1 COG:BH3492 KEGG:ns NR:ns ## COG: BH3492 COG1725 # Protein_GI_number: 15616054 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Bacillus halodurans # 3 120 5 118 129 66 30.0 1e-11 MNFKESRAIYLQIADRICDDILLGQYEEEGRISSVREYASIVEVNANTVMRSYEYLQSQE VIYNKRGIGFFVASGAKALIHSLRKEQFLKEEVDGFFRQLYTLGISIKEIEKMYYEFIQR QNQ >gi|226332259|gb|ACIC01000061.1| GENE 98 113297 - 114112 399 271 aa, chain - ## HITS:1 COG:no KEGG:BT_1305 NR:ns ## KEGG: BT_1305 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 271 1 271 271 412 83.0 1e-114 MIKDTFFSFSRFGKLLHKEVSENWKLYMLRTLMIYGVLTIALIWAGYFEYKSYNPLFSFD DANVHVTTLITWMWVWWGFTCLGASFAFESMKSKTRRITNLMVPATTFEKYFLHWLIYVL ILPLLIYFFIILADYTRVFVCAMIYSEVPSIEATQFKYFVDQGDGKYSLCHNWSEGTLLI LLNLYIQSLFLLGSIFWSRNAFIKTLIILVVIGFVYYWSGVVAVHLFKDCDIINLNSFIF KWGHILFPVLTIINWVLTYFRLKESEVIHRM >gi|226332259|gb|ACIC01000061.1| GENE 99 114116 - 114964 677 282 aa, chain - ## HITS:1 COG:BB0573 KEGG:ns NR:ns ## COG: BB0573 COG1131 # Protein_GI_number: 15594918 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase component # Organism: Borrelia burgdorferi # 2 215 5 214 270 166 41.0 5e-41 MIKAENLSFIYPKSKRMVLHDFSFSLEAGRVYGLLGRNGAGKSTLLYLIAGLLTPKKGKI VFHDINVRSRLPMTLQDMFLVPEEFELPSIPLKKYIELNAPFYPNFSKEDMHKYLHCFEM DMDVNLGALSMGQKKKVFMSFALATNTSLLLMDEPTNGLDIPGKSQFRKFMASGMTENKT IVISTHQIRDIDKMLDSVMIMDESRVLLNESIVRICEKLCFKESDDRSLMDKALFAVPSL QGNSLLLLNEYGEDSDINLELLFNATLAQPQKITKLFHAQYE Prediction of potential genes in microbial genomes Time: Thu May 12 00:44:17 2011 Seq name: gi|226332258|gb|ACIC01000062.1| Bacteroides sp. 1_1_6 cont1.62, whole genome shotgun sequence Length of sequence - 78325 bp Number of predicted genes - 60, with homology - 59 Number of transcription units - 27, operones - 13 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 14 - 364 265 ## BT_3477 glutaminase A 2 1 Op 2 . + CDS 406 - 1779 785 ## BT_3476 hypothetical protein 3 1 Op 3 . + CDS 1804 - 4923 2045 ## BT_3475 hypothetical protein 4 1 Op 4 . + CDS 4940 - 6985 1341 ## BT_3474 hypothetical protein 5 1 Op 5 . + CDS 7005 - 8006 654 ## BT_3473 hypothetical protein + Term 8020 - 8075 11.1 + Prom 8015 - 8074 7.3 6 2 Op 1 . + CDS 8108 - 9400 605 ## COG0673 Predicted dehydrogenases and related proteins 7 2 Op 2 . + CDS 9397 - 9639 138 ## BT_3471 hypothetical protein 8 2 Op 3 . + CDS 9697 - 11115 1226 ## COG0673 Predicted dehydrogenases and related proteins 9 2 Op 4 . + CDS 11174 - 12523 1032 ## BT_3469 hypothetical protein + Term 12568 - 12605 5.2 10 3 Tu 1 . + CDS 12615 - 13478 209 ## COG1477 Membrane-associated lipoprotein involved in thiamine biosynthesis + Term 13547 - 13577 0.4 - Term 13277 - 13334 -0.5 11 4 Tu 1 . - CDS 13447 - 14544 355 ## PROTEIN SUPPORTED gi|90020424|ref|YP_526251.1| ribosomal protein L11 methyltransferase - Prom 14706 - 14765 7.5 - Term 14883 - 14939 12.1 12 5 Op 1 . - CDS 15134 - 19171 3053 ## COG0642 Signal transduction histidine kinase 13 5 Op 2 . - CDS 19195 - 20502 1181 ## COG4942 Membrane-bound metallopeptidase 14 5 Op 3 . - CDS 20499 - 21080 553 ## BT_3463 hypothetical protein 15 5 Op 4 . - CDS 21077 - 22831 1905 ## COG0457 FOG: TPR repeat 16 5 Op 5 . - CDS 22850 - 23284 413 ## COG0756 dUTPase - Prom 23336 - 23395 6.0 + Prom 23333 - 23392 6.7 17 6 Tu 1 . + CDS 23412 - 24761 924 ## COG0232 dGTP triphosphohydrolase - Term 24634 - 24674 7.3 18 7 Op 1 . - CDS 24733 - 25746 955 ## COG3176 Putative hemolysin 19 7 Op 2 . - CDS 25771 - 26589 699 ## COG3176 Putative hemolysin - Prom 26609 - 26668 6.7 + Prom 26567 - 26626 8.3 20 8 Tu 1 . + CDS 26848 - 27351 454 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog + Term 27424 - 27462 3.5 + Prom 27682 - 27741 2.9 21 9 Op 1 29/0.000 + CDS 27806 - 28276 310 ## COG2001 Uncharacterized protein conserved in bacteria 22 9 Op 2 . + CDS 28248 - 29177 942 ## COG0275 Predicted S-adenosylmethionine-dependent methyltransferase involved in cell envelope biogenesis 23 9 Op 3 . + CDS 29186 - 29542 308 ## BT_3454 hypothetical protein 24 9 Op 4 26/0.000 + CDS 29603 - 31729 1871 ## COG0768 Cell division protein FtsI/penicillin-binding protein 2 25 9 Op 5 4/0.000 + CDS 31753 - 33201 1546 ## COG0769 UDP-N-acetylmuramyl tripeptide synthase + Prom 33210 - 33269 5.6 26 9 Op 6 28/0.000 + CDS 33299 - 34567 1265 ## COG0472 UDP-N-acetylmuramyl pentapeptide phosphotransferase/UDP-N-acetylglucosamine-1-phosphate transferase + Prom 34589 - 34648 2.8 27 9 Op 7 25/0.000 + CDS 34736 - 36070 1465 ## COG0771 UDP-N-acetylmuramoylalanine-D-glutamate ligase 28 9 Op 8 31/0.000 + CDS 36072 - 37388 1270 ## COG0772 Bacterial cell division membrane protein 29 9 Op 9 26/0.000 + CDS 37418 - 38536 1067 ## COG0707 UDP-N-acetylglucosamine:LPS N-acetylglucosamine transferase + Prom 38551 - 38610 2.4 30 9 Op 10 . + CDS 38740 - 40143 1365 ## COG0773 UDP-N-acetylmuramate-alanine ligase + Prom 40149 - 40208 3.9 31 10 Op 1 . + CDS 40228 - 40974 256 ## PROTEIN SUPPORTED gi|163752975|ref|ZP_02160099.1| 30S ribosomal protein S12 32 10 Op 2 35/0.000 + CDS 41044 - 42495 1407 ## COG0849 Actin-like ATPase involved in cell division 33 10 Op 3 . + CDS 42513 - 43820 1578 ## COG0206 Cell division GTPase 34 10 Op 4 . + CDS 43876 - 44325 265 ## PROTEIN SUPPORTED gi|42519249|ref|NP_965179.1| 30S ribosomal protein S21 + Term 44418 - 44468 1.1 + Prom 44797 - 44856 7.9 35 11 Tu 1 . + CDS 44962 - 46653 675 ## BT_3442 hypothetical protein + Prom 46712 - 46771 2.0 36 12 Tu 1 . + CDS 46813 - 47187 285 ## BT_3441 hypothetical protein + Prom 47205 - 47264 2.7 37 13 Op 1 . + CDS 47299 - 48612 470 ## BT_2451 putative pyrogenic exotoxin B 38 13 Op 2 . + CDS 48623 - 49096 310 ## gi|253568805|ref|ZP_04846215.1| predicted protein + Prom 49098 - 49157 3.6 39 14 Op 1 . + CDS 49180 - 49413 173 ## gi|253568806|ref|ZP_04846216.1| predicted protein 40 14 Op 2 . + CDS 49406 - 50467 339 ## PRU_1227 putative thiol protease/hemagglutinin PrtT 41 14 Op 3 . + CDS 50469 - 50888 91 ## gi|253568808|ref|ZP_04846218.1| predicted protein + Prom 50893 - 50952 5.1 42 15 Op 1 . + CDS 51008 - 52357 558 ## GFO_2059 M12A family peptidase (EC:3.4.-.-) + Prom 52364 - 52423 5.0 43 15 Op 2 . + CDS 52456 - 52722 344 ## gi|253568810|ref|ZP_04846220.1| predicted protein + Term 52771 - 52821 7.8 44 16 Op 1 . - CDS 53594 - 54829 903 ## BT_3434 hypothetical protein 45 16 Op 2 . - CDS 54826 - 55761 493 ## BT_3433 hypothetical protein - Prom 55849 - 55908 6.8 + Prom 55732 - 55791 3.8 46 17 Tu 1 . + CDS 55899 - 57638 1260 ## BT_3432 hypothetical protein - Term 57364 - 57405 0.7 47 18 Tu 1 . - CDS 57643 - 58371 514 ## BT_3431 DNA repair protein - Prom 58491 - 58550 7.0 + TRNA 58977 - 59051 50.0 # Glu TTC 0 0 + Prom 58979 - 59038 78.1 48 19 Op 1 . + CDS 59091 - 59345 423 ## PROTEIN SUPPORTED gi|29348839|ref|NP_812342.1| 30S ribosomal protein S20 + Term 59379 - 59413 4.0 + Prom 59412 - 59471 4.6 49 19 Op 2 . + CDS 59545 - 61506 2128 ## COG0187 Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit + Term 61550 - 61611 2.1 + Prom 61591 - 61650 1.7 50 20 Tu 1 . + CDS 61676 - 61876 88 ## gi|237716666|ref|ZP_04547147.1| conserved hypothetical protein - Term 61616 - 61660 5.0 51 21 Tu 1 . - CDS 61885 - 62466 518 ## BT_3427 hypothetical protein - Prom 62511 - 62570 5.7 + Prom 62475 - 62534 1.7 52 22 Tu 1 . + CDS 62565 - 62738 191 ## - Term 62671 - 62706 -0.5 53 23 Op 1 . - CDS 62771 - 64381 427 ## BT_3425 hypothetical protein 54 23 Op 2 . - CDS 64433 - 67171 861 ## BT_3424 hypothetical protein - Prom 67282 - 67341 5.7 55 24 Tu 1 . - CDS 67358 - 68050 246 ## BT_3423 hypothetical protein - Prom 68080 - 68139 5.5 - Term 68115 - 68149 2.1 56 25 Op 1 . - CDS 68189 - 70906 1065 ## BT_3422 hypothetical protein 57 25 Op 2 . - CDS 70919 - 72205 587 ## BT_3421 hypothetical protein 58 25 Op 3 . - CDS 72230 - 73756 637 ## BT_3420 hypothetical protein - Prom 73873 - 73932 4.9 - Term 73962 - 74006 8.9 59 26 Tu 1 . - CDS 74028 - 75542 1938 ## COG0696 Phosphoglyceromutase - Prom 75707 - 75766 4.4 + Prom 75579 - 75638 4.1 60 27 Tu 1 . + CDS 75717 - 78188 1977 ## BT_3418 putative thiol:disulfide interchange protein Predicted protein(s) >gi|226332258|gb|ACIC01000062.1| GENE 1 14 - 364 265 116 aa, chain + ## HITS:1 COG:no KEGG:BT_3477 NR:ns ## KEGG: BT_3477 # Name: not_defined # Def: glutaminase A # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 116 725 840 840 238 100.0 4e-62 MIWDKMWNLNLFPNNVIDKEVSYYLTKQNPYGLPLDSRKEYTKSDWIMWTATMSPDQATF EKFINPLYKYINETTSRVPISDWHDTKTGKMTGFKARSVIGGYWMKVLVNKNSNNL >gi|226332258|gb|ACIC01000062.1| GENE 2 406 - 1779 785 457 aa, chain + ## HITS:1 COG:no KEGG:BT_3476 NR:ns ## KEGG: BT_3476 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 457 1 457 457 958 100.0 0 MKMKSKTKSCCQYTQWVVLALLTLIFASCKDNDDATGDFDPNKPVVISEFSPKEGGLGTR MLLYGENFGSDISKIKVTIGGQDSKVVGAKGKSLYCVVPAKAYDGDIKLSILNDEGEEIA NTEANEKFVYQKKMLVTTFLGTMYDGNTKYDLKDGPFDDCGGFGGAVWLSFDPKNHNHLY LVGEQHPTRLIDFEKEYVSTVYSGLSKVRTICWTHEADSMIITNDQNNNDRPNNYILTRE SGFKVITELTKGQNCNGAETHPINGELYFNSWNAGQVFRYDFTTQETTPLFTIQDSGWEF HIQFHPSGNYAYIVVVNQHYILRSDYDWKTKRLTTPYIVCGQQGAKDWVDGVGKKARMHA PRQGTFVKNPAYKGSSDEYDFYFCDRENHCIRILTPQGRVTTFAGRGSNGTSGYNDGDLR QEARFNHPEGIVYDEERECFFIGDRENRRIRKIGYEE >gi|226332258|gb|ACIC01000062.1| GENE 3 1804 - 4923 2045 1039 aa, chain + ## HITS:1 COG:no KEGG:BT_3475 NR:ns ## KEGG: BT_3475 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1039 1 1039 1039 2062 100.0 0 MMKNFLFICMLLLGLSTSALAQESIVVTGVVTDTNKEPMIGVNVSISNIPGLGAITDLNG KYSIKMPPYHKLVFTYIGFEKVEVLVKEQRTVNVTMKEASAREIDEVVITGTGAQKKLTV TGAITNVDVDVLKANPSGSMANALAGNVPGILAMQTSGKPGSVSEFWIRGISTFGASNSA LVLVDGFERSLDEINVEDVESFSVLKDASATAIYGSKGANGVVLITTKHGKAGKINISAK AETFYNMLTQVPDFVDGYTYASMANEAKITRNLEPLYKADELEIFRLGLDPDLYPNVNWI DELLRKGSWSTRATLSMNGGGNTARYYVSGSYLDQQGMYKVDKALKDYNTNANFRRWNYR MNVDIDITKSTLLKVGVSGSLQKANDSGVGSDAIWTALMGYNAIMVPKLYSNGYVPAYGN DNGDRFNPWVQATMTGYRENWKNNIQTNVTLEQKLDFITKGLRFVGRFGYDTENNNWINR RKWPEQWKAKRFRATDGTLDYDRVAEERKMFQESGSDGLRNEFFEAELHYSRGFKHHHLG GTLKYNQSSKIKTVGLGDDLKQGIARRNQGLAGRFTYNWNYRYFIDFNFGYTGSENFAAG HRFGFFPAISGAWNIAEESLIKKHLKWMNMFKIRYSYGKVGNDNLGNTRFPYLYDIETMT KKDGDKTVDTGGYNFGDYTFDRYYGGMRYSSLSSPNVTWEIATKHDLGIDFSFFNDKLSG SVDYFNEKREGIYMLREYLPGIVGLESNPSANVGKVTSEGFDGHFTFRQKLGAVGLTIRS NITYSKNEIVDRDEENNYYWYKMQKGHRVNQARGLISLGLFKDYDDIRNSPVQDFDGYKV MPGDIKYKDVNGDGKIDGNDQVAIGATTKPNLIYGFGIAANWKGLDVNLHFQGAGKSTYF IDGSTVHMFKLGDGWGNVLSEMANSNRWISADISGDPATENPNAEYPRLSYGPNSNNYQQ STYWLRNGSYLRLKTVEVGYTLPTQLVNKVHFNTVRIFFVGTNLLTWSAFKLWDPEMGST DGKRYPLSKNLSLGISVNL >gi|226332258|gb|ACIC01000062.1| GENE 4 4940 - 6985 1341 681 aa, chain + ## HITS:1 COG:no KEGG:BT_3474 NR:ns ## KEGG: BT_3474 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 681 1 681 681 1395 100.0 0 MKKIYIILSFIIGIGFTLSSCSDYLDSDYLFDERMSIEDVFTSRAYSDKWLARAYFFLGD NHLLDVASKGYVPFCFADDMYFGDRDDRYKAWKNGEYNEGGLAGESAEIWNKCYKGIRQA SIYLNNIDMNTEYTAEEIADNKAQAHFLKAYFYWIMLRLFGPVPIVPDQGIDYMEDYDAV AQPRNTYDECVTYITNELVLAAQALPLDRAIQEIARPTRGAALALRAKVLLYAASPLMNG QTPADIASELVDDQGNRLLPEAYDESKWAKAAAAAKDVIELNRYTLFVSYATDKGDIAFP ATIAPPYHPEFSEQAWPNGWKDIDPFESYRAIFNGTVSAFENKELIFTRGQNTGDQSINN MVIHQLPRVAKGWNTHGLTQKQCDAYYMNDGTDCPGKDKEINRGDGSERMSGYVTKEDVE AGRYKPLSEGVSLQYANREPRFYASVAYNGDVWNLLNSNKNAGEPQNIQVFYYRGDGNGY TNSMFWLRTGIGVKKFVHPDDMGKGDNNEELIKKKVEPAIRYAEVLLIYAEALNELNGQY DIPSWDGNKTHIIKRDINEMKKGIRPIRIRAGVPDYTQEEYDDPIEFRKKLKRERQIELM GEGHRYFDLRRWLDAPVEESTPIYGCNTLATKEMADVFHTPVAVPSLPTTFSRKMWFWPI NHTELKRNKRLTQNPGWTYPE >gi|226332258|gb|ACIC01000062.1| GENE 5 7005 - 8006 654 333 aa, chain + ## HITS:1 COG:no KEGG:BT_3473 NR:ns ## KEGG: BT_3473 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 333 1 333 333 650 100.0 0 MKKIYYLLLAMTVLIVCASCNNEWESEQYVQMVSFKAPVNAQGVTPIYIRYKPEGKVTYQ LPLIVSGSTMNSKDLNVHVGLDLDTLDVLNVEHFGSRKELFFKPLENKYYEFPETVQIPA GECTSLLPIDFALKDLDQVDKWVLPLAIQSNDPSHNYQANPRKFYRRAMLRISPFNDYSG QYSATAYKVYFKGNEKEAIVPEYHTSYVIDDKTIFFYAGLIDIDRLDRKYYKIKFEFTEE KIDLQTRKLKISSDNEDINLVVKGQPSYTVEEEMDATRPYLKHIYVTINNLEYEFTDYTS VPGYPLEYVVKGSLTMERKINTQIPDEDQAIEW >gi|226332258|gb|ACIC01000062.1| GENE 6 8108 - 9400 605 430 aa, chain + ## HITS:1 COG:SA0210 KEGG:ns NR:ns ## COG: SA0210 COG0673 # Protein_GI_number: 15925921 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Staphylococcus aureus N315 # 45 249 5 204 359 95 32.0 3e-19 MKHTQISRREFLKNIGMTGAGVLLAASPWLSAFSEAVNTSNEKCRLGVIGPGSRGRFLMS FLVQNPKVDIVALCDIYQPSIDEALKLAPKAKVYGDYRELLENNNIDAILVATPLSSHCK IVMDAFDAGKHVFCEKTIGFTMEECFRMYNKHRSTGKIFFTGQQRLFDPRYIKAMEMIHA GTFGEINAIRTFWYRNGDWRRPVPSANLERQINWRLYKEYSKGLMTELACHQLQIGSWAL NKLPEKVMGHGAITYWKDGREVYDNVSCLYVFDDGVKMTFDSVISNQFYGLEEQIMGNLG TVEPEKGKYYFESIPPAPGFLQMINDWENKLFDTLPFAGTSWAPETANENKGELILGERP KSDGTSLLLEAFVEAVITRKQPARIAEEGYYASMLCLLGHQALQEEKTLYFPDEYKINYL NHQSKTSEAV >gi|226332258|gb|ACIC01000062.1| GENE 7 9397 - 9639 138 80 aa, chain + ## HITS:1 COG:no KEGG:BT_3471 NR:ns ## KEGG: BT_3471 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 80 1 80 80 106 100.0 3e-22 MKDTILTARRKKTELITLFLCFIIANLVNLYSIFTYNTPLTEMITSFFYVLIFSVALYII WSLLRILFYGITALFKKKKE >gi|226332258|gb|ACIC01000062.1| GENE 8 9697 - 11115 1226 472 aa, chain + ## HITS:1 COG:STM4425 KEGG:ns NR:ns ## COG: STM4425 COG0673 # Protein_GI_number: 16767671 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Salmonella typhimurium LT2 # 41 227 3 184 336 66 28.0 9e-11 MVTRRDFLKTMTMASAGLAAGAGEILNAQTLSSSKNKNEKVKIAYVGIGNRGQQIIEDFA RTGMVEVVALCDVDMGAKHTQKVIGMYPKAKQFRDFRQMFDKAGKEFDAVAIATPDHSHF PISMLALASGKHVYVEKPLARTFYEAELLMQAALKRPNLATQVGNQGHSEANYFQFKAWM DAGIIKDVTAVTAHMNNVRRWHKWDTNIYKLPSGQQLPKDMDWDTWLGAVPYHEYNEKYH QGEWRCWYDFGMGALGDWGAHILDTVHEFLELGLPYEVTLLHEKGHNDYFFPYSSTILFR FPQRKGMPPVDITWYDGLDNLPPIPAGYGVSGLDPNIPTTNQGDVPAAKLNPGKIIYTKD LTFKGGSHGSTLSIIPEEKAKEMADKLPQIPKSPSNHFENFLLACNGIEKTRSPFEINGV LSQVFSLGVMAQRLNTQLFFDSRTKQITNNEFANAMLTGIPPRKGWDEFYKL >gi|226332258|gb|ACIC01000062.1| GENE 9 11174 - 12523 1032 449 aa, chain + ## HITS:1 COG:no KEGG:BT_3469 NR:ns ## KEGG: BT_3469 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 449 1 449 449 888 100.0 0 MKKNFLFTLLLFCAASLSAQTWEPLFNGKNLKGWKKLNGKAEYKIVDGAIVGISKMGTPN TFLATTKNYGDFILEFDFKIDDGLNSGVQLRSESKKDYQNGRVHGYQFEIDPSKRAWSGG IYDEARRNWLYPLTLNPAAKTAFKNNAWNKARIEAIGNSIRTWINGVPCANIWDDMTPSG FIALQVHAIGNASEEGKTVSWKDIRICTTDVERYQTPETEEAPERNMIANTISPREAKEG WALLWDGKTNNGWRGAKLNAFPEKGWKMEDGILKVMKSGGAESANGGDIVTTRKYKNFIL TVDFKITEGANSGVKYFVNPDLNKGEGSAIGCEFQILDDDKHPDAKLGVKGNRKLGSLYD LIPAPEKKPFNKKDFNTATIIVQDNHVEHWLNGVKLIEYTRNTDMWNALVAYSKYKNWPN FGNSAEGNILLQDHGDEVWFKNVKIKELK >gi|226332258|gb|ACIC01000062.1| GENE 10 12615 - 13478 209 287 aa, chain + ## HITS:1 COG:aq_1258 KEGG:ns NR:ns ## COG: aq_1258 COG1477 # Protein_GI_number: 15606483 # Func_class: H Coenzyme transport and metabolism # Function: Membrane-associated lipoprotein involved in thiamine biosynthesis # Organism: Aquifex aeolicus # 48 273 37 258 288 92 28.0 1e-18 MQHTIQHLYKTHGDDGLLYAWFLSMHTRVDIILCCQKSENELMLVVNSIYDTLRQLERIA NYYDPSSELSQVNQRASTAPVMISQSLYRMISLCTEYHKKTLGCFDVTIHSDNYNQDTIH SIHLYPEAQSVFFQQAGTTINLSGFLKGYALDKIREILKVHIIANALINMGNSSVLALGN HPVGTGWKVSFDDQASTTKNHKTQSILLNNECLTTSGNNSDDRKHIISPSSGKPLEGVRQ VTVVTDDGTTGEILSTSLFVANQKQRELIMSEFLPKRVIDLELCQMV >gi|226332258|gb|ACIC01000062.1| GENE 11 13447 - 14544 355 365 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|90020424|ref|YP_526251.1| ribosomal protein L11 methyltransferase [Saccharophagus degradans 2-40] # 36 348 5 310 314 141 31 1e-32 MRINLCLLLFLVLLNISCNSKRAGKKEDVEKDTYANPLFMEGANSSAIYHNGKYYYTHET NEQIYLWVTTDITDMAHSTCKEVWVPKDPSNSHNLWNPEIRNINGKWYIYFAADDGNTDN HQIYVIENDSPDPMQGEFRMKGAIMTNPDWNWGIHPSTFEHKGELYLLWSGWPNRRIGSE TQCIYIARMENPWTLATERVLISKPEYEWERQWVNPDGGRTAYPIYVNESPQYFHSKDNR TVIVYYAASGCWSPYYCTGMLTADADSDLLNPASWKKSPVPVFQQRPEDGVYGPANLSFL PSPDGTEWYILYQARSVPSGNTGESESRNPRLQKIGWDADGMPDLGVPLPVNTPLPKPSG TVPNQ >gi|226332258|gb|ACIC01000062.1| GENE 12 15134 - 19171 3053 1345 aa, chain - ## HITS:1 COG:CAC0903_3 KEGG:ns NR:ns ## COG: CAC0903_3 COG0642 # Protein_GI_number: 15894190 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 808 1074 46 316 318 131 30.0 1e-29 MKLTKILILLLAFWLEPLSASPYFSFKKYQVEDGLSHNTVWCAIQDSYGFIWLGTSDGLN RYDGRGNKVYRNVLNDKFSLENNFVEALIEEDQNIWVGTNSGLYIYDRATDRFSYFDKTT QYGVYVSSEIKKIVKTENGLLWIATLGQGLFIYDPKTEVLTQNSIQTSFVWDVCQSYDKK KVYVSSLQEGLLCFDENGKFLQSYKVSSDVNSSDSYKINCVLDVEGEVWLGAGNNLLGRL NRQTGTVENYMAPSLNFGAVRSLLKYTENELLVGTDNGLYLFNRESKTFLRADNPADPRS LSDQTINGMMWDAEGALWVLTNLGGINYMSKQTKRFDYYSPAYLSGIAGAGEVVGPFCEN KDGNIWIGTQSGLYFFNTVTRELSEYHIGGVKSQKYDIRSLMLDGDCLWIGTYAEGIRVL NLRTGSIKEYTHSRGIPNTICSNDVLCIYKDRKGEIFVGTSWGLCRYNPDADNFMTITSI GSMISVGDIYEDMYNNLWIATTNSGVFSYNTLSGHWKHFQHEREDSTTITSNSVITVFED NKGVMWFGTNGGGLCSFDPKTEAFIEFDSALPNKVIYSIEQDQTGDFWISSNAGIFKINP ISKAHFRQFTINDGLQGNQFMARSSLKSSEGKLYFGGINGFNVFQPERFVDNSYIPPVYV TDIRLSYLNDEQEVKKLLQLGKPIYMADKITLSYENNSFTIRFVALSYEDPARNRYSYIL KGVDKEWITNSENNSASYTNLPPGEYEFEVRGSNNDHQWNEKTTTLRVVITPPWWRSSFA YFIYILLLMGWIVWIAWRWNLRVKHKYKRRMEKYQIAKEQEVYKSKIGFFINLVHEIRTP LSLIRLPLEKLQEIEHEGKEAKYLSVIDKNVNYLLGITNQLLDFQKMENGALQLNLVSCD IKEIVNDVYSQFTSPAELKGIELVLTLPEQELVSMVDREKLSKILVNLMGNAIKYARTRI DLKLVTTDAGYEIYVSDDGRGVPDAQKGKIFEAFYQMPDDKVATATGTGIGLAFAKSLAE AHQGSLRLEDNEPQGSSFILSLPLNEKKVEEAADIVEVHSENEGSAENIPSEFSGKKFTV LLVEDNVELLNLTRDSLVAWFRVLKAPNGRAALEILEQESVDVIVSDVMMPEMNGLELCS KVKSEIDYSHIPVILLTAKTTLESKVEGLECGADVYIEKPFSIKQLHKQIENLLRLRQSF HKLMASLSGNANQASAELAMTQRDCEFVAKIQEVIADQLADENFSIDTLAEQMNMSRSNF YRKIKALSGMSPNDYLKALRMNKAAELIQGGTRISEVAAQVGFTSSSYFAKCFKAQYGVL PKEYVNQLSASDTAVESTPDADSTI >gi|226332258|gb|ACIC01000062.1| GENE 13 19195 - 20502 1181 435 aa, chain - ## HITS:1 COG:STM3705 KEGG:ns NR:ns ## COG: STM3705 COG4942 # Protein_GI_number: 16766990 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Membrane-bound metallopeptidase # Organism: Salmonella typhimurium LT2 # 36 435 50 427 427 83 25.0 7e-16 MIKRLFVILVSSLWLAIPLSAQSNKLIRELEGKRGALQKQIAETESILQNTKKDVGSQLN GLAALTGQIEERKRYILAINNDVETIERELVSLNRQLNSLEKDLKEKKKKYEASVQYLYK NKSIEEKLMFIFSAKNLRQTYRRMRYVREYATYQRLQGEEILKKQEQIRKKKAERQQVKA AKEGLLKERENEKAKLEAQEKEKRTLVANLQKKQKGLQSEVNKKRREANQLNARIDRLIA EEIERARKRAAEEARREAAARKKADTNDKGASETGTAAKPKSEPLDAFTMSKADRELSGS FVANRGKLPMPISGPYIITSRYGQYSVEGLRNVKLDNKGIDIQGKPGAQARAIFDGKVAA VFQLNGLFNVLIRHGNYISVYCNLSSASVKSGDTVSTKQSIGQIFSDGADNGRTVLHFQL RREKEKLNPEPWLNR >gi|226332258|gb|ACIC01000062.1| GENE 14 20499 - 21080 553 193 aa, chain - ## HITS:1 COG:no KEGG:BT_3463 NR:ns ## KEGG: BT_3463 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 193 1 193 193 315 99.0 5e-85 MKRIAYFLLIAVVLVGCKSSKRLTATKVPEVAASEAAIPSYLASRLQLTIPGKGGSMSVG GTMKMKSRERVQISLLMPILRTELARIEVTPTEVLFVDRMNKRFVRATKNELKEILSKNV EFSQLEKLLTDASKPGGKTELSGKDLGIPKLEKAKVQLYDFSTKELSITPTEVTSRYRQV SLEELMKMLVALL >gi|226332258|gb|ACIC01000062.1| GENE 15 21077 - 22831 1905 584 aa, chain - ## HITS:1 COG:aq_854 KEGG:ns NR:ns ## COG: aq_854 COG0457 # Protein_GI_number: 15606205 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Aquifex aeolicus # 94 570 63 528 545 97 22.0 6e-20 MNKKNSIWLLVAVWTLVSCGTVKSTREKPAVALAQSSLTPEQQRKYDYFFLEAMRLKEKK DYASAFGLLQHCLDIHPNAASALYEVSQYYMFLRQVPQGQEALEKAVANAPDNYWYSQGL ASLYQQQNELDKAVTLLEQMVVRFPAKQDPLFNLLDLYGRQEKYDEVISTLNRLEKRMGK NEQLSMEKFRIYLQMKDDKKAFQEIESLVQEYPMDMRYQVILGDVYLQNGKKQEAYDVYQ KVLAAEPDNPMAIFSMASYYKQTGQEELYQQQLDTLLLNKKVTPDTKVGVMRQMIVENEQ ADKDSTQIIALFDRIMKQEQDDPQIPMLYAQYLLSKNMEAESVPVLEQVVDLDPTNNAAR MMLIGAAVKKEDYKQIIKVCEPGIEATPDALEFYYYLAVAYNQAEKPDSVISICKRALEH KTADGKKEIVSEFYSILGDMYHTQKQMKEAYAAYDSALVYNPSNIGALNNYAYYLSVERR DLDKAEEMSYKTVKAEPNNATYLDTYAWILFEKGNYAEARIYIDNAMKSEGGDKSDVIVE HCGDIYYMTGDVDGALTYWKKALEMGSESKTLKQKIEKKKYIAE >gi|226332258|gb|ACIC01000062.1| GENE 16 22850 - 23284 413 144 aa, chain - ## HITS:1 COG:FN1028 KEGG:ns NR:ns ## COG: FN1028 COG0756 # Protein_GI_number: 19704363 # Func_class: F Nucleotide transport and metabolism # Function: dUTPase # Organism: Fusobacterium nucleatum # 1 143 4 146 146 171 60.0 5e-43 MNVQVINKSKHPLPAYATELSAGMDIRANLSEPITLAPLQRCLVPTGIYIALPQGFEAQV RPRSGLAIKKGITVLNSPGTIDADYRGEVCIILVNLSSEPFVIEDGERIAQMVIARHEQA VWQEVEVLDETERGAGGFGHTGRG >gi|226332258|gb|ACIC01000062.1| GENE 17 23412 - 24761 924 449 aa, chain + ## HITS:1 COG:sll0398 KEGG:ns NR:ns ## COG: sll0398 COG0232 # Protein_GI_number: 16331575 # Func_class: F Nucleotide transport and metabolism # Function: dGTP triphosphohydrolase # Organism: Synechocystis # 1 446 1 440 440 288 39.0 2e-77 MNWKQLISAKRFGMEEFHEERQENRSEFQRDYDRLIFSAPFRRLQNKTQVFPLPGSIFVH NRLTHSLEVSCVGRSLGNDVAKAILERQPELESSFLPEIGSIVSAACLAHDLGNPPFGHS GERAISTFFSEGKGQRLQEKQPDGEQLSPMEWEDLTHFEGNANAFRILTHQFEGRRRGGF VLTYSTLASIVKYPFSSSLAGKKSKFGFFVSEEESYRKIATELGLIQLSEQPLKYARHPL VYLVEAADDICYQMMDIEDAHKLKIITTDETKELLMAFFSEDRQSRLRSTFQIVNDINEQ IAYLRSSVIGLLVRECTRVFLEHEQEILAGTFEGALIKHIAEGPAAAYNHCAEISLKRIY RSQDVLDIELAGFRIISTLLELMVDAVTLPGKEKAYSELLTNRVSDQYNIKSPVLYERIQ AVLDYISGMTDVFALDLYRKINGNSLPAV >gi|226332258|gb|ACIC01000062.1| GENE 18 24733 - 25746 955 337 aa, chain - ## HITS:1 COG:VCA0646 KEGG:ns NR:ns ## COG: VCA0646 COG3176 # Protein_GI_number: 15601404 # Func_class: R General function prediction only # Function: Putative hemolysin # Organism: Vibrio cholerae # 4 218 304 510 605 127 36.0 4e-29 MEEIIKPVSKELLKAELTEDRRLRMTNKSNNQIYIITHQNAPNVMREIGRLREIAFRAAG GGTGLSMDIDEYDTMEHPYKQLIVWNPEAEEILGGYRYLLGTDVRFDEAGAPILATSHMF HFSDAFIKEYLPQTIELGRSFVTLEYQSTRAGSKGLFALDNLWDGLGALTVVMPNVKYFF GKVTMYPSYHRRGRDMILYFLKKHFNDREELVTPMEPLILETSDEELRTLFCKDTFKEDY KILNTEIRKLGYNIPPLVNAYMSLSPTMRMFGTAINYEFGDVEETGILIAVDEILEDKRI RHIQTFIESHPDALKMPCESEGVFTPKVVTPQEGCCH >gi|226332258|gb|ACIC01000062.1| GENE 19 25771 - 26589 699 272 aa, chain - ## HITS:1 COG:VCA0646 KEGG:ns NR:ns ## COG: VCA0646 COG3176 # Protein_GI_number: 15601404 # Func_class: R General function prediction only # Function: Putative hemolysin # Organism: Vibrio cholerae # 55 267 68 273 605 80 29.0 2e-15 MTDDSLFLIDVDKILRTKAPKQYKYIPKFVVSYLKKIVHQDEINVFLNESKDKLGVDFLE ACMEFLDAKVEVKGIENLPKEGLYTFVSNHPLGGQDGVALGYVLGRHYDGKVKYLVNDLL MNLRGLAPLCVPINKTGKQAKDFPKMVEAGFQSDDQMIMFPAGLCSRRQNGVIRDLEWKK TFIIKSIQAKRDVVPVHFGGRNSDFFYNLANVCKALGIKFNIAMLYLADEMFKNRHKTFT VTFGKPIPWQTFDKSKTPAQWAEYVKDIVYKL >gi|226332258|gb|ACIC01000062.1| GENE 20 26848 - 27351 454 167 aa, chain + ## HITS:1 COG:mll3697 KEGG:ns NR:ns ## COG: mll3697 COG1595 # Protein_GI_number: 13473184 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Mesorhizobium loti # 3 164 5 161 183 97 36.0 1e-20 MKSLSFRKDLVGVQDELLRFAYKLTTDREEANDLLQETSLKALDNEDKYMPDTNFKGWMY TIMRNIFINNYRKVVRDQTFIDQTDNLYHLNLPQDTGFESTERTYDLKEMHRVVNALPKE YRVPFAMHVSGFKYREIAEKLNLPLGTVKSRIFFTRQKLQEELKDFR >gi|226332258|gb|ACIC01000062.1| GENE 21 27806 - 28276 310 156 aa, chain + ## HITS:1 COG:CAC2133 KEGG:ns NR:ns ## COG: CAC2133 COG2001 # Protein_GI_number: 15895402 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 4 127 2 119 142 64 32.0 7e-11 MIRFLGNIEVRADAKGRVFIPATFRKQLQAASEERLIMRKDVFQDCLTLYPESVWNEELN ELRSRLNKWNSKHQLIFRQFVSDVEVVTPDSNGRILIPKRYLQICNIRGDIRFIGIDNKI EIWAKERAEQPFMSPEEFGAALEEIMNDENRQDGER >gi|226332258|gb|ACIC01000062.1| GENE 22 28248 - 29177 942 309 aa, chain + ## HITS:1 COG:BS_ylxA KEGG:ns NR:ns ## COG: BS_ylxA COG0275 # Protein_GI_number: 16078578 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted S-adenosylmethionine-dependent methyltransferase involved in cell envelope biogenesis # Organism: Bacillus subtilis # 14 309 4 310 311 241 44.0 2e-63 MKIDKMEKDELTYHVPVLLKESVDGMNIHPDGTYVDVTFGGAGHSREILSRLGEGGRLLG FDQDEDAERNIVNDPHFTFVRSNFRYLQNFLRYHDIEQVDAILADLGVSSHHFDDSERGF SFRFDGALDMRMNKRAGLTAADIVNTYEEERLANIFYLYGELKNSRKLASAIVKARNGQS IRTIGEFLEIIKPLFGREREKKELAKVFQALRIEVNQEMEALKEMLAAATEALKPGGRLV VITYHSLEDRMVKNIMKTGNVEGKAETDFFGNLQTPFRLVNNKVIVPDEAEIERNPRSRS AKLRIAEKK >gi|226332258|gb|ACIC01000062.1| GENE 23 29186 - 29542 308 118 aa, chain + ## HITS:1 COG:no KEGG:BT_3454 NR:ns ## KEGG: BT_3454 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 118 1 118 118 175 100.0 5e-43 MEEEAVNKKAEEGKKKKRTSLKSILGGDILATDFFRRQTKLLVLIMVFIIFYIHNRYASQ QQQIEIDRLKKELIDIKYDALTRSSELMEKSRQSRIEEYISNKESDLQTSTNPPYLIK >gi|226332258|gb|ACIC01000062.1| GENE 24 29603 - 31729 1871 708 aa, chain + ## HITS:1 COG:CAC2130 KEGG:ns NR:ns ## COG: CAC2130 COG0768 # Protein_GI_number: 15895399 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell division protein FtsI/penicillin-binding protein 2 # Organism: Clostridium acetobutylicum # 3 707 15 723 729 178 26.0 4e-44 MTRYFFVILIMALIGVAIVVKAGVTMFAERQYWQDVADRFVKENVTVKPNRGNIISSDGK LMASSLPEYRIYMDFMSGEKDEKRRKKDQARRDSILKANMDSICIGLHKIFPDRSAAQFK AHLQKGRQAKSRNYLIYPKRISYIQYKEVKRLPVFCLNRYKGGFKELAYNQRKKPFGSLA ARTLGDVYADTAQGAKNGIELAFDTILKGRDGLTHRQKVMNKYLNIVDVAPIDGCDLITT IDVGMQDICEKALVDKLKELNATVGVVVLMEVATGEVKAIVNMTQGRDGEYYEVRNNAIS DMLEPGSTFKTASIMVALEDGKITPDYIVDTGNGQKPMHGRVMKDHNWHRGGYGKLTVTE ILGVSSNVGTSSIIENFYGSNPQKFVDGLKRMSIDQPLHLQISGEGKPNIRGPKERYFSK TTLPWMSIGYETQVPPINILTFYNAIANNGVSIRPKFVKAAVRDGEVIKEYPTEIINPKI CSDKTLAQIREILRKVVGEGLAKPAGSKQFHVSGKTGTAQVSQGAAGYKTGRTNYLVSFC GYFPSEEPKYSMIVSIQKPGLPASGGLMAGSVFSKIAERVYAKDLRLPLTSAIDTNTVVI PNVKAGEMRETQRVLEELNIKIQGKITDSGKEVWGSTHPAPQAVVLESRGIMQNFVPSVV GMGAKDAVYLLESKGLKVNLVGVGKVKSQSIANGSVARKGQTITLTLH >gi|226332258|gb|ACIC01000062.1| GENE 25 31753 - 33201 1546 482 aa, chain + ## HITS:1 COG:CAC2129 KEGG:ns NR:ns ## COG: CAC2129 COG0769 # Protein_GI_number: 15895398 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl tripeptide synthase # Organism: Clostridium acetobutylicum # 1 479 1 476 482 335 39.0 1e-91 MLLNELLKAIQPVQVTGDSRIEITGVNIDSRLVEAGQLFMAMRGTQTDGHAYIPAAIQKG ATAILCEDIPEEPVAGITYIQVKDSEDAVGKIATTFYGNPTSQFELVGVTGTNGKTTIAT LLYNTFRYFGYKVGLISTVCNYIDDQPIPTEHTTPDPITLNRLLGQMADEGCKYVFMEVS SHSIAQKRISGLKFAGGIFTNLTRDHLDYHKTVENYLKAKKKFFDDMPKNAFSLTNLDDK NGLVMTQNTRSKVYTYSLRSLSDFKGRVIESHFEGMLLDFNNHELAVQFIGKFNASNLLA VFGAAVLLGKKEEDVLLALSTLHPVAGRFDAIRSPKGVTAIVDYAHTPDALVNVLNAIHG VLEGKGKVITVVGAGGNRDKGKRPIMAKEAAKASDRVIITSDNPRFEEPQEIINDMLAGL DAEDMRKTLSIADRKEAIRTACMLAEKGDVILVAGKGHENYQEIKGVKHHFDDKEVLKEI FN >gi|226332258|gb|ACIC01000062.1| GENE 26 33299 - 34567 1265 422 aa, chain + ## HITS:1 COG:YPO0552 KEGG:ns NR:ns ## COG: YPO0552 COG0472 # Protein_GI_number: 16120880 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl pentapeptide phosphotransferase/UDP-N-acetylglucosamine-1-phosphate transferase # Organism: Yersinia pestis # 1 422 1 360 360 216 33.0 9e-56 MLYYLFEWLHKLNFPGAGMFGYTSFRALMAVILALLISSIWGDKFINLLKKKQITETQRD AKTDPFGVNKVGVPSMGGVIIIVAILIPCLLLGKLDNIYMILMLITTVWLGSLGFADDYI KIFKKDKEGLHGKFKIIGQVGLGLIVGLTLYLSPDVVIRENIEVHTPGQEMEVIHGTNDL KSTQTTIPFFKSNNLDYADLVGFMGEHAQTAGWFLFVIITIFVVTAVSNGANLNDGMDGM AAGNSAIIGATLGILAYVSSHIEFASYLNIMYIPGSEELVIYICAFIGALIGFLWYNAYP AQVFMGDTGSLTIGGIIAVFAIIIHKELLIPILCGVFLVENLSVILQRFYYKIGKRKGVK QRLFKRTPIHDHFRTSMSLVEPGCTVKFTKPDQLFHESKITVRFWIVTIVLAAITIITLK IR >gi|226332258|gb|ACIC01000062.1| GENE 27 34736 - 36070 1465 444 aa, chain + ## HITS:1 COG:BS_murD KEGG:ns NR:ns ## COG: BS_murD COG0771 # Protein_GI_number: 16078584 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramoylalanine-D-glutamate ligase # Organism: Bacillus subtilis # 2 444 10 450 451 238 36.0 1e-62 MKRIVILGAGESGAGAAVLAKVKGFETFVSDMSAIKDKYKDLLDSHHIAWEEGHHTEELI LNADEVIKSPGIPNDAPLILKLKAQGTPVISEIEFAGRYTDAKMICITGSNGKTTTTSLI YHIFKSADLNVGLAGNIGKSLALQVAEEHHDYYIIELSSFQLDNMYNFRANIAVLMNITP DHLDRYDHCMQNYIDAKFRITQNQTTDDAFIFWNDDPIIKQELAKHGLKAHLYPFAAVKE DGAIAYVEDHEVKITEPIAFNMEQEELALTGQHNLYNSLAAGISANLAGITKENIRKALS DFKGVEHRLEKVARVRGIDFINDSKATNVNSCWYALQSMTTKTVLILGGKDKGNDYTEIE DLVREKCSALVYLGLHNEKLHAFFDRFGLPVADVQTGMKDAVEAAYKLAKKGETVLLSPC CASFDLFKSYEDRGDQFKKYVREL >gi|226332258|gb|ACIC01000062.1| GENE 28 36072 - 37388 1270 438 aa, chain + ## HITS:1 COG:BH3275 KEGG:ns NR:ns ## COG: BH3275 COG0772 # Protein_GI_number: 15615837 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Bacterial cell division membrane protein # Organism: Bacillus halodurans # 13 398 15 398 398 139 29.0 1e-32 MDLLKSIFKGDKVIWIIFLCLCLISIIEVFSAASTLTYKSGDHWGPITQHSIILMVGAVV VVFLHNVPYKWFQVFPVFLYPISLVLLAFVTLMGIITGDRVNGAARWMTFMGLQFQPSEL AKMAVIIAVSFILSKRQDEEGANPKAFKYIMILTGLVLLLIAPENLSTAMLLFGVVFMMM FIGRVAAKKLLLLAGGLVLIVALGVGTVVAIPAKTLHNTPGLHRLETWQNRIKGFFDKDE VPAAKFDIDKDAQIAHARIAIATSHVVGKGPGNSIQRDFLSQAFSDFIFAIVIEEMGLIG GIFVVFLYLCLLMRAGRIAQKCERTFPAFLVMGIALLLVSQAILNMMVAVGLFPVTGQPL PLVSKGGTSTLINCAYIGMILSVSRYTAHLEEQKAHDAQIQMQIEAGTATSEAQAAAEPT AQILNSDAKFEDEHDSSK >gi|226332258|gb|ACIC01000062.1| GENE 29 37418 - 38536 1067 372 aa, chain + ## HITS:1 COG:BH2565 KEGG:ns NR:ns ## COG: BH2565 COG0707 # Protein_GI_number: 15615128 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylglucosamine:LPS N-acetylglucosamine transferase # Organism: Bacillus halodurans # 5 366 1 363 363 246 36.0 4e-65 MEKELRIIISGGGTGGHIFPAISIANAIIELRPDAKILFVGAEGRMEMQRVPDAGYKIIG LPIAGFDRKHLWKNVSVLIKLARSQWKARSIIKNFRPQVAVGVGGYASGPTLKTAGMMGV PTLIQEQNSYAGVTNKLLAQKAKTICVAYDGMEKFFPADKIIMTGNPVRQNLTKDMPEKG AALRSFNLQPDKKTILIVGGSLGARTINNTLTAALATIKENNDIQFIWQTGKYYYPQVTE AVRAAGELPNLYVTDFIKDMAAAYAASDLVISRAGAGSISEFCLLHKPVVLVPSPNVAED HQTKNALALVDKQAAIYVKDSEAEAKLMDVALNTVADDRKLKELSENIAKLALPDSARII AQEVIKLAEAEN >gi|226332258|gb|ACIC01000062.1| GENE 30 38740 - 40143 1365 467 aa, chain + ## HITS:1 COG:CAC3225 KEGG:ns NR:ns ## COG: CAC3225 COG0773 # Protein_GI_number: 15896472 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramate-alanine ligase # Organism: Clostridium acetobutylicum # 7 444 11 448 458 236 34.0 1e-61 MNIETIKSVYFVGAGGIGMSALVRYFLSKGKVVAGYDRTPTPLTENLIAEGAQIHYEENV RLIPEACKDKESTLVVLTPAVPQEHEELVYFRNNGFEIQKRAQVLGTITHSSKGLCVAGT HGKTTTSTMTAHLFHQSHIECTAFLGGISKNYGTNLLLSQKSPYTVIEADEFDRSFHWLS PYMSVITATDPDHLDIYGTEKAYLESFEHYTTLIQPGGALIIRKGISLQPKVQAGVRVYT YSRDEGDFHAENIRIGNGEIFIDFVAPDTRINNIQLGVPVSINIENGVATMALAHLNGVT DEEIRQGMNSFRGVDRRFDFKIKNDQVVFLSDYAHHPSEIKQSVLSIRELYKDKKITAIF QPHLYTRTRDFYQDFADSLSLLDEVILVDIYPARETPIPGVTSKLIYDNLRPGIEKSMCK KEEILNILKDKNIEVLITLGAGDIDNYVPEIKKELEKMKNEERRTKN >gi|226332258|gb|ACIC01000062.1| GENE 31 40228 - 40974 256 248 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163752975|ref|ZP_02160099.1| 30S ribosomal protein S12 [Kordia algicida OT-1] # 49 244 47 239 239 103 30 4e-21 MVKRILLSIVMLMLIAYLGIAITAFNRKPADQTCRDVELVIKDTTYAGFITKEEVKGILQ HKGIYPIGKKMERISTKSLERELSKHPLIDEAECYKTPSGKVCVEVTQRIPILRIMSANG ENYYLDNKGTVMPPEAKCVAHRAIVTGNVEKSFAMKDLYKFGVFLQNNKFWDAQIEQIHV LPDRNIELVPRVGDHLVYLGKLENFENKLARLKEFYQKGLNQVGWNKYSRISLEFSNQII CTKRESNR >gi|226332258|gb|ACIC01000062.1| GENE 32 41044 - 42495 1407 483 aa, chain + ## HITS:1 COG:PA4408 KEGG:ns NR:ns ## COG: PA4408 COG0849 # Protein_GI_number: 15599604 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Actin-like ATPase involved in cell division # Organism: Pseudomonas aeruginosa # 5 334 8 340 417 141 28.0 2e-33 MATTEFIAAIELGSSKITGVAGRKNSDGSMQVLAYAQEDSSTFIRKGVIFNLDKTAQSLT SIINRLEGELKNSIAKVYVGIGGQSLRTVRNVVSRDLEEEAIISEELVNAIGDENVAIPV VDMDILDVAPQEYKVGNNLQANPVGLVGSHIEGRFLNIVARSSVRKNLEHCFQQAKIDIA DQLIAPLVTANAVLTESERRSGCALIDFGADTTTISVYKNNILRFLTVLPLGGNSITRDI TTLQMEEEEAERLKKTYGDALYEEDESEEPATCKLEDESRTVELSKLNNIIEARAEEIIA NVWNQIQISGYEDKLLAGIILTGGAANLKNLEEMLRRRSKIEKMRMAKLPRNTVHAPSNI LKKDGSQNTLFGLLFEGNQNCCLIETAPQIPVTPATPRPEPEIHKTVDMFEDDQELKEQA RLARLKKEEEEKEARAAAKEAERLRKQKEKEEKERRKREAGPSWIQRKIDSLTKEIFSDD DMK >gi|226332258|gb|ACIC01000062.1| GENE 33 42513 - 43820 1578 435 aa, chain + ## HITS:1 COG:BB0299 KEGG:ns NR:ns ## COG: BB0299 COG0206 # Protein_GI_number: 15594644 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Cell division GTPase # Organism: Borrelia burgdorferi # 11 321 22 331 404 221 46.0 2e-57 MDEIVQFDFPTDSPKIIKVIGVGGGGGNAVNHMYREGIHDVTFVLCNTDNQALAESPVPV KLQLGRSITQGLGAGNRPERARDAAEESIDDIKEQLNDGTKMVFITAGMGGGTGTGAAPV IARIAKEMDILTVGIVTIPFIFEGEKKIIQALDGVERIAQHVDALLVINNERLREIYADL TFMNAFGKADDTLSIAAKSIAEIITMRGTVNLDFADVKTILKDGGVAIMSTGFGEGENRV TKAIDDALHSPLLNNNDIFNAKKVMLNVSFCPTSELMMEEMNEIHEFMSKFREGVEVIWG VAVDNSLDTKVKITVLATGFGVEDVPGMDTLHEARSQEEEERQLQLEEEKEKNKERIRKA YGESAGIGKKSLRSRRHIYIFNTEDLDNDDIIAMIEESPTYTRDKTKLLKIKTKAALEEE VAMEEATDNDGVITF >gi|226332258|gb|ACIC01000062.1| GENE 34 43876 - 44325 265 149 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|42519249|ref|NP_965179.1| 30S ribosomal protein S21 [Lactobacillus johnsonii NCC 533] # 1 148 1 146 147 106 41 3e-22 MDLFDKVSEDIKNAMKAKDKVALETLRNIKKFFIEAKTAPGANDTLTDEAALKIIQKLVK QGKDSAEIYIGQGRQDLADGELAQVAVMEAYLPKQMTAEELEAALKEIIAETGATSGKDM GKVMGVASKKLAGLAEGRAISAKVKELLG >gi|226332258|gb|ACIC01000062.1| GENE 35 44962 - 46653 675 563 aa, chain + ## HITS:1 COG:no KEGG:BT_3442 NR:ns ## KEGG: BT_3442 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 562 1 562 563 853 84.0 0 MKSSTLITILSTFLLVGCSKQQSTLDKKLDEAENIIEVNPDSASSILENIASPEQLDDKT FARWCILSGKVTDEIFNTLLPSYQFERANTWYSSYGKSNEQVQILIYLGRSYANDGDYDK AMSIYTKALEIGEKNKLYNLVGYTYSYMGDLYTAKAMRTEAIKKYEMAANNFKKENNTDS YACALRDMGREYACIDSISRALEILIMADSIAETSENKNVKTSIENTLGNIYVMQNKYDK AKKFFYKALEGRNKMPNYMALIDLYIASDSINQAKGFLQKIPQDDPTYTYSIKYLYYQIY KSEKKYEQALAYLEECTDLVDSIVYADNQSKILNIESKYNHLKISKEVNNLKMKQQSYII VSVICIAALLLMIIGYLLYRKQAKEKILKQQKELDKMKLNLYTLSLELEKKRSLLNTFKE KDENYDKMQKEINHLFTNYRKLQNKILVDSPLYKELVSLTNKNMPRITAPLISKDQWKLI VNEITSIYPNLYNYLFSLCPTLSDEDFQYCCFYLYGFDTNAEAKLLNIATGSVRRKHNRL KEKLNITLPSNSTLYEYLMKNMG >gi|226332258|gb|ACIC01000062.1| GENE 36 46813 - 47187 285 124 aa, chain + ## HITS:1 COG:no KEGG:BT_3441 NR:ns ## KEGG: BT_3441 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 18 124 1 107 107 135 65.0 5e-31 MRKLSITLLLIGISFMTMEANNYSTDHKTSIYFKKWDDRQKSVFFLPIEATIEENNIEVQ FFEKSNEPATFQVKDKNGNIVFQDVVIPDKLEIYKIDLDGFKAGSYELFYIEESIAFVGE FQIE >gi|226332258|gb|ACIC01000062.1| GENE 37 47299 - 48612 470 437 aa, chain + ## HITS:1 COG:no KEGG:BT_2451 NR:ns ## KEGG: BT_2451 # Name: not_defined # Def: putative pyrogenic exotoxin B # Organism: B.thetaiotaomicron # Pathway: not_defined # 97 420 91 417 426 107 27.0 1e-21 MKKQTLINLFILFFLFSCQSYELENIEINDTPIENNPYKVSIDEAKDIALNFMKTFQKTP AASTRSAGISIENVEVVKVNKAATRSAGIENGMDTLLYAVNFSNNNGYILVGADRRTDPI FGIIDNGSFSDESITTNPNFAFFLDLALGKAVYDILKNDTVKIVNTRSINDYDDVYGTAY NLSSKWGQGAPYNQYCPGPYTGCVATATAQILSFFPTIGSVNWQNGSSYGSSVLHWKQIA TDCFRYKGILNTVTTPQSANEVAHLMRYLGVALKAEYKSDGTSIESKNAINWLNDYASLK ASGLKKYDADQVFMAANRFSDKIVYMRGNRGKKSFVGITITYTGGHAWIADGSFTATRNS DGQRVNLFHFNWGWDGRSNGYFPAAVFDSSNPAINDSELPGILSLDNDGTSGNNYKYNVE YSIVENPNRPESGGTIK >gi|226332258|gb|ACIC01000062.1| GENE 38 48623 - 49096 310 157 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253568805|ref|ZP_04846215.1| ## NR: gi|253568805|ref|ZP_04846215.1| predicted protein [Bacteroides sp. 1_1_6] # 35 157 1 123 123 218 99.0 7e-56 MKLHHIKLLSFLLAVFLTSCKESTDTPPFVFRCLVKLVNEDGSLPLEENKDKTSEITINL IKPTDNNAQIENMLYLDYDGNLNIQILDRKATIKNDGNYDQEYRFEIQYPEEIRRRKDTI RIKYRFKDYRPSIIEAFYNDKKPQYMGTDVTYFEVEN >gi|226332258|gb|ACIC01000062.1| GENE 39 49180 - 49413 173 77 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253568806|ref|ZP_04846216.1| ## NR: gi|253568806|ref|ZP_04846216.1| predicted protein [Bacteroides sp. 1_1_6] # 14 77 1 64 64 93 100.0 5e-18 MRKLLIIVSLFTMMSCQENLDMIDSENLPETSSSTNPEFKVNEEDIQSTIDEFYGVSSLS PTRVENSVSNKKNNFID >gi|226332258|gb|ACIC01000062.1| GENE 40 49406 - 50467 339 353 aa, chain + ## HITS:1 COG:no KEGG:PRU_1227 NR:ns ## KEGG: PRU_1227 # Name: not_defined # Def: putative thiol protease/hemagglutinin PrtT # Organism: P.ruminicola # Pathway: not_defined # 35 317 47 316 777 67 25.0 7e-10 MTKSKTRNIGSDSDIDVNDILYIVKLSNGNTAFVAADKRAKSLYGIADGNVELDNTDLIN SNIPGGLVAILSNAIADIKYSIKYNTEVNDDWKTVNVSTRADSDIVGGKEMFTMEPRCPV SWHQFAPFNNACPLYTNSNGDKVHSAAGCVAVAAAQALLCLWDRSKTTFYYYTLITSWDR LSEIKHDNQFIAGSDEETDVANLIHEIGQAVGMKYGPSSSANTEDAVKAVCLLSQGYLKY EKSPFNETIENTLIQKSGIVWLSARNSNDEGHSMLIDALRFVFTTGACGTCIDATKYVKR YFHINYGWGESYNGYYLYIPQSDNYDQDLRWEGTSRTFPYQMKAFSVWQTKNE >gi|226332258|gb|ACIC01000062.1| GENE 41 50469 - 50888 91 139 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253568808|ref|ZP_04846218.1| ## NR: gi|253568808|ref|ZP_04846218.1| predicted protein [Bacteroides sp. 1_1_6] # 1 139 4 142 142 269 100.0 5e-71 MKHKLTIVVLYMIISSCNNTDSNEFAFSNYSWECPKEGGKEIFTSNKEEWEIDQLILDGI IIKEYSPEYIDKNQSFPPDKFEFDFFSFKIEGKKVTICILENTTGKSRSIKFICRHVNTY HAITVVQLPDINIIEHSGN >gi|226332258|gb|ACIC01000062.1| GENE 42 51008 - 52357 558 449 aa, chain + ## HITS:1 COG:no KEGG:GFO_2059 NR:ns ## KEGG: GFO_2059 # Name: not_defined # Def: M12A family peptidase (EC:3.4.-.-) # Organism: G.forsetii # Pathway: Lysine degradation [PATH:gfo00310]; Biotin metabolism [PATH:gfo00780]; Metabolic pathways [PATH:gfo01100] # 22 284 13 299 347 113 31.0 1e-23 MKTTKSNILSGLTYALSVIGCLSVASCADNGNYIEKDSSIKSESDNSKYDENNYSFNTIS LWGSEAAVVDKGDYYIFQGDIHLLKEDLIQTRGAARIDRRWPNNKVYYSVDGIPAIYTKY IYRAIEWIENGSYIDMVPRYNEGDYIQFNFLDQNEWSAYSDYVGRKGGTQTINLSREAII SVGTIAHEICHAIGMYHEMSRTDRDNYIRINFEGMDEDTRYQYKTYAEMNQNGVNVGSFD FGSIMMYSSGGVMYKLDNSYIRSQRDSLSNTDMLTLATLQPIGDQFKFFDPYGYNEPYDS DYEYRRTKIIQCPEGAKIDFKFQYNYKPTSSNLGGYTVDDFNVRTIISIINRNTNQEVYS KEIPLGVTTTWTDIYLRGINIPQGGFTTKVTLIGSVKGTSTSDKLNALKRVMYNPSVYLH LDKVEIKGVNKHIPNEFGNDDRIFTFISI >gi|226332258|gb|ACIC01000062.1| GENE 43 52456 - 52722 344 88 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253568810|ref|ZP_04846220.1| ## NR: gi|253568810|ref|ZP_04846220.1| predicted protein [Bacteroides sp. 1_1_6] # 1 88 35 122 122 129 100.0 4e-29 MTLKVAAYRDYYVAPEHGITTNILGYVVTDENNKTYVIETIKGFEDEYEEGYEFVVKVKA VNKNQDEPVQDLLGYYYYLLEILNKEKV >gi|226332258|gb|ACIC01000062.1| GENE 44 53594 - 54829 903 411 aa, chain - ## HITS:1 COG:no KEGG:BT_3434 NR:ns ## KEGG: BT_3434 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 411 1 411 411 819 98.0 0 MSRNMNTRRSLYIICLLLLSAVGCKLDAQTVVKPVRQENDSTNHALIHQIGFDVRPGYVA PTNSFLEGDNAQRQKIDRSLSLHLKYAFQFSKDSYLGRLYPHAYQGIGVSHNTFYNSAEL GNPVAVYAFQGAPIVRLSSRLSLDYEWNFGASFGWKQYDEHSNWYNDAIGSKINAYINLG FVLNWQFYPQWKLAAGVDFTHFSNGNTRYPNGGLNTIGGRVGIVRTFNAEDKASGAIAPK RLYIKPHVSYDLLVYGATRKRGFVKEGIPTLVPGSFGIVGLNFAPMYNFNNYFRAGLSVD AQFDESANIKDYKLVGTYGDDIKFHRPPFCEQFAVGVSLRAELVMPIFSINVGIGRNMIY SGDDTEGFYQILALKTYVTRHLFLHVGYQLSKFKDPNNLMLGIGYRFHDKR >gi|226332258|gb|ACIC01000062.1| GENE 45 54826 - 55761 493 311 aa, chain - ## HITS:1 COG:no KEGG:BT_3433 NR:ns ## KEGG: BT_3433 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 311 1 315 315 461 76.0 1e-128 MKTIVNLTAYILILLGLYSCNDDVFVDDFRTSDTEFTLDGNGDEVTIRFASSNWDLVWMY TYDSDFTHQYEVYDADDNLIMRNQDAYLKGLGKIVCNEKLTDFIIQRSNPKELKITVGEN VRNDYFRFMLVVSNEYESQEIYVQITPSDKYVFDHISYSLNAYYSERKIEEEVNIVRSNF ENTPSPLHLIVQGIRHKVTFRSDTSEAFQLLGDGNLTVEIPSIENGSLVMNGVQAQYTSM EQEFPFPDIYRDIFIPANTTQHITLMQEYEWFETEYTLYAIHPKTKKQRVITGMFQSTIP TDNYVKQENIK >gi|226332258|gb|ACIC01000062.1| GENE 46 55899 - 57638 1260 579 aa, chain + ## HITS:1 COG:no KEGG:BT_3432 NR:ns ## KEGG: BT_3432 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 579 1 579 579 1067 95.0 0 MDAVIKKIPYGMTDFERIILENYYYVDKTQYIAKVEKVTSFFFFVRPRRFGKSLFLNMLG LYYDINQKDKFEKIFGNLYIGKHPTPDRNKYLVLTLNFSSVAANMDRLEETFNTYCKIVM DGFAERNAHLLGKEAVEKLHELKTGDALLGSLCQSAQNKGQKIYLILDEYDNFANNILVD YGNERYRSITHGSGFFRSFLKVVKDYSSSVIERIFLTGVSPVTMDDLTSGFNIADNYSSN SAFNNMIGFNEYEVRALIDYYKSCIELPHSTDELIDIMKPWYDNYCFSIDSLDEPPMYNS DMVLYFMNRYLLNKRIPNNMLDANIRTDYNKLRHLIHVDKTFGENASVVQEIVEKGSTTG IIIDSFPAEDIVKTRNFKSLLYYYGMLTISGMEMGEPILSVPNWAVREQLYGYMADIYKD SADLYLETDKLVDRMKRMAYKGEWENCFTYIADRLNAQSSVRDFIEGEAYVKTFILAYLG LTHYYIARPEYESNKGYADIFLQPRLLQLPDMVYSYCIEVKYAKRDASDTEIEKLLSNAK IQLKQYAASEWIHQDKGTTELKSIALVFQGWKLVRVEEV >gi|226332258|gb|ACIC01000062.1| GENE 47 57643 - 58371 514 242 aa, chain - ## HITS:1 COG:no KEGG:BT_3431 NR:ns ## KEGG: BT_3431 # Name: not_defined # Def: DNA repair protein # Organism: B.thetaiotaomicron # Pathway: Homologous recombination [PATH:bth03440] # 1 242 1 242 242 466 99.0 1e-130 MLQKTKGIVLHTLKYNDTSIIVDMYTELSGRTSFLVPVPRSRKAAVKSVLFQPLSFIEFE ADFRPNATLYRVKEAKSFYPFTSIPYDPYKSSMALFLAEFLYRAIREEAENRPLFAYLQH SVIWLDECREGFANFHLVFLMRLSRFLGLYPNLEDYHTGDYFDLLNACFTSTRPQLHSSY INPEEAGRLRQLMRMNYETMHLFGMSRAERTRCLTIMNDYYRLHLPDFPALKSLDVLKEL FD >gi|226332258|gb|ACIC01000062.1| GENE 48 59091 - 59345 423 84 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|29348839|ref|NP_812342.1| 30S ribosomal protein S20 [Bacteroides thetaiotaomicron VPI-5482] # 1 84 1 84 84 167 100 2e-40 MANHKSSLKRIRQEETRRLHNRYYGKTMRNAVRKLRATTDKKEAVAMYPGITKMLDKLAK TNVIHKNKANNLKSKLALYINKLA >gi|226332258|gb|ACIC01000062.1| GENE 49 59545 - 61506 2128 653 aa, chain + ## HITS:1 COG:CAC0006 KEGG:ns NR:ns ## COG: CAC0006 COG0187 # Protein_GI_number: 15893304 # Func_class: L Replication, recombination and repair # Function: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit # Organism: Clostridium acetobutylicum # 6 647 1 630 637 689 55.0 0 MSEEQITPNNGSYSADSIQVLEGLEAVRKRPAMYIGDISIKGLHHLVYEIVDNSIDEALA GYCDHIEVTINEDNSITVQDNGRGIPVDYHEKEKKSALEVAMTVLHAGGKFDKGSYKVSG GLHGVGMSCVNALSTHMTTQVFRNGKIYQQEYEIGKPLYSVKEVGVSDITGTRQQFWPDD TIFTETVYDYKILASRLRELAYLNAGLRISLTDRRVVNEEDGSFKSEVFYSEEGLREFVR FIESSREHLINDVIYLNSEKQGIPIEIAIMYNTGFSENVHSYVNNINTIEGGTHLAGFRR ALTRTLKKYAEDSKMLEKVKVEISGDDFREGLTAVISVKVAEPQFEGQTKTKLGNNEVMG AVDQAVGEVLAYYLEEHPKEAKTIVDKVILAATARHAARKAREMVQRKSPMSGGGLPGKL ADCSDKDPSKCELFLVEGDSAGGTAKQGRNRMFQAILPLRGKILNVEKAMYHKALESDEI RNIYTALGVTIGTEDDSKEANIEKLRYHKIIIMTDADVDGSHIDTLIMTFFFRYMPQIIQ NGYLYIATPPLYLCKKGKVEEYCWTDAQRQKFIDTYGGGLENAVHTQRYKGLGEMNAQQL WETTMDPDNRMLKQVNIDNAAEADYIFSMLMGEDVGPRREFIEENATYANIDT >gi|226332258|gb|ACIC01000062.1| GENE 50 61676 - 61876 88 66 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237716666|ref|ZP_04547147.1| ## NR: gi|237716666|ref|ZP_04547147.1| conserved hypothetical protein [Bacteroides sp. D1] # 1 66 3 68 69 64 59.0 2e-09 MKTTTPNPVFAKIRLYLRILASCVIVTCLFLIIIHPEAWITPACIIVAMICLLFLLIRSK HKESRK >gi|226332258|gb|ACIC01000062.1| GENE 51 61885 - 62466 518 193 aa, chain - ## HITS:1 COG:no KEGG:BT_3427 NR:ns ## KEGG: BT_3427 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 193 1 193 193 377 100.0 1e-103 MIQIEDVVVSFDVLREKFLCNLDACKGECCIEGDAGAPVELDEVEKLEEVLPIIWEELSP EARAVIEKQGVVYTDQEGDLVTSIVNNKDCVFTCYDEKGCCYCAIEKAYRDGKTDFYKPV SCHLYPIRIGDYGPYKAVNYHRWDVCKAAVLLGKKENVAVYQFLKEPLIRKFGEKWYQEL EVAVKELKAHDMI >gi|226332258|gb|ACIC01000062.1| GENE 52 62565 - 62738 191 57 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRALFWILLIVSSNCWEQTWKEFNPTLGKTIDLSREGEKAFIKGLENDELIDGKDNY >gi|226332258|gb|ACIC01000062.1| GENE 53 62771 - 64381 427 536 aa, chain - ## HITS:1 COG:no KEGG:BT_3425 NR:ns ## KEGG: BT_3425 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 536 1 536 536 973 100.0 0 MTGRRLLPKIGMRYKVLYILAFLLCANSLFLQIESPIITIGDKWYASIILLLLFLIANST FSMSSFSWSLNKLLPSFYIIVLLSDVVLAMHGILQYTHIIPFHSYLGLSGSFDNPAGYAA SLCAGFPAVFYIYMHYCSKLIRGSVILAGLCVIIVVVLSGSRTGILSIAVMCIVCFLQKT EIGSRKKYLLLLLLLFPVFVTLLYFFKKDSADGRLLIWKCSALMIKDNPVTGYGSGGFLA NYMNYQAEYFARDTDNKYAMLAGDVKHPFNEYILLVVNYGLIGFLLFLTFVYFLWKCYRR NSCLETNIATVCLIGIAVFACFSYPLSYPFVWVMMIYSIYVIIFHAGYIAKYPQWLRRSM CLLIIGLSLAAFIYITRHILYTTAWNRTAKFSLIKQTDMTFARYNELLPKLGDNYLFLYN YAAELNYGKYYNQSQEMAEQCSELWADYNLQLLMADNCINTTQYDNAEKHLKLASLMCPA RFVPLYELMKLYQLKGEEKKTIQIAEIILKKKVKVPSSKVEWIKEKAKNSVKSVIH >gi|226332258|gb|ACIC01000062.1| GENE 54 64433 - 67171 861 912 aa, chain - ## HITS:1 COG:no KEGG:BT_3424 NR:ns ## KEGG: BT_3424 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 912 1 912 912 1853 99.0 0 MKDLINKYKKICCFALISLTVHSLSAQTIVSFDTIAHCMTEQFRIYPKEKLHIHIDRSCY LPGDTLWFKAYMVNASSHIPIRLSRYVYVELVNPENETVGRAKVRPDDMNQFHGYISLPE GLPGGKYALLAYTRYMLSEKNQLVFKREVCIAMASNWETTHLKACSKTSSHEENTELQLQ LWNNRQEQIPLKKVKAVCDKGKAVSVKLDKEKKIELSIRNKKVTPDACVRLDMTDVRNNT FRQYTAISTGMEDYDVTFYPEGGHLLADTPCRTAYKVLDVSGNSIDATLQLIDEQGNVMA ESKTLYAGMGAFTFTPKSGKKYSVQVLNKQGIRKIFELPQVQESAYGIVVQSRKKDIQIS INSALHSPHEKLLLLAHVRGKMICARWLLSDKVDIITIAKDQYPSGIVQCLLLDKNYNVL SERLGFIPYRKTIVCKMENDKNSYGKRTPVRTSLLLTDMNGNPVKGNLSVAITDKSMAVA DTTHSILSTLLLSSELKGRIETPSFYLQTDNPEAEKALDLLLMIHGWKRYRLPEIIKGNF ERPVMEPEKFTSLSGRLKNEKGEGKSRYDVYVRNLEFGLNRVIKLGQDGAFRVDSVEFAE GVPIGMIGKRFIRKGENYGRNKAEIILDKDQLYILSDKTVPQPFLSEISGGIQRDMVYPV GMSDYLLNQVTVKSKLLNKRLTLREIAKFKFKNMWELVDYMGVKFKYPYYTSTMKGSCKA SVKIMNAWACFYKDMPLVLMINGNMHSNAGILNYLSLSDIQVISLNSIDRKKILKPYLYL WEYEQHTSVAILELVLYPGVILDPLLHLCSIEQRFKGYGKLDDIQTDVYPSGGYQEPVAF YSPKYDKGNKDNALIPDFRSTLYWNPSLQTDSSGECDFLFYSADKATTYKVVIEGLTEEG EMVYETREFEIK >gi|226332258|gb|ACIC01000062.1| GENE 55 67358 - 68050 246 230 aa, chain - ## HITS:1 COG:no KEGG:BT_3423 NR:ns ## KEGG: BT_3423 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 230 1 230 230 446 100.0 1e-124 MKHKLLLVLLTALIPCLFMQGCISNFEEDNTELSEEDRVAEMFREAGFKVEKLSSKPVNP PMTEKEASLFLAAYKNRERSGSIQIPSENIELKPDGHLVVKVKCNSEKKNFSRTRTVYKE EFDFTHWATYTVRDIEIECWQDIAGYWKSDNVNCTFTTRDPEYICGFPDINPNTGYSKGS LYFSINITVSRYSGWQIYSVDENYDYTLMLTWLPVAQEWNASLSWHETFW >gi|226332258|gb|ACIC01000062.1| GENE 56 68189 - 70906 1065 905 aa, chain - ## HITS:1 COG:no KEGG:BT_3422 NR:ns ## KEGG: BT_3422 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 905 1 905 905 1833 100.0 0 MYKIICSLILILLVVPSLPAQEKVSFDTITSRMAEQLNMYPKEKLHVHIDRSCYLPGDTL WFKAYMVNASSHIPIRLSRYVYVELVNPENETIDRVKIRPDEMNQFHGYISIPEDLPGGN YSLLAYTCYMLFEENQPVFKRDVCIALASNWETAHLKANILSPHGEITGAQLQLWDNSQK QIPLKRVKAVCNKGKPVSAKLYDEGKIELRIQNEKVTSEACMRIDMTDIRNNVFRQYAAI STGTEDYDVTFYPEGGYLLEGTPCRTAYKVLDVSGNSIAATLQLMDEQGNVMATSETLHS GMGVFTFTPESGKRYIVQTSNKQGVKKVFELPPVQSSAYGIVIKDHQEDMQISINSALHS PHEELLLLAHVRGKMICAQWLLSDKGDIITIAKDQYPSGIVQCLLLDKNYNVLSERLGFI PYRKAIMCKVRNDKDSYGKRELIHTNLSLVDVNGNPVKGNLSVSITDESMSVADTTHSIL STLLLSSELKGRIETPSFYLQTDNPEAEKALDLLLMIHGWKRYRLPEIIKGNFEKPAMEP ERFTSLSGRLKDKIGDGKSRYEVYVRNLEFGLNKTITPGPDGAFRLDSLEFAEGTPIGMI GQRFIRKGETYGRSKTEIILDKEQTYNISDRTVPQPVLSEMSSGIRRDIIYPAGMSNYFL KQVTVKSDALNKSLNSREIAKLKLKNVWELAEYLGIKFSYPFIDSLMQGRCHIGVTLWGS WKCLYKDMPLALIVNGYIPLLDPGIFNYLSLSDLCAIHLYSIGREKFLGLYSYAQSYEHH TSVAILELKLHSDVDVESLEYVNGIDKRIKGYGKLDIQTDVYPFGGYQEPIAFYSPKYEK GNNNDVLIPDFRSTLYWNPNLQTDLSGKCNFSFYSADRITTYKVLIEGMTEEGEIVYEIR ELVIK >gi|226332258|gb|ACIC01000062.1| GENE 57 70919 - 72205 587 428 aa, chain - ## HITS:1 COG:no KEGG:BT_3421 NR:ns ## KEGG: BT_3421 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 428 1 428 428 731 100.0 0 MKLISTFIVIVLLSGCQSKEQSVVISQNSISIAMQIYAISSKISLSDESIMNLRTFFQEN DSLAEMELKKGKSLDEIARWYCPSINTIASLLTPLEGNDYMFYQKNNGPQLPYISDLRTV VKYRQELNLSHVQIEQLLHHSEEIEKRFGVQDYKHDSMEKQYLAEILSETQYKAFFIIRK TRQAEKIAAQQWKQIQVHQLCSTTCDSLAIIKQLYEFEREKSGILEYMSSRGDNKGYDKE RYRLNAHKPLLLLKLETIESFSHNKLLDIICKREVTKLSEQQIEQLLAEYYRIKQAEYKA MYEDASKNGETKFERSKLEGKCLINVVTHQQLEDYFKFVSQKRADEQAQRYWDELKNYDF IRKKDSVQVVSELADYELRLAVAEQWISLDNSRKHLFAREDVVNGKPEILKKKEEWDKKE KERKMVRF >gi|226332258|gb|ACIC01000062.1| GENE 58 72230 - 73756 637 508 aa, chain - ## HITS:1 COG:no KEGG:BT_3420 NR:ns ## KEGG: BT_3420 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 508 1 508 508 1040 100.0 0 MKNIYLFIVSLFFFYTGCDTQQTKYPFSVETVLAKAGNNRKELEKALDYFYKQGDSIKIK AIEFLVAHMDIHYSETYYWKTREGKRVDFSEFDYANLETAVKAIDSMRKKYGSLDFQDTI IYDVNSLTGKYLINNVNHAVDTWRLSEYKDIPFNDFCEYILPYRVTVEPVTEWRKEYCKR YHWLTDSLQNKPLENVLDYAAIDYKDWFTFTYGSETRNEPLSRLSAQQLLFRKKGACEDI AALETFIFRSQGIPAAYISVPLWATSAGAHFSNTVFDTKMKPVKLDVTTHAVVDHPLDRE PSKVIRYTYSSQPGTLASKEKEECIPTGFLQKTNYIDVTHEYWETSDVTIPLFNDTTHII YACMFSMGRWNAEWWGERMENSVTFHNMPRGVVILPMIYKEHQLIPIGYPIVNGYNHQLY LVPDLLHTMTVEIEEQDRYLRFRPDKKYELFYWDNAWISLGTQVATMDADCLQFNQVPQN VLMLLVPEYSERKERPFIIMPDGTRYWW >gi|226332258|gb|ACIC01000062.1| GENE 59 74028 - 75542 1938 504 aa, chain - ## HITS:1 COG:MA4007 KEGG:ns NR:ns ## COG: MA4007 COG0696 # Protein_GI_number: 20092802 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoglyceromutase # Organism: Methanosarcina acetivorans str.C2A # 6 503 15 518 521 507 50.0 1e-143 MSKKALLMILDGWGLGDQKKDDVIFNTPTPYWDYLMNNYPHSQLQASGENVGLPDGQMGN SEVGHLNIGAGRVVYQDLVKINRACADNSILQNPEIVSAFSYAKENGKSVHFMGLTSNGG VHSSMVHLFKLCDISKEYGIDNTFIHCFMDGRDTDPKSGKGFIEELSAHCEKSAGKIASI IGRYYAMDRDKRWERVKEAYDLLVNGQGKKATDMVQAMQESYDEGVTDEFIKPIVNANCD GTIKEGDVVIFFNYRNDRAKELTVVLTQQDMPEAGMHTIPGLQYYCMTPYDASFKGVHIL FDKENVANTLGEYIASKGLSQLHIAETEKYAHVTFFFNGGRETPFDNEDRILVPSPKVAT YDLKPEMSAYEVKDKLVDAINENKYDFIVVNFANGDMVGHTGIYEAIEKAVVAVDACVKD VIEAAKAQGYEAIIIADHGNADHALNEDGTPNTAHSLNPVPCVYVTGNKAAKVEDGRLAD VAPTILKIMGLEAPADMNGQILIK >gi|226332258|gb|ACIC01000062.1| GENE 60 75717 - 78188 1977 823 aa, chain + ## HITS:1 COG:no KEGG:BT_3418 NR:ns ## KEGG: BT_3418 # Name: not_defined # Def: putative thiol:disulfide interchange protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 823 1 823 823 1649 99.0 0 MKTILLCMMLMLSGMLTAQTVDNPPFKARSGSIGNITRIERTPDGTRVYIHAIFRPHWWI KEEGDSYLEDAATGKKYQFKSAEGIELNKEVYMPDSGEMDYVLVFEALPEETQVIHLLSP SDTEGNTYDISLVPSSGKNVSPLTAIKGNWFKADDLNTWEYGIYDSVTIMDNRIFTNENI RKKGKRVEITVKDKQNGDIRTLLVTPQKDGSCQIQVNGEKNQLYTRQRGATKTIAADTGF QQFFHTDTTCLQGYIDGYDRRLGFDTGLIYLSNHITRQDYPTVIQIDEDGSFLCKFVIKH PVEQSVTLDNNWIPFYIEPGQTLTMYIDWEALLARSRARDYYFPIKNTAYMGPSASLSYL LKEFKSLIPYRYDDLSNARNKLTPSQYQEHMKPIVARWEHTADSLIQICRPSAKAARLIK NKADLQAGGLFFDFLMSRDYYAKQDTANQALKVKEEDSYYDFLKKMPLNDETVLADANAS SFINRFEYMDAFRTAYNYHAPKAKDTISYTYPEESLLAFLKERGVKLNAEQEAIRLKQEK LAGTTVRIPLKELQEENDKVTGLYEKEEKLVLEYIDKQYKNKQSEQDMDRNFISMEQKNG HKKDSILARLYDVPDPLLWQIAKVRNLGFSLQNIKTRSIAREYVDSIKQKLTHPQLAEET EYLFAKTHPKEKVNSYQLPEGKATDVFRNIIKNHPGKVLFVDFWATTCGPCRAGIEATAD LRKKYKDHPEFEFIYITGQKDSPAGAYNQYVEKNLKGEASYYVTDSEFNYLRQLFRFNGI PHYELVLKDGSISKEKLGTHRIRKYLEEHFPVSESAGQGENDQ Prediction of potential genes in microbial genomes Time: Thu May 12 00:47:29 2011 Seq name: gi|226332257|gb|ACIC01000063.1| Bacteroides sp. 1_1_6 cont1.63, whole genome shotgun sequence Length of sequence - 51150 bp Number of predicted genes - 44, with homology - 44 Number of transcription units - 23, operones - 8 average op.length - 3.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 436 - 479 2.0 1 1 Tu 1 . - CDS 496 - 741 133 ## BT_3415 hypothetical protein - Prom 822 - 881 7.1 + Prom 771 - 830 5.3 2 2 Tu 1 . + CDS 946 - 1875 808 ## COG0598 Mg2+ and Co2+ transporters - Term 1895 - 1938 1.3 3 3 Tu 1 . - CDS 2051 - 3508 1287 ## BT_3413 hypothetical protein - Prom 3538 - 3597 5.1 4 4 Tu 1 . - CDS 3681 - 4283 660 ## COG0164 Ribonuclease HII - Prom 4306 - 4365 5.5 - Term 4673 - 4724 13.2 5 5 Tu 1 . - CDS 4749 - 6953 2652 ## COG3808 Inorganic pyrophosphatase - Prom 6974 - 7033 7.6 + Prom 7262 - 7321 5.5 6 6 Tu 1 . + CDS 7411 - 7716 167 ## gi|253568835|ref|ZP_04846245.1| predicted protein - Term 7645 - 7692 3.2 7 7 Tu 1 . - CDS 7713 - 8036 480 ## COG0393 Uncharacterized conserved protein - Prom 8265 - 8324 5.4 - Term 8296 - 8333 4.0 8 8 Op 1 24/0.000 - CDS 8343 - 9554 1120 ## COG0520 Selenocysteine lyase 9 8 Op 2 41/0.000 - CDS 9567 - 10910 1348 ## COG0719 ABC-type transport system involved in Fe-S cluster assembly, permease component 10 8 Op 3 41/0.000 - CDS 10913 - 11677 215 ## PROTEIN SUPPORTED gi|90020817|ref|YP_526644.1| ribosomal protein S16 11 8 Op 4 . - CDS 11681 - 13135 1370 ## COG0719 ABC-type transport system involved in Fe-S cluster assembly, permease component 12 8 Op 5 . - CDS 13113 - 13643 461 ## BT_3405 hypothetical protein - Term 13653 - 13699 6.3 13 9 Op 1 20/0.000 - CDS 13723 - 16845 3713 ## COG0532 Translation initiation factor 2 (IF-2; GTPase) 14 9 Op 2 . - CDS 16898 - 18217 595 ## PROTEIN SUPPORTED gi|17988250|ref|NP_540884.1| transcription elongation factor NusA 15 9 Op 3 . - CDS 18220 - 18687 600 ## BT_3402 hypothetical protein - Prom 18709 - 18768 3.4 - Term 18700 - 18731 -0.8 16 10 Tu 1 . - CDS 18842 - 20083 906 ## COG0790 FOG: TPR repeat, SEL1 subfamily - Prom 20193 - 20252 6.5 + Prom 20157 - 20216 8.1 17 11 Tu 1 . + CDS 20240 - 20635 525 ## BT_3400 hypothetical protein - Term 20627 - 20672 -0.3 18 12 Tu 1 . - CDS 20675 - 21373 531 ## COG1451 Predicted metal-dependent hydrolase - Prom 21395 - 21454 8.6 + Prom 21302 - 21361 3.4 19 13 Tu 1 . + CDS 21509 - 22294 728 ## BT_3398 hypothetical protein + Term 22317 - 22363 8.1 - Term 22305 - 22351 5.1 20 14 Op 1 . - CDS 22470 - 22946 449 ## BT_3397 hypothetical protein 21 14 Op 2 . - CDS 22936 - 23445 290 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 22 14 Op 3 . - CDS 23507 - 24280 749 ## COG0548 Acetylglutamate kinase 23 14 Op 4 . - CDS 24340 - 26232 1694 ## COG1166 Arginine decarboxylase (spermidine biosynthesis) - Prom 26265 - 26324 3.7 24 15 Op 1 . - CDS 26350 - 26877 489 ## COG0703 Shikimate kinase 25 15 Op 2 . - CDS 26938 - 27540 487 ## COG3560 Predicted oxidoreductase related to nitroreductase - Prom 27569 - 27628 5.3 + Prom 27514 - 27573 8.1 26 16 Tu 1 . + CDS 27679 - 28314 609 ## COG3341 Predicted double-stranded RNA/RNA-DNA hybrid binding protein - Term 28150 - 28202 1.0 27 17 Op 1 . - CDS 28327 - 28758 457 ## BT_3390 hypothetical protein 28 17 Op 2 . - CDS 28808 - 29968 546 ## BT_3389 hypothetical protein 29 17 Op 3 . - CDS 29953 - 31854 817 ## BT_3388 hypothetical protein 30 17 Op 4 . - CDS 31851 - 32639 377 ## COG0726 Predicted xylanase/chitin deacetylase 31 17 Op 5 . - CDS 32700 - 34535 218 ## PROTEIN SUPPORTED gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein - Prom 34558 - 34617 3.4 32 18 Tu 1 . - CDS 34671 - 36761 1119 ## COG5545 Predicted P-loop ATPase and inactivated derivatives - Prom 36950 - 37009 7.3 + Prom 37317 - 37376 1.6 33 19 Tu 1 . + CDS 37402 - 38013 213 ## BT_3384 hypothetical protein + Term 38069 - 38111 6.0 - Term 38176 - 38236 5.7 34 20 Op 1 . - CDS 38243 - 40435 1604 ## BT_3382 hypothetical protein 35 20 Op 2 . - CDS 40466 - 42193 996 ## BT_3381 hypothetical protein - Prom 42293 - 42352 8.4 36 21 Op 1 . - CDS 42355 - 43686 257 ## BT_3380 hypothetical protein 37 21 Op 2 . - CDS 43695 - 44630 561 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 38 21 Op 3 . - CDS 44627 - 45010 245 ## BT_3378 hypothetical protein - Prom 45065 - 45124 9.3 39 22 Tu 1 . - CDS 45227 - 46195 421 ## BT_3377 putative glycosyltransferase - Prom 46312 - 46371 4.3 + Prom 46271 - 46330 3.2 40 23 Op 1 . + CDS 46350 - 47459 735 ## COG0399 Predicted pyridoxal phosphate-dependent enzyme apparently involved in regulation of cell wall biogenesis 41 23 Op 2 . + CDS 47461 - 48621 338 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase 42 23 Op 3 . + CDS 48638 - 49822 779 ## COG0439 Biotin carboxylase 43 23 Op 4 . + CDS 49826 - 50269 243 ## BT_3373 spermidine N1-acetyltransferase 44 23 Op 5 . + CDS 50276 - 51149 140 ## COG0463 Glycosyltransferases involved in cell wall biogenesis Predicted protein(s) >gi|226332257|gb|ACIC01000063.1| GENE 1 496 - 741 133 81 aa, chain - ## HITS:1 COG:no KEGG:BT_3415 NR:ns ## KEGG: BT_3415 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 81 1 81 99 169 100.0 4e-41 MVFQCAGSSLVDYGKITCKIVLKFPLAPYIVDFPYKESWDKLPISLVVHGRGKSCSRSWQ MLLTDVIKSCHGREQVKMPVA >gi|226332257|gb|ACIC01000063.1| GENE 2 946 - 1875 808 309 aa, chain + ## HITS:1 COG:CAC0294 KEGG:ns NR:ns ## COG: CAC0294 COG0598 # Protein_GI_number: 15893586 # Func_class: P Inorganic ion transport and metabolism # Function: Mg2+ and Co2+ transporters # Organism: Clostridium acetobutylicum # 19 309 24 315 315 199 39.0 6e-51 MRTYLYCEAGFVEKPQWLPNCWVNVVCPDNDDFEFLTKTLNVPESFLNDIADTDERPRTD TEGNWLLTILRIPMQNGQNESLPFGTVPIGIITNNEIIVSVCYHNTDLLPDFIEHTRRKG IEVRNKLDLILRLIYSSAVWFLKYLKQINLDISAAEKELERSIRNEDLLRLMRLQKTLVY FNTSIRGNEVMIGKLQSIFQDTDFLDKELVEDVIIELKQAFNTVNIYSDILTGTMDAFAS IISNNVNAIMKRMTSLSITLMIPTLIASFYGMNVDIHLEEMPYAFALIILCSVVLSTLAF IVFRKIKWF >gi|226332257|gb|ACIC01000063.1| GENE 3 2051 - 3508 1287 485 aa, chain - ## HITS:1 COG:no KEGG:BT_3413 NR:ns ## KEGG: BT_3413 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 485 1 485 485 924 100.0 0 MTTSRLYASKKFMGALLLMMGFVLPANAQFLRTSYFMEGTHYRMQLNPALAPTSGYFNIP AIGSLNVSASSNSLGTQDIIDIIDNSDGFYNNQKFMNRLSNDNRLNVNVNTDILSFGWYK GKNFWSVNVGARVDLGAQIPKSMFSFLHDIDQDGFSWNNSKFDIGKEELNINAYTEVGIG YARAINDRLSVGGKFKVLLGMGNLNLKVDGMNVDANLPLNINDITDVNQIRDYHAKMKVN ARLESSFKGMDLVENTSDPDPRKHYIDDFDFNGFGIAGYGGAIDLGASYKILDNLTVSAS VLDLGFIKWSKGSTSVATAKGSMDYDGKDYALTPEGLEDFQRDAQDFMNRVEGGDVLNYE MLQLQKEDVVESRTTSLHSTVVIGAEYELLDKWLAIGALSTTRFTKPKTQTEITLSANIR PKSWLNAAISYSMIQSAGKSFGLAVKLGPLFIGTDYMFFGKSTKTVNGFVGISVPLGGKK SSKEG >gi|226332257|gb|ACIC01000063.1| GENE 4 3681 - 4283 660 200 aa, chain - ## HITS:1 COG:NMA0075 KEGG:ns NR:ns ## COG: NMA0075 COG0164 # Protein_GI_number: 15793104 # Func_class: L Replication, recombination and repair # Function: Ribonuclease HII # Organism: Neisseria meningitidis Z2491 # 10 196 3 193 194 184 51.0 6e-47 MLLPYLNKDLIEAGCDEAGRGCLAGSVYAAAVILPKDFKNELLNDSKQLTEKQRYALREV IEKEALAWAVGVVSPEEIDEINILRASFLAMHRAVDQLSVRPQHLLIDGNRFNKYPDIPH TTVIKGDGKYLSIAAASILAKTYRDDYMNRLHEEYPFYDWNKNKGYPTKKHRAAIAEHGT TPYHRMTFNLLGDGQLNLNF >gi|226332257|gb|ACIC01000063.1| GENE 5 4749 - 6953 2652 734 aa, chain - ## HITS:1 COG:MA3879 KEGG:ns NR:ns ## COG: MA3879 COG3808 # Protein_GI_number: 20092675 # Func_class: C Energy production and conversion # Function: Inorganic pyrophosphatase # Organism: Methanosarcina acetivorans str.C2A # 4 727 12 683 685 523 47.0 1e-148 MDSILFWLVPVASVLALCFAYYFHKQMMKESEGTPQMIKIAAAVRKGAMSYLKQQYKIVG CVFLGLVILFSIMAYGFHVQNVWVPIAFLTGGFFSGLSGFLGMKTATYASARTANAARNS LNSGLRIAFRSGAVMGLVVVGLGLLDISFWYLLLNAVIPADALTPTHKLCVITTTMLTFG MGASTQALFARVGGGIYTKAADVGADLVGKVEAGIPEDDPRNPATIADNVGDNVGDVAGM GADLYESYCGSILATAALGAAAFIHSADTVMQFKAVIAPMLIAAVGILLSIIGIFSVRTK ENATVKDLLGSLAFGTNLSSVLIVAATFLILWLLQLDNWIWISCAVVVGLLVGIVIGRST EYYTSQSYRPTQKLSESGKTGPATVIISGIGLGMLSTAIPVIAVVVGIIASFLLASGFDF SNVGMGLYGIGIAAVGMLSTLGITLATDAYGPIADNAGGNAEMAGLGAEVRKRTDALDSL GNTTAATGKGFAIGSAALTGLALLASYIEEIRIGLTRLGTVDLSMPNGDVVSTANATFVD FMNYYDVTLMNPKVLSGMFLGSMMAFLFCGLTMNAVGRAAGHMVDEVRRQFREIKGILTG EAEPDYERCVAISTKGAQREMVIPSLIAILAPIATGLIFGVTGVLGLLIGGLSTGFVLAI FMANAGGAWDNAKKYVEEGNFGGKGGEVHKATVVGDTVGDPFKDTSGPSLNILIKLMSMV AIVMAGLTVAWSLF >gi|226332257|gb|ACIC01000063.1| GENE 6 7411 - 7716 167 101 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253568835|ref|ZP_04846245.1| ## NR: gi|253568835|ref|ZP_04846245.1| predicted protein [Bacteroides sp. 1_1_6] # 1 101 22 122 122 196 100.0 3e-49 MPNSKSSLAANEKTYSSDYLMKVGIDAFTTNQAFSRVIELTADNFLLPYSCQAFGTLVPR AWHSSANAVAQLCQRHGTNIKNKRGNTVRGESPFSYNKTIN >gi|226332257|gb|ACIC01000063.1| GENE 7 7713 - 8036 480 107 aa, chain - ## HITS:1 COG:STM0930 KEGG:ns NR:ns ## COG: STM0930 COG0393 # Protein_GI_number: 16764292 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Salmonella typhimurium LT2 # 1 103 1 103 107 124 64.0 4e-29 MLVTTTPIIEGKRIVKYYGIVSGETIIGANVFRDLFASIRDIVGGRSGAYEEVLRMAKDT ALQEMQAQAAKLGANAVIGVDLDYETVGGSGSMLMVTANGTAVTIED >gi|226332257|gb|ACIC01000063.1| GENE 8 8343 - 9554 1120 403 aa, chain - ## HITS:1 COG:mlr0021 KEGG:ns NR:ns ## COG: mlr0021 COG0520 # Protein_GI_number: 13470346 # Func_class: E Amino acid transport and metabolism # Function: Selenocysteine lyase # Organism: Mesorhizobium loti # 2 403 11 412 413 465 53.0 1e-131 MDIQKIREDFPILSRTVYGKPLVYFDNGATTQKPRLVVDALVDEYYSVNANVHRGVHYLS QQATELHEASRETVRQFINAGSTNEVVFTRGTTESINLLVSSFGDEFMQEGDEVILSVME HHSNIVPWQLLAARKGIAIKVIPMNDKGELLLDEYEKLFTERTKIVSVVQVSNVLGTVNP VKEMIATAHAHGVPFLVDAAQSIPHMKVDVQDLDADFLVFSAHKVYGPTGVGVLYGKEEW LDRMPPYQGGGEMIQHVSFEKTTFNELPFKFEAGTPDYIGTTGLAKALDYVNGIGLDPIA AHEHELTTYALKRLKEIPNMRIFGEAADRGAVISFLVGDIHHFDLGTLLDRLGIAVRTGH HCAQPLMQRLGIEGTVRASFAMYNTRSEVDALVAGIERVSQMF >gi|226332257|gb|ACIC01000063.1| GENE 9 9567 - 10910 1348 447 aa, chain - ## HITS:1 COG:alr2494 KEGG:ns NR:ns ## COG: alr2494 COG0719 # Protein_GI_number: 17229986 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ABC-type transport system involved in Fe-S cluster assembly, permease component # Organism: Nostoc sp. PCC 7120 # 43 425 62 440 453 185 32.0 1e-46 MNVEQQYIDLFSQTEAMICKHSAEVLNAPRAAAFADFERLGFPTRKMEKYKYTDISKYFE PDYGLNLNRLQIPVNPYEVFKCDVPNMSTALYFVVNDAFYNKALPKTHLPEGVIFGSLKE VAAEHPELVKKYYGKLADTSKDGITAFNTAFAQDGVIFYVPKNVIVEKPIQLVNILRADV NFMVNRRVLIILEDGAQARLLICDHAMDNVNFLATQVIEVFAGENAVFDMYELEETHTST VRISNLYVKQEANSNVLLNGMTLHNGTTRNTTEVLLAGEGAEINLCGMAIADKNQHVDNN TSIDHAVPNCTSNELFKYVLDDQSTGAFAGLVLVRPDAQHTNSQQTNRNLCATRDARMYT QPQLEIYADDVKCSHGATVGQLDENALFYMRARGIAEKEARLLLMFAFVNEVIDTIRLDA LKDRLHLLVEKRFRGELNKCQGCAICK >gi|226332257|gb|ACIC01000063.1| GENE 10 10913 - 11677 215 254 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|90020817|ref|YP_526644.1| ribosomal protein S16 [Saccharophagus degradans 2-40] # 1 239 7 232 318 87 28 1e-16 INKLMLEIKDLHASINGKEILKGINLTVKPGEVHAIMGPNGSGKSTLSSVLVGNPAFEVT KGSVTFYGKNLLELSPEDRSHEGIFLSFQYPVEIPGVSMVNFMRAAVNEQRKYKGLPALT ASEFLKLMREKRAVVELDNKLANRSVNEGFSGGEKKRNEIFQMAMLEPRLSILDETDSGL DIDALRIVAEGVNKLKTPETSTIVITHYQRLLDYIKPDIVHVLYKGRIVKTAGPELALEL EEKGYDWIKKEVGE >gi|226332257|gb|ACIC01000063.1| GENE 11 11681 - 13135 1370 484 aa, chain - ## HITS:1 COG:SMc00530 KEGG:ns NR:ns ## COG: SMc00530 COG0719 # Protein_GI_number: 15965488 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ABC-type transport system involved in Fe-S cluster assembly, permease component # Organism: Sinorhizobium meliloti # 4 484 5 489 489 715 69.0 0 MQQEEPNKYVKELTQEKYKYGFTTEVHTDVIERGLNEDVVRLISSKKDEPEWLLEFRLKA YRHWLTLEMPTWAHLRIPEIDYQAISYYADPTKKKEGPKSMDEVDPELIKTFNKLGIPLE EQMALSGMAVDAVMDSVSVKTTFKETLMEKGIIFCSFSEAVREHPDLVKKYMGSVVGYRD NFFAALNSAVFSDGSFVYIPKGVRCPMELSTYFRINARNTGQFERTLIVADDDSYVSYLE GCTAPMRDENQLHAAIVEIVVHDRAEVKYSTVQNWYPGDAEGKGGVYNFVTKRGNCKGVD SKLSWTQVETGSAITWKYPSCILTGDNSTAEFYSVAVTNNYQQADTGTKMIHLGKNTRST IVSKGISAGHSENSYRGLVRVAEKADNARNYSQCDSLLLGDKCGAHTFPYMDIHNETAVV EHEATTSKISEDQIFYCNQRGIPTEDAIGLIVNGYAKEVLNKLPMEFAVEAQKLLTISLE GSVG >gi|226332257|gb|ACIC01000063.1| GENE 12 13113 - 13643 461 176 aa, chain - ## HITS:1 COG:no KEGG:BT_3405 NR:ns ## KEGG: BT_3405 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 176 1 176 176 282 100.0 4e-75 MTTIDIIILIIIGAGVVAGFMKGFIRQLASILGLVVGLLAAKALYATLAEKLCPAVTDSM TVAQVLAFIVIWIAVPLIFTLVASLLTKAMEAISLGWLNRWLGSGLGALKFLLLTSLVIC VIEFIDSDNKLISATKKSESLLYYPMETFAGIFFPAAKNITQQYILDNKDATRRTQ >gi|226332257|gb|ACIC01000063.1| GENE 13 13723 - 16845 3713 1040 aa, chain - ## HITS:1 COG:BH2413 KEGG:ns NR:ns ## COG: BH2413 COG0532 # Protein_GI_number: 15614976 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation initiation factor 2 (IF-2; GTPase) # Organism: Bacillus halodurans # 361 1039 79 730 730 539 46.0 1e-152 MTIRLNKVTRDLNVGIATVVEFLQKKGYTVEANPNTKISEEQYAVLVKEFSTDKNLRLES ERFIQERQNKERNKASVSIEGFEKQQEKPKSEDVIKTVVPEDARPKFKPVGKIDLDKFSG RRTDKVEKAPEKKVEPVVEQPVVEQPVAEKTVVEKPVVESEVKKEEPKVEKVEVIAPKPA PVEQPVVAPKLAVETKPVEVEKVVVEEVKKEEPKVVETAPVKAEERKVERTAEVTATVTT PTEENASKENEVFKIRQPELGAKINVIGQIDLAALNQSTRPKKKSKEEKRREREEKEKVR QDQKKLMKEAIIKEIRKEDSKQAKPGVKDNNADAARKKRNRINKEKVDVNNVATSNFAAP RPNTQTGKGGGNHPRPQGQGQGQGNNNNNNKKNNNKDRFKKPVIKQEVSEEDVAKQVKET LARLTTKGKNKTSKYRKEKREMASNRMQELEDQEMAESKVLKLTEFVTANELATMMDVSV NQVIATCMSIGMMVSINQRLDAETINLVAEEFGFKTEYVSAEVAQAIVEEEDAPEDLQPR APIVTVMGHVDHGKTSLLDYIRKANVIAGEAGGITQHIGAYNVQLEDGRRITFLDTPGHE AFTAMRARGAKVTDIAIIIVAADDNVMPQTKEAINHAMAAGVPIVFAINKVDKPTANPDK IKEELAAMNYLVEEWGGKYQSQDISAKKGMGVEDLMEKVLLEAEMLDLKANPNRNATGSI IESSLDKGRGYVATVLVSNGTLKVGDIVLAGTSYGRVKAMFNERNQRVKEAGPAEPALIL GLNGAPAAGDTFHVVESDQEAREITNKREQLAREQGLRTQKILTLDELGRRIALGNFQEL NIIVKGDVDGSVEALSDSLIKLSTEQIQVNVIHKGVGAISESDVSLAAASDAIIVGFQVR PSGAAAKMAEQEGVDIRKYSVIYDAIEEVKAAMEGMLAPELKEQITATIEIREVFNITKV GLVAGAVVKTGKVKRSDKARLIRDGIVIFTGNINALKRFKDDVKEVGTNFECGISLVNCN DMKVGDMIETFEEIEVKQTL >gi|226332257|gb|ACIC01000063.1| GENE 14 16898 - 18217 595 439 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|17988250|ref|NP_540884.1| transcription elongation factor NusA [Brucella melitensis 16M] # 1 403 1 426 537 233 33 1e-60 MAKKEETISLIDTFSEFKELKNIDRTTMVSVLEESFRSVIAKMFGTDENYDVIVNPDKGD FEIWRNREVVADEDLTNPNMQISLTEAQKIDASYEEGEEVTDEVIFAKFGRRAILNLRQT LASKILELEKDSIYNKYIDKVGTIINAEVYQIWKKEMLLLDDEGNELLLPKTEQIPSDFY RKGETARAVVARVDNKNNNPKIILSRTSPVFLQRLFEMEVPEINDGLITIKKIARIPGER AKIAVESYDDRIDPVGACVGVKGSRIHGIVRELRNENIDVINYTSNISLFIQRALSPAKI SSIRLNEEEKKAEVFLKPEEVSLAIGKGGLNIKLASMLTEYTIDVFRELDENVADEDIYL DEFRDEIDGWVIDAIKAIGIDTAKAVLNAPREMLIEKTDLEEETVDEVIRILKSEFEEEP AAEMEPETEAEPEAEPTQE >gi|226332257|gb|ACIC01000063.1| GENE 15 18220 - 18687 600 155 aa, chain - ## HITS:1 COG:no KEGG:BT_3402 NR:ns ## KEGG: BT_3402 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 155 1 155 155 260 100.0 1e-68 MIEKKTVCQIVEEWLEGKDYFLVEVTVSPDDKIVVEIDHAEGVWIEDCVELSRFIESKLN REEEDYELEVGSAGIGQPFKVLQQYYIHIGQEVEVVTRDGRKLAGILKDADEEKFTVGVQ KKVKLEGSKRPKLIEEDETFTYEQIKYTKYLISFK >gi|226332257|gb|ACIC01000063.1| GENE 16 18842 - 20083 906 413 aa, chain - ## HITS:1 COG:jhp1045 KEGG:ns NR:ns ## COG: jhp1045 COG0790 # Protein_GI_number: 15612110 # Func_class: R General function prediction only # Function: FOG: TPR repeat, SEL1 subfamily # Organism: Helicobacter pylori J99 # 276 399 45 168 256 86 40.0 1e-16 MDHEVFISYSSANIQTAQAICHALESNRIKCWMAPRDIRPGAEYGDIIEEAIVTCKVFLI VFSETSQISRWVRSELNIGFSSNKPIIPFRIDPTDLKGSMKLMLNDKHWLNAYPNPEEKF SELAAVILDLLQHPAVDTISPPGFQPASNVASADGSNAGAQGLFGKLLNQLFVKRRVKVT SDHAAEIFLNGERKGKLDAYETGNYSVSEDFYQLEVYSCEYGKKVKCEYSGSFSNSSTVI ISVDMATAEKQYLMKAKNITSGQINDLGIIFLDEGKFEDAQECFLKAASMGEAAAAYNLG MMYHFGKGVEIDYDVAREYYEEAVKSDYPLALNNLGSIYYNGHGVRKDIAKSFPYFCRAA ERGVESAQFTVATMLFYGQGVAVDKAKAKKWFQKAAAQGCKDSQNYLNSWVDD >gi|226332257|gb|ACIC01000063.1| GENE 17 20240 - 20635 525 131 aa, chain + ## HITS:1 COG:no KEGG:BT_3400 NR:ns ## KEGG: BT_3400 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 131 1 131 131 233 100.0 1e-60 MAHHLNTNKQFMVGNGILAFAVIFVVVIFIYMSLRLQRENQAERHYIETYTISLVKGFAG DSISLFVNDSLIANQVMQEEAFTVEVGRFAEQSALMIVDNKTEIASVFDLSEKGGTYQFE KETDGIKQLAK >gi|226332257|gb|ACIC01000063.1| GENE 18 20675 - 21373 531 232 aa, chain - ## HITS:1 COG:RSc0521 KEGG:ns NR:ns ## COG: RSc0521 COG1451 # Protein_GI_number: 17545240 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase # Organism: Ralstonia solanacearum # 106 208 165 261 288 76 37.0 5e-14 MDTVIEDDELGSLIVRVNPRARSLVFRTKSDAVYVSVPPGTTTKEVKRAIENLRSKLLLS RQKVARPLIDLDYKIDAEHFKLSLILGEKEQFLANSRLGVMQIVCPPQADFTDENLQNWL RKVIEESLRRNAKSILPPRLEHLSAQCGLSYASVKINSSQGRWGSCSARKNINLSYYLVL LPSHLIDYVLLHELCHTREMNHGERFWDLLNRFTDGKALALRKELKKYRTEI >gi|226332257|gb|ACIC01000063.1| GENE 19 21509 - 22294 728 261 aa, chain + ## HITS:1 COG:no KEGG:BT_3398 NR:ns ## KEGG: BT_3398 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 261 1 261 261 451 100.0 1e-125 MKRNTLTAWLLGIILIFSATSCFNGPIVKGSKNYITKKENLEHFNDISMSGSANIIYQQD SSSRIEIYGSDNIVELLETKVNGKTLNIKFKKNVNIIDKGKLEIKVFSPDLNGLTLNGSG GILFANGIHTEGDLKVTINGSGNLGGSTFDTGHLAISIHGSGDVRLKQIDSKTCSASISG SGNITLDGMTGAAKYNISGSGNIKAADLEATDVYAGISGSGNISCFVNGKLGGHVSGSGN VAYKGNPQAIDFPHKKLRKLD >gi|226332257|gb|ACIC01000063.1| GENE 20 22470 - 22946 449 158 aa, chain - ## HITS:1 COG:no KEGG:BT_3397 NR:ns ## KEGG: BT_3397 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 158 1 158 158 293 99.0 2e-78 MDYKDIEQLLERYWQCETSVEEESVLRDFFTKEEVPAHLLRYKNLFVYQQVQQEVGLGED FDARILAQVETPVVKAKRLTLTGRFIPLFKAAAVIAIILSLGNVAQHSFSGDDGRVLATD TIGKQLTAPSVALSNELKADQALADSLAKINKVQVMKE >gi|226332257|gb|ACIC01000063.1| GENE 21 22936 - 23445 290 169 aa, chain - ## HITS:1 COG:MT1259 KEGG:ns NR:ns ## COG: MT1259 COG1595 # Protein_GI_number: 15840665 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Mycobacterium tuberculosis CDC1551 # 15 168 93 247 257 63 28.0 1e-10 MQEISFRNDILPLKDKLFRLALRITLDRAEAEDVVQDTMIRVWNKRDEWSQFESVEAYCL IVAKNLAIDRSQKKEAQNVEITPEMEEEPDANSPYDQLVHDEKMNIINRLVNELPEKQRL IMQLRDIEGESYKKIATLLNLTEEQVKVNLFRARQKVKQRYLEIDEYGL >gi|226332257|gb|ACIC01000063.1| GENE 22 23507 - 24280 749 257 aa, chain - ## HITS:1 COG:MK1631 KEGG:ns NR:ns ## COG: MK1631 COG0548 # Protein_GI_number: 20095067 # Func_class: E Amino acid transport and metabolism # Function: Acetylglutamate kinase # Organism: Methanopyrus kandleri AV19 # 5 256 1 246 246 159 40.0 4e-39 MREKLTVIKVGGKIVEEEATLLQLLNDFAAISGHKVLVHGGGRSATKIAAQLGIESKMVN GRRITDAETLKVVTMVYGGLVNKNIVAGLQARGVNALGLTGADMNVIRSVKRPVKEVDYG FVGDVEKVDASLLADLIHKGVVPVMAPLTHDGQGNMLNTNADTIAGETAKALSALFDVTL VYCFEKKGVLRDENDDDSVIPEITRAEFEQYVADGVIQGGMIPKLENSFEAINAGVTEVV ITLASAIKDNEGTRIKK >gi|226332257|gb|ACIC01000063.1| GENE 23 24340 - 26232 1694 630 aa, chain - ## HITS:1 COG:all3401 KEGG:ns NR:ns ## COG: all3401 COG1166 # Protein_GI_number: 17230893 # Func_class: E Amino acid transport and metabolism # Function: Arginine decarboxylase (spermidine biosynthesis) # Organism: Nostoc sp. PCC 7120 # 2 629 51 677 679 613 46.0 1e-175 MRKWRIEDSEELYNITGWGTSYFSINDAGHVVVTPRRDGVTVDLKELVDELQLRDVASPM LLRFPDILDNRIEKMSSCFKQAAEEYGYKAENFIIYPIKVNQMRPVVEEIISHGKKFNLG LEAGSKPELHAVIAVNTDSDSLIVCNGYKDESYIELALLAQKMGKRIFLVVEKMNELKLI AKMAKQLNVQPNIGIRIKLASSGSGKWEESGGDASKFGLTSSELLEALDFMESKGLKDCL KLIHFHIGSQVTKIRRIKTALREASQFYVQLHSMGFNVEFVDIGGGLGVDYDGTRSSNSE GSVNYSIQEYVNDSISTLVDVSDKNGIPHPNIITESGRALTAHHSVLIFEVLETATLPEW DDEEEIAPDAHELVQELYSIWDSLNQNKMLEAWHDAQQIREEALDLFSHGIVDLKTRAQI ERLYWSITREINQIAGGLKHAPDEFRGLSKLLADKYFCNFSLFQSLPDSWAIDQIFPIMP IQRLDEKPERSATLQDITCDSDGKIANFISTRNVAHYLPVHSLKKTEPYYLAVFLVGAYQ EILGDMHNLFGDTNAVHVSVNEKGYNIEQIIDGETVAEVLDYVQYNPKKLVRTLETWVTK SVKEGKISLEEGKEFLSNYRSGLYGYTYLE >gi|226332257|gb|ACIC01000063.1| GENE 24 26350 - 26877 489 175 aa, chain - ## HITS:1 COG:alr1244 KEGG:ns NR:ns ## COG: alr1244 COG0703 # Protein_GI_number: 17228739 # Func_class: E Amino acid transport and metabolism # Function: Shikimate kinase # Organism: Nostoc sp. PCC 7120 # 2 168 8 169 181 105 35.0 4e-23 MVRIFLTGYMGAGKTTLGKAFARKLNVPFIDLDWYIEERFHKTVGELFTERGEAGFRELE RNMLHEVAEFENVVISTGGGAPCFYDNMEFMNRTGKTVFLNVHPDVLFRRLRIAKQQRPI LQGKEDDELMDFIIQALEKRAPFYTQAQYIFNADELEDRWQIESSVQRLQELLEL >gi|226332257|gb|ACIC01000063.1| GENE 25 26938 - 27540 487 200 aa, chain - ## HITS:1 COG:CAC3314 KEGG:ns NR:ns ## COG: CAC3314 COG3560 # Protein_GI_number: 15896557 # Func_class: R General function prediction only # Function: Predicted oxidoreductase related to nitroreductase # Organism: Clostridium acetobutylicum # 1 199 1 198 198 235 56.0 5e-62 MERTFSEALKNRRTYYSITDQSPIPDQEIECIINLAVRHVPSAFNSQSTRVVLLLGKSHK KLWNIVKDALRKIVPGEAFAKTEEKIDNSFACGYGTVLFFEDQKVVKGLQEAFPSYQENF PGWSLQTSAMHQLAVWVMLEDVGFGASLQHYNPLIDDEVRRAWNLPAHWHLIAEMPFGVP VNKPGEKEFQPLEERIKVFK >gi|226332257|gb|ACIC01000063.1| GENE 26 27679 - 28314 609 211 aa, chain + ## HITS:1 COG:BH0863 KEGG:ns NR:ns ## COG: BH0863 COG3341 # Protein_GI_number: 15613426 # Func_class: R General function prediction only # Function: Predicted double-stranded RNA/RNA-DNA hybrid binding protein # Organism: Bacillus halodurans # 1 211 1 196 196 186 49.0 3e-47 MAKEKFYVVWEGVTPGVYASWTDCQLQIKGYEGAKYKSFDTREEAERALASSPYAYIGKN AKKKTEAARTGSDTLPASVIDNSLAVDAACSGNPGPMEYRGVHVASRQEIFHFGPMKGTN NIGEFLAIVHGLALLKQKGFDMPIYSDSVNAISWVRQKKCKTKLPRNEETEELFKVIERA EKWLRENTYTTRILKWETKVWGEIPADFGRK >gi|226332257|gb|ACIC01000063.1| GENE 27 28327 - 28758 457 143 aa, chain - ## HITS:1 COG:no KEGG:BT_3390 NR:ns ## KEGG: BT_3390 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 143 1 143 143 265 100.0 4e-70 MEIVKSVFQSKGKRYQFIRYCTVGTLAAGIHYGVYYLLQEYEMTNLNIAYTIGYVTSFIC NFFLTSYFTFRSNPSLKRALGFGGSHLVNYLIHMGLFNLFLYLNVDQEIAPLCVLAVAVP TNFLMLRFVFKHKKEQAADQVAE >gi|226332257|gb|ACIC01000063.1| GENE 28 28808 - 29968 546 386 aa, chain - ## HITS:1 COG:no KEGG:BT_3389 NR:ns ## KEGG: BT_3389 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 386 1 386 386 689 100.0 0 MLERLKGYVQNPVFLFILWMGVGLACSLSLMMKGTYSNYVIFSQSFWHAISSSPLYVEYL QEQKDFFLYGISFTALISPFAVLPRPLGMILWCLVNCGFLYYAISKLDLKKWQFAVVILV SVNDVFTAVLSQQYSIGITAMIIFAYVLIEKEKDFWAALMIVLGTMTKLYGIVGLAFFLF SKHKMKLAGGLVFWAVIVFLLPMLYASPEYVVHSYKEWLDVLVYKNGLNQFSINQNISLL GMLHRITGASFSDLWIIVPGMILFALPYLRIRQYGNESFRFLFLSSALLFMVLFSTGTET YGYLTAMIAVGIWYVKTPTKAATPILNLSLLIFCILLTSLSTTDLFPRFIRAEYVKPYAL KALPCTLIWLKIVWEQLTQDFARPSD >gi|226332257|gb|ACIC01000063.1| GENE 29 29953 - 31854 817 633 aa, chain - ## HITS:1 COG:no KEGG:BT_3388 NR:ns ## KEGG: BT_3388 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 633 1 633 633 1249 100.0 0 MKDIIRFILRLVQNPIVLAILWFGVAIRGFWVSWTEGLANNYLIFSRSFFHALEQTPLYV EYPKEYFDLFLYGIPFTLLIAPFSIMPTMVGSALWSLCNALLLYFAIKKLEFEKWKTAII VWLSYNGLYLSVVTQQYNAAVAAFILFTFILVERKKDFWAALMIVLGTLTKIYGIVGLAF FLFSKRKLYFLWGILFWAFVLFVVPMFYTSPQYVFDSYKEWVSILVVKNDVNELSFYQNI SLLGMVRKITHAVEYSDMWLIIPGIVLFLLPYLRIGQYENRNFRLSFLASVLLFMVLFST GTEECGYVGALIGVGIWYVSTPTYKKSFVLNTCLLLFCFALTAASSSSILFSKHFRTEYI TSFALKALPCAIIWFKIIWEQLTQDYTSRTPTPFLHKKDDERIDVILPCYNPHEGWEQQL IEKHKELEGMLNGYNIRFIVVNDGSKRGFTEEAVLRLTNNLPNTIIVDNKINQGKGAAVR DGIAHSDSELALYTDYDFPYKIESVCQVIKYLEEGYDVVVANRNHTYYSQLSTRRKLASH ASRFLNFMLLGLTHTDTQGGLKGFNCKGKAFLASTRIKQFLFDTEFIYKASLDDTTFIKE VPVDLRGEVMLPDMKKGVFVNELKNLLMICWRG >gi|226332257|gb|ACIC01000063.1| GENE 30 31851 - 32639 377 262 aa, chain - ## HITS:1 COG:MA0797 KEGG:ns NR:ns ## COG: MA0797 COG0726 # Protein_GI_number: 20089681 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted xylanase/chitin deacetylase # Organism: Methanosarcina acetivorans str.C2A # 2 213 16 207 250 60 26.0 5e-09 MILLSFDIEEFDVPKEHQVDISMEEQIKVSVEGTTRILDCLERNQVKATFFCTANFALHA PDIIKRIQEGGHEIASHGFYHWTFEVEDLKRSKDQLEEITGTKVYGYRQARMMPVSEKAV YEAGYIYNSSLNPTFIPGRYMRLSTPRTPFIKDRVMQMPASVTPLLRFPLFWLSCHNLPA SLYRWLCKYTHRHDGYFVTYFHPWEFYPLNEHPEWKLPFIIRNHAGEGMEKRLDAFILYF RKKNVSFGRFIDYIKLNKDIMK >gi|226332257|gb|ACIC01000063.1| GENE 31 32700 - 34535 218 611 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein [Acinetobacter baumannii AYE] # 370 607 1 229 311 88 28 7e-17 MKEFLQLMRRFVSPYKKYIGWAILLNILSAVFNVFSFTLLIPILDILFKTGENNKVYEYM EWGTAGFKDVAINNFYYYVSQMIQDNGPTTALIFLGLFLALMTLFKTSCYFASSAVMIPL RTGVVRDIRIMVYAKVMRLPMGFFSEERKGDIIARMSGDVGEVENSITSSLDMLIKSPIM IILYFITLIATSWKLTLFTVLVLPAMGWLMGKVGRKLKRQSLEAQGKWSDTMSQLEETLG GLRIIKAFIAEDRMIDRFTRCSNELRDAVNKVAMRQALAHPMSEFLGTILIVSVLWFGGA LILGHNSSLTAPTFIFYMVILYSVINPLKDFAKAGYNIPKGLASMERVDKILKAENNIKE IPNPKPLTGMNDRIEFKDISFSYDGKREVLKHVNLTVPKGKTVALVGQSGSGKSTLVDLL PRYHDVQGGDITIDGTSIKDVRIADLRSLIGNVNQEAILFNDTFFNNIAFGVENATMEQV IEAAKIANAHDFIMEKPEGYNTNIGDRGGKLSGGQRQRVSIARAILKNPPILILDEATSA LDTESERLVQEALERLMKTRTTIAIAHRLSTIKNADEICVLYEGEIVERGRHEELLELDG YYKRLNDMQAL >gi|226332257|gb|ACIC01000063.1| GENE 32 34671 - 36761 1119 696 aa, chain - ## HITS:1 COG:all8519 KEGG:ns NR:ns ## COG: all8519 COG5545 # Protein_GI_number: 17232892 # Func_class: R General function prediction only # Function: Predicted P-loop ATPase and inactivated derivatives # Organism: Nostoc sp. PCC 7120 # 332 668 306 631 836 66 24.0 2e-10 MRITQVRDDGKVNTMRTLKIEQLVEQMKKETKAQLVSNMREVLPYILPGDKNDYIERVPK ILPAAAFVRKNGVMAMAEYNGIVMLQVNGLSGRMEADEVKECVKELPQTYLAFIGSSGKS VKIWVRFTYPDNRLPDNREQAEVFHAHAYRLAVKYYQPQLPFDIELRVPSLEQYCRLTFD PELYFNPEAMPVYLKQPASLPGETTYREQVQAQASPLQRLVPGYDSYEALSVLFEAAFAR AFAEQKGYRPGDDIHSLLVCLAEQCFRAGIPQEDTVRWARAHYRLPKDEFLIRETVKNVY GTCEGFADKSSLLPEQLFVMQTDEFMKRRYEFRFNMLTSSVEYRERNSFNFYFRPIDKRV MASITMNAMYEGIKLWDKDVVRYLNSDHVPVYHPVEEFLYDLPHWDGKDHIRDLAERVPC DNPHWGQLFRRWFLSTVAHWRGVDKNHANSTSPILIGPQAYRKSTFCRLILPPCLQAYYT DSIDFSRKRDAELYLNRFLLINMDEFDQIGVNQQSFLKHILQKPVVNTRRPNASAVESLR RYASFIGTSNHKDLLTDTSGSRRFIGVEVTGVIDVVRPIDYEQLYAQAMTALYKNERYWF DEEEEAIMTESNQEFEQSPAIEQLFQVYYRAAADEEAGEWLLAADLLQRIQKASKMKFSP RQVSYLGRILQKLGVKSYRRSHGVYYHVVPISFDNE >gi|226332257|gb|ACIC01000063.1| GENE 33 37402 - 38013 213 203 aa, chain + ## HITS:1 COG:no KEGG:BT_3384 NR:ns ## KEGG: BT_3384 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 203 1 203 203 386 98.0 1e-106 MAVQFEFYKNPVQGEEEGETSYHPRVVNFQHVTTRRLAAEIHSATTFGKAEVEAMLMELS RCMGNHLREGQRVHLDGVGYFQITLQATEPIHSMAAHADKVKFKSVSFLADRDLKGELIG MHAQRSKYKPHSATLSQEEIDKKLTGYFSTHTVLTRSDMQSLCQFTHSMAARHIRRLKEE GSLQNIGIRTQPIYVPCPGHYGK >gi|226332257|gb|ACIC01000063.1| GENE 34 38243 - 40435 1604 730 aa, chain - ## HITS:1 COG:no KEGG:BT_3382 NR:ns ## KEGG: BT_3382 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 581 1 581 582 1212 98.0 0 MENLKAFFTRRPERLLGRCGMLFLMAFFVMNLSSCNDDFIEDDGGEVIPPEEIVIPPVYM LLDGPYKSGVTLTQNEDSSYTIETTNGDPWATAGLFKEDVPEECNVLEFEYQTEMGMSNL ELFFADAKNGIDATHSMSAGTVPASTEWASFSVRLKKYRQEFDWGKVKDYLRLDFGDQPN NIIQMRNIRLRMMNDEEKQAEEDENNEALNKEKYEQGIKDYLNKEYDCHVTDVVVGESTI SIQGNYTGEGTFFLGEIPPFVDMFKVEKVENKTPLSNGSFNVQLDRYVTIGDYEYDRLLS KWAIYKEGASKDEIVSHARYANVDKIHVKQSVEAIPLKSKKGLGGLINHGFLASDLDDLG IASATINIPISHFMHLSQQEGDIPHTYGGRTYYFNEGYMKSSFDAVLAQTSERGISVAGI LLVPPTGDAGTLLKHPDFNGIAPYTMPNMTTIESTNCYAAALDFLAERYSDPNMRIAHWI IHNEVDGGSHWTNMGDKPIATFMDTYLRSMRMCYNIAHQYDQHSEVFISFSHGWNIAAGG GWYKVRDMLDFMNQFSESEGDFFWSLACHSYPAQLGNPCTWDDEQATYSMDTEYVTLKNL EVLDKWVSLSSNKYKGTVKRSVWLSEAGTCSPSYSDNDLQDQAAGFAYGWKKINNLDGID GIQWHSWFDHLGDGACLGLRKYTEAPHNGEAKPVWTTYQKADTDEEDDYFEQYLSRIGID SWEGIIQDIP >gi|226332257|gb|ACIC01000063.1| GENE 35 40466 - 42193 996 575 aa, chain - ## HITS:1 COG:no KEGG:BT_3381 NR:ns ## KEGG: BT_3381 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 575 1 575 575 1188 99.0 0 MELLKSHFFKMRKAYIESFIILLFLSCCPFIVSSCHEEEKEEIPESPFDEEDIQHEQDLN AYLGKSYSCKISQVSVMESSVRVTGEYTGEGNFFLGEIPPYLDIIDVKKAPYKLKLEDSS FEIKLERYVERDGALYDRLLSKWAIYKEGVERDQLVSHAHQADEIHAFQNLPAIKLTSKK GLGGIIPNQYISDFTSLGISSATINVCITQFMHLTPRAGDIAHTYGGRTYYMDEGYLKTV LDVPLLEAAKRNIAVAAIILVEPAAKCVDPDLGALLQHPDYERGVYTMPNMTTLESVNCY AAAFDFLAKRYCTADNRYGRIAHWIMHNEVDGCIDWTNMGVKPLTVFTDTYIKSMRICYN IVRQYDKQAEVLGSFTHSWTQIANVGWWLYTSKEIIDLLNVYSRVEGDFQWGLAYHSYSQ DLTNPCVWIDPNATFSMDTQFITFKNLEVLSKWALTKENKYKGTIKRSVWLSEAGVNSPT YSDEDFQKQAASLAFAWKKINALEGIDGLQWHNWFDHPGDGACFGLRKYLDESYRGEAKP VWEVYRKAGTNEEDEYFEQFLPLIGIPDWNIIENF >gi|226332257|gb|ACIC01000063.1| GENE 36 42355 - 43686 257 443 aa, chain - ## HITS:1 COG:no KEGG:BT_3380 NR:ns ## KEGG: BT_3380 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 443 1 443 443 835 99.0 0 MSKKTVRIGIGAFLLIIGSLFYLMNMYTPICGDDYLYSFYLTPVAAKSFFEGASIGFEQK ISSFTDVIFSQYNHYFYVNGRTIPHILEQSFAGLWGENCFNLINVFAFLLLNMLVIWISG KRNLTKFGYWVAAVFFIWFLLPCPVDLFLLMSGALNYTWSAVLCLAFLLVYTKVRQMERV NWGVAFLLFLLGVISGWTHESLVIGISGALFIIYCVQYNKRKPKSPEVALVAGFWLGTLL LCLSPAARGRASFDHPSIWETFLLIIGELRAFYVLLFLLVYTFFREKRNNNNHTLRKFFY DNQLYFYVILIELVFSLVIGFRNVRQLFGIELFSVVILIKLISEQTSFNAVWCRSVSIVA ASAIVLHMAFVIPCAKRSHAQFQDIVTTYLHSEDGVVSFRYEEFPCWVDSYVWRFGGYAD WEAFCISVYYMGDRKPMKALLVD >gi|226332257|gb|ACIC01000063.1| GENE 37 43695 - 44630 561 311 aa, chain - ## HITS:1 COG:lin2695 KEGG:ns NR:ns ## COG: lin2695 COG0463 # Protein_GI_number: 16801756 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Listeria innocua # 1 311 1 312 315 378 56.0 1e-104 MSKVSVLIPAYNEELSLPELYSKLKEMMDSYAEYEWEILFVNDGSHDKTLEVIKFLRSQD ERINFVDLSRNFGKEVAMLAGFDYATGDCLVIMDADLQHPPHLIPEMLKYWEEGYDDVYA KRITRGKESLMRKYLSLLFYKLLQKTTRVEILPNVGDFRLLDRCCINALKQMRESQRYTK GMYCWIGFRKKEIKFEQEDRVAGTSSFNFFSLLSLAVEGVTSFTVAPLRISTFAGIIVSL VAFIYMCFIMFKTLIWGEVVQGFPTLMVVILFLGGIQLLSLGVIGEYIGRIFNETKNRPT YIAREYNGVKQ >gi|226332257|gb|ACIC01000063.1| GENE 38 44627 - 45010 245 127 aa, chain - ## HITS:1 COG:no KEGG:BT_3378 NR:ns ## KEGG: BT_3378 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 127 1 126 126 200 98.0 2e-50 MMSVFNKRREFVRFILVGVLATATHYGIYFFLCVLMLPAIAYTIGYAISFILNFYLSNIF TFNTKPTVRKGIGFGISHFINYLLHIGLLSLFIWIGVPERWAPLPVFTLVVPVNFLLVRF VLKSEKI >gi|226332257|gb|ACIC01000063.1| GENE 39 45227 - 46195 421 322 aa, chain - ## HITS:1 COG:no KEGG:BT_3377 NR:ns ## KEGG: BT_3377 # Name: not_defined # Def: putative glycosyltransferase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 322 1 322 322 633 100.0 1e-180 MFGIYLTIVVLEFYLLYLCMIMNLFNDAKVMRHAFLILAHNEFQILKILLSMLDDGRNDI YLHIDKKVVLGPLEQDLFRLAKARLFVLEQRLDVRWGDISVVKAELLLLETASMKGPYDY YHLLSGVDLPIKSQDYIHHFFEKNKGYEFVPYSCGEANLKDLERKVFKYHLFCRYYKIPP RIFKKQVQSLRISFLKLQDFFHYNRPKEIEFKKGSNWVSITHELLTIILAQKSFILRRFK NVCCGDEIFLQSILWNSERRSHIYPGSEQLNAGLRAIDWERGNPYVWKMEDLQYLLETEH LFARKFDSQNMDVVIKIRELFS >gi|226332257|gb|ACIC01000063.1| GENE 40 46350 - 47459 735 369 aa, chain + ## HITS:1 COG:CAC2350 KEGG:ns NR:ns ## COG: CAC2350 COG0399 # Protein_GI_number: 15895617 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted pyridoxal phosphate-dependent enzyme apparently involved in regulation of cell wall biogenesis # Organism: Clostridium acetobutylicum # 6 367 2 353 364 261 39.0 2e-69 MIDKPIYVTSPLLPSLEDFTFLLKEIWESKMLTNNGNFHQKLEEELAKYLKVPYLSLFTN GTLPLITALQAMRITGEVITTPFSFVATTHSLWWNGIKPVFVDIEPETCNLDPSKIEAAI TPRTTAIMPVHVYGKPCKTKEIQEIANKYGLKVIYDAAHAFGVEINGESILNFGDMATLS FHATKVYNTLEGGALVVHDEQTKKRIDYLKNFGFASETEVVAPGINSKVDEVRAAYGLLN LKQVDHAINSRRKVAIRYRDELQGVKGITFFNDIPGVRHNYSYFPIFINAEEYGMTRDEL YFKMKEHNVFGRRYFYPLISTFSTYRGLDSANPDNLPVATQMSNNVICLPMHHALSENEV EYILQIIKK >gi|226332257|gb|ACIC01000063.1| GENE 41 47461 - 48621 338 386 aa, chain + ## HITS:1 COG:BH3350 KEGG:ns NR:ns ## COG: BH3350 COG0436 # Protein_GI_number: 15615912 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Bacillus halodurans # 3 384 10 389 393 296 40.0 3e-80 MIQLSEITNSISPSLTRRLFNLAQKYDNVIDFTLGDPDIHPHDKIKEAGCKAILEGRTRY SPNAGLLELREIISSRYKLQYNIEYNPTNEIMVTVGGMEGLYLTLLAILNRGDEVIIPAP YWINYVQMVCMCSGEPIITAPVSTNDLSISIENIRKAITPKTKAIILNTPSNPSGRIICD DSIQQIAQIAIENDLIVITDEVYKTLLYDNAHFKSIVTCDKMKERTVVINSLSKEFCMTG WRLGYVAAPSELISVMTMFQENIAACAPLPSQYAAIEALRNSEKYSAGMIEEFTLRRNVL LEEVAKIKTITVDAPQGTFYAMLNIKSTGLKSEEFAYALLEKEQVAVVPGITYGDCCEDF IRIAFTLDIYKIKEGIQRLKRFVESL >gi|226332257|gb|ACIC01000063.1| GENE 42 48638 - 49822 779 394 aa, chain + ## HITS:1 COG:mlr6296 KEGG:ns NR:ns ## COG: mlr6296 COG0439 # Protein_GI_number: 13475266 # Func_class: I Lipid transport and metabolism # Function: Biotin carboxylase # Organism: Mesorhizobium loti # 1 301 8 308 407 70 26.0 6e-12 MLGGSLYQVYAIKEAVKMGYYVITCDYLPNNPGHQYAHEYYNVSTTDKEAVYELAKRLQV DGVVAYASDPAAPTAAYVCEKLGLPTSPYNSVEILSQKDLFRRYLAEHNFNVPKYVGCTS YTEALEEIKNLTLPVMIKPVDSSGSKGINKLTNVDQLKDFIEDALSYSREKRIIIEEFIE KDGYQISGDAFSVDGILKFHCFGNEYYSSSVVKDFAPLGECWPFQMKPEIITELEKDIQR LISELKMGTTAYNVEAILGKNGKLYILELGARSGGSLIPQITELATGVNMVKYVIKAALG EDCSEIEMSPARGYWSNYMLHSKQTGRFKEISFDENFAKNNLVDCVTELKSGDNVHAFRD AGDALGTLILKYSSKEEMFNVIENMDQFVHIIID >gi|226332257|gb|ACIC01000063.1| GENE 43 49826 - 50269 243 147 aa, chain + ## HITS:1 COG:no KEGG:BT_3373 NR:ns ## KEGG: BT_3373 # Name: not_defined # Def: spermidine N1-acetyltransferase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 147 1 147 147 282 100.0 2e-75 MSITIHTEPLAYTEFRDFIISSANWFVPSLLQMPNLDEWILKMYHNGSMYYTISGSNIIS LIVGYYNREERFLYIPYVCVNPDHQGHGISKKTINYIIEHLSTDIQEILLEVRKDNNNAL HAYHKMGFYEAEDRDEKYLMKKEVHKS >gi|226332257|gb|ACIC01000063.1| GENE 44 50276 - 51149 140 291 aa, chain + ## HITS:1 COG:MT1566 KEGG:ns NR:ns ## COG: MT1566 COG0463 # Protein_GI_number: 15840982 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Mycobacterium tuberculosis CDC1551 # 8 123 44 156 373 128 51.0 1e-29 MKDIANTPLVSIICSTYNHGLYISQCLDGFLMQKTNFPFEILIHDDASTDNTPDIIREYE HNHPQVIRPIYQKENKYSKKEDIFAKYQCSRVRGKYIAICEGDDYWIDPLKLQKQIDFLE NNPDYGMIYTTSKVYNQKEGKIEKDIIGREFNGYIELLSGNCIPTLTSCIRSALVMEYLK EVEPKSKNWLMGDYPMWLWISYYHKIKFIPESTTVYRVLEESASHSNDIQKYERFILSTI DITAFYIHKFKTPLTEPYLRSLSCFYYDLYDKYMHLGMYKKARHYSKLINP Prediction of potential genes in microbial genomes Time: Thu May 12 00:49:02 2011 Seq name: gi|226332256|gb|ACIC01000064.1| Bacteroides sp. 1_1_6 cont1.64, whole genome shotgun sequence Length of sequence - 70054 bp Number of predicted genes - 45, with homology - 44 Number of transcription units - 24, operones - 8 average op.length - 3.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 992 - 1051 6.8 1 1 Tu 1 . + CDS 1129 - 2610 615 ## Sdel_1215 membrane attack complex component/perforin/complement C9 2 2 Op 1 . - CDS 2530 - 2736 84 ## 3 2 Op 2 . - CDS 2714 - 3718 371 ## COG0463 Glycosyltransferases involved in cell wall biogenesis - Prom 3961 - 4020 7.5 + Prom 3693 - 3752 5.4 4 3 Tu 1 . + CDS 3911 - 4897 -7 ## BT_0473 putative glycosyltransferase - Term 4798 - 4847 3.5 5 4 Op 1 25/0.000 - CDS 4926 - 6188 512 ## COG0438 Glycosyltransferase 6 4 Op 2 25/0.000 - CDS 6175 - 7434 578 ## COG0438 Glycosyltransferase 7 4 Op 3 26/0.000 - CDS 7460 - 8701 514 ## COG0438 Glycosyltransferase 8 4 Op 4 11/0.000 - CDS 8703 - 9713 617 ## COG0463 Glycosyltransferases involved in cell wall biogenesis - Prom 9733 - 9792 5.6 9 4 Op 5 . - CDS 9876 - 11915 1094 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 10 4 Op 6 . - CDS 11936 - 12670 314 ## BT_3364 hypothetical protein 11 4 Op 7 . - CDS 12720 - 13694 509 ## BT_3363 lipopolysaccharide core biosynthesis protein LpsA 12 4 Op 8 . - CDS 13726 - 14757 764 ## COG0859 ADP-heptose:LPS heptosyltransferase 13 4 Op 9 . - CDS 14761 - 15366 596 ## BDI_2820 hypothetical protein - Prom 15486 - 15545 7.4 + Prom 15317 - 15376 7.7 14 5 Tu 1 . + CDS 15484 - 16530 814 ## COG0111 Phosphoglycerate dehydrogenase and related dehydrogenases 15 6 Tu 1 . - CDS 16603 - 17229 303 ## COG0299 Folate-dependent phosphoribosylglycinamide formyltransferase PurN - Prom 17310 - 17369 7.8 + Prom 17249 - 17308 5.2 16 7 Op 1 27/0.000 + CDS 17328 - 17564 418 ## COG0236 Acyl carrier protein 17 7 Op 2 1/0.000 + CDS 17583 - 18845 1377 ## COG0304 3-oxoacyl-(acyl-carrier-protein) synthase 18 7 Op 3 . + CDS 18772 - 19785 810 ## COG0571 dsRNA-specific ribonuclease + Term 19920 - 19954 -0.2 19 8 Tu 1 . - CDS 19829 - 20839 872 ## COG0205 6-phosphofructokinase - Prom 20880 - 20939 5.4 + Prom 20803 - 20862 3.7 20 9 Tu 1 . + CDS 20977 - 22488 1146 ## BT_3355 putative auxin-regulated protein + Term 22529 - 22571 6.2 - Term 22516 - 22559 1.3 21 10 Tu 1 . - CDS 22695 - 23804 785 ## COG0482 Predicted tRNA(5-methylaminomethyl-2-thiouridylate) methyltransferase, contains the PP-loop ATPase domain - Prom 23841 - 23900 3.4 22 11 Tu 1 . + CDS 23860 - 26943 1974 ## COG0553 Superfamily II DNA/RNA helicases, SNF2 family + Prom 26975 - 27034 2.3 23 12 Tu 1 . + CDS 27059 - 27871 769 ## COG0561 Predicted hydrolases of the HAD superfamily + Prom 27886 - 27945 4.3 24 13 Tu 1 . + CDS 27975 - 29456 1772 ## COG0215 Cysteinyl-tRNA synthetase + Term 29503 - 29545 5.1 - Term 29489 - 29532 9.1 25 14 Op 1 . - CDS 29580 - 32441 1947 ## BT_3350 putative chondroitinase (chondroitin lyase) 26 14 Op 2 . - CDS 32580 - 34106 1223 ## COG3119 Arylsulfatase A and related enzymes - Prom 34134 - 34193 2.5 27 15 Tu 1 . - CDS 34213 - 35415 1164 ## BT_3348 putative unsaturated glucuronyl hydrolase + Prom 35892 - 35951 4.8 28 16 Op 1 . + CDS 36020 - 37429 786 ## BT_3347 hypothetical protein 29 16 Op 2 . + CDS 37453 - 40536 2685 ## BT_3346 hypothetical protein 30 16 Op 3 . + CDS 40559 - 42598 1667 ## BT_3345 putative outer membrane protein 31 16 Op 4 . + CDS 42627 - 43637 885 ## BT_3344 hypothetical protein + Term 43654 - 43707 -0.9 + Prom 43823 - 43882 3.6 32 17 Tu 1 . + CDS 43943 - 44347 459 ## COG2050 Uncharacterized protein, possibly involved in aromatic compounds catabolism 33 18 Op 1 . - CDS 44353 - 45237 736 ## BT_3342 hypothetical protein 34 18 Op 2 . - CDS 45173 - 47065 1353 ## BT_3341 hypothetical protein - Prom 47085 - 47144 5.6 + Prom 47049 - 47108 5.0 35 19 Tu 1 . + CDS 47252 - 50362 2959 ## COG3250 Beta-galactosidase/beta-glucuronidase + Term 50422 - 50472 14.7 + Prom 50437 - 50496 2.5 36 20 Op 1 27/0.000 + CDS 50542 - 51717 911 ## COG0845 Membrane-fusion protein 37 20 Op 2 9/0.000 + CDS 51734 - 54949 3021 ## COG0841 Cation/multidrug efflux pump 38 20 Op 3 . + CDS 54961 - 56346 382 ## PROTEIN SUPPORTED gi|157165073|ref|YP_001466086.1| 30S ribosomal protein S12 39 20 Op 4 . + CDS 56365 - 57135 756 ## COG1043 Acyl-[acyl carrier protein]--UDP-N-acetylglucosamine O-acyltransferase + Term 57155 - 57205 10.2 - Term 57148 - 57188 6.1 40 21 Tu 1 . - CDS 57196 - 57837 503 ## BT_3335 hypothetical protein - Prom 57902 - 57961 5.5 + Prom 57864 - 57923 5.9 41 22 Tu 1 . + CDS 57949 - 62013 3475 ## COG0642 Signal transduction histidine kinase + Prom 62163 - 62222 4.2 42 23 Tu 1 . + CDS 62313 - 63848 1325 ## COG3119 Arylsulfatase A and related enzymes + Prom 63864 - 63923 4.3 43 24 Op 1 . + CDS 63972 - 67169 3390 ## BT_3332 hypothetical protein 44 24 Op 2 . + CDS 67210 - 68928 1858 ## BT_3331 hypothetical protein 45 24 Op 3 . + CDS 68962 - 70005 990 ## BT_3330 hypothetical protein Predicted protein(s) >gi|226332256|gb|ACIC01000064.1| GENE 1 1129 - 2610 615 493 aa, chain + ## HITS:1 COG:no KEGG:Sdel_1215 NR:ns ## KEGG: Sdel_1215 # Name: not_defined # Def: membrane attack complex component/perforin/complement C9 # Organism: S.deleyianum # Pathway: not_defined # 50 450 27 415 466 117 25.0 1e-24 MKNKLCALSLILLLISCSDEIQQFDDSTKQTQNENKELTTRSLDTDDHALGFGYDVTGDY LDKYSIRNSVIDIVKLREFDKNSIITSNEVSGKNQYYYGYTSYDYIKEITKKTGVNTSVS CDSIPIVKGNLSFTGSFSKENFTRHELSTKYSYASADICKDVRWLRLAESVDVLSNFLTD KFKRDISNMTSPEQFVKDYGTHVLADITIGGVLRVMYSSSIIKEDDQEKKTKVIKAGLKG VLAKIGLSADVDKTVIENVSTSAENINRTLHLEYKGGDGVGGTYNLETGYPTIDKYSWEK SVTAANAGLTKINWDKAYPIYEFISDPVKREQVKKAVIKYINDSQLSVLQLNPIYEMFAP TMDDRFYAFSWDEVQMQINKWNNEFWGFKGYILAKPGVNTIPIYEMFAPTMNDRFYAFSW NEVQWQINDWKNENWGLRGYILSEPEANTQPIYEMYSPLLNDRFYVWSWEEVQMQISKWE NEYWGHRGYVFSK >gi|226332256|gb|ACIC01000064.1| GENE 2 2530 - 2736 84 68 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLVKKESKEGCVKTLAQPYNFLLISNLNSEVFYLHKMGQYLFLLTKYITSMSPILILPLT YLHLNFFP >gi|226332256|gb|ACIC01000064.1| GENE 3 2714 - 3718 371 334 aa, chain - ## HITS:1 COG:SP1365 KEGG:ns NR:ns ## COG: SP1365 COG0463 # Protein_GI_number: 15901219 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Streptococcus pneumoniae TIGR4 # 14 282 6 283 328 142 34.0 8e-34 MVRNMNDMNSLYSVSVIIPVHNTAPYLERCVESVRNQTLKNIEIILVENKSQDNSPVLCD EYARRDPRIKVLHLSIAGLSIARNAGLKIASAPYVGFIDSDDYISETMFQDLLDAISSSQ AEMAYCNFCYEYEDGKIETVYTDSGHTCIRQPKEVIEDIICEKVSSSSCTKLFKKELFAS LLFPEGVFFEDHAVLYKWVAMCTKVVWVDKVYYYYYQRDGSICHTLDMNKHYHFFMAEYP RLDFVKEKRLFSEKERGAIIKMIAQTCFYHFSDFMQEAKFLRDRKQIKDMRVRMRKWLSL PSDEIDKKLYKRIRKITYFWPIYYWKHYARKKRE >gi|226332256|gb|ACIC01000064.1| GENE 4 3911 - 4897 -7 328 aa, chain + ## HITS:1 COG:no KEGG:BT_0473 NR:ns ## KEGG: BT_0473 # Name: not_defined # Def: putative glycosyltransferase # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 244 2 244 313 185 39.0 2e-45 MDQNLAIIIPAYKGAYLKRTLDSLVQQTNKNFSVYIGDDCSPENLPDIINQYISLLNITY KKFSDNRGKSDLISQWERCLDMIQNEEYFILFSDDDIMEANCVEMFYQALACNHIYDVYH YNIDIIDQHENITKHCNEYPQILSASEFIRLIYTYQIDARMPEFIFRTSHFNETGRFIKF DLAYRSDNATVLVCAEEKGIFSIEKAKVLWRDSGVNISSTYNISTQKRKAIATIDFFNWL EAYYTSKNQTSPLTLKERLLLTITEISSLSPNYPHSELYKLLPLIRQITHSKTILFKSKC CLFLLIHQKRKTRKLIKKLINIFLKSYK >gi|226332256|gb|ACIC01000064.1| GENE 5 4926 - 6188 512 420 aa, chain - ## HITS:1 COG:MA1180 KEGG:ns NR:ns ## COG: MA1180 COG0438 # Protein_GI_number: 20090046 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Methanosarcina acetivorans str.C2A # 5 415 2 388 388 172 32.0 9e-43 MGKDKIAFVVVRYGADINGGAEYHCRMLAERLVDDYDVEVLTTCVKDYMKGGNDIAEGAE YINRVMVRRFKVDPIRDMSESEYLKRAKPVRRLRMFLYRIRGLRLFSYFIPVWTYYRQEE LESMRRCVFYSSQLHQFIKENKDVYKAFIALTIDYTPFYYTAILAGEKSIAIPTMHYTKI SFRGVLTEAFSKFAYVGFNTKAEQKLGERVFGKALGAHGIISVGIEPIKAANWEETKRKY RLPDKYLLCVGRVEKAKVNDLITCFSNYKQQYKESSLKLVMVGGIFGKVPDAAEVIYTGF VSDEEKIAIIQHAFLVVNPSQYESLSLILLETLNLEIPMLVNGRCAVLKEHCKLSHGAVD YYMNEQQFIRKLNKIESFPVFREQMAKGGKQYVDNNYNWNLILSRLKRVIELVAKSKQRK >gi|226332256|gb|ACIC01000064.1| GENE 6 6175 - 7434 578 419 aa, chain - ## HITS:1 COG:MA1180 KEGG:ns NR:ns ## COG: MA1180 COG0438 # Protein_GI_number: 20090046 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Methanosarcina acetivorans str.C2A # 5 399 2 371 388 188 33.0 2e-47 MKKEKVAFVVVRYGKNINGGAEYHCQMLAERLVSDYDVEVLTTCVRDVATGENIYPEGEE EWNGVVIRRFRTNPVQREKERYFAKRAKPARKLRQFLFKLGILKYLSYLIPVWTYKNDDE VQAMKSDKFYSSALNDYIRDHIDEYKAFIAMSSDYVTFYYTALYAGRKTIAIPTMHNMGI SFRSVLTSAFSKIAYVGFNTGEEQRLAENILGKALGVHGILSVGIEESLSADWAMTKEKF QLPEKYLLYIGRITPKKIHRLLTYFVNYKKKYSDSTLRLVLVGGLAMERFEHPDVIYTGF VSDEEKMSILQHADIVVNPSRYESLSLILLEAMSQKKPMLVNGHCKVLKEHCLKSDFASF YYMNKRGFNQALRRIEQSEDLREEMGEKGASYVESNYNWSLIMGRLKSDINFIKENGKR >gi|226332256|gb|ACIC01000064.1| GENE 7 7460 - 8701 514 413 aa, chain - ## HITS:1 COG:MA1180 KEGG:ns NR:ns ## COG: MA1180 COG0438 # Protein_GI_number: 20090046 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Methanosarcina acetivorans str.C2A # 5 401 2 374 388 191 34.0 3e-48 MTKEKIAFIVVRYGKDINGGAEYHCRMLAERLVNDYDVEVLTTCVKNYVTGDNEYPEGEE ILNGVLVRRFYSDPVEPDLHKSYVRQAAPAKKWRHFLYRCRLLKPLSYVKPIWHYKEQEE IKALNSQVFYSSSMYAFIRENKAFYRAFIPLSFFPHTYYTALYAPEKTILIPTMHNNGSS FRSIITSVFSEVAYIGFNIEAEQKLVENIVGKPLAAHGIISVGIEKTKAADWEQTKVKYN LPEDYLLYVGRIDAIKLNNIVEYFLSYKKKYVDSRLKLVLVGGIFGKTVEHPDIIYTGFV DDAEKVAIQLNAKVIVNPSRYESLSLILLETMSEGKAMLVNGRCNVLREHCEKSNYAALY YMNRRDFMRKLHHLENSETLRQQMGEKGRHYVQENYNWEMIIGRMKNVIQMLS >gi|226332256|gb|ACIC01000064.1| GENE 8 8703 - 9713 617 336 aa, chain - ## HITS:1 COG:SP1771_1 KEGG:ns NR:ns ## COG: SP1771_1 COG0463 # Protein_GI_number: 15901601 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Streptococcus pneumoniae TIGR4 # 10 221 7 224 259 153 40.0 4e-37 MNTNISGPLISLIVPVYNVKDYLKTCLQSILEQTYKNLEIILVDDGSDDGSSGICDEYAR MDQRIKTIHLPHSGVSAARNAGLAAATGELLGFVDSDDWIDHDMYQYLYTLMQEHDADVS ACTYLLEQEGRPSKIINNTGKLYVFSRKEIIRALVKNDLVKNYLWAKLFKRKLFDRLSFP VGRVYEDVAVLYKVFYSSQKVVLSCVSKYHYMIHKNESITRGGYDPVKEYHYFLSLYEQD KFIQNAELGGESSVGVLKRGIHLINHTLLCPPSPAFEDIIDETMLKMREYRHITPWQLGP SMSIKRYFMCKHFGFYRMLYRGYRAIFKRKHKLLID >gi|226332256|gb|ACIC01000064.1| GENE 9 9876 - 11915 1094 679 aa, chain - ## HITS:1 COG:Cj1135 KEGG:ns NR:ns ## COG: Cj1135 COG0463 # Protein_GI_number: 15792460 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Campylobacter jejuni # 91 349 259 514 515 95 28.0 3e-19 MIVAFENNVVCSDEKVRDYLLAHHADLKEDQDEDALCLVRLHKEEDIDGTDRVDLAGWRE ISRELYWAGEQMEYNYSIIRFSRKTTSLQMSVVLSTCNQLEWLEKVLWGYEAQDTKNFEL IVADDGSRKETYDMLQRITPQLSFQVKHVWHEDKGFRKCDILNKGILAAQADYLLFSDGD CIPRKDFVSTHLRLRRKGRFLSGGYHKLSMDLSKDITKDDILSGRCFDLQWMRGKGMPAS FKNNKLTATGFKRWALNTFTPTKASWNGHNASGWLSDILAVNGFDERMQYGGQDREFGER LENYGIHGMQIRYSTVCLHLDHARGYKTKDSIQKNRNIRKHTRGAKVQWASLGIVKDVLR GQSVKVNSYYDRYTREEEKLTSYKEKGGFYRHIYSLPCRWRRAKYHDKVVRAYQQDTDAP ALSNHSGVIVSLTTFPPRISQLHLMLKSILWQTCPPEKIIVWLSELEFPGRLNDLPEELK TLMAKGIEFRFVSENFRSHKKYHYVFREYPDSKVITVDDDLIYPRNTVERLLTLSYQYPD TVCGNVIRKIHMDGNSFSVYRKWAKVFTMPVNSSLQNVAIGCGGIYYPPHWYGEELFDWK IISEHCPSADDLWLKANELKRRVKVTGGGEFYPRPIELPQTQNNSLQKKNNGKTNLNDKQ WKSLNELWKLDELYCINGK >gi|226332256|gb|ACIC01000064.1| GENE 10 11936 - 12670 314 244 aa, chain - ## HITS:1 COG:no KEGG:BT_3364 NR:ns ## KEGG: BT_3364 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 244 4 247 247 506 99.0 1e-142 MINPKYEYLREYVERIPKDFETIGTVIHSGRNLIKMITVDGLDINVKRYTIPPLINRIAY AFFRPSKGKRAFVYPEKLLEKGFETPCPIAYIEETKMGLIGHSYFMSIQSPYRYNFCQFG NADIKSCEDVVTAFAEFTARLHEAGILHLDYSPGNILYDKIGEEYHFSLVDINRMHFGEV DIKKGCANFARLWGQTPFFILLGKEYARSRGMDEEECVRLILHYRKKFWNRYRRRHQVWF QLDI >gi|226332256|gb|ACIC01000064.1| GENE 11 12720 - 13694 509 324 aa, chain - ## HITS:1 COG:no KEGG:BT_3363 NR:ns ## KEGG: BT_3363 # Name: not_defined # Def: lipopolysaccharide core biosynthesis protein LpsA # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 324 1 324 324 642 99.0 0 MKNKLLYKLRSGKNPKFIYYSVNALRLIIPKGIFRLRLQGKLSSLSRRKDKEYIEHRVDY YNKLSGTVQLPSSAPHLSEHKMSKQKVYFFDTYQYTRWFSDQFQWGFCPGDVTFVPDYPS IVKSRPLTDDNVNSIVMKLDKVRHFIFVDDKKAFTEKKNMVIFRGKVKGKPSRKLFMEMY FHHPMCDLGDVSKNTTDPAEWRTEKKTINEHLDYKFIMALEGIDVASNLKWVMSSNSIAV MPRPTCETWFMEGTLIPNYHYIEIKPDFSDLEERLNYYIEHVDESLEIIRHAHEYVSQFK DKRRENLISLLVLDKYFKMTGQKS >gi|226332256|gb|ACIC01000064.1| GENE 12 13726 - 14757 764 343 aa, chain - ## HITS:1 COG:FN0546 KEGG:ns NR:ns ## COG: FN0546 COG0859 # Protein_GI_number: 19703881 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: ADP-heptose:LPS heptosyltransferase # Organism: Fusobacterium nucleatum # 3 339 6 332 335 91 25.0 2e-18 MARILIIRFSALGDVAMTIPVVHSLAVQYPQLEITVLSRAVWQPLFQGLPANVSFIGADL TGKHKGLWGLNTLYSELKAKSFDYVADFHHVLRTKYLCLRFRLANIPVASIYKGRVGKEK LVRRRHKVLENQKSSFRRYADVLEKLGFPVLLNFSSIYGDEKGNFAEIESVTGAKDNLKW IGIAPFAKHMGKIYPIELQEQVVAHFAADPKVKVFLFGGGKSEQEVFDGWVAKYPTVVSM IGKLNMRTELNLMSHLDVMLSMDSANMHLASLVNIPVISVWGATHPYAGFMGWKQLPVNT VQLDLSCRPCSVYGQKPCWRGDYACLREIKPEQVIAKIEGIIN >gi|226332256|gb|ACIC01000064.1| GENE 13 14761 - 15366 596 201 aa, chain - ## HITS:1 COG:no KEGG:BDI_2820 NR:ns ## KEGG: BDI_2820 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 201 1 201 201 292 74.0 5e-78 MTFSNLCNDIFWKSTTDYHVTDSVDAPMNNPYELKSIEYYLYLKNWIDAVQWHFEDIIRD PHIDPVAAVALKRRIDKSNQDRTDLVELIDSYFLDRYKDVKPLADATINTESPAWAVDRL SILALKIYHMQQEVERTDTTEEHRAQCQVKLDILLEQRKDLSSAIDQLLADIEAGKKYMK VYKQMKMYNDPALNPVLYAKK >gi|226332256|gb|ACIC01000064.1| GENE 14 15484 - 16530 814 348 aa, chain + ## HITS:1 COG:STM2370 KEGG:ns NR:ns ## COG: STM2370 COG0111 # Protein_GI_number: 16765697 # Func_class: H Coenzyme transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoglycerate dehydrogenase and related dehydrogenases # Organism: Salmonella typhimurium LT2 # 1 337 1 328 378 244 38.0 2e-64 MKIIIDDKIPYIKEAAEKIAEEVIYAPGKDFTRELVQDADALIIRTRTHCNRELLEGSKV KFIATATIGFDHIDTEYCKQAGIEWANAPGCNSASVAQYIQSSLLIWKSLRNKKPDELTI GIIGVGNVGSKVAKVAQDFGMRVLLNDLPREEKEGNIAFTSLEKIAEECDIITFHVPLYK EGKYKTYHLADGNFFRSLQRKPVVINTSRGEVIETNALLEAINNGTILDAIIDVWEHEPE INRELLEKVLIGTPHIAGYSADGKANATRMSLDSICRFFHLNATYEITPPTPSSPLIEAK DREEALLKMYNPIEDSNRLKSHPELFETLRGDYPLRREEKAYNIIGIK >gi|226332256|gb|ACIC01000064.1| GENE 15 16603 - 17229 303 208 aa, chain - ## HITS:1 COG:MA0316 KEGG:ns NR:ns ## COG: MA0316 COG0299 # Protein_GI_number: 20089214 # Func_class: F Nucleotide transport and metabolism # Function: Folate-dependent phosphoribosylglycinamide formyltransferase PurN # Organism: Methanosarcina acetivorans str.C2A # 19 195 8 187 204 137 44.0 1e-32 MQSFAHFSLFCALNSSIMKKNIAIFASGSGSNAENLIRYFQKSDSVEVSLVLSNKSDAYV LERAHRLKVPCNVFPKEDWIAGDEILAILQEYRIDFIVLAGFLVRVPDLLLHAYPDKIIN IHPALLPKFGGKGMYGDKVHQAVVAAGEKETGITIHYINEHYDEGNIIFQATCPVLPDDS PEEVAKKVHALEYEHFPHVVEETISIKY >gi|226332256|gb|ACIC01000064.1| GENE 16 17328 - 17564 418 78 aa, chain + ## HITS:1 COG:SMc00573 KEGG:ns NR:ns ## COG: SMc00573 COG0236 # Protein_GI_number: 15964896 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acyl carrier protein # Organism: Sinorhizobium meliloti # 1 75 1 75 78 78 64.0 3e-15 MSEIASRVKAIIVDKLGVEESEVTNEASFTNDLGADSLDTVELIMEFEKEFGISIPDDQA EKIGTVGDAVSYIEEHAK >gi|226332256|gb|ACIC01000064.1| GENE 17 17583 - 18845 1377 420 aa, chain + ## HITS:1 COG:BS_yjaY KEGG:ns NR:ns ## COG: BS_yjaY COG0304 # Protein_GI_number: 16078199 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: 3-oxoacyl-(acyl-carrier-protein) synthase # Organism: Bacillus subtilis # 1 418 1 411 413 429 54.0 1e-120 MELKRVVVTGLGAITPVGNDVPEFWENLVNGVSGAGPITHFDASQFKTQFACEVKNFDVT KYIDRKEARKMDLYTQYAVAVAKEAVSDSGLDVEKEDLNRIGVIFGAGIGGIHTFEEEVG NYYTHQEIGPKFNPFFIPKMISDIAAGQISIMYGFHGPNYATCSACATSTNAIADAFNLI RLGKANVIVSGGSEAAIFPAGVGGFNAMHALSTRNDEPSKASRPFSASRDGFIMGEGGGC LILEELEHAKARGAKIYAEVAGVGMSADAHHLTASHPEGLGAKLVMLNALEDAEMDPKEV DYINVHGTSTPVGDISEAKAIKEVFGDHAYELNISSTKSMTGHLLGAAGAVESIASILAI KNGIVPPTINHEEGDNDENIDYDLNFTFNKAQKREVNVALSNTFGFGGHNACVIFKKYAE >gi|226332256|gb|ACIC01000064.1| GENE 18 18772 - 19785 810 337 aa, chain + ## HITS:1 COG:SA1076 KEGG:ns NR:ns ## COG: SA1076 COG0571 # Protein_GI_number: 15926816 # Func_class: K Transcription # Function: dsRNA-specific ribonuclease # Organism: Staphylococcus aureus N315 # 53 267 24 240 243 105 36.0 2e-22 MLPCPTHLDSVVTMRALSSRNTLSKIVLRNQIDKIRLLFRKDRESYLCFYRILGFYPHNI QIYEQALLHKSSAVRSEKGRPLNNERLEFLGDAILDAIVGDIVYKRFEGKREGFLTNTRS KIVQRETLNKLAVEIGLDKLIKYSTRSSSHNSYMYGNAFEAFIGAIYLDQGYERCKQFME QRIINRYIDLDKISRKEVNFKSKLIEWSQKNKMEVSFELIEQFLDHDSNPVFQTEVRIEG LPAGTGTGYSKKESQQNAAQMAIKKVKDQTFMDTVNEAKSQHSKPSEVETESVEPELTES ETMEPDTLETEAPEAETTADKVETTVNEVEATETEKE >gi|226332256|gb|ACIC01000064.1| GENE 19 19829 - 20839 872 336 aa, chain - ## HITS:1 COG:Cgl1221 KEGG:ns NR:ns ## COG: Cgl1221 COG0205 # Protein_GI_number: 19552471 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphofructokinase # Organism: Corynebacterium glutamicum # 1 318 4 333 346 277 46.0 2e-74 MRIGILTSGGDCPGINATIRGVCKTAINHYGMEVIGIHSGFQGLLTKDIESFTDKSLSGL LNQGGTMLGTSREKPFKKGGVVSDVDKPALILQNIQEMGLDCVVCIGGNGTQKTAAKFAA MGVNIVSVPKTIDNDIWGTDISFGFDSAVSIATDAIDRLHSTASSHKRVMVIEVMGHKAG WIALYSGMAGGGDVILVPEIPYNIKNIGNTILERLKKGKPYSIVVVAEGILTDGRKRAAE YIAQEIEYETGIETRETVLGYIQRGGSPTPFDRNLSTRMGGHATELIANGQFGRMIALKG DDISSIPLEEVAGKLKLVTEDHDLVIQGRRMGICFG >gi|226332256|gb|ACIC01000064.1| GENE 20 20977 - 22488 1146 503 aa, chain + ## HITS:1 COG:no KEGG:BT_3355 NR:ns ## KEGG: BT_3355 # Name: not_defined # Def: putative auxin-regulated protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 503 1 503 503 1042 100.0 0 MNITKIISKTFDSRLKQIDLYASQASEIQHRVLSRLVNQAAQTEWGKKYDYASIRSYEDF RNRLPIQTYEEIKPYVERLRAGEQNLLWPSEIRWFAKSSGTTNDKSKFLPVSKEALEDIH YRGGKDAAALYFRINPDSHFFSGKGLILGGSHSPNLNSNHSLVGDLSAILIQNVNPLINF IRVPSKKIALMSEWETKIEAIANSTIPVNVTSLSGVPSWMLVLIKRILEKTGKQTLEEVW PNLEVFFHGGVAFTPYREQYKQVIHSKMMHYVETYNASEGYFGTQNDLSDPAMLLMIDYG IFYEFVPLEEVDKENPRAYCLEEVELNKNYAMVISTSCGLWRYMIGDTVKFTNKNPYKFV ITGRTKHFINAFGEELIVDNAEKGLAKACAETGAQVCEYSAAPVFMDEHAKCRHQWLIEF AKMPDSVEKFAAILDATLKEVNSDYEAKRWKDIALQPLEVIVARPGLFHDWLARKGKLGG QHKVPRLSNTREYIESMLVLNNE >gi|226332256|gb|ACIC01000064.1| GENE 21 22695 - 23804 785 369 aa, chain - ## HITS:1 COG:CAC2233 KEGG:ns NR:ns ## COG: CAC2233 COG0482 # Protein_GI_number: 15895501 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted tRNA(5-methylaminomethyl-2-thiouridylate) methyltransferase, contains the PP-loop ATPase domain # Organism: Clostridium acetobutylicum # 4 367 2 354 355 234 37.0 2e-61 MEENKRVLLGMSGGTDSSVAAMRLLEAGYEVTGVTFRFYELNDSTEYLEDARHLAERLGI RHITYDAREIFRKQIIEYFVHEYMVGHTPVPCTLCNNYLKWPLLSKIADEMGIFYIATGH YVRKVKVDDTCYITYAADSDKDQTFFLWGLKQDILRRMLLPMGDITKVEARAFAAERGFQ KVAVKRDSLGVCFCPMDYRSFLKMWLVSNCQPQVSVGQPQVSAGQTWSAEVRRGRFVDEK GDFIAWHEGYPFYTVGQRRGLGIHLNRAVFVKEIRPEKNEVVLASLQALEKTEMLLKDWN IVNRERLLGHADIIVKIRYRKQENHCTVTVTLDNFLHVQLHEPLTAIASGQAAAFYKDGL LLGGGIIVE >gi|226332256|gb|ACIC01000064.1| GENE 22 23860 - 26943 1974 1027 aa, chain + ## HITS:1 COG:MA0189 KEGG:ns NR:ns ## COG: MA0189 COG0553 # Protein_GI_number: 20089087 # Func_class: K Transcription; L Replication, recombination and repair # Function: Superfamily II DNA/RNA helicases, SNF2 family # Organism: Methanosarcina acetivorans str.C2A # 472 1026 567 1069 1078 291 32.0 4e-78 MEEKANTGQVIIVLTEHLTFGTLLIPYMAEKSDDGTYQLIEQAFHASPEAISRMNEAEQQ AIDIASHYTEKYLMGIYSREKTVSRFLRKLSEDPERVKNNIRPFIEKKMQEMLTLIRHHN LPLYQKQVGSKLLYAHHAYHVHPHDVEIRFTFLADETNFRYQLQCYYDGQPLSLSEQKPV IVLTSSPSALLLGMELYFFPHIESVRILPFTKKKKISVPASQIEKYIDNIVIPIARYHEI TTQGLNIIEETYACEAVLSLEDTIYDEQMLRLSFRYGDQSFTPDTVTEMKKIIYRNDFGE IFFFRRDSDAEEEKLRMLTDSGLQRVSDVHFKLSPDAPEKTITEWISTYRKMLQQSFLLT GNMGNTPYCLDEIRIEQSCDDEPDWFELHITVVIGNLRIPFSRFRKHILEEKREFLLPDG RMILLPEEWFSKYGNLLELGTQTETGIRLKPTFVGAVQSALEENGPKDLPFKREIRNVPV PQGLKAKLRPYQQKGFSWMVQLNKQGFGGCLADDMGLGKTLQTLTLLQYIYKPLSPDAVT APTKPVFTESTASDNQPDKECTAGNPPAKEQIADAEGQFSLFSFSSEDELLPDAREIRKK NRPKEDGHRQEHGKNSCKPATLIVVPTSLLHNWRREAKRFTTLAIAEYNSNTAFPKGHPE KFFNHFHLIFTTYGMMRNNIDILRSYTFEYIVLDESQNIKNNDSLTFRSVIQLQGKHRIA LTGTPIENSLKDLWAQFRFLQPDLLGEENAFHKQFIIPIRQGNVRMEKRLQQIIAPFILR RSKSEVAPELPPLTEETIYCAMSEKQSESYEQEKNSLRNILLQQPENRDRYHMFSILNGI LRLRQLACHPQLIFPDFDGISGKTEQIIDTFDTLRSEGHKVLIFSSFVRHLEILAEVFRQ RGWKYALLTGSTNNRPSEIAHFTEQKDVQAFLISLKAGGVGLNLTQADYVFIIDPWWNPA AESQAIARAHRIGQDKQVIAYRFITQNSIEEKILQLQEDKRRLAETFITDSEALPALSNE QWADLLK >gi|226332256|gb|ACIC01000064.1| GENE 23 27059 - 27871 769 270 aa, chain + ## HITS:1 COG:VC1364 KEGG:ns NR:ns ## COG: VC1364 COG0561 # Protein_GI_number: 15641376 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Vibrio cholerae # 3 265 2 264 273 172 35.0 8e-43 MKYKLIVLDLDGTLTNSKKEISSRNRETLIRIQEQGIRLVLASGRPTYGIVPLANELRMN EFGGFILSYNGGEIINWETKEMMYENVLPNEVVPVLYECARTNHLSILTYDGAEIVTENS QDPYVQKEAFLNKMAIRETNDFLTDITLPVAKCLIVGDAGKLIPVESELCIRLQGKINVF RSEPYFLELVPQGIDKALSLSVLLENIGMTREEVIAIGDGYNDLSMIKFAGMGIAMGNAQ EPVKKAADYITLTNDEDGVAEAIERIFNVP >gi|226332256|gb|ACIC01000064.1| GENE 24 27975 - 29456 1772 493 aa, chain + ## HITS:1 COG:DR1670 KEGG:ns NR:ns ## COG: DR1670 COG0215 # Protein_GI_number: 15806673 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Cysteinyl-tRNA synthetase # Organism: Deinococcus radiodurans # 5 490 52 531 532 457 48.0 1e-128 MEHQLTIYNTLDRKKELFVPLHAPHVGMYVCGPTVYGDAHLGHARPAITFDVLFRYLTHL GYKVRYVRNITDVGHLEHDADEGEDKIAKKARLEELEPMEVVQYYLNRYHKAMEALNVLS PSIEPHASGHIIEQIQLVQKILDAGYAYESEGSVYFDVAKYNKDYHYGKLSGRNLDDVLN TTRDLDGQSEKRNPADFALWKKAQPEHIMRWPSPWSDGFPGWHAECTAMGRKYLGEHFDI HGGGMDLIFPHHECEIAQSVASQGDDMVHYWMHNNMITINGTKMGKSLGNFITLDEFFNG THKLLAQAYTPMTIRFFILQAHYRSTVDFSNEALQASEKGLQRLIEAIEGLDKITPAATT SEGINVKELRAKCYEAMNDDLNTPIVIAQLFEGARIINNINAGNATISAEDLKDLKETFH LFCFDIMGLKEEKGSSDGREAAYGKVVDMLLEQRMKAKANKDWATSDEIRNTLTALGFEV KDTKDGFEWRLNK >gi|226332256|gb|ACIC01000064.1| GENE 25 29580 - 32441 1947 953 aa, chain - ## HITS:1 COG:no KEGG:BT_3350 NR:ns ## KEGG: BT_3350 # Name: not_defined # Def: putative chondroitinase (chondroitin lyase) # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 953 1 953 953 1981 99.0 0 MKAVCFIFLFLFCSLSSYAQVIGFEEKVPETFKVSGKGEVKLSSLFYKEGESSLEWDFQP ASTLDVQIEPLSLNAKKEQQFGITLWIYNEKPQQDSIRFEFLNKAGEVSYWFTYHLQAAG WRACWISFAYMNGDKKDKNIVSYRLVAPDRKGRIFLDRLTFPEKKMNLRTTPDQQLPANN GLSNRDLWHWCLVWKWEQQSYDVPLLSKLTTRQKKDLKTIEQRLTEFLDVKKAPQGQINA ARKTFERAAIAPSAAGTGFTGTPVVAPDEQDKKKGEMSWNDIETMLAGFAYDAFYNQNET SKKNYFTVFDYAMDQGFAFGSGMGTNHHYGYQVRKIYTTAWLMRDAIYKHPHRDAYLSTL RFWAALQETRQPCPPERDELLDSWHTLLMAKFISAMMFPDAREQEQALNGLSRWLSSSLN YTPGTLGGIKVDGTTFHHGGFYPGYTTGVLATIGQFIALTNGTGFELTEEARQHIKSAFL AMRNYCNLYEWGTGISGRHPFGGKMGSDDIEAFANIALSGDLSGQGNTFDHGLAADYLRL IRDRNTRNAHFFRKEGIQPAQAPHGFFVYNYGSAGIFRRADWMVTLKGYTTDVWGAEIYA KDNRYGRYQSYGSVQIMGKGNPVSRAGSGFVQEGWDWNRLPGTTTIHLPFELLDSPLKGT TMARSTENFSGSSSLGGMNGMFAMKLTERDYENFTPDFVARKSVFCFDNRMICLGTGITN SNADYPTETTLFQTKFNGKEQKTGKDNYWLHDGYDNYYHVVDGTLRSQIAEQESRHEKTR EKTTGTFSSAWIEHGNAPKNATYEYMVLIQPSASDLDELKKTPAYEVWQRDQTAHIVYDR KTGITGYAVFEDYRSADDKLVVASIPAETMVMYAAEGKKAIRLSVCDPNLNIAEKTYTTK EPSRPVRKIIELKGRWSFLETPANVRLEYKGAHTILDVTCQHGQPVEMILENK >gi|226332256|gb|ACIC01000064.1| GENE 26 32580 - 34106 1223 508 aa, chain - ## HITS:1 COG:YPO0829 KEGG:ns NR:ns ## COG: YPO0829 COG3119 # Protein_GI_number: 16121138 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Yersinia pestis # 6 504 28 511 517 312 39.0 9e-85 MGGLTLFAAQGCKAPKQVAEQAEHPNIIYVFPDQYRNQAMGFWNQEGFRDKVNFRGDPVH TPNIDTFARESMVLTSAQSNCPLSSPHRGMLLTGMYPNRSGVPLNCNSTRPISSLRDDAE CIGDVFSKAGYDCAYFGKLHADFPTPNDPENPGQYVETQRPVWDAYTPKEQRHGFNYWYS YGTFDEHKNPHYWDTDGKRHDPKEWSPLHESGKVVSYLKNEGNVRDTKKPFFIMVGMNPP HSPYRSLNDCEEQDFNLYKDQPLDSLLIRPNVDLNMKKAESVRYYFASVTGVDRAFGQIL EALKQLGLDKNTVVIFASDHGETMCSQRTDDPKNSPYSESMNIPFLVRFPGKIQPRVDDL LLSAPDIMPTVLGLCGLGDSIPSEVQGRNFAPLFFDEKAEIVRPAGALYIQNLDGEKDKD GLVQSYFPSSRGIKTARYTLALYIDRKTKQLKKSLLFDDVNDPYQLNNLPLDENKEVVEQ LYREMGTMLKEIDDPWYTEKILSDRIPY >gi|226332256|gb|ACIC01000064.1| GENE 27 34213 - 35415 1164 400 aa, chain - ## HITS:1 COG:no KEGG:BT_3348 NR:ns ## KEGG: BT_3348 # Name: not_defined # Def: putative unsaturated glucuronyl hydrolase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 400 1 400 400 811 99.0 0 MKTILSALGLSLLIFTSCGGQKKVEVDFIQDNIDNAVAQNTIQTDIIEKSGKILNPRTIN ADGSISYIPIDDWCSGFFPGSMWLTYNLTGDKKWLPLAEKYTEALDSVKYLKWHHDVGFM IGCSYLNGYRFADKKEYKDVIVEAAKSLSTRFRPGAGVIQSWDADKGWQGTRGWKCLVII DNMMNLELLFEATAFSGDSTFYNIAVKHADTTMAHHFRPDNSCYHVVDYDPETGEVRKRQ TAQGYADESSWARGQAWALYGYTACYRYTKDKKYLDQAQKVYNFIFTNKNLPEDLVPYWD YDAPNIPNEPRDASAAACTASALYELDGYLPGNHYKETADKIMESLGSPAYRAKVGTNGN FILMHSVGSIPHGQEIDVPLNYADYYFLEGLMRKRDLEKK >gi|226332256|gb|ACIC01000064.1| GENE 28 36020 - 37429 786 469 aa, chain + ## HITS:1 COG:no KEGG:BT_3347 NR:ns ## KEGG: BT_3347 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 469 1 469 469 953 99.0 0 MKNHLITVLVCCLFFIGCKDDKANSNQYDPSQPVVFTDFSPKEGGMRTRLYIEGSNFGND PTKIHITIGGEKTKTIGADGKKIYCMVPSKAFDGIINVRVDGPDGKPIAEHTFEEEFKYT PATIVGTLLRKVDEDNNSAFQEGSFDEGASIPSNDCMVFDPKYKDGDNRLLFSSNHFDGL HLIDLTERTVKRLFPRTGYSTMYSFTFSADGDTLLFTDDHGQDNTTRANIYYSLRRENFR RIRPYNYGRTAYSLVYMDDGTIFYTTWWEGKVYKMIRNGAIPNIDENAEVAFSLSQISAV GGSHTILFRHPSNKFMYMLSDNFGAVFRADYDPVAKTFGNPSIIAGNMNDKSFKEGTGGS ARFNKPQFGVFVKNEKYGEGIGPDNDQYDFYFCDRENHCIWKLDPHGVASLAAGRSNENA DGKIWGYVDGNPLHEARFNQPAGLAYDPDTDMFYIGDIDNKGIRYMTTE >gi|226332256|gb|ACIC01000064.1| GENE 29 37453 - 40536 2685 1027 aa, chain + ## HITS:1 COG:no KEGG:BT_3346 NR:ns ## KEGG: BT_3346 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1027 1 1027 1027 2009 100.0 0 MRNVILLFLLIVGWTTTVYAQNEQETIEVTGIVTDPKNEPLIGVNVTARNMPGFGAITDI NGRYKIKVPNYSYLIFSYVGFTKQEIFIKDKKIINIVMAETKETVIDEVVVTGTGAQKKV TVTGAVTTVDVSQLRTPTASITNALAGNVAGILARQTSGQPGSNISEFWIRGISTFGAGG SALVLVDGFERDMNEINVEDIQDFSVLKDASATAIYGSRGANGVVLITTKRGKEGKTRVN AKVETSYNTRTRTPEFVDGPTYARMMNEALVSRSQAPAYSESDLELFANGLDQDLFPNVD WMNLILKDGAPTYRATVDLTGGGTVARYFISASYVNEGGMYETDRAMTDYNTNANYHRWN YRMNVDIDLTKSTLLKVGVSGSLDKQNLPGSQYHEIWHSLMGYSPIASPVQYSDGKWAAV GSEGRRNPWVLTTQQGYQETWKNKIQTTVNLEQDLKFITKGLKFYGRFGFDTYTDNGDNR FKWPESWLAERQRTSDGELQFRRIETQKLMDGTMYSSGQRKEYLEAELHYNRTFGDHMLG AVLKYSQDKTTNTSHNGNTDSYEKIVQSIQNRHQGFAGRFTYGWKYRYFFDFNFGYNGSE NFASGQQFGFFPAYSVAWNIAEESIIKKHLKWLNMFKLRYSYGKVGNDNVGTRFPYVEKF GTWDEDGYYYGDIGTSNYYYTGLTYSRIASSSITWEVAKKHDLGLDFSMFNDKFSGSIDY FHEQRNGIYMVRSYLPQIIGLNHITTKPYANVGSVLSEGFDGNIAYKQRIGEVDLTVRAN MTYSKNEIKEYDEENSRYPYKMKYGFRVDQARGLIAEGLFKDYEEIRNRPSQGSGIMPGD IKYKDVNGDGVINGDDEVPIGATTRPNLIYGFGVSAQWKEFDINVHFQGAGKSSFFIDGF TVYPFQEKSWGNILTDVVGNYWSLGTNEDPNAKYPRLSYGGNSNNYRASTYWLRNGSYIR LKNVELGYNIPKRFVNKLHLDRVRLYLMGTNLLTFSDFKLWDPELGSSNGQQYPLSRTVT IGLTVGI >gi|226332256|gb|ACIC01000064.1| GENE 30 40559 - 42598 1667 679 aa, chain + ## HITS:1 COG:no KEGG:BT_3345 NR:ns ## KEGG: BT_3345 # Name: not_defined # Def: putative outer membrane protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 679 1 679 679 1370 100.0 0 MNKNKYILLGAVVLGLGLTSCSDFLNVDRYFRDQQSTERIFSDKDYTLQWLSFCYSRLQG DNLEVGHSDVCPFNFSDDQVFNERGDRFAKFKRGEYLNSTGGQNAWKWSYDGIYQASILL NELYENEDLTPEEVTDVRGQARFLRAYFYWLLLRKFGPIPILPPEGADYTKSYDELAYPR KTYDECVSFITSELEIAATELFEKRDNLNIARPTKGAALAVRAKVFLYAASPLVNGNTEM ADFTNKDGQQLIPQEYNEEKWAKAAASARDMIEYSEMSGLYKLYTFERRPVSTDEAYPTT IEPPYHEEYSNKPFPEGWSNIDPFESYRSLFNGDIYAAENPELIFTRGTNGDSNDLKTDN TMVDFVKHQLPGTFGGYNVHGMTLKQSEAYAMADGTSFDKECYTLWKGKFTNDENKDEHK YDNVKNGVWWGYTNREPRFYASVAFNGAQWNALSIKEEGGKDSRNKQIWYYRGATDGRIN GSDNWCITGIGIMKYVNPNDCAKWGGSIYQKVEPTLRYADILLMYAEALNNISEGTHYQV TSWDGSQTYDIFRDKEQMRRGVKPVRMRAGVPDYSDEVYENPKKFFEKIVHERQIEFFAE TQRYYDLRRWKIVEEHEGEQIYGCNTLMNEEYKDMFYLPVRVAELQTSFSRKQYFWPISF DELKRNNNLSQAPGWQNYD >gi|226332256|gb|ACIC01000064.1| GENE 31 42627 - 43637 885 336 aa, chain + ## HITS:1 COG:no KEGG:BT_3344 NR:ns ## KEGG: BT_3344 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 336 1 336 336 650 100.0 0 MRKLYILGMILGMYATFSACNDEWKDELYTQMLSFKAPLGDNGLSEIYVRYQPDGTGIYQ LPVIVSGSKSNSRNIDAKIAVDEDTLRILNQEKFPVGRQDLWYVPLAEQHYSFESNICHI PAGENIQLYPIHFNFNGLELDEKYVLPLTIEEDPSYVQNKYKGRYKALLGVNLFNDYSGT YNTTLMNIYIEGTTTDPAKVDTRLARVVDENTIFFYAGSTWVEDENRSRYKVFVEFEEGT EDEEGTIKGKLRLYGDEGEPGINIRPLGECKYEKRIIQHETIKYMERHLTTLELSYKYTD VTSNPDYPITYEVKGSMTMERQINPLIPDEDQAIEW >gi|226332256|gb|ACIC01000064.1| GENE 32 43943 - 44347 459 134 aa, chain + ## HITS:1 COG:MA0735 KEGG:ns NR:ns ## COG: MA0735 COG2050 # Protein_GI_number: 20089620 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Uncharacterized protein, possibly involved in aromatic compounds catabolism # Organism: Methanosarcina acetivorans str.C2A # 4 133 16 144 146 115 49.0 2e-26 MTPQEFFKKDLFAENAGVVLLEVREGYSKAKLEIKPEHLNAGARTQGGAIFTLADLALAA AANSHGTLAFSLSSSITFLRASGPGDTLYAEARERYIGRSTGCYQIDITNQNGDLIATFE SSVFRKDQKVPFTL >gi|226332256|gb|ACIC01000064.1| GENE 33 44353 - 45237 736 294 aa, chain - ## HITS:1 COG:no KEGG:BT_3342 NR:ns ## KEGG: BT_3342 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 36 294 1 259 259 531 100.0 1e-149 MKTRKRKKEKEIEITKTVTVANLRRHLIVGILALLMGSFALPANAQCEAKNDAFQSGEHV MYDLYFNWKFVWVKAGIASLTTNATTYHSEPAYRINLLALGSKRADFFFKMRDTLTCVMG EKLEPRYFRKGAEEGKRYTVDEAWFSYKDGLCFAKQKRTFRDGEVQESEESDSRCIYDML TILAQARSYDPADYKVGDKIKFPMATGRKVEEQTLIYRGKENVKAENGVTYRCLIFSLVE YDKKGKEKEVITFFVTDDKNHLPVRLDMFLNFGSAKAFLNDVRGHRHPLTSIVK >gi|226332256|gb|ACIC01000064.1| GENE 34 45173 - 47065 1353 630 aa, chain - ## HITS:1 COG:no KEGG:BT_3341 NR:ns ## KEGG: BT_3341 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 604 13 616 642 1152 99.0 0 MRKALPVIALVAILYSCASIGRPEGGPKDYDPPRFVGSTPAAGAINNKRTKVSLQFDEFI KLEKATEKVVVSPPQIQQPEIKASGKKIQVNLLDSLKPNTTYTIDFSDAIVDNNEGNPLG NFAFTFSTGAQIDTMEVSGTVLDASNLEPIKGILVGLHANLNDSAFTKLPFDRVARTDSR GRFSIRGVAPGKYRIFGLMDSDQNFAFTQKSEVIAFNDSLIIPRMEERLRMDTAWVDSLT YDTIVEKKYMHYLPDDVILRAFKELNYSQYLIKSERLVPQKFTFYFAGKADTLPVLKGLN FDEKEAFVIEKNQRNDTIHYWVKDSLLYKQDTLALSLTYLYTDTLNQLIPRTDTLKLVSK QKTLTKEEPEKKKKKKKKEGEEDEPIPTKFLPVNVNAPSSMDVYGYVSLNFEEPIASFDT AAIHLRQKVDTIWKDIPFEFEQDSLNLRRFNLYPENDWEPTMEYEFSVDSTAFHGIYGLF TDKIKQSFKVRSEDEYFTLHFIVTGADPQAFVELLDTQDKVVRRRLVEDGRADFYYLNPG KYAARLINDRNGNGEWDTGDYAKGIQPEEVYYFPKVVEYKALWDVDQNWDIHATPVDKQK LDELKKQKPDEDKKKKERERDRNNQNRNRR >gi|226332256|gb|ACIC01000064.1| GENE 35 47252 - 50362 2959 1036 aa, chain + ## HITS:1 COG:TM1193 KEGG:ns NR:ns ## COG: TM1193 COG3250 # Protein_GI_number: 15643949 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Thermotoga maritima # 26 1028 7 983 1087 699 38.0 0 MKKQLLSCCLAALGLTTAIQAQNFNEWKDPEVNSVNRSAMHTNYFAYASADEAKAGSKED SQNFMTLNGLWKFNWVRNADARPTNFYQTSFNDKGWDNIKVPAVWELNGYGDPIYVNVGY AWRNQFQNNPPLVPTENNHVGSYRKEIVLPADWKGKDIFAHFGSVTSNMYLWVNGRYVGY SEDSKLEAEFDLTNYLKPGKNLIAFQVFRWCDGSYLEDQDFFRYSGVGRDCYLYARDKKR IQDIRVTPDLDSQYKDGTLNIAIDMKGSGTVALDLTDAQGKSVATADLKGSGKLDTTINV ANPAKWTAETPNLYTLTATLKNGSTITEVIPVKVGFRKIELTGGQILVNGQPVLFKGADR HEMDPDGGYVVSLERMIQDIKVMKQLNINAVRTCHYPDDNRWYDLCDQYGLYVVAEANVE SHGMGYGDKSLAKNPIYAKAHMERNQRNVQRGYNHPSIIFWSLGNEAGMGPNFEHCYTWI KNEDKTRAVQYEQAGTSEFTDIFCPMYYDYNNCIKYCEGNIDKPLIQCEYAHAMGNSQGG FKEYWDITRKYPKYQGGFIWDFVDQSCHWKNKDGVAIYGYGGDFNKYDASDNNFNDNGLI SPDRVPNPHAYEVGYFYQNIWTTPADLSKGEINIYNENFFRDLSAFYLEWQLLANGEIVQ NGIVSDLNVAPQQTAKLQLPFDTKNICSCKELLLNVSYKLKAAETLLPAGETIAYDQMSI RDYKAPELKLENKQSSNIAVVVPSFQNNDRNYLIVSGEDFTLEFNKHNGYLCRYDVDGMQ LMEDGSALTPNFWRAPTDNDFGAGLQHRYAAWKNPELKLTSLKHDIENEQAVVRAEYDMK SIGGKLFLTYTINNKGAVKVTQKMEADKSKKVSDMFRFGMQLRMPVTFNEIEYYGRGPGE NYSDRNHAARIGKYRQTVEEQFYPYIRPQETGTKTDIRWWRLLNIGGNGLQFVADAPFSA SALNYTIESLDDGAGKDQRHSPEVEKANFTNFCIDKAQTGLACVNSWGAIALEKYRLSYQ DYEFSFIMNPVYHKLK >gi|226332256|gb|ACIC01000064.1| GENE 36 50542 - 51717 911 391 aa, chain + ## HITS:1 COG:ECs4393 KEGG:ns NR:ns ## COG: ECs4393 COG0845 # Protein_GI_number: 15833647 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Escherichia coli O157:H7 # 10 375 13 382 385 165 31.0 1e-40 MKSRIVFFAFCLALLSSCGNKGNDTGKVPEYAVQELQKTTADLMKAYPATIKGRQDVEIR PQVSGFITKLCVDEGATVRKGQLLFIIDPTQYEAAVRTAKASVATAEAAVNTQQMTVDNK IELNKKQIISDYDLSMAKNSLAQAQAQLAQAKAQLTTAQQNYSFTQVKSPSDGVINDIPY RLGALVSPSMATPMTTVSEIDEVYVYFSTTEKELLAMTKTGGTIKEEISKIPAIKLQLID GTTYDAEGKVDAITGVIDQSTGSVSMRAIFPNKEHMLRSGGTANVLIPYNMENVISIPQS ATVEIQDKKFVYVLQPDNTVKYTEIGIFNLDNGKEYLVTSGLNPGDKIVVEGVQSLKDGQ KIQPITPAQKEANYQQHLKDQHDGNLATAFN >gi|226332256|gb|ACIC01000064.1| GENE 37 51734 - 54949 3021 1071 aa, chain + ## HITS:1 COG:BMEI1629 KEGG:ns NR:ns ## COG: BMEI1629 COG0841 # Protein_GI_number: 17987912 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Brucella melitensis # 6 1056 5 1035 1051 765 41.0 0 MKLDKFINRPVLSTVISILIVILGAIGLATLPITQYPDIAPPTVSVRATYTGASASTVLN SVIAPLEEQINGVENMMYMTSNASNTGSGDISIYFKQGTDPDMAAVNVQNRVSMAQGLLP AEVTRIGVTTQKRQTSMLVVFSLYDETDTYTDAFIENYAKINLIPQVQRVQGVGDASVMG QDYSMRIWLRPDVMAQYKLIPNDVSTALAEQNIEAAPGQFGERSNQTFQYTIRYKGRLQQ PEEFENIVIKSLPNGEVLRLNDIAEIQLDRLGYNFTNRVNGHKAVTCIVYQMAGTNATQT ITDIQKLLDEASTTLPSGLKINVSMNANDFLFASIHEVLKTLIEAFILVFIVVYIFLQDL RSTLIPTIAIPVALIGTFFVLSLIGFSLNLLTLCALVLAIAIVVDDAIVVVEGVHAKLDQ GYTSARLASIDAMNELGGAIVSITLVMMSVFIPVSFMGGTAGTFYRQFGLTMAIAIGLSA LNALTLSPALCAVLLKPHTDHGDKKQTLVSRFHTSFNAAYDSILKRYKKRVLFFIQKKWL SMGLVVLSIVLLIFFMNTTPTGMVPNEDTGTLMGAVTLPPGTSQDRSEQILARVDSLIAA DPAVASRTLISGFSFIGGQGPSYGSFIIKLKDWDERSMIQNSDVIVGSLYMRAQKIIKEA QVLFFAPPMIPGYSASTDIEVNMQDKTGGDLNKFFDVANNYTAALEARPEINSAKTTFNP NFPQYMIDIDAAACKKAGISPSDILSTMQGYYGGLYASNFNRFGKMYRVMIQSDPLSRKN LESLNNVKVRNNQGEMAPISQFISVEKVYGPDIISRFNLYTSMKVMVAPASGYTSGQALA ALAEVAKENLPAGYTYELGGMAREEAQSSGSTTGLIFILCFVFVYLLLSAQYESYILPLA VLLSIPFGLLGSFLFVNGMSAIGSISALKMILGTMSNNIYMQIALIMLMGLLAKNAILIV EFALDRRKMGMSITWAAVLGAGARLRPILMTSLAMVVGLLPLMFAFGVGAHGNRTLGTAS IGGMLIGMICQIFIVPALFVIFQYLQEKVKPMEWEDIDNTDAVTEIEQYAK >gi|226332256|gb|ACIC01000064.1| GENE 38 54961 - 56346 382 461 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157165073|ref|YP_001466086.1| 30S ribosomal protein S12 [Campylobacter concisus 13826] # 45 458 33 455 460 151 26 8e-36 MKKQILYMLCATALLSSCHIYKSYDRPEDITASGLYRDPAADNDTLASDTTNFGNLPWRE VFTDPQLQSLIEQGLKQNTDLLSAALNVKAAEASLMSARLAYAPSIGLSPQGTISSFDKN AATKTYSLPVTASWQVDLFGQLLNSKRNAQVTLKQTKAYRQAVQTQVIANIANMYYTLLM LDRQLEITQATAEVLKRNAETVQALSERSTYTSAALAQSKAAYAQVVASIPDIEQSIRET ENALSTFLGEAPHAIKRGVLEAQALPEELSAGIPLQLLSNRPDVKAAEMSLASCYYNTNS ARAAFYPQITLSGSAGWTNSAGSAIINPGKLLASAIGSLTQPLFYRGTNIARLKAAKAQE EQAKLSFQQTLYNAGSEVSNALSLYQNTSKKAESRQMQVESAKKASEDTKELFNLGTSTY LEVLSAQQSYLSAQISQVSDCFDKMQAVVSLYQALGGGRED >gi|226332256|gb|ACIC01000064.1| GENE 39 56365 - 57135 756 256 aa, chain + ## HITS:1 COG:PM1996 KEGG:ns NR:ns ## COG: PM1996 COG1043 # Protein_GI_number: 15603861 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Acyl-[acyl carrier protein]--UDP-N-acetylglucosamine O-acyltransferase # Organism: Pasteurella multocida # 2 256 8 262 262 164 37.0 1e-40 MISPLAYVDPEAKLGKNVTVLPFAYIEKDVEIGDDCTIMSYASILKGTKMGKGNKIHQNA VLGAEPQDFHYTGEESSLIIGDNNDIRENVVISRATFAGNATRIGNGNYLMDKVHLCHDV QISNNCVVGIGTTIAGECSLDDCVILSGNVTLHQYCHIGSWTLVQSGCRISKDVPPYVIM SGNPVAYHGVNAVVLSQHHNTSERILRHIANAYRLIYQGNFSVQDAVQKIIDQVPMSEEI ENIVNFVKGSERGIVK >gi|226332256|gb|ACIC01000064.1| GENE 40 57196 - 57837 503 213 aa, chain - ## HITS:1 COG:no KEGG:BT_3335 NR:ns ## KEGG: BT_3335 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 213 1 213 213 442 100.0 1e-123 MKVKKISSANVEACSLPKLFDEEKIDFQPVQCVNWAEYPYKPKVSFRIAHTKDSILLHFK VREESVRAKYGEDNGSVWTDSCVEFFSVPASDGIYYNIECNCIGTILIGAGPARNGREHA PKEVTALVQRWSSLGNEPFEERVDDTTWEVALIIPYAAFFKHQIQSLDGKEVKANFYKCG DELQTPHFLSWNPIKIEKPDFHRPDFFGTLEFE >gi|226332256|gb|ACIC01000064.1| GENE 41 57949 - 62013 3475 1354 aa, chain + ## HITS:1 COG:BS_phoR_3 KEGG:ns NR:ns ## COG: BS_phoR_3 COG0642 # Protein_GI_number: 16079962 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus subtilis # 812 1043 41 272 279 138 36.0 7e-32 MRKTILLFLLFLSVTAVRTLGQNITFSHLTTDDGLSQFSVNSLYIDERGIIWIGTREGLN RYNGNDIKSFKLKKNDPNSLFSNTVLRITGNKNGKVYLLCTDGVAEFDMTTQRFKTLLQG NVDAIYYNEKLYIGKREEVFVFNENTNNFDLYYHLAGKDINLSCLHLDEKKNLWMGTTSN GVYCLSDDKQLSQPITKGNIASIYEDSSKELWIGSWEEGLYRIRTNGTIDNFRHDPKNPN SICSDFVRSCCEDNLGNLWIGTFHGLNRYDKSTGQFQLYTANDNKPDGLTHSSIWCIVKD EQGTLWLGTYFGGVNYFNPEYEIYTRYKVGDTEKEGLSSPIVGRMTEDKDGNLWICTEGG GVNVYNRKNNTYRWYRHEEGRNSISHNNVKAIYYDRTEEIMWIGTHLGGLNKLDLRTNRF TIYRMKAGDPTTLPSDIVRDIVPYGDKLIIATQNGVCLFNPATGTCEHLFKDTKEGRNIG MVASLYIDKDETLWVSATGEGVYSYRFDTGKLTHYAHNPSIPNTLSNNNVNNIMQDSNGN LWFSTSGSGLDRYRKESNDFENFDMQKDGLSSDCIYEVFESSIQKGDLLLITNQGFSQFD YPQKKFYNYGTENGFPLTAVNENALFVTHDGEVFLGGIQGMISFWEKKLHFTPKSYSIIL SRLLVNGKEITAGDDTGILEQSIAYTPEISLKADQSMFSIEYATSNFIPANRNEIVYRLE GFSDEWNHTDRKQTLITYTNLNPGKYTLVIKSDGDGIEEARLLIRVLPPWYETWWAYLIY TICTVSLLWYLIQNYNSRIKLRESLKYEKKHIEDLEALNQSKLRFFTNISHEFRTPLTLI VGQVETLLQVQTFTPNIYHKILGIYKNSLQLRELITELLDFRKQEQGHMKIKVSRHNLVN FLYENYLLFLEYASSKQINFKFNKQSDDIEVWYDQKQMQKVVNNLLSNALKHTKAEDTIS INVSQENKYVIIEIKDTGTGIAAAEIDKIFDRFYQTERLDSLNTGAGTGIGLALTKGIIE LHHGTIRVESEPGKGSRFIITLKSGNQHFTEEQIIRDNTDTNIQQQPETIVPTVEILPDS EWKEEDNKRIEDAKMLIVEDNESIKQMLAGIFETFYQVTTASDGVEALDIIQKDMPSIIL SDVVMPRMSGTELCKQIKTDFNTCHIPVVLLTARTAVEHNIEGLKIGADDYITKPFNTNL LISRCNNLVNSRRLLQEKFSKQPQAFAQMLATNPMDKEMLDRAMAIIEQHLDNTDFNVNI FAREMGMARTNLFTKLKAVTGQTPNDFILSIRLKKGAVMLRNNPELNITEISDRIGFSSS RYFSKCFKEIYHVSPLAYRKGEEKEEEGNEETDQ >gi|226332256|gb|ACIC01000064.1| GENE 42 62313 - 63848 1325 511 aa, chain + ## HITS:1 COG:STM0035 KEGG:ns NR:ns ## COG: STM0035 COG3119 # Protein_GI_number: 16763425 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Salmonella typhimurium LT2 # 13 501 11 467 497 155 26.0 2e-37 MKNVSRLLPLLPGIALLTGCNQKVQKDNGQNSQKPNIIYIFADDLGIGDLSCYGATKVST PHIDRLAGQGVQFTNAYATSATSTPSRFGLLTGMYPWRQENTGIAPGNSELIIDTACVTM ADMLKEAGYATGVVGKWHLGLGPKGGTDFNGHITPNAQSIGFDYEFVIPATVDRVPCVFV ENGHVVGLDPNDPITVNYEHKVGDWPTGEENPELVKLKPSQGHNNTIINGIPRIGWMTGG KSALWKDEDIADIITNKAKSFIVSHKEEPFFLYMGTQDVHVPRVPHPRFAGKSGLGTRGD VILQLDWTIGEIMNTLDSLQLTDNTILIFTSDNGPVIDDGYQDQAFERLNGHTPMGIYRG GKYSAYEAGTRIPFIVRWPAKVKPNKQQALFSQIDIFASLAALLKQPLPEGAAPDSQEHL NTLLGKDYTSREYIVQQNLNNTLAIVKGQWKYIEPSDAPAIEYWTKMELGNDRHPQLYDL SADPSEKNNVAKQHPEVVRELSELLESVKTR >gi|226332256|gb|ACIC01000064.1| GENE 43 63972 - 67169 3390 1065 aa, chain + ## HITS:1 COG:no KEGG:BT_3332 NR:ns ## KEGG: BT_3332 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 13 1065 1 1053 1053 2009 99.0 0 MKKHLSNQTRKRMLSILGMLLLSVPFVLAQVLVKGTVKDNLGEGVPGASVQVKGTSQGTI TDLDGKFTLNIPQKNATLVISFIGYVTVEQKADSQKPMVITLKEDTKTLDEVVVVGYQEV RRRDLTGSVAKANMADVLTAPVASFDQALGGRIAGVNVTSGEGMPGGNMSIVIRGNNSLT QENSPLFVIDGFPIEDSSAASTLNPSDIESLDFLKDASATAIYGARGANGVVIITTKKGK VGRAQLSYDGSFGVQHVTRTIPMMDAYEFVKLQNEMYPTVVAGSYLMNYEGKQWTLDDYK DIPQYNWQDEIFKTAWQQNHTVRLAGGTEGVRYNASLSYFDQDGTLIETGYKRMQGRMNT VVRRGKLNMSLTTNYSRSIQTGSTPSSTSYSGMNNLFYSVWGYRPVTSPDTPLSFLMDSS TDNAVDSSNDYRFNPIKSQKNEYRKSYTNNLQMNGFAEYEVLKGLKLKVSAGYTYDSRKQ DQFNNSNTRYGGPTSTDKVNAQVTRQERLTWLNENTLTYQTNIKKKHFLNVLGGITFQNS DYEIYSFRTTHIPNESLGMAGMSEGQAGTTTSAKSSWAMLSYLGRVTYNYMSKYYATVSF RADGSSKFNKDNRYGYFPSGSLAWSFSEEEFMKPLKSVLSSGKVRLSWGLTGNNRIGEYD YYALLAVLKSRVGSYTSTNSLPSGVYPFDNDATNAGVVPTSLPNKDLKWETTEQWNAGLD LGFFDERIGITMDIYRKTTRDLLLDASLPFSSGYYSATKNIGKVRNDGLELSLNTVNFQT RAFKWTTNFNISFNKNKVLALSENQTALLTAAQFDQNYNGQSSYIAKVGLPMGLMYGYVY EGTYKYDDFNKSGNSYSLKPGVPHYSTETNTQPGMPKYADLNGDGVVDSNDRTIIGRGLP IHTGGFTNNFEYKGIDLSVFFQWSYGNDIMNANRLFFESSNNRSRELNQFASYANRWTPE NPTSDIPAATNSSSNRVISSRIIEDGSYLRLKNVTVGYTFPAKLVKKWKIDKARVYVAAQ NLWTCTGYSGYDPEVSVRNSALTPGLDYSSYPRAYSISFGVSLGF >gi|226332256|gb|ACIC01000064.1| GENE 44 67210 - 68928 1858 572 aa, chain + ## HITS:1 COG:no KEGG:BT_3331 NR:ns ## KEGG: BT_3331 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 572 1 572 572 1131 99.0 0 MKALKITIIALLAGFSMASCDFLDKEPTKLTPENYFNTPAEANSFLTGIYAILSQPTFYG GDYMYLVAGDDLSHYGGSGRGPASTGLICNNATTSDNAVTAFWYALYSGINRANMFLENI DKVNGFDAGVKEQYIAEARFLRAFYYFNLVECWGDVPFKTVSTQSVTNLNIPRTDKQEIY DFIISEMADAADTGLKSASDLAYKPGRISQSTAWGILARVYLFRAGEHYREGRNATQAEK KDYFERASFYAQKVMTAGHKLAANYWDPFIDMCSDKYNTTANESIWEAEFAGNNTSDTQA EGRIGNIIGLAGPDLSSKSDVTGAKDPGYGYAFIYSTPKLYNLYVNNGDTKRFNWSIAPF EYKEAGGKNTGVTHREFEQGKLAEVMSQYGQQRGTYQYADDTEKTTATKNFSRMCGKYRR EYEADKKGKNYTSINFPILRYADVLLMIAEAENEANNGPTTLAYQCMKEVRERAGLNELP DMTQEEFRQTVKDERAMELCFEYTRRFDLIRWGEYVKNMRALVTEAQSGNNWTQGPTNVY TYFNISSTYNYFPIPDAEMSVNKDITQNNPGW >gi|226332256|gb|ACIC01000064.1| GENE 45 68962 - 70005 990 347 aa, chain + ## HITS:1 COG:no KEGG:BT_3330 NR:ns ## KEGG: BT_3330 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 347 1 347 347 694 99.0 0 MKTYKLIAICVASLFFGACSDGLDEAVGLHVKVATNENVSFDGQIITAKKGTPIEFILSG DPDFLTFFSGEAGSKYEYRERETIDPSQIKSSMLNFSIWFQYGNPSTTLEKHVYISDEFT GLYKDNFEADSLLVEQFEKDGKWKELVPQSAFPTAAVGNADLATPYSFDMKEYMGKRIAI AICYRGIDNTVAQSKMYFERMRINNVMTSGQEAEYSAGSFGFTPINMKNKWNLKDQTSMT KDREYGTVTNNVSGIWNLTGVGGGSFFIHCTNANDPLKYSWLVSDLITVNSCSPDQGTKV KDITQRLDKYTYTYNQIGIYNVTFLARNANIDHSSTTTYHMVVNVVE Prediction of potential genes in microbial genomes Time: Thu May 12 00:51:32 2011 Seq name: gi|226332255|gb|ACIC01000065.1| Bacteroides sp. 1_1_6 cont1.65, whole genome shotgun sequence Length of sequence - 141596 bp Number of predicted genes - 89, with homology - 88 Number of transcription units - 41, operones - 19 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 15 - 1931 1511 ## BT_3329 hypothetical protein 2 1 Op 2 . + CDS 1945 - 4551 2416 ## BT_3328 hypothetical protein + Term 4635 - 4671 5.1 + Prom 4651 - 4710 3.5 3 2 Op 1 . + CDS 4732 - 6189 658 ## COG2865 Predicted transcriptional regulator containing an HTH domain and an uncharacterized domain shared with the mammalian protein Schlafen 4 2 Op 2 . + CDS 6217 - 7161 702 ## COG0042 tRNA-dihydrouridine synthase 5 2 Op 3 . + CDS 7216 - 8478 1106 ## BT_3325 hypothetical protein + Term 8675 - 8711 4.7 - Term 9017 - 9051 3.3 6 3 Tu 1 . - CDS 9273 - 12350 2439 ## BT_3324 chondroitinase (chondroitin lyase) precursor - Prom 12468 - 12527 3.1 - Term 12431 - 12472 7.3 7 4 Tu 1 . - CDS 12549 - 13106 476 ## BT_3323 hypothetical protein + Prom 13082 - 13141 5.3 8 5 Tu 1 . + CDS 13290 - 15902 2008 ## BT_3322 putative thioredoxin family protein - Term 15807 - 15845 -0.6 9 6 Tu 1 . - CDS 15908 - 17191 783 ## BT_3321 hypothetical protein - Prom 17224 - 17283 5.2 + Prom 17189 - 17248 4.3 10 7 Op 1 . + CDS 17275 - 18039 798 ## COG0289 Dihydrodipicolinate reductase 11 7 Op 2 2/0.000 + CDS 18073 - 19554 1383 ## COG0681 Signal peptidase I 12 7 Op 3 . + CDS 19597 - 20535 622 ## COG0681 Signal peptidase I 13 7 Op 4 . + CDS 20578 - 21207 468 ## BT_3317 hypothetical protein 14 8 Tu 1 . - CDS 21715 - 22167 297 ## BT_3316 hypothetical protein + Prom 22015 - 22074 4.5 15 9 Tu 1 . + CDS 22183 - 22428 187 ## gi|253568933|ref|ZP_04846343.1| predicted protein - Term 22562 - 22600 -0.7 16 10 Op 1 . - CDS 22628 - 24922 2153 ## COG1472 Beta-glucosidase-related glycosidases 17 10 Op 2 . - CDS 24940 - 26943 1759 ## BT_3313 hypothetical protein 18 10 Op 3 . - CDS 26966 - 28456 1497 ## COG5520 O-Glycosyl hydrolase 19 10 Op 4 . - CDS 28515 - 30035 1529 ## BT_3311 hypothetical protein 20 10 Op 5 . - CDS 30050 - 33055 2739 ## BT_3310 hypothetical protein - Prom 33128 - 33187 6.2 - Term 33165 - 33202 -0.6 21 11 Op 1 . - CDS 33265 - 34908 1120 ## BT_3309 transcriptional regulator 22 11 Op 2 . - CDS 34957 - 36177 1192 ## COG0612 Predicted Zn-dependent peptidases - Prom 36209 - 36268 4.4 + Prom 36168 - 36227 3.2 23 12 Op 1 . + CDS 36283 - 36816 607 ## COG1611 Predicted Rossmann fold nucleotide-binding protein 24 12 Op 2 . + CDS 36821 - 37426 547 ## COG0794 Predicted sugar phosphate isomerase involved in capsule formation 25 12 Op 3 . + CDS 37414 - 38340 927 ## COG0524 Sugar kinases, ribokinase family + Term 38395 - 38433 3.0 + Prom 38375 - 38434 3.5 26 13 Tu 1 . + CDS 38459 - 39529 882 ## COG2365 Protein tyrosine/serine phosphatase + Prom 39557 - 39616 5.3 27 14 Tu 1 . + CDS 39641 - 41584 1804 ## COG0513 Superfamily II DNA and RNA helicases + Term 41594 - 41638 9.0 - Term 41628 - 41663 2.2 28 15 Tu 1 . - CDS 41820 - 45920 2524 ## COG0642 Signal transduction histidine kinase - Prom 46142 - 46201 7.1 + Prom 46174 - 46233 7.5 29 16 Tu 1 . + CDS 46274 - 47461 783 ## COG4833 Predicted glycosyl hydrolase + Prom 47478 - 47537 3.8 30 17 Tu 1 . + CDS 47573 - 50050 1794 ## COG1472 Beta-glucosidase-related glycosidases + Prom 50056 - 50115 3.1 31 18 Tu 1 . + CDS 50174 - 52225 1269 ## COG1501 Alpha-glucosidases, family 31 of glycosyl hydrolases + Term 52299 - 52347 4.1 + Prom 52291 - 52350 2.8 32 19 Op 1 . + CDS 52552 - 53679 441 ## BT_3298 hypothetical protein 33 19 Op 2 . + CDS 53700 - 56780 1892 ## BT_3297 hypothetical protein 34 19 Op 3 . + CDS 56791 - 58815 1059 ## BT_3296 putative outer membrane protein 35 19 Op 4 . + CDS 58834 - 59838 444 ## BT_3295 hypothetical protein + Term 59963 - 60017 1.1 + Prom 60005 - 60064 7.2 36 20 Tu 1 . + CDS 60114 - 62066 1222 ## BT_3294 putative alpha-glucosidase + Term 62105 - 62150 6.2 - Term 62091 - 62140 7.8 37 21 Op 1 . - CDS 62169 - 66434 2845 ## COG3250 Beta-galactosidase/beta-glucuronidase 38 21 Op 2 . - CDS 66468 - 68609 1747 ## BT_3289 hypothetical protein 39 21 Op 3 . - CDS 68656 - 70011 1414 ## COG1808 Predicted membrane protein 40 21 Op 4 . - CDS 70014 - 71462 1293 ## COG2244 Membrane protein involved in the export of O-antigen and teichoic acid 41 21 Op 5 . - CDS 71498 - 72265 529 ## COG4099 Predicted peptidase 42 22 Tu 1 . - CDS 72372 - 73403 1088 ## COG2255 Holliday junction resolvasome, helicase subunit + Prom 73935 - 73994 2.0 43 23 Op 1 . + CDS 74016 - 75248 1171 ## COG2715 Uncharacterized membrane protein, required for spore maturation in B.subtilis. 44 23 Op 2 . + CDS 75264 - 75677 413 ## COG0319 Predicted metal-dependent hydrolase + Term 75788 - 75823 1.1 - Term 75704 - 75747 10.1 45 24 Op 1 . - CDS 75771 - 78320 2113 ## BT_3282 hypothetical protein 46 24 Op 2 . - CDS 78340 - 79239 642 ## BT_3281 hypothetical protein 47 24 Op 3 . - CDS 79246 - 80712 947 ## BT_3280 hypothetical protein 48 24 Op 4 . - CDS 80725 - 84273 2270 ## BT_3279 hypothetical protein 49 24 Op 5 6/0.000 - CDS 84307 - 85491 921 ## COG3712 Fe2+-dicitrate sensor, membrane component - Prom 85580 - 85639 3.0 - Term 85610 - 85644 3.0 50 24 Op 6 . - CDS 85690 - 86220 226 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 86372 - 86431 3.5 - Term 86369 - 86421 17.1 51 25 Op 1 . - CDS 86522 - 89128 1903 ## BT_3276 hypothetical protein 52 25 Op 2 . - CDS 89137 - 89859 709 ## BT_3275 hypothetical protein 53 25 Op 3 . - CDS 89780 - 91765 1724 ## BT_3275 hypothetical protein 54 25 Op 4 . - CDS 91785 - 93455 1009 ## BT_3274 hypothetical protein 55 25 Op 5 . - CDS 93474 - 94199 501 ## BT_3273 hypothetical protein 56 25 Op 6 . - CDS 94224 - 95693 1063 ## BT_3272 putative outer membrane protein 57 25 Op 7 . - CDS 95713 - 99045 2098 ## BT_3271 hypothetical protein 58 25 Op 8 6/0.000 - CDS 99090 - 100298 984 ## COG3712 Fe2+-dicitrate sensor, membrane component - Prom 100337 - 100396 10.2 - Term 100728 - 100763 1.0 59 25 Op 9 . - CDS 100814 - 101386 497 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 101484 - 101543 14.6 + Prom 102140 - 102199 2.7 60 26 Op 1 . + CDS 102221 - 104107 1591 ## COG0445 NAD/FAD-utilizing enzyme apparently involved in cell division 61 26 Op 2 . + CDS 104158 - 104688 578 ## COG0503 Adenine/guanine phosphoribosyltransferases and related PRPP-binding proteins + Term 104709 - 104743 6.0 62 27 Tu 1 . + CDS 104765 - 106594 1466 ## COG0322 Nuclease subunit of the excinuclease complex 63 28 Tu 1 . - CDS 106872 - 107099 105 ## + Prom 106840 - 106899 6.2 64 29 Op 1 . + CDS 106975 - 107427 454 ## COG1490 D-Tyr-tRNAtyr deacylase 65 29 Op 2 . + CDS 107516 - 107854 444 ## COG1694 Predicted pyrophosphatase 66 29 Op 3 . + CDS 107841 - 108743 1039 ## COG0274 Deoxyribose-phosphate aldolase + Term 108777 - 108822 1.2 + Prom 108994 - 109053 7.2 67 30 Tu 1 . + CDS 109150 - 109830 466 ## BT_3262 hypothetical protein - Term 109726 - 109761 0.2 68 31 Tu 1 . - CDS 109831 - 110805 804 ## COG0142 Geranylgeranyl pyrophosphate synthase - Prom 110859 - 110918 8.7 + Prom 110785 - 110844 4.6 69 32 Tu 1 . + CDS 110894 - 113743 2670 ## COG0749 DNA polymerase I - 3'-5' exonuclease and polymerase domains + Term 113878 - 113914 6.5 + Prom 113933 - 113992 6.3 70 33 Tu 1 . + CDS 114076 - 114471 475 ## BT_3259 hypothetical protein + Term 114519 - 114570 7.8 - Term 114507 - 114555 13.2 71 34 Op 1 9/0.000 - CDS 114579 - 115358 648 ## COG3279 Response regulator of the LytR/AlgR family 72 34 Op 2 . - CDS 115355 - 117370 1462 ## COG3275 Putative regulator of cell autolysis - Prom 117418 - 117477 4.1 + Prom 117413 - 117472 8.4 73 35 Op 1 . + CDS 117512 - 118414 721 ## COG1045 Serine acetyltransferase 74 35 Op 2 . + CDS 118431 - 119972 1712 ## COG0116 Predicted N6-adenine-specific DNA methylase + Term 119984 - 120029 4.1 + Prom 120093 - 120152 3.3 75 36 Op 1 . + CDS 120182 - 122380 2097 ## COG1506 Dipeptidyl aminopeptidases/acylaminoacyl-peptidases 76 36 Op 2 . + CDS 122454 - 123728 1602 ## COG0151 Phosphoribosylamine-glycine ligase + Term 123759 - 123798 7.4 77 37 Op 1 . + CDS 123845 - 124738 550 ## BT_3252 hypothetical protein 78 37 Op 2 . + CDS 124723 - 125202 437 ## COG1238 Predicted membrane protein 79 37 Op 3 . + CDS 125249 - 126031 563 ## COG4121 Uncharacterized conserved protein 80 37 Op 4 25/0.000 + CDS 126034 - 126960 932 ## COG0803 ABC-type metal ion transport system, periplasmic component/surface adhesin 81 37 Op 5 . + CDS 127020 - 127811 225 ## PROTEIN SUPPORTED gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 + Term 127893 - 127938 2.7 + Prom 128065 - 128124 4.9 82 38 Tu 1 . + CDS 128230 - 131604 3454 ## BT_3247 hypothetical protein + Term 131626 - 131693 2.1 + Prom 131615 - 131674 2.9 83 39 Op 1 . + CDS 131707 - 132321 663 ## COG0726 Predicted xylanase/chitin deacetylase 84 39 Op 2 . + CDS 132397 - 133428 526 ## COG1600 Uncharacterized Fe-S protein - Term 133427 - 133472 9.1 85 40 Tu 1 . - CDS 133547 - 134494 498 ## BT_3244 hypothetical protein - Prom 134514 - 134573 8.7 86 41 Op 1 . - CDS 134627 - 135958 921 ## BT_3243 hypothetical protein 87 41 Op 2 . - CDS 135971 - 136885 797 ## BT_3242 hypothetical protein 88 41 Op 3 . - CDS 136900 - 138453 1030 ## BT_3241 hypothetical protein 89 41 Op 4 . - CDS 138489 - 141530 1753 ## BT_3240 hypothetical protein Predicted protein(s) >gi|226332255|gb|ACIC01000065.1| GENE 1 15 - 1931 1511 638 aa, chain + ## HITS:1 COG:no KEGG:BT_3329 NR:ns ## KEGG: BT_3329 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 638 8 645 645 1121 99.0 0 MLTAFALFALAACDTDDLRDDVDNLKDRVESLEAQVSLLNDNMTAIKRLLEGGQTITEVT NTNGTYKLKLSNGETISLTQGSKGEVAYPEITVNDEGQWVVNGEVLMQNGIPVQAVGTPG KDGIAPKFRITDEGSFWQVSYDNGTSWEDVLDTDGQKVSAVSDGSGGSSADSFFEEVYVD STGEFFVVKLKGQTEAISIPIVKDLLCEITEPETGMKNGYWEIGYGKTATTTVKVKGENI IVTAPAGWVATVSEADEMTNVATLSITAPANAMSTRATADNGSDVTVQVNKGASWAVAKI QVKAIQVVDSYYALYNSGATFTVNGIEVNNTKFENATYIDSDQTITTPGIYFIKGGVTVN YNSTVNAANLLFIGDDSQNISTVAITGNYIRLRQNTETGHFLCKNIVFKAAEGFTNYLFT VYADESFANVAFDQCQIVLNGKPVSAITNDKRSIANFSMENSTIKITAVTQQFIINTSSN KNQDYGNVIFRNNTFYCPSGKVNQLVLFNGSASGIASLTIENNTFINLETNTGGYVNIGN LAKTSIKNNIFWTNTDGTGNVVIIRPQITSPTGDICADNLLYKTMTYNWQMFYGGKLPFE GAEELKALTSNPFDGGTFDLANGIFVPNAEYAEYGATN >gi|226332255|gb|ACIC01000065.1| GENE 2 1945 - 4551 2416 868 aa, chain + ## HITS:1 COG:no KEGG:BT_3328 NR:ns ## KEGG: BT_3328 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 868 1 868 868 1726 100.0 0 MKKVISIICITLLVALYSCDERDDLRSDIDNLKERVANLEASIEQMNSDISNYQQMVEGK ILVVGYSKDEQDNYTIELSNGETVTIYSGKVDMNDMPLFSVNASGHWAYTINDMTTELLV NDKPVSAIPEAGTAGVTPKLKVDANGFWLISIDNGSTWNKLGNNQIADGTQAVANASSLF SNVTIDEATGQITFTIRADNSQVKVPIYGKDFYLTIKYEGTATFGLGQKQEFLVEQANVE TATIENQTWGVKLTENKLIVTAPKTNVQGKEYEEQIYIKIFSKEGYCRVVKLPVKLLTTE IDANSALAWQRFRQGEDNVLPDYSYAGYNHGESAPQGAFSLGYQVINVKERMTAKNMTAR EALISILQEKGMTRVNGTNKLNANAKIVIYFPAGDYILHNDDDNTRDESKQKDAVDSKNN NVSSGIEIYGGNFVIKGDGPDKTRLIMETPNLPTSISNLSSSPILLAIKHTNGPNNAGNS PKLASVTENAKRGDFTVKVSGTTGISSGQWVQLRLRSGDRELVKKEIGPIALNENWAIAK APISINQSSDDLYGVKITEFHQVKSASNGKITFYEPIMHDIDIKYNDTEGWEIRTYKYLE NVGVEDLSFVGNALDGYAHHGEGHAEQAKVGWQYDGAYKPLLLQRVVNSWVRNVHFESVS EALTFAESANSSAYDIRISGKRGHSAVRSQGSSRVFIGKVRDESAGNDVYGKSCQGQFHG CGVSKPSVGTVLWNVTWGNDACFESHATQPRATLIDNCSGGLVYYRAGGDENEVPNHLGD LTLWNLNVTGTDSHASNFAWWSDSDTWWKIFPPIVVGTHGMNVKFSGKEQQQVTYEESTG MKVSPESLYEAQLRERLGYVPGWLNALK >gi|226332255|gb|ACIC01000065.1| GENE 3 4732 - 6189 658 485 aa, chain + ## HITS:1 COG:MA2370 KEGG:ns NR:ns ## COG: MA2370 COG2865 # Protein_GI_number: 20091202 # Func_class: K Transcription # Function: Predicted transcriptional regulator containing an HTH domain and an uncharacterized domain shared with the mammalian protein Schlafen # Organism: Methanosarcina acetivorans str.C2A # 11 441 2 419 458 174 29.0 4e-43 MKADHPEDIEDIKELIAQTENEQVEFKETTGQLERGMETLCAFLNGEGGTVLFGITDKGK IIGQEVADVTKRSIAEAINRLEPTANVHVFYIPLPDSSKKIIALHVENSRDNRPFCYKSR PYYRVESVTSAMPQAVYNQLLIIRDEAKYRWELFENPKLSLQDMDENEILKTVRLGIECG RLPENTGNNLSIILEKFGLLVNGVLNNAAAVLFANRELIEYPQCLLRLARFKGTDKMVFM DNQRVQGNLFQLLDSAMTFIFKHLSLSGTTEALEREEHLTIPYKAIREGIINSLCHRQFR TPGGSVGIAIYDDRVEIENPGTFPHGWDMERMKSEHCSEPRNPLIANVLYKRKLLENWGR GISLMTEECRKANLPEPEFKLANGFVILVFRYGTNNHTSTMQVPHKHHISTIQVQSLLNI MEYNTYSVKEMMELLELKNRSYFSKEYLKPAVETGVIEPIFPDQPKSPKQKYRLTEKGKA LLEKR >gi|226332255|gb|ACIC01000065.1| GENE 4 6217 - 7161 702 314 aa, chain + ## HITS:1 COG:CAC3454 KEGG:ns NR:ns ## COG: CAC3454 COG0042 # Protein_GI_number: 15896694 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA-dihydrouridine synthase # Organism: Clostridium acetobutylicum # 8 313 4 307 311 187 33.0 3e-47 MQDTLPIHFAPLQGYTEAIYRNAHDAFFGGVDTYYTPFVRLEKGNFRRRDVRGIEPENNG VPHLIPQLIASSAEKAEVILSLFIEKGYQEVDINLGCPFPLLAKRHNGSGILPYPEEVKA LLSIVTRYPQISFSVKMRLGWEQPDECLALAPILNDLPLRQITMHPRLGKQGYKGEVDLQ GFSAFREVCQLPLVYNGDIHNLEDIQRISTQFPSLAGIMIGRGLLANPALALEYKENRTL APDEMRDRLKSMHKSVYNNYDVLLEGGEGQLLTKMKTFWEYLAPQTDRKVLKAIHKSTNI SKYYQAVSAFFNQR >gi|226332255|gb|ACIC01000065.1| GENE 5 7216 - 8478 1106 420 aa, chain + ## HITS:1 COG:no KEGG:BT_3325 NR:ns ## KEGG: BT_3325 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 12 420 1 409 409 811 100.0 0 MKKYILLSLCLMTLSSSFAQLKDFKFKFYGQIRTDFYYNSRANEETVDGLFYMYPKDEVL DGNGEDLNATSNSNFYTLYSRLGLDVAGPKLGTAKTSAKVEVDFRGSGTSYSTIRLRHAY FNLDWGKSAVLVGQTWHPLFGDVSPQILNLSVGAPFQPFSRAPQIRYRFNNKHLQLTGAL VWQSQYLSQGPAGKSQEYIKKSNIPEIYVGADYKNGGFLAGAGIEMISLKPRTQSSWEEK TFDPTTNSTISIPHTYKVDERITTLSYEAHVKYTNKNWFIGAKSVLGSNLTQASGLGGFG IKHIDNKTKEQEYTPIRFSSSWLNVVYGQKWKPGIFVGYAKNLGTSDELVSEKLYGTGTN LDKLVTAGAELTYNVPHWKFGVEYTLSSAWYGKLDKSDGKIIDTHSVSNNRIVAVAMFMF >gi|226332255|gb|ACIC01000065.1| GENE 6 9273 - 12350 2439 1025 aa, chain - ## HITS:1 COG:no KEGG:BT_3324 NR:ns ## KEGG: BT_3324 # Name: not_defined # Def: chondroitinase (chondroitin lyase) precursor # Organism: B.thetaiotaomicron # Pathway: not_defined # 12 1025 1 1014 1014 2139 100.0 0 MIKQSFTLSVTMLILSFLCPAFLNAQIVTDERMFSFEEPQLPACITGVQSQLGISGAHYK DGKHSLEWTFEPNGKLELRKDLKFEKKDPTGKDLYLSAFIVWIYNEQPQDAAIEFEFLKD GRKCASFPFGINFKGWRAAWVCYERDMQGTPEEGMNELRIVAPNAKGRLFIDHLITATKV DARQQTADLQVPFVNAGTTNHWLVLYKHSLLKPDIELTPVSDRQRQEMKLLEKRFRDMIY TKGKVTEKEAETIRKKYDLYQITYKDGQVSGVPIFMVRASEAYERMIPDWDKDMLTKMGI EMRAYFDLMKRIAVAYNNSEAGSPVREEMKRKFLAMYDHITDQGVAYGSCWGNIHHYGYS VRGLYPAYFLMKDVLREEGKLLEAERTLRWYAITNEVYPKPEGNGIDMDSFNTQTTGRIA SILMMEDTPEKLQYLKSFSRWIDYGCRPAPGLAGSFKVDGGAFHHRNNYPAYAVGGLDGA TNMIYLFSRTSLAVSELAHRTVKDVLLAMRFYCNKLNFPLSMSGRHPDGQGKLVPMHYAM MAIAGTPDGKGDFDKEMASAYLRLVSSDSSSAEQAPEYMPKVSNAQERKIAKRLVENGFR AESDPQGNLSLGYGCVSVQRRENWSAVARGHSRYLWAAEHYLGHNLYGRYLAHGSLQILT APPGQTVTPATSGWQQEGFDWNRIPGVTSIHLPLDLLKANVLNVDTFSGMEEMLYSDEAF AGGLSQGKMNGNFGMKLHEHDKYNGTHRARKSYHFIDGMIVCLGSDIENTNTDYPTETTI FQLAVTDKAAHDYWKNNAGEGKVWMDHLGTGYYVPVPARFEKNFPQYSRMQDTGKETKGD WVSLIIDHGKAPKAGSYEYAILPGTDRKTMTAFAKKPAYSVLQQDRNAHILESPSDRITS YVLFETPQSLLPGGLLQRTDTSCLVMVRKESADKVLLTVAQPDLALYRGPSDEAFDKDGK RMERSIYSRPWIDNESGEIPVTVTLKGRWKVAETPFCKVVSEDKKQTVLRFLCKDGASYE VELEK >gi|226332255|gb|ACIC01000065.1| GENE 7 12549 - 13106 476 185 aa, chain - ## HITS:1 COG:no KEGG:BT_3323 NR:ns ## KEGG: BT_3323 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 13 185 1 173 173 335 100.0 7e-91 MRAKKFACCLMVMSLLAGISFISCGNSSRAKVESEAAQTGEDFKSFLDKFTSSAAFQYTR VKFPLKTPVTLLADDGETEKTFPFTKEKWPLLDSETLKEERITQEEGGVYVSKFTLNEPA HKVFEAGYEESEIDLRVEFELLPDGKWYVVDCYTGWYGYDLPIAELKQTIQHVQEENAAF KELHP >gi|226332255|gb|ACIC01000065.1| GENE 8 13290 - 15902 2008 870 aa, chain + ## HITS:1 COG:no KEGG:BT_3322 NR:ns ## KEGG: BT_3322 # Name: not_defined # Def: putative thioredoxin family protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 870 1 870 870 1764 100.0 0 MRPPYCFLLIGLLLLCSTQTIAFAQSGTYQTIPESLRGYWQFKADNVSDWNGPLIGENFV ENYYVVFYAEQIKQEADGSYFFHLRNQKGDTTEFRITPTGNDAATIWYKGWKEPKNCIRK QVPDHTEPLTPLTLPDRVYQKWVKGLSGKVIYEFTRDGKLLYDGKTWDILSAGYFLNKEY RLLVKSGESYKLLYLSFPLPKTMNVAAELQNEKVSPIASHPEIYAFAGCWINQATGDWRI GFFEDFAVYQCQFWDYESINTQKNQTTIILKNGTEQLKVRLTRKDETSCTLSVGKEKAQT YVLCNDKYLPDYPVADTTPFVDNGYQTDSVTLIGYLRNLPSTRPFEVDVPDMITNNEEEY TTEIDSLGRFTLRFPVLNSHNVFIDWGRTTICTSVEPGETYFLYVDFADKKKLVMGEKAR ILNELLAHEGLSEYIGYDEGEKMENMEYLQKTQDIIRHKSEYRAKMLNDHPLLSHKFRYY TEQEIRYNAARDLMQRRFSVDRNKQEHLQPKFMSYVDSALYPRPVEPYTLLRDYGSFLRD YVGYITDIAPASSNRLTVTPQKMEMLYLKFERDGKIQLSQEEKDALHAFSNFEQEIEKMK AANTADSLTMAAYNKKMEPSIKTIQSLVGRDEIFNDYMMGNLLANSINRTLAIIDSLQMD EKLKEILKANCYYEMLQQTHKELPDSLINKFKAEVSNPSLQAYVLNQQGIYEEISHKAIE YPESLMPNEPLAEITDGEQLFRKIIEPYKGKVVYLDVWGTWCGPCKDMMQYAGSAKKLFE GKDVIFLYLCNHSSDKSWKNIIKEYGLTGKIAVHYNLPDEQQRAIEKFLQVRSFPTYMLI DKEGNIVNRDAPRPNRENDLLNAVYKLLEK >gi|226332255|gb|ACIC01000065.1| GENE 9 15908 - 17191 783 427 aa, chain - ## HITS:1 COG:no KEGG:BT_3321 NR:ns ## KEGG: BT_3321 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 427 1 427 427 875 98.0 0 MEHLLHYVWKHKLFPLKILQTTKGLLVEIIDSGLQNPDAGPDFFNAKLKIDGTLWVGNIE IHTHSSDWYRHGHDHDKTYDSVILHVVAEADVEVTRSNGEQIPQLLLTCPENVQTHYHEL CIADQYPACYSIIGSLSKLTIHSWLTALQTERLEQKARQIADRLERCDRHWEDAFFITLA RNFGFGLNGDAFETWAGLLPFRAIDKHRNDLFQIEAFFFGQAGLLEEAFLKKEQEDEYSL RLRKEFRYLQRKFEITQVMDAGLWRFLRLRPENFPHIRLAQLAYLYQKVDKLFSQMMEAE TLPEVRQLLSTHASAYWDNHYIFGRPSSQKEKSMGERSQDLIVINTVVPFLYAYGLHRTD ERMCDRAGRFLEELKAENNHIIRSWGDAGLPVGSAADSQALIQLRKEYCDKRKCLFCRFG YEYLRKK >gi|226332255|gb|ACIC01000065.1| GENE 10 17275 - 18039 798 254 aa, chain + ## HITS:1 COG:MK1422 KEGG:ns NR:ns ## COG: MK1422 COG0289 # Protein_GI_number: 20094858 # Func_class: E Amino acid transport and metabolism # Function: Dihydrodipicolinate reductase # Organism: Methanopyrus kandleri AV19 # 44 249 69 266 275 106 34.0 4e-23 MKIALIGYGKMGKEIEKAALSRGHEIVCIIDVDNQEDFESEAFKSADVAIEFTNPMVAYS NYMKAFKAGVKLVSGSTGWMAEHGEEIKKLCTEGGKTLFWSSNFSLGVAIFSSVNKYLAK IMNQFPSYDVTMSETHHIHKLDAPSGTAITLAEGILENMERKDKWVKGTLLAPDGTVSGT DACASNEFPIDSIREGEVFGIHTIRYESEVDSISITHDAKNRGGFVLGAILAAEYTAQHE GYLGMSDLFPFLNE >gi|226332255|gb|ACIC01000065.1| GENE 11 18073 - 19554 1383 493 aa, chain + ## HITS:1 COG:STM2582 KEGG:ns NR:ns ## COG: STM2582 COG0681 # Protein_GI_number: 16765902 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Signal peptidase I # Organism: Salmonella typhimurium LT2 # 437 488 269 320 324 73 57.0 1e-12 MRQATRTQWIKCTIAILLYLAFLIWVRSWWGLIVVPFIFDIYITKKIPWSFWKKSKNPTV RNVMSWVDAIVFALVAVYFVNIYIFQNYQIPSSSLEKSLLVGDFLYVSKMSYGPRVPNTP LSMPLAQHTLPILNTKSYIEWPQWKYKRVPGFGKVKLNDIVVFNFPAGDTVALNNQQTDF YSIAYGEGQRLYPKQIEMDSLTRQQQRAVYDLYYNAGRQQILSNPRVYGEVVYRPVDRRE NYVKRCVGLPGDTLQIVDGQVMIDGKAIENPENLQFNYFVQTTGPYITEDMFRELGISKA DQTLYDDSSWEETFRQIGLDNRNAQGKMAPIYHLPLTKKMYETLSGNKKLISKIVMEPEE YAGQMYPLNLHTKWNRNNYGPIWIPAKGATITLTEDNLPIYERCIVAYEGNKLEIKPDGI YINGEKTDQYTFKMDYYWMMGDNRHNSADSRYWGFVPEDHVVGKPIVVWLSLDKDRGWFD GKIRWNRLFKWVD >gi|226332255|gb|ACIC01000065.1| GENE 12 19597 - 20535 622 312 aa, chain + ## HITS:1 COG:NMA0976 KEGG:ns NR:ns ## COG: NMA0976 COG0681 # Protein_GI_number: 15793933 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Signal peptidase I # Organism: Neisseria meningitidis Z2491 # 11 288 51 277 292 65 27.0 1e-10 MNIRKLKWIFAFAGAIAVVLLLRGFAFTSCLIPSTGMENSLFQGERILVNKWSYGLRLPL MSLFSYHRWCERSVRKQDVVVFNNPAAIGQPTIDRREIYISRCIGTPGDTLLVDSLFSVS SPEAQLNPDKKRLYTYPAAKEQLITSLMQTLSITNDGLMGSNDSTHVRSFSRYEYYLLEQ AISDQNWIQPLTEKSEKELKPLIVPGKGKALRVYPWNITLLRNTLVMHEGKQAEIKNDTL YIDGKPTQHCFFTKDYYWMASNNSVNLSDSRLFGFVPQDHIIGKASLIWFSKEKGTGIFD GYRWNRFFQSVK >gi|226332255|gb|ACIC01000065.1| GENE 13 20578 - 21207 468 209 aa, chain + ## HITS:1 COG:no KEGG:BT_3317 NR:ns ## KEGG: BT_3317 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 209 1 209 209 408 99.0 1e-113 MRIAYLSSAYLAPVEYYTKLLEYDKIYVEQHDHYIKQTYRNRCTIAGPDGELALSIPTVK PDTLKCPMKDIRISDHGNWRHLHWNAIESAYNSTPYFEYYKDDFRPFYEKKYEFLADFNE ELCQLVCKLIDIQPCIERTSEYKTEFTTEETDFREIIHPKKDFRIADPEFVPHPYYQVFD SKLGFLPNLSIIDLLFNMGPESLLVLTSK >gi|226332255|gb|ACIC01000065.1| GENE 14 21715 - 22167 297 150 aa, chain - ## HITS:1 COG:no KEGG:BT_3316 NR:ns ## KEGG: BT_3316 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 150 1 150 150 277 98.0 1e-73 MREVLNNQQIVCPGDATQFMHAIFSSDHEMMTFYLTLNRFINPSSYLVERSDRQRLEDLE NVLFGNVAAFEAVHHFKKISVKDVIKGFGMHMMNMQVSNANRMQSANIMGSFIDCIIDTT KNSWQYKQMYRVNHLHLENVRYLLNRLNEE >gi|226332255|gb|ACIC01000065.1| GENE 15 22183 - 22428 187 81 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253568933|ref|ZP_04846343.1| ## NR: gi|253568933|ref|ZP_04846343.1| predicted protein [Bacteroides sp. 1_1_6] # 1 81 1 81 81 161 100.0 1e-38 MVNKLSSTGLNRKVVLDNKRSGSACPIALHHPTTAVGALTLYMGVRNRHIEEWDKHKKAC HIDTAGSPVVGFDKEHYKDRI >gi|226332255|gb|ACIC01000065.1| GENE 16 22628 - 24922 2153 764 aa, chain - ## HITS:1 COG:YPO2803 KEGG:ns NR:ns ## COG: YPO2803 COG1472 # Protein_GI_number: 16123001 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Yersinia pestis # 10 761 1 711 793 416 34.0 1e-116 MNMKFKATLLGLSIAAVLPTMNMAQTPVYLDTSKPIEERVKDALSRMTLEEKVKMTHAQS KFSSPGVPRLGIPEVWATDGPHGIRPEVLWDEWDQAGWTNDSCIAYPALTCLSATWNPEM SYLYGKSIGEEARYRKKDILLGPGVNIYRTPLNGRNFEYMGEDPYLSSMMVVPYIKGVQE NGVAACVKHYALNNQEFNRHTTNVHLSDRALYEIYLPAFKAAVQEGGAWAIMGAYNLYSF SEDTDSGKLYKTQHACHNKRLLQDILRKEWGFDGVVVSDWGGVHDTFQAISNGLDMEFGS WTNGLSAGTRNAYDNYYLAHPYLKLIQDGTVGTKELDEKVSNILRLIFRTSMNPYKPFGS LASPEHGQAGRKIGEEGIVLLQNKDNVLPIDLKKARKIAVIGENAIKMMTVGGGSSSLKV KYEISPLDGLKNRVGSQAEVLYARGYVGDPTGEYNGVQTGQDLKDDRSEDELLAEAVEVS KDADYVIFFGGLNKSNHQDCEDSDRASLGLPYAQDRVIGELAKVNKNLIVVNISGNAVAM PWVNEVPAIVQGWFLGSEAGTALASVLLGDANPSGKLPFTFPARLEDVGAHKLGEYPGNK EELAHSKNNGDTINEIYREDIFVGYRWADKEKIKPLFPFGHGLSYTTFAYGKPSADKKVM TADDTISFTINVKNTGTREGQEVIQLYVSDKKSSLPRPVKELKGFKKVKLAPGEEKAVTL TIDKKALSFFDDVKHEWMAEPGKFEAVIGTSSRDIKGIVPFELR >gi|226332255|gb|ACIC01000065.1| GENE 17 24940 - 26943 1759 667 aa, chain - ## HITS:1 COG:no KEGG:BT_3313 NR:ns ## KEGG: BT_3313 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 667 1 667 667 1310 99.0 0 MKKSIYITLALAGILSMNSCNDDEFLPGNPSMEIKAENADALFGDSLPFTIKASDVDVPL STLKAQLFYGEEQVSETVIRTKTSGSDYTGKIYIPYYANIPNGKATLKYILQNIHFTTTE QETELTLARPDFPYLTLVDEEGQEYRMDRQSMYHYSVSGDFSQKMKVYIKTPKVGEHGNE LTFGWEDNTIAVGSTTSIPFSNTEPGNYTIQFNTFNYEASPFAKLLLNGKEMELVENDVY AVKLSLKQNDILTFEGVPAYDEWWIDPDFFEKQEDGTLKFLPIDGSYQITANGKRQYFSV IALKDGEAAKLQDDGTGAIWAIGNGIGKPSVSLYEVGWTPENGLCMPQLTAKKYQLTFTA GVTMKTSDIDFKFFHINKWDNGEFKGDAISTTSDLVEITESGNLKLQEGQKFERGGVYKF TVDVTKGNTKAVLTVEKVGQVDLPRPDITINGMKLEEGSSDMYQLQMSLTQNQTLEIGGI EDLNDWYVDPDYLVKTGDNHLKFLPVAGSYRVMVNATRKYFAVVRMNGNSEAGLDADGHG AVWLLGYGMGSPSIKDEFNWDFSTAYCVAEISPKVYQFTGKTGAEGSTVIGERIRYDYIS VKYLGGKSWDQQFHQKNLQMDQATRNLFNVDEENLVLASPLEQGVTYVFTLDFTQGIDKG ILSVVKK >gi|226332255|gb|ACIC01000065.1| GENE 18 26966 - 28456 1497 496 aa, chain - ## HITS:1 COG:CC1757 KEGG:ns NR:ns ## COG: CC1757 COG5520 # Protein_GI_number: 16126001 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: O-Glycosyl hydrolase # Organism: Caulobacter vibrioides # 27 492 15 469 469 237 31.0 3e-62 MINMRNIALTFLGCFTILAACSNSDDAEKPVVPVPTGDVAIYTTTSSLTRDLTRDAVNFS PKDNLAPTTITLNPAEQYQTMDGFGAAITGSTCYNLLLMKPADRHAFLTETFSDKDGFGF SYIRISIGCSDFSLSEYTCCDTKGIENFALQSEEKDYILPILKEILAINPSIKVIAAPWT CPKWMKVKSLTDRTPLDSWTNGQLNPDYYQDYATYFVKWIQAFKAEGIDIYAVTPQNEPL NRGNSASLYMEWEEQRDFVKTALGPQMKAAGLSTKIYAFDHNYNYDNIESQKNYPGKIYE DAAASQYLAGAAYHNYGGNREELLNIHQAYPEKELLFTETSIGTWNSGRDLSKRLMEDME EVALGTINNWCKGVIVWNLMLDNDRGPNREGGCQTCYGAVDINNSDYKTIIRNSHYYIIA HLSSVVKPGAVRIATTGYTDNGITCSAFENTDGTYAFVLINNNEKSKKITVSDGQRHFAY DVPGKSVTSYRWAKSK >gi|226332255|gb|ACIC01000065.1| GENE 19 28515 - 30035 1529 506 aa, chain - ## HITS:1 COG:no KEGG:BT_3311 NR:ns ## KEGG: BT_3311 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 506 1 506 506 1009 100.0 0 MKLRYIFYGLTSGLLFGLSSCSLNYDPIDTYSDVTEGVTDTGTKVVFKDKAAVESHLTTI YNQMRDRQEHWYVDLLLISDSHADNAYAGTTGAEVLPFENNSIEGSNSVLERDWNRYLED VARANKLICNIDLVTDNSLTTAERAQYKAEAMIFRAMVMFDMVRLWGDFPVITTVADDIT SENIEDVYEQYFPAQNTELEAYQQIEKDLLAAIQHAPDNTAGNKTLFTKSVARTLLAKIY AEKPLRDYSKVIRYCDEVKADGFELVKDFSDLFGMNAAGTDAKARNTKESILEAQFTPGS GNWCTWMFGRDLVNWNNNFTWAKWVTPSRDLINAFKKEGDQIRFQESVVYYDCNWSNYYP SDNYPFMYKCRSANSSIIKYRYADVLLLKAEALIMQDTPDLEAAAKIIDEVRERAGLGEL SSSVRKDRDAMITALLNERRLELAFEGQRWFDLVRLDKVEEVMNAVYAKDSGRKSQVYAF DKNSYRLPIPQSVIDANDKIEQNPGY >gi|226332255|gb|ACIC01000065.1| GENE 20 30050 - 33055 2739 1001 aa, chain - ## HITS:1 COG:no KEGG:BT_3310 NR:ns ## KEGG: BT_3310 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1001 1 1001 1001 1908 99.0 0 MKKNRRKILSGSRKILFAILAVFFSLSASAQQFTASGQVLDAQKEPLIGVSVQEKGTTNG AITDLDGNFTLTVRSNATLIFSYVGYKSQEQKASRQMKVTLQEDNEVLEEVVVIGYGSVK RKDVTTAISTVSTKDLDVRPIVSAGQAIQGKAAGISVIQPSGTPGGEMSIRVRGTTSMNG SNDPLYVVDGVPVDNIKFLSPNDIESMQILKDASSASIYGSRAANGVILITTKAGATGKA KVSLTAQFGLNKVADKVESLNAAQYKELQDEIGLVSLPDGLPDRTDWFDETYKTGKMQNY QVAVSNGNEKMKYYLSAGYLDEKGILDISYYKRYNFRVNLENQIRSWLTVSANISYSDYT SNGGGAMGTGANRGGVVLAVINTPTYAPIWDALNPNQYYNNFYGVGNITNPLENMARAKN NKDKENRLLASGNVLLTLLPELKFKSTLTLDRRNAVNTTFLDPISTAWGRNQYGEASDNR NMNTVLTFDNVFTYNKNFKRHGLEVMAGSSWTDSDYSNSWINGSHYRNDLIQTLNAANKI AWDNTGTGASQWGIMSFFGRVAYNFDSKYLLTANLRADGSSKLHPDHRWGVFPSFSAAWR VSSEKFMADLTWIDDFKIRGGWGQTGNQSGIGDYAYLQRYNIGRIEWFKVAAEGDTTDYA NAVPTISQANLRTSDLTWETTTQTNIGLDLTVLNGRLTFNMDYYYKKTKNMLMNVSLPAG AAAATSIARNEGEMTNKGFELTISSKNFRGGAFTWDTDFNISFNRNKLTKLELQKVYYDA KTADVVNDYVVRNEPGRALGGFYGYISDGVDPETGELMYRDLSGDGKISTSDRTYIGDPN PDFTYGLTNTFSWKGFNLSVFIQGSYGNDIFNASRIETEGMYDGKNQSTKVLNRWKIPGQ ITNVPKANFKLLNSTYFIEDGSYLRLKDVSLSYNFKGKLLQKWGITRLQPYFTATNLLTW TSYSGMDPEVNQWGNSGTVQGIDWGTYPHSRSYVFGINVEF >gi|226332255|gb|ACIC01000065.1| GENE 21 33265 - 34908 1120 547 aa, chain - ## HITS:1 COG:no KEGG:BT_3309 NR:ns ## KEGG: BT_3309 # Name: not_defined # Def: transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 547 1 547 547 989 99.0 0 MKNKCLLLLLFASFPIFSWADETLDSLLHVLDQTILAHDTYVVQRESRIRHLKELAGDVA PNSIERYNLNNQIYKEYKAFICDSAIYYLNENVRIAGNLGDTDREIESKLQLSLLLSSTG MYTESIDVLKSVDRQKVTSHLILDYYTCFDHVYGEMGFYTQDQTLSAYYREISSAYKDSL YAILSPQSEEFMVMRETLFRDRHKYDEALEINDRRLMAAEPDTPQYALVTYHRSLIYKYL GDKIREKQNLCLSAISDIRSAIKDHASLWMLAQLLYENGDMERAYQYMRFSWNATKFYNA RLRSWQSADVLSLIDKTYQAMIEKQNDRLQQYLVLITALLVLLIGALGYIYRQMKKLAVA RNHLQTANHQLNQLNEELQQMNACLTSTNAELSESNQIKEEYIARFIKLCSTYINRLDAY RRMVNKKVSAGQIAELLKITRSQDALDEELEELYANFDTAFLHLFPDFVKKFNALLQDNE QIILKKDELLNTELRIFALIRLGIEDSSQIAEFLRYSVNTIYNYRAKVKNKARGSREDFE DLVRKIR >gi|226332255|gb|ACIC01000065.1| GENE 22 34957 - 36177 1192 406 aa, chain - ## HITS:1 COG:BH2405 KEGG:ns NR:ns ## COG: BH2405 COG0612 # Protein_GI_number: 15614968 # Func_class: R General function prediction only # Function: Predicted Zn-dependent peptidases # Organism: Bacillus halodurans # 4 404 3 401 413 212 33.0 1e-54 MHYNEYTLPNGLRIIHEPTLSKVAYCGFAIDAGTRDEAEDEQGMAHFVEHLIFKGTEKRK AWHILNRMENVGGDLNAYTNKEETVVYAAFLTGHLERALELLGDIVFHSTFPQHEIEKET EVIIDEIQSYEDNPSELIFDDFEDMIFRNHPLGRNILGKPELLRSFRTEDVLSFTSRFYQ PGNMVFFVQGQYDFKKIIRLAEKYLFDIPAVTVDNQRMPPPLYVPERLVVPKDTHQAHVM IGSRGYNAYDDKRTALYLLNNVLGGPGMNSKLNVSLRERRGLVYNVESNLTSYTDTGAFC IYFGTDIEDMDTCLKLTYKELKRMRDVKMTSSQLAAAKKQLIGQIGVASDNFENNALGMA KTYLHYHKYESSESVFHRIEALTAEQLLEVANEMFAEEYLSTLIYK >gi|226332255|gb|ACIC01000065.1| GENE 23 36283 - 36816 607 177 aa, chain + ## HITS:1 COG:BH2746 KEGG:ns NR:ns ## COG: BH2746 COG1611 # Protein_GI_number: 15615309 # Func_class: R General function prediction only # Function: Predicted Rossmann fold nucleotide-binding protein # Organism: Bacillus halodurans # 3 154 2 152 190 114 38.0 1e-25 MKKIGIFCSASENIDKMYFDSARQIGEWMGKAGKTLIYGGANLGLMECVARAVKENGGTV IGVVPAKLEEKGSVSTLLDEVIHTRNLSDRKDVITEKSEILVALPGGVGTLDEIFHVIAA ASIGYHQKKVIFYNEYGFYNELLAALKTLEDKGFARQSFSTYYETANNLDELKEKIN >gi|226332255|gb|ACIC01000065.1| GENE 24 36821 - 37426 547 201 aa, chain + ## HITS:1 COG:PA4457_1 KEGG:ns NR:ns ## COG: PA4457_1 COG0794 # Protein_GI_number: 15599653 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted sugar phosphate isomerase involved in capsule formation # Organism: Pseudomonas aeruginosa # 9 194 19 198 209 142 41.0 3e-34 MIDSIKQLLQQEAQAVLNIPVTDAYEKAVQLIVEQIHQKKGKLVTSGMGKAGQIAMNIAT TFCSTGIPSVFLHPSEAQHGDLGILQKNDLLLLISNSGKTREIVELTRLAHNLDPELKFI VITGNPDSPLANESNVCLSTGKPAEVCTLGMTPTTSTTAMTVIGDILVVQTMKKTGFTIE EYSKRHHGGYLGEKSRSLCEK >gi|226332255|gb|ACIC01000065.1| GENE 25 37414 - 38340 927 308 aa, chain + ## HITS:1 COG:VCA0656 KEGG:ns NR:ns ## COG: VCA0656 COG0524 # Protein_GI_number: 15601414 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Vibrio cholerae # 1 257 18 276 323 89 24.0 8e-18 MRKVIGIGETILDIIFRGNQPSAAVPGGSVFNGIVSLGRMGIKVGFISETGNDRVGNIIL QFMRENNIPTDHVNVFPDGKSPVSLAFLNEQSDAEYIFYKDYPKQRLDVLYPKLEEDDIV MVGSYYALNPVLREKILELLDQAREKKAIIYYDPNFRSSHKNEAMKLAPTIIENLEYADI VRGSLEDFFYMYGLQDVDKIYKDKIKFYCPRFICTAGGDKVSLRTNLVSKEYPIEPLEAV STIGAGDNFNAGLIYGLLKYDVRYRDLNNLNEEVWDKVVQCGKDFAAEVCRSFSNSVSVE FAGQYASK >gi|226332255|gb|ACIC01000065.1| GENE 26 38459 - 39529 882 356 aa, chain + ## HITS:1 COG:lin2049 KEGG:ns NR:ns ## COG: lin2049 COG2365 # Protein_GI_number: 16801115 # Func_class: T Signal transduction mechanisms # Function: Protein tyrosine/serine phosphatase # Organism: Listeria innocua # 39 352 12 325 326 127 30.0 3e-29 MYRNLLNWLTILLVLPSCSGTSPTISVVCEENNVGNAIIKWETAPILKGQVKVYASTSPD FIPEENPVATINIAKGKKTIVTNDPSQRYYYLMVFNNRYRVRVAARNVNIPGIQNFRDLG GYKSAETGKDTRWGMLYRSAQIDSIPFCSRRELKNMGIRTIIDLRSEEERHNYPQFHDED FNVLHLPIATGNMEHILQDIRDKKIETDTIYRLVERMNRQLVTNYRKEYKELFTLLLDRN NYPVVIHCTSGKGRTGIVSALVLAALGVNEEAIMKDYRLSNDYFNIPKASRYAYKLPINS QEAITTIYSAKEDFLNAAKEQIDAEYGSVQAYLKKGIGLSAEDIERLRSILLIDNG >gi|226332255|gb|ACIC01000065.1| GENE 27 39641 - 41584 1804 647 aa, chain + ## HITS:1 COG:BH2384 KEGG:ns NR:ns ## COG: BH2384 COG0513 # Protein_GI_number: 15614947 # Func_class: L Replication, recombination and repair; K Transcription; J Translation, ribosomal structure and biogenesis # Function: Superfamily II DNA and RNA helicases # Organism: Bacillus halodurans # 1 538 5 524 539 374 41.0 1e-103 MKTFEELGVSPEIRKAIEEMGYENPMPVQEEVIPYLLGENNDVVALAQTGTGKTAAFGLP LLQQIDVKNRVPQSLILCPTRELCLQIAGDLNDYSKYIDGLKVLPVYGGSSIDSQIRSLK RGVHIIVATPGRLLDLMERKTVSLSTVHNIVMDEADEMLNMGFTDSINAILADVPKERNT LLFSATMSPEIARISKNYLQNAKEITIGRKNESTSNVKHVAYTVQAKDKYAALKRIVDYY PQIYGIIFCRTRKETQEIADKLMQEGYNADSLHGELSQAQRDAVMQKFRIRNLQLLVATD VAARGLDVDDLTHVINYGLPDDTESYTHRSGRTGRAGKTGTSIAIINLREKGKLREIERI IGKKFIAGEMPTGKQICEKQLIKVIDELEKVKVNEEEITDFMPEIYRKLEWLSKEDLIKR MVSHEFNRFLDYYRDREEIETPTDIRERNTRDSRERGSRKAAPGFTRLFINLGKMDSFFP SELISLLNSNTRGRIELGRIDLMKNFSFFEVEEKEAQNVVKALNRANWNGRKVSVEVAGE ETGEGRRGSGSAERRGGKRPFGSSSEKRGDSSRNSRSERSDRAPRADRTTKGSDKTDKKK RSDKPSREERGYGARGPKKTDDWQQFFKDKEPDFSEEGWARRKPKKK >gi|226332255|gb|ACIC01000065.1| GENE 28 41820 - 45920 2524 1366 aa, chain - ## HITS:1 COG:all4963_3 KEGG:ns NR:ns ## COG: all4963_3 COG0642 # Protein_GI_number: 17232455 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Nostoc sp. PCC 7120 # 831 1060 8 231 294 123 31.0 3e-27 MKKHNEWTKGTTLLRRVLLILVSFLLIGVDINASYSLRQFSSKNGLSNSAILSMCQDRHG VIWIGSCDGLNIYDGTYLGLYKPTNVHHSFSGNLIERIIEADNDVLWIQTNYGLDRFDTR RQTIQSFKDFKYINRMAISPEDDFFIIKDDGYIYYYQPEQKDFCKLDVKKIHFEEVQQMA IDNFGVLWIFSSDNDNRSYIIEKNGNQVKLVPSNCLNHSEKLLWTFVDGDFLYFIDSTYA LYEYDLNNHKAYFIADMEAEILRRGEVSSIIKQKTDYFIGFKSSGLIQLKYMPDSKVKYS LQSINVQSGIFCLMKDRFQDIIWVGADGQGLYMYFTDEFSIDNIMLDVPEYRVDNPVRAL YQDQDQTLWIGTKGGGILKMFDFHPDMETLPRAERILASNSPLMDNSVYSFAPSRRKLLW IGTESGLNYYSYSQKRIQEFPVMADGKMVKYVHSVRETNDSTLWVVSAGEGIVKIILDAT SLLPKVKSARRFTLDGGKRNSNYFFVSFQENDSVIWFGNRGYGAYKMNTHTNQLTPCRID DVVKNQTVNDIFAIHKNDEGYWFGTSFGLTRLYQGEYRVYNETDGFPNNTIHGILEGRDN NLWLSTNQGLVRFNVRENTVQTYRQQGDLEVIEFSDGAFFKDQQTGTLFFGGTNGFITIS ENDQLSEEYMPPLHFNRLSIFGKECNIYDFLQGATKQETLVLDYSQNFFNLSFIAVDYIN GNNYTYSYKIDGLSDSWIENGLSTVAVFSNLSPGEYTLLVKYRSNITGKESEPYSLLIRI TPPWYMTTCAYIVYFLLLAGVIAGAIRMVIKRYRRKRNVMIEEMNRQQREELYESKLRFF TNITHEFCTPLTLINGPCEKILSYSRVDSYVRKYASMIQQNALKLNALILELIEFRRLET GNKILKIKHVPVTEQIRTIAESFGELAESRKLNYRLQIEDGVFWNTDVSCLSKIVNNLIS NAFKYTPENGSITVELRIEGEQLCIRVSNTGKGIKEADLTKIFDRYKILDNFEVQNKNGI SPRNGLGLAICHSMVNLLNGQIQVSSIPNEVTTFDVWLPVMEVTMDNEDEKVAEELVLSS DEQAVELKNSSVEFDKNKQTIMIIDDDPSMLWFVTEIFVGAYNVVSLGSAEEALKQLGIQ LPDLIISDVMMPGMDGMSFAKKIKSDKLLSRIPLILLSALNNIDEQTRGIESGAEAYITK PFNVEYLEKVVRRLLQREEDLKEYYSSVLSAFELDDGHFLHKEDKSFFEKMMQIIDSHIQ NTDLSVELLSSSLGYSTRQFYRKLKNVTEKTPADIIREYRLTIVERLLLTTQLSIEEIMD KTGFSNRGTFYKAFAQKFEMTPKQYRNIKTKDVHDASEEGGGTMNV >gi|226332255|gb|ACIC01000065.1| GENE 29 46274 - 47461 783 395 aa, chain + ## HITS:1 COG:lin0763 KEGG:ns NR:ns ## COG: lin0763 COG4833 # Protein_GI_number: 16799837 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted glycosyl hydrolase # Organism: Listeria innocua # 143 348 90 295 341 75 28.0 2e-13 MKALRFLLIIVCAYFLASCVSTKQADSNSNLDRARQTLDSLYLNYSVSNSHLLRENYPFN EQYKVTYLASENQTNMPNQFSYLWSYSGTFSAVNALLEATHDKKYQQLLEKQVLPGLEEY FDTERTPVAYSSYIRTTPTSDRFYDDNIWVGIDFIDIYQITKEKKYLDKAQLIWNFIESG TDSLLGDGIYWCEQKKESKNTCSNAPGSVFALKLFKATNDSLYFKRGKELYKWTQKNLQD STDYLYFDNIRLDGKIGKAKFAYNSGQMMQSAALLYQLTKNPDYLKDAQSIAKECYNYFF TDFTTDTGESLKMLKQGNIWFTAVMLRGFIELYQLDQNKTFIDAFNQCLSYAWDNARDEN GLFSTDLTGNNNNEKKWLLTQAAMVEMYSRLAAIQ >gi|226332255|gb|ACIC01000065.1| GENE 30 47573 - 50050 1794 825 aa, chain + ## HITS:1 COG:SPBC1683.04 KEGG:ns NR:ns ## COG: SPBC1683.04 COG1472 # Protein_GI_number: 19111852 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Schizosaccharomyces pombe # 33 815 6 815 832 374 31.0 1e-103 MKQKRLTFLLLLISNLFCASAQKWYKPEIDLKVEELLSQMTVEEKLGYIGGIDWMYTKSI DRLGIHRMRMSDGPQGIGTKGKSTAYPATVTLAATWNENLAYQYGKALGRDCRARNINVI LGPAVNIYRAPMCGRNFEYMGEDPYLTSRICVGYIKGVQDQGVMATVKHFVANNSDYDRN NISNDIDERTLNEIYFPAYKAAIQEAEVGAIMTSYNLMNGIYTTENPWLLKEVLRNQWGF KGLVMSDWGSTHYCVPAARSGLDLEMAGGEKMNPKDMAYYLKTGDVTMDMVDEKVRHILR ILIAFGFKDGTEADKSIALDDPQSVETALNVAREGLVLLKNENNILPINPQKYKHIIVTG KNAHGYVHGGGSGAVVPFHYTNLFDGIQKEGKLNNVKVEYIDELDFMPSIMYTDDNLNEK GFHAEYFKNIHFEGAPVVKQTEKKINYSWAAGTELEGMPKDYFSVKWYSTMCVDETADYE FTLGGDDGYKLFINDQPIIDDWTPGGFRTTNVTKTLKAGEKYHVRVEYYQQGGGAGICFL WKRKNETQNKFADYLNKADLVIACFGHSSDTEGEASDRTFELPEVDKKMLASLSGCKKPV IGIVNAGGNIEMQKWEPSLKGLIWGFYGGQEAGTAIGEALFGKVNPSGKLPMTFEKKWED SPAYNSYHDPDKDKHVAYTEGIFIGYRGYDKLKREVQYPFGYGLSYTTFKLSNIVVSKPN ADGTVEVTCRLANTGKKDGAQVIQAYVGKAENSPVERPQKELRKFEKIFLKAGESTTVKM TLPKESFMYFDVNRTQFVTDPGIYNIMLGFSSRDIKAQKNIQYTL >gi|226332255|gb|ACIC01000065.1| GENE 31 50174 - 52225 1269 683 aa, chain + ## HITS:1 COG:alr4773 KEGG:ns NR:ns ## COG: alr4773 COG1501 # Protein_GI_number: 17232265 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-glucosidases, family 31 of glycosyl hydrolases # Organism: Nostoc sp. PCC 7120 # 64 659 145 757 779 426 38.0 1e-119 MVGNGIAKFIPEGFDAQKIPSFAIEKEPREQGTLPADWVLVPEFSLTDGKANASLTVPEG TSIYGGGEVTGSLLRNGKTIKLWNTDSGAYGVDKGTRLYQSHPWMMGVRKDGTAFGILFD TTWKAELSSTDEKIELKSEGIPFRVFIIDRESPQAVIRGLSELTGTMPMIPRWALGYQQC RFSYSPDSRVIEIADTFRLKRIPCDVIWMDIDYMDGYRIFTFNPKSFPNPKAVNRDLHIR GFHSAWMIDPGAKVDPNYFVYKSGTENDVWVKTADGKNFHGDAWPGAAAFPDFTSPKVNK WWRNLYKDFLAQGVDGVWNDVNEPQINDTPNKTMPEDNLHRGGGKLPAGTHLQYHNVYGF LMVKASREGILDARPEKRPFILTRSNFLGGQRYAATWTGDNGSCWDHLKMSVPMSLTLGL SGQPFSGADIGGFLFNADADLFGNWIGFGAFYPFARGHACAGTNNKEPWVFGQKVEDASR IALERRYILLPYFYTLLHEASTNGMPIMRPVFFSDPKDLSLRAEEEAFLVGDNLLIIPAF ANQPALPKGIWKELSLVEGDQNDKYQAKMKIRGGAIIPTGKIIQNTTENSLDPLTLLVCL DEQGKASGNMYWDAGDGWSYKKGDYSLLQFVAERNGDKVTVKLTKKTGKYNTENKDMAVI KIITDQGIRQASGNLVEGIEIRL >gi|226332255|gb|ACIC01000065.1| GENE 32 52552 - 53679 441 375 aa, chain + ## HITS:1 COG:no KEGG:BT_3298 NR:ns ## KEGG: BT_3298 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 375 96 470 470 770 100.0 0 MIPPKAFDGNIKISIESADGTSSPIEYEFERRFQYVSQTSVGTLVGNEDEQGNSSDVDGD FENARFGNAEWLLMDTFGIQKCLIVNGFKGPIRRINLETQEVSTLMTEGQGLFGGMYYMC FDATGDTLFVSDDHGQSNKDRREIAYLLRSEDFRRARPYVYDRTGYSCAYQPLTKTLYYT VYWQGTIDKAVFDPTTQGMVAKEVFPTYESRNEHAYLTVHPEGKYMYITGANCLYKSMFN PETKEFQKPTMFAGSEGASDWVDSPGTSGRFVWPYQGTFVKNKDYIEAGKSDIYDFYLCD RNGHCIRKITPDGIISTYAGRGSVSSDGRVNGYIDGELRTEARFDSPCGIAYDEETQTFY IADRENHRIRTISVE >gi|226332255|gb|ACIC01000065.1| GENE 33 53700 - 56780 1892 1026 aa, chain + ## HITS:1 COG:no KEGG:BT_3297 NR:ns ## KEGG: BT_3297 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1026 1 1026 1026 2029 100.0 0 MKKYILSLFFLISCVLNTYAHTEKQQQEKIEISGIVLDQNKEPLVGVNVSVKDIPGLGAI TDINGKFKIKVEPYQWLVFTFIGFDKQEILVKEQKSINVIMQEAKATAIDEVVITGTGAQ KKITMTGAATTVDVNTLRTSTSSITNALAGNVAGIMARQTSGQPGNNISEFWIRGISTFG AGSGALVLIDGFERDMNEINIEDIESFTVLKDASATAIYGSRGANGVVLITTKRGKSGKV QIDAKVETTYNTRTYTPELVDGYTYANLMNEARTTRNKEAFFSDEKLYILRNGLDKDLYP NVDWMDVLLKKGAPTYRATLNLNGGGSLARYYVSVSYVDEGGMYKVDEGLKDYNTNANYR RWNYRLNVDMDITKSTLLKVGVAGSLDKKNEPGMAGNIWASAMRNNPISVPVMYSNGYVP AYGSENDQLNPWVAATQTGYQETWNNKLQANATLEQSLNFITKGLKFIGRYGFDTNNYQW INRKKFPEMWRAEQNRASDGSLVMKKIKDEQLMTQESSSNGNRKEYLEAELHYDRTFGDH IVGAVFKYSQDKTIDTSQNGNDIMQGIDKRHQGLAGRFTYGWKYRYFFDFNFGYNGSENF APGHQYGFFPAYSAAWNIAEEPIIKKHLKWMNMFKLRYSYGKVGNDVVGDNDQRFPYLST FGASGGFNYADIGQKYEFTGLSYTHYATTAVTWEIATKHDIGIDFSLWDDKFTGTIDYFH EQRDGIYQQRNFIPISVGLNGAMRISTNIGSVLSKGFDGNFGFKQRIGDVNLTLRGNFTY SKSEILEYDEEYSNYPYKSQRGFRVDQTRGLIALGLFKDYDDIRNSPKQSWGDVMPGDIK YADVNGDGYVNDNDIVPIGATTRPNLVYGFGLSGQWKGIDINVLFQGSGKSTFCIDGPTV YPFENGDWGNILTDVVKSNRWILGENEDPNAKYPRLSYGGNSNNYRASTYWLRNGSYLRL KNLEIGYTIPKSLVNKMHLNNIRIYFMGTNLVTFSSFKLWDPELGSSNGQKYPLSKSYTL GLTINI >gi|226332255|gb|ACIC01000065.1| GENE 34 56791 - 58815 1059 674 aa, chain + ## HITS:1 COG:no KEGG:BT_3296 NR:ns ## KEGG: BT_3296 # Name: not_defined # Def: putative outer membrane protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 674 1 674 674 1344 100.0 0 MKKKHILLAIAFSIGLGVSLSSCSDYLSVERFFNDRQSEEKIFKSKQFTEEWLAKCYGLL NGSNIEYGLIKFTLSNYSDDIIFNESDGAVNYNALKFGQYDNGWVNDSYIRCYEGVRQAS IMINNVDINEELTKEEILDKKAQARFLRAYFYWLLLRRYGPVPLIPDETISIDETYIGMS CPRATYDEVVDYIDKEMILAANDLPDRRDRQNIARATKGAALAVRAKVLLYSASPLNNPL PNDPDKFTDFVDDQGRILMSQTYDESKWARAAAAAKDVIDLGRYELHIADYQDKGDNTYP ATVKPPYNAKYSTKNFPDGWANIDPFESYRSIFNGDLYANENEELIFTRGNNSLADQVNY LVRNQMPTFAGGENRHGLTAKQCDAYDMANGDAFNLQHFLDTCDASKRFVTEDEYKAGKY PQLRPNVWKEYANREPRFYASVAFSGAFWPFASAKDAEYRNQQIWYYRGNKEGRKNGSDK WIPTGIGMMKFINPNDCNTNNGKIYDKVDVAIRYADILLMYAEALNELTPGNSYEVTNWQ GETMVVSRNTEEMSKSVSRVRIRAGMPDYEQEVYNDQTLFRKKLMHERQVEFMGENQRYH DLRRWKNVAIEESEQIYGCNVLMTEATKERFYERVRVENLQANFSRKMYFWPILERELQW NRRMTQAPGWETFD >gi|226332255|gb|ACIC01000065.1| GENE 35 58834 - 59838 444 334 aa, chain + ## HITS:1 COG:no KEGG:BT_3295 NR:ns ## KEGG: BT_3295 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 334 1 334 334 666 100.0 0 MKRRIIYIVTMFAALSFFSACNDEWKDELYSHMISLKAPIGSEEVSDIYVRYKPNGEVVY DLPVLVSGSTPLDRDLNVRIAVDPDTLVALNLANYHELRKDLWYELLPEQNYEFTAPTCY MPAGTRKELFPIKFKFSGLNLSKEWVLPLTVEEDPSYVTNKRKGWRKALLHVLPYNDYSG NYSATAMNIYFAGSNDDPLVVDNRRAHVVDENTVFFYVGTKEDNSIDRETYKINVKFEKP IEETETKKTGNLTITATNPAINFQIIGNPTYEIWDEMDTNQPYLLHRYHVLRMEYSYDDV TSSNVPTSYRASGSLLMERKINTLIPDQDQAIQW >gi|226332255|gb|ACIC01000065.1| GENE 36 60114 - 62066 1222 650 aa, chain + ## HITS:1 COG:no KEGG:BT_3294 NR:ns ## KEGG: BT_3294 # Name: not_defined # Def: putative alpha-glucosidase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 650 1 650 650 1345 99.0 0 MRKLLLLFLFICGLVLHAQAQKNVELSSPNGEIKVSVRLTDKIYYDVACNDEVLLKDNYL QLKLKDKILGEHPKLSGQKQNKIKETLTPVVPLKFSTINNEYNQLLLNFKGNYSVEFRAF NDGIAYRFITRQKGEIEVLGEDLSLQFPTDYLLHLQQPGGFKTAYEEPYTHISSHSWKPS DKMSVLPVLIDTRKSYKILISESDLSDYPCMFLKGNGTNGISSIFPKVPLAFGEDGDRSL KIEKEAEYIAKTSGKRNFPWRYFVITKEDKQLVENTMTYKLADKNQLEDVSWIKPGQVSW EWWNGATPYGQDVTFKAGCNLETYKYFIDFASKFGIPYIIMDEGWAKSTRDPYTPNPDVD LHELIRYGKEKNVGIVLWLTWLTVEKNFDLFKTFNEWGIKGVKIDFMDRSDQWMVNYYER VAQEAARHHLFVDFHGSFKPAGLEYKYPNVLSYEGVRGMEQMGGCKPENSIYLPFMRNAV GPMDYTPGAMISMQPNIYRSERPNSASIGTRAYQMALFVIFESGLQMLADNPTLYYRNED CTRFITQVPVTWDETVVLEAKVGEYVIVAKRKGEKWFIGGMTNDKENEREFEINLDFLKD GKTYRVTSFEDGINAGYQAMDYRMKSATMNKNQKLSVKMARNGGFAAVLE >gi|226332255|gb|ACIC01000065.1| GENE 37 62169 - 66434 2845 1421 aa, chain - ## HITS:1 COG:TM1193 KEGG:ns NR:ns ## COG: TM1193 COG3250 # Protein_GI_number: 15643949 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Thermotoga maritima # 29 1042 7 981 1087 660 38.0 0 MRRKIMSGVMATFLLMPFLALAQTTEQPEWCNEYVSGVNKEEAVQIAIPFTDEQQAMNLT IEESPYYKTLNGIWKFHWVADPKDRPQDFCKPEYDVSQWDNIKVPATWQIEAVRHNKNWD KPLYCNVIYPFCEWDWKKIQWPNVIQPRPSNYTFATMPNPVGSYRREFILPDSWKGRDIF IRFNGVEAGFYIWVNGKKVGYSEDSYLPAEFNLTPYLKAGKNVLAVEVYRFTDGSFLECQ DFWRFSGIFRDVFLWSAPKTQIRDFFFRTDLDKEYKNASVSLDIDITGKRSNNEIQVKVT DQNGKEIATQNARAVTGTNKLQFEVVNPLKWTAETPNLYNLTILLKQKGKTVDIRSVKVG FRKIELAQDGRLLINGKSTLFKGVDRHDHSSENGRTVSKEEMEKDVQLMKSLNINAVRTS HYPNNPYFYDLCDRYGIYVLSEANVECHGLMALSSEPSWVKAFTERSENMVRRYKNHASI VMWSLGNESGNGINFKSAVEAVKKLDDTRPTHYEGNSSYCDVTSSMYPDVQWLESVGKER LQKFQNGETVKPHVVCEYAHAMGNSIGNFKEYWETYERYPALVGGFIWDWVDQSIKMLAP DGSGYYMAFGGDFGDTPNDGNFCTNGVIFSDRTYSAKAYEVKKIHQPVWVEAMGNGTYKL TNKRFHAGLDDLYGRYEIEEDGKVVFSANLEELSLNAQDSKVITIADNQINKIPGAEYFI KFRFCQKQDTEWEKAGYEVASEQFKLSDSAKPVFKAGEGSIDLIETDDAYLVKGSQFEAS FSKQQGTISSYTLNELPMISKGLELNAFRAPTDNDKQVDGDWYQKGLYQMTLEPGHWNVR KEDNKVTLQIENLYRGKTGFDYRTNIEYTVAADGSILVNSTIIPSTKGVIIPRIGYRMEL PEGFERMRWYGRGPLENYVDRKDATYVGVYDELVSDQWVNYVRAQEMGNREDLRWISITN PDGIGFVFIAGDKMSASALHATAQDMVDPANHRRLLHKYEVPMRKETVLCLDANQRPLGN ASCGPGPMQKYELRSQPTVFSFIILPLERSYSTEELIKKARVQMPVCMPVLIERDNNGYL NLKTNTPGATVHYSLNGGEEKIYTEPFEFISGGHVEAYAVSEQLGKSAGTSAEFPIYVDR SLWKIVSVSSENGGEEARNAIDGDLNTIWHSRWNDPVAKHPHEIVVDMSSSLEIDKFIYQ PRNSENGRIKDYELYFSKDGKNWENKTKGRFENSSSAQFVTLEKPIVARYFKLIALSEIY GRDWASAAELNVNAVRNLSGASEERQKVVYVDSDADGSMKLAADGDINTFWHTVHNQFYL APYPHEIQVALAKETTVKGLKYTPRQDSSEGRIGKYEVYISHDGKEWGKAVASGTFADSK EVQTVEFNPCKARYVKLQALSAVIKEAKMAAVAELEVLLAE >gi|226332255|gb|ACIC01000065.1| GENE 38 66468 - 68609 1747 713 aa, chain - ## HITS:1 COG:no KEGG:BT_3289 NR:ns ## KEGG: BT_3289 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 713 1 713 713 1462 99.0 0 MKLRLTAMILLSLCLSQAFADEGMWLLGNLRKNKQAERVMKELGLQMPVNKLYNPKKPAL ANAVVSFGGFCSGVVVSEDGLVFTNHHCGFSSIQQHSSVEHDYLKDGFVARNLGEELPNP ELYVRFLLRTEDVTKRVLSAAKHAHTESERRVVVDSVMNVIGMEVSEKDSTLTGIVDAYY AGNEFWLSVYRDFNDVRLVFAPPSSVGKFGWDTDNWMWPRHTGDFSVFRIYANKQNGPAD YSPENVPYHPEYVAPISLDGYKEGSFCMTLGYPGSTERYLSSYGIEEMMNGINQAMIDVR GVKQAIWKREMDRHPDIRIKYASKYDESSNYWKNSIGMNKAIRHLKVLEKKRAAEAALRD WIQSHPEEREKLIRLFSSLELNYNNRRETNRALAYFGESFINGPELVQLALEILNFDFEA EEKLVVTRMKKLLEKYDNLNLSIDKEVFAAMLKEYRSKVDKKYLPAMYLQIDTLYNGNVQ TYVDSLYATSQITSPKGLKRFLERDTTYNLIEDPAISLSLDLIVKYYEMNQSVSEASEQI EQDERLFNAAMRRMYADRNFYPDANSTMRLSFGTVCGYSPFDGATFSYYTTVKGIFEKVK EHAGDIDFAVQPDLLALLSSGDFGRYADEKGDMNVCFISNNDITGGNSGSAMFNAKGELL GLAFDGNWEAMSSDIVFEPDLQRCIGVDVRYMLFIMEKYGKAANLIEELKLTR >gi|226332255|gb|ACIC01000065.1| GENE 39 68656 - 70011 1414 451 aa, chain - ## HITS:1 COG:SP1264 KEGG:ns NR:ns ## COG: SP1264 COG1808 # Protein_GI_number: 15901124 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Streptococcus pneumoniae TIGR4 # 29 332 10 311 347 234 41.0 2e-61 MKTDERNKFAIKSFLGEYLDLKKDKDNEMETVDSIRKGVEFKGANLWILIFAIFMASLGL NVNSTAVIIGAMLISPLMGPIMGVGLSVGLNDFELMKRSLKSFLITTAFSVTTATIFFLF TPIAEAQSELLARTSPTIYDVFIALFGGLAGVVALSTKEKGNVIPGVAIATALMPPLCTA GYGLASGNLVYFLGAFYLYFINSVFISLATFIGVRVMHFQRKEFVDKAREKMVRKYIILI VVLTMCPAVYLTFGIIKSTFYETAANRFINEQLNFENTQVLDKKISYEHKDIRVVLIGPE VPDASISIARSKLKEYKLEETKLTVLQGMNNEAVDVSSIRAMVMEDFYKNSEQRLQEQQK KITRLEENLAQYKEYDTMGRTLVPELKVLYPSITSLSIAHSLETRVDSMKTDTVTLAVLK FGKHPSVAEKEKISAWLKARVGAKKLRLITE >gi|226332255|gb|ACIC01000065.1| GENE 40 70014 - 71462 1293 482 aa, chain - ## HITS:1 COG:TM0620 KEGG:ns NR:ns ## COG: TM0620 COG2244 # Protein_GI_number: 15643386 # Func_class: R General function prediction only # Function: Membrane protein involved in the export of O-antigen and teichoic acid # Organism: Thermotoga maritima # 23 430 24 427 479 72 21.0 2e-12 MAELKSLAKDTAIYGASSIIGKFLNYFLVPIYTFSLPAASGGYGVITKMYAIVALLLVVL TFGMETGFFRFANKGDDDPKKVYSVSLLMVGSVSLLFLLLCLVFLNPIAGLMGYSEHPWY LGMMLIVLAMDAIQAIPFAYLRYKKRPIKFAGLKLLFILVSILLNILYFVVMKGDDVGVA FLINLICSFVVMLGLIPEFREFRYCPDRLLMKRMLYYSFPILILGVAGIVNQVGDKIIFP FVYPDKVEADVQLGIYSAATKIAMIMAMITQAFRFAYEPFVFGKSKDKDNKRIYAQAMKF FIIFTLLAFLAVMFYLDILRYIIAEDYWEGLKVVPIVMIAEMFMGIYFNLSFWYKLTDDT RWGAYFSIIACLTVVLMNIFLIPVYGYVACAWAGFTGYAIAMLLSYFVGQKKYPIAYDLR GIGCYVGLAIILYLIGEWMPIHNTVLLLAFRTVLLLLFIAYIIKRDLPLSQIPVINRIIK KK >gi|226332255|gb|ACIC01000065.1| GENE 41 71498 - 72265 529 255 aa, chain - ## HITS:1 COG:TM0033 KEGG:ns NR:ns ## COG: TM0033 COG4099 # Protein_GI_number: 15642808 # Func_class: R General function prediction only # Function: Predicted peptidase # Organism: Thermotoga maritima # 32 255 169 395 395 172 38.0 6e-43 MKQWSILLVFLSLSLSLSAQQGYEKDVFVSSTGDSLPYRVIRPESMKAGKKYPLVLFLHG AGERGNDNERQLTHGGQMFLNPVNREKYPAFVLVPQCPAGSYWAYTERPKSLFPADMPAG QEMSSLTRTLKQLLDTYLAMPEVDKKRIYIVGLSMGAMGTYDLVVRYPEIFAAAVPICGS VNPDRLPVAKEVKFRIFHGDADDVVTVEGSREAYKALKAAGAKVEYIEFPGCNHGSWNPA FNYPGFMEWLFKQKK >gi|226332255|gb|ACIC01000065.1| GENE 42 72372 - 73403 1088 343 aa, chain - ## HITS:1 COG:NMB1243 KEGG:ns NR:ns ## COG: NMB1243 COG2255 # Protein_GI_number: 15677115 # Func_class: L Replication, recombination and repair # Function: Holliday junction resolvasome, helicase subunit # Organism: Neisseria meningitidis MC58 # 14 330 22 338 343 388 62.0 1e-108 MEQEDFNIREHQLTSRERDFENALRPLSFEDFSGQDKVVENLRIFVKAARLRGEALDHVL LHGPPGLGKTTLSNIIANELGVGFKVTSGPVLDKPGDLAGVLTSLEPNDVLFIDEIHRLS PVVEEYLYSAMEDYRIDIMIDKGPSARSIQIDLNPFTLVGATTRSGLLTAPLRARFGINL HLEYYDDDILSNIIRRSASILDVPCSVRAASEIASRSRGTPRIANALLRRVRDFAQVKGS GSIDTEIAQFALEALNIDKYGLDEIDNKILCTIIDKFKGGPVGLTTIATALGEDAGTIEE VYEPFLIKEGFMKRTPRGREVTELAYKHLGRSLYSSQKTLFND >gi|226332255|gb|ACIC01000065.1| GENE 43 74016 - 75248 1171 410 aa, chain + ## HITS:1 COG:RSc0452_1 KEGG:ns NR:ns ## COG: RSc0452_1 COG2715 # Protein_GI_number: 17545171 # Func_class: R General function prediction only # Function: Uncharacterized membrane protein, required for spore maturation in B.subtilis. # Organism: Ralstonia solanacearum # 1 210 1 215 251 199 46.0 1e-50 MVLNYIWIAFFVIAFIIALIKVIFLGDTEIFTAIMNSTFDSSKTAFEISLGLTGVLALWL GIMKIGENSGLINALARFLSPVLCRLFPDIPKGHPVLGSIFMNMSANMLGLDNAATPLGL KAMKELQELNPKKDTASNPMIMFLVINTSGLIIIPISIMVYRAQMGAAQPTDVFIPILLS TFISTLVGVIAVSIAQKINLINKPILILMGVICLFFSGLIYLFLNISREEMGTYSTLIAN ILLFGVIILFILTGVRKKINVYDSFVEGAKEGFTTAVRIIPYLVAFLVGIAVFRTSGAMD FLVGGIGYVVGLCGVDTSFVGALPTALMKSLSGSGANGLMIDTMKEMGPDSFVGRMSCVV RGASDTTFYILAVYFGSVGISKTRNAVTCGLIADFSGIIAAILISYLFFF >gi|226332255|gb|ACIC01000065.1| GENE 44 75264 - 75677 413 137 aa, chain + ## HITS:1 COG:TP0650 KEGG:ns NR:ns ## COG: TP0650 COG0319 # Protein_GI_number: 15639637 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase # Organism: Treponema pallidum # 37 122 40 135 160 70 39.0 6e-13 MAVTYQTEGVKMPDIKKRETTEWIKNVAASYGKRLGEIAYIFCSDEKILEVNRQYLQHDY YTDIITFDYCEGDRLSGDLFISLDTIRTNAEQFGAAYDDELHRVIIHGILHLCGINDKGP GEREIMEEAENKALAMR >gi|226332255|gb|ACIC01000065.1| GENE 45 75771 - 78320 2113 849 aa, chain - ## HITS:1 COG:no KEGG:BT_3282 NR:ns ## KEGG: BT_3282 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 849 1 849 849 1670 100.0 0 MNKIFTLLLVGALTAGAVTANAESQIFKKKKKKAKAETEQPAAKDSTKSSMTEYKKLLKG AKTIDGMFKVHIVKDKYYFEIPKDLMGRDFMIASRVSSTSNNKDIAAGQMPRNPVLVTFS ADKKKIYMHKKMVRNLCDTTSNMYAAFQRNFSDPIWEAYKIESLSPDSSAYVIDMSSLFI TDVPEFSPFRSENIMDVLMKRKALKGSLASSKSTILGMKCFPLNINIKTLMSYTVDGGPF TVTMTRNIILLPEEIMRPRYGDSRIGYFDESKRFYTEKKDGLQELTYINRWDLQPKPEDL ERYKQGELVEPQKPIVYYVDTAIPDKWRDYIKKGIEDWQVAFEEIGFKNAIVAKDYPKDD PNFDPDDIRYSCYRMITTPIQNSMGPSCADPRSGEIIQGDVLFYSNIIKLLHNWRFVQTA TVDPKVRKAVFDDETMGSSLRYVAAHEIGHTLGLMHNFGASYSYPVDSLRSATFTQKYGT TPSIMDYTRYNYVAQPEDKGVALTPPLLGVYDKFAIKWGYKPIFDAASPEDEKATLNKWI KEKENDPMYKYGPQPFINEVDPSCKSEDLGDNAVKAGRYGLKNLKLIMKNLPEWTYEDDH QYERLTESYQEVIMQMQRYLLHAMVSIGGIYFDEPRRDSEKPVIRFVPKAEQKEALKFIM ETMMELPEWVLDKKVIEYTGPTYSPSTLQSIIMNRLFFTYITSSLVLYEELYPKQAYTFA EYMDDIYDFVWKKTISGARLNMYDRNLQITYVEKLLKEGGMLKQKSSPFSFKDLNEVEKQ LLTDNVDWTKAGFELNPLGSVDMVKNPAIYQKVMDSYSLLRSKVNTGDATTRAHYQSLVF KIKQALTEQ >gi|226332255|gb|ACIC01000065.1| GENE 46 78340 - 79239 642 299 aa, chain - ## HITS:1 COG:no KEGG:BT_3281 NR:ns ## KEGG: BT_3281 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 299 1 299 299 576 100.0 1e-163 MKKLILLICMLAGMSSCYHEDALIVPEQPDKYNILTDDPSDPTQHFIYQFYQKYQTVIIT NPTEADYKFNFTANNGIKITAPEQKQEIIDEGIEFLQKVLLNLYSDSFLKKNLPFSILLS EEVRMASYGETTIMNCYASSSFIALGNVSSSLKTMTDEEFVKIRADVNASFWAKYMSEVR GLFTISDAFYEASEEVEPKLYDPNWYRFKGTDPNEIDFYKYGVITYSENSYIDEDWPDFN SIYAPLKSEDLAQWMNFVFEKTPAEIQEICDKYPVMKKKYDVIREAMLENGFDLSKLEL >gi|226332255|gb|ACIC01000065.1| GENE 47 79246 - 80712 947 488 aa, chain - ## HITS:1 COG:no KEGG:BT_3280 NR:ns ## KEGG: BT_3280 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 488 3 489 489 972 99.0 0 MKKYNLYIILLICTLSSCSDFLEESSQDLMIPRSVKDYKELFFGEVMKTNEKDIPHPYLE YMTDDVKDQCYYGTRPQVISNDFREYVWAYYTWQGNPEVGISNELTPDVAWTVYYHKILM TNIILDQLYTMNGTDMERQDLAGEAYFMRAFSYFMLANLYGKPYHPATANEDLCVPLNKE IGLSDKMMKRATNAAVYAQMEEDINKAIQCFKAIGGEKTIFRPNLPSAYLLASRIALFQE KYDETILYCDSVFKVATQPLYKLQEKKDHYFFSLANKEILLSYGLTSLETHMKEDFRYTG NLVVSDDLLALYSEDDLRLTNYFKNTIGNQTRPTNKQYSVYTPIKWTKNSATVYANALRI SEAYLNRAEAFAELGNTNKALADLNELRENRMKPGTPPLAVDEDGIVATVRKERRRELAF EGFRWFDLRRYGCPPLTHTYSSKENEGAGDVFELKEKGSYVLPIPKSERERNTEIEIFDR PNSEPVNN >gi|226332255|gb|ACIC01000065.1| GENE 48 80725 - 84273 2270 1182 aa, chain - ## HITS:1 COG:no KEGG:BT_3279 NR:ns ## KEGG: BT_3279 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1182 1 1182 1182 2338 99.0 0 MQKKRARERIIMFLCIISLLFTPVLAASQSNAMRVSLDLKSATVKEFFDAVKVQTGLNFI YSTEQVKNMPRITIQSNSQPISEVLNKVLANTGYSYEIEGNIVTIVYQQPKENVRTATGV VVDEGGLPLPGAYIKLSKTEHSTITDNEGKFSINLFHQKEPVLIVSYLGMIVQEVKVAGT KPMTIVMKSDVKSIDEVVVTGYQEIDRRKLASSISSIKGADLIGGEYLSIDKMLQGRLPG VAVMNMSSTPGAAPKIRIRGSSSITGNREPVWVVDGIILEEPVNISTEELNSPDKINLIG NAIGSVNPEDIERVDILKDASATAIYGIKAANGVIVVTTKQGKSQKPSISYTATLGITAP PTYDKMFRMNSADRIDMSIEMQERGLSFGSYKPSDIGYEGALQHLWNKDITYQEFLNQVK TLKGLNTDWYDLLFRTAFTHQHSVSITGGNDRSNYYMSMGYANNNSVTIGEGLERYNVLA KINTRINRNIHLGLKVSGSLSKAEHPHTSIDVYEYAYNTSRAIPLRTASNELYFYANEAG RNGVLSYNIMNELNHTGNKNDNSAIDVAVNLDWKVASWMRFSSILGLSRSNVTQENWGDE QSYYISSMRQSPYGKMLPNPIEDSEFAERYCLLPFGGELMTTNTRNTSYTWRNSLSLMQS FGKHEVSGSIGQEVRSSKYDGLSSTQYGYLPERGKKFVDIDPTVWKRYGALVKSHPDVVT DTKNNVISLYATAAYVYDSRYILNFNIRTDGSNKFGQSKSVRFLPIWSVSTRWNVINEKF MKNVDFLNDLAIRASYGIQGNVHPDQTPNLIASLGTLESMPQEYISTLYKLPNNKLKWEK TNSYNIAIDWAFWNNRIYGSLDVYYKKGVDQVVTKNVAPSTGASSVSINDGDVENKGWDL AVSFVPIQTRDWMWSLSFNTGKNYNKVLNAGNSAITWQDYIAGTLVSNGNAINSFYSYKF DKLDDQGFPTFKDINEKDEEGNAVVHSLQEMYDRAFVLSGKREPDLTGGFSTYLKYKNIT FNALFSFSFGNKMRLNDLYESSGQRLPYPDQNMSSEFVNRWREPGDEDRTIIPVLSDKSL QINDKDVTYRIADNGWDMYNKSDIRVVSGSFLRCRSMSIRYDFKREWLKPVYLKGASVSF DMGNVFVIKDKALKGRDPEQIGFGSRSVPPQRSYSLRFNITL >gi|226332255|gb|ACIC01000065.1| GENE 49 84307 - 85491 921 394 aa, chain - ## HITS:1 COG:PA2388 KEGG:ns NR:ns ## COG: PA2388 COG3712 # Protein_GI_number: 15597584 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Pseudomonas aeruginosa # 201 384 134 315 331 78 30.0 2e-14 MIKRKRSINSIIHDQLLGYASEEDKKQLQSWLDSSQENRENYDRLMREACLIDRYKQFAQ VDEARAWERFQKKHFSIRSARWIKIGRYAAIFLLPIIGFAIWFWTLRLMDSQPVISDEVR IAMIRSEKMGKQKATLVLANGQKMDLKSVPAKPLQDSVEQVPVAQPSVPKTDMNESEEVP VVENNKLSTYDDSEFWMTFEDGTRVHLNYNTTLKYPPHFGTTTRTVYLDGEAYFQVAKDS KRPFRVITANGVVKQYGTTFNVNTHVPGITKVVLVKGSVSVLPNQGGEYEIKPGELAVLQ ADTQDVQISVVDIEPYIAWNSGRFVFDNCSLESLMNVISRWYNKDVVFESEDTKKIRFTG DIDRYGSIEPILKAIQRVTHLEMEIQGKTIIIKK >gi|226332255|gb|ACIC01000065.1| GENE 50 85690 - 86220 226 176 aa, chain - ## HITS:1 COG:PA2896 KEGG:ns NR:ns ## COG: PA2896 COG1595 # Protein_GI_number: 15598092 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Pseudomonas aeruginosa # 2 164 25 189 194 63 28.0 2e-10 MDAAAFRLLYKNYYKALVCYAITIVGDSESAEDIVQELFSTIWEKKMFFRSLASFRVYLY NSVRNASLDYLKHKDVEGNYLQKMLDSHSTTFRLEEEEEGFFSEEVYRQLLQTIDALPDR CREVFLMYMEGKKNEEIATALHVSLETVKTQKKRAMSFLRKKLGSYHFLLLQMLFL >gi|226332255|gb|ACIC01000065.1| GENE 51 86522 - 89128 1903 868 aa, chain - ## HITS:1 COG:no KEGG:BT_3276 NR:ns ## KEGG: BT_3276 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 868 1 868 868 1725 99.0 0 MRNRYSKILCLLLLFACCLPQEGNAFWPFKKKKKEDKKENLTPYQKLFKNKKVQTAHGLM TIHKVEGKVYVEFPVDMLGREMLFASSIENTSDGGEGAPGQLGGTDVRFRFEMIDSTLVA RMPLLSKPVNTSGDAYIARALDNAHNPGIFKSFKVLACTPDSSALVVDMKGLFLEGSAFT KPFPSTSANGYYGFVSRDHSLQSDKSAILGVSASENHISVREELSYTVDHTLMGAYNMYK DVPLTAVVTKMLCILPEEPMTPRLADSRLGLTTQLKSDYSGATQEVKSIRYAKRWRIEPS DSAAYRRGELVRPVKQLVFYIDSLMPVKWNPYIKAGAESWNKAFEKIGFKDVICVKEFPK NDTLFNANSFDCMTIRYSASWLNSAQTTIHSDTRTGEILNASILINANMISVQYADRIGA TVAIDPRVRTTVFPQEIQGELIQAAIAQAVGTGLGLTMNWGASCAYPVDSLRSASFTQKY GLASSVMGGVIINDVATEEDVRNGVCLVNTKPGPYDELVIKYLYQPIYASSLQEEKETLD SWIREHTGDPYYAYIRNQSRFDSDPRNSRGSLGDDHLKSFDYMLPNVRKGFENYYSWFAK EDRDFLMRRRVHSALSERLSGRIYAILSYIGGIYLNDIREKDAIPSYSMVDREKQKAALS KALELAKNLDWVDDTAHLNEFEISDKKADRLRLDIFNGIFGRLPYVEVCTERFPDAAYTA SEYLDDIYGIVWEGVLKHRPLENFDRALQTAFLESIISTSTATAPIGSFKASKTGFAATG KPEINLAALRNGGKVDMKMYSLQNAEEVAGFSSIPPIYTNQTKIAAYYFDLLMRTKDMLK KAISTAPEADRSHYDLLIYRINKATEIE >gi|226332255|gb|ACIC01000065.1| GENE 52 89137 - 89859 709 240 aa, chain - ## HITS:1 COG:no KEGG:BT_3275 NR:ns ## KEGG: BT_3275 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 8 240 628 860 860 458 96.0 1e-128 MTVISVQALKLYRSNKQKKALNFLLNELKDLSWLDARETLKEFPIRTSVAMQMEDAIIDG ILSRCGAVAICSGKATSAPYTQKEYLADIERFVWAPTRAGRTLTETEMRLQMNYLTKLIE GSVAGVAKNATRKGIADMYAHIEVPEWLKAASRERFGIISEEFMGCFNNKPREMQAARLE EISGFDDDVDLKAPIEPMEHIYYGELKKVRSLVAASASTGSADTQRHYRLLLYKIDQALK >gi|226332255|gb|ACIC01000065.1| GENE 53 89780 - 91765 1724 661 aa, chain - ## HITS:1 COG:no KEGG:BT_3275 NR:ns ## KEGG: BT_3275 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 16 660 1 645 860 1262 98.0 0 MKKLYLLLLVALCGALITPAVAKKKNKKKASTEATTKKEKKETKYDKLFKDKKVTTAKGF ITLHKLDNKIYFELPIKVMNRDILLGSTIVETTDNQFGCVGEKSGDPFLIRFVQRDSTIT LRRVQAGQFSEDREIKKRLVSSTMPAIAETFEIKAYNNDSTAVVFDMTDYLLSDKKNLSP FSPYSMMEMMGADISKDFEKESSFIQGVKGFEDNVSIRSLLTYKISVGMKGNYAVKKMPF TAVMNRSFILLPEQPMRPRYADPRIGIFEHSVVEFASEGRGLRTRHIAHRWNLEPSDSAA YLRGELVEPKKPIVFYIDDTFPLSWNKYIHKGVEVWQKAFEKIGFKNAIIAKDFPKDDPN FDPENIKYSCIRYAPSTVANAMGPSWIDPRSGEIINASVTVFHNIVQLVQYWRFLQTAPA DEEVRDVVLREDLLGDCIAYVLSHEVGHTLSLMHNMAGSSSIPVESLRDPKFTQEFGTTY SIMDYARNNYIAQPGDKERGVRLTPPELGAYDYYAIAWLYTPIFEAKTAEEEIPILDKWI SEKSGDVKYRYGKQQFRRRFDPSSVEEDLGDDPVKASEYGRRNLQYLLKHINDWVADKDK DFEFRTAVYDEIVYAYVRYLTFVLGDIGGIYLNERYDGDQRPSFKTVSKQQTKKGSEFLI E >gi|226332255|gb|ACIC01000065.1| GENE 54 91785 - 93455 1009 556 aa, chain - ## HITS:1 COG:no KEGG:BT_3274 NR:ns ## KEGG: BT_3274 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 10 553 1 533 534 649 61.0 0 MNMKIIFIVVFLLGMASCYDDKGNYDYKEMNDIEVSVSTETSYYSLGDKVISKPELTFAL GRESQNLAYEWSYDGHVIAHSKDLEWVADTIAKNKDLRLAVLDKSTGVSYFGSTTISISS PYVKDGWVILSEKDGNSMLTFMKFQTEEGVLKPVVTRDIYQMINKEPLGSQPVSIYPHWV EQWDGEDPGISWLWIAQKGGQGAVDVSGSSYQREGVLSQMFLEGYPQDLVPEAVIDLQCL TMAVGEDGTIYTRVKESNLLFNTSRFINTPLTSDAEGKVKVDGRMIAYAPFSGHGGLLLY DKNSSQYLHVTDKLSSNGTFNSGKILTLNVDETTYTNHPTYARLDNMKDYTVHYVGACES DSYDYGKMMYFAVIEHKQSHKFYIQKFAVRDFSGSTTVTSLATTYVSQVEAPAELGQIID GTSKNSFSLFRYQDTWQYLFISKGNALYLYYLKENMLFKCATFNSSITSVDTQFYNNQYI IVGLESGDAYIMKGYTYNTSDEYTIDKYVINKGQTIQVTEVDGKFVLYHEKDLGRIVQVR YKPNTGNGWDDNFDWQ >gi|226332255|gb|ACIC01000065.1| GENE 55 93474 - 94199 501 241 aa, chain - ## HITS:1 COG:no KEGG:BT_3273 NR:ns ## KEGG: BT_3273 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 241 1 236 236 325 69.0 9e-88 MKKNILFALISLCLWACSEDEIKPYQGEQYLYFSQLINTTNDYDKENGYMEVSFNNYPTS EEITVKIGMSLIGKPLNEDTPYSVVVVEENEKEDPTPNAASKNFRLPINPKFGAGLSNDT LEVTLVKTDDLKENVKLCLKLVPNEYFQGTMQEYEKIKIVFNNVVSKPEWWTTQVTRVYL GTYSRKKYEEFVKCTGVTNFGALNTAEKRAYSLTFKRYIAQNNVMDVDADGKEFPMTVPI N >gi|226332255|gb|ACIC01000065.1| GENE 56 94224 - 95693 1063 489 aa, chain - ## HITS:1 COG:no KEGG:BT_3272 NR:ns ## KEGG: BT_3272 # Name: not_defined # Def: putative outer membrane protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 489 1 488 488 669 68.0 0 MNNMKKVYAMILLTASLFTTSCSDWLDVAPSNQVNGDKLFEVGDGYRNALNGIYLNLGTS SMYGKSISWGFMDVIAQYYQSGSSYMKSTSSYYKAANYKFDDSDVKSIISSIWSVGYNNI ANCNNLIANVSEASPSVFAEDELEKNMIWGEALGLRALIHFDMLRMFAPSMLKDDGKAYI PYVDVYPTIVPSYENNKEILRKIVNDLKAAKDLLAKCDTIKENKHWMSTENRMLAVNSGT SGELAKDVFFAYRGYRMNYYAVTAMLARVYCWAGEYGNAYKEAKEVVDATYSTGSTSTAS CFNFSNSLTTNRKDYNSIIMTFFKSTCYEDYLPYMTKGNDVVLVVNTTNLFGDEAGADDR NLDLFGDWDGGDIKYSFKYDIKEGTSGTDMIPAIRLSEMYYIMGEYFAYKDEFSKAGEML DKVRYARGISTTNLASSIGSLEVFHTHLIKEMRREFVGEGQMFFQYKRLDQKPKDNVVFV FDKPNNEDI >gi|226332255|gb|ACIC01000065.1| GENE 57 95713 - 99045 2098 1110 aa, chain - ## HITS:1 COG:no KEGG:BT_3271 NR:ns ## KEGG: BT_3271 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1110 8 1116 1116 2050 93.0 0 MLCLFALSMVLSAWALPPDEKKVTLDLKDVPMETFLNTVKQKAGINMLYNSQMFKGVEPV TVKATNEPWKELLDRVLSPKGFTYATKDGIVVIKKKDQKTRTLEGKVVDEQGGELPGVTV LILGKERNIGTMTDADGAFSLGLPEGDVTIRLSFVGMQTLEIHTGKLNLDKVHTFKLKPD SKQLEEVVVTGYVKISKNSFTGTSVTVTADQLMSVSKTNVLGALQTFDPSFRIAENNLAG SNPNNVPELYIRGRSGIGTTQLDAQSLSKTSLKNNPNLPTFIMDGFEISVQKLYDMDPSR IESITILKDAAATAMYGSRAANGVVIITTVTPKAGKVNIDYNFTGTLEYPDLTDYNMMNA REKLETEVAAGMFKAKPEAGASEQNRLDQIYNRKLNNVVRGVDTYWLSKPLRTVFNHKHS LYLDGGTDDLRWGADFSYNSGDGVMKGSYRDRMSGGLSLSYRFGAFQVKNYFSYTYTKSE ESPYGSFSDYTSKLPYDEYLDENGNYLETTTSWTGGTEENPLYEATLKSYDRSKSWELIN NTELLWNIDNYWLLKGQFSVTKSNSDSRQFLDPRSKKNSEPLNANNVISGQLDVATANSL SWDATATLSYNRNIKKNNLNFLIGINATSSTGDSFGIAYKGFPSGELSSPGYANKVPTLP VNSDSKSRLFGALGTFNYSFDNIYLADVSLRFDGNSEFGSDQRFAPFFSGGLGLNLHNYK FMKQFEWMERFKIRGTYGITGKVDFSPYEAQTIYQVITDNWFKTGLGATLMALGNSSLGW EKTHNMDLGLDVDMFKGLLQMNFSYYRKKTVDLVNSVTLPSSTGFTSYKDNIGEVMNKGV EIQLRSTILNSKDWYVAAFVNMAHNKNEITKISDSLKEYNKRVQEKYADYNSGINKGKAE YSATYLQYVEGGSLTSIFGVKSLGINPADGREIFVRPDGTITYDWNAADQVVIGNEEPKL QGTFGFNMRWKQFSVYSTFMYELGGQRYNSTLVSKVENAYIQSSNVDRRVLTGRWQNPGD CTPYGRLQTDGVVAVTRPTSRFIQDYNVLTFNSLTVGYDFNAEWLKKARIGLLRVELSGN DLFRVSTVRAERGLDYPYSRSFAASVKMSF >gi|226332255|gb|ACIC01000065.1| GENE 58 99090 - 100298 984 402 aa, chain - ## HITS:1 COG:AGl2871 KEGG:ns NR:ns ## COG: AGl2871 COG3712 # Protein_GI_number: 15891547 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 203 348 135 278 331 70 34.0 6e-12 MKKLENTYQDASLIKKALLGDASDKEQQVLEQRLSMHPELKAVYEKLQDSEELKDAFNKY KEYSSEKAYEHFLQQIQQKENSSGGKIRRIRTWWYAAAAVVVLAVGISFYAVNHYQAVEE SQTRIAQIKPGSKQAVLTLPDGSTIDVQKKDINVIVDGVQVKYNKGVLSYRPTVTTQQHE EENVDESPVKSNELVIPRGGENTVLLADGTTVHLNAGSKLLYPVRFVGKRRIVTLEGEAY FDVRKDEEHPFVVRTRFGEVTVLGTAFNINAYNDADACYTTLVYGKVNFSTPDQNTITLA PGEQAVASSRGIEKRVVDVNEYIGWAQGVYVFNNKRLGDIMKTFERWYDVHVYYEKESLR DLTYSGDLQRYGTINTFLNALELTGDIYYRINGRNILIYENE >gi|226332255|gb|ACIC01000065.1| GENE 59 100814 - 101386 497 190 aa, chain - ## HITS:1 COG:BS_sigX KEGG:ns NR:ns ## COG: BS_sigX COG1595 # Protein_GI_number: 16079367 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Bacillus subtilis # 23 176 8 165 194 58 29.0 1e-08 MSDKDDLIVAGVNQKDKGIWRDFYDRYYAALCSYVEKILFLTDAVEDLVQEVFISVWEGK RTFSDSRELTNYLYRACYNNALLYIRNHQIHDSILNGLPQEEDFEDEEMLYALTVKEEAI RQLYFYIEELPAEQRRIILLRIEGHSWDEIASRLGVSINTVKTQKSRSYKFLREKLANSS YSVLLALIFY >gi|226332255|gb|ACIC01000065.1| GENE 60 102221 - 104107 1591 628 aa, chain + ## HITS:1 COG:RSc3328 KEGG:ns NR:ns ## COG: RSc3328 COG0445 # Protein_GI_number: 17548045 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: NAD/FAD-utilizing enzyme apparently involved in cell division # Organism: Ralstonia solanacearum # 4 624 6 627 647 595 49.0 1e-170 MDFKYDVIVIGAGHAGCEAAAAAANLGSKTCLITMDMNKIGQMSCNPAVGGIAKGQIVRE IDALGGQMGLVTDETAIQFRILNRSKGPAMWSPRAQCDRAKFIWSWREKLENTPNLHIWQ DTVCELLVENGEVVGLVTLWGVTFKAKCIVLTAGTFLNGLMHVGRHQLPGGRMAEPASYQ LTESIARHGIAYGRMKTGTPVRIDARSIHFDLMDTQDGECDFHKFSFMNTSTRHLKQLQC WTCYTNEEVHRILRKGLPDSPLFNGQIQSIGPRYCPSIETKIVTFPDKEQHQLFLEPEGE TTQELYLNGFSSSLPMDIQIAALKKVPAFKDIVIYRPGYAIEYDYFDPTQLKHSLESKII KNLFFAGQVNGTTGYEEAGGQGLIAGINAHINCHGGEAFTLARDEAYIGVLIDDLVTKGV DEPYRMFTSRAEYRILLRMDDADMRLTERAYHLGLAREDRYQLMKTKKEALEQIVNFAKN YSMKPALINDALEKLGTTPLRQGCKLIEILNRPQITIENIAEHVPAFQRELEKATAADSD RKEEILEAAEILIKYQGYIDRERMIAEKLARLESIKIKGKFDYASIQSLSTEARQKLMKI DPETIAQASRIPGVSPSDINVLLVLSGR >gi|226332255|gb|ACIC01000065.1| GENE 61 104158 - 104688 578 176 aa, chain + ## HITS:1 COG:YPO3123 KEGG:ns NR:ns ## COG: YPO3123 COG0503 # Protein_GI_number: 16123288 # Func_class: F Nucleotide transport and metabolism # Function: Adenine/guanine phosphoribosyltransferases and related PRPP-binding proteins # Organism: Yersinia pestis # 11 162 18 169 187 159 50.0 3e-39 MIMSKETLIKSIREIPDFPIPGILFYDVTTLFKDPWCLQELSNIMFEMYKDKGITKVVGI ESRGFIMGPILATRLNAGFIPIRKPGKLPAEVIEESYDKEYGTDTVQIHKDALDENDVVL LHDDLLATGGTMKAACELVKKLKPKKVYVNFIIELKDLNGKSVFRDDVEVESVLTL >gi|226332255|gb|ACIC01000065.1| GENE 62 104765 - 106594 1466 609 aa, chain + ## HITS:1 COG:lin1197 KEGG:ns NR:ns ## COG: lin1197 COG0322 # Protein_GI_number: 16800266 # Func_class: L Replication, recombination and repair # Function: Nuclease subunit of the excinuclease complex # Organism: Listeria innocua # 9 592 2 575 603 416 42.0 1e-116 MNTEPEAKTNEYLRGIVANLPEKPGVYQYLNTEGTIIYVGKAKNLKKRVYSYFSKEHEPG KTRVLVSKIADIRYIVVNTEEDALLLENNLIKKYKPRYNVLLKDDKTYPSICVQNEYFPR IFRTRKIIKNGSSYYGPYSHLPSMYAVLDLIKHLYPLRTCNLNLSPENIRAGKFKVCLEY HIKKCAGPCVGLQSHEDYLKNIDEIKEILKGNTQDISRMLVEKMQDLASEMKFEEAQKIK EKYLLIENYRSKSEVVSSVLHNIDVFSIEEDDSNSAFINYLHITNGAINQAFTFEYKKKL NESKEELLTLGIIEMRERYKSHSREIIVPFELDLELNNVVFTVPQRGDKKKLLDLSILNV KQYKADRLKQAEKLNPEQRSMRLLKEIQSELHLDKPPLQIECFDNSNIQGSDAVAACVVF KKAKPSKKDYRKYNIKTVVGPDDYASMKEVVRRRYQRAIEENSPLPDLIITDGGKGQMEV VREVIEDELHLNIPIAGLAKDNKHRTSELLFGFPAQTIGIKQQSSLFRLLTQIQDEVHRF AITFHRDKRSKRQVASALDSIKGIGEKTKTALLKEFKSVKRIKEASLEEIAKVIGEVKAQ TVKKGLSNE >gi|226332255|gb|ACIC01000065.1| GENE 63 106872 - 107099 105 75 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSSCPSDSSMPTNISIPFPMEDLHCPSMVTDAWLTRCTTILIDIFSILLNVTKGKFTKLR RKNDELEGNIIGYKY >gi|226332255|gb|ACIC01000065.1| GENE 64 106975 - 107427 454 150 aa, chain + ## HITS:1 COG:SP1644 KEGG:ns NR:ns ## COG: SP1644 COG1490 # Protein_GI_number: 15901480 # Func_class: J Translation, ribosomal structure and biogenesis # Function: D-Tyr-tRNAtyr deacylase # Organism: Streptococcus pneumoniae TIGR4 # 1 149 1 147 147 150 50.0 7e-37 MRIVVQRVSHASVTIEGQCKSSIGKGMLILVGIEESDGQEDIDWLCKKIVNLRIFDDENG VMNKSILEDGGEILVISQFTLHASTKKGNRPSYIKAAKPEISIPLYEQFCKDLSGALGKE IGTGTFGADMKVELLNDGPVTICMDTKNKE >gi|226332255|gb|ACIC01000065.1| GENE 65 107516 - 107854 444 112 aa, chain + ## HITS:1 COG:SA1292 KEGG:ns NR:ns ## COG: SA1292 COG1694 # Protein_GI_number: 15927040 # Func_class: R General function prediction only # Function: Predicted pyrophosphatase # Organism: Staphylococcus aureus N315 # 2 99 3 101 105 80 46.0 6e-16 MTLEEAQKEVDKWIKTYGVRYFSELTNMAVLTEEVGELARVMARKYGDQSFKEGEKDNLD EEIADILWVLLCIANQTGVDVTKAFRESLEKKTKRDNKRHINNPKLNDHGRE >gi|226332255|gb|ACIC01000065.1| GENE 66 107841 - 108743 1039 300 aa, chain + ## HITS:1 COG:SMb21300 KEGG:ns NR:ns ## COG: SMb21300 COG0274 # Protein_GI_number: 16264552 # Func_class: F Nucleotide transport and metabolism # Function: Deoxyribose-phosphate aldolase # Organism: Sinorhizobium meliloti # 54 293 71 314 334 181 39.0 1e-45 MEENSSQSNKYDAALAKYNTNLSDADIQARVADLIEKKVPENNTEEVKKLLFNCIDLTTL NSTDSDESVMHFTEKVNEFDNEFPDMKNVAAICVYPNFADIVKNTLQVDGINIACVSGGF PSSQTFIEVKVAETALAIADGADEIDIVISIGKFLSGDYETMCEEIQELKEVCKERHLKV ILETGALKTASNIKKASILSMYSGADFIKTSTGKQQPAATPEAAYVMCEAIKEYYQKTNN KIGFKPAGGINTVHDAIIYYTIVKEILGEEWLNNRLFRLGTSRLANLLLSDIKGEEIKFF >gi|226332255|gb|ACIC01000065.1| GENE 67 109150 - 109830 466 226 aa, chain + ## HITS:1 COG:no KEGG:BT_3262 NR:ns ## KEGG: BT_3262 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 226 1 226 226 467 99.0 1e-130 MKSHTKNLRFNHNPLNLILGTRKKQGLRIGYMEAALDGFYLNCIETGVHPEKLSKLLSDK FHCTDAISSCQLFLFLINEGDRASYSIMVPYLLSTENLNQFENTIRERFYGVDRFIQQGR NLYKFKEYIEERGEPIVWINDLERGVIGWDMAQVVGLARAAKDCGYITKEQAWEYIEQAS TLCSEILRTPEEIDKSFLIGGAMKSNKIEDWEELILCYSLLRNSGE >gi|226332255|gb|ACIC01000065.1| GENE 68 109831 - 110805 804 324 aa, chain - ## HITS:1 COG:PA4569 KEGG:ns NR:ns ## COG: PA4569 COG0142 # Protein_GI_number: 15599765 # Func_class: H Coenzyme transport and metabolism # Function: Geranylgeranyl pyrophosphate synthase # Organism: Pseudomonas aeruginosa # 16 322 12 320 322 179 36.0 7e-45 MDSISLIKTPIEAELEDFKALFDTPLSDSNALLDSVITHIRKRNGKMMRPILVLLVARLY GAVTPATLHAAVSLELLHTASLVHDDVVDESTERRGQLSVNAVFNNKVSVLAGDYLLATS LVHAEQTNNYEIIRLVSSLGQKLAEGELLQLSNVSNHSFSEEVYFDVIRKKTAALFAACA EAAALSVQVGEEEVAFARLLGEYIGICFQIKDDIFDYFDSKKIGKPTGNDMLEGKLTLPA LYALNTTKDAWAEQIAFKVKEGTATPDEIVRLIEFTKDNGGIEYACRMIEQYKKKAFDLL ATLPDSNICLALRTYLDYVVDREK >gi|226332255|gb|ACIC01000065.1| GENE 69 110894 - 113743 2670 949 aa, chain + ## HITS:1 COG:YPO0017_2 KEGG:ns NR:ns ## COG: YPO0017_2 COG0749 # Protein_GI_number: 16120370 # Func_class: L Replication, recombination and repair # Function: DNA polymerase I - 3'-5' exonuclease and polymerase domains # Organism: Yersinia pestis # 356 949 45 645 645 512 46.0 1e-144 MESDNKLFLLDAYALIYRAYYAFIKNPRINSKGFNTSAILGFVNTLEEVLKKENPSHIGV AFDPSGPTFRHEAFEQYKAQREETPEAIRLSVPIIKDIIHAYRIPILEVAGYEADDVIGT LATEAGRQGITTYMMTPDKDYGQLVTNNVFMYRPKHTGGFEVMGIEEVKAKFDIQSPAQV IDMLGLMGDASDNIPGCPGVGEKTAQKLIAEFGSIENLLEHTDQLKGALKTKVETNKELI TFSKFLATIKIDVPIQLEMDKLVREQADEDSLRQIFEELEFRTLIDRVLKKENSAGGVTM ATGSKTATAKSAPSPLPLFPEEGGAIQGDLFANFTGNEAGEAKKSNLETLETLNCDYQLI DTEKKRAEIIQKLLTSKILSLDTETTGTEPMDAELVGMSFSITENQAFYVPVPDNREEAL KIVNEFRPVFENENSLKVGQNIKYDMIVLENYGVQVKGALFDTMIAHYVLQPELRHGMDY LAEIYLHYQTIHIDELIGPKGKNQKNMRDLDPKDIYRYACEDADVTLKLKNVLEKELKEN DAERLFYDIEMPLVPVLVNIERNGVLLDTEALKQSSVHFTAQMQRIEQEIYELAGETFNI ASPKQVGEVLFDKLRIIEKAKKTKTGQYVTSEEVLESLRHKHPVVEKILEHRGLKKLLGT YIDALPQLINPRTGRVHTSFNQTVTSTGRLSSSNPNLQNIPIRDENGKEIRKAFIPDEGC LFFSADYSQIELRIMAHLSEDKNMIDAFLSNHDIHAATAAKIYKIDLKDVDSDMRRKAKT ANFGIIYGISVFGLAERMNVDRKEAKELIDGYFETYPSVKAYMEKSIQIAQEKGYVETIF HRKRFLPDINSRNAVVRGYAERNAINAPIQGSAADIIKVAMARIYQRFQTEGIQAKMILQ VHDELNFSVPVNEKERVEEIVIEEMEHAYRMHVPLKADCGWGKNWLEAH >gi|226332255|gb|ACIC01000065.1| GENE 70 114076 - 114471 475 131 aa, chain + ## HITS:1 COG:no KEGG:BT_3259 NR:ns ## KEGG: BT_3259 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 131 1 131 131 210 100.0 1e-53 MKTKVSLKMFVLSVVLLMTSFVASARSYDGQLVYNPVEENGVMVGQTVYKMNGSTLANYM KYNYKYDDNKRMIESETLKWNSTKEEWEKDLRINYTYEGKTVTTNYYKWNNKKRAYVLVP EMTVTMDNTNL >gi|226332255|gb|ACIC01000065.1| GENE 71 114579 - 115358 648 259 aa, chain - ## HITS:1 COG:VC0693 KEGG:ns NR:ns ## COG: VC0693 COG3279 # Protein_GI_number: 15640712 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Vibrio cholerae # 11 122 5 114 237 92 41.0 5e-19 MMNAEDKYKVVIVDDERTAIDALRRELGAYQEFEIKGTANNGAKGKKMIMELRPDLLFLD VELPDVLGLNLLSEMRDEVLWDMKVVFYTSYDKYLLQALRESAFDFLLKPFETEDLKVVI ERYRKAVASIPATPSSSFSSSVSALMPQQGMFMISTVTGFKLLRLEEIGYFEYLKDKRQW QVVLFNQTRLNLKRNTKAEDIIGYSQAFIQISQSAIVNVNYLAMIDGKCCQLYPPFHDKS DLVISRSFLKGLQDRFFVL >gi|226332255|gb|ACIC01000065.1| GENE 72 115355 - 117370 1462 671 aa, chain - ## HITS:1 COG:ECs3260 KEGG:ns NR:ns ## COG: ECs3260 COG3275 # Protein_GI_number: 15832514 # Func_class: T Signal transduction mechanisms # Function: Putative regulator of cell autolysis # Organism: Escherichia coli O157:H7 # 457 664 350 551 565 94 29.0 5e-19 MKLFGLIGLLILWNCSCVDRHRSHALCEQSLIDSLEVRAQDSLFSNLPYSRSLLRNAMRQ AQDSMSYYRLMGLYGKTFFISSDFDSILYYNRPVKEYDKCAAACPRWNDVLSDVYNIEGN VWMQLNQPDSAVAYYEKSYAYRLKGEKGHLLSDICMNLADAHLHRGELAHTASYYRRALF ICDSLHLSEHSKFPVYCGLGQTYMDLRDFDLSNHYYELAGQFFDEMTVSEKWVYLNNRGN HYYYKKDYQEALVYMRQAAELIADYPQMVFESNLSKVNLGDLYLLTNRLDSAENNLNEGY RYFSEIKNNSAMHYIETLMIGLSLKKGNIARAREMIARTASTGHVDANMLTIRNQYLQHY YEKAGDYRNAYEYLKRDYQLNDSIRSERIRTRVAELDMRYCQDTIVLRKEMQIQRQAGEV RVLKLSMYIWVLVCLLLAAGTVVIIWYMRKKREFLRERFFRQINRVRMENLRSRISPHFT FNVLGREINQFNGSEEVKNNLMELVKYLRRSLELTEKLSVSLQDELDFVQSYIGLESGRV GEDFTASVVVEEGLDAKSIMIPSMIVQIPVENAIKHGLAGKDGTKELTIRISRDGKGVRI VICDNGRGYLPQVASSTRGTGTGLKVLYQTIQLLNTKNKNEKIHFNIGNRNDGQTGTQVS VYIPFHFSYDL >gi|226332255|gb|ACIC01000065.1| GENE 73 117512 - 118414 721 300 aa, chain + ## HITS:1 COG:all4037 KEGG:ns NR:ns ## COG: all4037 COG1045 # Protein_GI_number: 17231529 # Func_class: E Amino acid transport and metabolism # Function: Serine acetyltransferase # Organism: Nostoc sp. PCC 7120 # 125 294 1 161 253 144 44.0 3e-34 MSPLNFTHILTQAVDELSESESYKGLFHQHKDGEPLPSAKVLYEIIELSRAILFPGYYGN STINSRTINYHIGVNIEKLFDLLTEQILAGLCFSTAEGDCNVCSESRREEAARLAANFIS KLPAMRRILATDVEAAYNGDPAAKSYGEVIFCYPAIKAISNYRIAHELLELGVPLIPRMI TEMAHSETGIDIHPGAKIGSHFTIDHGTGVVIGATSIIGNNVKLYQGVTLGAKSFPLDAD GKPIKGIPRHPILEDNVIVYSNATILGRITIGSNATVGGNIWVTENIPAGAKIVQTKAKK >gi|226332255|gb|ACIC01000065.1| GENE 74 118431 - 119972 1712 513 aa, chain + ## HITS:1 COG:slr0064 KEGG:ns NR:ns ## COG: slr0064 COG0116 # Protein_GI_number: 16331495 # Func_class: L Replication, recombination and repair # Function: Predicted N6-adenine-specific DNA methylase # Organism: Synechocystis # 9 374 16 384 384 268 38.0 1e-71 MSEQFEMIAKTFQGLEEILAEELTTLGANDIQIGRRMVSFIGDKRMMYKANFCLRTAIRI LKPIKNFTAKDADEVYNQIQAIPWEEYLDVNKTFAIDAVVFSEEFRHSKFVSYKVKDAIV DYFREKTGKRPSVRINNPDVLLNIHIAHTTCTLSLDSSGESLHRRGYRQEQVDAPLNEVL TAGMLLMTGWRGECDLIDPMCGSGTIPIEAALIARNIAPGVFRKGFAFEKWVDFDSEMFD EIYNDDSQEREFTHKIYGYDNNPKANEIATHNIKAAGVSKDVTLKLQPFQQFEQPQEKSI IVMNPPYGERISTNDLLGLYQMIGERLKHAFVGNEAWVLSYREECFDQIGLKPSKKVPLF NGALECEFRKYEIFDGKYKEFKSQEGEGEEKKETEGNYDTSRPRERKEFKPRGEGEFKPR REGEFKPRREGEFKPRREGEFKPRREGEYKPRREGEFKPRKEGDFKPRREEGSFTRDDRD RKPRGEFRGERDSRAPREFRGNREPRIPKKEEE >gi|226332255|gb|ACIC01000065.1| GENE 75 120182 - 122380 2097 732 aa, chain + ## HITS:1 COG:CC2154 KEGG:ns NR:ns ## COG: CC2154 COG1506 # Protein_GI_number: 16126393 # Func_class: E Amino acid transport and metabolism # Function: Dipeptidyl aminopeptidases/acylaminoacyl-peptidases # Organism: Caulobacter vibrioides # 150 714 161 715 738 286 34.0 1e-76 MSKIGKRAILILLALPIGFNVMAQEIKKLTLEDLIPGGETYRYAENLYGLQWWGDVCIKP STDTIYTVQPRTGKETVLTTLGQINKVLADNKAGKLPTPYSIRYPWADKPQMLLKVSGKY IVYDFENNRIVSTLKLKDKAANEDYCVANGNVAYTVNNNLYVNEQAITDEPEGIVCGQSV HRNEFGINKGTFWSPKGNLLAFYRMDESMVTQYPLVDITARVGEVNNVRYPMVGMTSHQV KVGVYNPSTGKTIYLNAGDPTDRYFTNISWSPDEKSIYLIELNRDQNHAVLCQYDATTGK LLGKLLEETHPKYVEPQHPIVFLPWDSSKFIYQSQRDGYNHLYLCDLTSSLKGDWKSDAA GGKHIEYVPTKQLTEGKWLVGDILGFNAKRKEVIFQGVDGTGSNNFAVNVNTGKRSLPFS FRSTTEGEHNGMLSASGSYLIDRYSTPTLPRRIDIVDTKSLKTVNLLTAKDPYEGYEMPT IETGTIKADDGTTDLYYRLTKPADFDPNKKYPVIVYVYGGPHAQLVTGGWLNGSRGWDIY MANKGYIMFTLDNRGSANRGLEFENATFRRLGIEEGKDQVKGIEFLKSLPYIDGNRIGVH GWSFGGHMTTALLLRYPEIFKVGVAGGPVIDWGYYEVMYGERYMDTPESNPEGYKECNLK NLAGQLKGHLLIIHDDHDDTCVPQHTLSFMKACVDARTYPDLFIYPCHKHNVSGRDRVHL HEKITRYFEQNL >gi|226332255|gb|ACIC01000065.1| GENE 76 122454 - 123728 1602 424 aa, chain + ## HITS:1 COG:PA4855 KEGG:ns NR:ns ## COG: PA4855 COG0151 # Protein_GI_number: 15600048 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylamine-glycine ligase # Organism: Pseudomonas aeruginosa # 1 422 1 419 429 382 47.0 1e-106 MKILLLGSGGREHALAWKIAQSPKVEKLYIAPGNAGTNAVGENVNIKATDFEAISAFALK ENIEMVVVGPEDPLVEGIYDYFQNRPELKHIAVIGPSANGAQLEGSKEFAKGFMMRHQIP TARYKSITAENLEEGLAFLETLEAPYVLKADGLCAGKGVLILPTLDEAKKELKEMLGGMF GSASATVVIEEFLSGIECSVFVLTDGEHYKVLPVAKDYKRIGEGDKGLNTGGMGSVTPVP FADEEFMEKVRTRIIEPTINGLKEENIVYKGFIFLGLIRVKGEPMVIEYNVRMGDPETES VMLRIQSDLVELLEGTAAGNLDEKTLVMDPRSAGCVILVSGGYPEAYEKGFPISGLEQAA ATESIIFHAGTAMKDGQIVTNGGRVIAVCSYGATKEEALAQSYKVADMIDFDKKYFRRDI GFDL >gi|226332255|gb|ACIC01000065.1| GENE 77 123845 - 124738 550 297 aa, chain + ## HITS:1 COG:no KEGG:BT_3252 NR:ns ## KEGG: BT_3252 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 297 40 336 336 501 99.0 1e-140 MTAGEEGGSVLWQSARTLLLPAWAEHLVSFLIYAAIGYFLIELNNRFSIIRMRASVQTAI YFLLVTVCPEMHLLYAGNVAAITFLFSIYFLFKSYQQSQASGYLFYSFLFIGAGSILFPQ LTFFSVLWLFEAHRFQSLTFRSFCGALIGWTMPYWMLFGHAFFYDQMELFYHPFKELATF GDIFNLQILQPWELATLGYLLVLFIVSAAHCVVAGFEDKIRTRAYLQFLIDVTLFLFVLI VLQPSQCSNLLPLLMISNSILIGHLFVLTNNKTSNIFFIVATVCLILLFGFNVWTLL >gi|226332255|gb|ACIC01000065.1| GENE 78 124723 - 125202 437 159 aa, chain + ## HITS:1 COG:MA3555 KEGG:ns NR:ns ## COG: MA3555 COG1238 # Protein_GI_number: 20092362 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Methanosarcina acetivorans str.C2A # 5 148 2 144 153 82 32.0 3e-16 MDAFIDSTIQLLIEWGLPGLFISALLAGSIVPFSSELVLVALVKLGLPPTACILAATLGN TAGGMTCYYMGRLGKISWIEKYFKVKKEKVDKMVKFLQGKGALMAFFAFLPAIGEVISIA LGFMRSNIWLTTASMFVGKLIRYILLLYVLESAWNVVAG >gi|226332255|gb|ACIC01000065.1| GENE 79 125249 - 126031 563 260 aa, chain + ## HITS:1 COG:DR1672 KEGG:ns NR:ns ## COG: DR1672 COG4121 # Protein_GI_number: 15806675 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Deinococcus radiodurans # 8 249 22 232 234 103 31.0 4e-22 MKRVIEKTDDGSATLFVPELNEHYHSTKGARTESQHIFINMGLKASAAPSPHILEIGFGT GLNAWLTLEEAERSGRNVHYTGLELYPLEWQMVEQLGYIEKENDEQSASSSGQQAATDLF RAIHTSPWEEEISFTPNFTLCKVQADVNKWVNERVENGQQTINRTITQRSHSPLSFDIIY FDAFAPEKQPEMWSQELFNRLYVLLNEGGILTTYCAKGVIRRMLQAAGFIVERLPGPPGG KREILRARKLEQNQPPILIQ >gi|226332255|gb|ACIC01000065.1| GENE 80 126034 - 126960 932 308 aa, chain + ## HITS:1 COG:MTH604 KEGG:ns NR:ns ## COG: MTH604 COG0803 # Protein_GI_number: 15678632 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type metal ion transport system, periplasmic component/surface adhesin # Organism: Methanothermobacter thermautotrophicus # 40 302 38 292 295 193 36.0 3e-49 MKQQLKDLKTLLLLGSCLLLAACTGRTSKASGSEEAKPVITVTIEPQRYFTEAIAGDKFT VVSMVPKGSSPETYDPIPQQLVSLGDSKAYLRIGYIGFEQTWMDRLMNNTPHIQVFDTSK GIDLILNNGEHNHAAGHHDHDGHNHAVEPHIWNSTANALILAGNTFKALCMLDKPNEAYY LARYDSLCQRIQHTDSLIRRQLSAPESAKAFMIYHPALSYFARDYGLHQISIEEGGKEPS PAHLKELIDLCKSEKVSVIFVQPEFDKRNAETIAQQTGTKVVPINPLSYDWETEMLNVAK ALTPMGPE >gi|226332255|gb|ACIC01000065.1| GENE 81 127020 - 127811 225 263 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 [marine gamma proteobacterium HTCC2080] # 1 219 1 210 305 91 29 2e-17 MKPIIEIKNLAAGYDGRTVLHDVNLNVYERDFLGIIGPNGGGKTTLIKCILGLLKPTAGE INFHAPTEASSHSQLSTSNSSLSLGYLPQYSTIDRKFPISVEEVILSGLSIQKSLTSRFT PEQREKGKHIIARMGLEGLEHRSIGQLSGGQLQRALLGRAIISDPSVLILDEPSTYIDKR FEARLYELLAEINKECAIILVSHDIGTVLQQVKSIACVNETLDYHPDTGVSTEWLERNFN CPIELLGHGTLPHRVLGEHHHHH >gi|226332255|gb|ACIC01000065.1| GENE 82 128230 - 131604 3454 1124 aa, chain + ## HITS:1 COG:no KEGG:BT_3247 NR:ns ## KEGG: BT_3247 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1124 1 1124 1124 2302 99.0 0 MKQYRTINNLVGWLTFIIAATVYCLTIEPTASFWDCPEFITTGYKLEVGHPPGAPFFMLV ANLFSQFASDVTTVAKMVNYMSALMSGACILFLFWSITHLVRKLVITDENNITKGQLITV MGSGLVGALVYTFSDTFWFSAVEGEVYAFSSLFTAVVFWLILKWEDVANQPHSDRWLILI AYLTGLSIGVHLLNLLCLPAIVLVYYYKKTPNATAKGSLLALLGSGVLVAAVLYGIVPGI VKVGGWFELLFVNGLGMSFNSGVVVYIILLAAALIWGVYESYTEKSRLRMAISFILTIAL LGIPFYGHGVSSILIGVVVIAILGIYLAPQVQEKIKEKWRISARTMNTALLCTMMIVIGY SSYALIVIRSTANTPMDQNSPEDIFTLGEYLGREQYGTRPLFYGPAFSSKVALDVKDGYC VPRQSQTGSKYVRKEKTSPDEKDSYIELPGRIEYEYAQNMLFPRMYSSAHSSLYKQWVDI KGHDVPYDQCGEMVMVNVPNQWENIKFFFSYQLNFMYWRYFMWNFAGRQNDIQSSGEIEH GNWITGIPFIDNLLVGNQELLPQDLKNNKGHNVFYCLPLLLGLIGLFWQAYHSQRGIQQF WVVFFLFFMTGIAIVLYLNQTPAQPRERDYAYAGSFYAFAIWVGMGVAGIIRMLRDYCKM KEVPAAALASVLCLLVPIQMAGQTWDDHDRSGRFVARDFGQNYLMTLQEEGNPIIFTNGD NDTFPLWYNQETEGFRTDARTCNLSYLQTDWYIDQMKRPAYDSPSLPITWDRVEYVEGQN EYIPIRPEVKKNIDQMYAQADSALENGNPEAMNELKEQFGENPYELKNILKYWVRSDKEG LRVIPTDSIVMKIDKEAVRRSGMKIPEALDDSIPEYMTILLRDANGNPKRALYKSELMML EMLANANWERPIYMAITVGSENHLGMDNHFTQEGLAYRFTPFDTDKLNSKIDSEKMYDNL MNKFKFGGIEKPGIYIDENVMRMCYTHRRIFTQLVGQLIKEGKKDKALAALDYAEKMIPS YNVPYDWANGAFQMAEAYYQLGQNEKANKIIDELANKSLEYMIWYLSLTDYQLSIASENF MYNAGLLDAEVRLMEKYKSEELAKHYSEQLDQLYNEYVARMKGK >gi|226332255|gb|ACIC01000065.1| GENE 83 131707 - 132321 663 204 aa, chain + ## HITS:1 COG:all4345 KEGG:ns NR:ns ## COG: all4345 COG0726 # Protein_GI_number: 17231837 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted xylanase/chitin deacetylase # Organism: Nostoc sp. PCC 7120 # 20 201 100 284 305 122 38.0 3e-28 MFIEQPPWFFRALYPQAIFRMDPNERAVYLTFDDGPIPEVTPWVLELLKKHDIKATFFMV GDNIRKHPDEYRMVVEHGHRIGNHTFNHIRGFEYSNPDYLANTRRVDEIIHSDLFRPPHG HMGFRQYYTLRHHYRIIMWDLVTRDYSKRMRPEQVLNNVKRYVRNGSIITFHDSLKSWNN GNLQYALPRAIEFLKAEGYVFKVF >gi|226332255|gb|ACIC01000065.1| GENE 84 132397 - 133428 526 343 aa, chain + ## HITS:1 COG:PA4950 KEGG:ns NR:ns ## COG: PA4950 COG1600 # Protein_GI_number: 15600143 # Func_class: C Energy production and conversion # Function: Uncharacterized Fe-S protein # Organism: Pseudomonas aeruginosa # 7 341 19 321 361 210 35.0 3e-54 MKKLLSSDRIKAEALRLGFSACGLAPAEAVDETVATAFRQWLADGCQAEMAYMQNYEDKR LDPRLLVEGARTVISVALNYYPAKKLPEGEYQIAWYAYGKDYHDVMKGKLKELFEFIEKE VSFSEETDSTIASTNNIGTYTENYASAPQAPVSQTSASPLQGRIFCDTAPILERYWAWRT GLGWIGKNTHLIIPHAGSCFFLGEIILDREADNYDSPQRNQCGSCTRCLDACPTKALEAP FRLNSERCLSYLTIEYRGELSLNTGKKMGNKIYGCDECLKACPWNRFATPCRTAEFQPSP SLLSMKKDDWHSLSEEQYKTIFKGSAVKRAKYSGLMRNIKIIK >gi|226332255|gb|ACIC01000065.1| GENE 85 133547 - 134494 498 315 aa, chain - ## HITS:1 COG:no KEGG:BT_3244 NR:ns ## KEGG: BT_3244 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 315 40 354 354 593 100.0 1e-168 MKASGGTGTVEFAATGTVTATVNADWCQVIEVADHKVSISVDANTGYPGRSAQIVLTDGV STQQATILQEGAIWKYNRDATYFTLTDAAEEVPVEMSSNLPIQVSIPADASQWLSYEMTS DGFKFIAKENLTGKIRSAIVKVTTGIREIEYALLQYDIDDLLGQWTGVVKVVATAFGINQ VLSLEENTQIVKNPGGNGYIFQLPMTRLLGATIVLAVTYQDGALVITTPQQQNYALSDGQ GGVMYGSIITKTDDGLYLQGKIILSPVLMNDGQVALAYITDAYMSMGLFKGRVPSTANYA GLSIDFPAILMQYAE >gi|226332255|gb|ACIC01000065.1| GENE 86 134627 - 135958 921 443 aa, chain - ## HITS:1 COG:no KEGG:BT_3243 NR:ns ## KEGG: BT_3243 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 443 1 443 443 858 100.0 0 MKKNIMIYLLMALVCFGLQSCLFQEEDYFDDSSANRATEEVKQYSELLESASNGWRMEYY IGQDYALGGITLLCKFDGQRVTMASQGYEGDETISSLYKVVSEEATMLTFDTYNAFIHAY AKPQGGGSNPNANLQGDYEFIIKEASAEKIVMQGKKYGNTIVMYALPDDLNWEAYINSVN DVEENAFFIQYQLLVDGNLVGMTQRSNYTFVIAYNDGGNIVQEQSPFLFTTDGFRFREPV SLAGVTVQNFVWDPSSELFTCTDEGATNVKMKGIYPEGYIKYNDYLGNYTLTCKMYKDNN LTDEELPISIMEDVENKSYILRGLIGDIVLDYDRGEGIMSFKTQKVGYISGYYLGCTTYC YAMGFYPTYISYQAGLMFGFVANVQQSPFGFTFEDDGVFEAVGGAKADYMAFDRYSAPDY SQSSYVQGVERMYGEMVFMKVEE >gi|226332255|gb|ACIC01000065.1| GENE 87 135971 - 136885 797 304 aa, chain - ## HITS:1 COG:no KEGG:BT_3242 NR:ns ## KEGG: BT_3242 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 304 1 304 304 602 100.0 1e-171 MKKYYIYILTAVLPILWGACNNGDDIDTKHSIFSTEPMEHNGFDEWLLANYTYPYNVDFK YRMQDIESDHKYNLVPADYDKSVALAKIIKHVWMEAYTELAGPAFLRSYVPKTFHLIGSP AYDSSGTKVLGTAEGGKKITLYEVNSLDFENVDIEVLNEYYFKTMHHEFAHILHQKRNYD PSFDRITEGKYVGSDWYFYVAENGNSYLREDADAWPEGFVTAYAMSEAREDFVENIAMYV THDQAYWDNMLKVAGTAGAAIINQKFTIVYNYMLETWGINLDDLREIVLRRQQEIPELDL STIE >gi|226332255|gb|ACIC01000065.1| GENE 88 136900 - 138453 1030 517 aa, chain - ## HITS:1 COG:no KEGG:BT_3241 NR:ns ## KEGG: BT_3241 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 517 1 517 517 1011 100.0 0 MRRNQLYIPLFIATLAIVGSSCNDFLDELPDNRTELNSEQKIAKMLVSAYPEGSANELFE LYSDNTDDNSARYSYYKLSEEECYNWKDTQEEYQDTPTNLWETHYIAIASANMALEEIEK RGNPESLMPQRGEALVCRAYNHFVLANIFCNAYNTHASQELGIPYMTKVETTVQPQYGRG TLQETYEKIEKDLLDGMALISDDSYSVPKYHFTRKAAYAFAARFYLYYMKPDFSNCDEVI KYADRVLGSDPAAELRDWEALGKLSANDNVQPDAYIASTSKANLLIGSTVSNWGIILSNY MTGKRYLHNKLIASYETSRSSGLWGIPDRMYMQPFTTSGYECSMLRKGGYYAGDYLYYMP VIFSTDETLLCRAEAYTLKKNYPMAAADLTTWEKAYTSNKQTLTPGMINEYYDKMDYYNP TQNPTPKKKLQPDFTIESGIQENFIHCILHMRRIATLHEGLRWPDIKRYGIVIYRRSIAN NNIVVTDEMPVNDARRAMQIPKSVVLAGMQPNPRTEN >gi|226332255|gb|ACIC01000065.1| GENE 89 138489 - 141530 1753 1013 aa, chain - ## HITS:1 COG:no KEGG:BT_3240 NR:ns ## KEGG: BT_3240 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1013 46 1058 1058 1999 100.0 0 MKPDAEVLEEVVVTGMQKMDKRLFTGATQQLTADNVKLDGLPDISRGLEGRAAGVSVQNV SGTFGTAPKIRVRGATSIYGSSKPLWVVDGVIMEDAIDVGPDDLSSGDAETLISSAIAGL NSDDIESFQILKDGSATSIYGARAMAGVIVVTTKKGKAGVSKINYTGEFTTRMVPSYKEF NIMNSQEQMGIYKELEQKGWLNSADLFRAKNSGVYGRMYKLIDQFNETNGQFGLMNTLEA RNAYLTEAEMRNTDWFDVLFSNNISQNHSVSITSGSDKSSFYASLSAMLDPGWYKQSEVK RYTANLNTTYNILKNLSINLISSASYRKQNAPGTMNSSLDVAAGQVTRDFDINPYSYALN TSRTLDPNTDYVANYAPFNILRELDNNFIELNVVDVKFQGELKWKVLQGLEVSALGAVRY QSSSQEHNIMDDSNQAIAYRTGLDDATIRDENDWLYKNPDNPYALPISILPEGGIYQRQD RKMLGLDFRGTVSWNHLFAEKHITNLFAGMEVNDLQRSNSSFQGWGMQYSMGEIPSYVYQ YFKKGIETGEKYYSMGHTRYRSVATFANATYSYDGRYTLNGTFRYEGTNRMGRSRSARWL PTWNLSAAWNVHEEKFFQSLQPTLSNLTFKASYSLTADRGPADVTNSQAIIKSYSPYRPF TDIQETGLHIVDLENSELTYEKKHELNIGVDVGFINNRINLSADWYTRNNYDLIGLIPTQ GVGGTIYKYANVATMKSHGIEFTLSTSNIKTKDFSWNSDFIFSHAKNEVTDLRGRTRMME LVSGNGFAREGYPVRGLFSIPFAGLDEKGIPMFDINGNITSTDINFQEREKLDYLKYEGP TDPTITGSFGNVFAYKGFKLNVFMTYSFGNVVRLNPYFNYKYSDLSAMPREFKNRWTLSG DEAKTNIPAILSNPQYEANRTLYKAYNAYNYSTERIAKGDFIRMKEISLSYEFPKSWISA AKISNLSLKLQATNLFLIYADKKLNGQDPEFFNTGGVASPVPRQFTLTLRLGL Prediction of potential genes in microbial genomes Time: Thu May 12 00:56:08 2011 Seq name: gi|226332254|gb|ACIC01000066.1| Bacteroides sp. 1_1_6 cont1.66, whole genome shotgun sequence Length of sequence - 7717 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 1, operones - 1 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 67 - 3111 2068 ## BT_3239 hypothetical protein 2 1 Op 2 . + CDS 3130 - 4689 982 ## BT_3238 hypothetical protein 3 1 Op 3 . + CDS 4703 - 5581 932 ## BT_3237 hypothetical protein 4 1 Op 4 . + CDS 5594 - 6850 840 ## BT_3236 hypothetical protein 5 1 Op 5 . + CDS 6868 - 7717 398 ## BT_3235 hypothetical protein Predicted protein(s) >gi|226332254|gb|ACIC01000066.1| GENE 1 67 - 3111 2068 1014 aa, chain + ## HITS:1 COG:no KEGG:BT_3239 NR:ns ## KEGG: BT_3239 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1014 46 1059 1059 1977 100.0 0 MKPDAEVLEEVVVTGMQKMDKRLFTGAATKLDAEGVKLNGMADISRGLEGRAAGVSVQNV SGTFGTAPKIRVRGATSIYGSSKPLWVVDGVIMEDAVEVSADQLSSGDAETLISSAIAGL NADDIESFQILKDGSATSIYGARAMAGVIVVTTKKGKAGVSRISYTGEFTTRLVPSYNDF NIMNSQDQMGVYKEMQQKGWLNFAETSRTAESGVYGKMYQLINTYDPTTGQYALLNTDEA KNAYLREAEMRNTDWFDTLFSPSLSQNHSVSLSSGTEKSSFYASLSAMHDPGWYKQSGVD RYTANLNMNHNILKNLSINLISSASYRKQKAPGTLGQDIDLVNGQVKREFDINPYSYALN TSRTLDPNTYYTSNYADFNILNELDNNYMELNIADLKFQGELKWKPITGLELSALGAVKY QSTSQEHNIKDQSNQASAYRAGLDDKTIMDRNPFLYKNPDNPYALPISILPEGGIYQRRD YKMLGYDFRATASWNHVFNDIHITNFFAGTEINSTDRSKNYYQGWGMQYSMGEIPYYVYE FFKKGIEDGNNYYSLSQTRSRSAAFFANATYSYKGRYTVNGTARYEGTNRMGKSRSARWL PTWNVSGAWNAHEEKFFSNLTSVLSHFTLKASYSLTADRGPADVTNSQAIITSYSPYRPF TSIQESGLYIKELENSELTYEKKHELNIGADMGFLNNRINLAVDWYKRNNYDLIGIVNTQ GAGGQILKYANIASMKSHGIEFTLSTKNIKTKDFNWNTDFIFSNAKNEVTDLKQNANVME LISGNGFAKQGYPVRGLFSIPFVGLDKDGIPVIINEKGEATSTDINFQERQNTGFLKYEG PTDPTITGSLGNMFSYKGFRLNLFITYSFGNVIRLNPVFKAYYSDLTAMTREFKNRWTLP GDEAKTNVPAILSKRMYDSNEDLKYAYNAYNYSTERIAKGDFIRLKEISLAYDFPKAWIN PLKLTDLSLKVQATNLFLIYADDKLNGQDPEFFNAGGVASPVPRQFTITLRLGL >gi|226332254|gb|ACIC01000066.1| GENE 2 3130 - 4689 982 519 aa, chain + ## HITS:1 COG:no KEGG:BT_3238 NR:ns ## KEGG: BT_3238 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 519 1 519 519 1047 100.0 0 MKKYIIMASVAALAIGLSSCSNFLDELPDNRTELNENNVGKILLSAYPTTAICEMGEMSS DNTDAYPNRFSSFNRLQEDLYKWADSAEQDQDSPHALWESCYMAISACNEVLKVIEDAGN PASLSAEKGEALVCRAYAHFLLANIFCNAYSSQASSDLGIPYMKTIETTVSPDYQRGTLE EVYQNIEKDLLEGLDLVTDIYAVPKYHFTKKAAQAFAARFYLYYVKSDKSNYTKVIDYAT KVLGSNPATSVRDWKSLGALDINGSVQPNAYVDATNGANLLLVSAGSYWGYVHAPYGLGE RYAHGPKVGNETCNSVGPWGSDYYMGVWSNSSALPTKIVVMKITQYKEVVDAVAGTINGH MINAAFTTDETLLCRAEAYAMKEMYPQAIADLNIWREAYTRSTTPLTTESINDFYGSMEY YTPTESTVKKKLNPDFTITNETQENVIHCILHARRLTTLHEGLRWQDIKRYGITIYRRLM NDNGTITVTDKLEPNDPRRAIQIPSDVISAGLKPNPRTK >gi|226332254|gb|ACIC01000066.1| GENE 3 4703 - 5581 932 292 aa, chain + ## HITS:1 COG:no KEGG:BT_3237 NR:ns ## KEGG: BT_3237 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 292 1 292 292 560 100.0 1e-158 MKKYIIYTLIVTIAVFLGSCNKDDIDTKRSIFDPTEVYEPNEFDKWLLDNYVYTYNIQVK YRLDDNETDVEYHLVPAKYDNAITLAKIVKFVWMEAYDEIWGIDMTRTYVPKLLQFIGNV AYTESGMILGQAEGGMKVTLFRVNEIDKNNLSVAQLNEYYFKTMHHEFTHILNQKKAYDT SYDRISESDYVGSSWYRVSESVALQKGFISPYAMDRATEDFAEMVAIYVTNDASTWEDML ASAGTTGRPIIEKKFEIVFDYMLNSWGLDLDKLREIVLRRQSEITELDLSTL >gi|226332254|gb|ACIC01000066.1| GENE 4 5594 - 6850 840 418 aa, chain + ## HITS:1 COG:no KEGG:BT_3236 NR:ns ## KEGG: BT_3236 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 418 1 418 418 836 100.0 0 MRKIKIYGLLTCVCLMMQSCLFNEDDIFDESSAQRAIASVNECQEILKGAANGWLLEYYT GEDGEYGGFNVLARFDGNNVIMAADFATDNYEIGEESTSLYKVESYQGTELSFDSYNELI HEFCEPSGYNSPGYAGDYEFVFRSVSKEKIVLTGKKHGVTLIMTPLPAETNWQEKLTNIA NVVSQASYVTYKLIVNGQEITKMGQEEHAFSVTKVDETGETTVSLYPFIYTEEGIKMYEP LVVNGVEINNFKWDNENLTYICTDTGVDAKIEFYCPEGYLNYLGNYILQLANGQRIQLEL KQKMIGKSFAMNFALSGTPIEFVYNYNMTTDCIDVPSQTVGVYQGYNVLLYPGIPGGNFY ADDSAVFQGRIANTDPLTIKFTYVNNPICTLMLLVYQKTDGWYGFSTMFQDVTLIKVD >gi|226332254|gb|ACIC01000066.1| GENE 5 6868 - 7717 398 283 aa, chain + ## HITS:1 COG:no KEGG:BT_3235 NR:ns ## KEGG: BT_3235 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 283 1 283 344 516 100.0 1e-145 MKKITSIFALLLCVFVFSCSNDDDKETSSLQIVSSNVTFESAGGTGTIKVNAISEITATS NKDWCTVSVNGDIINVSVIENNDMSGRTAAVTITDGEASTLVPVSQGGCVILYNKSELGH AFTYAGGSATVSFSTTASYSIEVPAEAQSWLSYTLDEENRTITFNVAASADKTPRGAAVK VTAGKKTIYYHLGEYELKDIAGKWRVSFVDGDDSTLAGEIEVVQDEEEPTIFYLSGISNF FDLPLIYNGEALLTMGGLNLGTYAGRYNIYTVTLSEGGYVSWD Prediction of potential genes in microbial genomes Time: Thu May 12 00:56:44 2011 Seq name: gi|226332253|gb|ACIC01000067.1| Bacteroides sp. 1_1_6 cont1.67, whole genome shotgun sequence Length of sequence - 27800 bp Number of predicted genes - 33, with homology - 31 Number of transcription units - 14, operones - 8 average op.length - 3.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 262 - 321 7.4 1 1 Tu 1 . + CDS 370 - 1560 471 ## BT_3234 hypothetical protein + Term 1654 - 1704 5.5 - Term 1735 - 1799 4.8 2 2 Tu 1 . - CDS 1839 - 3083 1343 ## BT_3233 hypothetical protein - Prom 3130 - 3189 5.0 - Term 3158 - 3216 17.3 3 3 Op 1 9/0.000 - CDS 3236 - 4039 1021 ## COG1028 Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) 4 3 Op 2 . - CDS 4079 - 4921 1023 ## COG3717 5-keto 4-deoxyuronate isomerase - Prom 5117 - 5176 7.5 - TRNA 5186 - 5259 56.2 # Gln TTG 0 0 - Term 5284 - 5319 4.4 5 4 Op 1 . - CDS 5368 - 6660 712 ## PROTEIN SUPPORTED gi|163739624|ref|ZP_02147033.1| 50S ribosomal protein L32 - Prom 6680 - 6739 3.4 6 4 Op 2 . - CDS 6745 - 7434 516 ## COG0084 Mg-dependent DNase 7 4 Op 3 . - CDS 7463 - 7738 96 ## COG0759 Uncharacterized conserved protein 8 4 Op 4 . - CDS 7735 - 8124 256 ## BT_3227 ribonuclease P (EC:3.1.26.5) 9 4 Op 5 . - CDS 8130 - 8876 435 ## BF0075 uroporphyrinogen-III synthase 10 4 Op 6 . - CDS 8881 - 9603 467 ## BT_3225 hypothetical protein 11 4 Op 7 . - CDS 9600 - 10187 642 ## COG1611 Predicted Rossmann fold nucleotide-binding protein - Prom 10228 - 10287 3.5 12 5 Op 1 . - CDS 10368 - 10949 376 ## BT_3223 hypothetical protein - Prom 10975 - 11034 2.3 13 5 Op 2 . - CDS 11037 - 11969 649 ## BT_3222 hypothetical protein - Prom 12219 - 12278 10.1 + Prom 11945 - 12004 5.7 14 6 Op 1 . + CDS 12030 - 12194 89 ## + Prom 12199 - 12258 6.0 15 6 Op 2 . + CDS 12282 - 12644 262 ## BT_3221 hypothetical protein + Prom 12672 - 12731 5.3 16 7 Tu 1 . + CDS 12884 - 14437 730 ## BT_3220 TPR repeat-containing protein - Term 15483 - 15542 16.5 17 8 Tu 1 . - CDS 15590 - 16882 1660 ## COG0192 S-adenosylmethionine synthetase - Prom 16903 - 16962 5.9 + Prom 16867 - 16926 5.3 18 9 Op 1 . + CDS 17120 - 17323 216 ## BT_3218 hypothetical protein 19 9 Op 2 . + CDS 17320 - 17511 247 ## BT_3217 hypothetical protein 20 10 Op 1 . - CDS 17542 - 18024 260 ## PROTEIN SUPPORTED gi|148994682|ref|ZP_01823786.1| 50S ribosomal protein L13 21 10 Op 2 . - CDS 18032 - 18367 261 ## BT_3215 hypothetical protein 22 10 Op 3 . - CDS 18384 - 19442 1230 ## COG0809 S-adenosylmethionine:tRNA-ribosyltransferase-isomerase (queuine synthetase) 23 10 Op 4 . - CDS 19460 - 20182 779 ## COG0130 Pseudouridine synthase 24 10 Op 5 . - CDS 20235 - 21029 905 ## COG1968 Uncharacterized bacitracin resistance protein 25 10 Op 6 . - CDS 21042 - 21275 236 ## BT_3211 hypothetical protein 26 10 Op 7 . - CDS 21325 - 22206 675 ## COG2177 Cell division protein 27 10 Op 8 . - CDS 22210 - 23109 727 ## BT_3209 hypothetical protein - Prom 23219 - 23278 5.9 28 11 Tu 1 . - CDS 24004 - 24165 122 ## BT_2926 hypothetical protein - Prom 24272 - 24331 3.4 - Term 24173 - 24212 7.3 29 12 Tu 1 . - CDS 24430 - 24585 96 ## - Prom 24636 - 24695 6.8 - Term 24619 - 24670 3.0 30 13 Op 1 . - CDS 24716 - 24949 180 ## BT_3208 hypothetical protein 31 13 Op 2 . - CDS 24939 - 25709 293 ## BT_3207 hypothetical protein - Prom 25887 - 25946 4.2 32 14 Op 1 . - CDS 25954 - 27000 296 ## BT_3206 hypothetical protein 33 14 Op 2 . - CDS 27002 - 27640 401 ## BT_3205 hypothetical protein - Prom 27670 - 27729 5.1 Predicted protein(s) >gi|226332253|gb|ACIC01000067.1| GENE 1 370 - 1560 471 396 aa, chain + ## HITS:1 COG:no KEGG:BT_3234 NR:ns ## KEGG: BT_3234 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 396 1 396 396 814 100.0 0 MKMILFRLICLKTCLLLIPVLYAQNNYRPGFIITVQKDTIYGEIDYRTDKMNAKRCVFQS QGNDTEPVTYHPFEILGYRFTDDGKYYVSKNIELKYGVSTPVFLEYLLQGMKSLYYYETE DNIPIYFVEDNNTLVKIDAPKLSKQATNIQFKGQTDRYIPLLHYAFKDCPSLAPQIDRAR FSRKEMIKITKEYHYAMCTSNEDCIEFEAKEDKRSIQLDITPYVGVIQYNVPSGSPIGLY SSPELSYLAGVTLAISNRRWMSSLSGCFDVSFSRITSSARSLNNENSSTIFKHSGTMFSG KLGIRYTYPKGMARPFVELGADFSGMINAKIKINDKSERWLDGVYPGYYANAGVNFKLSR KNKQMICIRAQFKGLRDMMEKSALVNGWSGVIGYTF >gi|226332253|gb|ACIC01000067.1| GENE 2 1839 - 3083 1343 414 aa, chain - ## HITS:1 COG:no KEGG:BT_3233 NR:ns ## KEGG: BT_3233 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 414 1 414 414 814 100.0 0 MKKIFILAAIALMGFASCADSKQSMTVTVTNPLALERTGEMVEVPMSDVTAKLQLADTAQ IVVLGEDGQQVPYQVTYDEKVVFPATVKANGTATYTIQSGTPDPYNVIACGRYYPERLDD VAWENDLGGYRAYGPALQKRGERGFGYDLFTKYNTTEPILESLYAEELDKEKRARIAELK KTDPKAAAELQNAISYHIDHGYGMDCYAVGPTLGCGTAALMAGDTIIYPYCYRTQEILDN GPLRFTVKLEFNPLTVRGDSNVVETRVITLDAGSYLNKTVVSYTNLKEAMPVTTGIVLRE PDGVVTADAANGYITYVDPTTDRSGGNGKIFIGAAFPSLVKEAKTVLFPEKEKKELRGGA DGHVLAISEYEPGSEYTYYWGSAWDKAAIKTVDAWNAYMAEYAQKVRTPLTVTY >gi|226332253|gb|ACIC01000067.1| GENE 3 3236 - 4039 1021 267 aa, chain - ## HITS:1 COG:CAC2607 KEGG:ns NR:ns ## COG: CAC2607 COG1028 # Protein_GI_number: 15895865 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) # Organism: Clostridium acetobutylicum # 1 267 1 267 267 434 80.0 1e-122 MNQYLNFSLKGKVALVTGASYGIGFAIASAYAEQGATVCFNDINQELVDKGMAAYAEKGI KAHGYVCDVTDEPAVQAMVATIEKEVGTIDILVNNAGIIRRVPMHEMEAADFRRVIDIDL NAPFIVSKAVLPAMMKKRAGKIINICSMMSELGRETVSAYAAAKGGLKMLTRNICSEYGE YNIQCNGIGPGYIATPQTAPLREKQADGSRHPFDSFICAKTPAGRWLDPEELTGPAVFLA SEASNAVNGHVLYVDGGILAYIGKQPK >gi|226332253|gb|ACIC01000067.1| GENE 4 4079 - 4921 1023 280 aa, chain - ## HITS:1 COG:YPO1725 KEGG:ns NR:ns ## COG: YPO1725 COG3717 # Protein_GI_number: 16121985 # Func_class: G Carbohydrate transport and metabolism # Function: 5-keto 4-deoxyuronate isomerase # Organism: Yersinia pestis # 6 280 2 278 278 299 50.0 4e-81 MKTNYEIRYAAHPEDAKSYDTARIRRDFLIEKIFVPNEVNMVYSMYDRMVVGGALPVGEV LTLEAIDPLKAPFFLTRREMGIYNVGGPGVVKAGDAVFELDYKEALYLGSGDRVVTFESK DASNPAKFYFNSLTAHRNYPDRKVTKADAVVAEMGSLEGSNHRNINKMLVNQVLPTCQLQ MGMTELAPGSVWNTMPAHVHSRRMEAYFYFEIPEEHAICHFMGEVDETRHVWMKGDQAVL SPEWSIHSAAATHNYTFIWGMGGENLDYGDQDFSLITDLK >gi|226332253|gb|ACIC01000067.1| GENE 5 5368 - 6660 712 430 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163739624|ref|ZP_02147033.1| 50S ribosomal protein L32 [Phaeobacter gallaeciensis BS107] # 2 427 8 414 418 278 36 2e-74 MNFVEELRWRGMLQDIMPGTEELLNKEQVTAYLGIDPTADSLHIGHLCGVMILRHLQRCG HKPLALIGGATGMIGDPSGKSAERNLLNEETLRHNQACIKKQLAKFLDFESDVPNRAELV NNYDWMKEFSFLDFVREVGKHITVNYMMAKDSVKRRLNGEARDGLSFTEFTYQLLQGYDF LHLYETKGCKLQMGGSDQWGNITTGAELIRRTNGGEVFALTCPLITKADGGKFGKTESGN IWLDPRYTSPYKFYQFWLNVSDSDAERYIKIFTSIEKEEIEALVAEHQQAPHLRALQKRL AKEVTIMVHSEEDYNAAVDASNILFGNATSESLRKLDEDTLLAVFEGVPQFEISRDALAE GVKAVDLFVDNAAVFASKGEMRKLVQGGGVSLNKEKLEAFDQVITTADLLDGKYLLVQRG KKNYFLLIAK >gi|226332253|gb|ACIC01000067.1| GENE 6 6745 - 7434 516 229 aa, chain - ## HITS:1 COG:NMA1946 KEGG:ns NR:ns ## COG: NMA1946 COG0084 # Protein_GI_number: 15794829 # Func_class: L Replication, recombination and repair # Function: Mg-dependent DNase # Organism: Neisseria meningitidis Z2491 # 56 228 93 284 284 80 31.0 2e-15 MENNASHILDIHTHKSEDASHGRAIINFPLPVDDSLCLSASDVRTAGKEGYFYSAGIHPW KLTERNAEEQFELLQQLLVKEQFVAVGEAGFDKLTAAPMELQVRMFEKQVELSEKYRLPL IIHCVKAMDELLAVRKKLNPAQPWIWHGFRGKPQQAGQLIKNGIYLSFGAHYSEETVKGV PVGRLFLETDDSPVDIEEVLKQVAKARSTDAEELRQAIRENIQKVFFRR >gi|226332253|gb|ACIC01000067.1| GENE 7 7463 - 7738 96 91 aa, chain - ## HITS:1 COG:NMA0549 KEGG:ns NR:ns ## COG: NMA0549 COG0759 # Protein_GI_number: 15793543 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Neisseria meningitidis Z2491 # 31 91 36 96 96 89 68.0 1e-18 MKPRAHRVSVCWMSLKSLVRKVFSFLLLIPIYFYRVCISPLTPPSCRFTPTCSAYAVEAI KKHGPVKGLYLAVRRILRCHPWGGSGYDPVP >gi|226332253|gb|ACIC01000067.1| GENE 8 7735 - 8124 256 129 aa, chain - ## HITS:1 COG:no KEGG:BT_3227 NR:ns ## KEGG: BT_3227 # Name: rnpA # Def: ribonuclease P (EC:3.1.26.5) # Organism: B.thetaiotaomicron # Pathway: not_defined # 21 129 1 109 109 185 100.0 6e-46 MSVYTLRKAERLNSKILIGKMFEGGHSKSFSIFPIRVVYMPVEQGEVPATILISVSKRRF KRAVKRNRVKRQIREVYRKNKQPLLDGLQNKGQRLAIAFIYLSDELVATAELEEKMKTAL SRISEKLSL >gi|226332253|gb|ACIC01000067.1| GENE 9 8130 - 8876 435 248 aa, chain - ## HITS:1 COG:no KEGG:BF0075 NR:ns ## KEGG: BF0075 # Name: not_defined # Def: uroporphyrinogen-III synthase # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 248 1 248 250 455 93.0 1e-127 MKIKKVLVSQPKPASEKSPYYDIAEKYGVKIDFRPFIKVESLSAKEFRQQKVSILDHTAV IFTSRHAIDHFFNLCTELRVTIPETMKYFCVTEAVALYIQKYVQYRKRKIFFGATGKIED LVPSIVKHKTEKYLVPMSDVHNDDVKNLLDKNNIQHTEAVMYRTVSNDFTSDEEFDYDML VFFSPAGVTSLKKNFPDFDQKEIRIGTFGSTTAQAVRDAGLRLDLEAPTVQAPSMTAALD MFIKENNK >gi|226332253|gb|ACIC01000067.1| GENE 10 8881 - 9603 467 240 aa, chain - ## HITS:1 COG:no KEGG:BT_3225 NR:ns ## KEGG: BT_3225 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 240 1 240 240 368 100.0 1e-100 MIADILNSGFEGIPISYSPRTDDVIALTLLACFFLSSIALARGKKFLTQQVKDFVLHRER TSIFDSSTAADVRYLLVLVVQTCVLTGIVFFNYFHDTCPMLMTKVSPLLLLGIYVGFCLA YFLLKWLLYMFLGWVFFDKNKTNIWLESYSTLIYYVGFALFPFVLFLVYFDLNLTNLVII GLIILIFTKILMFYKWVKLFFHQLSAAFLLILYFCALEIIPCLLLYQGMIQVNNVLLIKI >gi|226332253|gb|ACIC01000067.1| GENE 11 9600 - 10187 642 195 aa, chain - ## HITS:1 COG:PA4923 KEGG:ns NR:ns ## COG: PA4923 COG1611 # Protein_GI_number: 15600116 # Func_class: R General function prediction only # Function: Predicted Rossmann fold nucleotide-binding protein # Organism: Pseudomonas aeruginosa # 4 189 3 185 195 170 46.0 2e-42 MNQIHSVCVYSASSTKIAPVYFEAAEKLGRLLAKQHIRLINGAGSIGLMRSVADAVLKNG GEVTGVIPHFMVEQNWHHTGLTELIEVTSMHERKQKMANLSDGIIALPGGCGTLEELLEI ITWKQLGLYLNPIIILNINGFFDPLFQMLERAIEENFMRQQHGDIWKVAQTPEEAVELLQ TTPVWDASIRKFAAI >gi|226332253|gb|ACIC01000067.1| GENE 12 10368 - 10949 376 193 aa, chain - ## HITS:1 COG:no KEGG:BT_3223 NR:ns ## KEGG: BT_3223 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 193 1 193 193 374 100.0 1e-102 MKTITNLIFIALLFVTASLSAQEHGPVLGGSLGTLHEKVSVVSSNFEGYKLHYHIGYAYR QYWGNRKFAMDGIALFGSQGTDYQAFNSDSPKYSVNGKYLSIGTVFSYECFSNFRLGVGA DVAWYVNRLGASTYNSKKNAIDIPIVARVSYTLKWLELQLSYKQGTCNLMNTSGVGKVTS RNIQFSVFMPIFK >gi|226332253|gb|ACIC01000067.1| GENE 13 11037 - 11969 649 310 aa, chain - ## HITS:1 COG:no KEGG:BT_3222 NR:ns ## KEGG: BT_3222 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 310 1 310 310 607 100.0 1e-172 MKKLFSTLTLTVCLGAFISCTNDEQEVNMVKPTNANSEIEVINDGTMIKFKDVESYENAL LKVSAMSTSEQVSFLNSLSFKSQMILMQEADGELDKICNQAADKAEFDVLYEKYKHKYGD VFMFNTIDATDLSPYSRLVYVANEYFVNMKGEFMIGDSLVVDKVYTDFKERQQQFTVSTR SSVSDLSSINEAYSRQKDRKVGLYLSVSSGIIHANFTSQKKGVFGWSRYSTTYHAKVNLR GFEFAQGELLGYGPVYVNKDGIPFAIDTKEMGGNVTKVFGRKLAQECTGTIEIWSRGVPY DQRGFATVRL >gi|226332253|gb|ACIC01000067.1| GENE 14 12030 - 12194 89 54 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTQLFNSFFYPSIILYSKYPYQNFDQIICYKSRDNGGIDLDIQFSNNSEIKNAI >gi|226332253|gb|ACIC01000067.1| GENE 15 12282 - 12644 262 120 aa, chain + ## HITS:1 COG:no KEGG:BT_3221 NR:ns ## KEGG: BT_3221 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 120 1 120 120 218 100.0 5e-56 MMKKLLFIAIIAVLGCTALQAKNRIRIKRESWEVKASTRVPISTPLNLYYEGSTLFIHAN IDLENIIIEVKDSFNKTIYLDLTSIQANNEHVFTLENISSGEYIIEIKSGENYIIGCFIL >gi|226332253|gb|ACIC01000067.1| GENE 16 12884 - 14437 730 517 aa, chain + ## HITS:1 COG:no KEGG:BT_3220 NR:ns ## KEGG: BT_3220 # Name: not_defined # Def: TPR repeat-containing protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 517 62 578 578 880 100.0 0 MYYYLLLTEARDRCYVPHTSDSLMLSVTDYYERENDKDKLIKSYYYTGRVYLDLHEGPTA LSYFHKAFDISTDSKNYALLGRIYSQMGTLFAYEGLTEEVLQAYRNAYHYLQLGKDSLTS AYALRDLGRAFNLLEITDSTIYYYNKAYQLANSNGDIQREVNILEELAGIYIQMGKYDEA LKTLQETNSKWQDEKDINYDVWGNFYVSIGQKDSAAYYFKKNIGHNNVYSDIGAYRNLYE IEKERGNYEEGLAYLEKYTECSDSIYDISNTESLRKIKALYDYQHIEKEKEYLRKKNAEQ RAWIIFIVIAIVLGVIILIRYNRNRKAMIREQEKKLHEINEQQYRKSQLYIEENKKKIEE LKEKVKSFEQENNDLQQKLLLAKKEKLEQTNQAIETSQREQALLEEALRRSDIYAYCYRA IEDSSIRLTETEWKELENIINDTYDNFTNKLFILHPSITKMELRICLLLKIKIPVSTISQ LVCRTQSAVSMSRKQLYKKIFNKEGTPANLDDFIVSF >gi|226332253|gb|ACIC01000067.1| GENE 17 15590 - 16882 1660 430 aa, chain - ## HITS:1 COG:TM1658 KEGG:ns NR:ns ## COG: TM1658 COG0192 # Protein_GI_number: 15644406 # Func_class: H Coenzyme transport and metabolism # Function: S-adenosylmethionine synthetase # Organism: Thermotoga maritima # 1 430 1 395 395 426 52.0 1e-119 MGYLFTSESVSEGHPDKVADQISDAVLDKLLAYDPSSKVACETLVTTGQVVLAGEVKTKA YVDLQLIAREVIKKIGYTKGEYMFESNSCGVLSAIHEQSPDINRGVERQDPMEQGAGDQG MMFGYATNETENYMPLSLDLAHRILQVLADIRREEKVMTYLRPDAKSQVTIEYDDNGTPV RIDTIVVSTQHDDFIQPADDSAEAQLKADEEMLSIIRRDVIEILMPRVIASIHHDKVLAL FNDQIVYHVNPTGKFVIGGPHGDTGLTGRKIIVDTYGGKGAHGGGAFSGKDPSKVDRSAA YAARHIAKNMVAAGVADEMLVQVSYAIGVARPINIFVDTYGRSHVNMTDGEIARVIDQLF DLRPKAIEERLKLRNPIYQETAAYGHMGREPQVVSKTFFSRYEGNKTVEVELFTWEKLDY VDKIKAAFGL >gi|226332253|gb|ACIC01000067.1| GENE 18 17120 - 17323 216 67 aa, chain + ## HITS:1 COG:no KEGG:BT_3218 NR:ns ## KEGG: BT_3218 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 67 11 77 77 127 100.0 9e-29 MRTITLMQYVFSIILLVIIFLVMDYFWKGDEFSWTYEFAKGGIGGTIICLLNRHLFLRHP NPKQKKI >gi|226332253|gb|ACIC01000067.1| GENE 19 17320 - 17511 247 63 aa, chain + ## HITS:1 COG:no KEGG:BT_3217 NR:ns ## KEGG: BT_3217 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 63 1 63 63 99 100.0 3e-20 MIINNKLRTLFVLALYIGFTIAIYAIVCHFLKIEFQDIHLLYAVLVGCVAYLPRFIAEKK SKK >gi|226332253|gb|ACIC01000067.1| GENE 20 17542 - 18024 260 160 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148994682|ref|ZP_01823786.1| 50S ribosomal protein L13 [Streptococcus pneumoniae SP9-BS68] # 3 160 121 270 278 104 38 5e-22 MAKVYISLGTNLGDKEQNLRTAVQKIEEQVGKVISLSAFYITAPWGFDSEHSFLNAAACV ETELSPLEVLQKTQEIERELGRTHKSVGGVYSDRLIDIDLLLYDDLILSVTSPSGAALNL PHPLMAERDFVMRPLTEIAPELVHPVLGKAMKVLLAQIEK >gi|226332253|gb|ACIC01000067.1| GENE 21 18032 - 18367 261 111 aa, chain - ## HITS:1 COG:no KEGG:BT_3215 NR:ns ## KEGG: BT_3215 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 111 1 111 111 216 100.0 1e-55 MKICPKCHEEVEDSFEICWNCNYSFPDDKILDMTPAEESDDSGRSIDCLRCQVRMVYSGK YEFHEGMNTGIFGNLFELFQNREAFDLYVCPKCGKVEFFVPLDREREFKVE >gi|226332253|gb|ACIC01000067.1| GENE 22 18384 - 19442 1230 352 aa, chain - ## HITS:1 COG:SA1466 KEGG:ns NR:ns ## COG: SA1466 COG0809 # Protein_GI_number: 15927220 # Func_class: J Translation, ribosomal structure and biogenesis # Function: S-adenosylmethionine:tRNA-ribosyltransferase-isomerase (queuine synthetase) # Organism: Staphylococcus aureus N315 # 1 350 1 341 341 295 42.0 8e-80 MKLSQFKFKLPEDKIALHPTKYRDESRLMVLHRRTGEIEHKMFKDVLNYFDDKDVFIFND TKVFPARLYGNKEKTGARIEVFLLRELNEELRLWDVLVDPARKIRIGNKLYFGADDSMVA EVIDNTTSRGRTLRFLYDGPHDEFKKALYALGETPLPHSIINRPVEPEDSERFQSIFAKN EGAVTAPTASLHFSRELMKRLEIKGIDFAYITLHAGLGNFRDIDVEDLTKHKMDSEQMIV NAEAVNTVNRAKDKGKNVCAVGTTVMRAIESAVSTDGHLKEFEGWTNKFIFPPYEFTVAN SMISNFHMPLSTLLMIVAAFGGYEQVMDAYHVALKEGYRFGTYGDAMLILDK >gi|226332253|gb|ACIC01000067.1| GENE 23 19460 - 20182 779 240 aa, chain - ## HITS:1 COG:MT2862.1 KEGG:ns NR:ns ## COG: MT2862.1 COG0130 # Protein_GI_number: 15842331 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Pseudouridine synthase # Organism: Mycobacterium tuberculosis CDC1551 # 8 216 8 210 298 172 45.0 4e-43 MNFKKGEVLFFNKPLGWTSFKVVGHVRYHICRRIGVKKLKVGHAGTLDPLATGVMILCTG KATKRIEEFQYHTKEYVATLRLGATTPSYDLEHEIDATYPTGHITRELVEETLTHFLGAI EQVPPAFSACMVDGKRAYELARKGEEVELKAKQLVIDEIELLECRLDDPEPTIRIRVVCS KGTYIRALARDIGEALQSGAHLTELIRTRVGDVRLEDCLDPEHFKEWIDRQEIENDEDNN >gi|226332253|gb|ACIC01000067.1| GENE 24 20235 - 21029 905 264 aa, chain - ## HITS:1 COG:Cgl1479 KEGG:ns NR:ns ## COG: Cgl1479 COG1968 # Protein_GI_number: 19552729 # Func_class: V Defense mechanisms # Function: Uncharacterized bacitracin resistance protein # Organism: Corynebacterium glutamicum # 1 251 20 274 293 141 37.0 1e-33 MSWLEAMILGLIQGLTEYLPVSSSGHLAIGSALFGIQGEENLAFTIVVHVATVCSTLVIL WKEIDWIFKGLFKFQMNDETRYVINIVISMIPIGIVGVFFKDYVEAIFGSGLMIVGCMLL LTAALLSFSYYYKPRQKDKISMKDAFIIGLAQACAVLPGLSRSGSTIATGLLLGDNKAKL AQFSFLMVIPPILGEALLDSVKMMKGEDVVGDIPALSLIVGFLAAFVAGCLACKWMINIV KKGKLIYFAIYCAIAGLAVIITQL >gi|226332253|gb|ACIC01000067.1| GENE 25 21042 - 21275 236 77 aa, chain - ## HITS:1 COG:no KEGG:BT_3211 NR:ns ## KEGG: BT_3211 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 77 1 77 77 131 100.0 7e-30 MSDKQKFAFDKVNFILLAIGMAIVIIGFLLMTGPTSSETVFEPDIFSVRRIKVAPVVCLF GFISMIYAVLRKPKTKE >gi|226332253|gb|ACIC01000067.1| GENE 26 21325 - 22206 675 293 aa, chain - ## HITS:1 COG:L2 KEGG:ns NR:ns ## COG: L2 COG2177 # Protein_GI_number: 15672955 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Cell division protein # Organism: Lactococcus lactis # 23 285 29 311 311 82 26.0 7e-16 MGNKNRYKSKSLFDMQFITSSISTTLVLLLLGLVVFFVLTAHNMSVYVRENISFSVLISD DMKEADILKLQKKLNQEPFVKQSEYISKKQALKEQTEAMGTDPEEFLGYNPFTASLEIKL HSDYANSDSIAKIEKMIKKNSNIQDVLYRKELIDAVNENIRNISLVLLALAVVLTFISFA LINNTIRLAIYSKRFLIHTMKLVGASWSFIRGPFLRKNVWSGVLAGMLADAILMGTAYWA VTYEQELIQVITPEVMLIVCGSVLVFGIVITWLCAYISMNKYLRMKANTLYYI >gi|226332253|gb|ACIC01000067.1| GENE 27 22210 - 23109 727 299 aa, chain - ## HITS:1 COG:no KEGG:BT_3209 NR:ns ## KEGG: BT_3209 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 21 299 1 279 279 579 100.0 1e-164 MEKLIINACPVCGSTHLKRVMTCTDFYASGEQFELCSCEDCGFTFTQGVPVEAEIGKYYE TPDYISHTDTRKGAMNTIYHYVRSYMLGRKARLVAREAHRKRGRLLDIGTGTGYFADTMA RRGWKVEAVEKSPQARAFAKEHFDLDVKPESALKEFAPGSFDVITLWHVMEHLEHLNETW EMLRELLTEKGMLIVAVPNCSSYDAGRYGEYWAAYDVPRHLWHFTPVTIQQLASRHGFIM AARHPMPFDAFYVSMLSEKHRGSSCSFLKGMYAGTLAWFSTLGRKERSSSMIYVFRKKR >gi|226332253|gb|ACIC01000067.1| GENE 28 24004 - 24165 122 53 aa, chain - ## HITS:1 COG:no KEGG:BT_2926 NR:ns ## KEGG: BT_2926 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 53 96 147 372 72 71.0 5e-12 MNYKDKYILFYLEGKKALSEKRVSRILGISYPDDMPDGYHGQDPRIWVYIKDK >gi|226332253|gb|ACIC01000067.1| GENE 29 24430 - 24585 96 51 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNFNTELKKNIGCNIVYSSSDGGNYSIILMEFDSGNRIMTYCYWEVQKDGK >gi|226332253|gb|ACIC01000067.1| GENE 30 24716 - 24949 180 77 aa, chain - ## HITS:1 COG:no KEGG:BT_3208 NR:ns ## KEGG: BT_3208 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 77 1 77 77 112 100.0 3e-24 MDNNILKIIFRVVAVLLSLGALFFQLINFVFGTMVKNVSDMQDDYWYYGTELIPHLFTII IIIWFVVFQIRKIINMK >gi|226332253|gb|ACIC01000067.1| GENE 31 24939 - 25709 293 256 aa, chain - ## HITS:1 COG:no KEGG:BT_3207 NR:ns ## KEGG: BT_3207 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 256 1 256 256 503 100.0 1e-141 MYDAALGRWHAVDPMSEKYYSISSYVYGLNTPHNCIDPDGQKIIFVNGYLGFGSPRGGGT YWGGVNSSFVKGAKNFFNDQSAYFTDFDFNYLRSSTFLRNLDGYAYAKENYKQLIMGMNP QEDVFRIISHSMGGAFSEGIIRYLKEQGWNVDFSIHLNTWLPSELMGSVGTFLIDATITN DWVQGLSLPIDGSRDIPNANYKIRKKSNEGYQYRHRDWIDSGSFWNANNGITWNQLMPIL DSWLIQNPNIQINYGQ >gi|226332253|gb|ACIC01000067.1| GENE 32 25954 - 27000 296 348 aa, chain - ## HITS:1 COG:no KEGG:BT_3206 NR:ns ## KEGG: BT_3206 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 348 1 348 348 691 100.0 0 MLMRKLVYVVLLIILGGCIPPSPSLEDIHQRVAKQVEVLIDSGYLLTTYIEIDEVFSTDS NSLYYIGESDSPGSDGAELPSRVIKYKERYLCFIELDEPEMSRTELFERGFVSDSNFHEN LCLNRGRDWLLALRKYEDKHILVKMLPNYYRLFEYPELWSYFSGDIPQEKTALMGLTSHD IIVPSSYIPDLFELEIDSLKNYVERFSGEIFVRNQTDSVLLLSRNSARSMCYAVINGPDT LKLVLRDSLPVAIAPHDFKSLKYDSEPPHSFLQNLPDKDIWMSMYKLFSDSTFCFLNINN IPQKFRIMHNDAVYSSDLRDSLSKRVRYIYNKGVYDKEERIRRFFKWD >gi|226332253|gb|ACIC01000067.1| GENE 33 27002 - 27640 401 212 aa, chain - ## HITS:1 COG:no KEGG:BT_3205 NR:ns ## KEGG: BT_3205 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 212 33 244 244 386 99.0 1e-106 MLRIDLDGKDDYVISRSGRLFNETPIDKRGKGSTDNLYLSSDRSISVTVNQGLLGEMHSM QAKEQKENRVKKSYGSTQDLETAATVFKFAADHTTVEWKLDVYDDNGTRTAVVATDRDPY GVDNGVYAQNKLSVKGEKVIDIHSHLPGGTKGGAGNDFNLAKPQRKNAVYMKDNRVSTDK KDMIYEYTKNASRVNSIRVYDATDLLQYIKRK Prediction of potential genes in microbial genomes Time: Thu May 12 00:57:55 2011 Seq name: gi|226332252|gb|ACIC01000068.1| Bacteroides sp. 1_1_6 cont1.68, whole genome shotgun sequence Length of sequence - 1509 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 75 - 112 7.3 1 1 Op 1 . - CDS 167 - 739 272 ## BT_3203 hypothetical protein 2 1 Op 2 . - CDS 699 - 1508 408 ## BT_3202 cell well associated RhsD protein precursor Predicted protein(s) >gi|226332252|gb|ACIC01000068.1| GENE 1 167 - 739 272 190 aa, chain - ## HITS:1 COG:no KEGG:BT_3203 NR:ns ## KEGG: BT_3203 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 190 1 190 190 374 100.0 1e-102 MRKGIKLYGLLNHKKCFYGGSLFLLLLFITSCKSNQFLIDSNEVENVNFWFIGDVDTDVP ITDCLHIVFGNDSHKTIIRDRKIINRFLTLVNQLKPANPDSYIDLRVSSLIRLKPINGER RPDIKVCIGAGGYGVLLNDVLMKGNPKQLQKFIQEVLYDSLTPYEWLPSFIKEYLRDHPD EIDEYLPKSK >gi|226332252|gb|ACIC01000068.1| GENE 2 699 - 1508 408 269 aa, chain - ## HITS:1 COG:no KEGG:BT_3202 NR:ns ## KEGG: BT_3202 # Name: not_defined # Def: cell well associated RhsD protein precursor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 269 1069 1337 1337 531 100.0 1e-150 KYNGKELDRKNGLDWYDYGARMYDAALGRWHAVDPMSEKYYSWSPYTYCMGNPINHIDPD GNTVVIWYNNDAGKKVSYSYSGGDITHPNSFVQSVITAYQYNKANGLKAGNGGGASTVAI VENTNIKVNVMEAVFENSYNPNAARGAGSIYWKSNWGSQKDNGIVNSPATVFDHEADHAL EHKTNTQEYEVNRARGSDSQYQTKEERRVITGSEQKTSRANGETRSGQVTRRNHNGKTVI TKGVTSNVIDRQKTQEYEKRNKAVWTSEP Prediction of potential genes in microbial genomes Time: Thu May 12 00:58:13 2011 Seq name: gi|226332251|gb|ACIC01000069.1| Bacteroides sp. 1_1_6 cont1.69, whole genome shotgun sequence Length of sequence - 47054 bp Number of predicted genes - 37, with homology - 35 Number of transcription units - 23, operones - 9 average op.length - 2.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 2634 1485 ## BT_3202 cell well associated RhsD protein precursor - Prom 2655 - 2714 3.5 2 2 Tu 1 . - CDS 2877 - 6023 1724 ## BT_3201 hypothetical protein - Prom 6048 - 6107 3.4 - Term 6057 - 6120 2.2 3 3 Op 1 . - CDS 6136 - 6237 58 ## 4 3 Op 2 . - CDS 6237 - 6752 413 ## COG3023 Negative regulator of beta-lactamase expression 5 3 Op 3 . - CDS 6833 - 7339 464 ## BT_3199 putative non-specific DNA-binding protein - Prom 7369 - 7428 3.6 6 4 Op 1 . - CDS 7462 - 8370 562 ## BT_3198 hypothetical protein 7 4 Op 2 . - CDS 8358 - 8705 246 ## BT_3197 hypothetical protein - Prom 8915 - 8974 4.3 + Prom 8890 - 8949 6.2 8 5 Tu 1 . + CDS 9102 - 9812 769 ## BT_3196 hypothetical protein + Term 9865 - 9923 16.6 - Term 9962 - 9994 2.8 9 6 Tu 1 . - CDS 10047 - 11414 474 ## PROTEIN SUPPORTED gi|227395721|ref|ZP_03879044.1| SSU ribosomal protein S12P methylthiotransferase - Prom 11568 - 11627 7.8 + Prom 11520 - 11579 8.2 10 7 Op 1 . + CDS 11750 - 12190 347 ## gi|253569056|ref|ZP_04846466.1| conserved hypothetical protein 11 7 Op 2 . + CDS 12247 - 13746 1611 ## COG0427 Acetyl-CoA hydrolase 12 7 Op 3 . + CDS 13750 - 14499 390 ## BT_3192 TonB 13 8 Op 1 . - CDS 14524 - 15168 408 ## BT_3191 hypothetical protein 14 8 Op 2 . - CDS 15169 - 15759 435 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 15 8 Op 3 . - CDS 15756 - 17048 748 ## BT_3189 hypothetical protein 16 8 Op 4 . - CDS 17069 - 18250 933 ## BT_3188 hypothetical protein - Prom 18296 - 18355 4.7 17 9 Op 1 . - CDS 18399 - 19805 969 ## COG2027 D-alanyl-D-alanine carboxypeptidase (penicillin-binding protein 4) 18 9 Op 2 . - CDS 19885 - 21228 664 ## PROTEIN SUPPORTED gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 19 9 Op 3 . - CDS 21240 - 21635 215 ## BT_3185 hypothetical protein - Prom 21701 - 21760 3.0 + Prom 21454 - 21513 3.4 20 10 Op 1 . + CDS 21743 - 23314 1762 ## COG0029 Aspartate oxidase 21 10 Op 2 . + CDS 23317 - 23589 186 ## BT_3183 hypothetical protein + Term 23607 - 23657 12.0 - Term 23595 - 23645 12.0 22 11 Tu 1 . - CDS 23670 - 24248 718 ## COG1592 Rubrerythrin - Prom 24463 - 24522 4.7 + Prom 24207 - 24266 6.7 23 12 Tu 1 . + CDS 24493 - 26172 1647 ## COG0659 Sulfate permease and related transporters (MFS superfamily) + Term 26190 - 26249 12.8 - Term 26182 - 26233 3.4 24 13 Tu 1 . - CDS 26268 - 27398 791 ## COG4299 Uncharacterized conserved protein + Prom 27539 - 27598 3.4 25 14 Tu 1 . + CDS 27670 - 29103 1067 ## BT_3171 sialic acid-specific 9-O-acetylesterase + Term 29116 - 29163 1.6 26 15 Op 1 . - CDS 29244 - 29444 59 ## 27 15 Op 2 . - CDS 29485 - 33369 4056 ## COG1501 Alpha-glucosidases, family 31 of glycosyl hydrolases - Prom 33419 - 33478 8.2 28 16 Op 1 . - CDS 33580 - 35130 1548 ## COG3119 Arylsulfatase A and related enzymes 29 16 Op 2 . - CDS 35174 - 35947 755 ## COG0731 Fe-S oxidoreductases - Prom 35970 - 36029 6.5 30 17 Tu 1 . - CDS 36061 - 38109 1074 ## BT_3167 hypothetical protein - Prom 38191 - 38250 7.4 + Prom 38155 - 38214 4.8 31 18 Tu 1 . + CDS 38258 - 38830 558 ## BT_3166 hypothetical protein + Term 38951 - 39016 3.8 - Term 39146 - 39185 -0.8 32 19 Tu 1 . - CDS 39215 - 39742 166 ## PROTEIN SUPPORTED gi|163764761|ref|ZP_02171815.1| ribosomal protein S11 - Prom 39835 - 39894 4.8 + Prom 40133 - 40192 7.2 33 20 Tu 1 . + CDS 40234 - 41172 798 ## COG0379 Quinolinate synthase + Term 41201 - 41256 14.6 - Term 41189 - 41244 10.0 34 21 Op 1 . - CDS 41328 - 43352 1531 ## BT_3163 alpha-glucosidase, putative 35 21 Op 2 . - CDS 43391 - 44176 526 ## COG0584 Glycerophosphoryl diester phosphodiesterase - Prom 44262 - 44321 3.4 36 22 Tu 1 . - CDS 44328 - 44999 711 ## BT_3161 hypothetical protein - Prom 45167 - 45226 6.7 + Prom 44965 - 45024 5.6 37 23 Tu 1 . + CDS 45199 - 46926 725 ## BT_3160 hypothetical protein + Term 46984 - 47034 8.5 Predicted protein(s) >gi|226332251|gb|ACIC01000069.1| GENE 1 3 - 2634 1485 877 aa, chain - ## HITS:1 COG:no KEGG:BT_3202 NR:ns ## KEGG: BT_3202 # Name: not_defined # Def: cell well associated RhsD protein precursor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 877 83 959 1337 1738 100.0 0 MGMEMEKVPGQLECQLKITDDNLLTNQVYYQNAKSVDPDSLSSVSFTDTLLLFAGTYCLT STVIEKNKDILVPDDNGVNAASVLKTNTFDIKPIDPHEPIKPFPPIDPFPPIDPVLPMDS IRPLPLFYIMTLSIETKGLAPEEQPMPAIPFPAINWGNDLLPERFGKNAVSVFKSRTGNK EEGMLVVNYYDGLGRSKETVQQGVSPSHKDLVTLQEYDGWGRKANTWLPAVTTQNTGDFI SVASCGELARNTYGGDAYPYLTFTYENSPLEKVIKQHNPGRVWRENGRSLKIDEYVNIAG NDTLGCFGFQLINGAREALNISVSKQYENGSLLVTRSEDEDGVTLFEFKDRLERTVLERR IESRLKGDKQLSDTYYIYDDLGKLCAVLPPALSDQLSVGNIPAEKLDMYGYLYKYDVAGN LMAKKLPGISWEYYVYNVNNNLIFSQNGEERKRGEWKFSIPDAFGRVCLQGVCKVAIDPF DNPYLKPIFSGLSNWYVCQYVGSTEYGGYQFSGSLMAGLVGPKPLVQVINYYDNYDFMYR TDLSARDEFRYTHEEGFAATSSGAKGMLTGIAKLHTDAYDQYRNGEYGVDSPYNYSVMYY DDRGRLSQTVSDNHLGGMDRDFFSYDFNGNALRHLHRQTGAGNEILSDSYTYEYDHAERL VKALHRLGDAQEVILIDNVYDDLGRLSRKTFHNGLLNTSYSYNIRSWLTGITGSSFEQVL HYTDGTGIPYYNGNISSMVWKSGEDDIMRGYHFTYDNLNRLTNAVYGEGSVLVQNQNRFN EQVTGYDKMSNILGIKRSGQTSSTGYGLIDDLAMSYNGNQLKSVSDRATNSVYGNGFDFK DGVNKEAEYEYDENGNMTKDLNKKILNIQYNCLNLPS >gi|226332251|gb|ACIC01000069.1| GENE 2 2877 - 6023 1724 1048 aa, chain - ## HITS:1 COG:no KEGG:BT_3201 NR:ns ## KEGG: BT_3201 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1048 1 1048 1048 2063 100.0 0 MKKRLLSVFVFLAIFSFNCIGQQQIPIKNVPSPEIAGLGEYGKVPVSLYTGVPDISVPIY ELKAGNYSLSLAASYHLASVKPNSQSGCLGLGWNLIAGGYITRSVRGMYDEKCQSNGYAP GYYAHASKLKNISNEQFKAETMHIQSTENDYYELTADEFSFSVCGYSGNFYYAGNGEWNV VSDQDIRVEFNPVDGEGFLSLSEVGKRLDVSRWGASTRNNRFFNKFTLITPDGCRYEFGG INATEYSIPYYARYNSDLIATTWRLSKITTVDKRVIEFSYDTSAIMCDLRYVPQQKVVTN IPCTYSGIQSGRSGMTGYLLFPVNLKTIKTPNEILEFNYYNEYGYGDKFVDSYLAWKDNA NYDRQDIDNMNFEDPANQFTLFLGNQIDNTDQATLRQSIKSKLQHKILHCIYVKGKNSKA LKTIYFDYARNNRTKLSLITERSGNPDLIPNYVWHPHGFYFLTWYNIPENLTGGRVPEYR FLYNSEKRMPNDYVRPIADSWGYYTGGSVAFAEIPNFSQTYSSLQYTLAEVLTEVIYPTG GKSRFEYELNNYSKVVAPSLMSLTDKSGTAGGLRIRRITNLDNEDNVLGAKQYYYSNTRD RFGKSSGILKSLPVNEMVYTLKDGDKEPDPKNAISLYLKSKGGFFPSVTNLNTPDVGYSC VIEEAFDKDNKSQGYIVRHYSNYNEDIYGNTHYDELAFYSMFEGNSYTMPFSSRSMERGK LLSEEYYDVNDRLRKKVNYRYKEVTPGSFVTADQMVLFFCTDLDNFMLGKVGTLTRTYTH AYLTDSVIETLYPQSGNTAFVIEKAYQYNKYKQLSQIAGRNSDGKSTLTEYVYAATLPEY KWMEEAHILSPVSSKKEQTGGSYLKEVYQYMGPIPYIKQISTDRDGYVHKHYTVQAVDGY GNPIYLHEESTPVVLIWGAEGQRLISRIENATLNQVEEALGMNVKDFSSSDISATNLLKI ENIRHKISGTHFYIYKYTNELRLGSETKPNGITVFYKYDFLGRLTENYIMEFKDGDYQKR ILNIYDYNYYYGSKIESGEVAIEKGGQL >gi|226332251|gb|ACIC01000069.1| GENE 3 6136 - 6237 58 33 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MAMKRSVWDMILKVVIAVASAVAGVLGANAMNL >gi|226332251|gb|ACIC01000069.1| GENE 4 6237 - 6752 413 171 aa, chain - ## HITS:1 COG:HI1494 KEGG:ns NR:ns ## COG: HI1494 COG3023 # Protein_GI_number: 16273395 # Func_class: V Defense mechanisms # Function: Negative regulator of beta-lactamase expression # Organism: Haemophilus influenzae # 46 141 2 97 116 100 48.0 9e-22 MRIINLIVVHCTATQGNRTLSPEALDLMHRRRGFNGTGYHYYIRKDGTVHLTRPVERIGA HAKGFNASSIGICYEGGLDCRGRPADTRTPEQRATLRLLIHQLLEVFPSCRVCGHRDLSP DRNGNGEIEPEEWMKACPCFDAEQEFKEFAAEDTEETQSKLNYLQNKKGGK >gi|226332251|gb|ACIC01000069.1| GENE 5 6833 - 7339 464 168 aa, chain - ## HITS:1 COG:no KEGG:BT_3199 NR:ns ## KEGG: BT_3199 # Name: not_defined # Def: putative non-specific DNA-binding protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 168 1 168 168 281 100.0 8e-75 MAVLYKSFQSVLEDKNHKKMFHPRVIYTANISTSQIAKEIAAYSSLSPGDVKNTLDNLVT VMGQHLQASESVSLDGFGTFRMVMKSNGKGVETSEEVSAAQASLTVRFLPNFTKNPDRTT ATRSLVTGAKCVRFDRTDTPASGSGNTNKPGGGDGGEGGGEEAPDPTV >gi|226332251|gb|ACIC01000069.1| GENE 6 7462 - 8370 562 302 aa, chain - ## HITS:1 COG:no KEGG:BT_3198 NR:ns ## KEGG: BT_3198 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 302 1 302 302 609 100.0 1e-173 MGRIKRGLDYFPMSTSFMHDRMVRRIMRREGDSAFATLVETLSYIYAGKGYYISVGDEFY EELVDSLYSTELDDVKRIISLSVEYGLFDAGLFRQYNILTSADIQRQYLFITKRRSSALI EPDYCLLESEEITSYRSSQSGKSSTDDSLDSECAEALNGDADHKTACTVTSSFDSVTMED EIATSGTQNKRKQIKINQNKVNHLPNPPQGGDEGGKYLKSRTAVTQEDIDCLQPPCDGVQ RNFQGLTDNLRLYKVPPSEQYAIILKSNFGAIGNPVWKGFNTIRGSNGKIRLPGHYLLSI IN >gi|226332251|gb|ACIC01000069.1| GENE 7 8358 - 8705 246 115 aa, chain - ## HITS:1 COG:no KEGG:BT_3197 NR:ns ## KEGG: BT_3197 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 10 115 1 106 106 191 100.0 7e-48 MKTLKRKINMEPLMNQSLHICGCCKRELPHEAFYTNKRTGTLDNYCKECRKANTRKRRDL RRCTSFENKPVSYPVITEVKDPALRMLLILHARQVVAESISRKQEKEKLNTLWEE >gi|226332251|gb|ACIC01000069.1| GENE 8 9102 - 9812 769 236 aa, chain + ## HITS:1 COG:no KEGG:BT_3196 NR:ns ## KEGG: BT_3196 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 236 1 236 236 483 100.0 1e-135 MDTFLDVSGIVKRAKQALNFKKDSELASYLGVSRATLSNWCARNRIDFHLLLNKMRDVDL NWLLVGKGTPRHQTKLCKSDLASGEVQMIHNPKTVEALDDRSVALYDITAAANLKTLLTN KRQYMVGKIQIPSIPLCDGAVYISGDSMYPILKSGDIVGFKEINSFSNLIYGEMYLVSFT IEGDEYLAVKYVNRSEKEGCLKLVSYNAHHDPMDIPFSTINAMAIVKFSIRRHMMM >gi|226332251|gb|ACIC01000069.1| GENE 9 10047 - 11414 474 455 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227395721|ref|ZP_03879044.1| SSU ribosomal protein S12P methylthiotransferase [Haliangium ochraceum DSM 14365] # 20 450 1 450 461 187 30 1e-46 MNELTGADFKSATADDNKKLFIETYGCQMNVADSEVIASVMQMAGYSVAETLEEADAVFM NTCSIRDNAEQKILNRLEFFHSLKKKKKALIVGVLGCMAERVKDDLITNHHVDLVVGPDA YLTLPELIAAVEAGEKAINVDLSTTETYRDVIPSRICGNHISGFVSIMRGCNNFCTYCIV PYTRGRERSRDVESILNEVADLVAKGYKEVTLLGQNVNSYRFEKPTGEVVTFPMLLRMVA EAAPGVRIRFTTSHPKDMSDETLEVIAQVPNVCKHIHLPVQSGSSRILKLMNRKYTREWY LDRVAAIKRIIPDCGLTTDIFSGFHSETEEDHQLSLSLMEECGYDAAFMFKYSERPGTYA SKHLEDNVPEDVKVRRLNEIIALQNRLSAESNQRCIGKTYEVLVEGVSKRSRDQLFGRTE QNRVVVFDRGTHRIGDFVNVRVTEASSATLKGEEV >gi|226332251|gb|ACIC01000069.1| GENE 10 11750 - 12190 347 146 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253569056|ref|ZP_04846466.1| ## NR: gi|253569056|ref|ZP_04846466.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 146 1 146 146 234 100.0 2e-60 MEDFLKFLLIAGVILVGIFKEVSKNNKSKKAQNKRPVPPAPSPVEVDPDAIPIPEFWGRG SRALDELLQPIPMEQPAPKPTPKKKKKEEVSVAASIANSSAQDKRNTKQGSHYDHPEPAG EEDFSIHSVEEARRAIIWGEILQRKY >gi|226332251|gb|ACIC01000069.1| GENE 11 12247 - 13746 1611 499 aa, chain + ## HITS:1 COG:ygfH KEGG:ns NR:ns ## COG: ygfH COG0427 # Protein_GI_number: 16130821 # Func_class: C Energy production and conversion # Function: Acetyl-CoA hydrolase # Organism: Escherichia coli K12 # 3 489 5 490 492 508 49.0 1e-144 MSFNRISAAEAASLIKHGYNIGLSGFTPAGTAKAVTAELAKIAEAEHAKGNPFQVGIFTG ASTGESCDGVMSRAKAIRYRAPYTTNSDFRKAVNNGEIAYNDIHLSQMAQEVRYGFMGKV NVAIIEACEVTPDGKIYLTAAGGISPTICRLADQIIVELNAAHSKSCMGLHDVYEPLDPP YRREIPIYKPSDRIGLPYIQVDPKKIVGVVETNWPDEARSFADADPLTDKIGQNVADFLA ADMKRGIIPSTFLPLQSGVGNIANAVLGALGRDKTIPAFEMYTEVIQNSVIGLIREGRIK FGSACSLTVTNDCLEGIYNDMDFFRDKLVLRPSEISNNPEVVRRLGIISINTAIEVDLYG NVNSTHIGGTKMMNGIGGSGDFTRNAYISIFTCPSVAKEGKISAIVPMVSHLDHSEHSVN IIITEQGVADLRGKSPKERAQAIIENCAHPDYKQLLWDYLKLAGGRAQTPHAIQAALGMH AELAKSGDMKNTNWAEYAR >gi|226332251|gb|ACIC01000069.1| GENE 12 13750 - 14499 390 249 aa, chain + ## HITS:1 COG:no KEGG:BT_3192 NR:ns ## KEGG: BT_3192 # Name: not_defined # Def: TonB # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 249 1 249 249 483 100.0 1e-135 MKRLSFILSFIVCICVANTNDVFSQKRELQQFKGKVTNLQGEPVRAVSIYIPGTGRYTSS DMNGDFSVEAKHKETLRFSHIGMKDVLIQLDKDFPSELLVRMKPDTFQINNIFIQKKQII KVKPEDMPEQTMINFIINGPDFPGGLSGFDEFIKKNIQYPEEAFQAGEEGQVTVEFTIDV NGYVSDAKVTKSVSASLDKEALRIIESMPRWKSGMQLGRPVAVRMSVPINFVIEQDYQVI DSLQVKTNT >gi|226332251|gb|ACIC01000069.1| GENE 13 14524 - 15168 408 214 aa, chain - ## HITS:1 COG:no KEGG:BT_3191 NR:ns ## KEGG: BT_3191 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 214 1 212 212 389 100.0 1e-107 MNMKSDDLERLEESQEMQQLIHAAKEQYEQEKSVHPPVAYAAFAQNLRERQKDGKMKAKQ EPPERETFEVKGGKRSGISPWWLVAACLLGGIIGYGISTTSDRQEAGLERLAVADTVIVV RERVDTVYREVEVPSQPLVASRSTVKSKSTGLTKEHVKQSRKAVQKAEPAFISIEFLQQQ QNLPDPNSECYAANGMTVAEGNYPFHLLATVPCK >gi|226332251|gb|ACIC01000069.1| GENE 14 15169 - 15759 435 196 aa, chain - ## HITS:1 COG:PM1789 KEGG:ns NR:ns ## COG: PM1789 COG1595 # Protein_GI_number: 15603654 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Pasteurella multocida # 12 186 6 182 191 71 31.0 1e-12 MMLFTSFSHDETDEQLMERFAFRNSEKAFEELYCRYAPRFKGFFMRMLSADGALADDFLQ ELFLRVYEARGSYQHGKQFSTWAFAMAYNLCKNEYRHRDIVDEYRLQQSYTNEEDSYSSL EFEVAHDRQVFDRQLKKVLAGLTDEQRAAYTLRYEEELPVQEIAQILCCPEGTVKSRLYY TLKVLQRSLSDYNPQK >gi|226332251|gb|ACIC01000069.1| GENE 15 15756 - 17048 748 430 aa, chain - ## HITS:1 COG:no KEGG:BT_3189 NR:ns ## KEGG: BT_3189 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 430 1 430 430 825 100.0 0 MKRLIYILFLLIGFEQALFAQKIDTLRTERLELRLEPDLVGFIPCGKEWFQVQKDAWKLE IDKDVRNEKAWENYYLACKGIWDEDTLLWKKEQPRLLKKMKKYIPDTRVYYKVLDDEVMI SDKDKREAIKEKIVYLRRDCERDYRDDMWYYQRHGQIDKVREIAREWFDSGLYSRSILTY YYNEFVGLKRNAILAGTGPEWAYSVLLQYGAGLFKDVEVVDLSELMNPEEESDFWKTKGI DVNTLPDRGKVKCPGAWYFAEKEQRPVYISQYSASRQDVVEMKDFLYSEGLVFRYSLKPY DNMAILRRNYEQVYLLDYLQSPIITDISRGEDSYVLSFLPLLQFYRTSGDKNQYLRLKKL LLDIVDRMDDGCFVNKSWVDAKTAEHLKLFNRIFDHVRDMEKKGLTSVIVRSFKEEHRKE YQQLIEMVEP >gi|226332251|gb|ACIC01000069.1| GENE 16 17069 - 18250 933 393 aa, chain - ## HITS:1 COG:no KEGG:BT_3188 NR:ns ## KEGG: BT_3188 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 393 1 393 393 741 100.0 0 MKTIKYLLLVLLCYCTAQVYSQNKASWVPDSIASQIPVIQNQAREWKQKIYENPKDEKAW MSYARTIQTLKSLTPGDDIEKEINEMMDKMKKEIPNTATYALIQNMILPFGKNDMTFDEI IDKWPDAVMHYPVYMGLSFSNKDRLKDISTRWYQSGAYPVQSLNYTYNELTSAEKDALIF TDVNWTLFGSYLLQYGKGLFDDKKVILSGLILPSFSMNRLTEELGIPEFKDTDPEFYKSK TPTATFANEIKKRIEHIAKYTNRPIYISVSTNEAVKDLLKDHLYTEGLLMRYSAKPYDNL AIMRRNYENTYLLDYLYESFYPETLTNVCLDLKGVKMLSIYYVPAFKSLLQFYKESGDVT HYDKLHALLESIIKKADYYNEEVRERYLKSINF >gi|226332251|gb|ACIC01000069.1| GENE 17 18399 - 19805 969 468 aa, chain - ## HITS:1 COG:BS_pbp KEGG:ns NR:ns ## COG: BS_pbp COG2027 # Protein_GI_number: 16078896 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine carboxypeptidase (penicillin-binding protein 4) # Organism: Bacillus subtilis # 5 465 13 488 491 162 28.0 1e-39 MKKYLLFLVLSVVLLPASVWGQANLSQIDSLIKRMLPEASEVGISVYDLTAKKSLYTYRD TKLSRPASTMKLLTAITALSRPDADNPFRTEVWHDGVIEHDTLQGNLYVVGGFDPEFDSL MMDSLIEEVITFPFSVINGQVYGDVSMKDSLYWGHGWAWDDTPEAYQPYMSPLMFCKGAV EVTVVPGSLQGDTASVSCKPVSSYYTLTNRTKTRTPSAGKYSLSRDWLTNGNNLIVTGNV PTFRKDLINVYDSGSFFMHAFLERLRAKGIVVPESYGFTELPSDGAEQMARWETPVQKVL NQLMKESDNLNAEAMLCRIASQATGKKRVTAEDGIVEIMKLVRNLGHDPKDYKIADGCGL SNYNYLSPALLVDFLKYAYSQSDVFQKLYKSLPVAGIDGTLKNRMKNTPAFRNVHAKTGS FTAINALAGYLKMKNGHTLAFAIMNQNVLSAAKARAFQDKVCEVIIGH >gi|226332251|gb|ACIC01000069.1| GENE 18 19885 - 21228 664 447 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 [Flavobacteriales bacterium ALC-1] # 2 447 3 448 458 260 33 1e-68 MKYQVIIIGGGPAGYTAAEAAGKAGLSVLLIEKNNLGGVCLNEGCIPTKTLLYSAKTYDS ARHASKYAVNVSEVSFDLPKIIARKSKVVRKLVLGVKAKLTSNNVAMVTGEAQIIDKNTV RCGEETYNAENLILCTGSETFIPPITGVETVNYWTHRDALDSKELPASLAIVGGGVIGME FASFFNSLGVQVTVIEMMDEILGGMDKELSALLRAEYAKRGIKFLLSTKVVALSQTEEGA VVSYENEEGKGSVIAEKLLMSVGRRPVTKGFGLENLNLDKTGRGAIKVNKKMQTSLSGVY VCGDLTGFSLLAHTAVREAEVAVHSILGKEDAMSYRAIPGVVYTNPEIAGVGETEESASA KGITYKVVKLPMAYSGRFVAENEGVNGVCKVLLDEQEQIIGAHVLGNPASEIITLAGTAI ELGLTAAAWKKIVFPHPTVGEIFREAL >gi|226332251|gb|ACIC01000069.1| GENE 19 21240 - 21635 215 131 aa, chain - ## HITS:1 COG:no KEGG:BT_3185 NR:ns ## KEGG: BT_3185 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 130 1 130 131 217 98.0 1e-55 MNRIFHARIAWYQYFLLVVLTVNAVGALWCKYILPAVLLMLLLIVVIEQIIHTVYTVSTD GLLEISTGRFMRKKVIPIAEITAIRKYHSMKFGKFSVTNYVLIEYGNGKFASVTPVKERE FVELIKKRMEL >gi|226332251|gb|ACIC01000069.1| GENE 20 21743 - 23314 1762 523 aa, chain + ## HITS:1 COG:PA0761 KEGG:ns NR:ns ## COG: PA0761 COG0029 # Protein_GI_number: 15595958 # Func_class: H Coenzyme transport and metabolism # Function: Aspartate oxidase # Organism: Pseudomonas aeruginosa # 4 519 6 520 538 485 47.0 1e-137 MVRKFDFLVIGSGIAGMSFALKVAHKGKVALICKSGLEEANTYFAQGGVASVTNLLVDNF DKHIEDTMIAGDWISDRGAVEKVVREAPAQIEELIKWGVDFDKNEKGEFDLHREGGHSEF RILHHKDNTGAEIQDSLIKAVQRHPNIEVIENQFAVEILTQHHLGVTVTRQTPDIKCYGA YILDPKTGKVDTYLAKVTLMATGGVGAVYKTTTNPLVATGDGIAMVYRAKGTVKDMEFVQ FHPTALYHPGDRPSFLITEAMRGYGGVLRTMDGKEFMQKYDPRLSLAPRDIVARAIDNEM KNRGDDHVYLDVTHKDPEETKKHFPNIYEKCLSLGIDITKDYIPVAPAAHYLCGGILVDL NGQSSIERLYAVGECSCTGLHGGNRLASNSLIEAVVYADAAAKHSLKVVDQYAYNEAIPE WNDEGTRSPEEMVLITQSMKEVNQIMSTYVGIVRSDLRLKRAWDRLDIIYEETESLFKRS VASREICELRNMVNVGYLIMRQAMERKESRGLHYTIDYPHAKK >gi|226332251|gb|ACIC01000069.1| GENE 21 23317 - 23589 186 90 aa, chain + ## HITS:1 COG:no KEGG:BT_3183 NR:ns ## KEGG: BT_3183 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 90 1 90 90 161 100.0 9e-39 MKTLTRLFILLFVMAGVSSCYTGIPLKQGKSENNRTYEVSYLFEHDGVKVYRFMDMGNYV YFTTKGDVTSIKNDSTKERTVTIYRDDTVR >gi|226332251|gb|ACIC01000069.1| GENE 22 23670 - 24248 718 192 aa, chain - ## HITS:1 COG:CAC2575 KEGG:ns NR:ns ## COG: CAC2575 COG1592 # Protein_GI_number: 15895835 # Func_class: C Energy production and conversion # Function: Rubrerythrin # Organism: Clostridium acetobutylicum # 3 192 2 195 195 202 58.0 4e-52 MTKSIKGTQTEKNLLTSFAGESQARMRYTYFASVAKKEGYEQISAIFTETADQEKEHAKR MFKFLEGGMVEITASYPAGVIGNTLENLRAAAAGEHEEWSLDYPHFADVAEQEGFPMIAA MYRNISIAEKGHEERYLAFMNNIENMTVFAKEGEVVWQCRNCGYIEIGKEAPEVCPACLH PQAYFEVKKENY >gi|226332251|gb|ACIC01000069.1| GENE 23 24493 - 26172 1647 559 aa, chain + ## HITS:1 COG:CT856 KEGG:ns NR:ns ## COG: CT856 COG0659 # Protein_GI_number: 15605592 # Func_class: P Inorganic ion transport and metabolism # Function: Sulfate permease and related transporters (MFS superfamily) # Organism: Chlamydia trachomatis # 8 559 13 560 567 403 42.0 1e-112 MKLFEFKPKLVSCLKNYSKETFMADLMAGVIVGIVALPLAIAFGIASGVSPEKGIITAII AGFIISLLGGSKVQIGGPTGAFIVIIYGIIQQYGEAGLIVATLMAGVLLILLGVFKLGAV IKFIPYPIIVGFTSGIAVTIFTTQIADIFGLSFGGEKVPGDFVGKWMIYFRHFDTVNWWN TIVSIVSIIIIAITPRFSKKIPGSLIAIIVVTVAVYLMKTYGGIDCIPTIGDRFTIKSEL PDAVVPALDWEAIKNLFPVAITIAVLGAIESLLSATVADGVIGDRHDSNTELIAQGAANI VAPLFGGIPATGAIARTMTNINNGGKTPIAGIIHAIVLLLILLFLMPLAQYIPMACLAGV LVIVSYNMSGWRVFKALLKNPKSDVTVLLITFFLTVIFDLTVAIEVGLIIACVLFMKRVM ETTEISVITDEIDPNKESDIAVNEENIMIPKGVEVYEINGPYFFGIATKFEETMAQLGDR PNVRIIRMRKVPFIDSTGIHNLTTLCEMSQKEKITVILSGVNEKVYKVLEKSGFYELLGK ENICPNFKVALDRAEEVMK >gi|226332251|gb|ACIC01000069.1| GENE 24 26268 - 27398 791 376 aa, chain - ## HITS:1 COG:all1887 KEGG:ns NR:ns ## COG: all1887 COG4299 # Protein_GI_number: 17229379 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Nostoc sp. PCC 7120 # 11 323 2 307 375 105 29.0 2e-22 MSKLSENNTSRLASLDILRGFDLFLLVFFQPVFAALVRQLNLPFLNDILYQFDHEVWEGF RFWDLVMPLFLFMTGASMPFSLSKYVGMSGSYWPVYRRILRRVFLLFIFGMIVQGNLLGL DSSHIYLYSNTLQSIAVGYFIAAVIQLHFSFRWQIGITLLLLFIYWIPMTFLGDFTPAGN FAEQVDRWVLGRFRDGVFWNEDGTWSFSPYYNYTWIWSSLTFGVTVMLGAFAGKIMKEGK ANRKKVVQTLSVIGVLLVGLAMLWSLQMPIIKRLWTGSMTLLSGGYCFLLMALFYYWIDY KGHSRGLNWLKVYGMNSITAYLLGEVVNFRCIADSVSYGLKQYIGDYYPVWLTFANYLIL FFLLRMMYKRGLFLKV >gi|226332251|gb|ACIC01000069.1| GENE 25 27670 - 29103 1067 477 aa, chain + ## HITS:1 COG:no KEGG:BT_3171 NR:ns ## KEGG: BT_3171 # Name: not_defined # Def: sialic acid-specific 9-O-acetylesterase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 477 1 477 477 979 98.0 0 MKKHFLIFTISLLFFTLAQGKVKLPAMMGDHMVLQQNSSVKLWGWADGKKVTVTTSWNNR TYKVSTDDEGAWLVKVDTPAGSFTPYSITISDGTRVTLSDILIGEVWICSGQSNMEMTMK GNMGQPIDHSLETLLNAGNYRDRIRFITVPRTKGVKERTDFEGAKWEVSSPETTMDCSAA GYFFARQLTETLHLPVGLVINSWGGSRIEAWMTEEALASVEGADIEAAKNPKLDTNSRLQ CLYNIMLLPVKNYTARGFLWYQGESNLFNYQIYAPMMTAMVQLWRNVWEAPDMPFYYVQI APHKYGNSRNINSALLQEAQMKALQTIPNSGMIPTIDVGDEFCIHPPQKNVVGLRLANLA LTKTYGLHKFPSTGPMMTKVEYSKNKAIVTLDNAPSGLAPGSCELEGFEIAGADKKFYPA KARIAGRTRNVEVWSDQVAQPVAVRYAFRNYVGNITLRNTLGIAAFPFRTDTWDDVK >gi|226332251|gb|ACIC01000069.1| GENE 26 29244 - 29444 59 66 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPVGNYWHPLWETDGKDYFHKQSFSLLPPYLIWLGISSFEYSFYEVRIILLDSNLYSISS KLCRNE >gi|226332251|gb|ACIC01000069.1| GENE 27 29485 - 33369 4056 1294 aa, chain - ## HITS:1 COG:alr5331 KEGG:ns NR:ns ## COG: alr5331 COG1501 # Protein_GI_number: 17232823 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-glucosidases, family 31 of glycosyl hydrolases # Organism: Nostoc sp. PCC 7120 # 457 698 488 720 818 134 34.0 9e-31 MKKETPISLRHLRRITAILLLCPISLSSIQASVASDQSDVSSASVCQEQTSGILRAEKIN PTTVDVLFTNQQRMTIDFYGENIFRVFQDNSGGVIRDPEAKPEAQILVDQPRRKVSGLSV DEKDGYITLTTAQVRIELNKQTGLMKVLNPLTGKCVIEEVAPVVFGPKEVTVTLKENPEE YFYGGGVQNGRFSHKGKVIAIENQNSWTDGGVASPAPFYWSTNGYGMMWYTFRKGEYDFG ATEKNIVKLSHNSSYLDIFYMVNDGAVSLLNDFYQLTGNPVLLPKFGFYEGHLNAYNRDY WKEDEKGILFEDGKRYKESQKDNGGIKESLNGEKNNYQFSARAVIDRYKNHDMPLGWLLP NDGYGAGYGQTETLDGNIANLKSLGDYARKNGVEIGLWTQSDLHPKEGVSALLQRDIVKE VRDAGVRVLKTDVAWVGAGYSFGLNGVADVGHIMPYYGNDGRPFIISLDGWAGTQRYAGI WSGDQTGGEWEYIRFHIPTYIGSGLSGQPNICSDMDGIFGGKNAAVNIRDFQWKTFTPMQ LNMDGWGANEKYPHALGEPATSINRMYLKLKSELMPYTYSFAREAVDGMPLIRAMFLDYP NEYTYGTATRYQYMYGTDFLVAPVYQNTKADKEGNDIRNGIYLPEGTWIDYFSGEKYEGN RILSNFDTPVWKLPVFVKNGAIIPMTQPNNNVSEIDPSLRIYEFYPNRHTATVEYDDDGV TEAYRQEKSVSTLIESNVDAKNRVTITIHPTAGSFDGFVKDKKTELRINVTEKPKKLTAR INNKKVKLTEVSTAAELLNGENVFWYEETPNLNKFATKGSEFEKVVITRNPLLHIKLGST DITANRIELDVEGFRFEPADRNLVSTGTLSVPQNAQVSEQNREAYTLQPTWDKVPNADYY EIEFNGMLYTTIRDTYLLFEDLKADTPYAFKVRAVNKDGVSEWAEIQVKTKTNPLEFAIR GIEGESTAASQGGFGVDRLFNFSESGDTWHTKYSVNSIPLDLIIDLKTVNQLDKFHYLPR ADAGNGTLLKGTVSYSMDKENWTEAGAFEWQRNGDVKVFTFTERPNARYIKLNVTAGVGN YGSGREIYVFKVPGTASYLQGDINNDGKIDRNDLTSYMNYTGLRRGDSDFEGYISKGDIN MNDLIDAYDISVVATQLDGGVDREATEKVSGSLSISTPKKQYQKDEIVEIRVKGNDLRSV NALSFALPYDQSDYEFVGVEPLNMKAMENLTYDRLHTNGVKSLYPTFVNMGKQEALEGSE ELFILKLKAKRKVKFDLNLKDGILVDKQLRMHSF >gi|226332251|gb|ACIC01000069.1| GENE 28 33580 - 35130 1548 516 aa, chain - ## HITS:1 COG:STM0035 KEGG:ns NR:ns ## COG: STM0035 COG3119 # Protein_GI_number: 16763425 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Salmonella typhimurium LT2 # 15 503 8 471 497 150 26.0 4e-36 MKNTNLAFYALAGISSMPTTLTAGQAHPKERPADKQQPNVILIVADDLGYGDLSCYGAHR ISTPGMDRVANEGIRFTQGFCTAATSTPSRYSVMTGLYPWTNTDAKILPGNAALIIDTQK ITLPKLMKQAGYATGSVGKWHIGLGDGKVDWNKEVHPGAAEIGYDYSFIQAATNDRVPCV FLENGRVVGLDPNDPLYVDYRQNFPGEPTGKNNPELLRMHPSVGHAGSIVNGVPRIGFQK GGKAAQWKDEEMAELFLNKAKQFVNDHKDAPFFLYYGLHQPHVPRVPNERFAGQSGMGPR GDVILEADWCVSEFLKELDKLGLAENTIVVLTSDNGPVLDDGYKDDAVELVGDHKIAGPL RGGKTSMFDGGTRIPFMLRWPAAVKPQVSDAFVCQMDLLASFAALLGQSYPDKLDSENTL DTFLGKSKKGREELVIEGMFNYAYRQGDWALIPPYYNPYSKEDGDFIGLGYGYKLYNLKK DIGQQENVATKYPKKLGELINRFEYLKATTNKVTRY >gi|226332251|gb|ACIC01000069.1| GENE 29 35174 - 35947 755 257 aa, chain - ## HITS:1 COG:Cj1244 KEGG:ns NR:ns ## COG: Cj1244 COG0731 # Protein_GI_number: 15792568 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductases # Organism: Campylobacter jejuni # 9 251 4 233 300 109 33.0 4e-24 MTIIFPSPIFGPVHSRRLGVSLGINLLPEDGKVCSFDCIYCECGFNADHRPKKHLPTREE VRIALEEKLKDMQKNGPAPDVLTFAGNGEPTAHPHFPEIIEDTLALRDHYFPNAKVSVLS NSTFIDRPAVFDALNKIDNNILKLDTVNEEYIHLLDRPNGKYSVRKIIEKMKEFKGNCII QTMFLKGSYLGKDMDNTSDEFVLPWLEAVKEIAPSQVMIYTIDRETPDHDLQKATHEELD RIVELIKRTGIPATASY >gi|226332251|gb|ACIC01000069.1| GENE 30 36061 - 38109 1074 682 aa, chain - ## HITS:1 COG:no KEGG:BT_3167 NR:ns ## KEGG: BT_3167 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 682 1 682 682 1334 98.0 0 MKTTQLFRLISIINSLFIIPACSAQNPSQSLIEDIFEDLSVNNSVDNAVNEPNWENELEE LSVRLHEPVDLNRATRQQLEQFPFLSDIQIEHLLAYIYVHGQMKTLYELQLIEDMDRQTI QYLLPFVCIKAINNESSFRWKTMLKSAMKYGKNEVITRIDIPFYKRKGYEHTYLGPSVYN SVKYAFRYRDQLYAGLVAEKDAGEPFLALHNRQGYDYYSFYLLLKDCGRLKTLAVGNYRL SFGQGLVISTDYLLGKTVYASSFNTRSGGIRKHSSTDETNYFRGVAATVSITKQWSMSGF YSYRSLDGVLTDGEITSIYKTGLHRSQKEADKKNLFTMQLTGGNVSYQQNRIRLGITGIY YVFNRPYEPQLTGYSQYNIHGNQFYNLGIDYAYRWHRFSFQGETAMGKQGSATLNRFQYS PVEGTQLMIVQRYYSYNYWAMFAHSFGEGSAVQNEQGYYLGVETSPFRHWHILASFDLFS FPWKKYRISKPSRGMDGLLQATFTPHSNLTMYLKYRYKQKERDLTGSKENLTLPIFHHQL RYRLNYFYGDVFSCRTTLDYNHFHSQDRAASKGYQVTQMMSSQLPWTRLFADVQGSYFST DDYDSRVYVSEKGLLYTFYTPSFQGRGFRCAVRLRYEPNEHWMFITKFGETVYLDRNEIG SGNDLIFGNKKADVQMQLRIKF >gi|226332251|gb|ACIC01000069.1| GENE 31 38258 - 38830 558 190 aa, chain + ## HITS:1 COG:no KEGG:BT_3166 NR:ns ## KEGG: BT_3166 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 190 1 190 190 351 99.0 8e-96 MAFAMVCSLKMQAQDKQSINGYLVPMCIYNGDTIPCVQLRTVYIFRPLKFKNEKERLEYY RLVRNVKKVYPISKEINQAIIETYEYLQTLPNEKARQKHLKRVEKGLKEQYTARMKKLSF TQGKLLIKLIDRQSNSTSYELVKAFMGPFKAGFYQTFAALFGASLKKEYDPLGEDKLTER VVLLVENGQI >gi|226332251|gb|ACIC01000069.1| GENE 32 39215 - 39742 166 175 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163764761|ref|ZP_02171815.1| ribosomal protein S11 [Bacillus selenitireducens MLS10] # 18 173 93 245 255 68 29 7e-11 MRKLKITELNRISAEEFKQAEKLPLVVVLDDIRSLHNIGSVFRTSDAFRIECIYLCGITA TPPHPEMHKTALGAEFTVDWKYVNNAVDAVDNLRKEGYIVYSIEQAEGSIMLDQLELDKT KKYAIVMGNEVKGVQQEVIDHSDGCIEIPQYGTKHSLNVSVTTGIVIWDLFKKLC >gi|226332251|gb|ACIC01000069.1| GENE 33 40234 - 41172 798 312 aa, chain + ## HITS:1 COG:all4673 KEGG:ns NR:ns ## COG: all4673 COG0379 # Protein_GI_number: 17232165 # Func_class: H Coenzyme transport and metabolism # Function: Quinolinate synthase # Organism: Nostoc sp. PCC 7120 # 3 307 20 323 324 372 58.0 1e-103 MNNLIKAINELKKEKNAIILGHYYQKGEIQDIADYVGDSLALAQLAAKTEADIIVMCGVH FMGETAKVLCPDKKVLVPDMEAGCSLADSCPADQFAQFVKEHPGHTVISYVNTTAAVKAV TDVVVTSTNARQIVESFPKDEKIIFGPDRNLGNYINSVTNRNMLLWDGACHVHEQFSVEK IVELKAQHPEALVLAHPECKSTVLKLADVVGSTAALLKYAVNHPENTYIVATESGILHEM QKKCPQTTFIPAPPNDSTCGCNECSFMRLNTLEKLYECLKNESPEITVDPEIAEKAVKPI QRMLEISAKLGL >gi|226332251|gb|ACIC01000069.1| GENE 34 41328 - 43352 1531 674 aa, chain - ## HITS:1 COG:no KEGG:BT_3163 NR:ns ## KEGG: BT_3163 # Name: not_defined # Def: alpha-glucosidase, putative # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 674 1 674 674 1335 97.0 0 MDKKKISYLILLCLFSLSCTLQAEKYETVSPNGKLKIKLKIDKGTQYEVWYDNTQLILPS PIGLHLADGRVIGNGSVKSVKKRKVNQTIDVPVGKNKELQDTYNELTVSFNDSCDLVVRA YDEGVAYHFVTRLNGEITIISEDAIFNLASTPTIYYPECDANYSGETDQSGRTHQVHQGY RNFERLYKVYKGPLEIPYERFAVSPVLYEYPDSPYKVVVTESNTYDYPALYMESNGYNSM RGKWANYPKETIDSDPSNPYYWYSNHLVVTRENYIAKTDGNRSFPWRVIIVSKDDKSLLN NELVYKLADPCRLTDTSWIQPGKSAWEWWHKAVLEGVDFPCGNKQLSLQLYKYYVDWASK NHIEYMTLDAGWSEDYIKELCSYAKEKNVKIIVWTWASCARENPSDWIAKMHSYGVSGAK IDFFERNDQIAMRWGKEFAERLAEKQMVAIFHGCPVPTGLHRTYPNILNYEAVRGAECNF WEKTLTPEYHTRFPFIRLLAGPADYTPGSMRSVTQDKFRPMDIDNTPPMSMGTRSHELSM FVIYDQWMAYLCDSPTEYNKYPDVLDFLSKVPAVWDKTLPLEAKLSEYIVTAKQKGNDWY VGGMTNWDARSTEVNLSFLKDNVSYQATIFKDAPDSYEQPKEYMVEKRTVDNKTILNIDM AKGGGFVIRLEAEN >gi|226332251|gb|ACIC01000069.1| GENE 35 43391 - 44176 526 261 aa, chain - ## HITS:1 COG:AGl598 KEGG:ns NR:ns ## COG: AGl598 COG0584 # Protein_GI_number: 15890416 # Func_class: C Energy production and conversion # Function: Glycerophosphoryl diester phosphodiesterase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 1 255 36 294 306 123 32.0 3e-28 MLVVAHRGNWRSAPENSTAAIDSAIAMKVDIVEIDIQKTKDGQLILMHDNTLDRTTTGKG EIKNWTLADIKKLKLKDKDGKVTNYVVPTLEEALLTAKGKIMVNLDKAYDIFDDVYAILE KTETQNQVIMKGGQPIETVKREFGSYLDKVLYMPVIDLGNKEAEKIITDYLKELRPAAFE IIYSDPKNPLPPKIKQLLFKKSLIWYNTLWGSLAGNHDDNLALTDPEKSYGYLIEQLGAR ILQTDQPAYLLDYLRKKGWHN >gi|226332251|gb|ACIC01000069.1| GENE 36 44328 - 44999 711 223 aa, chain - ## HITS:1 COG:no KEGG:BT_3161 NR:ns ## KEGG: BT_3161 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 223 1 223 223 451 100.0 1e-126 MKHTFITFILLSLCIFTYAQKKIAREYIEWSDIWIPGANKTDLPHVLLIGNSITRGYYGK VEAALKEKAYVGRLSNSKSVGDPALIEELAVVLKNTKFDVIHFNNGLHGFDYTEEEYDKS FPKLIKIIRKYAPKAKLIWANTTPVRTGEGMKEFAPITERLNVRNQIALKHINRASIEVN DLWKVVIDHPEYYAGGDGTHPIDAGYSALANQVIKVIKNVLVH >gi|226332251|gb|ACIC01000069.1| GENE 37 45199 - 46926 725 575 aa, chain + ## HITS:1 COG:no KEGG:BT_3160 NR:ns ## KEGG: BT_3160 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 8 575 1 568 568 1196 100.0 0 MKKYILFMMSVWFIQLAIFAQSRIIEVFPEQGKDIENIALALKKAADCKGRPVTVKFSPG IYQLDRAKSSQVLYYISNTTSELDDPDPTKHIGLYLNTLKNITIDGCGSTLLMNGEMTSF VLDKCEGIVLKNFNIDYKHPTQTEVEVLEEGNDYLIVQVHPTSQYRIVDAQLEWYGDGWS FKNGIAQSYDRISEMTWRSWSPMENLLRTVELRPNVLYLQYKEKPQVGLHTIFQMRDSFR DEVSGFVNRSKGILLENINFYYLGNFGVVCQYSENITVDRCNFAPRPGSGRTNAGFADFI QVSGCRGMIDIKNSRFIGAHDDPINIHGTHLRVIEFLSDNRLKLRFMHDQTFGFEAFFKG DDIELVDSRSLLVVGKCKVKEAKLVTPREMELTLSSPLSSEVMQQKDLVIENVTWTPEVR ITNNYFARVPTRGILITTRRKSLIEGNTFYGMQMSGIFVADDGLSWYESGPVHDLTIRQN TFLNCGEPIISIDPENREYKGAVHKNITIEENYFYMRKNSSCAIRAKAVDGLMIRHNLIY SLDTEKNKESDFIQMYNCNEVTIKENRVQLHHLFK Prediction of potential genes in microbial genomes Time: Thu May 12 01:00:13 2011 Seq name: gi|226332250|gb|ACIC01000070.1| Bacteroides sp. 1_1_6 cont1.70, whole genome shotgun sequence Length of sequence - 67684 bp Number of predicted genes - 40, with homology - 40 Number of transcription units - 17, operones - 8 average op.length - 3.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 23 - 1315 715 ## BT_3159 hypothetical protein 2 1 Op 2 . - CDS 1336 - 2496 601 ## BT_3158 hypothetical protein 3 1 Op 3 . - CDS 2547 - 4562 1387 ## BT_3157 hypothetical protein 4 1 Op 4 . - CDS 4574 - 7744 2628 ## BT_3156 hypothetical protein - Prom 7982 - 8041 7.2 - Term 7907 - 7945 0.2 5 2 Op 1 . - CDS 8043 - 8840 204 ## gi|253569090|ref|ZP_04846500.1| predicted protein 6 2 Op 2 . - CDS 8908 - 11334 1241 ## BT_3155 hypothetical protein - Prom 11450 - 11509 6.9 + Prom 11414 - 11473 8.0 7 3 Tu 1 . + CDS 11502 - 15545 3088 ## COG5002 Signal transduction histidine kinase + Term 15650 - 15703 9.5 + Prom 15547 - 15606 6.3 8 4 Op 1 . + CDS 15723 - 17744 1674 ## BT_3133 alpha-galactosidase 9 4 Op 2 . + CDS 17753 - 19699 1306 ## BT_3132 sialic acid-specific 9-O-acetylesterase 10 4 Op 3 . + CDS 19724 - 21880 2033 ## COG3345 Alpha-galactosidase 11 4 Op 4 . + CDS 21927 - 24134 1743 ## COG3537 Putative alpha-1,2-mannosidase + Term 24189 - 24225 -0.4 - Term 24213 - 24256 7.8 12 5 Tu 1 . - CDS 24268 - 25110 817 ## COG1360 Flagellar motor protein - Prom 25204 - 25263 5.1 13 6 Op 1 . - CDS 25272 - 25853 303 ## PROTEIN SUPPORTED gi|71274727|ref|ZP_00651015.1| Ham1-like protein 14 6 Op 2 . - CDS 25902 - 26819 881 ## COG1284 Uncharacterized conserved protein - Prom 26893 - 26952 2.3 15 7 Tu 1 . - CDS 26971 - 29805 3089 ## COG0495 Leucyl-tRNA synthetase - Prom 30021 - 30080 8.0 + Prom 30698 - 30757 5.8 16 8 Tu 1 . + CDS 30952 - 32391 910 ## BT_3124 putative sialic acid-specific acetylesterase + Term 32480 - 32516 1.2 + Prom 32453 - 32512 2.6 17 9 Op 1 3/0.000 + CDS 32575 - 33450 803 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily 18 9 Op 2 . + CDS 33475 - 34344 607 ## COG2207 AraC-type DNA-binding domain-containing proteins + Term 34392 - 34423 1.8 + Prom 34494 - 34553 4.9 19 10 Tu 1 . + CDS 34688 - 37306 2701 ## COG0249 Mismatch repair ATPase (MutS family) + Term 37400 - 37437 -0.2 - Term 37349 - 37385 7.4 20 11 Op 1 . - CDS 37608 - 39020 829 ## BT_3120 hypothetical protein 21 11 Op 2 . - CDS 39087 - 39731 571 ## COG4845 Chloramphenicol O-acetyltransferase 22 11 Op 3 . - CDS 39728 - 40573 868 ## COG0682 Prolipoprotein diacylglyceryltransferase 23 11 Op 4 . - CDS 40600 - 41511 742 ## COG1893 Ketopantoate reductase 24 11 Op 5 . - CDS 41562 - 42665 1326 ## COG0012 Predicted GTPase, probable translation factor - Term 43290 - 43321 -0.6 25 12 Op 1 4/0.000 - CDS 43372 - 44514 1134 ## COG2814 Arabinose efflux permease - Prom 44559 - 44618 3.8 26 12 Op 2 1/0.000 - CDS 44640 - 44831 215 ## COG1073 Hydrolases of the alpha/beta superfamily 27 12 Op 3 . - CDS 44731 - 46047 859 ## COG1073 Hydrolases of the alpha/beta superfamily 28 12 Op 4 . - CDS 46071 - 48452 1837 ## BT_3111 hypothetical protein 29 12 Op 5 . - CDS 48449 - 49474 581 ## BT_3110 hypothetical protein - Prom 49522 - 49581 6.1 30 13 Tu 1 . + CDS 49860 - 51386 1337 ## COG3119 Arylsulfatase A and related enzymes + Prom 51491 - 51550 5.2 31 14 Op 1 . + CDS 51578 - 52567 656 ## COG3507 Beta-xylosidase 32 14 Op 2 . + CDS 52594 - 54213 785 ## COG3119 Arylsulfatase A and related enzymes 33 14 Op 3 . + CDS 54223 - 55875 1052 ## COG3119 Arylsulfatase A and related enzymes 34 14 Op 4 . + CDS 55911 - 58181 1618 ## BT_3104 hypothetical protein 35 14 Op 5 . + CDS 58195 - 61380 2193 ## BT_3103 hypothetical protein 36 14 Op 6 . + CDS 61399 - 63099 1263 ## BT_3102 hypothetical protein 37 14 Op 7 . + CDS 63146 - 64513 789 ## COG3119 Arylsulfatase A and related enzymes + Prom 64611 - 64670 7.2 38 15 Tu 1 . + CDS 64690 - 65565 362 ## COG0657 Esterase/lipase - Term 66249 - 66285 -0.7 39 16 Tu 1 . - CDS 66343 - 66912 400 ## BT_3099 hypothetical protein - Prom 66938 - 66997 8.0 40 17 Tu 1 . - CDS 67019 - 67576 400 ## BT_3098 hypothetical protein - Prom 67624 - 67683 3.6 Predicted protein(s) >gi|226332250|gb|ACIC01000070.1| GENE 1 23 - 1315 715 430 aa, chain - ## HITS:1 COG:no KEGG:BT_3159 NR:ns ## KEGG: BT_3159 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 430 1 430 430 846 99.0 0 MKKINILILCSILCFFYSCNGMLDGIQPYLDEGEKIYVGKLDSLKAFTGKNRIKIEGKMM YGVNQVKCVISYKDPITLEEKFKEFPIERTEPRETFEFMLENLTEGQYDFSIVTYDPKNN TSIPTEVSAYAYGELYQQALTNRILHSISPEQKEIVNESGQSERIWAAKLNWNISRGDGL VGCNLEYERQDGTLGSKYISVDETTTELDNFKADGILRYNSIYKPTTTSLDEFVTEYKEV TLPSQSYIGITKDLTSLYIKNAGYPFRGYDVNNKWGKPYDWKWNEVMETKNANGGAGFTE YNDGTIQFETTQWSQGYYDNGKIWQTFTLPEGKYEMRIEVKNAANIGSGSTNIHFAVAQG ESLPDNEELEKSDIILSYLKFESEHTNKTSALPIFELTEEKTITVGWVVSFSDICRNIEF KSVRLWSVAE >gi|226332250|gb|ACIC01000070.1| GENE 2 1336 - 2496 601 386 aa, chain - ## HITS:1 COG:no KEGG:BT_3158 NR:ns ## KEGG: BT_3158 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 386 17 402 402 780 99.0 0 MLHSCEDVETHKPYGDGSGHAPGTVQVTSYEQIPGGVTLKFIAPTDEDLLYIKIKYTLDN GKEMEARASLYTDETTIEGFGNTNPKKLVISAVNKMEKEGEAIITEIVPGKPAYLTAIEE IEVNPTFGGIYIHTTNGGRNYLIFDVSTKDSKGNWNIEHTEYTSVKNIGFTLRGFSAEPH DFKVRVRDLYDNQSEEYLTTLTPLYEEKLDLTKFKTFYLANDIKMDNAGHTLESLFNGDH GLNSWNYAHGYDFNPSEFPVWFTFDMGQTAQLSRFTSWQRSMGGSYYYRAGAIKEWEVWG RSDLPSSDGSWDGWTKLADCESIKPSGWPAGSNSEEDITYASKGEEFEFPADIPPVRYIR FKILSTHDGAGLVVMQQLWFYGTPIQ >gi|226332250|gb|ACIC01000070.1| GENE 3 2547 - 4562 1387 671 aa, chain - ## HITS:1 COG:no KEGG:BT_3157 NR:ns ## KEGG: BT_3157 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 671 1 671 671 1329 94.0 0 MKKFTKYILFISLGLLCSCSDYLDIVPDNVTTLDIVFNNKNTAQRYLANCYHFVPDHGSV KRIPGMGAGHECWFYTDKDSYFDNEVSFYIARGLQNTNAPYANYWDGEGGSKLPMFRAIR ECNIFLSYVRDQHKVNDLTDSERKRWIAEVNVLKAYYHFFLFEHYGPIPIVDEALPVSTS SEGVKVTRMPVDQVVDFMVKLIDDSYKDLPPVITMEATEMGRLTQPAALALKAKILLWAA SPLFNGNKEYANFRNKEGKPLINQEFSREKWVKTAEACKAAIESAEANGCKLYDCSMDMS YSQDLPEEIFYSMNIRQTVTERFNPELVWSVGKQGTNELQNLSMACILPSLKQGSGNGEV TGGILQSRAAGVYAPTLETAEQFYSKNGVPIDEDKEWLTNGNYDNRYTTRQVTSADKYNM RTGWTTAVLHFNREPRFYGSLGFDGSTWYGNGRTDVNDLNYVDGKYGRNSGNKWGFNYSI TGYFAKKVVYYQNAITQSSQVVYEYPFPIIRLGDLYLMYAESLNEATEGDNNVPEEVYTY LRKVRERSGLKKGVLGQTEDIGDVRVAWEKYSNDPTKPKTKDGMRSIIKQERKIELALEG HQYNDLRRWKDASKELSKSIKGWNVQGEIEEDFYKVTTLFVPRAFTTKDYLWPIKKQDLQ VDDKLIQTYGW >gi|226332250|gb|ACIC01000070.1| GENE 4 4574 - 7744 2628 1056 aa, chain - ## HITS:1 COG:no KEGG:BT_3156 NR:ns ## KEGG: BT_3156 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1056 1 1056 1056 1878 90.0 0 MKNMKERTYSFFKMNRLILVLLCLFFATNIYAQNREIRGMVTGDDNEPLAGATITIKNRQ GGIIVDIDGKFTIKAKTGEILVVSFLGMKKQEVKIDNKDFYAIKLLPETNEFDEVTVVAF AKQKKESVISAVSTVNPQELKVPSSNLTTALAGRMSGLIAFQTSGEPGKDNAQFFVRGAT TFGYKKDPLILIDNIELSTDDLARLNVDDIAQFSIMKDATATALYGARGANGVILVTTKQ GTAGRPKVSVRVEQSVSTPRKKVKLTDPVNFMKLNNEAVNSRRDPNNPAASANYTVYSQE KIENTIAGTNPYYYPAVDWYDELFNDYALSTRVNANLSGGGSAVRYYVAASYTKDGGVIK NDKLNNYNSNINWQRYSVRSNINMDLSKTTEFAIRVNGNFDDYTGPLDSGEGLYKKVMKT SPVLYPKSYPATDEYANNTHVLFGNANKGAYINPYADMVRGYKESNNLLVAAQAELKQKL DFITQGLDARVLVSTTRSSYSDLTRAINPYYYQANYDKTNNSYTLTNIVEGEEYLSYNQG AKNINTTNYIEAAVSYGRTFGAHATSGMLVYTRREEKRSSEKTLQQSLPRRNQGIAGRFT YAYDSRYFAEFNFGYNGSERFAAHERYGFFPSFGLGYIISNEDFWIPLRNTVDKLKFKLT YGLVGNDAIGSSEDRFYYLSEVNPNDGNRDQFFGTDYDNHMNGVSVSRYPNIYITWEKSK KVNLGFELGMFNSALEIQTDFFYERRTQIYQPRAHIPSSMGLSAGVSANIGEASSRGIDL SANYSYIANKDFWIKGMGNFTLAKGRYEVYEELDYASYGQGYRSRIGQAINHNYGYIAER LFIDVNDVANSPSQTSFNGGAMPGDIKYKDLNGDGKITDEDQTFLGYPNQPEITYGFGFS AGYKGFDLSMFFQGNARVSFFIDPRNTSPFVNYDAGDRIGVQQCLDVIANDHWSENNRDI YAFWPRLSPTLVDNNMVTSTWWMHDGSFLRLKTVELGYTFSQKALKKIGVQSLRLYLTGS NLLTFSKFKLWDAEMGNNGLGYPIQRVYNLGININF >gi|226332250|gb|ACIC01000070.1| GENE 5 8043 - 8840 204 265 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253569090|ref|ZP_04846500.1| ## NR: gi|253569090|ref|ZP_04846500.1| predicted protein [Bacteroides sp. 1_1_6] # 1 265 1 265 265 501 100.0 1e-140 MKNFLLILITCITILFTSCNTNYFDDYEKEETTTIMTRVVQDPLHPIINGENVYGNWIQL FYIHPTQYIGKWNIYLQYQSGAEWKYYNPNGGNTPYIVLSGEALNSYIIPFSGNYLPKGN PVIRICVDEYAPNGSRSTWSEPYQLKGYNQCGLIPAPNFDGFGSSVLELYGTLRPLISGV HSAKIIFAINGIFENSPITKQYKIEWIPGTYYFSTKLGIEPNQNYTLSYTVTYYDYTGKA LFPIDKGSYSINVGYSNYSSLKIFD >gi|226332250|gb|ACIC01000070.1| GENE 6 8908 - 11334 1241 808 aa, chain - ## HITS:1 COG:no KEGG:BT_3155 NR:ns ## KEGG: BT_3155 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 807 1 807 808 1597 95.0 0 MRNVTLGLIIGAALLVSGSLSLKATENKMKLWYDKPADEWMKSLPLGNGRLGVIIYGGIE TETLALNESTMWSGEYDEHQQRPFGREKLNQVRKLFFENNLSEGNHVAGNMMAGSPHSVG THLPIGDLKINFSYPQGEISDYRHELDLHTAINTVSYKVGNTEYIRQCIASNPDDVVAMH IKASRPKAITMELELKLLRQANVVASGNQLIYTGNAEFEKHGKGGVHFEGRIAVQIKGGT IKAEGKKLYIEKATEVTLLSDVRTNFKNNTFSGYNYKIKCEKTIELASKKDFKTLKKKHI EDYSPLFSRVGLSFEHHAKFDHLPNDERWARVKKGESDPGLDALFFQYARYLLIASSRPN SPLPVALQGFFNDNLACHMGWTNDYHLDINTEQNYWIANVGNLAECHLPLFDYIKDLSIH GAKTAKDLYGCKGWTAHTTANPWGYTAVSGSILWGLFPTASSWLASHLWTQYDYTQDKDF LKNTAYPLLKSNAEFLLDYMVIDPRNNYLVTGPSISPENSFRHQGQEFCASMMPTCDRVL AYEIFSACLQSTEILNVDASFADSLRTAISQLPPFRISTNGGVQEWFEDYEEAHPNHRHT THLLSLYPYSQITLDKTPELAQAAAKTIEKRLAAKDWEDTEWSRANMICFYARLKDSEKA YSSVKQLLGKLSRENMFTVSPAGIAGAGEDIFAFDGNTAGAAGMAEMLLQSHDNCIELLS CLPEEWKNGSFKGLCARGGIEIDASWKNARIENTVFKATAATEFILRLPNAEAYRWKLNK KTFIPKKNADGSIQIKMNVDDCITIILI >gi|226332250|gb|ACIC01000070.1| GENE 7 11502 - 15545 3088 1347 aa, chain + ## HITS:1 COG:SP1226 KEGG:ns NR:ns ## COG: SP1226 COG5002 # Protein_GI_number: 15901088 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Streptococcus pneumoniae TIGR4 # 830 1050 212 439 449 110 27.0 2e-23 MKKHLLRLAFINGLLFLSICIFAQEYSDFYFTRVNGENGLSESNVKAILQDSYGFMWFGT KNGLNRYDGTSILQFDCDDLEEGTGNHNIGALFEDKERNLWVGTDRGVYIYNPAVDVFKR FKIASSEGITLDNWVAEILSDSLDNIWVLIPDQGLFRYKDGKVHHYSLIDKDNLKNNNPE CICINEQGEVWVGTSGIGLFKYNYRDDNFEQYLTDRMGRSLIDKTIISICFQKENAILGI HEGDLLKYNTRTDELSEVSFLGEKKTFLRDVMCFGDEIWVGSLHGIFIINEKENNVIHLK EDLMRSFSLSDNSIYSIYKDYEGGIWIGTMFGGVNYLPNRILTFAKYVPGSDPYSLNTKR IRGLAEDNNGCIWIGTEDNGVNVLDPRTGKVHQIYDNVPGRLITLSVKHYENHIYVGLFK QGMNDISIPGEHLKYVSDKDLGIEEGSVYSFLKDSKGRTWIGPGWGLYVSQPKERKFHRV EEVGYNWVFDIMEARDGTIWLATMGNGVWKCDPKNNSYKNYSYKEGVDNSLSSNSVSSIM QDSKGNIWFSTDRGGICRYNEAQDNFTTFSIEDGLPDDVAYNILEDDAGNLWFGTNKGLV KFNPESGDVRVFTNKDGLLGNQFNYQSALKAQDGRFYFGGVDGLIAFDPTVQEEEKPLPP VYISKFSIYNKEVTVHTPESPLKQCIVHTDEIVLPYDQANISFDVALLSYSTAESNQYYY RMEPLDRDWVRAASNQNISYAKLPPGKYTFRVQATHGSKSESSTRSLSIIILPPWWQSII AYIVYTVCIILIVVGTILWYKRRKEKQLEERQKLFEIEKEKELYESKVDFFTEIAHEVRT PLTLINGPLEAIEEMEIQEPKMKKYISVMVQNTKRLLELTGQLLDFQKIGANKLTMKFES VDITELLAGIVARFEVTFSLNRKELQLKSTEEQVWAAVDKEAITKILSNLLNNALKYASQ NVLVELEKGEDSFTIRVTSDGNKIPAEVSQYIFEPFYQVDRKEKPRNGVGIGLSLARSLA SLHKGTIYLDTRQENNMFVLTIPLNMEGIKQENNKAIQKDIVELDEHTPVTADMYGYTLL LVEDNESMLTFILERLQENFTVETAMNGIEALEILRSGHIDLIISDIMMPVMDGYQLCKE VKSDMELSHIPIIFLTAKNDIDSKINGLKYGAEAYVEKPFSFNYLKEQVLSLLDNRRRER EAFAKRPFFSVNNMQMNKADEEFMNKVIKVIEDNITDDNFGVERMADILCMSRSSLLRKI KTLFNLSPLDFIRLIKLKKAAEFIQEGKYRIGDICYMVGINSPSYFSKLFLKQFGMTPKD FEKQYQSGKQIIITQEIKNVEAGEGTK >gi|226332250|gb|ACIC01000070.1| GENE 8 15723 - 17744 1674 673 aa, chain + ## HITS:1 COG:no KEGG:BT_3133 NR:ns ## KEGG: BT_3133 # Name: not_defined # Def: alpha-galactosidase # Organism: B.thetaiotaomicron # Pathway: not_defined # 17 673 1 657 657 1352 99.0 0 MKNQKLLLKKMSGVSLMLLLFLFISNCASGDKIFSTDVWNITYDDQQGGIALTYKSKNIC NGMYASYQWEGKEIKTSDYARHKISTGKVSDSFGKGHIYKVEYTDKELPKLTQYFYLYPG KEYILTDFTVEAKEEIRSGRMAPVNVDSISGFLQAGDNRALFVPFDNDKWIRYQSHPLGF DTLVSYEVTAIFNNESRNGLVIGSVEHDNWKTGISIGKGDRDNIGSLVCYGGIADEQTRD VKAHGALAGKKIKSPKVFLGFFEDWCDGLEAYAKANAVIAPPKAWEKAVPFGWNSWGALQ FNLTYEKAMEVSDYFKENLQNNHFVNPDQTVYIGLDSGWDCMNEEQLKSFIEKCKSNGQI GGIYWTPFTDWARDPERTVDAAPEYKYKDIYLYANGKPQELDGAYAVDPTHPAVEAMMKK TSGLFHRAGFEYVKMDFMTHGAMEADKWYNPEITTGIQGYNYGMKLLNQYFGDMYINLSI SPVFPAQYAQSRRIACDAWNQIKDTEYTLNALSYGWWQKYIYQYNDADHIVLRDATDGEN RARITSGVITGIYITGDDFSAGGGKDAKEKSQKYLTNAAVNAVATGEVFRPVEGNGEKPE HIFVRKTPDGRFHCAVFNYSDQEQTTVLSLERIGLDKARGYQVKELWSGNETFATGQLEV TVPSKDVQLLEFK >gi|226332250|gb|ACIC01000070.1| GENE 9 17753 - 19699 1306 648 aa, chain + ## HITS:1 COG:no KEGG:BT_3132 NR:ns ## KEGG: BT_3132 # Name: not_defined # Def: sialic acid-specific 9-O-acetylesterase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 648 1 648 648 1330 99.0 0 MKKYQLTLLSLCVAAFAQAKVTLPSFFTDNMVIQQNSQLTLPGKAKAGKQVTVKVSWNHE KYTAKASADGHFSIEVPTPPAGGPYQIIVSDGEKLVLNNVLSGEVWICSGQSNMEMPVAG WGKVMNYEQEVADADYPSIRLLQVKKTVAFSPQDNVEMNMNGWQECSSATVPEFSSVAYF YARKLWKELNVPVGVIDCTWGGTPAESWTSHQTLKQVVGFQKKMAEMESAGFQREKLVEL YHREMDEWLQQFDAKDAGFKDGVPQWISALQTGNEWKPMELPAYWETRGLNFDGTVWFQK EVEIPADWNGKEISFHLAMIDDDDVTYFNGKEIGRTSGCNTMRTYKIPAALAKAGKGVIT IRAIDYGGEGGIHGEPQQMYMEANDKKISLAGNWNYHTGVSMTGAPSRPLSPEGTGWPTS CYPTVLYNAMIHPFTVFPIKGAIWYQGENNVGFDEQYRVLFQSMITDWRRAWKQDFPFYF VQLANYLKPEEVQPDSKWAALRDAQAHALHLPNTGMACAIDLGEAYDIHPKNKQEVGKRL AQAALTKTYNKGTYEVPAFLGYRISGRTLILTFDQEVVAKEGAPKGFILAGPDGKYVPAQ ATIRGKEVILQSDRIEIPTAACYAWADNPVCNLYGKSGLPVPPFRTRE >gi|226332250|gb|ACIC01000070.1| GENE 10 19724 - 21880 2033 718 aa, chain + ## HITS:1 COG:BH2223 KEGG:ns NR:ns ## COG: BH2223 COG3345 # Protein_GI_number: 15614786 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-galactosidase # Organism: Bacillus halodurans # 28 702 14 714 748 345 31.0 2e-94 MKLKAFLLFFLLSSFSLLQAADKPLIKIETERTSLIYQVADNGRLYQKYLGKKLNHDSDI QYLPQGTEAYLTHGMEDYFEPAIHIRHNDNNSSLLLKYVDHSSNAIDNGVNETVITLKDD KYPVTVKLHYVAYGKENIIRTFTEISHEEKKPVILCKYASSMLHLNSSKYFLTEFSGDWA HEANVTERELAFGKKVIDTKLGARANMFVSPFFQLALDNPSQENAGEVLVGTIGWTGNFR FTFEVDNKNELRIISGINPYDSEYSLPAKEVFRTPDFYFTYSTQGKGEASRNFHDWARNH QVKNGNDTRMTLLNNWESTYFDFDENKLIGLMDEATKLGVDMFLLDDGWFANKYPRSSDH QGLGDWEETADKLPNGVGRLVEEAQKKGIKFGIWIEPEMVNPKSELYEKHKDWVIHLPNR DEYYFRNQMVLDLSNPKVQDHVFGVVDNLMTKYPGIAFFKWDCNSPITNIYSVYLKDKQS HLYVDYVRGLYKVLDRIKAKYPDLPMMLCAGGGGRSDYEALKYFTEFWPSDNTDPIERLF IQWGYSQIFPSKTLCAHVTTWNRGTSIKFRTDVAMMCKLGFDIKLEDMNQNEHLYCDQAV KNYNRLKPVILEGDLYRLVSPYGSNHTSSMFVGKDKKTAAVFAFDIHPRYAEKTLPVRLQ GLNADKKYRVKEINMMPGSGSSLKGNNQVFSGEYLMNVGLDLFTTQQLNSRLIEITEE >gi|226332250|gb|ACIC01000070.1| GENE 11 21927 - 24134 1743 735 aa, chain + ## HITS:1 COG:XF0842 KEGG:ns NR:ns ## COG: XF0842 COG3537 # Protein_GI_number: 15837444 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Xylella fastidiosa 9a5c # 11 731 34 766 790 500 38.0 1e-141 MKKLLLSVCAFSLTLATLQAGEITKYVNPFIGTGAIDGGLSGNNYPGATSPFGMIQLSPD TSEAPNWGDASGYDYNRNTIFGFSHTRLSGTGASDLIDITLMPTSSGRTSSAFTHDAEKA RPGYYQVMLKDENINAELTTTQRNGIHRYQYPAGKDAEIILDMDHSADKGSWGRRIINSQ IRILNDHAVEGYRIITGWAKLRKIYFYMEFSSPILTSTLRDGGRVHENTAVINGTNLHGC FRFGQLNGKPLTCKVALSSVSMENARQNMEQEAPHWDFDRYVAAADADWEKQLGKIEVKG TEVQKEIFYTALYHTMIQPNTMSDVNGEYMAADYTTRKVANNETHYTTFSLWDTFRASHP LYTLLEPERVTDFVKSMIRQYEYYGYLPIWQLWGQDNYCMIGNHSIPVITDAILKGIPGI DMEKAYEAVYNSSVTSHPNSPFEVWEKYGFMPENIQTQSVSITLEQAFDDWCVAQLAAKL NKDADYQRFHKRSEYYRNLFHPKTKFFQSKNDKGEWIEPFDPYQYGGNGGHPFTEGNAWQ YFWYVPHNIQALMELTGGTKAFEQKLDTFFTSTYKSEQMNHNASGFVGQYAHGNEPSHHV AYLYNFAGQPWKTQKYVSHILNTLYNNTSSGYAGNDDCGQMSAWYVFSAMGFYPVNPADG RYIIGSPLLDECTLKLAGNKEFRIRTIRKSPEDIYIQSVTLNGKKHKDFFITHQDIMNGG TMVFKMGKKPSGWGK >gi|226332250|gb|ACIC01000070.1| GENE 12 24268 - 25110 817 280 aa, chain - ## HITS:1 COG:PA1461 KEGG:ns NR:ns ## COG: PA1461 COG1360 # Protein_GI_number: 15596658 # Func_class: N Cell motility # Function: Flagellar motor protein # Organism: Pseudomonas aeruginosa # 91 267 69 245 296 104 34.0 2e-22 MKKIALFTFLTLLLCTSCVTKKKFMMTEAARIASIDSLQSLLTDCRNTGDQLSAQIKKLL RDTTQMGNSIRQYQSMLSTNMTEQEKLNALLSQKKNELSERERTINELQDMINAQNEKVK QLLNSVKDALLGFSTDELTVREKDGKVYVAMSDKLLFQSGSARLDKRGEEALGKLAEVLN KQTDIDVFIEGHTDNKPINTVQFKDNWDLSVIRATSVVRILIKNYGVNPLQIQPSGRGEY MPVDDNETAEGRSKNRRTEIIMAPKLDKLFQMLQTSEEVK >gi|226332250|gb|ACIC01000070.1| GENE 13 25272 - 25853 303 193 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|71274727|ref|ZP_00651015.1| Ham1-like protein [Xylella fastidiosa Dixon] # 1 191 1 195 200 121 42 1e-26 MKRKLVFATNNAHKLEEVAAILGDKVELLSLNDIDCHTDIPETAETLEGNALLKSSFIYR NYQLDCFADDTGLEVEALNGAPGVYSARYAEGEGHDAQANMRKLLHELEGKENRKAQFRT AISLILDGKEYLFEGVIKGEIIKEKRGDSGFGYDPIFKPEGYEQTFAELGNETKNKISHR ALAVQKLCEFLQR >gi|226332250|gb|ACIC01000070.1| GENE 14 25902 - 26819 881 305 aa, chain - ## HITS:1 COG:TM0177 KEGG:ns NR:ns ## COG: TM0177 COG1284 # Protein_GI_number: 15642951 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Thermotoga maritima # 9 303 1 281 283 151 31.0 2e-36 MRKPAKAEIMREVKDYIYITLGLISYALGWAAFLLPYQITTGGTTGIGAIIYYATGFPIQ WSYFIINAVLMTFAIKILGPRFSIKTTYAIFMLTFLLWIFQVLVNNYIQTPDMSADGKPL LLGSGQDFMACIIGAAMCGVGLGITFNYNGSTGGTDIIAAIVNKYKDVSLGRMIMICDVF IISSCYFVFHDWRRVIFGFVTLFIIGVVLDWIINSARQSVQFFIFSKKYDEIADRIIKDT DRGVTVLNGTGWYSKNDVKVLVVLAKRRQSLDIFRLVKRIDPNAFISQSSVIGVYGEGFD QLKVK >gi|226332250|gb|ACIC01000070.1| GENE 15 26971 - 29805 3089 944 aa, chain - ## HITS:1 COG:SP0254 KEGG:ns NR:ns ## COG: SP0254 COG0495 # Protein_GI_number: 15900189 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Leucyl-tRNA synthetase # Organism: Streptococcus pneumoniae TIGR4 # 3 944 4 832 833 708 42.0 0 MEYNFREIEKKWQQRWVEEKTYQVTEDESKQKFYVLNMFPYPSGAGLHVGHPLGYIASDI YARYKRLRGFNVLNPMGYDAYGLPAEQYAIQTGQHPAITTKANIDRYREQLDKIGFSFDW SREIRTCEPEYYHWTQWAFQKMFNSYYCNDEQQARPIQELIDAFAIYGNQGLNAACSEEI SFTAEEWKAKSEKEQQEILMNYRIAYLGETMVNWCQALGTVLANDEVIDGVSERGGFPVV QKKMRQWCLRVSAYAQRLLDGLDTIDWTESLKETQKNWIGRSEGAEVQFKVKDSDLEFTI FTTRADTMFGVTFMVLAPESDLVAQLTTPAQKAEVDAYLDRTKKRTERERIADRSVTGVF SGSYAINPFTGEAVPIWISDYVLAGYGTGAIMAVPAHDSRDYAFAKHFGLEIRPLVEGCD VSEESFDAKEGIVCNSPRPDVTPYCDLSLNGLTIKEAIEKTKQYVKEHNLGRVKVNYRLR DAIFSRQRYWGEPFPVYYKDGMPYMIDEDCLPLELPEVDKFLPTETGEPPLGHAKEWAWD TVNKCTVENEKIDNVTIFPLELNTMPGFAGSSAYYLRYMDPHNNKALVDPKVDEYWKNVD LYVGGTEHATGHLIYSRFWNKFLHDVGASVVEEPFQKLVNQGMIQGRSNFVYRIKDTHTF VSLNLKDQYEVTPLHVDVNIVSNDILDLEAFKAWRPEYAEAEFILEDGKYICGWAVEKMS KSMFNVVNPDMIVEKYGADTLRMYEMFLGPVEQSKPWDTNGIDGVHRFIRKFWSLFYSRT DEYLVKDEPATKEELKSLHKLIKKVTGDIEQFSYNTSVSAFMICVNELSNLKCNKKEILE QLVITLAPFAPHVCEELWDTLGHETSVCDAAWPAYNEEYLKEDTINYTISFNGKARFNME FDADAASDAIQAAVLADERSQKWIEGKTPKKIIVVPKKIVNVVV >gi|226332250|gb|ACIC01000070.1| GENE 16 30952 - 32391 910 479 aa, chain + ## HITS:1 COG:no KEGG:BT_3124 NR:ns ## KEGG: BT_3124 # Name: not_defined # Def: putative sialic acid-specific acetylesterase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 479 1 479 479 971 98.0 0 MKYLISFLFLALAFSMKSEAKVTLPSILGDNMVLQQQAEVKLWGKARPNATVTVKTSWNG QTYKFSSDGKGGWSLTVSTPVAGGPYKIIFNDGELLTLQNVLIGEVWFCSGQSNMEMPMG CFDRQPVRGANDIIAKAKPSTPIRMYTTDSKDGRWVRQFSKTPAEDCQGEWLENTPVNVS HISAVSYYFARYIQEVLEVPVGIVVSTWGGSRIEAWMSRESIKPFSSIDLSILDNDAEVK NPTATPCVLYNGKIAPLTNFAVRGFLWYQGESNRDNADLYQSLMPAFVADLRAKWGRGEL PFYFVQIAPFDYEGADGTSAARLREVQLQNMKDIPNSGMVTTMDVGHPVFIHPVDKETVG NRLAYWALAQTYGMKGFGYAPPVYKSMEIQENKIYINFYNAQRGLSPMWTSLKGFEIAGE DKVFYPAFAEIETTTARLAVSSDKVSHPVAVRYAYKNYVEASIFSIHGIPAAPFRTDNW >gi|226332250|gb|ACIC01000070.1| GENE 17 32575 - 33450 803 291 aa, chain + ## HITS:1 COG:BS_ybfH KEGG:ns NR:ns ## COG: BS_ybfH COG0697 # Protein_GI_number: 16077290 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Bacillus subtilis # 9 291 10 292 306 183 40.0 4e-46 MELKHGYYHLIAILTVAIWGLTFISTKVLINHGLTPQEIFFYRFLIAYLGIWVISPKRLF TNNWQDELWLVAGGIFGGSLYFFTENTALGITQASNVSFIICTAPLLTTILSLLFYKSEK ASKGLIYGSLLALMGVGLIVFNGSFVLKISPIGDLLTLLAALSWAFYSLIIKKMAGRYPT IFITRKIFFYGVLTILPAFLLHPLHPDTAVLLQPAVLFNLLFLAVLASLICYVLWNVVLK QLGTVRASNYIYLNPLVTMIASFLILDEKITLVALGGAACIVCGVYWAEKK >gi|226332250|gb|ACIC01000070.1| GENE 18 33475 - 34344 607 289 aa, chain + ## HITS:1 COG:AGl1135 KEGG:ns NR:ns ## COG: AGl1135 COG2207 # Protein_GI_number: 15890685 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 25 284 40 297 313 145 29.0 6e-35 MRKMMNEQLPISKSSPLKARFYDYPYFTYPWHFHTEYEIIYVKEGTGMRFVGNNAEKYAD GDILLLGSDLPHYMKSDEVYRQEGSTLRTTGTIIQFEKDFMHHAFNYYPHFMKIKNLLEE AQRGIYFPAGCSPRLVDLLESIPVETGVDQITSFLQLLKEMANVASRSVLSTSDFENDII YDGSRIDKIIVYLNKNYTRQIDLNEISSLAAMNPAAFCRFFKSKTGKSFKNYILEMRIGY ACKLLLMGDMNISQISVECGFDTISHFNKSFKKHTGFTPSYYKRMMLAE >gi|226332250|gb|ACIC01000070.1| GENE 19 34688 - 37306 2701 872 aa, chain + ## HITS:1 COG:MA0523 KEGG:ns NR:ns ## COG: MA0523 COG0249 # Protein_GI_number: 20089412 # Func_class: L Replication, recombination and repair # Function: Mismatch repair ATPase (MutS family) # Organism: Methanosarcina acetivorans str.C2A # 7 869 4 899 900 617 40.0 1e-176 MNEEEIVLTPMMKQFLDLKAKHPDAVMLFRCGDFYETYSTDAIVASEILGITLTKRANGK GKTIEMAGFPHHALDTYLPKLIRAGKRVAICDQLEDPKLTKKLVKRGITELVTPGVSIND NVLNYKENNFLAAVHFGKASCGVAFLDISTGEFLTAEGPFDYVDKLLNNFGPKEILFERG KRLMFEGNFGSKFFTFELDDWVFTESTAREKLLKHFETKNLKGFGVEHLKNGIIASGAIL QYLTMTQHTQIGHITSLARIEEDKYVRLDKFTVRSLELIGSMNDGGSSLLNVIDRTISPM GARLLKRWMVFPLKDEKPINDRLNVVEYFFRQPDFKELIEEQLHLIGDLERIISKVAVGR VSPREVVQLKVALQAIEPIKQACLEADNASLNRIGEQLNLCISIRDRIAKEINNDPPLLI NKGGVIKDGVNEELDELRRISYSGKDYLLQIQQRESEQTGIPSLKVAYNNVFGYYIEVRN IHKDKVPQEWIRKQTLVNAERYITQELKVYEEKILGAEDKILVLETQLYTDLVQALTEFI PQIQINANQIARLDCLLSFANVARENNYIRPVIEDNDVLDIRQGRHPVIEKQLPIGEKYI ANDVMLDSTTQQIIIITGPNMAGKSALLRQTALITLLAQIGSFVPAESAHIGLVDKIFTR VGASDNISVGESTFMVEMNEAADILNNVSSRSLVLFDELGRGTSTYDGISIAWAIVEYIH EHPKAKARTLFATHYHELNEMEKSFKRIKNYNVSVKEVDNKVIFLRKLERGGSEHSFGIH VAKMAGMPKSIVKRANTILKQLESDNRQQGISGKPLTEVSENRSGMQLSFFQLDDPILCQ IRDEILNLDVNNLTPIEALNKLNDIKKIVRGK >gi|226332250|gb|ACIC01000070.1| GENE 20 37608 - 39020 829 470 aa, chain - ## HITS:1 COG:no KEGG:BT_3120 NR:ns ## KEGG: BT_3120 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 470 1 470 470 926 97.0 0 MSTKTLFKLLWVPFVTSVFASCSDKEELPYTPPEETNDSYISPAVPQGNSWPEAAMSLTQ PFLGAGYDIMGSYIDNNSVKEPVLDLSKIDNERISNLVGSSGVGDSFIGRDMKEFLRSIT KYKKFVVPAENKDDLLFTATITGHKYFQKPYDYSSQYTFAFGSSGANGVIQRLLTLNADW SNWLADDFRQELEDASPETIIELFGTHILVNTHLGYISKTLYRSIVADDEENLLRTANTG MGAHQSSIIKHPNVSITYPEETVKKNYGGTIVVSLQGADSKVFNQLTGNPMDISPWIQSA NEKNRALTTLTGEDLIPIYEVIADPIKKQQIKEAVIAHIKRHQLSLQQTAPIFQASDGYY HRYYTSYKELTAKADICQGVIGSVFIRHEPGTVPLYLSSNGKNHRLTLEPAPNGDGTIIG YVYEKESDDLNCIYEISDGKNFAYTTEEKDAYGDKGTWKPTGLSFYTKKV >gi|226332250|gb|ACIC01000070.1| GENE 21 39087 - 39731 571 214 aa, chain - ## HITS:1 COG:MA1703 KEGG:ns NR:ns ## COG: MA1703 COG4845 # Protein_GI_number: 20090555 # Func_class: V Defense mechanisms # Function: Chloramphenicol O-acetyltransferase # Organism: Methanosarcina acetivorans str.C2A # 3 213 6 206 209 107 32.0 1e-23 MKQIIDIENWERKENFNFFRHFQNPQLSITSEVECGGARQRAKAAGQSFFLHYLYAVLRA ANEIPEFRYRIDPDGRVVLYDTIDMLSPIKIKENGKFFTTRFPYHNDFDTFYQEARLIID AIPEDGDPYAAENEEVADGDYGLILLSATPDLYFTSITGTQEKRSGNNYPLLNAGKAIIR EGRLVMPIAMTIHHGFIDGHHLSLFYKKVEDFLK >gi|226332250|gb|ACIC01000070.1| GENE 22 39728 - 40573 868 281 aa, chain - ## HITS:1 COG:VC0674 KEGG:ns NR:ns ## COG: VC0674 COG0682 # Protein_GI_number: 15640693 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Prolipoprotein diacylglyceryltransferase # Organism: Vibrio cholerae # 11 272 10 257 271 124 34.0 2e-28 MNTLLLSINWNPNPELFNLFGISIRYYGLLWAVGIFFAYIIVHYQFRDKKIDEKKFDPLF FYCFFGILIGARLGHCLFYDPGQYLTSGKGFVEMLLPIKFLAGGGWKFTGYEGLASHGGT LGLMIALWLYCRKTKLHYMDVVDMIAVATPITACFIRLANLMNSEIIGNPTDVPWAFVFE RVDMLPRHPGQLYEAIAYLILFFIMIYLYKNYSKKLHRGFFFGLCLAYIFTFRFFIEFVK QNQEAFEDNMMFNMGQWLSVPFIIIGFYFMFFYDRKKVAKK >gi|226332250|gb|ACIC01000070.1| GENE 23 40600 - 41511 742 303 aa, chain - ## HITS:1 COG:BH1763 KEGG:ns NR:ns ## COG: BH1763 COG1893 # Protein_GI_number: 15614326 # Func_class: H Coenzyme transport and metabolism # Function: Ketopantoate reductase # Organism: Bacillus halodurans # 1 291 1 287 304 113 27.0 5e-25 MKYLIAGTGGVGGSIAGFLSLAGKDVTCIARGAHLQSIQTNGLKLKSDLKGEHTLQIKAT TAEEFKGKADVIFVCVKGYSVDSITELIKRASHEKTVVIPILNVYGTGPRIQRLVPGVTV LDGCIYIVGFVSGTGEITQMGKIFRLVYGAHKGTQVASGIMEAIQKDLQESGIKAEISPD INRDTFVKWSFISAMAVTGAYYDVPMGEVQKPGKVRDTFIGLSTESAALGRKLGVEFPED PVSYNLKVIDKLDPDSTASMQKDLARGHESEIQGLLFDMIAAAEEQGIDIPTYRMVAEKF TKH >gi|226332250|gb|ACIC01000070.1| GENE 24 41562 - 42665 1326 367 aa, chain - ## HITS:1 COG:PA4673 KEGG:ns NR:ns ## COG: PA4673 COG0012 # Protein_GI_number: 15599868 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted GTPase, probable translation factor # Organism: Pseudomonas aeruginosa # 1 367 1 366 366 431 59.0 1e-121 MALQCGIVGLPNVGKSTLFNCLSNAKAQAANFPFCTIEPNVGVITVPDERLNALAELVHP QRIVPTTVEIVDIAGLVKGASKGEGLGNKFLANIRETDAIIHVLRCFDDDNVTHVDGSVN PVRDKEIIDYELQLKDLETIESRIQKVQKQAQTGGDKAAKQAYDVLVQYKDALEQGKSAR TVTFETKDEQKIAHELFLLTSKPVMYVCNVDEASAVNGNKYVDMVRDAVKDENAQILIVA AKTEADIAELETYEDRQMFLAEVGLEESGVARLIKSAYKLLNLETYFTAGVQEVRAWTYE KGWKAPQCAGVIHTDFEKGFIRAEVIKYDDYIQYGSEAAVKEAGKLGVEGKEYVVQDGDI MHFRFNV >gi|226332250|gb|ACIC01000070.1| GENE 25 43372 - 44514 1134 380 aa, chain - ## HITS:1 COG:STM0394 KEGG:ns NR:ns ## COG: STM0394 COG2814 # Protein_GI_number: 16763774 # Func_class: G Carbohydrate transport and metabolism # Function: Arabinose efflux permease # Organism: Salmonella typhimurium LT2 # 1 366 1 366 390 315 50.0 1e-85 MKKSLIALAFGTLGLGIAEFVMMGILPDVARDLKISIPAAGHFISAYALGVCVGAPVLTL ARKYPLKHILLVLVTLIMVGNICAALSPNYWMLLISRFISGLPHGAYFGVGSIVAERLAD KGKSSEAVSIMIAGMTIANLFGVPLGTSLSTMLSWRATFLLVGLWGIVILYYIWRWVPQV EGLKDTGFKGQFRFLKTPAPWLILGATALGNGGVFCWYSYINPMLTHISGFSAESITPLM ILAGFGMVMGNLISGRLSDRYTPGKVGTAAQALICLMLLLIFFLSPYKWAAALLMCLCTA GLFAVSSPEQILIIRVAKGGEMLGAACVQVAFNLGNAIGAYAGGLAVSGGYRYPALTGVP FALIGFILFLIFYKKYQAKY >gi|226332250|gb|ACIC01000070.1| GENE 26 44640 - 44831 215 63 aa, chain - ## HITS:1 COG:MA2933 KEGG:ns NR:ns ## COG: MA2933 COG1073 # Protein_GI_number: 20091752 # Func_class: R General function prediction only # Function: Hydrolases of the alpha/beta superfamily # Organism: Methanosarcina acetivorans str.C2A # 2 60 431 489 496 60 45.0 6e-10 MNLTAIKQHISENGNKNVTIKAYPKLNHLFQTCEKGTLAEYGQLEETINPEVLKDMTEWI KKQ >gi|226332250|gb|ACIC01000070.1| GENE 27 44731 - 46047 859 438 aa, chain - ## HITS:1 COG:MA2933 KEGG:ns NR:ns ## COG: MA2933 COG1073 # Protein_GI_number: 20091752 # Func_class: R General function prediction only # Function: Hydrolases of the alpha/beta superfamily # Organism: Methanosarcina acetivorans str.C2A # 29 387 42 411 496 269 39.0 6e-72 MRTTTFKTKIAAVAMLLTGLTLSSSLLAQDISGTWHGKLSLPAGSLTIVFHIKHTEQGTY VATLDSPDQGTKDIKTETTSFQDSTLTVQIPIIHASYKGKLNADQTITGTFTQGMPLPLN LTKGEFSGPKRPQEPQPPFPYKVEEVSVKNTQDGITLAGTLTLPEKGSKFPAVVLVTGSG AQNRDEEIMGHKPFLVIADYLTRNGIAVLRCDDRGTAASQGDYASATNEDFAKATEAALN YLRSRKEINTRKIGIIGHSCGGTIAFDIAAKDPNISFIISLAGAAVRGDSLMLKQVELIS KSQGMPDPVWQTMKPSVRHRYSLLQQTAKSSDEIRKEVYADVTRTMSAEQLKNLNTVQQL SAQINSMTSPWYLHFMRYDPTASLKKIKVSGPRIKRRKRHSSRCRYESDCYQAAHQRERK QECNNQSLSKTEPPVSDL >gi|226332250|gb|ACIC01000070.1| GENE 28 46071 - 48452 1837 793 aa, chain - ## HITS:1 COG:no KEGG:BT_3111 NR:ns ## KEGG: BT_3111 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 793 1 793 793 1444 100.0 0 MKYLKRSILTLAIAVIPFHSYVNGCGGDYYDYMNYYNLFDQLLLENKGLQPFLLTTDYAF YGEDTNPDAEQQPDENLNAWMTFFKKNNTLQEMDTAQFKTLLYSASYQSLKQPSSPYVIA LNKTDAGKQTLTYLQYAKELEPYAQLSENDGWWDMKRAASPSEETYAHYKNKGLELYQHC PYHELKLRYGYQLVRLAHYMRNKNNEAIRMYNLYVKPLKQEHYIYYAALEQTAGALYNIG KLANANYLYSRVFDHSDNRKKIAYSSFRIQSEVDWNEAMTWCKDNREKAAMYALRGYNTF SNELEEVENILEIYPESPYIKLLAIRYINKMERNVLTRYNHSNATDDTSSFMQPSGKVLA EYERGQKVIKAVMNHPKVSDKDFWALYLAHLSFLCKDYQQASALIDSVRTTKPELLKQKS RTQFSLYLAQLKFIGTDEEKTIRQYLQTSNADEDFINEIVGHLYKMQKDYGKAFLTHNRI EDLRQNPDPNIINSLMSNAEKENDQALLTQLYELEGTYFLRMNNFEEAAKWFAKVPPSYS ITHYDYDYETEKYIPVEILPNEFNGYSKISPLIFSNGFKRLFSVPADSQLTDTMYEQYPY LNQEHDKATLTAALMQLEKESQMMTEESARAAYMLANYYYNISPTGYYRNIPTYFRDNSY CWSAYGSYGSAVSNRIPDYSKEYNYRDFTQEYMTINNMENALALYEQAATYFTDREWKAR ALFMASSCTMDLYAQNWWDNWNNILDPDFKRSDEEKKVDSYFYQLAKSYSDTQFFKQAVH ECKYFDYYVKNEF >gi|226332250|gb|ACIC01000070.1| GENE 29 48449 - 49474 581 341 aa, chain - ## HITS:1 COG:no KEGG:BT_3110 NR:ns ## KEGG: BT_3110 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 341 1 341 341 707 100.0 0 MQIRLLQLAVLLICLSGCIDRHTNTPAFYVWKSKLSAEDADTAYLNTLGVQKIYARMFDV DNKGNGAFPTADYSPSSGNPDFQQEVVPVIFITNQTILQCPAADIEKLARNCADRIDTVY RSHFNKLPNEYQFDCDWTVKTKENYFNFLKHIRKLRKGALVSCTIRLHQIKFKDKTGIPP VDKGTLMYYASSEPTDFENKNTILNNKDAASYIKDVGSYPLHLDIALPLYSWGIVRNPFG QIKLINGIRQATIGAHPEYYKQTKEGVYNILQSHYLGGVWVNKDYELKVEEVSPETLLEA AQLLQRKLRKENREIIFYHLDKEILKQYSTQQLTNIINAFS >gi|226332250|gb|ACIC01000070.1| GENE 30 49860 - 51386 1337 508 aa, chain + ## HITS:1 COG:STM0035 KEGG:ns NR:ns ## COG: STM0035 COG3119 # Protein_GI_number: 16763425 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Salmonella typhimurium LT2 # 14 507 16 483 497 155 26.0 2e-37 MKKIIYLVPALSLAGCTSQQVEEKPNVIVILADDLGFGDVSAYGSTTIHTPNIDSLARGG VCFTNGYATSATSTPSRYALMTGMYPWKNKDAKILPGDAPLIINESQYTLPKMMRECGYV TGAIGKWHLGMGNGNVNWNETVKPGAKEIGFDYSCLIAATNDRVPTVYVENGDVVGRDPS DPIEVSYEQNFEGEPTAISNPEMLKMQWAHGHNNSIVNGIPRIGYMKGGKKARWKDEDMA DYFVDKVKNFITEHRDSSFFLYYGLHEPHVPRAPHQRFVGKTTMGPRGDAIVEADWCVGE LLTYLKKEGLLEKTLIIFSSDNGPVLNDGYKDGAPELAGKHAPAGGLRGGKYSLFDGGTH IPLFVYWKGKIQPVKSDALVCQMDLLASLGSMVGATLPDGLDSRNYLNAFMGTELKAREN LIIEAQGRLGYRSGDWIMMPPYKGSQRNLTGNELGNLDEFSLFDVKSDKGQKSNVAGRHP ELLERLKQEFFVQTDGFYRSEVEEEPLK >gi|226332250|gb|ACIC01000070.1| GENE 31 51578 - 52567 656 329 aa, chain + ## HITS:1 COG:L0234 KEGG:ns NR:ns ## COG: L0234 COG3507 # Protein_GI_number: 15673487 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-xylosidase # Organism: Lactococcus lactis # 38 298 14 283 559 71 24.0 3e-12 MKTLIQFLLATFTILACVCCKTAKEEKTDDKLNEITFADPTIFVENGKYYLTGTRNREPN GFAIFESTDLKEWKTANGDTLQLILRKGDSAYGERGFWAPQLFKEGNRYYLTYTANEHTA LASSNSVYGPFRQDSIRPIDETAKNIDSFLFKDSDGKYYLYHVRFNKGNYLWVAEFDLQK GCIKPETLKKCFDNTEAWERTPNYKSAPVMEGPTVVKLDDLYYLFYSANHFKNIDYAVGY ATSTSPYGPWKKHANNPIIHRSIVHENGSGHGDLFKGLDGRYYYVYHVHHSDSIVQPRKT RIVPLILKKENDVYSITIDKENVIKPMWK >gi|226332250|gb|ACIC01000070.1| GENE 32 52594 - 54213 785 539 aa, chain + ## HITS:1 COG:PA0183 KEGG:ns NR:ns ## COG: PA0183 COG3119 # Protein_GI_number: 15595381 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Pseudomonas aeruginosa # 30 531 3 527 536 309 34.0 1e-83 MMDGNFISRKRIGGIVALSFLQFGNLSGQERPNIVMILADDMGYSDIGCYGSEIETPHID SLALQGIRLTQFYNTSRSCPTRASLMTGLYQHQAGIGWMSEDPFKNVDKDPKDWGVAEYR GALNRNCVTIAEVLKSSGYHTYMTGKWHLGMHGMEKWPLQRGFERFYGILAGACSYLRPE GERGLVLDNEKLPAPEAPYYTTDAFTDYAVNFINEQKDNTPFFLYLAYNAPHWPLQAKEA DIEKYYELYRTKGWDQIRKERHKRMADLGIIDSEIGFAEWENRQWEELSEAEKDHTAYRM AVYAAQVHCMDYNIGKLIESLKKSGKLDNTLIFFMSDNGACAEPHNELGGGKQKDINNPA VSGHPSYGKAWAQTSNTPFRKYKQRAYEGGISTPLIITWKNRLKGFENTWCHVPGYLPDI MPTILNVTGATYPSTFHNGNKIHPLVGTDLMPAIKGKVSSIHEYMYWEHQGNRAVRWKNW KAVWDQEGKVWELYDIGKDRVEACDLSAVHPDILKQLSDKWAVWAAETKVLIPFPNNKK >gi|226332250|gb|ACIC01000070.1| GENE 33 54223 - 55875 1052 550 aa, chain + ## HITS:1 COG:PA0183 KEGG:ns NR:ns ## COG: PA0183 COG3119 # Protein_GI_number: 15595381 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Pseudomonas aeruginosa # 29 530 4 526 536 300 36.0 4e-81 MKRIDYTFMLTLGASIGACPVLAQQNVSKPNIVLILLDDVGYSDFGCYGSEINTPNIDRL AENGIRLRHFYNQARSAPTRASLITGLYPHQVGNGALGKVPGYPAYQGYPNENNVFLPEA LKTAGYFNVMTGKWHLGYFQGVTPISRGFDRSLNAPFGGYYFSSDKSLKKNKKERTNQRN LYLNDEEIGFDDDRLPENWYSTHLWTDFGLKFIDEAIEEKQPFFWYLAHNAAHFPLQAPV ETINKYKGKYMKGWEDVRNARFKKQLELGLFERPDQLTPRNPKVPEWDSLTQEEKERYDM QMAIYAAVIEEVDRSIGRVVEHLKEKGVLDNTLIILLSDNGGNGEPGIEGRFAGKNPGSA GSTVFLGAAWADVANAPFFLYKHHGHEGGCNTPFIVSYPNGIDKSLNGTIQKDNYGHIVD IMPTLVKLTGATYPSSRGGHKVPPMEGISLLPLLKGELVKQTKPTIVEHEGNKMLRDGEW KIVQEYKEPKWRLYNLKNDPTEMRNLADREPEILNRMIHDYQKMADHIGVESEIEFKVGK WYTPVEEYLK >gi|226332250|gb|ACIC01000070.1| GENE 34 55911 - 58181 1618 756 aa, chain + ## HITS:1 COG:no KEGG:BT_3104 NR:ns ## KEGG: BT_3104 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 360 756 1 397 397 661 100.0 0 MKNKYRKLAVIMLTAVCFAGCQQEDIVYPEPDAKVAETYLTLNELTQSPLQVPGPEGTYI LKVISDDKWTLSSSQAWCSVSETEGFKYSQVPVSYSENPWNEPRTAELTFLINESQEIKI VTVEQEASETSLAADSKELQYNIGGGEKDITLQTNAVEWNVEVVDEPTHTPVKWCTVTPV TGKGTATLKVTTESNTTEVVRKASLVFTAADKSLSIPVIQADKLEAPVITLDDHNDFLLS WNEITGVDGYKLKVIADQNENTIDIPSGTASYNLDVIDWKGYVGMVSIQLFSYANIGGGT VAQMGSEIIEAHNLFDETSGDGEKGSEYIITKPRHLRNVSKYLDKNYKQTVDIDLTGIDM APISTELVNNTYNGNFTGVYEATKGDIIDTGTGRSSEQYKIKNWTLNKSSNTNCGLFASI GAEGIVRNVCIENVVIKAKAKVGAIAGNCSGKIISCHTTGNEGSISTAATTDAEIYLGGI VGYLTEKGEVSYCSNTAAIEGTAGCVGGVVGIIMRDANNAPAVAYCTNKANVVCSSKSPV GGVVGSIAGAAGNDLIKVTGCANSGEISGTQANNQVGGIVGRATIDTEISQSYNTGSITA MGSASGIIARMGGTNAIVRDCYNTGAIKSTGTATNGNSNAAGIVATCTIAAGGGMTIENC HNVGDMTTANGASFGNGLFHRTDGKISQITMKSCFSLNEDGKSQSSGSDSYISGGSSSYK NLSASEMKNTSSFTNWNFSTVWEMGSEYPTLQGLLK >gi|226332250|gb|ACIC01000070.1| GENE 35 58195 - 61380 2193 1061 aa, chain + ## HITS:1 COG:no KEGG:BT_3103 NR:ns ## KEGG: BT_3103 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1061 1 1061 1061 2059 99.0 0 MNNEYNIMKQKIRFKLTDRYLVLLICFLGYISTVSAQSHKISGVVVDEGKSPIIGATIQV KGANNGTITDIDGNFSLEVKNPKAQLQISYIGFTTITVPVQLKGKMTITMQEASRNLDEV VVIGYGTVRKSDLSGSVSAVSMKNEAEIMPITSADQFLQGRIAGVSIAANSGAPGAGMNI QIRGVSTLSGNTDPLYVVDGFPIEAATASVNGGTSELSQQPAMNPLASINPNDIESIQVL KDASATAIYGSRATNGVILITTKQGKEGKVNVNYNARVDISNISKRYNLLNAHDYALFEN ELDRTSNGYDMQGNVIPSTKVPRHTDDALERYKIYSTDWQDLMYQTAVSQDHQLAVNGGN KMTQYNITAGYTDQEGIIMNTGLERYSFRLNLKSELAKRLVLQLNTSYSQTEQKQTSHSQ SSSMNQMVRRILTTKPTLMPGDQIYEDENVEYVPADNPYIMATELKDILQQRFFILNASL TYNFGKGLSWKGAGSFNRTDGSRSTYYPIGTNAGNSAHGMAFRGEDNRQNIVLETTVNYD RKFKKHHHINAVLGYTHEDRQRKTLAIQLGDFAGNDLLYYAIGEGTNTMDKSSSVIQTKL SSFLGRMNYTLYDRYIFTATGRYDGSSLLASGNRWSFFPSFAVAWRINQEKFMKDIPAIS NIKLRLSYGVTGNQSIAYAAPLAIMNHVRSYSGGAVTHGMVNGKLANPNLGWESTATYNA GLDLGFIDNRFRVTMNVYQRTTKDMLMNFGLPLSSGYGSIPYNMGKMTNKGFELEASADI LTGRVKWTLGGNIYLNRNKVDDISGNELLGTSYLAGGGVFSQSIHITKAGYPVGSFYGYV VDGVYQNEAEAKLAPFDTPQATPGSLRFKDISGPDGLPDGKITSDDMTIIGTAEPKFNYG INSELSWKGLTLSMIFTGRVGGDIANLNRYFLDSFTDTNDNIRAEAWEGRWQGEGTSNFY PAVNGSQGSSYFNKRFSTFLLEDGSFFRLKNLTLAYQFSLKKLRWLRSVRVFGTVTNVFT ITNYSGYDPEVNITSGAMSPNVDYAAYPSSRTYSMGINLAF >gi|226332250|gb|ACIC01000070.1| GENE 36 61399 - 63099 1263 566 aa, chain + ## HITS:1 COG:no KEGG:BT_3102 NR:ns ## KEGG: BT_3102 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 566 1 566 566 1154 100.0 0 MKKYILSLLIGTALLSSCLDVDIVDRVEKEDGFFKTEDHALAAINGVYSTLYAFAYHKTN WAIILPTYEDAMFSTGNAVPATVSNNTHSAGSAPVVNFWGELYNGIKNANEVIARVPSIS FKSENDKNRIIAEAYFLRALYHYDLMRLYGGVNGIPIVIEPVTGIENAYVPQSPASKVYE QIISDFEYAAGLNDDDTPRLPKRLDPAYTSGRVTNGAAHAFLAEVYLTLGRWEDAAREAQ EVISSEQYTFVNDYQNLWDIEQESLAQTEHIFSVPFFRDKDALADQSLGSNIAHLFNPAG VQVGGKLVSGNPYGKGAGAHRVQKWFIKYFQDDQGNLGYSDPSVDASIDESKLISKDYRI EVTFWRRFQQKNNNTGVLGNIVTVYPAAGGNSQENWGYIRKFIDPQGINNRTNENDMPRL RLSDMYLIRAEALNELGKYTEACQAIDMVRERARKANGTERAYPKYIGSDRPDNIGRTLT KQEFRWLVFMERGLEFAGEQKRWFDLIRMKYDENTQMYDYIKDTFIPSRPAADVQKQGVM AARKKYFPIPFAEVSRNDGVQQNPGY >gi|226332250|gb|ACIC01000070.1| GENE 37 63146 - 64513 789 455 aa, chain + ## HITS:1 COG:MT0310 KEGG:ns NR:ns ## COG: MT0310 COG3119 # Protein_GI_number: 15839682 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Mycobacterium tuberculosis CDC1551 # 27 437 12 421 465 119 27.0 9e-27 MNRLFLSVSVALSATTCSFAQQITQPNLVLFIADDCSYYDLGCYGSVDSKTPNIDNFATQ GVRFTQAYQAAPMSSPTRHNLYTGLWPVGSGAYPNHTCADQGTLSVVHHLHPLGYKVALI GKKHVAPKSVFPFDLYVPSEKGELHFEAIQKFIADCKRKGQPFCLFVASNQPHTPWNKGD VSQFDPDKLTLAPMYVDVPQTRQEFTKYLAEVNFMDQEFGNVLSILEQEKVADQSVVVYL SEQGNSLPFAKWTCYDAGVHSACIVRWPGVVKPGSVSDALVEYVDIVPTFVDIAGGKPQT RVDGESFKSVLTGKKKEHKKYSFSLQTSRGINKGPEYYGIRSAYDGRYRYIVNLTPEATF QNAMTATPLFKEWKQLAETDAHAKAMTFKYQHRPAIELYDVRNDPFCMNNLAGDTKYSNI IIRLDAELKKWMKACGDEGQATEMRAFDHMPSKQK >gi|226332250|gb|ACIC01000070.1| GENE 38 64690 - 65565 362 291 aa, chain + ## HITS:1 COG:CC3346 KEGG:ns NR:ns ## COG: CC3346 COG0657 # Protein_GI_number: 16127576 # Func_class: I Lipid transport and metabolism # Function: Esterase/lipase # Organism: Caulobacter vibrioides # 56 262 40 239 278 87 28.0 4e-17 MRIKLIRCIELKAIVTLSLVLINSGNLLIGSNLFIRDNNDLSADTILRYNHYPTGKLYVY YSKDQTNQSSPAVIFFFGGGWNSGSPKQFESQASYLNKYGVTVVLADYRTQKNAGTTPKE ALMDAKSAMRYLKQHAMSIHVDPDKILAGGGSAGGHLAAATAFCHQINNPEDDLNVSSIP KALILFNPVIDNGPHGYGFDRVKDYYRDFSPIHNIKKDAPPAIFFLGSEDNIVLLETALK FKKKMEHVGSRCDLLLYPGQKHGFFNAKFEEFFEKTMSATVVFLKSLGYIK >gi|226332250|gb|ACIC01000070.1| GENE 39 66343 - 66912 400 189 aa, chain - ## HITS:1 COG:no KEGG:BT_3099 NR:ns ## KEGG: BT_3099 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 189 1 189 189 378 100.0 1e-104 MKTFKFILLALVVSNTITAQKNVRPTKNLDSYEGTWIYQKNDIIFKIKLQKGQEAWRNDL INGLYGGYYLSVRGEVLENYMSGFPVYWDFSKDPRPNNMYIWVSNHSLRLDDINPNYVSV WFYDQRKRHFGGKGITGGYIQLLSPTKIRWKLDELTGIWNETEGDESISDAERRPIGFSV PDDVIMTKE >gi|226332250|gb|ACIC01000070.1| GENE 40 67019 - 67576 400 185 aa, chain - ## HITS:1 COG:no KEGG:BT_3098 NR:ns ## KEGG: BT_3098 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 185 1 185 185 376 100.0 1e-103 MKTFRIIVLLLVLGINATAQENQNFAKILDTYVGTWVYQKNDTVFKIKFQKGQQLWTKKT ANGLYGGYYLSVNGRVLEDYMGELPTCWDVLKECQPNNLFIWAYSPYTDELGSLGIIFYD QRKRHFGGKGITGGYIQLLSPTKIRWKLDEEKGIWNETEGDESISDAGRRPIGFSVPDDV IMTKE Prediction of potential genes in microbial genomes Time: Thu May 12 01:02:20 2011 Seq name: gi|226332249|gb|ACIC01000071.1| Bacteroides sp. 1_1_6 cont1.71, whole genome shotgun sequence Length of sequence - 12533 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 3, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 4065 2069 ## COG0642 Signal transduction histidine kinase - Prom 4165 - 4224 3.3 + Prom 3948 - 4007 4.8 2 2 Op 1 . + CDS 4195 - 6240 1354 ## COG3534 Alpha-L-arabinofuranosidase 3 2 Op 2 . + CDS 6248 - 7804 1027 ## COG3119 Arylsulfatase A and related enzymes 4 3 Op 1 . + CDS 7910 - 9040 865 ## BT_3094 putative secreted xylosidase 5 3 Op 2 . + CDS 9037 - 10704 1218 ## COG3119 Arylsulfatase A and related enzymes 6 3 Op 3 . + CDS 10738 - 12532 1285 ## COG3250 Beta-galactosidase/beta-glucuronidase Predicted protein(s) >gi|226332249|gb|ACIC01000071.1| GENE 1 3 - 4065 2069 1354 aa, chain - ## HITS:1 COG:CAC0903_3 KEGG:ns NR:ns ## COG: CAC0903_3 COG0642 # Protein_GI_number: 15894190 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 817 1099 29 317 318 145 34.0 6e-34 MEYTFVLEKPIEKRLSLSVKQGVESDMKRSTLKNLISVAVCLLFLLHPTFAEGKIQYNNV RFKQLSITEGLSHNTVNAISQDSKGFMWFGTRNGLCRYDGYNITRYFHEEGDSTTISHDF ITKLYNDPCRNVLWISTEQGICKYNPQNEQFTRYHIEGNNKNSVLFLNTSDSRLLTGCSN GIYQYDNEKNSFTPFILNEGKGENIRGLVEDSNNVLWINSNKGVKRYNLKKEQFEPLPLN IRPFLETCIQLVMLPGNQLLFNTPQEVFVYNINNDSLYPLAAIKDIRQFRCATTDTMNNI WLGTENGIFIYDQTFRLVTHYQQSESDLSNLNDSPIYSLFEDYNHNMWVGTYFGGVNYFI YASDQFRIYPYGNSPNHLSGKAVRQIINTPDNGLYIATEDGGLNYLNSNREITRSERLHE RMNIKARNIHSLLIDSKQNLWIGLFLRGMNYYMPKENKTLSFNDGMGKNSSGFCIIEDET GKIWYGGPSGLFTLKKQNGSFQLKKVSALPVFCMLNLNDSIIWTGNRQNGIYQINKRTEE QTPLPQFSSSKLYMTYLYMDSQGNIWAGTNNDGLFVLNKKGEKLKSYSKKELGSNAIKGI IEDDQNTIWIGTDNGLCNIQPKSGLISRYTIADGLPTNQFNYSSVCKKPDGELFFGTING MISFYPEQVRPVEPHFNIALTGVWSNNDVVSSSNPDALLPASISESDVMTLTHEQAQSIR IEYSGLNYQYKDKTQYAMKLEGIDKEWQFVGNQHQVRFSNLPTGDYTLKIKASNDGVNWD EKGQKELTIHVLPPWWLSIWAYLTYVCMVLCIIYLAYKYTKARLILLMRLKTEHEQRVNM EKMNQSKINFFTYVSHDLKTPLTLILSPLQRLIKQKQIDNNDREKLEVIYRNANRMHYLI DELLTFSKIEMQQMEINVRKGNIMHFLEEISHIFDIVSKEKEIDFIVSLEETDEEVWFSP SKLERIMYNLLSNAFKYTQPGDYVKLSAKLLKKDSENFIEISVKDSGRGIPKEMKDKIFD SYFQVEKKDHREGFGLGLSLTKSLIHMHKGEIKVESEVGKGSEFIVSLNVSESAYSSSEK SLESITSEEIQKYNLRMKETIELIPDQLISTEQDNTDVKESILIVEDNKEMNDYLAEIFS KDYQVFRAYNGAEACKLLKKQLPDLIVSDVMMPVMDGLELTAYVKQDLNSSHIPVILLTA KTDELDHTQGYLKGADAYITKPFNAQNLELLVQNMRTNRKQNIEYFKRIEKLNITQITNN PRDEVFMKELVDLIMANIKDEEFGVTEIITHMKVSRSLLHTKLKSLTGCSITQFMRTIKM KEAKIHLQNGMNVSEASYAVGMSDPNYFTKCFKK >gi|226332249|gb|ACIC01000071.1| GENE 2 4195 - 6240 1354 681 aa, chain + ## HITS:1 COG:CAC3436 KEGG:ns NR:ns ## COG: CAC3436 COG3534 # Protein_GI_number: 15896677 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-L-arabinofuranosidase # Organism: Clostridium acetobutylicum # 51 540 56 512 835 262 32.0 2e-69 MKTKFCRTAVYCLCCFMFIQPITGSQVNDTHEGVLHIDKQKTRKVSRVQYGFHYEEIGMI GEGALHAELVRNRSFEEATPPADLAVKNGLYQNVPNPRGKNKDVFHVDPLIGWNTYPLSY TPIFISRTEENPLNKENKYSMLVNVTEDIANNPEAMILNRGYYGMNLRKEVSYHLSMYIK SKNYTALLQVMLVDEQGKPVSTQLVLDVKGKEWTKLTGTLKPDKDVKRGMLAIQPLGKGQ FQLDVVSLFPSDTWDNGKSVFRADIMQNLKEYAPDFIRFPGGCIVHGVNEATMYHWKKTI GPIENRPGQWSKWAPYYRTDGIGYHEFYELCEYLGADAMYVIPTGMICTGWVKQSSPWNF IQPDVDLDAYIQDALDAIEYAIGPETSKWGALRVKNGHPKPFPLKYIEIGNEDFGPVYWE RYEKIYQALHRQYPDLIYIANSIIGKENDDKRIDIAKFVNPKNVKVFDEHHYQPVEWACK QHYRFDNYERGIADLFVGELGIDGKYPYNLLASGAVRMSLERNGDLNPLLAERPVMRHWD FSEHRQLHSMLINGVDCSIRTSFYYLSKMFRDHTFDVCLDAGIQGMKGLQNVFVTMGYDS ESKEYILKLINLRKQEVVLQTAVKGFGKKIQAQKITLSLEPGKNNTPEMPDAVKPVQSTV TLDLNKDLQLEASSMIVYRFK >gi|226332249|gb|ACIC01000071.1| GENE 3 6248 - 7804 1027 518 aa, chain + ## HITS:1 COG:PM0598 KEGG:ns NR:ns ## COG: PM0598 COG3119 # Protein_GI_number: 15602463 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Pasteurella multocida # 41 497 1 456 467 162 29.0 2e-39 MRNAAFCITGLALQGLSMSLMAQTDKPNIVIIMTDQQRADLCGREGFPLAVTPYVDQLAQ ENVWFNKAYTVAPASSPARCSMFTGRFPSATHVRTNHNIPDIFFEQDLVGVLKENGYKTA LVGKNHAYLKPADLDFWSEYGHWGKNKKTTPEEKETARFLNQKARGQWLEPSPIPVEEQH PAKIVNETLSWIESQKENPFFVWVSFPEPHNPYQVCEPYYSMFSPDKLPVLKTSRKDLEK KGEKYRILAELEDASCPNLEQDLPRIRANYIGMIRLIDDQIKRLIESLKASGQFENTIFV VLSDHGDYWGEYGLIRKGAGLSESLARIPMVWAGYHVKNQSAPMDSHVSIADIFPTLCSA IGAKIPAGVQGRSLWPMLTGKEYPKEEFSSIVVQQGFGGAHVGLDSSLTFEQEGALTPGK IAYFDELNTWTQSGTSRMVRKDDWKLVMDNYGNGELYNLKKDPSEVDNLFGKKKYTAVQT ELLTKLVAWELRLQDPLPLPRKRYHFKQDPYNYLQPEQ >gi|226332249|gb|ACIC01000071.1| GENE 4 7910 - 9040 865 376 aa, chain + ## HITS:1 COG:no KEGG:BT_3094 NR:ns ## KEGG: BT_3094 # Name: not_defined # Def: putative secreted xylosidase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 376 1 376 376 768 100.0 0 MKFILRLGLALLTGFVSVSTASAQAYGTADTNAPELRVPKSVQPAFDYWMRDTWATLGPD GYYYITGTTSTPDRHFPGQRHCWDWNDGLYLWRSKDLKIWEAMGRIWSMEKDGTWQKDPK VYKEGEKYAKKSINNDPMDNRFHAVWAPEMHYIKSAKNWFIVACMNQSAGGRGSFILRST TGKPEGPYENIEGNEDKAIFPNIDGSLFEDTDGTVYFVGHNHYIARMKPDMSGLAEEIKT LKESKYSPEPYVEGAFIFKYDGKYHLVQAIWAHRTVKGDTYVEKEGLTNKKTRYSYDCII ATADNVYGPYGKRYNAITGGGHNNLFQDKDGNWWATMFFNPRGAQAAEYKVTCRPGLIPM LYENGKFKPNHNYQAK >gi|226332249|gb|ACIC01000071.1| GENE 5 9037 - 10704 1218 555 aa, chain + ## HITS:1 COG:PA0183 KEGG:ns NR:ns ## COG: PA0183 COG3119 # Protein_GI_number: 15595381 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Pseudomonas aeruginosa # 21 532 3 527 536 284 33.0 3e-76 MKTSLLITGGLAFGAMTFAQEKPNIIVILADDLGFSDLGCYGGEVQTPVLDKMAKQGVRM TQMYNSARSCPSRANLLTGLYPHQTGLGHMDATRPAWPKGYAGFRSNSDNVTIAEVLKDA GYFTAMSGKWHLGKTANPINRGFQEYYGLLGGFNSFWNPDVYTRLPKDRNPRQYKEGEFY ATNVITDYAIDFINQAHQEEKPLFLYLAYNAAHFPLHAPKEVTDKYMKIYLQGWDKIRDQ RWKRIVKMGLLQGKPALSPRGVVPESLFEDETHPLPAWDSLTKEQQTDLARRMAIYAAMI DIMDTNIGRVLDTLQKNGELDNTFIMFMSDNGACAEWHEFGFDKQTGTEYHTHVGAELDQ MGLPGTYHHYGTGWANVCCTPFTLYKHYAHEGGISTPCIIQWGKQIKHKGSIDHQPAQFS DIMATCIELAGTKYPEEYEGRKIVPTPGKSILPIVRGEEMPDRYIYAEHEGNRMVRKGDW KLVSANFRGDEWELYNIKEDRTEQHNLIKEYPEMAKELEDAYFEWADHSDVLYFPKVWNT YNKGRRKDFKEYRDK >gi|226332249|gb|ACIC01000071.1| GENE 6 10738 - 12532 1285 598 aa, chain + ## HITS:1 COG:TM1193 KEGG:ns NR:ns ## COG: TM1193 COG3250 # Protein_GI_number: 15643949 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Thermotoga maritima # 108 508 117 502 1087 88 24.0 4e-17 MNKKLLIAVFVSAMAHTFACAASPKENRMKTVWADKVTSENVWQSYPRPQLQRSQWMNLN GLWQYAVTTQDTPKKEVKFEGQILVPFAIESSLSGVQRSFLPTEKLWYQRNFTLDDAWKG KTVILHFGAVDYECQVWVNNKLAGIHKGGNNPFSFDVTKLLRKGGSQLVEVAVIDPTDTE SISRGKQQLEPKGIWYSPVSGIWQTVWLEAVNSTHILQVLPTADIHKKTVSLDISVAKAK GGEKVKVELLDEGKVVGTTEQKLASKIEMEVPEAVLWSPSSPKLYHLNIELLSQGQVIDQ VKSYFAMREVSIRKDECGYQRICLNGDPIFQYGTLDQGWWPDGLLTPPSEEAMLWDMVQL KNMGFNTIRKHIKVEPEQYYYYADSLGLMMWQDMVSGFSTERKDEEHIKPNAQTDWNAPA SHTNQWQQEMFEMIDRLRFYSCITTWVIFNEGWGQHNTVEIVEKVMEYDKSRIIDGVTGW TDRGVGHMYDVHNYPSASMILPACNGNRISVLGEFGGYGWAIKEHLWNPGMRNWGYKNID GAMALMDNYGRLVYDLETLIAQGLSAAIYTQTTDVEGEVNGLITYDRKVIKIPEGLLH Prediction of potential genes in microbial genomes Time: Thu May 12 01:02:34 2011 Seq name: gi|226332248|gb|ACIC01000072.1| Bacteroides sp. 1_1_6 cont1.72, whole genome shotgun sequence Length of sequence - 33275 bp Number of predicted genes - 23, with homology - 23 Number of transcription units - 13, operones - 5 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 40 - 399 213 ## BT_3092 beta-galactosidase + Prom 435 - 494 4.8 2 2 Tu 1 . + CDS 581 - 2152 1631 ## BT_3091 putative regulatory protein + Term 2204 - 2252 4.3 + Prom 2165 - 2224 5.5 3 3 Op 1 . + CDS 2334 - 5339 3304 ## BT_3090 hypothetical protein 4 3 Op 2 . + CDS 5355 - 6845 1418 ## BT_3089 hypothetical protein 5 3 Op 3 . + CDS 6879 - 8396 1488 ## BT_3088 hypothetical protein 6 3 Op 4 . + CDS 8410 - 10188 1595 ## BT_3087 cycloisomaltooligosaccharide glucanotransferase 7 3 Op 5 . + CDS 10223 - 12727 1951 ## COG1501 Alpha-glucosidases, family 31 of glycosyl hydrolases + Term 12844 - 12890 10.1 - Term 12957 - 13015 1.1 8 4 Tu 1 . - CDS 13068 - 15206 1669 ## COG1501 Alpha-glucosidases, family 31 of glycosyl hydrolases - Prom 15282 - 15341 3.8 + Prom 15384 - 15443 7.5 9 5 Tu 1 . + CDS 15470 - 16273 789 ## COG0657 Esterase/lipase + Prom 16298 - 16357 4.1 10 6 Tu 1 . + CDS 16384 - 17709 912 ## COG1373 Predicted ATPase (AAA+ superfamily) + Prom 17756 - 17815 5.2 11 7 Op 1 . + CDS 17872 - 19518 1390 ## COG1621 Beta-fructosidases (levanase/invertase) 12 7 Op 2 . + CDS 19571 - 19912 379 ## COG1917 Uncharacterized conserved protein, contains double-stranded beta-helix domain + Prom 19956 - 20015 4.3 13 8 Op 1 . + CDS 20041 - 20988 854 ## PROTEIN SUPPORTED gi|148988856|ref|ZP_01820271.1| 50S ribosomal protein L9 + Term 21029 - 21078 12.4 + Prom 21002 - 21061 2.2 14 8 Op 2 . + CDS 21082 - 22842 1320 ## BT_3079 hypothetical protein + Term 23003 - 23040 4.0 15 9 Tu 1 . - CDS 23021 - 23500 530 ## COG3467 Predicted flavin-nucleotide-binding protein 16 10 Tu 1 . - CDS 23649 - 25805 1234 ## PROTEIN SUPPORTED gi|15894003|ref|NP_347352.1| fused ribonuclease/ribosomal protein S1 - Prom 25920 - 25979 6.0 + Prom 25926 - 25985 7.3 17 11 Op 1 . + CDS 26061 - 27068 735 ## COG0451 Nucleoside-diphosphate-sugar epimerases 18 11 Op 2 . + CDS 27059 - 28024 596 ## BT_3074 hypothetical protein + Term 28056 - 28095 5.3 19 12 Tu 1 . - CDS 28027 - 29019 498 ## PROTEIN SUPPORTED gi|148828154|ref|YP_001292907.1| ribosomal protein L11 methyltransferase - Prom 29142 - 29201 7.0 + Prom 28970 - 29029 4.0 20 13 Op 1 . + CDS 29157 - 30440 1312 ## COG0826 Collagenase and related proteases 21 13 Op 2 . + CDS 30444 - 30848 525 ## COG0824 Predicted thioesterase 22 13 Op 3 . + CDS 30890 - 31966 878 ## COG0758 Predicted Rossmann fold nucleotide-binding protein involved in DNA uptake 23 13 Op 4 . + CDS 31996 - 33246 1142 ## BT_3069 putative disulphide-isomerase Predicted protein(s) >gi|226332248|gb|ACIC01000072.1| GENE 1 40 - 399 213 119 aa, chain + ## HITS:1 COG:no KEGG:BT_3092 NR:ns ## KEGG: BT_3092 # Name: not_defined # Def: beta-galactosidase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 119 616 734 734 247 99.0 8e-65 MPDAQNGSAVSRAVSLNGQESKMTSLPFDCPPRSEIVSETSFDVNDEFSHLSLWVRATGD VKVWLNGVEVFSQEVKQTRQYNQYNISNYCRYLRKGKNELKIEVRETKKMSFDFGLRAY >gi|226332248|gb|ACIC01000072.1| GENE 2 581 - 2152 1631 523 aa, chain + ## HITS:1 COG:no KEGG:BT_3091 NR:ns ## KEGG: BT_3091 # Name: not_defined # Def: putative regulatory protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 523 28 550 550 931 99.0 0 MLLREIDGIIRNRQTYGAEKEARISDLKKLLSEATSDEQRYGFCGRLFDEYRAYNLDSSY VYAQRKQELASHLNKQDYLDDAAMNMAEVMGTTGMYKEALEQLGQIDKKTLSDYLYPYYY HLYRTIYGLMGDYAVTEKEKKEYYRMTDLYRDSLLQTNASDSLGHVLVMADKCTVHAQYD QAITMLTDFYRKPSLDDHSKAMITYTLSEAYRLKGDKKGQKHFLALSAIADLKSAVKEYV SLRKLASLVYEDGDIDRAYNYLKCSLEDATLCNARLRTLEISQVFPIIDQAYQLKTKRQQ QEMKISLICISLLSVFLLVAIFFVYKQMKKVAAARREVIDTNTLLQELNGELHDSNSQLK EMNHTLSEANYIKEEYIGRYMDQCSTYLDKMDLYRRSLNKIAATGRVEELYKAIKSSQFL EEELKEFYANFDMTFLQLFPNFVEEFNALLVEPMQPKQGELLNTELRIFALIRLGITDST KIAQFLRYSVTTIYNYRTRVRNKALGERDEFEAKVMKIGKVEE >gi|226332248|gb|ACIC01000072.1| GENE 3 2334 - 5339 3304 1001 aa, chain + ## HITS:1 COG:no KEGG:BT_3090 NR:ns ## KEGG: BT_3090 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 1001 1 999 999 1891 99.0 0 MYMEQSIKSKGFEHRLLLIMWGLLLSLSAFAQQITVKGHVVDATGEPVIGASVIEGKSTN GTITDIDGNFSLNVSANSTLTISFVGYKTQTVSVNGKTALKVTLQEDTEVLDEVVVVGYG TMKKSDLTGAVSSVGVKDIKDSPVANIGQAMQGKVSGVQIIDAGKPGDNVTIKIRGLGTI NNSNPLIVIDGIPTDLGLSSLNMADVERVDVLKDASATAIYGSRGANGVVMITSKRGAEG AGKVTVNANWAIQNATKVPDMLNAAQYAALSNDMLSNNDDNTNPYWADPSSLGKGTNWLD EMLRTGVKQSYSVSYSGGTEKAHYYVSGGFLDQSGIVKSVNYRRFNFQANSDAQVNKWLK FTTNLTFSTDVKEGGTYSIGDAMKALPTQPVKNDDGSWSGPGQEAQWYGSIRNPIGTLHM MTNETKGYNFLANITGEITFTKWLKLKSTFGYDAKFWFADNFTPAYDWKPNPVEESSRYK SDNKSFTYLWDNYFVFDHTFAKKHRVGVMAGSSAQWNNYDYLNAQKNIFMFDNIHEMDNG EKMYSLGGSQSDWALLSLMARLNYSYEDKYLLTATVRRDGSSRFGKNNRWGTFPSVSLAW RVSQEDWFPKDNFLMNDLKLRVGYGVTGNQEIGNYGFVASYNTGVYPFGNNNSTALVSTT LSNPNIHWEEVRQANFGVDMSLFDSRVSLSLDAYIKNTNDMLVKASIPITSGFEDTTETF TNAGKMRNKGVEMTLRTINLKGIFSWESALTATYNKNEILDLNSETPMFINQIGNSYVTM LKAGYPINVFYGYVTDGLFQNWGEVNRHATQPGTAPGDIRFRDLNNDGVINDEDRTILGN PNPNWFFSLSNNLSYKGWELSVFLQGVAGNKIYNANNVDNEGMAAAYNQTTAVLNRWTGE GTSYSMPRAIWGDPNQNCRVSDRFVENGSYLRLKNITLSYTLPKKWLQKIQLENARISFS CENVATITGYSGFDPEVDVNGIDSSRYPISRTFSMGLNFNF >gi|226332248|gb|ACIC01000072.1| GENE 4 5355 - 6845 1418 496 aa, chain + ## HITS:1 COG:no KEGG:BT_3089 NR:ns ## KEGG: BT_3089 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 496 1 496 496 1030 99.0 0 MKKKLTFIMILVVLALTSCSDFLDKYPKYGVDPESEVTNEIAVALTTACYKTLQSSNMYN QRLWSLDILAGNSEVGAGGGTDGLETVQAANFIAQSDNGFALYVWRSPWVGIGRCNIVLS NLPSAAISDEIKDRCMGEAYFLRAHYYYILVRLYGGVPLRLQPFEPGQSTDIARNTVDEV YAQILSDCKNAVDMLPPKSSYGENDKGRACKEAAMAMLADIYLTLAPNHRDYYSEVVTLC DQITAMGYDLSQCKYADNFDATINNGAESLFEVQYSGSTEYDFWGGDNQSSWLSTFMGPR NSGMVAGAYGWNLPTEEFIKEYEAGDLRKDVTVLYQGCPAFDGMEYRRSWSGTGYNVRKF LVSKTVSPEYNTNPNNFVVYRYADVLLKKAEALNELGHPDQAAAPLNIVRHRAGLADVPT TLNQETMREKIIHERRMELAFEGHRWFDMIRINNGNYAIEFLKSIGKNQVTKERLLLPIP QTEMDSNNLMTQNPGY >gi|226332248|gb|ACIC01000072.1| GENE 5 6879 - 8396 1488 505 aa, chain + ## HITS:1 COG:no KEGG:BT_3088 NR:ns ## KEGG: BT_3088 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 505 1 504 504 978 99.0 0 MMKKYIYQMLCSLFIGGAMVSCAEDYMETDKGHDTLTLTVNQQEIVLNEKNHSQEALTLS WTTGTNYGSGNRISYTLEIAKAGTDFARAYSVDLGTGTYQWTKKTEELNQFLNTQLGVGY AEKVSLEARITATVAGMEEKEQRATVALDVTTYQPVTPTLYLIGEAAPNGWSADQATPME RTDNGQFTWTGKLNTGVFKFITTLGEFLPSYNRDAAAGEELRLIYRTSGDEPDEPFTVSK EATYIVKVDLLDLTMTMTETENIGWRFEEFYIVGSFTGDNGWGFEALSKDAVQMNLFHYG AVIPWKADGDFKFTSVTDFGQSDAFFHPTEGNAPYTSTSVVLGGEDNKWQMKESECGKAY KVLFLTAKGKEKMLMRPFTPYEGLYLVGDATPNGWSIDNATPMAKSADSPYIFTWSGTLN TGEMKISCDKQSDWNGDWLMADKSGKAPTGEVETALFTSKTDAELKNMYPDTDLGSLDNK WNIQEAGSYRITIDQLKETISIVKQ >gi|226332248|gb|ACIC01000072.1| GENE 6 8410 - 10188 1595 592 aa, chain + ## HITS:1 COG:no KEGG:BT_3087 NR:ns ## KEGG: BT_3087 # Name: not_defined # Def: cycloisomaltooligosaccharide glucanotransferase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 592 1 592 592 1222 99.0 0 MKKIIYLVAAFLCLSCSDDHESNPQNGGASGSVTEVTPVTSDLCVELTTDKAFYKPNETV TFTAADALPAGTKVRYRLSGEIVGEEPVSGTNWTWKAPSTDFKGYMAELYRQENGTDVIV GTIAVDVSSHPARFPRYGFVADFDGVKTEEKTLEEMAYLNRHHINWVQFQDWHNKHHWPL GGTRTQLDEEYLDIANRPVHTSSVKNYIKAQQHFGMKSMFYNLCFGALKDAASDGVKEEW YLFKDASHTTKDSHDLPSGWKSNIYLVDPSDKEWQQYMAERNDDVYANFAFDGYQIDQLG KRGTLYNYNGTPVNLREGYASFIEAMKQAHPDKSLVMNAVSRYGARQIGETGKVDFFYNE MWADEADFTHLKAVLYENGVYGNNQLNTVFAAYMNYNKADHRGEFNTAGILLTDAVMFAL GGSHLELGGDHMLCKEYFPNDNLTMSEELKTAMVHYYDFLTSYQNLLRDGGTENSIAMNC TNGEMKLNVWPPKLGSVTTYAKQVDGKQVVHLLNFSQANSLSWRDVDGTMPEPALITKAT LQMNLPAKVNKLWVASPDVHGGALQELAFTQENGVVSFTLPALKYWTMIVAE >gi|226332248|gb|ACIC01000072.1| GENE 7 10223 - 12727 1951 834 aa, chain + ## HITS:1 COG:alr4773 KEGG:ns NR:ns ## COG: alr4773 COG1501 # Protein_GI_number: 17232265 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-glucosidases, family 31 of glycosyl hydrolases # Organism: Nostoc sp. PCC 7120 # 31 771 23 759 779 438 33.0 1e-122 MKIRYKFGVWMLCLMFPALVWAENAKSICTSFTQQGRQVIFHLADSAALQLQLCSPSVVR IWFSPDGKLQRRNASFAVINEELEDVGTVHVDEQAACYEIFTSKLRIRVNKSPMSIQIFD KYQKLLFSDYADKGHVSEGTKKVEYKTLRRDEHFFGLGEKTGKIDRRGESYKMWNSDKPC YSVVEDPLYKSIPFFMSSYRYGIFLDNTYKTEFKFGTESRDYYSFEAPDGEMIYYFIFGK DYKEILSQYVGLTGKPIMPPKWALGFAQCRGLLTSEKLSREIAEGYRTRGIPCDIIYQDI GWTEHLQDFEWRKGNYGHPKKMLADLKEMGFKVVVSQDPVISQANKKQWEEADRLGYFVK DSTNGKSYDMPWPWGGNCGVVDFTLPAVADWWGTYQQKPIDDGISGFWTDMGEPAWSNEE QTERLVMKHHLGMHDEIHNVYGLTWDKVVKEQFEKRNPNRRVFQMTRAAYAGLQRYTFGW TGDCGNGDDVLQGWGQLANQIPVILSAGLGVIPFTTCDITGYCGDIEDYPSMAELYTRWI QFGAFNPLSRIHHEGDNPVEPWLFGPEAEKNAKEAIELKYRLLPYIYTYAREAHDTGLPI MRPLFLEYPADMETFSTDGQFLFGQELLVAPVVKKGARTKNVYLPEGTWIDYNNKQTVYT GEQWTTVEAPLSCIPMFVKQGSIIPTMPVMNYTHEKPVYPLIFEVFPAPKGEEASFTLYE DEGEDLGYQRDEFAKTPVRFRTEEGGYLLSVGAREGKGYAVPAPRNFIFRMYLKTAPKGV TVQGKKVKKVKPERLEENPDDDTESMVWSWDKATGICSLRMPDRGEESRISLRL >gi|226332248|gb|ACIC01000072.1| GENE 8 13068 - 15206 1669 712 aa, chain - ## HITS:1 COG:alr4773 KEGG:ns NR:ns ## COG: alr4773 COG1501 # Protein_GI_number: 17232265 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-glucosidases, family 31 of glycosyl hydrolases # Organism: Nostoc sp. PCC 7120 # 199 678 225 723 779 187 28.0 8e-47 MRKLLLNLFLSTVGLLFAGQASAQEEIAPGIIKLQAGEIDTFTPYSLFGGKPAVEAMKQL PAAKPPFSPDEVQIKITDRGCLIEVPLEDNEQIYGFGLQFETFGQRGLRKRPIVNDNPLN GLGYTHAPQTFYVSTKGYGILVNTARYTTFLCGSNQKTEHSRQLQAEERKHIATTTEDLY KNRSNGNKVHIDVPGAKGIEVFIITGPEVLDVVKRYNLLSGGGCLPPMWGLGFKYRVKGD ATQDSVMRFANYFREKQIPCDVLGLEPGWQTATYSCSYRWSDDRFPRHKEMLDQLQQKGY KVNLWEHAYVHPSSPIRKALEPYSGDFLVWNGLVPDFIQPEAHKIFTDYHRTLIEEGISG FKLDECDNSNISFASATWCFPDMAQFPSGIDGEKMHQVFGSLYVNAMDSIYREKNTRTYQ DYRSSGMFMSSRNAVLYSDTYDPKEYIQALCNSAFGGLLWCPEVREAHSAEDFFHRLQTV ILSPQAMVNAWYLQYAPWLQFDRGKNERGEFLPEAKRYEEYARTLINLRMQLIPYLYSAF YTYYKEGVPPFRPLLMDYPKDERLRTISDQYMMGDGLMAAPLYQNKKTRTVYFPEGTWYN FNTNEKYEGNREYEITTELDQLPLYVRQGTLLPLAAPVPYVDAQTVFDLHCKVYGAPSAT FLLLEDDGISYDFQKGQFNEVTLEAAKGKVKLKRTKEYKQKRYQLTDYEFIN >gi|226332248|gb|ACIC01000072.1| GENE 9 15470 - 16273 789 267 aa, chain + ## HITS:1 COG:DR0821_2 KEGG:ns NR:ns ## COG: DR0821_2 COG0657 # Protein_GI_number: 15805847 # Func_class: I Lipid transport and metabolism # Function: Esterase/lipase # Organism: Deinococcus radiodurans # 44 225 6 188 242 106 33.0 5e-23 MKKNCFLLLFLLCACLVQAQDTYRTDKDVPYISGSETDAYRLERCKLDVYYPENKKDFST IVWFHGGGMEGGSKFVPRELTDQGFAVVAVNYRLSPKAKNPAYIEDAAEAVAWTFRNIEK YGGRKDRIFVSGHSAGGYLTLILAMDKKYMAAYGADADSVAAYLPVSGQTVTHFTIRKER GLPNGIPIIDEFAPVNKARKDTAPVVLITGDKNLEMADRYEENALLASVLKNIGNRKVSL YELQGFDHGQVLGPACYLIVNYVKRFK >gi|226332248|gb|ACIC01000072.1| GENE 10 16384 - 17709 912 441 aa, chain + ## HITS:1 COG:FN1101 KEGG:ns NR:ns ## COG: FN1101 COG1373 # Protein_GI_number: 19704436 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Fusobacterium nucleatum # 3 437 24 458 470 396 46.0 1e-110 MFERGIIQLLVAWKTDVNRKPLVIHGARQVGKTWALKYFGQKYFEDVAYFSLDKDESGLC DIFKTTKDPERIIQQLSFLHGRKINPQTTLLILDEIQECNEALNSLKYFCEEAPEYAVAC AGSLLGIYLNHIGNSFPVGKVNHLSMYPLTFTEFLNTKDPAMYQYMCSVKEIAPLPQIFF DKLREAFIAYSICGGMPEPASLMVDFNDMQKVDSSLRDILNDYALDFVKHATPTLAPRIN YVWNSLPSQLAKENRKFVYQLVRPGARAREYEDAILWLEQAGLVNKVVLSKSPRLPLKAY DDLSTFKLYALDIGLLRQLSELGASVLLLSTPGYTEYKGALAENYVLQSLTAQFQASFRY WTSGNKAEVDFLLQHGNHIYPIEVKADQNITGKSLIQYEKLYQPACRIRYSMLNLKQDGN LINIPLFLADKTKELLNNNVG >gi|226332248|gb|ACIC01000072.1| GENE 11 17872 - 19518 1390 548 aa, chain + ## HITS:1 COG:BS_sacC KEGG:ns NR:ns ## COG: BS_sacC COG1621 # Protein_GI_number: 16079757 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-fructosidases (levanase/invertase) # Organism: Bacillus subtilis # 112 548 30 513 677 330 39.0 6e-90 MKTLSRTVLTLFVLSFSCLAAIAGEVSFKITKQYLNFPISHKENRGRMTFEVNGKPDLSV VIRLAPDEAEYWVFKDVSAFKGKTIKITYDGNEKGLSKIYQSDEIEGQAELYKEKNRPQF HFTTRRGWINDPNGLIYYEGEYHLFYQHNPFERDWENMHWGHAVSKDLIHWTELPDALYP DHLGTMFSGSAVIDYDNTAGFNKGKTPAMVAAFTAASSDRQVQGIAYSLDKGRTFTKYDK NPVINSKEKWNSQDTRDPKVFWYAPSKHWVLVLNERDGHSIYTSSNLKDWKYESHVTGFW ECPELFELPVDGDKNHTKWVMYGATGTYMLGSFDGKVFTPEAGKYCYTTGSIYAAQTFTN IPASDGRRIQIGWGRISHPGMPFNGMMMLPTELTLRTTKDGIRLVNVPVKEVESLCKPLR SWKSLSSDEANRHLKEFYDADCLRIKTTIKLSHATDAGFNLFGQRMIGYDMNSNTLNGRF YSPQDPTSMELSADIYIDRTSIEVFIDGGLYSYSMERRPDTNNKEGFHFWGNRIEVKDLQ VFSVESIW >gi|226332248|gb|ACIC01000072.1| GENE 12 19571 - 19912 379 113 aa, chain + ## HITS:1 COG:SMb21347 KEGG:ns NR:ns ## COG: SMb21347 COG1917 # Protein_GI_number: 16264671 # Func_class: S Function unknown # Function: Uncharacterized conserved protein, contains double-stranded beta-helix domain # Organism: Sinorhizobium meliloti # 5 113 4 112 112 103 44.0 7e-23 MKTRSEVFQFEKELKWEHPAPGIRRQIMGYDGQLMMVKVEFEKGAVGTLHEHYHTQATYV ASGKFELTIGDRKEILSTGDGYYVAPDEPHGCVCLEAGVLIDTFSPMRADFLE >gi|226332248|gb|ACIC01000072.1| GENE 13 20041 - 20988 854 315 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|148988856|ref|ZP_01820271.1| 50S ribosomal protein L9 [Streptococcus pneumoniae SP6-BS73] # 4 311 3 307 308 333 56 8e-91 MEKIARKLTDLVGNTPLLELSNYNKSNGLKARLIVKIESFNPAGSVKDRVALAMIEDAEV KGVLTPGATIIEPTSGNTGVGLAFVAAAKGYKLILTMPDTMSVERRNLLKALGAELVLTP GADGMKGAIAKAEELKAATPGSVILQQFENPANPAMHLRTTGLEIWRDTEGKVDIFVAGV GTGGTISGVGEALKMRDPSVKAVAVEPADSPVLSGGKPGPHKIQGIGAGFVPKTYNASVV DEIIQVQNDDAIRTSRALAEKEGLLVGISSGAAVYAATELAKRPENEGKMIVALLPDTGE RYLSTILYAFEEYPL >gi|226332248|gb|ACIC01000072.1| GENE 14 21082 - 22842 1320 586 aa, chain + ## HITS:1 COG:no KEGG:BT_3079 NR:ns ## KEGG: BT_3079 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 586 1 586 586 1120 98.0 0 MQNKTLGILTIILLLFSACKDEETSLSSTKQLISYSIQKSDNQGKIKNDVRGSIKGNVIT LSMDQYDDLKSLIATFKYEGTSVSVNGVGQESGITSNDFSRPLMILVEAEDGSREQYTVE VVLKDAQVLSEFRFLRKDNALLTADVSCTIEDETIVSSYTFPQSKLIPVFMTDAVKVMVD DVEQVSGVTEIDFASPVTYQFVMRNGEVVRYILTLDFILIPQFTITTEDPSITEIPSKDY YLNATLTVDGKGIYENYTGKTEVKGRGNSTWGYPKKPYRLKLDKKSEICGLGKAKNYILL ANHIDPTLMLNSVAFKVGQLLNIPFTNHAIPVDVVLNGKYKGSYLLTEQIEIKENRVDLD ENNSVMWELDSYFDEDPKFKSEAFNLPVMVKDPDLTTEQFEYWKKDFNAFTVQFAKEPLE GNMYVDMIDIESVAKYLITFNLVHNMEINHPKSIFIHKEGKGKYVMGPIWDFDWAYDYEG KETHFGSYETPLFSDDMNGVGTAFFQRFLQDSRVRTIYKNFWQDFKSNKLNELLQYIDDQ AKLIKPSVTRNSELWENTRSFDAKVIELKNWLKNRAEYIDGEVNQY >gi|226332248|gb|ACIC01000072.1| GENE 15 23021 - 23500 530 159 aa, chain - ## HITS:1 COG:MA2197 KEGG:ns NR:ns ## COG: MA2197 COG3467 # Protein_GI_number: 20091038 # Func_class: R General function prediction only # Function: Predicted flavin-nucleotide-binding protein # Organism: Methanosarcina acetivorans str.C2A # 5 153 6 152 152 96 34.0 2e-20 MKTIVIEDKQRIESIILQADACFVGITDLEGNPYVVPMNFGYENDTLYLHSGPEGGKIEM LQRNNNVCITFSLGHKLVYQHKQVACSYSMRSESAMCRGKVEFIEDMEEKRHALDIIMRH YTKDQFSYSDPAVRNVKVWKVPVDQMTGKVFGLRADEKP >gi|226332248|gb|ACIC01000072.1| GENE 16 23649 - 25805 1234 718 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15894003|ref|NP_347352.1| fused ribonuclease/ribosomal protein S1 [Clostridium acetobutylicum ATCC 824] # 25 717 7 703 730 479 37 1e-135 MAKKKEKKEKKAGKRMSKKELAALLIDFFHAKPSETLSMKYIFSELHLTTHPQKMLCADL LHDLSDDDYISEIEKGKFRLNNHGTEMIGTFQRKSNGKNSFIPEGGGDPIFVAERNSAHA MNNDKVKITFYAKRKNKDAEGEVIEILERANDTFVGTLEVAKSYAFLVTENRTLANDIFI PKEKLKGGKTGDKAIVKVTEWPDKAKNPIGQVIDILGVAGDNTTEMHAILAEFGLPYVYP KAVETAADKIPAEITPEEIARREDFRKVTTFTIDPKDAKDFDDALSIRPIKDGLWEVGVH IADVTHYVKEGGIIDKEAEKRATSVYLVDRTIPMLPERLCNFICSLRPNEEKLAFSVIFD ITEKGEVKDSRIVHTVINSDRRFTYEEAQQIIETKTGDFKEEVLMLDTIAKALREKRFAA GAINFDRYEVKFEIDEKGKPISVYFKESKDANKLVEEFMLLANRTVAEKVGRVPKNKKAK VLPYRIHDLPDPEKLENLSQFIARFGYKVRTSGTKTDISKSINHLLDDIHGKKEENLIET VSIRAMQKARYSTHNIGHYGLAFDYYTHFTSPIRRFPDMMVHRLVTKYMDGGRSVSESKY EDLCDHSSNMEQIAANAERASIKYKQVEFMSERLGQIYDGVISGVTEWGLYVELNENKCE GMVPIRDLDDDYYEFDEKNYCLRGRRKNKIYSLGDAITIRVARANLEKKQLDFALIEK >gi|226332248|gb|ACIC01000072.1| GENE 17 26061 - 27068 735 335 aa, chain + ## HITS:1 COG:PAB2145 KEGG:ns NR:ns ## COG: PAB2145 COG0451 # Protein_GI_number: 14520521 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Pyrococcus abyssi # 4 331 6 303 307 85 29.0 2e-16 MESILITGASGFIGSFIVEEALKRKFGVWAGIRPTSSKKYLKNRKIHFLELDFAHPNELR AQLSGHKGTYNKFDYIIHCAGVTKCPDKKTFDYVNYLQTKYFIDTLKALNMVPKQFIYIS TLSVFGPVREKDYSPIEAGDVPMPNTAYGLSKLKAELYIQSIPGFPYVIYRPTGVYGPRE LDYFLMAKSIRQHVDFSVGFRRQDLTFVYVKDIVQAIFLGIEKKVTRRAYFLTDGKVYNS RVFSDLIQKELGNPFVIHVKCPLIVLKVISLLAEFIATRSGKSSTLNSDKYKIMKQRNWQ CDITPAINELGYAPEYDLEKGVRETIDWYKNEGWL >gi|226332248|gb|ACIC01000072.1| GENE 18 27059 - 28024 596 321 aa, chain + ## HITS:1 COG:no KEGG:BT_3074 NR:ns ## KEGG: BT_3074 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 321 1 321 321 555 100.0 1e-157 MALDLFKRVETRKGLFAVEKITLIYNLLTSILILFLFQRMDHPWHMLLDRAMIAAMTFLL MYLYRLAPCKFSAFVRVAIQMSLLSYWYPDTFEFNRFFPNLDHVFAITEQFIFNGQPAIW FCHTFPHLLVSEAFNMGYFFYYPMMLIVTVFYFIYKFEWFEKMSFVLVTSFFIYYLIYIF VPVAGPQFYFPAIGFDNVSKGIFPAIGDYFNNNQELLPGPGYQHGFFYSLVEGSQQVGER PTAAFPSSHVGISTILMIMAWRGSKKLFACLIPFYMLLCGATVYIQAHYVIDAIVGFFSA FLLYVVATWMFKKWFAQPMFK >gi|226332248|gb|ACIC01000072.1| GENE 19 28027 - 29019 498 330 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148828154|ref|YP_001292907.1| ribosomal protein L11 methyltransferase [Haemophilus influenzae PittGG] # 1 287 1 284 326 196 38 2e-49 MKISHIDLGEHPILLAPMEDVTDPAFRLMCKKFGADMVYTEFVSSDALIRAVSKTAQKLS ISDEERPVAIQIYGKDTETMVEAAKIVEQAQPDILDINFGCPVKRVAGKGAGAGMLQNIP KMLEITRAIVDAVKIPVTVKTRLGWDANNKIIVDLAEQLQDCGIAALTIHGRTRAQMYTG EADWTLIEEVKNNQRMHIPIIGNGDVTTPQRCKECFDRYGVDAVMIGRASFGRPWIFKEV KHYLETGEELPPLSFEWCMEVLRQEVIDSVNLLDERRGILHVRRHLAASPLFKGIPNFKE TRIAMLRAETKEELFRIFETIPEKVNSLLD >gi|226332248|gb|ACIC01000072.1| GENE 20 29157 - 30440 1312 427 aa, chain + ## HITS:1 COG:aq_1015 KEGG:ns NR:ns ## COG: aq_1015 COG0826 # Protein_GI_number: 15606313 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Collagenase and related proteases # Organism: Aquifex aeolicus # 7 413 5 401 409 270 37.0 3e-72 MNIKDFEIMAPVGSRESLAAAIQAGADSIYFGIENLNMRARSANTFTVDDLREIAQTCDE HGMKSYLTVNTIIYDKDIPLMRTIVDAAKAAGISAVIAADVAVMNYARQIGQEVHLSTQL NISNAEALKFYAQFADVVVLARELNLEQVAEIYRQIREENICGPSGEQIRIEMFCHGALC MAVSGKCYLSLHEMNHSANRGACMQVCRRSYTVRDKETDVELDIDNEYIMSPKDLKTIHF MNKMMDAGVRVFKIEGRARGPEYVRTVVECYKEAIKAYLDDTFTDEKIAAWDERLKAVFN RGFWDGYYLGQRLGEWTRNYGSAATERKIYVGKGIKYFSNIGVSEFLVEAAEVSVGDKLL ITGPTTGAVFMTLDEARVDLKPVETVKKGQRFSMKSEKIRPSDKLYKLVSTEELKKFKGL DVEQKRG >gi|226332248|gb|ACIC01000072.1| GENE 21 30444 - 30848 525 134 aa, chain + ## HITS:1 COG:CC3234 KEGG:ns NR:ns ## COG: CC3234 COG0824 # Protein_GI_number: 16127464 # Func_class: R General function prediction only # Function: Predicted thioesterase # Organism: Caulobacter vibrioides # 7 118 13 126 147 59 34.0 1e-09 MSQYIYELEMKVRDYECDLQGIVNNANYQHYLEHTRHEFLTSVGVSFAALHEQGVDPVVA RINMAFKTPLKSGDEFVSKLYMKKEGIKYVFYQDIFRKSDQKVVVKSTVETVCVVNGRLS NSELFDNVFAPYLK >gi|226332248|gb|ACIC01000072.1| GENE 22 30890 - 31966 878 358 aa, chain + ## HITS:1 COG:FN1068 KEGG:ns NR:ns ## COG: FN1068 COG0758 # Protein_GI_number: 19704403 # Func_class: L Replication, recombination and repair; U Intracellular trafficking, secretion, and vesicular transport # Function: Predicted Rossmann fold nucleotide-binding protein involved in DNA uptake # Organism: Fusobacterium nucleatum # 69 356 10 285 288 202 37.0 6e-52 MVPGIGHIGAKRLVEGMKSATDVFRFRKELAQRLSGVHERVSGALDCPSIIARAEQELVF LQKNHIQCLTFHDEAYPSRLRECEDAPVVLFYKGNADLNALHIINMVGTRHATDYGTQLC TTFLRDLKALCPDVLVVSGLAYGIDINAHRSALDNDLPTVGVLAHGLDRIYPSLHRKTAV EMLDKGGLLTEFPVGTTPDKHNFISRNRIVAGMCDATIVVESAAKGGSLITAELAESYHR DCFAFPGRVTDEYSKGCNQLIRDSKASLLLSAEDLVEAMGWTLDSHPAKVENVQRSLFLE LTEEEQKVVHTLEKQGNLQINTLVVETDIPVHKMSAILFELEMKGAVRVLAGGVYQLL >gi|226332248|gb|ACIC01000072.1| GENE 23 31996 - 33246 1142 416 aa, chain + ## HITS:1 COG:no KEGG:BT_3069 NR:ns ## KEGG: BT_3069 # Name: not_defined # Def: putative disulphide-isomerase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 416 1 416 416 815 96.0 0 MKRHFLVFMAAVCLLSVQNTRGQQMVPAQKISATPAAGTQMTLAQADGIAFRVLSFSEAL KRAEVEDKLLFVDCFTTWCGPCKRLSKVVFKDSLVADYFNRHFVNLKMDMEKGEGVELRK KYGVHAYPTLLFINSSGEVVYRLVGAEDAPELLKKVKLGVESGGLSGLKKRYEAGDRDLA FICGYINALSAANREQEAGKVAADFLQGREQKMLEDEDYFLVFYYYVHDVNSSAFQYVVN HQKEIADKFPRQGASLDRRLLEDWIAGSYAYLTVDESGHCTFDEQGLDAYVAKMKQMNVA EADMIGESLRLSRDGIMKQWDSFVKRGDKILASHTILGDEGHLLQWVNWLNKECADMSLR EKAAQWCDKACEELIKKNEEIKKNLPPGAIPAISMVDHKGQLLQVAEKLRKLMQQS Prediction of potential genes in microbial genomes Time: Thu May 12 01:03:49 2011 Seq name: gi|226332247|gb|ACIC01000073.1| Bacteroides sp. 1_1_6 cont1.73, whole genome shotgun sequence Length of sequence - 103331 bp Number of predicted genes - 63, with homology - 62 Number of transcription units - 37, operones - 15 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 13 - 58 7.5 1 1 Op 1 . - CDS 68 - 1123 908 ## BT_3067 hypothetical protein 2 1 Op 2 . - CDS 1137 - 3245 1371 ## BT_3066 hypothetical protein - Prom 3267 - 3326 7.0 + Prom 3193 - 3252 6.3 3 2 Tu 1 . + CDS 3353 - 4864 795 ## PROTEIN SUPPORTED gi|90021240|ref|YP_527067.1| ribosomal protein S32 + Term 4936 - 4967 1.8 + Prom 5234 - 5293 6.2 4 3 Tu 1 . + CDS 5335 - 5556 179 ## BT_3064 hypothetical protein + Term 5585 - 5625 9.0 + Prom 6431 - 6490 4.9 5 4 Tu 1 . + CDS 6516 - 7013 570 ## BT_3062 hypothetical protein + Term 7072 - 7129 8.1 + Prom 7046 - 7105 7.3 6 5 Op 1 . + CDS 7285 - 8562 629 ## BT_3061 hypothetical protein 7 5 Op 2 . + CDS 8570 - 9532 718 ## BT_3060 hypothetical protein 8 5 Op 3 . + CDS 9583 - 10557 670 ## BT_3059 hypothetical protein + Term 10621 - 10682 13.0 + Prom 10663 - 10722 5.3 9 6 Op 1 . + CDS 10756 - 11598 497 ## BT_3058 transcriptional regulator + Prom 11633 - 11692 7.5 10 6 Op 2 . + CDS 11833 - 13359 1331 ## COG3119 Arylsulfatase A and related enzymes + Term 13379 - 13435 13.1 11 7 Tu 1 . - CDS 13431 - 14660 582 ## BT_3056 hypothetical protein - Prom 14698 - 14757 7.4 - Term 14749 - 14790 10.0 12 8 Op 1 36/0.000 - CDS 14821 - 15576 834 ## COG0479 Succinate dehydrogenase/fumarate reductase, Fe-S protein subunit 13 8 Op 2 . - CDS 15613 - 17592 2329 ## COG1053 Succinate dehydrogenase/fumarate reductase, flavoprotein subunit 14 8 Op 3 . - CDS 17630 - 18334 810 ## BT_3053 putative cytochrome b subunit - Prom 18367 - 18426 3.2 15 9 Op 1 . - CDS 18455 - 19330 674 ## BT_3052 transcriptional regulator 16 9 Op 2 . - CDS 19405 - 19569 79 ## gi|253569173|ref|ZP_04846583.1| predicted protein 17 10 Tu 1 . - CDS 20081 - 21637 1283 ## COG3119 Arylsulfatase A and related enzymes - Prom 21751 - 21810 4.6 + Prom 21836 - 21895 5.3 18 11 Tu 1 . + CDS 21962 - 22927 749 ## BT_3050 chitinase precursor + Term 22981 - 23040 5.9 19 12 Tu 1 . - CDS 22896 - 27026 2855 ## COG5002 Signal transduction histidine kinase - Prom 27144 - 27203 5.1 + Prom 27103 - 27162 6.8 20 13 Tu 1 . + CDS 27211 - 29898 1499 ## BT_3048 hypothetical protein + Prom 29976 - 30035 9.6 21 14 Op 1 . + CDS 30253 - 31554 1241 ## BT_3047 hypothetical protein 22 14 Op 2 . + CDS 31577 - 34672 2545 ## BT_3046 hypothetical protein 23 14 Op 3 . + CDS 34712 - 36694 1698 ## BT_3045 hypothetical protein 24 14 Op 4 . + CDS 36743 - 37630 793 ## BT_3044 hypothetical protein + Term 37708 - 37759 14.2 + Prom 37713 - 37772 8.6 25 15 Tu 1 . + CDS 37879 - 39465 1094 ## BT_3043 putative xylanase + Term 39496 - 39538 -1.0 + Prom 39484 - 39543 3.9 26 16 Tu 1 . + CDS 39609 - 39986 169 ## COG3943 Virulence protein + Term 40216 - 40251 -0.6 27 17 Op 1 . - CDS 39907 - 40545 494 ## BT_3041 hypothetical protein 28 17 Op 2 . - CDS 40564 - 41571 811 ## COG3943 Virulence protein 29 18 Tu 1 . + CDS 42399 - 42854 293 ## BT_3039 hypothetical protein 30 19 Op 1 . - CDS 42952 - 44085 967 ## COG3712 Fe2+-dicitrate sensor, membrane component 31 19 Op 2 . - CDS 44102 - 44539 442 ## BT_3037 RNA polymerase ECF-type sigma factor - Prom 44603 - 44662 1.6 32 20 Op 1 . - CDS 44854 - 47436 2060 ## BT_3036 hypothetical protein - Prom 47456 - 47515 2.9 33 20 Op 2 . - CDS 47517 - 49103 1883 ## COG0793 Periplasmic protease - Prom 49123 - 49182 1.9 34 21 Op 1 . - CDS 49212 - 49667 283 ## PROTEIN SUPPORTED gi|163764798|ref|ZP_02171851.1| ribosomal protein S19 35 21 Op 2 . - CDS 49664 - 51541 2028 ## COG0187 Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit 36 21 Op 3 . - CDS 51547 - 52683 1074 ## BT_3032 hypothetical protein - Prom 52703 - 52762 2.4 37 21 Op 4 . - CDS 52776 - 55262 1749 ## COG1520 FOG: WD40-like repeat - Prom 55320 - 55379 2.7 38 22 Tu 1 . - CDS 55393 - 57270 1963 ## COG0591 Na+/proline symporter - Prom 57345 - 57404 3.3 39 23 Tu 1 . - CDS 57531 - 58721 880 ## COG2152 Predicted glycosylase - Prom 58850 - 58909 4.0 40 24 Op 1 . - CDS 58911 - 60089 905 ## BT_3027 hypothetical protein 41 24 Op 2 . - CDS 60115 - 61677 1137 ## BT_3026 glycosylhydrolase, putative xylanase 42 24 Op 3 . - CDS 61713 - 63344 1496 ## BT_3025 hypothetical protein 43 24 Op 4 . - CDS 63378 - 66443 2614 ## BT_3024 hypothetical protein - Prom 66515 - 66574 2.5 - Term 66889 - 66937 10.1 44 25 Tu 1 . - CDS 66963 - 67880 839 ## BT_3023 hypothetical protein - Prom 67942 - 68001 6.4 - Term 68477 - 68513 6.8 45 26 Tu 1 . - CDS 68535 - 69782 971 ## COG4289 Uncharacterized protein conserved in bacteria - Prom 69871 - 69930 6.1 + Prom 69884 - 69943 5.0 46 27 Tu 1 . + CDS 70005 - 71573 1103 ## BT_3020 hypothetical protein + Term 71597 - 71644 5.0 - Term 71585 - 71632 0.4 47 28 Tu 1 . - CDS 71655 - 72641 920 ## COG0530 Ca2+/Na+ antiporter - Prom 72754 - 72813 7.7 + Prom 72586 - 72645 5.4 48 29 Tu 1 . + CDS 72768 - 74618 1480 ## COG0668 Small-conductance mechanosensitive channel + Term 74619 - 74658 -0.9 - Term 74679 - 74722 11.7 49 30 Op 1 . - CDS 74775 - 75707 968 ## BT_3017 acid phosphatase - Prom 75766 - 75825 3.1 - Term 75797 - 75855 1.4 50 30 Op 2 . - CDS 75875 - 78691 2917 ## COG1629 Outer membrane receptor proteins, mostly Fe transport - Prom 78716 - 78775 3.6 - Term 78794 - 78827 4.3 51 31 Op 1 . - CDS 78878 - 81784 2163 ## BT_3015 hypothetical protein 52 31 Op 2 . - CDS 81793 - 82689 786 ## BT_3014 putative chitobiase 53 31 Op 3 . - CDS 82727 - 84661 1722 ## BT_3013 hypothetical protein 54 31 Op 4 . - CDS 84674 - 88060 3141 ## BT_3012 hypothetical protein 55 31 Op 5 . - CDS 88109 - 89017 632 ## COG3712 Fe2+-dicitrate sensor, membrane component - Prom 89107 - 89166 13.5 + Prom 88977 - 89036 7.2 56 32 Tu 1 . + CDS 89212 - 89787 388 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog + Term 89854 - 89884 -0.9 57 33 Tu 1 . - CDS 90091 - 90219 70 ## - Prom 90243 - 90302 2.8 + Prom 90167 - 90226 8.9 58 34 Tu 1 . + CDS 90251 - 92842 1834 ## COG1472 Beta-glucosidase-related glycosidases + Term 92920 - 92956 5.4 - Term 92942 - 92973 -0.8 59 35 Op 1 . - CDS 93139 - 94329 904 ## BT_3008 hypothetical protein 60 35 Op 2 . - CDS 94367 - 95239 637 ## COG2173 D-alanyl-D-alanine dipeptidase - Prom 95434 - 95493 4.1 61 36 Tu 1 . - CDS 95507 - 98410 2128 ## BT_3006 hypothetical protein - Prom 98447 - 98506 3.6 62 37 Op 1 . - CDS 98537 - 101383 1511 ## BT_3005 hypothetical protein 63 37 Op 2 . - CDS 101371 - 102726 974 ## BT_3004 hypothetical protein Predicted protein(s) >gi|226332247|gb|ACIC01000073.1| GENE 1 68 - 1123 908 351 aa, chain - ## HITS:1 COG:no KEGG:BT_3067 NR:ns ## KEGG: BT_3067 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 351 1 351 351 699 99.0 0 MRKEKLYTGCLLLMALITGSCSEEENPEVRSATKPAEPYTSYYYYSYEAARQLSATDFAL AEGQFTPGPTYIKGDTLFVANIQSGAFSLELYDRNKNRHLASLKSWKYKEAEQKFPDKIE AIGISGNRLYLANISSRIDVFDVRTLEFITRIGTGNWGDGINQMVHSHAMAITPDGNIMI RTKKNLLVYREADVTLENYQKVPFYCRSKGDGMDVNNGFHSHQMVQDSTGIVYLADFGQY GNQKIQAIDTTLIEKGDQKILIDTEKTLPLSFNPCGIALHENRMILSAQNGSLFIYDRTK GNWENSFKSVKGYTFKKPEKLLVDQNTLWVSDVNAQKLVEVKIYKNEIREY >gi|226332247|gb|ACIC01000073.1| GENE 2 1137 - 3245 1371 702 aa, chain - ## HITS:1 COG:no KEGG:BT_3066 NR:ns ## KEGG: BT_3066 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 702 1 702 702 1466 99.0 0 MIFHRIFRFTYFLIFCIWGLQSCTKDDLIETPADTDSETTENPYGIIRIAEKDLTPDVFK LMLQDDEPATILFNNTQRGFRVNQPLQVSITEQQELFIRFYSPRPVKEVTVWATISGYEE AFQLAKFDVLPAFTEFHKELPMLTQSKRYITRSGKEIQIMANPHLSAADFKLEIECNDKY YQKLLSTKSKYNVRFSAYSQTGSWAYPLYPAHAREAVAMMLNYGYMFSSKEFAEELEKYR GKLHSDANKTVIDIDMLLKKVINHSGFVIGKVTTVDGLGGGETYGLNEWCFLEHYADDGA HTSATFHELGHCLGYGHSGNMTYEQTGTGWITLCATVYNKLCIEKKLPVYSRRFMHTRRY GKLYGSSKYNASRYIIEDPELDAIDGGLSPILKEEDEDTAQGTPLSCIITYKDIPQATES TFAPKDVCVYGNRIYIVNNASGNFSLEILEEQNGKLTHIKSLKEWTEGGATKGFAATPNG VTVAHGKIYVTNEQSRTDIFDEKTFELVATIGTGSWGEGSNQTVHAFDVLVHRGCVFIRD KKRVCVFIEDDIVPGKSFKNVPNYCRTSNMGEAMGTYGQTIGNDGLLYTTHQGNKKIYVF DLQAMREQVEWKAQRVINLTSYSPYDIAFIGKRMFVSFATDKNQPIALAEVNPETGTVIK DYTTVEGHTFSNVEKMSMARQTLFIVDRNAHTVTGIPVEKLN >gi|226332247|gb|ACIC01000073.1| GENE 3 3353 - 4864 795 503 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|90021240|ref|YP_527067.1| ribosomal protein S32 [Saccharophagus degradans 2-40] # 121 447 28 339 408 310 46 1e-83 MNQTRCIVILSILALFVASVSVAGNISLPDASQKVFEGKPCINSPRAIGNYPASPFLFYI PTTGQRPMEWSAEKLPKGLKLDSKTGIITGSVASKGEYTVTLKAKNALGTSTEKLVIRIG DDLLLTPPMGWNSWNTFGRHLTEELVLQTADVLVANGMRDLGYSYINIDDFWQLPERGAD GHLQINKDKFPRGIKYVADYLHERGFKLGIYSDAADKTCGGVCGSYGYEEVDAKDFASWG VDLLKYDYCNAPVDRVEAMERYAKMGKALRGTGRSIVFSICEWGQREPWKWAKQVGGHLW RVSGDIGDVWNREANKLGGLRGILNILEINAPLSEYGGPSGWNDPDMLVVGIGGKSMSIG YESEGCTHEQYKSHFALWCMMASPLLCGNDVRSMNDSTLQVLLDRDLIAINQDVLGKQAE RSIRADHYDIWVKPLADGRKAVACFNRADTPRTIELNSKTVEDLSLEQVYSLDSRSMENA ANNIMVDLAPYQCKVYICGKPKK >gi|226332247|gb|ACIC01000073.1| GENE 4 5335 - 5556 179 73 aa, chain + ## HITS:1 COG:no KEGG:BT_3064 NR:ns ## KEGG: BT_3064 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 73 1 73 73 153 100.0 2e-36 MRGILSFAALAAMYFPGCTTSKNAVHCLSRWIKGCNPLVKELIPTGYLPYNHRFLTSKQY KIITKHLGDPYED >gi|226332247|gb|ACIC01000073.1| GENE 5 6516 - 7013 570 165 aa, chain + ## HITS:1 COG:no KEGG:BT_3062 NR:ns ## KEGG: BT_3062 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 165 1 165 165 291 98.0 5e-78 MALEYVVTKRVFGFDKDKNEKYVAKSVRSGRVSFSKMCGKVSRLCGVHRKVVDLVVSGLV DMMAEDIDDGKSVQMGEFGIFSPTIRAKSADDEKEVSSKSIVQRKILFYPGKIFKTTLDD MSITRRAELDTDYTDGSSNNGSGSVKPNPGGDGGKGDEEAPDPTV >gi|226332247|gb|ACIC01000073.1| GENE 6 7285 - 8562 629 425 aa, chain + ## HITS:1 COG:no KEGG:BT_3061 NR:ns ## KEGG: BT_3061 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 425 1 425 425 862 99.0 0 MKHGIYTCVILMAFLALSESVYAVVPVFTIASDTIRRVVIYFDQEETDVDSCYKTNNQEI NVLDSLLEERINSRYITALNVVTFVSPDGDSVYNAALSVRRNNSMREFLRQRYPYVDVEK IKLTSEGEDWSELRKLVASDSDLPDREEVLMLIDYHRDNIVKRKELLQKLNQGMAYRYIV RNVLPKLRRAEITVVRKVPEMEKNIFEPVSSVSGLFVSKQEEALPSDQPDKPEGESKESM AYEVVVSEAEGPVESKTVLAVKNNLLYDLALAPNLEVEIPVGKRWSLNAEYKCPWWLNSK HDFCYQLLSGGVEGRCWLGNRQKRNRLTGHFVGLYAEGGIYDFQWKGDGYRGDYYGAAGV TYGYARQLARHLSLEFSFGIGYLTTEYKKYTPYEGDIVWTTSGRYNFIGPTKAKVSLVWL IKRGR >gi|226332247|gb|ACIC01000073.1| GENE 7 8570 - 9532 718 320 aa, chain + ## HITS:1 COG:no KEGG:BT_3060 NR:ns ## KEGG: BT_3060 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 17 320 1 304 304 545 100.0 1e-154 MQDTRYILLVLLSSLLMLTGCSRRELLDDYPVSGVDIKLDWNGVTDKLPEGVRVIFYPKS AEGRKIDTYLSVRGGKVKVPPGRYSVVAYNYDTETVQIRGEEAYETIEAFTGYCNGLGIA GTEKMVWGPDPLYVVQIDDLHIANSEEELLLDWKPKLVVKTYFFKIKVEGLEYVSSIVGS VEGMAGCYCLGRCCGMMCDAPIYFEAQGRSGEVTGSFTAFGMPEAAISRAGDKIKLTLAF IKVDKTVQKAEIDITEVVTNSESGGSGGGNVDEPPTEIELPLDEKVKVEKPTTNPDGGGG GIGGDVGDWGDEDEVVLPVE >gi|226332247|gb|ACIC01000073.1| GENE 8 9583 - 10557 670 324 aa, chain + ## HITS:1 COG:no KEGG:BT_3059 NR:ns ## KEGG: BT_3059 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 324 1 324 324 561 100.0 1e-158 MKKILLAIAAVATITGCSQSEEFENPGQKAEINFSTAVTRATALDTEGLQNAGFQVYAYN TGTTEMSTTAVLPSSAWIANKATYASTWSVAGGPYYWPLDENLQFFAYSPSDGVTYTGPD GTIAGYPKFTYKVGDTAADQKDLVISSVLNQTKETDKTAPVVSLNFKHALTQINVQVIKG STDYTYEFAADNAVTISGIKGEGTFAYTGADTGDWTPEGTEATYTYTLGDFTDGTSAIVP GNALMLIPQDITDAKITIRYNTKKGDSYIFKGSKEIPLKSTTAWAFGTKILYKLTLPVGG EEVKIEATAAKWDTEKPETTEPSV >gi|226332247|gb|ACIC01000073.1| GENE 9 10756 - 11598 497 280 aa, chain + ## HITS:1 COG:no KEGG:BT_3058 NR:ns ## KEGG: BT_3058 # Name: not_defined # Def: transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 280 1 280 280 569 99.0 1e-161 MSSICQGTWDCQNCPKAVDNAIAHVIHPRGYHQPVRKCEENFILFLIRGEVLVNSQEYAG TMLRAGEFILQAIGSKFEMLAMTECECIYYRFIQPELFCDFRFNHIMKEVSSPLIYSPLK IIPELKYFLDGSKSYLSESKVCRDLLSLKRKELAFVLGYYYSDYDLASLVHPLSKYTSTF QYFILQNYKKVKTVEELAQLGGYTVSTLRRIFNNVFHEPVYEWMQARRKEGILDELLNTN NSISEICYKYGFESLPHFSNFCKKFFGASPRMLRTNKKTI >gi|226332247|gb|ACIC01000073.1| GENE 10 11833 - 13359 1331 508 aa, chain + ## HITS:1 COG:CC1172 KEGG:ns NR:ns ## COG: CC1172 COG3119 # Protein_GI_number: 16125424 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Caulobacter vibrioides # 24 504 29 512 521 166 29.0 8e-41 MRTKRAFKEFCVLGGLASSVCGVAQERPNIIVFLVDDMGLMDTSVPFIADESGQPVRHPL NDWYHTPNMERLAKQGICFSTFYAQSVSSPSRASIMTGQNATRHGVTNWINAESNNRNPF GPPQWNWKGLRKDMPTMPRVLQQAGYKTIHVGKAHFGCMGSEGENPLNIGFDVNIAGSGI GHPGSYYGEWGYGHIKGQKIRAVPDLEKYHGTDTFLSEALTIEANREITKAVEEKRPFYL NMAHYAVHSPFQADKRFLSRYTDPDKNEQARAFATLIEGMDKSLGDIMDQLEKLGIAENT LILFLGDNGGDAPLGDERGYGSSAPLRGKKGTEFEGGMRVPFIAAWAKPEKKSKVQKNLP IEVGSMQTQLGTIMDIYPTVLSVAGCEVPQNYVIDGFDLKKQLSGKVDKKRPESFLMHFP HAHRGSYFTTYRMGDWKLIYYYLPETPKQPKALLYNLKDDPEERNELSAAHPDQCREMIR EMSARLEKEGALYPVDKQGNELKPFVYF >gi|226332247|gb|ACIC01000073.1| GENE 11 13431 - 14660 582 409 aa, chain - ## HITS:1 COG:no KEGG:BT_3056 NR:ns ## KEGG: BT_3056 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 409 1 409 409 804 100.0 0 MNKLLLIWIVVLLTSCITNQENARDTLFVNLTDATDSQSLKLSDFGNTVRYVPLETNEVC LIGNNPHIALLEDKIVIATKDQCFLFHKQTGKYICSIGHIGDDPSGYSSTNYWIDDAGLF YFFRAPDQLLKYNQKGEMTGKIQIPHIPAAPDFFAFSDSTIIAHCNSSIGLESSNSLLFL NSFGEKLDSIPNLLKSPSIPSPDNISSLSIIKRGAGFFGNLGRRGVMIIKDKEQGELLLP LYNPSLWSSENEVRFRETFTDTIYTIKNRKLYPYMVFHTDTDHSSANPLWYSNPQSIYVA YVLENTHSVFFQYVKNKQVYNGLYNKETQQTKFAKCQQFIVDDLTSEQELKVDLSALCSY KGEYGFILEAASIPDASQKQSDNKNENIPHWMSQLDEDANPVIAIVSDK >gi|226332247|gb|ACIC01000073.1| GENE 12 14821 - 15576 834 251 aa, chain - ## HITS:1 COG:Cgl0368 KEGG:ns NR:ns ## COG: Cgl0368 COG0479 # Protein_GI_number: 19551618 # Func_class: C Energy production and conversion # Function: Succinate dehydrogenase/fumarate reductase, Fe-S protein subunit # Organism: Corynebacterium glutamicum # 5 243 1 238 249 234 46.0 1e-61 MDKNISFTLKVWRQAGPKAKGAFETYQMKDIPGDTSFLEMLDILNEQLISERKEPVVFDH DCREGICGMCSLYINGHPHGPATGATTCQIYMRRFNDGDTITVEPWRSAGFPVIKDLMVD RTAYDKIMQAGGYVSVRTGAPQDANAILIPKPIADEAMDAASCIGCGACVAACKNGSAML FVSAKVSQLNLLPQGKPEALRRAKAMLSKMDELGFGNCTNTRACEAECPKNISISNIARL NRDFIIAKLKD >gi|226332247|gb|ACIC01000073.1| GENE 13 15613 - 17592 2329 659 aa, chain - ## HITS:1 COG:Cgl0367 KEGG:ns NR:ns ## COG: Cgl0367 COG1053 # Protein_GI_number: 19551617 # Func_class: C Energy production and conversion # Function: Succinate dehydrogenase/fumarate reductase, flavoprotein subunit # Organism: Corynebacterium glutamicum # 4 658 28 673 673 607 48.0 1e-173 MIKIDSKIPEGPVAEKWTNYKAHQKLVNPANKRRLDIIVVGTGLAGASAAASLGEMGFRV FNFCIQDSPRRAHSIAAQGGINAAKNYQNDGDSVYRLFYDTVKGGDYRAREANVYRLAEV SNAIIDQCVAQGVPFAREYGGTLDNRSFGGAQVSRTFYAKGQTGQQLLLGAYSALSRQVN VGTVKLFTRYEMQDVVIIDGRARGIIAKNLITGELERFAAHAVVIATGGYGNAYFLSTNA MGCNCTAAISCYRKGAVFANPAYVQIHPTCIPVHGDKQSKLTLMSESLRNDGRIWVPKKK EDAIKLQKGEIKGSDIPEEDRDYYLERRYPAFGNLVPRDVASRAAKERCDAGFGVNNTGL AVFLDFSEAINRLGIDVVLQRYGNLFDMYEEITDVNPGELAKEISGVKYYNPMMIYPAIH YTMGGIWVDYELQTTIKGLFAIGECNFSDHGANRLGASALMQGLADGYFVLPYTIQNYLA DQITVPRFSTDLPEFAEAEKAVQAKIDKFMSIQGKESVDSIHKKLGHIMWEYVGMGRTAE GLKKGIAELKAVRKEFETNLFIPGSKEGMNVELDKAIRLYDFITMGELVAYDALNRNESC GGHFREEYQTEEGEAKRDDENFFYVACWEYQGDDEKAPVLIKEPLVYEAIKVQTRNYKS >gi|226332247|gb|ACIC01000073.1| GENE 14 17630 - 18334 810 234 aa, chain - ## HITS:1 COG:no KEGG:BT_3053 NR:ns ## KEGG: BT_3053 # Name: not_defined # Def: putative cytochrome b subunit # Organism: B.thetaiotaomicron # Pathway: Citrate cycle (TCA cycle) [PATH:bth00020]; Oxidative phosphorylation [PATH:bth00190]; Benzoate degradation via CoA ligation [PATH:bth00632]; Butanoate metabolism [PATH:bth00650]; Metabolic pathways [PATH:bth01100]; Biosynthesis of secondary metabolites [PATH:bth01110] # 1 234 1 234 234 385 100.0 1e-106 MWLSNSSVGRKVVMSVTGIALVLFLTFHMAMNLVAIISADGYNMICEFLGANWYALVATA ALAALFIIHIIYAFWLTMQNRTARGSERYAVVDKPKTVEWASQNMLVLGIIVIVGLGLHL FNFWAKMQLPELMHNLDMHADTLTLAYAANGAYHIQQTFSCPVYVVLYLIWLFALWFHLT HGFWSSMQSLGWNNKVWINRWKCISNIYSTIVVLGFALVVVVFFVKSLLCGGAC >gi|226332247|gb|ACIC01000073.1| GENE 15 18455 - 19330 674 291 aa, chain - ## HITS:1 COG:no KEGG:BT_3052 NR:ns ## KEGG: BT_3052 # Name: not_defined # Def: transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 291 26 316 316 614 100.0 1e-175 MSKDSLCAQHTDCSGCPKAMESTLIHRNLLRGQHIPKDKCTQNCMLFMIEGELLINSDEH PGTTLHEKQFILQAIGSKYEILALTDVSYILFWFNELPLLCENRYREIIDQAEGPLTYTP MVMNPRLFYLINDVAAYLDESTPACGKYIDLKCQEMVYMITNYYPLPQLRAFFYPISTYT ESFQYFVMQNYSKAKNVEEFAHLGGYNTTTFRRLFRNLYGVPVYEWMLEKKREGILEDLQ HSNMRITEICNRYGFDSLSHLAHFCKDSFGDTPRALRKKAANGEKIGKITD >gi|226332247|gb|ACIC01000073.1| GENE 16 19405 - 19569 79 54 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253569173|ref|ZP_04846583.1| ## NR: gi|253569173|ref|ZP_04846583.1| predicted protein [Bacteroides sp. 1_1_6] # 1 54 2 55 55 70 100.0 3e-11 MKLLVLNISCKGMKKMLQMADKNVLKNAIQESVKLQSCANKTNKDNHIPVSAKT >gi|226332247|gb|ACIC01000073.1| GENE 17 20081 - 21637 1283 518 aa, chain - ## HITS:1 COG:mll7612 KEGG:ns NR:ns ## COG: mll7612 COG3119 # Protein_GI_number: 13476324 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Mesorhizobium loti # 25 484 3 435 509 128 26.0 2e-29 MNKLVLTGLLAAGTMTHLQGAQPAGQRPNILFILSDDHTSQAWGIYGGVLAEYAHNANIR RLAKEGVVLDNCFCTNSISAPSRASILTGLYSHRNRLYTLADSLDTSIPTLATLLQANGY HTGLVGKWHIQSQPQGFDYYSIFYDQGEYRDPTFIESTDPWPGNHQFGERVLGFSTDLVT EKAIRWMKEQDGNQPFLMCCHFKATHEPYDYPIRMEHLYDGVTFPEPENLLDWGPETNGR SFKGQTLEELERRWRIASQDPDKWWCRYPGLPFSTEGMQRTAARRASYQKFIRDYLRCGA TVDDNIGKLLNALDEMNIADNTIVIYVSDQGYFLGEHGFFDKRMFYEESARMPFVIRYPK KVPAGKRLDDLILNVDFAPTLAEFAGVKMENVQGDSFVSNLEGNTPTDWRKEIYYRYWTN HAIRPAHFAIRSDRYKLIFYYARNLDMTDTENFDFTPSWDFYDLQNDPHENHNAYNDPKY APVIKQMKKDLLRLRKETGDTDEKYPEMQKLLEKYYNK >gi|226332247|gb|ACIC01000073.1| GENE 18 21962 - 22927 749 321 aa, chain + ## HITS:1 COG:no KEGG:BT_3050 NR:ns ## KEGG: BT_3050 # Name: not_defined # Def: chitinase precursor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 321 1 321 321 653 100.0 0 MLLPRKIVFFLLLFVGLSLYAQQNCFISSYVRGNFYNRGRISAESLRASDDLIFLNVRPN KDGSLSFENPRIFENGKGVTSWEELIKSVRADVKGTKTKLRLGAASGEWKAMVADEAARV AFAKNIKKILEKNKLDGIDLDFEWAETEKEYEDYSLAILKMREVLGDKYVFSVSLHPVCY KISKEAIAAVDFISYQCYGPSPVRFPMEKYCSDIQMALAYGIPKEKLVAGVPFYGVTKDG SKKTEAYFSFVQDGLITGPAQNEVTYKGDVYVFDGQNNIRAKTRYAMDQQLKGMMSWDLA TDMPLNDSRSLFKTMVEELGR >gi|226332247|gb|ACIC01000073.1| GENE 19 22896 - 27026 2855 1376 aa, chain - ## HITS:1 COG:BH4026 KEGG:ns NR:ns ## COG: BH4026 COG5002 # Protein_GI_number: 15616588 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus halodurans # 839 1067 369 597 607 128 32.0 1e-28 MKTYIAAFFLLLSATFVAAHPYLIQRLGIEQGLSNNYVLSITQDKQGFLWFATEEGLNKF DGTRFITYYKEEQSSSVQSITGNELNEVYTDPVQPVIWIATQRAGLNAYNYETQSFSVYQ YNPEDPQSLITNDVTHITSSVQAGKGLWVCTYYRGIEYLDIATGKFTHYNKSTVPALPSE QTWTATEAEDGKLYIGHVEGGLSILSLNDKSVKHFVHDPQNPNSLPGNDVRCIYKDTNGN IWIGTSKGLALFNANTETFTNFHNNPGNIHGALSSYIFSIKQLKDNKLWIATELNGIMIL DLQQNQFLLPEQIRFEFIREGDNNYSLSNASARYIFQDSFNNIWIGTWGGGINFISNAPP AFHTWSYSPTQMNESSLSNKVVSSVCDDGQGKLWIGTDGGGINVFENGKRVAIYNKENRE LLSNSVLCSLKDSEGNLWFGTYLGNISYYNTRLKKFQIIELEKNELLDVRVFYEDKNKKI WIGTHAGVFVIDLASKKVIHHYDTSNSQLLENFVRSIAQDSEGRFWIGTFGGGVGIYTPD MQLVRKFNQYEGFCSNTINQIYRSSKGQMWLATGEGLVCFPSARNFDYQVFQRKEGLPNT HIRAISEDKNGNIWASTNTGISCYITSKKCFYTYDHSNNIPQGSFISGCVTKDHNGLIYF GSINGLCFFNPDIAINSPQIPPVVITKVRIPGRLTSREKNETAIPISEGEIELTHEQNSF NLTFNVQDYSLANQVEYAYMLKGLENSWYTINEQNSVTFRNIPPGKYEFLVKARLHNQDW SEDTTSLRIHINPPLWLTWWAKLIYILITISIIYTIIHAYKKKIDLESLYTLEKKNHEQE QELNQERLRFYTNITHELRTPLTLILGPLEDMQKDTSLPTRQAQKLSVIHLSALRLLNLI NQILEFRKTETQNKKLCVCKGNIVPLIHEIGLKYKELNQKKAIDFQIQIEKEEMPLFFDK EIITIILDNLISNAIKYTEQGKITLSLYPTTRNGVTYTEIKVSDTGYGISAEALPHVFDR YYQESGKHQASGTGIGLALVRNLVELHEGEIRVESIQNEGSTFYISLLTDNIYPNALHGD STKQTEEEMISEAVPEDSQNTEPETSKPILLIVEDNEDIQKYIAESFSDSFEVITGSNGE EGKQQALNRIPDIIVSDIMMPVMDGITLCRQLKEDVRTSHIPVILLTAKDSLQDKEEGYE VGADSYLTKPFSASLLRSRINNLLDSRKKLIAQFQQTGTNHNPNSHLDEKRSIITEALSK LDNEFIEKITQLVEDNLSSEKIDITYLSDKMCMSGSTLYRKMKALTGLSTNEYIRKVKMQ NAERLLLEGKYNISEIAYKIGMNSTGYFRQCFKDEFGVSPSDYLKQFTSPTPPPSS >gi|226332247|gb|ACIC01000073.1| GENE 20 27211 - 29898 1499 895 aa, chain + ## HITS:1 COG:no KEGG:BT_3048 NR:ns ## KEGG: BT_3048 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 895 1 895 895 1833 99.0 0 MESDSLHEHDKSWKAFFITTLFILCTANVYAQSSDTIFSRYFRYAQNFADAYPREKVSLH LDNASYYLGDTIWFKAYVVTAEQNLPTTISKPLYVELLDQLGNVVERQIIQLTDGEGTGQ IILNNTFFTGYYEMRAYTKWMLAFDNPSCFSRVLPVYRKRLSDEETPRSIATYRMDASMK QRPKDKEKKFTVRFFPEGGQLVKGISSIVAFEATSRDKGAADVEGTVVLPSGEELAHIRS LHDGMGYFEYKPEEKAGVAKIDYEGSTYQFDLPEALPQGYVLRIDNRREMLDITVARSSQ AMKDTLAVFVSSQGRPYKCMTLDFEDELNCQFRISTKELPPGVQQISLVNLKGETLCERF CYVMPRSSMLLACKTDHALYRPFEPVTCRIKVRDHLDRPVQANLSVSIRNGVESDFREYD HSIYTDLLLVSDLKGYIHQPGFYFENQSAERFKMLDVLLLVRGWRKYDLSRLIGKRPFLP RYLPETSLTLYGQVESYFGKALRNVGVSILARRDSVSIAGMTKTDSLGYFSAPVDGFSGS MDALIQTRNEGKKWNKQAVVKLFRNFEPSLRKLDYYELNPEWKEAGDLKQLLDTLDIAYK DSVFGPDHHLLDEVVVNAKRLNLLLKQTERFEKEILGYYNITQVVDKMRDKGEAVYNLPM LLKELNPNFRLSDSLSLHYNNSRVLFIVNGGVLSYGKTDYVLDKDVDAIKSIMLYYDQAG GESVFVMNKQSNRVTKFTANNFWSGRWQDGDLSDLSLQDAIGADSGPDALWGEKDRKTMK KGPLQKSSVVVCSITTIDDWDPNKTYKARRGIRHTYIQGYNEPLEFYSPAYPNGAPLYTE DSRRTLYWNPNVKTNEKGEAVIRCYNSDNSAPLIINVETLYKGSPSSLNIYSIDY >gi|226332247|gb|ACIC01000073.1| GENE 21 30253 - 31554 1241 433 aa, chain + ## HITS:1 COG:no KEGG:BT_3047 NR:ns ## KEGG: BT_3047 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 433 1 433 433 856 100.0 0 MKRLFLPLLGFLSLVLMCCNDNKESGTDSYDPNKPVTIDYFTPDTGRVAEKVIIIGNNFG NDKSAVTVLFTDNMEKENKAAVVGVDNTTIYCMAPRQAEGFNKITVKVNDQTVVADETFK YSVSENVSTIAGDFVQADGAGKDGSLAEATFGQMFGVACIDEQSGIVGQAWNNNSVRYIS IDEDAVITIQSGPVVGKVAVSNDKTLAYGATINGANTVYVYKKSQAWVPSRLSDISSGTD VWAVALNGDNQWLYYVTANGRLGRIEVDNPTHNEILFEHNDFSGGPFAYIAYSAFEDCFY VTTDKNKILKVWVKEDGTHAYEQINQNNTGTTDGFLADEAKFQYLRGLTIDEDGNIFVCQ GDNSHVIRKIAYDENMEKRYVTTVLGTSTVHGTDDGSPDIALFNGPQDISYDGNGGYWIA MRYEPALRKYSIE >gi|226332247|gb|ACIC01000073.1| GENE 22 31577 - 34672 2545 1031 aa, chain + ## HITS:1 COG:no KEGG:BT_3046 NR:ns ## KEGG: BT_3046 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1031 1 1031 1031 1998 99.0 0 MKKIISILFIICSLFSSMAYAQGGLTYSGTVIGEDGFELIGVSVAVKGTSIGVITDLDGK FRLNNIPPKSTIVFSYVGFENKEVKITESKERETIVLKQSISSLDEVVVVGRGSQRKVSV VGAITSVDPAQLQVPATSVSNMLQGRVPGIIGVTRSGEPGNNFSEFWVRGISTFGANASA LILIDGVEGDLNTLDPADIESFSVLKDASATAVYGVRGANGVVIVTTKRGKAGKLTVNFK TNATYSYSPRMPEYVDAVGYANLANEARVVRGKNPIYTNSEIQLFRSGLDPDLYPNVNWR DVILNDYVIDNQHHLSLSGGGTNARYYVSMGIMNQGAVFKQDKSVSKHNTNVDYHQYNFR ANVDANMTKTTLLSLNMETIITKRNSPGYGDNNNALWSAQANLPATIVPVKYSNGTLPSF GRNSDEVSPYVQLNYTGYKISETQATRMNASLNQKLDFITKGLSARGVFSFNYTGYFDQY RTKTPDLYYATGRKQNGALISKRTSSATDMTYSDYRRIARQYYFEGNINYERIFGEDHRV TGLLNAYRQENKDSYLNNDDDDTTLDERIRSIPKRYQAISARATYSYKDTYMFEFNVGYT GSEQFKKGNRYGWFPAVALGWVPTQYEWTREKLPFLDFFKLRASLGKVGNDRIKKIRFPY LSTVSTDYSSVTWPGSTVGESRVGANDLNWEETTKMDIGVDLKMFNDKVELTADIFKDKN KGIFQQRANIPEEAGFVTNPYTNIGGMESWGFDGNITFNHNFTKDFGMTVRGNWTFARNN VTYWEQSGVVYPYQSFVDVPYGVQRGLISLGLFKDQADIESSPVQTYSSDVRPGDIKYKD VNGDGVINDDDIVPLSYSNTPTLQYGFAAEFRYKQFTASIFFEGVGKVQYFYGGSGYYPF AWETRGNVLNIVADQKNRWTPREYSGDASTENPNARFPRLTYGENKNNNRASTFWLADGR YFRLKNIELSYRFTSAWLRDKISVQNATLSLVGENLHVWDKVKLWDPEQASSNGAVYPLQ RKFTLQLNVTF >gi|226332247|gb|ACIC01000073.1| GENE 23 34712 - 36694 1698 660 aa, chain + ## HITS:1 COG:no KEGG:BT_3045 NR:ns ## KEGG: BT_3045 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 660 1 660 660 1348 100.0 0 MNFRKIYCIAAMALMCCTGSMFTSCDDYLDVDSYFDEIFELDSVFKRKEYLEEYINGAGK LLPNEGDLWTNAWSPYQGASDENFTSWNDSRHKAIQLMVDEVTPQSDFYNNYGTWYKGIR KANLVLERINECEDITTSDLRDFMGRCYFLRAYFYYKLVEAYGPVPIVPEMAYDVDASAE SMSLERETYENCINYICENFEKAYEYLPSSRTSTLVNLPTSGAALALMGRVRLIEASPWY NGNEFYADWKRSDGTNFMPQVKDESKWGTAALLAKRLIKGSEAGSFKYKLHTVERKLDTK PLPENVPDENYPNGAGGIDALRSYAFMFNGETPAYNNDEFIYMCGYSSTAGDSPAWIATP TSLGGGNGLNITYATVKAFRMEDGSDINNSPLYPTNYWEAIGGSSQSFSDYTLPSDAAKM FDKMEMRFYASVGFNHCYWSGLSYIGTEGNQTKQTVTYYANGTAAPSSDHPEDYNHTGFT CKKYIHYVEDQLKGMMKSKTFPIMRYAEVLLNYVEALNELGSNTYTDEENGITVSRDVDE MVKYFNMVRYRAGLPGITADDARDVTTMRDLIKRERRVEFFCEGRRYHDLRRWGDAIDAY NEPVTGMNIAARSTDRKAFHTETVINNTRSHRVFSYKDYFLPIPRTTMDKNPKLVQNPGW >gi|226332247|gb|ACIC01000073.1| GENE 24 36743 - 37630 793 295 aa, chain + ## HITS:1 COG:no KEGG:BT_3044 NR:ns ## KEGG: BT_3044 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 295 1 295 295 520 100.0 1e-146 MNINKILYPILLLLVVLSSCDYESPLDKEQYKKTLYLIGASSNLVTKELAYSSEVQEGFI TVGVSGSLLIDGNVDVTLESHNSVIDWYNKKFKFLATDIKYQGLATSAFEVPSYTTTLKA GETYARVPFKVKTEGLECDSLYAITFKIASSSAGYEINQKDSALIVSFNLVNKYSGNYIF KALRHELDSNDEIVASTSITIVRTLKATEENAVRFYNEQQAETTANIKKHAVVLKVDAGN NVTISAWEDFDLIDGTCTYNQNSKVFNVDYKYTADGKTYQMVGTFTYQDEDAGSN >gi|226332247|gb|ACIC01000073.1| GENE 25 37879 - 39465 1094 528 aa, chain + ## HITS:1 COG:no KEGG:BT_3043 NR:ns ## KEGG: BT_3043 # Name: not_defined # Def: putative xylanase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 528 1 528 528 1098 98.0 0 MNYKLLSVALAFLYIGIATSCSQPAPDLRYQIEVDKPLQTMEHFGASDAWSMHILGLWPQ EKQNQIADWLFSTENDANGKPKGIGLSLWRFNVGAGSTEQGEASQIGSSWMRTECFMNAD GTYDWNKQQGQRNFLKLAKERGVTKFLAFLNSPPVYYTQNGLATNTGRGGTANLKPECYE KYARFLADVVEGVEKHDGIKFNYICPFNEPDGHWNWVGPKQEGSPATNREVARTVRLLSR EFVNRKMDTQIMVNESSDYRCMLRTHQTDWQRGYQIQAFFCPDSVDTYLGDTPNVPRLML GHSYWTTTPLSELRAMRCQLREALDKYNVGFWQSETCIMGNDEEIGGGHGFDRTMKTALY VARIIHHDIVYAGAKSWQWWRAIGGDYKDGLIREYTNDDLKDGRVEDSKLMWALGNYSRF IRPGAVRLSVSAFDQAGNLIPGGDTDQKGLMCSAYQNADGSYAVVLINYAQEDKEFSINK INGKKTRWQVYRTSDVEGEDLLPVEKVKSGRTVRIPARSIITLLNQSL >gi|226332247|gb|ACIC01000073.1| GENE 26 39609 - 39986 169 125 aa, chain + ## HITS:1 COG:STM3755 KEGG:ns NR:ns ## COG: STM3755 COG3943 # Protein_GI_number: 16767039 # Func_class: R General function prediction only # Function: Virulence protein # Organism: Salmonella typhimurium LT2 # 5 74 12 81 345 58 34.0 2e-09 MDNRGNIVIYQTKDGKTSIDVKLEDETVWLTQAQMAELFQTDRTSIVKHVNNIYKSEELV KDSTCAKIAHVQIEDISHCFSCSDAASLTLIEAIFFCLSLILRCLLLPAYHNTPASEHKP DSDEY >gi|226332247|gb|ACIC01000073.1| GENE 27 39907 - 40545 494 212 aa, chain - ## HITS:1 COG:no KEGG:BT_3041 NR:ns ## KEGG: BT_3041 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 212 1 212 212 388 99.0 1e-107 MAIQFELYKSPNPKDEEDKELYHARVVNFQHIDTDYLAKEIQQATSLTEGDVKAVLESLS HFMGSRLREGERVHLDGIGYFQVKLNSLEPITSPKLKANQMKLKANIGFKADKKLRSSVS VVKVERSKLKLHSVPRSNEEIDRLLTAYFSNNQILTRSDFQGLCKLTLTTAARHIKRLKE EKKLQNINTRQSPVYVPMPGYYGKPEVEDTVK >gi|226332247|gb|ACIC01000073.1| GENE 28 40564 - 41571 811 335 aa, chain - ## HITS:1 COG:NMA1039 KEGG:ns NR:ns ## COG: NMA1039 COG3943 # Protein_GI_number: 15793995 # Func_class: R General function prediction only # Function: Virulence protein # Organism: Neisseria meningitidis Z2491 # 11 335 3 332 336 308 54.0 1e-83 MDEIEKNNGEIIIYRTEDGRTQLEVRLENENVWLSQQQIANLFGVQRPAITKHLKNIFES GELEENSVSSILEHTASDGKNYKTQFYNLDAIISVGYRVNSLQATHFRRWATERLKEYLI KGFAMDDKRLKEMGGGGYWYELLNRIRDIRSSEKVLYRQVLDLYATSVDYDPKADESIRF FKIVQNKLHYAAHGHTAAEVIFERANAEKPFMGLTTFPGEQPRKEDVLIAKNYLNEKELK ILNNLVSGYFDFAEIQAIKRSPMYMSDYIHHLDLILSTTGEQVLQNAGTISHEQAKQKAL GEYQKYHVKTLSPVEEAYFDSIKKLTAETKKKKKK >gi|226332247|gb|ACIC01000073.1| GENE 29 42399 - 42854 293 151 aa, chain + ## HITS:1 COG:no KEGG:BT_3039 NR:ns ## KEGG: BT_3039 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 135 1 135 142 246 97.0 2e-64 MNDSTCPRPCLMKLDLQSSTNKLAFLKDNWPSFGQIESIDRLSETELRCTLCLLDVVLAA LAKDECFCPNREIIRLVLTRTYVQNRCELCETEEIRSKLVKGFCTREKKNGLSRKNNIRR RGISVFYGAILRMLSLFSSFRKEENRDTTDF >gi|226332247|gb|ACIC01000073.1| GENE 30 42952 - 44085 967 377 aa, chain - ## HITS:1 COG:AGl2871 KEGG:ns NR:ns ## COG: AGl2871 COG3712 # Protein_GI_number: 15891547 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 172 326 128 278 331 68 33.0 1e-11 MEEKEKRISKEEDSGYGLTADNDSCRLADGDSSEPTDSNSCEPADSDSCEPADSDNCEVA TDDQPPFSENAEIRRLQNAFGEALGDLPLPDEVREEWSTFLQRQAQQKRKLYLRICLSGG VAAAIALLLLLWSPWHITDKDSILQNIEIFTALHAPEQITTIEENGRIIVSTPPATTIRL TLEDGSHVLLSANSRLEYPKEFSSQGSRTVNLTGEARFEVTKDAHRPFIVSADKMQTQVL GTVFDVNAYPGNAPAVTLYQGRVKVGKAASPIEKEIVPGQCATLTTSGDIRLAKATRTEK EGWTKDEFYYDNTEMITVLQNIGTWYNISVICHSADLLHKRVHFRFSRNVPIKTLLNVLN DLGIAHFQYKDKQIVVE >gi|226332247|gb|ACIC01000073.1| GENE 31 44102 - 44539 442 145 aa, chain - ## HITS:1 COG:no KEGG:BT_3037 NR:ns ## KEGG: BT_3037 # Name: not_defined # Def: RNA polymerase ECF-type sigma factor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 145 25 169 169 275 100.0 5e-73 MVRYASQLMGDGEEARDIVSEVMEQAWKHFDQLDEADRGGWIYTAVRNTCLNRMKHLQVE RDNAKALYEATLADVKSNYREHEALLQKAETIARSLPEPTCTILRLCYYEHLTYREVAQQ LGISPDTVKKHISKALRTLREAMKE >gi|226332247|gb|ACIC01000073.1| GENE 32 44854 - 47436 2060 860 aa, chain - ## HITS:1 COG:no KEGG:BT_3036 NR:ns ## KEGG: BT_3036 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 860 1 860 860 1678 94.0 0 MIRQFGFTWTAISWLLMGYGLFFALSPLQAQTAIKKFSLELKNESLPEALKQLEKAGGKN ILFTYNGTESYRVTVSIREKTEREAIDLVLAGKPFLCIEREEYFVVQRKKSDKAIAIEGK VYDEKGAPLPFVNILALAADSSFLAGSVTEEDGSFHLPPIAGEDCLLKATYIGYRPQIIP CRQQNTIRLQPDTELLKEVVVTASRPLIERKDGTLTANIAGTPLSLMGSAKEMISHLPFV TGSDGEFTVLGRGTPEIYINGRKVRDKTELDRLQANEILSAEIITTPGVQYGSSVGAVIR LRTIRKRGQGMSGSFYTDYSQGREPIGNEGISLNYRTGGLDIFVKGDFAEINNYRTGTSS QDIYASSNWNQSTEDKSKQTYRTFNGELGFNYEIDEDQSFGMRYMPGTNIGNAHTTNEGT TLILQDGKEVDQLHALRQTDAHTGWWQAANGYYNGTFGKWNIDFNADYLYGRDRIRQYAE NNGTEDATSSNRVRNHLYAAKLLLTAPLWKGKLSFGTEETFTNRHDVFLQSGFSADADDR IKQTMLSGFIDYILSLGKFNILAGLRYEYQQTDYYEKGIHQDGQSPIYRDWIPVISIRYT SGNWFFALSHRTLKYSPSYDMLTSAITYQNKYSYQSGDPFLVPQIHRATFFDAGWKWVNF SLSYDHCWNMYTNYTRPYDDVNHPGVLLFGRASIPHTNIYGANIVLSPKIGIWQPQFTTS VNWYDSHAACIGILQSWNEPRFHFSFDNNFSFPKGWFFNIKGVLSPGAKQSYAIWKTEGR VDAQLTKSFLKDQALKVSVTAKDLFHTAHRYFTIYGDRTFSSNRDYLDQQRFGIRLSYQF NATKSKYKGTGAGASEKSRL >gi|226332247|gb|ACIC01000073.1| GENE 33 47517 - 49103 1883 528 aa, chain - ## HITS:1 COG:XF2704 KEGG:ns NR:ns ## COG: XF2704 COG0793 # Protein_GI_number: 15839293 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protease # Organism: Xylella fastidiosa 9a5c # 30 343 75 386 508 221 41.0 2e-57 MMKKIKVYALLVCLLATSAAQAQSFGSAAMRKLQMAEFAISNFYVDKVDEDKLVEEAIIK MLAQLDPHSTYSDAEEVKKMNEPLQGNFEGIGVQFQMIEDTLLVVQPVSNGPSEKVGILA GDRIIAVNDSAIAGVKMSTEDIMKRLRGPKGSKVNLTIVRRGVQDPLVFTVKRDKIPILS LDASYLIQPKIGYIRINRFGATTAEEFKKAMKDLQKQGMKDLILDLQGNGGGYLNAAIDL ANEFLGQKELIVYTEGRTAQRSEFFAKGNGEFRDGRLIVLVDEYTASASEIVSGAVQDWD RGIIVGRRSFGKGLVQRPIDLPDGSMIRLTIARYYTPAGRCIQKPYDSSTDYNKDLIDRF NHGELMNADSIHFPDSLKVQTKKLKRTVYGGGGIMPDYFVPIDTTLYTDYHRKLVGKGVI IKFTMKFIEDHRKELADKYKKFDSFNEKFVIDDDMLATLREMGEKEGVKFNEEQYQKALP LIKTQLKALIARDLWDMNEYFRVMNTTNESVQKALEILRSGEYQKKLK >gi|226332247|gb|ACIC01000073.1| GENE 34 49212 - 49667 283 151 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163764798|ref|ZP_02171851.1| ribosomal protein S19 [Bacillus selenitireducens MLS10] # 2 149 4 152 164 113 37 3e-24 MRKAIFPGTFDPFTIGHYSVVERALTFMDEIIIGIGINENKNTYFPIEKREEMIRNLYKD NPRIKVMSYDCLTIDFAQQVEAQFIVRGIRTVKDFEYEETIADINRKLAGIETILLFTEP ELTCVSSTIVRELLTYNKDISQFIPEGMEIN >gi|226332247|gb|ACIC01000073.1| GENE 35 49664 - 51541 2028 625 aa, chain - ## HITS:1 COG:CT661 KEGG:ns NR:ns ## COG: CT661 COG0187 # Protein_GI_number: 15605394 # Func_class: L Replication, recombination and repair # Function: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit # Organism: Chlamydia trachomatis # 17 618 7 602 605 568 50.0 1e-161 MEENELISVDNNNAVEYTDDNIRHLSDMEHVRTRPGMYIGRLGDGAHAEDGIYVLLKEVI DNSIDEFKMQAGKKIEITVEENLRVSVRDYGRGIPQGKLVEAVSVLNTGGKYDSKAFKKS VGLNGVGVKAVNALSSRFEVRSYRDGKVRIATFSKGNLQTDVTQDTEEDNGTFIFFEPDN TLFVNYCFKPEFIETMLRNYTYLNTGLAIIYNGHRILSRNGLVDLLNDNMTATGLYPIVH LKGEDIEIAFTHTGQYGEEYYSFVNGQHTTQGGTHQSAFKEHIARTIKEFFNKNMDYTDI RNGLVAAIAVNVEEPIFESQTKTKLGSTNMTPGGVTVNKYVGDFIKLEVDNFLHKNADVA EVMQQKIQESEKERKAIAGVTKLARERAKKANLHNRKLRDCRIHLNDPKGKGMEEDSCIF ITEGDSASGSITKSRDVNTQAVFSLRGKPLNSFGLTKKVVYENEEFNLLQAALNIEDGIE GLRYNKVIVATDADVDGMHIRLLLITFFLQFFPDLIKKGHVYILQTPLFRVRNKKKTVYC YSEDERVNAINELSPNPEITRFKGLGEISPDEFKHFIGKDMRLEQVTLRKTDAVKELLEF YMGKNTMERQNFIIDNLVIEEDLAS >gi|226332247|gb|ACIC01000073.1| GENE 36 51547 - 52683 1074 378 aa, chain - ## HITS:1 COG:no KEGG:BT_3032 NR:ns ## KEGG: BT_3032 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 378 1 378 378 765 100.0 0 MAITIKKVSTKRELKKFIRFNYRMYKGNPYSVPDLYDDMLNTFNKKKNAAFEFCEADYFL AYRDDKIVGRVAAIINNRANEKWECKNVRFGWIDFIDDPEVSSALIKTVEDWGKERGMTH ITGPLGFTDFDAEGMLIEGFDQLSTMATIYNHPYYPVHMERLGFEKDADWVEYKIYIPDA IPDKHKRISELIQRKYNLKIKKYTSGRKIAKDYGQEIFELMNEAYSPLYGYSPLTQRQIN QYVKMYLPILDLRMVTLITDADDKLVCVGISMPSLAEALQKSHGRLLPLGWFYLLKALFM KRRAKMLDLLLVAVKPEYQNKGVNALLFSDLIPVYQKLGFIFAESNPELELNGKVQAQWD YFETKQHKRRRAFTKEIK >gi|226332247|gb|ACIC01000073.1| GENE 37 52776 - 55262 1749 828 aa, chain - ## HITS:1 COG:MTH1485 KEGG:ns NR:ns ## COG: MTH1485 COG1520 # Protein_GI_number: 15679482 # Func_class: S Function unknown # Function: FOG: WD40-like repeat # Organism: Methanothermobacter thermautotrophicus # 535 828 60 332 407 67 25.0 1e-10 MKRLSVILLSFLLYLPLCAQFKGTVYVDTNQSGTFDKGDKPLAGVMVTDGMNVVKTDKKG RFSLSGFKKTRFISITTPAGFETQQFYLPVKEDRKSYDFIVTESERTKTREHSFIHITDT EVTGGVGRWVTDLQQYIKNEQPAFLIHTGDICYEPGLTVHNQIVNAQTMDCPVYYCIGNH DLVKGNYGEELYESIYGPTWYSFDVGNVHYVVTPIDRGDNPTDYTQRDVYNWLKNDLALM KKDQALVLFNHDLFTPGDNFVFKADEKDILDLRTFNTKAQIYGHMHYNYVRNQKGIYTIC TGTLDKGGIDHSPSSFRDIKIDANDHLTTQLRYTFIEPQIAIVSPMKNQAAPIHPETGYQ LPVSVNTYNSQAKTSSVSYLLSNPEDNREIAKGNLSPLTDWNWSGNIQIPANENGKKLNL TVTSSFNDGKEATATSQFLYQKDFKSTAITGKDWNTLLQNASHSGGINDSQIKLPLQLQW TANTGSNIFMASPLIVGQKVFIATTDDNTSLNTYICAFDFHSGKQLWKFRTENSVKNTIA CENEIVVAQDASCNLYALDVASGKPLWQQYINLKNYPYLSEGLTVDKGIVYAGIGAGLSA YDLKTGKVIWTNTDWKQREGSTTTLTIAGDVLISGTQWGGLYGNDIRTGKQLWKLSDNGL GNRGASPVYKDGKLWIISSKSFFLIEPQTGKVLQQKELSANLDVTSTPLVTEHEIIFGSA DRGIFALDKPTLFIKWRAETLPSLVYTAPYSSTPQAAVETSPVSSQGMVYIGASDGYLYA IDQSTGIIKDKLYFGAPVFSTVAVSGNRMIVCDFAGNVYCLSSKAAEE >gi|226332247|gb|ACIC01000073.1| GENE 38 55393 - 57270 1963 625 aa, chain - ## HITS:1 COG:MA0279 KEGG:ns NR:ns ## COG: MA0279 COG0591 # Protein_GI_number: 20089177 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Na+/proline symporter # Organism: Methanosarcina acetivorans str.C2A # 8 484 8 444 639 78 22.0 4e-14 MKLHTSDLLIICAYLIAMIVIGLILKKRAAQNMDSYFLGGKSLPFYMLGLSNASGMFDIT GTMLMVYWAFAYGFKSLWIPWLWPVFNQIFLMVYLSVWLRRSNVLTGAEWIKTRFGKGKG ATLSHTIVVVFALLSVLGFLSYGFIGIGKFMEIFIPWEVISPYIPFTVSPEYVPHVYGIF FTAIATFYVMLGGMLSIVWTDVVQFLIMTVAGIVIVVIGMQMVAPDMIHSFVPAGWDSPF FGWTLDIDWSSRMSLLTERMANEPYSLIGIFVMMALLKGIFMSMAGPAPNYDMQKILSCK SPKEAAMMSGSVSVVLLIPRYLMIMGFALLAIYFFKEDGGLTQMEVTRTDFETILPHLIT RYVPAGLAGLLLAGLLAAFMSTFASTVNAAPAYIVNDIYLKYINPKASVKTQIRSSYVIS VAVVVVSTVIGFFLKDINEIFQWIVGALFGGYIAANVLKWHWWRFNGEGYFWGMTAGVVA AIVMKFTVPDAWVLYFFPVLFGVSLIGCIIGTYSAPATDEETLINFYVNVRPWGWWKPIQ EKAIARYPHIQANKNFKRDAFNVAIGIIWQCTLTIIPMYLVVREQLGLWSSIALLLITTL ILRKTWYKPLCKEEARYNEEMKQIR >gi|226332247|gb|ACIC01000073.1| GENE 39 57531 - 58721 880 396 aa, chain - ## HITS:1 COG:MA2382 KEGG:ns NR:ns ## COG: MA2382 COG2152 # Protein_GI_number: 20091213 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted glycosylase # Organism: Methanosarcina acetivorans str.C2A # 44 368 5 317 335 131 30.0 2e-30 MNRYDNRLHILTKEYDELISRENEKILPGNGVFERYKYPILTADHPPLEWRYDFNPETNP YLMERFGINAVFNAGAIKFNGKYLIMARVEGHDRKSFFAIAESPNGIDNFRFWEYPVQLP DLYPEETNVYDMRLTKHEDGWIYGIFCSESKDPDAPAGDLTSAIAAAGIIRSRDLKNWER LPNLVSQSQQRNVVLHPEFVDGKYALYTRPQDGFIDADSGGGISWALIDDITHAVVKKEI VIEQRHYHTIKEVKNGEGPHPIKTPEGWLHLAHGVRACAAGLRYVLYLYMTSLDDPSKVI AQPGGYFMAPVGEERTGDVSNVLFSNGWIADEDGTVYIYYASSDTRMHVATSTIERLIDY CRHTPEDRLRSTTSVKSIYDIIEANKLVMSENAVIL >gi|226332247|gb|ACIC01000073.1| GENE 40 58911 - 60089 905 392 aa, chain - ## HITS:1 COG:no KEGG:BT_3027 NR:ns ## KEGG: BT_3027 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 392 4 395 395 709 99.0 0 MKRNLLYLAGFLLAAATVFSGCNDDDPSYHDLVPNTQELIINLDETPEGIIQIVQGNGNY KITSSNEDVATAIAEGNKIKVTALKAGSTDLSITDWAKMSTNVKVIVDQLSELVLSNSAT TMYPGENKTVNVYTGNRGYKVTVDKESVVKATVDEEGHIQIESLAPGNATVTVTDRLNKS AEFSVKVIKKLVVDNSEEIAYLTVGEPLTIKILDGNGGYTCTNNGSATYLKCTIVENGTD VIIEGLKRYRFNKTVTIKDQEGESISINIVYIDDLYLENPSYRYLIAGSSSYQSVSTSAV GSITHSEEFNMSEIVIKGTGTYASGFAVQFTGDLTKGNKGDAVLYKVTRNAVDKKVKYPI EDCRIDKVEDGWYWVSFLEPNCTIRSYMITKQ >gi|226332247|gb|ACIC01000073.1| GENE 41 60115 - 61677 1137 520 aa, chain - ## HITS:1 COG:no KEGG:BT_3026 NR:ns ## KEGG: BT_3026 # Name: not_defined # Def: glycosylhydrolase, putative xylanase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 520 1 520 520 1069 99.0 0 MKKIYSLLFVSLFLFIGYSCDDGSLSHSDVYIPDPVEETDEDDGFSAEPTTEAVIKVQFG EGHEHQIIDGFGCAFAEWSHRIWNNMMREDVVNDLFGENGLKLNIFRGEVFPHYQNPTTN VIDFGMNRTFNLAANDPSMINDYWRDFNGSGCGEQVQLGQMWLVDILQKKYKDVKFIFST WSPPGTMKSNGKPSGGSLASGSEDAYANYLIDFIKAYTEKFGIEIYAISPSNEPNSSGTG WNGCSWSYTNLANFCQKNLRPALDKAGYQDMKIIFGEHSWWKAGVTFLENGLKACPDLVN SNIIAAAHGYTLIGNTEFVQSPLCAENNIHLWNTETSSTDTYDPSWKNAMQWATTFHNYL AVSNLNAFIWWAGARPCTNNEALIRLEEALPGTNYERASRYYSYGQFTKFIPEGSRRVDI KTVAPEGDEEAFPKELLMTAYIKDDNYTIVLVNNSTSKAFETKLEIEGKEFQTMISYTSD EGVKWQRKKVNPSLSGLRSITVPKFSVVTITGKMKDIDAE >gi|226332247|gb|ACIC01000073.1| GENE 42 61713 - 63344 1496 543 aa, chain - ## HITS:1 COG:no KEGG:BT_3025 NR:ns ## KEGG: BT_3025 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 543 1 541 541 1042 100.0 0 MNMKRNWLSYMCMAALFSGALTSCNESSFLNLSDPNYFTESNFWKDKADAESALAAAYSP IKGAMYGYFGAFDGWLNLNGRGDDIYTIKGEEVPMWNIANFLNSPSTGNDPYGALYSGIQ RANIVLRYIDQIPASGITEEDRSMIKGEALFLRGYQYFLLVNNYKEVPLRLIPSNEDETN KAAATEAALWAQVEEDLDNAIKCNLPVERSTAKERITKGAAIAMLGKVYATQHKYPEAKQ LLGTLLKAPYSYELMDNFEKNFTNEEEFNKESVFELAYSPDGDYSWSNESGICLGCYIPQ FIGPVKSGGWAKLMPNSLIVGEFTKETRPSDADTKFDKRMYASLFFDAANYGDKVPNEKW YGNNYSMDDLWEGNEGKMAGGAPSFNVNGTAGKFLIKKNTAYYVDDKAPDNMGNKEGRSS NLRVMRFAEVLLLYAEACAKTNDPDGANYALKQIRQRAGLTEKTFAQAELMNEIEHQCLL EFFAEGHRFDDLKRWYSPSEIQLIFKANGKQGAENFQEKHLYYPIPTSELNNNTAMEQTS LWK >gi|226332247|gb|ACIC01000073.1| GENE 43 63378 - 66443 2614 1021 aa, chain - ## HITS:1 COG:no KEGG:BT_3024 NR:ns ## KEGG: BT_3024 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1021 1 1021 1021 1967 99.0 0 MRRKIYFLLFFFLCLTSIPLAAQDIEIKGMVTEATTNEPLPGVTVRVKGKDTGTVTDMDG KYSIKANKGNILIFSTIGMKQIEKAVTSGTPINVSMEEDNIALEQVVVIGYGTVKKSHLS GAVGSVSAKELNGQVASNAATALQGKIPGVSVASSSGDPNGSMTINVRGISSLSNNNPLY VVDGAFGDISMVDPNDISSIEVLKDAAAAAIYGSRAAGGVVLITTKSGRKDMPTKLDVNF FTGISQTPKTLKVFNGEEYSRFARYYKLAGDGYGAENGATPFIGEGTDWQDVMLRTAMTY KANATISGGSKSGSYSSSVSYLNKEGILRNTDHESYNIRLKSDYSFFDNRLTIGESMIVN LTKGSGYIHQDTMFDIFQFPSVVPVYDPTNAGGWGTSNDINLPNPLAEMTVNDERTETTK IFLNAYLQAEIIKGLKYKLNVGIRKEHTKWRYYKDVYDLGTFGKNDKPDLEEKSSTWESW VLENTLNYDRTFGKHNLSLLAGYSAQKDKSYSLYGKNGDMPQFIETMPGNVDPSNLKASS SLNELALVSLFGRVMYSFDDRYLFSASIRRDGSSRFRSGHQYGAFPSASIGWNINREKFF KPLENVFDQLKLRFSYGKLGNQEMTSYYPTQSVVSDGMNYVMNNSPWFGSMPYVQAISPA NLTWENTETYNIGLDVSLLNGRLTLTADAYVKNTNDVLLPIPSTASTGISGNSIQNAGQV RNKGFELAVNYRGAIKEKFNYYIGANIAADKNEVTKITLGGQNLMISGYSAHGAGGRGIN MFAEGHPMSYFNLIETDGLFRSAEEIANYKNKDGELIQPAAQVGDVRYKDWNGDGKINTD DQHDVGSPFPDFTFGIRLGGEWNNFDFNLFFDGMVGNKIYNYPRYRLESGNFNGNMSTVL ANSWRPDNQNTDIPRFSKTDGADNKWAYTDRWLEDGSYIRLKTLDIGYTLPKVLTKKIKL ENVRIYTSMENLFTLTKYSGYTPDLGESSVAGVAYNVFSRGIDQGRYPLPRTISFGIQVN L >gi|226332247|gb|ACIC01000073.1| GENE 44 66963 - 67880 839 305 aa, chain - ## HITS:1 COG:no KEGG:BT_3023 NR:ns ## KEGG: BT_3023 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 305 1 305 305 608 99.0 1e-173 MNKQTFLTLLATFALLFSFTFSCHAKGKDKAKHVVFIGLDGWGAYSLPKADMPNVKKLME DGAYTLKKRSALPSSSAINWASMFMGAGPELHGYTEWGSKTPELPSRVLNKNGIFPTVFQ LLRDARPEAEIGCLYEWEGIKYLVDTLSLSYHYHVADCNKTPKELGNMASSYIKEKHPAL VAICYDGPDHTGHTAGHDTPEYYEKLKELDMYVGQIVQAVKDAGMLDDTIFILTSDHGGI DKGHGGKTMQEMETAFIISGKNIKKGLRFDDVSMMQYDVASTIARIFHLEQPQVWIGRPM EIVFK >gi|226332247|gb|ACIC01000073.1| GENE 45 68535 - 69782 971 415 aa, chain - ## HITS:1 COG:TM1061 KEGG:ns NR:ns ## COG: TM1061 COG4289 # Protein_GI_number: 15643819 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Thermotoga maritima # 31 411 8 384 387 296 42.0 4e-80 MKRLHWILLLMILLIPTESISAKKKTEKSDREIWCDVMYRMAAPVLSNMSEGLLKQNMLV ELSPTWDGRNQNVTYMECFGRLMAGLAPWLSLPDDETPEGLQRKQLREWALKSYANAVDP ASPDYLLWRQENQTLVDAAFLAESFLRSYDALWMPLDSITKQRYIAEFTDLRRVDPSYSN WLLFSATVESFLRKAGAPSDTYRISSSLRKIEEWYVGDGWYSDGPRFAFDYYNSFVIHPM YIEALEIITEAGKREKIGNMPGCNFHEAIRRAQRFGVILERLISPEGTLPVFGRSITYRT GSLQTLALLAWRNWLPEELSNGQVRAAMTAVIQRMFGDGRNFNTAGFLTLGFNGSQPGIS DYYTNNGSLYMASLAFLPLGLSADDPFWTDASQSWTSKKAWEGEEFPKDHSYHKE >gi|226332247|gb|ACIC01000073.1| GENE 46 70005 - 71573 1103 522 aa, chain + ## HITS:1 COG:no KEGG:BT_3020 NR:ns ## KEGG: BT_3020 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 10 522 1 513 513 1068 100.0 0 MKRNWIGSSMIVALFALLASGCGHSQRTLEFRGICVDDPQGENGLYNPGRGFRLETAVDV LHEKDTPTEELNELSAKYVADSVSLSQSYFYLTYLIGKELSEENFRTMQAYFDELQKQGK KAVLRFAYERDFMGRSPVGPTGEQILAHLDQLKPFLEKNKDLILVVQAGMIGAWGEWHSS VQGLENSEETKAAVLEKLLSVVPAERNVQVRLPEFKNLLKDKPELYKRLSFHDDFIVIRP DRWDADMHEGTPKFDQIVAESPYLVVDGELPWGFWSVGADPDSPSAGWIIDGMQAARRLF LQHYTSLSIIHNYKEQHPNNRFDENNPPEYSMVVWKKTMITEDSLLQHHMPVSDSYFRKK DGTKVKRNMFDYIRDHLGYRIELQSLQLPSKFVSGKENVLKLSLKNRGFATVFGEHPVYF VLIDDAGEVTEFPTDANPKNWQPFEPKDSAYTSLMHTVDVSLELPASVTAGTYKLGLWIP DGSDRLRYNPRYAIHCANGDTDWWISKDGKYGVNVLTAVEVE >gi|226332247|gb|ACIC01000073.1| GENE 47 71655 - 72641 920 328 aa, chain - ## HITS:1 COG:BH0465 KEGG:ns NR:ns ## COG: BH0465 COG0530 # Protein_GI_number: 15613028 # Func_class: P Inorganic ion transport and metabolism # Function: Ca2+/Na+ antiporter # Organism: Bacillus halodurans # 16 328 16 318 318 224 44.0 2e-58 MNILLLIGGLILILLGANGLTDGAASVAKRFRIPSIVIGLTIVAFGTSAPELTVSVASAL KGSADIAIGNVVGSNIFNTLMIVGCTALFAPIIITRNTLRKEIPLCILSSVVLLICANDI FLDNATENILNRVDGLLLLCFFAIFMGYTFAIAFPKSSAAPEPAAPEHTAPEEEIKLLSW WKSILYIIGGLAALIYGGQLFVNGATGIARSMGVSESIIGLTLVAGGTSLPELATSIVAA LKKNPEMAIGNVIGSNLFNIFFVLGCSASITPLHLSGITNFDLFTLVGSSILLWLFGLFF AKRTITRIEGSIMILCYVAYTVVLIYNS >gi|226332247|gb|ACIC01000073.1| GENE 48 72768 - 74618 1480 616 aa, chain + ## HITS:1 COG:sll0590_2 KEGG:ns NR:ns ## COG: sll0590_2 COG0668 # Protein_GI_number: 16331818 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Small-conductance mechanosensitive channel # Organism: Synechocystis # 358 606 1 251 264 249 49.0 1e-65 MDMKKIKRFALLLIMASLAVNVQAQLEQAVKKIFAGDTVATHPVSLHRDSDSARVANLQK SLEEARLNEANMRMEMEQMRLQMLSADSVKFSQQRQRIDSLRQFTKGIPVVAEGDTLFYL FTKRGGYTPQQRAQMTGAAIEEIGKRFNLRPDSVSIDHSDIVSDLMYGNKVLLSLTDQDA LWEGVSRDSLAKERRQNVITKLHEMKAEHSIWRMAKRILYFLLVIVGQYLLFRLTNWAFR KLKVRILRLKDTKIKPVSIQGYELLDAQKQANLLVFLAGIGRYILMGIQLLITVPLIFII FPQTEGLAYRLLGYIWNPIRKIFVDIIDYVPNLFTIVVIWYAVKYLVRMVLYLAREIEAG RLKFNGFYPDWAMPTFHIVRFLLYAFMIAMIYPYLPGSDSGVFQGISVFVGLIVSLGSST VIGNIIAGLVITYMRPFKMGDRIKLNDTTGDIIEKTPLVTRIRTPKNEVVTVPNSFIMSS HTVNYSTSAREYGLIIHSEVSIGYDVPWRQVNQILIDAALNTPGVVDDPRPFVLETSLSD WYPVYQINAYIKEAHKMSQIYSDLHQTIQDKFNEAGIEIMSPHYMAVRDGNATTTPKNDL SKSNTADTASQSNKSE >gi|226332247|gb|ACIC01000073.1| GENE 49 74775 - 75707 968 310 aa, chain - ## HITS:1 COG:no KEGG:BT_3017 NR:ns ## KEGG: BT_3017 # Name: not_defined # Def: acid phosphatase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 310 1 310 310 645 99.0 0 MKIKSLFILLFLSVFTFAHAQLTDYSIFDKKFNFYVANDLGRNGYYDQKPIAELMGTMGE EIGPEFVLATGDVHHFDGVRSVNDPLWMTNYELIYSHPELMIDWFSILGNHEYRGSTQAV LDYTNISRRWSMPDRYYTKVFEEKGATIRIVWIDTTPLIDKYRNESDKYPDACKQDISKQ LSWLESVLASAKEDWIIVAGHHPIYAYTPKEESERLDMQKRVDSILRKHNVDMYICGHIH NFQHIRVPGSDIDYVVNSAASLARKVEPIEGTKFCSPEPGFSVCSIDKKELNLRMIDKKG NVLYTVTRKK >gi|226332247|gb|ACIC01000073.1| GENE 50 75875 - 78691 2917 938 aa, chain - ## HITS:1 COG:CC0995 KEGG:ns NR:ns ## COG: CC0995 COG1629 # Protein_GI_number: 16125247 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor proteins, mostly Fe transport # Organism: Caulobacter vibrioides # 125 938 53 903 903 174 24.0 1e-42 MKRFLRLVSLTLLIMSTSGNTFAEEKVNVARQGTIRGRIVDTSKQILPGASIYIEKLHTG VTSDVNGYYTFANLTPGTYTVKVSYVGYSPVEMKITIPAGKTLEKDVVLNEGLELQEVVV GGAFQGQRRAINSQKNNLGITNVVSADQVGKFPDSNIGDALKRINGINVQYDQGEARFGQ VRGTSADLSSVTINGNRLPSAEGDTRNVQLDLIPADMVQTIEVNKVVTSDMDGDAIGGSI NLVTKSTPYKRMFSATAGTGYNWISQKAQLNLGFTYGDRFFNDKLGMMAAISYQNAPSGS DDVEFEYDVNKKGEVVMVEAQKRQYYVTRERQSYSLAFDYDINPNHRLTLQGIYNRRHDW ENRYRVTYKDLDKTGLDDEGDMQQSAQIETKGGTPDNRNARLELQQTMDLSLSGEHQFGK LSVNWGASYARASEDRPNERYFSLKQDFLGFTVVDAGDRFPYVTTDVNLHNGETDGERGK WKVKELTESNQEIYEKDLKFKVDFELPLTNGIYGNSLRFGAKYASKTKDRNTLCYDYADA YKDAFDKDYMNNLTSQIRDGYMPGDQYKATDFVSKEYLGSLDLSSMEGEQVLEESSGNYH ARENVTSAFFRFDQNLGKKIKMMAGLRMEATHIKYNGWNWNVADDKDETETLEPTGNHKN SYVNWLPSLLFKYDVNDDLKLRASFTETLSRPKYSALIPCVNINRSDNELVMGNSDLTPT ISYNFDLSADYYFKSVGLISAGIFYKKINDFIVDQVIGDYTYQNNEYKKFTQPKNAGDAD LLGVELAYQRDFGFIAPALKCIGFYGTYTYTHTQVNNFNFEGRENEKDLSLPGSPEHTAN ASLYFEKKGFNVRFSYNYASSFIDEMGEVAALDRYYDAVSYMDLNASYTFGKKIKTTFYA DATNLLNQPLRYYQGTKDRTMQAEYYGVKINAGVKVNF >gi|226332247|gb|ACIC01000073.1| GENE 51 78878 - 81784 2163 968 aa, chain - ## HITS:1 COG:no KEGG:BT_3015 NR:ns ## KEGG: BT_3015 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 968 1 968 968 1891 98.0 0 MKKYTLLFFLLSLLPCLSTACSDDDGSSTPNLTVGKETVDFKSESGSQNVAVTTNVDTWT VKSDKNWCHPSADGKALKISVDESDERYVRKATVTVIAADQTKTITVRQLGYEAAILVDQ PSFEVGVIGGEIQFDVTTNVEVAITLPEWITAKPASRAPATVTIPHTYMVKATGLDSQRH GNIEITEVLPAIDPDTEQAEPVSASVFVTQKGLNEFAEGNGEDVKGDIKIKIVSGTASSF QSGSNIEKSFDGDYSTLYHSSWSNGAGNYFPITLTYNFETVTDVDYLIYHPRNNGNNGRF IETEIQYSADGHTFTKLIDKDFQGSATASKITFDQTIQAKSFRFIVKSGSGDGQGFASCA EMEFFAKNPVNFDYSTLFTDASCSELKTGITEDDIAQCEYPFFKNIAYYMIKGKYPAEFR ISEFKAYPNPDIQSETHKTNPYSQLDNPTGISVKAGENLIVLVGDTHGYDIGLRVQNLDA PENDGFGGVTYLLNQGINKLTISEQGLVYVMYVTKTLDDPAAAPVKIHFASGKVNGYFDS QNPEHNGRWSELLNKATNRYFDVLGKYAHLTFETSDLRTYTGSKGDELIDLYDKIVYSEQ QLLGLEKYDKMFRNRMYLNVMYHSYMYATAYHTAYNRTTMNEICSPEKLKTSACWGPAHE IGHCNQTRPGVLWGGNTEVTNNIMSEYIQTTIFGQPSRIQVEDMGITYRNRYSKAWSGII ATGSPHADFQNLGKNNANDVFCKLVPFWQLELYFGKVLGRTPLQQADKGGFYPEVYEYAR NKDYTGMTHGEIQLDFVYACSKISGMNLLDFFTKWGFLTPVDKELDDYGKKQLTVTQDMI DALKQKVNALGGTRLDVALEYISDNTYELYKTKPAIIKGENATHAPKTFTVGSGDNAVTY NGETITIKNWTNVVTYEVKDETGKFILICSGENAPSSVDTFTIPVRWKDGFRLSAVSVTG ERIDIPMN >gi|226332247|gb|ACIC01000073.1| GENE 52 81793 - 82689 786 298 aa, chain - ## HITS:1 COG:no KEGG:BT_3014 NR:ns ## KEGG: BT_3014 # Name: not_defined # Def: putative chitobiase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 298 1 298 298 595 100.0 1e-169 MKKILMTICLIGGLAMLWSCSDDKDSYPVPSDIENLKATPAPGQITLSWTNPADENLYYV QIEYTIGATGKSYRKQVSQYANELVIDNLLQKYGEINFTVQAFNRGNTAGTIHQITAQAE KASPTFGTPVKIDLDYKKIWTNAPFPTRPIKDLVDGNIATFFHSWWSSLVEMPHYLVVDL GEEVSAIKFRSTNTNRANDSSWKTINLYTSDSYNPAEWFDGVEKIDGNTVDISQAGTHKE TTLTGLPDGVSEVYNSEVIPLSKPSRYLWFEVTETTKGTPYFALGELEIYQCSMVVLE >gi|226332247|gb|ACIC01000073.1| GENE 53 82727 - 84661 1722 644 aa, chain - ## HITS:1 COG:no KEGG:BT_3013 NR:ns ## KEGG: BT_3013 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 644 1 644 644 1298 97.0 0 MKKNTLYLFILLLITGLNTSCNYLDIVPDETATEKDAFANPRAALRYLYSCYGYLPQSNM VQSCMDFTGDETISPFAESYVKFAEGSYDSSNTIISYWNTLFQGIRQCYLLKENIHSVPK ISQEEVDLYTAEADFLIAYFHLLLIKCYGPSILVKELPALDTPANNMLGRRPYDECIDWV ADLLDDAATRLPATRNSSDYGRATSVIAKSLKARMLLYAASPLFNGNPDYADFKNPDGEQ LMPTTYSEEKYKRAADATWDAIQAALGAGHELYKANTTSNAYPEPTNLTERTLRMTFMDS ENFKEVIFPETRKAGAYGIQRKSIPFFPRGSWNGIAPTITMLDRFYTVNGLPIDEDPAFN TNNKLDIVTIPEGTTYAEPGKRTLYMNMNREPRFYAWIAFENGYYECRTDDKRYAYHKFW GAERSEGDKWLTGFLATENCGVRADDGKIVTAARSQNYSKTGYLNKKGVHPGIQATVGTP GPTVEYPWPVIRLAELYLNYAEACVGYGKEGYPEKGMAYLDKVRERAGLKSVLESWANAK VPLTSYDSQCGPDGRVMKIVRQERMIELYQENHNFWDIRRWKMADTYFNVKVRGLNILAE TLEDFAKIVEIQDKRTFDAPRQYLMPIPAGEVSKNPNMVQNPGY >gi|226332247|gb|ACIC01000073.1| GENE 54 84674 - 88060 3141 1128 aa, chain - ## HITS:1 COG:no KEGG:BT_3012 NR:ns ## KEGG: BT_3012 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 4 1128 1 1125 1125 2206 100.0 0 MKNMHFSNSILRYAVLLICFLPMALYAHGAQGKRIILKGQSITMQQAIQLIEENSQYTFF FKSGVIANDEKKDYDCEGDINEILRKVFSNSGIDYVIKNNEIILKSAKAEAVQQKSQGPK KSEIVGVVVDANTGESIIGASVQIKGAATGVITNIDGKFTIMAAPSDVILISYVGYTPKE IKIGSRKVLSIDLNEDAKQLEEVVITAYGTGQKKASMVGSVESIKPAELKVPSTNLSTAF AGRLAGVIAVQKSGQPGADGANFWIRGVSTMNGVTDPLIILDGVQVSSSDLNNLDPEIID SFSILKDATATAMYGTRGANGVMIVTTKSGMNLDKPIINFRMEAQMSQPTSTPKFVDGAT YMELFNEAVNNDNSGDVLYSQDRIDGTRAGLNPYIYPNVNWYDELFKDASYSEKFNFNIR GGGKRVDYFSSISVNHESGMLKNRSKDFFSYNNNIEIMRYNFQNNINAKLSNSSKLSLRL NVQLRDMTTPNQGVGAIFNNAMNTSPVEFPVYFPDDGETPYIKWGATERVNADYQTNPVA QAATGYNKGFQSTVIAALEFNQKLDFITEGLSFKALASFKNWSSSENKREGKWNKFALTG YEDDGNGGYTYTTSRIGDEYTTNLTAKNTNSGDRRFYLEGMVNYSRTFGEHDVNAMLIYS QDELVNNTPGDGNFIGSLPQRKQGLAARASYAYAGKYMAEVNVGYNGSENFAKGHRFGLF PSIAAGYNISEESFFKPLKNVVSKLKLRGSWGLVGNDQINGSRFIYMSQIDLGGKGFTTG VNQNVTYNGPVYQRYANEDITWEVGEKINFGVDLQLFHSLDLTFDIFRENRRDIFQEKGT TPTYMGTATTKVYGNLAAMRNQGFELAASYNKQFNKNWFVSFKGTFTYAHNEITQYDESP KYPWQSKIGVSANKNAIYIADGLFIDAADIANHQQQLGAAVIPGDIKYINISRDWYGYDD NLTDTDDWVWAKHPGVPEIVYGFGPSIKWKNLDFSFFFQGIARTSFMVKDFHPFGDSMLR NVLSWVADERWSPDNQNTNATYPRLSRKTNNNNNKLSSYWERNGAFLKLKNAEIGYTYKN MRLYISGSNLMTFAPFDLWDPEQGGGSGLSYPTQRVFNIGFQMTINNK >gi|226332247|gb|ACIC01000073.1| GENE 55 88109 - 89017 632 302 aa, chain - ## HITS:1 COG:PA2388 KEGG:ns NR:ns ## COG: PA2388 COG3712 # Protein_GI_number: 15597584 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Pseudomonas aeruginosa # 85 297 109 321 331 72 25.0 1e-12 MLADRLTTSEREQLLNEKPMVNFQSQQWDKAPKAHIQDHVPPHTILNKIMVRCRNEKQPL ERNKKYYRIGYSIAATVAIFIIGFWIANNISSSDINISAPMNDKLAVMLPDSSEVWLNAA SQIRYHKSFLNNREIFLEKGEAFFKVKKAQGAPFRVYFRESRIEVTGTEFNIKAGHMESE ITLFTGSIKFQAEEGQRELPMQPNERIVYNTQAKSVVRTNIDINEYDWRSSKYRFTNKPL QEFIDFINRSYHVNIIIKEEKLKELKFNGTIRKDEPLTNIIEKICISLDLKEKQENNNII LY >gi|226332247|gb|ACIC01000073.1| GENE 56 89212 - 89787 388 191 aa, chain + ## HITS:1 COG:PA1912 KEGG:ns NR:ns ## COG: PA1912 COG1595 # Protein_GI_number: 15597108 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Pseudomonas aeruginosa # 15 172 9 161 168 62 27.0 4e-10 MNERERVLAIKGGDHQAFIDLYNEYWSQVYDFSRLYIATIADAEEIVQDVFVKLWESRHL LKEDENIKGFLFIVTRNIVFNKNKKRVNENLFKTSVLVAYGNEGYYNSTTVEEDYCASQL KIFIDRLINSLPEQQRKCFLLSREESLSYKEIAERLGISQKTVEIHMGKALKFLKDKVKR GWEILLSLLSF >gi|226332247|gb|ACIC01000073.1| GENE 57 90091 - 90219 70 42 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLPCNFTSENYILIIVFKYNNYDRGMLSDLSVQMETPNLKQR >gi|226332247|gb|ACIC01000073.1| GENE 58 90251 - 92842 1834 863 aa, chain + ## HITS:1 COG:XF0845 KEGG:ns NR:ns ## COG: XF0845 COG1472 # Protein_GI_number: 15837447 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Xylella fastidiosa 9a5c # 33 820 31 831 882 546 38.0 1e-155 MKKVLLTLSLSIVASMGSFAQNEPYKNPKLTPEERAEDLLGRLTLKEKIGLMKNSSFAVE RLGVAPYNWWSEALHGVARNGLATVFPITMGMASTFDDEAIERVYVAVSDEGRAKFHDAH RSNRYGYGNEGLTFWNPNVNIFRDPRWGRGQETFGEDPYLTTRMGVAVVKGMQGPADAEY DKAHACVKHYAVHSGPEAKRHSFDVEDLSPRDLWETYLPAFKALVQEADVKEVMCAYQRL EGEPCCDSNRLLTQILRDEWGYKHLVVSDCGAIDDFFVKGRHETHKDAADASASAVINGT DLECGSIYSHLEEAVKQGLITEERIDTSLRRLLKARFALGEMDPDSIVPWSRISIDTVDC DLHKQMALDLARKSMVLLCNNGVLPLAKTGARIAVMGPNAVDSVMQWGNYEGVPSHTYTI LEGIRCKIGDVPFEKGCELLDNRIFESYFNEISNNGRPGLTATYWNNMNLSGDVAATSQI TSPINLSNGGNTVFATGVGLYNFTAVYEGTFRPKESGAYELLIEGDDGYRVYVNGEKVID YWGEHASAKREYTLKAIAGTDYKIRIEYMQAGAEALLRFDLGVYRHISPEMVVDRVKEAD IVIFAGGISPSLEGEEMYSVNSPGFAGGDRTSIELPQVQRDILKALKKAGKKVVFVNCSG SAVALVPEMESCDAILQAWYPGQAGGLAVADVLFGDFNPSGKLPVTFYKNTDQLPDFEDY SMKNRTYRYMTEVPLFPFGYGLSYTTFDISKGRLNKKTISAGQGLNFKVNVKNTGKYDGA EVIQVYVRKVDDTEGPIKSLRAFRRVPLKAGETCVVSIDLLPTTFEFFDPTTNTMRIMPG KYEIMYGNSSDIPSGNKLSVTLR >gi|226332247|gb|ACIC01000073.1| GENE 59 93139 - 94329 904 396 aa, chain - ## HITS:1 COG:no KEGG:BT_3008 NR:ns ## KEGG: BT_3008 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 396 1 395 395 782 98.0 0 MMQISPETQLFIREHQTDDVRALALQARKYPNVDMPTAITQIAGRQVAAEKIPSWRDTDD IWYPKHLSLEQCSSEVTARYKATLLKGNSLTDLTGGFGIDCAFLAARFKSATYVERQQEL CEIAAHNFPILNLNHINVKNEDGVSYLQAMSPVDCIFLDPARRNEHGGKTVAISDCEPDV AELEELLLNKAGQVMVKLSPMLDLSLALKELQHVQEVHIISANNECKELLLILGQASVEE ISIHCVNLPTKGIQEEQHFVFTREQEQCSECNYTNVLENYLYEPNASLLKAGAFRSIASA FPVKKLHPNSHLYTSDVLVESFPGRAFHIISQCSLNKKELKESLGDLKKANITVRNFPAT VAELRKRIKLSEGGDTYLFASTLNNGQKVLIRCEKA >gi|226332247|gb|ACIC01000073.1| GENE 60 94367 - 95239 637 290 aa, chain - ## HITS:1 COG:CC2273 KEGG:ns NR:ns ## COG: CC2273 COG2173 # Protein_GI_number: 16126512 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine dipeptidase # Organism: Caulobacter vibrioides # 90 276 16 193 212 96 33.0 5e-20 MLKYSLFVFCFLLAGYSCVSGHKETKTSAETETTCTDSMCAGLGDTSPEGTASDCTSLAH ADSIDADSIYTSPSQKNESASTLIPKHQKSPMALYMDSLGLINIAELDNSIAISLMYTKA DNFTGEILYDDLSEAYLHPHAAYALLKAQEALKALHPSYSLIIYDAARPMSVQRKMWDVV KGTSKYRYVSNPNRGGGLHNYGLAVDISIQDSLGRPLPMGTKVDHLGVEAHITEEGELIR NGKITETERQNRILLRKVMKAAGFRALPSEWWHFNFCSREEARKKYNVIP >gi|226332247|gb|ACIC01000073.1| GENE 61 95507 - 98410 2128 967 aa, chain - ## HITS:1 COG:no KEGG:BT_3006 NR:ns ## KEGG: BT_3006 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 967 1 967 967 1913 97.0 0 MTALETQLTEIVEKEQGQKIIPFLQKLTQEERKSLIPCLSRLEEYYNKFVQLEERTYGTR ATSGQHHIIDLAALVIFPLKEFRKHEWGINTAHLNEIAAWHIPTWLDSYFVEGEGKEFGG FYNMDYEILMDWIERGILTVSPSPQTIAGYLVNYIHTTPVLEKRDITINEHIWYLFEYDC GQNWHANPAKGYPYYTFQHFTENGKLDRMRVLKESLLAINRNFNKNLCSWFAGMFTALNP SVEEQLTLQPEMFAALSSPHSRPINIILGLLKNLCSHPRFLTDDFLDQTTVLFASDVKAV HQNMLGVLSKLAKEKKEYHDAICCAATQGLMSRDESTQNKIVKLIQTFGETESPTLKEAL SAYAETMLTSTKKELAAYLKDNVSDALSTDKVLLTTLDEQANVASFDYEPMPPILREDNR IQEITSIEDLLFLASQILDSNELYHFDLFLNALIEWNEQLEAKHITQWTPVLQRAYKLLI NGGSSRNGISDSMMATFLIDYAKLLIKRFPIEAKELSTLHEKMVQKDELQKGQWRYRNLQ KITIRQKSNKRTELPIHKQLLCRTLDLLESNENRLPMLSTPTHMPAFIDPIVLIKRLGQY QQANAEPDDMDMQIALSRMALNDYPSQDLPTVLQELEGEYQNLFSFLMGAKDAVPQAPFT HPSWWMTAGLIKSPETVYTEFKDFSYSKSSREFLTGNFSWWTFQTPHSYTDYHNKVVNWT SSTLSFNVPEGENIHIVNKGKYDERVSYHSYDPHPLLVEMYSQIERYDDIQNDLPRLVWL APNTPEPLFVWCIRCAIYDPMLNEVREVGITQATIEALHQLRHEWHETSYLLEASCMLVA DKTSRSYAAGIWTDRVSTGCIDSVRIGRILGSHQRTGWGPLKRLTDLIQQQMINVSPLHN RELELLLTSLIAELPEQPVKELKKLLEIYSELLSVNNSKVTDERILQLLEIWKNTANLKK VIVSINR >gi|226332247|gb|ACIC01000073.1| GENE 62 98537 - 101383 1511 948 aa, chain - ## HITS:1 COG:no KEGG:BT_3005 NR:ns ## KEGG: BT_3005 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 948 1 948 948 1700 89.0 0 MEELKHELEKLSKTYVDTPEKEEKILIPFIKRLLELSVKERRKLLPTIRELQWIKGKFSE TFCDHHRARFLGAVQFVCGNKREMDLGYHIDFNLLCKLLSLYTPSWLAEYINDAESYLNF NISYEQLMQLIDMGYMQELSPKRIAQILPGYICIRSSIPKGKDTFNSDLLLKREITLKEH IWYLFELESSIGYHNNRAQTAYKEGTTSRDESFSAAFYRFSLDGHLDRNRLLRATLSTFN RSFKKNMVNWFAGLFEELQPRAEELISLQEEIMQVFTSPYTKPVNVMLQQLKKIASEGGF HYQEFIERATTLFFSSPKNSLLTIYSIFEQIVTGHPEMKEACCIPLCQLFLKKDESLQKK AASFISKYGDASSSTLQETLLSYQSEMFQSVQDILVSFMKQPAEEAGLPETTFQEKVRIC RKDNRIPFPANKEDFLFQLSRLFDMNESWETDTAIAALIAFHPQLDEEDFSRMEPVFQRA ANIIINSWAVYENFLATFLLEYQRLWTQKDTANQGFLSKLFTRLEERLKGIDANRGAYDE RAFKRLADWQPTYSNRTCFEPIKQLWLEVIRKIQKGDSLPLLSTPTHIPVYIQSTELIRR LAVYQEANVKPCSWDFQLAIARCALEDKEEAIAVARQLLNNEYLHLCLFLLDKETQPEPP YNHPSAWLAAGLVKEPDTEFEAFKNFSCNTLPHNYLTGNYGWKEPAQKENQYAADTRLLQ LDFYKWHEYAEHNSHQLWQEHLVINSRYNIGDSDYMERLFSFFPNQPEPLVAQIINCYMD FGSPQEESKRSVANALRMLLSFHCPLREMSLLLLGGGLLFVDKTVRSYAAELWVEGIANN RIDNHRLGEIVARIINMGIVPLKRFTTEVYESIYKRSAFHNQQLEELLTVLIGGLPDNPV TGQKQLLELYAEVLRVNGHCVTDDKVRERLEMWKKNANLKKVVTALER >gi|226332247|gb|ACIC01000073.1| GENE 63 101371 - 102726 974 451 aa, chain - ## HITS:1 COG:no KEGG:BT_3004 NR:ns ## KEGG: BT_3004 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 451 1 451 451 899 99.0 0 MTDTISYQYAAPSVLQRSSDQDELFLAKYSEIEKKEAPCFFWGKLTQPYMTARCLIALSN VVQSSFNLTPAQLSMLKDPIVTAGNERLRFEGFSNCAGVYARVDVLPDGHDGEFLENGTT NVDFNAGMISALGSISKQEKVVMSVGPKEVGLYNKGEKVIERKVPLPVKWIKGLTTVQIY QSVAERMYSFNRIQTLQLFQTLPKGTVKCDYYLVMRGQKPAFSPVKSLNAVCVGGVHRLR LLEPLLPFADELRVFVHPTMQSTIWQLYFGPVRFSLSLSRECWRGFSGEGAALDALLEDV PEKWIAAMDKYSYANQQFNPTLFAIEEHIDLDKVDSLSARLAAMGLLGFDLDDNSFFYRR LPFKTERILSLNPRMIAAEKLLEEEKVEIISNDGNRIEARVAGSGGVRHTVILDREGEKE RCTCTWFSGNQGERGACKHILAVKKLVQWKN Prediction of potential genes in microbial genomes Time: Thu May 12 01:08:12 2011 Seq name: gi|226332246|gb|ACIC01000074.1| Bacteroides sp. 1_1_6 cont1.74, whole genome shotgun sequence Length of sequence - 85553 bp Number of predicted genes - 62, with homology - 62 Number of transcription units - 25, operones - 18 average op.length - 3.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 593 - 652 9.0 1 1 Op 1 . + CDS 739 - 954 231 ## BT_3153 hypothetical protein 2 1 Op 2 . + CDS 951 - 2597 869 ## COG2865 Predicted transcriptional regulator containing an HTH domain and an uncharacterized domain shared with the mammalian protein Schlafen 3 1 Op 3 . + CDS 2647 - 3993 472 ## COG0270 Site-specific DNA methylase 4 1 Op 4 . + CDS 3980 - 5530 919 ## RCAP_rcc02716 hypothetical protein 5 1 Op 5 . + CDS 5532 - 7733 485 ## RCAP_rcc02715 hypothetical protein 6 1 Op 6 . + CDS 7735 - 8721 314 ## RCAP_rcc02714 hypothetical protein 7 1 Op 7 . + CDS 8718 - 9419 219 ## gi|253569230|ref|ZP_04846640.1| predicted protein + Term 9423 - 9472 10.5 8 2 Op 1 . - CDS 9875 - 10417 342 ## BT_2996 hypothetical protein 9 2 Op 2 . - CDS 10441 - 11652 321 ## BT_2995 hypothetical protein 10 2 Op 3 . - CDS 11668 - 12066 337 ## BT_2994 hypothetical protein - Prom 12086 - 12145 3.4 - Term 12175 - 12210 -0.5 11 3 Op 1 . - CDS 12241 - 12825 368 ## BT_2993 hypothetical protein 12 3 Op 2 . - CDS 12877 - 13257 232 ## gi|253569235|ref|ZP_04846645.1| conserved hypothetical protein - Prom 13287 - 13346 2.0 13 4 Op 1 . - CDS 13360 - 14745 745 ## COG5545 Predicted P-loop ATPase and inactivated derivatives 14 4 Op 2 . - CDS 14750 - 15067 331 ## BVU_2467 hypothetical protein - Prom 15167 - 15226 5.1 15 5 Op 1 . - CDS 15245 - 16489 761 ## BT_2991 hypothetical protein 16 5 Op 2 . - CDS 16502 - 17995 830 ## BT_2990 hypothetical protein - TRNA 18618 - 18691 73.9 # Met CAT 0 0 - Term 18714 - 18770 -0.2 17 6 Op 1 . - CDS 18785 - 19096 298 ## BT_2979 hypothetical protein 18 6 Op 2 . - CDS 19101 - 19565 518 ## COG1522 Transcriptional regulators - Prom 19614 - 19673 9.2 - Term 19648 - 19706 13.3 19 7 Op 1 . - CDS 19726 - 20601 1039 ## COG0545 FKBP-type peptidyl-prolyl cis-trans isomerases 1 20 7 Op 2 . - CDS 20622 - 21206 747 ## COG0545 FKBP-type peptidyl-prolyl cis-trans isomerases 1 - Prom 21292 - 21351 3.2 + Prom 21410 - 21469 4.0 21 8 Tu 1 . + CDS 21678 - 22382 645 ## COG0846 NAD-dependent protein deacetylases, SIR2 family + Term 22418 - 22457 6.6 22 9 Op 1 . - CDS 22366 - 22584 239 ## BT_2974 hypothetical protein 23 9 Op 2 1/0.167 - CDS 22599 - 22847 144 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 22941 - 23000 3.8 24 10 Tu 1 . - CDS 23024 - 23797 756 ## COG0500 SAM-dependent methyltransferases + Prom 24171 - 24230 8.6 25 11 Tu 1 . + CDS 24265 - 28263 2126 ## COG0642 Signal transduction histidine kinase + Term 28300 - 28365 2.9 + Prom 28529 - 28588 5.0 26 12 Op 1 . + CDS 28610 - 30064 1396 ## COG3669 Alpha-L-fucosidase 27 12 Op 2 . + CDS 30084 - 32192 1831 ## COG3250 Beta-galactosidase/beta-glucuronidase + Prom 32232 - 32291 3.2 28 13 Op 1 . + CDS 32319 - 35477 2247 ## BT_2968 hypothetical protein 29 13 Op 2 . + CDS 35483 - 37435 1430 ## BT_2967 hypothetical protein 30 13 Op 3 . + CDS 37459 - 38706 880 ## BT_2966 hypothetical protein 31 13 Op 4 . + CDS 38730 - 39401 398 ## BT_2965 hypothetical protein + Term 39452 - 39503 3.5 32 14 Op 1 . + CDS 39535 - 41700 930 ## BT_2964 chondroitinase AC precursor 33 14 Op 2 . + CDS 41697 - 44639 2481 ## COG4225 Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins 34 14 Op 3 . + CDS 44679 - 44900 63 ## BT_2962 hypothetical protein 35 14 Op 4 . + CDS 44905 - 45762 860 ## BT_2961 hypothetical protein + Term 45976 - 46020 5.5 36 15 Tu 1 . - CDS 46058 - 46309 99 ## BT_2960 hypothetical protein - Prom 46471 - 46530 5.1 + Prom 46448 - 46507 8.3 37 16 Op 1 . + CDS 46527 - 47909 1099 ## BT_2959 hypothetical protein 38 16 Op 2 . + CDS 47915 - 50284 2026 ## BT_2958 hypothetical protein + Term 50324 - 50378 7.1 39 17 Tu 1 . - CDS 50492 - 51328 395 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 51401 - 51460 6.0 + Prom 51348 - 51407 4.4 40 18 Op 1 11/0.000 + CDS 51526 - 52425 422 ## COG1180 Pyruvate-formate lyase-activating enzyme 41 18 Op 2 . + CDS 52422 - 54566 1388 ## COG1882 Pyruvate-formate lyase 42 18 Op 3 . + CDS 54595 - 55020 382 ## BT_2954 hypothetical protein 43 18 Op 4 . + CDS 55028 - 56035 828 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily + Prom 56041 - 56100 2.5 44 19 Op 1 . + CDS 56173 - 59061 2484 ## BT_2952 hypothetical protein 45 19 Op 2 . + CDS 59067 - 60605 1207 ## BT_2951 hypothetical protein 46 19 Op 3 . + CDS 60632 - 61579 675 ## BT_2950 hypothetical protein 47 19 Op 4 . + CDS 61640 - 62791 884 ## BT_2949 putative alpha-1,6-mannanase 48 19 Op 5 . + CDS 62803 - 65496 1685 ## COG3537 Putative alpha-1,2-mannosidase + Term 65578 - 65623 1.1 49 20 Op 1 26/0.000 - CDS 65554 - 66309 482 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 50 20 Op 2 . - CDS 66302 - 67567 1098 ## COG0438 Glycosyltransferase - Prom 67804 - 67863 2.5 + Prom 67526 - 67585 3.1 51 21 Op 1 10/0.000 + CDS 67673 - 68875 1305 ## COG0677 UDP-N-acetyl-D-mannosaminuronate dehydrogenase 52 21 Op 2 1/0.167 + CDS 68926 - 70056 1139 ## COG0381 UDP-N-acetylglucosamine 2-epimerase + Prom 70529 - 70588 4.7 53 22 Tu 1 . + CDS 70736 - 72394 560 ## COG0110 Acetyltransferase (isoleucine patch superfamily) + Term 72411 - 72456 3.0 - Term 72467 - 72512 8.1 54 23 Op 1 9/0.000 - CDS 72556 - 73920 409 ## PROTEIN SUPPORTED gi|157165073|ref|YP_001466086.1| 30S ribosomal protein S12 55 23 Op 2 27/0.000 - CDS 73936 - 77112 3402 ## COG0841 Cation/multidrug efflux pump 56 23 Op 3 . - CDS 77127 - 78230 1174 ## COG0845 Membrane-fusion protein - Prom 78373 - 78432 6.0 + Prom 78206 - 78265 3.9 57 24 Tu 1 . + CDS 78454 - 79365 720 ## BT_2939 putative transcriptional regulator + Term 79471 - 79500 -0.7 58 25 Op 1 . - CDS 79337 - 80563 1231 ## COG0438 Glycosyltransferase 59 25 Op 2 . - CDS 80560 - 81693 417 ## BT_2937 putative glycosyltransferase 60 25 Op 3 . - CDS 81690 - 82889 902 ## COG0438 Glycosyltransferase 61 25 Op 4 . - CDS 82877 - 83980 951 ## BT_2935 hypothetical protein 62 25 Op 5 . - CDS 83977 - 85425 1192 ## COG2244 Membrane protein involved in the export of O-antigen and teichoic acid Predicted protein(s) >gi|226332246|gb|ACIC01000074.1| GENE 1 739 - 954 231 71 aa, chain + ## HITS:1 COG:no KEGG:BT_3153 NR:ns ## KEGG: BT_3153 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 71 44 114 114 80 54.0 2e-14 MEKEMINRLKIVLAEEQKSSKWLAEQLGKDRATVSKWVTNTMQPDARNLLRIAEALNIDA GRLLNQKHLKK >gi|226332246|gb|ACIC01000074.1| GENE 2 951 - 2597 869 548 aa, chain + ## HITS:1 COG:MA2370 KEGG:ns NR:ns ## COG: MA2370 COG2865 # Protein_GI_number: 20091202 # Func_class: K Transcription # Function: Predicted transcriptional regulator containing an HTH domain and an uncharacterized domain shared with the mammalian protein Schlafen # Organism: Methanosarcina acetivorans str.C2A # 10 351 7 315 458 112 29.0 2e-24 MNEIDIIQQIKSGEVSKVQFKERLLDNYDIGCELVAFSNARGGTLVIGVKDKTGSINSLS YQEVQETTNTLTNIASENVVPAIIVEVETIPVDDGSLVLAHIKEGLNKPYHDNRGIVWVK NGADKRKVFDNAELAEMMTDCGSFSPDEAGVRDATIKDLDENTIKTFLSNKFEKVLVRKG LVGDALREASLDTVSSAIASGHDGEKLLRNLRFIRPDGKITVAAMLLFGKYTQRWLPVMT AKCICFVGNNLGGTQFRDKVNDADIEGNLLHQFESIMDFFTRNLRHIQVGDEFNSQGVLE IPYISLVEFTVNALVHRSLNAKAPIRIFIFDDRVEIHSPGTLPNGLNVDDIVVGTSMPRN MFLFTNAIHLLPYTGAGSGMLRALSEGMKVSFTNNDRTNEFVITINRPVGNQVSEKSNQV DNKSNQVNGNLDTKLGHYDTNLDTELRHLDTYPATGLGHSNTDLDTNSETLNAYLDTKRP KITNKQKDIVNFCSVPRSSKEILDRIGVTNHSKNRQTYIMSLVEAGYLEMTNPEHPNASN QKYRRSKK >gi|226332246|gb|ACIC01000074.1| GENE 3 2647 - 3993 472 448 aa, chain + ## HITS:1 COG:NMA1500 KEGG:ns NR:ns ## COG: NMA1500 COG0270 # Protein_GI_number: 15794400 # Func_class: L Replication, recombination and repair # Function: Site-specific DNA methylase # Organism: Neisseria meningitidis Z2491 # 2 164 18 188 337 134 43.0 3e-31 MKQYRFIDLFAGLGGFHLALEKLGCKCVFASELQQELQELYYLNHGIKCHGDINQVNIKN DIPSHDILCAGFPCQPFSKAGKQEGFNDQQDRGNLFYKIMEILEFHKPEYVFLENVSNLK THDNHNTYHVIYEKLSSLYDVQDAIISPHYFGVPQHRNRIYIVGRLHSKGGLKGFEFPDY TQRPECHISSIIHPEDTDYMSLRPITRKHLEAWEHFLQLLDKHNVPVPTFPIWAMEFGAN YDYKGIAPYYQQRRQMDGKLGRFGEPIMGNSKDDYLQCLPIYSQTNKLENRQFPDWKKQF IEYNRKFYNDNKEWIDEWIHEIREPGFENSHQKFEWNCGQDPEVPRTLYNKIIQFRPSGI RVKRPTFSPALVLTTTQIPIFPWIETPNGEKGRYMTRKEAAALQCMEELQYYPDTCAKAF KAFGNAVNVEVVKRIAEKLLIVGDYDAN >gi|226332246|gb|ACIC01000074.1| GENE 4 3980 - 5530 919 516 aa, chain + ## HITS:1 COG:no KEGG:RCAP_rcc02716 NR:ns ## KEGG: RCAP_rcc02716 # Name: not_defined # Def: hypothetical protein # Organism: R.capsulatus # Pathway: not_defined # 33 514 29 503 510 271 35.0 4e-71 MMQTEHIDKVSIALKPNVYNTFRNLVNTVSNTLAEYVDNAVQSYLNHKTEIKTLDPDYQL QVEITIDRNADTISIIDNAAGIDTLNFKRAFEPAHIPLDNTGLNQFGMGMKTASVWLADK WSVSTKALGEDVERTTSFDLQKVTSEGREELIVENKSLAPEEHYTKIYLTALSSNAPSAN QMDKIRRHLSSIYRKYLRSEELKIVVNGESIQAANYAILNTPYYKTPDGDSLLWKKAIDY QSYDGKYKAKGFIAILDKIQNGANGLVLMRNGRVIVGGGEDRYFPSVIFGQVGSFKYRRI FGELELEGFEVTFNKNGFRDEDNLYALMEALRDELRSDNFNLLAQADNYRQRSKEEYAKI SKDIKKKLDKQVKPKDLTRQIQDVENKTSNKEFVKQEERKVQDAKPIDSHTEDFFYQGEN YSLRMDLFTDAGVDVLYSITLNSNESTDLFSVKQYVCKINLAHPFFTRFEQFKKSNDYMP IISIFKSFAMAELTATLSGISNVSDLRIKFNQYILQ >gi|226332246|gb|ACIC01000074.1| GENE 5 5532 - 7733 485 733 aa, chain + ## HITS:1 COG:no KEGG:RCAP_rcc02715 NR:ns ## KEGG: RCAP_rcc02715 # Name: not_defined # Def: hypothetical protein # Organism: R.capsulatus # Pathway: not_defined # 9 723 24 733 748 334 32.0 8e-90 MDIQIISNDTQEKFCPIRGERTQNLLDRLAKEIDSEGVDTIEREAIEVLSHCSNPTVDKV QSSTSLVVGYVQSGKTMSFTTLSALASDNGYRIIIYFAGIKDNLVEQTSERLREDLLDDG NNNQVYKLHENPTVDEAQRIKSELLLSSKPTILITIHKNPNRIKKLIELLSTTTVSHEIG KKGVMIIDDEADQASLNGYAYKNSKNHSENWEDDAEFTSTYRAILKLRSILPNHSYVQYT ATPQGPLLISLLDLLSPKYHTVLTPGKGYTGGKTFFKDEPGLVLTIPETEVYNYRRNNLT ECPPSLIEALQLHMMGVAIVVRLKKTVKFLSMMVHADREQDASQTFHTWIASTIDMWTNF IRAGKDDIAYINLMNDFKNRYSEAIREYSKHDEDYPSFEQIADVLPDIICDTNLELIISN KRKKNIKIAWKDYPSHILVGAEMLNRGFTVKNLAVTYMPRYSVGKSTADTIQQRCRFFGY KQKYLWSCRVFLPDEVQVEYSEYVEHEEEMRKWLTECKNIDEVERLLIISPRMNATRKNI LSVDTVNDKLRGWRLTNAVQAIDENNELVHNFLGGLTFEVCQNYGTPDRNHRFAKLPIQD VIDFLTGFKFQNMPDSARKQATLRYLKYYASHETSPITHAYIVQMAYDGEPRLRQFNDKT LKIGNLHTGRSIKGKTTYPGDKEIKFDDSITIQIHHIKLKSTSNSWGGKELYTLAIYYPD EMAINFVNSQNSK >gi|226332246|gb|ACIC01000074.1| GENE 6 7735 - 8721 314 328 aa, chain + ## HITS:1 COG:no KEGG:RCAP_rcc02714 NR:ns ## KEGG: RCAP_rcc02714 # Name: not_defined # Def: hypothetical protein # Organism: R.capsulatus # Pathway: not_defined # 23 304 21 302 326 124 26.0 4e-27 MNNSIYDIFLDLEGNNTAENDKFSAKPLPFNPNHKIGIDSNGYPLFFVECVDFEVVPNID LELISVQFNQLCRLTNKEKTSEGRYTIISLKTINPDFLRYFIDVTSLILQRMNKRPSHKE VTSELSKLIELFKVFSKPSRKTIQGLWAELFVIEQAKDPVYMLQSWHVSPEDKYDFNDGN DKVEVKCTVNSQRKHRFSYEQLTPNTKSQILIASVITTRSGKGKNVFDLKDSILRKVQDL KLHLILNEVIASTLGTDFDKAFEYYFDYQMAIDGLSFFDVNEIPNIPSDVIPSELSNIKF DCDLTRIKVASKETNPVLNTILFNSIGL >gi|226332246|gb|ACIC01000074.1| GENE 7 8718 - 9419 219 233 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253569230|ref|ZP_04846640.1| ## NR: gi|253569230|ref|ZP_04846640.1| predicted protein [Bacteroides sp. 1_1_6] # 1 233 1 233 233 440 100.0 1e-122 MKIEKIFPETLLELIRKELDKRSCFEYIEGVQPFLMKYDNVPYYVYIKNLSSAFFKDRPG TTRAQLPKRDLFDEIKKSPNIFIFLGYDQENDVFVCWDFNVAKERLNVGKSVSFYSRISY QEEVEEGEFLRINLKNGDNPIVFKRKSITEFFDKINTFFDVKSKGLSDSTHPVVEDGKIK VITEEELLKQLRPLIKISSPHTLEAISVAEKFYEGKYPKMNMRDWLQLVKNID >gi|226332246|gb|ACIC01000074.1| GENE 8 9875 - 10417 342 180 aa, chain - ## HITS:1 COG:no KEGG:BT_2996 NR:ns ## KEGG: BT_2996 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 168 1 172 181 98 36.0 1e-19 MGIGKSRGTSPINVNDMLENVEMNQQIESEKQEFDKLLIEIREAKETLIKFNEDLEKAIV AECHIEGALKAAAGSCDNIVSGICNAIVKAERDTKFKTTITPEQLTKLRQLIDHSVESWI SVLANHRAEQSKLIIEHESNMRKILRRNEGVWFSDFWMKVLVIFLVVYTVGLGLVVYYVT >gi|226332246|gb|ACIC01000074.1| GENE 9 10441 - 11652 321 403 aa, chain - ## HITS:1 COG:no KEGG:BT_2995 NR:ns ## KEGG: BT_2995 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 403 1 403 403 573 74.0 1e-162 MIAKASTIPHGANAIRYSVNKDRADIVKANLLPDDISPEAMYGRMMLVQKMFAEKINKGR PLGRNVIRIEISPSEEESRNWTMDEWVRLADEFIRVFDAIDLSQKTKRASSKQTNLKGSQ YIAALHRDSKSGILHLHIDANRVDMNGKINDSHKIGERAVMAANIINEKRGWVQSKEIGI RHRQEISKCCMEILRTMDKFSWRQYEMELTKRGYKVHIQENDKGKVYGYSVKRGNSNYKS SVLGIGRNLTPSKIEATWVKLHPQERKSEPTKSISQQTRTANADSVVQSKMTPQPVIKSY DIATDEYHSYHVKIPETADNIIRQDCSLVEAHPLAKIEEIQHTALLLFAGYMDAATSIAV SSGGGGSDTGGWGRDKDEDELEWARRCARMANSMCKRRRGLHR >gi|226332246|gb|ACIC01000074.1| GENE 10 11668 - 12066 337 132 aa, chain - ## HITS:1 COG:no KEGG:BT_2994 NR:ns ## KEGG: BT_2994 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 132 1 135 135 166 74.0 2e-40 MKKKKNVSNEADEENDKKVIKRTLRLEARVTEQEYAKAKELAKTCGLSMSNYVRRTALGQ HPRQRLSEREVEALCSLTDARGDLIRIAAAVKSIQADRRAMYFSDTRFVEQWMMAATQLI NRWNQIENYLTE >gi|226332246|gb|ACIC01000074.1| GENE 11 12241 - 12825 368 194 aa, chain - ## HITS:1 COG:no KEGG:BT_2993 NR:ns ## KEGG: BT_2993 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 133 2 134 272 201 76.0 1e-50 MKTNLNYCIVLSSEQLAYLAGSRYGIDRMKILHQLIEAAVLEENDYAIKGFSTTLQVGQA ILSEVDLSCKLGYDKKTISRVLDKMNQLGIVASTQGNRTSVHTLKCVSAWMLDDNRIDNP FYVRIKDRQEKGEDDCSQSEAMNEGCVSDSDFSFAGEMPVISNNGEIPSSLISVSSQSAA EEESNENISVDSQI >gi|226332246|gb|ACIC01000074.1| GENE 12 12877 - 13257 232 126 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253569235|ref|ZP_04846645.1| ## NR: gi|253569235|ref|ZP_04846645.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 12 126 1 115 115 210 100.0 2e-53 MLGTRDNINHNMRKEELIKQLQERAPLLANAVSHMVTYVQDRYPSAFPSKEQTEAVNNYL RSVHADGDGSTSERNCEHRRIASQNITIAAIRILDSRQLDRLQDVLDHIAYDKEYYMPER GYGMHR >gi|226332246|gb|ACIC01000074.1| GENE 13 13360 - 14745 745 461 aa, chain - ## HITS:1 COG:L109011 KEGG:ns NR:ns ## COG: L109011 COG5545 # Protein_GI_number: 15672499 # Func_class: R General function prediction only # Function: Predicted P-loop ATPase and inactivated derivatives # Organism: Lactococcus lactis # 7 348 45 390 480 131 29.0 3e-30 MEDNWKKNLATDSYGNPARSISNLRLIFTLDENLSQIRFDTFYQDDVCFCPLFRNVNGNK IDEESAGKIQDYLEKTYRIRLTQNKVFEILKTTSSERSFNPVQEFIAQETWDGQPRIATT IIDYLGAEDNPLVREQTKLWFVAAVTRVFNPGCKFDNVLTLPGPQGIGKSTFFKAISGRW FNDSFSFASGDKEKVETITNGWIIEISELNGLKRANDAEAAKAFLSRCSDYMRPAYGHKV VEFMRHNVFAATTNEANFLQGDNGNRRWWIIPVKGNGHVSDWLDDLQYSVPQFWAEAYTY YKQGMKLYLAPDMEIEANEIQMQHSNIIVDPIMEDIEMYLEREVPMQYASWMIPTRLAYQ KGAYSEPNSTMTSLNMVCARQIIEELPNDLVRRNPSKYTSQYINRLMSMFPNWKRSEQEK VKGLHPAYCDKTGRTKYPWVRVDAPSEESCATLPSELELPF >gi|226332246|gb|ACIC01000074.1| GENE 14 14750 - 15067 331 105 aa, chain - ## HITS:1 COG:no KEGG:BVU_2467 NR:ns ## KEGG: BVU_2467 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 5 100 1 99 102 101 55.0 1e-20 MEEKLNFTQLLQKPIAMMTGEELCFLITKSVENTETVTPQVASKGNYYGIEGIARVFGCS VPTANRIKKSGIIDKAITQIGRKIVVDADLALALAKESGGIHIKE >gi|226332246|gb|ACIC01000074.1| GENE 15 15245 - 16489 761 414 aa, chain - ## HITS:1 COG:no KEGG:BT_2991 NR:ns ## KEGG: BT_2991 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 20 410 1 394 398 426 51.0 1e-117 MINFIERIKSYSKRKDTADMAIRAWKSANEEAYANFCKRIDAVAKGNISVMMDMYLMMRD CVPPEALMMYNWLSDFVNGKDVSDIANQQWAGQYTETIARCITNKCLWIGINVKIGTVEL FPSPKSGLLMVHSETPIEIWNRLPQGLRSYLIGQLDMFMRNSKGCYLLSKLERKMVYQFL TYISQIIFLSYAVFIGGFMANLYDRVMEKKEDLVYCMYYFVVFDHGLSRMAKLLNRLLNS EEVDYGDMLLVKSCVTMLTNRSIEMGTETKADWEDTIEDCTPEIWKEVMFALRKVKGKRG NRKVIQSLDDILLGDKEHIKQGIRLFLEENTEDISLAYLLKSLVKSGKIKASIRYMTFHR AIEQFSQRYYGHDIPQKRYGEIKELALNSPQRGNSYTKAKRIIDRWTDYFINNG >gi|226332246|gb|ACIC01000074.1| GENE 16 16502 - 17995 830 497 aa, chain - ## HITS:1 COG:no KEGG:BT_2990 NR:ns ## KEGG: BT_2990 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 8 494 1 487 488 914 88.0 0 MEKLTKSLSLFKERNCSVSIVLDSRTRRKNACEFPLSLRFTIDRKFFYFTVGSSFTEKKF SDICNATKCKSENYKFQKEWKDTIAPKYKEILMNLNKGGILTFEMVRQCIMGEEVGLSKE DTTQSQSFISIWEQVIHELRTEDGGARFTTAESYECSLKSFRKILGENAIKGFCICAAEL QKWKDGMHNGVKDENGAIVGKISDTTAGIYLRCCRAVWNRCVHEGYLKDVPYPFSNKKEK GLVSIPKSAKRKQSFLNVNQMTELYNLFVSKKYPEHWSEEYVQRAHYSLGLFLAQYLCNG FNMADAGRLTYNSYYYQTNGKAFRFNRKKTSRRSADGSEVIVPIIPPLRYILDEIAAPQT RDSFVFPDILKGAETEELRRKYTMQENSNVKDRVIKICHEALNWNKSICPSGTWCRHSFA TNLHNAGVDMDYISESMGHSTSDHAITQIYIEHYPLDIQMQNNSKLLNLENSSERDALLA KLAIMSTEELLKLFRQP >gi|226332246|gb|ACIC01000074.1| GENE 17 18785 - 19096 298 103 aa, chain - ## HITS:1 COG:no KEGG:BT_2979 NR:ns ## KEGG: BT_2979 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 103 1 103 103 191 100.0 1e-47 MEFLNEYHLAGLFIGICTFLIIGLFHPVVVKAEYYWGTKCWWIFLILGIAGVAASLSIDN VILSSLLGVFAFSSFWTIKEVFEQEDRVKKGWFPKNPKRKYKF >gi|226332246|gb|ACIC01000074.1| GENE 18 19101 - 19565 518 154 aa, chain - ## HITS:1 COG:YPO0002 KEGG:ns NR:ns ## COG: YPO0002 COG1522 # Protein_GI_number: 16120355 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Yersinia pestis # 3 148 6 151 153 114 40.0 5e-26 MEKIDNLDRQILEIISQNARIPFKDVAAECGVSRAAIHQRVQRLIDLGVIVGSGYHVNPK SLGYRTCTYVGIKLEKGSMYKSVVAELQKIPEIVECHFTTGPYTMLTKLYACDNEHLMDL LNNKMQEIPGVVATETLISLEQSIKKEIPIRVEK >gi|226332246|gb|ACIC01000074.1| GENE 19 19726 - 20601 1039 291 aa, chain - ## HITS:1 COG:ECs5185 KEGG:ns NR:ns ## COG: ECs5185 COG0545 # Protein_GI_number: 15834439 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: FKBP-type peptidyl-prolyl cis-trans isomerases 1 # Organism: Escherichia coli O157:H7 # 76 286 61 258 259 174 48.0 1e-43 MKKVSIFMAIAAAASLASCTAQAPKANLKTDIDSLSYSIGMAQTQGLKGYLTGRLDVDTA YMAEFIKGLNEGANKTSKKDIAYMAGLQIGQQISNQMMKGINQELFAGDSTKTISKDNFM AGFIAGTLEKGGVMTMEAAQEYTRTAMETIKAKAMEEKYADNKAAGEKFLAENKAKEGVK TTESGLQYKVITEGKGEIPADTCKVKVNYKGTLIDGTEFDSSYKRNEPATFRANQVIKGW TEALTMMPVGSKWELYIPQELAYGSRESGQIKPFSTLIFEVELVGIEKDKK >gi|226332246|gb|ACIC01000074.1| GENE 20 20622 - 21206 747 194 aa, chain - ## HITS:1 COG:ECs5185 KEGG:ns NR:ns ## COG: ECs5185 COG0545 # Protein_GI_number: 15834439 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: FKBP-type peptidyl-prolyl cis-trans isomerases 1 # Organism: Escherichia coli O157:H7 # 5 194 67 259 259 190 54.0 1e-48 MDKFSYAIGLGIGQNLLSMGAKGIAVDDFAQAIKDVLEGNQTAISHQEAREIVNKYFEEL EAKMGAAAIEQGKAFLEENKKKPGVVTLPSGLQYEVINEGTGKKAKATDQVKCHYEGTLI DGTLFDSSIKRGEPAVFGVNQVIPGWVEALQLMPEGSKWKLYIPSDLAYGARGAGEMIPP HSTLVFEVELLEVL >gi|226332246|gb|ACIC01000074.1| GENE 21 21678 - 22382 645 234 aa, chain + ## HITS:1 COG:jhp1180 KEGG:ns NR:ns ## COG: jhp1180 COG0846 # Protein_GI_number: 15612245 # Func_class: K Transcription # Function: NAD-dependent protein deacetylases, SIR2 family # Organism: Helicobacter pylori J99 # 1 231 1 225 234 233 50.0 3e-61 MKNLVVLTGAGMSAESGISTFRDAGGLWDKYPVEQVATPEGYQRDPALVINFYNARRKQL LEVKPNRGHELLAELEKNFNVAVITQNVDNLHERAGSSHIIHLHGELTKVCSSRDPYNPH YIKELKPEEYEVKMGDKAGDGTQLRPFIVWFGEAVPEIETAVRYVEKADIFVIIGTSLNV YPAAGLLHYVPRGAEVYLIDPKPVDTHTSRSIHVLRKGASEGVEELKQLLIPAP >gi|226332246|gb|ACIC01000074.1| GENE 22 22366 - 22584 239 72 aa, chain - ## HITS:1 COG:no KEGG:BT_2974 NR:ns ## KEGG: BT_2974 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 72 1 72 72 125 100.0 3e-28 MENQILIFRTSITKRRDIKLIGSLLAKFPQITRWNVDFEDWEKVLRIECSNITALEISET LRNSHIFATELE >gi|226332246|gb|ACIC01000074.1| GENE 23 22599 - 22847 144 82 aa, chain - ## HITS:1 COG:lin1814_1 KEGG:ns NR:ns ## COG: lin1814_1 COG2207 # Protein_GI_number: 16800881 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Listeria innocua # 1 81 34 114 142 66 41.0 1e-11 MAGIACFSPFHFHRVFTFLTGETPTEYIKRTRIEKAALLLQNDRQLSATEIASRCGFSSL SLLSRNFKQHFNITIREFRSQE >gi|226332246|gb|ACIC01000074.1| GENE 24 23024 - 23797 756 257 aa, chain - ## HITS:1 COG:slr1117 KEGG:ns NR:ns ## COG: slr1117 COG0500 # Protein_GI_number: 16329224 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Synechocystis # 25 257 19 253 253 200 41.0 2e-51 MSNNNTSIHDFDFSFICNYFKLLKRQGPGSPEATRKAVSFINELTDDAKIADIGCGTGGQ TLFLADYVKGQITGIDLFPDFIEIFNENAVKANCADRVKGITGSMDNLPFQNEELDLIWS EGAIYNIGFERGMNEWSKYLKKGGFIAVSEASWFTSERPAEIEDFWMDAYPEISVIPTCI DKMERAGYTPTAHFILPENCWTEHYFAPQDEVRETFMKEHVGNKTAMDFMKGQQYERSLY SKYKDYYGYVFYIGQKR >gi|226332246|gb|ACIC01000074.1| GENE 25 24265 - 28263 2126 1332 aa, chain + ## HITS:1 COG:all4963_3 KEGG:ns NR:ns ## COG: all4963_3 COG0642 # Protein_GI_number: 17232455 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Nostoc sp. PCC 7120 # 795 1078 1 273 294 153 35.0 2e-36 MVMAEATNHYFKHLGVSDGLSQVCIPSIYQDELGAVWLGTTEGLNRYNGKDVKVFQPSES NHGLTNNEINELCGDKNGRMYIRSGNDLIKLDIYTEQFTCLRQKGVYGLFCKKDTLWVSC ADGIYYYTGQGTELTFFARVQKGFVGRALYVDKETVWVVTRKYLVAISREDPLRQEILLT LDKGNCILGDSSGNIWVGAWNGLYRISPNRETTAYVSHSGMGELSDNQVRCVLEDDFKHI WVGTFKGIDCYDPVTDTWSHYTRYGSSSNSLSHTSILSLHKDMQGNIWVGTYYGGVNVFN PDKNNHSFYCAEPLREDCLSFPVVGKMTEDSAGNVWICTEGGGLNCYHGDTGVFSRYMYH KGDQGTLGSNNVKSIFYRKENNRLYAGTHLGGLFVLDLKSNKGHTLHHVTGDPASLPHEI VNDIQEYKDGIAMLTQGGPVFFDPVTEKFSPLTDDPSVQKLISREYAFETFLIDSRHRMW LAVGGGGIVCVNLSSSKVTQYMADSEKDTLTVGRYKTVHIFEDNKGDIYFSTIGAGIFKY QEKQDTFKSYGTTNGVLPSNYCYYICESARDNCLLVLHGKGLSIFDREKEVTEETYHLFR QTYNLGSALYRSKNGIIYISGTNGLAMFQERSFYESQVKSFLNFDKLFIFNDEIAPGDES GILTEILAKTSDIFLSYQQNNVTVEFATFNYNSDRNRLFEYYLDGFDKVWTQTSGTTVTY TNLPPGDYTLKVRSMENKESLKDAICLNIHVSAPFYATIWAYLFYVLCLTALMVAFVRFK TRQAALRSSLEFERKEKERIEELNQVKLRFFTNISHEFRTPLTLILGQIEVLMQMDLGTT VYNRIQRIYKNAWHMRNLISELLDFRKQEQGYLKLRVEEQNLVTFTRQIYMCFYEYAQKK EITYRFDSVEETISVWFDPIQLQKVIFNLLSNAFKYTSNKGNITVEVRKVASQAVVSVCD TGIGIPVEHISKIFDRFYQTDNASSSSSFTLGTGIGLALAKGIMNMHHGKIEVESTVGEG TKFILSLPLGNRHFSDEEMAVTEGRESLIIPEASVSSYGQLMAEEIKEPESQKNMDEEDK PTILLVDDNTELLSMLEDIFLPMYNVYTACNGREGLEMAQQIQPDLIVSDVIMPEMSGKD LCYKIKTNVELSHISVVLLTAQTSVEYVVEGLMFGADDYITKPFNIKVLVARCNNLIKNK KRLIAHYAGKVITESPVAEAINEKDKELLTKCVNIVRENFENPDFDVTALASELCMGRSK LYMQFKQMTGLTPNEFILKVKLDEAMSLLTEHPELNITEISVRLGFSSPRYFSKSFKAFF GIAPQGVRSRKE >gi|226332246|gb|ACIC01000074.1| GENE 26 28610 - 30064 1396 484 aa, chain + ## HITS:1 COG:TM0306 KEGG:ns NR:ns ## COG: TM0306 COG3669 # Protein_GI_number: 15643075 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-L-fucosidase # Organism: Thermotoga maritima # 39 457 7 433 449 122 26.0 2e-27 MKNTIIYRMKSRLITLLLLLITVSSVTFAQEAKKEIPLKYGATNEGKRQDPAMQKFRDNR LGAFIHWGLYAIPGGEWNGKVYGGAAEWLKSWAKVPADEWLKLMDQWNPTKFDAKKWAKM AKEMGTKYVKITTKHHEGFCLWPSKYTKYTVANTPYKRDILGELVKAYNDEGIDVHFYFS VMDWSNPDYRYDIKSKEDSIAFSRFLEFTDNQLKELATRYPTVKDFWFDGTWDASVKKNG WWTAHAEQMLKELVPGVAINSRLRADDKGKRHFDSNGRLMGDYESGYERRLPDPVKDLKV TQWDWEACMTIPENQWGYHKDWSLSYVKTPIEVIDRIVHAVSMGGNMVVNFGPQADGDFR PEEKAMATAIGKWMNRYGKAVYACDYAGFEKQDWGYYTRGKNDEVYMVVFNQPYSERLIV KTPKGITVEKATLLTTGEDITVVETTRNEYNVSVPKKNPGEPYVIQLKVRAAKGTKSIYR DALT >gi|226332246|gb|ACIC01000074.1| GENE 27 30084 - 32192 1831 702 aa, chain + ## HITS:1 COG:SSO3036 KEGG:ns NR:ns ## COG: SSO3036 COG3250 # Protein_GI_number: 15899743 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Sulfolobus solfataricus # 23 585 22 554 570 150 26.0 1e-35 MKFKYILILSALIGQILSLSAREVTSFNEGWLFKRGPFSQDPVKVAAQWNADWETVNLPH TWNAKDMQVKADAFYEGVGYYRKTQFFGNELKGKRVFLRFEGVGANTEVYVNGKLVGMHR GAYSAFAFEIGTALKLGAENEIMVKADNASRSDVIPVNHNLFGVYGGIYRPVWLIITEQN NITVTDCASPGVYITQRNVSKRSADITVKVKLDNGSLTPADLVLENTIYTQNGKKVTSHR LPLVLTPQGTQTYISTFKLNKPHLWQGRKDPYLYKVISRLLADGKVIDEVVQPLGVRRYE IVPGKGFYLNGEKYPMYGVTRHQDWWGLGSALTNKEHDFDLAQIMDIGATTVRFAHYQQS DYLYSRCDSLGLVIWAEIPFVNRVTGYEAENAQSQLRELIRQSFNHPSIYVWGLHNEVYQ PHEYTAGLTQTLHNLAKTEDPDRYTVSVNGYGHAEHPVNQNADIQGMNRYFGWYEKKLQD IEPWVKGLEEKYPWQKLMLTEYGADANLDHQTEYLGDALNWGKSFYPETFQTKTHEYQWS VIAKHPYIIASYLWNMFDFGVPMWSRGGVPARNLKGLITFDRKIKKDSYYWYKANWSKDP VLYLTQRRNIDRERKHTSVTVYSNIGTPQVYLNGKELTGIRQGYTDVHYVLADVTLEQGK NTIKTVATYNGKEYTDEIEWNYTGEKKRSADWSENKEEHAGW >gi|226332246|gb|ACIC01000074.1| GENE 28 32319 - 35477 2247 1052 aa, chain + ## HITS:1 COG:no KEGG:BT_2968 NR:ns ## KEGG: BT_2968 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1052 1 1052 1052 2039 100.0 0 MKIEKFYLFLLACFVAIGAYSQDGQQKMTGDEKSQQQSDAKVKITGQVFDESGEGIPGAN VTLKSNPTSGTVTDLDGKFILMASPQKDVLVVSFIGYNTQEFPLKGKTNVTIQLSQNVNE LDAVEIVAFGTQKKESVIGSITTLSPKSLRVPSSNMTTALAGQVAGIISYQTSGEPGADD ASFFVRGIASFGFNTSPLILIDNIESTSTDLGRLNPDDIESFSIMKDAMATALYGSRGAN GVVLVKTKEGERGKTKFDVRIEGSNSRPTSNIELADPVTYMKLHNEAILTRDPSAPVMYS DDKIDRTVPGSGSIIYPTNDWRRQLMKNSTWNGRANMSISGGGNSATYYVSLRYTKDQGL LNVDGKNNFNNNINLQTYQMRANVNINVTKTTQVRVNLSGIFDTYEGPIYSGSDIYKMVM KSNPVLFPAVYPTDEQHKYIKHILFGNSDDGSYLNPYAEMVKGYKEYENTTLLATLGVTQ DLNFITKGLKFEGFFNVSRKSYYGQTRQYKPYYYALSSYDFMTEKYSIENINPDSGTEYL DFSPGDKTVNNVMTIETRTSYNQTFGDHSVGGLIVTQYIDSKNPNYKTLQESLPSRNMGV SGRFTYAYSDRYFTEFNFGYNASERFDKKHRWGFFPSVGGGWMISNEPFFQPLSSKITKL KLRASYGLVGNDKIGRVDERFLYLSNVNMNAGGASFGYENKYSRPGVNVSRYANPAIGWE KSRKANFALEASFYGFDLIAEYFTEHRTDILQKRASIPSVMGYQADVYANIGETKGHGVD LDLKYQKNLNKNAFLIVRGNLTYAHSEYLKYEDNTYDKEWWKYKIGYSPNQKWGYIAEGL FIDDAEVANSPVQFGDYKAGDIKYRDMNGDGVINSLDQVPIGHPTSPEINYGFGSTFSYK GFDINFQFHGSAQSSFWIDYDKMSPFFKDSKMSQKTNNQLVKFIANSYWSESNRNRYATW PRLSTTSVANNKELSTWFMRDGSFLRLKLVELGYTVPQKIVSKWGMSNLRLYMSSTNLFV LSKFKDWDVELAGNGLNYPLQRVFNIGVNVSF >gi|226332246|gb|ACIC01000074.1| GENE 29 35483 - 37435 1430 650 aa, chain + ## HITS:1 COG:no KEGG:BT_2967 NR:ns ## KEGG: BT_2967 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 650 1 648 648 1352 100.0 0 MVMKKIKICILSCLSVFILTSCELNIVPDNIPTIEDHAFALRKEAEKVLFTCYRYMPSDG SLKNNVALLGAGDFCISDVFRSYLGTSAWYIAQGLQKSDSPYCAQWGHLYEGIAYCNILI ENIHTTPDMDDSEKNRWIGEVEFLKAYYHFYLIRMYGPIPIMDRYVPVSSSTEEMHPYRE TLDSCFNYVTETLDKVIANPDVPAKIENEAEELGRVTVGMAKALKAKVLVTAASPLFNGN MDYEVIADNRGVKIFNPVKTEEEKKQKWVKAAEACKEAITYLEDLGFGLYYFDDPTVVMT ESDKLMMNLRGAVTEKWNKEVVWANSQSWVGSGSYDNYQIQAMPRDLVPGRNAKNATNRT NLGVSLGLTNSFYTKHGVPIEEDKTWDYANRFEVQAPTPVEAEPGYENTIMLNYKTGTVK LNLEREHRYYASLGFDGGIWFGQGKTGLSGLYSVNARKGGNVSPIPADHSQNLTGIWPKK VVNYKTVMSDAASGFTSVTYPFPVIRMADLYLMYAEALNEKGEDYATVLPWIDMVRERSG LQGVKESWDQYVGNSKYATQTGLRQIIMQERRIELAFEGHYFWDVRRWKTATTELTQPLT GWKVRYGETDVDYYSENLVLVRDFTPKMYFWPIDIGELRKDPNLVQNYGW >gi|226332246|gb|ACIC01000074.1| GENE 30 37459 - 38706 880 415 aa, chain + ## HITS:1 COG:no KEGG:BT_2966 NR:ns ## KEGG: BT_2966 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 415 1 415 415 810 100.0 0 MKKIIYSSLLVLAITLLPSACDNDSDMNSPHGSTTPPEQVVLNGVKNLPGKSIIYFTQPS DKNFLYMKAVYMTEIGERSTNASYFADSVVVEGFEKAGEYNVKLYSVSPGEAYSEPLEVK INPEEPPYITAYKEMEIKPDFAGIRIKTSNTSNETLTFYVYRKDATGKWTEAGALYTKKQ EINEPIRGMEAVPSEFSVVVKDRWGHLSESKEISLTPYYEEEVDKKKMGYLAIGEYKGYL APNANTPKNLYDGIIGSNNTFMTLTTAGYDFTKPSSVTLDLGKKFKLSRMIVYGRRNGTD YSSIFDGLYPKEIEIWGRNDNNVTKFDPENDEGWVRLYQGVLPRADGSVIPAAIVPLTDA DKELARDGNELEFSVDLDAYRYVTFVCIKNYNVGTSRINVAELTFFGTPEDKIIQ >gi|226332246|gb|ACIC01000074.1| GENE 31 38730 - 39401 398 223 aa, chain + ## HITS:1 COG:no KEGG:BT_2965 NR:ns ## KEGG: BT_2965 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 223 1 223 223 419 100.0 1e-116 MKKFIFISVSILIFCSIISCESMDATYKDFVKDGPIMYLTRLSKDSITVRNGWERVLISF PIVKDGRSTKIALALNQSDTVRYELAKNKRTDILLENMREGSIIFSAWLEDDELNKSLAT DFTGTIYGTQYQSYLLNRSIVSKSMQSGNLVIKYSMLLDSTLVASRLTWNKGGEETTKIS YYNKEGQDVLEDFTGDSFIMETLYAPQENVLDKIWSKPVKYTK >gi|226332246|gb|ACIC01000074.1| GENE 32 39535 - 41700 930 721 aa, chain + ## HITS:1 COG:no KEGG:BT_2964 NR:ns ## KEGG: BT_2964 # Name: not_defined # Def: chondroitinase AC precursor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 721 1 721 721 1545 100.0 0 MKMIHYKIIILLLLAWSPWDLVLPCPSICTSALQNDDFDILMKKIRDGVAENPNIDKALE LYDDVQGCFIDIDYARRDRTNWMPLIHVDRLYDFVFAYTHPKNSYYQNEDLYHKIVKGLE YWHHRNPHCNNWWYNQIAEPQKLGILLIQMRVGKRQIPEVLETKVLQRMRKDGGHPARWT GANRTDIALHWIYRACLQKNDVDLKTAVENVYSPLSYTVKEGFQHDNSYFQHGVQLYIGG YGDEILKGVTQVALYTKGTKYALDDERIQFLRHFMCGTYYQVIRGQYMLFDVLGRGVSRN NATQKSHAALFAKRMLELAPAHIDEYNAIIARLEGKKSANYGIKPLHTHYFRGDYALHVR PHYTFDVRMVSNRTMRCEYGNGENLKTYFMSDGCTNIVTEGDEYARIFPVWNWNRIPGVT APQLDTIPRTVIDWQTKGTSVFAGGVSDSLYGVSVYSYLDTYADINTAAKKSWFFFDDEI ICLGAGVNSTAGVPVCTTINQCLLSKKEVILSQSKKQSMVKEGDFVYDSPEWVLHNGIGY VFPAGGNLFLSKKIQTGSWYSINHTESKNEQQQEVFTLGFNHGCNPRNATYAYIVVPGIH SARKMNNYRKSPVEILANTDSMQIVRHTKLGIWQMVFYKEGTFRNGELSVSVDKACALMI KDGHSGNAELHIADPGQTQSCIKVELLIPEISSERKTVLCDFRNTGIYAGASKAYKLKNI L >gi|226332246|gb|ACIC01000074.1| GENE 33 41697 - 44639 2481 980 aa, chain + ## HITS:1 COG:YPO0840 KEGG:ns NR:ns ## COG: YPO0840 COG4225 # Protein_GI_number: 16121148 # Func_class: R General function prediction only # Function: Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins # Organism: Yersinia pestis # 27 367 12 352 352 218 33.0 4e-56 MKKLLLILFLSIAYLCTTHAQSFTKLEVKQSMRRVADWQIGHYNRAVYGDLNWVNATFFL GLAYWAEIAEKDDQDDFYYKWLTRLGSRNYWQVDKRMYHADDICIAQTYLYLYEKYKQKD MIVPTLARTEWVVANPPSGSFQLTYGDATTLEHWTWCDALFMAPPVYLKLYNITGDKKFI RFMDKEYKATYEFLFDKEENLFYRDHRYFDMKESNGAKVFWGRGNGWVLGGLVEMLRELP AKSKYRPFYQELFQKLCYRIANLQSSDGFWRASLLDPDAYPSPETSCSGFFVYALAYGLN EGLLPKEKFLPVVEKGWKALLSAVEEDGKLGYVQPIGADPKKVTRNMTEVYGPGAFLLAG TEMYRMAEDAPKQNATIPQNRIQEIAAMLPDRPQGIGTSYKDRTFWNKIKESDDAKQLLE KEAPALLKQGMPPFVDSLYLHLNKTNVRLPGENMMNARYQYLYRLTLAECLENKRRYVPA IEKALVALCSQKPWSIPAHDRGLKNYKGTDYYVDLVVATAGNGIAQCVTMLDDRLSPEVR ARVQCAFREKVFRPVYRCLEETKPFWWFTATNNWNSVCLAGVTGAALALLPDKEERAYFV AAAEKYNVYGMKGYADDGYCSEGVGYYNYGFCAYILLREEVYRATQGKIDFFQTPKFVRI ARYGKKIQMNEGVCPAYSDCRIGLSPDKFILSYCDRALGITSAEEQPVLPKGNNLSLHLL ELFTSQVAKVGMTDGIRQVLQEESDALRAYYEQAGILIARPAGGTSCRLAISAKGGTNAE NHNHNDVGSYAVALGSETMVGDQGGPNSYPGDYFNGDAPQKYKIKGSFGHPVPVVDGRTQ SSGGKAKGVVLQKEFTNEKDVFSIDYTSAYSTPNLDKLIRTFVYDRQGEGNFTVGDEFTA KTPIRFETAITTRADWKILDDTHLLLATDSEQMIVAIEASGKVAFTSETIEVNSPAYTRI GIALKNQSKEGYIRLKMYTK >gi|226332246|gb|ACIC01000074.1| GENE 34 44679 - 44900 63 73 aa, chain + ## HITS:1 COG:no KEGG:BT_2962 NR:ns ## KEGG: BT_2962 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 73 1 73 73 105 100.0 5e-22 MPSVQDIYVQRAGDSYPAHWLYMSCNQIIVALQTGEEGTIFIFLFLFLLLSLKLSIFEAA FPYLNTSTSLLIE >gi|226332246|gb|ACIC01000074.1| GENE 35 44905 - 45762 860 285 aa, chain + ## HITS:1 COG:no KEGG:BT_2961 NR:ns ## KEGG: BT_2961 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 285 1 285 285 607 100.0 1e-172 MKNRFLIICTVLFLSSCLAYGQEQASVTNDFSENKQGCIQHPWQGKKVGYIGDSITDPNC YGDNIKKYWDFLKEWLGITPFVYGISGRQWDDVPRQAEKLKKEHGGEVDAILVFMGTNDY NSSVPIGEWFTEQEEQVLSAHGEMKKMVTRKKRTPVMTQDTYRGRINIGITQLKKLFPDK QIVLLTPLHRSLANFGDKNVQPDESYQNGCGEYIDAYVQAIKEAGNIWGIPVIDFNAVTG MNPMVEEQLIYFYDAGYDRLHPDTKGQERMARTLMYQLLALPVAF >gi|226332246|gb|ACIC01000074.1| GENE 36 46058 - 46309 99 83 aa, chain - ## HITS:1 COG:no KEGG:BT_2960 NR:ns ## KEGG: BT_2960 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 83 1 83 83 145 100.0 4e-34 MHKNREKNAMRREVNRADIGGKSGNKVKDCHVTIIRSLLSQIKENSLRVLLTYRKLFTIN TKIAVEFIINSCIVYYKRSPHLL >gi|226332246|gb|ACIC01000074.1| GENE 37 46527 - 47909 1099 460 aa, chain + ## HITS:1 COG:no KEGG:BT_2959 NR:ns ## KEGG: BT_2959 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 460 1 460 460 955 100.0 0 MKYWMLLFALLWVSYGTGMARKTEKVVNNGIPWFDDRGEIVNAHGACIVEENGRYYLFGE YKSDKSNAFPGFSCYSSDDLVNWKFERVVLPMQSSGILGPDRVGERVKVMKCPSTGEYVM YMHADDMNYKDPHIGYATCSTIAGEYKLHGPLLYEGKPIRRWDMGTYQDTDGTGYLLLHG GIVYRLSKDYRTAEEKVVSGVGGSHGESPAMFKKDGTYFFLFSNLTSWEKNDNFYFTAPS VKGPWTRQGLFAPEGSLTYNSQTTFVFPLKCGEDTIPMFMGDRWSYPHQASAATYVWMPM QVDGTKLSIPEYWPSWDVDKLKPVNPLRKGKTVDLKKITFSKEADWKVEEGRISSNVKGS TLSIPFTGSCVAVMGETNCHSGYARMNILDKKGEKIYSSLVDFYSKANDHATRFKTPQLA EGEYTLVIEVTGISPTWTDKTKRIYGSDDCFVTITDIVKL >gi|226332246|gb|ACIC01000074.1| GENE 38 47915 - 50284 2026 789 aa, chain + ## HITS:1 COG:no KEGG:BT_2958 NR:ns ## KEGG: BT_2958 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 789 1 789 789 1627 100.0 0 MRNRILFSFLAWVAVIVSVQGKQKDFVLQSGQPVAIACSGSEAPVVRTSLDLLSRDLQTV LSATAHIDTNTGNIIVGTIGQSKLIEQAGIDISALKNKKQAFMLAVSEDGKLVVAGSDSH GTAYGILEISRLLGVSPWEWWADVTPEKKETFRLSGKFRELQSPSVEYRGIFINDEDWGL MPWSNKTYEPSDVKGEIGPRTNERIFELLLRLRANTYWPAMHECTLPFFLTKGNREAAKK YGIFMGASHCEPMACNAAGEWKIRGKGAYDYVNNSPAVYQFWEDRVKEVAGQEILYTLGM RGVHDGKMQGAKTVEEQKAVLDRVFVDQRGLLEKYVNKDVTQVPQVFIPYKEVLDIYHAG LQVPEDVTLMWCDDNYGYIRHFPTAEERARKGGNGVYYHVSYWGRPHDHLWLSTMSPSLI YQQMKQAYDQGIQKMWILNVGDIKPAEYQIELFMDMAWNLDKVSSEGVTAHLKHWLEREL GTSCAKTILSVMQEHYRLAHIRKPEFMGNTREEEKNPVYRVVKDLPWSEREINERLNAYS ELSETVEKAASKVPAGRQSAYFELVKYPVQAATQMNRKLLYAQLARHDKEDWEKSDAAYD SIAALTQHYNSLENGKWNRMMDFKPRKLPVFNRVERKAATAPMTADRKAVCQWNAAEAKK GNAIVCEGLGYESKAAEIKKGDALTFSFGNLKTDSVEVDIRLLPNHPVHGDKLRFTVSLD GAEPEVIAYETKGRSEEWKENVLRNQAIRKIVLPVTGKKSHQLVIKALDEGVILDQVMLY EVNRNSSGI >gi|226332246|gb|ACIC01000074.1| GENE 39 50492 - 51328 395 278 aa, chain - ## HITS:1 COG:YPO2681 KEGG:ns NR:ns ## COG: YPO2681 COG2207 # Protein_GI_number: 16122886 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Yersinia pestis # 16 272 19 269 280 78 26.0 1e-14 MRDSVFKLSYFLSANEKFHIARVNITSSQDLSLHSHDYAEILWVEKGTGIHHVNGHQFRL SPGDLIMVRPKDRHTFSSSGKGIVIVNVAFPVETLDYFRQRYFEWSNLYFWTTGMFPFQM RLSRDIVKRLSSRAEEAMKYGRSNIQLDSLLLFIFRQITANEKVEDSSEIPVWLFNAIQK FNSPEFFIQGIAGFVTLCDRNIDYVNRIVKLHFHKSLTDLINEFKMQYATTQLSITSMPI KEICTNCGFRNLGHFYKIFRSVYNQTPYEYRRINQLIV >gi|226332246|gb|ACIC01000074.1| GENE 40 51526 - 52425 422 299 aa, chain + ## HITS:1 COG:AF1450 KEGG:ns NR:ns ## COG: AF1450 COG1180 # Protein_GI_number: 11499045 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Pyruvate-formate lyase-activating enzyme # Organism: Archaeoglobus fulgidus # 1 266 4 266 302 213 40.0 4e-55 MKGFITNIQRMSIHDGPGIRSTIFLKGCNLRCKWCHNPETWSMKPQLQYIEDKCIHCFSC ITVCEYEVLFIDSNRLSIHRERCTDCGKCTERCTSGALSWIGKEVDSSDIIHEILQDLIY YQKSGGGITLSGGEPLQQKDFALDILQKCREHRIHTAVETNLLTDVNTLEAFLPWVDLWM CDFKMADDTLHRKWTGHSNVPIIKNLEFLAKQAVPLTIRTPVIPNVNDSEEAIESICRFI RQLPNQPAYELLGFHSLGFVKFENLGMKNPLSNSAFLKKGQLQKLKEILIRYNLNNNKK >gi|226332246|gb|ACIC01000074.1| GENE 41 52422 - 54566 1388 714 aa, chain + ## HITS:1 COG:ECs4880 KEGG:ns NR:ns ## COG: ECs4880 COG1882 # Protein_GI_number: 15834134 # Func_class: C Energy production and conversion # Function: Pyruvate-formate lyase # Organism: Escherichia coli O157:H7 # 157 710 185 760 765 192 27.0 2e-48 MKNYEQKIANLRMRKLAQTQEKIEKEGLLDEDDYGRVVPPENLWNIIPNHPDGSFYGFDA WADNFCSLMNIHPVYIDADDAFAGRWMYFMSKMRPNKWNPDYSYDFLKENIKKYDLICGI GDDAHFAPDYEIGVKLGWNGLIKKIEHYQSMHHSEEQQHFYSLHLRVIRSVQGWIQRHID QAYRMAASATDDCSKNNLLEIAKVNESIINDAPTTLREACQWIIWYHLASRTYNRDGAGG QIDTLLAPFYEKDLAEGRIDKETAVYYLACFLLNDPVYWQLGGPDESGKDQTSPLSFLIL EAADKINTSLNLTVRVFDGLNKDLFRKSVEYLVKNKNGWPRFSGDKALVEGFMKNGFSAQ LARRRIAVGCNWMSLPGLEYTMNDLVKINMAKVFEVSYTEMMNEQKDHSTSRLWMLFNEH LAKAVQTSADGIRHHLKFQKYNEPELLLNLLSHGPLEKGKDVSDGGAEYYNLAIDGAGLA TVADSFAALEQRIELENKLSWNTLTEYLRNNFSGDGGCRIRHLLKNSEKYGAAESLGEKW AIRIKETFTELVRLQSEPDKRRIFIPGFFSWANTVGFGQSVGATPDGRLAQTPISHGANP TPGFVKDGASLSLATIIAKIQPGYGNTAPMQWELDPTFANMDNIELIMSIIRAHFDLGGT LINVNIMNKDTVLAAHKDPAGYPELVVRVTGFTAYFSMLSPQFRQLVVDRILES >gi|226332246|gb|ACIC01000074.1| GENE 42 54595 - 55020 382 141 aa, chain + ## HITS:1 COG:no KEGG:BT_2954 NR:ns ## KEGG: BT_2954 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 141 1 141 141 271 100.0 6e-72 MKIIDEHTFEGKGYHPFLITDRWQVAYLNYAEAESLEQIEKLDIHHQTDEVFILLQGKAA LIGASVTDQNITYDVVNMQPGIVYNIRREVWHKIAMQPGSQVLIIENNDTHLNDFEFFEL TESQKSQLRECVTKIVSLTEN >gi|226332246|gb|ACIC01000074.1| GENE 43 55028 - 56035 828 335 aa, chain + ## HITS:1 COG:YPO0334 KEGG:ns NR:ns ## COG: YPO0334 COG0697 # Protein_GI_number: 16120671 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Yersinia pestis # 1 335 1 340 344 142 31.0 8e-34 METTVISGFLLILLAGGCSGTFALPFKHNSQWKWENNWFIWSIIALLVAPWIMAFISIPD LESVYAHESNTVLLVAFFGLLWGIGAILFGKGIDYLGVSLSLPIMQGLINVVGTLMPVIL RNPSELLTPTGLKLLTGTVIILAGIIFFAIAGHNRDSKSRQTHSETPIKKNFRKGLIICL LAGIFGPMINFAFVYGAPLQEKAVATGASSLYAANVIWSIALSAGFIINLLECIRLFGKN QSWKTYRHRTTGGLIMASLAGVLWYLSIMFYGMGGSFMGVLGASVGWATMQSTAIIAGNV AGLASGEWKGATRYAIRMMVIGLVCLIGGVVVIAL >gi|226332246|gb|ACIC01000074.1| GENE 44 56173 - 59061 2484 962 aa, chain + ## HITS:1 COG:no KEGG:BT_2952 NR:ns ## KEGG: BT_2952 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 962 27 988 988 1840 100.0 0 MAGFLCVCVPSLYGVENLNVTGKVVDSSGEPLIGVSIKVENKNMGTTTDINGNFTLADIP KNTVLVLTYIGMETQKVTVTQNRLNIVMRENSVQLEEVVAIGYGTVKKEELTSAVTKVGE ESFNAGVTTSPINLLNGKVPGLSIRNTSGNDPNASPEIHLRGIGSIRAGSAPLIIVDGTY STMSELQAINANDIKSFNVLKDGSSAAIYGTRGSNGVIIVETKNASKGKASIEYTSYYYT ERPVRKLEMMSADEYMGYLKDKGHSTINNDYGFSTDWTDQLIRNNFSYYQGVSFSTGTEN SSIRGSVGYRDSESMVINTGNKQLTGRVNFKQYFFDKKLTIEGTLNGANTQSDYTDYGAF MQAIVYHPTAPVYNQEGKFFEYAGVGPYNPVALLSQVDNRGERYTYSGNLSANLKVLPCL SVYAMMSANNDTYEGSKYISRDSRYSVLGGFEGQADMNTYFYKKRTFEAYANYSFSTDVH NLTALAGYSFNREDRTWHNASNSGFLTDIPGANNIGNGTYLEDGLASMGRGQDESKLIAF FGRINYSYKGKYLFNASVRREGSTRFGENHKWGWFPAVSAGWRILEENFFSDATPLSNLK LRVGFGITGNQDIPLYASIAKYNDLGYAYYNGKWDKVYGPVSNPNPDLKWEKNAEYNAGV DFGIWNDRLTASVDYYYRKTTDLLDWYDAQMPSNIYSTIFTNVGTLTNQGIEFAIGYDVI RNKELKWHLDGGFSYNENKLVSLANSSYRANHITYNPLSSPANGQTTYILEEGKPIGTYY GLKYRGFNSAGKWVFEDRDEDGAYSDKDYTYLGNGLPKWYFNLASTLTWKDFDFSFQLRG AAGFKVLNTKRIYYENSVSLPFNLLKSALGRPLNDAATFSDYYLEKGNYLKLDNITVGYT FDLSQLKYVSNARIYATATNLLTITGYTGVDPEVGTGLTPGFDDSGYYPRSTTLMLGLNI KF >gi|226332246|gb|ACIC01000074.1| GENE 45 59067 - 60605 1207 512 aa, chain + ## HITS:1 COG:no KEGG:BT_2951 NR:ns ## KEGG: BT_2951 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 512 1 512 512 1047 100.0 0 MKKIIYFITLLFLVGCVDLSENMYNDLGKDNYYQTEQEYDAAFMNQYAFLRTMYAWNVFY LQEITTDEACLPQKGRDGYDGGIYQRMHWHTWTDQDVIINGAWQELFKAVGFCNQVLNDL NNSRSDILPENKKKLYIAESRALRAFFYSMLVDMFGNVPIVEKVSELNPPTKTRTEVFNY VEKEIAEVKDQLLRSTDTGSYGRFTQEAALALLSRLYLNAEVYTGTSRLDDCISVSQTLL QKGYQLESTWDAPFAWNNENSKENIWVIPNDEIVANEMSVLFYRNIHWSQRSQWDWKNPN GGWNGVCTVREFIETYDITHDRRCKYDPQNGMYGQFMWGPQFDLSGNPILGTNEYAGQQL DFTLDLPDMVNNKENAGARNIKYKVKVGAQGMENDFALFRIAEINFNLAEATLRKSNEVD SKALEGINKIRTRAGVSTYATGELTLDELYREKGREMCYETLRRTDMIRFGRFIQPMWDK NYTDKETINLFPIPLQATKLNPELKQNPGYTK >gi|226332246|gb|ACIC01000074.1| GENE 46 60632 - 61579 675 315 aa, chain + ## HITS:1 COG:no KEGG:BT_2950 NR:ns ## KEGG: BT_2950 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 315 1 315 315 639 100.0 0 MKKIFHLACFAFILLAAACSDEKEVTLEYNKIDGLLSQCNELLSSSSEGEDLGDFVTGSK AIFESKIRETEFIRNKTDRQSALDNYCEKLEIARQTFLSSKVLPACPLFDGTGYIDCGPA SQFFSDKLTLEAWVYANEKTGGSIIGSEGSGDNGIYGLLIRLAEGVNNAIDFTIVNGTWS SCITQENSIELKKWVHIAATFDSQTAKVYLNGQEAATMDIAPYIPEGSCKFIIGDLSTWN GRQFRGKIYDVRIWHTIRTQQQIADNYQIFLKGDEEGLVANWQLNVKSGSSIKDITGKYP ATLVNLTWSDLDNLN >gi|226332246|gb|ACIC01000074.1| GENE 47 61640 - 62791 884 383 aa, chain + ## HITS:1 COG:no KEGG:BT_2949 NR:ns ## KEGG: BT_2949 # Name: not_defined # Def: putative alpha-1,6-mannanase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 383 14 396 396 753 100.0 0 MLSFVQVTSGCDATVQDIIIDTDPGVEIGNNDYYTWCKETLSVIDKDLKISGTHSYYENQ DRSQVSFIWGNIFLLYTYTEGISLSKSEWSDALMNCFLNFDNYWHPNYKGIAGYATLPTS AEKVPDRFYDENGWTAIGLCDAYLATQNNSYLEKAKGALAFSLSGEDNVLGGGIYFQETF VSLPVQKNTICSAVTMLSCMKLYEITQDRQYLDAAIRINDWTVENLLDKSDNLLWDAKMV ADGSVNTQKWSYNAGFMIRSWLKMYQATKDEKYLSQAKATLASSEAKWYNSINGALNDPG YFAFSIIDSWFDMYDTDKNTVWLTKAFHAINFIHNKLRDGNGRYPEHWGTPTTSNLEKYD LRFSTVAAYMYMRAANYKRILND >gi|226332246|gb|ACIC01000074.1| GENE 48 62803 - 65496 1685 897 aa, chain + ## HITS:1 COG:CC0533 KEGG:ns NR:ns ## COG: CC0533 COG3537 # Protein_GI_number: 16124788 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Caulobacter vibrioides # 406 892 274 755 770 252 34.0 3e-66 MKMKLLILAVLPFLTPINMNSQEEKASWVQYVNPLIGTEVWQSGVAVAGHEDPSGYTFPG VTEPFGMTEWTAHTLESKHAGTLHHRVPYWYKHNYISGFMGTHYPSGAVMFDYGAVELMP VVGQLKCRPEERSSSFTHITEKSKPHFYEVMLDDYQVKAQLSATKTSAILQFTYPQSDSA YVVVDAMPSMFTAGAPGYIHIDPVRKEISGKSIQSARGYRETGYFVVRFDKDFDSFGTFN LNNDYPEVIEEKYLFTQKEGKWVNGLKGIYTQDSKGVGHLRSEKIDPVIDFDWDWYKPAD DFSFNDYQVTWSGKLKAPSTGEYTLGIQADDGARLYINGELLIDDWKSHSFSYQPTQKKI SLEAGKMYDIKLEYYQHEWSSRIKLSWIRPDKKSSTSLLTGNRHLESSTKIGGYIRFKTG KNEVIKAIVGTSFISVEQARINLEREIGAKSMETISAQTEALWNKELSVIDLPGATEQDK IVFYTALYHSFLLPRSLSEDGKYRSPFDGKVHKGISFTDYSIWDTFRATHPLFVLLKPDF AGDLITGLLHAYDEGGWLPKWPNPGYTNCMMGTHSDAIIADAYVKGVRNFDVEKAKKAVL KNAYEKGNHIAWGRLGIMDYERLGYVPVDKYGESVARTMEFAYDDYCLSRFFAEKGEPDL SDKLGKRSKNFRNVLDKETKMVRARKADGSWSNPEDYDISVWSGFNPKGVYNYKKNYTLF VPHDVPELIRFLGGTDSLAVFMDELFDKDIYYVGDEFVMHAPYMYNLCKRPWMTQKRIYD IVNKYYLPTPSGLPGNDDCGQLSSWYIFSAMGFYPMCPASIEYQLGVPCLPGFVLHLPQN RTFTIKTKNFGKGNCYVRAVYLNGKPHRSSVITHSDIINGGEILFELTDKPAYNWFQ >gi|226332246|gb|ACIC01000074.1| GENE 49 65554 - 66309 482 251 aa, chain - ## HITS:1 COG:jhp0094 KEGG:ns NR:ns ## COG: jhp0094 COG0463 # Protein_GI_number: 15611164 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Helicobacter pylori J99 # 9 187 3 182 260 137 40.0 1e-32 MHSVHPTPKFSIITVTYNAEKVLEDTIQSVISQTYHHVEYIIVDGASKDGTLSIIDRYRP RITTVVSEPDKGLYDAMNKAISLASGDYLCFLNAGDCFHEDDTLQQMVHSIHGSVLPDVI YGETAIVDKDRHFLHMRRLSAPEKLDWKSFKQGMLVCHQAFFARHTLVEPYDLSYRYSAD FDWCIRIMKKACTLHNTHLTIIDYLDEGMTTQNRKASLKERFRIMAKHYGWISTAAHHAW FVLRLIIKPGQ >gi|226332246|gb|ACIC01000074.1| GENE 50 66302 - 67567 1098 421 aa, chain - ## HITS:1 COG:all4426 KEGG:ns NR:ns ## COG: all4426 COG0438 # Protein_GI_number: 17231918 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Nostoc sp. PCC 7120 # 1 416 1 412 417 239 32.0 8e-63 MRVLIINTSERIGGAAVAAGRLMESLKNNGIKAKMLVRDKQTDQISVVGLKGSWLHVWKF MWERIVIWKANRFKKNDLFAVDIANTGTDITSLPEFQQADVIHLHWINQGMLSLNTIRKI LTSGKPVVWTMHDMWPCTGICHYARECKNYEQECHHCPYIYGGGGKKDLSTRIFRKKKEI YSQASITFVGCSRWLAEKAKVSGLLTGQTVISIPNAINTNLFKPHNKQEARRKCRLPQEG KLILFGSVKITDKRKGIDYLIEACKLLAEKHPEWKESLGVVVFGNQSQQLQDLIPFRVYP LPYIKNEHELVDIYNAVDLFAIPSLEENLPNMVMEAMSCGVPCVGFNTGGIPEMIDHLHN GYVAQRKSSEDLANGIHWVLTEPEYAELSAQACRKAIGNYSESIIAKKYTEVYNKITGKY A >gi|226332246|gb|ACIC01000074.1| GENE 51 67673 - 68875 1305 400 aa, chain + ## HITS:1 COG:ECs4720 KEGG:ns NR:ns ## COG: ECs4720 COG0677 # Protein_GI_number: 15833974 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetyl-D-mannosaminuronate dehydrogenase # Organism: Escherichia coli O157:H7 # 4 381 6 399 420 422 54.0 1e-118 MKKVVFLGLGYIGLPTAAVAAAHGFEVIGVDVNPSVVETINQGKIHIVEPELGQVVRDVV QSGKLRAVCKPEAADAFFIVVPTPFKQNHRADITYVEAATRSVIPYLQEGNLFVIESTSP VYTTERMAEVIYKERPELKGKIHIAYCPERVLPGNTLYELVHNDRVIGGIDPASTAKAIE FYSAFVQGKLHGTNARTAEMCKLTENSSRDSQIAFANELSMICDKAGINVWELIELANKH PRVNILQPGCGVGGHCIAVDPWFIVSDYPEQAQLIKRARETNDYKADWCANKVLETCQKF VEQNEREPVVACMGLAFKPNIDDLRESPAKYIASRIISESRADVLIIEPNISTHKSFHLT DYREAYEKADIIVWLVRHDVFLALPREEGKIELDFCGVRR >gi|226332246|gb|ACIC01000074.1| GENE 52 68926 - 70056 1139 376 aa, chain + ## HITS:1 COG:STM3920 KEGG:ns NR:ns ## COG: STM3920 COG0381 # Protein_GI_number: 16767195 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylglucosamine 2-epimerase # Organism: Salmonella typhimurium LT2 # 3 375 2 370 376 473 61.0 1e-133 MKKILLVFGTRPEAIKMAPLVKALQKDTEHFETRVCVTAQHRQMLDQVLEVFGITPEYDL NIMAPNQDLYDITAKVLLGLREVLKDFRPDTVLVHGDTTTSMAASLAAFYMQIPVGHVEA GLRTYNMLSPWPEEMNRQVTDRICTYYFAPTEQSKVNLLQENIDAKKIFITGNTVIDALL MAVDIISTTAGVKEKMAKELQEKGYTVGDREYILVTGHRRENFGDGFLHICKAIKELAAL HPEMDIVYPVHLNPNVQKPVYELLSGLSNVYLISPLDYLPFIYAMQHSTLLLTDSGGVQE EAPSLGKPVLVMRDTTERPEAVEAGTVKLVGTNAEAIVSNVTALLLDKEMYKRMSETHNP YGDGQACERIIAALRC >gi|226332246|gb|ACIC01000074.1| GENE 53 70736 - 72394 560 552 aa, chain + ## HITS:1 COG:CC1011 KEGG:ns NR:ns ## COG: CC1011 COG0110 # Protein_GI_number: 16125263 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Caulobacter vibrioides # 60 203 67 209 215 86 32.0 1e-16 MVIIGSKGCAKEILTALKWDNVEETVSLFDNINTDISDAYYDFPIIKSWNELEQHLKTDS KVIIGVGGGQRREVLARKIACLGGVLTTFISQKALVGGYDNTIEPGVVILSGATITCNVS IGQGTFINKSTVISHDVRIGRYCEVSPGAKILGRAIIGDRTEIGANAVILPDVIVGADCK IGAGAVVTRNIDSHTTVAGVPARSITKSSNNAFKLKSKIRNLLYHIRIADFRKLREYNHY VFGKRKLMFLELLSHSWMYGASFENYYELQFFKKSRTECRQYLTSSLRHELTRQVNDPCE ALVLKDKVRFAEVFEDILGRRVMTFDEIKRQMHDPYSISINEVVIKPIKGQAGQGIIFPM QNFTSLRQLHDYVISTVKKPDEYLYEERIIQHSALNKLNPSSLNTLRIVTYYDESINKVD VWSVVLRIGIKARTDNFATGGIAALVDHRGVVCQPAIIKHPSGERFHIHPVSGEKITGCI IPYYDQAIALAKQAAMRIPKVRSIGWDIAITETGPYMLEGNDNWCMTLFQLPGGEGLRHL ANSVCNMFSVYE >gi|226332246|gb|ACIC01000074.1| GENE 54 72556 - 73920 409 454 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|157165073|ref|YP_001466086.1| 30S ribosomal protein S12 [Campylobacter concisus 13826] # 3 452 2 455 460 162 28 7e-39 MKKNIILLVAMSLTLSGCGIYTKYKPATSVPDDLYGGEVVTEDTAGLGNMDWRELFTDPH LQSLIETGLRTNTDYQSAQLCVEEAQAVLMSAKLAFLPAFALSPQGTVSSFDTNKATQGY SLPVTASWELDVFGRMRNSKKQAKALYAQSEDYRQAVRTQLIAGIANTYYTLLMLDEQLI ISRQTEEAWKETVASTRSLMNAGLANESAVSQMEATYYQVQGSVLDLQQQINQVENSLAL LLAETPRNYERGTLADQQFPTDFAVGIPVRMLAARPDVRSAERTLEAAFYGTNAARSAFY PSITLSGSAGWTNSVGSMILNPGKFLASAVGSLTQPLFNKGQVVAQYRIARAQQEEAALG FQQTLLNAGSEVNDALTAYQTSQGKRLLLDKQVTSLQTALRSTSLLMEHTNTTYLEVLTA RQSLLSAQLSQTANHFTEIQSLINLYRALGGGQE >gi|226332246|gb|ACIC01000074.1| GENE 55 73936 - 77112 3402 1058 aa, chain - ## HITS:1 COG:BMEI1629 KEGG:ns NR:ns ## COG: BMEI1629 COG0841 # Protein_GI_number: 17987912 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Brucella melitensis # 6 1042 5 1035 1051 726 39.0 0 MKGNIFIKRPVMAISISVLILAIGLISLFTLPVEQYPDIAPPTVYVSASYTGADAEAVMN SVIMPLEESINGVEDMMYITSSASNAGLAIIQVYFKQGTDPDMAAVNVQNRVAKAQGLLP AEVTKVGVSTMKRQTSFLQIGALVCTDGRYDQTFLANYLDINVIPQIKRIEGVGDVMELG DTYSMRIWLKPERMAQYGLVPSDITGVLGEQNIEAPTGSLGESSQNVFQFTMKYRGRLKS VEEFQNTVVRSREDGSILRLKDVADVELGTMTYSFRSEMDSQPAVLYMIFQTAGSNATAV NKEITAQMERMEKNLPEGTEFVTMMSSNDFLFASIHNVVETLIIAIILVILVVYFFLQDL KSTLIPSISIIVSLVGTFACLVAAGFSLNILTLFALVLAIGTVVDDAIIVVEAVQSKFDA GYKSPYLATKDAMGDVTMAIVSCTCVFMAVFIPVTFMGGTSGVFYTQFGITMATAVGISM ISALTLCPALCAIMMRPSDGTKSAKSINGRVRAAYNASFNAVLGKYKKGVMYFIRHRWMV WTSLAATIVLLVYLMSTTKTGLVPQEDQGVIMVNVSISPGSTLEETTKVMDRLENILKDT PEIEHYARVAGYGLISGQGTSYGTIIVRLKDWSERKGKEHGSDAVASRLNAQFQSIKEAQ IFSFQPAMIPGYGMGNSLELNLQDMTGGDLATFYDASTQFLNALNQRPEVAMAYTSYAIN FPQLSVEVDAAKCKRAGISPSAVLDAVGSYCGGAYISNYNQYGKVYRVMMQASPEYRLDE QALNNMFVRNGTEMAPVSQFVTLKQVLGPETANRFNLYSTITANVNPADGYSSGEVQKVI EEVAAESLPIGYGYEYGGMAREEASSGGAQTVFIYAICIFLIYLILACLYESFLVPFAVI FSVPFGLMGSFLFAKILGLENNIYLQTGVIMLIGLLAKTAILITEYAIERRRKGMGIVES AYSAAQVRLRPILMTVLTMIFGMLPLMFSSGAGANGNSSLGTGVVGGMAVGTVALLFVVP VFYIIFEYLQEKIRKPMEEEADVQVLLEKEKSEAERER >gi|226332246|gb|ACIC01000074.1| GENE 56 77127 - 78230 1174 367 aa, chain - ## HITS:1 COG:Cj0367c KEGG:ns NR:ns ## COG: Cj0367c COG0845 # Protein_GI_number: 15791734 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Campylobacter jejuni # 39 367 39 362 367 140 31.0 5e-33 MITVNKKWIRLIGIVGCTVWMASCKQAPDAGVKSSYAVMQIEPTDKELSSSYSATIRGRQ DIDIYPQVSGTIEKLCVTEGQKVRRGQLLFIIDQVPYRAALKTATANVEAARAATSTAEL TYKSNKELYAQKVVSEFSLKTAENSYLTAKAQLAQAEAQEISARNNLSYTEVKSPSDGVV GALPYRAGALVSANIPYPLTTVSDNSDMYVYFSMTENQLLALTRQYGDMDEALKNMPEVE LILNDNSVYQKKGTIESISGVIDRQTGTVVARVVFPNESRLLHSGASGTVVVPSIYKNCI AIPQTATVRMQDKTIVYKVVDGKAVSTLITVAGINDGREYVVLDGLKAGDEIVSEGAGLV REGTQVK >gi|226332246|gb|ACIC01000074.1| GENE 57 78454 - 79365 720 303 aa, chain + ## HITS:1 COG:no KEGG:BT_2939 NR:ns ## KEGG: BT_2939 # Name: not_defined # Def: putative transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 303 1 303 303 623 100.0 1e-177 MEKENIMTVDIEDFKKSQHILDYIDDDFAIVNSLDGIPYSNDTVRLNCFLIAVCAEGCIQ LDINYRTYQLQAGDLLLGLPNTLISRTMLSPKYKIRLAGFSTRFLQRIIKMEKKTWNTAA HIHNNPVKSISEGGDDPVFRFYRDLIVAKINDEPHYYHREVMQHLFSALFCEMLGQLHKE IDAPDKGDSPKEGIKQADYTLRKFVELLSKDKGMHRSVSYYADALCYTPKHFSKVIKQAC GRTPLDLINETTIEHIKYRLKRSDKSVKEIAEEFNFPNQSFFGKYVKMHLGTSPANYRNR EDE >gi|226332246|gb|ACIC01000074.1| GENE 58 79337 - 80563 1231 408 aa, chain - ## HITS:1 COG:DR1225 KEGG:ns NR:ns ## COG: DR1225 COG0438 # Protein_GI_number: 15806244 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Deinococcus radiodurans # 121 373 108 345 402 94 30.0 5e-19 MKVLFTFGGIPHYLNAMLNRLQAKGVEITVVSPKKGNTTIGKGVKMVEGGTYKHLTTPEK KMFYGKSAFPALPEIIAEEKPDIVVVGWPYFLQVFFQPALRKAMKSCNAKLVIREIPFQT PPYGKIKEYFKANPMHDEGMRLLSKGIGFYLRQWITARIRKYCYAQATGTLNYSTAAYDI LPSYGVKKEQIHVTYNSTDTEALLKEKESVLASPSILPPCDRRLLHIGRLVKWKRVDLLI EAFRKVATDYPEAELVIVGDGPELDNLRQQANDLQLADSIRFIGAVYDPKTLGAYMNESS IYVLAGMGGLSINDAMTYGMPVLCAVCDSTERDLVMEGKNGYFFKDGDADSLAGRIREMF ESPERCKEMGKESERMIREKINMETVSERYLKAFQEILRLTHLRDSGN >gi|226332246|gb|ACIC01000074.1| GENE 59 80560 - 81693 417 377 aa, chain - ## HITS:1 COG:no KEGG:BT_2937 NR:ns ## KEGG: BT_2937 # Name: not_defined # Def: putative glycosyltransferase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 377 1 377 377 749 100.0 0 MKVTVIDCSVSGHRETYYKQFAQTWAGMGYETSLIAPDGKGIQEAVAFQPVSALPLLPLP AGKPLQKKLVVLRNAVIRLRNLYRLRRSLQRLNPDLVFFACLDDMLPTLAPLWLFNLLLP YQWSGLLVQSALPPYHRGMPDTRPFLRSKNCVGVGILNEYSAKGLETFQRHILLFPDFAD LSAPATNYPLLRKLKERARGRKVISLLGSINFRKGIKLLLESIPLLPKDEYFFLIAGKSS LTAQQENDLRAFGESRNNCLFSLEKIPDESCFNALIAESDLIFAAYKQFTGSSNLLTKAA AFRKPVIVSRGLCMGRRVEQYGTGQTTGEDNAGECSAAIRSLCTETRMDTQAFARYAHEH EVEKLITCFNQISKHIQ >gi|226332246|gb|ACIC01000074.1| GENE 60 81690 - 82889 902 399 aa, chain - ## HITS:1 COG:NMA1057 KEGG:ns NR:ns ## COG: NMA1057 COG0438 # Protein_GI_number: 15794006 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Neisseria meningitidis Z2491 # 203 388 17 202 230 114 36.0 4e-25 MHILILPMYYPEKDSSPHRGYMFFEQAMQLAKSGCKVGLAFTEQRPLKNFTWKRFRKESH FQISAEDNGSFVTMRLHAWNPKLSTRAGGIIWSLLTVLLVRKYIRTYGRPDLIHAHFGTW AGYAARLVYKWYKVPYVITEHASSINGNQTTPSQAVILKKAYSEARKIICVGTKLKRSLC AYVSDPDKVTVIPNFVDTNTFAFSPHRTEKKKHFTFISIGNLNKRKGFWDLLTAFHWAFK DMPHVSLIIAGDGEEMQPLKKLIQSLHLQEQVKLTGRLSREELSGLLGTCDAFVLASFAE TFGIVFIEAMATGLPAIGTICGGPEDIITPESGFLIRPGDVDALAAKMKTLYDTYESFDK EKIRQSIVSRFDFQLAGQKLRQVYSEALEMSKPPKASES >gi|226332246|gb|ACIC01000074.1| GENE 61 82877 - 83980 951 367 aa, chain - ## HITS:1 COG:no KEGG:BT_2935 NR:ns ## KEGG: BT_2935 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 367 1 367 367 704 100.0 0 MTTPTNYIATWFYKESKEDASFYPQAGKRGDSALVHSIYMQIQVPFFTTFRHYHPNDKLL FFSNLDQFPDYLERLFKELRIETVTLPYLCIPPKGWYGAWRNQFYLYDILRYMEKRMQAD DTLLICDADCLCMRPLGQLFSDTRKYGSALYDASDRPDLKVNGITLKEMTDIYNDCYGEK EKREKEKEATKKTEAEKEEAEQQDTKNGVTRTELVHYYGGEFISLRGDVVARINEAYPAL WNYNLERFAANQPKLNEEAHFLSVVATRLNIHNNIANQYVKRLWTYPKYNTVEKGDEQLA VWHLPYEKKRGLYLLYKHFVNHPTFDDEEAFRKKASAYTGVPTVTFGKRIRDFITILSLK IKDKCIY >gi|226332246|gb|ACIC01000074.1| GENE 62 83977 - 85425 1192 482 aa, chain - ## HITS:1 COG:mll5270 KEGG:ns NR:ns ## COG: mll5270 COG2244 # Protein_GI_number: 13474395 # Func_class: R General function prediction only # Function: Membrane protein involved in the export of O-antigen and teichoic acid # Organism: Mesorhizobium loti # 4 435 73 503 561 157 27.0 4e-38 MSEEQSLKQKTAKGLFWGGLSSSVQQLLGLLFGVVLGRLLDPADYGMIGMLAIFPAIASA LQESGFVAALANREKVSDKDYNAVFWFSISFSAVLYILLYSSAPLIAKFYDTPELTSLAR FSFLNFFIASFGIAPRALLFRNLKVKENTIISLSSLFLSGIVGIILAANGFAYWGLAIQT MTFVVIGTALNWYFAHWKPSFRIDFSPIREMFGFSSKMLITQVFIIINQNLFSVLLGKFY TKQEVGDFSQANNWNNKGHALITGMINGVTQPVLASVANDPQRQAAVFHKMLRFTAFISF PAMFGLSLISREFILIAITDKWLASARIMQLLCIWGAFIPINNLFSLLLVSRGRSSIFMF NSIALSVLQLITACISYPYGITTMIYLFVAINILWLFVWYCFARREIPLTLFSILKDIAP YFLLAASLTIAAHYITSGITNLYLSLAIKVVFVASLYALVLWKMQSVIFKECIQFIKKKK IS Prediction of potential genes in microbial genomes Time: Thu May 12 01:11:10 2011 Seq name: gi|226332245|gb|ACIC01000075.1| Bacteroides sp. 1_1_6 cont1.75, whole genome shotgun sequence Length of sequence - 54482 bp Number of predicted genes - 31, with homology - 30 Number of transcription units - 11, operones - 7 average op.length - 3.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 69 - 2108 2279 ## COG0143 Methionyl-tRNA synthetase - Prom 2185 - 2244 3.0 2 2 Tu 1 . - CDS 2252 - 2977 731 ## BT_2932 transcriptional regulator - Prom 3038 - 3097 5.6 + Prom 3327 - 3386 2.0 3 3 Op 1 . + CDS 3422 - 3745 278 ## BT_3197 hypothetical protein 4 3 Op 2 . + CDS 3794 - 4702 622 ## BT_2930 hypothetical protein + Term 4703 - 4754 -0.5 + Prom 4767 - 4826 4.6 5 4 Tu 1 . + CDS 4886 - 5404 567 ## BT_2929 putative non-specific DNA-binding protein + Prom 5870 - 5929 6.0 6 5 Op 1 . + CDS 5949 - 9551 1650 ## BT_2928 hypothetical protein 7 5 Op 2 . + CDS 9570 - 12785 1503 ## COG3209 Rhs family protein 8 5 Op 3 . + CDS 12788 - 13903 504 ## BT_2926 hypothetical protein + Prom 13907 - 13966 3.8 9 6 Op 1 . + CDS 13988 - 14374 169 ## BT_2925 hypothetical protein 10 6 Op 2 . + CDS 14380 - 14544 107 ## BT_2925 hypothetical protein + Term 14734 - 14790 11.6 11 7 Tu 1 . - CDS 14942 - 15145 98 ## - Prom 15177 - 15236 2.6 + Prom 14901 - 14960 4.6 12 8 Op 1 . + CDS 15080 - 17140 1782 ## COG1042 Acyl-CoA synthetase (NDP forming) + Prom 17161 - 17220 3.0 13 8 Op 2 . + CDS 17240 - 21256 2039 ## COG3292 Predicted periplasmic ligand-binding sensor domain + Term 21311 - 21356 1.0 + Prom 21345 - 21404 8.0 14 9 Op 1 . + CDS 21568 - 23382 1243 ## COG3250 Beta-galactosidase/beta-glucuronidase 15 9 Op 2 . + CDS 23414 - 24496 831 ## COG4225 Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins 16 9 Op 3 . + CDS 24514 - 27708 2590 ## BT_2920 hypothetical protein 17 9 Op 4 . + CDS 27721 - 29643 1203 ## BT_2919 hypothetical protein 18 9 Op 5 . + CDS 29663 - 30877 902 ## BT_2918 hypothetical protein 19 9 Op 6 . + CDS 30902 - 31576 505 ## BT_2917 hypothetical protein 20 9 Op 7 . + CDS 31591 - 32814 754 ## COG4289 Uncharacterized protein conserved in bacteria + Prom 32816 - 32875 3.4 21 10 Op 1 1/0.000 + CDS 32898 - 33872 703 ## COG1409 Predicted phosphohydrolases + Prom 33904 - 33963 4.2 22 10 Op 2 . + CDS 33984 - 35324 552 ## COG3119 Arylsulfatase A and related enzymes 23 10 Op 3 . + CDS 35348 - 36556 544 ## BT_2913 unsaturated glucuronylhydrolase 24 10 Op 4 . + CDS 36619 - 41205 2446 ## COG3940 Predicted beta-xylosidase 25 10 Op 5 . + CDS 41217 - 43184 978 ## COG3533 Uncharacterized protein conserved in bacteria + Term 43224 - 43286 9.1 26 11 Op 1 . + CDS 43529 - 44410 512 ## COG2207 AraC-type DNA-binding domain-containing proteins 27 11 Op 2 . + CDS 44493 - 45917 943 ## COG2160 L-arabinose isomerase 28 11 Op 3 . + CDS 45936 - 48482 1045 ## BT_2908 hypothetical protein 29 11 Op 4 . + CDS 48537 - 49895 632 ## BT_2907 hypothetical protein 30 11 Op 5 . + CDS 49930 - 51240 881 ## COG3754 Lipopolysaccharide biosynthesis protein 31 11 Op 6 . + CDS 51268 - 54471 1775 ## BT_2905 hypothetical protein Predicted protein(s) >gi|226332245|gb|ACIC01000075.1| GENE 1 69 - 2108 2279 679 aa, chain - ## HITS:1 COG:PAB2364_1 KEGG:ns NR:ns ## COG: PAB2364_1 COG0143 # Protein_GI_number: 14521189 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Methionyl-tRNA synthetase # Organism: Pyrococcus abyssi # 7 546 3 553 562 547 48.0 1e-155 MEKKFKRTTVTSALPYANGPVHIGHLAGVYVPADIYVRYLRLKKEDVLFIGGSDEHGVPI TIRAKKEGVTPQDVVDRYHYLIKKSFEEFGVSFDVYSRTTSKTHHELASDFFKTLYNKGE FIEKTSEQYYDEEAKTFLADRYITGECPHCHSEGAYGDQCEKCGTSLSPTDLINPKSAIS GSKPVMKETKHWYLPLDKHEGWLRQWILEDHKEWRPNVYGQCKSWLDMGLQPRAVSRDLD WGIPVPVEGAEGKVLYVWFDAPIGYISNTKELLPDSWETWWKDPETRLLHFIGKDNIVFH CIVFPAMLKAEGSYILPDNVPSNEFLNLEGDKISTSRNWAVWLHEYLEDFPGKQDVLRYV LTANAPETKDNDFTWKDFQARNNNELVAVYGNFVNRAMVLTQKYFDSKVPACAELTDYDK ETLKEFANVKAEVEKLLDVFKFRDAQKEAMNLARIGNKYLADTEPWKLAKTDMERVGTIL NISLQLVANLAIAFEPFLPFSSEKLRKMLNMESFEWSELGRNDLLPVGHQLNKPELLFEK IEDSVIEAQVQKLLDTKKANEEANYKANPIRPNIEFDDFTKLDIRVGTILECQKVPKADK LLQFKIDDGLETRTIVSGIAKHYQPEELVGKQVCFIANLAPRKLKGIVSEGMILSAENND GSLAVIMPQKEVKPGSEVK >gi|226332245|gb|ACIC01000075.1| GENE 2 2252 - 2977 731 241 aa, chain - ## HITS:1 COG:no KEGG:BT_2932 NR:ns ## KEGG: BT_2932 # Name: not_defined # Def: transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 241 1 241 241 487 99.0 1e-136 MNEKQADTLLNVPEIVKRAKIALNLKRDSELASYLGVARATLSNWCARNSIDFPLLLNKL HHVDYNWLLTGKGSPLHDPKSFDNGKIRGEVETIHNSKTTEAIDDRSVTLYDITAAANLR TLLSDKRQYALGKILIPSIPACDGAIFVNGDSMYPILKSGDIVGFKGINNFSNVIYGEMY IVAFHLDGDQYLTVKYVNRSEKEGYVKLVSYNPHHEPMDLPVDTIQDMAIVKFSIRKNMM M >gi|226332245|gb|ACIC01000075.1| GENE 3 3422 - 3745 278 107 aa, chain + ## HITS:1 COG:no KEGG:BT_3197 NR:ns ## KEGG: BT_3197 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 97 1 97 106 91 53.0 9e-18 METLQKQTLYVCGCCKHTLPSEAFYVDKKTGLSNNYCKECRKSASRKYRKTEKRTLARDD REPYPVITSTKDPDLRQKLILSALQTVRASIKRKKQKLCEVEAELTD >gi|226332245|gb|ACIC01000075.1| GENE 4 3794 - 4702 622 302 aa, chain + ## HITS:1 COG:no KEGG:BT_2930 NR:ns ## KEGG: BT_2930 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 302 1 301 301 564 98.0 1e-159 MGRIKQGLDYFPLSTDFMHDRIVRRVMKREGDSAFTVLIYIMSYLYSGEGYYVRADTDFC DELSDQLFSTDNDTVRRVIRLFLEYGLFDSALYERYSILTSEDVQRQYLFITKRRSRHHI CPDYCLLPEGKTTDEQPDTVAATGENVAVSPNIATASRDTATKTALIKRKEKEKKENILP NPPFAKGGDEEEEGSEGEKVAEGAGGGMREECDQRSATRKRNLTQEDIDRMQVPPDGCPR NFSGLLENLRLFRVPPAEQYAIVLKSNYGEIGGDIWKGFSVIRGSGGKIKLPGHYLLSVL NK >gi|226332245|gb|ACIC01000075.1| GENE 5 4886 - 5404 567 172 aa, chain + ## HITS:1 COG:no KEGG:BT_2929 NR:ns ## KEGG: BT_2929 # Name: not_defined # Def: putative non-specific DNA-binding protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 172 1 172 172 286 100.0 2e-76 MPVLYKPFQSNLEDKKSGKKLFYPHVVRTGNINSAQLSKEIAAYSSLSPGDVKNTLDNLV TVMAQHLQSSESVSVDGLGTFRMVMSARGRGVETADEVSAAQATLTVRFQPTTTKNLDRT TATRSMVTGAKCARYDKLVSAPGDGGSVDDPNDKPGGGSDGGDDGEAPDPTV >gi|226332245|gb|ACIC01000075.1| GENE 6 5949 - 9551 1650 1200 aa, chain + ## HITS:1 COG:no KEGG:BT_2928 NR:ns ## KEGG: BT_2928 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 86 1200 1 1115 1115 2155 100.0 0 MTISGKAQDFNPGPEAQNLTERKNVTVDYATGLFHYTVPLYQLKSGDYELPISLDYIAKG VKVSDPEGLIGQNWTLNVGGIVTRTMRGGFADEKSGYGYLWTENAVTPLEQDARTVGLRK RDGESDIFTVVFNGKKVDFIIRMNEKRQIYALPLGQTDVRIECEGTSTEITGWTITDNNG DRYIYRQREICADVEYVDVSTSNAISDSGYTSAWHLTRILPYNGAPIDFCYKGDVMDLDF GNLSLDSIHTMKIYDSYKMIYHYGQSVKEQPFDFDQYKSRFYSAIEVAQNYLNMCSLLLD FKNVDSKIKDFERYSRINIQPLQSEYIKTNNRIVGVLSNISKMNGVSKELGESLRGFAAY CKRIGGFNADMAGSYLEEAADYIYACLSEVKYVKTKEIWGGKSYKVHSPLLNRIVFPEYI VKFAYFSSSSSLSAISLYNRNMELISSVSSTGGALARGLAFCDKNGKKTSGIEFNYYEKS DFPVWKETGVDLWGYPYAEDEDEECTDYEIYATLNSLKNIVLSDGGKIEVKYERNYGRKG IVGGIRLKSLVFSNELNGRSDTISYGYPRVGVSVYKPLSNVVSISYPECTDWIEYSRVIQ EGYPVINMGNNGLYYSCVTETISGRGSTVYSYQVSPPTSVNESDYPYWENGLLREKAVYD TNGEMLKKIQYIYETDVNYEDKLPQMQPSDFYLDGKSLESFYRNQGTSYLTGKEIYEYNI KPRLSPTNTDKFYNLQYGWRTALKEEVEYRSDENIPYSRTKYYYDNPMSMYPTCVIRIGS DGIERTEVLKRAMDMADTADSVFVMMKEANFLSPVVKSLILVDEKLVHETVCRYQVDRES KIGFIAPVEVLTYVPDMPEAYAQSVVDTILFSHGESNYTTEASYRYASKMWVETNGRTER KSRVYDSYGKLLLECDVIGTTASDKYKGVGKAIDEADLKFAINYLRKQQRTFASGIEVLK EEVDNQHFLRFLNSQDHSFIASFAEEIVKSKPDLSQARYYYEHIVCYDLFNVFKQEYERM IGLYPKFEILRQFISAMEAMMQFDGLSLFDYLYGLYGEVDKYKFAYLPKLTPASGMKNLR LYILDGNATASGSITHAGSEVPYTVESVSSSKLKIYDIDLSKYTDITSVTANTQGSYMAL VPEGANFKATSYNPDGTVYARFDQSGNVEYYTYDAAGRVIKVEDQYGNILKTYEYNKLNN >gi|226332245|gb|ACIC01000075.1| GENE 7 9570 - 12785 1503 1071 aa, chain + ## HITS:1 COG:MA2045 KEGG:ns NR:ns ## COG: MA2045 COG3209 # Protein_GI_number: 20090892 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rhs family protein # Organism: Methanosarcina acetivorans str.C2A # 417 863 1580 2012 2217 75 25.0 7e-13 MKTIEMKRTLSILCLLSFSLFMNGQQATGYTKTRNYTAERTYMDASSQTGSGGARIVSDI VYTDCFGRKEQEIQVGGSPGGNADLVLSYTYNMMGQVEKEYLPYAKTGNNGAFDKFSPER WSVYGTGEQAYAYNLTQYEDSPLMRVSRKMGPGKAWHTSDKSVRVTYGMNSTNEVRCYKV SSSGSLIASGFYGAGKLEVIVETDEDGHRSKTYTDSNERTLLTVSVNGEDSLATYYVYDD RNCLRYVLPPEASHRLAEAGTTDISVLHSLAYSYEYDKLNRMISKRLPGCAPIYMVYDCR DRLVLSQDGTRRTADTKKWSYFLYDSHDRVIESGEILLSAEQTCGELQLAAWENEHYLPS GTRTPLQYTVYDNYKPTENVTAHPFVAAVGYDTPYTLYPSGLTTSTKTRLLGTDDWLTET IYYDDRSRVIQTLCHYPEKGLRYRHTAYDFTGNVLRKQDNIGTDILETVYTYDDRARLLT KANTWNGRFTDKITYVYDALGRLTGRNYGDKVSESLSYNIRGWLTGIESQHFSQTLHYVD GVGVPCYNGNISSMIWHSGSEAGLRGYKFTYDGFSRLKDAIYGEGEQLSNNLNRFNEQVT GYDKNGNILGLLRYGQTNTMSYGLIDNLNLVYNGNQLESVNDNATGHVFGNGMEFKDGAS KEIEYEYDVNGNLTKDLNKKIADIQYNCLNLPEKVQFEGGNSISYLYAADGTKLRTTYKT GNATTITDYCGNAIYENGVLIKVLTEDGYITVSNNQFHYFIQDHQGNNRVVVAQNGTVEE VNDYYPFGGLLSSSLSNNVQPYKYNGKELNRDNGLDWYDYGARMYDASLGRWHAVDPSGE KYPALGLYAYCKNSPIIRIDPDGKDDYVVNANGAVYLMRKTDRIVDVLYASGINSSQKAT QPDPNWKSIRVFDKSMLQGLTASKGEKGRDYWDRKLYTTTNSVYNAANVFLFVADNTSVE WTLKGGYMDGKRTFILGTNNNRNRVSSMYGLTKVQEALFPTFIPIIDIHSHPYNPFASPE DMDNARLLRDIRYGIYYKDNIINYTDQNNSTYSIKVNSMNDLMRYIFQQLR >gi|226332245|gb|ACIC01000075.1| GENE 8 12788 - 13903 504 371 aa, chain + ## HITS:1 COG:no KEGG:BT_2926 NR:ns ## KEGG: BT_2926 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 371 1 372 372 733 98.0 0 MRKKVIISLIYVSIVLTIPLAVACQGANDDSIDPILLEHVKSVIDFYPDCGKTLKVSQFD SRKTHVYSMWPKQSGFWFDTGQNGDELRLILPTNAMRYKDKYILFYLEGKKRLSEKEISR LLGISSQDDMPDSREIDQRIWIYVKDKESGKSAFEQIEYGTKIWEYPNLRYFNGGDKDSA VIDIAIYDISVHGEKGKSEKFSSPDKISLNMSVYNKSDSSLLIGLNPDLYGSFIIKNGEY SMPLMADVEVNRYFGEFHEYSPGLYFIAPHGRMSFYLSTAQQPIKLKDTSPHEYVHKLYD LFYDSICYVPAPTIQMPDTIQGIVWNKEFTAYFPFGSWYHFFVNDSIYDIYPNGEVAGYA MDKHRYKWFEE >gi|226332245|gb|ACIC01000075.1| GENE 9 13988 - 14374 169 128 aa, chain + ## HITS:1 COG:no KEGG:BT_2925 NR:ns ## KEGG: BT_2925 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 43 127 1 85 200 145 88.0 4e-34 MKPIIRILLIVLLLVLYSCKSIALYDTSTDYDFKFDLYSQVKMYFRANLKYPTVNELWNY CWKITNEVNDDTFLSFDDFDKAAYKNVSGREEFLQHLSLYKKEISFQFKKGAMFVFWKNK NGLKLSLI >gi|226332245|gb|ACIC01000075.1| GENE 10 14380 - 14544 107 54 aa, chain + ## HITS:1 COG:no KEGG:BT_2925 NR:ns ## KEGG: BT_2925 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 50 90 139 200 85 86.0 5e-16 MIADNGYKFAYFYDTQGNYSLDFDYEEDFQKIRKEVRDKCVPDTNCGVNFTRVS >gi|226332245|gb|ACIC01000075.1| GENE 11 14942 - 15145 98 67 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MYIIRSPYYNYTLGTKQLGCYHIIHWFSFKLGYKYILSCEKKERKRGELWVEVFRNRRKH LGYAIIS >gi|226332245|gb|ACIC01000075.1| GENE 12 15080 - 17140 1782 686 aa, chain + ## HITS:1 COG:AF1211 KEGG:ns NR:ns ## COG: AF1211 COG1042 # Protein_GI_number: 11498810 # Func_class: C Energy production and conversion # Function: Acyl-CoA synthetase (NDP forming) # Organism: Archaeoglobus fulgidus # 5 682 3 679 685 325 33.0 2e-88 MITTQLLRPESIVVVGASNNVHKPGGAILKNLINGGYQGELRAVNPKETEVQGVPAFADV KELPDTDLAVLAVPAVMCPDIVETLASEKQTRAFIILSAGFGEETREGALLEERILETVN KYGASLIGPNCIGLMNTWHHSVFSQPIPNLNSKGVDLISSSGATAVFILESAVTKGLQFN SVWSVGNAKQIGVEDVLQFMDENFDPEKDSRIKLLYIESIHNPDRLLFHASSLIRKGCKI AAIKAGSSESGSRAASSHTGAIASSDSAVEALFRKAGIVRCFSREELTTVGCIFTLPELK GKNFAIITHAGGPGVMLTDALSKGGLNVPKLEGELAEELKAQLFPGASVGNPIDILATGT PDHLRICIDYCEEKLDNIDAILAIFGTPGLVTMFEMYDVLHEKMQTCHKPIFPVLPSINT AGAEVAAFLAKGHVNFSDEVTLGTALSRIVNAPKPAVPEIELFGVDVPRIRRIIDSIPEN GYIEPHYVQALLHAAGISLVDEYVSNKKEEVVDFARRCGFPVVAKVVGPVHKSDVGGVVL NIKGEQHLALEFDRMMQIPDAKAIMVQPMLKGTELFIGAKYEEKFGHVVLCGLGGIFVEV LKDVSSGLAPLSYEEAYSMIRSLRAYKIIQGTRGQKGVNEDKFAEIIVRLSTLLRFATEI KEMDINPLLATEKAVIAVDARIRIEK >gi|226332245|gb|ACIC01000075.1| GENE 13 17240 - 21256 2039 1338 aa, chain + ## HITS:1 COG:XF1330_1 KEGG:ns NR:ns ## COG: XF1330_1 COG3292 # Protein_GI_number: 15837931 # Func_class: T Signal transduction mechanisms # Function: Predicted periplasmic ligand-binding sensor domain # Organism: Xylella fastidiosa 9a5c # 28 606 28 625 740 118 23.0 6e-26 MKKIIITFVALIIGLICRAQTIDEHYYFKSLSSQNGLSQNTVSAILQDSKGFMWFGTKDG LDRYDGVSFRHFKYDRTNPRSLGNNFVTSLYEDVEGNIWVGTDVGVYIYYPEKDTFRHFV ELSDKNTRIERAVAMISGDKQGRIWIASEAQNLFCYDLKEQSLRNYILTEHSLVSTNVKC LTVDNSGTIWIGFYGDGLFYSKDNLKTLHPYLSPVDNEETYNDDVVSKIVSGAYNCLYIS SLKGGVKELNLTSGKLRDLLLKDENNESIYCRTLLVYTDNELWIGTESGVYIYNLRTEKY THLRSSEYDPYSLSDNAIYSFCKDREDGIWIGSYFGGVNYYPRSYTYFEKYYPKGIENSL HGKRVREFCRDSQGILWIGTEDGGLNRFNLKTKDFSFFAPSAGFTNIHGLCLVDDKLWVG TFSKGLKVIDTRSGVVLKTYQKNDSPRSLIDNSIFAICKTATGDIYLGTMFGLVRYNKQS DDFDRIPELNGKFIYDIKEDSAGNIWLATYANGVYCYNVNDRKWKNYVHDDKDNTSLPYN KVSSVFEDSHRQIWLTTQGGGFCIFHPESETFTSYNSTNGLPSDVVYQIVEDKEGQLWLT TNNGLVCFQPNTEAMKVYTTANGLLGDQFNYRSSFEDENGMIYFGGIDGFIAFNPKDFSE NKNLPTVVITDFLLFNKEVYADEPNSPLEKSITFSDKIVLRADQNSFSFRISALGYQAPR MNKLKYKLEGFDDEWLSVGESPLITYSNLRYGHYIFRVKAANSNGVWNSNEISLSIQILP PFYLSVWAYCIYVLLIIGCSVYLVLYFKRRTNRKHRRQMEKFEQEKEREVYNAKIDFFTN VAHEIRTPLTLINGPLENILLKKSVDSETREDLNIMKQNTERLLSLTNQLLDFRKTESQG FRLNFAECNITEVLKETHKRFTSLAKQRELDFILNVPVQDFYAHVNKEAFTKIVSNLLNN AVKYAETYVHVYLEINGTGENKMFYIRSVNDGVIIPNEMKEEIFKPFVRFNEKEDGKVTT GTGIGLALSRSLAELHRGNLTMLESEEANVFCLSLPVEQDSVIKLASEHKSVEESKTVER RIEQGESKANRPVILVADDNQDMLSFVVRQLEDNYIVLTAKDGLEALEVLDNQEVTLVVS DVVMPRMDGFELCKVVKSKLDYSHIPVILLTAKTNIQSKIEGLELGADAYIEKPFSVEYL QACIANLINSREKLRQAFAQSPFVAANTMALTKADEEFMKKLNEIIQGNFHNPDFSMDDI VDSLNMSRSNFYRKIKGVLDLSPNEYLRLERLKRAAQLLKEGNGRVNEICYMVGFNSPSY FSKCFQKQFGVLPKDFAG >gi|226332245|gb|ACIC01000075.1| GENE 14 21568 - 23382 1243 604 aa, chain + ## HITS:1 COG:L0025 KEGG:ns NR:ns ## COG: L0025 COG3250 # Protein_GI_number: 15673962 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Lactococcus lactis # 31 485 10 458 996 94 25.0 9e-19 MIKHLTLLGTALLFSSQILLAQWKPAGDKIRTFWAEKVDVNNVLPEYPRPIMERSDWQNL NGLWNYAVLPLGQSAPTTFDGKILVPFAIESSLSGVGKTLGMEKELWYQRAFNIPSSWRG KRVLLHFGAVDWKTEVWVNNIKVGEHTGGFTPFTFDVTEALEKSTNTLTVKVWDPTDKGF QPRGKQVSNPRGIWYTPVSGIWQTVWLEPVSEHYIAGIKTTPDIDTNKIKVKVETDSNRQ SDRLEVKVFDGKHLVAMGTSINGLPVEIPMPADAKLWSPDSPFLYQMEVSLISAGKLVDQ IKSYVAMRKYSIKRDEHGIVRLQLNNKDLFQFGPLDQGWWPDGLYTAPTDEALRYDIEKT KDFGFNMIRKHIKVEPARWYTYCDQIGLIVWQDMPSGDKSPEWQNRKYFEGTELTRSAES EETYRKEWKEVIDCLYSYPCIGTWVPFNEAWGQFKTQEIAEWTKQYDPSRLVNPASGGNH YTCGDMLDLHHYPGPEMFLYDAQRATVLGEYGGIGLVLKDHLWEPNRNWGYIQFNTSAEA TAEYLKYAGILKDMISRGFSAAVYTQTTDVEVEVNGLMTYDRKVIKLDESKLKKMNTEIC NSLK >gi|226332245|gb|ACIC01000075.1| GENE 15 23414 - 24496 831 360 aa, chain + ## HITS:1 COG:YPO0840 KEGG:ns NR:ns ## COG: YPO0840 COG4225 # Protein_GI_number: 16121148 # Func_class: R General function prediction only # Function: Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins # Organism: Yersinia pestis # 16 359 2 351 352 238 35.0 2e-62 MRKTLILILTLFAVFTTVDARSKKDQDRALIDKVAMWQIDHQTKVKHHDLAWTNGVLFRG MVEWADYTQDSRYYDFLMQIGKKHRWGFLKRLYHADDLVIAQMYIRMYEKYHDPVMIQPT IARVDSIVAKPSKARLWLGAKNWSERWSWCDALFMAPPVYGQLNKLYPEKNYLDFMDREF REATDSLYDADAGLYYRDRRYIPKKEKNGEKVFWGRGNGWVFAGLPLLLQTLPKEHPTYN YYLNIYKEMASSVIKCQDKNGSWHASMLDPESYPSPENSASGFFVYGLGWGINQGILKGK KYKKAAQKGWKALKSYVHEDGMLGYVQPVSAAPEQVTKDMTEVYGVGAFLMAGIEMLKMK >gi|226332245|gb|ACIC01000075.1| GENE 16 24514 - 27708 2590 1064 aa, chain + ## HITS:1 COG:no KEGG:BT_2920 NR:ns ## KEGG: BT_2920 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1064 1 1064 1064 2109 100.0 0 MKTKTRDLSACLKRRLWIAGVCLAAFSFLSVSVVYAADDIQTIQQQKKGMTIKGKVLGTD GEPIIGASVLVKGSTTGVVTDLDGNYTLANVSKGAILEFSYVGYVKLTKTVGSETTINAI LYEDSQMLEGVEVVAFGTQKKESVIGAISTVKVAELKTPSSNLTNALAGRIAGVISYQRT GEPGVDDASFFVRGVTTFGYKKDPLILIDGIELTSTDLARLQPDDIASFSIMKDATATAL YGARGANGVILVKTKEGQKGKARLSLRVENSISAPTQEIDLADPVTYMRLHNEAYLTRNP LAPLMYSEEKIDNTVPGSGSVIYPATDWRKELLKDFTMNQRVNLNITGGGDVARYYVAAS YSQDNGILEVDKNSNFNNNIKQRVYTLRSNVNINATKTTELIVRLSGVFDDYNGPLYGGS AMYNLIMKSNPVLFPAKYPKDEAHAYTKHTLFGNAENGQYLNPYAEMVRGYKESGRSNLS AQFEVKQDLKFLTEGLSARLLFNTSRISSYDVSRKLNPYYYKMTNYDYLTGDYAIDIINP DEGSEYLIYDPGGKSITANMYIEAALNYNRDFGKHGVGGLLVYQLRNNLQPNAGSLQASL PYRNVGLSGRFTYAYNNRYFAEFNFGYNGSERFHKSKRFGFFPSAGLAWVVSNESFWDPF KSVVSKLKLRASYGIVGNDAIGSGRFLYLSDINMNDSKYGANFGYNFDYHRDGISVKRYS DPEITWEKSAKTNFALEFTLFDDLNVTAEYYTERRKSILQQRASIPASMGLWVQPYANLG EAKGSGVDLSLDYNKYFANKSWLQLRGNFTYATSEYMVYEDYEYPGAWWKQKVGYPTNQT WGYIAEGLFVDDNEVANSPVQFGEYGAGDIKYRDVNKDGKITDLDQVPIGYPKEPEIVYG FGASYGYKNWDISVFFQGLARESFWIDYNNVSPYFNTVDTKVVGNRVGHNALAKFIADSH WSEDNRDAYAVWPRLSPTSIENNSKTSTWFMRDGSFLRLKQLEIGYTIPEKVTNKVGIKN LRFYVTGNNLLCFSKFKLWDPEMAGAGLGYPVQRVYNVGLNLTF >gi|226332245|gb|ACIC01000075.1| GENE 17 27721 - 29643 1203 640 aa, chain + ## HITS:1 COG:no KEGG:BT_2919 NR:ns ## KEGG: BT_2919 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 640 1 640 640 1306 100.0 0 MKKLQLLILNCLIIVLSSCDSFLDIVPDDIATIDNAFTMRAQAEKYLFTCYYYLPAHGLL EANPAIAGGDELYITESFRSAAHVHAWYISHNMQSSARPRCDYWTGADQQNNKSLYQGIS DCNIFLENIQKVPDMTQAEKDRWAAEVRFLKAYYHYWLIRMYGPIPIMDQNIPVNAGEEE VKVFRNTIDECFDYVINTLTEIIDSNHLPDKIMNEAEELGRITQGIAMAIRAEVMVTAAS PLFNGNADYKGYTDSRGIEIFNPNKSEQDKKQRWIDAAEACKQAIEFLKAQGHDLYKYTS LEYTISDQTRAKMNIRNIVTEKWNQEIIWANSNSIIGQLQDHAIPRGLEPGKEGNASVGG NLAVPLKIADLFYTKNGVPITEDKTWNYDDRLELRKATSDDKYFIKEGYTTASINFDREV RYYASLGFDGAVWFGQGVTDETKPIYVQCKQGQSAANQISNSWNETGIWPKKLVHFKSVV GQTSGFTKITYPFPVMRLGNLYLLYAEALNESGASKTDVLTWVDLIRERAGLEGVEESWD TYTNNKKYETVDGRREIIQQERGIELAFEAQRFWDLRRWKLAYQEQNKPITGWNTQYSTN EDYYTEQLIYAQEFRIRNYFWPILDKELYSNKNLVQNYGW >gi|226332245|gb|ACIC01000075.1| GENE 18 29663 - 30877 902 404 aa, chain + ## HITS:1 COG:no KEGG:BT_2918 NR:ns ## KEGG: BT_2918 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 404 1 404 404 835 100.0 0 MKKKLYKYGLLYIGVALAACADNADLNEPAGSTTPPAQVLNATVKNLPGAAIIYYDLPDD QNLKYVRASYKVDNMVRTVNASFYTDSLVVEGFPTKGEYDIELYSVSYGEVVSSPLVVKV NPDTPPYQKVRGTLVSAETFGGIRVNFDNPEKAKLGLGVIKKQAEGIWTQVYMHYTEAKG GDFYVRGLDAVTTDFGIFVRDRWGHLSDTLYVTETPLYEEQCDKSLFRKMALPTDSYECH SWNEVTKGNDMTRLWDGITDADPCFQTKTTTVMPQWFTFDMGENYKLSRFVMVSRYYPGK YGNTFKAGHPKHFEIWGSINPNPDGSFDDSWVLLSEYESVKPSGGGVNDALTAEDQEAAK NGENFIIPDNAPAVRYIRFRTNNTWGNTRYMHLHELTFFGAKHN >gi|226332245|gb|ACIC01000075.1| GENE 19 30902 - 31576 505 224 aa, chain + ## HITS:1 COG:no KEGG:BT_2917 NR:ns ## KEGG: BT_2917 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 224 1 224 224 437 100.0 1e-121 MRLFKNFALVFLFIGTLASCDGMDATYKEFIEEGPIVYIGKVDSLKAYAGRNRAMLEWQK LLDPRAKTAKIFWENRTRSTELQLTDKAGLTQVIVKDLAEGSYVFEVCTYDTHGNSSIMS EVPCTVYGDVYEKLLFNTKVKTAVLKNDVLTITFAASLEPTFFGSEITYTSIDGGDKTVI LKAPETQIKIDDFAGDRIIYRSVYLPEETAIDYFYSESDEFEIK >gi|226332245|gb|ACIC01000075.1| GENE 20 31591 - 32814 754 407 aa, chain + ## HITS:1 COG:TM1061 KEGG:ns NR:ns ## COG: TM1061 COG4289 # Protein_GI_number: 15643819 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Thermotoga maritima # 27 407 8 387 387 310 43.0 4e-84 MRKVVILLYLVQLLFTGFAQAKPGKDREYWVKTMIKIIDPFYTNLSQNTLRKNMPVETFD GLNNGNTRKNVTHLEALGRSFNGISAWLNLPPDGTKEGQLRAKYTDLVVKSISNAVNPDS SDYMRFDGPGGQPLVDAAFFAQGLLRSKDQIWPKLDKVTQERIIKELKASRRIKASESNW LMFSATIEAALLEFTGECELNPIHYALKRHKEWYKGDGWYGDGRNFHLDYYNSYVIQPML IDVLSVMKKHEVEGADFYDVQLQRLIRYADQQEKMISPEGTYPVLGRSMGYRFGAFQVLA QVSWMKLLPEHIKPAQVRCALTKVMKRQLIKGTFDKDGWLNLGFCGHQPEIADRYVSTGS NYLCTFIFLPLGLQADDEFWTAKPEEWSSVKIWSGSRDIKKDGSIKN >gi|226332245|gb|ACIC01000075.1| GENE 21 32898 - 33872 703 324 aa, chain + ## HITS:1 COG:AGl909_1 KEGG:ns NR:ns ## COG: AGl909_1 COG1409 # Protein_GI_number: 15890570 # Func_class: R General function prediction only # Function: Predicted phosphohydrolases # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 29 315 361 670 1299 169 33.0 5e-42 MKNQMKWIALLTTAVFIFSGCKTQKQVRLVLLPDIQTYSRLYPDILKSQTQWAIDHADSI DFVLQQGDMTDHNVDKEWEVAAAALNTMDNQVPYAFVMGNHDLGKNSNKRDSELFNRYFP YDKYSKTRNFGGAFEEGKMDNVWYTFKAAGLKWLILCLEFGPRTSVLDWAGEIVKKHPHH KVIINTHAYMYSDDTRMGEGDRWLPQKYGLGKDTGENAVNNGEQMWDKLVSKYPNILFVF SGHVLNSGVGTLVSIGDHGNKVFQMLANFQDGVKGTNRGQTGFLRIVDIDVKKQQVRVKT YSPYLKEYKNDVKNKFSFEGVNFK >gi|226332245|gb|ACIC01000075.1| GENE 22 33984 - 35324 552 446 aa, chain + ## HITS:1 COG:PA0031 KEGG:ns NR:ns ## COG: PA0031 COG3119 # Protein_GI_number: 15595229 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Pseudomonas aeruginosa # 2 369 6 367 503 150 32.0 6e-36 MNVLLIIADDMRPELGCYGIEDIVTPRLDSLARYATVFQNAYCNIPVSGASRASLFTGMY PRYPNRFTAFDASAEKDCPEALSLPECFKKNGYYVVSNGKVFHNITDHADSWSEAPWRVH PDGYGKDWAEYNKWELWQNEESSRYVHPKTLRGPFCESADVADTTYIDGRVAQKTIADLR RLHKKEKPFFLACGFWKPHLPFNAPKKYWDLYRREEIHLAQNPYRPKALPKQVTSSGEIR GYGKFVTTKDETFQREAKHGYYACVSYIDAQIGLILDELDRLGLSENTIVVILGDHGWHL GEHGFWGKHNLMNHATRAPLIVRVPHCRGGKAKGIVEFVDIYPTLCELCGVPMPKDQLQG KSFVPILQDSGKKTKQYAFIQWKGGYNIVSEQYSSTIWLQGDSVVARMIFDRQLDASENE NKVHCPYLSRERDKLEAQIKIRKSQL >gi|226332245|gb|ACIC01000075.1| GENE 23 35348 - 36556 544 402 aa, chain + ## HITS:1 COG:no KEGG:BT_2913 NR:ns ## KEGG: BT_2913 # Name: not_defined # Def: unsaturated glucuronylhydrolase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 402 1 402 402 827 100.0 0 MKVSSSIIASLLLLTVCSCSTPKSEMSTLVNNSLQTATVQSKLMAEDLLNEEGKLPRTIG KDGKLMTSKSNWWTSGFFPGVLWYLYEVNGNDSLKMYAENYTKRIENQKYTTDNHDVGFM LYCSFGNGLRLTGNDDYRKTLLQGSESLSTRFRPQVGCIRSWDWNQKVWEYPVIIDNLMN LEMLMWASKNSDDPKFEEIAKSHADVTMKYHFRSDYSSYHVISYDTISGLPEKKNTCQGY AHESCWARGQGWALYGYTMMYRETGQEKYLRHAINVAKFIINHPRLPEDKIPYWDFDAPN IPNELRDASAGALMASAFIELSLYTEGDFSKQCLSVAETQLKTLSSPEYLAEPGTNCNFI LKHSVGNNPGKAEVDVPLTYADYYYVEALVRYKRDILKEKLN >gi|226332245|gb|ACIC01000075.1| GENE 24 36619 - 41205 2446 1528 aa, chain + ## HITS:1 COG:BH1867 KEGG:ns NR:ns ## COG: BH1867 COG3940 # Protein_GI_number: 15614430 # Func_class: R General function prediction only # Function: Predicted beta-xylosidase # Organism: Bacillus halodurans # 11 316 8 317 327 174 35.0 1e-42 MVYGQEHDGDYTNPILSSGADPWVIKHEGWYYYCSGVPGGIGVSRSRDLHKINPPVRVWK VPEKGQWNSTCVWAPELHFWKGKWYIYYAAGYSGPPFIHQKAGVLESVTSDAMGKYVDKG MLFTGDALGDWENNRWAIDMTLLDHKGQLYAVWSGWEGNELTDKTQQHLYIAKMLNPWTM ASGRVKISSPDRYYEQGELPLNEGPQILKHEKDVFIVYSCGQSWLDTYKLSYLRLKDSDA DLLDPESWIKSEKPAFEGTNQVLGVGHASFTTSPDDREYYICYHSKREKTPGWKRDIRLQ KFTFDASGVPCFGKPVSVGEKLPLPSGTTHLVKAKSMVDLEKDFTELSSTARPYTYWFWM NGNITKEGITKDLEAMHRIGVGGVFNLEGGTGIPKGAVTYLSSEWSELKAHAVKEAARLG IDYVMHNCPGWSSSGGPWITPEYSMQKLTWSEIEVVGGEQIDTLLPQPAMELGYYKDVAI LAFPSFRNGKPIGFSDWQLLNNSVFNHKGLIDIREYDKEQVIHPEDVIDLTKQVDSNGCL SWEAPSGNWTIIRMGHTSTGRKNCAAPDTGVGLECDKFSKQAIQLHFNKMMDLLYPLIKP YVHQIQIGLEIDSWEVGMQNWTSGFEDEFCERTGYDLIRYLPAMTGKIVGSKEITERFLW DIRRIQADLLADNYYGEFRSLCNQYGLVSYCEPYDRGPMEELQIGSRVDGVMGEFWNGLS AIFQNNLMMRRTAKLASSIAHINGQKVVGAEAYTSEPESGRWQEYPFALKAVGDKAFTEG INRMVVHRYAMQPHSNAVPAMTLGPWGIHFDRTNTWWEPARAWMDYLNRCQTLLQEGLFV ADLAYFTGDNVVGYTKVHRNELNPVPPEGYDYDLMNTETLLNRAWIEQGRLRLPDGMSYR ILVLQEQSYITLGLLRKLREMVEQGLVIVGARPHQTVGLQSYSITEEKEFEQLCDELWGK NMATMIDRNIGKGRVFWEISLNLNQVFRQIQLKPDFEVGDNPNSAPIRYIHRQIGDTDVY FVTNQRRTPEDIICNFRVEGKVPEFWNPLTGERSRALVYQKNEGMTSVPIQLDEYGSVFV VFRSDLPEQTAIRSIYKDSECLVDASVVSVEEQEQAKYFGVKDDFSISLWVKPESDAMLN TDNPMGYISYPWTEYYAIYPSHGELLYGTGHATCGMAIGRNGVAVWENAKGYPEFKMAVE KPISGWSHICLVYKEGAPHIYINGEYVTYKTRSLQTIHPGLNPTTLKEGASYYNGDMSVP VLFQRVLSNKEISRLAAEGYVKKADERALNWVPYADNRFLLAWHDGNYDFITSDGVKKKI KVEKVGTPVMLNQKWEITFPDGCGAPEKITLPKLFSLHKHEDEGVKYFSGTATYMTDFVI KASILSDEKVVFLDLGAVEVMAEVIVNGVNKGILWARPYSIDVTDVLKPGKNTLEIKVTN QWTNRLIGDEQLPEENEYVLGGGINGIAALSRGAIKKLPDWYKNGVNKPKGGRVAFTTWK HYRKDSPLLESGLIGPVRLVFAKKIELK >gi|226332245|gb|ACIC01000075.1| GENE 25 41217 - 43184 978 655 aa, chain + ## HITS:1 COG:BH1877 KEGG:ns NR:ns ## COG: BH1877 COG3533 # Protein_GI_number: 15614440 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus halodurans # 48 651 4 589 758 385 36.0 1e-106 MRKGVLTKTSIKVLLSLLLLIGSFEPVVSASYGGSKPADVTFESYFPLREVRLLDSPFLD LQRKGKEYLLWLNPDSLLHFYRIEAGLPSKAAPYAGWESQDVWGAGPLRGGFLGFYLSSV SMMYQSTDDKRLLKRLKYVLKELELCQKAGKDGFLLGLKDGRKLFAEVASGKIKTNNPTV NGAWAPVYLINKMLLGLSAAYTQCQMEEALPILIRLADWFGYQVLDKLTDDQIQRLLICE HGSINESYVEAYELTGEKRFLDWARRLNDHAMWGPLSEGKDILFGWHANTQIPKFTGFHK YYQFTGDERFLTAATNFWNIVTQNHTWVIGGNSTGEHFFPKEEFADRVLLVGGPETCNSV NMLRLTESLFCQYPDAAKASYYERVLFNHILSAYDPEKGMCCYFTSMRPGHYRIYASRDS SFWCCGHTGLESPAKLSKFIYSHSKRIIDGDPDIRVNLFIPSILFWKEKGIELIQQNRLP ESEQVSFMLNLKKKQELILRIRKPDWADKVTFIINGKVEYPILDKDGYWVVNRTWARKNK IILQLPMHVYVESLMGSDRYAALLYGPYVLAGRMGTENLPTTFWGKMNNTAMNKIDLNKI PVFRMPLKQIPAYVKSISTDTLKFKIDLKEFDNVVVEPFYKVHFERYAVYWPISY >gi|226332245|gb|ACIC01000075.1| GENE 26 43529 - 44410 512 293 aa, chain + ## HITS:1 COG:SMb21419 KEGG:ns NR:ns ## COG: SMb21419 COG2207 # Protein_GI_number: 16264994 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Sinorhizobium meliloti # 23 287 26 287 295 141 34.0 1e-33 MERMSIKYFVNNNSLGPDRNISSVGFEEIPPNTSYPTLNHSAGYYFNPDKGRILTEYQLI YITEGEGVLETRSGGVFTIKRGMIFVLFPGEWHTYYPNYQTGWSHYWIGFCGPEVDSWMT NEYCSKESPVFKVGINDEIVSLFRKAIDVANEEPTLYQRVLSGLVTYLVALMCSIDKNLQ VENDDFSSKIDYACVLMKELIDQPVSMQEIAKKAGMGYSLFRKLFKEQKNYAPVQYFQNL KIQKAIELLTTTTIPVKEIAYRLDFESPAYFSARFKKQTGKSPIEYREEFRIK >gi|226332245|gb|ACIC01000075.1| GENE 27 44493 - 45917 943 474 aa, chain + ## HITS:1 COG:SMb21420 KEGG:ns NR:ns ## COG: SMb21420 COG2160 # Protein_GI_number: 16264996 # Func_class: G Carbohydrate transport and metabolism # Function: L-arabinose isomerase # Organism: Sinorhizobium meliloti # 1 473 1 474 475 511 51.0 1e-144 MIQQKVRVGLLGVGLDTYWGQFEGLLPRLLTYQDEIAAKIEAMDVQVINTGMVDSPLKAN ECVLQLKQADVELVFLFISTYALSSTILPVAQQVGKPIIILNIQPASAIDYQKLNSMGDR GRMTGEWLAHCQACSVPEFASVLNRAGVRYDIITGYLSEDYVWEEIASWVDAVRVMYGMR TSRLGVLGHYYCGMLDVYTDLMKQSAVFGTHIELLEMCELKAYREEVSDGELKRKLDEFY DKFNVEASCSSEELVRAARTSVALDKLVNVHQLGAMAYYYEGFCGNDYENIVTSVIAGNT LLTGYGIPVAGECEVKNAQAMKIMSLLKAGGSFSEFYAMDFKDDIVLLGHDGPAHFAIAE EKVKLVPLPLYHGKPGKGLSIQMSVKPGDVTLLSVCEGRDGVFLLAAEGEAVQGETLHIG NTNSRYRFPCGARRFMDQWSKAGPSHHCAIGIGHKVSELKKLAFLLDIPIIVVE >gi|226332245|gb|ACIC01000075.1| GENE 28 45936 - 48482 1045 848 aa, chain + ## HITS:1 COG:no KEGG:BT_2908 NR:ns ## KEGG: BT_2908 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 848 1 848 848 1761 99.0 0 MEKLKYSLLFMISILAISNRWVSANDIDDERNRIYNSSYSGKYNNRIAFPIGGIGTGMYC LEGTGYISHMSVWHRPEVFHEPGMFAALYVKGVCNGAKVLEGPVSDWRKFGMPNYGTGGS MGSILGLPRFDTVEFEARFPFAKVSLTDKDIPVKVTILGWSPFIPGDPDNSSLPVGGLEY SLENTSKEVQETIFSYHARNFLSWGKGLDAIKTMPHGFILSQSGTETEPHLQGDFAIFTD QDSLKINYCWFRGGWFDSLTMVWNAIEAGLMPQCPAIEKGAPGASMFVPVTLMPGEKKTI RIYTAWYVPNSTLRLGEEPEDWNDNNVDSARLAVEKADKGNYKPWYSSRFTGVNEVIDYF LSHYKILRNQTERFTDSFYRSTLPPEVIEAVSANLSILKSPTVMRQYDGRLWTWEGCADN WGSCHGSCTHVWNYAQAIPHLFPSLERSLRHTEFEEGQDLKGHQVFRVNLPIRPTRHNFH SAADGQLGGIMKVYREWRISGENEFLISMYPKVKKSLDYCISTWDPRRVGSIEEPHHNTY DIEFWGPDGMHNSFYYGALSAFIRMSEFLDKDVTEYKKLLKKGRKFTETGLFNGEYFIQK IEWRGLNAKDPTVAQSFHSSYSPEAKEILEKEGPKYQYGNGCLSDGVLGSWLSRMCGMEE TLNTEKVKSHLLSVHRYNFKKDLTDHANPQRSPYALGKEGGLLLGSWPKGGKLSLPFVYS NKVWTGIEYQVASHLMLQGEVEKGLEIVRACRQRYDGSVRNPFNEYECGHWYGRALSSYG LLQGLTGVRYDAVDKTLYINSKIGDFISFISTESGFGNIELRSGKPFVKVVSGHIEVDRF VVSGKVVE >gi|226332245|gb|ACIC01000075.1| GENE 29 48537 - 49895 632 452 aa, chain + ## HITS:1 COG:no KEGG:BT_2907 NR:ns ## KEGG: BT_2907 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 452 1 452 452 928 100.0 0 MDRRTFLKSASMKGTAIVTASAVGTEMLHAAESIGASDVASGKRNAPKAKRLPEDLQELV KDSSLLRKPDNLTVACYTFPNYHASALHDKIYGPGWTEYNLVRSARPWFQGHAQPRGPLL GEMDESKPGTWEKYNELCKQSGIDVLIWDWYWYNNEPCLHEALENGFLRASNRNDVKFAC MWTNHPWYVLYPTLLPNGYKAYPPSFAPADGSLKECWQSLSYIISRYCHLENYWRIDDKP VVCIWDPNRLEKNIGVDGVKQLFAELTEFARKLGHKGLHFHSSGFYSPNSKEVGYNTAGS YNPFTWVADNYQPKNIELPDYGVAAADVAFKLWPKHHDDFAIPYLPSLSPGWDSTPRYIP PVSRPDQPNRDAWPNCVILDNENPASFKALVQSAFAYLNKHKDVPPILTIACFNEWTEGH YLLPDNRFGYGMLDALAEAVGKSDNHQIHGFF >gi|226332245|gb|ACIC01000075.1| GENE 30 49930 - 51240 881 436 aa, chain + ## HITS:1 COG:CC0633 KEGG:ns NR:ns ## COG: CC0633 COG3754 # Protein_GI_number: 16124886 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lipopolysaccharide biosynthesis protein # Organism: Caulobacter vibrioides # 55 425 218 564 818 110 25.0 4e-24 MKNQFVKCISQLLFLLLIIGVLPVYAQKKSKEKIYRLPDDLETLAGDPALLKKPEGLTVA AYAFPNYHASALHNKIYSQGWTEYNLIRSARPWFEGHQQPRTPLLGELDESKPSTWETYN KLCKQSGIDVLIWDWYWYDGKPCLHEALENGFLEASNTKDVKFACMWTNHPWYVLYPTKR TDGSNAYPPSFDAPDFSKEECWKSLSYMISRYCHLENYWRIDDKPVICIWDARRLESKLG IAGVKQLFAELTDYAKKLGHKGLHFHVTGFSCGNMKEEGYDTVGSYNPMDWIAGRFQPKE IELPDYGTVAADVAFKLWDEHHGQFDIPYVPAVAPGWDSTPRYIAPANRPAKADRSQWPG CTIFKNENPASFKAFVQSSFVYLNKHPEVPRILTIACFNEWSEGHYLLPDNRFGYGMLDA LGEALGKEGNHEKHGK >gi|226332245|gb|ACIC01000075.1| GENE 31 51268 - 54471 1775 1067 aa, chain + ## HITS:1 COG:no KEGG:BT_2905 NR:ns ## KEGG: BT_2905 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1067 1 1067 1067 2167 100.0 0 MFTMTNFAKRGRLSISWRACILLVGLCCPFPVVSSYATDSITEIQQNKTKKFRGNVLDEA GAPVTGATVQIQGKAGGVITDIDGNFTIEVGESDVLQITFIGLETQLVPVRGKASMKIVM KELRDELEEVTVVAFAKQKKESVIGSIQTVKPAELKTPSSNLTTSLAGRMAGVIAYQRSG EPGKDNANFFIRGVTTFGYKKDPLILLDNVEVSSDDLARVQPDDIASFSVLKDATSTALY GARGANGVILVTTKEGKEGKAQVSVRVENSFSAPTKMIDIADPISYMRLNNEAVLTRDPQ GVIPYPESKIANTMAPNRNPYVYPTVDWMKEMFKDYTVNQRVNFNVRGGGSVATYYLAGT FSNDTGLLKNDGLNNFNSNISLKKYNILSTFNIHLTKTTEVKVRFQADFDDYRGPVDGGN ILFDHAIHASPVDFPKYYLPDLQNQGVGHPLFGNMEGANHINPYAIMTSGYKDYSRTNIS AQAEIEQNLDFITEGLTVQGHFSTVRRSYFDVSRSYTPYYYKIGFYDKESDNYVLQALNP DSGTEYLGYSEGPKDVYTQIYMQAKANYVRKFGLHDVGALLVYQRKEELLGNAGSVIKSL PKRNQGISGRMSYGYDSRYFVEFNFGYNGSERFAQNERYGFFPSIGGAWMLSNEAFWRPI EKVANKLKLKVTYGLVGNDAIGSDSDRFFYLSNMNMNSSGRGQVFGTNWGNYKDGISTVR YPNEFITWEVAKKLNIGFELGLFNSLDVQFDYFREDRSKILMDRSFIPTTMGLEASVRAN VGEAASHGVDLSVNYNHWFNNKIWLQGYGNFTYATSEFKVADEPDYAAAGLPWRSRVGYS LSQQWGYIAERLFVDEADIANSPVQSFGTYLPGDIKYKDINKDGKISDADLVPIGYPTTP EITYGFGLSMGFKNWDISCFFQGSARSSFFIDSEQIAPFNNPPSTTGSYTRKTALFQAIA DSYWSEEHRDIYAFWPRLSTESIKNNTQVSTWWMRDGSFMRLKSLELGYSFPQQMIKKIH LTKLRLYANGTNLLTFSKFKLWDPEMGGLGVGYPNQRVINLGIQVDF Prediction of potential genes in microbial genomes Time: Thu May 12 01:12:53 2011 Seq name: gi|226332244|gb|ACIC01000076.1| Bacteroides sp. 1_1_6 cont1.76, whole genome shotgun sequence Length of sequence - 14397 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 4, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 1924 1100 ## BT_2904 hypothetical protein 2 1 Op 2 . + CDS 1951 - 3132 742 ## BT_2903 hypothetical protein 3 1 Op 3 . + CDS 3144 - 4352 747 ## BT_2902 hypothetical protein 4 1 Op 4 . + CDS 4355 - 5380 741 ## BT_2901 hypothetical protein + Term 5444 - 5494 8.1 + Prom 5433 - 5492 5.3 5 2 Tu 1 . + CDS 5516 - 6493 530 ## COG3507 Beta-xylosidase + Prom 6612 - 6671 6.6 6 3 Op 1 . + CDS 6720 - 8786 1359 ## BT_2899 hypothetical protein 7 3 Op 2 . + CDS 8808 - 9791 648 ## COG3940 Predicted beta-xylosidase + Term 9872 - 9931 2.1 + Prom 10345 - 10404 7.5 8 4 Tu 1 . + CDS 10426 - 14395 1956 ## COG0642 Signal transduction histidine kinase Predicted protein(s) >gi|226332244|gb|ACIC01000076.1| GENE 1 2 - 1924 1100 640 aa, chain + ## HITS:1 COG:no KEGG:BT_2904 NR:ns ## KEGG: BT_2904 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 640 1 640 640 1282 100.0 0 MKTKKIFNFLLLLGFLSGCDYLDIVPDNLPTLEMAFNNKSSAERFLFTCYSFVPEHGKIG TDPGMDVGDEVWYYSDNSADYSNKTTFWIAKGMQNSNEPLVDYWNGTQWGKAMFVGIRDC NVFIENIEKVPDLSAYDKETWKAEAKVLKAFYHFWMMRMYGPIPIIKENIPVSAGEDEVR VVRDKIDDIVTYIVSLIDESVEYLPLNIQNEAAEMGRITRPAAKAIKAKILAFAASPFFN GNLDYANFKNAEGEPFFNQVYDNEKWTKAADACLEAIQCAEEAGHGLYEFVNMSSTQLSD ETILSLSNRCKVTERWNKELVWGCGQSGIRDLQVLCQPWLESNYSSDDRYHNARNGTFAP TLAVAETFYTKNGVPMDEDKNYDYSKRYTTQVATEADKYYIQPGYTTAKLHFDREPRFYA TLGFDGSSWYGIGKMDDNDMWYLQAKAKQASGKRGNTLYSITGYFAKKLVRYQNAMVPAS IQIETYPFPIIRLADLYLLYAEALNEAKKEEGTVPEDCYTYIDKVRARAGLKGVKDSWRL YANDANKPNTYEGFQTIVRKERMIELALEGQRFWDIRRWNLANQYFNKSLQGWDINQSET NEYYKLNTYFIRNYEDRENLWPLKDYDCIVNPKLVQNPGW >gi|226332244|gb|ACIC01000076.1| GENE 2 1951 - 3132 742 393 aa, chain + ## HITS:1 COG:no KEGG:BT_2903 NR:ns ## KEGG: BT_2903 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 393 9 401 401 822 100.0 0 MVLHLVLIFLSIGCKEDDNIRPFGKDVEAGAPGVVSEIKVKNIPGGAVISYQLPDNPDIM YVRARYKVGQGKEMEARASVYANELTVNGFGDMNEHEVELSCVDRMENEGATTVAVVTPL KPSVLTTFESLEVNATFGGVYVNFKNPEKASLSIHVVTTDSLNQPYEAHVQYTQAEKGPF YVRGFKAEERMFDIYVVDRWGNSSDTLYNVLTPYEETRLNKKLFYPYILPNDVACNAWGG NLAYAWNDDYTPALFVHSPGGDTFPMWFTFDMGVTAKLSRYKFYHLFMEEHAFQRGNLKT WEVWGRTDTPPADGSWDGWTKLMDCESHKPSGLPVGQHTAEDWEYLQAGEDFEFPAEAPP VRYIRFKVLETWGGMDFIHFTEFTFWGNVISNN >gi|226332244|gb|ACIC01000076.1| GENE 3 3144 - 4352 747 402 aa, chain + ## HITS:1 COG:no KEGG:BT_2902 NR:ns ## KEGG: BT_2902 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 402 1 402 402 806 100.0 0 MKYLYKIGLILSVVLTAGACSGQLDTIQEYLDAGETIYAAKMDSVDIRPGYNRVEVTGLL KYGMDTERCVIHWTPDNDSLVVPVKRIDPVDTFRVFIENLPEGTYQFEIVTYNKSGYRSI STSKGSKTYGDRYISSLRVRSLLSTEVEGENLLLNFSSEMAEALATKVFYLNSAGEKQEQ EVARGDNQIKITDWKPRGAYEVKTYYIPEVNAVDTFSVSSTGVFPEKIVEMDKSTFREII LDNDIRLNAWGGALWKAWDNAYENTNYAHSDNTNPIVFPAWFTFDLGEKALLKRFDLFSV VRDDLNYNGGNMKSWEIWGREDEPANSSWDGWTKLITCNSFKPSGKPVGENTDEDNSYIA KGEKFDFPSGIPEVRYIRIKVLDSWSGQGYVQFSEFTFYKSE >gi|226332244|gb|ACIC01000076.1| GENE 4 4355 - 5380 741 341 aa, chain + ## HITS:1 COG:no KEGG:BT_2901 NR:ns ## KEGG: BT_2901 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 341 1 341 341 705 100.0 0 MKHLLAGILVWVIWGLCSCQSDGKVIIPSPIYVDNHYHGPADPEITWNPKTEEWMIFYTS RRPFKDQASYVGTPVGVAVSKDFVHWKFAGYCSFDGVGGQPDSEKTYWAPGVIVEGDSAH MFVTLKDDSTPVWGGPSNIVHFSAPLEDMISGWRRVNTVVDTPISIDATVVKNGDRWDMW YRDRPTEDTGGLYYAQSNDLYHWTLKGLAKGDINNVEVTKHTYQEGAYAFQWKGYFWIIA DPHKGLAVYRSKDAENWEYQGIVMYEPGTRFFDNTRARHPSVIIRNDRAFIVYHVQPFLG YNPNTHEAGDDVYQKLSFLQMAELEFKDGKLICDRNKVLYK >gi|226332244|gb|ACIC01000076.1| GENE 5 5516 - 6493 530 325 aa, chain + ## HITS:1 COG:BS_abnA KEGG:ns NR:ns ## COG: BS_abnA COG3507 # Protein_GI_number: 16079933 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-xylosidase # Organism: Bacillus subtilis # 32 267 39 284 313 108 35.0 1e-23 MKKLLLIEILLTISFTLMAEDPAKTYKNPVVNYSLPDPSVIRAEDGYFYLYATEDIRNLP IHRSKDLIKWESVGTAFTDETRPTFEPKGNLWAPDINKIGDKYVLYYSMSCWGGEWTCGI GVATADKPEGPFVDHGMMFRSNEIGVQNSIDPFYIEDGGRKYMFWGSFRGIYGIELTADG LKVKSGADKKQIAGTAYEGTYIHKKNGYYYLFASTGTCCEGLKSTYQTVVGRSSSLWGPY LDKQGRSMLDNNHVVLIHNNKSFVGTGHNAELITDKADNDWILYHGVSVSNPYGRVLLLD RVDWKKGWPVVKSSSASTESPKPHF >gi|226332244|gb|ACIC01000076.1| GENE 6 6720 - 8786 1359 688 aa, chain + ## HITS:1 COG:no KEGG:BT_2899 NR:ns ## KEGG: BT_2899 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 12 688 1 677 677 1403 100.0 0 MIATDSHNLIYMNLKCLLCGIAILVCVGTVKAIDNVSERVSASISLKVPGNEAKQYPLTF QKTKNDKSAYQLEASERIPLTVYQTIEEKGGKSQITVCITALEDVYFNYSQQVSTGFRHD DCQFYMPGFWYRRNLRSPKSAPSFHTSDSWLVREDRLSTPLTGIYSEGTKRFMTVNRIDQ FENDALTTHREGEVILSGKTSLGFTGFENRNGIATLSFGFPYQEAPKSYIRKLTLAPQVK AFQLLKKGETVLLNWTIYEDAAEDYSDFIRHTWEYCYDTYAPKPVDTPYSIADMKNTLSS FFVNSLVSKPELTYYSGVGLRTDDCKSNGEAEVGFVGRVLLNAFNAWEYGWESNRAELKE NSLKIFDSYLKNGFTEAGFFKEFVNLDRNFEEPVHSIRRQSEGIYAMLHFLAYEKEQGRR HPEWEQRMKKMLDMFMQLQNQDGSFPRKFRDDFSIVDKSGGSTPSATLPLVMGYKYFKDK RYLDSAKKTADYLENELISKADYFSSTLDANCEDKEASLYAATATYYLALITKGEEHKHY ADLTKKAAYFALSWYYLWDVPFAPGQMLGDIGLKTRGWGNVSVENNHIDVFIFEFASVLQ WLSKEYKEPRFADFAEVISTSMRQLLPHDGHMCGIAKVGYYPEVVQHTNWDYGRNGKGYY NDIFAPGWTVASLWELLTPGRAEKMLVK >gi|226332244|gb|ACIC01000076.1| GENE 7 8808 - 9791 648 327 aa, chain + ## HITS:1 COG:BH1867 KEGG:ns NR:ns ## COG: BH1867 COG3940 # Protein_GI_number: 15614430 # Func_class: R General function prediction only # Function: Predicted beta-xylosidase # Organism: Bacillus halodurans # 47 294 17 285 327 60 26.0 4e-09 MLKKRFYILLFFCLLLGGNVNSSTVDKGDNNAPFENKRTETDYLPIADPFVMLYNNKYYA YGTGGTVNEGFACFSSDDLKNWKRERQALSADDSYGKWGFWAPEVYYIQSKKKFYMFYSV EEHICVATSDSPVGPFRQESKQPIWEEKSIDTSLFIDDDGTPYLYFVRFTDGNVIWVAQM TDDLMKIKKETLSECIKAEEPWELLQAKVAEGPSVLKKKGMYYLIYSANHYQNKGYGVGY ATSKSPMGPWTKYDKNPLLQGDESTGLVGTGHGAPFQCKDGSWKYIFHAHWNKEKIHPRT SYFKDFTISKQGVVSITGKVIKPKVIK >gi|226332244|gb|ACIC01000076.1| GENE 8 10426 - 14395 1956 1323 aa, chain + ## HITS:1 COG:CAC0903_3 KEGG:ns NR:ns ## COG: CAC0903_3 COG0642 # Protein_GI_number: 15894190 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 790 1043 34 291 318 113 27.0 2e-24 MRILHSLSILIVFIFHTFYVFADLNEGYAFRSLDINNGLSQNTVHAILQDKQGFMWFGTK DGLDRYDGISFRAFMKESGTLGNNFITSLYEDQQGQIWIGTDVGLYVYSPEHETVERFIM KSDLDTGIDYTVNLVTGDKDGGIWVVTQSQAVFYYNPQTNKLINYLSDQSGRLKFGSLGQ LYFDSDNVCWLDIRDGNLYFSKDKLQTLVPVFPKDGNEPFKGEYICKLLPGPYNCMYVGT ITGLKEVNLTNKTVRTLLSKDESGDDIYVREIAFYSDDELWAGTESGLYIYNLRTKKTVH LRNVSGDPYSISDNAIYSICKDREGGIWIGSYFGGVNYYPKQYTYFDKVYPRTEIDEMGK RVREFCADHDGTLWIGTEDKGLFHYYPSTGLIEPFRHPDIYHNVHGLCLDGNYLWVGHFA KGLNRIDLRTHAVKHYFDAPNDIFSICRTTSGNLWLGTTAGLFRYHSETNRFERVPELGW VFIYNIKEDKQGNLWLATYIDGVYKRNVRTGEWEHYMHDETNPSSLPSNKVLSIFEDSQN QLWFTTQGGGFCRFDPSGKIFVRYDSSIGLPSNVVHRIEEDEKGLFWITTNKGLVHFNPK TLEFKVYTIANGLLSNQFNYQSSYKDKNGRIYFGSINGFISFEPSVFVDNDFLPPVVITD FMLFNKKVPVGSTDSPLKQSITLSDYLELQSNQNSFSFSVAALSYQSPDMNTVLYKLEGY DSEWYSVGKNLITYSNLPYGTYVLKVKAANSDGIWNPDVRTLKIRILPPFYLSVWAYIIY VILILGALFATFFYFRKRAVEKHQRAMEKFEQEKERELYTSKIEFFTNVAHEIRTPLTLI KSPLESVLTEKELPENVKMELEIMDQNAERLLNLTNQLLDFRKTENKGFKLNPVECSVGS IIRSVYKRFTTLVNQKGIELKVEIPEEELLASVDKEALTKILSNLFTNALKYAQTYAYLS LSVDEAGKEFTIVMSNDGKIIPIEMRENIFRPFVQYRDGKDIVPGTGIGLALARSLAELH QGTLTMDQDMECNRFILSIPIRHQTASDVLEEEHREKPVYDEEDNETRIMPSDKDKKEVA SVLIVEDNKDMLAFVARQLSPLYHVITAENGIEALKVLEHEYINLVISDIMMPEMDGLEL CEHLKSNLDYSHIPIILLTAKTTLESKIEGLEQGADAYIEKPFSVEYLRVNVANLLSNRE RLRRRFIESPFIKADTMAQTKADEVFIQKLNEYVSQHLDNTDLVIDDMAEAMNMGRSNFY RKLKGVLDMSPNEYLRLIRLKKAAQLLKDGRYGIVEISYMVGFNSPSYFSNCFKKQFGVL PKD Prediction of potential genes in microbial genomes Time: Thu May 12 01:13:43 2011 Seq name: gi|226332243|gb|ACIC01000077.1| Bacteroides sp. 1_1_6 cont1.77, whole genome shotgun sequence Length of sequence - 84189 bp Number of predicted genes - 59, with homology - 59 Number of transcription units - 22, operones - 12 average op.length - 4.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 373 - 432 7.7 1 1 Op 1 . + CDS 504 - 1862 966 ## BT_2896 hypothetical protein 2 1 Op 2 . + CDS 1908 - 2900 644 ## COG3940 Predicted beta-xylosidase 3 1 Op 3 . + CDS 2951 - 6025 2442 ## BT_2894 hypothetical protein 4 1 Op 4 . + CDS 6034 - 7908 1457 ## BT_2893 hypothetical protein + Term 7920 - 7967 9.4 5 2 Tu 1 . + CDS 7986 - 9698 882 ## BT_2892 hypothetical protein + Term 9897 - 9946 9.0 + Prom 10481 - 10540 7.8 6 3 Tu 1 . + CDS 10684 - 11913 732 ## BT_2890 transposase 7 4 Tu 1 . - CDS 12112 - 12996 354 ## BT_2889 AraC family transcription regulator - Prom 13200 - 13259 10.8 + Prom 13141 - 13200 6.2 8 5 Op 1 1/0.000 + CDS 13401 - 14513 613 ## COG0472 UDP-N-acetylmuramyl pentapeptide phosphotransferase/UDP-N-acetylglucosamine-1-phosphate transferase 9 5 Op 2 . + CDS 14524 - 15504 804 ## COG0451 Nucleoside-diphosphate-sugar epimerases + Prom 15521 - 15580 2.5 10 6 Op 1 . + CDS 15670 - 16176 410 ## BT_2886 putative transcriptional regulator 11 6 Op 2 . + CDS 16225 - 17334 825 ## COG0399 Predicted pyridoxal phosphate-dependent enzyme apparently involved in regulation of cell wall biogenesis 12 6 Op 3 . + CDS 17338 - 18033 595 ## BT_2884 hypothetical protein 13 6 Op 4 . + CDS 18046 - 18630 154 ## COG0299 Folate-dependent phosphoribosylglycinamide formyltransferase PurN 14 6 Op 5 1/0.000 + CDS 18674 - 19645 344 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 15 6 Op 6 . + CDS 19682 - 20398 255 ## PROTEIN SUPPORTED gi|163764767|ref|ZP_02171821.1| ribosomal protein L15 16 6 Op 7 . + CDS 20395 - 21429 362 ## COG0451 Nucleoside-diphosphate-sugar epimerases 17 6 Op 8 . + CDS 21413 - 22675 584 ## BT_2879 hypothetical protein 18 6 Op 9 . + CDS 22693 - 24126 568 ## COG2244 Membrane protein involved in the export of O-antigen and teichoic acid 19 6 Op 10 7/0.000 + CDS 24123 - 24728 237 ## COG0110 Acetyltransferase (isoleucine patch superfamily) 20 6 Op 11 4/0.000 + CDS 24741 - 25724 83 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 21 6 Op 12 4/0.000 + CDS 25781 - 26317 307 ## COG1045 Serine acetyltransferase 22 6 Op 13 2/0.000 + CDS 26317 - 27117 312 ## COG0463 Glycosyltransferases involved in cell wall biogenesis + Prom 27126 - 27185 2.1 23 6 Op 14 . + CDS 27218 - 28057 381 ## COG1442 Lipopolysaccharide biosynthesis proteins, LPS:glycosyltransferases 24 6 Op 15 . + CDS 28060 - 28998 611 ## BT_2872 putative capsular polysaccharide synthesis protein 25 6 Op 16 . + CDS 29027 - 29866 212 ## BT_2871 putative glycosyltransferase 26 6 Op 17 . + CDS 29866 - 30732 338 ## BT_2870 putative glycosyltransferase + Term 30930 - 30984 1.1 + Prom 30875 - 30934 4.3 27 7 Op 1 . + CDS 31050 - 31832 290 ## BT_2869 putative succinyltransferase involved in succinoglycan biosynthesis 28 7 Op 2 . + CDS 31829 - 32827 468 ## COG0463 Glycosyltransferases involved in cell wall biogenesis + Term 32830 - 32871 -1.0 + Prom 32841 - 32900 4.5 29 8 Op 1 . + CDS 32948 - 34084 329 ## BT_2867 hypothetical protein 30 8 Op 2 25/0.000 + CDS 34081 - 35220 688 ## COG0438 Glycosyltransferase 31 8 Op 3 25/0.000 + CDS 35217 - 36329 503 ## COG0438 Glycosyltransferase 32 8 Op 4 . + CDS 36409 - 37560 676 ## COG0438 Glycosyltransferase 33 8 Op 5 . + CDS 37597 - 38394 530 ## BT_2863 putative polysaccharide export protein 34 8 Op 6 . + CDS 38411 - 40792 1385 ## COG0489 ATPases involved in chromosome partitioning + Prom 40807 - 40866 3.1 35 9 Tu 1 . + CDS 40975 - 42165 1377 ## BT_2861 putative outer membrane protein + Term 42328 - 42386 6.4 36 10 Tu 1 . + CDS 43035 - 47054 2972 ## COG5002 Signal transduction histidine kinase + Term 47130 - 47171 4.1 + Prom 47134 - 47193 8.4 37 11 Op 1 . + CDS 47291 - 50419 2984 ## BT_2859 hypothetical protein 38 11 Op 2 . + CDS 50433 - 52373 1640 ## BT_2858 hypothetical protein 39 11 Op 3 . + CDS 52403 - 53608 1114 ## BT_2857 hypothetical protein 40 11 Op 4 . + CDS 53648 - 54862 1106 ## BT_2856 hypothetical protein + Prom 54962 - 55021 3.6 41 12 Op 1 1/0.000 + CDS 55052 - 58021 2762 ## COG3250 Beta-galactosidase/beta-glucuronidase 42 12 Op 2 . + CDS 58028 - 60307 2280 ## COG1472 Beta-glucosidase-related glycosidases + Term 60420 - 60465 3.0 + Prom 60440 - 60499 4.5 43 13 Op 1 . + CDS 60522 - 62066 1224 ## COG3507 Beta-xylosidase 44 13 Op 2 . + CDS 62148 - 64304 1949 ## COG3345 Alpha-galactosidase + Term 64504 - 64562 15.2 45 14 Tu 1 . - CDS 64792 - 66345 1353 ## COG0642 Signal transduction histidine kinase - Prom 66375 - 66434 10.2 46 15 Tu 1 . - CDS 66472 - 67485 374 ## PROTEIN SUPPORTED gi|15900011|ref|NP_344615.1| aldose 1-epimerase - Prom 67570 - 67629 6.2 - Term 67665 - 67701 5.4 47 16 Tu 1 . - CDS 67814 - 69022 922 ## BT_2848 hypothetical protein - Prom 69148 - 69207 6.7 48 17 Tu 1 . - CDS 69221 - 70750 582 ## BT_2847 hypothetical protein - Prom 70877 - 70936 4.8 49 18 Op 1 . - CDS 70945 - 72483 955 ## COG0606 Predicted ATPase with chaperone activity 50 18 Op 2 . - CDS 72497 - 73612 1040 ## BT_2845 hypothetical protein - Prom 73646 - 73705 5.5 - Term 73649 - 73697 14.2 51 19 Op 1 . - CDS 73725 - 75416 1664 ## BT_2844 hypothetical protein - Prom 75469 - 75528 6.4 - Term 75454 - 75490 -1.0 52 19 Op 2 . - CDS 75555 - 76514 713 ## COG4974 Site-specific recombinase XerD - Prom 76664 - 76723 3.4 + Prom 76410 - 76469 4.3 53 20 Op 1 . + CDS 76589 - 77011 372 ## COG0757 3-dehydroquinate dehydratase II 54 20 Op 2 . + CDS 77043 - 78500 1620 ## COG0469 Pyruvate kinase 55 20 Op 3 . + CDS 78526 - 79164 634 ## COG4122 Predicted O-methyltransferase 56 20 Op 4 . + CDS 79181 - 79513 299 ## COG0858 Ribosome-binding factor A + Prom 79528 - 79587 2.3 57 21 Tu 1 . + CDS 79663 - 80736 840 ## COG4591 ABC-type transport system, involved in lipoprotein release, permease component + Term 80858 - 80902 1.2 - Term 80841 - 80895 11.6 58 22 Op 1 23/0.000 - CDS 80918 - 81928 1089 ## COG1013 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, beta subunit 59 22 Op 2 . - CDS 81932 - 83782 1958 ## COG0674 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit - Prom 84003 - 84062 76.8 + TRNA 83986 - 84060 54.5 # Arg CCT 0 0 Predicted protein(s) >gi|226332243|gb|ACIC01000077.1| GENE 1 504 - 1862 966 452 aa, chain + ## HITS:1 COG:no KEGG:BT_2896 NR:ns ## KEGG: BT_2896 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 452 1 452 452 879 100.0 0 MKNSFLCVGSFLLAASFFTSCTEDKDWSGGLKNSDLVFQTYLPTEGAKGTELSIRGNNFG EDISQVQVWVNEKEAEVISVTSTRIMARVAEASGSGVVKVRVGEVVYSYPDLFSYGYIRN VYTVAGNGQGTTVDGAYLQASVQWPIVMVYDKWDDAILFLQDEGEHRIRRMKDGKIETLC STKSLVNNARSICFSLDGDTLFIGNDNANNYVANPVAVGMVTRKGGFKDLKSYIPSEKLA QPHINGIAVNPVDGSLFTYHWGRHVFRYNKATETCEYVITRDQFNELVVGLFPDADGNMQ NIGGDGGYGGLAFSPDGKTLYWNGRDPYQGILKADYDLVTKKCTNLTRFAGNGVWGIIDG QGVSSRMDQPNQIAVDAEGNLLVTTVYGRTVRKITPEGYVSTYAGIGYQTGYVDGLAAEA KFNKPYGIAIDAQGNVYVGDCENWRIRVIKEE >gi|226332243|gb|ACIC01000077.1| GENE 2 1908 - 2900 644 330 aa, chain + ## HITS:1 COG:BH1867 KEGG:ns NR:ns ## COG: BH1867 COG3940 # Protein_GI_number: 15614430 # Func_class: R General function prediction only # Function: Predicted beta-xylosidase # Organism: Bacillus halodurans # 49 300 17 287 327 60 25.0 5e-09 MFKKEIYSLLLFSFGLLFTCCGESATDDEEKDNNGQGSVSVETNYLPIADPYVMFYNNKY YAYGTGGTTAGEGFACFSSDDLKNWKREGQALSATDSYGTWGFWAPEVYYVESKKKFYLF YSAEEHICVATSTTPEGPFRQEVKQPIWSEKSIDTSLFIDDDGTPYLYFVRFTDGNVIWV AQMTDDLMSIKTETLNQCIKAEVSWELLQGKVAEGPSLLKKNGVYYLIYSANHYENKGYG VGYATSDTPMGPWVKYSKNPLLQGDAATGLVGTGHGAPFQCKDGSWKYIFHAHWSAAEIQ PRTSYIKDFAISDQGVVTISGTVIKPRVLK >gi|226332243|gb|ACIC01000077.1| GENE 3 2951 - 6025 2442 1024 aa, chain + ## HITS:1 COG:no KEGG:BT_2894 NR:ns ## KEGG: BT_2894 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 7 1024 1 1018 1018 1951 99.0 0 MNKNNRMSKRLLNYVIVLLVCLISSAESFAQGTKFQVKGNVTDASREPIVGATIKVKGTS QGVITNIDGEYSIDTRSNAILEFSYIGYVSQEVPVSGSKIINVSLQEEVSKLDEVVVVGY GTQKKISVTGSVSNVSTKEIAKIATPSLSNTLGGQIPGIVTRQATGEPGYDQASIYIRGF GTWTNRSPLILVDGIERDMNTINTEEVESISVLKDASATAVYGVRGANGVILITTKRGQL GKPKVTLRSEYAVLTGLRYPEYINAAEYAGLMNEARDNAGVANMAYTDEEIELFRNGSSP YLYPNVNWVDEVLKKNTTQSITNLNITGGTEVVRYFVNVGYTTQSGLYRDNGDNAYSTNS RVNRYNYRSRVDVNLTKDLSVELGVGGIIQNRNFPGKSQYDIFYAIRNTSPLAFPVKNPD GTPGASPTYLGNNPWGMTTQSGYETQNWNTLQGTFSARWDLSSLVTEGLSVSGRFAYDHY YSNNKQYYKDFEVKQYMGEDAEGNAIYNILRNENIKNVTLAESANRAIYYEVAANYDRTF GMHNITGMFLFNRRDYVNLRNTDRTGSVPYRRQGIAGRASYNYLQRYFAEFNFGYNGSEQ FPKGKRYGFFPSVSLGYVLTNEDFWNRNWWISNLKLRGSYGTVGNDISSSTRFLYLTTMN MNVSAGYMGKEGNNYMAGIMEGQTGNQNVTWETAKKLNVGFDLGLFNDVVSLQVDIFKEK RDGILITRKTVPVLAGFSGASIPVGNLGKAENKGIETALEVKKRVANGLFYSLRGNFSFA RNKIIENDEPKPKYEYQDARGRRIDQTFGLVALGFFKDQDDIKNSPKQTFQSTVRPGDIK YKDINGDGVVDEYDKVAIGDPRTPEIMFGFGGTVAYKNFDVSLFFTGAAKTSFFLEGATV YPFLNGEGTWNVLREVYDNRWTSETAATAKYPIVLNANSYNNYQTSTMYMRNGSYLRLKS AEIGYTFKGRLIDKMFMDNIRLFCNGQNLLTLDYIKIVDPESNNGVGNYPMQRTINFGFQ INFK >gi|226332243|gb|ACIC01000077.1| GENE 4 6034 - 7908 1457 624 aa, chain + ## HITS:1 COG:no KEGG:BT_2893 NR:ns ## KEGG: BT_2893 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 624 1 624 624 1277 99.0 0 MKISKMKIKVAGLCLLTWAFVSTSCSDYLDKTPDDATSITLEEVFSSEIYTERFLLRAYD YLPCETNYNDNNYWALGAWSGASDEMNITWTYPMAKEMNRGSWNPKMLEGSSSQGTSYAM WWRYYEGIRQCNVFLENIDLCPAAQATKNVWICEARFMRAMLHFFLLRSHGPIPIMDHAL KMDDEMTYFPRNTFDQCVDFIVSDCQYGIDGALPMIYTNASGTVAESLYGRATKAAAYAL KARVLLYAASPLFNGNADYAKFANEDGTLLFAPKDDSKWGKAAAAAKEAIDKIEGSSYYG LYYSDEPDNGYRNYMELFLPSKKWNKEYIFARNQGTGYSDNIHQEKCMAANGMGGWSGLC PTQELVDAYEMANGSTPIIGYNADGTPIINGASGYSETGFAATAGKHYPAGVYNMYVGRE PRFYASINFNGQQWRGRQLEFWQGGKDGINVSKVDYCCTGYLLRKTADEGVDVINGKGGM KEASILFRVGSLYLDYAEALNEAEGPVDDVYKYVDAIRKRAGLPGLAKGLDKDKMRERIQ HERQIELAFEAGHRYFDCHRWKIAEKTDNGYMHGMNITATNKVDYSKRSTVGDSRVFEKK HYLFPIPQSEMDKRIGLVQSPYWE >gi|226332243|gb|ACIC01000077.1| GENE 5 7986 - 9698 882 570 aa, chain + ## HITS:1 COG:no KEGG:BT_2892 NR:ns ## KEGG: BT_2892 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 570 1 570 570 1199 99.0 0 MKTKTRRLLSLLNFCVLTLLACGNSDGAGEDPGVPYLAKKGEPVDGVRIGWDYGTIRQLA TRGGYPRSIRLQDNTVVAVYENYDTGYEMRCSTDEGSTWGDPVILFPIHSITNDKGTARI NIANSEIIQLANGDLLAACNYRPQTPEITPFAIAVRRSTDNGVTWSDPQIIYEAEPRFSD GCWEPSFLQLPNGEVQVYFANEGPYTHSNEQEISMMTSVDNGKTWGGYKTVCFRAGSRDG MPVSKVVGDEIVCVIEDCGFVTFKPYTVRTKLSDNWSSPVLADSPNRAMALGEPVEDWIY MGAPYLGVLPTGETLLSYQVDDIRHDDQLGDRLPYSTMEVAIGDKNARNFVRRTRPFPVP AGKHAVWPSVAVWDANTIAALATSNYQGGTEAPFFMKGHVMRDLEVNSSDIVNYPIFIGH TGICNLRLGVGKDADNLYITCKVKDGELYSGGQGTQKGSGVFIYLDTKNECLKEPAEGVY KVWCSYKGDITVWQGNKGKWENSELEGLQITPSVVSGGYELAFTIPLNRSFNKDAMRLCA TLSTKTYTETLVHSNADRSCTWMKMNLGGH >gi|226332243|gb|ACIC01000077.1| GENE 6 10684 - 11913 732 409 aa, chain + ## HITS:1 COG:no KEGG:BT_2890 NR:ns ## KEGG: BT_2890 # Name: not_defined # Def: transposase # Organism: B.thetaiotaomicron # Pathway: not_defined # 13 409 1 397 397 740 99.0 0 MASIKLLLNKQRMLNNGTFPLVFQFIHRRRKLLYYTKYHIFQQQLADGALEVEYCEASLY SMKEIKEINRELKRDYKRFQNRILELERNNEEYLVDDIVEFKKKKNHRSYLLMQYIETQI SYKKKMGKDGIAAAYHSTYISLKKYIGMKSARKSDIRMEEINFSFVVGYEDYLNAQGLAR NTINYYLRNFRTIYNSSIRDGFKPKSENPFTYIQTKPCKTIKRAINKDDMKKLSSLILPV HSGMDIARDMYLFSFYAQGMAFVDIVFLKKKNIRDGILSYRRHKSQQLIHIVVTPQMQGL IDKYANDSEYVFPIIDTSLSTSIYDQYRLALGRVNRYLKKITFRLNINVRLTTYTARHTW ATLARESGAPISIISAGLGHTSEEMTRVYLKEFDQETLARVNRIVTNLL >gi|226332243|gb|ACIC01000077.1| GENE 7 12112 - 12996 354 294 aa, chain - ## HITS:1 COG:no KEGG:BT_2889 NR:ns ## KEGG: BT_2889 # Name: not_defined # Def: AraC family transcription regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 294 1 294 294 587 100.0 1e-166 MRGEQSPNGLHYAREHLSCKNYMTDIDTGFKYIKFDTEKCIEEEYTHKNYLLFFLEGDFS VYCNQFCNRTFHSNEMVVLPKSSMIKISATDKSQMLAMAFDIPQSNCDKLLFQGLSSICE NIDYDFSPIPMRYPLMPFLETVVHCLRNEMNCTYLHNVIEREFFFLLRGFYNKKEIATLF HPIIGRQLEFRDFVMQNYSKVNNLDELITLSNIGRTSFFIKFKEEFGITAKQWMMKQLKK RILGKVIEPGICVKQLMEVCNFESQAQLYRYFKQHFRCTPKQLIDRYQAKTDNL >gi|226332243|gb|ACIC01000077.1| GENE 8 13401 - 14513 613 370 aa, chain + ## HITS:1 COG:BS_tagO KEGG:ns NR:ns ## COG: BS_tagO COG0472 # Protein_GI_number: 16080606 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl pentapeptide phosphotransferase/UDP-N-acetylglucosamine-1-phosphate transferase # Organism: Bacillus subtilis # 6 330 9 310 358 118 28.0 2e-26 MNYFFIAIAFLVAVFMGWIITPQILLVAFRKRLFDSVGGRKTHSGIVPRLGGVVFVPVQC FLLVLSMFFMYKLEITSYLQDPFIPFQFLLLMIGLLILNMVGVIDDLIRINYRRKFVAQI VASSFLPLSGLWINDLYGLLGITTLSPWIGMPLTVFVAVFIINAVNLIDGIDGLCSGLVG MGALVFGFLFIYNAAWLHAVFAFITAGVLCPFFYYNVFGKSKGRQRIFMGDTGSLTLGLS MAFLAISYAMNNPLIKPFSEGAIVVSFATLIVPLFDVVRVVRIRYFQHKPLFMPDQNHIH HKFLRVGMSHYVAMILILALALFFSFFNIVAVEYISNNIVIFIDIVLWIAFHLWLDRREL IRAKHGVANN >gi|226332243|gb|ACIC01000077.1| GENE 9 14524 - 15504 804 326 aa, chain + ## HITS:1 COG:ECs2847 KEGG:ns NR:ns ## COG: ECs2847 COG0451 # Protein_GI_number: 15832101 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Escherichia coli O157:H7 # 3 317 5 318 331 414 62.0 1e-115 MKIAIIGGSGFVGSRLIGLLQTVPNIELLNIDKQQSELYPHLTQIADVQDVQKLTELLAG TDLVVLLAAEHKDNVTPASLYYTVNVEGTRNTLQAMESNGVARLVFTSSVAVYGLNKDNP SELHPADPFNDYGRSKWQAECMLQEWYDTHREWNIHILRPTVIFGEGNRGNVYNLLRQIT SGRFLMVGDGENRKSMAYVGNVVAFIAFLIENNMEGYHVFNYIDKPDFTMNDLVYHVGEV LNKHIPTTHYPYWLGMLGGYCFDALAKMTGRKLSVSSVRVKKFCAVTQFDSVKVQSSGFK PAFSMEEGLRRTLQYEFGSGAEEAIN >gi|226332243|gb|ACIC01000077.1| GENE 10 15670 - 16176 410 168 aa, chain + ## HITS:1 COG:no KEGG:BT_2886 NR:ns ## KEGG: BT_2886 # Name: not_defined # Def: putative transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 168 1 168 168 306 100.0 2e-82 MDEKKCWLVVCVQSNREKKTYERLSALGFESFLPLQEETRRWSDRSKKVQRVVIPMVVFA RIAPSERISVLRLPSVSRFMVLRGESAPAIIPDAQMERFRFMLDYSEEAVEMCSERIQPG EQVKVIKGPLTGLTGELITMDGKSKVAVRINMLGAAMVEVPVGFVERI >gi|226332243|gb|ACIC01000077.1| GENE 11 16225 - 17334 825 369 aa, chain + ## HITS:1 COG:alr0556 KEGG:ns NR:ns ## COG: alr0556 COG0399 # Protein_GI_number: 17228052 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted pyridoxal phosphate-dependent enzyme apparently involved in regulation of cell wall biogenesis # Organism: Nostoc sp. PCC 7120 # 5 368 20 383 406 281 39.0 1e-75 MKEQQITVTSPLLPSLEELNVYLQDIWQRKWITNNGYYHQMLEAALCEYLGVPYISLFTN GTLPLTAALLAMRITGEVITTPFSFVATTHSLWLNGIKPVFVDIDPVTCNLDPDKIEVAI TPNTTAIMPVHVYGYPCDTKSIQQIADKYGLKVIYDAAHAFGVEREGESILNAGDMSTLS FHATKTFNTIEGGALVLHDEHTKKRVDYLKNFGFVGETEVVIPGTNGKLDEVRAAYGLLN LKQVDAAIEARCQATTNYREALRNVSGITFMDDISGVKHNYSYFPIFVDEERYGMSRDEL YFKMKEYNVLGRRYFYPLISTFSTYRELKSARKENLPVATKMAEQVICLPMHHALSEDDM ERVLRLIRK >gi|226332243|gb|ACIC01000077.1| GENE 12 17338 - 18033 595 231 aa, chain + ## HITS:1 COG:no KEGG:BT_2884 NR:ns ## KEGG: BT_2884 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 231 1 231 231 436 100.0 1e-121 MKLGVMQPYFMPYIGYFQLMKAVDKYVVYDDVNYIKGGWVNRNHILINGEKEMFTVTLKK ASQNKLFNEIVIGDDFKKLMKTLQLNYSKAVNFDQTMTLMKRIISFSDKRLAAFIANSFR EIFSYLSIDTEILMSSDIPKDNSLRGKDKILQICEILGADTYYNAIGGQNLYDKKEFSEH GIVLNFVDTIPKVYSQLRTKEFVPYLSMIDVLMNNTKDEVNDLLDSFCVRY >gi|226332243|gb|ACIC01000077.1| GENE 13 18046 - 18630 154 194 aa, chain + ## HITS:1 COG:FN0985 KEGG:ns NR:ns ## COG: FN0985 COG0299 # Protein_GI_number: 19704320 # Func_class: F Nucleotide transport and metabolism # Function: Folate-dependent phosphoribosylglycinamide formyltransferase PurN # Organism: Fusobacterium nucleatum # 4 186 2 178 180 148 48.0 6e-36 MKSFNIVVCASGGGGNFRSLIKYQCDYGYHISLLIVDRECPAIKIAKENGISYSVLEKKV LGKSFFEEFEKIVPIDTNLIVLAGFLPIIPKWICEKWERKIINIHPSLLPKYGGKGMYGV KVQEAILRNHEKYAGCTVHYVDSEIDTGEIIAQKKILVMENESAWELGGRVFNEEIILLP LAIKHIREAMKVIP >gi|226332243|gb|ACIC01000077.1| GENE 14 18674 - 19645 344 323 aa, chain + ## HITS:1 COG:MT1570 KEGG:ns NR:ns ## COG: MT1570 COG0463 # Protein_GI_number: 15840986 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Mycobacterium tuberculosis CDC1551 # 2 285 7 298 357 159 33.0 8e-39 MTQCPLVSISCLTYNHAPYLRQCLDGFVMQITSFPIEILIYDDASGDGTQNIIEEYQKKY PDIIKPIYQTENQYSKGVKVGFVYNYSRAKGEYIAFCEGDDYWTDPYKLQKQIDFLECYS DYVICSHRYRICLKEEKVMNDEIKPIGDLSDGMSFDLSFLIRGGWLFQPLSVVYRKSALD LDTYSKYAIYIDVALFYAILKNGKGYCMPDVMGVYRIHEKGVWSGLDLNHQRIFSLKARE AIYDVEKTDEAAMFILSQFSRPMGRFLVVKECSMFMRITKILISHFGLRFVLRMWFDRLF LNKKLCVNEFYQIKDFCKRKKNG >gi|226332243|gb|ACIC01000077.1| GENE 15 19682 - 20398 255 238 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764767|ref|ZP_02171821.1| ribosomal protein L15 [Bacillus selenitireducens MLS10] # 4 233 6 228 234 102 31 5e-21 MNIALIIAGGSGNRMGQDIPKQFIHVDNCPVIVHTMLAFQRHPDITAVAVVCLGGWETVL SSYAHQYNVSKLRWIFAGGINGQESISNGIYGLKKNGVKKDDLVLIHDAVRPLVSQSIIS SNIAICKQYGYAITGIQCREAILESDDGFCSVTSIPRDKLIRTQTPQTFRLENIINVHEQ AKSKGIINSVSSCTLLAELGGYEMHIVPGEERNIKITTTEDLEIFKVLRQTSKEDWLK >gi|226332243|gb|ACIC01000077.1| GENE 16 20395 - 21429 362 344 aa, chain + ## HITS:1 COG:slr0809 KEGG:ns NR:ns ## COG: slr0809 COG0451 # Protein_GI_number: 16330703 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Synechocystis # 24 338 20 324 328 127 29.0 4e-29 MMTYYDDIRKVQDLHLPWEILTDINILIVGASGLIGRALVDTLMQLPDKTFHLYAGVRDL VYAQSCFMRYKDDESFTLIQCDVTALIPFDIDFHYIIHAASYAGPAAFHNDPVGVMKANI LGVDNLFSYGEQHNLRRLLYVSSGEVYGEGNGIPFREEDSGSLDWASLRACYPAAKRTAE TLCIAYAAQFQIETVIARPCHIYGPFYTSKDDRAYAQFIRNVLAGENIILRSSGLQQRSW CYVVDCVFALLYVLLKGKTKNAYNIADVRSNVSIREFAEMIALKSEREVIFDLPDNTGKN KPIISQAIFNTEKINKIGWFPQWKLEEGVSHTLNTLISVDGTYI >gi|226332243|gb|ACIC01000077.1| GENE 17 21413 - 22675 584 420 aa, chain + ## HITS:1 COG:no KEGG:BT_2879 NR:ns ## KEGG: BT_2879 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 420 1 420 420 830 100.0 0 MEHIYKLLRSFKWDCAKYLYKNTLNFKVKRLRRKKNIRVLFAVAESATWKSDCLYKAMAE HPRFTPSILVLPDEQKEKTLLKEEVDSCFNLFCRKGYACTYPYQNGKLINIRKKLKPDII FYQKPYTFYPKSLLYYKNMNALFCYTNYAFHSLLTDWANDNEFFRLMWQNYYENESANVD LKKKFAGIFSNIVVTGLPVTDLFLTEKHEDKWKKTDKSRKRIIWAPHFSISDGGCLNYST FLSIAEELLEFIKNTQLPVQMAFKPHPLLKSQLYNYSSWGKEKTDEYYAAWEFLPNAQLE TNEYVDLFMTSDAMIHDCGSFTIEYHHTLKPVMYLVNGKEHTATMNSFAKAAYDLHYKGQ TIQDIRNFIECTVLENSDYLLEKRKNFFKSYLLPPHNQSATENIIHAILGTGYYTEKKQR >gi|226332243|gb|ACIC01000077.1| GENE 18 22693 - 24126 568 477 aa, chain + ## HITS:1 COG:Cgl0350 KEGG:ns NR:ns ## COG: Cgl0350 COG2244 # Protein_GI_number: 19551600 # Func_class: R General function prediction only # Function: Membrane protein involved in the export of O-antigen and teichoic acid # Organism: Corynebacterium glutamicum # 11 378 30 397 497 135 26.0 1e-31 MGISGVKWASIGRFSSQGISFVLGLILARLLLPSDYGMLGMLGVFTAFAGSFIDCGFGSA LIRKLDRTEIDCSTVFYYNLGASLLVYMVMFLGAPFIADFYKQPLLTNVTRIACLTIPIG ALCSVHSNLLYCQLRFRDIAIGNILATFFSGGIGLLLAYNGYGVWALVCQGIIASLVNCC YLWRISCWKPLWMFSTSSFKELFGYGSKLMLSGWLNTMYSQLSPLIIGRFYSSSTLGYYT RAQSYVDFPSSNIMGILQQVVFPVLSRLQNEDEQLIGIYRKYIKVCAIFIFCGMTILAAL AKPLILLLLTDKWLPSVPIMILLCFSFMFSFVNTINLSLLQIKGRSDLFLKLEVVKKAIS ITMILLSASWGIMAMCWSMVIYTQIAIFINTYYTGKLFHLGYREQLTDFLPYFFMALLAN IPTYMLTYTSLSIVTQILLGGLMSLSIYILLLKIKRDDMYLLIEHMLLSYCKRKMAR >gi|226332243|gb|ACIC01000077.1| GENE 19 24123 - 24728 237 201 aa, chain + ## HITS:1 COG:ECs0395 KEGG:ns NR:ns ## COG: ECs0395 COG0110 # Protein_GI_number: 15829649 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Escherichia coli O157:H7 # 70 195 48 184 203 82 34.0 5e-16 MIFGFIAKIYRKYQEYIHPGIYVTRHGILYREKDCRYNVYPLQRLIVGKVWVGNIPSQEK GRLILHANSELIVKGNFDIIGSTVVVLPDAKLILGSGYINFHSKLHCFNHIEIGENVIIS ENVIIRDSDNHQITGGNSMFAPVIIKDNAWIGMSAIILKGVTVGEGAIVAAGSVVTKDVP PHTIVAGVPARVIKKDVYYTI >gi|226332243|gb|ACIC01000077.1| GENE 20 24741 - 25724 83 327 aa, chain + ## HITS:1 COG:STM3707 KEGG:ns NR:ns ## COG: STM3707 COG0463 # Protein_GI_number: 16766992 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Salmonella typhimurium LT2 # 1 294 1 300 344 113 28.0 6e-25 MRNETLVSIIVPVYNSAQYLRKCIDSILAQSYTEFELLLINDGSEDHSGLICDDYAQSDN RIQVFHQENAGVSAARNLGLEKHTGEHFLFIDSDDYIQERYLEELMKYASCDFVQCSCCS EPVGNDYLFVDGNFEGSDEIKQCLLKYIYPEFTVPFGRLYKSSIQRKNKLFFDTYLYSGE DTLWVSQYLLCVHSLRVSSYIGYIYVHHVGEHLSQKSISYEHLEYTLHKLLISYSDLEKR YDFDLTGVRYSVIIYFFHRYIVYIANRNYTEIRKELEKSCANPLIKGVFYDKKYLLKGKK MKLFNWLVLHDMYSVLALYVKRWKRYL >gi|226332243|gb|ACIC01000077.1| GENE 21 25781 - 26317 307 178 aa, chain + ## HITS:1 COG:MA2737 KEGG:ns NR:ns ## COG: MA2737 COG1045 # Protein_GI_number: 20091560 # Func_class: E Amino acid transport and metabolism # Function: Serine acetyltransferase # Organism: Methanosarcina acetivorans str.C2A # 68 174 4 110 140 94 48.0 1e-19 MVTLKEYLNSDRIDYFPFGSKGYIMDWLLRTEQYWIRRYIRALRKEEFYMNYKRNKILQY YFSRKKNLLGIRLGFFINAGCFDIGLKIYHYGSIIVNPKSRIGKNCTIHGNCCIGSKGTF PDDSPVIGNNVDIGQNAQILGGIYIADGVKIGAGAVVTKSVLVPGVTVVGVPARIVEK >gi|226332243|gb|ACIC01000077.1| GENE 22 26317 - 27117 312 266 aa, chain + ## HITS:1 COG:STM2111 KEGG:ns NR:ns ## COG: STM2111 COG0463 # Protein_GI_number: 16765441 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Salmonella typhimurium LT2 # 1 208 1 204 248 103 34.0 3e-22 MKISVITINYNNGEQLEATIQSVVSNVTVLRELLGEKAEYIVIDGGSVDCSVSVLQKYSE YISYWISEKDKGIYHAMNKGISVAQGEYCFFLNSGDTFYEDDTLKKALLFLKEDFVCGNA VLKYAGGINEWSAPEIVNTLFFMQRFSVCHQSLFIRTELLKSHPYNETLMIVADYEQMFY EIAVNHRSYKKIDLTICYYGCDGVSSDHEKADAEKRSVINEFRYLGYIEPDELWNVVSKL KIGTRKYRLLLSLAKCLTSSPKYWVR >gi|226332243|gb|ACIC01000077.1| GENE 23 27218 - 28057 381 279 aa, chain + ## HITS:1 COG:BS_gspA KEGG:ns NR:ns ## COG: BS_gspA COG1442 # Protein_GI_number: 16080894 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lipopolysaccharide biosynthesis proteins, LPS:glycosyltransferases # Organism: Bacillus subtilis # 45 224 77 259 286 78 31.0 1e-14 MQAEGELHVICLLSEELPERLKLKIQLIGEGRTCYSFVNLQGKLQHIYIDQKYTEAASYR LLLPDLLPEYKKVIYIDCDIIVRNDLVQLYHSIDLGMNYLAAVFEASMDFQLDHLKTIGC NPNEYINSGFLIMNLELMRKDNMVEKFIEASKVDYLEFPDQDVLNQLCKDRILALPPYYN SIRTFYLPQYKKFFLQKYTEQDWLEVHRHGTVHYTGAKPWNQFTVQFQLWWQYYEQLPEI IKKEWQVDKKIYFLSKLYRTSCGTFVVDKLQALYRKMKY >gi|226332243|gb|ACIC01000077.1| GENE 24 28060 - 28998 611 312 aa, chain + ## HITS:1 COG:no KEGG:BT_2872 NR:ns ## KEGG: BT_2872 # Name: not_defined # Def: putative capsular polysaccharide synthesis protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 4 312 1 309 309 615 100.0 1e-175 MEKMKLKELVKQFIPMNYWNTRRKASIIRQQGKVADFWAPILKAYYNGEIERYSLKPKKK LGTQKVIWQYWGQGIDKDELPEIIQICFDSVDRNKNDYQVIRLTDITISEYIDLPDFVWR KREYVQFTRTFFSDLLRVALLSTYGGVWLDATILLTGSIPAVYEKTDFFMYQRSDEEKNK KYWENVYAYYFGWEPNFKVRMLSSILFAQKESEIISTLTDLLLYFWKTQDSLPDYFCFQI LFNELVANYRPAENCPIVNDCIPHIIQTKINGTYDDVSFEEALELSNIHKMTYFDAAAMI RLKMVLRLARNA >gi|226332243|gb|ACIC01000077.1| GENE 25 29027 - 29866 212 279 aa, chain + ## HITS:1 COG:no KEGG:BT_2871 NR:ns ## KEGG: BT_2871 # Name: not_defined # Def: putative glycosyltransferase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 279 13 291 291 544 100.0 1e-153 MAHHEFEVLGKLVQALDDERNDIYIHFDKKVKNYPLLKTKYTNLYILQKRTDIRWGHISQ IKAEYALFEAAFMQDRYDYYHLLSGVHLPLKSQSYIHHFFEGLKDKEVLVHVPNSDYQTH LKMRRYNFFTKNFMHKVLFIRRINQLLWRVCIRIQKELHICRNRNQSYTNAANWVSITGK CIGYLLQIKKDVLRKYRFTLCGDEFFIPSELGKSYLKDRIYYDDKLLKCDFDGGSNPRIY HLGDYDELISSGCLFARKFSYTDMHIVDKILIHITADQE >gi|226332243|gb|ACIC01000077.1| GENE 26 29866 - 30732 338 288 aa, chain + ## HITS:1 COG:no KEGG:BT_2870 NR:ns ## KEGG: BT_2870 # Name: not_defined # Def: putative glycosyltransferase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 288 1 288 288 579 100.0 1e-164 MKHAYLIIAHNEFEILKRLIQALDDERNDIYIHFDRKLNHYPDCRTSYAKLTFLEERMDV RWGDISVVDAEFALFDEAYRRGEYSYYHLLSGVDMPLKTQNYIHRFFEKNAGKEFVGYYQ GNISKEIDRKVCRWHLFPKSFKETEGGLAVTKRVLRAGCVRLQSLLGIRRNKDINFRKGT QWLSISNELVGYLLQQQKEVRRVYTHTFCADEIFVQTICWNSSFRDRVYDMHDEGHGCLR MIGWKDNQLEEWKEKDFEILMNSEALFARKFSIRHIEVVDQILNEISK >gi|226332243|gb|ACIC01000077.1| GENE 27 31050 - 31832 290 260 aa, chain + ## HITS:1 COG:no KEGG:BT_2869 NR:ns ## KEGG: BT_2869 # Name: not_defined # Def: putative succinyltransferase involved in succinoglycan biosynthesis # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 260 108 367 367 451 100.0 1e-126 MILAVYIKNYAFIFLFSMEPGEELQEVRNNSLYMLFWHTPVNYPLWFLRDLICMSALSPL FYAFFKYLKIYGLLILLALYLSVWETNIAGLSMTAIMFFGAGSYMGIYKKNVLAFCSKFR YVTAIITLMFLCCAIICNGRELHAYIVRIYILFGIITAFNLMDWLIDKESWKNLFCKLSA TVFFIYAAHEIYIINWTKGFFSRTSLADSGGGLMLSYLLIPFITLGVCLGLYYFLNRMAP NMLALLVGGRMKTQIIRNSK >gi|226332243|gb|ACIC01000077.1| GENE 28 31829 - 32827 468 332 aa, chain + ## HITS:1 COG:YPO0187 KEGG:ns NR:ns ## COG: YPO0187 COG0463 # Protein_GI_number: 16120528 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Yersinia pestis # 3 239 1 239 329 117 29.0 3e-26 MNIGNITISVIIPVYNASVTLPHCLESLHKQTYRNLELLFVDDCSTDESLYILTSYAEQF AEIDFTIRILQHERNRGVAAARNTALKCATGDYIYYVDADDSIEPHTLECLVNEINHKGL DIVGHEWYLTFNSNARYMKQPAYTTSVEALRKMMGGVMRWNLWLFLVRRSLYVENDIRFT EGLNMGEDMMVMMKLFICAGKVGIVHRPFYHYRQSNPESLTKIYSQEHIIQVANNVYEVE KYVNASAYADSLIEFIPFLKLNIKLPLLITDDKSHYEQWLECFPEANCYVMNNKLLPLRT RIIQWMAVKRYFILLRLYYRTVFKLVYGIIYK >gi|226332243|gb|ACIC01000077.1| GENE 29 32948 - 34084 329 378 aa, chain + ## HITS:1 COG:no KEGG:BT_2867 NR:ns ## KEGG: BT_2867 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 378 37 414 414 641 100.0 0 MLGIALVAYQGFQKHRITFSRDLLGAIVFAFIFSFICFVAADYNHTDDYSYVTYFVSFFT WLGGAYVVCYVIRAFHGKATLNLLIAYSAFVCVSQCILAILIDRFSAFRALVDTYISQGQ EFFQEVGRLYGIGAALDPAGVRFSIVLLLIVYLLCEDEGVKQVRWKTFACLFAFFVIAVI GNMISRTTSVGLFLGIVYLICSTGIFRLVIKGRYIRLYSILGGMLIVFTMLSVYLYNNDP FFYKNIRFAFEAFFNWVETGELRTDSTDKLNTMMWVWPEDVKSWIIGTGLFANFVYSTDI GYCRFILYCGLIGFGTFVLFFVYNACVFAWKFPSFRLFSFVLLSLSFIIWMKVATDLFFI YALFYCLDWKMEPVLIEE >gi|226332243|gb|ACIC01000077.1| GENE 30 34081 - 35220 688 379 aa, chain + ## HITS:1 COG:HI1698 KEGG:ns NR:ns ## COG: HI1698 COG0438 # Protein_GI_number: 16273585 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Haemophilus influenzae # 2 368 3 349 353 132 29.0 8e-31 MKIVYCIAGTYNSGGMERVLANKANYLVSHGYEVVIITTDQREKQSFFELDQRINCCDLR INYEENNGKSFLNKLANYPFKQWKHRKRLTACLKQLKADIVVSMFCNDASFLWKIDDGSK KVLEIHFSRYKRLQYGRKGVWKIADRWRSRMDERTVRKYDRFVVLTAEDKAYWGNLSNMI VIPNALSHSTTELHLSPLTARKVIAIGRYGYQKGFDYLIEAWEIIHCAQPAWTLDIIGDG EWTDRLQRQIKRKRLNHCVFLKPPTGQIEEEYRQASLLVLSSRYEGLPMVLLEAQSFGLP IVSFACKCGPGDVITDGKNGFLVSVGNLPMLADRIMRLMEDEGLRKRMGMNAYHNSKTFS EERIMQCWIDMFDKLVSQR >gi|226332243|gb|ACIC01000077.1| GENE 31 35217 - 36329 503 370 aa, chain + ## HITS:1 COG:wbbK KEGG:ns NR:ns ## COG: wbbK COG0438 # Protein_GI_number: 16129972 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Escherichia coli K12 # 5 334 3 333 372 198 36.0 2e-50 MSSKRKTIVVSAVSLRKGGTLTILQDCLCYLSALAADGNYRVIALVHKRELADYPHIEYI EMPDVIKGWGRRLWCEYVTMYKLSQKIMPVYLWLSLHDATPRVVAERRVVYCQTSFPFYK WSWRDFLFDYKIVLFALFTRWIYRINVHRNTYLIVQQEWLRSGLSRMLGVKKEKFIVAPP ERKENDVIAEIVELSCFTFFYAATPDCHKNFELLCDAASLLEQELGKGRFKVVLTISGQE NKYAQWLYKKFGSVSSIEFAGFMSREKLFGYYHAVDCLVFPSKVETWGLPVSEFMATGKP MLLADLPYAHETAAGSMQTAFFKLSDPVQLMNLMKRLCEDDFTFLTSIGTSDKRFSFASS WRELFDSILN >gi|226332243|gb|ACIC01000077.1| GENE 32 36409 - 37560 676 383 aa, chain + ## HITS:1 COG:aq_516 KEGG:ns NR:ns ## COG: aq_516 COG0438 # Protein_GI_number: 15605985 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Aquifex aeolicus # 1 370 1 367 368 254 38.0 3e-67 MRILQIGKFYPIRGGIEKVMYDITMGLSQRQVYCDMLCASAEKQKPENLLLNDYARILCV STWKKVAATMLSPAMIFRLRKINKEYDIIHIHHPDPMACLALFLSGYKGPVVLHWHSDIL KQKMLLKLYSPLQNWLLRRAKVIVGTTPVYVQESPFLENIQRKVISVPIGIDEMKPVPGR VAQIKERYAGKKIIFSLGRLVEYKGYEYLIKASQKLNDDYIILIGGKGPLQEYLQSLIDE LGVRKKVKLLGFIDDKDLPDYFGACDLFCLSSIWKTEAFGIVQIEAMSCGKPVIAMNIPE SGVSWVNINRFSGINVKPEDADALAEAITAVLTDECLYEELARGARRRYETMFTKELMTE LCLNLYSGVLDSSETDYPANKKE >gi|226332243|gb|ACIC01000077.1| GENE 33 37597 - 38394 530 265 aa, chain + ## HITS:1 COG:no KEGG:BT_2863 NR:ns ## KEGG: BT_2863 # Name: not_defined # Def: putative polysaccharide export protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 265 1 265 265 504 99.0 1e-141 MKRNVYVGVLLTALLMTSCINTKRIVYLEDMKESLSYPMNAQPEMKIQRDDRLSIVVSSR NLELTVPFNISTGGNFQVTSSGDVATGTDMQKCEKGYMVDLNGYIEFPVLGKLKVVDLSC KQVAELIKKRLIDENLINDPLVFIDILNIKITVMGEVESPQVLKIDDSRITLLEAITRTG GVTSNALLDRVAVIREEGNERRMYMHDIRSSDIFYSPCYYLQQNDMVYVHPKFAEANTKE RRTLSFYSFGLTLLSLITTITVLLK >gi|226332243|gb|ACIC01000077.1| GENE 34 38411 - 40792 1385 793 aa, chain + ## HITS:1 COG:STM2116_2 KEGG:ns NR:ns ## COG: STM2116_2 COG0489 # Protein_GI_number: 16765446 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Salmonella typhimurium LT2 # 562 779 28 240 240 120 34.0 1e-26 MEKQKKMALTEEEKSINIVDLFIYLIAHWKWFVLSILLFGGYFWYSYCETPFVYSRTAIV MIKTPANSRSAMQLNNADFIGQVNVASEILQFKSKELMRKVIDRLHADVSYMVRDGLRPR ELYTDSPVRVAFLEAGLDEVCSLSVIPKNKQQVELADFSLDELEQRKTVNLNDTVDTPLG KLIVFTAENYSESYFDRPIKVTSKSREGMVAFWLSNLTIRQMSGDAALLSMTMNDLSPTR AADILDMLITVYNEEAIKDKNRISVNTAEFIKERLQIIEHELGSVETDIEDLKRANNGVD INTVAGMYIQDSRQYESSIKELDTQLQLVSFIKQYLQDSNKDDELIPSNIGLSDLSIESQ ISRYNETLLRRNRLVSGSSSNNPVVQELNRIMQTMKQNIYMAVDNLSKSLRLKKQDYMRQ ESHVRQKVQAVPGKQREMLSIERQQKVKESLYIFLLNKREENALSQAMVDNNARILDPVS GSNLPISPNKYKKMILGVGCGIIVPSVILLLILMLDTRVHNRKEVEAVVSAPFLADIPQT AKASVDAHEVVVRARGLDPLSEAFRILRTNLGFMLSQAQDHKIITLTSFNIGAGKTFVSV NLAASLVQTKKKVLILDLDLRKGKMSEMAHSKHVKGVAHYLSNPSIVVDDLILRDAFGEG LDLIPIGVIAPNPTELLLSRRLDELMDRLRELYDYIIVDNVPIGLVADASVVNRISDLTL FIVRVGKIDRRQLPELERLYQEHKLTNMAVVLNGTKKGSSGYGYGYGYGQGYGYKNAEKK KGLFGRKKKKSRK >gi|226332243|gb|ACIC01000077.1| GENE 35 40975 - 42165 1377 396 aa, chain + ## HITS:1 COG:no KEGG:BT_2861 NR:ns ## KEGG: BT_2861 # Name: not_defined # Def: putative outer membrane protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 396 1 396 396 791 100.0 0 MRKVIVLSALLLITGMSVYAQEDYSKSLKTTTTIVENADKYKVETNRFWSNWFVTAGGGG LIFFGDHNMQMKFGDRLSPALDIGFGKWFTPGIGIRFMYSGLTIKGATQNGSHSTGKVYD ASQWLDEQKFDFMNIHGDVLFNASNLLCGYNEKRFWSVTPYVGLGWILTWESPRARNFNA SIGLINSFRLSSAFDLNLDVRGTATKDEFDGERGGRKEEGLLSVTVGVTYKFPRRTWGRS TVKTITFSDEELRLMREQLKAMNDENNRLKDELATSNKVTERVVETNILSAPYLVTFQIS RYALSNEARVNIGFQAKIMKENKNAVYTIIGYADKGTGTKEFNQFLSKARAEAVYNCLVN EFGVPASQLKITYEGGVDNMFYDDPRVSRAVITVIK >gi|226332243|gb|ACIC01000077.1| GENE 36 43035 - 47054 2972 1339 aa, chain + ## HITS:1 COG:BH4026 KEGG:ns NR:ns ## COG: BH4026 COG5002 # Protein_GI_number: 15616588 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus halodurans # 816 1036 364 588 607 113 27.0 2e-24 MIMRNTLYILLALVTLFLQKVSGEENSSYYFTHINGDNGLSSSNVKAIIQDSYGFMWFGT KNGLNRFDGTSILRMNCEDHQLKKGNNNISALYEDANRTLWVGTDRGVYLYHPASDVFTH INQKTKEGKTMDNWVSAIVGDTQGNIWVIIPDQGIFRYKDNELYFYEISNRRQFKQESPN CICVRENGEVWIGFWGLGICRYNPQNDSFEQIVEDRDGRPLVGKNINSICEYGDWLIMAA NEGELIKYNTKSHVLEDIKVAGADNTFYTTVAYMKGKIWLGTFNGLYVIDEKKNEVVSLK EDLMRSFSLSDKMIYSMCQDSEGGIWIGTLFGGVNYLPNRNLQFDKFVPGSSGNSLNTKR IRELAEDVKGNIWIGTEDAGISVLNVENGVIRQVKDHKSGNHLVTLAVVPFQDQIYCGLF KAGLDVIQVPGYTVRHYTPSELKIGEGSVYVFHIDRKGRKWLGTGWGLYVAEAGSFNFVN VKETGYNWIFDLCEGKDGSVWIATMGSGVWRYSPQTDSYKKYESKENEPDGLSSNSVSSV MQDSKGRVWFSTDRGGICYYNEEKDNFVTYSIKEGLPDDVAYKILEDEEGNLWFGTNRGL VRFHPDKNDIRVFTTKDGLLGNQFNYKSALKARNGKFYFGGIDGLVAFDPREAKTDDFVP PIYISKFSIYNKEVTVHTADSPLKECITHTDRIVLNYDESNISFDVALLSYSTAEANQYY YRMAPIDKDWIKAATNQNISYAKLPPGQYTFQVKATSNDNGGQFVERSLSIEILPPWWFS PWAYVFYFIWLVCVVVSWFFWYKHRKEKEMEERQKLFEIEKEKELYESKVNFFTEIAHEV RTPLTLINGPLETLQEMEIADPKIQKNLSVITQNTNRLLTLTGQLLDFQKIGAHKFEMTM ECVDITSLLRETVARFEPTISKKNKELALHIPEESIEAVVDKEAITKILSNLLNNALKYA RHSICVELQRETEAFLVRVTSDGDKIPAEISQQIFEPFYQAAKKDSASFGVGIGLPLARS LASLHRGKLYLDSEHEDNSFVLSVPLVENTLRSLPKLEEITVETDTLGEGVPVEAEMKSY VLLLVEDNETMLEFMSERLLEVFTVETAKNGQEALEVLRSRRIDLVISDIMMPVMNGWEL CKEIKSDMDLSHIPVIFLTAKNDIDSKINGLKIGAEAYVEKPFSFNYLKTQIVSLLYNRQ KEREAFSKRPFFPTNKMQMNKVDEEFMNKVIRTIEENIIDTNLNVEHLAEILGMSRSSLL RKIKMLSNLSPVDFIRLIRLRKAAELIHEGKYLIGDICFMVGINSPSYFSKLFLKQFGMT PKDFEKQSQADKEKIDMPS >gi|226332243|gb|ACIC01000077.1| GENE 37 47291 - 50419 2984 1042 aa, chain + ## HITS:1 COG:no KEGG:BT_2859 NR:ns ## KEGG: BT_2859 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1042 1 1042 1042 2083 100.0 0 MEKIRLIYLTLILFCCCQLSAQNKTVTGQVTAAGSEEPIIGASVVVRGTSNGTITDVDGN FSLSVPVGKELQISYIGYITQYVRVPKNNKVAVVLEEDSKALDEVVVVAFGTQKKESMIS SIETIKPAELKVPSSNLTTALAGRMSGVIAYQRSGEPGKDNADFFIRGVTTFGYKKDPLI LIDGIETTTTELARLQPDDIAAFSILKDATATALYGARGANGVIQVATKEGKEGATKVSI RVENSFSSNTKNVELADPITFMQLANEAVRTRNPLAPLAYSREKIDNTIKGTNPYVYPAN DWRKLLVKDMTANQRVNMNVSGGGKLARYYIAASFSKDNGNLKSNSMNGFDNNINLRSYQ LRSNININLTKTTEAVVRLSGTFDDYRGPLDGGDVMYKKIVRSNPVLFPAYYPSDALPTA KHLLYGNAIRETEANYTNPYAELTKGYKDYSRSTMDAQFELKQDLSFITKGLNIRGLYNT SRYSYFDVNRSFNPYYYNVSSYDKKTNQYNIGLLNEKSNPTDYLGYSEGWKDIQSTTYIE AAMQYNRSFDIHEVSGLLVYQQREKINANSGSLQKSLPYRNQGLSGRFTYSYDSRYMAEL NFGYNGSERFYKNERWGFFPAAGLAWVISNEKFWRPMNKVIPKLKLKATYGLVGNDAIGD ENDRFFYLSEVDPNDSGKGYWFGTNFDEGKPGVTVKRYDNRLITWERAKKANYGLELTLF DALNIQLDYFTEHRTNILMDRASIPADMGLAAKVRANVGEAKANGVDLSVDYNKFFRNGY WLQAHGNFTYAHSEFLKYEEPAYNEKYRSHIGQSLNQWYGLIAERLFIDEYDVQNSPKQT FGEYGAGDIKYRDVNGDGQITDADMVPLGYPTVPEIVYGFGFSFGNQRFDISAFFQGSAR SSFWIDAKATSPFQHINPTDDDPYHQDIPLLKAYANDHWSESNRNLYAMWPRLSTTVIEN NVRTSSWFMRNGAFLRLKTLEVGYTLPESWTKKAYISSARVYCSGNNLLLLSGFKLWDIE MGGSGLGYPIQRVINLGVQINF >gi|226332243|gb|ACIC01000077.1| GENE 38 50433 - 52373 1640 646 aa, chain + ## HITS:1 COG:no KEGG:BT_2858 NR:ns ## KEGG: BT_2858 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 646 1 646 646 1280 99.0 0 MKNRMIYQMKNKVRNLTYFLLLIGGTALTQSCSYLDVVPDKIGTIDNSFTNRNEAEKYLN TCYSYIPGNDSPYNNVALMGADELWSYYPLTPYNDFGPWKIALGRQNVNDPLCNMWYVYY RGIRDCNIFLENVSDMSKVGDLTPSMRKRWLAEVKFLKAYYHFCLFRMYGPIVIADDNLP ISATPEEVRKKREPVDKVVQYIVDLLDECLEDLPRVISNKGSELGRVTQIANLTLKARLL VTAASPLFNGNKDYANYVDKDGEHLFNSTEDPTKWDAAIAACEAAIEASQDPELGLELYS YKSNLILSDETKCQLSIRNSVTEKWNTELIWGLSSRSDAHLQSLCMTRVGAYPNNMWGGQ EMLNPTMEATNLYYTKNGVPMDEDKTWNYADRFKVKMHTNEQPYELASYYETIQMNFDRE PRFYANIGFDGCTWYQYNCPSDSEKDIWTAKNRAGQAQGKLGTNSYTTTGYWTKKLVSAT YEVTQSSYTTERFPWPEMRLTDLYLLYAEALNEKDDANNRELAISYLDQIRARAGLKGIK ESWDTYSKYPNRYKTQEGLREIIQRERAIELMFEGSRFWDLRRWKTANKVLNEKIHGWNF EGETPAEYYSQRTLYTLKFIAPRDYFWPIHQNDLVVNPKLVQNPGW >gi|226332243|gb|ACIC01000077.1| GENE 39 52403 - 53608 1114 401 aa, chain + ## HITS:1 COG:no KEGG:BT_2857 NR:ns ## KEGG: BT_2857 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 401 1 401 401 811 98.0 0 MISRKFIIWACGLLFVFSSCAEDMVKPIENDKDAPGTVQDVKTESLPGAVKFTYTLPSDP DLLYVLAKYTNKTGKVMEFRSSFYTNSVTVEGFGDTDTYKVELYTVDRSENRSQPQIVEV APLTPPILSCYESLSVVSDFGGMTFEMDNKFKSDLAIYVCTPDELGDMVLAETFYSAREE IVYSVRGYDAVPRRFGIFLQDKYGNETDTLFTELTPIYERELDKTKFREMYLQNDSRVES YDGKMEYVWNGRISKDGDSGGVGLHTGTGTKDGPAVFTFDLGVLAKLSRFALWAIQDDKH FYNDMSPRRYEVWGYATEPNPDGSWDQWVKLLDMENVKPSGSPIGILTEDDIEAAKIGDQ ANVPLDMPRVRYIRIKCLKNWSNNYNICFTELTFWGNDNEE >gi|226332243|gb|ACIC01000077.1| GENE 40 53648 - 54862 1106 404 aa, chain + ## HITS:1 COG:no KEGG:BT_2856 NR:ns ## KEGG: BT_2856 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 404 1 404 404 802 99.0 0 MKQFIKHTPYLVSLLILLGITSCTKMDEYLKYTDGKEILYTGIPDSIAMYSGYNRVVFRG VLASDPKIAKIKIYWNLKQDSLEQDIRREGNDNVLIIPIPLEEGTYNFEMHTYDKDGLHP SVPFNLTGTSYGDSYKDGLVNRLVKKVEKIEDDVTIDWSPAEPTALYTMVHYTDNSGKVQ ELKIENSEEETVLKDYKSMTKFDVQTYFLPDEMSIDTFKTAVVSYGVSEDITDLYLKNYK RPFEGRDKNADNKWGILVDWDFTPNILNQSNGKGGWSEDWGNYSIHLESKDWDGEGITNG KVYQSFELPAGNYQLECELEGGSNGMNGYLAASRGSVLPDIDRLKEEALGYSQYGDSNMG GKHTLTFSLAEPATVSVGWVVTFGSSTWMKVLYIKLMNIADIAE >gi|226332243|gb|ACIC01000077.1| GENE 41 55052 - 58021 2762 989 aa, chain + ## HITS:1 COG:TM1624 KEGG:ns NR:ns ## COG: TM1624 COG3250 # Protein_GI_number: 15644372 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Thermotoga maritima # 71 650 22 619 785 129 25.0 3e-29 MCMKFITKKGKSLGIGLAMVGSLLVSSLQAGPVYLHSGESIVSWKLKPEAEVGTNVPALL NAGYDTSSWVDAVVPGTAFTSYVTAGLEKDPNFGDNIHNVDRAKYDRSFWYRTEFKVPAD FDKALTWLNFNGVNRKAEIYLNGHLLGILDGFMHRGRFNITEIVKKDQPNVLAVLVHMPQ TPLANYGSPTYLSSGGWDWMPYVPGLNMGITDKVYLSNTGTATIIDPWIRTDLPLPSRTR ADLSVALEVKNSSDKPSKVVVNGTITPGDVKFTKEVDINPGATSEVKFDKRYYPQLVINS PKLWWPNGYGEPNLYTCKLEVSVDGKVSETKDVTFGIKEYSYDTNNNTLHLHINGVPVFV KGANWGMSEYMLRCRGEEYDTKLRFHHEMNFNMIRNWLGSTTDDEFYEMCDKYGLMVWDD FWINSNPNLPYDLNAFNNNMIEKIKRVRNHPSLAVWCGDNESNPQPPLEGWMAENIKTFD GGDRYFQPNSHAGNLTGSGPWGAFDPRFYFTEYPDGLEGDPERGWGFRTEIGTAVVPTFE SFKKFMPEKDWWPRNKMWDLHYFGQSAFNAAPDRYDASLAKGFGAPSGIEDYCRKAQLIN IESNKAMYEGWLDRMWDDASGIMTWMGQSAYPSMVWQTYDYYYDLTGAYWGTKSACEPLH ILWNPVTDAVKVANTTAENYQDLKAEVTVYNMDGKAVPAYSKSSVVHSASNSTLECFTID FNKERPNLGLNQKVVVSSTSEGDPSMAVDGKKDTRWSSAYRDNEWIYVDLGKVQPVGGVR LDWEASYGKEYKIQVSNDAQQWEEAYSTKNGIGGVELITFPEKDARYVRMFGFKRGWWYG YSLWSFDVLGGTGKSEGLSDVHFIRLKLKDKNGKLISENNYWRGNDRRDFTALNQLPKAE LKVSSKMEQKGEKAEIRATIGLPKSAKSVAFAVHVQAVRTVDGERILPAIMNDNYFTLMP GENKEITINFDKSLLKGGSYKLLVTPYNN >gi|226332243|gb|ACIC01000077.1| GENE 42 58028 - 60307 2280 759 aa, chain + ## HITS:1 COG:TM0076 KEGG:ns NR:ns ## COG: TM0076 COG1472 # Protein_GI_number: 15642851 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Thermotoga maritima # 30 759 4 761 778 489 37.0 1e-138 MTLRNITCAVTLCLLTLSGQAVGKTEVPPYKNKSLSIEKRVDDLMGRMTLREKVLQLQNR GAGRLDEIDRIFNGESYGCTHEMGTTAAECAAMYKELQQYMLTKTRLGIPIITSVEGIQG ILQNNCTLFPHALAQGSTFNPALIQRMTEAAGEEAKVIGIHQILSPVLDIARELRWGRVE ETFGEDPYLISEMGIAFINGYQKNRITCMPKHFVAHGTPSGGLNCAQVSGGERELRSLYL YPFRRVIKETNPLALMTCYSAYDGVAITGSPYYMTDILRGELGFKGYVYSDWGSVDRLKT FHAITPETDEAGRLALEAGVDLNIDSAYDNFERMVQEGRLDIKYIDLAVRRILTVKFQLG LFDAPYGDPKAVSKVVRSAEKVALAKEIADESAILLENKNQILPLDLAKYKSIAVVGPNS NQTIYGDYAWTTRDTKEGVTLLQGLKEVLGNKVTVRHAEGCDWWSSKKDKIAEAVEAVRG SDLAIVAVGTRSTYLGRSPKYSTTGEGFDLSSLELPGVQEELLQEIKKTGKPMVVVLIAG KPLAMPWVKENADALLVQWYGGEQQGRSLADILVGKVNPSGRLNVSFPRSTGNTPCFYNH FITDRNEPFDQPGSPEEPKGHYIFDAPDPLWSFGSGQSYTTFEYVDCALSDSVLTDKDQL TVTVKVKNTGKMDGKEVVQLYVRDRFSSVATPIRQLKAFRKDLIKAGATNTLTLKLPISE LALWNARMQEVVEPGEFEIQIGSAADNIHFTKVITVKGK >gi|226332243|gb|ACIC01000077.1| GENE 43 60522 - 62066 1224 514 aa, chain + ## HITS:1 COG:CAP0114 KEGG:ns NR:ns ## COG: CAP0114 COG3507 # Protein_GI_number: 15004817 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-xylosidase # Organism: Clostridium acetobutylicum # 10 511 15 528 531 214 30.0 3e-55 MNTRILCVWIALLTCQIAGAQTKNVTWGDQGNGTYINPILNADYSDPDVIRVGDKYYMVN SDFHYMGMPVLESDDMINWKIISQVYRRLDFPDWDTNGNYGGGSWAPSIRHHDGKFWIYF CTPREGLMMSTATDPHGPWSPLHCVKRIGGWEDPCPIWDDNGQAYLGRSQLGAGPIILHK MSADGRTLEDDGHVIYTGPVAEGTKFHKRDGYYYISIPEGGVGEGWQTILRSKNIYGPYE KKVVLEKGSTNVNGPHQGALVDTPEGEWWFYHFQLTEPLGRVVHLQPAHWKDGWPVIGVD IDMNGIGEPVKVWTKPNTGKKVPVSFPQGGDSFDSPELNLQWQFNHNPSDADWNLTERKG WLLLKALKADHLRASRNMLTQKCIGYEGTVTTEMDMSSWTEGQRAGLFCIGNLFNGIGIL KENGKNYLYLENNGSVEKVKPVSGKKIYFRATMNARTNQHQLYYSTDNKNFTPCGEAYSL RFGDWKGARVGLYSYNTLRDGGNAFFNWFTYDFN >gi|226332243|gb|ACIC01000077.1| GENE 44 62148 - 64304 1949 718 aa, chain + ## HITS:1 COG:BH2223 KEGG:ns NR:ns ## COG: BH2223 COG3345 # Protein_GI_number: 15614786 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-galactosidase # Organism: Bacillus halodurans # 27 705 13 720 748 360 32.0 7e-99 MKVRILFLICFLSITCLLNAADKPVIQIHTDNSSLIFRVADNGRLYQSYLGKKLNHEADI SHLPQGTEAYITHGMEDYFEPAIHILHNDGNPSLLLKYVSHETKAVSQGVNETIITLGDD KYPVTVKLHYVTYPAEDIIKTYTEISHKEKKPVVLHQYASSMLHLNRAKYYLTEFSGDWA HEANISDQPLEFGKKVLDTKLGARANMFTPPFFQLSLDQLATENCGEVLVGTLGWTGNFR FTFEVDNKNELRIISGINPYASEYYLPAGVVFRTPDFYFTYSANGKGKASRNFHDWARRY QLKDGDETRMTLLNNWEATYFDFNEEKLIGLIGDAAGLGVDMFLLDDGWFANKYPRSSDH QGLGDWDETADKLPNGIGRLVEEATKKGIKFGLWIEPEMVNPKSELYEKHKDWVIHLPNR DEYYFRNQLVLDLSNPKVQDFVFGVVDDLMTKYPGIAFFKWDCNSPITNIYSSYLKEKQS HLYIDYVRGLYNVLERIKTKYPDLPMMLCSGGSGRIDYEALRYFTEFWPSDNTDPIERLF IQWGYSQLFPAKTLCAHVTTWNHDASIKFRTDVAMMGKLGFDIKLSDLNDNEKKFCRDAV TNYNHLKPVVLEGDMYRLVSPYSGNHTSTQYVGKDKKGAVVFAFDIYSRYGEKLLPVRLQ GLDAGRQYRVKEINLMPGTNSSLQGNGELFSGEYLMTVGLNIFTGQRLNSRVIEVVAE >gi|226332243|gb|ACIC01000077.1| GENE 45 64792 - 66345 1353 517 aa, chain - ## HITS:1 COG:ECs4089_1 KEGG:ns NR:ns ## COG: ECs4089_1 COG0642 # Protein_GI_number: 15833343 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Escherichia coli O157:H7 # 308 517 283 505 525 84 25.0 5e-16 MKTSIVTLLITFCFYLSVYAQAPQDKATELKEQALSSLKQKDYIKARYLFKKAYEAFAVR ENYPQAIECGIQANALYVRENFYKEGFELCRNMEQLIWTGEQKQNKVFYDLRFPISKERL QMYISLKNPAQAKNQLDKLEEIASLAKNDSLMEVLLYTKANYYYTFNQNTQGDACFRKLI SQYKEKKDYDKVSDCYKTLIGIARKANNAPLMERTYESYIVWTDSVKALTAQDELNVLKR KYDESLQTIQDKDSTVSAKQYIIIGLCTLVAILVAAIIVLAILLLKFITGNRKLKKSVVI ANEHNELKTKFIRNISSQMEPTLNTLGTSAKELSAQAPRYAEQMQAQVAALKQFSDNIQE LSSLENSLTEPYEMGEVNAGTFCESVMEKAKEFMKPDVTPSVNAPKLQVKTNKEQLERIL LYLLKNAAFYTEQGRISLDFKKRGAHTHQFIITDTGTGIPAEQQENLFKPFTQVKDLTTG DGLGLPICSLIATKMNGSLTLDTSYTKGCRFILELHA >gi|226332243|gb|ACIC01000077.1| GENE 46 66472 - 67485 374 337 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15900011|ref|NP_344615.1| aldose 1-epimerase [Streptococcus pneumoniae TIGR4] # 14 333 14 340 345 148 29 8e-35 MQFTITHWGCTPSGDSIYLFRLTNATGASIELTNLGATWVSANMPNRYDVFENILLGYDH AEGYLKDTYYMGATVGRFANRIHQASFSIEGTTYLLEKNDGENTNHGGLSGFHKKIWQWK RTDSGIRFLLHSPDMKGGYPGNVQAEVEYQFTETNELTISYHGTTDRPTYLNLTNHAYFN LSGDKRKITEHELMIPAARILETTSQFIPTGQAQDVKDSPFDFSTSRSIGAHLYNDNEQL HWNKGYNHCYILKEESSDTLLTAAVLSDPFSGRRLTVRTDLPGVLLYTAGYLAPTPDIGV CLETQYFPDTPSHPHFPSCLLMPGEEYRHRTIYTFDK >gi|226332243|gb|ACIC01000077.1| GENE 47 67814 - 69022 922 402 aa, chain - ## HITS:1 COG:no KEGG:BT_2848 NR:ns ## KEGG: BT_2848 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 402 1 402 402 827 100.0 0 MKNRIIILSLCASCLLAIVAFVYNSETKTDPLALSPIVAKRVTTPTGTLVSCNLKALKDT VDIPLSYLTEELQVVKLDNRDEALVGGWIRTTVGEKYILVSNNKQTPYKLFTRDGKFITT IGAYGQGPNEYGNTYADQLDEAHNRIYILPWQSDKLLVFDLQGNPQPPIPLCMRVPKGKF RVDTEKSEVTLTTLPFEGWPAVVWTQDFKGKRKNFIAPGHLTVPRDFSNEVFMDNNTNDY GVMLMVIMPAPRTDSLYHYNAAQNRLEARFTVEYPNKEKIPWHGYSEYPRHFIGDVSVPI QVSENTWSGSKPAKYIIDKKTLHGSYFRLYNDFLGTKKMQIWPSFGNGYYVANMEPAQLK ETLEKEIARKDIPAEAKKRAQTLINSLDEEGNNVILLAKMKK >gi|226332243|gb|ACIC01000077.1| GENE 48 69221 - 70750 582 509 aa, chain - ## HITS:1 COG:no KEGG:BT_2847 NR:ns ## KEGG: BT_2847 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 509 1 507 507 903 100.0 0 MQMENVSTDSIFFYLQQIEEPDKLPLNDQGDYYLLLYKATLWKTGTPDDSLLQIAIRQYK QNEQQTQYVRARIEQSASYLYRNLPDSTLLMSEQILKGYQLNDTLKTRLYGLRRAAYSRK QDYPQALVMADSSRQLALKMKDTLFYFSTSQLYLQIINKMQDHDRYTQSYLQLMKELMDS PKYQSLNYYALESLLNTSLQRKDFRQAVIYLQQLSSQRRGRNEVPHYLLLCGRTHAALNQ IDSAQYYYRQAAISSSSFIAMEANARLFNLINEKEYPEQVFYTKQKESTIRDNILNNIST GIREREFNQLKLQNELYQLHLQQQQRELWMLGIVTALLSIGFIAFFFYQREKKKRLLREN QLLYKEAEVSGLREKEIRLRNKEAELREALFRRMPFFHKLPSLHANNNQDEPGTSHKIVV TDAEWAEVTSVVNDAFDNFVVRLRQAYPQLGDKEIGFCCLVKINVNIQDLSDIYCVSKAA ITKRKYRIKTDKLGITDENISLDSFLKVF >gi|226332243|gb|ACIC01000077.1| GENE 49 70945 - 72483 955 512 aa, chain - ## HITS:1 COG:slr0904 KEGG:ns NR:ns ## COG: slr0904 COG0606 # Protein_GI_number: 16331658 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted ATPase with chaperone activity # Organism: Synechocystis # 1 507 1 507 509 525 52.0 1e-149 MLIKVFGAAVQGIDATLITIEVNSSRGCMFYLVGLPDSAVKESHQRIISALQVNGYKMPT SNLVVNMAPADIRKEGSAYDLPLAIGLLGASETISSEKLSRYLMMGELSLDGSIQPIKGA LPIAIKAREENFEGLIIPQQNAREAAVVNQLKVYGVSNIKEVIQFFNGERELEQTVVNTR EEFYQQQTAFDLDFADVKGQENVKRALEVAAAGGHNLIMIGAPGSGKSMMAKRLPSILPP LSLGESLETTKIHSVAGKLNRGSSLISQRPFRDPHHTISQVAMVGGGSFPQPGEISLAHN GVLFLDELPEFNRGVLEVLRQPLEDRQITISRIKSTISYPANLMLIASMNPCPCGYYNHP TKACVCSPGQVQKYLNKISGPLLDRIDIQIEIVPVPFDKISDQRQGEASSVIRNRVIQAR RIQEQRYADHPGIYCNAQMSSKLLSIYARPDDKGLSLLRNAMERLNLSARAYDRILKVAR TIADLEGAELIQPSHLAEAISYRNLDRENWAG >gi|226332243|gb|ACIC01000077.1| GENE 50 72497 - 73612 1040 371 aa, chain - ## HITS:1 COG:no KEGG:BT_2845 NR:ns ## KEGG: BT_2845 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 13 361 13 361 371 625 100.0 1e-178 MKKSSILILIIICLVGCKKGSNETTITGEIKGLGTDTLYLYGMDELYDRIDTIYVENDKF SYTTSVDTITSAYLLLKNRIEYPVFLDKGNKIKIKGDTINLNFLTISGNIYNEEFTDFQK ALEDPADPSEKAGEETVDKRITVEKANTAEEMAEEFILQHHSSYVSLYLLDKYFVQKETP DFSKIKKLVEVMTGVLQDKPYIERLNETITQAEKSEIGKYAPFFSLPNAKGEKITRSSDA FKQKSLLINFWASWNDSISQKQSNSELREIYKKYKKNKYIGMLGISLDVDKQQWKDAIKR DTLDWEQVCDFGGLNSEVAKQYSIYKIPANILLSSDGKILAKNLRGEELKKKIENIVEEA TEKEKKNKQKK >gi|226332243|gb|ACIC01000077.1| GENE 51 73725 - 75416 1664 563 aa, chain - ## HITS:1 COG:no KEGG:BT_2844 NR:ns ## KEGG: BT_2844 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 11 563 1 553 553 975 100.0 0 MTKKLYLPLLMAIVVALFSSCKKMGPLSADYFTVTPQVLEAVGGKVPATINGKFPEKYFK KKAVVEVTPVLKWNGGEAKGQSAVFQGEKVEGNDQTISYKVGGSYTMKTSFDYVPEMAKS ELWLEFKAKVGKKEVVIPAVKVADGVISTSELVNNTLGSANPALGEDAFQRIIKEKHDAN IMFLIQQANIRSSELKTAKEFNKEVANINEAANKKISNIEVSAYASPDGGVSLNTTLAEN RESNTTKMLNKDLKKAKIDAPVDAKYTAQDWEGFQELVSKSNIQDKELILRVLSMYQDPA QREQEIKNISSVYKNLADDILPQLRRSRLTLNYEIIGKSDEEITKLASSNPSELNIEELL YAATLTNDPAKQEAIYTQATKQFPNDYRAYNNLGKLAYQAGNVDKAESYLKKAASVNAAP EVNMNLGLISLIKGDKAAAEAYFGKAAGTKELGESMGNLYIAQGQYERAVNSFGDAKTNS AALAQILAKDYNKAKNTLAGVEKPDAYTDYLMAVLGARTNNSSMVTSSLKSAVAKEPALA KKAATDLEFSKFFTNADFMSIIK >gi|226332243|gb|ACIC01000077.1| GENE 52 75555 - 76514 713 319 aa, chain - ## HITS:1 COG:SA1328 KEGG:ns NR:ns ## COG: SA1328 COG4974 # Protein_GI_number: 15927078 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinase XerD # Organism: Staphylococcus aureus N315 # 16 306 4 294 295 211 41.0 2e-54 MEINEKNKKKEQQALIIRKYQQYLKLEKSLSPNTLDAYLTDLDKLMSFLTLEGIDVLEVC LSDLQRFAAGLHDIGIHPRSQARILSGIKSFFRFLIMADYLEADPSELLEGPKIGLKLPE VLTVEEIDNIISSVDRSKAEGQRNRAILETLYSCGLRVSELVSLKLSDLYFDEGFIKVEG KGSKQRLVPISPRAINEIKLYFLDRNRIEVKKDYEDFVFVSQRRGKSLSRIMIFHMIKEL AQNAGITKNISPHTFRHSFATHLLEGGANLRAIQCMLGHESIATTEIYTHIDRNMLRSEI IEHHPRNIKYRREKESGFH >gi|226332243|gb|ACIC01000077.1| GENE 53 76589 - 77011 372 140 aa, chain + ## HITS:1 COG:SMc01343 KEGG:ns NR:ns ## COG: SMc01343 COG0757 # Protein_GI_number: 15965074 # Func_class: E Amino acid transport and metabolism # Function: 3-dehydroquinate dehydratase II # Organism: Sinorhizobium meliloti # 3 136 5 140 148 152 53.0 2e-37 MRIQIINGPNINLLGKREPSIYGSVTFEDYLAELRKRYADLEIDYFQSNIEGEMIDCIQQ VGFEVDGIILNAGAYTHTSIALQDAIRSVTSPVIEVHISNVHSRESFRHVSMIACACKGV ICGFGLNSYRLALEALLGNK >gi|226332243|gb|ACIC01000077.1| GENE 54 77043 - 78500 1620 485 aa, chain + ## HITS:1 COG:BB0348 KEGG:ns NR:ns ## COG: BB0348 COG0469 # Protein_GI_number: 15594693 # Func_class: G Carbohydrate transport and metabolism # Function: Pyruvate kinase # Organism: Borrelia burgdorferi # 1 471 1 473 477 394 47.0 1e-109 MLLKQTKIVASISDRRCDVDFIKQLFEAGMNVVRMNTAHASREGFEALIANVRSVSNRIA ILMDTKGPEVRTTANAEPIPYQIGDKVKIVGNPDQETTRECIAVSYPNFVNDLNIGGLVL IDDGDLELEVIDKTADYLLCEVKNDATLGSRKSVNVPGVRINLPSLTEKDRNNILYAIEK DIDFIAHSFVRNRQDVLDIREILDAHNSDIRIIAKIENQEGVDNIDEILEVADGVMVARG DLGIEVPQERIPGIQRVLIRKCILAKKPVIVATQMLHTMINNPRPTRAEVTDIANAIYYR TDALMLSGETAYGKYPVDAVKTMTKIAAQAEKDKLEENDIRIPLNENSNDVTAFLAKQAV KATSKLKIRAIITDSYSGRTARNLAAFRGKFPVLAICYKEKTMRHLALSYGVEAIYMPEL ANGQEYYFAALRRLLKEGRLQPTDMVGYLSSGKAGTQTSFLEINVVEDALKHAGETVLPN NNRYL >gi|226332243|gb|ACIC01000077.1| GENE 55 78526 - 79164 634 212 aa, chain + ## HITS:1 COG:aq_1507 KEGG:ns NR:ns ## COG: aq_1507 COG4122 # Protein_GI_number: 15606661 # Func_class: R General function prediction only # Function: Predicted O-methyltransferase # Organism: Aquifex aeolicus # 5 211 7 211 212 139 37.0 3e-33 MQETESIDEYILQHIDAESDYLKALYRDTHVKLLRPRMASGHLQGRMLKMFVEMIRPRHI LEIGTYSGYSALCLAEGLSEGGMLHTFEINDEQEDFTRPWLENSPFADKIRFYIGDALEL VPHLGVTFDLAFIDGDKRKYIDYYEMVLKHLSTGGYIIADNTLWDGHVLEQPRSADTQTI GIKAFNDALVTDDRVEKVILPLRDGLTIIRKK >gi|226332243|gb|ACIC01000077.1| GENE 56 79181 - 79513 299 110 aa, chain + ## HITS:1 COG:BS_rbfA KEGG:ns NR:ns ## COG: BS_rbfA COG0858 # Protein_GI_number: 16078728 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribosome-binding factor A # Organism: Bacillus subtilis # 3 109 2 107 117 62 37.0 2e-10 METTRQNKISRLLQKELSEIFLLQTKAMPGILISVSAVRISPDMSIARVYLSVFPSEKAE EMVKNINENMKSIRFELGTRVRHQLRIIPELKFFVDDSLDYIEKIDSLLK >gi|226332243|gb|ACIC01000077.1| GENE 57 79663 - 80736 840 357 aa, chain + ## HITS:1 COG:HI1548 KEGG:ns NR:ns ## COG: HI1548 COG4591 # Protein_GI_number: 16273448 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: ABC-type transport system, involved in lipoprotein release, permease component # Organism: Haemophilus influenzae # 37 354 91 406 416 95 24.0 1e-19 MVASFFTAFDPQLKITIREGKVFDGQDERIRAVCALPEVSVSTETLEENAMVQYKDRQAM VVLKGVEDNFEDLTEIDNILYGAGEFILHDSIVNYGVMGVELVSTLGTGLAFVDPLQVYL PKRNAKVNMANPGASFNRDYLYSPGVVFVVNQQKYDESYILTSLDFLRNLLDYTTEVSGI ELKLKPNTNISSVQSKIEKMLGDDFVVQNRYQQQADVFRIMEIEKLISYLFLTFILMIAC FNVIGSLSMLILDKKDDVITLRSLGASDKLISRIFLFEGRLISLFGAISGIILGLILCFI QQKFGIITLGGGGGTFVVDAYPVSVHAWDVVLIFITVLAVGFLSVWYPVRYLSKRLL >gi|226332243|gb|ACIC01000077.1| GENE 58 80918 - 81928 1089 336 aa, chain - ## HITS:1 COG:Rv2454c KEGG:ns NR:ns ## COG: Rv2454c COG1013 # Protein_GI_number: 15609591 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, beta subunit # Organism: Mycobacterium tuberculosis H37Rv # 2 335 38 372 373 328 47.0 8e-90 MSDKVYTVQDYKSGQPRWCPGCGDHAFLNSLHKAMAELGVAPHNIAVISGIGCSSRLPYY VNTYGFHTIHGRAAAVATGAKVANPDLTIWQISGDGDGLAIGGNHFIHAVRRNIDLNMIL LNNRIYGLTKGQYSPTSERGFVSKSSPYGTVEDPFHPAELAFGARGRFFARCIAVDGAAS VEVLKAAANHKGASVVEVLQNCVIFNDGTHASVATKEGRAKNAIYLEHGKPMLFGENKEF GLTQEGFGLKVVKLGENGITEKDILIHDAHCQDNTLQLKLALMEGPDFPIALGVIREVEA PTYNDAVAEQIEEVKGKKKYHNFQELLMTNDTWEVK >gi|226332243|gb|ACIC01000077.1| GENE 59 81932 - 83782 1958 616 aa, chain - ## HITS:1 COG:MT2530_2 KEGG:ns NR:ns ## COG: MT2530_2 COG0674 # Protein_GI_number: 15841979 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit # Organism: Mycobacterium tuberculosis CDC1551 # 213 606 1 389 425 383 52.0 1e-106 MADEMMVKELEEVVVRFSGDSGDGMQLAGNIFSNVSATVGNDICTFPDYPADIRAPQGSL TGVSGFQIHVGAGQVYTPGDRCHVLVAMNPSALKTQIKFCKPQGLIITDSDSFEARDLEK AQFKTNNPFEELGVKQEVLEVPISSMCKESLKDSGLDNKTILRCKNMFALGLVCWLFNRN LAAAEKMLREKFAKKPEIAEANIKVLNDGYNYGANTHASTSTYKIESKAPKSKGLYTDIN GNKATSYGLIAAAEKAGLELYLGSYPITPATDILHELAKHKSLGVKTVQCEDEIAGCASA VGAAFAGALAVTTTSGPGICLKSEAMNLAVIGELPLVIVNVQRGGPSTGLPTKSEQTDLL QALYGRNGESPMPVIAATSPTNCFDAAYMAAKIALEHMTPVVLLTDAFVANGSAAWKLPD LNDYPAINPPYVTPDMAGNWTPYQRNEETGARYWATPGTEGFMHRIGGLEKSNETGAIST EPENHNKMVHLRQAKVDKIADYIPELEVLGDEDADLLIVGWGGTYGHLRLAMDFMREHGK KVAFAHFQYINPLPKNTADVLRKYKKIVVAEQNLGQFAGYLRMKVPGLNISQFNQVKGQP FVTRELIDAFTKLLEE Prediction of potential genes in microbial genomes Time: Thu May 12 01:16:02 2011 Seq name: gi|226332242|gb|ACIC01000078.1| Bacteroides sp. 1_1_6 cont1.78, whole genome shotgun sequence Length of sequence - 16192 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 6, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 24 - 3056 3217 ## COG0342 Preprotein translocase subunit SecD - Prom 3106 - 3165 3.9 - Term 3171 - 3209 2.1 2 2 Op 1 . - CDS 3222 - 5306 2130 ## COG0339 Zn-dependent oligopeptidases 3 2 Op 2 . - CDS 5334 - 6419 726 ## BT_2833 hypothetical protein 4 2 Op 3 . - CDS 6423 - 7325 766 ## COG0705 Uncharacterized membrane protein (homolog of Drosophila rhomboid) 5 2 Op 4 . - CDS 7306 - 7980 578 ## COG0705 Uncharacterized membrane protein (homolog of Drosophila rhomboid) - Prom 8164 - 8223 8.3 + Prom 8081 - 8140 10.3 6 3 Op 1 . + CDS 8196 - 8468 322 ## COG0776 Bacterial nucleoid DNA-binding protein + Term 8487 - 8535 9.1 7 3 Op 2 . + CDS 8551 - 10344 1863 ## COG0018 Arginyl-tRNA synthetase 8 4 Tu 1 . - CDS 10505 - 12547 1135 ## BT_2828 hypothetical protein - Prom 12587 - 12646 6.0 - Term 12647 - 12706 10.8 9 5 Tu 1 . - CDS 12723 - 15071 2282 ## COG0550 Topoisomerase IA - Prom 15138 - 15197 9.0 + Prom 15037 - 15096 6.1 10 6 Tu 1 . + CDS 15318 - 16191 453 ## BT_2826 two-component system sensor histidine kinase/response regulator, hybrid ('one component system') Predicted protein(s) >gi|226332242|gb|ACIC01000078.1| GENE 1 24 - 3056 3217 1010 aa, chain - ## HITS:1 COG:AGc2877_1 KEGG:ns NR:ns ## COG: AGc2877_1 COG0342 # Protein_GI_number: 15888881 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit SecD # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 395 668 275 548 562 243 44.0 2e-63 MQNKGFVKVFAVLLTLVCVFYLSFSFVTRHYTNKAKEIANGDPKVEQDYLDSLSNEKVML WNWTLKDCREMEISLGLDLKGGMNVILEVSVPDVIRALADNKPDENFNKALNEAAKQAVN SQDDIITLFVREYQKTAPGAKLSELFATQQLKDKVNQKSSDAEVEKVLRAEVKAAVENSY NVLRTRIDRFGVVQPNIQSLEDKMGRIMVELPGIKEPERVRKLLQGSANLEFWETYTAKE ILPAMQSADSKLRAILSQETAADSTATNATADTIPAAKLAEATPAKKAVSVADSLAATLK GDAKDEKAGANMEEIKKQYPLLAVLQLNSSGQGPVIGYANYKDTADINRYLSMPEIQSEL PKDLRLKWGVSPSEFDKKGQTFELYAIKSTERNGKAPLEGDVVTDAKDEFDQYSKPAVSM TMNSDGARRWAQLTKQNIGRSIAIVLDNYVYSAPNVNSEITGGRSQITGHFTPEQAKDLA NVLKSGKMPAPAHIVQEDIVGPSLGQESINAGIFSFVVALILLMIYMCSMYGFIPGMVAN CALFLNFFFTLGILSSFQAALTMSGIAGMVLSLGMAVDANVLIYERTKEELRAGKGVKKA LADGYSNAFSAIFDSNLTSIITGIILFNFGTGPIRGFATTLIIGILVSFFTAVFMTRLVY EHFMNKDKWLNLTFTSKISRNLLVNTRFDFMGTNKKSLIIVSAIILVCIGSFALRGLSQS IDFTGGRNFKVQFENPVEPEQVRELIADKFGEDVNVNVIAIGTDKKTVRISTNYRIADEG NNVDSEIESYLYETLKPLLTQNITLATFIDRDNHTGGSIVSSQKVGPSIADDIKTGAVWS VVLALIAIGLYILIRFRNIAYSIGSIVALTCDTIMIIGAYSLLWGIVPFSLEIDQTFIGA ILTAIGYSINDKVVIFDRVREFFGLYPKRDKRQLFNDSLNTTLARTINTSLSTLIVLLCI FILGGDSIRSFAFAMILGVVIGTLSSLFIASPIAYNMMKNKKVVVAATEE >gi|226332242|gb|ACIC01000078.1| GENE 2 3222 - 5306 2130 694 aa, chain - ## HITS:1 COG:XF1944 KEGG:ns NR:ns ## COG: XF1944 COG0339 # Protein_GI_number: 15838538 # Func_class: E Amino acid transport and metabolism # Function: Zn-dependent oligopeptidases # Organism: Xylella fastidiosa 9a5c # 24 694 35 716 716 592 45.0 1e-169 MIRKTLTILAVSCMMYSCGTKTESNPFFTEFQTEYGVPSFDKIKLEHYEPAFLKGIEEQN QNIQAIIASPEVPTFDNTIVALDSSAPILDRVSAIFFNMTDAETTDELTELSIKMAPVLS EHEDNISLNQELFKRVNVVYQQKDSMNLTTEQKRLLDKTYKGFVRSGANLDAEKQARLRE INKELSTLGITFSNNILNENNAFQLFVDKKEDLAGLPEWFCQSAAEEAKAAGQPGKWLFT LHNASRLPFLQYAENRPLREKMYKAYINRGNNNDKNDNKETIRKIVSLRLEKARLLGFNN YANFVLDETMSKNDSNVMSLLNNLWSYALPKAKAEAAELQQLMDKEGKGEKLEAWDWWYY TEKLRKEKYNLSEEDTKPYFKLENVREGAFAVANKLYGITLNKLEGIPTYHPDVEVFEVK DADGSQLGIFYVDYFPRSGKSGGAWMSNYREQQGATRPLVCNVCSFTKPVGDTPSLLTMD EVETLFHEFGHALHGLLTKCEYKGTSGTNVVRDFVELPSQINEHWATEPEVLKMYAKHYQ TGEVIPDEIIEKILKQKTFNQGFMTTELLAAAILDMNLHMITDVKNLDMLAFEKEAMDKL GLIPEIAPRYRVTYFNHIIGGYAAGYYSYLWANVLDNDAFEAFKEHGIFDKNTADLFRYN VLEKGDSEDPMILYKNFRGAEPSLEPLLKNRGMK >gi|226332242|gb|ACIC01000078.1| GENE 3 5334 - 6419 726 361 aa, chain - ## HITS:1 COG:no KEGG:BT_2833 NR:ns ## KEGG: BT_2833 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 361 1 361 361 718 99.0 0 MEHIGKFVLYLILAVNALFVGMLILSAYSPYLQPKIHPIASCLGLAFPIFLAVNICFTLF WVIISYRYALLPVIGFLVCIPQIRTYIPINSTVETIPDGSIKFLSYNVMGFNNLEKKEGK NPILSYLADSEADIICLQEYNSTKNKKYLTDEDIRKALKAYPYRSIHNPEKGGSQLACFS KFPILSARPIKYESTYNGSMQYTLKVNEDTITLINNHLESNKLTKEDKVIYEDMIKDPNA KKVKTGLRQLIKKLAEASAIRSSQADSVAVAIANSKYPTIIACGDFNDASISYTHRILTQ QLDDAFTQSGRGLGISYNLNKFYFRIDNILISPNQKAYNCTVDRSIKDSDHYPIWCYIGK Q >gi|226332242|gb|ACIC01000078.1| GENE 4 6423 - 7325 766 300 aa, chain - ## HITS:1 COG:MA3859 KEGG:ns NR:ns ## COG: MA3859 COG0705 # Protein_GI_number: 20092655 # Func_class: R General function prediction only # Function: Uncharacterized membrane protein (homolog of Drosophila rhomboid) # Organism: Methanosarcina acetivorans str.C2A # 56 214 62 212 226 85 32.0 8e-17 MGHIITDLKETFRRGNTFIRLIYINVGIFVIGTLISVILQLFNLSATGIFDLFALPASLH RFILQPWSLLSYMFMHAGFLHILFNMLWLYWFGSLFLYFFSGKHLRGLYVLGGICGGLFY MIAYNIFPYFSQTLPFSTLVGASASVLAIVAATAYREPNYRVQLFLFGAVRLKYLALIVI GTDLLFITSNNAGGHIAHLGGALAGLWFAASLSKGKDVTYWINWFLDGFASLFQKKTWKR KPKMKVHYGSDTSREKDYDYNARKKAQSDEVDRILEKLKKSGYESLTTEEKKSLFDASKR >gi|226332242|gb|ACIC01000078.1| GENE 5 7306 - 7980 578 224 aa, chain - ## HITS:1 COG:XF0649 KEGG:ns NR:ns ## COG: XF0649 COG0705 # Protein_GI_number: 15837251 # Func_class: R General function prediction only # Function: Uncharacterized membrane protein (homolog of Drosophila rhomboid) # Organism: Xylella fastidiosa 9a5c # 1 221 9 211 224 147 42.0 2e-35 MPTVTKNLIIINVLVFFGTLVAQRYGIDLTNYLGLHFFLASDFNPAQLITYMFMHGGFSH IFFNMFAVFMFGTVLERTWGPKRFLFYYIACGIGAGLIQEGVQYIKYIVDYSHYSQVDIG TGIIPMGEFLNMLTTVGASGAVYAILLAFGMLFPNNQLFIFPLPFPIKAKFFVIGYALIE LYAGFANNPGDNVAHFAHLGGMIFGFILIMYWRKKSRNNGTYYN >gi|226332242|gb|ACIC01000078.1| GENE 6 8196 - 8468 322 90 aa, chain + ## HITS:1 COG:YPO3154 KEGG:ns NR:ns ## COG: YPO3154 COG0776 # Protein_GI_number: 16123316 # Func_class: L Replication, recombination and repair # Function: Bacterial nucleoid DNA-binding protein # Organism: Yersinia pestis # 1 89 1 89 90 79 61.0 2e-15 MNKSELISAMAAEAQMSKADAKKALDAFISSVTNAMKAGDKVALVGFGTFSVSERAARTG INPSTKAAITIPAKKVAKFKAGAELSAAVE >gi|226332242|gb|ACIC01000078.1| GENE 7 8551 - 10344 1863 597 aa, chain + ## HITS:1 COG:TP0831 KEGG:ns NR:ns ## COG: TP0831 COG0018 # Protein_GI_number: 15639817 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Arginyl-tRNA synthetase # Organism: Treponema pallidum # 8 597 12 589 589 473 41.0 1e-133 MKIEDKLVASVINGLKALYGQEVPEKMVQLQKTKKEFEGHLTLVVFPFLKMSRKGPEQTA QEIGEYLKANEPAVAAFNVIKGFLNLTIASATWIELLNEIQSDEEYGLVKATETSPLVMI EYSSPNTNKPLHLGHVRNNLLGNALANIVAANGNRVVKTNIVNDRGIHICKSMLAWKKYG NGETPESTGKKGDHLVGDYYVSFDKHYKAELAELMEKGMTKEEAEAASPLMQEAREMLVK WEAGDPEVRALWEMMNNWVYAGFDETYRKMGVGFDKIYYESNTYLEGKEKVMEGLEKGFF FKKEDGSVWVDLTAEGLDHKLLLRGDGTSVYMTQDIGTAKLRFADYPIDKMIYVVGNEQN YHFQVLSILLDKLGFEWGKGLVHFSYGMVELPEGKMKSREGTVVDADDLMEEMVSTAKET SQELGKLDGLTQEEADDIARIVGLGALKYFILKVDARKNMTFNPKESIDFNGNTGPFIQY TYARIQSVLRKAAESGIVIPEQIPAGIELSEKEEGLIQLVADFAAVVKQAGEDYSPSIIA NYTYDLVKEYNQFYHDFSILREENEAVKVFRIALSANVAKVVRLGMGLLGIEVPSRM >gi|226332242|gb|ACIC01000078.1| GENE 8 10505 - 12547 1135 680 aa, chain - ## HITS:1 COG:no KEGG:BT_2828 NR:ns ## KEGG: BT_2828 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 11 680 1 670 670 1311 100.0 0 MQTNNFKSLSMGVLCALSLASCMEKDLYQAPEEKTADEYFGFTTSSSCTVDLNYGLNYQV VFELYAENPLSEDKDGNIIKTKEEPVYRGATDKKGIFNDFISIPSYLTELYLYSDYLGTV SPIQLQITNGKVSFDQQAFIQSKRNAKGTSRGVTNSGYKYPEGFQVLGEWNEIGCPTYLS STPTEIDGQLMYNIREVFIKPGGSAAMEDYYPEYIGDNVNIEINVKKPTEIGLVMLSSTG TKQNTVGYFTYPTDQKPTDISQVTPIIAYPRISTAVCNSSSTAGSMYTGDRVELKYWDGT KFVNEFPAGVSIAWFLIESSYNQGTKEIMNNKRTFYSIRDLNKSKEHRTIALKNKSGEVV AFGMEDATNFEPTGDNTRKGNFGDAVFYLDFSDGSAIETGGVEELPDKSIDNKEIYNSFK GVLSFEDFWPSKGDYDMNDMIVEYKREIYKSVLTSKVVKVIDTFVPKHDGANWQNGFGYQ LTGIANSDIKKITVESGGIVSQFMEGQDREPGQNYPTIILFDNQKAAINKTFTVTIDVAQ ERFTETAFTPFFNNKFWEQTYSKFNMNPFIIISANTGRDKEAHIVKFPPTSKMNFSYFGT GSDVSRPDEGLYYVNNENMPTGLQISGISVGTKRGTDFLVPVETTSILEAYPKFGDWASS FGASNPYWWKDPDNSKVITQ >gi|226332242|gb|ACIC01000078.1| GENE 9 12723 - 15071 2282 782 aa, chain - ## HITS:1 COG:AGc2398_1 KEGG:ns NR:ns ## COG: AGc2398_1 COG0550 # Protein_GI_number: 15888629 # Func_class: L Replication, recombination and repair # Function: Topoisomerase IA # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 4 585 2 579 580 437 42.0 1e-122 MQKNLVIVESPAKAKTIEKFLGKDFKVLSSYGHIRDLKKKEFSIDVDKNFKPDYEIPADK KALVSTLKAEAKEAETVWLASDEDREGEAIAWHLYEVLKLKPENTKRIVFHEITKGAILK AIEQPRNIDLNLVNAQQARRILDRIVGFELSPVLWRKVKPSLSAGRVQSVAVRLIVERER EIHAFKSEAAYRVIAIFLVPDTDGKLVEMKAELSRRIKTKEEAKAFLDACKGATFTIEDI TTRPVKKTPPAPFTTSTLQQEAARKLGYTVAQTMMIAQRLYESGFITYMRTDSVNLSEYA TASSKDAIIQMMGERYVHPRHFETKTKGAQEAHEAIRPTYMENQSIEGTAQEKKLYDLIW KRTIASQMADAELEKTTATISISKSGDAFTAIGEVIKFDGFLRVYRESYDDENEQEDESR LLPPLKKGQKLEYGPIVATERFTQRPPRYTEASLVRKLEELGIGRPSTYAPTISTIQQRE YVEKGNKDGEERTFNVLTLKDNQIKDESHNEVTGAEKSKLFPTDTGTVVNDFLTEYFPDI LDYNFTASVEKEFDEIAEGEVKWTSIMKTFYDQFHPAVEKTLSIKTEHKVGERMLGEEPE TGKPVSVKIGRFGPVVQIGAADDEEKPRFAQMKKGQSMETITLEEALELFKLPRTLGEYE GKTVTVGIGRFGPYIQHNKVYVSLPKTLDPMKVTLEEAEQLILEKRTKEAERHIKKFDEE PELEILNGRYGPYIAYKGNNYKIPKDIVPQDLSLKSCFDLIKIQDEKGPGTSAKGKRTAK KK >gi|226332242|gb|ACIC01000078.1| GENE 10 15318 - 16191 453 291 aa, chain + ## HITS:1 COG:no KEGG:BT_2826 NR:ns ## KEGG: BT_2826 # Name: not_defined # Def: two-component system sensor histidine kinase/response regulator, hybrid ('one component system') # Organism: B.thetaiotaomicron # Pathway: not_defined # 5 291 1 287 1305 605 100.0 1e-172 MWLFMRKILFKIYLLFFFVGIAQPILSQSPYYYYKQLGIKEGLSQSKVQCVLNDHRGYLW IGTESGLNCYDRDHLKQYLHRPGDERTLPSNNITFIAEDSLCNLWVATMNGICLYDRGSD SFRTLSNNGKPIYVASYLLVEGGILLGGSGAIYKYVYATNKLEPLYYAQDPVYYNPFWQM IRYDEENILLNSRWHGIYSFNMKTYEVKKIESFTENNYTSIYLDSYKRLWVSVYGNGLYC YQGDQLIKHFTASNSPLTYDVIHDIMERDNQLWVATDGGGINIISLDDFSF Prediction of potential genes in microbial genomes Time: Thu May 12 01:16:49 2011 Seq name: gi|226332241|gb|ACIC01000079.1| Bacteroides sp. 1_1_6 cont1.79, whole genome shotgun sequence Length of sequence - 117861 bp Number of predicted genes - 100, with homology - 98 Number of transcription units - 47, operones - 26 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 3057 1930 ## COG0642 Signal transduction histidine kinase + Term 3168 - 3217 10.8 - Term 3454 - 3503 1.4 2 2 Tu 1 . - CDS 3597 - 4640 803 ## BT_2825 chitinase - Prom 4663 - 4722 1.6 - Term 4669 - 4722 7.2 3 3 Op 1 . - CDS 4766 - 5887 1096 ## COG2273 Beta-glucanase/Beta-glucan synthetase - Term 5916 - 5955 5.3 4 3 Op 2 . - CDS 5981 - 7456 1150 ## BT_2823 hypothetical protein 5 3 Op 3 . - CDS 7484 - 8962 1167 ## BT_2822 hypothetical protein 6 3 Op 4 . - CDS 8994 - 10724 1431 ## BT_2821 hypothetical protein 7 3 Op 5 . - CDS 10738 - 13518 2513 ## BT_2820 hypothetical protein 8 3 Op 6 . - CDS 13548 - 15305 1689 ## BT_2819 hypothetical protein 9 3 Op 7 . - CDS 15317 - 18490 3152 ## BT_2818 hypothetical protein - Term 18786 - 18840 12.1 10 4 Op 1 . - CDS 18878 - 20599 1786 ## BT_2817 putative TonB-dependent receptor 11 4 Op 2 . - CDS 20612 - 23623 3092 ## BT_2816 TPR domain-containing protein - Prom 23643 - 23702 3.5 12 4 Op 3 . - CDS 23712 - 23807 71 ## - Prom 23836 - 23895 4.3 13 5 Tu 1 . - CDS 23915 - 24442 418 ## COG2816 NTP pyrophosphohydrolases containing a Zn-finger, probably nucleic-acid-binding - Prom 24467 - 24526 7.0 + Prom 24499 - 24558 4.3 14 6 Op 1 2/0.000 + CDS 24588 - 25517 713 ## COG0329 Dihydrodipicolinate synthase/N-acetylneuraminate lyase 15 6 Op 2 . + CDS 25542 - 27011 1326 ## COG0591 Na+/proline symporter + Term 27055 - 27089 -0.8 + Prom 27105 - 27164 5.0 16 7 Op 1 11/0.000 + CDS 27194 - 27760 728 ## COG0450 Peroxiredoxin + Term 27784 - 27823 5.8 17 7 Op 2 . + CDS 27843 - 29396 382 ## PROTEIN SUPPORTED gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 + Term 29403 - 29447 4.0 + Prom 29398 - 29457 1.6 18 7 Op 3 . + CDS 29478 - 31313 1302 ## COG3568 Metal-dependent hydrolase + Term 31351 - 31398 12.6 19 8 Op 1 . - CDS 31434 - 32438 959 ## BT_2809 putative integral membrane protein 20 8 Op 2 . - CDS 32460 - 33476 918 ## BT_2808 putative inosine-uridine preferring nucleoside hydrolase - Prom 33552 - 33611 3.9 21 9 Op 1 . - CDS 34378 - 34689 312 ## BT_1095 hypothetical protein 22 9 Op 2 . - CDS 34698 - 34934 102 ## BT_1096 transposase - Prom 34970 - 35029 1.9 23 10 Tu 1 . - CDS 35116 - 35487 232 ## Npun_R1938 transposase, IS4 family protein - Prom 35609 - 35668 5.5 24 11 Tu 1 . - CDS 35712 - 35924 165 ## BDI_0746 hypothetical protein - Prom 36031 - 36090 1.7 + Prom 35995 - 36054 2.7 25 12 Tu 1 . + CDS 36108 - 36278 70 ## BVU_3507 putative exopolyphosphatase + Term 36336 - 36387 7.6 - Term 36612 - 36651 6.1 26 13 Op 1 . - CDS 36746 - 37282 429 ## COG1803 Methylglyoxal synthase 27 13 Op 2 . - CDS 37337 - 38146 479 ## BT_1099 putative arginase 28 13 Op 3 . - CDS 38163 - 40721 1622 ## COG0058 Glucan phosphorylase 29 13 Op 4 . - CDS 40743 - 42989 1175 ## COG0475 Kef-type K+ transport systems, membrane components 30 13 Op 5 . - CDS 43003 - 43986 835 ## COG0205 6-phosphofructokinase 31 13 Op 6 . - CDS 44052 - 44507 281 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 44546 - 44605 3.4 32 14 Op 1 . - CDS 44753 - 45019 276 ## BT_1104 small heat shock protein 33 14 Op 2 . - CDS 45087 - 45836 670 ## COG0588 Phosphoglycerate mutase 1 34 14 Op 3 . - CDS 45850 - 46899 1014 ## COG1830 DhnA-type fructose-1,6-bisphosphate aldolase and related enzymes 35 14 Op 4 . - CDS 46928 - 47410 524 ## COG1528 Ferritin-like protein 36 14 Op 5 . - CDS 47461 - 48108 440 ## COG2095 Multiple antibiotic transporter - Prom 48128 - 48187 3.5 37 15 Tu 1 . - CDS 48211 - 48663 245 ## COG1528 Ferritin-like protein - Prom 48706 - 48765 4.6 - Term 48710 - 48761 -0.5 38 16 Tu 1 . - CDS 48825 - 49034 99 ## BT_1110 hypothetical protein 39 17 Tu 1 . - CDS 49629 - 49826 58 ## - Prom 49989 - 50048 4.2 + Prom 50319 - 50378 1.9 40 18 Op 1 . + CDS 50414 - 50536 161 ## BDI_2138 integrase 41 18 Op 2 . + CDS 50561 - 50986 124 ## BT_1112 transposase 42 18 Op 3 . + CDS 50878 - 51207 174 ## PG1113 integrase 43 18 Op 4 . + CDS 51228 - 51419 184 ## PG1113 integrase 44 19 Tu 1 . - CDS 51505 - 51957 256 ## COG0716 Flavodoxins - Prom 52010 - 52069 8.2 45 20 Op 1 . - CDS 52088 - 53119 768 ## BT_1114 hypothetical protein 46 20 Op 2 . - CDS 53127 - 54299 946 ## COG0667 Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) - Prom 54323 - 54382 3.8 47 21 Op 1 . - CDS 54426 - 54983 289 ## COG1661 Predicted DNA-binding protein with PD1-like DNA-binding motif 48 21 Op 2 . - CDS 55081 - 55446 240 ## BVU_3495 hypothetical protein 49 21 Op 3 . - CDS 55456 - 57072 754 ## BVU_3494 hypothetical protein 50 21 Op 4 . - CDS 57085 - 59781 1595 ## BT_1119 hypothetical protein + Prom 60206 - 60265 4.5 51 22 Tu 1 . + CDS 60497 - 60736 81 ## BT_1120 hypothetical protein - Term 60685 - 60728 -0.3 52 23 Op 1 . - CDS 60820 - 61014 61 ## gi|154495398|ref|ZP_02034403.1| hypothetical protein PARMER_04455 53 23 Op 2 . - CDS 60950 - 62557 1585 ## COG1073 Hydrolases of the alpha/beta superfamily 54 23 Op 3 . - CDS 62582 - 63736 350 ## COG2706 3-carboxymuconate cyclase - Prom 63975 - 64034 7.3 + Prom 63882 - 63941 4.7 55 24 Tu 1 . + CDS 64016 - 64936 583 ## BT_1123 transcriptional regulator + Term 65077 - 65112 2.2 - Term 64738 - 64777 0.8 56 25 Op 1 . - CDS 65005 - 65931 467 ## BT_1124 putative integrase 57 25 Op 2 . - CDS 65939 - 66700 476 ## BT_1125 hypothetical protein 58 25 Op 3 . - CDS 66730 - 67650 495 ## BT_1126 mobilization protein BmgA 59 25 Op 4 . - CDS 67655 - 68026 210 ## BT_1127 mobilization protein BmgB - Prom 68176 - 68235 1.9 - Term 68057 - 68107 4.0 60 26 Op 1 . - CDS 68256 - 68504 106 ## BDI_3249 hypothetical protein 61 26 Op 2 . - CDS 68395 - 68760 277 ## BDI_3249 hypothetical protein - Prom 68830 - 68889 3.5 + Prom 69027 - 69086 1.8 62 27 Op 1 . + CDS 69238 - 69549 238 ## BT_1129 hypothetical protein 63 27 Op 2 . + CDS 69593 - 69916 372 ## BT_1130 hypothetical protein + Term 69938 - 69993 17.6 - Term 70405 - 70431 -1.0 64 28 Op 1 . - CDS 70441 - 70833 349 ## BDI_2238 hypothetical protein 65 28 Op 2 . - CDS 70865 - 71425 441 ## BT_1133 hypothetical protein - Prom 71487 - 71546 4.1 - Term 72248 - 72289 1.4 66 29 Op 1 . - CDS 72371 - 72970 371 ## BT_1135 hypothetical protein 67 29 Op 2 . - CDS 72976 - 73143 114 ## gi|154492100|ref|ZP_02031726.1| hypothetical protein PARMER_01731 - Prom 73182 - 73241 2.5 - Term 73376 - 73414 6.2 68 30 Tu 1 . - CDS 73650 - 74897 873 ## BDI_3265 transposase - Prom 74994 - 75053 3.1 69 31 Op 1 . - CDS 75083 - 76711 999 ## COG1874 Beta-galactosidase 70 31 Op 2 . - CDS 76745 - 78745 1381 ## BT_2806 hypothetical protein 71 31 Op 3 . - CDS 78758 - 82009 1748 ## BT_2805 hypothetical protein 72 31 Op 4 2/0.000 - CDS 82022 - 82957 704 ## COG0524 Sugar kinases, ribokinase family 73 31 Op 5 . - CDS 82998 - 83921 759 ## COG0524 Sugar kinases, ribokinase family - Prom 84025 - 84084 8.0 - Term 83927 - 83969 -0.3 74 32 Tu 1 . - CDS 84149 - 85690 986 ## BT_2802 hypothetical protein - Prom 85865 - 85924 6.0 + Prom 85824 - 85883 3.4 75 33 Op 1 . + CDS 86007 - 86810 819 ## COG0483 Archaeal fructose-1,6-bisphosphatase and related enzymes of inositol monophosphatase family 76 33 Op 2 . + CDS 86842 - 87549 248 ## COG1040 Predicted amidophosphoribosyltransferases + Term 87709 - 87764 15.6 - TRNA 87995 - 88071 79.9 # Asn GTT 0 0 - TRNA 88096 - 88172 79.9 # Asn GTT 0 0 + Prom 88166 - 88225 3.4 77 34 Op 1 . + CDS 88250 - 89239 829 ## COG0524 Sugar kinases, ribokinase family 78 34 Op 2 . + CDS 89284 - 90981 1847 ## COG0793 Periplasmic protease + Term 90989 - 91059 22.1 + Prom 91086 - 91145 6.0 79 35 Op 1 . + CDS 91173 - 92591 1448 ## COG0499 S-adenosylhomocysteine hydrolase 80 35 Op 2 . + CDS 92717 - 95248 2081 ## BT_2796 hypothetical protein + Term 95293 - 95333 10.5 + Prom 95417 - 95476 7.2 81 36 Op 1 4/0.000 + CDS 95526 - 96911 1224 ## COG1538 Outer membrane protein 82 36 Op 2 . + CDS 96936 - 98027 1009 ## COG1566 Multidrug resistance efflux pump 83 36 Op 3 . + CDS 98033 - 99670 1243 ## BT_2793 putative MFS transporter 84 36 Op 4 . + CDS 99710 - 100582 493 ## COG2207 AraC-type DNA-binding domain-containing proteins - Term 100358 - 100389 -0.6 85 37 Tu 1 . - CDS 100589 - 101242 458 ## COG0035 Uracil phosphoribosyltransferase 86 38 Tu 1 . + CDS 101524 - 103131 1633 ## COG1866 Phosphoenolpyruvate carboxykinase (ATP) + Term 103167 - 103200 5.2 + Prom 103170 - 103229 5.1 87 39 Op 1 7/0.000 + CDS 103263 - 103820 475 ## COG2059 Chromate transport protein ChrA 88 39 Op 2 . + CDS 103817 - 104341 486 ## COG2059 Chromate transport protein ChrA + Term 104421 - 104457 3.3 - Term 104405 - 104449 7.6 89 40 Tu 1 . - CDS 104471 - 106270 1897 ## COG1217 Predicted membrane GTPase involved in stress response - Prom 106408 - 106467 5.3 + Prom 106223 - 106282 4.1 90 41 Tu 1 . + CDS 106427 - 106696 446 ## PROTEIN SUPPORTED gi|29348195|ref|NP_811698.1| 30S ribosomal protein S15 + Term 106723 - 106756 0.5 - Term 106711 - 106744 2.1 91 42 Tu 1 . - CDS 106771 - 109395 1029 ## BT_2785 hypothetical protein - Prom 109520 - 109579 10.6 + Prom 109567 - 109626 6.7 92 43 Op 1 . + CDS 109769 - 110344 690 ## COG1396 Predicted transcriptional regulators 93 43 Op 2 . + CDS 110378 - 112027 1523 ## COG0318 Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II + Term 112044 - 112108 11.5 94 44 Tu 1 . - CDS 112116 - 113198 923 ## COG0836 Mannose-1-phosphate guanylyltransferase - Prom 113395 - 113454 80.3 + TRNA 113363 - 113449 50.8 # Leu CAA 0 0 - Term 113443 - 113488 12.2 95 45 Op 1 . - CDS 113713 - 114384 286 ## gi|253569508|ref|ZP_04846918.1| predicted protein 96 45 Op 2 . - CDS 114454 - 114759 172 ## gi|253569509|ref|ZP_04846919.1| predicted protein - Prom 114811 - 114870 3.5 97 46 Op 1 . - CDS 114881 - 115147 280 ## gi|253569510|ref|ZP_04846920.1| predicted protein 98 46 Op 2 . - CDS 115200 - 115535 253 ## gi|253569511|ref|ZP_04846921.1| conserved hypothetical protein 99 46 Op 3 . - CDS 115617 - 116105 405 ## gi|253569512|ref|ZP_04846922.1| predicted protein - Prom 116129 - 116188 6.9 - Term 116238 - 116283 3.6 100 47 Tu 1 . - CDS 116297 - 117700 1029 ## BVU_1439 mobilization protein Predicted protein(s) >gi|226332241|gb|ACIC01000079.1| GENE 1 1 - 3057 1930 1018 aa, chain + ## HITS:1 COG:all0824_2 KEGG:ns NR:ns ## COG: all0824_2 COG0642 # Protein_GI_number: 17228319 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Nostoc sp. PCC 7120 # 514 734 2 239 273 113 30.0 3e-24 TNIQQKQDDVHSFPANTIYRLYLDPANNMWAGSIRRGLIGIKNVYACSYQNVPFGNLYGL SNQTINSFFQDSDGIVWVGTDGGGINRFDPVSGTFKHYPATKYEKVVSIVEYTPDELLFF SFNKGLFIFHKQTGQIRPFVLIDKEMNDQTCINGFSVNIQRITKTKILFSAQHIFIYDMV TCKFDIVATMGEEYERHSPLIIATKGAKTYLSDLKNICEYDSSEGTFRTIYQGQYTINDA SMDQDGIFWLASTEGLVRYDPRTGKSELINTSLFQDVASVVADNQYRIWIGTRRHLFVYS SRTHNFTTLEEVDGVLPNEFIFHATLLAKNGDVFVGGTTGMTVINSAVHFDTDEDYTVEL LDVLLNGLPVSLSEEPQEAAETIQIPWNFSSLQLKVLLNESDVFRKNFFRFNIEGLDQEL ASSNSNSLVINYLPIGEYTITASYYTRDGGWSMKQPILHVVVTPPWWKTGWFYVGLYILL GAVAYSIVYYFYRKKKAKQKREIMRLKNKMYEEKINFLTNISHELRTPLTLICAPLKRII NKESDKQDVEKLLVPIYKQAYQMKSIIDMVLDVRKLEEGKDMLHILPHPLNEWVCSVGDK FINEFHVKGIRLEYELDEQIKEVPFDQNKCEFVLSNFLMNALKFSESGTTTTLITTLSQE KDWVRISVRDEGMGLNMVDTDSLFSSFYQGGHEKGGSGIGLSYAKSLITHHKGRVGASNA AGKGAIFYFELPLFTDTCGQLEPASVEAATGVEVNEPDRIDYTFLKKYSVMVVEDTPELR SYLKETLSNYFARVYVAKDGKEGLEQIKDRLPDIIISDVMMPRMNGFELCREVKTNLDIS HIPFVLLTAYHNSQNMYTGYKTGADAFLPKPFEIDGLLALIHNQLRQREQIRARYKDDKL LTHKEVSFSNADETFLLKLNTLIADNMSNPDLDVAFIATNMCISRSLLFNKVKAITGMGI VDYVNKQRIEQSIVLLTTTTMNITEISEVVGFSSLRYFSKVFKSIKGEIPSAYRKQDK >gi|226332241|gb|ACIC01000079.1| GENE 2 3597 - 4640 803 347 aa, chain - ## HITS:1 COG:no KEGG:BT_2825 NR:ns ## KEGG: BT_2825 # Name: not_defined # Def: chitinase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 347 1 347 347 672 98.0 0 MKHLLYALLTAVSILFTSCGPSSNTNTPQEPNVPQPQPQPQPEVTQKVVIGYLALDDWEF ENLFPTIEWKYLTHINASFARVKADGTLNINPVRKRIESVRETAHKHNVKILISLAKNSP GEFTTAINDPKARKELIQQIIAFTKEYKLDGFDIDYEEYDNWDKNFPSLLVFARGLYLAK EKNMLMTCAVNSRWLNYGTEWEQYFDYINLMSYDRGAFTDKPVQHASYDDFVKDLKYWNE QCRASKSKIVGGLPFYGYSWEESLQGAVDDVRGIRYSGILKHLGNEAADKDNIGKTYYNG RPTIANKCKFIKENDYAGVMIWQLFQDAHNDNYDLKLINVVGREMME >gi|226332241|gb|ACIC01000079.1| GENE 3 4766 - 5887 1096 373 aa, chain - ## HITS:1 COG:CC0380 KEGG:ns NR:ns ## COG: CC0380 COG2273 # Protein_GI_number: 16124635 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucanase/Beta-glucan synthetase # Organism: Caulobacter vibrioides # 135 371 28 297 301 125 34.0 9e-29 MMKRKILLILNMAIIGFSTAACSDDETNYITPEKKYPLTAFAVAVNNVSANTTSYYHGKI DQTTHRVEIGTIEDANTITGVDYTLMSDGATISPDPATFVHNWKKEQTVTVTTEDNQTTT YTIVLTKFDDTMKDVLFMDEFDVDGNPDPTKWVLCQKAGSDWNDEMSESYDQAYVKDGRL ILKAEKIGDEYKAGGIETQGKFDFTFGRVEVKAKITSYPNGAFPAIWMMPKKYIYDGWPN CGEIDIMEHVKQESAIHHTIHTHYTYDLNIKDPSNTAQVTCNYQDWNIYALEWSEDKLTF FVNGQETFSYSNLKLENEAEMKQWPFTKDSSFYLILNMGLGGDRPGSWAGPIDDANLPAI MEIDWVKITKLDK >gi|226332241|gb|ACIC01000079.1| GENE 4 5981 - 7456 1150 491 aa, chain - ## HITS:1 COG:no KEGG:BT_2823 NR:ns ## KEGG: BT_2823 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 491 1 491 491 944 100.0 0 MKSDKLHKIAFLGLFTLLAAVPVLQSCDDDEKQKAVDLRYRVEDSYLLPADGTTDELSVT FQVKSTDPWEIFGENKGDWYAISPATGDNPEKTYDVTIKCEENTSLDDRTEVINIKSDYW TGKKFTLTQKGTAYLEYEGVDMIEKNGNVPEVFSVLSNQKWTAKVTDGEEWLSISQGASG MENGTVELKATPNTGAMRYGVVTLYDRNGDKQQEVNITQDGVQIEPAQPENGKWYEVEAV GGKLTIHVVANSRWGVTKEIPSDETWYEFEQTEFNGSADLVINVAEYSTSSSVRNGTIIL SSLSDEEGIEPVTTTIRIKQASSESSRTTVNEENREVSGDYYFGSQLMPGRYNIYIGPFN GNMNMFFMINATPWTEFRWHVAGGKSDLSTTPWSQKVFSGAAGTVKTLDTNADNKLGWNI LQEENETGTWIRVEWYLNDEFIISTISDGTQDWKVPGNILVEKGGQIMVRCNDGTLPLRK WEYIEPLKWGE >gi|226332241|gb|ACIC01000079.1| GENE 5 7484 - 8962 1167 492 aa, chain - ## HITS:1 COG:no KEGG:BT_2822 NR:ns ## KEGG: BT_2822 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 492 1 492 492 961 99.0 0 MKKLYSNTLMQLILALCTFSFVACQEFDIDSQPEGPLNIQIDALESYTALATSPSNVVFN ISSNTPWTIESDQQWCKPTPSMSAASSLVSEIVVTMENNTGKKARTAKLTIKAEGVEGTK VVTIKQASKEDLVVIPYDQIVPTVGGTISFNIVSNKPFQIIPSTQFVEQISPASGDGNED GNKIPITITIPENTGGVRTAEITVKTEFQEKSFTITQDGIIIEPKNEEEKTNQLNGMGGE KNIEINSSVEWKVEVPAEFKEWLSAEADGNNLTLKAGYNNLFITRVGHVLLYPKSNVPGF EGVPVEVRQPRNMWADGAEDIDPETGYATIRSNAQNRYVPNFSYGKGKIVWNFESVNMPA GTDGFLYLNGDTWHWNGAANGAGWIQIRIYPQDSTTKSSFKLHYDWTAEEFTLEESMNDI KTIELDTKVDPEDATKVNVTFSINGKVYYSGKKHNAFDLDGHKGVIFYSGFSGPFSPECT IVFKSLDYEVYE >gi|226332241|gb|ACIC01000079.1| GENE 6 8994 - 10724 1431 576 aa, chain - ## HITS:1 COG:no KEGG:BT_2821 NR:ns ## KEGG: BT_2821 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 4 576 1 573 573 1142 99.0 0 MNIMRTKYILLTILSSLVFASCNYLDFDETNNLKTKEDMYKYFGTSKSMLTHVYSYMPQG YQLFATSGGIFTEDRYSMRDCASDDGEFGAYADNIQNTNNGNWSPIKTYDDSWTLYRGIR AANSFIAEIAQVDFTRYEHDGQYTNWMKQLKYFPYEARVLRAHYFFELARRYGDIAMPLE VLTEEEANTIGKTPFSEVISFIVSECDDAANNLPDSYVNEPGAEIGRVTKGFAMAVKSKA LLYAASKLHNPSMDTELWKKSAKAAIDIINTGLYSLDPQESANNLDSKEVVLMRINGDDS DFEMFNLPLRMTAGTRTSSLIPYSNYPSQNLVDAFETVNGYKVTLENTGWVCEDPAFDPQ SPYENRDKRFYRAILANGMSFKDYTIETFKGGADDGIVSQGGSPTGYFLRKYIQEATSFE PGKEAVSKHHWVIYRYAETLLTYAESMVNAFNDVNYTDETYKYSALWAINEVRKNADMPL IPSMGKEDFLERLYNEWRVEFAFEDHRFWDVRRWKIADTTQRELYGVKIEKQADGTFNFY KNLYETRNWRDCMYLYPIPQSELYKNTNLNPQNTGW >gi|226332241|gb|ACIC01000079.1| GENE 7 10738 - 13518 2513 926 aa, chain - ## HITS:1 COG:no KEGG:BT_2820 NR:ns ## KEGG: BT_2820 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 926 1 926 926 1742 98.0 0 MNKINYILITLLLGGSVSLNAQTSKAASEMQLADTLSIGYQINVSSRTSSYSINGVNASA FEKSPYIDISKALYGKVAGLNVYQGTGSSADNVSTLSIHGNAPLVLIDGFPRDISDITSM EIESCYVLKDAAAAALYGMRGANGVVLITTKRGISNGLKVNVDYNFGVNTQFRSPDFADA YTYANSLNTALSGDGLPARYNAQELDAFRTGIYPYDYPNVDWWNETLNDTGFTHNLKMSF SGGSDKFRFYTVVDYYRDRSMLKKNTEDTRFDTTPTDTRLTVRTNLDVKVTESTLLKAGI VGRLKEFRGTRYGRSAIFNRIYGIPSAAFPIRYENGIYGGSSVYGTGNPVALLKDYGHIR NVYGTLLADLSIRQDLGALTKGLAAEASVSFDNIGGMNETTSKEYRYMNSNASITSDGTL VTTPVIYGTDSETLGHNQPFERLMLRSDFQAKVDYNRTFGKHQVGGALIYDMQSVVKNGR NNSQKNQSVLVNATYTYDNRYSLNAVFNRSGSAYLPDGDKYSNYPAVSAAWIVSNEAFME KVTPINLFKIRASYGLSGWDGNLSHELWRQSYGSGGAGYNFGVNAGGQSGGSEGDLPVIG LVAEKSQKATFGFDLAAFDNRLNATVEGFYEKRSDILVSGANSTSGIIGITVGQVNEGIY KYKGVDASLSWNDKIGDFHYGIGASMSYLNTEVVNVNQAYQEYDYLYTKGNRIGQMYGLE AIGFFNSQQEINNSPLQTFSDVAPGDVKYKDQNGDNRIDEKDVVKMFGSSVPRFYFGFNL NFSYKRLELSADFQGMTGVTVSLLNSPLYSPLVSNGNISNTFLNEEISWTPENKTNATMP RLTTQENLNNYRASSLWYRDGSFLKLRNLLVAYTFPKSQTRFADLKVFVQGTNLFSLDNI HFADPEQLGIAYPSTRSYWAGIKLNF >gi|226332241|gb|ACIC01000079.1| GENE 8 13548 - 15305 1689 585 aa, chain - ## HITS:1 COG:no KEGG:BT_2819 NR:ns ## KEGG: BT_2819 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 585 1 585 585 1208 99.0 0 MNRIIKGLGIALGCVLIASSCSEISFGDKFLGDQPESSGATTEEMFSSKINAEKVLTKAY TGLPYGLPTGSDYKLGVNILESITDLCYSFRDNISDGPVKLYYNGALSANNVPKNSAYRY GSKSDWTTIRYAWLYIENVEKVPDMTASEKSERIAEAKVLISLCYFEMLRYVGGVTWLDH YVDVNETMNFPRITFAETVEKIVGLLNEAINSNLAWKAEKADDGRMTLAGAMALKFKVLQ WAASPTFNSNTKWHPEADEYTCYGNYSDQRWKDAAEAGAAFFTELQKRHEYELIQPTEET HRARRLAYRQAYYNRGGTEILISTRQGYSSDVHSSFIGQRYYSGPTLNYVDMFPWEDGTD FPANFDWKSPSKQPFFTKGEMVPTRDPRLYENVACPGDTYCNGTTAPVYINHADYKDGSG FLIMKYILQENNDREGPVQWSHTRLAEIMLGYAEVLNEVNGRPTDEAYKMVNDVRKRVGL PELSKTMNHDQFLEAVLRERALELGFEEVRWFDLVRRDRQSDFKKKLYGLRSRGNDLNNP TEFTFEKIELGDHYWATNWDTKWYLAPIPQEEINKQYGMTQNPGW >gi|226332241|gb|ACIC01000079.1| GENE 9 15317 - 18490 3152 1057 aa, chain - ## HITS:1 COG:no KEGG:BT_2818 NR:ns ## KEGG: BT_2818 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1057 1 1057 1057 2039 99.0 0 MRNALNLRFCFLVTFLLFIPLMYALPPEGITIKGRVTDETLKEGVPGANVTVKGTSIGTI TDFDGNFTIVAPSKKSVLNISFIGYVTQEIVVGDRTIINVSLKEDAQNLGEVEVVAIAYG NQDKRMLTSAVSSIDNKELIKSPATSITNLLAGALPGVSSVQTTGQPGKDAAAIYVRGVG SLNDSQSKPLVLVDGIEREFSQIDPNEIESISVLKDASSTAVFGVRGANGVVLITTRRGK AGKTQISASTKLGLQQPISLVEQTGSYEFARFWNMKQEMDHVTNPKLYFTPEQVEAYRTG SDPIMYPSMDWKKYIFNNIYLQSQNNLNISGGSDNVRYFISVGYTYQNGLLKELPGQMYD NNYRYNRYNYRANIDANLTKTTSMKLGVGGVLGKIQEPRSVVSGTGEDQNPWVIAQIWSH PFAGPGFINGVRTLIPKDMVPLGEVMRDGMFVFYGQGYNQTYDTTLNLDLDITQKLDFIT PGLSVSVKGAYDNKFKLEKVRTGGAIEAQTVYYKSYFDTNGTMSQTDPDYDKSLIYVPSG SDTPLTYSESSGRGQNWYLEGRINYDRTFGDHRISGLFLYNQSRNYYPDSYTYLPRNYVG LVGRATYAYQSKYLLDVNVGYNGSENFAPGSTRFGVFPSFSAGWIATAEKFMENQKIVDY LKLRASWGRVGNDKGVDSRFMYMPAVWGSSGSYSFGVDNPISQSAAAISRMGNPAVTWET ADKQNYGIDLKFFNSRLSATFDYFIEHRTGILISPESTPSIIATALPNLNIGKVDNHGYE VSLGWRDKIGKDFTYNVDANVSFARNKILYMDEVPKQFSYMNQTGGSTGRQTGVYNYIRL YQYSDFITGADGELILKPELPQPYQKVYPGDAMYADLNGDHIVDGNDKSVAGYATRPEYV FGMNMGFDWKGFNFTMQWVGSKNVNRMYDIEYRIPYTNAGKRGLLTYFYDGCWTPENQLG AVYPRPSEESESWNSEPSTLWLVDASYLRLKSLSFGYTITGKKFLKKLGLSSLGLNFSGY NLLTFSPLKYLDPESDPNRFGDYPLIKVYSFGLNLNF >gi|226332241|gb|ACIC01000079.1| GENE 10 18878 - 20599 1786 573 aa, chain - ## HITS:1 COG:no KEGG:BT_2817 NR:ns ## KEGG: BT_2817 # Name: not_defined # Def: putative TonB-dependent receptor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 573 1 573 573 1127 100.0 0 MKRTLYIAGGFTLAFLMPFNLQAQNTQPKDTTMNRTVVVEQEYNPDILDASKVNVLPKVE EPTVSKKEVEYATTFFPATSIPADLMRPYVGTEVQPGSKPGYIRAGYGNYGNLDLLANYL FRLSDRDKLNVRFQMDGMDGKLTMPETDTKWNAYYYRTRANIDYIHQFNKVDFNIAANFG LSNFNLSPVQPGKQKFTSGDFHLGVKSTDENYPIQFEAETNLMMYNRQNNNTFFFNDKVR ETQVHTKGLISGAISDEQSINIGLDMRNLIYNKDLKLADDLQVYENRTALALNPHYKLSS DSWNLRAGANVDISFGSGKSFRVSPDVIAQYIFSDSYVLYAQATGGKLVNDFRRLETICP YALPAGPILDTYEQLNAAIGFKASPYPGLWINLYGGYQDLKDDFFEVQEETFVNYDPSLA GDAQEGNSSSPVTAHQYLNLEFSNANNFYAGMKISYEYKDLIALFAAGTYRNWDAKEKYS LYMKPACEINFSADFRPITGLNINLGYDYIGRTKVEGIKAAAVSDLHAGASYNVFKGVSV YARINNILNKKYQYYLGYPTEGLNFLGGLSFSF >gi|226332241|gb|ACIC01000079.1| GENE 11 20612 - 23623 3092 1003 aa, chain - ## HITS:1 COG:no KEGG:BT_2816 NR:ns ## KEGG: BT_2816 # Name: not_defined # Def: TPR domain-containing protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1003 1 1003 1003 1774 99.0 0 MKKRISRIICTLLCCAPIAISAQTSEKITSPVNLYKEGKELFQEKNYAAAIPALKAFVKQ KPTASLLQDAEYMLVSSAYELKDKNRIELLRKYLDCYPDTPYANRIYSLLASCYFYEGKY DEALALFNSTRLDLLGNEERDDRTYQLATCYMKTDNLKEAAIWFETLRASSPKYAKDCSY YLAYIRYTQKRYDEALKDFLPLQDDPKYKELVPYYIAEIYAIKKNYDKAQIVAQNYLSAY PNNEHAAEMYRILGDAYYHFGQYHQAVEAFTGYLDRDHSAPRRDALYMLGLSYYQTKVYS KAAEMLGQVTTANDALTQNAYLHMGLSYLQLAEKNKARMAFEQAAASNANLQIKEQAAYN YALCLHETSYSAFGESVTAFEKFLNEFPTSPYAEKVSNYLVEVYMNTRSYEAALKSIERI AKPSAQIMEAKQKILFQLGTQSFANANFEQALQYLNQSIAIGQYNRQTKADAYYWCGESY YRLNRMMEAARDFNAYLQLTTQPNNEMYALANYNLGYIAFHRKDYTQASNYFQKYIQLEK GENRTALADAYNRIGDCHLNVRNFEEAKHYYSQAEQMNTPSGDYSFYQLALVSGLQKDYT GKITLLNRLVGKYPSSPYAVNAIYEKGRSYVLMDNNGQAITSFKELLEKYPESPVSRKAA AEIGLLYYQNGNFDQAINAYKQVIEKYPGSEEARLAMRDMKSIYVDLNRIDEFAALANAM PGHIRFDASEQDSLTYAAAEKIYARGRTEEAKSSLNKYLQTFPEGAFSLNAHYYLCQIGN EQKNYDMVLLHSGKLLEYPNNPFAEEALIMRAEVQFNQQQMAEALASYKMLKEKATNVER RQLAETGILRCAFLLRDDVETIHAATEVLAEAKLSPELKNEALYYRAKAYKNQKADKKAL DDFRELAKDTRNSYGAEAKYQVAQALYDAKEYAAAEKELLNYIEQSTPHAYWLARSFVLL SDVYVAMGKDLDARQYLLSLQQNYQGNDDIESMIEDRLKKLNK >gi|226332241|gb|ACIC01000079.1| GENE 12 23712 - 23807 71 31 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKFNIRLNYSHQEDEKESQKNKYFGFKMDER >gi|226332241|gb|ACIC01000079.1| GENE 13 23915 - 24442 418 175 aa, chain - ## HITS:1 COG:MA1439 KEGG:ns NR:ns ## COG: MA1439 COG2816 # Protein_GI_number: 20090298 # Func_class: L Replication, recombination and repair # Function: NTP pyrophosphohydrolases containing a Zn-finger, probably nucleic-acid-binding # Organism: Methanosarcina acetivorans str.C2A # 11 104 126 217 285 68 39.0 5e-12 MMIHPLAQFLYCPECGSPHFEVNNEKSKKCTDCGFVYYFNPSSATVALILNEKKELLVCR RAKEPAKGTLDLPGGFIDMNETGEEGVSREVWEETGLKVEKATYQFSLPNIYIYSGFPVH TLDMFFLCTVKDMSHFSAMDDVADSFFLPLSDIRPEDFGLDSIRRGLVQFLAQHS >gi|226332241|gb|ACIC01000079.1| GENE 14 24588 - 25517 713 309 aa, chain + ## HITS:1 COG:BMEII0862 KEGG:ns NR:ns ## COG: BMEII0862 COG0329 # Protein_GI_number: 17989207 # Func_class: E Amino acid transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: Dihydrodipicolinate synthase/N-acetylneuraminate lyase # Organism: Brucella melitensis # 11 304 23 320 322 172 33.0 1e-42 MNNKMNIPLSGIIPPLVTPLLDDDVLDVEGLQRLIEHLIAGGVHALFVLGTTGESQSLSY KLRVEMIKNTCRIAKGRLPVLVCISDTSIVESVNLACLAADHGADAVVSAPPYYFATGQP ELIEFYEHLLPQLPLPLFLYNMPTHTKVNFAPATIQRIAENPGVIGFKDSSANTVYFQSV MYAMKDNPDFSMLVGPEEIMAESVLLGAHGGVNGGANMFPELYVSLYNAAKNADMEEVRR LQEKVMQISATIYTVGQHGSSYLKGLKCALSLLGICSDYVAAPFHKFEQRERGKIWKALQ NLGVDVVLR >gi|226332241|gb|ACIC01000079.1| GENE 15 25542 - 27011 1326 489 aa, chain + ## HITS:1 COG:STM1128 KEGG:ns NR:ns ## COG: STM1128 COG0591 # Protein_GI_number: 16764485 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Na+/proline symporter # Organism: Salmonella typhimurium LT2 # 3 440 8 448 498 209 33.0 1e-53 MSILDIIIFLFFTGGVVVFGCSFFNKKRSSEEFTSAGRSLPGWVVGMSIFATYVSSISYL AYPGKAYMSDWNAFVFSLSIPIASYFAAKYFVPFYRSIGSVSAYSFLEERFGPWARVYAS SCYLLTQVARMGSILYLLALPMNALLGWDIKMIIAVTSVAIIAYSMLGGLKAVIWTEAIQ GFILIGGAVVCLCVLMFKMPEGPAQVFQIAVTDQKFSLGSFGSSLTESTFWVCLIYGIFI NLQNYGIDQNYVQRYLTAKSDKQAKFSALFGGYLFIPVSAVFFMIGTALYAYYKTFPELL PAGVEGDAVFPYFIVHALPTGLTGLLIASIFAAGMSTVATSITSSATIILTDYYARYINK KPTEKQSVRALYVSNVLIGIIGIFVALAFLNVESALDAWWALSSIFSGGMLGLFLLGYIS KKARNVDAVLGVICGIIVVVWMSLSPIFFKDGFMSEYASPFHANLTIVFGTIVIFLAGFL SMRLLKTKK >gi|226332241|gb|ACIC01000079.1| GENE 16 27194 - 27760 728 188 aa, chain + ## HITS:1 COG:STM0608 KEGG:ns NR:ns ## COG: STM0608 COG0450 # Protein_GI_number: 16763985 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peroxiredoxin # Organism: Salmonella typhimurium LT2 # 4 188 3 187 187 251 61.0 4e-67 MEPILNSQLPEFSVQAFQNGAFKTVTNNDLKGKWAILFFYPADFTFVCPTELVDMADKYA QFQEMGVEIYSVSTDSHFVHKAWHDASESIRKIKYPMLADPTGALSRALGVYIEEEGMAY RGTFVVNPEGKIKVVELNDNNIGRDASELLRKVEAAQFVATHDGEVCPAKWKKGESTLKP SIDLVGKI >gi|226332241|gb|ACIC01000079.1| GENE 17 27843 - 29396 382 517 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 [Streptococcus pneumoniae SP6-BS73] # 212 505 2 297 306 151 32 1e-35 MLEPALKEQLKGIFAGLEADFTFDISVSASHESRAELLELLSDVAECSDHITCVVDEGSG LKFTIGKNGKPTGITFRAIPNGHEFTSLLLAILNLDGKGKNFPDEAVCNRVKALKGPIHL VTYVSLTCTNCPDVVQALNAMTTLNPFITHEMVDGALYQDEVDALKIQGVPSVFADGKLL HVGRGEFGELLAKLEAQYGIDETKAETEVKEYDVIVAGGGPAGVSAAIYSARKGLRVAIV AERVGGQVKETVGIENLISVPETTGSELADNLKTHLLRYPVDLLEQRKIEKVELAGKDKL VTTSVGEKFIAPALIIATGASWRKLNVPGENDYIGRGVAFCPHCDGPFYKGKHVAVVGGG NSGIEAAIDLAGICSKVTVFEFMDELKADNVLQERLKSLPNVEVFVSSQTTEVIGNGDKL TALRIKDRKTEEERLVELDGVFVQIGLSANSSVFRDIVETNRPGEIVIDAHCRTNVTGIY AAGDVSTVPYKQIIISMGEGAKAALSAFDDRVRGLIV >gi|226332241|gb|ACIC01000079.1| GENE 18 29478 - 31313 1302 611 aa, chain + ## HITS:1 COG:SMb20092 KEGG:ns NR:ns ## COG: SMb20092 COG3568 # Protein_GI_number: 16263840 # Func_class: R General function prediction only # Function: Metal-dependent hydrolase # Organism: Sinorhizobium meliloti # 15 247 1 245 252 87 30.0 6e-17 MRNMKKIFLLISAILLIVPVQAQHTLRLMTYNIKNATGMDGVCDFQRIANVINNASPDVV AVQEVDSVTNRSNQKYVLGEIAERTQMYACFAPAIDYDGGKYGIGLLSKKAPVHLQTIAL PGREEARALILAEFEDYIYCCTHLSLTEEDRMKSLEILKTFAASYKKPLFLAGDMNAEPE SDFIKELQKEFRILSNPKQHTFPAPVPKETIDYVAAFKQNDKGFAVVSSEVVNEPVASDH RPIVVELRTAEKADKIFRTKPYLQNPVGNGMTVMWETTVPAYCWVEYGTDTTQLRRARTI VDGQVVCNNKLHKIRLDDLQPGQKYYYRVCSQEMLLYQAYKKVFGNTARSAFSEFTLPVT GTDSFTAVVFNDLHQHTHTFRALCRQIQDIDYDFVVFNGDCVDDPASHDQATAFISELTE GVRGDCIPTFFMRGNHEIRNAYSIGLRDHFDYVGDKTYGSFNWGDTRIVMLDCGEDKTDD HWVYYDLNDFTQLRNEQVGFLKKELAAKEFKKAKKRILLHHIPLYGNDGKNLCAELWTKL LEKAPFDICLNAHTHKYAYHPKGELGNHFPVIIGGGYKMEGATVMILEKRKEELRVRVLD AKGETLLDITV >gi|226332241|gb|ACIC01000079.1| GENE 19 31434 - 32438 959 334 aa, chain - ## HITS:1 COG:no KEGG:BT_2809 NR:ns ## KEGG: BT_2809 # Name: not_defined # Def: putative integral membrane protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 334 1 334 334 543 98.0 1e-153 MYIVNSYTLAIIFCFITMICWGSWGNTQKLASKNWRYELYYWDYTIGILLFALLLVFTLG SFGDSGRGFLEDIQQVEAAYIASALIGGVIFNASNILLSASVSIAGMAVAFPLGVGLALV LGVFINYFSTPKGDPFWLFTGVVLIVIAIICNGIAAGKNQKAGTNNSKKGIILAAIAGIL MSFFYRFVAAAMDLNNFESPTVGMATPYTAFFIFAIGIFLSNFLFNTLVMKRPFVGLPVT YKEYFIGKASTHMVGILGGCIWGLGTALSYIAAGKAGAAISYALGQGAPMIAALWGVFIW KEFTGSSKATNRLLGVMFILFILGLTFIVISGGS >gi|226332241|gb|ACIC01000079.1| GENE 20 32460 - 33476 918 338 aa, chain - ## HITS:1 COG:no KEGG:BT_2808 NR:ns ## KEGG: BT_2808 # Name: not_defined # Def: putative inosine-uridine preferring nucleoside hydrolase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 338 1 338 338 682 98.0 0 MFKYIAIILITVSLFCSCAQQKETTSAIPVILDTDVGNDIDDVLAMQMLLNYEKKGKIDL LGITISKCNPYSLEYIDAYCRFNDKYDIPLGYAYNGMNTDDGHYLRQTLDTIIDNNKILH PKRSLKDHIPEGYKLLRKLLAEQEDHSVVFIALGPETNLARLLISEADEYSELDGKALVA QKVKLLSVMGGLYGNEFDFPEWNIVNDLNAAQITFKEWPTPIIASGWELGNKIRYPHQSI LNDFADSYKHPLCVSYKIYQEMPYDRETWDLTSVIQAIEPDNNYFNLSEKGTITIDSIGQ SLFSPSESGKHQYLTIQGEENIRATRNALIRQVTGKEN >gi|226332241|gb|ACIC01000079.1| GENE 21 34378 - 34689 312 103 aa, chain - ## HITS:1 COG:no KEGG:BT_1095 NR:ns ## KEGG: BT_1095 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 103 1 103 103 205 100.0 4e-52 MFNELKTKDDAWMQSIHERIDRLSAMIDGIFGDGAVPPKEDVYLCDSEVACMLKVSRRTL GEYRSNGTLPYYVLGGKVLYKRSEIEQVLEREYRSVGKARGSG >gi|226332241|gb|ACIC01000079.1| GENE 22 34698 - 34934 102 78 aa, chain - ## HITS:1 COG:no KEGG:BT_1096 NR:ns ## KEGG: BT_1096 # Name: not_defined # Def: transposase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 78 77 154 154 146 100.0 2e-34 MPKKKPKGKELTCVEKQENKRISGVRIKVEHAIGGMKKCRIVKERFRCHKFGFEDMVILI ACGLHNFRISHKMSHITN >gi|226332241|gb|ACIC01000079.1| GENE 23 35116 - 35487 232 123 aa, chain - ## HITS:1 COG:no KEGG:Npun_R1938 NR:ns ## KEGG: Npun_R1938 # Name: not_defined # Def: transposase, IS4 family protein # Organism: N.punctiforme # Pathway: not_defined # 26 121 61 159 297 69 41.0 3e-11 MNITAILPWKVKCANVYLTTVLSLIQDKMFFILVYLKTNPLQELHAIQFEMTQPQANRWI HLLSEILRRTLKTLGELPDRNSKRLIHILQGCEEVLLDGTERPIQRPLDEDWQSACYSGK KNS >gi|226332241|gb|ACIC01000079.1| GENE 24 35712 - 35924 165 70 aa, chain - ## HITS:1 COG:no KEGG:BDI_0746 NR:ns ## KEGG: BDI_0746 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 2 70 35 103 103 91 65.0 7e-18 MKRTAGWLDHQDVCLKLKISKRTLQTLRDNGTLAYTKIGNRTYYLPEDVERIVTKVEDRR KDARYRGHTI >gi|226332241|gb|ACIC01000079.1| GENE 25 36108 - 36278 70 56 aa, chain + ## HITS:1 COG:no KEGG:BVU_3507 NR:ns ## KEGG: BVU_3507 # Name: not_defined # Def: putative exopolyphosphatase # Organism: B.vulgatus # Pathway: not_defined # 1 56 247 302 302 104 96.0 8e-22 MEEFNLKTDRADVIVPAAEIFLIIAGIVKAEYIHVPVIGLADGIIDNLYAIYLKTE >gi|226332241|gb|ACIC01000079.1| GENE 26 36746 - 37282 429 178 aa, chain - ## HITS:1 COG:TM1185 KEGG:ns NR:ns ## COG: TM1185 COG1803 # Protein_GI_number: 15643941 # Func_class: G Carbohydrate transport and metabolism # Function: Methylglyoxal synthase # Organism: Thermotoga maritima # 6 163 16 165 166 140 49.0 2e-33 METIVRRIGLVAHDAMKKDMIEWVLWNSERLIGHKFYCTGTTGTLIKKALEEKHPEIKWD ITILKSGPLGGDQQIGSRIVEGEIDYLFFFTDPMTLQPHDTDVKALTRLAGVENIVFCCN RSTADHIITSPLFTDPTYERIHPDYTNYTQRFENKGIISEAVEQVKKRRNKSENNISK >gi|226332241|gb|ACIC01000079.1| GENE 27 37337 - 38146 479 269 aa, chain - ## HITS:1 COG:no KEGG:BT_1099 NR:ns ## KEGG: BT_1099 # Name: not_defined # Def: putative arginase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 269 1 269 269 574 100.0 1e-162 MQKQRMSGQNQKTDKRIAWPIIIMNFTGVYDYEAFARNNKFIWLDCRHLYGTDGYCDREG ALALKGMIADYPAEGVHFIDSGNYHYLTKFWTDKLETPFSLIVFDHHPDMQPPLFDNILS CGSWVKDILDHNNNCKKVIIVGSSDKLIQAVPKGYERQVRFYSETTLMHEEGWQNFSSGH INGPVYISIDKDVLNPASAATNWDQGSLSLWELEKLLAVILQKEQVVGIDICGECSTTLN LFEEKRETVMDSQANKELLRFIRSSSGLQ >gi|226332241|gb|ACIC01000079.1| GENE 28 38163 - 40721 1622 852 aa, chain - ## HITS:1 COG:PH1512 KEGG:ns NR:ns ## COG: PH1512 COG0058 # Protein_GI_number: 14591294 # Func_class: G Carbohydrate transport and metabolism # Function: Glucan phosphorylase # Organism: Pyrococcus horikoshii # 22 763 18 751 837 595 43.0 1e-170 MEYSSYNVNTPQWREITVGSHLPAELRKLAEIAHNLWWTWNDDAKKLYCDLDSELWKEVE QNPVLFLERINYEKLVALAHDENFVYKMDAVYSAFKKYVDVEPDHQRPSIAYFSMEYGLD EVLKIYSGGLGMLAGDYLKEASDSNVDLCAIGLLYRYGYFDQSLSMDGQQTVNYKAQNFG QLPIEKVMQPDGKQLVIHVPYADSFVVHANVWKASVGRIPLYLLDTDNELNSEFDRPITH HLYGGDWENRLKQEILLGIGGMITLRALGITKDVYHCNEGHAALINIQRLCDYINGGLNF GQAMELVRASSLYTVHTPVPAGHDYFDEGLFNKYMKGYPGKLGITWDNLMDLGRHNPGDK EERFCMSVFACKTCQEVNGVSKLHKSVSQQMFAPIWKGYFPEENHVGYVTNGVHLPTWCA AEWKKLFKDNFDENFFCDQSNQKIWEAVYGIPDEEIWNTRLKQKAKLLDYIKSKCSKDWL RTQIDPALSVSIFERFNPDALLIGFGRRFATYKRAHLLFTDIDRLARIVNNPKYPVQFIF AGKAHPNDGAGQGLIKQIVEISRRPEFLGKIIFLENYDMDLARHLISGVDIWMNTPTRLA EASGTSGEKALMNGVLNFSVLDGWWYEGYRKDAGWALTDKRTYQNEQYQNQLDAEAIYYL LEHDILPLYYEHGGKNYSEDWVKYIKNSIAQIAPHFTMKRQLDDYYDRFYNKLSEHFHIL AADNFAKAKMMADWKANVRSRWDAIEIKSIEAGNGLNATVEAGKEYEVTVVVDEKGLDDA IGIESVIIRREDGQDHIYEVIPLLPVSKNGNLYTFKATSGIFNAGSFKQAFRMYPKNALL PHRQDFCYVRWF >gi|226332241|gb|ACIC01000079.1| GENE 29 40743 - 42989 1175 748 aa, chain - ## HITS:1 COG:PA5529 KEGG:ns NR:ns ## COG: PA5529 COG0475 # Protein_GI_number: 15600722 # Func_class: P Inorganic ion transport and metabolism # Function: Kef-type K+ transport systems, membrane components # Organism: Pseudomonas aeruginosa # 7 443 6 447 585 327 39.0 7e-89 MSEVAPLISDLAIILIIAGIVTVIFKWLKQPVILGYIVAGIMAGPSVSLVPTVSDPANIK IWADIGVIFLLFAMGLDFSFKKLINVGITAIVATVTIVCGMMFIGYTAGNAMGFSHMSSI FLGGMLSMSSTAIVFKAFNDMGLLQQKFTGIVLGILVIEDLVAVVMMVVLSTLAVGKHFE GKEMLESILKLAAFLIFWSALGIYLIPTLLKKIRRFTSNEILLITSLGLCLGMVMIATKA GFSAALGAFVMGSLLAETVEAEKIVHIVQPVKDLFASIFFVSVGMMIDPAMMWEYAVPIL ILTLLVLSGQVLFGSFGVLLSGQPLKIAIQSGFSLTQVGEFAFIIASLGVSLNVTDKYLY PVIVAVSVITTFLTPYMIWLSEPAYRFIDIHMPESLKDYLVHYTSGAMTVKHQGTWHKLI RSMLVSVTLYLVVCVFFITLYFSYVHPLIRKSLPGMEGNLLGFIIIFLVISPFLWAIIMK RNNSTEFRKLWTDNKFNRGPLVSIVLVKILICTIIMMSIITHLFNVALGAGLVVSSIIIA VIYFSKRIKKRSLTIERQFMANFQGTDGNGLSTESDTGSTFGSNIPFKELHLADFTVSPD SIYVGRTLKESSLRTLFQINVISITRGEKRLDIPQGEEHLYPYDKVTVVGTDRQLESFRT SMEQKKVERNGNGTSSQDEMEIGQFPVEGGSPLIGKTIREADISDSIIIGIERSTVNIMN PDPDTIFKENDTVWIVGKRKIIKGLNKD >gi|226332241|gb|ACIC01000079.1| GENE 30 43003 - 43986 835 327 aa, chain - ## HITS:1 COG:BH3164 KEGG:ns NR:ns ## COG: BH3164 COG0205 # Protein_GI_number: 15615726 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphofructokinase # Organism: Bacillus halodurans # 4 327 2 319 319 300 50.0 4e-81 MDNKYIGILTSGGDASGMNAAIRAVTRAAIFNGFKVKGIYRGYEGLIAGEVKELTTEDVS SIIQRGGTILKTARSETFTTPEGRKKAYKVIQKENINALIIIGGDGSLTGARIFAEEYDV TCIGLPGTIDNDLYGTDFTIGYDTALNTIVECVDKIRDTATSHDRIFFVEVMGRDAGFLA QNSAIASGAEAAIIPEDRTDVDQLETFIGRGFRKTKNSSIVIVTESPENKNGGAIYYADR VKKEYPGYDVRVSILGHLQRGGAPSANDRILASRLGEAAIQALMEGQRNVMIGIRNNEIV YVPFVQAIKKDKPIDKSLIRVLNELSI >gi|226332241|gb|ACIC01000079.1| GENE 31 44052 - 44507 281 151 aa, chain - ## HITS:1 COG:mll3697 KEGG:ns NR:ns ## COG: mll3697 COG1595 # Protein_GI_number: 13473184 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Mesorhizobium loti # 2 74 6 76 183 62 38.0 4e-10 MRGFDFDKALVALQNELHCFAYKLTADKDEAENLLQETMLRTLDNKDKFDSGTNFKGWMY TIMRNAFINNCRTKKIRGNLYVLSEPEYHFLLRDDSFIFVDNGHDAKEIREALKTLPKAH YVVFMLYRSQISGNSRKDRSVTEYDKKPYLL >gi|226332241|gb|ACIC01000079.1| GENE 32 44753 - 45019 276 88 aa, chain - ## HITS:1 COG:no KEGG:BT_1104 NR:ns ## KEGG: BT_1104 # Name: not_defined # Def: small heat shock protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 71 1 71 142 139 100.0 3e-32 MVPVKTNQNWLPSIFNDFFDNEWMARANATAPAINVIENEKDYKVELAAPGMTKNDFKVS VDESNNLVICMEKRTRKGGEERRKILAP >gi|226332241|gb|ACIC01000079.1| GENE 33 45087 - 45836 670 249 aa, chain - ## HITS:1 COG:STM0772 KEGG:ns NR:ns ## COG: STM0772 COG0588 # Protein_GI_number: 16764136 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoglycerate mutase 1 # Organism: Salmonella typhimurium LT2 # 1 248 3 250 250 287 58.0 2e-77 MKRIVLLRHGESLWNKENRFTGWTDVDLSEKGVEEACKAGDALREAGFSFEAAYTSYLKR AVKTLNCVLDRLDKDWIPVEKTWRLNEKHYGMLQGLNKSETAVQYGEEQVHIWRRSYDVA PAPVGKDDPRNPGMDIRYAGVPDSELPRTESLKDTIGRVMPYWKCIIFPALMYKDSLLVV AHGNSLRGIIKHLKGISDTDISNLNLPTAVPYVFEFDDRLVLVKDYYLGNPEEIRKRAEA VAKQGMAKK >gi|226332241|gb|ACIC01000079.1| GENE 34 45850 - 46899 1014 349 aa, chain - ## HITS:1 COG:all3735 KEGG:ns NR:ns ## COG: all3735 COG1830 # Protein_GI_number: 17231227 # Func_class: G Carbohydrate transport and metabolism # Function: DhnA-type fructose-1,6-bisphosphate aldolase and related enzymes # Organism: Nostoc sp. PCC 7120 # 3 349 11 360 360 504 68.0 1e-142 MKIVDLLGGKAEYYLNHTCKTIDKQLIHIPGPDMIDKVWMNSDRNIRTLESLQALYGHGR LANTGYVSILPVDQGIEHSAGASFAPNPLYFDPENIIKLAIEGGCNAVASTFGVLGAVAR KYAHKIPFIVKLNHNELLTYPNSYDQVMFGTVKEAWNMGAVAVGATIYFGSEQSRRQIVE VSQAFEYAHELGMATILWCYLRNSSFKKDGTDYHAAADLTGQANHIGVTIKADIVKQKLP SNNGGFKAIGFGKTNECMYSELTTDHPIDLCRYQVANGYMGRVGLINSGGESHGESDLHD AVVTAVVNKRAGGMGLISGRKAFQKPMKDGIQLLNTIQDVYLDSSITIA >gi|226332241|gb|ACIC01000079.1| GENE 35 46928 - 47410 524 160 aa, chain - ## HITS:1 COG:MTH158 KEGG:ns NR:ns ## COG: MTH158 COG1528 # Protein_GI_number: 15678186 # Func_class: P Inorganic ion transport and metabolism # Function: Ferritin-like protein # Organism: Methanothermobacter thermautotrophicus # 1 158 2 161 171 124 43.0 9e-29 MTEKLQNALNEQITAELWSANLYLSMSFYLEREGFSGMARWMQKQSAEETGHAYAIAGYM IKREATPKVDKVDVVPQGWGNPVEVFEHALEHEKHVSKLIDELVQVASEEKDNATQDFLW QFVREQVEDEANVLNIVSHLRKAGDCAILFMDAKLGERES >gi|226332241|gb|ACIC01000079.1| GENE 36 47461 - 48108 440 215 aa, chain - ## HITS:1 COG:BS_yvbG KEGG:ns NR:ns ## COG: BS_yvbG COG2095 # Protein_GI_number: 16080438 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Multiple antibiotic transporter # Organism: Bacillus subtilis # 3 204 2 209 211 123 38.0 3e-28 MDIFVYLTLCFTSLFTLMDPLGVMPVFLQMTDGMDTKERRYIALKACTIAFIILVLFTLS GRFLFHFFGISTNGFRIVGGIIIFKIGYDMLQAHFTHVKLNETERKEYSKDITITPLAIP MLCGPGAISSGITLMEDASEYTFKIVLLGVIALVCILSFFILCASTQLLKILGETGNNVM MRLMGLILMVIAVECFISGIRPVLIEILKQAHACS >gi|226332241|gb|ACIC01000079.1| GENE 37 48211 - 48663 245 150 aa, chain - ## HITS:1 COG:MTH158 KEGG:ns NR:ns ## COG: MTH158 COG1528 # Protein_GI_number: 15678186 # Func_class: P Inorganic ion transport and metabolism # Function: Ferritin-like protein # Organism: Methanothermobacter thermautotrophicus # 1 106 2 107 171 73 30.0 2e-13 MDKRIENAMNELINTEIWSTGLYLSLQVYFEDERLPILSSWLNSQAQDNMNKVYQMMNRI CHDGGCVVINEMKRDTHEWTTPLNALNELLEHEQYISRQVNTFLILCWNVSMSFHSFISG LYADRIYVSTAFMELLRILAKENERKLPYF >gi|226332241|gb|ACIC01000079.1| GENE 38 48825 - 49034 99 69 aa, chain - ## HITS:1 COG:no KEGG:BT_1110 NR:ns ## KEGG: BT_1110 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 69 1 69 69 115 100.0 4e-25 MVLDNSNSPYHIEEELWLQILSLSRELQECKASLHHNTLDCISDKIEGIERELDVLLDIL QGRQDKKTS >gi|226332241|gb|ACIC01000079.1| GENE 39 49629 - 49826 58 65 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVCGLRDHSPLYVNLYVLGYEYCPCHYPPLHFYYCQIDKILTEEWVMTYIPAVSWSVFLL FCYTI >gi|226332241|gb|ACIC01000079.1| GENE 40 50414 - 50536 161 40 aa, chain + ## HITS:1 COG:no KEGG:BDI_2138 NR:ns ## KEGG: BDI_2138 # Name: not_defined # Def: integrase # Organism: P.distasonis # Pathway: not_defined # 1 40 1 40 406 63 80.0 3e-09 MARSTFKVLFYVNGNKEKNDIVRIIGRVTINGIVTQFSYK >gi|226332241|gb|ACIC01000079.1| GENE 41 50561 - 50986 124 141 aa, chain + ## HITS:1 COG:no KEGG:BT_1112 NR:ns ## KEGG: BT_1112 # Name: not_defined # Def: transposase # Organism: B.thetaiotaomicron # Pathway: not_defined # 45 141 1 97 97 191 100.0 8e-48 MKGNRAKGKSAEVRDINLALDNIKAQIIKHYLRIFDREAFVTAEMVSNAYQGIGSEYETL LKASGRENEVFKKRVGKDRVMATTVHGWWQETMWQRSSSLFTDGRICSCWRLSPTSSRGL QPTYQRKRDFTTGRYGKNACG >gi|226332241|gb|ACIC01000079.1| GENE 42 50878 - 51207 174 109 aa, chain + ## HITS:1 COG:no KEGG:PG1113 NR:ns ## KEGG: PG1113 # Name: not_defined # Def: integrase # Organism: P.gingivalis # Pathway: not_defined # 1 105 156 259 407 135 63.0 5e-31 MFMLEIKPDFIKGFAAYLSTKAGLHNGTIWEKCMWLKGVVMCVHFNGLIPKKTFAQFHIS PNVTEREFLTEDELKTLMTHELGMSNCSIFVTFLFSPASLPSLSWTSRN >gi|226332241|gb|ACIC01000079.1| GENE 43 51228 - 51419 184 63 aa, chain + ## HITS:1 COG:no KEGG:PG1113 NR:ns ## KEGG: PG1113 # Name: not_defined # Def: integrase # Organism: P.gingivalis # Pathway: not_defined # 1 63 273 335 407 107 69.0 1e-22 MNGEKWILSKRYKTKVPFQVKLLDTPLQIIERYRPCQEDNLIFPNLNYWSICKSLKKGMK ECG >gi|226332241|gb|ACIC01000079.1| GENE 44 51505 - 51957 256 150 aa, chain - ## HITS:1 COG:MA0407 KEGG:ns NR:ns ## COG: MA0407 COG0716 # Protein_GI_number: 20089301 # Func_class: C Energy production and conversion # Function: Flavodoxins # Organism: Methanosarcina acetivorans str.C2A # 8 148 8 162 179 112 36.0 2e-25 MPHSEPGKTLIVYFSWSGHTQTVANIIHELIGCDMVEIEPEEPYSDEYNEVVDRFKNERD NHILPALRTKVENMDDYDTLIIGSPIWGGLLSSPVKSFLSGYDLSGKKILPFCTHGGSGT AQSVDNIRKLCPHAEILLFMAVKLQTLGMK >gi|226332241|gb|ACIC01000079.1| GENE 45 52088 - 53119 768 343 aa, chain - ## HITS:1 COG:no KEGG:BT_1114 NR:ns ## KEGG: BT_1114 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 343 1 343 343 719 100.0 0 MKKQIFYMTFIALLSGCNCYKGNILQIEEQGSFAVGGTVLTDSLGHKYHGDHAYVFYQKP VDARKYPLVFAHGVGQFSKTWETTPDGREGFQNIFLRKGFSTYLVDQPRRGNAGRSTEAV TLEPVFDEEEWFNRFRVEIYPDYFEGVQFSRDREVLNQYFRQMTPTIGPLDFDVYSDAYA ALFDKIGPAIFVTHSQGGPVGWFTLLKTKNIKAIVAYEPGGSVPFPTGQVPEEGKVLTRS KKTEGIEVPMAVFKRYMEIPIIIYYGDNLPETDEHPELYEWTRRLHLMRKWAEMLNKLGG DVTVIHLPDVGLHGNTHFPMSDLNNVEVADLLSKWLYEKQLDR >gi|226332241|gb|ACIC01000079.1| GENE 46 53127 - 54299 946 390 aa, chain - ## HITS:1 COG:TM1006 KEGG:ns NR:ns ## COG: TM1006 COG0667 # Protein_GI_number: 15643766 # Func_class: C Energy production and conversion # Function: Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) # Organism: Thermotoga maritima # 71 380 15 323 333 303 48.0 4e-82 MDKDKDMSRRRFLKMAALTGAAMCVGPTLNKVTAAERVWMGKQQVANNIPAGMAAVRTRR TLGHGNTAFTVSAMGYGCMGLNHNRSQYPSREKEIALVHEAVERGVTLFDTAESYGYHIN EKLVGEALKGYTDRVFVSSKFGHKFVNGVQIKTEEDSTPANIRRVCENSLRNLGVETLGM FYQHRIDPNTPIEAVAETCSKLIKEGKILHWGMCEVNVDTIRRAHKICPVTAIQSEYHLM HRMVETNGVLELCEDLGIGFVPYSPLNRGFLGGMINEYTKFDVTNDNRQTLPRFQPEAIR ANYRIVEVLNAFGRTRGITPAQIALAWLMNKKPFIVPIPGTTKLSHLEENLRACDIRFTA EEIEELETAVAAIPVVGSRYDALQESKIPK >gi|226332241|gb|ACIC01000079.1| GENE 47 54426 - 54983 289 185 aa, chain - ## HITS:1 COG:SA0649 KEGG:ns NR:ns ## COG: SA0649 COG1661 # Protein_GI_number: 15926371 # Func_class: R General function prediction only # Function: Predicted DNA-binding protein with PD1-like DNA-binding motif # Organism: Staphylococcus aureus N315 # 48 182 4 138 140 85 31.0 5e-17 MVRLIKTETVISVLMGHFIKKLLFILLFVGVLIAPANAQNEKNMYSYKKIGNKYIVSINN HTEIVKALNAFCKEKGILSGSINGIGAIGELTLRFFNPKTKAYDDKTFREQMEISNLTGN ISSMNEQVYLHLHITVGRSDYSALAGHLLSAIQNGAGEFVVEDYSERISRTYNPDLGLNI YDFER >gi|226332241|gb|ACIC01000079.1| GENE 48 55081 - 55446 240 121 aa, chain - ## HITS:1 COG:no KEGG:BVU_3495 NR:ns ## KEGG: BVU_3495 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 118 1 118 200 245 99.0 4e-64 MSIQFDIAKQIIAETCPNRKLKLENIRLFADIIQPKKYQKGDIILNEGDICNCLFYIEKG FIRQHYLKHDKDVIEHLACEKNVVWCIDSYFNREPTHLMMDAVENSVLWEIPRDIMEEYG K >gi|226332241|gb|ACIC01000079.1| GENE 49 55456 - 57072 754 538 aa, chain - ## HITS:1 COG:no KEGG:BVU_3494 NR:ns ## KEGG: BVU_3494 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 538 1 538 538 1059 99.0 0 MKRYINILMVALATMMTSCLTEDPRDQLYEEDIYNNANNIYINAVAVLYNYIGGSADSEG LQGTCRGIYDYNTLTTDEAMIPIRGGDWYDGGLWTNMYQHKWSPNDLPLYNTWKYLYKVV VLSNKSLHIIDKYSHNLSEEQRVAYEAEVRALRALFYYYIMDMYGRVPIVTSYEQPQDEV VQSERSEVFRFIVNELQEVAELLPNERSNKMGNYYGRITTPVANFLLAKLALNAEIYCDD NWTDGVPRNGKEIFFTVDGEKLNAWQTCIKYCNKLAEVGYRLEDDYSYNFSVHNENSNEN IFTIPLDKNLYAAQFWYLFRSRHYNHGGALGGSSENGTSANISTVLANGYGTDDVDARFE KNFYAGIVEVDGKPVMMDNGQQLEYFPLELKLNLTGSSYVKTAGARMAKYEVDRTSHSDG RQPDNDIVLFRYADALLMKSEAKVRNGEDGSLELNEVRGRVGMPYREATLENILKERLLE LVWEGWRRQDMVRFGVYHKSYDQRVQLADEKNGYTTVFPIPQKSIDLNPKLKQNVGYK >gi|226332241|gb|ACIC01000079.1| GENE 50 57085 - 59781 1595 898 aa, chain - ## HITS:1 COG:no KEGG:BT_1119 NR:ns ## KEGG: BT_1119 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 898 1 898 898 1738 100.0 0 MGRVSLCLVIMFFICSAQKIDAQVQNIREKQDTISATKSLEHIRPENIKKGPLNNALDVL SGQSAGVNVTTNGTDRLAMLNSIRVRGTTSIMGGNDPLVIIDGVTSDIATLSTIYPADIE SFTILKNATETAMYGSRGASGVIEIKTKKGTGRGFEISYDGNYGFESMYKHLQMLNGPEY IATAEALGLDYNNGGYNTNFHDVITRTGLIHQHHLAFSGGSENSNYRASFGFMDHNTIVK VNDYRNLVVKLDATQKAFDGRLVGDFGVYGYSSKIHDIFDTRMLFYSAAAQNPTYPAGTD VNGNWVKNSAASHINHPGALLYEKNDSEERNFNTHLGLKFNILDNLILSAFGSYSYSSTG NAQFCPTWVWAQGNVYRGEFKGEDYFTNVSLSYNNAWGDSHLDAVVGAEYLKQVRTGLWV QAKGITTNDFSYNNIGATSSRPFGGTSSSYEDQSLASIMGSVTYSYKDRYSIAAALRGDG SSMVSDNNTFGFFPSVSLGWDVKKEGFLSDTDFITMLKLRTGYGRSGNLGGITSYTTLNT VKENGIVSINGAPTVTMGSIRNTNPDLKWETRSTFNIGFDLGIWDNRLMLTSELYYSKTT DMLYEYDVPVPTFAFDKLMANIGSMSNQGVELGISVVPIQRKDMEMNINFNMSYQKNKLL SLSGEYNGMHMTASDITPIGSLYGAGQNGGDNNVVYQIVGQPLGVFYLPHCKGLKENELG GYSYDIEDLNDDGEIDFSDGGDRYIAGQATPKVTIGSNISFRYKSFDIAMQINGAFGHKI FNGTGLAYTNMSIFPDYNVLKGAPEKNIIDQNVSDYWLEKGDYVNIEHITIGYNIPMKSK AVKSLRLSAGISNLATITGYSGLTPMINSYVVSNTLGIDDKRTYPLYRTYSLGLSIQF >gi|226332241|gb|ACIC01000079.1| GENE 51 60497 - 60736 81 79 aa, chain + ## HITS:1 COG:no KEGG:BT_1120 NR:ns ## KEGG: BT_1120 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 79 1 79 79 161 100.0 6e-39 MLFSFFSEDYYFRQREIWQAADFTPSKIIDLNCCPLAFIGNCSTFVVACANEPSNVERWE HRTINKVPKRGKIPSTTAL >gi|226332241|gb|ACIC01000079.1| GENE 52 60820 - 61014 61 64 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|154495398|ref|ZP_02034403.1| ## NR: gi|154495398|ref|ZP_02034403.1| hypothetical protein PARMER_04455 [Parabacteroides merdae ATCC 43184] # 18 64 1 47 47 84 100.0 2e-15 MTGWIRYRSKKFRNSSGMLCGNSDENGNAIRDYDTAKRIRRRVTLQQKQFKDYSICNQLL LNIL >gi|226332241|gb|ACIC01000079.1| GENE 53 60950 - 62557 1585 535 aa, chain - ## HITS:1 COG:MA0419 KEGG:ns NR:ns ## COG: MA0419 COG1073 # Protein_GI_number: 20089312 # Func_class: R General function prediction only # Function: Hydrolases of the alpha/beta superfamily # Organism: Methanosarcina acetivorans str.C2A # 214 530 2 322 327 362 53.0 1e-99 MNKLITTIACLICCIVYTQAQNKNNMLSKKEQSIAAISMYAARGNQDSLKVILARGLDCG LTVSEEKEVLTQLYAYCGFPRSMGALVTLMNLTKERAAQGIKDEAGREPSPVKSSDMFVV GGQNQLKLFGRPALGEVLTFAPALDQFLKAHLFGDIFSRDNLDWRTRELSTVAALSVLDG VKNELNTHIAHAKHNGVTQAQIDEVLIMAARCRNGMVLSESDEPAKTFQTDPTITVRKVF YKNRYDIMLCAEMYLPKDFNEAQHYAALIIGHPFGAVKEQCSGLYAQEMARRGYVTLAFD ASYQGESGGEPRHTVSPDALVEDFSASVDWLGLQPFIDRNRIGVIGICGSGGFSVCAASL DPRIKALATVSMYDMGRATRNGLGDSMTDEQRRKLLDEVAEQRWKEAETGEARIRFGTPE KLLGNANAVQKEFFDYYRNPLRGYHPRYQGIRFTSQAALMNFYPFAMIKEISPRPVLFIA GEHAHSRYFSEDAYQEASEPKELYIVPGANHVDLYDRMDKIPFEKISEFFWYALR >gi|226332241|gb|ACIC01000079.1| GENE 54 62582 - 63736 350 384 aa, chain - ## HITS:1 COG:BS_ykgB KEGG:ns NR:ns ## COG: BS_ykgB COG2706 # Protein_GI_number: 16078366 # Func_class: G Carbohydrate transport and metabolism # Function: 3-carboxymuconate cyclase # Organism: Bacillus subtilis # 39 384 8 346 349 179 31.0 8e-45 MKPKALFVMLILWLSFTTIQARSTEDVPDKGTYAKFLLVGCYMKPDEEGVRMYRFDGQTA DVDYPCGLRGISNPAFLTSDSTGNRIYAIGDDEGKSSTANALLFDKESGLLSLLNSQSTD GELPIYITLSPKEYFVLTANYKGGSITVFSQDKKGKLQRDTKIIRFAGNGPNKKRQEQSH LHCVTFTPDGKFLLATDLGTDCIYLFPIGKRPEAGKAHSLLDESRVVRIQMDSGSGPRHI CFHPNGRFAYLISELSGKITVFSYNEGKLERLQTIVCDPFVAEGNADIHVSSDGKFLYAS KHLKEDGIIVYSIDSQKGTLVQIGFQPTGLYPRSFAISPDGCYLAVVCRDANCIQIFERN RNTGLLKNTGKNIRLERPAFVKFL >gi|226332241|gb|ACIC01000079.1| GENE 55 64016 - 64936 583 306 aa, chain + ## HITS:1 COG:no KEGG:BT_1123 NR:ns ## KEGG: BT_1123 # Name: not_defined # Def: transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 306 1 306 306 632 100.0 1e-180 MVMGDITRIDTVQQYCDLFEVEALHPLVSVVNCYEVQPIRHSKKLYNIYAVLLKDTDCGT MNYGRSLYDYEKGSMLFIAPGQVMGSDDDGSLHQPAGWALMFHPELLRGTSLAHIIKEYS YFSYNANEALHLSEQERKVVIECINNVAEELRHPIDKHSRSLIIDTMKLLLDRCIRFYDR QFITRENANNDLLARFELLLNNYYHSALPTSKGIPTVQYCADQLCLSTNYFSDLVKKETG MSAIKHIQQKIMDIAKERIMNTQKSISQISDEMGFQYPQHFTRWFKKMEGCTPNEYRNEI IKQAIN >gi|226332241|gb|ACIC01000079.1| GENE 56 65005 - 65931 467 308 aa, chain - ## HITS:1 COG:no KEGG:BT_1124 NR:ns ## KEGG: BT_1124 # Name: not_defined # Def: putative integrase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 308 1 308 308 607 100.0 1e-172 MATIRTKFRSSSAEGREGALYYQVIHNRVVRQISTGYKLFASEWDQRSEAVIPCQHPAGM ERDNYLLSVGERIRRDKIRLEKAIRTLSQSGPFTADDIVIRFHDSGQEPSFNDYIRQQIV RLKRLGKIRTSETYTAALKSFSSFMKGSDILFGELSSDLLMEYEAYLKNRGNSPNTISFY MRILKAVYNRAVENGLTGQRNLFKSVYTGVEKTLKRAIHLNDIRRIKRLDLSLKPHLDFA RDMFLFCFYTRGMSFVDMAYLKKGDIANGILTYRRKKTGQQLFIRWEKCMQEIIKLFCLS VQDFKVVH >gi|226332241|gb|ACIC01000079.1| GENE 57 65939 - 66700 476 253 aa, chain - ## HITS:1 COG:no KEGG:BT_1125 NR:ns ## KEGG: BT_1125 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 253 1 253 253 459 100.0 1e-128 MKEENYLEGIYGCLERIEKKVETLPVEGTSPAVGNRTSSPEGIAELKIRLERLQSAVEKN GLEIAAVRNHTVRLSEGWPLSAETFAGEMEKIRYCLSQDCQAVKETVRRLDERMVLLKKE PERRLVTYRLESASKAVVTTASALILALIISVWTNCNQYRTNRLLKDADLKYRAIRICLP GDDPDIAFLEKHFTIKRDEEKIRRVERLVTAFEDSVRNRIRNHEMAAYKDSLAHRLFREA QEIRKQLDNPNSK >gi|226332241|gb|ACIC01000079.1| GENE 58 66730 - 67650 495 306 aa, chain - ## HITS:1 COG:no KEGG:BT_1126 NR:ns ## KEGG: BT_1126 # Name: not_defined # Def: mobilization protein BmgA # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 292 11 302 316 581 100.0 1e-165 MIGKIRKGRSFSGCIRYVTQKDDAKIIASEGVLLGTVEETARSFRWQCLLNPDVAKPVGH IALSFKPEDAPRLTDAFMARLAEEYLELMGIRNTQFIVVRHHGTDNPHCHIVFNRVNFDG KVISDSNDFKRNEKVTKMLKDKYSLTYSEGKQSVKTEKLHASEKVKYEIYRAVKEALRSA DTWKEFQNKLLKMGVEMEFKYKENTNEVQGIRFIKNGLSFKGSGIDRSFSWSRLDAALDH NHVTSLENDVSQKQPYHEQSHGSVIDNLVEVTGTGGVFMPSVAPTEDEKEAERLRRKKKR RKGRSL >gi|226332241|gb|ACIC01000079.1| GENE 59 67655 - 68026 210 123 aa, chain - ## HITS:1 COG:no KEGG:BT_1127 NR:ns ## KEGG: BT_1127 # Name: not_defined # Def: mobilization protein BmgB # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 123 66 188 188 228 100.0 5e-59 MTNDVNGMKKKCGRPALGRTRKLTRGVTVKFSPVSYEALRFRAGKSGRSLAVYIREAALA ATVTARHTPEENALLRSLAGMANNLNQLTKLSHQTGFYRTRLLIEGLLGKLKRIMDDYRP KGG >gi|226332241|gb|ACIC01000079.1| GENE 60 68256 - 68504 106 82 aa, chain - ## HITS:1 COG:no KEGG:BDI_3249 NR:ns ## KEGG: BDI_3249 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 82 86 167 167 138 95.0 7e-32 MRRRTSGRQRRASLEEYRETYLTVPKIRNRKTVFVSEDVRDELDAVVRRLGGRGMSVSGL LENLAREHLAAYRGDIEQWRKI >gi|226332241|gb|ACIC01000079.1| GENE 61 68395 - 68760 277 121 aa, chain - ## HITS:1 COG:no KEGG:BDI_3249 NR:ns ## KEGG: BDI_3249 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 46 1 46 167 75 91.0 4e-13 MKSEPTEKTKGKKMAVSETADRKNGRSPGAGHDEWWERLMMEPGPGGIRRYGCFRDGIDD SIRARCGGGIRGGEKKGHPAGTGRPREKKNERQATQGLAGGVPGNVPHRPEDQEPQDGVR Q >gi|226332241|gb|ACIC01000079.1| GENE 62 69238 - 69549 238 103 aa, chain + ## HITS:1 COG:no KEGG:BT_1129 NR:ns ## KEGG: BT_1129 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 103 1 103 103 197 100.0 1e-49 MEIVNIEARTFEAMLSAFRTFADRLDTLCRLYGDMEEKKWLDNQEVCLLLKVSPRTLQTL RDNGTLAYTQICHKTYYKPGDVESIIRIVEERRKRAESMGRSI >gi|226332241|gb|ACIC01000079.1| GENE 63 69593 - 69916 372 107 aa, chain + ## HITS:1 COG:no KEGG:BT_1130 NR:ns ## KEGG: BT_1130 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 107 1 107 107 176 100.0 2e-43 MMNTDNRLLTRESSEHIREFFSTVERLSVSMERLFAGRSPAMAGENFYTDRELAEKLKVS RRSLQQYRDSGLLAFTRLGGKILYRSSDIEKLLDGCYREARTRPEEL >gi|226332241|gb|ACIC01000079.1| GENE 64 70441 - 70833 349 130 aa, chain - ## HITS:1 COG:no KEGG:BDI_2238 NR:ns ## KEGG: BDI_2238 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 130 1 130 130 259 98.0 2e-68 MALIVYNRENSRPQEVTYKGKRTINLDSRGTVYLSKTMSIELGILGGGRVNFAHDDETDD WYICRADDSEGFIVWKDKRCARFSAGVIVQRLMRQAKVERKSVQFMMARMPVEIGGVAYY KILLSNPILR >gi|226332241|gb|ACIC01000079.1| GENE 65 70865 - 71425 441 186 aa, chain - ## HITS:1 COG:no KEGG:BT_1133 NR:ns ## KEGG: BT_1133 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 186 1 186 186 352 100.0 3e-96 MADKSAEKERLFNEWFTASYDRLRGTLRRYGMLDEDNFHDTYLFVRRQVLVPGKDITDYD AYFIGCYKKAALVKIKRENRYAHPEDDFFLRCGEEAKFLSEDDLNGCERLVRDILRFVRQ KFSYEEYRMFMLRFYEAQFSFKALAECMGISASAISQKVCRIVDAVRTHSGFAWRSQMLA VESFMY >gi|226332241|gb|ACIC01000079.1| GENE 66 72371 - 72970 371 199 aa, chain - ## HITS:1 COG:no KEGG:BT_1135 NR:ns ## KEGG: BT_1135 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 199 1 199 199 406 100.0 1e-112 MDGISLYDDCVMLAYNKEVRRNCLPFTCGENDLDDFFLNDADLYADELLGKTYCWVTTEI PHRIVALFTLSNDSIKTRLISPNDKNRLQRNIVNPKRGRSYPAVLIGRLGVNLEYQGTSS HVGRQLMAFIKDWFRHEDNKTGCRFIVVDAYNEEKILRYYERNGFVPLYKTDVIEKQYYD IPQDEPLKTRLLYFDLKKD >gi|226332241|gb|ACIC01000079.1| GENE 67 72976 - 73143 114 55 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|154492100|ref|ZP_02031726.1| ## NR: gi|154492100|ref|ZP_02031726.1| hypothetical protein PARMER_01731 [Parabacteroides merdae ATCC 43184] # 1 55 1 55 55 82 100.0 9e-15 MAITIKNIPVLEGVTAEDFVRSADKNAAKVTPRLSVEAKKRLRKVLEKSRSFRFN >gi|226332241|gb|ACIC01000079.1| GENE 68 73650 - 74897 873 415 aa, chain - ## HITS:1 COG:no KEGG:BDI_3265 NR:ns ## KEGG: BDI_3265 # Name: not_defined # Def: transposase # Organism: P.distasonis # Pathway: not_defined # 1 415 1 415 415 837 99.0 0 MKTSMSRSTFKILFYVKKGSERANGYLPLMCRLTVDGEIKQFSCKLDVPPKLWDVKTARA TGKSAEAQKINAAVDRIRVDVNRRYQELMQSDGYVTAARLRDACLGLGVKRETLLKLFEQ HNEEFIKKVGHSRVQGTYNRYRTIYRHLCEFVPKVYRRDDIPLKELNLTFINNFEYFLRT EKKCRTNTVWGYMIGLKHVISIARNSGALPFNPFAGYINSPESVDRGYLTEREIQTLMET PVKSGTCELVRDLFIFSVFTGLAYADVKALTTDRLQTFFDGNLWIITRRRKTNTESNIRL LDVPKRIIEKYKELSKDGHVFPVPSNGRCNTILKELGRQCGFKIRLTYHVARHTNATTVL LSHGVPIETVSRLLGHTDLKTTQIYARITNQKISSDMEILSHKLEKMEKEICDAI >gi|226332241|gb|ACIC01000079.1| GENE 69 75083 - 76711 999 542 aa, chain - ## HITS:1 COG:CC0788 KEGG:ns NR:ns ## COG: CC0788 COG1874 # Protein_GI_number: 16125041 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase # Organism: Caulobacter vibrioides # 42 223 38 203 549 75 28.0 2e-13 MYKRMIILTYALVFFSLFATACSDNNNETYVEPILSQIEVSKLVTENNKTYVSVDGKPFP FLGAQIRLDALLNCDKMTINEVENYFKKAQELGLNCVQIPISWNMVEPKENKYDYSIVNS ILQFVNKYNLKMELLWFSTNMVGDSFSYLIPQYVLQEYNKRLSRNDEGNFWNYYGYQYTM ILDDEWVLERETKAITALFNHIRYWDSQNGDKHPVISAQIHNEPDALMRWRIDQKDLKYR DGTPLSKEKAWTMITNALNTVGKAVKNSSYKVVTRVNLIYGDGINPFPEATNARPKDVFD LQGIDFIGVDAYKDNIKHLKNEVMAYASIAGNYALVAENKGSYANSPSLILTSFALGGGY DIYDLATSNFFINNTTEPDQIDHGIYTWDLQEKEFTPPTRSLIKGLAAAYIDVAKVKPEN FAAFNINDNQPKDKLEQLICTTGAQITFQTNNASLGFVLDMHNYLLIYSLNDSQFKLENG KFGETISGRYDVNGTFTKEGTATLENQTLHAKGGVLYKVNAERKSKVDDNTAKMQSIKWL GV >gi|226332241|gb|ACIC01000079.1| GENE 70 76745 - 78745 1381 666 aa, chain - ## HITS:1 COG:no KEGG:BT_2806 NR:ns ## KEGG: BT_2806 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 666 1 666 666 1356 99.0 0 MKRLTISTLLSGVLFLTSCNDFLNQEPLDQLSPNEYLTTESNIAAFATDQYQVLPTHGTY GYGTFEIDNNTDNMAGMQPNVMYAPGYWKVGQEGGSWYFADIYRCNYFFENVLPSYEDNA IVGNTTNIKHYIGEMYFFRAFMYFERLKSLGDFPIITKTYPNEREALIEISKRAPRNEVA RFILSDLDKAIEMMQNIAPSGGKNRLSKDCATLLKSRVALYEGSWLKNFKGTAFVPNGPG WPGANKDYNANYSFKSGSIEKESEFFFDEAIKAAQIIAEKYNVLTANTGVFPQTSSEPEN PYFSMFCDIDMEKYNEVLLWRRYNTGLGVANEVCQYGCVGNNGVGTTKSMIDAFILKNGE PIYASPMWADENSSYWGDNNIIHITKNRDTRADIFIKKPGQKNLHTQAGDHGVVEEWGPN ITASSVTEKYNTGYALRKGLNPDGKYTNNTQSIVGSIIFRTAEAYLNYIEAYYELHGSLD SYAEKYWKAIRRRAGIDEDYTKTIRLTDMSKEAETDWGAYSGGEIIDATRYNIRRERRCE LMAEGLRSMDLHRWRSMDQMITKPYHILGMNLWQEMYGWDWYKDSNGNSILKEGENVSPR SFSTYLAPYHITANNIVYNGYKWNMAHYLDPIAAEHFLVTSSGGDLDSSPIYQNPGWPMT GGGTPQ >gi|226332241|gb|ACIC01000079.1| GENE 71 78758 - 82009 1748 1083 aa, chain - ## HITS:1 COG:no KEGG:BT_2805 NR:ns ## KEGG: BT_2805 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1083 1 1083 1083 2105 99.0 0 MKTNRHSTLLKKAVHLITYNSSHKLICLLGCLLLCSMNTYAQNYVSGTVKDNNGEPLLGV SIKVKDGNIVSGTVTDFNGHYQVKADPSSTIEFSYIGFKTVQFTVGDRKMINVTLGVDDN LLDEVIVVGYGTQKKINLTGSVGVIDSKAFEAIPVSNAVQALQGQVPGLNIYSNQGGGLN QKQSINVRGVGTIGEGSTGSALVLIDGMEGDIYSINPQDIESISVLKDAAASSIYGSRAP FGVILVTTKKGKAGKAQVNYNNSFRVSSPINMPSSLDSYTFSLYYNDAAANSGWGPYNWV SEERINRIKDYMAGKITDSTIPVPNNPSLWADGYSQGNDNIDYYDYFFKDNVFAHEHNIS VTGGTDKIQYYLSANYLGQDGELRIGDEYSNRYTASAKINAQLSKIVSVGFNTRFIRSDY VQPTHLNESFFAEIGRQCWPTKPLYDPNGILYDDHVLQMQNGGEKQERNTWLYQQLNITV EPIKGWRLIGDLSYRYNTQYSHWDYLTVHQTGVDGKTKGNTWNNDSQVHEGTYASDYFNV NLYTDYAKTFAKSHNMKVLFGFQAEQNNYKDIAAEKLGVIYPGKPTINTSTGMDSNNQKV APNVAGGHNRWATAGFFGRLNYDYLEKYLLEVNLRYDGSSRFRSDSRWGFFPSASVGWNI AREKFFLPVSKVVNTFKVRASYGSLGNQNTSSLYPTYSTIGTGTGSWIIDGTLANIAWAP SLVSYNLSWEKIRTWNAGVDFGLFNNRLTGSFDYYIRKTDDMVGPSEKLPVALGTAVPSS NNTNLKTFGWELELMWKDRLNNGLNYSARFTLADSRTKITRYSNPSGLIDGFYAGKYVGE IWGYETIGIAQSDQEMAEHIGRLVNGGQSNLGQDWKAGDIMYKDLNEDGKIDAGSRTLQD HGDLKRIGNSTPRYNVGLELAADWKGFDIRMFWQGTLKRDYFQGSYYFWGANGRQGPWFS TALKGHDDYFRNDEASPLGTNLNSYYPRPLFNTDKNQQSQTKYLQDASYLRLKNLQIGYS LPHNIVQKMGMQNLRIFASGENLFTITSLIDFFDPESIEGGSWGHGNVYPLSRTFSFGLN VTF >gi|226332241|gb|ACIC01000079.1| GENE 72 82022 - 82957 704 311 aa, chain - ## HITS:1 COG:PM0152 KEGG:ns NR:ns ## COG: PM0152 COG0524 # Protein_GI_number: 15602017 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Pasteurella multocida # 2 300 3 303 306 202 42.0 6e-52 MKVVVIGSSNIDMVAQVNHLPAPGETVGDASFMQSLGGKGANQAVAAARLGGSVTFITSL GNDMYAEILKKHFKKEGITTDYIIDDVNQPTGTALIFVADSGENCIAVAPGANYSLLPGS IIHFSKVIDEADIIVMQAEIPYETIKRIALLAKQKGKKVLFNPAPACLIDEELMKAIDIL VVNELEAAFISGIEYTGNNLEEIALSLLQAGARNTVITLGSQGVYMKNDKEIIQLPGYKV NAIDTIAAGDTFCGALAVICAQREIDRDALSFANAAAAIAVTRSGAQPSIPTLDEVKHFM LEKELALSFNF >gi|226332241|gb|ACIC01000079.1| GENE 73 82998 - 83921 759 307 aa, chain - ## HITS:1 COG:HI0505 KEGG:ns NR:ns ## COG: HI0505 COG0524 # Protein_GI_number: 16272449 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Haemophilus influenzae # 8 306 2 300 306 230 45.0 3e-60 METISIHRPKIVVIGSCNTDMVVKANRLPVPGETILGGTFYMNPGGKGANQAIAAARLGA EVTFISKIGYDLFGLQALEIYRSEKINTEYIFTDQKSPSGVALISVDSFGENSIIVAPGA SRSLSIEDINKAEEKIKEADIILMQLEIPIDTVEYAATIACKYGKKVILNPAPASSLSNS FLRNVHTILPNRIEAEMLSGIKVMNIESAHRAAQAIGEKGIENVVITLGKDGAYVKEKDE YTMIPAKEVETIDTTGAGDVFCGAFSVCLSEGHSLAKSVKFANAAAAIAVTRIGAQSAIP YKREVVL >gi|226332241|gb|ACIC01000079.1| GENE 74 84149 - 85690 986 513 aa, chain - ## HITS:1 COG:no KEGG:BT_2802 NR:ns ## KEGG: BT_2802 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 513 1 513 513 919 99.0 0 MKRVDSVISLIYSLSKSEKKHFSLQVIKDKEEKDYLVIYDIVIKSKQPNGNTVKEEFYRR KPNGSFEVSIQYLYEKLTDTLLTLRKKKDIYYDLLNDLCKARMLYERSLFEECFEILSNT IEQAQFYENNEILTIALKLELEYLLRLNFPNMTEQELFHKHFIQNEALKKIRKITEQSSL HNLLKYRLSHIGSIRTVKQKQDMNDLMVNELYIAASSDSEGNFELTRNHKLFQANYLMGA GDYRAALNSYKELDSLFEQNQQFWSNPPIYYLSVLEGVLGSLRSVSNYDEIPYFLDKLRK LISDSTSLEFKVNATCLLFQYELFPYLDKGDFSKCTQLMADYQEILYDKEAWLSPIRKSE LLLYTTLVHIGNQEYKVAKKYISNAIIDHNIKYLPLMRTIRLVRLIVFYEVQEHELIQYE SRSITRSLSSPKEQTFKTERIILWFLNKRNIPILKKDREAFWEKLSPEIHELYNNKYESQ LLRLFDFTAWMESKIRKEKLSEVLRARTSAKEC >gi|226332241|gb|ACIC01000079.1| GENE 75 86007 - 86810 819 267 aa, chain + ## HITS:1 COG:PA3818 KEGG:ns NR:ns ## COG: PA3818 COG0483 # Protein_GI_number: 15599013 # Func_class: G Carbohydrate transport and metabolism # Function: Archaeal fructose-1,6-bisphosphatase and related enzymes of inositol monophosphatase family # Organism: Pseudomonas aeruginosa # 13 264 10 263 271 150 36.0 4e-36 MLDLKQLTAEVCRIATEAGHFLKEERKNFRRERVMEKHAHDYVSYVDKESEVRLVKALSA LLPEAGFITEEGSATYQDEPYCWVIDPLDGTTNYIHDEAPYCVCIALRSRTELLLGVVYE VCRNECFYAWKEGKAFMNGEEIHVSDVRDIKNAFVFTELPYNYDQYKPTALHLIDNLYGA VGGIRMNGSAAAAICYIANGRFDAWAEAFIGKWDYSAAALIVLEAGGRVTNFYGDDHFIE GHHIIATNGYLHPLFLKLLAEVPPLDM >gi|226332241|gb|ACIC01000079.1| GENE 76 86842 - 87549 248 235 aa, chain + ## HITS:1 COG:DR1389 KEGG:ns NR:ns ## COG: DR1389 COG1040 # Protein_GI_number: 15806406 # Func_class: R General function prediction only # Function: Predicted amidophosphoribosyltransferases # Organism: Deinococcus radiodurans # 11 218 14 204 219 94 30.0 1e-19 MKHTLLIKDWLYSFLSLWFPRCCVVCGGSLAKGEECICTMCNINLPRTDYHLRKDNPVEK LFWGKFPLERATSFFFYRKGSDFRQVLHQLKYGGQKEIGAIMGRYMASELQASDFFHGVD VIIPIPLHKKKQQIRGYNQSEWISRGITAVTGIPVDTEAIIRRKNTETQTQKSALERWEN VDGIFELHCSEHLAGKHILIVDDVLTTGATTVACASRLAEIEGVRISVLTLAMAE >gi|226332241|gb|ACIC01000079.1| GENE 77 88250 - 89239 829 329 aa, chain + ## HITS:1 COG:SMc02846 KEGG:ns NR:ns ## COG: SMc02846 COG0524 # Protein_GI_number: 15963924 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Sinorhizobium meliloti # 4 308 6 313 330 184 35.0 2e-46 MDKIIGLGNALIDVLATLKDDTLLDELGLPKGSMQLIDDAKLQQINTKFSQMKTHLATGG SAGNAILGLACLGAGTGFIGKVGNDHYGDFFRKNLQNNNIEDNLLTSEQLPSGVASTFIS QDGERTFGTYLGAAASLKAEDLTLEMFKGYAYLFIEGYLVQDHEMILHAIELAKEAGLQI CLDMASYNIVENDLEFFSLLINKYVDIVFANEEEAKAFTGEEPEEALRVIAKKCSIAIVK VGANGSYIRKGTEEIKVSAIPVEKVLDTTGAGDYFAAGFLYGLTCGYSLEKCAKIGSILS GNVIQVIGTTISPERWDEIKLNINKVLAE >gi|226332241|gb|ACIC01000079.1| GENE 78 89284 - 90981 1847 565 aa, chain + ## HITS:1 COG:aq_797 KEGG:ns NR:ns ## COG: aq_797 COG0793 # Protein_GI_number: 15606169 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protease # Organism: Aquifex aeolicus # 7 409 2 398 408 237 36.0 5e-62 MKKLLNRRIALVLVAVIATVAFFSFKSGDDRNFQIAKNLDTFNAIVKELDMFYVDTIDPN KTIREGIDNMLFSLDPYTEYYPEEDQSELEQMVKGSFGGIGSLITYNTKLKRSMIAEPFE GTPAAKAGLKAGDILMEIDGKDLLGKNNSEVSQMLRGQAGTSFKLKVERPNEKGGQTPME FTIVRESIQTPAIPYATVMDNKVGYISLSTFSGNPSKDFKKALLDLKKQGATSLVIDLRN NGGGLLDEAVEIANYFLPRGKVIVTTKGKTKQASNTYKTLREPLDLDIPIAVLVNSGTAS ASEILSGSLQDLDRAVIVGNRTFGKGLVQVPRSLPYGGMMKVTTSKYYIPSGRCVQAIDY KHRNEDGSVGTIPDSLTNVFYTAAGREVRDGGGVMPDITVKQEKLPNILFYLVRDNLIFD YATQYCLKHPTVAAPEKFVVTDADYNDFKELVKKADFKYDQQSEKILKTLKEAAEFEGYM TDAAEEFKALEKKLNHNLDRDLDYFSKDIKRMIANEIIKRYYFQRGGIIEQLKDDDDLQE AIKVLNDPAKYKEMLSAPITKAEKK >gi|226332241|gb|ACIC01000079.1| GENE 79 91173 - 92591 1448 472 aa, chain + ## HITS:1 COG:XF1037 KEGG:ns NR:ns ## COG: XF1037 COG0499 # Protein_GI_number: 15837639 # Func_class: H Coenzyme transport and metabolism # Function: S-adenosylhomocysteine hydrolase # Organism: Xylella fastidiosa 9a5c # 33 472 1 446 446 650 68.0 0 MSTELFSTLPYKVADITLADFGRKEIDLAEKEMPGLMALREKYGESKPLKGARIMGSLHM TIQTAVLIETLVALGAEVRWCSCNIYSTQDHAAAAIAASGVAVFAWKGENLADYWWCTLQ ALNFPGGKGPNVIVDDGGDATMMIHVGYDAENDAAVLDKEVHAEDEIELNAILKKVLAED KTRWHRVAEEMRGVSEETTTGVHRLYQMQEEGKLLFPAFNVNDSVTKSKFDNLYGCRESL ADGIKRATDVMIAGKVVVVCGYGDVGKGCSHSMRSYGARVLVTEVDPICALQAAMEGFEV VTMEDACAEGNIFVTTTGNIDIIRIDHMEKMKDQAIVCNIGHFDNEIQVDALKHYPGIKC VNIKPQVDRYYFPDGHSILLLADGRLVNLGCATGHPSFVMSNSFTNQTLAQIELFNKKYE VNVYRLPKHLDEEVARLHLEKIGVKLTKLTPEQAAYIGVSVDGPYKAEHYRY >gi|226332241|gb|ACIC01000079.1| GENE 80 92717 - 95248 2081 843 aa, chain + ## HITS:1 COG:no KEGG:BT_2796 NR:ns ## KEGG: BT_2796 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 843 1 843 843 1648 99.0 0 MKKFLPDLIAILAFIILSFAYFFPADIEGRILFQHDTVAGVGAGQEAQEYLERTGERTRW TNSLFGGMPTYQMSPSYDSTKPLKWIENIYHLYLPSYVVLTFIMMLGFYILLRAFGLSVW LSALGGIIWAFSSYFFILISAGHIWKFVTLAYIPPTIAGIVLAYRKKYLLGGIVTALFIA LQIQSNHIQMSYYFMFVILFFVGAYFEDAYKKKELPHFFKASAVLALAAMIGVCANLSNL YHTYEYSKETMRGKSELKQEGPAANQTSSGLDRNYITNWSYGIGETLTLLVPNVKGGASV PLSKNETAMAKANPMYSNIYSQLTQYFGDQPMTSGPVYVGAFVLFLFILGCFIVKGPLKW ALLGATIFSILLSWGKNFMGLTDFFIDYVPMYNKFRAVSSILVIAEFTIPLLAVFALKEV LAKPEILKLKENRTGVIITLVLTAGVSLILAVAPGVFFSSYIPAQEMAALQQGLPAEYLS PIITNLAEMRKAMLTSDAWRSFFIIVVGCFLLFLYQQKKLKASFTMTGIVLLCLIDMWTV NKRYLNDEQFVSKSNQTGAFVKTQTDEIILQDTALNYRVLNFVGFPGNTFNENNTSYWHK SVGGYHAAKLRRYQEMIDHHITPEMKDAYQEVAGAGGAMDSVDASKFRVLNMLNTKYFIF PAGQQGQTVPVENPYAYGNAWFVDKVQYVNNANEEIDALNDILPTETAVVDAKFKDQLKG VTKGYKDSLSTIRLTSYEPNRLVYETSSAKDGVAVFSEIYYPGWQAKIDGQPVDIARADY ILRVMNVPAGQHTIEMWFDPQSLHVTESIAYASLALLLIGVMILIWFSKKKLITPKKKEA ENS >gi|226332241|gb|ACIC01000079.1| GENE 81 95526 - 96911 1224 461 aa, chain + ## HITS:1 COG:aq_1332 KEGG:ns NR:ns ## COG: aq_1332 COG1538 # Protein_GI_number: 15606535 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Outer membrane protein # Organism: Aquifex aeolicus # 15 453 4 414 415 100 23.0 9e-21 MDSPILSQPIKFVPEYRAIIMKRLLFIFTFFCLLAIHGGAQNILTLENCLRIGIENNLSL QGKRKAMQKSKYGISENRVKLLPQINGFANFNDNIDPPVSVTDGSSYGVPYNITRTLQYS ANAGIELQMPLYNQTLYTSISIAEIVDEMSRLSYEKAREDLILQISKMYYLGQVTAEQIA LIKANITRLEELRDITQAFFDNGMAMEVDLKRVNINLENLKVQYDNAQAMMTQQLNMLKY IMDYPAEKEIGLLPVNTDSIATVALTGLSENLYELQLLQSQVQLAERQKRLISNGYIPSL NLTGNWRFAAYTDEAYHWFHSGPSNHWFRSYGVGLTLRIPIFDGLDKKYKIRKAMIDIET MKLSRLDTRKNLQTQYLNAVNDLMNNQRNFKKQKDNYLLAEEVYTVTTDRYREGITSMTE VLQDEMRMSEAQNNYISAHYNYRVTNLMLLKLTGQISSLFK >gi|226332241|gb|ACIC01000079.1| GENE 82 96936 - 98027 1009 363 aa, chain + ## HITS:1 COG:BMEII0793 KEGG:ns NR:ns ## COG: BMEII0793 COG1566 # Protein_GI_number: 17989138 # Func_class: V Defense mechanisms # Function: Multidrug resistance efflux pump # Organism: Brucella melitensis # 56 360 14 316 325 167 35.0 2e-41 METMENNTVSSTHQEKAKKMRKLRRWQIIVSLFGVAVIVWGVIEIVFLFLGYKQTETSND AQIEQYVSPVNLRASGYINKIYFTEHQQVRKGDTLLVLDDREYKIRVMEAEAALKDAQAG ATVINATLQTTQTTASVYDASIAEIEIRLAKLEKDRQRYENLVKRNAATPIQLEQITTEY EATFKKLEATRRQREAALSGVNEVSHRRENTEAAIQRATAALEMARLNLSYTVVLAPCDG KLGRRSLEEGQFITAGQCITYILPDTQKWIIANYKETQIENLRIGQEVAITVDAISKKEF TGKVTAISGATGSKYSLVPTDNSAGNFVKIQQRIPVRIDFTDLSKEDNDKLAAGMMVVVK AKL >gi|226332241|gb|ACIC01000079.1| GENE 83 98033 - 99670 1243 545 aa, chain + ## HITS:1 COG:no KEGG:BT_2793 NR:ns ## KEGG: BT_2793 # Name: not_defined # Def: putative MFS transporter # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 545 1 545 545 1006 100.0 0 MPSCPRNYPFYNWVPKPIGIIILLFFFLPILSVGGVYSVNSTEMMSGLGIISEHIQFANF VTSIGMAAFAPFLYELVCIRREKLMCIAGFALMYVFSYVCAKTDSIFLLALCSLLTGFLR MVLMMVNLFTLIKYAFGMEATRNITPGMEPKDATGWNKLDIEKCISQPVVYLFFMILGQM GTALTAWLAFEYEWKYVYYFMMGILLLSILILFITMPNHGFAGRFPINFRKFGNVTAFCI SLTCITYVLVYGKVLDWYDDPSICWATAISILFAGIFLYMDVTRRSSYFILGALRLRTIR MGALLYLLLMIINSSAMFVNVYAGVGMHLDNLQNASLANWCMVGYFIGALISIVLGSKGV HLKYLFALGFFFLALSAGFMYFEVQTAGLYERMKYPVIIRTIGMMILYALTAAYANQRMP FKYLSTWICIMLTVRMVLGPGIGGAIYSNVLQERQQHYITRYAQNVDLVNPEASASYTNT VQGMQYQGKSETEAHNMAAMSVKGRIQVQATLSAVKEMAGWTIYAGVICMIFVLVVPYPK RKLLT >gi|226332241|gb|ACIC01000079.1| GENE 84 99710 - 100582 493 290 aa, chain + ## HITS:1 COG:AGl645 KEGG:ns NR:ns ## COG: AGl645 COG2207 # Protein_GI_number: 15890441 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 161 287 177 301 327 70 33.0 4e-12 MDTQQERLLQLDRDLKSGKNICSSGGVLSCFPSSLKEPFLMQGLGLIVCRQGSFQFSLDN KCASAKAGETLFIHEESWVQVLQETEDVEILILAYQIEPIRDIIGNSVVSMYMYSKLTPE LSCVWNTGEEDEIMKYMSLIDSVLEMEESVFSLYEQKLLLLALTYRICSVYNRKLISAGQ ETGGRKNEIFVRLIQLIEQYYMQERSVEFYADKLCLSPKYLSALSKSICGYTVQELVFKA IIRKCISLLKNTQKNIQEISNEFGFPNASYFGTFFKKQMGMSPQQYRRNL >gi|226332241|gb|ACIC01000079.1| GENE 85 100589 - 101242 458 217 aa, chain - ## HITS:1 COG:MTH1114 KEGG:ns NR:ns ## COG: MTH1114 COG0035 # Protein_GI_number: 15679125 # Func_class: F Nucleotide transport and metabolism # Function: Uracil phosphoribosyltransferase # Organism: Methanothermobacter thermautotrophicus # 1 214 8 211 215 135 37.0 5e-32 MKVIDFSQTNSILNQYISEIRNVEVQNDRLRFRRNIERIGEIMAYEMSKEFRYSVKNIRT PLGIAPVSTPDNQLVISTILRAGLPFHQGFLSYFDGAENAFVSAYRKYKDTLKFDIHIEY IASPRIDDKTLIITDPMLATGGSMELSYQAMLTKGHPAEIHVASIIASQHAIDHIKSVFP EDKTTIWCAAIDPEINEHSYIVPGLGDAGDLAYGEKE >gi|226332241|gb|ACIC01000079.1| GENE 86 101524 - 103131 1633 535 aa, chain + ## HITS:1 COG:VC2738 KEGG:ns NR:ns ## COG: VC2738 COG1866 # Protein_GI_number: 15642731 # Func_class: C Energy production and conversion # Function: Phosphoenolpyruvate carboxykinase (ATP) # Organism: Vibrio cholerae # 2 535 10 541 542 811 73.0 0 MANLDLSKYGITGVTEIVHNPSYDVLFAEETKPGLEGFEKGQVTNMGAVNVMTGVYTGRS PKDKFFVKDETSENTVWWTSEEYKNDNKPVDAKCWAAVKDLATKELSNKRLFVVDAFCGA NENSRLKLRFIMEVAWQAHFVTNMFIRPTAEELANFGEPDFVIMNASKAKVENYKELGLN SETAVVFNLTEKIQVILNTWYGGEMKKGMFSYMNYLLPLNGMASMHCSANTDKEGKSSAI FFGLSGTGKTTLSTDPKRLLIGDDEHGWDDEGVFNFEGGCYAKVINLDKESEPDIWNAIK RDALLENCTVNAEGEINFADKSVTENTRVSYPIYHIENIVKPVSKGPHAKQVIFLSADAF GVLPPVSILNAEQTKYYFLSGFTAKLAGTERGITEPTPTFSACFGAAFLSLHPTKYGEEL VKKMEKTGAKAYLVNTGWNGTGKRISIKDTRGIIDAILDGSIDKAPTKVMPYFDFVVPTE LPGVDPKILDPRDTYECACQWEEKAKDLAGRFIKNFAKFTGNEAGKALVAAGPKL >gi|226332241|gb|ACIC01000079.1| GENE 87 103263 - 103820 475 185 aa, chain + ## HITS:1 COG:FN0712 KEGG:ns NR:ns ## COG: FN0712 COG2059 # Protein_GI_number: 19704047 # Func_class: P Inorganic ion transport and metabolism # Function: Chromate transport protein ChrA # Organism: Fusobacterium nucleatum # 6 184 7 186 186 122 39.0 5e-28 MIITYLKLFCTFAKIGMFTIGGGYAMIPLIEREIVKKNWMSKDEFMEMFALTQSLPGVFA VNISIFVGYKLHKVSGSLVCALATILPSFVIMMLIAMFFARFQDNEVMIRIFNGIRPAVV ALILFPCISAVRALHLKYVQLIAPVIATVLIWQFGLSPIYVVLAGISGGLVYTLWLKNKI KDIKA >gi|226332241|gb|ACIC01000079.1| GENE 88 103817 - 104341 486 174 aa, chain + ## HITS:1 COG:FN0713 KEGG:ns NR:ns ## COG: FN0713 COG2059 # Protein_GI_number: 19704048 # Func_class: P Inorganic ion transport and metabolism # Function: Chromate transport protein ChrA # Organism: Fusobacterium nucleatum # 1 174 1 173 176 122 43.0 2e-28 MIYLQLLWVYLKIGTFGFGGGYAMLSLIQHEIVDLHQWLTPQQFTDVVAISQMTPGPIGI NSATYVGYAVTHSVWGAVLATVAVCLPSFILVLLISYFFAKCKDNKYIKAAMSGLLPMSV ALIASAALLMMNRENFIDYKSIGIFAGAFLITWKWDLHPILLICLAGLVGVILY >gi|226332241|gb|ACIC01000079.1| GENE 89 104471 - 106270 1897 599 aa, chain - ## HITS:1 COG:DR1198 KEGG:ns NR:ns ## COG: DR1198 COG1217 # Protein_GI_number: 15806217 # Func_class: T Signal transduction mechanisms # Function: Predicted membrane GTPase involved in stress response # Organism: Deinococcus radiodurans # 5 594 4 593 593 662 55.0 0 MQNIRNIAIIAHVDHGKTTLVDKMLLAGNLFRSNQNSGELILDNNDLERERGITILSKNV SINYNGTKINIIDTPGHSDFGGEVERVLNMADGCILLVDAFEGPMPQTRFVLQKALEIGL KPIVVVNKVDKPNCRPEEVYEMVFDLMFSLNATEDQLDFPVIYGSAKNNWMSTDWQKPTD TITPLLDCIIENIPAPEQLEGTPQMLITSLDYSSYTGRIAVGRVHRGTLKEGMNITLVKR NGDMFKSKIKELHVFEGLGRVKTNEVSSGDICALVGIEGFEIGDTVCDFENPEALPPIAI DEPTMSMLFAINDSPFFGKDGKFVTSRHIHDRLMKELDKNLALRVRKSEEDGKWIVSGRG VLHLSVLIETMRREGYELQVGQPQVIFKEIDGVKCEPIEELTINVPEEYSSKIIDMVTRR KGEMVKMENTGERINLEFDMPSRGIIGLRTNVLTASAGEAIMAHRFKEYQPHKGEIERRT NGSMIAMESGTAFAYAIDKLQDRGKFFIFPQDEVYAGQVVGEHSHDNDLVINVTKSKKLT NMRASGSDDKVRLIPPVQFSLEEALEYIKEDEYVEVTPKAMRMRKVILDEIERKRANKS >gi|226332241|gb|ACIC01000079.1| GENE 90 106427 - 106696 446 89 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|29348195|ref|NP_811698.1| 30S ribosomal protein S15 [Bacteroides thetaiotaomicron VPI-5482] # 1 89 1 89 89 176 100 5e-43 MYLDAAEKQEIFGKYGKSNSDTGSTEAQIALFSYRISHLTEHMKLNRKDYSTERALTMLV AKRRRLLNYLKDKDITRYRSIVKELGLRK >gi|226332241|gb|ACIC01000079.1| GENE 91 106771 - 109395 1029 874 aa, chain - ## HITS:1 COG:no KEGG:BT_2785 NR:ns ## KEGG: BT_2785 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 874 1 874 874 1734 99.0 0 MRIRSIIIIFCLLSSIAVQIHSQQVADSIFATYFARSQAYADAYPREKAYLHLDNTSYYI GDTIWYKAYVVTAEQNQPSPISTPLYVELLDQLGNTSERQIIQLRDGEGEGQFILTKALF SGFYEIRAYTKWMLAFSEKQYFSRTIPVYRKRRNEVDENRSIITYRMDESMKQRPRSKEK DFMIRFFPEGGQLVEGVPSVVAFEATSKEEGTVGISGTVYGPDDVELAQFTTLHDGMGCF TYTPTDKQAKAVVTYQDKKYSFQLPKALPQGYVLNVTSKEKALVIKVLRNSTSLKDTLAL FISHQGRPLMYQTIDFKDQTVYHLPLSTQGLPGGVIQLSLMNSNGATLCERFCYVMPSPS LQLTATSTNALYYPFEPIEYHISMKNHTGQPIQGHFSVAVRDASKSDFQRFDQTIYTDLL LTSDLKGYIHQPGYYFANNSPRRRVALDNLLMIHGWRKYDISQAISATPEIPIYLPETRL TLHGQVTSFLRNKEQKDISVSVIARRDTISAAGTTVTDSLGFFHIPMDAFNGSMSAAIQT RRKGKKLNRKTNISLFRNFTPELRKYAYEELHPKWEDISGLSWWSSLSDSLYLDSIMGHD THLLDQVTIKAKRTKYKKILAFEQSVRAYYDIPKELDRMRDEGKFMDYFPWLMTTLNPNI QVKSKWKGDPPFITMKYKNHYILCVINGVLYSTFDRELFIDKNIDAIKSVMICDGTTASA GIDDDSAYHLSDESLKNHMETSRLSPFEEKALKSYLNNTTPNNQPGNQFAICYITTIDNW DPEKKYQSRGIRNTQIQGYSQPIEFYSPYYPDISKGQGNDHRRTIYWNPKVTTDENGVAI IRCNNANSSTFLTISCEALYNGQPAAINMHSIKY >gi|226332241|gb|ACIC01000079.1| GENE 92 109769 - 110344 690 191 aa, chain + ## HITS:1 COG:MTH659 KEGG:ns NR:ns ## COG: MTH659 COG1396 # Protein_GI_number: 15678686 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Methanothermobacter thermautotrophicus # 7 190 6 189 190 194 57.0 8e-50 MDTSKIVGEKIKALREDKSISIEELAQRSGLAIEQIERIENNIDIPSLAPLIKIARVLGV RLGTFLDDQDEIGPVVCRKKEAKDAISFSNNAIHSRKHMEYHSLSKSKADRHMEPFIIDV MPTEDSDFVLSSHEGEEFIMVMEGVMEISYGKNTYLLEEGDSIYYDSIVPHHVHAYEGQA AKILAVVYTPI >gi|226332241|gb|ACIC01000079.1| GENE 93 110378 - 112027 1523 549 aa, chain + ## HITS:1 COG:MTH657 KEGG:ns NR:ns ## COG: MTH657 COG0318 # Protein_GI_number: 15678684 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II # Organism: Methanothermobacter thermautotrophicus # 1 544 1 545 548 723 61.0 0 MQLFERTLGQWLEHWAEETPDKEYIVYSDRNLRFTWSQLNQRVDDMAKGLIAVGVERGTH VGIWAANVPDWLTLLYACAKIGAVYVTVNTNYKQAELEYLCQNSDMHTLCIVNGEKDSDF VQMTYTMLPELKTCERGHLKSERFPYMKNVIYVGQEKHRGMYNTAEILLLGNNVEDDRLT ELKSKVDCHDVVNMQYTSGTTGFPKGVMLTHYNIANNGFLTGEHMKFTADDKLCCCVPLF HCFGVVLATMNCLTHGCTQVMVERFDPLVVLASIHKERCTALYGVPTMFIAELHHPMFDL FDMSCLRTGIMAGSLCPVELMKQVEEKMYMKVTSVYGLTEAAPGMTATRIDDSFDVRCNT VGRDFEFTEVRVIDPETGEECPVGVQGEMCNRGYNTMKGYYKNPEATAEVIDKDNFLHSG DLGIKDEDGNYRITGRIKDMIIRGGENIYPREIEEFLYKLDGVKDVQVAGIPSKKYGEAV GAFIILQEGVEMHESDVRDFCKNKISRYKIPKYVFFVKEFPMTGSGKIQKFRLKDLGLQF CKEQGIEII >gi|226332241|gb|ACIC01000079.1| GENE 94 112116 - 113198 923 360 aa, chain - ## HITS:1 COG:CAC3072 KEGG:ns NR:ns ## COG: CAC3072 COG0836 # Protein_GI_number: 15896323 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Mannose-1-phosphate guanylyltransferase # Organism: Clostridium acetobutylicum # 9 340 5 337 350 245 38.0 9e-65 MTNMDNYCVIMGGGIGSRFWPFSRKTLPKQFLDFFGTGRSLLQQTFDRFQKVIPTENILI VTNAMYADLVKEQLPELDEKQILLEPARRNTAPCIAWAAYHIRALNPNANIVVAPSDHLI LKEEEFRAAIIKGLEFVSHSEKLLTLGIKPNRPETGYGYIQIDEPTGDGFYKVKTFTEKP ELELAKVFVESGEFYWNSGLFMWNVNTIIKAAEILLPELASKLAPGKNIYATDKETAFIE ENFPACPNVSIDFGIMEKADNVYVSLGDFGWSDLGTWGSLYDLSERDREGNVTLKSHSLI YNGKDNMIVLPKGKLAVIDGLEGYLIAESDNVLLICRKDEEHSIRKYVNDAQMQLGDDFI >gi|226332241|gb|ACIC01000079.1| GENE 95 113713 - 114384 286 223 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253569508|ref|ZP_04846918.1| ## NR: gi|253569508|ref|ZP_04846918.1| predicted protein [Bacteroides sp. 1_1_6] # 1 223 1 223 223 389 100.0 1e-107 MRKEIIQAIIKPYMKSKSFQNKGMRFTKDMGLFQIEVEIQSQRYYKEEHTENFRINYFLF CREFTELSGKKLSFAGGSIAEEKSSWIKINPQTDIERLKSWLLCELDSMLNKLENKYSIE YLLDIWSKYDTDLQYPFLLAKNNPKQLDEWTDEMKFEIQKLDAKLLELSSEKSEQEKRDD CLDKEMRLDGIRMKINRLVSSRKNILETLDFIQSNANKIEITA >gi|226332241|gb|ACIC01000079.1| GENE 96 114454 - 114759 172 101 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253569509|ref|ZP_04846919.1| ## NR: gi|253569509|ref|ZP_04846919.1| predicted protein [Bacteroides sp. 1_1_6] # 17 101 1 85 85 157 98.0 3e-37 MIQVYFITDRHKLEHPLFGITEQELQVMNNLPKGVLPDIDPYGDTIFYIDHIKFVCDNIQ IMIENKELQFCYDKEYKQLKELYEKFKSVIPKETALFLVGD >gi|226332241|gb|ACIC01000079.1| GENE 97 114881 - 115147 280 88 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253569510|ref|ZP_04846920.1| ## NR: gi|253569510|ref|ZP_04846920.1| predicted protein [Bacteroides sp. 1_1_6] # 1 88 1 88 88 126 100.0 4e-28 MQESISNFSEKISVFLDKNHDIAFLIFIGVFVFLLIGNILNWKWTYEPKSWVGYHWLELL GPTSYRFWHAVSLVLIIIILAVIYFTTY >gi|226332241|gb|ACIC01000079.1| GENE 98 115200 - 115535 253 111 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253569511|ref|ZP_04846921.1| ## NR: gi|253569511|ref|ZP_04846921.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 111 1 111 111 210 100.0 3e-53 MKDIIELLQKERIKTVDALKNGNKQELSYLQQIDKALGWLKLIEEKGLENVGCYDIHSLP DLPPKSRGIYNYYHLMMDYEDPSIENWREYKPDGQPLLLMYDDIVITRKGR >gi|226332241|gb|ACIC01000079.1| GENE 99 115617 - 116105 405 162 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253569512|ref|ZP_04846922.1| ## NR: gi|253569512|ref|ZP_04846922.1| predicted protein [Bacteroides sp. 1_1_6] # 1 162 8 169 169 298 100.0 9e-80 MKKNITYFGEVEINSPQETTEGKVVIDNLQIELDLNFYDGRPEYDWVAEYENYIKDLKQH KANVEAAIRSDYKDGGDVKEYVDFHLEELDVSIIDKVLAGTDASKPKEERLLTVLKLVRV GFYPGDEDYAIWDYTIGREIADMLVVVNTDNTGKINYVTWEN >gi|226332241|gb|ACIC01000079.1| GENE 100 116297 - 117700 1029 467 aa, chain - ## HITS:1 COG:no KEGG:BVU_1439 NR:ns ## KEGG: BVU_1439 # Name: not_defined # Def: mobilization protein # Organism: B.vulgatus # Pathway: not_defined # 1 467 1 467 467 666 74.0 0 MATKSSIHIKPCNIASSEAHNRRTAEYMRHIGESRIYVVSELSTDNEQWINPDFGSPDLR THYDNIRKMVKEKTGRAMQEKERERKGKNGKTVKIAGCSPIREGVLLVRSDTTLADVRKF GEECQRRWGITPLQIFLHKDEGHWLNGQPEVEDRESFKVGERWFKPNYHAHIVFDWMNHE TGKSQKLNDEDMTEMQSMASDILLMERGRSKVVTGKEHLERNDFIIEKQKAELQRMDAAK RHKEEQINLAEQELKQVKSEIRTDKLKKTATTAATAITSGVASLFGSGKLKELEHANEKL QDEVSKRDTSIEKLQSQIQQMQKQHDTQIHNLREMHRQELDMKEKELSRLARIIDKAFRW FPMFREMLRMEKFCAMLGFSKEMTESLLVKKEALKCSGKIYSEQHRRNFDIKDDILRVEN DPNDENRLNLTINRKPIADWFREQWHKLRYGIREPQQKERQGRGMKL Prediction of potential genes in microbial genomes Time: Thu May 12 01:21:38 2011 Seq name: gi|226332240|gb|ACIC01000080.1| Bacteroides sp. 1_1_6 cont1.80, whole genome shotgun sequence Length of sequence - 55371 bp Number of predicted genes - 43, with homology - 43 Number of transcription units - 17, operones - 12 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 63 - 1013 529 ## BVU_1440 DNA primase - Prom 1080 - 1139 3.7 2 2 Op 1 . - CDS 1245 - 2417 869 ## BDI_2140 hypothetical protein 3 2 Op 2 . - CDS 2440 - 2754 217 ## Amuc_0323 phage transcriptional regulator, AlpA - Prom 2793 - 2852 4.3 - Term 2802 - 2839 6.2 4 3 Op 1 . - CDS 2924 - 3874 694 ## gi|253569517|ref|ZP_04846927.1| conserved hypothetical protein 5 3 Op 2 . - CDS 3881 - 5215 832 ## PG0838 integrase - Prom 5346 - 5405 4.5 + Prom 5516 - 5575 6.2 6 4 Tu 1 . + CDS 5612 - 6421 553 ## COG3177 Uncharacterized conserved protein - Term 6342 - 6383 3.2 7 5 Tu 1 . - CDS 6418 - 7608 569 ## COG2311 Predicted membrane protein - Prom 7644 - 7703 3.9 8 6 Op 1 . - CDS 7809 - 8585 271 ## COG2227 2-polyprenyl-3-methyl-5-hydroxy-6-metoxy-1,4-benzoquinol methylase 9 6 Op 2 . - CDS 8664 - 10313 579 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 10505 - 10564 5.3 + Prom 10469 - 10528 7.0 10 7 Op 1 . + CDS 10576 - 12528 1256 ## BT_2777 hypothetical protein 11 7 Op 2 . + CDS 12539 - 13702 795 ## BT_2776 hypothetical protein - Term 13678 - 13726 0.2 12 8 Tu 1 . - CDS 13827 - 15857 1142 ## COG5545 Predicted P-loop ATPase and inactivated derivatives 13 9 Op 1 17/0.000 - CDS 16539 - 18020 472 ## COG0515 Serine/threonine protein kinase 14 9 Op 2 . - CDS 18033 - 19097 721 ## COG0631 Serine/threonine protein phosphatase 15 9 Op 3 . - CDS 19103 - 20842 1035 ## gi|253569529|ref|ZP_04846939.1| conserved hypothetical protein 16 9 Op 4 . - CDS 20859 - 22133 419 ## COG0515 Serine/threonine protein kinase 17 9 Op 5 . - CDS 22130 - 22957 373 ## gi|253569531|ref|ZP_04846941.1| conserved hypothetical protein - Prom 22982 - 23041 4.0 18 10 Op 1 . - CDS 23046 - 23816 597 ## gi|253569532|ref|ZP_04846942.1| conserved hypothetical protein 19 10 Op 2 . - CDS 23800 - 25344 518 ## COG0515 Serine/threonine protein kinase - Prom 25469 - 25528 10.9 + Prom 26134 - 26193 3.3 20 11 Op 1 . + CDS 26213 - 27598 661 ## gi|253569534|ref|ZP_04846944.1| predicted protein + Term 27619 - 27660 7.3 21 11 Op 2 . + CDS 27673 - 28524 428 ## BT_0198 hypothetical protein 22 11 Op 3 . + CDS 28537 - 30483 735 ## Kfla_2319 hypothetical protein 23 11 Op 4 . + CDS 30535 - 31287 330 ## COG3279 Response regulator of the LytR/AlgR family + Prom 31346 - 31405 5.8 24 12 Op 1 . + CDS 31500 - 32183 564 ## BT_2762 TonB + Term 32208 - 32245 -0.1 + Prom 32197 - 32256 2.5 25 12 Op 2 . + CDS 32278 - 32916 415 ## BT_2761 hypothetical protein 26 12 Op 3 . + CDS 32968 - 33531 401 ## COG2096 Uncharacterized conserved protein 27 12 Op 4 . + CDS 33596 - 33817 389 ## PGN_1678 hypothetical protein + Term 33840 - 33909 18.0 + Prom 33875 - 33934 5.9 28 13 Op 1 . + CDS 34016 - 35374 851 ## BF4274 tyrosine type site-specific recombinase 29 13 Op 2 . + CDS 35382 - 36014 186 ## BF4273 hypothetical protein + Prom 36025 - 36084 5.3 30 14 Op 1 . + CDS 36254 - 36586 219 ## BF4272 hypothetical protein 31 14 Op 2 . + CDS 36528 - 38039 472 ## BF4271 hypothetical protein 32 14 Op 3 . + CDS 38052 - 39146 318 ## BF4270 hypothetical protein 33 15 Tu 1 . + CDS 39267 - 40823 695 ## BF4269 putative protein involved in recombination + Term 40835 - 40881 3.3 + Prom 40868 - 40927 4.9 34 16 Op 1 . + CDS 40985 - 41185 241 ## BDI_2133 hypothetical protein 35 16 Op 2 4/0.000 + CDS 41216 - 42742 1165 ## COG0286 Type I restriction-modification system methyltransferase subunit 36 16 Op 3 27/0.000 + CDS 42747 - 44285 947 ## COG0286 Type I restriction-modification system methyltransferase subunit 37 16 Op 4 11/0.000 + CDS 44288 - 45895 444 ## COG0732 Restriction endonuclease S subunits 38 16 Op 5 . + CDS 45916 - 48849 1688 ## COG0610 Type I site-specific restriction-modification system, R (restriction) subunit and related helicases 39 16 Op 6 . + CDS 48873 - 50078 357 ## Pnap_4263 hypothetical protein 40 16 Op 7 . + CDS 50032 - 50388 153 ## gi|153805911|ref|ZP_01958579.1| hypothetical protein BACCAC_00151 41 16 Op 8 . + CDS 50375 - 52750 603 ## Pnap_4262 hypothetical protein - Term 53001 - 53058 13.3 42 17 Op 1 . - CDS 53072 - 54223 1177 ## BT_2758 hypothetical protein 43 17 Op 2 . - CDS 54257 - 55315 1020 ## COG0252 L-asparaginase/archaeal Glu-tRNAGln amidotransferase subunit D Predicted protein(s) >gi|226332240|gb|ACIC01000080.1| GENE 1 63 - 1013 529 316 aa, chain - ## HITS:1 COG:no KEGG:BVU_1440 NR:ns ## KEGG: BVU_1440 # Name: not_defined # Def: DNA primase # Organism: B.vulgatus # Pathway: not_defined # 1 316 1 318 318 515 80.0 1e-145 MTTEEAKKIRIADYLHSLGYSPVKQQGVNLWYKSPLREENEPSFKVNTEREQWFDFGLGK GGNIIALAAHLYATESVPYILKRIEEQTPHVRPVPFSFHRQSATEPSFQQLDIVQLSSPA LLSYLQERGINTALAKRECREAHFTNNGKRYFAIAFPNISGGYEIRNRYFKGCIAPKEIS HIRQSGEPRKACYVFEGFMDYLSFLTLRLESCPQSPDFDRQDYMILNSVANVSKALYPLG SYERIHCFLDNDRAGMEALQQIRKEYDNARYIRDASHIYSGCKDLNEYLQKRAETKKQAQ SIKVKTPPNKPGGFRL >gi|226332240|gb|ACIC01000080.1| GENE 2 1245 - 2417 869 390 aa, chain - ## HITS:1 COG:no KEGG:BDI_2140 NR:ns ## KEGG: BDI_2140 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 390 24 413 413 629 77.0 1e-179 MRVGTTLYKVVNQPCASGGYEKRRVIWNNSTLRQDYGKNYLATVPKYDGFCTVPDHLNYR KEIDGFLNLYEPIEHTPQIGDFPNIRSLMLHIFGEQYNLGLDYLQLLFLQPLQKLPILLL VSEERNTGKSTFLNFLKAVFGDNVTFNTNEDFRSQFNSDWAGKLLIVVDEVLLNRREDSE RLKNLSTTFNYKVEAKGKDRTEISFFAKFVLCSNNEYLPVIIDAGETRYWVRKINPLQDD DTNFLQKLKEEIPAFLFFLTQRELSTEKESRMWFNPKLIHTAALQKIIRSNRNRLEIEMA ELFLDIMSNMNVESVSFCLNDLMTLLIYSQVKAEKHQVRKVVQEVWKLTSAPNSLSYTAY EIAPHRDCHYETKRKIGRFYTITKEQLTAI >gi|226332240|gb|ACIC01000080.1| GENE 3 2440 - 2754 217 104 aa, chain - ## HITS:1 COG:no KEGG:Amuc_0323 NR:ns ## KEGG: Amuc_0323 # Name: not_defined # Def: phage transcriptional regulator, AlpA # Organism: A.muciniphila # Pathway: not_defined # 2 101 5 105 106 61 36.0 1e-08 MTDLMAIIQNGNGNIKLEVTGEDLLMFSNQLISRAKHELSTAIAEARKEKYLTKEEVKKM CDVCDTTLWHWAKKNYLKPVKVGNKVRYRQSDIQKILGERNPSI >gi|226332240|gb|ACIC01000080.1| GENE 4 2924 - 3874 694 316 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253569517|ref|ZP_04846927.1| ## NR: gi|253569517|ref|ZP_04846927.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 316 1 316 316 568 100.0 1e-160 MGQQEYDNFKRLIREWLDSHPDEYAAFVEEMNDKEFKGFVKVFKVATTLVPRYKEATRRR IGDDKISDFEELENVLLDSDLAKKIVNEFHHSKRRSIIPAMLAWLYYGRSYECMVEQGEE LAKRKGITRLYKWLVSYMVRFIIKKSISSGMRTKEDWVAFRKQQKAIEENNLVEWSIEDE EDIEETETEDTDSLQQGQPKSAGRKADTRTLPELLIENRDVILERIGTRLKTHTTETDIA RMYIALVEYRFMRSCPIKTFRNALQQQYTDLSIVHERGIQKAYRNLTSPFGGSKKLIKDI GEDHVAIEELKAYLSA >gi|226332240|gb|ACIC01000080.1| GENE 5 3881 - 5215 832 444 aa, chain - ## HITS:1 COG:no KEGG:PG0838 NR:ns ## KEGG: PG0838 # Name: not_defined # Def: integrase # Organism: P.gingivalis # Pathway: not_defined # 218 438 202 424 432 91 30.0 5e-17 MKATAFIRKTASKNDTNSVATIYFRLRDGKKDIKAASELIINPNHWSSEKQGYKDRVALV SEEKKQALNNEVQNILNLINQNYTPESDSEWLSTLIDKYHHPNRYKTKEEIEAENKPMLL ELFSEFLVKHKLSEVRKKNFRVIQRTLARYELYVRATKKGQKDFILDVDTVTPETLRDMW DFFENEYQYYELYPSIYETIPEKRTPQPRGKNTLIDCFSRIRTFFLWCFDNKLTANRPFD KFPIEECKYGTPIYINLEERDRIFNADLSATPQLEIQRDIFIFQTLIGCRVSDLYRMTRL NIVNGAVEYIQKKTKEGNPVTVRVPLNDKAKEILERYKDHEGKLLPFISEQKYNEAIKKI FKLSGVDRIVTILDPLTRDEVKKPLYEVASSHLARRTFIGNIYKKVKDPNLVSALSGHKE GSKAFRRYRDIDEDMKKDLVKLLD >gi|226332240|gb|ACIC01000080.1| GENE 6 5612 - 6421 553 269 aa, chain + ## HITS:1 COG:mlr2757 KEGG:ns NR:ns ## COG: mlr2757 COG3177 # Protein_GI_number: 13472455 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Mesorhizobium loti # 39 255 41 243 263 87 31.0 2e-17 MEKSIWQEIEKLYQEFQKLGIGQSVDYEKYYLYSLITHSTAIEGSTLTEMDTQLLFDEGV TAKGKPLVYHLMNEDLKNAYELAKEESVQGVDITSSLLQKLNATLMRTTGSVHNTIGGSF DSSKGEFRLCGVTAGVGARSYMSYQKVSAKVEELCSILQERQKLMNTLREQYELSFNAHL NLVTIHPWVDGNGRAARLLMNYIQFCYRLFPAKIFKEDRADYILSLQQSQDEETSQPFLN FMATQLKKSLSLEIERDHTFRERGFSFMF >gi|226332240|gb|ACIC01000080.1| GENE 7 6418 - 7608 569 396 aa, chain - ## HITS:1 COG:BS_yxaH KEGG:ns NR:ns ## COG: BS_yxaH COG2311 # Protein_GI_number: 16081049 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Bacillus subtilis # 10 394 10 395 402 125 28.0 2e-28 MHLSNTLPHNNRIEIADVLRGFAVMGITLIHFIERFSLNSFPEETCNFLIFTDRIIWDSV FFTFSGKAYCIFALLFGFSFFIQDNSQKLKGKDFRGRFAWRLVLLLFIACINSALFPGEI LVLYALLGYVLIAVCRLSTRTVTIIAVILLLQPIELGQITYALINPDYMINADLDAPYWE LVNAAQKEGSFLEMCKTAIWTGNVANMGWMLLHGRVTQTAGLFMVGMIIGRSNVFLYSEK NIKIWLKVFIIAVTAFFPLYGLIDILPDFISREALLVPSVLLLKSLSNIAFTGIMFAGII LVYYLTKLKHVLHKFAPYGRMSLTNYLSQSLIGGFLFYNWGLGLYLHTGITACILIGVCV FFLQYFFCNWWLRSHRQGPLEWLWKKATWINWKPNN >gi|226332240|gb|ACIC01000080.1| GENE 8 7809 - 8585 271 258 aa, chain - ## HITS:1 COG:RC0965 KEGG:ns NR:ns ## COG: RC0965 COG2227 # Protein_GI_number: 15892888 # Func_class: H Coenzyme transport and metabolism # Function: 2-polyprenyl-3-methyl-5-hydroxy-6-metoxy-1,4-benzoquinol methylase # Organism: Rickettsia conorii # 37 135 104 201 289 63 37.0 4e-10 MQKRHTNRERYFEEQAQTTRNYYIPYIKEYTGNLPNKVLEVGCGEGGNLLPFAELGCDVI GIDIAASRIEQAQNFFITKRQKGTFIASDIFLLKDLQKHFPLILIHDVIEHIDNKEQFLR NLKKYLSPNGMIFIAFPAWQMPFGGHQQIARSKIISHMPFIHLLPRMLYKWILELFSEQE STVKELLTIKYTRCTIEMFRRVVKQTDYQIVNEQLYFINPHYKIKFGLTPHKLNKTIACI PFIRNVFSTSCFYLIRPT >gi|226332240|gb|ACIC01000080.1| GENE 9 8664 - 10313 579 549 aa, chain - ## HITS:1 COG:mll8140 KEGG:ns NR:ns ## COG: mll8140 COG1595 # Protein_GI_number: 13476734 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Mesorhizobium loti # 17 174 14 177 208 67 25.0 9e-11 MDIKELVDLCKKGNEQALSLLYKTYSKKMMGICLHYIPDKQIAQDLLHDGFIVIFTSIET LRTPEKLESWMGIIMKNISLRYLNQKSTNSAIPLSDIPENEEPIDNSLSSDSIPYDKITE MVEKLPEGYSKIFKLAVLEGLSHKEIGKLLNIAPHSSSSQLFRAKVLLKKMISDYRLILI LVILFFFPTIHDYLYWKRKETKDNHWSNNVIRKVKLDKKEENESITSSGHISHYKEEPPF NTTLANLPRLALDTSATGLSTTSEIKDSVEKQPVAKPSFTWEHAEKIITFSYSSQEFTSL PKKKIGKWKMMLAGSVGPQLAQNLYKLMTTPHSDGLESGLPQQVNTWEEYYDYLNTRNQE GTLGDSLTLMTIAKNNSGRIIEHQHHNSPITIGLALNKKLNNHWSIETGLQYIYLKSEFT TGKEYRIQETQKLHYIGIPLRISYRFGNYKQFSFYSTAGLQMEIPIKGTLHTSHVTDSVP INWGYQSLDVPLQWSINASTGVQYHFTPHTSIYIEPTINYYIPDGSSLRTIRKEHPVTFS VPVGIRFSW >gi|226332240|gb|ACIC01000080.1| GENE 10 10576 - 12528 1256 650 aa, chain + ## HITS:1 COG:no KEGG:BT_2777 NR:ns ## KEGG: BT_2777 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 650 1 650 650 1234 99.0 0 MGKYKFIEERVETMSSSELKIFLNILKSRSKELMSTLESVREIKDLEAVCKSTQKLNNRV SEFDGGSFLILVVGPVKSGKSTLVNLIAHAHVSPTHFLECTVRPSIISKGQEESITRIFS VADKARKVEQFDSVIDSLRGFEKLENISEISIEKKELNDANLEECVSLALEESVINTDHT LVTSITTSGGELLKDNIFLVDMPGLDGGYANLDNPIYETISQRADFVIFVQSSNSAISKV SNRFLKLLQENNPKVPVCLLHNVFEAAYWHSEEEKQAVIKAQVEFAENEMGRRNFQLKEN IYSLNLGKVADAASYEGMEDLQEEAAKFQRFEKDLYGKVISNSTDIRLRSVIGRTLNQLE KLDNLVGESIDQQEAIDIEYQAAATSFDELLRQASELSYDGSDFRKMVQVYMDDDLKEFT EIVREYYDSGCNSAFAGGAKGKDATRSLVTKFMTDSSAAIKSKFLDSHEHSLARQYLRKL SNLKDDVAFIEKMNTFLKQKGCAPFDSVPQVTDFPVVDFDLQGAFDVKNINNPTVPHIGI PNPFGTYSTMEIQSFLRKAMECITGIDHTHTGKTIKGYLQLTVGKSIYKAANDLYDQYVP VRKQSIIDFLEQQKTMYLNALVSDKEEFDRKDALLRSINAQVQSFKDSIR >gi|226332240|gb|ACIC01000080.1| GENE 11 12539 - 13702 795 387 aa, chain + ## HITS:1 COG:no KEGG:BT_2776 NR:ns ## KEGG: BT_2776 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 387 1 389 389 659 94.0 0 MKRNLLIITLLFLGSFVILLLGNIIIIGEKIASLSQVAWAEYAFYTLILVVFYAVVIRPV VRVHRAPQIPALSIDGEWNTAQLVAFGHKLADNCNYIPKDKSCPELRKLHQQRLREDLER YATSREELVQIISEELKLRIEGGELKETSGPRIEDMYSVRIIGINRRIIEWAKSVFMITA ISQNGKLDTVSVLYMNYKMIEDVIMASGFRPTRQQLFRQYVNILVTSLMTFVASEVFKDM GSVTPFGSLADQSSDAASDIDISDASVDAADVDLDDIGDTVSGDTGFLSILSNVKIPGVV IGSICDGIVNSLMTLRIGYVTRNYLIDGMNSLNGIKAKRKAKRVAVKEALKSLPKVVVVG TSFVGKGAMNIILNIIGGKKVKETNEN >gi|226332240|gb|ACIC01000080.1| GENE 12 13827 - 15857 1142 676 aa, chain - ## HITS:1 COG:L109011 KEGG:ns NR:ns ## COG: L109011 COG5545 # Protein_GI_number: 15672499 # Func_class: R General function prediction only # Function: Predicted P-loop ATPase and inactivated derivatives # Organism: Lactococcus lactis # 372 609 142 382 480 64 23.0 5e-10 MKHETKAQPISNLRTSIRYASPDAKLDDAKKLAKVIPAAAFRKTANGIQMTAYNGIIQIE VNHLANLMEVNRVKQEAEELSQTYLAFMGSSGHSVKIWVRFTRPDKSLPKNREEAEIFQA HAYRKAVSLYQPILSYSIELKNPALEQFCRQTYDPELYYNPDSTIMYMRQPMEMPSETTY QEAVQAETSPFKRLIPGYDSLETLSALFEVALNKACQSLSELQPGIYPRSDEDLKPLLVQ LAENCFQAGIPEEETARWAIAHLYRQKKEFLIRQTVQSVYTIAKGFGKKSPLSAEQELEL RTEEFMQRRYEFRYNTMTTVTEYRERNTFCFYFRPLSSRVRNSIAMNARLEGLSLWDRDV VRYLDSDRIPIFNPIEDFLFGLDVRWDGHDRIRELAARVPCNNRHWADLFYRWFLNMVAH WRQTDRKYANCTVPLLVGPQAYRKSTFCRSLLPPELQAYYTDRIDFSNKRDAEISLNRFA LINMDEFDQNRVNQQAFLKHILQKPIVNVRRPHGTATQEMRRYASFIGTSNHKDLLTDTS GSRRYIVVDVTGPIDCSPIDYEQLYAQAMHDLYKGERYWFDPEDEKVMNESNQEFQVMPI AEQLFHEYFRAATEGEECEQFLAIEILEQVQHDSKIRVSDCNIIQFGRILQKNRVPSVHT KRGNVYRVVRIKAKRE >gi|226332240|gb|ACIC01000080.1| GENE 13 16539 - 18020 472 493 aa, chain - ## HITS:1 COG:YAR019c KEGG:ns NR:ns ## COG: YAR019c COG0515 # Protein_GI_number: 6319328 # Func_class: R General function prediction only; T Signal transduction mechanisms; K Transcription; L Replication, recombination and repair # Function: Serine/threonine protein kinase # Organism: Saccharomyces cerevisiae # 12 256 24 265 974 108 31.0 3e-23 MQFNSGDIFAEHYQLEKKLGTGSFGEVWLARNTLADVEVAIKFYGLLDDNGIKDFREEFK LAYKLHHPNLLHLNHFDVFERCPFLIMPYCSNGSSATLIGKMSEKQIWHFIRDVSCGLMF LHSQNPPIIHQDIKPDNILIGDDGKFIISDFGISRKLEHTFRKGTNKVKSSGTLAYMGPE HFSEKPFIAGTSDIWSLGMSIYELLTGHILWEGMGGCVQLNGARIPALDNGYSPQLDQFL HACLSLNTWDRPTAQQAYNYANSMLRQEVNHPHSPSAVHASTPPLPPAPSRLQKRIRLST INKRMAGWIILGILFLFMLIKGVSAFFGSIEEERRYNACRTTEDFRHFLNRYPASSHAKF VKQKICMLVEDSIKKEIKKNRVIAPVSVDTIEPEQPKKEEIVVKHPSWSRPKPKVEIVSP APDTKKVKREDEERMFLNCQTITDYEKYLRQYPKGKFAYKAKRAIQQIESGMMESHPTQE IHVKKSTRVSFGL >gi|226332240|gb|ACIC01000080.1| GENE 14 18033 - 19097 721 354 aa, chain - ## HITS:1 COG:BH2505 KEGG:ns NR:ns ## COG: BH2505 COG0631 # Protein_GI_number: 15615068 # Func_class: T Signal transduction mechanisms # Function: Serine/threonine protein phosphatase # Organism: Bacillus halodurans # 12 273 7 238 249 122 32.0 8e-28 MSEIKIKLAAGTDVGLVRKNNEDNFVVNRDLSQSEWIIPQSSETIPLGKYGTLLVVADGM GGTNAGEVASAIAIETVQNAFTPEKLDKIVTLEGEMATEEAIEEFLTKTVKAADLNIVNA SKEDSSTQGMGTTIVIAWILNEKAYISWCGDSRCYVFNANSGFCRLSKDHSYVQDLVDQG KLDPENAFDHPYNNVITRCLGDPTNRSNPDFRSYNLKDDDIFLLCSDGLCGLCHDEEIMQ IIEENQNDLVVCKDQLIEAALAVGGYDNVTIVLCHIIQKETDEPKANLNNTVFSKPNNHK FRKIVLLLFVLALLLGGYLYKNPQLSAKWKAKIFPTDTVIVTETDTSTISPQPD >gi|226332240|gb|ACIC01000080.1| GENE 15 19103 - 20842 1035 579 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253569529|ref|ZP_04846939.1| ## NR: gi|253569529|ref|ZP_04846939.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 22 579 1 558 558 1121 100.0 0 MDTQNYKRTLAGSVGAGMGALMGASGRTYFILEHKTPSKYHQAGESQKIIVDQVELGRDA SCQVRFDESFETVSRKHAAIVRDGENWKLIHLSHSNPTLVNGHPINGSYYLQTGDEIQLS VGGPRLGFIVPQGRQALTSSIGLTERMSLFRKQALRPYKTALTIMGIVFVLAVAGLAAWN YSLGVKNDLLAQKTEQQEVQLRGYQNQLDSLGTERAALETQQRELEAKLRTSDGNTEALK SQLANVNNQLGNVNASFYKVKSDLDNLSSTIAAENATEEVEEASYKPQTTQQARSHQDNV ESDVVEEQDHNASGDLKDYYDHIYTIKVDHIEIEYNGSKFDPGIQVSDIICGTGFMLSDG TFVTDRQNVEPWIFSNTEWKDSWRRLLAIYRGAGCNIILHYRAYSTKGTGKPLTFTNADF SRNQLYDTTFTEIEVEKSIRVQLREHGIKLAYKRRNYVQVSIVTPAAYSWAYIRGKGTFG EGLPFDTTAANNMNGGTEIQMVGYAGYADIHNLTPAHFNDHTNISDTKYRTIVLQNRVAE PGYYGSPAFLLENGTYQVIGFMVGSIEGKDRLVPIGNIK >gi|226332240|gb|ACIC01000080.1| GENE 16 20859 - 22133 419 424 aa, chain - ## HITS:1 COG:Cgl2127_1 KEGG:ns NR:ns ## COG: Cgl2127_1 COG0515 # Protein_GI_number: 19553377 # Func_class: R General function prediction only; T Signal transduction mechanisms; K Transcription; L Replication, recombination and repair # Function: Serine/threonine protein kinase # Organism: Corynebacterium glutamicum # 10 226 3 212 418 115 34.0 2e-25 MTNQSVDRCDYNVGEYIDNRYRVSKTLGEGSFGKVYRVTDNAGKDYALKLLRLWEVPPEI RKPLTDRFEMEFKTGQIDCEYLVSSIDYGTVGGNPYIVMEYCPGGDLTPYLGSQNSKIPL ICQQILLGLHSLHIRGKVHRDLKPENVLFKSNGVAALTDFGIAGDRNKRMTERNIFGKPN QIFGTYAYMPPEQVNRMRGEATVLPTTDIFSFGVLAFQLLTGSLPFGELTSHNELALYQK RGKDGDWNRRKLVQIKNGKEWDRLFSGCLNPDFKNRFQSVKEVLKYLPETQQVPMERGYC PPQTVQGFCMRIMQGEEYGKLYQLSEFATHGCRILTIGRESDNLLPVTERQSTYVSRHHC TLETTPSGDHWIIRDGQWQKEQKVWQESSNGTYVNSAQVTQQGYRLKVGDIISIGDVKMR FENY >gi|226332240|gb|ACIC01000080.1| GENE 17 22130 - 22957 373 275 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253569531|ref|ZP_04846941.1| ## NR: gi|253569531|ref|ZP_04846941.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 24 275 1 252 252 479 99.0 1e-134 MRCMNCGWDNLSDTTVCLKCGQLLSISTGYANNQNDFVSREGYGEAIPKPTVLNANPNRE KCRKTVVFPTLNEPDTHSAITQPATECPNCSYPIVGEFTSCPCCGTPLERADKSADDKHK KIKEKTSFLPEDKMTCQECGKEIDITCSFCPHCGVRVHRPTIRRQIKNIAETEPEDITPH CSLTIIPEENENIPSEKMEYEGKSIILNRENTEVSNRTITSKEQAEIVFEDGHWYLLDRS ELRTTFIQANRKIEIIPDDIIVLGDRRFKFESDSL >gi|226332240|gb|ACIC01000080.1| GENE 18 23046 - 23816 597 256 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253569532|ref|ZP_04846942.1| ## NR: gi|253569532|ref|ZP_04846942.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 256 1 256 256 446 100.0 1e-124 MKKCIKLLILVFALAMPISTWGQCAAIYKKGETSMKKGRYREAIKSFKAAMKCDSKLEQD CKSKIKECEEKLKPAPKSTPAPAIEVTRLAITPDSVRFGYETTKAEYIKVDSAPEEWTAT SDSNWCKVIPHGKNLSVSCEINQLTSERKATVTISNGKMEETVKVVQSGQKEFINIALDK LEFGSKGEIKELPIKTNTEWEVINIPSWCEVIAKDSDKLILKVGKTKKAKEGTLILKTKG GEISSAILSQKKGGIF >gi|226332240|gb|ACIC01000080.1| GENE 19 23800 - 25344 518 514 aa, chain - ## HITS:1 COG:YAR019c KEGG:ns NR:ns ## COG: YAR019c COG0515 # Protein_GI_number: 6319328 # Func_class: R General function prediction only; T Signal transduction mechanisms; K Transcription; L Replication, recombination and repair # Function: Serine/threonine protein kinase # Organism: Saccharomyces cerevisiae # 11 268 24 271 974 121 34.0 3e-27 MLDTGQVIADRYELLKQLGRGGFSEVWLALDKLTDVKVAIKIYAPGMGLDDAGISLFTQE FSLVFDINHTNLLHPTYYDCWERMPYLILPFCKNGSIFKYLTDNNRITETESWHLLHDVA EGLAYLHAKLPPVIHQDIKPDNILINDENGYMITDFGISTRIRSTLRRNQGQELSGGTLA YMGPERFGPSPTPIMASDIWSLGATMYELLTGLPPYGDHGGLLQKNGADIPLINNEEYSQ ELKDIVYKCLALNTWDRPTARQIADYAFQHINGNSAPFNIQSSQQTTSDLSSKDEINEIQ TPEGHENFSFELKDNHEPSSIHSDAEFPVMQSSDIKKRVATLFKREHRKLIIIAAIIIIC ATIVTIYLATSGKEATPEPKQEIASKTNANDSICSAIILEGNNYKSQGNIFRDKIDTLRQ NNDSVFEDFYINAIKKYREVNHYKDSISKGVYMNAEDQIKQVEIILDSAYNLFILKADTM RMFDEIEAAKVFENRAKKIEPYINIDNNNNEEMH >gi|226332240|gb|ACIC01000080.1| GENE 20 26213 - 27598 661 461 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253569534|ref|ZP_04846944.1| ## NR: gi|253569534|ref|ZP_04846944.1| predicted protein [Bacteroides sp. 1_1_6] # 1 461 1 461 461 790 100.0 0 MKRNYLLTFILVLVIASCGGSDDDEAETLLTVSPNTFTLSSDGDSQVLKIQSNTSWLITG ASGWLSLSSSSGKGNASVVLTAQESKDIDERKCTLIVQSEGGEISEPVSVVQQGVSVSLS VDVKEINLAEAGKEQTFNITCNTSWSISGVPEWLTVSSTNGVGNAAIKVSANSFNNSPAD RDVTLNISSSGGSSQSVKIIQKAGLAAGSNVTAQTVVVLSDCFATDFEYGNNVTYFYAVV FEASFIERYTDEEIIDVMKKDENRNTPNDNYVISFYNLEPQTKHVLFLLGFNKDGERGEL IRKEITTKSGTNQPEAYVSNVTYTTSEWSWYTTIGAYATKYYQWILEYSSNYYNSSDGLV AWYFSKNIKERPNADFDPILQSGGWTRARTSSITKLQTVTWGVGQSGELGGTINNYHWES KTNGVMRINESNRNTAKGIAHIGMIVNENTLFKGATITRIR >gi|226332240|gb|ACIC01000080.1| GENE 21 27673 - 28524 428 283 aa, chain + ## HITS:1 COG:no KEGG:BT_0198 NR:ns ## KEGG: BT_0198 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 51 279 60 289 634 65 26.0 2e-09 MKTVIYLQTLIVILLFLFHSCSSEEVDKCDKEIITFLPAVSVSTRSTMERGFEEGDKIGI FLFDQNKAAKWKYGSECWVSNIPVEYNGSVGWDSEMQLYWQKVPYTLMGIGYYPYASSTL DKDFTNVYFQLPNDQSNRKLLRQSDFLWASTTDIIYEKYNGGIPLLFNHLFSKLTVSFIN TTSIANSDMKNGVQVTNVYNRAIVNLVTGEVSYESSISPQVIQMYYDENRQTAEAIIIPQ SVPVGQWINFKYKGRFYHYQLQEPLILESGKEYKIIIDLSTVQ >gi|226332240|gb|ACIC01000080.1| GENE 22 28537 - 30483 735 648 aa, chain + ## HITS:1 COG:no KEGG:Kfla_2319 NR:ns ## KEGG: Kfla_2319 # Name: not_defined # Def: hypothetical protein # Organism: K.flavida # Pathway: not_defined # 324 598 66 316 774 73 28.0 2e-11 MKRIDLSGLRTLVMLLVLAACSTEHEEERIYFEISQSISNGFQSTDGKSQDSSISFNTGD IIGVFLTEQGSQLSTDSYLYNQACIFDGNQWSLGKRFSFPAEDKGQKMRMVAYYPFMQPL ENAVLPFEVATLQNNANKQKESDLLFAEQEYIISEAAVDIHFSHLMSQVTFQVDYANGIS DVCSNIYLKACNQCSLNLENGAVSTHGTVTSIEAMKLKEETSDNSSRRFSLLIPPQHLSD EQAIELKVNESPFFIKLDQTFDSGVHYIMHLTVLGDRQVTLNGVSVANWESVNVTQGSLY SPETYSTGDVIVYQKMREKHLVTLVVTGDGFTTNELAPNGLFESSAREALNCLFSVEPYK SYREYFNVYILPTVSEETGAGNTDTGKMRNTYFKTSWGNNYSDMQVKDYNEIFDFVSSTC PDIIENKTSIDKVPVFLLVNDSRYGGICWIWNNGLSYAIIPLTEGNLQWSGNSSIGISTG DWKNVFVHEGGGHGFGKLLDEYHYNDSPNYTAESIESQAWEVPLGLNLTTDFNNTKGSVY WKHLLNDSRYPRTGFFEGGMGYKSGIWHSEIISCMDDNRLYFNTISRQLIVERIKTVAGE PFVIEEFFAKDINYDPLRDFNSFKTSDQYGIREKARQVPRTPHPHLMK >gi|226332240|gb|ACIC01000080.1| GENE 23 30535 - 31287 330 250 aa, chain + ## HITS:1 COG:ECs2936 KEGG:ns NR:ns ## COG: ECs2936 COG3279 # Protein_GI_number: 15832190 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Escherichia coli O157:H7 # 5 246 8 240 244 79 27.0 5e-15 MNKYKVVIVDDDEVAIGKLSDALKKNPRFSIEGTARNDKQGKKLIVKIQPDLLFLDVEMS DTAGLDLLHGLRDSISWNIRVVFYTAYKKYMIHAIRESVFDYLLKPFKEQELEYILTRFI TQVEAEQKRSLPIGASLYMGQTFIVFTPTNDMRALHPAEVVFFRYCAERKQWEVILSTQL PVVLRRGMTAEQIIKYSPSFVQIHQSYIINIDYLMIIKDNKCMLYPPFDNVTELFVSRKY KKELQDRFCL >gi|226332240|gb|ACIC01000080.1| GENE 24 31500 - 32183 564 227 aa, chain + ## HITS:1 COG:no KEGG:BT_2762 NR:ns ## KEGG: BT_2762 # Name: not_defined # Def: TonB # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 227 1 227 227 370 96.0 1e-101 MEAKKSKKAAIENQRGSWLLMGLVVALAFMFVSFEWTQHDVRVAAVSPDDESIFVTELVP ITFPEEKLEPPPPPEIKQSEWIEIVDDTEEVTDNVSVVPDNVNEPYKVVWVPPVVETKEV EEDVIHVSVEIMPEFPGGSAELLKYLSTHIKYPTMSQEMGSQGRVIVQFVVDKDGSITNP TVVRGVDAYLDKEAIRVISGMPKWKPGVQNGKKVRVKYTVPVVFRLQ >gi|226332240|gb|ACIC01000080.1| GENE 25 32278 - 32916 415 212 aa, chain + ## HITS:1 COG:no KEGG:BT_2761 NR:ns ## KEGG: BT_2761 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 212 1 212 212 419 98.0 1e-116 MKDWFMNLLYYLKRPLIWLSRFRYRCGYGVHSPFAFSLITDVIYEKMPYYAYDLLKAEQK KRITTQGWEKGIPRINRLLFRLVNKVQPATIIEVGQPSAASLYLQSAKPSAGYLFASDLS ELFLDADTPVDFLYLNDYRNPALLEEVFNICSRRTTPNSLFVVHGICYSKAMKNLWKQLQ NDERVGITFDLYDAGLLFFDTTKIKQHYIVNF >gi|226332240|gb|ACIC01000080.1| GENE 26 32968 - 33531 401 187 aa, chain + ## HITS:1 COG:lin1172 KEGG:ns NR:ns ## COG: lin1172 COG2096 # Protein_GI_number: 16800241 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Listeria innocua # 6 185 3 179 188 154 44.0 8e-38 MKKSLVYTKTGDKGTTSLVGGSRVPKTHIRLEAYGTIDELNSHLGWLNTYLLDEADRDFI LSIQHKLFSIGSHLATDQEKTQLKAASILTPEDVERIEREIDKLDEQLPELCAFIIPGGS RGAAVCHVCRTICRRAERRILTLSETCTISPEVLAFVNRLSDYLFVLSRKINFDEQNSEI FWDNSWK >gi|226332240|gb|ACIC01000080.1| GENE 27 33596 - 33817 389 73 aa, chain + ## HITS:1 COG:no KEGG:PGN_1678 NR:ns ## KEGG: PGN_1678 # Name: not_defined # Def: hypothetical protein # Organism: P.gingivalis_ATCC33277 # Pathway: not_defined # 1 73 13 85 85 117 98.0 2e-25 MYWTLELASKLEDAPWPATKDELIDYAMRSGAPLEVIENLQEMEDEGEIYESIEDIWPDY PSKEDFFFNEEEY >gi|226332240|gb|ACIC01000080.1| GENE 28 34016 - 35374 851 452 aa, chain + ## HITS:1 COG:no KEGG:BF4274 NR:ns ## KEGG: BF4274 # Name: not_defined # Def: tyrosine type site-specific recombinase # Organism: B.fragilis # Pathway: not_defined # 1 452 1 452 452 808 86.0 0 MGRPKRLYPLGKYRLRTPKESDKNKAYPIELEYTWNRQIIRKTTNVFVKETDWNQNGNQG RGEIRVSYGSEYKRLNQLLLARVERIDSMLAEYNEKHPNQITADIISGFLADKPLTRPDQ GKDFVEFTLERLYSEYSRNKIGRSRYENGKSGMNIFQTFLRSTKQGTYRTDSIYVGDITP ELLDAYILWRREIKQNSDATINHSLTPILKACAYACEMGMLEPSVNARIQDMRIITKVSL SEEESEFDGKSLNKEQLLALLEYYKTCPEPRRKEFLEIFFFAFHACGLRVVDVMTLQWGH INFEKKELRKIMIKTNKRHVIPLTEQAISILRQWQEKREGCKYVFNLVKEELDLDDAEAL YKARNNATKCINQSLVVVGEQIGLPFNLSMHVARHSFAVFALNKGLSMTVVSRLLGHGST DVTEKVYARFLPETLSTEVARLKDEFKTLEII >gi|226332240|gb|ACIC01000080.1| GENE 29 35382 - 36014 186 210 aa, chain + ## HITS:1 COG:no KEGG:BF4273 NR:ns ## KEGG: BF4273 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 210 1 222 222 271 64.0 1e-71 MLQIIFLNIGLCVILLAIAFVFKKYARVILWSSAITLFVGTLLLVGLGIVNPVSDNETGS SEFWGTIMFLISAAALVIGAEILREKNIMNVEEVVSVDAEIILSETLDTELARKVFAKAI EAGYMKEDGTHYKWNESKVLLAYMCGRIYCGDKPIPSKFDDKGSWKFGETFFPDTELNNL FDISGLGQSRQNRKDLAVPVKSTEIDKFFE >gi|226332240|gb|ACIC01000080.1| GENE 30 36254 - 36586 219 110 aa, chain + ## HITS:1 COG:no KEGG:BF4272 NR:ns ## KEGG: BF4272 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 3 108 4 109 110 135 69.0 3e-31 MAKIKNVESLCERELLEKILHTVEEQEKALAQHQMPQNHLYDNKALMAKLGIKEKYLKKL RDNGYLGYSRQGDKYWYTQEDVDRFLRNFHYDAFAASDELPQRRGGGYGR >gi|226332240|gb|ACIC01000080.1| GENE 31 36528 - 38039 472 503 aa, chain + ## HITS:1 COG:no KEGG:BF4271 NR:ns ## KEGG: BF4271 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 17 501 1 490 499 608 58.0 1e-172 MPLRQVMNYPNGEEVAMVDKLHLTNMLRAKVEHNLDGGLPLDVFPDKIQEIILNLSRHEN FNVEYVASIIISAMAAAIGNSYQINIRNEWKDSPSLYMMLIGRPGLGKTPPLNFLYKPIN DLDDRLDEKYSEELEKYERAKQANGGNDKLKAPKWLTNIISDFTPEAMVEAHWRNPRGIA IIVDEIIGLFNFAKRYNGNNNLIELLLTAYSGGTIKVLRKSSSRHIRVKTPCINVIGTVQ TNMLHEVFRKEFIANGFLDRFIFIFPKDRKISRWRRNDNSIPKPDIAGQWATILNKVLEI PCTINEIRNVAEPKVLEMTEEAEVYFYDWYNNIIDNVNSIDDDADVESRSMKLNGHAGRL SLIFQIMKWAVGEEDMQPVSLSSVKSAIRMVDYYEDTYHRIQEILLSNTIGDVKEDWLSQ LGNTFTASDAIAAAKIYEIPRRTVFYALKKLCQTKQPILEKTKHGEYRKIQHQTSNASCT IALSTQVEELQTKHSAKVHSANE >gi|226332240|gb|ACIC01000080.1| GENE 32 38052 - 39146 318 364 aa, chain + ## HITS:1 COG:no KEGG:BF4270 NR:ns ## KEGG: BF4270 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 341 1 342 342 455 60.0 1e-126 MSKYRFHLQKYSLGSKIACPSCERPRCFVKYVDEEGVVSFSNEVGKCDHEISCGYHYTPR DFFRDNPNANIKEERSSDYVQPVKSTHEKEIIVPPSFIPLEYFNASLAHFEKNPLYKYLC RVFGEAEAQRVFHLYHVGTSAKWGGATVFWQIDANEQVRSGKIILYNSETGHRVKEPHSY VSWAHRELQLKDFNLKQCLFGEHLLIRYPDAPVILVESEKTAIVMSHFIPNYVWVATGGI NGCFKEEFVHSLKGRDVTLIPDLGATQLWKEKSIILTRICSRVVVSDMLEQIATEEEQSK GLDISDYYLFSPSKHQILQMMIEKNPLLQNLIDALGLELIDAQQMTESTWIVQNLNNKGA WSRS >gi|226332240|gb|ACIC01000080.1| GENE 33 39267 - 40823 695 518 aa, chain + ## HITS:1 COG:no KEGG:BF4269 NR:ns ## KEGG: BF4269 # Name: not_defined # Def: putative protein involved in recombination # Organism: B.fragilis # Pathway: not_defined # 1 514 1 513 513 619 60.0 1e-176 MNESIKQAMHLDACKSFGAAEANENERRWDDKMIDSKNNDSINNYDKTRMRLNFEIGPDG KIHSLGYQDKRLDVRLQERLNELGWRPFKADSKIQPNCCARLIFGGNRERTLEMAFGTQD LKLDKDSDNSHIHRCKEVEQWALDVYNWCCRRYGKENIIGFQVHLDETSPHIHALIVPVG VRPKSGRECVMWSAKFGKNKYEYGSILKEMHTSLYEEVGSKYGLERGDSVEGRNIQHLNK RDYIRKLAKEQKKMEKAIRGLQTMKDNIEAEMQMFSDELKILEKDLAAGRVVLAEYQNKK ACIEDLIHDCQERLDDKETKLMIKQQELDSLLKDIQNIHSVNRPFKNYKIEFNAPQITEK PPMFGTDKWLERQNTIIEKQFTAMGRKLEELYRNEAANQVEAVRQNILVDMKEVHTLKAE NKTLTECMNYWSNLALKTFDYLSIPSLRDELCAVATALIGGDYVVPTGGGGGGNESDLRW DGRRPDEPDFAFQRRCFSHAVKYIMAKYSVKQNKGRSR >gi|226332240|gb|ACIC01000080.1| GENE 34 40985 - 41185 241 66 aa, chain + ## HITS:1 COG:no KEGG:BDI_2133 NR:ns ## KEGG: BDI_2133 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 3 66 6 69 80 89 79.0 3e-17 MNDMNRIKVVLVEKKKTSKWLAEVLGKNPATISKWCTNTSQPDVATLREIARVLDIDVKE LLNSTK >gi|226332240|gb|ACIC01000080.1| GENE 35 41216 - 42742 1165 508 aa, chain + ## HITS:1 COG:BMEII0451 KEGG:ns NR:ns ## COG: BMEII0451 COG0286 # Protein_GI_number: 17988796 # Func_class: V Defense mechanisms # Function: Type I restriction-modification system methyltransferase subunit # Organism: Brucella melitensis # 27 503 19 506 518 323 38.0 4e-88 MVKKKNKEESLVPKDEKTLKMEGVQNLYNFLFEACNILRGPVSQDNFKDYITPILYFKRI SDVYDEETQTALEESGGDEEYASLPEQHRFVIPDGCHWSDIRERSENLGAAIVGAMRGIE LANPDTLYGVLSMFSAQKWTDKKNLSDGKIRDLIEHLSTRRLGNNDYPADLMGDAYEILL KKFADDSKAQAGEFYTPRSVVSLLVRILDPKPGETVYDPACGSGGMLIEAVQHMNHSSLC CGSIFGQEKNVVNSAIAKMNLFLHGASDFNIMQGDTLRSPKILQNGEIAKFDCVIANPPF SLEKWGSVEWSSDKYGRNVWGTPSDSCGDYAWIQHMVKSMASGNGRMAVVMPQGVLFRGN EEGRIREKLVKSDLIEAVVTLGDKLFYGTGLSPCFLILRRLKPAAHSARVLMIDGTKILT VKRAQNILSPENVDRLYELYINYEDVEDFAKVVTLDAIAAKDYDLSPNKYVEYHKEEIRP YAEVLAEFKAAYEEVKRCEEEFRKLINL >gi|226332240|gb|ACIC01000080.1| GENE 36 42747 - 44285 947 512 aa, chain + ## HITS:1 COG:BMEII0451 KEGG:ns NR:ns ## COG: BMEII0451 COG0286 # Protein_GI_number: 17988796 # Func_class: V Defense mechanisms # Function: Type I restriction-modification system methyltransferase subunit # Organism: Brucella melitensis # 19 511 18 515 518 295 35.0 1e-79 MEQLVKYYRNPDEPISLEDLKSFLWGAATRLRGQIDAAGYKEYIFPLLFFKRISDVYDEQ FEGYVCEGGIEYANAQAQELVIRIPDGAHWRDVRECTENVGQRLVEAFIAIEQANPGEHA DGRVIGGLEGIFGPKDGWTNKAKMPDHIITSLIEDFSRYNLSLKACPADEMGQAYEYLVG KFADDAGNTAQEFYTNRTVVDLMAEILQPRPGESIYDPTCGSGGMLVKCLDFLRKKGEPW QGVKVFGQEINALTSAIARMNLYLNGVEDFSIVREDTLAHPAFVDGSRLRKFDIVLANPP YSIKTWNREAFMNDKWGRNFLGTPPQGRADYAFFQHILASMDDKTGRCAILFPHGVLFRD EEQSLREKLIKSDVVECVIGLGANLFYNSPMEACILICNNQKRSTLKNKIIFINALKEVT RKNAESYLEKVHIEKIVSAYFNASEIQNFSTVASLEQIANYNYNLNISLYANALNNENNF NVLSVSETLSNWKSVHAEMTSSITEMFTLIGK >gi|226332240|gb|ACIC01000080.1| GENE 37 44288 - 45895 444 535 aa, chain + ## HITS:1 COG:PAB2150 KEGG:ns NR:ns ## COG: PAB2150 COG0732 # Protein_GI_number: 14520513 # Func_class: V Defense mechanisms # Function: Restriction endonuclease S subunits # Organism: Pyrococcus abyssi # 8 146 252 391 427 102 37.0 1e-21 MKLEEFARESRATHKGDKSGVPIVGLEHLIPQEIKFSGYDVDTENTFTKTFKKGQILFGR RRAYLKKAAIADFDGICSGDITVIEAIPGKVDPLLLPFIIQNDKFFDYAVSRSAGGLSPR VKWEHLKDYEFDLPPIEEQRILADKLWAAYRLKESYKKLLTATQEMVKSQFIEIFYGMET TPVKDYIDDSFPGEWGTEDKDGNGVKVIRTTNFTNSGKLNLADVVTRSIEDRKVVRKQIK KYDTILERSGGTADNPVGRVVLFEEDNLFLCNNFTQVLRFKDVDPRFAFYALYYFYQTNR TAIRSMGSKTTGIQNLNMSKYLEIGIPNASDEDQKAFVTIAEQADKSEFVGCKSQFIEMY YNTHNKQTLESVCPIMNKGITPKYVESSSVLVINQACIHWDGQRLGNIKYHNEEIPVRKR ILESGDVLLNATGNGTLGRCCVFICPSDNNTYINDGHVIALSTDRAVILPEVLNTYLSLN DTQAEIYRQYVTGSTNQVDIVFSDIKKMKVPVPSMDEQILFVEVLTQADKSEYYN >gi|226332240|gb|ACIC01000080.1| GENE 38 45916 - 48849 1688 977 aa, chain + ## HITS:1 COG:MTH940 KEGG:ns NR:ns ## COG: MTH940 COG0610 # Protein_GI_number: 15678960 # Func_class: V Defense mechanisms # Function: Type I site-specific restriction-modification system, R (restriction) subunit and related helicases # Organism: Methanothermobacter thermautotrophicus # 139 887 129 922 1013 339 31.0 2e-92 MAYFNEENTVEQMLINAAGQCGWIYVEPQFVPRLPDEVLVVEWLMEALLALNPITAEQAE QVIYKLRACITSGGASDELITANDKFRKLLFEENSYPFGDNGDNINIRFFAGKDDEYKNR CIVTNQWEYPRKSKEGGKRLDLVYVINGIPMVIGEAKTPVKASVTWADGATDIMHYQKSI PEMFVPNILTFASEGRELQYAGIGCPIDKWGPWFADEERKHGTLEDVEHNYVSLMTPERL LDIYRFYSVFTGTSGGRKIKIVCRYQQYLGGEAIVQRVLSTYTKGSGPRKGLIWHFQGSG KSWLMVFAAQKLRRQEVLKAPTVVIVDDRIDLEDQITGDFTRAEIPNIDSISSKQELEEK IHQRKILITTIFKFGDLNDGEVIDERDNIILLMDEAHRTQEGDLGKKMRTALPNAFFFGL TGTPINRNDHNTFACFGAEEDKYGYISKYTFQNSVADGATLELNFKTVPVEMHLDDAKLQ KEFDELTDQISEEDKNELVRRTSVDAFFTAEKRINDVCKYIVSHFREYVKPTGMKAQVVV YNRECCVKYKKALDALLGTDDQTTIVMHTAGDKADEYKAYKRSRDEEKKLLDQFRDPLSP LKFVIVTSKLLTGFDAPILQCMYLDKPMKNHTLLQAICRTNRTYNENKKCGLIVDFVGVF EDVAKSLAFDEETVKTIVQNIDEIKSLIPTFMQQCIEFFPGVDRTIGGWEGLTAAQQCLK DDGVKTNFGRHFARLNKAWEIVSPDACLAEFQNDYTWLAQVYQSVRPVSGGNLIWTLLGA KTIEIIHRNIETIDIGTPLEDLVVDADVIDAVLEDEKAREKKIVEVEKMLRLRLGEHKGD PNFKKFAEKLDELRERMEQNLISSIDFLKQLLALAKDLLEEEKKKDESQDKRAKARAALT DLFQSIKTEETPIIVEQVVNDIDNEVVNIVRQFNDAFQSVTARREIKKKLRAILWVKYQI KDNDVFERAYQYIEMYY >gi|226332240|gb|ACIC01000080.1| GENE 39 48873 - 50078 357 401 aa, chain + ## HITS:1 COG:no KEGG:Pnap_4263 NR:ns ## KEGG: Pnap_4263 # Name: not_defined # Def: hypothetical protein # Organism: P.naphthalenivorans # Pathway: not_defined # 21 335 16 342 520 86 25.0 2e-15 MAQYKQSEITNVFWASNDGFKGGRDPLGIQNSSIATYSKLLPGLTNLTRHIRYYSLYCWL LSEYDKFEVAGQTSLHQYNFIRRAELAMALIMKEQNVGSVVGALFVSQGRYKQIEDGIYD IADGADYESKDKYWTFKSGAFGQYYLGSLIYYELVKIEEGRFYLRNKGKELADAVRNSID ENIRKLFLKCILDGSLKEEAIEDLQSLAIHRINVGSEEWLFLNNLLTKSDEDSSLRRETI FLLLNDISKGVEIQEFVKNRFLHITEDGNLHAAFGWYFYYLCEGLHYCIDLFFCLILYKI HELHNPPIALLSQDIKQSLLSVIEKEMNYNSLDEWRKNVSDNINIIYDELRDYVSKQDYI SAAVHAIRLLLRLYTEFENNSKEIEEFEKKNRFETSARHTQ >gi|226332240|gb|ACIC01000080.1| GENE 40 50032 - 50388 153 118 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|153805911|ref|ZP_01958579.1| ## NR: gi|153805911|ref|ZP_01958579.1| hypothetical protein BACCAC_00151 [Bacteroides caccae ATCC 43185] # 6 118 392 504 504 210 96.0 2e-53 MRKKIDLKRQRGILSEGLRSYMERYLSFSISSFIESLIVQIMQEHTVVAIAKMGKNNSDL RKFILEDGRIVLVEQRYPVETSPRINSLFNFLQDMGYLDEDNTLTEIASQFIENYGKE >gi|226332240|gb|ACIC01000080.1| GENE 41 50375 - 52750 603 791 aa, chain + ## HITS:1 COG:no KEGG:Pnap_4262 NR:ns ## KEGG: Pnap_4262 # Name: not_defined # Def: hypothetical protein # Organism: P.naphthalenivorans # Pathway: not_defined # 21 347 12 330 833 107 26.0 1e-21 MEKNEHAILLDIPSGGKNGKYHSAVLTTYAIDLIHFDNQLLNMLHRKQVCSINVFADTNQ MDKSMEYVSPIYIRHIGKEYSITSISAVGAFHPKINFFVGDDAVLVVFGTGNLTVTGHGK NHEAFTGFMIDETDTTHRPLIEECWQYLCRFTKQCNDYDHNRILREIPENCTFLDSSFNI VPHSMCKVQEGLNAALLYNDSQSGILQQISNLVPLNEVQTITLLSPYFDEYGESLITLSQ LCPNSTVNVLIHQDCALPPSGMLPNSSIHFYDFSETKRGKIAFKTYERQLHAKVLHFKTN DAEYCMVGSANATLAGLGTITHRGINEEFGVLYHSTKQDFLSTLGLKTKKRIDVPTNRSK HSNEAPSETGRRLRLLSAYYESGKLNVYSNEEIPDGVLLSIDNGIETLVSELKHDKGNRY STDIKLAKTQYTCYLVDKDKKSISNKLFVNWTEFLATTNPSKMSRNLNRFISRIENEGYD GMEVADMLSDVMWDLVNDAGEKVSPKIKVSSEDKRQSDLSLPEIKYNSEYDNDDAKSSRI FQIDRTSRLIECIEESIKKKIRSIDDAIIDEEEEGSAETSNNREIEEQEDIFLSKQLLKG YGELSTSLLMKYQKMISKRYEQVKATGDGVITRDDLNFFSLSMFAAMEICYLNKFRYKFD QIPYSSRNFSQKQFYDSLERSISNVGLDSLEKFVKFCNTMKLPAQRDENFNKVSSRSMKY AILYGTLFNRLSHIETIRILGKRLQKAVDSLAVLFGKPSFETLTKELEPLSERYDYVFRM NHVENMIRSLS >gi|226332240|gb|ACIC01000080.1| GENE 42 53072 - 54223 1177 383 aa, chain - ## HITS:1 COG:no KEGG:BT_2758 NR:ns ## KEGG: BT_2758 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 383 7 389 389 747 99.0 0 MLLLAGSIQAMYAQKFSPNTPLFEELTDVKKKTDKFNLYLNMQGSFDAHFQNGFQEGDFN MHQLRIEAKGNINNWLSYRYRQRLNRSNDGNNMIDNLPTSIDWAGIGIKLNDKFNLFAGK QSANYGGFEFDLNPIDVYQYCDMIDYMSNFLTGLNASYNITPDQQLNLQILNSRNSSFDN TYGITEDAEGNLPDLKSGKMPLVYTLNWIGNFNNVFKTRWSASVMNEAKDHNMYYYAVGN ELNLGKWNAFVDFMYSKEEIDRKGIITGIVGRPGGHNAFDTGYLSVVAKCNYRFLPKWNA FVKGMYETASVTKASEGIEKGNYRTSWGYLAGIEFYPMETNLHFFVTYVGRSYDFTSRAK VLGQENYSTNRISVGFIWQMPVF >gi|226332240|gb|ACIC01000080.1| GENE 43 54257 - 55315 1020 352 aa, chain - ## HITS:1 COG:STM3106 KEGG:ns NR:ns ## COG: STM3106 COG0252 # Protein_GI_number: 16766407 # Func_class: E Amino acid transport and metabolism; J Translation, ribosomal structure and biogenesis # Function: L-asparaginase/archaeal Glu-tRNAGln amidotransferase subunit D # Organism: Salmonella typhimurium LT2 # 11 352 9 348 348 380 64.0 1e-105 MKQIKKLGLVVVTLLLSATMAFAAKPNIHILATGGTIAGTGSSATGTSYTAGQVAIGALL DAVPEIKDIANVTGEQIVRIGSQDMNDEVWLTLAKKINELLKRPDIDGIVITHGTDTMEE TAYFLNLTVKSDKPVVLVGAMRPSTALSADGPLNLYNAVVTAAAKESKDKGVLVAMNGLI LGAESVVKMNTVDVQTFQAPNSGALGYVLNGKVCYNQITLKKHTTQSVFDVTKLTSLPKV GIVYSYSNIEADMMTPLLNNGYKGIIHAGVGNGNIHKNIFPSLIDARRKGIVVVRSSRVP TGPTTLDAEVDDAKYQFVASQELNPQKSRVLLMLALTKTTDWKQIQEYFNEY Prediction of potential genes in microbial genomes Time: Thu May 12 01:24:31 2011 Seq name: gi|226332239|gb|ACIC01000081.1| Bacteroides sp. 1_1_6 cont1.81, whole genome shotgun sequence Length of sequence - 10381 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 5, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 1279 1281 ## COG2704 Anaerobic C4-dicarboxylate transporter - Prom 1386 - 1445 6.9 + Prom 1342 - 1401 6.8 2 2 Tu 1 . + CDS 1459 - 2892 1363 ## COG1027 Aspartate ammonia-lyase + Term 2909 - 2963 13.4 + Prom 2951 - 3010 5.8 3 3 Tu 1 . + CDS 3093 - 4463 517 ## PROTEIN SUPPORTED gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 + Term 4526 - 4566 -0.5 + Prom 4490 - 4549 10.1 4 4 Op 1 . + CDS 4605 - 5585 882 ## BT_2754 hypothetical protein 5 4 Op 2 . + CDS 5644 - 6234 708 ## BT_2753 hypothetical protein + Term 6262 - 6303 6.4 + Prom 6347 - 6406 3.6 6 5 Op 1 . + CDS 6426 - 8891 1936 ## COG1198 Primosomal protein N' (replication factor Y) - superfamily II helicase 7 5 Op 2 . + CDS 8947 - 10381 1222 ## BT_2751 hypothetical protein Predicted protein(s) >gi|226332239|gb|ACIC01000081.1| GENE 1 1 - 1279 1281 426 aa, chain - ## HITS:1 COG:HI0746 KEGG:ns NR:ns ## COG: HI0746 COG2704 # Protein_GI_number: 16272687 # Func_class: R General function prediction only # Function: Anaerobic C4-dicarboxylate transporter # Organism: Haemophilus influenzae # 2 421 6 424 440 366 56.0 1e-101 MILQLAFVLTAIIIGARLGGIGLGVMGGVGLGILTFVFGLQPTAPPIDVMLMIAAVISAA SCMQAAGGLDYMVKLAEHLLRKNPSHVTLLSPLVTYLFTFIAGTGHVAYSVLPVIAEVAT ETKIRPERPLGIAVIASQQAITASPISAATVALLGLLAGFDITLFDILKITIPATIIGVL VGALFSMKVGKELIEDPEYQKRLKEGLFNDKKIEIKDVKNKRSAMLSVLIFILATAFIVL FGSFEGMRPSFLIDGEIVTLGMSSIIEIVMLSAAAIILIVTKTDGIKATQGSVFPAGMQA VIAIFGIAWMGDTFLQGNMAQLTLSIEGIVRQMPWLFGIALFVMSILLYSQAATVRALMP LGIALGISPYMLIALFPAVNGYFFIPNYPTVVAAINFDRTGTTKIGKYVLNHSFMMPGLV STVVAI >gi|226332239|gb|ACIC01000081.1| GENE 2 1459 - 2892 1363 477 aa, chain + ## HITS:1 COG:Cj0087 KEGG:ns NR:ns ## COG: Cj0087 COG1027 # Protein_GI_number: 15791475 # Func_class: E Amino acid transport and metabolism # Function: Aspartate ammonia-lyase # Organism: Campylobacter jejuni # 9 468 3 462 468 514 55.0 1e-145 MEQKLSKKTRTESDLIGSREVPESALYGVQTLRGIENFRISKFHLNEYPLFIQALAITKM GAAVANRELDLLTEEQTDAILKACKEILEGKHHDQFPVDMIQGGAGTTTNMNANEVIANR ALELMGHARGEYHYCSPNDHVNRSQSTNDAYPTAIHIGLYYTHLKLVKHFATLIEAFRKK GAEFAHIIKMGRTQLEDAVPMTLGQTFNGFASILEHELKNLDFAAQDFLTVNMGATAIGT GITAEPEYAEKCIAALRKITGLDIKLADDLIGATSDTSCMVGYSSAMRRVAVKMNKICND LRLLASGPRCGLGEINLPAMQPGSSIMPGKVNPVIPEVMNQVSYKVIGNDLCVAMSGEAA QMELNAMEPVMAQCCFESADLLMNGFDTLRTLCIDGITANEEKCRRDVHNSIGVVTALNP VIGYKHSTKIAKEALETGKGVYELVLEHNILSKEDLDTILKPENMIKPVKLDIHPNH >gi|226332239|gb|ACIC01000081.1| GENE 3 3093 - 4463 517 456 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 [Flavobacteriales bacterium ALC-1] # 1 456 1 449 458 203 30 4e-52 MNAFDVIIIGFGKGGKTLAAEFAKRGQKVAVVERSDKMYGGTCINIGCIPTKTLVHQAKI ASGMKGLTFLEKREFYRNAVSVKDSVTGALRNKNYHNLADNPHVTVYTGFGSFVSSDTVS VRTAAEELLLTAKQIIINTGAETVIPSIDGIADNPFVYTSTSIMELTDLPCHLVIVGGGY IGLEFASMYASFGSQVTVLESYPELIAREDRDIAASVKETLEKKGIVFRMNAKVQSVNHT SEKAIVAFTDLQTNEAFELEADAVLLATGRRPNTQGLHLEAAGVEVDARGAIIVDEYLKT TNPAIRAVGDVKGGLQFTYISLDDYRIIRDDLFGDAERKTSDRNPVSYSVFIDPPLARVG LNEEEARKQNLDIIVKKLPVMAIPRAKTLGETDGLLKAIIDKKTGLILGCMLFAPDSSEV INTVALAMKTEQDYTFLRDFIFTHPSMSEALNDLFA >gi|226332239|gb|ACIC01000081.1| GENE 4 4605 - 5585 882 326 aa, chain + ## HITS:1 COG:no KEGG:BT_2754 NR:ns ## KEGG: BT_2754 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 326 1 326 326 625 99.0 1e-178 MKNKPFENFDFIDFWDDDEYAMNEYIGAPPTEKMIEETERELGYKLPESYIWLMKQHNGG IPFNVCFPCDEPTSWADDHVAITGIMGVDKDKIYSLCGQLGSRFMIEEWGYPDIGVAICD CPSAGHDMIFLDYRECGPQGEPKVVHVDQEDDYYVTFLADNFEEFIRGLVNEEVFDTSEE DERMELEKVRNAAFSPLLSDLCAKCDHPVDTERWIRKISEEIVTDKGFFALHADERSYLL YDIQLWLYTNVYPDTTEEDYLSAYKKIIALDGEFSTGGYASDFVTDWLTRRKESGMVTCN DGILSMAAGTREALLANRELISNKEQ >gi|226332239|gb|ACIC01000081.1| GENE 5 5644 - 6234 708 196 aa, chain + ## HITS:1 COG:no KEGG:BT_2753 NR:ns ## KEGG: BT_2753 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 196 1 196 196 364 99.0 1e-100 MKKIFGALMIAVCIGMAMPAQAQLHFGVKGGLNLSKASFSDVKENFKKDNFTGFFIGPMA EFNIPVVGLGVDASLLFAQRGIKISDGGEEATVKQNGLDIPVNLKYNIGLGSLVGLYVAA GPDFYFDFAGNKTIDGVKTDKKKAEVGINVGAGVKLLNHLQVGANYNIPLGKTASFEDIE GSYKTKTWQVSVAYIF >gi|226332239|gb|ACIC01000081.1| GENE 6 6426 - 8891 1936 821 aa, chain + ## HITS:1 COG:lin1938 KEGG:ns NR:ns ## COG: lin1938 COG1198 # Protein_GI_number: 16801004 # Func_class: L Replication, recombination and repair # Function: Primosomal protein N' (replication factor Y) - superfamily II helicase # Organism: Listeria innocua # 1 820 1 793 797 461 34.0 1e-129 MLAMKKYVDVILPLPLPKYFTYSLPDECAEEVEIGCRVIVPFGRKKFYTAIVRNVHYCAP TEYEVKDISALLDASPILLSTQFKFWEWLADYYLCTQGDVYKAALPSGLKLESETIVEYN PDFEADAPLSEREQRILDLLSVDSQQCVTKLEKESGMKNILTVIKSLLDKEAIFVKEELR RTYKPKTEARVRLAGSADEKRLHILFDILSRAPKQLALLMKYVECSGILGRETPKEVSKK ELLQRAGVSPSILNGLVEKKIFEIYFHEVGRLNQEMKEVVELNALNEFQQRAHDEIVQSF QEKNVCLLHGVTSSGKTEVYIHLIEETIRQGKQVLYLLPEIALTTQITERLQRVFGSRLG IYHSKFPDAERVEIWRKQLGDKGYDIILGVRSSVFLPFRDLGLVIVDEEHENTYKQQDPA PRYHARNAAIVLAAMYGAKTLLGTATPSIETWQNARNGKYGFVQLKERYKEIQLPEIIPV DIKELHRKRRMVGQFSPLLIQYMKEALEQKEQVILFQNRRGFAPMIECRTCGWVPKCKNC DVSLTYHKGINQLTCHYCGYTYQLPKFCPACEGTELVNRGFGTEKIEDDIKVLFPEAAVA RMDLDTTRTRAAYEKIIADFEQGKTDILIGTQMVSKGLDFDHVSIVGILNADTMLNYPDF RSYERAFQLMAQVAGRAGRKNKRGRVVLQTKSIDHPIIHQVIGNNYEEMVDGQLAERQMF HYPPYYRLVYVYLKNHNEALLDQMAAVMAEKLRIVFGNRVLGPDKPPVARIQTLFIKKIV LKIEQNAPMGRARELLQRIQKDMLEDERYKSLIVYYDVDPM >gi|226332239|gb|ACIC01000081.1| GENE 7 8947 - 10381 1222 478 aa, chain + ## HITS:1 COG:no KEGG:BT_2751 NR:ns ## KEGG: BT_2751 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 478 1 478 598 991 99.0 0 MKRLTIYCLTCLIGVSPLFAQQGVTQCGTPTGQPKFPIETYQELPDPSSPSDKDWAAVTN PQISWGTIDTRYAKHRLPQLKKQQLTTLKGWRGERVNAQAVVWTGTEVKDLNFSFTDFKD KKGNSLLDEAFKAGFIRYVMTDELNKDGRGACGHRQAVDYDSLLVADPIDTNLKSMSVPA RTVQPIWVQCWIPQSATPGTYKGALLIKDGSRLLQQLNLEILVSSRELPAPSEWAYHLDL WQSPFAVARYYQVPLWSQEHLDAMRPLMKMLADAGQKIITATLTHKPWNGQTEDYFETMV TWMKRADGTWTFDYTVFDRWVEFMMSVGINQQINCYSMVPWELSFQYFDQATNSLKFVRT APGEPAYEEMWVAMLTSFSKHLREKGWFDICAIAMDERPMDVMQKTLKVIRKADPEFKVS LAGNYHAEIEPDLYDYCIVIGQNFPEEVRLRRAAENKRTTYYTCCTEAHPNTFTFSDP Prediction of potential genes in microbial genomes Time: Thu May 12 01:28:00 2011 Seq name: gi|226332238|gb|ACIC01000082.1| Bacteroides sp. 1_1_6 cont1.82, whole genome shotgun sequence Length of sequence - 54666 bp Number of predicted genes - 63, with homology - 62 Number of transcription units - 17, operones - 8 average op.length - 6.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 346 280 ## BT_2751 hypothetical protein 2 1 Op 2 . + CDS 389 - 859 413 ## COG0394 Protein-tyrosine-phosphatase + Term 913 - 953 -0.9 3 2 Tu 1 . - CDS 834 - 2906 585 ## PROTEIN SUPPORTED gi|163762592|ref|ZP_02169656.1| ribosomal protein S21 - Prom 2940 - 2999 4.6 + Prom 2810 - 2869 4.8 4 3 Op 1 . + CDS 3001 - 4515 1537 ## COG0008 Glutamyl- and glutaminyl-tRNA synthetases 5 3 Op 2 . + CDS 4532 - 5755 1143 ## COG1519 3-deoxy-D-manno-octulosonic-acid transferase + Term 5830 - 5866 -0.7 - Term 5917 - 5963 11.6 6 4 Op 1 . - CDS 6028 - 7656 958 ## BT_2746 thiol:disulfide interchange protein TlpA 7 4 Op 2 . - CDS 7653 - 8867 531 ## BT_2745 thiol:disulfide interchange protein - Prom 8888 - 8947 4.1 8 4 Op 3 . - CDS 8966 - 9151 86 ## - Prom 9181 - 9240 5.0 - Term 9293 - 9328 -0.1 9 5 Tu 1 . - CDS 9463 - 9975 499 ## COG0663 Carbonic anhydrases/acetyltransferases, isoleucine patch superfamily 10 6 Tu 1 . - CDS 10125 - 11906 1251 ## COG0006 Xaa-Pro aminopeptidase - Prom 12030 - 12089 2.9 + Prom 11906 - 11965 2.5 11 7 Op 1 . + CDS 12098 - 12289 310 ## PROTEIN SUPPORTED gi|160883083|ref|ZP_02064086.1| hypothetical protein BACOVA_01051 12 7 Op 2 . + CDS 12363 - 13244 662 ## COG4974 Site-specific recombinase XerD 13 7 Op 3 . + CDS 13261 - 13560 201 ## PROTEIN SUPPORTED gi|163755828|ref|ZP_02162946.1| 30S ribosomal protein S21 + Term 13605 - 13654 4.3 + TRNA 13658 - 13734 81.2 # Thr TGT 0 0 + TRNA 13890 - 13975 63.9 # Tyr GTA 0 0 + TRNA 14001 - 14073 68.9 # Gly TCC 0 0 + TRNA 14086 - 14157 80.8 # Thr GGT 0 0 14 8 Tu 1 . + CDS 14208 - 15392 1406 ## PROTEIN SUPPORTED gi|119502908|ref|ZP_01624993.1| Ribosomal protein S19 + Term 15407 - 15448 8.0 + TRNA 15447 - 15522 81.1 # Trp CCA 0 0 + Prom 15449 - 15508 80.2 15 9 Op 1 . + CDS 15534 - 15725 145 ## BF4198 preprotein translocase SecE subunit 16 9 Op 2 45/0.000 + CDS 15743 - 16285 586 ## COG0250 Transcription antiterminator 17 9 Op 3 55/0.000 + CDS 16347 - 16790 744 ## PROTEIN SUPPORTED gi|29348147|ref|NP_811650.1| 50S ribosomal protein L11 18 9 Op 4 . + CDS 16806 - 17504 1160 ## PROTEIN SUPPORTED gi|29348146|ref|NP_811649.1| 50S ribosomal protein L1 19 9 Op 5 . + CDS 17520 - 18038 856 ## PROTEIN SUPPORTED gi|29348145|ref|NP_811648.1| 50S ribosomal protein L10 20 9 Op 6 28/0.000 + CDS 18086 - 18460 589 ## PROTEIN SUPPORTED gi|29348144|ref|NP_811647.1| 50S ribosomal protein L7/L12 + Term 18486 - 18530 9.2 + Prom 18467 - 18526 3.5 21 10 Op 1 58/0.000 + CDS 18559 - 22377 2917 ## PROTEIN SUPPORTED gi|163796927|ref|ZP_02190884.1| 30S ribosomal protein S12 + Term 22391 - 22447 13.0 + Prom 22399 - 22458 2.0 22 10 Op 2 . + CDS 22484 - 26767 4195 ## COG0086 DNA-directed RNA polymerase, beta' subunit/160 kD subunit + Term 26788 - 26841 13.1 + Prom 26819 - 26878 7.7 23 11 Tu 1 . + CDS 26910 - 27215 266 ## BT_2732 hypothetical protein + Prom 27269 - 27328 5.1 24 12 Op 1 56/0.000 + CDS 27353 - 27754 686 ## PROTEIN SUPPORTED gi|29348140|ref|NP_811643.1| 30S ribosomal protein S12 + Prom 27777 - 27836 2.2 25 12 Op 2 51/0.000 + CDS 27898 - 28374 808 ## PROTEIN SUPPORTED gi|29348139|ref|NP_811642.1| 30S ribosomal protein S7 26 12 Op 3 4/0.000 + CDS 28430 - 30547 1889 ## COG0480 Translation elongation factors (GTPases) + Term 30564 - 30604 4.3 + Prom 30552 - 30611 4.0 27 12 Op 4 40/0.000 + CDS 30632 - 30937 495 ## PROTEIN SUPPORTED gi|29348137|ref|NP_811640.1| 30S ribosomal protein S10 28 12 Op 5 58/0.000 + CDS 30956 - 31573 1071 ## PROTEIN SUPPORTED gi|29348136|ref|NP_811639.1| 50S ribosomal protein L3 29 12 Op 6 61/0.000 + CDS 31573 - 32199 1045 ## PROTEIN SUPPORTED gi|29348135|ref|NP_811638.1| 50S ribosomal protein L4 30 12 Op 7 61/0.000 + CDS 32216 - 32506 481 ## PROTEIN SUPPORTED gi|29348134|ref|NP_811637.1| 50S ribosomal protein L23 31 12 Op 8 60/0.000 + CDS 32512 - 33336 1419 ## PROTEIN SUPPORTED gi|29348133|ref|NP_811636.1| 50S ribosomal protein L2 32 12 Op 9 59/0.000 + CDS 33357 - 33626 467 ## PROTEIN SUPPORTED gi|29348132|ref|NP_811635.1| 30S ribosomal protein S19 33 12 Op 10 61/0.000 + CDS 33662 - 34072 679 ## PROTEIN SUPPORTED gi|29348131|ref|NP_811634.1| 50S ribosomal protein L22 34 12 Op 11 50/0.000 + CDS 34078 - 34809 1250 ## PROTEIN SUPPORTED gi|29348130|ref|NP_811633.1| 30S ribosomal protein S3 35 12 Op 12 . + CDS 34833 - 35267 736 ## PROTEIN SUPPORTED gi|29348129|ref|NP_811632.1| 50S ribosomal protein L16 36 12 Op 13 . + CDS 35273 - 35470 321 ## PROTEIN SUPPORTED gi|29348128|ref|NP_811631.1| 50S ribosomal protein L29 37 12 Op 14 50/0.000 + CDS 35467 - 35736 456 ## PROTEIN SUPPORTED gi|29348127|ref|NP_811630.1| 30S ribosomal protein S17 38 12 Op 15 57/0.000 + CDS 35739 - 36104 596 ## PROTEIN SUPPORTED gi|29348126|ref|NP_811629.1| 50S ribosomal protein L14 39 12 Op 16 48/0.000 + CDS 36125 - 36442 527 ## PROTEIN SUPPORTED gi|29348125|ref|NP_811628.1| 50S ribosomal protein L24 40 12 Op 17 50/0.000 + CDS 36445 - 37002 914 ## PROTEIN SUPPORTED gi|29348124|ref|NP_811627.1| 50S ribosomal protein L5 41 12 Op 18 50/0.000 + CDS 37009 - 37308 502 ## PROTEIN SUPPORTED gi|29348123|ref|NP_811626.1| 30S ribosomal protein S14 + Term 37326 - 37354 -0.9 42 12 Op 19 55/0.000 + CDS 37361 - 37756 664 ## PROTEIN SUPPORTED gi|29348122|ref|NP_811625.1| 30S ribosomal protein S8 43 12 Op 20 46/0.000 + CDS 37774 - 38343 965 ## PROTEIN SUPPORTED gi|29348121|ref|NP_811624.1| 50S ribosomal protein L6 44 12 Op 21 56/0.000 + CDS 38365 - 38709 560 ## PROTEIN SUPPORTED gi|29348120|ref|NP_811623.1| 50S ribosomal protein L18 45 12 Op 22 . + CDS 38715 - 39233 848 ## PROTEIN SUPPORTED gi|29348119|ref|NP_811622.1| 30S ribosomal protein S5 46 12 Op 23 . + CDS 39243 - 39419 281 ## PROTEIN SUPPORTED gi|53715448|ref|YP_101440.1| 50S ribosomal protein L30 47 12 Op 24 53/0.000 + CDS 39451 - 39897 735 ## PROTEIN SUPPORTED gi|29348117|ref|NP_811620.1| 50S ribosomal protein L15 48 12 Op 25 2/0.000 + CDS 39902 - 41245 879 ## PROTEIN SUPPORTED gi|163796899|ref|ZP_02190856.1| 30S ribosomal protein S11 49 12 Op 26 9/0.000 + CDS 41261 - 42058 721 ## COG0024 Methionine aminopeptidase 50 12 Op 27 . + CDS 42060 - 42278 239 ## PROTEIN SUPPORTED gi|15900168|ref|NP_344772.1| translation initiation factor IF-1 51 12 Op 28 . + CDS 42288 - 42404 198 ## PROTEIN SUPPORTED gi|53715443|ref|YP_101435.1| 50S ribosomal protein L36 52 12 Op 29 48/0.000 + CDS 42438 - 42818 637 ## PROTEIN SUPPORTED gi|29348113|ref|NP_811616.1| 30S ribosomal protein S13 53 12 Op 30 36/0.000 + CDS 42830 - 43219 665 ## PROTEIN SUPPORTED gi|29348112|ref|NP_811615.1| 30S ribosomal protein S11 + Prom 43221 - 43280 4.2 54 12 Op 31 26/0.000 + CDS 43336 - 43941 1029 ## PROTEIN SUPPORTED gi|29348111|ref|NP_811614.1| 30S ribosomal protein S4 55 12 Op 32 50/0.000 + CDS 43953 - 44945 912 ## COG0202 DNA-directed RNA polymerase, alpha subunit/40 kD subunit 56 12 Op 33 . + CDS 44949 - 45440 813 ## PROTEIN SUPPORTED gi|29348109|ref|NP_811612.1| 50S ribosomal protein L17 + Term 45462 - 45513 13.7 + Prom 45451 - 45510 8.4 57 13 Tu 1 . + CDS 45588 - 45983 153 ## BT_2699 hypothetical protein + Term 46011 - 46057 5.1 + Prom 46001 - 46060 6.2 58 14 Tu 1 . + CDS 46090 - 46629 422 ## BT_2698 hypothetical protein 59 15 Tu 1 . - CDS 46820 - 48634 1037 ## COG0249 Mismatch repair ATPase (MutS family) - Prom 48855 - 48914 5.7 + Prom 49587 - 49646 8.6 60 16 Tu 1 . + CDS 49690 - 50337 457 ## BT_2695 hypothetical protein + Term 50407 - 50454 2.4 + Prom 50349 - 50408 5.5 61 17 Op 1 1/0.000 + CDS 50458 - 52449 1433 ## COG2987 Urocanate hydratase 62 17 Op 2 1/0.000 + CDS 52478 - 53380 995 ## COG3643 Glutamate formiminotransferase + Prom 53393 - 53452 2.7 63 17 Op 3 . + CDS 53478 - 54666 1074 ## COG1228 Imidazolonepropionase and related amidohydrolases Predicted protein(s) >gi|226332238|gb|ACIC01000082.1| GENE 1 2 - 346 280 114 aa, chain + ## HITS:1 COG:no KEGG:BT_2751 NR:ns ## KEGG: BT_2751 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 114 485 598 598 242 100.0 3e-63 SFYSSKKHLDGYLRWAYNSWPLEPLLDSRFRTWAAGDTYLVYPGARSCIRFERLIEGIQA HEKITILRQEFEKKGNKAGLKKIEKMLAPFNLGNMPEIPAAVTVNRANQILHSF >gi|226332238|gb|ACIC01000082.1| GENE 2 389 - 859 413 156 aa, chain + ## HITS:1 COG:alr5068 KEGG:ns NR:ns ## COG: alr5068 COG0394 # Protein_GI_number: 17232560 # Func_class: T Signal transduction mechanisms # Function: Protein-tyrosine-phosphatase # Organism: Nostoc sp. PCC 7120 # 3 153 4 155 161 150 49.0 7e-37 MKKILFVCLGNICRSSTAEGVMLHLIKEAGLEKEFVIDSAGILSYHQGELPDSRMRAHAA RRGYQLVHRSRPVRTEDFYNFDLIIGMDDRNIDDLKEKAPSTEEWKKIHRMTEYCTRIPA DHVPDPYYGGAEGFEYVLDVLEDACAGLLTSLTQDN >gi|226332238|gb|ACIC01000082.1| GENE 3 834 - 2906 585 690 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163762592|ref|ZP_02169656.1| ribosomal protein S21 [Bacillus selenitireducens MLS10] # 196 687 243 732 750 229 30 2e-59 MEHMKKKKRFSRRDILYKSLLFVATVTLIVYFLPRDGKFNYQFDINKPWKYGQLIATFDF PIYKDDAVVNREQDSLLASFQPYYLLDKQVEKDAIAKLKENYHTHLKGILPSVDYLRYIE RTLKEIYGEGIVSTENIQELHKDSTSAIMIIDDKLANSKPTDHIYTVKKAYEYLLSADTT HFNREILRQCSLNEYITPNLTFDQQRTQTAKEEMLNNYSWANGLVVSGQKIIDRGEIISP ETYNILESLRKESIKRSESIDQSRLILGGQILFVGMLMLCFMLYLDLFRKDYYERKGSLS LLFTLIVFYSVVTAFMVSHNIFNVYMIPYAMLPIIIRVFLDSRTAFLTHVITILICSISL RFPHEFILTQLAAGLVAIFSLRELSQRSQLFRTALLVILTYAAIYFAFELMTENGLANDF SKLNARMYTYFFINGILLLFTYPLLFLLEKTFGFTSNVTLVELSNINSDLLRQMSETVPG TFQHSMQVANLAAEAAIRIGAKSQLVRTGALYHDIGKMENPAFFTENQSGGVNPHKNLNY EQSAQVVISHVTDGLKLADKHNLPKVIKDFISTHHGRGKTKFFYISWKNEHPDEEPNEEL FTYPGPNPFSKETAILMMADAVEAASRSLPEYTEETISNLVDKIIDSQVAEGYFKECPIT FKDIATIKTVFKEKLKIAYHTRISYPELKK >gi|226332238|gb|ACIC01000082.1| GENE 4 3001 - 4515 1537 504 aa, chain + ## HITS:1 COG:BB0372 KEGG:ns NR:ns ## COG: BB0372 COG0008 # Protein_GI_number: 15594717 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Glutamyl- and glutaminyl-tRNA synthetases # Organism: Borrelia burgdorferi # 7 504 4 487 490 332 37.0 9e-91 MAERKVRVRFAPSPTGALHIGGVRTALYNYLFARQHGGDLIFRIEDTDSNRFVPGAEEYI LESFKWLGIHFDEGVSFGGEHGPYRQSERREIYKKYVQILLDNDKAYIAFDTPEELDAKR AEIANFQYDASTRGMMRNSLTMSKEEVDALIAEGKQYVVRFKIEPNEDVHVNDLIRGEVV INSSILDDKVLYKSADELPTYHLANIVDDHLMEVSHVIRGEEWLPSAPLHVLLYRAFGWE DTMPEFAHLPLLLKPEGNGKLSKRDGDRLGFPVFPLEWRPESGEVSSGYRESGYLPEAVI NFLALLGWNPGNDQEVMSMDEMIKLFDIHRCSKSGAKFDYKKGIWFNHQYIQLKPNEEIA ELFLPVLKEHGVEAPFEKVVTVVGMMKDRVSFIKELWDVCSFFFVAPAEYDEKTVKKRWK EDSAKCMTELAEVIAGIEDFSIEGQEKVVMDWIAEKGYHTGNIMNAFRLTLVGEGKGPHM FDISWVLGKEETLARMKRAVEVLK >gi|226332238|gb|ACIC01000082.1| GENE 5 4532 - 5755 1143 407 aa, chain + ## HITS:1 COG:YPO0055 KEGG:ns NR:ns ## COG: YPO0055 COG1519 # Protein_GI_number: 16120408 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: 3-deoxy-D-manno-octulosonic-acid transferase # Organism: Yersinia pestis # 49 405 51 416 425 135 27.0 1e-31 MLYNLAIVIYDLLVHFAAPFSRKPRKMMKGHWVVYELLRQQVEKGEQYIWFHAASLGEFE QGRPLIEMIREKYPDYKILLTFFSPSGYEVRKHYRGADIVCYLPFDKPRNVKKFLDIANP CMAFFIKYEFWKNYLDELHKRRIPVYSVSSIFRRDQIFFKWYGGTYRNVLKDFDHLFVQN EASKRYLSKIGICRVTVVGDTRFDRVLQIREEAKDLPLVKMFKGDNAFTFVAGSSWGPDE DLFLEYFNTHPEMKLIIAPHVIDENHLVEIISKLKRPYVRYTRADEKNVLKVDCLIIDCF GLLSSIYRYGEVAYIGGGFGVGIHNTLEAAVYGIPVIFGPKYQKFMEAIRLLEAKGAYSI KDYHELKTLLDRFQADDVFMRETGANAGYYVTSNAGATEKIMHMINF >gi|226332238|gb|ACIC01000082.1| GENE 6 6028 - 7656 958 542 aa, chain - ## HITS:1 COG:no KEGG:BT_2746 NR:ns ## KEGG: BT_2746 # Name: not_defined # Def: thiol:disulfide interchange protein TlpA # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 542 1 542 542 1087 100.0 0 MNRLFTFLFLSLLFNIVQAQLRSPEYQKGKAILSGTIANYSPDDHPDLKIGAPNIVMGAA ETLFPTIEADGSFKINIPLYHNTQVRMTIGKADIVILLSPDKETNVAVNLSNPQGKQFVF SGQYATINNEWCQPELITRIAPVYRNGDILDSIAGISANEFKKRCIDQYKQCVAHNNTKT QFSEDTRTLANLSCAFDCIENLNATRYCLQTAYQKKENITREQASTAFANFDFPANFYDF LKSFPVNHPLALYCYNYRNVISGELYELHHDPLKFEKYLLSKAALTKEEQALIRQYETAL KTGIPFQQGSELIALIAKYPKEYNEFSQKLFTKAKEYLSHIMQDSTCLMVDYIRAIYMRS SLYNLKPLTTQQEAMATEITNPIFLGIIQDMNRQMQPRAKVTTKKYSVCEAPKVSEEELL SALVDRHKGKVQFIDFWATWCGGCRRTIKEYEPIKKELGENVAFVYLTGPSSIEKTWKIL IQDIVGEHYWLNEKQWGYLWKHFQMKGLPMYLVIDKQGNIVKRFTHVTRKELEELLKQEI NK >gi|226332238|gb|ACIC01000082.1| GENE 7 7653 - 8867 531 404 aa, chain - ## HITS:1 COG:no KEGG:BT_2745 NR:ns ## KEGG: BT_2745 # Name: not_defined # Def: thiol:disulfide interchange protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 12 404 12 404 404 786 100.0 0 MKTKLLLITLLFATMPAVNAQNRERDYLKKVFANLEKIESATYYVANESWQPGDSTALSI LHGFIKEYNHPIDSTIGASYVSLDAKDTTKLNFCYDGNVRVEAYHDNKKLVIDDFTFKPL PFRLVRPPFFNFAKNIIKYALTTDDNITIELKDEGNDYYFKLTINEEQQVEFFGKAFHMP LPPFDIGNTTSIYELWINKSDNLPFKERREMSHDISVSTCSNLQINKLNIRDFNINDYFP KDYETVKYSDLHKKSGDSPSTSGLTGQKAPGWILENIDGQSLSLANCKSKVILINFTGIG CGACQAAVPFLKKLKELFSNEEFDLIAIESWSHNASAIRNYAKRKDLNYTILNATNEVIK QYQTGGAAPFFFILDQKRIIRKIIRGYSTENTDKEIINAIKELL >gi|226332238|gb|ACIC01000082.1| GENE 8 8966 - 9151 86 61 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNNIQLLIHMNNINIRFQIISKFMCFQGTIFNASLIFEKLAKERIGSANIHEGVHVFICF I >gi|226332238|gb|ACIC01000082.1| GENE 9 9463 - 9975 499 170 aa, chain - ## HITS:1 COG:BS_ytoA KEGG:ns NR:ns ## COG: BS_ytoA COG0663 # Protein_GI_number: 16080104 # Func_class: R General function prediction only # Function: Carbonic anhydrases/acetyltransferases, isoleucine patch superfamily # Organism: Bacillus subtilis # 3 170 1 166 171 147 48.0 7e-36 MALIKSVRGFTPEIGENCFLADNATIIGDVKIGNDCSIWFCTVLRGDVNSIRIGNGVNIQ DGSVLHTLYEKSTIEIGDHVSVGHNVTIHGATVKDYALIGMGSTILDHAVVGEGSIVAAG SLVLSNTVIEPGSIWGGVPAKFIKKVDPEQAKELNQKIAHNYLMYSQWYK >gi|226332238|gb|ACIC01000082.1| GENE 10 10125 - 11906 1251 593 aa, chain - ## HITS:1 COG:Cj0653c KEGG:ns NR:ns ## COG: Cj0653c COG0006 # Protein_GI_number: 15792013 # Func_class: E Amino acid transport and metabolism # Function: Xaa-Pro aminopeptidase # Organism: Campylobacter jejuni # 6 593 5 595 596 461 42.0 1e-129 MKQSIKERIHALRMAFRPNNIKAFIIPSTDPHLSEYVAPYWMSREWISGFTGSAGTVVIL MDKAGLWTDSRYFLQAEKELEGSGITLYKEMLPETPSITKFLCQNLKPGESVSIDGKMFS VQQVEQMKEDLAPYQLQVNLFGDPLKNIWKDRPSMPDAPAFIYDVKYAGKSCGEKVAAIR AELKKKGIYALFLSSLDEIAWTLNLRGSDVHCNPVIVSYLLVTQDEVVYFISPEKITQQV NEYLQEQQVSLRKYDEAESFLNSFAGENILIDPKKTNYAIYSAINPACKIIRGESPVTLL KAIRNEQEIVGIHHAMQRDGVALVRFLKWLEQSVPSGKETELSVDRKLHEFRAAQPLYMG ESFDTIAGYKEHGAIVHYSATEESDVTLQPKGFLLLDSGAQYLDGTTDITRTIALGELTE EEKTDYTLILKGHIALAMAKFPAGTRGAQLDVLARMPIWSHGMNFLHGTGHGVGHFLSVH EGPQSIRMNENPIVLQPGMVTSNEPGVYKAGSHGIRTENLTLVCKDKEGMFGEYFKFETI TLCPICKKGIIKEMLTAEEVKWFNDYHRTVYEKLSPSLNEEEKKWLLEATKAI >gi|226332238|gb|ACIC01000082.1| GENE 11 12098 - 12289 310 63 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160883083|ref|ZP_02064086.1| hypothetical protein BACOVA_01051 [Bacteroides ovatus ATCC 8483] # 1 63 1 63 63 124 100 2e-27 MIVVPVKEGENIEKALKKFKRKFEKTGIVKELRSRQQFDKPSVTKRLKKERAVYVQQLQQ VED >gi|226332238|gb|ACIC01000082.1| GENE 12 12363 - 13244 662 293 aa, chain + ## HITS:1 COG:lin1316 KEGG:ns NR:ns ## COG: lin1316 COG4974 # Protein_GI_number: 16800384 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinase XerD # Organism: Listeria innocua # 2 293 7 300 300 191 38.0 1e-48 MLIKSFLDYLRYERNYSEKTVLAYGEDIKQLQEFAQEEYGKFDPLEVEGELIREWIVSLM DRGYTSTSVNRKLSSLRSFYKYLLRQGEVTVDPLRKITGPKNKKPLPAFLKESEMDKLLD DTDFGEGFKGCRDRLIIGVFYATGIRLSELIGLDDKDVDFSASLLKVTGKRNKQRLIPFG DELKELMLEYINVRNETIPERSEAFFVKEDGERLYKNLVYNLVKRNLSKVATLKKRSPHV LRHTFATTMLNNDAELGAVKELLGHESVATTEIYTHATFEELKKVYKQAHPRA >gi|226332238|gb|ACIC01000082.1| GENE 13 13261 - 13560 201 99 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163755828|ref|ZP_02162946.1| 30S ribosomal protein S21 [Kordia algicida OT-1] # 1 98 4 101 102 82 42 7e-15 MDIRIQSIHFDASEQLQAFIQKKVSKLEKYYEDIKKVEVSLKVVKPEAAENKEAGIKILI PNGEFYASKVCDTFEEAVDLDVEALEKQLVKYKEKQRSK >gi|226332238|gb|ACIC01000082.1| GENE 14 14208 - 15392 1406 394 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|119502908|ref|ZP_01624993.1| Ribosomal protein S19 [marine gamma proteobacterium HTCC2080] # 1 394 1 407 407 546 66 1e-154 MAKEKFERTKPHVNIGTIGHVDHGKTTLTAAITTVLAKKGLSELRSFDSIDNAPEEKERG ITINTSHVEYETANRHYAHVDCPGHADYVKNMVTGAAQMDGAIIVCAATDGPMPQTREHI LLARQVNVPRLVVFLNKCDMVDDEEMLELVEMEMRELLSFYDFDGDNTPIIQGSALGALN GVEKWEDKVMELMDAVDNWIPLPPRDVDKPFLMPVEDVFSITGRGTVATGRIESGIIHVG DEVEILGLGEDKKSVVTGVEMFRKLLDQGEAGDNVGLLLRGVDKNEIKRGMVLCKPGQIK PHSRFKAEVYILKKEEGGRHTPFHNKYRPQFYLRTMDCTGEITLPEGTEMVMPGDNVTIT VELIYPVALNPGLRFAIREGGRTVGAGQITEIID >gi|226332238|gb|ACIC01000082.1| GENE 15 15534 - 15725 145 63 aa, chain + ## HITS:1 COG:no KEGG:BF4198 NR:ns ## KEGG: BF4198 # Name: not_defined # Def: preprotein translocase SecE subunit # Organism: B.fragilis # Pathway: Protein export [PATH:bfr03060]; Bacterial secretion system [PATH:bfr03070] # 1 63 1 63 63 100 90.0 2e-20 MKKVIAYIKESYDELVHKVSWPTYSELANSAVVVLYASLLIALVVWGMDVCFQNFMEKIV YPH >gi|226332238|gb|ACIC01000082.1| GENE 16 15743 - 16285 586 180 aa, chain + ## HITS:1 COG:CC3205 KEGG:ns NR:ns ## COG: CC3205 COG0250 # Protein_GI_number: 16127435 # Func_class: K Transcription # Function: Transcription antiterminator # Organism: Caulobacter vibrioides # 7 179 12 183 185 150 47.0 9e-37 MAEIEKKWYVLRAISGKEAKVKEYLEADLKNSDLGEYVSQVLIPTEKVYQVRNGKKIVKE RSYLPGYVLVEAALVGEVAHHLRNTPNVIGFLGGSEKPVPLRQSEVNRILGTVDELQETG EELNIPYVVGETVKVTFGPFSGFSGIIEEVNSEKKKLKVMVKIFGRKTPLELGFMQVEKE >gi|226332238|gb|ACIC01000082.1| GENE 17 16347 - 16790 744 147 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|29348147|ref|NP_811650.1| 50S ribosomal protein L11 [Bacteroides thetaiotaomicron VPI-5482] # 1 147 1 147 147 291 100 7e-78 MAKEVAGLIKLQIKGGAANPSPPVGPALGSKGINIMEFCKQFNARTQDKAGKILPVIITY YADKSFDFVIKTPPVAIQLLEVAKVKSGSAEPNRKKVAELTWEQIRTIAQDKMVDLNCFT VDAAMRMVAGTARSMGIAVKGEFPVNN >gi|226332238|gb|ACIC01000082.1| GENE 18 16806 - 17504 1160 232 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|29348146|ref|NP_811649.1| 50S ribosomal protein L1 [Bacteroides thetaiotaomicron VPI-5482] # 1 232 1 232 232 451 100 1e-126 MGKLTKNQKLAAEKIEAGKAYSLKEAASLVKEITFTKFDASLDIDVRLGVDPRKANQMVR GVVSLPHGTGKEVRVLVLCTPDAEAAAKEAGADYVGLDEYIEKIKGGWTDIDVIITMPSI MGKIGALGRVLGPRGLMPNPKSGTVTMDVAKAVKEVKQGKIDFKVDKSGIVHTSIGKVSF SPDQIRDNAKEFISTLNKLKPTAAKGTYIKSIYLSSTMSAGIKIDPKSVDEI >gi|226332238|gb|ACIC01000082.1| GENE 19 17520 - 18038 856 172 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|29348145|ref|NP_811648.1| 50S ribosomal protein L10 [Bacteroides thetaiotaomicron VPI-5482] # 1 172 1 172 172 334 100 7e-91 MRKEDKNSIIEQIAATVKEYGHFYLVDVTAMNATATSALRRDCFKSDIKLMVVKNTLLHK ALESLEEDFSPLYGSLKGTTAVMFCNTANVPAKLIKDKAKDGIPGLKAAYAEESFYVGAD QLDALVAIKSKNEVIADIVALLQSPAKNVISALQSGGNTLHGVLKTLGERPE >gi|226332238|gb|ACIC01000082.1| GENE 20 18086 - 18460 589 124 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|29348144|ref|NP_811647.1| 50S ribosomal protein L7/L12 [Bacteroides thetaiotaomicron VPI-5482] # 1 124 1 124 124 231 100 7e-60 MADLKAFAEQLVNLTVKEVNELATILKEEYGIEPAAAAVAVAAGPAAGAAAVEEKTSFDV VLKSAGAAKLQVVKAVKEACGLGLKEAKDMVDGAPSVVKEGLAKDEAESLKKTLEEAGAE VELK >gi|226332238|gb|ACIC01000082.1| GENE 21 18559 - 22377 2917 1272 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163796927|ref|ZP_02190884.1| 30S ribosomal protein S12 [alpha proteobacterium BAL199] # 1 1271 4 1387 1392 1128 46 0.0 SQMSSNTVNQRVNFASTKNPLEYPDFLEVQLKSFQDFLQLDTPPEKRKNEGLYKVFAENF PIADTRNNFVLEFLDYYIDPPRYTIDDCIERGLTYSVPLKAKLKLYCTDPDHEDFDTVIQ DVFLGPIPYMTDKATFVINGAERVVVSQLHRSPGVFFGQSVHANGTKLYSARIIPFKGSW IEFATDINNVMYAYIDRKKKLPVTTLLRAIGFENDKDILEIFNLAEDVKVNKTNLKRVLG RKLAARVLKTWIEDFVDEDTGEVVSIERNEVIIDRETVLEEVHIDEILESGVQNILLHKD EPNQSDFSIIYNTLQKDPSNSEKEAVLYIYRQLRNADPADDASAREVINNLFFSEKRYDL GDVGRYRINKKLNLTTDMDVRVLTKEDIIEIIKYLIELINSKADVDDIDHLSNRRVRTVG EQLSNQFAVGLARMSRTIRERMNVRDNEVFTPIDLINAKTISSVINSFFGTNALSQFMDQ TNPLAEITHKRRMSALGPGGLSRERAGFEVRDVHYTHYGRLCPIETPEGPNIGLISSLCV FAKINELGFIETPYRKVENGKVDLSDNGLVYLTAEEEEEKIIAQGNAPLNDDGTFVRNKV KSRQDADFPVVEPSEVDLMDVSPQQIASIAASLIPFLEHDDANRALMGSNMMRQAVPLLR SEAPIVGTGIERQLVRDSRTQITAEGDGVVDFVDATTIRILYDRTEDEEFVSFEPALKEY RIPKFRKTNQNMTIDLRPICDKGQRVKKGDILTEGYSTEKGELALGKNLLVAYMPWKGYN YEDAIVLNERVVREDLLTSVHVEEYSLEVRETKRGMEELTSDIPNVSEEATKDLDENGIV RIGARIEPGDIMIGKITPKGESDPSPEEKLLRAIFGDKAGDVKDASLKASPSLKGVVIDK KLFSRVIKNRSSKLADKALLPKIDDEFESKVADLKRILVKKLMTLTEGKVSQGVKDYLGA EVIAKGSKFSASDFDSLDFTSIQLSNWTSDDHTNGMVRDLVMNFIKKYKELDAELKRKKF AITIGDELPAGIIQMAKVYIAKKRKIGVGDKMAGRHGNKGIVSRVVRQEDMPFLADGTPV DIVLNPLGVPSRMNIGQIFEAVLGRAGKTLGVKFATPIFDGATMDDLDQWTDKAGLPRYC KTYLCDGGTGEQFDQAATVGVTYMLKLGHMVEDKMHARSIGPYSLITQQPLGGKAQFGGQ RFGEMEVWALEGFGAAHILQEILTIKSDDVVGRSKAYEAIVKGEPMPQPGIPESLNVLLH ELRGLGLSINLE >gi|226332238|gb|ACIC01000082.1| GENE 22 22484 - 26767 4195 1427 aa, chain + ## HITS:1 COG:mlr0277 KEGG:ns NR:ns ## COG: mlr0277 COG0086 # Protein_GI_number: 13470543 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, beta' subunit/160 kD subunit # Organism: Mesorhizobium loti # 13 1395 18 1356 1398 1326 50.0 0 MAFRKENKTKSNFSKISIGLASPEEILENSSGEVLKPETINYRTYKPERDGLFCERIFGP IKDYECHCGKYKRIRYKGIVCDRCGVEVTEKKVRRERMGHIQLVVPVAHIWYFRSLPNKI GYLLGLPTKKLDSIIYYERYVVIQPGVKAEDGIAEYDLLSEEEYLDILDTLPKDNQYLED NDPNKFIAKMGAEAIYDLLARLDLDALSYELRHRAGNDASQQRKNEALKRLQVVESFRAS RGRNKPEWMIVRIVPVIPPELRPLVPLDGGRFATSDLNDLYRRVIIRNNRLKRLIEIKAP EVILRNEKRMLQESVDSLFDNSRKSSAVKTDANRPLKSLSDSLKGKQGRFRQNLLGKRVD YSARSVIVVGPELKMGECGIPKLMAAELYKPFIIRKLIERGIVKTVKSAKKIVDRKEAVI WDILEHVMKGHPVLLNRAPTLHRLGIQAFQPKMIEGKAIQLHPLACTAFNADFDGDQMAV HLPLSNEAILEAQMLMLQSHNILNPANGAPITVPAQDMVLGLYYITKLRAGAKGEGLTFY GPEEALIAYNEGKVDIHAPVKVIVKDVDENGNIVDVMRETSVGRVIVNEIVPPEAGYINT IISKKSLRDIISDVIKVCGVAKAADFLDGIKNLGYQMAFKGGLSFNLGDIIIPKEKETLV QKGYDEVEQVVNNYNMGFITNNERYNQVIDIWTHVNSELSNILMKTISSDDQGFNSVYMM LDSGARGSKEQIRQLSGMRGLMAKPQKAGAEGGQIIENPILSNFKEGLSVLEYFISTHGA RKGLADTALKTADAGYLTRRLVDVSHDVIITEEDCGTLRGLVCTDLKNNDEVIATLYERI LGRVSVHDIIHPTTGELLVAGGEEITEEVAKKIQDSPIESVEIRSVLTCEAKKGVCAKCY GRNLATSRMVQKGEAVGVIAAQSIGEPGTQLTLRTFHAGGTAANIAANASIVAKNSARLE FEELRTVDIVDEMGEAAKVVVGRLAEVRFVDVNTGIVLSTHNVPYGSTLYVSDGDLVEKG KLIAKWDPFNAVIITEATGKIEFEGVIENVTYKVESDEATGLREIIIIESKDKTKVPSAH ILTEDGDLIRTYNLPVGGHVIIENGQKVKAGEVIVKIPRAVGKAGDITGGLPRVTELFEA RNPSNPAVVSEIDGEVTMGKIKRGNREIIVTSKTGEVKKYLVALSKQILVQENDYVRAGT PLSDGATTPADILAIKGPTAVQEYIVNEVQDVYRLQGVKINDKHFEIIVRQMMRKVQIDE PGDTRFLEQQVVDKLEFMEENDRIWGKKVVVDAGDSQNMQPGQIVTARKLRDENSMLKRR DLKPVEVRDAVAATSTQILQGITRAALQTSSFMSAASFQETTKVLNEAAINGKTDKLEGM KENVICGHLIPAGTGQREFEKIIVGSKEEYDRILANKKTVLDYNEVE >gi|226332238|gb|ACIC01000082.1| GENE 23 26910 - 27215 266 101 aa, chain + ## HITS:1 COG:no KEGG:BT_2732 NR:ns ## KEGG: BT_2732 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 101 1 101 101 179 100.0 3e-44 MEEQNNNQLQIELKEEVAQGTYANLAIITHSSSEFILDFVRVMPGVPKAGVQSRIIVAPE HAKRLLRALEDNIAKYERAFGPIRISEESPMPPLSVVKGEA >gi|226332238|gb|ACIC01000082.1| GENE 24 27353 - 27754 686 133 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|29348140|ref|NP_811643.1| 30S ribosomal protein S12 [Bacteroides thetaiotaomicron VPI-5482] # 1 133 1 133 133 268 100 4e-71 MPTIQQLVRKGREVLVEKSKSPALDSCPQRRGVCVRVYTTTPKKPNSAMRKVARVRLTNQ KEVNSYIPGEGHNLQEHSIVLVRGGRVKDLPGVRYHIVRGTLDTAGVAGRTQRRSKYGAK RPKPGQAAPAKKK >gi|226332238|gb|ACIC01000082.1| GENE 25 27898 - 28374 808 158 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|29348139|ref|NP_811642.1| 30S ribosomal protein S7 [Bacteroides thetaiotaomicron VPI-5482] # 1 158 1 158 158 315 100 3e-85 MRKAKPKKRVILPDPVFNDQKVSKFVNHLMYDGKKNASYEIFYAALETVKAKLPNEEKSA LEIWKKALDNVTPQVEVKSRRVGGATFQVPTEIRPDRKESVSMKNLILFARKRGGKSMAD KLAAEIMDAFNEQGGAYKRKEDMHRMAEANRAFAHFRF >gi|226332238|gb|ACIC01000082.1| GENE 26 28430 - 30547 1889 705 aa, chain + ## HITS:1 COG:HP1195 KEGG:ns NR:ns ## COG: HP1195 COG0480 # Protein_GI_number: 15645809 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factors (GTPases) # Organism: Helicobacter pylori 26695 # 3 700 4 692 692 845 60.0 0 MAKTDLHLTRNIGIMAHIDAGKTTTSERILFYTGLTHKIGEVHDGAATMDWMEQEQERGI TITSAATTTRWKYAGDTYKINLIDTPGHVDFTAEVERSLRILDGAVAAYCAVGGVEPQSE TVWRQADKYNVPRIAYVNKMDRSGADFFEVVRQMKAVLGANPCPIVVPIGAEETFKGLVD LIKMKAIYWHDETMGADYTVEEIPADLVDEANEWRDKMLEKVAEFDDALMEKYFDDPSTI TEEEVLRALRNATVQMAVVPMLCGSSFKNKGVQTLLDYVCAFLPSPLDTENVIGTNPNTG AEEDRKPSDDEKTSALAFKIATDPYVGRLTFFRVYSGKIEAGSYIYNSRSGKKERVSRLF QMHSNKQNPVEVIGAGDIGAGVGFKDIHTGDTLCDETAPIVLESMDFPEPVIGIAVEPKT QKDMDKLSNGLAKLAEEDPTFTVKTDEQTGQTVISGMGELHLDIIIDRLKREFKVECNQG KPQVNYKEAITKTVDLREVYKKQSGGRGKFADIIVKIGPVDEDFKEGGLQFIDEVKGGNI PKEFIPSVQKGFTSAMKNGVLAGYPLDSMKVTLIDGSFHPVDSDQLSFEICAIQAYKNAC AKAGPVLMEPIMKLEVVTPEENMGDVIGDLNKRRGQVEGMESSRSGARIVKAMVPLAEMF GYVTALRTITSGRATSSMVYSHHAQVSNSIAKAVLEEVKGRADLL >gi|226332238|gb|ACIC01000082.1| GENE 27 30632 - 30937 495 101 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|29348137|ref|NP_811640.1| 30S ribosomal protein S10 [Bacteroides thetaiotaomicron VPI-5482] # 1 101 1 101 101 195 100 5e-49 MSQKIRIKLKSYDHNLVDKSAEKIVRTVKATGAIVSGPIPLPTHKRIFTVNRSTFVNKKS REQFELSSYKRLIDIYSSTAKTVDALMKLELPSGVEVEIKV >gi|226332238|gb|ACIC01000082.1| GENE 28 30956 - 31573 1071 205 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|29348136|ref|NP_811639.1| 50S ribosomal protein L3 [Bacteroides thetaiotaomicron VPI-5482] # 1 205 1 205 205 417 100 1e-116 MPGLLGKKIGMTSVFSADGKNVPCTVIEAGPCVVTQVKTVEKDGYAAVQLGFQDKKEKHT TKPLMGHFKRAGVTPKRHLAEFKEFETELNLGDTITVEMFNDATFVDVVGTSKGKGFQGV VKRHGFGGVGQATHGQHNRARKPGSIGACSYPAKVFKGMRMGGQLGGDRVTVQNLQVLKV IADHNLLLIKGSIPGCKGSIVIIEK >gi|226332238|gb|ACIC01000082.1| GENE 29 31573 - 32199 1045 208 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|29348135|ref|NP_811638.1| 50S ribosomal protein L4 [Bacteroides thetaiotaomicron VPI-5482] # 1 208 1 208 208 407 100 1e-113 MEVNVYNIKGEDTGRKVTLNESIFGIEPNDHAIYLDVKQFMANQRQGTHKSKERSEISGS TRKIGRQKGGGGARRGDMNSPVLVGGARVFGPKPRDYFFKLNKKVKTLARKSALSYKVQN NALIVVEDFVFEAPKTKDFVAMTKNLKVSDKKLLVILPEANKNVYLSARNIEGANVQTVS GLNTYRVLNAGVVVLTENSLKAIDNILI >gi|226332238|gb|ACIC01000082.1| GENE 30 32216 - 32506 481 96 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|29348134|ref|NP_811637.1| 50S ribosomal protein L23 [Bacteroides thetaiotaomicron VPI-5482] # 1 96 1 96 96 189 100 2e-47 MGIIIKPLVTEKMTAITDKLNRFGFVVRPDANKLEIKKEVEALYNVTVVDVNTVKYAGKN KSRYTKAGIINGRTNAFKKAIVTLKEGDTIDFYSNI >gi|226332238|gb|ACIC01000082.1| GENE 31 32512 - 33336 1419 274 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|29348133|ref|NP_811636.1| 50S ribosomal protein L2 [Bacteroides thetaiotaomicron VPI-5482] # 1 274 1 274 274 551 100 1e-156 MAVRKLKPTTPGQRHKIIGTFEEITASVPEKSLVYGKKSSGGRNNEGKMTMRYIGGGHRK VIRIVDFKRNKDGVPAVVKTIEYDPNRSARIALLYYADGEKRYIIAPNGLQVGATLMSGE TAAPEIGNTLPLQNIPVGTVIHNIELRPGQGAALVRSAGNFAQLTSREGKYCVIKLPSGE VRQILSTCKATIGSVGNSDHGLERSGKAGRSRWQGRRPRNRGVVMNPVDHPMGGGEGRSS GGHPRSRKGLYAKGLKTRAPKKQSSKYIIERRKK >gi|226332238|gb|ACIC01000082.1| GENE 32 33357 - 33626 467 89 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|29348132|ref|NP_811635.1| 30S ribosomal protein S19 [Bacteroides thetaiotaomicron VPI-5482] # 1 89 1 89 89 184 100 9e-46 MSRSLKKGPYINVKLEKRIFAMNESGKKVVVKTWARASMISPDFVGHTVAVHNGNKFIPV YVTENMVGHKLGEFAPTRTFRGHAGNKKR >gi|226332238|gb|ACIC01000082.1| GENE 33 33662 - 34072 679 136 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|29348131|ref|NP_811634.1| 50S ribosomal protein L22 [Bacteroides thetaiotaomicron VPI-5482] # 1 136 1 136 136 266 100 2e-70 MGARKKISAEKRKEALKTMYFAKLQNVPTSPRKMRLVADMIRGMEVNRALGVLKFSSKEA AARVEKLLRSAIANWEQKNERKAESGELFVTQIFVDGGATLKRMRPAPQGRGYRIRKRSN HVTLFVGAKSNNEDQN >gi|226332238|gb|ACIC01000082.1| GENE 34 34078 - 34809 1250 243 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|29348130|ref|NP_811633.1| 30S ribosomal protein S3 [Bacteroides thetaiotaomicron VPI-5482] # 1 243 1 243 243 486 100 1e-136 MGQKVNPISNRLGIIRGWDSNWYGGNDYGDSLLEDSKIRKYLNARLAKASVSRIVIERTL KLVTITVCTARPGIIIGKGGQEVDKLKEELKKVTDKDIQINIFEVKRPELDAVIVANNIA RQVEGKIAYRRAIKMAIANTMRMGAEGIKIQISGRLNGAEMARSEMYKEGRTPLHTFRAD IDYCHAEALTKVGLLGIKVWICRGEVFGKKELAPNFTQSKESGRGNNGGNNGGKNFKRKK NNR >gi|226332238|gb|ACIC01000082.1| GENE 35 34833 - 35267 736 144 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|29348129|ref|NP_811632.1| 50S ribosomal protein L16 [Bacteroides thetaiotaomicron VPI-5482] # 1 144 1 144 144 288 100 6e-77 MLQPKKTKFRRQQKGRAKGNAQRGNQLAFGSFGIKALETKWITGRQIEAARIAVTRYMQR QGQIWIRIFPDKPITRKPADVRMGKGKGAPEGFVAPVTPGRIIIEAEGVSYEIAKEALRL AAQKLPITTKFVVRRDYDIQNQNA >gi|226332238|gb|ACIC01000082.1| GENE 36 35273 - 35470 321 65 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|29348128|ref|NP_811631.1| 50S ribosomal protein L29 [Bacteroides thetaiotaomicron VPI-5482] # 1 65 1 65 65 128 100 8e-29 MKIAEIKEMTTNDLVERVEAETANYNQMVINHSISPLENPAQIKQLRRTIARMKTELRQR ELNNK >gi|226332238|gb|ACIC01000082.1| GENE 37 35467 - 35736 456 89 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|29348127|ref|NP_811630.1| 30S ribosomal protein S17 [Bacteroides thetaiotaomicron VPI-5482] # 1 89 1 89 89 180 100 2e-44 MISLMEARNLRKERTGVVLSNKMEKTITVAAKFKEKHPIYGKFVSKTKKYHAHDEKNECN IGDTVRIMETRPLSKTKRWRLVEIIERAK >gi|226332238|gb|ACIC01000082.1| GENE 38 35739 - 36104 596 121 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|29348126|ref|NP_811629.1| 50S ribosomal protein L14 [Bacteroides thetaiotaomicron VPI-5482] # 1 121 1 121 121 234 100 1e-60 MIQVESRLTVCDNSGAKEALCIRVLGGTGRRYASVGDVIVVSVKSVIPSSDVKKGAVSKA LIVRTKKEIRRPDGSYIRFDDNACVLLNNAGEIRGSRIFGPVARELRATNMKVVSLAPEV L >gi|226332238|gb|ACIC01000082.1| GENE 39 36125 - 36442 527 105 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|29348125|ref|NP_811628.1| 50S ribosomal protein L24 [Bacteroides thetaiotaomicron VPI-5482] # 1 105 1 105 105 207 100 1e-52 MSKLHIKKGDTVYVNAGEDKGKTGRVLKVLVKEGRAIVEGINMVSKSTKPNAKNPQGGIV KQEASIHISNLNPVDPKTGKATRIGRRKSSEGTLVRYSKKSGEEI >gi|226332238|gb|ACIC01000082.1| GENE 40 36445 - 37002 914 185 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|29348124|ref|NP_811627.1| 50S ribosomal protein L5 [Bacteroides thetaiotaomicron VPI-5482] # 1 185 1 185 185 356 100 1e-97 MSNTASLKKEYADRIAPALKSQFQYSSTMQVPVLKKIVINQGLGMAVADKKIIEVAINEM TAITGQKAVATISRKDIANFKLRKKMPIGVMVTLRRERMYEFLEKLVRVALPRIRDFKGI ESKFDGKGNYTLGIQEQIIFPEINIDSITRILGMNITFVTSAETDEEGYALLKEFGLPFK NAKKD >gi|226332238|gb|ACIC01000082.1| GENE 41 37009 - 37308 502 99 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|29348123|ref|NP_811626.1| 30S ribosomal protein S14 [Bacteroides thetaiotaomicron VPI-5482] # 1 99 1 99 99 197 100 8e-50 MAKESMKAREVKRAKLVARYAEKRAALKQIVRTGEPADAFEAAQKLQELPKNSNPIRMHN RCKLTGRPKGYIRQFGISRIQFREMASNGLIPGVKKASW >gi|226332238|gb|ACIC01000082.1| GENE 42 37361 - 37756 664 131 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|29348122|ref|NP_811625.1| 30S ribosomal protein S8 [Bacteroides thetaiotaomicron VPI-5482] # 1 131 1 131 131 260 100 1e-68 MTDPIADYLTRLRNAIGAKHRVVEVPASNLKKEITKILFEKGYILNYKFVEDGPQGTIKV ALKYDSVNKVNAIKKLERVSSPGMRKYTGYKDMPRVINGLGIAIISTSKGVMTNKEAAEL KIGGEVLCYVY >gi|226332238|gb|ACIC01000082.1| GENE 43 37774 - 38343 965 189 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|29348121|ref|NP_811624.1| 50S ribosomal protein L6 [Bacteroides thetaiotaomicron VPI-5482] # 1 189 1 189 189 376 100 1e-103 MSRIGKLPISIPAGVTVTLKDDVVTVKGPKGELSQYVNPAINVAIEDGHVTLTENEKEMI DNPKQKHAFHGLYRSLVHNMVVGVSEGYKKELELVGVGYRASNQGNIIELALGYTHNIFI QLPAEVKVETKSERNKNPLIILESCDKQLLGQVCSKIRSFRKPEPYKGKGIKFVGEVIRR KSGKSAGAK >gi|226332238|gb|ACIC01000082.1| GENE 44 38365 - 38709 560 114 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|29348120|ref|NP_811623.1| 50S ribosomal protein L18 [Bacteroides thetaiotaomicron VPI-5482] # 1 114 1 114 114 220 100 2e-56 MTTKIERRVKIKYRVRNKISGTAERPRMSVFRSNKQIYVQIIDDLSGKTLAAASSLGMAE KVAKKEQAAKVGEMIAKKAQEAGITTVVFDRNGYLYHGRVKEVADAARNGGLKF >gi|226332238|gb|ACIC01000082.1| GENE 45 38715 - 39233 848 172 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|29348119|ref|NP_811622.1| 30S ribosomal protein S5 [Bacteroides thetaiotaomicron VPI-5482] # 1 172 1 172 172 331 100 6e-90 MAGVNNRVKVTNDIELKDRLVAINRVTKVTKGGRTFSFSAIVVVGNEEGIIGWGLGKAGE VTAAIAKGVESAKKNLVKVPVLKGTVPHEQSAKFGGAEVFIKPASHGTGVVAGGAMRAVL ESVGITDVLAKSKGSSNPHNLVKATIEALSEMRDARMIAQNRGISVEKVFRG >gi|226332238|gb|ACIC01000082.1| GENE 46 39243 - 39419 281 58 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|53715448|ref|YP_101440.1| 50S ribosomal protein L30 [Bacteroides fragilis YCH46] # 1 58 1 58 58 112 100 3e-24 MSTIKIKQVKSRIGAPADQKRTLDALGLRKLNRVVEHESTPSILGMVDKVKHLVAIVK >gi|226332238|gb|ACIC01000082.1| GENE 47 39451 - 39897 735 148 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|29348117|ref|NP_811620.1| 50S ribosomal protein L15 [Bacteroides thetaiotaomicron VPI-5482] # 1 148 1 148 148 287 100 8e-77 MNLSNLKPAEGSTKTRKRIGRGAGSGLGGTSTRGHKGAKSRSGYSKKVGFEGGQMPLQRR VPKFGFKNINRVEYKAINLDTIQTLADAKNLTKVGISDFIEAGFISSNQLVKVLGNGALT NKLEVEANAFSKSAAAAIEAAGGTVVKL >gi|226332238|gb|ACIC01000082.1| GENE 48 39902 - 41245 879 447 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163796899|ref|ZP_02190856.1| 30S ribosomal protein S11 [alpha proteobacterium BAL199] # 13 443 19 447 447 343 41 2e-93 MRKAIETLKNIWKIEDLRQRILITILFVAIYRFGSYVVLPGINPAMLAKLHEQTSEGLLA LLNMFSGGAFSNASIFALGIMPYISASIVIQLLGIAVPYFQKLQREGESGRRKMNQYTRY LTIAILLVQAPSYLLNLKMQAGPSLNASLDWTLFMVTSTIILAAGSMFILWLGERITDKG IGNGISFIILIGIIARFPDALIQEVVSRVANKSGGLIMFIIEVVFLLLVIGAAILLVQGT RKIPVQYAKRIVGNKQYGGARQYIPLKVNAAGVMPIIFAQAIMFIPITFIGFSNNVNNAG GFLHAFTDHTSFWYNFVFAVMIILFTYFYTAITINPTQMAEDMKRNNGFIPGIKPGKKTA EYIDDIMSRITLPGSFFLALVAIMPAFAGIFGVQAGFAQFFGGTSLLILVGVVLDTLQQV ESHLLMRHYDGLLKSGRIKGRAGVAAY >gi|226332238|gb|ACIC01000082.1| GENE 49 41261 - 42058 721 265 aa, chain + ## HITS:1 COG:BH0156 KEGG:ns NR:ns ## COG: BH0156 COG0024 # Protein_GI_number: 15612719 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Methionine aminopeptidase # Organism: Bacillus halodurans # 1 251 1 246 248 263 50.0 2e-70 MIFLKTEDEIELLRQSNLLVGRTLAEVAKVVKPGVTTRELDKVAEEFIRDNGATPTFKGF PNQYGDPFPASICTSVNEQVVHGIPGDIVLKDGDIVSVDCGTYMNGFCGDSAYTFCVGEV DEEIRNLLKVTKEALYIGIQNAVHGKRIGDIGYAIQQYCESHSYGVVREFVGHGIGKEMH EDPQVPNYGKRGYGPLMKRGLCIAIEPMITLGDRQVIMEGDGWTVRTRDRKCAAHFEHTV AVGAGEADILSSFKFIEEVLGDKAI >gi|226332238|gb|ACIC01000082.1| GENE 50 42060 - 42278 239 72 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15900168|ref|NP_344772.1| translation initiation factor IF-1 [Streptococcus pneumoniae TIGR4] # 1 72 1 72 72 96 61 3e-19 MAKQSAIEQDGVIVEALSNAMFRVELENGHEITAHISGKMRMHYIKILPGDKVRVEMSPY DLSKGRIVFRYK >gi|226332238|gb|ACIC01000082.1| GENE 51 42288 - 42404 198 38 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|53715443|ref|YP_101435.1| 50S ribosomal protein L36 [Bacteroides fragilis YCH46] # 1 38 1 38 38 80 100 1e-14 MKVRASLKKRTPECKIVRRNGRLYVINKKNPKYKQRQG >gi|226332238|gb|ACIC01000082.1| GENE 52 42438 - 42818 637 126 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|29348113|ref|NP_811616.1| 30S ribosomal protein S13 [Bacteroides thetaiotaomicron VPI-5482] # 1 126 1 126 126 249 100 2e-65 MAIRIVGVDLPQNKRGEIALTYVYGIGRSSSAKILDKAGVDKDLKVKDWTDDQAAKIREI IGAEYKVEGDLRSEIQLNIKRLMDIGCYRGVRHRIGLPVRGQSTKNNARTRKGRKKTVAN KKKATK >gi|226332238|gb|ACIC01000082.1| GENE 53 42830 - 43219 665 129 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|29348112|ref|NP_811615.1| 30S ribosomal protein S11 [Bacteroides thetaiotaomicron VPI-5482] # 1 129 1 129 129 260 100 1e-68 MAKKTVAAKKRNVKVDANGQLHVHSSFNNIIVSLANSEGQIISWSSAGKMGFRGSKKNTP YAAQMAAQDCAKIAFDLGLRKVKAYVKGPGNGRESAIRTIHGAGIEVTEIIDVTPLPHNG CRPPKRRRV >gi|226332238|gb|ACIC01000082.1| GENE 54 43336 - 43941 1029 201 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|29348111|ref|NP_811614.1| 30S ribosomal protein S4 [Bacteroides thetaiotaomicron VPI-5482] # 1 201 1 201 201 400 100 1e-111 MARYTGPKSRIARKFGEGIFGADKVLSKKNYPPGQHGNSRKRKTSEYGVQLREKQKAKYT YGVLEKQFRNLFEKAATAKGITGEVLLQLLEGRLDNVVFRLGIAPTRAAARQLVSHKHIT VDGEVVNIPSFAVKPGQLIGVRERSKSLEVIANSLAGFNHSKYAWLEWDETSKVGKMLHI PERADIPENIKEHLIVELYSK >gi|226332238|gb|ACIC01000082.1| GENE 55 43953 - 44945 912 330 aa, chain + ## HITS:1 COG:BMEI0781 KEGG:ns NR:ns ## COG: BMEI0781 COG0202 # Protein_GI_number: 17987064 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, alpha subunit/40 kD subunit # Organism: Brucella melitensis # 8 325 11 321 337 239 42.0 7e-63 MAILAFQKPDKVLMLEADSRFGKFEFRPLEPGFGITVGNALRRILLSSLEGFAITTIRID GVEHEFSSVPGVKEDVTNIILNLKQVRFKQVVEEFESEKVSITVENSSEFKAGDIGKYLT GFEVLNPELVICHLDSKATMQIDITINKGRGYVPADENREYCTDVNVIPIDSIYTPIRNV KYAVENFRVEQKTDYEKLVLEISTDGSIHPKEALKEAAKILIYHFMLFSDEKITLESNDT DGNEEFDEEVLHMRQLLKTKLVDMDLSVRALNCLKAADVETLGDLVQFNKTDLLKFRNFG KKSLTELDDLLESLNLSFGTDISKYKLDKE >gi|226332238|gb|ACIC01000082.1| GENE 56 44949 - 45440 813 163 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|29348109|ref|NP_811612.1| 50S ribosomal protein L17 [Bacteroides thetaiotaomicron VPI-5482] # 1 163 1 163 163 317 100 7e-86 MRHNKKFNHLGRTASHRSAMLSNMACSLIKHKRITTTVAKAKALKKFVEPLITKSKEDTT NSRRVVFSNLQDKIAVTELFKEISVKIADRPGGYTRIIKTGNRLGDNAEMCFIELVDYNE NMAKEKVAKKATRTRRSKKSAEAAAPAAVEAPATEEPKAESAE >gi|226332238|gb|ACIC01000082.1| GENE 57 45588 - 45983 153 131 aa, chain + ## HITS:1 COG:no KEGG:BT_2699 NR:ns ## KEGG: BT_2699 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 131 1 131 131 256 100.0 2e-67 MAKFVKVILFLLLAVALHGVAGNVFTEKQVEQPEHAVTYSMKPQGQINAPESPYLPVAEL TNLQSHQISVTRIQRVQIGEYFSSLRNVLQCCADRDNSLSQHWGRIYNTTTSYYCQPASE YYVFTLRHIII >gi|226332238|gb|ACIC01000082.1| GENE 58 46090 - 46629 422 179 aa, chain + ## HITS:1 COG:no KEGG:BT_2698 NR:ns ## KEGG: BT_2698 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 179 1 179 179 307 100.0 1e-82 MATQAIDATIFASSHPDIAKRTSISGLLISCVMLLIGILSFASTFELEDKSSTVSMALMV FGTGLFLLGIFRLFWKSKEVVYVPTGSIAKERSIFFDLKHMDKLTDLVNSGSFSADSKIK SEASGNIRMDVILSADNKFAAVQLFQFVPYTYQPVTAVHYFTNDAAAAVAAFLTKSQLN >gi|226332238|gb|ACIC01000082.1| GENE 59 46820 - 48634 1037 604 aa, chain - ## HITS:1 COG:CAC3034 KEGG:ns NR:ns ## COG: CAC3034 COG0249 # Protein_GI_number: 15896285 # Func_class: L Replication, recombination and repair # Function: Mismatch repair ATPase (MutS family) # Organism: Clostridium acetobutylicum # 1 601 1 597 598 259 27.0 1e-68 MEKQENIIATYQQIIQETEQQLQKARKRIRYISLLRLILFVEAVASAIIFWSDGWLKLIV FTVIPIIAFIWLVQSHNRWFYWKDYLKKKIEINQQELRALQYDFSDFDNGKEYIDPSHLY TFDLDIFGEHSLFQYINRTSTPIGKQRLANWFNAHLEEKEAIEQRQEAIRELSSELEFRQ QFRLLGLLYKGKPSDTSEIKEWVNSPSDYRKHAFLRILPTAVGIINLLCIGATILGFLPA SISGIVFACFVVFSFIFSKGITKIQATYGEKLQILSTYADQILITEKKEMHSPALQQLKA ELTSQNQTASQSVHRLAKLMNALDQRGNLLMSTILNGLLFWELRQVMQIEKWKEAHSADL PRWIEAIGAIDAYCSLATFSYNHPDYIYPQITSQSFHLQAEALGHPLMNRNKCVRNGIDM EKRPFFIIITGANMAGKSTYLRTVGVNYLLACIGAPVWAAKMEIFPARLVTSLRTSDSLT DNESYFFAELKRLKLIIDKLKAGEELFIILDEILKGTNSMDKQKGSFALIKQFMSMDANG IIATHDLLLGTLIDAFPQNIRNYCFEADITNNELTFSYQMRSGVAQNMNACFLMKKMGIA VIDG >gi|226332238|gb|ACIC01000082.1| GENE 60 49690 - 50337 457 215 aa, chain + ## HITS:1 COG:no KEGG:BT_2695 NR:ns ## KEGG: BT_2695 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 215 8 222 222 434 100.0 1e-121 MKKGYILWLLISVISLPVVAQNIIYSNLKELMVQDGDTLAVLRVEKRSRSQILLTGGADY RITAGNDESMCRQLKKRCFAVRTSEGDLFINCRKLRYKKMRFGVWYAPAVQLGKNIYFCA MPLGSVIGGTFAEEDDVKLGGNVGDALAASSLVTKRVCYELNGETGKIDFVGREKMAQLL EKHPEWKKAYLTKDSQEAKHTFKYLQMLCGEQQKE >gi|226332238|gb|ACIC01000082.1| GENE 61 50458 - 52449 1433 663 aa, chain + ## HITS:1 COG:SPy2082 KEGG:ns NR:ns ## COG: SPy2082 COG2987 # Protein_GI_number: 15675840 # Func_class: E Amino acid transport and metabolism # Function: Urocanate hydratase # Organism: Streptococcus pyogenes M1 GAS # 1 661 13 673 676 1039 70.0 0 MKITLGNTLPPYPAFIEGIRRAPDRSYMLTPAQTATALKNALRYIPLELHEQLAPEFMEE LRTRGRIYGYRFRPAGDLKARPIDEYRGKCVEGKAFQVMIDNNLCFDIALYPYELVTYGE TGQVCQNWMQYLLIKQYLEELTPEQTLVIESGHPLGLFRSRPDAPRVIITNSMMIGQFDN QHDWHIAAQMGVANYGQMTAGGWMYIGPQGIVHGTFNTLLNAGRLKLGIPQDKDLRGHLF VSSGLGGMSGAQPKAAEIAGAASIIAEVDSSRIETRYRQGWVGHVTQDLSEAFRLAKEAM ERREPCSIAYHGNVVDLLEYAEREQIHIELLSDQTSCHAVYEGGYCPVGLTFEERTRLLH ESPDQFRRLADASLRRHFEVIRKLVACGTYFFDYGNSFMKAIYDAGVKEISRNGVDEKDG FIWPSYVEDIMGPQLFDYGYGPFRWVCLSGRHEDLIKTDRAAMECIDVNRRGQDLDNYNW IRDAEKNRLVVGTQARILYQDAVGRMNIALRFNEMVRRGEVGPIMLGRDHHDVSGTDSPF RETSNIKDGSNVMADMAVQCFAGNCARGMSLVALHNGGGVGIGKAINGGFGMVCDGSERV DEILRSAMLWDVMGGVARRSWARNPHAMETSDAFNESHSEGYHITMPYLADDELIKKTIK EAL >gi|226332238|gb|ACIC01000082.1| GENE 62 52478 - 53380 995 300 aa, chain + ## HITS:1 COG:SPy2083 KEGG:ns NR:ns ## COG: SPy2083 COG3643 # Protein_GI_number: 15675841 # Func_class: E Amino acid transport and metabolism # Function: Glutamate formiminotransferase # Organism: Streptococcus pyogenes M1 GAS # 4 299 2 298 299 337 53.0 1e-92 MNWSKIIECVPNFSEGRDLEKIDKIVAPFRGKSGVKLLDYSNDEDHNRLVVTLVGEPEAL RDAVIEAIGIAVRLIDLNHHSGQHPRMGAVDVVPFIPIKNTTMEEAVALSKEVASRVAEL YNLPVFLYEKSATAPHRENLASVRKGEFEGMAEKIRLPEWQPDFGPAERHPTAGTVAIGA RMPLVAYNINLSTNNLEIATKIAKNIRHINGGLRYVKAMGVELKERNITQVSINMTDYTR TALYRAFELVRIEARRYGVAIVGSEIIGLVPMEALIDTASYYLGLENFSMQQVLEARIME >gi|226332238|gb|ACIC01000082.1| GENE 63 53478 - 54666 1074 396 aa, chain + ## HITS:1 COG:SPy2081 KEGG:ns NR:ns ## COG: SPy2081 COG1228 # Protein_GI_number: 15675839 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Imidazolonepropionase and related amidohydrolases # Organism: Streptococcus pyogenes M1 GAS # 23 391 32 398 428 338 46.0 2e-92 MSENLIIFNARIVTPTGTSARKGAEMGQLRIIENGTVEVTKGIITYVGESRGEDRDGYYQ HYWHYNARGHCLLPGFVDSHTHFVFGGERSEEFSWRLKGESYMSIMERGGGIASTVKATR QMNFLKLRSAAEGFLKKMSAMGVTTVEGKSGYGLDRETELLQLKIMRSLNNDEHKRIDIV STFLGAHALPEEYKGRGDEYIDFLIREMLPVIRENELAECCDVFCEQGVFSVEQSRRLLQ AAKEQGFLLKLHADEIVSFGGAELAAELGALSADHLLQASDAGIRAMADAGVVATLLPLT AFALKEPYARGREMIDAGCAVALATDLNPGSCFSGSIPLTIALACIYMKLSIEETITALT LNGAAALHRADRIGSIEVGKQGDFVILNSDNYHILP Prediction of potential genes in microbial genomes Time: Thu May 12 01:28:45 2011 Seq name: gi|226332237|gb|ACIC01000083.1| Bacteroides sp. 1_1_6 cont1.83, whole genome shotgun sequence Length of sequence - 952 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 697 214 ## COG0582 Integrase 2 1 Op 2 . - CDS 690 - 950 79 ## GFO_1229 phage integrase family protein Predicted protein(s) >gi|226332237|gb|ACIC01000083.1| GENE 1 1 - 697 214 232 aa, chain - ## HITS:1 COG:mlr9323 KEGG:ns NR:ns ## COG: mlr9323 COG0582 # Protein_GI_number: 13488300 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Mesorhizobium loti # 53 224 47 211 316 78 32.0 9e-15 MAEQFNYSSVLSPYIKRMLEIRKSMGIVDSRVRWILKEFDDFANSIGLQEPHITEEFVKR WHKSRISDKEITIYGKYLVLRQLTSLMCRNGCVCYIPIIPKQPKSEFTPHIYTHEQISQL FTAADNSRLFNSCMKTAIISMPVILRLLYSTGMRVSEALYMRNEDVNLDSGYIHLRKTKN RCERLVPIGESMVIVLKQYIEYRNRMPIEKISHSNHLFFTKLDGTSFRVCTL >gi|226332237|gb|ACIC01000083.1| GENE 2 690 - 950 79 86 aa, chain - ## HITS:1 COG:no KEGG:GFO_1229 NR:ns ## KEGG: GFO_1229 # Name: not_defined # Def: phage integrase family protein # Organism: G.forsetii # Pathway: not_defined # 5 85 344 424 425 103 53.0 2e-21 NKIIRLSGVNIDKKRHGPHSLRHSLASNMLENGATMPTISEVLGHRNTATTMTYLKINLV ALRKCVLPVPPIPDSFYTQKGGAFYG Prediction of potential genes in microbial genomes Time: Thu May 12 01:28:49 2011 Seq name: gi|226332236|gb|ACIC01000084.1| Bacteroides sp. 1_1_6 cont1.84, whole genome shotgun sequence Length of sequence - 5678 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 4, operones - 1 average op.length - 7.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 167 - 403 72 ## BF0112 lysozyme + Term 431 - 476 5.7 + Prom 416 - 475 7.5 2 2 Tu 1 . + CDS 532 - 2022 479 ## ECSE_P2-0099 hypothetical protein + Term 2041 - 2094 12.2 - Term 2037 - 2075 1.6 3 3 Tu 1 . - CDS 2087 - 2308 108 ## BF1096 hypothetical protein - Term 2386 - 2421 3.9 4 4 Op 1 . - CDS 2431 - 2739 310 ## BF0110 hypothetical protein 5 4 Op 2 . - CDS 2793 - 3038 237 ## BF0109 hypothetical protein 6 4 Op 3 . - CDS 3035 - 3265 205 ## BT_2286 hypothetical protein 7 4 Op 4 . - CDS 3285 - 3542 206 ## BF0108 hypothetical protein 8 4 Op 5 . - CDS 3567 - 4898 776 ## BF0107 hypothetical protein 9 4 Op 6 . - CDS 4895 - 5311 337 ## BF0106 hypothetical protein 10 4 Op 7 . - CDS 5325 - 5546 184 ## BF0105 hypothetical protein - Prom 5570 - 5629 1.7 Predicted protein(s) >gi|226332236|gb|ACIC01000084.1| GENE 1 167 - 403 72 78 aa, chain + ## HITS:1 COG:no KEGG:BF0112 NR:ns ## KEGG: BF0112 # Name: not_defined # Def: lysozyme # Organism: B.fragilis # Pathway: not_defined # 1 78 97 174 174 135 87.0 3e-31 MFRQFGKDSLLLATLAYNVGPYRLLGSKTIPKSTLIKKLEAGDRNIYHEYIAFCSYKGKR HAMLLTRRKVEFALLYIP >gi|226332236|gb|ACIC01000084.1| GENE 2 532 - 2022 479 496 aa, chain + ## HITS:1 COG:no KEGG:ECSE_P2-0099 NR:ns ## KEGG: ECSE_P2-0099 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SE11 # Pathway: not_defined # 12 465 1 519 583 207 30.0 9e-52 MPNFASKDRTIMNNKLNIKIEQLKNINNIELNLPIETGLYAITGANGTGKSTIMTVISKV VRNSAFNVFQPHDYSSNSKITISYDGKENSWTKASRGWSCSSTDIISLKGFYEGSIIHGM RFIDANYDTLLKAERVNNTILTDADSFVSRNLSYILHGNYDFYTNLKRIKNRTLAQLKAF KGIPYFIEGTNGIVNQFCMSAGENMLISLLHMLNVVIVRPAKSEDVRLILIDEIELALHP SAIMRLVDFLQKLATEYNLAIYFSSHSIELLRKIKPSNIFHLQKELDNIAIVNPCYPSYA TRDIYQHSGYDFLILVEDVLAKYILENIIDENALYKSKLINILPSGGWENVLKMQDDICK SNLAGVGTKVLSVLDGDVKPDFEQLYKQKGLYTNLTINFLPIHSLEKYLHEKIIVNKDAD FFKEIGDRFFKVKSLKEVVDSLIKKNDDKAFYNYLIKNLKEQGIEENVFVQKVCEMIYRK EDMSKLLTFLQKTFQN >gi|226332236|gb|ACIC01000084.1| GENE 3 2087 - 2308 108 73 aa, chain - ## HITS:1 COG:no KEGG:BF1096 NR:ns ## KEGG: BF1096 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 73 2 74 74 74 43.0 1e-12 MEAIRQNGKIILHSNDGISIKMIFRNLTGRNFQGQEYADYISHIAIGSMGFTPGSIEHCR DGGVIDTGTIPNV >gi|226332236|gb|ACIC01000084.1| GENE 4 2431 - 2739 310 102 aa, chain - ## HITS:1 COG:no KEGG:BF0110 NR:ns ## KEGG: BF0110 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 102 38 139 139 187 95.0 1e-46 MIAQTILQQIGGKRFTAMTGSRDYINMGNGLRMSLARNKTSANRLDIIYDAGADLYNMRF YRRTFSKKTFECRTKDIAVHEGIYFDMLEEMFTMVTGLYTRF >gi|226332236|gb|ACIC01000084.1| GENE 5 2793 - 3038 237 81 aa, chain - ## HITS:1 COG:no KEGG:BF0109 NR:ns ## KEGG: BF0109 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 81 28 108 108 150 95.0 2e-35 MNTTYQTLIVKFSEPIAALDGIFDDAQAWGTDTLKGWIDDYESTRFTATDSHTAVITSEY NMECVKEWLQRQTPIAEMREF >gi|226332236|gb|ACIC01000084.1| GENE 6 3035 - 3265 205 76 aa, chain - ## HITS:1 COG:no KEGG:BT_2286 NR:ns ## KEGG: BT_2286 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 73 1 72 72 72 58.0 4e-12 MATRMTINGVSTCTEAGTEKYERFQSGIGRRKRTLVQYDYRHTDGELFTCVKPTLDECRT ARNKWLNAKQGKEGNR >gi|226332236|gb|ACIC01000084.1| GENE 7 3285 - 3542 206 85 aa, chain - ## HITS:1 COG:no KEGG:BF0108 NR:ns ## KEGG: BF0108 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 85 1 85 85 150 92.0 2e-35 MEVRIESMICVWDDKIPTLFLEFVNLLTLSTSEEGLRKSVKEFAEKHELDKFFLYGFGSH HFYLHQRYTSNPEIVMKNRVLSVHF >gi|226332236|gb|ACIC01000084.1| GENE 8 3567 - 4898 776 443 aa, chain - ## HITS:1 COG:no KEGG:BF0107 NR:ns ## KEGG: BF0107 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 443 1 443 443 808 92.0 0 MKPKTKIQKKVARLSANLRPISATQIDWAYRHCIEHIGYRTKKGNITCSDCGHEWYSNSV LCDTLEGCTCPKCYAKLKVQDTRKRIYKETQYFSVITTCKGYQVIRVAQVRCESRKGEPM HFYCHEVVQRWISPDGKVTDMALLRGFTFYYCDVWALCSAMEIRPHNSLYDDVVARSCAY PKMRVLPQLRRNGFKGDFHGISPVRLFKALLSDPRIETLMKGDEIEVMKHFLFNARTADE CWASYLIAKRHKYRIDNFSMWCDYLRMLNKLGQDLRNPKNICPEDFMAAHDNATRKIEAI HEKERAEQRRRWEIERREREQQRQLQREKDAEDFIANKSKFFGLVITDEEIIVKVLESID EYYSEGKAQNICVFGSEYYKKADTLILSARIGGEIIETVEVDLRTLKVVQCHGKYNQDTE YHERIIDLVNKNANLIRERMKAA >gi|226332236|gb|ACIC01000084.1| GENE 9 4895 - 5311 337 138 aa, chain - ## HITS:1 COG:no KEGG:BF0106 NR:ns ## KEGG: BF0106 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 138 1 138 138 232 89.0 4e-60 MKGTDHFKRTIQMYLEQRAEEDTLFAKKYRNPAKNIDECVTHILNYVQKSGCSGFTDGEI FGQAIHYYEENEIEVGKPMNCQVVVNHVVELTEEEKAEARQNAARRYQEEELRKLQNRNR PPARKVTQSQPSLFDLGL >gi|226332236|gb|ACIC01000084.1| GENE 10 5325 - 5546 184 73 aa, chain - ## HITS:1 COG:no KEGG:BF0105 NR:ns ## KEGG: BF0105 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 73 59 131 131 139 94.0 2e-32 MAKRNSKTVAQQCRYYEVDNIFVYMVETYINGNFETFRRLYHELNKDTRRDFMDFLLSEV EPTYWRERLKQTI Prediction of potential genes in microbial genomes Time: Thu May 12 01:29:15 2011 Seq name: gi|226332235|gb|ACIC01000085.1| Bacteroides sp. 1_1_6 cont1.85, whole genome shotgun sequence Length of sequence - 831 bp Number of predicted genes - 0 Prediction of potential genes in microbial genomes Time: Thu May 12 01:29:28 2011 Seq name: gi|226332234|gb|ACIC01000086.1| Bacteroides sp. 1_1_6 cont1.86, whole genome shotgun sequence Length of sequence - 45099 bp Number of predicted genes - 46, with homology - 45 Number of transcription units - 19, operones - 12 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 107 - 166 4.1 1 1 Op 1 . + CDS 268 - 1104 606 ## BT_4576 hypothetical protein 2 1 Op 2 . + CDS 1116 - 1892 602 ## BT_4575 hypothetical protein + Prom 1917 - 1976 5.3 3 2 Op 1 . + CDS 2000 - 2374 167 ## PROTEIN SUPPORTED gi|15903216|ref|NP_358766.1| CrcB protein 4 2 Op 2 . + CDS 2395 - 3057 571 ## COG1357 Uncharacterized low-complexity proteins + Prom 3099 - 3158 2.5 5 3 Tu 1 . + CDS 3237 - 4517 1540 ## COG0148 Enolase + Term 4545 - 4580 6.5 + Prom 4531 - 4590 5.0 6 4 Op 1 . + CDS 4646 - 5209 396 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 7 4 Op 2 . + CDS 5206 - 5613 143 ## BT_4570 hypothetical protein 8 4 Op 3 . + CDS 5621 - 6706 772 ## BT_4569 hypothetical protein + Prom 6814 - 6873 4.1 9 5 Op 1 . + CDS 6899 - 9331 1116 ## COG1629 Outer membrane receptor proteins, mostly Fe transport 10 5 Op 2 9/0.000 + CDS 9356 - 10375 521 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 11 5 Op 3 . + CDS 10399 - 11103 495 ## COG3279 Response regulator of the LytR/AlgR family + Term 11128 - 11176 4.3 - TRNA 11172 - 11245 72.9 # Thr TGT 0 0 + Prom 11480 - 11539 2.7 12 6 Op 1 . + CDS 11560 - 12111 358 ## BT_4565 hypothetical protein 13 6 Op 2 . + CDS 12135 - 13085 816 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 14 6 Op 3 . + CDS 13106 - 13657 226 ## BT_4563 hypothetical protein 15 6 Op 4 . + CDS 13593 - 14387 609 ## BT_4562 hypothetical protein 16 6 Op 5 . + CDS 14394 - 14975 462 ## COG1971 Predicted membrane protein 17 6 Op 6 . + CDS 15015 - 16031 1017 ## COG1477 Membrane-associated lipoprotein involved in thiamine biosynthesis + Term 16060 - 16104 6.8 - Term 16040 - 16100 21.0 18 7 Op 1 . - CDS 16157 - 16528 242 ## COG3169 Uncharacterized protein conserved in bacteria 19 7 Op 2 . - CDS 16556 - 17302 456 ## COG0037 Predicted ATPase of the PP-loop superfamily implicated in cell cycle control - Prom 17353 - 17412 3.4 + Prom 17317 - 17376 2.5 20 8 Op 1 . + CDS 17418 - 18869 1448 ## BT_4557 hypothetical protein 21 8 Op 2 . + CDS 18879 - 19349 590 ## BT_4556 hypothetical protein 22 8 Op 3 . + CDS 19373 - 20374 1066 ## COG4864 Uncharacterized protein conserved in bacteria + Term 20470 - 20505 -1.0 + Prom 20404 - 20463 1.6 23 8 Op 4 . + CDS 20537 - 21415 880 ## COG2820 Uridine phosphorylase + Term 21436 - 21491 10.7 - Term 21425 - 21476 8.1 24 9 Op 1 . - CDS 21485 - 22144 679 ## COG2910 Putative NADH-flavin reductase - Prom 22165 - 22224 1.7 - Term 22145 - 22192 2.6 25 9 Op 2 . - CDS 22226 - 24133 1612 ## COG0642 Signal transduction histidine kinase - Prom 24212 - 24271 6.1 + Prom 24174 - 24233 7.2 26 10 Tu 1 . + CDS 24266 - 25663 864 ## COG0486 Predicted GTPase + Term 25704 - 25743 4.3 + Prom 25683 - 25742 3.9 27 11 Op 1 . + CDS 25935 - 26813 657 ## BF1137 putative transposase + Prom 26818 - 26877 4.8 28 11 Op 2 . + CDS 26931 - 28046 871 ## COG4974 Site-specific recombinase XerD + Term 28067 - 28127 18.3 29 12 Op 1 . + CDS 28216 - 29061 525 ## BF1134 putative transmembrane protein 30 12 Op 2 . + CDS 29161 - 29535 341 ## BF1133 putative excisionase 31 12 Op 3 . + CDS 29538 - 30608 654 ## BF1132 hypothetical protein 32 12 Op 4 . + CDS 30580 - 30678 58 ## 33 13 Op 1 . + CDS 30919 - 31338 330 ## BF1129 hypothetical protein 34 13 Op 2 . + CDS 31335 - 32546 608 ## BF1128 hypothetical protein 35 13 Op 3 . + CDS 32551 - 33078 313 ## BF1127 putative transmembrane protein 36 14 Tu 1 . - CDS 33139 - 33525 310 ## gi|253569680|ref|ZP_04847089.1| predicted protein - Prom 33545 - 33604 3.0 - Term 33569 - 33621 10.1 37 15 Tu 1 . - CDS 33631 - 34557 566 ## COG0582 Integrase - Prom 34682 - 34741 6.1 + Prom 34547 - 34606 6.9 38 16 Tu 1 . + CDS 34790 - 35797 290 ## COG0732 Restriction endonuclease S subunits 39 17 Tu 1 . - CDS 35790 - 36443 109 ## COG0732 Restriction endonuclease S subunits - Prom 36507 - 36566 4.3 40 18 Tu 1 . + CDS 36433 - 37017 128 ## COG0732 Restriction endonuclease S subunits 41 19 Op 1 1/0.571 - CDS 37010 - 38212 232 ## COG0732 Restriction endonuclease S subunits 42 19 Op 2 2/0.571 - CDS 38218 - 39624 799 ## COG3943 Virulence protein 43 19 Op 3 . - CDS 39632 - 41185 1306 ## COG0286 Type I restriction-modification system methyltransferase subunit 44 19 Op 4 . - CDS 41190 - 41894 462 ## APA01_04920 hypothetical protein 45 19 Op 5 . - CDS 41908 - 44736 1919 ## COG0610 Type I site-specific restriction-modification system, R (restriction) subunit and related helicases 46 19 Op 6 . - CDS 44780 - 45052 271 ## BT_4534 hypothetical protein Predicted protein(s) >gi|226332234|gb|ACIC01000086.1| GENE 1 268 - 1104 606 278 aa, chain + ## HITS:1 COG:no KEGG:BT_4576 NR:ns ## KEGG: BT_4576 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 278 1 278 278 520 99.0 1e-146 MTQHFRYTTLVLLTTLSTVYASAQSLRKLMEMPPRIIESKWEQTPEGGTELNYYNGDLCS YYLFRENDRSYNLNPGKNTVFRIEKGSNASNSFKTSSRYMFFRGQFPKDFQISTPYALPV KTGEETQWQIAPQESAKTMIFQIQEGDTVYATRRGVACATVLPQQLLIYHPDHTFAAYLM MRQNFIQTGEEVMTGQPIGIAGVLGVSISYFFLDNNKFDANGFLGYPYSHFTPVFRTDKG DVKLEEKTAYKSATDDELIMLEMSKREKKKFQKQKYSK >gi|226332234|gb|ACIC01000086.1| GENE 2 1116 - 1892 602 258 aa, chain + ## HITS:1 COG:no KEGG:BT_4575 NR:ns ## KEGG: BT_4575 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 258 1 258 258 515 100.0 1e-145 MRIFYLILCFICLCDLLHAQTVRISYEGDPLTDKERRKIEQTLQYEVEFYAQFGLPDTLN LQLTVFNKREDALVYLNKFNIHPPKSTNGMYISRLQKAIILSREKEYQQGLGVIYHELSH HLTLQITAGRPPIWFNEGLAEYFEHCKVNKKGLQHTFTDYEKGRIRTIYMLGDIDLLTFI NSRQKEFMKQQRTDEQYAYILSHALVTFWIEKVPRQILRDFVLSFQNKDDSSIVSERIDK VYTGGFKQFEKDFEAFCK >gi|226332234|gb|ACIC01000086.1| GENE 3 2000 - 2374 167 124 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15903216|ref|NP_358766.1| CrcB protein [Streptococcus pneumoniae R6] # 6 116 9 117 124 68 36 5e-11 MLKTLLFIGMGSFTGGVLRYLISRYVQNFLTPSFPLGTLLVNVLGCFAIGLFYGLFERGN LMNPNLRMFLTVGFCGGFTTFSTFMNENFQLIKDDNFFYLSLYVGLSLFVGFIMLYLGYS LVKQ >gi|226332234|gb|ACIC01000086.1| GENE 4 2395 - 3057 571 220 aa, chain + ## HITS:1 COG:CAC1657 KEGG:ns NR:ns ## COG: CAC1657 COG1357 # Protein_GI_number: 15894934 # Func_class: S Function unknown # Function: Uncharacterized low-complexity proteins # Organism: Clostridium acetobutylicum # 56 219 53 216 216 124 40.0 2e-28 MIKRNAAKPVRVAPPMMEEQEVSTSTLQELLEKEETVSNLIFSKGREEGISKSYKSFKNC TFRNQTFSECKFRSSQLADIRFENCDLSNISFAESSLYRVEFIACKLLGTNLSETTMNHV LLHDCNASYINLAMSKMNQVRFSHCHFRNGSFNDCRFSSVAFDSCDLVEADFSHAPLRGI DLRTSRIGGITLNISDLKGAVITSLQAMDLLPLLGVIIED >gi|226332234|gb|ACIC01000086.1| GENE 5 3237 - 4517 1540 426 aa, chain + ## HITS:1 COG:SP1128 KEGG:ns NR:ns ## COG: SP1128 COG0148 # Protein_GI_number: 15900994 # Func_class: G Carbohydrate transport and metabolism # Function: Enolase # Organism: Streptococcus pneumoniae TIGR4 # 3 420 4 423 434 606 73.0 1e-173 MKIEKIVAREILDSRGNPTVEVDVVLESGIMGRASVPSGASTGEHEALELRDGDKQRYGG KGVQKAVDNVNKIIAPKLIGMSSLNQRGIDYAMLALDGTKTKSNLGANAILGVSLAVAKA AASYLDLPLYRYIGGTNTYVMPVPMMNIINGGSHSDAPIAFQEFMIRPVGAPSFREGLRM GAEVFHALKKVLKDRGLSTAVGDEGGFAPNLEGTEDALNSIIAAIKAAGYEPGKDVMIGM DCASSEFYHDGIYDYTKFEGAKGKKRTAEEQIDYLEELINKFPIDSIEDGMSENDWEGWK KLTERIGDRCQLVGDDLFVTNVDFLAMGIEKGCANSILIKVNQIGSLTETLNAIEMAHRH GYTTVTSHRSGETEDATIADIAVATNSGQIKTGSLSRSDRMAKYNQLLRIEEELGDLAVY GYKRIK >gi|226332234|gb|ACIC01000086.1| GENE 6 4646 - 5209 396 187 aa, chain + ## HITS:1 COG:RSc1055 KEGG:ns NR:ns ## COG: RSc1055 COG1595 # Protein_GI_number: 17545774 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Ralstonia solanacearum # 1 180 1 187 199 89 32.0 4e-18 MMNQEKIQELVSRSQQDDKRAFALLVSEFQPLVFRLAFRLLCDEEEARDITQETFVKVWL SLKTYNQENRLSTWLYKITCNTCYDRLRSLRHSPLDNESAFSDSVNIPSDDHIEISLSNK QLKELILRYTNELPPQQKLVFTLRDVEELEVAEVQIITGLSPEKIKSTLYLARKNIRNKM NQIDPDL >gi|226332234|gb|ACIC01000086.1| GENE 7 5206 - 5613 143 135 aa, chain + ## HITS:1 COG:no KEGG:BT_4570 NR:ns ## KEGG: BT_4570 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 135 1 135 135 244 100.0 8e-64 MKQEDKQYEKWLTEVKSNQPILENPEELTATILNRISGISPERKRRKFLIGAWASGIAAS LLLLLFINDTCFTSVPHRTEMQNEYDNWSNSIPLPDNWEEMRLLEKNTYLSSQYIQHRKS RKMQILQVIKKKRLK >gi|226332234|gb|ACIC01000086.1| GENE 8 5621 - 6706 772 361 aa, chain + ## HITS:1 COG:no KEGG:BT_4569 NR:ns ## KEGG: BT_4569 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 361 1 361 361 716 100.0 0 MKTNYFLITMLLLLSMTSCSTYYRMTSRVERDGSMYREIYAQGDSAFIAGDKTHNPFLFQ PNANWQLVNLDSTIKFNFWGEEEKLNVKACQKLSGVDGEYFTVDKGKEHLSAMAIPMERV KKSFRWFYTYYIYTATYKELQDKGPVPLDNYLNKEEQMIWLHGNDDAFRGMNGIEMNDKL DKLEAKFGEWYNRNVYEINWEVIRHFTSLQGDTACLQCLEELKDSVYKKHSSEKGDSMGD ADIEEVCGMFDKACSTKYFSDLYKTNKEMMDALCEEKINIAEVFYHAIQFELTLPGRLLT SNAKVLKDNVVIWKIDGFRLLAGDYILTAESRVINYWAFGITLFMMLFVLGIFIKLYRRR S >gi|226332234|gb|ACIC01000086.1| GENE 9 6899 - 9331 1116 810 aa, chain + ## HITS:1 COG:CC0815 KEGG:ns NR:ns ## COG: CC0815 COG1629 # Protein_GI_number: 16125068 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor proteins, mostly Fe transport # Organism: Caulobacter vibrioides # 124 791 63 723 737 145 22.0 3e-34 MKTLKIYFLLMLLIQTISISAQSVISGKITNNDSSPIVGATILLSNGMDSTYITGTTSDL DGRFKMINVKSGNYFLSLSMIGYKKANIPLQIKESMNLELGDITLEEDSYILSTVNIIGK RPPIKAEPGKMTVNLSSALLSTDGNILDALRKLPGVIVQNDGTIILNGKSGANVLMDDKV TYLSGENLINYLRSIPASSIENIELISQPSSKHDASGSSGIINIQKKKIKEQGISLTASS GLEQGKHTKGNENLTLNFHHNKLNMYADYSYYWGKDFIELSVSGHYLDPMTLKPLELRKD FDSDINRQYKGHYIKTGVDYDLSEKIAIGTYFSSSWLNRNKQEVTVSDFFNNDKTQSDST LTALSTPDYSYTNIIGGANMIYKFAKTGKWDASFDYQLFNQEDNHLLKSFFQTGIHPVKG DTLSGITNGDIKIYSGQTNLSYDISDKFGITTGLKSVFVHISSDALYKNLIAGDWREDSD LSSSFAYHENIYAGYLQLNAKWSARFSTEIGLRLESTYTKSNYNSAVQDSVFNQSYVHLF PTLMVQYQLSENHNLSMAYSRRIVRPNYRNMNPFVEVRDQFLYEQGNTELRPELIDNIEI SWLLKKRYSFNVFYSHRSDPISLSFLVEDNNRVLLMPQNLSGNNSFGVRAGLNNLKPFQW WTSHINGSLTYKKFSWATLGKTLKNEVTTPMLHISNQFTLPYGWDAEALGFYSGEMIEGQ TRVKPLWTISLGVRKNLCNNKFSLYIYAHDIFHSNRPHVSMETNYLYYASQEKNDSRMIG ISLSYRFNRGKEIKKSQNENRIEESKRIGL >gi|226332234|gb|ACIC01000086.1| GENE 10 9356 - 10375 521 339 aa, chain + ## HITS:1 COG:BH3678 KEGG:ns NR:ns ## COG: BH3678 COG2972 # Protein_GI_number: 15616240 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus halodurans # 141 322 357 564 605 97 28.0 5e-20 MNLKYEWINTLFIILPLSLCVFIIVDIAYNERIWELDTFAVKARLQALGITVAYFYLLNY IFKRVARFFIDKSKDEKKANWKEYVWVFLINFVFLNIMHTFIMFCITMAGTFKWGESALI NVTGALLAFLYYIMIRNRILFKSLIEQSLQLEKVKVDQLETELKFLKSQYHPHFLFNALN TIYFQVDENNKVAKQSIEQLSDLLRYQLYDIEKEVTMEQEINYLRSYIAFQQLRMSERLM LDLYFDPQLKEQKIHPLLLQPLIENAFKYVRGDYKIKLEIKVDGNQIQSEIKNSISSSPS TANKKEKGIGIENLKRRLDLLYPRKIQLRLQTNRPYVYR >gi|226332234|gb|ACIC01000086.1| GENE 11 10399 - 11103 495 234 aa, chain + ## HITS:1 COG:ECs2936 KEGG:ns NR:ns ## COG: ECs2936 COG3279 # Protein_GI_number: 15832190 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Escherichia coli O157:H7 # 3 221 7 238 244 102 28.0 8e-22 MKIKCIITDDEPIARKGLRGYIEKVDFLTLTGECEDAVQLNTILKTQEIDLLFLDIEMPE MTGIELLSNLTNPPKVIIVSAYEQYALKGYEFNVVDYLLKPVSFDRFIKSVNKIYDLLQT EQKEKNDYIFVKSNKLLKKILFKDILYIESMENYVVIQTVSCKEVVYTTLKQLNDSLPKD VFQQTHRSYIVNLEKVTAIDGNQLTLSSYKIPVARTFRDSIFNLILNNRLIIKS >gi|226332234|gb|ACIC01000086.1| GENE 12 11560 - 12111 358 183 aa, chain + ## HITS:1 COG:no KEGG:BT_4565 NR:ns ## KEGG: BT_4565 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 60 183 1 124 124 239 100.0 5e-62 MTENRSYLQKYAMHFGTYMGIYWILKFILFPLGFHIPFLSLLFVILTLAVPFIGYHYVKM YRDKICGGSIQFSHAVLFTIFMYMFASLLVAVAHYAYFQFIDHGFIINSYIQLWDELMTK TPALMENQEIIKETIDTARSLTSINITMQLLSWDVFWGSLLAIPTGLMVMKKARPENNTP AQS >gi|226332234|gb|ACIC01000086.1| GENE 13 12135 - 13085 816 316 aa, chain + ## HITS:1 COG:aq_1899 KEGG:ns NR:ns ## COG: aq_1899 COG0463 # Protein_GI_number: 15606924 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Aquifex aeolicus # 3 316 2 310 322 246 42.0 5e-65 MNISVVVPLFNEEESLPELYAWIERVMKANGFSFEVIFVNDGSTDRSWEVIEKLKAQSDC VKGIKFRRNYGKSPALYCGFAEAQGDVVITMDADLQDSPDEIPELYRMITEDGYDLVSGW KQKRYDPLSKTLPTKLFNATARKVSGIPNLHDFNCGLKAYRKDVVKNIEVYGEMHRYIPY LAKNAGFQKIGEKVVHHQARKYGSTKFGFNRFFNGYLDLISLWFLSKFGVKPMHFFGLLG SLMFFIGMISVIMVGVSKLHAMYNGLPYRLVTDSPYFYLSLTAMIIGTQLFLAGFLGELI SRNAPERNNYQIEKKI >gi|226332234|gb|ACIC01000086.1| GENE 14 13106 - 13657 226 183 aa, chain + ## HITS:1 COG:no KEGG:BT_4563 NR:ns ## KEGG: BT_4563 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 183 1 183 183 344 100.0 8e-94 MKNLIRIFLIVVIVGAGISIGSSCSDENDCSLAGRPMMYCTLYTIDEITDLRVNDTLDSL TITALGTDSIILNNQKKVHTLMLPLRYTSDTTVFILWYNPNSPTNKKDVDTLYIIQKNTP YFQSMECGYMMKQNILSTKFGNTKNSSERIDSLHIQNKEANTNEIENLQIFYRYRDRTPP TIE >gi|226332234|gb|ACIC01000086.1| GENE 15 13593 - 14387 609 264 aa, chain + ## HITS:1 COG:no KEGG:BT_4562 NR:ns ## KEGG: BT_4562 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 77 264 1 188 188 372 100.0 1e-102 MKLRIYKYSIDIATARLLRLSSVLLLFCIGIPTIAQQQRPTPVQKRDQKKKEAVVDTIPF YNGTYVGVDIYGIGSKMLGGDFMSSEVSIGVNLKNKFIPTIEFGMGGTDTWNETGIHYKS KTAPFFRIGVDYNTMAKKKEKNSYLYVGLRYAFSSFKYDVSTLPVDDPIWGGSIGNPSLE DDYWGGSVPFSHLGMKGSMQWFELVVGVKVRIYKNFNMGWSVRMKYKTNASTNEYANPWY VPGYGKFKSNNMGITYSLIYKLPL >gi|226332234|gb|ACIC01000086.1| GENE 16 14394 - 14975 462 193 aa, chain + ## HITS:1 COG:Cj0167c KEGG:ns NR:ns ## COG: Cj0167c COG1971 # Protein_GI_number: 15791554 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Campylobacter jejuni # 8 192 8 186 187 122 45.0 4e-28 MTGLEIWLLAIGLAMDCFAVSIASGIILKRTQWRPMLVMALAFGLFQALMPFIGWMFAKT FSHLIESVDHWIAFAILAFLGGRMILESFKDEDCRQTFNPASPKVVFTMAIATSIDALAI GISFALLGINNYTEILSPILIIGFVSFVMSLIGLYFGIKCGCGCARKLKAELWGGIILVA IGLKILIEHLFLQ >gi|226332234|gb|ACIC01000086.1| GENE 17 15015 - 16031 1017 338 aa, chain + ## HITS:1 COG:VC2289 KEGG:ns NR:ns ## COG: VC2289 COG1477 # Protein_GI_number: 15642287 # Func_class: H Coenzyme transport and metabolism # Function: Membrane-associated lipoprotein involved in thiamine biosynthesis # Organism: Vibrio cholerae # 34 334 60 365 367 196 37.0 5e-50 MKTKKSFLWLAFLILATIWILARRNQKAEFNTASGFVFGTVYKIAYQHDADLKPEIEAEL KRFDQSLSPFNDSSVISRVNRNEELVTDSFFQTCFNRSMEISRETKGAFDITIAPLANAW GFGFKKGAFPDSLMIDSLLQITGYEKVKLENGKVVKQDPRVMLSCSAVAKGYSVDVIARL LDRKGIKNYMVDIGGEVVVKGKNATGDLWRIGINKPYDDSLAVKQDIQVVLNLTDLGMAT SGNYRNYYYKDGKKYAHTIDPRTGYPVQHSILSSTVIAEDCMTADALATSFMVMELEEAE KFCKANPMIDAYFIYSGENGEFKTYYTDGMKRYMPDAK >gi|226332234|gb|ACIC01000086.1| GENE 18 16157 - 16528 242 123 aa, chain - ## HITS:1 COG:XF0449 KEGG:ns NR:ns ## COG: XF0449 COG3169 # Protein_GI_number: 15837051 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Xylella fastidiosa 9a5c # 4 121 9 114 116 89 53.0 1e-18 MKGFYTILLLIVSNVFMTFAWYGHLKMKQEYSWFAALPLIGVITFSWVIAFFEYSCQIPA NRIGFVGNGGPFSLMQLKVIQEVITLIIFTVFTTVFFKGETLHWNHLAAFVCLIAAVYFV FMK >gi|226332234|gb|ACIC01000086.1| GENE 19 16556 - 17302 456 248 aa, chain - ## HITS:1 COG:NMA1465 KEGG:ns NR:ns ## COG: NMA1465 COG0037 # Protein_GI_number: 15794367 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Predicted ATPase of the PP-loop superfamily implicated in cell cycle control # Organism: Neisseria meningitidis Z2491 # 1 245 5 252 319 136 35.0 4e-32 MTQFTEEEKTIRRIERRFNKGVVQYRLIEEGDKILIGLSGGKDSLALVELLGKRARIYKP RFSVIAVHVVMNNIPYQSDTEYLKAYCDSFGVPFVQYETSFDPATDTRKSPCFLCSWNRR KALFTVAKEQGCNKIALGHHMDDILETLLMNITYQGAFSTMPPRLVMKKFDMTIIRPMCM VHEADLIELAALHDYRKQVKNCPYESQSSRSDMKGILRQLEAMNPEARYSLWGSMTNVQE ELLPDKID >gi|226332234|gb|ACIC01000086.1| GENE 20 17418 - 18869 1448 483 aa, chain + ## HITS:1 COG:no KEGG:BT_4557 NR:ns ## KEGG: BT_4557 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 9 483 1 475 475 927 100.0 0 MHGLVNKTMKRKYILFIICSFFLGGVSAQTLEQARTLFTKGDYEQAKPVFQKYAKSQPSN GNYSYWYGVCCLKTGEPEEAVKYLETAVKRRVAGGQLYLGQAYNDTYRFEDAVNCFEEYI ADLSKRKRSTEEAEALLEKSKADLRMLKGVEDVCIIDSFVVDKANFLSAYKISEESGKLF TFNEFFQTEGDHPGTVYETEIGNKIYYSEKGERGNLDIFSKNKLLNEWSNGRPLPGSINE SGNANYPFVLSDGVTIYYATDGEGLGGYDIFVTRYNTNTDSYLTPENVGMPFNSPYNDYM YVIDEYNNLGWFASDRFQPEEKVCIYVFIPNSSKQTYNYEAMDPQQMIRLAQIHSLKETW KDENTVTEALQRLKEAINHKPQERRVVDFEFVIDDSTTYYQLSDFKSPKAKLLFQSYRQL EKDYLQQEDKLNGLRQQYASANQQGKEKLAPAILDLEKRILQMNEELDALGVSVRNAEKT KSK >gi|226332234|gb|ACIC01000086.1| GENE 21 18879 - 19349 590 156 aa, chain + ## HITS:1 COG:no KEGG:BT_4556 NR:ns ## KEGG: BT_4556 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 156 1 156 156 264 100.0 9e-70 MDILIITVLIIAAVILFLVELFVIPGISLAGISALACILYANYYAFANLGMAGGFITLGI SAVACIGSLIWFMRSKMLDKLALKKDIDSKVDRSAEKSVKVGDTGISTTRLAQIGYAEIN GNIVEVRSIDGFLNEKTPIIVSRITDGTILVEKHKI >gi|226332234|gb|ACIC01000086.1| GENE 22 19373 - 20374 1066 333 aa, chain + ## HITS:1 COG:BS_yqfA KEGG:ns NR:ns ## COG: BS_yqfA COG4864 # Protein_GI_number: 16079592 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus subtilis # 1 322 1 321 331 387 65.0 1e-107 METSLYLPIVLIVGGIIFLILFFHYVPFFLWLSAKVSGVNISLIQLFLMRIRNVPPYIIV PGMIEAHKAGLKNITRDELEAHYLAGGHVEKVVHALVSASKANIELSFQMATAIDLAGRD VFEAVQMSVNPKVIDTPPVTAVAKDGIQLIAKARVTVRASIKQLVGGAGEDTILARVGEG IVSSIGSSENHKSVLENPDSISKLVLRKGLDAGTAFEILSIDIADIDIGKNIGAALQIDQ ANADKNIAQAKAEERRAMAVASEQEMKAKAQEARAKVIEAEAEVPKAMAEAFRSGNLGIM DYYRMKNIEADTSMRENIAKPTTGGTTNQPLSK >gi|226332234|gb|ACIC01000086.1| GENE 23 20537 - 21415 880 292 aa, chain + ## HITS:1 COG:VNG0893G KEGG:ns NR:ns ## COG: VNG0893G COG2820 # Protein_GI_number: 15790029 # Func_class: F Nucleotide transport and metabolism # Function: Uridine phosphorylase # Organism: Halobacterium sp. NRC-1 # 19 268 14 227 273 97 30.0 2e-20 MKKYFPSSELIINEDGSVFHLHVKPEWLADKVILVGDPGRVALVASHFENKECEVESREF KTITGTYKGKRITVVSTGIGCDNIDIVMNELDALANINFETREEKEKFRQLELVRIGTCG GLQPNTPVGTFVCSQKSIGFDGLLNFYAGRNAVCDLPFERAFLNHMGWSGNMCAPAPYVI DASEELIDRIAKEDMVRGVTIAAGGFFGPQGRELRIPLADPKQNDKIEAFEYKGFKITNF EMESSALAGLSRLMGHKAMTVCMVIANRLIKEANTGYKNTIDTLIKTVLDRI >gi|226332234|gb|ACIC01000086.1| GENE 24 21485 - 22144 679 219 aa, chain - ## HITS:1 COG:PA0741 KEGG:ns NR:ns ## COG: PA0741 COG2910 # Protein_GI_number: 15595938 # Func_class: R General function prediction only # Function: Putative NADH-flavin reductase # Organism: Pseudomonas aeruginosa # 6 219 2 213 213 197 50.0 2e-50 MEKVKKIVLIGASGFVGSALLNEALNRGFEVTAVVRHPEKIKIENEHLKVKKADVSSLDE VCEVCKGADAVISAFNPGWNNPDIYDETIKVYLTIIDGVKKAGVNRFLMVGGAGSLFIAP GLRLMDSGEVPENILPGVKALGEFYLNFLMKEKEIDWVFFSPAADMRPGVRTGRYRLGKD DMIVDIVGNSHISVEDYAAAMIDELEHPKHHQERFTIGY >gi|226332234|gb|ACIC01000086.1| GENE 25 22226 - 24133 1612 635 aa, chain - ## HITS:1 COG:mll3725_2 KEGG:ns NR:ns ## COG: mll3725_2 COG0642 # Protein_GI_number: 13473203 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Mesorhizobium loti # 205 490 4 304 328 165 36.0 2e-40 MEPDTNLYSPNENNEIIRENSQKILSVLAAHQIALWEYDIPTGKCSFTDDYFRTLGLKEA GVVFKDINDFYHFAYPEDINAYQTAFAKMLASESKASQIQVRCVGEHGEVIWLEDHFLSY KGNEEGDPDKLLAYTVNVTSQCEKEQHIKHLEEHNRKIIEALPEFIFIFDENFFITDVLM APGTILLHPVEVLKGADGRSIYSPEVSDLFLCNIRECLKDGNLKEIEYPLEVEGSKHYFQ ARIAPFEGNTVLALIHDIGDRIRRSEELIEAKRRAEDADRMKSVFLANMSHEIRTPLNAI VGFSEIMVLTENEEEKHEYLEIIQKNSNLLLQLINDILDLSRIESGKSEMHFQQVEIAGL VDEVEKVHQLKMKLNVDLKVVRPQGEFWTSTDRNRVMQVLFNFLSNAIKNTETGSITLGL KQEGDWLKLFVSDTGCGIPEEKLPQIFTRFEKLNDFVQGTGLGLSICKSIVERLGGRIEV SSELGQGSTFALYLPYQEIPVEVAERSLSTKIDSESVRHKKILVVEDIESNFAQLNILLK KEYTILWVRNGQEAINSFIREKPDLILMDIRMPVMDGIQATEKIRTISLSVPIIAVTAYA FYTEQQQAIQAGCNAVISKPYSLEKLKETIETYIG >gi|226332234|gb|ACIC01000086.1| GENE 26 24266 - 25663 864 465 aa, chain + ## HITS:1 COG:CAC3734 KEGG:ns NR:ns ## COG: CAC3734 COG0486 # Protein_GI_number: 15896965 # Func_class: R General function prediction only # Function: Predicted GTPase # Organism: Clostridium acetobutylicum # 4 465 5 459 459 309 39.0 1e-83 MNQDTICAIATAQGGAIGSIRVSGPEAISITSRIFQPAKAGKLLSEQKPYTLTFGRIYNG EEVIDEVLVSLFRAPHSYTGEDSTEITCHGSSYILQQVMQLLIKNGCRMAQPGEYTQRAF LNGKMDLSQAEAVADLIASSSAATHRLAMSQMRGGFSKELTDLRSKLLNFTSMIELELDF SEEDVEFADRSALRKLADEIEQVISRLVHSFNVGNAIKNGVPVAIIGETNAGKSTLLNVL LNEDKAIVSDIHGTTRDVIEDTINIGGITFRFIDTAGIRETNDTIESLGIERTFQKLDQA EIVLWMVDSSDASSQIKQLSEKIIPRCEEKQLIVVFNKADLIEEKQKEELSALLKDFPKE YTKSIFISAKERKQTDELQKMLINAAHLPTVTQNDIIVTNVRHYEVLSKALDAIHRVQDG LETHISGDFLSQDIRECIFFISDIAGEVTNDMVLQNIFQHFCIGK >gi|226332234|gb|ACIC01000086.1| GENE 27 25935 - 26813 657 292 aa, chain + ## HITS:1 COG:no KEGG:BF1137 NR:ns ## KEGG: BF1137 # Name: not_defined # Def: putative transposase # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 292 1 292 292 452 86.0 1e-126 MEVRKICQWCGKPFIAQKTTTCYCSHQCSNLGYKERIRERKRQLKRSQEILQSRQAAEGQ DFFSFAQAAKLMGVTRQYIYKLVKESKLRASRISGKKSLIRRADIELMMKTKPYERVLPK EDFDISEYYTAEEIAEKYKVNAKWVWTYTRQHKVPKVRIRQFNYYSKKHIDAAFAKYEVD SDLTEWYTPEEIQEKYGMTRVAIRSQVYRNNIPSKKEHGQIFYSKLHFDLSKSSEQESKA EYYTVKEAMEKFKLSRDSVYGILQFHQINREKNGRFVRFLKVEFDRIMGVRK >gi|226332234|gb|ACIC01000086.1| GENE 28 26931 - 28046 871 371 aa, chain + ## HITS:1 COG:BS_ripX KEGG:ns NR:ns ## COG: BS_ripX COG4974 # Protein_GI_number: 16079408 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinase XerD # Organism: Bacillus subtilis # 146 364 59 291 296 65 25.0 1e-10 MHECKTVTLRTRPLKGRMMSFYLDYYPGYRDKETMKVIRHESLGIYIYANPRNKREQNFN EVMTEKAEAIRCRRFESVVNERYDFFDKYKLKADFLEYYRKQLRKHDQKWEFVYLHFSNF VHGKCTFEEIDIDLCNKFREYLLSAKKLRRNGRITRNSASGYWSTFRGFLKILYRNGMIK TNVNDFLEKIETEDVMKEALSVEELYKLAETPCKKPILKTASLFSCMTSLRISDILSLCW KDIVDYSAGGKCVHIITQKNKAEDIIPISEEALGLIGYNSEKKGLVFKGLMRSWTQIPMK EWIRSAGITKNITFHSYRRTFATLQAAAGTDIRTIQSIMAHKSITTTQRYIKVVDANKRE ASKKITLTRQD >gi|226332234|gb|ACIC01000086.1| GENE 29 28216 - 29061 525 281 aa, chain + ## HITS:1 COG:no KEGG:BF1134 NR:ns ## KEGG: BF1134 # Name: not_defined # Def: putative transmembrane protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 281 4 284 284 502 91.0 1e-141 MTNKKQIQTKKISFILALSVIIYAAIDTYLFLCHDINIMRITTPVILGLVIAIILCGLLH KFFYKWLTKNPINFSFGEPSTPIVKSENAIESMDATEILPIERPMGLAEQGCPPSQQLNQ DTVNSDCPQDSHLNKYDSILVELKKKEIERQVEIMDAIREYVTVKTAPYLSKEDIATLIS NIEYMTYDQPESYKPIRSNIDNSLRSPGLRHLAWNIGERLGVPLAKRAVFIKKSFPYELA NATLEYLKLNLRDNVSSQIPIDVPDKGDYRFHLDADNDDSQ >gi|226332234|gb|ACIC01000086.1| GENE 30 29161 - 29535 341 124 aa, chain + ## HITS:1 COG:no KEGG:BF1133 NR:ns ## KEGG: BF1133 # Name: xis # Def: putative excisionase # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 124 1 124 124 223 95.0 2e-57 MKKDQLTFNDLPTVVGELCDRIASMENLLTEKLSKQYETKEDTHVPMTVQEACNYLKMPL STFYYKVKKDDIPVIKQGKHLYIYRDELDRWLESSRKNPAPQSFEDENESLLASHRRKPN PKNW >gi|226332234|gb|ACIC01000086.1| GENE 31 29538 - 30608 654 356 aa, chain + ## HITS:1 COG:no KEGG:BF1132 NR:ns ## KEGG: BF1132 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 356 1 356 356 677 95.0 0 MEKTDTIILSPEELTAYIAESTISVTSTYEHSPVVLMVDDTVIGTLGNFSASIGKAKSKK TFNISAIVASALSGSTVLHYRSTFPENKRKILYIDTEQGRYHCQLVLKRILRLADLPEYK NPDNLIMLALRKFSPKLRLAIVEQAIGTIPDLGLVVIDGIRDFLYDINSPGESTDIISKF MQWTDDRQIHIHTILHQNKNDEHARGHIGTELNNKAETIMQVEVDKEDKAVSVVEAVHIR DSEFEPFAFRINEEAMPEPVESYLPKEKKTGRPTKGPFDPDKEIPKNVHRPALDAVFANG NINNYDDYIERLKEGYGLQGIKLGYNKAVKVATLLSDKQMVIKEGKDYAFNPEYHY >gi|226332234|gb|ACIC01000086.1| GENE 32 30580 - 30678 58 32 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPSIQNTIIDFNFTLLLGRIYKNKAKSEEDIM >gi|226332234|gb|ACIC01000086.1| GENE 33 30919 - 31338 330 139 aa, chain + ## HITS:1 COG:no KEGG:BF1129 NR:ns ## KEGG: BF1129 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 139 1 139 139 241 90.0 7e-63 MKEDIENTSDSSKETRSVFIGAKVTLTQKEHIKSLAGQCGMTVSDYILSRAYNFKPKARL TKEEAVLLQNLDDCRSDLVKYTSALHGMSTKQRMAMFNQVPFMVGWLKELGNVAENICQF LNAVKEKNKIPSTPKSEEE >gi|226332234|gb|ACIC01000086.1| GENE 34 31335 - 32546 608 403 aa, chain + ## HITS:1 COG:no KEGG:BF1128 NR:ns ## KEGG: BF1128 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 403 1 403 403 720 91.0 0 MIAKAKAISHGINDLRYITGESQHKKHPEKIYRVLDNLLSSEPDAMGIWNSMQLTLSQHR PIKNSVIRIELSPSPEHTQFYDIEDWQNLWQDFAEEFDKQVITGKDGKVRSYPTNLAGSK YSVWLHTESKGEVPHLHAAVCRLDENGNINNDHNIHLRAQRAAERVAKKRGWTTAAQIRN RNIPQVNRDCMEVLKAMPSWSWDEYKNALIRKGYTVHEREDKKGLLRGYALMNGNTKYKA SELGIGRNLMVSKLPATWKKLHHQPATVIRNNAPQAVQQIAIHKVAPIDYTQYLTYRHDM IPYTLNHEDKEHKFYISEKVLDCFNDEFDYRAVANCQELTDMAVAIFVGLIDTPNVTTGG SGGGSQSDLPWRDKDEDDLQWARRCARAASRSLGKKTKIGLKR >gi|226332234|gb|ACIC01000086.1| GENE 35 32551 - 33078 313 175 aa, chain + ## HITS:1 COG:no KEGG:BF1127 NR:ns ## KEGG: BF1127 # Name: not_defined # Def: putative transmembrane protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 175 1 175 175 248 77.0 4e-65 MKTKFDFSAEMNEAESIKNTPKNEYLEEEKRLLDINAKELASLNDNVFKLRTEVENLSGS IRECKPIISKEMQNLAVEFGATLLCNFLSQIESKCKETERRMKKADNAIPIPDTAFYILI IIVVVLSSFFVCMIVANAEILHSGLIWKAVICCILIAVLGIGIALIVQKFLDRGK >gi|226332234|gb|ACIC01000086.1| GENE 36 33139 - 33525 310 128 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253569680|ref|ZP_04847089.1| ## NR: gi|253569680|ref|ZP_04847089.1| predicted protein [Bacteroides sp. 1_1_6] # 1 128 8 135 135 234 100.0 2e-60 MRKGLIFFLGIVTGCVLTIAVLFVIGITNSNANESDITIAEQQTVFTTATKFEVFQVLGD GALANCEEKGYSTSFFTGPVVYIVTDGQNLFYDDQVIEVPKGKKAMQIGTFRYETKLGEK VVPVIKFQ >gi|226332234|gb|ACIC01000086.1| GENE 37 33631 - 34557 566 308 aa, chain - ## HITS:1 COG:SPy2122 KEGG:ns NR:ns ## COG: SPy2122 COG0582 # Protein_GI_number: 15675872 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Streptococcus pyogenes M1 GAS # 1 299 65 371 381 97 27.0 4e-20 MIQKTIREISDAWRENKRPYVKQSTLAAYMLILENHILPKFGESNELHENDVQGFVLEKL EGGLSMKSVKDILIVLKMVMKFGVKNEWMNYYEWDIKYPTDVAGKKLEVLSVANHKKILN YIQSHFSFPGLGIYISLSTGLRIGEICALKWSDINVYDGILTVNRTIERIYIIEGERKHT ELVINTPKTKNSCREIPINKELLTMLKPLKKVINDDYYILTNDERPTEPRTYRNYYKRLM EKLDIPKLKYHGLRHSFATRCIEVGCDYKTVSVLLGHSNISTTLDLYVHPNMEQKKRCIS KVFKSLGK >gi|226332234|gb|ACIC01000086.1| GENE 38 34790 - 35797 290 335 aa, chain + ## HITS:1 COG:SA0392 KEGG:ns NR:ns ## COG: SA0392 COG0732 # Protein_GI_number: 15926110 # Func_class: V Defense mechanisms # Function: Restriction endonuclease S subunits # Organism: Staphylococcus aureus N315 # 1 335 70 403 403 161 35.0 2e-39 MSNYYLLHKGDFAYNKSYSSEYPWGAIKRLDCYEQGTLSSLYICFKPYSHVSSDFLTHYF ETSKWHQGISEIAVEGARNHGLLNVGIQDFFETRHCLPQSLLEQEKIAKFLNLIEERIAT QNKIIEKYESLIQAIIYQKKAAGIRKGDWQKTELSNVLKERIEKNTNGYIICSVSVSQGV INQIEYLGRSFAAKETLHYNVVKYGDIVYTKSPTGDFPYGIVKRSYIKDDVAVSPLYGVY MPVNDYIGVILHFYFMQPSNAFNYLHPLIQKGAKNTINITNERFLKNSIPLPKTENEAIY IANTLISIQKKIDMEKKMLWSYEKEKQYLLSKMFI >gi|226332234|gb|ACIC01000086.1| GENE 39 35790 - 36443 109 217 aa, chain - ## HITS:1 COG:jhp0726 KEGG:ns NR:ns ## COG: jhp0726 COG0732 # Protein_GI_number: 15611793 # Func_class: V Defense mechanisms # Function: Restriction endonuclease S subunits # Organism: Helicobacter pylori J99 # 30 192 21 185 454 103 35.0 3e-22 MADITEISSLIQAMCDTLIESEQHKVELAFSDFGKSYSGLSGKSAEDFGEGCPYITYMNV YQNQIINATNVGLVKINGAEQQSVVHYGDILFTLSSETAEEVGIGAVYLGDTYPLYLNSF CFGIHIIDDNKIFPPFLAFYVSTKSFRKVVFPLAQGSTRFNLQKNDFMKKGFSFPTVERQ RKIYSALKTYSDKLAVEKSIAKLLCKQKNHLLSKLFI >gi|226332234|gb|ACIC01000086.1| GENE 40 36433 - 37017 128 194 aa, chain + ## HITS:1 COG:XF2722 KEGG:ns NR:ns ## COG: XF2722 COG0732 # Protein_GI_number: 15839311 # Func_class: V Defense mechanisms # Function: Restriction endonuclease S subunits # Organism: Xylella fastidiosa 9a5c # 22 178 34 192 468 62 28.0 4e-10 MSAIVDKLYSSTNGKTYSFCQLFEVVNEKNKKLAYKNVLSASQELGMIERSNINIDIKFE QESISGYKIVRKGDYVVHLRSFQGGFAFSDTTGICSPAYTILRPNDLVVYGYLSHFFTSK PFIKSLKLVTYGIRDGRSINVDEWLDMPILLPSAQEQMRILTIVNAIDAKLHNEAKVQFC LSSQKSYLLNTMFI >gi|226332234|gb|ACIC01000086.1| GENE 41 37010 - 38212 232 400 aa, chain - ## HITS:1 COG:MJECL41 KEGG:ns NR:ns ## COG: MJECL41 COG0732 # Protein_GI_number: 10954528 # Func_class: V Defense mechanisms # Function: Restriction endonuclease S subunits # Organism: Methanococcus jannaschii # 23 197 24 203 432 128 38.0 2e-29 MADNNENKVLNVPHLRFPEFSGEWETKSINDLADVIGGGTPDTTVKSYWDGGIQWFTPSE IGKNKFVDASLRTITEDGLNNSSAKLLPPNTILLSSRATIGECSLSLRECATNQGFQSLV SKKCNVDFLYYLIQTKKKDLIRKSCGSTFLEISANEVRKIQVSVPSDVEQQKIAGLLSLI DKRIATQNKIIEDLKKLKSAIVEMLLCNQNGESFKLRDVGCFVRGLTYANEDVTENKAAT TVIRANNLNYGNNVDKDEVVYVNKTPTTSQILRKGDIVICMANGSSSLVGKNSYYPFNDG QSTIGAFCGIYRTSYPFVKWLMQSQRYKRLVYQSLQGGNGAIANLNGDDILNMSFPLIEN GKSQSIIFAIDAIEKSLMVNKSLQRLYYTQKSYLLKQMFI >gi|226332234|gb|ACIC01000086.1| GENE 42 38218 - 39624 799 468 aa, chain - ## HITS:1 COG:STM3755 KEGG:ns NR:ns ## COG: STM3755 COG3943 # Protein_GI_number: 16767039 # Func_class: R General function prediction only # Function: Virulence protein # Organism: Salmonella typhimurium LT2 # 122 467 3 345 345 257 41.0 3e-68 MAEIDNIPEMRPSFDNIRRQDGGGNEYWSSRDLCAAMGYSAYWKFQRVIDKAIKVAGEKG MNVDDHFNQTVDMVRIGSGSFRKVNIFRLSRMACMIVAENADVKKVLVQQARDYFTRTIS TNELMQNSLNSNILLYKTAQGEVRIEVIFNSETFWMSQKRMADLFGVDVRTVNYHLGQIY ETGELTKEATIRKIGIIQSEGKRDVERTPLFYNLDAIIAVGYRVNSYQATQFRIWATSVL KEFIIKGYALDDERLKQGKHFGKDYFDDLLERIREIRTSERRYYQKITDIYAECSADYDL KSDCTKLFFKMVQNMMHLAVTHRTAAEIVYERADSEMPHMGLTTWKKAPDGRVQKSDTIV AKNYLSDKEVLELNGITNAFLEFAEQRAQRHIITTMADWKQRLEQFLATMDYQAQDSAGK VSQEEAREKAYGEYEKYKVIQDRSFISDFDRFNDGDKLLPFDINPDKE >gi|226332234|gb|ACIC01000086.1| GENE 43 39632 - 41185 1306 517 aa, chain - ## HITS:1 COG:SA1626 KEGG:ns NR:ns ## COG: SA1626 COG0286 # Protein_GI_number: 15927382 # Func_class: V Defense mechanisms # Function: Type I restriction-modification system methyltransferase subunit # Organism: Staphylococcus aureus N315 # 6 517 11 516 518 580 56.0 1e-165 MSEELQQKLRDQLWEVANRLRGNMSASDFMYFTLGFIFYKYLSEKIETYANSALDDDEVT FKELWEMTDSDAPELQEEVKNQCLENIGYFIEPKFLFSSVIEAIKRKENVLPMLERSLKR IEDSTLGQDSEEDFGGLFSDIDLASPKLGKTADDKNTLVSNVLLALDDIDFGVEASQEID ILGDAYEYMISQFAAGAGKKAGEFYTPQEVSRILAEIVTLGHNRLRNVYDPTCGSGSLLL RAASIGKAAYIYGQEKNPTTYNLARMNMLLHGIRFSSFKIENGDTLEWDAFDDMQFDAVV ANPPFSAEWSAADKFNNDDRFSKAGRLAPKKTADYAFILHMVYHLNEGGTMACVAPHGVL FRGNAEGVIRRFLIEKKNYIDAIIGLPANIFYGTSIPTCILVLKKCRKEDDNILFIDASK EFEKVKTQNKLRPQHIQKIVETYRDRKEIEKYSHLATLQEVAENDYNLNISRYVDTFEEE EPIDIKAVMAEITELEAKRAELDKEIEIYLKELGLVE >gi|226332234|gb|ACIC01000086.1| GENE 44 41190 - 41894 462 234 aa, chain - ## HITS:1 COG:no KEGG:APA01_04920 NR:ns ## KEGG: APA01_04920 # Name: not_defined # Def: hypothetical protein # Organism: A.pasteurianus # Pathway: not_defined # 6 214 42 238 244 78 31.0 2e-13 MVKTNEPGYVYILTNPSFREDWVKIGKSARPVDIRSKELDNTAVLLPFEIYATIQTVKYN DVEKHVHKTIDRLTDLRIRQNREFFNVPPQIALDIFNDIAKMIDDAVVTVYVDNKPVCHN EKDSLPVVQKRTVKRGRFKFSMVGIKIGECVTFIPTDTEVKVASDDSVEYEGRIYKLSPF VGTFMPEEKRNTSGAYQGAKYFSYKGKVLDDLRSIIESNIPLPESDVIQDNIEQ >gi|226332234|gb|ACIC01000086.1| GENE 45 41908 - 44736 1919 942 aa, chain - ## HITS:1 COG:SA0189 KEGG:ns NR:ns ## COG: SA0189 COG0610 # Protein_GI_number: 15925899 # Func_class: V Defense mechanisms # Function: Type I site-specific restriction-modification system, R (restriction) subunit and related helicases # Organism: Staphylococcus aureus N315 # 1 925 1 916 929 629 39.0 1e-180 MSIQSEAALEAGLIATLRQMGYEYVQITEEDNLYANFKRQLEIHNKKQLAEVGRNSFTDE EFEKIVIYLEGGTRFEKAKKLRDLYPLDTANGQRIWVEFLNRTQWCQNEFQVSNQITVEG RKKCRYDVTILINGLPLVQIELKRRGVELKQAYNQIQRYHKTSFHGLFDYIQLFVISNGV NTRYFANNPNGGYKFTFNWTDAANMPFNELDKFAVFFLEKCTLGKIIGKYIVLHEGDKCL MVLRPYQFYAVEKILDRVQNSNDNGYIWHTTGAGKTLTSFKTAQLVSELDDVDKVMFVVD RHDLDTQTQSEYEAFEPGAVDGTDNTDELVKRLHSNSKIIITTIQKLNAAVSKTWYSSKI DSIRHSRIVMIFDECHRSHFGESHKKIMQFFDNAQIFGFTGTPIFTENAVDGHTTKEVFG NCLHRYLIKDAIADENVLGFLVEYYHGSEEVQNGSANRMTEIAKFILNNFNKSTFDGEFD ALFAVQSVPMLIRYYKIFKELNPKIRIGAVFTYAANGSQDDDLTGMGTGSYLNDSAGEVD ELQAIMDDYNEMFGTSFTTENFRAYYDDINLRMKKKRADMKPLDLCLVVGMFLTGFDSKK LNTLYVDKNMEYHGLLQAFSRTNRVLNEKKRFGKIVCFRDLKSNVDTAIKLFSNSNNPEE IVRPPFEEVKQEYKELATNFLKKYPDTNCIDLLQSEKAKKEFVLAFRDIIRKHAEIQIYE DYNEESDDLGMTEQQFMDFRSKYLDIHDTFALVDPAPSPKPDDDTDVPDDGDLGDVDFCL ELLHSDIINVAYILELIAELDPYSADYAERRQNIIDTMIKDAEMRSKAKLIDGFIQKNVD DDKENFMIQRGKADGTSDLEERLNHYIAVERENAVNSLAKEEEISSSVLNHFLKEYDYLQ KEQPEIIQKALKEKHLGLIKTRKALTRILDRLRNIIRTFSWD >gi|226332234|gb|ACIC01000086.1| GENE 46 44780 - 45052 271 90 aa, chain - ## HITS:1 COG:no KEGG:BT_4534 NR:ns ## KEGG: BT_4534 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 4 90 1 87 94 102 73.0 3e-21 MHIMAEEKIGNALEVKEETVECKSKDWVCHSQAIAAIMSNRMGELSMTQRALAEKMNCTQ QYVSKVLKGRENLSLETLCKIENALGIKIL Prediction of potential genes in microbial genomes Time: Thu May 12 01:30:47 2011 Seq name: gi|226332233|gb|ACIC01000087.1| Bacteroides sp. 1_1_6 cont1.87, whole genome shotgun sequence Length of sequence - 20183 bp Number of predicted genes - 16, with homology - 15 Number of transcription units - 9, operones - 5 average op.length - 2.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 104 - 2326 1213 ## gi|301162148|emb|CBW21693.1| putative transmembrane protein 2 1 Op 2 . - CDS 2356 - 3729 775 ## COG1672 Predicted ATPase (AAA+ superfamily) - Prom 3751 - 3810 2.5 3 2 Tu 1 . - CDS 3850 - 4170 284 ## BF1073 putative DNA-binding protein - Prom 4222 - 4281 3.7 4 3 Tu 1 . - CDS 4301 - 4594 142 ## BF1149 hypothetical protein 5 4 Op 1 . - CDS 4723 - 4887 92 ## 6 4 Op 2 . - CDS 4909 - 8166 1692 ## COG0507 ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member - Term 8411 - 8448 -1.0 7 5 Op 1 8/0.000 - CDS 8558 - 9658 479 ## COG3550 Uncharacterized protein related to capsule biosynthesis enzymes 8 5 Op 2 . - CDS 9651 - 9968 227 ## COG1396 Predicted transcriptional regulators - Prom 10026 - 10085 6.0 + Prom 10577 - 10636 7.2 9 6 Op 1 . + CDS 10800 - 11090 203 ## BT_4524 hypothetical protein 10 6 Op 2 . + CDS 11077 - 12261 380 ## COG0732 Restriction endonuclease S subunits 11 7 Tu 1 . - CDS 12256 - 12783 161 ## COG0732 Restriction endonuclease S subunits - Prom 12877 - 12936 4.2 + Prom 12685 - 12744 5.0 12 8 Tu 1 . + CDS 12790 - 13593 615 ## COG0582 Integrase + Term 13605 - 13649 11.4 - Term 13558 - 13606 0.0 13 9 Op 1 . - CDS 13623 - 15059 579 ## COG0732 Restriction endonuclease S subunits 14 9 Op 2 . - CDS 15105 - 16235 636 ## PROTEIN SUPPORTED gi|167855185|ref|ZP_02477956.1| 50S ribosomal protein L31 15 9 Op 3 5/0.000 - CDS 16239 - 17657 1078 ## COG0286 Type I restriction-modification system methyltransferase subunit 16 9 Op 4 . - CDS 17661 - 20183 1271 ## COG4096 Type I site-specific restriction-modification system, R (restriction) subunit and related helicases Predicted protein(s) >gi|226332233|gb|ACIC01000087.1| GENE 1 104 - 2326 1213 740 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|301162148|emb|CBW21693.1| ## NR: gi|301162148|emb|CBW21693.1| putative transmembrane protein [Bacteroides fragilis 638R] # 1 740 11 750 750 1370 99.0 0 MPETLNYQNRAIAYSVLAHIYDKGTLAHGPLDIFIPIVKNALSELYPLGVVKGNSMKEIT DAIEKKFSLVIPISVLNNILKVIASEINKTNGREDMRIFNDGSFWIDKFIFEDYSELIQK GKEDVAKVVHMFKTFCKAYNVGTTNNDVDLFKFVEQNRSEISYYLANGSSEATIDSSHNV LIAQFIDTFKQVPEIFEKIRDIYLGSMLSCYLDYQPSDANMNVELLLDTNFIISLLDLNT PESTETCNTLIDVSRKLGYKFTVLKDTIEEIQALLLFKSTNLSSAIIAKNINREDIYNAC DRRHLSSTDLNRISDNLEDTLVNHFKFHVIPQTKQWQGKAKFSKEFSIIRKYRNTDKAAL HDAMALVYVREKRGEKYIQEFGKVNCWFVNNAISHDNDYSDTEYKDLYSCKKHQPEIIKV DDLLNILWLSNPTIDITSDDVATMGITSLVSYTLNSSLPKARIIKELDDNIQKYREIYKI TDRDVVNLSTRIASRQIKDVQELNELAHKDEAAFAARVKDESMRQEQIEVNRAQKFDNLF QMLQGEIKEIKDNQVKMQQKYNERMHELESKEAELKNSEKIIYQENEVLKADKDQLVSEK EGIKKEKEIFEMKIKNLWEQENKQRAEQKEKILDYEIAKSKTRSTTLFVVGLFIMIITLG AMSYYYFVAADDTVQVINDFCNFKPISFVITFILALFNYFTISNFYNWHYNPSFVVNKKK LVKIPDCYKPISFEDYMHEK >gi|226332233|gb|ACIC01000087.1| GENE 2 2356 - 3729 775 457 aa, chain - ## HITS:1 COG:FN0123 KEGG:ns NR:ns ## COG: FN0123 COG1672 # Protein_GI_number: 19703471 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Fusobacterium nucleatum # 1 418 1 412 454 198 32.0 2e-50 MKFVDRIDEAARLKDALAREKSSLVVMYGRRRLGKSTLIKRVLSENDVYFLADRSEGQHQ RILLAKVIAQVFPDFDKLTYPDWESMFRAVNYRTDKRFTLCLDEFPYLVEQSSELPSVLQ KLVDEKQLKYNLVLCGSSQNMMYGLFLDSTAPLYGRADEIMKLALIRLPYIQEALSLDAM NAIEEYAVWGGVPRYWELRENQNSLNDALWHNILSVNGTLYEEPIKLFQDDVKDIVKTST IMSYIGTGANRLSEIAARCNEPATNLSRPLKKLIDLGFLAKDVPFGIDEKNAKKSLYKIA DPFMAFYYQFVVPNRSFIELGRRLPIEQALTAHFSEYVSMQWEKLCRDAVTGNLVNGVVY GKAKRWWGSVINEDKKPEQVEFDVMAESLDKKYLLVGECKWTTQENGKQLTTELLRKANL LPFAKKYTIVPMLFLKNAPKDEAGNTMLPEDVVRLSY >gi|226332233|gb|ACIC01000087.1| GENE 3 3850 - 4170 284 106 aa, chain - ## HITS:1 COG:no KEGG:BF1073 NR:ns ## KEGG: BF1073 # Name: not_defined # Def: putative DNA-binding protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 105 3 107 108 160 92.0 1e-38 MTKNTMGTKLPRKLVQKMSIVGEQIELARLRRNLSVAQIAERATCSLLTVSRIEKGAPTV AIGIYLRVLYALQLDDDILWLAKEDKLGKALQDLSLKTRRRASKKV >gi|226332233|gb|ACIC01000087.1| GENE 4 4301 - 4594 142 97 aa, chain - ## HITS:1 COG:no KEGG:BF1149 NR:ns ## KEGG: BF1149 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 91 102 192 200 145 84.0 3e-34 MSGTSDGERIPVLYSIHHCYYGRYPDDADVLELSDETINVMKEAYRQIRIAEIYATRVDW MMSGNDSEENFRERIKEDLAEFEKEYASKDWIFFDVD >gi|226332233|gb|ACIC01000087.1| GENE 5 4723 - 4887 92 54 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSGGYFNRHMIAFGEIANSIERDIARTLQPKPGRYIKIIGLFTRKIALAHTIAI >gi|226332233|gb|ACIC01000087.1| GENE 6 4909 - 8166 1692 1085 aa, chain - ## HITS:1 COG:MJ1519 KEGG:ns NR:ns ## COG: MJ1519 COG0507 # Protein_GI_number: 15669714 # Func_class: L Replication, recombination and repair # Function: ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member # Organism: Methanococcus jannaschii # 115 1083 168 1171 1175 176 25.0 2e-43 MSLDSQSWLSDRHPEFHDVEKSPFNSSWLYGAERQLDILKWFRGNIEANESICVFYCKNG NPVDDEGRRMIVGMGEVTSVASIKLYDTEADYTYPLWEMVVQHSIRQELQDSKGFLLPYN QYLEFDEDYIQKKTGLTKEEALDEIKISLDKLGNTERIFNELSYGCEFISNQSMLIILEN ARRALEAVMKHGLVGGDWQLQLRWINDSIAKVKSSISPFPSFAECLKAIGVNYSYLIERD ILTAGCGKKDNPWRYYNDLMAGKLPVPNTVYFSELPAYKKSWEYRSDEGKRVLELLSRFE LDADIIGQYANNAETYEKLLTNPYIISEKCAQDYDNRVNTQTIDFGVIPDVDIQGENIPT APFAVRTLIDERRLRSMTVERLCSALDDGDTLLSIAELEQYVSDTLSDTNSLLPNDYFLT VRGFFSDELVYLPDDNPKALQLKEYAEMERWLSKRLLARAKSSVRNKLNVDWETRAMSSS HYDKNNENSREATRQQIEALEMMTDRKLSVLTGGAGTGKTTVVETFLSCSQIKNEGVLLL APTGKARVRLGKMGKGVEAQTIAQFLVRQGFFDWDRMLAIDNPNGRQYANAANVIIDECS MLTTRDLYILLKALDLAKINRLILIGDPCQLPPIGAGRPFADLCYRLQNKDTVPVLNSAI TSLETVVRTITTGESDILTLASWFSSKKPKKGADEIFNKMATGDLNGDLQVMTWTDESDL EKCLMEALCIELGCTEDNLSAVLQHRLGIDSIKSLAANPNTIESVQVLTPVLNPIWGSLH LNECVQKWIGTYDKEFIQFSTQKIYPKDKIMQLKNEKVEAYPSHQKYQLSNGQIGFVRSI YKGYANLIYAGIPNESFGKRGVSGEDSETPIELAYSITIHKSQGSDFNTVVLVLPKFGRI LTRELIYTALTRAKKKLILLIQDNVYWLWEKTKPQASILAQRNSNMFEQLVARENKSSIP YVEGLIHRTKNPNLMVRSKSEVIIANELISADIKFKYEEMFNRDGHQCLPDFTFVDLSDE IIIWEHLGMLTVPEYKTSWEKKLKFYNSIGFIEGENLFTTHDHENGSIDTTEIMKVIDKI KNLVE >gi|226332233|gb|ACIC01000087.1| GENE 7 8558 - 9658 479 366 aa, chain - ## HITS:1 COG:CC2770 KEGG:ns NR:ns ## COG: CC2770 COG3550 # Protein_GI_number: 16127002 # Func_class: R General function prediction only # Function: Uncharacterized protein related to capsule biosynthesis enzymes # Organism: Caulobacter vibrioides # 29 355 28 402 435 142 29.0 1e-33 MNSIKRIEVIYDNRLVGRLALTKESLCAFEYSAEWLNSGFSISPFELPLRSGVFIAKPRP FDGGFGVFDDCLPDGWGLLILDRYLQRNGVNPRTLSLLDRLALVGSTGRGALEFRPDKSV VSKQEYADFEKLALEAEQILDSDNYNGEGIEEFQYRGGSPGGARPKIFTRYDGKEWLVKF RAKRDSKHIGEDEYRYSLLAKECGIEMPKTRLFEDKYFGVERFDRTSNGKLHVVSIAGLI GADYRLPSIDYTHIFQVCATLTHSVAEMWKVYRLMVFNYLIDNKDDHAKNFAFIYRDGNW HFAPAFDLLPSDGINGFRTTSINDRIEPTKEDLFAVMVKVGLNEKEAVEIFEEIQGILSM TFDQSF >gi|226332233|gb|ACIC01000087.1| GENE 8 9651 - 9968 227 105 aa, chain - ## HITS:1 COG:FN1997 KEGG:ns NR:ns ## COG: FN1997 COG1396 # Protein_GI_number: 19705293 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Fusobacterium nucleatum # 7 94 18 105 106 79 44.0 2e-15 MNNIIAFNVSSPSDIALQIAARVKVRRLELDLTQEGLAARAGVKFATYRRFEQTGEISLR GLLQVGFALNALSDFDALFAQKQYQSLDDVLNEQSVIRKRGKKNE >gi|226332233|gb|ACIC01000087.1| GENE 9 10800 - 11090 203 96 aa, chain + ## HITS:1 COG:no KEGG:BT_4524 NR:ns ## KEGG: BT_4524 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 96 1 96 96 160 100.0 2e-38 MATSKMKIKKVCEWCGTTFYAQKLTTRFCSHRCNNLAYKEAVRQKRIQEIETKVQTVISE QPISYFKDKEYLSFKEVATLLGLSKQAVYKMVYATL >gi|226332233|gb|ACIC01000087.1| GENE 10 11077 - 12261 380 394 aa, chain + ## HITS:1 COG:SA1625 KEGG:ns NR:ns ## COG: SA1625 COG0732 # Protein_GI_number: 15927381 # Func_class: V Defense mechanisms # Function: Restriction endonuclease S subunits # Organism: Staphylococcus aureus N315 # 5 180 7 188 409 103 34.0 8e-22 MPHYENVPFEIPNSWVWTTIEEICSKIGSGSTPRGSNYSANGIPFFRSQNVYNDRLVYDD IKYISEEVHQKMKGTEVLANDLLLNITGGSLGRCAVVPADFNCGNVSQHVCIMRSVLVEP EYFHALVLSSYFAKSMKITGSGREGLPKYSLEQMAFPLPPLSEQQRIVMEIEKLFALIDQ IEHSKVNLQTIIKQTKSKILDLAIHGKLVPQDPNDEPAIELLKRINPDFTPCDNGHYTQL PDGWTFCRLDQIIGYEQSTAYIVESTAYDDSYSTPVLTAGKSFIIGYTNEATGIYSNLPC IIFDDFTTDSKLVDFPFKVKSSAMKILKVHKDIEVDYVAMFMSITKLVGDTHKRYWISEY SKLEIPIPSKAEQKRIIHAIHGIFTQLDLIMESL >gi|226332233|gb|ACIC01000087.1| GENE 11 12256 - 12783 161 175 aa, chain - ## HITS:1 COG:MJ1218 KEGG:ns NR:ns ## COG: MJ1218 COG0732 # Protein_GI_number: 15669403 # Func_class: V Defense mechanisms # Function: Restriction endonuclease S subunits # Organism: Methanococcus jannaschii # 16 162 17 173 425 65 28.0 5e-11 MIYNEYGNNIIEYIGHYTQLPNGWTVAPMQVLCSLVDGDKQKGIERINLDVKYLRGERDA KTLTSGKYVTANSLLILVDGENSGEVFRTSIDGYQGSTFKQLLINENMNEEYVLQAINLH RKVLRESKVGSAIPHLNKKLFKAIEIPIPPYKEQQRIIKAITKAFMSLDLIMESL >gi|226332233|gb|ACIC01000087.1| GENE 12 12790 - 13593 615 267 aa, chain + ## HITS:1 COG:SP0506 KEGG:ns NR:ns ## COG: SP0506 COG0582 # Protein_GI_number: 15900420 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Streptococcus pneumoniae TIGR4 # 5 267 4 265 265 339 65.0 3e-93 MVTKFKSYLAKTNLAKNTVTSYVWTVQYFLNHYGEVNKKNLLTYKGYLVENFKPQTVNLR LQGINKYLEFTKQEKLKVKFVKVQQKNFLENVISDADYKFLKTRLKADGYNEWYFIVWFM AATGARVSELLHIKAEHVQVGHLDLYSKGGKIRRLYIPKNLRTEATKWLKEKSLISGYIF LNRFGERITTRGIAQQLKHFAGKYGMNKEVVYPHSFRHRFAKNFLDRFNDLALLADLMGH ESIETTRIYLRRTASEQQKIVDKVVNW >gi|226332233|gb|ACIC01000087.1| GENE 13 13623 - 15059 579 478 aa, chain - ## HITS:1 COG:SP0508 KEGG:ns NR:ns ## COG: SP0508 COG0732 # Protein_GI_number: 15900422 # Func_class: V Defense mechanisms # Function: Restriction endonuclease S subunits # Organism: Streptococcus pneumoniae TIGR4 # 1 286 1 307 522 159 36.0 1e-38 MDTKALRQKILDLAIHGKLVPQDPNDEPASVLLERIKAEKERLIKEGKIKRSKKSAKTSD TPHYENVPFEVPDNWVWMTLGEVGTWQSGGTPSRSNKTYYGGNIPWLKTGDLNDGLISDI PESITEEAVANSSAKINPAGSVLIAMYGATIGKLGILTFPATTNQACCACIEFNAITQLY LFYFLLSQRNGFIAKGGGGAQPNISKEIIVNTFIPLPPLSEQQRIVMEIEKWFALIDQVE QGKADLQNTIKQTKSKILDLAIHGKLVPQDPNDEPAIKLLKRINPDFTPCDNGHSRKLPQ GWYSVTANDVCSIIGGVSYNKADIQDTGIRVLRGGNIQNGKVIDCFDDVFISLSYQNNDN QVQRGDIIVVASTGSQTLIGKTGFADRDIPKTQIGAFLRIVRPKQKTLSPYIRLIFQTDA YKDYIRNVAKGSNINNVKNAHLQNFQICLPPLEEQQRIVQKIEELFSSLDDILTALEV >gi|226332233|gb|ACIC01000087.1| GENE 14 15105 - 16235 636 376 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|167855185|ref|ZP_02477956.1| 50S ribosomal protein L31 [Haemophilus parasuis 29755] # 17 366 10 337 339 249 38 1e-65 MRKDISTNDKDINDLFLKVVDLVTQARERVATAINIAEVFTKFHIGQYIVEYEQKGETRA EYGKAVLKELSKRLTDRLGDGWSYSTLKNIRQFYIVFSKGLTSGCPKSSQKANQWLADYP KAEEHISEPTFALSWSHYLILMRVENPDARSFYEIECTQQQWSKRQLSRQVGSCLYERLA LSRNKDEVMRLAKEGQTIGKPSDIIKNPITLEFLGLKPDAVYSESKLENAIINKMQQFLL ELGKGFLFEARQKRFTFDEQHFFVDLVFYNRLLQCYVLIDLKIDKLTHQDLGQMQMYVNY YDRYVKQDFEKPTIGILLCKEKNDALVELTLPKDANIYASAYQLYLPNKALLQAKVKEWI EEFEENEELKKLEENK >gi|226332233|gb|ACIC01000087.1| GENE 15 16239 - 17657 1078 472 aa, chain - ## HITS:1 COG:MA2116 KEGG:ns NR:ns ## COG: MA2116 COG0286 # Protein_GI_number: 20090959 # Func_class: V Defense mechanisms # Function: Type I restriction-modification system methyltransferase subunit # Organism: Methanosarcina acetivorans str.C2A # 1 471 1 486 498 422 46.0 1e-118 MATNSSTEQSLTKKVWNLATTLAGQGIGFTDYITQLTYLLFLKMDAENVEMFGEESAIPT GYQWADLIAFDGLDLVKQYEETLKLLSELDNLIGTIYTKAQNKIDKPVYLKKVITMIDEE QWLIMDGDVKGAIYESILEKNGQDKKSGAGQYFTPRPLIQAMVDCINPQMGETVCDPACG TGGFLLTAYDYMKGQSASKEKRDFLRDKALHGVDNTPLVVTLASMNLYLHGIGTDRSPIV CEDSLEKEPSTLVDVILANPPFGTRPAGSVDINRPDFYVETKNNQLNFLQHMMLMLKTGG RAAVVLPDNVLFEAGAGETIRKRLLQDFNLHTILRLPTGIFYAQGVKANVLFFSKGQPTK EIWFYDYRTDIKHTLATNKLERHHLDDFVSCYNNRVETYDAENNPQGRWRKYPVDEIIAR DKTSLDITWIKPGGEVDDRSLAELMADIKDKSQTISRAVTELEKLLANIEEN >gi|226332233|gb|ACIC01000087.1| GENE 16 17661 - 20183 1271 840 aa, chain - ## HITS:1 COG:MA2122 KEGG:ns NR:ns ## COG: MA2122 COG4096 # Protein_GI_number: 20090965 # Func_class: V Defense mechanisms # Function: Type I site-specific restriction-modification system, R (restriction) subunit and related helicases # Organism: Methanosarcina acetivorans str.C2A # 1 802 66 882 917 423 35.0 1e-118 VEAKREETDVFASKVCEQAALYARSVPNIYQAYQKPLPFIFTSNGKELYFCDFREQDSCF KQIITIPTPHELVKKLGIEDTFAGLPTLKRKGLRDCQYEAVTELEKSFRAGQNRALMVLA TGAGKTYTACLAAYRMLSYTPMRRVLFLVDRNNLGKQAEGEFGTFRLTENGDAFNTIFTV NRLRSSSIPSDSNVVISTIQRLFSFLKGEAIEDNDDDDENEPAEEVTLPPNPNLPHDYFD MIIIDECHRSIYGNWRMVLEYFDTARLVGLTATPIPETMAFFNNNRIVNYTLEKSIVDGV NVDCRVYRIKTQVTETGGAILEGEKFKEETRYTGEVKTVSSKETKNYTNKELNRSVINPA QIKLILSTYRDVVYTELFNDPQREPNMDYLPKTLIFALNEAHATNIVQIAKEVFGRTDDR FVQKITYSAGDSNELIRQFRNDKDFRIAVTCTLVATGTDVKPLEVVMFMRDVESLPLYIQ MKGRGVRTIGDDQLRNVTPNAFSKDCFYLVDAVGVTEHAQTVAPIDDGPTTKTITLKELL ERISHGYIPDEYLKRLAATLARIYNKADDSQRKEFVRLSHDDMKELSARIYDVLEKGILP LFVSTDEPNNERKGLVAPLANHADARKYLLILAAGFVNTLMPGEDTLISKGFSIEEAKST TEAFEEFCKEHSDEIEALRIIYNNEGTPITYSMLKDLENKLKMANNHFTSKQLWNSYAIV NAKAVRRSTTKEESDALTNIIQLVRFAFRQIERLDSVVTTSKQFFNLWLGQTQREITDKQ REVISRIVDYIASNGACTVRDIREDDATQAAQMIRAFGNMQKVDEALHSLYTFVVLRKAA Prediction of potential genes in microbial genomes Time: Thu May 12 01:31:24 2011 Seq name: gi|226332232|gb|ACIC01000088.1| Bacteroides sp. 1_1_6 cont1.88, whole genome shotgun sequence Length of sequence - 15137 bp Number of predicted genes - 15, with homology - 13 Number of transcription units - 11, operones - 3 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 131 - 1870 824 ## COG1479 Uncharacterized conserved protein 2 1 Op 2 . - CDS 1938 - 2141 130 ## BT_4515 hypothetical protein + Prom 2312 - 2371 3.4 3 2 Tu 1 . + CDS 2537 - 2692 72 ## 4 3 Tu 1 . + CDS 2798 - 3616 632 ## BT_4514 hypothetical protein - Term 3521 - 3567 -0.9 5 4 Op 1 . - CDS 3635 - 4435 809 ## COG0566 rRNA methylases 6 4 Op 2 . - CDS 4440 - 5492 776 ## COG1063 Threonine dehydrogenase and related Zn-dependent dehydrogenases - Prom 5612 - 5671 5.6 + Prom 5587 - 5646 3.9 7 5 Tu 1 . + CDS 5670 - 6104 395 ## BT_4511 hypothetical protein + Term 6176 - 6232 21.4 - Term 6164 - 6219 21.2 8 6 Tu 1 . - CDS 6240 - 8330 1125 ## COG5545 Predicted P-loop ATPase and inactivated derivatives - Prom 8385 - 8444 5.0 + Prom 8271 - 8330 2.1 9 7 Tu 1 . + CDS 8350 - 8529 78 ## + Term 8577 - 8613 5.4 - Term 8958 - 9004 4.4 10 8 Tu 1 . - CDS 9150 - 10043 474 ## COG1708 Predicted nucleotidyltransferases - Prom 10093 - 10152 3.4 11 9 Tu 1 . - CDS 10218 - 11444 501 ## BT_4508 hypothetical protein - Prom 11494 - 11553 4.6 + Prom 11480 - 11539 5.4 12 10 Op 1 . + CDS 11595 - 12476 733 ## COG2367 Beta-lactamase class A 13 10 Op 2 . + CDS 12504 - 13880 1138 ## COG0346 Lactoylglutathione lyase and related lyases 14 10 Op 3 . + CDS 13902 - 14444 385 ## BT_4505 hypothetical protein - Term 14169 - 14213 -0.4 15 11 Tu 1 . - CDS 14439 - 15113 491 ## COG2095 Multiple antibiotic transporter Predicted protein(s) >gi|226332232|gb|ACIC01000088.1| GENE 1 131 - 1870 824 579 aa, chain - ## HITS:1 COG:NMA2230 KEGG:ns NR:ns ## COG: NMA2230 COG1479 # Protein_GI_number: 15795099 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Neisseria meningitidis Z2491 # 1 571 1 558 571 271 34.0 4e-72 MANLNIEQKKVKSLFQESKYNFLIPDYQRPYAWGEMECKTLWDDLFSFSFPNDDCDQFID DEEYFLGPIVTFKNRNQMEVIDGQQRLTTLMLLLRAFYNKLGSMKDPRSKRMKEDIEKCI WRANEFGEFETSALKINSQVATDEDKEEFIAILKDGQGIGKKSRYAKNFNFFVEKIDAFL SNYPSYFAYFPARVLNNCVLLPIEAESQNTALRIFSTLNDRGKPLSDADIFKAQLYKYYS SFGKKDEFIETWKNLDKITSEVFHPIYGTPLDELFTRYMYYERALAEIKSSTTEALRKFY EGDGSYPLLHQPKTLSNLVSLAKFWQDVNNQETERFSDKVLKRLFVLSYAPNGMWTYITS VYFMYHRNENDEIEENAFCRFLDCITAFIWAYAITNPGVNSLRTPVYAEMVKIIRGEDVT FSDFKFSEERYRAQITNYAFNNSRAITKSMITWWAFQYDNQSLLSLENKFDIEHIYARNR LDKEKGLSNPQNVEILGNKVLLEKRLNIRASDYRFVDKVKYYKGFTTANGQEKNGSQIVE LAEISNSYSDFTETDIINRNSKIIDSFVNFLRANQLLKE >gi|226332232|gb|ACIC01000088.1| GENE 2 1938 - 2141 130 67 aa, chain - ## HITS:1 COG:no KEGG:BT_4515 NR:ns ## KEGG: BT_4515 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 67 79 145 145 125 98.0 5e-28 MKLNRLKAVLLKKGISQTWLAKQLDMSFSMVNAYACNRIQPNLQTLQQFASILQVDLKDL ITDKKDR >gi|226332232|gb|ACIC01000088.1| GENE 3 2537 - 2692 72 51 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MQLTHSFSLIKLLSSLSIKVRMNNDVTFERPNVSIALMVEHKIMDMRAYRF >gi|226332232|gb|ACIC01000088.1| GENE 4 2798 - 3616 632 272 aa, chain + ## HITS:1 COG:no KEGG:BT_4514 NR:ns ## KEGG: BT_4514 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 272 1 272 272 525 99.0 1e-148 MYLTSMTHEELYAEVHKDLIEISTKANMFMDKVRKKTKNMLPYPLATQRITLTTTRRNVW TVVGKHNSYMQGVGFQAYAPIIGTSSNGYIQMTGFKSRDRLMHYTAHFMQRYKERYIDHY QIDRKGENIFEYFVYNNPQVLYTRKNNGGYFIVSDHGIAVADFSEGLKLMTHVTFLGDDE LTLKKQLIYDEEIKIYKGALELKRLKSRKQKDDLVTIWNVAKKHNAGIEMVKRWYQWNGV KVDEDYLQQCIDLIEKYNVQSLDQFAELMSRQ >gi|226332232|gb|ACIC01000088.1| GENE 5 3635 - 4435 809 266 aa, chain - ## HITS:1 COG:Cgl0802 KEGG:ns NR:ns ## COG: Cgl0802 COG0566 # Protein_GI_number: 19552052 # Func_class: J Translation, ribosomal structure and biogenesis # Function: rRNA methylases # Organism: Corynebacterium glutamicum # 6 262 7 270 276 148 35.0 9e-36 MPVIEISSLSHPGVEIFCTLTEAQLRNRIEPDKGIFIVESPKVIERALDAGYEPLAILCE HKHITGDASDIIERCGNVPVYTGSRELLATLTGYVLTRGVLCAMRRPVVRSMAEVCREAR RIVVIDGVVDTTNIGAIFRSAAALGIDAVLLTRNSCDPLNRRAVRVSMGTVFLVPWTWMD GSLSDLGKLGFRTAAMALTDNSVSIDDPVLATEPRLAIVMGTEGDGLSPETIAEADYVVR IPMSHGVDSLNVAAAAAVAFWQLRAR >gi|226332232|gb|ACIC01000088.1| GENE 6 4440 - 5492 776 350 aa, chain - ## HITS:1 COG:RSc0194 KEGG:ns NR:ns ## COG: RSc0194 COG1063 # Protein_GI_number: 17544913 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Threonine dehydrogenase and related Zn-dependent dehydrogenases # Organism: Ralstonia solanacearum # 1 332 1 332 345 210 36.0 3e-54 MLAYTYIEHGKFELREKPEPKITDARDAIVRVTLGSICTSDLHIKHGSVPRAVPGITVGH EMVGVVEEVGAEVTSVSPGDRVTVNVETFCGECFFCHHGYVNNCTDPNGGWALGCRIDGG QAEYVRVPYADQGLNRIPDTVSDEQALFVGDVLATGFWATRISEITEKDTVLLIGAGPTG ICTLLCSMLKNPKRIIVCEKSPERIQFVREHYPDVLITPPENCKEFVLQNSDHGGADVVL EVAGTEDSFRMAWDCARPNAIVTIVALYDKPLLFPLPEMYGKNLIFKTGGVDGCDCAEIL SLIEEGKIDTTPLITHRCSLNEIEEAYRIFENKLDGVIKVAIVCQTESNK >gi|226332232|gb|ACIC01000088.1| GENE 7 5670 - 6104 395 144 aa, chain + ## HITS:1 COG:no KEGG:BT_4511 NR:ns ## KEGG: BT_4511 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 144 1 144 144 288 98.0 5e-77 MKEDPDSLSVPYNFARCFNDQCSQASKCLRHIAAQYNGADYLYITSVNPARYPADGNQCE CFKTAVKVHVTWGLKRLLDRIPYEDAVSIRSQLVGHYGKTGYYRLYRGERGLMPKDQAYI KQLFRNKGIKEEPAYQRYTEEYIW >gi|226332232|gb|ACIC01000088.1| GENE 8 6240 - 8330 1125 696 aa, chain - ## HITS:1 COG:L109011 KEGG:ns NR:ns ## COG: L109011 COG5545 # Protein_GI_number: 15672499 # Func_class: R General function prediction only # Function: Predicted P-loop ATPase and inactivated derivatives # Organism: Lactococcus lactis # 317 639 61 389 480 70 22.0 1e-11 MKITQFRKNGDTTALSVLDLEKLVNKVKTEIKSRPVSTFREELRYMLPDDRCMFADKLPE IIPAAEFRKVNGQKQMKAYNGIVELTVGPLSNKSEIALVKQKASEQPQTRCAFMGSSGKT VKIWTTFTRPDNSLPKTREEAELFHAHAYRLAVKCYQPQIPFDILPKEPTLEQYSRLSYD PDIMYRPNSVQFYLSQPTAMPEETTFREAVQAEKSPLTRAVPGYDAENAFLMLFEAAFRK AYTDLSEAGLQLREDKWQPLVVQLAKNCFASGLPQEEVVKRTVFHFYMYKQEVLIREMIG NVYLECKGFGKNISLSKEQQLALQTEEFMKRRYEFRYNTQIAEVEYRERLSFRFRFNPLD KRALNSIALDAQMEGIPLWDRDISRYIYSNRVPVFNPLEDFLYRLPGWDGKDRIRELAAT VPCRNPYWTDLFHRWFLNMVSHWRGYDKKYANSVSPLLVGAQGTRKSTFCRSIMPPSERS YYTDSIDFSRKKDAELYLNRFALINIDEFDQVSSTQQGFLKHILQKPVVNMRKPHASAVL EMRRYASFIATSNQKDLLTDPSGSRRFICIEVTGVIDTNRPIDYEQLYAQAMYELEHGER YWFDQEDEKIMMENNREFEQVPPEEQLFFRYFRAAQPEEGEWLSPAEIMEDIQKNSSIPM SVKRVNSFGRILKKQEIPSKHVRSGTLYHVVKLVTG >gi|226332232|gb|ACIC01000088.1| GENE 9 8350 - 8529 78 59 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MCQRLRVVETAKLLIISGTDKKKRKFNIEKADPKVGSILNSPPSGRRGAKGGVVGKHIT >gi|226332232|gb|ACIC01000088.1| GENE 10 9150 - 10043 474 297 aa, chain - ## HITS:1 COG:SMb20835 KEGG:ns NR:ns ## COG: SMb20835 COG1708 # Protein_GI_number: 16264326 # Func_class: R General function prediction only # Function: Predicted nucleotidyltransferases # Organism: Sinorhizobium meliloti # 1 289 27 322 331 162 33.0 5e-40 MKKSIKHLPKRTQEELTVLLDLVCKSVDNCQMIILFGSYARGNYVLWDTKIEFGVRTSYQ SDYDILVITNGAVKRVERKLERITNKYHDLFEYRRHAFPQFIVEHINTVNNNLEVSQYFF TDIIKEGILLYDSGKCQLAKPRKLSFREIRDIAQNEFDKLFPYACDFLTGVKDYFILKEK YNLSAFMLHQSCEKLYNTILMVFTNYRPKSHRLQDLGGRVKSFSMELVTVFPQNTDDEKE CFDLLCQAYIEARYNKDYKITREQLEYLISRLDILKEMTECLCKEKIAEYNAMAENG >gi|226332232|gb|ACIC01000088.1| GENE 11 10218 - 11444 501 408 aa, chain - ## HITS:1 COG:no KEGG:BT_4508 NR:ns ## KEGG: BT_4508 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 408 1 408 408 822 98.0 0 MLKCHSELFKSSVKLADGLRLSTKEELKDIAQLLLLPVPSKLRKEEYVSCLAEAVLTCPD IWISQLTHYEWLLLQKLVKAGANTYVEEPNMIMTSTLELLSFVAVECSLNGDKIRYMICD ELREAVAPHIDKFLTSTEENSRFVLEQYALGILNLYGLLPYTEFLKQLTGYLKGSMTKDE IAEGLSNSMLMRQLTFDIEDIYNSVMYVRSPLLLEVEDLEERLYAHRAIKSLKKFTVEEV LSAGTMPVFYISNPHSDQLKGFMMRKLGYNEELAEAKIQWLWYAIQMNENPMSAIVSAID TKVLSMQELQEVVGIAVNYCNDCPRWFLKGHSSTEASALLGRGESVKTPPRIVAGPTMKA AGMDITPEMQKMVDGMFYDTFSGTKIGRNDPCPCGSGKKYKKCCGRDN >gi|226332232|gb|ACIC01000088.1| GENE 12 11595 - 12476 733 293 aa, chain + ## HITS:1 COG:SMa1953 KEGG:ns NR:ns ## COG: SMa1953 COG2367 # Protein_GI_number: 16263522 # Func_class: V Defense mechanisms # Function: Beta-lactamase class A # Organism: Sinorhizobium meliloti # 10 274 17 317 334 79 27.0 6e-15 MRSFILLLCLIPTIICAQNLSLENQLKQAIQGKKAEIGIAVIIDGKDTVTVNNETHYPLM SVFKFHQALALADYMGKQQQSLNFELTIKKEDLKPDTYSPLRDSFPQGGFNIDIADLLKY TLQQSDNNACDILFQYQGGVDAVNQYIHSLGVTDCAIVCTENDMHQDESLCYQNWTTPLA AARLLEIFRKEALFPQEYKDFIYQTMTECQTGQDRLVAPLLGKEVTIGHKTGTGDRNAKG QQVACNDIGFVLLPDGRAYSIAVFVKDSEENNQENSKIIADISRIVYEYVTHQ >gi|226332232|gb|ACIC01000088.1| GENE 13 12504 - 13880 1138 458 aa, chain + ## HITS:1 COG:lin0429 KEGG:ns NR:ns ## COG: lin0429 COG0346 # Protein_GI_number: 16799506 # Func_class: E Amino acid transport and metabolism # Function: Lactoylglutathione lyase and related lyases # Organism: Listeria innocua # 1 124 1 123 126 140 52.0 4e-33 MKLHHIAIWTFRLEELKDFYVRFLGGTSNEKYINPKKGFESYFISFDEGPTLELMSRVDV QNTPIEENRRGLTHLAFTFPSKEEILRFTEEMRSEGYTIAGEPRTSGDGYFESVVLDPDG NRLECVYKKEPEAERTEAALCPNIETKRLLLRPFQENDAEAFFACCQNPNLGNNAGWAPH KTLNESREILHSAFIGQEGIWAVTLKDTQQLIASIGIVPDPKRENPQVRMLGYWLDEPYW GKGYMSEAVQAVLNYGFNELQLSLITANCYPHNKRSQQVLKRNGFIYEGTLHQAELTYNG NIYDHECYYIPNIARPTEQDYDELIRLWEKSVRSTHHFLTEESIQFYKPLIRNHYLPAVA LFIIRNSHGKIAAFMGLSDELIEMLFVHPDEQGKGYGKRLIEYAIRQKQIDKVDVNEDND QALRFYQHLGFEIIGRDETDSMGKPYPILHLQLTDDKK >gi|226332232|gb|ACIC01000088.1| GENE 14 13902 - 14444 385 180 aa, chain + ## HITS:1 COG:no KEGG:BT_4505 NR:ns ## KEGG: BT_4505 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 180 1 180 180 294 99.0 8e-79 MDLRTQLATRFHIENIRELLHYIKEDERLREEIYRLIFDEDDIVSYQALWVCTHFSKPEV EWLTLKQDELIDAALTCPHSGKRRMLLNLLCQQQLADPPRVDLLDFCMERMVSRQEPAGV QSLCIKLAYQLTCSIPELQQELRTMLEIMEPELLVPAIRSVRRNTLKAIKAKKKKEHPNS >gi|226332232|gb|ACIC01000088.1| GENE 15 14439 - 15113 491 224 aa, chain - ## HITS:1 COG:CC1662 KEGG:ns NR:ns ## COG: CC1662 COG2095 # Protein_GI_number: 16125908 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Multiple antibiotic transporter # Organism: Caulobacter vibrioides # 4 218 9 218 228 73 26.0 3e-13 MSLSSLLIIFSSSFMALFPVINPLGNGFVVNGFFADLDPSQRKTAIQKLTINFIMIGVGT LLIGHLFLLIFGLAIPVIQLGGGILICKTAIELLGDSNSSDKEGTSQNVDSFKWKSIEQK IFYPITFPISIGPGSISVIFTLMASASVKGKLLHTGINYLVIALVIICMAGILYVFLSQG QRIIQKLGPVGNQIINKLVAFFTFCIGIQISVTGISQIFHLNIL Prediction of potential genes in microbial genomes Time: Thu May 12 01:32:02 2011 Seq name: gi|226332231|gb|ACIC01000089.1| Bacteroides sp. 1_1_6 cont1.89, whole genome shotgun sequence Length of sequence - 59710 bp Number of predicted genes - 54, with homology - 54 Number of transcription units - 31, operones - 12 average op.length - 2.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 118 - 164 7.7 1 1 Tu 1 . - CDS 183 - 647 590 ## COG2030 Acyl dehydratase - Prom 668 - 727 3.4 2 2 Tu 1 . - CDS 754 - 1284 442 ## BT_4502 hypothetical protein - Prom 1311 - 1370 2.8 + Prom 1280 - 1339 1.5 3 3 Op 1 . + CDS 1385 - 3034 254 ## PROTEIN SUPPORTED gi|161507907|ref|YP_001577871.1| ribosomal protein large subunit 4 3 Op 2 . + CDS 3116 - 4093 803 ## BT_4500 hypothetical protein - Term 4093 - 4158 6.1 5 4 Tu 1 . - CDS 4166 - 5524 1188 ## COG0534 Na+-driven multidrug efflux pump - Prom 5571 - 5630 5.1 6 5 Tu 1 . - CDS 5642 - 5848 164 ## BT_4498 hypothetical protein - Prom 5976 - 6035 4.0 + Prom 5877 - 5936 3.1 7 6 Tu 1 . + CDS 5973 - 6278 194 ## BT_4497 hypothetical protein - Term 6049 - 6093 2.0 8 7 Tu 1 . - CDS 6307 - 7812 789 ## COG0168 Trk-type K+ transport systems, membrane components - Prom 7889 - 7948 4.8 + Prom 7739 - 7798 5.8 9 8 Tu 1 . + CDS 8043 - 8774 193 ## PROTEIN SUPPORTED gi|163797523|ref|ZP_02191474.1| 50S ribosomal protein L9 + Prom 8794 - 8853 6.0 10 9 Tu 1 . + CDS 8875 - 9144 294 ## BT_4495 hypothetical protein 11 10 Op 1 . - CDS 9276 - 9548 424 ## COG2388 Predicted acetyltransferase 12 10 Op 2 . - CDS 9559 - 10872 847 ## COG1090 Predicted nucleoside-diphosphate sugar epimerase - Prom 10946 - 11005 11.5 + Prom 10661 - 10720 6.6 13 11 Tu 1 . + CDS 10967 - 12703 782 ## BT_4492 hypothetical protein + Prom 12736 - 12795 4.3 14 12 Op 1 . + CDS 12834 - 13220 168 ## BT_4491 hypothetical protein 15 12 Op 2 . + CDS 13239 - 13811 431 ## BT_4490 hypothetical protein 16 12 Op 3 . + CDS 13849 - 15447 687 ## BT_4489 hypothetical protein + Term 15570 - 15617 -0.5 + Prom 15600 - 15659 6.5 17 13 Op 1 . + CDS 15833 - 16672 751 ## COG0266 Formamidopyrimidine-DNA glycosylase 18 13 Op 2 . + CDS 16705 - 17685 946 ## COG0741 Soluble lytic murein transglycosylase and related regulatory proteins (some contain LysM/invasin domains) 19 13 Op 3 . + CDS 17769 - 18647 481 ## COG1226 Kef-type K+ transport systems, predicted NAD-binding component 20 14 Tu 1 . - CDS 18681 - 18980 220 ## BT_4485 hypothetical protein - Prom 19112 - 19171 5.0 + Prom 19112 - 19171 4.2 21 15 Tu 1 . + CDS 19234 - 19902 379 ## PROTEIN SUPPORTED gi|163764775|ref|ZP_02171829.1| ribosomal protein L16 22 16 Op 1 . - CDS 19905 - 20678 584 ## BT_4483 hypothetical protein 23 16 Op 2 . - CDS 20711 - 22054 948 ## COG4277 Predicted DNA-binding protein with the Helix-hairpin-helix motif - Prom 22082 - 22141 10.7 - Term 22159 - 22204 2.8 24 17 Tu 1 . - CDS 22376 - 23467 989 ## COG2885 Outer membrane protein and related peptidoglycan-associated (lipo)proteins - Prom 23490 - 23549 2.4 - Term 23519 - 23557 2.6 25 18 Tu 1 . - CDS 23601 - 27224 3365 ## BT_4480 hypothetical protein - Prom 27250 - 27309 9.0 26 19 Tu 1 . - CDS 27718 - 28656 636 ## BT_4479 integrase protein - Term 29423 - 29466 10.1 27 20 Tu 1 . - CDS 29500 - 29676 323 ## BT_2472 hypothetical protein - Prom 29754 - 29813 6.7 + Prom 29735 - 29794 10.4 28 21 Op 1 . + CDS 29824 - 30471 578 ## BT_4477 putative ATP-dependent DNA helicase 29 21 Op 2 . + CDS 30506 - 31894 817 ## PROTEIN SUPPORTED gi|145629959|ref|ZP_01785741.1| 50S ribosomal protein L21 + Prom 31903 - 31962 10.1 30 21 Op 3 . + CDS 31983 - 32615 611 ## BT_4475 hypothetical protein + Term 32652 - 32706 10.0 + Prom 32643 - 32702 7.1 31 22 Tu 1 . + CDS 32944 - 35103 1582 ## COG1509 Lysine 2,3-aminomutase - Term 34996 - 35055 9.7 32 23 Op 1 . - CDS 35183 - 36493 360 ## PROTEIN SUPPORTED gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 33 23 Op 2 . - CDS 36552 - 37259 705 ## BT_4472 hypothetical protein 34 23 Op 3 . - CDS 37332 - 38927 1303 ## BT_4471 hypothetical protein 35 23 Op 4 . - CDS 38945 - 42085 2474 ## BT_4470 outer membrane protein - Prom 42132 - 42191 5.1 + Prom 41979 - 42038 5.3 36 24 Op 1 . + CDS 42177 - 42761 386 ## COG3663 G:T/U mismatch-specific DNA glycosylase 37 24 Op 2 . + CDS 42816 - 43169 366 ## COG1393 Arsenate reductase and related proteins, glutaredoxin family + Term 43313 - 43362 14.1 + TRNA 43235 - 43307 80.5 # Trp CCA 0 0 - Term 43305 - 43344 5.0 38 25 Tu 1 . - CDS 43565 - 45430 1821 ## COG2304 Uncharacterized protein containing a von Willebrand factor type A (vWA) domain - Prom 45516 - 45575 4.9 + Prom 45503 - 45562 7.7 39 26 Op 1 4/0.000 + CDS 45621 - 46274 596 ## COG0558 Phosphatidylglycerophosphate synthase 40 26 Op 2 5/0.000 + CDS 46271 - 47227 872 ## COG4589 Predicted CDP-diglyceride synthetase/phosphatidate cytidylyltransferase 41 26 Op 3 . + CDS 47253 - 47906 406 ## COG0204 1-acyl-sn-glycerol-3-phosphate acyltransferase 42 26 Op 4 . + CDS 47912 - 48484 334 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 43 26 Op 5 . + CDS 48481 - 49830 1040 ## BT_4460 hypothetical protein + Term 49852 - 49892 1.7 - Term 49918 - 49970 17.1 44 27 Op 1 . - CDS 50078 - 51250 937 ## COG2311 Predicted membrane protein 45 27 Op 2 . - CDS 51331 - 52203 838 ## COG2240 Pyridoxal/pyridoxine/pyridoxamine kinase - Prom 52262 - 52321 4.3 - Term 52267 - 52311 10.0 46 28 Op 1 . - CDS 52407 - 52988 548 ## BT_4457 hypothetical protein 47 28 Op 2 17/0.000 - CDS 52985 - 54370 1143 ## COG1139 Uncharacterized conserved protein containing a ferredoxin-like domain 48 28 Op 3 . - CDS 54367 - 55107 620 ## COG0247 Fe-S oxidoreductase 49 28 Op 4 22/0.000 - CDS 55164 - 55712 220 ## PROTEIN SUPPORTED gi|157803532|ref|YP_001492081.1| 50S ribosomal protein L35 50 28 Op 5 . - CDS 55739 - 56113 334 ## COG0720 6-pyruvoyl-tetrahydropterin synthase - Prom 56168 - 56227 6.3 + Prom 56127 - 56186 5.7 51 29 Tu 1 . + CDS 56242 - 56601 281 ## COG2832 Uncharacterized protein conserved in bacteria + Term 56837 - 56874 -0.1 52 30 Tu 1 . - CDS 56571 - 57137 555 ## BT_4451 putative MTA/SAH nucleosidase - Prom 57277 - 57336 4.7 + Prom 57164 - 57223 4.2 53 31 Op 1 . + CDS 57287 - 58522 1058 ## COG3274 Uncharacterized protein conserved in bacteria 54 31 Op 2 . + CDS 58544 - 59611 835 ## COG0673 Predicted dehydrogenases and related proteins Predicted protein(s) >gi|226332231|gb|ACIC01000089.1| GENE 1 183 - 647 590 154 aa, chain - ## HITS:1 COG:CC0942 KEGG:ns NR:ns ## COG: CC0942 COG2030 # Protein_GI_number: 16125194 # Func_class: I Lipid transport and metabolism # Function: Acyl dehydratase # Organism: Caulobacter vibrioides # 8 148 5 145 148 119 45.0 2e-27 MEKVIINSYEEFEKLVGQQIGVSDYVELSQERINLFADATLDHQWIHVDTERAKVDSPYH STIAHGYLTLSMLPYLWNQIIQVNNLKMMINYGMDKMKFGQAVLSGQSVRLVTTLHSLTN LRGVAKAEIKFAIEIKDQPKKALEGIAVFLYYFN >gi|226332231|gb|ACIC01000089.1| GENE 2 754 - 1284 442 176 aa, chain - ## HITS:1 COG:no KEGG:BT_4502 NR:ns ## KEGG: BT_4502 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 176 1 176 176 364 99.0 1e-100 MEDKPKFAFVGNSHMAFWALNVYFPQWECLNYGAPGEGLAYVESFHEDTSDCQVVIQFGS NDIYQLNEENTDDYVERYVKAVLAVPKVKTYLFCIFPRNDYDDYSTAVNKFIRMLNEKIV AKLTGTGVIYLDVFDQLLKNGRLNPELTIDDLHLNGKGYRILSTALKQAFNGQEHL >gi|226332231|gb|ACIC01000089.1| GENE 3 1385 - 3034 254 549 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|161507907|ref|YP_001577871.1| ribosomal protein large subunit [Lactobacillus helveticus DPC 4571] # 342 516 82 247 285 102 37 5e-21 MIHFFDQPVSHIALPERFTYPFNYTPHPLCVLAAEEVKAYISTKKEWLEELALGKMFGVL IVQTQEEESSSIGYLAAFSGNLAGKNLHPYFVPPVYDLLQPQGFFKIEEEQISAINVRIS ELEVNPHYLHLKERLDRETEQARLVLIQAKEELKTAKRERELRRQSSPALSEEEQDTLIR ESQYQKAEFKRLERGWKERINTLEEEVITFETEIEKLKNERKQRSAALQQKLFEQFRMLN AKGEIKDLCTIFEQTVHKIPPAGAGECALPKLLQYAYLHQLKPLAMAEFWWGNSPKTEVR HHGYYYPSCKGKCEPILQHMLQGLKVDGNPLSPHAHRKEELEIVFEDEWLVVVNKPSGML SVPGKEEETDSVYHRVKAKYPEATGPMIVHRLDMATSGLLLVAKTKEVHQHLQEQFINRS IKKRYVALLDRNGLNQQLEETGTINLPLCLNPLDRPRQMVSEEYGKPAVTEYRILNNSDK YIRIALYPLTGRTHQLRVHAAHHQGLNCPILGDELYGKKADRLYLHAEYIEFRHPVYGDI ICIQKEAEF >gi|226332231|gb|ACIC01000089.1| GENE 4 3116 - 4093 803 325 aa, chain + ## HITS:1 COG:no KEGG:BT_4500 NR:ns ## KEGG: BT_4500 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 325 1 325 325 582 98.0 1e-165 MNFKFRITKYLAVSALAVLLLGACSKNNIYMDVAYPNGEGNNGGEEGNSDNPDKKDALIT FSASVEGRNITRAMSPMGKGLQSWLCAYPSNTTNTIEGEPVGEGNYITSSPGVLTGIQSY KMYLSNDIYSFYAVSCNSSNPAPTFTNGKSEALSNGVDYLWWHALHQDVTSSQVNIPITY QHVATQVVITITGGENITLNKVLSATITPTKPGAFMDLSTGIISSEVSYDKPANMGINDF TVQYIMLPLKSSDPMALTMELMVNGESFSRTYNTTITPPDNILAAGNSYLFRAVINENSV SFGNVSVKDWTDVDESGNPLYPIQD >gi|226332231|gb|ACIC01000089.1| GENE 5 4166 - 5524 1188 452 aa, chain - ## HITS:1 COG:L170983 KEGG:ns NR:ns ## COG: L170983 COG0534 # Protein_GI_number: 15672149 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Lactococcus lactis # 5 444 2 441 446 250 36.0 4e-66 MATSKEMTAGPALPLIFNFTLPLLFGNLLQQTYSLVDAAIVGKFLGINALASVGASTSVV FLILGFCNGCCGGFGIPVAQKFGARDYSTMRSYVSVSLQLAVVMSVVIAIFTSIYCADIL KMMRTPENIFEGAYAYLLVTFIGIPCTFFYNLLSSIIRALGDSKTPFYFLVLATVLNIIL DLFCILVLGWGVMGAAIATVFSQGVSAFLCYVYMYRKFDILRGTPKERKYQSKLAKTLLS IGVPMGLQFSITAIGSIMLQSANNALGTACVAAFTSAMRIKMFFLCPLESLGMAMATFSG QNYGAGKPERIWSGIKASTLMMVIYTAVTFLILMVGAKSFALIFVDPSEIEILEKTELFL HISVSFFPMLGLLCILRYTIQGVGYTNLAMFSGVSEMIARILVSIYAVPAFGFIAVCYGD PMAWIAADLFLIPAFIYVYRRLKRQVLTSAVA >gi|226332231|gb|ACIC01000089.1| GENE 6 5642 - 5848 164 68 aa, chain - ## HITS:1 COG:no KEGG:BT_4498 NR:ns ## KEGG: BT_4498 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 68 14 81 81 139 98.0 3e-32 MRKLLCPQCKIAALYVVNDRKERLLVYVLENGEVVPKDPTASTDGFDLTEVYCLGCSWHG SPKRLVKR >gi|226332231|gb|ACIC01000089.1| GENE 7 5973 - 6278 194 101 aa, chain + ## HITS:1 COG:no KEGG:BT_4497 NR:ns ## KEGG: BT_4497 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 101 1 101 101 167 99.0 7e-41 METDKKPRITQEKRTVEIMIRLYCRKKEKNATLCPECEELLRYAHARLDHCPFGEKKSSC KKCPVHCYKPAMRKRMREVMRFSGPRMLFYAPLEAIRHLFS >gi|226332231|gb|ACIC01000089.1| GENE 8 6307 - 7812 789 501 aa, chain - ## HITS:1 COG:MA1483 KEGG:ns NR:ns ## COG: MA1483 COG0168 # Protein_GI_number: 20090342 # Func_class: P Inorganic ion transport and metabolism # Function: Trk-type K+ transport systems, membrane components # Organism: Methanosarcina acetivorans str.C2A # 45 498 26 474 476 290 40.0 6e-78 MEEYISDNSVYARAKSNSMINPKMIGRVLGALLFIELGMFLLCAGISLGYGEDDYKYFLY TCVINAVAGGLLLLYGRGAENKMSRRDGYCVVTLSWIFFTVFGMLPFYFSGSIDTITNAF FETMSGFTTTGATILDDIESLSHGMLFWRSLTQWIGGLGIVFFTIAILPIFTTGGVQLFS AETTGVTHDRTHPKINVMAKWLWTIYLILTVSETILLMFGGMDLFDAICQSFATTATGGY STKQNSISYWNSPYIEYVVAIFMIISSINFSLFLMCLKGKVGRLFKDEETHWFLGSVGIL TLLITLALVFQNNYGWELAFRKSLFQVATVHTSCGFATDDYNLWPPFTWLLLFIAMLSGG CTGSTSGGIKNMRLLIVARTIRNEFKHLLHPNAVLPVRVNKQTVSSSIVTTVLLFFAFYL VIILIGWTILLFLGVGFSESVSTVISSIGNVGPGLGSCGPAYSWNSLPDLAKWVLSFLML IGRLELFSVLLLFYPGFWKNR >gi|226332231|gb|ACIC01000089.1| GENE 9 8043 - 8774 193 243 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163797523|ref|ZP_02191474.1| 50S ribosomal protein L9 [alpha proteobacterium BAL199] # 5 241 8 254 259 79 28 6e-14 MKRAIIIGATSGIGKEVAQRLLSEGWQIGIAGRRQSALEDFRQIAPEQIKIQSLDVTQED SAGKLDALIHQLGGMDLFLLSSGIGFQNMELNMEIELNTTRTNVEGFTRMVDTAFSYFRN NGGGHLAVISSIAGTKGLGVAPAYSATKRFQNTYIDALEQLAHLQKLNIHFTDIRPGFVA TDLLNDGKHYPMLMKTDQVGRDIVRALHRKQRVAIIDGRYRILVFFWRMIPRWIWKRLPV KTK >gi|226332231|gb|ACIC01000089.1| GENE 10 8875 - 9144 294 89 aa, chain + ## HITS:1 COG:no KEGG:BT_4495 NR:ns ## KEGG: BT_4495 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 89 1 89 89 152 100.0 3e-36 MEQKICQCCAMPIDETTFGTEADGSKNEEYCQYCYADGHFTKECTMDEMIELNLNYLEEF NKDSEVKYTIEEARATMKEFFPQLKRWKQ >gi|226332231|gb|ACIC01000089.1| GENE 11 9276 - 9548 424 90 aa, chain - ## HITS:1 COG:PA1749 KEGG:ns NR:ns ## COG: PA1749 COG2388 # Protein_GI_number: 15596946 # Func_class: R General function prediction only # Function: Predicted acetyltransferase # Organism: Pseudomonas aeruginosa # 5 82 74 152 161 56 43.0 1e-08 MDYKITHQPEQKLFKTEVDGRTAFVEYRLLGDYLDIIHTIVPKPIEGRGIAAALVKAAYD FALANGMKPKATCSYAVRWLERHPEMNADS >gi|226332231|gb|ACIC01000089.1| GENE 12 9559 - 10872 847 437 aa, chain - ## HITS:1 COG:ECs3188 KEGG:ns NR:ns ## COG: ECs3188 COG1090 # Protein_GI_number: 15832442 # Func_class: R General function prediction only # Function: Predicted nucleoside-diphosphate sugar epimerase # Organism: Escherichia coli O157:H7 # 1 273 1 283 297 179 39.0 9e-45 MNIAMTGATGYIGKHLSNYLTEKGGHRIIPLGRAMFREGMSGLLVQTLTHCDVIINLAGA PINQRWTPEHKQELFNSRITVTHRIIRALNAVKTKPKLMISASAVGYYPSLVESDEYTQT RGDGFLSDLCYAWEKEAKRCPQPTRLVITRFGVVLSPDGGVMQQMLRPLRATKVAAVIGP GTQPFPWIDIRDLCRAMSFFIENEELRGVFNLVAPQAVSQSTFTRAMGKAYHAWTTLIVP QAVFRLLYGEAASFLTAGQSVRPTRLLEAGFHFSVPTIEKLFEGTDHSTVDRLDLKRYMG LWYEIARYDHRFERGLTEVTATYTLRPDGTIRVENRGYKRNSPYDICKTATGHAKIPDPA QPGKLKVSFFLNFYSDYYVMELDQENYNYALIGSSTDKYLWILSRTPQLPEDIKKKLVTA AERRGYDTNRLQWIEQL >gi|226332231|gb|ACIC01000089.1| GENE 13 10967 - 12703 782 578 aa, chain + ## HITS:1 COG:no KEGG:BT_4492 NR:ns ## KEGG: BT_4492 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 578 1 578 578 919 88.0 0 MLSLCHMQTNKKTKLVVKICSILFLGTFLFCACSSRERTETSFLQAESLLEEHPDSALQL LQTLPALQELSHKESARYALLLARATDKCEKSLLPCDSLLNLALNYYDNDEKERAVALLY KGRLEIEINSTKEAIAHLQEALTILKNYPKEIEIKRHTLSSLGNTYYRTSFFEEAIKAYQ ELYKYCITDKDKYIALNDISSYYCMTDNKDSTITTQHKALEYAIASKDSSMIAISKLNLS VYYDEFEELDSALYHAYSALQWFPSGKVPGNLYSHLGTLLFEKGDNPDSAIYYLNKNIEN TTDITGKAAALLSLYDIEKEQGNYKAASAYLEEHVEILDSMYSTEQYSELQQLINKYNIK IHIREEQIKEQHRLLLIITLFIFCCLLIILFYQYHINKRKRIQLIYQQNLKQTQYKLSSM QTIIEDNQSIISLLQEEQRNLKQDQANKIDEIQERESTIERLRQEKHELRNWLFTQSDIY KRIIALSKQKVSDKKAMKVLTNAELKKLQKIIFEIYADYISYLQKKYPKLTEEDILYLCL NEAKLQPLTIALCFGYNNTHPINQRKLRIKERMKDEEM >gi|226332231|gb|ACIC01000089.1| GENE 14 12834 - 13220 168 128 aa, chain + ## HITS:1 COG:no KEGG:BT_4491 NR:ns ## KEGG: BT_4491 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 128 1 128 128 179 76.0 3e-44 MKAKSLFILLSVLFIIPMNAATLETQRHRSHWRVRDKESKVSTRSPFTFDINVQEVDNCL QIIFLSPLPDAEITITDKNGKTVVHEPPTFINKDKTLYIETPNGYPYTVKIISPIMDITG DIVEEESE >gi|226332231|gb|ACIC01000089.1| GENE 15 13239 - 13811 431 190 aa, chain + ## HITS:1 COG:no KEGG:BT_4490 NR:ns ## KEGG: BT_4490 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 190 1 190 190 370 100.0 1e-101 MKLNIYLIGAFFGMLILASCGNDDDETPVPKDDSWYWGYFKGEINGKEISLENEAHTDWS VHSVKTSASPPNNDTDSIRGMMTSIKYGEDELLSITIYHLHKGIRYITKSTKTDWIYDGI QITRDTHSDKYEDKYIYYIPQKEKPFQVEITKSAYADQWHPIIEGNLNGVLYRSDNPKDS IIINGSYGTR >gi|226332231|gb|ACIC01000089.1| GENE 16 13849 - 15447 687 532 aa, chain + ## HITS:1 COG:no KEGG:BT_4489 NR:ns ## KEGG: BT_4489 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 532 1 532 532 997 99.0 0 MKKILLMLASAFFMFSCSQEEGTSPINENTSNLPVSATADAQLKFAKLLSQAASSSIEVR NFLKKEALVQFDNDYDIFYPLIKDKKVSGNQSFRDILLSYCKEENELSQIEQSLPLLNIM VPDLTLFWDFNATTWDTSDKEISVLCRDDKNNTLYENGENIGNMTNGDIPGFPCLVVKNN ERMKITSVNTRAGEATYEFISDAFDGSKKVQTRHYDADNDLEATEDLNKYVSKDVVATAI NAWREFKNVPNACQRDYIYYGINKTNTPGSLNRNIREKLYRFRISATAFGRIADADGDQR LQDLTQTKRYLSNEEILQNIWTDGNFEFRFKSYIAGETNKDAMEHLLTFSVKAKDVFSIE KIHLHHKNGTAFRQSKNFYSVDINNLRSKWIYPEKLEPNKDNQVFTLPWDLYNKALSIHM YVEEFDVDQTIEKTITVVNEFTNKADFSIEGGGSIDSVKLSAKLGYGFSHTNTTTSSTKV TTTVGSDELGTLSFFFYDPIIRAESNNTYKLYDVSSGDVTATILPINLIITR >gi|226332231|gb|ACIC01000089.1| GENE 17 15833 - 16672 751 279 aa, chain + ## HITS:1 COG:SP0970 KEGG:ns NR:ns ## COG: SP0970 COG0266 # Protein_GI_number: 15900847 # Func_class: L Replication, recombination and repair # Function: Formamidopyrimidine-DNA glycosylase # Organism: Streptococcus pneumoniae TIGR4 # 1 276 1 271 274 88 27.0 1e-17 MIEAPEARYLCEQLTETVVGKRISDVFIQFSPHKFAWFNGNSDEFAEWLVDKRINSAQSQ GGMVEITIEDKVLVLTDGVNLRYLTPGTKLPAKHQLLIAFEDESCLIASVRMYGGIMCYS KDAANGVLSEYYRTAKSKPQVMSDAFSKEYFLGLINDESAQKKSAKAFLATEQTVPGLGN GVLQDILYHAHIHPKKKIAALTDKEKENLFYQVKETMNDIYRQGGRNTESDLFGENGKYT ACLSKDTAGKACPRCGETIVKENYLGGSIYYCRGCQILE >gi|226332231|gb|ACIC01000089.1| GENE 18 16705 - 17685 946 326 aa, chain + ## HITS:1 COG:aq_1420 KEGG:ns NR:ns ## COG: aq_1420 COG0741 # Protein_GI_number: 15606599 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Soluble lytic murein transglycosylase and related regulatory proteins (some contain LysM/invasin domains) # Organism: Aquifex aeolicus # 93 317 81 286 299 115 33.0 1e-25 MKKFTKINYILTFILVFCIGATLPILIGSSHVDEQHSAKSEVPYCVTSPTVPEQVTFDGE TIDLRRYDRRERMDREMMAFTYMHSTTMLLIKRANRYFPIIEPILKANGIPDDFKYLMVI ESNLNNIARSPAGAAGLWQFMPATGREFGLEVNENVDERYHIEKATVAACKYFKQAYAKY GDWMAVSAAYNAGQGRISSQLDQQLASHAMDLWLVEETSRYMFRLLAVKEIFKNPQRYGF LLKKEHLYPPIPYKEITVTTPIANLSDFAKQQGITYAQLRDANPWLREQSLKNRTGKTYV LQIPTQEGMYYDPKKTVAYNKHWVID >gi|226332231|gb|ACIC01000089.1| GENE 19 17769 - 18647 481 292 aa, chain + ## HITS:1 COG:MA2034 KEGG:ns NR:ns ## COG: MA2034 COG1226 # Protein_GI_number: 20090882 # Func_class: P Inorganic ion transport and metabolism # Function: Kef-type K+ transport systems, predicted NAD-binding component # Organism: Methanosarcina acetivorans str.C2A # 3 279 5 279 279 274 49.0 1e-73 MKLKERIHQFLNDQKLKHKLYVIIFESDTPSGKLFDVVLIGCILASVLLVIIESLKGLPS YLTTPFVVMEYLFTAFFTFEYLTRIYCSPRPKKYIFSFFGIVDLLATLPLYIGLIFPGAR YLLIIRAFRLIRVFRVFKLFNFLNEGERLLTALRESSKKIAVFFLFVVILVTSIGTLMYM IEGALPNSQFNNIPNSIYWAIVTMTTVGYGDITPVTGLGKFLSACVMLIGYTIIAVPTGI VSASMMKDYKRRRDKECPNCHRSGHEDNAQFCKYCGHDLNPTSENEDKENIQ >gi|226332231|gb|ACIC01000089.1| GENE 20 18681 - 18980 220 99 aa, chain - ## HITS:1 COG:no KEGG:BT_4485 NR:ns ## KEGG: BT_4485 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 99 23 121 121 157 97.0 1e-37 MYVRDSFVRPYMGDALVVVLVYSFVRIFIPTGVPRLPFYVFLFACFVEVLQYFQLVETLG ITNRALRIILGSTFDWKDIVCYGAGCIFIFLFEQIFRRR >gi|226332231|gb|ACIC01000089.1| GENE 21 19234 - 19902 379 222 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764775|ref|ZP_02171829.1| ribosomal protein L16 [Bacillus selenitireducens MLS10] # 8 216 15 229 236 150 38 2e-35 MMLNVDFMLRLLVAGILGAIIGLDREYRAKEAGYRTHFLVSLGSALIMIVSQYGFQEIIK ENSVTLDPSRVAAQVVSGIGFIGAGTIILQKQIVRGLTTAAGIWATAGIGLAVGAGMYTI GIATTVLTLIGLELLSYIFKSVGMKSSMISFSTNNKDTLKQIADRFNSKDYMIVSYEMET QHSGEMESYQVTMVIKSKRNNDEGHLLSLMQKFPDVTVQRIE >gi|226332231|gb|ACIC01000089.1| GENE 22 19905 - 20678 584 257 aa, chain - ## HITS:1 COG:no KEGG:BT_4483 NR:ns ## KEGG: BT_4483 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 257 1 257 257 510 99.0 1e-143 MNVYIYDKTFDGLLTAVFDAYFRKTFPDFLLSDGDALPLFYDELHTVVTDEEKAARVWRG LQKKVSSSALGCLTQCWLSELPDIGMVIFRYIRKAIDAPCSIETNFGDPDVLLLAQIWKK VDGERMHLMQFVRFQKAADGTFFAAFEPQYNALPLTVQHFKDRFADQKWIIYDMKRRYGF YYDLQEVTTISFDDDSREAHLITGMLDESLMDENEKIFQKLWKTYFKAICIKERMNPRKH KKDMPVRYWKYLTEKQK >gi|226332231|gb|ACIC01000089.1| GENE 23 20711 - 22054 948 447 aa, chain - ## HITS:1 COG:CAC3343 KEGG:ns NR:ns ## COG: CAC3343 COG4277 # Protein_GI_number: 15896586 # Func_class: R General function prediction only # Function: Predicted DNA-binding protein with the Helix-hairpin-helix motif # Organism: Clostridium acetobutylicum # 29 406 2 378 440 439 55.0 1e-123 MSDLVRYQPAFRLKSYICVCNQIGSMNENVLTKLKILAESAKYDVSCASSGTVRSNKPGM LGNTVGGWGICHSFAEDGRCISLLKVMLTNYCIYDCAYCINRRSNDLPRATLSVSELVEL TIEFYRRNYIEGLFLSSGVVRSPDYTMERLVRVAKDLRQVHRFNGYIHLKSIPGASRELV NEAGLYADRLSVNVEIPKEENLKLLAPEKDHKSVFAPMKYIQQGVLESKEERQKFRHAPR FAPAGQSTQMIVGATSESDKDILYLSSALYGRPTMKRVYYSGYVSVNTYDTRLPALKQPP LVRENRLYQADWLLRFYQFKVDEIVDDAYPDLDLEIDPKLSWALRHPEQFPVDINKADYE MLLRVPGIGVKSAKLIVASRRFSRLGFYELKKIGVVMKKAQYFITCRELPLQMQTVNELS PQRVRSLLLPKPKKKVDDRQLILDFGE >gi|226332231|gb|ACIC01000089.1| GENE 24 22376 - 23467 989 363 aa, chain - ## HITS:1 COG:VC2213 KEGG:ns NR:ns ## COG: VC2213 COG2885 # Protein_GI_number: 15642211 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane protein and related peptidoglycan-associated (lipo)proteins # Organism: Vibrio cholerae # 231 345 174 290 321 66 39.0 7e-11 MRIRNLLVAILTVSSTVAFAQEQRQIKEEGKTVFKPHWFMQVQAGAAHTVGEADFTDLIS PAAAVNVGYKFAPAFGARLGVSGWQAKAGWVTPSQTYQYKYLQGNLDIMADLSTLFCGFN PKRVFNAYIFGGAGLNHAFDNDEANALDTRSHELEYLWQDKQNLIAGRMGLGCDLRLNDR LAINIEGNANALSDKFNSKKAGNCDWQFNVLVGLNIKLGKSYKKTAPVYYEPEPVVEQPK PQPVVKQPEPEPVAVVVEPMKQNIFFALNSALLQKDQQSKIDAMVAYMEKYPASKVAITG YADKETGNPRINMTLSEKRAKIVADALKAKGIAADRIVTDFKGDTVQPFRVPEENRVSVC IAE >gi|226332231|gb|ACIC01000089.1| GENE 25 23601 - 27224 3365 1207 aa, chain - ## HITS:1 COG:no KEGG:BT_4480 NR:ns ## KEGG: BT_4480 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1207 1 1206 1206 1855 88.0 0 MNKKFLNAVLFGALIASSAGTFTSCKDYDDDIKSLQEQIDKSGSTVGDLQTQLTTLKTAA EAAQAAADAAKTAAAEAKTAAEAAKTAGDQAAADAATAKALAEEAKAAASQAKADAIAEA AKQVEALRVSLQSAIDQKVDLTVYEAAVKALGAQIEGIETGLSNLTNGAVKENTEAIKTA QDAIETLITADKDLQTQLDALKLYAEGVKEVADANAEDIKAAQEAIEAAQDEIKKLWDEI NGEGGLKALIGENKSAIEVLQKSTAQDIANLEDKITDELDAIKNDIKDVQSDIKTINTQI ININADLASLHTLMTCRLTSITFAPDYIVDGVEAIKFNSLKYAAMAASENAVIPTAYKFS TAALATASYHFNPASFKLANADYSYIDRSATVINTRAAASQWVEISGTPVANAATGTVDF KLLRLNVHSTQPAEDKVNMIALQAALKGDAVDQNETNVVVTSPYVSIYDDVLAAPDVRIS DKATLISGADKAHYATTFDACKSEAVRYTTPYDKVFNLKELVATCFGNGNHKEFPIEDYK LSYKFSVATSDYNVTSGSTTTNQQKWIVCNDAKAGLYQAEGFNKEAIGRTPILKVELVDE AGNVVRRGFVKIEFGVTKQDDLVVGDKANDLVAKCAETAASYTISEQFIRENVYRVITNG KETSMSHEEFWNLYDATTAVAAVKKNNLSHSMSLPKIVDGDTSTGTATKKIVWSFTHTEL GKIGAGGSQFVATVTVKNKLTSSEYPASITFKFTVNVKLPELTLAATENDLYWTKEDGKY KYFNVNVNVPSSITSPAENCQFSRALPEAYSAYAVTGLPGCESARYKVIKTYSNGVATST VMNGVQIDGVNITLDKSNAAVKAALNSANGLQAVVAHIYTLSNGDEVTVNEFMVNFIRPV NLNMPSGISVTDAMTGGDVADFQWNGLLTDWRNEAIVKDSWSWVDHTRSYWKRVCAPEFE YVEGHEEMVTPAQLDVEYGTIKFTTTTTTTMYTGKATYRYYPLVGRSITKTYTIEEEMLT KAEINTALEAKKVEGWPFGYISAELDGEIEYTEVSKPASTTIEYDYVKSINYVPAVYKWV EGNWTMVPHKHTAMPAYEGTSYGQTSGCWEWTKKEWTRPEWNAGQYWFFYGPFSDVTVNV NEATTSLEYNGNKLPDGATLVQEGNTVKYVNVSSPVGYTYQIFIPATVNYGWGTTSSTLT ITVNPKN >gi|226332231|gb|ACIC01000089.1| GENE 26 27718 - 28656 636 312 aa, chain - ## HITS:1 COG:no KEGG:BT_4479 NR:ns ## KEGG: BT_4479 # Name: not_defined # Def: integrase protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 8 312 1 305 305 590 92.0 1e-167 MLKVVAFMKQVAEELRMGGNFGTAHVYRSSINAIIAYHGEGDFSFDEVTPEWLKGFEVHL RGRGCSWNTVSTYLRTFRAVYNRAVNSRMVIYTPHLFRSVYTGTRADHKRALCDEDMKKV FTKLPQSSAVPSAIRCAQDLFVLMFLLRGLPFVDLAYLRKSDLRDNVITYRRRKTGRPLS VTLTPEAMALLKKYMNHDPHSPYLFPLLKSGEGTKEAYREYQLALRSFNQQLTLLGEILG LNDRLSSYTARHTWATTAYYCEIHPGIISEAMGHSSIKVTETYLKPFRNKKIDEANNQII DFVKHSIVGTVA >gi|226332231|gb|ACIC01000089.1| GENE 27 29500 - 29676 323 58 aa, chain - ## HITS:1 COG:no KEGG:BT_2472 NR:ns ## KEGG: BT_2472 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 57 51 107 107 61 63.0 9e-09 MKELVEKVAALYADFSKDANAQIENGNKAAGTRARKASLEIEKAMKEFRKASLEASKN >gi|226332231|gb|ACIC01000089.1| GENE 28 29824 - 30471 578 215 aa, chain + ## HITS:1 COG:no KEGG:BT_4477 NR:ns ## KEGG: BT_4477 # Name: not_defined # Def: putative ATP-dependent DNA helicase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 215 1 215 215 407 99.0 1e-112 MKTLTDTDYIHTLIAEGEHQQQDFKFEISDARKIAKTLSAFANTDGGRLLIGVKDNGRIA GVRSEEEKYMIEAAAQLYCIPEIEYTLQTYIVEGKQVLVATIEENPHKPVYAKDENGKPL AYLRIKDENILATPIHLRVWQQSGSPRGELIRYTEREQLLLDLLEQGTLLSLNRYCRQTG ISRRAAEHLLAKFVRYDIVEPVFENHKFYFRMKNE >gi|226332231|gb|ACIC01000089.1| GENE 29 30506 - 31894 817 462 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|145629959|ref|ZP_01785741.1| 50S ribosomal protein L21 [Haemophilus influenzae 22.4-21] # 2 434 5 440 456 319 41 3e-86 MNFINEINDILWTYILIIMLLGCAVWFSIRTRFVQFRMLREMIVLLSESAGKGKQGEKHV SSFQAFAITIASRVGTGNLAGVATAIAIGGPGAVFWMWVIALLGASSAFIESTLAQLYKI RGKDSFVGGPAYYMKKGLKQPWMGMLFAVLISITFGFAFNSVQSNTICAAAEHAFGFNHM VLGGVLTALTLVIIFGGIRRIARVSSIIVPVMALGYVGLALVIVLLNITHLPEVISLIIS HAFGWEQALGGGVGMALMQGIKRGLFSNEAGMGSAPNAAATANVTHPVKQGLIQTLAVFT DTLLICTCTAFIILFSGAPLDGSSNGVQLTQQALTNEIGASGSIFVAVALFFFAFSSILG NYYYGEANIRFITAKKWVLHTYRILVSGMVLFGSVATLDLVWSLADITMALMAVCNLIAI IFLGKYAIRLLSDYRAQKKAGIQSPVFRKETMPDIEKDLECW >gi|226332231|gb|ACIC01000089.1| GENE 30 31983 - 32615 611 210 aa, chain + ## HITS:1 COG:no KEGG:BT_4475 NR:ns ## KEGG: BT_4475 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 210 1 210 210 402 99.0 1e-111 MGLFGLFKKKSDETKVGNVEDFISLTRVYFQSVIATNLGITNIRFLPDVANFKRLFKIPT QGGKLGLAEKSASRKMLMQDYGLNESFFKEIDNSVKRNCRTQNDIQSYLFMYQGFSNDLM MLMGNLMQWKFRMPSIFKKALYGMTQKTVHDVCTKTVWKADDVHKTATTVRQYKERLGYS EQWMTDYVYNIVLLAKKEPKRKNDDTETKK >gi|226332231|gb|ACIC01000089.1| GENE 31 32944 - 35103 1582 719 aa, chain + ## HITS:1 COG:MJ0634 KEGG:ns NR:ns ## COG: MJ0634 COG1509 # Protein_GI_number: 15668815 # Func_class: E Amino acid transport and metabolism # Function: Lysine 2,3-aminomutase # Organism: Methanococcus jannaschii # 188 686 202 619 620 134 24.0 5e-31 MLMKQKKMLVLTFSQLKQIYTQEMPGLVEMAAVSPTVEDFKAGLLRHLDSCGVVNEVAEE AREQIRLLLQYDGQDVHELSTGQDISVQTIRLLYQFLTEKLENIEMPTDLFVELFQLFKR LQGESVPLPSPQRIKSRNDRWDTGLDEEVREMRDENKERMLHLLIQKIENRKSKPSVRFH FEEGMSYEEKYQLVSKWWGDFRFHLSMAVKSPAELNRFLGNSLSSETMYLLNRARKKGMP FFATPYYLSLLNVTGYGYNDEAIRSYILYSPRLVETYGNIRAWEKEDIVEAGKPNAAGWL LPDGHNIHRRYPEVAILIPDTMGRACGGLCASCQRMYDFQSERLNFEFETLRPKESWDSK LRRLMTYFEQDTQLRDILITGGDALMSQNKTLKNILEAVYRMAVRKQRANLERPEGEKYA ELQRVRLGSRLLAYLPMRINDELVDILREFKEKASAVGVKQFIIQTHFQTPLEVTPEAKE AIRKILSAGWIITNQLVYTVAASRRGHTTRLRQVLNSLGVVCYYTFSVKGFNENYAVFAP NSRSMQEQQEEKIYGQMTPEQAEELYKILETKISTGINEENPKEDADTAKQIRRFMRKHH LPFLATDRSVLNLPAIGKSMTFQLVGLTEEGKRILRFEHDGTRHHSPIIDQMGQIYIVEN KSLAAYLRQLSKMGEDPEDYASIWSYTKGETEPRFSLYEYPDFPFRITDKMSNLEISNK >gi|226332231|gb|ACIC01000089.1| GENE 32 35183 - 36493 360 436 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 [Clostridium botulinum Bf] # 9 433 7 418 447 143 26 3e-33 MKTDLIYGVEDRPPFKDALFAALQHLLAIFVAIITPPLIIASALKLDVEKTSFLVSMSLF ASGVSTFIQCRRIGPIGAKLLCIQGTSFSFIGPIIATGLVGGLPLIFGSCIAAAPIEMVV SRTFKYLRNIITPLVSGIVVLLIGLSLIKVGIVSCGGGYSAMDDGSFGSWENLSIAALVL LSVLFFNRCKNKYLRMSSIVFGLCLGYGLAFALGKVNMSALNVEMLMSFNIPQPFKYGID FNISSFIAIGLVYLITAIEATGDVTANSMISGLPIEGDSYLKRVSGGVMADGFNSLLAGI FNSFPNSIFAQNNGIIQLTGVASRYVGYYIAAMLVLLGLFPIVGAIFSLMPDPVLGGATL LMFGTVAAAGIRIVSSQEIGRKETLVLAVSLSLGLGVELMPDVLKQAPEAIRSIFSSGIT TGGLTAIVANMVIRVK >gi|226332231|gb|ACIC01000089.1| GENE 33 36552 - 37259 705 235 aa, chain - ## HITS:1 COG:no KEGG:BT_4472 NR:ns ## KEGG: BT_4472 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 235 1 235 235 459 99.0 1e-128 MKKIISALFLCMLLSAVPELQAQNIQLHYDFGRSLYDKDLKGRPLLTSTVEKFHPDNWGS TYFFIDMDYTSEGVASAYWEIARELKFWKGPFSAHLEYNGGLAKGFSYNNAYLAGATYTY NNASFSRGFTLTTMYKYIQKHPSPNNFQLTGTWYMNFCKDLLTFSGFADWWREETKYGKT IFLSEPQFWVNLNRIKGVNKNFNLSVGSEVELSNNFGGRDGFYVIPTLALKWTLN >gi|226332231|gb|ACIC01000089.1| GENE 34 37332 - 38927 1303 531 aa, chain - ## HITS:1 COG:no KEGG:BT_4471 NR:ns ## KEGG: BT_4471 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 531 1 531 531 1112 99.0 0 MKKKHIYIYLLGMLLLSVGCTGKFREFNTDQSGVTDDDLLIDDNGYGIRLGIIQQGIYFN YDFGKGKNWPFQLIQNLNADMFSGYMHDGKPLNGGTHNSDYNMQDGWNSAMWTHMYSYIF PQIYQSENATRDTEPALFGITKVLKVEVMHRVTDYYGPVIYKNFANAQNQYRPDTQKEVY YEFFNELDSAVVSLADYIEKKPESNGFTRFDILLDGQYSSWIKFANSLRMRLAMRISAVD PDKARTEFRKGLENEFGVFEAEDERVAVSTKSGFTNPLGELNRVWNETYMSAAMESILTG YDDPRLKVYFETCTDKAYAGEYRGIRQGTCFAHNHYAGLSKLSVDQSTDAPLMSSAEIWF LRAEAALRGWIPEDEEECYQNGIKASLHQRGIYQMDDYLNSERTAADFIDTQDPENNIPA RCLVSPKWNPADDKEIKLEKIITQKWIALFPEGCEAWAEQRRTGYPRLFPVRFNHSRNGC IDTETMVRRLNFPGTLQTEDREQYLALVEALGGPDHGGTRLWWDTGNNSLD >gi|226332231|gb|ACIC01000089.1| GENE 35 38945 - 42085 2474 1046 aa, chain - ## HITS:1 COG:no KEGG:BT_4470 NR:ns ## KEGG: BT_4470 # Name: not_defined # Def: outer membrane protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 12 1046 1 1035 1035 2069 98.0 0 MHGDGEKQLIEMLLKTRIIISCCFISLLSFFEEDCYASRPGSKECVKVSSASFTVRGRVI DTYGEPLIGATIREKGGANGTVTDVEGNFFLSVPDSAVLQISFVGYESQEVSVKGRSMLE ICLKENILLLDHVIVTALGLEKKESSLAYAVQKVRGEELNRVKEVNMITALAGKAAGVQV SKNSSGMGGSAKVSIRGVRSVAGDNQPLYVIDGVPMLNSTSEQSYSAIGGTANAGNRDGG DGISNLNPEDVESISILKGAPAAALYGSQAANGVILITTKKGNTVGRRDIHFSTGLTFEK AFSLPEMQNSYGVSDVTDSWGEKENLTVYDNLGDFFRTGLTSITSVSVNSGNDKLQTYFS YANTTGRGILNENKLSKHNINLRETAVLFNSRLKLDGNVNVMKQTVKNKPVSGGFYMNPL VGLYRFPRGEDLSYYKENYERYNEDRKLNTQNWHTFTEDFEQNPYWIQNRIQSKDARIRL LLSLSANLKVTDWLTLQARGNLDHSADKLRQKFYASTATALCGMNGRYIEMDYQETQMYG DVMAMVKKQLNDFTLDVAIGGSINDRITNSTRIDSKNASLKYANVFNLANIVMSSSASID QKIDAHYQLQSVFATAQIGYKEGVFLDLTARNDWSSTLAYTKYEKKGFFYPSAGVSFLPD KWVKMPEWVSFSKFRGAYSIVGNGVPPFITYPVSYVTAGGEILASDAAPFAEMKPEMTHS VELGMEWKFLQHRLGLNLTYYRTDTYEQFFKLPALAGDKYAYRYVNAGDIRNQGWEITLD ATPVLTPDFTWKTSLNFSANKNKIVELHEELKEFVYGPTSFSSSYAMKLVKGGSIGDIYG KAFIRDEAGNIVYGTDGDYKGFPLVEGDGNTVKVGNANPDFMLGWNHSFSYKGFSLYFLL DWRYGGEVLSQTQAEMDLYGVSKATAQARDKGYVSLEGRQISDVKGFYKNVVGGRAGVTE YYMYDATNLRLREVSLSYSFPRKWIQKTNCLKDLQLSFVARNLCFLYKKAPFDPDLVLST GNDNQGIEVFGMPTTRSLGFTLTCEF >gi|226332231|gb|ACIC01000089.1| GENE 36 42177 - 42761 386 194 aa, chain + ## HITS:1 COG:NMB0698 KEGG:ns NR:ns ## COG: NMB0698 COG3663 # Protein_GI_number: 15676596 # Func_class: L Replication, recombination and repair # Function: G:T/U mismatch-specific DNA glycosylase # Organism: Neisseria meningitidis MC58 # 3 188 33 220 229 195 50.0 3e-50 MEIEIEKHPLKPFLPPKAKLLMLGSFPPQRKRWSMDFYYPNLNNDMWRIVGLLFFGDKDH FLNDTRKAFCREQIIDFLNEKGIALFDTASSIRRLQDNASDKFLEVVQPTDIAALLRQLP ECRAIVTTGQKATDTLRAQLEVEEPKVGDYAEFVFEGRAMRLYRMPSSSRAYPLALDKKA AAYRIMYQDLQMVI >gi|226332231|gb|ACIC01000089.1| GENE 37 42816 - 43169 366 117 aa, chain + ## HITS:1 COG:FN0052 KEGG:ns NR:ns ## COG: FN0052 COG1393 # Protein_GI_number: 19703404 # Func_class: P Inorganic ion transport and metabolism # Function: Arsenate reductase and related proteins, glutaredoxin family # Organism: Fusobacterium nucleatum # 4 116 5 117 120 125 61.0 2e-29 MTPLFLQYPACSTCQKAKKWLIENNIEYTNRLIVEDNPTVEELKAWIPRSGLPIKKFFNT SGLVYKELKLSEKLPSMTEEEQIALLATNGKLVKRPLVVTDSFVLVGFKPDEWEKLK >gi|226332231|gb|ACIC01000089.1| GENE 38 43565 - 45430 1821 621 aa, chain - ## HITS:1 COG:mll1222 KEGG:ns NR:ns ## COG: mll1222 COG2304 # Protein_GI_number: 13471293 # Func_class: R General function prediction only # Function: Uncharacterized protein containing a von Willebrand factor type A (vWA) domain # Organism: Mesorhizobium loti # 145 613 164 634 638 410 47.0 1e-114 MKTNQLRAMMLTVMLVITTLGGTVWAQPLIVKGTVTDGSDGSPLVGCAVTVKGTSRGAVT NLQGQYRIEANKGETLIFSYIGFDKVEKVVFSAKMDVSLKASSDVLEECVVVGYGSQRKV AVTGAISTVNLASLRHPSVRMDAVNTEEYKSISENGFKQVGESPLSTFSIDVDAASYSNM RRMINSGTLPVADAIRTEELVNYFSYDYAKPTGSDPVKITMEAGVCPWNADHRLVRIGLK AREIPTDKLPESNLVFLIDVSGSMWGPTRLDLVKSSLKLLVNNLREKDKVAIVVYAGNAS VKLESTPGSDKQKIRDAIDELTSGGSTAGGAGIQLAYKVAKHNFLPKGNNRIILCSDGDF NVGVSSVEGLEQLIEKERKSGVFLSVLGYGMGNYKDNKGQALAEKGNGNHAYIDNLQEAN RVLVGEFGATLHTVAKDVKLQVEFNPAQVQAYRLVGYESRLLKDEDFNNDAKDAGELGAG HTVTAFYEVIPVGVKNDYVGKIDDLKYQKKQKETVQPTGSNELLTVKLRYKAPDKDVSKK MEVPFVDNKGNNVSSDFRFASAVAMFGQLLRDSDFKGEASYDKVIGLAKQGLDNDEKGYR REFIRLVETAKGMKREIAENK >gi|226332231|gb|ACIC01000089.1| GENE 39 45621 - 46274 596 217 aa, chain + ## HITS:1 COG:VC1935 KEGG:ns NR:ns ## COG: VC1935 COG0558 # Protein_GI_number: 15641937 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylglycerophosphate synthase # Organism: Vibrio cholerae # 2 206 27 233 252 161 46.0 1e-39 MKHEVDGRREIASRNTAWANIIARKLTGWGITPNQISMMSVFFALAGCLLLTGTVIYPDF NKYVAYILFIVCMQSRLLCNLFDGMVAIEGGKKSANGDLYNDMPDRFADALFIIPVGYVA GGFGIELGWLAALLAVMTAYFRWIGAYKTQQHFFNGPMAKQHRMALLTLCFVIATCTIHW GYEQIVCLITLILMNIGLIATLIHRLYLMSHTNNEAK >gi|226332231|gb|ACIC01000089.1| GENE 40 46271 - 47227 872 318 aa, chain + ## HITS:1 COG:VC1936 KEGG:ns NR:ns ## COG: VC1936 COG4589 # Protein_GI_number: 15641938 # Func_class: R General function prediction only # Function: Predicted CDP-diglyceride synthetase/phosphatidate cytidylyltransferase # Organism: Vibrio cholerae # 19 311 11 303 310 261 47.0 1e-69 MKQLLDKLFPTLSDELIIVITLIICLLVTASIALFLVKKLSPKTNISELVARTRSWWIMA AMFIGAVFISYDISYFFLAFLSFIAFRELYSVLGFREADRRALFWGILAIPIQYYLAYIA WYGAYIIFIPVVMFLLLPLRLVLKGDTHGITKSIALLQWILMLSVFGISHLAYLLSLPEL PGFNAGGRGLLMFLVFLTEINDIMQFVWGKLFGRHKILPKVSPNKTWEGFLGGVVSTTVI GYFLGFLTPLSTPNVILVSALIAIAGFAGDVVISAIKRDKGIKAMGDSIPGHGGVFDRID SLSYTAPVFFHLVYYIAY >gi|226332231|gb|ACIC01000089.1| GENE 41 47253 - 47906 406 217 aa, chain + ## HITS:1 COG:VC1937 KEGG:ns NR:ns ## COG: VC1937 COG0204 # Protein_GI_number: 15641939 # Func_class: I Lipid transport and metabolism # Function: 1-acyl-sn-glycerol-3-phosphate acyltransferase # Organism: Vibrio cholerae # 5 205 1 204 223 129 35.0 4e-30 MQSAMMQILYKGIFRWFLKLIVGVQFPDTRFLRKEGQFIILANHNSHLDTLSLLSSLPGN LLWKIKPVAAEDYFGRTRFQAAFSNYFINTLLIRRKGEKDSEHDPLRKMLDAIDEGYSLI LFPEGTRGKPEQMGKIKSGIARILSLRPELKYIPVFMTGMGRSLPKGELILLPYKASIFY GTPAVAGSTDVHEILKQITDDFEMMKDKYQVVTEDDE >gi|226332231|gb|ACIC01000089.1| GENE 42 47912 - 48484 334 190 aa, chain + ## HITS:1 COG:mll4824 KEGG:ns NR:ns ## COG: mll4824 COG1595 # Protein_GI_number: 13474039 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Mesorhizobium loti # 25 187 17 179 179 72 24.0 3e-13 MLFNVRNISKLSDEELLTHYIKSGDTEYFGELYNRYIPLTYGLCLKYLQDEDQAQEAVMQ MFEDLLPKIINYEIKAFKPWLYRVAKNHCLQLLRKENKEIPLDYTINIMESDEFLHLLSE EESSEEQLKALHHCLEKLPEEQRTSITRFFLEEMSYADIVEQTGFTLNNVKSYIQNGKRN LKICIKKQAL >gi|226332231|gb|ACIC01000089.1| GENE 43 48481 - 49830 1040 449 aa, chain + ## HITS:1 COG:no KEGG:BT_4460 NR:ns ## KEGG: BT_4460 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 449 1 449 449 799 100.0 0 MKLLDYIRGLRKGKEAHRLEKESMKDPFLADAMDGYHQVEGDHEEQINKLRMKVNAHSAK KRNTYAVTWSIAACLIIGIGISSYFLFLKQNVGEDVFIAKEQPAATATVPAHKEDTSLSA PKMEPDSGRPSATKETVGKDIIAKTRQDSPSQSTPSGAPSATVPKAVASKAATPQATPAG TPIMEEMVAPAEELEEAAITTVTDTSFLDANRKKMKAAQLTTQMNNMIKGKVTDDRGEPL VGANVTYKGATYGAITDINGEFSLPKKEGNEILTAHYIGYNPVSIPADTSKNMLIAMSEN KTTLDEVVVTGYGAQKKVAVTGAISAVSIKDLKKASGALQKSDTLKDAVISQKADSLQGP VVPEPVTGMKQYKKYLKKNLAYPADDACAEVKGKVTLTFFVNKEGRPFDIKVKESLCKSL DKEAIRLIQEGPDWTYGNQSAEITVKFHK >gi|226332231|gb|ACIC01000089.1| GENE 44 50078 - 51250 937 390 aa, chain - ## HITS:1 COG:BS_yxaH KEGG:ns NR:ns ## COG: BS_yxaH COG2311 # Protein_GI_number: 16081049 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Bacillus subtilis # 3 390 5 391 402 115 28.0 1e-25 MTQTLKTTERLGVVDALRGFALLAIVLLHNLEHYNLFLPLDYTLPAWLQTIDKYAWDTMF FLFAGKAYATFSLLFGFSFYIQFHNAEKRGTDFRGRFAWRMCLLFLFAQLHALFYNGDIL LLYAVVGFALIPVCKLKDKTVFWIAAILLLQPYEWGRAIYAMINLEYVPSTGHFIPYYKL AQEVTSNGSFFEVLRSNITDGQMYSNIWQVENGRLFQTAALFMFGMLLGRRKYFMKSEES LRFWKKMLTGAILAFIPLYCLKTFIPALITNPSIMVPYKIAVPSYANFAFMVILVSVFTL LWFKKDTGYSWQSLLIPYGRMSLTNYISQSIMGVTIYYGFGLSMYKYAGATGSLLIALLI FTIQLIFSRWWLARHKQGPLEFLWRKGTWI >gi|226332231|gb|ACIC01000089.1| GENE 45 51331 - 52203 838 290 aa, chain - ## HITS:1 COG:CAC1622 KEGG:ns NR:ns ## COG: CAC1622 COG2240 # Protein_GI_number: 15894900 # Func_class: H Coenzyme transport and metabolism # Function: Pyridoxal/pyridoxine/pyridoxamine kinase # Organism: Clostridium acetobutylicum # 6 290 5 290 290 320 51.0 2e-87 MYANKVKKIAAVHDLSGMGRVSLTVVIPILSSMGFQVCPLPTAVLSNHTQYPGFSFLDLT DEMPKIIAEWKKLEVQFDAIYTGYLGSPRQIQIVSDFIKDFRQPDSLIVADPVLGDNGRL YTNFDMEMVKEMRHLITKADVITPNLTELFYLLDEPYKADSTDEELKEYLRLLSDKGPQV VIITSVPVHDEPHKTSVYAYNRQGNRYWKVTCPYLPAHYPGTGDTFTSVITGSLMQGDSL PMALDRATQFILQGIRATFGYEYDNREGILLEKVLHNLDMPIQMASYELI >gi|226332231|gb|ACIC01000089.1| GENE 46 52407 - 52988 548 193 aa, chain - ## HITS:1 COG:no KEGG:BT_4457 NR:ns ## KEGG: BT_4457 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 193 1 193 193 377 100.0 1e-103 MSSREEILASIRKHTQTRYEKPDIADMKRLTYPDKIEQFCAISRAVGGTVVVLGEGEDVN AVIRRTYPDAMRIASVLPDISCATFNPDMVDDPKELDGTEVAVIRGEIGVAENGAVWIPQ TVKYKAIYFISEKLVILIDRNKIVDTMYDAYRKLDGQEYQFGTFISGPSKTADIEQALVM GAHGAREVLVILT >gi|226332231|gb|ACIC01000089.1| GENE 47 52985 - 54370 1143 461 aa, chain - ## HITS:1 COG:ykgF KEGG:ns NR:ns ## COG: ykgF COG1139 # Protein_GI_number: 16128292 # Func_class: C Energy production and conversion # Function: Uncharacterized conserved protein containing a ferredoxin-like domain # Organism: Escherichia coli K12 # 29 458 34 473 475 312 38.0 7e-85 MSTKHSKAAEKFLQDSKMAAWHNETLWMVRAKRDKMSKEVPEWEELRNKACELKLYSNSH LEELLQEFEKNATANGAIVHWAKDADEYCAIVYEILNEHNVHHFIKSKSMLAEECGLNPF LMERGIDVVESDLGERILQLMHLEPSHIVLPAIHIKREQVGELFEKEMGTEKGNFDPTYL THAARKNLRHLFLNAEAAMTGANFAVASTGDIVVCTNEGNADMGTSYPKLNIAAFGMEKI VPDLDALGVFTRLLARSATGQPVTTYTSHYRRPREGGEYHIIIVDNGRSTILSKPDHIKT LNCIRCGACMNTCPVYRRSGGYSYTYFIPGPIGINLGMAHDPEKYYDNLSACSLCMSCSD VCPVKVDLAEQIYKWRQDLDGLGKANTGKKIMSGGMKFLMERPALFNAALWAAPMVNGLP RFMKYNDFDDWGKGRELPEFAKESFNEMWKKNEVQGKEESK >gi|226332231|gb|ACIC01000089.1| GENE 48 54367 - 55107 620 246 aa, chain - ## HITS:1 COG:BH1832 KEGG:ns NR:ns ## COG: BH1832 COG0247 # Protein_GI_number: 15614395 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Bacillus halodurans # 1 245 1 238 244 166 37.0 4e-41 MKVGLFIPCYINAIYPNVGVASYKLLKSLGVDVDYPLDQTCCGQPMANAGFEDESMKLAL RFDDLFREYDYIVGPSASCVAFVKENHPGILEKEGHVCQSAGKIYDLCEFIHDVLKPTKI PARFPHKVSIHNSCHGVRELLISAPSELNIPYYNKLRDLLDMVEGIEVFEPSHIDECCGF GGMFAVEEQAVSVCMGRDKVKDHMATGAEYIVGADSSCLMHMQGVIKREHLPIQIIHIVE ILASQS >gi|226332231|gb|ACIC01000089.1| GENE 49 55164 - 55712 220 182 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|157803532|ref|YP_001492081.1| 50S ribosomal protein L35 [Rickettsia canadensis str. McKiel] # 4 181 19 224 225 89 33 4e-17 MMRKINEIFYSLQGEGYHTGTPAVFIRFSGCNLKCSFCDTQHEAGTLMTDDEIIAEVSKY PAVTVILTGGEPSLWIDDALIDRLHEAGKYVCIETNGTRPLPESIDWVTCSPKQGVKLGI TRMDEVKVVYEGQDISIYELLPAEHFFLQPCSCNNTALTVDCVMRHPKWRLSLQTHKLID IR >gi|226332231|gb|ACIC01000089.1| GENE 50 55739 - 56113 334 124 aa, chain - ## HITS:1 COG:mll5797 KEGG:ns NR:ns ## COG: mll5797 COG0720 # Protein_GI_number: 13474825 # Func_class: H Coenzyme transport and metabolism # Function: 6-pyruvoyl-tetrahydropterin synthase # Organism: Mesorhizobium loti # 1 100 2 110 119 65 36.0 2e-11 MFTVIKRMEISASHKLVLPYRSKCASLHGHNWIITVYCRSSRLNSEGMVVDFTRIKEVVT EKLDHQNLNEVLPFNPTAENIARWVCRQIPQCYKVEVQESEGNIVIYEKDAVANEKTPAA GETE >gi|226332231|gb|ACIC01000089.1| GENE 51 56242 - 56601 281 119 aa, chain + ## HITS:1 COG:NMB0571 KEGG:ns NR:ns ## COG: NMB0571 COG2832 # Protein_GI_number: 15676476 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Neisseria meningitidis MC58 # 1 96 2 96 119 90 51.0 6e-19 MKTLYIVLGSISLGLGILGIFLPLLPTTPFLLLTAALYFKGSPRLYNWLLNHRHFGPYIR NFRENKAIPLRAKIISLVLMWGTMLYCIFFLIPFLWVKILLGLIAAGVTYHILSFKTLK >gi|226332231|gb|ACIC01000089.1| GENE 52 56571 - 57137 555 188 aa, chain - ## HITS:1 COG:no KEGG:BT_4451 NR:ns ## KEGG: BT_4451 # Name: not_defined # Def: putative MTA/SAH nucleosidase # Organism: B.thetaiotaomicron # Pathway: Cysteine and methionine metabolism [PATH:bth00270]; Metabolic pathways [PATH:bth01100] # 1 188 1 188 188 395 100.0 1e-109 MLKILVTYAVQGEFVEIKWPDVEPYYIRTGIGKVKSAFHLAEAIRQVQPDLVLNLGSAGT VNHQVGDIFVCRKFIDRDMQKLAGLGMECEIDSSALLETKGFCKHWTEEGICNTGDGFLT ELTHVSGDVVDMEAYAQAFVCRSKEIPFISVKYVTDIIGQNSVKHWEDKLADARQGLSNY FNVLKERI >gi|226332231|gb|ACIC01000089.1| GENE 53 57287 - 58522 1058 411 aa, chain + ## HITS:1 COG:RSc3292 KEGG:ns NR:ns ## COG: RSc3292 COG3274 # Protein_GI_number: 17548009 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Ralstonia solanacearum # 11 409 1 336 336 72 24.0 1e-12 MTQNSIADNRMVWLDVVRCVAMLMVIGVHCIDPFYISPTMRVIPEYTHWAAIYGSLLRPS VPLFVMMTGLLLLPVKQQPLGTFYKKRIFRVLFPFLIWSVLYSMFPWFTGLLGLPKEIIG DFFCYTQGHESQSLIDSLKDVAMIPFNFSHKENHMWYIYLLIGLYLYMPFFSAWVERASN KTKQIFLFIWIVSLFIPYIREYVANMLFDRSGYVFGTDTWNEFSMFYYFAGFNGYLLLGH YLKEGGNGNALKIFWPFSGSSKDDRLTNDWSVWKTFLICAVMFAIGYYVTYTGFSTTAAN PNATETEMELYFTFCSPNVLLMTLAMFLMLQKVVINTPVIVKALANMTQCGFGIYMVHYF VVGPFFLLIGPSEIPIPLQVPLMAVCIFLCSWAFTALLYKLMPHKAKYIMG >gi|226332231|gb|ACIC01000089.1| GENE 54 58544 - 59611 835 355 aa, chain + ## HITS:1 COG:lin2932 KEGG:ns NR:ns ## COG: lin2932 COG0673 # Protein_GI_number: 16801991 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Listeria innocua # 2 267 4 269 333 77 25.0 6e-14 MKIKVGLIGFGRMGRFYLKEMQKSGRWDVAYICDINPDCRELAKQLSPESKILENEQEMF DDESVQVVGLFALADSRLDRIEKAIKYGKHIISEKPVSDTIDKEWKVVDITEKASCFSTV NLYLRNAWYHQTIKQFIDQGEIGELAILRICHMTPGLAPGEGHEYEGPAFHDCGMHYVDI ARWYAGCEYKTWHAQGVNMWSYKDPWWVQCHGTFQNGVVFDITQGFVYGQLSKDQTHNSY VDIIGTKGIARMTHDFKTAVVDLRGVTQTHHIEKPFGGKNIDVLCDLFADSIETGIRNPQ LPLIRDSAIASEYAWTFLHDAYTHDLPAIGELETLEQIWERRRNMKNGYGLLQKP Prediction of potential genes in microbial genomes Time: Thu May 12 01:33:41 2011 Seq name: gi|226332230|gb|ACIC01000090.1| Bacteroides sp. 1_1_6 cont1.90, whole genome shotgun sequence Length of sequence - 5619 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 4, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 127 68 ## BT_4448 putative dehydrogenase 2 1 Op 2 . + CDS 165 - 1037 800 ## BT_2157 hypothetical protein 3 1 Op 3 . + CDS 1049 - 1924 816 ## COG1082 Sugar phosphate isomerases/epimerases + Term 1943 - 1996 9.3 4 2 Tu 1 . - CDS 1998 - 3713 1141 ## BT_4445 hypothetical protein - Prom 3754 - 3813 6.1 - Term 3823 - 3884 7.7 5 3 Op 1 . - CDS 3922 - 4320 170 ## BT_4444 hypothetical protein 6 3 Op 2 . - CDS 4317 - 4907 183 ## BT_4443 N-acetylmuramoyl alanine amidase 7 3 Op 3 . - CDS 4927 - 5277 335 ## BT_4442 hypothetical protein - Prom 5300 - 5359 3.3 8 4 Tu 1 . - CDS 5366 - 5596 246 ## gi|253569781|ref|ZP_04847190.1| predicted protein Predicted protein(s) >gi|226332230|gb|ACIC01000090.1| GENE 1 2 - 127 68 41 aa, chain + ## HITS:1 COG:no KEGG:BT_4448 NR:ns ## KEGG: BT_4448 # Name: not_defined # Def: putative dehydrogenase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 41 451 491 491 93 100.0 2e-18 IHDGHPSFNKTWTDPINAKQFAAELVKHNYREGWNLPDMPR >gi|226332230|gb|ACIC01000090.1| GENE 2 165 - 1037 800 290 aa, chain + ## HITS:1 COG:no KEGG:BT_2157 NR:ns ## KEGG: BT_2157 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 290 1 290 290 582 98.0 1e-165 MKKVFYPLACCLAAGVLVSCSGQKKAGSAQEEQSANEVALSYSKSLKAAEMDSLQLPVDA DGYITIFDGKTFNGWRGYGKDRVPSKWTIEDGCIKFNGSGSGEAQNGDGGDLIFAHKFKN FELEMEWKVSKGGNSGIFYLAQEVTSKDKDGNDVLEPIYISAPEYQVLDNDNHPDAKLGK DNNRQSASLYDMIPAVPQNAKPFGEWNKAKIMVYKGTVVHGQNDENVLEYHLWTKQWTDL LQASKFSQDKWPLAFELLNNCGGENHEGFIGMQDHGDDVWFRNIRVKVLD >gi|226332230|gb|ACIC01000090.1| GENE 3 1049 - 1924 816 291 aa, chain + ## HITS:1 COG:lin2265 KEGG:ns NR:ns ## COG: lin2265 COG1082 # Protein_GI_number: 16801329 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate isomerases/epimerases # Organism: Listeria innocua # 25 279 2 239 246 106 32.0 5e-23 MKTNYLLVLFALLLAIPQAADAKKKKDIAIQLYSVRDIINNGTDLNVVLKNLAQMGYTSI EAANYDNGKFYGKTPEEFKNAVEKAGMKVLSSHCSRGLSDKEVASGDFSESLKWWDQSIA AHKAAGMKYIVNPGIGVPKTMKEMKMYCDYFNEIGKRCQQNGMKFGYHNHAHEFQKVENQ AVMLDYMIENTNPEYVFFQMDVYWIVRGQHSPVDYFNKYPGRFTVLHIKDDKEIGDSGMV GFDAIFRNAKVAGVKHIVAEIEGYSCPVEESVKKSLDYLLDAPFVKASYSK >gi|226332230|gb|ACIC01000090.1| GENE 4 1998 - 3713 1141 571 aa, chain - ## HITS:1 COG:no KEGG:BT_4445 NR:ns ## KEGG: BT_4445 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 571 1 571 571 1132 100.0 0 MAKITIPYAVSDFVEMRERGFYYVDKTRYISGLEDYKAPVFLRPRRFGKSLLISMLACYY DRTKAHRFEELFGNTWIGSHPTSEHNRYMIIRYDFSAMVMADNIQGLTQNFNDLNCGPVE VMVEHNRDLFGDFQFVNRGDAAKMLEEALTYIRSHELPKVYILIDEYDNFTNQLLTAYND PLYEEVTTADSFLRTFFKVIKKGIGEGSIRTCFCTGVLPVTMDDLTSGYNIAEILTLEPN FLNMLGFTYEETETYLRYVLDKYAAGQDRYDEIWQLIVSNYDGYRFRPNGERLFNSTILT YFFKKFAANAGSIPDELVDENLRTDINWIRRLTLSLDNAKKMLDALVIDDELPYNVADLS SKFNKRKFFNKEFYPVSLFYLGMTTLKDNYVTTLPNMTMRSVYMDYYNQLNQIEGNAQRY VPVYRKYDADRRLEPLVQNYFEQYLGQFLAQVFDKINENFIRCSFYELVSRYLSSCYTFA IEQNNSAGRSDFEMAGIPGTDYYTDDRVVEFKYYRGKDAERMLSLSEPLPEHVEQVKGYA EDTKRKFPNYNVRTYVVYICSNKGWKCWEVY >gi|226332230|gb|ACIC01000090.1| GENE 5 3922 - 4320 170 132 aa, chain - ## HITS:1 COG:no KEGG:BT_4444 NR:ns ## KEGG: BT_4444 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 132 1 132 132 204 100.0 7e-52 MRHLVCILMLSFVLGGCFCSCRTQYIPIESVKTEYNVRDSIRYDSIYQHDSVYLTVKGDT VYQYKYKYLYKYQYVNKTDTLIKTDSIQVPYPVEKQLAKWQQFKLDLGGIAMLIIIVIVF ILLGRTVHRLKI >gi|226332230|gb|ACIC01000090.1| GENE 6 4317 - 4907 183 196 aa, chain - ## HITS:1 COG:no KEGG:BT_4443 NR:ns ## KEGG: BT_4443 # Name: not_defined # Def: N-acetylmuramoyl alanine amidase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 196 23 218 218 412 100.0 1e-114 MKILIDNGHGENTPGKCSPDGRLKEWAYTREIADRVVAGLRHRGEEAERIVKEDVDIPLS IRCRRVNKIYQESGGNAILISIHCNAAALGIDWLSAHGWSVFVSNNASANSKCLATSLAE SAIMQSVFVRQPMPGQLFWTQNLAICRDTICPSVLTENFFQDNKEDVEFLLSPEGKQQVI QIHIDGILNYLKTIEL >gi|226332230|gb|ACIC01000090.1| GENE 7 4927 - 5277 335 116 aa, chain - ## HITS:1 COG:no KEGG:BT_4442 NR:ns ## KEGG: BT_4442 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 116 1 116 116 206 100.0 2e-52 MGLNEWLALIGALGGLEAIKWVINFYVNRKTNARKEGAAADSMENENERKQIAWLEERIA QRDAKIDTIYVELRQEQAAHLDEIYKRHGIELKQKEAEMRRCDIRKCDRRQPPSGY >gi|226332230|gb|ACIC01000090.1| GENE 8 5366 - 5596 246 76 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253569781|ref|ZP_04847190.1| ## NR: gi|253569781|ref|ZP_04847190.1| predicted protein [Bacteroides sp. 1_1_6] # 1 76 1 76 76 119 100.0 7e-26 MKKVKIMIGDNTPSFHLIASEGKVLQRISDKQIFGYEIYLGYTYYIGSQKLPEPLLELPE HYCEIDAPEEYKDMLY Prediction of potential genes in microbial genomes Time: Thu May 12 01:34:10 2011 Seq name: gi|226332229|gb|ACIC01000091.1| Bacteroides sp. 1_1_6 cont1.91, whole genome shotgun sequence Length of sequence - 19896 bp Number of predicted genes - 9, with homology - 8 Number of transcription units - 4, operones - 1 average op.length - 6.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 5977 3614 ## BT_4440 putative cell surface protein 2 1 Op 2 . - CDS 5979 - 6251 382 ## BT_4439 hypothetical protein 3 1 Op 3 . - CDS 6278 - 9814 1782 ## BT_4438 hypothetical protein 4 1 Op 4 . - CDS 9817 - 13023 1577 ## gi|253569785|ref|ZP_04847194.1| conserved hypothetical protein 5 1 Op 5 . - CDS 13020 - 13442 314 ## gi|253569786|ref|ZP_04847195.1| conserved hypothetical protein 6 1 Op 6 . - CDS 13447 - 17622 2694 ## BDI_0893 putative viral A-type inclusion protein - Prom 17667 - 17726 9.1 + Prom 17720 - 17779 4.6 7 2 Tu 1 . + CDS 17915 - 18640 499 ## gi|253569788|ref|ZP_04847197.1| predicted protein + Term 18655 - 18696 -0.8 - Term 18643 - 18683 7.4 8 3 Tu 1 . - CDS 18774 - 19154 210 ## BT_0846 hypothetical protein - Prom 19233 - 19292 5.2 + Prom 19639 - 19698 1.8 9 4 Tu 1 . + CDS 19743 - 19896 65 ## Predicted protein(s) >gi|226332229|gb|ACIC01000091.1| GENE 1 1 - 5977 3614 1992 aa, chain - ## HITS:1 COG:no KEGG:BT_4440 NR:ns ## KEGG: BT_4440 # Name: not_defined # Def: putative cell surface protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1992 1 1992 2183 3932 100.0 0 MELTQEQINDIVTAVFSLMESQGMLPESQLDVLAQNVKRRLQLDSVGVDEVPLVDSIEGI NSLPCVRQSGSVFDVVRTPLELLKGKLAALPVFRISSGYWQVSEDNGNTWKDITDSLGNR VSAQGEKGEQGNQGDPGQTGEKGDPGDKGDNGENVYLQSNGTELQWKMGEDGEWQTLILL SEIKGAASMGELTNVSQEADIAEDGSVLMYRDNEWKPAAECFIPTGTAEDGSIITTFSDL MNYISTHGEGGGGTGVQRNIRITNNLESKNISASKGEPCLLDFTFISQERYSSNEPYEDT GERGMCQISVKNNINSEYVIVKQLYINSSVPFKTDIAEFLTSGANNVMIKVTGEVTEVTT PAFVYAVQLTSLSISAVNFKWWTAFNNDITVPFNISGNVSKSLYVNISGTDYSEGYEVAL GTGVYVETAYNYSIPHPQKSGIFKISAYVSNADGSIRTKTLSFNVICAVAGEQAKLIAIN NVAERITNWSENILFDYTMYDGDNVSTSALFEITKGNEVVFSSTEESISTSTKYSFSLPL EIDTIDNQDFSIIANVKDSGVLLTDPLEFPVNNSLGYSAVAGAVMYINPKTRTNRQGNRE CMINEVDGTQIEAIWNGINWGNDGWTNDNLGYKTLRLLAGSSVDINYSPFGKESARTGKT FEIDYQISNVTDFSKPVITVSTPAGDLFVGLNIYPDNVIMYSQSLKDKDVQSLHTFEEKR TRLTLTIMPDAYGNSGFNLCILYINGIKNREFTYENNDYFAHNGTIVIGSDNADVDVYGI REYDSALTSQGVQTNYVNWLSTAEEKNSFKTENDILDTNGSEIDFDNTVDQYNVIVFDNT IPSMADQTQRIGTLDVYFYDHPEWNVSISDVTAKGQGTSSMKYWIWNTRYQLDKNLSVIT HADGSTSKKVWQMVPWIPAGQKFTAKKNFASSMQSHKIGAVNSYTDLYKQVGLSNEAMQR EGYSDVRVSVYELPFFCFEKSINDDGEPVYVFKGLYTFGPDKGDKYTFGYDTDYFPDLLS IEGSDNSPLLTLFRVPWNTDSGRVVYDEDKEAWQYNGANSFGFGAGDIANIVNWIPTYNH VYQCSPRLLPFDGTPDELNDDLDIYRTQPYEFWIAKVGDSHRFDVYYYEASVGLFIPSDI GEGPINLVSQLVDKDYGLASADIENKTNDSLNTLFINARVAKFRKEAALYWDIDDCLYFM NNVEFNAGTDERAKNTYPYSFGIETSKWRWRVDDADTRFDTTNRGLPDKEYSVETHDLDE TGAAVWNGETNNFFNLMELAFPEEKIISMRKSMAAMQSLGGLKSGNDLEKIYAFYKKYFF DQAQEYFPSDGYNADAKYCYENGKLAYNAGIYSNDTDPITQSLGDHYLAEQRWITKRILY MMSKYNFGLFSASGTDTITVRAAGNTIKYQLTPAMDMYPAIANGTSIIQGKRTKAGEVCE MEIELSGSGDQQNAIQGASYLQDIGDWYNKNVTGSMIIQGRMLRDIRLGSKNNPVIISIS SLTLSNCVSLQRLLLSNITTLSGTLNLTACEHLQEIYADGTSLAQIVLPSGGGLRVVEYS RFNQYLSLSNYPLLTDDGIGIDLCKTVISDFFIVDCALVRPMKILVDIMNAQMEQGDNHA LKRIRAVGFEESYNNSFILDKLVDLTNGTYSGLSSEGLSGEDELPVLDGTLNINASVYGD TVEALRAMFTRLTLNINGEFYVRFADDIVTMLCAENWGDGIGTSKRQMGNITELGIIFAG TGIKSFKELSLSKIESLTNEFTGCSQLSSIAFPETLRILNTTAFAGSMVALDLSDCTHIT DLLIDSDDTIDFMPSANDNLVNITYNNGSSCIHMVGYSNAILTINNESEIVDFWVENCNT NNNILSKIQSIYSVKEHLLKYIRAVGFDEEFYTNEILKTLLSLAENGYRGINESGERDDN IIPVLSGKVLCTDKYSPDMLYDLKSYFPNILFNMTGEACVDFKDQVVKDICVRNWGSDGE LTVEQAAGITTI >gi|226332229|gb|ACIC01000091.1| GENE 2 5979 - 6251 382 90 aa, chain - ## HITS:1 COG:no KEGG:BT_4439 NR:ns ## KEGG: BT_4439 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 90 1 90 90 150 100.0 1e-35 MKVDFTKAKVEVTFGEYKEMDIAKAVGNAIHQNTSDIGVDETARNIYHSEGELEIPDEQI SAIIYALSNASTILVSAKKAALNLLKLTQE >gi|226332229|gb|ACIC01000091.1| GENE 3 6278 - 9814 1782 1178 aa, chain - ## HITS:1 COG:no KEGG:BT_4438 NR:ns ## KEGG: BT_4438 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 766 1178 37 449 449 801 99.0 0 MVKINGSSFPHQYRRTNPYPIDSTETWASIKEATEYARNTDAEPYMPYAGQVISIEGQED IYVLVEDKAISKSDGREHFKLHKISTQETAEGIYLSKLVEDTAQKLIHFLEGINVKGEAK METLKVLGAAIFSDTATFEKSLSSSEFISGNPNGQGWAIYLKEFVNSGGIKELRTILEID ELDIRKRMRVYEMIVSQLLGENGTRLTTDMIEVLRIDKTNMRIYIDTQNGKLYNPFRTGD YIMVQQFDEGNFSVNKEYEFVVSGISADYITYSNFVGSENDVREGDTLVRVDSMTDVDRK GLIKHTSVEANGPYIDIVYGMKTDPENATRTRIGNLSGITTPYWRQLKDYGIFCDNGYFR GDFMLRSGEDIFTKFQVTDANIRAEINSVRYDVSEKDNYLKNASFVYDMTYWEHSTAVNV YSASGEMVNVNANLYSEKSNVSDIVLYNNQYMLRVKSSGVKQLNKDLNHKPDKVEKFDIS LRYLCKTSGTIKVGFTSSALYVEESISANSDFTTKSWSGDWDGTGDFILEFSGDIYIKTL TLTTDSLESFKTEMNTVISTIDGRIDAYVEKTDNLEQTTTQMGISINGLTEDLKLYYKKS DAQNYIDSEIGVAIDGVNSTLESYYRSSEVDSMFLNIGNRIDGIDESLTLYASKTDTLER NYTDVGTRLSAAEGTLTNYAEFKSVTENSLTQVNQQLNALEGSLETTVRDMVAGEVSGAA QNLAAQLYASSEDILSRIKGYSQDGRITPAEKTDIISIYESIKSEMADTAVDAGTPSILE ANSPVISRVLAFLDAGINTGETIKTNLRTAINALKTKYSNLKMGLFSTTYTLEGSSAAYP LLSSDSLQRISLLDADKQIYPLFKEYVTARNELANLGVEASREWIVKNQTTIDQTSTSIG LLVKKTEGGEVVNAASIIAAINESGDGEIKLVADKITLEGFATDNKGFSIENGYMTAKGG KIGGFEIDGYGLSNYSNADAFISIETQKTRTSGNQTLTAIRKAILGNGIPSIAGFETASM FQASGDDENIAVKINAYGSSTYDYKYGGKANYALVASGGCQWKLSEAGDTWCMPGVLGCV EIYTTNSGINLNRKWGNGISINRIALNAEREYEFYHNLGHTDYYFLAMPSVNWVDGYHEA IPSYTSIENNVFRILFRSGNNDKAKPAYFMVIIFGRPA >gi|226332229|gb|ACIC01000091.1| GENE 4 9817 - 13023 1577 1068 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253569785|ref|ZP_04847194.1| ## NR: gi|253569785|ref|ZP_04847194.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 1068 1 1068 1068 2146 100.0 0 MIDIKDISGNIRFSTPINQGSKRKFLLMKEDYITLKFSLDKLIPFSLGDNVDHEIGMFEL VDLYKPDYNIETGAYEYNLRLDAYYWKWKNKMFFFTPENGGREASWNLTDSLKVHMQVFL KNLEVLGYKYQGEAFECTIDDSVNTSSKLISYDSTNLIDALSQMAEAFECEWWVVKNEIH FGRCENGDPVDFELGMNVGKMDRSDSQSTYATRIYAFGSTRNIPLTYRKKLVFDVKDING RDLSDTARSLVTDFFPSDIVTEEKHVIDPIITGTLNSNHREFSNNNDLVDTLVGSIYKII DKGNGISFGTGTYYMPMPGQPSFPRDYLPAGEYVFRVSFIYLQDGIEKEILIGSNTVTLN ENQQHEINLSFPIAKEFAPGLGASLFKLRTYLLVPQFDNTLIGMGIGLFARVSLDMELVA GKSAATSVTFLSGVNSGKTFDAIYNPDFHIGETANILRLPEGVTASIGDKYTIDNIVKSK VPVSYFSDDKDTQIAEGIATRHLLMPEGVTYIDAYPDMKAEEAVEQIVVFDDIYPKREGV TDMVTTHTYTDTIDNPDGTKTSKDWLAWRFKDDDPGFHFSKDYKLDSEELRIEFLSGPLA GMDFEVLFNPYDKVRDDAPKPEHLKDGTWNRDAQVYEIKRNDNYGRMLPDDILHPTDQGG DTYVLYGYDPQFVSDVMIPNAEVEVEERAKEYIEKLKQDPSTYTNTMMSDYIYGINPETG EQDEGFAKSFTVGQKVKLINKAYFEEARISRIIGFEYSLDIPYDSPVYTVGETAPYSRLG EIESKIDSLTYRKEKDRLSIIAGGGTSSGIEGTNAVFPRNIEVTVDRVGYFKAGDIILGG TTVVDAFIKLISQKSTGELRSEISTNKDVEYGSPKGFITYTAVRNSQGPMEQAYYDGNPN NKLNFSEEVGGVQTATRQLDGFYTQKETYIANVIYAASEDDVLPRKVLSDSINVNVHRKW FAGVCSSIPKTSDDVRALGSNDLYSGPGTYQFPVSNWKIVVVCVPSDSVCDISIASSYGN FIENGKVCSGPKAISVAGANGKDEINYKMWVIQTPGLNDPDSFTFKTV >gi|226332229|gb|ACIC01000091.1| GENE 5 13020 - 13442 314 140 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253569786|ref|ZP_04847195.1| ## NR: gi|253569786|ref|ZP_04847195.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 140 1 140 140 268 100.0 9e-71 MKDELRINGKDAYTTWGISMDNNALSELMTPSSNKTFIENESRLEHGKRVVIANPRVDVR NLTLQINLTASSEEQFFERYNSFCEELATGALEIETKYQPKVVYKTIYQSCSQFSQFMRG IGKFSLKLYEPNPNDRTKTV >gi|226332229|gb|ACIC01000091.1| GENE 6 13447 - 17622 2694 1391 aa, chain - ## HITS:1 COG:no KEGG:BDI_0893 NR:ns ## KEGG: BDI_0893 # Name: not_defined # Def: putative viral A-type inclusion protein # Organism: P.distasonis # Pathway: not_defined # 32 408 171 526 1388 155 28.0 1e-35 MGNLTFNIGVNSDDFIMKAEQMRFSIHALALESKKANNANSDFANILKKTLDIIGGTETL KDFASDVVRVRGEFEQLEISFATLLKNREKANVLMSQIIKTAAETPFNLKELAIGAKLLI EYGAEFEQVNDTLIRLGNIASGMGLPLEQLIELYGNVMEQGKLYEQNIDQFTIFGIPMLQ GLADMLGVTTAKITELVSVGKIGFPEVQKVIGNLTNEGGQFYGLMDEQSRTISGKIENIR DAWESMLNEVGQSQEGIINESLNGTQYILDNYEQLGKAVSELVIVYGSYKAILITITALQ RLNLIVMEQANLEMLLAAKAGMVVSQSQALAAAKTKLFTGAIKANTIALLQNPYTFIAAA VVGLGYAIYKVTTAETELEKAHSRLNEANMNVEKSLTSEVSKLSSLERKLIEVKKGSEEY NKIKKTIVDNYGQYYIGLDEEIDRVGNLSTSYAQLVENMRLSIGQRKFEFFFNTEQENLD KTISEKLDVAYDTLIKKFGQTRGGNLYNQFFDSAMSGKALSPNVIKDLKSATFWDMKWGD NAEDGLVDFRNNVFNLRSEISKETKATETVLEEYKSKFRITDEEVAGILFDNAEKTDNGE IKGLKTLKKLVSEIEKVKENINNQRKKAQDGLADDSELQKANETLNKLKELYKLRTGADY DKQASFSGDNKQKQEDLHNRQDLEGSNLIEDLENKSAQAEIDAMEEGEEKKLAQMELNHQ KEIAELKRLKTDYLQKKIDLEKTIFEANSKNEGKTFDESTVSLNENEKSAFANILKQTVA RQGNDISVYYKEILSKYQGYTEKRLAVQQKFEQEREALRKAGASKETLAEHDYQKEGALE AVDSEFAMREDSFQSWTNSIANMSLEQLQKLLFQAEQELQRSEFLSPNDPKLAGQRAKVS SLKNEISEKTNETQTAPEKRGIKEWQELYKTLGNVESGFNEIGDAVGGTAGEIISAAGGI ASSTLQMIDGIVMLANGSSTAMSGTAQAASTSIQTVEKASVILAIIGAALQVATKIADMF GADYSDYDTAKENYESYVKVLDVIIGKQKELLETLTGKAAVEASQKAIELIEKQADAART LGKERLNAGASAGSHSIGVRMRKGMNNEGWAEWNNFANSIGMNPDDIGNRMTGLFDLTAE QLATLQRDAPTFWAKLDGDVREYLEQIIACNEETEEMRDLLNESLTQVSFDDVFSSFLDA LSDMDSSSEEFANNFEKYMQKAILNSMLIDNYKSRINEWYKAFAKANESGGIDEKEYADL QESWNGIVSDALEERNALMQQFGWSSGEESSSQSSTQKGFAAMSQDTGDELNGRFTALQI SNEEIKNSMLFVLGSLSSLCTTASDSNLLLTDMRNLAVMSNGHLEDIAKYTKVLLGFGEK LDNIDRNTKNI >gi|226332229|gb|ACIC01000091.1| GENE 7 17915 - 18640 499 241 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253569788|ref|ZP_04847197.1| ## NR: gi|253569788|ref|ZP_04847197.1| predicted protein [Bacteroides sp. 1_1_6] # 1 241 48 288 288 399 100.0 1e-110 MKQALQLFIITLLIVSCTKKQSNQIEQYDSMIFTDAEIRNQEIQDSLQQARLDSLALIAW GDAKFGMSMKETLSTNAFKNGRKQHGVDEITMNFDQEFKFKKLFELNNLAGISAYYQENE LKRISIKSYYVRANYINELIMDCNIFAKEFTKKHGSPSFQKGKTNILDFTVGKEFTYAKF KMKDKNIVIKLGELYDGGNYYYIIYIENDKFPKKKHVKTEKEKKEEQRRMKEAKEIKENS F >gi|226332229|gb|ACIC01000091.1| GENE 8 18774 - 19154 210 126 aa, chain - ## HITS:1 COG:no KEGG:BT_0846 NR:ns ## KEGG: BT_0846 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 124 1 117 119 86 39.0 3e-16 MKKILFFMILLCPIVLGGCSSDDDDWMDLDSKHLVGTWSTGTEGTHKYLKFENDGTGYYA LLNGASFLTNYLFTYEISGDNINIEITYSDDGNKLKGQKKVWECLFSKDILKIKNGSEDG TYKRVE >gi|226332229|gb|ACIC01000091.1| GENE 9 19743 - 19896 65 51 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKKILLLLAILPMFMTSCSNDDESTEQTFFVNVYTKWEDSEEKVAKKAFLY Prediction of potential genes in microbial genomes Time: Thu May 12 01:36:04 2011 Seq name: gi|226332228|gb|ACIC01000092.1| Bacteroides sp. 1_1_6 cont1.92, whole genome shotgun sequence Length of sequence - 60276 bp Number of predicted genes - 49, with homology - 46 Number of transcription units - 25, operones - 13 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 320 143 ## gi|260171447|ref|ZP_05757859.1| hypothetical protein BacD2_06235 2 1 Op 2 . + CDS 376 - 1062 550 ## gi|253569790|ref|ZP_04847199.1| conserved hypothetical protein + Prom 1087 - 1146 3.9 3 2 Tu 1 . + CDS 1192 - 2406 868 ## COG3328 Transposase and inactivated derivatives 4 3 Tu 1 . - CDS 2534 - 3217 581 ## BT_4436 hypothetical protein - Prom 3334 - 3393 5.9 5 4 Op 1 . - CDS 3397 - 4104 465 ## BT_4435 hypothetical protein - Term 4109 - 4146 6.2 6 4 Op 2 . - CDS 4165 - 4626 697 ## BT_4434 hypothetical protein - Prom 4651 - 4710 8.1 7 5 Op 1 . - CDS 4798 - 5247 356 ## BT_4433 hypothetical protein 8 5 Op 2 . - CDS 5269 - 5454 122 ## gi|253569796|ref|ZP_04847205.1| predicted protein - Prom 5624 - 5683 5.1 + Prom 5808 - 5867 8.0 9 6 Op 1 . + CDS 5942 - 6355 324 ## BT_4432 putative non-specific DNA-binding protein HU-1 10 6 Op 2 . + CDS 6360 - 8159 1346 ## BT_4431 hypothetical protein - Term 8258 - 8329 5.1 11 7 Op 1 . - CDS 8355 - 10790 2074 ## COG0755 ABC-type transport system involved in cytochrome c biogenesis, permease component 12 7 Op 2 . - CDS 10804 - 11949 843 ## BT_4429 putative pteridine-dependent dioxygenase 13 7 Op 3 . - CDS 11990 - 12880 737 ## BT_4428 hypothetical protein 14 7 Op 4 . - CDS 12917 - 14716 1549 ## BT_4427 surface layer protein 15 7 Op 5 . - CDS 14716 - 17250 1923 ## BT_4426 hypothetical protein 16 7 Op 6 . - CDS 17295 - 18011 986 ## COG0274 Deoxyribose-phosphate aldolase 17 7 Op 7 . - CDS 18051 - 19085 1310 ## COG1063 Threonine dehydrogenase and related Zn-dependent dehydrogenases 18 7 Op 8 . - CDS 19110 - 20627 1582 ## COG1070 Sugar (pentulose and hexulose) kinases 19 7 Op 9 . - CDS 20656 - 21135 627 ## BT_4422 hypothetical protein 20 7 Op 10 . - CDS 21149 - 21520 324 ## BT_4421 hypothetical protein 21 7 Op 11 . - CDS 21577 - 22665 1089 ## BT_4420 hypothetical protein - Prom 22898 - 22957 6.5 + Prom 23391 - 23450 4.9 22 8 Op 1 . + CDS 23484 - 23777 233 ## BT_4419 hypothetical protein 23 8 Op 2 . + CDS 23791 - 24081 349 ## BT_4418 hypothetical protein + Term 24118 - 24179 8.1 + Prom 24118 - 24177 3.2 24 9 Tu 1 . + CDS 24198 - 25733 2092 ## COG1418 Predicted HD superfamily hydrolase + Term 25763 - 25792 2.1 + Prom 25744 - 25803 2.3 25 10 Tu 1 . + CDS 25856 - 26605 586 ## COG3142 Uncharacterized protein involved in copper resistance + Term 26817 - 26853 -0.6 26 11 Op 1 . - CDS 26652 - 27866 1266 ## COG1883 Na+-transporting methylmalonyl-CoA/oxaloacetate decarboxylase, beta subunit 27 11 Op 2 . - CDS 27881 - 28312 449 ## BT_4414 hypothetical protein - Prom 28334 - 28393 7.9 - Term 28432 - 28480 -1.0 28 12 Op 1 . - CDS 28660 - 29097 238 ## BT_4413 hypothetical protein 29 12 Op 2 . - CDS 29181 - 31898 1971 ## BT_4412 hypothetical protein - Prom 31993 - 32052 8.0 + Prom 31985 - 32044 7.4 30 13 Tu 1 . + CDS 32225 - 32362 125 ## + Term 32386 - 32436 12.5 - Term 32379 - 32420 9.4 31 14 Op 1 . - CDS 32437 - 33138 767 ## BT_4411 hypothetical protein 32 14 Op 2 . - CDS 33197 - 35131 1610 ## BT_4410 hypothetical protein - Prom 35357 - 35416 4.9 - Term 35359 - 35395 0.3 33 15 Tu 1 . - CDS 35424 - 36485 781 ## COG0389 Nucleotidyltransferase/DNA polymerase involved in DNA repair - Prom 36718 - 36777 6.3 + Prom 36523 - 36582 5.6 34 16 Tu 1 . + CDS 36728 - 38119 743 ## BT_4408 hypothetical protein + Term 38189 - 38234 12.4 - Term 38176 - 38220 12.2 35 17 Op 1 . - CDS 38255 - 39430 951 ## BT_4407 hypothetical protein 36 17 Op 2 . - CDS 39427 - 40593 1117 ## BT_4406 hypothetical protein 37 17 Op 3 . - CDS 40597 - 42225 1560 ## BT_4405 hypothetical protein 38 17 Op 4 . - CDS 42233 - 45553 2994 ## BT_4404 hypothetical protein - Prom 45574 - 45633 4.6 39 18 Op 1 6/0.500 - CDS 45721 - 46665 538 ## COG3712 Fe2+-dicitrate sensor, membrane component - Prom 46687 - 46746 1.6 - Term 46703 - 46752 2.6 40 18 Op 2 . - CDS 46755 - 47306 465 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 47414 - 47473 8.6 + Prom 48017 - 48076 5.4 41 19 Op 1 . + CDS 48171 - 48356 57 ## 42 19 Op 2 . + CDS 48340 - 48780 318 ## BT_4400 hypothetical protein - Term 48655 - 48708 1.2 43 20 Tu 1 . - CDS 48725 - 48916 66 ## - Prom 48971 - 49030 6.4 + Prom 48871 - 48930 4.8 44 21 Tu 1 . + CDS 48978 - 50129 1011 ## COG2706 3-carboxymuconate cyclase 45 22 Op 1 . - CDS 50203 - 53484 3488 ## BT_4398 TPR domain-containing protein 46 22 Op 2 . - CDS 53506 - 54888 1311 ## COG0477 Permeases of the major facilitator superfamily - Prom 55024 - 55083 6.0 + Prom 54944 - 55003 4.7 47 23 Tu 1 . + CDS 55051 - 55938 612 ## COG2207 AraC-type DNA-binding domain-containing proteins + Term 56074 - 56107 3.0 - Term 55880 - 55911 -0.8 48 24 Tu 1 . - CDS 56140 - 58353 2110 ## BT_4395 hyaluronoglucosaminidase precursor - Prom 58531 - 58590 6.2 + Prom 58480 - 58539 7.4 49 25 Tu 1 . + CDS 58559 - 60199 1468 ## COG3525 N-acetyl-beta-hexosaminidase Predicted protein(s) >gi|226332228|gb|ACIC01000092.1| GENE 1 3 - 320 143 105 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260171447|ref|ZP_05757859.1| ## NR: gi|260171447|ref|ZP_05757859.1| hypothetical protein BacD2_06235 [Bacteroides sp. D2] # 2 105 55 158 158 141 77.0 2e-32 ENENKSIDNSQSTISVVNDGLITYTDGSKSSKPKYATKFQSGVFNIENMPNGEYILWVTY REEYGGRTYSSYKAISVNHDYRGKSEKKVFQTSMQDQGLYVYQNW >gi|226332228|gb|ACIC01000092.1| GENE 2 376 - 1062 550 228 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253569790|ref|ZP_04847199.1| ## NR: gi|253569790|ref|ZP_04847199.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 13 228 1 216 216 418 100.0 1e-115 MKRIILSALIGLMAISASAQLMRSEELEKYAKEKFGDKWVDAAANIGSQLALDKNNSLTY TQVIECGEQTKEKLYVTLNHWFVESFNDANSVIQLNDKEEGVIIGKGYLSDIAGHLGGMN AYNVSVHPIIKVDIKDKKIRVTYTVQCFEVEKAVGGGVMSALSNTRPSISTEKWPLETCY PFIEKDEHKAKKTSSKALVMTHAYSNVIMDKIEEAVKNGLAGNESDNW >gi|226332228|gb|ACIC01000092.1| GENE 3 1192 - 2406 868 404 aa, chain + ## HITS:1 COG:YPO0011 KEGG:ns NR:ns ## COG: YPO0011 COG3328 # Protein_GI_number: 16120364 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Yersinia pestis # 30 401 25 398 402 335 45.0 7e-92 MKEEFDFESIKNKAIEQLKAGKPLLGKDGAFAPLLESILNAALEGEMDAHLTEEERQMGN RRNGKMQKQVQTPLGEVTVSTPRDRNSSFDPQFIKKRETILAEGVADRIIGLYAMGNSTR EISDWMEENLGNRVSADTISSITDRVLPEIKAWKSRMLDSVYPIVWMDAIHYKVTDERGC AVTRAIYNVLSIDREGHKELLGMYISRNEGANFWLSVLTDLQNRGVEDILIACIDGLKGF PEAIQSVYPNTAVQLCVVHQIRNSIKYVGSKNQKEFLRDLKCVYQAVNKESAENELLKLD EKWGEQYPVVIRSWQDNWDKLSEYFQYTPVIRKLIYTTNTVEGYHRQIRKVTKNKGVFPS DTALEKLVYLAYRNIRKKWTMPLANWATISQQLAIKFGNRFKLL >gi|226332228|gb|ACIC01000092.1| GENE 4 2534 - 3217 581 227 aa, chain - ## HITS:1 COG:no KEGG:BT_4436 NR:ns ## KEGG: BT_4436 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 227 8 234 234 400 100.0 1e-110 MTAVLVIAGCSKSEDGKLTLSANQVSLYSGDTKQVTVNDNATWSSKSEFVAEVSEDGIIK GNHVGKTIITATSDNGEALCEVVVNAKYSTYTEPVLEFGVDKATVKAKEKRTILEDKTST LGYRGENSAVKSVAYLFENGKLTSSAIALSYSYTEEIAKFLSERYQVIGKDSDGAYYFIN NDSDKYNMGVRLSVESGFIMVVYVPQSPKSRSIQAADLQELKSILGI >gi|226332228|gb|ACIC01000092.1| GENE 5 3397 - 4104 465 235 aa, chain - ## HITS:1 COG:no KEGG:BT_4435 NR:ns ## KEGG: BT_4435 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 235 1 235 235 449 100.0 1e-125 MNINKEIEINISDAIVEKAISFSIGKNKLCLYPSTLGKMQILKNLYLSMDVNMELLAINP LVETLRICQEKTESVCQVIAYSTFNDRKSILDMEKVLRRARLFQDEASVEDLATILTIIL SSDKIEEFIRYFGIDADREMKTRISRIKGEGSSITFGGKSIYGLLIDFACQRYGWTMDYV LWGISYVNLNMLFADAVTTVYLSEEERKKLGRGDGEVINADDPGNRDLIRRMISE >gi|226332228|gb|ACIC01000092.1| GENE 6 4165 - 4626 697 153 aa, chain - ## HITS:1 COG:no KEGG:BT_4434 NR:ns ## KEGG: BT_4434 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 153 1 153 153 263 100.0 1e-69 MAQLSWGKPTIEFGKCVNGATPSEWTKLGCDPVESSTKLTPTKGEKKEAKVEGGENEAVK YARNTYAFEFEIRAAKGREKPIKDSDGVVEGEYAFRLTPEDETCEGILIERSVVSVEESY DTAEGKKWKYTVDVLKPATGDQVKPYTPATPAE >gi|226332228|gb|ACIC01000092.1| GENE 7 4798 - 5247 356 149 aa, chain - ## HITS:1 COG:no KEGG:BT_4433 NR:ns ## KEGG: BT_4433 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 149 1 149 149 293 100.0 2e-78 MLQEFLEIEELKSIHEEKLRLMEREMALSTPLLTELEYIPMLYKWYCELSGCCEEAGVQN TDQKGQFLFVILFFYSPIALVGGRIVSGVRDKLAQLFGFTSPSAVSNLCKLIKSFYTTYK GYRKIVNQLCDEFMSRLKENGIIPQNPIL >gi|226332228|gb|ACIC01000092.1| GENE 8 5269 - 5454 122 61 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253569796|ref|ZP_04847205.1| ## NR: gi|253569796|ref|ZP_04847205.1| predicted protein [Bacteroides sp. 1_1_6] # 1 61 5 65 65 102 100.0 9e-21 MDIKKNLRTVARNAAFRVEFLTSGREILLYTNAIYSAMMWGWTKRIEEKEKETHIREELI K >gi|226332228|gb|ACIC01000092.1| GENE 9 5942 - 6355 324 137 aa, chain + ## HITS:1 COG:no KEGG:BT_4432 NR:ns ## KEGG: BT_4432 # Name: not_defined # Def: putative non-specific DNA-binding protein HU-1 # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 137 1 137 137 249 100.0 3e-65 MSILYKTTRFADTFTGDKETNVRVQLLSWDTLDTKAFVEYLATKYNLSKGEAYKSLSIVL EGMEAVLKDGNILNLDDFGSFSLNGNFCEDKEPDENHRAESIQVKNVVFKADKNLKKRIS AAGFEKYNPKKHNKRKY >gi|226332228|gb|ACIC01000092.1| GENE 10 6360 - 8159 1346 599 aa, chain + ## HITS:1 COG:no KEGG:BT_4431 NR:ns ## KEGG: BT_4431 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 599 1 599 599 1204 100.0 0 MEYDRIYSIRKGEYFADALKRAGKDFIPTNCIINKLLPGLGATHCELTAPRKSIIIEPNV PVIESKAKVHKNALAVYKGVSIRQIADFLEANREKDYKLLTTPEGFNKIKEAMQTVDIDM YTECFILFDECEKLVQDVHYRDSIREPMNDFFRFQNKALISATPIVPEKDSRFDGFMRVL IQPDYVYRQKLKLITTNNVLETLQEVIEAKRGTVCIFCNSIDSIDSFYRLIPELSNACTF CSEDGQYKLWKGNRRKKSMMITELERYNFFTSRFYSAVDILCKNPPHVIFVSDLYGAAQS VIDPATEAIQIIGRFRGGVNSVTHIASIRPELECMSSSEIDHWIQGASTIFNGWKAQLAR TTNIGERTLLQEAIGENSYLPYLDENGKPDSFLIANFYEKEQVKRLYTSADLLHLAYEQT GYFVFSHEERLMPVSDNERMAIQHRLAKKKRAELIVRKLEEMEKMSKATDKKIQKRYQRM LMNLITSTADRYIYDCFCRFGAEFVREADYNENKLRTALNVSSEHTIKKSGQMRTYIQRA FPVGAEISVQEAKSMLRQVYKKMGLNTGRGITTKELEQYAEIENSRNREARMIKILKHK >gi|226332228|gb|ACIC01000092.1| GENE 11 8355 - 10790 2074 811 aa, chain - ## HITS:1 COG:Cj1013c_2 KEGG:ns NR:ns ## COG: Cj1013c_2 COG0755 # Protein_GI_number: 15792340 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ABC-type transport system involved in cytochrome c biogenesis, permease component # Organism: Campylobacter jejuni # 535 806 10 280 287 246 50.0 1e-64 MKLIRLIASPILMYVLAGVYALVLAIATFVENSSGPAVAREYFYYAPWFILLQLLQAVNL LAMFLQGGYFKRISKGSLIFHGALVFIWLGAAVTHYAGVTGIMHIREGETVDRMMRDEGA GMGNASLPFSVTLDDFRLKRYPGSHSPMSYESDLVIKKENEAPLQATVRMNKVIEVDGYR LFQSSFDPDEQGTVLSVSYDRPGMQITYIGYFLLFAGFVLTLFSKKSRFGRLRRELGEMK KNAPFCLLLFLGLSGALGTQASYAQETLSSSQLPCIPAPHARKFGSLVLLNPNGRLEPVN SYTSAILRKLYGADKLNSINSDQFFLNLLAFPDEWGGYPFIKVDNKDILQRFGRDGKYIA WQDVFDADGNYVLTDEVNAIYAKSASERKRMDSDLLKLDESVNIVYRIMQHQLLPLFPDE NDVQGKWYSAGDEQTVFHDKDSLFVSKIMDWYIYELGNGVRTNNWKEADKIVDMMHIFQQ AKSKTPAIDNQRVKAELLYNQLNLFFWCRLAYLILGGILLFIACGEIIADFKWGSRLSSI LIVLLIAAFLAHTTGVLLRWYISERAPWANAYESMICTSWLLVGGGLLFARRFRILPALA GLLGGIMLFVAGLNHLNPEITPLVPVLQSYWLMSHVAIIMIGYVFFALCALTGLFNLILM NLLSATNRVKLLFRIREFTLLNEMAMILGLFFMTAGTFLGAIWANVSWGRYWGWDPKETW ALISIVVYALVLHIRFIPLLKGKTTWCYNLLSVVSILSIIMTWFGVNYYLSGLHSYGKTE GGDLLLWIWGAGLCVVLALALFARRRLKKYS >gi|226332228|gb|ACIC01000092.1| GENE 12 10804 - 11949 843 381 aa, chain - ## HITS:1 COG:no KEGG:BT_4429 NR:ns ## KEGG: BT_4429 # Name: not_defined # Def: putative pteridine-dependent dioxygenase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 381 1 381 381 756 100.0 0 MKYYRIKCEIYKAAGKSWERMAEELLLSAVRGEGKEHGGKVVRLVFFTFCEDNREYQLQR SFLEQWVDDHFVSPRPVLSLVAQKPLVGELVMEVHSLPATAGEEVTVEEQMTSSVRYLRV TSGHYREIIAGGLYADDLTLPVREQSEQVFGKAEEILKAEQMSFGDIVRQWNYLERITDI VYGNQCYQDFNDIRSQFYASSEWASGYPAATGIGTQHGGILVDFNAVKGGIEIIPLDNDW QRAAHVYSDEVLISHRTDAEKGTPKFERGKSLSDHQQEMIYISGTAAIRGEESIVTGDVL VQTEITLENIQHLIGLEEGREKLPEHSGKLELLRVYLKHEEDAKIVKEDMDKLCPDVPIV YLYADVCREELLVEIEGIAYL >gi|226332228|gb|ACIC01000092.1| GENE 13 11990 - 12880 737 296 aa, chain - ## HITS:1 COG:no KEGG:BT_4428 NR:ns ## KEGG: BT_4428 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 296 1 296 296 600 100.0 1e-170 MIKICSVAFLSLLLSASSLFAQTKTFAGADYSQGIVFVMENNQMVWKHKAPDSNDLWILP NGNILFTTGHGVLEMTRQNDTIFHYESKSPVFACQRLKNGNTFVGECVTGRMLEISPKGK IVKETCILPKGVKDGGFAFMRNARRLDNGHFLVAHYGDQCVKEYDANGKVVWKLDVPGGP HSLTRLPNGHTLIAVADKDQNPRLIEVTPEGKTIWEISNADIPGKPLKFLGGFQYFSDGR FLITNWTGHVNPKEKVHLLLVDRQKNVLYSLENTPELQTMSSVYSTDKPAGVASYH >gi|226332228|gb|ACIC01000092.1| GENE 14 12917 - 14716 1549 599 aa, chain - ## HITS:1 COG:no KEGG:BT_4427 NR:ns ## KEGG: BT_4427 # Name: not_defined # Def: surface layer protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 599 1 599 599 1194 100.0 0 MKKILIPVSIVAILLCASWKEDAKTVSPDRLFLADSGKSLFVTNRAGNELIKMSSDGQKA EQKVTLSSPVNAMTQDPSGNLWVVCDGNTGMMYELDGKKLSVQSKTKSGATPSDILYSPV SKSLWVTQRFNNELWEIDPATRKVKTKIAVGREPVAMAPFAGDSCLLIANNLPEMSSTAY PIAVQLDIVDVPSKKVTGRIMLPNGATDAKSVAVDKNQTYAYVTHLIARYQLPTNQLDRG WMATNTLSIIDLKAGKLLTSVLLDTPQKGAANPWSVIVTPDDKQIIVAAAGSQELVRIDR IALHERLAKAKQGEMVTPSVKSWNNIPNDAGFLYGIRDFIPTQGKGPRSVVATGDKIYTA NYYTSELVSMDLNGKNLKKQVLGAPLAFTKVGKGDMYFHDATICFQNWQSCATCHPNDAR MDGLNWDLLNDGMGNPKNTKTLLLSHQTPPCMATGIRKNAEVAVRSGVKYILFMEGEDEI YESIDEYLKSLKPLPSPYLVNGKLSGKAKKGKKIFEENCASCHSGEYYTDQKQYKVDWTT GPDKGLSMDVPALNECWRTAPYLYDGRSYSMKDMLKVHGPHKPVTEKELEELEEYVLSL >gi|226332228|gb|ACIC01000092.1| GENE 15 14716 - 17250 1923 844 aa, chain - ## HITS:1 COG:no KEGG:BT_4426 NR:ns ## KEGG: BT_4426 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 844 1 844 844 1779 99.0 0 MIRSKLIKWLKWGMVCCMLVPFLAWQAKDTSFQPTDTKGFIAEMQKQFAEVQTIKKKENH REELERKVRDTHRMLMHDYPVYYDWWLQDGNDAKDIKDGKDVNWFRGSFSRQLSLRMQKL NITTAVNDTPESIEAAFQSYLKACERRRAERLETFTANQPEIVFTKFRTLRPSFFAYTEG VSDARAECNYFAGGSLSKIKMNGIWAEEETLLTDEDGVYRDPNLHFDGQHLLFSWKKSAK DDDFHLYEMDLKTRAVKQLTFGKGHADIEGIYLPDENILFNSTRSGSAVDCWFTEVSNMY LCDREGRYMRQVGFDQVHTTTPTLLDDGRVVYTRWDYNDRGQVWAQPLFQMNPDGTGQAE YYGMNSWFPTTVAHTRQIPGTRKVMTVFMGHHNPQHGKLGIIDPEAGRDENEGVMFVAPV RKPEAERIDSYGQFTDQFQHPFPLNETEFLISYTPLGYHIGHPMEFGIYWMNANGERELL VADSKISCNQPILLAPRKRPFHRSTSVDYTKNEGVYYMQNIYEGNGLKGVAPGTIKQLRI VEIQFRAAGVGEVNGNDEGGGALASSPVGVGNAAWDVKRVIGVTDVYPDGSAFFKVPARR PLYFQALDDKGRVVQTMRSWSTLQPNEVQSCVGCHEHKNTVPVAGHRVSMAMDKGIKALA PEDEMGERNFSYLKEIQPIWDRNCISCHDGVKHPMSLKGELKVVDKQSKRKYTDSYLSLT HARPDGPDRAWRGDAHHPEVNWISALSQPTLLPPYFAGSNKSNLIKRLEEGHGGTKLTPQ EIRKVSLWIDLLVPQIGDYREANNWSDHDREFYDRYDKKRKQARMEEQENIRQYIKSLQT KQQK >gi|226332228|gb|ACIC01000092.1| GENE 16 17295 - 18011 986 238 aa, chain - ## HITS:1 COG:BS_dra KEGG:ns NR:ns ## COG: BS_dra COG0274 # Protein_GI_number: 16080993 # Func_class: F Nucleotide transport and metabolism # Function: Deoxyribose-phosphate aldolase # Organism: Bacillus subtilis # 18 224 3 208 211 194 53.0 1e-49 MEKKNINEVIANLSVEQLAGMIDHTFLKPFGTAENIEKLCAEARQYQFAMVAINPAEVET CVKLLEGSGVRVGAAIGFPLGQNTVECKAFETRDAIAKGATEIDTVINVRALQKGQTDIV KKEIEDMVSICKPAGVICKVILETCYLTDEEKETVCRIAKEAGVDFVKTSTGFGTAGANV HDVALMRRVVGPVIGVKAAGGIRDLDTALALIQVGATRIGTSSGIQIVEAYKELKKGL >gi|226332228|gb|ACIC01000092.1| GENE 17 18051 - 19085 1310 344 aa, chain - ## HITS:1 COG:SMa1156 KEGG:ns NR:ns ## COG: SMa1156 COG1063 # Protein_GI_number: 16263079 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Threonine dehydrogenase and related Zn-dependent dehydrogenases # Organism: Sinorhizobium meliloti # 1 342 4 355 357 110 26.0 4e-24 MKSVIVPAPGKIEIREVETPVINAYQALVKTEMVALCNATDSKLIAGHFPGVDTYPLALG HENAGIVVAVGEKVRNFKVGDRAIGGLISDFGAQGINSGWGGFSEYVVVNDFEVLKEEGL ATPEQGCWDSFEIQNSVPAHVQPEEAVISCTWREVLGAFKDFNLTPGKKVIVVGSGPVGL SFVKLGKLFGLGQIDIVDMLPAKLEVARRMGADNGYTPAEISTPEFIAAAGRSYDAVIDA VGLDVVVNSVLPLVKMGGDVCVYGVMTKNPTFDLSKAPYNFDLHMHQWPTRSEEKAAMTT LAQWIEEGTLSASDFITHRFKIEEIEEAFAAVKRGEVLKCVLTF >gi|226332228|gb|ACIC01000092.1| GENE 18 19110 - 20627 1582 505 aa, chain - ## HITS:1 COG:CAC2612 KEGG:ns NR:ns ## COG: CAC2612 COG1070 # Protein_GI_number: 15895870 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar (pentulose and hexulose) kinases # Organism: Clostridium acetobutylicum # 2 490 3 494 500 319 35.0 7e-87 MYILAHDLGTSGNKATLFDESGLLIASRTAAYPTDYASGNRAEQNPHHWWKAIVDTTQAL LELVSPNDIAGVALSGQMMGCLCIDKDGNPLRPHMLYCDQRSQEEEAKLTEKIDPLHFYE ITGHRISASYSVEKLMWVKKHEPEIFAQTAKMLNAKDYINYRLCGTIATDPSDASGTNAY DLNRWQWSEEIIEAAELDLSLFPEVRSSIDVIGEITNEAARETGLLAGTPVICGGGDGSC AGVGVGCVAPGTAYNYLGSSSWVALTVEKPIVDEQRRTMNWAHVVPGMLHPSGTMQAAGS SYNWMINQLCQNEQALAAQSGRSVFELIDEQIIASPIGANKLLFLPYMLGERTPRWNVDA KGAFIGLTLGHKHGDMLRAVMEGVTLNLGFIINIFRKHVPIDRMTVIGGCAQNPVWRQMM ADIYQAEIRVPNYLEEATSMGAAILAGIGAGVFPDFSVIDRFVRIEQTVQPIPENVKKYE AWMPVFDQAYHALCEMYTEIAKTEL >gi|226332228|gb|ACIC01000092.1| GENE 19 20656 - 21135 627 159 aa, chain - ## HITS:1 COG:no KEGG:BT_4422 NR:ns ## KEGG: BT_4422 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 159 1 159 159 261 100.0 7e-69 MSEWLQTYNEMFGMYHAGIITTIFGAFAVTFTVLMSWPKLVKDFGPIGGFMAAALIIGTF WLVNHKLPGFGFSTGLLNDADGLPMQFSLIHQGNRGSAPWVDMGWAIAMGFIFADVLCAP KGTRGGLLKEAFPRWVVIILGGIVGGIFVGLTGYTNAAL >gi|226332228|gb|ACIC01000092.1| GENE 20 21149 - 21520 324 123 aa, chain - ## HITS:1 COG:no KEGG:BT_4421 NR:ns ## KEGG: BT_4421 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 123 1 123 123 217 100.0 1e-55 MKYYLSTFAGSAICGGFAFGIWPELWKTYGLMGGWLAATLIIGIMWYMNHYNGAILNPPG KIWLDQGWCIGSAGIAWGIVRFQGDITNFFYAVPTLVCCLIGGALAGIVVWKIRSCDCAR KIK >gi|226332228|gb|ACIC01000092.1| GENE 21 21577 - 22665 1089 362 aa, chain - ## HITS:1 COG:no KEGG:BT_4420 NR:ns ## KEGG: BT_4420 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 362 1 362 362 734 100.0 0 MKKYYLLSLISASLIGATCIQCTSPKQKSGENTLPQKSELFVSLPDYCPTPDGMAIAPNG DLILACPNFADITQPACLMRITKDGAVSKWLDVPVLEETGWASPMGLAFNEEGDLFISDN QGWSGAEKAKNKGRVLRLKFENDQLKETITVASGMEHPNGIRIRNGKLYVTQSSLSQIKD PSGLLVSGVYCFDMNDRDIAVTNTSADQNLLTTVITKNPEVQYGLDGIVFNEAGDLFVGN FGDGAIHRIKMDAEGKVVSNDVWAQDTTQLRTTDGMCIDDKGNIWVADFSANAVARVDKD GKIQRIAQSPDCDGSDGGLDQPGEPIVWNGQVIVSCFDLVTGPDKVNTKHDKPFTLAKLS LE >gi|226332228|gb|ACIC01000092.1| GENE 22 23484 - 23777 233 97 aa, chain + ## HITS:1 COG:no KEGG:BT_4419 NR:ns ## KEGG: BT_4419 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 97 1 97 97 130 100.0 1e-29 MTEEEKKLLSTFETQLRHLMYLHDELKRENAGLRKLLENEKLKNEKVQAQYDELEVNYTN LKTATAISLNGSDVKETKLRLSKLVREVDKCIALLNE >gi|226332228|gb|ACIC01000092.1| GENE 23 23791 - 24081 349 96 aa, chain + ## HITS:1 COG:no KEGG:BT_4418 NR:ns ## KEGG: BT_4418 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 96 1 96 96 139 100.0 4e-32 MNDKIKINLQIADSNYPLTINREEEEMVREAAKQVNIRLNAYREYYKNLEPEKIIAMVAY QFSLEKLQLMQRNDTTPYTEKVKELTELLEDYFKKE >gi|226332228|gb|ACIC01000092.1| GENE 24 24198 - 25733 2092 511 aa, chain + ## HITS:1 COG:CAC1816 KEGG:ns NR:ns ## COG: CAC1816 COG1418 # Protein_GI_number: 15895092 # Func_class: R General function prediction only # Function: Predicted HD superfamily hydrolase # Organism: Clostridium acetobutylicum # 34 511 44 514 514 399 49.0 1e-111 MIAIIATAIACFVVGGILSYVLFRYVLKSKYDSVLKDAETEAEVIKKNKLLEVKEKFLNK KADLEKEVALRNQKIQQAENKLKQREMVLSQRQEEIQRKKLEAEAVKENLEAQLVIVDKK KEELDKLQHQEIEKLEAISGLSADEAKERLVESLKEEAKTQAQSFINDIMDDAKLTASKE AKRIVIQSIQRVATETAIENSVTVFHIESDEIKGRIIGREGRNIRALEAATGVEIVVDDT PEAIVLSAFDPVRREIARLALHQLVTDGRIHPARIEEVVSKVRKQVEEEIIETGKRTTID LGIHGLHPELIRIIGKMKYRSSYGQNLLQHARETANLCAVMASELGLNPKKAKRAGLLHD IGKVPDEEPELPHALLGMKLAEKYKEKPDICNAIGAHHDETEMTSLLAPIVQVCDAISGA RPGARREIVEAYIKRLNDLEQLAMAYPGVTKTYAIQAGRELRVIVGADKIDDKQTESLSG EIAKKIQDEMTYPGQVKITVIRETRAVSFAK >gi|226332228|gb|ACIC01000092.1| GENE 25 25856 - 26605 586 249 aa, chain + ## HITS:1 COG:PM0526 KEGG:ns NR:ns ## COG: PM0526 COG3142 # Protein_GI_number: 15602391 # Func_class: P Inorganic ion transport and metabolism # Function: Uncharacterized protein involved in copper resistance # Organism: Pasteurella multocida # 5 229 2 224 244 217 50.0 2e-56 MKKYQFEVCANSVESCLAAQAGGADRVELCAGIPEGGTTPSYGEISTARDMLTTTRLHVI IRPRGGDFLYSPIEVRTMLKDIEMARQLGADGVVFGCLTANGEIDLPVMQELMKASQGLS VTFHRAFDVCRDPEKALEQIIELGCNRILTSGQQATAELGIPLLKTLQTQASGRIILLAG CGVNEKNIVRIASETGIQEFHFSARESIKSDMKYKNESVSMGGTVHIDEYERNVTTAQRV INTIQAIKS >gi|226332228|gb|ACIC01000092.1| GENE 26 26652 - 27866 1266 404 aa, chain - ## HITS:1 COG:PAB1772 KEGG:ns NR:ns ## COG: PAB1772 COG1883 # Protein_GI_number: 14521092 # Func_class: C Energy production and conversion # Function: Na+-transporting methylmalonyl-CoA/oxaloacetate decarboxylase, beta subunit # Organism: Pyrococcus abyssi # 23 401 19 399 400 169 30.0 7e-42 MENIDFATLFQGIGTMMESGWLLASARVFLVLLGLLLIYLGWKGVLEPMVMIPMGLGMVA INCGTLMMPDGTLGNLFLDPMLSDTDDLMNTMQIDFLQPVYTLTFSNGLIACFVFMGIGT LLDVGFLLQKPFASIFLALCAELGTFLTVPIASALGLTLKESASVAMVGGADGPMVLFTS LALAKHLFVPITVVAYLYLGLTYGGYPYLVKLLVPKRFRAIKMVTKKAPKNYDAKVKLAF SAVLCAVLCFLFPVASPLFFSLFLGVAVRESGMKHIYDFVSGPLLYGSTFMLGVLLGVLC DAHLLLDPKILKLLVLGIVALLLSGIGGIIGGYIMYIVKRGNYNPVIGIAAVSCVPTTAK VAQKIVSKDNPDSFVLGDALGANISGVITSAIITGIYITIIPYL >gi|226332228|gb|ACIC01000092.1| GENE 27 27881 - 28312 449 143 aa, chain - ## HITS:1 COG:no KEGG:BT_4414 NR:ns ## KEGG: BT_4414 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 143 41 183 183 283 98.0 9e-76 MIDLLLSTFAMMGLTFLIGFFVAAVIKLIAYAADSFDFYSSHRLELLRLHRWRQHRQKVE RLVRQMPLPDEGTLGDYREDFSRGINRNVSGYSGYYHGVSPGASEENLMDYYYPRDTREF FLKEEELAHVNKKNTKKSTTNKQ >gi|226332228|gb|ACIC01000092.1| GENE 28 28660 - 29097 238 145 aa, chain - ## HITS:1 COG:no KEGG:BT_4413 NR:ns ## KEGG: BT_4413 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 145 1 145 145 278 99.0 5e-74 MAKYIFCILAFAFLALGFTHISNKELQGFQKETEVPAIDMQYNAGANCSVSSAGFDKSCK DTHSLLTDGTYGSKVSDCIFDNLSFPRSIVPFKNLRFNTNTAIIQILSSLNTLLPEERLS HTCFDVNYTKYSCKYYVYTLAHILI >gi|226332228|gb|ACIC01000092.1| GENE 29 29181 - 31898 1971 905 aa, chain - ## HITS:1 COG:no KEGG:BT_4412 NR:ns ## KEGG: BT_4412 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 905 1 905 905 1859 98.0 0 MKTFIHLLSVLILSVVLYACNNAHFLKEENYRNQVTEDFEQKKQALPHGDLFTVFSNPDL SVYEQEALMFLYAYMPIGDVTDYSGDYYLENVRLSGQTRTEMPWGDQIPDELFRHFVLPV RVNNENLDDSRRVFYGELKDRVKHLSMKDAILEVNHWCHEKVVYRPSDARTSSPLASVKT AYGRCGEESTFTVAALRSVGIPARQVYTPRWAHTDDNHAWVEAWADGQWYFFGACEPEPV LNLGWFNAPASRGMLMHTKVFGRYTGPEEIMLETPNYTEINVIDNYAPTAKATVTVTDTE GHPVSGAKVEFKIYNYAEFYTVATKYTDAEGKAFLTAGKGDMLVWASRDGKFGYAKLSFG KEDALKLSLDKKEGESYTLPMDIVPPVEGANLPEVTPEQRAENDHRMAQEDSIRNAYVAT MMTDEQAKEWVNGLYGNILQPETMKDKLAAFLVASRGNHQTLKDFLSAIRKEKKHISWEE MRGMWLLENISAKDLRDVTLDVLNDHLKNTSDGEKTDADLVKRALLNPRIANEMLTPYKK VLYDAISEAVLKSAPVDAAHDAKALIEWCRKEIKIDNELNSQRIPISPMGVWKSRVADEK SRDIFFVAAARSIGIPAWIDEVTGKVQYLSDGLSPQDVNFETSRSTQSRTGMLKASYTPI RSLSDPKYYSHFTISKFKNGTFQLLNYDEGDVDMGGGATWSNLLKNGVKLDEGYYMMVTG TRLASGAVLSNTTFFTIEPDKTTTVDLVMRESKDQVQVIGNFNSEATYRPVGGTDLQSIL QTCGRGYFVVAVLGVGQEPTNHALRDIAALRSEFEQWGRKMVFLFPSEEQYKKFNTHEFK DLPSTIVYGIDVDNSIQKQIVDAMKLNQSTLPVFIIADTFNRVVFVSQGYTIGLGEQLMK VVHGL >gi|226332228|gb|ACIC01000092.1| GENE 30 32225 - 32362 125 45 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKKNANEKIMMLQYRIKRYQAMGNGAMCQTLNGKLQKLLTKQPAM >gi|226332228|gb|ACIC01000092.1| GENE 31 32437 - 33138 767 233 aa, chain - ## HITS:1 COG:no KEGG:BT_4411 NR:ns ## KEGG: BT_4411 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 233 1 233 233 414 98.0 1e-114 MKKVLLVSLLAIAGVSVSAQNLIKNEKFATEVTNKVTNPNKATAGEWFIMNNEADGVTTI AWEQTGDAKYPNAMKIDNSGAEKNTSWYKAFLGQRITDGLEKGIYVLTFYAKAKEAGTPV SVYIKQTNEEKNDNGKLNTTFFMRRDYDADAQPNASGAQYNFKIKDAGKWTKVVVYYDMG QVVNAISSKKSNPALEVSDTDDDAAILKDCYVAILGQNKGGVVEISDVTLKKK >gi|226332228|gb|ACIC01000092.1| GENE 32 33197 - 35131 1610 644 aa, chain - ## HITS:1 COG:no KEGG:BT_4410 NR:ns ## KEGG: BT_4410 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 644 1 644 644 1320 98.0 0 MNRMKHLICVFLLAAFGSCSFKANAYTERDMLQKAADETTLKNVLVMKQAWVPYPAYTDR AAWDSLMGPNKQRLIAAGEKLLDYKWQLIPATAYLEYERTGNRKIMEVPYDANRQALNTL MLAELAEGKGRFIDQLLNGAYMSCEMNSWVLSAHLPRQSSKRSLPDFREQIIDLGSGGYG ALMAWVHYFFRKPFDKINPVVSLQMRKAIKERILDPYMNDDDMWWMAFNWQPGEIINNWN PWCNSNALQCFLLMENNKDRLAKAVYRSMKSVDKFINFVKSDGACEEGTSYWGHAAGKLY DYLQILSDGTGGKISLLNEPMIRRMGEYMSRSYVGNGWVVNFADASAQGGGDPLLIYRFG KAVNSNEMMHFAAYLLNGRKPYATMGNDAFRSLQSLLCCNDLAKETPKHDMPDVTWYPET EFCYMKNKNGMFVAAKGGFNNESHNHNDVGTFSLYVNTIPVILDAGVGTYTKQTFGKDRY TIWTMQSNYHNLPMINGIPQKYGQEYKATNTICNEKKRIFSTDIAAAYPSEAKVKSWIRS YTLDDRKLTITDSYTLDEAVAPNQVNFMTWGNVTFPSLGKIQIEVKGQKVELDYPIQFKA ELETIQLDDPRLSNVWGKEIYRITLKTNEKKETGNYKFVIQQIK >gi|226332228|gb|ACIC01000092.1| GENE 33 35424 - 36485 781 353 aa, chain - ## HITS:1 COG:SMa2355 KEGG:ns NR:ns ## COG: SMa2355 COG0389 # Protein_GI_number: 16263727 # Func_class: L Replication, recombination and repair # Function: Nucleotidyltransferase/DNA polymerase involved in DNA repair # Organism: Sinorhizobium meliloti # 1 330 34 363 379 338 53.0 8e-93 MDAFYASVEQRDHPELRGKPLAVGHAEERGVVAAASYEARRYGVRSAMSSQKAKRLCPQL IFVSGRMDVYKSVSRQIHEIFHEYTDIIEPLSLDEAFLDVTENKKGISLAVDIAKEIKLR IREQLNLVASAGVSYNKFLAKIASDYRKPDGLCTIHPDQALDFIAGLPIESFWGVGPVTA KKMHLLGIHNGLQLRKCSLEMLTAYFGKVGALYYECSRGIDERPVEAVRIRKSIGCERTL ERDISVHSSVIIELYHVAVELIERLQRKEFKGNTLTLKIKFHDFSQITRSITQTQELYTL DRILPLAKELLKSVEYEQHPIRLIGLSVSNPKEEADEQHGVWEQLSFEFSDWE >gi|226332228|gb|ACIC01000092.1| GENE 34 36728 - 38119 743 463 aa, chain + ## HITS:1 COG:no KEGG:BT_4408 NR:ns ## KEGG: BT_4408 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 463 1 463 463 912 99.0 0 MKKLLFYLFILCSTSIVAQNGITFEVEQLSKPEKLYPVASPDEIYRYLMLFDNESTIRIT EKDIWKPPFNVIAKSESPDSLLYFGYNSFFCGMYQAYADHRPFVLSPDMIWLLINQGFAQ HVNANHESLRKYFVNFSGKESLIVQSNKKLKDPSLLWEEIFPQFTEQIRKEVGGNLVETL TCNFSTTTSLEKTVSEITIMETVKSYFEFITIMIVCGIPEITLEGTPQDWEKVLNKARGL KEYKLEWWISQLEPLLEEFVKASKGTINQEFWRNMFKCHSPKSCGAPETFDGWIIKFFPY DNEGKRNNLKQIVGRKKLPSEIVKVDLKYIEAYNDTVIETPLELWSGFIGLKQNNENFAL RPQIGWMVKKKNTDNTGLINRLKADAKGRGINLRVKEFPSVLLKLEEIKQLELTFINEID IPDELSKVKIERLKLSGRITKEEMQRIKTLFPNTDIKINGSRI >gi|226332228|gb|ACIC01000092.1| GENE 35 38255 - 39430 951 391 aa, chain - ## HITS:1 COG:no KEGG:BT_4407 NR:ns ## KEGG: BT_4407 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 391 1 391 391 782 99.0 0 MKRIKILYAGALVALSFSSVGCNDSESDLLEPKLYFESNVIRVEVEADTYECELVSRLST MVKSNVNVNYQVGGQDLVDDYNHKHGTTGQLLATDNYEMTGTSSTIKAGELYADPCGLSV KNITKGEDGVTYILPVIVNSTDIEKISSSSVTYIVVRKPIIIDKVYRINPGWLDVRLPTA YKTMGSVTYEALVYAERWKNLGTIMGNEGTLIMRTGDLNHPDNELQMAGSVALQMPDVSI FLLNQWYHVAFTYDASAGMSTLYLNGEKIAEKTVGSLTFDLNERFCIGYAYDYDRYRKWN GFMSEVRLWSVARTANQIRENMMFVDSGSDGLVGYWKLNGEDIEQRDGVWYVLDQSPNHN DATSNNGLRGETGGSQSFVEPTVVDMRVRLE >gi|226332228|gb|ACIC01000092.1| GENE 36 39427 - 40593 1117 388 aa, chain - ## HITS:1 COG:no KEGG:BT_4406 NR:ns ## KEGG: BT_4406 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 388 1 388 388 788 99.0 0 MIHMKMLRRLLYILPVIGIMIASCVDIESKDFEHIGGYNTMDNEESAQYYADLRAWKATS QGYGRPIFFGWFSNWSPEGPIRKGYLASLPDSIDMVSMWSGPFGLNEAKLADKEIFQKKK GGKITVCYILHNIGTGITPASVSEKVQAENPNASSEEITELVNKATEAYWGFTSGEKGAE DHIAAIQRYAKALVDTIVATDYDGLDIDWEPDNGGDGGRYVGSLKDRRGGPRGEFLHYLV EEIGKYFGPMATERPNGKYYYFMIDGEIWNSNKESAPYFDYFITQAYGDSNLDRRVSTLQ SWCGEYYDYRKHIFTENFESSWTTGGVLLTQAAYNHANGPKGGVGAFRLDNDYDNARDYN FVRHAIQINLQAYQEYMDNQSNENTEQQ >gi|226332228|gb|ACIC01000092.1| GENE 37 40597 - 42225 1560 542 aa, chain - ## HITS:1 COG:no KEGG:BT_4405 NR:ns ## KEGG: BT_4405 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 542 1 542 542 1030 99.0 0 MKKINQYKLLAGICALSLFTACDFEELNTNPFEMTDEMGIRDGVAVGGAVMAMQRAVITV GTQADDTEAINQYQVAYNLSADSWSGFFGQNNNWYSGSNNTTYYLQDNWVAATYTNSYTT LLSSWKKIKQESEKNETPEIFALAQILKISAWHKTLESFGPIPYTHAGEPALVIPFDSEQ VAFNAMFADLTAAIEALTVRAEQGATIVADYDAVYAGDTRKWVKYANSLMLRLAMRISYA DETTAQKYARQALNHPFGVMTSKSDEAQMSTGAGMVFRNNIDWLANQYNECRMGSSMFSY LLGYQDPRLSAYFEASPSAYAVAAFDGKNYQAVPPGNANQQNTIYTDFSKPNITSNTPTY WMRASEVYFLRAEAALRWGSEFGDAEALYEQGVATSFDENGISSSVDDYLASGLTPIAHN MRASYYSYNAAAPTTATPAFSGGTEQKLEKIMIQKWIALYPNGHEAWTEWRRTGYPKLNQ VQTNRGQGVTREGGIRRMVYPVSFYQSAEDRANYEEALKLLGGLDKDKATTQLWWDCKNK IY >gi|226332228|gb|ACIC01000092.1| GENE 38 42233 - 45553 2994 1106 aa, chain - ## HITS:1 COG:no KEGG:BT_4404 NR:ns ## KEGG: BT_4404 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1106 1 1106 1106 2115 100.0 0 MLQIYEKRADNVVQKRVFLIFLGLLLMQTLAFAQSDSKITIQQKDITVVDALKTVERQSK MSINYSDSELKGKKIADLNLEDVSVTTALDAILKGTGFSYQIQGNYIIVTEKKPAAAQSV KDIKGKVTDENGEPLIGVNISVDGSSTGTITDFDGNFSIKSPGNARLKISYIGYATQLIA VSNKDFYPITMKQDTEVLDEVVVTALGIKRAEKALSYNVQQIKSEDITGIKDANFINSLN GKIAGVSINKSGSGVGGATRVVMRGAKSIEGDNNALYVVDGIPLFNTNMGNTDSGIMGEG KAGTEGIADFNPEDIESISILSGPSAAALYGSSAANGVVLITTKKGQEGKLQISFSSSSE FSKAYMVPEFQNTYGNKEGMFESWGDKLARPNSYDPKSDFFNTGTNFINSLTLSTGTKQN QTFASVSSTNSKGIVPNNSYDRLNFTIRNTATFLDDKLQLDLGASYVKQEDKNMVSQGQY WNPVISAYLFPRGEDFEDIKTFERYDQSRNIPIQYWPIADATYGSQNPYWTAYRNIATNE KSRYMFNVGLSYKITDWMNLAGRYRMDDTHVQFERKVYATSDQKFAEGKKGLYDYSNYKD RQEYADFMLNINKQFNDFNISANLGYSYSNYWSLARGYKGTLLGVTNLFAASNIDPSNGR ISEDGGDSRVRNHAIFANVELGWRSMLYLTLTGRNDWNSRLVNTSEESFFYPSVGLSGII SEMVKLPEFISYLKVRGSYTEVGSPVSQSGLTPGTVTTPIIGGALAPTGIYPFTDFKAER TRSYELGLSVRLWNKLSAEVTYYKSNTYNQTFIGDLPEFSGYKQIYLQAGNVQNHGWEAS LGYRDSFGDFSFSSNLTFSRNVNEIKEMVENYRTDMSPEPINVPEVRKDNGRVILKVGGS INDIYANTFLKKDSQGYVEIKQDGSYGLEQGEPVFLGKTSPDFNLGWSNSLSYKGFGLSF LINGRFGGVVTSSTQAILDRFGVSKASAEARDAGGVLLPGQGRVDAQSYYQLIGTGDYTT SGYYVYSATNIRLQELTLSYRFPDLWFKNVLKDVTLSFIANNPWMLYSKAPHDPELTPSV GTYGQGNDYFMQPSVRSFGFGIKFKL >gi|226332228|gb|ACIC01000092.1| GENE 39 45721 - 46665 538 314 aa, chain - ## HITS:1 COG:AGl2289 KEGG:ns NR:ns ## COG: AGl2289 COG3712 # Protein_GI_number: 15891252 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 72 259 87 272 323 90 32.0 3e-18 MSIHWLRVEKCSDGWLTTIMRKEKKEALHTLWQQAGEQKVPRGMQQSILRMQQNLGMQSV GSGKKYQLLIWRAAAIFLLAVSSVSIYLMLEKDKPQKDLVECFIPTAEIRELTLPDGTRV MLNSRSTLLYPEQFAGKTRSVYLIGEANFQVKPDEKHPFIVKANDFQITALGTEFNVNAY PENNELIATLLEGSVKVEFNNLISNVILKPNEQITYNKQTRKRSLQSPRIEDVTAWQRGE LVFSDMHLDEIFTSLERKFPYTFVYSLHSLNNKSYSFRFQQQATLEEVMKIITQVAGDVK YIIKGNKCYITDKI >gi|226332228|gb|ACIC01000092.1| GENE 40 46755 - 47306 465 183 aa, chain - ## HITS:1 COG:PA2426 KEGG:ns NR:ns ## COG: PA2426 COG1595 # Protein_GI_number: 15597622 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Pseudomonas aeruginosa # 37 172 39 171 187 62 31.0 4e-10 MENNVNQIELKFQRFFISNFPKVKNYAQLLLKSEVDAEDVAQDIFCKLWLQPEIWLDNEN ELDNYLFIMTRNIILNIFKHQQIEREYQEGYLEKTVLHELIEGEDALNNIYYEEMLLVVR LTLEKMPERRRLIFELSRFKGLSYKEIAEKLNVSIRTVEHQVYLALIELKKVLLFLFFLF SFK >gi|226332228|gb|ACIC01000092.1| GENE 41 48171 - 48356 57 61 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MQILADDMLSGIQRYIMSGGRLGSQTYNICRPGYFQLTNIIRTFVEYNKYYQPQKSNGRT I >gi|226332228|gb|ACIC01000092.1| GENE 42 48340 - 48780 318 146 aa, chain + ## HITS:1 COG:no KEGG:BT_4400 NR:ns ## KEGG: BT_4400 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 47 146 1 100 100 208 100.0 4e-53 MEEPFDYSQVPFTFGMCAAENCPQASTCLRQIALKHAPANKVFLPIMNPNHIKGIKEKCD YFCSNEKVRYAKGFMCTINALTVRVANTFRYRMIGYLGRKNYYLKRSGKLALTPAEQQWI INTAKELGVIQSEYFDSYIVEYNWDR >gi|226332228|gb|ACIC01000092.1| GENE 43 48725 - 48916 66 63 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MIAVSLSFPYEDTMVSRRGNGRFSTGILAFLYEETVVSLLETEISLTCPSYTLLYNCRNT HFE >gi|226332228|gb|ACIC01000092.1| GENE 44 48978 - 50129 1011 383 aa, chain + ## HITS:1 COG:BS_ykgB KEGG:ns NR:ns ## COG: BS_ykgB COG2706 # Protein_GI_number: 16078366 # Func_class: G Carbohydrate transport and metabolism # Function: 3-carboxymuconate cyclase # Organism: Bacillus subtilis # 40 382 8 346 349 204 36.0 3e-52 MLKTFATICIIGMFTSSCTSKKASRTEGSHEAEQELTMIVGTYTSGDSKGLYSFRFNEEN GTATALSEAEVENPSYLVPSADGKFIYAVSEFSNEQAAANAFAFNKEEGTFRLLNTQKTG GEDPCYIITNGSNVVTANYSGGSISVFPIDKDGSLLPASEVVKFKGSGADKERQEKPHLH CVRITPDGKYLFADDLGTDQIHKFIVNPNAKADNEDVFLKEGSPASYKVEAGSGPRHLTF APNGSYAYLINELSGTVIAFEYNDGELKEIQTIAADTAGAKGSGDIHISPDGKFLYASNR LKADGLAIFSIHPENGMLTKVGYQLTGIHPRNFIITPNGKYLLVACRDSNVIQVYERDTD TGLLTDIRKDIKVDKPVCIKFVP >gi|226332228|gb|ACIC01000092.1| GENE 45 50203 - 53484 3488 1093 aa, chain - ## HITS:1 COG:no KEGG:BT_4398 NR:ns ## KEGG: BT_4398 # Name: not_defined # Def: TPR domain-containing protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1093 1 1093 1093 2241 99.0 0 MVKAWREKVVIPTYEVGKPEKNPIFLEKRVYQGSSGVVYPYPVIESMSDEKVDKEYEAIF IENEYIKVMILPELGGRVQMAYDKIRERHFIYYNHVIKPALVGLAGPWISGGIEFNWPQH HRPSTYMPVDTTIEENADGSVTVWVNEMERMFHQKGMAGFTLRPGHAFLEIKGVLYNRTE VPQTFLWWANPAVAVNDYYQSVFPPDINAVFDHGKRAVSSFPIATGTYYKMDYSAGVDIS NYKNIKVPTSYMAVNSRYNFEGGYENDTCAGMLHVANHHISPGKKQWTWGNGDFGRAWDR NLTDEDGPYIELMAGVYTENQPDFTWLQPYEEKSFVQYFLPYRELGVVKNASKDLLMNIE PEGEKEVRFKIFATSKQVVNVVLKGDDGKVYYSRQVTITPEELLNETVDVAGEKLNKLNL EITANGKELLYWHAEPDEVKPIPDAAEAALLPEEIKTTEQLYLTGLHLEQYRHATYNPVD YYEEALRRDPIDVRSNNALGLWYIRKGRFRKAEQYLLTAVKTLQKRNPNPYDGEPIYNLG LALKYQGRYNEAYDRFYKSCWNAAWQDAGYFACAQIATMQGRLEDALDEVDRSLIRNWHN HKARALKTTILRQMGKAEEALQVIEDSLAIDKFNFGCRYEKYLITNAAEDLLTLKTMMRG EAHNYDEIALDYCAAGCWTEAASLWNVAIAEDSVTPMTYYYLGWCLVQGKLSGAEQAFAD AASACPDYCFPNRLEAILALQCAMEQNPNDAKAPYYLGNLYYDKRQYDLALEAWETSAGL DDKFPTIWRNLALANFNKKDEEATAIEYMERAFQLDTTDARVLMELDQLYKRVRRPHSER LVFLQQYPELIAQRDDLVLEEITLLNQTGEYEKAKTLLDAHIFHPWEGGEGKVPGQYQFA RVELAKKALEAGNYSEAVSLLAECLEYPHHLGEGKLHGAQENDFYYFMGCAYEGLGQNDK AVECWEQAIVGPTEPAAAMYYNDAKPDKIFYQGLALLKLNRMDEANGRFHKLTTYGEKHL FDKVKMDYFAVSLPDLLIWEDDLTIRNVIHCKYMMALGYWGLNEKEKSVRLLSEVERLDI NHQGIQAFRSLIG >gi|226332228|gb|ACIC01000092.1| GENE 46 53506 - 54888 1311 460 aa, chain - ## HITS:1 COG:ECs5014 KEGG:ns NR:ns ## COG: ECs5014 COG0477 # Protein_GI_number: 15834268 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli O157:H7 # 4 455 5 475 491 296 38.0 5e-80 MKSYNKKFVYSICLVSAMGGLLFGYDWVVIGGAKPFYELYFGIADSPTMQGLAMSVALLG CLIGAMVAGMMADRYGRKPLLLISAFIFLSSAYATGAFSVFSWFLAARFLGGIGIGIASG LSPMYIAEVAPTSIRGKLVSLNQLTIVLGILGAQIANWLIAEPIPADFTPADICASWNGQ MGWRWMFWGAAFPAAVFLLLACFIPESPRWLAMKGKREKAWSVLSRIGGNRYAERELQMV EQTSASKSEGGLKLLFSRPFRKVLVLGVIVAVFQQWCGTNVIFNYAQEIFQSAGYSLGDV LFNIVVTGVANVIFTFVAIYTVERLGRRALMLLGAGGLAGIYLVLGTCYFFQVSGFFMVV LVVLAIACYAMSLGPITWVLLAEIFPNRVRGVAMATCTFALWVGSFTLTYTFPLLNTALG SYGTFWIYSAICVFGFLFFLRALPETKGKSLETLEKDLIK >gi|226332228|gb|ACIC01000092.1| GENE 47 55051 - 55938 612 295 aa, chain + ## HITS:1 COG:PA3571 KEGG:ns NR:ns ## COG: PA3571 COG2207 # Protein_GI_number: 15598767 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Pseudomonas aeruginosa # 14 291 26 298 307 156 32.0 4e-38 MVKLKDGFTGERALVLPRIIVDKMEEDPLTSMLHITDIGYYPKAKHHFRERKEPINQYVF IYCIDGAGSYRIGDQEYTVSANQYFILPAGVPHSYASNKSTPWTIYWIHFKGILAPFYAK DANRPMDIQPEQHSRISTRINLFEEIFNTLKNGYSNENLRYAFSMFHFYLGTLRYVQQYR NAGTNEAEDEGIVNVAIHYMKENMEKHLSLEEISTQIGYSPSHFSMLFKKQTGHSPLTYF NLLKIQQACLLLDTTDMKINQICYKIGIEDTYYFSRLFSKIMGMSPREYRKSKKG >gi|226332228|gb|ACIC01000092.1| GENE 48 56140 - 58353 2110 737 aa, chain - ## HITS:1 COG:no KEGG:BT_4395 NR:ns ## KEGG: BT_4395 # Name: not_defined # Def: hyaluronoglucosaminidase precursor # Organism: B.thetaiotaomicron # Pathway: Metabolic pathways [PATH:bth01100] # 1 737 1 737 737 1487 98.0 0 MKNNKIYLLGACLLCAVTVFAQNVSLQPPPQQLIVQNKTIDLPAVYQLNGGEEANPHAVK VLKELLSGKQSSKKGMLISIGEKGDKSVRKYSRRIPDHEEGYYLSVNEKEIVLAGNDERG TYYALQTFAQLLKDGKLPEVEIKDYPSVRYRGVVEGFYGTPWSHQARLSQLKFYGKNKMN TYIYGPKDDPYHSAPNWRLPYPDKEAAQLQELVAVANENEVDFVWAIHPGQDIKWSQEDR DLLLAKFEKMYQLGVRSFAVFFDDISGEGTNPQKQAELLNYIDEKFAQVKPDINQLVMCP TEYNKSWSNPNGNYLTTLGDKLNPSIQIMWTGDRVISDITRDGISWINERIKRPAYIWWN FPVSDYVRDHLLLGPVYGNDTTIAKEMSGFVTNPMEHAESSKIAIYSVASYAWNPAKYDT WQTWKDAIRTILPSAAEELECFAMHNSDLGPNGHGYRREESMDIQPAAERFLKAFKEGNN YDKADFETLQYTFERMKESADILLMNTENKPLIVEITPWVHQFKLTAEMGEEVLKMVEGR NESYFLRKYNHVKALQQQMFYIDQTSNQNPYQPGVKTATRVIKPLIDQTFATVVKFFNQK FNAHLDATTDYMPHKMISNVEQIKNLPLQVKANRVLISPANEVVKWAAGNSVEIELDAIY PGENIQINFGKDAPCTWGRLEISTDGKEWKTIDLKQKESRLSAGLQKAPVKFVRFTNVSD EEQQVYLRQFVLTIEKK >gi|226332228|gb|ACIC01000092.1| GENE 49 58559 - 60199 1468 546 aa, chain + ## HITS:1 COG:VC0613 KEGG:ns NR:ns ## COG: VC0613 COG3525 # Protein_GI_number: 15640633 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetyl-beta-hexosaminidase # Organism: Vibrio cholerae # 19 534 132 626 637 316 35.0 6e-86 MKIKHFLPLLLLLGSNEMLTAQEIALTPQPAHLTVKDGRFEFGNQLKAKVAPYQGDSIRM VFESFKKELQEATGIKVSSTQKEAKARIILDLNPQLPAEAYKLNVSKEQVRIEASRPAGF YYALQTLKQLMPRNVMAGVATSDHSQWSLPSVKIEDAPRFEWRGFMLDEGRHFFGKDEIK RVIDMMAIYKMNRFHWHLTEDQGWRIEIKKYPKLTETGAWRNSKVLAYGDVKPDGERYGG FYTQKDIKEIVAYAKKKFIEIIPEIDIPGHSQAAVAAYPEFLACDPENKHEVWLQQGIST DVINVANPKAMQFAKEVIDELTELFPFNYIHLGGDECPTNKWQKNDECKKLLSEIGSSNF RDLQIYFYKQLKDYIATKPADQQRQLIFWNEVLHGNTSILGNDITIMAWIGANAAAKQAA KQGMNTILSPQIPYYINRKQSKLPTEPMSQGHGTETVEAVYNYQPLKDVDAALQPYYKGV QANFWTEWVTEPSVLEYLMLPRLAAVAEAGWTPQEKRNYEDFKERIRKDAELYDLKGWNY GKHIMK Prediction of potential genes in microbial genomes Time: Thu May 12 01:39:05 2011 Seq name: gi|226332227|gb|ACIC01000093.1| Bacteroides sp. 1_1_6 cont1.93, whole genome shotgun sequence Length of sequence - 35056 bp Number of predicted genes - 31, with homology - 31 Number of transcription units - 11, operones - 8 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 192 - 830 439 ## COG0739 Membrane proteins related to metalloendopeptidases 2 1 Op 2 . - CDS 842 - 1507 622 ## COG3382 Uncharacterized conserved protein 3 1 Op 3 . - CDS 1547 - 2332 477 ## COG1496 Uncharacterized conserved protein 4 1 Op 4 . - CDS 2356 - 3522 711 ## PROTEIN SUPPORTED gi|149915191|ref|ZP_01903719.1| 50S ribosomal protein L27 5 1 Op 5 . - CDS 3535 - 4104 730 ## COG0563 Adenylate kinase and related kinases 6 1 Op 6 . - CDS 4138 - 4674 695 ## COG0634 Hypoxanthine-guanine phosphoribosyltransferase 7 1 Op 7 . - CDS 4716 - 6260 1249 ## COG3104 Dipeptide/tripeptide permease - Prom 6285 - 6344 2.9 8 2 Tu 1 . - CDS 6453 - 6674 231 ## BT_4384 hypothetical protein - Prom 6714 - 6773 4.6 + Prom 7229 - 7288 2.8 9 3 Op 1 . + CDS 7333 - 8850 597 ## PROTEIN SUPPORTED gi|225093729|ref|YP_002662469.1| ribosomal protein S15 10 3 Op 2 . + CDS 8924 - 9988 1112 ## BT_4382 hypothetical protein 11 3 Op 3 . + CDS 10010 - 10291 357 ## BT_4381 hypothetical protein + Term 10447 - 10502 7.1 12 4 Op 1 . - CDS 10385 - 10729 379 ## BT_4380 hypothetical protein 13 4 Op 2 . - CDS 10773 - 12161 1610 ## BT_4379 putative oxalate:formate antiporter - Prom 12223 - 12282 5.8 - Term 12382 - 12415 1.3 14 5 Op 1 . - CDS 12452 - 12847 375 ## BT_4378 preprotein translocase subunit SecG 15 5 Op 2 . - CDS 12854 - 13648 563 ## BT_4377 hypothetical protein 16 5 Op 3 . - CDS 13653 - 14180 466 ## BT_4376 hypothetical protein 17 5 Op 4 . - CDS 14167 - 15408 1190 ## COG2204 Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains - Prom 15463 - 15522 4.6 18 6 Op 1 . - CDS 15532 - 16626 863 ## PROTEIN SUPPORTED gi|163786851|ref|ZP_02181299.1| 50S ribosomal protein L32 19 6 Op 2 . - CDS 16631 - 17668 903 ## BT_4373 hypothetical protein 20 6 Op 3 . - CDS 17686 - 18720 877 ## COG0820 Predicted Fe-S-cluster redox enzyme - Prom 18769 - 18828 3.2 - Term 18786 - 18838 7.3 21 7 Tu 1 . - CDS 18862 - 21000 2464 ## BT_4371 peptidyl-prolyl cis-trans isomerase - Prom 21039 - 21098 5.0 22 8 Op 1 . - CDS 21120 - 22376 939 ## COG1253 Hemolysins and related proteins containing CBS domains 23 8 Op 2 . - CDS 22380 - 23033 577 ## BT_4369 hypothetical protein 24 8 Op 3 . - CDS 23039 - 24364 1413 ## BT_4368 hypothetical protein 25 8 Op 4 . - CDS 24418 - 25713 1052 ## BT_4367 putative outer membrane protein 26 8 Op 5 . - CDS 25700 - 26368 468 ## COG1521 Putative transcriptional regulator, homolog of Bvg accessory factor - Prom 26581 - 26640 6.1 27 9 Op 1 . + CDS 26874 - 28061 782 ## BT_4364 hypothetical protein 28 9 Op 2 . + CDS 28061 - 29635 1360 ## BT_4363 putative alkaline phosphatase + Prom 29637 - 29696 4.4 29 10 Tu 1 . + CDS 29766 - 33086 4028 ## COG0653 Preprotein translocase subunit SecA (ATPase, RNA helicase) + Term 33118 - 33168 16.2 + Prom 33119 - 33178 2.8 30 11 Op 1 . + CDS 33208 - 34317 1018 ## BT_4361 hypothetical protein 31 11 Op 2 . + CDS 34379 - 34825 523 ## BT_4360 hypothetical protein + Term 34879 - 34937 17.2 Predicted protein(s) >gi|226332227|gb|ACIC01000093.1| GENE 1 192 - 830 439 212 aa, chain - ## HITS:1 COG:NMA2172 KEGG:ns NR:ns ## COG: NMA2172 COG0739 # Protein_GI_number: 15795043 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Neisseria meningitidis Z2491 # 62 171 271 387 430 87 38.0 2e-17 MKNILILLTVLLLPTLAYAQSKSSFSSMEVNHVRVATPGLFAKSNHIYLHLDSLQEHEYA FPLPGAKVISAYGTRGGHSGADIKTCAKDTIRAAFDGVVRMSKPYYAYGNLVVVRHANGL ETIYSHNFKNLVQSGDTVKAGQPIGLTGRTGRATTEHVHFETRINGQHFNPNLIFNLKEG TLRKECIKCTKNGNGVVVKPQANNNRVAQNKK >gi|226332227|gb|ACIC01000093.1| GENE 2 842 - 1507 622 221 aa, chain - ## HITS:1 COG:SSO0658 KEGG:ns NR:ns ## COG: SSO0658 COG3382 # Protein_GI_number: 15897568 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Sulfolobus solfataricus # 44 206 42 208 224 79 35.0 4e-15 MFEIKVSQEIKNACPVFAGAAVYAAVKNTAYCDGLWKEINTFTEDLTTTTQMADIKLQPV IAATREAYKRCGKDPGRYRPSAEALRRRLMRGIPLYQIDTLVDLINLVSLRTGHSIGGFD ADKIQGKHLELGIGKAEEPFEGIGRGILNIEGLPVYRDSFGGIGTPTSDHERTKMDVGTT HILAIVNGYNGKEGLKEAAEMIQSLLRDYAGSDGGELIYFE >gi|226332227|gb|ACIC01000093.1| GENE 3 1547 - 2332 477 261 aa, chain - ## HITS:1 COG:SA1030 KEGG:ns NR:ns ## COG: SA1030 COG1496 # Protein_GI_number: 15926770 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Staphylococcus aureus N315 # 2 257 12 260 263 118 32.0 1e-26 MLGYESLSSYSNISHFVTTRQGGCSEGNYASFNCTPYSGDEAEKVRRNQTLLMEGMSQTP EELVIPVQTHEANCLLIGDAYLSASSQQRQEMLHGVDALITRELGYCLCISTADCVPVLI YDKKHSAIAAIHAGWRGTVAYIVRDTLLRMEKEFGTSGEDVIACIGPSISLASFEVGEEV YEAFQKNGFDMPRISIRKEETGKHHIDLWEANRMQILAFGVPSGQVELARICTYIHHDEF FSARRLGIKSGRILSGIMIHK >gi|226332227|gb|ACIC01000093.1| GENE 4 2356 - 3522 711 388 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149915191|ref|ZP_01903719.1| 50S ribosomal protein L27 [Roseobacter sp. AzwK-3b] # 6 318 3 316 345 278 46 3e-74 MAESNFVDYVKIYCRSGKGGRGSTHMRREKYCPNGGPDGGDGGRGGHIILRGNRNYWTLL HLKYDRHAMAGHGESGSKNRSFGKDGADKVIEVPCGTVVYNAETGEYLCDVTDDGQEVIL LKGGRGGQGNSHFKTATRQAPRFAQPGEPMQEMTVIMELKLLADVGLVGFPNAGKSTLLS AISAAKPKIADYPFTTLEPNLGIVSYRDGKSFVMADIPGIIEGASQGKGLGLRFLRHIER NSLLLFMVPADSDDIRKEYEVLLNELRTFNPEMLDKQRVLAITKSDMLDQELMDEIEPTL PEGVPHIFISSVSGLGISVLKDILWEELNKESNKIEDIVHRPKDVTRLQQELKDMGEDEE LDYEYEEDADDDDDDLDYEYEEEDWEEK >gi|226332227|gb|ACIC01000093.1| GENE 5 3535 - 4104 730 189 aa, chain - ## HITS:1 COG:CC1269 KEGG:ns NR:ns ## COG: CC1269 COG0563 # Protein_GI_number: 16125518 # Func_class: F Nucleotide transport and metabolism # Function: Adenylate kinase and related kinases # Organism: Caulobacter vibrioides # 2 187 1 186 191 174 45.0 8e-44 MLNIVIFGAPGSGKGTQSERIVEKYGINHISTGDVLRAEIKNGTELGKTAKGYIDQGQLI PDELMIDILASVFDSFKDSKGVIFDGFPRTIAQAEALKKMLAERGQDVSVMLDLEVPEDE LMVRLIKRGKDSGRADDNEETIKKRLHVYHSQTSPLIDWYKNEKKYQHINGLGTMDGIFA DICEAVDKL >gi|226332227|gb|ACIC01000093.1| GENE 6 4138 - 4674 695 178 aa, chain - ## HITS:1 COG:CAC3203 KEGG:ns NR:ns ## COG: CAC3203 COG0634 # Protein_GI_number: 15896450 # Func_class: F Nucleotide transport and metabolism # Function: Hypoxanthine-guanine phosphoribosyltransferase # Organism: Clostridium acetobutylicum # 13 178 8 174 178 123 37.0 2e-28 MDTIQIKDKQFTVSIKEQDIQKEVIRVANEINRDLAGKNPLFLSVLNGSFMFTADLLKHI TIPCEISFVKLASYQGVTSTGVIKEVIGLNEDIAGRTVVIVEDIVDTGLTMQRLLDTLGT RNPEAIHIASLLVKPEKLKVNLNIEYVAMEIPNDFIVGYGLDYDGFGRNYPDIYTVVD >gi|226332227|gb|ACIC01000093.1| GENE 7 4716 - 6260 1249 514 aa, chain - ## HITS:1 COG:CAC0751 KEGG:ns NR:ns ## COG: CAC0751 COG3104 # Protein_GI_number: 15894038 # Func_class: E Amino acid transport and metabolism # Function: Dipeptide/tripeptide permease # Organism: Clostridium acetobutylicum # 7 510 24 520 521 282 35.0 1e-75 MSSLKGHPKGLYLIFATSTAERFSYYGMRAIFILFLTQALLFDKEAAASIYGSYTGLVYL TPLIGGYIADKYWGIRRSVFWGAVMMAVGQFLMFMSASSLNNTDLAHWLMYGGLGFMILG NGCFKPTVSSLVGQLYEPGDKRLDAAYTIFYMGVNVGSFAAPLICGFLGDTGNPQDFKWG FLASGIMTLFTVVLFETQKNKYLFSPSGEPIGIVPDARREKKEDKAEHISHPKMDKRTKV RNIIIITALTVALIAFFSYAFSDDWISVGIFTACIVFPVLILLDGSLTKVERSRIFVIYI VAFFVIFFWAAYEQAGASLTLFASEQTDRSIFGWEMPASWFQSFNPLFVVILAYIMPGIW GFLNKRNMEPASPTKQAIGLLLLSLGYLFICFGVKDAIPGVKVSMIWLTGLYFIHTMGEI ALSPIGLSMVNKLSPLRFASLMMGIWYLSTATANKFAGMLSGLYPEDGKVKSILGYQIAT MYDFFMLFVIMSGVASLILFLLSKKLQKMMHGVE >gi|226332227|gb|ACIC01000093.1| GENE 8 6453 - 6674 231 73 aa, chain - ## HITS:1 COG:no KEGG:BT_4384 NR:ns ## KEGG: BT_4384 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 73 2 74 74 135 100.0 3e-31 MEKKRIGSNAGKVWRILNEKGEQSMFTLCHELGLTFEDVAIAIGWLARENKILLRKKEGM LYASIENVEFTFG >gi|226332227|gb|ACIC01000093.1| GENE 9 7333 - 8850 597 505 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|225093729|ref|YP_002662469.1| ribosomal protein S15 [gamma proteobacterium HTCC5015] # 1 496 4 491 497 234 31 5e-61 IAMKIFPSSSIKKLDAYTIEHEPIASIDLMERAAQALTKAITNRWNPETPVTIFAGPGNN GGDALAIARMMAEKGYKIEVYLFNTKGELSPDCQTNKELVEMMEEVTFHEISTQFVPPTL TPEHLVIDGLFGSGLNKPLSGGFAAVVKYINASPAMVVSIDIPSGLMGEENTFNVKSNII RADVTFSLQLPKLAFLFAENTEFVGEWELLDIQLSEEGIEETETNYEMLEIEEIRSLIKP RQQFAHKGNFGHALLIAGSKGMAGASVLAARACLRSGVGLLTVHAPLCNNDILQTSAPEA MVEADVSETCFAVPTDTDDYQAVGIGPGLGRNEETEAALIEQLEHCQTPTVLDADALNIL ANHRHTLTHLPKGSILTPHPKELERLVGKCQDSYERLMKACELAHTAKVHIILKGAYSAI ITPEGKCYFNQTGNPGMATGGSGDVLTGVILALLAQGYPAEDAAKIGTYIHGLAGDIAQK KQGMIGLIASDIVTCLPTAWRLVSE >gi|226332227|gb|ACIC01000093.1| GENE 10 8924 - 9988 1112 354 aa, chain + ## HITS:1 COG:no KEGG:BT_4382 NR:ns ## KEGG: BT_4382 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 354 1 354 354 624 100.0 1e-177 MKKLIIATGLLMATSAYAQTEVLTGVTRGKDYGVVYALPKTQIEIEIKANKVNYKPGEFS KYADRYLRLTNVSAEPEEYWELNSVKVKPVGVPNSEATYFVKLKDKTVAPLMELTEEGLI KSINVPYSSNNAKNAANNAVAPQRKANPRDFLTEEILMASSTAKMAELVAKEIYNIRESK NALLRGQADNMPSDGAQLKIMLDNLNLQEEAMTEMFSGVRNKEEKTFTVRLTPDKEFDNE VAFRFSKKLGIVANNDLAGTPFYISLKDLKTVKIPQEDGKKKKEMEGIAYNVPGQAMVTL TDGKKKLYEGEIPVTQFGIIEYLAPVLFNKNSTIKVYFDPVTGGLLKVDREESK >gi|226332227|gb|ACIC01000093.1| GENE 11 10010 - 10291 357 93 aa, chain + ## HITS:1 COG:no KEGG:BT_4381 NR:ns ## KEGG: BT_4381 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 93 1 93 95 159 100.0 2e-38 MGGEGHMLDMIRRLKDGKEASRLRRERRNDKLGRLRRNNDPYLLPDTTPEEMERIIKDSE KKKEKDNSYFVWGTLIIMGVLIGCAVLLWAIFF >gi|226332227|gb|ACIC01000093.1| GENE 12 10385 - 10729 379 114 aa, chain - ## HITS:1 COG:no KEGG:BT_4380 NR:ns ## KEGG: BT_4380 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 113 1 113 115 220 99.0 1e-56 MATKEKINLLEVIPCRNEHIKAEQEGETIVLSFPRFKRSWMSRYLLPKGMSKDIHVRLEE HGTAVWNLIDGQRTVREIIEKLADHFQHEAGYESRVSTYLSQLQKDGFIKWIID >gi|226332227|gb|ACIC01000093.1| GENE 13 10773 - 12161 1610 462 aa, chain - ## HITS:1 COG:no KEGG:BT_4379 NR:ns ## KEGG: BT_4379 # Name: not_defined # Def: putative oxalate:formate antiporter # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 462 1 462 462 841 99.0 0 MTEQLKQKLNDSAVLRWSVLALVAFTMLCGYFLTDVMSPLKPMLEKELLWDSLDYGFFTS AYGWFNVFLLMLIFGGIILDKMGVRFTGMGACLLMVFGCGLKYYAITTTFPEGAMLFGFK MQVTLAALGYAIFGVGVEIAGITVSKIIVKWFKGKEMALAMGLEMATARIGTTLAMVLTV PLADFFGSTDESGVFHTNIPAPILFCLVMLCVGTIAFFIYTFYDKKLDASLDAEGLEPEE PFRMKDIVYIITNKGFWLIALLCVLFYSAVFPFIKYAADLMVQKYNVDPKLAGTIPGLLP IGAIILTPLFGSLYDRIGKGATLMVIGSVMLIFVHTMFALPILNIWWFATVIMIILGFAF SLVPSAMWPSVPKIIPEKQLGTAYALIFWVQNWGLMGVPLLIGWVLNSYCKGPVVDGAQT YDYTLPMAIFAMFGVLALIVALMLKAENKKKGYGLEEANIQK >gi|226332227|gb|ACIC01000093.1| GENE 14 12452 - 12847 375 131 aa, chain - ## HITS:1 COG:no KEGG:BT_4378 NR:ns ## KEGG: BT_4378 # Name: secG # Def: preprotein translocase subunit SecG # Organism: B.thetaiotaomicron # Pathway: Protein export [PATH:bth03060]; Bacterial secretion system [PATH:bth03070] # 1 131 1 131 131 203 99.0 1e-51 MYLLFIILMVIAALLMCFIVLIQNSKGGGLASGFSSSNAIMGVRKTTDFLEKATWGLAIF MVVMSVATAYVVPSSSAAKDVLLEQAQKEQQTNPYNMPTGTAAPQTDATAPAESAPATEA PATETPAPAAE >gi|226332227|gb|ACIC01000093.1| GENE 15 12854 - 13648 563 264 aa, chain - ## HITS:1 COG:no KEGG:BT_4377 NR:ns ## KEGG: BT_4377 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 264 1 264 264 447 98.0 1e-124 MTSVNLQQWIQHPETLNRDTLYELRNQLARYPYFQSLRLLYLKNLYILHDINFGAELRKA VLYIADRRKLFHLIEGERFAVESQKKGLPLSEVLKDEPTVDRTLALIDAFLSTAPEEVTS QTSFDYSMDYTAYLLEETPVVDEPAEAMPKLKGHELIDNFIEKSETDPVVYLKPLKEEEK AKAASSAETNETGRTDETTETNTVEEEDDSCFTETLAKIYVKQQRYSKALEIIKKLSLKY PKKNAYFADQIRFLEKLIINANSK >gi|226332227|gb|ACIC01000093.1| GENE 16 13653 - 14180 466 175 aa, chain - ## HITS:1 COG:no KEGG:BT_4376 NR:ns ## KEGG: BT_4376 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 175 1 175 175 327 100.0 2e-88 MNWIKKIARPLILIVLPAVIIACSVSYKFNGSSINYDKVKTISIADFPIKSEYVYAPLAT KFNEDLKDIFIRQTRLQLLKPNQNADLQIDGEITGYNQYNQAVSADGYSSETKLTITVNV RFVNNTNHAEDFEQQFSAFRTYDSSQLLTAVQDGLIAEMSKEITDQIFNATVANW >gi|226332227|gb|ACIC01000093.1| GENE 17 14167 - 15408 1190 413 aa, chain - ## HITS:1 COG:BMEI0866 KEGG:ns NR:ns ## COG: BMEI0866 COG2204 # Protein_GI_number: 17987149 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains # Organism: Brucella melitensis # 15 263 181 428 528 243 48.0 4e-64 MTKAEIQQVKQRFGIIGNTEALSRAIDVAIQVAPTDLSVLITGESGVGKESFPQIIHQYS RRKHGQYIAVNCGAIPEGTIDSELFGHEKGAFTGAIGERKGYFGEADGGTIFLDEVGELP LPTQARLLRVLESGEFIKVGSSKVQKTDVRIVAATNVNLTQAIAEGRFREDLYYRLNTVP IQIPPLRERGDDVLLLFRKFAADFAEKYRMPAIQLTEDAKKILLAYPWPGNVRQLKNITE QISIIETNREITAAILQTYLPAQNTQRLPALFGTRESKSFESEREILYSVLFDMRQEVAE LKKMVHNMMAERAGQVGQMGQVVATPVVTTTHQPSVPAIIHAVQQPSVCPKEDDDDIQDT EEYVEETLSLDEVEKEMIRKALERHHGKRKSAAKDLNISERTLYRKIKEYELD >gi|226332227|gb|ACIC01000093.1| GENE 18 15532 - 16626 863 364 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163786851|ref|ZP_02181299.1| 50S ribosomal protein L32 [Flavobacteriales bacterium ALC-1] # 2 344 3 343 346 337 47 8e-92 MEENKIRIGITQGDINGVGYEVILKTFSDPTMLELCTPIIYGSPKVAAYHRKALDVQANF SIVNTASEAGYNRLSVVNCTDDEVKVEFSKPDPEAGKAALGALERAIEEYREGLIDVIVT APINKHTIQSEEFSFPGHTEYIEERLGNGNKSLMILMKNDFRVALVTTHIPVREIATTIT KELIQEKLMIFHRCLKQDFGIGAPRIAVLSLNPHAGDGGLLGMEEQEIIIPAMKEMEEKG IICYGPYAADGFMGSGNYTHFDGILAMYHDQGLAPFKALAMEDGVNYTAGLPVVRTSPAH GTAYDIAGKGLASEDSFRQAIYVAIDVFRNRQREKAARVNPLRKQYYEKRDDSDKLKLDT VDED >gi|226332227|gb|ACIC01000093.1| GENE 19 16631 - 17668 903 345 aa, chain - ## HITS:1 COG:no KEGG:BT_4373 NR:ns ## KEGG: BT_4373 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 345 1 345 345 683 100.0 0 MKKYFFYLSMALVAFVVASCGLKGNHTSSGRPYELLVVVDAGVWDRAAGRALHDVLDSDM PGLPQSEPSFRIMYTSPKDYDSTLKLIRNIVIVDIKDIYTKGTFKYAKDVYASPQMILTI QAPNEEVFEKFVEENKQTIIDFFTRAEMNRQITLLEEKHNNFISNKVDSLFGCDIWIPSE LNNSKTGEDFFWASTNTGSADRNFVMYSYPYTDKDTFTKEYFVHKRDSVMKANIPGYKEG VYMSTDSLLTDVRPINVHNDYTMEARGLWRMKGDFMGGPFVSHTRLDQKNQRIITAEIFV YSPDKMKRNLVRQMEASLYTLKLPQEGQQSQIPLGVTREAEPTNK >gi|226332227|gb|ACIC01000093.1| GENE 20 17686 - 18720 877 344 aa, chain - ## HITS:1 COG:STM2525 KEGG:ns NR:ns ## COG: STM2525 COG0820 # Protein_GI_number: 16765845 # Func_class: R General function prediction only # Function: Predicted Fe-S-cluster redox enzyme # Organism: Salmonella typhimurium LT2 # 2 334 20 363 388 246 37.0 4e-65 MSKYPLLGMTLIELQSLVKRLGMPGFAAKQIASWLYDKKVTSIDEMTNLSLKYRELLKQN YEVGAEAPVEEMRSVDGTVKYLYPVGENHFVESVYIPDDERATLCISSQVGCKMNCKFCM TGKQGYSANLTAHQIINQIHSLPERDKLTNVVMMGMGEPLDNLEEVLKALDILTGSYGYA WSPKRITVSTVGLRKGLRRFIEESDCHLAISLHSPVTAQRAELMPAEKAFSITEMVELLK NYDFSKQRRLSFEYIVFKGLNDSQVYAKELLKLLRGLDCRMNLIRFHSIPGVALEGADMD TMTRFRDYLTTHGLFTTIRASRGEDIFAACGMLSTAKQEENNKS >gi|226332227|gb|ACIC01000093.1| GENE 21 18862 - 21000 2464 712 aa, chain - ## HITS:1 COG:no KEGG:BT_4371 NR:ns ## KEGG: BT_4371 # Name: not_defined # Def: peptidyl-prolyl cis-trans isomerase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 712 1 712 712 1311 100.0 0 MATLQNIRSKGPLLVIVIGLALFAFIAGDAWKVMQPHQAHDVGEVNGDALSAQEYQNLVE EYTEVIKLSRGVTALNDEQTNQVRDEVWRSYVNNKLVEKEAKALGLTVSAAEIQDILKAG VHPLLQQTPFRNPQTGAFDKDMLNKFLVDYAKMNESQMPAQYAEQYNNMYKYWSFIQKTL VQSRLAEKYQALVAKALLSNPVEAQDAFDARVNQYDLLMAAVPYSSIVDSTIVVKESELK DLYNKKKEQFKQYQESRDIKYIDVQVTASAEDRAAIQQEVDEATAQLATTTDDYTSFIRS VGSEAPYVDLFYNKTAFPSDVVARLDSASVGSVYGPYYNGADNTINSFKVVAKTAAADSI EFRQIQVFAEDALKTKALADSIYTAIKGGANFADLAKKYGQTGETNWMSSAQYEGAQIDG DNLKFISAINNTGVNEVVNLPLGQANVILQVTNKKAVKDKYKVAVVKREVEFSKETYNRA YNDFSQFIAANPTAEKMIANAEEAGYKLLDRRDLYSSEHTIGGVRGTKEALRWAFSAKPG DVSGLYECGESDHMVAVALVGVTPEGYRPLKAVQDQLRAEIVKDKKAEKIMADMKAANAT SLDQYKAMSGAVSDSLKLVTFAAPAYVSALRSSEPLVGAYASVAEMNKLSAPIKGNAGVF VLQMYGKDKLSDTFNAKDEEATLANMHARFASRLMNDLYLKGKVKDTRYLFF >gi|226332227|gb|ACIC01000093.1| GENE 22 21120 - 22376 939 418 aa, chain - ## HITS:1 COG:CAC1422 KEGG:ns NR:ns ## COG: CAC1422 COG1253 # Protein_GI_number: 15894701 # Func_class: R General function prediction only # Function: Hemolysins and related proteins containing CBS domains # Organism: Clostridium acetobutylicum # 14 415 17 426 428 156 26.0 7e-38 MNIYISLLITMAFSAFFSGMEIAFVSVDKLRFEMERKGGITSRILSIFFKNSNEFISTML VGNNIALVIYGILMAQIIGDNLLAGFIDNHFLMVLAQTVISTLIILVTGEFLPKTLFKIN PNLVLNVAAIPLFICYVILYPVSKLSSGLSCLFLRIFGMKVNKDASDKAFGKVDLDYFVQ SSIDNAANEEELDTEVKIFQNALDFSNIKIRDCIVPRTEVVAVDLTTSLDELKSRFIESG ISKIIVYDGNIDNVVGFIHSSEMFRDPKDWRDNVKDVPIVPETMSAHKLMKLFMLQKRTI AVVVDEFGGTSGIVSLEDLVEEIFGDIEDEHDNTSYICKKIDEDEYVLSARLEIEKVNEM FGLDLPESDDYLTVGGLILNQYQSFPKLHELVRVGRYQFKIIKVTATKIELVRLKVME >gi|226332227|gb|ACIC01000093.1| GENE 23 22380 - 23033 577 217 aa, chain - ## HITS:1 COG:no KEGG:BT_4369 NR:ns ## KEGG: BT_4369 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 25 217 1 193 193 386 100.0 1e-106 MLRKRSNRLLHKSISITIAFGAIVMLLSFSSCGGKKNTLGDAITERDSLPVMNTLGVTTL ISDSGVTRYRVNTEEWTVYDRKKPSYWAFEKGVYLEQFDSIFNIEASIKADTAYYYDKQK LWKLIGNVDIQNRKGERFNTELLYWNEATQKVYSDKFIRIQQPDRVITGHGFDSNQQMTV YVIQNIEGIFYVDENGGTAQPEVKALPPDSTKKDSVK >gi|226332227|gb|ACIC01000093.1| GENE 24 23039 - 24364 1413 441 aa, chain - ## HITS:1 COG:no KEGG:BT_4368 NR:ns ## KEGG: BT_4368 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 9 441 1 433 433 765 100.0 0 MKIKTLVAMLFLSAGATTVVAQDASNCNSNSSISHEAVRAGNFKDAYTPWKAVLENCPTL RFYTFTDGYKILKGLMAQIKDKNNPEYQKYFDELMNTHDLRMKYTEEFLAKGTKVSSADE ALGIKAVDYIALAPKVDANQAYQWLSQSVNAVKGESAAATLFYFLQMSLDKLKADPNHKE QFIQDYLLASEYADAAIAAETNEAKKKNFMGIKDNLVALFVNSGTADCESLQSIYGPKVE ANQTDLAYLKKVIDIMKMMRCTESEAYLQASFYAYKIEPTAEAATGCAYQAFKKGDIDGA VKFFDEAIQLETDNVKKAEKAYAAAAVLASAKKLSQARSYCQKAISFNENYGAPYILIAN LYAMSPNWSDESALNKCTYFAVIDKLQRAKAVDPSVAEEANKLIGTYSGHTPQAKDLFML GYKQGDRITIGGWIGETTTIR >gi|226332227|gb|ACIC01000093.1| GENE 25 24418 - 25713 1052 431 aa, chain - ## HITS:1 COG:no KEGG:BT_4367 NR:ns ## KEGG: BT_4367 # Name: not_defined # Def: putative outer membrane protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 431 1 431 431 818 100.0 0 MVGFKHTIWALLLMMVTGTAIAQNNTNSPYTRYGYGDLSDQSFGNSKAMGGIAFGLRDGA QINPTNPASYTAIDSLTFLFEGGVSLQNMNISGGGLKLNAKNASFDYLAMQFRLAPWMAM SVGLLPYSNVGYTVSDSQTTDNGLAYSRSFTGDGGLHQMYVGAGVKVLKNLSVGVNASYF WGDITRTRGMFYPGTSSYDSYQRKMVTSISDYKLDFGAQYTQALNKKSSLTIGAVYSPKH KLNNDYTSIVIMGASSSSYGTEYKDVLDATFELPNTFGVGFTYNYDKRLTVGADYSLQQW SKTNFGVVTSDENVRQDFNETFTYCDRTKISVGAEYIPNLIGRSYFAHIKYRLGAYYTTP YYKIDGKKASREYGVTAGFGLPVPRSRSILSISGQFVRVKGLETNMVNENIFRVSIGLTF NERWFFKRRVE >gi|226332227|gb|ACIC01000093.1| GENE 26 25700 - 26368 468 222 aa, chain - ## HITS:1 COG:TM0883 KEGG:ns NR:ns ## COG: TM0883 COG1521 # Protein_GI_number: 15643645 # Func_class: K Transcription # Function: Putative transcriptional regulator, homolog of Bvg accessory factor # Organism: Thermotoga maritima # 26 211 55 239 246 87 33.0 1e-17 MVEVLTESNQSLDSLEALCSKYRIERGIVATVVDLNERILAELAALPFPLLWLNHETPLP VGNLYETPETLGYDRIAAVVGANEQFPHNDILVIDAGTCITYEFIDSKGQYHGGNISPGM QMRYKALHQFTGRLPLIDSNGRKLPMGRDTETAIRAGVLKGMEYEISGYIEAMKHKYPEL LVFLTGGDDFSFDSSVKSAIFADRFLVLKGLNRILNYNNGRI >gi|226332227|gb|ACIC01000093.1| GENE 27 26874 - 28061 782 395 aa, chain + ## HITS:1 COG:no KEGG:BT_4364 NR:ns ## KEGG: BT_4364 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 395 1 395 395 776 100.0 0 MKRTILYPLLTLYLLIFSNGQAIASGNDSIRLSLLTCAPGEEIYSLFGHTAIRYEDPANG IDAVFNYGLFSFNTPNFILRFSLGETDYQLGATDYARFAAEYAFDGRSVWQQTLNLSKEE KAELIRLLQENYLPENRVYRYNFFYDNCATRPRDKIEESIDGKVIYPAEPQDGSLSFRDI VHQYCKGHPWARFGIDLCIGSEADRPITQRQMMFAPFYLMDAFAGAQIVHDSVQRPLVSG KELIVDALPEEEEGGWMPTPFQCSLLLFILTAAATIYSIRKRTGLWGVDLILFGAAGIVG CVLAFLALFSEHPAVSSNFLLLVFHPGQLLLLPYIIYCVRKGKKCWYLTLNLVVLTLFMV LFPLIPQRFDLAVVPLALCLLIRSASNLILTSKKK >gi|226332227|gb|ACIC01000093.1| GENE 28 28061 - 29635 1360 524 aa, chain + ## HITS:1 COG:no KEGG:BT_4363 NR:ns ## KEGG: BT_4363 # Name: not_defined # Def: putative alkaline phosphatase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 524 1 524 524 1088 100.0 0 MKGLLTSLITVLTFTGLQAQSLPTAPKLVVGLTIDQLRTDYLEAFSSLYGERGFKRLWKE GRVFHNAEYTFSGVDRASAMAAIYSGSTPSVNGIISNRWMDVATLRPVNSTDDAAFMGYY TDQTCAPTKLLTSTIADELKIATQGKGIVYAIAPFCDAAIFAAGHAGNGAFWINPTTGKW SGTTYYGEFPWWASQYNDRQAIDSRISSVTWEPVFPRGMYTFLPDWRDVVFKYKFDDDRN NKFRRFITSPFVNDEVNALAEEAIGKGSVGMDDITDLLALTYYAGNYAHKSVQECAMEIQ DTYVRLDRSIANLLDLLDKKVGLQNVLIFVTSTGYTDSESTDSGLYKIPTGEFHLNRCTA LLNMFLMATYGEGKYVEAYHDQQIYLNHKLLEQKQLNLTEVQEKSADFLMQFSGVNEAYS AHRLLLGSWTPEIYRIRNGYHRKRSGDLVIDVLPGWTIVSENGNENKVVRHSYIPAPLIF MGHSVKPAIIQTPVTIDHIAPTLAHFMRIRAPNASSSAPITDLR >gi|226332227|gb|ACIC01000093.1| GENE 29 29766 - 33086 4028 1106 aa, chain + ## HITS:1 COG:CT701 KEGG:ns NR:ns ## COG: CT701 COG0653 # Protein_GI_number: 15605434 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit SecA (ATPase, RNA helicase) # Organism: Chlamydia trachomatis # 5 1005 3 917 969 596 38.0 1e-170 MGFNEFLSSIFGNKSTRDMKEIKPWVEKIKAAYPEIEALDNDALRAKTEELKKYIRESAT DERAKVEELKASIESTELEDREEVFAQIDKIEKEILEKYEKALEEVLPVAFSIVKATAKR FTENEEIVVTATEFDRHLAATKDFVRIEGDKAIYQNHWNAGGNDTVWNMIHYDVQLFGGV VLHKGKIAEMATGEGKTLVATLPVFLNALTGNGVHVVTVNDYLAKRDSEWMGPLYMFHGL SVDCIDRHQPNSDARRQAYLADITFGTNNEFGFDYLRDNMAISPKDLVQRQHNYAIVDEV DSVLIDDARTPLIISGPVPKGDDQLFEQLRPLVERLVEAQKALATKYLSEAKRLIASNDK KEVEEGFLALYRSHKCLPKNKALIKFLSEQGIKAGMLKTEEIYMEQNNKRMHEVTEPLYF VIEEKLNSVDLTDKGIDLITGNSEDPTLFVLPDIAAQLSELENQNLTNEQLLEKKDELLT NYAIKSERVHTINQLLKAYTMFEKDDEYVVIDGQVKIVDEQTGRIMEGRRYSDGLHQAIE AKERVKVEAATQTFATITLQNYFRMYHKLSGMTGTAETEAGELWDIYKLDVVVIPTNRPI ARKDMNDRVYKTKREKYKAVIEEIEKLVQAGRPVLVGTTSVEISEMLSKMLTMRKIEHSV LNAKLHQKEAEIVAKAGFSCAVTIATNMAGRGTDIKLSPEVKAAGGLAIIGTERHESRRV DRQLRGRAGRQGDPGSSVFFVSLEDDLMRLFSSDRIASVMDKLGFQEGEMIEHKMISNSI ERAQKKVEENNFGIRKRLLEYDDVMNKQRTVVYTKRRHALMGERIGMDIVNMIWDRCANA IENNDYEGCQMELLQTLAMETPFTEEEFRNEKKDTLAEKTFNIAMENFKRKTERLAQIAN PVIKQVYENQGHMYENILIPITDGKRMYNISCNLKAAYESESKEVVKAFEKSILLHVIDE AWKENLRELDELKHSVQNASYEQKDPLLIYKLESVTLFDAMVNKINNQTISILMRGQIPV QEAPADEQQPRRVEVRQAAPEQRQDMSKYREQKQDLSDPNQQAAASQDTREQQKREPIRA EKTVGRNDPCPCGSGKKYKNCHGQNA >gi|226332227|gb|ACIC01000093.1| GENE 30 33208 - 34317 1018 369 aa, chain + ## HITS:1 COG:no KEGG:BT_4361 NR:ns ## KEGG: BT_4361 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 369 1 369 369 712 100.0 0 MAKFYSLAFVSLLLLAFGSCQSLEQISIDYMQPGDMTFPSQLRKVAIVNNTSTEPDNKLI TQTEKPKENVPEISHATAYANGNVKIAAESLAEEIAHQNYFDVVVICDSALRANDKFPRE STLSQEEVQQLTSDLGVDCIIAMENLQFKATKTVRYIRDFNCYLGTVDVKAYPTVKVYLP SRSKPMTTLHPTDSIFWEEYGGSVTETFAHMIPDAQMLREASEFAGTIPVKQLLPFWKTG KRYLYTGGSVQMRDAAIFVRENSWDRAFELWEQVYNGTKKEKKKMKAALNIAVYYEMKDS LAKAEEWAVKAQQLAQKVDKKNIPENAAYATIDDVPNYYLTTLYANELKERNSQLPKLKM QMERFNDDF >gi|226332227|gb|ACIC01000093.1| GENE 31 34379 - 34825 523 148 aa, chain + ## HITS:1 COG:no KEGG:BT_4360 NR:ns ## KEGG: BT_4360 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 148 1 148 148 223 99.0 2e-57 MKLSQQSLSTIESAIQKAAGKYVCGCDQTAVTDIHLQPDQTSGQLTIYNDDDEELANVMI EEWATYDGDDFMENVKPNLKSILCRMKDAGDFDKVTILKPYSFVLVDEDKETVAELLLID DDTLLVDDELLKGLDKELDEFLKNLLEK Prediction of potential genes in microbial genomes Time: Thu May 12 01:40:18 2011 Seq name: gi|226332226|gb|ACIC01000094.1| Bacteroides sp. 1_1_6 cont1.94, whole genome shotgun sequence Length of sequence - 8036 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 109 - 2319 1826 ## BT_4359 alpha-N-acetylglucosaminidase precursor - Prom 2398 - 2457 4.6 - Term 2505 - 2557 14.9 2 2 Op 1 . - CDS 2579 - 4465 1785 ## BT_4358 hypothetical protein 3 2 Op 2 . - CDS 4500 - 7958 3025 ## BT_4357 hypothetical protein Predicted protein(s) >gi|226332226|gb|ACIC01000094.1| GENE 1 109 - 2319 1826 736 aa, chain - ## HITS:1 COG:no KEGG:BT_4359 NR:ns ## KEGG: BT_4359 # Name: not_defined # Def: alpha-N-acetylglucosaminidase precursor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 736 9 744 744 1469 100.0 0 MNMRLLLLLLFGSVVMYTSCQSTSAPIDSLAARVTENTSKDQILFRLVTDEADPAKDYFE IDSKDGKVLITGNSDLSLATGLNWYLKYVAGIHLSWNNPSQKLPEVLPLPQKKIRQATAM KNRYYLNYCTYSYSMAFWDWERWEKEIDWMAMHGINMPLSITGMEVVWYNLLKRIGYTTE EINEFISGPAFMAWWQMNNLEGWGGPNPDSWYRQQEALQKKIIARMRELGIEPVFPGYAG MVPRNIGEKLGYQIADPGKWCGFPRPAFLSTEDEHFDSFAAMYYEELEKLYGKAKYYSMD PFHEGGNTEGVDLAKAGTSIMSAMKKANPEAVWVMQAWQANPREAMVSTLDSGDLLVLDL YSEKLPQWGDPESMWYREKGFGKHDWLYCMLLNFGGNVGLHGRMEQLVNGYYNACAHVNG KTLRGVGATPEGIENNPVMFELLYELPWREERFAPDAWLQAYLKARYGNDLSPEVAEAWR ALEHTVYNAPKNYQGEGTVESLLCARPGFHQDRTSTWGYAKLFYSPDSTAKAARLLLSVA DQYKGNNNFEYDLVDVVRQSLADKGNVLLEEISQSYDRKDKDSFGKQSQQFLELILAQDS LLSTRKEFSVSSWLNAARSLGTTEEEKKLYEWNASALITVWGDSIAANRGGLHDYSHREW SGILKDLYYQRWKTFFEQKQRELDGKLDQSAEEPINFYGMEKAWAEKNKTADSSDHKSQT VIETAKSVYRTAIEAK >gi|226332226|gb|ACIC01000094.1| GENE 2 2579 - 4465 1785 628 aa, chain - ## HITS:1 COG:no KEGG:BT_4358 NR:ns ## KEGG: BT_4358 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 628 1 628 628 1250 99.0 0 MLAAAAIGFTSCNDSFMERYPDTSLTEQTVFSNYNTFKTYAWGLYGVFTNTNILRIPGTN GAYASATSYTSDIYAGYLMRRQGDGNPYAFQNVSSSSSGNGWAFSFVYRVNVMLANIDNS GMTDADKEHWRSVGYFFRAYYYSELIARFGDVPWVDRVLGDSDKEIAYGPRTPRKEVADH VLNDLIYAEEHIKEEGDGENTINVHVVRALLSRFCLFEGTWRKYHALGDEDKYFDACITY SKKLMDSYPELNSDWGEMLTSDLKGMKGIILYKEYVEKELTNYVLTHVERTSTHNVEMPQ HIIDMYLCKDGKTIHNSSEYEWSQDGDNSMNATFRNRDLRLLETVAPPYKVIPSADNTSW EYTSDPKDREFMDIMGITRYTGFGGGNGEAGKHKAFPLMNWSAAILKGMPHFFTNNGGQG FLVARSGNYVYRYYNVWDNSKENEGTSDVPLFKIDEVMLNYAEAKFEKGGTGAEGFNQTV ADLTINKLRDRVGVAHMKVAEINAGFDPKRDQTVDPVLWEIRRERMVELMGEGFGFYDVR RWKKAPWFINKIQYGQWATKEQIGDSGQFVDLEKGYADTTGKKEGFIYMYNDPLVAGKGW LDKYYLYQVPTNEIALNPELAPNNPGWE >gi|226332226|gb|ACIC01000094.1| GENE 3 4500 - 7958 3025 1152 aa, chain - ## HITS:1 COG:no KEGG:BT_4357 NR:ns ## KEGG: BT_4357 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1152 27 1178 1178 2273 100.0 0 MRLVILFLFCFVGLTNATDSYAQSAKVTMNVRNQTVEDVLRAIEKQTEFSFFYNNAHINL KRLVTISADRNDIFKVLDEVFKNTNVEYKVIDKKIILSTELASVEQAHQEKVKVSGRVVD AAGEPVIGASVIEKGSANGTITDMDGNFILNVGSKEAILEISYIGYQGQSLKVTPGKTLS VVLKEDTQSLDEVVVVGFGVQKKANLTGAVSQVKMDDVLGSRPVVNAMSALQGAMPGLQI TPNNDAAGPGQSKSFNIRGTTSINGGGPLVLIDNVPGDIDMLNPEDIESVSVLKDAASAA IYGARAAFGVILVTTKKAKKGDGFHVNYNNNFGFQSSINRPEQADGLEWMQAYLDGEFNA GKYYTGQDIKTWMNYLTEYRKNPGKFQTTGDGVYVDPETGLNYYLNEKDLYANMLDDYGF LQAHNVSLSGGTDKLAYRLSLGYNSEQGILITDKDRYKRLSGSAYISAEITSWLTQSVDI RYAQSDKNMPVTSDKTGLYDMRLPVVYPEGSLTLPDGTSLMTNTPSNVLRMATDNNTIRD NARILSKTVLKPLKGLEVAFEYTFDKTWSNQNVNKASIDYTTVELAKIQTATTSSLETTH QSTDYNAINLYANYRYSWNDTHNLSLMGGFNQESSDWKKLYTYSYDMINEKYPSHSTATG ENKVITDDHRVYTVRGAFYRVNYDYKGKYLFETNGRYDGSSKFPKKNRFGFFPSVSVGWN IARENFMKPVAGDWLSDLKLRGTWGQIGNQGIDPYKFVPTMSQVEKKDVAWLVNGAKPLT LNAPGLVSDSFTWETVETLDFGFDITALNGRLQSTFDWYRRDTKDMLAPGAELPSVVGAS APLQNTADLRTKGWELSLTWRDRIGAWGYNVGFNLYDSKTVVTKYHNESKIILKSDGTNN YYEGYEIGSIWGYVTDGYYTADDFEDTNTWKLKEGVVTVDGVSPRPGDIKYQNLRDDGSS TNRIDTGDGTFDNPGDRKIIGNNSLRLQYGINLGVNYKGFDLSVLLQGVGKRDVWISDAR RWPFNSGQFGSLFKDQLDYWKPVDSANGDWTAANPNAEYFRIYGQGNNSSYNTRAQTKYL MNGSYLRVKNVTLSYNFPKSWLAPITLTSLKAFVSCENLHTFTKLMKGYDPERLSWGYPF YRTISFGFNVTL Prediction of potential genes in microbial genomes Time: Thu May 12 01:41:10 2011 Seq name: gi|226332225|gb|ACIC01000095.1| Bacteroides sp. 1_1_6 cont1.95, whole genome shotgun sequence Length of sequence - 73471 bp Number of predicted genes - 58, with homology - 57 Number of transcription units - 28, operones - 15 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 26 - 74 1.8 1 1 Tu 1 . - CDS 170 - 1198 728 ## COG3712 Fe2+-dicitrate sensor, membrane component - Prom 1280 - 1339 10.6 + Prom 1239 - 1298 7.9 2 2 Tu 1 . + CDS 1329 - 1904 334 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog + Term 2097 - 2148 1.4 - Term 2148 - 2204 1.1 3 3 Tu 1 . - CDS 2294 - 4189 1268 ## COG0642 Signal transduction histidine kinase - Prom 4213 - 4272 4.0 - Term 4236 - 4276 7.3 4 4 Op 1 . - CDS 4306 - 6945 2911 ## COG0525 Valyl-tRNA synthetase 5 4 Op 2 . - CDS 7010 - 7648 608 ## BT_4352 hypothetical protein + Prom 7636 - 7695 5.2 6 5 Op 1 . + CDS 7743 - 8807 742 ## BT_4351 hypothetical protein + Prom 8809 - 8868 3.9 7 5 Op 2 . + CDS 8912 - 9700 1063 ## COG3956 Protein containing tetrapyrrole methyltransferase domain and MazG-like (predicted pyrophosphatase) domain + Term 9772 - 9818 9.6 - Term 9756 - 9811 14.3 8 6 Op 1 . - CDS 9852 - 10319 452 ## BT_4349 hypothetical protein 9 6 Op 2 . - CDS 10346 - 10732 386 ## BT_4348 hypothetical protein 10 6 Op 3 . - CDS 10780 - 11331 546 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 11 6 Op 4 . - CDS 11318 - 12286 602 ## COG1234 Metal-dependent hydrolases of the beta-lactamase superfamily III - Term 12311 - 12353 11.0 12 7 Op 1 . - CDS 12373 - 14172 3054 ## PROTEIN SUPPORTED gi|29349753|ref|NP_813256.1| 30S ribosomal protein S1 13 7 Op 2 . - CDS 14221 - 14433 63 ## + Prom 14168 - 14227 8.2 14 8 Tu 1 . + CDS 14341 - 20169 4141 ## COG1112 Superfamily I DNA and RNA helicases and helicase subunits + Term 20315 - 20352 1.0 - Term 20027 - 20070 0.2 15 9 Tu 1 . - CDS 20203 - 21336 713 ## COG3344 Retron-type reverse transcriptase - Prom 21369 - 21428 5.7 16 10 Tu 1 . - CDS 21542 - 23728 1638 ## BT_4341 hypothetical protein - Prom 23751 - 23810 8.3 + Prom 23862 - 23921 4.3 17 11 Op 1 . + CDS 24130 - 24330 223 ## BT_4340 hypothetical protein + Term 24398 - 24454 -0.5 + Prom 24399 - 24458 7.3 18 11 Op 2 . + CDS 24513 - 26702 2504 ## COG3968 Uncharacterized protein related to glutamine synthetase + Term 26712 - 26761 11.6 + Prom 26735 - 26794 5.7 19 12 Tu 1 . + CDS 26820 - 27521 516 ## COG0664 cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases + Term 27567 - 27605 6.2 + Prom 27555 - 27614 6.2 20 13 Tu 1 . + CDS 27679 - 30108 2352 ## COG3525 N-acetyl-beta-hexosaminidase + Term 30130 - 30175 -0.7 - Term 30118 - 30163 -0.5 21 14 Tu 1 . - CDS 30223 - 31173 713 ## PROTEIN SUPPORTED gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 22 15 Op 1 . - CDS 31234 - 31881 647 ## BT_4335 hypothetical protein 23 15 Op 2 . - CDS 31901 - 34396 2283 ## COG1674 DNA segregation ATPase FtsK/SpoIIIE and related proteins - Prom 34423 - 34482 8.0 + Prom 34383 - 34442 6.0 24 16 Op 1 . + CDS 34519 - 35178 634 ## BT_4333 hypothetical protein 25 16 Op 2 . + CDS 35185 - 35835 493 ## BT_4332 hypothetical protein 26 16 Op 3 1/0.125 + CDS 35890 - 37068 563 ## PROTEIN SUPPORTED gi|223476703|ref|YP_002580685.1| ribosomal protein L11 methyltransferase, putative + Term 37252 - 37284 2.4 + Prom 37274 - 37333 6.9 27 17 Op 1 . + CDS 37445 - 38695 1076 ## COG0477 Permeases of the major facilitator superfamily 28 17 Op 2 . + CDS 38709 - 39305 632 ## COG1259 Uncharacterized conserved protein 29 17 Op 3 . + CDS 39312 - 40010 677 ## COG1385 Uncharacterized protein conserved in bacteria 30 17 Op 4 . + CDS 40026 - 41579 1599 ## BT_4327 hypothetical protein 31 17 Op 5 . + CDS 41596 - 42240 188 ## PROTEIN SUPPORTED gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) 32 17 Op 6 . + CDS 42286 - 43491 1094 ## BT_4325 hypothetical protein 33 17 Op 7 . + CDS 43503 - 46145 2135 ## BT_4324 hypothetical protein + Prom 46196 - 46255 4.0 34 18 Op 1 . + CDS 46285 - 47205 1009 ## COG0324 tRNA delta(2)-isopentenylpyrophosphate transferase 35 18 Op 2 . + CDS 47286 - 48212 983 ## COG1597 Sphingosine kinase and enzymes related to eukaryotic diacylglycerol kinase + Prom 48214 - 48273 2.1 36 19 Op 1 . + CDS 48296 - 49096 944 ## COG2877 3-deoxy-D-manno-octulosonic acid (KDO) 8-phosphate synthase 37 19 Op 2 . + CDS 49179 - 52019 2907 ## COG0612 Predicted Zn-dependent peptidases 38 19 Op 3 . + CDS 52042 - 53535 1037 ## BT_4319 hypothetical protein 39 19 Op 4 . + CDS 53586 - 54290 207 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 40 19 Op 5 . + CDS 54333 - 54845 234 ## PROTEIN SUPPORTED gi|167856514|ref|ZP_02479226.1| 50S ribosomal protein L1 41 19 Op 6 . + CDS 54872 - 55756 791 ## COG3735 Uncharacterized protein conserved in bacteria 42 19 Op 7 . + CDS 55756 - 56397 426 ## COG0546 Predicted phosphatases + Prom 56413 - 56472 5.8 43 20 Op 1 32/0.000 + CDS 56557 - 56874 533 ## PROTEIN SUPPORTED gi|29349722|ref|NP_813225.1| 50S ribosomal protein L21 44 20 Op 2 . + CDS 56896 - 57165 462 ## PROTEIN SUPPORTED gi|29349721|ref|NP_813224.1| 50S ribosomal protein L27 + Term 57189 - 57239 6.1 + Prom 57224 - 57283 5.6 45 21 Op 1 . + CDS 57315 - 58589 1466 ## COG0172 Seryl-tRNA synthetase + Term 58601 - 58650 2.4 46 21 Op 2 . + CDS 58672 - 59883 971 ## COG0738 Fucose permease + Term 59906 - 59948 7.1 + Prom 59894 - 59953 6.1 47 22 Tu 1 . + CDS 60028 - 62319 2508 ## COG0493 NADPH-dependent glutamate synthase beta chain and related oxidoreductases + Term 62346 - 62382 2.5 - Term 62327 - 62375 9.2 48 23 Op 1 12/0.000 - CDS 62393 - 62746 473 ## COG0853 Aspartate 1-decarboxylase 49 23 Op 2 . - CDS 62768 - 63616 679 ## COG0414 Panthothenate synthetase - Prom 63730 - 63789 9.6 + Prom 63657 - 63716 4.4 50 24 Op 1 . + CDS 63759 - 64595 622 ## COG0297 Glycogen synthase 51 24 Op 2 . + CDS 64615 - 65583 715 ## BT_4306 hypothetical protein 52 24 Op 3 . + CDS 65475 - 66242 487 ## BT_4306 hypothetical protein + Term 66252 - 66310 17.1 - Term 66318 - 66363 7.4 53 25 Op 1 2/0.000 - CDS 66387 - 67769 1321 ## COG1449 Alpha-amylase/alpha-mannosidase 54 25 Op 2 4/0.000 - CDS 67783 - 69057 1320 ## COG0438 Glycosyltransferase 55 25 Op 3 . - CDS 69075 - 71012 1556 ## COG3408 Glycogen debranching enzyme - Prom 71117 - 71176 12.1 + Prom 71094 - 71153 8.9 56 26 Tu 1 . + CDS 71228 - 71893 546 ## COG0705 Uncharacterized membrane protein (homolog of Drosophila rhomboid) 57 27 Tu 1 . - CDS 71901 - 72500 577 ## COG2095 Multiple antibiotic transporter - Prom 72524 - 72583 10.7 + Prom 72451 - 72510 5.8 58 28 Tu 1 . + CDS 72636 - 73298 580 ## BT_4300 Crp family transcriptional regulator Predicted protein(s) >gi|226332225|gb|ACIC01000095.1| GENE 1 170 - 1198 728 342 aa, chain - ## HITS:1 COG:AGl2289 KEGG:ns NR:ns ## COG: AGl2289 COG3712 # Protein_GI_number: 15891252 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 23 321 35 311 323 79 24.0 8e-15 MKDLNNNRIEELLPRYCEGRLSEGERLEVEAWMDESEENKRVATQTFALYLAVDTVQVMK KVDTEKALLKVKGKMSDREVRRTVWWEWAQRAAAILFIPLLTLFIWQNWKGDTGEVAEMM EVKTSPGMTTSLTLPDGTIAYLNSESSLSYPSRFNGDFRKVKLSGEAYFEVAKDPEKKFI LSTTHQSQIEVLGTCFNVEAYEQNTEVITTLIEGKVDFMFEKDAVMKHIILSPREKLVYD SETDKVHLYKTSGKSELAWKDGEVVLDNTPLEEALWMLERRYSVKFVIKNEKLKNSSFTG TFTNQRLEKILEYFKVSSKIRWKHINDDKDGSDRKKEIIEIY >gi|226332225|gb|ACIC01000095.1| GENE 2 1329 - 1904 334 191 aa, chain + ## HITS:1 COG:XF2239 KEGG:ns NR:ns ## COG: XF2239 COG1595 # Protein_GI_number: 15838830 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Xylella fastidiosa 9a5c # 10 184 11 196 206 64 25.0 1e-10 MEELRIKRNDLQLSEIQRGGLKAFETLFRQYYAVLCAYGHKYVDFHDAEEIVQDSLLWIW ENRENLIIESSLSSYLFKMVHHKALNKLAHIDAIKRADTRFYEEMQEMIHDMDFYQIKEL TKRIEDAVAALPESYRQAFVMHRFRDMSYKEIAETLEVSPKTVDYRIQQALKQLRIDLKD YLPLLLPLLFP >gi|226332225|gb|ACIC01000095.1| GENE 3 2294 - 4189 1268 631 aa, chain - ## HITS:1 COG:mlr3786_1 KEGG:ns NR:ns ## COG: mlr3786_1 COG0642 # Protein_GI_number: 13473249 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Mesorhizobium loti # 370 627 192 462 478 174 35.0 7e-43 MKQLLRICLTGAFFLLFEGLLSPMLADNGKSDKEILVLNSINFNLPWARNFYWYVHDVLQ EQGITARAESLSVPALQDTMEVNAVIEHLRQKYPLPPAAIVIIGDPGWIVCRPLFDDLWK DVPVIITNSRDRLPATLEVLLSHAPLNETNSVPAQEWRKGYNLTTLKQHYYVKETVELIY SLIPDMDRLAFISDDRYISEETRGDVKEAVEKHFPNLPLELLSTTHLSTEMLLDTLRSYK ANTGIIYYSWFESHNKEDNNYLFDHIQEIITNFTSSPLFLLSSEDLSNNTFAGGYYVSAE SFGDSLLEIIDRVLHGEQAEDIPESIGGKASAYLCYPVLESYNIPYHLYPERAVYVNEPK SIFHQYCVEILSCSIFLLILIAAITYYIRILRKAYTRLSEAMEKAEQANQLKSAFLANMS HEIRTPLNAIVGFSNMLPDIEDRKEMREYADIIETNTDLLLQLINDILDMSKIEAGTFDF CPSLIDVNQTMEEIEQSMRLRLKKETVTLTFTERLPECTLYTDKNRLIQLISNFVINAIK FTQIGSIRMGYRLKDADTICFYVSDTGCGMSEEQCRHVFERFVKYNPFVQGTGLGLSICQ MIIDRLGGTIGVESEEGKGSTFWFTLPYQQE >gi|226332225|gb|ACIC01000095.1| GENE 4 4306 - 6945 2911 879 aa, chain - ## HITS:1 COG:FN2011 KEGG:ns NR:ns ## COG: FN2011 COG0525 # Protein_GI_number: 19705307 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Valyl-tRNA synthetase # Organism: Fusobacterium nucleatum # 2 878 3 887 887 758 45.0 0 MELASKYNPADVEGKWYQYWLEHRLFSSKPDGREPYTIVIPPPNVTGVLHMGHMLNNTIQ DILVRRARMEGKNACWVPGTDHASIATEAKVVNKLAAQGIKKTDLTRDEFLKHAWDWTEE HGGIILKQLRKLGASCDWDRTAFTMDEERSESVLKVFVDLYNKGLIYRGVRMVNWDPKAL TALSDEEVIYKEEHGKLFYLRYKVEGDPEGRYAVVATTRPETIMGDTAMCINPNDPKNAW LKGKKVIVPLVNRVIPVIEDDYVDIEFGTGCLKVTPAHDVNDYMLGEKYNLPSIDIFNDN GTLSEAAGLYIGMDRFDVRKQIEKDLDAAGLLEKTEAYTNKVGYSERTNVVIEPKLSMQW FLKMQHFADMALPPVMNDELKFYPAKYKNTYRHWMENIKDWCISRQLWWGHRIPAYFLPE GGYVVAVTPEEALAKAKEKTGNAALTMDDLRQDEDCLDTWFSSWLWPISLFDGINHPGNE EISYYYPTSDLVTGPDIIFFWVARMIMAGYEYEGKMPFKNVYFTGIVRDKLGRKMSKSLG NSPDPLELIDKYGADGVRMGMMLAAPAGNDILFDDALCEQGRNFCSKIWNAFRLIKGWTV DDNIQAPDAAKLAVHWFESKQNEVAAEVADLFSKYRLSEALMAVYKLFWDEFSSWYLEMI KPAYGQGIDRTTYSATLCFLDNLLHLLHPFMPFITEELWQQMYERNAEEGESLMVSALSM DTYVDTAFVAQFEVVKGVISNIRSIRLQKNIAQKEPLDLQVLGENPVAEFNAVIQKMCNL SSITVVESKAEGASSFMVGTTEYAVPLGNMIDVEAEIARMEAELKHKEGFLQGVLKKLSN EKFVNNAPAAVLEMERKKQADAESIISSLKESIAALKKA >gi|226332225|gb|ACIC01000095.1| GENE 5 7010 - 7648 608 212 aa, chain - ## HITS:1 COG:no KEGG:BT_4352 NR:ns ## KEGG: BT_4352 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 212 1 212 212 399 100.0 1e-110 MATVEEIEKYCRNCVSRDFVNGKGLVCKRTRELPDFDEECENFEKDEELLKMAPPKPDDF PVSMTEEELLAEENLPKGVLYASVACILGAVAWSLISVSTGLQMGYMAIGVGFLVGFAMR QGKGIRPVFGLLGAALALISCVLGDFLSIIGFAAKDYDMTFFEVLAGVDYGEIFSVMVKN VVSMSALFYGIAVYEGYKLSFRAQKHPVGGKI >gi|226332225|gb|ACIC01000095.1| GENE 6 7743 - 8807 742 354 aa, chain + ## HITS:1 COG:no KEGG:BT_4351 NR:ns ## KEGG: BT_4351 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 354 2 355 355 681 98.0 0 MKRKLIIRYILLGVLLLLVWMTQSIPALGNLYSQTVYPVISRFLSFFSNLFPFAIGDLFI FGSITGVIVYPFYARIRKKLPWKKILLRDGEYLLWVYVWFYLAWGLNYSQPNFYQRTHIP YTAYTPEIFQEFVDDYIDSLNSSFVPVKGIHEEQVRDEAVKLYNQLSDSLGVHRPPFPNP RVKTMIFTPFISMVGVTGSMGPFFCEFTLNGDLLPINYPATYTHELAHLLGISSEAEANF YAYQVCTRSQIKGIRFSGYFSILNHVLGNARRLLSEKEYADIINRIRPEIIELAKKNQEY WMAKYSPLIGDVQDWIYDLYLKGNKIESGRKNYSEVIGLLISYQVWKAKNTSDQ >gi|226332225|gb|ACIC01000095.1| GENE 7 8912 - 9700 1063 262 aa, chain + ## HITS:1 COG:BS_yabN KEGG:ns NR:ns ## COG: BS_yabN COG3956 # Protein_GI_number: 16077126 # Func_class: R General function prediction only # Function: Protein containing tetrapyrrole methyltransferase domain and MazG-like (predicted pyrophosphatase) domain # Organism: Bacillus subtilis # 5 259 224 484 489 218 48.0 7e-57 MIHTREEQMEAFGRFLDILDELRVKCPWDRKQTNESLRPNTIEETYELCDALMRDDKKDI CKELGDVLLHVAFYAKIGSETGDFDIKDVCDKLCDKLIFRHPHVFGEVKAETAGQVSENW EQLKLKEKDGNKSVLSGVPAALPSLIKAYRIQDKARNVGFDWEEREQVWDKVKEEIGEFQ AEVANMDKEKAEAEFGDVMFSLINAARLYKINPDNALELTNQKFIRRFNYLEEHTIKEGK NLKDMSLEEMDAIWNEAKKKGL >gi|226332225|gb|ACIC01000095.1| GENE 8 9852 - 10319 452 155 aa, chain - ## HITS:1 COG:no KEGG:BT_4349 NR:ns ## KEGG: BT_4349 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 155 1 155 155 276 98.0 2e-73 MKKLIALVVLLCGFMPVLWAADGCDQHLSREEFRNKQKAFIIEQAGLTKEEAAKFFPVYF ELQEKKKKLNDESWSLMRQGKDDKTTEAQYEEIVAKVCDNRIAADRLDKSYLDRFKKILS NKKIFLVQRAEMRFHREMLKGMNRKDGGNDPKRKK >gi|226332225|gb|ACIC01000095.1| GENE 9 10346 - 10732 386 128 aa, chain - ## HITS:1 COG:no KEGG:BT_4348 NR:ns ## KEGG: BT_4348 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 128 1 128 128 218 97.0 4e-56 MKEEDTLLKKIGKEHSFKVPDGYFENLTSEVMNKLPEKEKVAFKEEHVSTWTRLKPLFYM AAMFVGAALIIRVASSDHKPVADDIAVTVTAMEADTEQVSDEMIDVALDRAMLDDYSLYV YLSDASVE >gi|226332225|gb|ACIC01000095.1| GENE 10 10780 - 11331 546 183 aa, chain - ## HITS:1 COG:mll8140 KEGG:ns NR:ns ## COG: mll8140 COG1595 # Protein_GI_number: 13476734 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Mesorhizobium loti # 22 181 14 177 208 83 29.0 2e-16 MTPYNEREVLKLLQEESTQRKGFELIVAQYSEQLYWQIRRMVLSHEDANDLLQNTFIKAW TNIDYFRAEAKLSTWLYRIALNECLTFLNRQRATTTVAIDDPEAAIVQKLESDPYYSGDQ VQMLLQKALLTLPEKQRMVFNLKYYQEMKYEEMSEIFGTSVGALKASYHHAVKKIEKFLE EVD >gi|226332225|gb|ACIC01000095.1| GENE 11 11318 - 12286 602 322 aa, chain - ## HITS:1 COG:slr0050 KEGG:ns NR:ns ## COG: slr0050 COG1234 # Protein_GI_number: 16331469 # Func_class: R General function prediction only # Function: Metal-dependent hydrolases of the beta-lactamase superfamily III # Organism: Synechocystis # 11 314 2 311 326 205 38.0 1e-52 MYICAAMEKFELHILGCGSALPTTRHFATSQVVNLREKLFMIDCGEGAQMQLRRSRLKFS RLNHIFISHLHGDHCFGLLGLISTFGLLGRTADLHIHSPKGLEELFAPLLSFFCKTLAYK VFFHEFETKEPTLIYDDRSVAVTTIPLRHRIPCCGFLFEEKQRPNHIIRDMVDFYKVPVY ELNRIKNGADFVTPEGEVIPNHRLTRPSAPARKYAYCSDTIYRPEIVEQIKGIDLLFHEA TFAQTEQVRARETHHTTAAQAAQIALNAEVKQLVIGHFSARYEDESVLLNEAAAIFPQTV LARENMCITINNYTDTRDYDPL >gi|226332225|gb|ACIC01000095.1| GENE 12 12373 - 14172 3054 599 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|29349753|ref|NP_813256.1| 30S ribosomal protein S1 [Bacteroides thetaiotaomicron VPI-5482] # 1 599 1 599 599 1181 99 0.0 MENLKNVAPIEDFNWDAYENGEAVTSASHEELEKAYDGTLNKVNDREVVDGTVIAMNKRE VVVNIGYKSDGIIPLNEFRYNPDLKVGDTVEVYIENQEDKKGQLVLSHRKARATRSWDRV NAALENEEIIKGFIKCRTKGGMIVDVFGIEAFLPGSQIDVKPIRDYDVFVGKTMEFKVVK INQEFKNVVVSHKALIEAELEQQKKEIIGKLEKGQVLEGTVKNITSYGVFIDLGGVDGLI HITDLSWGRVSDPKEVVELDQKLNVVILDFDDEKKRIALGLKQLTPHPWDALDTDLKVGD KVKGKVVVMADYGAFIEIATGVEGLIHVSEMSWSQHLRSAQDFMKVGDEVEAVVLTLDRE ERKMSLGIKQLKQDPWETIEEKYPVGSKHTAKVRNFTNFGVFVEIEEGVDGLIHISDLSW TKKVKHPSEFTQIGADIEVQVLEIDKENRRLSLGHKQLEENPWDVFETVFTVGSVHEGTI IEMLDKGAVVALPYGVEGFATPKHLVKEDGSQAQLDEKLEFKVIEFNKDAKRIILSHSRI FEDVAKAEERAEKKAASGAKKSSNNKREDAPMIQNQAASTTLGDIDALAALKEQLEGKK >gi|226332225|gb|ACIC01000095.1| GENE 13 14221 - 14433 63 70 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDDRYFVLLHREINKWQIIKLHPDFVQCSVHVESRIILDLQYYKKYGYYHLFFGYSFVGQ ENIPTFAPDL >gi|226332225|gb|ACIC01000095.1| GENE 14 14341 - 20169 4141 1942 aa, chain + ## HITS:1 COG:MA3490 KEGG:ns NR:ns ## COG: MA3490 COG1112 # Protein_GI_number: 20092301 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases and helicase subunits # Organism: Methanosarcina acetivorans str.C2A # 384 1754 20 1407 1939 564 30.0 1e-160 MNGTLDKIRVQFDYLPLINFAMQQNKVSVIHQLSIENMTSEPFRNIQVQITAEPDFGSIT PVMMEAIPANNSVCLQSFSLVLSANYFAQLTERMSGSLKIEIRSEAETIFTRTYPIDILA YDQWGGINIFPEMLAAFITPNHAVLTPIIKRAAAILEQWTGTPSLDEYQSRNPDRVRKQM AAIYTALTEQQIIYSTIPASFEEHGQRVRLTDSVLAQKLGTCLDMALLYTSCLESIGLNA LIVITKGHAFAGGWLVPETFPDPAIDDVSLLTKRTAEGIYDITLVETTCMNMGHNADFDN AVKSANGKLSDPGSFILAIDIRRARHSGVRPIPQRVLNGQVWEIKEDEDMNRNTTHATPQ SVNPYDLSGSETQTVLTKQLLWERRLLDLSLRNNLLNIRITKNTLQLIPANLACLEDALA EGDEFRILHRPAEWENPAMEFGIYSSIPESDPIADFVNSELSQKRLRFYLPENDLGKALT HLYRSSRTSIEENGANTLYLALGLLKWYETPSSERPRYAPILLLPVEIIRKSAAKGYVIR SREEETMMNITLLEMLRQNFGISVPGLDPLPTDESGINVKLIYSIIRHCIKNQRKWDVEE QAILGIFSFNKFIMWNDIHNNAHKLTQNKVVSSLINGKIEWDVTAKEVDAAYMDRQLSPA DIVLPIIADSSQLEAIYEAVHDKTFILHGPPGTGKSQTITNIIANALYKGKRVLFVAEKM AALSVVQNRLAGIGLAPFCLEIHSNKTKKSTVISQLKETTEIIRRTPPEEFMKEAERLLK LRTELNKYIEALHKEYPFGLSLYDAIIHYQLTDVEPCFDIPSSYLDNLDKDRFSHWEDAI ESLVSTANACGHPYLHPLTGIFIREYSSAIKEEALQTLATFIGLLTAIQSKLPVFSALLE NTDIHPTRKDFNIISAIIRKILDIPELTPELLTTPLLNETLEEYRKVTEHGRKRDEIKAE IENGFTKEVLKINAGPMLAEWNRVSAQWFLPRYFGQRKIKKVIRPYALQPVKPETVQPLL HQVIRYQEELDFTDRYTAKLPSLFGRFGRDEEWPIIDQIIHEVSSLHSLLLSYSKDVAKT SRIKQNLALQLTEGIRTFRDIHSHSLNELYQLVDTLTATEQRLSTTLGITVETLYTNSAD WIGIALQQAGIWKENLDKLKDWYQWLQSYNKLNELGLGFIAEEYKEKNIPTDLLTSSFRK SFYQAVIHYIIAKEPTLELFNGKIFNDIITKYKQVSANFEDITKKELFARLASNIPSFTH EAIQSSEVGILQKNIRNNARGISIRKLFDQIPILLSRMCPCMLMSPISVAQYIDADAEKF DLIVFDEASQMPTYEAVGAIARGKNVVIVGDPKQMPPTSFFSVNTIDEDNIEIEDLESIL DDCLALSIPSKYLLWHYRSKHESLITFSNSEYYDNKLMTFPSPDNIESKVRMVAVDGYYD KGKSRQNQAEAQAIVDEIARRLRSEKLRRKSMGVVTFNVVQQALIEDLLSDLFAFHPELE TFALECEEPLFIKNLENVQGDERDVILFSVGYGPDAEGRVSMNFGPLNRAGGERRLNVAV SRARYEMIIFSTLRSDMIDLNRTSSIGVAGLKRFLEYAEKGTKRILNTSTNIRTSEEETS IEKIIADKLRSSGYTVHTDIGCSGYKIDIGIVDTQTPSNYQLGIICDGKNYRRTKTVRDR EIVQNNVLKALGWNICRIWTMDWWEKPDEVMASIQAAIAQDMKAEKPVIVKKEQEKREKN VQETPMQLKSACAYTNFSFISVQNYNSTKIRPFLYAADDIFAKRNNPIIVSQVQKIIENE APISKTLLGKKIISEWGVNRVGPRIDAQLEIIFDTLHLYRIEHDGLIFCWKDEAQYRSYS EYRPDSDRDAADLPPEEIANAIQQILINSISLPLPDLAKACAQVFGFTHMGSNIEASMLR GIQEAVKRNYAKVENGRATIID >gi|226332225|gb|ACIC01000095.1| GENE 15 20203 - 21336 713 377 aa, chain - ## HITS:1 COG:SA2010 KEGG:ns NR:ns ## COG: SA2010 COG3344 # Protein_GI_number: 15927789 # Func_class: L Replication, recombination and repair # Function: Retron-type reverse transcriptase # Organism: Staphylococcus aureus N315 # 83 334 24 273 338 166 38.0 7e-41 MDGILITILIVTGFLIYKLISSSRKQEGYKRWKATGSGNTTNEETSKNKWEHGAWLRHAL GVVTPKMRTCPVFDEEAVVWCANLLGIEVMRLKEILQDVSSHYREFWIRKRRGGYRMISA PDKELKAIQDTIYHRILLSVNVHPAATGFRRQHSIVDNVRPHLGKRCVLKTDIHDFFISI RSPRVKKTFEKIGYPKNISKVLGELCCMRRHLPQGASTSPVLSNIIAYEMDRKLAVMAEE FGLTYSRYADDLTFSGDVFPKEEVLARVKEIIREEKFEPNHQKTRFLNENDRKIITGVSV SSGVKLTIPKARKREIRKNVYFILTKGLAEHQRRIGSHDPAYLKRLIGSLCYWRSIEPDN TYVADSIAALKRLQKGY >gi|226332225|gb|ACIC01000095.1| GENE 16 21542 - 23728 1638 728 aa, chain - ## HITS:1 COG:no KEGG:BT_4341 NR:ns ## KEGG: BT_4341 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 10 728 1 719 719 1401 100.0 0 MKYKILIGCMLCLFSFIPKGAANLVLNPYTSQEISADSIIERVMTFAPSYESIVSDYRAN LYIKGKMNIQKKNFILRYVPSMFRLQKGVREYLLETYSDLHYTAPNIYDQKVKASQGTVR GNRGLPGLLEYFSVNIYSSSLLNDERLLSPLAKNGQKYYKYRIDSVMGDPNNLDYRIRFI PRTKSDQLVGGYMIVSSNVWSVREIRFSGRSELITFTCWIKMGEVGKKDEFLPIRYDVEA LFKFLGNKVDGNYTASLDYKSIELKERKVRKKEKKKYNLSESFSLQCDTNAYKTDASTFG VLRPIPLSEGEKQLYKDYTYRRDTVSVQQKTKSQAFWGTMGDLMVEDYKFNLSNIGSVRF SPFINPLLFSYSGSNGLSYRQDFRYNRIFRGDKWLRIVPKLGYNFTRKEFYWSLNADFEY WPQKRGFFRLSVGNGNRIYSSKMLDELKAMPDSIFNFDLIHLDYFKDLYFNFRHSFEITN GLDISLGFSAHKRTAVEKSRFVITGDYPMPPPEFMERFRNTYISFAPRIRVEWTPALYYY MNGKRKINLRSDYPTFSVDYERGIEGVFKSTGEYERIEFDLQHKIQLGLMRNIYYRFGFG AFTNQDELYFVDFANFARHNLPVGWNDEIGGVFQVLDGRWYNSSRRYVRGHFTYEAPFLI LRHLMKYTRYVQNERLYISALSMPHLQPYLEVGYGIGTHVFDVGVFVSSENWKFSGIGCK FTFELFNR >gi|226332225|gb|ACIC01000095.1| GENE 17 24130 - 24330 223 66 aa, chain + ## HITS:1 COG:no KEGG:BT_4340 NR:ns ## KEGG: BT_4340 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 66 1 66 66 87 100.0 2e-16 MQMNASMGKHLQFLTQMAMDSTTMKLLYRNINKHYMQVEILVKQMAAEIDRQKNKDGQQE ILESIS >gi|226332225|gb|ACIC01000095.1| GENE 18 24513 - 26702 2504 729 aa, chain + ## HITS:1 COG:slr0288 KEGG:ns NR:ns ## COG: slr0288 COG3968 # Protein_GI_number: 16331104 # Func_class: R General function prediction only # Function: Uncharacterized protein related to glutamine synthetase # Organism: Synechocystis # 5 729 7 724 724 621 44.0 1e-177 MSKMRFYALQELSNRKPLEVTAPSNKLSDYYGSHVFDRKKMQEYLPKEAYKAVTDAIEKG TPISREIADLIANGMKSWAKSLNVTHYTHWFQPLTDGTAEKHDGFIEFGEDGGVIERFSG KLLIQQEPDASSFPNGGIRNTFEARGYTAWDVSSPAFVVDTTLCIPTIFISYTGEALDYK TPLLKALAAVDKAATEVCQLFDKNVTRVFTNLGWEQEYFLVDSSLYNARPDLCLTGRTLM GHSSAKDQQLEDHYFGSIPPRVTAFMKELEIECHKLGIPAKTRHNEVAPNQFELAPIFEN CNLANDHNQLVMDLMKRIARKHHFNVLLHEKPYSSVNGSGKHNNWSLCTDTGINLFAPGK NPKGNMLFLTFLVNVLMMVYKNQDLLRASIMSASNSYRLGANEAPPAILSCFLGSQLSAT LDEIVRQVGNEKMTPEEKTTLKLGIGRIPEILLDTTDRNRTSPFAFTGNRFEFRAAGSSS NCAAAMIAINAAMANQLNEFRASVEKLMEEGVGKDEAIFRLLKETIIASEAIRFEGDGYS EEWRQEAARRGLTNICHVPEALMHYVDNQSKSVLIGERIFNETELNSRLEVELEKYTMKV QIEGRVLGDLAINHIVPTAVTYQNRLLENLCKMKEIFSPEEYEVLSADRKELIREISHRV TSIKVLVREMTEARKVANHKDNYKERAFEYEEKVRPYLDKIRDHIDHLEMEVDDEIWPLP KYRELLFTK >gi|226332225|gb|ACIC01000095.1| GENE 19 26820 - 27521 516 233 aa, chain + ## HITS:1 COG:slr0449 KEGG:ns NR:ns ## COG: slr0449 COG0664 # Protein_GI_number: 16332256 # Func_class: T Signal transduction mechanisms # Function: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases # Organism: Synechocystis # 22 233 20 233 238 77 24.0 2e-14 MVKMNMSDIDISEPLSDLLAPLNNEQREFLMNNYTIQTYKKNETIYCEGETPSHLMCLIT GKVKIFKDGVGGRSQIIRMIKQREYFAYRAYFAKEDFVTAAAAFEPSVICLIPMTAITTL IAQNNDLAMFFIRQLSIDLGISDERTVSLTQKHIRGRLAESLIFLKESYGLEEDGSTLSI YLSREDLANLSNMTTSNAIRTLSQFATERLITIDGRKIKIIEEEKLKKISKIG >gi|226332225|gb|ACIC01000095.1| GENE 20 27679 - 30108 2352 809 aa, chain + ## HITS:1 COG:CC0447 KEGG:ns NR:ns ## COG: CC0447 COG3525 # Protein_GI_number: 16124702 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetyl-beta-hexosaminidase # Organism: Caulobacter vibrioides # 31 636 31 605 757 462 40.0 1e-129 MKKSISLLLLSLLMITPSCQKPKEVTNEYNIVPQPNQLVPKEGRFELSNKVRLVVPSDAP EVKKVADGFAEQLKQTAGISLKEAESVDGKPAISFVLQEGMPKEGYKLSVTPTLITVTAS QPNGFFYGVQTIYQLLPPAVYGKELKKKADWSVPAVEIEDAPRFVHRGLMLDVCRHYAPI EYIYKFIDLLAMNKMNVFHWHLTDDQGWRIEIKKYPKLTEIGSKREKTLVDYYYVNYPQV FDGIEHGGYYTQEQIKEVVAYAASKYINVIPEIEMPGHALAALAAYPELSCDSTQTYKVS PTWGVFEQVFCPSETTFKFFEGVMDEVVELFPSEYIHIGGDECPKTAWKNSAFCQQLIRQ LGLKDDVTPSKVDGMKHSKEDKLQSYFVTRMEKYLNGKGRNIIGWDEILEGGLAPNATVL SWRGVEGGLNAAKAGHNAIMAPMPYAYLDFYQEDPEIAPTTIGGYTTLKKTYSYNPVPDD ADELVKKHIIGMQGNLWREYMKTSDRVDYQAFPRAMAIAETGWTLDANKNWKSFCERMVT EFERLEVMDTKPCLNFFDVNINTHADENGPLMVLLETFYPNAEIRYTTDGSEPTYGSTLY EQPFALEGNIDLKAAAFKDGKMLGKVTNKPLYGNLLAGKPFTVNYTMGWTGDIFGDNDVL GADKTTFGLTNGKRGNNASYTPWSSFAIVEGKDLEFIVHLDKPTEVRKVVFGSLFNPAMR MLPAGGVAVEVSADGQQYTQIAEKVLKHDCPETGRIAFTDSIEFEPTQATFLKVKIKNGG TLRNGVNFEKNNGPEVIPAELWIDEIEAY >gi|226332225|gb|ACIC01000095.1| GENE 21 30223 - 31173 713 316 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 [Streptococcus pneumoniae SP6-BS73] # 10 310 5 303 306 279 48 4e-74 MAEIEKVKCLIIGSGPAGYTAAIYAGRANLCPVLYEGLQPGGQLTTTTDVENFPGYPEGI SGPQLMEDLRAQASRFGTDVRFGIATAADLSKAPYKITIDGDKVIETEALIISTGATAKY LGLEDEKKYAGMGVSACATCDGFFYRKKVVAVVGGGDTACEEAIYLAGLASKVYLVVRKP YLRASKIMQERVQKHEKIEVLFEHNVVGLFGDNGVEGMNVVKRWGESDEERYSLPIDGFF LAIGHKPNSDIFKEYIDTDEVGYIITDGDSPRTKIPGVFAAGDVADPHYRQAITAAGSGC KAAIEAERYLSAKGII >gi|226332225|gb|ACIC01000095.1| GENE 22 31234 - 31881 647 215 aa, chain - ## HITS:1 COG:no KEGG:BT_4335 NR:ns ## KEGG: BT_4335 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 215 1 215 215 372 97.0 1e-102 MRKYIFSVLIALLSLPVIAQQQQSQAKVVLEKTAEAFKKAGGVRADFTLKAVNDGHLEGR ENGIIQLKGEKFMLKTSETTTWFDGKTQWSYMVRNDEVNVSNPTQEELQQINPYTFLYMY QKGFSYKLGTVKVYQGKAVWEVILTANDKKQELESITLYVTKSTYEPVYIQLQQRGQKTR NEITVTAYQTGLDYADHVFTFDRKAYPTAEVIDLR >gi|226332225|gb|ACIC01000095.1| GENE 23 31901 - 34396 2283 831 aa, chain - ## HITS:1 COG:lin1423 KEGG:ns NR:ns ## COG: lin1423 COG1674 # Protein_GI_number: 16800491 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: DNA segregation ATPase FtsK/SpoIIIE and related proteins # Organism: Listeria innocua # 291 815 250 753 762 399 42.0 1e-110 MAKKKIDKEAERTPSSPGRIVSIFKNETVHFVIGLMLVIFSVYLLLAFSSFFFTGAADQS IIDSGNSADLAAVNNNVKNYAGSRGAQLASYLINDCFGISSFFILVFLAVAGLKLMRVRI VRLWKWFIGCTLLLIWFSVFFGFALMDHYQDSFIYLGGMHGYNVSRWLVSQVGVPGVWMI LLITAVCFFIYISARTIIWLRKLFALSFLKRQKKEEEKETAEEGTQEFTTSQPQEVEFNL KRTYKQTPPPAPVMDIQAEEPKEESPVNAPEPDDELPSADEAEGVTMVFEPTVSDVVPPA AQDELPGEDEPGFQVETATSEEEYQGPEQEPYNPMKDLENYRFPTIDLMKHFENDDPTID MDEQNANKDRIINTLRSFGIEISTIKATVGPTVTLYEITPEQGVRISKIRGLEDDIALSL SADGIRIIAPIPGKGTIGIEVPNKNPKIVSGQSVIGSKKFQESKFDLPIVLGKTITNEVF MFDLCKMPHVLVAGATGQGKSVGLNAIITSLLYKKHPAELKFVLVDPKKVEFSIYSVIEN HFLAKLPDGGEPIITDVTKVVQTLNSVCVEMDTRYDLLKMAHVRNVKEYNEKFINRRLNP EKGHKFMPYIVVVIDEFGDLIMTAGKEVELPIARIAQLARAVGIHMIIATQRPTTNIITG TIKANFPARIAFRVSAMMDSRTILDRPGANRLIGKGDMLFLQGADPVRVQCAFIDTPEVE EITKFIARQQSYPTPFFLPEYVSEDSGSEVGDIDMGRLDPLFEDAARLVVIHQQGSTSLI QRKFAIGYNRAGRIMDQLEKAGIVGPTQGSKARDVLCIDDNDLEMRLNNLQ >gi|226332225|gb|ACIC01000095.1| GENE 24 34519 - 35178 634 219 aa, chain + ## HITS:1 COG:no KEGG:BT_4333 NR:ns ## KEGG: BT_4333 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 219 1 219 219 375 99.0 1e-103 MEKESQTIFEKNVIEFVTVAAEFCAFLERAEHMKRKAFVDTSLKILPLLYLKASLLPKCE TIGDEAPETYVTEEIYEILRINLAGLMGEKDDYLDVFVQDMVYSDQPIKKSISEDLADIY QDIKDFIFVFQLGLNETMNDSLAICQENFGLLWGQKLVNTLRALHDVKYNQQNENDEEDN EEENNELSDDDYCCEEDGCHCHDDECHCHEDGCHCRNDE >gi|226332225|gb|ACIC01000095.1| GENE 25 35185 - 35835 493 216 aa, chain + ## HITS:1 COG:no KEGG:BT_4332 NR:ns ## KEGG: BT_4332 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 216 1 216 216 424 100.0 1e-117 MIVRRSINKDELKELPKTVFPGRIHVIQSEAETEKAVAYLQSQPILGIDSETRPSFTKGQ SHKVALLQISSDECCFLFRLNMTGLTQPLVDLLENPAVIKVGLSLKDDFMMLHKRAPFTQ QSCIELQDYVRQFGIQDKSLQKIYAILFKEKISKSQRLSNWEADVLSDGQKQYAATDAWA CLNIYNLLQELKRTGNYEVESLPVEEEVDANAPANR >gi|226332225|gb|ACIC01000095.1| GENE 26 35890 - 37068 563 392 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|223476703|ref|YP_002580685.1| ribosomal protein L11 methyltransferase, putative [Thermococcus barophilus MP] # 1 392 1 393 396 221 36 9e-57 MHKVYLKPGKEESLKRFHPWIFSGAIARFDGEPEEGEVVEVYTSKKEFIAEGHFQIGSIA VRVLSFRQEPIDHDFWKRKLQIAYDMRCGIGIAVNPTNDTYRLVHGEGDNLPGLVIDIYA RTAVMQAHSAGMHVDRMTIAEALSEVMGDKIENIYYKSETTLPFKADLFPENGFLKGGSS DNIAREYGLKFHVDWLKGQKTGFFVDQRENRSLLEHYSKDRSVLNMFCYTGGFSFYAMRG GAKLVHSVDSSAKAIDLTNKNVELNFPGDARHEAFAEDAFKYLDRMGDQYDLIILDPPAF AKHKDALRNALQGYRKLNAKAFEKIKPGGILFTFSCSQVVTKDNFRTAVFTAAAMSGRSV RILHQLTQPADHPVNIYHPEGEYLKGLVLYVE >gi|226332225|gb|ACIC01000095.1| GENE 27 37445 - 38695 1076 416 aa, chain + ## HITS:1 COG:STM3113 KEGG:ns NR:ns ## COG: STM3113 COG0477 # Protein_GI_number: 16766414 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Salmonella typhimurium LT2 # 1 411 1 408 418 413 54.0 1e-115 MSIKVRLIIMNFLQFFVWGSWLISLGGYMGRELHFEGGQIGAIFATMGIASLVMPGIIGI IADKWFNAERLYGLCHIAGAACLFYASTVTNYDQMYWAMLLNLLVYMPTLSLANTVSYNA LEQYKCDLIKDFPPIRVWGTIGFICAMWAVDLTGFKNSSAQLYVGGASALLLGLYSFTLP ACKPAKTEKKTLLSSFGLDAFVLFKRKKMAIFFLFSMLLGAALQITNTYGDLFLGSFASI PEFADSFGVKHSVILLSISQMSETLFILAIPFFLRHFGIKQVMLISMFAWVFRFGLFGFG DPGSGLWMLILSMIVYGMAFDFFNISGSLFVEQETSSSIRASAQGLFFMMTNGLGAIIGG YASGAVVDAFSVYADGKLVSREWPDIWFIFAAYALVIGILFALVFKYKHQRENKVN >gi|226332225|gb|ACIC01000095.1| GENE 28 38709 - 39305 632 198 aa, chain + ## HITS:1 COG:MT1877 KEGG:ns NR:ns ## COG: MT1877 COG1259 # Protein_GI_number: 15841299 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Mycobacterium tuberculosis CDC1551 # 6 158 3 144 164 97 41.0 2e-20 MDKKVELQVLNITNSQAQVGAFALLLGEVNGERQLPIIIGPAEAQATALYMKGVKTPRPL THDLFMTIIGVLGASLLRVLIYKAKDGIFYSYIYLKKDEEIIRIDTRTSDAVGMAIRAEC PILIYESILEQECLRISNEERRHPEESDEEAEDEKKRDLPRNVTSMSLEEALEQAIKDEN YELAAKIRDRINSRNQNH >gi|226332225|gb|ACIC01000095.1| GENE 29 39312 - 40010 677 232 aa, chain + ## HITS:1 COG:ECs3822 KEGG:ns NR:ns ## COG: ECs3822 COG1385 # Protein_GI_number: 15833076 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 15 223 29 240 252 97 32.0 2e-20 MHVFYTPDIQKNYELPEEEAQHCTRVLRLGIGDEITLTDGKGHFYKAEITVATNKRCLVA IKETIFQEPLWPCHLHIAMAPTKNMDRNEWLAEKATEIGFDELTFLNCRFSERKVIKTER IEKILISAIKQSLKARLPKLNEMTDFDKFIAQEFQGQKFIAHCYEGEKPQLKNVLKAGED ALVLIGPEGDFSEEEVKKAIECGFTPISLGKSRLRTETAALVACHTMNLLNQ >gi|226332225|gb|ACIC01000095.1| GENE 30 40026 - 41579 1599 517 aa, chain + ## HITS:1 COG:no KEGG:BT_4327 NR:ns ## KEGG: BT_4327 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 517 1 515 515 915 99.0 0 MSMAKKMISRLSVLAVLIVFLAACSKQTEYTNVIPADATAVASIDLKSLANKAGMNDKEN EAAKQKLLEAMKSGMNAATFQQLEKVINNPGASGLDPEAPIYIFSSPQISGGAFVAKVSN EDDLHASLDVMAKEQICQPVSEADGYSFTTLNGNLLAFNETTAIIISASRTSQTEAAKEA ITKLMKQTADNSIAKSGAFQKIAKQKSDINFFASMSAIPSTYRSQISMGLPSEIKPEDIT ILGGLNFEKGKIALKTENYTENDAVKALLKKQMESFGKTNGTFVKYFPASTLMFINMGVK GDGLYNLLSENKEFRSTVSIAKADEVKELFNSFNGDISAGLINVTMNSAPTFLAYADVKN GNALEALYKNKQSLGMRKGEDIMELGKDEYVYKTRGMNIFFGIKDKQMYATNDELLYKSI GKTVDKSIKDAPYAADMKGKTVFMAINAEAILDLPVVKMLVGFGGKEFKTYSDLASKVSY LSVSSEGETSETDLCLKDKDVNALKQIVDFAKQFAGM >gi|226332225|gb|ACIC01000095.1| GENE 31 41596 - 42240 188 214 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) [Campylobacter concisus 13826] # 18 209 22 210 223 77 27 3e-13 MNSIHLQQTLPQVFADRNSVTSDVWHQDLIFRKGEMYLIEAASGTGKSSLCSYLYGYRND YQGIINFDETNIKAYSVKQWVDLRKHSLSMLFQDLRIFTELTAIENVQLKNNLTGHKKKK EILSLFEKLGISDKLNVKAGKLSFGQQQRVAFIRAFCQPFDFLFLDEPISHLDDENSRIM GEIIIDEAGKQGAGVIATSIGKHIELPYKKVLQL >gi|226332225|gb|ACIC01000095.1| GENE 32 42286 - 43491 1094 401 aa, chain + ## HITS:1 COG:no KEGG:BT_4325 NR:ns ## KEGG: BT_4325 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 401 1 401 401 763 99.0 0 MNTLVWKLLRQHISIGQLAGFFLANLFGMMIVLLSVQFYKDVIPVFTEGDSFMKKDFIIA TKKISTLGSFAGKSNTFSPEDIADLKKQSFTKTIGAFTPSQFKVSAGLGMQKAGIHLSTD MFFESVPDEFVDIKLDKWHFDEETHTIPIIIPRNYLNLYNFGFAQSRSLPKLSEGLMGLI QMDIMMRGNGRVEQYKGNIVGFSNRLNTILVPQSFMNWANKNFAPDAEAQPARLIIEVSN PADSSIASYFQKKGYETEDGKLDAGKTTYFLRLIVGIVLGVGLFISILSFYILMLSIFLL LQKNTTKLESLLLIGYSPNRVALPYQILTVGLNIIVLILSVGLVCWLRSYYIESIRLLFP QLETGSLWVAVSMGILLFLIVSVINILAVKRKVLSIWMHKS >gi|226332225|gb|ACIC01000095.1| GENE 33 43503 - 46145 2135 880 aa, chain + ## HITS:1 COG:no KEGG:BT_4324 NR:ns ## KEGG: BT_4324 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 880 1 880 880 1730 98.0 0 MKNTTTLQSKRSITVLLGVLLYFSSLSAQTMQDTIIANFSLMERIPKEKLYLHLDKPFYG AGEKIWFKGYLVNAITHQDNAQSNFIITELINRSDSIVERKKIRRDSLGFHNAFTLPATL PAGDYYLRGYSNWMLNEDPDFFFSRNIKIGNSIDNTIVSSIEYQQEDDTHYTAKIKFTSN VQAVFENTTIKYLYLENGKIKNKGKKKTDENGWISISLPDLKSPVARRIEVEFDDPQYIY KRTFHLPVFTNDFDVKFFPEGGALINIPHQNVAFKAQGADGFSKEIEGFLFNSKGDTLTN FRSEHNGMGIFTMNPVNNETYYVTVRTNDSITKRFDLPAIEPKGISIAMSHYKQEIRYEI QKTEATEWPQKLFLLAHTRGKLAILQPINPKRTFGKMNDSLFTEGITHFMLIDEQGNALS ERLIFVPDHKPNQWQITTDQPTYGKREKVSLQIAAKDSEGNPVEGTFSVSITDRKSIQPD SLADNILSNLLLTSDLKGYVEDPAFYFLNQDARTLRSIDYLMMTHGWRRHKMENVLRTPS LNFTNYIEKGQTISGRIMGFFGANVKKGPICVLAPKYNIIATTETDEKGQFIVNTSFRDS TTFLVQARTKKGFAGVDILMDPPQYPVATHKAPYFNGATTFMEDYLMNTRDQYYMEGGMR VYNLKEVTVTAKRERPSSKSIYTGGINTYTVEEDRLQGYGQTAFDAASRLPSVTITNGSE IHIRNNSEPAIIVIDDIVYEDASDILKDIQVSDMSSISLLRGADAVILGPRASGGAVVIT LKDPRNLPARPAQGIITYTPLGYSESVEFYHPTYDTPEKKNAQRSDFRSTVYWNPELRLD AEGKATIEYYTPDSTAPEDIIIEGVDKNGKVCRILQTINK >gi|226332225|gb|ACIC01000095.1| GENE 34 46285 - 47205 1009 306 aa, chain + ## HITS:1 COG:TP0637 KEGG:ns NR:ns ## COG: TP0637 COG0324 # Protein_GI_number: 15639624 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA delta(2)-isopentenylpyrophosphate transferase # Organism: Treponema pallidum # 4 286 20 306 316 228 44.0 8e-60 MPDYDLITILGPTASGKTPFAAALAHELNTEIISADSRQIYRGMDLGTGKDLADYTVDGH PIPYHLIDIADPGYKYNVFEYQRDFLISYESIKQKGCLPVLCGGTGMYLESVLKGYKLMP VPENPELRARLANHSLEELTEILKQYKTLHNSTDVDTVKRAIRAIEIEEYYAVHPVPERE FPKLNSLIIGVDIDRELRREKITRRLKQRLDEGMVDEVRRLTEQGISPDDLIYYGLEYKF LTLYVIGKLTYEEMFTELETAIHQFAKRQMTWFRGMERRGFTIHWVSAELPMEEKIAFVI EKLRGN >gi|226332225|gb|ACIC01000095.1| GENE 35 47286 - 48212 983 308 aa, chain + ## HITS:1 COG:TM0358 KEGG:ns NR:ns ## COG: TM0358 COG1597 # Protein_GI_number: 15643126 # Func_class: I Lipid transport and metabolism; R General function prediction only # Function: Sphingosine kinase and enzymes related to eukaryotic diacylglycerol kinase # Organism: Thermotoga maritima # 10 284 6 281 304 112 28.0 8e-25 MSVEPGKWGVIYNPKAGTRKVQKRWKEIKEYMDSKGVNYDYVQSEGFGSVERLAKILANN GYRTIVIVGGDGALNDAINGIMLSDAEDKENIALGMIPNGIGNDFAKYWGLSTEYKPAVD CIINHRLKKIDVGFCNFYDGKEHQRRYFLNAVNIGLGARIVKITDQTKRFWGVKFLSYVA ALFSLIFERKLYRMHLRINDEHIRGRIMTVCVGSAWGWGQTPSAVPYNGWLDVSVIYRPE FLQIISGLWMLIQGRILNHKMVKSYRTRKVKVLRAQNASVDLDGRLLARHFPLEIGVLSE KTTLIIPN >gi|226332225|gb|ACIC01000095.1| GENE 36 48296 - 49096 944 266 aa, chain + ## HITS:1 COG:FN1224 KEGG:ns NR:ns ## COG: FN1224 COG2877 # Protein_GI_number: 19704559 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: 3-deoxy-D-manno-octulosonic acid (KDO) 8-phosphate synthase # Organism: Fusobacterium nucleatum # 12 266 30 284 286 263 50.0 3e-70 MIELKNNPAGNFFLLAGPCVIEGEEMAMRIAERVVGVTEKLQIPYVFKGSYRKANRSRLD SFTGIGDEKALKVLKKVHDTFGVPTVTDIHSADEAEMAAEYVDILQIPAFLCRQTDLLVA AAKTGKTINIKKGQFLSPLAMQFAADKVVEAGNKNVMLTERGTTFGYQDLVVDYRGIPEM QTFGYPVILDVTHSLQQPNQTSGVTGGMPQLIETVAKAGIAVGADGIFIETHENPAVAKS DGANMLKLDLLEGLLTKLVRIREAIK >gi|226332225|gb|ACIC01000095.1| GENE 37 49179 - 52019 2907 946 aa, chain + ## HITS:1 COG:BB0536 KEGG:ns NR:ns ## COG: BB0536 COG0612 # Protein_GI_number: 15594881 # Func_class: R General function prediction only # Function: Predicted Zn-dependent peptidases # Organism: Borrelia burgdorferi # 9 894 16 885 933 306 29.0 2e-82 MKHLLRGFFIAVLFICCNFQLVLAQPMQTLPVDKNVRIGKLDNGLTYYIRHNALPEKRVE FHIAQKVGSILEEPQQRGLAHFLEHMAFNGTKNFPGDETGLGIVPWCETKGIKFGTNLNA YTSIDKTVYRISNVPTDNVSVVDSCLLILHDWSSAINLADKEIDKERGVIREEWRSRNSG MQRIMTNALPVMYPDSKYADCMPIGSLDVINNFPYQDIRDYYAKWYRPDLQGIMIVGDIN VDEMEAKLKKVFADVKAPVNPAERIYYPVADNQEPQIFIGTDKEIETPSISFFFKSEAFP DSLKNTINYYGIQYMISMGVTMLNSRLAEIRQQANPPFTGASAGYGDFFVAKTKNAFGVD ASSKIDGIELAMKTILEETERARRFGFTETEYDRARANYLQRVESAYNEREKMKNDTYVN EYISNFLDNEPMPGIEYEYAMMNQLAPNIPVAAINQVMQQLITDNNQVVLLAGPEKEGVK YPTKEEIAALLKQMKSFDLKPYEDKVSNEPLLKEEPKGGKIVSEKAGDIYGTTKLVLSNG VKVYIKTTDYKADQILMKGTSLGGSSQFPDKEILNISQINSVAMVGGIGNFSKVDLSKAL AGKRASVGAGVSATTETVSGSCSPKDFETMMQLTYLTFTAPRKDNEAFESYKNRMKAELQ NADANPMTAFSDTVSYALYGNHPRSFSMKENMVDKIDYDRVMEMYKDRFKDASDFTFYFV GNIDVEKMKPMIAKYLGGLPSINRKETFKDTKMEIRKGQYKNEFAKKQETPMATIMFLFN GTCKYDLRNNLTLSILDQALDMVYTAEIREKEGGTYGVSCSGSLTKYPKEQLVLQIVFQT DPAKKDKLSGIVIEQLEKMAKEGPSAEHMQKIKEYMLKKYKDAQKENGYWLGNLDEYFYT GIDYTKDYETLVNSITAKDVQEFLAKLMKQNNEIQVIMTVPEEEAK >gi|226332225|gb|ACIC01000095.1| GENE 38 52042 - 53535 1037 497 aa, chain + ## HITS:1 COG:no KEGG:BT_4319 NR:ns ## KEGG: BT_4319 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 497 1 496 496 895 100.0 0 MMIFNELRKHGRLASKRHPMYEKNKVAKIFGYIGVAFWAGYLIFFGTTFAFGFADMVPNR EPYHVMNAVALIFILALDFLLRIPFQKTPTQEVKPYLLLPVKRNRIIDFLLIRSGLSLFN LFWLFMFVPFSFITITKYFGISGVLTYLIGIWLLIVANNYWYLLCRTLINERIWWVLLPI AFYGGLGCLAFIPEDSPLFYFFMDLGDGYINGNILCFLGTILVIAILWLINRKIMSGLIY AELAKVDDTRIKHVSEYKFFERYGEVGEYMRLELKMLLRNRRCKGALRNVLLVVIAFSCL LSFSSLYDTSTMTTFICVYNFAVFGMIILSQLMSFEGNYIDGLMSRKESIMSLLKAKYYI YTIGEIVPFVLMIPAIVMDKVPLLGIFAWFFYTTGFIYFCFFQLAVYNKQTVALNEKVTS RQTNSAIQMVVNFGAFGIPLILYSVLNAFLGETAAYSILFAIGLGFTLTSPLWIRNVYKR FMKRRYENMEGFRDSRQ >gi|226332225|gb|ACIC01000095.1| GENE 39 53586 - 54290 207 234 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 231 1 240 245 84 26 2e-15 MISINNLQKKFGEKLAVNIDHYEINQGDMLGLVGNNGAGKTTLFRLMLDLLKADNGNVVI NEIDVSQSEDWKNFTGAFIDDGFLIDYLTPEEYFYFIGKMYGLKKEEVDERLLPFERLMS GEVIGQKKLIRNYSAGNKQKIGIISAMLHYPQLLILDEPFNFLDPSSQSIIKHMLKKYSE EHNATVIISSHNLNHTIDVCPRIALLENGVIIRDIMNENNSAEKELEDYFNVEE >gi|226332225|gb|ACIC01000095.1| GENE 40 54333 - 54845 234 170 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167856514|ref|ZP_02479226.1| 50S ribosomal protein L1 [Haemophilus parasuis 29755] # 58 169 69 174 175 94 41 1e-18 MGVNFKLIFILTGLIFILSSCRTSAPRLDYQALARASILLGVDINMEDNHKLYLESAEWI GVPYRGGGDSKRGTDCSGLTYQIYRKVYRTQVPRNTEDLRKESTKVAKRNLREGDLVFFS SSRSRKKVAHVGIYLKNGKFIHASTSKGVIVSRLSEDYYTRHWISGGRIR >gi|226332225|gb|ACIC01000095.1| GENE 41 54872 - 55756 791 294 aa, chain + ## HITS:1 COG:SMc02488 KEGG:ns NR:ns ## COG: SMc02488 COG3735 # Protein_GI_number: 15966800 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Sinorhizobium meliloti # 21 294 84 358 358 85 28.0 1e-16 MKSFIGAVIFICVALSANAQLLWKVSGNGLSSPSYIMGTHHLAPLSVKDGITGLQKAMDE TQQVYGELKMSDIQSPATIQKMQKMMMIESDTTLTTLLSPEDYETANKFCKENLMMDLNM APKIKPAFLLNNIAVVAYIKHIGNYNPQEQLDTYFQTQATQKGKKVDGLETPDFQFNLLY NGISLQRQAQLLMCTLNNIDKEVESLKRLTDAYMKQDLETMLKINEERKGNQCDALPTEE DALIYDRNKTWAQKLPAIMKAAPTFVAVGALHLPGNKGLLSLLKKQGYTVEAVK >gi|226332225|gb|ACIC01000095.1| GENE 42 55756 - 56397 426 213 aa, chain + ## HITS:1 COG:MA2967 KEGG:ns NR:ns ## COG: MA2967 COG0546 # Protein_GI_number: 20091785 # Func_class: R General function prediction only # Function: Predicted phosphatases # Organism: Methanosarcina acetivorans str.C2A # 1 210 63 272 279 180 45.0 2e-45 MNYKTYLFDFDYTLADSSRGIVTCFRNVLNRHQYTNVTDEAIKRTIGKTLEESFSILTGV TDWEQLTAFRQEYRLEADVHMNVNTRLFPDTLSTLKELKERGARIGIISTKYRFRILSFL DEYLPENFLDIVVGGEDVQAAKPSPEGIKFALEHLGRTPQETLYIGDSTVDAETAQNAGV DFAGVLNGMTTADELRAYPHRFIMENLSGLLYI >gi|226332225|gb|ACIC01000095.1| GENE 43 56557 - 56874 533 105 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|29349722|ref|NP_813225.1| 50S ribosomal protein L21 [Bacteroides thetaiotaomicron VPI-5482] # 1 105 1 105 105 209 100 3e-53 MYAIVEINGQQFKAEAGQKLFVHHIEGAENGSTVEFEKVLLVDKDGNVTVGAPTVEGAKV VCQVISNLVKGDKVLVFHKKRRKGYRKLNGHRQQFTELTITEVVA >gi|226332225|gb|ACIC01000095.1| GENE 44 56896 - 57165 462 89 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|29349721|ref|NP_813224.1| 50S ribosomal protein L27 [Bacteroides thetaiotaomicron VPI-5482] # 1 89 1 89 89 182 100 4e-45 MAHKKGVGSSKNGRESHSKRLGVKIFGGEACKAGNIIIRQRGTEFHPGENIGMGKDHTLF ALVDGTVKFKVGREDRRYVSIIPAEATEA >gi|226332225|gb|ACIC01000095.1| GENE 45 57315 - 58589 1466 424 aa, chain + ## HITS:1 COG:aq_298 KEGG:ns NR:ns ## COG: aq_298 COG0172 # Protein_GI_number: 15605830 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Seryl-tRNA synthetase # Organism: Aquifex aeolicus # 1 423 1 422 425 327 43.0 2e-89 MLTIKQITENTEAVLRGLEKKHFKNAKETIDEVIALNDKRRTTQNLLDKNLAEVNSLSRT IGQLMKEGKKEEAEAARARVAELKESNKALDATMTQAATDMQNVLYTIPNIPYDSVPEGV GAEDNVVEKMGGMETELPKDALPHWELAKKYDLIDFDLGVKITGAGFPVYKGKGAQLQRA LINFFLDEARKSGYTEIMPPTVVNAASGYGTGQLPDKEGQMYHCEVDDLYLIPTAEVPVT NIYRDVILEEKQLPIMNCAYTQCFRREAGSYGKDVRGLNRLHEFSKVELVRIDKPEHSKE SHQQMLDHVEGLLQKLELPYRILRLCGGDMSFTAALCFDFEVYSEAQKRWLEVSSVSNFD TYQANRLKCRYRSAEKKTELCHTLNGSALALPRIVAALLENNQTPEGIRIPKALVPYCGF DMID >gi|226332225|gb|ACIC01000095.1| GENE 46 58672 - 59883 971 403 aa, chain + ## HITS:1 COG:XF1462 KEGG:ns NR:ns ## COG: XF1462 COG0738 # Protein_GI_number: 15838063 # Func_class: G Carbohydrate transport and metabolism # Function: Fucose permease # Organism: Xylella fastidiosa 9a5c # 55 403 4 368 377 185 32.0 1e-46 MTKNTKSFVLPLAFIGVMFFSLGFALGINSVLVPVLKGSLEISSAESYLIIAATFIPFLI FGYPAGLTIKKIGYKRTMVLSFLMFAIAFGLFIPSASYESFPLFLLASFISGSANAYLQA AVNPYITILGPIDSAAKRISIMGICNKLAWPIPPLFLAFLIGKEVTDITVADLFLPFYVI IAAFIALGIIAYMAPLPEVKAVGEDDSEEAAEACPYAAKKTSIWQFPHLLLGCLALFLYV GVETVSLGTLVDYATSLHLENAAMYAWIAPIGIVIGYICGIIFIPKYMSQATALKICSVL AIIGSVLVVLTPADISIYFISFMALGCSLMWPALWPLAMADLGKFTKSGSSLLIMAMAGG AVIPTLFGYLKDISDIQKAYWICLPCFLFILYYGVAGYKIRTK >gi|226332225|gb|ACIC01000095.1| GENE 47 60028 - 62319 2508 763 aa, chain + ## HITS:1 COG:TM1640 KEGG:ns NR:ns ## COG: TM1640 COG0493 # Protein_GI_number: 15644388 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: NADPH-dependent glutamate synthase beta chain and related oxidoreductases # Organism: Thermotoga maritima # 307 760 5 459 468 427 50.0 1e-119 MNKIISKERFSEKVFKLEIEAPLIAKSRKAGHFVIVRVGDKGERMPLTIAGSDLKKGTIT LVIQEVGLSSTRLCELNEGDYITDVVGPLGQATHIENFGTVVCAGGGVGVAPMLPIVQAL KAAGNRVITVLAGRNKDLIILEKEMRESSDEVIIMTDDGSYGRKGLVTEGVEEVIKREKV DKCFAIGPAIMMKFVCLLTKKYEIPTDVSLNTIMVDGTGMCGACRITVGGKTKFVCVDGP EFDGHQVDFDEMLKRMGAFKKIEREEMNKLQPECEATKEIDEKSRNAAWRQELRKSMKPK ERTAIPRVEMNELDAEYRSHSRKEEVNQGLTAEQAVTEAKRCLDCANPGCMEGCPVGIDI PRFIKNIERGEFLEAAKTLKETSALPAVCGRVCPQEKQCESKCIHLKMNEKPVAIGYLER FAADFERESGQISVPVIAEKNGIKVAVIGSGPAGLSFAGDMAKYGYDVTVFEALHEIGGV LKYGIPEFRLPNKIVDVEIDNLVKMGVTFIKDCIVGKTISVEDLKEEGFKGIFVASGAGL PNFMNIPGENSINIMSSNEYLTRVNLMDAASEDSDTPVAFGKNVAVIGGGNTAMDSVRTA KRLGAERAMIIYRRSEEEMPARIEEVKHAKEEGVEFLTLHNPIEYIADEQGCVKQVILQK MELGEPDASGRRSPVAIPGATETIDIDLAIVSVGVSPNPIVPSSIKGLELGRKGTIAVDD NMESSIPMIYAGGDIVRGGATVILAMGDGRKAAAAMNEQLQSK >gi|226332225|gb|ACIC01000095.1| GENE 48 62393 - 62746 473 117 aa, chain - ## HITS:1 COG:NMA1492 KEGG:ns NR:ns ## COG: NMA1492 COG0853 # Protein_GI_number: 15794392 # Func_class: H Coenzyme transport and metabolism # Function: Aspartate 1-decarboxylase # Organism: Neisseria meningitidis Z2491 # 1 109 1 109 127 130 62.0 6e-31 MMIEVLKSKIHCARVTEANLNYMGSITIDEDLLDAANMIPGEKVYIADNNNGERFETYII KGERGSGKICLNGAAARKVQPDDIVIIMSYALMDFEEAKSFKPTVIFPDPATNSVVK >gi|226332225|gb|ACIC01000095.1| GENE 49 62768 - 63616 679 282 aa, chain - ## HITS:1 COG:CAC2915 KEGG:ns NR:ns ## COG: CAC2915 COG0414 # Protein_GI_number: 15896168 # Func_class: H Coenzyme transport and metabolism # Function: Panthothenate synthetase # Organism: Clostridium acetobutylicum # 1 279 1 279 281 234 42.0 1e-61 MKVIHTIKDLQAELTALRAQGKKVGLVPTMGALHAGHASLVKRSVSENGVTVVSVFVNPT QFNDKNDLAKYPRTLDADCRLLEDCGAAFAFAPSVEEMYPQPDTREFSYAPLDTVMEGAF RPGHFNGVCQIVSKLFDAVQPDRAYFGEKDFQQLAIIREMVRQMDYKLEIVGCPIVREED GLALSSRNKRLSARERENALNISQTLFKSRTFAASHTVSETQKMVEEAIEDAPGLRLEYF EIVDGNTLQKVSSWEDSLYVVGCITVFCGEVRLIDNIKYKEI >gi|226332225|gb|ACIC01000095.1| GENE 50 63759 - 64595 622 278 aa, chain + ## HITS:1 COG:TM0895 KEGG:ns NR:ns ## COG: TM0895 COG0297 # Protein_GI_number: 15643657 # Func_class: G Carbohydrate transport and metabolism # Function: Glycogen synthase # Organism: Thermotoga maritima # 6 177 2 169 486 84 27.0 2e-16 MTKANKVLFITQEITPYVSESEMANIGRNLPQAIQEKGREIRTFMPKWGNINERRNQLHE VIRLSGMNLIIDDTDHPLIIKVASIQSARMQVYFIDNDDYFQNRLQTADENGVEYDDNDS RAIFYARGVLETVKKLRWCPDVIHCHGWMTALAPLYIKKAYKDEPSFRDAKVVFSVYEDD FKNTLSEDFATKLMLKGITKKDLAGLKEPVDYAALCKLAVDYSDGIIQNSAKVDESIIEY ARQSGKLVLDYQDPENYANACNEFYDQVWETTTNKEEE >gi|226332225|gb|ACIC01000095.1| GENE 51 64615 - 65583 715 322 aa, chain + ## HITS:1 COG:no KEGG:BT_4306 NR:ns ## KEGG: BT_4306 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 302 1 302 542 586 98.0 1e-166 MKAKYALIALLAITFFGCDDNTGGLGLGMFPGNDQNIKGKLSTFDVTTESVKTGDIYAKT NIGYIGKFTDETFGTYQAGFLAQLNCPDGLTFPEPYKEVTDASGNVISATGRMVVDDKDP ENKDVTFIKDGNQIIGNIRAVELYLWYDSYFGDSLTACRLSVYELGGNGKETLNLDNAYY TDINPEDFYDSQNILGTKAYTAVDLSVKDSIRNLSTYVPSVHIAFKEDIATRVGGNILTA ARKAKNADKEFNSQLFREAFQGIYVKSDYGDGTVLYIDQPQMNVVYKCYATDSITGKNYR RKTEVVRTLLITLIVCLLQLVK >gi|226332225|gb|ACIC01000095.1| GENE 52 65475 - 66242 487 255 aa, chain + ## HITS:1 COG:no KEGG:BT_4306 NR:ns ## KEGG: BT_4306 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 10 255 297 542 542 467 100.0 1e-130 MLCNRFYYRKKLQKKDGSGKDSTYYSYRVFATTREVIQANQLKNDPERIDALIKEDKNTY LKSPAGIFTEATLPISDIQNELTGDTLNAVKLTFTNYNQTGDKKFGMAIPSTVMLVRKKF QDSFFKDNKLSDGVSSYLTSHTSSTNQYVFSNITKLVNACIAEKEEAKKNAGSSWDETKW LQENPDWNKVVLIPVLVTYDSSNTTTGQANIIRIQHDLKPGYVRLKGGSLGKTNPDYKLK LEVISTDFGLTTKSN >gi|226332225|gb|ACIC01000095.1| GENE 53 66387 - 67769 1321 460 aa, chain - ## HITS:1 COG:MA4052 KEGG:ns NR:ns ## COG: MA4052 COG1449 # Protein_GI_number: 20092845 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-amylase/alpha-mannosidase # Organism: Methanosarcina acetivorans str.C2A # 1 387 1 390 396 265 36.0 2e-70 MRTICLYFEIHQIIHLKRYRFFDIGTDHYYYDDYANETGMNEVAERSYIPALSTLIEMVK NSGGAFKVALSISGVALEQLEIHAPAVIDLLHQLNDTGCCEFLCEPYSHGLSSLANEDCF REEVLRQRDKMKQMFGKEPKVFRNSSLIYSDEIGGLVASMGFKGMLTEGAKHVLGWKSPH YVYHCNQAPSLKLLLRDFKLSDDIGLRFSNSEWPEYPLFADKYINWIDALPQEEQVINIF MELSALGMAQPLSSNILEFLKALPECAKSKGITFSTPTEIVSKLKSVSQLDVPYPMSWVD EERDTSSWLGNILQREAFNKLYSVAERVHLCDDRRIKQDWDYLQASNNFRFMTTKNTGIW LNRGIYDSPYDAFTNYMNILGDFIKRVDSLYPVDVDNEELNSLLTTIKNQGEEIAELQKE VTKWQAKAEKGAVKAPKKVAAKKEPVAKATRKKATAKKEE >gi|226332225|gb|ACIC01000095.1| GENE 54 67783 - 69057 1320 424 aa, chain - ## HITS:1 COG:Ta0340 KEGG:ns NR:ns ## COG: Ta0340 COG0438 # Protein_GI_number: 16081471 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Thermoplasma acidophilum # 1 417 20 388 388 214 30.0 2e-55 MKVLMFGWEFPPHILGGLGTASYGLTKGMSQQDDLEITFCIPKPWGDEDQSFLRIIGMNS TPVVWKNVGWDYVKGRVGSYMDPQLFYDLRDHIYADFNYLNTNDLGCIEFSGRYPDNLHE EINNYSIVAGVVARQQEFDIIHSHDWLTYPAGIHAKQVSGKPLVIHVHATDFDRSRGNVN PTVYAIEKNGMDNADHIMCVSELTRQTVIHKYFQDPKKVSTVHNAVSPLSQEIQDIVPQK NPSEKVVTFLGRITMQKGPEYFVEAAAMVLHRTRNVRFVMAGSGDMMDQMIRLVAERGIA DRFHFPGFMKGKQVYEVLKASDVYIMPSVSEPFGISPLEAMQCSVPSIISKQSGCAEILD KCIKTDYWDIHAMADAIYSICTYPAMYEYLKEEGKKEVDEIKWENVGYKVRGIYDEVIKN YGKQ >gi|226332225|gb|ACIC01000095.1| GENE 55 69075 - 71012 1556 645 aa, chain - ## HITS:1 COG:MA0905 KEGG:ns NR:ns ## COG: MA0905 COG3408 # Protein_GI_number: 20089784 # Func_class: G Carbohydrate transport and metabolism # Function: Glycogen debranching enzyme # Organism: Methanosarcina acetivorans str.C2A # 1 636 22 669 680 287 32.0 4e-77 MSYLRFDKTLMTNLEESLQREILRTNKAGAYHCTTIVDCNTRKYHGLLVIPVPNIDDENH VLLSSLDETVIQHGAEFNLGLHKYQGNNFSPNGHKYIREFDCEHIPATTYRVGGVVLRKE KIFVHHENRILIRYTLLDAHSATTLRFRPFLAFRSVREYTHENAQASRDYQLVENGIKTC MYPGYPELFMQLNKACEFHFQPDWYRGFEYPKEQERGYDFNEDLYVPGYFEVDIKKGESI VFSAGTSEVTPRRLKQTFEAEVADRTPRDSFYHCLKNSAHQFHNQQEGEHYILAGYPWFK CRARDMFISLPGLTLALDEVDQFEDVMKTAEKAIRSFINDEPAGYKIYEMEHPDVLLWAV WALQQYAKETSREQCRQKYGELLKDIIEFIRQRKHENLFLHENGLLYANGTDKAITWMNS TVNGHPVIPRTGYIVEFNALWYNALRFVADLVREDGNVLLADALDAQAEVTGKSFIEVFR NEYGYLLDYVDGNMMDWSVRPNMIFAVAFDYSPLDRAQKKQVLDIATKELLTPKGLRTLS PKSGGYNPNYVGPQIQRDYAYHQGTAWPWLMGFYMEAYLRIYKMSGLSFIERQLIGLEDE LTIHCISSLPELFDGNPPFKGRGAVSFAMNVAEVLRILKLLSKYY >gi|226332225|gb|ACIC01000095.1| GENE 56 71228 - 71893 546 221 aa, chain + ## HITS:1 COG:Rv1337 KEGG:ns NR:ns ## COG: Rv1337 COG0705 # Protein_GI_number: 15608477 # Func_class: R General function prediction only # Function: Uncharacterized membrane protein (homolog of Drosophila rhomboid) # Organism: Mycobacterium tuberculosis H37Rv # 7 190 35 223 240 93 35.0 2e-19 MKRDIQRIMLATAVPLFLIFILFLLKIVEVGMDWDFSHMGIYPLEKRGITGILTHPLIHS GFSHLLANTLPLFFLSWCLFYFYRGIAGRIFMIIWIGAGLLTFIIGKPGWHIGASGLIYG LAFFLFFSGILRKYVPLIAISLLVTFLYGGIIWHMFPYFSPANMSWEGHLSGGIMGTLCA FGFLNHGPQRPEPFADETDEEEEESPEEDINNTTDSTDITS >gi|226332225|gb|ACIC01000095.1| GENE 57 71901 - 72500 577 199 aa, chain - ## HITS:1 COG:BMEI0883 KEGG:ns NR:ns ## COG: BMEI0883 COG2095 # Protein_GI_number: 17987166 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Multiple antibiotic transporter # Organism: Brucella melitensis # 7 193 4 207 209 76 29.0 2e-14 MFTGFNWQQMISAFIVLFAVIDIIGSIPIIINLKEKGKDVNATKATVISFALLIGFFYAG DMMLKLFHVDIESFAVAGAFVIFLMSLEMILDIEIFKNQGPIKEATLVPLVFPLLAGAGA FTTLLSLRAEYASVNIIIALILNMLWVYFVVSMTGRVERFLGKGGIYIIRKFFGIILLAI SVRLFTANITLLIEALHKS >gi|226332225|gb|ACIC01000095.1| GENE 58 72636 - 73298 580 220 aa, chain + ## HITS:1 COG:no KEGG:BT_4300 NR:ns ## KEGG: BT_4300 # Name: not_defined # Def: Crp family transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 220 1 220 220 439 100.0 1e-122 METMFDTLLQLPLFQGLCHEDFTSILDKVKLHFIKHKAGETIIKSGNPCTQLCFLLKGEI SIVTNAKENIYTVIEQIEAPYLIEPQSLFGMNTNYASSYVAHTEVHTVCISKAFVLSDLF RYDIFRLNYMNIVSNRAQNLYSRLWDEPTLDLKSKIIRFFLSHCEKPQGEKTFKVKMDDL ARCLDDTRLNISKTLNELQDNGLIELHRKEILIPDAQKLL Prediction of potential genes in microbial genomes Time: Thu May 12 01:42:31 2011 Seq name: gi|226332224|gb|ACIC01000096.1| Bacteroides sp. 1_1_6 cont1.96, whole genome shotgun sequence Length of sequence - 15023 bp Number of predicted genes - 9, with homology - 8 Number of transcription units - 4, operones - 2 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 52 - 87 -0.8 1 1 Tu 1 . - CDS 125 - 289 100 ## - Prom 333 - 392 6.7 + Prom 88 - 147 7.7 2 2 Op 1 . + CDS 368 - 1855 843 ## BT_4299 hypothetical protein 3 2 Op 2 . + CDS 1874 - 4963 2412 ## BT_4298 hypothetical protein 4 2 Op 3 . + CDS 4982 - 7033 1594 ## BT_4297 hypothetical protein 5 2 Op 4 . + CDS 7061 - 8080 874 ## BT_4296 hypothetical protein 6 2 Op 5 . + CDS 8090 - 9934 1301 ## BT_4295 putative chitobiase + Prom 9965 - 10024 3.7 7 3 Op 1 . + CDS 10044 - 10751 518 ## BT_4294 hypothetical protein + Prom 10761 - 10820 2.8 8 3 Op 2 . + CDS 10845 - 12590 1431 ## BT_4293 hypothetical protein + Term 12667 - 12722 11.5 9 4 Tu 1 . - CDS 12712 - 14805 1495 ## COG5545 Predicted P-loop ATPase and inactivated derivatives - Prom 14859 - 14918 3.8 Predicted protein(s) >gi|226332224|gb|ACIC01000096.1| GENE 1 125 - 289 100 54 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MWVELGYAYKGYTFVFVCMSKFENIWVCLTAIIHLLKLGVLKLIMNSEYISYLF >gi|226332224|gb|ACIC01000096.1| GENE 2 368 - 1855 843 495 aa, chain + ## HITS:1 COG:no KEGG:BT_4299 NR:ns ## KEGG: BT_4299 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 495 1 495 495 1016 100.0 0 MRKRRVYWTLFITAFAWLASCSDDDINGSSGFNPNQPIEITEFYPDSGGIATPMIIEGRN FGTDTTGMKVYFEDVDGIRHPAGLVSSNGSRIYAFVPKGLTFKREMNILVERRTPDGQEY IGKAPDQFLYKTQTSVSTVAGLASPDNNINTVGGDLATCTFSSPFYLCIDGEDNIFVVDR KGDSGKDKQPNTTCRNEKGEGVNGNISMISIASNSSIVLKYGTAYINAPAYSDEKDAEAV YIPDDAGMKYYDMQKLLNYVPRYRTVLKSEELSTVDENNWKHCFVINKLDHMIYTVMWKG QLVRINPKNRTAEILLKKISNVATGDGGKAGSDSYIAFSPIKGEENVLYVSLADFHQIWR VDVSKITPEDKDTYNGESYAGKAIYEGVMNGKGWEDGLLKNAKFRHPRQICFTDDGKMYI ADSGNSCIRVIDTTMPKERAAVTTPIGLPGAEGYKDGGPEIAKFHFPCGVAVNSDGTIVY VADTQNKVIRKLSIE >gi|226332224|gb|ACIC01000096.1| GENE 3 1874 - 4963 2412 1029 aa, chain + ## HITS:1 COG:no KEGG:BT_4298 NR:ns ## KEGG: BT_4298 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1029 1 1029 1029 2054 99.0 0 MRNNFLLIVLIMALIPLTSYSQDNKKKQFTVAGVVIDKTGEPLPGTTIYVKNAPGVGTSA DMDGKFSIKIDANQILVIQAIGMKSIEKLVTKDEMTLKVTLEEDDTKLDEVVVTGMTSQK KISVVGAISTIDVAQLSTPGTSLNNMIGGRLAGVITMQSSGEPGKNISNFWIRGISTFGA SSGALVLIDGIEGRLEDIDPDDVKSFSILKDASATAVYGTRGANGVVLVTTKRGETGKLT ITGRATLKVSHIKRLPEYLGAYDYALLANEARAMSGEDDLYSRLELDLIKYNLDKDLYPD VNWIDEIMKRTSIQQNYYINAQGGGDIARYYLSLGYQDEGAAYRQEDNLFKKPLSYKKIT YRANIDMNLTKTTKVYFGLDGYISNYTSPGGQNTNTVWSAVRQLTPLMFPKTYSDGTFPS YGKHDLASPYVMLNNTGYTQDETQRNMLTLKLEQEFGGFLKGMTASVQAMSENINYFNEG RYIWPDLYRATGRSSQGVLIKTLRTAKENMKYTQNNNRYRKYYMEAKANWDRTFGLHSLG ALALYYMEDVHNTQWDYDAMGINAIPQRRQNLSGRVSYGYNNCYFIDANFGYTGSAQFEK GKRFGFFPSIAVGWVPTSYNWVNEAIPWLSFLKFRGSYGQVGNDQISGDRFPYLTLINDN AASYWGYRERGITETVKGADNLQWEVAKKLNFGIDAKFFHDKLSVTVDFFRDTRDHIFQD RVTLPKFVGMVTVPKSNVGKMHSYGSDGNFEFFHQINKDMSFTVRGNYTFSQNIIDYYEE NKLPYDYLSVTGKPFNILRGYIAEGLFSSKEEINTSADQSGFGRIRPGDIKYRDVNGDGI INDDDKVPLSYSNQLPRVMYGFGADFHWKDLTLSFLFKGSAKVDYYRSGLGNDAGWIPFY NGELGNVIKLANNPKNRWTPAWYSGTTATENPHAEFPRLSYGGNTNNSQLSSFWKRNGSF LRFQELSLKYNLKTPWIKKTGLSSVDLEFVANNLFTIDKVKYFDPEQASANGAAYPIPAT YTFQVYLKF >gi|226332224|gb|ACIC01000096.1| GENE 4 4982 - 7033 1594 683 aa, chain + ## HITS:1 COG:no KEGG:BT_4297 NR:ns ## KEGG: BT_4297 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 683 1 683 683 1380 99.0 0 MKKLFKIILGATVGSTLLLSSCNFLDVDPYFEATFKEDSIFHSKKNAEGYLWNTPKGFPD AGAIWGNSWNPGESASDEITLKYQTNEFWGLQFSVGTINSRNLPIQNQWYDMYVIVARCN KMLKEVYNVPDMNEMDRRRYIGYVHFMRGYAYYHLLMNWGPLIIVGDEELSTSEPAEYYN RERATYDESVDYICDEFRLATQGIYSADEQSVNYYQRPTKGAAMALIARLRLFQASPLFN GGAAARKCFGTWKRKSDGAYYVNQEYNPRRWAVAAAAAKQLTKMGYELHTVEADAQNPYP LASNVPTANFPDGAGNIDPYHSYSDMFTGEGIIQTNKEFIWAMESSNVTNYTHHSFPVKF GGWGSMSVPQRVIDCYLMADGRTIHNSSAEYPYEPDFSRLTGESKKLGTYLLRENVPMMY ANRSARFYASIGFPGRYWPMSSASTDASYVNQQFWYSHDDTNAGIAGAGNNVNDYSVSGY VPVKYIHPDDSWANGKGSVKGAFVTSPKPFPIIRYGEVLLEYVEALNRVEGTVTVEVSDN TGTLIEETVSRDPAEMAKYFNMIRYRVGLPGVDANDMAEADNFEKIIRNERQVELFNEGY RYFDTRRWGTYLDEDANSSNWRGLDVTKDRTNANGNEGFWNIVTINTQNVRDRIALPKMV FLPIRHDELLKVPNADQNFGWDR >gi|226332224|gb|ACIC01000096.1| GENE 5 7061 - 8080 874 339 aa, chain + ## HITS:1 COG:no KEGG:BT_4296 NR:ns ## KEGG: BT_4296 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 339 7 345 345 619 100.0 1e-176 MKKKFNIMLCTLIGVLSACNEYDFDQEQYRNEVGLLSNSSLIYDRQVANVGQEKDTIYLV ATVSGSQISPNTHHVALLESDSLLKAYNKSNFDIEKEKFAKLLPDECYDFPNKELDIQAG TSKVMFPIYLKNLERISPDSIYFLDYKIDPEKTPNYNPDKSHVLLRIYKENYYATTKTST YYNYTSSTIVIPNPSGSPEVRRPTNANQVFPIGANSVRMMAGDEDLGDYKTARSRIVSKS IILEIGEQQPENPQARELTIRAHDRVNDVEVDPVDVVQLTPIDDYDNTFLLNAIRTPDGR ATYYKEFRLHYKYRLESTAPYREVKAKLRYEFNPRVDNL >gi|226332224|gb|ACIC01000096.1| GENE 6 8090 - 9934 1301 614 aa, chain + ## HITS:1 COG:no KEGG:BT_4295 NR:ns ## KEGG: BT_4295 # Name: not_defined # Def: putative chitobiase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 614 1 614 614 1211 100.0 0 MSMKKNILIIGLASIFGIASGLISCSPDYETEFKVETLVVPDKSQAPITFPLLGGEHEIE VQTNVPLDRWSAQSNAEWCKVVQHEGKVVVSASANNIYKQRRAEITVAYGHQSYSITVSQ FGKEPAILIGDKLQQEGYVEIIDAERETLTIPVATNLNLDNIIIPDTCNWIRLAEQPATF DAKTRAAEDVNKQELKFTLDKSTETDVRYCTIILQSSQNYSYTASFLIKQQPRGYIVEID EDKKIYEVKAMGETITIPFKVNSPAGEVSYTYEVEESAQSWITPVSLPASRALRDVSESF IIKANTEVENQPREGKITFKSTNSTDKVPSQFVVTVKQAGFIATPPLNVINATATPGAGS IQLQWEIPEDVDFNKIKITYYDKVTKENKEILINDYKTTSYIIDDTYQCAGEYSFTINTY GPTGMETDSPVTITGISGEASEMERVTLTIDMLSDNANHVGDGGGLPALIDGKVNTYYHT KWNAPVTTEAHYVQIKLNKPLKDLCFEYDARQSGVNNGGDVKAATIYGSMNGEFFESMGN EEFNLPTTNGGHATAKNNVSGKQAYNYIRFTPTARRDKDPLDYTVAGSAWWNMSEIYLYR IRHDEAWAREQLGI >gi|226332224|gb|ACIC01000096.1| GENE 7 10044 - 10751 518 235 aa, chain + ## HITS:1 COG:no KEGG:BT_4294 NR:ns ## KEGG: BT_4294 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 235 1 233 233 442 100.0 1e-123 MNMKPKKLSLLLTILPMIIAATGCHSQQESKVDITPHCINLQADSLYRQAMTLMESSCDV DSTRKCIRFLDKALAIDSLNPDYYGMKAKLLSEMGELDSALYIQTLAMEKNAITGEYLFQ LGLLQAAKDMHAEAHESFGKSKVFLQAVLKQYPDSLGAFILAEAANSLYENKDSLFMRNI DEIRKRFPERLLEIEMTRRVKPHSLVNQIRRIKIEKDYNIDFDMDSLVEATVNKD >gi|226332224|gb|ACIC01000096.1| GENE 8 10845 - 12590 1431 581 aa, chain + ## HITS:1 COG:no KEGG:BT_4293 NR:ns ## KEGG: BT_4293 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 581 1 581 581 1191 100.0 0 MIEDVKRIPYGVSNFVEVVEQNQYYVDKTMYLPLLEQQPNSLFFIRPRRFGKSIFLSMLR TYYDISQKEKFQKRFGNLWIGSQPTPLQGTFQILFLDFSRIGGIDGTLAQNFDDYCCGGL DDFASIYEPYYYPGFALEMKQLEGSTNKLNFLDRKARNNGSHLYLIIDEYDNFTNVVLNE QGNEIYHALTHASGFYREIFKKFKGMFERIFMTGVSPVTLDDLTSGFNIGWNISTDFQFN MMLGFSETDVRTMFQYYKDAGQLPADTDIDALIREIKPWYDNYCFAKESLERDPKMFNCD MVLYYLRHYITLGKSPEQMIDPNTRTDYNKMKKLIRLDKLDGNRKGVLRKITEEGEVITN LVTTFPASEIANPEIFPSLLFYYGMLTITGTRGVRLILGIPNNNVRKQYYDFLLEEYQEK RHIDLNGLRDLFDDMAFDGHWQKSLEFIAHAYKENSSVRSAIEGERNIQGFFTAYLSVNA YYLTAPEVELNHGYCDLFLMPDLLHYEVNHSYIIELKYLSEKDSDAKAEIQWEEAVEQIK GYAAAPKVRQLIQNTKLHCIVMQFRGWELQRMEEVIQPHIK >gi|226332224|gb|ACIC01000096.1| GENE 9 12712 - 14805 1495 697 aa, chain - ## HITS:1 COG:all8519 KEGG:ns NR:ns ## COG: all8519 COG5545 # Protein_GI_number: 17232892 # Func_class: R General function prediction only # Function: Predicted P-loop ATPase and inactivated derivatives # Organism: Nostoc sp. PCC 7120 # 397 649 357 610 836 71 26.0 7e-12 MKITLIRQDSGSGKEALSICEAGTLFNKMKTETKSGHITALRNLIPMLEGTYSQYEHIDK LPYIYSAVECTRTKEGERKMKQYNGLVQLEVNRLAGPSEMEYVKRQAALLPQTFAAFCGS SGRSVKIWVRFALPDDRGLPEKEAEAELFHAHAYWLAVKCYQPMLPYDINLQAPVLTQKC RMTLDESPYYNPDAVPFCLEQPLAMPGEETFRQRKQGEKNLLLRLKPGYESAKTFTKIYE AALNRAFQEMEDWKRGDDLQPLLVHLAEHCFKAGIPEEEAVRQTLIHYYREEDERIIRST IHNLYQECKGFGKKNSIGREQETAFLLEEFMNRRYEFRYNTVLDDLEYRQRDSIHFYFKP VDKRVRNSISICALKEGINAWDRDVDRFLNSEYVPLHNPVEEYLYDVGRWDGKDRIRALA GLVPCTNPYWQELFYRWFLSMVAHWRGVDRQHGNSTSPLLVGPQGYRKSTFCRIILPPEL RFGYTDSIDFKSKQEAERYLGRFFLINIDEFDQINVSQQGFLKHLLQKPVANLRKPHGNT IREMRRYASFIGTSNQKDLLADPSGSRRFICIEVIAPIKTNATINYKQLYAQAMEAIYKG ERYWLDDEDEKILKQTNREFEQASPLEQLFHCYLRPAEEEEGGEWMTSMQILNYLQTKTR DRLAINKVAVFGRALQKLNIPCRKSGKGTLYHLLKIE Prediction of potential genes in microbial genomes Time: Thu May 12 01:43:23 2011 Seq name: gi|226332223|gb|ACIC01000097.1| Bacteroides sp. 1_1_6 cont1.97, whole genome shotgun sequence Length of sequence - 4807 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 315 - 372 4.3 1 1 Op 1 13/0.000 - CDS 488 - 1828 966 ## COG0845 Membrane-fusion protein 2 1 Op 2 . - CDS 1782 - 4007 159 ## PROTEIN SUPPORTED gi|225084369|ref|YP_002657150.1| ribosomal protein S16 - Prom 4069 - 4128 5.8 + Prom 4004 - 4063 7.2 3 2 Tu 1 . + CDS 4088 - 4744 430 ## BT_4287 hypothetical protein Predicted protein(s) >gi|226332223|gb|ACIC01000097.1| GENE 1 488 - 1828 966 446 aa, chain - ## HITS:1 COG:XF1216 KEGG:ns NR:ns ## COG: XF1216 COG0845 # Protein_GI_number: 15837818 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Xylella fastidiosa 9a5c # 167 441 133 410 420 74 24.0 4e-13 MNRKQTSSPQKKAQTKSSGRSEELGEIIDRMPMAFGKWVALAVVIFAALFLLFGWIIKYP DMVTGQIKINAQNPTIRLVANSTGNLLLLSHKAQEEVKKGEYIAVVQNPASTEDVRKIAD LINRIDFDGTHLLALKDTFPDKVYLGEINPQYYAFLAVLKAQCDYLQQNVYEKQRENITT SIEWKKKIVREAEDSQKAAKDRMDVAQKWLKRYVSLDQQEIATYEYETDQIKNNYLTTVQ EVQNINREIASTRMQITEAYHRLEQLEVEQLEKERELKVELLSTHQNLIANMAAWEQKYV FKAPFDGKVEFLKFISDGQFVQAGEAVFGVIPKENHIYGQVLLPANGAGKVKENSKVVIK LENYPYMEYGYIEGYVSSISLVTQTQKTGEKTIETYLINVELPNGLTTNYEETLDFKYEL GGTADIIVKDRRLIERLFDNLRYRTK >gi|226332223|gb|ACIC01000097.1| GENE 2 1782 - 4007 159 741 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|225084369|ref|YP_002657150.1| ribosomal protein S16 [gamma proteobacterium NOR51-B] # 507 713 25 228 309 65 27 6e-11 MQTRQFPWEYQMDAKDCGPACIKMIAKYYGKYYSLQYLRDLCGITREGVSFLDISYAAEK IGLRTVAVKATMENLTNRIPLPCIIHWDQHHFIVVYKTKKGKIYVSDPAKGLLSYPEEDF KDRWYKEGEEFGMLMVLEPMANFKQIEAHERIERFKSFENLLNYFTPYKKAFGILFAIML IATGLQAVLPFISKSVIDIGIYTQDISFIYMMLIGNIVLLLSITLSNVLRDWVLLHVSTR VNISLISDYLIKLMKLPVTFFENKLVGDILQRAGDHERIRSFVMNNSLGMFFSIITFVVF SIILLIYNPMIFFIFIAGSGIYVAWIFTFLSIRKKLDWEYFELNAKNQSYWVETIENVQE IKINNYEDLKRWKWEAIQARIYRLNLKVLKINNAQSLGAQFINSMMNIAVTFYCAIAVIN GDITFGVMISTQFIIGMLNGPVAQLVSFIQSAQYAKISFMRINEIHQLKDEDDSSPVVSN SLSLPVDKSLYLKNVSFQYSRNAPLVLKNITLQIPKGKVTAIVGDSGCGKSTLLKLLLRL YMPSYGEICMGDMNVNNISLRDWRAKCGCVMQDGKLFNDTIQNNIVLDDANIDYEALQKA VEVANISHEIEAMPQGYQTMIGEMGRGLSGGQRQRVLIARALYKDPDYLFLDEATNALDT INEQKIVRALNNVFKNRTVIVVAHRLSTIRRADQIIVLKAGVIIEIGNHQSLMANKQFYY NLIQSQYEPETNIFSTEESAD >gi|226332223|gb|ACIC01000097.1| GENE 3 4088 - 4744 430 218 aa, chain + ## HITS:1 COG:no KEGG:BT_4287 NR:ns ## KEGG: BT_4287 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 215 1 222 222 192 50.0 7e-48 MKQKLTRALIDEIRKEMPVLTEMEMRCCNGGDGGTVSWDCLFNCMNAIDPSRSAQEYAKE YADRYNLDPNTMGGVPENYINDVLTTLGFNNVPTNSMTPSREYHQIVAIENHDGTAHAVI LSSFMDENGNFFYYDPTTNTGGLQANKSDIVGIYRISRSDINNPENVDNSYVGDSTFGYN NSIYADYYTSDYNYNGSYNNYEYNRDDQNDYSNYNSSF Prediction of potential genes in microbial genomes Time: Thu May 12 01:43:29 2011 Seq name: gi|226332222|gb|ACIC01000098.1| Bacteroides sp. 1_1_6 cont1.98, whole genome shotgun sequence Length of sequence - 7675 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 3, operones - 2 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 190 92 ## BT_4286 hypothetical protein + Term 202 - 232 0.3 2 1 Op 2 . + CDS 263 - 808 353 ## BT_4285 hypothetical protein + Prom 1419 - 1478 5.8 3 2 Tu 1 . + CDS 1615 - 1770 89 ## BT_4284 hypothetical protein + Prom 1801 - 1860 2.4 4 3 Op 1 . + CDS 1914 - 2903 551 ## BT_4283 hypothetical protein 5 3 Op 2 26/0.000 + CDS 2987 - 4126 548 ## COG0438 Glycosyltransferase 6 3 Op 3 26/0.000 + CDS 4119 - 4952 453 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 7 3 Op 4 11/0.000 + CDS 5002 - 6735 729 ## COG0438 Glycosyltransferase 8 3 Op 5 . + CDS 6791 - 7624 500 ## COG0500 SAM-dependent methyltransferases Predicted protein(s) >gi|226332222|gb|ACIC01000098.1| GENE 1 2 - 190 92 62 aa, chain + ## HITS:1 COG:no KEGG:BT_4286 NR:ns ## KEGG: BT_4286 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 62 198 259 259 127 100.0 9e-29 GCIALYCFYNDKGARKLSQYIRSLKKVFWYREPKDYIEVIDPPIIEEYEIPHLKAKEKSA DN >gi|226332222|gb|ACIC01000098.1| GENE 2 263 - 808 353 181 aa, chain + ## HITS:1 COG:no KEGG:BT_4285 NR:ns ## KEGG: BT_4285 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 181 1 181 181 346 100.0 2e-94 MKQKLTRALIDEIRKEMPVLSQNEEKGVIGGTLYVIGVDGRVLYSNETNTDEVLVSMGSW DGAPTMELPKGTSFQISSGQLVIEGTSEQNRDIYSFLTQNTSVEWSMCVDSSTYHFFAGT NHQEKEVSMAYSGCDIKYHNHQSEYANYPSDADYETKSKLQEIGYKEFYIYHEPTDTYIP Y >gi|226332222|gb|ACIC01000098.1| GENE 3 1615 - 1770 89 51 aa, chain + ## HITS:1 COG:no KEGG:BT_4284 NR:ns ## KEGG: BT_4284 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 51 266 316 316 95 100.0 4e-19 MIVSYLEKQIRKFPENSFGPLITIDKKMLYGRYLKFIVKDEKFKIEDYLEY >gi|226332222|gb|ACIC01000098.1| GENE 4 1914 - 2903 551 329 aa, chain + ## HITS:1 COG:no KEGG:BT_4283 NR:ns ## KEGG: BT_4283 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 329 1 329 329 586 99.0 1e-166 MKNNFLVNFILLHGYSLDNSSLMFGKMGYSLILFEYSHYFKDALAEKHAFELLQEVLASP MKSNTFNEGKMGIAWSLIHLIEKEYIEADYLELYGQEHKEIVAFIKQLKTDMNNIVSKND AISFLIASKSYIQESDFDEILPNLIENLYDYLKRIPKSLFERNLFYYHATKMLCLYNLYE ELSIRGETLIDIIVQTHKKLISDKHVCTNISFGVNLLQYGLCHKRKDIIKLANAIIKCYF SNMVLETLNLKESIDAIYNLNKLQLLNKESEWTSLKKQLSSLLFNKESYLYKENRLTLSR LKGGIPQLLLMECLNFQDIKSTGHYIMLK >gi|226332222|gb|ACIC01000098.1| GENE 5 2987 - 4126 548 379 aa, chain + ## HITS:1 COG:MJ1607 KEGG:ns NR:ns ## COG: MJ1607 COG0438 # Protein_GI_number: 15669803 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Methanococcus jannaschii # 135 375 141 386 390 99 28.0 1e-20 MKHMNGIKVHIICLNTSLIFPKFEVTEDRIIANIPYPSKSKPLRNEIYWLTKYFNVVSDL ISPYFKNRKQLIWHVQELLLVKLADILKLSLGGHILTHLHIIPWKLSLEYNESLFKKQYS QWLNNTFNLINENQLEKIAYPLSDRIICVSYSAMKHIISAYGIHPDKISVIYNGMDDTGI TLQERKSRTPEILFVGRISREKGVICLLNALNKVASRGYFCKLKLVGQCTGYMSSHIRKA YKQLKIDILGTVSFNELKELYTTNTIGVIPSLHEQCSYVAIEMSMFGMPMIVSDVDALSE MFEDEVNALRIPLVFDEDFGLELDEEKLADAIIRLIDDEALRLKLSTNAIKNYQERFTLE KMIENTINVYEQLIEQDNA >gi|226332222|gb|ACIC01000098.1| GENE 6 4119 - 4952 453 277 aa, chain + ## HITS:1 COG:CAC2173 KEGG:ns NR:ns ## COG: CAC2173 COG0463 # Protein_GI_number: 15895442 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Clostridium acetobutylicum # 1 208 1 215 333 131 39.0 1e-30 MPEITVLMPVRNGEKYIKESIDSVLNQTLTDFEFLIIDDGSTDRTVEIIQGYTDKRIRLV RKEHQFIQNLNEGLELASGSYIARMDADDIMHTERLRIQLKRMKKNPDITVCGTWAKIFS DKGNERNVSHLGYGIIHEPVLELLKYNMLLHPSVMIKKEFLLNHHIEYQNYPCVEDYKLW FDIAKAGGILFVEPQELLMFRRSDTQVTVTKKKEMSLGSIRLRKEILLYLLSIYNNKTLN SLLSDFENLEKNKWMSNEDIFRFFVNLFNRIQRDTMV >gi|226332222|gb|ACIC01000098.1| GENE 7 5002 - 6735 729 577 aa, chain + ## HITS:1 COG:MTH173 KEGG:ns NR:ns ## COG: MTH173 COG0438 # Protein_GI_number: 15678201 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Methanothermobacter thermautotrophicus # 81 359 94 370 382 68 24.0 3e-11 MNIAFLSSSNPNDQNNWSGTLYSLYTSLQKKHRVIWVGETVFAEVWEFHKANFKNNETFF PENYALLFGKLFSDLLKKECYDVIICRDYFFAAYLVSDIPLIYIGDTTFHLFNQYMNWQD KSLTKLAEQLESLAIQKVDKIIYCSEWAKQSAMQDYNADAENIEVVEFGANITENIPIPD NVSPINAPCNLLFIGRNWNMKGGNKVLEIYHNLKGRGFQCTLTIVGSEPPMSLPNDPNIE IYPFIDKTNSNDRLKFHEILTRSHFLILPTRFDCFGIVFCEACAYGIPSLGTNVGGVSQV IKERENGFLFNIDASSLEYADKIEEIFNNHITYSKLRKTARKDFEERLNWDIWLDKSNKI IEQLASEHQPDFYLPVYVINMRERVERKQHITKEFDNKEEFELNWVEASAHPIGAVGLWN SMIKIIKMAKEKGDDIIVICEDDHYFTENYSPKLLFKEVTEAYIQGAEVLSGGIGGFGQA IPAGYHRYKVDWFWCTQFIVIYNRFFDKMLDYSFQNTDTADGVISKLATNIMVIYPFISE QKDFGYSDVTQSNMEQQGKIREHFARANKHMKSISLK >gi|226332222|gb|ACIC01000098.1| GENE 8 6791 - 7624 500 277 aa, chain + ## HITS:1 COG:ML2346 KEGG:ns NR:ns ## COG: ML2346 COG0500 # Protein_GI_number: 15828262 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Mycobacterium leprae # 3 242 12 227 301 102 30.0 6e-22 MENLIFDIGFHKGEDTLFYLLKGYRVIAVDADPDLINEWQNIFKKYIENGKLLLLNYVIS DTNDVDTDFYIGPNTIWSSTKVSISSRMCCKAIKKKIKSKRLDHLFHEYGTPFYCKIDIE GNDIIALQTMEKVSEKPLYISVETECIGEDEDIAGHELDTLNALYQLGYRKFKLVDQRTL TVLDYNCFYKNNSEHNWFEQIETNCKYAEELIVLSDTDQRVKFTDFFPGSSGPFGEELAG KWYDYPQAKEMLKKHREDKMRLNEPAWTFWCDWHATF Prediction of potential genes in microbial genomes Time: Thu May 12 01:43:39 2011 Seq name: gi|226332221|gb|ACIC01000099.1| Bacteroides sp. 1_1_6 cont1.99, whole genome shotgun sequence Length of sequence - 460 bp Number of predicted genes - 0 Prediction of potential genes in microbial genomes Time: Thu May 12 01:43:47 2011 Seq name: gi|226332220|gb|ACIC01000100.1| Bacteroides sp. 1_1_6 cont1.100, whole genome shotgun sequence Length of sequence - 33294 bp Number of predicted genes - 25, with homology - 25 Number of transcription units - 11, operones - 6 average op.length - 3.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 685 - 2271 575 ## BT_4273 hypothetical protein + Term 2432 - 2472 0.4 - Term 2243 - 2310 16.5 2 2 Op 1 . - CDS 2337 - 5327 2464 ## BT_4272 hypothetical protein 3 2 Op 2 . - CDS 5333 - 6040 509 ## BT_4271 hypothetical protein - Prom 6065 - 6124 10.8 4 3 Op 1 . - CDS 6153 - 7988 1342 ## BT_4295 putative chitobiase 5 3 Op 2 . - CDS 8002 - 8997 717 ## BT_4269 hypothetical protein 6 3 Op 3 . - CDS 9026 - 11077 1704 ## BT_4268 hypothetical protein 7 3 Op 4 . - CDS 11099 - 11824 436 ## BT_4267 hypothetical protein 8 3 Op 5 . - CDS 11852 - 14218 2044 ## BT_4267 hypothetical protein 9 3 Op 6 . - CDS 14236 - 15708 991 ## BT_4266 hypothetical protein - Prom 15783 - 15842 5.2 - Term 16592 - 16629 1.0 10 4 Tu 1 . - CDS 16654 - 18177 1550 ## COG0519 GMP synthase, PP-ATPase domain/subunit - Prom 18282 - 18341 6.1 - Term 18314 - 18378 15.4 11 5 Tu 1 . - CDS 18383 - 18829 510 ## COG1970 Large-conductance mechanosensitive channel - Prom 18857 - 18916 8.5 + Prom 18853 - 18912 6.8 12 6 Tu 1 . + CDS 18972 - 19982 1022 ## COG0057 Glyceraldehyde-3-phosphate dehydrogenase/erythrose-4-phosphate dehydrogenase + Term 19998 - 20048 10.1 + Prom 20021 - 20080 5.2 13 7 Op 1 . + CDS 20164 - 22233 2106 ## COG0339 Zn-dependent oligopeptidases 14 7 Op 2 . + CDS 22252 - 22761 241 ## BT_4261 hypothetical protein 15 7 Op 3 . + CDS 22773 - 23222 415 ## COG2131 Deoxycytidylate deaminase 16 7 Op 4 . + CDS 23254 - 25002 1965 ## COG0793 Periplasmic protease 17 7 Op 5 . + CDS 24944 - 25504 329 ## COG0212 5-formyltetrahydrofolate cyclo-ligase 18 7 Op 6 . + CDS 25449 - 26240 525 ## COG1387 Histidinol phosphatase and related hydrolases of the PHP family - Term 26076 - 26110 -0.6 19 8 Op 1 . - CDS 26227 - 26514 357 ## BT_4256 hypothetical protein 20 8 Op 2 . - CDS 26511 - 27620 995 ## COG1195 Recombinational DNA repair ATPase (RecF pathway) - Prom 27730 - 27789 7.6 + Prom 27667 - 27726 4.8 21 9 Op 1 . + CDS 27782 - 28465 996 ## BT_4254 hypothetical protein + Prom 28547 - 28606 5.7 22 9 Op 2 . + CDS 28633 - 29127 444 ## COG0054 Riboflavin synthase beta-chain + Term 29208 - 29278 30.1 + TRNA 29178 - 29263 65.6 # Tyr GTA 0 0 + TRNA 29269 - 29341 68.9 # Gly TCC 0 0 + Prom 29188 - 29247 80.4 23 10 Op 1 . + CDS 29403 - 30662 1032 ## COG0673 Predicted dehydrogenases and related proteins 24 10 Op 2 . + CDS 30702 - 32549 1143 ## BT_4251 hypothetical protein + Term 32568 - 32622 11.4 - Term 32561 - 32604 1.8 25 11 Tu 1 . - CDS 32614 - 33159 535 ## BT_4250 RNA polymerase ECF-type sigma factor - Prom 33228 - 33287 13.7 Predicted protein(s) >gi|226332220|gb|ACIC01000100.1| GENE 1 685 - 2271 575 528 aa, chain + ## HITS:1 COG:no KEGG:BT_4273 NR:ns ## KEGG: BT_4273 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 10 528 1 519 519 1021 100.0 0 MIFITSCTIMIPEQINGDSCQVSIKSKTREITNAWLTVTALNSNEVAVASEDKKLEHPEW TDNVLSLKLTDGKYKAKALRINIFYFGNSNPDQQICLSDLQINIDGKDLGKQSIEDQTVI NTNIHKRLIKNKIIKLSHDNDSTLLTRINELKDKKIIGLGECTHGSQELKTAVIQFSKNL IQNGDCRFVLLEAPVDALLLVDAYIQGIISSPDIEKQIKEIMQMFFTNNSELMGFVDWLK EYNKTSLRKVHLSGMDYKDIISPYFYDYLLNLLDKEKGRYYLLKLYDKEFKDILQYAHED VYLKTKLGEENFSLFTQYIETCINLGIGSQLPPPDYRDFYMFTYTKQLADHFLKKDEKLI IHAHSAHLSNLERFDSFPYKEIPAGNRLKKYYGDKYYSIGFQVGKGTFTQESAGYFSKLI ALPLSKPPYNSFEHLALSTKESYFYYPSKYMDDGIVELLVIPRMARYQRQYQFHSVKNWC DAYVFIQESTPLKNVTTGEFETMFKQFLRIKELIKELKANEEKIRNKE >gi|226332220|gb|ACIC01000100.1| GENE 2 2337 - 5327 2464 996 aa, chain - ## HITS:1 COG:no KEGG:BT_4272 NR:ns ## KEGG: BT_4272 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 996 1 996 996 2040 99.0 0 MLLDKVKHILCISLILLVGVTTLYACKSDDKELQGEPVLQVQKSIGFKKEGGEVAVPVKS NREWNASVTEGKEWLTARKASDTELTVSAISSPEKGVREGNITIANNALTAKLRVVQTGG DLIIEVAEESHVIQVAGTGNDHIEVNLLSNTDYEVVIPEEAKDWITEKEVPDTRADLASS TRIFSIASNPLTTERNATIKFVSKENTNIYDQSEIKQQKKSSDISGVNPEKDVKLKVTGG YDTDHQPGQDISKSYDGQFGGTCYHSTWSQSAKFPVTLEYQFDQNQLTLDYILYHSRNGN GNFGAFELYIKPQGSADFVHIQDYDFKGAGGSHRILLNDPVVPAAVQFKVKSGLNDFVSC DEMEFFHAAENPLDEQLITVFTDRSCSELRPDASDETINRLPAFFNVLAKSLQSNTYPEA EKRFRIQSYQAYSVPEYWGDKLRTNYYSPLCNPTGIITNAGEEMVVLADGIPQGESISLR CCSDLGPDGEERFLKNGINKFSFSRAGNLFVIYQKLDPRGMPAVKIHFPPQYVEITEHAR VGFNVWDLTVDKTDDLFREYIRKAKSVTLDGSDKCVFVLKGRKILFTALKDLLQNQDNFK QYGVVRGMERWDNLIDWEQELAAIDTYSNTGEFNSLMHVTTFTDGLYATNYYINMAAGDV STKDGWGFKNNFDPRDMDKNQDNEWGPGHELGHMHQGAINWPSTTESSNNLFSNYVVYKI NQWGSRGSSIGTLATYRYAPPTPWSRFMHPRDPNTLAFTPQDMTSDDANKYGLYQGEASE MHMRLNQQLWTYFERIGKKPNTIRKIFEQGRTPEFWLPFNDPGAAQLMYARNVAKAANMD MTEFFDAWGFFIPVSFKLYAYGSFSYTVTQDMINQTLAYMKTFSTKCPPIEYIEDRRYQV GAKGNQKGISEDGGDVGYFETFQNNVKITKTVSYTVSGRTYTVTNGEQAVAFELIKDGKK VWFANRFVFTVPAEADIEGAELYAVQADGQRIKANK >gi|226332220|gb|ACIC01000100.1| GENE 3 5333 - 6040 509 235 aa, chain - ## HITS:1 COG:no KEGG:BT_4271 NR:ns ## KEGG: BT_4271 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 235 1 235 235 423 99.0 1e-117 MYMKTKKLNIILLVLLLICTAIGCHSRQKPDIRPHPVNLSADSFYQQAVAILQSSYDVDS TRKCISLLDRALSIDSLNPDYYGTKAKLLAEIGELDSALHVQTLAMERKAITGEYLFQLG LFQAAKDMNADAHQSFGKSLEILRAVLEQYPDSLGAFILEESANALYQGADSIYMKDIDG IRKRFPNRLLEIEMIRRLKPHSLVKQIKKIQIENEYNIDFDLDSLVNEMEKQQKL >gi|226332220|gb|ACIC01000100.1| GENE 4 6153 - 7988 1342 611 aa, chain - ## HITS:1 COG:no KEGG:BT_4295 NR:ns ## KEGG: BT_4295 # Name: not_defined # Def: putative chitobiase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 611 3 614 614 827 68.0 0 MKKNILIIGLTGILGTFVALSSCGPDYETEFNVLTLTIPSESQAAVVFPLSGGEHEIEVE TNVALDNWTVSSNAEWCKLQKQEGKVTVSADQNEGYKQRIAEVTIAYGHQSYSISVKQFG LEPVIDIPGLTDDGMKAVDAKATSLSIQVNTNLNLDVITVPDTCNWVRFNEPKELKAKKG QKAVKAEDIVKKNLVFSLDQNTDTIVRYCTITLQSSQNYNCVGTFIIKQQPRGYIVDIDD AHKVFEVKAMGETITIPFKVNGPAGAEYEYVIDENAQTWIKPATAPLTRGALRDASESFI IEPNTAVVEQPREGTITFTSKDPVDPNSFTVTVKQAGFVPVPPVNVVNATATPGAGHITL QWEAPEDVDYSKVKITYYDKVTKENKEIEIDGFKTTSAIIDDTYQCAGEYSFTFKTYGPT GMETESPVIITGTSGEASELERVILTVDMLSANATQDGDGGGLPALIDGKGGTYYHSRWQ DPVVSEAHYVQIKLNKPLQGLRFEYDARQTGINNGGDVTIATIYGSTNGEYFESMGTEEF NLPTSNGGHATAKNNVNGKQAYNYIRFTPTGRRNQATLDYTNANKAWWNMAEMYLYRVRH DEAWAKEQLGI >gi|226332220|gb|ACIC01000100.1| GENE 5 8002 - 8997 717 331 aa, chain - ## HITS:1 COG:no KEGG:BT_4269 NR:ns ## KEGG: BT_4269 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 331 1 331 331 588 91.0 1e-167 MKKILTIILCSALVMISGCDKYDFDQEQFRKEVNLLSNSNLVYDRQVAELQQGGDTLFVV ASLSGSQATDEPVTVVLQHSDTLLRAYNKSNFDINKARFAKYLPEECYEFPTMEMNISAG SSKAMFPVYLKNLEKISPDSIYFLEYKIDSLRTPDCNPKKRHVLLRIHKENYYASTQTAT YYNYTSSTVIIPNPDGYSEVRRPTNSNRVFPISENSVRLMAGDENFTDYTTALDIINKGS IILEMGAQLPENPLAKELTILPYQSIDVVQLTPIGEYDNTYLLNVIRTPDGRSTYYKEFR LHYKYRLKSTDLYREVNAKLRYQFNPRVENL >gi|226332220|gb|ACIC01000100.1| GENE 6 9026 - 11077 1704 683 aa, chain - ## HITS:1 COG:no KEGG:BT_4268 NR:ns ## KEGG: BT_4268 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 683 1 683 683 1270 90.0 0 MKNRIKILLGAVIGSSLLLSSCNFLDVDNYFQATFKEDSIFHSMKNAEGYLWNTPTRFPD AGAIWGNSWNPGETASDEIAVRWQTNEFWGAQFSVGTVNGRNLRQDIWYSMYVVVARCNR MLENIDKVSDMTEADRRRYIGYVHFMRGYAYYHLLMNYGPLLIVGDEVLGTSESAEYYDR ERSTYDESVDYICNEFKLATQGIYGPTEQSISYSDRPTKGAALALIARLRLFQASPLFNG GDAARLCFSNWKRKSDGADYVNQTYDPDRWAVAAAAAKQVINMEYYSLFTVAPDNQYPYP LADNVPTAPFPDGAGGIDPYHSFADMFNGEGIIQTNKEFIWAMASQNVTNYTHHSFPVKF GGWGGMSVPQRVVDCFLMMDGRDIHNASADYPYVADLSQTIGTNKVLGNYQLRGDVPKMY DNRSARFYASIGFPGRLWTMSSASSDATYVNQQFWYSHDDTQAGLAGAGNNVNDYNISGY TPVKFVHPDDSWSSGKGSVKGAFVTQPKPFAIIRYAEVLLEYVEALNRVTGTVTVTTPDM TGTDVEVTVSRDIQEMAKYFNMIRYRVGLPGPALGDFNEVERFEQIIRNERQVELFNEGY RYFDTRRWGTYLTEDANTSNWQGLDVQKDRTDVAGNPGFWNIVTIDQQNIRDRVALPKMV FMPISHSELLKVPSMDQNPGWDR >gi|226332220|gb|ACIC01000100.1| GENE 7 11099 - 11824 436 241 aa, chain - ## HITS:1 COG:no KEGG:BT_4267 NR:ns ## KEGG: BT_4267 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 241 799 1039 1039 468 93.0 1e-131 MTGRPFNILRGYISEGLFASKDEINTSPDQSGFGKIRPGDIKYRDVNGDGIINEDDKVPL SYSNQLPRVMYGFGGDFQWKDLTVSVLFKGSAKVDYYRAGLGNDNGWIPFYNGDLGNVIK LANNPKNRWTPAWYSGTTETENPNAEFPRLSYGGNKNNSQLSSFWRRNGSFLRLQEVSLR YSLKHLPWIKAVGLSSVDLEFVANNLCTFDNVKYFDPEQASANGAVYPIPATYSFQVYLR F >gi|226332220|gb|ACIC01000100.1| GENE 8 11852 - 14218 2044 788 aa, chain - ## HITS:1 COG:no KEGG:BT_4267 NR:ns ## KEGG: BT_4267 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 788 1 788 1039 1458 93.0 0 MKKNLLLIVLLMTLIPLAGYSQNETSTQFTVAGVIVDATGEPLVGTAVYVKNEPGVGVVA DLDGKFKIKVTKNATLIFQSVGMKNVEMLITKNEEKLKIVMKEDETKIDEVVVTGMSSQK KVSVVGAITTIDVAQLKTPATSLNNMIGGRMAGVITMQSSGEPGKNISNFWIRGISTFGA SSGALVLIDGLEGRLEDIDTDDVESFSILKDAAATAVYGTRGANGVVLVTTKRGTSGKLE VTGRATMKISHIKRLPEYLGAYDYALLANEARAMSGEDDLYTRLELDLIKNRLDPDLYPD VNWIDEIMKRNSIQQNYYVSAKGGGDVARYFLSLGYQDEGAAYRQEENLFKKPLSVNKLN YRANIDMNLTKTTQLYFGVDGYVNSYVSPGGGNTDLVWSAVQQLTPLLFPVTYSDGTLPV YGGSRELASPYVMLNNTGYMQSEDNRNMLTLKLTQEFGGFLKGLTMSVQAMTEHIGYFSE YRRIWPDLYRATGRTAQGVLIKALRSAQSSMAYGSGENSYRQYYMEAKANWNRSFGDHTF GALLEYYMKDEKNSLWGKYDDLGIASIPKRHQNLSGRISYDYKNRYFIDANFGYTGSAQF KKGEQFGFFPSIAAGWVPSSYAWWTEKLPWFTFLKIRGSYGIVGNDQIVSTNDYTGQGRF PYLTLIDNHAGSAWGYRYAGITETHTGADNLKWEVAKKANLGIDAKFFHDKLSLTVDIFR DTRDHIFQDRVTLPSYVGMVTYPKSNVGRMHSVGSDGNIEFFHTINKDMSFTVRANYTFS QNIIDYFE >gi|226332220|gb|ACIC01000100.1| GENE 9 14236 - 15708 991 490 aa, chain - ## HITS:1 COG:no KEGG:BT_4266 NR:ns ## KEGG: BT_4266 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 490 1 490 490 927 91.0 0 MQKRTIFLTFFAAVFVYLTSCSDSKNIGSAVFNPDQPIEVTGFYPDSGGIATPMIVEGSN FGSDTTNLKVYFEDADGIKHQAGLVSSNGNKIYLYVPSGLTYKKEMNLLVARTMPDGTEY EGQAGRQFIYKTQTSVTTVVGQASPDANQPTVGGDIATSTLSAPNYISLDDEDNIFITER HVWHGGNNYPSVTCQNDKGAQSNGNIVMASIKSNSVLVLQYGTSAILNAPAFSDLDGTMY APEDGGMGYYSLAKSVSYAPRRLTVLLNDETKDVADGNYKHCFVVNKLDHYIYTVMVKGQ LVRIKPNTRTCELLLKKVGTSTRPDACDSYLAFSPVKGQENMLYVAMAEGNQIWRVDVGN LEGKDKNTYPGEGYAGKAILEGQVAGAGWEDGLLRNAKFDNPHQICFTEDGKLYIADCGN NCIRVIDTKLPLDRAMVTTPIGLPGMKGYKDGGPDIALFNHPFGVAVSADGQIVYVADTG NKVIRKLSIQ >gi|226332220|gb|ACIC01000100.1| GENE 10 16654 - 18177 1550 507 aa, chain - ## HITS:1 COG:FN1444_2 KEGG:ns NR:ns ## COG: FN1444_2 COG0519 # Protein_GI_number: 19704776 # Func_class: F Nucleotide transport and metabolism # Function: GMP synthase, PP-ATPase domain/subunit # Organism: Fusobacterium nucleatum # 193 507 1 318 318 419 62.0 1e-117 MQEKIIILDFGSQTTQLIGRRVRELDTYCEIVPYNKFPKEDPTIKGVILSGSPFSVYDKD AFKVDLSEIRGKYPILGICYGAQFMAYTNNGKVEPAGTREYGRAHLTSFCKDNVLFKGVR ENTQVWMSHGDTITAIPDNFKKIASTDKVDIAAYQVEGEKVWGVQFHPEVFHSEDGTQIL RNFVVDVCGCKQDWSPASFIESTVAELKAQLGDDKVVLGLSGGVDSSVAAVLLNRAIGKN LTCIFVDHGMLRKNEFKNVMNDYECLGLNVIGVDASEKFFAELAGVTEPERKRKIIGKGF IDVFDVEAHKIKDVKWLAQGTIYPDCIESLSITGTVIKSHHNVGGLPEKMHLKLCEPLRL LFKDEVRRVGRELGMPEHLITRHPFPGPGLAVRILGDITREKVRILQDADDIYIQGLRDW GLYDQVWQAGVILLPVQSVGVMGDERTYERAVALRAVTSTDAMTADWAHLPYEFLGKISN DIINKVKGVNRVTYDISSKPPATIEWE >gi|226332220|gb|ACIC01000100.1| GENE 11 18383 - 18829 510 148 aa, chain - ## HITS:1 COG:YPO0238 KEGG:ns NR:ns ## COG: YPO0238 COG1970 # Protein_GI_number: 16120576 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Large-conductance mechanosensitive channel # Organism: Yersinia pestis # 5 147 2 134 137 148 55.0 3e-36 MGKSTFLQDFKAFAMKGNVIDMAVGVVIGGAFGKIVSSLVANVIMPPIGLLVGGVNFTDL KWVMKAAEVGADGKEIAPAVSLDYGQFLQATFDFLIIAFAIFLFIRLITKLTTKKAAEEA PAAPPAPPAPTKEEVLLTEIRDLLKEKK >gi|226332220|gb|ACIC01000100.1| GENE 12 18972 - 19982 1022 336 aa, chain + ## HITS:1 COG:VC2000 KEGG:ns NR:ns ## COG: VC2000 COG0057 # Protein_GI_number: 15642002 # Func_class: G Carbohydrate transport and metabolism # Function: Glyceraldehyde-3-phosphate dehydrogenase/erythrose-4-phosphate dehydrogenase # Organism: Vibrio cholerae # 2 332 3 330 331 488 77.0 1e-138 MIKVGINGFGRIGRFVFRAAMKRNDIQIVGINDLCPVDYLAYMLKYDTMHGQFDGTIEAD VENSKLIVNGQAIRITAERNPADLKWDAVGAEYVVESTGLFLSKDKAQAHIEAGAKYVVM SAPSKDDTPMFVCGVNEKTYVKGTQFVSNASCTTNCLAPIAKVLNDKFGILDGLMTTVHS TTATQKTVDGPSMKDWRGGRAASGNIIPSSTGAAKAVGKVIPALNGKLTGMSMRVPTLDV SVVDLTVNLAKPATYAEICAAMKEASEGELKGILGYTEDAVVSSDFLGDTRTSIFDAKAG IALTDTFVKVVSWYDNEIGYSNKVLDLIAHMASVNA >gi|226332220|gb|ACIC01000100.1| GENE 13 20164 - 22233 2106 689 aa, chain + ## HITS:1 COG:XF1944 KEGG:ns NR:ns ## COG: XF1944 COG0339 # Protein_GI_number: 15838538 # Func_class: E Amino acid transport and metabolism # Function: Zn-dependent oligopeptidases # Organism: Xylella fastidiosa 9a5c # 9 685 36 716 716 476 39.0 1e-134 MNNILNAQNPFFGQYQTPHGTVPFDRIKTEHYEPAILEGIKQQNAELDAIIQNPEKATFT NTIEAYEQSGRLLDKVTAVFGNMLSAETNDDLQALAQKIMPLLSEHSNNITLNEKLFARV KEVYGQKQSLQLTQEQNRLLDDIYDSFVRHGANLEGEAREQYRQLTNELSKLTLDFSENN LKETNRYQMLLTDKASIAGLPEIIVEAAAETARSEDKEGWAFTLHAPSYVPFMTYADNRE LRHKLYIAYNTKCTHDNEFNNIEIVKKLVNTRMKIAQLLGYKDYAEYTLKKRMAENSDAV YKLLNQLLEAYTPTAQKEYLEVQELAREEQGDDFIVMPWDWSYYSNKLKNKKFNINEEML RPYFELEQVKKGVFGLAERLYGITFRKNTEIPVYHKDVEAYEVFDKDGKFLSVLYTDFHP RPGKRAGAWMTSYKEQWIDPVTGEDSRPHISVVMNFTKPTESKPALLTFNEVETFLHEFG HSLHGMFANSTYQSLSGTNVYWDFVELPSQIMENFAIEKEFLNTFARHYETGEVLPDELI QRLVDASNFNAAYACLRQVSFGLLDMAWYTRNTPFEGDVKAYEQEAWKDAQVLPIVKEAC MSTQFSHIFAGGYSAGYYSYKWAEVLDADAFSLFKQQGIFNREVADSFRNNILSKGGTEH PMVLYKRFRGQEPTIDALLIRNGIKKQYN >gi|226332220|gb|ACIC01000100.1| GENE 14 22252 - 22761 241 169 aa, chain + ## HITS:1 COG:no KEGG:BT_4261 NR:ns ## KEGG: BT_4261 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 169 1 169 169 293 100.0 2e-78 MKKRLILKVLGLLLLLPMFSGCNDTDDVAGIFTGKTWKLTYITVKDSHQMFNFWGNDNKA REQSMKLLDETGRYVITFNGMEESNIITGTLSGTVITSTFTGSWSANGKDNQFNASIQGG NESSDILAKNFIEGLNNATSYGGDERNLYLYYKPSGSQQTLSLVFHVVK >gi|226332220|gb|ACIC01000100.1| GENE 15 22773 - 23222 415 149 aa, chain + ## HITS:1 COG:AF1764 KEGG:ns NR:ns ## COG: AF1764 COG2131 # Protein_GI_number: 11499353 # Func_class: F Nucleotide transport and metabolism # Function: Deoxycytidylate deaminase # Organism: Archaeoglobus fulgidus # 6 136 2 143 157 116 44.0 1e-26 MDTEKKQLELDKRYIRMASIWAENSYCQRRKVGALIVKDKMIISDGYNGTPSGFENVCED ENNLTKPYVLHAEANAITKIARSNNSSDGATMYVTASPCIECAKLIIQAGIKRVVYSEHY RLEDGIELLKRAGIEVIYTEMNEDSSPNK >gi|226332220|gb|ACIC01000100.1| GENE 16 23254 - 25002 1965 582 aa, chain + ## HITS:1 COG:aq_797 KEGG:ns NR:ns ## COG: aq_797 COG0793 # Protein_GI_number: 15606169 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protease # Organism: Aquifex aeolicus # 51 358 43 346 408 214 40.0 3e-55 MSTRNSSRFTPIIIAISVVAGILIGTFYAKHFAGNRLGIINGSSNKLNALLRIVDDQYVD TVNMTDLVEKAMPQILAELDPHSTYIPAQNLEEVNSELEGSFSGIGIQFTIQNDTIHVNA VIQGGPSEKVGLMAGDRIVNVDDSLFVGKKLNNELAMRTLKGPKGSQVKLGVKRVGEPKL LDFTITRGDIPQNTVDAAYMLNDDIGYVKVSKFGRTSHVELLNALAQLNHKKCKGLIIDL RGNTGGYMEAAIRMVNEFLPEGKLIVYTQGRKYPRAEEFANGTGSCQKMPLVVLIDEGSA SASEIFTGAIQDNDRGTVVGRRSFGKGLVQQPIDFSDGSAIRLTIARYYTPSGRCIQRPY ESGKDKNYELDIYNRYEHGEFFSRDSIKLNENERYLTSLGRTVYGGGGIMPDVFVPQDTT GVTSYLSSVINRGLTLQFTFQYTDNNRKKLSQYETEEDLLNYLRHQGLVEQFIRFADSKG VKRRNILIQKSYKRLEANIYGNIIYNMLGLEAYLKYFNKSDATVQQGIELLEKGEAFPKA PAETEEEVTKDNKDGKKKRTAQAYGITEDPTRIFDFAEATIS >gi|226332220|gb|ACIC01000100.1| GENE 17 24944 - 25504 329 186 aa, chain + ## HITS:1 COG:aq_1731 KEGG:ns NR:ns ## COG: aq_1731 COG0212 # Protein_GI_number: 15606807 # Func_class: H Coenzyme transport and metabolism # Function: 5-formyltetrahydrofolate cyclo-ligase # Organism: Aquifex aeolicus # 18 167 26 174 186 124 42.0 1e-28 MASLKIRHASSTLQKLQSAKILAALEAHPAFRAATTVLLYHSLKDEVDTHAFIRKWSNEK RILLPVVVGDELELRLYTGPEDMATGTYGIEEPVGETFTDYADIDFIAVPGVAFDRKGNR LGRGKGYYDRLLPRIPAAYKAGICFPFQIVEEVPAEAFDICMDIIITSNEDELSHPHHPL PPCDRE >gi|226332220|gb|ACIC01000100.1| GENE 18 25449 - 26240 525 263 aa, chain + ## HITS:1 COG:DR0470 KEGG:ns NR:ns ## COG: DR0470 COG1387 # Protein_GI_number: 15805497 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Histidinol phosphatase and related hydrolases of the PHP family # Organism: Deinococcus radiodurans # 6 250 8 247 260 86 28.0 6e-17 MRTNYHTHTTRCLHATGSDEEFVLSAIKGGYQELGFSDHTPWEYHTDYVSDIRMTPEELP EYVESLRSLQKKYKDQISIKIGLECEYFPEYIPWLKGVIKEYQLDYIIFGNHHFHTDEKF PYFGRNTHTVDMLELYEESAIEGMESGLFAYLAHPDLFMRSYPEFDRHCKLVSRHICRTA ARLNLPLEYNIGYEDYNDEHGITTIPHPAFWEIAAHEGCTAIIGVDAHNNQYLETPFYYD RANETLQKLGMKVTDRIPFLNEK >gi|226332220|gb|ACIC01000100.1| GENE 19 26227 - 26514 357 95 aa, chain - ## HITS:1 COG:no KEGG:BT_4256 NR:ns ## KEGG: BT_4256 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 95 1 95 95 164 100.0 1e-39 MKRNDAEQIGKLIQQYLRQESLESPLNEQRLLDAWPQVLGPAASYTSNLYIRNQTLYVHL TSAALRQELMMGRELLVRTLNQRVGATVITNIIFR >gi|226332220|gb|ACIC01000100.1| GENE 20 26511 - 27620 995 369 aa, chain - ## HITS:1 COG:BH0004 KEGG:ns NR:ns ## COG: BH0004 COG1195 # Protein_GI_number: 15612567 # Func_class: L Replication, recombination and repair # Function: Recombinational DNA repair ATPase (RecF pathway) # Organism: Bacillus halodurans # 1 366 1 371 371 173 30.0 5e-43 MILKRISILNYKNLEQVEIGFSAKLNCFFGQNGMGKTNLLDAVYFLSFCKSSGNPIDSQN IRHEQDFFVIQGFYEAEDGTPEEIYCGMKRRSKKQFKRNKKEYSRFSDHIGFLPLVMVSP ADSELIAGGSDERRRFMDVVISQYDKEYLEALIRYNKALAQRNTLLKSEFPVEEELFLVW EEMMAQAGEIVFRKREAFIEEFIPIFQSFYSFISQDKEQVGLSYDSHARDASLLEVLKQS RERDKIMGFSLRGIHKDELNMLLGDFPIKKEGSQGQNKTYLVALKLAQFDFLKRTGQTVP LLLLDDIFDKLDASRVEQIVKLVAGDNFGQIFITDTNREHLDRILQKVGSDYKVFRVDQG VINEMGAEQ >gi|226332220|gb|ACIC01000100.1| GENE 21 27782 - 28465 996 227 aa, chain + ## HITS:1 COG:no KEGG:BT_4254 NR:ns ## KEGG: BT_4254 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 227 1 227 227 390 100.0 1e-107 MAEQKNQNEHLNVEDALTQSEAFLVKYKNAIIGGVVAVIIIVAGFIMYKNLYAEPREEKA QAALFKGQEYFEQDAYEQALNGDSIGYVGFLKVADEYSGTKAANLAKAYAGICYAQLGKY DEAVKMLDGFNGGDQMVAPAILGATGNCYAQLGQLDKAASTLLSAADKADNNSLSPIFLM QAGEILVKQGKYDDAVNAYTKIKDKYFQSYQAMDIDKYIEQAKLMKK >gi|226332220|gb|ACIC01000100.1| GENE 22 28633 - 29127 444 164 aa, chain + ## HITS:1 COG:SP0175 KEGG:ns NR:ns ## COG: SP0175 COG0054 # Protein_GI_number: 15900112 # Func_class: H Coenzyme transport and metabolism # Function: Riboflavin synthase beta-chain # Organism: Streptococcus pneumoniae TIGR4 # 8 163 1 154 155 137 44.0 7e-33 MATAYHNLSEYDFNSVPNAENMKFGIVVSEWNFNITGALLNGAVNTLKKHGVKDENILVK TVPGSFELTFGANQMMENCDVDAIIAIGCVIKGDTPHFDYVCMGATQGITELNATGDIPV IYGLITTNTMEQAEDRAGGKLGNKGDECAISAIKMIDFVWSLNK >gi|226332220|gb|ACIC01000100.1| GENE 23 29403 - 30662 1032 419 aa, chain + ## HITS:1 COG:lin2266 KEGG:ns NR:ns ## COG: lin2266 COG0673 # Protein_GI_number: 16801330 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Listeria innocua # 21 285 3 248 358 65 25.0 2e-10 MEDSPTQSHVLQLAHPPIPIVRIGIIGLGNRGLLTLQRYLQIEGTEIKALSEIREGNLNK AQLILKEAKHPEATGYTGPGGWRKMCECNDLDLIFICTDWLTHTPMATYAMECGKHVAIE VPAAMNIAECWQLVDTAEKTRRHCIMLENCCYDPFALTTLEMARQGVLGEIMHVEGAYIH DLRSMYFAEESEGGYHNHWGKRYSIEHTGNPYPTHGLGPACQILDIHRNDRMEYLVSMST HQAGMSEYARKRFGENSPEARQKYKLGDVNTTLIHTAKGKTIMLQYNVSTPRPYSRLQTV CGTLGFAQKYPVPCIALDSHGDTPLEGEALETVLTRYKHPFSATIGEEAHRKGLPNEMNY VMDYRLIYCLRNGLPLDMDVYDAAEWSCITELSEKSVLNGSIPVEIPDFTRGVWKKHKH >gi|226332220|gb|ACIC01000100.1| GENE 24 30702 - 32549 1143 615 aa, chain + ## HITS:1 COG:no KEGG:BT_4251 NR:ns ## KEGG: BT_4251 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 615 1 615 615 1310 100.0 0 MRTFLSLKTCLLSALLLCVNSIAASKIISVSDFGLKPDSRINAVPFIQKAIDACKQHPGS TLVFPKGRYDFWAQHAIEKDYYETNTYDVNPKILAVLLEQINDLTIDGNGSEFIMHGRMQ PFTLDHCRNITLKNFSVDWEIPLTAQGIVTQSTSEYLEIEIDSHQYPYIIENKRLTFVGE GWKSSLWAIMQFDPDTHLVLPNTGDNLGWRSYDATEINPGLIRLSDPKKEADKFFPAPGT VLVLRHSTRDHAGIFIYHSMDTKLENVKLFHTCGLGILSQYSKNISFNDVHIIPNTSKKR VLSGHDDGFHFMGCSGLLKIENCSWAGLMDDPINIHGTCSRIMEVLSPTRIKCKFMQDMS EGMEWGRPDETIGFIEHKTMRTVATGKMNKFEALNKAEFIIELSVPLPAGVEAGYVIENL TCTPDAEIRNCHFGSCRARGLLVSTPGKVIIENNVFESSGSAILIAGDANAWYESGAVKD VLIRNNDFRYPCNSSIYQFCEAVISIDPEIPTPEQKYPYHRNIRIMDNTFHLFDYPILFA RSVNGLTFSSNTLIRDTTYQPYHYRKEGITLEACKSVVISNNKIEGDVLGRIVTIEKMKP SDVKISKNPFFKLKK >gi|226332220|gb|ACIC01000100.1| GENE 25 32614 - 33159 535 181 aa, chain - ## HITS:1 COG:no KEGG:BT_4250 NR:ns ## KEGG: BT_4250 # Name: not_defined # Def: RNA polymerase ECF-type sigma factor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 181 1 181 181 328 100.0 7e-89 MEETIILEKVKAGDAAAFSLLYDLYWLKVYNFARLYITSSSVISEVVQDVFVKVWESKEF LDITKNFDGLLFMITRNIIFNYSRRHFNELNFKMTVLKGLENSYDIEEELDAADLKNYID KLISQLPARRQQIFRMSREEHLSNREIAERCAVSEKAIERQITLALKFIKENLPLFVVFM G Prediction of potential genes in microbial genomes Time: Thu May 12 01:45:39 2011 Seq name: gi|226332219|gb|ACIC01000101.1| Bacteroides sp. 1_1_6 cont1.101, whole genome shotgun sequence Length of sequence - 155278 bp Number of predicted genes - 106, with homology - 105 Number of transcription units - 47, operones - 25 average op.length - 3.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 59 - 712 259 ## COG3712 Fe2+-dicitrate sensor, membrane component 2 1 Op 2 . + CDS 763 - 4116 2404 ## BT_4247 hypothetical protein 3 1 Op 3 . + CDS 4135 - 6063 1573 ## BT_4246 hypothetical protein 4 1 Op 4 . + CDS 6100 - 7392 721 ## BT_4245 hypothetical protein + Term 7408 - 7445 5.5 5 2 Tu 1 . + CDS 7475 - 10048 1626 ## BT_4244 hypothetical protein + Prom 10141 - 10200 5.1 6 3 Tu 1 . + CDS 10282 - 11685 1392 ## COG0673 Predicted dehydrogenases and related proteins + Term 11712 - 11766 12.3 + Prom 11699 - 11758 2.7 7 4 Op 1 . + CDS 11794 - 12684 512 ## COG1284 Uncharacterized conserved protein 8 4 Op 2 . + CDS 12699 - 16082 3101 ## COG3250 Beta-galactosidase/beta-glucuronidase 9 4 Op 3 . + CDS 16112 - 17203 1121 ## BT_4240 hypothetical protein + Term 17241 - 17278 4.2 - Term 17053 - 17095 0.6 10 5 Op 1 . - CDS 17249 - 17899 603 ## BT_4239 hypothetical protein 11 5 Op 2 . - CDS 17910 - 18809 873 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily 12 5 Op 3 . - CDS 18874 - 19683 733 ## COG0101 Pseudouridylate synthase 13 5 Op 4 . - CDS 19731 - 21377 1571 ## COG0784 FOG: CheY-like receiver 14 5 Op 5 . - CDS 21383 - 23887 1641 ## COG3292 Predicted periplasmic ligand-binding sensor domain - Prom 23928 - 23987 6.4 - Term 23981 - 24030 8.1 15 6 Tu 1 . - CDS 24192 - 24827 536 ## BT_4235 hypothetical protein - Prom 24848 - 24907 3.2 + Prom 25145 - 25204 3.1 16 7 Op 1 . + CDS 25326 - 25895 590 ## BT_4234 hypothetical protein 17 7 Op 2 . + CDS 25908 - 27362 1543 ## BT_4233 hypothetical protein - Term 27380 - 27415 4.9 18 8 Op 1 . - CDS 27436 - 27621 308 ## BT_4232 hypothetical protein 19 8 Op 2 . - CDS 27605 - 27754 153 ## gi|253569990|ref|ZP_04847399.1| conserved hypothetical protein - Term 27815 - 27841 1.0 20 9 Tu 1 . - CDS 27888 - 28526 631 ## BT_4231 hypothetical protein 21 10 Tu 1 . + CDS 28856 - 30115 849 ## BT_4230 hypothetical protein + Term 30270 - 30306 -1.0 22 11 Op 1 1/0.000 - CDS 30159 - 30809 368 ## COG2865 Predicted transcriptional regulator containing an HTH domain and an uncharacterized domain shared with the mammalian protein Schlafen 23 11 Op 2 . - CDS 30829 - 31641 649 ## COG2865 Predicted transcriptional regulator containing an HTH domain and an uncharacterized domain shared with the mammalian protein Schlafen - Prom 31684 - 31743 3.7 24 12 Tu 1 . + CDS 32783 - 34063 700 ## BT_4227 hypothetical protein + Term 34082 - 34120 8.8 + Prom 34065 - 34124 2.2 25 13 Op 1 . + CDS 34182 - 35264 725 ## BT_4226 hypothetical protein 26 13 Op 2 . + CDS 35278 - 36321 842 ## BT_4225 hypothetical protein 27 13 Op 3 . + CDS 36388 - 39216 1763 ## BT_4224 hypothetical protein 28 13 Op 4 . + CDS 39239 - 39499 278 ## gi|253569999|ref|ZP_04847408.1| conserved hypothetical protein + Term 39574 - 39637 5.1 + Prom 39504 - 39563 2.1 29 14 Tu 1 . + CDS 39663 - 39806 152 ## gi|298383894|ref|ZP_06993455.1| toxin-antitoxin system, antitoxin component, Xre family + Term 39896 - 39928 -0.9 + Prom 39808 - 39867 3.6 30 15 Tu 1 . + CDS 40106 - 41179 572 ## COG2184 Protein involved in cell division + Prom 41416 - 41475 4.7 31 16 Op 1 . + CDS 41501 - 41998 462 ## BT_4221 hypothetical protein 32 16 Op 2 . + CDS 42019 - 43677 2144 ## COG2268 Uncharacterized protein conserved in bacteria + Term 43696 - 43743 9.2 + Prom 43679 - 43738 8.1 33 17 Op 1 . + CDS 43761 - 44438 473 ## BT_4219 hypothetical protein + Prom 44448 - 44507 1.6 34 17 Op 2 . + CDS 44529 - 45545 1062 ## COG1702 Phosphate starvation-inducible protein PhoH, predicted ATPase 35 17 Op 3 . + CDS 45592 - 46536 1019 ## COG0152 Phosphoribosylaminoimidazolesuccinocarboxamide (SAICAR) synthase 36 17 Op 4 . + CDS 46539 - 47276 560 ## PROTEIN SUPPORTED gi|163754278|ref|ZP_02161401.1| 30S ribosomal protein S15 37 17 Op 5 . + CDS 47315 - 48061 708 ## COG0169 Shikimate 5-dehydrogenase 38 17 Op 6 . + CDS 48136 - 49086 652 ## COG1073 Hydrolases of the alpha/beta superfamily 39 17 Op 7 8/0.000 + CDS 49159 - 50061 448 ## COG1512 Beta-propeller domains of methanol dehydrogenase type 40 17 Op 8 . + CDS 50112 - 50693 712 ## COG1704 Uncharacterized conserved protein - Term 50717 - 50769 18.1 41 18 Tu 1 . - CDS 50775 - 50900 100 ## - Prom 50994 - 51053 4.3 + Prom 50836 - 50895 7.2 42 19 Op 1 . + CDS 51019 - 52185 1155 ## COG0150 Phosphoribosylaminoimidazole (AIR) synthetase 43 19 Op 2 . + CDS 52189 - 53301 1190 ## COG0216 Protein chain release factor A 44 19 Op 3 . + CDS 53318 - 54142 1025 ## COG0284 Orotidine-5'-phosphate decarboxylase 45 19 Op 4 . + CDS 54152 - 55381 1032 ## COG1078 HD superfamily phosphohydrolases + Prom 55383 - 55442 3.4 46 20 Op 1 . + CDS 55463 - 56503 1015 ## COG1044 UDP-3-O-[3-hydroxymyristoyl] glucosamine N-acyltransferase 47 20 Op 2 . + CDS 56515 - 57900 1189 ## COG0774 UDP-3-O-acyl-N-acetylglucosamine deacetylase 48 20 Op 3 . + CDS 57911 - 58678 848 ## COG1043 Acyl-[acyl carrier protein]--UDP-N-acetylglucosamine O-acyltransferase 49 20 Op 4 . + CDS 58700 - 59254 650 ## BT_4204 hypothetical protein + Term 59270 - 59329 15.1 + Prom 59290 - 59349 3.7 50 21 Tu 1 . + CDS 59373 - 60299 892 ## COG0324 tRNA delta(2)-isopentenylpyrophosphate transferase - Term 60219 - 60261 7.5 51 22 Tu 1 . - CDS 60272 - 60715 359 ## BT_4202 hypothetical protein - Prom 60741 - 60800 7.3 52 23 Tu 1 . - CDS 60830 - 63271 1461 ## BT_4201 hypothetical protein - Prom 63310 - 63369 6.1 53 24 Op 1 . - CDS 63373 - 65505 958 ## BT_4200 hypothetical protein 54 24 Op 2 . - CDS 65523 - 67139 711 ## BT_4199 hypothetical protein 55 24 Op 3 . - CDS 67174 - 67539 385 ## BT_4198 hypothetical protein - Term 68027 - 68083 19.1 56 25 Op 1 . - CDS 68176 - 68478 265 ## BT_4196 hypothetical protein 57 25 Op 2 . - CDS 68557 - 68733 174 ## BT_4195 hypothetical protein 58 25 Op 3 . - CDS 68797 - 69744 1044 ## COG0330 Membrane protease subunits, stomatin/prohibitin homologs - Prom 69963 - 70022 6.8 + Prom 69713 - 69772 5.7 59 26 Op 1 . + CDS 69924 - 72224 1977 ## COG1506 Dipeptidyl aminopeptidases/acylaminoacyl-peptidases 60 26 Op 2 . + CDS 72224 - 73072 594 ## COG0320 Lipoate synthase 61 26 Op 3 . + CDS 73130 - 74251 614 ## BT_4191 hypothetical protein + Term 74343 - 74382 0.1 62 27 Op 1 . - CDS 74217 - 74435 116 ## BT_4190 putative S-adenosylmethionine-dependent methytransferase 63 27 Op 2 . - CDS 74383 - 74922 415 ## COG0313 Predicted methyltransferases 64 27 Op 3 . - CDS 74963 - 75931 849 ## BT_4189 hypothetical protein 65 27 Op 4 . - CDS 75977 - 76834 1035 ## COG0623 Enoyl-[acyl-carrier-protein] reductase (NADH) - Prom 77008 - 77067 7.6 + Prom 76978 - 77037 9.6 66 28 Tu 1 . + CDS 77184 - 78800 1334 ## COG5434 Endopolygalacturonase + Term 78854 - 78908 7.1 + Prom 78846 - 78905 3.8 67 29 Tu 1 . + CDS 78952 - 80184 1100 ## BT_4186 hypothetical protein + Term 80245 - 80291 3.5 + Prom 80270 - 80329 3.6 68 30 Op 1 . + CDS 80349 - 82025 1319 ## COG3507 Beta-xylosidase 69 30 Op 2 . + CDS 82099 - 82749 733 ## COG0546 Predicted phosphatases + Term 82786 - 82844 6.6 - Term 82784 - 82819 -0.6 70 31 Tu 1 . - CDS 82862 - 84175 952 ## BT_4183 pectate lyase L precursor + Prom 84411 - 84470 4.0 71 32 Op 1 . + CDS 84520 - 88815 2730 ## COG0642 Signal transduction histidine kinase + Term 88853 - 88897 -0.8 72 32 Op 2 . + CDS 88903 - 91872 2552 ## COG3250 Beta-galactosidase/beta-glucuronidase + Term 92032 - 92097 1.4 73 33 Op 1 . - CDS 92617 - 93327 565 ## BT_4180 acetyl xylan esterase A 74 33 Op 2 . - CDS 93352 - 93486 61 ## gi|298383848|ref|ZP_06993409.1| polysaccharide deacetylase - Prom 93555 - 93614 4.1 75 34 Tu 1 . - CDS 93643 - 97944 2723 ## COG0642 Signal transduction histidine kinase - Prom 98014 - 98073 4.1 - Term 98013 - 98067 14.1 76 35 Op 1 . - CDS 98087 - 98401 375 ## COG3254 Uncharacterized conserved protein 77 35 Op 2 . - CDS 98432 - 99829 1409 ## COG4225 Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins 78 35 Op 3 . - CDS 99833 - 101674 1633 ## BT_4175 hypothetical protein - Prom 101894 - 101953 8.4 + Prom 101654 - 101713 6.3 79 36 Op 1 . + CDS 101941 - 103080 887 ## COG4225 Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins 80 36 Op 2 . + CDS 103191 - 104735 1552 ## COG2755 Lysophospholipase L1 and related esterases + Term 104765 - 104813 14.1 - Term 104752 - 104800 14.1 81 37 Tu 1 . - CDS 104814 - 108110 2075 ## BT_4172 hypothetical protein - Prom 108130 - 108189 6.0 - Term 108290 - 108337 10.1 82 38 Op 1 . - CDS 108363 - 109016 436 ## BT_4171 hypothetical protein 83 38 Op 2 . - CDS 109038 - 110645 1077 ## BT_4170 putative pectate lyase L precursor 84 38 Op 3 . - CDS 110678 - 112429 1463 ## BT_4169 hypothetical protein 85 38 Op 4 . - CDS 112435 - 115587 2436 ## BT_4168 hypothetical protein 86 38 Op 5 . - CDS 115604 - 117346 1402 ## BT_4167 hypothetical protein 87 38 Op 6 . - CDS 117352 - 119097 1162 ## BT_4166 putative lipoprotein 88 38 Op 7 . - CDS 119106 - 120665 1251 ## BT_4165 hypothetical protein 89 38 Op 8 . - CDS 120705 - 123971 2685 ## BT_4164 hypothetical protein 90 38 Op 9 . - CDS 123990 - 126323 1514 ## BT_4163 hypothetical protein - Prom 126351 - 126410 8.2 - Term 126818 - 126864 8.2 91 39 Tu 1 . - CDS 126882 - 128960 1444 ## BT_4162 hypothetical protein + Prom 129090 - 129149 6.8 92 40 Tu 1 . + CDS 129373 - 129975 375 ## BT_4161 hypothetical protein + Term 130001 - 130052 6.0 - Term 129988 - 130040 10.0 93 41 Op 1 . - CDS 130114 - 132210 1416 ## COG1874 Beta-galactosidase 94 41 Op 2 . - CDS 132239 - 133900 1440 ## BT_4159 hypothetical protein - Prom 133934 - 133993 4.3 95 42 Tu 1 . - CDS 134141 - 134833 575 ## BT_4158 hypothetical protein + Prom 134796 - 134855 5.7 96 43 Op 1 . + CDS 135077 - 136660 1355 ## BT_4157 alpha-galactosidase precursor 97 43 Op 2 . + CDS 136685 - 138358 1327 ## COG3250 Beta-galactosidase/beta-glucuronidase + Prom 138378 - 138437 2.0 98 43 Op 3 . + CDS 138458 - 139570 715 ## BT_4156 beta-galactosidase + Term 139663 - 139713 10.2 - Term 139653 - 139697 9.2 99 44 Tu 1 . - CDS 139714 - 142122 2001 ## COG3250 Beta-galactosidase/beta-glucuronidase - Prom 142200 - 142259 4.4 - Term 142221 - 142279 15.1 100 45 Tu 1 . - CDS 142291 - 144216 1320 ## BT_4175 hypothetical protein - Prom 144400 - 144459 6.0 - Term 144430 - 144473 4.8 101 46 Op 1 . - CDS 144474 - 145862 1063 ## COG5434 Endopolygalacturonase 102 46 Op 2 . - CDS 145896 - 146864 819 ## BT_4154 chitin deacetylase 103 46 Op 3 . - CDS 146877 - 148175 771 ## COG5434 Endopolygalacturonase 104 46 Op 4 . - CDS 148219 - 151200 2384 ## COG1874 Beta-galactosidase - Prom 151232 - 151291 2.1 + Prom 151287 - 151346 3.5 105 47 Op 1 . + CDS 151443 - 154319 2308 ## COG3250 Beta-galactosidase/beta-glucuronidase 106 47 Op 2 . + CDS 154328 - 155276 937 ## COG2755 Lysophospholipase L1 and related esterases Predicted protein(s) >gi|226332219|gb|ACIC01000101.1| GENE 1 59 - 712 259 217 aa, chain + ## HITS:1 COG:PA2388 KEGG:ns NR:ns ## COG: PA2388 COG3712 # Protein_GI_number: 15597584 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Pseudomonas aeruginosa # 17 198 123 309 331 60 25.0 2e-09 MNSIRDNINDEFIKVIAQQNQMHVLPDSTKVWMESGSSIKYTKAFNKKREVWLEGNSFFE VYKHEGSFFQVHINKAFIEVKGTCFQIKQTNAEKNEITLFHGKIEFNVESTGEKIIMSPS QKVMYNPNNAQTLVENVMDINWKDGRYNFKETPLSQLISIINQIYQSNIILEGKFTKQPS FTGSIRYDETLDDVIDKLCFSLNLTYKKQNRKIVIYN >gi|226332219|gb|ACIC01000101.1| GENE 2 763 - 4116 2404 1117 aa, chain + ## HITS:1 COG:no KEGG:BT_4247 NR:ns ## KEGG: BT_4247 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1117 1 1117 1117 2162 100.0 0 MKKNHSCYRRRYLKHIALVLLFYPLSALGAQGHISIKGQSITIMQAIQLIEKNSDYTFFY NAADLEDKQRKDINCNGPIDEILDEVFKDSGISYIVKNKEVILNVQKTNNTQQKKKRTVT GTIIDAVDDSPIIGANITIKGDKNTGTISDIDGNFSLSIPDNKTILVVTYIGYKTREVPV EDLGNIKIVLEGDDHTLNEVVVVGSGTQKKVSVTGAITSIKGASLKLPSSTLSNSFAGKL AGVIAKTNSGEPGSGAEFYIRGIGTFGGRATPLILLDDVEISSGDLNYVPAENIESFSIL KDASATAIYGSRGANGVMIVTTKGGEYNSKTSINVTAENSFNYLDKFPEFVDGATYMDMY NKASLARNSSATPKYSATDMERTASGVNPYLYPDVNWQDVLFKNMSMRQRANVNISGGGS KVKYYMSLDVSHDSGLLNTGKAYSWNNNINIMNYTFQNNIAYKLTPTTTIKMNMNAQIRQ KKSPNVSSEDLFKQILTTTPIEFPVTYPSQDGRIMYGNNIISGSTLYTNPYARMMTSFAE TNENTLNTVIKIDQDLDFITKGLKINAFVNFKTWSSSYFDRSIAPYYYRIKSGSYDETDL ENTNYELELLNSNGSDYISQSAIGKSSDQTFELQFNLNYARQFGLHNVGAMLLYKQREYR SDVLPNRNQGLSGRLTYDYGQRYLFEFNFGYNGTERLAKEDRFGFFPAVSLGWVISNEAF FEPMKNVVDNLKIRGSYGLVGSDDLATAGGSYYLYIDKITNNDLSYLKWTSGQNMDYQLG GPQMAYYAMSGLGWEKVKKLDIGIDFTLFRNWTFTFDYFYDKRYDIFMNREAWPQSLAYH IAKPWSNIGKMDNKGVEFSINYANNFSKDLSVSLQANFTYNKNKMVYVDEPEYPTIWKSE TGKPYSRITGYIAEGLFKSQEEIDNSPAQNLGSTPKVGDIKYRDLNGDGIIDSDDQTMIS KYGSTPRLQYGFGGTVNYKKFDFGVFFSGSALRSIMTNGIDPFQEGIAVGNRNVLKYIAD NYWSEEKQNWDAKYPRLGLLATDVANNTVNSSYWLRNGSFLRLKNVEVGYKIPYGRIFVS AANLFTFSPFDLWDPELSSWNSYPMQKTVNVGIQLQF >gi|226332219|gb|ACIC01000101.1| GENE 3 4135 - 6063 1573 642 aa, chain + ## HITS:1 COG:no KEGG:BT_4246 NR:ns ## KEGG: BT_4246 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 642 1 642 642 1306 100.0 0 MKLTRIFQNISIAVIITALGLTSCDYLDVVPPEQPNIDDAMSSPTRALGFLYSCYGGVST DLPSAYLGEINSTTDEYVLPYSWNTDGYWGAYAFNTASSTNQDWLWGTTYQYIGQCYLFL QKLENAGSDIASDAEKEQWRAECQFLVAYYHFATLRRYGPIPITDSYIPMDTPTSEYNGR FHFDYCVDWIANQLDEAAKVLPANRTVTNEWGRATSTIAKAVKARLLLYAASPLWNGSFP YPNWQNENFETPGYGKALVSNTYDKSKWERALAACQEALTLATTSGDRELYDDDEYYSRQ SLNLPFVPGVADVEDNKEFLKNVMKMRYAVSTRESEGNKEIIWGLSNQFDFYSRYPLRIL KKSDGTWHAGYSGVSPTLYTFEHFYTANGKLPEKDLDFTPSSEWFESAGISSREDIIKLN VGREPRFYAWMAFDGGDYGTKFAAGSPLKLEMRNSEMHGYNPSLFNRDHSVTGFLTQKFV DPVTEFYTAGGSTSGTSAPTILFRLAELYLNVAECHAALGNTQEAIDALNPVRERAGIPK LTLADITNNMTIKDWVHNERFVELWNEGHRFFDVRRWAEGAKYFGANKREGLNAEVQSPT FEEFNKRTTVDAPYVWENRMYLNPVFYNEVYKNPQMVQAPGY >gi|226332219|gb|ACIC01000101.1| GENE 4 6100 - 7392 721 430 aa, chain + ## HITS:1 COG:no KEGG:BT_4245 NR:ns ## KEGG: BT_4245 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 430 1 430 430 820 99.0 0 MKHISIALCLCGMILMNTSCDNYDDTYPQEYEKILSLQTTGEQAVDLYKTGEATNYSITV IKSGSQPTLTASAHIGAMDAVNFEKYISERGLDYVAMPANCYSFNMEELDYSSAETYKII NLQLNTNEIEIFEQTLEEGQTCVLPIMLTSTSDSILADKNTLVLKPEIITPSLSVTESSS GTVTKYLPQSGGTIALDLGLQVENQWNFTCKVAIDETTTTLEGATLVSDIISFEPGNTSS VQVNIPKFTKTSGNVGIKILEINGKDGFEFDTNPFILVASVEKYSLTADMLSSNAIEPSE GSLANLLDGDIGTYFHSAWSVSIADKHYVQVKLPVSTKTFRFTYTNRSNNGNAALAWFNL YTGTNENNLQLYKRFAWDEDGLPSGAAGVYVSPDVSIDNAANTLRFECESNWTGGSFFVW SEFSLFILSE >gi|226332219|gb|ACIC01000101.1| GENE 5 7475 - 10048 1626 857 aa, chain + ## HITS:1 COG:no KEGG:BT_4244 NR:ns ## KEGG: BT_4244 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 857 1 857 857 1663 100.0 0 MTIKRFITNLLALFTLFTVSLACKDTEKSIINSSFSISEEYLIQNLDKSSTSVQIPINTS MELAQWSVSYEANWLQCSKQKTAAEGTFLRITVNENTGETKRTANIKVTSTTATYTITVN QYAKGEVIVEGDIKVTPTGGKASEHQEGQDIENTYDGKFSTDGAAPFHTPWGQSAKFPVT LEYYFKGDTEIDYLIYYTRSGNGNFGKVKVYTTTNPDRSDYTLQGEYDFKEQNAPSKVSF SEGIKATGIKFEVLSGLGDFVSCDEMEFYKTNTDKTLDKQLLTVFTDITCTEIKNNVTNE QIQALPDYFVRIAEAVRDNTYDKWEKEFRIRSYEPYSNIAEWADKLMTKKYSDLDNPTGI SVKAGDDIIVLVGDTYGQNISMQCIWETGTEYKQTASSGDVYMLNPGVNKLTMKGEGQLF VMYNTELTSNTAKPIKIHIPLGSGTVNGFFDLKEHKTDEKYAELLKKSTHKYFCIRGEKI MFYFHRNKLLEYVPNNILSAIHLWDNIVGWQQELMGIDDVRPSQVNNHLFAISPEGSYMW ASDYQIGFVYTYLGNILLEDNVMAAEDNAWGPAHEIGHVHQAAINWASSTESSNNLFSNF IIYKLGKYKSRGNGLGSVATARYANGQAWYNMGDATHQNEDTETHMRMNWQLWIYYHRCE YKTDFWQTLFKLMREVNMTEGEDPGKKQLEFAKMASKAANQNLTDFFEMWGFFEPVNTTI EQYGTYKYYVSDAMIREAKEYMAQFPAPKHAFQYIEDRKKSEFPSNDYRYSAVGDVGYYT QFKENQKITKAITAELAGRKVSIQNGDEAVAFELRENDENGKLLYFSTFTTFEIPSSILM VNAKLYAVQADGKRILL >gi|226332219|gb|ACIC01000101.1| GENE 6 10282 - 11685 1392 467 aa, chain + ## HITS:1 COG:TM0585 KEGG:ns NR:ns ## COG: TM0585 COG0673 # Protein_GI_number: 15643351 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Thermotoga maritima # 64 233 5 180 360 65 26.0 2e-10 MKKLLLNTLIGLALLTCQTSFAQKTKAKFSPIKVETPARPANQQDVIQLVTPKLETVRVG FIGLGMRGPGAVERWTHIPGTQIVALCDLLPERVENAQKILEKAGLPKAASYAGDEKAWK KLCERDDIDVVYIATDWKHHADMGVYAMEHGKHVAIEVPAAMTLDEIWKLINTSEKTRKH CMQLENCVYDFFELTSLNMAQQGVFGEVLHVEGAYIHNLEDFWPYYWNNWRMDYNQKHRG DVYATHGMGPACQVLNIHRGDRMKTLVAMDTKAVNGPAYIKKSTGKEVKDFQNGDQTTTL IRTENGKTMLIQHNVMTPRPYSRMYQVVGADGYASKYPIEEYCLRPTQVDSNDVPNHEKL NAHGSVSEDVKKALMAKYKDPIHKELEETAKKVGGHGGMDYIMDYRLVYCLRNGLPLDMD VYDLAEWCCMAELTRLSIENGSAPVEVPDFTRGGWNKVQGYRHAFAE >gi|226332219|gb|ACIC01000101.1| GENE 7 11794 - 12684 512 296 aa, chain + ## HITS:1 COG:TM0177 KEGG:ns NR:ns ## COG: TM0177 COG1284 # Protein_GI_number: 15642951 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Thermotoga maritima # 13 294 1 281 283 149 32.0 8e-36 MKADILKPSRQTILREAKDYVMIAVGMILYGIGWTVFLLPNDITTGGVPGIASIVYWATG FPVQYTYFSINFFLLLLALKLLGLKFCIKTIFGVFTLTFFLSVIQQLASGVSLLRDQPFM ACVIGASFCGGGIGVAFCANGSTGGTDVIAAIINKYRDITLGRVILICDMIIITSSYFVL KDWEKVVYGFATLYICSFVLDQVVNSARQSVQFFIISNKYEEIGRHINEYPHRGVTIINA TGFYTGREVKMMFVLAKKRESTIIFRLIKDIDPNAFVSQSAVIGVYGEGFDHIKVK >gi|226332219|gb|ACIC01000101.1| GENE 8 12699 - 16082 3101 1127 aa, chain + ## HITS:1 COG:TM1193 KEGG:ns NR:ns ## COG: TM1193 COG3250 # Protein_GI_number: 15643949 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Thermotoga maritima # 33 1091 5 966 1087 692 37.0 0 MKNMKYRSYHLLLAGIFSLLTTGAMAEKTSDKPYWQDVQVVSVNKEYPRTSFMTYDNRTD ALTGKFEKSNYYQLLNGTWKFFFADSYKDLPANITDPSVSTDSWNDIKVPGNWEVQGYGV AIYTNHGYEFKPRNPQPPSLPEANPVGVYRRDIDIPADWKERDIYLHLAGAKSGVYVYIN GQEVGYSEDSKNPAEFLINKYVKPGKNVLTLKIFRWSTGSYLECQDFWRISGIERDVYLY SQPKAALKDFRITSTLDDSYKNGVFALNVDLRNHRPAATNLTLAYELLDAQGKVVATEEK TTYVPVNETRTLSFDKNLTDVHTWTSEHPNLYKLVMTVKEEGKVNEIIPFNVGFRRIEIK PIDQKAANGKPYVCLFINGQPLKLKGVNIHEHNPETGHYVTEELMRRDFELMKQHNLNSV RLCHYPQDRRFYELCDEYGLYVYDEANIESHGMYYDLRKGGSLGNNPEWLKPHMDRTINM FERNKNYPSVTFWSLGNEAGNGYNFYQTYLWLKEADKNIMQRPVNYERAQWEWNTDMYVP QYPGAGWLEDIGKNGSDRPIVPSEYAHAMGNSTGNLWGQWQAIYKYPNLQGGYIWDWVDQ GLLQKDKNGREYWAYGGDFGVNAPSDGNFLCNGLVNPDRGLHPAMAEVKYVHQNVGFDAI DAASGKFNITNRFYFTNLKKYQIHYSVLANGKTVKSGKVSLDIAPQASKEFTVPVNGLKA RPGTEYFVNFSVIAVEPEPLIPAGYEIAYDQFRLPVEAERNTYKANGPALQTQTQGDELT VSSSKVNFVFNKKSGLVTSYKVDGTEYFKDGFGIQPNFWRAPNDNDYGNGAPKRLHVWKQ SSKDFHVTDTKVSTEDKAVLLQATYLLAAGNLYVVTYKIYPSGIVHVNAKFTSTDMQAAE TEVSEATRMATFTPGSDAARKAASKLEVPRIGVRFRLPAQMNNVQYFGRGPEENYIDRNH GTMVGIYQTTADKMYFNYVRPQENGHHTDTRWLNLSPDKGNGLVIVADSLIGFNALRNSV EDFDSEEALPHPYQWNNFSPEEAANHDEKAARNVLRRMHHVNDVTPRDFVEVCVDMKQQG VGGYDSWGARPEPFHQIPANREYNWGFTLVPVRSGSQATEAAKYDYR >gi|226332219|gb|ACIC01000101.1| GENE 9 16112 - 17203 1121 363 aa, chain + ## HITS:1 COG:no KEGG:BT_4240 NR:ns ## KEGG: BT_4240 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 363 1 363 363 745 100.0 0 MKDLSSIVAKFNTQGTITEIKPLGAGLINDTYKVNTQEADAPDYVLQRINHAIFQNVEML QSNIAAVTGHIRKKLTEAGEADIDRKVLSFLATEEGKTYWFDGESYWRVMVFIPRAKTYE TVNPEYSNYAGEAFGNFQAMLADIPETLGETIPDFHNMEFRLKQLRDAVAANAAGRVAEV QYYLDEIEKRADEMCKAERLFREGKLPKRVCHCDTKVNNMMFDEDGKVLCVIDLDTVMPS FIFSDYGDFLRTGANTGDEDDKDLNRVNFNMEIFKAFTKGYLKGAKSFLTPIEIENLPYA AALFPYMQCVRFLADYINGDTYYKIKYPEHNLVRTKAQFKLLQSVEEHTPEMVAFINECL VNG >gi|226332219|gb|ACIC01000101.1| GENE 10 17249 - 17899 603 216 aa, chain - ## HITS:1 COG:no KEGG:BT_4239 NR:ns ## KEGG: BT_4239 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 216 1 216 216 405 100.0 1e-112 MKSNKITHTIAFFIIGLFTFVSSQAQEAKTLFINVPDSLCPLLTKVNREDCIDFLSSKMK AQVENRFGQKSEMTDLSKDYIRMQMTPETTWQMKVLALNDTTNVICTVATACAPACDSSI RFYTTDWKPLTADSFITLPVMNDFLLTPDSTAIYEFDEASRSADILLMKVDMNKENTELA VTLSTPDYMSKETAEKLKPFLRRPIVYQWKNGAFTK >gi|226332219|gb|ACIC01000101.1| GENE 11 17910 - 18809 873 299 aa, chain - ## HITS:1 COG:CAC1984 KEGG:ns NR:ns ## COG: CAC1984 COG0697 # Protein_GI_number: 15895255 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Clostridium acetobutylicum # 2 296 4 283 285 126 33.0 4e-29 MWLLLAFLSATLLGFYDVFKKQSLKDNAVLPVLFLNTLFSSLIFLPFILVSAFEPDLFWG TIFNVPVAGWEQHKYIIIKSFIVLSSWIFGYFGMKHLPITIVGPINATRPVMVLVGAMLV FGERLNLYQWIGVMLAVASFFMLSRSGKKEGIDFKHNKWILFIILAAVMGAVSGLYDKFL MKQLNPMLVQSWYNVYQFFIMGTIIFLLWWPKRKTTTPFRWDWTIILISVFLSAADFVYF YALSYDDSMISIVSMVRRGSVIVSFIFGALFFREKNLKSKAIDLILVLIGMIFLYLGSK >gi|226332219|gb|ACIC01000101.1| GENE 12 18874 - 19683 733 269 aa, chain - ## HITS:1 COG:VC0999 KEGG:ns NR:ns ## COG: VC0999 COG0101 # Protein_GI_number: 15641014 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Pseudouridylate synthase # Organism: Vibrio cholerae # 3 246 2 244 264 155 38.0 1e-37 MQRYFIYLAYDGTNYHGWQIQPNGSSVQECLMKALSTFLRRDVEVIGAGRTDAGVHASLM VAHFDHEDVLDTVTVADKLNRLLPPDISVYRVRQVKPDAHARFDATARTYKYYVTTAKYP FNRQYRYRVYGSLDYERMNEAARTLFEYIDFTSFSKLHTDVKTNICHISHAEWTKMEGED TTWVFTIRADRFLRNMVRAIVGTLLEVGRGKLTVEGFRKVIEQQDRCKAGTSAPGNALFL VNVEYPEDIFSEETKKSVLSTLLPEGGEC >gi|226332219|gb|ACIC01000101.1| GENE 13 19731 - 21377 1571 548 aa, chain - ## HITS:1 COG:alr2240 KEGG:ns NR:ns ## COG: alr2240 COG0784 # Protein_GI_number: 17229732 # Func_class: T Signal transduction mechanisms # Function: FOG: CheY-like receiver # Organism: Nostoc sp. PCC 7120 # 295 428 4 143 196 110 43.0 9e-24 MRNRLLMLFVQELRTPLSLIIAPLKELQKDDAQTSQLSLQVAYRNSLRMVDACDQLLAVY GQGNMESKLEVAPYSVDKMIDSSLFGVRDLLKVYTIDFHCEKRIKKEMEFYVDKKKIEFV IHNLLTNAFTHTHYAGTVSLSVCEVVNDNMHYVCLIVEDDGKERVRTVEQLMSENEGSER DMSAAQMGFTIMQQMIEAHYGTITLESTEGKGTKVTVNLPADRDVFENNPNIQFIDPEEL TEVAELEPESAETQKQDLAVEDAVVQQDQQLPLFAEALPAEEVASPVAGGVKKTILIVED HKDIRLYLKVLFGNEYNLLMATNGQEGVDTAMKEMPDLIICDVMMPVKDGFECCREVKEN PETCSIPFIMLTAKVEDDDIIHGLQLGADDYVLKPFTPGILKAKVSSLINGRQTLKQMYT KLFKLPGTDTIVVSEPEQAGEEVKTEDPFITAVIKIVEENICEADFSVKKLAAEMNMSQP TLYRKVKQSTDYTIIELIRGVRMRRAGVLLKTKQYAVQEVAEMVGYNDIPTFRKHFVDAF GTTPSTYE >gi|226332219|gb|ACIC01000101.1| GENE 14 21383 - 23887 1641 834 aa, chain - ## HITS:1 COG:all1523 KEGG:ns NR:ns ## COG: all1523 COG3292 # Protein_GI_number: 17229015 # Func_class: T Signal transduction mechanisms # Function: Predicted periplasmic ligand-binding sensor domain # Organism: Nostoc sp. PCC 7120 # 400 649 100 330 386 61 28.0 9e-09 MRKFFFLLLLLTVFLPSYSIDTHWDLEPQIIKNGVGNNTIYHVCYGSEGFMWFSTDKGIS RYDGFRFRDYPLIMSVDSLSTPLHQAVKTLKEGSDGLFYASLYQGGITCFDKEMEKFLPV RFDRSLKLKEIQDFCWNDGSLYLATSHGLFESRPIRKKEGKEDFVYCILNPEPLIKGKIT NLCTDGKTNLYFAVDREKVIHYDLVTKRTSVIREYDVVNRLFLRQGYLWICRLWNDIVCY DLKSHKERVVSLEGGDQPDFSNSYVTDMVMKDKQTFYLTTWDGLYKLHFENLNLCESPFT LTLLTQGERAFHSRIENKMTSLLWDNRQQILWVGTFGGGVVKFDISDSMYSRVRQNFKSR VDGMVEDAKGYIWLTVTDGGIMRSTTPSFSMDTHFEPWKKVSGLSGRYHIYKGKDGNIWL GNNFGEIISVNPLTEDVESFQLKNSEGGRMQAVVHCFCLDSWNRLWVGTSNGLVQVDPKT HGCKNIKLPGEIKNVFAITEDKEGNVWVGTDKGLKRIETNGDQIRVEGNYEKENGLEEAG VRTLYVNNYNQIYAAYLNVVIRIDGREKDKIESVYTLQSGLTDGHVSCMVDDHIGNTWAG NNVGVMTIRNGQEAFYSYLSVGNCSAVCRLNDGRLLWANSWGLIFFDPSATKGDSGKKHL MLTDVEVGGETVLAGEKRNGQMILAVSPEKQEKLVFASDNNDFHLFFSDLRYGLAQRKIA YRLLPADKEWKMIPLAEGLWFNGLAAGKYALQAKLVFPDGKEGEVIEIPLVVKGKWYQTI WAYIMYVLLLGGLSYFFYSYFRKKDQRKQIHRDREMILKENLNLEKLKQEQKKR >gi|226332219|gb|ACIC01000101.1| GENE 15 24192 - 24827 536 211 aa, chain - ## HITS:1 COG:no KEGG:BT_4235 NR:ns ## KEGG: BT_4235 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 211 1 211 211 369 99.0 1e-101 MKSIKFFFACVLALTFVACSNAESDSFEDNSSNFESMISLYGIQSATVDSKVGDVPSVTA EEMASVLEALRQNGNTIRNCESETLEGYYEGGDRKEVKMIAEYQARTRSGAFVEQFALCV SINFNIDNNGGVFYIGTTYSSSTDLFNWQGYGASLSTTAEGNSVFSSTSYLYFRVSDQGN CLVKVPVSFKGSYNFKDNKGTYSFTLSKAAN >gi|226332219|gb|ACIC01000101.1| GENE 16 25326 - 25895 590 189 aa, chain + ## HITS:1 COG:no KEGG:BT_4234 NR:ns ## KEGG: BT_4234 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 189 1 189 189 379 100.0 1e-104 MKKKRILFLFVTLFASTLLYSQVVGVKTNLVMDAMKIINLGAEVGLSKKLTLDLYANYNP WKYKDQKMMKMLAIQPELRYWFCDKFNGHFVGFHVHGGVYQAAAINMPWGIWPELKDHRF KGNFFGAGISYGYQWILAKHWNLEGNIGVGYARVKYEQFECKTCGEKVSEGHKNYLGPTK AAISLIYLF >gi|226332219|gb|ACIC01000101.1| GENE 17 25908 - 27362 1543 484 aa, chain + ## HITS:1 COG:no KEGG:BT_4233 NR:ns ## KEGG: BT_4233 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 484 1 484 484 955 100.0 0 MKKILCILFCLLPVLGGKAQTLYRDQVRIEKESITRNEDNNVLSIDMDIVMQENLKLSSN HVVTLTPFLESNGRTKILPSIVVYGRKREVVNQRNNIAPSENTYSVIRRKHNKEQIINYH QQFQFEAWMRDAEMKLNIDVCGCCDLKEETSGELITRLNILPVQLQPAISYITPQAEAVK HRAIEGSAFLDFPVNQIIIRPEYRRNTVELAKIRATIDSVRNDDKTTLSSIRIHGYASPE GGYANNTRLAKGRTQALVDYVTSYYKFDNKLITSEYTSEDWEGFRKFIAASSLEKKDEIL KLMDDSTLDIDKKERQIAQLAGPQTYKYILEECYPALRHSDYTVNYTVRGFNLEETKEII KTRPQLLSLQEIYRIAESCQPGSEEFNRSFQVAAAMFPDDPIANLNAGAMEIQKGGDLTT AKKHLAKADQKAPETLNNLGVIALLEGDYDAAEKYFNAAKAGGLTTQADTNMKQLKSIRN YPKE >gi|226332219|gb|ACIC01000101.1| GENE 18 27436 - 27621 308 61 aa, chain - ## HITS:1 COG:no KEGG:BT_4232 NR:ns ## KEGG: BT_4232 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 61 4 64 64 110 100.0 1e-23 MQVNPDVGSMDAVLDKLYGKVGTNKSYISRIEKGALEPGVGLFFRIIDALGLKVEIVKPM I >gi|226332219|gb|ACIC01000101.1| GENE 19 27605 - 27754 153 49 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253569990|ref|ZP_04847399.1| ## NR: gi|253569990|ref|ZP_04847399.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 49 23 71 71 92 100.0 7e-18 MNIYGAFFIFDEGNIVMLFNGFQKKTQKTPESEIEKAVKLKNEYYASKP >gi|226332219|gb|ACIC01000101.1| GENE 20 27888 - 28526 631 212 aa, chain - ## HITS:1 COG:no KEGG:BT_4231 NR:ns ## KEGG: BT_4231 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 212 1 212 212 427 100.0 1e-118 MAILFDWYENPVSPDQPKKKRFHPRIAYNGQVDTDEMRSKIQSRCTLNEVDVTAVLDALS QVMGEELGEGRQVHLDGIGYFYPTLTATEEIAADTPRRNMKIKLKAIRFRSDQKLKNSIG LIKVKQLKKNFHSSKLSEVEIDMLLKEYFSTHQMMQRRDFQSLCGMVRSTAMIHIRRICQ EGKLQNLGLRNQPIYVPVPGFYGTSRDQPTLR >gi|226332219|gb|ACIC01000101.1| GENE 21 28856 - 30115 849 419 aa, chain + ## HITS:1 COG:no KEGG:BT_4230 NR:ns ## KEGG: BT_4230 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 419 1 419 419 867 99.0 0 MMEKVTQMIAMLIKAWKKESVSQIDTPPVPATPEPKQQSPNNSACLTERVNTFLQTHYDF RYNRLTEETEFRPLSGAKTEFRPIGKRELNTLCMEAHAEGISCWDKDVSRYIYSTQIGEY HPFRLYMDELPPWDGIDRLTPLARRVSALPLWVKGFHTWMLGLAAQWEGKTGVHANSLAP ILISAEQGRMKSTFCKSLMPKVLQRYYMDNLKLTSEGQAERLLSEMGLINLDEFDKYAES KMPLLKNLMQMSSLHVCKAYQRNFRDLPRIASFIGTSNRSDLMSDPTGSRRFFCVEVEKP INCEGIEHEQIFAQLKAELADGCPYWFDKETEREIQQNNAPFYRTCPTEEVFLSCFRTSA SAEEKYLRLSAADIYKELKKRNAAALRGFNPNHFAQVLIKMGVERKHTEYGNVYLVVRR >gi|226332219|gb|ACIC01000101.1| GENE 22 30159 - 30809 368 216 aa, chain - ## HITS:1 COG:MA2121 KEGG:ns NR:ns ## COG: MA2121 COG2865 # Protein_GI_number: 20090964 # Func_class: K Transcription # Function: Predicted transcriptional regulator containing an HTH domain and an uncharacterized domain shared with the mammalian protein Schlafen # Organism: Methanosarcina acetivorans str.C2A # 30 213 265 449 458 64 27.0 1e-10 MVVPRLTRDIPRPFQMDGIIRKDDTPQHKAVREAFTNMIIHADLMLNGLLRVEKYDDRFV LSNPGLLKLPVEQIYAGGESKARNQRMQHMLRMIGYGENLGSGFPLILSAWNEKHWLKPE LVEQPELMQVKLILHIENRVYVTKDVTKDVTKELTERQQIILEFMQADGTITISEMSQKT NVTERTIKRDIESLTEKGILSREGGRKEGRWVIRKI >gi|226332219|gb|ACIC01000101.1| GENE 23 30829 - 31641 649 270 aa, chain - ## HITS:1 COG:FN0191 KEGG:ns NR:ns ## COG: FN0191 COG2865 # Protein_GI_number: 19703536 # Func_class: K Transcription # Function: Predicted transcriptional regulator containing an HTH domain and an uncharacterized domain shared with the mammalian protein Schlafen # Organism: Fusobacterium nucleatum # 1 261 1 245 477 71 25.0 2e-12 MTEQDIYKLLADGERVTLECKKAQNGVPNSLWETYSAFANTNGGILLLGVYENVAEKDSL MRFTITGVEDADKIRKDLWNTINSREKVNVNLLHDEDVQTISVNGKEVIAINVPRADYNL RPVYINNNLMRGTYRRNHEGDYHCTEQMIKMMVRDAYEDGNDRMFLEYYTMDDIDIPTLE GYRVMFKTNNPEHIWNSLDHKEFLMQLGGYVVNRKDGTEGLTIAGLLMFGKGLPVRERFD YLRMDYIDKSNLIGDQRYSDRLTYDGTWGE >gi|226332219|gb|ACIC01000101.1| GENE 24 32783 - 34063 700 426 aa, chain + ## HITS:1 COG:no KEGG:BT_4227 NR:ns ## KEGG: BT_4227 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 426 1 426 426 805 100.0 0 MKVKNLLLAGLAVAAMTACSNDEIIDNDGIQTNGENAAMQINFQFPTTRVVSDGGTNAGI DIEYAATEITAVLEYTETNKRIVVKGLTFGQQTESGARIYSTEKFQVEAGNNVKVYAFIN PIMEIPTTGLENLKVGKQKLPSEGLDYIAETIAKSGNFMMSNVNGEPTVIKAIVGGTDTN TANISVERISAKLTEETKRSIDNKYELTTPTFVGPKVYVQILSHTYTNLADDSYVLGGRN AAWSSFLHPYKTGEVSETDYRWLNATGTTYVLENLGSEWNTATSTSVLYQGQVFFENDGN QEIAGTFYSRPKLQSDGETWKVQIYKNWEELCADVDLTGIGENDDVALANRSIMRYAGGK CYYQAPIEHFGVGANIARNNWYQLKVTSIADLGYPKPVPPTPENETKLMMSVTIAPWTIH INNVGL >gi|226332219|gb|ACIC01000101.1| GENE 25 34182 - 35264 725 360 aa, chain + ## HITS:1 COG:no KEGG:BT_4226 NR:ns ## KEGG: BT_4226 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 360 1 360 360 659 100.0 0 MKRILTTIQEKANNILLFTILAGAITSCDSILDYNENCDIEYCVKFKYDYNMEEKDVFAE QVRTVTLYAFDDNNNLVFQNTDEGEPLGEETYAMNVDIDLSQYHLVVWAGLNDESFAVPL LYPNQAKIDELRVKTLRKEATRSTTEDEKGQYIVDNSLHSLWHGEVKKGTTTRSGRQQIT EVSLVKNTNTIRVVVAQVNQSGGPVTRLTQKTFECAIYDNNGYMNYDNTLLEDNLLTYKP YNVTSDVVSTRAFSSADEPAKQYNGIVSEMSVARLVESQKPELTIKNTATQEVLFQSSDL VKYFEEVDAEKYKDRNYSLQEYLDREDKYELVIFVDEKLALIKTVIQVNDWIIQLNDIEL >gi|226332219|gb|ACIC01000101.1| GENE 26 35278 - 36321 842 347 aa, chain + ## HITS:1 COG:no KEGG:BT_4225 NR:ns ## KEGG: BT_4225 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 347 1 347 347 660 100.0 0 MKYSHLTYYLLAILCMMATSCVSDGVMGDCPIDNNDQAVLIEGGKVNFALTFPTSSTRGI NGSSEGLASERTINDVQVYTFVGGKFVEKVKYILISGTNGDATRFVEGKLTETYMTGTAM DFVVMVNTESKGVQSPTMTKGNSKADLYKQLVYSYEGKNWSTNIPMWGEGTIASIQSGEY NIGELTLQRAIAKVNVTVNDGKGLENFEITNVSLHNYNNGGYCAPTNANGQPSIPTNVVK ATTPLSAGSLSGTQGNNIENKFYIPEHKNIGVDKAEQLYLKIEAKVKGQIKYYDIMFSEN GSDYDVLRNYMYVFNITNVKVEVDVNPILEYEVKIWEEKTVNIPSFN >gi|226332219|gb|ACIC01000101.1| GENE 27 36388 - 39216 1763 942 aa, chain + ## HITS:1 COG:no KEGG:BT_4224 NR:ns ## KEGG: BT_4224 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 942 17 958 958 1759 99.0 0 MTLACIAFNACTDEEIYKSSEIVEGVPVEVGFKFTIPSMQKVSTRGLSDEGEFQVNDLYV LIFNANNDQARKTGGGYYNSDLIKNVITGNQANGKVSSGTLTLKTTSGESLIYAVANVKG NDLGGDITAALENVNSLADFKKLSVALTIKANVERSSPALVMSGAYIDGSTQPQTGYCNI PAQSTTLSGKISLTRLDSHIIFKITPNMQANGGKIKTFTPKSWRVYNVPNKSYIVAQDAD AVGNTAEDYENTESSIRFGEQTDNIYDFDFYMLENRKNAKTYEGRSIENYKQREEEVKTN EHKNTGEYKYVEPYATFVEIKAHMEIENADNDNGIRVADVTYVIHLGYVDNVAADFKNER NKKYTYNVTINNVEDIVTEVTEEGNPENTPGAEGDIVDSQTTVYNLDAHYGYLILKFKYS EVKDGLQFYVKTPFGETNNDDKSVVGHDCKWLRFARCNNESTLANYPKDGKGLINLFELK ADIEDQYQKDTNKDNTYYYTVFVDEYFYENNPLETSSNNNWGNENWEEFVNKDDRYALLI FSPQKSPDGESSYASAKYMITQKSIQTYYSTEKFNSDKTALGMEHIDETGVPNGWESGSY GSSQENGYKNTYPVVNNTNISSYGTETLSNGKNTFTINDAANAIQACMARNRDENNDGKI SGSEVKWFLPAINQLVGMFLGAESLPTPLFGDGDKQPGTYTYNKKEIGTYGTYHYISSDK QRLWSEEGATFGPAAGILYAKAPEKLRCVRTLGISSQYNSTSKKEGKIYNMNNSYTFQMA YLDKQSIRTSFIENGELDLHHNFSSYNRPYTAFQVANKRMTIDGIETSNGWGGSNNRPRP TNWESLVKNSGLSRSVCTNYFENANKSDKGSWRAPNQRELMIIYLQDPSLVEYQVTDAYD YRYGSFTRTCWKFNENDHFTVDKDLITKGTVGSFVRCVRDVK >gi|226332219|gb|ACIC01000101.1| GENE 28 39239 - 39499 278 86 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253569999|ref|ZP_04847408.1| ## NR: gi|253569999|ref|ZP_04847408.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 86 1 86 86 173 100.0 4e-42 MTPKDILQLESSDMRQAYLFEENGRWYAYEHSANRIEKFVKGFVDFKHKVQDVCGRIKVE IDLSILEKCSITLCEDSLLVLDCPAA >gi|226332219|gb|ACIC01000101.1| GENE 29 39663 - 39806 152 47 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|298383894|ref|ZP_06993455.1| ## NR: gi|298383894|ref|ZP_06993455.1| toxin-antitoxin system, antitoxin component, Xre family [Bacteroides sp. 1_1_14] # 2 43 1 42 102 84 95.0 3e-15 MMTNKKIKINGCTPIEDLITEDFGAIGTPERDEFERGCEAFIIWDEK >gi|226332219|gb|ACIC01000101.1| GENE 30 40106 - 41179 572 357 aa, chain + ## HITS:1 COG:SP0571 KEGG:ns NR:ns ## COG: SP0571 COG2184 # Protein_GI_number: 15900482 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Protein involved in cell division # Organism: Streptococcus pneumoniae TIGR4 # 22 269 21 261 267 199 45.0 6e-51 MNDFEEYIRQSEPHKREKGYAWQTAIGLQAVDGLKPSEYLKEKARQHIEGDITIDEVKQL VDSYYKSKVARSSSEDRTEEADKVSARITEILSENTFTFSPIEYLAIHRHLFEGIFSHAG QIRDYNITKNEWVLKGATVLYASAGSIRETLEYDFSQEKIFDYKNLNIDEAIRHIARFVS GIWQIHAFGEGNTRTTAVFTIKYLHTFGFNFSNETFANHSWYFRNALVRANYNDLTKGVY ATTEFLEKFFRNLILNEQNELKNRNLQIDEIEKEAIQSAKQTDMDIPKCKNCTLDCTLEE IAVLNYLKEKPNATQKEIAQHIGKSERTVKSMTVNLSERGIIERKNGRRNGFWEIKK >gi|226332219|gb|ACIC01000101.1| GENE 31 41501 - 41998 462 165 aa, chain + ## HITS:1 COG:no KEGG:BT_4221 NR:ns ## KEGG: BT_4221 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 165 1 165 165 276 100.0 3e-73 MSHTIFLIIALVTTGIFLIQFVLSIFFGDIDADVDVDADISSVVSFKGLTHFGIGFGWYM YLVGNADIASYAIGILVGLFFVFAVWFLYKKAYQLQQVNRSEETDQLVGRECTIYFKQSD NKYTVQTNRDGAMREVDVISETGKAYQTGDRTIISSYKDGTLYIK >gi|226332219|gb|ACIC01000101.1| GENE 32 42019 - 43677 2144 552 aa, chain + ## HITS:1 COG:BS_yuaG KEGG:ns NR:ns ## COG: BS_yuaG COG2268 # Protein_GI_number: 16080153 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus subtilis # 1 370 1 378 509 114 29.0 5e-25 MTQEMLIMAAILVAVILLTFIGILSRYRKCKSDEVLVVYGKTGGEKKSAKLYHGGAAFVW PIIQGYEFLSMKPLQIDCKLTGALSAQNIRVDVPTTITVAISTDPEVMQNAAERMLGLTM DDKQNLITDVVYGQMRLVIADMTIEELNSDRDKFLSKVKDNIDTELRKFGLYLMNINISD IRDAANYIVNLGKEAESKALNEAQANIEEQEKLGAIKIANQIKERETKVAETRKDQDIAI AETKKQQEISVANADKERISQVAFANAEKESQVAKAEAEKNIRIEQANTEKESRVAELNS DMEIKQAEAAKKAAIGRNDAQKEVALSNAELAVTQANADKQAGEAAAKSEAAVQTAREIA QKEVEEAKAKKVESSLKAEKIVPAEIARQEAILQANAIAEKITREAEARAKATLAQAEAE AKAIQLKLEAEAEGKKRSLLAEAEGFEAMVRAAESNPAIAIQYKMVDQWKEIAGEQVKAF EHMNLGNITVFDGGNGGTSNFLSSLVKTVAPSLGVLDKLPIGETVKGIINPESKTEEKPA GKPEEKKEEKKK >gi|226332219|gb|ACIC01000101.1| GENE 33 43761 - 44438 473 225 aa, chain + ## HITS:1 COG:no KEGG:BT_4219 NR:ns ## KEGG: BT_4219 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 225 1 225 225 454 100.0 1e-127 MNNHKLHLLPLFILITGLVTLTAGCKKKDMSLKLNEPRNIRGVVSYKRSFPDLNDKHLAV AQAVGICPPEDRDAAEKMKEQLIHITDNQFYTVDSLTHSIPYLVPRASELLDTIGSNFLD SLTAKGLNPNQIIVTSVLRSQSDVKRLRRRNGNASANSAHCYGATFDVSWKRFKKVEDED GRPLQDVNADTLKLVLSEVLRDLRQADKCYIKYELKQGCFHITAR >gi|226332219|gb|ACIC01000101.1| GENE 34 44529 - 45545 1062 338 aa, chain + ## HITS:1 COG:DR1988 KEGG:ns NR:ns ## COG: DR1988 COG1702 # Protein_GI_number: 15806986 # Func_class: T Signal transduction mechanisms # Function: Phosphate starvation-inducible protein PhoH, predicted ATPase # Organism: Deinococcus radiodurans # 18 323 65 370 380 270 46.0 3e-72 MIEKLIVLEDIDPVIFYGVNNANIQLIKALYPKLRIVARGNVIKVLGDEEEMCAFEENIT KLEKYCAEYNSLKEEVIIDIIKGNAPQAEQTGNVIVFSVTGKPIIPRSENQLKLVEGFAK NDMVFAIGPAGSGKTYTAIALAVRALKNKEIKKIILSRPAVEAGEKLGFLPGDMKDKIDP YLQPLYDALQDMIPAAKLKEYMELNIIQIAPLAFMRGRTLNDAVVILDEAQNTTVQQIKM FLTRMGMNTKMIVTGDMTQIDLPASQTSGLVQALRILKGVKGISFVELNKKDIVRHKLVE RIVDAYEKFDKEAKAEREKRKNEQLVINGERPVKQVNN >gi|226332219|gb|ACIC01000101.1| GENE 35 45592 - 46536 1019 314 aa, chain + ## HITS:1 COG:CC3242 KEGG:ns NR:ns ## COG: CC3242 COG0152 # Protein_GI_number: 16127472 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylaminoimidazolesuccinocarboxamide (SAICAR) synthase # Organism: Caulobacter vibrioides # 10 313 13 319 320 270 45.0 3e-72 MKALTKTDFNFPGQKSVYHGKVRDVYNINGEKLVMVATDRISAFDVVLPEGIPYKGQMLN QIAAKFLDATTDICPNWKMATPDPMVTVGVLCEGFPVEMIVRGYLCGSAWRTYKSGVREI CGVKLPDGMRENEKFPEPIVTPTTKAEMGLHDEDISKEEILKQGLATPEEYETLEKYTLA LFKRGTEIAAERGLILVDTKYEFGKHNGTIYLMDEIHTPDSSRYFYSDGYQERFEKGEPQ KQLSKEFVREWLMENGFQGKDGQKVPEMTPAIVQSISDRYIELFENITGEKFVKEDTSNI AERIEKNVMNFLSK >gi|226332219|gb|ACIC01000101.1| GENE 36 46539 - 47276 560 245 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163754278|ref|ZP_02161401.1| 30S ribosomal protein S15 [Kordia algicida OT-1] # 24 245 1 221 221 220 48 3e-56 MDYPQQHIKPYNEDGKKTEQVERMFDNIAHAYDKLNHTLSLGIDRSWRRKAIAWLRPFRP QHIMDVATGTGDFAILACRELNPDELIGTDISEGMMNVGREKVKKEGLSDKISFAREDCT SLSFADNRFDAITVAFGIRNFEDLDKGLSEMCRVLKPGGHLVILELTTPDRFPMKQLFAI YSKVVIPLLGKLLSKDNSAYHYLPDTIKVFPQGEVMKGVISKTGFGEVHFRRLTFGICTL YTATK >gi|226332219|gb|ACIC01000101.1| GENE 37 47315 - 48061 708 248 aa, chain + ## HITS:1 COG:MK0117 KEGG:ns NR:ns ## COG: MK0117 COG0169 # Protein_GI_number: 20093557 # Func_class: E Amino acid transport and metabolism # Function: Shikimate 5-dehydrogenase # Organism: Methanopyrus kandleri AV19 # 5 246 15 271 290 137 33.0 2e-32 MEKYGLIGYPLRHSFSIGYFNEKFRSEGINAEYVNFEIPNINDFMEVIEENPNLCGLNVT IPYKEQVIPFLNELDRDTAKIGAVNVIKIIRQPKGKVKLVGYNSDIIGFTQSIQPLLQPQ HKKALILGTGGASKAVYHGLKNLGIESVFVSRTHKTDDMLTYEELTPEIMEEYTVIVNCT PVGMYPKVDFCPNIPYELLTPNHLLYDLLYNPNVTLFMKKGEAQGAVTKNGLEMLLLQAF AAWEIWHR >gi|226332219|gb|ACIC01000101.1| GENE 38 48136 - 49086 652 316 aa, chain + ## HITS:1 COG:lin2180 KEGG:ns NR:ns ## COG: lin2180 COG1073 # Protein_GI_number: 16801245 # Func_class: R General function prediction only # Function: Hydrolases of the alpha/beta superfamily # Organism: Listeria innocua # 1 313 1 317 319 218 36.0 1e-56 MKKRAVYSIIIIMLVLTSCTIGGSFYMLNFSLTPNAKILSKDADSYPYMYRNYPFLRPWV DSLKQVNALKDTFIINPQGIQLHAFYITAPAPTSKTAVIVHGYTDNAIRMFMIGYLYNRD LGYNILLPDLQHQGESEGRAIQMGWKDRIDVLQWMNIANEIFGDSTQMVVHGISMGGATT MMVSGEKQQPYVKCFVEDCGYTSVWDEFSHELKSSFHLPAFPLMYTTSWLCEKKYGWNFK EASSLKQVEKCELPMLFIHGDKDTYVPTWMVYPLYEAKPEPKELWIVPGAAHALSYKENK QEYTDKVRDFVGRYIH >gi|226332219|gb|ACIC01000101.1| GENE 39 49159 - 50061 448 300 aa, chain + ## HITS:1 COG:TM0962 KEGG:ns NR:ns ## COG: TM0962 COG1512 # Protein_GI_number: 15643722 # Func_class: R General function prediction only # Function: Beta-propeller domains of methanol dehydrogenase type # Organism: Thermotoga maritima # 12 161 5 150 238 89 32.0 1e-17 MKSILTLILATFLLIPLQAQEKVYTVDNLPKVHLQNKMQYVCNPAGILSQAACDTIDTML HALEQQTGIETVVAIVPSIGDMECFDFCHQLLNQWGVGKKGKDNGLVILLVTDQRCIQFY TGYGLEGVLPDAICKRIQTKYMIPYLKDGNWNEGMVAGIRATCQRLDGSMENESLSESNN ESMDFIFAVILFAVIGVGIAFFAARNQSRCPKCGKHALQRTGSRLVSRVNGVKTEDVTYT CKNCGNTIIRRQQSYDSDYHNRGGGGGGPFIGGFGGSGGGFSGGSFGGGMGGGGGAGSRF >gi|226332219|gb|ACIC01000101.1| GENE 40 50112 - 50693 712 193 aa, chain + ## HITS:1 COG:PM0785 KEGG:ns NR:ns ## COG: PM0785 COG1704 # Protein_GI_number: 15602650 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Pasteurella multocida # 1 193 1 192 193 196 55.0 2e-50 MKKSIIIILAVVAILVIWVVSVYNGLVTMDENVSGQWANVETQYQRRADLIPNLVNTVKG YASHEKETLEGVVEARSKATQIKVDAEGLTPEKLAEYQKAQGAVTSALGKLLAITENYPD LKANQNFLELQAQLEGTENRINVARKNFNDAAQSYNTNIRRFPKNIFAGMFGFDKKAYFE AEEGSEKAPKVEF >gi|226332219|gb|ACIC01000101.1| GENE 41 50775 - 50900 100 41 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVRIILIIMLIDSGKRLVKPLKRQEQKFLFNALLWKSFRGG >gi|226332219|gb|ACIC01000101.1| GENE 42 51019 - 52185 1155 388 aa, chain + ## HITS:1 COG:MJ0203 KEGG:ns NR:ns ## COG: MJ0203 COG0150 # Protein_GI_number: 15668375 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylaminoimidazole (AIR) synthetase # Organism: Methanococcus jannaschii # 47 369 49 323 350 114 29.0 2e-25 MSNQRYMMRGVSASKEDVHNAIKNIDKGIFPKAFCKIIPDILGGDPEYCNIMHADGAGTK SSLAYMYWKETGDLSVWKGIAQDALIMNIDDLLCVGAVDNILVSSTIGRNKLLIPGEVIS AIINGTDELLAELREMGVGVYATGGETADVGDLVRTIIVDSTVTCRMKRSDVINNANIRP GDVIVGLASYGQATYEKEYNGGMGSNGLTSARHDVFGKYLAEKYPESYDAAVPEELVYSG GLKLTDTVEDSPIDAGKLVLSPTRTYAPVVKKLLDTLRPEIHGMVHCSGGAQTKVLHFVE NVRVVKDNLFPVPPLFKTIQEQSGTDWAEMYKVFNMGHRLEVYLSPEHAEEVIAISESFG IPAQIVGRIEACDQTELIIKSEFGEFRY >gi|226332219|gb|ACIC01000101.1| GENE 43 52189 - 53301 1190 370 aa, chain + ## HITS:1 COG:VC2179 KEGG:ns NR:ns ## COG: VC2179 COG0216 # Protein_GI_number: 15642178 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Protein chain release factor A # Organism: Vibrio cholerae # 5 365 3 355 362 330 47.0 3e-90 MADNSTILEKLDGLVARFEEISTLITDPAVIADQKRYVKLTKEYKDLDDLMKARKEYIQL LGNIEEAKNILSNESDADMREMAKEEMDNSQERLPALEEEIKLMLVPADPQDSKNAILEI RGGAGGDEAAIFAGDLFRMYAKFCETKGWKMEVSNANEGTAGGYKEIVCSVTGDNVYGIL KYESGVHRVQRVPATETQGRVHTSAASVAVLPEAEEFDVVINEGEIKWDTFRSGGAGGQN VNKVESGVRLRYIWKNPNTGVAEEILIECTETRDQPKNKERALARLRTFIYDKEHQKYID DIASKRKTMVSTGDRSAKIRTYNYPQGRITDHRINYTIYNLAAFMDGDIQDCIDHLIVAE NAERLKESEL >gi|226332219|gb|ACIC01000101.1| GENE 44 53318 - 54142 1025 274 aa, chain + ## HITS:1 COG:RSc2773 KEGG:ns NR:ns ## COG: RSc2773 COG0284 # Protein_GI_number: 17547492 # Func_class: F Nucleotide transport and metabolism # Function: Orotidine-5'-phosphate decarboxylase # Organism: Ralstonia solanacearum # 4 268 22 285 288 204 41.0 1e-52 MNKQQLFENIKRKKSFLCVGLDTDIKKIPEHLLKEEDPIFAFNKAIIDATADLCIAYKPN LAFYESMGVKGWIAFEKTVKYIKENYPDQFIIADAKRGDIGNTSAMYARTFFEELEIDSV TVAPYMGEDSVTPFLTYEGKWVILLALTSNKGSHDFQLTEDANGERLFEKVLRKSQEWAN DERMMYVVGATQGRAFEDIRKIVPNHFLLVPGVGAQGGSLEEVCKYGMNSTCGLIVNSSR GIIYVDKTEKFAEAARLAAQEVQVQMAEQLKAIL >gi|226332219|gb|ACIC01000101.1| GENE 45 54152 - 55381 1032 409 aa, chain + ## HITS:1 COG:BS_ywfO KEGG:ns NR:ns ## COG: BS_ywfO COG1078 # Protein_GI_number: 16080812 # Func_class: R General function prediction only # Function: HD superfamily phosphohydrolases # Organism: Bacillus subtilis # 4 406 10 410 433 191 30.0 2e-48 MPYERKIINDPVFGFINIPKGLLYDIVRHPLLQRLTRIKQVGLSSVVYPGAQHTRFQHSL GAFYLMSEAITQLTSKGNFIFDSEAEAVQAAILLHDIGHGPFSHVLEDTIVQGVSHEEIS LMLMERMNKEMNGQLSLAIQIFKDEYPKRFLHQLVSGQLDMDRLDYLRRDSFYTGVTEGN IGSARIIKMLDVADDRLVIESKGIYSIENFLTARRLMYWQVYLHKTSVAYERMLISTLLR AKELASQGVELFASPALHFFLYNDINHTEFHNNPDCLENFIQLDDNDIWTALKVWSNHPD KVLSTLSLGMINRNIFKVENSAEPIGEDRIKELTLQISQQLGITLSEANYFVSTPSIEKN MYDPADDSIDIIYKDGTIKNIAEASDMLNISLLSKKVKKYYLCYQRLHR >gi|226332219|gb|ACIC01000101.1| GENE 46 55463 - 56503 1015 346 aa, chain + ## HITS:1 COG:FN1909 KEGG:ns NR:ns ## COG: FN1909 COG1044 # Protein_GI_number: 19705214 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-3-O-[3-hydroxymyristoyl] glucosamine N-acyltransferase # Organism: Fusobacterium nucleatum # 1 336 1 331 332 228 38.0 9e-60 MEFSAKQIAAFIQGEIIGDENATVHTFAKIEEGMPGAISFLSNPKYTPYIYETQSSIVLV NKDFVPEHEIRATLIKVDNAYESLAKLLNLYEMSKPKKQGIDSLAYIAPSAKIGENVYIG AFAYIGENAVIGDNTQIYPHTFVGDGVKIGNGCLLYSNVNVYHDCRIGNECILHSGAVIG ADGFGFAPTPNGYDKIPQIGIVILEDKVDIGANTCVDRATMGATIIHSGAKIDNLVQIAH NDEIGSHTVMAAQVGIAGSAKIGEWCMFGGQVGIAGHITIGDRVNLGAQSGIPSSIKADS VLIGTPPMEPKAYFKAAVVTKNLPDMQKEIRNLRKEVEELKQLLNK >gi|226332219|gb|ACIC01000101.1| GENE 47 56515 - 57900 1189 461 aa, chain + ## HITS:1 COG:XF0803 KEGG:ns NR:ns ## COG: XF0803 COG0774 # Protein_GI_number: 15837405 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-3-O-acyl-N-acetylglucosamine deacetylase # Organism: Xylella fastidiosa 9a5c # 1 319 1 297 304 178 34.0 2e-44 MLKQKTLKDSFSLSGKGLHTGLDLTVTFNPAPDNHGYKIQRIDLEGQPTIDAVADNVTET TRGTVLSKNGVKVSTVEHGMAALYALGIDNCLIQVNGPEFPILDGSAQYYVQEIERVGTE EQSAVKDFYIIKSKIEFRDESTGSSIIVLPDENFSLNVLVSYDSTIIPNQFATLEDMHNF KDEVAASRTFVFVREIEPLLSAGLIKGGDLDNAIVIYERKMSQESYDKLADVMGVPHMDA DQLGYINHKPLVWPNECARHKLLDVIGDLALIGKPIKGRIIATRPGHTINNKFARQMRKE IRLHEIQAPTYDCNREPVMDVNRIRELLPHRYPFQLVDKVIEMGASYIVGIKNITANEPF FQGHFPQEPVMPGVLQIEAMAQVGGLLVLNSVDEPERYSTYFMKIDGVKFRQKVVPGDTL LFRVELLAPIRRGISTMKGYAFVGEKVVCEAEFMAQIVKNK >gi|226332219|gb|ACIC01000101.1| GENE 48 57911 - 58678 848 255 aa, chain + ## HITS:1 COG:VC2248 KEGG:ns NR:ns ## COG: VC2248 COG1043 # Protein_GI_number: 15642246 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Acyl-[acyl carrier protein]--UDP-N-acetylglucosamine O-acyltransferase # Organism: Vibrio cholerae # 1 255 1 262 262 210 43.0 2e-54 MVSPLAYIHPEAKIGENVEIAPFVFIDKNVVIGDNNKIMANVNILYGSRIGNGNTIFPGA VIGAVPQDLKFRGEESTAEIGDNNLIRENVTVNRGTAAKGRTIVGSNNLLMEGVHVAHDA LIGNGCIIGNSTKMAGEIVIDDNAIISANVLMHQFCHVGSHVMIQGGCRFSKDIPPYIIA GREPIAFSGINIIGLRRRGFANEVIESIHNAYRIIYQSGLNTTDALKKIEDEVEKSPEID YIINFIRNSERGIIK >gi|226332219|gb|ACIC01000101.1| GENE 49 58700 - 59254 650 184 aa, chain + ## HITS:1 COG:no KEGG:BT_4204 NR:ns ## KEGG: BT_4204 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 184 1 184 184 308 100.0 6e-83 MIYRFTIISDEVDDFVREIQIDPEATFLDFHEAILKSVGYTNDQMTSFFICDDDWEKEKE VTLEEMDDNPEMDSWIMKETTISELVEDEKQKLLYVFDYMTERCFFIELSEIITGKDMNG AKCTKKSGDAPPQTVDFEEMAAASGSLDLDENFYGDQDFDMEDFDQEGFDIGGNAGGSYE EEKF >gi|226332219|gb|ACIC01000101.1| GENE 50 59373 - 60299 892 308 aa, chain + ## HITS:1 COG:BH2366 KEGG:ns NR:ns ## COG: BH2366 COG0324 # Protein_GI_number: 15614929 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA delta(2)-isopentenylpyrophosphate transferase # Organism: Bacillus halodurans # 4 282 5 287 314 192 36.0 8e-49 MPTLIVLIGPTGVGKTELSLRLAENFHTSIVSADSRQLYAELKIGTAAPTPDQLKRVPHY LVGTLHLTDYYSAAQYEQEAMEILHQLFTEHEVVVLTGGSMMYVDAICKGIDDIPTVDAE TRQVMLQKYEEEGLEQLCAELRLLDPDYYRIVDLKNPKRVIHALEICYMTGKTYTSFRTQ QKKERPFRILKIGLTRDREELYDRINRRVDQMMEEGLLDEVRSVLSYRHLNSLNTVGYKE LFKYLDGEWELPFAIEKIKQNSRIYSRKQMTWFKRDEEIRWFHPEQETEILEYLRLQNLT HLPSLDTF >gi|226332219|gb|ACIC01000101.1| GENE 51 60272 - 60715 359 147 aa, chain - ## HITS:1 COG:no KEGG:BT_4202 NR:ns ## KEGG: BT_4202 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 147 1 147 147 288 100.0 6e-77 MKKQFCKILLAVSFLFLVPALCMSQNLKGIWKLVQSEGSSQVVRYKVLDKDGNYFNVDAY IKDAVDVSSGSRSSDDVFCPYKITRSGEYSIIAKGLYCEKLRNEHGRSANAIVPISYRID GKKMTLLFRLGNNVYREVYQKVSKLGK >gi|226332219|gb|ACIC01000101.1| GENE 52 60830 - 63271 1461 813 aa, chain - ## HITS:1 COG:no KEGG:BT_4201 NR:ns ## KEGG: BT_4201 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 813 1 813 813 1474 100.0 0 MGIFLIYILKSAVCLSLFYLFYRLLLSKETFHRFNRIALLGILFLSLLIPFIEVTTAHQT ELSQTVLTVEQLLMMAEAMDPAEVSVAQPEELSISWVQVLLLFYLVGIIFFACRNLYSLS RLLLLIKSGKRERLKGGVRLIVLEREVAPFSWMRYIVISRKDLEEDGREILIHEMAHIQN RHSIDLLVADICIFFQWFNPGIWLLKQELQNIHEYEADETVINEGIDAKDYQLLLIKKAV GTRLYSMANSFNHSKLKKRITMMLKEKSNPWARLKYLYVLPLATIAVTAFARPEISERVE EISAVKVNDLAAIVEAKVEEITKDVSNITSGDSLKKLVVTGDSILKEKGTISIYGEKGKN GVLATKLLPDEDNHFKITKTQAKIKPGMNTSVDKSKLMGTHIGGISTVDLRDKDVLVIID GKESSRTVVDALDPSRIESISILDGKEATDIYGDKAKNGAMVIQLHSTAEQILQNKYKID AISKTRLDALNRGSKNWGVTFHSVSGKKPLVYIDGKEAVGEEALSSVSPERIKSISVMKD KAAVEVYGERGKDGVVLVDLLTEEEYQNKQKFPKPAKVRTESESPKKSHFYMGGSHDEEW HVAQAKKKPLVIIDGKEALEEDAISKLAPDRIKNFTILKDKSATDIYGERGKNGVLIITL FTDAEYEFNKANPKKPYADALELAESMAKDVEGEIIYCIDDEKIKKSKLKGMSTKNIRSV SVNEMDGTKIVRLETDKYRSDWISVTGVVTDEEGKTIAATVLVKGTNDYTVADADGRFNL KAPKNGILRIADVNKSVAEVKVKPMLKVVLKDK >gi|226332219|gb|ACIC01000101.1| GENE 53 63373 - 65505 958 710 aa, chain - ## HITS:1 COG:no KEGG:BT_4200 NR:ns ## KEGG: BT_4200 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 710 1 710 710 1391 100.0 0 MGFFFVYILKSGVCLSLFYLFYRLLLSKETFYRFNRIALLGILVFSLLLPLIEVTKAPQN EINQAVLTIEQLLVMAENHQETQVTTVVEGDDLVDTWRSPVHWIEIVLLFYIAGIFFLVC RNVYSLFRLVRLMNTAQRRQIDKHTVLLVHDRNVAPFSWMKFVVISRTDLEENGREILIH ECAHIRKHHSWDLLIADICIFFQWFNPGAWLLKQELQNIHEYEADEAVINEGINARDYQL LLIKKAVGTRLYSMANSFNHSKLKKRITMMLKEKSNPWARLKYLYVLPLAAIAVTAFARP EISEKADEISAVKVNDLAAIVEAKVEEITKDVLKDSLKAKPYVVPEKSKMYGGRWVSKDS SIVYSADSVVIFNDSVASKRSSVVLYGGRFERISESNKPLILVDGKEVSKDTLYSIMNDE ANPNRIKTVSILRSDAALSIYGEKGKHGVYDIELLPNADNHFRITKTQAKFKTKTGSFPK LNSVPGIGGLPIRGVYDGEEPLVVIDGKEVLEPNALSKISPDRIKSFSVLKSEAALEAYG EKAKNGVILIDLLSEEEYEDIKKNPQRPYNNAWEMAMNIHDNKVVPTGKVLFFIDGKEAS EEELKNIPANEVRGVSRIDNSGEGETTIRVETERNSDKWVLASGIVMDKEGKPMEGVTIT MYGTSSSTKTDATGHFELKTPKDASLQISRGEKVLFRTKAAKDIKIVLDK >gi|226332219|gb|ACIC01000101.1| GENE 54 65523 - 67139 711 538 aa, chain - ## HITS:1 COG:no KEGG:BT_4199 NR:ns ## KEGG: BT_4199 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 538 1 538 538 1035 99.0 0 MHIFFLIYMLKTALCLSVFYLFYRFWLSKETFHRFNRMVLIGLILVAFVIPSLKVSQNFH LPQIEQVFERLNIPEEETEQEHDLQKMEQVTTTTLHTQLEKERSDFTILTLGNLGFLYLG GALLLLLRYIFSIACIYRLIRNSEKKCWKGKIRLVVHQQNVAPFSWRHYIVLSRQDLEEY GEEIIAHEYAHIMHKHFRDLLLAEICLLFQWFNPAMWLFRKELKTVHEFEADESVLKAGI DAKKYQLLLIQKTVGTRLYSMVNSFDHSSLKKRITMMLKKKSNPWKLSKYLIVFPLAAIT AVLFARPNLLNMNNQEGNAFIQDTLDVRHFEEILVSKSYAESDTVQIIEIDTLKMHIPVG YAGWSISESSSDDNVISYRKGVYQKEKGILRLSSTTPWTESDLADFVHRVKKGMLKSLNE LYGVKKQQNAHPALAHEVKTSHTVKCIYYVKDFFKIYCTDGGTGHDGQLILLDGKEISYQ QLKSLSPDDLLHDVWEILPESTNLESKFGSKAKKGIWVLASKKNPSYLEEFEKKLAEN >gi|226332219|gb|ACIC01000101.1| GENE 55 67174 - 67539 385 121 aa, chain - ## HITS:1 COG:no KEGG:BT_4198 NR:ns ## KEGG: BT_4198 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 121 1 121 121 230 100.0 1e-59 MKGLTVKEEELMGYFWEKGPLFVKEMLAFYEEPKPHFNTLSTIVRGLEDKGFLAHHTFGN TYQYYPVVSEEDFRKGTLRNVISKYFNNSYLSAVSSLVKEEDISLDDLKRLINEVEQAHQ K >gi|226332219|gb|ACIC01000101.1| GENE 56 68176 - 68478 265 100 aa, chain - ## HITS:1 COG:no KEGG:BT_4196 NR:ns ## KEGG: BT_4196 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 100 1 100 100 177 100.0 1e-43 MNSKRKDLQYASVFLLAVALTFLMGKGDNLWLISWGDLIPSLFVLFIAGDCLHSSLLRIK RGEEAGGSRWSTCFAFLIFSIVFMGDLFFIGHFIIDKLLG >gi|226332219|gb|ACIC01000101.1| GENE 57 68557 - 68733 174 58 aa, chain - ## HITS:1 COG:no KEGG:BT_4195 NR:ns ## KEGG: BT_4195 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 58 13 70 70 80 98.0 2e-14 MRVDAETMDAIEKWAADEFRSTNGQLQWIIAEALRKSGRLKKKSAKAGNAPSESSDQA >gi|226332219|gb|ACIC01000101.1| GENE 58 68797 - 69744 1044 315 aa, chain - ## HITS:1 COG:SP2132 KEGG:ns NR:ns ## COG: SP2132 COG0330 # Protein_GI_number: 15901946 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Membrane protease subunits, stomatin/prohibitin homologs # Organism: Streptococcus pneumoniae TIGR4 # 14 313 11 334 335 214 38.0 2e-55 MNAKEFQYNDVVINGFVALFVNLAILPLLIVMSFILFKGSIVLFLLLILFLAAAILMIPG YFSQEPNEARAMVFFGKYKGTFTETGFFWVNPFMNKKKLSLRARNLDIEPIKVNDKIGNP ILIGLVLVWKLKDTYKAMFEIDAQTMADNKGTGQMSVTVAGRMNAFEDFVRVQSDAALRQ VAGLYAYDDNEANSDELTLRSGGDEINDQLEHQLNERLAMAGMEIVEARINYLAYAPEIA AVMLRRQQASAIISAREKIVEGAVSMVRMALHKLSEEEIVELDEDKKAAMVSNLLVVLCA DEAAQPVVNTGTLNH >gi|226332219|gb|ACIC01000101.1| GENE 59 69924 - 72224 1977 766 aa, chain + ## HITS:1 COG:CC2154 KEGG:ns NR:ns ## COG: CC2154 COG1506 # Protein_GI_number: 16126393 # Func_class: E Amino acid transport and metabolism # Function: Dipeptidyl aminopeptidases/acylaminoacyl-peptidases # Organism: Caulobacter vibrioides # 168 751 141 719 738 313 33.0 1e-84 MYFEVFKFIVIFAGYELNPLKQVKYKTIYKMRKVSLALLLCLLCLAGMAQGQKALDLKDI TSGRFRPENIQGVIPTPDGEHYTQMNADGTQIIKYSFRTGEKVEVIFDVNQARECDFKNF DSYQFSPDGDKLLIATKTTPIYRHSYTAVHYIYPLKRNDKGVTTNNIIERLSDGGPQQVP VFSPDGTMIAFVRDNNIFLVKLLYGNSESQVTEDGKQNMVLNGIPDWVYEEEFGFNRALE FSADNTMIAFIRFDESEVPSYSFPMFAGEAPQITPLKDYPGEYTYKYPKAGYPNSKVEVR TYDIKSHVTRTMKLPIDADGYIPRIRFTKDASKLAVMTLNRHQDRFDLYFADPRSTLCKL VLRDESPYYIKENVFDNIKFYPETFSLLSERDGFSHLYWYSMGGNLIKKVTNGKYEVKDF LGYDATDGSFYYTSNEESPLRKAVYKIDKKGKKTKLSQREGTNTPLFSKSMKYYMNKFSN LDTPMLVTLNDNTGKTLKTLITNDQLKQTLAGYAIPQKEFFTFQTTDGVTLNGWMMKPVN FSASKKYPVLMYQYSGPGSQQVLDTWGISWETYMASLGYIVVCVDGRGTGGRGEAFEKCT YLKIGVKEAKDQVETALYLGKQPYVDKDRIGIWGWSYGGYMTLMSMSEGTPVFKAGVAVA APTDWRFYDTIYTERFMRTPKENAEGYKESSAFTRADKLHGNLLLVHGMADDNVHFQNCA EYAEHLVQLGKQFDMQVYTNRNHGIYGGNTRQHLYTRLTNFFLNNL >gi|226332219|gb|ACIC01000101.1| GENE 60 72224 - 73072 594 282 aa, chain + ## HITS:1 COG:BH3435 KEGG:ns NR:ns ## COG: BH3435 COG0320 # Protein_GI_number: 15615997 # Func_class: H Coenzyme transport and metabolism # Function: Lipoate synthase # Organism: Bacillus halodurans # 5 281 8 287 303 288 49.0 1e-77 MADRVRKPEWLKINIGANDRYTETKRIVDSHCLHTICSSGRCPNMGECWGKGTATFMIGG DICTRSCKFCNTQTGRPHPLDANEPTHVAESIALMKLDHAVVTSVDRDDLPDLGAGHWAH TIREIKRLNPQTTIEVLIPDFQGRMELVDLVIEANPDIISHNMETVRRISPLVRSAANYD TSLQVIGHIARSGTKSKSGIMVGLGETPQEVETIMDDLLAVGCQILTIGQYLQPTHRHYP VAEYVTPQQFATYKTIGLEKGFSIVESAPLVRSSYHAEKHIR >gi|226332219|gb|ACIC01000101.1| GENE 61 73130 - 74251 614 373 aa, chain + ## HITS:1 COG:no KEGG:BT_4191 NR:ns ## KEGG: BT_4191 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 373 1 373 373 697 100.0 0 MDHKGTLSSAFNMSLGFIPVIISILLCEFITQDTAIYIGTGIGIVGIYLLLHRKGALIPN FILYIATGMLALLSLAALIPGDYVPPGALPLTLEVSILIPMLILYMHKKRFINHFLRQIG SCNKRLYAQGAEAAVVSARFALIFGILHFIIISIVVASQDPLSQTSMVILYKVFPPVVFV MSILFNQIAIRYFNHLMSHTEYVPIVNTKGDVIGRSLAIEALNYKNAYINPVIRIAVSTH GMLFLCDRPMNAILDKGKTDIPMECYLRYGESLTEGVNRLVHNALPHATEDFKPEFNIVY HFENEATNRLIYLFIVDIKDDSILCTPRFKNSKLWSFKQIEENMGKGFFSSCFEDEYEHL KDVICIREKYRES >gi|226332219|gb|ACIC01000101.1| GENE 62 74217 - 74435 116 72 aa, chain - ## HITS:1 COG:no KEGG:BT_4190 NR:ns ## KEGG: BT_4190 # Name: not_defined # Def: putative S-adenosylmethionine-dependent methytransferase # Organism: B.thetaiotaomicron # Pathway: not_defined # 11 72 173 234 234 138 98.0 6e-32 MLKVRHSYSSKTPYRNHKMIEDILQNCRPQTKLCIAANITCEGEFIQTRTVKDWKGHIPE LSKIPCIFLLYK >gi|226332219|gb|ACIC01000101.1| GENE 63 74383 - 74922 415 179 aa, chain - ## HITS:1 COG:NMA0547 KEGG:ns NR:ns ## COG: NMA0547 COG0313 # Protein_GI_number: 15793541 # Func_class: R General function prediction only # Function: Predicted methyltransferases # Organism: Neisseria meningitidis Z2491 # 1 173 6 177 241 161 50.0 6e-40 METALYLLPVTLGDTPLEQVLPSYNTEIIRGIRHFIVEDVRSARRFLKKVDREIDIDSLT FYPLNKHTSPEDISGYLKPLAGGASMGVISEAGCPAVADPGADVVAIAQRQKLKVIPLVG PSSIILSVMASGFNGQSFAFHGYLPIEPGERAKKLKTLEQRVYAESQTQLFIENSLPES >gi|226332219|gb|ACIC01000101.1| GENE 64 74963 - 75931 849 322 aa, chain - ## HITS:1 COG:no KEGG:BT_4189 NR:ns ## KEGG: BT_4189 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 322 14 335 335 646 100.0 0 MLFASCGLSTGKVAEQKEEGISVLRYDKLLSEYVRSNSFSAMQKLTMDYRQPTKILIEDV LSIGTVKDDTISQRLQKFYSDSTLLRLLSDVEAKYPNLDEVEKGLTKGFRKLKKEVPDTK IPFIYSQISAFNESIILVDTLLGISLDKYMGEDYPLYKRFYYDYQCTSMRPERIVPDCFA FYLLSRYEMNYHEGTCLVDLMMHSGKINYVVQHLLGYDDIGQAMGYSDQENEWCRKNEKD IWEYICANDHLHARDPMVIRYYMKPAPTVDMLGGQAPALIGSWVGARIIASYMKKHKDLK IKDLLELTDYQSMFEESGYLKL >gi|226332219|gb|ACIC01000101.1| GENE 65 75977 - 76834 1035 285 aa, chain - ## HITS:1 COG:mll5565 KEGG:ns NR:ns ## COG: mll5565 COG0623 # Protein_GI_number: 13474637 # Func_class: I Lipid transport and metabolism # Function: Enoyl-[acyl-carrier-protein] reductase (NADH) # Organism: Mesorhizobium loti # 3 260 2 256 267 125 35.0 7e-29 MSNNLLKGKRGIIFGALNDQSIAWKVAERAVEEGATITLSNTPMAIRMGEVNALAEKLNC QVVPADATSVEDLTNVFKTSMDILGGQIDFVLHSIGMSPNVRKKRTYDDLDYGMLDKTLD ISAVSFHKMIQSAKKLNAIAEYGSIVALSYVAAQRTFYGYNDMADAKALLESIARSFGYI YGREHSVRVNTISQSPTFTTAGSGVKGMDKLYDFANRMSPLGNATADECADYCIVMFSDL TRKVTMQNLFHDGGFSSVGMSLRAMATYEKGLDEYMDENGNIIYG >gi|226332219|gb|ACIC01000101.1| GENE 66 77184 - 78800 1334 538 aa, chain + ## HITS:1 COG:TM0437 KEGG:ns NR:ns ## COG: TM0437 COG5434 # Protein_GI_number: 15643203 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Endopolygalacturonase # Organism: Thermotoga maritima # 49 510 18 447 448 241 33.0 3e-63 MNTMTRRLLWMVVCCLPFVSGCKQSECAISENAIDDTIYQNLPFEMPKVQQPVFPAYEVN ISKFGAKGDGMTLNTKAINDAIKEVNQRGGGKVIIPEGTWLTGPIELLSNVNLYTERNAL ILFTGDFEAYPIIPTSFEGLETRRCQSPISARNAENIAITGYGIFDGNGDCWRPVKKEKL TASQWNKLVKSGGVLDEQERIWYPTAGSLKGAMACKDFNVPEGINTDEEWNEIRAWLRPV LLNFVKSKRILLEGVTFKNSPSWCLHPLSCEDFTVNNIQVINPWYSQNGDALDLESCKNA LILNSVFDAGDDAICIKSGKDENGRRRGEPCQNVIVKNNTVLHGHGGFVVGSEMSGGVKN IYVEDCTFLGTDVGLRFKSTRGRGGVVENIYINNINMINIPNEPLLFDLFYGGKGAGEES EEDLLSRMKTAIPPVTEETPAFRDIHISNVICRGSGRAMFFNGLPEMPIRNVTVKNVVMT EATDGVVISQVDGVTLENIYVESTKGKNILNVKSAKNLKVDGETYEEIDAKGQILNFK >gi|226332219|gb|ACIC01000101.1| GENE 67 78952 - 80184 1100 410 aa, chain + ## HITS:1 COG:no KEGG:BT_4186 NR:ns ## KEGG: BT_4186 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 410 1 410 410 843 99.0 0 MKKNIFIITLLVGCCSLSAWAQKQEKTITIEVNNNWNRAKTDEPVVINLRDLHTGFKVKS AVVMEGSTEIPSQLDDLNRDRKMDELAFVTSLPAHGRKTFQVTLSSEKSTKTYPARVYAE MFIADPRKGKHQSVQAITVPGTSNIYSMVRPHGPVLESELVGYRLYFNEKQTPDIYGKFN KGLEIKESQFYPTDEQLARGFGDDVLRVFDSCGPGAFKGWDGKKATHITPVATRTERIIS YGPVRVIAEIEVTGWKYQNAELDMMTRYTLYAGHRDLRIEAFFDEPIKNEVFCTGVQDIV GTAESYSDHKGLVGSWGTDWPVNDTVKYAKETVGLGTCIPQRYVKSEEKDKDNYLYTITS PGSKYLQYHTTFTSMKETFGYKTPEAWFAYLREWKEELAHPVTVKIKDIH >gi|226332219|gb|ACIC01000101.1| GENE 68 80349 - 82025 1319 558 aa, chain + ## HITS:1 COG:CAP0114 KEGG:ns NR:ns ## COG: CAP0114 COG3507 # Protein_GI_number: 15004817 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-xylosidase # Organism: Clostridium acetobutylicum # 18 556 22 529 531 196 30.0 8e-50 MKRLTQTLAFCLLTVFTAVAQKNYVSEVWISDLGNGKYKNPVLYADYSDPDACRVGDDFY MTSSSFNCLPGLQILHSKDLVNWTIIGAAVPYALTPIETPERPEHGNRVWAPSIRHHNGE FYIFWGDPDQGAFMVKAKDPQGPWTEPVLVKPGKGIIDTCPLWDEDGKVYLVHAYAGSRA GLKSVITICELNKEATKAITPSRIIFDGHEAHQTCEGPKFYKRNGYYYIFHPAGGVPTGW QVVLRSKNVYGPYEWRTVLAQGDSPVNGPHQGAWVDTPSGEDWFFHFQDVGAYGRLVHLQ PMKWVNDWPVIGIDKDGDGCGEPVMTYKKPNVGKIYPICTPQESDEFDGYILSPQWQWHA NINEKWAYYAGDKSYVRLYSYPVVADYKNLWDVANLLLQKTPSDNFTATMKLTFMPNPKL KGERTGLVVMGRDYAGLILENTDRGLVLSQIECKKADKGEAEQVNSSVGLTQNTVYLKVR FSCDGKKIKASEGGNDLIVMCNFSYSLDGKKFLPLGNPFQAREGQWIGAKVGMFCTRPAI VTNDGGWTDVDWFRITRK >gi|226332219|gb|ACIC01000101.1| GENE 69 82099 - 82749 733 216 aa, chain + ## HITS:1 COG:TP0554 KEGG:ns NR:ns ## COG: TP0554 COG0546 # Protein_GI_number: 15639543 # Func_class: R General function prediction only # Function: Predicted phosphatases # Organism: Treponema pallidum # 1 211 4 216 222 137 35.0 1e-32 MKKLVIFDLDGTLLNTIADLAHSTNYALNKLGYPTHEIEKYNFMVGNGIDKLFERALPEG EKSKENVLRVRKEFVPYYDIHNADDSRPYPGIPELLSYLQDAGIQIAVASNKYQAATQKL IDHYFPEIHFTAVFGQREGIKVKPDPTVVFDILEVAKVTKEEVLYVGDSGVDMQTAANAR VTVCGVTWGFRPRAELEEFSPQYIVDTAEEIKRLIL >gi|226332219|gb|ACIC01000101.1| GENE 70 82862 - 84175 952 437 aa, chain - ## HITS:1 COG:no KEGG:BT_4183 NR:ns ## KEGG: BT_4183 # Name: not_defined # Def: pectate lyase L precursor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 437 1 437 437 891 99.0 0 MKKHIGISLFFMGCFLSLSATNYFVATNGDDSNAGTLDKPFATLQKAQSKVVPGDTVYIR GGEYRIREEQMMGGDHLRAYVFEMNKSGTQAKRICYTGYQDERPIFNLAEVKPEGKRVSV FYVSGSYLHFRNFEIIKTQVTIREHTQSECIYNQGGNHNIYENLAMHDGMAIGFYLVRGS HNLVLNCDAYNNFDPVSENGTGGNVDGFGGHPASASYTGNVFKGCRAWYNSDDGFDLIKA QAAYTIEDCWAFYNGYKPGGFVGAGDGTGFKAGGYGMRSKVKMPNEIPHHVVKNCLAYKN KNKGFYANHHLGGISWFNNTGYQNPSNFCMLNRKSPGEIVDVDGYDHIIRNNLSYKPRAA GKHIVDVNREECTIINNSFLPVDMTVGEDDFVSLDPAQLTLPRKADGSLPDIDFLKLKRN SKLYDAGIGFQFSAQNL >gi|226332219|gb|ACIC01000101.1| GENE 71 84520 - 88815 2730 1431 aa, chain + ## HITS:1 COG:all4963_3 KEGG:ns NR:ns ## COG: all4963_3 COG0642 # Protein_GI_number: 17232455 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Nostoc sp. PCC 7120 # 898 1120 8 229 294 139 36.0 4e-32 MKNALLIFAFLSFYLSSVHASVEIRSNKLTTGDGLANNSIRYMFQDSKGFMWMGTVNGLS YYDGNSFVSIYPDPNLPISLADPRIRNMEEDSNGFLWIATLSSLYSCYDLKHGRFVDFTG CGEYKQSYSKKIIASDQSIWLWDNNNGCRRVVYQDGQFSSQAYKKELGNLSSGEVLFVYE SNDSPGHVWIGTKQGLWKYHDGKLEAMDTQGESWEHIFSYDQYTCIITGKKEIYRHSLSN NRLEKIASLTELGDTGVITGSLRLQHQWVMFTATGSYILDPVTGKLRRFSRLNIKNGNVT RDNKGNAWVHNYTGNVWYVNTSTGDIKHFQFLSSEHLGYIDVERYSIIHDSRDIIWITTY GNGLFAYDLNTGDLQHFTFEVSHSSHINSNYLQYIIEDRSGGIWVSSEFSGLSHLEILNK GTLRIYPNGEDASDRSNTIRMLLRGKNGNVYMANRMGTLYEYDADLKNILRREKFTHNVY SMCEDNEGQLWLGMRGIGLRIGADQWYRYNSKDNNSLSNDNVYLIYRDRKGRMWIGTFGG GLNLAVKTGNGYQFKHFFQGSYGEKRVRVIQEDRNGMMWVGTNNGIYIFHPDSLINSPKN YVLYNHVNETFPSNEIRCLVNDHEGNMWIGTTGAGFAICYPGNDYQHLTFDCYSIKDGLP NGVIQSIVEDQDNKMWIATEYGISRFTLATKQIENYYFSSHTLGNVYSENTACINADGRL LFGTNYGLVVLDANKVENMEKLASTVFTGLHINGAHMLPGMDDSPLNETMSYTGQLNLKH YQNSFVIAFSTFNFLNGASKYSYRMPPYDSEWSIPSAQNLATYRNLPPGKYQLQVKACNV AGVWGEESTMEIVIAPPFWQTTWAYLIYLVFIGIVCYFSFHIIRNFNRLRNRIAVEKQLT EYKLEFFTNISHEFRTPLTLIQGALNKLINIENPPKEMQRPLKTMDKSTQRMLRLINQLL EFRKMQKNKLALSLEETDVIAFLYEIFLSFKDTSESKNIDFSFEPSQPAYKMFIDKGNLD KVTYNLLSNAFKYTPSNGKIIFKIDIQEDKQQLRIQVIDNGIGIPKEKRSELFKRFMQSS FSHSSVGVGLHLTHELVQVHKGNISYDENEGGGSVFTVLLPTNSDIYQEKDFLIPNQLLT EEEEQHSKDFLRNETSEDTFQPPVDPLNKRKVLIIEDDTDIREFLREEIGVYFEVEVAAD GTSGFEKASTYDADLIVCDVLMPGMNGFEVTRKLKNEFTTSHIPIILLTALNIEEKYQEG IESGADAYITKPFNVSLLLARIFKLIELRDKLRQKYSNEPGLAHSIICTNDKDQKFSVKL NEVLNEHMTDTDFSVNDFAGIMGLGRTVFYKKVRGVTGYSPYEYLRVMRMKKAAEMLLTE DLTIAEVAYSVGINDPFYFSKCFKNQFGVSPSAYRKKLSEDENEPINDADV >gi|226332219|gb|ACIC01000101.1| GENE 72 88903 - 91872 2552 989 aa, chain + ## HITS:1 COG:SP0648_2 KEGG:ns NR:ns ## COG: SP0648_2 COG3250 # Protein_GI_number: 15900551 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Streptococcus pneumoniae TIGR4 # 24 784 53 871 871 140 24.0 9e-33 MRNKLFLFVFLFAVLAIVDSHAFERKKYNFNSEWRLQIGDFPEAKQSQFDDSRWKAVTLP HAFNEDEAFKVSIEQLTDTVVWYRKHFRIPASGKKQKVFIEFEGVRQAGDFYLNGQYLGK HENGVMAAGFDLTPYIKEGDNVLAVRTDNDWMYREKSTNSKFQWNDRNFNANYGGLPKNV FLYVTDEVYQTLPLYSNLKTTGVYIYAKDIDVKGRTATIHAESEVKNDSREPRQFSYQVV LLDADGKQVKSFEGEKVTLQGGETRVVKAASKVNNLHFWSWGYGYLYTVQTILKDNKNQV FDEVSTRTGFRKTRFAEGKIWLNDRVIQMKGYAQRTSNEWPAVGLSVPAWLSDYSNGLVV KGNGNLVRWMHVTPWKQDVESCDRVGLIQAMPAGDSEKDRTGRQWDQRTELMRDAIIYNR NNPSILFYECGNKGISREHMIEMKAIRDQYDPFGGRAIGSREMLDIREAEYGGEMLYINK SVHHPMWATEYCRDEGLRKYWDEYSYPFHKEGDGPLHRNQPATDYNHNQDMLAIAMIRCW YDYWRERPGTGRRVSSGGAKIIFSDTNTHHRGEENYRRSGVTDPMRIEKDAFFAHQVMWN GWVDTEEDNTYIIGHWNYPANTVKPVYVVSTGEEVELFLNSQSLGKGKREYNFLFTFDNV AFKAGKLEAVSYNKAGKEISHYAVSTVGEPAGLKLTTIQNPEGFHADGADLALIQVEVVD KDGKRCPLDNRTIQFDLKGNAEWRGGIAQGKDNYILNTSLPVECGINRALIRSTTQAGKI TLTAKAEGLPSATITLETLPVKVSNGLSTYLPQMTLKGNLDKGKTPLTPSYKDTKRDIRI VSAKAGANNETVNQSFDDNERSEWMNDGKLSTAWITYTLEKEAAIDDICIKLNGWRSRSY PLEVFAGNTMIWSGDTNKSLGYVHLNVEKPVRSKQITIRLKGNTSDQDAFGQITEVAAKA ANDMELEAKANKNNANLRIIEIEFLEAIK >gi|226332219|gb|ACIC01000101.1| GENE 73 92617 - 93327 565 236 aa, chain - ## HITS:1 COG:no KEGG:BT_4180 NR:ns ## KEGG: BT_4180 # Name: not_defined # Def: acetyl xylan esterase A # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 234 1 234 267 489 98.0 1e-137 MKTNYLCGLFLLCVLVWGRSEAHAEKPLKTLDLYLCIGQSNMAGRGKLSPEVMDTLQNVY LLNADDQFEPAVNPLNRYSTIGKGLSWQQVGPAYGFAKTMATKKHPVGLIVNARGGSSIR SWVKNAKQSGGYYDEAIRRAKEAMKYGTLKAIIWHQGEADCHHPEAYKEKIIQLMTDLRN DLGMPDLPVVVGQIAQWNWTKKPYIPEGTKPFNDMIKEISTFLPHSACVSPKDLLR >gi|226332219|gb|ACIC01000101.1| GENE 74 93352 - 93486 61 44 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|298383848|ref|ZP_06993409.1| ## NR: gi|298383848|ref|ZP_06993409.1| polysaccharide deacetylase [Bacteroides sp. 1_1_14] # 1 42 35 76 626 81 88.0 2e-14 MRTLFFILFFVLGFCHIQARGQKPAHVIITAGQSNQLGVCTNKV >gi|226332219|gb|ACIC01000101.1| GENE 75 93643 - 97944 2723 1433 aa, chain - ## HITS:1 COG:all4963_3 KEGG:ns NR:ns ## COG: all4963_3 COG0642 # Protein_GI_number: 17232455 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Nostoc sp. PCC 7120 # 900 1120 8 227 294 125 37.0 6e-28 MTQRRILLFTLFMTLVFNAAYGGIELRSKQMRTSDGLPNNSVRYLYQDSKGFLWLATLNG LSRYDGNSFLTFRPEIGDKVSLADNRIYDLTEDKNGFLWISTTPELFSCYDLQRACFVDY TGSGALEKNYSSIFVAANGDVWLRHSGNGCLRMVHQKDRQMVSTEFKTERGNLPDNRVQF VNEDAGGRIWIGTQRGLVSVSDGQYKVEDRLLHFTSSLAFKNDMYFLTADGDIYCYQTTK QKLTKVGSLSAVAGKTSPTGNFLLNDKWVILTNTGVFNYDFNTRKISVDSGLNIRNGEVI CDNHGDFWIYNHTGCVTYILAKNGARKEFQLIPEDKLGYIDYERYHIVHDSRGIIWISTY GNGLFAYDTTEDKLEHFLANINDQSHISSDFLLYVMEDRAGGIWVSSEYSGLSRISVLNE GTSRIYPESRELFDRSNTIRMLTKMPDGDIWVGTRKGGLYTFDSNLHSKMSNQYFHSNIY AIAEDNQGEMWVGTRGNGLKVGDSWYRNEVSDPASLSENNVFSLFRDRKDRMWIGTFGGG LELAEPTTDGKYKFRHFLQQKFGLRMVRVIEEDDNGMIWVGTSEGVCIFHPDSLIADSDD YHLFNYTNGTFCSNEIRCIYRDTKGRMWVGTSGSGLNLCEPEDNYRSLKYEHYGTSEGLV NDVIQSILGDNNGNLWVATEYGISKFNPTNHSFENYFFSSYTLGNVYSENSACVGVDGKL LFGTNYGLLVIDPDKIQDSETFSPVVFTDLHVNGTQINPTMEDSPLKQSLAYSDEITLKY FQNSFLIDFSTFDYSDSGRTKYMYWLENYDKGWSTPSPLNFASFKYLNPGTYVLHVKSCN GAGVWNESETTLKIVIVPPFWKTNWAMLGYVLLLIVTLYFTFRIVRNFNSLRNRINVEKQ LTEYKLVFFTNISHEFRTPLTLIQGALEKIQRVADIPRDLIHPLKTMDKSTQRMLRLINQ LLEFRKMQNNKLALSLEETDVIAFLYEIFLSFGDVAEQKNMNFRFIPSIPSYKMFLDKGN LDKVTYNLLSNAFKYTPSNGTIILSVTVDEVKKTLQIQVADTGVGIPKEKQNELFKRFMQ SSFSGDSIGVGLHLSYELVQVHKGTIEYKDNDGGGSVFTVCIPTDKSVYSEKDFLIPGNV LLKEAGGQAHHLLELSEELPEPEKIAEPLNKRKVLIIEDDNDIREFLKEEVGVYFEVEVA ADGTSGFEKARTYDADLIICDVLMPGMTGFEVTKKLKSDFATSHIPIILLTALNSPEKHL EGIEAGADAYIAKPFSIKLLLARVFRLIEQRDKLREKFSSEPGIVRAAVCSTDRDKEFAD RLAVVLEQNLSRPEFSIDEFAQLMKLGRTVFYRKLRGVTGYSPNEYLRVVRMKKAAELLL SGENLTVAEVAYKVGISDPFYFSKCFKTQFGVAPSVYQRGEMKEQGENETAEA >gi|226332219|gb|ACIC01000101.1| GENE 76 98087 - 98401 375 104 aa, chain - ## HITS:1 COG:mll5702 KEGG:ns NR:ns ## COG: mll5702 COG3254 # Protein_GI_number: 13474745 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Mesorhizobium loti # 2 103 3 104 105 109 53.0 2e-24 MKREAFKMYLKPGYEAEYEKRHAAIWPELKALLSKNGVSDYSIYWDKETNILFAFQKTEG EGGSQDLGNTEIVQKWWDYMADIMEVNPDNSPVSIPLPEVFHMD >gi|226332219|gb|ACIC01000101.1| GENE 77 98432 - 99829 1409 465 aa, chain - ## HITS:1 COG:STM1911 KEGG:ns NR:ns ## COG: STM1911 COG4225 # Protein_GI_number: 16765253 # Func_class: R General function prediction only # Function: Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins # Organism: Salmonella typhimurium LT2 # 219 412 164 357 379 97 31.0 4e-20 MNYRMMTVAALLVALPACAQKKKAVINDSNTPLHLLQPAYQGTYGDLTPEQVKKDIDHVF AYIDKETPARVVDKNTGKVITDYTAMGDEAQLERGAFRLASYEWGVTYSALIAAAETTGD KRYTDYVQNRFRFLAEVAPHFKRVYEEKGKTDSQLLQILTPHALDDAGAVCTAMIKLRLK DESLPVDGLIQNYFDFIINKEYRLADGTFARNRPQRNTLWLDDMFMGIPAVAQMSRYDKE AKNKYLAEAVKQFLQFADRMFIPEKGLYRHGWVESSTDHPAFCWARANGWALLTACELLD VLPEDYPQRPKVMDYFRAHVRGVTALQSGEGFWHQLLDCNDSYLETSATAIYVYCLAHAI NKGWIDAIAYGPVAQLGWHAVAGKINEEGQVEGTCVGTGMAFDPAFYYYRPVNVYAAHGY GPVLWAGAEMIRLLNTQHPQMNDSAVQYYQEKQKTTAPIFAVDSE >gi|226332219|gb|ACIC01000101.1| GENE 78 99833 - 101674 1633 613 aa, chain - ## HITS:1 COG:no KEGG:BT_4175 NR:ns ## KEGG: BT_4175 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 613 1 613 613 1286 100.0 0 MKCTSIFFSLLVIATFVVAQPNYDFTKLKREHLGRGVIAIRENPSTVVVSWRYLSSDPMD ESFDIYRDGKKVNKHPLKNATFFQDSYQGTEPALYTVKAIKGKTESNYQLPADAPTGYLN IPLVRPEGGTTPSGQAYTYAPNDASIGDVDGDGEYEIILKWDPSNAHDNAHDGYTGPVIF DCYKLNGQQLWRINMGRNVRAGAHYTQFMVFDLDGDGRAEVVMKTGDGTVDGTGKVIGDA NADYRNERGRILTGPEYLTIFNGLTGEAMQTIDYVPERGNLMDWGDGRANRSDRYLACIA YLDGVHPSVVMCRGYYTRTVLAAYDWDGKNLKNRWVFDSNNPGCRAYAGQGNHNLRVGDV DGDGCDEIVYGQCTINNDGTGLYSTRMGHGDAMHLTHFDPSRPGLQVWSCHENRRDGSTF RDAATGEIIFQIKSNTDVGRCMAADIDPNHPGVEMWSLDSKGVRNVKGEVIASRVRGLST NMAVWWDGDLLRELLDRNVVSKYNWEKGLCERIAVFEGALSNNGTKSTPCLQGDIVGDWR EEVLLRTADNTALRLYVSTIPTDYRFHTFLEDPVYRISIATQNVAYNQPTQPGFYFGPEL RGTIFRGCKIPKK >gi|226332219|gb|ACIC01000101.1| GENE 79 101941 - 103080 887 379 aa, chain + ## HITS:1 COG:YPO0840 KEGG:ns NR:ns ## COG: YPO0840 COG4225 # Protein_GI_number: 16121148 # Func_class: R General function prediction only # Function: Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins # Organism: Yersinia pestis # 66 378 47 351 352 175 34.0 2e-43 MKKLLLSFITATLLSSMSAGSAGAQELPEQKETLATIVKVNDYFMKKYADYTLPSFFGRV RPSNIWTRGVYYEGLIALYSIYPREDYYSYTYDWANFHKWGMRNGNTTRNADDQCCGQVY IDLYNMCPSDPNMIRNIKASIDMVVNTPQVNDWDWIDAIQMAMPIYAKFGKMTGEQKYYD KMWDLYSYTRNTEGEAGMYNAKEGLWWRDQDFDPPYKEPNGKNCYWSRGNGWVYAALVRV LDEIPADETHRQDYINDFLTMSKALKQCQRTDGFWNVSLHDESNFGGKETSGTALFVYGM AWGIRNGLLDRKEYLPVALKAWNAMVKEAVHPNGFLGYVQGTGKEPKDGQPVTYKSVPDF EDYGVGCFLLAGTEVYKLK >gi|226332219|gb|ACIC01000101.1| GENE 80 103191 - 104735 1552 514 aa, chain + ## HITS:1 COG:BS_yesY KEGG:ns NR:ns ## COG: BS_yesY COG2755 # Protein_GI_number: 16077774 # Func_class: E Amino acid transport and metabolism # Function: Lysophospholipase L1 and related esterases # Organism: Bacillus subtilis # 287 511 5 211 217 95 28.0 2e-19 MKDVNQVVDNTLDSLNKARTARPVAGASRKGNNPVLFLVGNSTMRTGTLGNGNNGQWGWG YYAGDYFDSNRITVENHALGGTSSRTFYNRLWPDVIKGVQAGDWVIIELGHNDNGPYDSG RARASIPGIGKDSLNVTIKETGVKETVYTYGEYMRRFINDVKAKGAHPILFSLTPRNAWA DKDSTIITRVNKTFGLWAKQVAEEQNVPFIDLNDISARKFEKFGKNKVKYMFYIDRIHTS AFGAKVNAESAADGIRACEGLELAKYLKPVEKDEATGSSRKEGRPVLFTIGDSTVKNKDN DKNGMWGWGSVIADEFDLNKISVENCAMAGRSARTFLDEGRWDKVYHALQPGDFVLIQFG HNDAGEINTGKARAELPGSGEESKVFLMEKTGKYQVIYTFGWYLRKFIMDVQEKGAIPIV LSHTPRNKWKDGKIERNTASFGKWTREAAEATGAYFIDLNKISADKLEKKGIKKAADYYN NDHTHTSLKGAHMNAKSIADGLKMADCPLKQYLK >gi|226332219|gb|ACIC01000101.1| GENE 81 104814 - 108110 2075 1098 aa, chain - ## HITS:1 COG:no KEGG:BT_4172 NR:ns ## KEGG: BT_4172 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1098 1 1098 1098 2244 100.0 0 MKSISIFLFLLITLASNAQDNKTSGLNARQFHKYWKVESESPDYKVTFRGDTAEILSPKG LTLWRKEKMSGKVTIEYDACVVVEAEGDRLSDLNCFWMASDPQYPDNIWKREKWRNGIFL NCYSLQLYYMGYGGNHNSTTRFRRYDGNEAAVTNAKARPAILKEYTDADHLLEANKWYHI KITNENNRVSYYINGVRLVDFRDAEPLTEGWFGFRTTLSRTRIANFHYECSSQETSEIPL HWIGNTPQQNKAVSLGVPFNEGELYPEHTLQLMTDKGEMLPTDTWALAYWPDGSVKWKGI AGVIPKNTEKLLLKKTGKKSKEKASSRTADDRSNKLSLSVVETPQSIRIETGIISAYIPR QGDFMIDSLFREGVKVGEKARLVCNTQSEPVLENTSQITFTNYTGRLTSATVERIGKVRT LVKLEGTHRSETGREWLPFVVRLYFYAGSEQIKIVHSFVYDGDQNKDFIRALGIRMDAPM REALYNRHVAFSCANGGVWSEPVQPLAGRRKLTLGKEDTLSLQQQQMDGKRIPPYEAFDG KNRDLLDNWASWNDYRLSQLSADAFSIRKRANDNNPWIGTFSGTRSGGYAFVGDITGGLG LCLHDFWQSYPSSLEISGAKTSSATITAWLWSPEGEPMDLRHYDNVAHDLNASYEDVQEG MSTPYGIARTTTLTLIPQKGYAGKKAFADVAESLSEPGILLPTPDYLHAQQAFGVWSLPD RSTSFRARVEDRLDAYIDFYRKAIEQNKWYGFWNYGDVMHAYDPVRHTWRYDIGGFAWDN TELASNMWLWYNFLRTGREDIWRMAEAMTRHTAEVDVYHIGPNAGLGSRHNVSHWGCGAK EARISQAAWNRFYYYLTTDERCGDLMTEVKDADQKLYELDPMRLAQPRSEYPCTAPARLR IGPDWLAYAGNWMTEWERTGNTVYRDKIVAGMKSIAALPNRIFTGPKALGFDPATGIITS ECDPKLESSNHLMTIMGGFEVMNEMIRMIDLPEWKDAWLDHAARYKQKAWELNHSRFRIS RLMAYAAYHKRDALMAKEAWEDLFTRLEHTPAPPFRITKLFPPEVPAPLDECVSISTNDA ALWSLDAIYMQEVIPMDE >gi|226332219|gb|ACIC01000101.1| GENE 82 108363 - 109016 436 217 aa, chain - ## HITS:1 COG:no KEGG:BT_4171 NR:ns ## KEGG: BT_4171 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 217 1 217 217 379 100.0 1e-104 MKQFINIYIISFVLILSQLSLFSSCKDEDTVAELKLDRTAIQIKEGSEGILKIESGSGGY QFSFSEEGYATAKYRENLIYIQGVKYGKVVLNVTDLEGHSVSLDVVVVSSVLSTDNQRFV WGNMIELNKANNWSLFTDVNSVAITNVMEGRQYVLSCDGDLSVGVKSNATLEILSSKTGA EPEIVQLAGMEVLQIKDGLCSIVFSTADRMGELVFQK >gi|226332219|gb|ACIC01000101.1| GENE 83 109038 - 110645 1077 535 aa, chain - ## HITS:1 COG:no KEGG:BT_4170 NR:ns ## KEGG: BT_4170 # Name: not_defined # Def: putative pectate lyase L precursor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 535 1 535 535 1076 100.0 0 MKDITKITYLLLGLMLSVPLAAQKTYYMDPEGSDSNPGTSDKPFATLVKVQEVVVAGDVV YINPGTYVVPANQVPMTTTNSGLYHCVFHMNKSGEAGKPISYLANPNKQGRPIFDLSQVK PKDQRITVFYVTGSNLYLKGFDVIGTQVTITGHTQSECFRIVKGANNNKFEDLRTHDGMA IGFYLLGGSNNHILNCDAYNNYDSVSEGGKGGNVDGFGGHINSSSVGEGKGTGNVFEGCR AWYNSDDGFDLINCFEAVKIINCWSFLNGYKPGTKEVAGDGTGFKAGGYGMAADKLPAIP SVIPQHEVRNSLAYYNRLRGFYANHHLGGIIFESNTAVNSGENYNMTNRESPLALPPTDV NGYDHMVKNNLSLVTRSGSKHIVMVNRAKSEVSNNSFDGSEEVIETDFISLEEAELMRDR KPNGDLPDVNFGKLTTDAELRFWGMGCFATGEPTDLDFGWLKKPTIVVVGSKASVVGPEA ASFTKMYVIVDGEETTEFDKNSIDLSDFSGVLEVKAVIEDANGNITKSIALKFKR >gi|226332219|gb|ACIC01000101.1| GENE 84 110678 - 112429 1463 583 aa, chain - ## HITS:1 COG:no KEGG:BT_4169 NR:ns ## KEGG: BT_4169 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 583 1 583 583 1182 100.0 0 MKKFKIEIILMGVLAMIFASCSDFFEPNNDTTLNGDRYIKEGSELYSGFLGIITKIQAIG DKAIYLTDMRGELLEPTENTPADLYSIYNYDDDLSGNTYADPAKYYDVIIACNDYLQKAK EYRDTHVTTVDDDDYRGLISSTVRLKTWTYLMLAKIYGQAIWFDDPMQSMKDIVSQTPKN LSELVDECDKLLNTGFDGINGTYNMSWNEWLDPANGATSTNDEYKRWNMMVPGYFVLQAE IALWKGDYQKAATTILGEMSRTFALAEYQGNVANIKYMRSGSYGSNYGQQYDSEMPVLTA AESVIMYDYKYNQQNSLLNHFDRSGSYMMRASVPGSKRFEDITFNPSADATAIKTSDARF RAIQEIEKDVEYAIRKYRGFRKSGHSVAHDDTYIMIYHTSDLYFMAIEALNNMGRFEPVS VLMNTGVDPFYGAGVVLGEQWQGLNCNWGLWTTLYAGFRRTYADNGVRGILGLGKRDMWQ TPEDIKEEIVAEQKLDGLSDEDKIKKHNDIELVKEAFLEFPCEGKTYPLMIRVAKRWNDY SFIADFVGQKYESDQAKQAEVKAKIMNGAYFVPWDLQGKSSSH >gi|226332219|gb|ACIC01000101.1| GENE 85 112435 - 115587 2436 1050 aa, chain - ## HITS:1 COG:no KEGG:BT_4168 NR:ns ## KEGG: BT_4168 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1050 1 1050 1050 2031 100.0 0 MKKIKRILSISALLVCGVANTWGQGGQKLTGKILSATNQPVEGAIVTVLDTMNVTTNKEG AFQFEVKDLSKAGEISVWAPGYFSVKQLIRERSNIVITLIPENQYKYNETMILPFRREGE MQLEDYTAATNIAKKDFMPGTTKIDRALTGQVAGLQVKRSSGMPGEGSYYNLRGIRTLTG DNAPLIVINGVPHMPDKTPSALIDGFTRDIFQFYHLQDIQNITILKGAEAAMYGSMGSNG VILIETDGTASNDLETRVSYYGSYGINWNDKRMPVMGLDDYKQYLADMGMTISKDPQNFY NNFPFMQNPNDPRYNYLYNNNTDWQDLIYKNTASTDHLFRVEGGDNIAKYDLSLGYYREN GLMDNTSMERFHTLLNANVLVNKQLNIFATVGLAYMNGHYQMQGMDITTNPILAAYARSP FLSPYEKDREGNTLKTYASHFYGRSKSRDYSVSNPLAIVNTLDSRNRQYDLNMKAGIAYN PFRELTLTGTVGLYYNYDNEHLFIPGASEATIVPLSDKYGLRNNAVSDGVAVTTNFFANL NASYKKTFNYVHQLNAIAGWQLMTTKNEYDAGEGRNTGNDFYQTLGSTIDGRRFLGYINS WNWTNFYGHADYTYNNMVQASVNIAVDAASSTGTDVARFYTYPSVGVTLLGKGWKPLLDA TWLNKLNVRAEYGLTGNSRFSSQMGGYYYSTVPYMQLSTIVRSNIPNVSLKPEKNASLNL GLDLSVLNNRLNVSFDYYNNQISDMISAMPLSSVYGSVPYYANVGKLENTGIELSVQASL VRTRNFEWIVGGNITRSRDKIKSLGGEEQIVLSYDNGVQMVNRVGESPYQFYGYQADGVY STQAEADAANLSNRTGRRYNAGDVRFVDQNGDNRIDDKDRILLGSAAPSYFGGFYTQLKY KGFALSAEFSYSKDNIAYNAVRQQLEAVSTTNNQSIAAVNRWTVEGQKTDMPKATWKDPV GNSYFSSRWMEDASYLRMKNVTLSYTFSKTLWNFFRSGTIYVTGENLLTFTDYLGMDPEF SYSYAENMQGFDNAKLMQPKTVKMGVNLKF >gi|226332219|gb|ACIC01000101.1| GENE 86 115604 - 117346 1402 580 aa, chain - ## HITS:1 COG:no KEGG:BT_4167 NR:ns ## KEGG: BT_4167 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 580 1 580 580 1143 100.0 0 MKKIRYIIASALVSIGLCACTDTWDSHYSKWETVIDNTEIQAVDEPAADFLKTAQDYSKM YELFDKTGVIKTWQEKNLMYTIMVVGNESTANANQPVTKAEGSEGQATAEEIFKAEAHIT DALLSPSNLEDGQRLLMWNGKYVTVRIYDEPVDGMEPGIYFNGSKVKKVIKTNNAYIYDL EDYINTPKALMEYLKDLPDEEYSIFKEMVLSRTQRVFDKAASTPIGIDKTGNTVYDSVFT EKSQYFADKKLDLYSENITATLLVPSNDLIENALKEAKEKLKSWNMEREDSILNNWIFQT AFFSKKYVKQDFIYNESDPASTQDFYSVFDQQWRTTVNKVDLDNPVELSNGVAYKITSLK IPTNKILIWRIKERFETFDQLTDEDKKYYYPGYNYISTYLKDIGENVQMNRVKQYVGANQ PKPWLPAVYCKSMMLWVVDTDRPGIFKFKCYRLVEDKASTTGFTAVPYKIPAGPYNFYMG FNGSRSQVNATFYLNGKKIPKCESGPIPSSTMKGSNHDRSGGGYSELYKDSRYDRDGSTD LGVVIFDKTEELEVTIEFTKGAKSDMEPTTWCFRPTVDLY >gi|226332219|gb|ACIC01000101.1| GENE 87 117352 - 119097 1162 581 aa, chain - ## HITS:1 COG:no KEGG:BT_4166 NR:ns ## KEGG: BT_4166 # Name: not_defined # Def: putative lipoprotein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 581 1 581 581 1181 100.0 0 MKNIVRFCLMILCITCYSCDDPYKDTVFKVYDVQPAATYLQNRPDDFSEWVKVLKYGDLF NAVNRAEDAFTVLAPTNDAVLRFYEKKGVTSIEDLGYEYARTLVTYHVINDSIDREDFVK SGELPGRTLSGDALKVSFGNEGGDKSVYINKEAHVSELAIRTANGRIYVLDDVLSPAVET IYARLSDKGNYKIFCEALEKTAWSDSLSVVNSVLEGPLGMRVELRKYFTVFAVSDAVFAQ EGITSFSALAAKVGALTDDYELPNNELNRFIAYHIMDGEHTLESLRTFSKPFDRETWEYR KMLNTRATNDLILISENGLNNGVKFIEEQADEIVKNGYIQPIDGYLPVKTDFEPIPVIFD FCDYPEVSSYIAAKGKGQLYQQVSRDDDDTYTALTKFIGRQEIVPQVSSYKIEMGPSGTL ASDWSYLSYCTKGANTSGNWTKLMNNDALVVNIGYNGTLDLTIPPILAGKYRITLYYAYD PSMKFIGERGEGSQAGLTNFRLDSKRLAGGDNKRIYDGKTGDVQDCFNTPLTSNLSGEET AFEFTTTASHTLNVVVVDPAASSHNKFRMYFDYILFEPVIE >gi|226332219|gb|ACIC01000101.1| GENE 88 119106 - 120665 1251 519 aa, chain - ## HITS:1 COG:no KEGG:BT_4165 NR:ns ## KEGG: BT_4165 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 519 1 519 519 1023 100.0 0 MAGATLLTSSCNDWLDVLPKNEQVSPEYWKTKEQVEEVLAQGYQNMRLTVPTLIYWGELR GASIYAYSGKSKQELQNFQLNSSSGECKWGGFYSILNVANSIIKYAPEVKRQDETYHEAV MNSNMAEAYFMRAWTYFTLVRNFKEVPLILEPYMTDEYAVDIPKSSEETIIAQIKLDIEN ALSTGAAKEMYDDEDWTGMSKGRVTVWALYALMADVCLWSEDYDGCVRYADMLINSTSAF RPAFVEDPEQWYNIFFPGNSNGSIFELNFDQSRNQSPDDDATASTYPSPSNVYPWTQSTV ASLQFSNAMCKRLFDEQTDQWVTSNTVRGYGNTFVLSSGTVIGNSNTEGYLPFKFRIGGK DGLSTTRSYKDANWILYRMADVLLMKAEALIWKGGEANFAQALELINKIRTRANVTTLSV QTNAVSQENMLGYLLQERDLEFAAEGKRWYDLLRFGRSQNFKYKDQFINMIIENNSTVSA SWIRSVLKNTDAWYLPISQGEIDSNPNLIQNPYYDVTAN >gi|226332219|gb|ACIC01000101.1| GENE 89 120705 - 123971 2685 1088 aa, chain - ## HITS:1 COG:no KEGG:BT_4164 NR:ns ## KEGG: BT_4164 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1088 1 1088 1088 2087 100.0 0 MNQTNKVFYTLVAIFLSCSIEMAGQVQKVISGTVTELFGKTAEPLVGVNVNLVNNQNRSL GGGITNLNGQYNVKVPEGEKDLTIVYSYIGMKTKRIKYTGQTLLNVTLESESMAVDEVVV SARRLNRNDLGISDKEMVSATQKVDMEKLIAAAPVVSIEEALQGQLGGVDIVLGGDPGSR SAIRIRGTSTLNASSDPLIVIDGVPYPTEISDDFNFSTATEEDLGALLNISPNDIASVEV LKDASATAIWGTQGANGVLVIKTKQGTVGKTRFSFSSKWTMKEDPSTIPMLNGKEYTSLM QESIWNSAKYIGLNNPGNKYLNMLSDAPAIGDNPDWRYYDEYNQNTNWLDYVRQKALVSD NSFSMNGGGEKATYRFSLGYMSDIGTTIGTSMNRLNTSMVINYQFSNKLKFGADFSYSQT NTDANWTNTIRSEALSKMPNKSPFTVNDLTGALTDEYFTYHDPNFEGSFNGKSNYNPVAM AHEAINRTVQREGKITFRADYEILPGFYYKGWASINMRAIKTRRFLPQVVTGVEQVNKYA NQSADAYSDQLALQTENKLMYIKNWNDKHNIIANVLVRTGQYINSGYNSEVYGNASSDLS DPVVGTTISDMNSSESETRNVSFVGVLNYTLLNRYVVHGSLNAEGNSAMGRNERMGYFPA VGLAWNFQNEPLLEKARDKWLDEAKFRFSIGQSGRAPSGASVYLGAYVKGTDYMNMSATK QARMQLDNLKWETSTEYNYGLDASVLKGRLRFTFDYYYKTVKDLLQKNYKLPSTTSFGSI SYFNSGKMENKGWEFRTDVVIFENKDWRINGYVNFSRNENKITELPANMVQENYSPKNGA YAARAEVGRPIGSFYGYRYKGVYQNTDATYARDAEGNVMNDAKGRPIVMKNLTATCQPGD AMYEDINHDGVINQYDIVYLGNANPTLTGGAGLTVKYKQFSINTFFHGRFGQSVVNTARM NNESMHGNANQSTAVLRRWKNEGDITDIPRALYGEGFNYLGSDRFVEDTSFLRLKQLTLN YAFPKSICNKLGITSLTCFVTGYNLFTWSSYTGQDPEVNLPSRPTDLATDGATTPVSRRY TFGFNLSF >gi|226332219|gb|ACIC01000101.1| GENE 90 123990 - 126323 1514 777 aa, chain - ## HITS:1 COG:no KEGG:BT_4163 NR:ns ## KEGG: BT_4163 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 28 777 1 750 750 1447 100.0 0 MRNVMNRKRHWLLLLLLSPFFLSCEDKMDEHYEKPEWLKGTAWEVLSNEYGGKFSMFLEA AELSGFKPILDGKSVATVMAPDNDAFAAYLEEHGYVSVKDIPTDDLKKLIGYHLIYYSYS KSDLENFRPEDSATSKDDDDDDELGVLQPGMYYKFRTHSTSPITKEVDPSTNNTVTVYHL ERFLPVFSHHIFASKGIDAKKNYEFFYPNSTWTGDNGFNVSNASVKEYQIITNNGYIYNV DRVLEPLETIYDVLKKKSDYSDFLDFYSQYSTYAYDKDLSADYGKAVGVDSLFLHAHSPN GLPNIALEWPTPNFRLYPELASISYSIFAPSNQALNTFFNRYWKAGGYSSLTDLDPLITK ILLYQSVYGGSIVFPDEISGITNSLGSHYDFQLSDVKDKSICVNGSFYGLSNFPMPEIFS TVMGPSFLKRDYLLSLYAIFQSNQMAAYTTTATNYTMLITKNSGYEISDMRLMSDGVGNT LATSGEDGDVAVSTSDLKRIVSGGTVVGDVNFNTPWAVYATQDGGTYWFVKDGKMTTNYV FNSVLGQDPQTVIPTLFTEVKEVTNDAGGSWANGKVYEYESDFGVFGKLDGLEYTSLRTM LTSIGETKYPNFVFAYLMRQAELFATIDGVLAWDYRLAGRLIGFIPTNEALKEALDNDRI PGVKGTIDLSLPSPTLQGEITNQYLLREYLLNYFFTPTNAPVASGCPYLGSPDWLSGEYR NSNNIPVKYTDNGAFITLQLQNQTTGQYGNACKIVSYENFPFAFMDGAFHLIDAVFN >gi|226332219|gb|ACIC01000101.1| GENE 91 126882 - 128960 1444 692 aa, chain - ## HITS:1 COG:no KEGG:BT_4162 NR:ns ## KEGG: BT_4162 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 692 1 692 692 1431 99.0 0 MKITQIREKDGVEALSVLDFNLLIEKIKTEIKTRPVTGLRQALHFVLPGESCSFAGKLPK VIPAAVFGRVNGVKRMKVYNGIVELTVGPLAGKVEVDLVKKKAAELPQTMFAYMGASGNS VKIWTLFTRPDGTLPQTQEEAEIFQAHAYRLAVKCYQPQIPFDIQLKEPVLEQYSRLSHD PEPFYRENSVPFYLSQPFGMPKEMNYQEKLTPEKSPLNRAVPGYDTEDALALMYEAALRK TFQSMEEGWRRNDDLQPLVVRLAENCFLSGIPEEEVVRRTIHRYYHSKQAVLVREMIGNV YQECKGFGKKNGLTKEQYLNMQTEEFMNRRYEFRYNTQIGEVEYRERCSFYFRFRPLDKR AQNSILLDAQSEGIAVWDRDVDRYLHSNRVSVYNPLEEFIFHLPNWDKKDRITELADRVP CANPHWTMLFHRWFLNMVAHWRGYDRQHANSTSPLLVGAQGTRKSTFCRDLIPPGLRGYY TDSIDFSRKRDAEMYLNRFALINIDEFDQITLTQQGFLKHILQKPVVNLRKSYSNSVQEL RRYASFIATSNQKDLLTDPSGSRRFMCVEVTGTIDTARPIDYEQLYAQAMYEICHGERYW FDDKDEAILAQGNREFEQTPPAIQLFYRYFKVAGDDKEGEYLSPVEILDYLQKRTTITLS SGKAHHFGRLLQKEGIPCKHTNKGTVYLVVKI >gi|226332219|gb|ACIC01000101.1| GENE 92 129373 - 129975 375 200 aa, chain + ## HITS:1 COG:no KEGG:BT_4161 NR:ns ## KEGG: BT_4161 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 200 1 200 200 375 99.0 1e-103 MAADYDFRRKPNEKGDGEVQPLYPRIVSKGTIDSKRLFREIAEASSFTEGDLAGIMVAFQ EKVSYYLSEGYHVKLGEIGYFSSSLRARPVMDKKEIRSVSISFDNVNFRATPWFRRRSSG TVTRAKFGFQESSNLPEETRRSRLEAFLAKNHFITRREYSQITGLLKGKALRELNLLVEN GVLNTRGYGNRVVYLKPNNQ >gi|226332219|gb|ACIC01000101.1| GENE 93 130114 - 132210 1416 698 aa, chain - ## HITS:1 COG:XF0840 KEGG:ns NR:ns ## COG: XF0840 COG1874 # Protein_GI_number: 15837442 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase # Organism: Xylella fastidiosa 9a5c # 1 215 2 205 612 122 31.0 2e-27 MKKKITFLLMFVLSLSMVSAQGTFRFGTSATPEGKTLLVDSKGLILDGKHIIPVMGEIHY SRVPESEWRREIRKMKAGGINIISTYIFWIHHEPEEGKWNWSGNHNLRRFVRICAEENVM LVLRLGPFCHGEVYQGGIPSWVHEKAGQNPKYKIRARTPGFLEDCTELYNTIFAQVNGLL WKDGGPVVGVQIENESRGPWDYLEALKNIAVKAGFDVPFYTRTGWPALRGKEVFGQLLPL YGDYADGFWDRKLEDMPGSYADAFIMRDKRMSSAIATETFSKEELSEDSPSLSSKLSYPY FTCELGGGMMPAYHRRININGKELKPLVICKLGSGSNLPGYYMYHGGTNPYNPLHTMGET QASPGTNHNDLPHMTYDFQAPLGEVGQVFETPFHEGRFIHQMLTDWGSELLQMNVDSLSR HYARRGAFEFYNDYVRIKNESGTSHVTFKDYRTGGATIDWTTVEPFCKVDDLIYFIEIRG KKPQISVDGKVYTCKLNKQQKAGKLNVCVLSYEKAKTAYKIDGKLLYAKNGGILYKSDSC IVEEVWTKSPVIAATVTEVKKADAPRVVPMGRQAVAAQPVEEDFAKAAVYTINYDTSGIN NYDNLFLRINYRGDVARVYADGRLVADNFWNGKEMWVRMADLVGKKVELKILPLRKDAPV YFQKEQKAMIEAAKGDYMLGLDSVEVIERHTLEFNDAF >gi|226332219|gb|ACIC01000101.1| GENE 94 132239 - 133900 1440 553 aa, chain - ## HITS:1 COG:no KEGG:BT_4159 NR:ns ## KEGG: BT_4159 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 11 553 1 543 543 1114 100.0 0 MYGKNLKLVLMVMLLLVSKSTFSQVTERERPKEWSQLVKGARFMDRFLPMKGNQLSSDTW GTSDVRPRYVDNGIEDRIWSYWGGNIRKGEDGKYHLMVCGWLEASPKGHMEWRNSWVFNT VSDNLTGPFKPINIIGKGHNPEMFQAKDGRYVLYVIDGRYVADDINGKWEYGKFDFNARD RRIIEGLSNLSFAQREDSSYVMVCRGGGIWISRDGLSEYNQLTDRRVYPDVKGRFEDPVI WRDHIQYHLIVNDWLGRIAFYLRSKDGVNWVTDPGEAYMPGVAVHEDGHSEGWFKYERLK MYQDKYGRAIQANFAVIDTLKHEDKPFDNHSSKNISIPLNPGLLLTVLNDKPITAGTKTI RLKVQAEEGFHPQTDMDISSLRFGASEEVNYGRGSKVLKTENDGNDLIITFDGKGNGITE KEFAPKLIGRYKNGKMLYGYARLPYVDYVEPILSARAPVFSESQKGWNGNIEVQNFGQVS SQKASVKIEYKKEGKMVKVASAAVPALKPYEKADIRFATKADFEKGEDYNFLVTIYSGKK VLSTFRLNRKVVE >gi|226332219|gb|ACIC01000101.1| GENE 95 134141 - 134833 575 230 aa, chain - ## HITS:1 COG:no KEGG:BT_4158 NR:ns ## KEGG: BT_4158 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 230 1 230 230 471 100.0 1e-132 MTHLKIFLLLIVCALSVLDGKAQTGIALTDELNGKRIGVIGDSYVRNHKEPFENTWHYKF AKKHGMEYFNYGKNGNSIAYSSPRWGKAMYLRYAEMADSLDYVIVIGGHNDAFKLDSIGG IDNFKDKMEILCKGLVEKYPTAKIFFFTRWNCKNFKGSDSEKVVDAMIEVCGNYSIPIFD CARKGSIYADNDTFRRIYFQKSKNNTDTAHLNSKGHDRFLKVAESFLLQY >gi|226332219|gb|ACIC01000101.1| GENE 96 135077 - 136660 1355 527 aa, chain + ## HITS:1 COG:no KEGG:BT_4157 NR:ns ## KEGG: BT_4157 # Name: not_defined # Def: alpha-galactosidase precursor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 527 80 606 606 1118 99.0 0 MKKHLIAWGILSTMFMANTFAQKDIDRPIMGWSSWNTYHVNISEELIKQQADALIKHGLK EAGYNYINIDDGFFGHRDETGKMHPHPDRFPNGMKVVSDYIHSLGLKAGIYSDAGDNTCG SIYDNDANGVGSGLYGHEQQDMDLYLKEWNYDFIKIDYCGGRELGLDEEKRYSTICQAIA NTGRTDVSINICRWAFPGTWAKRLARSWRISPDIRPRWNSVKGIIEKNLYLSAYATDGHY NDMDMLEIGRGLKPNEEEVHFGMWCIMSSPLLIGCDMNTIPDFLLKLLKNKELIALNQDV LGLQAHVVQHENESYVLVKDIERKRGLTRAVALYNPSDQPCDFIVPFETLELGGNVKVRD LIKQKDLGKMKGEIRQTVQPHSVMICKMEAEKRLEPVSYEAEWAYLPCYDDLGKKSKPIV YVPAPDCSGRMKISRLGGREENFAEWSEVYSEKGGNYEMTIFYSCDKNRKLEVSVNGTKT VLKDLNSNNEVKSVTIPVSLKQGYNTVQMGNNFGWAPDIDRFTVSRQ >gi|226332219|gb|ACIC01000101.1| GENE 97 136685 - 138358 1327 557 aa, chain + ## HITS:1 COG:TM1193 KEGG:ns NR:ns ## COG: TM1193 COG3250 # Protein_GI_number: 15643949 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Thermotoga maritima # 47 555 63 594 1087 187 29.0 3e-47 MNKKFRCLLVSLIFTAMWNGTTIQAQAPHPERIYLSGTGIDNTKTWDFFCSAGQNSGKWK KIEVPCNWELQGFGEYTYGRWYTIKGQRPSDETGTYRYKFDAPKSWAGQRVKIFFDGVMT DAEIMINGKPAGEMHQGGFYRFNYDITELLNLGKKNQLEVKVAKESANRSINAAERKADW WLFGGIYRPVWLEVLPQVHMEHFVLNADHHGKLQTAVDMAGDAKGHEIIVSVRSLKDGKT VYTSNGQTTITHPINNSDKEQMISGEWASIIPWSTENPNLYVAKLELKNPEGKIVQTRET RIGFRTVEFFPQDGVYLNGTKLVVKGINRHSFSVDGGRTTSAALSRMDALLIKEMNMNAV RSHYPPDEHFLDMCDSLGLVYMDELAGWQNGYDSKVGAKLVKEMIERDVNHPCIILWSNG NEGGWNTQTDPLFAQYDRLQKRHMVHPWADFNDLDTHHYPTYLTGVARFTNGYKVFMPTE FMHAMYDQGGGAGLRDFWDRWCTNPLFAGGFIWVYCDEAPKRSDKGGILDSDKSNAPDGV VGPRREKEGSYYAIRAQ >gi|226332219|gb|ACIC01000101.1| GENE 98 138458 - 139570 715 370 aa, chain + ## HITS:1 COG:no KEGG:BT_4156 NR:ns ## KEGG: BT_4156 # Name: not_defined # Def: beta-galactosidase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 370 592 961 961 778 99.0 0 MTYKVLSCGTPLQGNAEAGKVIVEGKVQLPSINPGETGTARFELPDSFRKGDVLELEAFD REGKSICNWTYPIHLAKQYFERNLAQTTLTPAPQKAEARQTSDAIELKSSKVSITFDAAT GMIRHIKSGETEVPFKDGPVAVGMKMRYEPSLSYTRHSSDGAIYCAKYKGAADSIVWRLT NEGLLYMDAILLNRGSGGGGFDDAFMDAQVFNLGLSFSYPEQNCSGMKWMGRGPYRVWKN RIPGTNYSIWHKDYNNTITGESFENLIYPEFKGYHANLYWATLEGEKTSFTVYSRNDGTF FRVFTPEEPAGRIKDTMPAFPEGDLSFLLDIPAICSFKPIEQQGPNSQPGNIRIKSGDEG LHLNLMFDFR >gi|226332219|gb|ACIC01000101.1| GENE 99 139714 - 142122 2001 802 aa, chain - ## HITS:1 COG:SP0648_2 KEGG:ns NR:ns ## COG: SP0648_2 COG3250 # Protein_GI_number: 15900551 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Streptococcus pneumoniae TIGR4 # 32 787 59 871 871 295 29.0 2e-79 MKKLILLSLLTLQFGLLAAQNADGQGRLSHLFNFGWKFHAGDLKDAYSVNYDDSSWRVLD LPHDFQIEQPWDKDAGGARGFKAMGTGWYRKTFKADPEWKGKRVLLDFEGIMLIGDVWVN GQKVGGTDYGYIGFETDITKLLRYDADNVVAVSASTGKKGGSRWYTGGGLFRDVHLLVKD YVSIARHGVFISTPKITEQSADINVQVEVEGISGKQLDVEINARIFSPKGELVTETKGMA PKKNKLATVEVPLPVVAVSNPQLWSCETPNLYRAEITLVRDGKVIDNVTEKFGIRTLEYS PEFGFKLNGKKLFLKGIANHHDLGAVGAAAFDRAIERQFELLKKFGYNHVRTSHNPYSKS FMELADKHGILIVDELIDKWSDKSYWGGRVPFTQLWYKMIPEWIKRDRNHPSVIMWSIGN ELQMREDLAGFPTGDWGVTTYRIFDVLVKRYDSTRKTTVAMYPSRAGAISRKDADFNKKI LPPELSTVTEVASFNYQYTDYAKYLEACPDLIVYQSEATSSELTAPFFGMDRDKMVGLAY WGAIEYWGESNGWPKKGWNYSFFNHALEPYPQAYLIKSAFSDEPLVHIGVVDSEKESIEW NDVIVGRMPLSSHWNRAEGSKQNLFTYTNADEVELLVNGKSLGIQKNKRDDIQSRNMIYW QNIPYGKGGNIIAIACNNGKEVARHMLETTGKAKALKIELENADWKADGMDLQYVKVYAV DSKGRVVPATEGEVTFEVSGEARLIAVDNGDHMTDELFSGNHKKLHQGFVMAILRSSQTS GNVKIKASLKGLKSAEKKMTTK >gi|226332219|gb|ACIC01000101.1| GENE 100 142291 - 144216 1320 641 aa, chain - ## HITS:1 COG:no KEGG:BT_4175 NR:ns ## KEGG: BT_4175 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 62 638 42 600 613 417 40.0 1e-115 MKNALFFLAAILLLTVSCTDNQIQLNGDEPEPTPPAERVEPPLKRALWASINGRKDASYY PEYNNKILVSWRMFPTDDSSTGFDLYRKSGNEEEVKLNEEPITLSTNFQDITADRNKDNT YRLCFAGSDETLDTYTITAAQASAGLPYISIPLAGTTGISSEYYYDANDASVGDLDGDGV YEIVLKRLLRSSSSTEDEEDESGAVQMGPWHTTLLEAYKLDGSFLWRVALGPNVPVGNLT SFAVYDFDGDGKCEIAVRTAEGTVFGDGTEIKDTDGDGKVDYRVEGSAHIHGGPEFLSVL DGMTGRELARTDYIALGKSEDWGDNYYKRSASYRVGVGCFSGTTPSILICRGVYGKMVLE AWDFQGQELKKRWRFDTSDGVHGDYAGQGNHSLSVGDVDDDGCDEVVYGGCCIDHNGKGL WNSRHGHGDALHLGKFDPSRKGLQIWSCFEACPFKVGAALRDARTGETIWDFPYSGDMGR CLVADIDPDSPGCEMWWYKGNAHSCTGADLGYGAGSSSMSYNMAVWFSNSLNRQLLDRSK IDAPKEKRVFTIYRYEVTTINSSKSNPCFYADIWGDWREEIIQVTSDQTELRLFTTWYPT DYKFPYLMSDHVYEMSALNQNIGYNQPTQLGYYLGSDLYKK >gi|226332219|gb|ACIC01000101.1| GENE 101 144474 - 145862 1063 462 aa, chain - ## HITS:1 COG:TM0437 KEGG:ns NR:ns ## COG: TM0437 COG5434 # Protein_GI_number: 15643203 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Endopolygalacturonase # Organism: Thermotoga maritima # 48 374 24 359 448 274 42.0 3e-73 MKQNFLKWMVCMCFGLLSVATVMAGGNYKTVKVKAPFPMQPIKVFIYPDRDFKITDFGAV PGGEVDNTKAIAAAIDACNKAGGGRVVVPAGIWLTGPVHFKSNVNLCLEEDAVLSFTDNP EDYLPAVMTSWEGLECYNYSPLLYAFECENVAISGKGTLQPKMGTWKVWFKRPAPHLQAL KELYTKASTNVPVIERQMATGENHLRPHLIHFNRCKNVMLDGFKIRESPFWTIHLYMCDG GIVRNLDVRAHGHNNDGIDFEMSRNFLVEDCSFDQGDDAVVIKAGRNQDAWRLNTPCENI VIRNCRILKGHTLLGIGSEISGGIRNIYMHDCTAPYSVMRLFFVKTNHRRGGFIENIYMK NVASGTAQRVLEIDTEVLYQWKDLVPTYEKRITRIDGIYMDKVTCESADAVYELKGNAEL PVKNVRIKDVKVGSVKKFVKKVSNVENVVEKNVTYQMEDNKD >gi|226332219|gb|ACIC01000101.1| GENE 102 145896 - 146864 819 322 aa, chain - ## HITS:1 COG:no KEGG:BT_4154 NR:ns ## KEGG: BT_4154 # Name: not_defined # Def: chitin deacetylase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 322 1 322 322 655 99.0 0 MKTFSLFILSLLISIGYLSGADWNVQIARYKQDKACAISYTFDDGLAEHYTLLVPQLEKR GFRGTFWVCGSNINKDNENITDTTRMTWPQLKEMADNGHEISNHGWAHKNFARFPIEEIK EDIFKNDSAIFANTGVMPRTFCYPNNNKKAEGRRIAVQNRVGTRTHQRSIGSKSTLKDLE KWVNTLIETNDWGVGMTHGLTYGYDAFRNPQRLWDHLDQVKARENEIWVGTFREVASYIK EREETKLEVVNKKNTLLIAPELKLDQELFVEPLTMVITGKEIKKVTVKQGKRRLPVQITD DKVLFDFDPYGGMIKVTLKIQK >gi|226332219|gb|ACIC01000101.1| GENE 103 146877 - 148175 771 432 aa, chain - ## HITS:1 COG:CAC0355 KEGG:ns NR:ns ## COG: CAC0355 COG5434 # Protein_GI_number: 15893646 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Endopolygalacturonase # Organism: Clostridium acetobutylicum # 48 327 85 363 513 145 34.0 2e-34 MKQIVGLVLILCCTQIVAQEVFPDGKAIPEWFRENKEVNVNALGKKYSITDYGVANDSTV VQTKKIQGVIDLAARNGGGVIVIPKGTFLSGSLFFKNNTHLHLEEDAVLKGSDDISHFPV KMTRMEGQTLKYFMALVNADGLDGFTISGKGTLNGNGLRYWKSFWLRRSINPDCTNMEEM RPRIIYLSNCKNTQIEGIRIMNSPFWSTHFYKCSFLKLLNLRITSPASPVKAPSTDAVDL DVCNNVLIKNCYMSVNDDAVALKGGKGPWADKDANNGENYDIIIEDCTYGFCHSAFTCGS ESILSRNVIVRRCKVDRARALLTLKMRPDTPQKYEYILVEDITGNVQNCLNIAPWTQFYD LKDRKDTPLSYSNNITMRNIDIDCNVFFNVKKSDQYKLSDFCFENLTVRAKKGKVDQSII DSFTMNNVKINQ >gi|226332219|gb|ACIC01000101.1| GENE 104 148219 - 151200 2384 993 aa, chain - ## HITS:1 COG:TM1195 KEGG:ns NR:ns ## COG: TM1195 COG1874 # Protein_GI_number: 15643951 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase # Organism: Thermotoga maritima # 335 991 1 644 649 347 32.0 6e-95 MKKVFAWMALTLWSVMTIFAGETAYLFSYFINDSKDGLHLAYSYDGLNWTPLNGGRSFLT PAVGKDKLMRDPSICQSPDGTFHMVWTSSWTDRIIGYASSRDLIHWSEQQAIPVMMHEPE AHNCWAPELFYDEPSETYYIFWATTIPGRHKEVPTSESEKGLNHRMYYVTTKDFRTFSKT KMFFNPDFSVIDAAIVKDPTQGDLIMVVKNENSNPPEKNLRVTRTKNIAKGFPTKVSAPI TGKYWAEGPAPLFVGDALYVYFDKYRDHRYGAVRSLDHGETWEDVSDQVSFPKGIRHGTA FAVDASVVESLIDDRKHQSVKAQTSSWFNDKDLTLTGVYYYPEHWDESQWERDFKKMHEL GFEFTHFAEFAWAQLEPEEGRYDFAWLDKAVALAAKYDLKVIMCTSTATPPVWMSRKYPE ILLKNEDGTVLDHGARQHASFASPLYRELSYKMIEKLAQHYGNDSRIVGWQLDNEPAVQF DYNPKAEQAFRDYLRAKYNHNIQLLNDAWGTAFWSEAYSSFDEITLPKRVQMFMNHHQIL DYRRFAAAQTNDFLNEQCLLIRKYAKNQWITTNYIPNYDEGHIGGSPALDFQSYTRYMVY GDNEGIGRRGYRVGNPLRIAWANDFFRPIQGTYGVMELQPGQVNWGSINPQPLPGAVRLW MWSVFAGGSDFICTYRYRQPLYGTEQYHYGIVGTDGVTVTPGGREYETFIKEIRELRKHY APRETKPADYLARRTAILFNHENSWSIERQKQNRTWDTFAHIEKYYRTLKSFGAPVDFVS EQKELTEYPVVIAPAYQLADKELVNKWIAYVKNGGNLVLTCRTAQKDRYGRLPEAPFGSM ITPLTGNEMNFYDLLLPEDPGTVVMNGKQYAWNTWGEILIPASDSQVWATYANEFYEGGP AVTFRKLGKGTVTYVGVDSHNGALEKDILKKLYAQLQIPVMDLPYGITVEYRNGLGIVLN YSDRSYTFDIPEGSKVLVGTEEIPTAGVLVFSM >gi|226332219|gb|ACIC01000101.1| GENE 105 151443 - 154319 2308 958 aa, chain + ## HITS:1 COG:ZuidAm KEGG:ns NR:ns ## COG: ZuidAm COG3250 # Protein_GI_number: 15804986 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Escherichia coli O157:H7 EDL933 # 27 359 15 304 604 90 25.0 2e-17 MKQLLVSIAIVFASVLTATGQNHSFSLSGKWDFQIDREDGGIKEQWFNKSLDESINLPGS MPEKLKGDNVTARTQWTGSLYDSSYYFNPYMEKYRIEGQVKLPFFLTPDKHYVGVAWYQK KVTIPSDWKGERIILFLERPHIETTVWVNTKEIGMQNSLCVPHVYDLTSAVTPGKPCRIT IRVDNRIKEINVGPDSHSITDQTQGNWNGIVGKIELQTTPKVYFDDIQVYPDLSNGKALV RMNVKSSSTAKGEITLSAESFNTDTRHNVPPVQQSFRIQAGDNRVEMELPMGKDFLTWDE FNPALYKLTAMLNSGKQSEKKQVQFGMRDFKIEGKWFYVNGRKTMLRGTVENCDFPLTGY APMDVASWERVFRICRNYGLNHMRFHSFCPPEAAFIAADLVGFYLQPEGPSWPNHGPRLG NGQPIDKYLMDETIALTKEYGNYASYCMLACGNEPSGRWVAWVSKFVDYWKATDPRHVYT GASVGGSWQWQPHNQYHVKAGARGLSWAGSQPESMSDYRAKIDSVKQPYVSHETGQWCAF PNFSEIRKYTGVNKAKNFEIFQDILNDNHMGSMGHDFMMASGKLQAICYKHEIEKTLRTP DYAGFQLLALNDYSGQGTALVGLLDVFFEEKGYINADEFRRFCSPTVPLARIPKFVYTND ETFHADIEVSHFGAAPLQGAKTVYSIKDEYGKVYAHGTVGTQNIPVGNLCPLGSVDMKLS GITRPQKLNMEIRIEGSNAVNDWDFWVYPAQVELAQGNVYTTDTLDAKAISILQEGGNVL ITAAGKIQYGKEVKQYFTPVFWNTSWFKMRPPHTTGIFLNEYHPLFREFPTEYHSNLQWW ELLNKAQVMQFTGFPAEFQPTIQSIDTWFINRKIGMLFEANVLNGKLIMTSMDITSKPEK RVVARQMHKAILDYMNSDAFRPTANIAPELIQELFTKVAGDVKSYTKDSPDELKPKIN >gi|226332219|gb|ACIC01000101.1| GENE 106 154328 - 155276 937 316 aa, chain + ## HITS:1 COG:BS_yesT KEGG:ns NR:ns ## COG: BS_yesT COG2755 # Protein_GI_number: 16077769 # Func_class: E Amino acid transport and metabolism # Function: Lysophospholipase L1 and related esterases # Organism: Bacillus subtilis # 165 314 6 155 232 135 46.0 1e-31 MTDMKTTFLGLLLLTATAISAQEQARTFQLADAPRYSEETGYGYDLAPTPEKESKAPFFF SVRVPDGNYKVTVRLGSKKQAGVTTVRGESRRLFIDNLPTRKGQFTEETFIINKRNPRIS DKESVRIKPREKTKLNWDDKLTLEFNGDAPVCQSISIEPADPSVITVFLCGNSTVVDQDN EPWASWGQMIPHFFGTDVCIANYAESGESANTFIGAKRLAKALSQIKKGDYLFMEFGHND QKQKGPGKGAYYSFMTSLKTFIDEARARGAHPVLVTPTQRRSFDANGHIRDTHEDYPEAM RWLAAKENVPLIDLNE Prediction of potential genes in microbial genomes Time: Thu May 12 01:50:53 2011 Seq name: gi|226332218|gb|ACIC01000102.1| Bacteroides sp. 1_1_6 cont1.102, whole genome shotgun sequence Length of sequence - 28833 bp Number of predicted genes - 22, with homology - 21 Number of transcription units - 13, operones - 7 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 1 - 252 179 ## BT_4150 putative rhamnogalacturonan acetylesterase 2 1 Op 2 . + CDS 277 - 1797 1220 ## COG5434 Endopolygalacturonase 3 1 Op 3 . + CDS 1748 - 2338 384 ## BT_4147 hypothetical protein + Term 2401 - 2438 2.1 + Prom 2400 - 2459 4.0 4 2 Op 1 . + CDS 2547 - 3950 1298 ## COG5434 Endopolygalacturonase 5 2 Op 2 . + CDS 3950 - 6724 2295 ## BT_4145 hypothetical protein + Term 6886 - 6936 1.4 - Term 6827 - 6861 -0.6 6 3 Op 1 . - CDS 6931 - 7611 267 ## COG1600 Uncharacterized Fe-S protein 7 3 Op 2 . - CDS 7608 - 8735 332 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 8775 - 8834 3.7 - Term 9294 - 9353 3.4 8 4 Tu 1 . - CDS 9554 - 9868 226 ## BT_4141 hypothetical protein - Prom 10063 - 10122 8.5 + Prom 9809 - 9868 3.1 9 5 Tu 1 . + CDS 9999 - 10172 78 ## - Term 10110 - 10148 2.2 10 6 Op 1 . - CDS 10318 - 10722 343 ## COG1895 Uncharacterized conserved protein related to C-terminal domain of eukaryotic chaperone, SACSIN 11 6 Op 2 . - CDS 10719 - 11033 345 ## BT_4140 hypothetical protein - Prom 11054 - 11113 6.7 12 7 Tu 1 . - CDS 11121 - 12212 875 ## COG1408 Predicted phosphohydrolases - Prom 12289 - 12348 6.2 13 8 Tu 1 . - CDS 12352 - 13521 1184 ## COG1168 Bifunctional PLP-dependent enzyme with beta-cystathionase and maltose regulon repressor activities - Prom 13546 - 13605 3.5 + Prom 13448 - 13507 3.1 14 9 Op 1 . + CDS 13590 - 14375 822 ## COG0561 Predicted hydrolases of the HAD superfamily 15 9 Op 2 . + CDS 14420 - 16417 1853 ## COG0507 ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member + Term 16430 - 16468 -0.9 + Prom 16419 - 16478 2.2 16 10 Op 1 . + CDS 16538 - 19054 2709 ## BT_4129 outer membrane assembly protein 17 10 Op 2 . + CDS 19072 - 19686 780 ## COG0009 Putative translation factor (SUA5) 18 11 Op 1 . - CDS 19809 - 20621 872 ## COG0363 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase 19 11 Op 2 . - CDS 20667 - 21863 1139 ## COG0426 Uncharacterized flavoproteins 20 11 Op 3 . - CDS 21919 - 22893 891 ## COG1242 Predicted Fe-S oxidoreductase - Prom 22971 - 23030 9.2 21 12 Tu 1 . + CDS 23338 - 27690 3042 ## COG0642 Signal transduction histidine kinase + Term 27937 - 27975 4.2 - Term 27777 - 27829 3.7 22 13 Tu 1 . - CDS 27923 - 28831 578 ## COG5434 Endopolygalacturonase Predicted protein(s) >gi|226332218|gb|ACIC01000102.1| GENE 1 1 - 252 179 83 aa, chain + ## HITS:1 COG:no KEGG:BT_4150 NR:ns ## KEGG: BT_4150 # Name: not_defined # Def: putative rhamnogalacturonan acetylesterase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 83 330 412 412 174 98.0 6e-43 AFVHYPAGTYPGQTRDFADNTHFNPYGAYQIAQCVIEGMKKAVPELAKHLKIDPAYNPAH PDDVNTFHWNDSPFTEIEKPDGN >gi|226332218|gb|ACIC01000102.1| GENE 2 277 - 1797 1220 506 aa, chain + ## HITS:1 COG:TM0437 KEGG:ns NR:ns ## COG: TM0437 COG5434 # Protein_GI_number: 15643203 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Endopolygalacturonase # Organism: Thermotoga maritima # 63 372 33 363 448 138 28.0 3e-32 MRIKNILPLLLAGSLMIACTPAKQAGNSFEWGQVPQQPDLSWADSVGSRQLPGDKMIISA NSFGAVADSTVLSTEAIQKAIDSCAISGGGTVVLQPGYYQTGALFVKSGVNLQIGKGVTL LASPDIHHYPEFRSRIAGIEMTWPAAVINIVDEKNAAISGEGTLDCRGKVFWDKYWEMRK EYEARGLRWIVDYDCKRVRGILVERSSDITLSGFTLMRTGFWGCQILYSDYCTIDGLTIN NNIGGHGPSTDGIDIDSSCNILIENCDVDCNDDNICIKSGRDADGLRVNRPTENVVVRNC TARKGAGLITCGSETSGSIRNVLGYDLKAVGTYTVLRLKSAMNRGGTIENIYMTRVSAEN VRQVLAADLNWNPSYSYSTLPKEYEGKEIPEHWKVMLTPVEPPEKGYPRFRDIYLSQVKA ENVDEFISASGWNDTLRLENFYLYGIEAQTNKPGKICYTRNFNLSEITLDVKDKDAIVLK ENEQSNIHFDYVKTTTDRRTAGNLPH >gi|226332218|gb|ACIC01000102.1| GENE 3 1748 - 2338 384 196 aa, chain + ## HITS:1 COG:no KEGG:BT_4147 NR:ns ## KEGG: BT_4147 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 196 1 196 196 404 97.0 1e-111 MLKQLLTVVLLAICLINVQAQQLTPPAGTFRLGISKGTNSHWLAPQEKVKGIAFRWEALP DTRGFILEVAVTSLQQADTLFWSFGDCQPDMDINVFSVEGQAFTCYYGESMKLRTLQAVT PTDDIRLSNGRQDKTPLLLYESGKRTDRPVLAGRCPLAANSKLYFCFYEQNTRADYNYFM LPDLFAKIDESKHSKK >gi|226332218|gb|ACIC01000102.1| GENE 4 2547 - 3950 1298 467 aa, chain + ## HITS:1 COG:CC0572 KEGG:ns NR:ns ## COG: CC0572 COG5434 # Protein_GI_number: 16124826 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Endopolygalacturonase # Organism: Caulobacter vibrioides # 20 449 34 474 527 518 59.0 1e-146 MKRIYLLLSLLTGCLYMQAAIYNVRDFGAKADGKTIDSPAINRAIEAAAQEGGGTVYLPA GEYACYSIRLKSNIHLYLEQGARIIAAFPGKDEGYDSAEPNEHNKYQDFGHSHWKNSLIW GIGLENITISGPGLIYGKGLTREESRLPGVGNKAISLKDCRNVTLKDFSMLHCGHFALLA TGVEHLTLLNLKVDTNRDGFDIDCCRNVRISQCTVNSPWDDAIVLKASYALGRFKDTENV TISDCYVSGFDKGSVLDGSWQLDEPQAPDHGYRTGRIKLGTESSGGFRNIVLTNCIFEHC RGLALETVDGGHLEDIVISNITMRNIVNAPIFLRLGARMRSPEGTPIGTMKRILISDINV WNADSRYSSIISGVPGASIEDVTFRNIHIYYKGGYSEEDGKRTPPEQEKVYPEPWMFGTI PAKGFYIRHAKNITFDGVRFHFAQPDGRPLFVTDDAENIEYYHTPQE >gi|226332218|gb|ACIC01000102.1| GENE 5 3950 - 6724 2295 924 aa, chain + ## HITS:1 COG:no KEGG:BT_4145 NR:ns ## KEGG: BT_4145 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 924 1 924 924 1922 98.0 0 MNKKQLILLCMLAAGGSIQAQQWPDAPAEARPGTRWWWLGSAVDEKNLTYNLEEYARAGM GAVEITPIYGVQGNDANDIQFLSPRWMEVLKHTQAEGKRTGIEIDMNTGTGWPFGGPEVS IEDAASKAIFQTYDIEGGQEIVQDINVTDKKQQPYSVLSRVMAYDENGRCINLTSHVRKD KLEWKAPAGKWKVIALYNSKTRQKVKRAAPGGEGYVMNHLSKTAVKNYLSRFDRAFKSSK TSYPHTFFNDSYEVYQADWTEDFLDQFARRRGYKLEEHFPEFLDESRPEVSRRIVSDYRE TISDLLLENFTRQWTDWAHKNGSITRNQAHGSPANLIDVYAAVDIPECEGFGLSQFHIKG LRQDSLTKKNDSDLSMLKYASSAAHIAGKPYTSTETFTWLTEHFRTSLSQCKPDMDLMFV SGINHMFFHGTPYSPKEAEWPGWLFYASINMNPTNSIWHDAPSFFDYITRCQSFLQMGKP DNDFLIYLPVYDMWDEQPGRLLLFSIHHMAKLAPKFIDAIHRINNSGYDGDYISDNFIRS TRFKDGQIITSGGTGYKALVVPAAHLMPNDVLVHLLKLAQQGATIVFLENYPTDVPGYGQ LEQKRKTYQQTLQKLPSVSFSETTVTPVGKGKIITGTDYARTLASCNIPQEEMKTKFGLQ AIRRVNDSGHHYFISSLQDKGVNDWVTLGTKAEAAALFNPMTGECGEAKVRQTGEQTQVY LQLKSGESVILQTYQQPLQAARPWNYIQEQPFSLSLDHGWKLHFAESEPEIKGTFNIDRP CSWTGIDHPAAKINMGSGVYSLDIELPALQADDWVLDLGDVRESARVRINGQEAGCAWAV PYQLRVGHLLKSGKNHIEIEVTNLPANRIAELDRQGVQWRKFKEINIVDLNYRPANYGHW APMPSGLNSEVRLIPVDYLSFKTH >gi|226332218|gb|ACIC01000102.1| GENE 6 6931 - 7611 267 226 aa, chain - ## HITS:1 COG:MA3660 KEGG:ns NR:ns ## COG: MA3660 COG1600 # Protein_GI_number: 20092460 # Func_class: C Energy production and conversion # Function: Uncharacterized Fe-S protein # Organism: Methanosarcina acetivorans str.C2A # 1 220 10 248 248 118 33.0 1e-26 MNKEVENELKKSRVGFVHFVDISKLTNKQNRGLPCAILIGIAINPKFVKDVFNNPDYKPV LEDEYVKTENRVGEVTDELAEFLVSKGYKALSQSDAGLLAEGVFNFETKESVLPHKTVAQ LSGLGWIGKNNLFITPEYGAAQCLGTVLTDAPLETVPYAPISPKCGNCNICRDICERKVL KGTTWNTAISRDDIIDVYGCSTCLKCLVHCPWTQKYVNFKSRFLQP >gi|226332218|gb|ACIC01000102.1| GENE 7 7608 - 8735 332 375 aa, chain - ## HITS:1 COG:lin1814_1 KEGG:ns NR:ns ## COG: lin1814_1 COG2207 # Protein_GI_number: 16800881 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Listeria innocua # 100 209 11 121 142 78 32.0 2e-14 MNINSETKYCQTCGIPLDIDYTNLKVDQNEEYCDYCLKNGVKLYDFSMDYLIYLWGLFPE EYYKEVGIRHTSSELREIMSRRLPEIKRWKQKINTAHIQYELVIRVQEYINCHLFDDLDS EKLSQIVCISKYHFRRLFKAVCGDSLGNYIQRLRLEYIAFKLISTNVSVSEILSQINYQN KHTLSRAFKNYFNCSIPEFRKRHSNACSEGKNPIQIEPSIERVPHTRIAYLKLERTHHVS HSFSVLWKQVLQFSESYGLLSKGCKYVSMTLDYPFITLEEQSRFMVGVTLPQSFKIPKGF GVYEVPAGEYAIFRFKGLYHELNRVYRYIYLDWLPANDYSLRDPFTFETYINTPEKTPVS ELITDIYIPVKKKEI >gi|226332218|gb|ACIC01000102.1| GENE 8 9554 - 9868 226 104 aa, chain - ## HITS:1 COG:no KEGG:BT_4141 NR:ns ## KEGG: BT_4141 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 104 7 110 110 185 99.0 4e-46 MDDPTCSYPCLMKLDLQDSTDKLAFLKDNWPSFGQIESIDKLSKHELKCTLCFLDVLIDE FVDDEYVCSSRVMTRLVLTRIYVQNTMEIKALKELRLELEKSDG >gi|226332218|gb|ACIC01000102.1| GENE 9 9999 - 10172 78 57 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRYITNLYTTETWRPVTYYLQAQKFTRFSRADNKNAISKIKKRSSFAEIRYKYSYIF >gi|226332218|gb|ACIC01000102.1| GENE 10 10318 - 10722 343 134 aa, chain - ## HITS:1 COG:TM1000 KEGG:ns NR:ns ## COG: TM1000 COG1895 # Protein_GI_number: 15643760 # Func_class: S Function unknown # Function: Uncharacterized conserved protein related to C-terminal domain of eukaryotic chaperone, SACSIN # Organism: Thermotoga maritima # 10 134 4 128 132 77 30.0 4e-15 MKDILDEESRKALIAYRMQRAYDTMKEAEVMIRETFYNAAINRMYYACYYATVALLLKNN IQTQTHNGVKTMLGLHFVSTGKLPLRIGKTFTTLFEKRHSGDYDDFMFCDKEMVDELFPQ AELFIKSVDELLKE >gi|226332218|gb|ACIC01000102.1| GENE 11 10719 - 11033 345 104 aa, chain - ## HITS:1 COG:no KEGG:BT_4140 NR:ns ## KEGG: BT_4140 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 104 1 104 104 187 100.0 1e-46 MRRTEVIEQIKDIIRHVAPTAKTILYGSQARNEARSDSDIDLLILLDGEKITLKEEEAIT LPLYELELKTGISISPMVMLKKLWENRPFKTPFYINVTNEGIVL >gi|226332218|gb|ACIC01000102.1| GENE 12 11121 - 12212 875 363 aa, chain - ## HITS:1 COG:CAC3027 KEGG:ns NR:ns ## COG: CAC3027 COG1408 # Protein_GI_number: 15896279 # Func_class: R General function prediction only # Function: Predicted phosphohydrolases # Organism: Clostridium acetobutylicum # 41 361 55 391 392 174 28.0 2e-43 MRIFFLIFLLAYVGGNVYIFIRTLQMLSGFSLAMKILLSVIYWLVAFSLVIAMLTRHIDM PVILSKSMFHIGSVWLVFTLYMIFALLIADVTKIFVPSLNHGFYYALGATCCLLLYGYYN YRHPQVNKIDVSLDKPIEGNGINIVAVSDVHLGYGTGKAMLKEYVDMINAQHPDLILIGG DLIDNSLTPLYKENMAEELAQLKAPLGIYMVPGNHEYISGIDESVRFLKDTPIQLLRDSV VTLPNGVQIIGRDDRSNRSRHSLPTLLKQADRSKPIILLDHQPYNLAKTDSLGIDLQFSG HTHHGQIWPISWVTDRIYEQSHGYRKWSQSHIYVSSGLSLWGPPFRIGTNSDMAVFRLNQ MAK >gi|226332218|gb|ACIC01000102.1| GENE 13 12352 - 13521 1184 389 aa, chain - ## HITS:1 COG:YPO3006 KEGG:ns NR:ns ## COG: YPO3006 COG1168 # Protein_GI_number: 16123185 # Func_class: E Amino acid transport and metabolism # Function: Bifunctional PLP-dependent enzyme with beta-cystathionase and maltose regulon repressor activities # Organism: Yersinia pestis # 1 388 1 391 393 363 43.0 1e-100 MNYNFDEIINRNGTDSVKWDAVERRWGRNDLIPMWVADMDFRTAPFVIDALKKRLEHEVL GYTFACKEWAESIINWLKERHGWEIHEDMLTFTPGIVRGLAFAIHCFTEKGDKVMVMPPV YHPFFLVTQKNEREVVYSPLVLKDGQYHIDFDRFRKDVQGCKLLILSNPHNPGGRVWTKE ELSQIADICYENGTLVISDEIHADLTLPPYKHPTFALISEKARMNSLVFMSPSKAFNMPG LASSYAIIENDELRHQFQVYMEASEFSEGHLFAYLSVAAAYSHGTEWLDQVVAYIKGNID FTESYLKERIPAIRMIRPQASYLIFLDCRELGLNQEELNRLFVEDAHLALNEGTTFGKEG EGFMRLNVACPRATLEKALRQLEQAVNNR >gi|226332218|gb|ACIC01000102.1| GENE 14 13590 - 14375 822 261 aa, chain + ## HITS:1 COG:lin1028 KEGG:ns NR:ns ## COG: lin1028 COG0561 # Protein_GI_number: 16800097 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Listeria innocua # 1 261 1 256 256 138 31.0 1e-32 MTKALFFDIDGTLVSFETHRIPSSTIEALEAAHAKGLKIFIATGRPKAIINNLSELQDRN LIDGYITMNGAYCFVGEEVIYKSAIPQEEVKAMAAFCEKKGVPCIFVEEHNISVCQPNEM VKKIFYDFLHVNVIPTVSFEEASNKEVIQMTPFITEEEEKEVLPSIPTCEIGRWYPAFAD VTAKGDTKQKGIDEIIRHFGIKLEETMSFGDGGNDISMLRHAAIGVAMGQAKEDVKAAAD YVTAPIDEDGISKAMKHFGII >gi|226332218|gb|ACIC01000102.1| GENE 15 14420 - 16417 1853 665 aa, chain + ## HITS:1 COG:Cj0945c KEGG:ns NR:ns ## COG: Cj0945c COG0507 # Protein_GI_number: 15792274 # Func_class: L Replication, recombination and repair # Function: ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member # Organism: Campylobacter jejuni # 23 422 13 413 447 169 32.0 2e-41 MSVDTNNAAFQDALNLIQYTRQSVFLTGKAGTGKSTFLRYVCEHTKKKHVVLAPTGIAAI NAGGSTMHSFFKLPFYPLLPDDPNLSLQRGRIHEFFKYTKPHRKLLEQIELVIIDEISMV RADLIDAIDRILRVYSHNLREPFGGKQLLLVGDVFQLEPVVKNDEREILNRAYPTPYFFS ARVFSQIDLVSIELQKVYRQTDSVFVSVLDHIRTNTAGAADLQLLNTRYGSHIEESEADM YITLATRRDTVDSINEKKLAELAGEPITFEGSIEGDFPESSLPTSQELVLKPGAQIIFIK NDFDRRWVNGTIGVIAGIDEEEETIYVITDDGKECDVKRESWRNIRYRYNEKTKEIEEEV LGSFTQYPIRLAWAITVHKSQGLTFSRVVIDFTGGVFAGGQAYVALSRCTSLDGIQLKKP INRADIFVRPEIVNFAGRFNDRQAIDKALKQAQADVQYAAAARAFDKGDMEECLEQFFRA IHSRYDIEKPVPRRLIRLKLGIINTLQEQNKKLKEQMREQQERLRQYAHEYLLMGNECIT QAHDARAAIANYDKALSLDPNYIDAWIRKGITLFNSKEYFDAENCFNTAVSLHPANFKAV YNRGKLRLKIDNTEGAIADLDKATSLKPEHAGAHELFGDALLRVGKEVEAAIQWRIAEEL RKKKS >gi|226332218|gb|ACIC01000102.1| GENE 16 16538 - 19054 2709 838 aa, chain + ## HITS:1 COG:no KEGG:BT_4129 NR:ns ## KEGG: BT_4129 # Name: not_defined # Def: outer membrane assembly protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 838 1 838 838 1525 99.0 0 MKKGLKIAAIVVGVIIILMLLLPFAFQGKIAGIVKTEGNKMLNAQFDFKKLNISLFRNFP QASVTLEDFWLKGAGEFANDTLVQAGEVTATINLFSLFGDSGYDISKVFIEDTRLHAIVL PDGRANWDIMKPDTTDTQETPAAEEDSSPFKVKLQRFVIKNMNLIYDDQQGKMYADIRDF NALCAGDLGSDRTTLKLEAETKSLTYKMNGIPFLANANISAKMDVDADLANNKYTLKDNT IRLNAIQAGIDGWVELKDPAIDMDLKLNTNDVGFKEILSLIPAIYATEFSSLKTDGTATL AASAKGTLQGDTVPAFNIDMQVKNAMFRYPALPAGVDQININANVRNPGGNIDLTTIQIN PFSFRLAGNPFSLTADVKTPISDPDFKAEAKGTLDLGMIKQVYPLGDMELNGTINADMQM SGRLSYIEKEQYDNMKASGTIGLTNMKLKMQDMPDVDIKKSLFTFTPKYLQLSETTVNIG KNDITADSRFENYIGYALKGTTLKGTLNIHSNYFNLNDFMTASADSVATTEAAATDSTAI AGVIEVPRNIDFQMDANLKQVLFDKMTFNNMNGKLIVKDGKVDMKNLSMGTMGGNVVMNG YYSTANAKKPEMKAGFKLSDISFSQAYKELDMVQQLAPIFENLKGNFSGSINVLTDLDAA MSPVLETMQGDGSLSTCDLSLSGVKAIDQIADAISQPSLKEMKVKDMTLNFTIKDGRVET KPFDIKMGDYNLNLSGSTGLDQTIDYSGKIKLPASTGISKLMTLDLKIGGSFTSPKVSVD TKSMANQALESVADEAISKLGEKLGLDSATTANKDSIKQKVTEKATEKALDFLKKKLK >gi|226332218|gb|ACIC01000102.1| GENE 17 19072 - 19686 780 204 aa, chain + ## HITS:1 COG:YPO2212 KEGG:ns NR:ns ## COG: YPO2212 COG0009 # Protein_GI_number: 16122440 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation factor (SUA5) # Organism: Yersinia pestis # 5 199 7 200 206 157 41.0 2e-38 MLLKLYDKNNNPQDLQRVTDILNDGGLIIYPTDTMYAIGCHGLKERAIERICRIKDIDPR KNNLSIICYDLSSISEYAKVGNNEFKLMKHNLPGPFTFILNGTNRLPKIFRNRKEVGIRM PDNNIIREIARLLDAPIMTTTLPYDEHEDLEYMTDPELIDEKFGDIVDLVIDGGIGGIEP STVVKCTEDEPEIVRQGKGWLEGI >gi|226332218|gb|ACIC01000102.1| GENE 18 19809 - 20621 872 270 aa, chain - ## HITS:1 COG:BB0152 KEGG:ns NR:ns ## COG: BB0152 COG0363 # Protein_GI_number: 15594497 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase # Organism: Borrelia burgdorferi # 1 259 1 259 268 375 71.0 1e-104 MRLIIQPDYQSVSQWAAHYVAAKIKAANPTPEKPFVLGCPTGSSPLGMYKALIDLNKKGI VSFQNVVTFNMDEYVGLPKEHPESYYSFMWNNFFSHIDIKKENTNILNGNAPDLDAECAR YEEKIKSYGGIDLFMGGIGPDGHIAFNEPGSSLTSRTRQKTLTTDTIIANSRFFDNDINK VPKTALTVGVGTVLSAKEVMIIVNGHNKARALYHAVEGSITQMWTISALQMHEKGIIVCD DAATEELKVGTYRYFKDIEAGHLDPESLIK >gi|226332218|gb|ACIC01000102.1| GENE 19 20667 - 21863 1139 398 aa, chain - ## HITS:1 COG:FN0512 KEGG:ns NR:ns ## COG: FN0512 COG0426 # Protein_GI_number: 19703847 # Func_class: C Energy production and conversion # Function: Uncharacterized flavoproteins # Organism: Fusobacterium nucleatum # 5 398 5 402 403 361 45.0 1e-99 MEQKTRIKGNVHYVGVNDRNKHRFEALWPLPYGVSYNSYLIDDEMVALVDTVDICYFEVY LRKIKQVIGERPINYLIINHMEPDHSGSIRLIKQHYPDIIIVGNKQTFGMIEGFYGVTGE QYLVKDGDFLALGKHMLRFYMTPMVHWPETMMTFDETDGILFSGDGFGCFGTVDGGFLDT RINVDKYWGEMVRYYSNIVGKYGSPVQKALQKLGGLPITTICSTHGPVWTENISRVIGIY DRLSRYDADEGVVIVYGSMYGNTEQMAEAIAAELSAQGIRNIVMHNVTKSHPSYILADIF RYKGLIVGSPTYSNQIFPEVESLLSKILVRELKGRYLGYFGSFTWAGAAVKRMAEFAEKS KFELVGDPVEMKQAMQDITYTQCENLARAMADRLKKDR >gi|226332218|gb|ACIC01000102.1| GENE 20 21919 - 22893 891 324 aa, chain - ## HITS:1 COG:BS_ytqA KEGG:ns NR:ns ## COG: BS_ytqA COG1242 # Protein_GI_number: 16080100 # Func_class: R General function prediction only # Function: Predicted Fe-S oxidoreductase # Organism: Bacillus subtilis # 6 312 15 318 322 236 38.0 3e-62 MSYQIQYNDFPTFLRKHFPYKVQKISLNAGFTCPNRDGTKGWGGCTYCNNQTFNPDYCRT EKSIATQLEEGKCFFAHKYPEMKYLAYFQAYTNTYAELEGLKRKYEEALQVDGVVGLVIG TRPDCMPEELLRYLEELNRHTFLMVEYGIESASDETLRRINRGHTYADTVEAVQRTAACG ILTGGHVILGLPGETHDTMVEQAGILSALPLSTLKIHQLQLIRGTRMAHEYEENPEGFHL FTKVDEYIDLVINYVEHLRPDLVLERFVSQSPKELLIAPDWGLKNYEFVTRLQKRMKERG AYQGKKYRDSEKRVIFAEDNFTTE >gi|226332218|gb|ACIC01000102.1| GENE 21 23338 - 27690 3042 1450 aa, chain + ## HITS:1 COG:CAC0903_3 KEGG:ns NR:ns ## COG: CAC0903_3 COG0642 # Protein_GI_number: 15894190 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 911 1140 56 290 318 135 33.0 6e-31 MKRTWLLLISLFNLCILVAQPTYHIKHYSVNDGMSQGIVQTIIQDKKGFLWFGTWNGLNK FDGYTFKNYKTSYKDGYILNTNRISQIAETKYGDIWCQAYDGRVYMFDNQTEKFIDVLGS VAETMITNNYADHIYTLSKGISWITFQNECAAIRIDDQLCKQGKGITLYSSFKQNLKGDK MYTVFEDSEGDEWILTNKGISIIGKKQVDSDFPFQYIKEADGIIYLVSQSDKLASYNFKT RQFKFIEIPYHVSMINEVKKFDSNTLLLATDNGLIIYKIKEKEFQLIDIRTSTQPSSVVQ SVYQDHYKELWIFPQTTGVIRYNPSTEEMQHLFTPADEVIRYERKNKNVIFEDRQHTLWV LPPEGNLSYYDRENKELKPMLKDSNNPKSVFSPLVRFYTLDNQGNCWLTSTRGIYKLSFF PHTYNLVHIDNGFETRAFLCDNEKRLWVSSKAQMIRVYQPDGQLEGYLTPQGKISKDKQS FTSVYCFLEDHEGNIWIGTKENGIYLLKKKSPDSYSVQQFTHQPNMPYSLSGNSVYSIFQ DSHKRIWIGCYGGGINLLTYNKDGKTEFIHSDNELKNYPTGYASKVRHISEAPGGVLLIG TTNGLLTFSNQFERPEEIKFYRNCCESGISTSLSGNDVMHTYTDKQKRTYIASFTGGISQ IVSKNLLSNKIEFKTYTTDDGLASDLVQAMQEDQQGNLWILSENAISKFDAQRKTFENYG LNSLHQEFSFSEAAPVINARNQIVFGTDKGFIEISSKEMQKSSYVPPIVFTGLKVQGQSS VIAIDDLKELRLTPSQRNVTFQFAALDYVDSKEIRYAYRLKGLEEEWHDGDKNRSASYIN LPNGKYKLQIKSTNSDGVWMDNIRTLSINVLPTFWETGWAWLLYIILFVLFTGSIVYVLF YIYRLRHQVDIEQQLSNIKLRFFTDISHELRTPLTLISSPVNEVLENEDLSATAREHLTV VQKNAERMLRLMNQILDFRKIQNQKMKVLVEKTDLIPLLEKVMINFRLIAEEKKINFRLE SELKSIYAWIDRDKFEKIFFNLISNAFKYTPAEKAITIEVTKQTDKVTISVVDEGIGIEP TKLRTLFQRFETLAQQNMLQPSSGIGLSLVKEMVDMHHGTIKVTSEPEAGSRFMVTLPLQ KEVFEQDSQVEFILNDSQSPVTHPDSSLQTEKRAEAEDKEDMENNAAPDTLTILVVEDNE ELKTFLKNILSENYTVITASNGKEGWQHAVDNIPDLIISDVMMPVMDGLEMIRQIKENNN ICHIPIIVLSAKASLDDRIAGLEQGIDDYITKPFSATYLKTRIASLLRQRKSLQEIYMAK LTEGKEIAVAEALTPSQPQITPYDEQFMQKVMEFIEEQMDNAELTIDEFAEHLMLSRTIF YRKLKSIIGLTPVDFIREVRIKRAAQLIDSGEYNFSQVAYMTGFNDPKYFSKCFKKVVGI TPSEYKEKNK >gi|226332218|gb|ACIC01000102.1| GENE 22 27923 - 28831 578 302 aa, chain - ## HITS:1 COG:TM0437 KEGG:ns NR:ns ## COG: TM0437 COG5434 # Protein_GI_number: 15643203 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Endopolygalacturonase # Organism: Thermotoga maritima # 2 262 178 429 448 172 36.0 8e-43 DEEWTYMKSWLRPVMLSIVKSKRILLEGVTFKNSPGWCIHPLSCESLTLNDVKVFNPWYS QNGDALDVESCKNVLVTNCFFDAGDDAICLKSGKDEDGRRRGEPCENVIIKNNTVLHGHG GFVIGSEMSGGVRNVYVSGCSFVGTDVGLRFKSTRGRGGVVENIFIDNINMIDIPNDALT MDLYYAVNDSPETPIPDVNEETPVFRNIYISNVLCRGAGRAVYFNGLPEMPLKNIFIKNM TVTNAKKGIVINQASQVNMENIKVEDPEAPGIQIKNATGIIINGKEYKKDSGKMLLSGNQ RN Prediction of potential genes in microbial genomes Time: Thu May 12 01:51:34 2011 Seq name: gi|226332217|gb|ACIC01000103.1| Bacteroides sp. 1_1_6 cont1.103, whole genome shotgun sequence Length of sequence - 33891 bp Number of predicted genes - 18, with homology - 18 Number of transcription units - 8, operones - 7 average op.length - 2.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 669 423 ## COG5434 Endopolygalacturonase - Prom 889 - 948 6.4 - Term 879 - 936 7.3 2 2 Op 1 . - CDS 960 - 3035 1733 ## BT_4122 hypothetical protein 3 2 Op 2 . - CDS 3050 - 6478 2771 ## BT_4121 hypothetical protein 4 2 Op 3 . - CDS 6505 - 8172 1203 ## BT_4120 hypothetical protein 5 2 Op 4 . - CDS 8202 - 9881 1010 ## COG3866 Pectate lyase - Prom 9921 - 9980 4.2 - Term 10269 - 10327 -0.4 6 3 Op 1 . - CDS 10332 - 11144 667 ## COG3279 Response regulator of the LytR/AlgR family 7 3 Op 2 . - CDS 11148 - 11942 454 ## BT_4117 hypothetical protein - Prom 12075 - 12134 5.4 + Prom 12067 - 12126 4.6 8 4 Op 1 . + CDS 12226 - 13797 988 ## BT_4116 hypothetical protein 9 4 Op 2 . + CDS 13804 - 15297 1266 ## BT_4115 hypothetical protein + Prom 15350 - 15409 3.9 10 5 Op 1 . + CDS 15517 - 18738 2254 ## BT_4114 hypothetical protein 11 5 Op 2 . + CDS 18758 - 20578 1554 ## Phep_1154 RagB/SusD domain protein 12 5 Op 3 . + CDS 20592 - 22169 1162 ## Phep_1155 fibronectin type III domain protein + Term 22193 - 22221 1.6 - Term 22873 - 22911 -1.0 13 6 Op 1 . - CDS 23060 - 27595 3294 ## COG0642 Signal transduction histidine kinase 14 6 Op 2 . - CDS 27661 - 29373 1327 ## COG4677 Pectin methylesterase - Prom 29448 - 29507 1.6 15 7 Op 1 . - CDS 29514 - 30482 749 ## COG4677 Pectin methylesterase 16 7 Op 2 . - CDS 30492 - 31646 1025 ## COG4225 Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins - Prom 31702 - 31761 5.5 17 8 Op 1 . + CDS 31993 - 33390 1536 ## COG3775 Phosphotransferase system, galactitol-specific IIC component 18 8 Op 2 . + CDS 33423 - 33891 516 ## COG3717 5-keto 4-deoxyuronate isomerase Predicted protein(s) >gi|226332217|gb|ACIC01000103.1| GENE 1 3 - 669 423 222 aa, chain - ## HITS:1 COG:TM0437 KEGG:ns NR:ns ## COG: TM0437 COG5434 # Protein_GI_number: 15643203 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Endopolygalacturonase # Organism: Thermotoga maritima # 49 179 18 149 448 108 43.0 1e-23 MNGILKKDLRVVLTGVMIFISYQVYSNPESGTKEQGDVYRNLPFSMPEVSQPSFPDYEVN IRDFGALSDGVTLNTEAINNAIKAVNSKGGGKVIIPEGLWLTGPVVLLSNVNLYAEKNAL IVFSSDTSLYPIIDTSFEGLDTKRCQSPISAMNAENIAITGSGVFDGAGDRWRPVKKDKM TERQWKNLVSSGGKVDENGKVWYPDAGALKASVLMTGQNNGQ >gi|226332217|gb|ACIC01000103.1| GENE 2 960 - 3035 1733 691 aa, chain - ## HITS:1 COG:no KEGG:BT_4122 NR:ns ## KEGG: BT_4122 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 691 1 691 691 1346 100.0 0 MKKKNFLYTGTLALGMFMSGCSDSFLDMNNYGAYDDFNSETKITWYLAGLYQTSFENYTS PTSQYLGLYTSYAQDFNEFTDEMWGITSTSRIDPSTQYSTIDDIKTQTDASSGKSYDPLF AGYFGKALGSSVTNNAYTRIRNCNILLRDIDASSVSQDTKDKAKGQALFLRAMQLFDLVR MYGCVPIVTTVLNAEVTDAGLPRASVTQCVEQIVKDLTDAAALLPDEWGTNDYGRLTRGG ALAYKSRVLLFYASPIFNKNWNDPGNLRWQKALEATQNAISGITASGLDGVTDAASWSKI LADDDNEHSNRETLVVRLLAKESNSSLGYKNNRWEKNIRLSSQGGSGGKGAPIELIDVFP MADGTLPDAAHQVTEGSLRFMENRDPRFYQTFAFNGLKWGHKDVTNDTVWAYRWRTSEST TSGFAYSEGVNITSPVFVRKMSGLNTASANNYEASGVHIYDFRYAELQLNLAECYAATNQ IDLCKQAIGKLRARVGIPSTNNYGLDTYVFDRASALAACLRERQVELAYEGKRYWDIWRW MLYDGGRGEAMQLSTTNTCSFLGVTPLTERYRTAKYVDVKDGYTPGSKDVLSELRKNIFA DPESADFQKQLKEVADFWESNFQYGEPNVQPDKNNSNEWIKIGWRSNYYIMGLSKDILDN NSWLGQTKGWTDQNGAAGTIDWQDDETLTID >gi|226332217|gb|ACIC01000103.1| GENE 3 3050 - 6478 2771 1142 aa, chain - ## HITS:1 COG:no KEGG:BT_4121 NR:ns ## KEGG: BT_4121 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1142 1 1142 1142 2244 99.0 0 MNKHIVFMKFKRSSFIKKVGVLLCFVAFALSANAQTRKVTGQIVDESGQPIIGATIRLQD ATTGTITDIDGHFSLNVPDGKKVVISYIGYLKQVILPKGDTLKVILQEDNQKLDEVVVVG YGSMKQKNITGSVSTISAEELEDLPVSNLSEALQGMVNGLNVQLGSSRPGTNANEVYIRQ NRTFTGISKDGGNSTPLIIIDDVIQLGTNGQPSMEQFNMLDPSEVESITVLRDASAAIYG SRAANGAILVKTKRGKKGVPVISYSGKFAVNDAVSHSKVLKGSAYGRFYNALAIGSNKAS GYDDLDVLYSDMELAEMDNLNYDWLDKAGWKSAFQQTHTLNVTGGSERATYYAGATYFDQ GANLGDQSYKRYTYRAGVDVRLTNDIKLSATVAGNEGKSDQIYTKGARFKLYGMSGSTEK SDYSALHHMPNHMPWSVTLSDENGQDQEYWLGPIENTYSSPSFNRDYVTSWNYFALNNSG SFSKNRSNSWNADISLTYEVPFVKGLSLRATYSSSHSSEATEQASFPYELAYVGGRMPAD QHLVYTIPSSSFKTAIFDKNSTLSFKDKQAERRQMNFYVNYDRTFGQHSISAMASIERYE SFYDSRDIEYADLAHDISDTYLGVGGPSIVGPDGKSALASDNTVTLKGESGSLSYLGRVA YSYADRYMLQFIFRSDASTKFAPENYWGFFPGISAGWIMSEESWFKRSLPWFEFLKVRAS WGRTGRDNIKMWKWKEQYKMDLKGMQFGAESGKPGTSLIPQSSPNRNVKWDVSDKFNLGF DTRFFDGRLSAVFDFYYDINDNILNQFMASQPGIPVYAGGSYAEENFGRVDTYGGELSLT WRDKVGQVNYNIGMDFGLNGSRVKEWVPGLRYNKYPSSSSWEEGMSTYLPVWGFKVWKGT SNGDGILRTQDDINRYWSYLESYTPEGGQTKYLDKTSKDDLRPGMLAYQDLGGEMVNGVQ QGPNGQIVLEQDYGKLCEKNKTYNVSTRLGASWKGLAISANIATSWGGVRFIDRASMGGN KSTMIWAPDSFWGDMFDEMYNPNGKYPNLGTESLISSSAIANSDFWMISTFRCYIRNLSV SYTLPKKWIAPLKMSAVRLNLTGNNLWDLYNPYPNHYRNMYDASSTDYPTLRTWSLGVNV TF >gi|226332217|gb|ACIC01000103.1| GENE 4 6505 - 8172 1203 555 aa, chain - ## HITS:1 COG:no KEGG:BT_4120 NR:ns ## KEGG: BT_4120 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 555 1 555 555 1079 98.0 0 MNLLKLMTVGCLLSCFSGLMTSCSDDDDAGSKVLLRPVTAMEIVQNKAYLSWKSVEGATE YVVEVYKVVDKGEELYKTETVPGDRSSCVIDLDWEENYKFKVKCEGNGRLSGYWETEVTG VLYRPLSIELGEARTIDTQALISWTPNDTVVITALTAVPMGLETVNSQDIKVYNVSSEEY LAGSKIIDDLTPETSYRVSFYSGDEQSSDTYQARIEVKTAVTENLDEDYGTANRIDLRNE AFDPDYFNRLDWNSLAEGTTFVLPAGKTYVLNSGETVIEFAHSVHFVTPQTLEDYPTFSF DNAFRIVEGGVVDKVTFKRINLRASKSLSEVADNSLSGKQVICPESDVFLINTIDFTNCY IENFRSIVRSKKATGNVGAIAFKECTINAIGNQGIVSTDGKNGNYINDVSFDECTITNIC GIADLRNSSSGKSISITNTTFCYAPMENSFLFRVDPSIAVKIENCVFGGSMKIDGKLPKF NELGSGGQDDYTGVYPFSSVNSFQANDRTSSKGNLGLSDSKMSTATLFTAPGTNNFKLNE LFTGCSSVGASKWRR >gi|226332217|gb|ACIC01000103.1| GENE 5 8202 - 9881 1010 559 aa, chain - ## HITS:1 COG:TM0433 KEGG:ns NR:ns ## COG: TM0433 COG3866 # Protein_GI_number: 15643199 # Func_class: G Carbohydrate transport and metabolism # Function: Pectate lyase # Organism: Thermotoga maritima # 69 272 33 226 367 61 28.0 5e-09 MKNCAFIKKWKSWYQRFPVLIRTTFVVTPVFFRRYLVQYPFCACLLFLGACTDVKGFTNM ELGPYAKGPEVLKAFPTAEGFGKNATGGRGGKVVIVTNTNDDGDGSFRWALQQCSQNEAT TVVFAVSGKIELKSEIRCKAKNFTLAGQTAPGDGVCIIKNEINFGGSENFIIRHMRFRVG EKDASGKEHNAACLRVENANNFIIDHCSFSWASEENTDFIDTHFSTVQWCISSEGLYYSV NKKGARAYGGAWGGTSSTYHHNLFAHCNSRTPLMNGARGKDPGQDIVVYMEYINNVNYNW GSQMATYGGMDESQDPEHHGWSCNFVNNYYKPGSATTARVKELKFFRQSSAREPNKAPLR AVSKWYFHGNVMEGNSQLTSDNWEGVYTDGNYPYSIDEMKASSFIIPSGKENYEQYWFDW ESYTLSDQYESAEKAYQSVLADKSGAGAFPRDKVDARIVKEVKSGLCTYTGAGDANSGAI PGIINSPDEAEGLDGLTYKTSGTITDADQDGMDDAWEKKVGLDPANPEDRNRTTEVGYTA LEVYLNSLVGESISYNFKK >gi|226332217|gb|ACIC01000103.1| GENE 6 10332 - 11144 667 270 aa, chain - ## HITS:1 COG:BS_lytT KEGG:ns NR:ns ## COG: BS_lytT COG3279 # Protein_GI_number: 16079944 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Bacillus subtilis # 135 268 106 239 241 58 27.0 9e-09 MKTHPIIDYPYRWVPVLVVALVLVLAQVMIMSVYTGADYVPAVIDGIATIGWLMALGYLV WFVVGVVSIFQTEVITLVAGILIWIAGSFMVYDIVTRIAGIPYITFASTIPFRLLFGIPT WVAILLWYRLIVAKEDALNQELEKELIMHQPVSLLEEPQIEQIDRITVKDGSKIHLIKTD ELIYIQACGDYVMLITPSGEYLKEQTMKYFETHLPSDTFVRVHRSTIVNVTQISRVELFG KETYQLLLKNGVKLRVSLSGYRLLKERLGL >gi|226332217|gb|ACIC01000103.1| GENE 7 11148 - 11942 454 264 aa, chain - ## HITS:1 COG:no KEGG:BT_4117 NR:ns ## KEGG: BT_4117 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 264 1 264 264 513 93.0 1e-144 MEQKSKFSSPTGFSGKLLVASLFILSGILLFARNMGWITAELFSIIVSWYSLFIIMGIYS MIRRHFVGGIILVLIGVYFLLGGLSWLPENSQAMVWPLALIIAGVLFLFKSGHRGPWNDK QRMFRDHREWMKHGHYGHAGMNFANNQQQSESEDGFLRSDNTFGAARHVVLDELFKGAVV RTSCGGTTIDLRHTHIAPGETYIDLDCSWGGIEIYVPSDWKVVFKCNAFFGGCDDKRWQN GNINKESVLVIRGRLSFGGLEVKD >gi|226332217|gb|ACIC01000103.1| GENE 8 12226 - 13797 988 523 aa, chain + ## HITS:1 COG:no KEGG:BT_4116 NR:ns ## KEGG: BT_4116 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 523 1 523 523 942 96.0 0 MNKTFLGAFLASVFISFTACSEENLEQDTNPPTEQPGDSTDPDDSGEDQLPEYPTPDRST VAAFPGAEGAGKLTSGGAGGTVYTVTSLKDDGSEGTLRWAIEKSGKRTIVFAVGGVIPLS KQLQIKNDDITIAGQTAPHPGICLKNYTLRVNANNVIIRFIRSRMGDECKTEDDSMNGYQ DSYPGKRNIIIDHCSMSWSTDECASFYGNTDFTMQWCIISESLTNSLHNKGGHGYGGIWG GSPATFHHNLLAHHSNRTPRLCGSRYSNREDMEKVDLCNNVIYNWTGEGAYAGEGGSYNI MNNYYKPGPVNAIAKVHCRIFTAYLDDGKNQQAKGVYGKFYINGNYFETHEKLSNSQKTE LANANADNTSSTAFCVKNNEVSTKDLLVSSRFPILDDYSFVQSAQDAYQSVLLYAGVSNL RDHIDERIVKETQKGTYTYTGSNGGTNGLIDTQADVEGWSEYASTTSAQQDSDKDGIPDE WETANGLNPNDSSDGNKYNLSKEYTNLEVYLNSLVNSLYPTNN >gi|226332217|gb|ACIC01000103.1| GENE 9 13804 - 15297 1266 497 aa, chain + ## HITS:1 COG:no KEGG:BT_4115 NR:ns ## KEGG: BT_4115 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 497 1 497 497 991 98.0 0 MNKKILIGLCALFMLPSIQGNTNTDNSYKENITPDRSKVPAFPGADGAGKYTTGGAGGTV YTVTSLADDGSEGTFRWAINKKGPRTIVFAVSGIIELQKPLKLSNGDVTIAGQTAPGDGI CLKNYTFSIQADNVIVRFIRSRMGADIKQKGDDAMNGTKGNSNIIIDHCSLSWCTDECAT FYDNSQFTLQWCIISESLANSIHEKGAHGYGGIWGGQKASFHHNLLAHHTNRTPRLCGSR YTGKPEEEKVELFNNVIYNYGSDGAYAGEGGSYNFLNNYYKPGPYSATKGSYRRLFTAYA DDGKNQNEAGVHGTFHFKGNFMDATCPSLTDKQKEALYKVNMDNTFGLIVKNDFAPEKNL LSKKAFDIAEHTSLQPAKKAYKDVLQFAGASHRRDAVDQRIVEETRKGNYTYEGSHGSTN GMVDQPIDVGGWPEYKSEPTPTDSDGDGIPDEWEKKYNLNPNDPSDGAQYGLSKEYTNLE VYLNELVNHLYPADSKK >gi|226332217|gb|ACIC01000103.1| GENE 10 15517 - 18738 2254 1073 aa, chain + ## HITS:1 COG:no KEGG:BT_4114 NR:ns ## KEGG: BT_4114 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1073 1 1071 1071 1308 65.0 0 MPKGMKNMRTLLLMIFAALSLSVSAQTITLNGNVKDTTGEPIIGASIVEKGNTTNGTITD LDGNFSLKVPANATVVISYIGMKTQEIAIKGKSKIDVTLSDDAKALDEVVVIGYGTAKRK DITGSVASVNSETLAAIPVASATEALQGKMAGVQITTTEGSPDAQMSIRVRGGGSITQSN EPLFIVDGFPVESISDIPPTDIESIDVLKDASSTAIYGARGANGVIIVTTKSGKEGKVSV SYNAYYAWKKVAKTLDVLSASDYTKWQYELAQLTNKNDDYIKYFGNYQDLDMYDNVATND WQDLTFGRTGHSFNHNLSINGGSDKFKYSFSYSHIDDKAIMQMSDYARDNLNLKLNHKPL KNVSLDFSARFSKTKINGGGSNDANSSLNTDKRLKYSLLYTPYPVPGISDALNADEEDAN SLLINPLISLRDNDTRKERINYNLAGSITWEIIKDLKLKAEVGYDDYRNNADRYYGVTTY YVRNNVDDEDKSKPAISFTRGTRTAFRSTNTLSYDFKKFFKNNKNHLNLLAGQEYIVTRE RELRNVVHGFPTEYTADDCINYSTKGNPYEVTNYGYPDDVLFSFFGRANYDFDSKYLFSA TFRADGSSKFAKGNQWGFFPSAAVAWRISSESFMENTKNWLDDLKLRFSYGTAGNNNIPS GQISQLYELNKTPWINGVNSYIAPSKYMANSDLTWETTITRNVGLDFTTLGGRLSGTIEG YWNNTKDLLIMFPTSGTGYNYQYRNMGETENKGLEASVNWVAVDKKNFGLSIGANISFNK NKIKSLGSLQKLTNEQAQSYWASTEIGQDFLPEVGGSVGKMYGYVSAGRYEVSDFESYDA TTDTWTLKKDVNKSSAVGTARPGMMKLKDLDNSGDANGEKDKTIIGDANPLHTGGFNINA RLYGFDLAANFTWSYGNDIYNANKIETTSTSKYLYRNMTSEMSAAKRWTNIDGNGNLVTD MNQLAQMNENTTMWSPYMSKFVFSDWAVEDGSYLRLSTLTLGYTLPAHLTKKVGINNLRF YVTGYNLFCITGYSGYDPEVSTCRRNGSQLTPGVDYSAYPKSRQFVIGLNLNF >gi|226332217|gb|ACIC01000103.1| GENE 11 18758 - 20578 1554 606 aa, chain + ## HITS:1 COG:no KEGG:Phep_1154 NR:ns ## KEGG: Phep_1154 # Name: not_defined # Def: RagB/SusD domain protein # Organism: P.heparinus # Pathway: not_defined # 1 602 1 587 592 560 49.0 1e-158 MKKILYTALAASLFAFSSCDDILDTSKKSSMEKTEVFSNEALVNDVVMGLHQSFGETNSY RGRYIAYFGVNSDCEIWNNTGKKGAFTDKEGALVTYNATTDNQYMNTDNNVWAKLYEAIE RANSAITGMDEYSDMSNANMRQFYGELLTLRAFIYFDLIKAFGDVPARFEPNTTETIDLP KTDRMVIMRRLLNDLLIAQDYVGWPNENSFTKSTERVSQTFTKGLRARIALFAAGYSQHP DGIRYNTEDATERQELYTIAKNECLDIISKGYNTLGTFEANFKALCAEGTIAGAESIFEI PFSASRGRVIYTWGVKHEKKDQWTKLAKGGINGPIPTLFYDYDVEDVRRDITCVPFKWTS DNDGDIAWKAPNKCWGGWSFGKVRFEWMNRVVDSSNDDGMNWQVMRMADIYLMAAEAINE LEGPKGSSDAGKYLKAILDRSYPAEKASAILTKAKASQDAFFNVIVDERKFEFAGEAIRK VDLIRWNLLGSKMNEAKEKMTRLYNREGEYADLPLKIYYNEGLDGTDATSYKMYGLNHGD TDEIGQTLGYSKSKEWIVPKESADQAAALLLIDQLYDNNPDTKQFWPIWKVFIDGSNGVL TNDYDY >gi|226332217|gb|ACIC01000103.1| GENE 12 20592 - 22169 1162 525 aa, chain + ## HITS:1 COG:no KEGG:Phep_1155 NR:ns ## KEGG: Phep_1155 # Name: not_defined # Def: fibronectin type III domain protein # Organism: P.heparinus # Pathway: not_defined # 1 522 3 515 516 203 32.0 1e-50 MKKNYIYTVLFSLVLSAFTVSCSEPDDEVTTGIFDRLFSPTNVEAVIQKKTNVKFKWTAV TNATSYTIELYENQDMTFEGTPKTYEGITDNTYTVEGLLGETQYTARIRALSEEINESKW SAVSFMTEAENIFNSVKDENIEAHAVTLTWDAIDATATKIVLSADGKADITYTLKSTDIA NKKAYIDGLEESTSYTAKLYNVDKLRGTVTFKTAIDFQGKTPVYEGDDLATVLEGAADGA NIVLVSGSFVLGDYALNKSVIISGYDKANMPTIYGRLQAEAGASSIEINNVIFRGDTPGA EELVSNFIELQGGANISTLTVSGCEIRNYKNQILYCNVTATLGTALFENCWADNITGSGG DGFDLRANTTLGTLTIQNSTFSNGIRTFLRCNMTSATVSVTNCTFYKVCSYDGGSNNNGL FLMDKVSTSTGKLTVEKCVFSQIGVGTLGYWAKKGKMKAQASYSKNYYHNSANLWDATNG LYTDPSACNATEIDPKFTNPESGDFTVGAEDIKDSKAGDPRWIKE >gi|226332217|gb|ACIC01000103.1| GENE 13 23060 - 27595 3294 1511 aa, chain - ## HITS:1 COG:CAC0903_3 KEGG:ns NR:ns ## COG: CAC0903_3 COG0642 # Protein_GI_number: 15894190 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 943 1170 56 287 318 142 35.0 5e-33 MIKRILYLILLCLTAPVVQAQQHSFFTHYSTEDGLSQNTVMSMLQDHKDNLWFATWDGIN RFDGYNFKTYKARQGNYISLTNNRADRMYEDRYGFLWLLTYDNRVHRFDPKTETFEQVPA AGEEGSTYNIHAIQVLPDGTVWLLTEHDGAIRVVTLPQKNYTTKSDIYSTKSGLFPADHF YQVYQDKAGNHWLLTDNGLGMIRPGEQQPVNYFVETKGKQDGMNQAFYAVRECEDDICFA SDRGRIWCYQKQGGEFRLLELPTKARITSIHTVSNGNTVVTTDSDGFFLCNLKTNKHTHY SPVTCKELSAQPILSAYVDRTSEVWFEQEEPGVVVHFNPATGVVRREQMRVEYSNADRSR PAFHIHEDVHGHIWVHPYGGGFSYFDRDRNCLVPFYDDLGSNNWRFSNKIHAAFSDKQGN LWLCTHSKGLEKVTFRNVPFSMLTPVPHEYESLSNEVRSVCEDKQGNLWVGLKDGMIRLY DSNRKFIGHLTGNGTISMTGTPMEGTAYFIMQDSKGIIWIATKGNGLVRAEQISPTSMSY KLTRYQHDSNDMYSLSDNNVYCVYEDHHGRIWAATFAGGINYISQGEHGETVFINHRNNL KGYPIDVCYKARFITSDNNGRLWVGTTTGAVAFDENFKKPEDIQFHHFSRVPNDTKSLSN NDVHWIIATQQKELYLATFGGGLNKLISISENGHGEFKSYSVLDGLSSDVLLSIREDHKQ NLWISTENGICKFVPSGERFENYDERSISFRVRFSEAASTLTSGGDMLFGTSNGLFMFTP DSIRKSSYVPPVVFSKLMVANEDVIPGEKSILKVDLDDTQELVLSHDENIFSVQYAALDY TNPQNIQYAYILDGFEKQWTFADRQRSVTYTNLPKGDYIFRVRSTNSDGVWVDNERILNI TILPSFWETPLAYVLYVCFVLLIIFVAVYILFTIYRLKHEVSVEQQISDIKLRFFTNISH ELRTPLTLIAGPVEQVLKNDKLPADAREQLVVVERNTNRMLRLVNQILDFRKIQNKKMKM QVQQLNVVAFVRKIMDNFESVAEEHNIDFLFQTEKEALNLWVDADKFEKIVFNLLSNAFK YTPNGKMITVFIREDEGTVSVGVQDQGIGIAENKRKSLFVRFENLVDKNIFNQASSGIGL SLVKELVEMHKATISVDSRLGEGSCFKVDFLKGKEHYNSSVEFILEDSVAPLSMERIVDI ANSSLQTEAAIADAPDLEVSAAKEEAEESSSKELMLLVEDNQELRSFLRSIFASTYRVVE ASDGMEGWSKALKYLPDIIISDVMMPEKDGIEMTRELRADMTTSHIPIILLTAKTTIESK LEGLEYGADDYITKPFSATYLQARVENLLMQRKKLQNFYRDSLTHVTVSETPVAQGETLT VHASTGPESSAAEEPAMPEMSPNDRKFMDKLVDLMEQNMDNGELVVDDLVRELAVSRSVF FKKLKTLTGLAPIEFIKEMRIKRAAQLIETGEFNMTQISYMVGINDPRYFSKCFKAQVGM TPTEYREKVGR >gi|226332217|gb|ACIC01000103.1| GENE 14 27661 - 29373 1327 570 aa, chain - ## HITS:1 COG:CAC3373 KEGG:ns NR:ns ## COG: CAC3373 COG4677 # Protein_GI_number: 15896615 # Func_class: G Carbohydrate transport and metabolism # Function: Pectin methylesterase # Organism: Clostridium acetobutylicum # 274 561 2 318 321 247 44.0 5e-65 MKNRLLLLFIVFILFSAFRADKPVLTIFTIGDSTMANKNLYGGNPERGWCMVLPGFFSED IRVDNHAANGRSSKSFISEGRWAKVISQVKKGDYVFIQFGHNDEKADSARHTDPGTTFDD NLRRFVNETRAKGGIPVLFNSIVRRNFVQPEDASIATDARRAPGEQELPKEGNVLYDTHG AYLDSPRNVAKEMGVAFIDMNKITHDLVQGLGPAESKKLFMFVEPEKVPAFPKGREDNTH LNVYGARTIAGLTVDAIAKEIPELAKYVRHYDYVVAQDGTGDFFTVQEAINAVPDFRKNV RTTILVRKGTYKEKIIIPESKINISLIGEDGVVLTNDDFANKKNVFGENMGTSGSSSCYI YAPDFYAENITFENSAGPVGQAVACFVSADRAFFKNCRFLGYQDTLYTYGKHSRQYYEDC YIEGTVDFIFGWSVAVFNRCHIHSKRDGYVTAPSTDQGKKYGYVFYDCRLTADPDVAKVY LSRPWRPYAQAVFIRCELGKHILPEGWHNWGKKEAEKTVFYAEYDSHGEGANPKARAAFS RQLKNLKGYEMETVLAGEDGWNPLKNDSVK >gi|226332217|gb|ACIC01000103.1| GENE 15 29514 - 30482 749 322 aa, chain - ## HITS:1 COG:CAC3373 KEGG:ns NR:ns ## COG: CAC3373 COG4677 # Protein_GI_number: 15896615 # Func_class: G Carbohydrate transport and metabolism # Function: Pectin methylesterase # Organism: Clostridium acetobutylicum # 32 318 1 318 321 250 43.0 2e-66 MKRSILKGMALFLLLCSGGTLACAQQQKQDTIVVSRDGTGKYRDIQEAVEAVRAFMDYTV TIFIKNGIYKEKLVIPSWVKNVQLVGEDAEKTIITYDDHANINKMGTFRTYTVKVEGSDI TFKNLTIENNAAPLGQAVALHTEGDRLMFVGCRFLGNQDTIYTGSEGSRLLFTNCYIEGT TDFIFGPSTALFEYCELHSKRDSYITAASTPKEVEFGYVFKNCKLTAAPGIKKVYLGRPW RPYAATAFINCEFGGHIRSEGWHNWKNPENEKTARYAEFKNTGEGADASGRVKWAKQLTD KEAVQYTPQNIFKECSNWYPYK >gi|226332217|gb|ACIC01000103.1| GENE 16 30492 - 31646 1025 384 aa, chain - ## HITS:1 COG:CAC0359 KEGG:ns NR:ns ## COG: CAC0359 COG4225 # Protein_GI_number: 15893650 # Func_class: R General function prediction only # Function: Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins # Organism: Clostridium acetobutylicum # 44 382 23 361 361 336 47.0 5e-92 MSSTISLSAQQVDEKLPWSVRMTESEMIRCPESWQLDFQPRLKWDYCHGLELGAMLDVYD TYGDKKIRDYAIAYADTMVHEDGSITAYKLTDYSLDRINSGKILFRIYEQTKDEKYKKAL DLLYSQFAGQPRNEDGGFWHKKIYPHQMWLDGLYMGAPFYAEYAFRNNRPQDYADVINQF ITCARHTYDPKNGLYRHACDVSRTERWADPVTGQSKHCWGRALGWYAMALVDVLDFIPKH EAGRDSLLAILDNVAVQVKKLQDPETGGWYQVMDRSGDKGNYLESSCSAMFIYSLFKAVR MGYIDPSYLDVAKAGYEGFLKNFIEVDKDGVVTITKACAVAGLGGKVYRSGDYDYYINET IRNNDPKAVGPFIMASLEYERLQK >gi|226332217|gb|ACIC01000103.1| GENE 17 31993 - 33390 1536 465 aa, chain + ## HITS:1 COG:Z4877 KEGG:ns NR:ns ## COG: Z4877 COG3775 # Protein_GI_number: 15804015 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, galactitol-specific IIC component # Organism: Escherichia coli O157:H7 EDL933 # 4 393 21 402 462 281 38.0 1e-75 MEEVFKYIIGLGAAVMMPIIFTILGVCIGIKLPKALKSGLLVGVGFVGLSVVTALLTSSL GPALSKMVEIYGLELGIFDMGWPSAAAVAYNTSVGAFIIPVCLGVNLLMLLTKTTRTVNI DLWNYWHFAFIGAIVYFASDSIFWGFFAAIICYIITLVMADMTAPAFQKFYDKMDGISIP QPFCQSFVPFAIVINKLLDKIPGFDKLNIDSEGMKKKFGLMGEPLFLGIVIGCGIGALGC ASWKEVLDGIPGILGLGIKMGAVMELIPRITSLFIEGLKPISDATRELIAKKYKNNTGLS IGMSPALVIGHPTTLVVSLLLIPVTIFLAVILPGNRFLPLASLAGMFYLFPMILPITKGN VVKSFIIGLVALIVGLYFVTELAGFFTMAAKDVYAATGDPTVNIPAGFEGGALDFASSPF CWGIFHLTYSVKIIGPAILVALALGMAVYNRIRMKRNDAKNAANV >gi|226332217|gb|ACIC01000103.1| GENE 18 33423 - 33891 516 156 aa, chain + ## HITS:1 COG:BH2166 KEGG:ns NR:ns ## COG: BH2166 COG3717 # Protein_GI_number: 15614729 # Func_class: G Carbohydrate transport and metabolism # Function: 5-keto 4-deoxyuronate isomerase # Organism: Bacillus halodurans # 28 156 6 134 276 146 51.0 2e-35 MKKLAIAMMMGIAAMSASAQVNYKMQVACNPQDVKTYDTNRLRGSFLMEKVMVPDQINVT YSMYDRLIFGGTVPATKELVLETIDPLKAKYFLERRELGVINIGGEGIVTVDGKEYTLNF KDALYVGRGKQKVTFKSKDASKPAKFYINSATAHKE Prediction of potential genes in microbial genomes Time: Thu May 12 01:52:45 2011 Seq name: gi|226332216|gb|ACIC01000104.1| Bacteroides sp. 1_1_6 cont1.104, whole genome shotgun sequence Length of sequence - 13726 bp Number of predicted genes - 9, with homology - 9 Number of transcription units - 6, operones - 3 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 441 449 ## COG3717 5-keto 4-deoxyuronate isomerase + Term 476 - 538 11.1 + Prom 549 - 608 5.4 2 2 Tu 1 . + CDS 701 - 2218 1425 ## COG0477 Permeases of the major facilitator superfamily + Term 2246 - 2290 7.0 - Term 2434 - 2463 1.4 3 3 Tu 1 . - CDS 2589 - 5963 2808 ## COG1112 Superfamily I DNA and RNA helicases and helicase subunits - Prom 5983 - 6042 3.5 4 4 Op 1 . - CDS 6078 - 6908 873 ## COG0805 Sec-independent protein secretion pathway component TatC 5 4 Op 2 . - CDS 6915 - 7136 351 ## BT_4102 putative sec-independent protein translocase - Prom 7207 - 7266 9.1 6 5 Op 1 . - CDS 7345 - 9894 2321 ## COG0787 Alanine racemase 7 5 Op 2 . - CDS 9870 - 10859 400 ## BT_4100 hypothetical protein - Prom 10918 - 10977 6.6 8 6 Op 1 . + CDS 11525 - 13468 1725 ## COG1154 Deoxyxylulose-5-phosphate synthase 9 6 Op 2 . + CDS 13471 - 13726 284 ## COG0569 K+ transport systems, NAD-binding component Predicted protein(s) >gi|226332216|gb|ACIC01000104.1| GENE 1 1 - 441 449 146 aa, chain + ## HITS:1 COG:YPO1725 KEGG:ns NR:ns ## COG: YPO1725 COG3717 # Protein_GI_number: 16121985 # Func_class: G Carbohydrate transport and metabolism # Function: 5-keto 4-deoxyuronate isomerase # Organism: Yersinia pestis # 19 146 153 278 278 159 56.0 2e-39 LITIDGRKGSLKANSFAAGKMEESNDRVINQLIVNNVLEEGPCQLQMGLTELKPGSVWNT MPAHTHSRRVEAYFYFNVPEGNSICHFMGEPQEERIVWMQNEQAIMSPEWSIHAAAGTSN YMFIWGMAGENLDYGDMDKIKYIEMR >gi|226332216|gb|ACIC01000104.1| GENE 2 701 - 2218 1425 505 aa, chain + ## HITS:1 COG:CC1508 KEGG:ns NR:ns ## COG: CC1508 COG0477 # Protein_GI_number: 16125755 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Caulobacter vibrioides # 9 430 18 395 431 250 36.0 5e-66 MNAFQKTGEKMTNYRWTICAMLFFATTVNYLDRQVLSLTWDEFIKPEFHWDESHYGTITS VFSIVYAICMLFAGRFVDWMGTKKGFLWAIGVWSAGACLHAVCGIVTEAQVGLHSAAELA GATGDVVVTIATVSMYCFLAARCILALGEAGNFPAAIKVTAEYFPKKDRAYATSIFNAGA SIGALIAPLTIPILAKAFGWEMAFIVIGGLGFIWMGFWVFMYDAPSKSKHVNKAELEYIE QDQNEAGAGPKTEEKDEKKMRFWQCFSYKQTWAFVFGKFTTDGVWWFFLFWTPSYLNSQF GIKTSDPLGMGLIFTLYAITMLSIYGGKLPTIFINKTGMNPYAARMKAMLIFAFFPLVVL LAQPLGTFSPWFPVILIGIGGAAHQSWSANIFSTVGDMFPRTAIASITGIGGMAGGIGSM ILQKVAGNLFVYASGTTMVDGKEVEMTKELLEQGAQFVHPAMTFMGFEGKPAGYFVIFCV CAVAYLIGWVIMKALVPKYKPIVLD >gi|226332216|gb|ACIC01000104.1| GENE 3 2589 - 5963 2808 1124 aa, chain - ## HITS:1 COG:sll1582 KEGG:ns NR:ns ## COG: sll1582 COG1112 # Protein_GI_number: 16329815 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases and helicase subunits # Organism: Synechocystis # 194 1118 203 1096 1118 301 28.0 4e-81 MENEGRESYEILLAVCKADHLQLTIGYKQMRDLLERLCRLHMHNGSLQMTDLSARISFVA AKVGLSVAEQNRLHTFRLTSNAILNRQQEPTREHLLRDAKTLAFFIRKLFEEDIPQELYR LLPRTDATYIVAPPAHKQVQRMRVCFQYSDEQYLYVTPLDEIADEPLRVRYNIPQINEEF AETCQLLWRHAQLNLLDVAVDEAGILTPSFIVLEPDYLLDISSLAECYRDYGHHPANYFL SRLQPIENARPLLLGNIANLFLDEWIHAEGEVDYLRCMQKAFRRYPIELAACADLRDREK ERQFFDDCKLHFDHIRETVNDTFHAAGYELDKTDAVLEPSYICEALGLQGRLDYMQRDMS SFIEMKSGKADEYAIRGKVEPKENNKVQMLLYQAVLQYSMGMDHRKVKAYLLYTRYPLLY PSRPSWAMVRRVIDLRNRIVADEYGIQLRNSLEYTAQKLEEIKASVLNERGLSGRFWETY LRPSIDNFQEKLKSLSTLEKSYFYALYNFITKELYTSKSGDVDYEGRTGAASLWLSTLAE KCESGEIIYDLRIKENHAADEHKSHLLLVPSGELQRTVADDAQHTLPNFRQGDAIVLYER NADTDNVTNKMVFKGNIDYLNENEICIRLRATQQNSSVLPSDSLYAIEHDAMDTTFRSMY QGLYAFMSATKERRDLLLSQREPQFDVALDKQIAEAADDFTRIALKAKAAKDYFLLVGPP GTGKTSCALKKMVETFYKEEGAQILLLSYTNRAVDEICKALSSIRPEVDFIRVGSELSCD ENYRDHLIENELATCMRRTEVCERINRCRILVGTVASISGKPELFRLKHFDVAIVDEATQ ILEPQLLGILCAKGENGKEGVGKFILIGDHKQLPAVVLQNTEQSEVYDEALSAVGLKNLK DSLFERLYRTARQHTDAHRTYDMLCRQGRMHPEVALFANQAFYEGRLLPVGLPHQLEDSG NINRLSFYPSQPEPMGSSAKTNHSEAKIAARLAATVYKEHSEAFDASRTLGIITPYRSQI ALIKKEIEVLGISELNQILIDTVERFQGSERDVIIYSFCVNYPYQLKFLSNLTEEDGILI DRKLNVALTRARKQMLLTGVPTLLERNPLYKSLLKLIESLRNIA >gi|226332216|gb|ACIC01000104.1| GENE 4 6078 - 6908 873 276 aa, chain - ## HITS:1 COG:DR0806 KEGG:ns NR:ns ## COG: DR0806 COG0805 # Protein_GI_number: 15805832 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Sec-independent protein secretion pathway component TatC # Organism: Deinococcus radiodurans # 1 267 19 264 270 131 33.0 1e-30 MAEMTFWDHLDELRRVLFRVIGVWFVLAIGYFVAMPYLFDHVILAPCHNDFVFYDLLRFI GQKFDLTDEFFTQEFKVKLVNINLAAPFFIHMSTAFWMSVVTAMPYIFFEVWRFINPALY PNERKGVRKALTIGTGMFFIGVLMGYFMVYPLTLRFLSTYQLSSEVENILSLNSYIDNFM MLILCMGLAFELPLVTWLLSLLGVVNKSFLRKYRRHAVVIIVIAAAIITPTGDPFTLSVV AIPLYLLYEMSILMIKDKKEVPELEEEEEEEGVAKV >gi|226332216|gb|ACIC01000104.1| GENE 5 6915 - 7136 351 73 aa, chain - ## HITS:1 COG:no KEGG:BT_4102 NR:ns ## KEGG: BT_4102 # Name: not_defined # Def: putative sec-independent protein translocase # Organism: B.thetaiotaomicron # Pathway: Protein export [PATH:bth03060]; Bacterial secretion system [PATH:bth03070] # 1 73 1 73 73 108 100.0 7e-23 MTNLLLLGFLPSGSEWIIIALVILLLFGGKKIPELMRGLGKGVKSFKDGVNEAKEEINKA KDEIDAPADSAKK >gi|226332216|gb|ACIC01000104.1| GENE 6 7345 - 9894 2321 849 aa, chain - ## HITS:1 COG:CAC0492 KEGG:ns NR:ns ## COG: CAC0492 COG0787 # Protein_GI_number: 15893783 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Alanine racemase # Organism: Clostridium acetobutylicum # 491 849 11 376 386 209 37.0 2e-53 MSYTIESIAEYIGARRVGEHEATIDWLLTDSRSLSFPEETLFFALPTKRNNGARYISELY DRGVRNFVVTEEDFKRMENGEWKMENAMQQDGAQPTPNSRLSILDFSNFLIVANPLKALQ KLAEAHRENFKIPVIGITGSNGKTIVKEWLHQLLSPDRCIVRSPRSYNSQIGVPLSVWQL NEEAELGIFEAGISEMGEMGALKRMIKPTIGILTNIGGAHQENFFSLQEKCMEKLTLFKD CDVVIYNGDNELISNCVAKSMLTAREIAWSCKDIERPLYISKVTKKEDHTVIAYRYLDMD NIFCIPFIDDASIENALNCLAACLYLMTPADQITERMARLEPIAMRLEVKEGKNNCILIN DSYNSDLASLDIALDFLVRRSEKKGLKRTLILSDILETGQSTATLYRRVAQLIKSRGINK LIGVGAEISSCAARFEGTPERYFFPDTDALLRSGIFKTLHSEVILIKGSRVFNFDLVSEE LELKVHETILEVNLGAMVANLNHYRSMLRHPETKMICMVKAAAYGAGSYEIAKTLQEHHV DYLAVAVADEGSELRKAGITSSIIIMDPELTSFKTMFDYKLEPEVYNFHLLDALIKAAEK EGITNFPIHVKLDTGMHRLGFSVDEIPLLIRRLKSQNAVIPRSVFSHFVGSDSAQFDFFT RQQIELFEKGSQELQEAFSHKILRHICNTAGIERFPGAQFDMVRLGIGLYGVSPIDNSII NNVSTLKTTILQIRDVPGEDTVGYSRMGHLTRPSRIAAIPIGYADGLNRHLGRGNAYCLV NGKKAPYVGNICMDVCMIDVTDIDCREGDKAVIFGDDLPITTLSDKLGTIPYEVLTSISN RVKRVYYQD >gi|226332216|gb|ACIC01000104.1| GENE 7 9870 - 10859 400 329 aa, chain - ## HITS:1 COG:no KEGG:BT_4100 NR:ns ## KEGG: BT_4100 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 329 1 329 329 662 100.0 0 MDFQTKVELPAGLPPVSHAERILLMGSCFAENMGRLLAENKFRVDMNPFGILYNPLSVST ALVEILKGKVYQEKDLFLYKECWHSPMHHGLFSASSPEEVLEKINTRLSQAHRSVHELDW LMLTFGTAYVYEQKETRQVVSNCHKLPESCFNRRILSVDEIVNEYTSLITSMVARNSHLK VLFTVSPIRHIRDGMHANQLSKSTLLLAIDRLQQLFPDHVFYFPSYEIVLDELRDYRYYA DDMLHPSPLAVRYLWERFSEAFFSAETKQVITAIEDITKDLSHKPFHPESEAYQRFLGQI VLKIERLNGKYPYLDFQKETELCHIRLNP >gi|226332216|gb|ACIC01000104.1| GENE 8 11525 - 13468 1725 647 aa, chain + ## HITS:1 COG:HI1439 KEGG:ns NR:ns ## COG: HI1439 COG1154 # Protein_GI_number: 16273346 # Func_class: H Coenzyme transport and metabolism; I Lipid transport and metabolism # Function: Deoxyxylulose-5-phosphate synthase # Organism: Haemophilus influenzae # 7 630 8 616 625 548 44.0 1e-155 MKNEPIYNLLNSINSPDDLRRLEVDQLPEVCDELRQDIIKELSCNPGHFAASLGTVELTV ALHYVYNTPYDRIVWDVGHQAYGHKILTGRREAFSTNRKLGGIRPFPSPEESEYDTFTCG HASNSISAALGMAVAAAKKGDDQRHVIAIIGDGSMSGGLAFEGLNNSSTTPNNLLIILND NDMAIDRSVGGMKQYLFNLTTSNRYNQLRFKASRLLFKLGILNDERRKALIRFGNSLKSM AAQQQNIFEGMNIRYFGPIDGHDIKNLSRVLRDIKDLKGPKILHLHTIKGKGFAPAEKHA TEWHAPGKFDPVTGERFVANTEGMPPLFQDVFGNTLVELAEANPRIVGVTPAMPSGCSMN ILMSKMPKRAFDVGIAEGHAVTFSGGMAKDGLQPFCNIYSSFMQRAYDNIIHDVAIQNLP VVLCLDRAGLVGEDGPTHHGAFDMAYLRPIPNLTIASPMNEHELRRLMYTAQLPDKGPFV LRYPRGRGVLVDWKCPLEEIPVGKGRKLKDGKDIAVISIGPIGNKARSAIARAESESGRS IAHYDLRFLKPLDEELLHEVGRTFRHIVTIEDGTIQGGMGSAVLEFMADHEYTPTVKRIG IPDKFVQHGTVAELYQLCGMDEDSLTKELLKQCELLPDMSKIKELTN >gi|226332216|gb|ACIC01000104.1| GENE 9 13471 - 13726 284 85 aa, chain + ## HITS:1 COG:HI0625 KEGG:ns NR:ns ## COG: HI0625 COG0569 # Protein_GI_number: 16272568 # Func_class: P Inorganic ion transport and metabolism # Function: K+ transport systems, NAD-binding component # Organism: Haemophilus influenzae # 1 85 1 85 458 73 47.0 1e-13 MKIIIAGAGNVGTHLAKLLSREKQDIILMDDDEEKLSALSANFDLLTVTASPSSISGLKE VGVKEADLFIAVTPDESRNMTACML Prediction of potential genes in microbial genomes Time: Thu May 12 01:53:14 2011 Seq name: gi|226332215|gb|ACIC01000105.1| Bacteroides sp. 1_1_6 cont1.105, whole genome shotgun sequence Length of sequence - 92724 bp Number of predicted genes - 57, with homology - 56 Number of transcription units - 19, operones - 12 average op.length - 4.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 17/0.000 + CDS 2 - 1093 1120 ## COG0569 K+ transport systems, NAD-binding component + Prom 1137 - 1196 4.8 2 1 Op 2 . + CDS 1263 - 2699 1161 ## COG0168 Trk-type K+ transport systems, membrane components - Term 2703 - 2771 3.1 3 2 Op 1 . - CDS 2784 - 4166 1103 ## BT_4096 hypothetical protein - Prom 4190 - 4249 6.1 4 2 Op 2 . - CDS 4260 - 5798 822 ## COG3507 Beta-xylosidase 5 2 Op 3 . - CDS 5805 - 6938 920 ## COG2152 Predicted glycosylase 6 2 Op 4 . - CDS 6965 - 9118 1453 ## COG3537 Putative alpha-1,2-mannosidase - Prom 9166 - 9225 2.7 - Term 9143 - 9187 8.5 7 3 Op 1 . - CDS 9237 - 11510 1621 ## COG3537 Putative alpha-1,2-mannosidase 8 3 Op 2 . - CDS 11548 - 13536 1346 ## BT_4091 sialic acid-specific 9-O-acetylesterase - Prom 13615 - 13674 7.1 - Term 13613 - 13671 10.8 9 4 Op 1 . - CDS 13689 - 16769 2008 ## BT_4090 hypothetical protein 10 4 Op 2 . - CDS 16774 - 18570 1071 ## BT_4089 hypothetical protein 11 4 Op 3 . - CDS 18577 - 21711 1826 ## BT_4088 hypothetical protein 12 4 Op 4 . - CDS 21736 - 22788 433 ## BT_4087 hypothetical protein 13 4 Op 5 . - CDS 22806 - 23258 294 ## BT_4086 hypothetical protein 14 4 Op 6 . - CDS 23288 - 25144 1069 ## BT_4085 hypothetical protein - Prom 25192 - 25251 8.7 - Term 25277 - 25325 11.0 15 5 Op 1 . - CDS 25362 - 26693 613 ## BT_4084 hypothetical protein 16 5 Op 2 . - CDS 26720 - 27748 538 ## BT_4083 hypothetical protein 17 5 Op 3 . - CDS 27768 - 27941 199 ## BT_4082 hypothetical protein 18 5 Op 4 . - CDS 27988 - 29820 972 ## BT_4082 hypothetical protein 19 5 Op 5 . - CDS 29852 - 32953 1623 ## BT_4081 hypothetical protein 20 5 Op 6 . - CDS 32985 - 34355 675 ## BT_4080 hypothetical protein - Prom 34393 - 34452 4.7 21 6 Op 1 . - CDS 34540 - 36300 927 ## BT_4079 hypothetical protein 22 6 Op 2 . - CDS 36328 - 36777 257 ## BT_4078 hypothetical protein 23 6 Op 3 . - CDS 36799 - 37020 195 ## BT_4078 hypothetical protein - Prom 37216 - 37275 11.2 24 7 Op 1 . - CDS 37352 - 38218 433 ## COG0627 Predicted esterase 25 7 Op 2 . - CDS 38228 - 40906 1213 ## BT_4076 alpha-rhamnosidase 26 7 Op 3 . - CDS 40977 - 42173 865 ## BT_4075 hypothetical protein - Prom 42360 - 42419 9.7 + Prom 42502 - 42561 5.2 27 8 Tu 1 . + CDS 42628 - 42825 62 ## - Term 43059 - 43110 11.1 28 9 Op 1 . - CDS 43146 - 46229 2253 ## COG3250 Beta-galactosidase/beta-glucuronidase 29 9 Op 2 . - CDS 46243 - 48591 1792 ## COG3537 Putative alpha-1,2-mannosidase 30 9 Op 3 . - CDS 48626 - 51175 1411 ## COG0383 Alpha-mannosidase 31 9 Op 4 . - CDS 51183 - 51413 310 ## BT_4071 hypothetical protein 32 9 Op 5 . - CDS 51489 - 52013 302 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 52132 - 52191 4.9 - Term 52070 - 52105 3.4 33 10 Tu 1 . - CDS 52204 - 53808 841 ## BT_4069 putative regulatory protein - Prom 53836 - 53895 6.5 + Prom 53776 - 53835 6.3 34 11 Tu 1 . + CDS 54068 - 54445 123 ## BT_4068 hypothetical protein + Term 54482 - 54529 7.2 + Prom 54480 - 54539 3.5 35 12 Op 1 30/0.000 + CDS 54559 - 54909 211 ## PROTEIN SUPPORTED gi|154175415|ref|YP_001407462.1| NADH dehydrogenase subunit A 36 12 Op 2 9/0.000 + CDS 54900 - 55493 436 ## PROTEIN SUPPORTED gi|154175216|ref|YP_001407461.1| NADH dehydrogenase subunit B 37 12 Op 3 8/0.000 + CDS 55514 - 57106 1657 ## COG0649 NADH:ubiquinone oxidoreductase 49 kD subunit 7 + Term 57157 - 57204 -0.6 + Prom 57148 - 57207 5.2 38 12 Op 4 31/0.000 + CDS 57232 - 58308 1091 ## COG1005 NADH:ubiquinone oxidoreductase subunit 1 (chain H) 39 12 Op 5 28/0.000 + CDS 58319 - 58807 535 ## COG1143 Formate hydrogenlyase subunit 6/NADH:ubiquinone oxidoreductase 23 kD subunit (chain I) 40 12 Op 6 30/0.000 + CDS 58810 - 59322 488 ## COG0839 NADH:ubiquinone oxidoreductase subunit 6 (chain J) 41 12 Op 7 26/0.000 + CDS 59418 - 59726 367 ## COG0713 NADH:ubiquinone oxidoreductase subunit 11 or 4L (chain K) + Prom 59735 - 59794 2.4 42 12 Op 8 30/0.000 + CDS 59814 - 61745 1326 ## COG1009 NADH:ubiquinone oxidoreductase subunit 5 (chain L)/Multisubunit Na+/H+ antiporter, MnhA subunit 43 12 Op 9 22/0.000 + CDS 61828 - 63312 1429 ## COG1008 NADH:ubiquinone oxidoreductase subunit 4 (chain M) 44 12 Op 10 . + CDS 63350 - 64792 1281 ## COG1007 NADH:ubiquinone oxidoreductase subunit 2 (chain N) + Term 64806 - 64845 9.1 45 13 Tu 1 . - CDS 64842 - 67637 2383 ## COG0642 Signal transduction histidine kinase - Prom 67691 - 67750 5.0 46 14 Op 1 . - CDS 67907 - 69118 1225 ## BT_4056 hypothetical protein 47 14 Op 2 5/0.000 - CDS 69162 - 71876 2699 ## COG1629 Outer membrane receptor proteins, mostly Fe transport - Prom 71903 - 71962 1.8 - Term 72128 - 72202 19.8 48 15 Op 1 2/0.000 - CDS 72421 - 73317 761 ## COG2207 AraC-type DNA-binding domain-containing proteins 49 15 Op 2 . - CDS 73378 - 74709 1250 ## COG0668 Small-conductance mechanosensitive channel 50 15 Op 3 . - CDS 74760 - 75761 780 ## BT_4052 putative ABC transporter ATP-binding protein 51 15 Op 4 . - CDS 75775 - 76650 922 ## COG3950 Predicted ATP-binding protein involved in virulence - Prom 76678 - 76737 2.7 + Prom 76538 - 76597 3.2 52 16 Tu 1 . + CDS 76757 - 77674 789 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily + Term 77710 - 77780 9.2 53 17 Op 1 . - CDS 77796 - 81824 3723 ## COG3250 Beta-galactosidase/beta-glucuronidase 54 17 Op 2 . - CDS 81920 - 83014 1127 ## COG1703 Putative periplasmic protein kinase ArgK and related GTPases of G3E family 55 17 Op 3 . - CDS 83026 - 84093 963 ## BT_4048 hypothetical protein - Prom 84113 - 84172 6.5 - Term 84217 - 84268 21.0 56 18 Tu 1 . - CDS 84329 - 90097 4696 ## COG2373 Large extracellular alpha-helical protein - Prom 90246 - 90305 6.9 57 19 Tu 1 . + CDS 90965 - 92521 1020 ## BT_4046 hypothetical protein Predicted protein(s) >gi|226332215|gb|ACIC01000105.1| GENE 1 2 - 1093 1120 363 aa, chain + ## HITS:1 COG:PA0016 KEGG:ns NR:ns ## COG: PA0016 COG0569 # Protein_GI_number: 15595214 # Func_class: P Inorganic ion transport and metabolism # Function: K+ transport systems, NAD-binding component # Organism: Pseudomonas aeruginosa # 12 362 96 450 457 162 30.0 1e-39 MLATNLGAKKTVARIDNYEYLLPKNKEFFRKLGVDSLIYPEMLAAKEIVSSMRMSWVRQW WEFCGGALILIGTKMREKAEILNIPLHQLGAPDIPYHVVAIKRGTETIIPRGDDVIKLHD IVYFTTTRKYIPYIRKIAGKEDYADVRNVMIMGGSRIAVRTAQYVPDYMQVKIVDNDINR CNRLTELLDDKTMIINGDGRDMDLLIEEGLKNTEAFVALTDNSETNILACLAAKRMGVEK TVAEVENIDYIGMAESLDIGTVINKKMIAASHIYQMMLDADVSNVKCLTFANADVAEFTV PAGAKITKNLIKDLGLPKGTTIGGMIRNGEGVLVTGDTLIRPGDHVVVFCLSMMIKKIEK FFN >gi|226332215|gb|ACIC01000105.1| GENE 2 1263 - 2699 1161 478 aa, chain + ## HITS:1 COG:MA1483 KEGG:ns NR:ns ## COG: MA1483 COG0168 # Protein_GI_number: 20090342 # Func_class: P Inorganic ion transport and metabolism # Function: Trk-type K+ transport systems, membrane components # Organism: Methanosarcina acetivorans str.C2A # 22 477 26 476 476 304 39.0 2e-82 MIYRIMGFLLLIETAMLMCCGGVSLFYKEDDLNSFLLSSAITAFVGVIMLAIGRGAEKSL NRRDGYVIVSVAWIAFSFFGMLPYYIGGYIPSVTDAFFETMSGFSSTGATIMNNIESMPH GILFWRAMTQWIGGLGIVFFTIAVLPIFGMGGIQVFAAEASGPTHDKVHPRIGVTAKWIW GIYAGMTGTLIILLVFGGMGLFDSICHAFTTTSTGGFSTKQTSIEYYHSPYIDYVISVFM FLSGVNFTLLLLMFNGKIKKFIHDAELKFYFMSVAFFTVFIAVWLYQTSSMGAEEAFRKS LFQVISLQTSTGFATADYMLWPSILWGCLLIVMIMGACAGSTTGGIKCIRMVILFKVAKN EFKHILHPNAVLPVRVNKQVISPSIQSTVLAFTFLYAIIAIISILVMMGFGVGFLESIGT VISSIGNMGPGLGTCGPAFSWSELPDAAKWLLSFLMLLGRLELFTVLLLFSSDFWKRN >gi|226332215|gb|ACIC01000105.1| GENE 3 2784 - 4166 1103 460 aa, chain - ## HITS:1 COG:no KEGG:BT_4096 NR:ns ## KEGG: BT_4096 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 460 1 460 460 932 100.0 0 MKKFLVLVAAVCMAYTTAFAQTVKPFKEGDRAVFLGNSITDGGRYHSFIWLYYMTRFPNM PIRVFNGGIGGDTAYDMNKRLDGDIFSKNPTVLMVTFGMNDSGYYEYNGDNAKEFGEQKY QESIKNFQQMEKRFKELPHTRIVMTGTSPYDETAQIKDNTVFKKKNETIKRIIEYQRESA ARNGWEFTDWNAPMVAINQELQQKDPSFTLCGNDRIHPDNDGHMVMAYLFLKAQGFAGKD VANMEINANKKQAVKAEGCTISNIKKIGKDISFDYLAEALPYPLDTIARGWGSKKSQAEV IKEVPFMEEMNTELLKVTGLKGQYKLLIDDQEIGTWDAADLAKGINLAAESKTPQYQQAL TIMHLNEYRWELERTFREYAWCQFGFFQQKGLLFANDRKAIEVMDENVEKNMWLKGRRDL YSKMMFKEIRDAREQEMDVLISKIYEINKPVVRKIVLRKI >gi|226332215|gb|ACIC01000105.1| GENE 4 4260 - 5798 822 512 aa, chain - ## HITS:1 COG:CAC3452 KEGG:ns NR:ns ## COG: CAC3452 COG3507 # Protein_GI_number: 15896693 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-xylosidase # Organism: Clostridium acetobutylicum # 27 511 4 533 533 120 27.0 5e-27 MKKIGLFLFGICFALQMVAQEQNYFTNPVIRGDVPDPSVIRIEDTYYATGTSSEWAPFYP MFTSKDLVNWKQVGHVFTKQPSWTSNSFWAPELFYHNNKVYCYYTARQKSTGISYIGVAT SDSPLHEFTDHGPIVEYGKEAIDAFIYDDNGQLYISWKAYGLDTRPIELLGCKLSADGLH LDGEPFTLLVDEKGIGMEGQYHFKEGDYYYIVYAAHGCCGPSSDYDVYVARARNYGGPYE KYFGNPILHGGEGDYKSCGHGTVVRTSDGRMFYMCHAYLKGDGFFIGRQPILQEMEMTDD HWVRFKTGNLAIAEQPIPFVGTKQEPLSDFEDNFKGNQLKVDWTWNYPYSDIHAVLKKGK LFLSGTPKNNNKYGTALCLRPQSPQYSCETKVMNTGKGLKGLTLYGDDKNLIAWGIEGDK LILKVVKDDIESVLYDSAFASKEIYLKLEVEQGCIFHFYKSLDGKTWQSVQNTPFKGKSL IRWDRVQRPGLLHYGDKDVPAEFSYFKMKNLK >gi|226332215|gb|ACIC01000105.1| GENE 5 5805 - 6938 920 377 aa, chain - ## HITS:1 COG:PH1107 KEGG:ns NR:ns ## COG: PH1107 COG2152 # Protein_GI_number: 14590938 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted glycosylase # Organism: Pyrococcus horikoshii # 66 373 16 287 299 152 35.0 9e-37 MKRKFQHIAYLLMVAAVITSCGEKKQTSEFPDWAWADFQRPEGINPIVSPDTTTVFYCPM RQDSVAWESSDTFNPAATIYDGKVVVLYRAEDNSAVGIGSRTSRLGYAYSDDGLHFNRMT VPVFYPADDNQKELEWPGGCEDPRVAVTDDGLYVMLYTQWNRKQARLAVATSRDLQIWEK YGPAFAKAYGGRFFDEFSKSASIVTKLVDGKQVIAKIDGKYWMYWGEKFVNVATSTDLIN WEPMLDKKGDFLKVITPREGKFDSDLTECGPPAIMTDKGILLLYNGKNKSGAEGDTLYTA NSYCAGQALFDAKDPTKLIDRLDKPFYIPESDFEKSGQYPAGTVFIEGLVFHNQKWYLYY GCADSRVAVAVYDPFKK >gi|226332215|gb|ACIC01000105.1| GENE 6 6965 - 9118 1453 717 aa, chain - ## HITS:1 COG:Rv0584 KEGG:ns NR:ns ## COG: Rv0584 COG3537 # Protein_GI_number: 15607724 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Mycobacterium tuberculosis H37Rv # 29 716 42 768 877 399 35.0 1e-111 MKHKFLFILLFSLVLEGMVTTQAVAGDYVHQVNTLIGTKGTGLTSGYLYPGATYPYGMVQ FTPSYFSKRSGFVINQLSGGGCEHMGNFPTFPVKGKLKMSPDNILNYRINVSEEKGHAGY YEAMVQEDIKAKLTVTERTGMASYEYPADQQYGTIIIGGGISATPIEQAAIVITAPNKCE GYAEGGNFCGLRTPYKVYFVAEFDTDAFETGTWKREELMPNTTFAEGEYSGVYFTFDVNK KKNIQYKIGVSYVSVENARENLKAENAEWDFQKIQNQAEAKWNHYLGMIEVEGTNPDRTT QFYTHLYRSFIHPNVCSDVNGEYMGADFRVHKSRSKHYTSFSNWDTYRTQIQLLSMLDPE VASDIVISHQLFAEQSGGSFPRWVMANIETGVMQGDPTPILIANAYAFGARNYDPKPIFK IMRKGAEEPGSKSQDVETRPGLKQYLDKGYYNASIQLEYTSADFAIGQFALHAVGDEFAS WRYFHFARSWKNLYNSDTGWLQSRNPDGSWKSLGEDFRESTYKNYFWMVPYDIAGLVEII GGKEKAEKRLDEFFTRLDAGYNDAWFASGNEPSFHIPWIYNWIGRPYKTQEIINRVLNEQ YSSKIDGLPGNDDLGTMGAWYVFACIGLYPEIPGVGGFTVNTPIFSSVKVHLKKGDIVIK GGSEKDIYIKSMKLNGKSYESTWIDWDQLNSGATIEYRTSGKPDMKWGAKVVPPSFP >gi|226332215|gb|ACIC01000105.1| GENE 7 9237 - 11510 1621 757 aa, chain - ## HITS:1 COG:L135972 KEGG:ns NR:ns ## COG: L135972 COG3537 # Protein_GI_number: 15673483 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Lactococcus lactis # 25 756 3 716 717 390 32.0 1e-108 MKKLVYLIIFFLYSTVMNAQLKDLVQYVNTLQGTDSNFGLSYGNTYPTTGMPYGMHMWSA QTGKNGDGFKYMYAVDKIRAFSQSHQCSPWVSDYAVYSFMPMVGELVVNQDARATKFSHD NEIAKPHYYKVTFDNGITTEMAPTTRGVHLRFSYPTTGDAYLVLDGYTDMSEIKIDPAKR QISGWVNNQRFVNDSKSFRNYFVVQFDKPFEDYGVWENQKDEVFPQKLDGAGKGYGAFIK FKKGSKVQAKAASSYISAEQALITLNKELGKDKNLEVTKARGQKTWNEVLNRIVVEGCTD EQMKTFYSCLFRANLFSRKFYERKENGEPYYYSPYNGKIYDGYMYTDNGFWDTFRSQFPL TNILHPTMQGRYMNALLAAQEQCGWLPSWSSPGETGGMLGNHSISLLADAWAKGIRDFDP EKALKAYAHEAMNKGPWGGANGRGFWKEYFQLGYVPYPESMGSSAQTMEYAYDDFCGYQL AKMVGNKHYQEVFARQMYNYKNVFDKSIGFMRGKGVDGKWQEPFDPLEWGGPFCEGNAWH YTWSVFHDVQGLIDLFGSDEKFTTKIDSVFTIPNIIKPGTYGGVIHEMKEMELANMGQYA HGNQPIQHMPYLYCYAGQPWKTQYWVRQIVERLYNSTEKGYPGDEDQGGMSSWYILSSLG IYAVCPGTDEYVIGSPLFKKATITMENGNKFVIEAPENSKENLYIQSATLNGRLLDKNYI HYDDIAEGGVLKFEMGSQPNKERCTSKYAAPFSLSKE >gi|226332215|gb|ACIC01000105.1| GENE 8 11548 - 13536 1346 662 aa, chain - ## HITS:1 COG:no KEGG:BT_4091 NR:ns ## KEGG: BT_4091 # Name: not_defined # Def: sialic acid-specific 9-O-acetylesterase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 662 1 662 662 1341 100.0 0 MNKNFVISLFFLLGSATLSAQVKLLPIFSDNMVLQQQTQAPIWGESKPKKKVEITTSWDQ KKYTIQADAQGKWSTKVATPVAGGPYNITISDGKKVKLSNVMIGEVWICSGQSNMEMQVE GWGKVKNYEQEKEEANNYPNIRFLLVENAMSPTPVENITVKENGWQVCTSKSVADFSAAG YFFGRDLNKYRNVPIGLIDTSWGGTIIETWTSNEALSTIPSMKKRLEALVGLPASQEGRK KKFEEDVETWKAEVERIDKGCVNGEAIWAAPDFNDAAWKSMKVPGLMQEQDLPGFSGLVW FRKTIDIPAGWAGKDLILNLGVIDDNDFTYFNGIQIGHTEGWMAPRSYKIPKELVKKGKA VIAVRVMDTGGTGGINGSPESISLHLSDTEAIQLAGNWKYQVSLDMREVAPMPVDMSWNP NSPTFLFNAMLNPLIPYAIKGAIWYQGESNAGEAFQYRDLMPLMITDWRNRWGYDFPFYM VQLASFTAKQTAPVESTWAELREAQTRTLHLQNTGMAVAIDIGEEFDIHPKNKQEVGRRL ALAARAQTYGEKIPYSGPIYKSYKIEGNKIRIFFDHIDGGLKTANGEMPKGFTIAGVDHK FHWADAVIEGNTVVVSSSEVTLPVAVRYAWADYPICNMYNGADLPMSPFRTDDWKGITYV NK >gi|226332215|gb|ACIC01000105.1| GENE 9 13689 - 16769 2008 1026 aa, chain - ## HITS:1 COG:no KEGG:BT_4090 NR:ns ## KEGG: BT_4090 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1026 1 1026 1026 1981 100.0 0 MKRLLLILVCISLFISAFGQSFVLKGIVSSADGELLPGVNVKIQGTTVGTITDIDGNFQL NVNKGAVLEFSYIGFKKQSIKVNSQQMLNVELAPDQTNLDEVVVVGYGKAKRITLTGAVS GIQAREIRNVPTSSVQNALTGKLPGFFSQQQSGQPGKDASDFFIRGVSSLNNDGNKPLIM VDDMQYTYEQLSQINVNEIESISILKDASTTAIYGIKGANGVLVVKTRRGEQGKPKINVR LETGMQTPVRTPKFLNAYESLQLVKEAHTNDGTLSDFPYTEDDMIAFRDHTDPYGHPDVN WYDEIFKKMAFQENINVDVSGGSKKLRYFVSAGYFTQNGLVKDFGGNSGDGVNPNYSYRR FNYRTNLDFDVTDNFNMRLDVSSRFMDINEPYNMNVTGELYDFSKMHPYSAPLLNPDGSF AYLYDTQDRKPTLNARLANEGYKRTRRNDNNILYGATWKMDFLTPGLAADFQLAYSSSDE NYRAVMRTRYPTYHYDSATGLYNINPNGVYDYEQYFVYNSTDKAIKDLNIKASLKYAHVF NNAHDVNVMFLYNRQSTTNEKDAAVPNNFEGYTMQLGYKYKNKYLVDLNMAYNGTDRFGK DNNFGFFPALAIGYAISQEDFFKNVDWLGRNVQLLKFRTSYGLVGSDVASGDRYLYRQVY KTGDSYTFGEGNNFGVSGIKEGDLGNLNVTWEKARKFNVGVDMNLFDKFSLTFDWFIDKR YDQLVTRNDIPDILGIGLSPENVAETTNKGFDGQIGYQDRFGNFNFNTNFVFGYAKNRVD YKAEAQQKYEWLRETGRQIGQPFGYHWIGYYTPEDIEKINSGAADAPAKPDIAVQAGDLK YADLNGDGTINDFDKRAIGKPNLPTTTLGWTIGGSWKGFSISVLFQGSFDYSFAINGTGI ESGKSQFQPLHQKRWTQERYENGESIEFPRLTMQDGTINSASAYNSDFWLINAWYIRLKT VDLGYQLPKTALPKFLDNVRFYVNAYNLLTFTGYDKYQQDPEIKTNTAGDAYMNQRVINF GVQLGF >gi|226332215|gb|ACIC01000105.1| GENE 10 16774 - 18570 1071 598 aa, chain - ## HITS:1 COG:no KEGG:BT_4089 NR:ns ## KEGG: BT_4089 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 598 1 598 598 1219 100.0 0 MISMIRNKYLTLIGVGLLFTAVSCTDGYEPEPVELVSIDFVFSKTDSLGTNAVKFMNNIY ATLQNGHNRVGSDYLDAASDDAISINVSDPDVYKLAMGRYTASTRVESDMRWKEYYSGIR KANILINHIDVVPFMLTYKNAKGETKPLNVTMKAEARFLRAYFYFELVKRYGGVPLMGDD VHILGDDMEIPRNTFEQCVQYICDELDDIKDDLRTNPMPDFEQYAHTPTREACLALKSRV LLYAASPLFNERPIEIGNELIGYASYDRERWNDAAKAAKTFIDEYGPNGNGAYGLTQSTS DGDNRDFRDVFLGFYNKTNNPEVIFYRPGGEDKSIESNNGPLGFSGDNLGKGRTLPTQNL VDAFPMKDGMFAGQGSKYTLNQSNPYENRDPRLDYTILHHGSSWLNNTLDISIGGVNNPS NSAEYSKTGYYMCKFMGKFGEESQYGNKIHLWVMFRYAEMLLNYAEAMNEYLSSPSQDVY DAIIALRARAGIESGNDESPYGLKKNMTQAEMREVIQNERRIEMAFEEQRYWDIRRWRIA EEIFKNPLEGLEIRVKGNTTSFNEVDVLSTTFDVKRYLYPIPYNEVVKNDNMIQNPKW >gi|226332215|gb|ACIC01000105.1| GENE 11 18577 - 21711 1826 1044 aa, chain - ## HITS:1 COG:no KEGG:BT_4088 NR:ns ## KEGG: BT_4088 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1044 1 1044 1044 2033 100.0 0 MRNILNKRNSRYCFTCFVMWLLLGLGHLNAVAQEAGGTANITGKVIDQYGNPVSGVVITM KNTDFKTVTGDDGTFEFQYKKGDMLRFSHPGFLHKEIKVNKLRNQERIFKVTLTEEFVKF PDVINGPYDTKDKASYLGSAATVYTDQVSSLMGTTILPSLQGRLPGLDIVRTRGARKSQI ESSSSGTIFNFNSPTLGKEAYSDNTEFNVLSRHNAPVVVVDGVQRELYSIDPDAIESVSI QKDALSSMFLGMRSSRGALVITTKDPIKQGFQLSFTGRFGVQSSVKKLNPLSTSQYAYLL NEALLNDGKNPFYSYDDFIKFRDHSSPYTHPSVNWCDELMNKNSTTQSYNLNATGGNKYA QYFISVGYVGENGLFKNPGGDAHDTNMTFDRYMISSKVNINITDDLTAKVTLMGRIEEGT QPGGTGNGYDDILSSIYSTPSNAYPVTNPDGSWGGSQSFNNNLLSQTINSGYITDGARDV LGAINLRYDFGKLVKGLSVRMVGSVTSQNRSTTKRTKTSEVFDYTIDKDGNDVYTRYGEK KTQSNSFSSVSTYRQMYGQLAVDYERQFGKHKFKASVLGDTRNTLTNWDLPEYPSNIIGD VSYDYAERYFAQVALSESYYNRYAPGRRWGTFYAFGLGWDISKENFMENCEWLNQLKIRG VYGKTGNGMNNAGYYTYYQTYSSGGDDYRLGTNLGQSSGSFTEKDKLANLYQTWEKGNKL NIGVDIALFNNKLQVTADYFNDKYYDLLQARGKSIELIGQNYPDENIAKERWYGGEFSIT YQDHVGDFNYYASANWSCEQSKVLYKDEQKVPYEYLRTTGKPKGAIFGLVAEGFFTSQDE ITKSPVIEGFNNIQPGDIKYKDQNNDGVINDFDKVMIGGDKPLSYFGIDLGFEWRGLEFS MFWQGVYNKDVLMSDWNLLEGFQTQGQVYGQAYENMLDRWTPETAATATFPRLSAGGNKY NQGNGWGSSFWLRSGNYIRLKNVSLGYNLPDSFCRNYLGGARVKVFVSGQNLFTKAANEL VDPEVSFGNYPLQRCISTGINVKF >gi|226332215|gb|ACIC01000105.1| GENE 12 21736 - 22788 433 350 aa, chain - ## HITS:1 COG:no KEGG:BT_4087 NR:ns ## KEGG: BT_4087 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 350 1 350 350 696 100.0 0 MKYYFKKHRIKIFFLLVLSLIVFGCSFLIKEVNVKQENEAGEMVAYIKAGEIATFTFSGE INIDGDASNETFIVGFLAPRSWNVRQNATVTYREDRYETEVDHKMTVIPDTEQPANYKGM SWSAALKKKYGVRGNVLNDMEWIAFKSDNYPSVNGTIHYTVTIKCNSGKSNLKFRPSFFI NHSSDGIGGDEAHYSVKDADDCFEVVEGSGTVIDFCSTHYYQIEPLSALQDDYVTFTFQG DINTNELIKAENVYIEATAYTIEGKIYTVNEKSAKTLMKRETKLPRYNVTLWPGGFFNIP DGETISRIEYIFTNEDGTVSISQSDDSRDNEGEEVEEGIKEPFVFEFQCE >gi|226332215|gb|ACIC01000105.1| GENE 13 22806 - 23258 294 150 aa, chain - ## HITS:1 COG:no KEGG:BT_4086 NR:ns ## KEGG: BT_4086 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 150 1 150 150 299 100.0 2e-80 MRFKIQTTVLLFVLLGTVSACNTFKDEIAPDSYMEVPKQLDGKWQLKTVVRNGTDISEVM DFSQFRLIMNKDNTYNIENYLPFLVKKNGTWRIDNLTYPFFLTFQEEGAEREAITEITYP IVQGKRHITLTISPGCSSNSYVYSFEKIEE >gi|226332215|gb|ACIC01000105.1| GENE 14 23288 - 25144 1069 618 aa, chain - ## HITS:1 COG:no KEGG:BT_4085 NR:ns ## KEGG: BT_4085 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 618 1 618 618 1259 100.0 0 MNITVFRVYQLNLLTAISMYMKSLFYLFLSIVGAWLFSSCSDDFLGETETTDLDQATVFA DSTYTADFLNQIYVDIGFDIQHNRYKDQYNDHGGLQTSCDEAAYKASTGLTTDVMFATGT VNPVTISEDDVWRIAYRNIRRVNVFFKYADGSRMAEVAKEEYKAEARFLRAWYYAMLLRH YGGVALIGDDVYETVEEAIKERNSYADCVEYIVDEANKAAETLPVERSGNKFGRVTRGAC KALISRVRLYAASKLFNGSDFAPADFPKELLGYPTYDKERWKIAVDAALDVIKMKQYDLY IRNEDENNEAYPGWGYYAQLLPADYYGKVGTEVYCGTIFEKKAGASIDTNRWFAPPSTGG NGIGGYVYHDLAELFPMADGTPTKDSPDYDPTNPANKRDPRFMFTVTYDGCIMKSNMQDT EINISVGTQQDAIYRGTPTGYYTHKFLKFGSMANQMLYGGSQARPLMRYTEILLNYAEAA NEYYGPDHKDVLGDQEISPYIVLRKIRECAGIEPGEDGTYGIENNMSQADMTEAIRLERR LDLAFEGHRFFDVRRWMIAEDTDNRMMHGFEITRNGERKTGRIIDTRQHTFRKAMYFYPI PYKETVKSPDLLQNPYYE >gi|226332215|gb|ACIC01000105.1| GENE 15 25362 - 26693 613 443 aa, chain - ## HITS:1 COG:no KEGG:BT_4084 NR:ns ## KEGG: BT_4084 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 443 1 443 443 888 100.0 0 MKTRNIYSLLYVTFVAVALVACNDYDAAQTAYEEIVDSDVTTTPPNIDSAWELQLIPNVG QHSGEVFVYKDKKYDKLFTRTLGWNGGIGVQSTSLSDGNVLWAFNDSYFGVVDAETRARG NCNFPHNSIMIQTTVGGSLGETDDDLRWLVDYIQTNDPDGEGYYQAYTHIAPDETIMEET DEEHFYQIGGATIFDNNGVKELQMLWGEIDNHEGKMTRTGTCLAVYSLEGQPGNSTYLKR ISKNEEFNTDDVGYGSTIWKDEDGHIYLYVTENNRPLVARTTTHDLTSEWEYYIRDLSGN FMWQKMYPTKEERTRSTIMENNYVCSMPQIFKKGDYYYMIGQAVSYGHSVYLYRGETPYG PFTDQKILFNVPYSVDKIGNQYYKNLLRVNLHLELAREGELVFSTNTDADTAGDNFDFPG SADFCRPYFYRIFNWESIYDEDD >gi|226332215|gb|ACIC01000105.1| GENE 16 26720 - 27748 538 342 aa, chain - ## HITS:1 COG:no KEGG:BT_4083 NR:ns ## KEGG: BT_4083 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 342 1 342 342 647 100.0 0 MKYRNFTNKLMLMVGVTATAVLTSCEQEFYQDEQYRKEIYIVSGEDNIFQREFAFGGEEI GYLSVYASGTTPIEKEVMVELERNETVLSDYNQKRYGDNYKNYVLELPDTHYKVDDWSIN LYPNANSSYSLFPIKVNIDGLEPEDNYFLPLRIASVSDYMISSARRNVLMQILMKNDYAT TKEKSYYTMNGTRLRVAKDTWEPLDKKNNAPDYKPINATKLVAPVTEYGIRILTGSTLTS DRKELRKQGIVVTVHPEEMIDVPVIGSNGLPTGEYIQCQKVTLDKWYNVNSGITVLNIED TPSYYNPEKKEFTLNYRYNYNSNDWYEMKEVMSPVAIANENN >gi|226332215|gb|ACIC01000105.1| GENE 17 27768 - 27941 199 57 aa, chain - ## HITS:1 COG:no KEGG:BT_4082 NR:ns ## KEGG: BT_4082 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 57 627 683 683 126 100.0 2e-28 MGLNVEQAEWEGFYQPTVIQYKSIIERDFKPKMVWLPLHLDEIRKVSVLDQNPGWDK >gi|226332215|gb|ACIC01000105.1| GENE 18 27988 - 29820 972 610 aa, chain - ## HITS:1 COG:no KEGG:BT_4082 NR:ns ## KEGG: BT_4082 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 600 1 600 683 1211 100.0 0 MKKYFILSVLLLVSGFFASCNYLNIDDYFEDTFQEDSIYANKRNIERYFNGAAALLPVVD KIWEYGNTPGVTGSDEAVSCGDWNGMVVIQFSGTKLMNNEITSSSMGGWTWDFNIWPKCY KVIRKVNTILPHIDGVRDMNSFERMEFRAKCRFLRAYAYYLILQQNGPMILLGDEIVSNN EEAEYYARTRNTYDECVDYICSEFDEAAKNLPDATTSMDQYIPTEGAALALAARVRLQAA SPLYNGGEAARRFFGDFTRCTDGDHYVSQSYDERKWALAAAAAKKVIDLGRYQLYTVAAS TDAEHPESGNYGTYVVTLPQEVPTAEFPNGVGMGDKKVVDPYRSYAEMFNGELAINQNPE FIWCSSTAQVGSHMGYVFPLNFGGSSCLCVPQHIVDQFYMADGRDIKNSSSAYPYISRPY DKTCVTTEGKVLSEGYKISQGTYLAYTNREPRFYVNIGYSHAWWAMGSTTESAKKNVNID YWNGANSGKNHSNNNVYNITGYTSRKYINPQDAMSGSGARQKDKSFPIIRYAEILLAYAE ALNNLTQAYEIDGQTYTRDTEAIKYYFNQIRYRAGIPGLTADDLITVEAFNKVVQRERLI VVILGRTTLL >gi|226332215|gb|ACIC01000105.1| GENE 19 29852 - 32953 1623 1033 aa, chain - ## HITS:1 COG:no KEGG:BT_4081 NR:ns ## KEGG: BT_4081 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1033 1 1033 1033 2069 100.0 0 MNRIFVIFALLLAFVAQVHAQKEQSFMLTGTVFDEFNEPVPGANVYVKDKPGVGITTNID GKFRLKVNIYDIVVVSFLGYENYEQRLTNKIDNVKVTLKPATENIDEVVVVGMGTQRKVS VAGAITTMDPAQLEVPATNIVNTLAGRVAGVIGVQSSGEPGKNISEFWVRGIGTFGANSG ALVLIDGLEGSLSQIDAADVESFSVLKDAAATAVYGSRGANGVVLVTTKRGLESKLKISG RANMTISHLKRLPEYVNATQYAEMANEASVATGLSPIYNKTEMDIIKYGLDPDLYPNIDW QDVILNPNSFQQTYYVSAQGGSSVARYFASLGMSKESAAYNPSKDSKYNKGVGYDTYNYR LNLDIDLTKTTKVYIGTTGYMSVNTRPSMGEYSRGVSLTDWLWSSQAKTTPISYPLRYSN GYYPAAGTKDEISPYVLLNYTGNAREQNTRNLVTLGITQDLSMITKGLSAKVQGSWDTQS LFGEARYKMPELNEALKERNPNGTLSFEKIADEKTVTYSSAAWVWRKLYLEANVNYDRTF GDHRLGGLLFYYIEDTSETGADKSMNAIPKRYQSLSGRFTYGFKDTYFWDLNFGLNGSEN FEPGKQYGFFPAGAFAWVPSSYEFIRENVSWLNFLKLRVSYGVVGNDRISSRRFPYLTLI KEENASNPWGGTGSLTEEQVGANNLMWEKAKKFDVGMDLHLFKDKFTLTLDYFKDTRDGI FQERKQIPDYVGLIQMPYGNIGRMKSWGADGNMEFYQQIGKDAHVILRSNFTLSKNKILN WEDTKKPYTYLENNGYANNVQRGFVAMGLFKDQQDVEMSPTQFGTVRPGDIKYRDINGDG KITDDDKVPLFAYSGVPQLMYGIGAEFRYKSWTLNVLFKGTGRNKFLYGGSDGRNFDGYM PFNQKDKGNIMTIAYNPENRWISAEYSGNQATENPNARFPRLYYGKNENNTKPSTFWMGD ARYLRLQELSLAYNLKVPALQRILGINSMNIQLMCENLAVWDSVNIFDPEQATSCGQAYP LPARYSLQLYLNF >gi|226332215|gb|ACIC01000105.1| GENE 20 32985 - 34355 675 456 aa, chain - ## HITS:1 COG:no KEGG:BT_4080 NR:ns ## KEGG: BT_4080 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 456 1 456 456 944 100.0 0 MRNLLQKCFKWQWLVCLAVGCCFSSCKDDEESTSGYDPNKPIVLTDFYPPEGKLATQVIL NGSNFGDSKENVKVFFNDKEASVISVKDDRMLVLAPKRASTIEDPECVVKVQVHEQIEEY KQTFDYYIQTTVTTLVGGSTSAQVNPTGTIPLSEAQFRANIDRCICVDQDKNVFFLVDND GKFAAFMLNEEADKLISLKSDINALFNSPVLGYNSKDDIVYQFWANRDSHEVYYFDPKTD YAPTTAISSISWDDPNFPNIEGFGVWAAKCNFTMGPDGKMYSRMLGGNLVRIDVENARGE NLTNGDLVGTKDGSAYGLVFDPQDENVFYFSNNDKHCIYKYDLRTKECACWAGQEGKSGY LDGPIGQAMFNKPGQMCVDSEGNIILTDTENHCIRKITMSTGYVSTLAGKPQNSGYVNGS AEDAQFKKPLGICIDNDDVMYIGDSENRAIRRLAVE >gi|226332215|gb|ACIC01000105.1| GENE 21 34540 - 36300 927 586 aa, chain - ## HITS:1 COG:no KEGG:BT_4079 NR:ns ## KEGG: BT_4079 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 586 1 586 586 1229 100.0 0 MRRYIYFVFLLCCAVNLLLTSCVRQKYVVMDMVHHNPGEAMTESKFLDPSFLKKNEYGAK VFFLFEAAQFGIDWKSFDPSLFPDTTEAGRWVAEKAEIIHKKYDAAKKEDLQVYCMLDML VLPSLLVEKHRTELTNEQGKLDISKPYTQLCIRELMKEMFETFPQLDGLVIRTGETYLHD APYYVGNHPVQNGMYDHITLINLLREEVCERRNKKLFYRTWDMGQLHSIPKYYLSVTDSI EPHPNLYFSIKHTMTDFWRSAITDPDMNYNTMDKYWLEESGQYGVPFNPCIGIGKHQQVV EVQCQREYEGKGAHPNYIAKGVIDGFEEFKKSNIKKPYCLNQVKDNPLFKGVWTWSRGGG WGGPYIKNEFWIELNAYVISHWASNPLKTEKEILYDFVKAKGLPESEWEMFRRLCLLSED GVIKGQYSTMGDTYVNWTRDDTITGDVYQKSYFDRMIERNQVNAYLKEKEEAVRIWKEIE LISQKLHFPSEELNHFIRISCSYGRIKYELFAVSWQIMLCGYVADTTKKSFNRIEMDKYI TAFDDLWKEWNDLSLENDNCPSMYKISSNFFGFPVGIQETIDKYRK >gi|226332215|gb|ACIC01000105.1| GENE 22 36328 - 36777 257 149 aa, chain - ## HITS:1 COG:no KEGG:BT_4078 NR:ns ## KEGG: BT_4078 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 149 82 230 230 307 100.0 1e-82 MSTDAFINAQKFQPNVVIIKLGTNDSKPENWKYKDEFETDLEYMISTFQKCGSKPKIIIC RCIPASNNLYGIRNEIIKDEIYPIQKKVARKFHLKLVDLYSPLEHKVDSTCYVWDNVHLN KSGSLILAEHIYKAITGKKAPKLENAFAN >gi|226332215|gb|ACIC01000105.1| GENE 23 36799 - 37020 195 73 aa, chain - ## HITS:1 COG:no KEGG:BT_4078 NR:ns ## KEGG: BT_4078 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 73 1 73 230 160 100.0 1e-38 MTRIQNHMTKIVRILVFAFLMLIPVCGVAQDKIKIACIGNSITEGADNYPTPLARMLGNQ YEVGNFGKWGHTL >gi|226332215|gb|ACIC01000105.1| GENE 24 37352 - 38218 433 288 aa, chain - ## HITS:1 COG:AGc3637 KEGG:ns NR:ns ## COG: AGc3637 COG0627 # Protein_GI_number: 15889292 # Func_class: R General function prediction only # Function: Predicted esterase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 9 281 6 309 322 113 29.0 3e-25 MKKHLLSALLAMCLVTTAFAQQGKVYETRTVKSKILGMERSYSIYLPAGYDEGDGSYPVL YLLHGLGDNHTGWVQFGQVQYIADKAIAEGKSAPMIIVMPDADTVHKGYFNLLDGTYNYE DFFFQELIPHIEKTYRVRAESRYRAISGLSMGGGGALFYALHYPEMFVAVAPLSAVGGAW TFDQMKNQSDLSKVSEEKKAEVLGQMDIQTILEKSPKEKLDRIKWIRWYISCGDDDFLSV TNCLLHNTLLQHQVGHEFRMKDGSHSWTYWRMELPEVMRFVSRIFTQY >gi|226332215|gb|ACIC01000105.1| GENE 25 38228 - 40906 1213 892 aa, chain - ## HITS:1 COG:no KEGG:BT_4076 NR:ns ## KEGG: BT_4076 # Name: not_defined # Def: alpha-rhamnosidase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 892 1 892 892 1856 100.0 0 MRNKKLLLGIGIMCCLWCLPKIVHGKENKVSFSLVELKCENMVDPLGIDNVTPHLSWKLK GDGVVDGQAFYEIQVASDSLLLIGGKADLWKSGKLKSDVSVMVPYQGLPLASRSLCYWRV RAWDRKRHVSQWSPVARFSVGLLNKEQIHGVYIGSSPEGGKVCAPLLRKKVQIGELATTF LYVNSLGYHEVYVNGKKVTENVLTPAVSQLNKRSLMVTYDVSSYLKEGENDLLIWLGQGW YKKTTFGAAYDGPLVKAELNMLRNGKWEVLTATDTSWRGRESGYSDTGNWCALQFGGERV DGRIVPTDFSTYSLDKMKWYPVVEVNVPRHIVSPQMCEANKIHQTLQPVSIRKLSEDTWL VDMGRIQTGWFEMKMPMLSAGHEVTMEYSDNLTKEGEFDKQGESDVYIAGGRRGEYFRNK FNHHAYRYVRISNLPARPKTEWIKSLQIYGDYRQTATFECSDADLNAIHNMIQYTMRCLT FSGYMVDCPHLERAGYGGDGNSSTMSLQTMYDVAPTFTNWIQTWGDSMREGGSLAHVGPN PGAGGGGPYWCGFIVQAPWRTYVNYNDPRLIKNYYPKMKEWFSYVDKYTVDGLLKRWPDT QYRDWFLGDWLAPIGVDAGAQSSVDLVNNCFISECLGTMEKIALMLGEKEEAEKFAMRRK NLNELIHQKFYHPGKGIYSTGSQLDMCYPMLVGAVPDSLYDDVKRKMMTDTEKQHKGHIA VGLVGVPILTEWAIRNREVDFLYQMLKKRDYPGYLYMIDNGATATWEYWSGERSRVHNCY NGIGTWFYQAIGGLRIDESIPGYQHVFIDPQIPKGLTWAKMAKDTPYGVIAVDWELTDNI MDLQVDIPVGVTATLCIPDEAVSCMMDGKNIRIEKKMIQLKAGNFKYYIYMK >gi|226332215|gb|ACIC01000105.1| GENE 26 40977 - 42173 865 398 aa, chain - ## HITS:1 COG:no KEGG:BT_4075 NR:ns ## KEGG: BT_4075 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 398 70 467 467 807 100.0 0 MNAGIGGESAWDIKDRLDYDVFDRKPTYVTLTFGMNDTGYDIFWKENAKELSEQRIEKSL ESFREIEKRLLAENKMTKVLIGGSPYDETTKLNSLLFLHKNDAILKIIDAQRKAAKKNGW GFVDFNQPMVQISLEEQKKDSTFTFCRVDRIHPDNDGQMVMAYLFLKAQGLAGVEVSDIS IDANNKNLLSHRNCKVSKLKKEAGGLSFDYLANSLPYPLDSIPRHGWGNKRSQRDAMDLI PFMKEFNQERLQVTNLEKGHYRLTIDGLFIDNVSSGQLEDGINLADYPNTPQYQQAMKIM YLNEERFEVEKRFREYLWTEYSFLKKEGLLFADNEKAINKLREYLPKDGFLRMSYEWYTK AMYPEIREVWSRYMKIIVDTIYKMNKPTTHKVKLTKIG >gi|226332215|gb|ACIC01000105.1| GENE 27 42628 - 42825 62 65 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIIEFYKFLLSPLSHYKSSMLTYLFNSKALFSSFCSYIGLFIMVQIKQMSVTLIYEEGKT NKLCQ >gi|226332215|gb|ACIC01000105.1| GENE 28 43146 - 46229 2253 1027 aa, chain - ## HITS:1 COG:TM1624 KEGG:ns NR:ns ## COG: TM1624 COG3250 # Protein_GI_number: 15644372 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Thermotoga maritima # 67 746 22 682 785 156 26.0 2e-37 MMNKFRLLLLASSLFVLNGIGHAQTSTMSLNSSNPKIVWEVKPQADLNNIGGEQISTPGF KMPDYVKGVVPGAVFTAYVEAGIVPDPNYADNIYKVDETFYNRPFWYRTEFELPASYSAG KRVWLHFDNTNRFADFYFNGEKISGTKTSTKDVSGHMLRSKFDVTHLIKKSGKNAVAVLI TDPDQKKTRKGKDPYGVACSPSYLAGAGWDWMPYVPGRLAGITGNAYLAITGDAVMEDPW IRSELPTLQQAELFFSTGIKNVSSAPKEVEVSGVIQPGNITFSKNIRVEGKETVQLSVDK SDFAALVIRNPKLWWPNGYGEPNLYTCKLTCSVDGKISDEKDITFGIKKYEYKMINNVVN YPVLTFFINGQKIYLKGGNWGMSEYLLRCHGKEYETKIKLHKDMNYNMIRLWTGCVTDDE FYDYCDKYGIMVWNDFWLYVAYNDVAEPEAFKANALDKVRRLRNHPSIAIWCGANETHPA PDLDNYLREMIAQEDKNDRMYKSCSNQDGLSGSGWWGNQPPKHHFETSGSNLAFNKPAYP YGIDHGYGMRTEIGTATFPTFESVKLFIPQESWWPLPTDEQLKDDDDNVWNKHFFGKEAS NANPINYKKSVNTQFGESSSLEEFCEKAQLLNIEVMKGMYEAWNDKMWNDAAGLLIWMSH PAYPSFVWQTYDYYYDPTGAYWGAKKACEHLHLQWNSSNNSIKAVNTTTKDLKGAYAKAT IYNLNGKEVAAYGRTKQMDVPASNIAEAFTLNFNPYNLAFGKNVIASSSSPSRSASLVAD GGAGSRWESDASDSQWIYVDLGKKEKIEHVVLKWETARAKEYEIQVSNDAKKWKTVYTNK DGQGSTDEIKLSPVTARYVKMAGVSRATDFGYSLYEFEIYGKKQKNVEELTPLHFIRLEL TDANGNLISDNFYWRNGVTDLDYTALNTLPEAELSCKLVDKSMLSEGKMKLSVKNHSKTV ACANRIRLVNTATQERILPVIMSDNYITLMPGEERTISVEAEPEMLKGGVSVLLKQYGKA EQKKLDI >gi|226332215|gb|ACIC01000105.1| GENE 29 46243 - 48591 1792 782 aa, chain - ## HITS:1 COG:XF0842 KEGG:ns NR:ns ## COG: XF0842 COG3537 # Protein_GI_number: 15837444 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Xylella fastidiosa 9a5c # 33 777 47 770 790 367 31.0 1e-101 MYKASANILIACSFLWLTACSSAGVVTEDLIPTDYVNPFIGASTSVGAAGVYHGLGKTFP GATTPYGMVQVSPNTITGGDNSSGYSDEHKTIEGFAFTQMSGVGWFGDLGNFLVMPTTGE LQKIAGKEDGSIKGYRSSYDKATETAKAGYYSVELTDYKIKVESSATPHCGILQFTFPSN EQSRIQIDLARRVGGTSTSQYVKVLDDYTIQGWMKCTPDGGGWGNGEGNSDYTVYYYAQF SKPLSNYGFWSADIPDEWVRKRDEVVSIPYLTRISQAPVIKDKKELEGKHLGFFTEFPTK EGEQVEMKVGISFVDMEGAANNFKQEIASKNFAQVKQEASDLWNKELSRIRISGGTDDEK TVFYTSLYHTMIDPRIYTDVDGRYIGGDKKVHEQDGTFTKRTIFSGWDVFRSQFPLQAMI NPRLVSDALNSLITMADQSRREYYERWELLNSYSGCMIGNPALSVLADAYMKGIRTYDVE KAYQYAVNTSAKFGNDSLGYTPEPLSISYTLEYAYADWCVAQLAKALGKEEDAKRFYEKG QAYRNMFDAEKGWFRPRNADGSWKAWPENALTEEWYGCIESNAYQQGWFVPHDVPGMVEL MGGKEEVIANLTNLFDHTPSDMLWNDYYNHANEPVHFVPFLFNQLDVPWYTQKWTRYICK NAYANKVEGIVGNEDVGQMSAWYILAASGIHPSCPGNTRMEITSPVFDKVEFNLDSKYHQ GKVFTIIAHNNNTNNLYIQKALLNGKEYNKCYLDFAEIAAGGTLELFMGDKPNTEWGVLS NI >gi|226332215|gb|ACIC01000105.1| GENE 30 48626 - 51175 1411 849 aa, chain - ## HITS:1 COG:all0848 KEGG:ns NR:ns ## COG: all0848 COG0383 # Protein_GI_number: 17228343 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-mannosidase # Organism: Nostoc sp. PCC 7120 # 64 813 278 1015 1047 90 20.0 1e-17 MNRLNNIILLVVFICTHAHSQQAYFVDGYHGGIYGHYPVKWKTQFIVDQLSKHPDWRICL EIEPETWDTVQIQTPGAYRQLKNVVVTKQVEFTNPTYAQPYCYNISGESIIRQFQYGIAK INKHFPGVTFTTYSVEEPCFTSCLPQILKLFGFKYAVLKCPNTCWGGYTNAYGGELVNWV GPDGTPILTVPRYACEKLEENSTWQTTAWCNEESYLNACRDAGIEHPVGMCFQDAGWKNG PWLGSGKNTKSNSVYVTWKDYFENISIGKTNDDWHFSQEDIRVNLMWGSQVLQRIAQEVR TSENKIIIAEKMSVIAHLANGYTCVQEDLDEAWRTLMLAQHHDSWIVPYNGLNKFGTWAD QIKRWTDETNTVADKITAASIFSFDNDTINTEKAQGFVRVYNTLGTKRKELVTVELPQEY ADFDLEVNDYRNKKVDYSIGKEGGKIRLLFEADVPPFGYSTYRIKQVKTGKRTVSGSEKI INEGEYVVENDMYKIVFDLSKGGIIKSLVAKKEENKEFAKQSGEYSIGELRGYFYDEGKF RSSTEAPAKLTVLRSNIQETKVKLEGKIASHPFVQIISIAQGEKRIDFDLTINWKKNVGI GEYKENSWRDNRRAYCDDRFKLSVLFPVNLHSPRIYKNAPFDVCESKLDDTFFNSWDQIK HNIILHWVDLAEQEGDYALALFADHTTSYSHGKDYPLGLTAQYSGNGLWGPDYKITGPLK MKYAIVPHRGKWDKAAIATKSDCWNEPLLCSYHSFAKLESRSLVDLKNTGYQVSAANIKD GKIILRLFNAEGDRKQHNITFDMPLSSIEEVDLNGRVIDRKTIKTRTGKSEISISMPRFG LKTFALNLN >gi|226332215|gb|ACIC01000105.1| GENE 31 51183 - 51413 310 76 aa, chain - ## HITS:1 COG:no KEGG:BT_4071 NR:ns ## KEGG: BT_4071 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 76 1 76 76 117 100.0 1e-25 MGTNLNKYFAGELTSEEKEVFLLNVKNNGEMREEFIEYQSVVALVDWSFPKDDKELAKQK LSEFMSRIENSENKKA >gi|226332215|gb|ACIC01000105.1| GENE 32 51489 - 52013 302 174 aa, chain - ## HITS:1 COG:PA1912 KEGG:ns NR:ns ## COG: PA1912 COG1595 # Protein_GI_number: 15597108 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Pseudomonas aeruginosa # 4 163 13 162 168 61 29.0 9e-10 MDFSELYLTYYSKLVRFAKEFVILEEDAENITQDVFTDLWAKRDSMDRIENMNAYLFRLI KNRCLDHLKHKMFEQKYIESVQTSFEIEMSLKLQSLNRFDVSDISEGNETEMLVRNAINS LPRKCRDIFLLSRVEGLKYREISERLGISVNTVECQMGIALKKLRVKLNICLAA >gi|226332215|gb|ACIC01000105.1| GENE 33 52204 - 53808 841 534 aa, chain - ## HITS:1 COG:no KEGG:BT_4069 NR:ns ## KEGG: BT_4069 # Name: not_defined # Def: putative regulatory protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 534 1 534 534 940 100.0 0 MKHTLCFITILLGSLLNLYANNENDSLLKVLDKVISERLVYTEKKEATIKELKAKKKEQK TLDDMYRLNSEILHQYETFVCDSAEQYINENIEIAKKLDNKTYLLEGRLQLAFVYSLSGL FIQANDIFKSINCSDLPSHLQALYCWNRIRYYENLIKYTDDARFASEYLVEKEAYRDTVM SILYDASEEYSKERAIKLQDQGNTKEALKILTKIYQKEKTGTHGFAMMSMGLSRAYRLVG EHELEEKYLILAAMTDIKLAVKENEALLTLAVNLYHKGDIDRSYNYIKVALSDAIFYNSR FKNTVIARIHPIIENTYLYRLEKQKQNLRFYILLTSLFVVALAITLYFTYKQTKIVSRAK KNLNVMNEELVALNKNLDEANLIKERYVGYFMNQCAVYINKLDEYRKNVNRKIKTGQVDD LYKSSSRPFEKELEGLYTNFDKAFLKLYPNFVEEFNSLLKPEDYYKLDKDQLNTELRIFA LMRMGITDVSQIAVFLHYSVQTIYNYKSKVKRMSLLDGNIFEEEVKKLGSLSQK >gi|226332215|gb|ACIC01000105.1| GENE 34 54068 - 54445 123 125 aa, chain + ## HITS:1 COG:no KEGG:BT_4068 NR:ns ## KEGG: BT_4068 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 125 1 125 125 227 100.0 1e-58 MKQSLKYGLFLVLALLIHVVSNEAMEDNCRVSPPTYRQEKCYVSQDHPVHNALERLHYFY STQSCDMSHADVAHIPTDKSVQLLITYFREYKSQQNTSQTHSLHIPKYFYDPITYYIYGL RKIVI >gi|226332215|gb|ACIC01000105.1| GENE 35 54559 - 54909 211 116 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|154175415|ref|YP_001407462.1| NADH dehydrogenase subunit A [Campylobacter curvus 525.92] # 5 116 14 126 129 85 39 7e-16 MNFTFLVVVLLTALAFVGVVIALSRAISPRSYNVQKFEAYECGIPTRGKSWMQFRVGYYL FAILFLMFDVETAFLFPWAVVMHDMGPQGLVSILFFFIILVLGLAYAWRKGALEWK >gi|226332215|gb|ACIC01000105.1| GENE 36 54900 - 55493 436 197 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|154175216|ref|YP_001407461.1| NADH dehydrogenase subunit B [Campylobacter curvus 525.92] # 33 183 12 160 170 172 51 6e-42 MEITKKPKIKSIPYDEFIDNESLEKLVKELNAGGANVFLGVLDDLVNWGRSNSLWPLTFA TSCCGIEFMALGAARYDMARFGFEVARASPRQADMIMVCGTITNKMAPVLKRLYDQMPDP KYVVAVGGCAVSGGPFKKSYHVVNGVDKILPVDVYIPGCPPRPEAFYYGMMQLQRKVKIE KFFGGTNRKEKKPEFMK >gi|226332215|gb|ACIC01000105.1| GENE 37 55514 - 57106 1657 530 aa, chain + ## HITS:1 COG:SMa1529 KEGG:ns NR:ns ## COG: SMa1529 COG0649 # Protein_GI_number: 16263284 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase 49 kD subunit 7 # Organism: Sinorhizobium meliloti # 163 530 11 404 404 308 39.0 2e-83 MQEIQFIVPAALHDEMLRLRNEKQMDFLESLTGMDWGVADEKDAPEKLRGLGVVYHLEST VTGERIALKTAVTDRERPEIPSVSDIWKIADFYEREVFDYYGIVFVGHPDMRRLYLRNDW VGYPMRKDNDPEKDNPLCMANEETFDTTQEIELNPDGTIKNREMKLFGEEEYVVNIGPQH PATHGVMRFRVSLEGEIIRKIDANCGYIHRGIEKMNESLTYPQTLALTDRLDYLGAHQNR HALCMCIEKAMGIEVSDRVKYIRTIMDELQRIDSHLLFYSALAMDLGALTAFFYGFRDRE KILDIFEETCGGRLIMNYNTIGGVQADLHPNFVKRVKEFIPYMRGIIHEYHDIFTGNIIA QSRMKGVGVLSREDAISFGCTGGTGRASGWACDVRKRIPYGVYDKVDFQEIVYTEGDCFA RYLVRMDEIMESLKIIEQLIDNIPEGPYQEKMKPIIRVPEGSYYAAVEGSRGEFGVFLES QGDKMPYRLHYRATGLPLVAAIDTICRGAKIADLIAIGGTLDYVVPDIDR >gi|226332215|gb|ACIC01000105.1| GENE 38 57232 - 58308 1091 358 aa, chain + ## HITS:1 COG:RP796 KEGG:ns NR:ns ## COG: RP796 COG1005 # Protein_GI_number: 15604628 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase subunit 1 (chain H) # Organism: Rickettsia prowazekii # 30 349 15 327 339 247 44.0 2e-65 MFDFSIVTNWIHELLLSVMPEGWAIFIECIAVGVCIVALYAILAIVLIYMERKVCGFFQC RLGPNRVGKWGSIQVICDVLKMLTKEIFTPKDADRFLYNLAPFMVIIASFLTFACIPFNK GAEILNFNVGVFFLLAASSIGVVGILLAGWGSNNKFSLIGAMRSGAQIISYELSVGMSIM TMVVLMGTMQFSEIVEGQADGWFIFKGHIPAVIAFIIYLIAGNAECNRGPFDLPEAESEL TAGYHTEYSGMHFGFFYLAEYLNLFIVASVAATIFLGGWMPLHIVGLDGFNAVMDYIPGF IWFFAKAFFVVFLLMWIKWTFPRLRIDQILNLEWKYLVPISMVNLLLMACCVAFGFHF >gi|226332215|gb|ACIC01000105.1| GENE 39 58319 - 58807 535 162 aa, chain + ## HITS:1 COG:SMa1519 KEGG:ns NR:ns ## COG: SMa1519 COG1143 # Protein_GI_number: 16263279 # Func_class: C Energy production and conversion # Function: Formate hydrogenlyase subunit 6/NADH:ubiquinone oxidoreductase 23 kD subunit (chain I) # Organism: Sinorhizobium meliloti # 19 146 18 140 188 95 38.0 4e-20 MEYKDQKYTYLGGLIHGISTLATGMKTSIKVYFRKKVTEQYPENRKELKMFDRFRGTLAM PHNENNEHRCVACGLCQIACPNDTITVTSETIETEDGKKKKILAKYEYDLGACMFCQLCV NACPHDAITFDQNFEHAVFDRTKLVLQLNHAGSKVIEKKKEV >gi|226332215|gb|ACIC01000105.1| GENE 40 58810 - 59322 488 170 aa, chain + ## HITS:1 COG:jhp1190 KEGG:ns NR:ns ## COG: jhp1190 COG0839 # Protein_GI_number: 15612255 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase subunit 6 (chain J) # Organism: Helicobacter pylori J99 # 5 164 2 162 182 65 30.0 6e-11 MGSTLETVVFYFLAAFIIAMSIMTVTTQRIVRSATYLLFVLFGTAGIYFLLGYTFLGSVQ IMVYAGGIVVLYVFSILLTSGEGDRAEKLKRSRFLAGLFTMVAGLAIILFITLKHNFLQT TNLVPLEINIHTIGNALLSSDKYGYILPFEAISILLLACIIGGIMIARKR >gi|226332215|gb|ACIC01000105.1| GENE 41 59418 - 59726 367 102 aa, chain + ## HITS:1 COG:VNG0643G KEGG:ns NR:ns ## COG: VNG0643G COG0713 # Protein_GI_number: 15789840 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase subunit 11 or 4L (chain K) # Organism: Halobacterium sp. NRC-1 # 1 102 1 100 100 75 43.0 2e-14 MIHMEYYLVVSTIMMFAGIYGFFTRRNTLAILISVELMLNATDINFAVFNRFLFPGGMEG YFFALFSIAISAAETAIAIAIMINIYRNLRSIQVRNLDDLKW >gi|226332215|gb|ACIC01000105.1| GENE 42 59814 - 61745 1326 643 aa, chain + ## HITS:1 COG:slr0844 KEGG:ns NR:ns ## COG: slr0844 COG1009 # Protein_GI_number: 16331732 # Func_class: C Energy production and conversion; P Inorganic ion transport and metabolism # Function: NADH:ubiquinone oxidoreductase subunit 5 (chain L)/Multisubunit Na+/H+ antiporter, MnhA subunit # Organism: Synechocystis # 2 643 6 680 681 382 39.0 1e-105 MELTILILLLPFLSFLILGIGGKWMSHRTAGAIGTLILGAVVVLSYVTAFQYFSAPRLED GTFATLIPYNFTWLPFTETLRFDLGILLDPISVMMLIVISTVSLMVHIYSFGYMKGETGF QRYYAFLSLFTMSMLGLVVATNIFQMYLFWELVGVSSYLLIGFYYTKPAAIAASKKAFIV TRFADLGFLIGILIYGYYGGTFGFTPDTVSLISGGASMLPLALGLMFIGGAGKSAMFPLH IWLPDAMEGPTPVSALIHAATMVVAGVYLVARMFPLFIAYAPDTLHMIAWVGAFTAFYAA SVACVQSDIKRVLAFSTISQIGFMMVALGVCTSMDPHHGGLGYMASMFHLFTHAMFKALL FLGAGSIIHAVHSNEMSAMGGLRKYMPITHWTFLIACLAIAGIPPFSGFFSKDEILAACF QYSPAMGWVMTVIAAMTAFYMFRLYYGIFWGAEPPRASSHSDHSTPHGNLEAAPCRPHES PLAMTFPLIFLAVVTCLAGFIPFGHFISSNGESYSIHLDLSVAVTSVVIAIISIGIATWM YKNAKQPVANALAKRFNGLWTAAYHRFYIDDIYQFITHKIIFRCISTPIAWFDRHVVDGF FNFLAWATNTTSDEIRGLQSGQVQQYAYVFLCGALALILLLVL >gi|226332215|gb|ACIC01000105.1| GENE 43 61828 - 63312 1429 494 aa, chain + ## HITS:1 COG:slr1291 KEGG:ns NR:ns ## COG: slr1291 COG1008 # Protein_GI_number: 16329430 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase subunit 4 (chain M) # Organism: Synechocystis # 69 439 71 443 559 251 37.0 3e-66 MNFLSIFVLIPLLMLAGLWAARGIKAIRGVMVTGASALLIASVVLTFMYLGERSAGNTAE MLFRADTLWYAPLHISYSVGVDGISVAMLLLSAVIVFTGTFASWRLQPLTKEYFLWFTLL SMGVFGFFISIDLFTMFMFYEIALIPMYLLIGVWGSGRKEYAAMKLTLMLMGGSAFLLIG ILGIYFGSGATTMNLLEIAQLHNIPFAQQCIWFPLTFLGFGVLGALFPFHTWSPDGHASA PTAVSMLHAGVLMKLGGYGCFRIAMYLMPEAANELSWIFLILTGISVVYGAFSACVQTDL KYINAYSSVSHCGLVLFAILMLNQTAATGAILQMLSHGLMTALFFALIGMIYGRTHTRDV RELTGLMKVMPFLSVCYVIAGLANLGLPGLSGFIAEMTIFVGSFQNNDVFHRTLTIIACS SIVITAVYILRLVGKILYGTCTNKHHLALTDATWDERVAVICLIVCVAGLGMAPFWVSHM IGESVLPVVSQLIP >gi|226332215|gb|ACIC01000105.1| GENE 44 63350 - 64792 1281 480 aa, chain + ## HITS:1 COG:BMEI1145 KEGG:ns NR:ns ## COG: BMEI1145 COG1007 # Protein_GI_number: 17987428 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase subunit 2 (chain N) # Organism: Brucella melitensis # 68 432 64 430 478 244 39.0 2e-64 MDYSQFLHMREELSLVAVLLLLFLADLFMSPDAHKQKGTRPVLNTMLPVILMAIHTAINL VPSTAADIFGGMYHYVPMHTVVKSILNIGTLIVFLMAHEWMKRDDTSFKQGEFYVLTLST LFGMYLMISAGHFLMFFIGLETASIPMAALIAFDKYRHNSAEAGAKYILTALFSSALLLF GLSMIYGSAGTLYFDDLPAHIDGNPLQIMALVFFFTGMAFKLSLVPFHLWTADVYEGAPS TVTAYLSVISKGSAAFVLMAILIKVFAPMIHDWQEVLYWVTIASITIANIFAIRQQNLKR LMAFSSISQAGYIMLGVIGGTAQGMTAMVYYILVYAAANLGVFAVITIVALRSQKFTLED YAGLYKTNPKMAFLMTISLFSLAGIPPFAGFFSKFFIFMAAFEAGFHLLVFIALVNTVIS LYYYLLIVKAMYITPSDNPIPTFRSDRCTKWGLALCTLGIIGLGIASIVYQSIDKLSFGI >gi|226332215|gb|ACIC01000105.1| GENE 45 64842 - 67637 2383 931 aa, chain - ## HITS:1 COG:mll3725_2 KEGG:ns NR:ns ## COG: mll3725_2 COG0642 # Protein_GI_number: 13473203 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Mesorhizobium loti # 679 922 55 308 328 184 40.0 8e-46 MQTNYERYCKMSSLAQIGWWEADFLAGHYVCSDFLCDLLGLEGDTISFMDFQNLIREDYR EQIVQEFRANANIHKDFYEQTFPICLKNGEVWLHTRLALREKGTGTNGGDKSFGVIQRVE APKEVEQKNTLRRVNDLLRRQNYISQSLLRFLHDDDVDSCIMEILNDVLSLYQGGRVYIF EYNEDYTHHSCTYEVVSEGVSKEKNKQQSIPVNETRWWCEQILSGKPIILTSLKQLPEEA ENEYKFLDAQGICSLMVAPLMAGDRVWGFMGIDLVESYREWSNEDFQWFSSLANIISICI ELRKAKDRVVREQSFLSNLFHFMPLGYIRMSVVRDENGQLLDYKITDVNKACSRFFARPA ETYIGVLASEIYPDFSSQLLFLKEVLDNNSYREKDIFFPQTELYTHWIVYSPEKDEVVGL FTDSTEAVKTNRALDRSEKLFKSIFANIPAGVEIYDKDGFLIDLNNKDLEIFGVENKQDV IGVNFFENPNVPQHIRDRVRDEDLVDFRLNYSFERAEGYYHPDRRDTIDIYTKVSKLYDN EGNFNGYILISIDNTEQIDAMNRIRDFENFFLLISDYAKVGYAKLNLLNRKGYAIKQWYK NLGEEEDTPLADVVGVYRNMHPEDRERIFDFYREVRKGNRKHFQGEMRIYRPGKKNEWNW IRMNVVVTTYNPEENEVEIIGINYDITELKETEKELILARDKAEMMDRLKSAFLANMSHE IRTPLNAIVGFSDLLVDTENIEERREYIQIVKENNDLLLQLISDILDLSKIEAGTFEFTN GDVDVNMLCEDIVRSMQMKVKENVELMFDPHLAECRIISDRNRLHQVISNFVNNAIKFTS EGSIRVGYKQKGEELEFYVQDTGVGIDAESQAHIFERFVKLNSFVHGTGLGLSICQSIVE QMGGKIGVESEPGKGSRFWFSLPCFVVMSEE >gi|226332215|gb|ACIC01000105.1| GENE 46 67907 - 69118 1225 403 aa, chain - ## HITS:1 COG:no KEGG:BT_4056 NR:ns ## KEGG: BT_4056 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 403 1 403 403 664 100.0 0 MKRLFVNFMTFAAMATALTFTACSSDSDGDDNGNGNGGNGGGTGSSIVVGENILSGTLTG EQTLDAKEYILNGTVIIADGGRLNIPAGTTIKAREGFSSYLLVAQGGKLYADGTADKPIV FTANTTSPVSGYWGGVIINGKAPISGSKTDKSDTALTEINNDYKYGGSAADDNSGSLTYV KICYAGARSTADIEHNGLTLNGVGNGTKIENIYVLESADDAIEFFGGTVNVTNLLAVNPD DDMFDFTQGYCGTLKNCYGVWESDYTSTEADPRGIEADGNMDGLYPDHLRQSDFKVENMT IVNNAANTADNVDRMQDVIKIRRGAKATITNALVKGTGGAIDLVDMKDSKGAGNIESSIN ITSTLSLTGLKQNGELNTFVESADNTGADASLFGWTGYNLSAF >gi|226332215|gb|ACIC01000105.1| GENE 47 69162 - 71876 2699 904 aa, chain - ## HITS:1 COG:CC0171 KEGG:ns NR:ns ## COG: CC0171 COG1629 # Protein_GI_number: 16124426 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor proteins, mostly Fe transport # Organism: Caulobacter vibrioides # 143 904 79 888 888 120 23.0 1e-26 MKKLFLLRFSVAAFFCLLCVLPALAANIKIKGAVKDKLSKEPLIGATIRLVGTPAGAVTD MDGNFELNSTGVLEGLYDIEIKYVGYKTEVRRKVRVENGKLAILNLELETDAHELSDVVV VAKKNRENENMLLLEQQKAVIAVQAVSVKELSRKGVSDAEGAVTKVAGVSKQDGVKNVFV RGLGDRYNATTFNGFAVPSEDPEYKNISLDFFGTDIIQSVGVNKAFNAGGISDVGGATID IVSKELIGSGNLNIGLSGGLNTQTVTADFLKQDGVNLLGFATTTEPADENNWGFKNKLDP SKQSLQINRSYSISGGKRFHIGKDRNPLSFFLTAGHTTDYQFTDETIRNTTTSGTIYKDM TGKKYTENISQLALANVDYDMQNRHHMSYNFMMIHANVQSVGDYTGKNSIFSDDYDNQGF TRRQQANDNLLIVNQLMTNWGLTKTLSLDAGASYNIVKGNEPDRRINNITKAEEGYTLLR GNSQQRYFSTLDEDDINVKAGLIYRLKDNVEEISNIRLGYTGRFVDDNFKATEYNLTVGH ASSIPSLDNFSLDDYYNQQNLSSGWFAVQKNIDKYSVTKNIHSAYAEATYQFTPRWIVNV GMKYDNVDIQVDYNVNKGGSEGSNTIQKDFFLPSLNLKYNLSEKHSLRLGASKTYTLPQA KEISPYRYVGVNFNSQGNANLKPSDNYNVDLKWDFNPTPTELISLTAFYKLIKNPISRIE VASAGGYLSYENIADKATVAGVEVEVRKNLFVRPLSSAANGMNKLSFGLNGSYIYTNAKM PLATVTTGSQLEGAAPWIANFDLSHNFTKGNHSFINTLVLNYVSDKIYTIGTQGYQDIME QGMMTLDFVSQAKLNKYVSLTLKARNLLNPSYKLSRKANESGEKVVLSDYKKGINISLGV SCTF >gi|226332215|gb|ACIC01000105.1| GENE 48 72421 - 73317 761 298 aa, chain - ## HITS:1 COG:AGl1135 KEGG:ns NR:ns ## COG: AGl1135 COG2207 # Protein_GI_number: 15890685 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 26 296 36 304 313 125 30.0 8e-29 MNTPIPNQIIREITPLSDKDCFYIAERYKTEFTYPIHNHSEFELNFTEKAAGVRRVVGDS SEVIGDYDLVLITGKDLEHVWEQNECRSKEIREITIQFSSDLFFKSFINKNQFDSIRRML DKAQKGLCFPMSAILKIYPLLDTLASEKQGFYAVIKFMTILYELSLFEEEARTLSSSSFA KIDVHSDSRRVQKVQEYINLHYQEEIRLGQLASMVGMTDVSFSRFFKLRTGKNLSDYIID IRLGFASRLLVDSTMSIAEICYECGFNNLSNFNRIFKKKKDCSPKEFRENYRKKKKLI >gi|226332215|gb|ACIC01000105.1| GENE 49 73378 - 74709 1250 443 aa, chain - ## HITS:1 COG:VC0265 KEGG:ns NR:ns ## COG: VC0265 COG0668 # Protein_GI_number: 15640294 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Small-conductance mechanosensitive channel # Organism: Vibrio cholerae # 51 437 26 410 412 349 45.0 8e-96 MKEVIDTVSVALANGDTENVGNVVMQEVNRTLLSAGVDEVWADKIDNLIVLLFIIGIALL ANIICRKIILRVVAKLVKQTKATWDDIVFNHKVMVNVSRMVAPILIYIAIPIAFPEHADS DLLDFLRRLCLIYIIAVFLRFISALFTAVYQVYSEREQYRDKPLKGLLQTAQVILFFIGA IIIISILINQSPMVLLTGLGASAAILMLVFKDSIMGFVSGIQLSANNMLKVGDWITMPKY GADGTVIEVTLNTVKVRNFDNTITTIPPYLLISDSFQNWQGMQESGGRRVKRSINIDMTS VRFCTPEMLEKYRKIQLLANYVDETEKVVEEYNKEHDIDNSVLVNGRRQTNLGVFRAYLT NYLRSLPTVNQDMTCMVRQLQPTETGIPLELYFFSANKVWVAYEGIQADVFDHVLAIIPE FGLRVFQNPSGEDLRRIGVKIEH >gi|226332215|gb|ACIC01000105.1| GENE 50 74760 - 75761 780 333 aa, chain - ## HITS:1 COG:no KEGG:BT_4052 NR:ns ## KEGG: BT_4052 # Name: not_defined # Def: putative ABC transporter ATP-binding protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 333 297 629 629 665 100.0 0 MATSLRDNLTSSYFNAAHKLYSKKARRRIIAYVESYDDVAFWRTLLEEFEDDEHYFQVML PSATSLAKGKKMVLMNTLNTAELGRSLIACVDSDYDFLLQGATNTSRKINRNKYIFQTYT YAIENYHCFAESLHEVCVQATLNDRFILDFNAYLKRYSEIVYPLFLWNVWFYRQRDTYTF PMYDFHTYTALREISLKHPEHSLEALQHRVNQKLAELKKRFPGSVNQVNGLRSELKELGL VPETTYLYMQGHHVMDNVVMKLLIPVCTALRREREQEIKRLAEHNEQFRNELTCYQNSQV NVEIMLKKNVAYKRLFHYDWLRQDIQEYLAKGE >gi|226332215|gb|ACIC01000105.1| GENE 51 75775 - 76650 922 291 aa, chain - ## HITS:1 COG:STM2746 KEGG:ns NR:ns ## COG: STM2746 COG3950 # Protein_GI_number: 16766058 # Func_class: R General function prediction only # Function: Predicted ATP-binding protein involved in virulence # Organism: Salmonella typhimurium LT2 # 162 266 302 410 427 62 33.0 8e-10 MEQQADYIRRIEIKGLWARFNIQWDLRPDVNILSGINGVGKTTILNRSVGYLEQLSGDIQ LSGEMKSDAKNGVHLFFDNPAATYIPYDVIRSYDRPLIMGDFTARMADKNVKSELDWQLY LLQRRYLDYQVNIGNKMIEMLSSNDEEQRSKAATLSLAKRRFQDMIDELFSYTRKKIDRR RNDIAFYQDGELLFPYKLSSGEKQMLVILLTVLVQDNSHCVLFMDEPEASLHIEWQQKLI SMIRELNPNVQIILTTHSPAVIMEGWLDAVTEVSDISTEADGFRLPPTINC >gi|226332215|gb|ACIC01000105.1| GENE 52 76757 - 77674 789 305 aa, chain + ## HITS:1 COG:BH0390 KEGG:ns NR:ns ## COG: BH0390 COG0697 # Protein_GI_number: 15612953 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Bacillus halodurans # 4 290 2 291 311 66 22.0 5e-11 MGNNKNLHGHLFALTANVMWGLMSPIGKSALQEFSAISVTTFRMVGAAAAFWILSIFCKQ EHVDHRDMLKIFFASLFALVFNQGIFIFGLSMTSPIDASIVTTTLPIVTMIVAAIYLKEP ITNKKVLGIFVGAMGALILIMSSQAASSGNGSLIGDLLCLVAQISFSIYLTVFKGLSQRY SAVTINKWMFIYASMCYIPFSYQDISVIQWTSIPMVAILQVLYVVLGGSFLAYLCIMTAQ KLLRPTVVSMYNYMQPIVATIAAILMGIGSFGWEKGIAITLVFLGVYIVTQSKSKADFEK AGKEL >gi|226332215|gb|ACIC01000105.1| GENE 53 77796 - 81824 3723 1342 aa, chain - ## HITS:1 COG:TM1193 KEGG:ns NR:ns ## COG: TM1193 COG3250 # Protein_GI_number: 15643949 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Thermotoga maritima # 39 1117 7 984 1087 586 34.0 1e-166 MNKTLLTGLLCCSLSIQSFADQPLEGFTYGSVNAPTGKEWESPENLALNKEQPHAYFFPF QHLDNARKVLPENSKYWQSLDGDWKFHWAPDPDSRPKDFYQTEYDVSSWDAIPVPSSWNI YGIQKDGSQKYGTPIYVNQPVIFQHSVKVDDWRGGVMRTPPANWTTYKDRNEVGSFRRDF EIPQDWDGREVFISFDGVDSFFYLWINGQYVGFSKNSRNTANFNITPYLQKGKNTVAAEV YRSSDGSFLEAQDMFRLPGIFRTVALYSVPKVHFRDLVATPDLDVTYTDGSLTVNAEIRN LDKKAIKDYKVYYSLYANKLYSDENTLVDGFLSPVIDKIAPNETGSVQTVLKVKAPNKWS AEFPYRYTLVAELKDKKNRTVEMVSTIVGFRKVEIKDTPASEDEFGLAGRYYYVNGKTVK LKGVNRHESNPGVGHAITREMMEKEIMLMKCANINHVRNSHYPDDPYWYFLCNKYGIYLE DEANIESHEYYYGAASLSHPVEWKNAHVARVMEMVRANVNNPSIVIWSLGNEAGPGKNFV AAYDALKKFDTSRPVQYERNNDIVDMGSNQYPSIGWVRGAVQGKYDIKYPFHISEYAHSM GNACGNLIDYWEAMESTNFFCGGAIWDWVDQSMYNYDPKTGVRYLAYGGDFGDTPNDGQF VMNGIVFGDLDPKPQYYEVKKVYQHIGVKAIDTEKGVFEIFNKYYFKNLAEDYQLVYSLY EDGKPIMTGKPMDINIAPRQRAQITLPYDHASLKKDAEYFMKLQFILKDQRPWAAKGFPM AEEQILIKEATDRPSISEVTAGAAKQDGFVLDKDTKRILIKGADFEAIFDPQTGSIYSLK YGNETVIADGNGPKLDALRAFTNNDNWFYAPWFEHGLHNLIHKATEYKVLNKGNGTLVLS FTVESQAPNAARIKGGTSSGKNSIEELTDRKFGSNDFKFVTNQIWTVYPDGSIELQSSIT SNRSSLVLPRLGYVMKVPQQYSNFTYYGRGPIDNYADRKSGQFIEQYTNSVAGEFVNFPK PQDMGNHEDVRWCALTNQAGNGAVFVATDRLSASALQYSALDLILASHPYQLPKAGDTYL HLDCAVTGLGGNSCGQGGPLVHDRVFANQHSMGFIIRPAGKELSVVANVAPAGDLPLSIT RTPAGMVELTSAKKDAVICYSIDGSKKVQEYTEPVPMRNGGTIKAWYKDSKDISSTMKFE KIESIQTQVVYASSQESGEGDASHLTDGDPNTIWHTMYSVTVAKYPHWVDLDAGEVKEIK GFTYLPRQNGGNGNIKDYSIQVSMDGKEWGEPVNKGTFARDSKEKRVLFDKPVKARYIRF TALSEQNGQDFASGAEITILAN >gi|226332215|gb|ACIC01000105.1| GENE 54 81920 - 83014 1127 364 aa, chain - ## HITS:1 COG:BH2954 KEGG:ns NR:ns ## COG: BH2954 COG1703 # Protein_GI_number: 15615516 # Func_class: E Amino acid transport and metabolism # Function: Putative periplasmic protein kinase ArgK and related GTPases of G3E family # Organism: Bacillus halodurans # 33 364 4 335 340 336 48.0 3e-92 MEHPENNEEYKGLVVNKGIEQPSSVNPYLKRKPKKRQLSVAEFVEGIVKGDVTILSQAVT LVESVKPEHQAVAQEVIEKCLPYSGNSIRVGISGVPGAGKSTSIDVFGLHVLEKGGKLAV LAIDPSSERSKGSILGDKTRMEQLSVHPKSFIRPSPSAGSLGGVARKTRETIVLCEAAGF DKIFVETVGVGQSETAVHSMVDFFLLIQLAGTGDELQGIKRGIMEMADGIVINKADGDNL EPAKLAASQFRNALHLFPAPESGWIPKVLTYSGFYNIGVKEIWDMVYEYIDFVKKNGYFD YRRNEQSKYWMYETINEQLRDSFYHNPRIEAMLQEKEQQVLQGNLTSFIAAKSLLDTYFD ELKR >gi|226332215|gb|ACIC01000105.1| GENE 55 83026 - 84093 963 355 aa, chain - ## HITS:1 COG:no KEGG:BT_4048 NR:ns ## KEGG: BT_4048 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 355 19 373 373 654 100.0 0 MKRTLLLIFTLLTVTLAAVAQPRISSNKETHNFGQIEWKKPVTVEYTITNTGNEPLVLTN VTTSCACAVADWTKEPIAPGAKGTVKASFDAKALGRFDKSIGIYSNATPNLVYLKFTGEV VQEIKDYSKLLPYAIGNIRIDRDEFSFPDVYRGQQPSMTFSIANLSDRPYEPVLMHLPPY LKMEAEPKVLLKGKKGTIKLTLDASQLQDFGLTQTSVYLSRFSGDKVSEDNEIPVSAILL PDFSRMTEKDSLNAPSIHISETNIDLSIPLIKKNKVSHDILIANAGKTPLVISKLQVFNS SVGVRLKKTVIPPDGMTKLKVTIHKRDVGNKKHHLRILMITNDPLRPKVEINIKR >gi|226332215|gb|ACIC01000105.1| GENE 56 84329 - 90097 4696 1922 aa, chain - ## HITS:1 COG:TM0984 KEGG:ns NR:ns ## COG: TM0984 COG2373 # Protein_GI_number: 15643744 # Func_class: R General function prediction only # Function: Large extracellular alpha-helical protein # Organism: Thermotoga maritima # 470 682 149 354 1536 62 22.0 8e-09 MKIRQICMLVLLCLGVIPAVQAQTYDKLWKQVEQAEKKSLPETVIKLTGEIYQKGEKEKN SPQMLKAYMWRMRYRDMLTPDSLYTNLNGLELWAKQTDQPMDRAVLHSLIAEIYSMYAFN NQWQLRQRTEIVGEAPSADMREWTANMFVEKVRTNVREAMADSVLLLNTSSRTYIPFVEL GETSEYYHHDMYHLLATRGIVALQQVAGLDRATPVEDISEDSSAEKESSVKQDIIAIYGN MIAAYKASGLKEGYVLTALSYLEWRRDSDRNIRPFGLKKGLSGLTEDTYVTALNELKSRF KSESICAEVYLAQARYAIEKEQQTSALQLCDEAIRLYPGYRRINALKNLREDILSPFLNV TAAATAFPGEEIEIRASHKNLDGFTLRLYQAKKLIKEQHFAVLRPEDYRTQDTVFTFKAP EVGQYVMRIVPDIRAKRDSESKFNVTRFKVLTCRLPGNQYEVVTLDGQTGHPIPNAKITL YTNDEKVLQEYTTGADGKVVFPWKSEYRYLKAAKGIDTGMPFQSIYGGSYGYYGDENKVS EGMTLLTDRSLYRPGQTVYVKGIAYVQKSDTANVLPNKDYTVVLMDANNQEVGQKAVRTN EFGSFTVDFALPSACLNGMFSLKAGEGRTNIRVEDYKRPTFDITFDKQQGSYQLGDKVEV KGKIQSYSGVLLQDLPVKYTVKRSVVDLWRLAESTQIASGEVIANEDGEFTIPVFLEENN AYKNNTRIYYRYSIEATATNVAGETQSSADVIAAGNRSLILQTELKEKTCKNQPFNTVFK VQNLNGQPVEVKGTFSLYQAKDADFKQLDENPAATGTFTSNEEMTIEWGNLPSGPYVLKA VVKDGQGKEVTAEANTILFSREDNRPPVETTVWFYEANTEFDATHPAVFCFGTSKKDAYV MMNVFSGNKLLESKTMNLSDTIVHFKYPYQESYGDGVLVSFCMVRDGQVYQEQARLKKRL PDKTLTMKWEVFRDKLRPGQKEEWKLTIKTPQGQAAEAEMLATMYDASLDKIWNRKQNFS VFYNQIIPYSSWMGGYFGNNSFNYWWNNKYFKVPAMEYDYFVTQGTMSMDQALNGMVPGV MVRGYGVQKQASMTGSMMIRGVSRSKAEAKYVPALVGSVAEDAVFESELVSVEAGKQDSA SEEETLPEAPADLRTNLAETAFFYPQLRTNEQGEVSFSFTMPESLTRWNFRGYSHTKGML MGTLDGEATTSKEFMLTPNLPRFVRVGDKTSVAASISNMIGKPQAGTVSMILFDPMTEKV ISTQKQKFTVEAGKTIGVSFMFTVSDKYEILGCRMIADSGTFSDGEQQLLPVLSNKEHLV ETLPMPVRGEETRTFSLDSLFNHHSKTATDRKLTVEFTGNPAWYAIQALPSLSLPENNNA ISWATAYYANTLASYIMNSQPRIKAVFDSWKLQGGTKETFLSNLQKNQEVKNILLSESPW VLEAQTEEQQKERIATLFDLNNIRSNNIAALTRLQELQNSSGAWSWYKGMTGSRYVTTYI AELNARLAMMTGEQPSGTALALQKNAFTYLHQEALKEYREILKAQKDGVKFTGISGSILQ YLYLIALSGEQVPASNKAAYTYYLSKIGEMLPTASMDTKAIAAIILDKAGRKKEAQEFIA SLKEHLTKTDEQGMFFAFNENPYSWGGMKMQAHVDVMEALELTGGNNDTVEEMKLWLLKQ KQTQQWNSPVATADAVYALLMKGVNLLDNQGDVRIVIANEVLETVSPSKTTVPGLGYIKR SFTQKNVVDARKIEVEKRNPGIAWGAVYAEYESPIKDVRQQGGELNVQKQLYVERMVNNA PQLQPITEKTVLQVGDKVVSRLSIRADRAMDFVQLKDQRGACFEPIGSVSGYNWSNGIGY YVDIKDASTNFFFDHLGKGVYVLEHSYRISRVGTYETGLATMQCAYAPEYASHSASMTII IK >gi|226332215|gb|ACIC01000105.1| GENE 57 90965 - 92521 1020 518 aa, chain + ## HITS:1 COG:no KEGG:BT_4046 NR:ns ## KEGG: BT_4046 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 518 1 518 518 1026 99.0 0 MDERFRRYPIGIQNFEDLRNNGYVYVDKTELIYRLTNTNKVYFLSRPRRFGKSLLVSTLE AYFSGKKDLFNGLAMETLEKEWAVYPVLHIDFSVSKYMTAEALSAVINYQLLQWEKLYGR EEGETTFSLRLKGIIQRAYTQTGKPVVILVDEYDAPMLDSNNNTELQIEIRDIMRDFFSP LKAQGQYIRFLFITGISKFSQMSIFSELNNLQNISMWDEYSAICGITEQEVHTLQIDIEQ MAQANNETYEEACTHLKQQYDGYHFSENSEDIYNPFSLFNAFAQKKYSNFWFSTGTPTFL IDLLQESNFDIRELDDTTATAEQFDAPTNRITDPLPVLYQSGYLTIKGYDPSFQLYTLAY PNKEVRKGFLESLMPAYVHLPARENTFYIVSFIKDLRAGNLDGCLERLKSFFASIPNKLN NKEEKHYQTIFYLFFRLMGQYIDVEVDTAIGRADAIVKLQDTLYVFEFKVDGTPEEALAQ INSKEYAIPYQVGDWKIIKVGVNFDSATRTIGTWKTER Prediction of potential genes in microbial genomes Time: Thu May 12 01:56:12 2011 Seq name: gi|226332214|gb|ACIC01000106.1| Bacteroides sp. 1_1_6 cont1.106, whole genome shotgun sequence Length of sequence - 87030 bp Number of predicted genes - 76, with homology - 74 Number of transcription units - 32, operones - 19 average op.length - 3.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 278 - 331 15.1 1 1 Op 1 . - CDS 368 - 1825 1739 ## COG2195 Di- and tripeptidases 2 1 Op 2 . - CDS 1850 - 2836 559 ## BT_4044 putative dolichol-P-glucose synthetase - Prom 2884 - 2943 4.7 + Prom 2751 - 2810 2.9 3 2 Op 1 . + CDS 2907 - 3710 904 ## COG0030 Dimethyladenosine transferase (rRNA methylation) 4 2 Op 2 . + CDS 3764 - 5104 1471 ## COG2239 Mg/Co/Ni transporter MgtE (contains CBS domain) + Prom 5113 - 5172 3.9 5 2 Op 3 . + CDS 5193 - 7067 1934 ## BT_4041 hypothetical protein + Term 7106 - 7147 2.0 + Prom 7878 - 7937 3.1 6 3 Op 1 . + CDS 7973 - 9316 1066 ## BT_4040 putative galactose oxidase precursor 7 3 Op 2 . + CDS 9356 - 12457 2534 ## BT_4039 hypothetical protein 8 3 Op 3 . + CDS 12482 - 14323 1288 ## BT_4038 hypothetical protein + Term 14407 - 14446 4.0 + Prom 14526 - 14585 6.8 9 4 Tu 1 . + CDS 14791 - 15048 179 ## BT_4037 hypothetical protein + Term 15065 - 15109 4.2 + Prom 15058 - 15117 3.6 10 5 Op 1 . + CDS 15138 - 15686 433 ## BT_4036 hypothetical protein 11 5 Op 2 . + CDS 15704 - 16024 297 ## BT_4035 hypothetical protein + Prom 17214 - 17273 2.3 12 6 Tu 1 . + CDS 17298 - 17504 162 ## + Term 17577 - 17634 3.4 - Term 17655 - 17705 11.2 13 7 Tu 1 . - CDS 17743 - 18219 574 ## BT_4032 putative non-specific DNA-binding protein - Prom 18302 - 18361 2.4 - Term 18335 - 18375 7.9 14 8 Op 1 . - CDS 18413 - 18523 106 ## gi|253570192|ref|ZP_04847601.1| predicted protein 15 8 Op 2 . - CDS 18614 - 19117 371 ## COG3023 Negative regulator of beta-lactamase expression 16 9 Tu 1 . - CDS 19294 - 19548 220 ## BT_4030 hypothetical protein + Prom 19930 - 19989 2.4 17 10 Tu 1 . + CDS 20040 - 20318 164 ## 18 11 Op 1 . - CDS 20558 - 20830 207 ## BT_4028 hypothetical protein 19 11 Op 2 . - CDS 20827 - 21054 322 ## BT_4027 hypothetical protein 20 11 Op 3 . - CDS 21081 - 22073 814 ## BT_4026 hypothetical protein 21 11 Op 4 . - CDS 22070 - 22366 317 ## BT_4025 hypothetical protein 22 11 Op 5 . - CDS 22388 - 22648 303 ## BT_4024 hypothetical protein 23 11 Op 6 . - CDS 22645 - 22791 102 ## BT_4023 transposase + Prom 23068 - 23127 4.0 24 12 Op 1 . + CDS 23153 - 24382 917 ## COG4974 Site-specific recombinase XerD 25 12 Op 2 . + CDS 24402 - 25619 671 ## BDI_0742 integrase + Term 25704 - 25764 0.5 - Term 25906 - 25948 1.0 26 13 Tu 1 . - CDS 25956 - 26978 827 ## COG3177 Uncharacterized conserved protein - Prom 27066 - 27125 3.5 27 14 Op 1 . - CDS 27160 - 27486 288 ## BDI_0745 hypothetical protein 28 14 Op 2 . - CDS 27483 - 27860 297 ## BDI_3258 hypothetical protein - Prom 27911 - 27970 2.9 29 15 Op 1 . + CDS 28297 - 28857 371 ## BDI_2239 hypothetical protein 30 15 Op 2 . + CDS 28903 - 29322 337 ## BDI_2238 hypothetical protein 31 16 Op 1 . - CDS 29436 - 29741 178 ## BDI_0745 hypothetical protein 32 16 Op 2 . - CDS 29772 - 30083 310 ## BDI_0746 hypothetical protein + Prom 30476 - 30535 4.3 33 17 Tu 1 . + CDS 30662 - 31108 235 ## BT_3141 hypothetical protein + Prom 31267 - 31326 3.1 34 18 Op 1 . + CDS 31346 - 31720 234 ## BDI_3248 mobilization protein BmgB 35 18 Op 2 . + CDS 31717 - 32673 727 ## BVU_3726 mobilization protein BmgA 36 18 Op 3 . + CDS 32728 - 33537 569 ## PGN_0926 hypothetical protein + Term 33611 - 33655 4.0 + Prom 33597 - 33656 4.5 37 19 Tu 1 . + CDS 33676 - 34872 400 ## BDI_0013 integrase + Term 35012 - 35054 -0.9 + Prom 34942 - 35001 5.9 38 20 Op 1 . + CDS 35204 - 36493 507 ## BDI_0751 hypothetical protein 39 20 Op 2 . + CDS 36495 - 36977 438 ## BDI_0752 hypothetical protein 40 20 Op 3 . + CDS 36998 - 38470 798 ## GFO_2613 hypothetical protein 41 20 Op 4 . + CDS 38476 - 41862 1707 ## BDI_0754 hypothetical protein 42 20 Op 5 . + CDS 41868 - 42863 446 ## gi|212692359|ref|ZP_03300487.1| hypothetical protein BACDOR_01855 43 20 Op 6 . + CDS 42867 - 44549 597 ## BDI_0756 hypothetical protein 44 20 Op 7 . + CDS 44552 - 45205 149 ## BDI_0757 hypothetical protein 45 20 Op 8 . + CDS 45278 - 45598 217 ## gi|212692356|ref|ZP_03300484.1| hypothetical protein BACDOR_01852 + Prom 45604 - 45663 3.9 46 21 Op 1 . + CDS 45743 - 45961 62 ## BDI_0757 hypothetical protein 47 21 Op 2 . + CDS 45965 - 46741 32 ## COG0265 Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain + Term 46836 - 46885 -0.8 + Prom 46860 - 46919 4.3 48 22 Tu 1 . + CDS 46995 - 47234 88 ## FIC_01982 SNase-like nuclease - Term 47485 - 47520 1.3 49 23 Tu 1 . - CDS 47755 - 48522 372 ## BT_4009 integrase - TRNA 48989 - 49062 49.9 # Gln CTG 0 0 + Prom 49079 - 49138 8.1 50 24 Op 1 . + CDS 49327 - 49686 331 ## COG0799 Uncharacterized homolog of plant Iojap protein 51 24 Op 2 . + CDS 49696 - 51840 1286 ## PROTEIN SUPPORTED gi|157803230|ref|YP_001491779.1| 50S ribosomal protein L9 52 24 Op 3 . + CDS 51857 - 52699 790 ## COG0575 CDP-diglyceride synthetase + Term 52722 - 52766 1.2 - Term 52698 - 52762 16.3 53 25 Op 1 . - CDS 52812 - 53588 841 ## BT_4005 hypothetical protein - Prom 53618 - 53677 5.0 - Term 53617 - 53659 3.3 54 25 Op 2 . - CDS 53680 - 54816 1091 ## COG0763 Lipid A disaccharide synthetase 55 25 Op 3 . - CDS 54868 - 55647 618 ## COG0496 Predicted acid phosphatase - Prom 55809 - 55868 7.1 + Prom 55619 - 55678 3.2 56 26 Op 1 25/0.000 + CDS 55825 - 56772 918 ## COG1192 ATPases involved in chromosome partitioning 57 26 Op 2 . + CDS 56784 - 57674 1077 ## COG1475 Predicted transcriptional regulators 58 26 Op 3 . + CDS 57675 - 58562 564 ## BT_4000 hypothetical protein 59 26 Op 4 . + CDS 58596 - 59891 1169 ## COG0741 Soluble lytic murein transglycosylase and related regulatory proteins (some contain LysM/invasin domains) 60 26 Op 5 . + CDS 59964 - 62207 2250 ## COG0317 Guanosine polyphosphate pyrophosphohydrolases/synthetases 61 27 Op 1 . - CDS 62209 - 62550 376 ## COG0789 Predicted transcriptional regulators 62 27 Op 2 . - CDS 62572 - 63540 914 ## COG0739 Membrane proteins related to metalloendopeptidases - Prom 63608 - 63667 3.2 + Prom 63569 - 63628 3.4 63 28 Tu 1 . + CDS 63648 - 66266 2880 ## COG0013 Alanyl-tRNA synthetase + Term 66289 - 66349 16.0 + Prom 66419 - 66478 6.8 64 29 Op 1 . + CDS 66498 - 68729 1852 ## COG3537 Putative alpha-1,2-mannosidase 65 29 Op 2 6/0.000 + CDS 68758 - 69312 535 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 66 29 Op 3 . + CDS 69365 - 70360 879 ## COG3712 Fe2+-dicitrate sensor, membrane component 67 29 Op 4 . + CDS 70392 - 72677 2048 ## COG3537 Putative alpha-1,2-mannosidase 68 29 Op 5 . + CDS 72701 - 74971 1959 ## COG3537 Putative alpha-1,2-mannosidase + Term 75079 - 75128 14.2 + Prom 75077 - 75136 2.8 69 30 Tu 1 . + CDS 75163 - 75597 253 ## BT_3989 hypothetical protein - Term 75772 - 75813 9.4 70 31 Op 1 . - CDS 75845 - 77179 977 ## BT_3988 putative peptidoglycan bound protein 71 31 Op 2 . - CDS 77210 - 78640 1144 ## BT_3987 endo-beta-N-acetylglucosaminidase F1 precursor 72 31 Op 3 . - CDS 78674 - 79828 830 ## BT_3986 putative patatin-like protein 73 31 Op 4 . - CDS 79837 - 80913 931 ## BT_3985 hypothetical protein 74 31 Op 5 . - CDS 80938 - 82551 1674 ## BT_3984 hypothetical protein 75 31 Op 6 . - CDS 82568 - 85975 2868 ## BT_3983 hypothetical protein - Prom 86077 - 86136 10.8 76 32 Tu 1 . - CDS 86177 - 87028 549 ## BT_3982 exonuclease V subunit alpha Predicted protein(s) >gi|226332214|gb|ACIC01000106.1| GENE 1 368 - 1825 1739 485 aa, chain - ## HITS:1 COG:VC2279 KEGG:ns NR:ns ## COG: VC2279 COG2195 # Protein_GI_number: 15642277 # Func_class: E Amino acid transport and metabolism # Function: Di- and tripeptidases # Organism: Vibrio cholerae # 2 485 50 533 534 461 47.0 1e-129 MSTILSLAPQNVWKHFYSLTQIPRPSGHMEKVTEFLINFGKGLGLESFVDEAGNVIIRKP ATPGMENRKGVILQAHMDMVPQKNNDTVHDFEKDPIETYIDGDWVKAKGTTLGADNGLGV AAIMAVLEANDLKHGPLEALITKDEETGMYGAFGLKPGTVNGEILLNLDSEDEGELYIGC AGGMDVTATLEYKEVAPEEGDVAVKVSLKGLRGGHSGLEINEGRANANKLLVRFVREAVA SYEARLVSWDGGNMRNAIPREACAVLAIPAENEEELLGLVKYCEDLFNEEFSAIETPISF TAERVELPAGEVPEEIQDNLIDAIFACQNGVTRMIPTVPDTVETSSNLAIITIGGGKAEI KILARSSSDSMKEYLTTSLESCFSMAGMKVEMTGGYSGWQPNVESPILHAMKESYKQQFG VEPAVKVIHAGLECGIIGAIIPGLDMISFGPTLRSPHSPDERALIPTVQKFYDFLVATLA QTPTK >gi|226332214|gb|ACIC01000106.1| GENE 2 1850 - 2836 559 328 aa, chain - ## HITS:1 COG:no KEGG:BT_4044 NR:ns ## KEGG: BT_4044 # Name: not_defined # Def: putative dolichol-P-glucose synthetase # Organism: B.thetaiotaomicron # Pathway: not_defined # 14 328 14 328 328 535 100.0 1e-150 MKKLLKKTLKLILPIVLGGFILYWVYRDFDFSRVGEVLRHGTNWWWMLFSLLFGVLAQVF RGWRWRQTLEPLDAFPRRSDCVNAIFISYAASLVVPRIGEVSRCGVLAKYDNVSFAKSLG TVVTERLVDTLTIFLITGITVLLQMPVFVTFLENTGTKIPSFAYLLTSVWFYIVLFCFIG VVVLLYYLRKTLFFYERVKGFVLNIWEGIMSLKGVRNIPLFIFYTLAIWACYFFHFYFTF YCFAFTAHLGILAALVMFVGGTFAVIVPTPNGAGPWHFAIMEMMMLYGVNVTDAGIFALI VHGIQTFLVVLLGVYGLAALPFTNRQRK >gi|226332214|gb|ACIC01000106.1| GENE 3 2907 - 3710 904 267 aa, chain + ## HITS:1 COG:PA0592 KEGG:ns NR:ns ## COG: PA0592 COG0030 # Protein_GI_number: 15595789 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Dimethyladenosine transferase (rRNA methylation) # Organism: Pseudomonas aeruginosa # 5 257 8 263 268 166 38.0 6e-41 MKLVKPKKFLGQHFLKDLKVAQDIADTVDTFPELPVLEVGPGMGVLTQFLVKKDRLVKVV EVDYESVAYLREAYPSLEDHIIEDDFLKMNLHRLFDGKPFVLTGNYPYNISSQIFFKMLE NKDIIPCCTGMIQKEVAERIAAGPGSKTYGILSVLIQAWYKVEYLFTVSEHVFNPPPKVK SAVIRMTRNDTKELGCDEKLFKQVVKTTFNQRRKTLRNSIKPILGKDCPLTEDALFNKRP EQLSVEEFISLTNQVEEALKTATASGN >gi|226332214|gb|ACIC01000106.1| GENE 4 3764 - 5104 1471 446 aa, chain + ## HITS:1 COG:BH0511 KEGG:ns NR:ns ## COG: BH0511 COG2239 # Protein_GI_number: 15613074 # Func_class: P Inorganic ion transport and metabolism # Function: Mg/Co/Ni transporter MgtE (contains CBS domain) # Organism: Bacillus halodurans # 9 441 14 441 452 232 31.0 1e-60 MNEEYIDNVKHLIEQKDADKVKELLVDLHPADIAELCDDLNPEEARFVYRLLNNETAADV LVEMDEDVRKEFLEMIPSETIAKRFVDYMDTDDAVDLMRDLDEDKQEEILSHIEDIEQAG DIVDLLKYDENTAGGLMGTEMVVVNENWSMPECLKEMRQQAEELDEIYYVYVIDDDERLR GIFPLKKMITSPSVSKVKHVMQKDPISVHVDTSIDEVAQTIEKYDLVAIPVVDSIGRLIG QITVDDVMDEVREQSERDYQLASGLSQDVETDDNVLRQTTARLPWLLIGMIGGIGNSMIL GNFDATFAAHPEMALYIPLIGGTGGNVGTQSSAIIVQGLANSSLDAKNTFKQVTKEAVVA LINATIISLLVYTYNFIRFGATATVTYSVSISLFAVVMFASIFGTLVPMTLEKLKIDPAI ATGPFIAITNDIIGMMLYMGITVLLS >gi|226332214|gb|ACIC01000106.1| GENE 5 5193 - 7067 1934 624 aa, chain + ## HITS:1 COG:no KEGG:BT_4041 NR:ns ## KEGG: BT_4041 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 624 1 624 624 948 100.0 0 MDAHDTNLPLNQGELEEEKKTVEVSEAITETPTEEGTEKVTEEVTAEVQTEAAPKPATKE DVLNRLKELAQDSENANKQEIDNLKQSFYKLHNAELEAAKKQFIDNGGAAEDFTPKDDPT EEEFKRLMGSIKEKRSQLVAEQERQKEENLQVKLSIIEELKELVESGDDANKSYTEFKKL QQQWNETKLVPQGKVNELWKNYQLYVEKFYDLLKLNNEFREYDFKKNLEIKTHLCEAAEK LADEEDVVSAFHQLQKLHQEFRDTGPVAKELRDEIWNRFKAASTAVNRRHQQHFEALKES EQHNLDQKTVICEIVEAIEYDELKTFSAWENKTQEVIALQNKWKTIGFAPQKMNVKIFER FRHACDDFFKKKGEFFKALKEGMNENLEKKKALCEKAEALKDSTDWKATADALTKLQKEW KTIGPVAKKYSDAVWKRFITACDYFFDQKNKATSSQRSVEIENMEKKKALIEKLASIDEN MDTDEASTLVRELMKEWNTIGHVPFKEKDKLYKQYHGLIDQLFDRFNISASNKKLSNFRS NISNIQGGGSQSLYREREKLVRIYENMKNELQTYENNLGFLTSSSKKGSSLLTELNRKVD KLKADLELVLQKIKVIDESLKAEE >gi|226332214|gb|ACIC01000106.1| GENE 6 7973 - 9316 1066 447 aa, chain + ## HITS:1 COG:no KEGG:BT_4040 NR:ns ## KEGG: BT_4040 # Name: not_defined # Def: putative galactose oxidase precursor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 447 1 447 447 888 99.0 0 MKKCFELTPCLLAFLCITSCVEDVDNERSKGMFSASQSGFSTIEVYDLGPLNKIDLSIAK AGLEDTGGTVTFSVEQSLLDSLNNADGTSYQLLPPECYTIENATYSVSPGGDRRVTGGSL VYDPHKIYNLCGFDELKYVLPLRVSSTGTPMNSDRTATLYGFIVKEAIVRLASTGVDFVV EGGTTTSPMNLDTEISFENEWDLTTSFATKGADYVDNYNSANSTYYIPLPSDFYTLPDVT IAKGKTTGGSAISLLGETLPPGNYLLPVSLGGLSGQGDASINIDKETVANYFVIKQGGPI NRDHWTVTANTEEKTGEVSAAYPHNGQTISLIDGEINTFWHSKWQGGEVAPPYEIILDMG KENSVSQLGLIARQNALTSMNLEIYAGDDGETWNLIGKYFFDGSIKTEQMVPVKACDARW IRIYIPTLGSGSSVGHLAEIYVYGTDK >gi|226332214|gb|ACIC01000106.1| GENE 7 9356 - 12457 2534 1033 aa, chain + ## HITS:1 COG:no KEGG:BT_4039 NR:ns ## KEGG: BT_4039 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1033 1 1033 1033 2047 100.0 0 MNKILKVMCFILFALFAGINTIYAQGQSPTGLVVDENGQPMIGVQIKIEGTTVGAITDVD GNFTIKAQKGNILLFSYVGYEQQKITYKGEKVLAIKMLPSTEMLEDVVVIGYGKQKKNSV VSSINAIGPKELSVSSSRNMTNNLAGQVPGLIAVQRSGEPGYDNADFWIRGQSSFKGGTN PLVLVDGVPRQMQDIEADEIESFTLLKDAAATAVYGAEGANGVILITSKRGNSQKPKISF RAEGTILEPTRLPTFMNSVETLNLYNEALNNEGTASIRTDEEIAKYGPGADRDLYPDTDW LGTMLRNHTYNMRYTLNVRGGTERARYFVSGAFYQENGIFKDFGNDYDNNIGLKRYNLRS NIDFDATKTTTVKVDLSGQYLQTNYPGTGTSTIFTTMCRTPSYIMPAVYSDGTVAGHPRP SGNRSNPYNLLVNSGYAKEWRTSIQSKVEVDQKLDFLTKGLTWKGLISFDADMSYIAKRT KTPTQYLATGRDENGKLLYKTVVEGSDVLSENLSNSSNKKIYFETAFNYNRTFAEKHDVG AMFLYMQKETQYHNNALPYRKQGLVGRVTYGYDGRYFLEGNFGYTGSETFAKGYRFGFFP AMGLAWYISNEHYYPEALKKVVSKLKLRFSIGRTGNDDTGGNRFLYRGTMKQDNSGYNIG FSNSGALGGVGNGITEAQFESPFLSWEIEDKRNFGIDLGLFDNRIDLQVDYFNNKRKDIL LQRNTVSNVTGFQQMPWQNFGIVKNQGVDASLNLNYKVGEVNLSARGNFTFARNEILEYD QVPQVYPWLEKKGTRLNSWKLYIADGLYTADDFNITGEGLNRQYELKPGVVSGLSSGVRP GDIKYKDLNGDGKIDSNDQMEDVGNPSVPEIVYGFGLNAEWKGAYVGIFFQGSGNTSTVL GASSNGAFFPFQWGVEESAVRSEVANRWTEQNPSQNVMFPRMHSTNYDNNTAASTWWLRN ASFLRLKNIEVGYNFKEKTLKKIGIQALRVYLQGNNLCVWDDIKMWDPELGNTNGGFSYP LSRTFTFGLDFTF >gi|226332214|gb|ACIC01000106.1| GENE 8 12482 - 14323 1288 613 aa, chain + ## HITS:1 COG:no KEGG:BT_4038 NR:ns ## KEGG: BT_4038 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 613 1 613 613 1238 100.0 0 MKTLKKLSIIVAVGLSTIFFSGCSDYLDVSDDLAAELSMEEVFNNTGYARRFHRYIYSGI PDVSNIIITSSYAALTGLDNPWPAVSDELKSAQNNVKTIPTVGYHAGSADLSRWSLYKQI RQANEFLAYSHTIPQQGDVTDYIDEDELARLKNEARFLRAYYHYLLFELYGPIPIMTEIS EPSASDLDYYRNSVDEVVQFIDSELNECYNELPEKEVNDDGTPNENRSAAPTKGAALAIL AKLHVYAASPLLNGGYSEAIALRDNQDKQLFPAKDDKKWQTALNALQRFIDYAKTHYHLY KEKDKDGELNAEESLYQLFQVSLNNPEAIWQTSKNSWGDVGGEGRERRCTPRAIYSGFGC VGVLQEAIDDFYMNDGKSIHESGLYSEEGIGEDGIPNMYKNREPRFYQAITYSGKTWQKT TKQIYFYKGSGDDNSKADMSYSGYLLYKGMNRDLLNQGSNAKSKYRAGMLFRLADFYLLY AEALNHVNPSDERIIAHIDSVRYRAGIPLLKDIKPEIKGNQALQEEAIRKERRIELFAEG QRYFDVRRWMCADEEGYKQGGPVHGMNMNADNLEDFMERTAFETRIFERRMYLYPIPLNE IQKSSKLVQNPGW >gi|226332214|gb|ACIC01000106.1| GENE 9 14791 - 15048 179 85 aa, chain + ## HITS:1 COG:no KEGG:BT_4037 NR:ns ## KEGG: BT_4037 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 85 1 85 85 174 100.0 1e-42 MQKNYSRGEAYARGYFHHIKGHLTAGAIIAIVMFVLHLLSGCEPMLPDESNRETVASPLD STKNDLDINEWETDTTTYQSHAKPV >gi|226332214|gb|ACIC01000106.1| GENE 10 15138 - 15686 433 182 aa, chain + ## HITS:1 COG:no KEGG:BT_4036 NR:ns ## KEGG: BT_4036 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 182 1 182 182 336 98.0 2e-91 MTFDNITGSKKLWSVRLDGEEDNEFLKLFYDWSDVMWLRSFFKENINDLSAYFKITDINQ AIKDTIEDSKRLRCVIMDISPEADLSRLFRPLDNNQASDFMLQKEKARLKRTIGHSSWLR LYAIKLTPGVYIITGGAIKLTATMQEREHTRKELVKMDKVRRYLLEENIIDDDGFIDYMS EL >gi|226332214|gb|ACIC01000106.1| GENE 11 15704 - 16024 297 106 aa, chain + ## HITS:1 COG:no KEGG:BT_4035 NR:ns ## KEGG: BT_4035 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 106 1 106 106 173 100.0 2e-42 MNENMRKIVERLEKHASPTPSKWREVFEYMDANETWLRYSQHIAMLMLDRMEELEMNQKQ LAEKMNCSRQYISKVLKGRENLSLETLAKIENALGISIIKEEPLAV >gi|226332214|gb|ACIC01000106.1| GENE 12 17298 - 17504 162 68 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNAPVAQPLPRLQPTPFGGGTPCGKGGDHAHRLSASPKGGREGGRTKIPLYKFIDNSVSI PLPAFIYG >gi|226332214|gb|ACIC01000106.1| GENE 13 17743 - 18219 574 158 aa, chain - ## HITS:1 COG:no KEGG:BT_4032 NR:ns ## KEGG: BT_4032 # Name: not_defined # Def: putative non-specific DNA-binding protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 158 1 158 158 280 99.0 1e-74 MAIQVKASLRKNPQDKKAAGKYYAQVVLAPEMTQRQIIDQIADRCTVTGSDVKAVLDALM TVIKRNLANGSPIRLGDLGSFRPSVSGIGSENVDKFTASNVKKARVIYVPSTEIKEAVAM YAFSKVGATAEGGNTEKPGGGDIDNPGGGDEEAPDPTV >gi|226332214|gb|ACIC01000106.1| GENE 14 18413 - 18523 106 36 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253570192|ref|ZP_04847601.1| ## NR: gi|253570192|ref|ZP_04847601.1| predicted protein [Bacteroides sp. 1_1_6] # 1 36 1 36 36 66 100.0 5e-10 MKPFWRNTLRVLKVIGKIFVWLTGADSPVNDNEKKD >gi|226332214|gb|ACIC01000106.1| GENE 15 18614 - 19117 371 167 aa, chain - ## HITS:1 COG:HI1494 KEGG:ns NR:ns ## COG: HI1494 COG3023 # Protein_GI_number: 16273395 # Func_class: V Defense mechanisms # Function: Negative regulator of beta-lactamase expression # Organism: Haemophilus influenzae # 49 145 2 98 116 108 53.0 3e-24 MIKQRNISLIVVHCTASRCTSDLTPSALDAMHKRQGFTECGYHYYITKDGRIHHMRDITK IGAHAKGHNSESIGIAYEGGLNASGKATDTRTTAQKQSLETLLRFLLLTYPGAKVCGHRD LSPDLNHNGIIEPCEYIKECPCFNVRTEYGYLMENGKWRMDNVPCGI >gi|226332214|gb|ACIC01000106.1| GENE 16 19294 - 19548 220 84 aa, chain - ## HITS:1 COG:no KEGG:BT_4030 NR:ns ## KEGG: BT_4030 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 84 1 84 84 159 98.0 4e-38 MYTNVHSQNNYQANLSNRSYGFKELAVLYFPNIAPASASIRLKQWIKDDVELLGSLEETN YHLSNRILTPKQRDLITTTFGSPF >gi|226332214|gb|ACIC01000106.1| GENE 17 20040 - 20318 164 92 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDFPRKSCSRIETPEAHRARSITTARARRTAKGADFRRTNAFAINTGTPHRMAALRTNGL FPVVQTQGGSGLSLVPAASAGQSETAPGIEND >gi|226332214|gb|ACIC01000106.1| GENE 18 20558 - 20830 207 90 aa, chain - ## HITS:1 COG:no KEGG:BT_4028 NR:ns ## KEGG: BT_4028 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 90 1 90 90 133 100.0 2e-30 MSEKVSILTLRLTAEEAAQMEVLKSITGKKSGSEAIKYIVKEYPRFCAHYKQEAREKGEL QRKYQDQKIAVGDFLKAFERLQQTMEDDRK >gi|226332214|gb|ACIC01000106.1| GENE 19 20827 - 21054 322 75 aa, chain - ## HITS:1 COG:no KEGG:BT_4027 NR:ns ## KEGG: BT_4027 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 75 1 75 75 131 100.0 7e-30 MNENTTKETLYYPRKMRIGWCVAHEVTAAGVKIERYGIKCRTYAEAFDRAAKLNNENRAG KAAVNESASEGGGQL >gi|226332214|gb|ACIC01000106.1| GENE 20 21081 - 22073 814 330 aa, chain - ## HITS:1 COG:no KEGG:BT_4026 NR:ns ## KEGG: BT_4026 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 330 1 332 332 618 97.0 1e-176 MSKNGFSYYKAETDRFQDIKIKRLKKKYGCDGYAVYQYALNEIYRVEGAYIRWTEDQLFD CADYWDMNEARVKEIIGYCAEVCLFDPVMWKTQCILTSRAIQSRYLDICKISKKKSYIPL EILLVEPEQPMREPVAMPLFEGGAGAAEHDTPNQATLAEQKFRSTPETFQNTQESSGNIP EKTDKEKKIKEKQNKENSSSTPPQPSGALTEEEARVLLSSTVLGEDRSSKPDIARHQPPQ ADAKAEEEKQRNPQGLINTLRPYNLSQRELEEVLQLSRHGEIGNPVWSILAEMHGNKRLK MPRLFLLKRLRDAMGLVAGGSPGGKAKEAS >gi|226332214|gb|ACIC01000106.1| GENE 21 22070 - 22366 317 98 aa, chain - ## HITS:1 COG:no KEGG:BT_4025 NR:ns ## KEGG: BT_4025 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 98 1 98 98 164 98.0 1e-39 MKTKNNNMKIMKTESTANEQERKEAKYKPVTPVILDNVSEDVTFIRSADVCKLLNVSNST LRNLRAAQAIPFYKLGGTFLYNKEEIMNYLASNYSKRI >gi|226332214|gb|ACIC01000106.1| GENE 22 22388 - 22648 303 86 aa, chain - ## HITS:1 COG:no KEGG:BT_4024 NR:ns ## KEGG: BT_4024 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 86 1 86 86 147 100.0 1e-34 MKTTATLIQQALEQKAIDSMIAYERNLISEQKMGKALNDALQHYSNVEGHRSIVLKGWII KTIYALKSNQLNDLDRIAFKYIKNEY >gi|226332214|gb|ACIC01000106.1| GENE 23 22645 - 22791 102 48 aa, chain - ## HITS:1 COG:no KEGG:BT_4023 NR:ns ## KEGG: BT_4023 # Name: not_defined # Def: transposase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 48 71 118 118 88 100.0 6e-17 MLGHTSIKTTQIYAKILDTKVMDDMAALKEMYARKESQESPSNKAANE >gi|226332214|gb|ACIC01000106.1| GENE 24 23153 - 24382 917 409 aa, chain + ## HITS:1 COG:ECs3766 KEGG:ns NR:ns ## COG: ECs3766 COG4974 # Protein_GI_number: 15833020 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinase XerD # Organism: Escherichia coli O157:H7 # 177 391 64 287 298 60 28.0 5e-09 MKVEKFKVLLYLKKSGLDKSGKAPIMGRITVNRTMAQFSCKLSCTPGLWNPRESRLNGKS REAVETNAKIDKLLLDINAAFDSLLERKGDFDAASVKDAFQGSMKTQMTLMKMLDALRDE VKSRIGIDRAKGTYPAYDFTCRTMREFIETKFKTKDLAFGQLTEQFIHDYENFILDEKGY AVDTVRHYLAILKKTCKRAYQEGHSERFMFQHYVLPKQTVKTPKALCRESFEKIRDVEIA PHRTTHRLARDLFLFACYTGVAYSDAVTVTRENLYTGEDGKLWLKYRRKKNELRASVKLL PEAVALIEKYHDDSRDTLFPMIHYPSMRNHMKALAVLAGIKENLCYHVGRHSFASLVTLE AGVPIETISSMLGHSNIQTTQVYARVTPKKLFEDMDRLIEATGDLKLVL >gi|226332214|gb|ACIC01000106.1| GENE 25 24402 - 25619 671 405 aa, chain + ## HITS:1 COG:no KEGG:BDI_0742 NR:ns ## KEGG: BDI_0742 # Name: not_defined # Def: integrase # Organism: P.distasonis # Pathway: not_defined # 1 395 1 395 403 537 65.0 1e-151 MRSTFSILFYINRGKIKADGTTAVMCRITIDGRNTAITTGICCKPEDWNARTGTIRTVRE NARLQEYRKYIEQTYEEILRTQGVVSAEIIKNRVTRQFVVPTHLLRMGEIERERLRIRSR EINSTSTYRQSQYFQKYLTDYLVSLGKKDIAFEEITEDFGRNYKAFLIRNKNFSTSQTNR CLCWLNRLLYLAVDNEILRTNPVENVEYEKKTAPKHKYVTREEMKRILAMPLNEGRAELG RRAFIFSYFTGLAYADIKQLHPCHIGTTAEGRRFIRINRKKTGVEAFIPLHPIAEQILSL YNTTDIHSPVFPLPSRDSIWHEIREIGVILGRHDDLSYHQARHGFGVLLISESVSIESIA KMMGHSNISTTQGYARITEEKISREMDRLMEKRSQSRTHSGSDSQ >gi|226332214|gb|ACIC01000106.1| GENE 26 25956 - 26978 827 340 aa, chain - ## HITS:1 COG:mlr2757 KEGG:ns NR:ns ## COG: mlr2757 COG3177 # Protein_GI_number: 13472455 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Mesorhizobium loti # 35 224 41 214 263 86 32.0 8e-17 MQTELEKLVSLYRELGIDRQIDYDKFYLYSLITHSTAIEGSTITELENQIMFDQGISLKG KSIVEQHMNLDLKDAYEHAIRLADAHTDITVDLLKSLSALVLKNTGQEYKTVLGDFSSAR GDLRLLNVTAGPGGKSYMNYSKVPAKLSEFCTRLNRERENHAAKSMTQLYEISFDAHYDL VTIHPWADGNGRMARLLMNMLQFEFGLIPTKILKEDKEEYIKALVETRENEDLNVFREFM TATMIKNLTRDIEVYRKSIDDTPISGEKPQKSREKKVKSREKIMTLLSQDNTLSAATLAE RIGITAKAVEKQIAALKADGVLRRIGPDKGGYWQVVEKKD >gi|226332214|gb|ACIC01000106.1| GENE 27 27160 - 27486 288 108 aa, chain - ## HITS:1 COG:no KEGG:BDI_0745 NR:ns ## KEGG: BDI_0745 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 21 103 15 97 97 99 61.0 4e-20 MNDNGNIRLLTPENDMRVRAFLSSLEELSEKVEKIRENNKPSLDGERYYTDKELAVRLKV SRRSLQDYRNNGILPYIQIGGRILYRASDIERTLMDGYKEAYRLKRDI >gi|226332214|gb|ACIC01000106.1| GENE 28 27483 - 27860 297 125 aa, chain - ## HITS:1 COG:no KEGG:BDI_3258 NR:ns ## KEGG: BDI_3258 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 93 1 93 103 109 55.0 3e-23 MEVVVIDKATFERMLSGFEIFAEKVERLCREQEDLGEKEWLDSNDVCRLLCISPRTLQTM RENGTLAYTKISHKVYYRPEDVKAVFPVAEMKRCITAGKERKCNVTNKPTNKPINQQTNK QISDL >gi|226332214|gb|ACIC01000106.1| GENE 29 28297 - 28857 371 186 aa, chain + ## HITS:1 COG:no KEGG:BDI_2239 NR:ns ## KEGG: BDI_2239 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 9 180 12 179 186 137 43.0 2e-31 MATKDVKEFNGWFNRSYARLKERLSIYGKIDEDAFHDAYLAVRKQIMFSSVGIEDPESYF FGCYRRILQSGARDESRYDSPGDEYFARLGETDCAEETEEREEMLTGCDRLVRDIQKFLR RHFSYEDYRIFMLRFYETGSSFRTIARHMGEKTSVVTRRAQAMMESVRANRKFIARRRLI MAGEAA >gi|226332214|gb|ACIC01000106.1| GENE 30 28903 - 29322 337 139 aa, chain + ## HITS:1 COG:no KEGG:BDI_2238 NR:ns ## KEGG: BDI_2238 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 127 1 128 130 114 46.0 2e-24 MKLTVYDKSNSHPALTYKGKRIITVCRDGSMYLSRILSRELSLHAGNRLCIARDEDRPKD WYMFVSDDENGFTIWNDPRCARFSNSFIAGMILDVAKVEKCAGFMVAKEPVNVDGRLCYR IILDNPIPKGVSVRTTGTK >gi|226332214|gb|ACIC01000106.1| GENE 31 29436 - 29741 178 101 aa, chain - ## HITS:1 COG:no KEGG:BDI_0745 NR:ns ## KEGG: BDI_0745 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 3 98 2 97 97 106 58.0 3e-22 MSSEIREKDHEWVCKFHSNFDRLLASFEKLFSQRRPPVYGDELLTDKEVSHLLKVSRRTL QDYRSNGILPYIQVGGKILYRASDIERTLMDGYREAYRSRK >gi|226332214|gb|ACIC01000106.1| GENE 32 29772 - 30083 310 103 aa, chain - ## HITS:1 COG:no KEGG:BDI_0746 NR:ns ## KEGG: BDI_0746 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 103 1 103 103 131 66.0 9e-30 MEIVSFEKRTFEEIAAKLDYFVQRMDDLCRQHGEKKAERWMDSHDVCRKLRISPRTLQTL RDNGTLAFTKIGNRTYYRLEDVERVIVDVEERRKEAKWKGKSI >gi|226332214|gb|ACIC01000106.1| GENE 33 30662 - 31108 235 148 aa, chain + ## HITS:1 COG:no KEGG:BT_3141 NR:ns ## KEGG: BT_3141 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 29 148 35 160 160 117 51.0 9e-26 MATRKSFDKERWEAMTGMEMSQLLPECENPESAGEVTEHAGNEILPTVTEQSGIQEPAEK DGSAPASKCRISGRQRRLSLEEYRSTFLQVPRIEDRKPVFVSCEVRDRLDEYVRKLGSRR MSVSGLLENIAWQHLEIYSEDFDRWRKL >gi|226332214|gb|ACIC01000106.1| GENE 34 31346 - 31720 234 124 aa, chain + ## HITS:1 COG:no KEGG:BDI_3248 NR:ns ## KEGG: BDI_3248 # Name: not_defined # Def: mobilization protein BmgB # Organism: P.distasonis # Pathway: not_defined # 8 117 106 215 219 116 55.0 3e-25 MSVKDKTRPRGRPKASGIRKLSKSVTVKFSRIDYERLLHRSRQANRTLAEFIREAAFEAR IVARHSTEEAAVIRNLVGMANNLNQLARLSHQTGFYRTRNAVMELLEKLKVIMNEYKKVE RRNT >gi|226332214|gb|ACIC01000106.1| GENE 35 31717 - 32673 727 318 aa, chain + ## HITS:1 COG:no KEGG:BVU_3726 NR:ns ## KEGG: BVU_3726 # Name: not_defined # Def: mobilization protein BmgA # Organism: B.vulgatus # Pathway: not_defined # 1 297 1 301 316 314 54.0 3e-84 MIGKIKKGKSFGGCIRYVMGKDNAEIIGSDGVLLGNNREIADSFNCQCLLNPKIKQPVGH IALSFKPEDKPVLSNEFMAKIAMEYMDLMGIRNTQFILVRHHHTDNPHCHLVYNRIGYDG KVISSQGDYKRNEIATKRLKDKYGLTYAEDKGKTNVTKLHDSERIKYEIYHAVKQALKRA RTWKELVVGLALQGIKLEFVGRGGKMKSAGDIQGIRLTKDGLTFKGSQISREFSFAKLNA ILGGNSPDTGVDLEVKKQNQAPSNRNRKEQEPSNMAFIESNGLGLFSSFGESVPEEQIPY DELLRKRKKKKKRKGLGL >gi|226332214|gb|ACIC01000106.1| GENE 36 32728 - 33537 569 269 aa, chain + ## HITS:1 COG:no KEGG:PGN_0926 NR:ns ## KEGG: PGN_0926 # Name: not_defined # Def: hypothetical protein # Organism: P.gingivalis_ATCC33277 # Pathway: not_defined # 177 258 159 240 240 69 50.0 1e-10 MKQEEFMESIYGCLERIENKINGLSMPQSAGGDAEADKEVIQELNALKTGFKRVLEALVM IKGDTANLQKRNSMPDKFMETLSVLKSEQQAYHKNQNEFLERFAQTEKDTILHISEKMES LSTSVRNRMEEPDVVCHRHSISIDTPYIFWTLIILVTYSIVVSVAFCLGKQLDYDCSDND LKYRYIKMKGEASPGQIGELENLFELNRDEAGIEQMREDVEAYEDAVRRQATLAEQARLK EQAAKEQESKAKFIKMKQGQPKGNLKSRN >gi|226332214|gb|ACIC01000106.1| GENE 37 33676 - 34872 400 398 aa, chain + ## HITS:1 COG:no KEGG:BDI_0013 NR:ns ## KEGG: BDI_0013 # Name: not_defined # Def: integrase # Organism: P.distasonis # Pathway: not_defined # 1 396 1 401 407 403 49.0 1e-111 MATIKVKLRPSSVVERAGTIYYQVTHRRATQQITTNIRLQPDEWDTIGEQVVVSVADKNI IQNRIDSDIALLKRIVKDLNNSGVTYSVGDIVKRFKSPECHVLVLDFMQNQIRLMRNANR LGTALNYEKTMKNFVKFLGGVNLPFSAMTEQVIADYNAFLVQRGMVRNSISFYMRIMRAV YNKAVRQKLVEQSHPFTEVYTGIDRTRKRAVSESVISQLYKLNLAEGTPLALARDIFIFS YCTRGMAFVDIAYLKRENIQNGVICYARRKTGQLLSVRIEPSIQRIIDRYSSALSPYVFP ILTSTETKPAYEEYQVAINNHNRQLRRLSKMLPAGCKLTSYTSRHSWATAARNHNVPISV ISAGMGHTSEQTTQIYLTMLENSVIDDANQGLIRSLLE >gi|226332214|gb|ACIC01000106.1| GENE 38 35204 - 36493 507 429 aa, chain + ## HITS:1 COG:no KEGG:BDI_0751 NR:ns ## KEGG: BDI_0751 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 5 429 8 393 395 160 29.0 6e-38 MKRLILSAACTLSLASCGGPPTGPNQKLGNESSQTEVKLPAPKIKVFVENSGSMDGYVKG ATDFENAVYSYLSDVQHADLGTRIDSLATKNILVLNYINSEVLQQKPDIREFIEALEPAD FKMKGGKRGTSDMSNILDTIISQTDDDEVSIFVSDCIFSPGKKYKAKDNADEYIVAQQIG IKNHIVEKLAKNPNFSIVVMRLTSQFNGIYYNKFDDRQPINNDRPFFMWLMGDKSFLNTI LKKVELNQIKGKGVQNIFMISRPLTAILYNISLPQPGNGKYEIAKSEQYAIKNAKTDGRG GNSRFQVGISVNFSNILLPDEYLMNPDNYVVSNKAYGLEITKYSGPRQDLYTHTIKLNLL QPVLSKGTVKISLKNTLPQWINDCTDESGLDINAPGAMEKTYGLSYLLGGVYDAYASDGQ YGSITINIK >gi|226332214|gb|ACIC01000106.1| GENE 39 36495 - 36977 438 160 aa, chain + ## HITS:1 COG:no KEGG:BDI_0752 NR:ns ## KEGG: BDI_0752 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 159 1 152 152 102 38.0 4e-21 MENLLGQIYCWFQSFYGQDLSYYLWGYDPGTEAYTNPNIYNLVGLITLVGSLVLVVVFYY IINHPRLCKWWSWLITLGINAVIALFVGYGIVMSKYVNGYIHDTLMYQRDADGNIISILI GESNCWGFGIANMFVAMIAFVLLSFMLKWWSSSAKHVPFL >gi|226332214|gb|ACIC01000106.1| GENE 40 36998 - 38470 798 490 aa, chain + ## HITS:1 COG:no KEGG:GFO_2613 NR:ns ## KEGG: GFO_2613 # Name: not_defined # Def: hypothetical protein # Organism: G.forsetii # Pathway: not_defined # 4 490 3 465 465 407 45.0 1e-112 MNGKLYIFGIGGTGSRVIKSLVMLAASGVKIDASAIVPIIVDPDFANADVTRTIEQIKAY VAIRSQLAFNDATKNRFFEVPIENVVNDHRLALKDTKNKKFREFIEFSTMSKENRALASM LFSEANLESDMEVGFKGNPNIGSVVLNQFTDSQDFIDFADAFKPGDRIFIISSIFGGTGA SGFPLLLKNLRGLDAKVSNNDAIQHAPIGAITVLPYFAVKTDEESTINSSTFIGKTKAAL QYYERNVNGDESSVNVLYYIGDERTKQYENEEGGTEQRNNAHIVELAAAMSIVDFASIQN DDPILVCEENSNGKIYAPNPDFREFGIEKDVQEVLFSNLTQSTRDYLCTPMTQFVLFAKY VNEHLRGATSLQWARDNKFDDAFLKSSFINNIQKFVNSYIEWLEEMSDNDRAFKPFKLSV KSSDLYSIVDGVKPGSIKALWAFNKSGYDLFDAALNEEHSSLSKNFTLPHKFTELFYNVT KLLVGKKYKF >gi|226332214|gb|ACIC01000106.1| GENE 41 38476 - 41862 1707 1128 aa, chain + ## HITS:1 COG:no KEGG:BDI_0754 NR:ns ## KEGG: BDI_0754 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 1126 1 1124 1130 660 36.0 0 MAKVLRLHKGAADTIGNWDTSVKIGTKAIDSIADPIGANDKHEITSIPSPFARMDLVKRA FKIVAEGSLEGKTAYHKLVSDCLDVGQIFFNIEKYRDKIEIIVWDKKNCLAELSDSDYEE HTRLGKTYKTYLEQDCDEYNFDQMDCIYLLNYIDPSAPGVMNIIGATSPATIFFTSANDL QYVGKKIKFGNDCPFDTEYKPLYKRDFEYIKYWWGLKKSRKDFARVFPEVDLYLEKCFRM LTDEQRNELRQNIVDETYYRGNYDDIPVVPTAQQYVMVLDEKLRRKRTVTNISSGFEMKV SDSLRNGNVPLALPVEMYTEPTHYVVAKWDKNTYVPYYDARPIDDRTLPDDGSKYPYVTV GDFLEDTIVKVPYKFNSEAFFNGNDENPDSKFSYILPLKRTYFNYFTTKDLTEKVCGKNR IEVVRLNGEAVKVVLRIPIKDGGYIQYERIYYKDGKAEATANKGAIIEKEFTLGLYPSIK YADGVKPYYKVAFLDRDSVDNPDSAYSLSFYDYSNKEVSVEGVVRRNRNADNSRFDTSYI DYITYALESEYQYITLSNDNESGIHGVIIPKFTARNGSHKFRFAIDFGTTNTHIEYSVDG STSSNPFDITEKDMQIQKLHITDDYMINDVFNSDFIPATIGGESLYGYPMRTVISESNNT NWDKAVLAMANVNIPYTYEKSLSLPYNVLHTDLKWSTNTEDKKRASKYIESILLMLRTKV LLNNGDLSKTEIVWFYPASMTQNRFNKFKAEWENTFATLFGAPISNIIAASESVAPYYYH KAKKGATSTVVSVDIGGGTTDVLIVDKGEPKYLTSFRFAANTIFGDGYSYDSDSNGFVNK YKDIITNQLETNNLRGLKAVLKSVLDKRVSTDVIAFLFSLASNKEIKKEKVEINFAKMLA DDNRGKYVVILFYVAIVYHIAKLMKAKGFDMPRHMTFSGNGSKVLNILSTNDATLVRLTK IIFEEIYAQSYSIDGLDIIRPANSKESTCKGGIILTPFQSQDYGEIKDMKTILIGTDNEK FADVHMTYNDVTEADLDSVVDVIKEYIEFTFKLDKKFSFYDNFDVDRSIMNKVKDLCYRD IRTYLENGLAIKKSEIAQDGADDNLEETLFFYPLVGIINAVVRNIYQM >gi|226332214|gb|ACIC01000106.1| GENE 42 41868 - 42863 446 331 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|212692359|ref|ZP_03300487.1| ## NR: gi|212692359|ref|ZP_03300487.1| hypothetical protein BACDOR_01855 [Bacteroides dorei DSM 17855] # 1 331 1 331 331 533 96.0 1e-150 MKKIIILLSVITLASCSLLPNNGENDNEAIKKAQQTAQENISQYTDQEIDRAKKEIDNAV ASVINRADSTLTNQFTAVEKSFNAKAEETKKELSGKISGIEDNITKAKAISFIGIAIGIL GIVLAFFAYRKRPRTNVNKVKDIITEEINTNDYIRNEIRRIAGGQSSSYRPQTSAPSQAT IDRAIEAYLASKKFKDILQQYIALATAPTQSVTENIKMEPATTAKPVYQIYAKESNSMIL SSIQDTYQKGKSIYRLTMSEANAYTAEVSICIEQEEVKQRILKFDSQYLEPICSVTRSSN DPTQVLIKTTGTAERIGEEWKVIKPITVEIK >gi|226332214|gb|ACIC01000106.1| GENE 43 42867 - 44549 597 560 aa, chain + ## HITS:1 COG:no KEGG:BDI_0756 NR:ns ## KEGG: BDI_0756 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 81 440 59 424 542 208 33.0 4e-52 MELHYILVVLVIIAIIVAQIYIYRNTKKKIATYKSIFPNSTSSYSIVEKEIQTESCSDDD DVYVEDIDTISVSQLNINTDNETLKEIRDALNMYLQKNRGAASDFYLMKDVVERYCDAEE EEINIQQPIPLYLGLMGTMVGIIVGIGFIAVSGGLSSESLMDNITSLMTCVAIAMAASLV GICCTTLISWSAKSATSKVEADKNRFYSWLQTELLPVLSGNAVNALYLLQQNLMTFNQTF QSNIEGLDGALSKVEESSREQIELITLIKDIDIKRVAQANVTVLKELKECTGEIAVFNKY LHSVSGYLHSVNELNSNINEHLNRTAAIENMGAFFEREINQVAAREQYINEVVAKVDDTL RKTFEKLAESTRESVTQLRNNSVSEFDALLKHYSEQKEEFARMLQEQREEFAARNAETTE LMKEIRNLADIKAVMGQLVESTKGQTAILERLVSSLKNQNNGGRREGFPIESAVQHVSAP VFPKSITYMVATITLLAFMAFGLYVYNSFIAEPRIEVVGVSNEPQQTIVPTSTQVEPNIS NDQTVNVDYLESTQSATQEQ >gi|226332214|gb|ACIC01000106.1| GENE 44 44552 - 45205 149 217 aa, chain + ## HITS:1 COG:no KEGG:BDI_0757 NR:ns ## KEGG: BDI_0757 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 211 1 201 210 175 46.0 1e-42 MAKESKSFFWASYADLMTSLFFVMLTLFIVVIIALNNARIDAIEQTAELQAKIDKADEIN NATRELDTQHSQYFQYFPEFKKHKLAVTVSFRSGSADMNSLPSSTKEDLRTTGKILQDFI IKTTQSNPHIQYLLIIEGQASKDGYAYNYELSYQRALSLKKFWEDNGLNFNDKNCEVLIC GSGDGRLSGTGLMRESKEVLNQRFLIHILPKPGKIGD >gi|226332214|gb|ACIC01000106.1| GENE 45 45278 - 45598 217 106 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|212692356|ref|ZP_03300484.1| ## NR: gi|212692356|ref|ZP_03300484.1| hypothetical protein BACDOR_01852 [Bacteroides dorei DSM 17855] # 1 106 24 129 250 187 100.0 2e-46 MVVLFIISIVELKQIDATPLEVKELKAERDSLLNLNSRMVLRQKQYSEELDSMRYLANAT QAHIDKINEINDATKNLDQNYFVYDSINKKHKLNFVVRFKIDDDQI >gi|226332214|gb|ACIC01000106.1| GENE 46 45743 - 45961 62 72 aa, chain + ## HITS:1 COG:no KEGG:BDI_0757 NR:ns ## KEGG: BDI_0757 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 3 67 137 201 210 72 56.0 7e-12 MDYNYDLSYRRARNLKKFWDNNNIHFDRDNCEVLISGSGDGRLSGTGLMRELEEKANQRF LIHILPKPGQIQ >gi|226332214|gb|ACIC01000106.1| GENE 47 45965 - 46741 32 258 aa, chain + ## HITS:1 COG:TP0546 KEGG:ns NR:ns ## COG: TP0546 COG0265 # Protein_GI_number: 15639535 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain # Organism: Treponema pallidum # 102 258 287 454 666 82 32.0 1e-15 MKQILYILLILLLSVGCAQKPAAPKGNNNYPVSSPSSRPGSFSNNGRPPLPHQQKGTSQT NKTSDVGNTTNGRVLSPSEIFEKLSSAVFKIHTSTAYQGFQGSGFFISSNGIAVSNYHVF QGTAVGYEDIILSDGSSYKVTEVYHKSQDNDFIIFKVGVSRKVNYIKIANNTPKVGEKIY TIGSPRGLDNTFSSGEISQIRENGKILQISAPIDHGSSGGVLLNSKGEAVGITSGGIDDS GANLNYAWNIQLIKSYIP >gi|226332214|gb|ACIC01000106.1| GENE 48 46995 - 47234 88 79 aa, chain + ## HITS:1 COG:no KEGG:FIC_01982 NR:ns ## KEGG: FIC_01982 # Name: not_defined # Def: SNase-like nuclease # Organism: F.bacterium # Pathway: not_defined # 2 79 74 151 161 68 41.0 6e-11 MGENIIVDVQKQDGWGRYIAYVYTLENKDVALEMLNAGMAWHYTKYDQSEKYHNAEIKAR NNKVGLWVYPRRIAPWDFR >gi|226332214|gb|ACIC01000106.1| GENE 49 47755 - 48522 372 255 aa, chain - ## HITS:1 COG:no KEGG:BT_4009 NR:ns ## KEGG: BT_4009 # Name: not_defined # Def: integrase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 254 100 353 370 500 99.0 1e-140 MFQTQDNEFSTISTLFDYHEIMEKKNLRTSTFIGYHVTKKHLLNFIRIKYHVSDYDLAAV DKAFVYEFYAYLQGYRREGDTVCAVNGALKHIQRFKKVMNVALQNEWISRNPVCLLNAKK TKVERGFLSEKELKSLEEVPLPANLSIVRDVFIFAVYTGLSYVDIGNLTNENINVGIDKS LWLSYYRQKTDIHAILPLLQPAVNILKRYEAYHEGKRNNHIFPVPLNQVMNRYLKKVAKQ AGVDKNITFHMARHR >gi|226332214|gb|ACIC01000106.1| GENE 50 49327 - 49686 331 119 aa, chain + ## HITS:1 COG:slr1886 KEGG:ns NR:ns ## COG: slr1886 COG0799 # Protein_GI_number: 16330295 # Func_class: S Function unknown # Function: Uncharacterized homolog of plant Iojap protein # Organism: Synechocystis # 4 118 27 140 154 81 35.0 4e-16 MNDTKILIEKIKEGIQEKKGKNIIIADLTSIEDTICKYFVICQGNSPSQVNAIVDSVKEF ARKGADSKPTAIDGLRNAEWVAMDYSDVLVHVFLPEVRSFYNLEHLWADAKLTQIPDLD >gi|226332214|gb|ACIC01000106.1| GENE 51 49696 - 51840 1286 714 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157803230|ref|YP_001491779.1| 50S ribosomal protein L9 [Rickettsia canadensis str. McKiel] # 18 680 3 636 636 499 42 1e-140 MDNNNTNNNNNKPNNKVNMPKFNLNWLYMIIALMLLGLWWSSDSRGSGSKVVPYSDFQKY VTNGYISKVLGSEKTLEAHVKPNAVGAIFGADSTKVGRNPIITSNPPSKDRLDKFLQDEK AAGHFDGIADYPADSDIFPAILINILPLVLLIALWIFFMRRMSGGGSGGAGGVFNVGKSK AQLFEKGGSIKITFKDVAGLAEAKQEVEEIVEFLKEPQKYTDLGGKIPKGALLVGPPGTG KTLLAKAVAGEANVPFFSLAGSDFVEMFVGVGASRVRDLFKQAKEKAPCIVFIDEIDAVG RARGKNPAMGGNDERENTLNQLLTEMDGFGSNSGVIILAATNRVDVLDKALLRAGRFDRQ IHVDLPDLNERKEVFGVHLRPIKIDDTVDVDLLARQTPGFSGADIANVCNEAALIAARHG KKFVGKQDFLDAVDRIIGGLEKKTKITTEAERRSIALHEAGHASISWLLEYANPLIKVTI VPRGRALGAAWYLPEERQITTKEQMLDEMCATLGGRAAEDLFIGRISTGAMNDLERVTKQ AYGMIAYLGMSDKLPNLCYYNNEEYSFNRPYSEKTAELIDEEVKRMVNEQYERAKKILSE HMEGHNELAQLLIDKEVIFAEDVERIFGKRPWASRSEEIMAAKESQDAARAERELAEKLK AEEQEIKEEEAKNSAQEEATTETKITAEGKKVTVEGKVTIEAKSNEKEQTTESK >gi|226332214|gb|ACIC01000106.1| GENE 52 51857 - 52699 790 280 aa, chain + ## HITS:1 COG:HI0919 KEGG:ns NR:ns ## COG: HI0919 COG0575 # Protein_GI_number: 16272856 # Func_class: I Lipid transport and metabolism # Function: CDP-diglyceride synthetase # Organism: Haemophilus influenzae # 13 277 11 288 288 106 32.0 4e-23 MISNFIKRAITGILFVAILVGCILYNSFSFGILFMAISALTIYEFGQLINSRAEGVNVNK TIIMLGGAYLFLAVMGFCIDAADSKIFIPYLLLLLYLMISELYLKKASSMLNWAYSMLSQ LYIGLPFALLNVLAFHTDPEYSSVSYNPILPLSIFIFLWLSDTGAYCVGSLIGKHRLFER ISPKKSWEGSIGGGVVAIGASFVLAHYFTIMSMWEWAGLALVVVVFGTWGDLTESLLKRQ LHVKDSGTILPGHGGMLDRFDSALMAIPAAVFYLYALTWF >gi|226332214|gb|ACIC01000106.1| GENE 53 52812 - 53588 841 258 aa, chain - ## HITS:1 COG:no KEGG:BT_4005 NR:ns ## KEGG: BT_4005 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 258 1 258 258 501 100.0 1e-140 MKKFNWLLVMLLLALVPALQSCDDDGYSIGDIGWDWATVRATGGGGYYLEGDRWGMIDPV ASSIPWYKPVDGERVVAFFNPLADTDKGAQVKIEGIQEVLTKEVEDMTAENEEEFGNDPI VIYQGDMWLGGKFLNIIFHQYLPRSEKHRISLVQNKIEPEAPETPEALNVDEDGYIHLEL RYNTYDDVTGYRGWGRVSYNLEEFYPTAKDAAETEFKGFKVTINSKENGEGRVIVLDLDH PVGVPEKAKDVHSTSFVK >gi|226332214|gb|ACIC01000106.1| GENE 54 53680 - 54816 1091 378 aa, chain - ## HITS:1 COG:FN0597 KEGG:ns NR:ns ## COG: FN0597 COG0763 # Protein_GI_number: 19703932 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lipid A disaccharide synthetase # Organism: Fusobacterium nucleatum # 1 341 1 323 356 157 32.0 3e-38 MKYYLIVGEASGDLHASHLMTALKAEDPQADFRFFGGDLMAAVGGTMVKHYKELAYMGFI PVLLHLRTIFANMKRCKEDIVAWNPDVVILVDYPGFNLNIAKFVHSETQIPVFYYISPKI WAWKEYRIKNIKRDVDELFSILPFEVEFYTLKHRYPIHYVGNPTVDEVTAYQKAHPKNPE AFLADNNLEDKPIIALLAGSRKQEIKDNLPDMLKAASAFPDYQLVLAGAPGIAPEYYKQY VGQAKVKIIFAQTYRLLQHAEVALATSGTATLETALFRVPQVVCYYTPIGKVVSFLRRHI LKVKFISLVNLIADREVVKELVADTMTVGNMQNELKKLIEDQEYKNRMLAEYEYMADRLG PAGAPQHAARKMLELLKK >gi|226332214|gb|ACIC01000106.1| GENE 55 54868 - 55647 618 259 aa, chain - ## HITS:1 COG:VC0531 KEGG:ns NR:ns ## COG: VC0531 COG0496 # Protein_GI_number: 15640553 # Func_class: R General function prediction only # Function: Predicted acid phosphatase # Organism: Vibrio cholerae # 8 256 16 259 263 151 34.0 1e-36 MESKKPLILVSNDDGVMAKGISELVKFLRPLGEIVVMAPDSPRSGSGSALTVTHPVHYQL VKREVGLTVYKCTGTPTDCIKLALGSVLDRKPDLIVGGINHGDNSAINVHYSGTMGVVIE GCLKGIPSIGFSLCNHRPDADFEPSGPYIRKIAAMILEKGLPPLTCLNVNFPDTPNLKGV KVCEQAKGCWVNEWVTCPRLDDHNYFWLTGSFTDHELENENNDHWALENGYVAITPTTVD MTAYGFIDELNGYCQQLEF >gi|226332214|gb|ACIC01000106.1| GENE 56 55825 - 56772 918 315 aa, chain + ## HITS:1 COG:lin2923 KEGG:ns NR:ns ## COG: lin2923 COG1192 # Protein_GI_number: 16801982 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Listeria innocua # 62 311 1 250 253 278 56.0 9e-75 MYICKLKQRKYDDKINQVSELFRPAVINKTGDLSTENAKFLFFRRRSKELSLISLPINLE YMGKIIALANQKGGVGKTTTTINLAASLATLEKKVLVVDADPQANASSGLGVDIKQSECT IYECIIDRANVQDAILDTEIDSLKVISSHINLVGAEIEMLNLPNREKILKEVLTPLKKEY DYILIDCSPSLGLITINALTAADSVIIPVQAEYFALEGISKLLNTIKIIKSKLNPALEIE GFLLTMYDSRLRQANQIYDEVKRHFQELVFNSVIQRNVKLSEAPSYGIPTILYDADSTGA KNHLALAKEIINRNK >gi|226332214|gb|ACIC01000106.1| GENE 57 56784 - 57674 1077 296 aa, chain + ## HITS:1 COG:ML2706 KEGG:ns NR:ns ## COG: ML2706 COG1475 # Protein_GI_number: 15828464 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Mycobacterium leprae # 32 295 64 334 335 197 43.0 3e-50 MATQKRNALGRGLDALLSMDDVKTEGSSSISEIELAKITVNPNQPRREFDQTALQELADS IAEIGIIQPITLRKLSDDEYQIIAGERRYRASQKAGLKTIPAYIRTADDENMMEMALIEN IQREDLNAVEIALAYQHLLDQYELTQERLSERIGKNRTTIANYLRLLKLPAPIQMALQNK QLDMGHARALISLGDPKLQVKIFEEIQEHGYSVRKVEEIVKSLSEGEAVKSGTRKITPKR SKLPEEFNLLKQQLTGFFNTKVQLTCSEKGKGKISIPFSNEEELERIMEIFDTLKK >gi|226332214|gb|ACIC01000106.1| GENE 58 57675 - 58562 564 295 aa, chain + ## HITS:1 COG:no KEGG:BT_4000 NR:ns ## KEGG: BT_4000 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 295 1 295 295 575 100.0 1e-163 MTRKSKKYQLPIITLLLCFLQVAGIDVYAQEPVTPVQKDSIIQSREAPKARARRHRDPAQ VMDSLKKDSIKILHPERLQATDSLSAVKIQIADSLDAANKKELKKIEQPAPIVVKTDTVP PVQDLNKKLFIPNPTKATWLAVVFPGGGQIYNRKYWKLPIIYGGFAGCAYALSWNGKMYK DYSQAYLDIMDSNPNTKSYEDLLPPNAQYNEEQLKNTLKRRKDSFRRFRDLSIFAFIGVY LISIIDAYVDAELSNFDISPDLSMKLEPTVIDNNSQFRSNSYKNKSVGLQCVLRF >gi|226332214|gb|ACIC01000106.1| GENE 59 58596 - 59891 1169 431 aa, chain + ## HITS:1 COG:PA1812 KEGG:ns NR:ns ## COG: PA1812 COG0741 # Protein_GI_number: 15597009 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Soluble lytic murein transglycosylase and related regulatory proteins (some contain LysM/invasin domains) # Organism: Pseudomonas aeruginosa # 126 429 118 462 534 165 31.0 2e-40 MKKLENYCSIIFLFLLATTQVKAQSVDVVIRENGTERKESIDLPKSMTYPLDSLLNDWKA KNYIDLGKDCSTAEVNPLFSDSVYIDRLSRIPAVMEMPYNDIVRKFIDMYAGRLRNQVSF MLSACNFYMPIFEEALDAYGLPLELKYLPIIESALNPSAVSRAGASGLWQFMIGTGKIYG LESNSLIDERRDPIKATWAAARYLKEMYDIYGDWNLVIAAYNCGPGTINKAIRRAGGETD YWKIYNLLPKETRGYVPAFIAANYVMTYYCDHNICPMETNIPASTDTVQVTKNLHFEQIA ELCNVPLEEVKSLNPQYKKQVIPGTTKPCTLRLPQGAISTFIDKQDTIYAHRADELFRNR KTVAVKEVTPTTRRQTSAVAGKGKLTYYKIKSGDTLSSIAEKLGVRVKDIQQWNGMSNTR IAAGKQLKIYK >gi|226332214|gb|ACIC01000106.1| GENE 60 59964 - 62207 2250 747 aa, chain + ## HITS:1 COG:VC2710 KEGG:ns NR:ns ## COG: VC2710 COG0317 # Protein_GI_number: 15642704 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Guanosine polyphosphate pyrophosphohydrolases/synthetases # Organism: Vibrio cholerae # 34 710 19 667 705 459 38.0 1e-128 MDNITPKEIADEEMINQAFQELLNDYLHTKHRKRVEIITKAFNFANQAHKGIKRRSGEPY IMHPIAVAQIVCNEIGLGSTSICAALLHDVVEDTDYTVEDIENIFGPKIAQIVDGLTKIS GGIFGDRASAQAENFKKLLLTMSNDIRVILIKIADRLHNMRTLGSMLPNKQYKIAGETLY IYAPLANRLGLYKIKTELENLSFKYEHPEEYAEIEEKLNATAAERDKVFNDFTAPIRTQL DKMGLKYRILARVKSIYSIWNKMQTKHVPFEEIYDLLAVRIIFEPRNVEEELNDCFDIYV SISKIYKPHPDRLRDWVSHPKANGYQALHVTLMGNNGQWIEVQIRSERMNDVAEQGFAAH WKYKEGGGSEDEGELEKWLKTIKEILDDPQPDAIDFLDTIKLNLFASEIFVFTPKGELKT MPQNSTALDFAFSLHTDIGSHCIGAKVNHKLVPLSHKLQSGDQVEILTSKSQRVQPQWEV FATTARARAKIAAILRKERKANQKIGEELLNEFLKKEEIRPEEAVIEKLRKFHNFKNEEE LLAAIGSKAITLGEADKNELREKQTSNWKKYLTFSFGNSNKEKPEEKEPQEKEKINPKEI LKLTEESLQKKYIMAECCHPIPGDDVLGYVDENDRIIIHKRQCPVAAKLKSSYGNRILAT EWDTHKELSFLVYIYLRGIDSMGLLNEVTQVISRQLNVNIRKLAIETNDGIFEGKIQLWV HDVEDVKTICNNLKKIQNIKQVNRVEE >gi|226332214|gb|ACIC01000106.1| GENE 61 62209 - 62550 376 113 aa, chain - ## HITS:1 COG:AGc2183 KEGG:ns NR:ns ## COG: AGc2183 COG0789 # Protein_GI_number: 15888519 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 3 82 26 105 203 67 41.0 6e-12 MLNTDKELKLYYSIAEVADMFGVNASLLRFWEKEFPQISPRTTGRGIRQYRKEDVETIGL IYHLVKEKGLTLPGARQRLKDNKEATVRNYEIVNRLKGIKEELLAIKKELDGR >gi|226332214|gb|ACIC01000106.1| GENE 62 62572 - 63540 914 322 aa, chain - ## HITS:1 COG:mll8577 KEGG:ns NR:ns ## COG: mll8577 COG0739 # Protein_GI_number: 13477076 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Mesorhizobium loti # 84 297 222 423 434 134 34.0 2e-31 MRKVYYIYNPRTQTYDRIYPTVRQRALSILRRLFIGMGLGAGCFIVLLLVFGSPSEKELR IENSRLLAQYNVLSRRLDDAMGVLQDIQQRDDNLYRVILQADPVSPAIRQAGFGGTNRYE ELMDLANSKLVVNTTQKLDVLSKQLYIQSKSFDDVVDMCKNHDEMLKCIPAIQPISNKDL RKTASGYGTRIDPIYGTTKFHSGMDFSAHPGTDVYATGDGTVVKVGWETGYGNTIEIDHG FGYLTRYAHLQSYNTKVGKKVVRGEVIGKVGSTGKSTGPHLHYEVHVKGKVVNPVNYYFM DLSAEDYEKMIQLAANHGKVFD >gi|226332214|gb|ACIC01000106.1| GENE 63 63648 - 66266 2880 872 aa, chain + ## HITS:1 COG:ZalaS KEGG:ns NR:ns ## COG: ZalaS COG0013 # Protein_GI_number: 15803211 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Alanyl-tRNA synthetase # Organism: Escherichia coli O157:H7 EDL933 # 3 870 4 871 878 648 43.0 0 MLTAKEIRDSFKNFFETKGHHIVPSAPMVIKDDPTLMFTNAGMNQFKDIILGNHPAKYHR VADSQKCLRVSGKHNDLEEVGHDTYHHTMFEMLGNWSFGDYFKKEAINWAWEYLVEVLKL NPEHLYATVFEGSPEEGLSRDDEAASYWEQYLPKDHIINGNKHDNFWEMGDTGPCGPCSE IHIDLRPAEERAKISGRDLVNHDHPQVIEIWNLVFMQYNRKADGSLEPLPAKVIDTGMGF ERLCMALQGKTSNYDTDVFQPMLKAIAAMSGTEYGKDKQQDIAMRVIADHIRTIAFSITD GQLPSNAKAGYVIRRILRRAVRYGYTFLEQKQSFMYKLLPVLIDNMGDAYPELIAQKGLI EKVIKEEEEAFLRTLETGIRLLDKTMGDTKAAGKTEISGKDAFTLYDTFGFPLDLTELIL RENGMTVNIEEFNTEMQQQKQRARNAAAIETGDWVTLREGTTEFVGYDYTEYEASILRYR QIKQKNQTLYQIVLDCTPFYAESGGQVGDTGVLVSEFETIEVIDTKKENNLPIHITKKLP EHPEAPMMACVDTDKRAACAANHSATHLLDSALREVLGEHIEQKGSLVTPDSLRFDFSHF QKVTDEEIRQVEHLVNAKIRANIPLKEYRNIPIEEAKELGAIALFGEKYGERVRVIQFGS SIEFCGGIHVAATGNIGMVKIISESSVAAGVRRIEAYTGARVEEMLDTIQDTISELKSLF NNAPDLGIAIRKYIEENAGLKKQVEDYMKEKEASLKERLLKNIQEIHGIKVIKFCAPLPA EVVKNIAFQLRGEITENLFFVAGSLDNGKPMLTVMLSDNLVAGGLKAGNLVKEAAKLIQG GGGGQPHFATAGGKNTDGLNAAIEKVLELAGI >gi|226332214|gb|ACIC01000106.1| GENE 64 66498 - 68729 1852 743 aa, chain + ## HITS:1 COG:XF0842 KEGG:ns NR:ns ## COG: XF0842 COG3537 # Protein_GI_number: 15837444 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Xylella fastidiosa 9a5c # 22 732 40 765 790 540 41.0 1e-153 MKTPIYLLLIVCIFASCNTKQTQAEIDYTSYVNPFIGTDFTGNTYPGAQAPFGMVQLSPD NGLPGWDRISGYFYPDSTIAGFSHTHLSGTGAGDLYDISFMPVTLPYKEAEAPLGIYSKF SHDEESAYAGYYQVRLKDYHINVELTATERCGIQRYTFPKAEAAIFLNLKKAMNWDFTND SHIEVVDSVTIQGYRYSDGWARDQRIYFRTRFSKPFDKVELDTTAIIKDKQHIGTAVIAR FDFHTEEGEQILVNTAISGVSMEGAAKNLQAEVPENDFDKYLAETKANWNRQLGKIEVES NNQDDKVNFYTALYHSMIAPTIYSDVDGAYYGPDKKVHQSDGWVNYSTFSLWDTYRAAHP LFTYTEPERVNDMVKSFIAFFEQNGRLPVWNFYGSETDMMIGYHAVPVIVDAYLKGIGNF DAEKALAACVATANLDNYRGIGLYKQLGYIPYNVTDHYNAENWSLSKTLEYAFDDYCIAE MANKMGKKEIADEFYKRSQNYKNVYNPATSFMQPRDDKGNFIKDFKADEYTPHICESNGW QYFWSVQHDIDGLIDLTGGTSRFAEKLDSMFTYHPAADEELPIFSTGMIGQYAHGNEPSH HVIYLFNAVGQQNLTQKYVAKVMNELYKNEPAGLCGNEDCGQMSAWYVFSAMGFYPVNPV SGKYEIGTPLFPEMKLHLANGKTFTVLAPKVSKENIYIQSIKVDGQPYNKTYLTHEQIMS GATVEFEMGNTPLVEVEFEEQTQ >gi|226332214|gb|ACIC01000106.1| GENE 65 68758 - 69312 535 184 aa, chain + ## HITS:1 COG:CC0981 KEGG:ns NR:ns ## COG: CC0981 COG1595 # Protein_GI_number: 16125233 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Caulobacter vibrioides # 10 172 14 176 201 69 33.0 4e-12 MDENFDLTYKALFRRYYPSLIFYATRLVGEEEAEDVVQDVFVELWKRKDSIEIGEQIQAF LYRAVYTRALNVLKHRNVEDGYCTAMEEINRRRAEFYQPDNNEVIRKIEDRELRKEIHDA INELPDKCKEVFKLSYLHDMKNKEIADVLGVSLRTVEAHMYKALKFLRNRLGHLWFILLL FLLD >gi|226332214|gb|ACIC01000106.1| GENE 66 69365 - 70360 879 331 aa, chain + ## HITS:1 COG:CC0560 KEGG:ns NR:ns ## COG: CC0560 COG3712 # Protein_GI_number: 16124814 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Caulobacter vibrioides # 107 301 112 302 329 76 27.0 6e-14 MRNLSEEIINRYLTGQCSEEELIEVNAWISESDENARQLFRMEEIYHLGKFDHYADEQRM ANAEKRLYKQLSQEKKKKDKVLHMHRWMRYAAILAVALLMGGGAGYWFYNRPEHQMLVAV ANEGIVKEVVLPDGTKVWLNNAATLKYPREFSEKERNVYLDGEAYFEVTKNRHKPFTVES DAMRVRVLGTTFNFKCDKRCRIAEATLIEGEIEVKGNKDEGQIVLAPGQRAELNRNSGRL TVKQVDAKLDAVWRDNLIPFNKANIFTITKALERFYDVKIILSPDIRSDKTYSGVLKKKS TIESVLKSLQNSIPIEYKIVGNNIFISSDKK >gi|226332214|gb|ACIC01000106.1| GENE 67 70392 - 72677 2048 761 aa, chain + ## HITS:1 COG:XF0842 KEGG:ns NR:ns ## COG: XF0842 COG3537 # Protein_GI_number: 15837444 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Xylella fastidiosa 9a5c # 28 751 45 765 790 540 39.0 1e-153 MKLKTLLIGCFGGFIMLNSCTESPSVKDYSAYVNPFIGTGGHGHTFPGAVVPHGMIQPSP DTRIDGWDACSGYYYDDNTINGFSHTHVSGTGCCDYGDVLLMPIVGKPQYLTTDPESQKL AYASAFSHENETAEPGYYSVFLDTYQVKAEITATKRGAIHRYTFPESTESGFIIDLDYSL QRQTNYEMEIEVISDTEICGHKKTTYWAFDQYINFYAKFSKPFAYTLITDSVTMDNGKRL PVCKAVLHFNTKKDEEVLVKVGVSAVDIAGARKNVESEIPEWDFDKVRKDARTAWNNYLS KIDITTSDKEDKTIFYTALYHTAISPNLFTDADGRYLGMDLEVHQGDTINPLYTVFSLWD TFRALHPLMTIIDPNLNNQFINSLIRKHQEGGIYPMWDLASNYTGTMIGYHAVPVIVDAY MKGYRNFDAHEAYKASLRAAEYDTTGIKCPDLVLPHLMPKAKYYKNAIGYIPCDRENESV AKALEYAYDDWCISIFAEAMNDFENKAKYERFAKAYEFYFDKSTRFMRGLDSNGEWRTPF NPRASTHRSDDYCEGTAWQWTWFVPHDVEGLVKLMGGEEAFVEKLDSLFTVDSSLDGETT SADISGLIGQYAHGNEPSHHVIHLYNYVNRPWRTQELVDSVYRSQYANNVDGLSGNEDCG QMSAWYILNSMGFYQVCPGKPVYSIGRPAFDKAVINLPSEKTFSIVVKNNSKSNKYIESV LLNGKPLDTPFFGHQDIVAGGIMEIKMTDHPTQWGSNNSPQ >gi|226332214|gb|ACIC01000106.1| GENE 68 72701 - 74971 1959 756 aa, chain + ## HITS:1 COG:L135972 KEGG:ns NR:ns ## COG: L135972 COG3537 # Protein_GI_number: 15673483 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Lactococcus lactis # 31 754 13 716 717 435 34.0 1e-121 MKRLLLIACVLSCTLLSQAKDWTQYVNPLMGSQSTFELSTGNTYPAIARPWGMNFWTPQT GKMGDGWQYTYTANKIRGFKQTHQPSPWINDYGQFSIMPIVGQPVFDEEKRASWFAHKGE VATPYYYKVYLAEHDIVTEMTPTERAVLFRFTFPENDHSYVIVDAFDKGSSIKILPEENK IIGYTTRNSGGVPENFKNYFIIEFDKPFTYKATVENGNLQENVAEQTTDHAGAIIGFKTR KGEQVNARIASSFISFEQAAANMNELGKDNIEQLAQKGKDAWNQVLGKIEVEGGNLDQYR TFYSCLYRSLLFPRKFYELDANGQPIHYSPYNGQVLPGYMFTDTGFWDTFRCLFPLLNLM YPSVNKEMQEGLINTYLESGFFPEWASPGHRGCMVGNNSASILVDAYMKGVKVDDIKTLY EGLIHGTENVHPEVSSTGRLGYEYYNKLGYVPYDVKINENAARTLEYAYDDWCIYRLAKE LKRPKKEINLFAKRAMNYKNLFDKESKLMRGRNEDGTFQSPFSPLKWGDAFTEGNSWHYT WSVFHDPQGLIDLMGGKEMFITMMDSVFAVPPIFDDSYYGQVIHEIREMTVMNMGNYAHG NQPIQHMIYLYDYAGQPWKAQYWLRQVMDRMYTPGPDGYCGDEDNGQTSAWYVFSALGFY PVCPGTDEYVMGAPLFKKATLHFENGNSLVIDAPNNSTENFYIDSMSFNGADHTKNYLRH EDLFKGGTIKVDMSNRPNLNRGTKEEDMPYSFSKEQ >gi|226332214|gb|ACIC01000106.1| GENE 69 75163 - 75597 253 144 aa, chain + ## HITS:1 COG:no KEGG:BT_3989 NR:ns ## KEGG: BT_3989 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 144 1 144 144 285 99.0 5e-76 MEEEIDFTKVPYQYAMCLNRECSKANTCLRLLTAQSVPEKIEYWVIISPKHLAAQQGNCP YYRSNVKVRYAKGFIRMLEDLPYKQMQTVISHLMSYFGRRTYYRIRKGERLLTPSEQQRI LNILKNYGATHLQNFDAYVEDYDW >gi|226332214|gb|ACIC01000106.1| GENE 70 75845 - 77179 977 444 aa, chain - ## HITS:1 COG:no KEGG:BT_3988 NR:ns ## KEGG: BT_3988 # Name: not_defined # Def: putative peptidoglycan bound protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 444 1 444 444 823 99.0 0 MKAHYKLFLSLAIGSFVTFAGCQDDEVVDLVKYPVNQPAITIDDAEGASKATLTAVYKSD GTLELDGPVTRTYTFHFAASPEDATVTFDIINTNIPKENVEISATKVVLPAGSTDASVTV TLKDEDFSFAASNYDATTYELGVKASVEGYKIGTEPIESKVVIEKEAYTASCSVVGESGN NVTFERAFSQGAIVNPDPISYTFKMKLDKPARKDVKVKLATTGLDEQFMKNITVTPAEIT IAAGELESAEVTWTITDDFLLTTAGEESHTLVVTASAESEDPVVKVNSKENFLTFNVDKV LRNFKYLSVIGSNWTELSKAGWSAEIPSGVSGRASYLIDGNGGSYGSDVYSYNPFWFVID MKSAQTFNALGMDYYYTYASKKVRISTSLDNENWTPQGVLEVPRVGNHYFQFFSSITARY VKVELLEGFGSYIDVTEVYIYNAQ >gi|226332214|gb|ACIC01000106.1| GENE 71 77210 - 78640 1144 476 aa, chain - ## HITS:1 COG:no KEGG:BT_3987 NR:ns ## KEGG: BT_3987 # Name: not_defined # Def: endo-beta-N-acetylglucosaminidase F1 precursor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 476 1 476 476 944 100.0 0 MNMKYITSGLFAAMIVSSTAFLTSCADDLEVGKNIDESAYSGIYENNAYLRDGKSNLVSK VVELHGETYATTVKMGLSKTPNTATSAKVKIDAAYLETYNKAHNTDFALYPQDLVTFANE GILTVNANTKSAEVEMTIRAGEGLQEDKTYAIPVAISDQSSDITIKDEDAKHCIYLVKDM RNAGDAYKGEGVMQGYLFFEVNDVNPLNTLSFQLENGKLLWDVVVLFAANINYDAEAGRP RVQCNPNVQYLLDNNETLLQPLRRRGVKVLLGLLGNHDITGLAQLSEQGAKDFAREVAQY CKAYNLDGVNYDDEYSNSPDLSNPSLTNPSTAAAARLCYETKQAMPDKLVTVFDWGQMYG VATVDGVDAKEWIDIVVANYGSAAYPIGQMTKKQCSGISMEFNLGGGGSLSASKAQSMID GGYGWFMGFAPSPAKYGSVFSRLQGGGEVLYGSNVAAPTIFYKKNDPTPYKYPDDL >gi|226332214|gb|ACIC01000106.1| GENE 72 78674 - 79828 830 384 aa, chain - ## HITS:1 COG:no KEGG:BT_3986 NR:ns ## KEGG: BT_3986 # Name: not_defined # Def: putative patatin-like protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 384 1 384 384 756 99.0 0 MKLYKYCFALLALTTVTVSGCKNEDINEEHHYDNKLYVSSAPVCDDLLIKPSITEATREL SYRIASPAEQDIQISFDAAPAMTAAYNLIYNDNATALDSYFYNIPTKTATIKAGDISSDN IVIDFKNTNELDKSKRYVLPVTILDASNIDVLESARTAYFIFKGAALINVVANIKEIYFP INWKSSVNSLSTVTIEALVRSEDWVAGRDNALSSVFGIEGKFLVRIGDADRPRDQVQVVA PGGNFPGPNVVPGLPVNEWVHIAIVYDNTTKERIYYKDGVPVYKDEAASGNVSLSSGCYI GRAWDDTRWLPGDISEVRVWSVQRTAEQIATNPYEVDPASEGLVAYWKFNEGAGNVITDQ TGHGNDITGSGDPTWVKVEIPKIN >gi|226332214|gb|ACIC01000106.1| GENE 73 79837 - 80913 931 358 aa, chain - ## HITS:1 COG:no KEGG:BT_3985 NR:ns ## KEGG: BT_3985 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 358 1 358 358 710 99.0 0 MKNNLKYRLSALLFLASALTITSCSDWTDVESLQLNTPTFEEQNPQLYADYLKDLNRYKS EEHKVTFVSFENPKGSPGKQAERLTVVPDSVDFICLNNPEVSPEVQAEMVKIREKGTRTV YSIDYSSIENAWKEKVKAEPELTEEDALQYIGERTNEMLALCDKYNFDGVIADYTGRSLV SLPEAALKEYNDRQQKFFGEVMNWQGNHDDKTLVFYGNVQYLVPENMDMLSKFDYIMLKT ASSTNADDLALKAYLAIQAGIDAVGGTEGGVNPVPADRFIVCVELPQADDKDKVKGYWST VDEKGNKLVAAPGAARWMVEASPNYTRKGIFIMNVHNDYYNNTYGYVREVIRIMNPNK >gi|226332214|gb|ACIC01000106.1| GENE 74 80938 - 82551 1674 537 aa, chain - ## HITS:1 COG:no KEGG:BT_3984 NR:ns ## KEGG: BT_3984 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 537 1 537 537 1064 100.0 0 MKNIINKLVFGSLTVALSSCIGNYENINSNPYEAPDLSADGYALGSAMNNLAGCVVSPDV NTAQFTDCLLGGPLGGYFADSNAGFTETISNFNPKDDWSRVFLKSDKIIPTLYSNLTQVK LVSQNTNDPVPYAIAQVIKVAAMHRVTDAFGPIPYSQIGANGEIATPYDSQEVTYNTFFD ELNAAIATLNENSNEQLVPTADYIYKGDVKKWIRFANSLKLRLAIRIAYANPVKAQQMAE EAVNPANGGVIESNADNATWNYFETSQNPIYVATRYNQVQTSDHGGVPCLTGGDTHAAAD IICYMNGYKDNRREKFFTKSEWAGQDYVGMRRGIVIPELKTTGHKYSGVNIAPTSPLYWM NAAEVAFLRAEGQAVFNFSMGGTAESFYNQGIRLSFEQWGADGVEDYLKDDVNKPTAYTD PAGTNTYQNALSNITIKWNDSADKEEKQERIIVQKWIANWQLGNEAWADFRRTGYPKLIP VKENKSGGVVDSEKGARRMPYPLDEFVSNKANVEYAIANYLHGADNMATDVWWASKK >gi|226332214|gb|ACIC01000106.1| GENE 75 82568 - 85975 2868 1135 aa, chain - ## HITS:1 COG:no KEGG:BT_3983 NR:ns ## KEGG: BT_3983 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1135 1 1135 1135 2150 100.0 0 MFNLKFKCMRINDSCRGSKIFRALLILMLFALPAQSAIAQLTIRISNSSLGTVIKQIQSQ SKYQFFYDDNLASMRIESLNVKDVSLVEVLDKALKGKNVVYKIDDNVVYLSKANASSSTK STQQQQKVTGKVVDANNEPLIGVSVLEKGTTNGTITDFDGNYTLVVTGSNAVLQFSYVGY QTLERAVAGKTAINITLKEDAQVLDEVVVTALGIKRSEKALSYNVQQVNADAVTTNKDPN FINSLSGKVAGVNINASSSGVGGVSKVVMRGTKSIMQSSNALYVVDGVPMYSNANKVNGT EFSSKGNTEPIADINPEDIESMSVLTGAAAAALYGSDAANGAIIITTKKGKEGRVNITVN SNVEFNAPLVMPRFQTRYGTGIGGVKDDNSSRSWGPKLTEARYFGYNPRDDYFQTGVIGT ESVSFSTGSEKNQTYASAAAVNSKGIVPNNKYDRYNFNVRNTTSFLDDKMTLDVNASYIL QKDRNMVNQGTYNNPLVGAYLFPRGNDWEDIKMYERYDVARKIYTQYWPTGDGNMTMQNP YWVNYRNLRENKKDRYMLGASLNYQILDWLNVSGRVRLDNSNNDYTEKAYASTNTQLTEL SDRGLYGISRSYEKQLYADFLVSVNKTFGEKWSLQANMGGSFTDMRYDEMAVRGPIADDS KTFAGEKAGLTNGFYIQNLSTTKTSKMQSGWREQTQSIYASAEVGYQSTYYLTLTGRNDW PSQLAGRNSVNKSFFYPSVGMSVVLSELMPKLNKDYLSYWKIRGSFASVGTAFERYIANP LFAWNTSIGQWSNLTDFPVYDLKPERTNSFEVGMNMRFLKNFELDVTYYNAKTMNQTFNP ELPVGEYARIYIQTGAVRNQGLELALNYNNTWKDFTWNTGVTYSMNKNKILTLADNAINP ITREKFSISSLNMGGLGSTRFILKEGGSMGDIYSLMDLKRDANGAVYIDENNSVVTESLE ANNYIKLGSVLPKGNLAWRNNFSWKNINVAFLVSARLGGVVFSRTQAVLDNFGVSEASAA ARDKGYVSVNGNDRVNPEGWYSVVAGGTAVPQYYIYSATNIRLQEASIGYTIPRKWLGNV CDIKVSLIGRNLWMIYNKAPFDPESVASTDNFYQGIDYFMMPSLRNIGFNLSFKF >gi|226332214|gb|ACIC01000106.1| GENE 76 86177 - 87028 549 283 aa, chain - ## HITS:1 COG:no KEGG:BT_3982 NR:ns ## KEGG: BT_3982 # Name: not_defined # Def: exonuclease V subunit alpha # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 283 190 472 472 592 100.0 1e-168 YGLEVREVDLTQVVRQIQESGILWNATQLRQLIAEDNCYSLPKIKITGFPDIKMVPGTEL IDAITSCYDHDGMDETIVICRSNKRANLYNNGIRAQILWREDELNTGDMLMIAKNNYYWT EQYKEMDFIANGEIAVVRRVRKTREMYGFRFAEVTLRFPDQNDFELDANLLLDTLHSDSP ALPKVDNDRLFYTILEDYADISNKRDRMKKMKADPHYNALQVKYAYAITCHKAQGGQWQN VFLDQGYMTDEYLTPDYFRWLYTAFTRATKTLYLVNYPKEQIL Prediction of potential genes in microbial genomes Time: Thu May 12 02:00:02 2011 Seq name: gi|226332213|gb|ACIC01000107.1| Bacteroides sp. 1_1_6 cont1.107, whole genome shotgun sequence Length of sequence - 19398 bp Number of predicted genes - 17, with homology - 17 Number of transcription units - 11, operones - 4 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 575 216 ## COG0507 ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member - Prom 617 - 676 7.3 + Prom 523 - 582 6.6 2 2 Op 1 . + CDS 671 - 1357 767 ## BT_3981 hypothetical protein 3 2 Op 2 . + CDS 1357 - 2127 495 ## BT_3980 hypothetical protein 4 2 Op 3 . + CDS 2118 - 2651 222 ## PROTEIN SUPPORTED gi|163764797|ref|ZP_02171850.1| ribosomal protein L29 5 3 Tu 1 . - CDS 2686 - 4125 1093 ## COG1502 Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes - Prom 4154 - 4213 5.2 - Term 4174 - 4244 5.6 6 4 Op 1 . - CDS 4269 - 7082 1465 ## BT_3977 hypothetical protein 7 4 Op 2 . - CDS 7163 - 7567 217 ## BT_3976 hypothetical protein - Prom 7602 - 7661 9.8 + Prom 7399 - 7458 5.2 8 5 Tu 1 . + CDS 7648 - 8709 1095 ## COG0337 3-dehydroquinate synthetase 9 6 Tu 1 . - CDS 8833 - 9729 619 ## BT_3974 hypothetical protein - Prom 9864 - 9923 7.6 - Term 10062 - 10105 7.2 10 7 Tu 1 . - CDS 10212 - 10961 367 ## gi|253570268|ref|ZP_04847677.1| predicted protein - Prom 11158 - 11217 5.5 11 8 Op 1 . - CDS 11845 - 12840 426 ## BDI_2534 hypothetical protein 12 8 Op 2 . - CDS 12893 - 13684 396 ## gi|253570270|ref|ZP_04847679.1| conserved hypothetical protein 13 8 Op 3 . - CDS 13687 - 14931 863 ## BDI_2532 tyrosine type site-specific recombinase - Prom 14961 - 15020 6.9 + TRNA 15454 - 15530 62.7 # Pro GGG 0 0 + Prom 15456 - 15515 79.1 14 9 Op 1 . + CDS 15586 - 16806 961 ## COG0809 S-adenosylmethionine:tRNA-ribosyltransferase-isomerase (queuine synthetase) 15 9 Op 2 . + CDS 16827 - 17369 523 ## COG1443 Isopentenyldiphosphate isomerase + Term 17458 - 17499 1.1 - Term 17070 - 17122 -0.3 16 10 Tu 1 . - CDS 17347 - 17889 508 ## COG0386 Glutathione peroxidase - Prom 17909 - 17968 3.9 + Prom 18327 - 18386 3.9 17 11 Tu 1 . + CDS 18614 - 18997 125 ## BT_3970 hypothetical protein + Term 19184 - 19227 2.0 Predicted protein(s) >gi|226332213|gb|ACIC01000107.1| GENE 1 2 - 575 216 191 aa, chain - ## HITS:1 COG:BMEI0619 KEGG:ns NR:ns ## COG: BMEI0619 COG0507 # Protein_GI_number: 17986902 # Func_class: L Replication, recombination and repair # Function: ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member # Organism: Brucella melitensis # 21 174 6 151 373 68 34.0 7e-12 MINNYLERQIKENFPYQPTFEQEIAVKSLSEFLLSTLADEVFILRGYAGTGKTSLVGALV KTMDQLQQKAVLLAPTGRAAKVFSSYAGHPAFTIHKKIYRQQSFSNETSNFSINDNLATN TLFIVDEASMISNEGLSGSMFGTGRLLDDLIQFVYSGQGCRLLLMGDTAQLPPVGEELSP ALFADALKGYG >gi|226332213|gb|ACIC01000107.1| GENE 2 671 - 1357 767 228 aa, chain + ## HITS:1 COG:no KEGG:BT_3981 NR:ns ## KEGG: BT_3981 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 228 1 228 228 457 100.0 1e-127 MKTVFNIVLVLCAAALIYICYSSIMGPINFEKAKKEREQAVIARLIDIRKAQQEYRSLHH GMYTEHFDTLIDFVKNQKLPFVMKVGQLTDDQLESGLTEKKAMAIINKAKKTGKYDEVKK NGLENFKRDTMWVAVMDTIYPKGFNADSMKYIPYGNGAIFEMNVKNDTAKSGAPVYLFEV KAPYETYLGGLDRQEIINLKDLNEKLGRYSGLMVGSIDNPNNGAGNWE >gi|226332213|gb|ACIC01000107.1| GENE 3 1357 - 2127 495 256 aa, chain + ## HITS:1 COG:no KEGG:BT_3980 NR:ns ## KEGG: BT_3980 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 256 1 256 256 462 100.0 1e-129 MIDFTKSKQYTLSIRLSTDGFSFSIYNPINDNSQSLFEKEVDTSLSLTANLKNVFHESDF LSYSYKRVNIMIASKRFTMIPLELFEEEQAELLFYHNHQKRENEIVMYNILKKNNVVIIF GIDKSTYTFLNEQYPEARFYSQSTPLIEYFSIKSRLGNSKKMYASVRKDAIDIYCFERGQ LLLANSFECMQTEDRIYYLLYVWKQLEFNQERDELHLTGTLSDKETLMNELKKFILQVFI MNPANNIDMQALLTCE >gi|226332213|gb|ACIC01000107.1| GENE 4 2118 - 2651 222 177 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764797|ref|ZP_02171850.1| ribosomal protein L29 [Bacillus selenitireducens MLS10] # 1 177 13 193 199 90 32 1e-17 MRVISGIYKRRRFDVPRTFKARPTTDFAKENLFNVLNNYIDFEEGVTALDLFAGTGSISI ELVSRGCDRVISVEKDPAHHSFICKIMKEVQTDKCLPIRGDVFKFINSGREQFDFIFADP PYALKELESIPELIFKNNLLKEGGLFVLEHGKQNNFEDHPHFIERRVYGSVNFSLFR >gi|226332213|gb|ACIC01000107.1| GENE 5 2686 - 4125 1093 479 aa, chain - ## HITS:1 COG:BS_ywnE KEGG:ns NR:ns ## COG: BS_ywnE COG1502 # Protein_GI_number: 16080712 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes # Organism: Bacillus subtilis # 11 479 3 482 482 351 37.0 2e-96 MIDWNILVSQIATVAFDIVYFGAIISTIVVIILDNRNPVKTMAWILILLFLPIVGLVFYF FFGRSQRRERIIGQKSYDRLLKKPMAEYMAQDCSDVPEEYARLIQLFQHTNQAFPFEGNR IAVYTEGYTKLQSLLRELQKAKQHIHIEYYIFEDDAIGRLVRDVLIEKASQGVEVRVIYD DVGCWHVPNRFFEEMRNAGIEVRSFLKVRFPLFTSRVNYRNHRKIVVIDGRIGFVGGMNL AERYMRGFSWGIWRDTHILIEGKAVHGLQTAFLLDWYFVDRTLITASRYFPKIESCGSSL VQVVTSEPIGPWKEIMQGLTMAINGAKKYFYMQTPYFLPTEQVLAAMQTAALAGVDVRLM LPERADNRITHLGSHSYLADVMQAGVKVYFYKKGFLHSKLMVSDDMLSTVGSTNVDFRSF EHNFEVNAFMYDVETALEMKEIFLQDQRESTQIFLKNWVKRSWRRRAAESVVRLLAPLL >gi|226332213|gb|ACIC01000107.1| GENE 6 4269 - 7082 1465 937 aa, chain - ## HITS:1 COG:no KEGG:BT_3977 NR:ns ## KEGG: BT_3977 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 912 1 912 937 1665 99.0 0 MKRLSVWWVLTLLCTFSIFAQNKVITVSGRVVESDSKEPAAQATIQLLSLPDSAYAAGIA SSNKGWFTLPKVKAGKYLLKVSYIGFRTKFVPVQLSNNVTEKKMGTISLDPDAVMLSEAV ITAEAPPVTVKADTTEYSAAAYPVPEGSMLEDLVKKIPGAEVSDEGKITINGKEIKKIMV DGKEFFSDDPKVSMKNLPANMIEKVKAYDKKSDMARITGIDDGDEEPVLDLTVKKGMRKG WIGNLIAGYGSQDRYEGGVMISRFKDDASLSIIGSANNTNNKGFSEFGDAGQGLGGGNAG SGITTAQSLGVNFAKDTKKLQIGGNVQYGHSDNDARRKTSSETFLGETSSFAQSENFSNR NRHDFRVDFRLEWRPDTLTTIIFRPNGSYSQTESSSRSWSKTENNSHSPVNEKEAASSSK SHNASFNGSLMAFRRLNNKGRNLSLGARFGYSDSESDSYSDSKTEFFEKDSISDISRYTD RNSDSRNWSVSASYTEPVFKNHFLQLRYEFAHRKQLSQSLVYDSINYYPYPEYIERGYDN ELSTRVENFYDTHTADVSVRGIHPKMMYSVGVGVTPQSSLSKTTIGPNYKKNLPEQNVLN WAPSVMFRYMFNKQHVLMFRYRGRSSTPNIEDLQEVIDITDPMNLRYGNPNLKPSFNNNF TLDYRKFVPESMRSYSANLYYTSTLNSVANRMTYDPQTGARVYKKENVNGNWQARGFFSF NTPLKNKKFTISSNTNARFSDAVSYTSVGSSKNADQELSTTHNLGLGERFTSSYRSEVFD ISLSGSVNYNLVRNSKQENSNRETFDYYIGGNTNVNLPWQVSISTDINCRFKDGYTGGLN NNEVMWNAQISKNFLKNNSGTIRFKIYDILKQQSSLSRSISETMMSDTEYNTLGSYFMVH FVYRFNTLGGKSPNRRGPGGGPRGGHGFGGGGRPMRF >gi|226332213|gb|ACIC01000107.1| GENE 7 7163 - 7567 217 134 aa, chain - ## HITS:1 COG:no KEGG:BT_3976 NR:ns ## KEGG: BT_3976 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 134 1 134 134 214 100.0 8e-55 MKALFPVFCRPLGYVVLLVALFLLPILLMQGLITDHNLLFYKECTKLLMMAGCLLIIFAL SKNESRETEQIRNAAVRNAVFLTFLFVFGGMLWRVLQGDVINVDTSSFLTFLIFNVLCLE FGLKKALVDRFFKR >gi|226332213|gb|ACIC01000107.1| GENE 8 7648 - 8709 1095 353 aa, chain + ## HITS:1 COG:FN0871_1 KEGG:ns NR:ns ## COG: FN0871_1 COG0337 # Protein_GI_number: 19704206 # Func_class: E Amino acid transport and metabolism # Function: 3-dehydroquinate synthetase # Organism: Fusobacterium nucleatum # 26 348 26 349 350 184 35.0 2e-46 MSKQEVILCESLETSLGRAIERCPHDKLFILTDEHTQRLCLPSLKEVSFLKDAVEICIGA EDVHKTLETLASVWMALSQQGATRHSLLINLGGGMVTDLGGFAAATFKRGISYINIPTTL LAMVDASVGGKTGINFNGLKNEIGSFAPADSVLIETEFLRSLDAQNFFSGYAEMLKHGLI SNTAHWVELLDFNTNNIDYAYLKNMVGRSVQVKEDIVEQDPFERGIRKALNLGHTAGHAF ESLALAENRPVLHGYAVAWGIVCELYLSHLKVGFPKDKMRQTIQFIKENYGSFAFDCKQY EQLYAFMQHDKKNTSGTVNFTLLKEIGDICINQTADKDTIFEMLDFYRECMGV >gi|226332213|gb|ACIC01000107.1| GENE 9 8833 - 9729 619 298 aa, chain - ## HITS:1 COG:no KEGG:BT_3974 NR:ns ## KEGG: BT_3974 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 35 298 1 264 264 514 99.0 1e-144 MYLVLFNEGVNVYNKNRETLEAYVLGDDFKYFQNLKYLPLSVYPSNIQSALVVYCFSGKA KLNVYDDSHWIQPQELIILLPGQFVSFTEASDDFLTTTLVVSPTMFSDALSGVPRFSPHF FFYMRTHYWYPQTENDTRRLMNFFGMVKDKVTSNDIYRRELIIHLLRYLYLELFNAYEKE ASLMTTRKDTRKEELANKFFGLIMKHFKENKDVAFYADKLCITSKYLTMVIKEVSGKSAK DWIVEYIVLEIKALLKNTNMNIQEIAVKTNFANQSSLGRFFRKHTGMSLSQYRMSNLG >gi|226332213|gb|ACIC01000107.1| GENE 10 10212 - 10961 367 249 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253570268|ref|ZP_04847677.1| ## NR: gi|253570268|ref|ZP_04847677.1| predicted protein [Bacteroides sp. 1_1_6] # 1 249 1 249 249 500 100.0 1e-140 MKRTFLIVLSLCAMLIGKAQERPVLDLSNNNCRFEVGYNKVKTICTGARSIEQLHDLYNM DVCSYYNLTQYDTELKRKTFANSEDGKTLLETMKNIKTNLNAMKHYYIFPFSANMGWDKA YNLKTKTFDFGYVVDETGFLPVSGYLNFPQFAIKCTPKIRQTVQKRQSTDRASYLTTIKI PMTESEALAIEEHIADLALAVQFTIQGWGETKRKINLGGAVFITTTCVKAVSPKIYIIDK NTNEIYFEL >gi|226332213|gb|ACIC01000107.1| GENE 11 11845 - 12840 426 331 aa, chain - ## HITS:1 COG:no KEGG:BDI_2534 NR:ns ## KEGG: BDI_2534 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 53 313 56 315 325 66 27.0 2e-09 MYDKIKLILYDLPTGYDWQTVLQRTVVKDYFANGTGGKGYWLGRRVIATETYISFEGSLP KCLWGHNLKTLSLRQVKWLIMKLSKDLGVPMYKAVVESAEFAHNFSMTEPPIMYMQKLDA MKKFRPNEWNGTKYIEDEEVRCKFYDKIQEAKKKRELPKYGRENLPRNLLRYEVTFSTKG LNRLFGRDIVAEELWSKQVFWTLVAEWFGYYEDMVKLPNDCWDVDYHIFESAKDFAKWCI CIANADQNLSYYVKHVLFKLRANPQPQDRVLHGQIQKKIQEALEWGKKHLALPNLTLELT GKIEQYLAGLLEQSADGMSVAEERRIFNTAC >gi|226332213|gb|ACIC01000107.1| GENE 12 12893 - 13684 396 263 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253570270|ref|ZP_04847679.1| ## NR: gi|253570270|ref|ZP_04847679.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 263 1 263 263 520 100.0 1e-146 MSMAINMPLGGLINKSLKGEARFAHEEWLRTYVKRLTNGNKTKLKQALNEVDEAINQWRD GFFISTYDEKLSAVRQGKLNMDSQELEDVYNEETYLQKEGKTGLVADFLYNSIAQYGFIN EHHRDAIESAWMFHDMEFLQQQWEYYALAQIRSLRAVIVSMLGTVPTIPETEPQNAKREP KRIEDYPEVFDITLCCELTNYAKDTIYKWTRTREIPCHRSGTNGRKLVFKRDEIVAWMTA RKQETKDEFIKRMESQLAARLRK >gi|226332213|gb|ACIC01000107.1| GENE 13 13687 - 14931 863 414 aa, chain - ## HITS:1 COG:no KEGG:BDI_2532 NR:ns ## KEGG: BDI_2532 # Name: not_defined # Def: tyrosine type site-specific recombinase # Organism: P.distasonis # Pathway: not_defined # 1 410 1 410 422 372 48.0 1e-101 MARTKIVLFKSNIRRDGSCPVCLRVAKEDKTKYIDLQLSATKGQWDELASRFKKDKRVNP NYENYNALLNRYEVRKDEILQKFMEERVNWTLNQFEEEFLGMSKQGKVYDYFMRQVENLK ATRHIGNAKVYERTLHMLAKYDDKIEERLFSELDVKYINRFNLEMEKDGCCGNTRKYYLK TLRAVINKAIKEREASSNTYPFGKGGFEIGKLAEETAKRYLSPHDLELIKNSPQQNPVLE LSRRVFLFSYLCFGMSFIDEAMLTKNNIDTFGTEEHIVYKRQKTQNAKNAKPITIPVTPA IREQLEWFKANTALTGNYLLPIITRDYEGEQLYDHIRSRYKRINDGLKQLGKLLRIRMNL TTYVSRHTMAMTLQGNDVPREIISQALGHRNLTTTNVYLDSFSTSVLDRVAKIL >gi|226332213|gb|ACIC01000107.1| GENE 14 15586 - 16806 961 406 aa, chain + ## HITS:1 COG:HI0245 KEGG:ns NR:ns ## COG: HI0245 COG0809 # Protein_GI_number: 16272205 # Func_class: J Translation, ribosomal structure and biogenesis # Function: S-adenosylmethionine:tRNA-ribosyltransferase-isomerase (queuine synthetase) # Organism: Haemophilus influenzae # 8 404 1 353 363 231 34.0 2e-60 MKEDPKHIHISEYNYPLSDERIAKFPLAVRDQSKLLVYRHGEVSEDVFTSLPDYLPKGSL MVFNNTKVIQARLHFRKETGALIEVFCLEPIQPNDYVLNFQQTEHAAWLCMIGNLKKWKD GALKREMTVKGFPITLTATRGECRGTSHWIDFSWDNPEVTFADILEVFGELPIPPYLNRD TEESDKETYQTVYSKIKGSVAAPTAGLHFTPRVLEALQEKGIDLEELTLHVGAGTFKPVK SEEIEGHEMHTEYISVNRATIKKLIDHDGCAVAVGTTSVRTLESLYHIGVTLAENPDATE EELHVKQWQPYEKYDQIPPVVALQKILGYLDRNGLEALHSSTQIIIAPGYQYKIVKAMVT NFHQPQSTLLLLVSAFVKGNWRTIYDYALAHDFRFLSYGDSSLLIP >gi|226332213|gb|ACIC01000107.1| GENE 15 16827 - 17369 523 180 aa, chain + ## HITS:1 COG:MT1787 KEGG:ns NR:ns ## COG: MT1787 COG1443 # Protein_GI_number: 15841209 # Func_class: I Lipid transport and metabolism # Function: Isopentenyldiphosphate isomerase # Organism: Mycobacterium tuberculosis CDC1551 # 8 174 12 185 203 63 25.0 2e-10 MPSDNNQEMFPIVDEQGNITGAATRGECHSGSKLLHPVIHLHVFNSKGDLYLQKRPEWKD IQPGKWDTAVGGHIDLGESVEIALKREVREELGITDFTPELLTSYVFESDREKELVFVHK TVYDGELHPSDELDGGRFWTIEEIEENLGKGIFTPNFEGELKKVSLIPSLFPAPTPALSK >gi|226332213|gb|ACIC01000107.1| GENE 16 17347 - 17889 508 180 aa, chain - ## HITS:1 COG:PA0838 KEGG:ns NR:ns ## COG: PA0838 COG0386 # Protein_GI_number: 15596035 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Glutathione peroxidase # Organism: Pseudomonas aeruginosa # 23 179 3 157 160 155 45.0 3e-38 MKSFILMVITMVCAISLEAQNKSFYDFNVTTIDGKEFPLSSLKGKKVLVVNVASKCGLTP QYAKLQELYDKYKDKNFVIIGFPANNFMGQEPGSNEEIAQFCSLKYDVTFPMMAKISVKG KNMSPLYQWLTEKKLNGKEDAPVQWNFQKFMIDENGNWVGFVAPKESPLSEKIVTWIEQE >gi|226332213|gb|ACIC01000107.1| GENE 17 18614 - 18997 125 127 aa, chain + ## HITS:1 COG:no KEGG:BT_3970 NR:ns ## KEGG: BT_3970 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 127 112 238 238 258 96.0 4e-68 MNRPDKETLHRVIRKMHRQNDCEKFIELASVFTGMLGKLDFRSVEFSSDDPYHPFPLYMK HIHSVICEEEYLYILCYNGNLHILGTYERGHILIPLKEQLTLRELFCWTCRSWWDRLLTK SENELPI Prediction of potential genes in microbial genomes Time: Thu May 12 02:01:17 2011 Seq name: gi|226332212|gb|ACIC01000108.1| Bacteroides sp. 1_1_6 cont1.108, whole genome shotgun sequence Length of sequence - 81064 bp Number of predicted genes - 56, with homology - 56 Number of transcription units - 23, operones - 14 average op.length - 3.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 11/0.000 + CDS 477 - 4802 4504 ## COG3696 Putative silver efflux pump + Prom 4806 - 4865 2.6 2 1 Op 2 . + CDS 4885 - 5814 809 ## COG0845 Membrane-fusion protein 3 1 Op 3 9/0.000 + CDS 5877 - 6935 818 ## COG3275 Putative regulator of cell autolysis 4 1 Op 4 . + CDS 6932 - 7744 601 ## COG3279 Response regulator of the LytR/AlgR family - Term 7804 - 7844 8.3 5 2 Op 1 . - CDS 7871 - 10183 1811 ## COG3537 Putative alpha-1,2-mannosidase 6 2 Op 2 . - CDS 10224 - 11018 1007 ## BT_3964 putative secretory protein 7 2 Op 3 . - CDS 11048 - 13360 2250 ## COG3537 Putative alpha-1,2-mannosidase 8 2 Op 4 . - CDS 13394 - 15679 2257 ## COG3537 Putative alpha-1,2-mannosidase - Prom 15865 - 15924 6.1 - Term 15892 - 15944 2.5 9 3 Op 1 . - CDS 15961 - 17196 901 ## BT_3961 hypothetical protein 10 3 Op 2 . - CDS 17222 - 18748 1153 ## BT_3960 hypothetical protein 11 3 Op 3 . - CDS 18806 - 20770 1622 ## BT_3959 putative outer membrane protein 12 3 Op 4 . - CDS 20795 - 23947 2869 ## BT_3958 hypothetical protein 13 4 Op 1 . - CDS 24579 - 24815 212 ## COG2207 AraC-type DNA-binding domain-containing proteins 14 4 Op 2 . - CDS 24900 - 28526 2215 ## BT_3957 transcriptional regulator - Prom 28572 - 28631 5.9 - Term 28739 - 28795 12.7 15 5 Op 1 . - CDS 28831 - 30222 1176 ## COG3669 Alpha-L-fucosidase 16 5 Op 2 . - CDS 30301 - 31653 1185 ## BT_3955 hypothetical protein 17 5 Op 3 . - CDS 31681 - 33036 1428 ## BT_3954 hypothetical protein 18 5 Op 4 . - CDS 33074 - 35008 1769 ## BT_3953 putative outer membrane protein 19 5 Op 5 . - CDS 35030 - 38173 3073 ## BT_3952 hypothetical protein - Prom 38348 - 38407 6.4 - Term 38311 - 38345 1.0 20 6 Tu 1 . - CDS 38470 - 42444 2299 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 42565 - 42624 4.2 - Term 42566 - 42611 8.0 21 7 Op 1 . - CDS 42641 - 44029 1796 ## COG1109 Phosphomannomutase 22 7 Op 2 . - CDS 44059 - 44703 475 ## BT_3949 hypothetical protein - Prom 44725 - 44784 4.8 - Term 44751 - 44796 1.2 23 8 Op 1 . - CDS 44822 - 45859 829 ## COG0618 Exopolyphosphatase-related proteins 24 8 Op 2 . - CDS 45905 - 47578 397 ## COG0658 Predicted membrane metal-binding protein - Term 47932 - 47971 6.5 25 9 Op 1 1/0.000 - CDS 48016 - 48666 700 ## COG0036 Pentose-5-phosphate-3-epimerase 26 9 Op 2 . - CDS 48719 - 49687 995 ## COG0223 Methionyl-tRNA formyltransferase 27 9 Op 3 . - CDS 49750 - 51546 1452 ## COG0038 Chloride channel protein EriC 28 9 Op 4 . - CDS 51547 - 52110 560 ## COG0009 Putative translation factor (SUA5) - Prom 52166 - 52225 4.6 + Prom 52104 - 52163 5.3 29 10 Tu 1 . + CDS 52189 - 52626 408 ## COG0824 Predicted thioesterase 30 11 Tu 1 . - CDS 52817 - 53743 645 ## COG4632 Exopolysaccharide biosynthesis protein related to N-acetylglucosamine-1-phosphodiester alpha-N-acetylglucosaminidase - Prom 53769 - 53828 4.5 + Prom 54388 - 54447 3.1 31 12 Op 1 . + CDS 54533 - 56251 1807 ## COG0608 Single-stranded DNA-specific exonuclease 32 12 Op 2 1/0.000 + CDS 56244 - 58148 1551 ## COG0514 Superfamily II DNA helicase 33 12 Op 3 . + CDS 58188 - 59150 1378 ## COG0457 FOG: TPR repeat + Term 59360 - 59400 -0.7 + Prom 59309 - 59368 5.2 34 13 Op 1 . + CDS 59501 - 60349 733 ## COG0077 Prephenate dehydratase 35 13 Op 2 . + CDS 60324 - 61505 1372 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase + Prom 61518 - 61577 2.0 36 13 Op 3 . + CDS 61598 - 62659 1172 ## COG2876 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase + Prom 62771 - 62830 5.0 37 14 Tu 1 . + CDS 62895 - 63668 863 ## BT_3933 chorismate mutase/prephenate dehydratase (TyrA) + Term 63699 - 63740 8.2 + Prom 63694 - 63753 4.9 38 15 Tu 1 . + CDS 63783 - 65975 1895 ## COG0358 DNA primase (bacterial type) - Term 65875 - 65909 -0.7 39 16 Op 1 . - CDS 66067 - 66657 616 ## COG0302 GTP cyclohydrolase I 40 16 Op 2 . - CDS 66665 - 67114 554 ## BT_3930 hypothetical protein - Prom 67147 - 67206 2.6 41 17 Op 1 . - CDS 67241 - 67999 1037 ## COG0149 Triosephosphate isomerase 42 17 Op 2 . - CDS 68032 - 69345 1137 ## BF3957 hypothetical protein 43 17 Op 3 . - CDS 69335 - 69877 586 ## BT_3927 hypothetical protein - Prom 69926 - 69985 2.2 - Term 69910 - 69963 15.1 44 18 Op 1 . - CDS 69987 - 70859 965 ## COG0739 Membrane proteins related to metalloendopeptidases 45 18 Op 2 . - CDS 70878 - 71342 346 ## COG0105 Nucleoside diphosphate kinase - Prom 71390 - 71449 5.9 - Term 71501 - 71536 1.2 46 19 Op 1 . - CDS 71566 - 73662 1492 ## COG1200 RecG-like helicase 47 19 Op 2 . - CDS 73663 - 74322 270 ## PROTEIN SUPPORTED gi|163764767|ref|ZP_02171821.1| ribosomal protein L15 48 19 Op 3 . - CDS 74330 - 74881 723 ## COG0693 Putative intracellular protease/amidase 49 19 Op 4 . - CDS 74907 - 75785 1027 ## BT_3921 hypothetical protein 50 19 Op 5 . - CDS 75827 - 76243 349 ## BF3738 putative tansport related protein 51 19 Op 6 . - CDS 76289 - 77008 991 ## COG0811 Biopolymer transport proteins 52 19 Op 7 . - CDS 77008 - 77745 748 ## COG0854 Pyridoxal phosphate biosynthesis protein - Prom 77767 - 77826 1.8 + Prom 77657 - 77716 4.6 53 20 Tu 1 . + CDS 77870 - 78742 746 ## COG0061 Predicted sugar kinase - Term 78879 - 78917 -0.9 54 21 Tu 1 . - CDS 79051 - 79242 150 ## gi|298384857|ref|ZP_06994416.1| hypothetical protein HMPREF9007_01498 - Prom 79316 - 79375 3.2 55 22 Tu 1 . - CDS 79792 - 80742 547 ## BT_3916 site-specific recombinase IntIA - Prom 80766 - 80825 6.2 56 23 Tu 1 . - CDS 80885 - 81058 109 ## BT_3915 hypothetical protein Predicted protein(s) >gi|226332212|gb|ACIC01000108.1| GENE 1 477 - 4802 4504 1441 aa, chain + ## HITS:1 COG:RSp0493 KEGG:ns NR:ns ## COG: RSp0493 COG3696 # Protein_GI_number: 17548714 # Func_class: P Inorganic ion transport and metabolism # Function: Putative silver efflux pump # Organism: Ralstonia solanacearum # 1 1032 1 1035 1064 793 42.0 0 MFKAIVRFSIRKKLFVGLTTLFLFIGGIYAMLTLPIDAVPDITNNQVQIVTVSPTLAPQE VEQLITMPIEIAMSNIMNVEDIRSVSRFGLSVVTVVFKEDVPTLDARQLINEQIQTVSGE ISPELGTPEMMPITTGLGEIYQYILKVAPGYEEKYDAMELRTIQDWMVKRQLSGIPGIVE INSFGGYLKQYEVAVDPDALFSLNITIGEVFEALSSNNQNTGGSYIEKAKNAYYIRSEGM ITRIKDIEQIVVANRNGIPVHISDVGTVRFGAPKRFGAMTMDGKGECVGGIAMMLKGANA NVVTQELEKRVEKIQHLLPEGISIEPYLNRSELVNRNISTVVNNLIEGAIIVFLVLIIFL GNVRAGLIVASVIPLAMLFAFIMMRLFNVTANLMSLGAIDFGIVVDGSIVILEGILAHIY SKQFRGRTLTRKEMDEEVEKGASGVVRSATFAVLIILIVFFPILTLNGIEGKYFTPMAKT LVFCIIGALILSLTYVPMMASLFLKHTIVVKPTLADRFFEQLNKLYQRCLHACLHHKART VVIAFAALIGSLFLFTRLGAEFIPTLDEGDFAMQMTLPAGSSLSESIKLSEEAEKTLMDQ FPEIKHVVAKIGTAEVPTDPMAVEDADVMIIMKPFKEWTSATSRAEMVEKMKEALEPLSE RAEFNFSQPIQLRFNELMTGAKADIAVKLYGEDTHELYQRAKEAATYVEKVPGAADVIVE QTMGLPQLVVKYNRGKIARYGINIEELNTIIRTAYAGEASGVVFENERKFDLVVRLDQEK VADLNLDKLFVRTSEGIQIPVGEVASIELVSGPLQINRDATKRRIVIGVNVRDADIQQVV ANIQKTLDKNIKLQPGYYFEYGGQFENLQNAINTLMIVIPVALMLILLILFFAFKNITYT LMVFSTVPLSLIGGIVALWLRGLPFSISAGVGFIALFGVAVLNGILMVNHFNELRKRNKY AMTTNRILTLGTPHLLRPVFLTGLVASLGFVPMAIATSAGSEVQRPLATVVIGGLIISTV LTLLIIPVFYKIVNSFAVWRRPGSKFHLPFFVILPLLLLIPSFASAQQPEAVSLEQAIEI AKQNHPRLKIAANAIRQAKATRGEIVEAAPTSFSYSWGQLNGENKQDKELAFEQSLGSLL TPFYKNALVSRQVKTSTYYRRMVEKEVIAEVKRAWAYYQYAANLCSMYRDQDKMAEELKR IGEIRYQQGEITLLEKNMMTTTAADLHNRWYQAQEEEKTALARFQWCCYADSPIVPADST LSLFYTTLSDGNLSEAHTGYFRSQAEEAKAMLHVERSHFFPEISIGYTRQDILPLKNLNA WMVGVSFPVYFLPQKSKVKQARLAAASAQIQADANIRELRNKVMELEASLRRYNESLRYY TTSALKEAEELTKAANLQLQQSETGIAEYIQSVTTARDIRRGYIETVYQYNIAALEHELF E >gi|226332212|gb|ACIC01000108.1| GENE 2 4885 - 5814 809 309 aa, chain + ## HITS:1 COG:RSp0529 KEGG:ns NR:ns ## COG: RSp0529 COG0845 # Protein_GI_number: 17548750 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Ralstonia solanacearum # 59 309 21 267 344 73 24.0 5e-13 MRKIFYLILLLALVACKGQQQTAESESVADPATESVNPEMGKNNGQSDADATAQPEVDAI TSATSRPNQVSFNGRIVLPPQRQATVALTIGGVIKNTSLLPGQHVTKNSVIATLENPEFI TLQQTYLDSHAQTEYLRMEYERQKNLSAEQAASQKKFQQSKADYLSMKSRQDAAATQLSL LGVNPETLLRDGIQPLLEVKAPISGYVVNVAMNMGKYINPGEALCEVIDKSAPMLCLTTY EKDLADMQTGSPVQFRVNGMGTQTFNGTVISIGQKVDEVNRSLEVYASIKETNPQFRPGM YVTAHIQKQ >gi|226332212|gb|ACIC01000108.1| GENE 3 5877 - 6935 818 352 aa, chain + ## HITS:1 COG:ECs2937 KEGG:ns NR:ns ## COG: ECs2937 COG3275 # Protein_GI_number: 15832191 # Func_class: T Signal transduction mechanisms # Function: Putative regulator of cell autolysis # Organism: Escherichia coli O157:H7 # 152 336 354 540 561 93 32.0 5e-19 MVLKAINKDKYFLSTVIISLMVAVLIHFPESVSLFDRFESHSLFPGMKFMDVANEILFTF VSLLILFAINTRLFHFNQASIKITAAKILLSFILTWILSNLLGQVFVFLHRTFDIPAIDA MVHHYLHPLRDFIMACLVTSSCCIIYLVRRQQLVLIENEQLQAENIRNQYEVLKNQLNPH MLFNSLNTLRSLVRENQDKAQDYIQELSRVLRYTLQSNESQSVSLREEMEFASAYIFLLK MRFENNLQFDIQIAKSFEDYRLPPMAVQVLIENAVKHNEISDRKPLTIHIVTNNEGYLSV SNDIQPKRTAASGTGIGLVNLAKRYRLLFKQDIQITEDKEFSVCIPLIDEVQ >gi|226332212|gb|ACIC01000108.1| GENE 4 6932 - 7744 601 270 aa, chain + ## HITS:1 COG:SA0251 KEGG:ns NR:ns ## COG: SA0251 COG3279 # Protein_GI_number: 15925964 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Staphylococcus aureus N315 # 8 263 1 240 246 89 28.0 1e-17 MNDKPSVMKTVIIEDEKAAVRNLTSLLNEVKPEAEIIAILDSINSTIEWFGIHPMPELVF MDIHLADGSAFEIFDHISITCPIIFTTAYDEYALRAFKVNSIDYLLKPIGKEDIEHAFEK LDNLQDTIPENGSRRENKEEELLHLIHSLKKQENYKTHFLIPIKGDKLLPVSIDMIQLFY IKDCQVKAVLTDGMEYNFSLTLDELVDCLNPSLFFRVNRQFLISREAIKDIDLWFNSRLS INLRHSRMTEKILVSKARVAEFKEWFSSKK >gi|226332212|gb|ACIC01000108.1| GENE 5 7871 - 10183 1811 770 aa, chain - ## HITS:1 COG:XF0842 KEGG:ns NR:ns ## COG: XF0842 COG3537 # Protein_GI_number: 15837444 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Xylella fastidiosa 9a5c # 21 764 46 789 790 494 36.0 1e-139 MFLVWSCLVSILPGMAQTEKLTDYVNPFVGTDGYGNVYPGAQIPFGGIQISPDTDSRFYD AASGYKYNHLTLMGFSLTHLSGTGIPDLGDFLFIPGTGEMKLEPGTHEDPDQGYRSRYSH DKEWASPNYYAVELADYGVKAEMTSGVRSGMFRFTYPESDNAFIMIDMNHTLWQSCEWSN LRMINDSTITGYKLVKGWGPERHVYFTATFSKKLTGLRFVQDKKPVIYNTSRFRSSYEAW GKNLMACISFDTKAGEEVTVKTAISAVSTDGARNNMKELDGLTFNELRAKGEALWEKELG KYTLTADRKTKETFYTSAYHAALHPFIFQDSDGQFRGLDKNIEKAEGFTNYTVFSLWDTY RALHPWFNLVQQEVNADIANSMLAHYDKSVEKMLPIWSFYGNETWCMIGYHAVSVLADMI VKEVKGFDYERAYEAMKTTAMNPNYDCLPEYREMGYVPFDKEAESVSKTLEYAYDDYCIA QAAKKLGKEDDYHYFLNRALSYQTLIDPETKYMRGRDSKGDWRTPFTPVAYQGPGSVHGW GDITEGFTMQYTWYVPQDVQGYINEAGKELFRKRLDELFTVELPDDIPGAHDIQGRIGAY WHGNEPCHHVAYLYNYLKEPWKCQKWIRTIVDRFYGNTPDALSGNDDCGQMSAWYMFNCI GFYPVAPSSNIYNIGSPCAEAITVRMSNGKNIEMTADNWSPKNLYVKELYVNGKKYDKSY LTYDDIRDGVKLRFVMSGKPNYKRAVSDEAVPPSISLPEKTMKYKSSIGF >gi|226332212|gb|ACIC01000108.1| GENE 6 10224 - 11018 1007 264 aa, chain - ## HITS:1 COG:no KEGG:BT_3964 NR:ns ## KEGG: BT_3964 # Name: not_defined # Def: putative secretory protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 264 1 264 264 543 100.0 1e-153 MKKRHLVYAIALIVGMGACAASAKKQAEAKPDVWKSYNVGTVLFEDKASETKGSDIYHRI IPDAESYIKEQARTVLATLYNSPEDSITPVNKIHYTLEDIEGISAKGGGNGDVTIFYSTR HIEKSFAENDTAKLFFETRGVLLHELTHAYQLEPQGIGSYGTNRVFWAFIEGMADAVRVA NGGFDGPNARPKGGNYMDGYRTAGYFFVWLRDNKDPEFLRKFNRSTLEVIPWSFDGAIKH ILGTEYSIDELWHEYQVAVGDIQV >gi|226332212|gb|ACIC01000108.1| GENE 7 11048 - 13360 2250 770 aa, chain - ## HITS:1 COG:XF0842 KEGG:ns NR:ns ## COG: XF0842 COG3537 # Protein_GI_number: 15837444 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Xylella fastidiosa 9a5c # 19 766 49 781 790 550 41.0 1e-156 MMTMAAVTLSAVAQQPVDYVNPIIGTNGMGHTFPGACTPFGWVQLSPDTDTIPHNVNGAY QKNAYEYCAGYQYRDKTIVGFSHTHLSGTGHSDLGDILLMPAVGDVKLNPGRADHPEEGY RSRFDHATEKATPGYYEVMLDDYGIKAQLTATQRTGVHKYTFPKGKDGHLILDLVHGIYN YDGKVLWANLRVENDTLLTGYRITNGWARTNYTYFAISLSQPIKDYGYKDKEKVLYNGFW RRFKLEKNFPEITGRKIVAYFNFETAKDPELVVKVALSAVSTEGAVKNLRAEASGKSFEQ LAEAARTDWNNELDHFEIEGTPDQKAMFYTSLYHTMINPSVYMDVDGSYRGLDHNIHQAK GFTNYTIFSLWDTYRAEHPFLNLVKPERNADMVESMIKHEQQSVHGMLPIWSLMGNENWC MSGYHAVSVLADAITKGVFSNVDEALSAMVSTSTVPYYEGVADYMKLGYIPLDKSGTAAS STLEYAYDDWTIYQTALKAGNKEIADTYRKRALNYRNIYDTSIGFARPRYSDGTFKKEFD VLQTYGEGFIEGNSWNFSFHVPHDVFGMIDLMGGEKTFVQKLDELFSMHLPEKYYEHNED ITEECLVGGYVHGNEPSHHVPYLYAWTSQPWKTQYWLREILNKMYKNDINGLGGNDDCGQ MSAWYLFSVMGFYPVCPGTDQYVLGAPYLPYLKLTLPNGKTLEIKAPGVSDKKRYVQSLK LNGESYDKMYITHEDILKGGVLEFKMSASPNKRRGVSVQDKPYSLTNGIN >gi|226332212|gb|ACIC01000108.1| GENE 8 13394 - 15679 2257 761 aa, chain - ## HITS:1 COG:L135972 KEGG:ns NR:ns ## COG: L135972 COG3537 # Protein_GI_number: 15673483 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Lactococcus lactis # 27 759 3 717 717 434 33.0 1e-121 MNKQKLLSITMTLLLGVSSVFAQKQPVDYVNPLMGTDSKISLSNGNTYPAIALPWGMNFW MPQTGKMGDGWAYTYASDKIRGFKQTHQPSPWINDYGQFSIMPMTKQLKIDQDSRASWFS HKAEKATPYYYSVYLSEYNMTTEIAPTERCAYFRFTFPEASDAYVVVDAFDRGSYVKVIP EENKIVGYTTRNSGGVPQNFRNYFVIEFDKPFTFNKVWADYHLVETHLELQSNHVGAAIG FSTKKGEQVHAKVASSFISPEQAELNLKEIGNKTFEQTKEAGRKAWNDVLGRIKVEDDDE NRMRTFYSCLYRSVLFPRMFHEVNAKGETVHYSPYNGEIRPGYMFTDTGFWDTFRCLFPF VNLIYPSMGEKMQEGLLNTYLESGFFPEWASPGHRGCMVGNNSASVVADAFMKNVTKADA EKMYEGLLKGANSVHPRVSTTGRRGYEYYNKLGYVPYDVKINENAARTLEYAYDDWCIYR MGEKLGRPAEELELYKSRSQNYRNLFDPETKLMRGKNADGTFQTPFNPFKWGDAFTEGNS WHYTWSVFHDVQGLADLMGGRKMFVSMLDSVFNLPPIFDDSYYGGVIHEIREMEIANMGN YAHGNQPIQHMIYLYNYAGEPWKAQYWLREVMNRLYFATPDGYCGDEDNGQTSAWYVFTA LGFYPVCPGSNEYVMGAPYFKKATITLENGKKLEISAPKNSDANRYIRSLNYNGKNYTKN YLNHFDLLKGGRLVFDMDNKPNKGRGINESDFPYSFSRDNK >gi|226332212|gb|ACIC01000108.1| GENE 9 15961 - 17196 901 411 aa, chain - ## HITS:1 COG:no KEGG:BT_3961 NR:ns ## KEGG: BT_3961 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 411 1 411 411 843 100.0 0 MKKLIYIMLSLCCLVFYSCGMTEPWKDWEHEGDMSADRLRPSEVKELLCAADGWKMIYQG VTFYFQFDEEGNVASDSDETLLKNEVGTDYSLDFQGEKAVLLTLLNGGMLQYLNENSETT FVITGYSDSQITAVGQTHGKEMILTPVSTAALQQAKERKRLAIIAYNKAQAMDLLKGELN NGVFRRSSSSFLAHYLIICDESNNWKVKISAIDNGVVKHTEYPMIIDTTNDENAVLTLGS NVTVDGISLNKLYYNYLNGEIETDNANVVCDTRKASDIAAWYANGWKTHIVDQDEIHADF KGIFHSGVEFDDRNPRNLIACPWSGMGSYIGFAVTMTADNATGRIFISLGEPYDLFGWNN NPADYNRVQQDYSKFLSFCVSEDGFYWSYDDNDSMVYVLSATGERWFRMKK >gi|226332212|gb|ACIC01000108.1| GENE 10 17222 - 18748 1153 508 aa, chain - ## HITS:1 COG:no KEGG:BT_3960 NR:ns ## KEGG: BT_3960 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 508 1 508 508 1025 100.0 0 MKIPQIRCSDVNAVKRIIFLLVLLIPFCLAGCSSDDDEGDNGGEGPNEDAPVSYVLGSGN TNMPSSGTIIAQYADAPVGSEIRRLVDDNADTKYVTYHSSFNITWNGNSSKAVTAYSLTS AADTPEMDPKDWTLYGSNDNTTWTELDAQTNQLFAARKEEKSYEVDNATSYRYYRLSVEA NNGGAATQIAEWKLVAMRSYTENINDLIASKGSSTFSAITPMGRQHENDREATAADLKWL ADPAEEPEPFGDGGTKMAWNTFNVVSIYPNGNPVMSDVNQRWVGDCCACAVIASMAYLYP RFVKHIIKDNMDKTYTVTMYDPKGKQISVGVGGDYFIGNSGDLGALGGRNKEVTWATILE KAMIKWNQIYQGSSNVGGIGTEYVSAIFTGDGESVGFGANALIAEDLQRAVEVSLKQGRL VIGGFTRSGEQVDQNWQTTSGHAFTFILPDDNTHLFKMRNPWGGTTDGVMKVKDDGRIPP MIDLRICAPGAAKNYGVGPDLGGYIPNF >gi|226332212|gb|ACIC01000108.1| GENE 11 18806 - 20770 1622 654 aa, chain - ## HITS:1 COG:no KEGG:BT_3959 NR:ns ## KEGG: BT_3959 # Name: not_defined # Def: putative outer membrane protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 654 1 654 654 1342 99.0 0 MKNIFKIGTLALCLVVGLSSCNDFFDAIPGEQYDLEDTFTNRSKTEQFLNNVYNYVPDET LERFARNTMSGIWTTGSLECKLSWDGNNGSEWASGATYAGSSWINFWYIEYYKGISRAST FIMNVDRCLEASAAQRKQWKAQARALRAFYYFMIFRSYGPFVILGEEPIPLDISTAELLK ERNTVDECVAFMAKEFDDAANELPDRYDGSNLGRIDRAACKAFKAKMLLYAASPLFNCNP DYAAIVNPESGKQLFPQDKSQEKAKWEAARDAYKEFFDEYGNTFSLYTEKTADGKIDFYE SYRKVTSGVLYGTENKEQIFIRLADHDYRAYETTPYHKGYDDNNGALRGGLGFGVPQEMV DLYFMKDGRRIVDDTNYKEYEGVPSNEYLGWSSDYTDEVVPSRTYFKSNSNQTLKQWANR EPRFYTNITFHGSTWLKTDTPRGEITTELTYNGNSGYANANWDAPYTGYGMRKMASKEGR SGANRHCATLLRLADMYLGYAETLSACDQRNEAIKYVNKIRARAGIPGYGAVGTKDDNGF ACIPYEDTRDGVDKRIHRERLVELMFEWNHFFDVRRWKVADMAVGDDWIYPKYHRGGEGG PIHGMAFRSDAPAFFEKVVVETRTFLPKHYLFPIPDEDVRRNPKMVQNLGWTTE >gi|226332212|gb|ACIC01000108.1| GENE 12 20795 - 23947 2869 1050 aa, chain - ## HITS:1 COG:no KEGG:BT_3958 NR:ns ## KEGG: BT_3958 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1050 1 1050 1050 2096 100.0 0 MNLFEMEKHHTKTLWLLALLLTFSTVVWAQGTPVTGRVSDEKGELLIGVSVQEKGTTNGT ITDTNGQYNLKLTSNNPILIVSYIGYKSQEVKVGKQKVLDVILAEDVSSLDEVVVVAYGH QRKVSVVGAQSSMKIEDIKMPTANLSSAIAGRLPGVVAVQRSGEPGHDDSDLWIRGISTL AGQNSKPLVLVDGVERSFNNIDPEDIESFTVLKDASATAVYGVRGANGVILIKTKPGKVG KPQFSVDYYEGFVTLTKKPEMADAFTYMDAANEAYMDTRGSMLYSPQYIEATKKAHGLLP NDNPLMFNPYLYPNVDWMNELFNDWGHNRRVNVSVRGGVPNATYYVSLSYYNEKGLTRTA EMENYDANIRYDRYNYTANLNLKPTETTTIDLGFNGFLSMGNYPQQSTSDLFASAMEINP VYLPLMMPDGSVPGISTNGDLRNPYADLTRRGYKNEARNQLNSNIRLTQDLGFWKWSKGL TASAMLAFDVHNSRDLKYNKREDTYNFAGTKDENGLWNDDVFDADGNYRYALTYTGHKDL AFDQGASDSRSTYFEASLNYDRSFGLHRIGGLLLYNQKIYRSSSDNLIGSLPYKQQGLAA RATYSWNDRYFFEANLGYNGSENFSPEKRFGFFPAFGLGWAISNEAWWESLQETVSYFKV RYTDGLVGTDAVTGRRFMYLDQMASVDGYRFGDQNNGVGGWGFSKYGANVGWSTSRKQDL GVDLKFFKDNLSLTLDIFKEHRKDIFITRRVIPDYSGFVEMPYANLGVVDNKGFEATLEY TQQLGKKCFLTVRGNFSWNEDKIIENDDPRVQYPWMEKRGTNVNGRWGWIAEGLFTSEEE IMDHAKQFGEGHPGQISKVGDIKYKDLNGDGVIDDYDKCLIGQGDVPKIYYGFGADLQLG DFSVGALFAGNAKADRCLGGNAMYPFNDGSGITNLFANITDRWSADNPTNQDVFYPRLHH GNNANQNNMKTSTWWQKDVSFLRLKQLTIAYQLPKKLINQTFLKSARIYVMGTNLLTFSK FKLWDPELNTNNGTAYPNVRTYSVGVNVSF >gi|226332212|gb|ACIC01000108.1| GENE 13 24579 - 24815 212 78 aa, chain - ## HITS:1 COG:ykgD KEGG:ns NR:ns ## COG: ykgD COG2207 # Protein_GI_number: 16128290 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Escherichia coli K12 # 3 78 205 280 284 60 35.0 1e-09 MAMSPRQFYRKFKEISNAAPSDLIKSYRMEKAARLLQNEELSIQDVIAEVGISSRSYFYK EFTRRFGMTPKDYREQLR >gi|226332212|gb|ACIC01000108.1| GENE 14 24900 - 28526 2215 1208 aa, chain - ## HITS:1 COG:no KEGG:BT_3957 NR:ns ## KEGG: BT_3957 # Name: not_defined # Def: transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1208 1 1208 1315 2480 99.0 0 MRRCLILCLFTFIGSLVCWADAGKDAYLFRKVDYQQGLSNSAVLCLYQDKAGLMWFGTYD GVNCYDGKGIEVFRSDFSMKKTLSNNVIHSIQQADSSCLWITTHLGVNRFSQDSRQVVGY YDFTGDYYLHSNPKGDTWVVSDEGIFYYNTYHKKFVHLKNIETPVENMDQRAFVTDDGVL WTFPRNTGSLIQCSLDGFDRDTLSVHPTVSSSDFHAKPIDNVFYQNGILCFVDAERDLYV YDISRRSKIYIRNLSSLVQRYGKVVGIVPFYEDIIIAFQTNGLVRLRTSKKYEEEVVDRN VRIYDIFRDPHQNILWVASDGQGAVMYAKKYSIATNLMLSKLSSNLSRQVRSVMTDKYGG LWFGTKGDGLLHVHDYEGGMDASATTVYSPVGRQDASSYTRWNREFQAYSLKQSRYMDGF WVGSGNPGLFYYSFADKALHCVEDKSADPVIEIHDIYEENDSVLYAVTAGVGFRKLILER KQGEIHLKSQKRYHFFYEQKEITMFYPMLAEGDSILWLGSREKGLIRFDKQTEEYKVISL KEILHKSVDDVLSLHRAKDGRMYVGTTSGLVCLTFNKEQISASYIGREQGLLNDMIHGIL EDANGFLWLGTNRGLIKYNPKNTSSHAYYYAAGVQVGEFSDDAYYQCPYTGRLFFGGIDG LLYLDKEVATAPEFYPNILLRKLMIGRKEVNLGDYYTDGGKALSFKGAKASFSLSFVVPD FLTGGDVEYSYMLDGFDKDWTSFSSINEASYLEIPSGSYVLKVRYKKDVFNTEYKVFSIP LYILPPWYLSTVAYVIYLLFFILIAGYLIHLLRKYFLQKRMMQRLLTAESNEALPESASL NRELLNRFTSIYRSCDQLRAENLPYEQRLRIMEQVHETVIATLFRSGTLAVEELKSFFPT EYAITGCMCMKELSIEVLHVLEGQGVDISSVKLAIPEHFVYPVYKNALRCMLYWCYLYIG GVKHKSDIVVDVKEEEGRMLLQFSAVDDTLKELYKQLSGSEEEGRRDSKETEEAITTRQL LFSVQAALRQLNVTLHYADREKDHLLTLAFEPAIIKETEEHGKKTVLLLEDRDEMVWLIS DLLADEFVVCPVKSVQLAFEEIRRSAPALLLVDMLMYAKAESDFLEYVNKNRTQLSKTAF IPMLSWKVSSSIQRELIMWADSYIVLPYDIIFLKEAVNKAIYGTREAKQIYMEELGDWAG RIVCTTEE >gi|226332212|gb|ACIC01000108.1| GENE 15 28831 - 30222 1176 463 aa, chain - ## HITS:1 COG:XF0106 KEGG:ns NR:ns ## COG: XF0106 COG3669 # Protein_GI_number: 15836711 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-L-fucosidase # Organism: Xylella fastidiosa 9a5c # 5 459 16 451 460 126 27.0 1e-28 MKRLFLLMILGVTSVLCLNAQTKSLHQLQQEFVDLRCGMFIHFNMPTFFNEDWPDPDAAP ELFNPVRMDCKQWAKAAKSANMTYGCLTTKHHSGFCIWDTKTTDYSVMSSPFKRDVVKEY ADAFRAEGMKVMLYYSILDTHARLRPKCITPQHIEMIKEQLRELLTNYGEITALIIDGWD APWSRISYDDVPFEDIYRLVKSIQPNCLVMDLNAAKYPAEALFYTDIKSYEQGAGQHISK DTNRLPALSCLPLQQNWFWKESFPTTPVKSPAQMVNDNIIPMGKSYCNFILNVAPNRDGL MDANALKALKEIGKLWKNDGRVATVPEADAPIISSNIAKYQPAEGTWSSDYAIMDFANDD DFGTCWNSNPEVKVPWYSVTFEREKSFNMVVITDRNNDRLQEYRLEYRAGGTWKPLYEGK APTGLRVKIHRFDTVWGDAVRMTVLNSNGTTSIAEFGVYCERR >gi|226332212|gb|ACIC01000108.1| GENE 16 30301 - 31653 1185 450 aa, chain - ## HITS:1 COG:no KEGG:BT_3955 NR:ns ## KEGG: BT_3955 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 450 1 450 450 884 100.0 0 MKKVLFALAVLCTSFWSCNTLDLNDVGYGDGDRLKASSVKEILYGTADGCWKADYQGHEF YFQFHEDGTVTLDSDFLEMAVEGKTSFSAKGKEVALDIENCDVHLQNLGSEFADTKFVVS EIPAEGEAPQLNLYGESTGNTIELQPTTQAYIDGKVASKADFTELFEKNLLDNQSICDAS GNFIGYYGLVLNGVNDLSVKVITIENKDGSDANGHTQYYESKLTKEGQIFKLDTPVEQIK SVNGATYAFKAIDCTGDAVAVDGMSNVTLTSNKGAVNDFDYVTSGGKFTLGKAQDHGAAC DEIWAGTGGQATSSGGTIADINAMSYDFGWSGSAKQRPLVIWTWWFANLAFPSSEEGASI LMNNSDKDRVLFKNISGSGQTCGGGTLNATEVAEINAYCKDLIDTWFNEKGLFVVRHDRQ SAGDKFYIYLLCPDTEATSKGGMWMKYQRD >gi|226332212|gb|ACIC01000108.1| GENE 17 31681 - 33036 1428 451 aa, chain - ## HITS:1 COG:no KEGG:BT_3954 NR:ns ## KEGG: BT_3954 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 451 1 451 451 892 99.0 0 MKHWKDIAFGLFVSLTLAACYGEEDSFATPDIYKDYQPVLKDGNTVAGYAAAPLKEAKYD EVLNELYLTWDGSNSGWEKRDEYVGVEVEFNSLLSGKKIKRVMIPNVGDFGNLTVKRRDY GNEPTTLRLSKYRTVVVTDRYGVEELRFRSIYKDADGARQTSDWTNLSDQADKYELQVDM NYAAWQYFKTSQATQVQISFESDSEIRPVSALYNLIGNGDEAATQEEFRKWYYMDCAAMS FDPYNLAFDPYSKLRVFIKKDQTGGAYAIDYPTHNEGRGIVYPASESNPNDAWWKLPDMQ HVFMHEMGHCVQWMPKQGKYIMEGVQDCDRQGYQEGWPDAVKVASKGYILATQKEEYQAA IAKSYRNPQSDKYFVWQVDYNTSGAFMSWLRLYNGDFVRMLPWTVLMDELTNQWSLEDAV KYILKESYPDLTMEELWNEYKTEVEVFLQNN >gi|226332212|gb|ACIC01000108.1| GENE 18 33074 - 35008 1769 644 aa, chain - ## HITS:1 COG:no KEGG:BT_3953 NR:ns ## KEGG: BT_3953 # Name: not_defined # Def: putative outer membrane protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 14 644 1 631 631 1290 99.0 0 MKNRIKQYLLAASMCVGLTACSDFFEPIPGVQFGLDETFASKQRTEEYLNNVYSYVREVT DAIHPNTYGGIFTEASLDGANRWNKTYAEWTNGSFNSASAQASEYFSKYYQAIAKASTFI QNVDKCTEAAASTRGKWKSEARALRAYYYFELLRLYGPIPLIGEDPIPLDASLEELIKER NSVDECVNFIATELQSAIDSGDLLQRAGKANLGRMDVATCMALKAKLYLYWASPLFNGNT DQASVKNKDGKQLFPQTEDNSKWTQARDAYERFMTFATGQGYKLTEVYTNGKLDPYASCR AAGEFFTTTWEAVDELIFVKLRDLYDYTYWVCPKFTDFQDTDVTGGGGYYTTQETVDLFF TKDGLTIEEDPGYDKFEGIPGANNFTSGRYYDPNNPSRLYFDADKSKVLKQWKDREPRFY VNITYSGSIWLNEGKYNEEMRTDFTNGANGTCGKSKASGDCPDSGYLIRRGAKASNNNGS KHFSPVLRLADMYLGYAEALCMCSDLDNALTYLNKIRVRAGIPEYTFTATAGKITCPKTQ TDLLNRIRRERLVELVFEWNRYFDVRRWKVAEGQNDPEHWIYPAYHTGGEGGKVYGMNMD KDYPAFFERTSFETRVAFTKKQYFMPIPYDDLRRIPSLVQNLGW >gi|226332212|gb|ACIC01000108.1| GENE 19 35030 - 38173 3073 1047 aa, chain - ## HITS:1 COG:no KEGG:BT_3952 NR:ns ## KEGG: BT_3952 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1047 1 1047 1047 2037 99.0 0 MNLFGMKKHRIRSLWLLLLLLTCSVTVWAQGSSVTGRVSDEKGELLIGVSVQEKGTTNGT ITDMNGQYTLKLSTGNPILLISYIGYKPQEVKVAKQKIVDVVLVEDVSSLDEVVVVGYGN QRKVSVVGAQSTMDVKDIKMPAASLSSAISGRIAGVVAVQRSGEPGHDESDIWIRGLSSL LGQSSSPLVLVDGVERSFNNIDPEDIESFTVLKDASATAVYGVRGANGVVIVKTKPGKVG KPQFSVDYYESFTRLTKKVDMADAYTYMDARNEAQMNTSGTLKYSAAYIEATKKANGLLP NDNPRLYNPYLYPAIDWADNLFNDWGHNRRGNINIRGGVPNANYYASLSYYGETGMTRNF KLENYNTQMKYDRYNFTSNLNLKPTSETTVDLGFSGYLGQGHYPQSSTSSLYAACMDVNP VIYPLLLPNGTVSGINSQQKFNPYGLLARGGYYDEFSSQLNSNIRVKQDLDFWKWSKGLS ASAMVAFDTYNSRKRKYNRNEPMYTFAGKTDENGIWIEDTLFDEETGDYLYSVLKEADGS LSLQTPEQWSSRTVYTEASLNYDRSFGAHRVGGLLLYNQKVYWDLNATDVIGGMPYKQRG FAGRATYSWNDRYFAEFNLGINGSENFSPGKRYGVFPAFGLGWAVSNESFWNPVRKYISF LKFRYTDGWVGSDTATGRRFMYQGVFTGLTGTMFGTNYTSANGYGEEKYGVNVTWSKSRK QDLGIDIKFLNDNLSFVVDFFKERRDNIFLQRSTIPSYAGWIENPYANLGVVENKGIELA MDYTQRLGKKTFLTVRGNLTFNKDKIIENDEPPVDYPWMETRGTNVNATWGFIADGLFTS QAEIEDHATQFGTVHVGDIKYRDLNGDGVIDNYDKTVIGRGDVPRIYYGFGADLQIGDFS ISALFQGTGQADRYLDGICIKPFWDDEGRDNVFANIHDRWSPDDPTNQDVFYPRMYVGSD ANTNNVQKSSWWVKDVSFLRLKQLNISYNIPKKLLDRCFLKSASVYLMGTNLLTFSNFKL WDPELNSSNGTAYPNVSSYSVGVKFSF >gi|226332212|gb|ACIC01000108.1| GENE 20 38470 - 42444 2299 1324 aa, chain - ## HITS:1 COG:all3171 KEGG:ns NR:ns ## COG: all3171 COG2207 # Protein_GI_number: 17230663 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Nostoc sp. PCC 7120 # 1213 1314 204 304 306 68 31.0 1e-10 MRIRLILVFITLICPLISIADNVKDAYLFQKVDYQQGLSNSAVLCLFQDNEGLMWFGTYD GVNCYDGKSMEVFRSDFSEQKTLSNNIIHSIQQADSSCLWITTHLGANRFSKDSRQVICN YEFDTDFVIHSNHAGNTWALSYDWIAYYNTHHRRFVRIPMPDIKMLKVDVRAFVTDEGEL YLFPYESGDLYRFSLNSFDQDTLSTRLTTTPSHFHSKQIQYVCYQNGVFCFYDSDYDLFV YDISRKSKIYIRNIGEMVHKYGQITGIVPFYEDIVIAFQTNGLIRLRLSQKYAEEIIDQN MRIYGIYNDSRQGVLWVGTDGQGAIMYSKKYSIATNLMLNSFSPNLTRQVRSIMTDKYGG LWFGTKGDGLLHVKNYRDGVHASNTEVYSLDKKQNAMSYVKQNREFQVYSMLESRYRNGF WVGTGSSGLCYYSFDNGKLHELSNRETGTGREVAEVHSIFEESDSVLYLATSGKGFYKVI VDNSKQDVQIKSWKNYRFYHEQQELNLFYSMVPQGDSLLWLGSRQKGLIRFDRKTEEYQI YSLNEILHKSVDDILCLHWHGEQLYVGTTSGLVRVTFKERKLEADYIGREQGLLNDMIHS ILEDANGLLWLGTNRGLIKFNPENSFSHAYYYSGGIQIGEFSDDAYYRCPYTGCLFFGGI DGLLYLDKKVSAAPEYYPEILLRKLIIEKTFVNLQDHYLPDRKGLRMQGANLSFSLFFVV PDYASGGDVEYSYMLEGCDKDWGAFSSVNEASYFSVPSGDYLFKVRYKKDVFATEYKTFT IPVHILPPWYRTVYAYIVYILLGIVLVMYVIHLFRKYFRHERMMKKLLESENRNISLAAD SYQAREVLNTCTLIYQACDQLNDKNISSEGHAEKIGQIRESVMSLLFGCGLGDECFRLLS SLHFSVSGRLSLLKLAEEVLQVLEKEGHDVSSIQLDIPENFTYPAYKNALRCILYFSYLY ISSCKTEKVSVTVTEEAGCMMLTFVSPDETVKGLKDVLSGEELLSIRAKDSDDMFRIRTM QRFVFSALKQQNSVMRYITPVEQDNQLTITFDPVVVAEQNEVRKTVLLLEDREEIVWLIS ALLSDEYEVRSVRSVQLAFDEMHKSAPTLFLVDMLMYADAESTFIKYVNKNRSMLSKTAF IPMLTWKVSASMQRELILWADSYIVLPYDIPFLKETIHKVIYGKREARQIYMEDLEGWAD SIVCTTTEQADFVRKFLQVVEQNLDREDLGSTFVAEQMLMSSRQFYRKFKEISGMSPSDL IKDYRMEKAARLLQNEELSIQDVISDVGISSRAYFYKEFTRKFGVTPKVYREKLLGNDTH KNQE >gi|226332212|gb|ACIC01000108.1| GENE 21 42641 - 44029 1796 462 aa, chain - ## HITS:1 COG:PH0923 KEGG:ns NR:ns ## COG: PH0923 COG1109 # Protein_GI_number: 14590777 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannomutase # Organism: Pyrococcus horikoshii # 12 448 6 441 455 250 37.0 3e-66 MTLIKSISGIRGTIGGGAGEGLNPLDIVKFTSAYATLIRKTCKAQSNKIVVGRDARISGE MVKNVVVGTLMGMGWDVVDIDLASTPTTELAVTMEGACGGIILTASHNPKQWNALKLLNE HGEFLNAEEGNEVLRIAEAEEFDYADVDHLGSYRKDLTYNQKHIDSVLALDLVDVEAIKK ANFRVAIDCVNSVGGIILPELLERLGVKHVEKLYCEPTGNFQHNPEPLEKNLGDIMNLMK GGKADVAFVVDPDVDRLAMICENGVMYGEEYTLVTVADYVLKHTPGNTVSNLSSTRALRD VTRKYGMEYSASAVGEVNVVTKMKATNAVIGGEGNGGVIYPASHYGRDALVGIALFLSHL AHEGKKVSELRATYPPYFIAKNRVDLTPEIDVDAILAKVKEIYKNEEINDIDGVKIDFAD KWVHLRKSNTEPIIRVYSEASTMEAAEEIGQKIIDVINELAK >gi|226332212|gb|ACIC01000108.1| GENE 22 44059 - 44703 475 214 aa, chain - ## HITS:1 COG:no KEGG:BT_3949 NR:ns ## KEGG: BT_3949 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 214 1 214 214 414 98.0 1e-114 MKKLVFLFLSLLAAGGIFQACDDSKTYAEMLEDEKNAVNKFIKDKRIQIISQDEFEKNDT VTDLIRNEYVALSDGVYMQIVDRGSAENKTDTFANNNEICVRYIEEDIMTRDTTCFNVFL EEWGDANQLYTNPAVFRYVAEGSYVYGTFIQMDYYWASYYQSTAVPAGWLLALPFVRNYA HVRLIVPSKVGHSSAQQYVNPYYYDIWTFSKALN >gi|226332212|gb|ACIC01000108.1| GENE 23 44822 - 45859 829 345 aa, chain - ## HITS:1 COG:aq_1630 KEGG:ns NR:ns ## COG: aq_1630 COG0618 # Protein_GI_number: 15606737 # Func_class: R General function prediction only # Function: Exopolyphosphatase-related proteins # Organism: Aquifex aeolicus # 24 338 19 319 325 131 30.0 2e-30 MLTKVIEQAKIDHFTKWFERADKIVIVSHVSPDGDAIGSSLGLYHFLDSQEKTVNVIVPN AFPDFLRWMPGSKDILLYDRYKDFADKLIAEADVICCLDFNALKRIDEMADAVAASPARK VMIDHHLYPEEFCKITMSYPKISSTSELIFRLICRMGYFSDISKEGAECIYTGMMTDTGG FTYNSNNREIYFIISELLSKGIDKDDIYRKVYNTYSESRLRLMGYVLSNMVVYSDYNAAL ISLTKEEQSKFDYIKGDSEGFVNIPLTIKNVCFSCFLREDTEKPMIKISLRSVGTFPCNQ LAAEFFNGGGHLNASGGEFFGTIEEAKAVFEQALEKYKPLLTAKS >gi|226332212|gb|ACIC01000108.1| GENE 24 45905 - 47578 397 557 aa, chain - ## HITS:1 COG:lin1517_1 KEGG:ns NR:ns ## COG: lin1517_1 COG0658 # Protein_GI_number: 16800585 # Func_class: R General function prediction only # Function: Predicted membrane metal-binding protein # Organism: Listeria innocua # 10 381 115 478 487 111 26.0 3e-24 MFTLLKLGDKQLKIGDELLVSARIAPPANGGNFDEFDYARYLMRHGISGTGYVASGKWAL WSPSIRYTAMFCQEKVINLYRKLGFEGDELAVLSALTVGEKTDLSDSIRESYSVSGASHV LALSGLHIGLLYALLFLLLKPLARKWQAGRYFRSVLLLVLLWSFAFFTGLSPSVVRSVSM FSVLAIAELFGRQSLTLNTLAATAWVMLLVNPAWLFDVGFQLSFLAVLSILMIQKPVYQL LPVKSRIGKYVWGLMSVSIAAQIGTAPLVMLYFSHFSTHFLLTNLVVIPLVTVTLYAAVL MLLLTPLPAVQFVMAGAVRFLLKVLNDFVRWVEQLPYASLDGIWLYRLEVLGIYIFLLLF LYYLKTRRFRNLVVCFSCLLCLGIYHTVMRWYDRPCPSLVFYNVRGCPAIHCIAEDGTSW LNYADTLSDKRRLQAVAANYWRRHQLLPPIEVTADCQNVDFCRHQQIVFYHGCRICMVTD NRWRNKSAASPLFINYMYLCKGYNGRLEELTGLFSPSCILLDASLSDDRKQFFREECKRL HLHFITLSEEGSVRFLL >gi|226332212|gb|ACIC01000108.1| GENE 25 48016 - 48666 700 216 aa, chain - ## HITS:1 COG:lin1932 KEGG:ns NR:ns ## COG: lin1932 COG0036 # Protein_GI_number: 16800998 # Func_class: G Carbohydrate transport and metabolism # Function: Pentose-5-phosphate-3-epimerase # Organism: Listeria innocua # 5 216 4 215 218 206 47.0 4e-53 MKPIISPSILSADFGYLARDLEMINRSEAEWVHIDIMDGVFVPNISFGFPVLKYVAKLTD KPLDVHLMIVNPEKFIPEVKALGAHTMNVHYEACTHLHRVIQQIKEAGMQPAVTINPATP VALLQDIIQDVYMVLIMSVNPGFGGQKFIEHSVEKVRELRALIERTGSKALIEVDGGVNL ETGARLVEAGADVLVAGNAVFGAPDPEEMIRQLHKL >gi|226332212|gb|ACIC01000108.1| GENE 26 48719 - 49687 995 322 aa, chain - ## HITS:1 COG:fmt KEGG:ns NR:ns ## COG: fmt COG0223 # Protein_GI_number: 16131167 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Methionyl-tRNA formyltransferase # Organism: Escherichia coli K12 # 4 314 3 305 315 221 40.0 1e-57 MKKEDLRIVYMGTPDFAVEALRQLVEGGYNVVGVITMPDKPAGRGHKIQYSPVKQYALEQ NLPLLQPEKLKDEAFVQALREWKADLQIVVAFRMLPEVVWNMPRLGTFNLHASLLPQYRG AAPINWAVINGDTETGITTFFLQHEIDTGKVIQQVRVPIADTDNVEVVHDKLMVLGGKLV LETVDAILNDTVKPIAQEDMAVVGELRPAPKIFKETCRIDWNSPVKKVYDFIRGLSPYPA AWSELVSPEGEAVVMKIFESEKIYEAHQLAVGTVVTDGKKYIKVAVPDGFVSVLSLQLPG KKRLKTDELLRGFRLSDGYKMN >gi|226332212|gb|ACIC01000108.1| GENE 27 49750 - 51546 1452 598 aa, chain - ## HITS:1 COG:RSp0020 KEGG:ns NR:ns ## COG: RSp0020 COG0038 # Protein_GI_number: 17548241 # Func_class: P Inorganic ion transport and metabolism # Function: Chloride channel protein EriC # Organism: Ralstonia solanacearum # 22 449 28 447 461 148 27.0 3e-35 MKREERQSLLQRCIKWREANIKEKQFILILSFLVGIFTAIAALFLKFLIHQIQNFLTNNF NATSANYLYLVYPVIGIFLAGWFVRNIVKDDISHGVTKILYAISRRQGRIKRHNIWSSTI ASAITIGFGGSVGAEAPIVLTGSAIGSNLGSVFKMEHRTLMLLVGCGAAGAIAGIFKAPI AGLVFTLEVLMIDLTMSSLLPLLISAVTAATVSYITTGTEAMFKFNLDQAFELERIPYVI LLGIFCGLISLYFTRAMNSIEGVFGKLKNPYQKLAFGGVMLSVLIFLFPPLYGEGYDTIE LLLNGTSAAEWDTVMNNSMFYGYGNLLQVYLMLIILLKVFASSATNGGGGCGGIFAPSLY LGCIAGFVFSHFSNDFAFSAYLPEKNFALMGMAGVMSGVMHAPLTGVFLIAELTGGYDLF LPLMIVSVSSYLTIIAFEPHSIYSMRLAKRGQLLTHHKDKAVLTLMKMENVVEKDFVAVH PEMDLGELVKAIAASHRNMFPVTDKKTGELLGIVLLDDIRNIMFRQELYHRFTVNKLMTS APAKIFDTDGMEQVMQTFDDTKAWNLPVVDEEGRYQGFVSKSKIFNSYRQVLVHFSED >gi|226332212|gb|ACIC01000108.1| GENE 28 51547 - 52110 560 187 aa, chain - ## HITS:1 COG:AF0781 KEGG:ns NR:ns ## COG: AF0781 COG0009 # Protein_GI_number: 11498387 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation factor (SUA5) # Organism: Archaeoglobus fulgidus # 3 158 10 164 309 102 38.0 5e-22 MIEDIKKACQVMNEGGVILYPTDTVWGIGCDATNEEAVHRVYEIKKRADSKAMLVLVDSP VKVDFYVQDVPDVAWDLIEVADKPLTIIYSGARNLAPNLLAEDGSVGIRVTNEEFSRRLC QQFRKAIVSTSANISGQPGAANFTEISEEVKSAVDYIVGFRQDDMSRPKPSSIIKLDKGG VIKIIRE >gi|226332212|gb|ACIC01000108.1| GENE 29 52189 - 52626 408 145 aa, chain + ## HITS:1 COG:TVN0706 KEGG:ns NR:ns ## COG: TVN0706 COG0824 # Protein_GI_number: 13541537 # Func_class: R General function prediction only # Function: Predicted thioesterase # Organism: Thermoplasma volcanium # 12 102 11 101 133 70 34.0 1e-12 MEEIVFHHALPIQLRFNDVDKFGHVNNTVYFSFYDLGKTEYFASVCPGVDWEKIGIVVVH IEANFVKQIFASDHIAVQTAVSEIGTKSFHLIQRVIDTETQEVKCICKSIMVTFDLEKHE SMPLTEDWIEAICKFEGKDVRKKKN >gi|226332212|gb|ACIC01000108.1| GENE 30 52817 - 53743 645 308 aa, chain - ## HITS:1 COG:CAC2630 KEGG:ns NR:ns ## COG: CAC2630 COG4632 # Protein_GI_number: 15895888 # Func_class: G Carbohydrate transport and metabolism # Function: Exopolysaccharide biosynthesis protein related to N-acetylglucosamine-1-phosphodiester alpha-N-acetylglucosaminidase # Organism: Clostridium acetobutylicum # 81 307 117 344 347 70 27.0 3e-12 MTGEVYLCVISCYTKYESKTYRLFMRRNLVGIIVLILLSVRVVNAQTVADSIAIVTAPWE VVTVENGIVHKRASIPFLYQGAQSINILEINPRTGKKIGIAFTGQLEKISRIARKHQAIG AINGSYFDMTKGNSVCFLKVGSQVVDTTSLDELKLRVTGAVYEKKGKVKLIPWDRQIEKN YKKNKGSVLASGPLMLKDGEYYDWSQCNANFIETKHPRSAICLTEEGKILFVTVDGRSPE NAVGINIPELAHLLHVLGGKDALNLDGGGSTALWLSGAPEEGIVNFPCDNRNYDHQGERK VANFLYVH >gi|226332212|gb|ACIC01000108.1| GENE 31 54533 - 56251 1807 572 aa, chain + ## HITS:1 COG:lin1560_1 KEGG:ns NR:ns ## COG: lin1560_1 COG0608 # Protein_GI_number: 16800628 # Func_class: L Replication, recombination and repair # Function: Single-stranded DNA-specific exonuclease # Organism: Listeria innocua # 5 568 8 561 562 362 37.0 1e-99 MNHKWNYQPITPEQAETSQTLAQELGISPILGQLLVQRGITKAADAKKFFRPQLPDLHDP FLMKDMDIAVERLNKAMGKKERILIYGDYDVDGTTAVALVYKFIQQFYSNIDYYIPDRYN EGYGISKKGVDYASETGVGLIIVLDCGIKAVEEITYAKEKGIDFIICDHHVPDDILPPAV AILNAKRLDNTYPYTHLSGCGVGFKFMQAFAINNGIEFHHLIPLLDLVAVSIASDIVPIM GENRILAYHGLKQLNSNPSVGMKAIIDVCGLSEKEITVSDIVFKIGPRINASGRIQNGKE AVDLLTEKDFSAALEKAGQINQYNETRKDLDKSMTEEANNIVANLEGLSERRSIVLYNEE WHKGVIGIVASRLTEVYYRPAVVLTRTDDMATGSARSVSGFDVYKAIEYCRDLLENFGGH TYAAGLSMKVENVEAFTRRFEEYVSQHILPEQTSAVINIDAEIDFRDITSKFFNDLKKFN PFGPDNIKPIFCTHHVYDYGTSKVVGRDQEHIKLELVDNKSNNVMNGIAFGQSSHVRYIK TKRSFDICYTIEENTHKRGEVQLQIEDIKPIE >gi|226332212|gb|ACIC01000108.1| GENE 32 56244 - 58148 1551 634 aa, chain + ## HITS:1 COG:CAC2687 KEGG:ns NR:ns ## COG: CAC2687 COG0514 # Protein_GI_number: 15895945 # Func_class: L Replication, recombination and repair # Function: Superfamily II DNA helicase # Organism: Clostridium acetobutylicum # 7 490 8 468 714 292 36.0 2e-78 MNKYQEILKQYWGYDSFRDLQEEIITSIGEGKDTLGLMPTGGGKSITFQVPALAQEGICI VITPLIALMKDQVQNLRKREIKALAIYSGMTRQEILTALENCIFGNYKFLYISPERLDTE IFRTKLRSMKVSMITVDESHCISQWGYDFRPAYLKIAEIRELLPEVPVLALTATATPEVV TDIQARLKFREGNVFRMSFERKNLAYIVRKTDNKTKELLYILQRISGSAIIYVRNRRRTK EITELLMNEGITADFYHAGLDNAVKDLRQKRWQSGEVRVMVATNAFGMGIDKPDVRIVLH LDLPDSPEAYFQEAGRAGRDGEKAYAVILYSKSDKTTLHKRVVDTFPDKEYILNVYEHLQ YYYQMAMGDGFQCIREFNLEEFCRKFKYFPVPVDSALKILTQAGYLEYTDEQDNSSRILF TIRRDELYKLREMGKEAEALIQSILRSYTGVFTDYAYISEESLAVRTGLTRQQIYNILVT LTKRRIVDYIPRKKTPYIIYTRERLELRFLHIPPSVYEERKARYEARIKAMEEYVTTENI CRSRMLLRYFGEKNEHNCGQCDVCLSKRATDNLSEKSYEEVKRQILDLLSHSPLTPAETA DQIKAEKEDIGQVIRYLLDEGELKMQDGMLHISK >gi|226332212|gb|ACIC01000108.1| GENE 33 58188 - 59150 1378 320 aa, chain + ## HITS:1 COG:mll2208 KEGG:ns NR:ns ## COG: mll2208 COG0457 # Protein_GI_number: 13472041 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Mesorhizobium loti # 17 303 51 337 551 75 25.0 1e-13 MPNFFKSFFSGKSETPESEKQKNDQKNFEIFKYDGLRAQRMGRPDYAIKCFTEALAIEED FETMGYLSQLYIPMGETEKARELLEKMAVMEPHVTSTFLTLANVCYIQEDYKAMEEAAGK AIAIEEGNAVAHFLLGKARKGQDDEIMTIAHLTKAITLKDDFIEARLMRAEALLKMKQYK ETMEDIDAVLAQNPEEETAILLRGKVKEANGQEEEAETDYKFVTEINPFNEQAYLYLGQL YINQKKLAEAIALFDEAIELNPNFAEAYKERGRAKLLNGDKDGSVEDMKKSLELNPKDEA SLNGEFKNLGPKPEALPGIF >gi|226332212|gb|ACIC01000108.1| GENE 34 59501 - 60349 733 282 aa, chain + ## HITS:1 COG:VC0705_2 KEGG:ns NR:ns ## COG: VC0705_2 COG0077 # Protein_GI_number: 15640724 # Func_class: E Amino acid transport and metabolism # Function: Prephenate dehydratase # Organism: Vibrio cholerae # 11 275 3 265 278 139 35.0 7e-33 MKKIAIQGTLGSYHDIAAHKYFEGEEIELICCANFEDVFTSIRKDSQVIGMLAIENTIAG SLLHNNELLRQSGTQIIGEYKLRISHSFVCLPEESWEDLTEVNSHPIALMQCRDFLNQHP QLKVVEGEDTARSAEIIKKENLKGHAAICSKTAAERYGMKVLQEGIETNKHNFTRFLVVA DPWQVDELRQHHAKATNKASMVFTLPHTEGSLSQVLSILSFYNINLTKIQSLPIIGREWE YQFYVDVSFNDYLRYKQSIAAITPLTKELKILGEYAEGKSNV >gi|226332212|gb|ACIC01000108.1| GENE 35 60324 - 61505 1372 393 aa, chain + ## HITS:1 COG:aq_273 KEGG:ns NR:ns ## COG: aq_273 COG0436 # Protein_GI_number: 15605813 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Aquifex aeolicus # 13 392 5 385 387 296 41.0 4e-80 MQKESQTYKIAPAERLASVSEYYFSKKLKEVAQMNAEGKDVISLGIGSPDMPPSKVTIET LCNNAHDPNGHGYQPYVGIPELRKGFAAWYQRWYGVELNPNTEIQPLIGSKEGILHVTLA FVNPGEQVLVPNPGYPTYTSLSKILGAEVISYDLKEEDGWMPDFEALEKMDLNRVKLMWT NYPNMPTGANATPELYERLVDFARRKNIVIVNDNPYSFILNEKPISILSVPGAKECCIEF NSMSKSHNMPGWRIGMLASNAEFVQWILKVKSNIDSGMFRAMQLAAATALEAEADWYEGN NHNYRGRRHLAGEIMKTLGCTYDENQVGMFLWGKIPASCKDVEELTEKVLHQARVFITPG FIFGSNGARFIRISLCCKDAKLAEALERIKKLS >gi|226332212|gb|ACIC01000108.1| GENE 36 61598 - 62659 1172 353 aa, chain + ## HITS:1 COG:DR1001_2 KEGG:ns NR:ns ## COG: DR1001_2 COG2876 # Protein_GI_number: 15806024 # Func_class: E Amino acid transport and metabolism # Function: 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase # Organism: Deinococcus radiodurans # 1 242 13 245 270 147 38.0 4e-35 MELESILLPGIEAKRPIVIAGPCSAETEEQVMETAKQLAAKGQKIYRAGIWKPRTKPGGF EGIGVEGLAWLKEVKKETGMYVSTEVATAKHVYECLKAGIDILWVGARTTANPFAVQEIA DALKGVDIPVLVKNPVNPDLELWIGALERINNAGLKRLGAIHRGFSSYDKKIYRNLPQWH IPIELRRRLPNLPIFCDPSHIGGKRELVAPLCQQAMDLNFDGLIVESHCNPDCAWSDASQ QVTPDVLDYILNLLVIRTETQTTESLSQLRKQIDECDDNIIQELAKRMRVAREIGTYKKE HGITVLQAGRYNEILEKRGAQGEQCGMSADFMKLIFEAIHEESVRQQIEIINK >gi|226332212|gb|ACIC01000108.1| GENE 37 62895 - 63668 863 257 aa, chain + ## HITS:1 COG:no KEGG:BT_3933 NR:ns ## KEGG: BT_3933 # Name: not_defined # Def: chorismate mutase/prephenate dehydratase (TyrA) # Organism: B.thetaiotaomicron # Pathway: Phenylalanine, tyrosine and tryptophan biosynthesis [PATH:bth00400]; Novobiocin biosynthesis [PATH:bth00401]; Metabolic pathways [PATH:bth01100]; Biosynthesis of secondary metabolites [PATH:bth01110] # 1 257 1 257 257 494 100.0 1e-138 MRILILGAGKMGSFFTDILSFQHETAVFDVNPHQLRFVYNTYRFTTLEEIKEFEPELVIN AVTVKYTLDAFRKVLPVLPKDCIISDIASVKTGLKKFYEESGFRYVSSHPMFGPTFASLS NLSSENAIIISEGDHLGKIFFKDLYQTLRLNIFEYTFDEHDETVAYSLSIPFVSTFVFAA VMKHQEAPGTTFKKHMAIAKGLLSEDDYLLQEILFNPRTPGQVANIRTELKNLLEIIEKK DAEGMKAYLTKIREKIK >gi|226332212|gb|ACIC01000108.1| GENE 38 63783 - 65975 1895 730 aa, chain + ## HITS:1 COG:CAC1299 KEGG:ns NR:ns ## COG: CAC1299 COG0358 # Protein_GI_number: 15894581 # Func_class: L Replication, recombination and repair # Function: DNA primase (bacterial type) # Organism: Clostridium acetobutylicum # 1 426 5 418 596 286 39.0 7e-77 MIDQVTIDRILDAAQIVDVVSEFVTLRKRGVNYVGLCPFHNEKTPSFSVSPAKGLCKCFS CGKGGNSVHFIMEHEQMSYYEALKYLAKKYNIEIKERELTNEEKQAQTTRESMFIVNNFA RDYFQNILKNHVDGRSIGLAYFRQRGFRDDIIEKFQLGYCTESHDAMSQEALRKGYKKEF LVKTGICYETDDHRLRDRFWGRVIFPVHTLSGKVVAFGGRVLSTATKGVKVKYVNSPESE IYHKSNELYGIYFAKQAIVRQDRCFLVEGYTDVISMHQSGIENVVASSGTALTPGQIRLI HRFTNNITVLYDGDVAGIKASIRGIDMLLEEGMNIKVCLLPDGDDPDSFARKHNSEEFQA FIHEHEKDFIRFKTDLLMEDAGRDPIKRAELISNIVRSISVIPEAIIRDVYIKECSQHLR IEEKLLVAEVAKLREAQAEKANRPSYNHSSSTTGDTSGTAAASGTPAYSGAPTYSSPSSP YGNAGMPPEPEYDDDGNIVSFADPMSAATGGAAGFPPGNAPASTTDTTVPGDSYTSFIPQ EGKEGQEFYKFERLILQAVVRYGEKVMCNLTDEEGNEIPVTVIEYVVNDLKEDDLAFHNP LHRQILTEAATHIHDAGFIAERYFLAHPDQTISRLSVDLINVRYQLSKYHSKSQKIVTDE ERLYELVPMLMINFKYAIVTEELKHMLYALQDPALAHDNEKCNSLMQRYNEMREVQSLMA KRLGDRVVLR >gi|226332212|gb|ACIC01000108.1| GENE 39 66067 - 66657 616 196 aa, chain - ## HITS:1 COG:slr0426 KEGG:ns NR:ns ## COG: slr0426 COG0302 # Protein_GI_number: 16331608 # Func_class: H Coenzyme transport and metabolism # Function: GTP cyclohydrolase I # Organism: Synechocystis # 6 193 41 229 234 210 57.0 2e-54 MLEKEEIVSPNLEELKSHYHSIITLLGEDAGREGLLKTPERVAKAMLSLTKGYHMDPHEV LRSAKFQEEYSQMVIVKDIDFFSLCEHHMLPFYGKAHVAYIPNGYITGLSKIARVVDIFS HRLQVQERMTLQIKDCIQETLNPLGVMVVVEAKHMCMQMRGVEKQNSITTTSDFTGAFNQ AKTREEFMNLIQHGRV >gi|226332212|gb|ACIC01000108.1| GENE 40 66665 - 67114 554 149 aa, chain - ## HITS:1 COG:no KEGG:BT_3930 NR:ns ## KEGG: BT_3930 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 149 1 149 149 296 100.0 1e-79 MRKLGLLLWMLAFTAVGAQAQNIIKSLERTVPGQGKVTIHQDPKIEALIGVERPSMGEQK VLKAAGFRIQAYAGNNTREAKNDAYRVASRIKEYFPELTIYTSFNPPRWLCRVGDFRSIE EADAMMRKLKATGVFKEVSIVKDQINIPL >gi|226332212|gb|ACIC01000108.1| GENE 41 67241 - 67999 1037 252 aa, chain - ## HITS:1 COG:FN1366 KEGG:ns NR:ns ## COG: FN1366 COG0149 # Protein_GI_number: 19704701 # Func_class: G Carbohydrate transport and metabolism # Function: Triosephosphate isomerase # Organism: Fusobacterium nucleatum # 1 251 1 251 251 238 50.0 6e-63 MRKNIVAGNWKMNKTLQEGIALAKELNEALANEKPNCDVIICTPFIHLASVTPLVDAAKI GVGAENCADKASGAYTGEVSAEMVASTGAKYVILGHSERRAYYGETVAILEEKVKLALAN GLTPIFCIGEVLEEREANKQNEVVAAQMESVFSLSAEDFSKIILAYEPVWAIGTGKTASP EQAQEIHAFIRSIVADKYGKEIADNTSILYGGSCKPSNAKELFSNPDVDGGLIGGAALKV SDFKGIIDAFNA >gi|226332212|gb|ACIC01000108.1| GENE 42 68032 - 69345 1137 437 aa, chain - ## HITS:1 COG:no KEGG:BF3957 NR:ns ## KEGG: BF3957 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 436 1 435 443 706 76.0 0 MKDKNLHIIQKTGANVCRFLLAASFIFSGFVKAVDPLGFQYKIQDYLTAFGVISWFPSFF PLLGGIILSAIEFSIGIFLFFGIKKTISTTLALVLMIFMTPLTLYLAIFDPVSDCGCFGD AWVLTNWETFAKNIVLLLAAIATFQWRKMLIRFVTRKMEWLISLYTIFFVFTLSFYCLDR LPVLDFRPYKIGQNIMKGMSIPEGAKPSVYESVFILEKNGEKKEFTLDNYPDSTWTFVDT HTILKEKGYEPSIHDFSMMDMSTGDDITEDVLTDMGYTFLLVAHRIEEADDSNIDLINEI YDYSVEHGYRFYCLTSSPEEQIELWKDKTGAEYPFCQMDDITLKTMIRSNPGLMLIKNGT ILNKWSDEDIPDEYVLTDKLENLELGQQKVRSDVHTIGYVFLWFVIPLLLVLGVDILIVR RRERKNEKRKQQQEVKE >gi|226332212|gb|ACIC01000108.1| GENE 43 69335 - 69877 586 180 aa, chain - ## HITS:1 COG:no KEGG:BT_3927 NR:ns ## KEGG: BT_3927 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 180 1 180 180 348 99.0 6e-95 MKDTKQQFEHVIALCRDLFSKKLHDYGPAWRILRPASVTDQIFIKANRIRSIETKGVTLV DEGIRSEFIAIVNYGIIGLIQLELGYAESADISNEEALALYDKHAKEALELMLAKNHDYD EAWRSMRVSSYTDLILMKIYRTKQIESLAGNTLVSEGIDANYMDMINYSVFGLIKIEFEG >gi|226332212|gb|ACIC01000108.1| GENE 44 69987 - 70859 965 290 aa, chain - ## HITS:1 COG:HI0409 KEGG:ns NR:ns ## COG: HI0409 COG0739 # Protein_GI_number: 16272358 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Haemophilus influenzae # 98 206 328 443 475 104 45.0 2e-22 MNFNCIIKTGLVAVAAMVSLSSFSQDLIARQAPIDKKLKTVDSLALQKQIRAEQSEYPAL SLYPNWNNQYAHAYGNAIIPDTYTIDLTGFCMPTPSTKITSPFGPRWRRMHNGLDLKVNI GDTIVAAFDGKVRIVKYERRGYGKYVVIRHDNGLETIYGHLSKQLVEENQLVKAGEVIGL GGNTGRSTGSHLHFETRFLGIAINPIYMFDFPKQDIVADTYTFRKTKGVRSAGSHDTQVA DGTIRYHKVKSGDTLSRIAKVRGVSVSTLCKLNRIKPTTTLRIGQVLRCS >gi|226332212|gb|ACIC01000108.1| GENE 45 70878 - 71342 346 154 aa, chain - ## HITS:1 COG:AF0767 KEGG:ns NR:ns ## COG: AF0767 COG0105 # Protein_GI_number: 11498373 # Func_class: F Nucleotide transport and metabolism # Function: Nucleoside diphosphate kinase # Organism: Archaeoglobus fulgidus # 2 149 1 148 151 146 45.0 1e-35 MIEKTLVILKPCTLQRGLVGEITHLFERKGLRLAGMKMMQLTDELLSEHYAHLSSKPFFQ RVKDSMMATPVIVCCYEGVDAIQAVRTLAGPTNGRLAAPGTIRGDYSMSFQENIVHTSDS PETAAIELTRFFKPEEIFDYKQATFDYLYANDEY >gi|226332212|gb|ACIC01000108.1| GENE 46 71566 - 73662 1492 698 aa, chain - ## HITS:1 COG:slr0020 KEGG:ns NR:ns ## COG: slr0020 COG1200 # Protein_GI_number: 16331409 # Func_class: L Replication, recombination and repair; K Transcription # Function: RecG-like helicase # Organism: Synechocystis # 18 674 146 809 831 500 43.0 1e-141 MFDLTTRDIKFISGVGPQKAAVLNKELEIYSLYDLIYYFPYKYVDRSRIYYIHEIDGNMP YIQLKGEILGFETIGEGRQRRLTAHFSDGTGIVDLVWFQGIKYILGKYKLHEEYIIFGKP TVFNGRINVAHPDIDKPDDLKLSSVGLQPYYNTTEKMKRSFLNSHAIEKMMATVIQQIQE PLPETLSPKILSDHHLMPLTEALRNIHFPTNPDSLRRAQYRLKFEELFYVQLNILRYAKD RQRRYRGYIFERVGDVFNTFYSQNLPFQLTGAQKRVLKEIRNDVGSGRQMNRLLQGDVGS GKTLVALMSMLLALDNGFQACMMAPTEILANQHYETIKELLFGMDIRVELLTGSIKGKKR EAILTGLLTGDVKILIGTHAVIEDTVNFSSLGLVVIDEQHRFGVAQRARLWTKNIQPPHV LVMTATPIPRTLAMTLYGDLDVSVIDELPPGRKPITTIHQFDNRRESMYRAVRKQIEEGR QVYIVYPLIKESEKIDLKNLEEGYQHILEEFPGCTVAKVHGKMKSAEKDEQMQLFISGQA QIMVATTVIEVGVNVPNASVMIIENAERFGLSQLHQLRGRVGRGAEQSYCILVTNYKLTE DTRKRLEIMVRTNDGFEIAEADLKLRGPGDLEGTQQSGIAFDLKIANLARDGQLLQYVRS IAEDIVDNDPSAQSPENEILWRQLKSLRKTNVNWAAIS >gi|226332212|gb|ACIC01000108.1| GENE 47 73663 - 74322 270 219 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163764767|ref|ZP_02171821.1| ribosomal protein L15 [Bacillus selenitireducens MLS10] # 6 219 7 223 234 108 37 9e-23 MKQSVIIVAGGKGLRMGSDLPKQFLPIGGKPVLMRTLEAFRKYDAMLQIILVLPREQQDF WKQLCEEHHFSVEHLVADGGETRFHSVKNGLALVEAPGLVGVHDGVRPFVSLEVIRRCYK LAEQHKAVIPVVDVVETLRHLTDAGSETVSRTEYKLVQTPQVFDVELLKQAYGQEFTPFF TDDASVVEAMGVPVHLAEGNRENIKITTPFDLKVGSALL >gi|226332212|gb|ACIC01000108.1| GENE 48 74330 - 74881 723 183 aa, chain - ## HITS:1 COG:BB0621 KEGG:ns NR:ns ## COG: BB0621 COG0693 # Protein_GI_number: 15594966 # Func_class: R General function prediction only # Function: Putative intracellular protease/amidase # Organism: Borrelia burgdorferi # 8 178 7 178 184 136 43.0 2e-32 MGTVYAFFADGFEEIEAFTAVDTLRRAGLNVQIVSVTPDEIVMGAHDVSLLCDINFENCD FFDADLLLLPGGMPGAATLDKHEGLRKLLLDFAAKGKPIAAICAAPMVLGKLGLLKGRKA TCYPSFEQYLEGAECVSEPVVRDGNIITGMGPGAAMEFALTIVDLLAGKEKVDELVEAMC VKR >gi|226332212|gb|ACIC01000108.1| GENE 49 74907 - 75785 1027 292 aa, chain - ## HITS:1 COG:no KEGG:BT_3921 NR:ns ## KEGG: BT_3921 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 292 1 292 292 418 100.0 1e-115 MDDRRKKGEFVGALGALLVHVAVIALLILVSFTVPQPDEDAGGVPVMLGNVESARGFDDP SLVDVDILPEEPEAPAPAETQPELPSEQDLLTQTEEETVVLKPKTEPKKETVKPKEVVKP KEPVKKPEKTEAEKAAEAKRLAEEKAERERKAAEEAAKKKVANAFGKGAQMGGSKGTSAS GTGTEGSKDGNSSTGAKTGTGGYGTFDLGGRSLGTGSLPKPAYNVPEEGRVVVNITVNPA GVVIGTSINPQTNTVNSTLRKAAEDAAKKARFNTVEGPNNQTGTITYYFNLR >gi|226332212|gb|ACIC01000108.1| GENE 50 75827 - 76243 349 138 aa, chain - ## HITS:1 COG:no KEGG:BF3738 NR:ns ## KEGG: BF3738 # Name: not_defined # Def: putative tansport related protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 138 6 143 145 238 89.0 6e-62 MGLKRRNRVSPNFSMASMTDVIFLLLIFFMITSTVVSPNAIKVLLPQGKQQTSAKPLTRV VIDKDLNFYAAFGNEKEKPLSLDELTPFLQGCAEKEPEMYVALYADESVPYREIVRVLNI ANENHFKMVLATRPPENK >gi|226332212|gb|ACIC01000108.1| GENE 51 76289 - 77008 991 239 aa, chain - ## HITS:1 COG:FN1312 KEGG:ns NR:ns ## COG: FN1312 COG0811 # Protein_GI_number: 19704647 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Biopolymer transport proteins # Organism: Fusobacterium nucleatum # 35 223 1 190 202 92 31.0 7e-19 MNVLMLLAQGAMNMADSLATANPVLTPVSAPEMNMLDMAIKGGWIMIVLGVLSVVCFYIL FERNYMIRKAGKEDPMFMERIKDYIHSGEIKAAIQYCRTMNTPSARMIEKGISRLGRPIN DVQVAIENVGNLEVAKLEKGLTVMATISGGAPMLGFLGTVTGMVRAFYEMANAGSGNIDI TLLSGGIYEAMITTVGGLIVGIIAMFAYNYLVMLVDRVVNKMEARTMEFMDLLNEPAQK >gi|226332212|gb|ACIC01000108.1| GENE 52 77008 - 77745 748 245 aa, chain - ## HITS:1 COG:XF0060 KEGG:ns NR:ns ## COG: XF0060 COG0854 # Protein_GI_number: 15836665 # Func_class: H Coenzyme transport and metabolism # Function: Pyridoxal phosphate biosynthesis protein # Organism: Xylella fastidiosa 9a5c # 10 244 5 250 260 202 47.0 3e-52 MRQILRKTMTKLSVNINKVATLRNARGGDTPNVVKVALDCEAFGADGITVHPRPDERHIR RADVYDLRPLLRTEFNIEGYPSPEFIDLVLKVKPHQVTLVPDDPSQITSNSGWDTKANLE FLSEVLDQFNSAGIRTSVFVAADPEMVEYAAKAGADRVELYTEPYATAYPKNPEAAVAPF VEAAKTARKLGIGLNAGHDLSLVNLNYFYKNIPWVDEVSIGHALISDALYLGLERTIQEY KNCLR >gi|226332212|gb|ACIC01000108.1| GENE 53 77870 - 78742 746 290 aa, chain + ## HITS:1 COG:PA3088 KEGG:ns NR:ns ## COG: PA3088 COG0061 # Protein_GI_number: 15598284 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted sugar kinase # Organism: Pseudomonas aeruginosa # 64 286 64 289 295 164 37.0 2e-40 MKFAIFGNTYQPKKSLHALRLFELLKKQGAEICMCREFYQFLTADLKMEVPVDALLEGND FTADMVISIGGDGTFLKAARRVGRKQIPILGINTGRLGFLADVSPEEMEVTFEEIQAGRY SVEERSVLQLICNDRNLQESPYALNEIAVLKRDSSSMISIRTAINGAYLNTYQADGLVIA TPTGSTAYSLSVGGPIIVPHSNTIAITPVAPHSLNVRPIVIRDDWEITLDVESRSHNFLV AIDGSSETCKETTQLTIRRADYSIKVVKRFNHIFFDTLRSKMMWGADGRR >gi|226332212|gb|ACIC01000108.1| GENE 54 79051 - 79242 150 63 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|298384857|ref|ZP_06994416.1| ## NR: gi|298384857|ref|ZP_06994416.1| hypothetical protein HMPREF9007_01498 [Bacteroides sp. 1_1_14] # 1 63 1 63 63 106 98.0 4e-22 MESLWKVWFSRRRKVYVRIARQYGSTPWRVYYLGHGGRCRSLKDMQILEALQRQGVISHI YPW >gi|226332212|gb|ACIC01000108.1| GENE 55 79792 - 80742 547 316 aa, chain - ## HITS:1 COG:no KEGG:BT_3916 NR:ns ## KEGG: BT_3916 # Name: not_defined # Def: site-specific recombinase IntIA # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 316 1 316 316 603 99.0 1e-171 MKQKRAKIGFTGCAKAYIRSLQEEGRYSTAHVYKNAVLSFTRFHGSTYIAFEQITRESLR RYGQYLYDCKLKLNTISTYMRMLRCIYNRGVETGIARFIPRLFRDVYTGVDVRQKKAIPV KELHTLLYKTPQSKHLRRTQEIARLMFQFCGMPFADFAHLEKSALAQGVLRYNRIKTGTS VSLEVLESSLPTISKLRNNDPVREDGTNYLFSILRGNKSPKGESMYKEYQSALRRFNNQL KSLSRELHLKSAVTSYTIRHSWATNAKYQGIPIEMISESLGHKSIKTTQIYLKGFELEKR TEANRLNCFYVENCNN >gi|226332212|gb|ACIC01000108.1| GENE 56 80885 - 81058 109 57 aa, chain - ## HITS:1 COG:no KEGG:BT_3915 NR:ns ## KEGG: BT_3915 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 57 24 80 80 82 96.0 3e-15 MMSKVNSSEFSVVDVRVTAIVVITVPSPVAVTALSLDYECVICVFVMGDSNNWINNT Prediction of potential genes in microbial genomes Time: Thu May 12 02:03:32 2011 Seq name: gi|226332211|gb|ACIC01000109.1| Bacteroides sp. 1_1_6 cont1.109, whole genome shotgun sequence Length of sequence - 55574 bp Number of predicted genes - 48, with homology - 48 Number of transcription units - 22, operones - 14 average op.length - 2.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 181 - 228 16.1 1 1 Op 1 . - CDS 462 - 1364 613 ## BT_3914 hypothetical protein 2 1 Op 2 . - CDS 1376 - 1951 547 ## BT_3913 hypothetical protein - Prom 1971 - 2030 2.0 - Term 1990 - 2036 6.0 3 2 Tu 1 . - CDS 2060 - 2932 993 ## BT_3912 hypothetical protein - Prom 3052 - 3111 4.5 + Prom 3037 - 3096 5.8 4 3 Tu 1 . + CDS 3116 - 4057 1137 ## COG0039 Malate/lactate dehydrogenases + Term 4089 - 4134 10.2 - Term 4079 - 4119 4.2 5 4 Op 1 . - CDS 4145 - 5275 1065 ## BT_3910 hypothetical protein 6 4 Op 2 . - CDS 5286 - 7514 2218 ## BT_3909 hypothetical protein 7 4 Op 3 . - CDS 7529 - 8608 1052 ## COG0714 MoxR-like ATPases 8 4 Op 4 . - CDS 8627 - 10102 962 ## BT_3907 hypothetical protein 9 4 Op 5 . - CDS 10089 - 11417 795 ## BT_3906 hypothetical protein - Prom 11592 - 11651 6.7 + Prom 11382 - 11441 7.7 10 5 Op 1 13/0.000 + CDS 11654 - 13117 1211 ## COG1538 Outer membrane protein + Prom 13125 - 13184 5.5 11 5 Op 2 9/0.000 + CDS 13204 - 14196 1285 ## COG0845 Membrane-fusion protein 12 5 Op 3 22/0.000 + CDS 14199 - 15380 647 ## COG0842 ABC-type multidrug transport system, permease component 13 5 Op 4 . + CDS 15380 - 16639 1024 ## COG0842 ABC-type multidrug transport system, permease component + Term 16669 - 16710 8.1 - Term 16732 - 16775 7.2 14 6 Op 1 . - CDS 16811 - 17368 619 ## BT_0646 hypothetical protein - Prom 17420 - 17479 2.2 15 6 Op 2 . - CDS 17528 - 17761 173 ## BT_3900 hypothetical protein - Prom 17940 - 17999 4.5 + Prom 17477 - 17536 5.8 16 7 Op 1 . + CDS 17760 - 18125 341 ## BT_3899 transcriptional regulator 17 7 Op 2 . + CDS 18143 - 19972 1206 ## BT_3898 TonB + Term 20016 - 20069 5.2 + Prom 20026 - 20085 6.4 18 8 Op 1 . + CDS 20117 - 22039 1228 ## BT_3897 putative thiol:disulfide interchange protein DsbE 19 8 Op 2 . + CDS 22087 - 22944 605 ## BT_3896 TonB + Term 22949 - 22996 15.9 - Term 22933 - 22988 19.1 20 9 Op 1 . - CDS 23013 - 24110 1173 ## COG0489 ATPases involved in chromosome partitioning - Prom 24134 - 24193 2.5 21 9 Op 2 . - CDS 24199 - 25041 702 ## COG0220 Predicted S-adenosylmethionine-dependent methyltransferase 22 9 Op 3 . - CDS 25068 - 25691 578 ## COG5523 Predicted integral membrane protein 23 9 Op 4 . - CDS 25732 - 26751 1218 ## COG0115 Branched-chain amino acid aminotransferase/4-amino-4-deoxychorismate lyase 24 9 Op 5 . - CDS 26803 - 27009 280 ## BT_3891 hypothetical protein 25 9 Op 6 . - CDS 27021 - 28274 1150 ## COG1570 Exonuclease VII, large subunit - Prom 28303 - 28362 3.7 26 10 Op 1 . - CDS 28364 - 29719 1116 ## COG1404 Subtilisin-like serine proteases 27 10 Op 2 . - CDS 29744 - 30799 1003 ## COG0482 Predicted tRNA(5-methylaminomethyl-2-thiouridylate) methyltransferase, contains the PP-loop ATPase domain - Prom 30976 - 31035 6.1 + Prom 31657 - 31716 4.1 28 11 Tu 1 . + CDS 31737 - 32615 564 ## BT_3886 hypothetical protein - Term 32548 - 32583 -0.5 29 12 Op 1 . - CDS 32590 - 33597 838 ## COG1409 Predicted phosphohydrolases 30 12 Op 2 . - CDS 33635 - 34114 576 ## COG0245 2C-methyl-D-erythritol 2,4-cyclodiphosphate synthase 31 12 Op 3 . - CDS 34154 - 34768 751 ## COG0179 2-keto-4-pentenoate hydratase/2-oxohepta-3-ene-1,7-dioic acid hydratase (catechol pathway) 32 12 Op 4 . - CDS 34768 - 35430 591 ## COG2344 AT-rich DNA-binding protein - Prom 35450 - 35509 4.9 33 13 Op 1 . - CDS 35519 - 36691 826 ## BT_3881 hypothetical protein 34 13 Op 2 . - CDS 36618 - 37196 375 ## BT_3880 hypothetical protein - Prom 37296 - 37355 5.0 + Prom 37180 - 37239 5.3 35 14 Tu 1 . + CDS 37308 - 37658 386 ## COG0023 Translation initiation factor 1 (eIF-1/SUI1) and related proteins + Term 37666 - 37704 3.1 - Term 37651 - 37694 6.5 36 15 Op 1 38/0.000 - CDS 37717 - 38709 1309 ## COG0264 Translation elongation factor Ts - Prom 38732 - 38791 4.3 - Term 38765 - 38807 0.0 37 15 Op 2 . - CDS 38835 - 39671 1398 ## PROTEIN SUPPORTED gi|29349285|ref|NP_812788.1| 30S ribosomal protein S2 38 16 Op 1 59/0.000 - CDS 39793 - 40179 639 ## PROTEIN SUPPORTED gi|160883130|ref|ZP_02064133.1| hypothetical protein BACOVA_01099 39 16 Op 2 . - CDS 40186 - 40647 797 ## PROTEIN SUPPORTED gi|29349283|ref|NP_812786.1| 50S ribosomal protein L13 - Prom 40872 - 40931 6.8 - Term 40879 - 40916 3.4 40 17 Tu 1 . - CDS 40953 - 41444 495 ## BT_3874 hypothetical protein - Prom 41480 - 41539 7.1 - Term 41518 - 41566 9.2 41 18 Op 1 . - CDS 41584 - 42987 1499 ## COG0017 Aspartyl/asparaginyl-tRNA synthetases 42 18 Op 2 . - CDS 43015 - 44505 1822 ## COG1187 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases 43 18 Op 3 . - CDS 44580 - 45926 1501 ## COG0015 Adenylosuccinate lyase - Prom 45982 - 46041 8.8 - TRNA 46063 - 46140 88.0 # Val TAC 0 0 - TRNA 46170 - 46247 88.0 # Val TAC 0 0 - Term 46451 - 46482 1.7 44 19 Op 1 . - CDS 46483 - 47034 401 ## COG0110 Acetyltransferase (isoleucine patch superfamily) 45 19 Op 2 . - CDS 47040 - 47813 602 ## COG3022 Uncharacterized protein conserved in bacteria - Prom 47840 - 47899 8.8 + Prom 47871 - 47930 7.7 46 20 Tu 1 . + CDS 47960 - 49945 1720 ## COG3525 N-acetyl-beta-hexosaminidase - Term 50737 - 50778 -0.8 47 21 Tu 1 . - CDS 50780 - 54001 3651 ## COG0458 Carbamoylphosphate synthase large subunit (split gene in MJ) - Prom 54038 - 54097 7.4 + Prom 54340 - 54399 6.3 48 22 Tu 1 . + CDS 54432 - 55532 1286 ## COG0180 Tryptophanyl-tRNA synthetase Predicted protein(s) >gi|226332211|gb|ACIC01000109.1| GENE 1 462 - 1364 613 300 aa, chain - ## HITS:1 COG:no KEGG:BT_3914 NR:ns ## KEGG: BT_3914 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 300 1 300 300 604 97.0 1e-171 MKKIHLLAMLLFTACVVPVSQAQESLFLKDPSTKNKLLQLDLDVGTNFDLAYRGKGNMFT AMSDQKSVFPAISLRLQHFFSRKWGWYTNIRLGIPVKYRRDCYTELAHAVEADYYVNNLI PGTQKPDVNPCLDFGVAYRFENSHWAFYPRLGIGVNSISYQRVCAELKKKGGNELYRIEY RGDDESNYGSESIDAFILSAGITANYKLSRNCFLLLNVNYIQPLGRFTYRKYVTDLYTGE KVERGVYKSSTFARDLNVSVGFGFPFYLGRKTNRKSPHRERTRQLMEQKRKTYGLFPGNK >gi|226332211|gb|ACIC01000109.1| GENE 2 1376 - 1951 547 191 aa, chain - ## HITS:1 COG:no KEGG:BT_3913 NR:ns ## KEGG: BT_3913 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 191 3 193 193 345 96.0 6e-94 MKKHLTQLLLSLLFLVVTAMTCDEQDWAEPIDVPCTLKEVELYHWDNAGEKPKEAVDNKV LKEAYMMEIRLLTDAGEKDPESYDADHYLRHVLSDGIKKIQIFTETAFNEEFPAGAEVTS CFYDYPKTFVKDQQTDYTVNGGVISMIDEVNKIYKALLTIPQSGGEFRFRVVLTMESGET VERLSDPVTFY >gi|226332211|gb|ACIC01000109.1| GENE 3 2060 - 2932 993 290 aa, chain - ## HITS:1 COG:no KEGG:BT_3912 NR:ns ## KEGG: BT_3912 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 290 1 290 290 456 100.0 1e-127 MNKKSLLIAAVAVLVIAIIGITYLLFTEKKANRELVQEFQLDKEDLENEYSQFVQKYDEL KFTVTNDSLALLLEQEQLKTQRLLEELRTVKSSNATEIRRLKKELATLRKILVGYVNQID SLDRINKRQQQVIADVTQKYNTASQQISTLSKEKENLDKKVTLAAQLDVTNIRIEPRNKR GKVAKKVKDIVKLAISFTVVKNITAENGERTIYIRITKPDNDALTKSASNTFSYENRTLT YSIKKYIEYNGEEQNVNVFWDVEEFLYAGNYRLDIFEGGNLIGSQKFTLD >gi|226332211|gb|ACIC01000109.1| GENE 4 3116 - 4057 1137 313 aa, chain + ## HITS:1 COG:BH3158 KEGG:ns NR:ns ## COG: BH3158 COG0039 # Protein_GI_number: 15615720 # Func_class: C Energy production and conversion # Function: Malate/lactate dehydrogenases # Organism: Bacillus halodurans # 3 307 7 311 314 272 47.0 5e-73 MSKVTVVGAGNVGATCANVLAFNEVADEVVMLDVKEGVSEGKAMDMMQTAQLLGFDTTVV GCTNDYAQTANSDVVVITSGIPRKPGMTREELIGVNAGIVKSVAENILKYSPNAILVVIS NPMDTMTYLSLKALGLPKNRIIGMGGALDSSRFKYFLSQALGCNANEVEGMVIGGHGDTT MIPLTRFATYKGMPVTNFISEEKLNEVAAATMVGGATLTKLLGTSAWYAPGAAGAFVVES ILHDQKKMIPCSVYLEGEYGESDICIGVPVILGKNGIEKIVELDLNADEKAKFAASAKAV HGTNAALKEVGAL >gi|226332211|gb|ACIC01000109.1| GENE 5 4145 - 5275 1065 376 aa, chain - ## HITS:1 COG:no KEGG:BT_3910 NR:ns ## KEGG: BT_3910 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 376 1 376 376 717 99.0 0 MEEELLKRWRLILGGDEADGTGVTLNLEEQRIDHSLEAVYDSDRRGGLGSSAPKVSRWLG DIREFFPQTVVQVIQRDAIKRLNLTSLLTEKEMLETVVPDVHLVATLMSLSRVIPEKNKE MARQVVRKVVEELLRKLSAPTQQAVTGALNRYSRRRNPRYNEIDWKTTITKNLKNYQPDY KTIIPEIRIGYGRKRKAMKDIILCLDQSGSMGTSVIYSGIFGSVLASIPAVSTRMVVFDT AVVDLTDDLQDPVDLLFGVQLGGGTDIARALTYCQGVITRPQDTVMVLVTDLYEGGDSRE MRKKFVSLVNSGVQLIVLPALNDDGAPSYDKGHAEFLASIGVPTFACTPDKFPDLMAAAL SKQDIGMWVSQNVKSE >gi|226332211|gb|ACIC01000109.1| GENE 6 5286 - 7514 2218 742 aa, chain - ## HITS:1 COG:no KEGG:BT_3909 NR:ns ## KEGG: BT_3909 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 742 1 742 742 1488 99.0 0 MAIHLLGIRHHGPGSCRNVLNYLQELQPDLILVEGPAEAEALLACVENPQMTPPVALLAY QPDEPQNAVFYPFAEFSPEWQAMLYAVQNRIPFRFFDLPLIYSLALRTEKTSEQETETTA EVAEAGDPFDWLAHAAGFTDGESWWETMIEHRQEPADIFQAVQEAVTALREELPGHTSPR DLIREAWMRKMIRAAQKENFERIVVVCGAWHVPALDDMPKVKDDNELLKGLPKVKVECTW IPWTYDRLAFRSGYGAGIESPGWYHYLWHHPEDDGTLWGSRIASLLRQKNMDISVAHVIE TVRLAQVTAALRDLPYPSLNEYNEAVTTVMGFGDDILLQIIKEELIISNRLGSVPDDVPK VPLLVDVEKIQKRLRVPFTAEIKELILDLRKPNDLERSIFFHRLQLLGIDWTILGRIDGK GTFKEKWTLYHKPEQIIQIIERAIWGNTVMEATQKYLLKQMAEIQHIPELTALLSTVLPA DLPALVEAMTVQLDRLSAASTDILEMMEAVPDLVNIVRYGNVRDLDFSKVGDMLEAMIAR ILAGGVLVCINIDEEAAGDLLNKLVATDYALSILNREELNLMWRNFIGQVRSSANVHLLL SGYATRLLNDKGEISAEEMETTLSFYSSVGNAPADMAYWFEGFLRSSGSILLLDDRLWNL VNNWVCSQDKETFMELLPVLRRTFSEFTSAERRKLGEKARRTDSTGTSSAVGASVTENEC LNREEAKKVIPVIRQLLGLETK >gi|226332211|gb|ACIC01000109.1| GENE 7 7529 - 8608 1052 359 aa, chain - ## HITS:1 COG:ECs2927 KEGG:ns NR:ns ## COG: ECs2927 COG0714 # Protein_GI_number: 15832181 # Func_class: R General function prediction only # Function: MoxR-like ATPases # Organism: Escherichia coli O157:H7 # 4 354 30 379 384 223 39.0 5e-58 MSTLLRQHAEQQFAEELHELKQNETNPVPENWEMSPQSVVTYLMGGKLKNGFEVSPKYIG NRRLMEIAVATLVTDRALLLYGLPGTAKSWVSEHIAAAISGDSTLIVQGTAGTGEEAIRY GWNYARLLAEGPSVGALVQTPVMKAMQEGKIGRIEELTRIGSDVQDTLITILSEKTLPIP ELNTEVQAIRGFNIIATANNRDKGVNDLSSALKRRFNTVILPLPDSIDEEIDIVRRRVES FEKVMQLPAEKPALEEIRRVVTIFRELRQGATTDGKTKLKTPSGTLSTAEAISVVNNGLA MAGYFGDGEIHAEDLVSGILGAVVKDPVQDKVVWQEYMETVVKNRDGWKDIYRASRDMI >gi|226332211|gb|ACIC01000109.1| GENE 8 8627 - 10102 962 491 aa, chain - ## HITS:1 COG:no KEGG:BT_3907 NR:ns ## KEGG: BT_3907 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 491 1 491 491 935 97.0 0 MNLTDHIINVALLGTATRELITTDFPEELQETLRDIQAKAEDAEALFYQQSALGFAFARA GVEAQSIAGVTNVTEAPEEDKPYFLREVGELLTSLYLNKNQYLLLYAYRRAADKGKLIPP AYLQTLLRRAFDRNNLYRYEEQHWLSLLAGQRGRWLLPQMGFPVWGESGNETWETASHEE RKRMLTNLRKNSPEQGLALLQTELKNESAAHRDELIQCLRWGLSKSDEAFLQEIVATDRS SNVKETARRLLCSLPDSELVKIYEELLRGKLHFNFLLGWSYDKIEFTPEMKKLGLEEVSS NKNEKDDRFLLRQLAERVPLSFWSEFYDCPPEKAASKLAKNPPFQKLFDLSKPILNFSDP GWAYHTLKENADEKMADALMGLLPSSQREEIAFQSERGGYIPDSWFNKDGIGWGIKFSTR VFQRMLRNNYYLPKETAEQLALYFPSEMRKPIEQTALSTAAQENNTSTRFCRLMMEYMDL KQRIDTLLNND >gi|226332211|gb|ACIC01000109.1| GENE 9 10089 - 11417 795 442 aa, chain - ## HITS:1 COG:no KEGG:BT_3906 NR:ns ## KEGG: BT_3906 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 442 1 442 442 870 98.0 0 MLLNLTEEQITQLAPDAASVKAGKGLANRTKWVLLEHSDRAIWGHCQGSGKTPYQTVVDT KNIAFKCSCPSRKFPCKHGLGLLFMYASHADLFKEAEEPDWVTAWLSKREEKAEKKEQKE KSETPVDEAAQAKRQAVRHQKVLAGIDDLQIWMKDLLRNGLLNIPERAHTLFEPISRRMI DAQAGGLAGRLRSLQEINYYTDSWKYELTDKLSKLYLLTESYKNLDSLSAEWQTEIRTQV GYPQAKEEVMSGVPVADQWIVLHTRSRKINELRTETFWLYGKSSHRMALYLNFLAPGALP EFNLVPGGVYEGELYFYQGVGALRALPQNMKWLDSTFLPSLCADIPVAMQSYRAVIQTHP FAEEVPLLVDNVRLVTSGNTHYLQDAGGMAIPVHIEETARVDILAITGGKPFSAFLLADA VSCELKTIWYQSEFYFWKDELN >gi|226332211|gb|ACIC01000109.1| GENE 10 11654 - 13117 1211 487 aa, chain + ## HITS:1 COG:jhp1382 KEGG:ns NR:ns ## COG: jhp1382 COG1538 # Protein_GI_number: 15612447 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Outer membrane protein # Organism: Helicobacter pylori J99 # 23 465 48 481 510 79 24.0 2e-14 MKKLFLLTILLSLTFIVKAQSFLSLDSCRALALANNKDLLISNEKINAAHYQHKAAFTNY LPSFSATGTYMRNQKEFSLLNNDQKAALSGLGTNLAGPIQQTATEIATAHPELRPLIASL SEKLGAALPALDQAGNSLVDALRTDTRNVYAGAITLTQPLYMGGKIRAYNKITKYAEELA QQQHQGGMQEVIMSTDQAYWQVISLVNKKKLAEGYLKLLQQLDSDVEKMIAEGVATKADG LSVRVKVNEAEMTLTKVEDGLSLARMLLCQLCGIDLSSPITLADENMEDIPLLTPETHFD MSTAYANRPEIRSLELATQIYKQKINVTRAEHLPSIALMGNYMVTNPSVFNSFENKFKGM WNVGVMVQIPIWHWGEGIYKTKAAKAEARIAQYQLQDAREKIELQVNQSAFKVKEASKKL VMATKNMEKADENLRYATLGFKEGVIATSNVLEAQTAWLSAQSEKIDAQIDVKLTEIYLK KSLGTLK >gi|226332211|gb|ACIC01000109.1| GENE 11 13204 - 14196 1285 330 aa, chain + ## HITS:1 COG:VC1607 KEGG:ns NR:ns ## COG: VC1607 COG0845 # Protein_GI_number: 15641615 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Vibrio cholerae # 17 328 6 322 324 218 45.0 1e-56 MAPIKSQNSNMLLAFLTLLGVIAIVAVVGFFMLRKGPEIIQGQAEVTEYRVSSKVPGRIL EFRVKEGQSVNAGDTLAILEAPDVVAKMEQARAAEAAAQAQNEKAIKGAREEQIQAAYEM WQKAQAGVTIAEKSYKRVKNLYEQGVMPAQKLDEVTAQRDAAIATEKAAKAQYTMAKNGA EREDKMAAEALVNRAKGAVAEVESYIKETYLIAPAAGEVSEIFPKVGELVGTGAPIMNIA ELNDMWVTFNVREDLLKNLTMGAEFEAVVPALDNKTVRLKVYYLKDLGTYAAWKATKTTG QFDLKTFEVKASPMEKVENLRPGMSVIINK >gi|226332211|gb|ACIC01000109.1| GENE 12 14199 - 15380 647 393 aa, chain + ## HITS:1 COG:VC1608 KEGG:ns NR:ns ## COG: VC1608 COG0842 # Protein_GI_number: 15641616 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, permease component # Organism: Vibrio cholerae # 28 355 22 348 387 128 27.0 2e-29 MKERKKKYVALWQVMQRECRRLVSRPLYLFCMVIAPLFCYLFFTTLMDSGLPQNLPAGVV DMDDSSTSRNIVRNLDAFSQTGVVAHYSNVTDARIAVQEGKIYGFFYIPKGLSAEAQSQR QPKISFYTNYSYLIAGSLLFRDMKMMGELTAGSAARSVLYAKGATEDQAMGFLQPIVIDT HPLNNPWLNYSVYLCNTLIPGVLMLLIFMVTVYSIGVEIKDRTAREWLRMSNNSIYIALA GKLLPHTVVFFIMGIFYNVYLYGFLHFPCNSGIFPMIFATLCLVLASQCCGIVMIGTLPT LRLGLSFASLWGVISFSISGFSFPVMAMNPVLQALSNLFPLRHYFLIYVDQALNGYSMAY SWSNYMALLIFMMLPFFVVHRLKEALVYYKYIP >gi|226332211|gb|ACIC01000109.1| GENE 13 15380 - 16639 1024 419 aa, chain + ## HITS:1 COG:VC1609 KEGG:ns NR:ns ## COG: VC1609 COG0842 # Protein_GI_number: 15641617 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, permease component # Organism: Vibrio cholerae # 17 381 29 388 408 152 28.0 2e-36 MKDIKFKDKIAQGINDLFYIWKREFQTTFRDQGVLIFFILVPLGYPLLYSFIYDNEVVRE VPAVVVDDSHSSLSREYLRKVDATPDIQIVAYCADMEEAKQMLKNRLAYGIIYIPSDFSD NIAKGKQTQVSIYCDMSGLLYYKSMLIANTAVSLDMNKDIKIARSGNTTERQDEITGYPI EYEEISIFNPTAGFAAFLIPAVLVLIIQQTLLLGIGLAAGTARENNRFKDLVPINRHYNG TLRIVLGKGLSYFLVYVLVSFYVLHIVPRLFSLNQIGQPGSLVLFVAPYLAACIFFAMTT SIAIRNRETCMLIFVFTSVPLLFISGISWPGAAIPPFWKYVSYIFPSTFGINGFVKINNM GATLSEVAFEYKALWIQAGVYFLTTCWVYRWQILMSRKHAIERYKELKEKENLAKQVTD >gi|226332211|gb|ACIC01000109.1| GENE 14 16811 - 17368 619 185 aa, chain - ## HITS:1 COG:no KEGG:BT_0646 NR:ns ## KEGG: BT_0646 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 185 1 190 190 203 59.0 2e-51 MKKGLIFVLFALVSIVSYSQISWNAKVGMNMSNFTGDADTDMRVGFNVGVGMEYQFTDMW SIQPSLMFTQKGAKLGDIKANPMYLELPVMAAARFAVVDGQNVVVKAGPYLACGIAGKYK IAGVKSDFFGDDAAKRFDFGLGMGVAYEINRFFVDLSGEFGLVKMYDGDGSPKNINFSIG VGYKF >gi|226332211|gb|ACIC01000109.1| GENE 15 17528 - 17761 173 77 aa, chain - ## HITS:1 COG:no KEGG:BT_3900 NR:ns ## KEGG: BT_3900 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 77 1 77 77 145 96.0 6e-34 MEMMFNYDSFAKLRIFRNKAIASRLNIVNACLGWSDLRVKAPDSRKHENNVDKRYIHPNK QSLGHLGEQKAYRGFFI >gi|226332211|gb|ACIC01000109.1| GENE 16 17760 - 18125 341 121 aa, chain + ## HITS:1 COG:no KEGG:BT_3899 NR:ns ## KEGG: BT_3899 # Name: not_defined # Def: transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 121 1 121 121 218 100.0 4e-56 MEKLTIQEEEVMIYIWELQDCFVKDIVSKFPQPAPPYTTVASIVKNLERKGYVKSKHIGN TYQYTPSIRENEYKRHFMSGVVRNYFENSYKEMVSFFAKDQKISTDDLKDIIELIEKGKE K >gi|226332211|gb|ACIC01000109.1| GENE 17 18143 - 19972 1206 609 aa, chain + ## HITS:1 COG:no KEGG:BT_3898 NR:ns ## KEGG: BT_3898 # Name: not_defined # Def: TonB # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 609 1 609 609 1217 99.0 0 MTPELAYFLKINVAIALFYAFYRLFFHKDTFFHWRRTALLCFFGISLLYPLLNIQEWIKE QEPMVAMADLYATIILPEQIIEAPQETTANWQELIPQILGFIYWTGVLLLALRFLIQLGS IIRLHFLCPKSTIQGSRIHILKKGTGPFSFFHWIFIHPESHTESEISEIITHEETHARQY HSADVLVSEIMCTFCWFNPFVWLMKREVRGNLEYMADHRVLETGHDSKSYQYHLLGLAHH KAAANLSNSFNVLPLKNRIKMMNKRRTKEIGRTKYLMFLPLAALLMIISNIEVVARTTKE FAKEMMAAPEEQGVSQTELANGPELPDGMTGVATLQDKKGMQKTKEVAPPPPPAPVKSAT VNDSVVFEVVEEMPDFPGGQSALMEYLAKNIKYPATAHENGKQGRVIVMFVVKKDGSISD VKTVRGVDPYLDKEAERVIAAMPNWKPGKQRGQAVNVRFTVPVTFRLSGPEPAKPAETPE AVAIEKFEEVVVVGYGPKEATPNNEPTFKVVDEMPKFPGGQEGLMRYLAKNIKYPTMAQQ NKEQGKVLVQIVIGKDGNVSNIKILEGASAWLDAEAIRVVRGMPKWEPGKQNGQAVAVEY TFPITFRLQ >gi|226332211|gb|ACIC01000109.1| GENE 18 20117 - 22039 1228 640 aa, chain + ## HITS:1 COG:no KEGG:BT_3897 NR:ns ## KEGG: BT_3897 # Name: not_defined # Def: putative thiol:disulfide interchange protein DsbE # Organism: B.thetaiotaomicron # Pathway: not_defined # 7 640 1 634 634 1287 100.0 0 MKRFALMISLLLSILCMTQAKDRIVERPPFLAQSSSSIEIDKIVMSDTVTTVYIKAFYRP KYWIKIATGSVLKDNNGNLYPVRKGIGITLDKEFWMPESGEAEFQLTFPPIPQNVTCLDF SEGDFEGAYKIWGIQLDKKTFSKFKLPKDVTVTKADHKTTLPEPVVKYAKAILKGRILDY QKGMPNKGSLYFQDVIRDEAQEGKIRIQDDGTFYHEMNVFMVTPCMIRFPFGAVSYLLAP GETTSVVINLRENARKQSHLRQAEKPYGKEIYYGGYLAGVQQELADHPMSFSFTEGNSYE EYQQQIKKLNGKTAEEYKAQILARLPEFRKQITQSKGSPAYKEIINTDLELSAALKIIET ERNLKLVHIMTNGLEGEKANKYYAETKIDIPQGYYESLQDFTYILSPKACYNSRYPILFD IIRRIGIDDKILKSLSNHTAASYLIQNLKAQMIGVSMRDLNPLNDKQKAELSTMPAAYND MALAMNNDLLKQIEINKKKTGFTVNETGEVSNEDLFPSIISKFRGHTLLVDFWATWCGPC RSANKQILPMKEELKDKDIIYLYITGETSPLGTWKNMIPDIHGEHFRVTDEQWSYLREKF SIRGVPTYFVIDKEGNITYKQTGFPGVDTMKEQLMKALEK >gi|226332211|gb|ACIC01000109.1| GENE 19 22087 - 22944 605 285 aa, chain + ## HITS:1 COG:no KEGG:BT_3896 NR:ns ## KEGG: BT_3896 # Name: not_defined # Def: TonB # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 285 1 285 285 555 100.0 1e-157 MKHQQLSRANGTKRPFLLALIAFSLGFCHYAEAKVQPTEAFTSNVDSLIINDNNKDKDET IYEAVDEMPKFPGGTQALFKYLGENIQYPEEVQKLGIAGRVITQFVISKKGEITSVAVVR SLHPELDKQAIQAITAMPTWTPGKKDGKVVNVKFTLPINFHPTPAATTKATGEQQPDSTT DKVFMTAQKMPRFPKEQGELDQYLKQHITYWKNAAKQKEEGRVIVTFIVRKDGQITDARV VRSVSPTLDAEALRIISNMPKWEPGENNGVPVSVKYTIPIMFRLR >gi|226332211|gb|ACIC01000109.1| GENE 20 23013 - 24110 1173 365 aa, chain - ## HITS:1 COG:alr0652 KEGG:ns NR:ns ## COG: alr0652 COG0489 # Protein_GI_number: 17228148 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Nostoc sp. PCC 7120 # 8 349 10 350 356 263 43.0 6e-70 MTLYPKLILDALATVRYPGTGKNLLEAEMVADNLRIDGMAVSFSLIFEKPTDPFMKSMLK AAETAIHTYVSPDVQVTITTESKQAARPEVGKLLPQVKNIIGISSGKGGVGKSTVSANLA VALAKLGYKVGLLDADIFGPSMPKMFQVEDARPYAEKINGRDLIIPVEKYGVKLLSIGFF VDPDQATLWRGGMASNALKQLIGDADWGDLDYFLIDLPPGTSDIHLTVVQTLAMTGAIVV STPQAVALADARKGINMFTNDKVNVPILGLVENMAWFTPAELPENKYYIFGKEGAKKLAE EMNVPLLGQIPIVQSICEGGDNGTPVALDEDTVTGRAFLSLAASVVRQVDRRNVEMAPTK IVEMH >gi|226332211|gb|ACIC01000109.1| GENE 21 24199 - 25041 702 280 aa, chain - ## HITS:1 COG:L156302 KEGG:ns NR:ns ## COG: L156302 COG0220 # Protein_GI_number: 15672731 # Func_class: R General function prediction only # Function: Predicted S-adenosylmethionine-dependent methyltransferase # Organism: Lactococcus lactis # 48 237 15 203 217 120 33.0 3e-27 MEYGAICTPFIFVVYAKSKYLCVLISNEMGKNKLEKFADMASYPHVFEYPYSAVDNVPFD MKGKWHQEFFGNDHPIVLELGCGRGEYTVGLGRMFPDKNFIAVDIKGSRMWTGATESLQA GMKNVAFLRTNIEIIERFFAAGEVSEIWLTFSDPQMKKATKRLTSTYFMERYRKFLKPDG IIHLKTDSNFMFTYTKYMIEANQLPVEFITEDLYHSDLVDDILSIKTYYEQQWLDRGLNI KYIKFRLPQEGVLQEPDVEIELDPYRSYNRSKRSGLQTSK >gi|226332211|gb|ACIC01000109.1| GENE 22 25068 - 25691 578 207 aa, chain - ## HITS:1 COG:lin0656 KEGG:ns NR:ns ## COG: lin0656 COG5523 # Protein_GI_number: 16799731 # Func_class: S Function unknown # Function: Predicted integral membrane protein # Organism: Listeria innocua # 6 165 4 166 345 124 41.0 1e-28 MLKQNSELRAQAREALRGKWPMAAVAALIYSAIAGGLSSIPFIGWIGSLLVGLPVAYGFA ILMLAVFRGAEEVDLGVLFAGFQEYSRILTTKLLQAVYTFLWSLLLLIPGIIKHYSYAMT DYILKDEPELCNNAAIERSMAMMEGNKMKLFLLDLSFIGWAILCLFTFGIGFFVLQPYVQ VSHAAFYEDLKAQQGGININVEVTVES >gi|226332211|gb|ACIC01000109.1| GENE 23 25732 - 26751 1218 339 aa, chain - ## HITS:1 COG:L0086 KEGG:ns NR:ns ## COG: L0086 COG0115 # Protein_GI_number: 15673270 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Branched-chain amino acid aminotransferase/4-amino-4-deoxychorismate lyase # Organism: Lactococcus lactis # 4 339 5 340 340 342 51.0 7e-94 MKEIDWANLSFGYMKTDYNVRINFRNGAWGELEVSSDEHLNLHMAATCLHYGQEAFEGLK AFRGKDGKVRIFRLEENAARLQSTCQGILMAELPTERFKEAILKVVKLNERFIPPYETGA SLYIRPLLIGTSAQVGVHPAEEYMFVVFVTPVGPYFKGGFSTNPYVIIREFDRAAPHGTG IYKVGGNYAASLRANKKAHDLGYSCEFYLDAKEKKYIDECGAANFFGIKDNTYITPKSSS ILPSITNKSLMQLAEDMGIKVERRPIPEEELETFEEAGACGTAAVISPIQRIDDLENGKS YVISKDGKPGPICTKLYNKLRGIQYGDEPDTHGWVTIVE >gi|226332211|gb|ACIC01000109.1| GENE 24 26803 - 27009 280 68 aa, chain - ## HITS:1 COG:no KEGG:BT_3891 NR:ns ## KEGG: BT_3891 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: Mismatch repair [PATH:bth03430] # 1 68 1 68 68 73 100.0 2e-12 MAAKKETYSQAMERLEKIVRQIDNNELDIDTLSEKIKEANEIIAFCTEKLTKADREVEKL LQERRQSE >gi|226332211|gb|ACIC01000109.1| GENE 25 27021 - 28274 1150 417 aa, chain - ## HITS:1 COG:DR0186 KEGG:ns NR:ns ## COG: DR0186 COG1570 # Protein_GI_number: 15805222 # Func_class: L Replication, recombination and repair # Function: Exonuclease VII, large subunit # Organism: Deinococcus radiodurans # 4 408 28 415 416 153 31.0 7e-37 MDSLSLLELNALVRRSLEQCLPDEYWIQAELSDVRSNTTGHCYLEFVQKDPRSNNLVAKA RGMIWSNIYRLLKPYFEETTGQLFASGIKVLVKVTVQFHELYGYSLTVLDIDPAYTLGDM ARRRREILMQLEEEGVLTLNKELEMPVLPQRIAVISSATAAGYGDFCHQLQHNSGGFFFY TELFPALMQGNQVEESVLAALDRINDRVNEFDVVVIIRGGGATSDLSGFDTYLLAAACAQ FPLPVITGIGHERDDTVLDSVAHTRVKTPTAAAELLIHRITESADHLEELSARLQQGAYA LLEQEGRRLEMIQTRIPNLVHRKLTDARFALLAAGKDLAQATQTLLSRHRHRLELLRQRV ADASPDKLLSRGYSITLKDGKAVTDAASLNPGDQLVTRLAKGSFTSEVRLDEHYYSS >gi|226332211|gb|ACIC01000109.1| GENE 26 28364 - 29719 1116 451 aa, chain - ## HITS:1 COG:BS_aprX KEGG:ns NR:ns ## COG: BS_aprX COG1404 # Protein_GI_number: 16078789 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Subtilisin-like serine proteases # Organism: Bacillus subtilis # 169 438 146 431 442 89 29.0 2e-17 MKRLVLAALILTTAVGASAQFTSTDTLKYRISLTDKAATTYSIQQPEEFLSRKSIDRRLR QKLAIDSTDLPVCKKYVDAIRKKGVHVLVTGKWDNFVTVSCNDSMLIDEIAKLPFVRSTE KVWQGKVSAVTKRDSLINDPQRSDSLYGPAITQIEMSRANLLHDAGFKGQGMTIAVIDAG FHNADRIEAMKNIRILGTRDFVNPEADIYAESNHGMSVLSCMAMNQPNVMVGTAPEASYW LLRSEDEASEHLVEQDYWAAAIEFADSVGVDVVNTSLGYYTFDDPTKNYRYRDLNGHYAL MSREASKAADKGMVLVCSAGNSGSGSWKKITPPGDAENVLTVGAIDKKKLLAPFSSVGNT ADGRVKPDVVAVGLKADIVGTDGNLRLANGTSFSSPIMCGMVACLWQACPRLTAKEVIEL VRSVGDRSDFPDNIYGYGIPDLWKAYQMSKE >gi|226332211|gb|ACIC01000109.1| GENE 27 29744 - 30799 1003 351 aa, chain - ## HITS:1 COG:CAC2233 KEGG:ns NR:ns ## COG: CAC2233 COG0482 # Protein_GI_number: 15895501 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted tRNA(5-methylaminomethyl-2-thiouridylate) methyltransferase, contains the PP-loop ATPase domain # Organism: Clostridium acetobutylicum # 1 334 9 354 355 247 38.0 3e-65 MSGGIDSTATCLMLQEQGYEIVGVTMRVWGDEPQDARELAARMGIEHYVADERIPFKETI VKNFIDEYKQGRTPNPCVMCNPLFKFRVLAEWADKLDCAYIATGHYSRLEERSGHIYIVA GDDDKKDQSYFLWRLGQDILKRCIFPLGDYTKIKVREYLAEKGYEAKSKEGESMEVCFIQ GDYRDFLREQCPELDTEIGPGWFVNSEGVKLGQHKGAPYYTIGQRKGLEIALGKPAYVLK INPQKNTVMLGDAEQLQTEYMLAEQDRIVDEQELFSCENLAVRIRYRSRPIPCRVKRLED GRLLVRFRETASAVAPGQSAVFYDGRRVLGGAFIASQRGIGLVIIENKDKL >gi|226332211|gb|ACIC01000109.1| GENE 28 31737 - 32615 564 292 aa, chain + ## HITS:1 COG:no KEGG:BT_3886 NR:ns ## KEGG: BT_3886 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 292 1 292 292 572 100.0 1e-162 MANPKLPGISESEQALLYAKLNEYNRGRASFKEAGVYLVVLPRPGKPNYSLWLYSPLPEK QSILYIHDLSPDINESLRIASTMFYYSRRCLILMDYNEKRMQSNGDDLIFFGKYRGHFLH EILKVDPAYLSWIAYKFTPKIPKQERFVKIAQAYHSVHLDVMLRKSKEVRRKSRYLGEVN EKLTDLKLKVMRVRLEDDPYKTRVYGTTPQFFVKQILTLNDASGNLVTMSIPSKTPSTVS CTLSGIEHEYRPGEIIYVASARVSRLFESYGSKYTRLSNVKLAIVNAWSSRL >gi|226332211|gb|ACIC01000109.1| GENE 29 32590 - 33597 838 335 aa, chain - ## HITS:1 COG:CAC2806 KEGG:ns NR:ns ## COG: CAC2806 COG1409 # Protein_GI_number: 15896061 # Func_class: R General function prediction only # Function: Predicted phosphohydrolases # Organism: Clostridium acetobutylicum # 23 302 26 303 317 158 38.0 2e-38 MNRTLKLFLALVSLCMTVFCFAQKSELRFNKDGKFKIVQFTDVHFKYKNPASDIALERIN QVLDEEQPDFVIFTGDVVYSAPADKGMLQVLEQVSKRKLPFVVTFGNHDNEQGMTREQLY DIIRQVPGNLMPDRGSVLSPDYVLTVKASSDAKKDAAVLYCMDSHSYSPLKDVKGYAWLT FDQVNWYRQQSAAYTAQNGGKPLPALAFFHIPVPEYNEAASDENAILRGTRMEEACAPKL NTGMFAAMKESGDVMGIFVGHDHDNDYAVMWKGILLAYGRFTGGNTEYNHLPNGARIIVL DEGARTFTSWIRQKDGIVDKVTYPASFVKDDWTKR >gi|226332211|gb|ACIC01000109.1| GENE 30 33635 - 34114 576 159 aa, chain - ## HITS:1 COG:NMB1512 KEGG:ns NR:ns ## COG: NMB1512 COG0245 # Protein_GI_number: 15677365 # Func_class: I Lipid transport and metabolism # Function: 2C-methyl-D-erythritol 2,4-cyclodiphosphate synthase # Organism: Neisseria meningitidis MC58 # 3 156 4 157 160 175 55.0 3e-44 MKIRVGFGFDVHQLVEGRELWLGGIRLEHSKGLLGHSDADVLLHTVCDALLGAANMRDIG YHFPDTAGEYKNIDSKILLKKTVELIATKGYRVGNIDATICAERPKLKAHIPLMQETMAA VMGIDAEDISIKATTTEKLGFTGREEGISAYATVLIEKD >gi|226332211|gb|ACIC01000109.1| GENE 31 34154 - 34768 751 204 aa, chain - ## HITS:1 COG:ycgM KEGG:ns NR:ns ## COG: ycgM COG0179 # Protein_GI_number: 16129143 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: 2-keto-4-pentenoate hydratase/2-oxohepta-3-ene-1,7-dioic acid hydratase (catechol pathway) # Organism: Escherichia coli K12 # 2 198 18 214 219 158 39.0 7e-39 MKIIAVGMNYALHNQELGHTSVNKEPVIFMKPDSAILKDGKPFFIPDFSNEVHYETELVV RINRLGKNISPRFASRYYDALTVGIDFTARDLQRKFREAGNPWELCKGFDSSAAIGTFVP VDRYKDIQSLNFNLMIDGKEVQRGCTADMLFKVDDIIAYVSKFVTLKIGDLLFTGTPAGV GPVSIGQHLQGYLEGEKLLDFYIR >gi|226332211|gb|ACIC01000109.1| GENE 32 34768 - 35430 591 220 aa, chain - ## HITS:1 COG:lin2178 KEGG:ns NR:ns ## COG: lin2178 COG2344 # Protein_GI_number: 16801243 # Func_class: R General function prediction only # Function: AT-rich DNA-binding protein # Organism: Listeria innocua # 11 210 4 203 215 139 36.0 4e-33 MAENQKIQQIDCTKVPEPTLRRLPWYLSNVKLLKQRGERFVSSTQISKEINIDASQIAKD LSYVNISGRTRVGYEVDALIEVLEHFLGFTEIHKAFLFGVGSLGGALLQDSGLKHFGLEI VAAFDVDPTLVGTNLNGIPIYHSDDFLKKMEEYDVQIGVLTVPIEIAQCITDMMVDGGIK AVWNFTPFRIRVPEDIVVQNTSLYAHLAVMFNRLNFNEIK >gi|226332211|gb|ACIC01000109.1| GENE 33 35519 - 36691 826 390 aa, chain - ## HITS:1 COG:no KEGG:BT_3881 NR:ns ## KEGG: BT_3881 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 390 1 390 390 714 100.0 0 MGKTYRTSDKKAWIGWKGLPSYVSEYNFVDGFWLGAKFETGLKLSEASTLRFTPSAYYTT ARKAVAGQSELSLSYAPRRRGYLELSGGVLSADYNGESGESRLINTVASSFFGRNDVKLY EKRFLSFQNKIELANSLLLSTSLSWQRRQMLENHIHRSWFKKEAESNIPDSRAFYPMPEN ELLKASFALEYTPAHYYRMYRGKKVYEESKCPTFTLRYDRAFPLKGALPSPSYHLAEFSA RQSIEFGMFNTLDWAVNAGTFWNKSGMQFPDFKHFATTGLPVTERSFDTGFSLLDNYAYS TNTRWVQANISWYTPCLLLKFLPFLKKKNLDEALHLRTLVEYDRRPYSEIGYSIGFMKLA RLGVFLGFDSLKYRSAGVSVSIPLSTFAGE >gi|226332211|gb|ACIC01000109.1| GENE 34 36618 - 37196 375 192 aa, chain - ## HITS:1 COG:no KEGG:BT_3880 NR:ns ## KEGG: BT_3880 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 192 2 193 193 382 98.0 1e-105 MKQSIAFALICSFTISLWAQTFKGRVVDLAGNPIPYAALYLKELKTGFTTDDNGCFQTSL PAGQYTCEVSSLGFAGQTFSFSISAGDVEKNIVLSEQIYSLREVNVVKGVEDPAYSVMRK AIANAPYYRTQVKGFRAGTYLKGTGKMKTIPAILKLSKEVRKESKEFPDGQNLSYFRQES MDWLERLAFLCV >gi|226332211|gb|ACIC01000109.1| GENE 35 37308 - 37658 386 116 aa, chain + ## HITS:1 COG:alr3795 KEGG:ns NR:ns ## COG: alr3795 COG0023 # Protein_GI_number: 17231287 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation initiation factor 1 (eIF-1/SUI1) and related proteins # Organism: Nostoc sp. PCC 7120 # 33 116 33 115 115 69 48.0 2e-12 MKNNDWKDRLNVVYSTNPDFNYDMDDDQEQVTLDKSKQNLRVSLDRKNRGGKVVTLITGF VGTENDQKELGKMLKSKCGVGGSAKDGEIIVQGDFKQKVIDLLIKEGYTKTKGVGG >gi|226332211|gb|ACIC01000109.1| GENE 36 37717 - 38709 1309 330 aa, chain - ## HITS:1 COG:PM1985 KEGG:ns NR:ns ## COG: PM1985 COG0264 # Protein_GI_number: 15603850 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factor Ts # Organism: Pasteurella multocida # 3 225 4 222 282 117 35.0 3e-26 MAVTMADITKLRKMTGAGMMDCKNALTEAEGDYDKAMEIIRKKGQAVAAKRSERDASEGC VLAKTTGDYAVVIALKCETDFVAQNADFVKLTQDILDLAVANKCKTLDEVKALPMGNGTV QDAVVDRSGITGEKMELDGYMTVEGTSTAVYNHMNRNGLCTIVAFNKNVDDQLAKQVAMQ IAAMNPIAIDEDGVSEEVKEKEIAVAIEKTKAEQVQKAVEAALKKANINPAHVDSEEHME SNMAKGWITAEDIAKAKEIIATVSAEKAAHLPEQMIQNIAKGRLGKFLKEVCLLNQEDIM DAKKTVREVLAAADPELKVLDFKRFTLKAE >gi|226332211|gb|ACIC01000109.1| GENE 37 38835 - 39671 1398 278 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|29349285|ref|NP_812788.1| 30S ribosomal protein S2 [Bacteroides thetaiotaomicron VPI-5482] # 1 278 1 278 278 543 100 1e-153 MSRTNFDTLLEAGCHFGHLKRKWNPAMAPYIFMERNGIHIIDLHKTVAKVDEAAEALKQI AKSGKKVLFVATKKQAKQVVAEKAQSVNMPYVIERWPGGMLTNFPTIRKAVKKMATIDKL TNDGTYSNLSKREVLQISRQRAKLDKTLGSIADLTRLPSALFVIDVMKENIAVREANRLG IPVFGIVDTNSDPTNVDFVIPANDDATKSVEVILDACCAAMIEGLEERKAEKIDMEAAGE APANKGKKKSVKARLDKSDEEAINAAKAAAFIKEDEEA >gi|226332211|gb|ACIC01000109.1| GENE 38 39793 - 40179 639 128 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|160883130|ref|ZP_02064133.1| hypothetical protein BACOVA_01099 [Bacteroides ovatus ATCC 8483] # 1 128 1 128 128 250 99 1e-65 MEVVNALGRRKRAIARVFVSEGTGKITINKRDLAEYFPSTILQYVVKQPLNKLGAAEKYD IKVNLCGGGFTGQSQALRLAIARALVKMNAEDKAALRAEGFMTRDPRSVERKKPGQPKAR RRFQFSKR >gi|226332211|gb|ACIC01000109.1| GENE 39 40186 - 40647 797 153 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|29349283|ref|NP_812786.1| 50S ribosomal protein L13 [Bacteroides thetaiotaomicron VPI-5482] # 1 153 1 153 153 311 99 5e-84 MDTLSYKTISANKATVTKEWVIVDATDQTLGRLGAKVAKLLRGKYKPNFTPHVDCGDNVI IINADKVKLTGNKWNDRVYLSYTGYPGGQREMTPARLITKPNGEERLLKKVVKGMLPKNI LGAKLLGNLYVYAGSEHKHEAQSPKMIDINSYK >gi|226332211|gb|ACIC01000109.1| GENE 40 40953 - 41444 495 163 aa, chain - ## HITS:1 COG:no KEGG:BT_3874 NR:ns ## KEGG: BT_3874 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 163 1 163 163 327 100.0 1e-88 MKKLYFFTMLSIMLLAVTGATAQKKTKFKPADLKGIWQLCHYVSESPDVPGSLKPSNTFK VLSDDGRIVNFTIIPGSDAIITGYGTYKQLTDDSYKESIEKNIHLPMLDNQDNILEFEIK DNDYLHLKYFIKNDLNGNELNAWYYETWKRVEMPAKFPEDIVR >gi|226332211|gb|ACIC01000109.1| GENE 41 41584 - 42987 1499 467 aa, chain - ## HITS:1 COG:sll0495 KEGG:ns NR:ns ## COG: sll0495 COG0017 # Protein_GI_number: 16332045 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Aspartyl/asparaginyl-tRNA synthetases # Organism: Synechocystis # 4 467 52 513 513 553 55.0 1e-157 MEKIGRTKIVDLLKREDIGAMVNVKGWVRTRRGSKQVNFIALNDGSTINNVQIVVDLANF DEEMLKLITTGACISVNGVMVESVGSGQKVEVQAKEIEVLGTCDNTYPLQKKGHSMEFLR EIAHLRPRTNTFGAVFRIRHNMAIAIHKFFHEKGFFYFHTPIITGSDCEGAGQMFQVTTM NLYDLKKDERGSISYDDDFFGKQASLTVSGQLEGELAATALGAIYTFGPTFRAENSNTPR HLAEFWMIEPEVAFNDIADNMDLAEEFIKYCVKWALDNCADDVKFLNDMFDKGLIERLQG VLKDDFVRLPYTDGIKILEDAVAKGHKFEFPVYWGVDLASEHERYLVEEHFKRPVILTDY PKEIKAFYMKQNEDGKTVRAMDVLFPKIGEIIGGSEREADYNKLMTRIEEMHIPMKDMWW YLDTRKFGTCPHSGFGLGFERLLLFVTGMSNIRDVIPFPRTPRNADF >gi|226332211|gb|ACIC01000109.1| GENE 42 43015 - 44505 1822 496 aa, chain - ## HITS:1 COG:SPy0369 KEGG:ns NR:ns ## COG: SPy0369 COG1187 # Protein_GI_number: 15674518 # Func_class: J Translation, ribosomal structure and biogenesis # Function: 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases # Organism: Streptococcus pyogenes M1 GAS # 261 490 1 232 240 166 39.0 7e-41 MSTENEEWRENSYNEENTGAGRDGNKSYNREGGERPYRPSYNREGGDRPYRPRFNPNSEG GERPQRSYGERSYDRPQRPSYNREGGDRPYRPRFNSNNEGGERPQRPYNREGGEQRSYDR PQRPSYNREGGDRPYRPRFNPNGEGGDRPQRPYNREGGEQRSYGDRPYRPRFNSGEGGDR PQRPYNREGGDRPYRPRFNSGEGGGYRNNNGGGYRPRFNNDRSQGGYRPRPRTSDYDPNA KYSIKKQIEYKEQFVDPNEPIRLNKFLANAGVCSRREADEFITAGVVSVNGEVVTELGTK IKRSDVVKFHDEPVSIERKVYVLLNKPKDTVTTSDDPQERRTVMDLVKGACNERIYPVGR LDRNTTGVLLLTNDGDLASKLTHPKFLKKKIYHVHLDKNLTKADMDQIAAGVQLEDGEIH ADAISYTDDFKKDQVGIEIHSGKNRIVRRIFESLGYKVVKLDRVFFAGLTKKGLRRGDWR YLSEAEVNYLRMGSFE >gi|226332211|gb|ACIC01000109.1| GENE 43 44580 - 45926 1501 448 aa, chain - ## HITS:1 COG:PA2629 KEGG:ns NR:ns ## COG: PA2629 COG0015 # Protein_GI_number: 15597825 # Func_class: F Nucleotide transport and metabolism # Function: Adenylosuccinate lyase # Organism: Pseudomonas aeruginosa # 1 447 1 447 456 453 50.0 1e-127 MELDVLTAISPIDGRYRGKTKALAAYFSEFALIKYRVQVEVEYFITLCELPLPQLKGIDS SVFETLRNIYRNFSEADAQRIKDIESVTNHDVKAVEYFLKEEFDKLGGMDDYKEFIHFGL TSQDINNTSVPLSIKEALDKVYYPLIEELIAQLKTYATEWAEIPMLAKTHGQPASPTRLG KEVMVFVYRLERQLAMLKACPITAKFGGATGNYNAHHVAYPQYDWKQFGNRFVAEKLGLE REEYTTQISNYDNLSAVFDAMKRINTIMVDMNRDFWQYISMEYFKQKIKAGEVGSSAMPH KVNPIDFENAEGNLGIATSILEHLAVKLPVSRLQRDLTDSTVLRNVGVPFGHIVIAIQSS LKGLRKLLLNEPAIYRDLDNCWSVVAEAIQTILRREAYPRPYEALKALTRTNQAITESSI KEFIEGLNVSEDIKKELRAITPHTYTGL >gi|226332211|gb|ACIC01000109.1| GENE 44 46483 - 47034 401 183 aa, chain - ## HITS:1 COG:BS_yyaI KEGG:ns NR:ns ## COG: BS_yyaI COG0110 # Protein_GI_number: 16081137 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Bacillus subtilis # 2 181 4 183 184 157 44.0 9e-39 MTEVEKMRSSQLADMSAPELQVRFEHAKKLLAIMRGLSTYDDGYRELLEELVPGIPETSV ICPPFHCDHGDGIKLGEHVFVNANCTFLDGGYITIGAHTLVGPCVQIYTPHHPMNYLERR GSKEYAYPVTIGEDCWIGGGAVICPGVTIGNRCVIGAGSVVTKDIPDDSVAVGNPARLIR KQA >gi|226332211|gb|ACIC01000109.1| GENE 45 47040 - 47813 602 257 aa, chain - ## HITS:1 COG:ECs0006 KEGG:ns NR:ns ## COG: ECs0006 COG3022 # Protein_GI_number: 15829260 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 251 1 252 258 158 35.0 7e-39 MLVLLSCAKTMSAVSKVKVPLTTIPRFQKEAAEIALQMSQFSVDELERLLRVNAKIAVEN YKRYQAFHAETTPELPALLAYTGIVFKRLNPKDFSAEDFGYAQEHLRLTSFCYGLLRPLD VIRAYRLEGDVVLPELGNQTLFSYWRSRLTDTFIEDIRSAGGILCNLASDEMKSLFDWKR VESEVRVVTPEFHVWKNGKLATIVVYTKMSRGEMTRFILKNKIGNPEELKGFSWEGFEFD ESLSDERKLVFINGMGE >gi|226332211|gb|ACIC01000109.1| GENE 46 47960 - 49945 1720 661 aa, chain + ## HITS:1 COG:SMb21160 KEGG:ns NR:ns ## COG: SMb21160 COG3525 # Protein_GI_number: 16264574 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetyl-beta-hexosaminidase # Organism: Sinorhizobium meliloti # 92 477 201 619 639 100 24.0 8e-21 MRNKFVSLKVLALLVIFCLTGSLTRAAVNPKPFVVPELKQWTGKDGNFTPGKDARIVCTS QNPELLRIARMFADDYQQMFGQTLSVAQGKATPGDFVLSLSADKKLGEEGYAIKITDRVA ISAPTPTGLYWSTRTLLQLAEQNQERSLPQGTIRDYPDYPLRGFMIDCGRKFIPMAYLQD LVKIMAYYKMNTLQVHLNDNGFKQYFEHNWDKTYAAFRLESETYPGLTARDGSYSKKEFI DFQKQAVSNFVEIIPEIDVPAHSLALTHYKPEIGSKEYGMDHLDLFKPETYEFVDALFKE YLEGDNPVFVGKRVHIGTDEYSNAKKDVVEKFRAFTDHYIRFVEGFGKQAVVWGALSHAK GDTPVKSENVVMNAWYNGYADPATMIKDGYQLISIPDGLVYIVPKAGYYYDYLNEPYLYK EWTPAHIGKAVFDEKHPSILGGMFAIWNDHVGNGISVKDIHHRIFSPLQTLSVKMWTGAQ TGIPYETFNEKRALLSEAPGVNQLARIGKKPELVYERSTVAPGSTSDYPEIGYNYTVSFD ITGAKESEGTELFRSPNAVFYLSDPIRGMMGFARDGYLNTFPYKVNPGEKATIQIEGNNC STTLRVNGKVVDEMNTQKLYFNAGKDSMNYVRTLVFPLEKAGNFNSKVQNLKVYNYCVSK P >gi|226332211|gb|ACIC01000109.1| GENE 47 50780 - 54001 3651 1073 aa, chain - ## HITS:1 COG:YJL130c_2 KEGG:ns NR:ns ## COG: YJL130c_2 COG0458 # Protein_GI_number: 6322331 # Func_class: E Amino acid transport and metabolism; F Nucleotide transport and metabolism # Function: Carbamoylphosphate synthase large subunit (split gene in MJ) # Organism: Saccharomyces cerevisiae # 4 1055 4 1051 1070 1157 54.0 0 MEKEIKKVLVLGSGALKIGQAGEFDYSGSQALKALKEEGISSVLVNPNIATIQTSEGIAD KVYFLPVTTYFVEEIIKKERPDGILLAFGGQTALNCGAELYTQGILDKYGVKVLGTSVEA IMYTEDRDLFVKKLDEINMKTPISQAVESMEDAIAAARKIGYPVMVRSAYALGGLGSGIC ANEEEFLKLAESSFAFSKQILVEESLKGWKEIEFEVIRDANDHCFTVASMENFDPLGIHT GESIVVAPTCSLDDKELKLLQELSTKCIRHLGIVGECNIQYAFNSDTDDYRVIEVNARLS RSSALASKATGYPLAFVAAKIALGYSLDQIGEMGTPNSAYLAPQLDYYICKIPRWDLTKF AGVSREIGSSMKSVGEIMSIGRSFEEIIQKGLRMIGQGMHGFVGNDDVHFDDLDKELSRP TDLRIFSIAQAMEEGYSIDRIHELTKIDPWFLGKLKNIVDYKAKLSTYNKVEDIPADVMR EAKILGFSDFQIARFVLNPTGNMEKENLAVRAHRKALGILPAVKRINTVASEHPELTNYL YMTYAVEGYDVNYYKNEKSVVVLGSGAYRIGSSVEFDWCSVNAVQTARKLGYKSIMINYN PETVSTDYDMCDRLYFDELSFERVLDVIDLEQPRGVIVSVGGQIPNNLAMKLYRQSVPVL GTSPVSIDRAENRNKFSAMLDQLGIDQPAWMELTSLEEVKGFVEKVGYPVLVRPSYVLSG AAMNVCYDDEELENFLKMAAEVSKEYPVVVSQFLENTKEIEFDAVAQNGEVVEYAISEHV EFAGVHSGDATLVFPAQKIYFATARRIKKISRQIAKELNISGPFNIQFLARNNEVKVIEC NLRASRSFPFVSKVLKRNFIETATRIMLDAPYSQPDKTAFDIDWIGVKASQFSFSRLHKA DPVLGVDMSSTGEVGCIGDDFSEALLNAMIATGFKIPEKAVMFSSGAMKSKVDLLDASRM LFAKGYQIYATAGTAAFLNAHGVETTPVYWPDEKPGAENNVMKMIADHKFDLIVNIPKNH SKRELTNGYRIRRGAIDHNIPLITNARLASAFIEAFCELKLSDIQIKSWQEYK >gi|226332211|gb|ACIC01000109.1| GENE 48 54432 - 55532 1286 366 aa, chain + ## HITS:1 COG:SP2229 KEGG:ns NR:ns ## COG: SP2229 COG0180 # Protein_GI_number: 15902033 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Tryptophanyl-tRNA synthetase # Organism: Streptococcus pneumoniae TIGR4 # 6 349 5 341 341 417 61.0 1e-116 MGKEKIILTGDRPTGRLHIGHYVGSLKRRVDLQDAGDYSKMFIFIADSQALTDNIDNPEK VRQNVIEVALDYLACGIDPSKATIFIQSQIPELCELSFYYMDLVSVSRLQRNPTVKSEIQ MRNFEASIPVGFFTYPISQAADITAFRATTVPVGEDQEPMLEQAREIVRRFNYIYGETLV EPEILLPDNAACLRLPGTDGKAKMSKSLGNCIYLSEEPEEIQKKIMSMYTDPGHLRVQDP GKIEGNTVFTYLDAFCRPEHFERYLPDYPNLDELKAHYQRGGLGDVKVKRFLNAIMQETL EPIRNRRKEFSKDIPAVYEMLQKGCEVARAAAAETLADVKKAMKINYFDDKELIEEQVKR FAASQE Prediction of potential genes in microbial genomes Time: Thu May 12 02:04:58 2011 Seq name: gi|226332210|gb|ACIC01000110.1| Bacteroides sp. 1_1_6 cont1.110, whole genome shotgun sequence Length of sequence - 33632 bp Number of predicted genes - 20, with homology - 20 Number of transcription units - 6, operones - 3 average op.length - 5.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 24 - 2399 1925 ## COG5373 Predicted membrane protein - Prom 2419 - 2478 2.9 2 1 Op 2 . - CDS 2480 - 3628 713 ## BT_3862 endo-alpha-mannosidase - Term 3635 - 3686 7.4 3 2 Op 1 . - CDS 3703 - 4845 937 ## BT_3861 hypothetical protein 4 2 Op 2 . - CDS 4872 - 6002 1088 ## BT_3860 hypothetical protein 5 2 Op 3 . - CDS 6036 - 8012 1602 ## BT_3859 hypothetical protein 6 2 Op 4 . - CDS 8051 - 10315 1444 ## COG3537 Putative alpha-1,2-mannosidase 7 2 Op 5 . - CDS 10350 - 12257 1480 ## BT_3857 hypothetical protein 8 2 Op 6 . - CDS 12276 - 12971 606 ## BT_3856 hypothetical protein 9 2 Op 7 . - CDS 13000 - 14931 1692 ## BT_3855 putative outer membrane protein 10 2 Op 8 . - CDS 14943 - 18074 2824 ## BT_3854 hypothetical protein - Term 18293 - 18332 4.4 11 3 Tu 1 . - CDS 18543 - 21116 1603 ## BT_3853 hypothetical protein - Prom 21186 - 21245 5.2 - Term 21209 - 21278 23.0 12 4 Op 1 . - CDS 21333 - 22517 1190 ## BT_3852 major outer membrane protein OmpA - Prom 22557 - 22616 3.0 13 4 Op 2 . - CDS 22620 - 24542 1464 ## COG0323 DNA mismatch repair enzyme (predicted ATPase) 14 4 Op 3 . - CDS 24557 - 24859 293 ## BF4070 hypothetical protein 15 4 Op 4 . - CDS 24861 - 26591 1544 ## BT_3849 hypothetical protein 16 4 Op 5 . - CDS 26614 - 27996 1585 ## COG0760 Parvulin-like peptidyl-prolyl isomerase 17 4 Op 6 . - CDS 28006 - 28851 691 ## BT_3847 hypothetical protein 18 4 Op 7 . - CDS 28906 - 30447 1103 ## BT_3846 peptidyl-prolyl cis-trans isomerase - Prom 30467 - 30526 1.8 - Term 30463 - 30520 13.6 19 5 Tu 1 . - CDS 30545 - 32023 1616 ## COG0516 IMP dehydrogenase/GMP reductase - Prom 32189 - 32248 5.3 + Prom 32047 - 32106 4.7 20 6 Tu 1 . + CDS 32199 - 33630 642 ## PG0838 integrase Predicted protein(s) >gi|226332210|gb|ACIC01000110.1| GENE 1 24 - 2399 1925 791 aa, chain - ## HITS:1 COG:RSc0786 KEGG:ns NR:ns ## COG: RSc0786 COG5373 # Protein_GI_number: 17545505 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Ralstonia solanacearum # 134 477 147 485 938 124 27.0 9e-28 MGELYGTCALLAAVLFFVLLQKFESRFGKVDKELGEIKKRMDDLLKAQEKMASGPVRAKE DVTAETLDNTDILPYVTEELPDSTVTALKEEKAVEEKAVEGIQPIMAETTPQKLNASLEK ESAEATLQETSPEAPLEIPSVSKRKQVNYEKFIGENLFGKIGILVFVIGVGFFVKYAIDK DWINETLRTVLGFLTGSALLVVAERLQKKYRTFSSLLAGGAFAVFYLTVAIAFHFYHLFP QTVAFIILIATTLFMSILSILYDRRELAIISLVGGFLAPFLVSTGNGNYLVLFTYMSILN LGMFGLSIYMKWGELPVIAFVFTYVVMGIFLLTGFTTGSTHISVHLFIFATLFYFIFLLP ILSILRIEAVKKNRGLLLVIITNNFIYLLLGILFLRNMGLPFKSEGLLSLLIAIINLVLV IWLRMSKKDYKFLIYAMLGLVLTFVSITIPIQLDGNYITLFWAAEMVLLLWLYVKSKIGV YERATQVLMGLTLVSYLMDIYNVLMTSSSSETIFLNSSFATSLFVGLATGAFALLMGRYR SLFTEARYLRYTPWNSIMLLAAAAILYYTFMAEFALHLAGATRSGMMLAFTSAAIFILSY TFKKRFPIKQYTIPYLTAMGMNVLIYVINIWGDQWVYTSLTPALLRWFAAAFVIANLYYV ARQYYTLIGLKTPFTVYLNVLALFLWLTMARSFLLQAGVEDFDAGFSVSLSIAGFIQMAL GMRLHQKVLRIISLSTFGIVLLKLILKDLWAMPTIGKIIVFIILGLILLILSFLYQKLKD VLFKNDEDETD >gi|226332210|gb|ACIC01000110.1| GENE 2 2480 - 3628 713 382 aa, chain - ## HITS:1 COG:no KEGG:BT_3862 NR:ns ## KEGG: BT_3862 # Name: not_defined # Def: endo-alpha-mannosidase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 382 1 382 382 771 99.0 0 MRKELVFVLLALFLCAGCNGNKKKMNGEHDLDAANITLDDHTISFYYNWYGNPSVDGEMK HWMHPIALAPGHSGDVGAISGLNDDIACNFYPELGTYSSNDPEIIRKHIRMHIKANVGVL SVTWWGESDYGNQSVSLLLDEAAKVGAKVCFHIEPFNGRSPQTVRENIQYIVDTYGDHPA FYRTHGKPLFFIYDSYLIKPAEWAKLFAAGGEISVRNTKYDGLFIGLTLKESELPDIETA CMDGFYTYFAATGFTNASTPANWKSMQQWAKAHNKLFIPSVGPGYIDTRIRPWNGSTTRD RENGKYYDDMYKAAIESGASYISITSFNEWHEGTQIEPAVSKKCAAFEYLDYKPLADDYY LIRTAYWVDEFRKARSASEDVQ >gi|226332210|gb|ACIC01000110.1| GENE 3 3703 - 4845 937 380 aa, chain - ## HITS:1 COG:no KEGG:BT_3861 NR:ns ## KEGG: BT_3861 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 380 1 380 380 769 100.0 0 MKNNYLYNWVVIVFMMLQMLAFTSCSDDDDNSGWPTETDKTEIFIERFSTLVTNLTALRD GATYGELKDNYPVSSKAMLDDEIVYLEETIAKLKEGNKKLADSEADRIIREANQIEKNFR ATRRTEDFLPVAAELLVNGKNGGYIDFGVHPEYSAFGEQGQQAFTVEFWVKLTDVDEYLN SFVFLLSTFTDDDTKDHERKGWAVNSHFGRLRMTYGIGYSDLFEPGFSFSTLNQWVHVAV VTNENGVDGEVRDGIPVMTKIYVNGQLMLSERGRDDRLPYTPNDKEVAMVAFTGLSATAN RIGEKSTNGCMRHLHIWKSAKTQAEIQHLMDTPESVTGSESDLVCGWTLNKTVSDNNNIK DLTGKFSARLIGDFQWVENR >gi|226332210|gb|ACIC01000110.1| GENE 4 4872 - 6002 1088 376 aa, chain - ## HITS:1 COG:no KEGG:BT_3860 NR:ns ## KEGG: BT_3860 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 376 1 376 376 764 100.0 0 MKNRYSIRLLVIVFLSAFAMGICYSCSDDDGEGVKVYEYREQIGKKIKVLTDLQAESQFG LREGMYPETSRAILEDAIAKLKDFLQTIKEAGVAEARIPEETARLLKESDEKIAEFKATV RTEDLLVPAELEVSGKNGGYIDFGAHPEYSTFGEVGKQAFTVEFWVKLKDVEGFFYLIST FTDDDTNNHERKGWCVNSYNYGGNNMRMTYGMGFNDLMEPAFGFATVNEWVHLAVVTDET GVDGEMNGGRPVMTKMYMNGELKLSTTSHQDASKPYASNDSNVPMVAFGGMSATGNRIGD KGANGSMKHLHIWRTAKTQDEIRRLMEHPENVTGEEDDLVCGWTFAKMALDDQEIKDLTG KYTAKLVGDYRWIELE >gi|226332210|gb|ACIC01000110.1| GENE 5 6036 - 8012 1602 658 aa, chain - ## HITS:1 COG:no KEGG:BT_3859 NR:ns ## KEGG: BT_3859 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 658 1 658 658 1355 100.0 0 MKKMIVLTSLWSFLFIFSFSMFGCSEDDDEFGTTAERGNEVRTPRVEEINGKAVVTWIDP YITDIKEVQVKDLQTNEQQTVAKGVQSAEFAITDNSLLSYRYEMKVVRTTGEMSAGVTAR LVKNWAQKLHPLMDYHSDATPQSGMFFKNQPVAKVNVFDIRDDENISKLTTAVMQGVINQ EQALTYLIWLQQDLTQLDDAEVQYEMQPLANTSRNRGFAALYNMYKDRFNCLVVWDENQP WSWSMAQMISSQEKGIPVTESMRKFIEDELGIGNLEVRDIRNQWSSKAEAYGWAIAHYAD KCHPKLTFSGGLRSDYKDNPWRMYDYVAASKGFVFWLDDSNGDDKQIMDNIFNSGSYPVG SSVFGYGMNANGDELNKITNIHNAGFVVSDYYANGSYWCSFPSKAFQQRKGIAGEVKPGK IYVAISLSDGDNIQFDANSLYQIFKEGKRRGEVPVGVTLAAGLQELNPKLLEFYYKNMTP NDELTAGPSGFQFIYGDYYAQSGKYAEWLEMNKKWLSTAGFHTAHLWNTDEQMYFEQYMK SKAVDAIMDGSNRTHTTGSSYKLVDGVVRIDQGTMCRNNGDVYRDLMSVSPSPRRPLFRH VYLLTNYYGFEGNKVVVYERLIKELERVEQDCPDTYEFMLPMDLAASIRKYIEEGGIY >gi|226332210|gb|ACIC01000110.1| GENE 6 8051 - 10315 1444 754 aa, chain - ## HITS:1 COG:XF0842 KEGG:ns NR:ns ## COG: XF0842 COG3537 # Protein_GI_number: 15837444 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Xylella fastidiosa 9a5c # 27 752 39 783 790 238 25.0 3e-62 MMNRLNIKRTVGSCLMAMAFFSCTHTDQTPTKDFVDYVNPYIGNISHLLVPTYPTVHLPN SMLRVYPERGDYTSDRVNGLPVVVTSHRGSSAFNLSPVQGEVSRPIVSYSYDLENITPYS YSVYLDEADIQVEYAPSHQAGIYHISFGTEGDNALVVNTKNGKLVAEEKGVSGYQVIDNT PTKIYLYLETSQLPLRKGVLADGKVDMESKEGSAIALYYGSEKNLNLRYGISFISAEQAK KNLQRDITTYDVKAVADAGRRIWNKTLGKIVIEGGSEDEKEIFYTSLYRTYERMINLSED GKYYSAFDGKIHEDGGVPFYTDDWIWDTYRATHPLRILIEPQKELDMIRSYIRMAEQSDR RWMPTFPEVTGDSHRMNGNHAVAVIWDAYCKGLKDFDLEAAYEACKGAITEKTLLPWLRC PLTELDKFYQEKGFFPALNPGEEETCKAVHSFERRQAVAVMLGNCYDNWCLAQIARTLNK TDDYKKFMRMSYTYRNVYNAETGFFHPKNKDGKFIEPFDYRYSGGQGARGYYGENNGWIY RWDVQHNPADLIALMGGQASFIERLNQTFNEPLGRSKFDFYHQLPDHTGNVGQFSMANEP CLHIPYLYNYAGQPWMTQKRIRVLLNQWFRNDLMGVPGDEDGGGMTAFVVFSMMGFYPVT PGSPTYNIGSPVFQSAKMEVGDGHYFEIIAENYAPDHKYIQSATLNGTPWNKPWFSHADI QNGGRLVLQMGDKPNKKWGIASDAVPPSSESLPE >gi|226332210|gb|ACIC01000110.1| GENE 7 10350 - 12257 1480 635 aa, chain - ## HITS:1 COG:no KEGG:BT_3857 NR:ns ## KEGG: BT_3857 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 635 1 635 635 1335 100.0 0 MRKYLQLVFLAVSVFCSEILLAESRDPVRILDLRTLNELDLKEQEKAEQLWDIMHTTATL QGIVNRNSPRLYIRYVKNGQGENVDDYWWNKYRQAGQWLAGRDTIAYTELSDVVTVFRKE IRGVVVYDSKVASTSNIASSVAGIENLIAVRYDISPNSLYTRLVLQGPKLAVKCWLVNKD GSSLFTGKGRIAGTGQPSTGSLKIDPYVWFIEKYLKKGLCNTEYAAYYIDQFWRTDPTRT VTNHHQLTNHDFFVSKKAFFFDLSPWGDEPATDDPTQEEGLDLQILKTFLQEAYKQNKGE KFCYIGGFPSWIYKYTQHAGGKHEDVATEWEFSRIISAYNAFKDADAIGLGALANSSFWQ HFPLQEKYPQKWVTHQELMDRGYLNRDGTINFQGRNFILFYVGDYDSSSWIAQTTPFLWD EPSRGEVPLMWSVSPVLAERVPMVMHNYRVTATPNDYFAAADNGAGYLMPGMLQEPRSVS GLKSGLSAWAKHCSKYYQKWGLTITGFVIDGEAPGLDSDGLDCYASFSPNGIVPQKMPLT LLHNDMPVIRADYDIVDHDYRRATDVIVERVEKRPVPFHWFRAILKSPSWYKGICDELKQ RHTNIELLDAPTFFELYRIYLKQHPDAAAGKITMN >gi|226332210|gb|ACIC01000110.1| GENE 8 12276 - 12971 606 231 aa, chain - ## HITS:1 COG:no KEGG:BT_3856 NR:ns ## KEGG: BT_3856 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 231 1 231 231 463 100.0 1e-129 MKKLLFLLCVVVAGSMFYSCEKDNLEGPDAIFAGELRDRKTGELIQQEISDGSRVYFIEQ GWGDNPPVQNMVVKKDGTFYNGMIFSGDYKIILNRGNYVPLDTLDMKIKPGKNYQVFEVN PYLRIIEPEISVIGRKVVAKFKLEQVTSNEVYRISLFAHSHIDVSNKLNIVNRTEELNRS ITDGEAFELSINLEDYTSTLVPGKSYYFRIGAQSHGSETKYNYSASVQLDI >gi|226332210|gb|ACIC01000110.1| GENE 9 13000 - 14931 1692 643 aa, chain - ## HITS:1 COG:no KEGG:BT_3855 NR:ns ## KEGG: BT_3855 # Name: not_defined # Def: putative outer membrane protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 643 1 643 643 1330 100.0 0 MKIYSYLLLGAMFLLGGCSDLLDIDPKNKIPADELFSTPEGVQAHMANLYGRLPIEDFTY SPNRGFNVGVGTDVNNAGFMAAHFCDEAIHPEYNDFGEEWFDYWEDGYKLIRDLNSLLVT IPTLTSITEQQKNEINAETHFLRAYTYFALAKRYGGVPIIKEPQEYNGNIEELRVPRNTE KDTWDFVLEECDQAVSLFGDANENDVLRANKWVALALKSRAALYAASVAKFTHQPYVSFS GPAVDQKLVGIEVISADHYYDECISASQEIMNSGKFGLYKPSPATPEEATTNYQKLFEQP FQCLDGLKEPIFMKAYAANTILAHNYDVWFSPRQMILDPNLYPGRMNPTLDFVDSFEDYT DDGTGTPKPISTRVDGNESDYNGFNLSTRYLSFPIDKPYQAFAGRDARLSAMVLFPGQNF GSTKIIIQGGLVKADGSGYHYRTQASEKGQDGLIYYTYGAEKSTEYSGFDPTLGHYTRSG FLFKKFLQIESPVEQAWSKGTQPWIDFRYGEILLNYAEAVIEKTTSTSAEKQAAQDALNA VRKRAAHKDDIALTQANVRKERFVELAFENKRRWDLSRWRTFHKEFENRVRKGLVPFLDL RTNPAHYVFVRVNPLGIESKTFDYSWYYKSIPGTGANGLVQNP >gi|226332210|gb|ACIC01000110.1| GENE 10 14943 - 18074 2824 1043 aa, chain - ## HITS:1 COG:no KEGG:BT_3854 NR:ns ## KEGG: BT_3854 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 32 1043 1 1012 1012 1973 99.0 0 MKTGKSKDHLQRVFLRFLYLIMGSILSLDSVMAQENLKISGQVTDNKGEAIIGASVKVLK TGTGTISDMDGKFTIQVPVGAELEIGYVGYNPKRVKVVNKNFVTVVLEENVVALGDVVVV GYGIQKKESLTGAIGNLKVDDIVKTKAPSLAQAIQGKVAGLRIRQENGEPGKFSSNINVR GFGTPLFVIDGVVRDGSSEFQRLNPEDIESISFLKDATASIYGMNSANGAVIVTTKKGAT GKPRITLNANVGITSPTNVPEMANARQYMTMRNEAEINAGRPAYITKDELTKWQQGAPGY ESVDLYDAVFNKHATQFQTTLSLEGGTDKVSYYGSFGYATDNSLLKNNALTYDKYTFRSN VSLKITKDLTASINLGGRYDTTNRPWFPFYDIFKSTRVNPPTTSIYANDNPDYYNNFSYV PNPAAMIDADYTGSAKERNKNLQTQFALEYNIPYVKGLKVKGTFIYDYNNYGYKATRKGF KTYTYGEYTGEYTAADANYPALLQDNRRESERVDMQFQTNYNRTFGNHTIGATYVFERRE EKANWMNGERKFDFFTIGELDNARESDQKVSGSSEHQAYLSHIGRLTYDYKGKYLAELAC RYDGSYRYAPGSRWAFFPSASVGWRISEESFIKDNFKFVDNLKLRFSAGRSGQDAGDPFQ YFSGYTLNSGGYVFSQGNYTNGVASPVMINKNLTWIKVNMYNIGIDFSIFNRLIAVEFDI YQRDRSGLLADRYGSLPNTFGSKLPQENLNGDRTRGIEFTLTHTNKIGDFHYSVSGNFNL ARTQRRYIESGPYKSSMERWRNQSSNRWGDFIWGYQTDGRFQNFDEINTYPIQNGDNGNS KELPGDYILKDVNGDGVVNDLDKTPLFWSGSPLIHYGFNVEASWKNFDFYALFQGSALYT VQFDEVYAKMLCFKGGNTPEYFYDRWHLSDPYDANSEWIPGEWPAIRLEQDMGSFYTRDS QIWRKNASYLRLKTIEIGYTFSPRLMHKLGIGSLRIYANGNNLFTVCDPFVKAFDPEKIE GDYSAGLNYPLNKSFNFGLTLNF >gi|226332210|gb|ACIC01000110.1| GENE 11 18543 - 21116 1603 857 aa, chain - ## HITS:1 COG:no KEGG:BT_3853 NR:ns ## KEGG: BT_3853 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 857 1 857 857 1743 100.0 0 MIVTMKCRYLLSLVFLLHIWVCKSNVIDNSVYDYGLTFLAHSTNQDQRTNLDLTPAASLS FPEDGFSVGFDIKLRNELYTYGYVVRVIADDSSCFDFISYLLYSRFNIVLTDKDRVIKNT EIADSVKIVADRWIHVDLQFAKDRIHIAADGIQAEINHSLSNFKDIKIYFGGSKHPRFFS TDVPPMTIRNIELADIQGKLLYKWELAAHDKDVTYDSVRNKQAFVRNGVWEIDKHTKWAA LASLNVHHINPQVAYDDVSGRFFIAGGGQLFVYDVKANRIDSIAYKGHPYIGASSQIIFD AKRNRLLSYTPDFNDLNVYEFDRKCWTLETPVMIDTRQHHNRIINQKRDELIVFGGYGNH RYNSQLSRINLSDPQGWSISSLDSCLFPRYLSAMGAENEDYLLIMGGYGNQSGKQEESPG NFYDLYRLNLKTGKCTKLWEFVNDRQHFTFGNSMIIDTPSNSVYALTYNNDRYNTFVYLS RFDIQTRQPVQEVMSDSIVYNFLDIHSYCDMFLHKETSSIYAVVLQEKEPGISKVEFYKL AFPPLSKEDILPHQTGGMKPVILISGILAGLLCLIGGSIWLLHSKRKRKVNVAVGPVATE EVKDRSVEEEPTEQKVSSVLLLGGFQVFDKQGGNITGDFTPTLKQLFLFLLLNTIKNGKG TTSQCLDETFWFDMSKSSASNNRNVNIRKLRLIIEKIGDINIANKNGYWYLNLGKDVTCD YQEVMRLLDQIKDTITDKKIINKIISLASAGALLPNVSAEWIDEYKSAYYVLLTEILLSV VNRPDIKEDSRLLLKISDVILLVDNIDEDAIRTKCRVLYQMGQKGLSKQSFDKFCIEYER LLNAKPDFSYDDIINSL >gi|226332210|gb|ACIC01000110.1| GENE 12 21333 - 22517 1190 394 aa, chain - ## HITS:1 COG:no KEGG:BT_3852 NR:ns ## KEGG: BT_3852 # Name: not_defined # Def: major outer membrane protein OmpA # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 394 1 394 394 750 100.0 0 MKKILMLLAFAGVASVASAQQTVTVTEYEVIQVQDKHQVITNPFWSNWFFSVGGGAQVLF GNNDHIGKFRDRIAPTLNVSVGKWVTPGFGLRMQYSGLQSKGFTTNESANYVVGGPREDG SYKQRWDYMNLHGDLLINLNALFGGYNPDRVYEIIPYIGAGWAHSYSKPHTNAATFNAGI INRFRLSNAVDLNLELSATGLEGKFDGEHGGKPDYDGILGATLGVTYYFPTRGFQRPVPQ IISELELRQMRDQMNAMAAANMQLQQQLANAQQPVVEEAEEVVVTDANIAPRTVFFKIGS DKLSPQEEMNLSYLASKMKEFPNMTYTINGYADSATGTPSINQKLSLERAQAVKDLLVKK YGISADRLSVAAGGGVDKFGQPILNRVVLVESAQ >gi|226332210|gb|ACIC01000110.1| GENE 13 22620 - 24542 1464 640 aa, chain - ## HITS:1 COG:SPy2121 KEGG:ns NR:ns ## COG: SPy2121 COG0323 # Protein_GI_number: 15675871 # Func_class: L Replication, recombination and repair # Function: DNA mismatch repair enzyme (predicted ATPase) # Organism: Streptococcus pyogenes M1 GAS # 1 640 1 645 660 300 31.0 7e-81 MSDIIHLLPDSVANQIAAGEVIQRPASVIKELVENAIDADAQNIHVLVTDAGKTCIQIID DGKGMSETDARLSFERHATSKIREAADLFALRTMGFRGEALASIAAVAQVELKTRLESEE LGTKLVIAGSKVESQEAVSCSKGSNFSVKNLFFNVPARRKFLKANSTELSNILAEFERIA LVHPEVAFSLYSNDSELFNLPVSPLRQRILAIFGKKLNQQLLNIEVNTTMVKISGYIAKP ETARKKGAHQYFFVNGRYMRHPYFHKAVMEAYEQLIPVGEQVSYFIYFEVDPANIDVNIH PTKTEIKFENEQAIWQILSASVKESLGKFSAIPSIDFDTEDMPDIPAFEQNLPPAPPKIH YNSDFNPFKVSSGGSGGGSYSRPKVEWEGLYGGLTKASKMNEPQQEPEMDWENSPFEEEP MVAEEKTISAVSAASSTFSSASSTLYANESVIEKGNLHLQFKGRFILTSVKSGLMLIDQH RAHIRVLFDRYMVQIKQKQGVSQGVLFPEILQLPASEAAVLQGIMDDLSAVGFDLSDLGG GSYAINGIPSGIEGLNPVELVRNMLHTAMEKGNDVKEEIQNILALTLARAAAIVYGQVLS NEEMVSLVDSLFACPSPNYTPDGKTVLTTIKEEDIEKLFK >gi|226332210|gb|ACIC01000110.1| GENE 14 24557 - 24859 293 100 aa, chain - ## HITS:1 COG:no KEGG:BF4070 NR:ns ## KEGG: BF4070 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 100 1 98 98 159 79.0 4e-38 MGMFFNSMRKPRGFNHQYIYVNERKEKLEKMEEKAKRELGMLPDKEFSPEDIRGKFVEGT THLKRRKASGRKPVSFGIILIIIAFLLYLWHYLATGSWSF >gi|226332210|gb|ACIC01000110.1| GENE 15 24861 - 26591 1544 576 aa, chain - ## HITS:1 COG:no KEGG:BT_3849 NR:ns ## KEGG: BT_3849 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 576 1 576 576 1149 100.0 0 MLKQNKKDKQQDGHRWLLIGFLCLFAVCLVAQVKPTGQTKKAETPAAKTPNGTKQTDKDI KKEPENKKTKVYLLHANQGQADKLARPDVQVLIGNVKLRHDSMYMFCDSALIYEKTNSVE AFSNVRMEQGDTLFIYGDYLYYDGMTQIAQIRENVKMINRNTTLLTDSLNYDRLYDLGYY FEGGTLMDEENVLTSDWGEYSPATKQSVFNHDVKLVNPKFVLTSDTLKYNTFSKIATILG PSNIVSDNNHIYSERGFYNTLSEQAELLDRSILTNEGKKLIGDSLFYDRKVGYGEAFDNI RMTDTINKNMLTGDYCFYNELADSAFATKRAVAIDYSQGDSLFMHGDTLQLISYNLNTDS VFRLMKAYHKVRMYRTDVQGVCDSLVYNSKDSCLTMYTDPILWNEGQQLLGEEIKIYMND STINWAHIINQALTVEMKDSVHYNQVSGKEMKAYFENGDMRHIEVIGNVMTAFYPEEKDS TMTGFNNMEGSVLHLYMKEKKMEKGMFVGKSNGTLYPMDQIPPDKLRLSTFAWFDYVRPL NKEDIFNWRGKKEGETLKPTTDRKPKTDKRSLINMK >gi|226332210|gb|ACIC01000110.1| GENE 16 26614 - 27996 1585 460 aa, chain - ## HITS:1 COG:RSc1715 KEGG:ns NR:ns ## COG: RSc1715 COG0760 # Protein_GI_number: 17546434 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Parvulin-like peptidyl-prolyl isomerase # Organism: Ralstonia solanacearum # 146 301 242 404 648 73 30.0 6e-13 MKKFVNFRFIVTLVLAVFANVVTYAQDNVIDEVVWVVGDEAILKSDVEEARMDALYNGRR FEGDPYCVIPEEIAVQKLFLHQAKLDSIEVSEAEIIQRVDMMTNMYIQQIGSKEKMEEYF NKTATQIRETLRENARDGLTVQKMQQKLVGDIKVTPAEVRRYFKDLPQDSIPYIPTQVEV QIITLQPKIPIAEIEDVKRRLRDYTDRVTKGEIDFSTLARLYSEDKASAIKGGELDFMGR GMLDPAYANVAFSLQDPKKVSKIVESEFGYHIIQLIEKRGDRVNTRHILLRPKVSEKELT EACARLDSIGDDIRQNKFTFDEAASVISQDKDTRNNHGLMVNTNERTGITTSKFQMQDLP QDVAKVVDKMNVGEISRAFTMINEKDGKEVCAIVKLKAKINGHKATIAEDYQDLKEIVMD KRREEMLQKWILNKQKHTYVRIDPNWQKCDFKYPGWVRND >gi|226332210|gb|ACIC01000110.1| GENE 17 28006 - 28851 691 281 aa, chain - ## HITS:1 COG:no KEGG:BT_3847 NR:ns ## KEGG: BT_3847 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 13 281 13 281 281 498 100.0 1e-140 MRILVLLLITLLCCGACKEQYDHKGKTALVEVDGNFLYKEDLMSVLPVGLSKDDSILFAE HYIRSWAEEILLYEKAANNIPDNVDVDKLVENYRKALIMHTYQQELINQKLTNDISEQEI ADYYEKNKELFKLESPLIKGLFIKVPLTAPQLNNVRRWYKTEKQDAVESLEKYSLQNAVK YEYFYNKWVPVTDVLDLIPLKEASPEQYVDKHRHVELKDTAFYYFLNVSDYRGAGEEKPY EFARSEVKDLLVNQKRVNFMEQVKNDLYQQAVNKKKIIYNY >gi|226332210|gb|ACIC01000110.1| GENE 18 28906 - 30447 1103 513 aa, chain - ## HITS:1 COG:no KEGG:BT_3846 NR:ns ## KEGG: BT_3846 # Name: not_defined # Def: peptidyl-prolyl cis-trans isomerase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 513 2 514 514 974 100.0 0 MKRNLLLGWISLFGVLAFAQEDPVVMRINGKDIPRSEFEYSYRRHADGNGMKLSPKEYAE FFIQSKLKVEAARAAGLDTTSAFRKQQEAYRTNLLRSYLLDDQEMDGNARILYQKMKENV RGGQVQIQQIYKYLPQTITSRHLQEEQARMDSIYQVIQNQPGVDFARLVDRFSDDKRCRW IESLQTTSEFEEAAFSLAKGEISKPFFTPEGIHILKVIDRKEVPAYEVVSDSLLNRLRRQ PLDKGTEAIVEQLKKEYQYIPNMESLEEVLQKGGTERTLFTIDGQAYTGEMFKRFAASHP QAVKRQLNGFIAKSLLDYESQHMERKHPELRYTLQEYAEKCLAEEIVHQKVDLPAVNDRV GLATYFKFHSSDYRWEHPRYKGVVLHCADKKTAKQAKKLLKKVPENEWEDMLRKTFNTSG AEIIKIEQGVFADGDNKYIDKLVFKKGAFDPVVSYPFTIAVGKKQKGPEDYREVIDQVRK DYRNYLNAYWERELRESGKVEINQEVLKTVNNN >gi|226332210|gb|ACIC01000110.1| GENE 19 30545 - 32023 1616 492 aa, chain - ## HITS:1 COG:L21264_3 KEGG:ns NR:ns ## COG: L21264_3 COG0516 # Protein_GI_number: 15672202 # Func_class: F Nucleotide transport and metabolism # Function: IMP dehydrogenase/GMP reductase # Organism: Lactococcus lactis # 212 489 5 283 285 343 64.0 4e-94 MSFIADKIVMDGLTYDDVLLIPAYSEVLPRTVDLSTKFSRNIELKIPFVTAAMDTVTEAK MAIAIAREGGIGVIHKNMSIEEQARQVAIVKRAENGMIYDPVTIKRGSTVSDALGIMAEY KIGGIPVVDDEGYLVGIVTNRDLRFERDMTKHIDLVMTPKERLVTTNQSTDLESAAQILQ KHKIEKLPIVGMDGKLIGLVTYKDITKAKDKPMACKDAKGRLRVAAGVGVTADTLDRMQA LVDAGADAIVIDTAHGHSMYVIEKLKEAKKRFPNIDIVVGNIATGEAAKALAEAGADAVK VGIGPGSICTTRVVAGVGVPQLSAVYDVAKALKGTGVPLIADGGLRYSGDVVKALAAGGY CVMIGSLVAGTEESPGDTIIFNGRKFKSYRGMGSLEAMENGSKDRYFQSGTADVKKLVPE GIAARVPYKGTLYEVVYQLSGGLRAGMGYCGAANIEKLHDAKFTRITNAGVMESHPHDVT ITSESPNYSRPE >gi|226332210|gb|ACIC01000110.1| GENE 20 32199 - 33630 642 477 aa, chain + ## HITS:1 COG:no KEGG:PG0838 NR:ns ## KEGG: PG0838 # Name: not_defined # Def: integrase # Organism: P.gingivalis # Pathway: not_defined # 230 431 224 427 432 79 33.0 4e-13 MERQIFINEMQARFNLRKPRSEKPTNLYLVCRINNKQVKLSTGVKIYPDHWNEKRQEAYI SVRLSEIDNINNTIVNKKITKLKEYFIEFKHYLCMHPDEIGESMKLLKQHIYKDRMKKEL QKPATFIMKQIIEAKTCAESSKKQYRSNIDKFERFLKENEIPNTWESMNLDTINRYQKQI IKENPLHPHNTLRNIIKGTIFNLLGIADKRLDIPFKWSDSNLNSFEFVKDKSNKELADNK KVSLTEEQLNKFYKHIITGTERQIKKYTEIRDLFILQCLVGQRIGDMQKFFNGDNEMDEE AGTISIIQQKTKARAIIPLLPLAKEIISKYENKELLYYKERKSIVNEALKEVAEQAGLDE PITYEENGIKQTQPLYKLLHTHTARHTFITILCRKGIPKETVIIATGHEDTKMIDKVYSH LNSKDKAKKVSNAFKSLNNGIFNMGKVETNSLNEVKPTNDATNNITFDTLLDTQFFA Prediction of potential genes in microbial genomes Time: Thu May 12 02:06:30 2011 Seq name: gi|226332209|gb|ACIC01000111.1| Bacteroides sp. 1_1_6 cont1.111, whole genome shotgun sequence Length of sequence - 392 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 387 104 ## BVU_1734 hypothetical protein Predicted protein(s) >gi|226332209|gb|ACIC01000111.1| GENE 1 3 - 387 104 128 aa, chain - ## HITS:1 COG:no KEGG:BVU_1734 NR:ns ## KEGG: BVU_1734 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 34 123 115 208 950 68 42.0 6e-11 MKNKHEQYPDTLLVIGGVDTKEIEAYLQTAELPKILVSYDSVYKLIGCIKYKSDWRVVVD EFQCLLADSSFKSEIELHFLDNSRSFPYVTFLSATPILDKYLEQIDHFKDMNYYQLDWEE KDIVRVYR Prediction of potential genes in microbial genomes Time: Thu May 12 02:06:33 2011 Seq name: gi|226332208|gb|ACIC01000112.1| Bacteroides sp. 1_1_6 cont1.112, whole genome shotgun sequence Length of sequence - 1675 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 3, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 59 - 403 149 ## BF0112 lysozyme + Term 425 - 473 13.0 - Term 413 - 460 13.6 2 2 Tu 1 . - CDS 467 - 688 174 ## BF1096 hypothetical protein - Prom 708 - 767 1.7 - Term 769 - 805 4.4 3 3 Op 1 . - CDS 806 - 1114 325 ## BF0110 hypothetical protein 4 3 Op 2 . - CDS 1160 - 1405 240 ## BF0109 hypothetical protein 5 3 Op 3 . - CDS 1402 - 1635 259 ## BT_2286 hypothetical protein Predicted protein(s) >gi|226332208|gb|ACIC01000112.1| GENE 1 59 - 403 149 114 aa, chain + ## HITS:1 COG:no KEGG:BF0112 NR:ns ## KEGG: BF0112 # Name: not_defined # Def: lysozyme # Organism: B.fragilis # Pathway: not_defined # 1 114 61 174 174 192 85.0 4e-48 MGWGHQVQPGERYSARTMTKRQADALLRKDLRKFCAMFRKFGRDSLLLATLAYNVGPYRL LGSGKIPKSTLIRKLEAGDRNIYREYIAFCNYKGKRHAMLLKRRKAEFALLYIP >gi|226332208|gb|ACIC01000112.1| GENE 2 467 - 688 174 73 aa, chain - ## HITS:1 COG:no KEGG:BF1096 NR:ns ## KEGG: BF1096 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 73 2 74 74 72 43.0 3e-12 MEEIRQNGKIILHSDDGISIKMIFKNLTGKNFQGQKYAEYIRHIAIGEMGFSPGIIEHCR DGEVIGKGKIPNV >gi|226332208|gb|ACIC01000112.1| GENE 3 806 - 1114 325 102 aa, chain - ## HITS:1 COG:no KEGG:BF0110 NR:ns ## KEGG: BF0110 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 102 38 139 139 177 90.0 1e-43 MIAKTILQQIGGRRFVAMTGSKDFTDMGNGLRMSLARNKTSANRLDIIYDAGLDLYNMRF YRKTFSKKTFECKTKDIETHEGIYCDMLEEMFTMVTGLYTRF >gi|226332208|gb|ACIC01000112.1| GENE 4 1160 - 1405 240 81 aa, chain - ## HITS:1 COG:no KEGG:BF0109 NR:ns ## KEGG: BF0109 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 80 28 107 108 130 76.0 1e-29 MKAAYQTLIVKFSQPIKVLDGIFDDAEAWGVDTLKGWIDDYESSRFTAINSHTAVITSEY NMECLMEWLKRNTPIAEITEC >gi|226332208|gb|ACIC01000112.1| GENE 5 1402 - 1635 259 77 aa, chain - ## HITS:1 COG:no KEGG:BT_2286 NR:ns ## KEGG: BT_2286 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 74 1 72 72 77 56.0 2e-13 MATRMTINGISTCTEAGTEKYERFQLGIGRRKRTLVQYDYRHPTDGELFSCVKPTLDECR AARDKWLTEKEEKEDNR Prediction of potential genes in microbial genomes Time: Thu May 12 02:06:43 2011 Seq name: gi|226332207|gb|ACIC01000113.1| Bacteroides sp. 1_1_6 cont1.113, whole genome shotgun sequence Length of sequence - 2633 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 1, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 250 154 ## BF0108 hypothetical protein 2 1 Op 2 . - CDS 275 - 1564 638 ## BF0107 hypothetical protein 3 1 Op 3 . - CDS 1567 - 1989 312 ## BF0106 hypothetical protein 4 1 Op 4 . - CDS 2003 - 2224 183 ## BF0105 hypothetical protein Predicted protein(s) >gi|226332207|gb|ACIC01000113.1| GENE 1 1 - 250 154 83 aa, chain - ## HITS:1 COG:no KEGG:BF0108 NR:ns ## KEGG: BF0108 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 83 1 83 85 144 87.0 1e-33 MEVRIESMICLWDDKIPTMFLEFVNLLTLATSEEQLRRSVKDFAEKHELDRFFLYGFGSH HFYLHQRYTSDPEMVMKNRILSV >gi|226332207|gb|ACIC01000113.1| GENE 2 275 - 1564 638 429 aa, chain - ## HITS:1 COG:no KEGG:BF0107 NR:ns ## KEGG: BF0107 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 429 1 443 443 374 43.0 1e-102 MKPRTPIQQEVARLSERLPKLTATQRAYAFRHCFKHYAIKRADGTNICTECGHSWRSEHD LADTVCGCTCPHCGMKLEALRTRKSVFSENEYFCIITTCKQYQVIRFFFVKSRYKAGQAA EYSIYEVVQRWISPKGTTVTVARLRGMSILYYDLWAEYSDMEVRKNNKLRAYDINPVCTY PRQRFIPELKRNGFNGEYHNILPYDLFTAILSDSRAETLLKAGQYPMLRHYIRSSFDIER YWASIKICIRNGYTISDGSMWRDTIDLLRHFGKDTNSPKYVCPSDLKAEHDRLMHKRNKE IERKKLEERIRQAKKHEKAYRKLKGIFFGIAFTDGTLQVRVLESVAEFAAEGTELHHCVF SNSYFLEKNSLILSATIDGKRIETVEVSLKTLEVVQSRGLHNSNTEYHDRIVNLVNSNVN LIRQRMEAA >gi|226332207|gb|ACIC01000113.1| GENE 3 1567 - 1989 312 140 aa, chain - ## HITS:1 COG:no KEGG:BF0106 NR:ns ## KEGG: BF0106 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 140 1 136 138 202 77.0 4e-51 MKGTEHFKRTIQMYLEQRAAEDALFAKNYRNPAKNIDDCVTYILNYVQRSGCNGFTDGEI FGQAVHYYDENEIEVGKPIQCHVAVNHVVELTAEEKAEARQNAVRRYQEEELRKLQNRSK PRTATKATAQEVQQPNLFNF >gi|226332207|gb|ACIC01000113.1| GENE 4 2003 - 2224 183 73 aa, chain - ## HITS:1 COG:no KEGG:BF0105 NR:ns ## KEGG: BF0105 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 73 59 131 131 134 87.0 1e-30 MAKRSSKTVAQQCRYYEVDNIFVYMVETYINGNISVFRELYRELNKDARRDFTDFLLSEV EPTYWREILKQTI Prediction of potential genes in microbial genomes Time: Thu May 12 02:07:04 2011 Seq name: gi|226332206|gb|ACIC01000114.1| Bacteroides sp. 1_1_6 cont1.114, whole genome shotgun sequence Length of sequence - 45368 bp Number of predicted genes - 35, with homology - 33 Number of transcription units - 25, operones - 5 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 492 -26 ## BF0104 hypothetical protein + Prom 878 - 937 3.1 2 2 Tu 1 . + CDS 987 - 1544 259 ## BT_1907 putative RNA polymerase sigma factor RpoS + Term 1551 - 1597 3.0 - Term 1538 - 1584 6.8 3 3 Tu 1 . - CDS 1596 - 2465 845 ## BT_1908 hypothetical protein - Prom 2487 - 2546 7.4 + Prom 2344 - 2403 2.1 4 4 Tu 1 . + CDS 2467 - 2592 81 ## + Prom 2759 - 2818 5.3 5 5 Tu 1 . + CDS 2996 - 3934 755 ## BT_1910 hypothetical protein + Prom 4108 - 4167 4.2 6 6 Tu 1 . + CDS 4318 - 5097 246 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 + Term 5313 - 5347 1.6 - Term 5656 - 5709 8.1 7 7 Tu 1 . - CDS 5717 - 5860 78 ## BT_1912 hypothetical protein - Prom 6060 - 6119 6.7 + Prom 5738 - 5797 6.0 8 8 Tu 1 . + CDS 6030 - 7199 934 ## COG1488 Nicotinic acid phosphoribosyltransferase + Term 7242 - 7273 1.8 - Term 7230 - 7259 1.4 9 9 Tu 1 . - CDS 7412 - 7735 374 ## COG0526 Thiol-disulfide isomerase and thioredoxins - Prom 7765 - 7824 5.3 + Prom 8175 - 8234 2.0 10 10 Op 1 . + CDS 8268 - 9779 1467 ## COG0439 Biotin carboxylase 11 10 Op 2 . + CDS 9800 - 10309 562 ## COG1038 Pyruvate carboxylase 12 10 Op 3 . + CDS 10330 - 11865 1757 ## COG4799 Acetyl-CoA carboxylase, carboxyltransferase component (subunits alpha and beta) + Term 11892 - 11952 4.0 + Prom 11926 - 11985 6.6 13 11 Tu 1 . + CDS 12029 - 13531 1300 ## COG3119 Arylsulfatase A and related enzymes + Term 13776 - 13812 2.0 14 12 Tu 1 . - CDS 13695 - 14729 936 ## COG2855 Predicted membrane protein - Prom 14749 - 14808 2.2 - Term 15089 - 15132 4.5 15 13 Tu 1 . - CDS 15308 - 16348 982 ## BF3546 putative N-acetylmuramoyl-L-alanine amidase - Prom 16442 - 16501 4.7 + Prom 16299 - 16358 6.3 16 14 Tu 1 . + CDS 16470 - 17750 1368 ## COG2873 O-acetylhomoserine sulfhydrylase + Term 17778 - 17824 6.1 + Prom 17805 - 17864 1.7 17 15 Tu 1 . + CDS 17964 - 18731 443 ## COG2220 Predicted Zn-dependent hydrolases of the beta-lactamase fold - Term 18805 - 18841 1.0 18 16 Tu 1 . - CDS 19011 - 19535 219 ## BT_1925 hypothetical protein - Prom 19556 - 19615 2.4 - Term 19553 - 19599 10.1 19 17 Op 1 . - CDS 19635 - 20501 750 ## BT_1926 hypothetical protein 20 17 Op 2 . - CDS 20528 - 23350 3118 ## BT_1927 hypothetical protein - Prom 23374 - 23433 7.1 21 18 Tu 1 . - CDS 23747 - 23923 67 ## BT_1793 integrase protein - Prom 23963 - 24022 5.4 + Prom 23961 - 24020 7.7 22 19 Op 1 . + CDS 24238 - 25467 1036 ## COG4974 Site-specific recombinase XerD 23 19 Op 2 . + CDS 25487 - 26698 827 ## BT_1929 transposase - Term 26688 - 26748 14.7 24 20 Op 1 . - CDS 26763 - 27056 311 ## BT_1930 hypothetical protein 25 20 Op 2 . - CDS 27273 - 27392 56 ## gi|253570444|ref|ZP_04847852.1| conserved hypothetical protein - Prom 27414 - 27473 4.1 + Prom 27729 - 27788 4.3 26 21 Tu 1 . + CDS 27934 - 31464 2217 ## COG1002 Type II restriction enzyme, methylase subunits + Term 31501 - 31527 -1.0 - Term 31387 - 31419 -0.9 27 22 Tu 1 . - CDS 31431 - 32138 412 ## BT_1932 hypothetical protein - Prom 32194 - 32253 6.8 + Prom 32099 - 32158 7.6 28 23 Tu 1 . + CDS 32329 - 32601 162 ## COG3328 Transposase and inactivated derivatives - Term 32343 - 32379 -0.1 29 24 Op 1 . - CDS 32548 - 34740 1487 ## BT_1934 hypothetical protein 30 24 Op 2 . - CDS 34743 - 36944 1396 ## BT_1935 hypothetical protein 31 24 Op 3 . - CDS 36951 - 39098 1596 ## BT_1936 hypothetical protein 32 24 Op 4 . - CDS 39156 - 40718 1373 ## BT_1937 hypothetical protein 33 24 Op 5 . - CDS 40742 - 42013 960 ## BT_1938 hypothetical protein 34 24 Op 6 . - CDS 42027 - 44888 1588 ## BT_1939 putative outer membrane receptor - Prom 44912 - 44971 9.3 + Prom 44880 - 44939 9.1 35 25 Tu 1 . + CDS 44985 - 45089 104 ## Predicted protein(s) >gi|226332206|gb|ACIC01000114.1| GENE 1 3 - 492 -26 163 aa, chain - ## HITS:1 COG:no KEGG:BF0104 NR:ns ## KEGG: BF0104 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 53 163 20 129 136 84 50.0 1e-15 MMQGPFLLCLCQRGAFTGSQMISFSAVPLSRKRKAVSDRPIHKWNKPSCTAGKPGRMILS VRPDTLGQTYKIILPCRWFTRFHASGYVPFPFWRFPFFTDFSFEVFLSNLPPLPFTSIFA SFSEPHQAVISVLGAKVIPGLDGNARSSLLFSEKISSPAGSIF >gi|226332206|gb|ACIC01000114.1| GENE 2 987 - 1544 259 185 aa, chain + ## HITS:1 COG:no KEGG:BT_1907 NR:ns ## KEGG: BT_1907 # Name: not_defined # Def: putative RNA polymerase sigma factor RpoS # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 185 221 405 405 349 98.0 2e-95 MLEKKFYKRLEKSHEVIYYVEQGITITYDRGKYTPDAIVFLDDGKGFVVEIKPLTEMANQ SVQKKFKALLDFCEETGLGATLTDGRTDINHIFETIPNLAFEESILQSLKEFKKLTYGKV NELKNKYQVTTIHLLQCIIKNNLSYNSMPTLIWKTKKPIICDLLLSPENKMLLKESTDII NNDKT >gi|226332206|gb|ACIC01000114.1| GENE 3 1596 - 2465 845 289 aa, chain - ## HITS:1 COG:no KEGG:BT_1908 NR:ns ## KEGG: BT_1908 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 289 1 289 289 521 99.0 1e-147 MKKKFLLFGALVGALLLSSCSGGSKKQTVSSESTEELDDASKVINYYHMSLAVLRHVANA KDINAVLGYMEQTGKVPEVEPIAPPEIAARDTAELLDPGDYFNPEVRQNLKQNYAGLFNV RTQFYDNFNKFLAYKKLKDTAKTAQLLDENYKLSVELSEYKQVIFDILSPLTEQAESELL ADEPLKDQIMAMRKMSGTVQSIMNLYSRKHAMDGVRIDLKMAELEKELKAAEKIPAVTGY DEELKNFQSFLSTVKSFMNDMQKARSKGAYSDKEYQAMSEAYEYGLSVI >gi|226332206|gb|ACIC01000114.1| GENE 4 2467 - 2592 81 41 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MFSLLVTFRTITIKNDFCYVQTIGITYTYNGGKIQDAINDD >gi|226332206|gb|ACIC01000114.1| GENE 5 2996 - 3934 755 312 aa, chain + ## HITS:1 COG:no KEGG:BT_1910 NR:ns ## KEGG: BT_1910 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 312 1 312 312 644 99.0 0 MIISASRRTDIPAFYSQWFFNRIKEGYVLVPNPFNPKMISKVSLHPAVVDCIVFWTKNPA PMIDKLDHLQDYKYYFQFTLNPYGEKLENNLPSIDKRIDIFKRLSDKIGKEKVIWRYDPI LTNEEYDVSFHKEAFAQIAYGLKDHTEKCMLGFIDHYQHIRTAVGQFNIQPLRKEEIEEI AVSFRNTINEYPAIQLDTCTSKVDLRHLGIPAGLCIDKELIERITGYPLLAKKDKNQRNV CNCIESIDIGTYESCLNGCIYCYAIKGNYNSVEYNTKKHDRNSPLLIGHPAEDDVVKERE MKSLRNNQYSLF >gi|226332206|gb|ACIC01000114.1| GENE 6 4318 - 5097 246 259 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 7 247 4 238 242 99 30 3e-20 MKRFENKVVVITGAAGGIGEATTRRIVSEGGKVVIADLSQERADKLAAELTQAGADVRPI YFSATELQSCKELVDFAMKEYGQIDVLINNVGGTDPKRDLNIEKLDIDYFDEVFHLNLCC TMYLSQQVIPIMTTHGGGNIVNVASISGLTADANGTLYGASKAGVINLTKYIATQMGKKN IRCNAVAPGLVLTPAALDNLNEEVRNIFLGQCATPYLGEPEDVAATIAFLASNDARYITG QTIVVDGGLTIHNPTVELS >gi|226332206|gb|ACIC01000114.1| GENE 7 5717 - 5860 78 47 aa, chain - ## HITS:1 COG:no KEGG:BT_1912 NR:ns ## KEGG: BT_1912 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 47 21 67 67 79 100.0 3e-14 MNKLLYNQSIIAVNRKTQNTYMIIIQYDTDYYILNHSHGDEKKFSPL >gi|226332206|gb|ACIC01000114.1| GENE 8 6030 - 7199 934 389 aa, chain + ## HITS:1 COG:MA2533 KEGG:ns NR:ns ## COG: MA2533 COG1488 # Protein_GI_number: 20091361 # Func_class: H Coenzyme transport and metabolism # Function: Nicotinic acid phosphoribosyltransferase # Organism: Methanosarcina acetivorans str.C2A # 2 383 1 395 404 282 39.0 1e-75 MIVRTLLDTDLYKFTTSYAYIKLFPYAMGTFSFNDRNETEYTEEFLEALKSEFNKLSRLR LTEEELEYMTRNCRFLPRVYWEWLSSFRFDPDKIDIHLDTTGRLHIEVSDFLYKVTLYEV PLLAIVSEIKNKFFGNVPDMSEILCKLSEKVELSNQHQLRFSEFGTRRRFSIDVQETVIK RLNDTAKYCTGTSNCYFAMKYGMKMMGTHPHEWFMFHGAQFGYKHANYMALENWVNVYDG DLGIALSDTYTSGIFLSNLSRKQAKLFDGVRCDSGNEFDFIDKLVARYKELGIDATTKTI VFSNALDFTKALEIQEYCKDKIRCSFGIGTNLTNDTGFAPSNIVMKLTQCKMNVNQEWRE CIKLSDDEGKHTGSPEEVQACLYELRLAK >gi|226332206|gb|ACIC01000114.1| GENE 9 7412 - 7735 374 107 aa, chain - ## HITS:1 COG:slr0233 KEGG:ns NR:ns ## COG: slr0233 COG0526 # Protein_GI_number: 16331440 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Synechocystis # 14 91 12 90 105 67 37.0 5e-12 MKEKKIARDTRNREKLAGDDWVMAEFYATWCPHCQRMQPLIKEFKKEMEGIVEVVQVDVD EESDLANFYTIESTPTFILFRKGEQLWRQSGELTLERLGRAVKEFKS >gi|226332206|gb|ACIC01000114.1| GENE 10 8268 - 9779 1467 503 aa, chain + ## HITS:1 COG:MA0675 KEGG:ns NR:ns ## COG: MA0675 COG0439 # Protein_GI_number: 20089560 # Func_class: I Lipid transport and metabolism # Function: Biotin carboxylase # Organism: Methanosarcina acetivorans str.C2A # 1 463 1 464 493 516 55.0 1e-146 MIKKILVANRGEIAIRVMRSCREMEITSIAIFSEADRTAKHVLYADEAYCVGPAASKESY LNIEKIIEVAKECHADAIHPGYGFLSENATFARRCQEENIIFIGPDPETMEAMGDKISAR IKMIEAGVPVVPGTQENLKSVEEAVELCNKIGYPVMLKASMGGGGKGMRLIHSAEEVEEA YTTAKSESLSSFGDDTVYLEKFVEEPHHIEFQILGDKHGNVIHLCERECSVQRRNQKIVE ETPSVFVTPELRKDMGEKAVAAAKAVNYIGAGTIEFLVDKHRNYYFLEMNTRLQVEHPIT EEVVGVDLVKEQIKVADGQVLQLKQKDIQQRGHAIECRICAEDTEMNFMPSPGIIKQITE PNGIGVRIDSYVYEGYEIPIYYDPMIGKLIVWATNREYAIERMRRVLHEYKLTGVKNNIS YLRAIMDTPDFVEGHYDTGFITKNGEHLQQCIMRTSERAENIAMIAAYMDYLMNLEENRG DATDNRPISKWKEFGLHKGVLRI >gi|226332206|gb|ACIC01000114.1| GENE 11 9800 - 10309 562 169 aa, chain + ## HITS:1 COG:AGc4940 KEGG:ns NR:ns ## COG: AGc4940 COG1038 # Protein_GI_number: 15889978 # Func_class: C Energy production and conversion # Function: Pyruvate carboxylase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 88 164 1095 1171 1174 63 37.0 2e-10 MEIHIGDRVAEIELVSKEDNKVVLTIDGKPFEADVVMAENGTCNILMDGRSANAQLIRRE NGKSYKVNTHYSSFNVEIVDSQAKYLRMRKKGEDEQNDRITSPMPGKVVKIPVTAGQEMR TGDTVIVIEAMKMQSNYKVTSDCRIKEILVQEGDNITGDQTLITLEPII >gi|226332206|gb|ACIC01000114.1| GENE 12 10330 - 11865 1757 511 aa, chain + ## HITS:1 COG:VNG1529G KEGG:ns NR:ns ## COG: VNG1529G COG4799 # Protein_GI_number: 15790513 # Func_class: I Lipid transport and metabolism # Function: Acetyl-CoA carboxylase, carboxyltransferase component (subunits alpha and beta) # Organism: Halobacterium sp. NRC-1 # 11 511 10 516 516 625 58.0 1e-179 MEEINKAYAMFQERDRIASLGGGVEKIEKQHESGKMTARERIEMLLDKGTFVELDKLMVH RCTNYGMDKNKIPGDGIVSGYGKIDGRQVFVYAYDFTVYGGSLSASNAKKIVKVQQLALK NGAPIIALNDSGGARIQEGIESLSGYADIFYQNTMASGVIPQISAILGPCAGGACYSPAL TDFIFMVKEKSHMFVTGPDVVKTVIHEEVSKEELGGAMTHSSKSGVTHFMCNTEEELLMS IRELLSFLPQNNMDETKKQNCTDETNREDAVLDTIVPADPNVPYDMKDIIERVVDNGYFF EVMTNFAKNIIIGFARLAGRSVGIVANQPAYLAGVLDIDASDKASRFIRFCDCFNIPLIT FEDVPGFLPGYTQENNGIIRHGAKIVYAFAEATVPKLTVITRKAYGGAYIVMNSKQTGAD VNFAYPSAEIAVMGAEGAVNILFRKADAETKGKELEAYKEKFATPYQAAELGFIDEIIYP RQTRKRLIQALEMTENKMQTNPPKKHGNMPL >gi|226332206|gb|ACIC01000114.1| GENE 13 12029 - 13531 1300 500 aa, chain + ## HITS:1 COG:SMc00127 KEGG:ns NR:ns ## COG: SMc00127 COG3119 # Protein_GI_number: 15964702 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Sinorhizobium meliloti # 24 453 5 430 512 184 32.0 5e-46 MINLKCTFAVTAGLCSSLAYAQKQPHIILIMTDQQRGDAMGCMGNESLISPHLDALASEG TLFMNGYSSCPSSTPARAGLLTGQSPWHHGLLGYGKVAPKYNHEMPQMLKDAGYYTFGIG KMHWHPQRIKHGFEGTLLDESGRREDPNFISDYRLWFQIQAPGKNPDETGIGWNDHGAAT YKLKESLHPTYWTGEMACQMIQNYDNGNQPLFLKVSFARPHSPYDPPQRFLDMYKDAQVP DPVIGEWCGKYAKELDPEKAAKDAPYGNFGNEYARHSKRYYYANITFIDEQIGRVLQTLK DKGMYDNSLIIFVSDHGDMMGDHYHWRKTYPYEGSTHIPYIIKWPAKAQVVPGKVDNPVE LRDLLPTFFEIAGTSVPTDIDGRSLLSLAKGTETEWRKYIDLEHATCYSDDNYWCALTDG KIKYIWYFYTGEEQLFDLAKDPKELHNAVNDKKYKKLLTGMRAEMIRHLSERGEEFVKDG QLVVRKKTMLYGPNYPKEKR >gi|226332206|gb|ACIC01000114.1| GENE 14 13695 - 14729 936 344 aa, chain - ## HITS:1 COG:NMA0465 KEGG:ns NR:ns ## COG: NMA0465 COG2855 # Protein_GI_number: 15793467 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Neisseria meningitidis Z2491 # 25 338 22 332 338 232 43.0 7e-61 MFNEKRSSMLHGVLLIALFSCAAFYIGEMSFVRSISFSPMIVGIILGMLYANSLRNHLPE TWVPGIQFCSKKILRIGIILYGFRLTFQDVMAIGLPAMLIDVIIVAVTICGGIYLGKLLK MDRGIALLTSIGSGICGAAAILGAESTIKAKPYKTAVSVSTVVIFGTISMFLYPFLYRNG FCALTPDQMGIYTGATLHEVAHVVGAGDAMGNGISDSAIIVKMIRVMMLVPVLLITTYLV ARARKRQVQKGQKFQKIAVPWFAIGFMGVIAFNSFDLLPAQLVAGINTLDTFLLTMAMTA LGTETSIDKFRKAGAKPFVLALLLYVWLVVGGYFLVKYLTPYLM >gi|226332206|gb|ACIC01000114.1| GENE 15 15308 - 16348 982 346 aa, chain - ## HITS:1 COG:no KEGG:BF3546 NR:ns ## KEGG: BF3546 # Name: not_defined # Def: putative N-acetylmuramoyl-L-alanine amidase # Organism: B.fragilis # Pathway: not_defined # 20 343 20 346 346 545 83.0 1e-153 MKKRFYILLLISFLLSLADVQAQQKATPKAGEGISTFLLRHNRAPKKYYDDFVELNKAKL GKGNVLKLGVTYTIPPVKRSAAADKETPARKQSSKASKIGTTLHEPLFGKQLANVKVTSN RLAGACFYVVSGHGGPDPGAIGRVGKHELHEDEYAYDIALRLARNLMQEGAEVHIIIQDA KDGIRNDAYLSNSKRETCMGDPIPLNQVQRLQQRCNKINALYRKDRQNYTYCRAIFIHVD SRSKKKQTDVFFYHSNKKAESKRLANNMKDTFESKYGKHQPNRGFSGTVSGRNLYVLSHT TPASVFVELGNIQNTFDQRRLVMDSNRQALAKWLMEGFLKDFKGRK >gi|226332206|gb|ACIC01000114.1| GENE 16 16470 - 17750 1368 426 aa, chain + ## HITS:1 COG:PM0738 KEGG:ns NR:ns ## COG: PM0738 COG2873 # Protein_GI_number: 15602603 # Func_class: E Amino acid transport and metabolism # Function: O-acetylhomoserine sulfhydrylase # Organism: Pasteurella multocida # 9 426 5 420 422 524 58.0 1e-148 MAKQFKPETLCVQAGWTPKKGEPRVLPIYQSTTFKYDTSEQMARLFDLEDSGYFYTRLQN PTNDAVAAKIAALEGGVAAMLTSSGQAANFYAIFNICQAGDHFVCSSAIYGGTFNLFGVT MKKLGIDVTFVNPDASEEEISAAFQPNTKALFGETISNPSLEVLDIEKFARIAHSHGVPL IVDNTFPTPINCRPFEWGADIIVHSTTKYMDGHATSVGGCIVDSGNFDWDAHADKFPGLC TPDESYHGLTYTKAFGKGAYITKATAQLMRDLGSIQSPQNSFLLNLGLETLHLRMPQHCR NAQKVAEYLSKNEKVAWVNYCGLPDNKYYALAQKYMPNGSCGVISFGLKGGRDVSIKFMD SLEFIAIVTHVADARSCVLHPASHTHRQLSDEQLMEAGVRPDLIRLSVGIENADDIIADI EQALNA >gi|226332206|gb|ACIC01000114.1| GENE 17 17964 - 18731 443 255 aa, chain + ## HITS:1 COG:MA0289 KEGG:ns NR:ns ## COG: MA0289 COG2220 # Protein_GI_number: 20089187 # Func_class: R General function prediction only # Function: Predicted Zn-dependent hydrolases of the beta-lactamase fold # Organism: Methanosarcina acetivorans str.C2A # 47 235 13 201 225 113 37.0 4e-25 MKKILLVALALCLSYGFAYSCLFNHSINLTTKMQNFETDSFTTKNGKSLKITFFKHASLL IEYAGKKFFVDPVSDYADFTQQPKADYILITHEHHDHFDTKAIAAIETPDTKIIANPNCQ KMLDKGQAMKNGDILQITPEIKLEAVPAYNTTPGRDKFHPKGRDNGYILTVGGTRIYIAG DTEDIPEMKQIKNIDIAFLPVNQPYTMTPEQAIQSAKTIQPHILYPYHYGDTNINEVKDG LKNEKEIEVRIRALQ >gi|226332206|gb|ACIC01000114.1| GENE 18 19011 - 19535 219 174 aa, chain - ## HITS:1 COG:no KEGG:BT_1925 NR:ns ## KEGG: BT_1925 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 44 174 1 131 131 258 100.0 8e-68 MNRCIRLLPFFIGVLVLGACSQIKGYRIDGSAPLPEFEGKMVYMKDVSTDAPVDSARIIN GKFAFADTTKIENPVIKILSIHASKMGLEYRLPVVIENGTIKASIADVVCTEGTMLNERM QDFLLAIDAYSAACTDKPVEQIKSGFSELLKRYIEMNNDNVIGTYIQTAYQSSL >gi|226332206|gb|ACIC01000114.1| GENE 19 19635 - 20501 750 288 aa, chain - ## HITS:1 COG:no KEGG:BT_1926 NR:ns ## KEGG: BT_1926 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 16 288 16 288 288 549 100.0 1e-155 MIKKLCTLLLLTCITAAPSLAQRYSSSDDASFAPKKGQWQVSVLLGSGKFFNENTSYLLP KFTNTEGVVGLPNGGTDNSGDLNRYLNIGSLNNNSLVNIAGIEGKYFLSDNWDINFQFSM NISLTPKKDYVEGDNSVPDMIIPAQSYINAQMTNNWYTSVGSNYYFKTRNERIHPYVGGA LGFQMARIETTEPYTGDTYKDSDDSEELPSQVYTAGSKAGQMYGLKVAAVAGIEYSIAKG FIFGFEMHPLAYRYDLIQICPKGFDKYNVSHHNIKIFEMPVVKLGFRF >gi|226332206|gb|ACIC01000114.1| GENE 20 20528 - 23350 3118 940 aa, chain - ## HITS:1 COG:no KEGG:BT_1927 NR:ns ## KEGG: BT_1927 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 13 940 1 928 928 1283 100.0 0 MRKWTYLVATLLMAGTTATFTGCIDTDEPEGIVDLRGAKSELIKAQAAVKLVEVEWQKAQ VAYQELVNKSKELDNQYKEYDVQMHALDVKLKELEVERAQAATEQAKAEAEAKIAEANRN KAYWENKMAEEAEIFKAAMLNYQTQTAQAQEAYDNAMKLIEAGKLLLSDGEKAIIDKAQQ RLYVASASLNQYYTALKTAQDNYYDALVNPNLPTLASLQAELKLAQIAVEKAEILLDEKN NMLALAEDFDAAAWDDKILDLKKKKSEYESEKSKADVEIATIKTSADYKAAEQKVAEKIK ARKAAKEAYDKAVADSTTQVNTQLDIAAYKSEPINEGLKTLFSSSNDFTSLDGYTVSTGV FDYPAVQYTQTEYNADLKIEDVTARTSQASLTLMKVNAWIDALGKYSVDENGVEWNKLTL AEKEKTAKNAKEKFEKDKANWEISAKAVKGTATTVPTTDLKKATDTYNSSYAAVESAVKA YNSAWDAVYQAAYDDAVEKEKASVLEKTYRDNMITALSPTSKAAWDALTPAEQTTSKLEA ILDDTVKQAKAKADANATLAEWLKLDQTVAELAAKGEAAGNQALADDKDKKVEKAKAALV KAAGEAATAVAKVAPAISTYSALAANPYGQILANTVSVEDMTGKDAFYKEEEKDGKKTGH MQALRSDISADEFTKLSATKLDRNTAEDALEYTSDATFGTVIPGEDRLVEVTEAMVRAYI KQNNSAVLTDFGTLGAMMAANDDVQTCKDMIAAADLIKPLKAQMEGVLADLNAEINTNTA LMDPFIAKADETRIALKTAKEEVKKAQEEQDALTAEAEANSKKFAELIQDYIGLISVVQD QIDGINGGVVTGSVVTVESVLNYWKNQVATQEKSVEEAKQKVTAAEKSIELFNKGEYKEA YVIEQKKLALETAQEAYDVAKAIYDTALAQVKSVLETLSK >gi|226332206|gb|ACIC01000114.1| GENE 21 23747 - 23923 67 58 aa, chain - ## HITS:1 COG:no KEGG:BT_1793 NR:ns ## KEGG: BT_1793 # Name: not_defined # Def: integrase protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 58 258 315 315 115 96.0 7e-25 MAYYCEVHPGVISEAMGHSSITVTETYLKPFKNKKIDEANVAVISSLKKVYSVGKLLN >gi|226332206|gb|ACIC01000114.1| GENE 22 24238 - 25467 1036 409 aa, chain + ## HITS:1 COG:ECs3766 KEGG:ns NR:ns ## COG: ECs3766 COG4974 # Protein_GI_number: 15833020 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinase XerD # Organism: Escherichia coli O157:H7 # 176 391 63 287 298 60 26.0 7e-09 MKVEKFKVLLYLKKSEPDKTGKAPIMGRITLNRTMAQFSCKLSCTPGLWNARESRLNGKS REAVETNEKIERLLLAVHSAFNSLMERKRDFDAAAVRDMFQGNAGMQMTLLKLLDRHNGE MKARVGVDRAPTTLSTYLFTYRTLSEFIKAKFKVPDLVFGQLNEQFIRDYQDFILLEKGY AVDTLRGYLAILKKICRIAYKEGHSEKYHFCHFKLPKQKETTPKALSRENFEKLRDLEIP EKRRSHVITRDLFLFACYTGTAYADAVSITRKNLFRDDEGSLWLKYQRKKTDYLGRVKLL PEAVALIEKYRDDTRETLFPPQDYHTLRANMKSLRLMAGLSQDLVYHMGRHSFASLVTLE EGVPIETICKMLGHSNIKTTQIYARVTPKKLFEDMDRFVEATRDLKLIL >gi|226332206|gb|ACIC01000114.1| GENE 23 25487 - 26698 827 403 aa, chain + ## HITS:1 COG:no KEGG:BT_1929 NR:ns ## KEGG: BT_1929 # Name: not_defined # Def: transposase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 403 1 403 403 799 100.0 0 MRSTFKLLFYINRNKVKSDGTTAVLCRISIDGKKSAVTTGVYCKPGDWDSKKCEIKTARE NNRLAAFRSRLEEAYGNLLRNQGVVTAELLKTTVSGANSVPEYLLQAGEVERERLRVRSK EINSTSTYRQSKTTQLNLRQFIESRGMKDITFSDITEEFAESFKVFLKKELGHRNGHVNH CLCWLNRLIYIAVDREILRANPIEDVAYERKETPKLRHISRSELKRMMETPLPDPMMELA RRTFIFSSLTGLAYADTRALHPRHIGTTSEGRRYIRIRRAKTDVEAFIPLHPIAGQILEL YNTTDDDRPVFPLPVRDVLWYEVHGMGVALGMKENLSYHMARHSFGTLTLTAGIPIESIA RMMGHTNIDSTQVYAQVTDRKISSDMNRLMERRKPAAGKEAAG >gi|226332206|gb|ACIC01000114.1| GENE 24 26763 - 27056 311 97 aa, chain - ## HITS:1 COG:no KEGG:BT_1930 NR:ns ## KEGG: BT_1930 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 97 1 97 97 169 100.0 2e-41 MEGIIDKENERVRRFFALLDDMEKKVERLARDNRPPFNGERFLTDRELSGMLKISRRCLQ DYRDQGRIPYIQLGGKILYRQSDIERLLEENYHPALV >gi|226332206|gb|ACIC01000114.1| GENE 25 27273 - 27392 56 39 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253570444|ref|ZP_04847852.1| ## NR: gi|253570444|ref|ZP_04847852.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 39 17 55 55 67 100.0 3e-10 MEICYIEAGVLERMLARAENLSARVDRLYERNRCKEPGE >gi|226332206|gb|ACIC01000114.1| GENE 26 27934 - 31464 2217 1176 aa, chain + ## HITS:1 COG:jhp1409 KEGG:ns NR:ns ## COG: jhp1409 COG1002 # Protein_GI_number: 15612474 # Func_class: V Defense mechanisms # Function: Type II restriction enzyme, methylase subunits # Organism: Helicobacter pylori J99 # 6 1174 7 1157 1252 618 36.0 1e-176 MGLLKPNQVLNKAYRQVAIETTDFDLFKNALRTLRDNIVDGQREHTQKEHLRNFLSETFY KPYYMAPEEDIDLAIRLDKTIKSNIGLLIEVKSTTNKGEMISNDNLNRKALQELLLYYLK ERVNKKNNDIKYLIATNIHEFFIFDAHEFERKFYQNKQLRREFQDFVDGRKTSNKTDFFY TEIATTYIEEVKDSLEYTYFNLQDYQHLLDRTDSSASRKLIELYKIFSDTHLLKLSFQND SNSLNRGFYTELLHIIGIEERKENNKTVIVRKAVERRDEASLLENTINQLDAEDCLRHIN GSLYGNDYEERLFNVAMELCITWMNRILFLKLLEAQMLKYHNGDAIYKFLSITKIHDYDD LNTLFFQVLARDMGSRTHSIMRDFAYVPYLNSSLFEVTDLESKTIKINSLSQRTVLPVLA SSVLRNKKRNLQVNALPTLQYLFAFLDAYNFASEGSEEVQEEAKTLINASVLGLIFEKIN GHKDGSVFTPGFITMFMCREAITKTVLQKFNGYYGWNCTTRIELYNHIDNIVEANELINS LRLCDPAVGSGHFLVSALNELILLKYELGILVDATGKRIRKADYQLAIENDELIVTDTEG NLFAYNPLNAESRRMQETLFKEKRQIIENCLFGVDINPNSVKICRLRLWIELLKNAYYTA ESNYTYLETLPNIDINIKCGNSLLHRFALTDSIQTVLRESSISISQYKEAVAKYKNAQSK SEKQDLETFITEIKSKLKTEINRRDARLVRLNKRRSELANLQAPQLFEPTKKEKKASDKR IADLKKEIATLENIFEEIRSNKIYLGAFEWRIEFPEVLDAEGNFLGFDCIIGNPPYIQLQ SMGKSADVLECMGYITYARTGDIYCLFYELGMNLLTPNGFLCYITSNKWMRAGYGEALRG YFASKTNPIMLVDFAGIKIFDAITVEANILLSQKAANIFNTQACLVQDSNGLNNLSDFVQ QQGVKCNFADSIPWVILSPIEQSIKQKIESVGIPLKDWNIQINYGIKTGFNDAFIISTEK RDEILANCQTEDERVRTAELIRPILRGRDIKRYEYEWADLWIIATFPSRHYDIESYPAVK NYLLSIGIERLEQTGETHIVNGKKIKARKKTSNEWFETQDSISYWEDFSKPKIVWKIIGN QMAFAYDANNYVMNNACYIMTGDHLDYLLAVLNSNN >gi|226332206|gb|ACIC01000114.1| GENE 27 31431 - 32138 412 235 aa, chain - ## HITS:1 COG:no KEGG:BT_1932 NR:ns ## KEGG: BT_1932 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 235 1 235 235 458 100.0 1e-128 MVKIQKISEIEPRLGFTEFDMLKKYRQSFATSELGRLHALFPFSELARQMHLKSSALGRK SYFSPEGKIALMVLKSYTNFSDAQLIEHLNGNIHYQLFCGVQIDPLHPLTNPKIVSAIRQ ELAHRLDVEPLQLILAEHWKPYLENLHVCMTDATCYESHLRFPTDTKLLWEGIVWLHRHL CKHCQTLHIQRPRNKYLDVRRAYLAYSKLRKRRKSQTRMITRRLLQLLEFNTANR >gi|226332206|gb|ACIC01000114.1| GENE 28 32329 - 32601 162 90 aa, chain + ## HITS:1 COG:SMa0384 KEGG:ns NR:ns ## COG: SMa0384 COG3328 # Protein_GI_number: 16262658 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Sinorhizobium meliloti # 14 71 106 163 400 61 44.0 4e-10 MSLYSVDNIKEKLVISLYAKGMSVSDIEEEMREIYEIELSTSAISIITNKVNQAAQEWQN RPLDPVYLIVWMILPILTFQPINCFGNFVH >gi|226332206|gb|ACIC01000114.1| GENE 29 32548 - 34740 1487 730 aa, chain - ## HITS:1 COG:no KEGG:BT_1934 NR:ns ## KEGG: BT_1934 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 17 730 1 714 714 1436 100.0 0 MNKYILLAITALCLQDMQAQTVVHPSIKTKTTFAIVVDQKSYDEAKSEIDAYRTSIEKEG LGTYLLIDDWKRPEPIREQLVKLHENEKTPLEGCVFIGDIPIPMIRDAHHLSSAFKRSPK ANWQKSSVPSDRYYDDFGLKFDYIKQDSLIPDYHYMTLRADSKQYISPDIYSARIRPLHL EGENRYQMLRDYLKKAVAEKAKQNAFDQLTMARGHGYNSEDPLAWSGEQIALREQLPQIF KSGNTVKFYDFNMRYPMKPLYLNEIQREGLDVMLFHHHGGPTMQYINGYENGSGINLSIE NAKIFLRSKVPSYAKKHGREAAIKEYAKQYGVPESWCAEAFDEEKIKSDSIVNRNMDIYT EDIRLLTPNARFILFDACFNGSFHLDDNIVGSYIFNKGKTIATMGCTVNTIQDKWPDEFL GLLAAGMRIGQFTRFTCFLENHLIGDPTFHFTNNAGLDMDINQALVAQEGNVTFWKKQLN SPMADMQAMALRQLSMANYSGLVELLKKSYHESNYFVVRLEALRLLALNYPTEVADVLQT AMNDSYELIRRYAVEYVEKNCNPELLPAWIESYLLRGHENRHRFRIFSAINTFDHDMALN ELKKQAADWSFYDSSYVNELLEYLPRQKKGLERDFALIDSPESTTKQIQSEISRFRNKPI AKAIEPLLNIIKNESQEEELRILAAETLGWYNLYYNKADIIKELNTFRTSNQKLMNEVTK TINRLKSQNR >gi|226332206|gb|ACIC01000114.1| GENE 30 34743 - 36944 1396 733 aa, chain - ## HITS:1 COG:no KEGG:BT_1935 NR:ns ## KEGG: BT_1935 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 733 1 733 733 1449 99.0 0 MIKKIIYLAFLLPLAGNAQTTVIKPLVKQPTAFAIITDNQTYANTKDAMHQYKTAVEDDG LATYLISGDWQNPDQVKQIIIKTYQECPSLEGLVLIGDVPVALVRNAQHMTTAFKMNEKA FPWDQSSVPTDRFYDDLNLKFEFIRQDSVNHQHFYYKLTEDSPQRLNPTFYSARIKYPEK KEGDKYASIASYLKKAAAAKADKHNQLDRVFSFNGASYNSDCLIVWMDDEKAYMENFPLA FGRQMGFKHWNFRMKHPMKYKLFSELQRKDLDLFMFHEHGMPTGQLINDELACTDFNNRY KMLKSTLYNAVMSHVGKRDKDTLRIQMQEKRQVNEVFFKDLDNPKFWEADSLHYADERIV TEDLMKRNLSTNPKMIMFDACYNGSFHENDYIAGQYIFNDGQTLVAQGNTRNVLQDRWTI EMIGLLSHGVRAGQYNKLIASLEGHLFGDPTFRFAPIEANTLSTDITIHKDDKAYWKNLL NSPYADVQSLAMRMLADADTQKELSPLLLKKYRESGFNTVRMEAIKLLSRYQDDNFIEAL REGLNDTYEMVARQSAIYAGFVGDDSLLPAIVEALVEHNERLRVQMSANKALSLYPKEKV EKTIEDFYAKVDRLNENEEKKRLLRSLERMFVQEAKVHQTLMDVAAPEAKRISAIRNVRN YTFHFHVDDYLNVIRDAGNPQEVRVVMAEALGWFTNSVQRPHILEEIKKMQQTANLPEDL KAELEQTIKRLSL >gi|226332206|gb|ACIC01000114.1| GENE 31 36951 - 39098 1596 715 aa, chain - ## HITS:1 COG:no KEGG:BT_1936 NR:ns ## KEGG: BT_1936 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 715 1 715 715 1444 100.0 0 MLCSILSLRAQTFVKPAVKVKDTSFAVITDKGTFQACEAELKAYQEILGMEGLPTFIVYN EWNKPEDVKKVIVKLYKKDKLEGVVFVGDIPIPMLRKAQHMTSAFKMDEKNNDWRDSSVP SDRFYDDFDLQFDFLKQDSVENNFFYYNLAIKSPQQIRCDIYSARVKAVDNGEEPHAQIS RYFKKVVAEHQINNKLDQFFSYTGDGSYSNSLTAWTPETFTIREQMPGVFDKEGRARFIR YNFSDYPKDDVINMLKRTDLDLSIFHEHGMPERQYLSGSPATNRWNAHVDAMKYYYRGLA RRKQNNKKSFDEMLDMMKNTYGLDTTWIAGYDDPKVIAEDSLLDLRTGIILSEVTEFKPN SRMVIFDACYNGDFREKDYIAGRYIMSEGKCVTTFANSVNVLQDKMANEMLGLLGMGARV GQWAKLTNILESHITGDPTLRFQSINEVDANALFKEPYSESRMLELLQSPYADIQNFALH NLYRNDYPGISDLLRKTFETSSFMMVRFTCLALLEKISDKNFREVLHLAITDSYEFIRRT SVRMMQHVGLNEYVYPQIKAYVEDNLSERVAFNVSLGLQVFDQAAVQAAIDKVMAETYVL QDKEEMRKVLENANNSRSMQKELLSKETSERWRILYCNSLKNHMAHACVDGLLALLTDSS ESEKLKTCLLEAFAWFTHSYRKPDILRVCDQLRKDKSLSENLREEADRTYYRLKN >gi|226332206|gb|ACIC01000114.1| GENE 32 39156 - 40718 1373 520 aa, chain - ## HITS:1 COG:no KEGG:BT_1937 NR:ns ## KEGG: BT_1937 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 520 1 520 520 1001 100.0 0 MKKQYMNYMKSCLLVMIACLVSMVGAAQGFSPAAMEQLKTRRLWSHSQNAAGMPFDDIQN YSNVILGYDLQDGNYCRPQEGQKEAIVGVSSEGFINLKNAYVWGAFNFAQKNLTDAGYNA SIADPFRGMPYYVADQHLSKWRNQYYDLKFRAATPLLGNHWALGLEGNYVATLAAKQRDP RVDTRFYTLGLTPGITYKLNNSHKFGASFKYSSIKEDSRMSNVNSYVDQDYYILYGLGTA IKGIGSGVTSNYIGDRFGGALQYNFSMPSFNLLLEGSYDVKAETVQQSYTTPKKIAGVKD KTAHVSLTMIQEGKDYTNYMRTTYTNRNIDGIQYISQRDNSESQSGWVELYNNIRSTYKA QTASLNYALSRNRGNEYSWKAELNVNYTKQDDEYLMPNSVQNAENLSLGLGGKKNFVLGN SLNRRLLIDVHVAYNNNLGGEYVYGGSHADYPTVTELQQGLTNYYTCDYYRIGGSITYSQ QVRENRRMNLFAKVVFDRVNTSDYDYDGRTHLSISLGCNF >gi|226332206|gb|ACIC01000114.1| GENE 33 40742 - 42013 960 423 aa, chain - ## HITS:1 COG:no KEGG:BT_1938 NR:ns ## KEGG: BT_1938 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 25 423 1 399 399 795 100.0 0 MKKYLIYLFTLASTLLIGCDSFRDMSGTAEVNPITVDVYLDITVENISTLKDLTVKFDNY DEDLHYVKEVTDNSVKVDGIIPGIYSVTVSGTAIDTENNEYYINGNSVNAALFKHGSALN IEVQGLKVSPLIFKEIYYCGSRPEKGGVYFRDQFYEIYNNSADILYLDGIYFANLTPGTA TTKLPIWPEADGNNYAYGERVWKFPGNGTEYPLAPGESCIISQFAANHQLDIYNPQSPID GSSSEFEFNMNNPNFPDQAAYDMQHVFYQGKAEMGSIPQYLTSVFGGAYVIFRVPEGEAW DPVNDENMKTTDLSKPNSNVYYAKIPIKYVLDAVEAVNNESKMNAKRVPGVLDAGITWVG ATYCGLGIARKLSTDEEGNPIIREETGTYIYQDTNNSTDDFERGVVPVMRRNGAKMPSWN HTL >gi|226332206|gb|ACIC01000114.1| GENE 34 42027 - 44888 1588 953 aa, chain - ## HITS:1 COG:no KEGG:BT_1939 NR:ns ## KEGG: BT_1939 # Name: not_defined # Def: putative outer membrane receptor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 953 1 953 953 1825 100.0 0 MKKGILGCNIIYLFILICIGIALPIKSQNNYKVKLTGTVYEYDHNNKRLPLEFAAVSIPE IALGTTSDENGRYILENVPTGKIRMQIQYLGKVSIDTLINVNKDLVLNFTMRNEDFKLKE VTVTATNSRSGKSTSSHISRSAMDHMQATSLYDVMSLMPGGISQNQDMSSAQQINIRQVS SSSGPEAPMNAMGTAIIRDGAPISNNANLSAMSPTVLSGTETPASLAGGASPAGGTDVRS ISTENIESIQIVRGIPSVEYGDLTSGAVIINTKAGREPLRVKAKANPNIYQVSMGTGFEL GKKKGALNVSADYAYNTNNPISSYQHYQRATTKLLYSNTFFNNKLRTNSSFDFIYGKDQR ERNPDDEQTKTASEGRDIGFTLNTNGTWNINKGWLKTLRYVLSGTYMDKDSYYETVYSSA TSPYSMTTTNGAVLSNFAGQHIYDANGNQITNFGPEDINHYAVYLPSSYLGHYEIDSREV NLFAKVTSSLFKASGHVNNRILIGADFRSDGNVGKGKTYDPSTPPYRSQYGHNSSFRPRN YKDIPFINQFGAYVEDNFKWSISGTHDLNIQAGVRYDHTSVVGGIFSPRVNASIDLIPNL LSLQGGYGIAAKMPSLLYLYPENAYFEYININELTNENIPESQRLFMTTTEVRQVDNSDL KIAQNHKAEVGFNLRVGKTNLNVIAYKERLKDGYVMSQTFNTFNTFIYNEYQRTENGIEL SSSLPVLSTYAKPTNNLNIETKGLEFDLNIGRIDAIRTAFQINGSWMRTKSWRQGYSFYD NSEDAASARKPVAIYSQEGNASYKQQFVTTLRATHNIPRIGFVVTMTAQAIWQQSNWNTF GNDSIPVGYLALEDASVNMFPKGKYTTTQQVKDAGYGYMLNNVSHNNAIKESYSPYFCFN LNVTKEISNMLRVSFFANNMFRSYPRRESKRNPGSYIQLNNRFFFGLELSLTL >gi|226332206|gb|ACIC01000114.1| GENE 35 44985 - 45089 104 34 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLFLLVPLHNLTDLNRLLSNNIATALYLGKSNIY Prediction of potential genes in microbial genomes Time: Thu May 12 02:08:51 2011 Seq name: gi|226332205|gb|ACIC01000115.1| Bacteroides sp. 1_1_6 cont1.115, whole genome shotgun sequence Length of sequence - 5409 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 4, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 464 - 2062 649 ## BT_1940 hypothetical protein 2 1 Op 2 . - CDS 2134 - 3168 565 ## BT_1941 hypothetical protein 3 1 Op 3 . - CDS 3200 - 3640 273 ## BT_1942 hypothetical protein 4 2 Tu 1 . - CDS 4206 - 4469 62 ## BT_1943 hypothetical protein - Prom 4700 - 4759 1.9 5 3 Tu 1 . + CDS 4484 - 4888 212 ## BT_1944 putative transposase 6 4 Tu 1 . - CDS 4810 - 5403 468 ## BT_1945 conjugate transposon protein Predicted protein(s) >gi|226332205|gb|ACIC01000115.1| GENE 1 464 - 2062 649 532 aa, chain - ## HITS:1 COG:no KEGG:BT_1940 NR:ns ## KEGG: BT_1940 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 532 1 530 536 1061 99.0 0 MKMDYDECRAKISIIDIAEDLGYTRISGRQATNLTYVLGTPKNPEDEIVIFPKKNTYFSR KGSFDDKGELTKFVLKRLHMFSYCTQQGYRGVNEVLSRYMTGERIVTGNVQSRTKDHTQN YIKEFNLNYWNPRPIKKDNPYLTVQRKLSPQTVEDFANRLFIYQVGKNNHIGFPFRKPSQ MEILNFEMRNYFAETNTNYKAFATGGDKAQSCWMANFVPFDKVTDIYLFESAIDAMSFYE INHYTKETTCAFISTGGYVTKSQIENISRIFPSDKVKWNCCYDNDASGNGFDITTAYYLK GEECKAFARTNTGDTYKTIYLSFPDGNTQTFKEDAFSSGEYLKQHSIDNVNIIKPSRYKD WNELLVYYKRFDLNLGPGMKFIPAIEKTISQLNLRGYEQLANSISSSTKELVDSLLEQAN YCISAPLAESGAYTLMVDCNIFMGLDTMVPVPSNLYVIEKCTQKKISAHAINEFLKKEYI NIFRDMSSSDFKNFLEKDILTYTKGAVEKNFEKVILTFGWSLKPSILKKKKF >gi|226332205|gb|ACIC01000115.1| GENE 2 2134 - 3168 565 344 aa, chain - ## HITS:1 COG:no KEGG:BT_1941 NR:ns ## KEGG: BT_1941 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 273 1 273 276 522 99.0 1e-147 MKDQKELLKKCLVNDIPAIVFQGSDSCAVEILQAAENIYRKNGCSPEFLIDFHENVVENF RAYQLENSIATKLPDLTTSEKEAFYILQEKEEAYFPIHIKDFEQHLKDYGFTRTYENTTS YGKYLLLKEKDYIGITGNNGNYDFAINYSKESNQISYFVFSDDPNYPERIGDYNSLKTCF SDIMSRHPLKEEKNELFTVIENYHQMGYDDISQALTQYASAIDNFCNNAQSKYIIEGVSI CKTDSRQLKGNITITQKGNIPVSQITDFQIKDKSTGKTEPLNIGSINLEKQSPETIKKLL SGHKVEMTNKSGGNSLVSLNKTVTGWGLSIAKQIANSTDSSAEI >gi|226332205|gb|ACIC01000115.1| GENE 3 3200 - 3640 273 146 aa, chain - ## HITS:1 COG:no KEGG:BT_1942 NR:ns ## KEGG: BT_1942 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 146 1 146 146 243 100.0 2e-63 MREYAKNPFGALNKENISAEGMDKWAAVTNKYMEMKTNISTKQIELQSSGCKTLIYDVSH SSGQEEIGNYRILDKSTGKTESINVGDIDLEKQSPETLKKLLSGQQTEMTNKSGTNSLVT LNKTITGWGISAVKQVFNSADNSAGI >gi|226332205|gb|ACIC01000115.1| GENE 4 4206 - 4469 62 87 aa, chain - ## HITS:1 COG:no KEGG:BT_1943 NR:ns ## KEGG: BT_1943 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 87 1 87 87 164 100.0 1e-39 MLNTHSCIFHRGVNGSEPLLPVVNGVNICGNPRIISCCLLPLVNPCKPMSTGFDFVAIKK SRTFTAIEIRHKNMRKMAEQQIRCAKC >gi|226332205|gb|ACIC01000115.1| GENE 5 4484 - 4888 212 134 aa, chain + ## HITS:1 COG:no KEGG:BT_1944 NR:ns ## KEGG: BT_1944 # Name: not_defined # Def: putative transposase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 134 1 134 134 238 100.0 6e-62 MKTTKKCAFCGRMFTPNSGMQKYCTVHCADEAKKAKKKRQQDLLNAVEPVLEIQRQEYLT FSKAAILMGCSRQYIYKLVAQGRLRASRISNRMAFIRRADIEKMLEDNPYNRVIPGSRPK KAASTSSGRPESFS >gi|226332205|gb|ACIC01000115.1| GENE 6 4810 - 5403 468 197 aa, chain - ## HITS:1 COG:no KEGG:BT_1945 NR:ns ## KEGG: BT_1945 # Name: not_defined # Def: conjugate transposon protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 197 64 260 260 405 100.0 1e-112 MKNDDLKVNLYRQYERIRKPAYPVIKSDPEKGVEDLRRYMDEKGETFDIVLFDLPGTLRS EGVVHTVAAMDYIFVPLKADNIVMQSSLQFTKVLEEELIAKGNCNLKGIRLFWNMVDRRG RKNLYDAWNRVIHRMGLRLLSSHIPNTLRYNKEADPVCKGVFRSTLFPPDPRQEKDSGLP ELVEAAFFGRDPGITLL Prediction of potential genes in microbial genomes Time: Thu May 12 02:09:36 2011 Seq name: gi|226332204|gb|ACIC01000116.1| Bacteroides sp. 1_1_6 cont1.116, whole genome shotgun sequence Length of sequence - 108474 bp Number of predicted genes - 107, with homology - 102 Number of transcription units - 43, operones - 19 average op.length - 4.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 21 - 49 -1.0 1 1 Op 1 . - CDS 75 - 461 164 ## BT_1946 hypothetical protein 2 1 Op 2 . - CDS 491 - 907 137 ## BT_1947 hypothetical protein - Prom 927 - 986 5.5 + Prom 865 - 924 8.2 3 2 Tu 1 . + CDS 984 - 1238 141 ## BT_1948 hypothetical protein + Prom 1540 - 1599 3.9 4 3 Tu 1 . + CDS 1621 - 1893 109 ## + Term 2007 - 2043 -0.5 - Term 1661 - 1704 2.3 5 4 Op 1 . - CDS 1822 - 2388 240 ## BT_1949 hypothetical protein 6 4 Op 2 35/0.000 - CDS 2385 - 3143 195 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 7 4 Op 3 33/0.000 - CDS 3140 - 4120 684 ## COG0609 ABC-type Fe3+-siderophore transport system, permease component 8 4 Op 4 . - CDS 4121 - 5260 1065 ## COG0614 ABC-type Fe3+-hydroxamate transport system, periplasmic component 9 4 Op 5 . - CDS 5272 - 7350 1285 ## BT_1953 putative TonB-linked outer membrane receptor 10 4 Op 6 . - CDS 7369 - 8469 793 ## BT_1954 putative surface layer protein 11 4 Op 7 . - CDS 8498 - 10588 1228 ## BT_1955 putative cell wall biogenesis protein 12 4 Op 8 . - CDS 10603 - 12384 1271 ## BT_1956 putative cell surface protein - Term 12399 - 12437 4.0 13 4 Op 9 . - CDS 12449 - 13366 751 ## BT_1957 hypothetical protein 14 5 Op 1 . - CDS 14476 - 15114 429 ## BF3036 tyrosine type site-specific recombinase 15 5 Op 2 . - CDS 15199 - 15369 190 ## BF3036 tyrosine type site-specific recombinase 16 6 Tu 1 . - CDS 15799 - 16575 461 ## BT_1962 hypothetical protein - Prom 16742 - 16801 7.6 + Prom 16715 - 16774 7.4 17 7 Tu 1 . + CDS 16805 - 17533 525 ## BT_1963 putative transcriptional regulator - Term 17955 - 18004 10.3 18 8 Op 1 . - CDS 18008 - 18565 373 ## BT_1964 hypothetical protein 19 8 Op 2 9/0.000 - CDS 18627 - 19997 373 ## PROTEIN SUPPORTED gi|157165073|ref|YP_001466086.1| 30S ribosomal protein S12 20 8 Op 3 27/0.000 - CDS 20001 - 23198 3176 ## COG0841 Cation/multidrug efflux pump 21 8 Op 4 . - CDS 23217 - 24359 1367 ## COG0845 Membrane-fusion protein - Prom 24520 - 24579 2.5 + Prom 24325 - 24384 5.8 22 9 Tu 1 . + CDS 24514 - 25362 647 ## COG2207 AraC-type DNA-binding domain-containing proteins + Prom 25407 - 25466 4.6 23 10 Tu 1 . + CDS 25552 - 27843 2747 ## COG0281 Malic enzyme + Term 27874 - 27910 5.0 + Prom 27997 - 28056 7.2 24 11 Tu 1 . + CDS 28181 - 29515 1551 ## COG0334 Glutamate dehydrogenase/leucine dehydrogenase + Term 29557 - 29601 9.4 25 12 Tu 1 . + CDS 29920 - 30384 346 ## COG0753 Catalase + Term 30612 - 30656 9.2 - Term 30596 - 30648 3.2 26 13 Tu 1 . - CDS 30689 - 30907 203 ## gi|253570493|ref|ZP_04847901.1| predicted protein 27 14 Tu 1 . - CDS 31303 - 32100 255 ## Coch_0655 putative phage repressor - Prom 32179 - 32238 8.8 + Prom 32104 - 32163 5.5 28 15 Op 1 . + CDS 32186 - 32428 173 ## gi|253570495|ref|ZP_04847903.1| predicted protein 29 15 Op 2 . + CDS 32438 - 32770 66 ## + Term 32916 - 32957 3.1 30 16 Tu 1 . - CDS 32612 - 32863 211 ## gi|253570496|ref|ZP_04847904.1| predicted protein - Prom 32978 - 33037 5.7 + Prom 32789 - 32848 2.9 31 17 Op 1 . + CDS 32990 - 33169 193 ## gi|253570497|ref|ZP_04847905.1| predicted protein 32 17 Op 2 . + CDS 33177 - 33341 64 ## 33 17 Op 3 . + CDS 33361 - 35553 1987 ## gi|253570498|ref|ZP_04847906.1| predicted protein 34 17 Op 4 . + CDS 35582 - 36496 649 ## gi|253570499|ref|ZP_04847907.1| predicted protein 35 17 Op 5 . + CDS 36493 - 37170 436 ## gi|253570500|ref|ZP_04847908.1| conserved hypothetical protein 36 17 Op 6 . + CDS 37136 - 37570 351 ## gi|253570501|ref|ZP_04847909.1| predicted protein 37 17 Op 7 . + CDS 37557 - 37781 316 ## gi|253570502|ref|ZP_04847910.1| predicted protein 38 17 Op 8 . + CDS 37805 - 38452 718 ## gi|253570503|ref|ZP_04847911.1| conserved hypothetical protein 39 17 Op 9 . + CDS 38449 - 38835 320 ## BF2406 hypothetical protein 40 17 Op 10 . + CDS 38832 - 39095 177 ## gi|253570505|ref|ZP_04847913.1| predicted protein 41 17 Op 11 . + CDS 39103 - 39306 195 ## gi|298387187|ref|ZP_06996740.1| hypothetical protein HMPREF9007_03952 + Prom 39367 - 39426 4.1 42 18 Op 1 . + CDS 39483 - 39932 282 ## BVU_0933 hypothetical protein 43 18 Op 2 . + CDS 39943 - 40239 204 ## gi|299145782|ref|ZP_07038850.1| hypothetical protein HMPREF9010_01234 44 18 Op 3 . + CDS 40262 - 40438 137 ## 45 18 Op 4 . + CDS 40459 - 40707 269 ## gi|253570507|ref|ZP_04847915.1| predicted protein 46 18 Op 5 . + CDS 40716 - 40898 127 ## gi|253570508|ref|ZP_04847916.1| predicted protein + Term 40916 - 40957 6.1 - Term 40904 - 40943 6.5 47 19 Tu 1 . - CDS 40950 - 41867 379 ## COG1533 DNA repair photolyase - Prom 41888 - 41947 6.2 + Prom 41828 - 41887 7.4 48 20 Tu 1 . + CDS 42128 - 42385 171 ## gi|253570510|ref|ZP_04847918.1| predicted protein + Term 42386 - 42423 1.7 49 21 Op 1 . - CDS 42394 - 42993 408 ## gi|253570511|ref|ZP_04847919.1| predicted protein 50 21 Op 2 . - CDS 43005 - 43589 335 ## gi|253570512|ref|ZP_04847920.1| conserved hypothetical protein 51 21 Op 3 . - CDS 43667 - 46213 1493 ## COG2369 Uncharacterized protein, homolog of phage Mu protein gp30 52 21 Op 4 . - CDS 46246 - 46668 308 ## gi|253570514|ref|ZP_04847922.1| predicted protein 53 21 Op 5 . - CDS 46671 - 48209 1110 ## HTH_0882 phage uncharacterized protein 54 21 Op 6 . - CDS 48209 - 48724 478 ## gi|253570516|ref|ZP_04847924.1| predicted protein - Prom 48902 - 48961 5.2 + Prom 48805 - 48864 3.4 55 22 Op 1 . + CDS 48927 - 49988 892 ## COG0740 Protease subunit of ATP-dependent Clp proteases 56 22 Op 2 . + CDS 50015 - 50950 776 ## gi|253570518|ref|ZP_04847926.1| conserved hypothetical protein 57 22 Op 3 . + CDS 50962 - 52185 1067 ## Coch_0648 hypothetical protein 58 22 Op 4 . + CDS 52190 - 52630 581 ## Coch_0647 hypothetical protein 59 22 Op 5 . + CDS 52659 - 53039 469 ## gi|253570521|ref|ZP_04847929.1| predicted protein + Prom 53174 - 53233 3.0 60 23 Op 1 . + CDS 53253 - 55181 1445 ## XBJ1_3939 putative tapemeasure protein 61 23 Op 2 . + CDS 55186 - 55821 503 ## Coch_0643 hypothetical protein 62 23 Op 3 . + CDS 55818 - 56789 714 ## Cpin_0289 hypothetical protein 63 23 Op 4 . + CDS 56786 - 57289 533 ## gi|253570526|ref|ZP_04847934.1| predicted protein 64 23 Op 5 . + CDS 57286 - 57585 288 ## gi|253570527|ref|ZP_04847935.1| predicted protein 65 23 Op 6 . + CDS 57597 - 57896 270 ## gi|253570528|ref|ZP_04847936.1| conserved hypothetical protein 66 23 Op 7 . + CDS 57909 - 58157 231 ## BDI_0444 hypothetical protein 67 23 Op 8 . + CDS 58150 - 58647 396 ## COG3023 Negative regulator of beta-lactamase expression 68 24 Op 1 . + CDS 58821 - 59150 185 ## gi|253570532|ref|ZP_04847940.1| predicted protein 69 24 Op 2 . + CDS 59147 - 59425 251 ## gi|253570533|ref|ZP_04847941.1| predicted protein 70 24 Op 3 . + CDS 59425 - 60273 659 ## Coch_0637 hypothetical protein 71 24 Op 4 . + CDS 60263 - 60751 290 ## Cpin_0295 hypothetical protein 72 24 Op 5 . + CDS 60744 - 61586 396 ## gi|253570536|ref|ZP_04847944.1| predicted protein 73 24 Op 6 . + CDS 61593 - 67883 4258 ## gi|253570537|ref|ZP_04847945.1| conserved hypothetical protein 74 24 Op 7 . + CDS 67885 - 68676 610 ## gi|253570538|ref|ZP_04847946.1| conserved hypothetical protein 75 24 Op 8 . + CDS 68699 - 69511 536 ## gi|253570539|ref|ZP_04847947.1| conserved hypothetical protein + Term 69544 - 69569 -0.5 + Prom 69522 - 69581 5.8 76 25 Op 1 . + CDS 69657 - 70037 232 ## Rpic_2349 hypothetical protein 77 25 Op 2 . + CDS 70052 - 71473 699 ## COG3344 Retron-type reverse transcriptase 78 25 Op 3 . + CDS 71439 - 71714 82 ## gi|253570542|ref|ZP_04847950.1| conserved hypothetical protein + Term 71717 - 71765 11.2 + Prom 71731 - 71790 5.8 79 26 Tu 1 . + CDS 71810 - 72883 1042 ## COG0753 Catalase + Term 72912 - 72951 -0.8 - Term 73085 - 73122 5.7 80 27 Tu 1 . - CDS 73158 - 76130 2736 ## BT_1972 phosphoenolpyruvate synthase/pyruvate phosphate dikinase - Prom 76336 - 76395 7.2 + Prom 76200 - 76259 8.6 81 28 Op 1 . + CDS 76462 - 77799 1498 ## COG0334 Glutamate dehydrogenase/leucine dehydrogenase + Term 77837 - 77885 13.0 + Prom 77869 - 77928 5.1 82 28 Op 2 . + CDS 77952 - 79115 1239 ## COG0006 Xaa-Pro aminopeptidase + Term 79164 - 79234 26.4 - Term 79153 - 79220 28.6 83 29 Tu 1 . - CDS 79253 - 80755 1250 ## COG0617 tRNA nucleotidyltransferase/poly(A) polymerase - Prom 80958 - 81017 4.5 + Prom 80749 - 80808 5.3 84 30 Tu 1 . + CDS 80984 - 81826 1034 ## BF3641 hypothetical protein + Term 81937 - 81970 -0.2 85 31 Op 1 . - CDS 82348 - 83337 1185 ## BT_1977 hypothetical protein 86 31 Op 2 . - CDS 83406 - 84011 720 ## COG0632 Holliday junction resolvasome, DNA-binding subunit - Prom 84084 - 84143 4.7 + Prom 84103 - 84162 4.8 87 32 Tu 1 . + CDS 84190 - 85089 1271 ## BT_1979 meso-diaminopimelate D-dehydrogenase + Term 85107 - 85166 10.1 + Prom 85096 - 85155 9.0 88 33 Op 1 . + CDS 85185 - 85490 115 ## BT_1981 hypothetical protein 89 33 Op 2 . + CDS 85487 - 85816 230 ## BT_1982 hypothetical protein 90 33 Op 3 . + CDS 85858 - 86508 526 ## COG1272 Predicted membrane protein, hemolysin III homolog + Term 86686 - 86723 1.0 91 34 Op 1 . - CDS 86542 - 86763 245 ## COG1476 Predicted transcriptional regulators 92 34 Op 2 . - CDS 86753 - 87217 308 ## BT_1985 hypothetical protein - Prom 87279 - 87338 5.8 93 35 Tu 1 . + CDS 87411 - 88112 720 ## COG0120 Ribose 5-phosphate isomerase + Prom 88213 - 88272 7.4 94 36 Tu 1 . + CDS 88293 - 90734 1979 ## BT_1987 hypothetical protein - Term 91012 - 91054 -0.8 95 37 Tu 1 . - CDS 91060 - 91296 315 ## BT_1988 hypothetical protein - Prom 91352 - 91411 4.2 + Prom 91271 - 91330 7.2 96 38 Tu 1 . + CDS 91522 - 92604 373 ## PROTEIN SUPPORTED gi|167855185|ref|ZP_02477956.1| 50S ribosomal protein L31 + Term 92654 - 92711 13.4 + Prom 92630 - 92689 3.9 97 39 Op 1 . + CDS 92906 - 93391 633 ## BT_1990 hypothetical protein 98 39 Op 2 . + CDS 93431 - 93871 350 ## COG3023 Negative regulator of beta-lactamase expression + Prom 93877 - 93936 3.5 99 39 Op 3 . + CDS 93956 - 94057 74 ## + Term 94100 - 94147 2.5 + Prom 94090 - 94149 4.3 100 40 Op 1 . + CDS 94170 - 97301 1615 ## BT_1992 hypothetical protein 101 40 Op 2 . + CDS 97314 - 101264 1844 ## COG3209 Rhs family protein 102 40 Op 3 . + CDS 101221 - 101553 293 ## BT_1994 hypothetical protein 103 40 Op 4 . + CDS 101623 - 102264 258 ## BT_1995 hypothetical protein + Term 102301 - 102353 9.1 104 41 Tu 1 . - CDS 102471 - 102875 335 ## BT_1997 hypothetical protein - Prom 103074 - 103133 3.6 105 42 Op 1 12/0.000 + CDS 103332 - 105725 2492 ## COG1328 Oxygen-sensitive ribonucleoside-triphosphate reductase + Term 105740 - 105813 5.1 + Prom 105757 - 105816 3.7 106 42 Op 2 . + CDS 105838 - 106332 277 ## COG0602 Organic radical activating enzymes + Term 106467 - 106507 1.6 + Prom 106456 - 106515 7.0 107 43 Tu 1 . + CDS 106546 - 107958 921 ## COG0477 Permeases of the major facilitator superfamily + Term 108007 - 108050 -0.7 Predicted protein(s) >gi|226332204|gb|ACIC01000116.1| GENE 1 75 - 461 164 128 aa, chain - ## HITS:1 COG:no KEGG:BT_1946 NR:ns ## KEGG: BT_1946 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 128 1 128 128 235 100.0 4e-61 MTGKNVKERIGLGGMDADRIREIMGEAPACPRKRRGTSGNDIIRPARKNGPMQPSVYGEE YLHGIAGVQRRSLHIPAALHRKLSILAGASRNGKVTLEGFINHLVSRHLEEYRETTDMIL EESLPGRS >gi|226332204|gb|ACIC01000116.1| GENE 2 491 - 907 137 138 aa, chain - ## HITS:1 COG:no KEGG:BT_1947 NR:ns ## KEGG: BT_1947 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 138 1 138 138 262 100.0 3e-69 MDSEKEKNRLSDIVLERVGLTGNLLSAPVSPSLEPVVEIPSHGSQVRAGKVTGPEEYKRR FLVPAPRAAEWKTAYIDGRLHRRIAMLVRAAGCGSISGFIIRLLELHMEEHREDIASLLG EVYRPWDEDGQPGGTPRR >gi|226332204|gb|ACIC01000116.1| GENE 3 984 - 1238 141 84 aa, chain + ## HITS:1 COG:no KEGG:BT_1948 NR:ns ## KEGG: BT_1948 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 84 1 84 84 149 100.0 2e-35 MKPPVTYSHPKSVIVMIHMTGGTEMPFRSDGSSYAEATSEARSGVEKRAGFRFLVHRSFL ILMGRRESAHPSGYGFLNYRSMGV >gi|226332204|gb|ACIC01000116.1| GENE 4 1621 - 1893 109 90 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MWSAPADSKLCFRLTKTGRFNGEYTDKPLLISSSSVAFYISNYNSFEITKQIEPNVVSLS VYFMSESLVCYVAKFVFIMKKSAYENKIFR >gi|226332204|gb|ACIC01000116.1| GENE 5 1822 - 2388 240 188 aa, chain - ## HITS:1 COG:no KEGG:BT_1949 NR:ns ## KEGG: BT_1949 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 188 1 188 188 391 100.0 1e-108 MSIYTVENFTSDITVEGYIAEFRDELHFLELCKQCTNYGKSWGCPPFDFDTESFLRQYKY AHLMATKIIPEDKDIPIEYTQKLILPERIRIESELLDMERKYGGRSFAYIGKCLHCSDNE CTRNCGTPCRHPEKVRPSLEAFGFDIAKTLSELFNIELLWGKDGKLPEYLVLVSGFFHNE YELCNIAY >gi|226332204|gb|ACIC01000116.1| GENE 6 2385 - 3143 195 252 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 216 1 217 245 79 25 6e-14 MIELQHFSIGYKENSLLHEVNATIKKGQLTALIGRNGTGKSTLLRAIAGLNRCYSGKIIL DGHDIACMKTEDMAKTLAVVTTERTRIANLRCKDVVAIGRAPYTNWIGRMQETDKEIVMQ SLISVGMEAYANRTMDKMSDGECQRVMIARALAQDTPIILLDEPTSFLDMPNRYELVALL RRLVHDEKKCIMFSTHELDIALSMCDSIALLDTPNLSCLTASEMQKSGYIDRLFQNENIR FDSLCGTMILKQ >gi|226332204|gb|ACIC01000116.1| GENE 7 3140 - 4120 684 326 aa, chain - ## HITS:1 COG:alr4032 KEGG:ns NR:ns ## COG: alr4032 COG0609 # Protein_GI_number: 17231524 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-siderophore transport system, permease component # Organism: Nostoc sp. PCC 7120 # 6 323 22 356 362 237 48.0 2e-62 MRSRSTILFSILITLTVGLFLLDLAVGAVNIPIRDVWAALTGGNCSRATEKIVLNIRLIK AIVALLAGAALSVSGLQMQTLFRNPLAGPYVLGISSGASLGVALVVLAGIGSSIGIAGAA WVGAAVVLLVITAVGQRIKDIMVILILGMMFSSGVGAVVQILQYLSKEESLKAFVIWTMG ALGDVTSGQLLILVPSVFAGLLLAVLTIKPLNLLLFGEEYAVTMGLNIRRSRSLLFLSTT LLAGTITAFCGPIGFIGLAMPHVTRMLFQNSDHHVLLPGTILSGASILLLCDIISKIFTL PINAITALLGIPIVVWVVLRNKSITA >gi|226332204|gb|ACIC01000116.1| GENE 8 4121 - 5260 1065 379 aa, chain - ## HITS:1 COG:alr4031 KEGG:ns NR:ns ## COG: alr4031 COG0614 # Protein_GI_number: 17231523 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-hydroxamate transport system, periplasmic component # Organism: Nostoc sp. PCC 7120 # 51 375 93 420 426 226 36.0 4e-59 MNALKNLSLILLLSLAFTGCHNKSSKINDFNLLLYAPEYASGFDIKGAGGKESVLITVRN PWQGADSVTTWLFIVRNGEEVPEGFAGQVLKGDAKRIVAMSSTHIAMLDAIGEVRCITGV SGIDYISNPDIQARRDSIGDVGYEGNINYELLLSLDPDLVLLYGVNGASAMESKLEELDI PFMYVGDYLEESPLGKAEWMVVLSEVTGKREKGEKAFAAIPVRYNALKKKVADSTLGTPS VMLNVPYGDSWFMPSTQSYVARLITDAGGRYIYQKNTGNASIPIDLEEAYLLASDADMWL NVGMANSLDDLKASCPKFTDTRCFKNGEVYNNNARTNTAGGNDYYESAVVNPDIVLRDLV KIFHPELVQEECVYYKQLK >gi|226332204|gb|ACIC01000116.1| GENE 9 5272 - 7350 1285 692 aa, chain - ## HITS:1 COG:no KEGG:BT_1953 NR:ns ## KEGG: BT_1953 # Name: not_defined # Def: putative TonB-linked outer membrane receptor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 692 1 692 692 1381 100.0 0 MKRHLILLFVGVSLPFLLAAQQKNSVSITKRVLRIPEVTVVGKRPMKDIGVQRTRFDSIA MKENIALSMADVLTFNSSVFVKNYGRATLSTVAFRGTSPSHTQVTWNGMRINNPMLGMTD FSTIPSYFIDDASLLHGTSSVNETGGGLGGLVRLSTSPANHEGFGLQYVQGVGSFSTFDE FLRLTYGDKHWQSSTRVVYSSSPNDYKYRNRDKKENIYDEDKNIIGSYYPTERNRSGAYK DLHVLQEIYYNTGEGDKFGLNAWYINSNRELAMLSTDYGNDMDFENRQREQTFRGVLSWD RVREKWKVGVKGGYIHTWMAYDYKRDKGNGEMASMTRSRSKINTFYGSADGDYAPSEKWL FTAGVSVHQHLVESADKNIISQEGNKAVVGYDKGRVEFSGSVSAKWRPVDRFAASLVLRE DMFGTEWAPVIPAFFIDGVLSKKGNIVAKASISRNYRFPTLNDLYFLPGGNPDLKSEHGF TYDVGLSFSVGKENVYALSGGINWFDSHIDDWIIWLPTTKGFFSPRNLKKVHAYGAETNA HLDIMLGKDWKLDMNGTFSWTPSINESEPMSPADQSVGKQLPYVPEFSATVTGRLSWRTW SLLYKWCYYSQRYTMSSNDYTLTGYLPPYFMNNVTLEKQLSFRWADLSLKGSINNLFDEE YLSVLSRPMPGINFEIFIGITPKFGKNKNSKR >gi|226332204|gb|ACIC01000116.1| GENE 10 7369 - 8469 793 366 aa, chain - ## HITS:1 COG:no KEGG:BT_1954 NR:ns ## KEGG: BT_1954 # Name: not_defined # Def: putative surface layer protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 366 1 366 366 751 100.0 0 MIRVLFFIRMTMSRTIQRICLFLFCLPVFGSCMKWDYGEMEDFSVSASGLFITNEGNFQY SNATLSYYDPATCEVENEVFYRANGFKLGDVAQSMVIRDGIGWIVVNNSHVIFAIDINTF KEVGRITGFTSPRYIHFLSDEKAYVTQIWDYRIFIINPKTYEITGYIECPDMDMESGSTE QMVQYGKYVYVNCWSYQNRILKIDTETDKVVDELTIGIQPTSLVMDKYNKMWTITDGGYE GSPYGYEAPSLYRIDAETFTVEKQFKFKLGDWPSEVQLNGTRDTLYWINNDIWRMPVEAD RVPVRPFLEFRDTKYYGLTVNPNNGEVYVADAIDYQQQGIVYRYSPQGKLIDEFYVGIIP GAFCWK >gi|226332204|gb|ACIC01000116.1| GENE 11 8498 - 10588 1228 696 aa, chain - ## HITS:1 COG:no KEGG:BT_1955 NR:ns ## KEGG: BT_1955 # Name: not_defined # Def: putative cell wall biogenesis protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 696 1 696 696 1297 100.0 0 MKIYFLPMLLSLFFLGACDKNDEIIPEDADENFITSVVMTVDGKSYTADITDNTVTITVP YTVSLNNAEVEFKYTTSATIIPDPETVTDWDNERTFRVTSYNGDAREYTYKVVKSEIESD GDVELKTTEEVASFAATKTTVVKGNLIIGSDAEEAEKITDISALASLKEVTGNIVIRNSY NGADLTGLDNIVSAGGLQVGSTDVASKATELHMISMKALETLSGDISVYNDQVTYVLFEK LATIEGSVMFNASSLQSFEFPVLTTVGQDLNLQGLNEENTAAGSIASLEIPELTSVGGVL SVNNLAKLTSMSFLKLKETGGLDFHTVPVMLETINLPEIETVNGSIIMEANMEAPPTGSF VPQRNDVLQAFGGMDKLTTIKGQIKIKNFTALKQLPDWSKITTLGSITLDYLEDVSGTLL LPNARFETFGETAPQIEIINKVQLSKIETAEDLSNVNFVITSLTNNKFPEITFKNIKDFT CKPTTNNTDYTISTIQHVYGNLNVTGQMRSNAKFPDLEIIDGYGYIQIPMFASITMPVLK EVGGQFYLSGNFTSCNLPLLSKVCCSASPVYYKEGEGSLAISLQSKSLDIPELLHVGGEG LFVNKATGITCDKLQTIDGTLQIKSATSLSQETLSMEKLETLHGVVFDGLTKFTDYTFFG KFIENGMITGESWSVTKCGYNPTFQNMKDKQYTQQD >gi|226332204|gb|ACIC01000116.1| GENE 12 10603 - 12384 1271 593 aa, chain - ## HITS:1 COG:no KEGG:BT_1956 NR:ns ## KEGG: BT_1956 # Name: not_defined # Def: putative cell surface protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 593 1 593 593 1164 100.0 0 MHRFHYFIISACMLFTSCNKDEVITEEVGGQPIIELDSETGIYTVKVDHELTIAPTYQNV EDALFAWTIDGTLVSSGPSLQRTWNECGDFYVKLRVDNAEGYAEEELKVEVKELTPPVIS LALPSQGLKVVRNTDYTFTPDIQHSDVEGFKIEWVREGKIVSTENTYTFNEKELGVYTVT INASNIDGTTTKDVSVEVVETMPYVVKFPTPSYLQTSTDRYTFADRPVFLRPLLEYFDNP RFEWSVDGQVMEGEVERMFKFTPSAPGEYTVSCTVSEDTPTEKISRNIDKGKTAVTATVK VVCVDKKEQDGFRASGSSKLWNKVYEYTPAPGQFINETSTIGGMTGNETSPEAAVAWATQ RLKDKLHVSLGSFGGYIIVGFDHSIPNSGNQYDFCVQGNAFDGSSEPGIVWVMQDINGNG LPDDEWYELKGSEAGKEETIQNFEVTYYRPEGKKMDVQWISSDGRNGWVDYLSAYHTQDY YYPAWISENSYTLTGTCLAARNTQDSQTGYWDNQSYDWGYVDNFGNDQIEGGSTVDGSGQ RNGFKISNAIHADGTEANLQYIDFIKIQCGVLAKSGWLGEVSTEVFSFEDLTK >gi|226332204|gb|ACIC01000116.1| GENE 13 12449 - 13366 751 305 aa, chain - ## HITS:1 COG:no KEGG:BT_1957 NR:ns ## KEGG: BT_1957 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 305 1 305 305 598 100.0 1e-169 MKRKLRFLAGACLFTATALFSGCSSDDDFLMDPVDSGTSQTRAVTNPDGTLTITFDDFDP GMLAGPTSAGENLYSYQGYPQVTTIYDNTPEEYLFLSMFNTVGGSTEYSSGGIALSNWNI RSNQSGNTGDWWYSYLNQCSVYNTAVEAEGQNKEAGHSGSNFGVVYGYVDAYNQAWMAKP EFYFNVPRKLVGLWICNTSYTYGVITYGNQFGSTGVATPLKEMKGYFQVNLECYDVNGGL IRTYKRLLADYRNGQQQVDPITTWDYWEINAEGVQSVKFNFEGSDSGAYGLNTPAYICID DITIQ >gi|226332204|gb|ACIC01000116.1| GENE 14 14476 - 15114 429 212 aa, chain - ## HITS:1 COG:no KEGG:BF3036 NR:ns ## KEGG: BF3036 # Name: not_defined # Def: tyrosine type site-specific recombinase # Organism: B.fragilis # Pathway: not_defined # 1 212 86 297 297 434 99.0 1e-121 MNLFREQLEPYKKKIGIEKAESTYCGLVADYKSLLLFMKSKKNAEDIVIEELEKSFIEDY YNWMLGTCALANSTVFGRVNTLKWLMYIAQEKGWIRVHPFASFECMPEYKRRSFLSEEEL QRIIHIEPRYKRQRAMRDMFLFMCFTGLSYVDLKAITYDNIHTDSDGGTWLMGNRIKTGV AYVVKLLPIAIELIEKYRGTDEKKDSPNVSFR >gi|226332204|gb|ACIC01000116.1| GENE 15 15199 - 15369 190 56 aa, chain - ## HITS:1 COG:no KEGG:BF3036 NR:ns ## KEGG: BF3036 # Name: not_defined # Def: tyrosine type site-specific recombinase # Organism: B.fragilis # Pathway: not_defined # 1 56 1 56 297 114 98.0 1e-24 MGRITINGTQAGFSCKKEVSLALWDVKTNRAKGKSEEARTLNQELDNIKAQITRHY >gi|226332204|gb|ACIC01000116.1| GENE 16 15799 - 16575 461 258 aa, chain - ## HITS:1 COG:no KEGG:BT_1962 NR:ns ## KEGG: BT_1962 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 258 1 258 258 503 99.0 1e-141 MGNLISFMKEVADGLRESGNFGTAHIYRSSLSAILAFHGSDKLPFRKVSQEFLKSFESYL RGRNCSWNTVSTYMRTLRAVYNRAVDRHLAPYVPHHFRYVYTGTRADRKRALDKEDMERL MKELPKQLCQDNKELQRTRGLFFLMFLLRGIPFVDLAYLKKHDIDGNVMTYRRRKTGRLL TVTLVPEAMKLIKRYMNTDPASPYLFSLITSGEGTEAAYKEYQLALRNFNYQLMILKQVL GLTSEVSSYTAKHHTISI >gi|226332204|gb|ACIC01000116.1| GENE 17 16805 - 17533 525 242 aa, chain + ## HITS:1 COG:no KEGG:BT_1963 NR:ns ## KEGG: BT_1963 # Name: not_defined # Def: putative transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 242 1 242 242 452 100.0 1e-126 MEQSKNKRARRTRAELEADVFDAIRQLASEKGLAQITFTDIMQRADIQMSVLLNNYKNIE RLLDKYAYISDYWLHDLFDEEHPTDKANEDIMKSTLKALANYLYDNTDMQHLLVWELEAD NSTTRRMARSREKHYKLAIEEYKHLFEGTGIPIDIIAGLLTAGTYYLILHRKRSTFFSVD YQRKENRERLYSTLEYLSGLVFSALKEHNQTIEIARNFKQKGIADDVIAECTGLSVDVVK GL >gi|226332204|gb|ACIC01000116.1| GENE 18 18008 - 18565 373 185 aa, chain - ## HITS:1 COG:no KEGG:BT_1964 NR:ns ## KEGG: BT_1964 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 185 18 202 202 372 98.0 1e-102 MYLHGADGLTMDDIAKGMKMSKRTLYKLFPSKTCLFRICLSDFTNGIRSCLKQSQMRMDS SCMQVLFATVNGYLTLLHSLGKTLLLDIAANEDYRASFKREEAFWLQQFIDVLTHCKICG YLLPGVDPDRFAADLQEVIYQSCLQGTPYVVQRTLNHTLLRGLFEVDGIRYIDEHLKLDK FNVCV >gi|226332204|gb|ACIC01000116.1| GENE 19 18627 - 19997 373 456 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|157165073|ref|YP_001466086.1| 30S ribosomal protein S12 [Campylobacter concisus 13826] # 1 454 1 455 460 148 26 1e-34 MMKKILILSAATLVLSSCGIYNKYKPVSEVPEGLYGSETVVSADTANFGNLSWREVFTDS YLQNLIDSALVRNTDMQTAHLRVEEAEASLLTSKLSYLPSLFLAPEGAVSSFDRGKATQT YSLPVTASWELDIFGKVTTAKRRAKAAYEQSKEYEQAVKTQLVSAVANTYYTLLMLDAQY EISVSTEAAWKESVRTARAMKKAGMMTEAGLAQTEATYYNICTAVLDLKEQINQTENSMA LLLAETPHKIQRGKLENQQLPEDFSVGIPLQMLSNRPDVRSAEFSLAQAFYTTNAARAAF YPSITLSGSAGWTNSAGSMIINPGKFVASAVASLTQPLFNKGVNIAQLKIAKAQQEEARL SFEQTLLNAGVEVNEALVQYQTAREKSAFYDKQITSLQTAARSTSLLMEHSNNTTYLEVL TAQQTLLNARLSQVANRFTEIQGVITLYQALGGGRM >gi|226332204|gb|ACIC01000116.1| GENE 20 20001 - 23198 3176 1065 aa, chain - ## HITS:1 COG:BMEI1629 KEGG:ns NR:ns ## COG: BMEI1629 COG0841 # Protein_GI_number: 17987912 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Brucella melitensis # 4 1025 3 1022 1051 765 41.0 0 MNLRTFIERPVLSAVISITIVVVGIIGLFSLPVEQYPDIAPPTIMVSTTYYGASAETLQK SVIAPLEEAINGVEDMTYMTSSATNAGTVSITVYFKQGTDPDMAAVNVQNRVSRATGQLP AEVTQVGVTTSKRQTSILQMFSLSSPDDSYDENFLSNYISINIKPQILRISGVGDLMIMG GEYSMRVWMKPDVMAQYKLIPSDITGVLAEQNIESATGSFGENSNETYQYTMKYKGRLIT PEEFGDIVIRSTDNGEVLKLKDIAEIQLGQDSYAYHGGMDGHPGVSCMVFQTAGSNATEV NQNIDELLEEVRKDLPKGVELTQIMSSNDFLFASIHEVVKTLIEAIILVILVVYVFLQDF RSTLIPLVGVIVSLIGTFAFMAMAGFSINLLTLFALVLVIGTVVDDAIIVVEAVQARFDV GYRSSYMASIDAMKGISNAVITSSLVFMAVFIPVSFMGGTSGTFYTQFGLTMAVAVGISA INALTLSPALCALLLKPYINEDGTQKNNFAARFRKAFNSAFDILIEKYKNVVLIFIKRRW LAWSLLVCSVVLLVFLMNTTKTSLVPDEDQGVVFVNVSTAAGSSLTTTDEVMERIEKRLM DIPQIKHVQKVAGYGLLAGQGSSFGMLILKLKPWDERPNDEDNVQAVIGQVYGRTADIKD ASVFAISPGMIPGYGMGNALELHMQDKMGGDINEFFTTTQQYLGALNQRPEIAMAYSTFD VRYPQWTVEVDAAKCKRAGITPDAVLSTLSGYYGGQYVSNFNRFSKVYRVMIQADPQFRL DETSLDNAFVRMSNGEMAPLSQFVTLTRSYGAESLSRFNMYNSIAVNAMPADGYSTGDAI KAVQETAVQALPKGYGYDYGGITREENQQSGTTMIIFGICFLMIYLILSALYESFIIPFA VLLSVPCGLMGSFLFAWMFGLENNIYLQTGLIMLIGLLAKTAILLTEYAAERRKAGMGLI ASAVSAAKARLRPILMTALTMIFGLFPLMLSSGVGANGNRSLGTGVVGGMTIGTLALLFI VPTLFIAFQWLQERLRPVQSRSTEDWQIEEEIKVSEEEKSKAGKK >gi|226332204|gb|ACIC01000116.1| GENE 21 23217 - 24359 1367 380 aa, chain - ## HITS:1 COG:Cj0367c KEGG:ns NR:ns ## COG: Cj0367c COG0845 # Protein_GI_number: 15791734 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Campylobacter jejuni # 9 372 6 367 367 145 30.0 1e-34 MSKKVCKMKQGLLLLGCLVAAVGCKQAPPTQMETGYEVMTVSPTDRMISSAYSATIRGRQ DIDIYPQVGGTLTKVSVTEGQRVKSGQTLFIIDQVPYEAALQTAVANVESAKASLATAQL TYDSKEELYKENVVSAFDLSTAKNSLLAAKAQLAQAKAQEVSARNNLSYTVVKSPADGVV GTLPYRVGALVSSSIPEPLTTVSDNSDMYVYFSMTENQLLGLIRQYGSKDEALKSMPAID LQLNDKSAYPEQGQIESISGVIDRSTGTVSLRAVFPNKDGLLHSGGAGNVVIPVQKTGAL VIPQGATFEIQDKRYVYKVVDGKAQSALVQVTRVNGGREFIVDEGLAPGDVIVAEGVGLL REGTPIKAKTAQASTTTTEN >gi|226332204|gb|ACIC01000116.1| GENE 22 24514 - 25362 647 282 aa, chain + ## HITS:1 COG:PA0248 KEGG:ns NR:ns ## COG: PA0248 COG2207 # Protein_GI_number: 15595445 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Pseudomonas aeruginosa # 175 277 183 285 288 69 35.0 7e-12 MIQRNSTLAVLRESFVAGSDENLSPIMHHLWKMEGGAIYFCRSGWAHVTIDLKEYEIVKN TQVVLLPGSIIRINGSSSDFTASFFGFPKDMFREACLRLEPTFFRFMKERPCYVLPDKDT GAINGLIQAATAIYNDRENRFRNQIAKNHLQSFMLDIYDKCYRYFDKQEIEGGSRQDEIF KKFVALVHENCISQREVNFYANELCISTKYLTGICRSVTGDSAKKIIDDFAILEIKVLLQ STELTMQDIADRLGFPDQSYLGRYFKRHEGMSPREYQNRYVI >gi|226332204|gb|ACIC01000116.1| GENE 23 25552 - 27843 2747 763 aa, chain + ## HITS:1 COG:STM2472_1 KEGG:ns NR:ns ## COG: STM2472_1 COG0281 # Protein_GI_number: 16765792 # Func_class: C Energy production and conversion # Function: Malic enzyme # Organism: Salmonella typhimurium LT2 # 1 429 1 429 434 521 62.0 1e-147 MAKITKEAALLYHSQGKPGKIEVVPTKPYSTQTDLSLAYSPGVAEPCLEIEKNPQDAYKY TAKGNLVAVISNGTAVLGLGDIGALSGKPVMEGKGLLFKIYAGIDVFDIEVDEKDPEKFI AAVKAIAPTFGGINLEDIKAPECFEIERRLKEELDIPVMHDDQHGTAIISSAGLVNALQV AGKKIEDVKIVVNGAGASAVSCTKLYVSLGARLENIVMLDSKGVISKTRTDLNEQKRYFA TDRTDIHTLEEAIKGADVFLGLSKGNVLSQDMVRSMAPMPIVFALANPTPEISYEDAMSA RPDVLMATGRSDYPNQINNVIGFPYIFRGALDTQAKAINEEMKIAAVHAIANLAKQPVPD VVNTAYHVNNLSFGPEYFIPKPVDPRLITEVSCAVAKAAMESGVARTEIKDWDAYCVHLR ELMGYESKLTRQLYDTARRNPQRVVFAEGIHPNMLKAAVEAKAEGICHPILLGNDEAIGK LAEELDLSLEGIEIVNLRHPDESERRERYSRILAEKRAREGFTYEEANDKMFERNYFGMM MVETGDADAFITGLYTRYSNTIKVAKEVIGIQPGFNHFGTMHILNSKKGTYFLADTLINR HPDTETLIDIAKLADKTVRFFNHTPVISMLSYSNFGADTSGSPVKVHGAVNYMQKEYPEL AIDGEMQVNFAMNRELRDAKYPFTRLKGKDVNTLIFPNLSSANAGYKLLQAMDPDTEFIG PIQMGLNKPIHFTDFESSVRDIVNITAVAVIDAIVVKKKNESR >gi|226332204|gb|ACIC01000116.1| GENE 24 28181 - 29515 1551 444 aa, chain + ## HITS:1 COG:PA4588 KEGG:ns NR:ns ## COG: PA4588 COG0334 # Protein_GI_number: 15599784 # Func_class: E Amino acid transport and metabolism # Function: Glutamate dehydrogenase/leucine dehydrogenase # Organism: Pseudomonas aeruginosa # 7 444 9 445 445 584 62.0 1e-166 MNAAKVLDDLKRRFPNEPEYHQAVEEVLSTIEEEYNKHPEFDKANLIERLCIPDRVFQFR VTWMDDKGNIQTNMGYRVQHNNAIGPYKGGIRFHASVNLSILKFLAFEQTFKNSLTTLPM GGGKGGSDFSPRGKSNAEVMRFVQAFMLELWRHIGPETDVPAGDIGVGGREVGFMFGMYK KLAHEFTGTFTGKGREFGGSLIRPEATGYGNIYFLMEMLKTKGTDLKGKVCLVSGSGNVA QYTIEKVIELGGKVVTCSDSDGYIYDPDGIDREKLDYIMELKNLYRGRIREYAEKYGCKY VEGAKPWGEKCDIALPSATQNELNGDHARQLVANGCIAVSEGANMPSTPEAIKVFQDAKI LYAPGKAANAGGVSVSGLEMTQNSIKLSWSAEEVDEKLKSIMKNIHEACVQYGTEADGYV NYVKGANVAGFMKVAKAMMAQGIV >gi|226332204|gb|ACIC01000116.1| GENE 25 29920 - 30384 346 154 aa, chain + ## HITS:1 COG:YPO1207 KEGG:ns NR:ns ## COG: YPO1207 COG0753 # Protein_GI_number: 16121498 # Func_class: P Inorganic ion transport and metabolism # Function: Catalase # Organism: Yersinia pestis # 2 146 4 148 480 253 81.0 6e-68 MEEKKLTAANGRPIADNQNSQTAGQRGPVMLQDPWLIEKLAHFDREVIPERRMHAKGSGA FGTFTVTHDITKYTRAAIFSEVGKQTECFVRFSTVAGERGAADAERDIRGFAIKFYTEEG NWDLVGNNTPVFFLRDPLKFPDLNHVRGSIQNVA >gi|226332204|gb|ACIC01000116.1| GENE 26 30689 - 30907 203 72 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253570493|ref|ZP_04847901.1| ## NR: gi|253570493|ref|ZP_04847901.1| predicted protein [Bacteroides sp. 1_1_6] # 1 72 1 72 72 114 100.0 2e-24 MAKETKVIHVHLIFKKTSRFFGSISAIYSEFTAEEIGITEETLRHKGLSDGVSFATKKAI IQQGVLIRSARK >gi|226332204|gb|ACIC01000116.1| GENE 27 31303 - 32100 255 265 aa, chain - ## HITS:1 COG:no KEGG:Coch_0655 NR:ns ## KEGG: Coch_0655 # Name: not_defined # Def: putative phage repressor # Organism: C.ochracea # Pathway: not_defined # 9 260 2 269 277 102 28.0 1e-20 MKENMRDFSVLKQRILQYLDFKGITKYECYKNTGITNGVLSQPNGMSEDNLLKFLSYYSD ISTDWLLAGCGSMLRDDNQTKISKIVPIESEFESIPIVDISVAAGYGCENPDFIEVVETI RLPYNMLRRNRKYFCVKVRGESMSPTLLDCSYLILRLLDRSEWNEIKDNHVYVVSDRSGR AYVKRIKNRFREYGFIVCTSDNVDKANYPNFNLMEDEINTILYVEWYLSAKMPNINATYY DKVNHLEDDVDALKSQMSLLMKRLT >gi|226332204|gb|ACIC01000116.1| GENE 28 32186 - 32428 173 80 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253570495|ref|ZP_04847903.1| ## NR: gi|253570495|ref|ZP_04847903.1| predicted protein [Bacteroides sp. 1_1_6] # 1 80 1 80 80 130 100.0 3e-29 MKQIIELRDTEKRKMIAETFGISLANLSQILRFKRNGKNAEAIRRMAQENGGIKYTEGNE PSKVKVLDSHGNVTRVISNK >gi|226332204|gb|ACIC01000116.1| GENE 29 32438 - 32770 66 110 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MQKNKILKRIDRIIFQAIVMGDNLTADDVLRLGNHARFKKKLFMVGDTCRVSVRKLSNSA SIFTVVNYYYINSLCFTDRIRHYLGKLRTTAFRNMGISEVFVVIIDDTCL >gi|226332204|gb|ACIC01000116.1| GENE 30 32612 - 32863 211 83 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253570496|ref|ZP_04847904.1| ## NR: gi|253570496|ref|ZP_04847904.1| predicted protein [Bacteroides sp. 1_1_6] # 1 83 1 83 83 140 100.0 2e-32 MMFFDDLESKLASDQTICKSILRDVKCPVHKLKARIIYDYDKDFTYAHITKCCCPQFAQI VADTIRKTETIDVVVIDDCEYTR >gi|226332204|gb|ACIC01000116.1| GENE 31 32990 - 33169 193 59 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253570497|ref|ZP_04847905.1| ## NR: gi|253570497|ref|ZP_04847905.1| predicted protein [Bacteroides sp. 1_1_6] # 1 59 1 59 59 83 100.0 5e-15 MKTFRVIQNVLIAVGIITTVSLVDGIEVSASNVQAAFVIACFTIVTILEREFRSEKDEE >gi|226332204|gb|ACIC01000116.1| GENE 32 33177 - 33341 64 54 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MQARNFPVFRDNTDSWFQIEFVMVLFNGVRTGFQWEQVKRGEAATILGQCRDLR >gi|226332204|gb|ACIC01000116.1| GENE 33 33361 - 35553 1987 730 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253570498|ref|ZP_04847906.1| ## NR: gi|253570498|ref|ZP_04847906.1| predicted protein [Bacteroides sp. 1_1_6] # 1 730 1 730 730 1474 100.0 0 MAEIFNNRICVFANELIIFNPKTQVGSEDGFIPEGTYYSMVRNGQLIVLRRGIPGCPALV DFETMRKDVKKGYIVRKGDPRAEIAAKTQKSILEDAIVYSNAAYEFFSVKYRYDGDKKLP PAKIDEYTLNVRIMNALLSLRDGRKANSIGGGGTRINVWEKLCKLSNDLLTLKDPNGRDI FPHNLPKNWKALKRKCEQYEAARRISEEEGYRSVIHKSYGNKYAAVVLNEDAKAVMHKLI SMHNNLNNVQIMEEYNKVASLMDWKPIDSPTTVENWRQKFALTTMAGNKGDKALKNTRMK QIHREAPTQALTYWTLDGWDAELFYQKKTPKTVKKNGEEKRYMYTTYTNRKTMVVVLDAC EKYPVGYAIGDHESPALIREALRNAVQHTKELFGERYKPLQLQSDNYQKKVMVPFYEAMT KYYTPAALGNAKSKIVEPYFKRLNVEYCQKQANWSGFGITADKDNQPNLEVLNQNHKFIP DEATVIAQLEAIIAQERAKKIDAYRAAWERTEEARKMPFGIEEYLMLMGETTGRTNKITG SGLFIEFMGERICFDSFDLSLRDHYNEDWIVRFDPDDMSQVLVSNAKRLKSGRVDKEIGT LQYVLQRDIKVPMALADQKPEHFEYRARVDRFNTEMVETVKEKVKEVDRRITTICQRIPE IAAGTVLDRYLITDSLGQHKDVRSKMRDDATDADFMEVTQHITRQSVAMASTGTDDEDYD YNPLDMNFSR >gi|226332204|gb|ACIC01000116.1| GENE 34 35582 - 36496 649 304 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253570499|ref|ZP_04847907.1| ## NR: gi|253570499|ref|ZP_04847907.1| predicted protein [Bacteroides sp. 1_1_6] # 1 304 1 304 304 608 100.0 1e-172 MDNQALKTYIEKLINRGSSATELARKCGISDTAMSQFRSGKYGANEDSIAEKIASGLNYY ENAWNVVESVTSYQQVRTAFVAAKRNHKWMCISSRSGSGKTQSLIDLYNMSTDNSVIYLK CRKWTARKFLTKLATCMGETVTRYMDNDDLMDLVVSHINRMAGKSPLLILDDAGKLAHSA LCTLIPLYDDTLHRLGAIVAGTETLERNIKRYVGRVEGYDEIDGRFCRNYIALLGATKKD VKAICAANGINDTEEQENIWGKLNKEKKEPVPGKYVWFTDDLRELSGMIEDRIIKQQIER GELA >gi|226332204|gb|ACIC01000116.1| GENE 35 36493 - 37170 436 225 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253570500|ref|ZP_04847908.1| ## NR: gi|253570500|ref|ZP_04847908.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 225 1 225 225 443 100.0 1e-123 MKVWSQKNLEDIRHEYIDFDGEWYLAFGRPEKSGCWIIYGKSGQGKSSFALQLARKFDEM GLRVLYLTLEMGACDDFVNSVLSVGIHSKTNNIIYSDEATITELDEYLSKQRSPDVIMID SIQYFEQQGGAKAPEIIRLRKKYPRKIFVFISHVDGREVEGKTAYDVKRDSFKRIYVEHF KATFIGRGKGGSRGYYIVWAEGYQKYWIENIKSDNDGTEDEETYQ >gi|226332204|gb|ACIC01000116.1| GENE 36 37136 - 37570 351 144 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253570501|ref|ZP_04847909.1| ## NR: gi|253570501|ref|ZP_04847909.1| predicted protein [Bacteroides sp. 1_1_6] # 1 144 1 144 144 221 100.0 1e-56 MMEQKTKKPISKSLIKRLHIIYSAQGIDDEQKRAILLDLTDGRTNTTKELTYSEAMYLCG YLNGAKKENQDLTITEREIRRRRSAVLKRVQRIGIDTTDWGAVNAFCLDTRIAGKKFREL DGEELLLLIPKLESILKKKEDGGY >gi|226332204|gb|ACIC01000116.1| GENE 37 37557 - 37781 316 74 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253570502|ref|ZP_04847910.1| ## NR: gi|253570502|ref|ZP_04847910.1| predicted protein [Bacteroides sp. 1_1_6] # 1 74 1 74 74 114 100.0 1e-24 MADISAEQHRINRINELLDRLDKIPGELDAIHEKLYAGNMNRNEFAKLVDQRSSLYIEAE NKERELKEVYKIKL >gi|226332204|gb|ACIC01000116.1| GENE 38 37805 - 38452 718 215 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253570503|ref|ZP_04847911.1| ## NR: gi|253570503|ref|ZP_04847911.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 215 1 215 215 377 100.0 1e-103 MDISKLSKEEKAELLRKLKEEEKTESIQRKETYEALRHQFMFDVESKLMPVVNDVQGFYD WIVGESKAFRNVMREYGQLRMRQGEETATFSVVDGNFKLEVKSNKVKSFDERADLAAERL IDYLKNYIAHSEKGVDDPMYQMAMTLLERNRQGDLDYKSISKLYELESRFDEEYASIMQL FKESNVVYKTATNYYFHKRDENGVWRRIEPSFCRL >gi|226332204|gb|ACIC01000116.1| GENE 39 38449 - 38835 320 128 aa, chain + ## HITS:1 COG:no KEGG:BF2406 NR:ns ## KEGG: BF2406 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 116 1 116 121 189 75.0 2e-47 MIIAVDFDGTISRGRFPAIDGEQPYAGESLRKLHDEGHKIIIWTCRTGDQLLNAINWLLE RKIPFDRVNDHDPENVAKYGEGGKKIYAHCYIDDKNIGGFPGWLACVEEIERMEEAYKTI LKEDKTKV >gi|226332204|gb|ACIC01000116.1| GENE 40 38832 - 39095 177 87 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253570505|ref|ZP_04847913.1| ## NR: gi|253570505|ref|ZP_04847913.1| predicted protein [Bacteroides sp. 1_1_6] # 1 87 1 87 87 144 100.0 3e-33 MNKKKEIIRTIRNFKRILKSGNVKTILVVSDWDIYIKTYTIEEIAARFLRIKGYNVQITI SDNTEHPSYQFGYIRFYRYARIKFNSN >gi|226332204|gb|ACIC01000116.1| GENE 41 39103 - 39306 195 67 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|298387187|ref|ZP_06996740.1| ## NR: gi|298387187|ref|ZP_06996740.1| hypothetical protein HMPREF9007_03952 [Bacteroides sp. 1_1_14] # 1 67 1 67 67 117 100.0 3e-25 MNAKDQRKLCKAGYTILRRHDYPQPHITFKSDINPDSWKRYGDNYPSKAERDRAMKRLLT DDKIVED >gi|226332204|gb|ACIC01000116.1| GENE 42 39483 - 39932 282 149 aa, chain + ## HITS:1 COG:no KEGG:BVU_0933 NR:ns ## KEGG: BVU_0933 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 148 1 148 151 208 66.0 5e-53 MAKIYVASSWRNQHQPQVVSFLREQGHEVYDFRHPAGKTGFQWSQIDEDWRNWSTDQYRA ALEHPIAQAGFKSDFDAMQWADVCVLVLPCGRSAHSEAGWMKGAGKKVIVYQIWEEEPEL MYKLFDGVCSMGVGLQMFLAEFDKEKNNV >gi|226332204|gb|ACIC01000116.1| GENE 43 39943 - 40239 204 98 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|299145782|ref|ZP_07038850.1| ## NR: gi|299145782|ref|ZP_07038850.1| hypothetical protein HMPREF9010_01234 [Bacteroides sp. 3_1_23] # 1 97 9 100 100 74 42.0 2e-12 MNKDNIIPPMTHPYGMCWQQPPTYLILIDDTHAVMSRLDFEILMDYTCSRPSALYNGKMW KAQYENEGALKWFLCYCFNENEKTNEIDIAYREILIID >gi|226332204|gb|ACIC01000116.1| GENE 44 40262 - 40438 137 58 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLYIAEVVAIIEEAKERGEHSHTFINLSSYVRHILIKVGYKIRFSIDENWNGVYKVEW >gi|226332204|gb|ACIC01000116.1| GENE 45 40459 - 40707 269 82 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253570507|ref|ZP_04847915.1| ## NR: gi|253570507|ref|ZP_04847915.1| predicted protein [Bacteroides sp. 1_1_6] # 1 82 1 82 82 142 100.0 9e-33 MSFIARQKNGLLCRFSTVIDTVSDYNMTDEEYIEMCAQKAREEAQETLKHSLRPFEEVKA SFVPTNMSRDEFNRILKLMEKE >gi|226332204|gb|ACIC01000116.1| GENE 46 40716 - 40898 127 60 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253570508|ref|ZP_04847916.1| ## NR: gi|253570508|ref|ZP_04847916.1| predicted protein [Bacteroides sp. 1_1_6] # 1 60 1 60 60 100 100.0 4e-20 MKRFNTQTRFVPLKIDEDFNVGHIQSKDGKVKDFKTRKAVEKYCKENHCIYCEEKYIFYK >gi|226332204|gb|ACIC01000116.1| GENE 47 40950 - 41867 379 305 aa, chain - ## HITS:1 COG:PAE1082 KEGG:ns NR:ns ## COG: PAE1082 COG1533 # Protein_GI_number: 18312397 # Func_class: L Replication, recombination and repair # Function: DNA repair photolyase # Organism: Pyrobaculum aerophilum # 41 214 15 185 295 96 35.0 7e-20 MSMLKEEIAGMTTGYTVNAPQWVTTFTKAPEFKSFYKTVSGNEGNKCNYPTRLDLYGCGC FHDCSYCYAKSLLNFRGLWHPDNPSVSRTDKVGRKICKLERGTVVRLGGMTDCFQPCEAV YRETYKAIQNLNRQGVHYLIVTKSSMVADDIYIRLMDRKLAHIQISVTSTDDTLSRTFEK ACPPSARIKAIEKLQEQGFDVSVRLSPFIPQFIDFRVLNSIRCDKILVEFLRVNTWVKQW FDIDYSEYTLKHAGYNHLQLERKIEYLNKISGFRKISVCEDVDSHFQYWKKNVNYSPNDC CNLRI >gi|226332204|gb|ACIC01000116.1| GENE 48 42128 - 42385 171 85 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253570510|ref|ZP_04847918.1| ## NR: gi|253570510|ref|ZP_04847918.1| predicted protein [Bacteroides sp. 1_1_6] # 1 85 1 85 85 137 100.0 3e-31 MRNPEMTKIRDRKMVETFYLLYDKKRIRLEDVLLRMSHDLFFLDQNYIYKRIFYISENLS YYEQLKEGKKPDSKKNDTNQLSLGF >gi|226332204|gb|ACIC01000116.1| GENE 49 42394 - 42993 408 199 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253570511|ref|ZP_04847919.1| ## NR: gi|253570511|ref|ZP_04847919.1| predicted protein [Bacteroides sp. 1_1_6] # 1 199 1 199 199 358 100.0 1e-97 MIWTDCYKELVEIIRSKDEFLASIPDEYSELRERMENTPGIEHIDMWHEQVSFLDEEHPF SSPAVFIEFNTLGIEDEGLLVQRLHTQIDFRLFYETFSDTYEGAAMQEEALSFLDLLTLL GMMLHGKSGKNFGTLRRTHVGREESGGAGNLYRISFECEIMDYTTMELASHADMKDREMK ISNGDLPEKTEDEEPLYHL >gi|226332204|gb|ACIC01000116.1| GENE 50 43005 - 43589 335 194 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253570512|ref|ZP_04847920.1| ## NR: gi|253570512|ref|ZP_04847920.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 194 1 194 194 369 100.0 1e-101 MQMERTELPDFFKELSTLVEDAHRYAKVAGVNFFKQNFRRQGFLDTSLTPWAKRSLAIGS DRGVLIQSGKLRDSIHAVSRGIDRIIYQTDPLAYAKIHNEGGYIVVTERMKRYFWYLYMK STGSMQKRKNGELRQNKANERLSTMASFYKSMALKKVGSRIRIPKRQYMGESATFMKQLD TWIASEIDKRFSNI >gi|226332204|gb|ACIC01000116.1| GENE 51 43667 - 46213 1493 848 aa, chain - ## HITS:1 COG:NMA1850_1 KEGG:ns NR:ns ## COG: NMA1850_1 COG2369 # Protein_GI_number: 15794738 # Func_class: S Function unknown # Function: Uncharacterized protein, homolog of phage Mu protein gp30 # Organism: Neisseria meningitidis Z2491 # 525 635 53 186 207 72 32.0 4e-12 MYKKLREIFNWFQQKAIRRMSLKNILNEYYYRMDSSGLPTSGTMYKRQAVVYREKTIDDW IMSVTAATDPDDPRRGLLYRFFQSLYNDEHLQTTIDNRVLPVQQAKYNLVDDNDNEDEEA KKLLDRPWFHQLIRICFLHQLQGVSLADLSHLDDNLEISHVEEIPMSNYIPQQQIIIREE SDQTGWSYKDGALEPYYVQFGNPWSLGMLNELAVIILAKKLGLGAWMNYIEKYGVPPVFV TSDRMDKKRMDELFEMMTDFRNNFFAVLQGNETVEYGKEAGGNTTNAFLPLEERCDNQIS KRLLGQTGTTENGAWEGTAEVHERVEKSRHEYDKMLFQFYFNYIIIPKLVKISPVYKPLE RLKLKWDDTESLSITEYIEAINKLAYTFEFDHEEVAKKTGLPIIGQKKNPGGEQQGGTLP NQPQTDPQKKKTEPDDETVTSPVMEAGEYDFSSIIGRVMKQVYERKVKTGNIDGELFRKT YEELNKKAAEGWGEDDYNDPEQAEEPQRIRDNLFKFSGAKTYQEIKEMNDALYDDKGKKL SYEDFREKVMAIHKDYNENYLRTEFETAETSGRRASEWQEFKENADIMPNLKYVTAGDER VRESHRILDGVVKPINDPFWLQNYPPNGYRCRCYVEQTDEPETPATPIVTIPDAFANNVG QSGEIFTVAHPYFSMPDNDLIKIRKETERNKIYAPYHRDPESKVMISDFADPKDLAKNVE SARVISKELKMKVKIRPHINEDGVKNPEYLIDEKLADLKNIQGLGGIKHGLDSSKKQQCE YTVFNLSAFDTVEPEMLKNKLNGIYKLYGEKYAGQRMVFIYKRKAVKVSWQDVVDGKATD LLKELQEQ >gi|226332204|gb|ACIC01000116.1| GENE 52 46246 - 46668 308 140 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253570514|ref|ZP_04847922.1| ## NR: gi|253570514|ref|ZP_04847922.1| predicted protein [Bacteroides sp. 1_1_6] # 1 140 1 140 140 261 100.0 1e-68 MKYINMDDLTTIIQNRLLIESIEKEEEILAGIEDLVISEVCAYIGGRYDVGKIFGDPPIR TGLLVRVVACITACRAVSRNATRKVPDSLSGLNDWADGILVKLRDGIMTLPQDIPPVTDE DGNVQYPILYGHTRNGGWFL >gi|226332204|gb|ACIC01000116.1| GENE 53 46671 - 48209 1110 512 aa, chain - ## HITS:1 COG:no KEGG:HTH_0882 NR:ns ## KEGG: HTH_0882 # Name: not_defined # Def: phage uncharacterized protein # Organism: H.thermophilus # Pathway: not_defined # 37 484 23 454 470 119 26.0 3e-25 MKVEDSKALKEYQEKLKRVRCTGNLIDPDESLTVRMNRIQRAKRDVKYLVETYLPHYATA DCADFQIAHANRVMNDPIYKGYAEWGRGLAKSVWNDVIIPLWLWINGETHYMCIVSDTFD RACDLLEDLRAEFEANELLKHDFGEQYNPGYWEKGNFVTMNGFICKAFGAKQKVRGLRKG AHRPDLWIIDDLETPQTIKNNRMQDDYADWIEADVLATMTGKRRRLIGANNRFASRMVQT ILKQRHPDWDWHLVKAYDPVTYEPAWKSMYSAQFYRQQEKDMGILAAHAEYNHVPLVKGK IFKPEMVKWGKLPDLHTMNAIVAHWDIAYAGTDTSDFNACKIWGRHKNDFWLIDGFVKQS KMKLCVQWMCMKQAEFRAKGIICFWQYESQFWNDEVKRNIEEAEAETGVELNLVPIQTPK TMTKLLRMLSMHPYYQNGRMYVNELLKSNPDIAVGLKQLYAVEPGMTEHDDSPDADEQAV KKLEIYTDPPQSEDEPASRPWKAGRYKRKYTW >gi|226332204|gb|ACIC01000116.1| GENE 54 48209 - 48724 478 171 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253570516|ref|ZP_04847924.1| ## NR: gi|253570516|ref|ZP_04847924.1| predicted protein [Bacteroides sp. 1_1_6] # 1 171 1 171 171 299 100.0 5e-80 MPSKEYYRKLKKEAHDLYVREGMTCKEISTRINVSERSVSSWINENDALWKKERQASVIS SQKQGDNLKQIINILADQKLELLRMIDEAIAEGDSDKVLELRKQAATLDNSVAQWGNQLK EVDKKNRITLAIYIDVMSRIFDAMKVYDADLYFKTLDFQENHLYEAAKMLG >gi|226332204|gb|ACIC01000116.1| GENE 55 48927 - 49988 892 353 aa, chain + ## HITS:1 COG:ECs2960_1 KEGG:ns NR:ns ## COG: ECs2960_1 COG0740 # Protein_GI_number: 15832214 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Protease subunit of ATP-dependent Clp proteases # Organism: Escherichia coli O157:H7 # 10 173 79 240 244 74 31.0 4e-13 MNLTATAENGRARIELKGTISKWRETEAEFTSKIEQLIRSGIKDVHIYINSPGGECFEAN EIVNVIKKFPGKITGEGGALVASAATYIAINCTSFSMPANGLFMIHQVSGGACGRVADIE SALEVMRKLNEHYLNAFLSKCTDKKKIRDAWEKGDYWMSAQEAKENGFVTEVTGKAKVDK AMAQMITNCGYTGEIEITDSINNEKSKNDMDLTMLTTRFGMDASTTEAQFIAQVDVWKRK ADRVDMLERQEEARKEQEIENILNSAIKEKRITADVRDDWKANLTSNFDTAKKLLDAIKP VEMPEVHAPSLTDTTNKKFEDLQNDPEALKNIMEKNPAEYERLLNDYIKRNGK >gi|226332204|gb|ACIC01000116.1| GENE 56 50015 - 50950 776 311 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253570518|ref|ZP_04847926.1| ## NR: gi|253570518|ref|ZP_04847926.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 311 1 311 311 628 100.0 1e-178 MAQPVDGLYLNKYVDPQLLIERRNYRADFMQVLGSVPAGALAADGVRRNKLINNVGFRVN NTEDFEPKQMTGKNYIVPWEIYDTEPSSCTDDEIRYLAFDKRAAIRVKHNEAFQVGIRNH VLHKLAPEDDSNEEMPVIRTTGEKDINGRLRLSYKDLVDFATLAKTWNLPVTDALYMVLS PLHMGDLLLDKDASKYFYDRTFYLDPATGKPKGFMGIKFFENNDCPFYNAETAKKVAEGT KPSAETDFQASTFFYAPNTYYHIESVKSLYRPETTDTRSKSPTSEYRTQTYGIVDRIEDF GVGAILSGKSV >gi|226332204|gb|ACIC01000116.1| GENE 57 50962 - 52185 1067 407 aa, chain + ## HITS:1 COG:no KEGG:Coch_0648 NR:ns ## KEGG: Coch_0648 # Name: not_defined # Def: hypothetical protein # Organism: C.ochracea # Pathway: not_defined # 7 406 6 386 390 76 22.0 2e-12 MGNFTGVIINKVNGGLVRDTDTSDRIILLVVGGSEIGKLEYYKPENLNDITDLEALGWDD TIDLENKELVHYHTSEVFRLSPERSLYLMLVPKSEKVSSLLTKEDFVNAVRTINGVNTIG ICSLTADETITVAVQEAQKMVNKFREDHLYIDAVILEGVGKYINAIADAVDLRKLDAENV SVVIAQDPARAAKDEAYRTHAAVGSALGMLSVRYVHENMGSVDIENHPRTAKGTKDYPLT DKLNGLWLDAALSNGKPFSQLSVSDQKKLTDQGYIFVGSFQGYAGFFFSNSCTCTEADSD YAYIEYNAVWNKAARIIRNTLLPRVRSKVKADPSTGYISNTTISSWDALVKSALETMVTS EDIADFDIYINPKQMAVSDKPFNIKVKLVADGIVHEFEIDLGFTNKI >gi|226332204|gb|ACIC01000116.1| GENE 58 52190 - 52630 581 146 aa, chain + ## HITS:1 COG:no KEGG:Coch_0647 NR:ns ## KEGG: Coch_0647 # Name: not_defined # Def: hypothetical protein # Organism: C.ochracea # Pathway: not_defined # 16 136 10 130 138 63 28.0 3e-09 MALLGTLINKFGKIAGWNSVKVVMLGRQVEGITALSYKDSKEKDNIYGAGEFPVGRGEGN YKAEASITLLKEEVNALQLTLGSGKRLTDIEPFDIPVMYEYKGLVMKDVIRNVEFTDNGV DVKQGDKSIATQFTLLPSHIDWNVAM >gi|226332204|gb|ACIC01000116.1| GENE 59 52659 - 53039 469 126 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253570521|ref|ZP_04847929.1| ## NR: gi|253570521|ref|ZP_04847929.1| predicted protein [Bacteroides sp. 1_1_6] # 1 126 1 126 126 205 100.0 8e-52 MKEEEMKIKAGKPYEELTTEEKALIVDFTEEEHTELKLKYGKRLKHVTVQVDEDERYDYL IVRPKKNILLAMAKKKDDLEEANDILIRNCVAAGNMEALEDSTVYTSVLTAIGQLIAGQA AFISKA >gi|226332204|gb|ACIC01000116.1| GENE 60 53253 - 55181 1445 642 aa, chain + ## HITS:1 COG:no KEGG:XBJ1_3939 NR:ns ## KEGG: XBJ1_3939 # Name: not_defined # Def: putative tapemeasure protein # Organism: X.bovienii # Pathway: not_defined # 76 356 74 333 1102 103 26.0 2e-20 MQVTQWILELVDRITSPLHAATDAAEEATRVIDDTEEVVERLGETSGKTAGKLEGLGKGM FFLNQLKEGVDNIRDSFNDAIEPGVRFETAVAEMSGITNMEGKELDVLATKARNTAKAFG VDASNAMVVYKDLLSKITPELKKAPDALEIMSNNVMTLSKTMQNDVPGASAAMSTAMNQY KVSLDDPMKAAQTMTDYMNIMAAGTVEGSAEIREVAEALKQTGSVAKTFGVEFAETNSLI QLLDKSGKKGSEGGIALRNTIVKLQAPTTDAIKQLKAAGVNIKTMQNQSLSLTDRLRALT PVMHNATIMSALFGSENLASAMALIEGVDQIDTWTEAIQGSTSAVDMANKQMDTYAEKQK RMQAFIDDLKISFFEFVEPIAPAIEVVGIFVGALVTLGTVAWSISQIMSLGITKIAGVWV ASMAKMALSTIVGSRLISVAIMGIPVIGWIVAIITAVIAFVAFLYNKFEGVRVFLFGLWE VLKTGFLSFFKTIHTIQMGIIEILNPVNWFRDDWSIDDVFERVKKEVFDNAVAVGRAWEE GKEKGRESWRNKDKVPGLDKFQLDTAPAAVNKPTTVTTSTGGTSGKDVGLGGKGGSSVRN ITMNVTFNNHFRVAAGADMRDVADKVKREILAVITDTVPAIG >gi|226332204|gb|ACIC01000116.1| GENE 61 55186 - 55821 503 211 aa, chain + ## HITS:1 COG:no KEGG:Coch_0643 NR:ns ## KEGG: Coch_0643 # Name: not_defined # Def: hypothetical protein # Organism: C.ochracea # Pathway: not_defined # 92 204 67 180 198 62 34.0 8e-09 MTGNTALNIGALFTEVFGISSPIYLPWGRTLQDYDPGKYIGVTTIPDAEAEAYSWMGTPV IGTFTLDGNKQYSTYNPDGSRGTMNMASFPMPYATIVDFSRSMNCSKTKVLGVHGTVKEV YGLDDWKINIRGFCIADKSREGYKTVAEQVNALCKFRKVTEAVGVTGSIFNNKEIYSIVI DNISFNPIQGNSSVVPFTIEATSDNPYELTL >gi|226332204|gb|ACIC01000116.1| GENE 62 55818 - 56789 714 323 aa, chain + ## HITS:1 COG:no KEGG:Cpin_0289 NR:ns ## KEGG: Cpin_0289 # Name: not_defined # Def: hypothetical protein # Organism: C.pinensis # Pathway: not_defined # 22 300 15 307 331 86 27.0 1e-15 MSYMMCSRITFPANKRREELVIYTISSVHIESSWKMLTDSAEIVLPRRIKYFAGKDLKEL LSAGDQVKIELGYDSNLYTEFEGYISLIGWGVPVTIRCEDEMYNLKRKTVSYSAKNVTLK KLLADVAKGYEVKTNYDAELGAVRYSSRTVAEILNDIRKKTNLHCYFIGKVLYCGNVYSE KVDTEKVKIVLEKNAVSQNLNETNGEFQVKVVSIGAGGKKLEAKAGTEGSEVYNLTYNEK GKSVKVEDLKKFARDFYESLKKQKYRGGVELFGIPVVRHGITIDLKSEITPEMNGYYYVE KVTKDFSDDATYRQKLELGGRAE >gi|226332204|gb|ACIC01000116.1| GENE 63 56786 - 57289 533 167 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253570526|ref|ZP_04847934.1| ## NR: gi|253570526|ref|ZP_04847934.1| predicted protein [Bacteroides sp. 1_1_6] # 1 167 1 167 167 310 100.0 2e-83 MTTDEQLRDALEKWREGARQAQLRWVTVDKVDKENKAMDVTGVIDQLEYYDVQLGMGALC IYPKPGTTCLVGIIEGQETDAFLISADEVDEIVLNGGTLGGLVKVGELTERLNLIEKDIN SLKQKLSGWTPVPNDGGSALKTALSSYISESLKETQVRDIENERVKQ >gi|226332204|gb|ACIC01000116.1| GENE 64 57286 - 57585 288 99 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253570527|ref|ZP_04847935.1| ## NR: gi|253570527|ref|ZP_04847935.1| predicted protein [Bacteroides sp. 1_1_6] # 1 99 1 99 99 172 100.0 6e-42 MKGLLLDKDGDIRIVPHKGKDGLTGFVIGDTLIQNAAIVLELNQGELKEDPVLGANLIRY IRSQADKRAIEKQMKIHLKRAGIDYSELVDKINIEITND >gi|226332204|gb|ACIC01000116.1| GENE 65 57597 - 57896 270 99 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253570528|ref|ZP_04847936.1| ## NR: gi|253570528|ref|ZP_04847936.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 99 3 101 101 158 100.0 1e-37 MKASNDLIKKFGVDKIIHGLIGMLILAVCVVASVFLFGVSFPSVLGGMVLGTVSAWLAGK WKESKDDVPDMADIRATVRGALLADVVILLVWIVFRLIL >gi|226332204|gb|ACIC01000116.1| GENE 66 57909 - 58157 231 82 aa, chain + ## HITS:1 COG:no KEGG:BDI_0444 NR:ns ## KEGG: BDI_0444 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 62 1 62 65 95 70.0 4e-19 MKRLHVQLWIAVFLSVSGMILLFCGFWVVPTGQIDNSVLVAYGEVSTFAGALFGVDYRYK CKYKKYIQGEDETENKEEKKDE >gi|226332204|gb|ACIC01000116.1| GENE 67 58150 - 58647 396 165 aa, chain + ## HITS:1 COG:HI1494 KEGG:ns NR:ns ## COG: HI1494 COG3023 # Protein_GI_number: 16273395 # Func_class: V Defense mechanisms # Function: Negative regulator of beta-lactamase expression # Organism: Haemophilus influenzae # 46 156 2 104 116 86 43.0 2e-17 MNKPTYIIIHCSATREDKDFTEKQINDSHVARGFGKWGYHYYIRKDGRVIPMRAENEIGA HDNFIVPGEKTSYNRCSIGICYEGGLDKNGKAKDTRTDAQKKAMRELVQDICHRHDIIDI LGHRDTSPDKNGNGIVEKCEWMKECPCFDVKSEFTSFLPPVIVRP >gi|226332204|gb|ACIC01000116.1| GENE 68 58821 - 59150 185 109 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253570532|ref|ZP_04847940.1| ## NR: gi|253570532|ref|ZP_04847940.1| predicted protein [Bacteroides sp. 1_1_6] # 1 109 59 167 167 144 100.0 2e-33 MIEMMQQMEFNWQKTNYSPPDSTGKQYPTSTETATGTSTRQEKETYNEQLQVQIQEIQET LLTLKEQLEKQERNDTKVVEKVAYIPPWAKAVIAAFFIAFVFFIYKNVR >gi|226332204|gb|ACIC01000116.1| GENE 69 59147 - 59425 251 92 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253570533|ref|ZP_04847941.1| ## NR: gi|253570533|ref|ZP_04847941.1| predicted protein [Bacteroides sp. 1_1_6] # 1 92 1 92 92 158 100.0 9e-38 MKTVVQAGQTLLDIAVQEYGTIEAAFMLARTNDMGITDTLQAGQEIETPEKVYNSELADY CQRNSVCPATSETASNAVRLRIFTEQFTEQFK >gi|226332204|gb|ACIC01000116.1| GENE 70 59425 - 60273 659 282 aa, chain + ## HITS:1 COG:no KEGG:Coch_0637 NR:ns ## KEGG: Coch_0637 # Name: not_defined # Def: hypothetical protein # Organism: C.ochracea # Pathway: not_defined # 1 236 1 229 280 117 31.0 5e-25 MARTIAEIKKEMTDAYMSNSIIRDLYGITGDADFDSVFSPVSIESTLFYIFAATAHVIEQ MFDQFKSDVEERIDANIIPTVRWYHSSALAFQYGDPLVYDPEKYQFRYSVIDKTKQLVKY VAVKDRGGSIQILVSGDEGGLPCPLTGDVLTAFKSYMNSIKIAGVILSIQSIPADDIRIN ATIEVDPMVINASGIRLTDGSKPVLAAINDYLKGIEYGGKFNKTKLVDAIQRVEGVLDVE LGECAAKAASATEYNVIKNNNYTAVAGCFILNSLETSLTYVV >gi|226332204|gb|ACIC01000116.1| GENE 71 60263 - 60751 290 162 aa, chain + ## HITS:1 COG:no KEGG:Cpin_0295 NR:ns ## KEGG: Cpin_0295 # Name: not_defined # Def: hypothetical protein # Organism: C.pinensis # Pathway: not_defined # 19 157 1 139 141 67 33.0 1e-10 MWYDFDIIKYAQYVLRPSLRKRKIFAIISIFLLPLIFIYTLFKSYRKQAINKLNINGQVI YIEKVLNDRFFLKNREIYITDIAGKESYLYHRREEQIPSYLYKRGEGVGKKHIQQRGEGN YSGNYMVNIPSFLSTYENEIKNLIDYYKPAGRTYVLKIYEYE >gi|226332204|gb|ACIC01000116.1| GENE 72 60744 - 61586 396 280 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253570536|ref|ZP_04847944.1| ## NR: gi|253570536|ref|ZP_04847944.1| predicted protein [Bacteroides sp. 1_1_6] # 1 280 18 297 297 543 100.0 1e-153 MNKLLFKEGGQPFYLDDLEFMQESTADVLKAICSGMKLGEKHILLSDPVSTEILGSNTVY TIVGNGYIVIGDEVYPIKPDILTVPTSQPVYWVVVQEKFQNEIFADNSEAQVYERRYVKL STTYTKSDMYVGRNDVVTFRNKILAIVTDYLDKTIIEKDMKAQLSLSEVVSGKAEIIYRA KQTGNETVYFNILAAANTGTSMIAPEVNGKRRLCTFDSSVKNISGVFSLSMSYADSWDNP QSMIVQLTFDNGNCYIASADGSPLVQMPASTILIEDTLKI >gi|226332204|gb|ACIC01000116.1| GENE 73 61593 - 67883 4258 2096 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253570537|ref|ZP_04847945.1| ## NR: gi|253570537|ref|ZP_04847945.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 36 2096 1 2061 2061 4019 99.0 0 MATIYELKRRAQELSAKKDSLSISPEEVGGLIDETLDVINEAEKNQIGLGIRNTYTTVAE MNADSTSPVGSDGKPLKFGQIVTVYDENTPDAVDNGNIYAFQNPGWKLVSTTGNLSVYAK KEDVETAKNTADAAQKKANEAAESAKKANENIGKLSDNIGTESESEDGTVWGKLKSLSDD ADSTSQDVSSLMVDFVHHSTERFDEIVTDSSIVLEQSNAPTEDGKIVFLASLGKFACFVD NKYYPSWKGVDAYMNTDRTHPHENKIYLFGNKTYIYFAGALLSADSDAIQLAASADLAAK AAKKSAEDAQATASSALSLANKALSVINVNEICGGSVYSLSAAIAAITERENADNVTYRK PGIVLTYKIAEGEWESKQFAGSSLEGFATEANWTDFGGAGGDMTGKGAVLLVDEIAPLSN GYYILQTAIDALTAYETANETECIKPGVVIIYRTGKETFESKQLCASRADYNDLAAWNDF GSAAGGTVETDSEIIKDSVNPVAGGAVYDAMPVDVDGEQAEDGTVRVYMKDAEGFPLGDG FTFAVGTGGGGGVAGTIVYIYPQKTSLYAALGTDDLTIRLAILSRTGSGEMVSYNNIETL QLKDKSTGETLETFNVNRESSPSDTDYTFAISVKSYFSEAMNRKFVVVATDDGGNTAQKT ISVTAVNLKLSRVWALYKTLQEGSGLITMTDVFKLSSANKSTVTAHIKIGDEWKLISQTS VASTRSQDLQINVSSLGLKHGAYTVRIVAQDVESGVWSNYQFFDVMIVNPSSLMPIVALA HSEETETAWSVKKYANLNIEVACYDPSHVATDAHVEIHKVAKVANTSTGDNSETDTVLTT VSVGRNSTFNLSTRVDGFTIADNIRNTLGIYGKCGAGESNTIEYSVNSSVIDINGDSSYM IYFNPADKDNSDQDKSWLYGLYEMKQNGFNYSTNAFVTDKNEGKAFKVSDDATALCTYRP YNRTNIEQTGSTTIIKIKTQNAADPDANVVSCWDEANQIGWRITSKCVYFKALGTELIER YFKPGDIYEFAFVIEKANAEEDGKGYTKLYCDGDLIGASKYTAGQSAIKQSEQISFSGTA GELYMYRLLSWEKEMADEQINDEFVIGKSDTDEMIALNKKNDILTDNKIDLNKALEMCDC LVEMPHGDYKLETLDNVTDTSTKIYTDLYLFCKDKGMSLIFENVETTNQGTTSAFYPTYK NRKYKLKKAIIRAMYPERAPQALLDAIVNKKIILRGETIPFDKVCLKVNYASPDKVNTPI SRINNDMQKALGEEYMTPAQNAYYADENNTLDLRTSIDGNSVLVFKSDTGNINDAYFWCR GDWNIDKGNPPTFGFKDVPGYNADCLSYGDFTDLPDVTESYFMSHADDYDQDTIYMLSKS TDASYKFMEYVDGAWKNTTGTISFNGRKTVVTGRVLNPVECVEMLDYEGMCIFDDIDNFM TMQSTHSKWVKGLYGAELSTESLVPKWTMFFEFRTPDDDGMNLAYALGKKTPYHWKQFCE WVYSCNPKNRMAGGKISINGAQVSDTLENRYRKLVEEMDKYCSVASFRAYLVRILYHSGV DQLSKNSMWALYLCPDGIYRWYMNHDYDSDSTNGKNNSGIFKLPYNVMLDSVMEGENVFA GRMSVVWQGMWRYDQVGLAATAEKIRTSRLPGGESAFSYEAVLRESEEKDHLLIPAIVAC RDSVAKYITNPGGQAFNVISGMGIPYRHYYVSARYDFLDAYFGVSTILKADNMCMFRAIG EDINIEVTASEQWKLWAGFNTPAAQQGAWAEEDGSKVTFHFDGSNSSSAIYIIGASKIKS LGDLSTVNIDGTQAKDFTTLIRVEELVFGSKREGYANNSVTDLPLGEKPYMRLLNVENFK KLVSLDLTGATRLLRLLAYGSSLQIINFVGGCPVQYAELPTTMTQFKLMNLDKLSYKGLN ADTGIVVESMPNITTLRVENCPLIDVVKMIRDIIDSQEGNVVFRHIRITNRDFIGNGSEV LEILQLGIGGLDENGNQVEKPVLTGNYLLDEVIENSDIEAIRNGFEGLTVSTIIDAYIKI IDWFNAEAYGGEPYYPEVTLDNVGEIMDYYNGETYEEYLQRFAEENMDINDIVNSK >gi|226332204|gb|ACIC01000116.1| GENE 74 67885 - 68676 610 263 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253570538|ref|ZP_04847946.1| ## NR: gi|253570538|ref|ZP_04847946.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 263 1 263 263 485 100.0 1e-135 MSTKEQSATLLRLNKQEQVKALQAVGFADITENSRASEFPNRIKWATGLLDMRVACNRIS DNSKWYFTREEWNSLTPANKLKFIRRGLCIRAHSQSFVIAAQECYAADLSSSFYWGGLGK AIDGLSAKMLGKMYTCFTGKEDTRLILDALKGTNSNGVEGAPAAEAAVAYKAFTLDGDGL EDDTEWFLPSSGQMMIMYRYRDQINEMLRAFWSSDSMFLTDKYYWTSTYYDTTNAWTCNL NTGHMNVQNKNTSLLHVRATAEE >gi|226332204|gb|ACIC01000116.1| GENE 75 68699 - 69511 536 270 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253570539|ref|ZP_04847947.1| ## NR: gi|253570539|ref|ZP_04847947.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 270 1 270 270 513 100.0 1e-144 MTDKNNESALLLRMNKEEQVSVLQEIGFTNVNENTPASDIAKYIRWAGGLLDLSFATIRI EDGVNVFFTAEEWNSLSANNRSKYLRVGVRIRADRRQFVIAKSDCTDDIGGRTFKWGAYG TDIRGVKNYGNGNQGLYETADGKQNTDAIIEATAGVKDNSGVVGAPAAEAAKNYKACTLE LDGLEDKTEWYLPSEGELITIAKYKTEINELLSSVSGNQNIITADWYWSSTEYDASGAWS VNMCYGIVHTSSKTYAGRVRPVSAIDSLSL >gi|226332204|gb|ACIC01000116.1| GENE 76 69657 - 70037 232 126 aa, chain + ## HITS:1 COG:no KEGG:Rpic_2349 NR:ns ## KEGG: Rpic_2349 # Name: not_defined # Def: hypothetical protein # Organism: R.pickettii # Pathway: not_defined # 1 124 1 123 123 65 29.0 5e-10 MGKAEDRPVYQLMYRLLMLILDARDKFPKNYRYEFGTELMMSALRCCELIRYANSSLPRR VEYLNEFLVKFDTLKLLLRVCRDRKLINIQTTADIIEMITSIEKQIIGWRNFTASQEKTA SVKPEP >gi|226332204|gb|ACIC01000116.1| GENE 77 70052 - 71473 699 473 aa, chain + ## HITS:1 COG:alr3497 KEGG:ns NR:ns ## COG: alr3497 COG3344 # Protein_GI_number: 17230989 # Func_class: L Replication, recombination and repair # Function: Retron-type reverse transcriptase # Organism: Nostoc sp. PCC 7120 # 78 422 26 343 352 98 25.0 3e-20 MGEQSNLSIGHSPGDEPGKTKTVNAESSNAWYVNMNNGNVNTNNKTNAGRVRPVSATDKP IYDIPLSSIVHAFDVCCKNKRNTDDCIEFSFEYDTDLVAVWDAIRYGRYEPDYSKCFIRK KPVLREIFASAYVDRVIQHWADLRLDPILEERFQAQGNVSKNCRIGEGALSAVIYLNGMI MEVSENYTKDAYIFKGDFKSFFMSMSKSLLWEMIDLFIRDNYKGDDIECLLYILRTVIFH QPQYKCYRKSPLHLWDELPHDKSLFHADPDHGFAPGSLHAQKFANFIGSCFDYYVSEILG IKHYVRFVDDFAFVMRNKEDILNTVPLLDSYLKEQLLLKLHPKKIYIQHHSKGVLFVGAF ILPGRIYISNRVVGNLYDVISKYNKIAEEGFAEAHADKFVATLNSYYGLMRHFNTYNLRR KVAKRINPAWWEYFYVQGHWEVFVLKNEYNFKKQLKKQIRKGNAKKYLTPEIG >gi|226332204|gb|ACIC01000116.1| GENE 78 71439 - 71714 82 91 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253570542|ref|ZP_04847950.1| ## NR: gi|253570542|ref|ZP_04847950.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 91 1 91 91 167 100.0 2e-40 MRKNILPRKLAKPIEQLSDGTWIIRYAIQSIDRTDNEGNELVTFASSIFLEKPTLEMIKK SIHRYAMSVLDDEDVLPLVANPDLSVYMIID >gi|226332204|gb|ACIC01000116.1| GENE 79 71810 - 72883 1042 357 aa, chain + ## HITS:1 COG:NMA0050 KEGG:ns NR:ns ## COG: NMA0050 COG0753 # Protein_GI_number: 15793081 # Func_class: P Inorganic ion transport and metabolism # Function: Catalase # Organism: Neisseria meningitidis Z2491 # 5 354 143 488 504 467 64.0 1e-131 MFFGREFFDLAIVHAIKRDPKTNMRSPNSNWDFWTLLPEALHQVTITMSPRGIPYSYRHM HGFGSHTYSFINADNQRIWVKFHLRTLQGIKNLTDQEAEAIIAKDRESHQRDLFESIEKG DYPKWLFQIQLMTEEEADHYRINPFDLTKVWLHKDFPLQDVGILELNRNPENYFAEVEQA AFNPINTVDGIGFSPDKMLQGRLFSYGDAQRYRLGVNSEQIPVNKPRCPFHAYHRDGAMR VDGNYGATKGYEPNSYGEWKDSPHMKEPPLKASGNGEIYNYNEREYDDDYYSQPGDLFRL MPADEQQLLFENTARAMGDSELFIKQRHTRNCYKADPAYGAGVAKALGIDLQEALKE >gi|226332204|gb|ACIC01000116.1| GENE 80 73158 - 76130 2736 990 aa, chain - ## HITS:1 COG:no KEGG:BT_1972 NR:ns ## KEGG: BT_1972 # Name: not_defined # Def: phosphoenolpyruvate synthase/pyruvate phosphate dikinase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 989 1 989 990 2012 100.0 0 MLSKYKLNQLYFKDTQFANLMTRRIFNVLLIANPYDAFMLEDDGRIDEKIFNEYTSLSLR YPPRFSQVSTEEEALTQLENMSFDLVICMPSTGDNDSFDIGRHIKEKYEHIPIVILTPFS HGITKRIINEDLSAFEYVFCWLGNTDLLVSIIKLMEDKMNLEHDVQEVGVQMILLVEDGI RFYSSILPNLYKFVLKQSQEFSTEALNAHQRTLRMRGRPKIVLARTYQEAMEIYRKYQNN ILGVITDVRFPKVERGEKDGLAGIKLCAAIRKNDPFVPLIIQSSESENSSYAAKYGASFI DKNSKKMDVDLRRIVSDNFGFGDFVFRNPETGEEIARVRNLKELQNILFAVPAESFLYHI SRNHVSRWLYSRAMFPVAEFLKPITWTSLQDVDAHRRIIFEAIVKYRKMKNQGVVAVFKR DRFDRYSNFARIGDGSLGGKGRGLAFIDNMVKRYPEFEEFENARVAIPKTVVLCTDVFDE FMETNNLYQIALSDADDDTILRYFLKAKLPDRLVEDFFTFFDVVKSPIAIRSSSLLEDSH YQPFAGIYNTYMIPYLDDRYEMLRMLSDAIKGVYASVYFRDSKAYMQATSNVIDQEKMAV ILQEVVGNQYGDRYYPSMSGVARSLNYYPLGDEKAEEGTVNLALGLGKYIVDGGMTLRFS PYHPNQVLQTSEMEIALKETQTRFYALDLKNAGHDFSIDDGFNLLKLHVKEAESDGSLRY IASTYDPYDQVIRDGLYPGGRKVITFANILQHDVFPLARILQLVLKYGEQEMRRPVEIEF AATLSREQDKTGTFYLLQIRPIVDSKEMLDEDLTLIPDEDVVLRSNNSLGHGVMNEIYDI VYVKTDGYSASNNQAIAWEIEKMNLQFLNAGRNYVLVGPGRWGSSDTWLGIPVKWPHISA ARVIVEAGLTNYRVDPSQGTHFFQNLTSFGVGYFTINAFMNDGVYNQDFLNAQPAVEETK FLRHVRFEKPMVVKMDGKKKLGVVLMPGIG >gi|226332204|gb|ACIC01000116.1| GENE 81 76462 - 77799 1498 445 aa, chain + ## HITS:1 COG:PA4588 KEGG:ns NR:ns ## COG: PA4588 COG0334 # Protein_GI_number: 15599784 # Func_class: E Amino acid transport and metabolism # Function: Glutamate dehydrogenase/leucine dehydrogenase # Organism: Pseudomonas aeruginosa # 2 445 4 445 445 538 58.0 1e-153 MNIQKIMSSLEAKHPGELEYLQAVKEVLLSIEDIYNQHPEFEKVKIIERLVEPDRIFTFR VTWVDDKGEVQTNLGYRVQFNNAIGPYKGGIRFHASVNLSILKFLGFEQTFKNALTTLPM GGGKGGSDFSPRGKSDAEIMRFCQAFMLELWRHLGPDMDVPAGDIGVGGREVGYMFGMYK KLTREFTGTFTGKGLEFGGSLIRPEATGFGGLYFVYQMLQAKNIDIKGKTVAISGFGNVA WGAATKATELGAKVITISGPDGYIYDPDGISGEKIDYMLELRASGNDIVAPYAEKYPGAT FVEGKRPWEVKADIALPCATQNELNGEDAQNLIKNNVLCIGEISNMGCTPEAIDLFIEKK IMYAPGKAVNAGGVATSGLEMSQNAMHLSWSAAEVDEKLHAIMHGIHAQCVKYGTEPDGY INYVKGANIAGFMKVAHAMMGQGVI >gi|226332204|gb|ACIC01000116.1| GENE 82 77952 - 79115 1239 387 aa, chain + ## HITS:1 COG:MA4232 KEGG:ns NR:ns ## COG: MA4232 COG0006 # Protein_GI_number: 20093022 # Func_class: E Amino acid transport and metabolism # Function: Xaa-Pro aminopeptidase # Organism: Methanosarcina acetivorans str.C2A # 23 384 20 385 388 167 31.0 3e-41 MLQPELKLRRDKIRSLMVSQGIDAALITCNANLIYTYGCVVSGYLYLPLHSPALLFFKRP NNITGEHSFPIRKPEQIVDILKEKGLPLPTKLMLEGDELPYTEYVRLAGLFPDAEVVNGT PLIRQARSVKTAIEIEMFRRSGMAHAKAYEQIPFAYYPGMTDIEFSIEIERLMRLQGCLG IFRVFGRSMEIFMGSVLTGDNAGYPSPYDFALGGRGLDPALPGGADKTPLKEGQSVMVDL GGNFNGYMGDMSRVFSIGKLSEEAYTAHQVCLDIQEKIASIARPGIPCEMLYNTAIEMVT QAGFADKFMGTGQQAKFIGHGIGLEINEAPVLAPRIKQELEPGMVFALEPKIVLPGVGPV GIENSWVVTNEGVEKLTNCNEEIIELS >gi|226332204|gb|ACIC01000116.1| GENE 83 79253 - 80755 1250 500 aa, chain - ## HITS:1 COG:MT4026 KEGG:ns NR:ns ## COG: MT4026 COG0617 # Protein_GI_number: 15843539 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA nucleotidyltransferase/poly(A) polymerase # Organism: Mycobacterium tuberculosis CDC1551 # 50 473 34 457 480 233 34.0 8e-61 MDRPTGVFINYLTVCCQSLSIKKMIELTQEELKQYFSEPIFGHIAETADALGLECYVVGG YVRDIFLQRPSKDIDVVVVGSGIAMAEALGKRLGRGAHVSVFKNFGTAQVKYKGTEVEFV GARKESYQRDSRKPIVEDGTLEDDQNRRDFTINAMAVCLNKARFGELVDPFGGMADMKEK TIRTPLDPDITFSDDPLRMMRCIRFATQLNFYIDDDTFESLCRNKDRIEIISRERIADEL NKIILSPIPSKGFVELERSGLLPLIFPELAALQGVETRNGRSHKDNFYHTLEVLDNISKK TDNLWLRWAALLHDIAKPATKRWEPKAGWTFHNHNFIGEKMIPNIFRKMKLPMNEKMKYV QKMVSLHMRPIVIADDVVTDSAVRRLLFEAGDDIDDLMTLCEADITSKNMERKQRFLNNF QLVRQKLKDLEEKDRVRNFQPPVSGEEIMETFGLQPCREVGALKSAIKDAILDGVIPNEY EAAHAFMLERAVKMGLKPVK >gi|226332204|gb|ACIC01000116.1| GENE 84 80984 - 81826 1034 280 aa, chain + ## HITS:1 COG:no KEGG:BF3641 NR:ns ## KEGG: BF3641 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 2 280 1 280 280 360 67.0 3e-98 MMKKLIIFFCGTLALTACGNGIEKKANEKLTIARAAYERGDYEEAKTQIDSIKILYPKAF EARKAGQELMLDVELKAQQEILAFLDSALQAKQAAFDAIKGKYTLEKDAEYQQVGNYIWP TQAIEKNLHRSFLRFQVSEQGIMSMTSIYCGAGNIHHVGVKVMTPDGSFAETPTSKDSYE TSDMNEKIEKADYKLGEDGNVIEFLNLNKDKNIRVEFIGDRRYTTTMSPTDRQAVAGVYE LSKILSAMQQIKKEQEDANLKIGFINKKKERKAMEEAAEE >gi|226332204|gb|ACIC01000116.1| GENE 85 82348 - 83337 1185 329 aa, chain - ## HITS:1 COG:no KEGG:BT_1977 NR:ns ## KEGG: BT_1977 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 329 1 329 329 624 100.0 1e-177 MKQGKIYMKAWLDLHGRAKVLATDHWYLEFANLLLPVVSESYLYKSETQESQNQVTLMLT LYLEDCVTDGGNWRQFIRWHKRNYGRYLPFYELSEGYLTDEINKEDIAFLLWGINSPVGD DFDGVENPLDADLLEFADVIYAQLEDVFEKAPISDGLAGDWLMESELMEKQRTALPVASP GDQLPVNVERFLKASGGEPLMFFDSYEALKLFFVQALQWEDEEDALLPDLKEFSDFVMYA NPKGLLIGPDVARYFADKRNPLYNAEMAEEEAYELFCEEGLCPFDLLKYGMEHDLLPEAQ FPFENGKELLHENWDFIARWFLGEYYEGE >gi|226332204|gb|ACIC01000116.1| GENE 86 83406 - 84011 720 201 aa, chain - ## HITS:1 COG:PA0966 KEGG:ns NR:ns ## COG: PA0966 COG0632 # Protein_GI_number: 15596163 # Func_class: L Replication, recombination and repair # Function: Holliday junction resolvasome, DNA-binding subunit # Organism: Pseudomonas aeruginosa # 1 199 1 198 201 108 35.0 5e-24 MIEYVRGELAELSPATAVIDCNGVGYAANISLNTYSAIQGKKNCKLYIYEAIREDAYVLY GFADKQEREIFLLLISVSGIGGNTARMILSALSPAELVNVISTENANLLKTVKGIGLKTA QRVIVDLKDKIKTMGATVAGGSASAGMLLQSASVEVQEEAVAALTMLGFAAAPSQKVVLA ILKEEPDAPVEKVIKLALKRL >gi|226332204|gb|ACIC01000116.1| GENE 87 84190 - 85089 1271 299 aa, chain + ## HITS:1 COG:no KEGG:BT_1979 NR:ns ## KEGG: BT_1979 # Name: not_defined # Def: meso-diaminopimelate D-dehydrogenase # Organism: B.thetaiotaomicron # Pathway: Lysine biosynthesis [PATH:bth00300] # 1 299 1 299 299 595 100.0 1e-169 MKKVRAAIVGYGNIGHYVLEALQAAPDFEIAGVVRRAGAENKPEELANYAVVKDIKELEG VEVAILCTPTRSVEKYAKEYLAMGINTVDSFDIHTGIVDLRRTLDATAKEHKAVSIISAG WDPGSDSIVRTMLEAIAPKGITYTNFGPGMSMGHTVAVKAIDGVKAALSMTIPTGTGIHR RMVYIELKDGYKFEEVAAAIKADPYFVNDETHVKLVPSVDALLDMGHGVNLTRKGVSGKT QNQLFEFNMRINNPALTAQVLVCVARASMKQQPGCYTMVEVPVIDLLPGDREEWIGHLV >gi|226332204|gb|ACIC01000116.1| GENE 88 85185 - 85490 115 101 aa, chain + ## HITS:1 COG:no KEGG:BT_1981 NR:ns ## KEGG: BT_1981 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 101 1 101 101 176 99.0 2e-43 MKRLFIFSTFLLCLLSTTFSQQSGNTIKPNLKYGKPSKEELTMTSFAPDTTATAYADIQI PYYSNGRNPTLKENVSQLALFLDFRKQVAKQFNNKIIIKRI >gi|226332204|gb|ACIC01000116.1| GENE 89 85487 - 85816 230 109 aa, chain + ## HITS:1 COG:no KEGG:BT_1982 NR:ns ## KEGG: BT_1982 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 109 1 109 109 185 100.0 5e-46 MKIYKKKELPRHEEEEFSVIVLTCDEHGETNLGFYNFDTEDWSFLSEADQSYRDSEFVWI YPPANQMKEFLIKSIKAERESAPPSSKENWFRKKTQPIKKKITLAFKKK >gi|226332204|gb|ACIC01000116.1| GENE 90 85858 - 86508 526 216 aa, chain + ## HITS:1 COG:XF0175 KEGG:ns NR:ns ## COG: XF0175 COG1272 # Protein_GI_number: 15836780 # Func_class: R General function prediction only # Function: Predicted membrane protein, hemolysin III homolog # Organism: Xylella fastidiosa 9a5c # 10 212 13 209 214 115 37.0 7e-26 MENKRYNNVEEWANTLSHGAGILLGVIAGYFLLAKAAAGAEPKWAVACVTVYLFGMLSSY VSSTWYHGSRPGKLKELLRKFDHGAIYLHIAGTYTPFTLLVMRHAGGWGWGIFSFVWLSA IVGFILSFKKLKEHSNLETICYIAMGACILVAMKPLMDHLAEMGAGPAFWWLIGGGASYI IGAVFYSLRKPYMHATFHLFCLGGSIGHIIAIWLIL >gi|226332204|gb|ACIC01000116.1| GENE 91 86542 - 86763 245 73 aa, chain - ## HITS:1 COG:MJ0272 KEGG:ns NR:ns ## COG: MJ0272 COG1476 # Protein_GI_number: 15668446 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Methanococcus jannaschii # 3 70 11 78 79 70 47.0 6e-13 MENKELLNKIKVYRAMKNISQEELAIAIGVTRKTINTVETGKFIPSTVLALRIARYFGVP VEEIFVLNDEASY >gi|226332204|gb|ACIC01000116.1| GENE 92 86753 - 87217 308 154 aa, chain - ## HITS:1 COG:no KEGG:BT_1985 NR:ns ## KEGG: BT_1985 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 154 1 152 152 271 100.0 5e-72 MVMENITIEKMEIKRFNALLMRTIVGMLFMGVLMWETFCLDTGRKGFLGETWYIFAFAVL LYVVVAIRASDILERIKKDKKLMGALDSEIYSDYNSKALTAGFYAAMQMGLLVYCFGDFF NLSVRMGALVIVVVAMLFSEIRRLLLYNPYKDGK >gi|226332204|gb|ACIC01000116.1| GENE 93 87411 - 88112 720 233 aa, chain + ## HITS:1 COG:MTH608 KEGG:ns NR:ns ## COG: MTH608 COG0120 # Protein_GI_number: 15678636 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose 5-phosphate isomerase # Organism: Methanothermobacter thermautotrophicus # 37 222 22 207 226 154 45.0 1e-37 MNWENHLIKDLQWSNEIINREAKERVAREIAATAKTGDVIGAGSGSTVYLTLFELAKRIH EEHLYLEVIPASQEISMTCIQLGIPQTTLWDKRPDWTFDGADEVDPDRNLIKGRGGAMFK EKLLIRSSGKTFIIVDPSKLVNILGSKFPIPVEVFPHALSYVEKELRRLGASEISLRPAH GKDGPILTENGNFILDTRFHYIDASLEEQLKTITGVIESGLFINYDIEVVVTR >gi|226332204|gb|ACIC01000116.1| GENE 94 88293 - 90734 1979 813 aa, chain + ## HITS:1 COG:no KEGG:BT_1987 NR:ns ## KEGG: BT_1987 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 813 1 811 811 1635 98.0 0 MEKQSLPQTENLQFQFSFFRNTWSKESTIITLHNLYLQTIGTLWKAQTECYRKLQDRPDR SNEAKMVKDAMPVVIIEGICRPHCSHAAANLEKMSRLAMYDLDHLNERTSAVKALFRALP YVAYTQTSISDRGLKVVVFLDARTPEEYPLAYAICQQTLERIAGHPCDEQCTRITQPCSC VWDADAYYNPTPEPYPWREELDTDPSLTNLIKDKRNLPGSNGNPYATSNGSSSPFPPATE ACGYIETFARNFTHYHPWQKGNRHESMLAMGRSARRKGFSKEDLEKLTSVMSVEIVRNGY TLQELRKDLSAGYQYVDLSYVPQETHNSLPTLTTDTFRPVSVGNESGKEEELSIKNEELR GSTPCIPNQVYDHLPDFLKRAMKPARSKRERDILLLGLIANLSGCLPQVRISFDQRPYSP QLYTLIIAPPATGKGLLTLANMLPREIENYLKGENKRKKDAYDRELTEWERTNQQIKKGT KSTAASPAPMPEPPEYYHLCGAPSTSRNQIIRRLKINGDLGLIICATELDMISGAIKQDY GKHDDVFRAAYHHEPVATDYKVDGELICAEVPRLALCLSGTPSQLANFIRSLENGLYCRF GIYTCGARWKYRSAAPIKGQEDYITLYKGFGKEVLEMYFFFQQSPTEVTLSDRQWEEHTA YFDCLLNEVASEQADAPGSIVLRGALMVARIASIFTALRKYEGAMQMKEYICTDEDFHAS MQIVQTILNHSLLVASSLPGEKMKPQPMQSYFRIRSVVESLPRIFTYKQIKEKALTEGIS ERSTCRYLKSLIELKYIEKQTDKYIRIKEFTDK >gi|226332204|gb|ACIC01000116.1| GENE 95 91060 - 91296 315 78 aa, chain - ## HITS:1 COG:no KEGG:BT_1988 NR:ns ## KEGG: BT_1988 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 78 1 78 78 142 100.0 4e-33 MKELIEEKKFEMRAYDKVELALLYCPGRSAESALKTLMRWIKQCQPLMQALGGIGYNVRR HRFLRQEVEQIVRHLGEP >gi|226332204|gb|ACIC01000116.1| GENE 96 91522 - 92604 373 360 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167855185|ref|ZP_02477956.1| 50S ribosomal protein L31 [Haemophilus parasuis 29755] # 9 349 5 327 339 148 29 1e-34 MENQNHAFNYAYLLKQVKARVLLAQKKAIYTANEEMLSMYWDIGKLLSESQKQIGWGNNA LEKLANDLKNDYPEMKGFSVRNCQFMIQFYNEYNQQLTNTKRAVSYLNNNEIVLPISKLG WSHNITLMQRTKDIKARYWYMIQCLKNGWSRDFLIEAIKQDYYHSHGALASNFDTTLPEI QAKQVKETLKDPYIFDMLTFTEEYDERDIELGLIKHIEKFLVEMGAGFAFMGRQYHIEVS EKDFYIDILMYNAFMHRYLVVELKRGEFQPEYIGKLNFYCSAVDDILCREGDNPTIGLLL CQNKDQIMAEYALRDVHKPIGISDYELGQALPKDIKSGLPSIEELENKLSRDLQDIDNKE >gi|226332204|gb|ACIC01000116.1| GENE 97 92906 - 93391 633 161 aa, chain + ## HITS:1 COG:no KEGG:BT_1990 NR:ns ## KEGG: BT_1990 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 161 1 161 161 295 100.0 3e-79 MAKVFSFTSVLRKNMMEKDKPDLYYALAKSSGEIDIDEMAERIQRSSTVNWADVVCVLRA LQTEMIDSFKKGEIVRLGNIGSFYVTLRSNGVLVQKDVKEGLIKGARVRFRPGKEIKDAL KTLDFSKYKPENSEGDKTDPENPKPKPDPDDGDEEAPDPTV >gi|226332204|gb|ACIC01000116.1| GENE 98 93431 - 93871 350 146 aa, chain + ## HITS:1 COG:HI1494 KEGG:ns NR:ns ## COG: HI1494 COG3023 # Protein_GI_number: 16273395 # Func_class: V Defense mechanisms # Function: Negative regulator of beta-lactamase expression # Organism: Haemophilus influenzae # 46 142 2 98 116 106 49.0 2e-23 MRKIDLIVIHCSATRADRTLTAFDLETLHRRRGFNGTGYHYYIRKDGTTLLTRPIERIGA HAKGFNASSIGICYEGGLDCRGRPADTRTPEQRAALRLLVHQLEQRFPGCRVCGHRDLSP DLNGDGEIEPEEWIKACPCFEVKESL >gi|226332204|gb|ACIC01000116.1| GENE 99 93956 - 94057 74 33 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAMKKSLWDMILKVVIAVASAVAGVLGANAMSL >gi|226332204|gb|ACIC01000116.1| GENE 100 94170 - 97301 1615 1043 aa, chain + ## HITS:1 COG:no KEGG:BT_1992 NR:ns ## KEGG: BT_1992 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1043 1 1043 1043 1954 98.0 0 MKYILFTIAFLLASNCLKAQGQFEKVIGLTGAQGFPQPGVSNANTQGKIPVGLYTGTPNI SIPLYEFKLRDNIQLPINLNYHIYNVKPNNLPSEVGLGWSLECGGCITRIIKNEPDISYE SSSNEYKPITTEADLLTTADILVRVSGNYINTQDEYQFNFLGYTGSFMYSQEKSKWMVQS DSDIKIEFTSNTYNNTRSQLTSPLSQFYNYCRSEGTGFKNPLSCWLIDSFTLTTPDGYKY IFGGTDKTDYNLPFKGFLNLPAPITWHLSKIITPAGHEIEFTYEIMPFQINGNMSFCISL DALFWQTAMSYDYELLAPVQLATVKDVTDNKILARFHYSPSTQLPYDSQYAWETCMDHGP ATFFTKEKNFTLNKLNSVVILDKINYQFTYTNSSTERLKLKTLTKTTPSGTQSTYSLNYF PNHLPGYNTGHYDNLGFNNGENFSYYFSKEFFENAIFADKQIAEGKEYTNKRMGDKGGFR VTAEMLKSITYPTHGRTEFIYEPNVISSMVSADRKTVQSAHLPYPGTPDYTYPGGLRIKE INNYDSNDELLTRKHYYYTKEFTPTTKGGVSSGILSFTPQYLWGWQLYNLLKSQNGGPEY YTLNAIMSQASNPLWYNSRGEYIGYSKVIECNEDKNGKLIDGYTVHTFSNFGPGYMDEDP IAMLNNKFSREYPPHVGTPYSPYTPCSSNALKRGMLLSKEQFDCAGHVKQKELFEYTPIQ KDSILITEITTTNVMDYNSDDPTLGFLRFAFGGTYYQKFYSNLLSEKRTITYDDNGNTIE YKDKYEYNSVNKQIKLKTSEDGAGNVYEEKTRYVPDMLIFPFVPPYSSFYQMNQLHFTDY PLEVTKIKNGKVTENETYFYKLLTADSKSLVKDKVSILGKHADAATYQGLHNVGNELVAD VSNIPATTYLAYDSYSNPTHIRNEKDKTETVYLYGYKGKYAIAEIKNSDYESVTGLLGND LIKRLADATKPSYSDMQKVENLRTQLPASFITTYEYIPYIGISKIRDPKNVSTYFKYDDS GRLIEKTDHKGELISSYKYSNNL >gi|226332204|gb|ACIC01000116.1| GENE 101 97314 - 101264 1844 1316 aa, chain + ## HITS:1 COG:RSp0477 KEGG:ns NR:ns ## COG: RSp0477 COG3209 # Protein_GI_number: 17548698 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rhs family protein # Organism: Ralstonia solanacearum # 638 1105 129 621 741 62 23.0 5e-09 MKIIIYITALICSLLASPVLATETGQISMTDMGLKVTPSPESTLYEGDTWMITARNNFSL NEAEIKNFVPSRISEELRPYVRLFKVEFTSNSDGICNLYFHITADPTDTIRFVIKESPQD FPIEIMPAESRHGEISLSDLGLRVTTSIDSHLEYGGQWIVTMENKNHLFTAEIERLVPSR ITEDLRRCLRLTGVGISDISSEFVVLFFDVTFETTFPVHFIIKQSPQNIPIEILPIGRDN YISTTNYLDNTRSGGRKEITYYNGLGYPSQKIEVGTSNKNKSIYHPIVYDNLLRDDAKSY LPYAQGSNTEMIDWKFTESLQDFYAFNYKEYTYHFSEKVYEDSPLGRVREVYGPGSDWRS QKASTKFSYLSNIAGNDTLNCLRFEMTEKDNQNMELKVAGNCKSSELSVTRTENEDRQVT LEFKNKFDQVVLSRQIEHNNGSKTNNDTYYLYDEFDNLKAVLPPMVSAQLVSGSSYSSQT SASLAQYAYLYKYDMRNRCIGTKLPGCSWEYKVYDLADRLIFSQTGEQRKRGEWQFALPD AMGRECIMGICKNAIDPFNNPILNTCVKCERTDNASLLGYSVTGVALNSVTVLTAKYYDD YLFKMQNGTLISNSSLNYEANSEFGERYTTSSQGLQTGNATARLDKNGSVTGYDYTATYY DYNGRAIQTKSTNHLGGYEKTYTAYNFTGQPVHTKHIHSKSNSELIEETKHSYNHAGRLS ETVQIVNEKDSTKMSYLYDDLGNIKSLTRIDGTSTLTTTNTYNIRNWLTSIESPLFSQTL HYTDGPGTPQYGGNISSMTWKANKESITRNYQYSYDKLNRLTSANYSEGSNSMTNTNHFN EQVTGYDKHGNITGLKRYGQTGQSSYGLIDDLSLTYTGNQLKKVTDSATSSAYANGFEFK DGVNLDTEYSYDEDGNLTKDLNKNISDIQYNFLNLPRRIQFKDGSEISYLYSADGTKLQT THIIAGNTTTTDYCGNVIYENGSRDKLLTEQGYFSIADKKFHYYIQDHQGNNRVVASQDG MIEEVNHYYPFGGIFASNSSVQPFKYNGKELDTKNGLNLYDYGARQYDPVLGRWHTMDLM TEKYYKISPYTYCLNNPILLVDPNGMWPTWGGISRGLSNVFKGTLSFTNGAARAMADNIL LGQTSLRETGIYSNASAYNAGQDVGDIISIFAGAAEIVNGFEEAAGGMALSPETAGISLG VTAKGVYDITHGSLMGTSGFMKLFSKKGRVSEGSNNGSGYSKSSGKNEKHSNIDKRQQAA NDYSTAKEKYDNLKSKTNKTAKEKKELETLKKQRDHYKKQMDYAGDHDHQRGRGGN >gi|226332204|gb|ACIC01000116.1| GENE 102 101221 - 101553 293 110 aa, chain + ## HITS:1 COG:no KEGG:BT_1994 NR:ns ## KEGG: BT_1994 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 5 110 1 106 106 187 100.0 1e-46 MQETMTIKEVEEVIKMKLFTFIADYKGGTYISQYVAKTLEEALNLWINNVDFFTDKQLKS FKKNIKYGDIDPPTKLNGLSNVWCTCFIVFSTLLLLNIVETVKKTTNKTS >gi|226332204|gb|ACIC01000116.1| GENE 103 101623 - 102264 258 213 aa, chain + ## HITS:1 COG:no KEGG:BT_1995 NR:ns ## KEGG: BT_1995 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 213 1 194 194 375 91.0 1e-103 MKTFTFLVNANDKHCFVHQYQSEMPTNDLFRELVHDTSHISSKMKSLLIRESLECYHDNG PVKLDCVKNVWRDFFLIDDLISLYNANIAKKYHEGCGDRKYYKLMYLYDVNMIETDMSDT VFPLSEKATFTFITYIRNYNASYQFEVTTLEEGLMLWATNIDILNRQQRKVLLKYIQKSK NNPIAVEGVKNVWSTSYRIFRPLLTLHIVKTVS >gi|226332204|gb|ACIC01000116.1| GENE 104 102471 - 102875 335 134 aa, chain - ## HITS:1 COG:no KEGG:BT_1997 NR:ns ## KEGG: BT_1997 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 134 1 134 134 174 100.0 6e-43 MKILKDLHTGRKKFSRVERMHFLQELEEELKVYYLSYTETVDRPYSEQEQEIIVRIIKML IRQLRGTSKFQLFSRRGTRRKSEDPDDVEEREERRKEQTDEIRLKHTNLYSYNPLSKKYK ARKEELLRNLKREF >gi|226332204|gb|ACIC01000116.1| GENE 105 103332 - 105725 2492 797 aa, chain + ## HITS:1 COG:CAC1209 KEGG:ns NR:ns ## COG: CAC1209 COG1328 # Protein_GI_number: 15894492 # Func_class: F Nucleotide transport and metabolism # Function: Oxygen-sensitive ribonucleoside-triphosphate reductase # Organism: Clostridium acetobutylicum # 8 797 5 690 699 414 35.0 1e-115 MNYAEICIIKRDGKREDFSISKIKNAIGKAFSATGIQNEQQLIADVTMSVISQFTNPTIT VEEIQDLVEKALMKVRPEVAKKYIIYREWRNTERDKKTQMKHVMDGIVAIDKNDVNLSNA NMSSHTPAGQMMTFASEVTKDYTYKYLLPKRFAEAHQLGDVHIHDLDYYPTKTTTCIQYD MDDLFERGFRTKNGSIRTPQSIQSYATLATIIFQTNQNEQHGGQAIPAFDFFMAKGVAKS FRKHLASFINFYVAMENGTQADEKAIRTLIKEHLPSIKSTEAERETLRIALIALQIIIDK EHLTRIAEKAYQQTKKDTHQAMEGFIHNLNTMHSRGGNQVVFSSINYGTDTSAEGRLVIE ELLKATIEGLGTRGEVPVFPIQIFKVKDGVSYSEKDFEKAMKAENIEDAMRGTYEAPNFD LLLRACQTTSKALFPNFMFLDTPFNKNEKWKADDPKRYIYELATMGCRTRVFENVAGEKS SLGRGNLSFTTLNMPRLAIEARIKAENLIEGERNKDAIEQKAKEIFMESVRNMATLVADQ LYERYQYQRTALARQFPFMMGNDVWKGGGALNPNEQVGDALRSGTLGIGFIGGHNAMVAL YGEGHGHSQKAWDTLYEAVMEMNKVVDEYKEKYNLNYSVLATPAEGLSGRFTKIDRRKYG KIPGVTDRDYYVNSFHVDVKEPISITEKIRCEAPFHAITRGGHITYVELDGEAQKNVRAI AKIVKVMHDEGIGYGSINHPVDTCHNCGYKGVIFDKCPVCQSENILRMRRITGYLTGDLS SWNSAKRAEEKDRVKHL >gi|226332204|gb|ACIC01000116.1| GENE 106 105838 - 106332 277 164 aa, chain + ## HITS:1 COG:PM0941 KEGG:ns NR:ns ## COG: PM0941 COG0602 # Protein_GI_number: 15602806 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Organic radical activating enzymes # Organism: Pasteurella multocida # 11 160 1 154 158 126 41.0 2e-29 MNTLNSPLSVLHLLSTYPETIVDGEGIRYSIYLAGCSHHCVGCHNPESWNPRAGELLTEE RIQSIIREIKANPLLDGVTFSGGDPFYNPEAFLLFVKRVKEETGLNIWCYTGYTYEEIQA NPRLKAVLDYIDVLVDGRFEQALFSPYLEFRGSSNQRILRIGNK >gi|226332204|gb|ACIC01000116.1| GENE 107 106546 - 107958 921 470 aa, chain + ## HITS:1 COG:YPO1712 KEGG:ns NR:ns ## COG: YPO1712 COG0477 # Protein_GI_number: 16121972 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Yersinia pestis # 13 461 6 454 455 478 59.0 1e-135 MIQANNTTSIDETDGLPMPRRIWAVVSIGFALCMSVLDVNIINIVLPTLSHDFGTSPAVT TWIINGYQLAIVVSLLSFSSLGEIIGYRKVFLSGIGLFCITSLICALSDSFWTLTIARIF QGFSASAITSVNTAQLRYVYPKSQIGRGMGINAMVVAISAAAGPSVASGILSIASWHWLF AINVPLGITALLLGIKHLPRQEERSKRKFDFVSAIANAVTFGLLIYTLDGFAHHEEMDFL VLQLIILAIVGTFYVRRQLTQTTPLLPLDLLRIPIFRLSILTSICSFIAQMAAMVSLPFF LQNTLGHSEVMTGLLLTPWPLATLVTAPLAGYLVERIHPGILGSIGMALFAVGLFSLSGL TAESSDISIILRLMLCGAGFGLFQTPNNSTIISSAPTKRSGGASGMLGMARLLGQTFGTT LVALLFSFVVHDRSTAVCLMVGSGFAVVAAIVSSLRLSQPSTLKRETNQS Prediction of potential genes in microbial genomes Time: Thu May 12 02:17:31 2011 Seq name: gi|226332203|gb|ACIC01000117.1| Bacteroides sp. 1_1_6 cont1.117, whole genome shotgun sequence Length of sequence - 63990 bp Number of predicted genes - 62, with homology - 57 Number of transcription units - 30, operones - 10 average op.length - 4.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 17/0.000 - CDS 189 - 1544 1187 ## COG0750 Predicted membrane-associated Zn-dependent proteases 1 - Prom 1603 - 1662 5.1 2 1 Op 2 . - CDS 1669 - 2841 1228 ## COG0743 1-deoxy-D-xylulose 5-phosphate reductoisomerase 3 1 Op 3 . - CDS 2905 - 3762 844 ## COG0739 Membrane proteins related to metalloendopeptidases 4 1 Op 4 . - CDS 3775 - 4314 496 ## COG0806 RimM protein, required for 16S rRNA processing 5 1 Op 5 . - CDS 4311 - 5615 1216 ## COG0766 UDP-N-acetylglucosamine enolpyruvyl transferase - Prom 5663 - 5722 4.0 - Term 5682 - 5720 -0.7 6 2 Tu 1 . - CDS 5749 - 6363 752 ## BT_2006 hypothetical protein - Prom 6384 - 6443 5.2 7 3 Tu 1 . - CDS 6485 - 7180 660 ## COG1214 Inactive homolog of metal-dependent proteases, putative molecular chaperone - Prom 7200 - 7259 3.4 + Prom 7148 - 7207 6.7 8 4 Tu 1 . + CDS 7360 - 9069 730 ## BT_3478 integrase + Term 9164 - 9209 1.2 - Term 9152 - 9195 0.8 9 5 Tu 1 . - CDS 9218 - 9667 390 ## gi|253570581|ref|ZP_04847989.1| predicted protein - Prom 9693 - 9752 5.2 10 6 Tu 1 . + CDS 10011 - 10613 489 ## gi|253570582|ref|ZP_04847990.1| conserved hypothetical protein + Term 10640 - 10679 5.6 + Prom 10926 - 10985 4.0 11 7 Tu 1 . + CDS 11051 - 11215 214 ## gi|253570583|ref|ZP_04847991.1| predicted protein + Term 11302 - 11363 3.6 + Prom 11237 - 11296 3.0 12 8 Tu 1 . + CDS 11394 - 12413 644 ## BF3462 hypothetical protein + Term 12441 - 12494 5.4 + Prom 12426 - 12485 3.0 13 9 Op 1 . + CDS 12516 - 13148 378 ## BT_1133 hypothetical protein 14 9 Op 2 . + CDS 13202 - 13648 258 ## BDI_2238 hypothetical protein + Prom 13724 - 13783 5.7 15 10 Tu 1 . + CDS 13850 - 14524 511 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs + Term 14530 - 14579 -1.0 16 11 Op 1 . + CDS 15489 - 16307 367 ## COG1342 Predicted DNA-binding proteins 17 11 Op 2 . + CDS 16307 - 16681 140 ## BDI_3898 hypothetical protein + Term 16752 - 16816 7.2 + Prom 18066 - 18125 3.8 18 12 Tu 1 . + CDS 18342 - 18551 119 ## + Prom 19087 - 19146 4.6 19 13 Op 1 . + CDS 19171 - 19443 119 ## 20 13 Op 2 . + CDS 19451 - 19795 117 ## gi|189462837|ref|ZP_03011622.1| hypothetical protein BACCOP_03536 21 13 Op 3 . + CDS 19807 - 20007 260 ## gi|253570590|ref|ZP_04847998.1| conserved hypothetical protein 22 13 Op 4 . + CDS 20004 - 21713 1058 ## gi|253570591|ref|ZP_04847999.1| conserved hypothetical protein + Prom 22164 - 22223 2.0 23 14 Tu 1 . + CDS 22391 - 23401 512 ## BVU_1734 hypothetical protein + Prom 23512 - 23571 3.9 24 15 Op 1 . + CDS 23639 - 24010 220 ## BVU_1736 hypothetical protein 25 15 Op 2 . + CDS 23991 - 24575 479 ## gi|253570594|ref|ZP_04848002.1| conserved hypothetical protein 26 15 Op 3 . + CDS 24598 - 25017 314 ## gi|253570595|ref|ZP_04848003.1| conserved hypothetical protein + Term 25024 - 25071 8.1 - Term 25014 - 25055 -0.2 27 16 Tu 1 . - CDS 25254 - 26813 183 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 28 17 Tu 1 . - CDS 26917 - 27363 144 ## BT_0074 hypothetical protein - Prom 27512 - 27571 4.2 - Term 27526 - 27580 -0.5 29 18 Tu 1 . - CDS 27600 - 28010 264 ## BT_2526 hypothetical protein - Prom 28151 - 28210 5.2 - Term 28412 - 28450 5.1 30 19 Tu 1 . - CDS 28465 - 29655 1275 ## BT_0066 major outer membrane protein OmpA - Prom 29680 - 29739 7.1 31 20 Op 1 . - CDS 29854 - 31947 1402 ## BF1750 1-phosphatidylinositol phosphodiesterase precursor 32 20 Op 2 . - CDS 31979 - 32176 301 ## gi|253570601|ref|ZP_04848009.1| predicted protein - Prom 32218 - 32277 4.1 33 21 Op 1 2/0.000 - CDS 32344 - 34674 1820 ## COG0489 ATPases involved in chromosome partitioning 34 21 Op 2 . - CDS 34689 - 35486 486 ## COG1596 Periplasmic protein involved in polysaccharide export 35 21 Op 3 . - CDS 35507 - 36412 370 ## BT_0059 hypothetical protein 36 21 Op 4 . - CDS 36409 - 37503 467 ## BT_0058 hypothetical protein 37 21 Op 5 . - CDS 37534 - 37800 293 ## BT_0057 hypothetical protein 38 21 Op 6 12/0.000 - CDS 37831 - 38496 561 ## COG2148 Sugar transferases involved in lipopolysaccharide synthesis 39 21 Op 7 . - CDS 38484 - 39608 668 ## COG0438 Glycosyltransferase - Prom 39719 - 39778 3.9 40 22 Op 1 . - CDS 39808 - 40785 265 ## gi|253570609|ref|ZP_04848017.1| predicted protein 41 22 Op 2 . - CDS 40844 - 41917 371 ## COG0438 Glycosyltransferase 42 22 Op 3 . - CDS 41922 - 42767 394 ## COG3774 Mannosyltransferase OCH1 and related enzymes 43 22 Op 4 . - CDS 42685 - 43959 317 ## BT_2867 hypothetical protein 44 22 Op 5 11/0.000 - CDS 43964 - 44734 330 ## COG0463 Glycosyltransferases involved in cell wall biogenesis - Prom 44770 - 44829 1.9 45 22 Op 6 . - CDS 44954 - 45973 459 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 46 22 Op 7 . - CDS 45982 - 46485 193 ## gi|253570615|ref|ZP_04848023.1| conserved hypothetical protein - Prom 46546 - 46605 5.0 + Prom 46272 - 46331 6.2 47 23 Tu 1 . + CDS 46568 - 46885 95 ## - Term 46947 - 47005 2.6 48 24 Op 1 . - CDS 47052 - 48164 789 ## COG0438 Glycosyltransferase 49 24 Op 2 . - CDS 48183 - 49202 211 ## Gmet_1497 glycosyl transferase family protein 50 24 Op 3 . - CDS 49165 - 50151 377 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 51 24 Op 4 . - CDS 50154 - 50426 96 ## gi|253570619|ref|ZP_04848027.1| predicted protein 52 24 Op 5 . - CDS 50429 - 51289 311 ## BT_2870 putative glycosyltransferase 53 24 Op 6 . - CDS 51363 - 52481 750 ## BDP_1842 polysaccharide pyruvyl transferase 54 24 Op 7 . - CDS 52486 - 53952 694 ## BVU_2391 putative transmembrane protein 55 24 Op 8 . - CDS 54005 - 55168 256 ## BVU_2392 F420H2-dehydrogenase, beta subunit 56 25 Tu 1 . - CDS 55493 - 56509 303 ## BT_0037 putative transcriptional regulatory protein - Prom 56748 - 56807 7.1 + Prom 56284 - 56343 4.6 57 26 Tu 1 . + CDS 56578 - 56775 81 ## - Term 56875 - 56909 3.9 58 27 Tu 1 . - CDS 56931 - 58136 785 ## BDI_0750 transposase - Prom 58160 - 58219 6.6 - Term 58680 - 58719 4.5 59 28 Tu 1 . - CDS 58733 - 59563 448 ## Metev_0644 radical SAM domain-containing protein - Prom 59583 - 59642 6.1 - Term 59599 - 59644 0.5 60 29 Op 1 . - CDS 59811 - 60767 572 ## gi|253570627|ref|ZP_04848035.1| predicted protein 61 29 Op 2 . - CDS 60820 - 61935 767 ## gi|253570628|ref|ZP_04848036.1| predicted protein - Prom 62012 - 62071 6.1 62 30 Tu 1 . - CDS 62921 - 63133 138 ## - Prom 63285 - 63344 7.4 Predicted protein(s) >gi|226332203|gb|ACIC01000117.1| GENE 1 189 - 1544 1187 451 aa, chain - ## HITS:1 COG:aq_1964 KEGG:ns NR:ns ## COG: aq_1964 COG0750 # Protein_GI_number: 15606963 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted membrane-associated Zn-dependent proteases 1 # Organism: Aquifex aeolicus # 9 450 4 428 429 137 28.0 4e-32 METFLIRALQLIMSLSLLVIIHEGGHFLFARLFKVRVEKFCLFFDPWFTLFKFKPKRSDT EYAVGWLPLGGYVKIAGMIDESMDTEQMKQPEQPWEFRSKPAWQRLLIMVGGVLFNFLLA LFIYSMILFAWGDQYIKVQEAPLGMDFNETAKAVGFQDGDILLSADNVPFVRYDGDMLSQ IADAREVSVLRDGKKASVYIPEDMMQRLMADSVRFASFRFPYVIDSVMVNSPAAQAGILP GDSIIALDGKSISFSDFKQTMAERKKNAEALLKDSIDPRLITLTYVRGGVTDTTSLRVDS AYLMGVVASLTTDRLLPMVKKEYTFFESFPAGVSLGVKTLKGYVGNMKYLFSKEGAKQLG GFGTIGSIFPATWDWHQFWYMTAFLSIILAFMNILPIPALDGGHVLFLFYEMIARRKPSD KFMEYAQMTGMVLLFGLLIWANFNDILRFFF >gi|226332203|gb|ACIC01000117.1| GENE 2 1669 - 2841 1228 390 aa, chain - ## HITS:1 COG:alr4351 KEGG:ns NR:ns ## COG: alr4351 COG0743 # Protein_GI_number: 17231843 # Func_class: I Lipid transport and metabolism # Function: 1-deoxy-D-xylulose 5-phosphate reductoisomerase # Organism: Nostoc sp. PCC 7120 # 10 383 3 380 399 350 47.0 4e-96 MDSEKKNKKKQIAILGSTGSIGTQALQVIEEHPDLYEAYALTANNRVELLIAQARKFQPE VVVIANEEKYAQLKEALSDLPIKVYAGIDAVCQIVEAGPVDMVLTAMVGYAGLKPTINAI RAKKAIALANKETLVVAGELINQLAQQYHTPILPVDSEHSAVFQCLAGEVGNPIEKVILT ASGGPFRTCTLEQLKSVTKTQALKHPNWEMGAKITIDSASMMNKGFEVIEAKWLFGVQPS QIEVVVHPQSVIHSMVQFEDGAVKAQLGMPDMRLPIQYAFSYPDRICSSFDRLDFTQCTN LTFEQPDTKRFRNLALAYEAMYRGGNMPCIVNAANEVVVAAFLRDGISFLGMSDVIEKTM ERAAFVAAPAYDDYVATDAEARRIAAELIP >gi|226332203|gb|ACIC01000117.1| GENE 3 2905 - 3762 844 285 aa, chain - ## HITS:1 COG:NMB1483 KEGG:ns NR:ns ## COG: NMB1483 COG0739 # Protein_GI_number: 15677336 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Neisseria meningitidis MC58 # 162 285 295 415 415 88 40.0 2e-17 MPRKRSKAFWNNFKFKYKLTIINENTLEEIVGLRVSKLNGLSVLLSVLAVLFLIAACIIA FTPLRNYLPGYMNSEVRSQIVDNALRVDSLQELLNRQNLYIMNIQDIFSGKVPIDSVQTL DSLTAAREDTLMERTRREEEFRRQYEENEKYNLTTIVSQPDVNGLILYRPTRGMVSDHFN TDKKHFGTDIAANPNESVLATMDGTVFLSTYTAETGYVIGVQHSQDFVSIYKHCGSLLKK EGDRVKGGEAIALVGNSGTLSTGPHLHFELWYKGHPVNPEKYIVF >gi|226332203|gb|ACIC01000117.1| GENE 4 3775 - 4314 496 179 aa, chain - ## HITS:1 COG:CAC1757 KEGG:ns NR:ns ## COG: CAC1757 COG0806 # Protein_GI_number: 15895034 # Func_class: J Translation, ribosomal structure and biogenesis # Function: RimM protein, required for 16S rRNA processing # Organism: Clostridium acetobutylicum # 5 171 2 164 166 58 28.0 7e-09 MIKKEEVYKIGLFNKPHGIHGELQFTFTDDIFDRVDCDYLICLLDGIFVPFFIEEYRFRS DSTALVKLEGIDTAERARMFTNVEVYFPVKHAEEAEDGELSWNFFVGFRMEDVRHGELGE VVEVDTTTVNTLFVVEQEDGEELLIPAQEEFIVEINQEKKLITVELPEGLLNLEDLEED >gi|226332203|gb|ACIC01000117.1| GENE 5 4311 - 5615 1216 434 aa, chain - ## HITS:1 COG:BB0472 KEGG:ns NR:ns ## COG: BB0472 COG0766 # Protein_GI_number: 15594817 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylglucosamine enolpyruvyl transferase # Organism: Borrelia burgdorferi # 1 434 16 439 442 387 46.0 1e-107 MASFVIEGGHRLSGEIHPQGAKNEVLQIICATLLTAEEVTVNNIPDILDVNNLIQLMREM GVTVAKEGVDTYSFKAENVDLAYLESDEFLKKCSSLRGSVMLIGPMVARFGKALISKPGG DKIGRRRLDTHFVGIQNLGADFRYDELRGIYEITADRLQGSYMLLDEASVTGTANILMAA VLAKGTTTIYNAACEPYLQQLCRMLNRMGAKISGIASNLLTIEGVEELHGTQHTVLPDMI EVGSFIGMAAMTKSEITIKNVSYENLGIIPESFRRLGIKLEQRGDDIYVPAQETYEIESF IDGSIMTIADAPWPGLTPDLLSVMLVVATQAKGSVLIHQKMFESRLFFVDKLIDMGAQII LCDPHRAVVIGHNHGFKLRGARLTSPDIRAGIALLIAAMSAEGTSTISNIEQIDRGYQNI EGRLNAIGARITRI >gi|226332203|gb|ACIC01000117.1| GENE 6 5749 - 6363 752 204 aa, chain - ## HITS:1 COG:no KEGG:BT_2006 NR:ns ## KEGG: BT_2006 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 190 1 190 204 371 100.0 1e-102 MQYNTQQKRMPLPEYGRSIQNMVDYALTIQDRAERQRCANTIINIMGNMFPHLRDVPDFK HKLWDHLAIMSGFELDIDYPYEIIREDNLVTKPEPIPYSTTRMRYRHYGHTLEVLIKKAC ELPEGNDKRNLTAMICNHMKKDYMAWNKDTVDDRKIAEDLYELSGGKLQLTDDIIRLMAE RLNQNYRPKVNYNNNRNNNPRRRY >gi|226332203|gb|ACIC01000117.1| GENE 7 6485 - 7180 660 231 aa, chain - ## HITS:1 COG:SA1856 KEGG:ns NR:ns ## COG: SA1856 COG1214 # Protein_GI_number: 15927626 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Inactive homolog of metal-dependent proteases, putative molecular chaperone # Organism: Staphylococcus aureus N315 # 5 218 13 214 229 78 31.0 9e-15 MSCILNIETSTTVCSVAASQDGQTIFVKEDLKGPSHAVSLGVFVDEALSFIDSHAIPLDA VAVSCGPGSYTGLRIGVSMAKGICYGRNVPLIGLPTLEVLSVPVLLYHDLPEDALLCPML DARRMEVYAAVYDRALNVKRAIAADIVDENSYLEFLNEAPVYFYGNGAAKCREKITHPNA HFIDDIHPLAKMMYPLAEKAVAREDYKDVAYFEPFYLKEFVASLPKKSLLE >gi|226332203|gb|ACIC01000117.1| GENE 8 7360 - 9069 730 569 aa, chain + ## HITS:1 COG:no KEGG:BT_3478 NR:ns ## KEGG: BT_3478 # Name: not_defined # Def: integrase # Organism: B.thetaiotaomicron # Pathway: not_defined # 30 437 31 423 427 140 27.0 1e-31 MTHQVFINEIVKANFNLRQPKSERPTNIYLIVRINQKQAKLSTGVKVYPEHWNIKKQEAY ISCRLTESDNINNTIANKKLLELRSQFEEFKRYLCDNPSELKNCWDLLRKYIYKDNNMQR TNKINAIHWLRDIIANDKTIKTSGEHKRGSTMETYLIVLKDFQTFLKEKGQDTISFDDIN LALIKEYETYLFNKKVKGEQTTATSTVGNKCVQLIGIIKRAEPYNLIDIHEAKLDKYSKP KSRQGDENEIYLNEEEISKIYSLKLPYKEGVARDVFILQCWTGQRFSDIKSLNEGIVKET SSGKVLEIVQLKKTRRVTIPLFPIALEILKKYDFNLPEISENTMLRYLKKIGLKAGLTEE HIVTEDRGGKVTNSIKQRWELIGTHTARRSYISNMLKRGYDSHLLMKITGHTTEEAFKRY VKVRSEDVASLILKTEAGRVKDKLPQESSQENNVPMNSNQILKVIKEGIEEGLKPFKKEL SDIKEVMQYITTNKRSIRPVNARRLIAMVVSLEKDNTPAQTIINMLEASGIIGSVSVTMT GGQYFPMQNMNEQVLDELKSIGTKLKDSQ >gi|226332203|gb|ACIC01000117.1| GENE 9 9218 - 9667 390 149 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253570581|ref|ZP_04847989.1| ## NR: gi|253570581|ref|ZP_04847989.1| predicted protein [Bacteroides sp. 1_1_6] # 1 149 1 149 149 251 100.0 9e-66 MRTLRLIVLAIVAIVMSVNFTACSDDDDDFNISDLEGLWEGVTSEFEEKENGQVVDKDTE SLDDQRIRFKSDGTITSYYKSGSNWIVEDEGTWSVKNGKIYMRADGEEDVAKILELNPQT LVIEISESGVEDGISYSYYEKDTYKKITE >gi|226332203|gb|ACIC01000117.1| GENE 10 10011 - 10613 489 200 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253570582|ref|ZP_04847990.1| ## NR: gi|253570582|ref|ZP_04847990.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 200 18 217 217 376 100.0 1e-103 MCKMGIPKEVVIIATAHENTKMIDAVYLHVSSHDKAYKLSEAIIEKAQGSMFVAHKEQVT TKPEVDVFNYIFAGDLLLGLTKKYEATKVHEVLDKDLDKGFLELPTTKQAIAILKDTNRI NEVDINKYKGDEELKYKVKAICTIVWEIGKAANDVLLIQFFQDNVITLGLNTEYYLPETM SKREIQRLFEMKPYKGTIIA >gi|226332203|gb|ACIC01000117.1| GENE 11 11051 - 11215 214 54 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253570583|ref|ZP_04847991.1| ## NR: gi|253570583|ref|ZP_04847991.1| predicted protein [Bacteroides sp. 1_1_6] # 1 54 1 54 54 84 100.0 3e-15 MKHLMKGIRIAAYIAEIIVAGSTIVELVEKYSGRSKRKSAASKVDTGNVATENP >gi|226332203|gb|ACIC01000117.1| GENE 12 11394 - 12413 644 339 aa, chain + ## HITS:1 COG:no KEGG:BF3462 NR:ns ## KEGG: BF3462 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 6 339 5 340 340 405 56.0 1e-112 METTNLLLPSTRMARTNEWKKEAEDAVLVTEQTHKRSPFIEANTIEVTLEHLRNDCIIPT FAKDNEVCISHQSFIESVYEATRDFYHGETICSPEIRTSHIVRGRIPEAINKRVDQLLES DKTMYYERMIFNIEIPSISRNVNGNRLNLCITGCKSYARDNLSGKMTAQRFNLAIGFLNL ACTNQCLSTDGYKEEIRATSARDLYQSTLDLFSQYNVARHIHLMRSLGDTMLSEHQFCQI LGRMRLYNYLPQAQQRELPRLLITDSQINNVARAYIHDDNFAGNNGELSMWKFYNLITGA NKSSYLDTFLGRSVNATEVSVGLTEALNGKDMAYSWFIE >gi|226332203|gb|ACIC01000117.1| GENE 13 12516 - 13148 378 210 aa, chain + ## HITS:1 COG:no KEGG:BT_1133 NR:ns ## KEGG: BT_1133 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 39 208 12 181 186 152 45.0 8e-36 MMTIFTRDYSQRLRHFNKKQNNNAVKTETIGNTAEVTRFDKWFASSYTRLREKIRFFSMV DEDNFHNTYLFIREKIMGGEEKIENLEAYFFRCYRYKAMTEMRNENRYVHPEDDFFYRFS QEETEPYPWGLNRCERLAGDVLRYIRSRFSRQEYRMFMLRYYRQQCSLKVLSEYTGLPLS EVMRKTRRMLESLRSNSYFMERCEMLYVYE >gi|226332203|gb|ACIC01000117.1| GENE 14 13202 - 13648 258 148 aa, chain + ## HITS:1 COG:no KEGG:BDI_2238 NR:ns ## KEGG: BDI_2238 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 4 132 16 130 130 70 33.0 2e-11 MLSVAQEGQRSIRMDRNGCIFLSSALVQELGLQEGSMAVLVGDEDDPKQWFLCIVDDEGG FSVRLKKRKRYNSKGLLEEYCYGAVFNCSYLCHKILDNVGADKSATFMLATAPVEMEGLK LYKILTSNPITRPDRKYVRKKDRTQMEI >gi|226332203|gb|ACIC01000117.1| GENE 15 13850 - 14524 511 224 aa, chain + ## HITS:1 COG:mll4708 KEGG:ns NR:ns ## COG: mll4708 COG1961 # Protein_GI_number: 13473947 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Mesorhizobium loti # 5 218 3 204 215 125 35.0 5e-29 MNNQYVAYLRVSTQKQGYSGLGLEAQREIIQKYLRDKTPVAEFVEIESGRKKDRPKLKEA LSLCRKTEATLIVAKLDRLARNVSFLSNLLENDVEIVFCDFPQANKMVLHILSAISQYEA ELIAARTKSALQAKKARGFRLGNPEHLMDKHEQAIQNSIRTCREKADNNPNNRRAVAMLR TLVKEEHTLQEITNILNKEGFVTSKGCSFSKSTVYKLIRRYNLK >gi|226332203|gb|ACIC01000117.1| GENE 16 15489 - 16307 367 272 aa, chain + ## HITS:1 COG:MTH1178 KEGG:ns NR:ns ## COG: MTH1178 COG1342 # Protein_GI_number: 15679189 # Func_class: R General function prediction only # Function: Predicted DNA-binding proteins # Organism: Methanothermobacter thermautotrophicus # 3 135 2 128 189 87 39.0 3e-17 MSPRPKNIRKVNNMPSVEGFRPIASNSYHKDTILLHFEEYEAIRLCDYEMKTQQEASVSM GVSRPTLSRIYVSARQKIANALVRGVTIMIEGGVAYTDSEWFCCGVCGFLFNNINPAFKI RKTECPVCHSNDISISNININKNEIMMKIAIPTRDKVVDNHFGHCEYYTILTIGQDNQIL SSETIPSPQGCGCKSNIAGELENMEVSVMLAGNMGQGALNVLTAHNIKVIRGCSGNILDV ATDYLNGKLIDSGVGCSSHAHQHECHGHNHKE >gi|226332203|gb|ACIC01000117.1| GENE 17 16307 - 16681 140 124 aa, chain + ## HITS:1 COG:no KEGG:BDI_3898 NR:ns ## KEGG: BDI_3898 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 124 2 125 125 161 84.0 6e-39 MFGKNKMLLNIIIDFVMLIAMALVSISGVILEIVIPSRHAVRFQSVTPWSSHLFGLGRHD WGNIHLWAGVVLFALLAIHILSHINMVSAFVKKKCPNHILRVLLYILFLMLLIITIMPWL YLCY >gi|226332203|gb|ACIC01000117.1| GENE 18 18342 - 18551 119 69 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MHETLFTKSMLQKENESLSILFLGQAKIQLLSEFKKQHKLYPTQKSNGIIHESIANIIFL QLPIKNNTK >gi|226332203|gb|ACIC01000117.1| GENE 19 19171 - 19443 119 90 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MHFYFSKSKERPLRRQPKVINKKTSPHPYYPQLSLPMHIYWDTPKWDALLGQLLDGTTGI RSIHSSLIRLCLIFTLSKKSLFRRSNPSKA >gi|226332203|gb|ACIC01000117.1| GENE 20 19451 - 19795 117 114 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|189462837|ref|ZP_03011622.1| ## NR: gi|189462837|ref|ZP_03011622.1| hypothetical protein BACCOP_03536 [Bacteroides coprocola DSM 17136] # 17 102 27 117 118 68 43.0 2e-10 MKANIILSGKNDSFILDRQVRDCPDVNEKTVPQRIPRRYREDIKRLQERYDLHEGLVINV SLKEFRGICERDYPKIEAYLGLRKYLLRTFGVTLNIHSQKTKGSSDKNTAITQK >gi|226332203|gb|ACIC01000117.1| GENE 21 19807 - 20007 260 66 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253570590|ref|ZP_04847998.1| ## NR: gi|253570590|ref|ZP_04847998.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 66 1 66 66 97 100.0 2e-19 MDTIVIKKSELIEQIREDFKLWEEMSPDIDEGYFDEEDVQSYLNFLIERYHDEWVVIDDT QEGGDV >gi|226332203|gb|ACIC01000117.1| GENE 22 20004 - 21713 1058 569 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253570591|ref|ZP_04847999.1| ## NR: gi|253570591|ref|ZP_04847999.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 569 1 569 569 1129 100.0 0 MKELFDIIPNSTGDGFRMQLSTGVIDIPDDNGGYIISSGCGSGKTESIKSLIRQKYNSGI LYCVDTRDELGKMYDWILANLVNRELGYGDILRESDVMIISSDKERSSFLNQYKDNPEIL MDKKIILITHVRFWTDLINYFLIYRPQMPVDSFDGDFRKLMVRPDLRRYILFDETPTFIR PFVEFDRTILGVFSKTDDTGNIICMSPEEIGIYYDHFIRNTRNDLFNQSYRINRIKRDVA LNLIPKYYDSWMLSDSDKAGITFYPVDLCPLGVYINTHVLIFEGAGDLLFKDSRNFRLLD VDRKYNCVTEFRKIDFGLFRRNLNPRRFDEFTTRIAMLINKPTLVVCWKDINGGDDGPGK SEYAEQISEALLLKGVPKELFTVTYYGSSDNKSTNNYRDIDQIVMCGDWTLPNIESARIR RAYGTTTDTQNQKDWFFSQLITRIGIRKHDGGTYTVYYTDDFKYDFIGRMYVYFNENRVI SSSHSRESCDWKNRLDSMNIRSNLKNEIVLLAMDDEDMRNAIGMDWEYTKEVSFDYLENL GIKRSARERRRYNKLIRVLEKIKITLLIE >gi|226332203|gb|ACIC01000117.1| GENE 23 22391 - 23401 512 336 aa, chain + ## HITS:1 COG:no KEGG:BVU_1734 NR:ns ## KEGG: BVU_1734 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 6 323 647 950 950 82 26.0 2e-14 MINKTVSYYNNADDSKSKIRYNLDYVLGMIRDDQELFQQTVRLRSISTEEEYKQAKKLLP MIAPSGVFDYRNDNPENLREYSNVLVLDFDHFQNHLDAGEFKRNLIDNADRLYIYALWFS PSGLGVKVAMLHDNTTPMYHNELFRCIREQLYPGIPEFDDKCGNLSRTFFLSSDPEIYIN PQRDYLIPYHFKHSPSVRATPMKSYNQGYISNCFTHTSAEIIQNNCFQARWKDKTLINYI DKKWRLEYPESYEDGHRHQSILSRAKWLCRYGVLYENALAYLTGTFGLHGIDKADIEAMV INNYNANREDFGKDRMRLYLKKESGRNYRNQKLKGE >gi|226332203|gb|ACIC01000117.1| GENE 24 23639 - 24010 220 123 aa, chain + ## HITS:1 COG:no KEGG:BVU_1736 NR:ns ## KEGG: BVU_1736 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 12 112 1 103 104 67 39.0 1e-10 MKNSIFDMWSIMNLFFGGPPEESLKRQVGKKASGPDVSEKTGSKHNEKYEQDITALKEKF GNSFITGLCVRITLKEALGMMPRNRKRVDAYYGLARYLQEEYGITLTIYSQKTKPNGHEE DNL >gi|226332203|gb|ACIC01000117.1| GENE 25 23991 - 24575 479 194 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253570594|ref|ZP_04848002.1| ## NR: gi|253570594|ref|ZP_04848002.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 18 194 1 177 177 319 99.0 5e-86 MKKIIYKGTIEESDYNRVKDIDSENYITPDGKTVMPKLTQMPLRDLAILSFTSENELKKY YTGNEEYFSYSVVELMLDTRIQTSNLSRHKVSSFEDALYLLYTYSEEIPQADDPKYLSIL IAADILNVEEEDIIEEARRDNKLYSDEDKNLFVPVRWIGDWYNDALATLGISSVIYIQTR GTGKVKILIERDLE >gi|226332203|gb|ACIC01000117.1| GENE 26 24598 - 25017 314 139 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253570595|ref|ZP_04848003.1| ## NR: gi|253570595|ref|ZP_04848003.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 139 8 146 146 284 100.0 1e-75 MGKVYDLSTLGVNEVENAVQADLDKVFNAGGVKFKLREVSGKTLELTFLRKYREGEIDWL NYDPKTIYNIDANIITGHSFNGFRIPDYWGGVPYGYTFFMPKREFIGCYRNSAVMLGADQ VKKVKITTQPEKIVMRLMF >gi|226332203|gb|ACIC01000117.1| GENE 27 25254 - 26813 183 519 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 290 519 260 497 563 75 29 9e-13 MMGTLYVGISLFFIWICKCLIDIATKEPGDHMNVLIYLMAACLISRLLLSVAGVRLGNYI EIRFRNGLRHQLFNHLMESRWTGRERFHTGDILNRLEEDVITVTDMLSRGFPSIMVIVSQ LIGALFFLSRLDSRLTGAILFIMSGALLLSKSYIKKMRGLSRDIRSMDSRIQSHIQEYLQ HCILVRTLGNTLRAGNTLASLQLNLQHQVMRRTDFSLFSRMVVQTGFSAGYMTAFLWGVF GLQDGTVTFGVMAAFLQLVAQIQGPLVELSRQIPAFIHLITSSERLAEIAVLPLEQHGES IMLDGIPGIRVESLDFSYPDGEKKVLNGFTHNFLPGSLTAVIGETGTGKSTLLRLILALL SPDSGRIVFYNKRKEVDVSPLTRCNLSYVPQGNTLISGTIRDNLLMGNPNATDEELYVAL HTAVADFVYTLPDGLNTLCGEQGIGLSEGQAQRIAIARGLLRPGRILLLDEPTSALDEKT EKILLERLSEQVRNKTLILITHREMIAQLCSSMVRMHGK >gi|226332203|gb|ACIC01000117.1| GENE 28 26917 - 27363 144 148 aa, chain - ## HITS:1 COG:no KEGG:BT_0074 NR:ns ## KEGG: BT_0074 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 144 15 159 163 131 47.0 6e-30 MTERVILSNDLLLSEVALMLSEGCTVTLLAKGSSMFPFIIGGCDRVVLQKTDCIQVGDIV LAYLTGRSYVLHRIYRIADDDIILMGDGNVSEIERCRRENICGKVLRIMHSGKHIECSSL KERCKVRIWMALLPFRKYILGVCRYWYK >gi|226332203|gb|ACIC01000117.1| GENE 29 27600 - 28010 264 136 aa, chain - ## HITS:1 COG:no KEGG:BT_2526 NR:ns ## KEGG: BT_2526 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 136 1 132 132 149 64.0 4e-35 MILYNSLLAKIILNRKRNKREHYFMVFGCCFTRYKYLEIWEDMELRIHERQYFECFLLAL MPSLVLSLLFSWWFMLLALFSYHLLYWAERWFGHHSSFDWEALEYCGDTLYLRKRKFCAW MKWYGKRTLPPSEWED >gi|226332203|gb|ACIC01000117.1| GENE 30 28465 - 29655 1275 396 aa, chain - ## HITS:1 COG:no KEGG:BT_0066 NR:ns ## KEGG: BT_0066 # Name: not_defined # Def: major outer membrane protein OmpA # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 396 1 400 400 587 74.0 1e-166 MVRNLFLTTLCALAFCPAFAQTSTTEEKVEYSEDKYKVETNRFWSNWFISAGGGAQIYFG DHDKQVSFGKRLAPALDIAVGKWFTPGIGMRLMYSGLQQKGATQKGALSHSTGGDVPGKG GNGYWLEKQKFDMANFHADILFNFSNLFCGYNEKRIWNCSPYAGLGWARVWKSPSAKEVS ANVGILNSFRLCSALDLNLDVRAMLVNDRFDGEGGGRFNEGMLSATVGLAYKFKPRGWDR SKTIYRYDYGDLESMRRKLNEMSAENERLKKALAEGNVQDARTIIKKIAAANLVTFPINK SKLDNEARANLGMLAEIIKSGDPEATYTITGYADAGTGSKKGNEKLSKERAHAVYDCLVK EFGVKESQLRIDYKGGVENMFYNDPRLSRAVITRSE >gi|226332203|gb|ACIC01000117.1| GENE 31 29854 - 31947 1402 697 aa, chain - ## HITS:1 COG:no KEGG:BF1750 NR:ns ## KEGG: BF1750 # Name: not_defined # Def: 1-phosphatidylinositol phosphodiesterase precursor # Organism: B.fragilis # Pathway: Inositol phosphate metabolism [PATH:bfr00562] # 298 555 15 244 345 77 26.0 1e-12 MKIQMKHFISGCALFVTVLTACSDDDFEGKTSIKSGDAVQFAASSRTSTRTVYDDENIFQ INWEEQDKIKIYSNKTYEGVADADYTVTPIETEDPKKSTNKLYNEGTIAAFGTSQLMWAD NDTHKFIGVYPSDNSNISVDAETGIVTLPIKRNQRCEVVAVSGNDYYSQRYSDYTYYAKP DMKNAYLVAYNALTPEQAATNDGNVFLDFKPVMTTIEVVVKGRAKVNEEDVQVTGISVVR EVPEGDGDTFTWNAETGETTNKLPESSKGMNTQSTTTFVGLKTSDDTNIVTLKNNESIVF TVFLPPYAIDGNWPMKIRVHATGATEIITKDISGILASNKRRVTLPNFKSSNEQIGNNWI TPLDDNIYVSQLSIPGTHDAATGDGTTFSLGKTQDMTLDQQFEMGIRAFDLRPALNASST MILCHGIVATTFVWDNVMERFKYYLKENPGEFIIAIMRHEDEYSGTSITHWTNENTHEKW APAMLEKLNVMKNTINPSTNQSYTIDFRPDLTVGEMRGKILFMCRSWTKYNDAGPVVGGY HDWSHSKDGGEVSIWGPSSVIGTLNIQDCYNRSEAGVSSDEFSTVKWNAIEALLEKSRYF HTNPAMINRWSYNHTSGYTSGIIATASTTDGYRENAANNHIKFYNKITSTDWVGSTGIIL SDFVGARRSGNFTVYGDLLPQAIINNNYKYRMKRKGE >gi|226332203|gb|ACIC01000117.1| GENE 32 31979 - 32176 301 65 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253570601|ref|ZP_04848009.1| ## NR: gi|253570601|ref|ZP_04848009.1| predicted protein [Bacteroides sp. 1_1_6] # 1 65 13 77 77 108 100.0 9e-23 MKEQKKKKYEAPETKKTQVELEDGICAASRTGKPVVDDKNDKVNISEQGHGGWNGIGDDI SGEWQ >gi|226332203|gb|ACIC01000117.1| GENE 33 32344 - 34674 1820 776 aa, chain - ## HITS:1 COG:RSp1018_2 KEGG:ns NR:ns ## COG: RSp1018_2 COG0489 # Protein_GI_number: 17549239 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Ralstonia solanacearum # 554 772 21 237 252 119 35.0 2e-26 MNEQNRNNQDINLKDLFFYLLSKWRWFLLSILVFGSLSWLRYAQSPFVYFRQATVIIKDP SNKTTTAGLDRYDNYINKVNVANEILQFRSKKLMQEVVKRVHADISYLHKEGLRENELYT KAPVAVVFPDATPEQYLSFKVTPRDKQTVTVSDFMGTEKDKVCQVGLNDTVDVNGVRMVI SPTNYYTDAWLEKIVRVQKNPLAAVAGYYRGNLGIRQEEDDSSILTLSLKDSSPIRAEDV LNMLITVYNEEAVRDKNQVAVNTADFINERLIIISRELGGVETELETFKRDNRIMDISST ASSYMSESQQSNSQVLELETQLRLARYIKDYLEDHTKGTDLIPSNTGIDNASIENQISQY NTIKLRRDKLIDDSSDSNPVVEELNNSLHAMKQSIIRAVDNMIVSINVRRNDARSREMRA QSRVSSIPTKEREMLSIERQQKIKEALYLFLLNRREENALSQAMADNNARVIDGADGSTV PIAPSGKRILLLGVLIGFALPCIVFLVRMFMDTYVHSRKDLKGKVSVPFLGEIPLDKETA KSGRKGASINEKEDDIVSEAFRILRTNMSFMEKKDSRMQVITFTSFNEGAGKTFISRNLG MSLVYAKKRVLLLDMDIRKGTLSRHFHKHKLGLTNYLANSSIGIDDIIHHDGNQPGLDII SSGTVAPNPAELLMDERLDELVKELRTRYDYIIADSVPVGIIADATIANRIADLTIFVVR AGKLDRRQLPDIESLYQEKTLNNMTLVLNGVDVSHRGYGYGYGYGNYGYGKYEKKK >gi|226332203|gb|ACIC01000117.1| GENE 34 34689 - 35486 486 265 aa, chain - ## HITS:1 COG:XF2370 KEGG:ns NR:ns ## COG: XF2370 COG1596 # Protein_GI_number: 15838961 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protein involved in polysaccharide export # Organism: Xylella fastidiosa 9a5c # 99 230 58 191 217 64 32.0 2e-10 MRKKMIFISCMAVLLLSSCASRKGIAYLQDMELGQRYLYDARYEATVHRNDRLSITVNCK RPELAIPFNAHGNSIHVGADGNVSTSSGEVSAESAKGYRVDVEGNIDFPILGKLHVEGLK VSEVTDLIKNRIIQGNYIKDPQVSLEFLNFKYTVLGAVGRTGTFTVKDDRITLLEAIANA GDLSSKAKINSVAVIREIDGEREIFVHDLRSKELFTSPCYYLQQNDIVYVEPRYRKKDSE DRGFQVGSIFLSLASVIVSLIWALK >gi|226332203|gb|ACIC01000117.1| GENE 35 35507 - 36412 370 301 aa, chain - ## HITS:1 COG:no KEGG:BT_0059 NR:ns ## KEGG: BT_0059 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 291 1 293 313 279 49.0 1e-73 MMREYDFRVAGFLFKLHLPEEQDIDRLLPSFRPFRCVTDIQGRAIFNFVAVSSSLSEKGI IRVLEETESDLGHIRLLEMAEGYRVELRYTAGSPVHVLHAEPFFTLAMANIRWDDPYAGE VLCSLLRIVYSQAILFWGGISVHASAVAWRGKAYLFMGKSGTGKSTHAVQWLKCFPKSEL LNDDNPTIRMESGRSIAYGTPWSGKTPCYKNRCFPVGGIVRLRQADANRFVLQKDVDAFI TLLPGCSAIRLHSYLCNGLYDTLAETVSAVPVGELDCLPDEGAALLCEESLTKEYEYLYN H >gi|226332203|gb|ACIC01000117.1| GENE 36 36409 - 37503 467 364 aa, chain - ## HITS:1 COG:no KEGG:BT_0058 NR:ns ## KEGG: BT_0058 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 364 1 376 376 296 45.0 6e-79 MEMYPNKEVNILFILLRAGLWEKEPESLSLFPLSGESWENIYRMARRQTVTGLVYRGVCH LPDEMLPPEKLLVRWVAEINAIEQRNHRMNAVLAELYEMFRNDGLTPVLQKGQSAALFYK FPLLRECGDIDFYFPYKNEREFAIRIVKNRGIRVDWQPDDSVSYIWNSIEVEHHPRMLDL YNPFLHEDLNSLETLFGYHDVWLSPGCEITIPSPTLYVLLLNTHIMKHAIGRGIGLRQLC DMARACHRLHIAVTSSEIKNICCKAGIIVWNRLLHSFLVEHLGLPVTSLPYKDRPVSSRP LLERVLEGGNFGQYRMERTPGTQAVWQRKLYTAHSFLRNVSLSFGYAPKEAFWTFTNLLT GQVK >gi|226332203|gb|ACIC01000117.1| GENE 37 37534 - 37800 293 88 aa, chain - ## HITS:1 COG:no KEGG:BT_0057 NR:ns ## KEGG: BT_0057 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 87 1 88 95 83 48.0 2e-15 MKLQPNLQLRKIGNKYMIVSTASGNVNMTDVFTLNETAARLWQLMEGKDITPKELAVLLC NEYEVGEEDALKDVEKQLCEWKQSGLVG >gi|226332203|gb|ACIC01000117.1| GENE 38 37831 - 38496 561 221 aa, chain - ## HITS:1 COG:VNG0011C KEGG:ns NR:ns ## COG: VNG0011C COG2148 # Protein_GI_number: 15789348 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Sugar transferases involved in lipopolysaccharide synthesis # Organism: Halobacterium sp. NRC-1 # 33 217 284 469 477 149 42.0 4e-36 MEKVEVVSKAVKGSATERSRSGDIMSPIARTTKRFLDVVASIVGLAVFSPVLLVIYIAIK CEDHGKAIFSQERVGYHGEIFTLYKFRSMITLAEADGKPALCQKRDKRLTRVGRFLREHH LDELPQLWNVLKGEMSFVGPRPERKYFVDQIKSINPDYEQLYQLRPGLFSAATLYNGYTD SMTKMLERLRMDLDYLYHRSLWLDTKIIFLTVFSILVGKKF >gi|226332203|gb|ACIC01000117.1| GENE 39 38484 - 39608 668 374 aa, chain - ## HITS:1 COG:aq_516 KEGG:ns NR:ns ## COG: aq_516 COG0438 # Protein_GI_number: 15605985 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Aquifex aeolicus # 4 373 1 368 368 266 39.0 5e-71 MKALKILQLGKFYPIMGGVEKVMYDLMSGLSERGVPCDMLCALSQGSSRTFSINAHSRLI GCRTWMKVAATMISPGMIFSLRKRCRDYDIIHVHHPDPMACLALFLSGYKGKVILHWHAD IEKQKILLKLYSPLQEWLLARADVIIGTTPPYLAESPCLARVRHKTECLPIGIEPVCPAP AAVEEVRKRYPGKKIIFSIGRLVAYKGYKYLIESAHYLDDDYVILIGGSGAMKYDLEAEI ETWGVQDKVMLLGRISDKELSAYYGACTLFCLSSVQKTEAFGIVQIEAMSCGKPVVATNI PHSGVSWVNAHGFSGLNVTPCSAKELAGAIMTITRNDDLYQKLAKGARERYQETFTKEKM IDNILEIYRSLWKK >gi|226332203|gb|ACIC01000117.1| GENE 40 39808 - 40785 265 325 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253570609|ref|ZP_04848017.1| ## NR: gi|253570609|ref|ZP_04848017.1| predicted protein [Bacteroides sp. 1_1_6] # 1 325 1 325 325 525 100.0 1e-147 MIFFLVIIGILYIFSFCSFNERVSTFNKEHVSTLKGVMAISIVAFHLFYQTDDWLFMFSS WGAPIVSMFYFISGYGLALNYRAKGNEYLSHFFKHRIWESLILPFLLVWVVNRIISGNIS MSLLDELIKLLMYGETTLPYSWYVFSILLFYILFYMIASKKNVVIISFLCVFLYIALTVS LSYERCWYISALAFPLGIFYCKYEERICALWNVPVKYYVTVPLCLLLTAACVISKNEFCY LFAYIFIPIVIVCLCAKIQIYNRNIQWISNVSYEIYLCQGVSMILLRGNYFFVKSDFLYI IATFVLTLFMAYCIKKTCILLVNKH >gi|226332203|gb|ACIC01000117.1| GENE 41 40844 - 41917 371 357 aa, chain - ## HITS:1 COG:DRA0039 KEGG:ns NR:ns ## COG: DRA0039 COG0438 # Protein_GI_number: 15807709 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Deinococcus radiodurans # 1 354 1 334 343 184 32.0 3e-46 MIYVNGRFLLQSLTGVNRFAYELCRAWVQMGIPFILCCPSGSIKGCYDVSHFNIVVYGWG KSHVWEQLLLPLWFSRIKGEKVLVCFTGLGPLLIRKKIMTIHDLAFMANPDWYSRSYRLW YRLMTPLCVATSMKILTVSEFSKSEIVRRLSIDDRKISVIYNAVSSLFCVSDSSHRNARE VTGEKYILAVSSIDPRKNFSMLLKAFAQMDDKNIKLYIVGGQANIYSTSIKELCDNIPTD RIKWLGRITDCELKEYYMNSCCFIYPSLYEGFGIPPLEAMACGTPTIVSDIPPLREICSN ASLYICPLDTEDIAKKIMLLVSDIKLREKLRIAGYNQYKKFAWKDSANAVYDLLCSL >gi|226332203|gb|ACIC01000117.1| GENE 42 41922 - 42767 394 281 aa, chain - ## HITS:1 COG:FN1241 KEGG:ns NR:ns ## COG: FN1241 COG3774 # Protein_GI_number: 19704576 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Mannosyltransferase OCH1 and related enzymes # Organism: Fusobacterium nucleatum # 25 129 1 107 243 87 42.0 3e-17 MPCFCCWIAMSPWKIVPDRLKKHPMIPKIIHYCWLSNEPFPKKIQMCIDSWKKVMPDYEL KLWNTQTFDIENSVPYVKEAFANRKWAFVADYIRLYALYTEGGIYLDSDVKVLKCFDRFL HHKFFTSMEYHPFMVERDNSLADIDIDGYRIADRYISGIELQAAIMGAEKGYVFVKNVLD WYQDKHFIRPDGSMGLDVIAPQIYARIAENYGFRYKDIDQLLADGLMVYRSEIFAGNKRE ATPASYAIHYCENSWNERALLDKIRHYWRFFVYLFKSKMSR >gi|226332203|gb|ACIC01000117.1| GENE 43 42685 - 43959 317 424 aa, chain - ## HITS:1 COG:no KEGG:BT_2867 NR:ns ## KEGG: BT_2867 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 405 1 404 414 249 39.0 2e-64 MVKLLKVVFTGVVFSCYYFPFEFVFLPGINTKMMLAVIGVALFVQQWLKEDTSRLNRTFL VVSVWAALFSLSSFLSVVFNNTTDYTYVSFIVGLWVWIAGAYSVMFLIRRVHGFVSIQWV FQYMAYVCATQSILAICIDNMPLLQEWVDGLILQNVEYLHRTNRLYGIGASFDTAGIRFS CSMLGLGYLLVHEISNKRKICYWSLFLIIGIVGNIVSRTTTIGLVISIVYMALSNFSLSG QISHSRVRFIGYGIVFTGLLVGGIYYLYNYSPVFRNYLQYGFEGFFNWFENGEWTANSTD RLQRMVVFPNNLKTWLIGDGWFLNPNDSNGFYKYTDIGYLNFIFYCGTIGLAIFIFLFIH STYFLCQKGRENTFFFLLLLLLQLIVWIKISTNIFSVYAMLLLLDSYESLEDSARPTEKT SYDT >gi|226332203|gb|ACIC01000117.1| GENE 44 43964 - 44734 330 256 aa, chain - ## HITS:1 COG:BS_yveT KEGG:ns NR:ns ## COG: BS_yveT COG0463 # Protein_GI_number: 16080481 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Bacillus subtilis # 2 167 69 241 344 94 30.0 2e-19 MGVASARNTGLEAATGEYIAFVDADDWIEENMFEKLYQRAEAGHFDIVGCDWYLEFETSK RYMRQPVYERTSDCLKAMLSGEMRWFLWAFLIRRGIYVKNNIRFLDGANIGEDMAVLIRC FSFARSYRHISEALYHYVKSNAVSMTALDSKKQIEIVKRNVDATVSFIRSRYPNSLEQEL DFLKLNVKFSLLISDNAANYEVWNISFPEANRSIWKNAKQPFRNRLLQWSASHRYYWIVR GYYNILFKFVYGVLYR >gi|226332203|gb|ACIC01000117.1| GENE 45 44954 - 45973 459 339 aa, chain - ## HITS:1 COG:YPO0187 KEGG:ns NR:ns ## COG: YPO0187 COG0463 # Protein_GI_number: 16120528 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Yersinia pestis # 3 267 6 275 329 118 28.0 2e-26 MKVSIVIPIYNVSVYIENCLESVRRQIYQDLEVILVDDCGTDNSMEIVQEYLEYHNFVEV KIIHHTHNRGLSAARNTGLEAATGDYVYFLDSDDALMEDCIFILVAPVEAQSYDFVIGNY EVKGSNKEYPALTLPSGALRSNKEILHSYAEGRWYMMAWNKLCNRKFLLDNKLFFEEGLL HEDVIWSFKLACKARSMYVIYEPTYRYTIRAASIMTGTDIERDAYQYIKVFKVITQFVVN EGRQQAQDEYALLEGRKSALLFSLLQRDEYGLYNRYYPYLHEMSPISPWVAFKNKTIGIK YLLRDLHYLLPVVWGSQYKRLFYNLYYKWRGKPIEGALW >gi|226332203|gb|ACIC01000117.1| GENE 46 45982 - 46485 193 167 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253570615|ref|ZP_04848023.1| ## NR: gi|253570615|ref|ZP_04848023.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 167 190 356 356 281 100.0 1e-74 MFFSLCPKEFGFTLNACFFFSLGLYTRLFSHVIEYVERLNFIKYLYFLLIAADLYTVTLG LPVNHYIHLLVIFSGILTVFNYVFISVKKGKNGCLFKEEVFFIYASHGLFIAFLQKTVLK IMRPITEVEFLAIYFLVPILTITISIGMYGCMKRVTPKVLSILVGGR >gi|226332203|gb|ACIC01000117.1| GENE 47 46568 - 46885 95 105 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSSRTYHKGKSKGKPFPSHAQYPDKTLGRVTSSKETLGNKPCKAGILTNIKMRRHKVFHK TYGINKDQNRFFIFVRYSFSLKVPSLKNRYPETMKKIGTAIHGIT >gi|226332203|gb|ACIC01000117.1| GENE 48 47052 - 48164 789 370 aa, chain - ## HITS:1 COG:SP1366 KEGG:ns NR:ns ## COG: SP1366 COG0438 # Protein_GI_number: 15901220 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Streptococcus pneumoniae TIGR4 # 7 370 17 365 367 134 28.0 4e-31 MMPALPGGGAEKVLIDILKNFDYASYEVTLFLEYKEGVYLNDVPEEVRILALHSQNTIWF ERFHRVLRIFHSYVLFHTLIYKYMFMKLLKGERFNTIVSFMEGAAVKFHSYIIHKANNNL SWVHIDLKQKHWSLDFFRNEKDEFKVYRKMDKIVFVSEDVKRMFLELYAIENDKCKVIYN LIDKNVIQRLAISNNAIKRKFTICMVGRLNRQKRYDRALKVAKRLKSDGCDFDLWIIGEG SLEKSLKAMSHEYGMDDCVHFLGFQKPSYPYMKEADIYLNTSEAEGYPLVVCEALCLGLA IVATDISGASEILASSEYGLLVSEKEEDIYNGLKRLMDDAALYGDYRAKALQRAEMFDVP TTMAQIYKVL >gi|226332203|gb|ACIC01000117.1| GENE 49 48183 - 49202 211 339 aa, chain - ## HITS:1 COG:no KEGG:Gmet_1497 NR:ns ## KEGG: Gmet_1497 # Name: not_defined # Def: glycosyl transferase family protein # Organism: G.metallireducens # Pathway: not_defined # 11 316 10 312 328 122 27.0 3e-26 MGKNISIPFLKNDMQMKYTAVIRTLGKGGKNYQRLLNSLLEQSIRPTSVLVYIAEGYPLP KETVGIEQCVYVKKGMVAQRALAYDEVSTEYCLFLDDDVYLPHHAVETFYNELVGQDAQV ISPCVFANHEVAIKDKIRSSILGREVCRLWSRKWGYKVLRTAGFSYNNHPVKPVYESQSN AGPCFFCRKEDFLKIHYEEELWLDKTYYAFPDDQVMYYKMYKTGLKILTSFDSGAIHLDA SSTVEKSKEKTEKLIYSEYRNKLIFWHRFIFLPEKNLLLKFWSVIAISYVYGIQGMKYGV KYLFDDKDMSVAYINGVADGFSFLKSEGYKKLPKIKSCR >gi|226332203|gb|ACIC01000117.1| GENE 50 49165 - 50151 377 328 aa, chain - ## HITS:1 COG:SP1365 KEGG:ns NR:ns ## COG: SP1365 COG0463 # Protein_GI_number: 15901219 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Streptococcus pneumoniae TIGR4 # 4 203 6 207 328 115 34.0 1e-25 MKTISVIVPVYNTEAYLDRCIKSLFCQSYADLEIILVDDGSKDGSLRICREWEQRDSRIQ VISQPNLGVSSARNEGIRKSTGDLIMLLDSDDWLAADACEKLLALIKGKNADCIVCGLKQ TSGNIWAPAFDKDYSDLASFKCDFIYWINTELLSSSVNKIYKRGLITELYPENMSFGEDL VFVLNYLKHCNRISFTQEPLYQHEVYNSVSLTHSFSPARFSNLEDIQKAILDFADDKNKI NPRIYDKYVKDALHLTRMWYKNKNVPYIRKKEIIGEWLKQSYIRKLKISDYQLHWKDRML LHCLHMSCFIGINLIVNGKEYINSLSKE >gi|226332203|gb|ACIC01000117.1| GENE 51 50154 - 50426 96 90 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253570619|ref|ZP_04848027.1| ## NR: gi|253570619|ref|ZP_04848027.1| predicted protein [Bacteroides sp. 1_1_6] # 25 90 1 66 66 128 98.0 1e-28 MLHPTITPIDSLRGIFDLLIVWHHLTLTEVSPYHYDFGSTIVLFFFILSGYGISLSWKDK ISEKGGGKKFLSRPCSGYLLKVANKMFNEL >gi|226332203|gb|ACIC01000117.1| GENE 52 50429 - 51289 311 286 aa, chain - ## HITS:1 COG:no KEGG:BT_2870 NR:ns ## KEGG: BT_2870 # Name: not_defined # Def: putative glycosyltransferase # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 278 2 282 288 238 45.0 3e-61 MEKHAFLIIAHTDWSLLKTLVSLLDYELNDIYIHIDAKVPAKAIPDIICSKSNLYMLEHR ISVAWGDISVVEAEYLLFEIAYNNSHYGYYHLLSGVDLPLKSKEYIYSFFMQSGKEFIGF CPYNDTLSDIRVRTYHFFVSKMRNNRFYRLLDRIIAKCLVVFDCLRNKEIYFRKGSTWVS VTNDFVRYMLVNKTMVLELYKHTFGADEFFIQTLCWNSRFRNSVYDLNDEYNGCQRLIDW ERGWPYTWQEKDYNELIASEYLFARKFSSENAELINRLTTFLNTQN >gi|226332203|gb|ACIC01000117.1| GENE 53 51363 - 52481 750 372 aa, chain - ## HITS:1 COG:no KEGG:BDP_1842 NR:ns ## KEGG: BDP_1842 # Name: not_defined # Def: polysaccharide pyruvyl transferase # Organism: B.dentium # Pathway: not_defined # 3 361 2 367 375 177 32.0 5e-43 MKKVGIITFHAAYNYGSMLQAYALQQVILSMGYDCEIINFRSPAQKRQYKPIFVVGSLYG RCVRFIIQAAYVWGILKKQRLFEQFLNSELKLSYNEYGTLEDLENAGFNYDYYISGSDQI WNVYCNDFNYAYFLPFVKSGKRIAYAPSMGAQLSIKTYKDRKKVIDLLNQYDAISVREAV GARYIGEITKMPTASVLDPTLLLNPQEYDNLIDDKPLIKGEYVFIYSPNFTEKVNEMAEA LGDKYNKQVVISQGLISKNAMLKWGRKFNIYTATGPKEFLNLCKNASIICCDSFHAVAFS ILFKKCFFVLDGMKDNRISNLLQITHLQNRNFSSPNEYFNAPLQMDFTQPLGALEIERRK SLEWLKEHLDVC >gi|226332203|gb|ACIC01000117.1| GENE 54 52486 - 53952 694 488 aa, chain - ## HITS:1 COG:no KEGG:BVU_2391 NR:ns ## KEGG: BVU_2391 # Name: not_defined # Def: putative transmembrane protein # Organism: B.vulgatus # Pathway: not_defined # 1 483 26 511 512 385 44.0 1e-105 MIVALYTSRKILETLGVDDFGIFNVVGGIISLMSFINGSMSVATQRYLTYELGRGTEGQF NKVFNMAVYIHAIIAVIVLIAAETVGVWFVNTQLNIPEVRMEAANWVFQATILTTILGIL QTPYNAAIISHEHMHVYAYVGLGETFAKLFIVWGLFYYPYDRLAIWGFAIFLLQFLIAMT YRMYCSRQFPECRLHLRWNKAIFNSMVQFTGWNMFGTIAWLLKDQGVNILMNLFGGPVAN AARGVSCQVSGAIQNLTGGFQSAVNPQLTKRYAAHESEETCDLMCKSSKISYFLLFTIAL PVMMETDFILDLWLVEVPPMTASFTRIIIIEALLNAFGGPMITSLMATGNIKWYQIVVSS SLLFIIPVAYLLLKSGYSIETPLIVSVIFIMIGNIVRLMFCKKQLGLAFRQYTWNVVFPI AGVSALAIIFPLGIHVYMTEGWSRLLITIFVSCTAMGILTYTIGLTASERTFIISYVTPK AKKFFHID >gi|226332203|gb|ACIC01000117.1| GENE 55 54005 - 55168 256 387 aa, chain - ## HITS:1 COG:no KEGG:BVU_2392 NR:ns ## KEGG: BVU_2392 # Name: not_defined # Def: F420H2-dehydrogenase, beta subunit # Organism: B.vulgatus # Pathway: not_defined # 6 372 4 379 379 298 42.0 2e-79 MISVAIKQKVNCSGCNACAEVCPKHCIEMVPDKKGFFYPKVDAVTCIDCGACEKVCPFQD GNIKLDTPLTAYAAWNKDREQYLASSSGGAAHVFSSHIIKRGGVVYGCTSEGMHIRHIHV DFLSKLSKLQGSKYVQSDVRGVFSLVKADLKAGKPVLFIGTPCEVAGLKKYIKRIPEDLY LVDLICHGVPSQQMLYEHINHILNGRSAERLSFRKGQSFHIELTDQYGTVYSSEPHRDMY YRAFLGGISYRESCYECPFARRERVSDITIGDFWGLQDAASLPLEISEGISVLLPSSEKG KSLIAAAKSDMWIYERSVEEAVEGNTQLYRPVHNGLSARLLSMLYPCFPFDKAVRIVIVK DLILESLKNILRSFKPVLMPIINVIRR >gi|226332203|gb|ACIC01000117.1| GENE 56 55493 - 56509 303 338 aa, chain - ## HITS:1 COG:no KEGG:BT_0037 NR:ns ## KEGG: BT_0037 # Name: not_defined # Def: putative transcriptional regulatory protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 338 13 351 351 417 60.0 1e-115 MGDNISNGTGQALVSEVRGLSKGKGGANWCYLFIHHTKVNAVNEKLKERYCTFIHKSVVY KRKNKRIAKDEKPTISGLVFVQGESDDIQSFLCENFFGLYLVKDCSTGKIAVIHNNVMQS FMQVSQVEPARIRFMPHTFDYYSSGNTLVRITSGPLTGLEGYRIRISRDKCLVTSIGGMT VAIGGIYKESFENIDEYVRQRRELLCKESQEECFTLTSCQSEIAQCFFTPQNQMDVMAIA SALNLWIVKAESYIKEKNFNKAVEITLFILNKIGNCFHSIYSDPRIKNFKEINTVCCDAD KILISVLNSVDVSVDLKEIIKSERRSLVICYPFLPMEL >gi|226332203|gb|ACIC01000117.1| GENE 57 56578 - 56775 81 65 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLFLRNSSVAFFLLIKSNTSSIPPIKKIKINIFNQILKYKNTLLKLRLLIRDEYIILIVD YKIIM >gi|226332203|gb|ACIC01000117.1| GENE 58 56931 - 58136 785 401 aa, chain - ## HITS:1 COG:no KEGG:BDI_0750 NR:ns ## KEGG: BDI_0750 # Name: not_defined # Def: transposase # Organism: P.distasonis # Pathway: not_defined # 1 398 1 399 402 454 59.0 1e-126 MTSVKIKFRASTMADKEGYLYYQIIHNRIIRQISTDYRIFSSEWDNNSGMIILYPEGVGT ERASIVLSIQERIRWDKVRLNKIIESLDNQESGYTADDVVQQFNDRAKEQSFFDFMLNVI VGLRRLGKVRTSETYTAALKSFMRFRNERDVMLEEIDSEMMMLYEAWLKERDISMNTISF YMRILQATYNRAIEKELTVQRYPFKHVYTGVEKTVKRAVPFKYIKKIKDLELPAGSSIDF ARDMFLFSFYTRGMSFVDMAYLKKKNLKNGVLTYRRRKTKQQLTIKWEKSMQEIVDKYPD TNMYLLPIIREVENERKQYENALHLINNKLKEISIMINLAARLTMYVARHSWASVAKSKN IPISVISEGMGHDSESTTQIYLASLDNSVIDKANKLILKNL >gi|226332203|gb|ACIC01000117.1| GENE 59 58733 - 59563 448 276 aa, chain - ## HITS:1 COG:no KEGG:Metev_0644 NR:ns ## KEGG: Metev_0644 # Name: not_defined # Def: radical SAM domain-containing protein # Organism: M.evestigatum # Pathway: not_defined # 10 253 2 234 238 97 29.0 6e-19 MKTEKNYHGKAIYQPSGKAAEYGEWACNFHIGCSNLCDYCFCPAALRPGLWSSTVTLKKA FKDEQDAIAVFGKEFSQNLQSLRGAGLFFSFTTDPLLPETMELTAQGVKTCVENGVNVKV LTKRADFIDNFFGLLASYGNFDEEQYREYTAFGFTLTGHDELERGASSNIERISAMEELH NRGYRIFASIEPIVDFPSSMQMINSSLPFCDLYKIGLMSGNGVTYDPVEAHTFLSELQQL PGQPKIYLKNSLVRFLGVDRKILPENFVESNYNMFG >gi|226332203|gb|ACIC01000117.1| GENE 60 59811 - 60767 572 318 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253570627|ref|ZP_04848035.1| ## NR: gi|253570627|ref|ZP_04848035.1| predicted protein [Bacteroides sp. 1_1_6] # 1 318 1 318 318 632 100.0 1e-180 MDNAFKVRTFKEVIADKNNFKEPGAVLGDFIKEGELSVIGVAANSSETAFCYDVAFANAS GLCHWEEPVSDKIRKTLCVDFELSDSQIARRYANVPDFVSCSVRRAHPVSSAHGCSPEDT IRNIEKLVEENRPELVIIDDLGALTGNAVSVSVVKKTMKGLNHIRESFELTMILVAHFRK RNGRNPIEISDIEGSSVICNYTDSIVAIGSSVEGTEIKYLKQLKTRSAQKMSEVAVMCLE DVPWLHFDFIRFDDEFNHLVNSQKSRSTITDFMGENIVRLNSEGFSIRSIAEMLGLSKSV VGRFLKDRNSYSPQLNYD >gi|226332203|gb|ACIC01000117.1| GENE 61 60820 - 61935 767 371 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253570628|ref|ZP_04848036.1| ## NR: gi|253570628|ref|ZP_04848036.1| predicted protein [Bacteroides sp. 1_1_6] # 1 371 1 371 371 700 100.0 0 MLENNQNPVTEVSEQGAVQATVENVNNGMIPQGSATPDSNSSINAVVPNGTASSGTLLSG TVLQNSYLKMWDGEPLSEGKLYHRNEMIGLPECSIMENRDFLGRAEEWQNKCREVGMAQP GKYVRARIVKQAGYTPALFNKEKGEWERVSGDLLDRYYCRMDGHGRAAGHDLELAEAMKN PAYQPFDFIFFFEDICDPDIFRKQFVSINFDTKKTTNAELAGYAAAVYKNADTQYYNDLL KTGYVAKAAAYYAYAKEPSRDDMKKINEGTSVSVDRPMVDAMKRALAVYRKIFTGKASSK LLNGVPLARWTYNRLKQETDKEKLLKIITQKFSVLTPQYVAVLQEARGVKGDRTRTTEIV LGELFDNILKD >gi|226332203|gb|ACIC01000117.1| GENE 62 62921 - 63133 138 70 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLTVFIAAIHHNPTEKVFMDEGDNTIILHGTRQYFNNGIEELFQDQVNTVFIIITNISLD TSPKLGEKYF Prediction of potential genes in microbial genomes Time: Thu May 12 02:21:38 2011 Seq name: gi|226332202|gb|ACIC01000118.1| Bacteroides sp. 1_1_6 cont1.118, whole genome shotgun sequence Length of sequence - 20493 bp Number of predicted genes - 19, with homology - 18 Number of transcription units - 12, operones - 5 average op.length - 2.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 240 136 ## - Prom 270 - 329 4.1 + Prom 1440 - 1499 6.1 2 2 Op 1 8/0.000 + CDS 1525 - 2403 1162 ## COG1561 Uncharacterized stress-induced protein 3 2 Op 2 . + CDS 2427 - 2996 459 ## COG0194 Guanylate kinase 4 2 Op 3 . + CDS 2993 - 3604 478 ## COG1057 Nicotinic acid mononucleotide adenylyltransferase - Term 3514 - 3549 -1.0 5 3 Tu 1 . - CDS 3609 - 4664 799 ## COG1408 Predicted phosphohydrolases - Prom 4756 - 4815 7.0 + Prom 4629 - 4688 6.3 6 4 Tu 1 . + CDS 4778 - 5674 823 ## COG1575 1,4-dihydroxy-2-naphthoate octaprenyltransferase + Term 5893 - 5924 -1.0 - Term 5583 - 5629 7.1 7 5 Op 1 16/0.000 - CDS 5676 - 6812 1414 ## COG1088 dTDP-D-glucose 4,6-dehydratase 8 5 Op 2 . - CDS 6829 - 7704 782 ## COG1209 dTDP-glucose pyrophosphorylase 9 5 Op 3 . - CDS 7782 - 8273 402 ## COG0622 Predicted phosphoesterase - Prom 8332 - 8391 7.0 + Prom 8160 - 8219 3.1 10 6 Op 1 . + CDS 8329 - 10407 1758 ## COG0855 Polyphosphate kinase 11 6 Op 2 . + CDS 10480 - 12732 2168 ## BT_2020 putative phosphate/sulphate permeases + Term 12830 - 12878 9.6 - Term 13112 - 13172 9.2 12 7 Tu 1 . - CDS 13387 - 13914 493 ## BT_2021 putative non-specific DNA-binding protein - Prom 14147 - 14206 5.4 13 8 Tu 1 . + CDS 14190 - 14447 213 ## gi|253570642|ref|ZP_04848050.1| conserved hypothetical protein 14 9 Tu 1 . - CDS 14552 - 16714 1440 ## COG0642 Signal transduction histidine kinase - Prom 16962 - 17021 6.4 15 10 Tu 1 . - CDS 17283 - 17948 421 ## BT_2024 hypothetical protein - Prom 18057 - 18116 6.2 + Prom 17909 - 17968 5.9 16 11 Op 1 . + CDS 18117 - 18455 247 ## gi|160883429|ref|ZP_02064432.1| hypothetical protein BACOVA_01398 17 11 Op 2 . + CDS 18452 - 18721 221 ## gi|253570645|ref|ZP_04848053.1| predicted protein + Term 18864 - 18913 -1.0 + Prom 18850 - 18909 5.4 18 12 Op 1 . + CDS 18929 - 19897 502 ## BT_2026 endonuclease 19 12 Op 2 . + CDS 19894 - 20331 392 ## COG3023 Negative regulator of beta-lactamase expression Predicted protein(s) >gi|226332202|gb|ACIC01000118.1| GENE 1 3 - 240 136 79 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGKGTQKIQLMYYGMVYTILQHNGFFLRGKNASPINITCSKYCQLFSQNRKSLSNNIYTF NFYDIEKEKGSKVWIKTYD >gi|226332202|gb|ACIC01000118.1| GENE 2 1525 - 2403 1162 292 aa, chain + ## HITS:1 COG:CAC1716 KEGG:ns NR:ns ## COG: CAC1716 COG1561 # Protein_GI_number: 15894993 # Func_class: S Function unknown # Function: Uncharacterized stress-induced protein # Organism: Clostridium acetobutylicum # 1 291 1 291 292 132 32.0 6e-31 MIQSMTGYGKATAELPDKKINVEIKSLNSKAMDLSTRIAPAYREKEIEIRNEISKVLERG KVDFSLWIEKKESAESATPINQVLVEGYYKQIQAISENLGIPVPTDWFQTLLRMPDVMSK TEIQELTEEEWEMVRATVLEAIGHLVDFRKQEGAALEKKFREKIANIALLLEKITPYEKE RVEKVKERITDALEKTLNTDYDKNRLEQELIYYIEKLDVNEEKQRLTNHLKYFISTLESG NGQGKKLGFIAQEMGREINTLGSKSNHAEMQKIVVQMKDELEQIKEQVLNVM >gi|226332202|gb|ACIC01000118.1| GENE 3 2427 - 2996 459 189 aa, chain + ## HITS:1 COG:RSc2155 KEGG:ns NR:ns ## COG: RSc2155 COG0194 # Protein_GI_number: 17546874 # Func_class: F Nucleotide transport and metabolism # Function: Guanylate kinase # Organism: Ralstonia solanacearum # 3 183 20 198 221 145 42.0 4e-35 MTGKLIIFSAPSGSGKSTIINYLLTQNLNLAFSISATSRPPRGTEKHGVEYFFLTPEEFR CRIENNEFLEYEEVYKDRYYGTLKEQVEKQLEKGQNVVFDLDVVGGCNIKKYYGERALSI FVQPPSIEELRCRLTGRGTDEPEVIECRIAKAEYEMTFAPQFDRVIVNDDLEAAKAETLK VIKEFLNKE >gi|226332202|gb|ACIC01000118.1| GENE 4 2993 - 3604 478 203 aa, chain + ## HITS:1 COG:BS_yqeJ KEGG:ns NR:ns ## COG: BS_yqeJ COG1057 # Protein_GI_number: 16079618 # Func_class: H Coenzyme transport and metabolism # Function: Nicotinic acid mononucleotide adenylyltransferase # Organism: Bacillus subtilis # 11 194 3 183 189 107 34.0 2e-23 MKESLKRQKLKTGIFSGSFNPVHIGHLALANYLCEYEGLDEIWFMVSPQNPLKAGTELWP DDLRLRLVELATEEYPRFRSSDFEFHLPRPSYSVHTLEKLHETYPERDFYLIIGSDNWAR FDRWYQSERIIKENRILIYPRPGFPVNENGLPETVRLVHSPTFEISSTFIRQALDEKKDV RYFLHPKVWEYIREYIRQSIREH >gi|226332202|gb|ACIC01000118.1| GENE 5 3609 - 4664 799 351 aa, chain - ## HITS:1 COG:mll3894 KEGG:ns NR:ns ## COG: mll3894 COG1408 # Protein_GI_number: 13473337 # Func_class: R General function prediction only # Function: Predicted phosphohydrolases # Organism: Mesorhizobium loti # 106 346 47 311 312 111 30.0 2e-24 MKKINYLFIIFVLFLFVSCKSKKNVVSILPRPVLNVDSVRPDSSAAIDGLFAPDHSQLAD LKVSTKKQKKTAKKEAATEQEERNLLLRGTKITSSTVNVPATYSGIDRVVEYDFTHRDVP EAFEGFRIAFISDLHYKSLLKEKGLNNLVDLLIAQKPDVLLMGGDYQEGCEYVEPLFAAL ARVKTPMGTFGVMGNNDYERCHDEIIRTMKHYGMRPLEHEVDTLRKDGQQIILAGVRNPF DLKQNGVSPTLALSPNDFVILLVHTPDYVEDVSVANTDIALAGHTHGGQVRVFGYAPIQN SHYGTRFLTGLAYNSTKMPLIVTNGIGTSQMPVRIGAPAEIIMITLHRLKE >gi|226332202|gb|ACIC01000118.1| GENE 6 4778 - 5674 823 298 aa, chain + ## HITS:1 COG:VNG1075G KEGG:ns NR:ns ## COG: VNG1075G COG1575 # Protein_GI_number: 15790173 # Func_class: H Coenzyme transport and metabolism # Function: 1,4-dihydroxy-2-naphthoate octaprenyltransferase # Organism: Halobacterium sp. NRC-1 # 10 296 11 311 311 182 38.0 9e-46 MEEVKRNSLRAWILAARPKTLTGAITPVMIGSALAYMDGHFQWLPALICCLFAGLMQIAA NFINDLFDYLKGTDREDRLGPERACAQGWISPRAMRNGIIVTVLFACLIGSALLFYAGWE LIIVGLLCVLFAFLYTTGPYPLSYNGWGDVLVIVFFGFVPVGGTYYVQALTWTSDTTVAS LICGLLIDTLLVVNNYRDREADARSGKRTVIVRFGEKFGRYLYLMLGVAASLLCFWFFFE GHLFAALLPLLYLIPHVLTWRRMVQIHSGKKLNSILGETSRNMLLMGILLSIGFLIRH >gi|226332202|gb|ACIC01000118.1| GENE 7 5676 - 6812 1414 378 aa, chain - ## HITS:1 COG:FN1667 KEGG:ns NR:ns ## COG: FN1667 COG1088 # Protein_GI_number: 19704988 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-D-glucose 4,6-dehydratase # Organism: Fusobacterium nucleatum # 1 377 1 397 399 504 61.0 1e-142 MKTYLVTGAAGFIGANYIKYILAKHSDIKVVVLDALTYAGNLGTIAKDIDNERCFFIKGD ICSREVVDSLFAEYRFDYVVNFAAESHVDRSIENPQLFLITNILGTQNLLDCARRAWVMG KDEQGYPTWRKGVRYHQVSTDEVYGSLGAEGYFTEETPLCPHSPYSASKTSADMVVMAYH DTYKMPVTITRCSNNYGPYHFPEKLIPLIIKNILEGKHLPVYGDGSNVRDWLYVEDHCKA IDLVVREGKEGEVYNVGGHNEETNLEIVKLTIATIRKLMTEKPEYRQVLKKKVKGEDGEI SVDWINNDLITFVKDRLGHDQRYAIDPTKITNALGWYPETKFEDGIVKTIVWYLENQDWV EEVTSGDYQGYYEKMYGK >gi|226332202|gb|ACIC01000118.1| GENE 8 6829 - 7704 782 291 aa, chain - ## HITS:1 COG:MTH1791 KEGG:ns NR:ns ## COG: MTH1791 COG1209 # Protein_GI_number: 15679779 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-glucose pyrophosphorylase # Organism: Methanothermobacter thermautotrophicus # 1 287 1 287 292 398 63.0 1e-111 MKGIILAGGSATRLYPLSKAISKQIMPVYDKPMIYYPLSTLMLAGIREVLVISTPRDLPM FRDLLGSGEELGMSFSYKIQEQPNGLAQAFVLGADFLNGEAGCLILGDNMFYGQGFSAML RRAAGVEKGACIFGYYVKDPRAYGVVEFDEQGKVISLEEKPLVPKSNYAVPGLYFYDATV TEKAASLRPSARGEYEITDLNRLYLEEGTLNVELFGRGFAWLDTGNCDSLLEASNFVATI QNRQGFYVSCIEEIAWRQGWISSGQLLLLGQKLEKTEYGKYLIELSKQPLK >gi|226332202|gb|ACIC01000118.1| GENE 9 7782 - 8273 402 163 aa, chain - ## HITS:1 COG:PA0351 KEGG:ns NR:ns ## COG: PA0351 COG0622 # Protein_GI_number: 15595548 # Func_class: R General function prediction only # Function: Predicted phosphoesterase # Organism: Pseudomonas aeruginosa # 3 137 9 134 157 61 33.0 6e-10 MTRIGLLSDTHAYWDEKYLEYFESCDEIWHAGDIGSVEVAEKLAAFRPLRAVYGNIDGQE IRKMFPQVNRFTVDGAEVLIKHIGGYPGKYDPSVIGSLMARPPKLFISGHSHILKVKYDK TLDMLHINPGAAGMSGFHKVRTMVRFVIDNGTFKDLEVIELAG >gi|226332202|gb|ACIC01000118.1| GENE 10 8329 - 10407 1758 692 aa, chain + ## HITS:1 COG:ECs3363 KEGG:ns NR:ns ## COG: ECs3363 COG0855 # Protein_GI_number: 15832617 # Func_class: P Inorganic ion transport and metabolism # Function: Polyphosphate kinase # Organism: Escherichia coli O157:H7 # 7 685 7 682 688 426 38.0 1e-119 MESKYNYFKRDISWLSFNYRVLLEALDEHLPLYERINFISIYSSNLEEFYKIRVADHKAV ASGATESDEETVQSARELVEEINHEVNRQLDDRVRIYEEKILPALRKNHIIFYQDRHVEP FHQQFIKDFFREEIFPYLQPVPVSKDKIVSFLRDNRLYLAIRLYPKENRNPANRQPFYFV MKQPYAKVPRFIELPPRGDNYYIMFTEDIIKANLNLIFPGYDVDSSYCIKISRDADILID DTASSADLVAQLKKKVKKRKIGDVCRFVYDRAMPQDFLDSLIDAFHIHRDELVPGDKHLN LEDLRHLPNPNKSLRRIEKPQPMKLNILDEKESIFNYVAQKDLLLYYPYHSFEHFTHFLY EAVHNPETREIMVTQYRVAENSAVINTLIAAAQNGKKVTVFVELKARFDEENNLATAEMM QAAGIKIIYSIPGLKVHAKVALVRRRGLNGEKIPSYAYISTGNFNEKTATLYADCGLFTC RKEIVNDLYNLFRTLQGKEDPKFTTLLVARFNLIPELNRLIDREISLADQGKGGRIILKM NALQDPAMIDRLYEASEHGVQIDLIVRGICCLIPEQSYSRNIRVTRIVDSFLEHARIWYF GNEGHPKVYMGSPDWMRRNLYRRIEAVTPILDPDLRASLIEMLHIQLADNQKACWVDDKL QNVFKKRASGTPAVRAQYDFYEWLKKDIVTEM >gi|226332202|gb|ACIC01000118.1| GENE 11 10480 - 12732 2168 750 aa, chain + ## HITS:1 COG:no KEGG:BT_2020 NR:ns ## KEGG: BT_2020 # Name: not_defined # Def: putative phosphate/sulphate permeases # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 750 1 750 750 1442 100.0 0 METIYLCIIIFLFVLAVFDLIVGVSNDAVNFLNSAVGAKAASFKTILFIAGIGIFIGASL SNGMMDIARHGIYQPEHFYFAEIMCILLAVMLTDVVLLDVFNSMGMPTSTTVSLVFELLG GTFALSLIKVNNDATLALGDLINTDKALSVIMAIFVSVAIAFFFGMLVQWLARIVFTFNY TKNIKYSIGLFGGIAATSIIYFMLIKGLKDSSFMTPENKQWIQDNTLLLIASFFVFFTAL MQVLHWLKVNVFKVVVLMGTFALALAFAGNDLVNFIGVPLAGYSSFIDYTTNGAGASPDS FLMTSLLGPAKTPWYFLIGAGAIMVYALCTSKKAHAVIKTSVDLSRQDEGEENFGSTPMA RTLVRFSMTLANGTSRIMPESTKQWINSRFRKNEAIIADGAAFDLVRASVNLVLAGLLIA LGTSLKLPLSTTYVTFMVAMGTSLADRAWGRDSAVYRITGVLSVIGGWFITAGAAFTICF FVALVLHYGGNISIIALIGIAIFILIRSQVMYKKRKAKEKGNETLKQLMQTADSEEALQL MRKHTREELAKVLEYAETNFELTVTSFIHENLRGLRRAMGSTKFEKQLIKQMKRTGTVAM CRLDNNTVLEKGLYYYQGNDFASELVYSISRLCEPCLEHIDNNFNPLDAIQKGEFSDVAE DITYLIQQCRKKLENNDYQNMEEEIRRANDLNGQLSQLKRKELQRIQSQSGSIRVSMVYL TMVQEAQNVVTYIINLMKVSRKFQMETEMP >gi|226332202|gb|ACIC01000118.1| GENE 12 13387 - 13914 493 175 aa, chain - ## HITS:1 COG:no KEGG:BT_2021 NR:ns ## KEGG: BT_2021 # Name: not_defined # Def: putative non-specific DNA-binding protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 175 1 175 175 320 100.0 2e-86 MPLIYKPYQANIANKAGQKLYYPRLVKFSKMVNTQKMAELIAEKASLTPGDVHNVIRNLM SVMREQLLNSRTVRLEGLGTFTMIAKAGGKGVVLESKVSSSQIVSLRCQFTPEYTRSADG VTTRALTSGVEFVHVKDVAGGFVDDDDKNHSGGGDNPGGGSTPGGDDDEAPDPTV >gi|226332202|gb|ACIC01000118.1| GENE 13 14190 - 14447 213 85 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253570642|ref|ZP_04848050.1| ## NR: gi|253570642|ref|ZP_04848050.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 85 1 85 85 144 100.0 2e-33 MKAKIMKPTSEENATLPRIQAYSKSQLATLYLPHIQPASARRTLRSWIAKNTALQNELAR TGYSEKAILLTPAQVGLIFRFLGEP >gi|226332202|gb|ACIC01000118.1| GENE 14 14552 - 16714 1440 720 aa, chain - ## HITS:1 COG:MA4377_3 KEGG:ns NR:ns ## COG: MA4377_3 COG0642 # Protein_GI_number: 20093164 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Methanosarcina acetivorans str.C2A # 464 718 1 260 311 178 41.0 3e-44 MEARITRKTAALAIFLISAFQTYIYAGRPSSCQLLIINSYTENCLWSDDFMAPVYKEFRV QNSPVDICAEHMELMATVVPDMRKQVFMSDRRWISAQCRKEAEEVMPFLLIGGYFWTDSE IKKHLLPVIKSRLDGASHPHRVETTAMGTPPSVINYADYVESGLLLGLCPDDTVFGMKPP TFFERNKYYLALFFSLMALAVIYVIWLRRALHERSRRLEIMRSYSSLVENMPVLYARVEL IFDPGARIIDYVYREVNPTFEKYILPKEKILGKKYSELNPDYSPELPDRYSELNDNRQIT FQYYLEKTKTHLTVISIHSKTKGCVDVFGVDNTELVLTQQMLKSTNHKLSAALDAADMTP WKWDLQTGLLSCNVSHDLYVTEEEVTHDGNLIIVPTSACFAKICDEDRERVRDAFERLAN GETQKMREEYRVGRQWLPSPQQNEWVEVRAAVDERDANGKPLSLIGTSMTVTQRKEMEEA LVQAKVKAEEANTLKSSFLANISHEIRTPLNAIVGFSSLLVSAERGISEEKQEYINIIEN NNTLLLQLISDVLDLSKIEAGTMEFDYAPVDVHGLFIELEDTFRLRNKKSGICICYHRRT TECGVKADRNRLVQVMMNLMNNAVKFTGEGSIEFGFDVREDGFLHFYVTDTGCGIPEERL EEIFGNFVKLNSFVQGTGLGLTICRAIVERMGGKIGAVSRLGQGSTFWFTLPYTANEEKV >gi|226332202|gb|ACIC01000118.1| GENE 15 17283 - 17948 421 221 aa, chain - ## HITS:1 COG:no KEGG:BT_2024 NR:ns ## KEGG: BT_2024 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 221 1 221 221 345 98.0 1e-93 MNKKENCQNSSEREQHIEVKRTVNQKSTSQKKVNQKSMSQEKANQKSMSQKKANQSGEKS VYDENRDALQERVEEKQWKKKQCNGLQHSDDEEKQKVIDSIMKVSRDNGFDDAYLQQHSD CSASSIKRFHSAWMGKRMSNWTTIFNLAHCVSVNCVFAENLVGMLVVIIMFLIRDAGIVS YHIDSTRKVVIEINFGKDKLLRMKDEEKSEKGKEGKDDEHL >gi|226332202|gb|ACIC01000118.1| GENE 16 18117 - 18455 247 112 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160883429|ref|ZP_02064432.1| ## NR: gi|160883429|ref|ZP_02064432.1| hypothetical protein BACOVA_01398 [Bacteroides ovatus ATCC 8483] # 1 112 1 112 120 176 75.0 6e-43 MTEMKKQHSLPRVRLMRRAAVIRRNNDMNRRDALIIAHRIGGLIRKMHRENVKFCYTKQD GTVRHAVGTLTGYQHSFHRPYMPRPENTFVVYYDLEAKGWRTFHAENFLCVE >gi|226332202|gb|ACIC01000118.1| GENE 17 18452 - 18721 221 89 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253570645|ref|ZP_04848053.1| ## NR: gi|253570645|ref|ZP_04848053.1| predicted protein [Bacteroides sp. 1_1_6] # 1 89 14 102 102 129 100.0 4e-29 MSFEEQMRFGEQMSFGQQINFEMSFGQQISFGQQMNFGEQMTSEERKEWETLCRMYHVDS LYEAIQGCREELEFLVRCAERLEPRAEEE >gi|226332202|gb|ACIC01000118.1| GENE 18 18929 - 19897 502 322 aa, chain + ## HITS:1 COG:no KEGG:BT_2026 NR:ns ## KEGG: BT_2026 # Name: not_defined # Def: endonuclease # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 322 1 322 322 644 99.0 0 MENWRFIEANSDYMVSDHGRILSFKGKSKLIISSSITAKGYEYVAIRQKGIYVGYSVHRL VATAFIPNPKRLPQVNHLDGNKLNNHVANLEWCDAYDNVMHAIRTGLRPSSPALSPVPCA TTDEAGNILQAYPSMNALVKGEQMNPKQRNWLVLHLLHPERLKQSAMKKKPRAERSGSEE CFLIAPEIPVLTVKADSIGEFEAGNPLGIGKSSNSVGEFEAGNPLGIGKSSNSVEEFEAG NPLGIGKSSNSVGEFKAGNVCVQPVSFSFVSSPFPLISLFAPIHHSVTQHYYRRLSPEES AIFGFNTYILPRERGQRKEFLQ >gi|226332202|gb|ACIC01000118.1| GENE 19 19894 - 20331 392 145 aa, chain + ## HITS:1 COG:HI1494 KEGG:ns NR:ns ## COG: HI1494 COG3023 # Protein_GI_number: 16273395 # Func_class: V Defense mechanisms # Function: Negative regulator of beta-lactamase expression # Organism: Haemophilus influenzae # 43 141 1 98 116 102 48.0 3e-22 MKDTIIIHCSATRAGQDITAADIDCWHRARGFWSIGYHYVIRLDGTIEPGRDVTLDGAHC MGWNQRSIGICYVGGLDKEGRPADTRTDAQRTALIRLVKSLQLAFPNVKQVIGHRDTSPD LNGDGIISPNEYMKACPCFDVKKEF Prediction of potential genes in microbial genomes Time: Thu May 12 02:22:18 2011 Seq name: gi|226332201|gb|ACIC01000119.1| Bacteroides sp. 1_1_6 cont1.119, whole genome shotgun sequence Length of sequence - 1235 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 55 - 114 4.4 1 1 Tu 1 . + CDS 165 - 1233 634 ## BT_2028 hypothetical protein Predicted protein(s) >gi|226332201|gb|ACIC01000119.1| GENE 1 165 - 1233 634 356 aa, chain + ## HITS:1 COG:no KEGG:BT_2028 NR:ns ## KEGG: BT_2028 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 335 1 335 335 581 99.0 1e-164 MARISKPGLDYFPLDVNFFQDRKVRRISNRHHAAGIAALTSLLCLIYKEKGFYVAWNQDT LFDISQEVCCEEEEMQAIIDDCLSVGLFDTYIYKEYGILTSQAIQEQYHKIITDSRRKYK LPLERFWLIKEEKDGTGNNSADIRSNINSKGTEVDEAENKIVDAGVNTTKTGVNITATEV TKTKPGIGTTGTNIHAAGTDIHAAGTGINATKTVTDGAAMKINAAKNREIAATIPQTKQE TDTEIESKSETDTEIQSKPKREMENDIKPEREKDKERERQSKTENEWEKDREQPPVPSNG VSQAAPVAVKGLSLEISSGKREEENKEERRNGGIDQKGEARYNGTQYERNQLEEPK Prediction of potential genes in microbial genomes Time: Thu May 12 02:22:36 2011 Seq name: gi|226332200|gb|ACIC01000120.1| Bacteroides sp. 1_1_6 cont1.120, whole genome shotgun sequence Length of sequence - 44320 bp Number of predicted genes - 45, with homology - 45 Number of transcription units - 20, operones - 12 average op.length - 3.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 294 188 ## gi|253570651|ref|ZP_04848059.1| conserved hypothetical protein + Prom 311 - 370 2.2 2 2 Tu 1 . + CDS 467 - 1369 699 ## BT_2030 hypothetical protein - Term 1285 - 1323 2.5 3 3 Tu 1 . - CDS 1568 - 1789 266 ## BVU_1841 hypothetical protein - Prom 1968 - 2027 8.2 + Prom 2011 - 2070 7.0 4 4 Op 1 . + CDS 2165 - 5125 3075 ## BT_2032 hypothetical protein 5 4 Op 2 . + CDS 5147 - 6709 1462 ## BT_2033 hypothetical protein + Term 6790 - 6833 6.3 6 5 Tu 1 . - CDS 7147 - 7545 598 ## COG0784 FOG: CheY-like receiver - Prom 7664 - 7723 7.8 + Prom 7563 - 7622 6.3 7 6 Op 1 . + CDS 7690 - 9327 165 ## PROTEIN SUPPORTED gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 + Prom 9330 - 9389 5.2 8 6 Op 2 . + CDS 9415 - 9996 461 ## COG4185 Uncharacterized protein conserved in bacteria 9 6 Op 3 . + CDS 9986 - 10198 210 ## gi|253570660|ref|ZP_04848068.1| predicted protein + Prom 10216 - 10275 3.1 10 6 Op 4 . + CDS 10295 - 10705 127 ## BT_2037 hypothetical protein + Term 10706 - 10751 4.0 11 7 Op 1 11/0.000 + CDS 10792 - 12063 978 ## COG0845 Membrane-fusion protein + Prom 12067 - 12126 4.0 12 7 Op 2 . + CDS 12193 - 15321 3106 ## COG3696 Putative silver efflux pump 13 7 Op 3 . + CDS 15371 - 16564 1208 ## BT_2040 hypothetical protein + Prom 16842 - 16901 9.0 14 8 Tu 1 . + CDS 16924 - 17748 657 ## BT_2041 hypothetical protein + Term 17960 - 18004 4.3 - Term 17715 - 17749 -0.9 15 9 Tu 1 . - CDS 17823 - 19301 1065 ## COG0144 tRNA and rRNA cytosine-C5-methylases - Prom 19398 - 19457 3.3 + Prom 19224 - 19283 5.8 16 10 Op 1 . + CDS 19410 - 20162 756 ## BT_2043 hypothetical protein 17 10 Op 2 . + CDS 20159 - 20707 313 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 18 10 Op 3 . + CDS 20691 - 21020 401 ## BF3732 hypothetical protein + Prom 21154 - 21213 6.0 19 11 Op 1 . + CDS 21289 - 22560 817 ## COG1502 Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes 20 11 Op 2 16/0.000 + CDS 22598 - 23392 702 ## COG0207 Thymidylate synthase 21 11 Op 3 . + CDS 23397 - 23891 317 ## COG0262 Dihydrofolate reductase + Term 23925 - 23963 0.3 - Term 23816 - 23846 2.0 22 12 Tu 1 . - CDS 23911 - 24390 490 ## COG1522 Transcriptional regulators - Prom 24516 - 24575 5.2 + Prom 24413 - 24472 4.1 23 13 Op 1 . + CDS 24544 - 25821 905 ## BT_2050 hypothetical protein + Prom 25862 - 25921 4.8 24 13 Op 2 . + CDS 26002 - 26538 480 ## BT_2051 hypothetical protein + Term 26668 - 26712 10.5 - Term 26657 - 26698 10.4 25 14 Op 1 . - CDS 26720 - 27190 529 ## BT_2052 hypothetical protein 26 14 Op 2 . - CDS 27194 - 27787 536 ## BT_2053 hypothetical protein 27 14 Op 3 . - CDS 27823 - 28308 416 ## BT_2054 hypothetical protein 28 14 Op 4 . - CDS 28315 - 29118 862 ## COG0811 Biopolymer transport proteins - Prom 29333 - 29392 6.4 - TRNA 29194 - 29281 56.1 # Ser GGA 0 0 29 15 Op 1 . - CDS 29402 - 30178 658 ## COG0084 Mg-dependent DNase 30 15 Op 2 . - CDS 30172 - 30888 495 ## BT_2057 hypothetical protein 31 15 Op 3 . - CDS 30904 - 31878 947 ## COG0142 Geranylgeranyl pyrophosphate synthase - Term 31895 - 31926 4.1 32 15 Op 4 . - CDS 31949 - 32632 773 ## BT_2059 TonB - Prom 32652 - 32711 6.5 + Prom 32598 - 32657 5.8 33 16 Op 1 2/0.000 + CDS 32856 - 33545 197 ## PROTEIN SUPPORTED gi|15639271|ref|NP_218720.1| bifunctional cytidylate kinase/ribosomal protein S1 34 16 Op 2 . + CDS 33564 - 34433 353 ## PROTEIN SUPPORTED gi|15895122|ref|NP_348471.1| 4-hydroxy-3-methylbut-2-enyl diphosphate reductase + Prom 34477 - 34536 3.2 35 17 Tu 1 . + CDS 34562 - 35542 1331 ## COG0205 6-phosphofructokinase + Term 35588 - 35646 13.6 36 18 Op 1 . - CDS 35642 - 35944 141 ## gi|253570688|ref|ZP_04848096.1| predicted protein 37 18 Op 2 3/0.000 - CDS 36006 - 36857 752 ## COG1028 Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) 38 18 Op 3 . - CDS 36854 - 38077 982 ## COG1902 NADH:flavin oxidoreductases, Old Yellow Enzyme family 39 18 Op 4 . - CDS 38116 - 38877 720 ## BT_2068 3-oxo-5-alpha-steroid 4-dehydrogenase - Prom 38929 - 38988 10.7 + Prom 38844 - 38903 6.9 40 19 Op 1 . + CDS 38927 - 39085 110 ## gi|253570692|ref|ZP_04848100.1| predicted protein 41 19 Op 2 . + CDS 39052 - 39210 289 ## gi|253570693|ref|ZP_04848101.1| predicted protein 42 19 Op 3 . + CDS 39214 - 39849 409 ## BT_2069 hypothetical protein + Term 39966 - 40006 11.5 - Term 39938 - 40004 13.5 43 20 Op 1 4/0.000 - CDS 40245 - 41588 1614 ## COG0372 Citrate synthase 44 20 Op 2 1/0.000 - CDS 41614 - 42804 1210 ## COG0538 Isocitrate dehydrogenases 45 20 Op 3 . - CDS 42819 - 44303 1278 ## COG1048 Aconitase A Predicted protein(s) >gi|226332200|gb|ACIC01000120.1| GENE 1 1 - 294 188 97 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253570651|ref|ZP_04848059.1| ## NR: gi|253570651|ref|ZP_04848059.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 20 97 1 78 78 156 100.0 3e-37 APNESIASGGFSESLLRLNMDAIGIRNEQTVKGILALARRRKLGGPCGTLWKVLSSEYRS TLLKKNEPGDYILWALNHPAEFEDTYTGILKKRVRGR >gi|226332200|gb|ACIC01000120.1| GENE 2 467 - 1369 699 300 aa, chain + ## HITS:1 COG:no KEGG:BT_2030 NR:ns ## KEGG: BT_2030 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 300 1 304 304 523 97.0 1e-147 MSKFINPFTDYGFKLIFGREVSKDLLIEFLNDLLEGERVITDLTFLNNEQLPDYPEGRGI IYDVYCTTDTGEKIIVEMQNRMQSNFKERSIFYLSRAIVNQGRTGHDWKFEIKAVYGVFL MNFIMDKNIKLRTDVILADKETGELFSEKFRQIFIALPLFKKSEEECETNFERWIYILNN METLKRLPFKARKAVFEKLEEIADVASMSPKERELYDNSVKVYRDYLVTMDAAEKEGIKK GMKEGMKEGLKKGLEEGLKKGREEALNIFQTAIDMKKQGIDNQLIAEKTGLPLSLIESLK >gi|226332200|gb|ACIC01000120.1| GENE 3 1568 - 1789 266 73 aa, chain - ## HITS:1 COG:no KEGG:BVU_1841 NR:ns ## KEGG: BVU_1841 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 72 1 72 1002 112 75.0 6e-24 MKRKLMFFMTFLFVGIGLVTAQTSRVTGVVTAEEDGLPVVGASVLVNGTTLGTITDIDGK FTITNVPSSSKTY >gi|226332200|gb|ACIC01000120.1| GENE 4 2165 - 5125 3075 986 aa, chain + ## HITS:1 COG:no KEGG:BT_2032 NR:ns ## KEGG: BT_2032 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 32 986 1 955 955 1829 100.0 0 MTPQTVTITQGVIKVVLKSDAKSLDEVVVTAMGISREKKALGYAVQDVKSDQLTRAANTD LAGALQGKVSGVDIAPSSGMPGASSKITIRGSRSFTGDNTPLYVIDGMPISSAADVTTTD NANGAAYGTDYANRSVDIDPNDIESINILKGQAASALYGMRASNGVIVITTKSGKGVAKG KPTITFRSNLSFDVVSTLPELQNEFGQGSGGSYDPYSGHSWGPKIADLANDPTYGGNTDN AFTQKMGKRQGQYYVPQRAEAGLDPWATPKAYNNMKDFFDTGITWSNNVNVSQNLDKGNY SFSLGNSHQEGIIPTTGMDRYNAKMSAEVQLSPNWSTGFNGNFVTSKIKKQSTANSGVTA TIYNAPVSYNMKGIPSHVEGDPYEQNTYRQAWIDDAYWAVDNNLFTERSQRFFGNAFVKF TTKFGTDNHKLDVKYQLGDDAYTTNYSEIYGYGSTMADTGDAIEYHYSINELNSLLTASY RWDINKDWVFDALIGNELVEKRTQYAFSEGMNFNFPGWNHINNASIYQSSKSYNKKRTVG NFANLSLAWKNMVYLNGTIRNDVVSNMPRDNRSFTYPSVSLGFVFTELEPLKNNILTFGK LRASYAEVGMAGDYMQSYYYTPSYGGGFFNGTPIAYPINGTMAYIPYYKVYDPNLKPQNT KSYEIGADLTFFNGLFTLNYTYSRQNVKDQIFEVPLSGSTGYDSMIMNGGKIHTNSHELT LGVSPVNNRNFKLDFAFNFSKIDNYVDELAPGVESIYLGGFVTPQVRAGIGEKFPVVYGS TYKRNKAGQIVVDANGLPQVGEDDVIGRVSPDFRLGFNTNIELYKFRIAAVFDWKQGGQM YCGTAGEMNFYGVTKESGEKRKSNFVVPNTVKETGTDAQGNPTYAANDIEVTNAQAYYTR LRSINESYIYDSSFIKLRELSVSYPVYASKWLNVDVNVFARNLIVWSELKGFDPEASQGN DNMGGAFERFSLPGTASYGFGVNVKF >gi|226332200|gb|ACIC01000120.1| GENE 5 5147 - 6709 1462 520 aa, chain + ## HITS:1 COG:no KEGG:BT_2033 NR:ns ## KEGG: BT_2033 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 520 1 520 520 986 100.0 0 MKRYDKITGVLAVAALSLFSACSEDTMDRINKDHDHTTSVASRFILADVITSTAFSNASG DINTYASSYIEYEVGVDNQLYYAEVRENEPSSSSTFNNSWNGIYSSLKNARIIIDQCGEG GRDHGNDVTRGMAEVMAAYNCALIADFFGDAPCSQAALVDEKGSPVYMTPKMDTQQEIYT QIISYLDDAIANLQKEDLADVTEQDFLYAGDADKWLKFAYGLKARYTMRLINRSSNKSAD YEKVLDYVSKSFTSADDQAAFDIYDSNNINPFYGFYNSRAGFGASTSLGTKLLAYNDPRA NRAFFTPIVDKKRSQVAANDPSLVPAPNGSPDQSTSKYGISAFVYAKTAPTLLMSYHELM FLKAEALCRLNRDAEDALKEAVVAGLLNAENSISIAIKELGSGLNTNSSEVITETSAGKY FDDVVKAKYAANPLQETMIQKYLAMWGASGEATETYNDFRRMKGLNENFITLTNPNNSSK FPLRYPYGNSDTAANPEVKAAYGNGDYVYSEPVWWAGGSR >gi|226332200|gb|ACIC01000120.1| GENE 6 7147 - 7545 598 132 aa, chain - ## HITS:1 COG:slr2104_4 KEGG:ns NR:ns ## COG: slr2104_4 COG0784 # Protein_GI_number: 16330590 # Func_class: T Signal transduction mechanisms # Function: FOG: CheY-like receiver # Organism: Synechocystis # 13 128 10 126 130 79 39.0 1e-15 MEGGQTNEFRPLILVAEDDDSNFKLIKAIIGKKCDIQWAKNGEEIVALFKEYGNEAKAIL MDIKMPVMNGLDATKIIRGENKEIPIIMQTAYAFSSDKENAMNAGASEVLVKPITLSILR NTLTKYFPELVW >gi|226332200|gb|ACIC01000120.1| GENE 7 7690 - 9327 165 545 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 [marine gamma proteobacterium HTCC2080] # 330 520 7 214 305 68 25 8e-11 MISIDGLAVEFSGTTLFSDISFVINEKDRIALMGKNGAGKSTLLKILAGVRQPTRGKVSA PKDCVVAYLPQHLMTEDGRTVFAETAQAFAHLHEMEARIDALNKELETRTDYESDSYMAL IEEVSALSEKFYSIDATNYEEDVEKSLLGLGFTREDFQRQTSDFSGGWRMRIELAKLLLQ KPDVLLLDEPTNHLDIESIQWLEDFLINNGKAVIVISHDRKFVDNITTRTIEVTMGRIYD YKVNYSQYLQLRKERREQQQKAYDEQQKFIAETEAFIERFKGTYSKTLQVQSRVKMLEKL ELLEVDEEDTSALRLKFPPSPRSGSYPVTMEGVGKTYGDHVVFRNANLTIERGDKVAFVG KNGEGKSTLVKCIMNEIDHDGTLTLGHNVQIGYFAQNQASLMDENLTVFQTIDDVAKGEI RNKIRDLLGAFMFGGPEESMKKVKVLSGGERTRLAMIKLLLEPVNLLILDEPTNHLDMKT KDILKQALMDFDGTLIVVSHDRDFLDGLVTKVYEFGNKKVTEHLCGIYEFLEKKKMDSLQ ELEKK >gi|226332200|gb|ACIC01000120.1| GENE 8 9415 - 9996 461 193 aa, chain + ## HITS:1 COG:alr5363 KEGG:ns NR:ns ## COG: alr5363 COG4185 # Protein_GI_number: 17232855 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Nostoc sp. PCC 7120 # 7 189 4 186 187 184 50.0 1e-46 MDETRQLYIISGCNGAGKTTASYTVLPEILLCKEFVNADEIAKGLSPFNPESMAIEAGRL MLKRIDELLATRTSFSIETTLATRSYTRLITRAQSAGYKVSLIYFWLNSPELAVNRVLQR VNEGGHNVPIDTIYRRYQAGINNLFRIYAPRVDYWLLADNSVSPRVIVAEGCQQGEDRIY ELELFNRIKSYVK >gi|226332200|gb|ACIC01000120.1| GENE 9 9986 - 10198 210 70 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253570660|ref|ZP_04848068.1| ## NR: gi|253570660|ref|ZP_04848068.1| predicted protein [Bacteroides sp. 1_1_6] # 1 56 17 72 86 87 100.0 2e-16 MSNNEMQELSDKLRRGLQLAEQRLLERNARHGKLLSQGTPDGKVIYVSATELLERLQKQE KEKSKESEKE >gi|226332200|gb|ACIC01000120.1| GENE 10 10295 - 10705 127 136 aa, chain + ## HITS:1 COG:no KEGG:BT_2037 NR:ns ## KEGG: BT_2037 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 136 1 136 136 252 99.0 2e-66 MKGKKRFIVTMLFFINIIMLVAAVIPHHHHPNGMICMKQDQPVEQQCPNHHHHPASDSCC SSECMTRFHSPIPSVHTDSGPNYVFVATLFTDMIIEHLLRPQERRVKNYYIYRESLHGTK IPRTSSLRAPPYSVFA >gi|226332200|gb|ACIC01000120.1| GENE 11 10792 - 12063 978 423 aa, chain + ## HITS:1 COG:aq_468 KEGG:ns NR:ns ## COG: aq_468 COG0845 # Protein_GI_number: 15605952 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Aquifex aeolicus # 98 421 38 359 359 79 26.0 1e-14 MKKLIFIGVMGLFVLGSCNSKNTGHEGHDHETVTHNHDEHEGHDHEAEGHDHEAEGADHS HEGECSGGHDHGKAATSEPAGEHSDEIILPKAKADAAGVKVNAITPAPFQQVIKTSGQVL AAQGDESVAVATVAGVVSFRGKVTEGMSVGRGTPLVTISSHNIADGDPVQRARIAYEVSK KEYERMKSLVKNKIVSDKDFAQAEQNYENARISYEALAKNHSAIGQNITAPIAGYVKSIL VNEGDYVTIGQPLVSVTQNRRLFLRAEVSEKYYPYLRTISSANFRTPYNNEVYELNELSG RLLSFGKTSGDNSFYVPVTFEFDNKGEVIPGSFVEVYLLSSQLENVISVPRTALTEEQGI FFVYLQLDEEGYKKQEVTLGADNGKSVQILTGIKPGDRVVTEGAYQVRLASASNAIPAHS HEH >gi|226332200|gb|ACIC01000120.1| GENE 12 12193 - 15321 3106 1042 aa, chain + ## HITS:1 COG:all7618 KEGG:ns NR:ns ## COG: all7618 COG3696 # Protein_GI_number: 17158754 # Func_class: P Inorganic ion transport and metabolism # Function: Putative silver efflux pump # Organism: Nostoc sp. PCC 7120 # 1 1020 1 1019 1058 774 42.0 0 MLNKIIHYSLHNRLVVLCAAILLLIAGTYTAMHTEVDVFPDLNAPTVVIMTEANGMAAEE VEQLVTFPVETAVNGATGVRRVRSSSTNGFSVVWVEFDWGTDIYLARQIVSEKLAVVSES LPVNVGKPTLGPQSSILGEMLIVGLTADSTSMLDLRTIADWTIRPRLLSTGGVAQVAVLG GDIKEYQIQLDPERMRHYGVSMGEVMAVTQDMNLNANGGVLYEFGNEYIVRGVLSTPKVE ELGKAVVKTVNNFPVTLEDIADVKIGPKAPKLGTASERGKPAVLMTVTKQPATSTLELTD KLEASLQDLQKNLPADVKVSTDIFRQSRFIESSIGNVKKSLFEGGIFVVIVLFLFLANVR TTLISLVTLPLSLLVSILTLHYMGLTINTMSLGGMAIAIGSLVDDAIVDVENVYKRLREN RLKAEAERLSTLEVVFNASKEVRMPILNSTLIIVVSFVPLFFLSGMEGRMLVPLGIAFIV ALFASTVVALTLTPVLCSYLLGSNKTNKELKESFLARWMKGIYEKALTWVLAHKRVTLGS TIVLFLIALGVFFTLGRSFLPSFNEGSFTINISSLPGISLEESNKMGHRAEELLLTIPEI QTVARKTGRAELDEHALGVNVSEIEAPFELKDRPRSELVAEVREKLGTITGANIEIGQPI SHRIDAMLSGTKANIAIKLFGDDLNKMFSLGNQIKGAISDIPGIADLNVEQQIERPQLKI QPKREMLAKFGITLPEFSEYVNVALAGKVISQVYEQGKSFDLIVKVKDDARDEMEKIRNL MVDTNDGRKVPLNYVAEVVSSMGPNTINRENVKRKIVISANVADRDLRSVVNDIQKQVDA TIQLPEGYHIEYGGQFESEQAASRVLALTSFMSIVIIFLLLYHEFRSVKESGVILLNLPL ALIGGVFALVITTGEVSIPAIIGFISLFGIATRNGMLLISHYNHLQQVEGLNVYDSVIQG SLDRLNPILMTALSSALALIPLALGGDLPGNEIQSPMAKVILGGLLTSTFLNGFIVPIVY LMMHRQRSAGANSPCGVPAEQA >gi|226332200|gb|ACIC01000120.1| GENE 13 15371 - 16564 1208 397 aa, chain + ## HITS:1 COG:no KEGG:BT_2040 NR:ns ## KEGG: BT_2040 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 397 1 397 397 697 99.0 0 MKRIIIIGSLVSCALLALTGEVQAQSGIEQVLKNIEANNKELQANAQLITSQKLEAKTDN NLPDPTLSYAHLWGAKDKNETIGELVVSQSFDFPSLYATRNKLNRLKAGAFDSQADVFRQ EKLLLAKEVCLDIIMLRQQKHILEERLRNAEELAKMYAKRLQTGDANALETNKINLELLN VKTETSLNETALRNKLQELNTLNGNIPVVFEENTYPATPFPADYQILKSEVLSADRTLMA FNNESLVARKQIAVNKSQWLPKLELGYRRNTETGTPFNGVVVGFSFPLFENRNKVKIAKA QALNIDLQKDNASLQVESELAQLYREAKTLHTSMEEYRKTFQAQQDLALLKQALTGGQIS MIEYFVEVSVIYQSHQNYLQLENQYQKAMARIYKSKL >gi|226332200|gb|ACIC01000120.1| GENE 14 16924 - 17748 657 274 aa, chain + ## HITS:1 COG:no KEGG:BT_2041 NR:ns ## KEGG: BT_2041 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 274 1 274 274 546 98.0 1e-154 MKTKTLSYTPILFCILILWSACKDDADAPLRFYDSKYEVPMGGRRYLGIESGNGDYSLEI GNPRIASAGIESGWSGVPAGRMIYISGILTGSTYLKVTDNATQETLTLPIKVVDYYEDLN LIHGSSSLRPNGDENLLPGVDDIFLVSNAARDAYFFKQGQRTAFSSGLELITKGTYALKQ EEDNKATLTLTFSLDASPATEHHFTVWGNAYLLHRLDKNLHLNWGTPPIGETRTSPAPPP AYTLEEITEGAEPGTGRQVGFILNYKEIPTGILP >gi|226332200|gb|ACIC01000120.1| GENE 15 17823 - 19301 1065 492 aa, chain - ## HITS:1 COG:yebU_1 KEGG:ns NR:ns ## COG: yebU_1 COG0144 # Protein_GI_number: 16129788 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA and rRNA cytosine-C5-methylases # Organism: Escherichia coli K12 # 3 350 10 353 385 174 32.0 5e-43 MNLPASFIDYTRALLGNEEYEKLAAALQQEPPVSIRINKLRMKEEGLSSLTDSSARFSFN KVPWASDGYYLDERLTFTFDPLFHAGCYYVQEASSMFVEQVLRQYVESPVVMLDLCAAPG GKSTHARSVLPEGSLLVANEVIRNRSQVLAENLTKWGHPDVVVTNNDPADFSALPSFFDV ILTDVPCSGEGMFRKDPVAVEEWSPENVEICWQRQRRIIADIWPCLKPGGILIYSTCTYN SKEDEENVCWIQQEFGAELLPLEVQDEWNITGNLLDGEDESQRSLSVCHFLPHKTKGEGF FLAALRKPDAEDEPATYSFSKAKSSKKKDKKGGAAASPVSKEHMGMALNWLKQENAERYT LSAEGAGIVAFLQRYTDELAAMKQYLKVIQAGVLTGEVKGKDLIPTHALAMSATLLRQDA FDTEEVSYEQAIAYLRKEAITLPETAPRGYILLTYRNIPLGFVKNIGNRANNLYPQEWRI RSGYLPEEIRTL >gi|226332200|gb|ACIC01000120.1| GENE 16 19410 - 20162 756 250 aa, chain + ## HITS:1 COG:no KEGG:BT_2043 NR:ns ## KEGG: BT_2043 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 238 1 238 250 417 98.0 1e-115 MKRILIALLVMMTIFSLANAQKQTTIVNDSTGNVKVTVTKGKGKDAKVNVGNTAVTVIGI DDEDADTTSVDTGKVSSSSRGSHGKASFTISSDDDDFPFNNFGNAVGGGILVAIISIIAI FGMPVFIIFVVFFFRYKNRKARYRLAEQALAAGQPLPEEFIREHKSTDQRTQGIKNTFTG IGLFIFLWAITGEFGIGAIGLLVMFMGLGQWLIGYKQHANEETKNRETAVRSNDIIISER NDEEKNEENR >gi|226332200|gb|ACIC01000120.1| GENE 17 20159 - 20707 313 182 aa, chain + ## HITS:1 COG:slr1545 KEGG:ns NR:ns ## COG: slr1545 COG1595 # Protein_GI_number: 16330063 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Synechocystis # 6 174 35 218 223 94 32.0 1e-19 MSQLNDISLVAQVVVFKNTRAFDQLVQKYQSPVRRFFLHLTCGDSELSDDLAQDTFIKAY TNLASFRNLSSFSTWLYRIAYNIFYDYIRSRKEMADLDTREVDAVNCTEQANIGQTMDVY QSLKSLKEVERTCITLFYMEDVSIDKIAGITGIPAGTVKSHLSRGKEKLATYLKRNGYDG NR >gi|226332200|gb|ACIC01000120.1| GENE 18 20691 - 21020 401 109 aa, chain + ## HITS:1 COG:no KEGG:BF3732 NR:ns ## KEGG: BF3732 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 109 1 109 109 111 53.0 9e-24 MTEIDNDKLLKDFFAENKREIADNGFSRRVMHHLPDRSNRLARLWTVFVMTVGATLFVTL GGLEAVWGTLKDVLIGMINHGATSLDPKSIIIATVVLLFMAGRKVVSMA >gi|226332200|gb|ACIC01000120.1| GENE 19 21289 - 22560 817 423 aa, chain + ## HITS:1 COG:SA1891 KEGG:ns NR:ns ## COG: SA1891 COG1502 # Protein_GI_number: 15927663 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes # Organism: Staphylococcus aureus N315 # 52 423 127 494 494 247 35.0 3e-65 MKLRIFLLFLFLSLFRAQADVIDSLMTHPRDSIALTSDSLVLRFLSESGIPISDNNKVRL LKSGREKFIDLFSAIRDAKHHIHLEYFNFRNDSIANALFDLLAEKVKEGVEVRAMFDAFG NWSNNKPLKKRHLKKIREQGIEIVKFDPFTFPYINHAAHRDHRKIAVIDGEVAYTGGMNI ADYYINGLPKIGTWRDMHMRIEGDAVNDLQEIFLTIWNKETKQNIGGEAYFPKHKEQSDS TNVVVAIVDRTPKKNSRMLSHAYAMSIYSAQKNVHIVNPYFVPTSSINKALQRTIERGVD VTIMVSSASDIPFTPDAALYKLHKLMKRGATVYMYNGGFHHSKIMMVDDIFCTVGTANLN SRSLRYDYETNAFIFNKEITGELNEMFRNDIEHCTQLTPEFWKKRSPWKKFVGWFANLFT PFL >gi|226332200|gb|ACIC01000120.1| GENE 20 22598 - 23392 702 264 aa, chain + ## HITS:1 COG:BH3451 KEGG:ns NR:ns ## COG: BH3451 COG0207 # Protein_GI_number: 15616013 # Func_class: F Nucleotide transport and metabolism # Function: Thymidylate synthase # Organism: Bacillus halodurans # 1 264 1 264 264 423 71.0 1e-118 MKQYLDLLNRVLTEGTEKSDRTGTGTISVFGHQMRFNLDEGFPCLTTKKLHLKSIIYELL WFLQGDTNAKYLQEHGVRIWNEWADENGDLGHIYGYQWRSWPDYDGGFIDQISEAVETIK HNPDSRRIIVSAWNVADLKNMNLPPCHAFFQFYVADGRLSLQLYQRSADIFLGVPFNIAS YALLLQMMAQVTGLKAGEFIHTLGDAHIYLNHLDQVKLQLSREPRALPQMKINPDVKSIY DFQFEDFELVNYDPHPHIAGIVAV >gi|226332200|gb|ACIC01000120.1| GENE 21 23397 - 23891 317 164 aa, chain + ## HITS:1 COG:RSc0946 KEGG:ns NR:ns ## COG: RSc0946 COG0262 # Protein_GI_number: 17545665 # Func_class: H Coenzyme transport and metabolism # Function: Dihydrofolate reductase # Organism: Ralstonia solanacearum # 1 161 1 159 167 124 44.0 8e-29 MSKISIIAAVDRRMAIGFQNKLLFWLPNDLKRFKALTTGNTIIMGRKTFESLPKGALPNR RNVVLSTRPDTVCPGAEVFPSLEVALQSCKEDEHVYIIGGASVYQQALPLADELCLTEIN DVAPEADAFFPEVSPAQWHEKSREAHPVDEKHLCPYAFVDYVKQ >gi|226332200|gb|ACIC01000120.1| GENE 22 23911 - 24390 490 159 aa, chain - ## HITS:1 COG:HI0563 KEGG:ns NR:ns ## COG: HI0563 COG1522 # Protein_GI_number: 16272506 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Haemophilus influenzae # 4 156 2 149 150 117 38.0 1e-26 MGHHQLDALDEQILKLIAGNARIPFLEVARACNVSGAAIHQRIQKLTNLGILKGSEYVID PEKIGYETCAYIGIYLKDPESFDSVTRALEAIPEVVECHFTTGKYDMFIKIYAKNNHHLL SIIHDKLQPLGLARTETLISFHEAIKRQMPIMVDIEDED >gi|226332200|gb|ACIC01000120.1| GENE 23 24544 - 25821 905 425 aa, chain + ## HITS:1 COG:no KEGG:BT_2050 NR:ns ## KEGG: BT_2050 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 425 1 425 425 889 100.0 0 MKQICSILLFFLISAGSYAQNFADYFQNKTLRVDYIFTGNNKQQAIYLDELSQLPSWAGR EHHLSELPLEGNGQIIVRDLATRQCIYKTSFSSLFQEWLSTDEAKETAKGFENTFLLPYP KQPAEVEIVLFSPRKEVMTSFKHIVRPDDILIHKRGTSHVTPHRYMLQSGNEKECIDVAI LAEGYTEKEMDLFYQDAQKACESLFSHEPFRSMKNKFNIVAVASPSIDSGVSVPRENQWK HTAVHSHFDTFYSDRYLTTSRVKAIHNALAGIPYEHIIILANTDVYGGGGIYNSYTLTTA HHPMFKPVVVHEFGHSFGGLADEYFYDDDVMTDTYPLDVEPWEQNISTRVNFALKWEDML APNTPVPTPVAQHQNYPVGVYEGGGYSAKGIYRPAFNCRMKTNEYPEFCPVCQRAIQRII EFYVP >gi|226332200|gb|ACIC01000120.1| GENE 24 26002 - 26538 480 178 aa, chain + ## HITS:1 COG:no KEGG:BT_2051 NR:ns ## KEGG: BT_2051 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 178 1 178 178 340 100.0 1e-92 MIRFQPITTSDVQHYKFMEELLVESFPPEEYRELEHLREYTDRIGNFHNNIIFDDDLPIG FITYWDFDEFYYVEHFATNPALRNGGYGKRTLEHLCEFLKRPIVLEVERPVEEMAKRRIN FYQRHGFTLWEKDYYQPPYKEGDDFLPMYLMVHGNLDAEKDYEGIRHKLHTIVYGVKE >gi|226332200|gb|ACIC01000120.1| GENE 25 26720 - 27190 529 156 aa, chain - ## HITS:1 COG:no KEGG:BT_2052 NR:ns ## KEGG: BT_2052 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 12 156 1 145 145 254 100.0 6e-67 MGKFNKTGKREMPALNTSSLPDLIFTLLFFFMIVTTMREVTLKVQFTLPVGTELEKLEKK SLVTFIYVGEPTQEYRAKMGTESRIQLNDSYAEVGEVQDFIFQERASMNEGDQAKMTVSL KVDQKTKMGIITDVKNALRKSYALKINYSSTKRGEK >gi|226332200|gb|ACIC01000120.1| GENE 26 27194 - 27787 536 197 aa, chain - ## HITS:1 COG:no KEGG:BT_2053 NR:ns ## KEGG: BT_2053 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 197 1 197 197 366 100.0 1e-100 MARGKRKVPDINSSSTADIAFLLLIFFLITTSMDTDRGLARLLPPPPEDQDQQNTDKIKE RNILQVYLNKDDALMCGNDYIGVEQLREKAKEFIANAGNAEHMPEKTQKNVEFFGTTLVN DKHVISLQNDRGSSYQAYISVQNELVAAYNELRDELALQKWQRPYAELNDEQQKAIREIY PQRISEAEPKKYGEKRK >gi|226332200|gb|ACIC01000120.1| GENE 27 27823 - 28308 416 161 aa, chain - ## HITS:1 COG:no KEGG:BT_2054 NR:ns ## KEGG: BT_2054 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 161 1 161 161 236 100.0 3e-61 MSKLSYKVSYYALYAMFAIILIVLGLFYLGGDAQGADVIAGVDPEMWQPANTNALLMLIY GLFGLAVAATVVAAVFQFGAALKDSPANAIKSLLGLVLLVVVLVIAWAAGDGTPMNIPGY DGTDNVPFWLKLTDMFLYSIYILLFVTIVAIIASGIKKKIS >gi|226332200|gb|ACIC01000120.1| GENE 28 28315 - 29118 862 267 aa, chain - ## HITS:1 COG:VC1547_2 KEGG:ns NR:ns ## COG: VC1547_2 COG0811 # Protein_GI_number: 15641555 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Biopolymer transport proteins # Organism: Vibrio cholerae # 114 255 46 184 205 73 33.0 3e-13 MKKLFAIVAVIGAFTFGSIQLAQAQDAPAAEQTEQQAAPAAAQAAPAAAPAAEEGGIHKE IKVKFIEGTASFMSLVAIALVIGLAFCIERIIYLSLAEINTKKFMASIEAALEKGDVEAA KDIARNTRGPVASIYYQGLMRIDQGIDVVEKSVVSYGGVQAGYLEKGCSWITLFIAMAPS LGFLGTVIGMVQAFDKIQQVGDISPTVVAGGMKVALITTIFGLIVALILQVFYNYVLAKI EALTSEMEDSSISLLDMVIKYDLKYKK >gi|226332200|gb|ACIC01000120.1| GENE 29 29402 - 30178 658 258 aa, chain - ## HITS:1 COG:VC0103 KEGG:ns NR:ns ## COG: VC0103 COG0084 # Protein_GI_number: 15640135 # Func_class: L Replication, recombination and repair # Function: Mg-dependent DNase # Organism: Vibrio cholerae # 2 258 1 255 255 218 40.0 7e-57 MLIDTHSHLFLEEFSDDLPQVMERARQAGVSRIYMPNIDSTTIEPMLSVCADYPDFCYPM IGLHPTSVNESYRQELSIVRERLEAPNNFVAIGEIGLDLYWDKTFLNEQLYVFEKQIEWA LEYKLPIVVHSREAFDYIYKVMEPYKNTALTGIFHSFTGNAEEAARLLEFGGFMLGINGV VTFKKSSLPDTLLTVPLERIVLETDSPYLTPAPNRGKRNESANVHDTFLKLVEIYRTTPE HLSQATSENALKVFGMVK >gi|226332200|gb|ACIC01000120.1| GENE 30 30172 - 30888 495 238 aa, chain - ## HITS:1 COG:no KEGG:BT_2057 NR:ns ## KEGG: BT_2057 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 238 1 238 238 438 97.0 1e-122 MPYRRLPNTDQARVRALKAAVEKGDVYNVRDLAISLKTLFEARNFLLKFEAAQIYYTQCY DNQSRASRKHQANVRMARLYISHFIQVLNLAVLRDEIKPVHKELYDLPEANVVPDLLSEA ALVEWGRKIIEGEQRRTSQGGIPIYNPTIARVKVHYDIFLDSYERQKNYQSATNRSLDEL ASMRDRADELILDIWNQVEAKFQEVNPNEARLAKCRDYGLVYYYRSNEKVKEESKLSC >gi|226332200|gb|ACIC01000120.1| GENE 31 30904 - 31878 947 324 aa, chain - ## HITS:1 COG:MA0606 KEGG:ns NR:ns ## COG: MA0606 COG0142 # Protein_GI_number: 20089495 # Func_class: H Coenzyme transport and metabolism # Function: Geranylgeranyl pyrophosphate synthase # Organism: Methanosarcina acetivorans str.C2A # 24 245 30 253 324 188 43.0 1e-47 MFTASQLLDKINNHLSENQITRTPEGLYEPIEYILSLGGKRIRPVLMLMAYNLYKEDVSS IYDPATGIEVYHNHTLLHDDLMDRSDMRRGKPTVHKVWNDNTAILSGDTMLILAFRYVAG CAPEHLKEVIDLFSLTALEICEGQQLDMEFESRNDVAEDEYIEMIRLKTAVLLAASLKIG AILAGASAVDAENLYNFGMQIGVAFQLQDDLLDVYGDPEVFGKRIGGDILCNKKTYMLIK ALERANGEQLEELNRWLNAENCQPAEKIAAVTEIYNQLTIRSVCENKMREYYTLAMESLE AVAVAEEKKKELKNLVKLLMYREM >gi|226332200|gb|ACIC01000120.1| GENE 32 31949 - 32632 773 227 aa, chain - ## HITS:1 COG:no KEGG:BT_2059 NR:ns ## KEGG: BT_2059 # Name: not_defined # Def: TonB # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 227 1 227 227 372 100.0 1e-102 MEVKKSPKADLEGKKSTWLLIGYVVVLAFMFVAFEWTQRDVKIDTSQAVADVVFEEEIIP ITETPEQATPPPPEAPKVAELLEIVDDQADIEESTTILNEDNTPKVEVKYVPVQVVEEEP EEQTIFEVVENMPDFPGGQAALMQYLAKNIKYPTIAQENGTQGRVIVQFVVNKDGSIVDA KVVRSVDPYLDKEALRVINTMPKWKPGMQRGKPVRVKFTVPVMFRLQ >gi|226332200|gb|ACIC01000120.1| GENE 33 32856 - 33545 197 229 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15639271|ref|NP_218720.1| bifunctional cytidylate kinase/ribosomal protein S1 [Treponema pallidum subsp. pallidum str. Nichols] # 1 222 32 282 863 80 27 2e-14 MKKITIAIDGFSSCGKSTMAKDLAREVGYIYIDSGAMYRAVTLYSIENGIFNGDVIDTEK LKEAIRDIRITFRPNPETGRPDTYLNGVNVENKIRTMGVSSKVSPISALDFVREAMVAQQ QAMGKEKGIVMDGRDIGTTVFPDAELKIFVTATPEIRAQRRFDELKAKGQEGSFEEILEN VKQRDYIDQHREVSPLRKADDALLLDNSNLSIEQQKEWLSEQFGKVVKE >gi|226332200|gb|ACIC01000120.1| GENE 34 33564 - 34433 353 289 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15895122|ref|NP_348471.1| 4-hydroxy-3-methylbut-2-enyl diphosphate reductase [Clostridium acetobutylicum ATCC 824] # 1 278 1 274 642 140 30 1e-32 MIKVEIDEGSGFCFGVVTAIHKAEEELAKGETLYCLGDIVHNSREVERLKAMGLITINRD EFRQLRNAKVLLRAHGEPPETYQIAHKNNIEIIDATCPVVLRLQKRIKQEFRKEDFEEKQ IVIYGKNGHAEVLGLVGQTGGQAIVIESAEEAKKLDFTKSIRLFSQTTKSLDEFQEIVEY IKLHISPDATFEYYDTICRQVANRMPNLREFAATHDLIFFVSGKKSSNGKMLFEECLKVN ANSHLIDNEKEIDPTLLRNVESIGVCGATSTPKWLMEKIHDHIQLLIKD >gi|226332200|gb|ACIC01000120.1| GENE 35 34562 - 35542 1331 326 aa, chain + ## HITS:1 COG:BH3164 KEGG:ns NR:ns ## COG: BH3164 COG0205 # Protein_GI_number: 15615726 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphofructokinase # Organism: Bacillus halodurans # 4 326 1 319 319 311 50.0 1e-84 MGTVKCIGILTSGGDAPGMNAAIRAVTRAAIYNGLQVKGIYRGYKGLVTGEIKEFKSQNV SNIIQLGGTILKTARCKEFTTPEGRQLAYDNMKREGIDALVIIGGDGSLTGARIFAQEFD VPCIGLPGTIDNDLYGTDTTIGYDTALNTILDAVDKIRDTATSHERLFFVEVMGRDAGFL ALNGAIASGAEAAIIPEFSTEVDQLEEFIKNGFRKSKNSSIVLVAESELTGGAMHYAERV KNEYPQYDVRVTILGHLQRGGSPTAHDRILASRLGAAAIDAIMEDQRNVMIGIEHDEIVY VPFSKAIKNDKPVKRDLVNVLKELSI >gi|226332200|gb|ACIC01000120.1| GENE 36 35642 - 35944 141 100 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253570688|ref|ZP_04848096.1| ## NR: gi|253570688|ref|ZP_04848096.1| predicted protein [Bacteroides sp. 1_1_6] # 1 100 2 101 101 189 100.0 4e-47 MNKKKREKTTILSSAGEGEKEITWNSVINPKKTVVAVCSFYTDLSQMLVNSILNGVTKKS VVSPLEILLFDSFVSFIFHSDRAKQGGQTARNLWHGSWNG >gi|226332200|gb|ACIC01000120.1| GENE 37 36006 - 36857 752 283 aa, chain - ## HITS:1 COG:alr1722 KEGG:ns NR:ns ## COG: alr1722 COG1028 # Protein_GI_number: 17229214 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) # Organism: Nostoc sp. PCC 7120 # 1 203 1 217 311 120 33.0 4e-27 MSEEKWAIITGADGGMGTEITRAVAEAGYHIIMACYRPSKAEPIRQRLVNETGNVNMEVM AVDLSSMASTASFADRIVERHLPVSLLMNNAGTMETGLHITDDGFERTVSVNYLGPYLLT RKLLPALTRGARIVNMVSCTYAIGHLDFPDFFRQGRKGRFWRIPVYSNTKLALMLFTIEL SERLREKGITVNAADPGIVSTDIITMHQWFDPLTDIFFRPFIRTPKKGASTAVGLLLDEA VAGVSGQLYASNRRKELSDNYIYHVQKEQLWEITEQLLAQWFT >gi|226332200|gb|ACIC01000120.1| GENE 38 36854 - 38077 982 407 aa, chain - ## HITS:1 COG:MT3467 KEGG:ns NR:ns ## COG: MT3467 COG1902 # Protein_GI_number: 15842955 # Func_class: C Energy production and conversion # Function: NADH:flavin oxidoreductases, Old Yellow Enzyme family # Organism: Mycobacterium tuberculosis CDC1551 # 5 379 11 385 396 280 42.0 2e-75 MESKLFTPVTFGPLTLRNRTIRSAAFESMCPGNAPSQMLLDYHRSVAAGGVGMTTVAYAA VTQSGLSFDRQLWLRPEIISGLREVTGAIHTEGAAAGIQIGHCGNMSHKKICGTTPISAS TGFNLYSPTFVRGMKKEELPEMARAYGRAVHLAREAGFDAVEVHAGHGYLISQFLSPYTN HRKDEYGGSLENRMRFMEMVMNEVMTAAGSDMAVFVKMNMRDGFKGGMETDETLQVAKRL LALGAHALVLSGGFVSKAPMYVMRGAMPIRSMAYYMDCWWLKYGVRMFGKWMIPTVPFRE AYFLEDALKFRAALPEAPLIYVGGLVSREKIDEVLDAGFDAVQMARALLNEPEFVNRMRR EEQARCNCGHSNYCIGRMYTIEMACHQHLKEKLPPCLQREIEKLEKQ >gi|226332200|gb|ACIC01000120.1| GENE 39 38116 - 38877 720 253 aa, chain - ## HITS:1 COG:no KEGG:BT_2068 NR:ns ## KEGG: BT_2068 # Name: not_defined # Def: 3-oxo-5-alpha-steroid 4-dehydrogenase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 253 1 253 253 447 97.0 1e-124 MSITAFNLFLGVMSLIALIVFIALYFVKAGYGIFRTASWGVAIPNKLAWILMEAPVFLVM CWMWMHSERRFDPVILTFFIFFQIHYFQRAFVFPLLLTGKSKMPLAIMSMGILFNLLNGY MQGEWIFYLSPEGMYHSGWFTSAWFIAGSLLFFAGMLMNWHSDYIIRHLRKPGDTRHYLP QKGMYRYVTSANYLGEIIEWAGWAILTCSLSGLVFFWWTVANLVPRANAIWHRYREEFGS EVGGRKRVFPFIY >gi|226332200|gb|ACIC01000120.1| GENE 40 38927 - 39085 110 52 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253570692|ref|ZP_04848100.1| ## NR: gi|253570692|ref|ZP_04848100.1| predicted protein [Bacteroides sp. 1_1_6] # 1 52 1 52 52 72 100.0 6e-12 MATKITQIKNKSNDNRLPAFFVYSLKSAIFAVGYELNKKKKQWQWKLKPYRP >gi|226332200|gb|ACIC01000120.1| GENE 41 39052 - 39210 289 52 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253570693|ref|ZP_04848101.1| ## NR: gi|253570693|ref|ZP_04848101.1| predicted protein [Bacteroides sp. 1_1_6] # 1 52 1 52 52 64 100.0 2e-09 MAMEIKAIPTLKGKEAEHFVKAADKAYKNQNKQDFSKHVLEARAVLKKAKML >gi|226332200|gb|ACIC01000120.1| GENE 42 39214 - 39849 409 211 aa, chain + ## HITS:1 COG:no KEGG:BT_2069 NR:ns ## KEGG: BT_2069 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 211 1 211 211 424 96.0 1e-117 MGFLFEKCTFAVLDEYTIKECDPFSCGHQDLDDFFHNDAPLYNAQLLGKSYCFRSDKKPS DIVCAFTVSNDSIRVNILPNSREKKVQKAIPRAKQMRRYPGVLIGRLGINKEYKHQGIGS DLMTFIKSWFIDTGNKTGCRFLIVDAYNENIPLNYYLKNGFKYLFSTEAQEVEYTGFETG THLKTRLMYFDLIDIMPYPNNSSKDSINGNY >gi|226332200|gb|ACIC01000120.1| GENE 43 40245 - 41588 1614 447 aa, chain - ## HITS:1 COG:L67186 KEGG:ns NR:ns ## COG: L67186 COG0372 # Protein_GI_number: 15672652 # Func_class: C Energy production and conversion # Function: Citrate synthase # Organism: Lactococcus lactis # 12 447 8 441 441 380 46.0 1e-105 MKKEYLIYKLSEEMKEATRIDNELFPKFDVKRGLRNEDGTGVLVGLTKIGNVVGYERIPG GGLKPIPGKLFYRGYDVEDISHAIIKEKRFGFEEVAYLLLSGRLPDKEELASFRELINDN MALEQKTKMNIIELEGNNIMNILSRSVLEMYRFDPDADDTSRDNLMRQSIDLISKFPTII AYAYNMLRHATFGRSLHIRHPQEKLSIAENFLYMLKKDYTELDARTLDLLLILQAEHGGG NNSTFTVRVTSSTGTDTYSAIAAGIGSLKGPLHGGANIQVADMFHHLQENIKDWKSVDEI DTYFTRMLNKEVYNKTGLIYGIGHAVYTISDPRALLLKELARDLAREKGKEEEFAFLELL EERAIATFGRVKNNGKTVSSNIDFYSGFVYEMIGLPQEIFTPLFAMARIVGWCAHRNEEL NFEGKRIIRPAYKNVLDDLAYIPIKKR >gi|226332200|gb|ACIC01000120.1| GENE 44 41614 - 42804 1210 396 aa, chain - ## HITS:1 COG:SA1517 KEGG:ns NR:ns ## COG: SA1517 COG0538 # Protein_GI_number: 15927272 # Func_class: C Energy production and conversion # Function: Isocitrate dehydrogenases # Organism: Staphylococcus aureus N315 # 3 395 5 422 422 513 58.0 1e-145 MNKITMQKDGTLSVPDVPVVPYITGDGVGAEVTPSMQAVVNAAVQKAYGGKRRIEWKEVL AGERAFNETGSWLPDETMKAFQEYLIGIKGPLTTPVGGGIRSLNVALRQTLDLYVCLRPV RWYQGVHSPVKAPEKVNMCVFRENTEDIYAGIEWEAGTPEAEKFYQFLKNEMGVTKVRFP ETSSFGVKPVSREGTERLVRAACQYALNHHLPSVTLVHKGNIMKFTEGGFKKWGYELAQR EFGDALADGRLVIKDCIADAFLQNTLLIPEEYSVIATLNLNGDYVSDQLAAMVGGIGIAP GANINYKTGHAIFEATHGTAPNIAGKDVVNPCSIILSAVMMLEYLGWKEAAALIEKALEQ SFLDARATHDLARFMPGGTSLSTTAFTREIVERIEK >gi|226332200|gb|ACIC01000120.1| GENE 45 42819 - 44303 1278 494 aa, chain - ## HITS:1 COG:SPAC24C9.06c KEGG:ns NR:ns ## COG: SPAC24C9.06c COG1048 # Protein_GI_number: 19114943 # Func_class: C Energy production and conversion # Function: Aconitase A # Organism: Schizosaccharomyces pombe # 1 492 282 769 778 557 54.0 1e-158 MGAEVGATTSLFPFDGRMATYLRATGRDRIVELAEAVDCELRADQQVTDEPEKYYDRVID IDLSTLEPYINGPFTPDAATPISEFAEKVLQNGYPRKMEVGLIGSCTNSSYQDLSRAASL ARQVKEKNLSVASPLIINPGSEQIRATAERDGMMDDFVQIGAVIMANACGPCIGQWKRHT DDPMRKNSIVTSFNRNFAKRADGNPNTYAFVASPELTMALTIAGDLCFNPLKDRLMNHDG KKVKLAEPEGEELPLRGFTSGNEGYIAPGGTKTAINVNPASQRLQLLTPFPAWDGQDILN MPLLIKAQGKCTTDHISMAGPWLRFRGHLENISDNMLMGAVNAFNGETNNVWNRSTNTYG TVSGTAKMYKSEGIPSIVVAEENYGEGSSREHAAMEPRFLNVRVILAKSFARIHETNLKK QGMLALTFADKADYDKIQEHDLLSVIGLPDFAPGRNLTVVLHHEDGTKESFEAQHTYNEQ QIAWFRAGSALNAR Prediction of potential genes in microbial genomes Time: Thu May 12 02:24:18 2011 Seq name: gi|226332199|gb|ACIC01000121.1| Bacteroides sp. 1_1_6 cont1.121, whole genome shotgun sequence Length of sequence - 37824 bp Number of predicted genes - 43, with homology - 41 Number of transcription units - 22, operones - 8 average op.length - 3.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 682 644 ## COG1048 Aconitase A - Prom 705 - 764 7.0 + Prom 644 - 703 11.7 2 2 Tu 1 . + CDS 800 - 2695 1484 ## COG1112 Superfamily I DNA and RNA helicases and helicase subunits - Term 3161 - 3197 6.5 3 3 Op 1 . - CDS 3228 - 4307 1268 ## COG0059 Ketol-acid reductoisomerase 4 3 Op 2 . - CDS 4336 - 5079 779 ## COG3884 Acyl-ACP thioesterase 5 3 Op 3 32/0.000 - CDS 5079 - 5642 639 ## COG0440 Acetolactate synthase, small (regulatory) subunit 6 3 Op 4 6/0.000 - CDS 5661 - 7358 1943 ## COG0028 Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] 7 3 Op 5 . - CDS 7406 - 9208 1780 ## COG0129 Dihydroxyacid dehydratase/phosphogluconate dehydratase + Prom 9494 - 9553 2.5 8 4 Tu 1 . + CDS 9683 - 10258 590 ## COG1047 FKBP-type peptidyl-prolyl cis-trans isomerases 2 + Term 10279 - 10343 4.2 - Term 10267 - 10329 8.2 9 5 Tu 1 . - CDS 10397 - 11686 1216 ## COG3681 Uncharacterized conserved protein - Prom 11720 - 11779 4.4 + Prom 11709 - 11768 6.8 10 6 Op 1 . + CDS 11820 - 12905 833 ## BT_2081 hypothetical protein + Prom 12912 - 12971 4.9 11 6 Op 2 . + CDS 12996 - 13781 700 ## BT_2082 hypothetical protein + Term 13852 - 13882 -1.0 12 7 Tu 1 . - CDS 13797 - 14360 575 ## BT_2083 hypothetical protein - Prom 14492 - 14551 4.8 + Prom 14435 - 14494 5.5 13 8 Op 1 . + CDS 14519 - 15595 1051 ## COG0082 Chorismate synthase + Prom 15603 - 15662 1.6 14 8 Op 2 . + CDS 15697 - 17061 1348 ## COG0624 Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases + Term 17103 - 17153 5.1 - Term 17090 - 17142 8.1 15 9 Tu 1 . - CDS 17187 - 18245 895 ## COG3049 Penicillin V acylase and related amidases - Prom 18285 - 18344 6.0 + Prom 18395 - 18454 5.4 16 10 Tu 1 . + CDS 18486 - 19361 361 ## BT_2087 hypothetical protein + Term 19420 - 19460 3.1 17 11 Tu 1 . - CDS 19464 - 19658 110 ## - Prom 19755 - 19814 8.4 + Prom 19663 - 19722 7.1 18 12 Op 1 . + CDS 19908 - 20240 337 ## gi|253570714|ref|ZP_04848122.1| predicted protein 19 12 Op 2 . + CDS 20244 - 20618 368 ## BVU_2108 hypothetical protein 20 12 Op 3 . + CDS 20624 - 20689 72 ## + Prom 20727 - 20786 2.7 21 13 Tu 1 . + CDS 20912 - 21400 160 ## BDI_0861 site-specific DNA-methyltransferase 22 14 Tu 1 . + CDS 21676 - 22452 659 ## COG3645 Uncharacterized phage-encoded protein - Term 22468 - 22512 9.2 23 15 Tu 1 . - CDS 22545 - 22874 214 ## gi|253570719|ref|ZP_04848127.1| predicted protein - Prom 22919 - 22978 1.8 24 16 Op 1 . - CDS 22999 - 23466 324 ## gi|253570720|ref|ZP_04848128.1| predicted protein 25 16 Op 2 . - CDS 23470 - 23934 255 ## gi|253570721|ref|ZP_04848129.1| predicted protein 26 16 Op 3 . - CDS 23940 - 24404 125 ## gi|253570722|ref|ZP_04848130.1| predicted protein 27 16 Op 4 . - CDS 24420 - 24893 570 ## BVU_2146 conjugate transposon protein TraQ 28 16 Op 5 . - CDS 24914 - 25498 333 ## BF1350 conjugate transposon protein TraO 29 16 Op 6 . - CDS 25503 - 26396 768 ## BVU_2144 conjugate transposon protein 30 16 Op 7 . - CDS 26438 - 27391 688 ## BF1352 conjugate transposon protein TraM 31 17 Op 1 . - CDS 27542 - 27745 206 ## gi|295086045|emb|CBK67568.1| hypothetical protein 32 17 Op 2 . - CDS 27726 - 28022 206 ## gi|253570727|ref|ZP_04848135.1| predicted protein 33 17 Op 3 . - CDS 28064 - 28687 434 ## BT_2292 conjugate transposon protein 34 17 Op 4 . - CDS 28724 - 29743 1020 ## BVU_2139 conjugate transposon protein 35 17 Op 5 . - CDS 29767 - 30393 673 ## BT_0091 conjugate transposon protein 36 18 Op 1 . - CDS 30604 - 32925 1688 ## COG3451 Type IV secretory pathway, VirB4 components 37 18 Op 2 . - CDS 32888 - 33238 95 ## BF1242 putative transmembrane conjugate transposon protein 38 18 Op 3 . - CDS 33249 - 33446 90 ## Fjoh_3006 hypothetical protein - Prom 33616 - 33675 5.0 - Term 34081 - 34105 -1.0 39 19 Tu 1 . - CDS 34153 - 34581 292 ## gi|253570735|ref|ZP_04848143.1| predicted protein - Term 34685 - 34723 -0.9 40 20 Tu 1 . - CDS 34816 - 35616 636 ## BF3019 conjugate transposon protein TraA - Prom 35755 - 35814 3.7 - Term 35842 - 35889 10.1 41 21 Op 1 . - CDS 35913 - 36146 202 ## BF1525 hypothetical protein 42 21 Op 2 . - CDS 36143 - 37072 698 ## BT_2465 hypothetical protein - Prom 37126 - 37185 7.0 43 22 Tu 1 . - CDS 37219 - 37818 357 ## BF3019 conjugate transposon protein TraA Predicted protein(s) >gi|226332199|gb|ACIC01000121.1| GENE 1 1 - 682 644 227 aa, chain - ## HITS:1 COG:SPAC24C9.06c KEGG:ns NR:ns ## COG: SPAC24C9.06c COG1048 # Protein_GI_number: 19114943 # Func_class: C Energy production and conversion # Function: Aconitase A # Organism: Schizosaccharomyces pombe # 12 227 41 255 778 306 66.0 2e-83 MVYDLTMLEAFYSAYKGKVEHVRAVLKRPLTLAEKILYAHLFNEGDVKNYKRGEDYVNFC PDRVAMQDATAQMALLQFMNAGKEKVAVPSTVHCDHLIQAYKGAKEDIATATKTNEEVYD FLRDVSSRYGIGFWKPGAGIIHQVVLENYAFPGGMMVGTDSHTPNAGGLGMVAIGVGGAD AVDVMTGMEWELKMPRIIGVRLTGKLSGWTSPKDVILKLAGILTVKG >gi|226332199|gb|ACIC01000121.1| GENE 2 800 - 2695 1484 631 aa, chain + ## HITS:1 COG:MK0070 KEGG:ns NR:ns ## COG: MK0070 COG1112 # Protein_GI_number: 20093510 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases and helicase subunits # Organism: Methanopyrus kandleri AV19 # 136 626 135 669 698 267 35.0 5e-71 MNNHPKSPIIDLQQQQLLLRMEYEHEKEEFKRQTETMGVARKVKRGLCWYPASPGRSYYN SLNQLVIDITRTENKEIEHSFEFGRPVCFFRQSFDGKVNYMNFIATVSYADDERMVVVLP SAGALLELQTEEVLGVQLYFDETSYRAMFEALEDTIRAKGNRLAELRDTLLGTQKPGFRE LYPVRFPWLNSTQETAVNKVLCTRDVAIVHGPPGTGKTTTLVEAIYETLHREPQVLVCAQ SNTAVDWICEKLVDRGVPVLRIGNPTRVNDKMLSFTYERRFENHPSYPELWGIRKSIREM GSRMRRGSYSEREGMRSRMSRLRDRATELEILINADLFDSARVIASTLVSSNHRLLNGRR FPTLFIDEAAQALEAACWIAIRKADRVILAGDHCQLPPTIKCIEAARGGLDHTLMEKVVQ QKPSAVSLLKVQYRMHEAIMRFPSEWFYNGELEAAPEVRNRGILDFDTPMNWIDTSEMDF HEEFVGESFGRINKQEANLLLQELEAYISRIGKARILDESIDFGLISPYKAQVQYLRSKI RGSSFLRPFRSLITVNTVDGFQGQERDVVFISLVRANEDGQIGFLNDLRRMNVAITRARM KLVILGDATTLTKHAFYRKLIQYIRQEAVSS >gi|226332199|gb|ACIC01000121.1| GENE 3 3228 - 4307 1268 359 aa, chain - ## HITS:1 COG:YLR355c KEGG:ns NR:ns ## COG: YLR355c COG0059 # Protein_GI_number: 6323387 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Ketol-acid reductoisomerase # Organism: Saccharomyces cerevisiae # 10 358 45 394 395 422 59.0 1e-118 MAQVIKNKKTKKMAQLNFGGTVENVVIRDEFPLEKAREVLKNETIAVIGYGVQGPGQALN LRDNGFNVIVGQRQGKTYDKAVADGWVPGETLFGIEEACEKGTIIMCLLSDAAVMSVWPT IKPYLTAGKALYFSHGFAITWSDRTGVVPPADIDVIMVAPKGSGTSLRTMFLEGRGLNSS YAIYQDATGNAMDRTIALGIGIGSGYLFETTFIREATSDLTGERGSLMGAIQGLLLAQYE VLRENGHTPSEAFNETVEELTQSLMPLFAKNGMDWMYANCSTTAQRGALDWMGPFHDAIK PVVEKLYHSVKTGNEAQISIDSNSKPDYREKLEEELKALRESEMWQTAVTVRKLRPENN >gi|226332199|gb|ACIC01000121.1| GENE 4 4336 - 5079 779 247 aa, chain - ## HITS:1 COG:CAC3591 KEGG:ns NR:ns ## COG: CAC3591 COG3884 # Protein_GI_number: 15896825 # Func_class: I Lipid transport and metabolism # Function: Acyl-ACP thioesterase # Organism: Clostridium acetobutylicum # 17 215 15 211 248 92 27.0 8e-19 MSEENKIGTYQFVAEPFHVDFNGRLTMGVLGNHLLNCAGFHASDRGFGIATLNEDNYTWV LSRLAIELDEMPYQYEKFSVQTWVENVYRLFTDRNFAVIDKDGKKIGYARSVWAMINLNT RKPADLLALHGGSIVDYICDEPCPIEKPSRIKVTSNQPVATLTAKYSDIDINGHVNSIRY IEHILDLFPIELYQTKRIRRFEMAYVAESYFGDELSFFCDEVSENEFHVEVKKNGSEVVC RSKVIFE >gi|226332199|gb|ACIC01000121.1| GENE 5 5079 - 5642 639 187 aa, chain - ## HITS:1 COG:MTH1443 KEGG:ns NR:ns ## COG: MTH1443 COG0440 # Protein_GI_number: 15679440 # Func_class: E Amino acid transport and metabolism # Function: Acetolactate synthase, small (regulatory) subunit # Organism: Methanothermobacter thermautotrophicus # 14 161 16 162 168 76 33.0 2e-14 MSDKTLYTIIVHSENIAGLLNQVTAVFTRRQINIESLNVSASSIKGVHKYTITAWTDKDT IEKVVKQIEKKIDVIQAHYFTEDEIYFHEIALYKVSTPAFQETPEASKVIRRYNARIVEV NPVFSIVEKNGMSEEITSLYEELRALKCVLQFVRSGRVAITTSCFERVNEFLDGQEAKYK QSKKEQE >gi|226332199|gb|ACIC01000121.1| GENE 6 5661 - 7358 1943 565 aa, chain - ## HITS:1 COG:ECs4702 KEGG:ns NR:ns ## COG: ECs4702 COG0028 # Protein_GI_number: 15833956 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] # Organism: Escherichia coli O157:H7 # 6 562 1 544 548 501 45.0 1e-141 MSKDLITGAEAMMRSLEHQGVTTIFGYPGGSIMPTFDALYDHQNTLNHILVRHEQGAAHA AQGYARVSGKVGVCLVTSGPGATNTITGIADAMIDSTPIVVIAGQVGTGFLGTDAFQEVD LVGITQPIAKWSYQIRRAEDVAWAIARAFYIASSGRPGPVVLDFAKNAQVEKTKYEPTQQ EFIRSYVPVPDTDEESVKAAAELINNAERPLVLVGQGVELGSAQEELRIFIEKADMPAGC TLLGLSALPTDHPLNKGMLGMHGNLGPNINTNKCDVLIAVGMRFDDRVTGNLATYAKQAK VIHFDIDPAEVNKNVKVDIAVLGDCKKTLAAVTGLLKKNRHTEWVDSFKEYEAVEEEKVI RPELHPATDSLSMGEVVRAVSEATRHEAILVTDVGQNQMISARYFKYTRERSIVTSGGLG TMGFGLPAAIGATFGRPDRTVCVFMGDGGLQMNIQELGTIMEQKAPVKIICLNNNYLGNV RQWQAMFFNRRYSFTPMLNPDYMKIASAYDIPSKRVFSREELKVAIDEMLSTDGPFLLEA CVVEEGNVLPMTPPGGSVNQMLLEC >gi|226332199|gb|ACIC01000121.1| GENE 7 7406 - 9208 1780 600 aa, chain - ## HITS:1 COG:NMB1150 KEGG:ns NR:ns ## COG: NMB1150 COG0129 # Protein_GI_number: 15677026 # Func_class: E Amino acid transport and metabolism; G Carbohydrate transport and metabolism # Function: Dihydroxyacid dehydratase/phosphogluconate dehydratase # Organism: Neisseria meningitidis MC58 # 4 597 3 612 619 806 66.0 0 MKKQLRSSFSTQGRRMAGARALWAANGMKKNQMGKPIIAIVNSFTQFVPGHVHLHEIGQL VKAEIEKLGCFAAEFNTIAIDDGIAMGHDGMLYSLPSRDIIADSVEYMVNAHKADAMVCI SNCDKITPGMLMAAMRLNIPTVFVSGGPMEAGEWNGQHLDLIDAMIKSADDSVSDQEVAN IEQNACPTCGCCSGMFTANSMNCLNEAIGLALPGNGTIVATHENRTKLFEDAAKLIVENA MKYYEEGDESVLPRSIATRQAFLNAMTLDIAMGGSTNTVLHLLAVAHEAGVDFKMDDIDM LSRKTPCLCKVAPNTQKYHIQDVNRAGGIIAILAELAKGGLIDTSVLRVDGMSLAEAIDQ YSITSPNVTEKAMSKYSSAAGNRFNLVLGSQGAYYQELDKDRANGCIRDLEHAYSKDGGL AVLKGNIAQDGCVVKTAGVDESIWKFTGPAKVFDSQEAACEGILGGRVVSGDVVVITHEG PKGGPGMQEMLYPTSYIKSRHLGKECALITDGRFSGGTSGLSIGHVSPEAAAGGNIGKIV DGDIIEIDIPSRTINVRLTDEELAARPMTPVTRDRYVPKSLKAYASMVSSADKGAVRLID >gi|226332199|gb|ACIC01000121.1| GENE 8 9683 - 10258 590 191 aa, chain + ## HITS:1 COG:FN1875 KEGG:ns NR:ns ## COG: FN1875 COG1047 # Protein_GI_number: 19705180 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: FKBP-type peptidyl-prolyl cis-trans isomerases 2 # Organism: Fusobacterium nucleatum # 1 156 1 149 164 88 39.0 7e-18 METVENKYITVAYKLYTVEDGEKELFEETNAEHPFQFISGLGTTLESFENQITALSKGDK FDFTIPADEAYGAYDEQHVFDLPKNIFEIDGKFDSERVTEGSIVPLMTGDGQRVNASVVE IKPDVVVVDLNHPLAGADLIFEGEVIESRPATNEEIQELVKMMSGEGCGCGCDSCGSDCG DDCGCEGGHCH >gi|226332199|gb|ACIC01000121.1| GENE 9 10397 - 11686 1216 429 aa, chain - ## HITS:1 COG:STM3238 KEGG:ns NR:ns ## COG: STM3238 COG3681 # Protein_GI_number: 16766537 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Salmonella typhimurium LT2 # 13 427 17 435 436 325 44.0 8e-89 MIESERKQIIDLIKKEVIPAIGCTEPIAVALCVAKAAETLGMKPEKIEVLLSANILKNAM GVGIPGTDMVGLPIAVALGALIGRSEYQLEVLRDCTPEAVEQGKLFIAEKRICISLKEDI TEKLYIEVICTAGSQKATAVIAGGHTTFVYIATDEKVLLDKQQTANEEEEDASLELNLRK VYDFALTSPLDEIRFILDTARLNKAAAEQAFKGNYGHSLGKMLRGTYEHKVMGDSVFSHI LSYTSAACDARMAGAMIPVMSNSGSGNQGISATLPVVVFAEENGKTEEELIRALMLSHLT VIYIKQSLGRLSALCGCVVAATGSSCGITWLMGGNYNQVAFAVQNMIANLTGMICDGAKP SCALKVTTGVSTAVLSAMMAMEDRCVTSVEGIIDEDVDQSIRNLTRIGSQAMNETDKMVL DIMTHKGGC >gi|226332199|gb|ACIC01000121.1| GENE 10 11820 - 12905 833 361 aa, chain + ## HITS:1 COG:no KEGG:BT_2081 NR:ns ## KEGG: BT_2081 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 361 1 361 361 721 100.0 0 MKAKHGILYLLLAIFSSSCIREEAPNAEADILSCRLPGVVMTTSPIITNNSINIFVGPGT DISSLAPEFTLTPGATIDPPSGTARDFHSPQQYTVTAADGFWKKKYTVSVIDTELATIYN FEDTLGGQKYYIFVEREGEKVVMEWASGNAGYAMTGVPKTADDYPTFQFANGKTGKCLSL VTRSTGFFGSIMGMPIAAGNLFIGSFDVGNAMSNPLKATKFGLPFRHIPTYLAGYYKYKA GDQFTEGGKPVSGKRDICDIYAIMYETSESVPTLDGTNAFTSPNLVSIARIDDAKETDEW TYFKLPFHMLSGKYIDKEKLTAGKYNVAIVFTSSLEGDHFNGAIGSTLLIDEVELIYRSE D >gi|226332199|gb|ACIC01000121.1| GENE 11 12996 - 13781 700 261 aa, chain + ## HITS:1 COG:no KEGG:BT_2082 NR:ns ## KEGG: BT_2082 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 261 1 261 261 533 100.0 1e-150 MKIHSYIFSLLACLVIALPGYSQVDRNETLIRSALHGLEYEIKAGFSIGGTAPLPLPVEI RSIDGYNPTLAISIGGEVTKWFAVQNKLGVIVGLRLENKAMTTEATVKNYNMEILGQGGE RISGVWTGGVKTKVHTSGLTIPLMATYKLSNRWNIKAGPYFSYLLSREFSGHVYEGYLRE GDPVGPKVEFTDGKIATYDFSDDLRHFQWGLQIGAGWRAFKHLNIYADLTWGLNDIFRND FQTITFAMYPIYLNIGFGYAF >gi|226332199|gb|ACIC01000121.1| GENE 12 13797 - 14360 575 187 aa, chain - ## HITS:1 COG:no KEGG:BT_2083 NR:ns ## KEGG: BT_2083 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 187 1 187 187 341 98.0 6e-93 MKKNLLYLWALICSVSLLTACSSDDDNTVNDETTPPEEEAVVTAPDVVGTYWGNLDISML PDGSDQEVVIADGLPKFITFSQVSDTEVKMELKEFELFINGNILKFGDIVIDKCAVKKET DASTFTGQQDLTFQGDAAALGTCATSIEGTVQGGNATMNIQVKVPALQQTVKVAFSGVKQ VEESGKD >gi|226332199|gb|ACIC01000121.1| GENE 13 14519 - 15595 1051 358 aa, chain + ## HITS:1 COG:sll1747 KEGG:ns NR:ns ## COG: sll1747 COG0082 # Protein_GI_number: 16330007 # Func_class: E Amino acid transport and metabolism # Function: Chorismate synthase # Organism: Synechocystis # 1 351 1 353 362 361 53.0 1e-100 MFNSFGNIFRLTSFGESHGKGVGGVIDGFPSGITIDEEFVQQELNRRRPGQSILTTPRKE ADKVEFLSGIFEGKSTGCPIGFIVWNENQHSNDYNNLKEVYRPSHADYTYKVKYGIRDHR GGGRSSARETISRVVAGALAKLALRQLGISITAYTSQVGAIKLEGTYSDYDLDLIETNDV RCPDPEKAKEMADLIYKVKGEGDTIGGTLTCVIKGCPIGLGQPVFGKLHAALGNAMLSIN AAKAFEYGEGFKGLKMKGSEQNDVFFNNNGRIETHTNHSGGIQGGISNGQDIYFRVVFKP IATLLMEQETVNIDGVDTTLKARGRHDACVLPRAVPIVEAMAAMTILDYYLLDKTTQL >gi|226332199|gb|ACIC01000121.1| GENE 14 15697 - 17061 1348 454 aa, chain + ## HITS:1 COG:DR2025 KEGG:ns NR:ns ## COG: DR2025 COG0624 # Protein_GI_number: 15807020 # Func_class: E Amino acid transport and metabolism # Function: Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases # Organism: Deinococcus radiodurans # 18 441 18 440 459 402 48.0 1e-112 MNEIQKYIAANEPKIMEDLFSLIRIPSISALPEHHDDMLACAQRWAQLLLEAGVDEALVM PSKGNPIVFAQKIVDPNAKTVLVYAHYDVMPAEPLELWKSQPFEPEVRDGYIWARGADDD KGQSFIQVKAFEYLVKNELLKNNVKFIFEGEEEIGSPSLEAFCEEHKELLKADVILVSDT SMLGAELPSLTTGLRGLAYWEIEVTGPNRDLHSGHFGGAVANPINVLCEIISKVTDKDGR ITAPGFYDDVEEVPQAEREMIAHIPFDEKKYKEAIGVKELFGEKGYSTLERNSCRPSFDV CGIWGGYTGEGSKTVLPSKAYAKVSCRLVAHQDHHKISQMFADYILQMAPDTVQVKVTPM HGGQGYVCPISLPAYQAAEKGFEIAFGKKPLAVRRGGSIPIISTFEQVLGTKTVLMGFGL ESDAIHSPNENFSLDIFRKGIEAVIEFHQEYAKR >gi|226332199|gb|ACIC01000121.1| GENE 15 17187 - 18245 895 352 aa, chain - ## HITS:1 COG:mlr8141 KEGG:ns NR:ns ## COG: mlr8141 COG3049 # Protein_GI_number: 13476735 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Penicillin V acylase and related amidases # Organism: Mesorhizobium loti # 8 346 9 348 350 382 54.0 1e-106 MKKKLTGVALVLAAVSLMGIQPAEACTRAVYLGPDGMVVTGRTMDWKEDIMSNIYVFPRG MQRAGHNKEKTVNWTSKYGSVIATGYDIGTCDGMNEKGLVASLLFLPESVYSLPGDTRPA MGISIWTQYVLDNFATVREAVDEMKKETFRIDAPRMPNGGPESTLHMAITDETGNTAVIE YLDGKLSIHEGKEYQVMTNSPRYELQLAVNDYWKEVGGLQMLPGTNRSSDRFVRASFYIH AIPQTADAKIAVPSVLSVMRNVSVPFGINTPEKPHISSTRWRSVSDQKNKVYYFESTLTP NLFWLDLKKIDFSPKAGVKKLSLTKGEIYAGDAVKDLKDSQSFTFLFETPVM >gi|226332199|gb|ACIC01000121.1| GENE 16 18486 - 19361 361 291 aa, chain + ## HITS:1 COG:no KEGG:BT_2087 NR:ns ## KEGG: BT_2087 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 291 1 291 291 567 98.0 1e-160 MRMRQKVSISRLPIRLGIFLLLYFVPQLLSAQENNDSTHILRNGASITPDDVLKQANINL FPEKSTTYSLYSPGQLEHIPRFSLKRDISLPYQTNPSLLFRGDYSTGGVLHQFDHGALFG SGSQTSMAGIGRFNSASLGYQHIFNEQFELQVRANALKVNMSHITGQAFSTSGAFLYHPS DHVTFKVFGSYDIGNSYGMSTHSYGATMSVDMSDRFGMEMGVQRYYDAMSGRWETVPVMI PYYHFDKFTLGLDVGGIIYEILRNAVFDKNRGIGGPSGPTIGPPRMHIPIR >gi|226332199|gb|ACIC01000121.1| GENE 17 19464 - 19658 110 64 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPALNKKFSIDLLHSKILTIFVIEIFEIMQRNAICSKTFATNVLPVTFEFSPYFLIFKKI KSIT >gi|226332199|gb|ACIC01000121.1| GENE 18 19908 - 20240 337 110 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253570714|ref|ZP_04848122.1| ## NR: gi|253570714|ref|ZP_04848122.1| predicted protein [Bacteroides sp. 1_1_6] # 1 110 9 118 118 200 100.0 2e-50 MEAGNEDIEMLKALIQKSFSEQRDLITRLETALDAMTSFNGKQMLDSRDMRLQLKVCDRT LIRWRNSGKLPYFKLSGKIYFWASDVYRFLREEYRDENPDIRNNKPVKKS >gi|226332199|gb|ACIC01000121.1| GENE 19 20244 - 20618 368 124 aa, chain + ## HITS:1 COG:no KEGG:BVU_2108 NR:ns ## KEGG: BVU_2108 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 6 107 3 105 109 67 34.0 2e-10 MENQSKVIMIERNKFASLVKAHRKCLQMLSILAYACATKEVQLTFTLEEICELLQISREE VERQRQKGYIRFVHRNGMTVYEITDILRLKNMLEMGGVYRKIDEKAFILEVAEKEKKETE PSIR >gi|226332199|gb|ACIC01000121.1| GENE 20 20624 - 20689 72 21 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRQLVHASLFSGFGAADLAAE >gi|226332199|gb|ACIC01000121.1| GENE 21 20912 - 21400 160 162 aa, chain + ## HITS:1 COG:no KEGG:BDI_0861 NR:ns ## KEGG: BDI_0861 # Name: not_defined # Def: site-specific DNA-methyltransferase # Organism: P.distasonis # Pathway: Cysteine and methionine metabolism [PATH:pdi00270]; Metabolic pathways [PATH:pdi01100] # 1 101 94 194 427 133 61.0 2e-30 MLRAIHEIRPGWIVGENVGGILQVVQPGKAAHVGCQPSLFGEDQPVYRKREEYVVETICR DLEREGYTVQPLLIPACAVGAPHRRDRVWFVAHADSLGLQGKGTRKPTEGPAGDHPRDAA AHPDSNGDTPLATGGTLETSGAHLHARDGRRGEESQRPDGLP >gi|226332199|gb|ACIC01000121.1| GENE 22 21676 - 22452 659 258 aa, chain + ## HITS:1 COG:lin1738_2 KEGG:ns NR:ns ## COG: lin1738_2 COG3645 # Protein_GI_number: 16800806 # Func_class: S Function unknown # Function: Uncharacterized phage-encoded protein # Organism: Listeria innocua # 116 249 11 136 152 89 38.0 9e-18 MRTLVITEHPEFGKVRTVEAGGRVWFCARDVASALGYANPKDAVNRHCRPKGVCVHDLLT AGGRQKVKFIDEGNLYRLMACSRLPSAERFESWIFDELVPRTLKEGGYLLEKEGETDAEL LSRTLQLAEAKLKERDRYISGLEKENALNALKLSLQAPKVRYFDEVLHSPSTYTVTQIAK ELGMSGRELNRRLKALGIQFRLGGTWLLTARYQKEGYTRTHTHSWQSRCGETGTAMHTVW TEKGRLFIHCLFSGLSLF >gi|226332199|gb|ACIC01000121.1| GENE 23 22545 - 22874 214 109 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253570719|ref|ZP_04848127.1| ## NR: gi|253570719|ref|ZP_04848127.1| predicted protein [Bacteroides sp. 1_1_6] # 1 109 1 109 109 217 100.0 2e-55 MRIVEHSVRRWEAYPEADTETCLWEACVEHEGEYYEVQVEAREVGDDKMYYMGYIRNTQC IKAYFSDGWLGFGFSDIRGTCKYILKSMRKVIRGDYSSIRGAGHSVDMP >gi|226332199|gb|ACIC01000121.1| GENE 24 22999 - 23466 324 155 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253570720|ref|ZP_04848128.1| ## NR: gi|253570720|ref|ZP_04848128.1| predicted protein [Bacteroides sp. 1_1_6] # 8 155 1 148 148 245 100.0 6e-64 MERQNPGMVRFMDRLNEKMELLDEKIQNKAAKAGEDYLSFFESHAEEAYKEYYLYKCFRD LRQKARESDSPEKVLEYLKKRQNVCLDTLLRQDIAARSTSPMANMAHTLRLECVQQLVED YGHFIRILADTLRQQETRRDTRTLREKENRRGPKL >gi|226332199|gb|ACIC01000121.1| GENE 25 23470 - 23934 255 154 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253570721|ref|ZP_04848129.1| ## NR: gi|253570721|ref|ZP_04848129.1| predicted protein [Bacteroides sp. 1_1_6] # 1 154 1 154 154 293 100.0 3e-78 MRQENWSVRKLRKLLDNKIAAVENAMREGLESASTDYCRFFEWKAAYLYKGELLRGHYRL LRNKVSRTDDVEDLKRYFGDIAMYHREKLAGRTPAGSGTHPMADTCRALSLECSRQVLKD CMEFSRILSLGQPEEDMKKENRQPVKHRTNGLKR >gi|226332199|gb|ACIC01000121.1| GENE 26 23940 - 24404 125 154 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253570722|ref|ZP_04848130.1| ## NR: gi|253570722|ref|ZP_04848130.1| predicted protein [Bacteroides sp. 1_1_6] # 1 154 1 154 154 271 100.0 6e-72 MTEQDKYGMEQLAEYLDMRIRYEEKTIKEIRNKLDRDYLYHFAWTGEELFKSHFMVKQYG ELRQVIRQVGVPGEVHGYIRHKREECLKELVSGSIRRRSTDDISNLAHTYRLECMQRLVK DYTGFERLLSMKAPQKEIRTKKAPERKRPDGLKM >gi|226332199|gb|ACIC01000121.1| GENE 27 24420 - 24893 570 157 aa, chain - ## HITS:1 COG:no KEGG:BVU_2146 NR:ns ## KEGG: BVU_2146 # Name: not_defined # Def: conjugate transposon protein TraQ # Organism: B.vulgatus # Pathway: not_defined # 10 148 8 147 153 149 55.0 3e-35 MKLGTFIMGCALALLACTFAGCDDRLEVRQAYDFSLSSWYLQEGIASDETVEIRLTLDRE GDYREARYRIGYIQLEGDGEVFAADGTRLVNRELTELSGIAGLDTTDIRRQVFTLFYRNT GEDNPEIRFVAVDNFRVERTLDIPFSIEDRANTDTID >gi|226332199|gb|ACIC01000121.1| GENE 28 24914 - 25498 333 194 aa, chain - ## HITS:1 COG:no KEGG:BF1350 NR:ns ## KEGG: BF1350 # Name: not_defined # Def: conjugate transposon protein TraO # Organism: B.fragilis # Pathway: not_defined # 1 194 1 193 193 179 48.0 4e-44 MRKYLFITMCAFLSLWAGKVSAQRYLPGQTGLELRLGGVDDFGKNVRHLRGNFQVGMALS RYNRNRSRWVVGADYVKKHYAYKETAVPKEQFTAEAGCLFPFLSDRGRNVFLSAGLSALA GYERVNRNGRLLYDGATLRNGGAFIYGFAPSFEMEAYLTDRWAFLFNVRQRVFFGSSVGH FHTVVAVGLKYIID >gi|226332199|gb|ACIC01000121.1| GENE 29 25503 - 26396 768 297 aa, chain - ## HITS:1 COG:no KEGG:BVU_2144 NR:ns ## KEGG: BVU_2144 # Name: not_defined # Def: conjugate transposon protein # Organism: B.vulgatus # Pathway: not_defined # 1 294 1 296 299 319 53.0 7e-86 MRTFILLIMAALAGMTCRAQERKPKGGVRTLTAGQHITPYRIGVPFSKTVHVLFPSEVRY VDLGSTDIIAGKADGVENVVRVKAAVRDFPGETNFSVITGDGSFYSFLVSYEEEPEALNI NMDSRFPTGPSTGGSAVRVTELGEEDPSVIASLMYTVHRLDRRDVKHIGCRQFGMQALLK GIYVHKDLLFLHISLANSSHVPFDIDFVRFKVVDKKVAKRTAQQETYIEPVRALNELRRI GGKGEGRIVYAFHKVVIPDDKLIEVEIYEKGGGRHQWFHIENSDLVDARLVDELVKE >gi|226332199|gb|ACIC01000121.1| GENE 30 26438 - 27391 688 317 aa, chain - ## HITS:1 COG:no KEGG:BF1352 NR:ns ## KEGG: BF1352 # Name: not_defined # Def: conjugate transposon protein TraM # Organism: B.fragilis # Pathway: not_defined # 26 317 142 438 438 260 49.0 6e-68 MQSLSDLAGRLLEGKGQTDGVSGPDGSIRQSADTYERITEQLETFYETPQETAGGLEQKV DELNRRLEEAEAENRRAESRERMMEQSYRMAARYLNPKPEATPGEAEEEEKPEPVPVSRA ESGTTSALKQPLADSAFVTSLVVERNYGFNTAVGSSYRMGMNTIPACISESQTLEQGGRV KLRLLEPLQAGSVTIPANSLVTGTADIRGERLDILVSSIEYAGNIIPVELATYDIDGQRG IFVPGSESRTAAKEVLGDVSQSMGGSISFAGSAGQQVAMDLTRGVLQGGTRFISQRVKAV KVKLKAGYKVLLATKKQ >gi|226332199|gb|ACIC01000121.1| GENE 31 27542 - 27745 206 67 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|295086045|emb|CBK67568.1| ## NR: gi|295086045|emb|CBK67568.1| hypothetical protein [Bacteroides xylanisolvens XB1A] # 1 67 1 67 67 105 94.0 1e-21 MENRKSNMEEGHVPAEEAPEGTRPAEAGSVPSEGTKTEKRLSGEEILRRRKYIVIPVFVL VFLAVLY >gi|226332199|gb|ACIC01000121.1| GENE 32 27726 - 28022 206 98 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253570727|ref|ZP_04848135.1| ## NR: gi|253570727|ref|ZP_04848135.1| predicted protein [Bacteroides sp. 1_1_6] # 1 98 14 111 111 176 100.0 3e-43 MDIKSTARERLKAYLDGLSEKQRRGILVAMLAFTCVASGIVLLRAAGRLTPARERMELPF GEGSLSDTLRRLPDPLKGQVTSIYRTIKQPQQNGKQEK >gi|226332199|gb|ACIC01000121.1| GENE 33 28064 - 28687 434 207 aa, chain - ## HITS:1 COG:no KEGG:BT_2292 NR:ns ## KEGG: BT_2292 # Name: not_defined # Def: conjugate transposon protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 207 1 207 207 301 71.0 1e-80 MEFKSLKNIESSFRQIRLFSLVFLASCTLLTVFSVWKSYRFAEEQREKIYVLDEGKSLIM ALSQDAARNRPVEAREHVRRLHELFFTLAPDKAAIESNINRAMYLADKSLYGYYRDWNEK GFYNRLISGNVNQTVVVDSMGCDFDSYPYRVTTYARQMIIRGSNITERSLVTRCRLQSTV RSDNNPQGFLAEHFEILENRDLRTVER >gi|226332199|gb|ACIC01000121.1| GENE 34 28724 - 29743 1020 339 aa, chain - ## HITS:1 COG:no KEGG:BVU_2139 NR:ns ## KEGG: BVU_2139 # Name: not_defined # Def: conjugate transposon protein # Organism: B.vulgatus # Pathway: not_defined # 1 311 1 308 337 347 52.0 4e-94 MVLLSLSFDSLHGILRRLYDEMTELCQPMMTLATAIAALGALFYIAYRVWQSLSRAEPVD VFGMLRPFCLGICILFFDVIVLGSLNGILSPVVQGTGAMLHDQTFDVRKFQEEKDRLLSE MPLKVAVTGEFNPTDEELEKEIADMGWNGEDYAVMQRMSHTCYSFSLQSIVQTIMRKVME FLFRAASLVVDTIRTFFLIVLSVLGPIAFAISVFDGFQGTLVQWLARYISVYLWLPISDL LGVMLARIQTLVLQSEMEMIQDPLSVFSPDGSTAIYLVFMLIGLVGYFCVPTVANWVVQA GGMGAYNRNMNNTASKAANVGGAVAGASTGNVAGVLLKK >gi|226332199|gb|ACIC01000121.1| GENE 35 29767 - 30393 673 208 aa, chain - ## HITS:1 COG:no KEGG:BT_0091 NR:ns ## KEGG: BT_0091 # Name: not_defined # Def: conjugate transposon protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 208 1 209 209 212 52.0 6e-54 MRRKIFSALCAFLLTVPAMHAQWVVTDPGNLAQGIVNSVNEMVETSETAQNAMSTFKEAS KIYEQSRKYYDMLKSVNSLVSGSLKVKESVLMLSEISESYVTNYGKMLTDENYTQRELEA IAFGYNNIMKKSSASVAGLKDIINPTGMSMTDKERLDLIERVYREMRHYRTLVNYYTRKN LRVSYLRAREKGETEQVLKLYGEDERYW >gi|226332199|gb|ACIC01000121.1| GENE 36 30604 - 32925 1688 773 aa, chain - ## HITS:1 COG:PSLT088_2 KEGG:ns NR:ns ## COG: PSLT088_2 COG3451 # Protein_GI_number: 17233453 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirB4 components # Organism: Salmonella typhimurium LT2 # 433 733 184 472 593 72 24.0 3e-12 MRNILKAATLESKFPVLSVERGCILSKDADVTVGFKVRLPELFTLTSAEYAAMHGTWVKA IKVLPCYTVVHRQDWFLEEKYRAVTGRDDMGFLDRSFELHFNERPFLNHYAYLFITRTTR ERSLQQSNFSILCRGAIIPREARDREAVVRFMESVDQFERIMGDSGLVGLERLTTDEIVG TGKKAGIIEKYFSLSQENTTTLKDISLSPAEMRIGDEILSVHTLSDADDLPAAVRTDGRF EKYSTDRSDCRLSFAAPVGLMLSHNHIYNQYVFMDDPADTLQHLEKTARNMQSLSRYSRA NQINRQWIEEYLNEAHSFGLIPVRCHCNVMSWADSRERLRNVKNDTGSALAQMGCKPRYN TVDAPTLYWAAIPGGEGDFPAEESFHTFIEQAACLFIGETNYASSPSPFGIKMADRLSGR PLHLDISDEPMRRGITTNRNKFILGPSGSGKSFFTNHLVRNYYEQGAHILLVDTGNSYKG LCRLIHERTRGEDGIYITYEEDRPITFNPFYSEDRQFDVEKRESIKTLILTLWKREDELP KRSEEVALSGAVNAYIRRITENPELRPDFNGFYEFVRDDYRRMMEKKRVREKDFDIDGFL NVLEPFYKGGDYDFLLNSDKEIDLTHKRFIVFELDNISSNKVLLPVVTLIIMETFISKMR RLKGIRKVILLEEAWKSLMSPNMATYVLYLFKTVRKYFGEAVVVTQEVDDIISSPIVKEA IINNSDCKILLDQRKYLNKFERIQTLLGLTDKECSQILSINRANDPSRRYKEV >gi|226332199|gb|ACIC01000121.1| GENE 37 32888 - 33238 95 116 aa, chain - ## HITS:1 COG:no KEGG:BF1242 NR:ns ## KEGG: BF1242 # Name: not_defined # Def: putative transmembrane conjugate transposon protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 2 106 3 107 110 125 53.0 5e-28 MEHKVNKGADRGVEFSGLQAQYLYVLAGGLLSVFLVFVILYMAGVNRWFCILFGTVSAST LVLTVFRLGRKYGPHGLMKLAAAKCRPFHIINRKRIISLFKRSSHEKHPESGNTGK >gi|226332199|gb|ACIC01000121.1| GENE 38 33249 - 33446 90 65 aa, chain - ## HITS:1 COG:no KEGG:Fjoh_3006 NR:ns ## KEGG: Fjoh_3006 # Name: not_defined # Def: hypothetical protein # Organism: F.johnsoniae # Pathway: not_defined # 1 65 63 127 127 104 80.0 1e-21 MVTSYFDPATKLIYAAGALCGLIGAIKVYSKFSSGDPDTGKVAGSWFGACVFLIVAATVL RAFFL >gi|226332199|gb|ACIC01000121.1| GENE 39 34153 - 34581 292 142 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253570735|ref|ZP_04848143.1| ## NR: gi|253570735|ref|ZP_04848143.1| predicted protein [Bacteroides sp. 1_1_6] # 1 142 77 218 218 251 100.0 1e-65 MSEKRTDTAMKGRKAPVPEFPAPFGRPEDGEVPERYEITGYVERPDPHRARGVTFDDMDL LSRFMATDRLSEPERSHARQTLERLEGTDMERILQAGILGSDETLKRYMELYVDSDREPP LLAGKDRESIIRDFDIMDFIPR >gi|226332199|gb|ACIC01000121.1| GENE 40 34816 - 35616 636 266 aa, chain - ## HITS:1 COG:no KEGG:BF3019 NR:ns ## KEGG: BF3019 # Name: not_defined # Def: conjugate transposon protein TraA # Organism: B.fragilis # Pathway: not_defined # 1 255 2 257 262 292 56.0 7e-78 MKKEPLFIAFASQKGGVGKTAFTVLTAGILRYQRGYNVAVVDCDPPQHSIGLMRDRDLEC VKENDSLKMALYRQYKQKPQKAYPIIKSEPENAVENFYRYMREQETEFGIVLFDLPRAFR YMGMIRTLALMHHIFIPLKADDAVMQSTLQFASVIREELVSHEDYPLKGIHLFWNMIDKR EDKGTFEAWNKTICDSKLHLMTTCVPETKRYNREASSMRGGIFRSTLLLPDNRQVKGSGL MELVDEICGIIGLAEKQGESPMTNLM >gi|226332199|gb|ACIC01000121.1| GENE 41 35913 - 36146 202 77 aa, chain - ## HITS:1 COG:no KEGG:BF1525 NR:ns ## KEGG: BF1525 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 76 4 79 81 86 55.0 2e-16 MNKHMYILADGGRIAASDPSEFVRVLREGSWFDSGCTDGEYMVNFSGRYRELHGVTVRTD TPEHFMDDLKKYGYIKG >gi|226332199|gb|ACIC01000121.1| GENE 42 36143 - 37072 698 309 aa, chain - ## HITS:1 COG:no KEGG:BT_2465 NR:ns ## KEGG: BT_2465 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 303 1 303 304 580 90.0 1e-164 MNEQVRSILAQETTKTSKIRQLFLLGVPRAEIARMVTNGNYGFVVNALRRMSEREGGLNI HPATAALDYTFNRKFGIEIEAYNCSRERLARELREAGIEVTVEGYNHTTRPYWKLVTDGS LNGNNTFELVSPILVGEAGLRELEKVCWVLDLCEVKVNESCGLHVHIDAAGFSMETWRNL ALSYKHLEPVIDRFMPASRRDNYYCKGLGHVSDAMIRSARTVDELKSRIGNRYHKVNLEA YSRHKTVEFRQHSGTTNFTKIRNWVLFLHKLVTFATRGQVPAATALQDIPFLDSEQKLYY KLRTKKLSA >gi|226332199|gb|ACIC01000121.1| GENE 43 37219 - 37818 357 199 aa, chain - ## HITS:1 COG:no KEGG:BF3019 NR:ns ## KEGG: BF3019 # Name: not_defined # Def: conjugate transposon protein TraA # Organism: B.fragilis # Pathway: not_defined # 1 199 64 262 262 368 89.0 1e-101 MKNDDLKVNLYRQYERIRKPAYPVIKSDPEKAMEDLRRYMDEKGETFDIVLFDLPGTLRS EGVVHTVSAMDYIFVPLKADNIVMQSSLQFAKVLEEELIAKGNCNLKGIRLFWNMVDRRG RKDLYDAWNRVIHRMGLRLLSSHIPNTLRYNREADTVCKGVFRSTLFPPDPRQEKGSGLP ELVEEICPAIGLEESDTDR Prediction of potential genes in microbial genomes Time: Thu May 12 02:26:33 2011 Seq name: gi|226332198|gb|ACIC01000122.1| Bacteroides sp. 1_1_6 cont1.122, whole genome shotgun sequence Length of sequence - 66793 bp Number of predicted genes - 56, with homology - 52 Number of transcription units - 32, operones - 12 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 22 - 70 0.8 1 1 Op 1 . - CDS 77 - 463 136 ## BF3020 hypothetical protein 2 1 Op 2 . - CDS 489 - 905 216 ## BF3021 hypothetical protein - Prom 925 - 984 5.5 + Prom 863 - 922 8.1 3 2 Tu 1 . + CDS 1027 - 1200 76 ## + Term 1346 - 1380 0.2 4 3 Tu 1 . - CDS 2480 - 2719 62 ## gi|253570746|ref|ZP_04848154.1| predicted protein - Prom 2856 - 2915 11.8 5 4 Op 1 . - CDS 4137 - 5675 1057 ## CYA_1022 leucine-rich repeat-containing protein 6 4 Op 2 . - CDS 5689 - 6558 618 ## gi|253570748|ref|ZP_04848156.1| predicted protein 7 4 Op 3 . - CDS 6587 - 8053 1038 ## Coch_1071 hypothetical protein 8 4 Op 4 . - CDS 8067 - 9014 842 ## BT_3237 hypothetical protein 9 4 Op 5 . - CDS 9016 - 10512 1176 ## Phep_0306 hypothetical protein 10 4 Op 6 . - CDS 10524 - 14108 2888 ## Phep_0305 TonB-dependent receptor plug 11 4 Op 7 . - CDS 14116 - 14235 58 ## 12 4 Op 8 . - CDS 14311 - 15492 761 ## COG3712 Fe2+-dicitrate sensor, membrane component - Prom 15555 - 15614 10.6 + Prom 15560 - 15619 9.2 13 5 Tu 1 . + CDS 15693 - 16235 323 ## BT_3277 RNA polymerase ECF-type sigma factor 14 6 Tu 1 . + CDS 16367 - 16627 96 ## gi|298384233|ref|ZP_06993793.1| two-component system sensor histidine kinase/response regulator + Prom 16703 - 16762 2.1 15 7 Tu 1 . + CDS 16833 - 16991 68 ## - Term 16764 - 16806 5.3 16 8 Tu 1 . - CDS 17005 - 17397 180 ## BF2872 hypothetical protein - Prom 17444 - 17503 6.2 + Prom 17814 - 17873 7.0 17 9 Tu 1 . + CDS 17969 - 18280 410 ## gi|253570756|ref|ZP_04848164.1| predicted protein + Prom 18712 - 18771 2.9 18 10 Tu 1 . + CDS 18865 - 19056 67 ## gi|253570759|ref|ZP_04848167.1| predicted protein + Term 19121 - 19151 1.1 19 11 Tu 1 . - CDS 18987 - 19205 109 ## BF1977 hypothetical protein 20 12 Tu 1 . - CDS 19309 - 19446 83 ## - Prom 19468 - 19527 6.2 - Term 19952 - 20009 10.4 21 13 Tu 1 . - CDS 20092 - 21171 857 ## COG3049 Penicillin V acylase and related amidases - Prom 21277 - 21336 6.1 22 14 Tu 1 . - CDS 21787 - 21942 70 ## gi|253570763|ref|ZP_04848171.1| predicted protein - Prom 21984 - 22043 1.8 23 15 Tu 1 . - CDS 22700 - 22855 71 ## gi|253570765|ref|ZP_04848173.1| conserved hypothetical protein - Prom 22938 - 22997 6.7 - Term 22914 - 22956 0.7 24 16 Tu 1 . - CDS 23015 - 24232 567 ## COG3182 Uncharacterized iron-regulated membrane protein - Term 24246 - 24284 2.9 25 17 Op 1 . - CDS 24322 - 25710 1244 ## BDI_3402 hypothetical protein 26 17 Op 2 . - CDS 25751 - 28165 1786 ## COG1629 Outer membrane receptor proteins, mostly Fe transport - Prom 28203 - 28262 4.0 27 18 Op 1 . - CDS 29555 - 29764 127 ## gi|293372895|ref|ZP_06619268.1| conserved domain protein 28 18 Op 2 . - CDS 29843 - 30859 881 ## BVU_3699 hypothetical protein 29 18 Op 3 . - CDS 30901 - 33237 826 ## COG1629 Outer membrane receptor proteins, mostly Fe transport - Prom 33259 - 33318 2.2 + Prom 33213 - 33272 5.9 30 19 Tu 1 . + CDS 33368 - 33670 69 ## gi|295085997|emb|CBK67520.1| hypothetical protein - Term 33998 - 34042 9.5 31 20 Op 1 . - CDS 34093 - 37341 1760 ## BT_4224 hypothetical protein 32 20 Op 2 . - CDS 37378 - 38781 698 ## gi|253570774|ref|ZP_04848182.1| conserved hypothetical protein 33 20 Op 3 . - CDS 38790 - 39704 538 ## gi|253570775|ref|ZP_04848183.1| conserved hypothetical protein - Prom 39731 - 39790 6.3 - Term 39774 - 39810 6.1 34 21 Op 1 . - CDS 39899 - 41332 1348 ## gi|253570776|ref|ZP_04848184.1| conserved hypothetical protein 35 21 Op 2 . - CDS 41388 - 42890 871 ## PG2167 immunoreactive 53 kDa antigen PG123 36 21 Op 3 . - CDS 42946 - 43557 483 ## BF2182 hypothetical protein 37 21 Op 4 . - CDS 43570 - 43854 127 ## gi|301165273|emb|CBW24844.1| hypothetical protein - Prom 44012 - 44071 9.2 - Term 44125 - 44157 -0.1 38 22 Tu 1 . - CDS 44386 - 45495 615 ## BT_0727 hypothetical protein - Prom 45535 - 45594 4.8 - Term 45548 - 45591 3.3 39 23 Op 1 . - CDS 45613 - 45804 102 ## gi|253570780|ref|ZP_04848188.1| conserved hypothetical protein 40 23 Op 2 . - CDS 45801 - 46028 224 ## gi|253570781|ref|ZP_04848189.1| conserved hypothetical protein 41 24 Tu 1 . - CDS 46154 - 46846 399 ## BDI_3897 hypothetical protein - Prom 46932 - 46991 2.5 - Term 46966 - 47010 8.0 42 25 Op 1 1/0.000 - CDS 47023 - 47310 75 ## COG0776 Bacterial nucleoid DNA-binding protein - Prom 47333 - 47392 1.8 43 25 Op 2 . - CDS 47436 - 47858 386 ## COG1970 Large-conductance mechanosensitive channel - Prom 47880 - 47939 2.2 - Term 47885 - 47934 -0.1 44 26 Op 1 . - CDS 47941 - 48633 398 ## COG0580 Glycerol uptake facilitator and related permeases (Major Intrinsic Protein Family) - Prom 48653 - 48712 2.5 45 26 Op 2 . - CDS 48719 - 49882 917 ## BT_1391 hypothetical protein - Prom 49910 - 49969 10.3 - Term 50767 - 50818 2.0 46 27 Tu 1 . - CDS 50866 - 51723 368 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 51875 - 51934 7.1 47 28 Op 1 8/0.000 - CDS 51990 - 52511 321 ## COG1475 Predicted transcriptional regulators 48 28 Op 2 . - CDS 52517 - 53794 722 ## COG3969 Predicted phosphoadenosine phosphosulfate sulfotransferase 49 28 Op 3 . - CDS 53794 - 54198 171 ## BVU_3704 hypothetical protein - Prom 54322 - 54381 5.1 - Term 54537 - 54595 17.2 50 29 Op 1 . - CDS 54611 - 56215 1280 ## BVU_3705 hypothetical protein 51 29 Op 2 . - CDS 56235 - 59231 2678 ## BF4254 hypothetical protein 52 29 Op 3 . - CDS 59319 - 59777 218 ## BVU_3708 hypothetical protein - Prom 59797 - 59856 5.7 - Term 59814 - 59860 4.3 53 30 Tu 1 . - CDS 59902 - 61380 799 ## COG0753 Catalase - Prom 61460 - 61519 7.5 - Term 61436 - 61481 2.2 54 31 Tu 1 . - CDS 61658 - 63493 1049 ## COG1874 Beta-galactosidase - Prom 63546 - 63605 9.5 55 32 Op 1 . - CDS 63620 - 64480 529 ## COG2273 Beta-glucanase/Beta-glucan synthetase 56 32 Op 2 . - CDS 64511 - 66787 1022 ## COG3525 N-acetyl-beta-hexosaminidase Predicted protein(s) >gi|226332198|gb|ACIC01000122.1| GENE 1 77 - 463 136 128 aa, chain - ## HITS:1 COG:no KEGG:BF3020 NR:ns ## KEGG: BF3020 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 127 1 127 128 206 87.0 3e-52 MTGKNVKDRTGPGRMDTARIREIMGEAPVCPRDGRGTSGNDIIHSARKSGSIQPSVYGEE YLHGIAGVQRRSLHIPAALHRKLFILAGASRNGKVTLEGFINHLVSRHLEEYREMVDMIL EESLPGRP >gi|226332198|gb|ACIC01000122.1| GENE 2 489 - 905 216 138 aa, chain - ## HITS:1 COG:no KEGG:BF3021 NR:ns ## KEGG: BF3021 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 138 1 138 138 235 91.0 5e-61 MNSEKEKNRLSDIVLERVGLTGNLLSDPVSPSPEPAMRMLGHERPVGAGKVTDPEEYKRR FLVPAPKAAEWKTAYIDGRLHRRIAMLVRAAGCGSISGFIIRLLELHMEEHREDIAALLG EVYRPWDEDGQPGGTARR >gi|226332198|gb|ACIC01000122.1| GENE 3 1027 - 1200 76 57 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIHMTLRSNGYSYAEVTSEARCRSGVEKRRDSVSLSTGPFKSLWAGVGAHIRPDTDS >gi|226332198|gb|ACIC01000122.1| GENE 4 2480 - 2719 62 79 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253570746|ref|ZP_04848154.1| ## NR: gi|253570746|ref|ZP_04848154.1| predicted protein [Bacteroides sp. 1_1_6] # 1 79 1 79 79 142 100.0 6e-33 MITVDDSIRIIRDAFEIKWRKNLDFAHIQVERYTSSLQKADLSVMRKFYQNYIDKYQKIA DPLSRLEPLLPVAYSDAAP >gi|226332198|gb|ACIC01000122.1| GENE 5 4137 - 5675 1057 512 aa, chain - ## HITS:1 COG:no KEGG:CYA_1022 NR:ns ## KEGG: CYA_1022 # Name: not_defined # Def: leucine-rich repeat-containing protein # Organism: Cyanobacteria_CYA # Pathway: not_defined # 366 499 163 292 296 87 34.0 1e-15 MRYKILRLASAAFAILLVACHEQSEYLFGADAPLVEISADAENEGNFTPEAQTGYALLRS NARWKAESLSEWLTCNTTAGEYNDTIYFSMPENKGDTRIGKIVVRNTIGEERLDTLTVRQ QCAVSFIPDGYKIKIESDDLKAPLSGESFAEVKFEVNSPTAWRAFTKATDGWSTVITKKG ETSGSGSLIVTENETVVPRSMYIYVQSVEFPTLKDSILLTQSARPLKLEIISPTNKKIML DGAESQFIFSVNGDGKWKIEDAPGWVELEKTEYEGNANVFVKVSATGTERTAQLTIRSLV QTDKTDVLTIEQKNIPDGRLKDSLALVAIYQATKGENWKYTWKPELPLSDSNWPGVLFDT VDGELRVVDLSLLDYNMEGSLPNEIGWLTEIIKIKLQRNKLSGPLPASINRLTNLTHLYI TSNQFSGEFPDIPNLQKLTWIEMEFNRFTGEFPPVFSLLPKLSILKVKYNNFDPNTCVPA RFGGWKLSLYINPQRATHADAKTDYNLRDCTE >gi|226332198|gb|ACIC01000122.1| GENE 6 5689 - 6558 618 289 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253570748|ref|ZP_04848156.1| ## NR: gi|253570748|ref|ZP_04848156.1| predicted protein [Bacteroides sp. 1_1_6] # 1 289 1 289 289 550 100.0 1e-155 MKYIIYNLWILMFLFACSESEQVEYSKLSVGFQKSKVDVGENTGILEIPVVLSGCNMDMP LQVSVQVSATDGDAVAGVDYELIDTRLAFEVCGQAMLKVRIMDNEEITDVIKTFTVNLKA ETPEVKSGISSIKVHIISDDVEKVTMAGHYTLTAQDFVEGTKLSSTPDGVEIVQDLDDSN KYYMRNMVLVNGDKVLPLTMGGDLYFTVDNSGNMAIPSQLKIGNYGDGEAFAVGLTTDGY FTEDPIKIEKKDNRLLIMGGGLAGIFIDDKDDLSIYYALKNIILEKVNH >gi|226332198|gb|ACIC01000122.1| GENE 7 6587 - 8053 1038 488 aa, chain - ## HITS:1 COG:no KEGG:Coch_1071 NR:ns ## KEGG: Coch_1071 # Name: not_defined # Def: hypothetical protein # Organism: C.ochracea # Pathway: not_defined # 1 184 1 169 450 78 32.0 5e-13 MKNIIGILLLAAGCYSCSDNEVDNVFDKTPEERIAAVKAEYKNILVEAPHGWKTYFGTTD KLGRWLILMDFDADGKVTMKCDPVDYYYIGGVSIEDPVTYRVDYSQFPELVFESFSQFSA WNEFYVDSDGDGYFDKYASPETQFIFDAYRDGKLYMKGKTNMGVGKKPEEIMTFVFEPAS EADWTLNGIADVKKAINYNESKGKYQRLEYKGELLESLFLIDAESRVVLYMSTEIAGSQM QILPFYITPTGFTLVSPLKIKGVGNIRQFDVSQDGTAVTESTHQQLRVVYTDKVPAVQRT PFSIFDKLYFGIEVYHSTGDPIGRNLKKYFDDLRPPYAPMEQHLNVIYFAKNIDKDMGKI PFKYAEELTIIYADSDPSYSSKDLYGDNPTFVRIPVSFETSADRMWIISLKGSVEQAFLD AYPDDAGYARDKAKAALPMLYRLLNERGWGLYAQGLDTNNPTVKVVDEEDVVGSFFTLYP YTQTDVDR >gi|226332198|gb|ACIC01000122.1| GENE 8 8067 - 9014 842 315 aa, chain - ## HITS:1 COG:no KEGG:BT_3237 NR:ns ## KEGG: BT_3237 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 313 1 285 292 117 25.0 5e-25 MHMKRSILYICCLSLCCFMSCSDDESTNYVPESKELSTDEIDVYFQENFQEKYGSTVRWQ WIDKYVDINYIVAPAMRKVLIPTGEMIRRFWIEPFILESKASGEFIRKHFPPEIVCVGSE LRNADGTRTLGYADAGVRITLTELNYFDLTNREWVIQQLHTMHHEFSHIIHQTYKMPNGF NKITEDTYTGQAWKDINTNAQAELIEKGISSPTTHEIDSVAQNMAIQRGMLTPYGTSSEF EDFAELVSLYLLTEPAVFEETYIASDPKRTFLDEGKANINKKLALVKEYYVSNFSIDLTH LREIILQRLNEISMK >gi|226332198|gb|ACIC01000122.1| GENE 9 9016 - 10512 1176 498 aa, chain - ## HITS:1 COG:no KEGG:Phep_0306 NR:ns ## KEGG: Phep_0306 # Name: not_defined # Def: hypothetical protein # Organism: P.heparinus # Pathway: not_defined # 21 477 19 475 481 257 37.0 9e-67 MKYLYIKIAVVLSAVVSLSSCDSFLDETPDNRLRLNSYETIAELVTNAYPEGSGVFMEWM SDNVGADPKNIQRSEMTQAYNWQDVEQEGQDTPSFYWSGNYKAIAHANQALEALEEVKEN NPGYKDAIRGEALACRAFAHFMLVSTFAKSYDPATAASDKGIVIMTKPEENLLATYERSS VQEVYDFVEKDLLEAIELVSDQYYKNSGKYHFNRAALMAFASRFFLFKKDYEKVEKYATE LLGTDYNPQMIRDYTQVYTGTTSEMMGRKFTAATLPSNLLLIRKDVLYGYKGYAGYRFNQ DVYNGLVLSQRDLRFSVSYSYGGTSSFLPKFQRDLIRKTSLTSSSGYPYVIEVAFRGEEV FFNRLEAWAMMGTTKLSQFESQFSRYLKAVYAGTVDYPILYQNYQRFYPGVPANELRLKM VLDEKRREFVEEGLRWFDICRHKLSVTHTDIAGVTRILQPDDSRKVIQIPESAIIYGDLQ PNERKNISEPEAHLVIRN >gi|226332198|gb|ACIC01000122.1| GENE 10 10524 - 14108 2888 1194 aa, chain - ## HITS:1 COG:no KEGG:Phep_0305 NR:ns ## KEGG: Phep_0305 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: P.heparinus # Pathway: not_defined # 27 1194 51 1219 1219 904 42.0 0 MSNKISYGLKYLIITLLLFCNAPWTAAQQPARKVTLNLESVSFENCLKAIERQTHYTFLY RKDLINLSKLVSFSCKDEALTAVLDSLLPSQNIAYRIQDLTIVLTPKQVRNKKTPTLIQG NIVDENKEPVIGASIKIKGTTQGTITDIDGNFSFEASLDDKVIISYVGLETIERKIGTKV MNIVMKDDAVSLGDVVVTGYQKINRKMFTGSASKVNADETVLKGNPDVTNSLQGKVAGVQ ITNVSSTFGASPVLTIRGNSSINGTNKPLWVVDGVILEDLMSVSAEELTSGNLSTLLSSG VAGLNPEDIESFEILKDVSATSLYGAQAMNGVIVITTKQGKKDRLTINYTGSIALKEKPD YRNFNIMNSDDEMSIYKELERKGWIDITTVARSENFGAMGKMFDEITRKNINWGPNGSLN EDFLSAYANSNTDWFNTLFHNSLVNQHSLSISGGGDKTTFYASLSYYTDNGMAVSSDKVN RYTASLRGTFAVSPKLNIGLKLSANIRDQRVPGTKDRNFDTSTGEFTRDFDINPFNYALN TSRSMRAYDSNGNLEFFRRSFADFNILHELRHNFVDLTVKDISTQMDLEYKPLKVLTLKG SFQYRSANTLREHKIHEFSNQAEAYRADYSQAIIDNNRFLFQNPSLPTANPYSILPEGGF YNTNEADLRHTYFRATADYSPKLGLDHVVNFFAGFEINKVDRKMRLNEGWGYLWDKGGVV VTHPDLTYYLKEQGEVYYDYAEGRDRRISSFINSAYSFKGKYIVNAGFRYDGSNQLGSSR EARYLPSWNVSAAWNMHDEAFMKGISEVVNTLKPKISYGYNGIMGPSTSAELAIYAKQTL RPSDNEVYNVIEGLKNKDLTWEKMYELNLGFEAGLFNNRIMMDFAHYRRKSIDLIDYVNT AGIGGQSIKLGNIGNMKSYGFEFAINSVNISTKDFDWKTNLNINFHKSKITKLNNFSRMS TAIANTGTAMLGYPQRGLFSVRFAGLDGEGIPTFYGANNEIVYDMNLQSRNDVDKILKYE GPLEPKGYGGLTNTFRYKDWTLSFGFVYKYGNVIRLDDEFYPYYDDFSSFPKELKNRWIM PGDENRTSVPAILDKRTYDRIEGKYTYQMYNKSTERVAKGDFIRLKDLSIGYRFPKEWLT GTCIHNVNLSFQATNLWLLYSDSKLNGIDPEFYLSGGVSSPVSRMYTFTVNLTF >gi|226332198|gb|ACIC01000122.1| GENE 11 14116 - 14235 58 39 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKIYYRNINKCGDNTRVEISMKFTNFLCDKHYELTIIKL >gi|226332198|gb|ACIC01000122.1| GENE 12 14311 - 15492 761 393 aa, chain - ## HITS:1 COG:RSc2919 KEGG:ns NR:ns ## COG: RSc2919 COG3712 # Protein_GI_number: 17547638 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Ralstonia solanacearum # 194 361 71 237 274 78 31.0 2e-14 MNNCHLIYIATLIRKSILGEITLEEKRQLEDWLDESARHRELFDKYNNPDFLASVDIEKD MEEAQEAYSNFMQRLKPAKPVRRIVWNRYVAAAALIAFLCASAIFVYMYDGGKPHSPSGL AVVPVQQMDTVPADSTDILLITASGEKIPVQSSQKMLSVSSGTLRIDEQPIEEYAQVKTV PAEVVYNTLKILRGKRFKMQLSDGTLVWLNADSEITFPSNFDGKERKVLAKGELFFDVTE NKSQPFIVETPTGNIRVLGTAFNIHCYEDEVPMTTLVRGKIAYSLGKESVILAPGQQCRV EDNKLSVREVDTYEYVSWIDEVLVFKDKELDEILSTLSRLYDVDIYYENPELKRLPFTGS FKQYEHLDKIIRMIEDCGLIRIKQDGKTLTISK >gi|226332198|gb|ACIC01000122.1| GENE 13 15693 - 16235 323 180 aa, chain + ## HITS:1 COG:no KEGG:BT_3277 NR:ns ## KEGG: BT_3277 # Name: not_defined # Def: RNA polymerase ECF-type sigma factor # Organism: B.thetaiotaomicron # Pathway: not_defined # 7 178 4 184 189 87 35.0 2e-16 MARLNPDEVIISKIATDFPEGFRALFITYYKPLLRIAEKMIGVEFAEDIVQDTFLTIWNN KQSFDNILSLKSYLYTSVQNKCLNIIRTNQQFEKYKVEQSAALDEEYILDEEVISQLYQS IELLPDHYKEVMLKSMEGESIAKIALSMDTTEDAIKAYKRRAKQILKKELGDKAFILLFL >gi|226332198|gb|ACIC01000122.1| GENE 14 16367 - 16627 96 86 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|298384233|ref|ZP_06993793.1| ## NR: gi|298384233|ref|ZP_06993793.1| two-component system sensor histidine kinase/response regulator [Bacteroides sp. 1_1_14] # 1 79 342 420 669 138 91.0 1e-31 MVQQLVELHHGKIILHSEIRKWSTFSIHFPQDKSSHNKDEFLNDNMEEKQDIMPSIYSNK ICMNDEHEPTNTQVIKKRNYFNSRKQ >gi|226332198|gb|ACIC01000122.1| GENE 15 16833 - 16991 68 52 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLAKKRELTRLQLSSNNCFYFTERIYPFSDFDLQLKRGTAIMCNLCKNAQCS >gi|226332198|gb|ACIC01000122.1| GENE 16 17005 - 17397 180 130 aa, chain - ## HITS:1 COG:no KEGG:BF2872 NR:ns ## KEGG: BF2872 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 2 111 19 129 278 98 48.0 7e-20 MHIIAYWRRKTGEVVFTTAWMADAGYVCDRYYHMAVMAGHRDACRIFYFLHPNLFGNYPL LKLFNSFSIRLAFYGHYYPLRYEVWLNDKTGILRSLFWFKDGEIHGIEFVKLIPFGFKMG SFLLIVCDND >gi|226332198|gb|ACIC01000122.1| GENE 17 17969 - 18280 410 103 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253570756|ref|ZP_04848164.1| ## NR: gi|253570756|ref|ZP_04848164.1| predicted protein [Bacteroides sp. 1_1_6] # 1 103 1 103 103 181 100.0 1e-44 MSKTNEMYLDKARSLTKGLRKHLVTVKNYGVSASDLSKLEAAILEGEKLNAEVDRRRAEL NEIIPNTNRKLAEIRNLTVYLKGLIKPRVDSSHWPEYGIADKR >gi|226332198|gb|ACIC01000122.1| GENE 18 18865 - 19056 67 63 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253570759|ref|ZP_04848167.1| ## NR: gi|253570759|ref|ZP_04848167.1| predicted protein [Bacteroides sp. 1_1_6] # 1 63 1 63 63 85 100.0 1e-15 MSFRLTSEILQPMKKCRKVARKKKELIQAEKNALLFANAFTYHNAASFSSSSFITFIDST SFA >gi|226332198|gb|ACIC01000122.1| GENE 19 18987 - 19205 109 72 aa, chain - ## HITS:1 COG:no KEGG:BF1977 NR:ns ## KEGG: BF1977 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 54 79 132 345 65 55.0 5e-10 MSPRNLWDVKKFYLCYYECDTKLRQAIAVLLWSHNLILMSYDLSPEYMVFYANEVESINV IKLLLEKDAALW >gi|226332198|gb|ACIC01000122.1| GENE 20 19309 - 19446 83 45 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKGLTVSKRFNEEGDNELLRHVVAVIESLRLQIAKQLNTVVMSSY >gi|226332198|gb|ACIC01000122.1| GENE 21 20092 - 21171 857 359 aa, chain - ## HITS:1 COG:BMEI0543 KEGG:ns NR:ns ## COG: BMEI0543 COG3049 # Protein_GI_number: 17986826 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Penicillin V acylase and related amidases # Organism: Brucella melitensis # 9 336 22 347 367 219 36.0 7e-57 MNTKLSTFLLASIGIVSQVPAGACTGITLKSKDGAIVAARTIEWAESAMNNMYVVVPRNQ ELQSLTPSGMDGIKFRSKHGYVGLAVEQKEFMVEGINEKGLSAGLFYFPGYGKYQPYDAS LKYKCLADFQVVSYVLAECGTLDEVKEALEKVRVIHIDPRSSTVHWRFTESSGKQMVLEI VDGQVHFYDNPLGVLTNSPGIEWHWTNLNNYINLQPGNAPVHKFGPLEMKNFGHGSGMLG LPGDFTPPSRFVRAAFFQLTAPEQPDARGCVFQAFHILNNFDIPTGTELAWGKAPADMPS ATQFTVASDTRNRMIYYRTMYNSNIRCIDLKEIDFDGVKYHAKPLDGTKTQPVEKISIR >gi|226332198|gb|ACIC01000122.1| GENE 22 21787 - 21942 70 51 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253570763|ref|ZP_04848171.1| ## NR: gi|253570763|ref|ZP_04848171.1| predicted protein [Bacteroides sp. 1_1_6] # 1 51 1 51 51 92 100.0 5e-18 MRDNYRTPSQIKTDGGKLTDFTLVLFEETLKVISNKDVRDGLIPFSTVQAL >gi|226332198|gb|ACIC01000122.1| GENE 23 22700 - 22855 71 51 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253570765|ref|ZP_04848173.1| ## NR: gi|253570765|ref|ZP_04848173.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 51 1 51 51 96 100.0 5e-19 MSSTPCSYEIVKTVFLFSCFTGLRYGDMFILEWYGMHKEADEKTLYIEHEH >gi|226332198|gb|ACIC01000122.1| GENE 24 23015 - 24232 567 405 aa, chain - ## HITS:1 COG:PA4513_1 KEGG:ns NR:ns ## COG: PA4513_1 COG3182 # Protein_GI_number: 15599709 # Func_class: S Function unknown # Function: Uncharacterized iron-regulated membrane protein # Organism: Pseudomonas aeruginosa # 1 393 1 364 395 136 26.0 9e-32 MKKIFRKIHLWLSVPFGLIITLVCFSGAMLVFENEVNEWSRPDLYYVETVKESPLSMDKL LEKVAMALPDSVSVTGVSISSDPGRAYQVSLSKPRRASLYVDQYTGEVKGKSERSGFFMF MFRMHRWLLDSMNPGNEGIFWGKMIVGVSTLLFVFVLISGIVIWWPRTRKALKNSLKITV SKGWRRFWYDLHVAGGMYALFFLLAMALTGLTWSFPWYRTAFYKVFGVEVQQRFSHGNEQ KHNEHKGNTRLANQEKKRGGNGRRSADKREGLENNHSDRCSAASPFACWQEVYDKLRHQN LEYKQISVSSGTASVSFDRFGNQRASDRYSFNTDTGEFTETSLYQHQDKSGKIRGWIYSV HVGSWGGMLTRILAFIAALIGAALPLTGYYLWMRKFIRQLIGTKK >gi|226332198|gb|ACIC01000122.1| GENE 25 24322 - 25710 1244 462 aa, chain - ## HITS:1 COG:no KEGG:BDI_3402 NR:ns ## KEGG: BDI_3402 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 457 1 466 471 360 46.0 6e-98 MKKFLLQAFAIAFCLSSGVFVASCDDDNNPVDNPPIDGETAYVVAATTGEASYLVVANSL DEGTVSTQGNGTEVIGGTYWVYKGLDYVFALVYNKGGAGTGASYYLGADGKMKEKYTYTY NRITSYGTWGDKVVTVSTGDSKITDEDQNVAQALLFNYLDATDGSQEEGTLLAENYLGNG EKVSWAGLVEANNKIYTSVIPMGMSKYGIKKWPEAVTDQELVTKTDGGSGSGAYTAGVIP STQYPDNAYVAIYSGTNFNETPVIAKTDKIGYACGRMRSQYYQTIWAADNGDVYVFSPGY GRTAVSSSDLKKVTGQKPSGAMRIKAGATDFDPDYYVNFEEIGTKHPIFRCWHISEDYFL LQLYKKGAEDMIGGGRNADVSELAVFKAEDKMIMPVTGLPANCKFGGEPYGEKGYVYMAV TVTTGENPAFYKIDAKTGKAVKGLTVQADAITTVGKMRYLSK >gi|226332198|gb|ACIC01000122.1| GENE 26 25751 - 28165 1786 804 aa, chain - ## HITS:1 COG:FN1971 KEGG:ns NR:ns ## COG: FN1971 COG1629 # Protein_GI_number: 19705267 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor proteins, mostly Fe transport # Organism: Fusobacterium nucleatum # 107 279 25 204 657 75 34.0 5e-13 MSNKYFLVFLAFLFSVSVYSQHGRHRSSLISGRVMSTEKEVVDFATVYLKGTNQGCYTDE KGIYHLKTTAGEYTLVVSAIGYQTVEKKVKLAKGERIKVNVTIAPAVKELGEVLVTTSGV GRVNQSAFNAVAIDAKKLHNSTQTLAGALTKVPGVKLRESGGVGSDMQLYIDGFSGKHVK IFIDGVPQEGAGAAFDLNNVPINFADRIEVYKGVVPVGFGTDAIGGVVNIVTNKQPGKWF MDASYSYGSFNTHKSYVRFGQTFKNGFMYEVNAFQNFSDNDYYVDTYVREFEIKEDGSVR FPPLDKNKIYHLKRFNDQYHNEAVIGKIGLVGKKWADRLALSFNYSHFYKEIQTGVYQDV VFGEKFRKGHSLVPSLEYYKKNLLVKNLDLLLTANYNHNITNNVDTASRAYNWRGDFYEK GSRGEQSYQNSESKNKNWNGTLKMNYHIGQVHTFTFSHVISDFERTSRSTIGASSKFTDF SIPKITRKNVSGLSYRLMPSDRWNVSAFAKYYRQYNKGPVSQNTDGIGNYINLSNTASAL GYGVAGTYFIWKDLQVKLSYEKAFRLPTTDELFGDEDLEAGKMNLKPEKSDNVNLSFSYN HQFGKHGLYAEAGLIYRDTKDYIKRGLDVLGGTSYGYYENHGHVRTKGYNLSLLYSFSHW FDIGGTFNSIDTRDYEKFLAGSSLQESMHYKVRMPNLPYRYANINANFYWNDLFVKGNVL SIGYDSYWQHDFPLYWENLGDKDSKNMVPEQFSHNLSLSYTMKNGRYNVSFECRNFTDAQ LFDNFSLQKAGRAFYGKFRVFFGK >gi|226332198|gb|ACIC01000122.1| GENE 27 29555 - 29764 127 69 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|293372895|ref|ZP_06619268.1| ## NR: gi|293372895|ref|ZP_06619268.1| conserved domain protein [Bacteroides ovatus SD CMC 3f] # 9 64 3 54 140 63 57.0 6e-09 MKLVVFNFLVAVVIIIVTLGCAGSQKKTAPQEQSSFIATINVVNTENKGCYGTYEGILPC AGCGVSKLH >gi|226332198|gb|ACIC01000122.1| GENE 28 29843 - 30859 881 338 aa, chain - ## HITS:1 COG:no KEGG:BVU_3699 NR:ns ## KEGG: BVU_3699 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 338 1 338 338 682 97.0 0 MRTIVCKTVWALLVGVSLVLSLNSCSKDPVIPEDETKNKLHEDPAKMIVRLVECHLHADW NEIQKAGGPHQNPESPARYMKRVQEITYELKAGSGWTLAEGSQGKFYVQKNGEYKNGNNF TPAPVYLMFIYYYNSKGELMNGQFVENGQENIHQHFFTPENVRPTFDGKPEADDNDPEAL VDYLYVDTTPWDKTKHDNEAEITGGTNPVGLKGVIRFLKDRKEFDLKLRLYHGYNSKKNP QTNGFDPFYKPSGVLIQRGTWDINLSIPVVVFWSREEFVDVDLEADVNLIGEDSLDEDSN RTLHSIMETFNLTWKEALEEFISYTYQAGDVEAGSIWL >gi|226332198|gb|ACIC01000122.1| GENE 29 30901 - 33237 826 778 aa, chain - ## HITS:1 COG:PA0781 KEGG:ns NR:ns ## COG: PA0781 COG1629 # Protein_GI_number: 15595978 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor proteins, mostly Fe transport # Organism: Pseudomonas aeruginosa # 511 778 418 687 687 67 24.0 1e-10 MTFHAYLQYVCLLLVCLCTLPSPMEARVQADRQSTGTRDTLLVIDRENGLPIEGAYILTG ERLLVSSPRGMIVFGHGTCFMDTVLVQCLGYGSRRVPLNEVFKESSIHTVCLSPDIQKLG EVVVTGERAGASPNVVSRRLSSPEIRNALGTSLATLLERVSGVSSISTGTTVSKPVIQGM YGNRILIIHNGARQTGQQWGADHAPEVDMNGSSSVSVIKGSDAVRYGSDALGGIIVMEQS PLPFRKRSLQGGISVLYGSNGRRYVATGQLEGAFPGDFAWRLQGTWSNSGDRSTAHYLLN NTGTREYHASASLGYDRGRLRVEGFYSRFYSRTGVMLSAQMGSEDLLAERIRLGRPLHTD PFSRGIRPPCQEVTHQIAFGRMRLGMKKGGSIHWQSTWQKDDRQENRVRRLDSNIPTVSL HLNSFQHLLRWKRDYRSWQVEAGGQVMFIENHSRAGTGFVPVIPNYTETQAGIYGIGKYH LARGGVEAGLRLDMQETRASGYDWTGSLYGGIRKFNNVSYSLGGHYQLSRRWRLTSNFGL AWRAPHVYELYSNGNELGSGMFVRGDSAMHSERSYKWISSLRYGDGMFSVCLDGYLQWVD GYIYDGPEKETVTVISGAYPVFQYRQTPAFFRGMDFDLRFTPGGSWNYHAVVSFIRANER TKGNYLPYIPSFRFSHELAWIHETKSHLRLRLNIRHRFTAKQRRFDPDTDLIPYTPPAYH LLGFDAGLELPVKRGHQVRFMLSADNLLNREYKEYTNRSRYYAHDMGRDVRCGVNWIF >gi|226332198|gb|ACIC01000122.1| GENE 30 33368 - 33670 69 100 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|295085997|emb|CBK67520.1| ## NR: gi|295085997|emb|CBK67520.1| hypothetical protein [Bacteroides xylanisolvens XB1A] # 1 100 21 120 120 208 99.0 9e-53 MREKNDDDICFPGDKIPYRPASRNIIVYKNPCLDKRGERKETDGTIFRRIVTGVPLFVMG YARMFPVNMKSFHEEKGHQCQQQHTCYDSPLSLSFFRTIH >gi|226332198|gb|ACIC01000122.1| GENE 31 34093 - 37341 1760 1082 aa, chain - ## HITS:1 COG:no KEGG:BT_4224 NR:ns ## KEGG: BT_4224 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 6 988 11 899 958 189 25.0 8e-46 MKAYRLKITCFILLLLPVLTCCTDEELLRGSGVPEGVPTTVDISVGTAANRAKTRSGLPE GEERKIYDLYLWVFDSRGGVEFSREYPRAELYQAAAALETATGEKDTDAPTSMGLLKNIP ITSGEKTLVLLANYKSDGDGLFQVEPGILAGVSSLSDIQRVKAEMTARTLFRPNGNLLMS VTKTEEISPDTRRLEVSLKRAEAKITVNLKTGEGLTFVPGTWRVGNVPASTFVTEHVKGS DAGAPWDAGGMEMDSYWLSELFQFEGDATTTTTFYLPENRKLAKKMITPASPGYNAERKG YDLRQKQVKSPVPDATEKPNVQNGATEYAPDLAPYIEFTGELRQDMTTGEVSTERFGRVT YRVYLGYTAENDPVNDYDIERNVHYTYNVTIKGMNDLVVEAMSDKDREEEPAPGVEGLVY DAHRSFSFDCHYEQGLMRFNKDELAIFNPDGSLKDDAMISFAIRTPFCDKIVSYTRAELE QLEDNGYVPSGEKKADTGWLAFFIHPSTLSDNGNETMEYYSNTGVHLLSLEQFLYRLVKE PDYVFNASTGLCKVTVYANEYFYERNPMQDGAPEDKNLWKTFANAPDRTFDLLVNTSHEI SPDGQSRYHQAVVSIRQMSIKTVFTEGPEGMRVWGVENVNETPDLDFMVKAPGEPDAFYN KYTSNGWSNTWKVMARSGDFSQEVMIDPMRRSKQDQLWQVMTKVDNAKLTLSRDHYNLAA KNYHAEYACFTPFLRNRDDNRDGRMQAAEMKWYIPSVSESGMLIVGERALPLHSRLHGHA PSDNATAFYMSRAFMASTNLSLYSSNPYICILESRSVTPIANFDYYKGGNSPGDRTPYSE VRLIRDLGMLESGEESYHIEELGRELNKSLVNNRTKEEYLVFRADNLPYSMTRGVRAIYE LPAHDEDSQSNTIYKNGFEVAKYIANKIDKPKDLNNPDRKSEYYLQDWSTLMADIEQGNS PCAYYYQEADKSDLGTWRVPNEVELIIMCGSLFDWHNPEKKEVYEFGNDLPSLGVVESEV LHSRTGFSKRSMNGARSFSAGYQMYLHGYLRWVTTVDTQWAGDNDRLVNNRKGYVRCVRD LP >gi|226332198|gb|ACIC01000122.1| GENE 32 37378 - 38781 698 467 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253570774|ref|ZP_04848182.1| ## NR: gi|253570774|ref|ZP_04848182.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 467 1 467 467 933 100.0 0 MLKDTKYTRTVRALRNFVMTVLPLFFLALTGCRNEEDIPERLSGEDDVTVRLNLSTRGVT AGGSEMLDLYSSDPANWYKTGTLFPQAPAQPRIALMVFNKEQNRYVYSRLLPFSSGNAEG EYQTKIRIPKGESEFYAFYAPNQTGKDLPYYTPGGILQNAVPWDFLGEGMEEVDREDIVE AAFPAIIREQDDGSLGIPVDNPYDGVTPSPSDEKIIDWTTTSNISWDDAPGISNRLHMGM LSGKLSTAVQPASGEQVQDITIPLFRDFSRVRIYIASIAPFEQIIYNYKRIAFLNFPVLM SPSFRENDSDGGQLAGSTPGTGKNTEHTGTYSYGIKNEAQLTIYEAPLNAGGEIDYTKMD AQKYEQFFLPQYLAPYIPEANSWQEGHPHPKIQLTVEYYTGGNISQTRTRTFLLDIGEES SLGVYSGPIYPNRDYKVFMVLPEAADREIIYRVESWERKDVEFPPFQ >gi|226332198|gb|ACIC01000122.1| GENE 33 38790 - 39704 538 304 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253570775|ref|ZP_04848183.1| ## NR: gi|253570775|ref|ZP_04848183.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 304 1 304 304 622 100.0 1e-176 MNIRNIPTWLGLLAALYLVTACDRVYDDLSGCLGNTIVFSYLADDNREHLQEYVDDIDVF IFDVGNGALVEKHHLEGEQLLSPLEVRLPEGDYKVVAVGNALNETVIGDGAYGEASVSRP ELAEKDGPAAGTFDRLYLGETTITSKVMNDSRDVVRLYSQHVKIHAVVLPDDDENAGTWY EQNKGTGFRLTMETLSARFSFSGQRSGETSFDLPFYTGGKNDRFVLDFNTLRFKDGDPLT IRLMQGDKVLCTVDVAEYIALYPDQIRITGRQEAVLPLFFRQNPRSLTVSVKPWEAVDVV PITD >gi|226332198|gb|ACIC01000122.1| GENE 34 39899 - 41332 1348 477 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253570776|ref|ZP_04848184.1| ## NR: gi|253570776|ref|ZP_04848184.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 477 1 477 477 868 100.0 0 MKVMKYFAVASLALFAACNNEENPVAPESGNQPVDVVMSVALPGPGTRAAMGELQNTAND QDAASVEKLHVYLVAGEDIVLSKEFTKDTDDFTKLTSATAVGAEATTGGYKFLDVDNGVT KALVIANPQGALVDGEGKTVAEISKQQLKAQINEVIYAGSETLTTVGAEPYGVNPQEDKR TVKKAELTLTASMNRFQVTGTKFVKVIWKDGKKTEAEQWCTTWLAKDENKGKSSAEAWAA FKADAAGFNGQTWDGKSDIAQQTSLSQWLQVVEVTTANQGILMNRFSRTLSVPQLTVEEE ANWFWAKTYAGDRYKFEDGTFKPDGSTDLSEVASYFNAAGFDFAQGAKAAAFNFFVNGIT GYGKENNAPKLAFVFKTGDDGVSADRRFVVISGYAKAEGETAGTTDLEAAKGGYLINLDL AALNGGQGILVDVDPSIPEGVTPEPGGQGDLEDENVNVIVRVEVQPWTAVNVFPILD >gi|226332198|gb|ACIC01000122.1| GENE 35 41388 - 42890 871 500 aa, chain - ## HITS:1 COG:no KEGG:PG2167 NR:ns ## KEGG: PG2167 # Name: not_defined # Def: immunoreactive 53 kDa antigen PG123 # Organism: P.gingivalis # Pathway: not_defined # 66 481 54 464 479 199 32.0 3e-49 MKPYHINKKQILIMGCLGMFPLFSSAQDILSTSGTSRWDYSNSRVEREPGRDTLDITFSV FPLAGLGSQEVAYLFPVYVSVDGRDSVRLEPVCVAGKRRYKVIKRRKALGNLKPGNPGSG EVRSAKVLESSGLTVKRSVPFERWMADGRLVVREVSYGCAECGTNESEDIAFQAGIPLFG EKDYAYSFIEPEKVMIKCYKDSFDCKVVFPVARHDLQEDFAGNAQELDRLKKFLSENLNI QGTSLKEVDIKGYASPEGSFDYNRSLAQRRTQILSDYISRQYPALKNAPVYRTEGIGEDW EGLKSMVSGSTLVNRDEILFIIGHNERDTEREEAIRALDDGRSYRLLLEEFYPGLRRTTF SLSFDVRPYTVEELPEIFETKPECLSQHEMYQLAGLYASRGMSPLPVYERAYGQFPDDAV AILNYANALLKYGQDADGALRVLEPVTNDSRAFFPMAVAYNMKGDWRKAEELIKKAGMAQ GKQKLSGEPGDGGNNQQIEM >gi|226332198|gb|ACIC01000122.1| GENE 36 42946 - 43557 483 203 aa, chain - ## HITS:1 COG:no KEGG:BF2182 NR:ns ## KEGG: BF2182 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 18 203 14 188 188 192 48.0 6e-48 MNNSMMKPFFLLVFLLCSAGAFSQKVALKNNLAYDALKTPNLSLEFSMGRKWTLDTQVGM NFFFYTRDATSSRYKAKKFSHWLVQPELRYWTCDVFNGWFFGLHAHGGQMNIGGVDVPFV LQKGDGNMKDHRYEGYFWGGGLSAGYQWVLSNRFNIEASLGIGYVHARYDKYKCTTCGQK LGKGDADYIGPTRAAISIIYMLK >gi|226332198|gb|ACIC01000122.1| GENE 37 43570 - 43854 127 94 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|301165273|emb|CBW24844.1| ## NR: gi|301165273|emb|CBW24844.1| hypothetical protein [Bacteroides fragilis 638R] # 1 94 42 135 135 193 98.0 3e-48 MRMSHEIFNMEWSNRDSIYLHTRDGVLYSAFAGSAERLLSLVGGLRPCRSFWCGQEFPAV HSIRARKILEFFPGSCIFINDYYIRITLACPLIK >gi|226332198|gb|ACIC01000122.1| GENE 38 44386 - 45495 615 369 aa, chain - ## HITS:1 COG:no KEGG:BT_0727 NR:ns ## KEGG: BT_0727 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 369 1 368 393 401 52.0 1e-110 MKAPRKIYTWTSILLFVCCSLIFLSCEKEELGEAMANRKTLFMFLPWSTDLTGYVYTNIA DMEACVSRRGLEHERILVFMSTSSTEATMFEIIHPKGKCDRKTLKRYGTSGFTTVEGITG ILNDVQEFAPAPVYAMTIGSHGMGWFPVDGTQAHSLFRMKKHWEYQEQPLTRYFGGLTRE FQTNVGTLARGIVGAGVKMEYILFDDCYMSSVKAAYELREATRFLIASASEMMAYGMLYA TVGEFLLGNPDYGSLCEGFHDFYSTYEMMPCGTLAVTDCSELDNMAAIMKSINDEYVFDD SLQGGPQGVDGYTPVIFYDFADYILTLCSDPVLTARFREQLERLVPYKPHTSKFYSRAKG TFPYVLFRG >gi|226332198|gb|ACIC01000122.1| GENE 39 45613 - 45804 102 63 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253570780|ref|ZP_04848188.1| ## NR: gi|253570780|ref|ZP_04848188.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 63 1 63 63 112 100.0 1e-23 MKEYCVYWFENGESRHEVFSCLDGAEMFSCMIRGQDGVEHVEISEEDISAPEEFQEICPG DFS >gi|226332198|gb|ACIC01000122.1| GENE 40 45801 - 46028 224 75 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253570781|ref|ZP_04848189.1| ## NR: gi|253570781|ref|ZP_04848189.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 75 1 75 75 147 100.0 2e-34 MKDYYFIMNAGVKAGGEITHAVLEGKIVSAPKGYDAFTGIEAAREKLACGNIRQQMEEFG IELEIVPVNTDFLLR >gi|226332198|gb|ACIC01000122.1| GENE 41 46154 - 46846 399 230 aa, chain - ## HITS:1 COG:no KEGG:BDI_3897 NR:ns ## KEGG: BDI_3897 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 20 229 22 231 231 246 65.0 5e-64 MRVFNLLLLISMFIPFPLPAQVGERYIEVTGTSEIEVVPDRIHYVIEIREYFEEEFDGVS KPEEYRTKVPLTRIEEELKLVLKIVGVPREAIRTQDVGDNWRKPGQDFLVSKSFDVTLRD FTLIDEILKRVDTKGIHTMYIDKLEHRDILSYHRKGKIEALKAAREKAVYLLEAIGKRPG EIIRIVEGGDAGKEMFAQGRILSVAPPPFERSRTIKKRYSMLVRFGIVDR >gi|226332198|gb|ACIC01000122.1| GENE 42 47023 - 47310 75 95 aa, chain - ## HITS:1 COG:SA1305 KEGG:ns NR:ns ## COG: SA1305 COG0776 # Protein_GI_number: 15927054 # Func_class: L Replication, recombination and repair # Function: Bacterial nucleoid DNA-binding protein # Organism: Staphylococcus aureus N315 # 1 85 1 85 90 60 34.0 9e-10 MTKADIINRVSEELGIDRRTVGLVIESFMKCVKDALGRERSVFLRGFGTFSLKKRAAKKA QNIQQHTTICIPARKIPHFKPSESFLVLRKEDNRK >gi|226332198|gb|ACIC01000122.1| GENE 43 47436 - 47858 386 140 aa, chain - ## HITS:1 COG:VCA0612 KEGG:ns NR:ns ## COG: VCA0612 COG1970 # Protein_GI_number: 15601370 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Large-conductance mechanosensitive channel # Organism: Vibrio cholerae # 5 140 2 133 136 150 59.0 6e-37 MSKSSFLLDFKAFVMRGNVVDMAVGVIIGGAFGKIISSVVADIIMPPIGLLVGGTNFSEL RWELEPARVVDGVEQAAVTINYGNFIQTMLDFVIIAFAIFSFIRLLSNLRRKKEETPLPP PVPSNEEKLLSEIRDLLKKQ >gi|226332198|gb|ACIC01000122.1| GENE 44 47941 - 48633 398 230 aa, chain - ## HITS:1 COG:mll6943 KEGG:ns NR:ns ## COG: mll6943 COG0580 # Protein_GI_number: 13475778 # Func_class: G Carbohydrate transport and metabolism # Function: Glycerol uptake facilitator and related permeases (Major Intrinsic Protein Family) # Organism: Mesorhizobium loti # 1 221 1 219 225 156 50.0 4e-38 MNRYVSEMIGTMVLVLMGCGSAVFAGDMPGAVTTGVGTLGVAIAFGLSVVAMAYAIGGIS GCHINPAITLGMYCSGGMGGKDALLYIIFQIIGGILGSAVLFILVSTGPHAGPTMTGSNG FVEGEMLQAFIAEAVFTFIFVLVALGATDKKKGAGKLAGLVIGLTLVLVHIVCIPITGTS VNPARSIGPALFEGGGAISQLWLFIVAPLTGGLASAIVWKAISQHSDRQR >gi|226332198|gb|ACIC01000122.1| GENE 45 48719 - 49882 917 387 aa, chain - ## HITS:1 COG:no KEGG:BT_1391 NR:ns ## KEGG: BT_1391 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 378 1 373 375 400 56.0 1e-110 MMKKIFICLFLSGVFMYHTNAQTAVEGNKFLDNWSIGISAGGTTPLTHHSFFGNMRPITG IELNKQLTPVFGFGLEAVGSFNTSHSRTIFDRSNVSLLGLVNLNNLLGTYTGVPRPFEIE AVAGIGWLHYYMNRETGSDQNSMSTKLGLNFNFNLGESKAWTLALKPALVYDMNAMGSEA VRFHSGRAVWEISVGLKYHFGCSNGKHHFTKVRAYDQQEVDVLNAKINELHTQAGKDAEA LQEAVRKVTELEAALDKCRNQEPKIVKDTIDNTKKTLESVITFRQGRTTVDNSQLPNVER IATYLKNHKGASVLIKGYASPEGSVEVNERIARQRAEAVKKMLVGKYGIAEERIVAEGQG VGNMFEEPDWNRVSICTINAGTKSSSR >gi|226332198|gb|ACIC01000122.1| GENE 46 50866 - 51723 368 285 aa, chain - ## HITS:1 COG:STM1355 KEGG:ns NR:ns ## COG: STM1355 COG2207 # Protein_GI_number: 16764706 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Salmonella typhimurium LT2 # 187 282 191 286 296 65 31.0 1e-10 MKLFCLNEHTSCYNYARCVQEGFRYYTFDKGLEHEEELQNDCILFVLKGSLRFSYNEFQF TISAGKMVFVCRESLFSTYSLEKCEVVVALFEGGVWPCQKISFSELSYLRDIIEYRMEPL EIKDRLYRFLELLECYLKDGANCIHFHEIKIKELFWNMRFYYSKRELASFFYMVIGRSQG FRNMVLNNYKKCKTVKELASVCGISISSFKRQFAAEFGETPTGWMQKQLVREIKYKLSIT DLPLGSIVYELNFSSLAHFSRFCKRCLGCSPKEWREQMKNNLKTF >gi|226332198|gb|ACIC01000122.1| GENE 47 51990 - 52511 321 173 aa, chain - ## HITS:1 COG:L69383 KEGG:ns NR:ns ## COG: L69383 COG1475 # Protein_GI_number: 15673430 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Lactococcus lactis # 4 170 5 171 180 228 63.0 3e-60 MSSFKSPAYNVKAVPVEKIVANSYNPNVVAPPEMKLLELSIWEDGYTMPLVCYYREEEDI YELVDGYHRYLVMKTSVRIYKRENGLLPVTVINKDISNRMASTIRHNRARGMHSLELMTG IVAELSKSGMSDSWIMRNIGMDKNELLRFKQISGLAELFRDRSFGLSDDWLEE >gi|226332198|gb|ACIC01000122.1| GENE 48 52517 - 53794 722 425 aa, chain - ## HITS:1 COG:lin1347 KEGG:ns NR:ns ## COG: lin1347 COG3969 # Protein_GI_number: 16800415 # Func_class: R General function prediction only # Function: Predicted phosphoadenosine phosphosulfate sulfotransferase # Organism: Listeria innocua # 1 423 1 430 434 399 45.0 1e-111 MRRMDVHEAANRRLKIIFDYFDYVYVSFSGGKDSGILLHLCMDYIRMHAPGRKLGVFHMD YEVQYRQSTEYVERMFSNNRDILEVFHCCVPFKVPTCTSMYQQYWRPWQEGYQNIWVRQM PGTALTVKDFDFWNDSLWDYDFQSLFPSWIRRKKGCKRVCCLVGIRTQESFNRWRAIHSD KNYRKLANYKWTHRVGYYTYNAYPIYDWKTTDVWTGYARYGWDYNRLYDLYYQAGIPLSR QRVASPFISQAVSTLHLYKVIDPDTWGRMVSRVNGVSFAGMYGNTVAMGWRSISCPDGFT WKEYMYFLLDTLPRATRENYLEKLRVSQKFWREKGGCLGEETIGKLRAAGVPFTVEECTT YRTDKRPVRMEYIDEIDIPEFREIPTYKRMCVCILKNDHTCKYMGFTQTKREREMKERVL KRYKL >gi|226332198|gb|ACIC01000122.1| GENE 49 53794 - 54198 171 134 aa, chain - ## HITS:1 COG:no KEGG:BVU_3704 NR:ns ## KEGG: BVU_3704 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 134 1 134 134 279 100.0 2e-74 MNILKLQGLDGRLFDLVAPLVMNPAVLRQNNNYPFKTTRNHVWYIAMDEQRVLGFMPVKM TLTNNCIDNYYISGDNSSVIEVLLDRIIHDFSSDGSLVAVVHERHVEDFSMKNFIPCVEW KKYVKMRYHEGGGA >gi|226332198|gb|ACIC01000122.1| GENE 50 54611 - 56215 1280 534 aa, chain - ## HITS:1 COG:no KEGG:BVU_3705 NR:ns ## KEGG: BVU_3705 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 534 1 534 534 1072 99.0 0 MKIKIIAGTMVALSFSLMSCSDFLTEDPRGKLTPENFFSTQDELNMSIYALYQKVNLSQV YTNMQLPQWQGDDITTNPGSNKQPAAEMDKFAAANNNKGVKDAWNMHYAIVKAANLIIQG ASKTPTTQDEINIALGQAKFWRAYAYFTLVRLWGPLPMNLDNVNDDYTKPLSPVEEVYGH IVQDLTEAEAVLPTGYSGSPRFLNGVNVYVTRQAAKSTLAAVYMAMAGWPMNKTKYYAKA AEKAKEVIEGVNRGEYEYKLDKEYKDVYAMSNNYNNETVLGINYSPFVDWAQDSELTSCN QFESLGGWGDAWGEIRFWKEFPDGPRKDATYDPKIRLKDGTLVDWWELKEDGTPVVPEHH PMFSIFSVNWDPASKVNTSAPYDYTKPASQNMCNDHRHRIIRYSEVLLWYAEAKARTGQT DELAFKCLNDVRERAGLEPLTGLSADDLAEAAYKEHGWEVAGYWVALVTRRADQFRMNRL KDTFKERAENTAVEVADGILVKESVEYTNRTWSDNLMYLPYPDMDSQKNPNLVR >gi|226332198|gb|ACIC01000122.1| GENE 51 56235 - 59231 2678 998 aa, chain - ## HITS:1 COG:no KEGG:BF4254 NR:ns ## KEGG: BF4254 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 998 1 998 998 1903 99.0 0 MNTLQKTAGAILSLSLLCGMQVYAFPDYPSLRGQIIEQSDICQGVVKDANGESIIGASVL VKGTTNGSITGLDGDFSLRNVKKGDIIVVSYVGYQSQEIAWTGEPLNIVLKEDAEVLDEV VVIGYGAVRKADMAGSVAVLDNKNFKDQPITQVADALQGRVSGVHVENSGVPGGSVKIRI RGANSISKSNDPLYVVDGIVRESGLDGINPEDIRSMQVLKDASSTAIYGSRGSNGVVLIT TKIGKAGVREIMFDASVGVSNVYKRYDILGAYDYALALKEVKGIDFSNEEMQSYQNGTGG IDWQDEIFRTGITQNYKLALSNGSEKTQYYISANYMSQEGVVIESKNERYQAKANLSSQL TDWLHITADINASHGVRRGGSFASGKDNPIWIALNYSPTMTMMAENGNYNTDTYNSIASN PVGILKLQSGETMTNVFNGRVDLRLDIMKGLTFTTTNGVDYYDGKSYSFSSKRVGTKSGM GNNDTYRLMLQSSNNLTYTGSWNDHHLTATAVYEVTSSETRTMGITGNNLLTEGVGWWNV GMASSRDANNGYEQWALMSGVARVMYNFKDRYMLTGTFRADGSSRFAKKKWGYFPSIAAA WTLSNEDFMKDVSSVQDIKLRASYGIVGSQAISPYATMGLMSATAYNFGTNSNFTGYWAN DIATPELTWEKTKQFDLGLEFSLFDRRLNFSVDYFYKRTTDALLKRSIPGYVGGNSFWVN DGEISNRGIDLSVTARIMQNDRFQWTSTLNGTYLKNRVERLSGGENDFINGSSPAAGMVD YATIIKPGEAIGTFWGYEWTGLDENGHDTYTDVDGNQMIDGGDRKVIGKANPDFTLGWNN SLSYKNWDLNLFFNGSFGAKRLNLVRYTMASAEGNSRFVTLADAYLKGFDKIGSSATYPS LTEGGNNLQPVSTKWLENADFLRLENISLSYTFPKKTTGFADLRLTFSCQNLFTITGYKG MDPAGTTFSNSSVDVDAGIDMGAYPSPRTFTFGLRMNF >gi|226332198|gb|ACIC01000122.1| GENE 52 59319 - 59777 218 152 aa, chain - ## HITS:1 COG:no KEGG:BVU_3708 NR:ns ## KEGG: BVU_3708 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 9 152 7 154 154 233 85.0 2e-60 MKVYRQGALLSIICSFVGLILIGCAEEKQGVETGIHKIVIQQSGDTDSFEVSVSIGGADK GGPAKLYNDKGEYIGDSYSAQIRTATMSCRTNGNAFFMTCAGSVSSISEAGKRLHITVIG YIDDKEVNRLEKEYITDGNTLIETFSVSTKEI >gi|226332198|gb|ACIC01000122.1| GENE 53 59902 - 61380 799 492 aa, chain - ## HITS:1 COG:NMA0050 KEGG:ns NR:ns ## COG: NMA0050 COG0753 # Protein_GI_number: 15793081 # Func_class: P Inorganic ion transport and metabolism # Function: Catalase # Organism: Neisseria meningitidis Z2491 # 7 487 8 488 504 708 70.0 0 MDKNNKIHKLTAANGRPIADNQNSQTVGPRGPIVLQDPWFLEKLAHFDREVIPERRMHAK GSGAYGVFTVTHDITQYTRAAIFSEIGKQTECFVRFSTVAGERGAADAERDIRGFAIKFY TEEGNWDLVGNNTPVFFLRDPLKFPDLNHAVKRDPRTNMRSANNNWDFWTLLPEALHQIT ITMSSRGIPYSYRHMNGYGSHTYSFINAKNERIWVKFHLKTLQGIKCLTDQEAEAIIAKD RESHQRDLYESIERRDYPRWKFQIQLMTEREAESYRVNPFDLTKVWPHKDFPLQDVGILE LNRNPENYFAEVEQSAFNPQNIVEGIGFSPDKMLQGRLFSYGDAQRYRLGVNAEQIPVNR PRCPFHAFHRDGAMRVDGNYGSAKGYEPNSYGEWKDSPEKKEPPLKGYGNIYNYDEREYD ADYYSQPGDLFRLMPVEEKQLLFENTARAMGDAELFIKYRHVRNCYRADPDYGTGVAAAL GIDLHTALVSTE >gi|226332198|gb|ACIC01000122.1| GENE 54 61658 - 63493 1049 611 aa, chain - ## HITS:1 COG:XF0840 KEGG:ns NR:ns ## COG: XF0840 COG1874 # Protein_GI_number: 15837442 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase # Organism: Xylella fastidiosa 9a5c # 11 589 28 588 612 387 37.0 1e-107 MNSMSQEKIPFEIRDGHFYRYGEEIPILSGEMHYARIPHQYWRHRLQMMKGMGLNTVATY VFWNLHEVEPGKWDFSGDKNLAEYIRIAGEEGMMVILRPGPYVCAEWEFGGYPWWLQNIP GMEIRRDNTEFLKYTKKYIDRLYEEVGDLQCTKGGPIIMVQCENEFGSYVSQRKDIPLEE HRSYNAKIKGQLADAGFTIPLFTSDGSWLFEGGCVAGALPTANGESDIANLKKVVNQYHG DKGPYMVAEFYSGWLSHWGEPFPQVSASEIARQTEAYLQNDVSFNFYMVHGGTNFGFTSG ANYDKKRDIQPDLTSYDYDAPISEAGWLTPKYDSIRSVIQKYVKYPIPAPPAPIPVIEIS SIKLERVVDALLLAQSIQPVNASTPLTFEQLNQGYGYVLYTRHFNQPISGILEIPGLRDY AVVYVDGEKIGVLNRNTRTYSMEIDIPFNATLQILVENMGRINYGSEIVYNTKGIISPVT VAGKEITGGWNMYRLPMDKCPVLTEFGDNVYRNTPLQAVQFKDRPVIYEGEFTLDQPGDT FIDMRAWGKGIIFINGKHIGRYWKVGPQQTLYIPGVWLRKGKNKIVIFEQLNEIPQQNVN TVREPVLMNLK >gi|226332198|gb|ACIC01000122.1| GENE 55 63620 - 64480 529 286 aa, chain - ## HITS:1 COG:TM0024 KEGG:ns NR:ns ## COG: TM0024 COG2273 # Protein_GI_number: 15642799 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucanase/Beta-glucan synthetase # Organism: Thermotoga maritima # 15 272 189 455 642 80 29.0 3e-15 MRRILYCFLFIAGGGIIHASAREERVRTEDRIKSDSVLFVDSFDGSSVVPDTAVWKLCTY ANNAWSQYFRGVNGYENVKVEDGYLKLRACKDNGTYKNGGVFSKIGFPCGTRLEVKAKLT KLVRGGFPAIWQMPMGAPEWPRGGEIDLMEWVQGTPLQIYQTVHTYYINGANGSAGVTNK NPDKNFDVTKDHVYAVERTEKELIFYVDGKETWRYGNQYLDEGKLQYPFCEYPFNIILNF SLGGELNGRMTWPGEICDEDLPGEMWVDWVRVVSLNNENQSDTGDK >gi|226332198|gb|ACIC01000122.1| GENE 56 64511 - 66787 1022 758 aa, chain - ## HITS:1 COG:CC0447 KEGG:ns NR:ns ## COG: CC0447 COG3525 # Protein_GI_number: 16124702 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetyl-beta-hexosaminidase # Organism: Caulobacter vibrioides # 8 586 29 597 757 358 35.0 2e-98 MKSYNAGINIIPTPRFLEIREGSFKLYDGVSIGTTTPEAKKIATCFASKISQATGYDVKV DDKGEITLLLDSSATFPPEGYTLNVESSRVRITASSHQGLFYGMQSFLQLLPAEIESAGI VENIGWKAPAVNIIDSPRFAYRGIHMDPCRHFMTVEEVKKQIDVLSMFKINTIHWHLTDD QGWRIEIKKYPRLAEVGGRRIEGEGVEYGPFYYTQEEIKDIVSYAAEHFITIIPELEIPG HELAAISAYPELSCKGDSVTPRNIWGVEDIVMCPGKESVFTFLENVIDEMVALFPGTYFH IGGDECPKESWKSCSLCQKRILEEGIKPDKKHTSEQLLHTYVVKRIGKYLARYNKKIIGW DEILEGKPDSTATIMSWRGDAGGISAALSGHDVIMSPGPNGLYLDYYQGDSKVEPVAIGG CSTLEKVYNYNPVPDTLVLIGKEHHIIGVQANNWSEYFYNNNILEYHMYPRSLALAEIAW SDIERKNYMDFERRIENACVRLDGHDINYHIPLPEQPGGSCDHIAFTDSVSLEFTTSRPI SLVYTLDGTEPGISSEIYTSPLSFYQNGILKIASLLPSGKLSKVRTILIEKQALLPAVGV EGSFHGLNMEVTYGMFLNMQEFMKTGKSVDVSTDIRETGELTSFVPSTNSMRGVRQYAAV ATGFIDIPENGVYFFSSDLEEVWVGGKLLVNNGGEVKRHSRNDRSIALEKGMHELKLVFL GHIIGGWPSNWNDGAVMLRKSDEKKFRKITSDMLWRKK Prediction of potential genes in microbial genomes Time: Thu May 12 02:30:33 2011 Seq name: gi|226332197|gb|ACIC01000123.1| Bacteroides sp. 1_1_6 cont1.123, whole genome shotgun sequence Length of sequence - 5075 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 341 - 1174 392 ## COG2273 Beta-glucanase/Beta-glucan synthetase - Prom 1199 - 1258 3.6 - Term 1200 - 1254 9.7 2 2 Op 1 . - CDS 1262 - 2851 1116 ## BF4248 hypothetical protein 3 2 Op 2 . - CDS 2893 - 5073 1498 ## BF4247 hypothetical protein Predicted protein(s) >gi|226332197|gb|ACIC01000123.1| GENE 1 341 - 1174 392 277 aa, chain - ## HITS:1 COG:CC0380 KEGG:ns NR:ns ## COG: CC0380 COG2273 # Protein_GI_number: 16124635 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucanase/Beta-glucan synthetase # Organism: Caulobacter vibrioides # 44 275 24 296 301 112 33.0 9e-25 MKMKYFTLLFFCSLFVGCSDETSTSSVNNQNQQARDTIDNGDDNSWVLIFADEFNIDGAV DSKKWNYTPRGDVAHAYYYKPNDTSVVWCEDGLLNLKFMKDETDPRGYKSGSIRTDGGKF DFTYGKVEVKAKFSSGDGSWPAIWMMPASSKYGGWPASGEIDIMEHIKDLDVAVQTVHTT GTEQKTYPFGHSGSKFKQGEWNIWGIEWNNKKIDFMVNGEVCATYYNKGLGAVQWPFDIP FFLILNVAGGKGMAGPINEKHLPFTMQVDYVRVYQWK >gi|226332197|gb|ACIC01000123.1| GENE 2 1262 - 2851 1116 529 aa, chain - ## HITS:1 COG:no KEGG:BF4248 NR:ns ## KEGG: BF4248 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 529 10 538 538 961 89.0 0 MVLNIAAIVAASCGDMLDLAPIDNYGSTSYWKTEAQVSAYIDGLHKQLRDKAGQHVITFG ELRGGHYKDGASADGLAISSGEIRLQNLSKETPGVSKFGDIYGCITNINLFIARVTDANF IPEAKKNYYLGQAYGLRAFYYFDLYRVYGGVPIRLGVEVIDGVLDPVKLYLQRSQPSEVM SQIKKDLETSLQYFGEQSGFNPYGHGNKVYWSKAATECLAGEVYLWNSKVTIGDNKADES DLSKAKQFLKNVESNYGLQLQQDFKRIFSTDNEGNSEVIMAVSYMEGEVENSLPRGYTYS LVSGTTNKDSYRADGTPWNDALDVQNNGQQYYEYKYALYEKFEENDTRRDATFMPSYRKK ESGELYIYGTHVCKNLGSLNAQGNRVYDGDFILYRLSWVYLALAEIANMESDNVNVERYI NLVRNRAYKSEAGSHIYKASDFLTNELAILHEKDKEFVQEGQRWWDLCRMKNAKDGIPLV FCIEGDIENKAAILDQETEAYKVLWPLDQNILDNDSALKQTPGYEKQEE >gi|226332197|gb|ACIC01000123.1| GENE 3 2893 - 5073 1498 726 aa, chain - ## HITS:1 COG:no KEGG:BF4247 NR:ns ## KEGG: BF4247 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 726 355 1081 1081 1305 90.0 0 KGTYYASLGYNDSPGAPITTEYKRYNFTFNGSYKITDWLKTISNATFSRANYVWLNNAWG TESDYFGRVMSTPPTVRFEDEDGNPTLGPSPGDGNQTYVAKSYDQDNERTKLTFVQSVEA NLFKGLVLRASASWYYFHTVQENMRKDYETVPGTWNRTRSTSASYSRQFSQTYNAVLDYT TTFAASHNLNVMLGSEFYDQYNNGFSASGSGAATDDFKKLALTDAGEGKRSIDSSHSRYR ILSYFGRLNYDYEGKYLFSGVFRYDGYSSLLGKNRWGFFPGVSGGWIFTKEDFVKESLPF LYYGKLRASYGLNGNASGIGPYTLQGSYNTNKYHGNNGFLIGTLPNPSLRWEKTASFDIG LNVGLLDGRLNLDIDYYNRLTSDKYADFTLPSTTGFSSVKNNNGKLRNSGFEMQVSGKVL ETKDFTWDASLNITYNKNKIISLPYNKLPRNRQGGQEVYTGRKVLNEKGILEDEKVYVGG FQEGKEYGVLVGYQAEGIYQSVDDIPGNLVVTSGNYQGKSQYGPDAWNKLSDNERAKGIE LKPGDVKWKDVNGDGMIDQYDQVVIGNTTPRWFGGFNTTMSWKGLSLYARFDFALDYWVY DDKTPWFLGCMQGAYNTTKDVFNTWSESNPNAKYPRYVFADQLGAANYYRTSTLFASKGN YLGIRELSLSYTLPQELSKKLYMQKLQFSITGQNLGYLTSARTVSPEISSGYPLPRTVIF GVNVTF Prediction of potential genes in microbial genomes Time: Thu May 12 02:31:08 2011 Seq name: gi|226332196|gb|ACIC01000124.1| Bacteroides sp. 1_1_6 cont1.124, whole genome shotgun sequence Length of sequence - 78698 bp Number of predicted genes - 59, with homology - 58 Number of transcription units - 28, operones - 18 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 1079 564 ## BF4247 hypothetical protein - Prom 1144 - 1203 7.1 2 2 Op 1 . - CDS 1329 - 2495 124 ## COG3055 Uncharacterized protein conserved in bacteria 3 2 Op 2 . - CDS 2473 - 3237 323 ## COG1402 Uncharacterized protein, putative amidase 4 2 Op 3 1/0.250 - CDS 3265 - 4503 725 ## COG0477 Permeases of the major facilitator superfamily 5 2 Op 4 . - CDS 4513 - 5721 718 ## COG2942 N-acyl-D-glucosamine 2-epimerase 6 2 Op 5 . - CDS 5762 - 6679 582 ## COG0329 Dihydrodipicolinate synthase/N-acetylneuraminate lyase - Prom 6702 - 6761 8.8 7 3 Op 1 . - CDS 6769 - 8841 1148 ## BT_0457 sialic acid-specific 9-O-acetylesterase 8 3 Op 2 . - CDS 8842 - 9507 512 ## COG2755 Lysophospholipase L1 and related esterases - Prom 9562 - 9621 6.6 9 4 Op 1 . - CDS 9634 - 11595 1333 ## COG3525 N-acetyl-beta-hexosaminidase - Prom 11615 - 11674 2.9 10 4 Op 2 . - CDS 11676 - 13310 1058 ## COG4409 Neuraminidase (sialidase) - Prom 13367 - 13426 10.8 + Prom 13253 - 13312 11.0 11 5 Op 1 . + CDS 13507 - 17097 968 ## COG0642 Signal transduction histidine kinase + Prom 17161 - 17220 5.0 12 5 Op 2 . + CDS 17240 - 17446 181 ## BF4234 two-component system sensor histidine kinase/response regulator hybrid + Prom 18497 - 18556 2.6 13 6 Op 1 13/0.000 + CDS 18587 - 20986 1070 ## COG0642 Signal transduction histidine kinase 14 6 Op 2 . + CDS 20928 - 22244 674 ## COG2204 Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains 15 7 Tu 1 . + CDS 23012 - 23641 182 ## gi|253570815|ref|ZP_04848223.1| conserved hypothetical protein + Prom 23717 - 23776 5.0 16 8 Op 1 . + CDS 23975 - 24223 261 ## BT_0100 hypothetical protein + Term 24228 - 24264 -0.5 17 8 Op 2 . + CDS 24313 - 25440 839 ## BT_0101 hypothetical protein 18 8 Op 3 . + CDS 25477 - 25749 233 ## BT_2306 putative mobilization protein + Term 25754 - 25794 5.1 19 9 Op 1 . + CDS 25930 - 27489 1394 ## COG3505 Type IV secretory pathway, VirD4 components 20 9 Op 2 . + CDS 27526 - 27696 147 ## gi|253570820|ref|ZP_04848228.1| DNA adenine methylase 21 10 Tu 1 . + CDS 27754 - 28269 368 ## COG0338 Site-specific DNA methylase + Term 28354 - 28386 -1.0 22 11 Op 1 . - CDS 28444 - 28680 58 ## 23 11 Op 2 . - CDS 28554 - 29024 107 ## BDI_3503 DNA primase - Term 29187 - 29227 -0.2 24 12 Op 1 . - CDS 29298 - 29456 148 ## BVU_2465 DNA primase 25 12 Op 2 . - CDS 29476 - 29826 93 ## COG0739 Membrane proteins related to metalloendopeptidases - Prom 30063 - 30122 4.5 + Prom 30011 - 30070 3.4 26 13 Op 1 . + CDS 30135 - 31223 941 ## PGN_0581 hypothetical protein 27 13 Op 2 . + CDS 31228 - 32658 532 ## gi|253570827|ref|ZP_04848235.1| conserved hypothetical protein 28 13 Op 3 . + CDS 32664 - 33476 391 ## gi|253570828|ref|ZP_04848236.1| conserved hypothetical protein 29 13 Op 4 . + CDS 33489 - 34448 462 ## gi|253570829|ref|ZP_04848237.1| conserved hypothetical protein 30 13 Op 5 . + CDS 34466 - 35566 469 ## BF3847 hypothetical protein 31 13 Op 6 . + CDS 35568 - 36779 558 ## BF3847 hypothetical protein 32 13 Op 7 . + CDS 36793 - 36984 246 ## gi|256839867|ref|ZP_05545376.1| conserved hypothetical protein 33 14 Op 1 . + CDS 37198 - 37677 224 ## gi|253570832|ref|ZP_04848240.1| conserved hypothetical protein 34 14 Op 2 . + CDS 37708 - 37821 116 ## gi|253570833|ref|ZP_04848241.1| DNA topoisomerase III 35 14 Op 3 . + CDS 37876 - 39780 1101 ## COG0550 Topoisomerase IA + Prom 39858 - 39917 6.6 36 15 Op 1 . + CDS 40106 - 40393 283 ## BF3841 hypothetical protein 37 15 Op 2 . + CDS 40397 - 40708 315 ## BF3840 hypothetical protein + Term 40722 - 40791 4.1 - Term 40721 - 40761 -0.8 38 16 Tu 1 . - CDS 40785 - 42014 839 ## COG0582 Integrase - Prom 42201 - 42260 6.2 - Term 42588 - 42622 4.2 39 17 Tu 1 . - CDS 42801 - 44963 2146 ## COG0550 Topoisomerase IA - Prom 45035 - 45094 5.5 - Term 45032 - 45083 12.2 40 18 Op 1 7/0.000 - CDS 45116 - 47263 2436 ## COG1884 Methylmalonyl-CoA mutase, N-terminal domain/subunit 41 18 Op 2 . - CDS 47265 - 49166 2077 ## COG1884 Methylmalonyl-CoA mutase, N-terminal domain/subunit - Prom 49214 - 49273 7.7 42 19 Tu 1 . + CDS 49485 - 51149 1916 ## COG2985 Predicted permease + Term 51173 - 51223 6.2 - Term 51160 - 51211 14.0 43 20 Tu 1 . - CDS 51242 - 51877 707 ## BT_2093 hypothetical protein - Prom 51930 - 51989 8.1 44 21 Op 1 . - CDS 52044 - 54098 1355 ## BT_2094 TonB-dependent receptor 45 21 Op 2 . - CDS 54174 - 55241 833 ## BT_2095 putative surface layer protein - Prom 55377 - 55436 3.6 46 22 Tu 1 . - CDS 55763 - 56740 583 ## BT_2096 putative transcriptional regulator - Prom 56837 - 56896 4.5 + Prom 56701 - 56760 4.6 47 23 Tu 1 . + CDS 56907 - 58970 1148 ## COG3533 Uncharacterized protein conserved in bacteria + Term 59104 - 59146 -0.7 + Prom 59020 - 59079 2.8 48 24 Op 1 33/0.000 + CDS 59163 - 60302 1043 ## COG0614 ABC-type Fe3+-hydroxamate transport system, periplasmic component 49 24 Op 2 35/0.000 + CDS 60377 - 61315 985 ## COG0609 ABC-type Fe3+-siderophore transport system, permease component 50 24 Op 3 . + CDS 61312 - 62067 213 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 + Prom 62078 - 62137 3.4 51 25 Op 1 1/0.250 + CDS 62191 - 63546 777 ## COG0534 Na+-driven multidrug efflux pump 52 25 Op 2 . + CDS 63560 - 64123 371 ## COG0664 cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases - Term 63957 - 63996 0.1 53 26 Tu 1 . - CDS 64150 - 64884 497 ## COG2188 Transcriptional regulators - Prom 64910 - 64969 8.2 + Prom 64859 - 64918 10.0 54 27 Op 1 . + CDS 65014 - 67707 1058 ## COG1940 Transcriptional regulator/sugar kinase 55 27 Op 2 . + CDS 67739 - 70882 1117 ## COG3537 Putative alpha-1,2-mannosidase 56 27 Op 3 . + CDS 70893 - 72053 362 ## COG0738 Fucose permease + Prom 72087 - 72146 11.9 57 28 Op 1 . + CDS 72263 - 75442 1762 ## BT_2107 hypothetical protein 58 28 Op 2 . + CDS 75450 - 77378 1109 ## BT_2108 hypothetical protein 59 28 Op 3 . + CDS 77402 - 78601 819 ## BT_2109 hypothetical protein + Term 78657 - 78691 -0.8 Predicted protein(s) >gi|226332196|gb|ACIC01000124.1| GENE 1 2 - 1079 564 359 aa, chain - ## HITS:1 COG:no KEGG:BF4247 NR:ns ## KEGG: BF4247 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 8 359 1 352 1081 628 91.0 1e-178 MVFTMKNIKKNFILFVFLLCSITAVAQKKNITGIITDAAGEAVIGASIVEVGTTNGTVSD IDGNFTLSVSDNASIQVSFIGFSSQILSVKGKTSFRIVLKEDSELLQEVVVTGYGGKVSR AKLTNSISTVNPQILDKGIYTNPSQALSGSVPGLKVSLTSGNPTSSPKIILRGGTEFDGS GSPLVIVDGQLRDNFDDINPQDIASMDVLKDAGATAIYGARASNGVILITTKSGKAGHRE INFRANIGLGYVNNPYDFLGAEDYITVLRTAYKNTPWAATANLTGSTAFGTGNKLGANMV WNIMGKTDENAYLLNMGWREMQDPLDPSKTIIYKETKPEEYNLRNPVVTQDYTVSMSGG >gi|226332196|gb|ACIC01000124.1| GENE 2 1329 - 2495 124 388 aa, chain - ## HITS:1 COG:FN1470 KEGG:ns NR:ns ## COG: FN1470 COG3055 # Protein_GI_number: 19704802 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 52 378 32 361 372 108 27.0 2e-23 MNFINPFKKMWNSQKYSIIYMIVGLGLSTCIPPGDTFSRDVRIELMSGFPDGESGYDLGV SGCYAGKLGDFLIMAGGCNFPDKPLSEGGGKRYYKGIYVARVNDSSVLHWVKVGNLPVEA AYGATVSLPDRLIFIGGSNSSGRLSSVLSFSFDCMKELGVACEILPSLPCTFDNMSATLL GDTLYVLGGYRNGIPSCSMFSFCLSNYSGGWTEVFFPGKPRVQPVCASLFGNLYIWGGFI PGKESSVLTDGLCYQPASGQWQVLRAPLTTEGQPLTLTGGASVSYGDTLVICAGGVNRDI FEDAISGRYKKVCESEYLLQPVEWYHFNDQLLAYDVRLGEWIRIGTPSPMLARAGASLVL FGEALFYIGGELKPGVRTSDICRISFQE >gi|226332196|gb|ACIC01000124.1| GENE 3 2473 - 3237 323 254 aa, chain - ## HITS:1 COG:MK0183 KEGG:ns NR:ns ## COG: MK0183 COG1402 # Protein_GI_number: 20093623 # Func_class: R General function prediction only # Function: Uncharacterized protein, putative amidase # Organism: Methanopyrus kandleri AV19 # 21 245 16 219 224 92 30.0 8e-19 MNREIDLSFACYGYVKELVYDLAILPWGAIEPHNLHLPYLTDCILSHDIAVDAALKAWER YGVRCMVMPFVSMGSQNPGQRELPFCVHSRYETQKAILTDVVSSLYTQGMRRLVIVNGHG GNSFKNMIRDLSVDYPDFLIASSEWFRVLPVQDYFECPGDHADEVETSVMLHYHSEMVDM ELAGNGKYTPFAINSLRTNVAWIPRNWQKVSTDTGIGNPQSATANKGKRFAEAISERYAQ LFKELVRDELYQSI >gi|226332196|gb|ACIC01000124.1| GENE 4 3265 - 4503 725 412 aa, chain - ## HITS:1 COG:CC2486 KEGG:ns NR:ns ## COG: CC2486 COG0477 # Protein_GI_number: 16126725 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Caulobacter vibrioides # 7 372 40 428 519 104 26.0 3e-22 MKNNRIYPWILVGLLWGVALLNYMDRQMLSTMKEAMQVDISALESAENFGRLMAVFLWIY GFMSPVAGMIADRLNRKWLIVGSLFVWSAVTYGMGYAETFTQLYWLRAVMGISEALYIPA GLSLIADWHQGKSRSLAVGIHMTGLYIGQAIGGFGATVSAAYSWHATFHGFGIIGIIYAL VLILFLRENKGTGTAEQVCRKGVKTSSPSIFKGMSLLFSNIAFWVILFYFAVPSLPGWAT KNWLPTLFSDSLDMPMAEAGPLSTITIALSSFLGVIAGGILSDRWVLKNIRGRVYTGAIG LGLTIPALLLLGFSHSVFGVVGASLLFGLGFGIFDANNMPILCQFVPPGYRATAYGIMNM TGVFAGAAVTHLLGRWTDQGNLGGGFAMLAAGVLCAVVIQLYFLRPKTDNME >gi|226332196|gb|ACIC01000124.1| GENE 5 4513 - 5721 718 402 aa, chain - ## HITS:1 COG:slr1975 KEGG:ns NR:ns ## COG: slr1975 COG2942 # Protein_GI_number: 16330802 # Func_class: G Carbohydrate transport and metabolism # Function: N-acyl-D-glucosamine 2-epimerase # Organism: Synechocystis # 11 391 10 386 391 275 37.0 1e-73 MNVTEYLTCWAASYKNDLINNIMPFWMKYGLDQVNGGVYTCVNRDGSLMDTTKSVWFQGR FGFIAAFAYNHMDKNSEWLSASKSCIDFIERHCFDTDGRMYFEVTGDGRPLRKRRYVFSE CFAAIAMSEYALASGDRVYAEKALELFSRVLRFITTPGILSPKYCDTLSLRGHSITMILI NTASRIREAIDNPILTARIDDSLSDLKTFFMHPEFNALLETVGGNGEFIDTINGRIINPG HCIETAWFILEESRYRGWDKEMLETALTILDWSWEWGWDKDFGGIINFRDCRNLPVQDYS QDMKFWWPQTEAIIATLYAYEATGDEKYLKMHNQVSDWTYAHFPDGEYGEWYGYLHRDGS VAQPAKGNIFKGPFHIPRMMIKGHVLCNEILSGLSGTENIKL >gi|226332196|gb|ACIC01000124.1| GENE 6 5762 - 6679 582 305 aa, chain - ## HITS:1 COG:YPO3024 KEGG:ns NR:ns ## COG: YPO3024 COG0329 # Protein_GI_number: 16123201 # Func_class: E Amino acid transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: Dihydrodipicolinate synthase/N-acetylneuraminate lyase # Organism: Yersinia pestis # 1 289 1 286 297 204 38.0 2e-52 MEKITGLINAPFTPFYENGDVNYEPLGTYAEFLVKNGLKGVFVNGSSGEGYMLTDEERMK LAERWLEVSPADFKVMVHVGSTCARSSRRLAAHAQSVGAYAIGAMAPPFPRIGRVEELVA YCQEIACGAPRLPFYFYHIPAFNGVYLSMTEFLEAVDGRISNFAGIKYTYENLYEYNQCR LYKDGKFDMLHGQDETLLPCLAMGGARGGIGGTTNYNGRTLVGILDAWKSGDLERARDLQ NFAQQVINVICHYRGNIVAGKRIMKLLGLDLGKNRIPFRNMTEEEESTMRAELEEIHFFD RCNKF >gi|226332196|gb|ACIC01000124.1| GENE 7 6769 - 8841 1148 690 aa, chain - ## HITS:1 COG:no KEGG:BT_0457 NR:ns ## KEGG: BT_0457 # Name: not_defined # Def: sialic acid-specific 9-O-acetylesterase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 686 2 686 692 1086 74.0 0 MKKILLLFVCLFVCWVLSAQQRIKVACVGNSITYGTGLSDRATQSYPVKLQKLLGERYEV ENFGKPGATLLNQGHRPYTRQEEYRKALDFAGDIVVIHLGINDTDPRNWPNYRDFFVKDY LKLIDTFREANPAVRIIIARMTPIADRHTRFLSGTRDWHGEIQTAIETVARYAGVQLIDF HEPLYPYPYLLPDAVHPAAEGAAIMARTVYSAITGDYGGLKLSPLYTDNMILQRDTPLKV HGIANAGEQVTVCIDRQQWITKAAPDGKWSVKLSPLKAGGPYTLTISTPHRILKYTNVLI GEVWLCSGQSNMEFMLRQTITGKKDIPQAADEQLRLYDMKARWRTDAVQWDASVLDSLNH LQYYKETEWEICTPANAAHFSAIAYYFGQMLRDSLKVPVGLICNAIGGSPTESWIDRNTL EYQFPAILKDWTRNDFIQDWVRGRAALNVKLSKEKFQRHPYEPCYLYESGIRPLEQYPVR GVIWYQGESNAHNCEAHEKLFKLLVGSWRKNWKNEDLPFYYVQLSSIARPSWPWFRDSQR RMMAEIPNTGMAVSSDYGDSLDVHPRNKKPVGERLGRWALNKTYGYTSLVPSGPLFRKAE FRDWEVLITFDYGEGMRSSDNGPIRAFEVAEVEGLYYPATAEVLEDGLMRVFCEFVKSPR YVRYGWRPYSNGNLVNRENLPASTFRVGIP >gi|226332196|gb|ACIC01000124.1| GENE 8 8842 - 9507 512 221 aa, chain - ## HITS:1 COG:all0976 KEGG:ns NR:ns ## COG: all0976 COG2755 # Protein_GI_number: 17228471 # Func_class: E Amino acid transport and metabolism # Function: Lysophospholipase L1 and related esterases # Organism: Nostoc sp. PCC 7120 # 46 217 69 243 249 99 32.0 3e-21 MKKIFFLVVILTLSLLCRAQERKYSTFYYQRATLFEELPVTSSDIIFLGNSITNGAEWAE LFKNKHVKNRGISGDICMGVYDRLDAILKGKPAKIFLLIGINDVSRGTPADTIVSRIEMI VRKIKADSPKTKLYLQSVLPVTDHYNMFKGHTSRWQVIPEINKGLVGLAEKEGATYIDLY SHFIDKQTGKMNTTYTNDGLHLLGKGYLKWVEIVKPYIGKK >gi|226332196|gb|ACIC01000124.1| GENE 9 9634 - 11595 1333 653 aa, chain - ## HITS:1 COG:CC0447 KEGG:ns NR:ns ## COG: CC0447 COG3525 # Protein_GI_number: 16124702 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetyl-beta-hexosaminidase # Organism: Caulobacter vibrioides # 75 411 103 510 757 128 25.0 3e-29 MVKAGDYHLLPQPQKFTPSHLNFQVNKVQLSTPVLQQEWETLISEMGGTVSPKATGTIEV KLLPTLPEIPMNQDEAYRLSITKKRIIVEAVTERGVYWAMQTLRQLAEKRNSKTHIQGAE IIDWPAFRVRGFMQDVGRSYISLDELKREIAALAKFKINVFHWHLTENQSWRLESKIFPM LNDSANTTRMPGKYYTLEEAKELVAFCKAHHMTLIPEIDMPGHSAAFIRTFRHDMQSPEG MKILKLLMDEVCETFDVPYLHIGTDEVQFTNPRFVPEMVSYVRSKGKKVISWNPGWHYKP GEIDMTQLWSYRGKAQKGIPAIDSRFHYLNHFDTFGDIIALYNSRIYNKEQGSEDLAGTI LAIWNDRLVSTEWGMIIENNFYPNMLAMAERAWKGGGTEYFDKNGTILPTDEHSELFRSF ADFERRLLWHKEHTFDGYPFAYVRQTNVKWNITDAFPNEGNLAKAFPPEETLQDSYSYGG KTYNVRPAIGAGIYLRHVWGTLIPGFYKEPKENHTAYAYTYVYSPKEQNVGLWAEFQNYG RSEADLPPLPGKWDYKESRIWINEQEILPPVWTATHRTKSNEIALGNENCVARPPLEVHL QKGWNKVLLKLPVGKFVSPEVRLVKWMFTTVFVTLDGQKAVEGLIYSPNKTLE >gi|226332196|gb|ACIC01000124.1| GENE 10 11676 - 13310 1058 544 aa, chain - ## HITS:1 COG:STM0928 KEGG:ns NR:ns ## COG: STM0928 COG4409 # Protein_GI_number: 16764290 # Func_class: G Carbohydrate transport and metabolism # Function: Neuraminidase (sialidase) # Organism: Salmonella typhimurium LT2 # 172 528 34 393 412 126 30.0 1e-28 MKFYLLFIFISLNLCSFSARATDTIFVYETQVPILIERQDNMLFLMRLTTARSDTKLNEV VLRLGQNVNLSNIQSIKLYYGGTEARQNYGKELYLPVAYISRDVSGKTLAANPSYSINKS QVNNPGRKVILNAKQKLFPGINYFWISLQMKPGTSLLDKVSAKIVTVKVDNKEALIYTVS PENITHRVGVGVRHAGDDGSAAFRIPGLATTNKGTLLGVYDVRYNSSVDLQEHVDVGLSR SVDGGKTWEKMRLPLAFGETGGLPAAQNGVGDPSILVDTKTNTTWVVAAWTHGMGNQRAW WSSYPGMDMNHTAQLVLSKSTDDGKTWSEPINITDQVKDPSWYFLLQGPGRGITMQDGTL VFPIQFIDSTRVPNAGIMYSKDRGETWKIHNYARTNTTEAQVAEVEPGVLMLNMRDNRGG SRAVATTKDLGKTWTEHPSSRKALQEPVCMASLISVKAADNTLNKDILLFSNPNTVKGRH HITIKASLDGGITWLPEHQVMLDEGDGWGYSCLTMIDKETVGILYESSVAHMTFQAIRLR DIIQ >gi|226332196|gb|ACIC01000124.1| GENE 11 13507 - 17097 968 1196 aa, chain + ## HITS:1 COG:BS_resE_4 KEGG:ns NR:ns ## COG: BS_resE_4 COG0642 # Protein_GI_number: 16079368 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus subtilis # 779 1020 24 266 269 128 28.0 8e-29 MHRNTALLIIVILLCAFSALGRGVNYQFTSISIEEGLSQSTVQSILLDKKGKLWIGTRNG LNSYTGQDLKIFKSNPEDRYSLPDNEILHLTQDSLNNIWISTREGMVTYDEKHGNFTLVN RDIIYSSLCITDGILFGSENRIYKYDYKKRSFKSINIKKQEDKGSDVTKYRVQKIIAFSE REALVATRRDGVYVLDYQTMRLKRLFSGSNNILQGAYLSSNGYIYLSFWGQGLFCYEKTG KFIKRYTKENSDLTNNYVLDIVEKEGFLWLATDGGGINRLELSNENFSNLYHIAGDKNSL PVNSVTILYKDENEGLWAGSVRGGVFNIKESYIRTYKDCPLGYINGLSEKSVTSLFEEPN GRLWIGTDGGGINLYEPQTGDFKHFYSTYGDKVVSIAPISEKELIVSVYTKGLFLFDTHT GAYRPFLIINDSINFRQCFYGYIPRAHRVTKNKIYILSKDIWIYDINNRKFSPIKTDKNY QLPTSVMAYSDEKMSLTMSGNKVFQIINKNDSIQPLFQIDEKETITAIDCDGQNRIWVGT TAGIGYYNLKEKRYSKIDSQLFSDISALRYDPSSERVWICAQNQLYSYCIKDDKFIQWNR SDGFHPNEIIFTYQQRTMKKHRYLYFGGIEGLVQINTDIPEPNEPYPEIELCGVELNGQL QTSDINIRNIKIPWKYNTLILQVHIKNKDIFQRVPFRYIIKGDVDKSIESYHPMLELSNL SPGKYHIEVSCMTKDGNYTMPIHLTDIHVTPPWYKTDWFIILCCIFLTGGIIVGMSIFNI RKEVKIQKRLKEYRQYFNEKKIDFLIHINHELRTPLTLIYAPLKRLIDKNETQGLPSYLM PQLQLIFNQAQYMREIVDMVLDWNSMEAGYSKLKIQKCKLDEWITNIVKDFTEEARQKGI CIKLQMTTDIEEVWFDKQKCHTVLSNLLMNALKFSMPESCITINTRKLENKVRISVVDEG IGIQDSDITNLFTRFYKGNHKEKGSGFGLYYAKTIMEMHGGDIGAYNNTDKGATFYLELP LLDINEKAEATRLLQKEAPIVNTTQDLHFDCTGKTILIVEDEKELREYLIESFTGTFKKV YAADNAISALETCRKKQPSIIVSDVMMPQMDGFELCRQIKNDIRISHIPVILLTARYDQT GITTGYKSGADSYIPKPFDLAFLKVVVGNILKNKRKNTQSICNGYWFTLLIRFDKQ >gi|226332196|gb|ACIC01000124.1| GENE 12 17240 - 17446 181 68 aa, chain + ## HITS:1 COG:no KEGG:BF4234 NR:ns ## KEGG: BF4234 # Name: not_defined # Def: two-component system sensor histidine kinase/response regulator hybrid # Organism: B.fragilis # Pathway: not_defined # 1 59 1245 1303 1307 120 98.0 1e-26 MGVNDYINRLRIEQAMSLLVNTNLNINEISCEVGFTYPRYFSSTFKNMTGMTPKQFRNEN RTTQTTDD >gi|226332196|gb|ACIC01000124.1| GENE 13 18587 - 20986 1070 799 aa, chain + ## HITS:1 COG:RSp1178 KEGG:ns NR:ns ## COG: RSp1178 COG0642 # Protein_GI_number: 17549399 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Ralstonia solanacearum # 302 660 282 653 676 127 28.0 7e-29 MWLRVKVFIGYVPLIFLLIITVGLFRKEQVNRDRLRQEEQKLHLVRGLVEQSYAGLLELS TYGEMVSVWDKEDFNSYHTKREEVCRNLLKLKKCTTDSGEQSHIDSLCLLLQEKEALLDT LMRTFEHLWEIDEVVNRKIPAIISNAKKNIPSVISDTSSTVKRESLWSKISCILKRKKKE SAYRERHKKNDITVQKNTDMLHSLSKEVTDMQENKEEQLLFQMDRLYENSSKLNKKLYRI VRELESEAKLKMEKRYRRFTLGREHSFRSVFVLSASVSALTILLYVIIHRDLKRKYKYQK ELEASDKANRELLRSRREMMLSIAHDLRSPLTTINGYTRLLPREKDSSLRIKYIENIRHS SEYMLLLVDTLMEFYLLDTEQIQPHLSFYNLESLFKEIIDNHLLQARKKDLRFSYNFSGM NTIVSGDRGWLQQIVNNLLGNAFKFTDKGNIHLNAEYGNGELRFWVQDTGSGMSEPETKK IFTAFARLGNASDISGFGLGLTISHRLVTQMGGNIQVKSHPGKGSTFTVMIPLPPADEKL QITENDYPSTAYKLDRLHILLLDDDIRQLHITSEMINRSGAHCDICTNSSELVSRLREDE YDLLLTDIRMPELDGYSILELLRSSNIRRANMIPVIALTARMDEEANYLARGFSGCIRKP FTMKTLIQGIYSTIGAEKSQAWKPDFSVLFTDEDNQTEMLEIFLCESRKELTRLHKSLCE NNRQSICDILHKNLPLWEIVNLDYPVENLWKIIITDPEKWQEREFMEIRKIELAAEKLVA YAEYIQKNGYEKDNTGNRR >gi|226332196|gb|ACIC01000124.1| GENE 14 20928 - 22244 674 438 aa, chain + ## HITS:1 COG:hydG KEGG:ns NR:ns ## COG: hydG COG2204 # Protein_GI_number: 16131834 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains # Organism: Escherichia coli K12 # 14 434 8 441 441 302 40.0 8e-82 MPNTYRRMDMKKTILVIEDDVIFSRSICNWLSKNDMITKSVTMLSAARKHLATKEFDLIL ADLRLPDGNSTELLRWMHEKNLAVPFLIMTNYGQVENAVEAMQLGAVNYLCKPVRPEKLL DTINKVFSHMKHDMNEFYRGESDKAREMYRQIGLVAAPEISVLLRGASGTGKEHIARELH EQSRRKNKPYITVDCGSILEELAASEFFGHRKGAFTGADSDKTGLFQEADSGTLFLDEIG NLSYKTQMLLLRALQEKRYKPVGSTKERCFDIRLLAATNENLEKAISEGRFREDLFHRLN EFTIWVPRLSECHEDILPLAKFFLKRFSKEYRLAVQGFDRLAVATMEQYGWPGNIRELKN CVRRAMLLAGGGWITATDLNLDPTLKLEENVVLTGKERERQLLLQILERTGNNRARTARE LNMSRTTLYEKLKRYGII >gi|226332196|gb|ACIC01000124.1| GENE 15 23012 - 23641 182 209 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253570815|ref|ZP_04848223.1| ## NR: gi|253570815|ref|ZP_04848223.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 5 209 1 205 205 397 100.0 1e-109 METSMKYHSFCSFTEIIQKKENPTESELSSAYEEFVNHLSLVASSELSIIGKLRKMRQLE LELATYRNGKYSVPDNPTTIYLTKAVALVRTEIDLLHFAVDHPECHAVPPPVSKRGNTTP SLHWKSSLVNLMELITSLDYSGFVTDEKGNRHSFSALVTAFETFFHVTFSKPYDLRADLA RRKKSLSVLLPKLKENYEKNIVNCGIDRR >gi|226332196|gb|ACIC01000124.1| GENE 16 23975 - 24223 261 82 aa, chain + ## HITS:1 COG:no KEGG:BT_0100 NR:ns ## KEGG: BT_0100 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 82 65 145 145 97 63.0 2e-19 MKVVKIDKAATDYYIKLTNLQSEYRRIGVNYNQIVKAVHSGELTEKKALALLYKLEQLTV ELVSLNKEIIRLTKEFEQWLQR >gi|226332196|gb|ACIC01000124.1| GENE 17 24313 - 25440 839 375 aa, chain + ## HITS:1 COG:no KEGG:BT_0101 NR:ns ## KEGG: BT_0101 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 347 37 386 412 289 44.0 1e-76 METPEGKVGITECMRSFEPYLAMNRRTEKPVIHLSLNPHPDDVLSDEQLEAIGREYMDKL GYGDQPYIIFRHDDKARPHIHIVSLRIDEQGKKLRDYKEWERSTDICRELERKYGLLQSR REEYRISRPMTTVDYTKGDLKHQIAGVVKPAVQNYRFQSFKEFRALLGLFNVTVEEVNKV VDGKRCSGLVYAATDGKGKRIGVGIKSSNIDKSVGYRALQKRFSQSRTWMKKHPLPEKTR ENIRTALRRDTREGFLHELAGKGIVPVLWENGAGVIYGVTYIDHNIRAVFKGSALGKEFS ASVINRKYGTLPSVPDKPTPGQEPPVNTDTRETGLAEGLLDIFSMESCPYPAEDIPRNIY GKKKKRKGRKGPRIN >gi|226332196|gb|ACIC01000124.1| GENE 18 25477 - 25749 233 90 aa, chain + ## HITS:1 COG:no KEGG:BT_2306 NR:ns ## KEGG: BT_2306 # Name: not_defined # Def: putative mobilization protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 83 1 84 669 99 58.0 3e-20 MAQEDSRGLGKTIDFMRAVSILFLVMNIYYFCYPFFHAAGCTNGMVDRILLNFQHDTGLF TSSPVTKCFSLMFLFFSSMGAKGRRVPEMT >gi|226332196|gb|ACIC01000124.1| GENE 19 25930 - 27489 1394 519 aa, chain + ## HITS:1 COG:alr7213 KEGG:ns NR:ns ## COG: alr7213 COG3505 # Protein_GI_number: 17233229 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirD4 components # Organism: Nostoc sp. PCC 7120 # 48 406 116 466 589 90 25.0 9e-18 MDDVFNEENESFMQETRLITNEYSVNLPTRFYYKKKWHDGWINVVLPQRGCIVVGSPGSG KSYCVINQFIKQQIEKGYALYCYDFKFPDLSLIVYNHLLKNKDKYKANVQFYVINFDDPR HSHRCNPINPTFMSDIADAYESAYVIMLNLNRTWIQKQGDFFVESPIVLFAAIIWYLKIY ENGKYCTFPHAIELLNKPYSDIFTILTSYRELENYLSPFMDAWKGGAMEQLQGQIASAKI PLSRMISPSLYWIMSGDDFTLDINNPDEPKVLCVGNNPDRQNIYSAALGLYNARIVKMVN RKGKLKCSILVDEVPTIYFKGLDTLIATARSNKVAVCLGAQDFSQLVRDYGEKEARVIQN TVGNVFSGQVLGETAKNLSERFGKVLQQRKSVNMTREDTSTSISTQLDSLIPASKISNLS QGMFVGSVCDSFQEKMEQKIFHCEIVVDNARVAAETKAYKPIPVITDFTGGDGKDHMREE IERNYYRIKEEVSGIIQKELRRIENDPNLKHLLETEDDG >gi|226332196|gb|ACIC01000124.1| GENE 20 27526 - 27696 147 56 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253570820|ref|ZP_04848228.1| ## NR: gi|253570820|ref|ZP_04848228.1| DNA adenine methylase [Bacteroides sp. 1_1_6] # 1 56 1 56 56 105 100.0 1e-21 MEKKKTSSGCLPGKPCFPWVGGKRRLLPVLIQSLPENITEMKTYVEPFVGGGALFF >gi|226332196|gb|ACIC01000124.1| GENE 21 27754 - 28269 368 171 aa, chain + ## HITS:1 COG:MJ0598 KEGG:ns NR:ns ## COG: MJ0598 COG0338 # Protein_GI_number: 15668778 # Func_class: L Replication, recombination and repair # Function: Site-specific DNA methylase # Organism: Methanococcus jannaschii # 1 171 69 240 289 128 38.0 4e-30 MNVYRVIRDTPEELVGLLAGIQGEYHALRERTERRDYFMEKRRVFNEEHPDDITRAALFI FFMRTCYNGIYSVNRKGRLSVTFGTGSRARILEEELIRCNHKLLQDVIILDGDYRQTEKY AGEKSFFYFDPPYKPVNEAGTCTSYMPDDFDDDCQIELAGFCKGLGEKGSK >gi|226332196|gb|ACIC01000124.1| GENE 22 28444 - 28680 58 78 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MREDKRQGHTLSATQGFERISAGKNSIRADDSATGGKESRGLKKKKATISCVLDIMVASI YCLCKNKELMFWTAPLYN >gi|226332196|gb|ACIC01000124.1| GENE 23 28554 - 29024 107 156 aa, chain - ## HITS:1 COG:no KEGG:BDI_3503 NR:ns ## KEGG: BDI_3503 # Name: not_defined # Def: DNA primase # Organism: P.distasonis # Pathway: not_defined # 3 136 147 286 312 128 50.0 5e-29 MYGKWYFAIGFKNRKGGLEIRNPYFKGAVSPKDITHVSHNTGDRRQSSVLVFEGFMDYLS YLALKKGQAVPDCVVLNSVTNLPKAMDILRSYGQVCCFLDNDEVGRKAVEEIRKQCGKIS DKAIHYLPHKDLNEFLQERIRSERMTVRQEAKNQEG >gi|226332196|gb|ACIC01000124.1| GENE 24 29298 - 29456 148 52 aa, chain - ## HITS:1 COG:no KEGG:BVU_2465 NR:ns ## KEGG: BVU_2465 # Name: not_defined # Def: DNA primase # Organism: B.vulgatus # Pathway: not_defined # 1 52 2 53 323 63 55.0 2e-09 MNIKEVKKIPLEDFLGRAGFSPVRRQGDSVWYLSPFRQERTPSFKVSLSLNL >gi|226332196|gb|ACIC01000124.1| GENE 25 29476 - 29826 93 116 aa, chain - ## HITS:1 COG:RP407 KEGG:ns NR:ns ## COG: RP407 COG0739 # Protein_GI_number: 15604272 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Rickettsia prowazekii # 2 111 300 411 458 114 49.0 5e-26 MFGMRVHPVKGGRLFHQGVDLAAPCGTPVYSAGNGKITEARYSRSYGWFVHVRHAEGYST LYAHMSRLHVKAGTHVRIGQHIGNVGHTGVATGNHLHFELRKDGVLLDPLSWPPFK >gi|226332196|gb|ACIC01000124.1| GENE 26 30135 - 31223 941 362 aa, chain + ## HITS:1 COG:no KEGG:PGN_0581 NR:ns ## KEGG: PGN_0581 # Name: not_defined # Def: hypothetical protein # Organism: P.gingivalis_ATCC33277 # Pathway: not_defined # 9 348 5 365 494 125 28.0 2e-27 MEELKPAATGSENATQEQGYLMAYDKKEQKAKGVKGIAANGELETLEANETNKSQFIRVD RYGNFFTNFGKNFMYQYNHPTRYALYRMEEKTPVEEAKRRIEQGQLPENESLRRELSRDN RIYNNHLFNEREINWQQAERYGLTPDVLRQTGDMERLLQGRQSGIAYDISMNTELGRQKG DAKLSLFRDENGLAKFDLHFIRQAPKVGQEYRGYKLEEDVLDALNRTGNAGRLVDLVVDF RSKETKPCYLSKDPVTNEMFFLPADQARCSKKIKDYTLSQQEYDDYTAGKEVPIEFKSSN GKILRTSIQMSAAERGTEFLWERSTKKLENRQGLEQKDGLPEKTQKPEKRTYARKSKITP KM >gi|226332196|gb|ACIC01000124.1| GENE 27 31228 - 32658 532 476 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253570827|ref|ZP_04848235.1| ## NR: gi|253570827|ref|ZP_04848235.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 476 1 476 476 964 100.0 0 MASPKDNMEQLEELFRQDGRGCLLIGYETGMDKPHAAISYQLYPVNPEQDGMTYQFLGLL HVGVETARISAFVPDTRLEIYRFPRMSDVPSISRDIPVREYITDKLLPHIRRYGLEPVVS VNLRDAVFMRSALKRPMEPGGRLRLTAAEIDRLMDFRLLQDEKARLYGYDPAYKLPLHIV ETSRGILVFSDGPAGQKGLEEFYQHLADNYWWIHSEPGPVKQYDMHSVPASLAPLIDASC RKDPDTGRYVYEFTDSPVRADLPDERKLEPVFFTDMTPSAEGYRNLTEFSGCGMNRCNAD IYRLLSLTRHFDRQLILDPAFSYRHQFREFVERMDSFLRGNPGDDDMGKILDDMHGKAGR ILKTDFDVRGHRTLERLLNDCSVPFLIGDHEADDTLRRALLEGKWIYFPGLSAKMPGLRY IHADKTCDRVMAYKNPPGLKPVYQAKDGKIVPYEAKAVKTDKSRAKRNRKRNNLKL >gi|226332196|gb|ACIC01000124.1| GENE 28 32664 - 33476 391 270 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253570828|ref|ZP_04848236.1| ## NR: gi|253570828|ref|ZP_04848236.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 270 1 270 270 535 100.0 1e-150 MEQTYTAIETWGGFLAFTDTAEGRGKLRQFLQQTADAYFNPAFNSGALHVYRAEGKLGNR PWVNPGRMRPDEYPYGPKPHGSRMELLYSNQMRPTAEDFRSFCHNAGCEISARNVNITDT LDALERYDRQAEELQRIPAKSARDREELLQTLETRRQLQKLVDSAYDVRGYRTAGRILDD PAECVILEGVPLYGPHRSVLKEGLGLYLPHESGNNPSHAYAWVDQATDRIIFGGNPPVDR KTVRIRPEVEKRLYSPPGKTRKRTEIRPKM >gi|226332196|gb|ACIC01000124.1| GENE 29 33489 - 34448 462 319 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253570829|ref|ZP_04848237.1| ## NR: gi|253570829|ref|ZP_04848237.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 319 1 319 319 588 100.0 1e-166 MTMEKSNRTAPLKVVLRKHGYLLFSGSRAGEEALKRYLQRAADRYFEPHAHDGALEIYEY SGRRRELMPHINGGGHGMSCREGTGKAEYIPRETLPAGFAESLEPRLYHAMNPTADEYRG LFSKVRGYTGFELSSRGENEDIYRLLSIRERGYMNIHDRPFTYYRELLPLAERFEEVTQV KAAELFDPQAFRELSSQIRKKADDILRRDFDVRGHRSLEKYLGNPDTALLVGNVRLENEQ FRVLSEGHALYLPENDRPASRHLLYCMADFTRNKLLVSGEPFPIRTYRVKDGIRHPAGQR EEASEKRKRAGRKPSGRKL >gi|226332196|gb|ACIC01000124.1| GENE 30 34466 - 35566 469 366 aa, chain + ## HITS:1 COG:no KEGG:BF3847 NR:ns ## KEGG: BF3847 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 138 361 27 281 301 63 24.0 9e-09 MYIGFILNGRNELEYCRMLSSDVNDLDRRFGEFCFSQVRGQLKGAYLPEERIYCIEAAGE DGNQVESLVRNIIDRGAWPSDDYQRLRDIGFVHETADSYRRTDSRDFIVGELERTLHNGR AGRLLEAYHDFHRIREKDTGNRVLTAIQTGQGVLLFDDTGRGLERAESYLQYNADNFFSP IHRNTDKLGVYYFSTDNARLVEKARECGRMFTSDDRCRFIPAKAEFLRSDILRGCKPAVE CDMFPDLARYREMLKTFKLRESEPAFNIGILDRLCRTGNLDEMPENRNFRHFCSFSSLHL QMNHSFLSSQADTLLGSSMRQAVSDTARRILQNEYDVRGYELPTPDTRRRREKKPEKTGQ KTKIGR >gi|226332196|gb|ACIC01000124.1| GENE 31 35568 - 36779 558 403 aa, chain + ## HITS:1 COG:no KEGG:BF3847 NR:ns ## KEGG: BF3847 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 166 397 33 283 301 69 26.0 2e-10 MASWLEKLGLQRTRPPKEELKDFCLELCPDNRIASFMITDHTPRSGNPFIQFVKEAARKA VIEEARGVPDNLLLFYAGIRKEELPPSNGLLEKEPRQAWRFIIENGGITDGNSVPLRRAA YLEAHGVETDRRNLEGLEHSPGYKDFVAHETARERILQGKNARFLVTAVKTDQGVRVFNR GLHGPRHLREYLQEIADNFYSPCGKRPGTLSIFRIETSSKRLMEMSRQPGRCFPASQPGL EVFGNYRPVASFDLSPTAANLDRFITAGSLELSRRNLDIMTLQDIAEKGYAHLTLEEPFR YRREFAPVEKELGRLSREKKLHPDFPLGERMAELRETSRKIAADLLVREGVRTGMPRSET RYADRISRMLGSPADTLGGSDTGKKRKQPSSKRKHAAGRQVKL >gi|226332196|gb|ACIC01000124.1| GENE 32 36793 - 36984 246 63 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|256839867|ref|ZP_05545376.1| ## NR: gi|256839867|ref|ZP_05545376.1| conserved hypothetical protein [Parabacteroides sp. D13] # 1 62 1 62 294 115 98.0 9e-25 MATNKPLTPEEAVDMSRRAVGRRALMRTLRDIFSYRNPLRQFMDAYRFNVIRWSASNLRE PTL >gi|226332196|gb|ACIC01000124.1| GENE 33 37198 - 37677 224 159 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253570832|ref|ZP_04848240.1| ## NR: gi|253570832|ref|ZP_04848240.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 159 68 226 226 317 100.0 2e-85 MTGYEHCPSLLTERVLLSTYRYIFMAPHDRRFSALCPDGICGMLRQNGVPPESLYYQNYE SLLQGRETVIYEFASGGKGQQYLRPLEHVRLDAGPGGYDMLPARRPDGPKDPSVSQGAEE AVPLSEKMEKNGRNIAREKRTVQNTKVKNTKAKNKGIRM >gi|226332196|gb|ACIC01000124.1| GENE 34 37708 - 37821 116 37 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253570833|ref|ZP_04848241.1| ## NR: gi|253570833|ref|ZP_04848241.1| DNA topoisomerase III [Bacteroides sp. 1_1_6] # 1 37 1 37 37 63 100.0 6e-09 MRIVLTDKPAMARSIASVLGANEKAEGYLYGNGYAVT >gi|226332196|gb|ACIC01000124.1| GENE 35 37876 - 39780 1101 634 aa, chain + ## HITS:1 COG:CAC2947_1 KEGG:ns NR:ns ## COG: CAC2947_1 COG0550 # Protein_GI_number: 15896200 # Func_class: L Replication, recombination and repair # Function: Topoisomerase IA # Organism: Clostridium acetobutylicum # 6 553 66 615 618 278 33.0 4e-74 MERQALPLIPVPFRLVVRQVRTEEGYVTDPLATQQLGVLRGLLENAEGVVIATDAGREGE LVSRYLLDYLGYRKPALRLWISSLTEKAILNGFKNLREDGDFDNLALAARARREADWILG VNASIALGIVAGMSNHSLGRVQTPTLALVCRRYLENRDFVPVNCYHLQTGVCKDGQKVLF TCPLQYERKEDAEKARSRLEGESRCTVLAVKKKENMEDPPLPHDLTGLQQEANTRLGMAA GQTLAVVQRLYERGYVSYPRTGCCHIPEDIFEQAPTLVASLKGHPRFGRHAERLLEKGLN PHAVDGNSVTEHHALLITGEVPEGLSPDEQNIYSLIAGRMLEAFSEGCVREIVRVRVDYG EMEFEAETSCVKYPGWRDIYNGPDMAEEGKSLPQFHQGEILAVLETEILEDCTKPQPPFT EAELLAAMENAGSTADSENKKKRMEKCGLGTPGTRAGIIDLLVARRYVERIGNRLIPTEK GLEIYGIVGDKLIADAAMTAAWEEALREIEEGRLSAGKFMKGIHEYARKIVSELLALQVR NPSVTKCTCPKCGTGTVTFYDRVAKCGDPDCAFHLPRMFNGRTMTDEDMTRLMAGEATPF LKFTTKAGKPYEASLRMDENYKVELTFKDRRPEG >gi|226332196|gb|ACIC01000124.1| GENE 36 40106 - 40393 283 95 aa, chain + ## HITS:1 COG:no KEGG:BF3841 NR:ns ## KEGG: BF3841 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 41 94 1 54 54 82 70.0 6e-15 MEITIMETSAYLELKRQLSTLSVQMEEFNRKTAPPSPDKWIDAQEVCQALGISKRCLQAH RNRGLIPCSHIGGKYFYREADIQKILEEGLIRNRK >gi|226332196|gb|ACIC01000124.1| GENE 37 40397 - 40708 315 103 aa, chain + ## HITS:1 COG:no KEGG:BF3840 NR:ns ## KEGG: BF3840 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 101 1 98 100 146 72.0 2e-34 MAEIITKDSEEFKEITGWIKRTGRALEKATARIRPGIADEHYLSGEEVCEKLHISKRTLQ ALRDEKAIPYTSVTAAGGKLLYPESGLYEVLKKNYKDFRKYIR >gi|226332196|gb|ACIC01000124.1| GENE 38 40785 - 42014 839 409 aa, chain - ## HITS:1 COG:SSO0375 KEGG:ns NR:ns ## COG: SSO0375 COG0582 # Protein_GI_number: 15897309 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Sulfolobus solfataricus # 214 386 109 270 291 67 33.0 4e-11 MRSTFSVLFYTKNQSLKDGRVPIMGRITINRTTACFSCKLEVSLALWDAKANRAKGKSDE ARRLNQKLDHIKAQITRHYQYICDHDSPVTAKSVYNRYLGFGDSYHTLMGLFREELASYK EKVGKEKAASTYRGLVADYNNLLLFLKEKRRIEDIAIADLDKKFIEDYYNWMLGTCALAS STAFCRVNTLKWLMYTAQERGWIKLHPFIGFDCLPEYKRRSFLTEEDLQRVIHVKLNYKR QRAIRDMFLFMCFTGLAYADLKEITYKNIHTDSEGGTWLMGNRIKTGVSYVVKLLPIAIE LVERYKGENKKKSSPDKVFPVGEYQTMVSSLRVLTQKCGCSTEITPHIGRHTFAVLAILK GMPLETLQKVLGHKSILSTQIYAELINPKVGEDTDKLCMKIGDVYRLTN >gi|226332196|gb|ACIC01000124.1| GENE 39 42801 - 44963 2146 720 aa, chain - ## HITS:1 COG:CAC3567 KEGG:ns NR:ns ## COG: CAC3567 COG0550 # Protein_GI_number: 15896801 # Func_class: L Replication, recombination and repair # Function: Topoisomerase IA # Organism: Clostridium acetobutylicum # 2 715 4 650 709 449 39.0 1e-125 MIVCIAEKPSVARDIAEVLGAHTRKEGYIEGNGYQVTWTFGHLCTLKEPHEYTPNWKSWN LGSLPMIPPRFGIKLIENPTYEKQFHIIEGLMQNADEIINCGDAGQEGELIQRWVMQKAG ARCPVKRLWISSLTEEAIREGFAKLKDQSDFQSLYEAGLSRAMGDWLLGMNATRLYTIKY GQNKQVLSIGRVQTPTLALIVNRQLEIANFQPKQYWELKTNYRDTTFSALIRKSDEEIAA EEEKSGGKKKIDNPGIDPIANREEGEALVERIKDLPFIVTNVGKKDGKEYAPRLFDLTSL QVECNKKFAYSADETLKLIQSLYEKKVATYPRVDTTFLSDDIYPKCPGILKGLRDYEVLT APLAGASLPKSKKVFDNSKVTDHHAIIPTGVYAQNLTDMERRVYDLIARRFIAVFYPDCK ISTTTVMGEVDKIEFKVTGKQILEPGWRVVFAKDVKDTSEEKEEEDENVLPAFVKGESGP HVPDLNEKWTQPPRPYTEATLLRAMETAGKLVDNDELRDALKENGIGRPSTRAAIIETLF KRNYIRKERKNLIATPTGVELVQIIHEELLKSAELTGIWEKKLREIEKKTYDARQFLEEL KQMVSEIVMSVLSDNTNRRITIQEAVAQTEEKAKKEPKKRERKASAPKEKKAKAEPKVPA SSPSPALTPSAPASAPATGNADTFVGQPCPVCGKGTIIKGKTAYGCSEWRNGCTYRKNFE >gi|226332196|gb|ACIC01000124.1| GENE 40 45116 - 47263 2436 715 aa, chain - ## HITS:1 COG:BH2955_1 KEGG:ns NR:ns ## COG: BH2955_1 COG1884 # Protein_GI_number: 15615517 # Func_class: I Lipid transport and metabolism # Function: Methylmalonyl-CoA mutase, N-terminal domain/subunit # Organism: Bacillus halodurans # 24 587 19 582 582 845 72.0 0 MRKDFKNLDIYAAFQPANGAEWQKANGISADWNTPEHIDVKPVYTKEDLEGMEHLGYAAG LPPYLRGPYSVMYTLRPWTIRQYAGFSTAEESNAFYRRNLASGQKGLSVAFDLATHRGYD PDHERVVGDVGKAGVSICSLENMKVLFDGIPLSKMSVSMTMNGAVLPIMAFYINAGLEQG AKLEEMAGTIQNDILKEFMVRNTYIYPPAFSMKIISDIFEYTSQKMPKFNSISISGYHMQ EAGATADIELAYTLADGLEYLRAGTAAGIDIDAFAPRLSFFWAIGTNHFMEIAKMRAARM LWAKIVKQFNPKNPKSLALRTHSQTSGWSLTEQDPFNNVGRTCIEAMAAALGHTQSLHTN ALDEAIALPTDFSARIARNTQIYIQEETYICKNVDPWGGSYYVEALTNELAHKAWERIEE VEKLGGMAKAIETGIPKMRIEEAAARTQARIDSGSQTIVGVNKYRLEKEAPIDILEIDNT AVRLEQIENLKCLKEGRNQAEVDKALAAITECVETGKGNLLELAVEAARVRATLGEISYA CEKIVGRYKAIIRTISGVYSSESKNDGDFKRACELAEKFAKKEGRQPRIMVAKMGQDGHD RGAKVVATGYADCGFDVDMGPLFQTPAEAAREAVENDVHVVGVSSLAAGHKTLIPQIMEE LKKLGREDIVVIAGGVIPAQDYDFLYKAGVAAIFGPGTPVAKAACQILEILMDEE >gi|226332196|gb|ACIC01000124.1| GENE 41 47265 - 49166 2077 633 aa, chain - ## HITS:1 COG:BH2956_1 KEGG:ns NR:ns ## COG: BH2956_1 COG1884 # Protein_GI_number: 15615518 # Func_class: I Lipid transport and metabolism # Function: Methylmalonyl-CoA mutase, N-terminal domain/subunit # Organism: Bacillus halodurans # 8 470 9 468 525 224 31.0 4e-58 MADKKEKLFSDFSPVSTEQWMEKVTADLKGADFEKKLVWKTNEGFKVKPFYRMEDLEGLK TTDALPGEFPYLRGTKKSNNEWLVRQEIKVECPKEANAKALDILNKGVDSLSFHVKAKEL NAEYIETLLNGIQAECVELNFSTCQGHVVELAGLLVAYFQKKDYDVKKLRGSVNYDFFNK MLTRGKEKGDMVQTAKALIEAIQPLPFYRVLNVNAISLNNAGAYISQELGYALAWGNEYM NQLTDAGIPAATVAKKIKFNFGISSNYFLEIAKFRAARMLWANIVASYHPECLRDCDNKG ANGECRCAAKMAVHAETSTFNLTLFDAHVNLLRTQTEAMSAALAGVDSMTVVPFDKTYST PDEFSERLARNQQLLLKEESHFDKVIDPAAGSYYIENLTVSIAKQAWELFLAVEEAGGFY AALKAGTVQAAVNESNKARHKAVAQRREVLLGTNQFPNFNEKAGEKQPIEAKCCCGGDAH TCEKDVDTLVFDRAASEFEALRLETEASGKRPKAFMLTIGNLAMRQARAQYSCNFLACAG YEVIDNLGFETVEAGVEAAMAAKADIVVLCSSDDEYAEYAVPAYHALNGRAMFIVAGAPA CMDELKAAGIENFIHVRVNVLDTLKEFNAKLLK >gi|226332196|gb|ACIC01000124.1| GENE 42 49485 - 51149 1916 554 aa, chain + ## HITS:1 COG:STM3807 KEGG:ns NR:ns ## COG: STM3807 COG2985 # Protein_GI_number: 16767092 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Salmonella typhimurium LT2 # 28 549 18 545 553 394 41.0 1e-109 MDWLQSLLWDPSSVAHIVFLYAFVVAAGVYLGKIKIFGVSLGVTFVLFAGILMGHFGFTA DTHILHFIREFGLILFVFCIGLQVGPSFFSSFKKGGMTLNLLAVGIVVLNIAVALGLYYL WNGRVELPMMVGILYGAVTNTPGLGAANEALNQLSYNGPQIALGYACAYPLGVVGIIGSI IAIRYIFRVNMTKEEESLKTQSGDAHHKPHMMSLEVRNESISGKTLIEIKEFLGRNFVCS RIRHEGHVSIPNHETIFNMGDQLFIVCSEEDAPAITVFIGKEVELDWEKQDLPMVSRRIL VTKPEINGKTLGSMHFRSMYGVNVTRVNRSGMDLFADPNLILQVGDRVMVVGQQDAVERV AGVLGNQLKRLDTPNIVTIFVGIFLGILLGSLPIAFPGMPTPLKLGLAGGPLVVAILIGR FGHKLHLVTYTTMSANLMLREIGIVLFLASVGIDAGANFVQTVVEGDGLLYVGCGFLITV IPLLIIGAIARLYYKVNYFTLMGLIAGSNTDPPALAYANQVTSSDAPAVGYSTVYPLSMF LRILTGQMILLAMM >gi|226332196|gb|ACIC01000124.1| GENE 43 51242 - 51877 707 211 aa, chain - ## HITS:1 COG:no KEGG:BT_2093 NR:ns ## KEGG: BT_2093 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 211 1 211 211 418 100.0 1e-116 MSGNYVINIGRQLGSGGKEIGEKLAARLGIDFYDKELINLASEESGLCKEFFEKADEKAS QGIIGGLFGMRFPFISEGAMPCTNCLSNDALFKVQSDVIRRLAAEKSCVFVGRCADYILR EHPRCANVFISATKEDRIARLCQMHRIDEEAAEEMIEKADKRRSEYYNYYSYKTWGAAAT YHLCVDSSSLGIEETVRFVEEFVAKKLQIGM >gi|226332196|gb|ACIC01000124.1| GENE 44 52044 - 54098 1355 684 aa, chain - ## HITS:1 COG:no KEGG:BT_2094 NR:ns ## KEGG: BT_2094 # Name: not_defined # Def: TonB-dependent receptor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 684 1 684 684 1354 99.0 0 MSGQQHSSDNKNWAEKGIVIREVEILGKRPMKDIGVQQTRFDSLVLKGNIALSMADILTF NSSIFVKSYGRATLSTVSFRGTSASHTQVTWNGMRINNPMLGMTDFSMIPSYFIDDASLL HGTSSVNMAGGGLGGLVKLSTVPAHQEGFGMQYVQGIGSFSTFDEFLQLKYGDKHWQIST RAVYQSSPNDYKYRNHDKKENIYDDKYNTIEQYYPIERNRSGAYKDLHILQEVYYNTGKG DRFGLNAWYTDSNRELALLTTDQGDLMDFENRQREHTLRSVLSWDHTRENWKVSARGGYV HTWLAYDYKRDLGNGIMATMTRSRSKVNTFYGQLDGEYFFSDKLLLTAGVSAHQHLVNSL DKDLNKDDNKNDKYGQGRKNDSIVYFDKGRIELSGNVSLKWQPVNRLGMSLVLRGEMFGT KWAPVIPAFFVDYVLSKRGNIMAKASITRNYRFPTLNDLYFLPGGNPALNNESGFTYETG LSFSVDKDNVYTLSGSASWFDQHINDWIIWLPTSKGFYSPVNLKKVHAYGVEVQADYAVA IDKAWKLGLNGTFAWTPSINEGEPTSKADQSVGKQLPYIPEYSATLSGRLTYRSWGLLYK WCYYSERYTMTSNAVSYTGHLPPYLMSNVTLEKGFSLRWADLSLKGTVNNLFDEEYLSVL SRPMPGINFEFFIGITPKWGKKKK >gi|226332196|gb|ACIC01000124.1| GENE 45 54174 - 55241 833 355 aa, chain - ## HITS:1 COG:no KEGG:BT_2095 NR:ns ## KEGG: BT_2095 # Name: not_defined # Def: putative surface layer protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 355 1 355 355 721 100.0 0 MKRILLSVLFIVFCLTAFVGCMKWDYGKMEPFRATGDGLFIMNEGNFQYGNATLSYYDPE TKKVENEIFYRANAMKLGDVAQSMIVRDTIGWVVVNNSHVIFAISTNTFKEVGRITGLTS PRYIHFISDEKAYITQIWDYRIFIVNPKTYQITGYIECPDMTMETGSTEQMVQYGKYVYV NCWSYQNRILKIDTTTDKVVDQLTVGIQPTSLVMDKNFKMWTITDGGYKGSPYGYEEPSL YRIDAETFKIEKQFKFQLGDAPSEVQLNGAGDELYWINKDIWRMSVDEERVPVRPFLKYR DTKYYGLTVSPKNGDVYVADAIDYQQQGMIYRYTEDGELVDEFYVGIIPGAFCWK >gi|226332196|gb|ACIC01000124.1| GENE 46 55763 - 56740 583 325 aa, chain - ## HITS:1 COG:no KEGG:BT_2096 NR:ns ## KEGG: BT_2096 # Name: not_defined # Def: putative transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 325 1 325 325 642 100.0 0 MKSNKSKYKLVVDYVIDGINEGRLKKGAWIPSLNEFREMYGLSRDTVFSGIRELKSRGII KSNPGVGYYIVSTRVPFKHNIFLLFNEFNEFKEDIYNSFMETVGNSATVDLYFHNYNRKV FETLVNNANHKYTTYIIMSGKFADIGPLLESLSGNVFLLDHYHSELKGKYSSVFQNFEKD TYEALVYGLSNLRKYKHIVMVQKDSKEPFERYDGLRAFCKEHGFTHECIGEIQGREIVKQ EVFMVVNDRDLVNLLKQADRQQLVPGKDFGIISYNDTPLKEVLAGGITTLSTDFKLMGRT MASLINKKTIETIENPWNLNIRNSL >gi|226332196|gb|ACIC01000124.1| GENE 47 56907 - 58970 1148 687 aa, chain + ## HITS:1 COG:mlr2247 KEGG:ns NR:ns ## COG: mlr2247 COG3533 # Protein_GI_number: 13472070 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Mesorhizobium loti # 100 675 99 651 662 291 31.0 3e-78 MKSTNLLKSLAVIATLSGTPHSAVQGQTGKVQFANIENVKVNDAFWSPKFKTWNEVTIND VLNKFEGKHTNAPEQHNAFCNFDKVAKGERGTQGHFGEPWFDGLIYESIRGIADYLVMYP DKELETRIDRYIDRIEAAQMTEPTGYLETYTLLKEPEHRWGDNGGFLRWQHDVYNAGMMI EAAVHYYKATGKTKLLEVATRYTNYMADYMGPAPKKNIVPSHSGPEEAIIKLYWLFKEQP ELKKQLSVSVNEDAYWKLATFWIENRGHHCGYPLWLTWGNGKSEKWIRDAQYNAPEHGEH SRPTWGDYAQDSIPVFEQQTIEGHAVRATLLATGITTAALENHSPKYIETAKRLWDNMTG RRMFITGGVGAIHEDEKFGPDFFLPPGAYLETCAAIGAGFFSQRMNELTGKGMYMDELER VLYNSLLTAVSLKGDNYTYQNPLNAEKHNRWEWHGCPCCPPMFLKITSALPGFIYASDKK GVYINLFIGSETEIKLSSKNSVQLKQETSYPWKGKVNISVNPQKTDKFPIKVRIPGWAQG IENPYELYQSNLKGAPQLFVNGKSVPIKIVDGYAEINRKWQKGDLIELELPIQPRIITAH TNTKDLSNTVCIASGPIIYCFEDVDNPDFKDFRLDTNAPLEIIYQKDLLNGVNIIKSNGK TTAIAIPYYSITNRKRDSSHKVWVPKL >gi|226332196|gb|ACIC01000124.1| GENE 48 59163 - 60302 1043 379 aa, chain + ## HITS:1 COG:alr4031 KEGG:ns NR:ns ## COG: alr4031 COG0614 # Protein_GI_number: 17231523 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-hydroxamate transport system, periplasmic component # Organism: Nostoc sp. PCC 7120 # 6 377 44 422 426 206 30.0 5e-53 MKFKSIISTITLLCLLSLASCVYNKKTSLEAFKQDVYMPEYATGFKILGAKNAQSTLIQV SNPWQGAKDVTMSYFISRNGELPPAGFTGPTIPAGAQRIVCMSSSYIAMLDALGQMNRIV AVSGINYIANPYILAHKDSIKDMGPEINYELLLGLKPDVVLLYGIGDAQTAVTDKLKELA IPYMYVGEYLEESPLGKAEWLVALSELTDSRDKGIEVFSEIPKRYQALKDLTASVEHHPT VMFNTPWNDSWVMPSTQSYMAQLVTDAGADYIYKENTSNSSAPIGLETAYGLIQKADYWI NVGTASTLDELKNMNPKFADAKSVRDKTVYNNNLRITATGGNDYWESAVVRPDVVLRDLI HIFHPELVSDSTYYYRHLE >gi|226332196|gb|ACIC01000124.1| GENE 49 60377 - 61315 985 312 aa, chain + ## HITS:1 COG:alr4032 KEGG:ns NR:ns ## COG: alr4032 COG0609 # Protein_GI_number: 17231524 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-siderophore transport system, permease component # Organism: Nostoc sp. PCC 7120 # 1 304 40 351 362 215 42.0 7e-56 MATGDTYIPIPKVWAVLTGGECDEMTRNILLSIRFIRVVVAGLIGIALSVSGLQMQTVFQ NPLADPYLLGVSSGAGLGVALFILGAPLLGWADFPLLQSAGIVGSGWIGTAVILLGVAII SRKVKNILGVLIMGVMIGYVAGAIIQILQYLSSAEQLKMFTLWSMGSLSHITAGQLSIMI PVVCIGLLLSVACIKSLNLLLLGENYARTMGMSIKRSRTLVFISTALLTGTVTAFCGPVG FIGLAVPHITRLLFDNADHRILMPGTMLTGLIAMLICDIIAKKFLLPVNCITALLGVPVI LWVIGKNLRIFK >gi|226332196|gb|ACIC01000124.1| GENE 50 61312 - 62067 213 251 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 211 1 212 245 86 26 4e-16 MIQLKDLTLGYEQRTLLEKVSAHITGGQLVALLGRNGTGKSTLLRAVMGLEKPQNGEIIL HGNNIASLKPEELARNISFVTTDKVRIANLRCRDVVALGRAPYTNWLGQLQGEDKEKVDN AMHLVGMDSYAEKTMDKMSDGECQRIMIARALAQDTPVILLDEPTAFLDLPNRYELCLLL RKLTQKEGKCILFSTHDLDIALSLCDTIMLIDNPQLYSLPTNEMITSGHIERLFHNESIT FDAQAMKIRIR >gi|226332196|gb|ACIC01000124.1| GENE 51 62191 - 63546 777 451 aa, chain + ## HITS:1 COG:lin2192 KEGG:ns NR:ns ## COG: lin2192 COG0534 # Protein_GI_number: 16801257 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Listeria innocua # 12 450 13 443 443 199 30.0 1e-50 MKDSIDFGSMDISTLFRKLLVPTVLGMVFSAIFVITDGIFVGKGIGSDALAAVNITAPLF MIATGIGLMFGVGGSVVASIYLSQGKRKAANINITQALVFSTFIVLVMSALCFCFVGPLG CLLGSSDRLLPLVIEYMIWYLPFLVFYELLSTGMFFIRLDGSPNYAMMCNAIAAIFNIIF DYIFIFEFGWGMMGAAFATSLGTVVGGLMTVIYLTRFSRNIHLHRIKLSLKSLMLTLRNG GYMIKLGTSAFISEASIACMMFLGNYVFIHHLGEDGVAAFSIACYFFPIIFMVYNAIAQS AQPIISYNFGAGQPDRVRKALHLAIRTALICGISFFIITFLCRQNIVSLFIDRSCPAFDI AVNGIPYFGVGFIFFAANMIGIGYYQSIRRGQRATIITLLRGVVFMLIGFFVLPLVLGVP GIWLAVSLAELLTTLYIIGIYFKDRFMVHRL >gi|226332196|gb|ACIC01000124.1| GENE 52 63560 - 64123 371 187 aa, chain + ## HITS:1 COG:all4541 KEGG:ns NR:ns ## COG: all4541 COG0664 # Protein_GI_number: 17232033 # Func_class: T Signal transduction mechanisms # Function: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases # Organism: Nostoc sp. PCC 7120 # 4 184 8 184 193 70 29.0 1e-12 MDAFIKKLCDKYKVLESDVQTLLGCMEEVHFKKRDLIVREGTKNSNLYFIKTGIWRAYYH KDGADATIWFASDGEAAFSVWGYADNAYSLINIEAMCDSVAYSISRATLNQLYHSSIGLA NLGLRLMDHQLLLQENWLINSGSPRAKERYLTLIKETPELLQYVPLKYIASYLWITPQSL SRIRASL >gi|226332196|gb|ACIC01000124.1| GENE 53 64150 - 64884 497 244 aa, chain - ## HITS:1 COG:BS_yvoA KEGG:ns NR:ns ## COG: BS_yvoA COG2188 # Protein_GI_number: 16080556 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus subtilis # 5 242 3 241 243 117 33.0 2e-26 MKILVNHNSAIPLYLQIEEQLRTIIKGEEYKQGKMLPNEIDLSKQLGISRNTLRQAINNL VNEGLLMRKKGVGTVVVNSAVCSKAQNWMSFTQEMKTLGIVPMNYELHVSWTKPTDDIAL FFNAPDVKRILKLERLRGNQEYPFVYFISYFNPKIGLTGSEDFSRPLYEILERDYHAIAK VSKEDISARLADKYIAGKLEIKVGDPILVRKRFVYDPEGCPLEWNIGYYRADSFIYSLEF ERNI >gi|226332196|gb|ACIC01000124.1| GENE 54 65014 - 67707 1058 897 aa, chain + ## HITS:1 COG:SPy0258 KEGG:ns NR:ns ## COG: SPy0258 COG1940 # Protein_GI_number: 15674438 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulator/sugar kinase # Organism: Streptococcus pyogenes M1 GAS # 1 266 1 257 312 86 28.0 2e-16 MKYYLGIDIGGTHIKGGIVNLLTNDIHQNILSHEELNATDSTSSIITKVRKVITDIQACM PLSKLGGIGIAMPGPCDYAKGIVAIYGVPKFQSLFGLNLKEEIKKVSDLNTVFINDASAY ALGEYYAGAAKDTSRSIIVTIGTGLGSTFLENDTVLNELTEGIPEHGYLYNIPYRDGMAD DFFSTRWFVNTWNMLFPDKKVTGVKEIALRASNGDNNAQSLFENFASNFVEFITPFLLNF KPEKLIIGGNIAKASDFFLDNIQFQLKKLNLITKIDICKLWDMSPLIGSAIYTSNILENM ENTKEKRHTQQFIAPTNSTATPSGEYDIYPAFPLGKGKIGKGINQLADWIEKHSQIKIDG YIGVFWDELIIKLGEELRKRGKNVRFFHSSVAMKDPQTIEKMIAPYLGGDNPLFGTITDK HLVNWFDENKLNSIQPDPEADLNIFIGTGAALSQWKAPLIYIDIPKNEIQFRMRAGAINN LGLDYRKDNQQAYKQLYFVDWIVLNKHKKQCLPLIDLLIDGQREWDELLMISGNDLREGL HKMSRNFFRVRPWFEPGAWGGQWMKNHIQGLNKEVNNLAWSFELMVLENGLMLESDGYRL EVSFDFLMYSDYQNILGECSETFKYDFPIRFDFLDTFDGDNLSIQCHPRPRYIQEHFNMP FTQDETYYILDCKNSPCVYLGFQDNIVPEEFQYTLERSQQKATKVEIERFVQKHQAKKHD FFLIPNGTIHASGKDCVVLEISSAPYIFTFKMYDWIRMGLDGKPRPLNIQHGMNNLYFER KGEKVIQELICHPYIMKENQECTIEHLPTHKEHFYDVYRYTFKDRIQMNTENTCHVCMIV EGDSVCIETEDGMKQRFNYAETFVIPAAARSYTIINENPDKRIMLVKAFVKKEVTLK >gi|226332196|gb|ACIC01000124.1| GENE 55 67739 - 70882 1117 1047 aa, chain + ## HITS:1 COG:XF0842 KEGG:ns NR:ns ## COG: XF0842 COG3537 # Protein_GI_number: 15837444 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Xylella fastidiosa 9a5c # 290 1042 49 766 790 286 30.0 2e-76 MKRTIVQIIFFIYCIANVYPQYSTIWQLGKTDNSSKEFALAPDGKDRFIISGFGDNKKYF YAGEHTPADFPYIIPGPTAEWAGSSYWAGQCRIQLPILIKLSDVNPLKKYQWNIFIENVE YEDCMFLRLEVNGKNYDSPIKPGTKQLTYSIQPGILKEGYNKIVMQLFNGKSLTFDAICL NGPQETQINKIGDTPIISMKMADYELEQGKTRTQPLLLKTITKKSGTLKIQINQKKIFKQ VEEGENIYEIPTGKLKDQSKIKVKISTEGQTVATQEFIRSNQQLRRSIDYVDQFAGSSGS RWMIGPGPWMPFGMVKLMPDNEDAHWKAGYEYNVENIMGFSHIHEWTMTGLLMMPTTGDL KIQPGTEKQPDYGYRSRINKKTETARIGYYSVNLTDYNIQAELTATTRSSLQRYTFNKAE QPRILVDFFFPAEYDWNLDDVYVKKVSDTEIEGWTLNDCRSTGYHGVQRYKLHFVMQFDK PFKTMNGWIRNKVYSQIEQLHKSNMKSRQVFTVENNSQDKLDAGIFLDFNLNTGDDVMVR TGISLVSIDNARLNLEEEIARPFGWNFDKVVTNQQDTWETLFQRVSITTDNYLLKQKFYT NLYRSISPRTIWNDVNGEWIDMNGNKALIDKPGKSVYGGDSAWGMHWTLGPFYNLLYPEY MSNWIYTYEQFYRRGGWLPNGNPGMKYFRIMIGNPALPLIVSGYQHGIRDFDSQLMYRAL IHQQTATMINYPGGGQVGNESYPDYITKGYVPLYDDAWDWNSPHYQSYVSNTMEYSYQDY CAAQYFDALNKKDDFNTFMKRSDNWKNIFDPSTGYVRPRRPNGEWIENTNPYHAPGFCEG SAWQFTWYVPHDVKSLINLIGERQFIDRLNAGFATSEKVSFNALGDNMGAYPINHGNETN MQAAYLFSYTSEPWHTQKWARAIQEKYYGMGPRDAYPGDEDQGQMSSWYILSSIGLFQMD GGCSKDPVWLLGSPRFDRVEILLDNTYYSGKKLIIKAENVSKDNCYIQYVRFNNKRLSNN YIEWSKLKKGGTLHFIMGDRPNPNAFK >gi|226332196|gb|ACIC01000124.1| GENE 56 70893 - 72053 362 386 aa, chain + ## HITS:1 COG:NMB0535 KEGG:ns NR:ns ## COG: NMB0535 COG0738 # Protein_GI_number: 15676441 # Func_class: G Carbohydrate transport and metabolism # Function: Fucose permease # Organism: Neisseria meningitidis MC58 # 14 379 39 423 426 93 26.0 5e-19 MKNKSKILPVLFGFFIMGFCDLVGVSVTYAKDYFSWSETQAGFLPSMVFIWFLLISIPIS IWMDKKGRKTISLIGLLSTFIGMLLPLLTFTQTACYIAFALLGIGNTILQVSLNPLLTNV IAEGKLSSIMTAGQFIKALSSFVGPIVAGFCSVYFNNWILMFPIFAAITLISGIWLFFTP INEKDDKRLTSSFYQVISLLKNKTICLLFGGIICIVGLDVGMNVFTPKLLIENAELTKEI ASYGTSWYFAARTLGTLCGVILLAKFSEIYYYRINMFIVLIALSCIYWIHSQYIILLLVC IIAFAMSSIFAVIYSLALHTLPYKTNEISGLMITGISGGSIIPPLMGVCADYTESQSGSI LVMLICVIYLILCSFHKFKTIHTKPH >gi|226332196|gb|ACIC01000124.1| GENE 57 72263 - 75442 1762 1059 aa, chain + ## HITS:1 COG:no KEGG:BT_2107 NR:ns ## KEGG: BT_2107 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 10 1059 1 1050 1050 2070 100.0 0 MTYFLKSQSMKKLLCNLPRFVLWIAFLLIIESYVTLQAQTSSSNNSSKAISGKIVDETDL PIAGATVQVKGSTKGAISDIDGLFTISSEPGSILIFSFLGYQSIEYKIDKVPQTIKLLPK VDELDEVTVVGFAKQKKESVIGAISTLKPERLKIPTSNLTQALGGNVAGIISYQLSGEPG NDNVEFFIRGVTTFGYSKSPLILIDNIESSTTDLSRLSADDIAAFSIMKDATATAIYGAR GANGVILITTKQGKPGKVKISARVEGSLSAPTKRIDMVDPISYMKYHNEAVLTRNPLAVR PHTDEKIASTKAGKNPYVYPAVDWYNELFNDNVFNSRANINMSGGSETARYYVSLGISDD NGIMKVDKRNNYNSNIDLKKIYVRSNIDIDVTKTTTFSVKFSGNFDDYTGPIVSGNDLYR RAMATSPVLFPKYYMPDENHTSTSHILFGNAGDGNYINPYAEMIRGYKQYTNSVISAQAE LNQKLDFITPGLSIKVFASTTRNSYSGQQRSTSPFYYSISYYDKLTDTYTLRELNPNGGR EDLDYTAEGNQVSSSYYYEASLNYNRTFREKHNITGLLVGVLRENKSGGASSIQTSLPAR NIGLSGRFTYAYDGRYFAEANFGYNGSERFAKNERFGLFPSIGAGWMISEEKFWNQDLKN IINKLKLKGTYGLVGNDNIGAWNDRFFYMSEVNLNGSGANFSWGQDFERNVGTIDMIRYR NDKIKWEIAQKMNLGLEVGLFNIFDIQFDYFTEYRKNIFGERQTIPYEMGFSAPLFANKG EASSHGFEVQVDANHAFNKNFWLGVRGNFTYATGKYEYVEEPYRPYPWLSHIGKRIEQSY GLVAERLFIDEEDIANSPVQTYGEYMPGDIKYKDINNDGIINDEDIVPIGFAKSPEIVYG FGVSMGYKNFDISCFFQGAARSSFFINSKATNPFMQMDLGGNNLNGMMQAYADDYWSESN RNIYATFPRLSTTDIANNNRNSTWWLRDGSYLRLKSVEIGYTLPKRLTSHLGMSNLRIYA SGLNLFAWNRFKLWDVEQGNDGMQYPIQAVYNFGINLSF >gi|226332196|gb|ACIC01000124.1| GENE 58 75450 - 77378 1109 642 aa, chain + ## HITS:1 COG:no KEGG:BT_2108 NR:ns ## KEGG: BT_2108 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 34 642 1 609 609 1239 100.0 0 MKYKHLFTSTLLAILITSCNYLDIVPDNIATLEMAFNTRANAEKYLATCYNFVPAAANLD HNSGLICGGEMWYLKPESHYYKNITSFNIARGMQNSNDPLLNYWDGGQGGYNIFRGIHEC NIFIDNIDRVPDMRMEEKNKWKAEVMTLKAYYIYYLMLHYGPIPLLKENISVEADLDMMQ LERETINDIVTYTTDLLDEAMNIPSGLTFNVTSPRTELGRITLPIAKAIKAKLLILAASP LFNGNTMYAEFVNSEGIPFFDQEYKKEKWEAAAAACLDAINTAHEAGASLYKFDEKLNYT PSSTLQQELTLRNTITGRFNSEIIWALGSNATDGIQNLCQAHLTSYSLGNRMIRSCSQLV PSLEIVEQFYTANGVPMDEDKDYEYENRYKLTTTPEDDPIHFTPNYTTIKMHLNREPRFY AYIGFDGGKWFSIECCNDNETSSYPIKSKKGDIAGVSDELYSPTGYFTKKLVSYQNVNTS ATNINYVYSFPIIRLSDLYLLYAEALNELNETPTTDVYYYVQQVRNKAGLDEGGSLVNTW RNHSNRPDKPTTQRGMREIIQKERMIELAFEGHRYHDVRRWKLGSTYLNTQIKGWSVLEQ DPEYFYNVRVLFTRKFMPRDYLWPIKTNSINRNPKLVQNPQW >gi|226332196|gb|ACIC01000124.1| GENE 59 77402 - 78601 819 399 aa, chain + ## HITS:1 COG:no KEGG:BT_2109 NR:ns ## KEGG: BT_2109 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 399 1 399 399 826 100.0 0 MKNIAFILLILAAVCFANCAGDYEAEPRGTNDGKAPAPITNISVKNVNGGAIIKYTPSDD VDLLYVKAIFTNTRGEQQEVKSSMYVDSLVIEGLGDTQKREVKLYSVDRHENYSTPAIVT IEPLTPAVVQTQESLKVTESFGGFFLDFNNAARSALSFYIYKWVDEHNEYEIHDVYSSQQ EEGRLTIKGLEHKLLKLAIFVRDKWENTSDTLYAEITPWKESLLDKTKFQLVKVMDDVSW DYHEGYHSRLWDGQIGGWNWGHTEYPIPFPHAFSIDLNVNVRLSQFQFWQRLDSQDLLYA HGAPKYFKLYGCEEGKDLQNPDNWVVLFDGEMKKPSGGAFGDALTQEDIEEAQRGHVFTL NQDVPFIRYFRFQSLMSWSGMETSVMSEITLWGDIQGEE Prediction of potential genes in microbial genomes Time: Thu May 12 02:34:01 2011 Seq name: gi|226332195|gb|ACIC01000125.1| Bacteroides sp. 1_1_6 cont1.125, whole genome shotgun sequence Length of sequence - 37215 bp Number of predicted genes - 25, with homology - 24 Number of transcription units - 16, operones - 7 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 1140 656 ## BT_2110 hypothetical protein + Term 1154 - 1222 13.6 + Prom 1143 - 1202 3.7 2 2 Op 1 . + CDS 1247 - 4375 1756 ## COG3537 Putative alpha-1,2-mannosidase 3 2 Op 2 . + CDS 4399 - 5436 687 ## BT_2112 hypothetical protein 4 2 Op 3 . + CDS 5480 - 8905 1741 ## COG0383 Alpha-mannosidase 5 3 Tu 1 . - CDS 9084 - 9827 388 ## BT_2114 hypothetical protein - Prom 9870 - 9929 6.4 - Term 10003 - 10052 2.1 6 4 Op 1 . - CDS 10255 - 10806 428 ## BT_2116 hypothetical protein 7 4 Op 2 . - CDS 10809 - 10961 113 ## - Prom 11066 - 11125 4.0 - Term 11185 - 11238 2.5 8 5 Op 1 9/0.000 - CDS 11296 - 12663 1436 ## COG1538 Outer membrane protein 9 5 Op 2 27/0.000 - CDS 12660 - 15800 2733 ## COG0841 Cation/multidrug efflux pump 10 5 Op 3 . - CDS 15797 - 16867 1019 ## COG0845 Membrane-fusion protein - Prom 16890 - 16949 5.4 + Prom 16830 - 16889 7.4 11 6 Tu 1 . + CDS 17049 - 17915 538 ## COG2207 AraC-type DNA-binding domain-containing proteins - Term 17783 - 17820 3.1 12 7 Tu 1 . - CDS 17896 - 19248 1255 ## COG0534 Na+-driven multidrug efflux pump - Prom 19352 - 19411 4.4 + Prom 19271 - 19330 3.5 13 8 Op 1 . + CDS 19354 - 21084 2126 ## COG1190 Lysyl-tRNA synthetase (class II) + Prom 21086 - 21145 2.4 14 8 Op 2 . + CDS 21179 - 22174 877 ## COG0240 Glycerol-3-phosphate dehydrogenase + Prom 22194 - 22253 3.6 15 9 Tu 1 . + CDS 22306 - 23643 1647 ## COG0166 Glucose-6-phosphate isomerase + Term 23661 - 23710 13.1 - Term 23643 - 23703 18.4 16 10 Tu 1 . - CDS 23745 - 24245 529 ## BT_2125 hypothetical protein - Prom 24383 - 24442 5.5 + Prom 24361 - 24420 5.9 17 11 Op 1 . + CDS 24440 - 25246 761 ## BT_2126 hypothetical protein 18 11 Op 2 . + CDS 25276 - 25950 678 ## COG0637 Predicted phosphatase/phosphohexomutase + Term 26103 - 26141 -0.4 19 12 Tu 1 . + CDS 26184 - 28823 1748 ## COG2605 Predicted kinase related to galactokinase and mevalonate kinase + Term 28849 - 28890 6.2 - TRNA 28886 - 28958 51.4 # Arg CCG 0 0 + Prom 28977 - 29036 8.0 20 13 Op 1 . + CDS 29178 - 30215 1040 ## COG2502 Asparagine synthetase A 21 13 Op 2 . + CDS 30221 - 30883 521 ## COG0692 Uracil DNA glycosylase + Term 30923 - 30966 5.3 - Term 30911 - 30954 1.5 22 14 Op 1 . - CDS 31005 - 33716 2107 ## BT_2133 hypothetical protein - Prom 33741 - 33800 5.2 23 14 Op 2 . - CDS 33807 - 34343 444 ## COG1418 Predicted HD superfamily hydrolase - Prom 34391 - 34450 5.4 + Prom 34189 - 34248 5.7 24 15 Tu 1 . + CDS 34419 - 35582 889 ## COG3876 Uncharacterized protein conserved in bacteria + Term 35650 - 35688 -0.8 - Term 35473 - 35504 -0.6 25 16 Tu 1 . - CDS 35571 - 36929 1094 ## COG1409 Predicted phosphohydrolases - Prom 36979 - 37038 5.4 Predicted protein(s) >gi|226332195|gb|ACIC01000125.1| GENE 1 1 - 1140 656 379 aa, chain + ## HITS:1 COG:no KEGG:BT_2110 NR:ns ## KEGG: BT_2110 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 379 31 409 409 764 100.0 0 DGEQIYVGKLADVNIQPGFQRMMIKGSMKYLATAKTCIIELVGYDKVFTTDIDRTQPEFS YEIKDVEEGNYYVKITTKDKEGNTSLSETYNVDVYGTEHIATYYPKRITDIQFVIADNSL NLIWNQADNVVEAIVKYYDSNEQLLTLTVKGDAASTNLPNWKATSILTVETRILPSETAL DIVSLPISEYTLPDDPIVEIPRTNFANANMPSDVVNGYGGSVEKLWNNDTWNYDSGYHAG DNQGVPHHLTIDMGLTATITECELFFTGYERTDWMPRLFQIWGIEDVEDLESHEPDVPSI NPAWEESAKAKGWKQLTTKNISVSGWQPSIKFACDKEHENIRYIRYRIVESLSAPHVGMG AYGSASEIKFWGKKVSLLQ >gi|226332195|gb|ACIC01000125.1| GENE 2 1247 - 4375 1756 1042 aa, chain + ## HITS:1 COG:XF0842 KEGG:ns NR:ns ## COG: XF0842 COG3537 # Protein_GI_number: 15837444 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Xylella fastidiosa 9a5c # 303 1040 46 770 790 267 29.0 1e-70 MKRIPLSLFTFLFLLSGLCAAPTKTVWQIGEKNHSASELALSPDKYKEFLANDFGWEDRF YIIGLSSPQKDFPYVLPGPADHWGGTSYTAGIRTHVLNILFGMKEVPETDSWKLIIDLLD THHELPALFKITINGKSWKYALPKGSGKSIDGDYSQIKPHTIEIPLNSGVIRKGSNSIAL TSLEGSWILFDDIRLMGPDNAELNEVNKSVYLRDVKAADFQTTSPVAQPLLVDIEHLSGH PLLEVKVDGKKILEQRLEKGRYILEAPMPVVKSPKTSHYIISADGAILDKGMIRRAPHNT ITLADYIDTRMGTAHSRWMIAPGPWMPFSMVKLSPDNQDSGWQSGYDPSFESIGTFSHIH EWTMAGLGIMPANGPLKTEIGSQSSLVKDANSYRSAIDKTSEETKVGYYKVDLTDYQIKA ELTATSRCGFQRYTYPQDKDARVMIDLKIPSEYDYQIVEGSVKQTGARRIEGFSKQLSKN VWSADADQNYTIYFVIEFNKDIKKFGGWHDHTLWETDTMTAHYPQRFGCYAEFDTTDHPE VMVRSGISYVDMAGASNNLSNEITEPFGWNFEAVHKHQSDSWNNILNRVRIYSNDYREKV RFYTNLYRAFCRNTFSDADRRWVDAAGNIQKLDDPDAVALGCDAFWNTFWNLNQVWNLIA PEWSSRWVKSQLAMYDANGWLAKGPSGMKYIPVMVGEHEIPLLVSTYQMGIRNYDAEKMF RAIVKMQTTPAQRVANGFAGNRDLETYLQHQYVPADKGRFSNTLEYSYDDWTVSQLAKAL GKEEYYRTFSNRGNWWKNAINPATGYAQMRYSNGEWEKNFDPFKSGANHHYVEGNAWQLT FFVPQNVPALTEIIGKELFIDRLSWGFKASEPWRYNAPGDQYWDFPVVQGNQQSMHFAFL FNWAGQPWQTQRWSRSIIDRYYGFDTSNAYLGDEDQGQMSAWFVMAALGLFQTDGGCNAE PVYELGSPLFEKIKIDLGKHFGRGKQFIIKASNTSRENKYIQSAKLNGKKLNSFKFPAKE LLKGGTLELEMGDRPNKNWGIK >gi|226332195|gb|ACIC01000125.1| GENE 3 4399 - 5436 687 345 aa, chain + ## HITS:1 COG:no KEGG:BT_2112 NR:ns ## KEGG: BT_2112 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 345 1 345 345 701 100.0 0 MKKILLSLITAAVLLSCQSSRDTSHKYTAITPGVEWLDTNNEKINAHGGGILYHEGIYYW YGECKSDSTYWNPNVPGWECYRTEAGGVNCYSSKDLLNWKYEGLVLQPEISDTQSDLHPS KVLERPKVIYNDKTKKFVMWLHVDSDDYNKAAAGVAVSDSPTGVFTYLGSMHPNGQISRD MTLFKDDDGKAYHIYSSEQNSTLYISLLSDDYLKPSGTFTRNFAGKFREAPAVFKHNGLY YVISSGCTGWSPNEAEYAVATQILGPWTVKGNPCVGKDADKTFYGQSTHVLPIQGKKDAY IAMFDKWNKTDLIHSTYIWLPIQFREDGSMAIQWLDSWSPDTYEF >gi|226332195|gb|ACIC01000125.1| GENE 4 5480 - 8905 1741 1141 aa, chain + ## HITS:1 COG:MT0676 KEGG:ns NR:ns ## COG: MT0676 COG0383 # Protein_GI_number: 15840051 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-mannosidase # Organism: Mycobacterium tuberculosis CDC1551 # 296 798 94 593 1398 64 20.0 1e-09 MRKIKIKEIVCIIGLLGTLTAHSQNTVWQIGNKDKSAKEFKFYKNDYRNYINDVYLYEVG RNHSADFPYFLPGPSDAWAGGISGQAVIGFSLAEAPSPTTQIQLRLLFAETHPSSPPRLE VRLNSFEKSIQTPKGTNPEYLDTKETTSKELSVTLDIPADEFRAGNNTLTIRNISGSWVA FDQIKLTSDSPLQTNNLTTKASLVSSAFSPVLVVGKKKELRQPVKIKAVNWHHKKRPVEL QVGNFPPERYVLSPGINNIETSIPEVTAPCELPIILKEQGKTVSKINEKATPVKKWTVYL VQHTHTDIGYTKPQTEILSEHLRYIDYAIEYCEQTKDYTNDAKFRWTCESAWAVDEYLKN RPEEQVNKLKKYIAEGQIEVASMYFNMSEIVDENSFKTFLQPVKEFRKHGIPSALAMQND VNGIAWCLADYLPDLGIKYLWMGEHHYKSQVPFNMPTVFQWESPSGKPILTYRADHYNTG NFWGIEQGDIQKTEPKLLHYLSELERKHYPFDAVGVQYSGYFTDNSPPSIIECKLIREWN EKYAYPKLRSATASEFMDYVTERYGDKIPAYRAAYPDWWTDGFGSAARETAASRKTHSDM TAVEGLLSMAVLKDKCLPQATHQQIEHIHESLLFYDEHTFGASESVSDPLCENSQVQWGE KSAYAWEAVKRTQMLYETSVGLLQGDLRRGKNPTLTIFNTLNWKRSEMLTVYIDFEVIPR NQFFEITDFQGHSLKVQPIRYRREGCYYAIFAEDIPPMGYKTFEIVFKQPSTDTGTITIN NASIENHFYKLQLNPDKGTIQSLYDKELNCELVDSSSPWELGAFIYEKLGNRDQLAQYRL DDYNRTGLSECRIIDSSNGPIYQRILLQGKSPGTDEAFGVNLEIKLYHDTKRIELEYAIK RLPETDPSGIYVAFPFQLPEGKLAFDVQGGTVISGKNQLEGTSTDWNTVQNFVSVQNNQA QIIIGSSLIPLYQLGDINLGKFQYQKQYERPHVYSWVMNNYWNTNFKASQEGEFRWSYHL TSSVDPSNNLATKFGWSSRIPLYARVMPTGTENHKPTEYSAFHFEKDNFLMTSCTPSKEE GYILLNIREIDGKRTNLRILNEDGQTIPFTIVNAIEEKLENETSEHIFEGYQNKFIKIKR F >gi|226332195|gb|ACIC01000125.1| GENE 5 9084 - 9827 388 247 aa, chain - ## HITS:1 COG:no KEGG:BT_2114 NR:ns ## KEGG: BT_2114 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 10 247 1 238 258 448 100.0 1e-125 MSYWLTTKQMSERHGVQEATLRNWANLGYITSCRMGNQLFLDDESLTAYLEAHKRLGLQA DYLAKIVEEKKLERDFIISRYDDLLYVLRTQKTCKPLYEIIIRELSQLIVHPGARDIFYS ISMGESIEKVAGRHRITYDRALQIYNSHLRGLKVRKNVLATYRKHIIDARFQSLADKSKN INLNQEERVLQLSVGKVADTRLTNVLYKEEIRTVGQLLELVSGKGWRWLLKMEGVGRISY DRLLSNL >gi|226332195|gb|ACIC01000125.1| GENE 6 10255 - 10806 428 183 aa, chain - ## HITS:1 COG:no KEGG:BT_2116 NR:ns ## KEGG: BT_2116 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 183 1 183 183 340 100.0 2e-92 MDRQLIETTYEMRKLDTSEKIESFDCGDTDLNDFILNESFLYREALLAVSYVLENKVNQR SVAYFSLANDRISLSDFENKTEFNRFRKHRFVNEKRLKSYPAAKICRLGVDLSVKGQSIG SFLLNFIKSYFIIENKTGCRFLTVDAYAAAVPFYMKNGFVPLNDEDVDSATRLLYFDLND IVK >gi|226332195|gb|ACIC01000125.1| GENE 7 10809 - 10961 113 50 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MARPIKETPILFGEDARRFEERMKEKRRETSEQRAKRLKDYELAMKIFKK >gi|226332195|gb|ACIC01000125.1| GENE 8 11296 - 12663 1436 455 aa, chain - ## HITS:1 COG:Cj0365c KEGG:ns NR:ns ## COG: Cj0365c COG1538 # Protein_GI_number: 15791732 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Outer membrane protein # Organism: Campylobacter jejuni # 28 429 64 443 492 70 24.0 6e-12 MKKEFLVAVSVLFGMAASLSVSAQEKLSLKQCREMALKYNKEMAAADKQTEAARLLSLSY KANFFPNFTASGTGLYSTADGSLGIRGGNLPVFLPDPVTGELATSGYAYFPGLNLDYKVG TVYTGGVQVEQPLYMGGKIRAAYKMSLLGKEMAHLNEELTASGVILNTDKAYVQLVKAKE MKKVAEKYHVLLNELFRNVKSAHQHGMKPQNDVLKVQVKLNESELALRKADNALRLAGMN LCHYIGRPLTAGIDISDEFPEVEQEWKTQVADISARPEYGILNKQIAMAEQQVKLNRSEL LPRVGVKGSYDYFHGLEINDETLMKKGSFSVFLNVSVPLFHFGERMNKVRSAKVKLEQTR LEQENANEKMLLELTQATNNLDEARLECELSERSLEQAEENMKVSGKQYEVGLETLSDYL EAQVMWQQAYQTKVDAHFQLYVNYVAYLKAAGQLQ >gi|226332195|gb|ACIC01000125.1| GENE 9 12660 - 15800 2733 1046 aa, chain - ## HITS:1 COG:VC1673 KEGG:ns NR:ns ## COG: VC1673 COG0841 # Protein_GI_number: 15641677 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Vibrio cholerae # 9 1042 26 1034 1037 622 35.0 1e-178 MIDISKWAFENKKLIYFLIAVLIVGGAYSSYEMSKLEDPEVTVKVAMVVTTYPGASAHQV ELEVTDVLEKNIRTMGNIDNVESYSYNDLSLIQVELKTTVKEADVEQCWDLLRRKVANAQ AELPEGAATPLVKDDFGNVYGMFYALTGDGLQDRELSDYAELIKREVSELDGVERVDLYG KREECINISLLQDRMANLGVKPAEVLATLNGQNKTTYTGYYDNGDNRIRVTVSDKFKTVE DIGRMLIQGHDDDQLRLSDIARIEKDYENPTRNELFYDRQRAMGILIAASSGSDIIKIGK RVEQKLDELKAERLPTGVECHKVFYQPERVSGSLGTFILNLIESVIIVVVILMITMGFKS GVIIGISLVVTVFGSFLFLYFVDGTMQRVSLASFVLAMGMLVDNAIVIIDGILVDLKAGK SRMEAMTAIGRQTAMPLLGATLIAIIAFLPIFMSPDTAGVYTRDLFIVLAVSLLLSWVLA LVHVPLMANRILHPEVSKEASTTGKRVYEGKIYAVLRSLLRFSLAHRWSFVFAMVALVAL SAFSYRFMKQGFFPDMVYDQLYMEYKLPEGTNYTRVTRDLEEIEAYLKTRPEVTHVTASV GGTPGRYNLVRNIANPSLAYGELIIDFTSPDALVDNMPEIQQYLSSRYPDAYVKLNRYNL MFKKYPIEAQFTGPDPVVLHQLADSARSIMESCPDVYLITTDWEPQIPVLTIEYDQPTAR AIGLSRNDVSLSLLSATSGIPIGSFHEGIHRDNIYLRCLDEQGRPIEDLDNTQIFSSLPS LNGLLTQEMMMKLKTGTLSKDELVETLMGTTPLKQISKRIDVQWEDPVVPRYNGQRSQRV QCSPVPGIETEKARQSIAAQIEQIPLPDGYKLQWQGERNASTKSMQYLFKNFPFAIILMI SILIMLFKDYRKPIIIFCCIPLVFVGVVAVMLLTGKTFNFVAIVGTLGLIGMIIKNGIVL MDEITLQISQGVEPVTALIDSSQSRLRPVMMASLTTILGMIPLLPDAMFGSLAASIMGGL LFGTLITLLFIPILYALFFHIKKTDK >gi|226332195|gb|ACIC01000125.1| GENE 10 15797 - 16867 1019 356 aa, chain - ## HITS:1 COG:PA3677 KEGG:ns NR:ns ## COG: PA3677 COG0845 # Protein_GI_number: 15598873 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Pseudomonas aeruginosa # 41 350 42 360 367 127 31.0 3e-29 MKRFNFLLTGILVVLLASCSQRTQITDSKPTVKIDTVLSAGGQTALQFPGRVKAAQDISL SFRVSGTILRMYVNDGTYVRKGQLLAELDPADYQIQLDAAEAEYQSVKAEAERVMALYKE NVTTPDANDKAMYGLKQITAKYNHAKDQLEYTRLYAPFSGYVQKRLFDSHETIAAGMPVV SIISEGTPEVEINLPAVEYIRRQQFTGYTCTFDIYPGQEYALDLINVTPKANANQLYTMR LQLKKGEAQALPSPGMNTMVTIECTEGEVRHLSVPTGAVLRDKGMASVFLYDASSRTIRR CDVTVTRLLGSGRCLVTAEGLKPGDLIVASGVHHVKDGEQVEPLLPASKTNVGGLL >gi|226332195|gb|ACIC01000125.1| GENE 11 17049 - 17915 538 288 aa, chain + ## HITS:1 COG:lin0157 KEGG:ns NR:ns ## COG: lin0157 COG2207 # Protein_GI_number: 16799234 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Listeria innocua # 173 277 165 271 277 61 27.0 2e-09 MEQQIPFSYVIISPENATEKEGYMISTIDKCGFFFCQKGEVEVALNDKSYLISKGSVCIY MTGSLLRIQRISKDIKGIMLEVDLNYIIPIVNKIVNSENLLYLRENPCFSITEYQYNYLE QLIKALQQRMDIKAHDIPLQRQHLISELIKSWGQTLCYELLNVYFTNQPLKPLSQDKKDK IFQNFVITLFRYYQQERDVTFYASKQYLSSRYFSAVIKEKSGSTALQWIVQMVITEAKQL LENSDLTIKEIATKLNFPTQSFFGKYFKQYVGVSPKEYRNKQIRQLPF >gi|226332195|gb|ACIC01000125.1| GENE 12 17896 - 19248 1255 450 aa, chain - ## HITS:1 COG:CAC0883 KEGG:ns NR:ns ## COG: CAC0883 COG0534 # Protein_GI_number: 15894170 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Clostridium acetobutylicum # 7 430 4 428 448 322 39.0 8e-88 MTGQKTPTALGTEKIGKLLMQYAIPAIIAMTASSLYNMVDSIFIGHGVGAMAISGLALTF PLMNLAAAFGSLVGVGAATLVSVKLGQKDYDTAQRVLGNVLVLNIIIGLAFTVLTLIFLD PILYFFGGSEATVGYARDYMVVILWGNVITHLYLGLNAVLRSAGHPQKAMYATIATVVIN TILDPIFIYGFGWGIQGAAIATITAQVIALLWQFKLFSNKDELLHFHKGIFRLKKKIVFD SLAIGMAPFLMNLAACFIVILINKGLKQHGGDLAIGAFGIVNRLVFLFVMIVMGLNQGMQ PIAGYNFGAKQYPRVTQVLKITIYAATIVTTIGFLMGMFIPQLAVSIFTTHEELVNISAK GLRIVVMFFPIVGFQMVTSNFFQSIGMASKAIFLSISRQVLVLIPCLLILPRFYGQLGVW ISMPISDLIASLIAGGMLWWQFRQFRKATA >gi|226332195|gb|ACIC01000125.1| GENE 13 19354 - 21084 2126 576 aa, chain + ## HITS:1 COG:CAC3197 KEGG:ns NR:ns ## COG: CAC3197 COG1190 # Protein_GI_number: 15896444 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Lysyl-tRNA synthetase (class II) # Organism: Clostridium acetobutylicum # 13 501 33 510 515 455 49.0 1e-128 MNILELSEQEIIRRNSLNELRAMGIDPYPAAEYVTNAFSTDIKAEFKDEDEPRQVSVAGR IMSRRVMGKASFIELQDSKGRIQVYITRDDICPGEDKELYNSVFKRLLDLGDFVGIEGFV FRTQMGEISIHAKKLTVLAKSIKPLPIVKYKDGVAYDSFEDPELRYRQRYVDLVVNEGIK ETFEKRATVVRTLRNALDEAGYTEVETPILQSIAGGASARPFITHHNSLDIDLYLRIATE LYLKRLIVGGFEGVYEIGKNFRNEGMDRTHNPEFTCMELYVQYKDYNWMMSFTEKLLERI CIAVNGSTETVVDGKTISFKAPYRRLPILDAIKEKTGYDLNGKSEEEIRQICKELKMEEI DDTMGKGKLIDEIFGEFCEGSYIQPTFITDYPVEMSPLTKMHRSKPGLTERFELMVNGKE LANAYSELNDPLDQEERFKEQMRLADKGDDEAMIIDQDFLRALQYGMPPTSGIGIGIDRL VMLMTGQTTIQEVLFFPQMRPEKVIKKDPAAKYMELGIAEDWVPVIQKAGYNTVADMQDV NPQKLHQDICGINKKYKLELTNPSVNDVTEWINKLK >gi|226332195|gb|ACIC01000125.1| GENE 14 21179 - 22174 877 331 aa, chain + ## HITS:1 COG:BS_gpsA KEGG:ns NR:ns ## COG: BS_gpsA COG0240 # Protein_GI_number: 16079340 # Func_class: C Energy production and conversion # Function: Glycerol-3-phosphate dehydrogenase # Organism: Bacillus subtilis # 6 314 3 312 345 153 32.0 5e-37 MKLPGKIAIMGGGSWATAIAKMCLAQEDSINWYMRRDDRIADFKRLGHNPAYLTGVKFDT KRITFSSNINDVVKESDTLIFVTPSPYLKAHLKKLKTKIKDKFIITAIKGIVPDDNVIVS EYFTKEYGVPPENIAVLAGPCHAEEVALERLSYLTIACPDKDKARIFARRLGSSFIKTSV SDDVAGIEYSSVLKNVYAIAAGICSGLKYGDNFQAVLISNAIQEMNRFLNTVHPLNRNVD DSVYLGDLLVTGYSNFSRNRTFGSMIGKGYSVKSAQIEMEMIAEGYYGTKCIKEINKHHH VNMPILDAVYNILYERISPMIEIKLLTDSFR >gi|226332195|gb|ACIC01000125.1| GENE 15 22306 - 23643 1647 445 aa, chain + ## HITS:1 COG:BH3343 KEGG:ns NR:ns ## COG: BH3343 COG0166 # Protein_GI_number: 15615905 # Func_class: G Carbohydrate transport and metabolism # Function: Glucose-6-phosphate isomerase # Organism: Bacillus halodurans # 2 445 5 449 450 487 53.0 1e-137 MISLNIEKTFGFISKEKVSAYEAEVKAAQEMLEKGTGEGNDFLGWLHLPSSISKEHLADL NATAKVLRDNCEVVIVAGIGGSYLGARAVIEALSNSFTWLQDKKTAPVMIYAGHNISEDY LYELTEYLKDKKFGVINISKSGTTTETALAFRLLKKQCEDQRGKETAKKVIVAVTDAKKG AARVTADKEGYKTFIIPDNVGGRFSVLTPVGLLPIAVAGFDIDKLVAGAADMEKACGSDV PFAENPAAIYAATRNELYRQGKKIEILVNFCPKLHYVSEWWKQLYGESEGKDNKGIFPAS VDFSTDLHSMGQWIQEGERSIFETVISLDKVDHKLEVPFDEANLDGLNFLAGKRVDEVNK MAELGTQLAHVDGGVPNMRIVLPELSEYNIGGLLYFFEKACGISGYLLGVNPFNQPGVEA YKKNMFALLNKPGYEEESKAIQAKL >gi|226332195|gb|ACIC01000125.1| GENE 16 23745 - 24245 529 166 aa, chain - ## HITS:1 COG:no KEGG:BT_2125 NR:ns ## KEGG: BT_2125 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 166 22 187 187 300 100.0 1e-80 MKKKVLFLLPLFLGACSDSGESPVITIEKLYVKVDQGDGETIREDYANRMSEMPALKVGD KVDAFLILDGNGAELKSFQLQNGEEMKTELRYEKTEVTTEGNLTDEEKGQLRFKDGVTKT KVLVHATIEHVDKNGDVQLSFYLSSKAECEGAQEVIDLKTEVVENE >gi|226332195|gb|ACIC01000125.1| GENE 17 24440 - 25246 761 268 aa, chain + ## HITS:1 COG:no KEGG:BT_2126 NR:ns ## KEGG: BT_2126 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 268 1 268 268 475 100.0 1e-133 MEKLVRRPNAVILFISLSLLMMLTTGCQEIKLKAGIEAANKQCPIDMGETGKITSIVYDG ENVVYTFYLNEEAANIKTLQKNPESMKTSLKIMFQNPNKEVKSLLDLVVKCKAGLQMIYI GKDSGEQVVCELTTDEIKEILNKDVDTTESDLAKLETQVQMANLQFPIQASEDVLIEKIE LSDEAVIYICRVDEDLCDMNQIKANAAEVKQGIIETLGNQTDLATQLFIKSCVNCDRNIV YRYIGKQSEAQHDVVITVPELKDMLKKE >gi|226332195|gb|ACIC01000125.1| GENE 18 25276 - 25950 678 224 aa, chain + ## HITS:1 COG:MA0451 KEGG:ns NR:ns ## COG: MA0451 COG0637 # Protein_GI_number: 20089342 # Func_class: R General function prediction only # Function: Predicted phosphatase/phosphohexomutase # Organism: Methanosarcina acetivorans str.C2A # 5 217 2 206 218 113 35.0 3e-25 MRKKLKAVLFDMDGVLFNSMPYHSEAWHQVMKTHGLDLSREEAYMHEGRTGASTINIVFQ RELGKEATQEEIESIYHEKSILFNSYPEAERMPGAWELLQKVKSEGLTPMVVTGSGQLSL LERLEHNFPGMFHKELMVTAFDVKYGKPNPEPYLMALKKGGLKADEAVVIENAPLGVEAG HKAGIFTIAVNTGPLDGQVLLDAGADLLFPSMQALCDSWDTIML >gi|226332195|gb|ACIC01000125.1| GENE 19 26184 - 28823 1748 879 aa, chain + ## HITS:1 COG:CAC3055 KEGG:ns NR:ns ## COG: CAC3055 COG2605 # Protein_GI_number: 15896306 # Func_class: R General function prediction only # Function: Predicted kinase related to galactokinase and mevalonate kinase # Organism: Clostridium acetobutylicum # 514 816 2 275 364 88 26.0 5e-17 MLHAGGQSRRLPGYAPSGKILTPIPVFSWERGQKLGQNLLSLQLPLYERIMKQAPKGLNT LIASGDVYIRSEKPLQDIPEVDVVCYGLWVNPSLATHHGVFVSDRKKPEVLDFMLQKPSL EELEGLSKTHLFLMDIGIWILSDRAVEVLMKRSLKEGTNDISYYDLYSDYGLALGEHPQT TDDEVNKLSVAILSLPSGEFYHFGTSRELISSTLAIQDKVRDQRRIMHRKVKPNPAIFIQ NSFTQVKLSAENANLWIENSHVGEGWKLGSRQIITGVPENHWNINLPDGVCIDIVPMGDA AFVARPYGLDDVFKGDLRNDSTTYLGNSFTQWMKEREIGLEDIKGRTDDLQAAPVFPVTT SIEELGILIRWMTAEPQLKQGKELWLRAEKLSADEISAQANLERLYAQRSAFRRDNWKGL SANYEKSVFYQLDLQDAANEFVRLNLEVPAVLKEDAAPMVRIHNRMLRARILKLQGNEGC KEEQAAFQLLRDGLLEAVAGKKNYPKLNVYSDQIVWGRSPVRIDVAGGWTDTPPYSLYSG GSVVNLAIELNGQPPLQVYVKPCHEFHIVLRSIDMGAVEVIRSYEELQDYKKVGSPFSIP KAALTLAGFSPLFAAESHASLEKHLKAFGSGLEITLLAAIPAGSGLGTSSILASTVLGAI NDFCGLAWDRNDICNYTLVLEQLLTTGGGWQDQYGGVFPGVKLLQSESGFEQHPLVRWLP DQLFVQPEYRDCHLLYYTGITRTAKGILAEIVSSMFLNSGVHLSLLAEMKAHAMDMSEAI LRGNFETFGNLVGKSWIQNQALDSGTNPPAVAAIIEQIKDYTLGYKLPGAGGGGYLYMVA KDPQAAGCIRRILTEQAPNPRARFVEMTLSDKGLQVSRS >gi|226332195|gb|ACIC01000125.1| GENE 20 29178 - 30215 1040 345 aa, chain + ## HITS:1 COG:FN0776 KEGG:ns NR:ns ## COG: FN0776 COG2502 # Protein_GI_number: 19704111 # Func_class: E Amino acid transport and metabolism # Function: Asparagine synthetase A # Organism: Fusobacterium nucleatum # 10 345 3 327 327 353 51.0 2e-97 MSYLIKPKNYKPLLDLKQTELGIKQIKEFFQLNLSSELRLRRVTAPLFVLKGMGINDDLN GIERPVSFPIKDLGDAQAEVVHSLAKWKRLTLADYNIEPGYGIYTDMNAIRSDEELGNLH SLYVDQWDWERVITNEDRNVDFLKEIVNRIYAAMIRTEYMVYEMYPQIKPCLPQKLHFIH SEELRQLYPNLEPKCREHAICQKYGAVFIIGIGCKLSDGKKHDGRAPDYDDYTSTGLNNL PGLNGDLLLWDDVLQRSIELSSMGVRVDKEALQRQLKEENEEERLKLYFHKRLMDDTLPL SIGGGIGQSRLCMFYLRKAHIGEIQASIWPEDMRKECEELDIHLI >gi|226332195|gb|ACIC01000125.1| GENE 21 30221 - 30883 521 220 aa, chain + ## HITS:1 COG:PA0750 KEGG:ns NR:ns ## COG: PA0750 COG0692 # Protein_GI_number: 15595947 # Func_class: L Replication, recombination and repair # Function: Uracil DNA glycosylase # Organism: Pseudomonas aeruginosa # 3 220 8 226 231 273 59.0 2e-73 MNVQIEESWKTHLQPEFEKDYFRTLTEFVKSEYSQYQIFPPGKLIFNAFNLCPFDKVKVV IIGQDPYHGPGQAHGLCFSVNDGVPFPPSLVNIFKEIKADIGTDAPTTGNLTRWAEQGVL LLNATLTVRAHQAGSHQNRGWEAFTDAAIRALAEEREHLVFILWGAYAQRKGAFIDRNKH LVLSSAHPSPLSAYNGFFGNKHFSRANDYLKANGETEIIW >gi|226332195|gb|ACIC01000125.1| GENE 22 31005 - 33716 2107 903 aa, chain - ## HITS:1 COG:no KEGG:BT_2133 NR:ns ## KEGG: BT_2133 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 903 1 902 902 1617 96.0 0 MAPLRAKIIISFILLIVLLMLPDEATSQRRRRGMIASNTAQTDSLNRKDSLGADTVVVRV DSVAPTKKQPLDAPVIYESNDSTTFTLGGAATLYGSGKVNYQNIELAAEVISMNLDSSTV HAYGIKDTTGTIKGKPVFKEGETAYDTETISYNFKSKKAGITDIITQQGEGYVTGSRAKK GANDEIFMEHGRYTTCDHHDHPHFYMQLTRAKVRPKKNVVTGPAYLVVEDVPLPLAVPFF FFPFSSSYSSGFIMPTYMDDSSRGFGLAEGGYYFAMSDIMDLKLTGDIFTKGSWRLSGLT NYNKRYKYSGTLQADYQVTKTGDKGMPDYTVAKDFKVVWNHRQDAKASPNSTFSASVNFS TSSYERSNINNLYNSQLLTQNTKTSSISYSRSFPDIGLTLSGTTNIAQTMRDSSIAVTLP DLNITLSRLFPFKRKKAAGAERWYEKISVSYTGRLTNSIRTKDDRLFKTGISGWENAMNH NIPISATFTLFKYLQVSPSVNYTERWYTRKMNQTYNMEEHKLDPNLNDTVNGFYRVSNYS ASISLSTKLYGMYKPLFMKKKEIQIRHVVTPQVSLSGAPGFSKYWEEYTDYNGNTQYYSP FTGQPFGVPSREGSGTVSFSLSNNLEMKYYDAKKDTLKKVSLIDDLSANMSYNMAAKERP WSDLSLNIRMKLTKNYTFNMNASFATYAYAFDKNGNVVTSNRTEWSYGRFGRFQGYGSSF NYTFNNDTWKKWFGPKEDAEQDKNKKDSEDGDGEDSEGTEDGTTTKKVEKAQADPDGYQV FKMPWSLSFSYSFNIREDRTKPINRYSMRYPFTYTHNINMNGNIKISNNWSLSFNSGYDF QAKEITQTSCTISRDLHCFNLSASLSPFGRWKYYNVTIRANASILQDLKYEQRSQTQSNI QWY >gi|226332195|gb|ACIC01000125.1| GENE 23 33807 - 34343 444 178 aa, chain - ## HITS:1 COG:MJ0778 KEGG:ns NR:ns ## COG: MJ0778 COG1418 # Protein_GI_number: 15668959 # Func_class: R General function prediction only # Function: Predicted HD superfamily hydrolase # Organism: Methanococcus jannaschii # 24 161 21 149 169 73 36.0 2e-13 MNPIEIIDKFYPQETEQRHILLIHSLSVAQKALKIVDAHPNLPINRSFVREAALLHDIGI FLTDAPTIQCFGEHPYIAHGYLGADLLRKEGFERHALVCERHTGAGLTLEEIIERQLPVP HREMVPVTLEEQIICFADKFFSKTHLDEEKTVEKARKSIAKYGEEGLNRFDGWCSLFL >gi|226332195|gb|ACIC01000125.1| GENE 24 34419 - 35582 889 387 aa, chain + ## HITS:1 COG:BS_ybbC KEGG:ns NR:ns ## COG: BS_ybbC COG3876 # Protein_GI_number: 16077233 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus subtilis # 24 387 36 414 414 246 40.0 5e-65 MQARSSLILFSLFIILCSTYASGKVITGAERMDQYLPLIKGKRVGMVVNHTSIVGTEQIH LLDTLLKQKIHVVKVFAPEHGFRGNADAGETVKDGKDSRTGVTIVSLYGDNKKPTAAQLK DIDVILFDIQDVGARFYTYISTMYYVMEACAENKKEMIILDRPNPCDYVEGPVLKAGYKS FVGMLPLPVLHGCTIGELAQMINGEGWMTTQAKTCPLTVIPVKGWKHGQPYVLPVKPSPN LPNAQSIRLYASLCPFEATRVSVGRGTTFPFQVLGAPNKKYGDFAFTPRSLPGFDKNPMH KGVTCYGEDLRNVTDVNGFTLRYFLRFYRLCGEGAAFFSRPRWFDLLMGTDSVRKAILKG ESEEEIRDSWQKELQTYRDMRQKYLLY >gi|226332195|gb|ACIC01000125.1| GENE 25 35571 - 36929 1094 452 aa, chain - ## HITS:1 COG:all1686 KEGG:ns NR:ns ## COG: all1686 COG1409 # Protein_GI_number: 17229178 # Func_class: R General function prediction only # Function: Predicted phosphohydrolases # Organism: Nostoc sp. PCC 7120 # 248 400 147 291 303 66 29.0 1e-10 MKITKKKIAWGIVLLILIAGGVICKIRWKVWFHNIPEIRYVLTEDPQRVFLTFGNDGELS RNVSWVCGGVSKQAKLEYTKIGSGDTLLVDGSSKFLRTLGGFGYANWAKFKELSYGDSYS YRVWNDDRSSDWYSFTMQPDSTDHFSFVFIGDVQDTLRGKTRGFMENVRHRYPQADFYMF AGDFAERPMNCYWDEAYQSVDSIAPTKPVLVSPGNHEYVKGLVRVLEKRFAYVFSYLLES RYKNNNVYSIDYNDATIITLDSNRDPWFLFSQREWLEKTLKASKKKWKIVMLHHPVYSIK GKTNNLAVRWMFDGLFREYGVDLVLQGHEHNYARMTNKNDEGEMTTPLYLVSHASPKSYR LSFNDKYDRFGTNRRFYQHIDVTGDTLRMQAYLENDSLYDDVRIVKNASGTQIIDNAKDI PEILEMPARLSGKKAEEFERNAEKWRNRSLVK Prediction of potential genes in microbial genomes Time: Thu May 12 02:34:41 2011 Seq name: gi|226332194|gb|ACIC01000126.1| Bacteroides sp. 1_1_6 cont1.126, whole genome shotgun sequence Length of sequence - 14621 bp Number of predicted genes - 20, with homology - 19 Number of transcription units - 6, operones - 4 average op.length - 4.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 51 - 110 3.7 1 1 Op 1 . + CDS 219 - 398 80 ## 2 1 Op 2 . + CDS 451 - 1278 725 ## CA_C1149 hypothetical protein 3 1 Op 3 . + CDS 1298 - 2236 816 ## gi|237717756|ref|ZP_04548237.1| predicted protein + Term 2257 - 2296 7.4 4 2 Op 1 . + CDS 2316 - 5318 2152 ## COG0827 Adenine-specific DNA methylase 5 2 Op 2 . + CDS 5334 - 5531 66 ## gi|301308008|ref|ZP_07213962.1| hypothetical protein HMPREF9008_01202 6 2 Op 3 . + CDS 5545 - 6042 395 ## BVU_2473 putative anti-restriction protein 7 2 Op 4 . + CDS 6055 - 6375 191 ## gi|253570894|ref|ZP_04848302.1| conserved hypothetical protein 8 2 Op 5 . + CDS 6383 - 6670 144 ## gi|253570895|ref|ZP_04848303.1| conserved hypothetical protein + Term 6679 - 6717 6.2 9 3 Op 1 . + CDS 6760 - 7692 668 ## gi|293370320|ref|ZP_06616877.1| hypothetical protein CUY_4157 10 3 Op 2 . + CDS 7725 - 8228 357 ## BDI_3866 hypothetical protein 11 3 Op 3 . + CDS 8260 - 8859 457 ## BVU_1586 hypothetical protein 12 3 Op 4 . + CDS 8856 - 9488 385 ## gi|237717748|ref|ZP_04548229.1| predicted protein 13 3 Op 5 . + CDS 9490 - 9699 230 ## gi|237717747|ref|ZP_04548228.1| conserved hypothetical protein 14 3 Op 6 . + CDS 9739 - 10731 618 ## BVU_1584 hypothetical protein 15 3 Op 7 . + CDS 10742 - 11443 441 ## BVU_1583 hypothetical protein 16 3 Op 8 . + CDS 11445 - 12158 406 ## BDI_3871 hypothetical protein 17 4 Tu 1 . - CDS 12214 - 12558 456 ## BDI_3872 hypothetical protein + Prom 12612 - 12671 2.4 18 5 Op 1 . + CDS 12882 - 13172 244 ## BVU_1580 hypothetical protein 19 5 Op 2 . + CDS 13217 - 14143 732 ## gi|237717741|ref|ZP_04548222.1| predicted protein + Term 14173 - 14224 6.1 - Term 14160 - 14211 6.1 20 6 Tu 1 . - CDS 14220 - 14456 143 ## BVU_1573 glycoside hydrolase family protein Predicted protein(s) >gi|226332194|gb|ACIC01000126.1| GENE 1 219 - 398 80 59 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTVIPKPSSPNSQAKVPRVLLVQARSSPPGVVEKSSSPGGLAVFFPPTLHVRDTAFHSL >gi|226332194|gb|ACIC01000126.1| GENE 2 451 - 1278 725 275 aa, chain + ## HITS:1 COG:no KEGG:CA_C1149 NR:ns ## KEGG: CA_C1149 # Name: not_defined # Def: hypothetical protein # Organism: C.acetobutylicum # Pathway: not_defined # 10 234 2 222 267 149 38.0 1e-34 MLQQATKEKYGIETLKERNVSYDHEHGLTQKDVDMANNHVQLIERTRSEVTPQIGDRLVY VTEHGDYYGNALIDGKSTKEGYLSICEQPYVPFVWEEGGSIRLSVSGGAFHSVNPKDMKF LKWTEGAFKDWGHCGACANGSVTFIAKVPLWFYAEPNSRYGDFTTETYRKFYLNKRDESE SGNLYQGFDIAFRDEAEFWQFLKDYEGTVFKGNWENQIVLWCFRREYVFLPSAEWKTVDA PAEERRLNFHPEQVKIVKDMEKHITYFYRIKPDNF >gi|226332194|gb|ACIC01000126.1| GENE 3 1298 - 2236 816 312 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237717756|ref|ZP_04548237.1| ## NR: gi|237717756|ref|ZP_04548237.1| predicted protein [Bacteroides sp. 2_2_4] # 1 312 1 312 312 606 100.0 1e-172 MQTTVTKAGNAPDLLSGILSVQVRNEDKITEQDRLYCQMQQELLYKTLDQIDRWYAVFKE EAEQYQAERKFHYEENGKVSMRDFYTYHNNRDDYSHNEFKPFDLINDLVDKNRNANANFA GRIISYFNRTYNVSVPEYKIDEKTLQMGSRPVYGTYVDVVIEHLGGKSFRETAVEELLAR VARVVKPSCWSKVKTELKKDRIVFPEIIRFDDYYIQYNNRCKISYNYGGDLETLCAGIAY GADDILNGNSKMIIRFDDNDVSVTDWYDLTTTNAEQIRFYKNGRIDVRFKDSAAAGNCFK RLRLDEITQREN >gi|226332194|gb|ACIC01000126.1| GENE 4 2316 - 5318 2152 1000 aa, chain + ## HITS:1 COG:jhp0928_1 KEGG:ns NR:ns ## COG: jhp0928_1 COG0827 # Protein_GI_number: 15611993 # Func_class: L Replication, recombination and repair # Function: Adenine-specific DNA methylase # Organism: Helicobacter pylori J99 # 95 246 553 711 729 63 32.0 2e-09 MYAIIPQQIPQGMRAEVNEKILFAIDSGKDLIPAESIYNCYTGIGGLHNLKQSDFANYHE YAEAKKESEMGQFFTPHEVCRDMADMLSPTSSEMILDMCCGMGNFFNHLPNLHNAYGFDI DGKAVSVARYLYPDAHIEKCDLRQYYPEQRFDIVIGNPPFNQKFDYKLSQEYYMDKAYDV LNPAGILMIIVPGSFMQSGFWEKTRIAGINSNFSFVGQTKLAPSAFAATGVHDFNTKIMV FLRKSVHIGMRAYSAEEFITVEELKKRIGGARAMKHRLRFDLMRETNRIDKEELELFEYR LAKYMYELKVHAKLNRYIGKTEALVTKFRNQKPPGNATREQVNQWEKNKLTPKKVLAVIR RYITSQNTVPRKEVALVKTSYGFKLKQYAPRLLDKVPHKAASINDLVLERAELPMPEVPT EKNMRQIRAAEKLIRRKRREYEMQDRQFPEMEEDDRLKEYLDRTTFINKDGDVCEFTTLQ KHDLNLVLQKRYALLNWQQGSGKTAAVYHRAKYLLKYRKVRNAVILAPAIATNMTWIPFL SMNREQFRVARCNADLETVPEGVFLILSTSMLSKLKRGLARFVKRTSRKLCLVFDESDEI TNPSSQRTRHILCLFRRLRYKILDTGTTTRNNIAELYSQFELLYNNSVNMICWSGRVYHD NKDKEIEEDTNPHYGEPFSAFRGHVLFRACHCPGKSTVFGIEKQNQDVYNKEELAELIGK TVITRKFRDFAGEKYRIRTHTVSPSDGEREVYRVIIEEFCRICELYYNSTGDTKKDAGLR LMRQIKLLIKACSVPHLIEGYSGDGIPNKTKYIERLVRKIPGKVAVGCTSIAAFDLYEKR LRECFPERPVFVVKGDVAFKKRQSVVTEFDSTVNGILVCTQQSLSSSVNIPTCNDVILES LQWNIPKMEQFYFRFIRLDSKEQKDVHYVTYKDSVEQNLMALVLTKERLNEFIKMGEVKE QSEIFEEFDVTMSVIESLLVRECDSEGRIHISWGSQRIMN >gi|226332194|gb|ACIC01000126.1| GENE 5 5334 - 5531 66 65 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|301308008|ref|ZP_07213962.1| ## NR: gi|301308008|ref|ZP_07213962.1| hypothetical protein HMPREF9008_01202 [Bacteroides sp. 20_3] # 1 62 5 66 70 92 93.0 7e-18 MGNRRFHSTGKGSPPPYRQGKVRPQAVFGEIILAGGSGISPKNPALPGCGPFGACGMKSP VPIIN >gi|226332194|gb|ACIC01000126.1| GENE 6 5545 - 6042 395 165 aa, chain + ## HITS:1 COG:no KEGG:BVU_2473 NR:ns ## KEGG: BVU_2473 # Name: not_defined # Def: putative anti-restriction protein # Organism: B.vulgatus # Pathway: not_defined # 3 116 6 119 177 84 42.0 1e-15 MDLNQAEVAVTTQHLIDTRQEKDNWLQMSDFGDMGEFLCTCSELFAEEETPEYRYMKWED IPDSLINMEWLCPNFFEIREAMEQLEEPDKDCFFDWCDRYGHDISTEDAHLLVAHYIELF GNTAYTGDEPCPDSGDDSLPYYPGISSNYFDTGIPHFEVFDDNYD >gi|226332194|gb|ACIC01000126.1| GENE 7 6055 - 6375 191 106 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253570894|ref|ZP_04848302.1| ## NR: gi|253570894|ref|ZP_04848302.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 106 1 106 106 205 100.0 8e-52 MEINFKGPVMPVDPYSQMAFVEILNILLTARHIVDVNRFLINRNTNPQFGSLSGYFRWSF SGNHFTLWQRMEYNSPVCFSRRIFSIHFGILASRNRERNKDSLTLN >gi|226332194|gb|ACIC01000126.1| GENE 8 6383 - 6670 144 95 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253570895|ref|ZP_04848303.1| ## NR: gi|253570895|ref|ZP_04848303.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 95 1 95 95 189 100.0 5e-47 MDTITISNREIALMAFDRLRKDDRKDSALKLARCMLHGTSISLGISDIDWEIDRAIQQCG GVPRTGYRYTAYFHFNRNTEMAKEIYDKIVKELYG >gi|226332194|gb|ACIC01000126.1| GENE 9 6760 - 7692 668 310 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|293370320|ref|ZP_06616877.1| ## NR: gi|293370320|ref|ZP_06616877.1| hypothetical protein CUY_4157 [Bacteroides ovatus SD CMC 3f] # 1 310 1 310 310 592 97.0 1e-167 MDEQKTLTLDFIKSLMEPAYTLVWTDYDDNLDNHRGLIQKCLDSKSREHLWEEADVWYSD AEWEAVRGIIAKLKEECTVFNDFDEEDVDDFFDEHEDEIRDEIYSRNDSDVIKELIMHTD DIPIRVEMLSNYDCINSHWFESQGGYRYEESYFGDMVDSLNLNPARVKKILTKHGYKAYG RFPNRKNRNGREQVSYEQFYEELINSCCGANLLTYIGRVNLKELYEAGFSLEEVVIPKGN CCGLFSSTYGGGSLLEMELKKDVRLKLEVKDYHGFRFRLDDERSKYECSIRHVYGVDDSF FGERISLVAS >gi|226332194|gb|ACIC01000126.1| GENE 10 7725 - 8228 357 167 aa, chain + ## HITS:1 COG:no KEGG:BDI_3866 NR:ns ## KEGG: BDI_3866 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 158 1 158 165 188 58.0 6e-47 MEKKYVIILSEGKEYLCCHEDGCYYDVSCPMRSFTEGEEDFEIMDSGQNRHGKTYPYHKR KLKLVPGFYPNGWLALSLEVPKTGEAYTVLTVNLEDFPAFGIPDKAFVDINNNPEAMDFL IRYNLAEDTGYRRRSGWVEYPMVKLNLPELYRISPAYFEESGQQSIM >gi|226332194|gb|ACIC01000126.1| GENE 11 8260 - 8859 457 199 aa, chain + ## HITS:1 COG:no KEGG:BVU_1586 NR:ns ## KEGG: BVU_1586 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 199 1 199 199 270 64.0 2e-71 MAKYEVRVRYAFEGTYMVAAGNREEAKRMVEEDCGLVLGGNIHTTRDDDEVTDWNFGIHP ESQILSIKETGRKARAKYIAMNFSDRIEELRNGIIEAIRQLLQAHSLAELAFTNREDDPV WIIWFGKNGDPYECRVTGVRIMEGSLTVIAEEKESGDEVECYGPFELGARNIDWLHEMYD AAWRQLEETKDVEPKTEEP >gi|226332194|gb|ACIC01000126.1| GENE 12 8856 - 9488 385 210 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237717748|ref|ZP_04548229.1| ## NR: gi|237717748|ref|ZP_04548229.1| predicted protein [Bacteroides sp. 2_2_4] # 1 210 1 210 210 399 100.0 1e-110 MKYQAENAVSSFFYYMWNAWSKEECKVVFGDMYRHFWDKWSVSADNAIFGAAERFFAGLS ENYQKLLVERAVTLYDGRAFRKEPDDSDILVCKECGSRQLEIQVWINANTDERISYVYDD NDGHWCDGKWCEECVDQTFFCTKAEFTQKMQSWWESCGLESKEQITGLKVCDCPPAESPQ TFVDAAGRWWNSRDYEYKREIYNKHTSNNE >gi|226332194|gb|ACIC01000126.1| GENE 13 9490 - 9699 230 69 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237717747|ref|ZP_04548228.1| ## NR: gi|237717747|ref|ZP_04548228.1| conserved hypothetical protein [Bacteroides sp. 2_2_4] # 1 69 42 110 110 132 100.0 7e-30 MQTNIIELISNSCGCSQTEAQEYLDSEIRYLRELQEADDLRGDDIGMACSNLGLDLDYQE YFINRLAGA >gi|226332194|gb|ACIC01000126.1| GENE 14 9739 - 10731 618 330 aa, chain + ## HITS:1 COG:no KEGG:BVU_1584 NR:ns ## KEGG: BVU_1584 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 330 10 339 339 617 93.0 1e-175 MADLKKEYRCLALQHHPDKGGDTAIMQRVNTEFEKLYGIWKDKPDVSAASTGYEHDYPGA TAKEYTEYVYNEYRWKGRNYKGQHAPEIVELVRAWLKETYPGYKFSVRRENCHSIHIRLM KADFEAFTKESGKVQGDVNHHHIHSDKSLTDRAKEVMMNICDFIMSYNFDDSDPMTDYFH TNFYLTLGIGSYKQPYKVEPPRLGSKDKPEVFKHPEGPAHKAMRRALGKARFGFIESRKY AGEIILGEDCFGSRGELYFWPKEYSSAKMAQKRIDKLEEAGIRCELTGYNGGYIRLLGYT PEMRDSLERERQEYAAAYQAWYSKQNLKTI >gi|226332194|gb|ACIC01000126.1| GENE 15 10742 - 11443 441 233 aa, chain + ## HITS:1 COG:no KEGG:BVU_1583 NR:ns ## KEGG: BVU_1583 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 233 1 233 233 465 95.0 1e-130 MDTNNLDKWWYGLPENTRQAIGDDGIWEKLDMPSRSALHRYSLLRIYGTAKDRDEERTLL NEIACGLGDLALVRKNGIALEEMCNGNGEFYDEYQEQFNILYDNYGHIIENISWPDWIGH TIPTNRELVRLLEHHGYKRMEIDTDRRIPKTFYVFRRGLHINASEDLSYHIIPQQDSFGL GRFAVCATKDGESSQLGTDCARLFLRRFLAFLKGERSGKEIIDEIFNNRQTER >gi|226332194|gb|ACIC01000126.1| GENE 16 11445 - 12158 406 237 aa, chain + ## HITS:1 COG:no KEGG:BDI_3871 NR:ns ## KEGG: BDI_3871 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 237 1 237 237 387 90.0 1e-106 MKAKVFKYKSDGNTVVAPYMELEPYAENVYLSLSRKNEDGNEDDDCFHVVCRIENVYFSS GQYSRRFLKGEGCREEAATYCRNWIADTLQDAERGTFVRLISVRVFEALGLDATPLVQAR EAYKRKQEQKRREWEEKEAEERRAREEQHQRLLDEQKQKFLDGERITGGMFLEITGRDGF DIHIRTKGTFNRHVMGINKEGSISYRKIKGHRTPDFTGCHKAVSAYLAFITEKEGKQ >gi|226332194|gb|ACIC01000126.1| GENE 17 12214 - 12558 456 114 aa, chain - ## HITS:1 COG:no KEGG:BDI_3872 NR:ns ## KEGG: BDI_3872 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 114 16 129 129 205 93.0 5e-52 MENYIKKAADAFLVERPYGMRVDYRKKGFVLFNRNLNVLGNAKQTRLEELPLERFNVEEI PLEGEVVEEHAGFTDVFFYTDLTNPYAGYVLNLQKLKAYNRFMFPLAMALNREL >gi|226332194|gb|ACIC01000126.1| GENE 18 12882 - 13172 244 96 aa, chain + ## HITS:1 COG:no KEGG:BVU_1580 NR:ns ## KEGG: BVU_1580 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 96 1 96 96 193 95.0 2e-48 MKILNEEHFENVKRYAESIGDISLQKCLERLKSWEENPDCPSEISLYYDHAPYSFGFTQR YPDGRTGIVGGLLYHGIPDRSFAVTLQPFHGWQIHT >gi|226332194|gb|ACIC01000126.1| GENE 19 13217 - 14143 732 308 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237717741|ref|ZP_04548222.1| ## NR: gi|237717741|ref|ZP_04548222.1| predicted protein [Bacteroides sp. 2_2_4] # 1 308 1 308 308 586 96.0 1e-166 METTLAVMERQQQFDFQKNGIEVMNFETLQRTYKENDIYNNPVQGIYHYQVIRRMMDICE KYNLDYEVEEIFAAQNRNKTQPGVSILPQVEQTHGEKAVEAHILRRIFATIRIKDWETDE LTTTLVIAYHQDGIQAAIGPCVLICHNQCILSSERSVCNYGKKKVSTEEVFETVDGWLAN FEVNMNEDIERIQRLKRRIISMEEIYMYIGLLTALRVSHDSSDRDLSSSVETYPLNQGQI SIFAEEVLKLAMSKGQITAWELYNIATEIYKPGKTDFPALIPQNGAMAELLLSRLPEELE VQDAVQVN >gi|226332194|gb|ACIC01000126.1| GENE 20 14220 - 14456 143 78 aa, chain - ## HITS:1 COG:no KEGG:BVU_1573 NR:ns ## KEGG: BVU_1573 # Name: not_defined # Def: glycoside hydrolase family protein # Organism: B.vulgatus # Pathway: not_defined # 1 76 82 157 159 103 68.0 2e-21 MFSYLGRDSLLAAVLSYNVGPYRLKGYGKRPKSRLLKKLESGDRNIYKEYVSFRCYKGKV VPSIERRRKVEFMLLFEE Prediction of potential genes in microbial genomes Time: Thu May 12 02:36:23 2011 Seq name: gi|226332193|gb|ACIC01000127.1| Bacteroides sp. 1_1_6 cont1.127, whole genome shotgun sequence Length of sequence - 8258 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 1, operones - 1 average op.length - 10.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 77 - 110 -0.9 1 1 Op 1 . - CDS 149 - 655 399 ## PGN_0056 probable conserved protein found in conjugate transposon 2 1 Op 2 . - CDS 695 - 1561 666 ## PGN_0057 probable conserved protein found in conjugate transposon TraP 3 1 Op 3 . - CDS 1575 - 2147 495 ## BT_0085 conjugate transposon protein 4 1 Op 4 . - CDS 2150 - 3133 819 ## BF0116 hypothetical protein 5 1 Op 5 . - CDS 3157 - 4425 707 ## PGN_0060 conserved protein found in conjugate transposon TraM 6 1 Op 6 . - CDS 4418 - 4723 238 ## gi|253570913|ref|ZP_04848321.1| conserved hypothetical protein 7 1 Op 7 . - CDS 4736 - 5359 576 ## BT_0089 conjugate transposon protein 8 1 Op 8 . - CDS 5383 - 6387 860 ## BF0120 hypothetical protein 9 1 Op 9 . - CDS 6410 - 7039 754 ## BF0121 hypothetical protein 10 1 Op 10 . - CDS 7080 - 8129 617 ## gi|237717730|ref|ZP_04548211.1| predicted protein Predicted protein(s) >gi|226332193|gb|ACIC01000127.1| GENE 1 149 - 655 399 168 aa, chain - ## HITS:1 COG:no KEGG:PGN_0056 NR:ns ## KEGG: PGN_0056 # Name: not_defined # Def: probable conserved protein found in conjugate transposon # Organism: P.gingivalis_ATCC33277 # Pathway: not_defined # 6 159 2 153 153 185 59.0 4e-46 MNALNKKRGLVGMTAILFLGLVACLLSACNDEVDVQQSYPFKVETLPVPTRIVKGETVEI RCELKREGRFSDARYTIRYFQPDGKGTLRMDDGMVLLPNDRYPLDREVFRLYYTPESEDQ QTIDIYFEDNSEPAQLCQLTFDFNNETEDEDSVVTADSKELPVTTVRH >gi|226332193|gb|ACIC01000127.1| GENE 2 695 - 1561 666 288 aa, chain - ## HITS:1 COG:no KEGG:PGN_0057 NR:ns ## KEGG: PGN_0057 # Name: traP # Def: probable conserved protein found in conjugate transposon TraP # Organism: P.gingivalis_ATCC33277 # Pathway: not_defined # 1 286 1 278 284 293 51.0 7e-78 MTIEEIRDIPIAVFLARMGYEPARRRGDEYWYLAPYREERTASFQLNVRKDIWHDFGTGQ GGDIFTLAGEFIGSGDFKAQARFITEIWGGLAPEHKTVFRSGENDREDSHRQESFTKVQS GPLHNRVLLRYLAERGISGDVAMPNCKEIRYTLHGKRYFAIGFRNVSGGYEVRNRFFKAS LSPKDISLMDNGSDTCNLFEGFIDCLSWMQLELGCGDDYLVLNSVALLERSFPVLDRYER VNCYLDRDEAGRRTLEALRKRYADKIVDCSSLYKGYKDLNEYLQNKFL >gi|226332193|gb|ACIC01000127.1| GENE 3 1575 - 2147 495 190 aa, chain - ## HITS:1 COG:no KEGG:BT_0085 NR:ns ## KEGG: BT_0085 # Name: not_defined # Def: conjugate transposon protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 189 2 192 192 213 58.0 3e-54 MKKYLFLFVTVVSLALFSGRAHAQRYLPGMKGVELRGGFANGSKSPLNYYTGFAVSGYTP KANRWVIGAEYLMKNYGYRNASVPRAQFTAEGGYYLKFLSDPTKTVFLSVGGSALAGYET VNWGDRMLFDGSTLLAKDAFVYGGAITLELEAYLTDRIVLLAGIRERVLWGSSLDLFTTQ FGLGVKFIIH >gi|226332193|gb|ACIC01000127.1| GENE 4 2150 - 3133 819 327 aa, chain - ## HITS:1 COG:no KEGG:BF0116 NR:ns ## KEGG: BF0116 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 327 1 328 328 489 72.0 1e-137 MKKFFVMLALAWGTVGAFAQNEADSVAAVNNAITLAKDVYPQKEEDGDLYHGLTRKLTFD RMIPPYGLEVTYGKTTHIIFPSSVRYVDLGSPNLIAGKADGAENVIRVKATVKDFREETN MSVITESGSFYTFNVKYAEEPLLLNIEMKDFIHDGSEVNRPNNALDIYLKELGCESPKLV HLISKSIHKDNRRHIKHIGCKAFGIQYLLRGLYTHNGLLYFHTQIKNRSNVPYEVDFVTF KIVDKKLMKRTAIQEQVIFPLRAYDYVTSVAGKKDGRTVFVFDKFTIPSDKVLVVEMHEK GGGRHQTFTVESEDIVRAGVINELKVK >gi|226332193|gb|ACIC01000127.1| GENE 5 3157 - 4425 707 422 aa, chain - ## HITS:1 COG:no KEGG:PGN_0060 NR:ns ## KEGG: PGN_0060 # Name: traM # Def: conserved protein found in conjugate transposon TraM # Organism: P.gingivalis_ATCC33277 # Pathway: not_defined # 9 422 21 455 455 337 47.0 5e-91 MGNDVNKLKQRLEIRKYLVFAGMFLLFLGAMWLIFAPSEEEKRKEEKNAGFNTELPDPRG AGIEGDKIAAYEQADMKRKQTEKLRTLEDFSEMADGNSGTVVELSPDADAKDGSSREGDR RNEACSSSVSAYNDINRTLGNFYESPQEDPEKEALKAEVEQLKQAVATQNAQPGYEEQVA LLEKSYELAARYMPGNGGVNTDSRSDNESPGRNGKAKAVPVGLVSTPVVSSLPQPVSDSI QLTGMAQPERFHTPIGNKEERNVRNTIRACIHGDQTVISGQGVRMRLLEPMRVGKHILPK NSLLTGEGRIQGERLHVNILQVEYGGIIIPVELAVYDNDGQGGIFIPGSMEANAVREVAA NLGQNLGTSISITNQSAGDQLLSELGKGAIQGVSQYISRKMREEKVHLKSGYTLMLYQDN DQ >gi|226332193|gb|ACIC01000127.1| GENE 6 4418 - 4723 238 101 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253570913|ref|ZP_04848321.1| ## NR: gi|253570913|ref|ZP_04848321.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 101 1 101 101 180 100.0 3e-44 MKKMFVKIKESVEEKLRKLCSGLSPEKRVPAIVVLAALFAIGNFYMIFRAIHDIGREDVR PEIIEIPPMEIPDIVPADTLPDKRVQEMEEFFNRFNQKENG >gi|226332193|gb|ACIC01000127.1| GENE 7 4736 - 5359 576 207 aa, chain - ## HITS:1 COG:no KEGG:BT_0089 NR:ns ## KEGG: BT_0089 # Name: not_defined # Def: conjugate transposon protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 207 1 207 207 310 71.0 3e-83 MEFKSLKNIETSFRHLRLFGIVYLCVCTLLVGFSVWKAYSFAEAQREKIYVLDEGKSLML ALSQDLEQNRPVEAREHVRRFHELFFTLSPDKNAIEGNIRRAMFLSDRSAYAHYVDLAEQ GYYNRIISGNVNQRVGIDSVKCDFHTYPYEVVTYAKLSIIREKSVTERSLVTRGRLLNST RSDNNPHGFILEAFRVVENRDIRVYDR >gi|226332193|gb|ACIC01000127.1| GENE 8 5383 - 6387 860 334 aa, chain - ## HITS:1 COG:no KEGG:BF0120 NR:ns ## KEGG: BF0120 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 6 316 1 313 334 466 70.0 1e-130 MLLLAIDFDNLHQILQNLYVEMMPLCSKMTGVARGLAGLGALFYVAYRVWQALARAEPVD VFPLLRPFALGLCIMFFPTLVLGTLNSILSPVVKGTHTILESQTFDMNEYRAQKDKLETE AMKRNPETAYLVDKETFDNRLDELGAFDAIEACGMYVDRAMYNMKRAVQNFFRELLELLF NAAALVIDTLRTFFLIVLSILGPVSFAISCWDGFQASLSQWFVRYISIYLWLPVSDLFSS VLARIQVLMLQRDIEQLSDPDFIPDSSNGVYITFLIIGIIGYFTIPTVSNWIVQAGGGAG NYGKNVNQAASKTGSVVAGTAGAAVGNIAGRLIK >gi|226332193|gb|ACIC01000127.1| GENE 9 6410 - 7039 754 209 aa, chain - ## HITS:1 COG:no KEGG:BF0121 NR:ns ## KEGG: BF0121 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 209 1 209 209 287 69.0 2e-76 MKKKILMLCMGCFLASLTAKAQWVVSDPGNLAQGIINAAKNIVHTSSTATNMANNFQETV KIYKQGKEYYDGLRKVKNLVRDARKVQQTVLMVGDITDIYVTNFERMLSDPYFTPEELSA IALGYTKLLEESAHLLNDLKTVVNENGLSMNDKERMDIIDRCYNDMLQNRSLVQYYTNKN IGVSYLRAKKRNDLDRVMALYGSPNERYW >gi|226332193|gb|ACIC01000127.1| GENE 10 7080 - 8129 617 349 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237717730|ref|ZP_04548211.1| ## NR: gi|237717730|ref|ZP_04548211.1| predicted protein [Bacteroides sp. 2_2_4] # 1 349 1 349 349 693 100.0 0 MKFTEWIRHRIVGRFREPDHRELQEIAQKAKRLAGVYNHGELAKLIREYSHSEKAIEQIA LLIAKSEPFDSPISTMNKESVDMMDVLTDTPFMAKHGSQLTDIPPWEATSVHGLFAMYTI MRDWGCTKYPMNRIERPPHSEVISAIRILDFQRKSRDVSELCELASYSMVPSTYVKLRYG IEDKCSTYEWLESYLMDRDDCCIEPDVRMNIPRIEAEMCAAAEKAVENIPGVRLPDFYLE GLDRELDKLGRIAVSPDASNDLINVQPDFLVKYGIDKTAAPEEQSRQAQKAYKELDNRLV RVTSRRPYADEFFASLRHGKEKVAEVDRPRPVHKPILRNPPSKGRKMGL Prediction of potential genes in microbial genomes Time: Thu May 12 02:37:40 2011 Seq name: gi|226332192|gb|ACIC01000128.1| Bacteroides sp. 1_1_6 cont1.128, whole genome shotgun sequence Length of sequence - 119271 bp Number of predicted genes - 129, with homology - 125 Number of transcription units - 46, operones - 29 average op.length - 3.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 336 217 ## COG3344 Retron-type reverse transcriptase - Prom 358 - 417 2.0 2 2 Op 1 . - CDS 905 - 3301 1777 ## COG3451 Type IV secretory pathway, VirB4 components 3 2 Op 2 . - CDS 3298 - 3630 83 ## BF0124 hypothetical protein 4 2 Op 3 . - CDS 3632 - 3928 333 ## Fjoh_3006 hypothetical protein - Prom 3999 - 4058 4.9 - Term 4426 - 4470 8.2 5 3 Op 1 . - CDS 4498 - 4728 167 ## gi|237717727|ref|ZP_04548208.1| predicted protein 6 3 Op 2 . - CDS 4768 - 5556 532 ## BDI_3900 hypothetical protein 7 3 Op 3 . - CDS 5559 - 5978 347 ## gi|237717725|ref|ZP_04548206.1| predicted protein 8 3 Op 4 . - CDS 5986 - 6738 638 ## BT_2609 conjugate transposon protein 9 4 Op 1 . + CDS 7354 - 7806 413 ## Fjoh_3620 hypothetical protein 10 4 Op 2 . + CDS 7808 - 9061 860 ## BVU_0678 putative mobilization protein 11 4 Op 3 . + CDS 9097 - 11124 1395 ## COG3505 Type IV secretory pathway, VirD4 components + Prom 11262 - 11321 4.7 12 5 Op 1 . + CDS 11341 - 11604 258 ## gi|253570928|ref|ZP_04848336.1| conserved hypothetical protein 13 5 Op 2 . + CDS 11616 - 11876 288 ## gi|237717717|ref|ZP_04548198.1| predicted protein 14 5 Op 3 . + CDS 11881 - 12378 428 ## BVU_0658 hypothetical protein 15 5 Op 4 . + CDS 12390 - 12683 206 ## BT_2585 hypothetical protein + Term 12726 - 12773 9.0 - Term 12713 - 12761 3.2 16 6 Tu 1 . - CDS 12799 - 14055 724 ## COG1373 Predicted ATPase (AAA+ superfamily) - Prom 14174 - 14233 4.1 + Prom 14868 - 14927 3.7 17 7 Op 1 . + CDS 14973 - 16739 1813 ## BT_2642 hypothetical protein 18 7 Op 2 . + CDS 16788 - 17051 378 ## BT_2643 hypothetical protein + Term 17089 - 17142 1.2 19 8 Tu 1 . + CDS 17149 - 19245 1296 ## COG0550 Topoisomerase IA - Term 18911 - 18952 -0.6 20 9 Tu 1 . - CDS 19111 - 19428 87 ## - Prom 19450 - 19509 3.9 21 10 Tu 1 . + CDS 19444 - 19680 218 ## + Term 19745 - 19794 11.0 + Prom 19830 - 19889 2.0 22 11 Op 1 . + CDS 20068 - 20286 131 ## gi|255015017|ref|ZP_05287143.1| hypothetical protein B2_14008 23 11 Op 2 . + CDS 20324 - 21286 738 ## BVU_1522 hypothetical protein 24 11 Op 3 . + CDS 21291 - 21512 306 ## BVU_0712 hypothetical protein 25 11 Op 4 . + CDS 21553 - 22641 649 ## BVU_0713 hypothetical protein 26 11 Op 5 . + CDS 22645 - 23364 681 ## BT_2648 hypothetical protein 27 11 Op 6 . + CDS 23361 - 24149 618 ## BVU_0715 ThiF family protein, ubiquitin-activating enzyme + Prom 24175 - 24234 5.0 28 12 Tu 1 . + CDS 24259 - 25386 605 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain + Term 25519 - 25569 1.5 + Prom 25508 - 25567 7.3 29 13 Op 1 . + CDS 25623 - 26723 1091 ## BF3849 hypothetical protein 30 13 Op 2 . + CDS 26740 - 27627 953 ## BF3848 hypothetical protein + Term 27641 - 27679 -1.0 31 13 Op 3 . + CDS 27714 - 28601 798 ## gi|253570946|ref|ZP_04848354.1| conserved hypothetical protein 32 13 Op 4 . + CDS 28636 - 30438 1486 ## DSY3857 hypothetical protein 33 13 Op 5 . + CDS 30502 - 32199 1187 ## Swol_1076 Ig-like domain-containing protein + Prom 32331 - 32390 3.7 34 14 Op 1 . + CDS 32428 - 32925 306 ## BF3844 hypothetical protein 35 14 Op 2 . + CDS 32947 - 33648 526 ## BF3842 hypothetical protein + Term 33663 - 33700 3.4 36 15 Op 1 . + CDS 33722 - 34375 -24 ## HCH_03798 endonuclease/exonuclease/phosphatase family protein 37 15 Op 2 13/0.000 + CDS 34406 - 36703 1563 ## COG0642 Signal transduction histidine kinase 38 15 Op 3 . + CDS 36734 - 38041 684 ## COG2204 Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains + Term 38236 - 38291 0.2 + Prom 38925 - 38984 4.0 39 16 Op 1 . + CDS 39017 - 39310 298 ## BF2920 hypothetical protein 40 16 Op 2 . + CDS 39330 - 39506 246 ## gi|253570958|ref|ZP_04848366.1| conserved hypothetical protein 41 16 Op 3 . + CDS 39518 - 39877 321 ## BF2919 hypothetical protein 42 16 Op 4 . + CDS 39861 - 40118 177 ## BF2918 hypothetical protein + Term 40248 - 40281 2.5 + Prom 40719 - 40778 3.7 43 17 Tu 1 . + CDS 40802 - 41182 450 ## BF2915 putative single strand binding protein + Term 41198 - 41239 8.8 - Term 41158 - 41190 2.0 44 18 Op 1 . - CDS 41270 - 41551 313 ## COG0776 Bacterial nucleoid DNA-binding protein 45 18 Op 2 . - CDS 41619 - 41723 82 ## - Prom 41750 - 41809 5.9 46 19 Tu 1 . - CDS 41827 - 42438 270 ## BF2906 serine type site-specific recombinase - Prom 42464 - 42523 6.3 47 20 Op 1 . + CDS 42456 - 42638 76 ## gi|253570967|ref|ZP_04848375.1| conserved hypothetical protein 48 20 Op 2 . + CDS 42677 - 43726 769 ## BF2905 hypothetical protein 49 20 Op 3 . + CDS 43748 - 44200 513 ## BF2904 hypothetical protein 50 20 Op 4 . + CDS 44221 - 44598 306 ## BF2903 hypothetical protein 51 20 Op 5 . + CDS 44589 - 45179 295 ## BF2902 hypothetical protein 52 20 Op 6 . + CDS 45169 - 46014 485 ## BF2901 hypothetical protein + Prom 46026 - 46085 1.6 53 21 Op 1 . + CDS 46232 - 47689 1248 ## BF2900 hypothetical protein 54 21 Op 2 . + CDS 47744 - 49144 690 ## BF2899 putative outer membrane protein + Term 49165 - 49212 6.7 55 22 Op 1 . - CDS 49225 - 49821 267 ## BT_1949 hypothetical protein 56 22 Op 2 35/0.000 - CDS 49818 - 50576 227 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 57 22 Op 3 33/0.000 - CDS 50573 - 51553 844 ## COG0609 ABC-type Fe3+-siderophore transport system, permease component 58 22 Op 4 . - CDS 51554 - 52693 854 ## COG0614 ABC-type Fe3+-hydroxamate transport system, periplasmic component 59 22 Op 5 . - CDS 52698 - 54776 1201 ## BT_1953 putative TonB-linked outer membrane receptor 60 22 Op 6 . - CDS 54773 - 55216 166 ## Slin_4541 transposase IS200-like protein 61 22 Op 7 . - CDS 55270 - 56349 1091 ## BT_1954 putative surface layer protein - Prom 56371 - 56430 4.4 62 23 Op 1 . - CDS 56434 - 58488 1108 ## BT_1955 putative cell wall biogenesis protein 63 23 Op 2 . - CDS 58497 - 60272 1477 ## BT_1956 putative cell surface protein - Prom 60332 - 60391 5.7 - Term 60337 - 60390 8.5 64 24 Tu 1 . - CDS 60400 - 61368 713 ## BF3033 hypothetical protein - Prom 61389 - 61448 6.2 + Prom 62103 - 62162 2.8 65 25 Tu 1 . + CDS 62233 - 62856 259 ## BF2893 hypothetical protein + Prom 62958 - 63017 4.6 66 26 Op 1 . + CDS 63141 - 63608 293 ## BF2897 hypothetical protein 67 26 Op 2 . + CDS 63605 - 63859 178 ## BF2891 hypothetical protein 68 26 Op 3 . + CDS 63874 - 64248 358 ## BF2890 hypothetical protein + Term 64252 - 64303 8.3 - Term 64240 - 64291 3.5 69 27 Op 1 . - CDS 64301 - 65989 1275 ## COG4227 Antirestriction protein 70 27 Op 2 . - CDS 65995 - 66603 582 ## BF2888 hypothetical protein 71 28 Op 1 . - CDS 66876 - 68432 1132 ## COG1705 Muramidase (flagellum-specific) 72 28 Op 2 . - CDS 68445 - 70706 1680 ## BF2885 putative DNA primase 73 28 Op 3 . - CDS 70734 - 72533 1455 ## BF2884 hypothetical protein 74 28 Op 4 . - CDS 72542 - 72823 189 ## BF2883 hypothetical protein 75 28 Op 5 . - CDS 72847 - 73677 686 ## COG0739 Membrane proteins related to metalloendopeptidases 76 28 Op 6 . - CDS 73703 - 74368 626 ## BF2881 hypothetical protein 77 28 Op 7 . - CDS 74382 - 75065 555 ## BF2880 hypothetical protein 78 28 Op 8 . - CDS 75079 - 75768 475 ## BF2879 hypothetical protein 79 28 Op 9 . - CDS 75749 - 76246 269 ## BF2878 hypothetical protein 80 28 Op 10 . - CDS 76243 - 77631 916 ## BF2877 hypothetical protein 81 28 Op 11 . - CDS 77635 - 79212 1258 ## BF2876 hypothetical protein 82 28 Op 12 . - CDS 79229 - 79453 298 ## BF2875 hypothetical protein 83 28 Op 13 . - CDS 79472 - 80230 765 ## BF2874 hypothetical protein - Prom 80272 - 80331 4.4 + Prom 80199 - 80258 10.1 84 29 Tu 1 . + CDS 80450 - 81103 298 ## COG0739 Membrane proteins related to metalloendopeptidases - Term 80833 - 80875 -0.5 85 30 Tu 1 . - CDS 81109 - 82278 720 ## COG1373 Predicted ATPase (AAA+ superfamily) - Prom 82372 - 82431 4.3 86 31 Op 1 . + CDS 82668 - 82892 269 ## BF1096 hypothetical protein 87 31 Op 2 . + CDS 82940 - 83572 463 ## gi|253571012|ref|ZP_04848420.1| conserved hypothetical protein + Term 83580 - 83628 7.1 - Term 83763 - 83821 5.2 88 32 Tu 1 . - CDS 83929 - 84147 91 ## - Prom 84171 - 84230 4.0 89 33 Op 1 . + CDS 84173 - 84514 192 ## BF2912 hypothetical protein 90 33 Op 2 . + CDS 84525 - 84887 355 ## gi|253571014|ref|ZP_04848422.1| conserved hypothetical protein + Prom 85086 - 85145 7.8 91 34 Op 1 . + CDS 85168 - 85923 726 ## COG1192 ATPases involved in chromosome partitioning 92 34 Op 2 . + CDS 85931 - 86185 266 ## BF2825 hypothetical protein + Term 86234 - 86277 5.4 93 35 Tu 1 . + CDS 86623 - 88350 1376 ## COG1475 Predicted transcriptional regulators + Term 88370 - 88414 7.1 94 36 Op 1 . + CDS 88792 - 89067 350 ## BF2819 hypothetical protein 95 36 Op 2 . + CDS 89075 - 89347 311 ## gi|253571022|ref|ZP_04848430.1| conserved hypothetical protein 96 36 Op 3 . + CDS 89381 - 89761 560 ## BF2817 hypothetical protein + Term 89783 - 89828 6.2 - Term 89771 - 89816 8.3 97 37 Op 1 . - CDS 89864 - 90061 98 ## HMPREF0424_0734 hypothetical protein 98 37 Op 2 . - CDS 90076 - 90252 200 ## gi|253571025|ref|ZP_04848433.1| conserved hypothetical protein - Prom 90425 - 90484 4.3 + Prom 90289 - 90348 1.8 99 38 Tu 1 . + CDS 90401 - 91621 823 ## COG4804 Uncharacterized conserved protein + Term 91663 - 91719 12.2 - Term 91651 - 91706 13.6 100 39 Op 1 . - CDS 91709 - 93895 1604 ## BF2815 putative mobilization protein 101 39 Op 2 . - CDS 93924 - 94580 499 ## BF2813 hypothetical protein 102 39 Op 3 . - CDS 94606 - 95091 420 ## BF2812 hypothetical protein 103 39 Op 4 . - CDS 95115 - 95966 837 ## BF2811 conjugate transposon protein TraN 104 39 Op 5 . - CDS 96004 - 96789 501 ## COG0863 DNA modification methylase 105 39 Op 6 . - CDS 96801 - 97949 1002 ## BF2808 conjugate transposon protein TraM 106 39 Op 7 . - CDS 97960 - 98466 394 ## BF2807 hypothetical protein 107 39 Op 8 . - CDS 98453 - 98833 285 ## BF2806 hypothetical protein 108 39 Op 9 . - CDS 98846 - 99460 423 ## BF2805 conjugate transposon protein TraK 109 39 Op 10 . - CDS 99464 - 99859 434 ## BF2804 hypothetical protein 110 39 Op 11 . - CDS 99901 - 101034 860 ## BF2803 hypothetical protein 111 39 Op 12 . - CDS 101047 - 101814 646 ## BF2802 hypothetical protein 112 39 Op 13 . - CDS 101833 - 102558 715 ## BF2800 hypothetical protein 113 39 Op 14 . - CDS 102555 - 105101 1675 ## BF2799 hypothetical protein 114 39 Op 15 . - CDS 105113 - 107821 2235 ## BF2797 hypothetical protein - Term 107889 - 107921 -1.0 115 40 Op 1 . - CDS 107933 - 108229 327 ## BF2796 hypothetical protein 116 40 Op 2 . - CDS 108245 - 108622 477 ## BF2795 conjugate transposon protein TraE 117 40 Op 3 . - CDS 108677 - 109003 158 ## BF2794 hypothetical protein 118 40 Op 4 . - CDS 109005 - 109448 491 ## BF2793 hypothetical protein - Prom 109478 - 109537 5.6 - Term 109634 - 109663 -0.3 119 41 Tu 1 . - CDS 109692 - 110579 651 ## BF2792 DNA primase - Prom 110608 - 110667 8.2 120 42 Op 1 . - CDS 110682 - 111782 837 ## BF2791 hypothetical protein 121 42 Op 2 . - CDS 111760 - 112155 352 ## BF2790 putative excisionase - Prom 112274 - 112333 3.0 - Term 112180 - 112219 6.5 122 43 Op 1 . - CDS 112335 - 112979 164 ## BF2789 hypothetical protein - Prom 113126 - 113185 3.4 - Term 113149 - 113184 0.4 123 43 Op 2 . - CDS 113239 - 114384 934 ## COG4973 Site-specific recombinase XerC - Prom 114514 - 114573 8.5 + Prom 114505 - 114564 7.4 124 44 Tu 1 . + CDS 114748 - 115164 249 ## BVU_0680 hypothetical protein + Term 115199 - 115234 3.5 + Prom 115207 - 115266 6.0 125 45 Op 1 . + CDS 115371 - 115694 349 ## BT_2651 hypothetical protein 126 45 Op 2 . + CDS 115707 - 116000 109 ## BF1983 hypothetical protein 127 45 Op 3 . + CDS 115997 - 116308 238 ## BT_2448 hypothetical protein + Term 116321 - 116359 4.5 128 46 Op 1 . - CDS 116333 - 117658 599 ## BF1987 tyrosine type site-specific recombinase 129 46 Op 2 . - CDS 117679 - 118902 665 ## COG4974 Site-specific recombinase XerD - Prom 119019 - 119078 5.5 Predicted protein(s) >gi|226332192|gb|ACIC01000128.1| GENE 1 3 - 336 217 111 aa, chain - ## HITS:1 COG:YPMT1.75c KEGG:ns NR:ns ## COG: YPMT1.75c COG3344 # Protein_GI_number: 16082867 # Func_class: L Replication, recombination and repair # Function: Retron-type reverse transcriptase # Organism: Yersinia pestis # 21 109 21 112 156 66 38.0 1e-11 MRNPENVLNSLSKHSGNLNYKFERLYRVLFNEEMFYVAYQNIYSKTGNMTAGADGKTIDG MSIDRVEQLIGSLKNETYQPNPSKRTYIPKKNGKKRPLGIPSFDDKLVQEV >gi|226332192|gb|ACIC01000128.1| GENE 2 905 - 3301 1777 798 aa, chain - ## HITS:1 COG:MYPU_3830 KEGG:ns NR:ns ## COG: MYPU_3830 COG3451 # Protein_GI_number: 15828854 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirB4 components # Organism: Mycoplasma pulmonis # 426 770 486 839 853 62 25.0 3e-09 MKNVMKTATLESKFPLLAVENGCIISKDADITVAYRVELPELFTLTRAEYESMHSTWAKA VKVLPNYSIVHKQDFFIEEGYKPDICKEDLSFLSRSFERHFNERPYLQHTCYLFLTKTTK EHSRTTSSFNALTRGFIIPKEMQDKETVTRFMECCGQFERIVNDSGLLRIIRLTDEEIIG SNNSAGIIEKYFSMSQEDATCLQDLSLGAGEMKVGDNYLCLHTLSDPEDLPSNVSTDCRY ERLSTDRSDCRLSFAAPIGILLTCNHVVNQYLFIDDSAEILRKFEQTARNMHSLSRYSRS NQINREWIEEYLNEAHSKGLVSIRAHCNVMAWSDDREKLKRIKNDVGSQLALMEAKPRHN TVDVPTLFWAAIPGNAGDFPSEESFHTFIEQALCLFIGETSYKDSLSPFGIRMVDRLTGK PVHLDISDLPMKNGTITNRNKFILGPSGSGKSFFTNHMVRQYYEQGAHVLLVDTGNSYQG LCSLIHARTHGEDGIYFTYEEKDPIAFNPFYVEDGIFDIEKKESVKTLILTLWKRDDEPP TRAEEVALSNAVNLFLEKIRRDSSIKPSFNTFYEFIRDEYQDILKEKRTREKDFDVWGFL NVLEPYYKGGEYDYLLNSDKQLDLQSKRFIVFELDNIKDNKVLFPIVTIIIMETFINKMR KLKGIRKMILLEEAWKAISKEGMAEYLRYLFKTVRKFFGEAVVVTQEVEDIISSPIVKGT IINNSDCKILLDQRKYMNKFDEIQALLGLTDKERAQILSINMANNPSRKYKEVWIGLGGS QSAVYATEVSLEEYMVRP >gi|226332192|gb|ACIC01000128.1| GENE 3 3298 - 3630 83 110 aa, chain - ## HITS:1 COG:no KEGG:BF0124 NR:ns ## KEGG: BF0124 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 109 1 109 110 151 66.0 7e-36 MAEFNVNKGIGRSPEFKGLKSQYLFIFAGGLLALFVIFVVMYMAGIDQWVCIGFGVVSAS VLVWGTFRMNSKYGEWGLMKLHALRSHPRYIISRRKFLRLFSPTIKNRKA >gi|226332192|gb|ACIC01000128.1| GENE 4 3632 - 3928 333 98 aa, chain - ## HITS:1 COG:no KEGG:Fjoh_3006 NR:ns ## KEGG: Fjoh_3006 # Name: not_defined # Def: hypothetical protein # Organism: F.johnsoniae # Pathway: not_defined # 2 98 31 127 127 143 80.0 2e-33 MKKRFFLAAMTLLVSMGASAQGNGIGGITEATNMVTSYFDPGTKLIYAIGAVIGLIGGVK VYSKFSSGDPDTSKTAASWFGACIFLIVAATILRSFFL >gi|226332192|gb|ACIC01000128.1| GENE 5 4498 - 4728 167 76 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237717727|ref|ZP_04548208.1| ## NR: gi|237717727|ref|ZP_04548208.1| predicted protein [Bacteroides sp. 2_2_4] # 1 76 10 85 85 125 100.0 8e-28 MPKSEAESFGIKRNPAIRAPRIYLSEEEVIRIMRKEAAIKKKIGNLIDSMLCDDSQISAM DNIKNKTEAFDIDEFV >gi|226332192|gb|ACIC01000128.1| GENE 6 4768 - 5556 532 262 aa, chain - ## HITS:1 COG:no KEGG:BDI_3900 NR:ns ## KEGG: BDI_3900 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 97 240 6 152 178 64 32.0 4e-09 METTLIYFSVKAVSATYILYKVWMFIFSPKVYKFWNNQLRYMRIARIRLWKSRKKRMAEK ARKARYRARLDKAEAWIAQAENDILKVGHEKNTETQPNPLNEYNEVIGKTKIVYLEDPEV ARKTPTRSEPMKKEPIEEDEDINPDDVIDDFTPQKGLTESEKRELMSNDGCVPDPDFSRA LTFEELDNVADVLISGTDDRKKIRNAAETLYQLQDTDLFRFFSTELSTQSQLEKLLRENM PNGKETSKQSLEQIGIDWNKYM >gi|226332192|gb|ACIC01000128.1| GENE 7 5559 - 5978 347 139 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237717725|ref|ZP_04548206.1| ## NR: gi|237717725|ref|ZP_04548206.1| predicted protein [Bacteroides sp. 2_2_4] # 1 139 1 139 139 234 100.0 2e-60 MAKKIVKVNEEKIRGYMVGDIPDAIENEDIIVVEVPEEGNNERQDTEKETARKNSKSCNM PKSDSNFRQKYLVNTPMPGRIQVYLNRKLYDEIKNYLNVIAPEVSIASYISNIIAEHIEL NIEEITRMYKDRFSPPKIQ >gi|226332192|gb|ACIC01000128.1| GENE 8 5986 - 6738 638 250 aa, chain - ## HITS:1 COG:no KEGG:BT_2609 NR:ns ## KEGG: BT_2609 # Name: not_defined # Def: conjugate transposon protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 244 1 246 252 197 42.0 2e-49 MKKNPLFVAVSNQKGGVGKSTMLITLASLLNYSMGKNVAIVDCDSTQRSLFNLRERDMEM VEINKKYMVLLEEQRLRGCRIYPIRQAKPENARQVAGELAAKADFDIVFIDLPGSMDISG VLQTIFNVDYVLTPIAADNFVMDSSFVFAKSVMKCAENRNNIPLKDVFLFWTKVKKRSNT EVLDNYMALMKKQGLKILDSMIPDLCRYDKELSSRTRTYFRCSLLPPPAGQLKGSGLQEL ANELIVKFNL >gi|226332192|gb|ACIC01000128.1| GENE 9 7354 - 7806 413 150 aa, chain + ## HITS:1 COG:no KEGG:Fjoh_3620 NR:ns ## KEGG: Fjoh_3620 # Name: not_defined # Def: hypothetical protein # Organism: F.johnsoniae # Pathway: not_defined # 3 144 2 143 145 133 54.0 2e-30 MEENKNRKARKGVGRKSKSDPAVYRYGIKFNSRENAQFELSFQKSGMAYRARFIKSVLLN KEIKVVRIDKAAMDYYVRLTNFYHQFQAVGNNYNQAVKAIKTNFGEKRAYALLRNLEKVT IDLVVLSKRIIMLTREFEETYLIKKQREEE >gi|226332192|gb|ACIC01000128.1| GENE 10 7808 - 9061 860 417 aa, chain + ## HITS:1 COG:no KEGG:BVU_0678 NR:ns ## KEGG: BVU_0678 # Name: not_defined # Def: putative mobilization protein # Organism: B.vulgatus # Pathway: not_defined # 1 400 1 400 415 416 56.0 1e-114 MVAKINRGASLYGAVIYNQQKVNESTARIISGNRMIADVTGNPEQVMRNTLWAFENYLLA NRNTEKPVLHISLNPSVDDRLTDSQFADLAREYMQRMGYGDQPYIVYIHEDIGRRHIHIV STCVNEKGEKIDDAYEWNRSMKACRELERKFGLKQVEDKRRELLEPYLKKADYQNGDVKQ QVSNILKSVFSTYRFQSFGEYSALLSCFNIEAKQVRGEFEGTPYNGIVYTMTDNTGKPVC TPIKSSLIGKRFGYEGLEKRIGYNTREYKNKKRQPKIRNDIALAMHGCRGNREDFIRLLN GRGIDVVFRENGEGRIYGATFIDHRNREVYNGSRLGKEFSANAFERLFNNPGNIPNLDAP MPDTGLQSGFSSDMESAIEQAFGIFDFEANGPDPQEEALARRLYHKKKKKRRSRGIS >gi|226332192|gb|ACIC01000128.1| GENE 11 9097 - 11124 1395 675 aa, chain + ## HITS:1 COG:alr7539 KEGG:ns NR:ns ## COG: alr7539 COG3505 # Protein_GI_number: 17158675 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirD4 components # Organism: Nostoc sp. PCC 7120 # 194 558 137 495 608 91 26.0 6e-18 MQQEDDLRGLAKVMEFMRAISIVFVVIHVYWFCYRAFVDAGINIGVVDKILLNFQRTAGL FSNLLVTKVFAVIFLALSCLGTKGVKNQKMTWRKIYTAFLSGLVLFFMNWWMFDLPFSPT VDAAIYTVTLTAGYILLLMSGVWISRMLKHNLMEDVFNTANESFMQETHLMENEYSVNLP TKFVYQGKEWDGWINVVNVFRASIVLGTPGSGKSYAVVNNYIKQMISKGFALYLYDYKFD DLSVIAYNELLKNIDKYRVRPEFYVINFDDPRRSHRCNPINPKFMADISDAYESAYTILL NLNKTWIQKQGDFFVESPIILLAAIIWYLRIYKDGKYCTFPHAIEFLNKPYADIFTILTS YPSLENYLSPFMDAWQGGAQDQLQGQIASAKIPLSRMISPQLYWVMTGDDFTLDLNNPEH PKILCVGNNPDRQNIYSAALGLYNSRIVKLVNKKGQLKSSIIIDELPTIYFRGIDNLIAT ARSNKVAVCLGFQDFSQLTRDYGEKEAKVIQNTVGNIFSGQVVGETAKSLSERFGKILQQ RQSISINRQDTSTSINTQLDSLIPASKIANLSQGTFVGSVADNFGEEIDQKIFHARIIVD NEKVAAETKAYKKIPVINEFKDADGNDIMQQQIERNYSRIKADVLQIIEDEMQRIKTDPD LQHLIPKEDSDRKEE >gi|226332192|gb|ACIC01000128.1| GENE 12 11341 - 11604 258 87 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253570928|ref|ZP_04848336.1| ## NR: gi|253570928|ref|ZP_04848336.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 87 1 87 87 168 100.0 9e-41 MSRSNFTPMKRFHEIIGRYGLRLMEVGTNHLRVFSEGRKLFDYYPLRMKLFDYRQWKQLT YPSLIDGTDKWETELDNIIKELMVSQQ >gi|226332192|gb|ACIC01000128.1| GENE 13 11616 - 11876 288 86 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237717717|ref|ZP_04548198.1| ## NR: gi|237717717|ref|ZP_04548198.1| predicted protein [Bacteroides sp. 2_2_4] # 1 82 23 104 107 162 98.0 7e-39 MNGNNSKTLVWDNIPEWAIFALEYGTREELFLSDEDKKMITKFIAENFPNGYTMSVDWES YKEFDTNPAFGKACKTYKVTFVIPKE >gi|226332192|gb|ACIC01000128.1| GENE 14 11881 - 12378 428 165 aa, chain + ## HITS:1 COG:no KEGG:BVU_0658 NR:ns ## KEGG: BVU_0658 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 56 157 45 139 303 68 37.0 7e-11 MRKITLSEYNAIPEDYRGIWTVERWDLPDWADLREKHIGKRTMMAYDNGTCLLVEGMGFE IVDDSSWKKTDEVRQEIGCHYLKFYSGQGRDPHYADCVIRWRDTLETEEARIALAMDSDT EKDDEIFFYCDSLDDLKSLADKGGEDFTVAGCLGFGIYEELLQTT >gi|226332192|gb|ACIC01000128.1| GENE 15 12390 - 12683 206 97 aa, chain + ## HITS:1 COG:no KEGG:BT_2585 NR:ns ## KEGG: BT_2585 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 97 1 97 97 128 64.0 7e-29 MKIRCQEHYDKVAEYAKSIGDTTFQNCIGRLKQWEKNSNGRYEIEFYWDFAPYSFGFAER TPDGRNGIVGGLLYHGRPDESFAVMIGGPFHGWSIHT >gi|226332192|gb|ACIC01000128.1| GENE 16 12799 - 14055 724 418 aa, chain - ## HITS:1 COG:FN1715 KEGG:ns NR:ns ## COG: FN1715 COG1373 # Protein_GI_number: 19705036 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Fusobacterium nucleatum # 1 410 1 422 430 342 49.0 6e-94 MIIPRDKYLQELKSVMHNGMIKIITGVRRCGKSYLLFELFKQSLLENGVNEEHIIQVDLE DRRSKHLREPDILLDYIDGHITDNDMYYILLDEIQLVPEFEDVLNSYLKIKNADVYVTGS NSRMLSSDVKTEFRGRGYEIRVHPLSFSEFLKAGQYTNELSALQDYMIYGGMPQVVSFSN KKEKENYLKSLFQGTYIRDIKERYNIRNDDDLSELIDIIASNIGCLTNPTNLENTFKSVK GHSISDTTIQSYLGMLQDAFMLEKAIRYDIKGKHYINTPSKYYFEDVGLRNARLNYRQID GGHLMENIIYNELRIRGYSVDIGQVEVRITNDDGRKMRKLLEVDFICNSGDKRVYIQSAL DMPTQEKIDQETNSLRHIKDGFPKIVIVGGLTPSHVNADGISIINVIDFLKDTEGNLL >gi|226332192|gb|ACIC01000128.1| GENE 17 14973 - 16739 1813 588 aa, chain + ## HITS:1 COG:no KEGG:BT_2642 NR:ns ## KEGG: BT_2642 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 581 15 590 601 533 50.0 1e-150 MAKKNGRDDPPKPQVAENEQMSDIILILDKMELILQAVSQIDKDGRYKTVPADKEHTNSF LKIDRYASMFENFLKNFWSQLKDPTRFGILSVKEKALDDPKVRQAIEDLAAGKKTDAVEE FLKQYEIVPRDKENQSINHQNQEEMAKENETQQQADQGGGTQQQPQYRYNESMINWEQLK NFGLSREELQERGLLDQMLRGYKTNQVVPISMNFGSAVLRTDARLSFQQSRAGDIVLGIH GIRQKPDLDRPYFGHIFSDEDKKNLLETGNMGRVVELKNRNGEYVPSFVSIDKLTNEVVA MKAENVFIPREISGVKLTEQEQNDLREGKKIFIEGMTARSGNPFDAHIQVNAERRGVEFI FENDKLFNRQYLGGVELTKKQIDDLNGGKAIFVEGMKRKDGELFSSYVKLDEATGRPSYT RYNPDSPEGAREIYIPNEIGGVKVTPEEQKELREGRPIFLNGMVNRRGEEFSSFIKADLE TGRLSYSRTPDGFDQREEFKIPAKVWDVELTRKQRADLQSGKAVLVEGIKGYDGKTISQY VKANFNQGRLDFYNENPDRRRDAAQRNVVSAAQRQGEENRQSRGASIA >gi|226332192|gb|ACIC01000128.1| GENE 18 16788 - 17051 378 87 aa, chain + ## HITS:1 COG:no KEGG:BT_2643 NR:ns ## KEGG: BT_2643 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 87 1 86 91 65 45.0 4e-10 MEMNKENNTPFKAEDVNWDELAGIGILKDELEMSGELDTLLRGEKTKVMPLSLVLLGVDV VMDATLQLVRKDGDPLLEILGIKSVGE >gi|226332192|gb|ACIC01000128.1| GENE 19 17149 - 19245 1296 698 aa, chain + ## HITS:1 COG:CAC3567 KEGG:ns NR:ns ## COG: CAC3567 COG0550 # Protein_GI_number: 15896801 # Func_class: L Replication, recombination and repair # Function: Topoisomerase IA # Organism: Clostridium acetobutylicum # 2 638 4 654 709 394 37.0 1e-109 MIAIIAEKPSVGQEIARVVGATEKKDGYITGNGYMVTWALGHLVSLALPGTYGYTRTTAE DLPMLPEPFRLVERQIRTDRGMVTDIAAGRQLKVIDGVFSECDSIIVATDAGREGELIFR WIYSHLGCTKPFKRLWFSSLTDEAIRKGMADLREGYEYDSLYAAADSRAKADWLVGMNAS RALAAVSGSANNSIGRVQTPTLAMICARFKENRNFVSTPYWQLHITLKQGEAHRLFIHPE HFGDKKAAETAYGRIIPGSAVTITKVERGTVFQQAPLLYDLTSLQKDCNIHHDLTADKTL SIAQSLYEKKLVSYPRTGSRYIPEDIMANIPALLEKIIAMPLFREYGRTFDFSALNTRSV DTTKVTDHHALIITGVAPEELSEAESAVYTLIAGRMLEAFSPPCEKERLLMECMCGGMDF RSRSAFIVNTGWRAVFARKEDREKDEPEDGGGTALFAEGGLVPLSGRGLSQKKTLPRPLY TEATLLAAMENCGKDITDGEARETVKDSGIGTPATRAAIITTLFKRDYVERSGKSILPTE KGLYLYESVKDMMVADAELTGTWERALARIEGRTLDPDSFMLSIREYTGKVTGEILRLKF PEPSSHAFTCPKCKAGNVVVKSKVAKCDREGCGLLVFRRFLNKELTDRHLEQLFSSGSTR LIKGFKGKKGCSFDAALTFDDSFNLKLSFPKPKGAKGK >gi|226332192|gb|ACIC01000128.1| GENE 20 19111 - 19428 87 105 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGETCHFQFRGPESGPPPQARFSGKIPERGEDDFPRNLQGPTLQVETGGSATFARRIGMA GSFAFCTFGLGKRKFQVKTVVKGECRIERTSFLTLETLDKPGGAG >gi|226332192|gb|ACIC01000128.1| GENE 21 19444 - 19680 218 78 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKTREYIGKLTTGKWTRMTVLAIAAFIVFGYLYSRGVSPVWTAVAIVCFRGFFRFLYKVA CFLVAAAILFCILSYLVF >gi|226332192|gb|ACIC01000128.1| GENE 22 20068 - 20286 131 72 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|255015017|ref|ZP_05287143.1| ## NR: gi|255015017|ref|ZP_05287143.1| hypothetical protein B2_14008 [Bacteroides sp. 2_1_7] # 1 72 1 72 72 93 90.0 3e-18 MNQAIISRPPMAPVQIPVPIPARRKYPVPEPTVKFPPRERSGPVHISTLLDPVLEICSHP DRNRLLAEFFNR >gi|226332192|gb|ACIC01000128.1| GENE 23 20324 - 21286 738 320 aa, chain + ## HITS:1 COG:no KEGG:BVU_1522 NR:ns ## KEGG: BVU_1522 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 262 1 269 351 146 45.0 1e-33 MFFTAINQMMTEGVDLTLVIRKANGQLAVSTLPKSNGLKDEAQNHIVPLTVNGKPEELDA GFLQTVARPIQKAVGLITNMAQFEVQAEKAASESKAAKDAKAKETKEEKERREKYEKHLK KAEELIAAKNHKDAVTALSQARLHAKPQDQKKIDEMMEQQKKAMNKGSLFELMDEPVPQA QSQPRPQPMAATAQQRQQPQPQPMAVSVQQPPYPPQPQVPGSMAGQPQQPSMWPPQQPPH RPVQHQTPPQPQYAQQPSVQPQNYGGQEVTHWQEPEFTPEDYRMQYPADEEINYNPKDYE EYPDFPQSMLEPKYSPYQTV >gi|226332192|gb|ACIC01000128.1| GENE 24 21291 - 21512 306 73 aa, chain + ## HITS:1 COG:no KEGG:BVU_0712 NR:ns ## KEGG: BVU_0712 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 73 1 73 73 86 60.0 3e-16 MALEITGMTRSFTFKKGSGMVTLDDPNPSDSPEMVMNYYSNFYPELTTATVHGPVLKDDK AVYEFKTTVGTKG >gi|226332192|gb|ACIC01000128.1| GENE 25 21553 - 22641 649 362 aa, chain + ## HITS:1 COG:no KEGG:BVU_0713 NR:ns ## KEGG: BVU_0713 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 41 361 43 365 375 194 35.0 3e-48 MCTIGRTPAGTAGKAHHRADTEGNPIRGGQERTDCGSKRGRNSFLSTALAPVALKSLVVE TYDGLQCGSINVITRENYEFLRDSFFKYALLLQKEAAHTPGNSIGEGIARLYDEMDALVG DELNVNIEQERGRLFFRLWKYHRWGSLTLYYFPVRFLESLSPVLRRIAITFIHKLMTANG ISTILDDDDAEFIFELLSEDGGDDPQEWKSRVKLVRSYQEGKIGRLLRRVAAKSYYKDLP GAIGSYTPQNGFERQLVDAMRSGLPFLTPERGIMGYGYDAYYSENPDFHPMYLQQQIRVV YDINDIVSEYLVDYYNSNSRETYDITPVTTCDLSSDTGELFRMDDYPERFFKWADKFINI IC >gi|226332192|gb|ACIC01000128.1| GENE 26 22645 - 23364 681 239 aa, chain + ## HITS:1 COG:no KEGG:BT_2648 NR:ns ## KEGG: BT_2648 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 5 238 2 232 233 247 50.0 2e-64 MLETNELTRKLRTLLHPRAALIAYAREDNSHYVYDNSYFIEVRDIDESGIMGEGRPVTAE FMNELVRGYSERHSATPYGRIPSNLLWCDPRKGSEKYIWYNSPRKRMMFFKEALKIENAE YHLPGIIYEAGESHLNVYACKDREPTEKTELYAAPFFNVTQASVCLGSARIEKPKDLTYA NLLEYWERKFWLTEFSHLGAHGNPTKSNLVLVTKAAKDRPFDLEELRPLNNLKLKDILK >gi|226332192|gb|ACIC01000128.1| GENE 27 23361 - 24149 618 262 aa, chain + ## HITS:1 COG:no KEGG:BVU_0715 NR:ns ## KEGG: BVU_0715 # Name: not_defined # Def: ThiF family protein, ubiquitin-activating enzyme # Organism: B.vulgatus # Pathway: not_defined # 1 262 1 263 263 330 58.0 4e-89 MKRVHYIDNYLIAPQHPVTVNLIGAGGTGSQVLTCLARLDVTLRALGHPGLSVTLYDPDI VSGTNIGRQLFSDSDIGLNKAKCLITRVNNFFGNDWKAEPRPYPSVLKEVKRDEIANITV TCTDNIKSRLDLWNVLGKMPPASYTDNATPLYWMDFGNTQTTGQVIMGTVLKKIRQPAST LYEAVGSLKVITRFVKYARVKEEDSGPSCSLAEALEKQDLFINSTLAQLGCNILWKMFRN GMIEHHGVYLNLATMKVNPISI >gi|226332192|gb|ACIC01000128.1| GENE 28 24259 - 25386 605 375 aa, chain + ## HITS:1 COG:BH3679 KEGG:ns NR:ns ## COG: BH3679 COG4753 # Protein_GI_number: 15616241 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 280 372 164 255 257 65 35.0 2e-10 MEPSIYSYSLCIALPLMSFFGFYFLLAPTPEKAIFNNYLRSRRIMGVAILLLAANYSVHF FFGIRFKNTDAAILMNLSTYFLCYWLFSSALTTLLDRFYITKCRLRTHICLWILFSILSG IVLLLLPKGGLQTTVMFALAAWLVIYGLFLTRRLLRAYHRVIRIFDDTRADDIGAYIKWL SIFTYWAVTFGVGCGLLTFLPDEYIYIWILSSVPFYIYLFLCYLNYLLFYEQVENAMEDG MTSEEEDLCDTTNREQAQRQDTPFFHAEMAKKIKGWIDADGYIRPGLTIKELSDVLHTNR TYLSAYIKTTYKMTFREWITGLRLEYAKNILKEHPEINIQKLAESSGFLSRSNFIKSFTE KEGCTPGKWKKANLE >gi|226332192|gb|ACIC01000128.1| GENE 29 25623 - 26723 1091 366 aa, chain + ## HITS:1 COG:no KEGG:BF3849 NR:ns ## KEGG: BF3849 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 27 366 125 493 493 275 45.0 2e-72 MSRKITFLTLFLWLMTVTFPVIAQQKADTTYTFRFVPQKDMFYVPWNGNDTELARLLECI ENNKATILDGKLPLLVDGYCNSQGSEAENLATAKIRANRVKSELITRAEIKEENFITRNH ATEGDFVTVRLTVPVKETAVTDAEAEARRKAEAERLAAEKRAEQERLAEEQRKAEEARLA AEKAEAEKAAQQNTLADTPSETKITTDYHLSLRANLLRWATLTPDLGVEWRICPSWGIAV NGSWTSWSWSDKDRRYALWEVAPEVRYYMGEKKAWYLGAMFKAGQFNYKFSETGKQGDLM GGCITAGYQLRLNKALALDFNLGLGYLNADFEKYEVIDGVRVRRGNETKDWWGPINAGVT LVWKLF >gi|226332192|gb|ACIC01000128.1| GENE 30 26740 - 27627 953 295 aa, chain + ## HITS:1 COG:no KEGG:BF3848 NR:ns ## KEGG: BF3848 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 294 1 304 304 121 32.0 4e-26 MKARQYINMMGMAAAVLFSSCVKDTLYDTPHPDYGKIAVTADWSARGEGIDIPATWTVTM GDYTGTETSATHTPDHLFAPGSYTLAVWNPAEGITVNGTTATIAASTGTRAGTDAFVNNA PGWFFTYTEQVTIEKDRDYPLTAAMKQQVRELTLVVEPTGDAAGRITEIVAHLTGAAGTL DFATDTYGAASNVVLPFSKITEGDDAGKWKATVQLLGVTGTEQLLTGEIRYADGNPTPTT LKSDLTEALKEFNTGKGESLTLGGTLVETPEGMEVDGAEINGWEEVKGDDVNADL >gi|226332192|gb|ACIC01000128.1| GENE 31 27714 - 28601 798 295 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253570946|ref|ZP_04848354.1| ## NR: gi|253570946|ref|ZP_04848354.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 295 1 295 295 525 100.0 1e-147 MKTRFFTLATLALALAACNNDNENLNGDPVAAQFTANIAPATRASGTTWTGGDRIGITDI GNDSQYGNVPFILKNGKFEAEGKVIYIEDTKTHTFRAYYPYNAAGGILAATTDATAQQNK PAIDFLFASGATGDKNNPVVSFTDKTAKGGEDNSFHHRMSQITLTFEAGDGVDFSVVKPE RYTLDGLLLTGTFNTADGIATADNGVQTGELAMNLADGNLTSSIILFPQTVASLPLVVNY KGQEYHATLTMPEGALLAGNNYTYTVKVRNKVLEVSEATIAKWNDIDGGDVGADL >gi|226332192|gb|ACIC01000128.1| GENE 32 28636 - 30438 1486 600 aa, chain + ## HITS:1 COG:no KEGG:DSY3857 NR:ns ## KEGG: DSY3857 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 284 529 1519 1766 3013 109 35.0 3e-22 MRHRLFIPAATALLFALAACTQDELADDNRLPEGEYPVVIRATGLSVEATPLAAHSTRAA VDGDWQGVTSVALKMGDAVKEYTVTASTDFKSATLSRENAPYYWTSRDPITVSAWWPFNN ADITQMPAVKVAEDQSKLADFQNSDFISAENRKVEFNNPTLEFTHRTARVTIELKPGTGF TSVAGATVSLVSLSADNGNPTAIKTYNASGNTYEALTAPQTVAAGKPFFRVELGSGTFYF RPQNNVVLEAGNRYKYTVKVNATGLTLEGCTIGDWADGGGESGAAEDLGYIYDSNTKTYT VYNANGLMNVAELVNGGKTDINITLTADIDLTGKGWTPIGTDYDNAYTGTFDGGGHTITG LTVTTNDEYAGLFGYLGNFGKFGTVKNVVMDGIQITCNHRLGYAGGVAGYSRGTIENCSV SGSVSGTVSVGGVVGAQRDGSITGCSSSATVKGTLNVGGVAGQTIFGATLTACYATGNVI IEIDRTQNISGGGLVGFNDGISLLSCYATGNVTSTGSSTGYVHIGGFLGDNYITVTACYW KNNHEQGIGYNRISSTKATKVDGTSVTWKNAVDAMNTALQNAGSEWRYELKGALPTLRKQ >gi|226332192|gb|ACIC01000128.1| GENE 33 30502 - 32199 1187 565 aa, chain + ## HITS:1 COG:no KEGG:Swol_1076 NR:ns ## KEGG: Swol_1076 # Name: not_defined # Def: Ig-like domain-containing protein # Organism: S.wolfei # Pathway: not_defined # 304 533 166 383 1937 97 36.0 1e-18 MRIRFFALAALALLLGACTQDEAGFLPEGAEGTPIVFTATGLNPAATAIAGTRAPVDGNW EGVQSVAVLMDGTVKAYDVTPSTADPTSATLTSTDPYYWTNHNDITVTAWWPYTAGETTP PAVKVKANQSTQKDFEGSDLIVADGQTVTYGSPTLRFTHRTARVTIVLTDYTEGLASVQL TGLSTEGDNPDIIVPYSKGSNTYTAIVAPQNVAAGTAFITCTFTNGKTFVYKMKNATDWQ AGGEYTYTVSLAAAKDLGYTIESNGSYTVYNADGLMNIAELVNGGKSDINITLDKNIDLT GKDWTPIGTDYDNSYKGTFDGGGHTITGLTFTTNDEYAGLFGWLNRAGTVKNVVMEGVQI TSNQIYGGSIGGVVGYSWGTIENCSVSGSVSGTVYVGGVVGAQIDGSITGCSSSATVKGM VDVGGVAGQTNSSATLTACYATGNVIIEMAPAKNIAGGGLVGMNAGSSLLACYATGNVTS TGSSTGYVHIGGFLGNNYTTVTAGYWKNNHEQGIGYNRESTGATKVDGTDVTWQNAVDAM NRALQNAGSKWRYELKGALPTLRKQ >gi|226332192|gb|ACIC01000128.1| GENE 34 32428 - 32925 306 165 aa, chain + ## HITS:1 COG:no KEGG:BF3844 NR:ns ## KEGG: BF3844 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 22 157 11 142 150 88 36.0 9e-17 MGLFHRGKKNNDDEPDSVQTNSFSDIMNGLQYAVNCAQDTLQNHQIQNLTRLFEGTNANN ANTFQSKKIMVGDKTIDIPLIALISHHYLAMDNVQIKFKAKVGSVESQIPENNLLLSSPQ RANLQMQMSNIKPDADDVMEVCVNFKVQETPESISRIIDDFVKNI >gi|226332192|gb|ACIC01000128.1| GENE 35 32947 - 33648 526 233 aa, chain + ## HITS:1 COG:no KEGG:BF3842 NR:ns ## KEGG: BF3842 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 51 231 1 185 188 168 53.0 1e-40 MAEEIKNQGQQENAENLNENVSGQAKEIISAALEETQNRETSPVLKADSNVTDKFKGLPM RELIAAPLIAAAEAQQELAATAWNFYQQIAFDGKSGNKARILEFDVERPIQQDGKMTTMS QSVKAPFIGLVPIPSLLIDRVDVDFQMEVTDTSNVKSTTNAEVEAKASAKHWFINAEISG KVTTARENTRMTNQTAKYQIHVTASQQPQTEGLSKLMDIMASCIEPINTESSK >gi|226332192|gb|ACIC01000128.1| GENE 36 33722 - 34375 -24 217 aa, chain + ## HITS:1 COG:no KEGG:HCH_03798 NR:ns ## KEGG: HCH_03798 # Name: not_defined # Def: endonuclease/exonuclease/phosphatase family protein # Organism: H.chejuensis # Pathway: not_defined # 9 217 2 214 215 122 41.0 1e-26 MHQTSKNSHMKICTWNSQGNPLNDAIKLNILNHLLTTEQCNVVMIQECGNFILPAQHSGR YHYVVVEHAGAYNCRCNTCIIADLNFVASIHYLISGTGRSAICLNYNGCNIYTLHCESGS GAVGDIRDLVRHAVSPFIIGGDMNSTPSELSEYLRIMTTGTRSRPGNSANFACCGMPTHF SGRELDYFLIDSRLQLRTCVRRYHMKGGDHYPVILEI >gi|226332192|gb|ACIC01000128.1| GENE 37 34406 - 36703 1563 765 aa, chain + ## HITS:1 COG:VC1831_1 KEGG:ns NR:ns ## COG: VC1831_1 COG0642 # Protein_GI_number: 15641833 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Vibrio cholerae # 279 513 343 584 584 144 35.0 6e-34 MSLRYKIFFGYAILMAVIGSMAAILIYERQRMREIEAETANINLVRRGINTAHRRITGLA TLGEGVVNWNKADYLYYRNHRFQADSLLNSLKRHCREYVRPEQIDTLRALLAEKETHLLH IMEMFERRTEADSVLVNQLPEVARRATHIRTIEQKKKGIAGFFGKKEEIQVMPSQKELHD FSDSLIAIHQRQANEMDIYADSLRMRNRELNRTLNKLINDLDEQAQTAFSQRELKMAEAE KMSFFLMAGVIGMAIILLIISHLIIMRDLNRRERDKAELEDTATQNRTLSDMRKKIIITL SHDIRGPLNAISGSAELAMDTRDRKRRNAYLGNILESSRHITRLANSLLDLSRLDDAKET LNEIPFHLESFLESIAEEYTRKANDKGLMFDKAFMGCGITVLGDADRIRQVVVNILENAV KFTRTGYIKFLASYEEDTLSVKVKDTGIGMGENTTQRIFQPFERAAPDLDSEGFGLGLSI TKGLVNLFGGRLSVSSQIGKGSEFKVEIPLRQTNEPARDKPETYTGNLRLPRRVLVVDDD PIQLRNTVEMMERNGISCRACTNAQEVVKALRTGEYDLLLTDIQMRGTGGFDLLHLLRLS NIGNSRTIPIAAMTARNDGDADRYIQAGFAGCIHKPFYTRDLLEFLSSLIGQDRTMDNHS PDFEALYVTTGDERWTLETLIEESNRNSSDLLDSLSQEKLDRKRIWETLHRMYPMWEQLG IAHELESYSYEEYVEDTDESAFRNDVERIVRRIDRLISETKSRLS >gi|226332192|gb|ACIC01000128.1| GENE 38 36734 - 38041 684 435 aa, chain + ## HITS:1 COG:aq_1117 KEGG:ns NR:ns ## COG: aq_1117 COG2204 # Protein_GI_number: 15606382 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains # Organism: Aquifex aeolicus # 5 432 3 438 439 279 39.0 8e-75 MKYRILIVEDNIMLAGQQKKRLEKSGYEAEITIDEPGARKLLKKESFDLVLSDVRLPEGD GISLLEWMRKERMDIPFIIMTGYASVPDAVQAIKLGAKDYLAKPVQMDELQRQLKDIFRP KSVICDKNKDLLPRNSLQMQEVEHLVSTVAPFDISVLILGPNGAGKESVAQRIHYIGERK DMPFVAVNCGVIPKELAPSLFFGHIKGTFTGADANKDGYFEVAKGGTLFLDEIGILSLDV QAMLLRVLQEGTYIPIGGNKEKRANVRIVAATNEDLQLAIQEKRFREDLYHRLCEFEIVL PSLHECPDDILSLAHHFRKKFSGELKRPTEGFSSEAEQLLLSYRWPGNVRELHNRIRRAV LMAKQPLIETADLNIKQEAATEEINLFPENDAEEKHSIIQALKTSHGSRKQAAGILHIDP STLYRKMKKYGLNDK >gi|226332192|gb|ACIC01000128.1| GENE 39 39017 - 39310 298 97 aa, chain + ## HITS:1 COG:no KEGG:BF2920 NR:ns ## KEGG: BF2920 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 97 1 97 100 137 80.0 8e-32 MANYATNIFYARTENKADLDKIEAFLDDTFDGFVNRHSDSVDAEFTSRWVYPEEEIDRLI ASLEAKDKTYIKILSYEFTDEYVSFRIFSQGKWDIKL >gi|226332192|gb|ACIC01000128.1| GENE 40 39330 - 39506 246 58 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253570958|ref|ZP_04848366.1| ## NR: gi|253570958|ref|ZP_04848366.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 58 1 58 58 102 100.0 1e-20 MYEYEENSEIIGAHCTLLTPYKGYSEGTVVGDFGNKIVVRLSSGKEVVEYRDEVIIYD >gi|226332192|gb|ACIC01000128.1| GENE 41 39518 - 39877 321 119 aa, chain + ## HITS:1 COG:no KEGG:BF2919 NR:ns ## KEGG: BF2919 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 109 1 109 109 192 90.0 3e-48 MQQIAMKFVQWDVPELEKLKDSRVYQLREQLDNGDKLSREDKNWITRNVKECIHFKRGIA LMGYFFDFSDVLKRYFVKQHGHIAEYYAIDKTALRSVLYGRIEDIVEVELKSKKHESND >gi|226332192|gb|ACIC01000128.1| GENE 42 39861 - 40118 177 85 aa, chain + ## HITS:1 COG:no KEGG:BF2918 NR:ns ## KEGG: BF2918 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 85 1 85 86 134 75.0 7e-31 MKATIEHSFCPYCDEVTELYFRIINTILFSTDEAELRTGMERLQEKTRLDDYFVFGYGKH HLWVCQRRPSNQKKIFEHRVIMAEF >gi|226332192|gb|ACIC01000128.1| GENE 43 40802 - 41182 450 126 aa, chain + ## HITS:1 COG:no KEGG:BF2915 NR:ns ## KEGG: BF2915 # Name: not_defined # Def: putative single strand binding protein # Organism: B.fragilis # Pathway: not_defined # 1 115 1 115 126 187 83.0 1e-46 MKKIENNFVVTGFVGKDAEIRQFTNASVARFPLAVSRLENNGEESKRVSAFMNFEAWRKN ENTGSFDQLTKGTMLTVEGYFKPEEWSDQSGVKHNRIVMVAVKFYPPVEKEETPEKPAKP AKKGKK >gi|226332192|gb|ACIC01000128.1| GENE 44 41270 - 41551 313 93 aa, chain - ## HITS:1 COG:BMEI0877 KEGG:ns NR:ns ## COG: BMEI0877 COG0776 # Protein_GI_number: 17987160 # Func_class: L Replication, recombination and repair # Function: Bacterial nucleoid DNA-binding protein # Organism: Brucella melitensis # 1 89 3 91 93 60 37.0 6e-10 MTKAEIVAQISRQSGIEKTVVMTVIESFMENVKESMVAGNEVFLRGFGSFIIKRRAEKTA RNISKNTTIKIPAHNIPAFKPAKAFLNAVKENK >gi|226332192|gb|ACIC01000128.1| GENE 45 41619 - 41723 82 34 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGDIIILLLVFLVVGRLLRGVFGGFSKSSFRDDK >gi|226332192|gb|ACIC01000128.1| GENE 46 41827 - 42438 270 203 aa, chain - ## HITS:1 COG:no KEGG:BF2906 NR:ns ## KEGG: BF2906 # Name: not_defined # Def: serine type site-specific recombinase # Organism: B.fragilis # Pathway: not_defined # 1 203 1 202 214 258 61.0 7e-68 MAKVGYIFEANSYDAFDADKEWMRQYGCVQVVEESVGHETLRPRWKQLMSNLERGDELVM SKFSNAVRGLRELSALIELCRIKVVRIISIHDKIDTDNKLFPDTTPAEVLAMFGALPEEV AVLRKSSDKIIRLQQSISIPISKKSMSKTERDKKIVDMYNNGYSIRDIWKESGVQSKSTV YSILNKYKVQLNRTPGRVPGIRK >gi|226332192|gb|ACIC01000128.1| GENE 47 42456 - 42638 76 60 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253570967|ref|ZP_04848375.1| ## NR: gi|253570967|ref|ZP_04848375.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 60 1 60 60 110 100.0 4e-23 MITSANIQVLFEYMNHTILKNENRTFPRVLSSLVFIGNHTIKYDIKKCTKYGFSDAKYQK >gi|226332192|gb|ACIC01000128.1| GENE 48 42677 - 43726 769 349 aa, chain + ## HITS:1 COG:no KEGG:BF2905 NR:ns ## KEGG: BF2905 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 349 1 349 349 399 57.0 1e-110 MKTLKSNLKQPIIFAKIIAVSIAIFLCASCDNGNGKRLAKAKNDPAGTYREYLSDIRRQK KLSIKELAGHLKQWQTLRDSVFLHLECDTLGRLHSTVREECEQIHDSIRMEFSRMVLSQT RTYKDILWVKEQISPHLEDKELHRAAETIRPFFASLDNRPVSQRNRQHILATYRTTLAET IDYGIHCMADLTTFIEKEDAVYRAFLSRLPDFDGENLSDITHDTERCCAQIFLAAERNDI TYRNAMIYLAMRTNRRLIQNIRTCIDDICNKKVKTPAQAQAYIWMLLQPYSSLDGFCLAL LSAEEREQLDMLASQTPDAFRELGRMLHPGDNRLDELPGMLMEAFIHTL >gi|226332192|gb|ACIC01000128.1| GENE 49 43748 - 44200 513 150 aa, chain + ## HITS:1 COG:no KEGG:BF2904 NR:ns ## KEGG: BF2904 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 150 1 150 150 238 82.0 6e-62 MNMLRHFYNDFMAFVPLQLPQLLDVTTMEEPQFYGDYVLLTFPLRTPYELDEVMDMFEDD MELITLYHHIPMRTEKFGHSTCAYSNPAFGQMFKMNAKTDTEGNVNSILVTIYDSLEQMY GDLCLDLELHSKSGFLKYKKDKSDVLMNFI >gi|226332192|gb|ACIC01000128.1| GENE 50 44221 - 44598 306 125 aa, chain + ## HITS:1 COG:no KEGG:BF2903 NR:ns ## KEGG: BF2903 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 123 1 123 123 160 68.0 1e-38 MVYCINAYRTWIEVADDNLYKEHIISRTDRTDYLVSRTLVLRAFKTNGIHAEGTTWTIPE HELDKALAIYRKQDITFKQRIKKAAMYFSPKDAETLIRLATYGIVQLELIVRPTPIPEKP YYLCY >gi|226332192|gb|ACIC01000128.1| GENE 51 44589 - 45179 295 196 aa, chain + ## HITS:1 COG:no KEGG:BF2902 NR:ns ## KEGG: BF2902 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 185 1 185 185 192 62.0 4e-48 MLLNILLLIFCAIPFGVSMSLYKNNKRFMTPFYMAMARSGNARKLYVQVWLICLLLFHYV YACGHMGEFGILLSTGVCAAMFSFRRTDNWLRRLLDRPRAFVTLASGALVIGFVPHLYTL AITIAYLLLAALFYPSVRVMSECKDTDTLSGWAKHPGMLSESYHENHHANLPHEADSGNT DISAQYESLKPNENEK >gi|226332192|gb|ACIC01000128.1| GENE 52 45169 - 46014 485 281 aa, chain + ## HITS:1 COG:no KEGG:BF2901 NR:ns ## KEGG: BF2901 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 277 3 280 288 298 57.0 1e-79 MKNNQKIPVSILADREVFEYLKEKGDERKTRTEAYCDLLDKSLAGFVSPFLRKKAYVLQP NQCHLTVSDLASEWHWHRATVRSFLSAMEAFGLLTRIQLPKSVVITMTVQSGQAAQPRNA QEQPDFARQLREVLSDWVIGKTTAAETGVICGQLVSLAKTEIADRDTGLCLDTHSNTTSA HSGTLVTEHRETALCCIAHAALQKILHKSRFDDSSPLVDFFRFDLGEEWAAFIESAKDLA GLILDTEASVTDFDMDEDQERLKSLRKPFLSLLAKAQAMVD >gi|226332192|gb|ACIC01000128.1| GENE 53 46232 - 47689 1248 485 aa, chain + ## HITS:1 COG:no KEGG:BF2900 NR:ns ## KEGG: BF2900 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 485 1 485 485 751 84.0 0 MADQKQVLDMQVSKGITAAQSNEHLRDRSERAEKYAMNKGNYDPTRKHLNFEIAPGGKVR PIDTSRNIPERMADILERRGIKDPNEGLAEPKYRTVVNFIFGGSRKRMHELAFGTQNVDF DEGANNTRIKRMSDIERWAKDVYSFVSGKYGEQNIAAFIVHLDELNPHVHCTLLPIKDGR FAYKEIFAGKDKFEYSARMKQLHTDFFAEVNTRWGMSRGTSISETGARHRTTEEYRRMLS EECTTIEENIDRHQKVLATLQSDIRLAERRVKGLTTMVDNLEKSKAEKEALLSAAEQDLK ANKGDAEQLAAQVKSLEKELAGINRQLADKQEKLQTADRQLAELKENMDAIEERTGELKE EAYKYSHDIHSKVDTLLKDVLLENVVGEYRNVSAQMDVAERQLFDGSLVQSIAEQGTDVM HCATMLFLGMVDDATTFAETRGGGGGGSDLKWGRDEDEDNRAWALRSMRMASRMMRPAIG KKPKR >gi|226332192|gb|ACIC01000128.1| GENE 54 47744 - 49144 690 466 aa, chain + ## HITS:1 COG:no KEGG:BF2899 NR:ns ## KEGG: BF2899 # Name: not_defined # Def: putative outer membrane protein # Organism: B.fragilis # Pathway: not_defined # 1 466 1 476 480 687 70.0 0 MTKQIFFILAVLCTLQAQASIQPVQVDTVQHTPYYNVSEELQPIQPVYLDGVVLPASLTG NWFVSIAGGTSAFLGTPLGCEDLFGRLKPSYSFAVGKWFTPSVGARINYSGVQFKDGTLS NQDYHHIHADLLWNVLGCRYARQEQVRWNLAPFAGVGLLHNASNGNNPFTVSYGVQGEYR ISKRVSAMLELSGTTTFQDFDGYGRPNRLGDHMVSLTAGFTFHLGKVGWKRAVDATSYIR QNEWLVDHANILSGENKRYKDWYDRNRRTVAELKKILEIEGLLDKYGHLVNDDETDRRQG YPRNNYSGLNSLRARLKNRHWDGISPLDSASIGYGNNGDDAEKPGIIVSERTELIQAGNC IGSPVYFFFNLNTAHLTDASQMLNLDELARVAKKYGLSLRVTGAADSSTGTPVLNGSLST SRADYIVAELKKRGIPIERIVKVSRGGIADHVPVEVNRHTKVELFF >gi|226332192|gb|ACIC01000128.1| GENE 55 49225 - 49821 267 198 aa, chain - ## HITS:1 COG:no KEGG:BT_1949 NR:ns ## KEGG: BT_1949 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 5 183 4 182 188 285 71.0 6e-76 MTKEYIIENFTANISVDEYISRFRDEKRFVEFCKQCPNYGNSWGCPPFDFDTGEFLRQYE YAHLMATKIIPVEKNIPIDRTQELIKPERLRIERELLEMEHRYGGQAFAYVGKCLYCPDS ECARKCNRPCLHPDKVRPSLEAFGFDMTRTLSELFGIELLWGKDGILPEYLVIVSGLFHN SAENIISHTKRNQDSGNL >gi|226332192|gb|ACIC01000128.1| GENE 56 49818 - 50576 227 252 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 208 1 209 245 92 30 1e-17 MIKLHDFSIGYGERTLLCEVETTIEKGRLTALIGRNGTGKSTLLRAIAGLNRRYTGRILL DGHNAADMRAAEMARTLAFVTTERTRIANLKCKDVVAIGRAPYTNWIGKMQEVDKEIVMR SLASVGMEAYAERTMDKMSDGECQRIMIARALAQDTPIILLDEPTSFLDMPNRYELCTLL ARLAHEENKCILFSTHELDIALSLADAIALIDPPQLSYMPTEEMRRSGCIERLFRNNCVT FDATTGFIKVGQ >gi|226332192|gb|ACIC01000128.1| GENE 57 50573 - 51553 844 326 aa, chain - ## HITS:1 COG:alr4032 KEGG:ns NR:ns ## COG: alr4032 COG0609 # Protein_GI_number: 17231524 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-siderophore transport system, permease component # Organism: Nostoc sp. PCC 7120 # 6 323 22 356 362 241 49.0 1e-63 MRSRSAILFAMLAALTLFLFLLDLAVGAVAVPLGDVWAALTGGDCPRATAKIILNIRLIK AVVALLAGAALSVSGLQMQTLFRNPLAGPYVLGISSGASLGVALVVLAGFGSSIGIAGAA WLGAALVLVVIAAVGHRIKDIMVILILGMMFSSGVGAIVQILQYLSKEESLKAFVIWTMG SLGDVTFDQLAVLVPSIIAGLLLAVVTIKPLNLLLFGEEYAVTMGLNIRRSRGLLFLSTT LLAGTVTAFCGPIGFIGLAMPHVTRMLFRNSDHRVLVPGTVLSGAAVLLLCDLVSKMFTL PINAITALLGIPIVVWVVLRNKSVTA >gi|226332192|gb|ACIC01000128.1| GENE 58 51554 - 52693 854 379 aa, chain - ## HITS:1 COG:alr4031 KEGG:ns NR:ns ## COG: alr4031 COG0614 # Protein_GI_number: 17231523 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-hydroxamate transport system, periplasmic component # Organism: Nostoc sp. PCC 7120 # 55 379 97 426 426 209 34.0 5e-54 MKALKNLSLLLLLVLAFTGCHNKSSKLADFNRTVYTPEYASGFDIKGADGKKSVLVTVTN PWQGADSITTNLFIARDDEEVPADFTGQMLKGDAERIVCMSSTHIAMLDAIGETGRVVGV SGIDYISNPDIQARRDSVGDVGYEGNINYELLLSLDPDLVLLYGVNGASSMEGKLKELDI PFMYVGDYLEESPLGKAEWLVALSEVIGKRAEGEKVFAEIPVRYNVLKKKVADNILDAPS VMLNTPYGDSWFMPSTESYVARLIKDAGGDYIYKKNTGNASAPIDLEEAYLLASQADMWL HVGMANTLDELKAACPKFIDTRCFRGGQVYNNNARTNAAGGNDYYESAVVNPDLVLRDLV KIFHPELVEEDFVYYKQLK >gi|226332192|gb|ACIC01000128.1| GENE 59 52698 - 54776 1201 692 aa, chain - ## HITS:1 COG:no KEGG:BT_1953 NR:ns ## KEGG: BT_1953 # Name: not_defined # Def: putative TonB-linked outer membrane receptor # Organism: B.thetaiotaomicron # Pathway: not_defined # 7 684 6 688 692 1049 74.0 0 MTYTKYLLFSILVVCPSVLSAQGITRRIHQIDEVTVWGKRPMKEIGVQKTKFDSLALKEN IALSMADILTFNSSVFVKSYGRATLSTVAFRGTSPSHTQVTWNGMRINNPMLGMTDFSTI PSYFIDQASLLHGTSSVNETGGGLGGLVKLGTAPEVAEGFNAQYVQGIGSFKTFDEFARF TYGSERWHVSTRAVYSSSPNDYKYTNHDKKINIYDEDKNIVGQYHPKERNRSGAFKDLHL LQEVYYNTGKGDRFGLNAWYINSNRELPMLTTDYGDATDFENRQREQTFRSVLSWDHMKS NWKLGVKGGYIHTWMAYDYKREVAPDNWASMTRSRSKVNTFYGQAEGEYSLDKKWFFTAN VSAHQHLVRSEDKNIILQDGGKAIVGYDKGRVELSGSVSAKWQPIDRLGMSVVLREEMYG SDWIPLIPAFFIDGIISPKGNVMLKASISRNYRFPTLNDLYFLPGGNPNLKNEQGFSYDA GVSFDVGKKGIYKLSGGANWFDSYIDDWIIWLPTTKGFFSPRNVKKVHAYGVEVKANFAV QPAKDWLIDLNGSYSWTPSINEGEKMSPADQSVGKQLPYVPKHSASLTGRLSWRTWAFLY KWAFYSERYTMSSNDYTLTGHLPEYFMSNVSLEKNLFFKPVDIQLKFAVNNLFNEDYLSV LSRPMPGINFEFFIGITPKFGKNKKKSENTNM >gi|226332192|gb|ACIC01000128.1| GENE 60 54773 - 55216 166 147 aa, chain - ## HITS:1 COG:no KEGG:Slin_4541 NR:ns ## KEGG: Slin_4541 # Name: not_defined # Def: transposase IS200-like protein # Organism: S.linguale # Pathway: not_defined # 26 120 44 138 148 65 30.0 4e-10 MTIPMTTKGMILDQMVQIFTRKGCHVYACNAFLNHVHILVEIPSPTDFAKIINKVKSSTG VVYRKYPEYADFSGWAAGFDSFSVSFNDLNRVKHHIEKQETVHQELSFEDEYDQLLEENG FNAYQDFMRASFSAYAAVNNHIKNKRK >gi|226332192|gb|ACIC01000128.1| GENE 61 55270 - 56349 1091 359 aa, chain - ## HITS:1 COG:no KEGG:BT_1954 NR:ns ## KEGG: BT_1954 # Name: not_defined # Def: putative surface layer protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 23 359 31 366 366 590 83.0 1e-167 MTYHQLKYLLWSVVVGVTLSLTSCMKWDYGDAVEDFNATGAGLFITNEGNFQYGNATLSY YDPATKQVQNEIFFRANGMKLGDVAQSMTIYDNKGWVVVNNSHVIFAIDLNTFKEVGRIE NLTSPRYIHFLSDEKAYVTQLWDNRIFIINPKKYEITGYIQVPDMTMESGSTEQMVQYGK YVYCNCWSYQNRIIKIDTETDQVVDELKVGIQPTSLVMDKYNKMWTVTDGGYEGSPYGYE APSLYRIDAETFTVEKQFKFKLGDWPSEVQLNGDRDKLYWINKAIWSMDVTASHVPVRPF LEYSGTIYYGLTVNPANGEVYIADAIDYQQQGMIYRYSPEGKLIDEFYVGIIPGAFCWK >gi|226332192|gb|ACIC01000128.1| GENE 62 56434 - 58488 1108 684 aa, chain - ## HITS:1 COG:no KEGG:BT_1955 NR:ns ## KEGG: BT_1955 # Name: not_defined # Def: putative cell wall biogenesis protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 680 2 692 696 211 28.0 6e-53 MNKIYHFILFCFATLCLAACSDDDPEVSGIDGKDHFISEFALTVDGITYQAMIVGDKITV EIPYNTSLKGATVEYALCEGASINPNPSTIEDWENEWKFVVTSKMQDSKVYSYTYQYTDI EQSGSVVLATQAEVDNFAKTGINKIEGSLTIGTADGEEITNLDGLANLKQISNSLVINPS YKGTDLTGLDNLEQLGSFKLGSTTSASKNIMLKTVNLPSLLGVTGDFVVNSSVIEKISIP KVEFIGEDMYITSDALLDLDANAVESVGASLIVKGSVAQKESATTEAIVFSALKQVGNEL TIQYFPKLQGIYLPALESVAGTASFSDMSSIGSLAMTELHSVGGLTIKNCKEISIVELPG LISCGETSVDANKVNKLNIASLKDVLGDMTLTNLLIEELDLSQINFNGNTLTLQCKQLNK IVGSETFNGSLFLLPKDCRLTEFTLEGISNIQGDFQCIDYFYVKEFVMPFIRVAGDMTIA LNSGSVNTAAEIEFPKLQEIGGTLTLGTNRNANNITFPLLKKILGSCSVTTYKLKNDIEF TNLESIGTDGADAQIKFEIEATNILCPKLKTINGKFDIATSSFMFDMEVDKVSYPNVESI SENLSITCPYSDFGSNGILSIDFSGLKSAKGISISGQGDVTDFSSFKYLFENNVLTGESQ WSVKECGYNPTFQEMKDGKYKLAE >gi|226332192|gb|ACIC01000128.1| GENE 63 58497 - 60272 1477 591 aa, chain - ## HITS:1 COG:no KEGG:BT_1956 NR:ns ## KEGG: BT_1956 # Name: not_defined # Def: putative cell surface protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 4 584 5 590 593 537 49.0 1e-151 MKKYLLLFSLTIFLFACNKDEEISQDVILPPVIELDSEDGIYVVKIGKEVVIEPTYQNVD YAVYSWKCNGRIISDEPQLKYIFNECGSYYVTLRVDTRDASTEEEIRVDVNELAPPVISL VTPSIGLKVVAGREYILTPDIQNAEGATYLWTLNGNEVGTENTYTFKQDELGTYELTLTV TNEDGQSKKTVSIEVVDKLPIEIVVPSSLYFTENNTKYVELGRTLFVRPFVSISAEPSYQ WSLDGQPIEGANSLVYGFKPAKTGEHTLTFTVKYDNQDTKAVLTRNISVSGVDEVSVNIP VKCCEAAGKRPFAAGNSIYSNKVYEFVPAPGQFVNETNTAGFNGESTHEAACAYAQKRLD NEQYVSLGGWGGYIVVGFDHSIENKGGYDFSIKGNAFDSSNEPGIVWVMQDVNGDGLPND EWYELKGSEYGKPETIQDYAVTYFRPGPNMDTQWQDNKGNKGAIDRLGNYHPQEFYYPLW IEEDSYTLYGTCLKARTEQSPSTGMWSNNPFGWGYADNIGDDMPNKDNPNAGALGNYFKI SDALNIDGTSANLSHIDFIKVQTGVNVKAGWLGENSTEVFKFCDENNNNDK >gi|226332192|gb|ACIC01000128.1| GENE 64 60400 - 61368 713 322 aa, chain - ## HITS:1 COG:no KEGG:BF3033 NR:ns ## KEGG: BF3033 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 318 1 304 305 124 35.0 4e-27 MKKNFRFLAMAVVAMAAAVFTGCSSDDDFLAPYEESAIQTRAISSTNALIDFDNVPSSVM ASDQYGNNLYSATANGKQVTTGYITQIGQTGTYIQFPINYLEQEWVSGQPWEYEFWNGGF AISNFHNLTQGDYQNQCSVYWPNGGHSGKNFAVAFGYSDSYNDSQATYDKCAKIYLTDAT GYRVVTTNTPVKGTPKYGKFNSVWVCNTTYTYLVMKDGNSFTQGSLSAQKGWFKVVFVAL DATGKPTGKEVEYYLANFDSSKDAESGLTNKIRTGWNQVDLSGLGDSVCTVAINFEGSDS SAYGLNTPAYVAIDDIDVTVNE >gi|226332192|gb|ACIC01000128.1| GENE 65 62233 - 62856 259 207 aa, chain + ## HITS:1 COG:no KEGG:BF2893 NR:ns ## KEGG: BF2893 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 126 1 127 161 197 71.0 3e-49 MHSRIFQISTEPIDKENYLNEDTLQQGDGSFYDYCSEIDEEDRKEDIANLVNHALPKGMF ELISDDTMRYNGGIEQWKEEYVANIKKRANALTADNMLEWGSTYYLKQAVENPLDVAYHF YLDGDGCSLLPNSPLRLWSLFVGLNRERYSTSEELLITTSDVCPKHWQPPASKVGGCSLF CTGSPVTFINMLCRASATGFTAGIFLF >gi|226332192|gb|ACIC01000128.1| GENE 66 63141 - 63608 293 155 aa, chain + ## HITS:1 COG:no KEGG:BF2897 NR:ns ## KEGG: BF2897 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 149 1 152 156 200 61.0 1e-50 MSKYDFIKQGNLLFWHTADNDIECRIISTPEKVDSDSIILISTSSSETEVLASELLPIGS SRSHKEEFMRWKKEREAEGMEFFSRLSEVMETDSDLAVGDMVAFTNDYGVVFGPKEVLAF RKPWNGYRCVYIDSDAYWFPDRPEQLTILSKGGTE >gi|226332192|gb|ACIC01000128.1| GENE 67 63605 - 63859 178 84 aa, chain + ## HITS:1 COG:no KEGG:BF2891 NR:ns ## KEGG: BF2891 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 7 83 18 94 94 125 76.0 4e-28 MTTNDILKRLCGNIAAGRFNWRKYCTPQSYFGWEICVTPLHCSYGQIGYTVHFPYTNIPE VEYDWEMGKLTIDGEKWKSYLRNE >gi|226332192|gb|ACIC01000128.1| GENE 68 63874 - 64248 358 124 aa, chain + ## HITS:1 COG:no KEGG:BF2890 NR:ns ## KEGG: BF2890 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 122 1 122 123 209 88.0 2e-53 MAQNFYTKWQDAILADAGDYVSKEYRSFQTALVREISKYAAAVGAKVASNSKGHYDTSCF IERNGKFVYISHSSGLSRMGSGVRIELDSFLIRTAQNGKDYRGGCNQYCDIANLQSMIDG LLGK >gi|226332192|gb|ACIC01000128.1| GENE 69 64301 - 65989 1275 562 aa, chain - ## HITS:1 COG:XF2061_1 KEGG:ns NR:ns ## COG: XF2061_1 COG4227 # Protein_GI_number: 15838653 # Func_class: L Replication, recombination and repair # Function: Antirestriction protein # Organism: Xylella fastidiosa 9a5c # 23 340 224 514 522 92 27.0 2e-18 MAGYRKKNADGPNSEDKALDLFAEMMIEKIEGIQKDWKKPWFTEGTLQWPRNLHGREYNG MNAFMLLLHCEKEGYKIPRFCTFDCVQKLNKPGKDGEELPRVSVLRGEKSFPVMLTTFTC IHKETKEKIKYDDYKKLSDDEKEQYNVYPKMQVFRVFNVAQTNLQEARPELWQKLERENS RPAIEEGEHFSFAPVDTMIRDNLWICPITPKYQNDAYYSITKNEIIVPEKEQFKSGESFY GTLFHEMTHSTGAENVLDRFKPTTFGSPEYAREELVAELGSALVAQRYGMTKHIKEDSCA YLKGWLDELKESPQFIKTTLLDVKRATSMITQKVDKIAQELEQNVGEKQENGAAAKENTF YSSVAYLQFSDDTRQLDELREKGDYEGLLTLAKEYYDGNGINEQHTYLSATNNKGDSLIA EDENFAVVYNGSVGGTYEVMLKFTEQEIRDHIRRYGVDIAGDTIKGVAREMAAEQFSALA YQKIPAFEMPNGEVLYVEYNKESDTLDVGQPTNAGLVAQHRFPYDHNVGLDANLQAVNEK LNELEEYRAELQEAEYGSGMRR >gi|226332192|gb|ACIC01000128.1| GENE 70 65995 - 66603 582 202 aa, chain - ## HITS:1 COG:no KEGG:BF2888 NR:ns ## KEGG: BF2888 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 202 1 203 203 313 77.0 2e-84 MIKCNVTVCGVIGRDASVRKNKEEKEFLVFPLRVQIPATGGGHTIEVDVRKEGCQEEAAG YRNGSRVEVRGTMYLKRRGDKLYFNLFADEICNAATDDADCVKGELVFRGKVGQNIEEKR DKKDQPYTVFSAFSAEKVDDGFEYQWVRFFCFGKEREAWLQPGVKVDAKGEMNLSAHNGK INLSCKMEELTQYVADSSNYNQ >gi|226332192|gb|ACIC01000128.1| GENE 71 66876 - 68432 1132 518 aa, chain - ## HITS:1 COG:SPy1438_1 KEGG:ns NR:ns ## COG: SPy1438_1 COG1705 # Protein_GI_number: 15675348 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Muramidase (flagellum-specific) # Organism: Streptococcus pyogenes M1 GAS # 22 161 18 163 174 79 38.0 1e-14 MTKNQEYALQYADYAMAQMRRYGIPASVTLAQGILESSNGQSRLARNENNHFGIKATSSW IAEGGKYGIYTDDKPNEKFCSYDSVGDSYEHHSRFLKENSRYAGCFKLSPDDYKGWAQSI EKAGYATGGKYAENLQKIIEQNGLQKYDRQVMQEMTAQGRQFGVEHNPLQTSESAEHGTG YSFPVEREEFLFITSPFGTRQDPLDSTKQQMHKGVDIRCKADAVLATESGGKVVAVNQNK NTPGGKSLTVEYARTDGSKVQCTYMHLKEISVKVGDTVQAGGRLGTSGNTGTRTTGEHLH FGVKNIYADGTKRDIDPAAYLAEIAQKGHIKLQMLHNGNDLLVKYKAAEDTVPEKNLSPD GWMKKLLSSEDSGVGISGCNDPIVEMAMTAFASLMLLATQIDNREEEEQKTAISAAMDLR TIDLKPLLPNMKNCDLTIGENGKAILKADNGELHVSRELTASELNRLSATLNNGTLTEEA KQMRVTGMLNTVILSEAASQNFEQGMTRQQGQTENLRR >gi|226332192|gb|ACIC01000128.1| GENE 72 68445 - 70706 1680 753 aa, chain - ## HITS:1 COG:no KEGG:BF2885 NR:ns ## KEGG: BF2885 # Name: not_defined # Def: putative DNA primase # Organism: B.fragilis # Pathway: not_defined # 18 753 1 732 732 1110 76.0 0 MKEKSQIEKKAEEKQTELLSAALGGASNAGGHWLNVSGKGFPRLYPQGVSASPFNALFMA LHSDNNGCKTNLFTLYSETKARGAAVREHEQGVPFLFYNWNKYVNRNNPNETIDRTAYLQ LDEEQKAQFKGVHNREIRTLFNIDQTTLPYVDKPAYEDAVKQDGSVQERGYAEADNRRLR TRFNDFLLKMRDNLVPVRSDGSGVPHYETDKDAVYMPRQKDFEHYHDYVQEALRQIVSAT GHQQRLAREGMVMKNGVAPSEDAVKYERLVVELASGIKMLELGLPARLSDASLKTVDYWC REFKENPCIMDALESDVNNALDVIRKAERGEKIEYATLRNRRQTTTMQEQMPKHYFVANE IRQHPDKAAKSIVLVIDREAKSADVILPAGASTEANNEIPGMNKGRIERALQKEGIEQVR FYNTDGALGYRPDDSYFNEKMVTLARLRNYTLEKLSTLDVSEAVRRANEVGFDAVQMIQD DKNRWALYIKPENKEGYSVYPDKEDVNRFFSTLKQAMDNIDKVRMELAHKYYALAEVKPD LKVDLFGSDTPEIDLNRIQRVAIFKTRQDGIQCVATIDGQKLQPRSVTPQQWQRMWVAED RDGYKRHLAATLFADVLQKGQSVGEQKAEEQVRQQNEVVAENRSEETNDENMSPKRQFWD NLKEKHPDALYLIRAGEVYRLYNEDAAKGADILGITMKKYPERGFSAFAEFPRTQLDSYL PKLVRAGERVAISEIELQDKRQDEQETHRGIHR >gi|226332192|gb|ACIC01000128.1| GENE 73 70734 - 72533 1455 599 aa, chain - ## HITS:1 COG:no KEGG:BF2884 NR:ns ## KEGG: BF2884 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 599 1 596 596 776 69.0 0 MDGIRHSRQFAEIERLVSDYFHCHLAPVMSKTQAFLTKKQGEEMREYSTSLGGILSAMAS AAQPMGDPYQVLKVTGEWNSKTTEDYIGMCKAEITGSKEIQQDLAYMAGQWRDTVVREIG RERYNELSEQLGGDLAYAYMDYRVEELMIDRLVKERMPKSSADYIIRKAAESSLLGLSQT LNRSPLAEEIEARGEAAYRPKGWEKGAGRVLGATADAVMMGGAGSWATLAKFVGADVAIS AVASRFEPEKSPKLSVEQCISKGVFGSERNVFTDFRKEAATVQTGDSALIVAANEQLKKK IPVMNFNFLEWMQTPKFTPFQMSEKPGHPEKKNERAERYKDVPLIVAPGQEEAYLKHLEK YKNATAVKSTKESTQPEMEQREKVEKEERQVVIPHEEEKQEREAVQTNTNGWSGLLGTLG LEGLGDITGNLGYVMAMLPDILLGAFTGKTQSLRFGDNLLPIASIVAGLFVRNPLLKMLL VGLGGANLLNKAGHEALGRPMPSADVHTENQYRRYPDEPLNSRIVNPVLQGSTLIATIDR VPCTIQLTPTVAEAYRAGALPLNTLANAVLAKSDQLRHIASQNYDDGQRETIVRTRGIQ >gi|226332192|gb|ACIC01000128.1| GENE 74 72542 - 72823 189 93 aa, chain - ## HITS:1 COG:no KEGG:BF2883 NR:ns ## KEGG: BF2883 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 44 88 1 45 50 70 84.0 2e-11 MAELDIDIQSFDIPRLVTVYPDRAGVRWWTKAWFNNREEGETSVEIEREQAIRFMQDNIE KDAWLEEFFPRQMEVYHNAIEQTKEQLLKQINI >gi|226332192|gb|ACIC01000128.1| GENE 75 72847 - 73677 686 276 aa, chain - ## HITS:1 COG:TM0409 KEGG:ns NR:ns ## COG: TM0409 COG0739 # Protein_GI_number: 15643175 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Thermotoga maritima # 29 135 140 254 271 63 33.0 3e-10 MKYTEEMILQSDSGYCMPFEEQQGKDVELSLGYGEQTDPATGEKFFHHGIDFNVRCYMLS ALASGIVSGVGNDSGHGICQTIRYGEYEVTYGNLSNVFAQFGQRVKAGQTVALSGDKLHM TVRFKGEELNPLEFLTMLYGNIQAMRQAGGHETDYLSGLEMELKTDYERDKREIEELMLH FLPHYMEDLRHGAYTLPRNTEQSLRHIFTVGAMKEYFYENMPSMANPLGLGHKAMPLACK VQNLLIADFLHYLALRHDVYLSTASSDIKKNSMTKP >gi|226332192|gb|ACIC01000128.1| GENE 76 73703 - 74368 626 221 aa, chain - ## HITS:1 COG:no KEGG:BF2881 NR:ns ## KEGG: BF2881 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 221 1 214 214 286 78.0 4e-76 MGVWSTLLKYGGKAMKGVGTATATTGKSMGKAVLHPSQTLRGAGQAVKTATIGGAVGYVG WEKLTTDKSVVRIVSDAVIGEPATNTLAETADGVRELTGKAGEAVSSVSGAVTGIDNKLN GVSNFLRQASGGGGLDMFGNFFRNLGSGNVSGLSIAGLVAAAFLIFGRFGWLGKIAGAFL GMMLIGNNAGVFRTPETENVQRTRTPELPAEEQTHGGGMRR >gi|226332192|gb|ACIC01000128.1| GENE 77 74382 - 75065 555 227 aa, chain - ## HITS:1 COG:no KEGG:BF2880 NR:ns ## KEGG: BF2880 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 227 1 228 228 380 84.0 1e-104 MKRTIWIGILLLCIGIGKVRAQNDPVLAGMILLYTDKAEKELKNQEKVMLIQTTGHLWTK EEVKATTDLQREFNNYLDSFRSIICYAAQIYGFYHEISRLTDNMGDFTRQVSRNSPNALA VALSTQRNRIYRELIMNSVEIVNDIRMACLSDNKMTEKERMEIVFGIRPKLKMMNKKLQR LTKAVKYTTMGDIWREIDEGARPAADKRDIVEAAKRRWRQIGRNVRP >gi|226332192|gb|ACIC01000128.1| GENE 78 75079 - 75768 475 229 aa, chain - ## HITS:1 COG:no KEGG:BF2879 NR:ns ## KEGG: BF2879 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 229 1 231 231 369 84.0 1e-101 MKSRIISLVIFALCLMPHWAKAQITASNPLEWMALAEGNEVINDQIEKQINGQTKTAMLQ NSIAAEFNRIHKWEKQYNSYLKTVSGYASSLKACTHLYNDGVRIFLTLGKLGNAIRNNPQ GIIASMSMNNLYIETATELVSVFTLLNDAVAKGGKENMLTGAERSKTLWALNDKLSVFSR KLHLLYLSIRYYTLNDVWNNVTAGMLDRNNGEAARMAMSRWRRAAVLAR >gi|226332192|gb|ACIC01000128.1| GENE 79 75749 - 76246 269 165 aa, chain - ## HITS:1 COG:no KEGG:BF2878 NR:ns ## KEGG: BF2878 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 11 165 7 161 161 204 72.0 7e-52 MMTDAPTGHFQSIVLQPHAGQFVIDELPAIVLCCAAWVYGGMEGLPLTALAVSVAALLSL ALLYRFIYLRRTRYHIGSEQLISRHGVLSRKTDYMEQYRIVDFVEHQSLMQQLCGLKTVR IFSMDRNTPRLDLVGIRHNFDVVTLIRERVEYNKRKKGIYEITNH >gi|226332192|gb|ACIC01000128.1| GENE 80 76243 - 77631 916 462 aa, chain - ## HITS:1 COG:no KEGG:BF2877 NR:ns ## KEGG: BF2877 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 310 1 310 563 521 80.0 1e-146 MREGELTYDDFLRRLNIQDVLIDAGYHLNRRDGLRYPSYVRLDSEGRRIRNDKFIVTQQG KCCFKPQQQKSYNIISFIKEHPHFFAEYHAGVSPDRLVNLVCNRLLNHPVADRDTRIIQP KRDVKPFDMADYDIHQFNPQDRATQKKFYPFFKHRGIDLYTQYAFHRNFCLATKHREDGM KYTNLAFPLTVPKDTGQVVGLEERGRPRMDGSGSYKGKAEGSNSSQGLWIASPAKTTLTE AKHIYWFESAYDAMAYYQLHQANDKDLRKAVFISTGGNPTVEQMRGVLTLSLPAKQHICF DTDLAGIEFAKNLQQEMYRAVRSTIEETPERKPYLDSVADGKNLDEGDIDLLPDALRSSY GKYESAWEEAMSMRSSGLCHPDDIREQTDIMNGNYKEFREGLREFLGLDKANDASFVREQ PTYPNKDWNEQLLAGQKQEETVDETQAREQSPEEEQQTHFRR >gi|226332192|gb|ACIC01000128.1| GENE 81 77635 - 79212 1258 525 aa, chain - ## HITS:1 COG:no KEGG:BF2876 NR:ns ## KEGG: BF2876 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 525 1 522 522 872 83.0 0 MAIRTNTNPRQMDLQPEMRDMLMRNGLQAHVALDDAGYRLIVQGHDSPLLVYPITERQML ALTDWGTNTANKKAYNVLTSIVGKDFYMPKNFVHARNANGRVAMGLHGYRIVIGEYGRMG RMGMPPPFLGWTPRNQLGFHLRRVGGQLFFPGPSIVPERPDGRMKPGELQSGGYGFYYKG HQQEQPVQQRDVLQDLQAAITPMVSRPRSKEPARPYKELIASPVYFSNEKWAECLASHGL VVDAEKQTLTVQSESVQADMVYDLTEEEVRTLVAAPIEEQPVEKRLELLNGIIGADFSDK VTMEALNSDQRIAIGLHPEVQQDLKQRQRQEQEAFMPVKTSLQQQEESVQGNIGAVVDGR DLQFLNENKGWYREEKHGREVEVSQIAVQPAQTEGKYRMTAVIDGQAISHEITQRDYDKF LAVDDYHRMKLFSKIFSEVDMKTRPEAQRGLGTKIFAALTAGTVVAAGVAHGIHHHCHAP EFYGECFGGPPRPYFKPGVDTPRDVAIRCFEAEMNREINDMRMGR >gi|226332192|gb|ACIC01000128.1| GENE 82 79229 - 79453 298 74 aa, chain - ## HITS:1 COG:no KEGG:BF2875 NR:ns ## KEGG: BF2875 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 74 1 74 74 100 78.0 2e-20 MAQNEYYPEDVLVGKMQSGEYGWLDYVNHFSADWQEEYARYCEEKGLAVGNESAAEFVRF KDEQLEAAMESGDA >gi|226332192|gb|ACIC01000128.1| GENE 83 79472 - 80230 765 252 aa, chain - ## HITS:1 COG:no KEGG:BF2874 NR:ns ## KEGG: BF2874 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 252 1 252 252 425 80.0 1e-118 MKKVQIEFDELPFSTLERFGLTREMIEDLPMRVLEDICNGRHSPVLPVRVTDEHGGQIES RSRFAFIRMDDGQVDVVFYPALKSSPLERYDEAQQKQLLDGKAIVADVEMSDGRSSKAFV QIDAETKQVMYVPTPIIARNLKVLAEVMRLGTVEVNGMQHGEPLTVVVGGEPATVGIDLH AKTGIRICSGDAQQWRNQPKREWDKYTFGCYGCWVMDDDGNLDYVSEEDYTEELWNEQKK SGERNRATALHK >gi|226332192|gb|ACIC01000128.1| GENE 84 80450 - 81103 298 217 aa, chain + ## HITS:1 COG:TP0864 KEGG:ns NR:ns ## COG: TP0864 COG0739 # Protein_GI_number: 15639850 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Treponema pallidum # 79 195 423 540 546 99 45.0 3e-21 MRKILLIAISGISFCTMPVQAQFNTIAKTPERYKVEALQEGMKKPEPTPESMAPAQETST KPADESKKLWIDRYLSVSYPLQRIRITSPYGYRKDPFTGKRKFHGGIDLHARGEQVLAMM EGVVVKVGQDKTSGKYVTLRHGNYTVSYCHLSRVLAAKGTVVRPRDAVGITGSTGRSTGE HLHVTCKLNGKNINPSVLFDYIKSMQQECVSALAGLL >gi|226332192|gb|ACIC01000128.1| GENE 85 81109 - 82278 720 389 aa, chain - ## HITS:1 COG:MT0627 KEGG:ns NR:ns ## COG: MT0627 COG1373 # Protein_GI_number: 15840000 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Mycobacterium tuberculosis CDC1551 # 6 350 4 373 411 157 28.0 5e-38 MEAKYIHRELSAVIEEAYRYFSVITVTGPRQSGKTTLLRNLFSYLPYYSLENLDVRSFAE NDPVAFLNQHTEGMILDEVHNAPNLLSYIQGMVDNDADRRFILSGSSQFAMLKKVTQSLA GRTAVFELLPLSYSEIREQITDTPLDNLLFNGFYPAIYSGRNIPKFLYPAYMKTYLDKDV RDLLQIKDMMQFHTFIRLCAGRIGSLFKASELANEIGVSSHTVTAWLSVLQASYIVFLLP PYFENTRKRLTKTPKLYFTDTGLACHLLGIESPEQLARDKMRGALFENFIVTEALKRRYN QGKESNLYFYRDSNQNEVDLLLKKHSGLYGIEIKSAMTYHADFEKALKQMDGWVKETILG KAVAYAGTLENTAGEIKLLNYSHLDEVLA >gi|226332192|gb|ACIC01000128.1| GENE 86 82668 - 82892 269 74 aa, chain + ## HITS:1 COG:no KEGG:BF1096 NR:ns ## KEGG: BF1096 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 74 1 74 74 117 72.0 1e-25 MVEEIRQDDKVILSSEDGFSVPMIFNNLCGKNFIGKEYKDYIRHIAFEEMGLKPGIVSHY RDGVLYKNGTIPEL >gi|226332192|gb|ACIC01000128.1| GENE 87 82940 - 83572 463 210 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253571012|ref|ZP_04848420.1| ## NR: gi|253571012|ref|ZP_04848420.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 210 1 210 210 426 100.0 1e-118 METTTIHPAEAYLRNEQNSTSLYVRIAGQRRRLFINRNENVIGIIAPRKRKSGYIFTDWA SIEKIYYPSSSPEDAADIGKKQVLKYQKLARLATHTNDWLRKIANADLDKSLYENRITTG TRIDGKCIGLATIEKYCSSWDMARFRTALKQGEKFSTGRFDFCGYDGTLWCEPRENGDMA AGFSKEYRNCGNGYYYLLINDEYMIGYDID >gi|226332192|gb|ACIC01000128.1| GENE 88 83929 - 84147 91 72 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNQDYYFPLFGLFATAGKIFLLSEGGGEQSDKADPEKYARQRRGRFLRVICGSMPALAGL EPPVTFAENSVP >gi|226332192|gb|ACIC01000128.1| GENE 89 84173 - 84514 192 113 aa, chain + ## HITS:1 COG:no KEGG:BF2912 NR:ns ## KEGG: BF2912 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 18 89 11 82 104 84 56.0 1e-15 MEILKIVIPKWNIAEITGYDPMTTFWQDFSMADKFGNEAIADTYRKVKAEWKDNYKHWTE LCLVLNHKIWQWHERDNQKATLYDRRMARRAAGVLLSNNGLKQSVGRCEVSPL >gi|226332192|gb|ACIC01000128.1| GENE 90 84525 - 84887 355 120 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253571014|ref|ZP_04848422.1| ## NR: gi|253571014|ref|ZP_04848422.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 120 11 130 130 218 100.0 7e-56 MTAQEKIQKVTEISQSKGWSISVDDKNKSNIQFDFQRYTNYGQDFNFSAEMKCEDIDTLI ADMEQYFEGFDPDYEAYLWIGNDGHGKNGAPYHIKDIVSDMEEAEKQIHDLLEALETEFI >gi|226332192|gb|ACIC01000128.1| GENE 91 85168 - 85923 726 251 aa, chain + ## HITS:1 COG:Rv1708 KEGG:ns NR:ns ## COG: Rv1708 COG1192 # Protein_GI_number: 15608846 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Mycobacterium tuberculosis H37Rv # 3 248 65 313 318 191 41.0 1e-48 MTKIIAVLNHKGGVGKTTTTINLAAALQQKKKRVLLIDMDGQANLTESCGLSIEEERTVY GAMKGEYTLPVFELENGLSVVPSCLDLSATESELINEPGRELILKGLIAKLLETRKFDYI LIDCPPSLGLLTLNALTSADFLIIPVQAQFLAMRGMAKITNVVGIVKERLNPNLNIGGIV ITQFDKRKTLNKSVAELISESFCDKVFKTVIRDNVSLAEAPIKGMNIFEYNKNSNGAKDY MELAKEVLKLK >gi|226332192|gb|ACIC01000128.1| GENE 92 85931 - 86185 266 84 aa, chain + ## HITS:1 COG:no KEGG:BF2825 NR:ns ## KEGG: BF2825 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 84 1 84 84 117 83.0 2e-25 MGKSDSLKNGMRSGLDGLLSSTGKSTQKKEPAPVKTEKEPAVHCNFVINKSIHTRMKYLA IEKNMSLRDIVNEAMKEYLEKHEK >gi|226332192|gb|ACIC01000128.1| GENE 93 86623 - 88350 1376 575 aa, chain + ## HITS:1 COG:PA5562 KEGG:ns NR:ns ## COG: PA5562 COG1475 # Protein_GI_number: 15600755 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Pseudomonas aeruginosa # 5 215 28 229 290 132 38.0 1e-30 MATTAVQAVEKNITSVALADIQPSNYNPRKNFDETSLAELAESIRQQGVLQPIGVRPIAD NRFEIVFGERRYRASLMAELAEIPAIVMEISDETAEEMAITENLQRKDVTPIEEANAYQK LIDSGRHDVQSLTVQFGKTEAYIRTRLKFVSLIPEIALLLEQDEITISVASEICRYGEEI QREVYDQHLKEGVQYNSWRGMKASEVAQSIERQYTADLNRYSFDKTLCLSCPHNTNNMML FCEGGCGNCANRACLVEMNTSHLTEKAMRLMEQHPAVPLCHESYNYNEAVIDRLTAMGYE VESLKTYATKYPESPQAPQKEDYDTTEEYEDAEKDYGQELNGYTEKCEAIRTRSEAGEIS LYLRIESNDITLCYVANTATTVNGTATEMPLSPIEKLEKQDKRNKEIALEKTVEDTKKRI LEVDMSERKFGQDEEKMVYFFLLSSLRKEHFNEVGIEDKGSYYYLTSEDKMRIIENLTAK QKAVIRRDYLIANFKDAFGNNATASLLLGFAQKHMPEELANIQDGYNEVYEKRHQRIKEK KAALQEQATQEAEQPDEPQPEAEAQTEPQTEEIAA >gi|226332192|gb|ACIC01000128.1| GENE 94 88792 - 89067 350 91 aa, chain + ## HITS:1 COG:no KEGG:BF2819 NR:ns ## KEGG: BF2819 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 91 3 93 93 156 84.0 2e-37 MDNIKESKEYQLAKDWERAVNDYGFNPKRFAAAIPEMHPTLQQSLYRLVKECIVVMADET RNYDDRNRASHEEAKCIMEYLKANGRHIPLR >gi|226332192|gb|ACIC01000128.1| GENE 95 89075 - 89347 311 90 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253571022|ref|ZP_04848430.1| ## NR: gi|253571022|ref|ZP_04848430.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 90 1 90 90 159 100.0 6e-38 MKTKELQFDGNIYICRIVKSNEGEELLIGSTALLDALHPGSFEDESEGFASKEAEQIYDE VFFFADAKTLKLPDDELITELKEDNPEWFN >gi|226332192|gb|ACIC01000128.1| GENE 96 89381 - 89761 560 126 aa, chain + ## HITS:1 COG:no KEGG:BF2817 NR:ns ## KEGG: BF2817 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 111 1 109 118 177 74.0 1e-43 MDKEYVMHIAQTIKEQLLSFTPIPVFMSWGVSEFVATVFQELPALRLKVNGRLHAGYVVI ALNGSDYYEVYLLKEDDSNAKCVNEEVCFDELGDVIDRAIESGTDKEEYDKFCDRQLAEL LSGTRA >gi|226332192|gb|ACIC01000128.1| GENE 97 89864 - 90061 98 65 aa, chain - ## HITS:1 COG:no KEGG:HMPREF0424_0734 NR:ns ## KEGG: HMPREF0424_0734 # Name: not_defined # Def: hypothetical protein # Organism: G.vaginalis # Pathway: Homologous recombination [PATH:gva03440] # 1 60 262 321 322 61 45.0 9e-09 MEFCVEPQSLSDILQHLGLKDRENLMEVYINPMIGAGVLEMTEPDNPTSRNQMYVTVKVE QEFQK >gi|226332192|gb|ACIC01000128.1| GENE 98 90076 - 90252 200 58 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253571025|ref|ZP_04848433.1| ## NR: gi|253571025|ref|ZP_04848433.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 58 3 60 60 113 100.0 4e-24 MCLGRAEKAGSGVDKIVSGWQSLGWPLPTVAEETRPDYVVLTLQLGMKTRQENLASRI >gi|226332192|gb|ACIC01000128.1| GENE 99 90401 - 91621 823 406 aa, chain + ## HITS:1 COG:MA2994 KEGG:ns NR:ns ## COG: MA2994 COG4804 # Protein_GI_number: 20091812 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanosarcina acetivorans str.C2A # 15 385 13 328 345 120 25.0 4e-27 MYMEQQVIPSDYTQYAEAVEIIKHAIERCRYRSAAAVNKETLSLYFGVGKFVSENSRIGC WGTKALPTISKLLQRELPGLHGFSESGLKRMRSFYEEWRTFLIRPTVLGELENHSSEKGT ALIHPTALGELDIDEHLLLQLIGQPVNTEFTWDDFVKISFSHHIEILTRSKDIAERLFYI HCAAQNAWSLSSLKNYLKEDIYSNRGSLPSNFLQVLPEAIYAVKATLAFKDEYMLEMVNL ENVGEREQDWNEKVIENQIVTNIKQFILRFGNDFTFIDSQHRLIVAGEEMFADLVFFNRE LNASVIVELKRGKFRPNYLGQLSGYLTVYDMTDKKPHENPSIGIVLCQDANRQFVEIMVR DYDKPMGVATYRTAQEMPENLRKTLPDIDKLQNLLSENDSKSETVR >gi|226332192|gb|ACIC01000128.1| GENE 100 91709 - 93895 1604 728 aa, chain - ## HITS:1 COG:no KEGG:BF2815 NR:ns ## KEGG: BF2815 # Name: not_defined # Def: putative mobilization protein # Organism: B.fragilis # Pathway: not_defined # 1 728 1 728 728 1368 92.0 0 MEESKELQGFYKIFRAVIYISVLMEFFEYAIDLAMLDHWGGILIDIHGRIKRWMIYNDGN LVYSKIATFLLICITCIGTRNKKHLEFDARRQVLYPLICGLFLIVFSVWLYHHTMETRLY TLPLNIIFYMAATLVGVILVHIALDNISKFIKEGLGKDRFNFENESFEQSEEKDENQYSV NIPMRYYYKGKFRKGWVSISNCFRGTWVVGTPGSGKTFSIIEPFIRQHSAKGFAMVVYDY KFPTLATKLYYHYKKNQKLGKVPKGCKFNIINFVDVEYSKRVNPIQAKYINNLAAASETA ETLLESLQKGKKEGGGGSDQFFQTSAVNFLAACIYFFVNYEREPYDANGKKLYAEKRQDP QTKFWKPTGVVRDREGGSIVEPAYWLGKYSDMPHILSFLNESYQTIFEVLETDNEVAPLL GPFQTAFKNKAMEQLEGMIGTLRVYTSRLATKESYWIFHKDGDDFDLKVSDPKSPSYLLI ANDPEMESIIGALNALILNRLVTRVNTGQGKNIPVSIIVDELPTLYFHKIDRLIGTARSN KVSVTLGFQELPQLEADYGKVGMQKIITTVGNVVSGSARAKETLEWLSNDIFGKVVQVKK GVTIDRDKTSINLNENMDSLVPASKISDMATGWICGQTARDFVKTKTGTGGSMNIQESEE FKTTKFFCKTDFDMREIKAEEAAYVPLPKFYTFKSREERERILYKNFIQVGQDVKDMIAD VLNKRGAK >gi|226332192|gb|ACIC01000128.1| GENE 101 93924 - 94580 499 218 aa, chain - ## HITS:1 COG:no KEGG:BF2813 NR:ns ## KEGG: BF2813 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 217 1 217 218 355 83.0 8e-97 MIRKIQWFAMAVTAVLCAACDAHIDVPDTAVRPGHILCEDGTALSYVQYEQSGKRAIAVV FDTEHREGTEGNGYAVYLWDIAPAAFADSLGVAQGTSADIEALDGNMNTFALYDTRETAS PMAEAVFDLWRYGQSAYIPSVAQMRLLYAVRETVNPVIERCGGHPLPLDENDCWYWTSTE VTGQETAKAWLYSTGSGAMQETPKTQAHKVRPIITLNR >gi|226332192|gb|ACIC01000128.1| GENE 102 94606 - 95091 420 161 aa, chain - ## HITS:1 COG:no KEGG:BF2812 NR:ns ## KEGG: BF2812 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 10 161 19 170 170 270 82.0 2e-71 MLACMCVFMNASAQRNSGRLSLGVGLLYENGMDVTLAYEHEMNYRHAWEFFANGYLKWTE CKSCGHICPESFWRNYRTYGFGVAYKPCVVRGRNHYGNLRIGASAGSDTNKFLAGIHVGY EHNYVLRSGCTLFWQVKSDMMIKGADLLRTGIVLGVKLPIK >gi|226332192|gb|ACIC01000128.1| GENE 103 95115 - 95966 837 283 aa, chain - ## HITS:1 COG:no KEGG:BF2811 NR:ns ## KEGG: BF2811 # Name: not_defined # Def: conjugate transposon protein TraN # Organism: B.fragilis # Pathway: not_defined # 11 283 9 281 281 437 85.0 1e-121 MNKKIIVTAFLLAAGLFATRNAQAQRTYEEMERLTVNEQVTTVITATEPVRFVDISTDKV AGDQPIENIIRLKPKETGHEDGEVLAIVTIVTERYRTQYALIYTTRISEAVADKEIQLQE RDAYNNPTVSMSTADMVRFARRVWNSPAKIRNVATKAHRMVMRLNNIYSVGDYFFIDFSI ENKTNIRFDIDEIRVKLSDKKLSKATNAQTIELTPALVLEHGKTFKHGYRNVIVVKKMTF PNDKLLTIEMTEQQISGRNISLNIDYEDVLSADSFNTALLEEE >gi|226332192|gb|ACIC01000128.1| GENE 104 96004 - 96789 501 261 aa, chain - ## HITS:1 COG:SMc00021 KEGG:ns NR:ns ## COG: SMc00021 COG0863 # Protein_GI_number: 15964679 # Func_class: L Replication, recombination and repair # Function: DNA modification methylase # Organism: Sinorhizobium meliloti # 4 247 20 265 376 89 30.0 9e-18 MIEIDNIYNMDCIEGMKLMANGSVDAVIADLPYGVLNRSNKAAHWDRQIPLEALWKQYRR ITKPGSPVILFAQGIFSAQLMLSQPRMWRYNLVWRKDRVTGHLNANRMPLRQHEDIIVFY DRQPVYHPQMTPCPPERRNHGRRKTDGFTNRCYGEMKLAPVRVAEDKYPTSVISIPKEHK TGAFYHPTQKPVALIEYLIRTYTNEGDVVLDNCIGSGTTAIAAIRTGRHYIGFEIEPTYC EIVGRRIREELERGHGLKKAK >gi|226332192|gb|ACIC01000128.1| GENE 105 96801 - 97949 1002 382 aa, chain - ## HITS:1 COG:no KEGG:BF2808 NR:ns ## KEGG: BF2808 # Name: not_defined # Def: conjugate transposon protein TraM # Organism: B.fragilis # Pathway: not_defined # 1 382 10 389 390 545 79.0 1e-153 MKIFDKINFREPKYMLPAVLYIPLLVASYFIFDLFHTETAEIPDKTLQTTEFLNPDLPDA RLKGGDGIGSKYENMAKSWGKIQDYSAVDNIERDEPDDNKEEYESQYTVDDIALLDEQQQ EKAAAAQIADAKTREQEALAELEKALAEARLRGRNEVLPATDTDSTASAQPPTATIEVKG KIEEESRSVKAPSENEPPSEVVRKVKTASDYFNTLAVNAREPKLIKAIIDEDIKAVDGSR VRLRLLDDVEINECVVKRGSYLYATMSGFSSGRVKGNITSILIEDELVKVSLSLYDTDGM EGLYVPNSQFRETSKDVASGAVSGNLNMNTGSYGNSLSQWGMQAATNAYQKTSNAIGKAI KKNKVKLKYGTFVYLVNGREKQ >gi|226332192|gb|ACIC01000128.1| GENE 106 97960 - 98466 394 168 aa, chain - ## HITS:1 COG:no KEGG:BF2807 NR:ns ## KEGG: BF2807 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 9 168 1 160 160 301 86.0 8e-81 MNRNKGVLMNTRFEKSVRSSDEWYTPKEVLKALGRFDLDPCAPIRPLWPTAEVMYDRNMD GLSLKWEGRVWLNPPYSRPLIEQFVRKLAEHGNGIALLFNRCDSKMFQDVIFEKATGMKF LRHRIRFYRPDGTRGDSPGCGSILIAFGVENAEVLKNCSIEGKYVQLN >gi|226332192|gb|ACIC01000128.1| GENE 107 98453 - 98833 285 126 aa, chain - ## HITS:1 COG:no KEGG:BF2806 NR:ns ## KEGG: BF2806 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 122 1 122 123 172 74.0 4e-42 MNIKGFRRMLFGEKMPDKNDPKYKDRYEREVSAGRKFAQATRIDKAAAKVQGFANAHRIL FLVIVFGFAIGGFTWNIYRITMAYRNSRPTRTATEMQDSVLRERHKRLQGGEIRENRNEN KKYEPQ >gi|226332192|gb|ACIC01000128.1| GENE 108 98846 - 99460 423 204 aa, chain - ## HITS:1 COG:no KEGG:BF2805 NR:ns ## KEGG: BF2805 # Name: not_defined # Def: conjugate transposon protein TraK # Organism: B.fragilis # Pathway: not_defined # 1 204 1 204 204 382 91.0 1e-105 MVIKNLENKIRLVGIICTAFLVGCVIISLSSIWTARTMVSDAQKKVYVLDGNVPILVNRT TMDETLDMEAKSHVEMFHHYFFTLPPDDKYIRYTMEKAMYLVDETGLAQYNTLKEKGFYS NILGTSSVFSIYCDSVAFNKEKMEFTYYGRQRIERRSNILMRELVTAGQLKRVPRTENNP HGLLIVNWRTLLNKDIEQKTKINY >gi|226332192|gb|ACIC01000128.1| GENE 109 99464 - 99859 434 131 aa, chain - ## HITS:1 COG:no KEGG:BF2804 NR:ns ## KEGG: BF2804 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 130 11 139 139 135 50.0 6e-31 MKLQEKIKSWCKDEKFMSFAQERARKEVCEVTENHRIDPQYEELDEAFEYDDRYIAPLVT YLTYKLRLALLQRNAGKRKRGIWWVLVHVEMQGYYVEIFSAEFENLLTELRDAVIPMLHT EYVQMLNGKRE >gi|226332192|gb|ACIC01000128.1| GENE 110 99901 - 101034 860 377 aa, chain - ## HITS:1 COG:no KEGG:BF2803 NR:ns ## KEGG: BF2803 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 377 1 377 377 701 92.0 0 MANGDILSDFGINLLEEEIDDVIFQTNEFLTDATFTGAQGPFWWILQMCMALAALFSIVM AAGIAYKMMVKHEPLDVMKLFRPLAVSIIICWWYPPADTGMAGSRNNWCFLDFLSYIPNC IGSYTHDLYEAEASQISDRFEEVQQLIHVRDTMYTNLQAQADVAHTGTSDPNLIEATMEQ TGVDEVTSMEKDAAKLWFTSLTAGVIVGIDKIIMLIALVVFRIGWWATIYCQQILLGMLT IFGPIQWAFSILPKWEGAWAKWLTRYLTVHFYGAMLYFVGFYVLLLFDIVLCIQIENLTA ITASEQTMAAYLQNSFFSAGYLMAASIVALKCLNLVPDLAAWMIPEGDTAFSTRNFGEGV AQQAKMTATGGLGGIMR >gi|226332192|gb|ACIC01000128.1| GENE 111 101047 - 101814 646 255 aa, chain - ## HITS:1 COG:no KEGG:BF2802 NR:ns ## KEGG: BF2802 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 238 1 238 256 366 81.0 1e-100 MNRKLLLMAVAVTVTTAVHAQYVTYNHDSPKQNQVTVMETGTGALSPDLYYSVLHNKYKK SAAAKNKLSFRTLAGINLYNQVDEAEAIDSALVKRAKVEALNVADRQADIAWLAEGDKVS RQMDRFRRNIDRILLSGGTPADKERWTEYYHVYQCAINATKDAYMPNAQRKKEYLRIYED VARQNEILVSYLAKRQNATATSTLLNATDNRTLHKGGIVRNAMSRWQESRLAVRGSQSGG NGNGEDDNESVNRGK >gi|226332192|gb|ACIC01000128.1| GENE 112 101833 - 102558 715 241 aa, chain - ## HITS:1 COG:no KEGG:BF2800 NR:ns ## KEGG: BF2800 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 241 1 241 241 357 79.0 2e-97 MKRTIQIAVVLFALLPGIAKAQWTFDIVSVEAYINDHKKQRSLLLARSTLEYSNKLLHEY SCKEIGDYKELNIDLDKYTRAFDAIDVMYQSLRTVLNVKNTYTSVSDRIGDYKSLLEDFN AKILKRGRIESADTLIISINARGLRAIAREGEQLYKSVSDLVLYATGAAACSTSDLLMVL EAINRSLDNIERHLNRAYFETWRYIQVRIGYWKAKVYRSRTMREILDDAFGRWRGAGRLD Y >gi|226332192|gb|ACIC01000128.1| GENE 113 102555 - 105101 1675 848 aa, chain - ## HITS:1 COG:no KEGG:BF2799 NR:ns ## KEGG: BF2799 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 848 1 843 843 1377 80.0 0 MKRLLLILTIASVTLCQTVHAQYYSVNYDTRTVAAMVAAFGTEAVAEGYYREQVDDVLKH YTAAEVATAGIFAAKFLEHKALSDLGIWNSRTENYYYRRIYRMVAEKIMPKIWVVAKQML HSPQTAIYWGSYLMKVCDDTKSLCMQFESVVTNSTLTFSDIAFLEINPEIAPLLKLSETG NIDWQRMMDNFARIPGNFTHENLKSDLDNLYNTGVGLATAGIANLGDALLQSSSFHDLLG GKVEEIGNLYEHYGSLFEQAEHDISGLLIDMVGGPDNVAGLFNFSNYNLTAWMTDYLDEA MGNYYTQRWYIARRDQGSVSLCDYYPPTDDNSILNGGAWVRFNTSDPNFYPDASQREQVL ANSEGYAGWSRNRVQQLNNQNDGFSYSINYWMSAYIISKKNKQTKKAYAYEIHVSKNWNK EEVVYEEVFDSYSMDLNTFRAQLNVRLAEFNENEDGYTYYISSGARNYYQATDAAKLKGC ESVTISVTCSDGVTLGQGSTQYKCRKCGSSLNAHTKECAMQTSVTENDLDLSELDALLNE ANSQAANIEAQIGALENENAVLLKKISTASIEDAANYRQQYNANKTRIDRLKGELAEWKK KQSDYAQAKSDAADDNSVATDDYYRIPAIMQDCKAAYGLTWQDGGAWNGYSYVRKATMPN INGIITFRATVSIARKPKYFLGIKIHRTIIQIKWELTSEYTDTHVADVLTLDPGLSDAEK AKLVNDRIAEIAREHPSCKITTEYARSAPTEETPSGDVYHLLWSSDRLEIAREVDSRITK IYADLVSLEKMMHYKRSILDVMKDVQPGLDTDEGRRLTLVEECHDRWVENARTLRSGNSR NSRKEVRP >gi|226332192|gb|ACIC01000128.1| GENE 114 105113 - 107821 2235 902 aa, chain - ## HITS:1 COG:no KEGG:BF2797 NR:ns ## KEGG: BF2797 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 895 1 897 903 1518 83.0 0 MTLYIILCFVALCAGMALSVYAFGTGGKRKRIFQDIYFSAEETDGVGVLYTKTGEYSAVL KIENPVQKYSADIDSYYDFTHLFTALAQTLGEGYAIHKQDIFVRKQFASEPTDGQEFLSS SYFRYFKGRPYTDSLCYLTITQEAKKSRLFSFDSKKWRDFLVKIRKVHDQLRDGGVQARF LNKAEASEYVDRYFAMNFKDRTVSMTNFKADDETVSMGDKRCKVYSLVDVDCAALPSQIR PYTNIEVNNTEMPVDLVSVVDSIPNAETVVYNQIIFLPNQKRELSLLDKKKNRHASIPNP NNQMAVEDIKRVQEVIARESKQLVYTHFNMVVAVSAGADLQKCTNHLENAFGRMGIHISK RAYNQLELFVGSFPGNCYTLNEEYDRFLTLSDAAMCLMYKESVLHSEETPLKIYYTDRQG VPVAIDITGKEGKNKLTDNSNFFCLGPSGSGKSFHMNSVVRQLHEQGTDVVMVDTGNSYE GLCEYLGGKYISYTEERPITMNPFRINREEYNIEKIDFLKNLILMIWKGSDSQIPEIEFR IVEQIIIDYYDAYFNGFTRYTDEQREVLLKNLFAAASRKNPNKPPREVDEMVRKQIEVLE ARRAALKVSELNFNSFFDYSFDRLEQICTENDITTISYSTYSTMLQPFYKGGAYEKILNE NVDSALFDETFIVFEVDAIKENKKLFPIVTLIIMDVFLQKMRIKKTRKVLVIEEAWKAIA SPLMAEYIKFMYKTARKFWASVGVVTQEIQDIIGSEIVKEAIINNSDVVMLLDQSKFKER FDEIRKILGLTEVDCKKIFTINRLENKDGRSFFREVFIRRGTTSGVYGVEEPHECYMTYT TERAEKEALKLYKKELRCSHQEAIEAYCRDWDASGIGKALPFAQKVNETGRVLNLRPVHE SK >gi|226332192|gb|ACIC01000128.1| GENE 115 107933 - 108229 327 98 aa, chain - ## HITS:1 COG:no KEGG:BF2796 NR:ns ## KEGG: BF2796 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 5 98 4 97 97 137 78.0 1e-31 MKGKDERYPDYPLFKGLQRPLEFLGIQGRYIYWAAVTTCGAIVGFVAAYCLLGFIAGLVV LAAVVSAGIVLILLKQRKGLHSKKVVPGVYVYAHSRKI >gi|226332192|gb|ACIC01000128.1| GENE 116 108245 - 108622 477 125 aa, chain - ## HITS:1 COG:no KEGG:BF2795 NR:ns ## KEGG: BF2795 # Name: not_defined # Def: conjugate transposon protein TraE # Organism: B.fragilis # Pathway: not_defined # 1 125 1 125 125 154 76.0 1e-36 MFQKTKQLCRKAFGFVNGIPTKVMMFSFMLLSGMVAKAQNSAGDYSAGTSALSTVAEEIA KYVPIMVKLCYAIAGVVAIVGAISVYIAMNNEEQDVKKKIMMVVGACIFLIAAAKALPLF FGIAA >gi|226332192|gb|ACIC01000128.1| GENE 117 108677 - 109003 158 108 aa, chain - ## HITS:1 COG:no KEGG:BF2794 NR:ns ## KEGG: BF2794 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 108 23 129 129 144 78.0 1e-33 MSKAKKILCALCLPFPYTTFAKSGSVNYSWGADALATMHDFVVTMMLYVQYICCAIAGVY VIVSVCQIYIKMNTGEDGITKSIMTLVGACLFLIGAFYVFPVFFGYRI >gi|226332192|gb|ACIC01000128.1| GENE 118 109005 - 109448 491 147 aa, chain - ## HITS:1 COG:no KEGG:BF2793 NR:ns ## KEGG: BF2793 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 147 51 200 200 169 62.0 4e-41 MNSYFIFAIVLTVAYIIYYAVIIAHDIYGKKGTDKPNEEVFDLGAPEEEESVAVTESETG FSIGSENYETESTATSSETSLEDVKDKPGTAQEKLERLKAEAEEQMEETTPYLSDARTSE EMYKAMISKGRLDNRPEIKWNPIQDRL >gi|226332192|gb|ACIC01000128.1| GENE 119 109692 - 110579 651 295 aa, chain - ## HITS:1 COG:no KEGG:BF2792 NR:ns ## KEGG: BF2792 # Name: not_defined # Def: DNA primase # Organism: B.fragilis # Pathway: not_defined # 1 295 1 295 295 457 74.0 1e-127 MTIAEAKQVRIVDFLAQLGHHAQHIKSEQYWYFSPLRNERTPSFKVNDRINEWYDFGEAT GGDLVELAKYICRTDCVSEALAYIERLVNGASLPRTRMPTAPPRPVEAEMKDVIVIPLRH HALFSYLQSRLIDADIGRMYCKEVHYELRGRHYFALAFGNTSGGYEVRNAYYKGCLNNKD ISLIRHLTEETQENVCVFEGFMDFLSYMTLKLAGDRTVCLAMPCDYLVMNSVNNLKKTLA RLQEYSVIHCYLDNDLAGQRTTETIAGMYDGRVSDESCHYAEYKDLNDYLRGKKR >gi|226332192|gb|ACIC01000128.1| GENE 120 110682 - 111782 837 366 aa, chain - ## HITS:1 COG:no KEGG:BF2791 NR:ns ## KEGG: BF2791 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 366 1 368 368 627 82.0 1e-178 MENERRIDRTTSMGMDEHRLSDILHASQIKATDIYETPPQIIWIDNSTIATLGNFSASTG KAKSKKTFNVSALVAASLAGKQVLNYRAHLPEGKQRILYVDTEQSRFHCRSVLERILRLA GLPTTTDPENLDFFCLREYSPSVRIEVIDYALRQQKGYGLVIIDGIRDLMLDINNAGESV EVINRMMEWSSRYDLHIHCVLHLNKGDNNVRGHIGTEMSNKAETVLVISKSNENPGISEV HALHIREKEFKPFAFTINETGLPVIAEVHSFGEPPKPKARTGFTELSIEQHREALSAAFG EKPIRGFDNLLQSLMVSYEAIGFKRGRSVMIKLMQYLIDNLKLIIKRDKLFYYDMTPTEA MLFDEE >gi|226332192|gb|ACIC01000128.1| GENE 121 111760 - 112155 352 131 aa, chain - ## HITS:1 COG:no KEGG:BF2790 NR:ns ## KEGG: BF2790 # Name: not_defined # Def: putative excisionase # Organism: B.fragilis # Pathway: not_defined # 3 131 4 132 132 184 76.0 9e-46 MPNRMTFMERMSGRLAAIESVLKKLEPVESLLERITLLENTIFTTKRVFTFQEACMYIGV SESMLYKLTSSKEIPHYKPRGKMVYFAKEELDEWLLQNYEPTMNEAVRRVTEAAATEPFL NKRRYGKRKKD >gi|226332192|gb|ACIC01000128.1| GENE 122 112335 - 112979 164 214 aa, chain - ## HITS:1 COG:no KEGG:BF2789 NR:ns ## KEGG: BF2789 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 3 214 43 256 257 260 62.0 3e-68 MLGGKQYHRYVDDFVNGHRYIDCDHAACRNCHEMNIHIVKGLLTECASSVQPCFAAPDFT FNECMKLKRMYDTSESLSPLGPPRINRPAALSLSFGCNFTPEQMKSIVACANTYHLFCVS VRIEDMEALFACKKGFSIRVNNIRRVVILFDALLENSFIQSRWQNVLGKGAFLQSKDGTR SVSVSTLSSALSSIKNNMTSVAYSIRKVIDRLKE >gi|226332192|gb|ACIC01000128.1| GENE 123 113239 - 114384 934 381 aa, chain - ## HITS:1 COG:XF1483 KEGG:ns NR:ns ## COG: XF1483 COG4973 # Protein_GI_number: 15838084 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinase XerC # Organism: Xylella fastidiosa 9a5c # 127 364 43 273 294 81 27.0 2e-15 MARVKNHTKVKEPIRLRMKSLSNGSKSLYLDIYRDGKRTYEYLKMYIIPETDSNSRRQNQ TTMDAANAIKSKRIIELTSNEAGIVFRKDKTFLLDWMQVYMEAQESAGKKDGSQIKIAMR ILKDYAGEMVTLDQIDGDFCRGYITYLLTEYHPKGKDISNYTLHNYYRALNGALNSAVRK KKMKANPFNELEKSEKIRKPESMRSYMTIEEVQALIDTPMPHEEYEIVKCAYLFSCFCGL RISDIIKLKWNDVFVDRGQYRLAVSMKKTKEPIYLPLSPEALKWMPERGGKSSEDNVFDL PSANTIRMQLKPWAKAAGISKRFSYHTSRHTFATMMLTLGADLYTVSKLLGHADVKMTQV YAKIINKKKDEAVNLVNGLFH >gi|226332192|gb|ACIC01000128.1| GENE 124 114748 - 115164 249 138 aa, chain + ## HITS:1 COG:no KEGG:BVU_0680 NR:ns ## KEGG: BVU_0680 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 133 1 134 138 105 39.0 6e-22 MAKLQVLIAMTLDGSIPAEDDPLLQWMRDDKDGFSYGRDKCTRRLYPGYPLVDFMCEKDM SDPSILYQAEIHDEESIELLRGLSVYHLIDEMIIFLFPSTRPNHKSVSKHLPRGEWKTVK SKTFKNGICRLVYCKTLQ >gi|226332192|gb|ACIC01000128.1| GENE 125 115371 - 115694 349 107 aa, chain + ## HITS:1 COG:no KEGG:BT_2651 NR:ns ## KEGG: BT_2651 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 93 1 95 97 85 48.0 6e-16 MIQINRETFQMMLHQIMERFDRIDDRLNRMNRQTAALEGDKLLDNQDMCELLGVTKRTLA RYRQKKLVTYYMIDGRTYYKASEVQDFLSRKGKVLPAKIKKELGIQF >gi|226332192|gb|ACIC01000128.1| GENE 126 115707 - 116000 109 97 aa, chain + ## HITS:1 COG:no KEGG:BF1983 NR:ns ## KEGG: BF1983 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 89 1 89 94 90 42.0 1e-17 MEIICIDKRTFDELVVRFSMIEKKVTGICNPAKDAGLKKWMDNQEVCGILRISKRTLQVY REKGLLPFTRVKNKFFYKPEDVQNMLESSYHPQKRKP >gi|226332192|gb|ACIC01000128.1| GENE 127 115997 - 116308 238 103 aa, chain + ## HITS:1 COG:no KEGG:BT_2448 NR:ns ## KEGG: BT_2448 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 5 95 4 94 97 78 46.0 6e-14 MSYDLIDRKDQRIDTIFKGLENMERMIDAIRTAPRPAFHSDYFLTDEELSKLLKVSRRTL QEYRTLGVIPYYLVQGKALYKESDIQKVLDDAYKRCREEQRWV >gi|226332192|gb|ACIC01000128.1| GENE 128 116333 - 117658 599 441 aa, chain - ## HITS:1 COG:no KEGG:BF1987 NR:ns ## KEGG: BF1987 # Name: not_defined # Def: tyrosine type site-specific recombinase # Organism: B.fragilis # Pathway: not_defined # 4 435 2 435 435 358 42.0 2e-97 MKRNTDNMEIKRRSTFAILFYINRTKIRKDGTCQLLCKISIDAKWEQIGTKVSVNPAVWN PEKGRADGRSENAITVNRAIDDLTKEIKEHYRRIKNSLGFITAEQVKNAVMGVGQKPLTL LALFREHNEEFKKRIGVDRIKESYDSYLRSYKHLSAFVQEKRGIEDVLLRNLDRVFYDDF ELFLRTNRNLSPKTVHEHLYRLKKMTMRAVSQGTIRRDPYCRLHPELPKRKSRHLKLEDL KTLLTTPVEKPQLQFVRDMFIFSTFTGLAYADLKRLTVNDIIQSEDGSWWIHIQRQKTGT LSSVRLLDIPLKIIEKYREQRHDDKVFNLYKREYFIMLTRKLGEVYGFELTFHKARHNFG THITLSMGIPIETVGKMMGHMRIETTQLYAKVTDKKVDEDMKRLKAAGLSQTSGLYEEDI IVRKQRRKSQSQLSENKETTL >gi|226332192|gb|ACIC01000128.1| GENE 129 117679 - 118902 665 407 aa, chain - ## HITS:1 COG:lin2069 KEGG:ns NR:ns ## COG: lin2069 COG4974 # Protein_GI_number: 16801135 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinase XerD # Organism: Listeria innocua # 129 384 19 282 297 71 26.0 3e-12 MNQTDVKVSFYLKKSEADAKGNCPVMARLIVGKYSETAFSVKLRVPQSLWLSGRACGKSA AARDINNRLDEIRAAAFSIYAEQSANREVVTAEEVKHQLLGMASEQETLLSYFRLFMRNF EKRVGINRTESSLNGYRNSYDHLVRFLQSQYKLSDIPFAALDRSFIEKFDLHLRTECHLA PGTIVNLTVRLKTIVGEAIADGIITAFPFVGYEPTHPQSEQKYLTTEELNRIMTTPLHSR TLYHVRDLFLFSCYTGIPYSDMCMLTNENLSLAEDGIWWIRSSRKKTGVDFEIPLMELPL HIIEKYRDVAPEGKLLPMYSNSSLNHYLKQIAVLCGIERKLVFHVARHTYATEITLSHGV PLETVSKMLGHSRIGTTQLYAKVTDNKIDTDTKALDKKIAERFSVVI Prediction of potential genes in microbial genomes Time: Thu May 12 02:44:43 2011 Seq name: gi|226332191|gb|ACIC01000129.1| Bacteroides sp. 1_1_6 cont1.129, whole genome shotgun sequence Length of sequence - 1846 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 174 - 210 -0.9 1 1 Tu 1 . - CDS 438 - 1541 275 ## PROTEIN SUPPORTED gi|167855185|ref|ZP_02477956.1| 50S ribosomal protein L31 - Prom 1565 - 1624 5.3 Predicted protein(s) >gi|226332191|gb|ACIC01000129.1| GENE 1 438 - 1541 275 367 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|167855185|ref|ZP_02477956.1| 50S ribosomal protein L31 [Haemophilus parasuis 29755] # 3 353 4 323 339 110 26 9e-25 MDINQNYREAVKTIKEAILRSQYRAATSVNKEQLSLYYGIGRYVSKNSRIGFWGKGAIEQ ISSLLQKELPGLRGFSTSNIKNMRVFYEEWEPVLNRQPLAGDLVLDEKLLLSVIRQPLAD EFNWSDFFSIGFSHHIEIISKAKTLEARLFYIHECAIRYWSKYTLRDYLKADLYSHRGTL PNNFAQTLPDTKQALKAVCSFKDEYLLDFINVEELDEQEEDLDEKIVEKSIVANVKKFIM TFGQDFSFIGNQYRVEVAGEEMFIDLLFFNRELNSLVAVELKSGKFRSSYLGQLNTYLSA LDSYVRKPHENPSIGIILCREMNQTFVEFAVRDYNKPMGVATYRTSKDMPERLRNALPDI EELRKLL Prediction of potential genes in microbial genomes Time: Thu May 12 02:45:41 2011 Seq name: gi|226332190|gb|ACIC01000130.1| Bacteroides sp. 1_1_6 cont1.130, whole genome shotgun sequence Length of sequence - 233862 bp Number of predicted genes - 176, with homology - 168 Number of transcription units - 70, operones - 41 average op.length - 3.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 41 - 613 384 ## BT_3517 RNA polymerase ECF-type sigma factor + Prom 636 - 695 3.4 2 2 Tu 1 . + CDS 727 - 1929 1050 ## COG3712 Fe2+-dicitrate sensor, membrane component + Prom 1962 - 2021 4.5 3 3 Op 1 . + CDS 2066 - 5521 2964 ## BT_3519 hypothetical protein 4 3 Op 2 . + CDS 5534 - 7462 1455 ## BT_3520 hypothetical protein 5 3 Op 3 . + CDS 7491 - 8777 1070 ## BT_3521 alpha-1,6-mannanase 6 3 Op 4 . + CDS 8803 - 9963 911 ## BT_3522 hypothetical protein 7 3 Op 5 . + CDS 9978 - 11453 1297 ## BT_3523 hypothetical protein + Term 11497 - 11541 11.4 + Prom 11463 - 11522 6.5 8 4 Op 1 . + CDS 11675 - 12937 1119 ## COG4833 Predicted glycosyl hydrolase 9 4 Op 2 . + CDS 12994 - 15159 2028 ## BT_3525 hypothetical protein 10 4 Op 3 . + CDS 15189 - 17666 2735 ## BT_3526 glutaminase 11 4 Op 4 1/0.000 + CDS 17679 - 19970 2330 ## COG3537 Putative alpha-1,2-mannosidase 12 4 Op 5 . + CDS 19974 - 21437 1594 ## COG3538 Uncharacterized conserved protein 13 4 Op 6 . + CDS 21447 - 22580 1225 ## COG2017 Galactose mutarotase and related enzymes + Prom 22604 - 22663 2.2 14 5 Op 1 . + CDS 22720 - 24681 1473 ## COG3537 Putative alpha-1,2-mannosidase 15 5 Op 2 . + CDS 24740 - 26803 1963 ## COG3533 Uncharacterized protein conserved in bacteria + Prom 26939 - 26998 7.7 16 6 Tu 1 . + CDS 27018 - 28151 1179 ## COG2017 Galactose mutarotase and related enzymes + Term 28240 - 28285 8.1 17 7 Op 1 . - CDS 28764 - 30377 133 ## BT_3533 hypothetical protein 18 7 Op 2 . - CDS 30399 - 31079 284 ## BT_3534 hypothetical protein - Prom 31126 - 31185 2.8 19 8 Tu 1 . - CDS 31187 - 31891 367 ## BT_3535 hypothetical protein - Prom 32060 - 32119 9.7 + Prom 32187 - 32246 11.7 20 9 Op 1 . + CDS 32298 - 33284 644 ## BT_3536 hypothetical protein 21 9 Op 2 . + CDS 33335 - 33553 206 ## 22 9 Op 3 . + CDS 33566 - 33796 79 ## BT_3537 hypothetical protein + Term 33839 - 33876 -0.4 + Prom 33808 - 33867 6.6 23 10 Tu 1 . + CDS 33900 - 34712 211 ## BT_3538 hypothetical protein 24 11 Op 1 . - CDS 34709 - 34960 176 ## BT_3539 hypothetical protein 25 11 Op 2 . - CDS 34993 - 35700 485 ## BT_3540 hypothetical protein - Prom 35810 - 35869 8.1 + Prom 36277 - 36336 6.7 26 12 Tu 1 . + CDS 36473 - 38353 912 ## BT_3542 hypothetical protein + Prom 38442 - 38501 7.0 27 13 Op 1 . + CDS 38530 - 39252 476 ## BT_3543 hypothetical protein 28 13 Op 2 . + CDS 39331 - 39513 115 ## + Prom 39515 - 39574 10.0 29 14 Op 1 . + CDS 39606 - 41195 326 ## BT_3545 hypothetical protein + Prom 41215 - 41274 1.5 30 14 Op 2 . + CDS 41295 - 44192 2512 ## BT_3546 glutaminase 31 14 Op 3 . + CDS 44271 - 44918 511 ## BT_3547 hypothetical protein 32 14 Op 4 . + CDS 44940 - 45155 160 ## + Term 45253 - 45299 11.3 + Prom 45376 - 45435 6.4 33 15 Tu 1 . + CDS 45459 - 46673 917 ## COG1373 Predicted ATPase (AAA+ superfamily) + Term 46885 - 46943 -0.9 34 16 Tu 1 . - CDS 46681 - 47748 794 ## COG0624 Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases - Prom 47953 - 48012 6.6 35 17 Op 1 . - CDS 48183 - 48938 354 ## gi|253571095|ref|ZP_04848502.1| predicted protein 36 17 Op 2 . - CDS 49026 - 50006 702 ## BF3416 hypothetical protein 37 17 Op 3 . - CDS 50043 - 51242 970 ## BF3614 hypothetical protein 38 17 Op 4 . - CDS 51271 - 52383 980 ## BF3414 hypothetical protein 39 17 Op 5 . - CDS 52405 - 54024 1235 ## BF3612 hypothetical protein 40 17 Op 6 . - CDS 54044 - 57322 2306 ## BF3611 hypothetical protein - Prom 57443 - 57502 6.3 41 18 Op 1 6/0.000 - CDS 57510 - 58499 520 ## COG3712 Fe2+-dicitrate sensor, membrane component 42 18 Op 2 . - CDS 58557 - 59096 394 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 59119 - 59178 8.5 + Prom 59076 - 59135 7.0 43 19 Op 1 . + CDS 59330 - 59464 83 ## + Term 59506 - 59544 -1.0 + Prom 59487 - 59546 3.7 44 19 Op 2 . + CDS 59645 - 61450 1589 ## COG1022 Long-chain acyl-CoA synthetases (AMP-forming) + Term 61481 - 61541 5.2 - Term 61474 - 61523 7.0 45 20 Op 1 14/0.000 - CDS 61582 - 63861 2321 ## COG0612 Predicted Zn-dependent peptidases 46 20 Op 2 . - CDS 63716 - 64501 700 ## COG0612 Predicted Zn-dependent peptidases - Prom 64588 - 64647 6.2 + Prom 64531 - 64590 9.8 47 21 Op 1 . + CDS 64764 - 65723 1132 ## COG1186 Protein chain release factor B + Prom 65726 - 65785 3.4 48 21 Op 2 . + CDS 65807 - 65941 103 ## + Prom 66030 - 66089 6.5 49 22 Op 1 . + CDS 66169 - 67221 855 ## BT_3553 hypothetical protein 50 22 Op 2 . + CDS 67242 - 67709 496 ## COG2954 Uncharacterized protein conserved in bacteria 51 22 Op 3 . + CDS 67760 - 69052 1142 ## COG3174 Predicted membrane protein + Term 69138 - 69180 0.7 + Prom 69057 - 69116 5.9 52 23 Tu 1 . + CDS 69198 - 69524 463 ## BT_3556 transcriptional regulator + Term 69627 - 69664 -0.9 - Term 69606 - 69662 9.5 53 24 Op 1 . - CDS 69738 - 70877 639 ## COG1864 DNA/RNA endonuclease G, NUC1 54 24 Op 2 . - CDS 70892 - 71923 626 ## BT_3559 hypothetical protein - Prom 72063 - 72122 7.4 + Prom 71984 - 72043 7.3 55 25 Op 1 . + CDS 72123 - 74663 2024 ## BT_3560 hypothetical protein 56 25 Op 2 . + CDS 74695 - 75528 753 ## BT_3561 hypothetical protein 57 25 Op 3 . + CDS 75550 - 77598 1135 ## COG4085 Predicted RNA-binding protein, contains TRAM domain + Term 77619 - 77657 8.8 58 25 Op 4 . + CDS 77658 - 78530 465 ## COG1864 DNA/RNA endonuclease G, NUC1 + Term 78566 - 78630 16.1 59 26 Tu 1 . - CDS 78705 - 79145 352 ## BT_3564 hypothetical protein - Prom 79345 - 79404 6.2 - Term 80467 - 80522 0.2 60 27 Tu 1 . - CDS 80674 - 80841 93 ## - Prom 80978 - 81037 3.7 + Prom 80650 - 80709 6.0 61 28 Tu 1 . + CDS 80840 - 83398 1814 ## COG1629 Outer membrane receptor proteins, mostly Fe transport - Term 83366 - 83409 1.8 62 29 Tu 1 . - CDS 83426 - 84790 1324 ## COG5368 Uncharacterized protein conserved in bacteria - Prom 84811 - 84870 2.7 63 30 Op 1 . - CDS 84913 - 87228 2560 ## COG1472 Beta-glucosidase-related glycosidases 64 30 Op 2 . - CDS 87259 - 88788 1592 ## BT_3568 hypothetical protein 65 31 Tu 1 . - CDS 88892 - 91957 3059 ## BT_3569 hypothetical protein - Prom 92054 - 92113 8.0 + Prom 92076 - 92135 7.0 66 32 Tu 1 . + CDS 92156 - 93781 550 ## BT_3570 TPR repeat-containing protein + Prom 93853 - 93912 4.5 67 33 Tu 1 . + CDS 93933 - 94289 348 ## BT_3571 hypothetical protein + Term 94320 - 94358 0.1 + Prom 94342 - 94401 5.6 68 34 Op 1 . + CDS 94484 - 95248 332 ## BT_3572 hypothetical protein 69 34 Op 2 . + CDS 95272 - 95619 129 ## BT_3573 hypothetical protein + Term 95628 - 95681 12.1 70 35 Tu 1 . + CDS 95686 - 96333 386 ## BT_3574 hypothetical protein + Prom 96374 - 96433 6.8 71 36 Op 1 . + CDS 96458 - 96832 191 ## BT_3575 hypothetical protein 72 36 Op 2 . + CDS 96822 - 96989 102 ## BT_3573 hypothetical protein + Term 96998 - 97052 11.5 - Term 96976 - 97050 19.8 73 37 Tu 1 . - CDS 97073 - 97963 843 ## COG0524 Sugar kinases, ribokinase family - Prom 98026 - 98085 2.8 - TRNA 98028 - 98109 68.2 # Leu TAG 0 0 - Term 97964 - 98012 11.3 74 38 Op 1 . - CDS 98182 - 99204 820 ## COG0793 Periplasmic protease 75 38 Op 2 . - CDS 99253 - 100113 719 ## BT_3578 hypothetical protein 76 38 Op 3 . - CDS 100152 - 102797 2949 ## COG0188 Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), A subunit - Prom 102861 - 102920 5.8 - Term 103458 - 103493 5.1 77 39 Op 1 . - CDS 103674 - 105017 1079 ## COG0673 Predicted dehydrogenases and related proteins - Prom 105040 - 105099 5.2 78 39 Op 2 . - CDS 105105 - 105344 155 ## BT_3582 hypothetical protein 79 39 Op 3 . - CDS 105348 - 106673 1043 ## COG0673 Predicted dehydrogenases and related proteins 80 39 Op 4 . - CDS 106677 - 107447 334 ## COG1477 Membrane-associated lipoprotein involved in thiamine biosynthesis 81 39 Op 5 . - CDS 107503 - 108909 974 ## BT_3585 putative oxidoreductase 82 39 Op 6 . - CDS 108919 - 109923 835 ## BT_3586 putative dehydrogenase 83 39 Op 7 12/0.000 - CDS 109930 - 110715 510 ## COG0363 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase 84 39 Op 8 . - CDS 110696 - 111883 851 ## COG1820 N-acetylglucosamine-6-phosphate deacetylase 85 39 Op 9 . - CDS 111915 - 113231 1208 ## COG0738 Fucose permease - Prom 113310 - 113369 3.9 - Term 113307 - 113350 3.3 86 40 Op 1 . - CDS 113375 - 115573 1409 ## BT_3590 alpha-N-acetylglucosaminidase precursor 87 40 Op 2 . - CDS 115601 - 116782 855 ## BT_3591 hypothetical protein 88 40 Op 3 . - CDS 116791 - 118773 555 ## PROTEIN SUPPORTED gi|90021240|ref|YP_527067.1| ribosomal protein S32 89 40 Op 4 . - CDS 118780 - 119856 498 ## BT_3593 hypothetical protein - Prom 119878 - 119937 5.7 - Term 119870 - 119919 9.2 90 41 Op 1 . - CDS 119948 - 121558 981 ## BT_3594 hypothetical protein 91 41 Op 2 . - CDS 121596 - 123611 947 ## BT_3595 hypothetical protein 92 41 Op 3 . - CDS 123647 - 125893 1404 ## BT_3596 hypothetical protein 93 41 Op 4 . - CDS 125913 - 127337 486 ## BT_3597 sialic acid-specific 9-O-acetylesterase 94 41 Op 5 1/0.000 - CDS 127353 - 128966 1139 ## COG3525 N-acetyl-beta-hexosaminidase 95 41 Op 6 . - CDS 128985 - 131270 1142 ## COG3250 Beta-galactosidase/beta-glucuronidase 96 41 Op 7 . - CDS 131332 - 131748 241 ## BT_3599 beta-mannosidase precursor 97 41 Op 8 . - CDS 131755 - 132651 387 ## PROTEIN SUPPORTED gi|116517028|ref|YP_816079.1| glucokinase 98 41 Op 9 . - CDS 132655 - 133704 348 ## COG0449 Glucosamine 6-phosphate synthetase, contains amidotransferase and phosphosugar isomerase domains - Prom 133724 - 133783 3.4 - Term 133730 - 133792 9.4 99 42 Op 1 . - CDS 133834 - 135693 971 ## BT_3602 hypothetical protein 100 42 Op 2 . - CDS 135718 - 137403 1242 ## BT_3603 hypothetical protein 101 42 Op 3 . - CDS 137418 - 140573 2398 ## BT_3604 hypothetical protein 102 42 Op 4 1/0.000 - CDS 140620 - 141786 996 ## COG2942 N-acyl-D-glucosamine 2-epimerase 103 42 Op 5 . - CDS 141819 - 143225 904 ## COG0477 Permeases of the major facilitator superfamily 104 42 Op 6 . - CDS 143231 - 144328 843 ## BT_3607 hypothetical protein - Prom 144352 - 144411 10.4 - Term 144529 - 144563 3.2 105 43 Tu 1 . - CDS 144715 - 145893 659 ## BT_3608 hypothetical protein - Prom 145936 - 145995 2.2 106 44 Tu 1 . - CDS 146008 - 147153 654 ## COG1609 Transcriptional regulators - Prom 147363 - 147422 4.4 + Prom 147123 - 147182 7.6 107 45 Tu 1 . + CDS 147381 - 147851 499 ## BT_3610 hypothetical protein + Prom 147866 - 147925 1.9 108 46 Op 1 . + CDS 147955 - 149496 1654 ## COG0423 Glycyl-tRNA synthetase (class II) 109 46 Op 2 . + CDS 149508 - 150098 488 ## COG0545 FKBP-type peptidyl-prolyl cis-trans isomerases 1 + Term 150118 - 150176 11.1 110 47 Tu 1 . - CDS 150140 - 151183 1006 ## COG1609 Transcriptional regulators + Prom 151284 - 151343 3.1 111 48 Op 1 . + CDS 151363 - 152295 1025 ## COG0667 Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) 112 48 Op 2 . + CDS 152311 - 153234 793 ## BT_3615 hypothetical protein 113 48 Op 3 . + CDS 153276 - 154532 1116 ## COG0738 Fucose permease 114 48 Op 4 . + CDS 154556 - 155572 931 ## COG1063 Threonine dehydrogenase and related Zn-dependent dehydrogenases + Term 155633 - 155673 7.3 - Term 155619 - 155660 9.1 115 49 Op 1 . - CDS 155873 - 156724 582 ## BT_3618 hypothetical protein 116 49 Op 2 . - CDS 156721 - 157839 881 ## COG4299 Uncharacterized conserved protein 117 49 Op 3 . - CDS 157882 - 159189 874 ## COG0477 Permeases of the major facilitator superfamily 118 49 Op 4 . - CDS 159207 - 160586 832 ## COG2385 Sporulation protein and related proteins 119 49 Op 5 . - CDS 160597 - 163041 1317 ## BT_3622 putative glycosyltransferase 120 49 Op 6 . - CDS 163043 - 164488 930 ## COG0591 Na+/proline symporter 121 49 Op 7 . - CDS 164451 - 164624 144 ## 122 49 Op 8 . - CDS 164630 - 165895 666 ## BT_3624 hypothetical protein - Prom 166052 - 166111 5.2 - Term 166057 - 166113 -0.2 123 50 Op 1 . - CDS 166152 - 167186 941 ## BT_3625 hypothetical protein 124 50 Op 2 . - CDS 167230 - 167898 687 ## BT_3626 hypothetical protein 125 50 Op 3 . - CDS 167942 - 168829 223 ## PROTEIN SUPPORTED gi|225084369|ref|YP_002657150.1| ribosomal protein S16 126 50 Op 4 . - CDS 168852 - 169418 478 ## BT_3628 hypothetical protein 127 50 Op 5 . - CDS 169405 - 169992 626 ## BT_3629 hypothetical protein 128 50 Op 6 . - CDS 170050 - 171672 1347 ## BT_3630 hypothetical protein 129 50 Op 7 . - CDS 171708 - 173240 1193 ## BT_3631 hypothetical protein 130 50 Op 8 . - CDS 173209 - 174444 1118 ## BT_3632 hypothetical protein 131 50 Op 9 . - CDS 174449 - 177286 1997 ## BT_3633 hypothetical protein + Prom 178062 - 178121 6.0 132 51 Op 1 . + CDS 178166 - 178534 186 ## gi|253571187|ref|ZP_04848594.1| conserved hypothetical protein 133 51 Op 2 . + CDS 178531 - 179103 164 ## Sala_2218 hypothetical protein + Term 179111 - 179152 -0.9 - Term 179022 - 179060 7.9 134 52 Op 1 . - CDS 179120 - 179680 330 ## BT_4275 hypothetical protein - Prom 179737 - 179796 2.4 135 52 Op 2 . - CDS 179866 - 180459 243 ## gi|253571190|ref|ZP_04848597.1| predicted protein - Prom 180530 - 180589 6.4 - Term 180481 - 180518 4.1 136 53 Op 1 13/0.000 - CDS 180610 - 181941 615 ## COG0845 Membrane-fusion protein 137 53 Op 2 . - CDS 181919 - 184120 228 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P - Prom 184174 - 184233 7.6 + Prom 184456 - 184515 6.8 138 54 Tu 1 . + CDS 184555 - 185082 223 ## gi|253571193|ref|ZP_04848600.1| predicted protein + Prom 185317 - 185376 7.4 139 55 Op 1 . + CDS 185599 - 185781 143 ## gi|253571194|ref|ZP_04848601.1| predicted protein 140 55 Op 2 . + CDS 185793 - 186446 180 ## gi|253571195|ref|ZP_04848602.1| predicted protein + Prom 186519 - 186578 3.8 141 56 Tu 1 . + CDS 186647 - 187168 253 ## gi|253571196|ref|ZP_04848603.1| predicted protein + Prom 187170 - 187229 5.2 142 57 Op 1 . + CDS 187312 - 187950 284 ## gi|253571197|ref|ZP_04848604.1| predicted protein + Prom 187957 - 188016 3.2 143 57 Op 2 . + CDS 188040 - 188807 306 ## BDI_0151 hypothetical protein 144 57 Op 3 . + CDS 188800 - 189864 532 ## BF4403 putative glycosyltransferase 145 57 Op 4 . + CDS 189864 - 190640 253 ## Pcar_2294 hypothetical protein 146 57 Op 5 . + CDS 190646 - 191197 221 ## BDI_3162 hypothetical protein 147 57 Op 6 1/0.000 + CDS 191202 - 192311 627 ## COG0562 UDP-galactopyranose mutase 148 57 Op 7 . + CDS 192319 - 194109 453 ## COG0463 Glycosyltransferases involved in cell wall biogenesis + Term 194259 - 194308 0.5 149 58 Tu 1 . - CDS 194120 - 194953 339 ## Swit_4855 hypothetical protein - Prom 195007 - 195066 7.4 150 59 Op 1 . - CDS 195215 - 196222 1079 ## COG0136 Aspartate-semialdehyde dehydrogenase 151 59 Op 2 . - CDS 196290 - 198029 1377 ## BT_3637 hypothetical protein - Prom 198052 - 198111 4.0 + Prom 198097 - 198156 6.1 152 60 Op 1 . + CDS 198214 - 200349 2176 ## COG0475 Kef-type K+ transport systems, membrane components 153 60 Op 2 . + CDS 200349 - 201065 642 ## COG1179 Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 1 154 60 Op 3 . + CDS 201075 - 201731 368 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 + Term 201769 - 201811 4.4 - Term 201756 - 201799 0.8 155 61 Op 1 . - CDS 201934 - 202863 448 ## COG1555 DNA uptake protein and related DNA-binding proteins 156 61 Op 2 . - CDS 202863 - 204233 1161 ## COG0733 Na+-dependent transporters of the SNF family - Prom 204278 - 204337 8.4 + Prom 204215 - 204274 6.0 157 62 Tu 1 . + CDS 204337 - 204729 294 ## BT_3643 hypothetical protein + Term 204755 - 204826 9.2 - Term 204739 - 204811 7.3 158 63 Op 1 . - CDS 204883 - 206181 1238 ## COG0770 UDP-N-acetylmuramyl pentapeptide synthase 159 63 Op 2 . - CDS 206256 - 208271 1755 ## COG0642 Signal transduction histidine kinase - Prom 208368 - 208427 4.4 + Prom 208288 - 208347 5.0 160 64 Op 1 1/0.000 + CDS 208399 - 209262 826 ## COG0294 Dihydropteroate synthase and related enzymes 161 64 Op 2 . + CDS 209337 - 210101 801 ## COG1624 Uncharacterized conserved protein 162 64 Op 3 17/0.000 + CDS 210120 - 211661 1584 ## COG0312 Predicted Zn-dependent proteases and their inactivated homologs 163 64 Op 4 . + CDS 211693 - 213012 1432 ## COG0312 Predicted Zn-dependent proteases and their inactivated homologs + Term 213040 - 213096 12.3 - Term 213028 - 213084 8.5 164 65 Tu 1 . - CDS 213110 - 213307 87 ## - Prom 213443 - 213502 5.0 + Prom 213122 - 213181 4.8 165 66 Op 1 . + CDS 213306 - 213866 448 ## COG1704 Uncharacterized conserved protein 166 66 Op 2 . + CDS 213874 - 214905 738 ## BT_3651 hypothetical protein 167 66 Op 3 . + CDS 214934 - 215722 783 ## BT_3652 hypothetical protein 168 66 Op 4 . + CDS 215760 - 216326 537 ## BT_3653 hypothetical protein + Term 216553 - 216583 -0.9 + Prom 216536 - 216595 8.4 169 67 Op 1 . + CDS 216624 - 218960 2412 ## COG1874 Beta-galactosidase 170 67 Op 2 . + CDS 219031 - 220002 794 ## BT_3655 arabinosidase 171 67 Op 3 . + CDS 220044 - 222479 2151 ## COG3507 Beta-xylosidase 172 67 Op 4 . + CDS 222550 - 225027 2186 ## COG3534 Alpha-L-arabinofuranosidase - Term 225854 - 225911 17.1 173 68 Op 1 . - CDS 225935 - 228409 2299 ## COG1501 Alpha-glucosidases, family 31 of glycosyl hydrolases 174 68 Op 2 . - CDS 228494 - 231208 1625 ## COG4977 Transcriptional regulator containing an amidase domain and an AraC-type DNA-binding HTH domain - Prom 231364 - 231423 5.9 + Prom 231350 - 231409 7.0 175 69 Tu 1 . + CDS 231436 - 233379 1643 ## BT_3661 alpha-glucosidase + Prom 233457 - 233516 4.8 176 70 Tu 1 . + CDS 233551 - 233860 239 ## BT_3662 putative endo-1,4-beta-xylanase Predicted protein(s) >gi|226332190|gb|ACIC01000130.1| GENE 1 41 - 613 384 190 aa, chain + ## HITS:1 COG:no KEGG:BT_3517 NR:ns ## KEGG: BT_3517 # Name: not_defined # Def: RNA polymerase ECF-type sigma factor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 190 1 190 190 333 100.0 1e-90 MASDIHTLSDSLLWKRFLEGDSSAYSQIYNQTVQELFRYGLLYTSDRELVKDCIHDVFVK IYTNRAKLTPTDNIIAYLMVALKNTLFNALKKTSDSFSLDEADEKEDQSEEHFSTPETIY INKEQEKNTHMKVHAMMSSLTTRQREIVYYRYIKDMSIDEISKITDMNYQSVSNSIQRAL GRVRNLFKRE >gi|226332190|gb|ACIC01000130.1| GENE 2 727 - 1929 1050 400 aa, chain + ## HITS:1 COG:RSc2919 KEGG:ns NR:ns ## COG: RSc2919 COG3712 # Protein_GI_number: 17547638 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Ralstonia solanacearum # 197 385 71 256 274 76 29.0 7e-14 MINSELKHTDFSRYTFEEFLQNDFFISSVKYPTEEIQEFWDRFEKSTPSNIDDYFAAREY IETISTSEGDLLSDQELGELWADIQTTNIKNDKIKHKNFFLIGLTAAASVAILVGSFFLL KNYQAVLAPDIATFAVQTKTELPATEETLLILAEDKVVSLKEKETTITYDSVAIKANEEN ISKKELAVYNQLVIPRGKRSVLTFSDGSKVWVNAGTRVIYPTEFEKDKREIYVDGEIYIE VARDEERPFYVRTKDMNVRVLGTKFNVTAYESEPIRSVVLAQGCVQVETTQTPKAILAPN QMFSSVEGKENISQVDVEQMISWVNGLYCFNSADLGIVLKRLSTYYGINVEFDSALSKIK CSGKIDLKDNFETVINGLTFVAPISYAYDGQYKTYRVVKK >gi|226332190|gb|ACIC01000130.1| GENE 3 2066 - 5521 2964 1151 aa, chain + ## HITS:1 COG:no KEGG:BT_3519 NR:ns ## KEGG: BT_3519 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1151 1 1151 1151 2220 99.0 0 MRNLDIVNMKNSIKGRKSKILKLFLCFLLLGVSYSFANNNNYSQLKTLSVNVNNKTLREV FQTIEKTSQFVFFYLDDAINLDRKVSIDSKDKKIDEILTELFEGTSCTYRISDRQVFISG KSAAVSAAQQQNARKITGRVTDTKGEPLIGVNVTVDGDTNGSITNMDGLFELRVSKKNAV LKFTYIGFKPSEVRINSSTNIYDVVLEEQVNELEETVIVGYGTQRKISNIGAQSSMKLED IKTPSASLTTTLAGRLAGVVAVQRTGEPGKDAADIWIRGIATPNTATPLILVDGVERAFN DIDPEDIESLTTLKDASATAVYGVRGANGVIIIKTKPGKIGKPTISADYYESFTRFTKMV DLTDGVSYMKAANEALRNDGLATKFTDSQIQNTILGKDQYLYPNVDWLNEIFNDWGHNRR VNVNVRGGSEKVSYYASVSYFNETGMTVTDKSINTFDSKMKYSRYNFTTNLSIDVTPTTK VEIGAQGYLGEGNYPAISSKDLYNAAMSISPVEYPKMFFINGEAYIPGRSTNNNFNNPYS QATRRGYDNLTKNQIYSNLRITQDLDMLTKGLKLTAMYAFDVYNEIHVHQDRAESTYYFL DTNVPYDLDGQPILQRIYTGTNVLSYKQETSGNKKTYLEASLNYDRAFGDHRVSGLFLFN QQQKLLYPKGTLEDAIPYRMMGIAGRATYSWKDRYFAEFNIGYNGAENFSPKNRYGTFPA YGVGWVISNEKFWEPLSKVVSFLKIRYTDGKVGNSDVSDRRFMYLNQMKENGDYGYKLGP NGTKYSGYETLNMAVDLIWEEARKQDLGIDLKLFNDDLSIIFDIFKERRENILLKRENSI PSFLGYNTSAPYGNIGIVENKGFDATVEYNKRFNKDWTLAIRGNITYNKDKWIEGELPEQ RYDWMNQYGQNILGKKGYIAEGLFTQAEIDDMARWESLSQANKATTPKPFASQFGTVKAG DIKYKDLNNDGQIDAYDKTYICRGDVPTMVYGFGFTLGWKNLSLGMMFQGTHDAERIMSG SSIQPFNGGGGSGNLYSNIDDRWTEDNPNQNAFYPRLSYGAETSNNINNFQPSTWWVRDF SFLRLKTMQISYNLPKDWVNKVRLKNAAVYVMGTNLFTLSKFKLWDPELNTDNGASYPNT TSYSVGVNFTF >gi|226332190|gb|ACIC01000130.1| GENE 4 5534 - 7462 1455 642 aa, chain + ## HITS:1 COG:no KEGG:BT_3520 NR:ns ## KEGG: BT_3520 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 642 1 642 642 1283 100.0 0 MKKYNYILLGMLSLFFVTTLTSCSDYFDQVPDDRLSLEEIFKTRDGALKYLSNVYTFLPD EFNQRQVHETSLYRTPGPWTGASDETEWTSSGNKAKLINNNSIDATENTMVLYRWKSWYS GIHESAVFTKYVDQAPLTASERAQWKAEAKALRAIYYFYLVRTYGPVPVLEDDYALDTPS NELQLSRSTVDRCFDFIVSELKEAQNAGLLEDASSDKTTGVGRIDKAIAQAFIIEALTYR ASWLFNGECTYYADMANPDGTRLFPSQPDAATIKADWQKVVTECQKFFADYGNRFQLMYT DKSGKSVAGPDAEGFNPYESYRRAIRTLFSEMGNNKEMIFYRMDNAAGTMQYDRMPNKSG NTNDYRGGSLLGATQEMVDAYFMANGMSPVTGYAADGVTPIINEASGYKEDGTSSSDYKS ADGTLYAPAGTRNMYVNREPRFYADITFSNSKWFSGTEGDYTVDFTYSGNCGKAQGNNDF TSTGYLVRKNMDSGDRNQNLVCVLLRLTNIYFDYIEALAYVDPSHADIWKYMNLIRERAG IPGYGTASLPKATTTSEIMKLIQKEKRIELSFENCRYFDVRRWGLTNEFFNKPVHGMNVN YDGNEFYKRTEITSRNFDRQYFFPIPQSEIDIDKNLVQNEGF >gi|226332190|gb|ACIC01000130.1| GENE 5 7491 - 8777 1070 428 aa, chain + ## HITS:1 COG:no KEGG:BT_3521 NR:ns ## KEGG: BT_3521 # Name: not_defined # Def: alpha-1,6-mannanase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 428 2 429 429 860 100.0 0 MKQYIFSALCLVSGAFCLSSCNDDKEARPYTPDYEIVPEYTNADTWKAYEAFNEHLLDQN KFIYKSSTADKAAVDRWNGAAAIWCQPTYWDMAMNAYKRAKAEGDTQKEQKFKQLCDDLF AGNKAHYANFDFDDNNENTGWFIYDDIMWWTVTLARAYELFGVEEYLSLSEESFGRVWYG SEKVGDTGSYADPEKGLGGGMFWQWQPIKNPNPNEADHGKMACINFPTVVAALTLYNNVP TGRTESTDSHPSYQTKEQYLAKGKEIYAWAVENLVDVTTGQVADSRHGNGNPAWKDHVYN QASYIGASVLLYKATGEKQYLDNAVMAADYTMNTISGTFDLLPFETGAEQGIYTAVFAQY IAMLVYDCDQTQYIPFVKRNINYGWANRDKTRDICGGDYTKLQVEGDAVESYSASGIPAL MLLFPTDK >gi|226332190|gb|ACIC01000130.1| GENE 6 8803 - 9963 911 386 aa, chain + ## HITS:1 COG:no KEGG:BT_3522 NR:ns ## KEGG: BT_3522 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 386 5 390 390 751 100.0 0 MNTKYSIKKVWYYLLCMIMTLQLIACTEETHESYTAAPEVEDIYIDQLEELINKMKDLQK NSEYGEKKGQYPTESRAILTDAIDDANRSVLLIKYQNPVPSEQEKQRYVASAKSAIDKFK GTIRTEDAETTPAELFVDGKGGNSYIDFGRSEEYVKFGEQGHQSFTVELWVKVTERGRWD NCLFLCSYMSDSSWRNGWMMYWRKDDNGVYRTTWGGLNTTNGDRDLWEPKFQISDDLNKW QHFVAVYSDEGLDGNSTLRAKLYLNGELKKEETVSPTTRVYQSGHYSDYSKPMTAFGRYM RVSDDLYEEGFSGYMKKIRIWKTAKGADYVKQSYEGTAEVTGKETDLAAGWDFTSKPSGT DNEIIDLTGRHTAKIIGTYKWERILE >gi|226332190|gb|ACIC01000130.1| GENE 7 9978 - 11453 1297 491 aa, chain + ## HITS:1 COG:no KEGG:BT_3523 NR:ns ## KEGG: BT_3523 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 491 1 491 491 962 99.0 0 MKDMKQLINKWGESMLLSLLMALCLTWTFTACSDDKDNEYISDTQLSILEDNRTSLSYLL KNSTFGTAPGTFPEASKEILNNAIAELDQLITKVKGGEMFDEATFEATIAKVNQAIDEFK NSKYYNLSPEAQKFISDLMAKADELREMIANEALWGNHQGQYPVEGKATLESAAEDLESL ADRIKTSAITDMTQEIYDDAIAAADKKLQEVENSAWPEDNLVWNLFVDGNKGGYIDFGYS EDFVKFGDDNNQNFTIELWINIKEFCSKSGEDNSTFLAAFVNSPRSGWRVQYRKVNGGNE HWLRGSMAHWQNEGPKDPEWWEPRAIVNNPKDKWTHFAFAVADNGVPGFDPPQEHTKSCV FVNGSQSGEVIRVGEAWRTYINNGCIEEKMPMTAFCRLNTDKTTREEYFSGYIKYMRIWK GIRSRDDLRLSAMGQVDVDPNDPNLVAAWDFEVLGAQPTGTTITDITGRHVATLKGPEGT YQWVESTTIAQ >gi|226332190|gb|ACIC01000130.1| GENE 8 11675 - 12937 1119 420 aa, chain + ## HITS:1 COG:lin0763 KEGG:ns NR:ns ## COG: lin0763 COG4833 # Protein_GI_number: 16799837 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted glycosyl hydrolase # Organism: Listeria innocua # 87 412 53 334 341 69 25.0 1e-11 MKSKILLAVALLLAISTTSVWAADSSEKTNQKTGSYTNEDVWAAYEGFNNTLLDPDKYIY KTTSAYEQAVDRGHGAAAIWCQPIYWDMSMNAYKLAKAQKDKKKRAYYKELCEKIFAGNK AQYCHFDFDNNNENTGWFIYDDIMWWTISLARAYELFGVDEYLKLSEESFSRVWYGSKKV GDTGSYDKENGGMFWQWQPIHNPKPNRPGDGKMACINFPTVVAALTLYNNVPKKRKESTE ESPKYQTREQYLAKGKEIYEWGVENLLDKKTGRIADSRHGNGNPAWKAHVYNQATFIGAS VLLYKATKEKRYLDNAILAADYTVNEMSAKHNLLPFERGIEQGIYTAIFAEYIAMLVYDC GQTQYIPFLKRNIESGWANRDKTRNVCGGEYEKALPAGAEIDSYSASGIPALMLLFPAEK >gi|226332190|gb|ACIC01000130.1| GENE 9 12994 - 15159 2028 721 aa, chain + ## HITS:1 COG:no KEGG:BT_3525 NR:ns ## KEGG: BT_3525 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 721 1 721 721 1449 99.0 0 MNNKHLLIALGLLLACNHATYAQKGKSKEAKTTFQTSEPWKPETDVRADATMVYGTLDKP GVTFEQRIQSWRDKGYLTEFMTGVAWGDYKDYFLGKWDGVDGHLKEGQRDRNGNEIAHGH LIPYIVPTESFIRYMQETQIKRVIDAGITSIYLEEPEFWMRGGYSEAFKSEWQKYYNFPW RAQHESPENTYLSNKLKYHLYYNALDKIFTYAKEYGKSKGLDIKCYVPTHSLINYTSWQI VSPEASLASLDCVDGYIAQVWTGTAREPNFYNGVQKERVFENAFLEYGCMKSMTAPLNRK MYFLTDPIEDRAKDWLDYKINYQATFAAQLMYPMVDTYEVMPWPDRIYQGLYRIAGTDQK ERIPRSYSTQMQTMVNTLNDIRTSDKKITGTQGIGVLMANSLMFQRFPNHNGYDDPQFSS FYGQTLPLLKRGIPVELVHMENTPFKETFKGLHILVMSYSNMKPMKLEYHNYLADWVKKG GILIYCGEDIDPYQTVLEWWNTDGNEYKAPSEHLFEKMNLSRNPGEGTYRYGKGTVIVMR EDPKYFVLKAGNDQKYFETIASAYQKKIGKEIETKNSFIVERGPYTIAAVMDESVSKEPL TLSGLYIDLFDKDLPVLTSKQIQPGEQGYLYDLNKVSGKIKAKVLCGASRIYDEKVSKQS YSFVAKSPINTTNVSRVLLPRKPEKIRVNGKEEQPEWDESSKTLLLSFENDPAGVNVSIE W >gi|226332190|gb|ACIC01000130.1| GENE 10 15189 - 17666 2735 825 aa, chain + ## HITS:1 COG:no KEGG:BT_3526 NR:ns ## KEGG: BT_3526 # Name: not_defined # Def: glutaminase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 825 1 825 825 1662 99.0 0 MKQQLMTLLLGAASVFCSCETQLEQHVKSELRAPAYPLVSIDPYTSAWSFTDNLYDGSVK HWSGKDFPLIGVAKVDGQTYRFMGTEELELRPLVKTSEQGNWTGKYTIQQPADGWQNVGF NDAAWKEGEGAFGTMENEHVAKTQWGEEFIWVRRVADIQEDLTGKNVYLEYSHDDDVIIY INGIKVVDTGNACKKHVQVKLSEEVVASLKQGENLIAAYCHNRGANGLLDFGLLVELDNN RYFNQTAQQTSADVQPMQTYYNFTCGPVDLNLTFTAPLFMDNLDLMTRPVNYISYEVISN DGQAHQVELYFEAAPQWALDLPHQESVADSFTDGDLLFLRTGSRNQDILKKKGDDVRIDW GHFYLAAEKENSTYAIGDGKKLRQSFTENKLEAPTTNGYDKLALVRSLGETKKANGHLLI GYDDIYSIQYFGENLRPYWNRTGSETIVSQFQKAEQEYKTQMQNCAAFDNKLMTEAEAAG GRKYAELCALAYRQALAAHKLVQAPNGDLVFLSKENFSNGSIGTVDLTYPGAPLLLLYNP ELVKATMNHIFYYSESGKWTKPFAAHDVGTYPLANGQTYGGDMPIEESGNMVVLAAAIAA VEGNADYAQKHWETLTTWTDYLVEYGLDPENQLCTDDFAGHFAHNANLSIKAIMGIASYG YMADMLGKKDIAEKYTKKAKEMAAEWVKMADDGDHYRLTFDKPGTWSQKYNLVWDKLLKL QIFPEKVAETEIAYYLTKQNKYGLPLDNRETYTKTDWIMWTATMAQDKATFEKFIEPVYL FMDETTTRVPMSDWVFTDNPIQRGFQARSVVGGYFIKMLEEKLTK >gi|226332190|gb|ACIC01000130.1| GENE 11 17679 - 19970 2330 763 aa, chain + ## HITS:1 COG:L135972 KEGG:ns NR:ns ## COG: L135972 COG3537 # Protein_GI_number: 15673483 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Lactococcus lactis # 34 756 11 715 717 424 33.0 1e-118 MKKLALLAFTLFSAWNMDAKTITGPVDYVSPLVGTQSKHALSTGNTYPAIALPWGMNFWV PQTGKMGDGWAYTYDADKIRGFKQTHQPSPWINDYGQFSIMPITGKAVFDQDQRASWFSH KAETATPYYYKVYLADHDVVTEIAPTERAAAFRFTFPENDHSYVVVDAFDNGSFVKVIPS ENKIIGYTTKNSGGVPANFKNYFVLEFDKPFTYTAAVANGNIDTNKLEANDKHAGALIGF KTRKGEQVNVRVASSFISPEQAELNLKELGKDNIDQIAAKGRKVWNDVLGRIEVEDDDID HLRTFYSCLYRSVLFPRSFYEIDAKGDIVHYSPYNGEVLPGYMFTDTGFWDTFRCLFPFL NLMYLSMNMKMQEGLVNTYKESGFLPEWASPGHRGCMVGNNSASVVADAYLKGLKGYDIE TLWEAVKHGANAVHPQVSSTGRLGYDYYNKLGYVPYNVGINENAARTLEYAYNDWCIYQL GKALNKPKKEIEIFAKRAMNYKNLYDPEHKLMRGKNEDGKFQSPFNPLKWGDAFTEGNSW HYTWSVFHDPQGLIDLMGGKDGFNQMMDSVFILPPVFDDSYYGGVIHEIREMQIMNMGNY AHGNQPIQHMLYMYNYSGQPWKAQHWIREVMDKLYTPAPDGYCGDEDNGQTSAWYVFSAM GFYPVCPGTDEYILGTPYFKQMKLHLENGKTVTISAPNNGDDKRYISSMTLNGKDHTKNY VTHEDLMNGASITFKMDSKPNEQRGTKETDFPYSFSNEFKKKK >gi|226332190|gb|ACIC01000130.1| GENE 12 19974 - 21437 1594 487 aa, chain + ## HITS:1 COG:XF0843 KEGG:ns NR:ns ## COG: XF0843 COG3538 # Protein_GI_number: 15837445 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Xylella fastidiosa 9a5c # 45 474 67 497 516 470 49.0 1e-132 MKKQVKYISAGMLAGMLLCGGQMQAANTMSGMHVCVTDAIQKDNRPEVSKRLFRSNAVEK EIIRVQKLLKNEKLAWMFANCFPNTIDTTVHFRKGADGKPDTFVYTGDIHAMWLRDSGAQ VWPYVQLANSDPELKEMLAGVILRQFKCINIDPYANAFNDGAVEDNHWMSDLTDMKPELH ERKWEIDSLCYPLRLAYHYWKTTGDASIFSEEWIQAITNVLKTFKEQQRKDGVGPYKFQR KTERALDTLNNDGLGAPVKPVGLIVSCFRPSDDATTLQYLVPSNFFAVSSLRKAAEILDK VNKKTALAKECKDLAKEVETALKKYAVYNHPKYGKIYAFEVDGFGNHFLMDDANVPSLLA MPYLGDVDVNDPIYQNTRKFVWSEDNPYFFKGKAGEGIGGPHIGYDMVWPMSIMMKAFTS KDDAEIKTCIKMLMDTDAGTGFMHESFHKDDPKNFTRAWFAWQNTLFGELILKLVNEGKV DLLNSIN >gi|226332190|gb|ACIC01000130.1| GENE 13 21447 - 22580 1225 377 aa, chain + ## HITS:1 COG:CC1418 KEGG:ns NR:ns ## COG: CC1418 COG2017 # Protein_GI_number: 16125667 # Func_class: G Carbohydrate transport and metabolism # Function: Galactose mutarotase and related enzymes # Organism: Caulobacter vibrioides # 10 376 11 378 378 288 42.0 1e-77 MKKLCVWAVAALLMAACTPKAEKTTDSGLLQSNFQMEVDGKKTDLYTLRNKNNMEVCVTN FGGRIVSVMVPDKDGQMRDVVLGFDSIQDYVSKPSDFGASIGRYANRINQGRFTLDGTEY QLPQNNYGHCLHGGPQGFQYRVFDAVQPNPQELELTYTAEDGEEGFPGNITCKVLMKLTD DNAIDIRYEAETDKPTIVNMTNHSYFNLDGDAARNEAHLLTIDADYYTPVDSTFMTTGEI APVEGTPMDFRTPTPVGARINDYDFVQLKNGNGYDHNWVLNTKGDITRKCATLESPLTGI VLDVYTNEPGIQVYAGNFLDGSLTGKKGITYNQRASVCLETQKYPDTPNKPEWPSAVLRP GEKYMSQCIFKFSVNKN >gi|226332190|gb|ACIC01000130.1| GENE 14 22720 - 24681 1473 653 aa, chain + ## HITS:1 COG:CC0533 KEGG:ns NR:ns ## COG: CC0533 COG3537 # Protein_GI_number: 16124788 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Caulobacter vibrioides # 27 648 57 748 770 231 28.0 3e-60 MTPSVAQNTKYVNLFIGTSGDNGQVAPGAAAPFGMVCVCPDNDPRSHAGYDYAVTKVSGI SVNRLSGVGCSGGGGNLRIRPVAPSQELHIKKSREKATPGYYSTAFTNGIKTELTATNAM AVERYKFPRSLSAALWIDFASTFEDVATCHYKRISETCIEGYVQAKNVCGHGRYKLYFSL NTSHPFQLEEQKETTACLTFGKKVRSVEVRIGLSALSSELASWECARWEKMDFEDVKSRT ADQWEKQLSAIDVKGGKKDDRVIFYTSLYRTYLSPADVSSPDGAYLGTDGKVYISEDFRY YSNWSLWDTFRTKFPLLVLTEPAKMRDMATSLIHLYATGKKDWSTGFESTPTVRTEHAVI LLLDAYRKGITNLDFRKGYAGMKQEMERLPMRSPDQKMESAYDLWAMAKIAEIIGEKADS EQYRQRSVSLFEETWKKEFMNVTPAFEVMKNNGLYQGTRWQYRWAAPQYIDKMIEWVGQD SLRSQLTYFFDHHLYNQGNEPDIHVPYLFNRLGAPEKTQQIVRSLMTEPMIHKYGGNSEF KTPYLGKAFKNAPEGYSPEMDEDDGTMSAWYVFGAMGFYPLLVGDEYYDLTSPLFDRVLL RLTNGNVLTIQTEGRKKKDAPIKSIHFNGKKIADYRISHNELIKGGELIYNYK >gi|226332190|gb|ACIC01000130.1| GENE 15 24740 - 26803 1963 687 aa, chain + ## HITS:1 COG:SMb20631 KEGG:ns NR:ns ## COG: SMb20631 COG3533 # Protein_GI_number: 16265291 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Sinorhizobium meliloti # 329 531 309 525 640 93 29.0 1e-18 MKQKRSFKIGVAGTLLTVGLLTAAFTTRTASESIRVMDRPDTESTNVNYVSYRAPLRPLN FIKLPVGSIQPEGWVKKYLELQRDGLTGHLGEISAWLEKDNNAWLTTGGDHGWEEVPYWL KGYGNLAYILNDPKMIAETKTWIEGVFASCQPNGYFGPVNERNGKRELWAQMIMLWCLQS YYEYSQDQRVIDLMTNYFKWQMTVPDDKLLEDYWENSRGGDNIISIYWLYNHTGDAFLLD LAKKIHRNTADWTKSTSLPNWHNVNIAQCFREPATYYMQTGDSAMLKASYNVHHLIRRTF GQVPGGMFGADENARLGYIDPRQGVETCGLVEQMASDEIMLCMTGDPMWAEHCEEVAFNS YPAAVMPDFKALRYITCPNHTVSDSKNHHPGIDNRGPFLSMNPFSSRCCQHNHAQGWPYF SEHLILATPDNGIAAAIYAACKATVKVGNGKEIVLHEETNYPFEEGIKFTVSTDEKVDFP FYLRIPSWTEGAEVRVNGKKISVKPVSGKYLCIEREWADGDKVEMTLPMSLSMRTWQVNK NSVSVDYGPLTLSLKIDEKYIEKDSRETAIGDSKWQKGADPKKWPTTEIYANSPWNYSLV LDKKEPLKNFKVVRKSWPADNFPFTVANVPLEVKATGRIIPEWQIDETGLCGVLPEEDAV KGAKEEITLIPMGAARLRISAFPNTKE >gi|226332190|gb|ACIC01000130.1| GENE 16 27018 - 28151 1179 377 aa, chain + ## HITS:1 COG:CC1418 KEGG:ns NR:ns ## COG: CC1418 COG2017 # Protein_GI_number: 16125667 # Func_class: G Carbohydrate transport and metabolism # Function: Galactose mutarotase and related enzymes # Organism: Caulobacter vibrioides # 29 376 24 378 378 286 43.0 5e-77 MKNLALWAVSALFVAACTPKAEQPTDSGLLRTNFQTEVGGKKTDLYTLRNKNNMEVCVTN FGGRIVSVMVPDKDGKMQDVVLGFDSIQDYISKPSDFGASIGRYANRINQGRFTLDSIEY QLPQNNYGHCLHGGPNGFQYRVFDAVQPNPQEVELTYVAADGEEGFPGNITCKVLMKLTD DNAIDIRYEAETDKPTIVNMTNHSYFNLDGDAANNADHLLMVDADYYTPVDSTFMTTGEI VPVEGTPMDFRTPTPVGARINDYDFVQLKNGNGYDHNWVLNAKGDISRKCASLESPKTGI VLDVYTNEPGVQVYAGNFLDGSLTGKKGITYNQRASVCLETQKYPDTPNKPEWPSAVLRP GEKYMSQCIFKFSVDKE >gi|226332190|gb|ACIC01000130.1| GENE 17 28764 - 30377 133 537 aa, chain - ## HITS:1 COG:no KEGG:BT_3533 NR:ns ## KEGG: BT_3533 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 537 1 537 537 1027 100.0 0 MKSIPVSSILYFLLSLGVLFVNANTFTDSQIFPKWMFMFTGLGVIGCFFSFYLFRGKRFI CNAKCCYYTVIISCFLQAGYGILQFFNILSSHSITYNVVGSFDNPAGFAGSLCAGLPFTF YFLQYSNKKVQWASWCALFVIVLGVLLSESRAGSISIFTVLVIYFIHQNKNKFHYKKHIG SFLLCLSFLLISIFVISVTVYHLKKDSADGRLLIWRCTWEMIKHEPFTGYGVGGFSAHYM DYQAQYFKEHPDSQFAILADNIKSPFNEYLSVGVQFGILTWVLLIAAGIFLILCHRKHPT KAGYVSLLSLLSIAVFSFFSYPFTYPFIWIISALSVSTLIGKAYENTLFQKTTIIKRGIA FFLLIGSLFLLIEVVSRIKAELEWGKVAHMSLCGRTNDMLPRYHNLLHTFRRNPYFLYNY SAELCVAEKYKECLAITAICCNYWSDYNLELVQAEAYIGLKQYDAAKHHLEKATLMCPVR FIPLYRLHYVYMKQGKAEKANRLAQLIIEKPIKKNSTIIQKIKAEMKHHLSRKTCLY >gi|226332190|gb|ACIC01000130.1| GENE 18 30399 - 31079 284 226 aa, chain - ## HITS:1 COG:no KEGG:BT_3534 NR:ns ## KEGG: BT_3534 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 226 1 226 226 438 98.0 1e-122 MNNILSTNHTSIYKLLFVLFFAGICNLPFKCNAQEYSKLDEVPQWVKEKVSHEEYKLWEV MSSVFQIDYSFLKKDISQERKKEILHLLQNTVSRIQNGSYNIEKGSLFCVADEFKIDSLT NWSTEILECRKKSGIIYSSRDGYDAHYKLTVIYGYDKMNKKVYPIKNELECISYSGIEIT CNDLPIVKYNEVEGKLQGTCYGNLFFKDEKGATHSEDFTKYFVLEP >gi|226332190|gb|ACIC01000130.1| GENE 19 31187 - 31891 367 234 aa, chain - ## HITS:1 COG:no KEGG:BT_3535 NR:ns ## KEGG: BT_3535 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 234 1 234 234 464 100.0 1e-129 MNRLRHLMSLCIFISLMACEQNEDWVVNEPMQSFEENPEYAPLNTIPDWVSEKVTPKEYE LWRTMSSRYEINYSFLKKDISEKRKKEIYDCINNICERIEKGQINKYEGFLNIADEDGTT LSDSQYFGRIATRSPEGGAEYKTNGCTLYTHSLGPYIKAAVTYKKSDDDVTITSSSVYTG SPYLGNDPSFSGASSVSYDKDKKLIAASCSGTLSFKDGSRKVEVTVQKTGFMIP >gi|226332190|gb|ACIC01000130.1| GENE 20 32298 - 33284 644 328 aa, chain + ## HITS:1 COG:no KEGG:BT_3536 NR:ns ## KEGG: BT_3536 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 328 2 329 329 610 100.0 1e-173 MKKILFLMVICVATSLTSILPASAQNSDQTDEYKTLLKKIMTLSGSSASSEAIMSQLMSS MKNGPFQQDEAYWKDFASKWTRKIEDKVMEVYAPIYQQHMTLDELKKVVAFYESPAGRKL GETATAVATESMPMIQQLSMEMVQEMMPKLQKSRELVRDDAAEQAKTRDQKLFDAAYTAP KDSIEIVADRTYEHGMGTKPLLHSIERRKDDTKVTFLQPIYFDWQWMYYSPGFKIVDKKT GDEYMVRGYDGGAPMDKLMIIKGCNRKYIYVSLLFPKLKKNVKEIDIIEALTDKEELPSN DDGKAKSYFDVKVDDYLVLSRENKKVYY >gi|226332190|gb|ACIC01000130.1| GENE 21 33335 - 33553 206 72 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVWIITIVLMSIDIYIVFMDPLERQYDLTQVVIAWYLVTLLLDFLFRHNKIYGVIRNIIA TVLLCVFYFVFP >gi|226332190|gb|ACIC01000130.1| GENE 22 33566 - 33796 79 76 aa, chain + ## HITS:1 COG:no KEGG:BT_3537 NR:ns ## KEGG: BT_3537 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 76 5 80 80 96 100.0 3e-19 MRYAVVLIINIILVLAGFFLLFGDPLNKQLPTKMIAIWFIVLVILDYPLRHNKIYKEIRI LVYIIFLALLYWLFPT >gi|226332190|gb|ACIC01000130.1| GENE 23 33900 - 34712 211 270 aa, chain + ## HITS:1 COG:no KEGG:BT_3538 NR:ns ## KEGG: BT_3538 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 270 1 270 270 491 99.0 1e-137 MNKRVTIQFADFIKSRLAGGLLCCLVLLAAYLFQMFFPGMYVDAFSSSFYLFAWLILFVG ALLWTLREKGTSGLPSEITDWQCIAAFLLIVANQIYVVMPKEVHVELAPVLSAYTLPVLV IVSLAIAGIVKLSISFYRSLEIERQKTILQEKRLQELLSSPPPVLEGVIFVRRLLENQSI YQLTASENLTLIEGCRAIDPDFFVWLKDKQIKIPPRHIVYCVLIRMRKTKKEILSIFDIS DGACRAMRSRVRASLGIEDEDMETFLQELH >gi|226332190|gb|ACIC01000130.1| GENE 24 34709 - 34960 176 83 aa, chain - ## HITS:1 COG:no KEGG:BT_3539 NR:ns ## KEGG: BT_3539 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 83 13 95 95 135 100.0 4e-31 MWGISITMCYFHRFIDDINYSLQDFLITFFELLAWIVLIIGAIDTFPQNKYSNKRVWFYY AIMGGFISAIHSFIGLINILEIT >gi|226332190|gb|ACIC01000130.1| GENE 25 34993 - 35700 485 235 aa, chain - ## HITS:1 COG:no KEGG:BT_3540 NR:ns ## KEGG: BT_3540 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 235 1 235 235 398 99.0 1e-110 MITKKKKYTLYYVMALACGLVASFFIYSCSADGYYSDEIEKNEVTNTRALSSKMINNGST LIDSIASSDEFWEFEMSSELLADKFHEYTSILSEEEYDKLMENLNDDDYVEDFMRKANLE NELQQLAKAKENLIKHTRFLRLSADERTQLFILYAESNELTKVKLLKTREEGGSTSSCEE QKQAAYKQAKADYDNAIANCQNGSMPSGCLIQAAAKYDRAKDIANKEYKECIANK >gi|226332190|gb|ACIC01000130.1| GENE 26 36473 - 38353 912 626 aa, chain + ## HITS:1 COG:no KEGG:BT_3542 NR:ns ## KEGG: BT_3542 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 242 626 1 385 400 817 100.0 0 MVSFLETALQQAGENRVELEKVLSHYKTDPADSLKYKAACFLIENMPYYTYYKGKQLDRY LTYYTLLQETRGLGISPQVVADSVCHMYGALYLDSLQSYRDIETVDSAYLCNNIEWSFKV WQEQPWGKHVSFADFCEYLLPYRIGDETLTSWRESIYQKYNPLLDSLRASGVLDKEDPIV AARCLLDSIRKGGVVFTTAVPASLPHVGPEVAQLKAGSCRELSDFVVYVCRALGIPCSID FMPIRGDENDSHQWVAFSDKYGILYYHEFPNGVSEVRKDAMCGMPKIKVYRNTFSLNRAM QEEMLKLDTAIVPLFKDPHIVDITFPYTKDFKKELHIPKDALYKGKPRSRIAYLCASKRM DWEPVAWTEFDGKNIVFTDIQKGPVMRVATYERGRLRFWTDPFEINVSNEFHFFTPSDSV QDVTLFAKYTLRADEMFLNRMIGGTFEGSNDPDFREKEVLYLINEKPKRLQTVVQSYSSK SYRYVRYIGPKDSHCNIAEAAFYTPNDTASLKGKVIGTPGCFQKDGSHEYTNVFDGDVTT SFDYIEPSGGWSGLDLGTPKQIGRIVYTPRSYDNYIRSGDDYELFYCARRNNWKSLGDQR SKADSLIYIKIPVNALLLLCNNTRGI >gi|226332190|gb|ACIC01000130.1| GENE 27 38530 - 39252 476 240 aa, chain + ## HITS:1 COG:no KEGG:BT_3543 NR:ns ## KEGG: BT_3543 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 240 1 240 240 426 100.0 1e-118 MISRKKSLFLSVSVLILFVALVAQSCSSENDDFVANPNSSLEMRANLESMDNRATIVNSI AESDELLDYGMNCMLLAEKLKSYTSTLTPEEFDELMNNLNNDDYMIELVSKVDIEKEALM VENARQELLANKSFKLLDESEKMNVFIRFSDNSQNTMQHLLLKSPGESAGGVTKAECKRR YDEDYAYASAIYLSSMLLCSCATGGLGVCLCAIGASAGYDYATTLADRSYYDCLSNAIDR >gi|226332190|gb|ACIC01000130.1| GENE 28 39331 - 39513 115 60 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MINDWYISGVYVLMWVFLFLYTVSKVPKKKKGLKEYLPSILAFLVVLLNVLDVLRRHVMC >gi|226332190|gb|ACIC01000130.1| GENE 29 39606 - 41195 326 529 aa, chain + ## HITS:1 COG:no KEGG:BT_3545 NR:ns ## KEGG: BT_3545 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 529 1 529 529 928 100.0 0 MSKVSITQIGFALLCIGSVFMYSTQITDPYIVSKWLYTILFVLIITIYCSIRMLLGKSVK FDTRLAGMSIVIVSSLQAIYGLSQCFNITTFNTFYKIMGSFENPTGFSACLCVSLPFFVV FQLLNENKQIRYLVCFLGIIVVIAIVLSYSRAGIISVAIVIAIFLFQKLKQKRIWKYLLL CSSLILLLFGCYWMKKNSADGRLLIWQCGINMVKDAPWIGHGLGSFEAHYMDYQANFFKQ CGQSRFSMLADNVKQPFNEYLGLLLNFGIIGLLVLLLLMVIIIYCYRKNPSVEKQIAFYS LSSIGIFSFFSYPFAYPFVWVVTFLSIFIITSEYIKPLFSNILIKKIACMFILTYSLFGS SKLFERIQAELDWGKASTLALCKSYNETLPTYERLEKMFVSNPYFLYNYAAVLQEMKQYT ESLEVALKCRQYWADYDLELIIGENYQQLNKPELAEKYYNSASMMCPSRFLPLYKLFHLY KTNGEKERSLAMAEAVISKPMKIKTTTIRMMKREMEREIQKMNMSIKLE >gi|226332190|gb|ACIC01000130.1| GENE 30 41295 - 44192 2512 965 aa, chain + ## HITS:1 COG:no KEGG:BT_3546 NR:ns ## KEGG: BT_3546 # Name: not_defined # Def: glutaminase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 965 1 965 965 1950 98.0 0 MRIYKTLMNKTFIYLLIAVLAACKSTPYSDETTKSDLRAPAYPLLTLHPHVKLWSMTDEL NKQNMTFGGSTQLPFVGFLRVDGAMYRFMGSKELPMQAIAPMALDHEKWEGKFTSLVPDE GWEQPDFDDKYWQLSEGAFGTPGMWEARTQWTSSNIWVRREVEVDPYLLEHKKIYLRYSH DDVFQLYINGKQLVNTGYDWGANFKVEVPDSILQTMKSGKALIAAHCENRVGGGLVDFGL FAEEPTMPVEKVAPISYEKEWTGRYTMEQPQENWEAKEFDDTTWTEGQAAFGTDDQRNVH TPWFSPNIWVRRELTFDPALAKNKQLYLRYSHDDVFQLYVNGMQLVSTGYEWKSDLRVNI PDSIAETMKDGKAVIAAHCENRMGGALVDFGLYAELREAEQTSVDVQATQTHYTFECGDV ELKLSFTAPYLLDDLKTLARPVNYISYQAKALDGKEHDVAVYFEIDPHKAFRAGQSTQIY EKDGLALMKTGQENQKLWVDKEKDGRSWGYFYLGTKDDVTCAQGDAAEMRAYFMKEGDLK QMRKSAEKRYAAFAQKLDVGSDFPQHLTAAFDGLYTMAYFGEDLRPYWNKDGKTSIEDLY TEAEKDYKELMAKCYAFDRQLMADAYLAGGKEYAELCALAYRQSVSAFQMSEDSDGELLY FTPQVGPVDEYYPASPLYLRYNPDLVKAMLNPFFYYSESGKWGKPFPPHDLGGYPIVNGQ TIGGDMPVEEAGNGLIMTAAIAKMEKNASYAEKHWKTLTQWAEYLLENGTDTDNQLATDN FAGDCPHHANLSAKGILGIAAYARLAEMLNKKEEAKKYMNAAGEMAKEWEMAAYAGDHYR LAFDQPDSWGMKYNLVWDRLLGLNLFPERVIQKETDFYLTKMNEFGCPLDSRHNYTKVDW TVWSASLSADRMQFRKLMLPLHKFMNETTDRTPMADLINTDGKTIVGFTGRTVVGAYFIR LLEKK >gi|226332190|gb|ACIC01000130.1| GENE 31 44271 - 44918 511 215 aa, chain + ## HITS:1 COG:no KEGG:BT_3547 NR:ns ## KEGG: BT_3547 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 215 1 213 213 372 98.0 1e-102 MNMKSLFYLVVTMVVIIACTNEEFDANSEVKKEEVKTKIASLAEKYDVNIDFFEINPSEK TENEMSLEEIEELFRDIKQLREHPVELKMYKDKEENGCMFYSTKEQKPSVPLAKTRSEIY SFSDWIWNLTWFSVTLIEDNKNVTVMTDITGLSVYTYSQSYASGYVSGNSISFNSTGKIK ATITQVVGFNLSYVINSSGTYDKSAGKGSVTVSAY >gi|226332190|gb|ACIC01000130.1| GENE 32 44940 - 45155 160 71 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIMPPMFFEIVYVLILWIGILAVLKSHQLTKPYKYLWVILVLLFNLLGIIAFLIWKQYLN KMSNSGSHGNV >gi|226332190|gb|ACIC01000130.1| GENE 33 45459 - 46673 917 404 aa, chain + ## HITS:1 COG:FN1382 KEGG:ns NR:ns ## COG: FN1382 COG1373 # Protein_GI_number: 19704717 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Fusobacterium nucleatum # 1 402 1 402 402 332 45.0 1e-90 MINRELYMEQITPFIDKPFVKVITGIRRCGKSVVLRLIREELLRRGVSEDYIIYMNFESF EWIDMKEAKALYAHIREATKAPGKYYILLDEIQEVTDWEKAVNAFLVDLDVDIYVTGSNS RLLSSEFSTYLAGRYVAFHIMTLSFREYLLFHDLPLTISVSDRKTEFLKYLRMGGFPAIH TANYDYNAIYKIVYDIYSSVILRDTVQRHSIRNVEMLERVVKFVFDNIGNKLNAKNIADY FKSQQRKVDINTIYNYLNALDSAFVIQRIPRYDIKGKEILQTNEKYFVSDLSLIYSVMGY RDRLIGGMLENLVCMELKRRGYEVYVGKQDEKEVDFVAIRREEKIYVQVTYQLGLQATID REFAPLLAIDDHYPKYVVSMDDLWQDNIEGVKHRHIADFLLDEW >gi|226332190|gb|ACIC01000130.1| GENE 34 46681 - 47748 794 355 aa, chain - ## HITS:1 COG:AGpA709 KEGG:ns NR:ns ## COG: AGpA709 COG0624 # Protein_GI_number: 16119709 # Func_class: E Amino acid transport and metabolism # Function: Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 66 349 69 378 387 106 28.0 9e-23 MKYDIPTMTAEAVSLLKSLISIPSISREETQAADFLQNYIEAEGMQTGRKGNNVWCLSPM FDLKKPTILLNSHIDTVKPVNGWRKDPFTPREENGKLYGLGSNDAGASVVSLLQVFLQLC RTSQNYNLIYLASCEEEVSGKEGIESVLPGLPPVSFAIVGEPTEMQPAIAEKGLMVLDVT ATGKAGHAARDEGDNAIYKVLNDIAWFRDYRFEKESPLLGPVKMSVTVINAGTQHNVVPD KCTFVVDIRSNELYSNEDLFAEIRKHIACDAKARSFRLNSSRIDEKHPFVQKAVKMGRIP FGSPTLSDQALMSFASVKIGPGRSSRSHTAEEYIMLKEIEEAIGIYLDLLDGLKL >gi|226332190|gb|ACIC01000130.1| GENE 35 48183 - 48938 354 251 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253571095|ref|ZP_04848502.1| ## NR: gi|253571095|ref|ZP_04848502.1| predicted protein [Bacteroides sp. 1_1_6] # 1 251 1 251 251 520 100.0 1e-146 MFFLFIIAYLFIPNYMFNILSNITLRNITTAGVAVVIYLCTSCRNNHAEVERIIRNHLEP TKELYALALQTGEVTLPLGGKYPHVTYPAPGGTETPISNVWIVNGNYMGKTYECHGCDSC MKCYQDAGLITYNIVRRDSTSVRSTVCLTEKGKQYLIERHISGFHPIINAWRRREQLEVV LVAKEKFDLEIYPSDTTEIYHCKVHRKLEITPFIEALGATGKENQPDYVRKLRVDLNNKE YPMVTYIEMEP >gi|226332190|gb|ACIC01000130.1| GENE 36 49026 - 50006 702 326 aa, chain - ## HITS:1 COG:no KEGG:BF3416 NR:ns ## KEGG: BF3416 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 323 5 340 343 147 30.0 5e-34 MKKIIQLLYILFAVAAFSACSDDNQGEVVSQDLSYQLDANYVFVDIPEKFLFDPSNILYL EGKTVMKLNAKSDGKQTTITEGTEFTFNVKLKKALNQNVKVRLKKNLDLLEGHTLTEFPD EAFELNDAIITAGSREGTITLNITNPDILNKLPGYVLPLRLEFVDAVSGVKISKQSYNVL IEINLKLEKDNIDPSNDEITGEYFNNNVVFESNKSDGLEFLYDGKNSGNTWYPNKNTFLT MTFPELTRILGVRINFDTGYYSLGSLNVYADEGNGFISYGKLTRNANGTIYLKFKEPADI RALKFDGMLTTSGGTGPDIYEITFIK >gi|226332190|gb|ACIC01000130.1| GENE 37 50043 - 51242 970 399 aa, chain - ## HITS:1 COG:no KEGG:BF3614 NR:ns ## KEGG: BF3614 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 8 398 10 390 393 293 39.0 7e-78 MKKHIALLFLSLGFFTACDMVNGEGEGADNAVYMGNTNSSGVISMVVSNDKGGSSVITPR LANITDQPVEITVEVDKQLLEEYNTKAGLTLEPMAIEDFIFITKDKKETHGKAVVTIEPG QYNASVEIKIPQIDETKYPYSKRFAIPVSITSSSKYKILSSPDFTIIRLSRELVTSVGQF SRAGSIALVPNAELRKPMDNWTMQVSMLYPRMTNSNQTVMSIQSGTGDFYTRITNDKGIQ IKNGRDGDDTWTNKPLSNGKWLNISLVHRNASSISVYVNGELQKTFETSPIYFSDMKKCC LFIGNTQYTNVYIREARMWNRALTDGEIIDKEYLPQDPTDPNLIMYMPFNTVENNEMKEL TGNWEISDFRTLGIWDEDPPQISYVENVQFPAEDLIIVE >gi|226332190|gb|ACIC01000130.1| GENE 38 51271 - 52383 980 370 aa, chain - ## HITS:1 COG:no KEGG:BF3414 NR:ns ## KEGG: BF3414 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 21 370 24 346 346 323 49.0 6e-87 MNNIKKCFVLSLMASFMFSCTDIETIDLEKEAVKDLYENRDKDKWAEEDAQKQQNYEDSV RIAEENKRLYELYLADLREYKETKHPVMFGWFNAWSAETPGEYSNLTLIPDSMDIVSIWG NCFNINEKRLKQMREVQSKGTKVIVGWIVENVGNGLSNIPEGGWSDDPTTGIKQYAQAIL DSIAKYGYDGFDIDYEPSYASPFKPGNHCGDWTNDWTDYRPIISCSSYDNKEYENLFFQT LRDGLDKLEAKDGKKKILNINGSIHYLSPEMAPLFSYFVAQSYNGSYSGWTSRITNRLGN DVKDQIIYTETFESGQANRISFENYANFVVNNLNREAGGIGAYHINADSFDKNEYRYVRN AISIMNPPIK >gi|226332190|gb|ACIC01000130.1| GENE 39 52405 - 54024 1235 539 aa, chain - ## HITS:1 COG:no KEGG:BF3612 NR:ns ## KEGG: BF3612 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 12 538 15 542 542 575 54.0 1e-162 MKIQIKKIVGVLLACAGIMTSTSCADFEEINKNPYLPDKDMEQKDGVLNGAYMPNLEKHI IPVPFKTDNTDLTNAYQISINLCGDSWIGYFSPRDNKWNEASNTTTFFFSEGWVNLMYEY AITNIFAPWIQLKNINMSGENPNKEMYAIAQISKIMGLHRSTDKFGPIPYKQVGNGSFTV EYDSQESVYRSFFEELDEAVTVLYDFYMKANNTVPMASDVVYEGDVTKWIRLANSLMLRL AIRVRYADEILARTYAEKAIKNPIGVMESIDDMAKMNRGANLQTKHPMYQIADPGQYNDS RMGATIQCYMKGYADPRISKYFQDQGKNAIRAGLPVTQKVYDGASLPNISEDDPVYWMRA SEVDFLRAEGALAGFDMGGGTPQQYYESGIRKSFAECGASIGSYLNSTTPPLAFTDPVNA AYNYAAPSIITVKWNEADDVEKKLERIITQKYLALFPDGQEAWSEWRRTGYPRQIPPVNN FTNSDVKKSDGHKDGVRRIPYPRNEYEHNGENLQKAIQQYLGGIDKASVNVWWDKKEKK >gi|226332190|gb|ACIC01000130.1| GENE 40 54044 - 57322 2306 1092 aa, chain - ## HITS:1 COG:no KEGG:BF3611 NR:ns ## KEGG: BF3611 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 25 1092 24 1089 1089 1350 62.0 0 MNLYLENTGKGILITFLLLIGSMLYAQNNSKITIKKKNISLQTALADIREQTKMSVSYNS SQLPKTRISLDINNQSLDQALKTILAGTGFTYTVKDTYIMIIPEQTAKKSKSRNVVGNVV DGKGAPLIGVTVVEKGTGNGTVTNMEGNYHITTQGATPVLVFSYIGYQSKEIVVKENTIN IVLEDESQALDEVVVTALGIKRAEKALSYNVQTVSNSELTVAKDANFMNSLNGKVAGVNI QKSSSGVGGATRVIMRGSKSIVGNNNALYVVDGMPIGNPSRGVIQTEYGAVAGSESISDF NPDDIESISVLTGPSAAALYGAAAANGVILINTKKGSEGKMKINFSSNTEFSKPMISPEF QNSYGNRDNSYRSWGEKLATPSSFDPMDFFKTGMNTINSLNISSGTKTNQTFISLASTNS KGVIKNNEYYRYNFSFRNTALMLDDKLHVDLGASYVIQAEQNMISGGRYFNPLFPLYLFP RGEDFENVKIFERYNEERRFPTQNWEYGDQGLSFENPYWIINREMFPTKKSRYMLHARVQ YDIFDWLNIAGRVRLDKTHSTEERKLHASTLELYTGSSKGSYTNKEEFYTQTYADVMANI NKRFGTDFSLTANVGGSFEDHYTRSIDVGGKLMTVPNLFSLANVEPASGKRDQGYLRTRN VAFFASAELGWKNMLYLSATGRCDWASQLVSNGDTPCIFYPSIGLSGIISEMVKLPEFIS YLKLRTSYTEVGSPITQTGITPGTVTDPMSGGVINPISTYPYPDFKPERTKSYEVGINAR FLNGRISFDGTFYQSNTYNQTFLSVMSPATGYSGFYVQAGNVRNRGVEMTLGYNDTFGKV NYSTHLTYTANQNKIVEMVHDYRNPIDNSLVNITELTLKEVGDVYLREGYSMSDVFTTGI LQRGRDGKLVEEGNGYQVDRSQRIRLGSADPDFTMGWRHDINYKNISLGLMITGRFGGIV TSQTQAFMDAFGVSKVSAEARDNGGVMLDGHRYDAERYYNTVGGQGLMSYYTYDATNVRL QELSLTYSLPKKWLGDVFNNATISFVGRNLFMFYRKAPFDPDMNGSTGTYNRSDFFMAPS LRNLGFSIKFSL >gi|226332190|gb|ACIC01000130.1| GENE 41 57510 - 58499 520 329 aa, chain - ## HITS:1 COG:AGpAbx251 KEGG:ns NR:ns ## COG: AGpAbx251 COG3712 # Protein_GI_number: 16119537 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 116 311 105 301 311 74 31.0 3e-13 MKNKLEKQLRSFIDFDYSKRTIDRFHRWMISSDSAEEKETALRNLWFKTKGKAEHDMEYS FRQVLDKIGIEYTPMVTDVNRWNLWKSVAAAAIIVVLSVTATLWISYNHFDRDNIAMVEH YVNNGTRETISLPDGTTVHLNSGSHVFYPENLEGKTRTIYLIGEAEFKVARNPKKPFIVR SSNMAITALGTEFNVKAYPEEDVITASLIEGKVRVDCNDTISYVLTPGYQVVYNKCTDDC QMLTANMKDVTAWMRGELVFDKVTLTEIVRTLERHYGITFHISTKKSNQDRYNFVFRKDA TLEETLEVMKVVIGQFDYRLEDSICYIIW >gi|226332190|gb|ACIC01000130.1| GENE 42 58557 - 59096 394 179 aa, chain - ## HITS:1 COG:RSp0849 KEGG:ns NR:ns ## COG: RSp0849 COG1595 # Protein_GI_number: 17549070 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Ralstonia solanacearum # 38 166 37 161 169 68 31.0 5e-12 MIITRMEDKVLRFKKFFDLNFPKVKTFAWQLLKSEEDAEDIAQDIFVKLWEKPDLWLERE KLDSYLYTVVRNHIYNFLKHKAVEYDYLDVAAEKMQMAERGLPTPDDEFCAHELELFVQM ALERMPEQRRRIFLMSREEGMTSPEIAAKLNISVRTVEQHIYKALQDLKKSFYFYFFSI >gi|226332190|gb|ACIC01000130.1| GENE 43 59330 - 59464 83 44 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MFELQRKSSVFQILSISLTDKTKFQNDKERFRPNGPNLFNSNSS >gi|226332190|gb|ACIC01000130.1| GENE 44 59645 - 61450 1589 601 aa, chain + ## HITS:1 COG:VC2484 KEGG:ns NR:ns ## COG: VC2484 COG1022 # Protein_GI_number: 15642480 # Func_class: I Lipid transport and metabolism # Function: Long-chain acyl-CoA synthetases (AMP-forming) # Organism: Vibrio cholerae # 5 601 7 597 601 493 41.0 1e-139 MTYHHLSVLVHRQAEKYGDRVALRYRDYETAQWIPITWNQFSGTVRQAANAFVALGVEEQ ENIGIFSQNKPEWFYVDFGAFANRGVTIPFYATSSPAQAQYIINDAQIRYLFVGEQYQYD SAFSIFGFCSSLQQLIIFDRSVVKDPRDVSSIYFDEFMAMGEGLPHNEVVEERTARASYD DLANILYTSGTTGEPKGVMLHHSCYLEQFHTHDERLTTMTDKDVSMNFLPLTHVFEKAWC YLCIHKGVQICINLRPADIQTTIKEIRPTLMCSVPRFWEKVYAGVQEKINETTGLKKALM LDAIKVGRIHNLDYLRLGKTPPVMNQLKYKFYEKTIYSLLKKTIGIENGNFFPTAGAAVP DEINEFVHSVGINMVVGYGLTESTATVSCTLPVGYDIGSVGVVLPGLEVKIGEDNEILLR GKSITKGYYKKAEATAAAIDADGWFHTGDAGYFKNGQLFLTERIKDLFKTSNGKYIAPQA LETKLVIDRYIDQIAIIADQRKFVSALIVPVYGFVKDYAKEKGIEYKDMAELLQHPKIVG LFRARIETLQQQFAHYEQIKRFTLLPEPFSMERGELTNTLKLKRSVVSENYKELIEKMYE E >gi|226332190|gb|ACIC01000130.1| GENE 45 61582 - 63861 2321 759 aa, chain - ## HITS:1 COG:sll0915 KEGG:ns NR:ns ## COG: sll0915 COG0612 # Protein_GI_number: 16330991 # Func_class: R General function prediction only # Function: Predicted Zn-dependent peptidases # Organism: Synechocystis # 12 276 249 514 524 95 25.0 5e-19 MSLTRDNSKVQEAIFSSLFPKHPYGTQTVLGTQENLKNPSITNIKNYYKQWYVPNNMAIC MSGDLDPDETIALIDKYFGGLKPNPELPKLNLPKEDPITAPVVKEVLGPDAESVALAWRF PGLASKDFEVLQVVSQVLYNGKAGLIDLDLNQQQKVLNSYGYPMGLADYSAFILGGLPKQ GQTLEEVKDLLLNEIKKLRAGEFDEKMLQANINNFKLYELQSMESNEGRADIFVNSFING TNWEDEVTAIDRMAKLTKEDIVAFADKYLKEDNYAVVYKKQGKDPNEKKMTKPEITPIVS NRDVASPFLTSIQESAVKPVEPVFLDFKKDMSQLTAKSDIPVLYKQNVTNDLFQLIYVFD MGNNNDKALGTAFDYLEYLGTSDMTPEELKSEFYRLACTFYVSPGNERTYVVLSGLNENM PAAMQLFEKLLADAQVNKEAYDNLVGDILKARADAKLNQGQNFSRLMNYAMYGPKSPATN LLTEAELASMNPQELVDRIHNQNNYKHRILYYGPSSSKDLLATIEQYHQVPATLKDIPAG NEYSYLETPATKVLVAPYEAKQIYMAQISNLDKKYDPAIEPTRELYNEYFGGGMNSIVFQ EMRETRGLAYSAWAGIMPPSYLKYPYTIRTQIATQNDKMIDAVNTFNDIINNMPESEAAF KLAKEGLINRMRTDRIIKSDIIWTYINAQDLGQSVDPRIKLYNDVQTMTLKDIVDFQKEW VKGRTYVYCILGDKKDLDMNKLKAVGPIEELTQEQIFGY >gi|226332190|gb|ACIC01000130.1| GENE 46 63716 - 64501 700 261 aa, chain - ## HITS:1 COG:sll0915 KEGG:ns NR:ns ## COG: sll0915 COG0612 # Protein_GI_number: 16330991 # Func_class: R General function prediction only # Function: Predicted Zn-dependent peptidases # Organism: Synechocystis # 42 212 62 235 524 112 34.0 9e-25 MNKLLKLSCLSLFLALVMSSCSSQKKYSYETVPNDPLKARIYTLDNGLKVYLTVNKETPR IQTFIAVRVGGKNDPAETTGLAHYFEHLMFKGTDKYGTQDYAAEKPLLDQIEQQFEIYRQ TTDEAERKAIYHTIDSLSYEASKYAIPNEYDKLMAAIGSSGSNAYTWYDQTVYQEDIPSN QIENWAKIQADRFENNVIRGFHTELEAVYEEKNNVSDPGQQQSAGGDLLLSLPQASLWNT NRIGYTGESEEPFHHQYQELL >gi|226332190|gb|ACIC01000130.1| GENE 47 64764 - 65723 1132 319 aa, chain + ## HITS:1 COG:SA0709 KEGG:ns NR:ns ## COG: SA0709 COG1186 # Protein_GI_number: 15926431 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Protein chain release factor B # Organism: Staphylococcus aureus N315 # 7 318 23 330 330 243 44.0 3e-64 MKLVKDLQKWIDGYNEVKTLADELELSFDFYKEELVTEGDVDAAYAKASEAVEALELKNM LRDEADQMACVLKINSGAGGTESQDWASMLMRMYLRYAETNGYKATIANLQEGDEAGIKT CTINIEGDFAYGYLKGENGVHRLVRVSPYNAQGKRMTSFASVFVTPLVDDSIEVNILPAC ISWDTFRSGGAGGQNVNKVESGVRLRYQYKDPYTGEEEEILIENTETRDQPKNRENAMRQ LRSILYDKELQHRMAEQAKVEAGKKKIEWGSQIRSYVFDDRRVKDHRTNYQTSDVNGVMD GKIEEFIKAYLMEFSSQES >gi|226332190|gb|ACIC01000130.1| GENE 48 65807 - 65941 103 44 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGKSKRKVAHSKKEEEQAQRVVKIVFVSLIILALIMLIAYSFFG >gi|226332190|gb|ACIC01000130.1| GENE 49 66169 - 67221 855 350 aa, chain + ## HITS:1 COG:no KEGG:BT_3553 NR:ns ## KEGG: BT_3553 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 350 1 350 350 710 99.0 0 MSRITTTVALLLMATAACGQAKEDAGDGKQLDKQTMELKDYLPEIHGTIRGKYEFQTETN ESRFEVRNARFSVSGNVHPIVAYKAEIDLSDEGSIKMLDAYARVFPVKDLNFTIGQMRVP FTIDAHRSPHQQYFANRSFIAKQVGNVRDVGFTGCYTQKEGLPFILEGGLFNGSGLTNQK EWHKTLNYSIKAQLLPNKNWNLTLSTQMIKPEHTRINMYDAGIYYQNSRFHIEAEYLYKM YGHEAFKDVHAVNSFVNYDLPLKKVFNKISFLARYDMMTDHSNGLADSETGVLKINDYAR HRVTGGITLSLSKAFIADLRLNFEKYFYQKSGVPKESERDKIVIEFMTRF >gi|226332190|gb|ACIC01000130.1| GENE 50 67242 - 67709 496 155 aa, chain + ## HITS:1 COG:XF2357 KEGG:ns NR:ns ## COG: XF2357 COG2954 # Protein_GI_number: 15838948 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Xylella fastidiosa 9a5c # 1 155 1 161 165 135 46.0 2e-32 MAQEIERKFLVSGDFKSFAFAQSRIMQGYICSARGRTVRVRIRDDKGYLTIKGASNESGT SRYEWEKELPLSEAEELMKLCEPGMIDKTRYLVRSGNHVFEVDEFYGENEGLIVAEVELG TEDEAFVKPGFIGEEVTGDVRYYNSHLMKKPYKTW >gi|226332190|gb|ACIC01000130.1| GENE 51 67760 - 69052 1142 430 aa, chain + ## HITS:1 COG:MTH1451 KEGG:ns NR:ns ## COG: MTH1451 COG3174 # Protein_GI_number: 15679448 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Methanothermobacter thermautotrophicus # 188 402 3 212 236 79 30.0 1e-14 MEQLYNYLPRELITFVLVTLFSLLIGLSQRRISLKREGETTLFGTDRTFTFIGILGYLLY ILDPTDMRLFMGGGAVLGLLLGLNYYVKQSQFHVFGVTTIIIALITYCLAPIVSTQPSWF YVMVIVTVLLLTELKHTFTEFAQRMKNDEMITLAKFLAISGIILPMLPHKNLIPDINLTP YSIWLATVVVSGISYLSYLLKRYVFHESGVLVSGIIGGLYSSTATISVLARKSRKASEQE ANEYVAAMLLAVSMMFLRFMILILIFSREIFTSIYPYLLIMSVVAAVVAWFIHSRHRRSK DMADESEDEDSSNPLEFKVALIFATLFVVFTVLTHYTLVYAGTGGLNLLSFVSGLSDITP FILNLLQNTGSVAVLVVVACSMQAIISNILVNMFYALFFAGKGSKLRPWILGGFGTVIGV NLVLLLFFYL >gi|226332190|gb|ACIC01000130.1| GENE 52 69198 - 69524 463 108 aa, chain + ## HITS:1 COG:no KEGG:BT_3556 NR:ns ## KEGG: BT_3556 # Name: not_defined # Def: transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 108 1 108 108 171 100.0 6e-42 MSLLYNFDVERPEELLAILAQNLQKRRLEKGLSREALTELSGVPTPTIAKFEQKHTISLA SYVALAKALGYSKAIKELLSEPLFSTMEELEMINKNKNRKKGRNEISK >gi|226332190|gb|ACIC01000130.1| GENE 53 69738 - 70877 639 379 aa, chain - ## HITS:1 COG:BB0411 KEGG:ns NR:ns ## COG: BB0411 COG1864 # Protein_GI_number: 15594756 # Func_class: F Nucleotide transport and metabolism # Function: DNA/RNA endonuclease G, NUC1 # Organism: Borrelia burgdorferi # 192 360 14 175 195 92 34.0 9e-19 MKYKFIQTLLLVLLPTLFAACGSDNNDPTDNIGGLSVSTDLKNNEVSAKGGSFFLQIKTD GKWTASSQDSWCTINNKEGNGNASTICSVSANDDDERYTVITVTSNGKSENITITQKGGN GEEPDPDPNPSGYAGRIEIPKLKGGNSNLFITHTTQYNGKEVITYSFEYDCTQKSSRWVA FTFNTSTPDNNVGRAGDFSDDPSIPSQYRTHDGDYTGSGYSRGHLAASSDRQYSVAANKQ TFYMSNMNPQIQNGFNGGIWASLEGKVQSWGKITNDQDTLYVAKGGTIDNNNIIKYLKAN NTIPVPKYFYMAILSLKNGQYKAIGFWFEHKSYSNSSYAAYALSIDELEEKTGIDFFHNL PDKIENEVERSYNKSDWGL >gi|226332190|gb|ACIC01000130.1| GENE 54 70892 - 71923 626 343 aa, chain - ## HITS:1 COG:no KEGG:BT_3559 NR:ns ## KEGG: BT_3559 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 343 1 343 343 696 100.0 0 MKKFLTVWGLLLFISITSYSQEKKYALYSVAFYNLENLFDTIHDAGKNDFEYLPNGKNKW NSMKYEAKLKNMSEILSQLSTDKLPLGPTIIGMSEVENRRVLEDLLKQPALSDRGYEIVH YEGPDRRGVDCAFFYNPKFFHLTASKLAPYIYENNDTTYKTRGFLIASGTLAGEKVHFIV NHWPSRAAASPARERAGEQVRALKDSLLNEDSNAKVIIMGDMNDDPMDKSMAVALGAKRK AQDTKEHDLYNPWWDTLKKGNGTLMYDGKWNLFDQIVFTGNLLGNDRSTLKYYRNEIFRR DYMFQKEGKYKGYPKRTHAGGVWLNGYSDHLPTIIYLIKEIKD >gi|226332190|gb|ACIC01000130.1| GENE 55 72123 - 74663 2024 846 aa, chain + ## HITS:1 COG:no KEGG:BT_3560 NR:ns ## KEGG: BT_3560 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 846 1 846 846 1580 99.0 0 MKQRLGIVIALFCLSPAIFAQQKAEKNAREDNASFTFTESQLNEDDDAAQSASAFVSSNN DVYLSNVGYLFSPMRFRVRGYNSQYSDTYINGVLFNDVETGRFSYGMIGGLNDATRNKEG IGAFEVNNFTFGPIGGATNINMRASQYAAGSKLSLSGCNRNYILRGMYTYSTGLLKNGWA FTGSLGYRWANEGVIEGTFYNAFSYFLAAEKVFNDKHSLSFATWGAPTERGQQGASTEEA YYLANSHYYNPNWGYQNGEKRNSRVVRSFEPSAIASWDFDINKEMKLKTSAGFKYSNYGT SALGWSGNAADPRPDYYKKLPSSIFNVYDKSTVPSEDELNLFNEVTERWKTSKSTRQIDW DQMYFANQQANALGKETLYYQEERHNDQLAFNFSSIFNHTIDQHNSYVVGLAVNTTKGMH YKKMKDLLGGDLYTDVDKFSVRDYGYNSYVIQNDLDNPNRRIGEGDKFGYDYNIFVNKQN VWARYQGDNDGHFNYFVSGKIGSAQISRDGKMRNGRAPKKSLGSSGTAKFLEGAVKAGFT YSINGNHSLILNAGYENRAPLAYNSFIAPRIKNDFAHGLRTEKIYNGELTYRFNTPIVSG RVTGYYTRFNDQVEMDAFYNDNEARFTYLSMSGIDKENWGVEVAATFKLMSNLSLTAIGT WSDARYMNNPTAVRTYESESESNIDRVYCKGLRDNGTPLSVYSLGLDYSVKGWFFNVTGN YYHRVYLDFSTYRRLGSVLGKYSDGAVDANGNPVGYNVPEQEELNGGFMLDASIGKYIRL KNGKSLSINLSVNNILNNTNLRTGGYEQNRDDSYDDGEARTYVFSKNSKYYYAPACNAFL NIGYRF >gi|226332190|gb|ACIC01000130.1| GENE 56 74695 - 75528 753 277 aa, chain + ## HITS:1 COG:no KEGG:BT_3561 NR:ns ## KEGG: BT_3561 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 277 1 277 277 532 100.0 1e-150 MKKITFILSAALLLTLGVSSCMDDFDTPVTGNAYGNNTIKEQRTISIANLKEKYSSVISA NQFQTVTEETRISGVVVGDDESGNIYKQLIVADETGAIVVGINSTGIYANCPVGQKVVID CENLNVGGYGMQAQIGTTYKGAIGRMDLAVWLDHVRVINKPQLWYDELIPMELTGAQLKA YDKDLAPVLVMFKDVTIKEADGTATFAPEDLKDGGNGVNRTLVLDDNSTLTFRTSTYANF STEVMPTGKINVIGILSRYNSTWQIVARTYSDIQRNN >gi|226332190|gb|ACIC01000130.1| GENE 57 75550 - 77598 1135 682 aa, chain + ## HITS:1 COG:BH1015_2 KEGG:ns NR:ns ## COG: BH1015_2 COG4085 # Protein_GI_number: 15613578 # Func_class: R General function prediction only # Function: Predicted RNA-binding protein, contains TRAM domain # Organism: Bacillus halodurans # 52 174 8 125 430 82 42.0 2e-15 MKKILNALFLTIITLVTFSCSDVPAPYDINGGGNGEGPALTGDGTKENPYDIASAMAKQD NSEAWVMGYIVGNITDKDIKTESVFAPPFTNPANILIAADADETDYKKCIPVQLVGGTDV RTALNLKDNEGNLGKAVVIKGQLTKYFGVAGLKNPTAAVLDGKDIGDGGGTDPEPPTGTV LFEETFAASQGAFTINNVQLPEGSTFVWQWSAYNESGYMKASAFVNNQNKAAESWLISPS VDLTKSSDATLVFDHAYKFAADKTKDLTLWVTETGKDAWAQIAIPNYSDGASWTFVSSGN ISLKDYVGKNIQFAFKYVSTTEAAGMWEVKNVKVVGDGDVPNPPVEGDKIFTETLGALVS ATTAFADFQNWDNKDLTFSATGKVDIRAIAHRTEENKTEKDNKVNNIWFPANGDTEFSIS KINAAGYKKFVLLYEAASNVFDAGTSIDLNVLKVAFNGTELTVPSKIVSKDNNDANVFYE MQVDINIAGTANSTLKFFAAGTDNTKGLRLYNIRLLGVEKGGTDPEPSTNLLVNPSFETW TETNPDNWGRTSVTNVTYEKSAIAKTGTNAVLLKGNGKNARFGSSDITLAAGTYTLSVFS KRISAAETELRMGYVPIKDDGTPNSSSYKYLDPADKLTDDWEQYTHEFTLDAETKVSIFF ASSKSNVAERDFLIDDISLTKK >gi|226332190|gb|ACIC01000130.1| GENE 58 77658 - 78530 465 290 aa, chain + ## HITS:1 COG:BB0411 KEGG:ns NR:ns ## COG: BB0411 COG1864 # Protein_GI_number: 15594756 # Func_class: F Nucleotide transport and metabolism # Function: DNA/RNA endonuclease G, NUC1 # Organism: Borrelia burgdorferi # 111 270 23 175 195 89 34.0 9e-18 MNRFNYLIIFLLGIHSLTFISCDSDEAENRLLEASGPIEFPALRNGADDIFLSPTTTFSG EKVTTYSMEYDKSKKHARWVAFKYYNVTGQTNWNRNDWKKTEWGGDPWQSDPNIPQADQR VQSDFGKQGYDRGHICASSDRLYSKDANEQTFYYSNMSPQKNYFNTGVWSDLEGKVRTWG RSSTFRDTLYVVKGGTIDKENQIWTYIGGDKSKPVPKYYFMALLCKKGETYKAIGFWLDQ STTAKPALSECAKTIDELEELTGLDFFHNLPDNLENAVESKYAISAWTGL >gi|226332190|gb|ACIC01000130.1| GENE 59 78705 - 79145 352 146 aa, chain - ## HITS:1 COG:no KEGG:BT_3564 NR:ns ## KEGG: BT_3564 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 146 1 146 146 247 98.0 1e-64 MPKWISIDEAAHKYGVKEEDICLWAEMEAITAYFTETTLIIDEKSLQRFMYLRKNLPTTG YIRTLEQLCINQSEVCKLYMEVIELQEKDLQYKKRRISVLERQYAMATEQNKLREKIITI TSDMLSKAESGWWEKLWMKISNRQKL >gi|226332190|gb|ACIC01000130.1| GENE 60 80674 - 80841 93 55 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MFGIVLVFSLLDEVSFNFLQIRGKSRWFLYGYEIFGVKNRKRRDKKDYTKRETDF >gi|226332190|gb|ACIC01000130.1| GENE 61 80840 - 83398 1814 852 aa, chain + ## HITS:1 COG:CC0815 KEGG:ns NR:ns ## COG: CC0815 COG1629 # Protein_GI_number: 16125068 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor proteins, mostly Fe transport # Organism: Caulobacter vibrioides # 115 818 45 720 737 149 22.0 2e-35 MKHRLLLLLLFVALVMTGWAQNSATPSYTVKGVLLDSLTQDGEPYATIKITKKGAPDKAV KMAVTGANGRFQEKLNVGAGDYVITISSIGKAPVTKEFTLKPSVRVIDLGTMLSAEANNE LKGVEVVAQKPLVKVDVDKIEYNIEDDPDSKSNSILEMLRKVPLVTVDGEDNIQVNGSSS FKIHVNGKPNNMMSNNPKEVLKSMPANSIKYIEVITSPGAKYDAEGVGGILNIVTVGGGF EGYTATFRANASNYGAGAGGYAMVKQGKMTVSANYNFNYNDRPRGYSDSYRENYESDTEK YLESNSSSKSKGNFQYGNLEGSYEIDTLRLLTVAFGMYGSNSKNYGDGFTTMYGANHEDI AYSYRTGNHGDGSWYSINGNIDYQRTSRKNKQRMITFSYKINTQPQTSNSNTGYEDIHAK EEVDDLVKRLLLKNSQSDGKTNTMEQTFQVDYTTPIGELHTVEAGAKYIFRCNTSDNKLY EAAGGSDDYLYNEDRSSDYRHLNHILAAYLGYTLKYKDFTFKPGVRYEQTVQRVKYIVGP GENFHTNFSDLVPSVSLGMKLGKTQNMRVGYNMRIWRPGIWNLNPYFNNLNPMFITQGNS NLKSEKSHAFDLSYSNFSAKFNINVSLRHSFNNNSIENISRLITAPEGEMFDNDPTHIAP EGALYSTYANIGKSRNTGMSLYLNWNASPKTRLYVNGRGNYSDLKSEAQGLHNYGWNGSF YGGVQHTLPLKIRLSLNGGGSTPYINLQGKGSGYYYYSLGASRSFLKDERLSLNVYCSNI FEKYRSYNNHTEGVNFLSKSSSKWPSRSFGVSISYRIGELKASVKKAARSINNDDVKGGG GEGGNTGGGGGQ >gi|226332190|gb|ACIC01000130.1| GENE 62 83426 - 84790 1324 454 aa, chain - ## HITS:1 COG:AGl3503 KEGG:ns NR:ns ## COG: AGl3503 COG5368 # Protein_GI_number: 15891871 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 33 454 2 414 425 320 39.0 4e-87 MKKLSLYISLLCLILTAGACKNKTGGAATASSELTDDALMDTVQRRTFLYFWEGAEPNSG LAPERYHVDGVYPQNDANVVTSGGSGFGIMAILAGIDRGYVTREEGLARMERIVSFLEKA DRFHGAYPHWWYGDTGKVKPFGQKDNGGDLVETAFLMQGLLAVHQYYINGNEKEKALAAR IDQIWKDVDWNWYRNGDQNVLYWHWSPTYGWEMDFPIHGYNECMIMYILAAASPTHGVPA AVYHDGWAQNGAIVSPHKVEGIELHLRYQGTEAGPLFWAQYSFLGLDPVGLKDEYCPSYF HEMRNLTLVNRAYCIRNPKHYKGFGPDCWGLTASYSVDGYAAHSPNEQDDKGVISPTAAL SSIVYTPEYSMQVMRHLYNMGDKVFGPFGFYDAFSETDNWYPQRYLAIDQGPIAVMIENY RTGLLWKLFMSHPDVQAGLTKLGFNTNKQDVRQK >gi|226332190|gb|ACIC01000130.1| GENE 63 84913 - 87228 2560 771 aa, chain - ## HITS:1 COG:PA1726 KEGG:ns NR:ns ## COG: PA1726 COG1472 # Protein_GI_number: 15596923 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Pseudomonas aeruginosa # 13 770 19 763 764 679 48.0 0 MINKKIFFSLLLLAAGFLSAAAQKSPQDMDRFIDALMKKMTVEEKIGQLNLPVTGEITTG QAKSSDIAAKIKRGEVGGLFNLKGVEKIRDVQKQAVEQSRLGIPLLFGMDVIHGYETMFP IPLGLSCTWDMTAIEESARIAAIEASADGISWTFSPMVDISRDPRWGRVSEGSGEDPFLG AMIAEAMVLGYQGKNMQRNDEIMACVKHFALYGAGEGGRDYNTVDMSRQRMFNEYMLPYE AAVEAGVGSVMASFNEVDGVPATANKWLMTDVLRGQWGFNGFVVTDYTGISEMIDHGIGD LQTVSARAINAGVDMDMVSEGFVSTLKKSVQEGKVSMETLNTACRRILEAKYKLGLFDNP YKYCDLKRPARDIFTKAHRDAARRIAAESFVLLKNDNVTLRPGTPAEPLLPFNPKGNIAV IGPLADSRTNMPGTWSVAAVLDRCPSLVEGLKEMTAGKANILYAKGSNLISDASYEERAT MFGRSLNRDNRTDEQLLNEALTVANQSDIIIAALGESSEMSGESSSRTDLNIPDVQQNLL KELLKTGKPVVLVLFTGRPLTLTWEQEHVPAILNVWFGGSEAAYAIGDALFGYVNPGGKL TMSFPKNVGQIPLYYAHKNTGRPLAQGKWFEKFRSNYLDVDNEPLYPFGYGLSYTTFSYG DIDLSRSTIDMTGELTAAVMVTNTGTWPGSEVVQLYIRDLVGSTTRPVKELKGFQKIFLE PGQSEIVKFKIAPEMLRYYNYDLQLVAEPGEFEVMIGTNSRDVKSARFTLK >gi|226332190|gb|ACIC01000130.1| GENE 64 87259 - 88788 1592 509 aa, chain - ## HITS:1 COG:no KEGG:BT_3568 NR:ns ## KEGG: BT_3568 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 509 1 509 509 1026 99.0 0 MKLNKYLFSTFIAASTLLISSCSDFLDRSPQGQFTEDDNPNALVNGKIYNVYTMMRSFDI TAGTPAIAIHYLRSEDSEKGSIPSDGSDVAEMYDDFLYTPTNGLLGSYWGENYAIIYQCN DILDAIAEKETAGQTETEDIINKGEASFFRAYCYFNLVRAFGEVPLVTYKINDASEANIP KTTADKIYEQIDTDLKTAEESLPETWSTEYTGRLTWGAARSLHARTYMMRSDWDNMYKAS TDVINKGLYNLKTPYNEIFTDDGENSGGSIFELQCTATAALPQSDVIGSQFCEVQGVRGA GQWDLGWGWHMATQLMADAYETGDPRKNATLLYFRKTDDEPITPENTNEPYGESPVSPAM GAYFNKKAYTDPALRKEYTNKGFWVNIRLIRYSDVVLMAAEAANEKNIPGEAVDYLEMVR ARARGTNTNILPKITTNDQGELREAIRHERRVELGLEPDRFYDLVRWGIASEVLHAAGKV NYQDKNALLPLPQSEIDKSKGVLVQNPDY >gi|226332190|gb|ACIC01000130.1| GENE 65 88892 - 91957 3059 1021 aa, chain - ## HITS:1 COG:no KEGG:BT_3569 NR:ns ## KEGG: BT_3569 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1021 1 1021 1021 2013 99.0 0 MEKFNAMRTRLLQDLQKKTIRFKSIVALTCLLLTSVSAFAQTKTVTGTVTDAANEPLIGA SVLVQGTSTGTITDMDGKYSISVTPEDVLVFSYVGMTTQSVKVGAQNVINVTLKEDSQVL AETVVIGYGSAKKRDLTGSITNVKGEEIANKPATNPLSSLQGKVAGVQIVNSGRAGADPE IRIRGTNSINGYKPLYVVDGLFNDNINFLNPEDIESMEILKDPSSLAIFGVRGANGVIII TTKKAKEGQTLVNINTSFGFKKVVDKVDLVNGPQFQELYSEQLANQKDTPFDFSGWNANT NWQDEIFQTAFITNNNISITGASPKHSFYLGVGYSHEQGNIKHEKFSKVTINASNDYKIT DFLKVGFQFNGARTLPADSKQVLGALRAAPIAPVYNKEYGLYSVLPEFQKAQINNPMVDV DLKANTTKAENYRASGNIYGEVDFLKHFNFKAMFSMDYASNNGRTYLPVMKVYDDTAAGD VVTLGTGKTEVSQFKENETKVQSDYLLTYTNSFDHGNHNLTATAGFTTYYNSLSRLDGAR KQGVGLVIPDDPDKWFVSIGDAATATNGSTQWERTTVSMLARVIYNYKGKYLFNGSFRRD GSSAFSYTGNEWQNFFSLGGGWLMTEEEFMKDIKWLDMLKIKASYGTLGNQNLDRAYPAE PLLSNAYSAVFGKPSIIYPGYQLSYLPNPNLRWEKVEAWEAGFETNVLRNRLHFEGVYYK KRTKDLLAEVPGISGTVPGIGNLGEIENMGVEMAASWRDQIGDWGYSVSANLTTIKNKVK SLVQDGYSIIAGDKQQSYTMAGYPIGFFYGYKVAGVYQSQADIDASPENTLATVTPGDLK FADVNRDGKITPEDRTMIGNPTPDFTYGLSLGVNYKNWSLGIDMMGQHGNEIFRTWDNYN FAQFNYLSQRMDRWHGEGTSNSQPLLNSKHSINNLNSEYYIEDGSFFRIRNVQLAYSFDK ALLAKIRLQALKVYVNIQNLKTWKHNTGYTPELGGSATTFGVDDGSYPVPAVYTFGINLT F >gi|226332190|gb|ACIC01000130.1| GENE 66 92156 - 93781 550 541 aa, chain + ## HITS:1 COG:no KEGG:BT_3570 NR:ns ## KEGG: BT_3570 # Name: not_defined # Def: TPR repeat-containing protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 541 26 566 566 1032 99.0 0 MKLLLSYIIGLFLILSLSACHDGGNATTLLHQADSLMQEFPDSALSLLESISHPEKLSGS ERADYAIFLTRARTKLYVHESSDSLIRFAVDYYKRSWNNERKMQAYYYRGCVYRDMRCMD LAVKDFLQALKVIPKESEYLYLGAIYENLAGCYAEQNLYKDAMHAHHKAHEIYIKQKKDD GLFYAVRGIGYVFMLQHQLDSALVYYQKALDVAENIGEDYYKSIILSELGILHNEKGEFQ KANQYLSASISSAPTGTNLFSEYLRKGCILRNIHQMDSARYYLNLSKSSPYIFNRGGSYG ELYKLEKEEKNYPAAIAAADSFIYYLDSIYDTTKAAEITRLADQYEIEFYQQKIAGRYKI EVLSLLLLFIIGGAISLWIDKRRKKKYLELQNQLMKSRTDILSGDFEDQGNAESDFMRML EPSLELCLQLFRRTETHEKLLSLEKKMGVATSLSIHEGQMICESIYETFGDIMLKLKIQY ADLTKEDLLHCVFLLLGCSKETILLCTRASEGAFKSRKSRMKIKLGEEFFEWMTIRQYLV S >gi|226332190|gb|ACIC01000130.1| GENE 67 93933 - 94289 348 118 aa, chain + ## HITS:1 COG:no KEGG:BT_3571 NR:ns ## KEGG: BT_3571 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 118 1 118 118 216 100.0 2e-55 MKAKLLLLLCCLFAIEGWANSNDIHLLKDSRSNPMGIPIQPTYEKCAILSNILNVSFGRA KDYAIITVTNKATGEIVHSKTYHNTSIVMIDMSSCEKGEYTIHIILNDCLLEGTFTVQ >gi|226332190|gb|ACIC01000130.1| GENE 68 94484 - 95248 332 254 aa, chain + ## HITS:1 COG:no KEGG:BT_3572 NR:ns ## KEGG: BT_3572 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 254 1 254 254 482 100.0 1e-135 MKAIFFMFLLCCICCLSCSDEQNTLSEEQNVIENKTFEKVSSLELKQQLLDYLNNSVKTR TTFNNAEMNYDLGMQEILRPQIQTFGEEMIVVRSKMDANNIMAFYKENGVIENCLIIECA QNAEDALEETMFTCYDSNNIPIFKAMVNKRNNTCNVLEIYDGLDDLTARGGNWGCNVSLG LAGALWSTAFGMVTAGAGFVVAVGWCVLQTWLCSSRVVSRPPTEIQIDTIEFKPMERDDD SIARDTIGKLLPLR >gi|226332190|gb|ACIC01000130.1| GENE 69 95272 - 95619 129 115 aa, chain + ## HITS:1 COG:no KEGG:BT_3573 NR:ns ## KEGG: BT_3573 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 115 1 115 115 183 100.0 1e-45 MRNNIKLALFYILILGLFSILLCTENSIWFSASIAMICTSTGVCIGQLIKKRKRFHASTD KLWVLSGLFLLSIIISLCIYFFFDNNPNKVTYILLMNLVLALGSILYIYITREKT >gi|226332190|gb|ACIC01000130.1| GENE 70 95686 - 96333 386 215 aa, chain + ## HITS:1 COG:no KEGG:BT_3574 NR:ns ## KEGG: BT_3574 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 215 1 215 215 426 100.0 1e-118 MRKLILFFLFVFSVTVGFAQSFSGDKLFISVDGGKGVLFGKSNLSPFGVNYRGEYNGGLT CNVKTLYRIDKFWVAGLKFNLSGTSANYTLDDETNVADNVELWYLGPQLGFKIPITERTL ISCVLGAGYLHYRDEGRSNSEFKCTSGALAGNMDFLVEYKLTDHLAVNGGFSVLSGDFKK IEMTADEKKETLRPNKLDRLYMRRLDFQLGLVFCY >gi|226332190|gb|ACIC01000130.1| GENE 71 96458 - 96832 191 124 aa, chain + ## HITS:1 COG:no KEGG:BT_3575 NR:ns ## KEGG: BT_3575 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 124 1 124 124 205 100.0 3e-52 MKFMKKNRPDVMIIILLILWGIYEIFVVPFEYDTLVSIFVGFNLVYLFPWCCKETSAEYL KIRNAALQYAFTLIIAVLCALKIVGVLCERMFDVDGLKMIFIGVFFQSVYFLIVKFKRSK IHER >gi|226332190|gb|ACIC01000130.1| GENE 72 96822 - 96989 102 55 aa, chain + ## HITS:1 COG:no KEGG:BT_3573 NR:ns ## KEGG: BT_3573 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 54 1 54 115 67 68.0 2e-10 MKGNVKLALFFVLLLGLFTILLCTESSRWFSASIAMIFTMSGVFLGQLIKNRRKK >gi|226332190|gb|ACIC01000130.1| GENE 73 97073 - 97963 843 296 aa, chain - ## HITS:1 COG:TM0415 KEGG:ns NR:ns ## COG: TM0415 COG0524 # Protein_GI_number: 15643181 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Thermotoga maritima # 9 285 5 277 286 125 30.0 7e-29 MKKHQLCCIGHITLDKVVTPQSTVYMPGGTAFYCSHAIRHFNDIDYALVTAVGATEMKVV DQLRGIGISVTALPSQHSVYFENIYGENPDDRTQRVLAKADPFTASQLQEIEAEIYHLGS LLADDFSLEVIKELSWKGLIAVDSQGYLREVRDTHVYPVDWTDKREALQYIHFLKVNEHE MEVLTGLSDPHEAARKLHEWGVKEVLVTLGSMGSLIYDGTDFYRIPAYKPGQVVDATGCG DTYTIGYLYQRVSGASIEEAGRFAAAMSTLKIEKSGPFCGNKEDVIRCMETAEQMY >gi|226332190|gb|ACIC01000130.1| GENE 74 98182 - 99204 820 340 aa, chain - ## HITS:1 COG:CC2028 KEGG:ns NR:ns ## COG: CC2028 COG0793 # Protein_GI_number: 16126271 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protease # Organism: Caulobacter vibrioides # 15 321 110 434 462 69 24.0 7e-12 MRNRIIQLLLLLCCLPIQTGCIGEDDYADDPVGNFEQLWKIIDERYCFLDSKGIDWDAVH EKYSKLIVPGTSNDDLFDKLSEMLYILKDGHVNLSSAKRVSYYDAWYQGYPWNYREDILY QYYLGSASKDYYTSAGMKYKIFDNNIGYIRYESFSSGVGDGNLDEILVYLATCNGLIIDV RDNGGGNLTNSSRIAARFTNSKILTGFIQHKTGTGHSDFSQPEPIYLEPSNSIRWQKKVV ILTNRRCYSATNDFVNLMRSIDNGKIIQVGDQTGGGSGLPFTSELPNGWSIRFSASPHFD KNMQPLEEGILPDIAIDMTEDDQSKHRDTLIEKAFEILSE >gi|226332190|gb|ACIC01000130.1| GENE 75 99253 - 100113 719 286 aa, chain - ## HITS:1 COG:no KEGG:BT_3578 NR:ns ## KEGG: BT_3578 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 286 1 286 286 529 100.0 1e-149 MKKNLLYLGLTGYLLFALCTRLQAQTDSLQAHRYVTRATMYGVGFTNVFDTYLSPQEYKG IDFRISRETIRMTKLFDGNVSVQNFFQADVGYTHNRADNNNTFSGLVNWNYGLHYQFRLT ENFKLLAGGLFDVNGGFVYNLRNSNNPASARAYVNLDASGMAIWHLKIKRYPMVLRYQVN VPMIGLMFSPHYGQSYYEIFSLGNSSGVIQFTSLHNQPSLRQMLSVDLPIGYTKMRFSYL ADLQQSNVNGIKTHTYSHVFMVGFVKDLYRVSNKKGANLPPSVRAY >gi|226332190|gb|ACIC01000130.1| GENE 76 100152 - 102797 2949 881 aa, chain - ## HITS:1 COG:BB0035 KEGG:ns NR:ns ## COG: BB0035 COG0188 # Protein_GI_number: 15594381 # Func_class: L Replication, recombination and repair # Function: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), A subunit # Organism: Borrelia burgdorferi # 37 615 11 583 626 439 41.0 1e-122 MSDEINEITEGHSDYKPADARDENVKHQLTGMYQNWFLDYASYVILERAVPHINDGLKPV QRRILHSMKRLDDGRYNKVANIVGHTMQFHPHGDASIGDALVQLGQKDLLIDCQGNWGNI LTGDGAAAPRYIEARLSKFALDIVFNPKTTEWKLSYDGRNKEPVTLPVKFPLLLAQGVEG IAVGLSSKILPHNFNELCDASISYLRGESFQLYPDFQTGGSIDVAKYNDGERGGAVKVRA KINKLDNKTLAITEIPYGKTTSTVIDSILKAVDKGKIKIRKVDDNTAANVEILVHLAPGT SSDKTIDALYAFTDCEVSISPNCCVIDDSKPHFLTVSKVLKKSADNTLGLLKQELEIKKG EILESLHFASLEKIFIEERIYKDKEFEQSKDMDAACAHIDDRLTPFYPSFIREVTKEDIL KLMEIKMGRILKFNTDKADELIARMKEEIAEIDDHLAHIVDYTVNWYQMLKNKYGKNFPR RTELRNFDTIEAAKVVEANEKLYINRDEGFIGTALKKDEFVANCSDIDDVIVFFRDGKYI VTPVADKKFVGKNILYVNVFKKNDKRTIYNITYRDGKEGTTYIKRFAVTGVVRDREYDVT QGTPDSRITYFSANPNGEAEIIKVTLKPNPRVRRIIFERDFSEISIKGRQAQGVILTRLP VHKITLKQKGGSTLGGRKVWFDRDILRLNYDGRGEYLGEFQSDDTILVVLNNGEFYTTNF DLSNHYEDNVSIVEKFDPNKIWTAALYDADQQNYPYLKRFCFEGSNRKQNYLGDNKNTRL ILLTDEYYPRLEVIFGGHDSFREAMVIDAEEFIAVKGFKAKGKRITTYTVDTINELEPTR FPEPTNEQPEQPEEEPENLDPDSDKSEGDIIDEITGQMKLF >gi|226332190|gb|ACIC01000130.1| GENE 77 103674 - 105017 1079 447 aa, chain - ## HITS:1 COG:MK0248 KEGG:ns NR:ns ## COG: MK0248 COG0673 # Protein_GI_number: 20093688 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Methanopyrus kandleri AV19 # 42 189 3 144 317 78 31.0 3e-14 MTTRRDFIKKTVAGTAALSLGSIIPGFGSNSYQEILGANEKIRIGVIGVNSRGNALAQGF AKMKGCEVTYLCDVDSRALERCQADIHKISGRTPKGEKDIRKMLESDDFEAVVIATPDHW HAKAAIMAMQAGKHVYLEKPTSHNPAENEMLVRATLKYNRIVQVGNQRRSWPNVIKAIEE IKSGAIGKVRYAKSWYVNNRPSIGTGKVVPVPDYLDWDLWQGPAPRVPNFKDNYIHYNWH WFWNWGTGEALNNGTHFVDILRWGLGVDYPTKVDSVGGRYRFQDDWQTPDTQLITFQFGD EASCSWEGRSCNSTPVDGYGVGTAFYGETGTLFISGGNEYKITDLQGKVIKDVKSNLKFE TGNLLNPSEKLDAYHFENWFDAIRKGGKLNSGIVDACISTQLVQLGNIAQRVGHSLDIDP GSGRILNDLEANKLWGREYEKGWEIRV >gi|226332190|gb|ACIC01000130.1| GENE 78 105105 - 105344 155 79 aa, chain - ## HITS:1 COG:no KEGG:BT_3582 NR:ns ## KEGG: BT_3582 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 79 1 79 79 105 98.0 5e-22 MKDIIITSQKIKRERNIYLICFLLSFAINIIAILVYTRPWIEIFTQLGYVLVISFFIYFI LWIPRILIIGLRHLLRKKK >gi|226332190|gb|ACIC01000130.1| GENE 79 105348 - 106673 1043 441 aa, chain - ## HITS:1 COG:BH1248 KEGG:ns NR:ns ## COG: BH1248 COG0673 # Protein_GI_number: 15613811 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Bacillus halodurans # 53 198 5 150 340 107 40.0 5e-23 MRNKIEKGVDLNLRQFIKSLGYIAGGTALLASTPWLSSCTPEKLKEIRNEKARIALIGTG SRGQYHIHNLKEIPHARIVAVCDNYAPNLQQAVELCPGAKAYTDYRKLLESADIDGVIIS TPLNWHAPIVLDALAAGKHVFCEKAMARTLDECKAIYDAYQTTNQVLYFCMQRMYDEKYI KGMQMIHSGLIGDVVGMRCHWFRNADWRRPVPSPELEHQINWRLYKESSGGLMTELACHQ LEVCNWAAKRMPVSIIGMGDIVYWKDGREVYDSVNVTYRYSDGTKIAYESLISNKFNGME DQILGHKGTMEMAKGIYYLEEDHSTSGIRQLIGQVKDKVFAAIPTAGPSWRPETKMEYTP HFVIEGDINVNNGLSMIGADKDGSDIILSSFCQSCITGEKAQNVVEEAYCSTVLCLLGNQ AMEEERHILFPDEYKIPYMKF >gi|226332190|gb|ACIC01000130.1| GENE 80 106677 - 107447 334 256 aa, chain - ## HITS:1 COG:CAC2766 KEGG:ns NR:ns ## COG: CAC2766 COG1477 # Protein_GI_number: 15896021 # Func_class: H Coenzyme transport and metabolism # Function: Membrane-associated lipoprotein involved in thiamine biosynthesis # Organism: Clostridium acetobutylicum # 33 255 33 293 319 82 23.0 8e-16 MFHGFIPHVMGTRLDILMIHSNLPLLNMLWAHITDELERLDKMLNRFDATSEVSKLNSHT QQDSVSVSAELEDILRSCQYYYEKTLHLFDITLNDFSQIQIHGNHHISFSNFSVTLDFGG FAKGYALKKIQEILLRGNIENAFVDFGNSSIMGIGHHPYGDCWKVSLQNPYTQQTLDEFC LTDNTLSTSGNTEQYTGHIINPLTGIYNEQKKVTSILSDNPLDAEILSTVWMIADDQQRE QINENFKHIKGTIYTL >gi|226332190|gb|ACIC01000130.1| GENE 81 107503 - 108909 974 468 aa, chain - ## HITS:1 COG:no KEGG:BT_3585 NR:ns ## KEGG: BT_3585 # Name: not_defined # Def: putative oxidoreductase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 468 1 468 468 928 99.0 0 MRYTYRHIGILTISLIVASCSFSKKQANNNHDKNMNPNVKIVVLDPGHFHASLLQKNPLA SVNDTIRVYAPEGAEVKQYLNDINSYNQRAENPTSWKEEIYIGGDYLSRMLSDRQGDVVV LAGNNQKKTNYILEAIKAGYNVLSDKPLAINKKDFDLLIQAYQLAKERKLLLYDLMTERY DILNIIEKALLNNPDLFGELQKGSLNDPSVSMESVHYFFKNVSGKPLIRPVWYYDTEQQG EGIADVTTHLIDLVNWQCFPNETIRYQSDVEVLKARHWPTRITLPEFSQSTQADTFPAFL NKYINNNVLEVLANGSLNYTVKGIHIGMKVIWNYTPPSDGGDTFTSLKKGSKATLKTIQD KESGFVKQLYIQRAADSDHSEFESQLQKAIKQLQATYPFLSVKKMNEGLYLIDIPQADRL GHEAHFSKVAEAFLGYLHDKNMPEWENENTISKYYITTTAVELAKKEK >gi|226332190|gb|ACIC01000130.1| GENE 82 108919 - 109923 835 334 aa, chain - ## HITS:1 COG:no KEGG:BT_3586 NR:ns ## KEGG: BT_3586 # Name: not_defined # Def: putative dehydrogenase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 334 1 334 334 662 99.0 0 MKNILFILLAFLSVSFRLYAQDVIKIGIIGLDTSHSTAFTELLNGDSDDKFVKEFEVVAA YPYGSKTIQSSYERIPGYIEEVKKHGVEITSSIAELLDKVDCVMLETNDGRIHLEQAMEV FKSGKICYIDKPIGATLGQAIAIYEMAEKYNVPIFSSSALRYSPQNQKLRNGEFGKILGA DCYSPHKVEPTHPDFGFYGIHGVETLYTLMGTGCESVNRMSSQDADVVVGRWKDGRIGTF RGIKEGPAIYGGTAYTPKGSIAAGGYEGYKVLLDQILTFFKTGVAPISKEETIEIFTFMK ASNMSKEQNGKIVTMEEAYRKGLKDAQKLIKTYK >gi|226332190|gb|ACIC01000130.1| GENE 83 109930 - 110715 510 261 aa, chain - ## HITS:1 COG:all0727 KEGG:ns NR:ns ## COG: all0727 COG0363 # Protein_GI_number: 17228222 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase # Organism: Nostoc sp. PCC 7120 # 5 256 2 253 258 212 42.0 5e-55 MLSTINETYLFQQDLLTVKMFPSIQQMGKCAATEVTNKICELLKEKAEINMIFAAAPSQN EFLSHLIHSKQIDWSRINAFHMDEYIGIHPEAPQSFGNFLRQRIFDKVPFKTVNYLNGQA ENLEEECKRYSELLLRHPVDIVCLGIGENGHIAFNDPDVANFNDSHLVKVVELDPICRQQ QVNEKCFEAFDLVPAKALTLTIPALLKADWMFCIVPFKNKANAVYNTLYGEISEKCPASI LRKKENSCLYLDPESAERINL >gi|226332190|gb|ACIC01000130.1| GENE 84 110696 - 111883 851 395 aa, chain - ## HITS:1 COG:SA0656 KEGG:ns NR:ns ## COG: SA0656 COG1820 # Protein_GI_number: 15926378 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetylglucosamine-6-phosphate deacetylase # Organism: Staphylococcus aureus N315 # 1 391 1 389 393 226 36.0 6e-59 MERLIIINGELILPTGIETNKIMVCRNGKIEQIVSSEAYIPQADDRIIDANQQYVSPGFI DIHVHGGGGHDFMDGTVEAFLGVAETHARYGTTAMVPTTLTSTNEELMTTFAVYQKAKSL NKKGAQFIGLHLEGPYFSPKQCGAQDPNHLKTPHPDEYNTILEASQDIVRWSIAPELAGA IELGEKLNSCHILPSIAHTDAIYEEVVKAYEAGYTHITHLYSAMSTITRRNAYRYAGVVE AAYLIDGMTVEIIADGIHLPKPLLQFVYKFKGADKTALCTDAMRGAGMPDGESILGSLTN GQKVIIEDGVAKLPDRSAFAGSVATADRLVRTMISIAGIPLIDAIRMITLTPARILHVDS QKGSLEEGKDADIVIFDNQINVTTTISKGHVIYNQ >gi|226332190|gb|ACIC01000130.1| GENE 85 111915 - 113231 1208 438 aa, chain - ## HITS:1 COG:XF1462 KEGG:ns NR:ns ## COG: XF1462 COG0738 # Protein_GI_number: 15838063 # Func_class: G Carbohydrate transport and metabolism # Function: Fucose permease # Organism: Xylella fastidiosa 9a5c # 59 434 3 370 377 247 38.0 3e-65 MGNNQPSARKKTYYISLAILAGMFFIFGFVSWVNSILIPYFRISCELTHFESYFVAFAFY IAYFVMAIPSGILLKKVGFKKGIMYGFMLTALGAFIFVPAALARQFEIFLIGLFSIGTGL AILQTAANPYVTIIGPIDSAARRISIMGICNKVAGIISPLIFAALILKANDSELFALIES GALDEATKNAMLNELIQRVIIPYIILGIILLLTGIGIRYSVLPEINTDEQNATDEQDNKH TDKKSILDFPYLILGALAIFFHVGTQVIAIDTIINYANSMGMDLLEAKVFPSYTLGCTMI GYILGIILIPKYISQKNALIGCTLLGLALSFGVVWADFDMTLFGHQANASIFFLNALGFP NALIYAGIWPLSIHGLGKFTKTGSSLLIMGLCGNAILPLVYGHFADQYSLRIGYWVLIPC FIYLVFFAIKGHKINSWR >gi|226332190|gb|ACIC01000130.1| GENE 86 113375 - 115573 1409 732 aa, chain - ## HITS:1 COG:no KEGG:BT_3590 NR:ns ## KEGG: BT_3590 # Name: not_defined # Def: alpha-N-acetylglucosaminidase precursor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 732 1 732 732 1463 98.0 0 MNHKYLYLLLLFLIVSCQRNVETGPPVLKEMCQRLFPRHAQSFLFELLTDSIDTDRFILE SSQGKIRIKGNNRNSLAAGLNHYLKNYCHTHVSWYASETVEMPDVLPEIPQPVYIRSKCD NRFFLNYCTFGYTMPYWKWQDWERLIDWMALNGVTMPLAITGQESIWYKVWTDMGLSDEQ VRSYFTGPAHLPWHRMSNVDYWQSPLPQSWLKDQEELQKRILEREREFDMTPVLPAFAGH VPAELKTIYPNAKIYQMSQWGGFDEKYRSHFIDPMDSLYQVIQRRFLEEQTKVYGTDHIY GIDPFNEVDSPDWSEDFLANVSSKIYESIHQVDSAAQWLQMTWMFFYDKKKWTQPRIRSF LKAVPDDKLILLDYYCDHTEIWRNTEKYYGNPYIWCYLGNFGGNTMIAGNLNDIDFKIKR LFKEGGDNVYGLGATLEGFDVNPLMYEFVFDQAWDYPVTTDQWITNWSMCRGGDQDANII KAWRALHQNIYTEYAICGQSVLMNARPRLTGTKSWNTNPGIHYANNDLWQIWKELLKARN INNSDFRFDVINIGRQVLGNLFSEYRDQFTACYNRKDTTGMREWSTRMDNLLLDVDRLLS CDATLSIGKWLQDARDCGTTVSEKDYYEENARCILTVWGQQDTQLNDYANRGWGGLTRSF YRERWKRFTDGVIGAVSKNKPFDEDKFHQDITQFEYNWTLQKDSFPIVSEEDPIQIADSL ILKYDTDFTKAQ >gi|226332190|gb|ACIC01000130.1| GENE 87 115601 - 116782 855 393 aa, chain - ## HITS:1 COG:no KEGG:BT_3591 NR:ns ## KEGG: BT_3591 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 393 1 393 393 806 100.0 0 MKRILFILFLCCIGSGLAACQDEDHTVPTLPSGNDPEINDPVVEFYDWEKNRTELLTSTD MVLLYGGGHHRTPYTWDKERVSSYIRYVDTDNQSHWLFDSFLFLEIMDTGTGGANKMFAK GYNLESANQADWTKLIDYYFQSETGIGALDASVKEASAILGTPRQKRQIVISIPEPIVYQ HPEQASSSTKYWGKIDNQTLDFSNSADRIKACKWYIDQVRAKFNEKNYQHVELAGFYWLA EKATDTRDILNAVAIYLNKLKYSFNWIPYYGADGYNQWKSLGFNYAYFQPNYFFNESVPD SRLEDACQKALTYDMHMELEFDDNALNSRGKAYRLKNYMSAFKKHGIWEKKRLAYYQGSA SLLTLKKSGNAADTQLYHEFCNFVISRPIRSSH >gi|226332190|gb|ACIC01000130.1| GENE 88 116791 - 118773 555 660 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|90021240|ref|YP_527067.1| ribosomal protein S32 [Saccharophagus degradans 2-40] # 278 660 33 407 408 218 34 2e-55 MQYIKQIMLLVILTISWKCQAEEVKEIWLDELGESSYYIQDWGLPRINKAVTMTPLTVKG IVYERGIGTHAISRMLFDIGKKAKTLSGLAGADDNTPFACNLQFKILGDRKELWRSGIMR KGDPAKPFNIDLSGIDKVLLLVEECGDGMMYDRADWLNVKFTTLGDVQPIPVWPKPIAKE KYILTPQSPDAPQINNPLTYGARPGNPFLMPIMVSGKRPMTYKAKGLPKGLKLNRKTGLI TGSTNTNGNFKVRLQATNEKGTDEKEITLKIGSEIMLTPPMGWNSWNCWRFAADDQKVRD AARIMHEKLQAYGWTYVNIDDGWEADERTPEGELPANEKFPDFKTLTDYIHSLGLKFGIY SSPGWTTCGRHIGSCQHELTDAKTWEKWGVDYLKYDYCGYAAIEKNSEEKTIQEPFIVMR NALDQIKRDIVYCVGYGAPNVWNWGAEAGGNLWRTTRDINDQWNIVMAIGCFQDVCAYVS APGKYNDPDMLVVGKLGPGWGAKSHDSDLTADEQYAHISLWSILSAPLLLGCDMTAIDDF TLGLLTNPEVIAVNQDPLVAPATKLTVPNGQIWYKKLYDGSYALGFFQMDPYFILWDQDK AVNIQQQKYNFNFALNQLGIQGKVKIRDLWRQKNLGIFSGSYETSIPYHGVSLIKITPIK >gi|226332190|gb|ACIC01000130.1| GENE 89 118780 - 119856 498 358 aa, chain - ## HITS:1 COG:no KEGG:BT_3593 NR:ns ## KEGG: BT_3593 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 358 1 358 358 712 100.0 0 MKRHIIILLITFLCLPEYLEAQLKYYDAAKFPLYGKATKNTENRYERLPDSLKHVSRKPV WNLSRNSAGLAIRFCSNSTTIATKWENLNNNLMNHMTPTGTKGLDLYTWVDGAWRFVNSG RPTGKVNQATIIANMEPVEREYMLYLPLYDGVISLSIGIDAEASISVPRTAIPVRTKPIV FYGTSILQGGCANRPGMAHTNIISRRLNREIVNLGFSGNALLDYEIAEVMSSVDAGVYVL DFVPNASSDQIYEKMETFYRILRDKHPRTPIIFIEDPVFTHALFDKKIAIEINKKNKAVN SVFKSLKTKGEKNIYIISSEKMLGEDGEATVDGIHFTDLGMMRYADLVCPVIKKVLKK >gi|226332190|gb|ACIC01000130.1| GENE 90 119948 - 121558 981 536 aa, chain - ## HITS:1 COG:no KEGG:BT_3594 NR:ns ## KEGG: BT_3594 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 536 1 536 536 1023 100.0 0 MKQIIFIAILLTCAFGACSSDNDGDEQSASVQAPSDITIERMGKTKVLLRWKDNSNNETG FSILLRKADTSENIEIAKVSANVTEYTIENGLEEGNIYYFGVRAFSATNTSRAIYELYRL VALGDEPSIAIIGSIKANSTCISSSYQVTNIAGQTNVKYGLCWSTENTPTINDQKQNGPE VAEDGKVFQVIPNTLLDYGKSYKVRAFLTTSTGTYYSAESTVSLETEPQAIQLTWNKLTK STLPAEIELYETTSNLNGSNFHAWYAIGDLSTGKVEVRVHIPSSPATIDTQSASFNGDCY LLVNGGYFYNGNHTGIAVINSIKSGSVSAVRGSLKTGDTEYNSMYNVTRGTFGVDASGKP NVVWTGTDASSNVFYFDRPLPSVKGENKYGIVTNENPTTAISWSPKYALSAGPVLLKDKK IPFDFTETSKGTDYYLSNYEIIPYDIFGANVTPDRTAIGYREDGKVVIFICDGRITASGG ATLTELAQIMKGLGCVGAINLDGGGSTGMVVGDEHLNDMTGGNRAVVSTIGFFKKN >gi|226332190|gb|ACIC01000130.1| GENE 91 121596 - 123611 947 671 aa, chain - ## HITS:1 COG:no KEGG:BT_3595 NR:ns ## KEGG: BT_3595 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 671 1 671 671 1402 99.0 0 MITGIISILCYLQCFGTLSASVTAKNENGNFVLKNKNVELVFANGKEFLFKEFRMDGMNI LPVNGSTTHPWQLIYRGPNGENPTLMPRWGEYKGGEIQKTQDASTLIFTWQMVIDAGPTC PVRILVTLGKDAELPEWRIEAEMPEGWVITESEFPRIAVNRPEGAKGILPVGFGTEYTIG NEGQLQSRYPSCTGTMQLVLMHHKGGTVYFAAQDKGGSGKVFRMKSEGKSLVFIQNVTTS YAWTQNKKFCIPWETVMGFTQKGWQDAAVQWYRPFTFETLWGAKTISERPIAEWIKNADM WLRPGEVNAETMEAVRKAMKYYGKGVGLHWYYWHNHRFDTKYPEYFPEKAGFKEMIKETQ ELGGFITPYINGRLWDPATDSYKTLNGKDASCRKADGTLYTEVYSSKVLNTVTCPASYIW RDVLKGVNKQILTELKTNGVYMDQIGCANSEPCYATNHGHAPGGGDWWPFAYRSLLTEMR TNLYKENQAMTTEENAECYIDLFDMMLVVNSPHNSYTKMVPLFPLIYSDRCIYSGYTYIP WRITDGSLNFMSMRSLLWGSQLGWVEPSLLMRPDAKREAKFLKNLTDFRRQQHDLFLGGR FIQEIIPTGDNPTQEIPNYEITPVVLAAEWASVSGEHVYLIVNMSEQEHKVTLPNKKQIT VKALDAIRISK >gi|226332190|gb|ACIC01000130.1| GENE 92 123647 - 125893 1404 748 aa, chain - ## HITS:1 COG:no KEGG:BT_3596 NR:ns ## KEGG: BT_3596 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 748 2 749 749 1550 99.0 0 MKRIYYTYPVLLVWLVSTLLPQFCIAMPNEQKENNDVVIENAEMRLIISNDGKARSLIHK ATGEECLITNADVPLCAITQYRPYDNENFLMFPAKPRTFPANKIERNGNELRIEFQDTYD IAIIELNITDYYIGFTLKQIDYRIEDFGVKRKTEIDEISLLQLPVRKRENFGEWLNVSWD EQTAICLLGTHPTTYIDAFANKEYTTMYAGLDFQVKLFNSGAALITTSKEKLLTCIDKVE RDYHMPLGVESRQRKEYQYSYYELRDVTTKNIDEHIVYAQKGGFKSIVVYYVDFAKACGH YEWRKEYPNGMKDLQEITNKIKAAGMIPGIHIHYSKVAVNDPYINNGIPDSRTNHVREFI LSEPLDDSSTIITIEGNPEGVRMEKGRRLLQIDNELVTYENYTTEPPYQFTGCVRGVFNS KAASHDKGQHFRLLDVDDWPLFIRVNQNTGIQKEIAERLGKIYHEAGFRFVYFDGAEDVP MPYWYNVSRSQMIVYNEMKPTPLFAEGALKSHYGWHILSRGNAFDIFPPERIRPAMKKYT LRCAEQIAKDFTSVNFGWVNYLAPNDKTIGMQPDMYEYICSKAVAWNSPISLVGNLKELQ NHPRTEDNLRVIKMWEEAKLQGVLTDKQKELLKNPEQEYLLMKDKKGNYQLYPYRQITKD DEKPIRAFIFQKAGRTCIIYWHMNGTGQLTLDIEKNKLSLMNESGKRIPIRSAGSKSILP AAGRLILETALPQEEVIKLFRKSIEIIK >gi|226332190|gb|ACIC01000130.1| GENE 93 125913 - 127337 486 474 aa, chain - ## HITS:1 COG:no KEGG:BT_3597 NR:ns ## KEGG: BT_3597 # Name: not_defined # Def: sialic acid-specific 9-O-acetylesterase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 474 1 474 474 975 100.0 0 MKILIYFLIIVLWSINMEAKVKLPPLLSDNMVLQQKSNVRIWGKATPNSTVVVTPSWANK EVKTHADEKGCWELQITTPAASFEEYTLTLTDGEQSVTLQHLLIGEVWLCAGQSNMVMPL NGFDYCPISDSNNVIADAPNHPGIRMVTIKPTVKLSPQEYAEGSWQQPTTENAPKFSAAA YHYALTLQRTLQIPIGVITCAWGGSRVEGWLPKEILQTYKDEDLTLIGSDKTPVYLQSML MYNGILYPCHKYTIKGFIWYQGESNVRSSRTYAERLATMVKHWRSIWKQGDLPFYYVETV PFACEYKEDGYIGALLREAQQKALSLIPNSGMIGTNDLVETYEAPQVHPHNKKGIGERLA YMSLNKTYGYQGIESEYPSYKSMKINGETIEISFSHAEKGLSPWMNISGFEIAGSDKVFY PAEASLNQKDHTVIVKSPTVKSPVAVRYCFRNFLKGNLINTRNLPVRPFRTDKW >gi|226332190|gb|ACIC01000130.1| GENE 94 127353 - 128966 1139 537 aa, chain - ## HITS:1 COG:CC0447 KEGG:ns NR:ns ## COG: CC0447 COG3525 # Protein_GI_number: 16124702 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetyl-beta-hexosaminidase # Organism: Caulobacter vibrioides # 3 494 1 507 757 288 34.0 3e-77 MKLLLTTLAIVYLSLFMSLQAQEQVRVIPYPAEVTMQAETCKLKSGQKIHYLDASLYKEA AFLEKKLADLNYKMKVIPGKKAAKGINLSLNNRLQNKEGYQLIIAPKQVLIKGGSPAGVF YGIQTLLQQLTNGDLRCGTIEDAPRYEWRGYMLDEARHFSGEKRVKQILDLMAYYKMNRF HWHLTDAQGWRIEIKQYPKLATIGGEGCHSDPDTPAQYYTQEQIRDIIAYAKERHIEIIP EIDMPGHATAANKAYPEYSGGGTEEHPEFTFNVGKEETYTYLTNILKEIAALFPSPYLHI GGDEVAYGIKAWETDPHVQALLKREGLQTVKEAERYFMHRMTDVVNSLGKTLVGWDELLD LNVKQDNTIIMWWRHDKPDYLRKSLTKGYSTIMCPRKPLYFDFVQYKDHKWGRIWDGFCP IEDVYAFPDKWFAEWGVSASDLSHVKGIQANTWTELMHTKDRVDFMIFPRLCALAESAWS APTVKDYDKFLSRMEDAFTLFDKLNIYYFDYRNPQHHPEPAGPVIKKKEKIQMDFRD >gi|226332190|gb|ACIC01000130.1| GENE 95 128985 - 131270 1142 761 aa, chain - ## HITS:1 COG:TM1624 KEGG:ns NR:ns ## COG: TM1624 COG3250 # Protein_GI_number: 15644372 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Thermotoga maritima # 179 586 243 642 785 81 23.0 7e-15 MFRRFKFDITDYIQPTQNCIAIKIYQVDHPGTPNPGTQFIAFGPNRGTASDLFKDETLKF TGGWDCAPVVRDRNMGIYQSVTLEATSQVTIEDPYIITTLPQKDTTVADITIKATVHNHS DKMISGKVKATIRLINELVYPSYTRKLAGSMKPISVTLPVTMLPNESKEIILSPQKFSVL SIQNPYLWYPNGYGEQYMHSLDLSFDMNNGKSSDREKVNFGIREVTSELMKNGEEYGRVF FINGKRIFCKGGWIQPDVLLDDSPKRIYDQARLMANANITLIGSEDMPSPSEDWLDSWDK YGLMNWHVFYQCYRMFPGRDNQHNPLNNDLAIACVKDMVKRYRNHPCIIAWFGVNEVLVD EELYHPTKEAVLSLDTTRPYIPTTSISWDVDKLTPYLKPDLPTGTTDDGAPDYNWAPSDY YFDKVEEVYLQMFRNELGMPSVPVYESLKKFIPTIDKPLDRRNPIYPLDSIWAEHGAWDT NNFCYRAYDNAIRTFYSDPVSSEDYAYKGQLVSSEGYRAMFEAANHRMWDITTGIMIWKL NSCWPDVCWQIYDWYLSPNASYYFSKKAMEPIHIQMNANTNRISVINATHQELKSVVAKA RIIDYSMKECWAYTDTISVGADQYKELETVPQRGDLSAVYYIKLELEDLNGNLISENLYW RYSQHQNFYWLVNMPKVTLKQDLKLQKQEKEYQITLTLTNDSDKLSFFNHLMIQREKTKE VVNPVFWSDNFISLFPNETRTITATVSIEDLHGENPIIIIK >gi|226332190|gb|ACIC01000130.1| GENE 96 131332 - 131748 241 138 aa, chain - ## HITS:1 COG:no KEGG:BT_3599 NR:ns ## KEGG: BT_3599 # Name: not_defined # Def: beta-mannosidase precursor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 131 1 131 920 260 100.0 1e-68 MKKLLILYPFLLFCLISKADNQLKLTNWEIKSTTEISENAEKVSMPSFQATDWYEATVPT TVLNALVKQNVYPDPRIGMNNFLIPDVSDEFNKKMDLAKYSYLGNGKNPWQEPYWYRTTF TLPKQYKNKKIMAEPQWN >gi|226332190|gb|ACIC01000130.1| GENE 97 131755 - 132651 387 298 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|116517028|ref|YP_816079.1| glucokinase [Streptococcus pneumoniae D39] # 11 295 6 317 319 153 33 5e-36 MRIVYPKKIAIGVDIGGTKIKAGLVDINGQIIGIPESIRTLAHEPGEMIIEQLTLLIRRM IQQADGAELIGIGIGSTGPLDINKGIILECNNLPTLHNYPLHKKIESTFGLPVKLDNDAN AMMLGEALWGAGRNLNSILGITLGTGLGAAIVVNRKIIRGATGCAGEIWLSPYKEGMIED YVSGTGISNLYQRITKRKISGEEISKLAREGDINALKAWKEFTQALAYALSWTVNIVDPE VVIIGGSVMHSSDIFWDSMVSLFKKYICPQTAASIQLKPAGLKDNAGFMGAAALMFVE >gi|226332190|gb|ACIC01000130.1| GENE 98 132655 - 133704 348 349 aa, chain - ## HITS:1 COG:PH0243 KEGG:ns NR:ns ## COG: PH0243 COG0449 # Protein_GI_number: 14590174 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glucosamine 6-phosphate synthetase, contains amidotransferase and phosphosugar isomerase domains # Organism: Pyrococcus horikoshii # 5 349 259 601 601 103 28.0 3e-22 MVQFFQEIQEQPQALLQTANFYKSVEGKSVLSQISELWCSGKYRNILLTGMGSSYFIANA TASLLNSYKIPAYALNAGELLHYQISLISPESLIICISQSGESYEVIKLIEKLSSNITVL SICNEKDSSLVKFSRYSLLCKAGKEEKTSTKTFITCYQVAYLLAMKLCNQEIDSTQWHKL SKIIENMVNGNTPWMSKAIELIDGSTFVQLIGRGPVFAAASQSALMFMEAAHTPASALLG GEFRHGPLEMVKKGFIAILFAHSQSETYEQSLSLVKDILKYEGKVIFITDSNLVFENDNL CAITIPSTNAELFTIPCIVPVQLLINTWATKCGIIPGEFTHGAKITSIE >gi|226332190|gb|ACIC01000130.1| GENE 99 133834 - 135693 971 619 aa, chain - ## HITS:1 COG:no KEGG:BT_3602 NR:ns ## KEGG: BT_3602 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 619 1 619 619 1139 99.0 0 MKIKYLFTGVLISMLAFMSCEDPDDLARTGSENVTGLTITGCLASDESTTYSAIVDEASN TITIQVPYYISDTEKIQGDLTQMKVSASLPVGAKFSPSISGIRDLVSGFQSTLVKEDGSK ITYTFKAAYVKSKLASISKVELTDYSRATIRIVEPESTGGTGKIIIYKTSSAIDAALKSA ALTVSPWATIESSSLDPATGIIDLSNQPTITVIAQNGTDKTVYQTSLELPDLLPQGVGYT ASLFGFQIYTDDTHGFEVGANRTMAVIDDYLIISNSTDYNKMIVMDRYNGKVLDVKVNTT GIDAGRSIHAITSDDAGHLVAMAYTSTLDANVTDPNVRAWVWPNGIENAPKSIVYANING STFANAPVGINGVKKLELGRTICVKGDLTSGDAIIATSTKNVPRAVFLMFKDGAMQGNAY VEWGGGASVSMWNATKVIPLTNTSPLGYIWASANFRQAINYTPIGTGARAIDFSLPTSHW WSGSATYDKNVRGIGYVEFNGTCLLGVQNGLSSNGVWSHRLYVSNITNNPGTSAMANGFI FDSREGSTTGTGSIPGTGYAVTGMTSSASFVSGKQVLGTNVDETGDVVFGRSADGNAVQV YMLTTDQGLFAYEITRFNL >gi|226332190|gb|ACIC01000130.1| GENE 100 135718 - 137403 1242 561 aa, chain - ## HITS:1 COG:no KEGG:BT_3603 NR:ns ## KEGG: BT_3603 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 561 1 561 561 1127 100.0 0 MKNIKKYIIGIAASICLTQCDVLDVDPTGWYSENVAYSSLENVDLYVKGLYSVLYDNATI NIDSYCLMDDGATDLLKYSWYGVNGGAMNKFFYQTNYVTETSNFRSNWSGMYTRIRQLNE YFYDRSKGYGDNLDQDEMKKRTAEVRFLRAFAYQELVLRHGGVILRTDEDYVDGPDERAK ARSSESDCWDFIINEYTQAVQDLPENWTGTNVGRITKGAAYGMKARAALYAKRWGDAIDA CNEVLKLNYSLLQGTTANDYYKIFTSVNNSELILPVYFQQGKNAKQHSFDIYVCPPYDWK AAGVTEGSVGAAVTPSDEYASSFDIKVNGSYQSFDWSNLSSYNNAPFTNREPRFYASILY NGATWKGRTLQLYVDGNDGYMDFATTGQDNVHKSTTGYLIRKFASDDTNINFSSILSGQY WIEMRLAEIYLIRSEAYARQNEFGKAYADLKTIRDRVGLPELAQQNNWSDYLEDLSKERI CELGMEGHRYFDLIRWGIAVRTLDHKRLHGVKITQAGGSFSYARVECDTQDRLFPEKYTI FPIPYTELQDNTLCEQNELWK >gi|226332190|gb|ACIC01000130.1| GENE 101 137418 - 140573 2398 1051 aa, chain - ## HITS:1 COG:no KEGG:BT_3604 NR:ns ## KEGG: BT_3604 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1051 1 1051 1051 2031 100.0 0 MMEQCFKQKSSFRTIRIILFSLSMLFAYATNGYAQNMVRGVVTDVTGDPLPGVSVVVKGT TTGTTTDIDGRYSINASNTATLEFSYIGMNKQEIKVNGRSTINIILKEDVANLDEVVVVG YGTQKKAHLTGSVATVSQKDMLKTTASNMSQTLVGKLPGLITQQSLGQPGSDDVSILIRG YSSFSGSGTVLVLVDGVERAMGQVDPNDVESVTILKDAASCAVYGMKAANGVVLVTTKHG QEGKTDITYRGSMTLSHATTLPKMMNGTQYMQWYNLARKLDNNGVDNPYFTDEEIAATYN GDPTDGFENTDWTSDLYKTTLMHQHNLSINGGSNKVRYFISGGYLHQDGIIKGNKNERSN FRSNIDVQATKDIKVSLNTAALIKDYYQPGGYSYGNQQAYSIFHQLLYSIPFVPKEYEGY PTSAYRGATSAANPIYGSANSGFQESRRVRIESSANVEYTAPFLKGLKANMFISWDWQDL ASKTFAYAYSVNAYDSSTKSYSYVQSANLLADGNLYQGDQKSQQVILRPTISYNNKFGLH DVGALFLYEQTQVKSSALNASRTDYDLFDLPEISLGDAQTATNSGSSGKSAYAAYVGRLN YAYANKYLAEASFRYDGSYKFAKGHRWGFFPSVSLGWVMSEEDFFKDALPKIDYFKLRGS FGILGSDNVSAFLYRKSYSYTNNGVVFGSTPNTQGTLSNTVAYPNERLTWEKTKSYNLGF DLSAWNGLLGVEFDVFYKYTYDILQSVSNIYPPSLGGHYPSSENTGTFDNRGFEIALKHR NRIGEFSYSLNGNLSYAHNRILSRTQADNTLPWQSVLGSSVGELWGLKALGLYQSEEEIA NSPQVSWNTPRVGDIKYADINGDGKIDSNDRIKISRGIRPEMMFALMADANYKGFDLSVQ FQGAALCDKMLQYSWQDLNGATDMTPMTRPWYANWDNAPLYLVENSWRPDHTNAEYPRLT VSSVSHSNNAQQSDFWKRNGAYLRLKNVTLGYTLPKAWTNKMGLSNIRVYANGTNLLTFT DFKYIDPESTNVATGYYPQQRTFSFGIDVRF >gi|226332190|gb|ACIC01000130.1| GENE 102 140620 - 141786 996 388 aa, chain - ## HITS:1 COG:all3695 KEGG:ns NR:ns ## COG: all3695 COG2942 # Protein_GI_number: 17231187 # Func_class: G Carbohydrate transport and metabolism # Function: N-acyl-D-glucosamine 2-epimerase # Organism: Nostoc sp. PCC 7120 # 2 387 4 387 388 475 58.0 1e-134 MDLKKLANQYKDELLNNVLPFWLEHSQDHEFGGYFTCLDREGNVFDTDKFIWLQGREVWL FSMLYNKVEKKQEWLDCAIQGSEFLKKYGHDGNYHWYFSLDRAGNPLVEPYNIFSYTFAT MAFGQLSLATGNQEYADIAKKTFDIILSKADNPKGKWNKIHPGTRNLKNFALPMILCNLA LEIEHLLDKEYLEKTIETCIHEVMEVFYRPELGGIIVENIGVDGNLVDCFEGRQVTPGHN IEAMWFIMDLGKRLNRPDLIEKAKNVTLTMINYGWDKEYGGIYYFMDRKGCPPQQLEWDQ KLWWVHIETLISLLKGYQLTGDKQCMEWFEKIHEYVWTHFKDAQYPEWFGYLNRQGEVLL PLKGGKWKGCFHVPRGLYQCWKVLEELQ >gi|226332190|gb|ACIC01000130.1| GENE 103 141819 - 143225 904 468 aa, chain - ## HITS:1 COG:CAC1339 KEGG:ns NR:ns ## COG: CAC1339 COG0477 # Protein_GI_number: 15894618 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Clostridium acetobutylicum # 6 463 10 453 469 292 38.0 1e-78 MKTTINLGYIIFLSVVAALGGFLFGYDTAVISGTIAQVTHLFQLDTLQQGWYVGCALIGS IVGVLFSGILSDSIGRKRTMILSAILFSTSAIGCAFCIDFNQLVVYRIIGGIGIGVVSIV SPLYISEVSVAQFRGRMVSLYQLAVTVGFLGAYLVNYQLLAYSESGNHLPIAWLEKIVVT EVWRGMLGMETLPAIIFFIIIFFIPESPRWLIVKGQERKATYILEKIYNSFKEADFQLNE TKSVLVSETRSEWSILLKPGILKAVIIGVCIAILGQFMGVNAVLYYGPSIFENAGLSGGD SLFYQVLVGLVNTLTTILALLIIDKVGRKKLIYYGVSGMVVSLILIGSYFLFGNAWNISS LFLLAFFLCYVFCCAISICAVIFVLLSEMYPTKIRGLAMSIAGFALWIGTYLIGQLTPWM LQNLTPAGTFFLFAIMCVPYMLIVWKLVPETTGKSLEEIERYWTRSER >gi|226332190|gb|ACIC01000130.1| GENE 104 143231 - 144328 843 365 aa, chain - ## HITS:1 COG:no KEGG:BT_3607 NR:ns ## KEGG: BT_3607 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 365 1 365 365 753 100.0 0 MEANNRRDFLKKAALAGASALMAPTLLAAEGKDGSEFVLSPSKRTDSKLIVPKNNGLKIT GTFLDEISHDIPHQNWGEKEWDLDFQHMKRIGIDTVIMIRSGYRKFITYPSKYLLGKGCY MPSVDLVDMYLRLAEKYNMKFYFGLYDTGHYWDTRDMSWEVEYNKYVIDEVWNTYGEKYK SFGGWYISTEISRNTKGAIGAFHTMGKQCKDVSNGLPTFISPWIDGKKAVMGTGKMTRED AVSVEQHEREWNEIFDGIHDVVDACAFQDGHIDYDELDAFFTVNKKLADKYGMQCWTNAE TFDRDMPIRFLPIKFDKLRMKLEAAKRAGYDKAITFEFSHFLSPQSAYLQAGHLYDRYRE YFEIK >gi|226332190|gb|ACIC01000130.1| GENE 105 144715 - 145893 659 392 aa, chain - ## HITS:1 COG:no KEGG:BT_3608 NR:ns ## KEGG: BT_3608 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 392 1 392 392 776 100.0 0 MEQKKIVLFILITHLAAFLAGCSGNKNSGNNDSSDLWNKLSSYFRPPAEYENVYGNFRSP LLYYNGDTVRTVEDWQRRRTEIKDRWMSLLGQWPPVITGQTFEILDTLHRENFMQYRVRF YWTPNEQTEGYLLVPDKEGKKPAVITTFYEPETAIGLGGKPYRDFAYQLTKRGFVTLSIG TTKTTENQTYSIYYPTIENATLQPLSALAYAAANAWEVLAKVQDVDSTRIGITGHSYGGK WAMFASCLYEKFACAAWGDPGIVFDETKEGYINYWEPWYLGYYPPPWENTWSKNGHDYAK GIYPKLRKEGYDLHELHTLMAPRPFLVSGGYSDGTDRWIALNHTIAVNRLLGYRNNVAMS NRVNHDPTPESNEIIYDFFKWYLHSANKSTKE >gi|226332190|gb|ACIC01000130.1| GENE 106 146008 - 147153 654 381 aa, chain - ## HITS:1 COG:YPO4034_1 KEGG:ns NR:ns ## COG: YPO4034_1 COG1609 # Protein_GI_number: 16124154 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Yersinia pestis # 3 245 7 255 265 82 23.0 1e-15 MIKILLLIDYSSEFDRKLLRGLVQYSKENGPWLFYRLPSYYSGMYGEKGILKWAKEWKAD AIIGQWNNDTVNLLKELNIPIVLQNYHHRSTTYSNLTGDYKGTGRMAAQFFAKRMFHNFA YFGINGVVWSDERCEGFRQEVKRIGGNFYCFESDKHEDEIRIEVSQWLQELPKPIALFCC DDSHALFISETCKISNIHIPEEISLLGVDNDDLICNISDPPISSIELEVERGGYSIGRLI HQQIKKEHEGTFNIVINPIRIELRQSTEKHNIKDPYILEVVKYIESHYNSDLTIESLLAQ IPLSRRNFEVKFKNAMHTSIYQYILNCRVNHLADLLLTTDRSLADIATEAGFKDYNNISR IFKKFKGCSPLEYREKKITNR >gi|226332190|gb|ACIC01000130.1| GENE 107 147381 - 147851 499 156 aa, chain + ## HITS:1 COG:no KEGG:BT_3610 NR:ns ## KEGG: BT_3610 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 156 1 156 156 290 99.0 1e-77 MAIPYVVRKKADLTSGERKELWYGVPSKIQDRGGVQNKELAQVVELRGGFHRGQVEGILV EVADAIRYLLSMGQSVTIDGLGTFQTALTSPGFERPEQVTPGQVSVSRVYFVANSALRDR MKKTKCMRIPFKYYMPESMLTKEMKKADQEQEQTEE >gi|226332190|gb|ACIC01000130.1| GENE 108 147955 - 149496 1654 513 aa, chain + ## HITS:1 COG:SA1394 KEGG:ns NR:ns ## COG: SA1394 COG0423 # Protein_GI_number: 15927145 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Glycyl-tRNA synthetase (class II) # Organism: Staphylococcus aureus N315 # 10 502 8 460 463 470 50.0 1e-132 MAQEDVFKKLVSHCKEYGFVFPSSDIYDGLGAVYDYGQMGVELKNNIKKYWWDSMVLLHE NIVGIDSAIFMHPTIWKASGHVDAFNDPLIDNKDSKKRYRADVLIEDQLAKYDDKINKEV AKAAKKFGESFDEAQFRSTNGRVLEHQAKRDALHTRFSKALNDNNLDELRQIIVDEEIVC PISGTKNWTEVRQFNLMFSTEMGSTSDGAMKIYLRPETAQGIFVNYLNVQKTGRMKVPFG IAQIGKAFRNEIVARQFIFRMREFEQMEMQFFVKPGTELDWFKKWKEIRLKWHKALGFGD ASYRYHDHDKLAHYANAATDIEFLMPFGFKEVEGIHSRTNFDLSQHEKFSGKSIKYFDPE LNESYTPYVIETSIGVDRMFLSIMSAAYCEEQLENGESRVVLKLPAALAPVKLAVMPLIK KDGLPEKAREIIDALKFHFHCQYDEKDSIGKRYRRQDAIGTPFCVTVDHQTLEDNCVTLR NRDTMQQERVAISELNNIIADRVSITSLLKTLQ >gi|226332190|gb|ACIC01000130.1| GENE 109 149508 - 150098 488 196 aa, chain + ## HITS:1 COG:PA4572 KEGG:ns NR:ns ## COG: PA4572 COG0545 # Protein_GI_number: 15599768 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: FKBP-type peptidyl-prolyl cis-trans isomerases 1 # Organism: Pseudomonas aeruginosa # 83 187 113 203 205 68 34.0 5e-12 MSKKIYLFSLVLLALAFVSCSETEEVGKYDNWRARNEAFIDSLANVYATASGRGGLERIE MLTAPGNYIYYKEMEPMTDHVVKAGNPKYTDYVKVYYKGTNILGEYFDGNFKGDNPVVDG KDPSEGDSPTTIFQVSGVITGWGEVLQRMEVGDRWKVYIPWDYAYGSSGTTGILGYSALV FDITLLDFANTEAELK >gi|226332190|gb|ACIC01000130.1| GENE 110 150140 - 151183 1006 347 aa, chain - ## HITS:1 COG:YPO0108 KEGG:ns NR:ns ## COG: YPO0108 COG1609 # Protein_GI_number: 16120455 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Yersinia pestis # 1 335 1 337 342 179 31.0 1e-44 MENRKHTSLKDLAQALGVSIPTVSRALKDSPEISRELCAKAKKLAKEMNYRPNPFAMSLR KNAPRIIGVIVPDIVTHFFASILNGIENMAIDNGYFVIITTSHESFEHEKRNIENLVNMR VEGIIACLSQETTDFSHFAALKEINMPLILFDRVCLTDQFSSVIADGAQSAQMATQHLLD NGSKRVAFIGGANHLDIVKRRKHGYLEALRDNRIPIEKELVVCRKIDYEEGQIATETLLS LPQPPDAILAMNDTLAFAAMEVIKRHGLRIPDDIAIIGYTDEQHANYVEPKLSAVSHQTY KMGETACQLLIDQIKGDRTVKQVVIPTHLQIRESSIKENNKKNAVST >gi|226332190|gb|ACIC01000130.1| GENE 111 151363 - 152295 1025 310 aa, chain + ## HITS:1 COG:SMc02775 KEGG:ns NR:ns ## COG: SMc02775 COG0667 # Protein_GI_number: 15963772 # Func_class: C Energy production and conversion # Function: Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) # Organism: Sinorhizobium meliloti # 1 280 1 307 339 122 29.0 1e-27 MQYHEIGKTGMKVSSLSFGASSLGGVFHDLKEKEGIQAVFTAIEAGMNFIDVSPYYGHYK AETVLGKALKDIPRDRYYLSTKVGRYGKDGVNTWDYSAKRATESVYESMERLNIDFIDLI NVHDIEFADLNQVVNETLPALVELREKGVVGHVGITDLQLENLKWVIDRSPSGTIESVLS FCHYCLCDDKLADFLDYFESKEIGVINASPLSMGLLSERGVPVWHPAPKPLVDACRKAME HCKAKNYPIEKLAMQFSVSNPKIATTLFSTTNPENVKKNIGFIEEPVDWELVREVREIIG EQQRVSWANS >gi|226332190|gb|ACIC01000130.1| GENE 112 152311 - 153234 793 307 aa, chain + ## HITS:1 COG:no KEGG:BT_3615 NR:ns ## KEGG: BT_3615 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 307 1 307 307 630 100.0 1e-179 MDYTIIDAHAHLWLRQDTVVDGFPIRTLENGRSLFMGEIRQMVPPFMIDGVNSAEVFLSN MDYAQVSAAVITQEFIDGIQNDYLMEVVSRYPDRFFVCGMCEFRKPGYLEQAKELIGKGF KAIKIPAQRLLLKEGRVMLNCPEMMQMFRWMEERGVILSVDLAEGAIQVLEMEEIIQECP RLKIAVGHFGMVTLPDWKEQIKLARHPNVMIESGGITWLFNDEFYPFKGAIKAIREAAEL VGMEKLMWGSDYPRTITAITYKMSYDFVVKSPELSEAEKTLFLGGNARKFYGFAELPELP YIKNMSE >gi|226332190|gb|ACIC01000130.1| GENE 113 153276 - 154532 1116 418 aa, chain + ## HITS:1 COG:fucP KEGG:ns NR:ns ## COG: fucP COG0738 # Protein_GI_number: 16130708 # Func_class: G Carbohydrate transport and metabolism # Function: Fucose permease # Organism: Escherichia coli K12 # 2 404 20 423 438 256 39.0 8e-68 MKKNTYTIPLALVFSLFFLWAISSNLLPTMIRQLMKTCELNTFEASFTETAYWLAYFIFP IPIAMFMKRYSYKAGIIFGLLLAAVGGLLFFPAAMLKEYWAYLCIFFIIATGMCFLETAA NPYVTVLGAPETAPRRLNLAQSFNGLGAFIAAMFLSKLILSGTHYTRETLPVDYPGGWQA YIQLETDAMKLPYLILALLLLAIAVVFVFSKLPKIGDEGAEPASGKKEKLIDFDVLKRSH LRWGVIAQFFYNGGQTAINSLFLVYCCTYAGLPEDTATTFFGLYMLAFLLGRWIGTGLMV KFRPQGMLLVYALMNILLCGVVMLWGGMIGLYAMLAISFFMSIMYPTQFSLALKGLGNQT KSGSAFLVMAIVGNACLPQLTAYFMHVNEHIYYVAYGIPMICFAFCAYYGWKGYKVID >gi|226332190|gb|ACIC01000130.1| GENE 114 154556 - 155572 931 338 aa, chain + ## HITS:1 COG:STM1542 KEGG:ns NR:ns ## COG: STM1542 COG1063 # Protein_GI_number: 16764887 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Threonine dehydrogenase and related Zn-dependent dehydrogenases # Organism: Salmonella typhimurium LT2 # 1 334 3 336 341 198 34.0 1e-50 MKAVQIVNPSEMKVVELEKPTVGAGEVLVRIKYVGFCGSDLNTFLGRNPMVKLPVIPGHE VGAVIEEIGPDVPAGFEKGMNVTLNPYTNCGKCASCRNGRVNACEHNETLGVQRNGVMCE YAVLPWTKIIPAGNISSRDCALIEPMSVGFHAVSRAQVIDNEYVMVIGCGMIGIGAIVRA ALRGATVIAVDLDDEKLVLAKRVGASYAVNSKTENVHERIQEITAGFGADVVIEAVGSPV TYVMAVDEVGFTGRVVCIGYAKKEVAFQTKYFVQKELDIRGSRNALPADFRAVINYMKEG KCPVEELISKIAKPEDALEAMKEWAADPRKVFRILVEM >gi|226332190|gb|ACIC01000130.1| GENE 115 155873 - 156724 582 283 aa, chain - ## HITS:1 COG:no KEGG:BT_3618 NR:ns ## KEGG: BT_3618 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 283 1 283 283 580 100.0 1e-164 MILIADSGSTKTDWCVVLNGAVIKRLGTKGINPFFQSEEEIQQKLTASLLPQLPEGKFNA VYFYGAGCTPEKAPVLRRAIADSLPVIGNIKANSDMLAAAHGLCGQKAGIACILGTGSNS CFYNGKEIVSNISPLGFILGDEGSGAVLGKLLVGDILKNQLPATLKEEFLKQFDLTPPEI IDRVYRQPFPNRFLASLSPFIAQHLEEPAIRQLVMNSFIAFFRRNVMQYDYKQYPVHFIG SIAYCYKEILQDAARQTGIQIGKILQSPMEGLIQYHSQLSFHP >gi|226332190|gb|ACIC01000130.1| GENE 116 156721 - 157839 881 372 aa, chain - ## HITS:1 COG:all1887 KEGG:ns NR:ns ## COG: all1887 COG4299 # Protein_GI_number: 17229379 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Nostoc sp. PCC 7120 # 9 372 2 375 375 190 35.0 4e-48 MNVTTSNKRLLALDVMRGITIAGMILVNTPGSWQHAYAPLKHAEWIGLTPTDLVFPFFMF IMGISTYISLRKYNFTFSVPAGLKILKRTVIIFLIGIGISWLSILCFQHDPFPIDQIRIL GVMQRLALGYGVTAIVALLMKHKYIPYLIAVLLISYFAILALGNGYVYDETNILSIVDRA VLGQAHIYGGQILDPEGLLSTISAIAHVLIGFCAGKLLMEVKDIHEKLERLFLIGTILTF AGFLLSYGSPICKKVWSPSFVLVTCGLGSSFLALLVWIIDIKGYKNWSRFFESFGVNPLF IYVLADILAITLAVIPMTYQGEATSLHGYIYSALLQPVFGDKGGSLVFALLFVLLNWAIG YILYKKKIYIKI >gi|226332190|gb|ACIC01000130.1| GENE 117 157882 - 159189 874 435 aa, chain - ## HITS:1 COG:YPO3162 KEGG:ns NR:ns ## COG: YPO3162 COG0477 # Protein_GI_number: 16123324 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Yersinia pestis # 25 430 20 409 492 117 25.0 6e-26 MISTRNAKNIIPRNNRNPWSWIPTLYFAEGLPYVAVMTIAVIMYKRLGLSNTEIALYTSW LYLPWTIKPLWSPFVDLVKTKRSWIIAMQGLIAAGFAGIAFFIPTPHYVQLTLAFFWLMA FSSATHDIAADGLYMLGLNNKEQSFFVGIRNTCYRFANIFGQGILVMLAGWLETSHGNVP MAWSITFYLMAGLFLGLTLYHRFILPHPASDVKRPGLTPGKLLEDFFKTFVTFFEKKNLG LMFFFLLTYRLGESQLAKIASPFLLDGAEQGGLGLSTASVGVIYGTIGVAALLIGGIISG FLVSRDGFKKWIIPMALAINLPDLLYVWMAAATPDNLLLISICVAIEQLGYGFGFTAYML YLIYIADGEHKTAHYAIGTGFMALGMMLPGMPAGWIQEHLGYTNFFIWVCVCTIPGILAS WMIRNRLDSSFGKKR >gi|226332190|gb|ACIC01000130.1| GENE 118 159207 - 160586 832 459 aa, chain - ## HITS:1 COG:sll1283 KEGG:ns NR:ns ## COG: sll1283 COG2385 # Protein_GI_number: 16329811 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Sporulation protein and related proteins # Organism: Synechocystis # 88 456 127 388 391 96 26.0 1e-19 MEEPQITVGILSGKEIEFSFPESFTTPDGTKVSGAQKAVYQKEKIYWSGKEYDELIFYPQ PNAGIFFELKDVTIGINFHWERKEVQRFQGALKIIVEGETLTAINVISIEDYLTSVISSE MSATASLELLKAHAVISRSWLINKLRVENEKWKATIQPDSAANSPHSTLHSQLIKWYDHE AHTHFDVCADDHCQRYQGITRASTPQAIEAVSATRGEVLMYEGKICDARFSKCCGGAMEE FQNCWENVRHPYLTGKRDYIPESHASDSEKQITGPAIQLPDLTQEEEADRWIRTSPDAFC NTQDKKILSQVLNNYDQETTDFYRWKISYSQQELAELIHQRSGIDFGEIIDLIPVERGTS GRLIRLKIVGTLRTLTIGKELEIRRTLSPSHLYSSAFLVDKEKGEEGNVPTRFTLTGAGW GHGVGLCQIGAAVMGERGYYYKDILAHYYPGSRLEKQYE >gi|226332190|gb|ACIC01000130.1| GENE 119 160597 - 163041 1317 814 aa, chain - ## HITS:1 COG:no KEGG:BT_3622 NR:ns ## KEGG: BT_3622 # Name: not_defined # Def: putative glycosyltransferase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 814 1 814 814 1681 100.0 0 MKKKINCFIPFGNPKDTIQTVKELQASELVNKIYLLSADTNKECLPGCECLLINGLYSTS TIKTIADNADDASYLLLYTKQTPLKLGLYALERMTQIIQSSSENAMVYADHYQLVDGVLK QAPVIDYQQGSLRDDFDFGSVLLFNSEIFTLASSILEETVFQYAAFYEMRLMLGNLFQII HINEFLYTEIETDTRKSGEKQFDYVNPKNREVQIEMEKVCIKHLKEINAYVRPSSRTVNL DREAFEFEASVIIPVRNRIRTIRDAVNSALSQQTTFPFNVIVIDNHSTDGTTEALQEFSA DNRLIHIIPQENDLGIGGCWNVGVHHEKCGKFAVQLDSDDLYKDEHTLQKIVDTFYQQNC AMVIGTYLMTDFQMNEIAPGVIDHKEWTVEDGPNNALRINGLGAPRAFYTPILRKIKFPN TSYGEDYAIGLSISREYTIGRIYDVIYLCRRWEGNSDAALDIEKINRNNFYKDSIRTWEL QARIRMHSIDESFQRLVNEMIEKQKKDWKLAKKNYKELEQNLKKEKTLELKLGGDTKRVR FFPNPQRAISTMAQTDSQSIQERPCFLCNDNRPAEQTSLSLGHYEVCLNPYPIFRRHLTI IEEEHTPQTIKNRFEDMLFLAENMNEFLILYNGPECGASAPDHMHFQAAGKEEKIANPFG MTFISDMLSSEDSVHSHLENGFTTSIGISSPNRRGIIEMFEYLYDKIVSIYHDKEPLINV IAWYGLETVNRTGGDEIKQWNCIIFLRSKHRPECYYAQGKKGLLISPAIAEMCGVFPIVR EEDLDKITAKKITQIYKEVSLSQEQLKALTDQTS >gi|226332190|gb|ACIC01000130.1| GENE 120 163043 - 164488 930 481 aa, chain - ## HITS:1 COG:sll1087 KEGG:ns NR:ns ## COG: sll1087 COG0591 # Protein_GI_number: 16330938 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Na+/proline symporter # Organism: Synechocystis # 13 407 26 420 512 130 28.0 6e-30 MILFAISYIAGHKADNEGFFVGNRKSAWYIVAFAMIGSTISGVTFVSVPGMVQASGFSYL QMVLGFIVGQFIIAFILVPLFYRMNLVSIYEYLENRFGASSYKTGAWFFFISKMLGAAVR LFLVCLTLQLLIFEPFHIPFIINVILTVLIVWLYTFRGGVKSLIWTDVLKTFCLVVSVVL CIYYIASSLHLNFSGLVSTISDNDLSKMFFFDDVNDKRYFFKQFLAGVFTVIAMNGLDQD MMQRNLSCKNFRDSQKNMITSGISQFFVILLFLMLGVLLYTFTARQGIENPGKSDELFPM IATGNYFPGIVGILFIIGLIASAYSAAGSALTALTTSFTVDILNAHKKDEAALSKIRKHV HIGMAFVMGIVIFVFNLLNNTSVIDAIYTLASYTYGPILGLFAFGIFTKKQVYDPYIPLV AILSPALCYVLQRNSEAWFDGYQISYELLILNAAFTFLGLCLLIKRKSSVAPINNSLIKK H >gi|226332190|gb|ACIC01000130.1| GENE 121 164451 - 164624 144 57 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MHLFDKFPTFVQFVGKAYLNIFTRERILSNRYESNYLIYYYHSILYDPFRYLLYSRT >gi|226332190|gb|ACIC01000130.1| GENE 122 164630 - 165895 666 421 aa, chain - ## HITS:1 COG:no KEGG:BT_3624 NR:ns ## KEGG: BT_3624 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 421 1 421 421 848 100.0 0 MKICKACTFLSFLIILLYSCALPPEDEKESLEPDSSTSELPWTGETEKFTINTKEGVHLN DPQEDAGTAYITIPSSSVRSTRWEFGVRLTFNPSANNYARFYLASSSEILSGDLNGYYIQ IGGTKDNVALYRQNGNQSKLLASGRELMKGNNSPKLFIKVECDANGYWTFWTRMETETEY TKEKQIKDASITKSICCGIYCVYTKSRCDGFTFHHIQLSDDVITSTEPDDIPDVPDEPDK PDTPESPEYPDNVRNMLLFNEVMYDNAKDGAEYIEFYNPADQDITVKSLRLFKMRATGEI FSTTILKQEDENTNLVIPGKGYICFTHSATTLIRKHKVDGKSITEISKFPQLSNDGGYLA LATDEEHPRTIDTCRFIDWMHDTSVTTGISLEKISPELPSLNKNWHSSRNDTGGTPGIKN S >gi|226332190|gb|ACIC01000130.1| GENE 123 166152 - 167186 941 344 aa, chain - ## HITS:1 COG:no KEGG:BT_3625 NR:ns ## KEGG: BT_3625 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 344 1 344 344 680 100.0 0 MKRFSTLLLYVTIAFLLLWQLPWCYNFFVVKPEKTPFTLYSFVIGDFALMGQEEGKGTVR RDAAGNIYSEAAFDSILPMFYFRQLMSDERFPDTINGIAVTPKIVQTENFNFRSVPTDIN APSIGLYPLLESMSGRVDLKMPDDVFRITSQGIEFIDMASNSVNAEKSLLFTEAMTKKNF RFPAIEIAGNPTVKKEYDEGYLLLDADRRLFHLKQVKGRPYVRAITLPEGLTLEHLYLTE FRNKKTLAFLTDINNALYVLKSRTYDVVKTGVPAFNPETDALTIIGNMFDWTVRVTSPSS DNYYALNADDYSLIKKLENNSNTHYMPGISFTSYTDKYVMPRFE >gi|226332190|gb|ACIC01000130.1| GENE 124 167230 - 167898 687 222 aa, chain - ## HITS:1 COG:no KEGG:BT_3626 NR:ns ## KEGG: BT_3626 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 222 1 222 222 385 100.0 1e-106 MYAIFYKEWIKTRWYFLLAVVTTLGFTGYCMLRINRVIEMKGAAHVWEVMMQRDAIFIDM LQYIPLIAGVLMAIVQFVPEMQRKCLKLTLHLPYPELKMTGNMLFSGLVLLLVCFASNFL LMEVYLSGVLAHELKNHILLTALTWYLAGISGYLLIAWICLEPAWKRRIINLIIAVLLLR IFFLSPTPEAYNKFLPYLLVYTLLTASFSWLSIVRFKAGKQD >gi|226332190|gb|ACIC01000130.1| GENE 125 167942 - 168829 223 295 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|225084369|ref|YP_002657150.1| ribosomal protein S16 [gamma proteobacterium NOR51-B] # 21 291 26 304 309 90 25 6e-17 MEQIIECNNLTHYYGKRLIYENLSFTVPKGRILGLLGKNGTGKTTTINILSGYLKPRSGE CRIFGQEIQTMAPALRRNIGLLIEGHVQYQFMSITQIEKFYAAFYPNQWKKEAYYELMNK LKVATGQRISRMSCGQRSQVALGLILAQNPELLVLDDFSLGLDPGYRRLFVDYLRDYARS EGKTVFLTSHIIQDMERLVDDCIMMDYGKILIQKPIDELLKEGRRYTCTVPEGYELPASD DFYHPSVMRNTLETFSFLPPTEAEAKLKSMSVPYTDLHSEHVNLEDAFIGLTGKY >gi|226332190|gb|ACIC01000130.1| GENE 126 168852 - 169418 478 188 aa, chain - ## HITS:1 COG:no KEGG:BT_3628 NR:ns ## KEGG: BT_3628 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 188 1 188 188 365 98.0 1e-100 MKWSSSIRKWSRLIHRDLSFFFAGMVLIYAISGIVMNHRDTINPNFSIERKEYKIAEKLP GKEGMKRENVLPLLQPLGEEGNYTKHYFPKTDIMKVFLKGGSNLQVNVRTGEAVYESVTR RPLIGAMARLHYNPGQWWTYFADIFAVGLIIITLSGVIMLKGSKGIIGRGGIEMIVGIVI PILFLLFF >gi|226332190|gb|ACIC01000130.1| GENE 127 169405 - 169992 626 195 aa, chain - ## HITS:1 COG:no KEGG:BT_3629 NR:ns ## KEGG: BT_3629 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 195 1 195 195 336 100.0 3e-91 MKTKKLSIAIALLAVAVIGTSCGNKQQKSSSDTTTEQTTSSALEIDSLLANAENLAGKEV TIEGVCTHTCKHGAKKIFLMGSDDTQVIRVEAGTLGAFDPKCVNSIVRVTGTLKEQRIDE AYLQNWEAQLKAQAAEKHGTGEAGCDTEKKARGETANTPEARIADFRAKIADRKASSGKE YLSFYFMEANSYEVE >gi|226332190|gb|ACIC01000130.1| GENE 128 170050 - 171672 1347 540 aa, chain - ## HITS:1 COG:no KEGG:BT_3630 NR:ns ## KEGG: BT_3630 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 540 2 541 541 1013 99.0 0 MKKGILFVLTAAFLASCQQEENEGVASVDRVTITPIITRATEVNFEDQDQIGLSVTKEDG TVYATNELMTFNDGAFAGSLKWYPEGADKSSFVAYYPYSATGVPTSFTVHADQTTNYGIS DFMAASKSDVLPSVNSISMIFKHMLTKLVINVTNETNLDISSIVLKGSVPTANINLATMK TTVNESAAATDITAQQVTKNKTFRAIVVPQTAAFTLAVTTSDNNTLMQKLVSTDLVQGGQ YSVNIRVLPDNIIVTLSGEIENWTDEGEIKGDEPEVPFEEHDGYFIYDGITYNTVTLSNG TTWMAEPLRYVPDGYTPSSDPAADSHIWYPYELVTVDGTLTAKALQDEASIKQYGYLYDI HAALSGKAVTPENCYDFEGAQGICPKGWHIPTRAEYFSLVGNSTKDVDGNSLENGKDALF YDTAYDGGKIPTLNEAKFNYQFSGVRMSTGYSAVPKYQATAITSSNSTVTEWFGKPSMSY YMTSTAYKPIYSTSTGDLTNIQFFGLMSTFSMAKYPEGRLSLSYISVLSGQTVRCVKNQN >gi|226332190|gb|ACIC01000130.1| GENE 129 171708 - 173240 1193 510 aa, chain - ## HITS:1 COG:no KEGG:BT_3631 NR:ns ## KEGG: BT_3631 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 9 510 1 502 502 984 99.0 0 MKENNNPCMSNMKGIYIGCMLLCIAAVSSAQTYDVIERRNSWNAGTNVTGIMMDSVTVSY AELYGKNNHGDFRNYYEADKLWSAGAVAKSITHLKKYSLIGSFSFDHTSGKDMSGSMFIH PGFYPVDLLEFTPGRKDLQTYAFMGGIAADAGTNWRIGGKIDFTSANYSKRKDLRHTNYR LDLKVAPGIMYHSDHYAIGFSYIFSKNSESVSAEEIGTAATPYYAFLDKGLMYGAYEAWS GSGVHLSESGINGFPIKELSHGAAVQAQWGSLYGDVEYLHSSGTAGEKEVIWFKFPAHRI TSHLRYRFAKGSTEHFLRINLSWSRQTNDENVLGQEISNGITTTHVYGSNRIFERKALSV QPEYELIAPKGELRVGAEISSFKSMTTQMYPYVASQTMTCGRVYASGVLHAGRFDVKAAA SLSIGDFTEKSRTTETTAEPGDPPYRLTDYYHLQNEYATAPRATVNVGLRYHFHRGIYAE IQGGYTHGFNLEYIAGSSRWNETIKLGYTF >gi|226332190|gb|ACIC01000130.1| GENE 130 173209 - 174444 1118 411 aa, chain - ## HITS:1 COG:no KEGG:BT_3632 NR:ns ## KEGG: BT_3632 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 411 1 411 411 814 99.0 0 MKQQSIYIILLCVTSLLGGCTDIDKGNPYEDQLHTLQVTAVYPDDYAEYLREGITVQIED IDRGNSYKTVTDQNGTAQFALTNGIYRIQISDKADQHIFNGLADKVKLVNGDITLNVPLT HSKAGAIIIKEIYCGGCTKLPQEGTYQADKYIILHNNDSQTQYLDHLCLGTLDPYNSQST NVWVTQDEATGATIFPDFAPIVQCIWQFGGNGTSFPLAPGADAVIAVNGAIDHAAQYPLS VNLNKPGYFACYNNVYYWNTMYHPAPGDQISRDHYLDVVIKLGQANAYTFSIFSPATVIF KAQDTTIQEYVQQTGSVIQKPGSTVDQVIKIPLDWVIDGVEVYYGGSSSNKKRISPSIDA GYVSQSALYDGRSLHRRVDEEASKEAGYEILEDTNNSSADFYEREQQSLHE >gi|226332190|gb|ACIC01000130.1| GENE 131 174449 - 177286 1997 945 aa, chain - ## HITS:1 COG:no KEGG:BT_3633 NR:ns ## KEGG: BT_3633 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 945 1 945 945 1785 99.0 0 MRFLTLKIMLCLLLSLIISRIEANTNNYSISGRISDERTGSPLPGASILIKGTYLWAVSD QKGEFTIQGIQEGKYQLEVSFLGYVPATVPVNVNNSIKGLKIQLKENTLALNDVIVTAQA PKSELNTTLNIGSNALEHLQISNVSDISALLPGGKTKVPDLTSNNIFSLRDGGSSAGNAA FGTAIEVDGVRIGNNSSFGNMTGIDTRSISVADIESVEVITGVPSAEYGDLNSGMVKIHT KKGKTPWNVLLSINPRTEQVSFSKGLDLGNDKGIVNISGEWTKATQKLNSPYTSYTRRGF SASYSNTFRKVLRFNIGLTGNIGGMNTKDDPDAYTGEYTKVRDNVFRANTSLAWLLNKSW ITNLKLDASVYYNDNKSHAHTPYSYASEQPAVHAEQEGYFIAGKLPYHFFADQIVDSKEL DYAASLKYEWNRRFKNITSNLKAGVQWKGTGNVGDGEYYQNPSLAPNGYRPRSYSAYPYM HNVSLYAEESLSVPVGSTMLRLMAGLRWENLFISGTQYDKLNTLSPRFNARWQLNENIAI RGGWGVTEKLPSYYVLYPRQEYRDIQTFGVSYNNNESAYVYYSQPYTLLHNEKLRWQKNQ NAEIGVDINVARTRISLVGYFNRTKMPYKYTSTYTPFAYNVLQLPEGFELSASPQITVDN QTGMGYIRDDENSYWTPLDVKVKDQTFVRSTSPDNGPDITRRGAEMIIDFPEITPLRTQF RVDAAYTYTKYIDNSLSYYYQNGWSHTSLANRSYQYVGIYANGDNSSTTANGKRTHSLDA NITAITRIPKARLIISCKLEASLIKRSQNLSEYNGTSYAFNVSENSNEKTGGSIYDGNSY TAIYPVAYLDLNNEVHPFTAAEAENPDFAHLIRKSGNAYTFAADGYDPYFSANISITKEI GDHVSLSFNAINFTNSRAYVKSYATGTSAIFTPNFYYGLTCRIKL >gi|226332190|gb|ACIC01000130.1| GENE 132 178166 - 178534 186 122 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253571187|ref|ZP_04848594.1| ## NR: gi|253571187|ref|ZP_04848594.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 122 1 122 122 229 100.0 6e-59 MDNLIFTSESVILYERKRIHALKYEEIICITTDRPYLVITTVESLQMFIQMSLSKVGELL PDYFCLCNQSTIINLTYAYLYEEHNGHFFVSLSAMLEPFEVSRRCKRNIKNKMLYINKNR GL >gi|226332190|gb|ACIC01000130.1| GENE 133 178531 - 179103 164 190 aa, chain + ## HITS:1 COG:no KEGG:Sala_2218 NR:ns ## KEGG: Sala_2218 # Name: not_defined # Def: hypothetical protein # Organism: S.alaskensis # Pathway: not_defined # 1 184 9 186 190 112 34.0 8e-24 MTNNELKLLQSAMDHSKSYLEFGSGNSTYMAVATDNIKKITVVESDVTFWETHLMSTPAI YEAVEKGRLHPYLVNIGVTGKWGYPMNDNNRDYWPAYHSCVFQSNHSYDTVLVDGRFRIS CILHACLYCPQDTKILIHDFFNRPNYYVVLPFLKLEERADTLGLFSIKQDTSLKHLAKEY ISIYEYLPGF >gi|226332190|gb|ACIC01000130.1| GENE 134 179120 - 179680 330 186 aa, chain - ## HITS:1 COG:no KEGG:BT_4275 NR:ns ## KEGG: BT_4275 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 110 186 147 222 252 62 42.0 6e-09 MWLKKIGLKHKKSKLIKLTDESEDFESLPGFQMHTGILSPTIKSDDQDFIAFFSIQPNNN RPPFILKDKVEYDGKEVYRYLFVVRAELAMSLNKKREDISFDEIKHYIEYKPASYAKDVF NADTVITYPLSLGSKCFRKKYYYCDVWLFQKNNIGHFYLYCFYTDKGYENKWKYEQELEK TLRFKK >gi|226332190|gb|ACIC01000130.1| GENE 135 179866 - 180459 243 197 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253571190|ref|ZP_04848597.1| ## NR: gi|253571190|ref|ZP_04848597.1| predicted protein [Bacteroides sp. 1_1_6] # 12 197 12 197 197 359 100.0 4e-98 MKKLKKINLNKSFEDFTPLKEWELRTIVAGDGDPSWDCLFNAFIYTYEELYGITLNKDEM VQFYINYTNHNPEAEGGANIGSFMNFIQYYGYQGSQNIGGTMNSFSTTQWASFQLDSSGI LHFAIPKAAYRDANGNLMLRCYDPTLKSERDYSYESLRGVFDINTVPFIDSPISGSYDSY GSFGSYPITGSDGIYYG >gi|226332190|gb|ACIC01000130.1| GENE 136 180610 - 181941 615 443 aa, chain - ## HITS:1 COG:PA4142 KEGG:ns NR:ns ## COG: PA4142 COG0845 # Protein_GI_number: 15599337 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Pseudomonas aeruginosa # 210 440 175 413 418 61 22.0 3e-09 MIWERKNKEVIEGKTSKRSEEINDIIDKMPMTFGKWVARAVIFFFVLLLTMGWVIKYPDT VTGQIKINSNLSPVRLVANASGKMQLLNITPQENVKEGQYIAIIQNSATTVDVIRINNLI HAFDPNNLEMILHEKLFPKKVSLGELNLKYYAFLSALESYKRFLSGDSYKQQKRNLLTNI KWQHITATENSSGINISQKKLDVTEKWYRKYQTMNKKDAVAGHDVDNILNDFLSAKLNYQ NLKKENASIQLQIAENEYQLNRLEIEQNEKDSQLQMELLTTYYDLRDNLKVWENKYIFKA PFTGKVEFLNFWVDNQFIQSGEEVFSIIPPQNITVGQMLLPASGAGKVKNGSNVIIKLNN FPYTEFGSINGTVASISLITKEQKAQESSIDSYLITVDLPQGLKTNYGEVLEFQYELGGI ADIIIRDRRLIERLFDNLKSRVK >gi|226332190|gb|ACIC01000130.1| GENE 137 181919 - 184120 228 733 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 495 717 135 356 398 92 27 2e-17 MNLTTFPHEFQMDAKDCGPASLMIIAKFYGKYYSLQYLRDLCGITREGVSFLDISYAADK IGLRTVSVKASMHDIVDRISLPCIIHWNANHFIVVYKASKTKIYVSDPAKGLVSYSHEEF KDKWYKKGEETGVLMALEPMANFKQIEANEKIERLKSLENLLGYFLPYKKAFGILLFIML MATLLQAFLPFISKSVIDIGIHTRDITFIQMMLVANIILLLSVTLSNALRDWVLLHVSTR VNISLISDYLIKLMKLPATFFENKLVGDILQRANDHERIRNFVMNNSLGMLFSSMTFLVF SVILLIYNANIFFIFMGGSILYVLWIFFFLKIRKKLDWEYFELTSKDRSYWVETINNIQE IKINNYEDTKRWKWEAIQARLYRLNIKVLKVNNAQMLGSQFINNMQNMAVTFFCAVAVIN GDITFGVMISTQFIIGMLNGPITQFVGFVQSAQYAKISFMRINEIHQLKDEDEMASIISN NITLPQNRNLIVSNLSFQYSPNAPLVLKNLNFIIPAGKVTAIVGDSGCGKSTLLKLLLRL YMPSYGEVCIGNMNINTISLRQWRAKCGCVMQDGKVFNDTIQNNIVLNDEDVDYEALQQA TQIANIAQEIERMPLGYQTAIGEMGRGLSGGQKQRLLIARALYKKPEYLFLDEATNALDT INEQKIVQALNSVFQNRTVIVIAHRLSTIKKADQIIVMKDGMVVEMGTHDSLMKNKRKYY ELIQSQYDLGTEK >gi|226332190|gb|ACIC01000130.1| GENE 138 184555 - 185082 223 175 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253571193|ref|ZP_04848600.1| ## NR: gi|253571193|ref|ZP_04848600.1| predicted protein [Bacteroides sp. 1_1_6] # 1 175 21 195 195 351 100.0 1e-95 MRKLKKISLKELEKEAICLDESELRLYMGGYDPNDCWWRCIAYINSCGSNYSADDAMEMA REYYGHCGSAFNENKYGFTGSSSDNRQCFNYFFGSGVDCGSSSREIFVFNPNLMEGMGIS PSGEYHAIVITRHEGSVMEYFDPQNRTYGQITQEQLDDYTARNGKSSFFRAGRSS >gi|226332190|gb|ACIC01000130.1| GENE 139 185599 - 185781 143 60 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253571194|ref|ZP_04848601.1| ## NR: gi|253571194|ref|ZP_04848601.1| predicted protein [Bacteroides sp. 1_1_6] # 1 60 160 219 219 117 100.0 3e-25 MVYPYNMEGKKYEDNFTRVRVVAIEKNGIDVFLYFAMTDQGIKDFDKYLMDFKGVFLFSK >gi|226332190|gb|ACIC01000130.1| GENE 140 185793 - 186446 180 217 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253571195|ref|ZP_04848602.1| ## NR: gi|253571195|ref|ZP_04848602.1| predicted protein [Bacteroides sp. 1_1_6] # 1 217 1 217 217 406 100.0 1e-112 MIRKNLYIVVLLLCSFTNLSAQNESSQFEQDLELIGLNFSLSDQYDLSDNMRMIYVTDDY KSPLRRLGTVHSILKNHDGQCNIFVYLSGANGLRYGGIIRKNKSLFKNVSSLTYNRIKTD FKYGYPGATEQDIEDLKMMMTFYPHDTATVLFNADYMMVYPFNMEGKKYQDKFTRTRVVV TGKDGLDVFFYFMMTDEGIKNFDKYLMDFKGIFLFSK >gi|226332190|gb|ACIC01000130.1| GENE 141 186647 - 187168 253 173 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253571196|ref|ZP_04848603.1| ## NR: gi|253571196|ref|ZP_04848603.1| predicted protein [Bacteroides sp. 1_1_6] # 1 173 1 173 173 338 100.0 6e-92 MKKLSKLRLPLIERELSVLNEEELRAYIGGQKQSCVFNCFDYLDGSLHSANDYYNWTKQG LGYEPDEHGNVDISDVGTIGGYGGFNVSKVNSGESFILRSDGQTSNGERVMAVFAVGSES NEEGHAVVISGFYRTDDERIVYYYYDPTSLCSGSILSDDCDSLYLVKHENPNT >gi|226332190|gb|ACIC01000130.1| GENE 142 187312 - 187950 284 212 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253571197|ref|ZP_04848604.1| ## NR: gi|253571197|ref|ZP_04848604.1| predicted protein [Bacteroides sp. 1_1_6] # 1 212 31 242 242 423 100.0 1e-117 MILDTTNMRTIPQFEEDLKLIDMKFIPSAQYDIFYDINLVSPCKEQDCPFKKIGLIHAIL TSHDKQCKLFVYIAGAINAHLGDIIKYNKKSFGIDTLSKYNRVKHDFDYGRPYSSASEQD IEDLKIMMTFYPQDTARSFFNADYMMTYPYNMKGKACQGVYSRTRVVVIERNGLDVFFYF TMTDVGIKDFDKYLMDLKGVLGFNAMTRQDGH >gi|226332190|gb|ACIC01000130.1| GENE 143 188040 - 188807 306 255 aa, chain + ## HITS:1 COG:no KEGG:BDI_0151 NR:ns ## KEGG: BDI_0151 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 4 225 2 227 263 200 45.0 3e-50 MKTKIDLWDVTFNIILRLDSIERLENIIASITFLNRHFNTNVTVWECSYRDNGFLKKLLD NARVSYVFKQDDDPILFRTHYLNQMIQETTTPIVSIWDTDVIAPVNQIIDAVNLLRLQEA DFVYPYDKLFLDTSIIIRNLYLESEDISLLMNNTKKMKQMYPPKPVGGAFFCNLKAYKES GLENEKFYGWGMEDGERYCRWEKMGYRIKRIDGPLFHLSHPRGINSDMHHPDQHVIKMRE FKAANRFTKKVGTDV >gi|226332190|gb|ACIC01000130.1| GENE 144 188800 - 189864 532 354 aa, chain + ## HITS:1 COG:no KEGG:BF4403 NR:ns ## KEGG: BF4403 # Name: not_defined # Def: putative glycosyltransferase # Organism: B.fragilis # Pathway: not_defined # 31 161 278 408 449 87 39.0 5e-16 MFKEIADYLLNIDKEKASGLLVGNVGVSLFLYQVYRKTGQTEFEETADMMLNNILEKQTM LSSTNFRDGLAGIGWGIEYLLQNNYYEGDSEEILEDVDNVIFKSVYEARLPLLGVNEGLC GYLLYLNSRIKNTESSNSVAHRINCGLFKVLINKIDKIAPDHFQSLTKDIHFDLLWEFPA LLVVLRQSLDLNLYNEKIINMVNQWMVYLGSYLPSMHCHRLSLLLALYSINTYIHSIEME QHIKVLAYSIDFNKLKSEVDPYAINIMHGWPGVVCLLSMASQYLNADYPNYHLFDKTRCE ILHQYLNGLEEQIDNLSKGISENKSISIGIVNGITGIGIIDLCYPEVMSVNSNK >gi|226332190|gb|ACIC01000130.1| GENE 145 189864 - 190640 253 258 aa, chain + ## HITS:1 COG:no KEGG:Pcar_2294 NR:ns ## KEGG: Pcar_2294 # Name: not_defined # Def: hypothetical protein # Organism: P.carbinolicus # Pathway: not_defined # 53 257 18 218 224 145 38.0 2e-33 MNKTVFLFATHKLTDFVLEQYYRLKLATKNIGVLYLLIEDGELQQIPVDVKYYSFSVDTL NELNYEPIEESIIPGSNHFPVLQFYKEHPEFLYYWNIEYDVYFNGKWESFFDPFEMIISD FISSHIEWYTQRPKWDWWKSVQFKTLSIPQNRYLKSFNPIYRISNRALKALDGVLSLRNS GHHEVVIPTVLDYLGYTINDFGGNSEFTLCELNIPCYLSNSELNNWYTESTMRYRPVFNK EAITRKGIDNMLYHPIKF >gi|226332190|gb|ACIC01000130.1| GENE 146 190646 - 191197 221 183 aa, chain + ## HITS:1 COG:no KEGG:BDI_3162 NR:ns ## KEGG: BDI_3162 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 110 1 110 214 82 39.0 6e-15 MEKNGHIKLQSIANILLINIQHIDKLGLWNGKMGVILFFYHYGRYVDKSFYEDVADSLLD LVIEAVSRSAHSDSYALLSSVGIGIEYLLAQQFVESDSDDLLEDFDKYLLRGGSRASSLM QSLYMHVRKEQNILNLSGLLELKNELSPYSSMSIIEEIAQYPLTFSELAWEGLNILNQIS NRE >gi|226332190|gb|ACIC01000130.1| GENE 147 191202 - 192311 627 369 aa, chain + ## HITS:1 COG:glf KEGG:ns NR:ns ## COG: glf COG0562 # Protein_GI_number: 16129976 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-galactopyranose mutase # Organism: Escherichia coli K12 # 10 367 2 363 367 390 53.0 1e-108 MVEYMNMKRYDYLLVGAGLFNAILAYELSRIGKKCLVIDKRSHLGGNMYCDHIAGIDIHS YGPHIFHTNNKYIWDYVNNLCEFKPFVYSPLACYKGKLYNLPFNMHTFYELWGVKTSVEA QKRIKEQLIPLKNDNLEEYALSTFGSEIYETLIKGYTEKQWGQEATMLPAFILKRLPLRF TFNNNYFEDFYQGIPIGGYNSLFVNAFRTCELMLDTDFIQNRDLSKLAETVIFTGMIDEY YDYCCGVLEYRSLCFETETIYMEDFQSSVVVNYTEREVPYTRIIEHKHFEGKCTPNTIIT KEYPQEWHKGREPYYPVNTKRNESIYKEYKMRSILDQNVFFAGRLGTYRYMNMDQIIKEA LELFKQLKK >gi|226332190|gb|ACIC01000130.1| GENE 148 192319 - 194109 453 596 aa, chain + ## HITS:1 COG:all2289 KEGG:ns NR:ns ## COG: all2289 COG0463 # Protein_GI_number: 17229781 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Nostoc sp. PCC 7120 # 5 118 9 122 329 84 36.0 6e-16 MEGFSIIMPTYNQAHFIRRAILSVLEQTYKCWELIIIDDGCTDSTERYIQDFLGNSHIRY LRNEVNQGIGYSINRGLEVAKYDLIAYLPSDDYYYKDHLLIHKKEYDNDLDLFLTFTKAS SKIEDSFMKQNCINAQRYEICNLFCSSPLQLVQTVHRNTSEKWETCNDVISKDYYRMYWS KLVMLGGFKSINFLTCCWSIYSFKEQIQKEEVCLLERKFLSGNIDKHQKDAEACDIADSI KTPLKILLVGELSHNPDRILALKDAGCDLFGLWVEKPLWSNIGPIAGITDISLTEWQSEI KKIKPDIIYGLLNYMSVPLAHEVLMNTKEIPFVWHFKEGPYVCQYRNLWDKLMDLYTLSD GQIYINEECQAWFEQYISCPNRETSFILDGDLPSGKYLNNNFCSKLSKTDGEIHILNSGR LVGLSLRDINYLCQQKIHIHSYGSNSPLVHCAQQANPTYFHVHGYCSPKDWVTEYSQYDA GLSHCFLSKNYNELVRTSWDDLNIPAKMNTFAIAGLPMIQRNNSGHIVATQRIADTFDCN ISFSTLEELVETLRNTKRMAELTHHIMKERELFVFDTYVPALMDFFKVIIKMKELH >gi|226332190|gb|ACIC01000130.1| GENE 149 194120 - 194953 339 277 aa, chain - ## HITS:1 COG:no KEGG:Swit_4855 NR:ns ## KEGG: Swit_4855 # Name: not_defined # Def: hypothetical protein # Organism: S.wittichii # Pathway: not_defined # 9 266 71 321 349 66 25.0 9e-10 MHVDDLKIVVSCPIKNDNEYLIEFVEHYLNLGFNAIYLYDNNDDDSIIPSEILASYINKD KVKIINYRKQVFNDVWHRKDFFSSYDFDWVLFVDDDEFLELKHQENIKTFLARFDENATK IAFNNLHYGDNDKLYYEDGNVQDRFPRPLSLNSGTKNYKFNCAVKSLLKKVEIQTIQTIN AHTLIDHLPYYNADNRIINMRSFWRMDEADVSYGTAYIKHYCTKSLEEFVKCKIKRAMSN NMAYNERFDINSYYYMYNERTRKKDELYKLFREQYLS >gi|226332190|gb|ACIC01000130.1| GENE 150 195215 - 196222 1079 335 aa, chain - ## HITS:1 COG:aq_1866 KEGG:ns NR:ns ## COG: aq_1866 COG0136 # Protein_GI_number: 15606903 # Func_class: E Amino acid transport and metabolism # Function: Aspartate-semialdehyde dehydrogenase # Organism: Aquifex aeolicus # 2 331 4 336 340 367 58.0 1e-101 MKVAIVGVSGAVGQEFLRVLDERNFPMDELVLFGSKRSAGTTYTFRGKQIEVKLLQHNDD FKGVDIAFTSAGAGTSKEFEKTITKYGAVMIDNSSAFRMDADVPLVVPEVNAADALERPR GVIANPNCTTIQMVVALKAIEQLSHIKTVHVSTYQAASGAGAAAMDELYEQYRQVLANEP VTVEKFAYQLAFNLIPQIDVFTENGYTKEEMKMYNETRKIMHSDVKVSATCVRVPALRAH SESIWVETERPISVEEAREAFAKGEGLVLQDNPAEKEYPMPLFLAGKDPVYVGRIRKDLT NENGLTFWIVGDQIKKGAALNAVQIAEYLIKVKNV >gi|226332190|gb|ACIC01000130.1| GENE 151 196290 - 198029 1377 579 aa, chain - ## HITS:1 COG:no KEGG:BT_3637 NR:ns ## KEGG: BT_3637 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 579 1 579 579 1183 100.0 0 MAKGIPYGISDFSRLIDQNYYYVDKTHFIPQIEEIANYLFLIRPRRFGKSVFLTMMRAYY DISQKDRFEERFGNLWIGKHPTPLQGIYQVLYFDFSKANLGRGSLEDNFNNYCCIQLNDF FSTYREYYQGEKELEQIIASTEAATKLHWLEATARRLGHLLYLIIDEYDNFTNVILSEQG KNLFRDLTHASGFYREYFKQFKGMFERIFLMGVSPVTLDDLSSGYNIDWNISVDPHFNMM MGFSETDVREMFGYYQKEGMLKGDIDAMINEMKPWYDNYCFAEASLKTDPRMFNCDMTLY YLRHRVNFDASPKELIDKNIRTDYSKLKMLARLDHENVSGEDRMSTIEEIAAKGEILVNL HTSFPAEKIVETDNFRSLLYYYGLLTMCGTRGDRIVMCIPNNCVREQYFGFLREYYQQRK EVFLPHLKDLIDDMAFDGQWRPLFEYIALAYKENSAIRDAIEGEHNLQGFFKAYLALASY YLVQPELEMNYGYCDFFLLPDKSRYPDTAHSYILELKYLPRTATDKELEEQAEEGRGQLK QYSRDKKVTTISEGTALHCILLQFKGWELIKCEEVSPKC >gi|226332190|gb|ACIC01000130.1| GENE 152 198214 - 200349 2176 711 aa, chain + ## HITS:1 COG:all3567_1 KEGG:ns NR:ns ## COG: all3567_1 COG0475 # Protein_GI_number: 17231059 # Func_class: P Inorganic ion transport and metabolism # Function: Kef-type K+ transport systems, membrane components # Organism: Nostoc sp. PCC 7120 # 8 404 22 412 413 303 44.0 6e-82 MNLFEFNLALPITDPTWVFFLVLIIILFAPMILGRLHIPHIIGMILAGVLIGEHGFHVLD RDSSFELFGKVGLYYIMFLAGLEMDMEDFKKNRMKSVVFGLLTFLIPMALGIWSSMSMLG YGFLTAVLLASMYASHTLIAYPIISRYGLSRLRSVNITIGGTAITVTLALIILAVIGGMF KGTVDGLFWVFLVAKVAFLGFLIIFFFPRIGRWFFRKYDDSVMQFVFVLAMVFLGGGLME FVGMEGILGAFLAGLVLNRLIPHVSPLMNRLEFVGNALFIPYFLIGVGMIIDVRSLFTGG EALKVAVVMTVVATFSKWLAAWITQKIYRMQPNERSMIFGLSNAQAAATLAAVLIGHEII MENGERLLNDDVLNGTVVMILFTCVISSLVTERSARRFALNEDAQPEDKGAKKAMEQILI PVANPETIENLVNLALVIKDAKQKNGMIALNVINDNNSSENKELQGKRNLEKAAMIAAAA DVPVTMVSRYDLNIASGIIHTIKEYEATDVVIGLHRKANIVDSFFGNLAESLLKGTHREV MIAKFLMPVNTLRRIIIAVPPKAEFETGFSKWVEHFCRMGSILGCRVHFFANERTLMRLQ QLVKKKFSGTPTEFSLLEEWGDLLLLTGQVNYDHLFVVISARRGSISYDPSFERLPAQLS KYFSNNSLIILYPDQFGDPQEIVSFSDPRGYNESQHYEKVGKWFYKWFKKS >gi|226332190|gb|ACIC01000130.1| GENE 153 200349 - 201065 642 238 aa, chain + ## HITS:1 COG:FN0725 KEGG:ns NR:ns ## COG: FN0725 COG1179 # Protein_GI_number: 19704060 # Func_class: H Coenzyme transport and metabolism # Function: Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 1 # Organism: Fusobacterium nucleatum # 8 238 4 231 234 170 41.0 2e-42 MEENNWQQRTELLLGEEKMKRIRASHVLVVGLGGVGAYAAEMLCRAGVGRMTIVDADTVQ PTNMNRQLPAMHSTLGMPKAEVLAARYKDINPDIELTVLPVYLKDENIPELLDSAKFDFI VDAIDTISPKCFLIYEAMKRHIKIVSSMGAGAKSDITQIRFADLWDTYHCGLSKAVRKRL QKMGVKRKLPVVFSTEQADPKAVLLTDDERNKKSTCGTVSYMPAVFGCYLAEYVIKRI >gi|226332190|gb|ACIC01000130.1| GENE 154 201075 - 201731 368 218 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 215 1 217 245 146 40 9e-34 MIKLEGITKSFGSLQVLKGIDLEINKGEIVSIVGPSGAGKTTLLQIMGTLDEPDAGTVAI DGTVVSRMKEKELSAFRNKNIGFVFQFHQLLPEFTALENVMIPAFIAGVSSKEANERAME ILAFMGLTDRASHKPNELSGGEKQRVAVARALINHPAVILADEPSGSLDTHNKEDLHQLF FDLRDRLGQTFVIVTHDEGLAKITDRTVHMVDGTIKKD >gi|226332190|gb|ACIC01000130.1| GENE 155 201934 - 202863 448 309 aa, chain - ## HITS:1 COG:TM1052 KEGG:ns NR:ns ## COG: TM1052 COG1555 # Protein_GI_number: 15643810 # Func_class: L Replication, recombination and repair # Function: DNA uptake protein and related DNA-binding proteins # Organism: Thermotoga maritima # 184 289 47 162 181 65 34.0 1e-10 MWKDFFYFTKTERQGIIVLVVLVIGLYTIPALLQAFSGPEKTDPAEQAKSEKEYNEFISS IKEAKQDKKYPVCRDRYSSPAYPKKEIKPAAFDPNTADSATFLSLGLPTWMAGNILRYRR KQGRFRRPEDFRKIYGLTEEQYRTLQPYIRIAETPVLQDTSRILVVQATAPYDTLMKYPP GTIIDLNQADTTELKKIPGIGSRIARSIVNRRRLLGGFYQIEQLGEIRLKAEKLRSWFSV DAGKIHRININKASVERMMHHPYISYYQAKVIAEYRKKKGKVRDLKQLMLYEEFTPADFE RMAPYVCYD >gi|226332190|gb|ACIC01000130.1| GENE 156 202863 - 204233 1161 456 aa, chain - ## HITS:1 COG:BH1128 KEGG:ns NR:ns ## COG: BH1128 COG0733 # Protein_GI_number: 15613691 # Func_class: R General function prediction only # Function: Na+-dependent transporters of the SNF family # Organism: Bacillus halodurans # 6 449 9 446 453 327 45.0 3e-89 MTKIDRANFGSKLGVILASAGSAVGLGNIWRFPYETGNHGGAAFILIYLGCVFLLGLPIL IAEFLIGRRSRANTAGAYQKLAPGTHWRWVGRMGVLAAFLILSYYSVVAGWTLEYVYEAL TNGFTGKTPTEFISSFQQFSSSPWRPVLWLVLFLLVTHFIIVKGVEKGIEKSSKIMMPTL FIIILILVICSVTLPGAGAGIEFLLKPDFSKVDGNVFLSAMGQAFFSLSLGMGCLCTYAS YFSKETNLTKTAFSVGIIDTFVAVLAGFIIFPAAFSVGIQPDAGPSLIFITLPNVFQQAF SGVPILAYIFSVMFYVLLAMAALTSTISLHEVVTAYLHEEFNFTRGKAARLVTGGCIFLG ILCSLSLGVMKGFTIFGLGIFDLFDFVTAKIMLTLGGLCISIFTGWYLDKKIVWSEITND GSLKVPVYKLIIFILRYIAPIAISLIFINELGLIKL >gi|226332190|gb|ACIC01000130.1| GENE 157 204337 - 204729 294 130 aa, chain + ## HITS:1 COG:no KEGG:BT_3643 NR:ns ## KEGG: BT_3643 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 130 1 130 130 263 100.0 2e-69 MLSYIKKYPISLFIILTVIYLSFFKPPKTDLNEIPNLDKLVHICMYFGMSGMLWLEFLRA HRRDDAPLWHAWVGAFLCPILFSGCVELLQEYCTTYRGGDWLDFAANSVGAILASLVAYY VVRPRMMRRE >gi|226332190|gb|ACIC01000130.1| GENE 158 204883 - 206181 1238 432 aa, chain - ## HITS:1 COG:BS_murF KEGG:ns NR:ns ## COG: BS_murF COG0770 # Protein_GI_number: 16077524 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl pentapeptide synthase # Organism: Bacillus subtilis # 16 432 30 451 457 226 36.0 5e-59 MKLSALYQIFLDCQLVTTDSRNCPEGSLFIALKGESFNGNAFAGKALETGCAYAVIDESE YAIEGDQRYILVDDCLQTLQQLANYHRRQLGTRVIGITGTNGKTTTKELISAVLSRSHNI LYTLGNLNNHIGVPSTLLRLKTEHDLAVIEMGANHPGEIKFLSEIAEPDCGIITNVGKAH LEGFGSFEGVIKTKGELYDFLRKKEGSTVFIHHDNAYLMNIAEGLNLIPYGTEDDLYVNG RITGNSPYLTFEWKAGKDGKSYQVQTQLIGEYNFPNALAAITIGRFFGVEDAKINEALSG YTPQNNRSQLKKTDDNTLIIDAYNANPTSMMAALQNFRNMEVPHKMLILGDMRELGAESA AEHQKIADYLKECAFEKVWLVGDQFAAAEHSFQTYPNVQEVIKELEADKPKGYTILIKGS NGIKLCSVVDYL >gi|226332190|gb|ACIC01000130.1| GENE 159 206256 - 208271 1755 671 aa, chain - ## HITS:1 COG:mll3725_2 KEGG:ns NR:ns ## COG: mll3725_2 COG0642 # Protein_GI_number: 13473203 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Mesorhizobium loti # 429 667 64 310 328 138 33.0 4e-32 MSRLSRIILGYLVSFLFIVIPAKAQKEAEVYNVDSSLYAYYQRCQEHLLEPVVLSMSDTL FRMASGQNDQRMQAVALSTRLDYYYYQANNEDSIIFYTNKVKQFAKEIQQSKYYYFAWAN RLILYYLKTGRSNIALYEAEKVLKEAQAEDNKTGLMYCYNIMSQIYTIKNFDVMASEWRV KEIELTEKYKLENYNISNTYAQLANYYITHHQPEPALEALEKAVKTANSASHKILAKLTY VEYYSEFKDFSAAEKMLKECRELFDQDKRLDSLKKRFYRTEAFYYQRSKQFVKALEAVEK QKTEEQRLNEYAMSSGQYRLKGEIYRQMGRPDEAVKFLKKYIEVDDSLKIANEQLASSEY ATLLNVEKLNAEKKELMLRTKEKELSNKTTLIFSLIILLAILFLFLYRESRLNRRLKASE SELRNKNEELTYSREELRKARDQAEATSQMKTTFIQSMSHEIRTPLNSIIGFSQVLSDHY GNEETKEFVNIIKANTDDLLRLVTDVLALSELDQYNKLPTNIQTDINMICQLSIEVAKMQ KQDTVEFLFKPEREKLIILSNPERISQLLANLAHNAIKFTTHGSIELTYSVLEAEKKLEI SVTDTGIGIPKEKQEKVFERFYKMNSFSQGTGLGLSISRTIAEKLGGSLHIDSSYTDGCR MVLTLPLVYAE >gi|226332190|gb|ACIC01000130.1| GENE 160 208399 - 209262 826 287 aa, chain + ## HITS:1 COG:VC0638 KEGG:ns NR:ns ## COG: VC0638 COG0294 # Protein_GI_number: 15640658 # Func_class: H Coenzyme transport and metabolism # Function: Dihydropteroate synthase and related enzymes # Organism: Vibrio cholerae # 16 279 11 274 278 227 43.0 2e-59 MKLISPTYINVKGRLLDLATPQVMGILNVTPDSFYSGSRMQTQEEIAARARQIIDEGASI IDIGAYSSRPNAEHITAEEEMNRLRTGLEIVNRNHPDAIVSVDTFRADVAEQCVEEYGVA IVNDIAAGEMDNRMFETVARLGVPYIMMHMQGTPQNMQKEPHYDNLMKEVFMYFARKVQQ LRELGLKDIILDPGFGFGKTLEHNYELMAHLEEFSIFELPLLVGVSRKSMIYKLLGGTPQ DSLNGTTVLDTVALMKGANILRVHDVREAVEAVRIVDKLRTESEYGK >gi|226332190|gb|ACIC01000130.1| GENE 161 209337 - 210101 801 254 aa, chain + ## HITS:1 COG:BS_ybbP KEGG:ns NR:ns ## COG: BS_ybbP COG1624 # Protein_GI_number: 16077243 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus subtilis # 11 253 16 252 273 163 37.0 3e-40 MFFDVGIKDFIDVLLVALLLYYTYKLMKASGSIKVFTGILVFILIWLVVTQILEMKLLGS IFDTLMNVGVIALIVLFQDEIRRFLLTLGSHRHVSALAHFFSGARKEALKHDDIMPVVMA CLSMGKQKVGALIVIEHTTPLDEVVRTGEIIDAAISQRLIENIFFKNSPLHDGAMVISKK RIKAAGCILPVSHDLNIPKELGLRHRAAMGISQKSDAHAIIVSEETGAISVAYRGQFYLR LNAEELESLLTKEN >gi|226332190|gb|ACIC01000130.1| GENE 162 210120 - 211661 1584 513 aa, chain + ## HITS:1 COG:PM0343 KEGG:ns NR:ns ## COG: PM0343 COG0312 # Protein_GI_number: 15602208 # Func_class: R General function prediction only # Function: Predicted Zn-dependent proteases and their inactivated homologs # Organism: Pasteurella multocida # 56 513 19 482 482 336 40.0 6e-92 MDRRNFLKTGGIALLGSLATGSAMALTTPGALGGEGADKKAAVALAMNHFGVSEADLKKV LAAALEKGGDYADLFFEHTISNSIRLMDGAVNNSYSNIDYGVGVRVLTGDQSGYAYVENI TVEDMLKAARTAARIASANKGNKPLNLTEKELKKNYYTVASPWEEVSLKDKMPYLQKLND RIFALDKRVHKVQASQSDTTSHIFFCNSEGVMFYDYRPMVTLGAVCIMEEDGKTENGYAA RAYRRGFDFLSDEVVDVIAREAVDQTSLLFKAIKPKGGEMPVVMGAGGSGILLHEAIGHT FEADFNRKNVSIFADQLNKKVCNEHINVVDDGTIPFNRGSVNFDDEGAEGQKTYLIKDGV LTSYLHDRISAKHYGVEPTGNGRRESFRNMPIPRMRATYMEAGNVSESDIISSVKKGIFV DNFTNGQVQIGAGDFTFFVKSGYMIEDGKLTQPIKDINIIGNGPKALADITMVATNAKID NGTWTCGKDGQSCPVTCGMPSALVSKLTVGGEN >gi|226332190|gb|ACIC01000130.1| GENE 163 211693 - 213012 1432 439 aa, chain + ## HITS:1 COG:NMB0839 KEGG:ns NR:ns ## COG: NMB0839 COG0312 # Protein_GI_number: 15676735 # Func_class: R General function prediction only # Function: Predicted Zn-dependent proteases and their inactivated homologs # Organism: Neisseria meningitidis MC58 # 9 438 13 442 443 199 30.0 9e-51 MITDENKKLAQWAMDYALKNGCQAAKVLLYSSSNTSFELRDAKMDRLQQASEGGLSLSLY VDGRYGSISTNRLNRKELETFIKNGIDSTRYLAKDEARVLADPSRYYKGGKPDLKLYDAK FASLNPDDKIEMAKAVAEEALGKDERIISVGSSYGDGEDFAYRLISNGFEGETKSTWYSL SADITIRGEGEARPSAYWYESSLYMNDLIKKGIGQKALERVLRKLGQKKVQSGKYTMVVD PMNSSRLLSPMISALNGSALQQKNSFLLNKLNEKIASDRLTLTDEPHLVKASGARYFDNE GIATERRSIFDKGVLNTYFIDTYNAKKMGVDPTISGSSILVMETGDKNLDGLIAGVEKGI LVTGFNGGNNNSSTGDFSYGIEGFLIENGKLTQPVSEMNVTGNLITLWNSLVATGNDPRL NSSWRIPSLVFEGVDFSGL >gi|226332190|gb|ACIC01000130.1| GENE 164 213110 - 213307 87 65 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLLLLCVYQFSIVISYLILTQAKVNSIPQISKQRGGINTGKQKPDNTLINKALVRQKTGY SRVYL >gi|226332190|gb|ACIC01000130.1| GENE 165 213306 - 213866 448 186 aa, chain + ## HITS:1 COG:SPy0330 KEGG:ns NR:ns ## COG: SPy0330 COG1704 # Protein_GI_number: 15674491 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Streptococcus pyogenes M1 GAS # 13 177 13 177 185 118 37.0 7e-27 MGIIYFGSGLILLIALWYIWAVNNLIAKRNRVKQCRSGICVALKQRNDMIPNLVAAVKSY MGHENEILTRITELRSHSFQPSQEAEQIKSGNELSGLLTKLQLSVENYPELKANEQFHRL QNSIEDMELQLQAIRRTYNAAVTDYNNTIEMFPSSVVARQQGHHQEELIDIPEPEQQNVN VAELLR >gi|226332190|gb|ACIC01000130.1| GENE 166 213874 - 214905 738 343 aa, chain + ## HITS:1 COG:no KEGG:BT_3651 NR:ns ## KEGG: BT_3651 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 43 343 1 301 301 610 100.0 1e-173 MDTIDFKKLSEQLRSELLRVNGWRKVVRTGSIAVYLFVFCWFMFVLFGGALVSYIGLENY MQFTQYIIPVFIGFIVVNFVFTRCMANFQERESEAMQHIMSTLFPTVYFSASSQVDSRIL RDSKLFSASFSDPALAANTYGYIQFPHGEHSLHVADIGVSYGLLNKLQYNPVLGYFVMIY RFVLRPLFASRLDSSPHNFRGMFGWCKIDKRFKGNIIILPDHLEQKIGYLAKNIQGLKKR YSARLVQLEDQEFENYFAVYADDEVEARMLLTPAMMRHMTALHQTFGCDIMLSFSRGTFY YAAVMPSGFLCVRPSALNDGKLLEDIYNDISLSCKVAEELRLN >gi|226332190|gb|ACIC01000130.1| GENE 167 214934 - 215722 783 262 aa, chain + ## HITS:1 COG:no KEGG:BT_3652 NR:ns ## KEGG: BT_3652 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 262 1 262 262 528 100.0 1e-149 MTDMSWMPWVVGGVMVLSFLYMKVWPFVRTIIRAFRGPRFKSKSKLSVEQYKKLSIGSLY ALQQGGYLNTLSLDIKDKLPTILGEWWGINNAHDARETLDDLCRKGYDYYFPFVYEAFLL DDENAQDDIFQQNMESQEDYEKAVGQLQNLKEVYEELIAYEVITSKEDIARYGVIGWDAG RINFVARACCDMKYISEMEAWNYIDKAYELAHSSFTSWHDMAMSYVIGRAIWGGTNAHNL GMKGMADDLLSNPKSPWVQIKW >gi|226332190|gb|ACIC01000130.1| GENE 168 215760 - 216326 537 188 aa, chain + ## HITS:1 COG:no KEGG:BT_3653 NR:ns ## KEGG: BT_3653 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 188 1 188 188 377 100.0 1e-103 MEQKNNKYYSYISEYYAANPTKFEKRKNNLKPVLYLGLAVVGAVLAIFPGLLPLAGWLVR TAGIIMTLVCLIAAYLNNFDIYNFQSGGKVKSMGVKKFKRDETNPAKIVEAFLSRNFEYL ADLPGGRSEPVQLHIEEDATGREMYCLLTTYDSDSNIVGLADVITLSGNDYDDNIDLIKQ MFKDEEDN >gi|226332190|gb|ACIC01000130.1| GENE 169 216624 - 218960 2412 778 aa, chain + ## HITS:1 COG:XF0840 KEGG:ns NR:ns ## COG: XF0840 COG1874 # Protein_GI_number: 15837442 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase # Organism: Xylella fastidiosa 9a5c # 29 608 28 602 612 432 38.0 1e-120 MRHRFIALLVLFTVIFFSSAEAQTTARKFEAGKNTFLLDGKPFVVKAAELHYTRIPQAYW DHRIEMCKALGMNTICIYIFWNIHEQEEGKFDFTGQNDIAAFCRAAQKHGMYVIVRPGPY VCAEWEMGGLPWWLLKKKDVALRTLDPYYMERVGIFMKEVGKQLAPLQVNKGGNIIMVQV ENEYGSYGTDKPYVSAVRDLVRESGFTDVPLFQCDWSSNFTRNALDDLIWTINFGTGANI DQQFKKLKELRPETPLMCSEFWSGWFDHWGRKHETRPAKDMVQGIKEMLDRNISFSLYMT HGGTTFGHWGGANNPAYSAMCSSYDYDAPISEAGWTTEKYYLLRDLLKTYLPAGEALPEV PAAMPVIEVPEFHFTKVAPLFSNLPDAKQSVDIQPMEQFNQGWGTILYRTTLPEAVTSGT TLKITEVHDWAQIYADGKLLARLDRRKGEFTTTLPALKKGTQLDILVEAMGRVNFDKSIH DRKGITEKVELLSGNQVKELKNWTVYNFPVDYSFIKNKNYKDTKILPIMPAYYRSSFKLD KVGDTFLDMSTWGKGMVWVNGHAMGRFWEIGPQQTLFIPGCWLKEGENEILVLDLKGPTK SSIKGLKKPILDVLREKAPETHRKDGEKLKLTGEKVAHEGAFTPGNGWQEVRFAAPVKGR FFCLEALSPQANNNIAAIAEFDVLGADGKPVSREHWKIRYADSEETRSGNRTADKIFDLQ ESTFWMTVDNVAYPHQLVIDLSKVETVTGFRYLPRAEKDFPGMIREYRVYVKPSDFKY >gi|226332190|gb|ACIC01000130.1| GENE 170 219031 - 220002 794 323 aa, chain + ## HITS:1 COG:no KEGG:BT_3655 NR:ns ## KEGG: BT_3655 # Name: not_defined # Def: arabinosidase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 323 1 323 323 619 100.0 1e-176 MNMKKYTLLLALLLVGVLTGYSQQSAYLFVYFTGNRMSEEAIRMAVSPDGYNYYALNGNQ PVIDSREISSTGGVRDPHILRCEDGKTFYMVVTDMVSGNGWSSNRAMVLLKSKDLVNWTS NIVNIQKKYPNQEDLKRVWAPQTIYDKEAKKYMVYWSMQHGNGPDIIYYAYANKDFTDIE GEPKTLFLPKNGKSCIDGDIIYKDGLYHLFYKTEGDGNGIKKATTASLTSGQWTESEDYK QQTKEAVEGAGIFPLIGTDKYILMYDVYMKGKYQFTESTDLENFKVIDNAISMDFHPRHG TVMPITDKELKRLYKAYGKPDKM >gi|226332190|gb|ACIC01000130.1| GENE 171 220044 - 222479 2151 811 aa, chain + ## HITS:1 COG:CC0813 KEGG:ns NR:ns ## COG: CC0813 COG3507 # Protein_GI_number: 16125066 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-xylosidase # Organism: Caulobacter vibrioides # 26 488 65 540 540 397 44.0 1e-110 MKKLLLFILLVCATVQAYSQEYPKVVLSGDYPDPSVMRDGEDYYMTHSPFYYAPGFLIWH SRDLMNWEPVCRVMPEYEGSAMAPDLLKYKGKYYIYYPAKGTNWVIWANDIKGPWSKPID LKVSGIDPGHIADQEGNRYLYVDKGEVIRLTDDGLATIGQKQKVYEGWRYPNHWETECMC LESPKLNYHNGYYYLTSAQGGTAGPATSHMAVAARSKSVTGPWENSPYNPVVHTYSAHDN WWSKGHGTLIDDVNGNWWIVYHAYAKGYHTLGRSTLIEPIEWTADGWYRTKSTATPIKTD PSIKHGLSLSDDFEGPEPGLQWTFWKEYAPQSLSFKKQTLWIDAKGSTPSDARLLLATAE DKNYETQVEVNVGKGNTAGLLLYYSEKAYAGVVSDGKNFTIYRNAENSFTLPNKLGKRFL AKIQNQGNSVRIAVSKDGKEWTTLVENMDVSQLHHNNYGGFYALRIGLLSSGKGSAGFRQ FRYRNAIPQEKDMGAYLMVFHKDETHSLYMAVSDDGYTFTALNDGKPVIAGDTIALQKGI RDPHIFRGPDGAFYLSMTDLHIYAQKDGFRDTEWERDGKEYGWGNNRGLVLMKSWDLINW KRTNARFDLLSAGLGEIGCVWAPEVTYDDKKGKLMIYFTMRFKNEANKLYYVYVNDDFDR IETLPQILFEYPNEKISAIDGDITKVGDRYRMFYVSHDGGAGIKQAVSDRINGDYEYDPR WYDFEPRACEAPNLWKRIGEDKWVLMYDVYGINPHNFAFIETSDFVNFKNLGRFNEGVMK TTNFSSPKHGAVIHLTKEEAAKLRSHWENRK >gi|226332190|gb|ACIC01000130.1| GENE 172 222550 - 225027 2186 825 aa, chain + ## HITS:1 COG:CAC3436 KEGG:ns NR:ns ## COG: CAC3436 COG3534 # Protein_GI_number: 15896677 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-L-arabinofuranosidase # Organism: Clostridium acetobutylicum # 206 718 51 549 835 412 42.0 1e-114 MKHLLLLFSVLLLSLQPAAFAATHETTATPDSVSLFAYATRGDDGRSGLCFAWSMDGKHW FEIGRNYGYLRCDYSRWGSQKKMLDPNLKQLPGGEWLCVWKLNDHDGYGQARSKDLIYWE AQQYPRTTSDFEGTRVKAKIAGHEETGTVSQVPWSVVDGLTQTYERNQYRNSLYGERPVQ DKERFAGLKSVKATVTAQPEETKEISDLLMGIFFEDINYSADGGLYAELIQNRDFEYEPS DREGDKNWNSTHSWKLEGENATFTISTSDPIHPNNPHYAVLKTNQPGAALTNTGFDGIAL KAGEKYDFSLFARIPEGSKSGKLLVRLVDADGTVQGETTVTVSSRSWKTYKAVLTAKASA DTHLELHPQSAGEIELDMISLFPQNTFKGRKNGLRPDLAQTLADMHPRFVRFPGGCVAHG DGLKNIYQWKNTIGPLEARKPARNLWGYHQSMGLGYYEYFQFCEDIGAEPLPVLAAGVPC QNSACHGDLRGGQQGGIPMSEMGAYIQDILDLIEWANGDAKKTKWGKIRAESGHPKPFNL KYIGIGNEDLITDVFEERFTMIYLAIKEKYPEIIVVGTVGPFNEGTDYVEGWKLADKLGV PMVDEHYYQSPGWFLHNQDFYDKYDRSKKTKVYLGEYATHIPGRRANMETALTEALYLAA LERNGDVVHMSSYAPLLAKDGRTQWNPDLIYFNNREVRPTTGYYVQKLYGQNAGDHYIPS QINLDNQDGRVKLRVGSSIVRDSKTGDVIVKLVNLLPVSIETDVRLPGMDGIQSSATRTV LAGAPETTPLPVTDTIEAGTSFKQELPAYSFTVIRLKTQKGKNKQ >gi|226332190|gb|ACIC01000130.1| GENE 173 225935 - 228409 2299 824 aa, chain - ## HITS:1 COG:SSO3022 KEGG:ns NR:ns ## COG: SSO3022 COG1501 # Protein_GI_number: 15899728 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-glucosidases, family 31 of glycosyl hydrolases # Organism: Sulfolobus solfataricus # 41 821 6 726 731 501 37.0 1e-141 MKMNLRKTGILLAWAMIIAANGTAQNVQRTSQGIKYAAQGMNVSVEFYSPSIIRVYKTPG ETASDKESLVVIKAPEQTPVSFGENGKNVTLSSQMIQVEVNPETGGIHFFDKLGQRLLTD KDYGTQFTPFNDAGVPSYNVRQAFLLDKDEVIYGLGQQQTGKVNQRNQKLFLRNQNMSIC IPFIHSIKGYALYWDNYSPTTFLDNPQETSFDSEVGDCADYYFIYGGNADGVIAGVRDLT GQAPLYPLWTLGFWQCRERYKSPDELCEVVDKYRELKVPLDGIIQDWQYWGCNENWNSMK FQNPRYINKMGDPEYMKFLPNGEDKNADYGTPRIKSPKEMIDYVHKQNAHIMISVWASFG PWTEMYQKMDSLKALLHFETWPPKAGVKPYDPYNPAARDIYWEAMKKNIFDLGMDAWWLD STEPDHMDIKDQDFNTQTYLGSFRRVHNAFPLMSNKGVYEHQRATTSDKRVFLLTRSSFL GQQRYASHSWSGDVTSEWSVMRKQLAAGLNYALCGIPYWNTDLGGFFAWRYNNNVNNIAY HELHVRWYQWGVFQPIMRSHNSSPVAVEIYQFGKKGDWSYDALEKYTHLRYRLLPYIYST SWEVTSKAGSIMRPLMMDFPKDKKVLDMDTEYMFGRNFLVRPVTDSLYTWQDKKQNGHQK DMSKIGKTDVYLPQGARWIDFWTGQTLEGGQTLQREVPIDIMPIYVRAGSILPWGTAVQY STEKKWDNLTIRIYPGANAEFTLYEDEFDNYNYEKGAYSTITLKWNDQDRTLTIDDRKGS YKGMLKSRKFNLIVVEPGKGCGDGDSTTFDKSISYRGKKVIAKL >gi|226332190|gb|ACIC01000130.1| GENE 174 228494 - 231208 1625 904 aa, chain - ## HITS:1 COG:mll6236 KEGG:ns NR:ns ## COG: mll6236 COG4977 # Protein_GI_number: 13475210 # Func_class: K Transcription # Function: Transcriptional regulator containing an amidase domain and an AraC-type DNA-binding HTH domain # Organism: Mesorhizobium loti # 799 900 223 323 356 64 34.0 1e-09 MAANNQKNTRILLFLLLCIRMTLTVDANDFLFTSINTAQGLSDNQIRYILQLPDGRMVFT TNGNVNLYDGVHFSYLHRTAEDISPLKQYEGCYRIYQCGDSLLWIKDYHKLMCIDLHQEK YIPHLDSYFRGKEIHEPVEDLFVDNSGRIWLLTSQELRELETGIRLTLPENEHRKLQDIH SDELAVYLFYHTGEVVCYDPKTRKQLYDRAAYPASEQEDFNRTSLVVKSQKGFYQLRNGR KGGFFYFNPQNQTWEKLFEQNYTLNTLIITPQGDKAYISCVHGFWIIDLQSREQQYIPVL ETKKGKLVSTEVSTIFQDAQGGLWVGTLNRGLLYHHPATHKLIHVSRTAFPASPEEDIAV KAFAEDDNGNIYLKSHSTIYQLAVNQDGSRTLLPVPALSVSAELAKELDQRGGNKYYRRQ SYTALCTDTRGWTWAGTADGLELFTDQEQTPDSKPTTQTADNHPASSNTQAHRIFYREDG LSNNFVQSIIEDRNNYIWVTTSNGISRIHINPENQEIGFTNFNQLDGALEEEYIQGAIFE SSDGTLYFGGIDGFTIFRPDHESATPDLPYKPVFTNLHLYGEKVNTGEKYGNRIILPQAA PYTKELELDYNQNFLTFECSALNYVNSERTYYRYQLEGIDKQWMNAFASGQGNATVGNGI LQASYTNLPPGEYTFKVAASDNQLQWNDNITKIHVIIHAPWWKTTTAYVLYTVIVLLIVL ASIRLYIYQTKKKMKQQHKEEILLLRIRNLIEQCNSYEAEQKARSEKKSHPSVCDRNETA DDNEKPNGIEPDSAESAFLVQAIELVEKNLDVNGYSVEQLSRDLCMERTGLYRKLVTLLD QSPSLFIRNIRLQRAAQLLAEGKLSVTEIAERTGFSSSSYLSKCFQEMYGCRPSEYTEKA KKST >gi|226332190|gb|ACIC01000130.1| GENE 175 231436 - 233379 1643 647 aa, chain + ## HITS:1 COG:no KEGG:BT_3661 NR:ns ## KEGG: BT_3661 # Name: not_defined # Def: alpha-glucosidase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 647 1 647 647 1376 100.0 0 MKKLVSTLTAILGISTLAAQDVVVKGPDEKLQLAVFVQNETKPCYSVSYNGKTMLEKSPL GMNTNIGDFTKNLKLTGHSVDKIDTVYQQTRIKVSNVHYRANELTCHLENEQGQKLGVIF RVSDNDVAFRYTLPHQGGKASVTVKEEQTGFRFPEQTTTFLCPQSDAMIGWKRTKPSYEE EYKADAPMSDRSQYGHGYTFPCLFRIGNDGWVLVSETGVDSRYCGSRLSDVSEGNLYTVA FPMAEENNGNGTVAPAFALPGATPWRTITVGDHLKPIVETTVPWDVVSPLYETKHDYRFG RGTWSWILWQDGSINYDDQVRYIDFASAMGYEYALIDNWWDTRIGHQRMKSLVEYARDKG VELFLWYSSSGYWNDIEQGPVNRMDNAIIRKREMKWLQSLGVKGIKVDFFGGDKQETMRL YEDILSDADDHGLMVIFHGCTLPRGWERMYPNYVGSEAVLASENMVFNQHFCDEEAFNTC LHPFIRNTVGSMEFGGCFLNKRLNRNNDGGTTRRTTDVFQLATTVLLQNPVQNFALAPNN LKDVPAVCMDFMKRVPTTWDETRFVDGYPGKYVVLARRQGDTWYLAAVNAGKEPLKLKLD LEMFAGKTVALYKDDKKGEPELTSLKVKENGKVQLEIRPQGGILCIK >gi|226332190|gb|ACIC01000130.1| GENE 176 233551 - 233860 239 103 aa, chain + ## HITS:1 COG:no KEGG:BT_3662 NR:ns ## KEGG: BT_3662 # Name: not_defined # Def: putative endo-1,4-beta-xylanase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 103 1 103 461 215 100.0 4e-55 MQKQIFFSFVLVMMLVGKVKAQHNNPFGNALIPDMIADASIQEINGVFYCYATTDGYGQG LKTSGPPVVWKSKDFVHWSFDGTYFPSAAKEKYWAPSKAIFAN Prediction of potential genes in microbial genomes Time: Thu May 12 02:54:57 2011 Seq name: gi|226332189|gb|ACIC01000131.1| Bacteroides sp. 1_1_6 cont1.131, whole genome shotgun sequence Length of sequence - 49904 bp Number of predicted genes - 34, with homology - 33 Number of transcription units - 15, operones - 11 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 22 - 1062 660 ## BT_3662 putative endo-1,4-beta-xylanase 2 1 Op 2 . + CDS 1077 - 1175 63 ## + Prom 1231 - 1290 3.1 3 2 Op 1 . + CDS 1317 - 2678 508 ## COG3507 Beta-xylosidase 4 2 Op 2 . + CDS 2690 - 4606 665 ## BT_3664 putative alpha-glucosidase + Term 4613 - 4659 2.0 + Prom 4620 - 4679 6.9 5 3 Op 1 . + CDS 4725 - 6368 871 ## COG3669 Alpha-L-fucosidase 6 3 Op 2 . + CDS 6380 - 6769 365 ## BT_3666 hypothetical protein + Term 6810 - 6847 1.1 7 4 Tu 1 . - CDS 6846 - 7706 375 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 7746 - 7805 9.9 + Prom 7738 - 7797 9.6 8 5 Op 1 . + CDS 7923 - 9329 782 ## BT_3669 hypothetical protein 9 5 Op 2 . + CDS 9395 - 12469 2302 ## BT_3670 hypothetical protein 10 5 Op 3 . + CDS 12494 - 14569 1325 ## BT_3671 hypothetical protein 11 5 Op 4 . + CDS 14601 - 15560 703 ## BT_3672 hypothetical protein + Prom 15591 - 15650 6.0 12 6 Tu 1 . + CDS 15676 - 16527 630 ## BT_3673 TonB 13 7 Op 1 1/0.000 - CDS 16656 - 18662 1632 ## COG3533 Uncharacterized protein conserved in bacteria 14 7 Op 2 . - CDS 18700 - 19701 698 ## COG3507 Beta-xylosidase - Prom 19721 - 19780 8.4 + Prom 19751 - 19810 8.3 15 8 Op 1 . + CDS 19913 - 21673 1021 ## BT_3676 hypothetical protein 16 8 Op 2 . + CDS 21670 - 23691 1446 ## COG4289 Uncharacterized protein conserved in bacteria + Term 23705 - 23753 7.4 + Prom 23699 - 23758 7.4 17 9 Tu 1 . + CDS 23925 - 28070 2051 ## COG0642 Signal transduction histidine kinase + Term 28189 - 28219 -0.4 + Prom 28624 - 28683 5.9 18 10 Op 1 . + CDS 28709 - 30001 829 ## BT_3679 hypothetical protein 19 10 Op 2 . + CDS 30022 - 33123 2139 ## BT_3680 hypothetical protein 20 10 Op 3 . + CDS 33136 - 35151 1438 ## BT_3681 putative outer membrane protein 21 10 Op 4 . + CDS 35173 - 36060 425 ## BT_3682 hypothetical protein + Term 36086 - 36140 10.3 + Prom 36232 - 36291 8.8 22 11 Tu 1 . + CDS 36353 - 38248 1353 ## COG2273 Beta-glucanase/Beta-glucan synthetase + Prom 38261 - 38320 6.7 23 12 Op 1 . + CDS 38502 - 39569 788 ## BT_3685 hypothetical protein 24 12 Op 2 . + CDS 39573 - 40868 903 ## BT_3686 hypothetical protein 25 12 Op 3 . + CDS 40873 - 42018 934 ## COG4225 Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins + Term 42081 - 42128 1.7 26 13 Op 1 . - CDS 42001 - 43020 608 ## COG4552 Predicted acetyltransferase involved in intracellular survival and related acetyltransferases 27 13 Op 2 . - CDS 43034 - 43951 831 ## COG4866 Uncharacterized conserved protein 28 13 Op 3 23/0.000 - CDS 44007 - 44702 861 ## COG1346 Putative effector of murein hydrolase 29 13 Op 4 . - CDS 44699 - 45142 324 ## COG1380 Putative effector of murein hydrolase LrgA - Prom 45295 - 45354 7.1 + Prom 45072 - 45131 7.3 30 14 Op 1 21/0.000 + CDS 45339 - 46358 1343 ## COG0280 Phosphotransacetylase 31 14 Op 2 . + CDS 46387 - 47586 1379 ## COG0282 Acetate kinase + Term 47599 - 47657 13.1 + Prom 47604 - 47663 9.1 32 15 Op 1 . + CDS 47745 - 48836 742 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 33 15 Op 2 . + CDS 48848 - 49543 549 ## COG2003 DNA repair proteins 34 15 Op 3 . + CDS 49591 - 49902 434 ## COG2151 Predicted metal-sulfur cluster biosynthetic enzyme Predicted protein(s) >gi|226332189|gb|ACIC01000131.1| GENE 1 22 - 1062 660 346 aa, chain + ## HITS:1 COG:no KEGG:BT_3662 NR:ns ## KEGG: BT_3662 # Name: not_defined # Def: putative endo-1,4-beta-xylanase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 346 116 461 461 693 100.0 0 MYPAVADKPEGPFKLARGKDEFYKPFTPSTLLQSKNPGGIDAEIFVDDDGQAYVFWGRRH VAKLNEDMITVDSVVQVISTPRKEYSEGPIFFKRKGIYYYLYTIGGDEKYQYAYVMSRVS PMGPFEAPEQDIISTTNYERGIFGPGHGCVFHPEGTDNYYFAYLEFGRRSTNRQTYVNQL KFNEDGTIRPVELTMDGVGALKKVKSDKKMKIDTVYASSIEVPLKIEPMKDPTCLRTEYF VPSFAVDGANGSRWMAAAEDSINPWIVADLGTVKKVRRSEIYFVRPTAGHAYVIEASMDG KVWQEFAVHQDRKMCSPHTDVLNKRFRYLRIKILKGVPGIWEWNIY >gi|226332189|gb|ACIC01000131.1| GENE 2 1077 - 1175 63 32 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNIKMNMKVPTCIVVLIFTLSTVNVFAQNDYK >gi|226332189|gb|ACIC01000131.1| GENE 3 1317 - 2678 508 453 aa, chain + ## HITS:1 COG:CAP0114 KEGG:ns NR:ns ## COG: CAP0114 COG3507 # Protein_GI_number: 15004817 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-xylosidase # Organism: Clostridium acetobutylicum # 3 451 78 528 531 155 28.0 2e-37 MVIIHSKDLVNWTIKGHVVSDLRQISEEMNWTRMNRYGRGIWAGAIRYYQGKFYVYFGTP DEGYFMSTATDPAGPWEPLHCVKAEKGWDDCCPFFDDDGQLYFVGTHFADKYKTYLYRMT PDGKTLIENSKILINEGYGREASKLYKINGTYYHFFSEVKNGGRYIMMQRSSSITGPYLE RKQLSHVQREYNEPNQGGLVEGPDRKWYFFTHHGTGDWAGRIASLLPVYWVDGWPIIGEV GQDGIGTMVWQAAKPSNEYPVRTPQSSDDFSKSVLSPQWEWNYQPRDEMWSLTERPGNLR LKAFRPLVTDNLLKAGNTLTQRCFRTNKNEVIVKMNIEGMVDGQKAGLCHYSKSYAMVGV RQSGKDRNLEFQTNKEKITGPAIKGNKVWIKSVWGLDGKSQFSYSTDGRKFTPFGEVYQL EWGYYRGSRIGIYCFNNIADAGYIDVDEFIYLY >gi|226332189|gb|ACIC01000131.1| GENE 4 2690 - 4606 665 638 aa, chain + ## HITS:1 COG:no KEGG:BT_3664 NR:ns ## KEGG: BT_3664 # Name: not_defined # Def: putative alpha-glucosidase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 638 1 638 638 1337 100.0 0 MKHSFNCLSYELLLYVLICIPFCSCSQKDTGVISPNGKICLSVEIKEDTEKDGFGKVFFN VEYKDGKTSRRVLSDTRIGLETQLMSFTDNIRLKSVSGTKLIEDDYVMLTGKRSHCQNRG NERIFRFENEKAETLDIVFRAYDDGIVFRYSLPSVQDKDSLTSEHTAYYIPEGTKRWIQN YSVGYEGFYPLKTRAERGKDNMWGYPLLLEPADSLFVLITEANILRGHSGSRLYNGNDMN QYQVRMTDDKLALGDSFTSPWRVLMIGSLSDIVESTLVTDVSEPNKISDTSWIMPGSASW VYWAHNHGSKNFKIVKDYIDLAVDMKWPYSLIDAEWNEMSNGGDIEDLITYSLSRNIKPM IWYNSSTAWMGPGPLYRLNSKENRIKEYTWLQKSGVVGVKIDFFPDDYSSNMDYYIDLLE DAAKYKLMVNFHGATIPRGWQRTYPHLMTMEAVYGAEWYNNKPILTDRAAEHNVTLPFTR NVIGSMDYTPGTFSDSQHPHITTCGHELALSVVFESALQHMPDRPEVYSSLPRKVREFLS GLPTAWDDTKLLCGYPGADIVLARRKGDVWYIGGLNGTNENKTLSFSLVPLQVEGKTMDV FKDGVDDKSFAIEEDIQLSNDKMDMSCLPRGGFVAVIK >gi|226332189|gb|ACIC01000131.1| GENE 5 4725 - 6368 871 547 aa, chain + ## HITS:1 COG:CC0795 KEGG:ns NR:ns ## COG: CC0795 COG3669 # Protein_GI_number: 16125048 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-L-fucosidase # Organism: Caulobacter vibrioides # 38 514 36 501 538 438 45.0 1e-122 MKAKKLLLLSLLFIAETLSAQNYQVPVSEQDEPMMQGKFQPTWESLANYKIPEWFRNVKF GIWAHWGPQCVEGSGDWMARSLYMENSPEYRHHIANYGHPSEFGFKDIIPLWKAEKWDPD KLVAFYKKIGAQYFFALGNHHDNMDLWDSKYQPWNSVNMGPKKDILKGWERAARKHGLYF GVSLHADHAWSWYETSQRHDTQGPKKGIPYDGKLTKADGKGKWWEGYDPQDLYAQNHPLS QDSWDNGTIHRQWAWGNGVCLPTQEYCTNFYNRTLDVINRYNPDLLYFDVTVAPFYPISD AGLKIAAHFYNHNMSMQKGKLNAVMFGKILDANQRKALVWDVERGAPNKIIDEPWQSCSC LGGWHYNTSIYNNNQYKSAADVVKLLVDIVSKNGNLLLSVPLRADGTFDEKEEKILNEFG NWMNINKEAIYQTRPWKIFGEGPIADKDVQLNAQGFNEWAYSKADAKEIRFTQTDKNLYA TVLGWPEDGKILIRSLAEGNGLYPTKIGRVELLGYGKVSFTRTDKGLVVDIPAKSLKNIA PVLKIRK >gi|226332189|gb|ACIC01000131.1| GENE 6 6380 - 6769 365 129 aa, chain + ## HITS:1 COG:no KEGG:BT_3666 NR:ns ## KEGG: BT_3666 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 129 1 129 129 239 100.0 3e-62 MSSIQGYKMKEYSMPTKRYCQTLSLKNNPILIEEYRKIHSEEKAWPEIRAGIRAVGILEM EIYILGSQLFMIIETPLDFDWEIAMDKLSNLPRQAEWEKYVSKFQDCPYLSSSAEKWKLM ERMFYLYED >gi|226332189|gb|ACIC01000131.1| GENE 7 6846 - 7706 375 286 aa, chain - ## HITS:1 COG:ECs0068 KEGG:ns NR:ns ## COG: ECs0068 COG2207 # Protein_GI_number: 15829322 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Escherichia coli O157:H7 # 181 281 176 277 292 82 38.0 1e-15 MASFVDKLHPLVLNIGLAIHNADWNWKNVNSPFTRLYYVTEGTAQVLISGKTQILKANHM YFIPAFTKHSYICNSYFSHYYLHIYEEHQSDSNLLDNWDFPIEIPAVEIDLALFQRLCTI NPHMTLQKSDPTSYDNNYTLMQNLLKNKQRTLSDKIESRGIVYQLLARFLKRAQVKMELK DNRIEEAIHYIRKHINETIDLRVLAENSCLSKDHFIRLFKKETGDTPLKYINRKKIEKAQ LILITDDMSIKNVALNLAFEDYSYFNRLFKKITGITPQEYRTSYHS >gi|226332189|gb|ACIC01000131.1| GENE 8 7923 - 9329 782 468 aa, chain + ## HITS:1 COG:no KEGG:BT_3669 NR:ns ## KEGG: BT_3669 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 273 468 1 196 196 415 100.0 1e-114 MKSMRKNTWLPLIGLRTFLFVTTAFTIVSCSDDKGEDGYNPNLPVRISELSPQEGGYFDK VILQGENFGNNPKKVRVFFNKKEAIVVGASGDRVLVHVPKLPGDDCKIGMLLNGNTTDTI FAEKHFAYEKNYQLIYVAGQLNSNTETFVEGSLETTTFANSMQHLACDPSGVIYMNHKNT GQSGSLIYINEPENYSKFLDYGASTENGGSPSTPYYDDKTEKVYFCAHHAPFFWEVDPKD SWSFVKRKLIAPDASYQAKGYRPVPSGQKLEYMCSYARATDANGESFIYCRTWQGQFFRF KLEDRVYDYVTTFAKSDACLATDPDDPTKIYCALNEYNKITCIDLTKNPEDSGFETDVCG IQGPGAYQDGHVSVARLNNPQQILSMHDPETGEKVIYICDANNNCVRMYNMETKLMSTVA GIGGKSGYAAGNPTVSRMNRPYGICITPENDIYVADAGNKVIMKLAFM >gi|226332189|gb|ACIC01000131.1| GENE 9 9395 - 12469 2302 1024 aa, chain + ## HITS:1 COG:no KEGG:BT_3670 NR:ns ## KEGG: BT_3670 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1024 23 1046 1046 2061 100.0 0 MLFFCSITIMAQAQQKFEISGIVQDETGQPVPGVSVYVKDKTGIGTNTDMDGKFVLKVDR GDKVQFSFIGYTRHEYLALKTEKKLVIQLVPDVSALDEVVVEAYGSKTRKITTTGAVTSV DVATLQTPATSLANMLGGRVAGIISTQGSGEPGKNISDFWIRGIGTFGANASALVLIDGL EGSLNDVDPADIESFSVLKDASATAMYGVRGANGVVLVTTKRGKQEKLSFTARANATISW LKRLPEYCDGYNYAKLANEALVVRGDDPKYTNQELDIIKYGLDTDLYPDVNWQDEVLKRS YWQQTYYLSARGGGSIARYFISLNGSNETSAYQQDKSSKYFKGLAYNTFGLRANIDIDLT KTTKLYFGVNANKTIRNLPGAADTNLIWSACSKLTPVTIPVVYSNGMLPSANGVTDEMSP YVLLNHTGLKQEEEFTSTYTLSLDQDLGMLIKGLKLNIQGAYTNDNAFHETRYIIPELWS YDKRNATGELAGKRQTERITTQYSKWEMQYRKYFLQATMNWAHDLGHGHTLTALAHYEMS DQKRSTDANDTMSAIPVRYQGLSAKVGYNYNDTYIVDGNFGYTGSENFQPGRQFGFFPSI SGAWVPSSYQWTKDKMPWLNYFKIRASYGLVGNDRITDKRFPYLTSINVNGSATTNWKYT KGGIWENSIGADNLEWEKAKKFDVGLEGKLFDKFEFVIDFFRDTRDGIFQERKQVPDFVG LVNMPFGNVGSMVSWGSDGNITFNQRINKDMEFTLRANFTYSTNKVNYYEEGDTKYEYTS ATGRPNGYMKGYIALGLFKDQEEINNSPKQDFGSYLPGDIKYKDVNGDGVVNSDDQVPVS YQDYPRLMYGFGGEFRWKKLTLGVRFSGIGNTDYYKIWTWGSNNVGYIPFYDGKLGNVLT IAANPANRWIPADYDDPSIPASMRENPNAMFPRLSYGSNQNNAQASTFWKGNRKYLRLDE ISLNYNCNCNLLKSIGINSIDLAVVANDLHTWDSVKLFDPELATSNGRAYPIPGRVSFQA IVHF >gi|226332189|gb|ACIC01000131.1| GENE 10 12494 - 14569 1325 691 aa, chain + ## HITS:1 COG:no KEGG:BT_3671 NR:ns ## KEGG: BT_3671 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 691 1 691 691 1415 99.0 0 MKKYILHRSINLFIVFLGILTTGGFTSCNFLRVDDYFDDTMKFDSIFTKYEYLVQYMWGT SDQFPDESRILTDPVTPGPMATDEAFSSQNPEHGLRRLSYILGTINADNIGNYSMNIWSS MYKVIRKCNYILARKGEAHMNTIQDEEITGYTHMMRGYAYYNLIQSYGPCIILGDDILPN NELPETYNYSRSTYDECVDYCCNELELAAKYMPIDLNPTFFGRPSRGAAFALIARLRLQQ ASPLYNGGQAARTAFGNWKRTVDGANYVNQNYDETRWAVAAAACKRVIEMNKYKLHTSPI DPSMPPRVLPASVTITDPDFQNIDYYRSYSEMFTGETIGSKNPEFIWGRQSDAMANMTQC SFPTEKAMGGWNNLCVTQKLVDAYYMVDGRDKNDASKEYPYHVANGTVTDDYFSTSAEEF SGYTIPMGVYGMYLNRENRFYATIGYSGRYWAARSNTQEQYGPYRAWYHQMVTSGRDLFS GKNSAITNPLDYPATGYVLTKWIHADDAWIGTGSMRTTKYFPIIRYAEILLGYVEAINHL SKSYTVELPAINGGNNAPQSYTVERIPTEIQKYFNPIRNRVGLPGLSDTEAASDETTLDN IIKREYMIEFACENRRYFDVRRWGIYEETEKTGVYGMRLGGDKYTYYQNPIPVNQVNNRN RVIDKRLILLPLPKAEVRRVENLDQNPGWEN >gi|226332189|gb|ACIC01000131.1| GENE 11 14601 - 15560 703 319 aa, chain + ## HITS:1 COG:no KEGG:BT_3672 NR:ns ## KEGG: BT_3672 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 319 1 319 319 594 100.0 1e-168 MKTKYIYIAFLTILCGGSLVACSNGDEFFKDERYKKMIYAISDNEQIFHAEFELNNEEDI IGVQPFAVSGTNPIDQDVHLTIEKDPELLTEYNYNTYMDETDKYARELKDGDYSLLSSIV EIKAGVNPSYEVGKLQVKIKTSIIEKLSVDSAYFLSFRIKEATPYEINEEKRNVLFRIYK KNQYASQKTPTYYASSGFMDDVVMPFDSKIIMPLAYNKIRTYIAGQIYDANDTKETINKN SMVITVDADNQVLISAFDSENGLKVEMLSPSSDPNDGSYGYRNYYNPEEKQFYLYYRFDN GEGWKVVRETLRAESIISK >gi|226332189|gb|ACIC01000131.1| GENE 12 15676 - 16527 630 283 aa, chain + ## HITS:1 COG:no KEGG:BT_3673 NR:ns ## KEGG: BT_3673 # Name: not_defined # Def: TonB # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 283 1 283 283 557 100.0 1e-157 MNTKLFGLFIALLVCLPSLAQQKPVEKVDSDGVYLMPDQMPEFPGGMQAMMKFLTTNIKY PVEAQKKGVSGRVIVQFVIMEDGTLDQAKVVRGVDPLLDEEALRVVKLMPKWKPGMDRGE AVKVRFTAPIMFNLSRSEAQRPTFPELVVPIGQEVENRSLQGVWQSCVVQPGEHGYKILL LPVLKIVSPDQTFMNIMTAGMNGRSNAIIYCQGEYSLPSDGTYVEMVEKSVDPVFIQGVK NEISVERLHDNLIKLSFTVPGQGRKVTEYWFRAPSPDVKIMAD >gi|226332189|gb|ACIC01000131.1| GENE 13 16656 - 18662 1632 668 aa, chain - ## HITS:1 COG:TM0280 KEGG:ns NR:ns ## COG: TM0280 COG3533 # Protein_GI_number: 15643049 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Thermotoga maritima # 30 643 18 618 620 456 40.0 1e-128 MKNILTATFLTCICITGSAQINHGYPIDPVPFTSVKVTDNFWGQRLQASREVTIPLAFSK CEETGRYENFVKAAHPSDTYKVEGFSFDDTDVYKTIEGASYSLQTYPDKKLQKYIDSVLV IVAGAQEPDGYLYTARTMNPKHPHNWAGKERWVAVENLSHEFYNLGHMIEGAVAHYQATG KRNFLDIAIKYADCVCREIGNGPQQKKYVPGHQIAEMALVKLYMATGDKKYLDQAKFFLD TRGYTSRKDTYSQAHKPVVEQDEAVGHAVRAVYMYSGMADVAAITGDSSYIKAIDKIWDN IVSKKIYITGGIGAHHAGEAFGNNYELPNLSAYCETCAAIGNVYMNYRLFLLHGDAKYFD VLERTLYNGLISGVSLDGGSFFYPNPLSSNGKYSRKPWFGCACCPSNVSRFIPSLPGYVY AVKNDQVYVNLYLSNKAELKVDKKKILLEQETGYPWNGDIRLKITQGNQDFTMKLRIPGW VRGNVLPGDLYSYADNQKPAYQVSVNGQTVESDVNDGYLSIARKWKKGDVVEVHFDMIPR IVKANPKVEADHGRVAVERGPIVYCAEWPDNRFNVHSILLNQHPQFKVTDKPELLYGIRQ ITTDAQALSYDKAGKLVTKDVELTLIPYYAWAHRGEGDMEVWLPIDVSATSAQPQEAGKW EDNGFFKN >gi|226332189|gb|ACIC01000131.1| GENE 14 18700 - 19701 698 333 aa, chain - ## HITS:1 COG:yagH KEGG:ns NR:ns ## COG: yagH COG3507 # Protein_GI_number: 16128256 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-xylosidase # Organism: Escherichia coli K12 # 30 295 5 263 536 60 26.0 4e-09 MKHLFTRIFLFHLLLIGTVQVTAQNKKSGNPILPGFHADPEVLYSHQTKRYYIYPTSDGF PGWGGSYFKVFSSKNLKTWKEETVILEMGKNVSWANGNAWAPCIEEKKIDGKYKYFFYYS ANPTTNKGKQIGVAVADSPTGPFTDLGKPIITSSPTGRGQQIDVDVFTDPVSGKSYLYWG NGYMAGAELNDDMLSIKEETTVVLTPKGGTLQTYAYREAPYVIYRKGIYYFFWSVDDTGS PNYHVVYGTAQSPLGPIEVAKEPIVLIQNPKEEIYGPAHNSILQVPGKDKWYIVYHRINK NHLNDGPGWHREVCIDRMEFNPDGTIKQVIPTP >gi|226332189|gb|ACIC01000131.1| GENE 15 19913 - 21673 1021 586 aa, chain + ## HITS:1 COG:no KEGG:BT_3676 NR:ns ## KEGG: BT_3676 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 586 1 586 586 1211 100.0 0 MRAKLILPFLILYLFCHLLSVQGHEVDIVNAGSDKNVASSQVYEVTSEGAWCWFADPRAL HYENEKGTINKTYVGYIDIHGNIKAMQYDFKKKRQDEVLIRSYFQPDDHNNPTFLVLPDE RIMIFYSRHTDEPCFYYRISRLPGDITTLGEEKVIKTKDNTTYPSPFILSDDPEHIYLCW RGIGWHPTIAKLSLPDQNDQVAVEWGPYQIVQSTGARPYAKYMSNGKDKIYFAYTTGHPD NENPNFLYFNYIDIHSLQLKDVKGNTLSTIADGTFKVNKTDDYARQYPSTLIDNPSARDW VWQVASDENDNPVIAMVRISSDKNSHDYYYAKWNGHEWKKTFLANAGGHFHQTPNSEKCY SAGMTIDPANTNHVYCSLPVEGKQGKVYEIVKFILNEVGEVVSTEAVTQDSQQNNVRPYI VPNSKNTPLRLTWMYGNYYDWIVSSRYPQGYCTGIACDFKGFPGAKKEKTVVTEKDFKFN PKKAFVLEQTVTLDADNYQGSLLQLGDLQYYLNGQTMKPEVRYNGKVYPSTNILGTSDCW KTTSRNTGGEWYTPQKYGSFKLKLEYKKGVLCVYINGLLDQKIEIQ >gi|226332189|gb|ACIC01000131.1| GENE 16 21670 - 23691 1446 673 aa, chain + ## HITS:1 COG:AGl3401 KEGG:ns NR:ns ## COG: AGl3401 COG4289 # Protein_GI_number: 15891818 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 96 400 67 365 627 123 30.0 1e-27 MRRKVLRDCVIGVLLLFVLPLAELSGAIAQESVFTVQQPDFQKSPYTGMTRQHWIQAGEY LLKGAFGYIHTLDDQMYFPKQLDKTYPNNDGQVPVAKLEGLARTLFIAAPLLKDNPELVM NGIRVADYYRHQLVGISNPKSPSFISHRKGGPSQTLLELGSLAISMKAAQAVLWDPLTKA QKDSLAATMLSYGEGPTIGSNWMFFNVFILSFLKDQGYAVNESYLESNLKKLLARYRGEG WYNDAPAYDYYSAWAYQTYGPIWAEMFGKKQFPQLAQQFLANQHDMVANYPYMFSRDGKM NMWGRSICYRFAATAPLSLWEYDKSGDVNYGWIRRIASSTLLQFLENPKFLEEGVPTMGF YGPFAPAVQIYSCRGSVYWCGKAFLSLLLPENSDFWSATENNGPWDKELKKGNVYNKFQP GTNLLITNYPNSGGAEMRSWCHETVAKDWQKFRSTENYNKLAYNTEFPWMADGKNGEISM NYGTKNQKGEWEVLRLYTFQSFKDGIYRRDAVLETDSTVRYQLADIPLPNGILRVDKVSV SEPTEICLGHYSLPRLNGVFKETSRRVGKLDIPVIDNGEYELAMIPLAGWDKLYTSYPKG LHPVSDECALIMASDKLAGSKIYVTLQLWKKNEGKNGFTKKELNPVRAIDISEDKKQVTV RLDTKEIKTVLFE >gi|226332189|gb|ACIC01000131.1| GENE 17 23925 - 28070 2051 1381 aa, chain + ## HITS:1 COG:CAC0323 KEGG:ns NR:ns ## COG: CAC0323 COG0642 # Protein_GI_number: 15893615 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 831 1109 363 647 654 158 35.0 8e-38 MKNARIVILYYKLSFLILFLLLKSALLLATEDVIYNLKFKQLSAPYSLPTNEVQKVYQDK DGFIWFATRNGLCQYNGYETTLYKSNLYSPDLLTTNSITCLVDDNNNHLWIGTSEGLNVM DKRTGEIKKYKAPYLSNNVVSALCVTRDNTVWVGTDNGLCRYVAEKDTFVVCGDDFGDSR LRYVTIKSLLEDSDGDLWIGTWAQGLFRYSPSTDKVEIYPKINDQYSSHVIYEDSNKDIW VGSWGYGLFKLQNPKDMQRVSYQRFLHENGDDSSLSDNIVYDIAEDINTHTLWVGTRSGL SILKLDEPDAFINYKSGKTDYRIPSDEVNSIMRDAQRNMWIGAIGGGVLMADTRQSTFAL YTLNFGDEDIPVTSVRTLFADSDQNLWIGVGTYGLACREYETGKLKMYSHIPEFSGLKDL PSLFAVTQRKKSGEIWFGMYNGGIYVYRKGEKVKHLTTKNSGFLTSDCVSALYEDYEGNC WVGTRGGIGVRLADGTDYRFETMNFNDSLSAGWIYVRDIIKDMDSSVWLATSNLGVIHIT GDVRQPSTLKYENYSFHNRKLITNAVLCIHLDRFGRLWAGTEGGGLYLYNRNKGEFEEKT RAYSIPGDVIVSIEEDKSGNLWLGTISGLVKLYVAAVGNDFSTRIYTSADGLQDNFIVNS SCSRNGELFFGGHKGYNSFFPDKMEIPSQETNFLITDIKIFNHSFSNLPVELQQKISPVM PTYTSKIELPHKYNNFSIEFAALTYKNPELNRYAYRLQNFDRDWQYTDADRRFAYYNNLP SGTYTFQLKATNENGEWSGYVRELTVVVLPPFWATWWAYLLYVIIVSVIGVSIYRTAKNR ILLRNALRLREIEKAKAEELNHAKLQFFTNITHELLTPLTIISATVDELKTQAPSHNDLY TVMNSNIQRLIRLLQQILEFRKAETGNLKLRVSPGDIAAFVKNAAESFQPLVKKRKIHFS FLCDPESMIGYFDMDKLDKILYNLLSNAAKYNKEDGFIQVSLSYDETDKDFILLKIKDNG KGISKEKQKNLFKRFYEGDYRKFNTIGTGIGLSLTKDLVELHGGTISVESEVDHGTEFMV RIPIERSYYDEEQIDDEAISLMQNPVNYEDTQEDADVETQEVTIKANTILLVEDNGELLH LMTKLLSREYNVFTAQNGKEGIAVLEKEDVDLIVSDVMMPEMDGIEFCKYVKGHLEMSHI PMILLTAKNKEEDRAEAYEIGADAFISKPFNLTVLHARIRNLLKYKERMARDFKNQIVFE VKDLNYTSLDEDFIQRAIHCVNNHLEDPNFDQAQFADEMRTSKSTLYKKLKSLAGLNTSS FIRNVRLKAACRIMEEKGTNVRVSELAYAVGFNDPKYFSSCFKKEFDMLPSEYIERFIQN V >gi|226332189|gb|ACIC01000131.1| GENE 18 28709 - 30001 829 430 aa, chain + ## HITS:1 COG:no KEGG:BT_3679 NR:ns ## KEGG: BT_3679 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 430 1 430 430 871 100.0 0 MRNCIFYSLVLALLFLIGSCKDSDDNKTTGATYDPNQPVTVESFMPVEGKLREKVIVKGS NFGTDKSKVKVYFVDEAAERLSTVIGIDNNTLYCLAPRQLPGGNRIKVIVDGKEVTTDGT FKYEQAQNVSTISGSASKDGNDDGDLASAKFKYMWGIAAVGNNTVLAYQRDDPRVRLISV DDNKVTTVHPGFKGGKPAVTKDKQRVYSIGWEGTHTVYVYMKASGWAPTRIGQLGSTFSG KIGAVALDETEEWLYFVDSNKNFGRFNVKTQEVTLIKQLELSGSLGTNPGPYLIYYFVDS NFYMSDQNLSSVYKITPDGECEWFCGSATQKTVQDGLREEALFAQPNGMTVDEDGNFYIV DGFKGYCLRKLDILDGYVSTVAGQVDVASQIDGTPLEATFNYPYDICYDGEGGYWIAEAW GKAIRKYAVE >gi|226332189|gb|ACIC01000131.1| GENE 19 30022 - 33123 2139 1033 aa, chain + ## HITS:1 COG:no KEGG:BT_3680 NR:ns ## KEGG: BT_3680 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1033 1 1033 1033 2071 99.0 0 MKRSILVLFWVFMSATMFAQGGIDVAGIVLDEQGQELIGVSVQIKGKQGVGVVTDFDGRF KMTGVPAGSTLVFSYIGYETREIKYTATKLKEKIALKEAVNEFDEVVVVGRDTQRKVSVV GAITNVDPAGIQAPAVSVSNMLGGRVPGIIAVTRSGEPGNNFSEFWIRGMSTFGASSSAL VLIDGIEGNINDLDPADIESFSILKDASATAVYGTRGANGVVVVTTKRGKAGKLHVNFKT NATYSYSPRMPEYADAYQYATLANEARSVRGDDPVYSATELELFKTGLDPDLYPNVNWRD VILKDHVINNQHHLSISGGGQSARYYMSLGILNSEALFKQDKSASKHDVNVNYHKYNFRT NIDADLTKTTLLSLNLEAVIKTQNAPGTGSSNKYLWESQANLPPTVVPVRYSNGQLPAYG TNLEDKSPYVRLNYMGYTTNETYSTKINVGLSQDLGMITEGLSVRGLFSFSMNGAHIVDR HMNPEQYYADPKDGRYLDGSLKTVRTVNKEDMTATQGSASNRELYFEAAANYKRLFNQDH RVTGLAHFYRQELTNVDWGNGVLVSIPKRYQALSFRATYSYKDTYLVEGNLGYTGSENFN KERRYGWFPSISGGWVPTQYDWYRNLLPFNNFLKFRASWGRVGNDRLKDENGNDIRFPYL TTLGNVSSTWGTGLAENRTGSMNLKWEVSTKTNFGIDARFFDDKVDMTVDFFHTKTTDIF QRRANIPDEGGLSNVLPYANIGSMKSWGMDGTLAYTHTFNKDMALTVRGNFTHAENEVIY WEQSGVNYPYQSNSGVPYKVQRGLIALGLFKDEDDIKSSPKQTFMDNYRPGDIKYKDVNG DGKIDKDDVVPLNYSAVPFIQYGFALDWNYKAFRVSILFEGVSKVQYFQGGRGFYPFLNE SRGNLLEMVADPRNRWIPREYAEVNGIDPALAENPNAKFPRLTYGENKNNNQESTFWLAD GKYLRLKNVDVSYRFTNNWLKSRVGVESATLSLIGENLHVWDKVKLFDPSQASGNGAEYP LQRMYTLQLNLTF >gi|226332189|gb|ACIC01000131.1| GENE 20 33136 - 35151 1438 671 aa, chain + ## HITS:1 COG:no KEGG:BT_3681 NR:ns ## KEGG: BT_3681 # Name: not_defined # Def: putative outer membrane protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 15 671 1 657 657 1308 100.0 0 MKKKTNIIKTLFLSMVCSITIGLQSCGDYLDVEEYIDQMTFLDSVFSRKELLDQYINGAA QYIPNEGKLWSNSSTPFQIASDENFMSWNDSQHYGMKFLLDEITPYDGYYNNYPKYYQGI RMALTALSRINEVPDVTDVERRELMGRCYFLVGYYYYLLMLQYGPVPIVPETPFNVDVPV AEMSLERGTYDECVLEIRKWMSLAIQFLPLEVESSTVVTLPTQWAAYATLSRITLYAASP WYNGNKFYADWQRTSDGANFISQENDNSKWGVSAAYSKYIIDSNKFELYWTPKEIDSKDL PTNEEFIKEIDPDYYEPYPKGAAGIDHYRSLTYTFSGEIPVMINPEFIYSCQMPTGDAPL VAAAPFKLGGWGGLNLVQDLIDAYQMVDGQDINESSQDYPYPDASVNFERIGGANQTFSG FTLLASTARMYNNREPRFYATIGFCHSFWPGTSSSENQYKNIEVTYYSDGYASANPDHPE DYNRTGYTCVKYRHLEDEMKKGTVKAKYFPVFRYAETLLNYVEAINELKEPYTLEMPNGS QVTVEHNTADIVKYFNLIRHRAGLPGITEAVAADRNKVRELIKHERRIEFACEGRRYHDL RRWGDAMEAYNRPISGCNVKARSNERQKFYTTTILNDKLTRRSFSYKHYFYPIPKSVLDK NKNLVQNPEWR >gi|226332189|gb|ACIC01000131.1| GENE 21 35173 - 36060 425 295 aa, chain + ## HITS:1 COG:no KEGG:BT_3682 NR:ns ## KEGG: BT_3682 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 295 1 295 295 576 100.0 1e-163 MKLIRTIYTLGLLVGLFASCSKESPLDGEQYYKQVYIVGAYEVVQQFDVAYGDGPQNAYV AVATGGSQNIDRNVEVVLTHNDATIEWYNNKYMLDAPLKYQKLGDQFCSFPSMSTTIKAG DVYSRLPFTVETSGLDCDKLYALTFKIESVSDYVKQPKDTVLIMNLNLTNDFSGIYQMVA IRYTLTDNDEELTPSSVNMQRTLKAVNKNQARFFNVTQNSDLSPTGNITSEDYFNSIDNN GVTFTRQPDGTFTVAGWKNLPVSNGTVSFEDGTFTFCYDYESGGKRYRLRGTMTK >gi|226332189|gb|ACIC01000131.1| GENE 22 36353 - 38248 1353 631 aa, chain + ## HITS:1 COG:TM0024 KEGG:ns NR:ns ## COG: TM0024 COG2273 # Protein_GI_number: 15642799 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucanase/Beta-glucan synthetase # Organism: Thermotoga maritima # 23 281 210 457 642 155 38.0 2e-37 MKIKKIVNLMMGLSFSLCIFAQDDWQLVWSDEFNADGPLNSSVWNFEQGYARNEEAQWYQ SDNAVCKNGLLIIEARKEQNRKNPLYVSGSNDWRKKREFIDYTSSSVTTAGKKEFLYGRF EIKARIPVAKGAWPAIWTLGSNMEWPSCGEIDIMEYYQIKGTPHILANAAWGTDRQWHAK WDSQATPYSHFTDKDPDWASKFHIWRMDWDEEAIKLYLDDELLNEIPLSSTRNGSIGKGT NPFTKPQYLLLNLAIGGINGGPIDEAALPMKYEIDYVRVYQKEKGIASGKVWRDTDGNVI NAHGGGILFHEGKYYWFGEHRPASGFVTEKGINCYSSTDLYNWKSEGIALAVSEEEGHDI EKGCIMERPKVIYNAKTGKFVMWLHLELKGQGYGPARAAVAVSDSPAGPYRFIRSGRVNP GAYPLNMTRKERKMKWNPEEYKEWWTPKWYEAIAKGMFVKRDLKDGQMSRDMTLFVDDDG KAYHIYSSEDNLTLQIAELADDYLSHTGKYIRIFPGGHNEAPAIFKKEGTYWMITSGCTG WDPNKARLLTADSMLGEWKQLPNPCVGEDADKTFGGQSTYILPLPEKGQFFFMADMWRPK SLADSRYIWLPVQFDDKGVPFIKWMDRWNFD >gi|226332189|gb|ACIC01000131.1| GENE 23 38502 - 39569 788 355 aa, chain + ## HITS:1 COG:no KEGG:BT_3685 NR:ns ## KEGG: BT_3685 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 338 1 338 355 701 100.0 0 MVRMLLLVVVCIGLLPIQMRAQKNSYIIPGEVWKDTDGNPINAHGGGLLYHDGTYYWYGE YKKGKTILPDWATWECYRTDVTGVGCYSSKDLLNWKFEGIVLPAVKEDPNHDLHPSKVLE RPKVIYNKKTGKFVMWAHVESADYSKACAGVAVSDFPNGPFTYLGSFRPNNAMSRDQTVF VDDDGRAYQFYSSENNETMYISLLTDDYLKPSGRFTRNFIKESREAPAVFKHKGKYYMLS SGCTGWDPNVAEIAVADSIMGTWKTIGNPCTGPDADKTFYAQSTYVQPVIGKKNAYIAMF DRWKKKDLEDSRYVWLPVLIKDGAITIPWHEKWDLTVFDKQKKSDKYKKSDKQKK >gi|226332189|gb|ACIC01000131.1| GENE 24 39573 - 40868 903 431 aa, chain + ## HITS:1 COG:no KEGG:BT_3686 NR:ns ## KEGG: BT_3686 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 431 1 431 431 869 100.0 0 MKRTLLLTFLLLSVSALCFSQKLVEVAKGYSCTSVNTTIFRNNSLVTHGDEQYISYYDAD GYLVLGKRKLNSKQWTLHRTQYRGNVKDAHNIISIMVDGEGYLHVSFDHHGHKLNYCRSI APGSLELGDKMPMTGVDEGNVTYPEFYPLTDGDLLFVYRSGSSGRGNLVMNRYSLKDHKW ARVQDVLIDGEDKRNAYWQLYVDEKGTIHLSWVWRETWQVETNHDLCYARSFDNGVTWYK SDGEQYKLPITASNAEYACRIPQNSELINQTSMSADAGGNPYIATYWRSSDSEVPQYRIV WNDGKTWHNRQVTDRKTPFTLKGGGTKMIPVARPRIVVEDGEIFYIFRDEERGSRVSMAH TADVANGKWIVTDLTDFSVDAWEPSHDTELWKKQRKLNLFVQHTCQGDGERTAEIEPQMI YVLEANTNTKK >gi|226332189|gb|ACIC01000131.1| GENE 25 40873 - 42018 934 381 aa, chain + ## HITS:1 COG:YPO0840 KEGG:ns NR:ns ## COG: YPO0840 COG4225 # Protein_GI_number: 16121148 # Func_class: R General function prediction only # Function: Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins # Organism: Yersinia pestis # 56 373 47 350 352 165 32.0 1e-40 MKKLYATLFSALVVGCAVCAGCTTKKVSSSAEVVDIIHKVNGYWQTNHPEHGRSFWDNAA YHTGNMEAYFLTNKPEYLEYSKGWAEHNEWKGAKSDHKANWKYSYGESNDYVLFGDYQIC FQTYADLYNLEPDTHKIARAREVMEYQMSTPNNDYWWWADGLYMVMPVMTKLYNITKNPL YLEKLHEYLAYADSIMYDEEAGLYYRDGKYVYPKHKSVNGKKDFWARGDGWVLAGLAKVL KDLPETDKYRQEYIDRFRTLAKSVAACQQPEGYWTRSMLDAQHAPGPETSGTAFFTYGLQ WGVNNGFLDSAHYQPVVEKAWKYLSTVALQPDGKIGYVQPIGEKAIPGQVVDANSTSNFG VGAFLLAACERVRYLESLIQH >gi|226332189|gb|ACIC01000131.1| GENE 26 42001 - 43020 608 339 aa, chain - ## HITS:1 COG:BH1812 KEGG:ns NR:ns ## COG: BH1812 COG4552 # Protein_GI_number: 15614375 # Func_class: R General function prediction only # Function: Predicted acetyltransferase involved in intracellular survival and related acetyltransferases # Organism: Bacillus halodurans # 42 326 51 347 386 70 23.0 5e-12 MTTKEKVKALWQLCFDDNEEFVEMYFRLRYKNEINIAIESGDEIVSALQMIPYPMTFCGD TVQTSYISGACTHPDFRGNGVMRELLSQAFAKMLRNGTYFSTLIPAEPWLFDYYTRMGYA SVFQYSVKEITVPEFIPSKEITVTSEIGCQKEVYEYLNSKLSGRTCCIQHSFEDFQVVMA DLILSDGILVTARSENQINGLAIVYRRDKQLIISELFAETKDAEHSLLHHIKQFTGCRHM TQLLPPEKEQTQYPLGMARIINAKEVLQLYASAFPEDEMQLEVSDKQLSVNNGYYYLCNG KCMYSTERLPGAHIPMNISELTGRIFQALQPYMSLMLNK >gi|226332189|gb|ACIC01000131.1| GENE 27 43034 - 43951 831 305 aa, chain - ## HITS:1 COG:jhp0277 KEGG:ns NR:ns ## COG: jhp0277 COG4866 # Protein_GI_number: 15611347 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Helicobacter pylori J99 # 4 297 2 290 290 160 35.0 4e-39 MIPFKDITLADRDTITAFTMKSDRRNCDLSFSNLCSWRFLYDTQFAVIDDFLVFKFWAGE QLAYMMPIGNGDLKAVLRKLIEDADKEKHNFCMLGVCSNMRADLEAILPERFIFTEDRAY ADYIYLRSDLATLKGKKFQAKRNHINRFRNTYPDYEYTPITPDRIQECLDLEAEWCKVNN CDQQEGTGNERRALIYALHNFEALGLTGGILHVNGKIVAFTFGMPINHETFGVHVEKADT SIDGAYAMINYEFANRIPEQYIYINREEDLGIEGLRKAKLSYQPVTILEKYMACLKDHPM DMVRW >gi|226332189|gb|ACIC01000131.1| GENE 28 44007 - 44702 861 231 aa, chain - ## HITS:1 COG:NMB2004 KEGG:ns NR:ns ## COG: NMB2004 COG1346 # Protein_GI_number: 15677832 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Putative effector of murein hydrolase # Organism: Neisseria meningitidis MC58 # 10 229 11 229 230 178 46.0 9e-45 MSFLENNFFLLAVTFGVFFFAKLLQKKTGLVLLNPILLTIAVLIIFLKMANISYETYNQG GHLIEFWLRPAVVALGVPLYLQLEMIKKQLLPILLSQLAGCIVGVISVVLIAKFMGASQE VILSLAPKSVTTPIAMEVTKAIGGIPSLTAAVVVAVGLLGAICGFKTMKIMHVGSPIAQG LSMGTAAHAVGTSTAMDISSKYGAYASLGLTLNGIFTALLTPTILRLLGVL >gi|226332189|gb|ACIC01000131.1| GENE 29 44699 - 45142 324 147 aa, chain - ## HITS:1 COG:NMA0437 KEGG:ns NR:ns ## COG: NMA0437 COG1380 # Protein_GI_number: 15793442 # Func_class: R General function prediction only # Function: Putative effector of murein hydrolase LrgA # Organism: Neisseria meningitidis Z2491 # 1 109 3 111 114 96 49.0 2e-20 MIRQCAILFGCLALGELIVYLTGIKLPSSIIGMLLLTLFLKLGWIKLQWVQGLSDFLVAN LGFFFVPPGVALMLYFDVIAAEFWPIVIATIVSTALVLVVTGWVHQIVRKFRLARQIKLA RSLRLSDFHLSDKLHLKNKTNLMNKDK >gi|226332189|gb|ACIC01000131.1| GENE 30 45339 - 46358 1343 339 aa, chain + ## HITS:1 COG:CAC1742 KEGG:ns NR:ns ## COG: CAC1742 COG0280 # Protein_GI_number: 15895019 # Func_class: C Energy production and conversion # Function: Phosphotransacetylase # Organism: Clostridium acetobutylicum # 2 332 1 329 333 320 52.0 2e-87 MLNLINQIVARAKADRQRIVLPEGTEERTLKAANQILTDEVADLILLGKPAEINELAVKW GLGNISKATIIDPETSPKHEEYAQLLCELRKKKGMTIEEARQLTNDPLFYGCLMIKSGDA DGQLAGARNTTGNVLRPALQIIKTAPGITCVSGAMLLLTHAPEYGKNGILVMGDVAVTPV PDPNQLAQIAVCTAQTAKAVAGIENPKVAMLSFSTKGSAKHEVVDKVVEATKIAKEMAPT LDLDGEMQADAALVPEVGASKAPGSPVAGEANVLIVPSLEVGNISYKLVQRLGHADAIGP ILQGIARPVNDLSRGCSIEDVYRMIAITANQAIAAKNNK >gi|226332189|gb|ACIC01000131.1| GENE 31 46387 - 47586 1379 399 aa, chain + ## HITS:1 COG:TM0274 KEGG:ns NR:ns ## COG: TM0274 COG0282 # Protein_GI_number: 15643044 # Func_class: C Energy production and conversion # Function: Acetate kinase # Organism: Thermotoga maritima # 1 399 1 400 403 464 58.0 1e-130 MKILVLNCGSSSIKYKLFDMTTKEVIAQGGIEKIGLKGSFLKLTLPNGEKKVLEKDIPEH TIGVEFILNTLISPEYGAIKSLDEINAVGHRMVHGGERFSESVLLNKEVLEAFAACNDLA PLHNPANLKGVNAVSAILPNIPQVGVFDTAFHQTMPDYAYMYAIPYEMYEKYGVRRYGFH GTSHRYVSKRVCEFLGVNPVGQKIITCHIGNGGSIAAIKDGKCIDTTMGLTPLEGLMMGT RSGDIDAGAVTFIMEKEGLNTTGISNLLNKKSGVLGISGVSSDMRELLAACANGNERAIL AEKMYYYRIKKYIGAYAAALGGVDIILFTGGVGENQFECRESVCKDMEFMGIKLDNNVNA KVRGEEAIISTADSKVKVVVIPTDEELLIASDTMDILKK >gi|226332189|gb|ACIC01000131.1| GENE 32 47745 - 48836 742 363 aa, chain + ## HITS:1 COG:RC0461 KEGG:ns NR:ns ## COG: RC0461 COG0463 # Protein_GI_number: 15892384 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Rickettsia conorii # 2 112 294 401 604 78 38.0 2e-14 MRYSVIIPVYNRPDEVDELLQSLTVQCFKDFEVVIVEDGSSIPCKEVADRYTDRLDIKYF SKPNSGPGQTRNYGAERSEGEYFIILDSDVILPEGYFDAIEKELQASPADAFGGPDRAHD SFTDIQKAINYSMTSFFTTGGIRGGKKKMDKFYPRSFNMGIRREVYEALGGFSKMRFGED IDFSIRIFKGGYTCRLFPDAWVYHKRRTDLKKFFKQVHNSGIARINLYKKYPDSLKLVHL LPAVFTLGVSLLLLLTLLGLGLVLLAFYLLYSKTVPGCATGEMALLFLGFAAMLAAPLPL LLYSLVVCVDSTIRNRSLWIGMLSITASFIQLIGYGTGFWRAWWQRCIRGKDEFEAFQKN FYK >gi|226332189|gb|ACIC01000131.1| GENE 33 48848 - 49543 549 231 aa, chain + ## HITS:1 COG:XF0148 KEGG:ns NR:ns ## COG: XF0148 COG2003 # Protein_GI_number: 15836753 # Func_class: L Replication, recombination and repair # Function: DNA repair proteins # Organism: Xylella fastidiosa 9a5c # 1 231 1 232 232 134 36.0 1e-31 MESKHKLSINQWALEDRPREKMMEKGAAALSDAELLAILIGSGNTEESAVELMRRLLLSC DNNLNSLAKWEVCDYSRFKGMGPAKSITVMAALELGKRRKLQNIQERPRISCSRDIYDIF QPVMCDLEQEEFWVLLLNQATRLIDKVRISTGGIDGTYTDVRTILREALLQRATQIAVVH NHPSGNIQPSQPDRTLTEHIRRAADTMNIRLIDHVIVCEDKFYSFADEGLL >gi|226332189|gb|ACIC01000131.1| GENE 34 49591 - 49902 434 103 aa, chain + ## HITS:1 COG:CC1859 KEGG:ns NR:ns ## COG: CC1859 COG2151 # Protein_GI_number: 16126102 # Func_class: R General function prediction only # Function: Predicted metal-sulfur cluster biosynthetic enzyme # Organism: Caulobacter vibrioides # 5 100 20 115 118 102 52.0 2e-22 MEKFKIEEKIVAMLKTVYDPEIPVNVYDLGLIYKIDVSDNGEVALDMTLTAPNCPAADFI MEDIRQKVESVEGVTSATINLVFEPEWDKDMMSEEAKLELGFL Prediction of potential genes in microbial genomes Time: Thu May 12 02:56:40 2011 Seq name: gi|226332188|gb|ACIC01000132.1| Bacteroides sp. 1_1_6 cont1.132, whole genome shotgun sequence Length of sequence - 16957 bp Number of predicted genes - 9, with homology - 9 Number of transcription units - 5, operones - 1 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 39 - 803 835 ## COG2908 Uncharacterized protein conserved in bacteria + Term 878 - 950 23.0 - Term 871 - 932 13.6 2 2 Op 1 . - CDS 944 - 3022 1720 ## COG0366 Glycosidases - Prom 3048 - 3107 1.9 - Term 3055 - 3103 1.2 3 2 Op 2 . - CDS 3132 - 4589 1473 ## BT_3699 outer membrane protein SusF 4 2 Op 3 . - CDS 4615 - 5778 934 ## BT_3700 outer membrane protein SusE 5 2 Op 4 . - CDS 5813 - 7468 1472 ## BT_3701 SusD, outer membrane protein 6 2 Op 5 . - CDS 7490 - 10504 2854 ## BT_3702 SusC, outer membrane protein involved in starch binding - Prom 10546 - 10605 2.5 7 3 Tu 1 . - CDS 10659 - 12875 2227 ## BT_3703 alpha-glucosidase SusB - Prom 12903 - 12962 7.7 8 4 Tu 1 . - CDS 13072 - 14925 1568 ## COG0366 Glycosidases - Prom 15130 - 15189 6.1 + Prom 15083 - 15142 5.7 9 5 Tu 1 . + CDS 15169 - 16767 1362 ## BT_3705 regulatory protein SusR + Term 16868 - 16917 8.3 Predicted protein(s) >gi|226332188|gb|ACIC01000132.1| GENE 1 39 - 803 835 254 aa, chain + ## HITS:1 COG:NMA0723 KEGG:ns NR:ns ## COG: NMA0723 COG2908 # Protein_GI_number: 15793700 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Neisseria meningitidis Z2491 # 1 247 1 240 240 89 30.0 8e-18 MKNVYFLSDAHLGSRAIEHGRTQERRLVNFLDSIKHKASAVYLLGDMFDFWYEFRLVVPK GYTRFLGKLSELTDMGVEVHFFTGNHDIWCGDYLSKECGVIIHREALTTEIYGKEFYLAH GDGLGDPDKKFKLLRWMFHSTTLQTLFSAIHPRWSVELGLTWAKHSREKRVDGKEPDYMG ENKEHLVLYTKEYLKSHPNINFFIYGHRHIELDLMLSATSRVLILGDWINYFSYAVFDGE NLFLEEYIEGETQV >gi|226332188|gb|ACIC01000132.1| GENE 2 944 - 3022 1720 692 aa, chain - ## HITS:1 COG:VC0911 KEGG:ns NR:ns ## COG: VC0911 COG0366 # Protein_GI_number: 15640927 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Vibrio cholerae # 64 177 17 131 562 128 50.0 3e-29 MNKHLHFLSLLWLSMLMAFMTACSDDKNITDPAPEPEPPVEGQWTALTASPDTWDETKRA DISYQLLLYSFADSDGDGYGDLNGVTQKLDYLNQLGVKALWLSPIHPCMSYHGYDVTDYT KVNPQLGTESDFDRLVTEAHNRGIKIYLDYVMNHTGTAHPWFTEASSSSESPYRNYYSFS EDPKTDIAAGKIAMITQEGAAGYNAAEWFQVSDETAAVKGLLKFTLDWSNAPSPILVVST GTKADEDNPDTGTDNAKYLYYGEDICKKFYDKGNNIYELTVDFESTWGLLIRTSNASFWP SGTKYGASSSSEKLALNKDFKLTNAGNPANIMFDSQQITYFHSHFCTDWFADLNYGPVDQ AGESPAYQAIADAAKGWIARGVDGLRLDAVKHIYHSETSEENPRFLKMFYEDMNAYYKQK GHTDDFYMIGEVLSEYDKVAPYYKGLPALFEFSFWYRLEWGINNSTGCYFAKDILSYQQK YANYRSDYIEATKLSNHDEDRTSSKLGKSADKCKLAAAVLLTSAGHPYIYYGEELGLYGT KDNGDEYVRSPMLWGDSYTTNYTDKTDATVSKNVKTVADQQADTHSLLNIYFSLTRLRNT YPALAEGNMTKHSVYNESQEKDYKPIAAWYMTKDNEKLLVIHNFGGTAMLLPLTDKIEKV LFVNGEAQQNTDSDSYTLKLGGYASVVFKLGN >gi|226332188|gb|ACIC01000132.1| GENE 3 3132 - 4589 1473 485 aa, chain - ## HITS:1 COG:no KEGG:BT_3699 NR:ns ## KEGG: BT_3699 # Name: susF # Def: outer membrane protein SusF # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 485 1 485 485 920 100.0 0 MKKHLIYTGMFLAAIGFSACNEDFKDWADPQSNPQEESAGQLTATFTAGKDASIVMDAAT ADSVEIAKLSSTTAEEGSKIAVNSLTLNENHTIPFSMTEDHVFKVALAQLDSVTQEAYKS RASVVRELKISINASAVTPSGEGIQLVGNEVSITLQPATTPAVDPDGYYIVGDFTGWDGN SAQQMKKDALDENLYILEAEIESTSNFKIFPASAINGNDIDWTKALGSSVDGDDSGDNFV SWTNAGAINTALDGKIKISFDAFNYRFTVKDNSAPTELYMTGSAYNWGTPAGDPNAWKAL VPVNGTKGTFWGIFYFAANDQVKFAPQANWGNDFGFVDAISQESKDLAGLSDEGGNLKVG IAGWYLVYVSVIGDDKVIEFEKPNVYLMGDTSYNGWDAQLVEQDLFTVPGTADGEFVSPA FLKDGAVRICVNPKAVSAGDWWKTEFIIFDGQIAYRGNGGDQAAVQGKTGQKVYLNFGNG TGRIE >gi|226332188|gb|ACIC01000132.1| GENE 4 4615 - 5778 934 387 aa, chain - ## HITS:1 COG:no KEGG:BT_3700 NR:ns ## KEGG: BT_3700 # Name: susE # Def: outer membrane protein SusE # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 387 1 387 387 750 100.0 0 MKKISNILLAVTFALPLFTACETDNDSNPILNEPDTFTLNTPAYAANNVYDLKNAQTVEL TCSQPDYGFPAATTYTVQASFEQDFIEATDESKANYTVLESTSPTAKINVDASELNNALL DLWTAVNGEQAELPTEPVAVYIRLKANITSSGKGVCFSNVIELPNVLISKSTSSLTPPKT MFIVGSMLDTDWKVWKPMAGVYGMDGQFYSMIYFDANSEFKFGTKENEYIGINDNRVTVT DKAGAGVSGSDNFVVENAGWYLFYVKAAVKGDDYQFTITFYPAEVYLFGNTTGGSWAFND EWKFTVPATKDGNFVSPAMTASGEVRMCFKTDLDWWRTEFTLHDGEIFYRDFNLIDSWTE KGDGYSIQGSAGNVIHLNFTAGTGEKK >gi|226332188|gb|ACIC01000132.1| GENE 5 5813 - 7468 1472 551 aa, chain - ## HITS:1 COG:no KEGG:BT_3701 NR:ns ## KEGG: BT_3701 # Name: susD # Def: SusD, outer membrane protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 551 1 551 551 1135 100.0 0 MKTKYIKQLFSAALIAVLSSGVTSCINDLDISPIDPQTGGSFDQQGVFVKGYAMLGVTGQ KGIDGSPDLDGQDEGESGFYRTTFNCNELPTDECLWAWQENQDIPQLTSISWSPSSQRTE WVYVRLGYDITQYNFFLDQTEGMTDAETLRQRAEIRFLRALHYWYFLDLFGKAPFKEHFS NDLPVEKKGTELYTYIQNELNEIEADMYEPRQAPFGRADKAANWLLRARLYLNAGVYTGQ TDYAKAEEYASKVIGSAYKLCTNYSELFMADNDENENAMQEIILPIRQDGVKTRNYGGST YLVCGTRVAGMPRMGTTNGWSCIFARAAMVQKFFSNLEDVPMLPADVEIPTKGLDTDEQI DAFDAEHGIRTEDMIKAAGDDRALLYSGVGGGRRKIQTDAISGFTDGLSIVKWQNYRSDG KPVSHATYPDTDIPLFRLAEAYLTRAEAIFRQGGDATGDINELRKRANCTRKVQTVTEQE LIDEWAREFYLEGRRRSDLVRFGMFTTNKYLWDWKGGAMNGTSVASYYNKYPIPVSDINN NRNMSQNEGYK >gi|226332188|gb|ACIC01000132.1| GENE 6 7490 - 10504 2854 1004 aa, chain - ## HITS:1 COG:no KEGG:BT_3702 NR:ns ## KEGG: BT_3702 # Name: susC # Def: SusC, outer membrane protein involved in starch binding # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1004 1 1003 1003 1936 99.0 0 MKKGNFMFKVLLMLIAGIFLSIDAFAQQITVKGIVKDTTGEPVIGANVVVKGTTTGTITD FDGNFQLSAKQGDIIVVSFIGYQPQELPVAAQMNVILKDDTEILDEVVVIGYGQVKKNDM TGSVMAIKPDELSKGITTNAQDMLSGKIAGVSVISNDGTPGGGAQIRIRGGSSLNASNDP LIVIDGLAIDNEGIKGMANGLSMVNPADIETLTVLKDASATAIYGSRASNGVIIITTKKG KNGQAPSVSYNGSVSFSKTQKRYDVLSGDEYRAYANQLWGDKLPADLGTANTDWQDQIFR TAVSTDHHVSINGGFKNLPYRVSLGYTDDNGIVKTSNFRRFTASVNLAPSFFEDHLKFNI NAKFMNGKNRYADTGAAIGGALAIDPTRPVYSNEDPYQFTGGYWQNINSTTGFSNPDWKY TSNPNSPQNPLAALELKNDKANSNDFVGNVDVDYKFHFLPDLRLHASIGGEYAEGTQTTI VSPYSFGNNYYGWNGDVTQYKYNLSYNIYVQYIKSLGANDFDIMVGGEEQHFHRNGFEEG QGWDSYTQEPHDAKLREQTAYATRNTLVSYFGRLNYSLLNRYLFTFTMRWDGSSRFSKDN RWGTFPSLALGWKIKEENFLKDVNVLSDLKLRLGWGITGQQNIGDDFAYLPLYVVNNEYA QYPFGDTYYSTSRPKAFNENLKWEKTTTWNAGLDFGFLNGRITGGIDGYFRKTDDLLNSV KIPVGTNFNAQMTQNIGSLENYGMEFSINAKPIVTKDFTWDLSYNITWNHNEITKLTGGD DSDYYVEAGDKISRGNNTKVQAHKVGYAANSFYVYQQVYDENGKPIENMFVDRNGNGTID SGDKYIYKKPAGDVLMGLTSKMQYKNFDFSFSLRASLNNYVYYDFLSNKANVSTSGLFSN NAYSNTSAEAVALGFSGQGDYYMSDYFIHNASFLRCDNITLGYSFQNLWKTQTYKGVGGR VYATVQNPFIISKYKGLDPEVKSGIDANPYPRAMTFLLGLSLQF >gi|226332188|gb|ACIC01000132.1| GENE 7 10659 - 12875 2227 738 aa, chain - ## HITS:1 COG:no KEGG:BT_3703 NR:ns ## KEGG: BT_3703 # Name: susB # Def: alpha-glucosidase SusB # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 738 1 738 738 1521 100.0 0 MKKRKILSLIAFLCISFIANAQQKLTSPDNNLVMTFQVDSKGAPTYELTYKNKVVIKPST LGLELKKEDNTRTDFDWVDRRDLTKLDSKTNLYDGFEVKDTQTATFDETWQPVWGEEKEI RNHYNELAVTLYQPMNDRSIVIRFRLFNDGLGFRYEFPQQKSLNYFVIKEEHSQFGMNGD HIAFWIPGDYDTQEYDYTISRLSEIRGLMKEAITPNSSQTPFSQTGVQTALMMKTDDGLY INLHEAALVDYSCMHLNLDDKNMVFESWLTPDAKGDKGYMQTPCNTPWRTIIVSDDARNI LASRITLNLNEPCKIADAASWVKPVKYIGVWWDMITGKGSWAYTDELTSVKLGETDYSKT KPNGKHSANTANVKRYIDFAAAHGFDAVLVEGWNEGWEDWFGNSKDYVFDFVTPYPDFDV KEIHRYAARKGIKMMMHHETSASVRNYERHMDKAYQFMADNGYNSVKSGYVGNIIPRGEH HYGQWMNNHYLYAVKKAADYKIMVNAHEATRPTGICRTYPNLIGNESARGTEYESFGGNK VYHTTILPFTRLVGGPMDYTPGIFETHCNKMNPANNSQVRSTIARQLALYVTMYSPLQMA ADIPENYERFMDAFQFIKDVALDWDETNYLEAEPGEYITIARKAKDTDDWYVGCTAGENG HTSKLVFDFLTPGKQYIATVYADAKDADWKENPQAYTIKKGILTNKSKLNLHAANGGGYA ISIKEVKDKSEAKGLKRL >gi|226332188|gb|ACIC01000132.1| GENE 8 13072 - 14925 1568 617 aa, chain - ## HITS:1 COG:lin2231 KEGG:ns NR:ns ## COG: lin2231 COG0366 # Protein_GI_number: 16801296 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Listeria innocua # 114 542 121 520 591 168 31.0 3e-41 MKRNLLFIILLLLLPGLHQVFATSTIKKVAPTFWWAGMKNPELQILLYGDRISSADVSLS ADNITLQEVVKQENPNYLVLYLDLSKAAPQNFDIILKQGKKQTKIPYELKQRRPNASAVE GFDSSDVLYLIMPDRFANGNPSNDIIPGMLEGNVDRNEPFARHGGDLKGIENHLDYIADL GVTSIWLNPIQENDMKEGSYHGYAITDYYQVDRRFGSNEEFRKLTQEANAKGLKVVMDMI FNHCGSDNYLFKDMPSKDWFNFEGNYVQTSFKTATQMDPYASDYEKKIAIDGWFTLTMPD FNQRNRHVATYLIQSSIWWIEYAGINGIRQDTHPYADFDMMARWCKAVNEEYPKFNIVGE TWLGNNVLISYWQKDSRLAYPKNSNLPTVMDFPLMEEMNKAFDEETTEWNGGLFRLYEYL SQDIVYSHPMSLLTFLDNHDTSRFYRSEADTKNLDRYKQALTFLLTTRGIPQIYYGTEIL MAADKANGDGLLRCDFPGGWPNDTKNCFDAANRTPQQNEAFSFMQKLLQWRKGNEVIAKG QLKHFAPNKGVYVYERKYGDKSVVVFLNGNDREQTIDLVPYQEILPASSAFDLLTEKKVE LRNELTLPSREIYLLSF >gi|226332188|gb|ACIC01000132.1| GENE 9 15169 - 16767 1362 532 aa, chain + ## HITS:1 COG:no KEGG:BT_3705 NR:ns ## KEGG: BT_3705 # Name: susR # Def: regulatory protein SusR # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 532 51 582 582 921 100.0 0 MRNLTYYLLFLLFVYPLSLSAKHTDINKAALKKLDDIISKKETYQIRREKDITDLKVQLA HSTDPARNYELYASLFGAYLHYQADSALHYINRQMEILPQLNRPDLEYEIVINRATVMGV MGMYIEAMEQLEKIDPKKLNEWTLLSYYQTYRACYGWLADYTTNKTEKEKYLKKTDLYRD SIIAAMPPEENKTIVMAERCIVTGKADTAIGMLNDALKDMEDERQKVYIYYTLSEAYSMK KDVEKEVYYLILTAIADLESSVREYASLQKLAHLMYELGDIDRAYKYLSCSMEDAVACNA RLRFMEVTEFFPIIDKAYKLKEERERAVSRAMLISVSLLSLFLLIAIFYLYRWMKKISVM RRNLSLANKQMSAVNKELEQTGKIKEVYIARYLDRCVNYLDKLETYRRSLAKLAMSSRID DLFKAIKSEQFIRDERNEFYNEFDKSFLKLFPHFITSFNNLLVEEARVYPKSDELLTTEL RIFALIRLGVVDSNKIAHFLGYSLATIYNYRSRMRNKAAGDKDRFEQDVMNL Prediction of potential genes in microbial genomes Time: Thu May 12 02:57:34 2011 Seq name: gi|226332187|gb|ACIC01000133.1| Bacteroides sp. 1_1_6 cont1.133, whole genome shotgun sequence Length of sequence - 43298 bp Number of predicted genes - 40, with homology - 37 Number of transcription units - 21, operones - 12 average op.length - 2.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 174 - 1577 1575 ## COG1785 Alkaline phosphatase - Prom 1610 - 1669 4.1 - Term 1612 - 1659 4.9 2 2 Tu 1 . - CDS 1681 - 2247 721 ## COG0231 Translation elongation factor P (EF-P)/translation initiation factor 5A (eIF-5A) - Prom 2364 - 2423 4.5 + Prom 2265 - 2324 3.8 3 3 Tu 1 . + CDS 2440 - 2601 277 ## PROTEIN SUPPORTED gi|29349118|ref|NP_812621.1| 50S ribosomal protein L34 + Term 2613 - 2673 17.1 + Prom 2938 - 2997 3.6 4 4 Op 1 . + CDS 3018 - 3668 661 ## BT_3711 hypothetical protein 5 4 Op 2 . + CDS 3694 - 4767 1118 ## COG0564 Pseudouridylate synthases, 23S RNA-specific 6 4 Op 3 . + CDS 4764 - 5738 1014 ## COG1181 D-alanine-D-alanine ligase and related ATP-grasp enzymes + Prom 5745 - 5804 2.6 7 5 Op 1 . + CDS 5833 - 6981 1025 ## BT_3714 hypothetical protein 8 5 Op 2 . + CDS 6978 - 7688 541 ## BT_3715 hypothetical protein 9 5 Op 3 . + CDS 7703 - 8095 348 ## COG0607 Rhodanese-related sulfurtransferase + Term 8265 - 8321 11.2 - Term 8251 - 8308 15.2 10 6 Op 1 . - CDS 8361 - 9317 916 ## COG0078 Ornithine carbamoyltransferase 11 6 Op 2 22/0.000 - CDS 9371 - 10624 1142 ## COG0014 Gamma-glutamyl phosphate reductase 12 6 Op 3 . - CDS 10651 - 11733 1041 ## COG0263 Glutamate 5-kinase - Prom 11754 - 11813 4.8 + Prom 11477 - 11536 3.9 13 7 Tu 1 . + CDS 11779 - 12990 962 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase + Term 12996 - 13028 -1.0 14 8 Op 1 . - CDS 12959 - 13192 284 ## BT_3721 hypothetical protein 15 8 Op 2 . - CDS 13266 - 14108 616 ## COG0796 Glutamate racemase - Prom 14128 - 14187 3.2 - Term 14133 - 14191 16.3 16 9 Op 1 . - CDS 14215 - 14730 623 ## BT_3723 putative outer membrane protein OmpH 17 9 Op 2 . - CDS 14786 - 15301 688 ## BT_3724 cationic outer membrane protein precursor 18 9 Op 3 1/0.000 - CDS 15325 - 17982 2731 ## COG4775 Outer membrane protein/protective antigen OMA87 19 9 Op 4 . - CDS 18013 - 18747 591 ## COG0020 Undecaprenyl pyrophosphate synthase 20 9 Op 5 . - CDS 18784 - 20226 717 ## BT_3727 hypothetical protein - Prom 20342 - 20401 6.9 21 10 Tu 1 . - CDS 20408 - 21445 582 ## COG0117 Pyrimidine deaminase - Prom 21666 - 21725 4.9 + Prom 21382 - 21441 3.4 22 11 Op 1 . + CDS 21499 - 22335 356 ## PROTEIN SUPPORTED gi|225874212|ref|YP_002755671.1| ribosomal protein L11 methyltransferase 23 11 Op 2 . + CDS 22332 - 22814 585 ## BT_3730 putative regulatory protein 24 11 Op 3 . + CDS 22892 - 23530 752 ## COG0461 Orotate phosphoribosyltransferase + Term 23562 - 23614 10.4 25 12 Op 1 . + CDS 23628 - 24038 422 ## BT_3732 hypothetical protein + Prom 24044 - 24103 3.8 26 12 Op 2 . + CDS 24140 - 25480 1325 ## COG0165 Argininosuccinate lyase + Term 25579 - 25645 30.0 + TRNA 25561 - 25633 84.5 # Gly GCC 0 0 + TRNA 25648 - 25734 61.5 # Leu TAA 0 0 + Prom 26041 - 26100 4.2 27 13 Tu 1 . + CDS 26254 - 27477 606 ## COG0665 Glycine/D-amino acid oxidases (deaminating) 28 14 Op 1 . - CDS 27615 - 27821 218 ## BT_1748 hypothetical protein 29 14 Op 2 . - CDS 27875 - 28165 286 ## BT_3736 hypothetical protein - Prom 28192 - 28251 3.9 - Term 28197 - 28253 8.8 30 15 Op 1 . - CDS 28274 - 28384 117 ## 31 15 Op 2 . - CDS 28463 - 28759 313 ## BT_3737 hypothetical protein - Prom 28859 - 28918 7.7 + Prom 28831 - 28890 6.2 32 16 Tu 1 . + CDS 29014 - 29115 66 ## + Prom 29520 - 29579 8.7 33 17 Tu 1 . + CDS 29655 - 33614 2464 ## COG5002 Signal transduction histidine kinase + Term 33695 - 33761 16.1 - Term 33685 - 33746 20.1 34 18 Op 1 . - CDS 33764 - 34117 345 ## COG0471 Di- and tricarboxylate transporters 35 18 Op 2 . - CDS 34102 - 35277 1136 ## COG0471 Di- and tricarboxylate transporters - Prom 35320 - 35379 8.3 + Prom 35305 - 35364 8.9 36 19 Op 1 . + CDS 35389 - 37383 1471 ## BT_3740 hypothetical protein 37 19 Op 2 . + CDS 37396 - 39204 1133 ## BT_3741 hypothetical protein 38 20 Op 1 . + CDS 39873 - 40004 154 ## 39 20 Op 2 . + CDS 39967 - 41481 1266 ## BT_3742 hypothetical protein + Term 41544 - 41598 9.1 + Prom 41647 - 41706 5.8 40 21 Tu 1 . + CDS 41767 - 43266 1011 ## BT_3742 hypothetical protein Predicted protein(s) >gi|226332187|gb|ACIC01000133.1| GENE 1 174 - 1577 1575 467 aa, chain - ## HITS:1 COG:TM0156 KEGG:ns NR:ns ## COG: TM0156 COG1785 # Protein_GI_number: 15642930 # Func_class: P Inorganic ion transport and metabolism # Function: Alkaline phosphatase # Organism: Thermotoga maritima # 1 466 1 421 434 226 35.0 1e-58 MKKLIYTLLFVLISVAANGQQAKYVFYFIGDGMGVNQVNGTEMYQAELQNGRIGVEPLLF TQFPVATMATTFSATNSVTDSAAAGTALATGKKTYNSAISVGEDKNPIETVAEKAKKAGK KVGVTTSVSVDHATPAAFYAHQADRNMNYEIAVDLTKANFDFYAGGGFLKPDKTYDKKDA PNIFPIFEEAGYTVARGYSDYKAKSKDAGKMILIQEEGKDPSCLPYAIDRKSDDLTLAQI TESAIDFLTKGKNKGFFLMVEGGKIDWACHGNDAATVFNEVKDMDDAIKVAYEFYKKHPK ETLIVVTADHETGGIVLGTGKYALNLKALQYQKHSADGLSRRISELRKSKGNKVTWEDMK EFLGEEMGFWKQFPISWEQEKKLRDEFEQSFVRNKVVFAESMYSKSEPMAARAKEVMDQI SMVGWVSGGHSAGYVPVFAIGAGSQLFGEKIDNTEIPKRIAKAAGYK >gi|226332187|gb|ACIC01000133.1| GENE 2 1681 - 2247 721 188 aa, chain - ## HITS:1 COG:MT2609 KEGG:ns NR:ns ## COG: MT2609 COG0231 # Protein_GI_number: 15842068 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factor P (EF-P)/translation initiation factor 5A (eIF-5A) # Organism: Mycobacterium tuberculosis CDC1551 # 1 188 1 187 187 179 44.0 4e-45 MINAQDIKNGTCIRMDGKLYFCIEFLHVKPGKGNTFMRTKLKDVVSGYVLERRFNIGEKL EDVRVERRPYQFLYKEGEDYIFMNQETFDQHPIAHDLINGVDFLLEGAVLDVVSDASTET VLYADMPIKVQMKVTYTEPGMKGDTATNTLKPATVESGATVRVPLFISEGETIEIDTRDG SYVGRVKA >gi|226332187|gb|ACIC01000133.1| GENE 3 2440 - 2601 277 53 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|29349118|ref|NP_812621.1| 50S ribosomal protein L34 [Bacteroides thetaiotaomicron VPI-5482] # 1 53 1 53 53 111 100 8e-24 MKRTFQPSNRKRKNKHGFRERMASANGRRVLAARRAKGRKKLTVSDEYNGQKW >gi|226332187|gb|ACIC01000133.1| GENE 4 3018 - 3668 661 216 aa, chain + ## HITS:1 COG:no KEGG:BT_3711 NR:ns ## KEGG: BT_3711 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 216 1 216 216 405 100.0 1e-112 MTIKEFFSFKTNKYFWVNILVMIVAVVLIVFGVMKWLDVYTRHGEAVVVPDVKGMTVDEA SKMFRNHGLVCVVSDSNYVKNMAAGIVLDVNPGIGQKVKEGRTIYLTINTLSIPLRAVPD VADNSSLRQAQAKVLAAGFKLTEIQLMNGEKDWVYGIKYQGRSLNAGDKVPLGAALTLMV GNGENEPLESDSMENVEGADEPVSSESPSSQDDSWF >gi|226332187|gb|ACIC01000133.1| GENE 5 3694 - 4767 1118 357 aa, chain + ## HITS:1 COG:BH2542 KEGG:ns NR:ns ## COG: BH2542 COG0564 # Protein_GI_number: 15615105 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Pseudouridylate synthases, 23S RNA-specific # Organism: Bacillus halodurans # 26 346 1 300 305 245 44.0 8e-65 MIEEELPDELENDLDDIEPVGDESQLYEHFRVVVDKGQAMVRVDKYLFERIVNASRNRIQ KAAEDGFVMANGKPVKSSYKVKPLDVITVMMDRPRYENEIIPENIPLNIVYEDPYVMVVN KPAGLVVHPGHGNYHGTLVNALAWHMKDIPEYDANDPHVGLVHRIDKDTSGLLVIAKTPD AKTNLGIQFFNKTTKRRYRALVWGNVEQDEGTIVGSIARNPKDRMQMAVMSDPSMGKHAV THYRVLERLGYVTLVECILETGRTHQIRVHMKHIGHVLFNDERYGGHEILKGTHFSKYKQ FVNNCFDTCPRQALHAMTLGFVHPVTGEEMYFTSELPDDMTRLIEKWRGYISNREIE >gi|226332187|gb|ACIC01000133.1| GENE 6 4764 - 5738 1014 324 aa, chain + ## HITS:1 COG:HI1140 KEGG:ns NR:ns ## COG: HI1140 COG1181 # Protein_GI_number: 16273066 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanine-D-alanine ligase and related ATP-grasp enzymes # Organism: Haemophilus influenzae # 2 321 5 303 306 176 31.0 4e-44 MKRTIAIVAGGDTSEFHVSLRSAQGIYSFIDKEKYTLYIVEMQGTRWEVQLPDGAKAPVD RNDFSFMLGTEKIRFDFAYITIHGTPGEDGRLQGYFDMMHIPYSCCGVLAAAITYDKFTC NQYLKAFGVRIAESLLLRQGQSVSDEDVMEKIGLPCFIKPSLGGSSFGVTKVKTKEQIQP AIVKAFEEAQEVLVEAFMEGTELTCGCYKTKDKTVIFPPTEVVTHNEFFDYDAKYNGQVD EITPARISDELTNRVQMLTSAIYDILGCSGIIRVDYIVTAGEKINLLEVNTTPGMTTTSF IPQQVRAAGLDIKDVMTDIIENKF >gi|226332187|gb|ACIC01000133.1| GENE 7 5833 - 6981 1025 382 aa, chain + ## HITS:1 COG:no KEGG:BT_3714 NR:ns ## KEGG: BT_3714 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 382 1 382 382 771 99.0 0 MDTTEFDEIRPYNDEELPQIFEELIADPAFQKAATDAIPNVPFELLAQKIRACKSKLDFQ ETFCYGILWKIAADHTDGLTLDHTAVPDKSKAYTYLSNHRDIILDSGFLSILLIDQGMDT VEIAIGDNLLVYPWIKKLVRVNKSFIVQRALTMRQMLESSARMSRYMHYTISEKKQSIWI AQREGRAKDSNDRTQDSVLKMLAMGGEGDLIDRLMEMNIAPLAISYEYDPCDFLKAQEFQ LKRDIEGYKKTTQDDLISMQTGLFGYKGKVHFQTAPCINDRLAQLDRSLPKQELFSSISA CIDQRIHSNYRIYSGNYVAYDLLKGTSEFAGHYTPEEKQRFVTYIEQQLGKIKIPNKDED FLRGKLLLMYANPLVNYLAACQ >gi|226332187|gb|ACIC01000133.1| GENE 8 6978 - 7688 541 236 aa, chain + ## HITS:1 COG:no KEGG:BT_3715 NR:ns ## KEGG: BT_3715 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 236 1 237 237 461 97.0 1e-129 MNKVKLHTLWLLGMLFLMTACGDDDYYYPSVKLEFVTVKAGADGLIQSLLPDKGELLTVA KDRTGSTISPNSARRVISNYEVNPEDATAVIYSLQSVVTPEPKGADDPAFEGGLKYDPVD VTSIWLGRDYLNMILNVKININSGKQHVFGMIEESVEAEGDETVVTLSLFHDANGDEENY NRRAYISVPLTKYVDEENPHQTIRIKFKYHTTDKSGVVVESSKYCDPGFEYVPGVN >gi|226332187|gb|ACIC01000133.1| GENE 9 7703 - 8095 348 130 aa, chain + ## HITS:1 COG:MA0746 KEGG:ns NR:ns ## COG: MA0746 COG0607 # Protein_GI_number: 20089631 # Func_class: P Inorganic ion transport and metabolism # Function: Rhodanese-related sulfurtransferase # Organism: Methanosarcina acetivorans str.C2A # 27 125 39 146 151 65 37.0 3e-11 MFKMNQLVVGLFLFLSSLFSCQQKGDFKTVNVEDFDSIIQDEEVQRLDVRTLAEYSEGHI AKTININVMDDSFASMADSLLQKNKPVALYCRSGKRSKKAAAILSEKGYKVVELGKGFNA WQAAGKEIEY >gi|226332187|gb|ACIC01000133.1| GENE 10 8361 - 9317 916 318 aa, chain - ## HITS:1 COG:XF0998 KEGG:ns NR:ns ## COG: XF0998 COG0078 # Protein_GI_number: 15837600 # Func_class: E Amino acid transport and metabolism # Function: Ornithine carbamoyltransferase # Organism: Xylella fastidiosa 9a5c # 20 302 19 322 336 193 37.0 3e-49 MKKFTCVQDIGDLKSALAEAFEIQKDRFKYVELGRNKTLMMIFFNSSLRTRLSTQKAALN LGMNVMVLDINQGAWKLETERGVIMDGDKPEHLLEAIPVMGCYCDIIGVRSFARFEDRDF DYQETILNQFIQYSGRPVFSMEAATRHPLQSFADLITIEEYKKTARPKVVMTWAPHPRPL PQAVPNSFAEWMNATDYDFVITHPEGYELAPQFVGNAKVEYDQMKAFEGADFIYAKNWAA YTGDNYGQILSKDREWTVSDRQMAVTNNAFFMHCLPVRRNMIVTDDVIESPQSIVIPEAA NREISATVVLKRLIEGLE >gi|226332187|gb|ACIC01000133.1| GENE 11 9371 - 10624 1142 417 aa, chain - ## HITS:1 COG:SP0932 KEGG:ns NR:ns ## COG: SP0932 COG0014 # Protein_GI_number: 15900812 # Func_class: E Amino acid transport and metabolism # Function: Gamma-glutamyl phosphate reductase # Organism: Streptococcus pneumoniae TIGR4 # 9 416 8 419 420 345 46.0 1e-94 MTTNLNGTFAAVQAASRELALLSDDTINQILNAVADAAIAETPFILSENEKDLARMDKSN PKYDRLKLTEERLKGIAADTRNVATLPSPLGRILKETSRPNGMKLTKVSVPFGVIGIIYE ARPNVSFDVFSLCLKSGNACILKGGSDADDSNRAIISVIHKVLEKFHVNPHIVELLPADR EATAALLNATGYVDLIIPRGSSNLINFVRENARIPVIETGAGICHTYFDEFGDTRKGADI IHNAKTRRVSVCNALDCTIIHEKRLGDLPALCDQLKKSKVTIYADTQAYQALEGHYPAEL LQPATPESFGREFLDYKMAVKTVKSFEDALGHIQENSSRHSECIVTENKERAALFTKIVD AACVYTNVSTAFTDGAQFGLGAEIGISTQKLHARGPMGLEEITSYKWVIEGDGQTRW >gi|226332187|gb|ACIC01000133.1| GENE 12 10651 - 11733 1041 360 aa, chain - ## HITS:1 COG:BS_proJ KEGG:ns NR:ns ## COG: BS_proJ COG0263 # Protein_GI_number: 16078908 # Func_class: E Amino acid transport and metabolism # Function: Glutamate 5-kinase # Organism: Bacillus subtilis # 7 345 9 346 371 198 34.0 1e-50 MKQEFTRIAVKVGSNVLARRDGTLDVTRMSALVDQIAELNKSGVEIILISSGAVASGRSE IHPQKKLDSVDQRQLFSAVGQAKLINRYYELFREHGIAVGQVLTTKENFSTRRHYLNQKN CMTVMLENGVIPIVNENDTISVSELMFTDNDELSGLIASMMDAQALIILSNIDGIYNGSP SDPASAVIREIEHGKDLSNYIQATKSSFGRGGMLTKTNIARKVADEGITVIIANGKRDNI LVDLLQQPDDTVCTRFIPSTEAVSSVKKWIAHSEGFAKGEIHINECATDILSSEKAASIL PVGITHIEGEFEKDDIVRIMDFQGNQVGVGKANCDSAQAREAMGKHGKKPVVHYDYLYIE >gi|226332187|gb|ACIC01000133.1| GENE 13 11779 - 12990 962 403 aa, chain + ## HITS:1 COG:MA0636 KEGG:ns NR:ns ## COG: MA0636 COG0436 # Protein_GI_number: 20089523 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Methanosarcina acetivorans str.C2A # 27 400 13 390 394 421 53.0 1e-118 MLQKQAEKNYLCSLTDKRKTMKHTNPQVEQMTSFIVMDVLERANELQKQGVDVIHLEVGE PDFDVPACVAEAAKAAYDRHLTHYTHSLGDPELRREIAAFYQREYGVTVDPDCIVVTSGS SPSILLVLMLLCNSDSEVILSNPGYACYRNFVLAAQAKPVLVPLSEENGLQYDIEAIRKC VTPHTAGIFINSPMNPTGMLLDESFLKSVASLGVPIISDEIYHGLVYEGRAHSILEYTDK AFVLNGFSKRFAMTGLRLGYLIAPKSCMRSLQKLQQNLFICASSIAQQAGIAALRQADSD VERMKQIYDERRRYMISRLREMGFEIKVEPQGAFYIFADARKFTTDSYRFAFDVLENAHV GITPGIDFGTGGEGYVRFSYANSLESIREGLDRISQYLSRCGF >gi|226332187|gb|ACIC01000133.1| GENE 14 12959 - 13192 284 77 aa, chain - ## HITS:1 COG:no KEGG:BT_3721 NR:ns ## KEGG: BT_3721 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 77 1 77 77 139 98.0 3e-32 MKEEDYSRAIEVFSGSPWEAEIIKGLLESNDIRCVIKDGIMGTLAPYIAPSVSVLVTEDQ YEAATELIRSRNEKDTD >gi|226332187|gb|ACIC01000133.1| GENE 15 13266 - 14108 616 280 aa, chain - ## HITS:1 COG:BS_racE KEGG:ns NR:ns ## COG: BS_racE COG0796 # Protein_GI_number: 16079891 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glutamate racemase # Organism: Bacillus subtilis # 11 280 5 267 272 173 38.0 4e-43 MKQHLSHTPGPIGVFDSGYGGLTILNKIREALPEYDYIYLGDNARTPYGTRSFEIVYEFT LQAVNKLFEMGCHLVILACNTASAKALRSIQMNDLPGIDPERRVLGVIRPTVECIGDITQ SRHIGVLATAGTIKSESYPLEIHKLFPDIQVSGIACPMWVPLVENNESQNEGADYFIRKY IDQLLSKDPEIDTMILGCTHYPILLPKIQKYTPKHIRIVAQGEYVAESLKDYLSRHPEMN AKCTQNGSCLFYTTEAEEKFVESASSFLNQQIDVKRISLE >gi|226332187|gb|ACIC01000133.1| GENE 16 14215 - 14730 623 171 aa, chain - ## HITS:1 COG:no KEGG:BT_3723 NR:ns ## KEGG: BT_3723 # Name: not_defined # Def: putative outer membrane protein OmpH # Organism: B.thetaiotaomicron # Pathway: not_defined # 15 171 1 157 157 216 100.0 2e-55 MLKKIALVLMLALPMGVFAQNLKFGHINAQEIVSAMPEFAKAQSDIEALDKQLTSELQRT QEEFNKKYQEFQQAIAKDSLPANIAERRQKELQDMMQRQEQFQQEAQQQMQKAQADAMAP IYKKLDDAIKAVGAAEGVIYIFDLARTPVAYVNESQSINLTPKVKTQLGIK >gi|226332187|gb|ACIC01000133.1| GENE 17 14786 - 15301 688 171 aa, chain - ## HITS:1 COG:no KEGG:BT_3724 NR:ns ## KEGG: BT_3724 # Name: not_defined # Def: cationic outer membrane protein precursor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 171 1 171 171 271 100.0 6e-72 MRKSVLSIMLLFAISMAASAQKFALIDTEYILKNIPAYQSANEQLQEATKKYQSEVEVIA KEAQKMFQDYQAQSSTLSAAQKTKKEDEIVAKEKSAAELKRKYFGPEGELAKMQEKLINP IQDEIYGAVKELSQLHGYDLVLDRASAAGIIFANPRIDISDEVLRKLGYSN >gi|226332187|gb|ACIC01000133.1| GENE 18 15325 - 17982 2731 885 aa, chain - ## HITS:1 COG:RSc1412 KEGG:ns NR:ns ## COG: RSc1412 COG4775 # Protein_GI_number: 17546131 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane protein/protective antigen OMA87 # Organism: Ralstonia solanacearum # 46 885 30 765 765 141 22.0 7e-33 MHYRISFIFVTFICLFCFAATGFTQNTNTDEDSKPVILYSGTPKKYEIADIKVEGVKNYE DYVLIGLSGLSVGQTITVPGDEITGAIKRYWKHGLFSNVQITAEKIEGNKIWLKISLTQR PRIADVRYHGVKKSERTDLESKLGMVKGMQITPNTVDRAKTLIKRYFDDKGFKNAEVIIA QKDDPSNENQVIVDIDIDKKEKIKVHAIHITGNSAIKTSKLKKVMKKTNEKGKLLNLFRT KKFVPENFEADKQLIIDKYNELGYRDAMIVKDSVAQYDEKTVDVYMDIDEGQKYYLRNVT WVGNTLYPSEQLNFLLRMKKGDVYNQKLLGERTSTDDDAIGNLYYNNGYLFYNLDPVEVN IVGDSIDLEMRIYEGRQATINKINISGNDRLYENVVRRELRIRPGQLFSKDDLMRSLREI QQMGHFDPEKLQPDIQPDPVNGTVDIGLPLTSKANDQVEFSAGWGQTGIIGKLSLKFTNF SVANLLRPGENYRGILPQGDGQTLTISGQTNAKYYQSYSISFFDPWFGGKRPNSLSVSAF FSVQTDISSRYYNSSYFNNYYNSMYSGYGGYGMYNYGNYNNYENYYDPDKSIKMWGLSLG WGKRLKWPDDYFTLSAELAYQRYNLKDWQYFPVTNGKCNDLSLSLTLARNSIDNPIFPRT GSDFSLSVQLTPPYSLFDGKDYKGYFYDPTDDRGITQDNMNKLHRWVEYHKWKFKAKTYT PLMDYIAHPKCLVLMTRTEFGLLGHYNKYKKSPFGTFDVGGDGMTGYSSYATESIALRGY ENSSLTPYGKEGYAYARLGIELRYPLMLETSTNIYVLGFLEAGNAWHDISKFNPFDLKRS AGIGVRIFLPMIGMMGIDWGYGFDKINGSKEYGGSQFHFILGQEF >gi|226332187|gb|ACIC01000133.1| GENE 19 18013 - 18747 591 244 aa, chain - ## HITS:1 COG:STM0221 KEGG:ns NR:ns ## COG: STM0221 COG0020 # Protein_GI_number: 16763611 # Func_class: I Lipid transport and metabolism # Function: Undecaprenyl pyrophosphate synthase # Organism: Salmonella typhimurium LT2 # 14 239 18 245 252 242 48.0 6e-64 MSYIEKIDKNRIPQHIAIIMDGNGRWAKQRGKERTYGHQAGAETVHKIIEDAARLGVKYL TLYTFSTENWNRPQEEVAALMNLLVDSIEEETLMKNNIRFRIIGDIKKLPAEVQEGLSRC IEHTANNTGTCLVLALSYSSRWEMTEAVRQIATLAKTGEISPEQITDEYITAHLTTNFMP DPDLLIRTGGEIRLSNYLLWQCAYSELYFCDTFWPDFDKEEFCKAIYEYQQRERRFGKTS EQIS >gi|226332187|gb|ACIC01000133.1| GENE 20 18784 - 20226 717 480 aa, chain - ## HITS:1 COG:no KEGG:BT_3727 NR:ns ## KEGG: BT_3727 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 480 14 493 493 931 99.0 0 MVSFVITSCLDNDNEVNYSPDATIRAFELDTIGYGVNYKFTIDQVSCLIYNVDSLPVNAD TIINSILIKTLTTASGIVTMKDQNDQDSIVNINDSIDLTKYVNATEKNNFLVLKVWAPNM EVQNEYKVNIRMHTMVPDSLSWGKDPIANNPVRNTAEKQKVVTLGDKILLFAQNNEIYST AIPAGSPTDRLNYGQKWDKETTGKLPDGADVTSIIRFVDKLYLLTKNKEVYNSNDGLTWT KDEVLNSDGVSVTNLITSFSDSDGSNHKKINGIAGIVEINGEKYFSFAEKDVTWEKDIDK LTVVPAEFPINNLSADVYATESGTLNAIVVGNTEDGLDNDTATVVWASEDGKAWIPMEIP SNNNCPKLVDPSIIHYNDAFYICGKETKDDAKGFQKFYTSPTLLVWKGVDRMFMLPGILP PVKLEGGVTQHPSLHESSFKGKEVNYTMVVDRNHYIWMVGGQGIDKIWRGRVNKLGFLIQ >gi|226332187|gb|ACIC01000133.1| GENE 21 20408 - 21445 582 345 aa, chain - ## HITS:1 COG:BH1554_1 KEGG:ns NR:ns ## COG: BH1554_1 COG0117 # Protein_GI_number: 15614117 # Func_class: H Coenzyme transport and metabolism # Function: Pyrimidine deaminase # Organism: Bacillus halodurans # 1 143 1 141 143 159 52.0 8e-39 MEEEKYMRRCIELAKNGLCNVAPNPMVGAVIVCDGLIIGEGYHIRCGEAHAEVNAIRSVK DKSLLSRSTIYVSLEPCSHYGKTPPCADLIIEKQIPRIVIGCQDPFSQVAGRGIQKLRDA GREVTVGVLEKECRYLIRRFITFNTLHRPFITLKWAESADHFIDVERTDGNPVVLSSPLT SMLVHKKRAETDAIMVGRRTALLDNPSLTVRNWYGRNPVRIVLDRALSLPHSLRLFDGEV PTIVFTAEKHPDKKNVTYRTIEFTQDILPQIMKVLYQQKIQSLLVEGGSQLLQSFIDAEL WDEVYIEKCPCKLNSGVKAPEISDNFSYSTEKHFDRQIWHYVFEK >gi|226332187|gb|ACIC01000133.1| GENE 22 21499 - 22335 356 278 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|225874212|ref|YP_002755671.1| ribosomal protein L11 methyltransferase [Acidobacterium capsulatum ATCC 51196] # 27 278 42 290 294 141 38 6e-33 MNRITAYIRQSLQDIYPPEEVKALSMLICCDMLGVDALDIYMGKDIILSACKQRELENII FRLQKNEPIQYIRGYAEFCGRNFRVAPGVLIPRPETAELVDLIVKENPDARRLLDIGTGS GCIAISLDKNLPDAKVDAWDISEEALAIARKNNEELDAQVTFRRQDVFSADGIQGTSYDI IVSNPPYVTETEKTEMEANVLDWEPELALFVPDEDPLRFYRRIAELGRELLRPGGKLYFE INQAYGQDMIRMIEMNQYRDVRVIKDIFGKDRILTANR >gi|226332187|gb|ACIC01000133.1| GENE 23 22332 - 22814 585 160 aa, chain + ## HITS:1 COG:no KEGG:BT_3730 NR:ns ## KEGG: BT_3730 # Name: not_defined # Def: putative regulatory protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 160 1 160 160 289 100.0 2e-77 MSAQLTDEEALNRVASYCSAAEHCRAEVNEKLQRWGIAYETIARILERLETEKYIDDERY CRAFVNDKFRFAKWGKMKIAQGLYMKKIPSDVAWRHLNEIDEEEYLSILRDLLASKRKSI HAKDDYELNGKLMRFAVSRGFELKDIRRCIEIPEEEEQFS >gi|226332187|gb|ACIC01000133.1| GENE 24 22892 - 23530 752 212 aa, chain + ## HITS:1 COG:lin1945 KEGG:ns NR:ns ## COG: lin1945 COG0461 # Protein_GI_number: 16801011 # Func_class: F Nucleotide transport and metabolism # Function: Orotate phosphoribosyltransferase # Organism: Listeria innocua # 3 208 2 207 209 225 53.0 4e-59 MKNLERLFAEKLLKIKAIKLQPANPFTWASGWKSPFYCDNRKTLSYPSLRNFVKIEITRL ILERFGQVDAIAGVATGAIPQGALVADALNLPFVYVRSTPKDHGLENLIEGELRPGMKVV VVEDLISTGGSSLKAVEAIRRDGCEVIGMVAAYTYGFPVAEEAFKNAKVPLVTLTNYEAV LDVALRTGYIEEEDIATLNDWRKDPAHWDAGK >gi|226332187|gb|ACIC01000133.1| GENE 25 23628 - 24038 422 136 aa, chain + ## HITS:1 COG:no KEGG:BT_3732 NR:ns ## KEGG: BT_3732 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 136 1 136 136 246 100.0 2e-64 MANFESSVKVIPYSQERVYAKLSDLSNLESVKGRLPEDKIQDLSFDSDTLSFSVSPIGQL TLQIVEREPCKCIKLATTNSPLPFNMWIQLVSTAEEECKLKVTISMDINPFMKAMVQKPL QDGLEKMVEMLSMINY >gi|226332187|gb|ACIC01000133.1| GENE 26 24140 - 25480 1325 446 aa, chain + ## HITS:1 COG:XF1003 KEGG:ns NR:ns ## COG: XF1003 COG0165 # Protein_GI_number: 15837605 # Func_class: E Amino acid transport and metabolism # Function: Argininosuccinate lyase # Organism: Xylella fastidiosa 9a5c # 1 422 6 430 445 248 34.0 2e-65 MAQKLWEKSVQVNKDIERFTVGRDREMDLYLAKHDVLGSMAHITMLESIGLLTKEELEQL LAELKTIYASVERGEFIIEEGVEDVHSQVELMLTRRLGDVGKKIHSGRSRNDQVLLDLKL FTRTQIREIAEAVEQLFHVLILQSERYKNVLMPGYTHLQIAMPSSFGLWFGAYAESLVDD MQFLQAAFRMCNRNPLGSAAGYGSSFPLNRTMTTDLLGFDSLNYNVVYAQMGRGKLERNV AFALATIAGTISKLAFDACMFNSQNFGFVKLPDDCTTGSSIMPHKKNPDVFELTRAKCNK LQSLPQQIMMIANNLPSGYFRDLQIIKEVFLPAFQELKDCLQMTTYIMNEIKVNEHILDD DKYLFIFSVEEVNRLAREGMPFRDAYKKVGLDIEAGKFTHDKQVHHTHEGSIGNLCNDEI SALMQQVVDGFNFCGMEQAEKALLGR >gi|226332187|gb|ACIC01000133.1| GENE 27 26254 - 27477 606 407 aa, chain + ## HITS:1 COG:mlr3020 KEGG:ns NR:ns ## COG: mlr3020 COG0665 # Protein_GI_number: 13472653 # Func_class: E Amino acid transport and metabolism # Function: Glycine/D-amino acid oxidases (deaminating) # Organism: Mesorhizobium loti # 1 402 5 403 403 246 32.0 6e-65 MDLHSGLPYWVVKNSLLDYFHPLEDDFSTDIVVVGSGITGALMVHELCSAGLRCCMVDKR SIATGSSIASTALLQYEIDVPLCEMAEIIGEDNAVSAYRASLASIADIEKVLKETGVDAD FEKRPSLFYASIPKDIELIEKEYVIRKKHNLPVRLLGKEEIKKLYNIEVPGNALLNRVSA QMDAYKATTGLLLYHMKKDGLEIFTHTGVTECVEMPEGYIIETDRGHKIECKYVIIAAGF EAGKFLSREIMDLTSTYALVSHPVDSKDLWPEQCLIWETAEPYLYIRTTRGNRIIVGGED EKFSDPERRDALLRKKTLVLEKKFRRLFPSIPFKTEMAWCGTFSTTKDGLPFIGNCPDKD RMFFDLGYGGNGITFSMIGAQIICKKLQGIDDERGRIFGYERIEKYW >gi|226332187|gb|ACIC01000133.1| GENE 28 27615 - 27821 218 68 aa, chain - ## HITS:1 COG:no KEGG:BT_1748 NR:ns ## KEGG: BT_1748 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 61 1 61 62 73 72.0 3e-12 MVGFVIWIVGLVLTIKAALEIWRIHAPIERRLIAIILIVLTSWIGLLFYYFYGKARMPQW LGTGVQKY >gi|226332187|gb|ACIC01000133.1| GENE 29 27875 - 28165 286 96 aa, chain - ## HITS:1 COG:no KEGG:BT_3736 NR:ns ## KEGG: BT_3736 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 96 1 96 96 101 100.0 8e-21 MKKLLFAIFVVTTIISFTSCRNKKEQDKAKQQVEKVKENVNDAVDKVSDKIEDGADAVKD AWKDTKKDVQNSAEKTKKEIKEGYNEVKKDVSKKLD >gi|226332187|gb|ACIC01000133.1| GENE 30 28274 - 28384 117 36 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNKKTEKKQEEAKGTKKTAQAKECKSSTSSKKTEKK >gi|226332187|gb|ACIC01000133.1| GENE 31 28463 - 28759 313 98 aa, chain - ## HITS:1 COG:no KEGG:BT_3737 NR:ns ## KEGG: BT_3737 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 98 1 98 98 150 100.0 2e-35 MENRTSDLWLGLGIGSVIGALVYRFSRTSKAKKLKKKVCDAFHKISGQVEEMLDTAKEKV LDTGAAVADKVADKTFDLAEKADDLKGKMHTIAADAKK >gi|226332187|gb|ACIC01000133.1| GENE 32 29014 - 29115 66 33 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIELTINLIHKFELILFMAVDFIDETMFKTYFF >gi|226332187|gb|ACIC01000133.1| GENE 33 29655 - 33614 2464 1319 aa, chain + ## HITS:1 COG:BH4026 KEGG:ns NR:ns ## COG: BH4026 COG5002 # Protein_GI_number: 15616588 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus halodurans # 789 1022 361 598 607 129 30.0 4e-29 MFIVLLISSAVSKGQIYKYIGLEDGLNNQKIYHIQKDRRGYMWFLTQEGVDRYDGKHIKH YTFSDDSMKLDSRIALNWLYMDQEDVLWVIGQKGRIFRYDSQHDKFELAYVHPELIRNKF QAFLDYGYLDKNDHIWLCYKDTITWYNIRTGTTQHMPKPVDGAISTIEQADGGHFFIGTG SGLFCVGIEKGELKQIPDEVVKSIVTPVHELYYHTVSKQLFVGSYKEGILIYDIGKTQKI TPCYAPNNVEVNQIVALNADELLIATGGKGVYKLNVNTCKSEPYITADYSSYNGMNGNNI NDIYVDEEERIWLANYPTGITIRNNRYGRYDLIKHSLGNNRSLVNDQVHDVIEDSDGDLW FATSNGISFYQTDTREWCSFFSSFDPVPDDENHIFLALCEVSPGVIWAGGFTSGIYKIEK KKGFKISYLSPAAIAGIRPDQYIYDIKKDSSGDVWSGGYYHLKRINLETKNVRLYPGVSS ITTILEKNARQMWIGTRMGLYQLDKQSGVYRYIDLPVESPYICALYQREDGILYIGTRGA GLLVYDINKKKFIHQYRTDNCALISDNIYTIIPRQDENLLMGTENGITIYSPKDHFFRNW TREQGLMSVNFNAGSATTYSNSTLVFGGNDGAVKFPTDIEIPEPYYSRLLLRDFMIAYHP VYPGDDGSPLKKDINETDCLELAYGQNTFSLDVASINYDYPSNILYSWKIDGYHKEWTRP SQDNRIVVRNLPPGSYTLQIRTVSNEEKYKTYETRSIQIIITPPVWASMWAMVGYAILVV LIMIIIFRVIMLHKQKKISDEKTRFFINTAHDIRTPLTLIKAPLEEVVENHMVAEKALPH MNMALKNVNTLLQLTTNLINFERIDVYSSTLYVSEYELNSYMNDVCATFRKYAEMKRVRF VYESNFDYLNVWFDSDKMGSILKNILSNALKYTPENGSVCICACEEGNTWSIEVKDTGIG IPSSEQKKLFRNCFRGSNVVNLKVTGSGIGLMLVYKLVKLHKGKIHIQSVEHQGTCVQIT FPKGNTHLHKAKFISPKTPNERMDAVVLGGTSDLPVLEAPQIKTSLQRILVVEDNDDLRN YLVDMLMAGYNIQSCSNGKDALVIIKEFNPDLVISDIMMPEMSGDELCSAIKTSVEMSHI PVILLTALGDEKNILGGLEIGADAYITKPFSVGILKVTIKNILANRELLRQVYNSIENKE KHLPVNCTNTLDWKFIASVKECIEKNMGDLDFGVDVLSNQHHMSRTSFYNKLKALTGYAP ADYIRMIRLQRAAQLLKQKEYTITEIAEIVGFSDAKYFREVFKKYYNVSPSKFVNSDQE >gi|226332187|gb|ACIC01000133.1| GENE 34 33764 - 34117 345 117 aa, chain - ## HITS:1 COG:VC1314 KEGG:ns NR:ns ## COG: VC1314 COG0471 # Protein_GI_number: 15641326 # Func_class: P Inorganic ion transport and metabolism # Function: Di- and tricarboxylate transporters # Organism: Vibrio cholerae # 2 111 374 482 487 94 53.0 7e-20 MAPVLMIVGSGLICYAMANFISHTATAALLVPILAIAGISMRENLSSLGGVETLLIGVAI GSSLAMILPISTPPNALAHATGMIQQKDMEKVGIIMGIIGLILGYTMLIILGSNKLL >gi|226332187|gb|ACIC01000133.1| GENE 35 34102 - 35277 1136 391 aa, chain - ## HITS:1 COG:VC1314 KEGG:ns NR:ns ## COG: VC1314 COG0471 # Protein_GI_number: 15641326 # Func_class: P Inorganic ion transport and metabolism # Function: Di- and tricarboxylate transporters # Organism: Vibrio cholerae # 33 391 22 376 487 331 51.0 2e-90 MYKIFHGFHLVEAYQDLKKAKRLAKNQTVARCIKLTIAITLSLILWFLPIDTFGIEGLTV IEQRLISIFIFATLMWVFEAVPAWTTSVLIVVLLLLTVSDSSLWFLTQNTPAEELGQTVK YKSILHCFADPIIMLFIGGFILAIAATKSGLDVLLARVMLRPFGTQSRYVLLGFILVTAV FSMFLSNTATAAMMLTFLTPVLKALPADGKGKIGLAMAIPVAANVGGMGTPIGTPPNAIA LKYLNDPEGLNLNIGFGEWMSFMLPYTIIVLFIAWFILLRLFPFKQKSIELQIEGEAKKD WRSIVVYITFAITVLLWMFDKFTGVNSNVVAMIPVAVFCITGVITKRDLEEISWSVLWMV AGGFALGVALQETGLAKHMIEAIPFSTWPLY >gi|226332187|gb|ACIC01000133.1| GENE 36 35389 - 37383 1471 664 aa, chain + ## HITS:1 COG:no KEGG:BT_3740 NR:ns ## KEGG: BT_3740 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 36 664 1 629 629 1242 99.0 0 MQMTCCIEKSSIFAAAFGKCIYKVSNIQYYFNQKNMNYSCRKTIVPIIIGTLLSGACSND EPTGGKGHQQTYSVVLKGITVAGEESSEELKDVSVFQFSDGNLYKEEQLTPGQGGQSEIS AVSGSRLYFLTGLEIPAGEKAKSEEEFRNTIIGEGLHDNSAPDFMAAVVELESGVVTRSN AEVNVIMKRGVARIDLNTTADSKTQIKEVIVENAPAETLPFLENVRASDKTVSYRKEFSS AFDGKQEGVFRLFESTRPVNIILRGTYGEVPIRLKVELPVVERNKVYELAVLNVGAEVTG VFEIKPWEEGETIVGKPDTNQRLLLNASKSRIPEGVKVDYENNVLEVPATGADDMTLAFV TDTRIDISSTEGAGSGTSVGNMSVSEEAEGIVSSFNVSVAAQGSGRLGYTVLVHLKNALL SGTYDYVEIRVAPSDKQIETVEIAGNVWMAFNARSRDLEDQIYPLDGATVEDMYHKSWIN TVGGLFQFGRLYMYVPWQGYNPSNNLGNQTADAPWVNDTHTPCPEGYRIPTGNEWQSLLP ADQEIPGRYKAGNGETIAATLHIGEGTLITPSSGVTGTQHYVKFTSEDTGRSLIIPLAGS KGDKSSSNNPAFGKRAVLWTNERNGLPGGYAWAYWLPFEGAESTVIKKQRLQMEAFASVR CVKK >gi|226332187|gb|ACIC01000133.1| GENE 37 37396 - 39204 1133 602 aa, chain + ## HITS:1 COG:no KEGG:BT_3741 NR:ns ## KEGG: BT_3741 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 602 1 602 602 1232 99.0 0 MKMLRIIMILLGALLLTNCSGDFEQETGIVPSHSGQVSFLFGLQQDDASVVQTRGSKPEQ ITRMWYAIANERGEIIKPLYQKLEENFSKLTIEGLSMGDYTVVFLATTSEEEDEAVKEPE KLSDDWLINHAENAPLDAVYFYKKIELHIGRDQASVSHTVVLERSVGRVDVDLNVSSDYM WRFIRKIDITFDDAEGIYATLGADGKYSGTRKIESYDITGKYSFYSLPGTKALSGFVTIE SDRSDGTQFVRKYRFTDCKIEAGRVSHISIDYLHPENQDGSLYVRKEDFFRFRTDTMFLA SEPREVFYDSRRRSFYANAPLQVSISDEHQLLVKFFSPVGIQDVKIMCRFNKFSMEFFEL AHFEQIYPFMEASFPLPVVDSERTFTTSSGRKIVVPAQPGLSNDDVTLVIRTEDPFMKKI EQIDSRWFIRFSSYSADNGHAYWRHMNPLLCRHGVALAVNMAFMFSSEEFNMEMNKYEGL LKDNGGNPINLDALRQRIRNHGGLVLGCVAGVGGLGGGNTYGLANYCYTGVYFDATPPDA HPHNYPRQAMFHEYGHCLGYNHSSTMTYGDQWTVLCATVFVNMGKNGKLPVCSKEIIAQL PM >gi|226332187|gb|ACIC01000133.1| GENE 38 39873 - 40004 154 43 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPRMALVVVILNLLLCVILCDKKEFIINYEDYEEKIVVFVYAY >gi|226332187|gb|ACIC01000133.1| GENE 39 39967 - 41481 1266 504 aa, chain + ## HITS:1 COG:no KEGG:BT_3742 NR:ns ## KEGG: BT_3742 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 504 4 505 505 639 71.0 0 MKKRLLYLFMLISSVSLFMSCSDDDDVKYPVDSELAGAYKGTMDVYYVGVSTPIASDMVQ KVYISKASDTAVKLELRNFTITVAGTELLIGDITVDNCALTKSGDAYKFSGNQTLSLVVG SCNTSVSGTVGKNAVDMVIDVDVEGGMKVKVNYKGTKLSGSENTEAKITDFTIDDEVITV APVIDDAKSAITFKVNDEATEEDLKALVPVIAVSEKATVTPASGVAQDFSGGKSVVYTVV AEDGTSRKYTASIEGVQNIMKYSLDEWSEFDAGNSYENYWTPEPAGFLATSNGGAKMLNG SSSTVKVGYPVMEESEGFSGKAAKLVTLDSRAHVLGSMAPITSGSLFTGTFSLNMLAPLK STKFGIAYDKEPKLLKGVYKYKAGTNYIDGSVKPAQEGLNIIDECSIAAVLYEAKDAAGK DVTLTGVDINTSEYRVAEARLKDGTDKDEWTTFELAFEYLPEKSYDSTKEYKLAIICSSS KEGDKFKGAANSTLIVDELEVIGE >gi|226332187|gb|ACIC01000133.1| GENE 40 41767 - 43266 1011 499 aa, chain + ## HITS:1 COG:no KEGG:BT_3742 NR:ns ## KEGG: BT_3742 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 499 4 505 505 531 60.0 1e-149 MKKSLLYLFMFVCSVSLFSSCGDDDDVKYPVDSELAGAYKGKMDVYYVGVSTPIASDMVQ KVYISKASDTAIKLELKNFVINVAGTDITIGDIAVDNCALKQDGEAFQFSGSQTLELVVG SCNTSVSGTIGNGTIDMVINVDVAGGGMKVKVNYRGSRLSGNESVEAKITSFTFDSELVT SQPVIDEENKTITFKVSEDATPEELKTLAPTITVSDKATVTPGSGVAQNFAGNVVYTVVA EDGTTNQYTVSIAAKTSVLKFSFEEWENVPGSPWANEYDKPLPTDVLATSAEGAAMLKLM GVTTMPVYKTDDKKEGEYAIKLVTMDTSAKANALVPAITSGSVFTGKFDMDFLEQGKLYC TRFGVLYDKKPVVFKGWYKYTPGEKFIDGTDVNNIVEVKDRIDECAIQAVLYKVDTDDEV LTGFDINTSEKRVAVAALSDKTAKVDYTYFEIPFEFLKDYEEGAKYKLAIVCSSSKEGDL FKGAGGSTLILDELEVMGE Prediction of potential genes in microbial genomes Time: Thu May 12 02:59:15 2011 Seq name: gi|226332186|gb|ACIC01000134.1| Bacteroides sp. 1_1_6 cont1.134, whole genome shotgun sequence Length of sequence - 111984 bp Number of predicted genes - 77, with homology - 77 Number of transcription units - 33, operones - 16 average op.length - 3.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 56 - 841 606 ## BT_3744 hypothetical protein 2 1 Op 2 . - CDS 854 - 1984 918 ## BT_3745 hypothetical protein - Prom 2053 - 2112 4.9 3 2 Tu 1 . - CDS 2147 - 2323 138 ## gi|298386851|ref|ZP_06996406.1| hypothetical protein HMPREF9007_03607 - Prom 2522 - 2581 6.0 + Prom 3100 - 3159 5.6 4 3 Tu 1 . + CDS 3327 - 3740 317 ## BT_3746 hypothetical protein + Term 3806 - 3846 4.2 - Term 3794 - 3834 4.2 5 4 Tu 1 . - CDS 3862 - 4062 161 ## BT_3747 hypothetical protein - Prom 4088 - 4147 5.1 + Prom 4297 - 4356 8.8 6 5 Tu 1 . + CDS 4387 - 4929 457 ## BT_3748 RNA polymerase ECF-type sigma factor + Term 4989 - 5027 3.8 + Prom 4954 - 5013 4.3 7 6 Tu 1 . + CDS 5084 - 6052 723 ## COG3712 Fe2+-dicitrate sensor, membrane component + Prom 6147 - 6206 4.1 8 7 Op 1 . + CDS 6233 - 9565 3250 ## BT_3750 hypothetical protein 9 7 Op 2 . + CDS 9579 - 11144 1400 ## BT_3752 hypothetical protein 10 7 Op 3 . + CDS 11168 - 12271 1018 ## BT_3753 endo-beta-N-acetylglucosaminidase F2 precursor (mannosyl-glycoprotein endo-beta-N-acetyl-glucosaminidase F2) 11 7 Op 4 . + CDS 12302 - 13492 748 ## BT_3754 hypothetical protein + Term 13557 - 13595 4.2 - Term 13543 - 13582 8.2 12 8 Op 1 1/0.333 - CDS 13607 - 15262 1812 ## COG0365 Acyl-coenzyme A synthetases/AMP-(fatty) acid ligases 13 8 Op 2 . - CDS 15270 - 15824 544 ## COG1396 Predicted transcriptional regulators 14 8 Op 3 . - CDS 15864 - 16637 954 ## COG0345 Pyrroline-5-carboxylate reductase - Prom 16663 - 16722 3.8 - Term 16690 - 16739 -0.9 15 9 Op 1 1/0.333 - CDS 16773 - 17894 1162 ## COG4992 Ornithine/acetylornithine aminotransferase 16 9 Op 2 . - CDS 17908 - 18876 838 ## COG0002 Acetylglutamate semialdehyde dehydrogenase 17 9 Op 3 . - CDS 18873 - 20081 1635 ## COG0137 Argininosuccinate synthase 18 9 Op 4 . - CDS 20095 - 20673 479 ## BT_3761 hypothetical protein 19 9 Op 5 . - CDS 20699 - 21172 630 ## COG1438 Arginine repressor - Prom 21193 - 21252 6.0 + Prom 21130 - 21189 5.7 20 10 Op 1 6/0.000 + CDS 21405 - 22862 1105 ## COG1070 Sugar (pentulose and hexulose) kinases 21 10 Op 2 . + CDS 22922 - 24178 1454 ## COG4806 L-rhamnose isomerase 22 10 Op 3 . + CDS 24182 - 25201 1106 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily 23 10 Op 4 5/0.000 + CDS 25214 - 26023 898 ## COG0235 Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases + Term 26059 - 26110 11.7 24 10 Op 5 . + CDS 26119 - 27273 1433 ## COG1454 Alcohol dehydrogenase, class IV + Term 27305 - 27348 5.4 + Prom 27318 - 27377 6.5 25 11 Tu 1 . + CDS 27415 - 28314 621 ## COG2207 AraC-type DNA-binding domain-containing proteins + Prom 28360 - 28419 4.2 26 12 Tu 1 . + CDS 28439 - 28870 500 ## BT_3769 hypothetical protein + Term 28919 - 28961 3.1 - TRNA 29194 - 29266 70.6 # Met CAT 0 0 + Prom 29359 - 29418 5.7 27 13 Op 1 . + CDS 29506 - 30096 517 ## BT_3770 transcriptional regulator 28 13 Op 2 . + CDS 30113 - 30859 289 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 29 13 Op 3 . + CDS 30869 - 31540 203 ## PROTEIN SUPPORTED gi|238855152|ref|ZP_04645474.1| pseudouridine synthase, RluA family + Prom 31610 - 31669 6.3 30 14 Tu 1 . + CDS 31716 - 33977 1928 ## COG3537 Putative alpha-1,2-mannosidase + Term 34018 - 34073 13.0 - Term 34111 - 34172 3.0 31 15 Tu 1 . - CDS 34242 - 37838 3187 ## COG0383 Alpha-mannosidase - Prom 37858 - 37917 6.1 32 16 Op 1 . - CDS 38434 - 39108 382 ## COG3774 Mannosyltransferase OCH1 and related enzymes 33 16 Op 2 . - CDS 39130 - 39831 432 ## COG3774 Mannosyltransferase OCH1 and related enzymes 34 16 Op 3 . - CDS 39837 - 40637 365 ## BT_3777 hypothetical protein - Term 40656 - 40686 -0.5 35 16 Op 4 . - CDS 40721 - 41665 480 ## BT_3778 hypothetical protein 36 16 Op 5 . - CDS 41749 - 43170 742 ## BT_3779 hypothetical protein - Prom 43335 - 43394 3.4 - Term 43318 - 43383 16.1 37 17 Tu 1 . - CDS 43418 - 44569 1222 ## COG2152 Predicted glycosylase - Prom 44591 - 44650 6.4 38 18 Op 1 . - CDS 45017 - 46462 1482 ## COG3538 Uncharacterized conserved protein 39 18 Op 2 . - CDS 46507 - 47670 1064 ## COG4833 Predicted glycosyl hydrolase 40 18 Op 3 . - CDS 47705 - 48652 788 ## COG3568 Metal-dependent hydrolase 41 18 Op 4 . - CDS 48711 - 50993 2185 ## COG3537 Putative alpha-1,2-mannosidase - Prom 51110 - 51169 5.4 42 19 Tu 1 . + CDS 51203 - 55240 2996 ## COG0642 Signal transduction histidine kinase + Prom 55535 - 55594 6.8 43 20 Op 1 . + CDS 55713 - 57077 1201 ## BT_3787 hypothetical protein 44 20 Op 2 . + CDS 57101 - 60226 2524 ## BT_3788 hypothetical protein 45 20 Op 3 . + CDS 60244 - 62223 1674 ## BT_3789 hypothetical protein 46 20 Op 4 . + CDS 62246 - 63214 1011 ## BT_3790 hypothetical protein 47 20 Op 5 . + CDS 63236 - 64891 1510 ## BT_3791 hypothetical protein 48 20 Op 6 . + CDS 64911 - 66488 1392 ## COG4833 Predicted glycosyl hydrolase - Term 66508 - 66568 7.3 49 21 Tu 1 . - CDS 66620 - 66889 416 ## gi|253571364|ref|ZP_04848771.1| conserved hypothetical protein 50 22 Op 1 . - CDS 67560 - 69116 1487 ## COG3119 Arylsulfatase A and related enzymes 51 22 Op 2 . - CDS 69134 - 71215 1117 ## BT_3797 hypothetical protein 52 22 Op 3 . - CDS 71226 - 72608 1206 ## COG3669 Alpha-L-fucosidase 53 22 Op 4 . - CDS 72614 - 74083 965 ## COG3119 Arylsulfatase A and related enzymes - Prom 74122 - 74181 2.8 + Prom 74028 - 74087 6.2 54 23 Tu 1 . + CDS 74317 - 78414 2265 ## COG0642 Signal transduction histidine kinase 55 24 Tu 1 . - CDS 78419 - 79792 1222 ## COG2204 Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains - Prom 79839 - 79898 7.5 + Prom 79801 - 79860 4.3 56 25 Op 1 . + CDS 79881 - 80720 929 ## COG0652 Peptidyl-prolyl cis-trans isomerase (rotamase) - cyclophilin family 57 25 Op 2 . + CDS 80726 - 81964 1073 ## COG0612 Predicted Zn-dependent peptidases 58 25 Op 3 . + CDS 82049 - 83407 1109 ## COG0534 Na+-driven multidrug efflux pump 59 25 Op 4 . + CDS 83427 - 85784 1963 ## COG0642 Signal transduction histidine kinase + Prom 86046 - 86105 3.4 60 26 Op 1 . + CDS 86180 - 90619 3889 ## BT_3806 hypothetical protein 61 26 Op 2 . + CDS 90622 - 92988 2079 ## BT_3807 hypothetical protein + Prom 93010 - 93069 5.1 62 27 Tu 1 . + CDS 93117 - 94028 1099 ## BT_3808 hypothetical protein + Term 94052 - 94095 9.4 + Prom 94053 - 94112 3.2 63 28 Op 1 . + CDS 94172 - 96127 2027 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains 64 28 Op 2 . + CDS 96141 - 98177 2150 ## COG3590 Predicted metalloendopeptidase + Term 98205 - 98259 13.6 + Prom 98228 - 98287 8.1 65 29 Tu 1 . + CDS 98329 - 99852 1737 ## COG0138 AICAR transformylase/IMP cyclohydrolase PurH (only IMP cyclohydrolase domain in Aful) + Prom 99870 - 99929 3.6 66 30 Op 1 22/0.000 + CDS 99960 - 100982 1059 ## COG1077 Actin-like ATPase involved in cell morphogenesis 67 30 Op 2 . + CDS 101002 - 101844 719 ## COG1792 Cell shape-determining protein 68 30 Op 3 . + CDS 101844 - 102341 330 ## BT_3815 hypothetical protein 69 30 Op 4 19/0.000 + CDS 102344 - 104206 1705 ## COG0768 Cell division protein FtsI/penicillin-binding protein 2 70 30 Op 5 . + CDS 104187 - 105644 1022 ## COG0772 Bacterial cell division membrane protein 71 31 Op 1 . - CDS 105646 - 106125 197 ## BT_3818 hypothetical protein 72 31 Op 2 1/0.333 - CDS 106103 - 107698 1456 ## COG1774 Uncharacterized homolog of PSP1 73 31 Op 3 . - CDS 107713 - 108837 1123 ## COG2812 DNA polymerase III, gamma/tau subunits 74 31 Op 4 . - CDS 108839 - 109792 946 ## COG0685 5,10-methylenetetrahydrofolate reductase - Prom 109828 - 109887 2.5 75 32 Tu 1 . - CDS 109949 - 110479 267 ## BT_3822 hypothetical protein - Prom 110606 - 110665 4.6 - Term 110644 - 110703 8.6 76 33 Op 1 . - CDS 110732 - 111244 650 ## COG2193 Bacterioferritin (cytochrome b1) 77 33 Op 2 . - CDS 111270 - 111461 79 ## BT_3824 hypothetical protein - Prom 111561 - 111620 3.1 Predicted protein(s) >gi|226332186|gb|ACIC01000134.1| GENE 1 56 - 841 606 261 aa, chain - ## HITS:1 COG:no KEGG:BT_3744 NR:ns ## KEGG: BT_3744 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 261 1 261 261 517 99.0 1e-145 MKRYKLLNIIAVLLLVSTLSAQAQEERNQGIIWSYLHGWEYGIKAGFSIGGTSPLPLPEE IRKIDSYAPGGLAISIEGNATKWFDTKWGMTVGVRLENKNMTTEATVKNYGMKIINTNGG ELQGLWTGGVKTKVKNSYLTIPVVANYKVSKRWKVSAGPYVSYLIERNFSGHVYEGHLRT PDQTGSRVDFTGESIATYDFSDNLRKFQWGLQLGGEWRAFKHLNVYADLTWGLNDIFKKD FDTISFAMYPIYLNVGFGYAF >gi|226332186|gb|ACIC01000134.1| GENE 2 854 - 1984 918 376 aa, chain - ## HITS:1 COG:no KEGG:BT_3745 NR:ns ## KEGG: BT_3745 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 376 1 374 374 735 98.0 0 MKIKTLIACFILACAATSCIQDEALNSEAAIDACTGDDVQLANINADSKLINVYVNKGAD LSKQKLEFVIPEGATIKINDQVAGDTEATYDFSEEPHSRKFTVTSEDGQWKPVYTVNVVL AELPTSFNFEELLPSNDYDIFYEFQPGTSQEISKVLQWSSGNPGFKLTGMANSKTDYPTV QVANGFRGKGVKLETRDTGSFGAMVKMYIAAGNLFIGTFEVGNALTDPRKATNFGFQFYK RPKTLKGHYKFKAGDVYSVEGKPQEGVRDKCDIYAVMYEAENNSVMLNGDDVFTSDKLVS LARIKPEDVVESDQWTDFEIPFEPVPVKGRVIDDTKLKNGKYKLGIVLSSSVDGAYFKGA VGSTLYVDEVELICED >gi|226332186|gb|ACIC01000134.1| GENE 3 2147 - 2323 138 58 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|298386851|ref|ZP_06996406.1| ## NR: gi|298386851|ref|ZP_06996406.1| hypothetical protein HMPREF9007_03607 [Bacteroides sp. 1_1_14] # 1 58 1 58 58 102 100.0 5e-21 MHNTALLLCNYRIIADKCTSAMDCLYENFNKLFKEYEHEYSTHSIYPTLVFWNELLVN >gi|226332186|gb|ACIC01000134.1| GENE 4 3327 - 3740 317 137 aa, chain + ## HITS:1 COG:no KEGG:BT_3746 NR:ns ## KEGG: BT_3746 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 137 1 137 137 240 98.0 1e-62 MDTENALTNAKESPNEIFFEEEARLWAEVYGKQSDKPDLESLQRMLENMNHLLLSLELPW ERFIPILYRSFALYMRRPDENTNNRKAYQLSAQLMDSVVYLSQNTNLIHHIHLFCNMQIA ALEKLKNEKVEIEQTED >gi|226332186|gb|ACIC01000134.1| GENE 5 3862 - 4062 161 66 aa, chain - ## HITS:1 COG:no KEGG:BT_3747 NR:ns ## KEGG: BT_3747 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 66 1 66 66 110 100.0 1e-23 MITSRKVSQVNVFAGSPWEVASVKSLLKAAYIEASMKDNGLNSILISVPCEYYTAAMRVI NKRKVS >gi|226332186|gb|ACIC01000134.1| GENE 6 4387 - 4929 457 180 aa, chain + ## HITS:1 COG:no KEGG:BT_3748 NR:ns ## KEGG: BT_3748 # Name: not_defined # Def: RNA polymerase ECF-type sigma factor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 180 1 180 180 329 100.0 3e-89 MTAINFNSIYTTYYRRAFLFTLSYVHNDLVAEDIVSEAIIYLWELSKKQEIPSIEAVLIT YIRSKSLNYLKHLQVQENVYQNLTDKGQRELEIRISTLEACDPKEVMSEELRSKVKTLLA GMPEKTRIAFISDRLDGKSHKEIAEELGISVKGVEYHISKAVKLLRDNLKEYAPFLIFFI >gi|226332186|gb|ACIC01000134.1| GENE 7 5084 - 6052 723 322 aa, chain + ## HITS:1 COG:PA1364 KEGG:ns NR:ns ## COG: PA1364 COG3712 # Protein_GI_number: 15596561 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Pseudomonas aeruginosa # 107 295 74 255 280 69 28.0 9e-12 MNQNLLLKYISGKASQREKEEVAAWIDADAANLKEFMSLRKSYDALVWQDADELKTGRDK LLSLRTFTMKAMRIAAIFVLAFGLSYILIQTLQKENVEMQTVYVPAGQRTQVTLADGTMV WVNGKSTLTFPSQFASRTRKVELDGEAYFEVQKDPEKQFIVSTAHQSAIKVLGTKFNVKA YRDSEEITTTLIEGKVHFEFNNTAQKPQYITMAPGQKLIYYSQSGKTELYTTSGEGELAW KDGIIVFKQTSLQDALEILADRYDVEFIVRRNVPDDDLFSGTFTSRSLEQILNYIEASSK IRWRYLNSVQGSKEKMKIEIFI >gi|226332186|gb|ACIC01000134.1| GENE 8 6233 - 9565 3250 1110 aa, chain + ## HITS:1 COG:no KEGG:BT_3750 NR:ns ## KEGG: BT_3750 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 857 1 857 866 1584 99.0 0 MQNYCIVQFLGGLNLLRRSSRKISLSMKLLFILIVCSFGLAYASDGYAQKTSITLKANDC TIEEVLHKIELESGFGFFVNSKNLDLKRKVSVSVSNKNIFQVLEQVFKGVGIEYKVLDNK IVLAAKEKDVAQQRKERLIAGTVKDKNGEPLIGVSIREKSSGEGTITDMDGNYKLSTSSA NPVLVFSYVGYQSKEVPVKGNVANVVLEDATQELNEVVVTALGIKRSEKALSYNVQQLSG DELTTVKDANFISSLNGKVAGVNINSSNTTGGASRVVMRGVKSITSSNLALYVIDGVPMY NMMNGGGGGIYDQQGTDGAADINPEDIESISMLTGPSAAALYGNAAAAGVVLITTKKGSA DKTSVTVSNNTTFSKVSMMPEMQSKYGNLPNSLDSWGPIVNSDYDPRDFFQTGANVINSF ALSTGNKKNQAYLSASTTNTTNILPNSGYNRYNFTVRNTTSFLKDKMTLDAGASYILQND KNMMSQGYYYNPLPGLYRFPRSENFDDVRMFERYNSGMDLMEQYWPYGGTSSGLANPYWI QNRQLRENKKTRYMINASLKYQLFDWLDVVGRVKIDNYDNRSTYKAYASTGTLVSGDRGT YSDTSTQSKNTYADAIATLNKSFNDWSVNVNLGVSLSDSRYEMIGYEGGLKLINFFAVHN IDFNKAWKAKQSGWHDQTRAIFANAEIGWKSMIYLTATGRNDWDSRLAFSDYKSFFYPSI GLSAILTSMFNAPDWLTYLKVRGSYTEVGNSYGRFMTTVTYPYDEQTQSWTSTSSYPNTK LKPERTKSWEVGLDARILNDISLNLTYYRSNTYNQTFYADLSLSSGYSNIPIQSGNIMNE GLEMSLGYNKQWKDFSFNTNYTLTWNKNRVKRLADGVYNYATGQPIQMPELSPLTYGHTD ARLILRVGGSMGDVYGQRLLKRDLNGYILNEEGVGLAMENKETYLGSILPKMNMGWSLGF GYKGVNLGMTFTGRFGGIVISETQSVLDAAGVSKVSADVRDAGGVQINQSKVSAQTYYQT IQGTAAYYTYSATNVRLGELSLSYTLPKKWFDNKVGMTVGLVGKNLWMIYCKAPFDPELT TSAASNFYQGFDAYMLPSTRNFGFNVKFQF >gi|226332186|gb|ACIC01000134.1| GENE 9 9579 - 11144 1400 521 aa, chain + ## HITS:1 COG:no KEGG:BT_3752 NR:ns ## KEGG: BT_3752 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 521 1 521 521 1043 100.0 0 MNKIIKYIGASAVVCMMAGCTTNFEDFNTNPYQPSKVPASNLLSTMFNVYACPQQNACQE INCMWASFSGQVTATANWSFGKNIFAYYNASEGHNDSSWGRLYGYIYPSFFLVENSTEKK GVIYAMAQLTRVYGMQLLASLQGPIPYTQMKAGETEAPYDNEQTVWHAMFDDLDNAITIL KSAATFGVNQDLAVVDQFYKGDCSKWLKFANTLKLRMAIRISGVEPEYAQTKAQEAVLGG VMESVGDSSYDTTNGGINENGYAIVSGWPEVRANACLVSYMNGYNDPRRPAYFTPQTQTA AGGYVGVRSGSAEIPEPTVYANYSKLFIATDKTLPQPVMYAAEAAFLRAEGALKGWNMGG DAKTFYEKGVRLSFEEFGVSGADDYLADATSIPGNYVDNLIAGHTGNNYTNQSSITIKWE DGADDAKKLERVLTQKWIACYPDPMNGWADFRRTGYPRIFPATESMNADCNTGRGQRRLR FTRSEYNNNKANVEAAVSMLSNGKDSNGTDLWWAMKENGTY >gi|226332186|gb|ACIC01000134.1| GENE 10 11168 - 12271 1018 367 aa, chain + ## HITS:1 COG:no KEGG:BT_3753 NR:ns ## KEGG: BT_3753 # Name: not_defined # Def: endo-beta-N-acetylglucosaminidase F2 precursor (mannosyl-glycoprotein endo-beta-N-acetyl-glucosaminidase F2) # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 367 1 367 367 760 100.0 0 MKKLLYIIFGVMGALMFIQCSDWTEMEPKFTEPVNINGEDYYKALREYKKSDHPIVFGWY SEWTGTGTNMNNQLRGIPDSMDIVSLWGGAFNLTEAQKSDLKEVREKKGLRVLYCQHITD IGRSHTPASVENDFIVDGVQYNSKDEAMAAYWGWYGNYGDTSEEGQEKAIRKYARVIIDS INKYNYDGFDIDFEPNFGYSGNLSGNSDRMHIFLDELSKEFGPKSGTGRILMVDGEPQTL NKESGPLLDYYVVQAYYCRSDEGYSDALDGRFDRLLNKFGSIEDEATILSKTVWCEDFEK HKGDGGPEFTTRDGIVTYSLKGMAMYYRPGVDARIGGVGAYRFNLCRPVNDYFFMREVIQ VLNPANH >gi|226332186|gb|ACIC01000134.1| GENE 11 12302 - 13492 748 396 aa, chain + ## HITS:1 COG:no KEGG:BT_3754 NR:ns ## KEGG: BT_3754 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 396 1 396 396 809 100.0 0 MKMIKFSRYVVLGAALAVVTGCNDAKLSTIDNGVYIAEAAPSNTFSQQIEAQLVDEGDVI KTLTVRLVRAIDQDVTVTLDIDQQLIDEYNQQHEASYELLPEEFRSFERTVTIPAGEVSA PVINLTIKPFTTPNNEAYAIPVRITSVAGPIGLVGNANHILYLLTSPNKQKAVVLKSANK TSLNFKNEIPVTQWTIEYWIKFDNTTGRPTGDWVGPANIAFRRLIFADNSAPISFNGVLL RYWADGAKKIAPCLQCQLDGNYFDSEEFWYPDIWYHIAYTYDGSTISLYKDGTLNNSKAD SRDFTFNNISLAQSFGWNMQVELAQIRLWSKCLTENAIQEGMSRQISGDSDGLIGYWKCD EGKGNVLKDNSPNGNDITLTGIPAWSEQYNFYHPND >gi|226332186|gb|ACIC01000134.1| GENE 12 13607 - 15262 1812 551 aa, chain - ## HITS:1 COG:MA2912 KEGG:ns NR:ns ## COG: MA2912 COG0365 # Protein_GI_number: 20091733 # Func_class: I Lipid transport and metabolism # Function: Acyl-coenzyme A synthetases/AMP-(fatty) acid ligases # Organism: Methanosarcina acetivorans str.C2A # 1 550 7 558 560 707 59.0 0 MVERFLAQTSFASQEDFIKNLKINVPENFNFGYDVVDAWAAEQPDKNALLWTNDQGESRQ FSFADMKRYTDMTASYFQSLGIGRGDMVMLILKRRYEFWYSTIALHKLGATVIPATHLLT KKDIIYRCNAADIKMIVAAGEGIILQHIKDALPECPSVEKLVSVGPEVPEGFEDFHQGID NAAPFIRPRHANTNDDISLMYFTSGTTGEPKMVAHDFTYPLGHIVTGSFWHNLDENSLHL TIADTGWGKAVWGKLYGQWIAGANIFVYDHEKFTPAAILEKIQEYQVTSLCAPPTIFRFL IHEDLTKYDLSSLRYCTIAGEALNPAVFETFKKLTGIKLMEGFGQTETTLTVATMPWMEP KPGSMGLPNPQYDVDLIDSEGRSVEAGEQGQIVIRTSKGKPLGLFKEYYRDAERTHEAWH DGIYYTGDVAWKDEDGYLWFVGRADDVIKSSGYRIGPFEVESALMTHPAVIECAITGVPD EIRGQVVKATIVLSKDYKARAGEELIKELQNHVKKVTAPYKYPRVIEFVDELPKTISGKI RRVEIRENDEK >gi|226332186|gb|ACIC01000134.1| GENE 13 15270 - 15824 544 184 aa, chain - ## HITS:1 COG:MA2914 KEGG:ns NR:ns ## COG: MA2914 COG1396 # Protein_GI_number: 20091735 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Methanosarcina acetivorans str.C2A # 1 184 1 184 184 166 48.0 3e-41 MNDQIKQIAERLRGLRDVLELTSEDIARDCDISAEEYRLAETGQYDISVSMLQKIARTYN IALDTLMFGEEPKMSSYFVTRAGKGVSIERTRAYKYQSLASGFMNRTADPFIVTVEPKGD DEPIHYNHHNGQEFNLVVEGRMLINIEGKEIILNQGDSIYFNSKLPHGMKALDGKTVRFL AVIM >gi|226332186|gb|ACIC01000134.1| GENE 14 15864 - 16637 954 257 aa, chain - ## HITS:1 COG:lin0414 KEGG:ns NR:ns ## COG: lin0414 COG0345 # Protein_GI_number: 16799491 # Func_class: E Amino acid transport and metabolism # Function: Pyrroline-5-carboxylate reductase # Organism: Listeria innocua # 2 256 3 259 266 137 35.0 3e-32 MKIAIIGAGNMGGSIARGLAKGSLIADSDIIVSNPSAGKLEKLKEEFPGISTTLSNTEAA TGADVVILAVKPWFVESVMRELKLKSKQTLVSVAAGISFEELAHFVIAPEMPMFRLIPNT AISELESMTLIAARNTNDEQNLFMLRLFNEMGMAMLIPEDKIAATTALTSCGIAYVLKYI QAAMQAGIEMGIRPKDAMEMVAQSVKGAAALILNNDTHPSVEIDKVTTPGGITIKGINEL EHNGFTSAVIKAMKASK >gi|226332186|gb|ACIC01000134.1| GENE 15 16773 - 17894 1162 373 aa, chain - ## HITS:1 COG:BH2897 KEGG:ns NR:ns ## COG: BH2897 COG4992 # Protein_GI_number: 15615460 # Func_class: E Amino acid transport and metabolism # Function: Ornithine/acetylornithine aminotransferase # Organism: Bacillus halodurans # 3 373 4 377 384 253 39.0 3e-67 MKLFDVYPLYDINIVKGQGCKVWDENGTEYLDLYGGHAVISIGHAHPHYVEMISNQVATL GFYSNSVINKLQQQVAERLGKISGYEDYSLFLINSGAEANENALKLASFYNGRTKVISFS KAFHGRTSLAVEATNNPTIIAPINNNGHVTYLPLNDIEAMKQELAKGDVCAVIIEGIQGV GGIKIPTTEFMQELRKVCTETGTILILDEIQSGYGRSGKFFAHQYNHIQPDIITVAKGIG NGFPMAGVLISPMFKPVYGQLGTTFGGNHLACSAALAVMDVIEQDNLVENAKAVGDYLLE ELKKFPQIKEVRGRGLMIGLEFEEPIKELRSRLIYDEHVFTGASGTNVLRLLPPLCLSME EADEFLARFKRVL >gi|226332186|gb|ACIC01000134.1| GENE 16 17908 - 18876 838 322 aa, chain - ## HITS:1 COG:AF2071 KEGG:ns NR:ns ## COG: AF2071 COG0002 # Protein_GI_number: 11499653 # Func_class: E Amino acid transport and metabolism # Function: Acetylglutamate semialdehyde dehydrogenase # Organism: Archaeoglobus fulgidus # 2 319 1 329 332 209 37.0 4e-54 MIKAGIIGGAGYTAGELIRLLLNHPETEIVFINSSSNAGNRITDVHEGLYGETDLRFTDQ LPLDAIDVLFFCTAHGDTKKFMESHNVPEDLKIIDLSMDYRIKSDDHDFIYGLPELNRRA TCTAKHVANPGCFATCIQLGLLPLAKNLMLTGDISVNAITGSTGAGVKPGATSHFSWRNN NISIYKAFDHQHVPEIKQSLKQLQNSFDSEIDFIPYRGDFPRGIFATLVVKTKVALEEIV RMYEEYYAKDSFVHIVDKNIDLKQVVNTNKCLIHLEKHGDKLLIISCIDNLLKGASGQAV HNMNLMFNLEETVGLRLKPSAF >gi|226332186|gb|ACIC01000134.1| GENE 17 18873 - 20081 1635 402 aa, chain - ## HITS:1 COG:XF0999 KEGG:ns NR:ns ## COG: XF0999 COG0137 # Protein_GI_number: 15837601 # Func_class: E Amino acid transport and metabolism # Function: Argininosuccinate synthase # Organism: Xylella fastidiosa 9a5c # 5 385 3 384 401 206 34.0 7e-53 MEEKKKKVVVAFSGGLDTSFTVMYLAKEKGYEVYAACANTGGFSEEQLKTNEENAYKLGA VKYVTLDVTQEYYEKSLKYMVFGNVLRNGTYPISVSSERIFQALAIARYANEIGADAIAH GSTGAGNDQIRFDMTFLVLAPNVEIITLTRDMALSRQEEIDYLNKHGFSADFTKLKYSYN VGLWGTSICGGEILDSAQGLPETAYLKHVEKEGSEQLRLTFEKGELKAVNDETFDDPIQA IQKVEEIGAAYGIGRDMHVGDTIIGIKGRVGFEAAAPMLIIGAHRFLEKYTLSKWQQYWK DQVANWYGMFLHESQYLEPVMRDIEAMLQESQRNVNGTAILELRPLSFSTVGVESEDDLV KTKFGEYGEMQKGWTAEDAKGFIKVTSTPLRVYYNNHKDEEI >gi|226332186|gb|ACIC01000134.1| GENE 18 20095 - 20673 479 192 aa, chain - ## HITS:1 COG:no KEGG:BT_3761 NR:ns ## KEGG: BT_3761 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 192 1 192 192 400 99.0 1e-110 MDTQQIDVMVADASHEVYVDTILETIRNAAAVRGTGIAERTHEYVATKMKEGKAIIALCG DTFAGFTYIESWGNKQYVATSGLIVHPDFRGLGLAKRIKQASFQLARLRWPKAKIFSLTS GAAVMKMNTELGYVPVTFNELTDDEAFWKGCEGCTNHDILVAKNRKFCICTAMLYDPTDP RNIKKEQERNNI >gi|226332186|gb|ACIC01000134.1| GENE 19 20699 - 21172 630 157 aa, chain - ## HITS:1 COG:BH2777 KEGG:ns NR:ns ## COG: BH2777 COG1438 # Protein_GI_number: 15615340 # Func_class: K Transcription # Function: Arginine repressor # Organism: Bacillus halodurans # 4 138 3 134 149 94 36.0 7e-20 MKKKANRLDAIKMIISSKEVGSQEELLQELNREGFELTQATLSRDLKQLKVAKAASMNGK YVYVLPNNIMYKRSTDQSAGEMLRNNGFISLQFSGNIAVIRTRPGYASSMAYDIDNNEFS EILGTIAGDDTIMLVLREGVATSKVRQLLSLIIPNIE >gi|226332186|gb|ACIC01000134.1| GENE 20 21405 - 22862 1105 485 aa, chain + ## HITS:1 COG:BS_yulC KEGG:ns NR:ns ## COG: BS_yulC COG1070 # Protein_GI_number: 16080172 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar (pentulose and hexulose) kinases # Organism: Bacillus subtilis # 5 470 3 466 485 395 42.0 1e-109 MKQNFFAVDLGATSGRTILGSFIEGGLNLEEINRFPNHLIEVGGHFYWDIYALYRHIIDG LKLVAHRGESIASIGIDTWGVDFVLLGKDGNLLRQPYAYRDPHTVGAPEAFFSRISRSEV YGKTGIQVMNFNSLFQLDTLRRNHDSALEAADKVLFMPDALSYMLTGKMVTEYTIASTAQ LVNAHTQRLEPELLKAVGLKEENFGRFVFPGEKIGTLTEEVQKITGLGAIPVIAVAGHDT GSAVAAVPALDRNFAYLSSGTWSLMGVETDAPVITAETEALNFTNEGGVEGTIRLLKNIC GMWLLERCRLNWGDTSYPELITEADSCEPFRSLINPDDDCFANPADMEQAIREYCRTTGQ PVPEQRGQIVRCIFESLALRYRQVLENLRALSPRPIETLHVIGGGSRNDLLNQFTANAIG IPVVAGPSEATAIGNVMIQAMTMGEATDVAGMRQLISRSIPLKTYHPQDMAAWDAAYIHF KNCVR >gi|226332186|gb|ACIC01000134.1| GENE 21 22922 - 24178 1454 418 aa, chain + ## HITS:1 COG:STM4046 KEGG:ns NR:ns ## COG: STM4046 COG4806 # Protein_GI_number: 16767312 # Func_class: G Carbohydrate transport and metabolism # Function: L-rhamnose isomerase # Organism: Salmonella typhimurium LT2 # 7 418 5 418 419 488 53.0 1e-137 MKKEEMIQKAYEIAVERYAAVGVDTEKVLKTMQDFHLSLHCWQADDVTGFEVQAGALSGG IQATGNYPGKARNIDELRADILKAASYIPGTHRLNLHEIYGDFQGKVVDRDQVEPEHFKS WIEWGKEHNMKLDFNSTSFSHPKSGDLSLSNPDEGIRQFWIEHTKRCRAVAEEMGKAQGD PCIMNLWVHDGSKDITVNRMKYRALLKDSLDQIFATEYKNMKDCIESKVFGIGLESYTVG SNDFYIGYGASRNKMVTLDTGHFHPTESVADKVSSLLLYVPELMLHVSRPVRWDSDHVTI MDDPTMELFSEIVRCGALERVHYGLDYFDASINRIGAYVVGSRAAQKCMTRALLEPIAKL REYEANGQGFQRLALLEEEKALPWNAVWDMFCLKNNVPVGEDFIAEIEKYEAEVTSKR >gi|226332186|gb|ACIC01000134.1| GENE 22 24182 - 25201 1106 339 aa, chain + ## HITS:1 COG:YPO0334 KEGG:ns NR:ns ## COG: YPO0334 COG0697 # Protein_GI_number: 16120671 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Yersinia pestis # 3 332 5 334 344 166 34.0 7e-41 MDILIGLLIIAIGSFCQSSSYVPIKKVKEWSWESFWLVQGVFAWLVFPFLGSLLGIPAGG SLFDLWGAGGAGMSIFYGVLWGIGGLTFGLSMRYLGVALGQSIALGTCAGFGTLFPAIFA GTNLFEGNGLILLLGVCITLAGIAIIGYAGGLRAQNMSEEEKRAAVKDFALTKGLLVALL AGVMSACFALGLDAGTPIKEAALAGGVDGLYAGLPVIFLVTLGGFMTNAAYCLQQNIANK SVGDYAKGKVWGNNLVFCALAGVLWYMQFFGLEMGKSFLTESPVLLAFSWCILMALNVTF SNVWGIILKEWKGVSAKTITVLICGLVVLIFSLVFPNLF >gi|226332186|gb|ACIC01000134.1| GENE 23 25214 - 26023 898 269 aa, chain + ## HITS:1 COG:rhaD KEGG:ns NR:ns ## COG: rhaD COG0235 # Protein_GI_number: 16131742 # Func_class: G Carbohydrate transport and metabolism # Function: Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases # Organism: Escherichia coli K12 # 26 267 21 265 274 123 31.0 5e-28 MKSILENRPALAKEVNKVAEVAGYLWQKGWAERNGGNITVNITEFVDDEIRRMEPISEVK SIGVTLPYLKGCYFYCKGTNKRMRDLARWPMENGSVIRILDDCASYVIIADEAVAPTSEL PSHLSVHNDLLSKNSPYKASVHTHPIELIAMTHCEKFLQKDVATNLLWSMIPETKAFCPR GLGIIPYKLPSSVELAEATIKELQDYDVVMWEKHGVFAVDCDAMQAFDQIDVLNKSALIY IAAKNMGFEPDGMSQEQMKEMSVAFNLPK >gi|226332186|gb|ACIC01000134.1| GENE 24 26119 - 27273 1433 384 aa, chain + ## HITS:1 COG:ECs3659 KEGG:ns NR:ns ## COG: ECs3659 COG1454 # Protein_GI_number: 15832913 # Func_class: C Energy production and conversion # Function: Alcohol dehydrogenase, class IV # Organism: Escherichia coli O157:H7 # 2 384 4 383 383 414 56.0 1e-115 MNRIILNETSYFGAGCRSVIAVEAARRGFKKAFFVTDKDLIKFGVAAEIIKVFDDNHIPY ELYSDVKANPTIANVQNGVAAYKASGADFIVALGGGSSIDTAKGIGIVVNNPDFADVKSL EGVADTKHKAVPTFALPTTAGTAAEVTINYVIIDEDARKKMVCVDPNDIPAVAIVDPELM YSMPKGLTAATGMDALTHAIESYITPGAWAMSDMFELKAIEMIAQNLKAAVDNGKDTVAR EAMSQAQYIAGMGFSNVGLGIVHSMAHPLGAFYDTPHGVANALLLPYVMEYNAESPAAPK YIHIAKAMGVNTDGMTETEGVKAAIEAVKALSLSIGIPQKLHEINVKEEDIPALAVAAFN DVCTGGNPRPTSVAEIEVLYRKAF >gi|226332186|gb|ACIC01000134.1| GENE 25 27415 - 28314 621 299 aa, chain + ## HITS:1 COG:SMb21419 KEGG:ns NR:ns ## COG: SMb21419 COG2207 # Protein_GI_number: 16264994 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Sinorhizobium meliloti # 11 293 8 287 295 161 32.0 2e-39 MTEDNNLGLEFKYLIVNDMDRKFGLWVNTVGYQSIPPDSPYPLKEHPSGYYFNAEKGRVL REYQLVYITKGRGLFSSDSTPERQVCKGRLMVLFPGQWHTYYPLRQTGWTEYYIGFEGPA IDTIVGDAFLSQERQILEVGINEELVSLFSRALEVAEADKISAQQYLSGIVLHMIGMILS ISKNKVFEMSDVDQKIEQAKILMNENVSGNIDPEELAMRLNISYSWFRRVFKEYTGYAPA KYFQELKLRKAKQMLVGTSQSVKEISFFLGFQSTEYFFSFFKKRTGLTPLEYRSFGREE >gi|226332186|gb|ACIC01000134.1| GENE 26 28439 - 28870 500 143 aa, chain + ## HITS:1 COG:no KEGG:BT_3769 NR:ns ## KEGG: BT_3769 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 143 1 143 143 154 99.0 1e-36 MKRIVFWMVALLLMSGVAMAQGNRQGGRQQMDPKTRAERMTERMVKEYSLNEDQKQQLQD VNLAWVQKMAANQGGRSKDNKAAKMTKEEREKKMAEMKKSREDYDAQLKKIMTKEQYDSY VKKQAEREKQMKEGRQNRQKRQG >gi|226332186|gb|ACIC01000134.1| GENE 27 29506 - 30096 517 196 aa, chain + ## HITS:1 COG:no KEGG:BT_3770 NR:ns ## KEGG: BT_3770 # Name: not_defined # Def: transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 196 1 196 196 361 98.0 1e-98 MTVSKTKAKLVDVARQLFAKMGVENTTMNDIALASKKGRRTLYTYFKSKDEIYLAVVESE LDILSDMMKRVADKNISPDDKLLELIYTRLDAVKEVVYRNGTLRAYFFRDIWRVEKVRKK FDAKEIQIFKTVLLEGQAKGVFHIDDVEMTADLIHYCVKGIEVPYIRGHIGAHLDEETRN KYVSNIVFGALHRTEI >gi|226332186|gb|ACIC01000134.1| GENE 28 30113 - 30859 289 248 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 7 248 4 242 242 115 34 7e-25 MGLLDGKTAIVTGAARGIGKAIALKFAAEGANIAFTDLVIDENAEKTRVELEAMGVKAKG YASNAANFEDTAKVVEEIHKDFGRIDILVNNAGITRDGLMMRMSEQQWDMVINVNLKSAF NFIHACTPVMMRQKAGSIINMASVVGVHGNAGQANYAASKAGMIALAKSIAQELGSRGIR ANAIAPGFILTDMTAALSDEVRAEWAKKIPLRRGGTPEDVANIATFLASDMSSYVSGQVI QVDGGMNM >gi|226332186|gb|ACIC01000134.1| GENE 29 30869 - 31540 203 223 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|238855152|ref|ZP_04645474.1| pseudouridine synthase, RluA family [Lactobacillus jensenii 269-3] # 3 210 83 279 287 82 29 7e-15 MTVVYEDNHIIVVNKTASEIVQADKTGDTPLSETVKQYLKEKYQKPGNVFIGVTHRLDRP VSGLVIFAKTSKALTRLNEMFRTSEVKKTYWAVVKNAPKEPEGELVHYLARNEKQNKSFA YDKEVPNSKKAILNYRLIGHSENYYLLEVDLKTGRHHQIRCQLAKMGCPIKGDLKYGAPR SNPDGSICLHARKVRFIHPVSKELIELEAPLPEGNLWKGFELL >gi|226332186|gb|ACIC01000134.1| GENE 30 31716 - 33977 1928 753 aa, chain + ## HITS:1 COG:XF0842 KEGG:ns NR:ns ## COG: XF0842 COG3537 # Protein_GI_number: 15837444 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Xylella fastidiosa 9a5c # 25 748 39 777 790 570 40.0 1e-162 MNAFKLLMAASCFLGLATSGVAKQDYSKSEGLLQYVDPYIGSGFHGHVFVGTSVPYGMVQ LGPNNIHKGWDWCSGYHYSDSILIGFSHTHLSGTGCTDLGDILIMPLNEIRTPRGNQDDI RDGYASKYSHDNEIARPEYYSLILDRYNIKAELTATDRVGFHRYTYPEGKPASILIDLRE GNGSNAYDSYIRKVDDYTVEGYRYVRGWSPSRKVYFVLKSDQKIEKFTAYDDNTPKPWDQ LKVESVKSVLTFGNVKEVKIKVAISSVSCDNAAMNLQAELSHWDFDKVVKMSSDRWNKQL DKMTVESDNEAAKRVFYTAHYHTMIAPTLYCDVNGEYRGMNDMIYTDPKKANYTTLSLWD TYRALNPLMTIIQPEMVDNVINSMLSIYRQQDKLPIWPLMSGETNCMPGYSSVPVIADAY LKGFTGFDAEEALTAMKATATYERQNGVPYVMAKGYIPADKIHEATSIAMEYAVDDWGIA AMAQKMGKTADYETFSKRAHYYKNYFDSSIHFIRPKLEDGSWRTPYDPARSIHGVGDFCE GNGWQYTFFVPQDPYGLISLFGGDKPFTSKLDSFFTNNDSMGEGASSDITGLIGQYAHGN EPSHHIAYLYTYAGEQWKTAEKVRFIMSDFYTDQPDGIIGNEDCGQMSAWYLLSAMGFYQ VNPSDGVFAFGSPRFKKIEVKVRGGKVFTVEAPNNSKDNIYIQKVYLNGKPYHKSYITYD DIINGSTLKFEMGKKPAKNFGKASANRPIVLNK >gi|226332186|gb|ACIC01000134.1| GENE 31 34242 - 37838 3187 1198 aa, chain - ## HITS:1 COG:lin2123 KEGG:ns NR:ns ## COG: lin2123 COG0383 # Protein_GI_number: 16801189 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-mannosidase # Organism: Listeria innocua # 29 846 238 1032 1032 301 29.0 7e-81 MNKKVIAVALALALAGGSYAQDDTAKKKVKAYMVSDAHLDTQWNWDIQTTINEYVWNTIS QNLFLLKKYPEYVFNFEGGVKYAWMKEYYPEQYEEMKKFIEEGRWHIAGSSWEASDVLVP SVEASIRNIMLGQTYYRQEFGKEGTDIFLPDCFGFGWTLPTIAAHCGLIGFSSQKLDWRN HPFYGKSKHPFTIGLWKGIDGKQVMLAHGYDYGRKWNNEDLSKNKDLEKLAQRTPLNTVY RYYGTGDIGGSPTLGSVRSVEQGIKGDGPVEVISATSDQLFKDYLPFNNHPELPVFDGEL LMDVHGTGCYTSQAAMKLYNRQNEQLGDAAERAAVAAEWLGTASYPQHTLTEAWKRFIFH QFHDDLTGTSIPRAYEFSWNDELISLKQFSQVLTSSVNAIAGQMDTRVKGTPVVLYNANA FPVSDLTEIILEQPKTPKGFTVYNAQGKKVASQMIGYENGRAHILVAASLPANSYAVYDV RTGGSEKTISPSAASAIENSVYKITLDKNGDIISLTDKRNNKELVKDGKAIRLALFTENK SYAWPAWEILKETIDREPVSITDGAKITLVENGALRKALCIEKKHGKSLFKQYIRLYEGS RADRIDFYNEIDWQSTNTLLKAEFPLNIENEKATYDLGIGSVERGNNVQTAYEVYAQQWA DLTDKNNSYGVSILNDSKYGWDKPDNNTVRLTLLHTPETKGNYAYQDRQDFGFHTFTYSL TGHDGALDKPATAIKAEILNQPIKAFSSPKHAGTLGKEFAFVRSSNDQVVIKALKKAEVS DEYVVRVYETGGAAPQQAAITFAGEIEKAVLADGTEKEIGSADFNKNQLNVSIAPYSIQT FKVKLKKKADLQAPACAYLPLDYDRRCFSWNAFRKEGNFESGNSYAAELLPDSILKADGI PFRLGEKEIANGLTCKGNVLQLPTGHSYNRIYFLAASAGEDAVATFSTGNNSQEITVPSY TGFIGQWEHLGHTEGFLKDAEIAYVGTHRHASDKDEAYEFTYMFKFGMDIPKGATTVTLP DHADIVLFAATLVNEKYPAVTPASELFRTALKAGNGEEATSKANLLKQAKLIKCSGETNE KEVARYAVDGDVKTKWCDTSTAPNYIDFDFGKEQTIRGWKLVNAGNEGSVFITHTCFLQG RNSPDEEWKTIDELSDNKKNTVVRQFKPTSVRYVRLLVTQSTQNNSLKAARIYELEVY >gi|226332186|gb|ACIC01000134.1| GENE 32 38434 - 39108 382 224 aa, chain - ## HITS:1 COG:YBR161w KEGG:ns NR:ns ## COG: YBR161w COG3774 # Protein_GI_number: 6319637 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Mannosyltransferase OCH1 and related enzymes # Organism: Saccharomyces cerevisiae # 1 223 61 262 376 69 23.0 4e-12 MIPKNIHLIFLRKDEPFPELFEKCKQEIQVNHPDWNIRLYNKDDAQDILNQDLPEYIEAY NAFYHNVQKADFLRLALVYLYGGFYMDLDMLSLKPLDELRKYNLVLGEEKIVCQAEQEAL NLRYRLRIANYMFGGIPKHPFLHRIMDEMAERATIILKSQQEILDITGPGLLTDTYWDNY GMYTDITLLRNTDKRCRQPYHNEISCHFGDYAAHLHAGTWRTGI >gi|226332186|gb|ACIC01000134.1| GENE 33 39130 - 39831 432 233 aa, chain - ## HITS:1 COG:SPCC4F11.04c KEGG:ns NR:ns ## COG: SPCC4F11.04c COG3774 # Protein_GI_number: 19075903 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Mannosyltransferase OCH1 and related enzymes # Organism: Schizosaccharomyces pombe # 7 179 82 243 345 109 35.0 5e-24 MKNSLRIPRIIHQTWKTKDVPSPLDQLPQTWKEYLPNWEYILWTDEMNREFVCKHFPDFL EKYDTYPCNIQRADAIRYLLLKVYGGLYVDMDFECLENIEFLLEGSDFIVGKEPDWHAKR FGFEYIICNAFMASTPDNDFINFVCQRLINHSGGKVVNNGFDILDSTGPFLLTHAFNAFP HKEDIRILESKTIYPIGQWEVEKIKNNQIPKEMEERINQAHAIHYFFGTWFGK >gi|226332186|gb|ACIC01000134.1| GENE 34 39837 - 40637 365 266 aa, chain - ## HITS:1 COG:no KEGG:BT_3777 NR:ns ## KEGG: BT_3777 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 266 1 266 266 519 100.0 1e-146 MKKVAILIILIMASGLSMYGQNSYKVTSLVKLSDIKEKNLQRVFYVFSKNEFLKLPINYD ISFFVYQDNNSISRDDFEHTRAVTQGTSYRISNSPFYANLDKRDKRPPIETVLSLIEDTN GKYIMGIVRTWSALHDMDKHWLVTFNFAGAIIDYIPICEWPGTCSKARTMEAQINKDFTV NVQQLNFPENNYIIQFDSIKRNFIHLDNLKGQRIDTKYQIMPDGKFKKLEEAYYQPQIYT PDMLKSNETLIRYRRENIKEKKIFQQ >gi|226332186|gb|ACIC01000134.1| GENE 35 40721 - 41665 480 314 aa, chain - ## HITS:1 COG:no KEGG:BT_3778 NR:ns ## KEGG: BT_3778 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 314 1 314 314 463 100.0 1e-129 MEKKSKAEKLKLRGINQTILISSYNSGAGTGPNDCMTEEVYQRLNAQGALDTTVYVCGIG WMTPDVTIYGSGSGFGSGSGSGSGSGSGEYWGSGHWGSGDWGSGSGWTGGGTSGGGTSGG GTSGDGPKPGGGDKPVPKDPIELMDKSRFVGWREGANCLSLCKETLKKYGLSNYGSSLNV FKLVDSANGLLTNWGNDPAQNYKNAIECIDKHLNAKRVIIVGVDYDLDLNPNIDGTDHFI VVTGRGYDTSRQQYYYTFMDNATSNSDDGCSNINRLYYKTENLKLEGSTKVANRYYTVTQ VRPNDGGKYDTTSL >gi|226332186|gb|ACIC01000134.1| GENE 36 41749 - 43170 742 473 aa, chain - ## HITS:1 COG:no KEGG:BT_3779 NR:ns ## KEGG: BT_3779 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 473 1 473 473 926 100.0 0 MKIVQTFWHGNNSLLDNSFGWMNPQYHLMSWALSCLSLKDNYQNIVLYTDSHGYEIFAEQ LKLPYTDIIIQYDNLTCPEPHWAYPKLLTYSLQKEPFIHVDGDVYLPNKLDNTIEMSELI AQNKEIGSSYYKSMMNHILQKDLLMPSFLKKELEKDSIMSYNAGIIGGNDLEFIAEYCRT AFNFIESNHLNDINSKDIHINNNILFEQILFYAQSNSYNKPVATVLDHSVRDNGYIYDDF CNFYLYDKAKLLHIIGGHKKNQRICDLLSKTLLNKYPDYYNRVVELFANNHKRMQNKAKN TTFPDLSVQMCIASYQDYLHSLSKKWEKLSYTDLYDWEKCSSSYFMFLNAGKEEQAIFTI NRNPYLSIYEIPEPWPTEAKKLLKERINKECHSDHFDIVCIPCLLYEGFKEVLINDLCYN ILILIEEEKTFECLFNELLPCFSSEISKDKEQTYKLILTEVEYLLYHGVICIN >gi|226332186|gb|ACIC01000134.1| GENE 37 43418 - 44569 1222 383 aa, chain - ## HITS:1 COG:PAB1622 KEGG:ns NR:ns ## COG: PAB1622 COG2152 # Protein_GI_number: 14521331 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted glycosylase # Organism: Pyrococcus abyssi # 46 371 8 286 305 170 35.0 5e-42 MKSTFLFLVTTTMMTCTALGQPSNDKKNVLPDWAFGGFERPQGANPVISPIENTKFYCPM TQDYVAWESNDTFNPAATLHDGKIVVLYRAEDKSGVGIGHRTSRLGYATSSDGIHFKREK TPVFYPDNDTQKKLEWPGGCEDPRIAVTAEGLYVMTYTQWNRHIPRLAIATSRNLKDWTK HGPAFAKAYDGKFFNLGCKSGSILTEVVNGKQVIKKIDGKYFMYWGEEHVFAATSEDLVN WTPYVNTDGSLRKLFSPRDGHFDSQLTECGPPAIYTPKGIVLLYNGKNSASRGDKRYTAN VYAAGQALFDANDPTRFITRLDEPFFRPMDSFEKSGQYVDGTVFIEGMVYYKDKWYLYYG CADSKVGMAIYNPKKPAAADPLP >gi|226332186|gb|ACIC01000134.1| GENE 38 45017 - 46462 1482 481 aa, chain - ## HITS:1 COG:XF0843 KEGG:ns NR:ns ## COG: XF0843 COG3538 # Protein_GI_number: 15837445 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Xylella fastidiosa 9a5c # 37 468 61 497 516 467 49.0 1e-131 MNITKTLCLCAALSGAAGVQAMENREFVTQQDNTRVNNYQTNRPEASKRLFVSQEVERQI DHIKQLLTNAKLAWMFENCFPNTLDTTVHFDGKEDTFVYTGDIHAMWLRDSGAQVWPYVQ LANKDPELKKMLAGVINRQFKCINIDPYANAFNMNSEGGEWMSDLTDMKPELHERKWEID SLCYPIRLAYHYWKTTGDASVFSDEWLQAIANVLKTFKEQQRKDDAKGPYRFQRKTERAL DTMTNDGWGNPVKPVGLIASAFRPSDDATTFQFLVPSNFFAVTSLRKAAEILNTVNKKPA LAKECTALADEVEKALKKYAVCNHPKYGKIYAFEVDGFGNQLLMDDANVPSLIALPYLGD VKVTDPIYQNTRKFVWSEDNPYFFKGSAGEGIGGPHIGYDMIWPMSIMMKAFTSQNDAEI KTCIKMLMDTDAGTGFIHESFNKNDPKNFTRAWFAWQNTLFGELILKLVNEGKADLLNSI Q >gi|226332186|gb|ACIC01000134.1| GENE 39 46507 - 47670 1064 387 aa, chain - ## HITS:1 COG:lin0763 KEGG:ns NR:ns ## COG: lin0763 COG4833 # Protein_GI_number: 16799837 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted glycosyl hydrolase # Organism: Listeria innocua # 139 386 90 341 341 89 31.0 2e-17 MRNICFVACMLFCLASASGKTVKNHPFVSIADSILDNVLNLYQTEDGLLTETYPVNPDQK ITYLAGGAQQNGTLKASFLWPYSGMMSGCVAMYQATGDKKYKTILEKRILPGLEQYWDGE RLPACYQSYPVKYGQHGRYYDDNIWIALDYCDYYRLTKKADYLKKAIALYEYIYSGWSDE LGGGIFWCEQQKEAKHTCSNAPSTVLGVKLYRLTKDKKYLDKAKETYAWTRKHLCDPDDF LYWDNINLKGKVSKDKYAYNSGQMIQAGVLLYEETGDKDYLRDAQKTAAGTDAFFRSKAD KKDPSVKVHKDMSWFNVILFRGFKALEKIDHNPTYVRAMAENALHAWRNYRDANGLLGRD WSGHNEEPYKWLLDNACLIELFAEIEK >gi|226332186|gb|ACIC01000134.1| GENE 40 47705 - 48652 788 315 aa, chain - ## HITS:1 COG:lin0348 KEGG:ns NR:ns ## COG: lin0348 COG3568 # Protein_GI_number: 16799425 # Func_class: R General function prediction only # Function: Metal-dependent hydrolase # Organism: Listeria innocua # 27 307 2 254 257 159 34.0 6e-39 MKLRNLLFIVLAAIVFCNCQSYQPTSLTVASYNLRNANGSDSARGDGWGQRYPVIAQIVQ YHDFDIFGTQECFLHQLKDMKEALPGYDYIGVGRDDGKDKGEHSAIFYRTDKFDIVEKGD FWLSETPDVPSKGWDAVLPRICSWGHFKCKDTSFEFLFFNLHMDHIGKKARVESAFLVQE KMKELGRGKNLPAILTGDFNVDQTHQSYDAFVSKGVLCDSYEKCDYRYALNGTFNNFDPN SFTESRIDHIFVSPSFHVKRYGVLTDTYRSVRENSKKEEVRDCPEEITIKAYEARTPSDH FPVKVELVFDQRQQK >gi|226332186|gb|ACIC01000134.1| GENE 41 48711 - 50993 2185 760 aa, chain - ## HITS:1 COG:L135972 KEGG:ns NR:ns ## COG: L135972 COG3537 # Protein_GI_number: 15673483 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Lactococcus lactis # 30 758 3 715 717 408 32.0 1e-113 MKTHFSFKHLLFIGGAVLYSMQISAVKNPVDYVSTLVGTQSKFELSTGNTYPATALPWGM NFWTPQTGKMGDGWAYTYNADKIRGFKQTHQPSPWMNDYGQFSIMPITGGLVFDQDQRAS WFSHKAEIAKPYYYKVYLADHDVTTELVPTERAVMFRFTYPETKNAYVVIDAFDKGSYVK VIPEENKIIGYSTKNSGGVPENFKNYFVIQFDKPFTFTSGVKENNILPNETEVQGNHTGA IIGFATQKGEIVHARVASSFISYEQAELNLKELGKDSFDQLVTKGKDIWNREMSKVDVED DNIDNLRTFYSCLYRSMLFPRSFYEIDAKGQVVHYSPYNGKVLPGYMFTDTGFWDTFRCL FPFLNLMYPSMNQKMQEGLVNAYLESGFLPEWASPGHRDCMVGNNSASVVADAYIKGLRG YDIETLWEALKHDANAHLRGTASGRLAYDAYNKLGYVPNNIGIGQNVARTLEYAYNDWTI YTLGKKLGKPASEIDIFKQRALNYKNVYHPKRKLMVGKDDKGVFNPKFDAVDWSGEFCEG NSWHWSFCVFHDPQGLIDLMGGKKEFNNMMDSVFVIPGKQGMESRGMIHEMREMQVMNMG QYAHGNQPIQHMVYLYNYSGEPWKAQHWVREIMDKLYTAGPDGYCGDEDNGQTSAWYVFS ALGFYPVCPGTDQYILGTPLFKSAKLHLENGKTVTIKASNNNTDNRYVKDMKVNGKAFTR NYLTHDQLLKGANIQYQMSPTPNKQRGTTEKDIPYSLSFE >gi|226332186|gb|ACIC01000134.1| GENE 42 51203 - 55240 2996 1345 aa, chain + ## HITS:1 COG:evgS_2 KEGG:ns NR:ns ## COG: evgS_2 COG0642 # Protein_GI_number: 16130302 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Escherichia coli K12 # 807 1060 160 414 420 101 28.0 1e-20 MKKHCILLFFYCLNLLMNTAGAYNLKQVADKEYMSNSSITSLCQDDKGLMWIGTCDGLNI YDGQEIEEFKTRDKEDYLSGNQIDNIIYTGNDTYWFQTYYSLIRLDRKTNSITRFNEFQK LFLMNKDNHGTLFIIKDTNCIYYFHKKEGIFKKINVTGIPISDVITFFFDKNNRMWVMMK GYNRCYDLQQDPLTENITLVSQKSGLDHHTPLICSFYDDAAVYYVDKEYNLYAFHIESKK NEFIANLGANIQAHDKISSIVKYHDSFFVGLMMDGVLTLEKQKGSGEYQIQTLPINSGTF SLIKDRFQDVVWIGTDGQGVYLYSNPLYSIKSVVLNNYTEKIERPVRALLLDKERTFWIG TKGNGILKIFDYEVQKNISDCRSEILTTSNSGLGSNAVYCIRESNRNLLWIGDEEGLNFY SYRERRIKKLPLWIDNEEFKYIHDIYETEDSELWLASVGMGVVRARIGGTPDHPVLEKLQ RYVVNGGEFGSNYFFTICKGDSLNLLFGNMGYGVYRFNETINGLEPLTTHKYENMNLNKV VPIIKDDADNYLIGTTYGVIKYASENSYQLFNAKDGFLNSTIHAILRHSSDNFWLSTNQG LINFDTKRNVFRSYGFGDGLKVVEFSDGASYRDPQTGILFFGGINGFVAIQADGRPEQLY MPPIYFDKLSIFGEQYNLGEFITRKKENEVLNLKYDQNFFAVSFTSVDYLNGNNCTYSYK LKGLSDQWINNGKESSVSFTNMAPGEYTLLVKYYNSVFDKESEVYSLLIRIGDPWYASWW AYLIYSLCLLMIAAMLVRSFILRTKRKKQELLNEIEKRHQKNVFESKLRFFTNIAHEFCT PLTLIYGPCGRILSSKGLSKFVVDYVQMIQTNAERLNNLIHELIEFRRIETGNREVRIEA LNVSSLMKNTAKTFVDMAKSRNITFLSKIPEQVSWNSDKGFLNTIIINLISNAFKYTPDG KSIKIEVDTNDENLLVLRVANEGSTIKEEDFQYVFNRYAILDSFENQDEKNFSRNGLGLA ISYNMAKLLNGTLKVENTPDGWVLFTLSLPPLEVTTGVSSSKRITSEYIPKIESQLVLKL PHYEFDKMRPTLLVIDDELEMLWFIGEIFSNDFNVVTLQEPERVEQVLNEVYPNVIICDI MMPGVGGIELTKRIKSVKETSHIPIIMVSGRHEMEQQMAALSAGAEMYITKPFNAEYLRI SVCQMLERKEVLKDYFSSAISSFEKTDGKLTHKESKKFQQSVLKIINDHIQEKELTPRFI ADHLAISVRSLYRKMEEMGEESPTIMIKECRLHVAKDLLLKTKKTIDEIVFEAGFSNKVT FFKVFREKYGCTPKEFRTKHLEEVQ >gi|226332186|gb|ACIC01000134.1| GENE 43 55713 - 57077 1201 454 aa, chain + ## HITS:1 COG:no KEGG:BT_3787 NR:ns ## KEGG: BT_3787 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 454 1 452 452 706 73.0 0 MKPNLFMKRKLEIGLTLLIGMSAFWGCNNDLAPIRQNQALPPDLSKHSVFESFSPDSGGI GTQLIIRGKNFGSDPSYVKVTVNNKEAAIVGMDDEVIYAIVPARADTGYVRLFIGKDENI EEYASETKFRYQFKRNVTTMVGQHGMNGREDGSYAASKLQRTWFLLTDKDGSVFFIDEGR GQTQNGALRRARNGEVETLVQCSSGPFQSPTCLAFSPAQDTLYISQYSYTDESNTKTDFN IIYVTREGGFVDVRGLCRAKKVGTTGLAVHPKTGEVFFCNKGTGYVYRFDGPSYEDFTPL FRINNATEIETRMTFNTEGTILYVAVCNRHCIYQVPYDAATHTFGNPVLFVGAWDESGYI NGTGATVRLNKPEQMAFDEDGNMFVPERNNHIIRKITPAGSASLYAGQPEQSGFGDGLPE EAKFNQPECVTVYPDNSVYVADRDNHVIRRVTVE >gi|226332186|gb|ACIC01000134.1| GENE 44 57101 - 60226 2524 1041 aa, chain + ## HITS:1 COG:no KEGG:BT_3788 NR:ns ## KEGG: BT_3788 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1041 1 1033 1033 1637 76.0 0 MNNIQKCIVLLLTVMLLVPDSAWAQESKQEFVIKGVVEDNLGPVAGASIVAKNQPGVGTI TNLDGEFTIKVGAYDVLQVTFIGYQTVEIPVLSIKDRNNLKVKIVEDNRKIDEVVITASG IQKKKTLTGAITNVDIKQLNAPGANLSNALAGVVPGIIAMQTSGEPGENMSEFWIRGMST FGAKSGALILVDGVERSFNEIPVEDIESFSVLKDASATAIYGQRGANGVVLITTKRGEKG KVKINVKTGFSWNTPVKVPEYANAYDWASLANEALEGRYESPLYTSEEMNIIRYGLDSDL YPNVDWRDLMLKSGAPSYYANINFSGGSDNVRYFVSGQYTSEDGRYRTSSSENKYNTNTT YERWNYRANVDMNLTRSTILKVSVGGWLVNRNSPQSDADDIWKSFAYYTPLSSPRKWSTG QWPVVNGMTTPEYQMTQTGYRTIWESKVESSVALDQDLSFITKGLKFSGVFAFDTYNKNT IRRSKAQELWSAEYSRDANGNLVLKRVQNAAAMSQTKKVEGDKRYYLQASLDYSRLFAQK HRVGVFAMVYQQETTDVNFDESDLMGSIPHRNLAYSGRFTYAFQDKYMAEFNWGYTGSEN FEHGKQFGFFPAVSAGWVVSEESFVKKAMPWLDLFKIRASYGEVGNDQLRTSLTDDKARR FPYISLVSTDDGGSYTFGEFGTNKVQGYRIKTLGTSNLTWEVAKKYDIGVDFSIFNGKVT GTVDYFVDKRDDIFMQRNQMPLTTGLADQTPMANVGKMKSVGWDGNIAFTQQIGQVSLQL RGNFTYQKTDILDMDEAANELWYKMNKGFQLNQSRGFIALGLFKDQEDIDRSPSQASLAN KTILPGDIKYKDVNGDGVITTDDEVPLGYRQQPGLQYGIGLSASWKNWNFSMLLQGTGKC DFFVGGSGPHAFHDGRTGNILQVMVDGNRWIPKEISGTEATENPNADWPRLTYGNNNMNN RASTFWLKDRKYLRLRNVELSYDFPQVWTRKFLVSNLRLGFIGQNLFTWAPFDWWDPEGT NESGSSYPINKTFSCYLQFSF >gi|226332186|gb|ACIC01000134.1| GENE 45 60244 - 62223 1674 659 aa, chain + ## HITS:1 COG:no KEGG:BT_3789 NR:ns ## KEGG: BT_3789 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 659 1 659 659 1073 80.0 0 MKRILYTILLAIGTLSFSSCTDYINVDKYFYDQVSLDSAFSKRVYVDGWLSSAYSVMDYI GEYREPFRWASDDLYHPDMKDYVEGNYSADNQLSDKDQNESRLWKYYEGIRKASTFIQNV DRCSELTMDEIADMKGQARFLRAYCYWALIRVYGPVPLIPLEGLDVNLSYEELSLPREHF DNLVDFIDQELAESARSLPTKRTVNNLGRPTRGAALGLRSRVLLYAASPLFNGNTDFFNV KDCYGNQLVSQTYDETKWAKAAAAAKDVIELAKASGLYELYVVAPKATVLPSQRPPHNAL YSDKNYPEGWADVDPLLSYKSNFDGTILGSKNPELIFTRTRIGTGHINDWAYQSTPKTLK GNNRLAVTQKQVDAYAMNDGRSITEAEATGDYVTQGFTTQAYAVANPFLPAKVNLMYNNR EPRFYASIAYNGSVWEASSASESEFRDQQIFYYRGLNDGKQGFKEECPLTGVTLKKFYNS EDSRTEGGYLVDKTEMTIRYGEILLIYAEALNELTSGQVYHLTTYTGADVEIQRSVDEMR YAIKRIRMRAGVPDYAEETYNNPNDFRVKLKRERQIELLGENSMRYFDLRRWKDAMTEEN QLLQGCNINISDDETRIADFYKQTIITSVHKVFEQKMYLWPFPTYELKRNVNMTQNPEW >gi|226332186|gb|ACIC01000134.1| GENE 46 62246 - 63214 1011 322 aa, chain + ## HITS:1 COG:no KEGG:BT_3790 NR:ns ## KEGG: BT_3790 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 26 322 26 324 324 366 64.0 1e-100 MKKIISIYGIMGSLLIAMLPGVSSCDKLEDELFKKNTYIIHNGWQDYSLEVADDNTTVLP VYFGVNGTSGNDKEIHITMEVDPDTLDAYNKDKYKSQTDLYYKILPEGTYTFDADSWNIK SGELNAAAYIHIDLNKIREVGNLYNDYVLPLRITSSTGEEMGANKYTKVLAHIGFKNDYS GIYSGKGVVTQQGTTYTTETTSTQLYAINNNTCYMFVGEKTRSNTTDYLNYVVEIERDDF GDITLTSHVDGLKFKPYSAKLSRKYTYNYTDQRYYTEITTIELAYEYQDSAQGESLMMSF EGTFSMSRDVLRVDYPNVDVEE >gi|226332186|gb|ACIC01000134.1| GENE 47 63236 - 64891 1510 551 aa, chain + ## HITS:1 COG:no KEGG:BT_3791 NR:ns ## KEGG: BT_3791 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 551 1 545 545 848 76.0 0 MKVMKYFVLASLLCLSACSDDNDNKNNETPQNLAELTVDASEISIAQGDTRTVKITSGNG EYEVTSANEEVVTAEVDGDIVTLTAVEGHNNAQGVVYVKDKYYQRAKILVNTAAEFDLKL NKTLFTLYSQVEGADEAFVKIYTGNGGYSLEVIDENNCIEVDQSTLEDSESFTVKGIAQG NAEVKITDRKGKEAFVNLNVIAPKQITTDADEKGVLINANQGSQQVKILSGNGEYKILDA GDTKVVRLELYGNVVTVTGRKAGETSFTLTDAKGQVSQPIQVKIAPDKRWCMNLGRDYAV WTHFGEMTGEGVEALKAATNDFKLKKMTWELTCRIDNTYWLQTIMGKEGYFILRGGDDGE KGKEGGNQWKVIDLVGTGDKLQLRTGHNAIKLGEWMHLALVVDCDVAQSNPSEKYKLYIN GSRVAWGEIKRNDLNFSEIDLCTGNDGGKISIGKASDNNRFLGGAVLEARIWSVCRTEAQ LKANAWDFVEENPDGLLGRWDFSAGAPVAYIEDGTDSDHELLMHVCKYDSFNATEFPMSR FEEAPIVVPFK >gi|226332186|gb|ACIC01000134.1| GENE 48 64911 - 66488 1392 525 aa, chain + ## HITS:1 COG:lin0763 KEGG:ns NR:ns ## COG: lin0763 COG4833 # Protein_GI_number: 16799837 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted glycosyl hydrolase # Organism: Listeria innocua # 256 415 90 242 341 87 34.0 4e-17 MKAIFKLFILNLLALALLPSCSDDDQSNSEQNDPISGNISPVGSFAVEATNNENELLVKW TNPSNRDVDMVELSYRDVEASLSRATDFSPGHIIIQVERDVTQEYLLKVPYFATYEVSAV AISKAGKRSVPESRIMMPYHEKVDEPELKLPEMLDRAHSYMTSVIGYYFGKSSRSCWRGN YPYDGKGYWDGDALVWGQGGGLSAFVAMRDATKESEVENIYGAMDDMMFKGIQYFCQLDR GILAYSCYPAAGNERFYDDNVWIGLDMVDWYTETKEMRYLTQAKVVWRYLIDHGWDETCG GGVHWRELNEHTTSKHSCSTGPTAVMGCKMYLATQEQEYLDWAIKCYDYMLDVLQDKSDH LFYDNVRPNKDDPNLPGDLEKNKYSYNSGQPLQAACLLYKITGEQKYLDEAYAIAESCHK KWFMPYRSKELNLTFNILAPGHAWFNTIMCRGFFELYSIDNDRKYIDDIEKSMIHAWSSS CHQGNNLLNDDDLRGGTTKTGWEILHQGALVELYARLAVLERENR >gi|226332186|gb|ACIC01000134.1| GENE 49 66620 - 66889 416 89 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253571364|ref|ZP_04848771.1| ## NR: gi|253571364|ref|ZP_04848771.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 89 1 89 89 115 100.0 6e-25 MKKLVLVVAMFMFVCGGSFLVKAQNSAEAATALAEVNATAIVNDTVVKDTVTKSEEPVKT ELSAPTAIVNDTVVTDTASKDKPAEPVKE >gi|226332186|gb|ACIC01000134.1| GENE 50 67560 - 69116 1487 518 aa, chain - ## HITS:1 COG:CC1172 KEGG:ns NR:ns ## COG: CC1172 COG3119 # Protein_GI_number: 16125424 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Caulobacter vibrioides # 34 503 33 491 521 144 27.0 4e-34 MNKKLLFPLALVPLAATSLQAQKSAQTQRVDKRPNIILFMVDDMGWQDTSLPFWTQKTDY NKLYETPNMERLAKQGMMFTQAYASSISSPTRCSLITGTNAARHRVTNWTLQKNTKTDRK DKVLDVPDWNYNGVSQVPGTNNTFVGTSFVQLLKDSGYHTIHCGKAHFGAIDTPGEDPHH WGFEVNIAGHAAGGLASYLGEENYGHNKDGKPISLMAVPGLEKYWGTETFVTEALTLEAI KALNKAKKYNQPFYLYMSQYAIHVPLDKDKRFYDKYKKKGMTDHEAAYATLIEGMDKSLG DLMDWLEKSGEADNTIIIFMSDNGGLAAESYWRDGKLHTQNHPLNSGKGSTYEGGIREPM IVSWPGVVAPGSKCNDYLLIEDFYPTILEMAGIKKYKTVQPIDGISFMPLLKQTRNPSKG RSLFWNMPNNWGNDGPGINFNCAVRKGDWKLIYYYGTGKKELFNIPDDIGESNDLSAQHP DIVKRLSKELGTYLRKVDAQRPTVKATGKPCLWPDEIE >gi|226332186|gb|ACIC01000134.1| GENE 51 69134 - 71215 1117 693 aa, chain - ## HITS:1 COG:no KEGG:BT_3797 NR:ns ## KEGG: BT_3797 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 693 1 693 693 1425 98.0 0 MKRIIHICLWLLTGIFAEVSLSAQNPFIYHKGKTYKDVAAYRMEEIIDLNTHTPSTPILC EQQVPEHQSTLSYKVALPLYVRGIFFSRDSRPGDYQWPNNTNRLLPWMFNRLEDLTRSDY AGIPSNALPSASGDALLLELADGEYLFAKAIAGSNSLSWFQVNQDGTLTLYVSTLGEDAL TGRLPLLIFRKSSSVYHVFSDAYDSLIAKKAVSALRKRADKEYFNAFDYLGWCTWEHYHY DIDETKILNDIDAIEASGIPVRYVLIDDGHIANKNRQLTSLVPDKKRFPNGWSRIMKRKQ ADKIRWIGLWYSLSGYWMGISAENDFPPEIRQVLHSYNGSLLPGTSTEKIETWYEYYVRT MKEYGFDFLKIDNQSFTLPLYMGGTQVIRQAKDCNLALEHQTHRMQMGLMNCMAQNVLNI DHTLYSSVTRASIDYKKYDENMAKSHLFQSYTNTLILGQTVWPDHDMFHSCDTVCGSLMA RSKAISGGPVYLSDSPSEFIADNIRPLIDETGKIFRPAAPAIPTPESILTNPLQSGKAYR VFAPTGDEALSVICYNLNTSPAYREVESFVKQEDYLLRESTGKSADSSSDNILAFNWEKQ SAEVLNASERKIKLSGFTDSLFHLCPIRKGWAVIGIQEKYLSPATVQILKRTTDKLILDV HCTGTLRIWADSHGKQELRSIPIKKAGRIEIKK >gi|226332186|gb|ACIC01000134.1| GENE 52 71226 - 72608 1206 460 aa, chain - ## HITS:1 COG:XF0106 KEGG:ns NR:ns ## COG: XF0106 COG3669 # Protein_GI_number: 15836711 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-L-fucosidase # Organism: Xylella fastidiosa 9a5c # 1 459 10 453 460 185 28.0 1e-46 MKKLCYLSLFILYSVCSFAQQQAIPVPKPHQLKWHEAEMGAVFHYDLHVFDGIRYGQGNN RINPIEDYNIFNPTELNTDQWIQAAKAAGCKFAVLTATHETGFGLWQSDVNPYCLKAVKW RDGKGDIVRDFVNSCRKYGLQPGIYIGIRWNSLLGIHNFKAEGEGAFARNRQAWYKRLCE KMVTELCTRYGDLYMIWFDGGADDPRADGPDVEPIVNKYQPNCLFYHNIDRADFRWGGSE TGTVEYPCWSTFPVPCSHHKRIESSIDQLELLKHGDKNGKYWVPAMADTPLRGANGRHEW FWEPDDENNIYPLNTLMDKYEKSVGRNATLILGLTPDPTGLIPAGDAQRLKEMGDEINRR FSSPIARISGQKKSLTLKLGKEQSVNYCIIQENIKNGERIRQYQIEAKVNGKWQTVCKGE SVGHKRIEKFEPVEATALRLTVSESVALPDIINFSAYSVK >gi|226332186|gb|ACIC01000134.1| GENE 53 72614 - 74083 965 489 aa, chain - ## HITS:1 COG:YPO0829 KEGG:ns NR:ns ## COG: YPO0829 COG3119 # Protein_GI_number: 16121138 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Yersinia pestis # 18 478 31 501 517 398 43.0 1e-110 MIIRIPKTTYYHKWFIGATLVVPVLPLTAQTSPNLVFIMADQYRGDALGCLGKEPVKTPC LDHLASEGVLFTNAVSSYPVSSPARAMLMTGMYPLHNKVTGNCNSQTAPYGVELPQEARC WSDVLKDMNYRTGYIGKWHLDSPYKPYVDTYNNRGKVAWNEWCPPERRHGFEYWTAYGTY DYHLRPMYWDTTAPRDSFYYVNQWGPAYEADKAIEYIQTQKENKQPFALVVSMNPPHTGY ELVPDKYKAMYKDIDVEALCAHRPDIPAKGTEMGNYFRNNIRNYYACITGVDEQVGRIIE TLKSNGLFENTIVVFTSDHGICMGAHNNAGKDIFYEESMRIPMIISWPAKIKPRKDDHLM IAFADLYPSLLSLMGFQKEIPETVQTFDLSRHILGKNKKEVVQPYYYVQFDNHATGYRGL RTSTHTFAVHATNGKIDETVLYDRTKDPYQMDNIAGQSPRLVRQFNKQLKAWLLHTNDSF AHYLETVTK >gi|226332186|gb|ACIC01000134.1| GENE 54 74317 - 78414 2265 1365 aa, chain + ## HITS:1 COG:all4963_3 KEGG:ns NR:ns ## COG: all4963_3 COG0642 # Protein_GI_number: 17232455 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Nostoc sp. PCC 7120 # 836 1118 2 293 294 130 32.0 2e-29 MKHLNLILIILILSCCPLHPAYAFFDKDIRLLTMRDGLADNTIPCIYKDEDGFMWFGTDD GLSRYDGKTIQNFKPADTYASVAAIVRLSDDFLGIVSDGYLYYFNRKQETFLPIHTESDT AIGVFNVLPADNSSFWGISGNRLILCQWEIRQGKEEEKTAVRVRIQKTFQLLEEKGEVFS SLCYHEKAGNGIWLTTNRGNLILFHPDTPEAYQKIPLTDGKPLQVTSILNRDGVVWISTI AKGIVRYHTKSGHMDRISYGGTGKENLLSHTDVYQTIFIGNNRYLAVTWGGYTLLMPDEK QPEELTSEIYNNTHTQIYRNLETRMISACYDPNGLLWIGTNGGGVMYSDLRSQFYDRYYQ DRHNEICDILMDDSHRLWLATYHKGIMRSDIPYAKGKKLSFSKVDTGSGKSEETMLCALK DVDGNLWFGNINGTLTVYHVRTGRFVTHSLLVDGVANRASVWALYMDDKNRLYVGTGDGL LLYNRQTGVCAKLSAAHYLNEKRQPFIRAIAQTKDGTIWLGTSNMGVCRVVESASGGISV ENGYERKMNMEYRSVRSLLASSDGNLYIGYTDGFAILSPQQNRISARYTTNDGLCSNFIG CIAEDSEGHIWLGSNSGISRYGRRQHLFYNYYIAGSNRSAVFSDKTLFFGNNQSLTYFNP EDINAYPTNGRVLITGLEVDNHPVGTGEKINNQIILNRVISFTDHIVLDNDNKDFSLTFN NLSYSDEQQKYNYRLLPYQGNWLISDEGEKASYTNLPAGEYVFEVKNIHPNGQTGAVTSL KIVILPHWSQTFAFRLFVSLLLLGGVAYGVRLIRIRQKRMEHEMQMKHELLTVSMEREKE RQIRVERENFFTGVAHELRTPLTLILAPLQELMKQCNPLEDFYKKLQVMYKNAASLHTLV DQLLYVQKIGAGMVRLQLSEIDLVQLVCNVSEPFRQMAEIKGLQFDVDLPEGKFLYWVDE GKIASAVQNLLSNAFKYTPSGGRVLLSVSHLMKSGQGYCRITVSDTGAGIPEWLQKYAFE SFVTGDNVPEMSTKAGIGLHIVKNTMDLHHGLVTLKSVPDEGSTFALYIPEGKAHFAKDS YELIDSRQEKTKQNEELKPESFLSITENKTRETAAKKSLLIIEDNEDVREYIRSLFISRY VVSEACNGEEGVRIAKEQLPDLIISDVMMPVKDGFACCREIRMQQETAHIPILILTARAE DADILQGCNSGADDYMMKPFNPEILKTKVDNLILQRERLKRIYTKALMLKQESEEGEKED AFLQQLIHIIEANLSNADFNVKMLAEQLNMSQPTLYRKVKQRSDLSVIDMIRSIRISKAA TLILENRYSIQEISEMVGYSDTRTLRKHFMEQFGVSPSKYMGSTD >gi|226332186|gb|ACIC01000134.1| GENE 55 78419 - 79792 1222 457 aa, chain - ## HITS:1 COG:hydG KEGG:ns NR:ns ## COG: hydG COG2204 # Protein_GI_number: 16131834 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains # Organism: Escherichia coli K12 # 5 450 8 439 441 322 39.0 9e-88 MMKSILIIEDDITFGMMLKTWLSKKGFEVSSVSNIARAQKHIEGQGADLILSDLRLPDHD GIDLLKWMNEKGIDIPLIIMTGYADIQSAVQAMKLGARDYVAKPVNPEELLKKMSEALQA KEAPTTHTAAKTSAKKGSSVSSGNATETHHAYLEGESDAAKQLYNYVGLVAPTNMSVLIN GSSGTGKEYVAHRIHQLSKRNDKPFIAVDCGSIPKELAASEFFGHVKGSFTGALTDKTGA FVAANGGTIFLDEIGNLSYEVQIQLLRALQERKIRPVGSTQEVSVDIRLVSATNENLEQA IEKGTFREDLFHRINEFTLRMPDLKERKEDILLFANFFLDQANKELDKHLIGFDSKASQA LLNYHWPGNLRQMKNIVKRATLLAQGSFITLLELGTELLETSTVCSASMALRNEETEKEH ILEALRQTGNNKSKAAQLLNIDRKTLYNKLKLYNIDL >gi|226332186|gb|ACIC01000134.1| GENE 56 79881 - 80720 929 279 aa, chain + ## HITS:1 COG:SPy0457 KEGG:ns NR:ns ## COG: SPy0457 COG0652 # Protein_GI_number: 15674576 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peptidyl-prolyl cis-trans isomerase (rotamase) - cyclophilin family # Organism: Streptococcus pyogenes M1 GAS # 32 278 75 261 268 97 31.0 2e-20 MKKIAYLLLTISFCGLTACKTGTKKGGNMDNETLVKIETTLGDIKVKLYNETPKHRDNFI KLAEDGVYEGTLFHRVIKDFMIQAGDPDSKNAPKGKMLGAGDVGYTLPAEFVYPKYFHKK GALSAARQGDNVNPKKESSGCQFYIVTGKVYNDSTLLGMESQMNENKINVIFNTLAQKHM KEIYKMRKENDENGLYELQEKLFAEAEAEVAKQPEFHFTPEQIEAYTTVGGTPHLDGEYT VFGEVIEGMDIVDKIQQVKTDRSDRPEEDVKIVKVEVLD >gi|226332186|gb|ACIC01000134.1| GENE 57 80726 - 81964 1073 412 aa, chain + ## HITS:1 COG:CC3584 KEGG:ns NR:ns ## COG: CC3584 COG0612 # Protein_GI_number: 16127814 # Func_class: R General function prediction only # Function: Predicted Zn-dependent peptidases # Organism: Caulobacter vibrioides # 5 407 47 454 948 199 32.0 1e-50 MKINRHILDNGLRLVHSQDESTQMVALNVLYNVGARDEHPEHTGFAHLFEHLMFGGSVNI PDYDMPLQLAGGENNAWTNNDITNYYLTVPRQNVETGFWLESDRMLSLDFSERSLEVQRG VVMEEFKQRCLNQPYGDVGHLLRPLAYRAHTYQWPTIGKDLSHIANATLEEVKAFFFRFY APNNAVLAVTGNISFEEAVALTEKWFGPIPRREVPQRNLPQEPEQTEERRLTVERNVPLD SLFMAYHMCSHDHPDYYAFDILSDLLSNGRSSRLNQRLVQQKQLFSSIDAYISGSVDAGL FHISGKPAAGVSLEQAEAAVREELERLQQEPVDEQELEKVKNKFESTQIFGNINYLNVAT NLAWFELLGRAEDMEKEVERYRAVTAGQLQKVAQSAFRKENGVILYYKKQIS >gi|226332186|gb|ACIC01000134.1| GENE 58 82049 - 83407 1109 452 aa, chain + ## HITS:1 COG:VC1540 KEGG:ns NR:ns ## COG: VC1540 COG0534 # Protein_GI_number: 15641548 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Vibrio cholerae # 1 450 1 455 461 178 29.0 2e-44 MRNIYSIYKEHYKALISLGLPIVIGQIGVIVLGFADTLMIGHHSTIELGAASFVNNVFNL VIIFSTGFSYGLTPIVGGLYGTHQYASAGQALRNSLLANLLVAFLLTIGMTVLYFNVGNL GQPEELIPLIKPYYLVLLASLVFVMLFNGFKQFTDGITDTKTAMWILLGGNVLNIIGNYI LIYGKLGLPEMGLLGAGISTLFSRIMMVVVFIIVFMRSPRFLRFKLGFYRLGWSKAIFGR LNSLGWPIAFQMGMETASFSLSAVMIGWLGTIALASHQVMLAISQFTFMMYYGMGAAVAV RVSNFNGQGDILNVRRSAYAGFHLMMALGVVLSSIVFLCRNYLGGWFTDSTEVAAMVTSL ILPFLVYQFGDGLQITFANALRGISDVKLMMLIAFIAYFLISLPVGYFCGFVLEWGIIGV WMAFPFGLTSAGLMLWWRFHHKTKLPPSIQNS >gi|226332186|gb|ACIC01000134.1| GENE 59 83427 - 85784 1963 785 aa, chain + ## HITS:1 COG:rcsC_1 KEGG:ns NR:ns ## COG: rcsC_1 COG0642 # Protein_GI_number: 16130155 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Escherichia coli K12 # 244 493 424 675 700 146 33.0 1e-34 MASFRSTKLKITIGYILLLAILLFSLVFVHQEMETLSAADDQQNLRTDSLLSLLHEKDQN TLQMLRVLSEANDSLLSASEIEEIISEQQDSVITQQRVQHRVITKRDSLITTPKKKGFFK RLGEVFAPSKKDTAVLVNTSLEIATDTILEPVKSADSLQHKIRMATEEKRLQRRRTIRRT STRYQRMNTQLTARMDSLIKGYEEEMSARARQDAELQQEVRMRSANIIAGIAIGAVFLSA FFLILIMRDITRSNRYRRQLEEANKRAEDLLVAREKLMLAITHDFKAPLGSIMGYTELLS RLTEDERQRFYLDNMKSSSEHLLKLVSDLLDFHRLDLNKAEVNRIVFNPAQLFEEIRVCF EPLTAAKGLVLQCHIAPELSEKYISDPLRIQQIVNNLLSNAVKFTQQGKITLTVGYNASK LTIAVEDTGKGMASEDRERIFQEFTRLPGAQGEEGFGLGLSIVHKLVLLLEGTIDVQSTL GKGSCFTVVLPLFPVGKSIVENKPFGSRIKKPSKDTDVSPEETDAVPAMKVIRVLLIDDD KIQLNLTAAMLKQHNIDAVCCEQLEELIEQLRSSTFDVLLTDIQMPAINGFDLVKLLRAS NIPQARNIPVIAVTARSEMDEKALHEHGFAGCLHKPFTVKELLVTLNEGQMSADEAHITH DMQLIADALPEDTLFNFSALTAFSEDDPEAACSIIRTFIEETGKNADRMQQALTDREVDG IAAMAHKLLPLFTLIGASESVASLRWLESCRGEEFSEEIEKTTLETLEAVRKVVRAAEEY SFGLH >gi|226332186|gb|ACIC01000134.1| GENE 60 86180 - 90619 3889 1479 aa, chain + ## HITS:1 COG:no KEGG:BT_3806 NR:ns ## KEGG: BT_3806 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1479 1 1479 1479 2821 99.0 0 MVLLYIPPVQNLLRREVTAYASKATGMQIQVERIDLRFPLNLLVRGVEVIQQPDTLLSLE SLNVRVQAWPLLKGKVEVDEVTLSQVAVNSANLIEGMQIKGVLGRFFLQSHGVDLSNETA VVNRTELSDTHIQLLMTDTTTTPKDTTASAPVNWKVNLHQLKLKNVSFGMKLPADSMKMA AYIGEATIDDAKADLKNQFYGLKQFLLSGASASYDTGEVQPSEGFDASHMAVRNIRIGLD SLLYEGRNMNAVIREITMEERSGLSITSLTGRLFSNDSIIRIPELKLQTPHSEIDLSAQT YWELVNIPTTGRLSASFNAYIGKEDVMLFAGGLPQTFKEAYPFRPLVIRAGTDGNLKEMQ ISRFTVDLPGAFSLKGGGILENLTDSLTRSGTIGLDMVTRNLNFLTGLTGVTPDGSLVIP DSMSLKMKMEMNGPQYKAKLDLKEGKGSMNVNAALNSSTEVYTADLKINDLQLHHFLPKD SIYELSLSADAKGRGLDVMSFHSLANLNLSLDQLHYARYHLSNVHLTGNLKGALATAQLT SDNELLKMTADAEYNLAHSYPDGKVTVDVTQLDLYELGLMPKPMKHPLAFNLSGEARQNR VFTHFTAGDMKLNLSARAGVYPLIRQSTHFVDVLMKQVDEKLLDHAALREALPSAIFSFS AGKENPLAYYMAMKNISFHDASMKFGTAPDWGINGKAAIHALKVDTLQLDTVFFTVKQDT TRMNLRAGVINGPKNPQFSFSTILTGEIRDRDAELLAEYKNEKGKTGVLLGVNARPLVGG RGKGDGLAFTLIPEEPIIAFRKFHFNEDHNWIYLHKNMRVYANVDMWDDEGMGFRVHSVQ GDTVSLQNIDVEIRRIRLDEITSVLPYFPEITGLFSAEAHYIQTEKDLQLSAEASIDELT YERQRIGDITLGATWLPGEQGKQYLNAYLNHDKVEVLIADGKLLPTSTGKDSLEVNTTLE HFPLRIANAFIPDELVTLAGDMDGDLNITGSTEQPLINGELILDSVSVLSRQYGANFLFD NRPVQLKNNRLIFDKFAIYTTGKNPFTIDGYVDFRDMSRPMASLNLLAENYTLLNAKRTR ESLVYGKVFADLRATIKGPLDGLNMRGNLNLLGNTDVSYVLTDSPLTVQDRLGSLVTFTS FSDTTTVVRQEVPTVSLGGLDMLMMVHIDPSVRVKVDLDASDNRIELEGGGDLSMKYTPQ GDLTLTGRYTLSGGLMKYSLPIIAVKEFAIDNGSYVDWTGNPMDPMLNFKATDRIRASVS EGENGGTRMVNFDVSIVVKNRLDNLSFAFDVAAPEDATIQNELTAMGAEERGKQALYIML MKTYLGTGPIGGGGGGLGKLNMGSALNSVLSSQINSLMGNLKNASVSVGVEDHDASDTGG KRTDYSFRYSQRLFNNRFQIVIGGKVSTGENATNDAESFIDNISLEYRLDRTGTRYVRLF YDKNYESVLEGEITETGVGLVLRKKLDKLSELFIFKKKK >gi|226332186|gb|ACIC01000134.1| GENE 61 90622 - 92988 2079 788 aa, chain + ## HITS:1 COG:no KEGG:BT_3807 NR:ns ## KEGG: BT_3807 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 788 1 788 788 1600 100.0 0 MKDIRINILAGCLIGIVLFSACSVTKHLPEDEILYTGGKTVIVNKSSTRVGETALTEINA ALAKTPSTTLLGGFLPIPFKMWMYNDFVKYKKGFGKWMFNRFAANPPVFISTVNPEVRVK VATNLLRDYGYFNGKVTYETLVDKKDSLKASLLYTVDMKNPYFIDTVYYQRFTPQTLKIM ERGRRMSYITPGEQFNVVDLDEERSRISTLLRNRGYFYFRPDYMTYQADTTLVPGGHISL RLIPLPGLPAAAQRSYYVGDASVYLFGKNGEAPNDSILYKNLNIHYYDKLQVRPNMLYRW MNYQQFVRSKQMRASNRTRLYSQYRQEQVQEKLSQLGIFSYMDMQYAPRDTTAACDTLDV TMQATFAKPLDAELELNVVTKSNDQTGPGASFGVTRNNVFGGGESWNVKLKGSYEWQTGG GEKSSLMNSWEMGLSTSLTFPRVVFPHLGKREFDFPATTTFRLYINQLNRAKYYKLLSFG GNATYDFQPSRTSRHSITPFKLTFNVLQHQSEDFKEIAEANPALYVSLRNQFIPAMEYTY TYDNASGRRIKNPIWWQSTVTSAGNITSLIYRAFGKPFNEEDKSLLGAPFAQFVKLNTEL RHLWNIDKNNAIASRMAVGALFTYGNATIAPYSEQFYVGGANSIRAFTVRSVGPGGYHPE EGKYSYLDQTGTFRFEANVEYRFRIFKSFWGATFLDAGNVWLLRKDRKLLADGKEARPNS QLRLKTFPKQIALGTGVGIRYDMDILVFRLDFGIPLHLPYDTERSGYYNVTGAFFKNLGI HFAIGYPF >gi|226332186|gb|ACIC01000134.1| GENE 62 93117 - 94028 1099 303 aa, chain + ## HITS:1 COG:no KEGG:BT_3808 NR:ns ## KEGG: BT_3808 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 303 1 303 303 621 100.0 1e-177 MKPTLFLLAAGMGSRYGGLKQLDGLGPNGETIMDYSIYDAINAGFGKLVFVIRKDFEQDF REKIISKYEGHIPCELVFQSIDDLPEGFTCPADRTKPWGTNHAVMMGADVISEPFAVINC DDFYGRDSFQVMGKFLSALPENSKNVYSMVGFRVGNTLSESGTVSRGICSTDANNLLTSV VERTKIQRLDGEVKYIDDNGEWTATPDTTPVSMNFWGFTPDYFAYSEEFFKTFLSDPKNM ENLKSEFFIPLMVDKLINDGTATVEVLDTTSKWFGVTYPEDRQSVVDKIQALVDAGEYPA KLF >gi|226332186|gb|ACIC01000134.1| GENE 63 94172 - 96127 2027 651 aa, chain + ## HITS:1 COG:all4183 KEGG:ns NR:ns ## COG: all4183 COG0488 # Protein_GI_number: 17231675 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Nostoc sp. PCC 7120 # 1 532 1 531 564 403 42.0 1e-112 MISVEGLSVEFNATPLFADVSYVINKKDRIALVGKNGAGKSTMLKILAGLQNPTRGVVAV PKDVTIGYLPQVMILSDVRTVMEEAELAFEHIFELQANLERMNQQLADRTDYESEEYHQL IDRFTHENDRFLMMGGTNFQAEIERTLLGLGFSREDFNRSTSEFSGGWRMRIELAKLLLR RPDVLLLDEPTNHLDIESIQWLENFLSTRANAVVLVSHDRAFLNNVTTRTIEITCGQIYD YKVKYDEFVVLRKERREQQLRAYENQQKQIQDTEDFIERFRYKATKAVQVQSRIKQLEKI DRIEVDEEDNSSLRLKFPPASRSGNYPVICEDVRKAYGQHVVFHDVNLTINRGEKVAFVG KNGEGKSTLVKCIMGEIDFDGKLTIGHNVQIGYFAQNQAQMLDENMTVFDTIDRVATGDI RLKIRDILGAFMFGGEASDKKVKVLSGGERTRLAMIKLLLEPVNFLILDEPTNHLDMRSK DVLKEAIREFDGTVILVSHDRDFLDGLATKVYEFGGGTVKEHLGGIYDFLQKKKIDSLNE LQKGVSLSTSPTTSAKGNEADTEQPSENRLSYEAQKELNKKIKKLERQVADCEASIEETE SAIAILEAKMATPEGASDMQLYERHQKLKQQLDTTVEEWERVSMELEEVKN >gi|226332186|gb|ACIC01000134.1| GENE 64 96141 - 98177 2150 678 aa, chain + ## HITS:1 COG:MA2001 KEGG:ns NR:ns ## COG: MA2001 COG3590 # Protein_GI_number: 20090849 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted metalloendopeptidase # Organism: Methanosarcina acetivorans str.C2A # 32 678 16 665 665 538 40.0 1e-152 MKVTKYLPILAVCLMTTGCNSKKEAVLTSGIDLANLDTTAMPGTCFYQYACGGWIESHPL TDEYSRFGSFDMLHEKSREQLKELIAELAAKKDNAPGSAAQKVGDLYNIAMDSVKLNKEG AAPIKEEMAAIDALKDKEEIYTYIAESQKKGIRPYFTMFVSADDMNSSMNMVQTYQGGLG MGQRDYYLENDEQTKSIRDKYKEHLVKMFQLAGYDEATAQKAMVAVMNIETRLAKAARSQ VELRDPHANYNKMDMETLKKNFPTFNWDAYFTTSGLNDLKEVNVGQPAAMKEVADVINTV PLEDQKFYLQWNLIDAAASFLSDDFVAQNFDFYSRTMSGKKEMQPRWKRAVSTVDGSLGE VVGQMYVEKYFPAAAKERMVGLVKNLQTSLGERIKALEWMSEPTKVKALEKLATFHVKIG YPDKWKDYSALEIKDDSYWVNIERASQWDFNDMIAKAGKPVDKDEWLMTPQTVNAYYNPT TNEICFPAAILQAPFFDMNADDAMNYGAIGVVIGHEMTHGFDDQGRQYDKDGNLKDWWTE EDAKKFEERAQVMVNFFDSIEVAPGVHANGELTLGENIADHGGLQVSFAAFKKAMETAPL EVVDGFTPEQRFFLAYANVWAGHIRPEEILRLTKLDPHSLGKWRVDGALPHIQNWYEAFN ITEHDPMFVAKDKRVSIW >gi|226332186|gb|ACIC01000134.1| GENE 65 98329 - 99852 1737 507 aa, chain + ## HITS:1 COG:aq_1963 KEGG:ns NR:ns ## COG: aq_1963 COG0138 # Protein_GI_number: 15606962 # Func_class: F Nucleotide transport and metabolism # Function: AICAR transformylase/IMP cyclohydrolase PurH (only IMP cyclohydrolase domain in Aful) # Organism: Aquifex aeolicus # 10 507 3 506 506 429 46.0 1e-120 MSESKRIKTALVSVYHKEGLDEIITKLYEEGVEFLSTGGTRQFIESLGYPCKAVEDLTTY PSILGGRVKTLHPKIFGGILCRRDLEQDIQQIEKYEIPEIDLVIVDLYPFEATVASGASE ADIIEKIDIGGISLIRAAAKNYNDVIIVASQAQYKPLLDMLMEHGATSSLEERRWMAKEA FAVSSHYDSAIFNYFDAGEGSAFRCSVNNQKQLRYGENPHQKGYFYGNLDAMFDQIHGKE ISYNNLLDINAAVDLIDEYEDLTFAILKHNNACGLASRPTVLEAWTDALAGDPVSAFGGV LITNGVIDKAAAEEINKIFFEVIIAPDYDVDALEILGQKKNRIILVRKEAKLPKKQFRAL LNGVLVQDKDMNIETVADLRTVTDKAPTPEEVEDLLFANKIVKNSKSNAIVLAKGKQLLA SGVGQTSRVDALKQAIEKAKSFGFDLNGAVMASDAFFPFPDCVEIADKEGITAVIQPGGS VKDDLTFAYCNEHGMAMVTTGIRHFKH >gi|226332186|gb|ACIC01000134.1| GENE 66 99960 - 100982 1059 340 aa, chain + ## HITS:1 COG:CAC1242 KEGG:ns NR:ns ## COG: CAC1242 COG1077 # Protein_GI_number: 15894525 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Actin-like ATPase involved in cell morphogenesis # Organism: Clostridium acetobutylicum # 1 335 1 332 335 284 45.0 2e-76 MGLFSFTQEIAMDLGTANTIIITNGKIVVDEPSVVALDRRTDKMIAVGEKAKLMHEKTHE NIRTIRPLRDGVIADFYACEQMMRGLIKQVNTRNHLFSPSLRMVIGVPSGSTEVELRAVR DSAEHAGGRDVYLIFEPMAAAIGIGIDVEAPEGNMIVDIGGGSTEIAVISLGGIVSNNSI RIAGDDLTADIQEYMSRQHNVKVSERMAERIKINVGAALTELGDDAPEDYIVHGPNRITA LPMEVPVCYQEVAHCLEKSISKVETAILSALENTPPELYADIVHNGIYLAGGGALLRGLD KRLTDKINIPFHIAEDPLHAVAKGTGVALKNVDRFSFLMR >gi|226332186|gb|ACIC01000134.1| GENE 67 101002 - 101844 719 280 aa, chain + ## HITS:1 COG:lin1582 KEGG:ns NR:ns ## COG: lin1582 COG1792 # Protein_GI_number: 16800650 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell shape-determining protein # Organism: Listeria innocua # 26 268 44 278 295 90 29.0 3e-18 MRNLINFLLKYNYWFLFIVLEVASFVLLFRFNNYQQSTFFTSANVVVGAVYEVSGGISSY FHLKSVNADLLDRNMVLEQQITNLEKALREHQVDSVTVNSIKEIPLTDYQLFKAHVIKNS LNQADNYITLDQGSSSGIRPEMGVVDGNGIVGIVYETSSSYSLVISVLNSKSNISCKIVG SDYFGYLKWEHGDSRYAYLKDLPRHAEFNLGDTVVTSGFSTVFPEGIMVGTVDDMSDSHD GLSYLLKIKLATDFGKVSDVRVIARNGQQEQKELENKAVK >gi|226332186|gb|ACIC01000134.1| GENE 68 101844 - 102341 330 165 aa, chain + ## HITS:1 COG:no KEGG:BT_3815 NR:ns ## KEGG: BT_3815 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 165 1 165 165 266 100.0 2e-70 MIITYIHRIGWFIGLVLLQVLILNNVHIAGYATPFLYIYFILKFASGTSRNELMLWAFFF GLTIDIFADTPGMNAAATVLLAFLRPSLLRLFTPRDNLDSIIPSFKSMGITPFLKYTTAS VFVHSLALLSIEFFSFTSIWLLLLRVISCTLLTLTCILAVEGIRR >gi|226332186|gb|ACIC01000134.1| GENE 69 102344 - 104206 1705 620 aa, chain + ## HITS:1 COG:RSc0062 KEGG:ns NR:ns ## COG: RSc0062 COG0768 # Protein_GI_number: 17544781 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell division protein FtsI/penicillin-binding protein 2 # Organism: Ralstonia solanacearum # 7 600 11 633 801 281 32.0 4e-75 MAKDYRLEKRKFVIGGIALTIVLIYLIRLFVLQITTDDYKKNADSNAFLNKIQYPSRGAI YDRNGKLLVFNQPAYDITLVPKEVENLDTLDLCESLGITRAQFLKIMSDMRDRRRNPGYS RYTNQLFMSQLSAEECGVFQEKLFKFPGFYIQRRTIRQYSYNAAAHALGDIGEVSVKDME ADEEGYYIRGDYVGKLGVERSYEKYLRGEKGVEILLRDAHGRIQGRYMDGEYDRPSIPGK NLTLSLDIDLQILGERLLKNKIGSIVAIEPETGEILCLVSSPNYDPHLMIGRQRGKNHLM LQRDKQKPLLNRALMGVYPPGSTFKTAQGLTFLQEGIITEQSPSFPCSRGFHYGRLTVGC HAHGSPLPLIPAIATSCNSYFCWGLFRMFGDRKYGSPQNAITVWKDHMVSQGFGYKLGVD LPGEKRGLIPNAQFYDKAYRGHWNGLTVISISIGQGEILSTPLQIANLGATIANRGHFTT PHIVKEIQDAQLDSIYRHPRNTTIERRHYESVVEGMRAAATGGTCRMLSVMVPELEACGK TGTAQNRGHDHSVFMGFAPMNQPKIAIAVYVENGGWGATYGVPIGALMMEQYMKGKLSPE NELRAEEISNRIILYGNEER >gi|226332186|gb|ACIC01000134.1| GENE 70 104187 - 105644 1022 485 aa, chain + ## HITS:1 COG:TP0501 KEGG:ns NR:ns ## COG: TP0501 COG0772 # Protein_GI_number: 15639492 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Bacterial cell division membrane protein # Organism: Treponema pallidum # 52 479 46 430 433 167 32.0 4e-41 MVTRNDSLWKTLDWVTIGIYLLLIVGGWFSVCGASYDYGDRDFFDFSTRAGKQFVWIICS FGLGFVLLMLEDRMYDMFAYIIYVGMIVLLIVTIFIAPDTKGSRSWLVMGPVSLQPAEFA KFATALALAKYMNSYSFSIKKEKCAFVLGFIILLPMLLIIGQRETGSALVYLAFFLVLYR EGMPGVVLFAGLCAVIYFVVGIRFDEVFIADTPTPLGEFIVLLLILLFAGGMVWVYPKKW EPTRNIIGGSLIILLIAYLISEYGIHFSLVWVQWGLCVLVIGYLLYLALRERQRSYFLIA LFAIGSIGFLYSSNYVFDNILESHQQIRIKVVLGMEEDLTGAGYNVNQSKIAIGSGGLTG KGFLNGTQTKLKYVPEQDTDFIFCTVGEEQGFVGSAAVLLAFLILILRLIALSERQTSIF ARVYGYSVVSIFLFHLFINVGMVLGLTPVIGIPLPFFSYGGSSLWGFTILLFIFLRLDAG RGRRL >gi|226332186|gb|ACIC01000134.1| GENE 71 105646 - 106125 197 159 aa, chain - ## HITS:1 COG:no KEGG:BT_3818 NR:ns ## KEGG: BT_3818 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 159 1 159 159 338 100.0 4e-92 MKSLLRNSLFCLFGACLMAACNENTVYHSYQSLPNEGWGKSDTLSFQCPVTDSIPGTLRL FAEVRNRSEYPYRNLYLFIHENLLDSTVWRTDTIAVNLADSTGRWTGNGWGSIYQTAVFV KSVRPLHPGNYTVKIVHGMQDEKLTGLNDVGIRIEKQGE >gi|226332186|gb|ACIC01000134.1| GENE 72 106103 - 107698 1456 531 aa, chain - ## HITS:1 COG:BS_yaaT KEGG:ns NR:ns ## COG: BS_yaaT COG1774 # Protein_GI_number: 16077100 # Func_class: S Function unknown # Function: Uncharacterized homolog of PSP1 # Organism: Bacillus subtilis # 41 275 3 231 275 166 38.0 1e-40 MEYKLHNGSGGLCCKGCSRQDKKLNTYDWLADIPGNAEECDMVEVQFKNTRKGYYRNSNK IKLEKGDIVAVEAAPGHDIGVVTLTGRLVPLQMKKANFKTDVEIKRIYRKAKPVDMEKFN EAKAKEHATMIRARQIALNLNLDMKIGDVEYQGDGNKAIFYYIADERVDFRQLIKVLAEA FRVRIEMKQIGARQEAGRIGGIGPCGRELCCATWMTSFVSVSTSAARFQDISLNPQKLAG QCAKLKCCLNYEVDCYVEAQKRLPSKEIELETKDGTFYFFKADILSNHVSYSTDKNFPAN LVTISGKRAFEVIAMNKKGMKPDSLLEAEVKQESKKPIDLLEQESLTRFDRNRNNKDGNN GNRNKKKRKGNNDNRPQAQAESGNRPQQPQRGENENRPQPSENGNRGERGDRENRPRNNN NNNRNRGQNQGRNNENRRPERGQNQERPQNPNRPQNQERSQNQERPVNQERNQERNQERP QNRERQPKQERPQNQERPQEQGRQPNQERIPRPERNSNQEKPQNNEKPAQE >gi|226332186|gb|ACIC01000134.1| GENE 73 107713 - 108837 1123 374 aa, chain - ## HITS:1 COG:DR2410 KEGG:ns NR:ns ## COG: DR2410 COG2812 # Protein_GI_number: 15807400 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, gamma/tau subunits # Organism: Deinococcus radiodurans # 3 238 13 209 615 102 31.0 1e-21 MFFRDVIGQEEIKQRLIQEVSEGRIPHAQLICGPEGVGKMPLAIAYARYISCTNRGETDA CGVCPSCVKFNKLVHPDVHFVFPIVKSSKGKREVCDDYIADWRPFVLKNPYFNLNHWLRE MDAENAQAIIYAKESDEILKKLSLKSSEGGFKITILWLPEKMHPVCANKLLKLLEEPPEK TIFLLVSEAPEMILPTILSRTQRMNVRKIDEPAIDRVLQSKYGIQPADSLSIAHLANGNF IKALETIHLNEENQLFFDLFVSLMRLSYQRKIREMKLWSEQVAGMGRERQKNFLEYCQRM IRENFIFNLHQRDLTYMTINEQNFASRFAPFVNERNVIGIMDELSEAQLHIEQNVNAKMV FFDFSLKMIVLLKQ >gi|226332186|gb|ACIC01000134.1| GENE 74 108839 - 109792 946 317 aa, chain - ## HITS:1 COG:aq_1429 KEGG:ns NR:ns ## COG: aq_1429 COG0685 # Protein_GI_number: 15606607 # Func_class: E Amino acid transport and metabolism # Function: 5,10-methylenetetrahydrofolate reductase # Organism: Aquifex aeolicus # 14 316 13 287 296 174 32.0 3e-43 MRVIDLIHNSEKTAFSFEILPPLKGTGIEKLYQTVDTLREFDPKYINITTHRSEYVYKDL GNGLFQRNRLRRRPGTVAVAAAIQNKYNITVVPHILCSGFTREETEYVLLDLQFLNITDL LVLRGDKAKHESVFTPEGDGYHHAIELQEQINNFNKGIFVDGSEMKVTNSPFSYGVACYP EKHEEAPNIESDLFWLKKKVEAGAEYAVTQLFYDNKKYFEFVEQAKAAGINVPIIPGIKP FKKLSQLSMVPKTFKVDLPEDLVKEALKCKNDKEAEQVGIEWCVAQCKELMAHGVPSIHF YSIGAVDSIKEVAKIIY >gi|226332186|gb|ACIC01000134.1| GENE 75 109949 - 110479 267 176 aa, chain - ## HITS:1 COG:no KEGG:BT_3822 NR:ns ## KEGG: BT_3822 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 176 1 176 176 322 100.0 3e-87 MANQLSRTDSSSNQDIRQRFNVKKTIIGVIVLLIGLCHLPEAILLIMDGISLDSILYLIG NILLCLLGIYQLKFRSREMRYLLTKSVVKEKNYSFNLKYLEPLKEMIESGNFSNNINIEK GTGGNLRLDVLMSADKKFAAVRLLQFIPYSYIPVMDMQYLRNDKIAALENFLEHSK >gi|226332186|gb|ACIC01000134.1| GENE 76 110732 - 111244 650 170 aa, chain - ## HITS:1 COG:PA4880 KEGG:ns NR:ns ## COG: PA4880 COG2193 # Protein_GI_number: 15600073 # Func_class: P Inorganic ion transport and metabolism # Function: Bacterioferritin (cytochrome b1) # Organism: Pseudomonas aeruginosa # 14 161 32 177 177 78 34.0 6e-15 MARESVKILQGKLDVESLISQLNAALSEEWLAYYQYWVGALVVEGAMRADVQGEFEEHAE EERRHAQLLADRIIELEGVPVLDPKKWFELARCKYDAPQGFDSVSLLKDNVASERCAILR YQEIADFTNGKDFTTCDIAKHILAEEEEHEQDLQDYLTDIARMKKSFLDK >gi|226332186|gb|ACIC01000134.1| GENE 77 111270 - 111461 79 63 aa, chain - ## HITS:1 COG:no KEGG:BT_3824 NR:ns ## KEGG: BT_3824 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 63 32 94 94 109 100.0 3e-23 MAHEHPPLNQSIYLPYWHFSILYTYILQEKSHFNACIPCIYKITVKSLLTLKQKHLPSVT YPT Prediction of potential genes in microbial genomes Time: Thu May 12 03:02:00 2011 Seq name: gi|226332185|gb|ACIC01000135.1| Bacteroides sp. 1_1_6 cont1.135, whole genome shotgun sequence Length of sequence - 5275 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 3, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 140 - 199 7.6 1 1 Tu 1 . + CDS 326 - 1978 781 ## BT_1868 hypothetical protein + Term 2012 - 2055 -0.4 + Prom 2090 - 2149 10.3 2 2 Tu 1 . + CDS 2267 - 2632 302 ## BT_1867 hypothetical protein + Prom 2670 - 2729 4.4 3 3 Op 1 . + CDS 2808 - 3020 180 ## BT_1866 hypothetical protein 4 3 Op 2 . + CDS 3082 - 3693 432 ## BT_1865 putative fiber protein 5 3 Op 3 . + CDS 3701 - 4486 201 ## BT_1864 putative methyltransferase + Term 4505 - 4555 4.1 Predicted protein(s) >gi|226332185|gb|ACIC01000135.1| GENE 1 326 - 1978 781 550 aa, chain + ## HITS:1 COG:no KEGG:BT_1868 NR:ns ## KEGG: BT_1868 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 550 1 550 550 1051 99.0 0 MKHLSYVLTLVLIFFVACRRQQSVSTPLNAAERILKDNPDSAFRIIQSIPYPDKLDKEDL VRWCLIAGKAADKLNTSLPPSYYFDHAYSWLIKYGITKDQVDIGLFWGRSLVADGEYDKA MSIYAEVLAIAKEKELYNEAGYICTYIADLYDFRDMQKEARHQYEEAQGLFKKAGNIKSH VFALQNIVREWAFFDSLECALKYMGIADSIAQQLNDNEVSSSLDNAFGNIYLAMNQYSKA EYHFLKSTSLDKENELYNTNCLIKLYLQVGDILKARELLYTLPLDSSEYEYGIERAFYRV TKAEGSYKEALKHLETCESISDSITILQNNTKILEVEKKFNNLRLKEKNHQLELTQQKYM IIICICLIFGIAITLLYLLYRQKTKNEMNKQALELNNIKLELANAFIELDKKKNQLVVSQ KENESSQSRLENEIKNLTSNYKKLQRRRIVTSIIFRKLVNIAERSTNCNEPLLTEQLWFS IVSEITETYPNLKMYLLERYPNLSSQEWEYCCLCMFNFDSKTEARLLGINPSSVSTKRLR FRQKLGISAL >gi|226332185|gb|ACIC01000135.1| GENE 2 2267 - 2632 302 121 aa, chain + ## HITS:1 COG:no KEGG:BT_1867 NR:ns ## KEGG: BT_1867 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 121 1 121 121 228 100.0 5e-59 MKKFELVLLVMLLIHSFSVMIEAKQHVKTKSDWVIQRTPVDPEWLKVYVDDESKRLCLNF KDSFAPITVEVKDIEKQIVFQSIIFPVAAGEYTLYLGDLSLGQYELYMYNASVKVVGNFT L >gi|226332185|gb|ACIC01000135.1| GENE 3 2808 - 3020 180 70 aa, chain + ## HITS:1 COG:no KEGG:BT_1866 NR:ns ## KEGG: BT_1866 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 70 1 70 70 126 100.0 3e-28 MRKLFMLFSFSMMILCGCTDESEEVNTNNETTTGITTRAVGVVVWEAQPGTVDIPFGQAV TIGLDGSVTK >gi|226332185|gb|ACIC01000135.1| GENE 4 3082 - 3693 432 203 aa, chain + ## HITS:1 COG:no KEGG:BT_1865 NR:ns ## KEGG: BT_1865 # Name: not_defined # Def: putative fiber protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 203 1 203 203 377 100.0 1e-103 MKSLFFKSILLVVFLAVAVLANAQIKYYSNGKLTIGNTEPYSFYHTTFSGNGIYMKCNAS NFFQIDCSPAATRLASHYDQVVFYNTASGVYNSIQVKNVYNYSDARAKININPLGYGLNV LSKLNAVSYDFKDKNEPAAAAFRVGGDGKEIGLLAQEVEKVLPNIVLTDPDGNKLINYTA IIPIMIQSIKELKAEVEALKANK >gi|226332185|gb|ACIC01000135.1| GENE 5 3701 - 4486 201 261 aa, chain + ## HITS:1 COG:no KEGG:BT_1864 NR:ns ## KEGG: BT_1864 # Name: not_defined # Def: putative methyltransferase # Organism: B.thetaiotaomicron # Pathway: not_defined # 37 261 1 225 225 457 100.0 1e-127 MQQRHIDRYSYFNELAITSRDFYIDYLEKFVAIKRGMNILEIGCGEGGNLLPFAEKECNV TGIDRSEERISQAISYFKLLGFNGRFIYSDFFDFSSEEDMNKYDIILIHDVIEHIDKCRK VEFILHAKEFLSETGIIFWGFPAWQMPFGGHQQICRNGFCSKMPFIHLLPLSVYNKILFI FGENDSCIKELLNIKAAGLSIEHFESIIKESNGRIVDRILWFINPHYKQKFKLKPRKLMP ILSSIRYIRNYFSTSCFYITK Prediction of potential genes in microbial genomes Time: Thu May 12 03:02:36 2011 Seq name: gi|226332184|gb|ACIC01000136.1| Bacteroides sp. 1_1_6 cont1.136, whole genome shotgun sequence Length of sequence - 73040 bp Number of predicted genes - 95, with homology - 91 Number of transcription units - 38, operones - 22 average op.length - 3.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 62 - 406 138 ## PGN_0055 probable lysozyme + Prom 413 - 472 4.8 2 1 Op 2 . + CDS 495 - 1703 997 ## BDI_2229 transposase + Term 1704 - 1740 3.2 3 2 Op 1 . + CDS 2209 - 2394 173 ## BDI_2226 putative transcriptional regulator UpxY-like protein 4 2 Op 2 . + CDS 2375 - 2782 285 ## BDI_3243 putative transcriptional regulator UpxY-like protein 5 2 Op 3 . + CDS 2882 - 3559 522 ## COG0472 UDP-N-acetylmuramyl pentapeptide phosphotransferase/UDP-N-acetylglucosamine-1-phosphate transferase + Term 3665 - 3709 1.2 + Prom 3566 - 3625 4.9 6 3 Tu 1 . + CDS 3753 - 4358 17 ## COG0477 Permeases of the major facilitator superfamily + Prom 4360 - 4419 5.9 7 4 Tu 1 . + CDS 4472 - 5935 1152 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains + Term 6040 - 6078 6.6 + Prom 5964 - 6023 2.1 8 5 Op 1 . + CDS 6167 - 7006 371 ## CKR_2318 hypothetical protein 9 5 Op 2 . + CDS 7003 - 7662 278 ## COG0110 Acetyltransferase (isoleucine patch superfamily) 10 5 Op 3 . + CDS 7675 - 8409 345 ## COG0030 Dimethyladenosine transferase (rRNA methylation) + Term 8449 - 8495 6.6 + Prom 8834 - 8893 7.0 11 6 Op 1 . + CDS 8913 - 9254 157 ## COG4584 Transposase and inactivated derivatives 12 6 Op 2 . + CDS 9133 - 9369 89 ## gi|57118026|gb|AAW34151.1| transposase 13 6 Op 3 23/0.000 + CDS 9439 - 9732 207 ## COG2963 Transposase and inactivated derivatives + Prom 9827 - 9886 4.3 14 6 Op 4 . + CDS 9993 - 10607 280 ## COG2801 Transposase and inactivated derivatives + Term 10666 - 10701 2.1 + Prom 10646 - 10705 3.1 15 7 Op 1 . + CDS 10744 - 11415 274 ## Ccel_2811 DNA polymerase beta domain protein region 16 7 Op 2 . + CDS 11412 - 12152 555 ## PROTEIN SUPPORTED gi|239830964|ref|ZP_04679293.1| Ribosomal protein L11 methyltransferase 17 7 Op 3 . + CDS 12217 - 12684 352 ## Dtox_3136 metal dependent phosphohydrolase 18 8 Tu 1 . - CDS 12734 - 12919 85 ## gi|212694249|ref|ZP_03302377.1| hypothetical protein BACDOR_03775 - Term 13104 - 13132 -1.0 19 9 Tu 1 . - CDS 13337 - 13639 184 ## gi|189466910|ref|ZP_03015695.1| hypothetical protein BACINT_03292 - Prom 13848 - 13907 6.7 + Prom 13578 - 13637 3.2 20 10 Tu 1 . + CDS 13783 - 13926 59 ## gi|189466909|ref|ZP_03015694.1| hypothetical protein BACINT_03291 + Term 14130 - 14163 3.0 - Term 14311 - 14341 -0.5 21 11 Tu 1 . - CDS 14368 - 14544 57 ## 22 12 Tu 1 . - CDS 14659 - 15945 445 ## BVU_1598 transposase - Prom 16052 - 16111 5.5 + Prom 16012 - 16071 6.7 23 13 Op 1 . + CDS 16094 - 16582 382 ## COG0449 Glucosamine 6-phosphate synthetase, contains amidotransferase and phosphosugar isomerase domains 24 13 Op 2 . + CDS 16579 - 16767 142 ## gi|212694256|ref|ZP_03302384.1| hypothetical protein BACDOR_03782 + Term 16810 - 16842 3.1 + Prom 16878 - 16937 3.3 25 14 Op 1 . + CDS 17002 - 18144 701 ## gi|212694257|ref|ZP_03302385.1| hypothetical protein BACDOR_03783 26 14 Op 2 . + CDS 18135 - 18692 423 ## gi|212694258|ref|ZP_03302386.1| hypothetical protein BACDOR_03784 + Term 18878 - 18928 5.0 27 15 Tu 1 . - CDS 18855 - 19127 181 ## COG0739 Membrane proteins related to metalloendopeptidases - Term 19299 - 19332 5.2 28 16 Tu 1 . - CDS 19481 - 19675 104 ## gi|254882509|ref|ZP_05255219.1| predicted protein - Prom 19712 - 19771 2.6 + Prom 19553 - 19612 4.2 29 17 Op 1 . + CDS 19705 - 20466 762 ## BF2874 hypothetical protein 30 17 Op 2 . + CDS 20490 - 20714 269 ## BF2875 hypothetical protein 31 17 Op 3 . + CDS 20785 - 22347 1233 ## BF2876 hypothetical protein 32 17 Op 4 . + CDS 22378 - 23520 732 ## BF2877 hypothetical protein 33 17 Op 5 . + CDS 23517 - 24002 230 ## BF2878 hypothetical protein 34 17 Op 6 . + CDS 23983 - 24684 672 ## BF2879 hypothetical protein 35 17 Op 7 . + CDS 24702 - 25385 677 ## BF2880 hypothetical protein 36 17 Op 8 . + CDS 25479 - 26141 627 ## BF2881 hypothetical protein 37 17 Op 9 . + CDS 26157 - 26984 555 ## COG0739 Membrane proteins related to metalloendopeptidases 38 17 Op 10 . + CDS 27010 - 27282 117 ## gi|253571436|ref|ZP_04848842.1| conserved hypothetical protein 39 17 Op 11 . + CDS 27292 - 29091 1215 ## BF2884 hypothetical protein + Prom 29093 - 29152 1.6 40 18 Op 1 . + CDS 29174 - 31630 2172 ## COG0249 Mismatch repair ATPase (MutS family) 41 18 Op 2 . + CDS 31635 - 33191 1097 ## COG1705 Muramidase (flagellum-specific) 42 18 Op 3 . + CDS 33188 - 33370 188 ## gi|212694274|ref|ZP_03302402.1| hypothetical protein BACDOR_03800 + Term 33385 - 33427 0.3 43 18 Op 4 . + CDS 33443 - 34048 345 ## BF2888 hypothetical protein 44 18 Op 5 . + CDS 34055 - 35977 1480 ## COG4227 Antirestriction protein + Term 36040 - 36084 7.1 - Term 36026 - 36071 11.1 45 19 Op 1 . - CDS 36084 - 36449 319 ## BVU_3418 hypothetical protein 46 19 Op 2 . - CDS 36452 - 36745 239 ## gi|212694278|ref|ZP_03302406.1| hypothetical protein BACDOR_03804 + Prom 37036 - 37095 7.2 47 20 Tu 1 . + CDS 37140 - 38402 758 ## COG1373 Predicted ATPase (AAA+ superfamily) + Term 38449 - 38494 4.6 - Term 38430 - 38485 14.6 48 21 Op 1 . - CDS 38543 - 39949 731 ## BF2899 putative outer membrane protein 49 21 Op 2 . - CDS 40004 - 41455 953 ## BF2900 hypothetical protein - Prom 41589 - 41648 5.4 50 22 Op 1 . - CDS 41664 - 42548 545 ## BF2901 hypothetical protein 51 22 Op 2 . - CDS 42628 - 43122 258 ## BF2902 hypothetical protein 52 22 Op 3 . - CDS 43112 - 43525 351 ## BF2903 hypothetical protein 53 22 Op 4 . - CDS 43604 - 44050 470 ## BF2904 hypothetical protein 54 22 Op 5 . - CDS 44075 - 45187 700 ## BF2905 hypothetical protein - Prom 45284 - 45343 4.7 + Prom 44887 - 44946 4.5 55 23 Tu 1 . + CDS 45122 - 45319 72 ## + Prom 45330 - 45389 4.3 56 24 Op 1 . + CDS 45442 - 46065 575 ## BF2906 serine type site-specific recombinase 57 24 Op 2 . + CDS 46082 - 46765 467 ## Cag_0377 hypothetical protein 58 24 Op 3 . + CDS 46769 - 47305 271 ## Cag_0376 hypothetical protein + Prom 47408 - 47467 7.0 59 25 Op 1 . + CDS 47500 - 47799 229 ## COG0526 Thiol-disulfide isomerase and thioredoxins 60 25 Op 2 . + CDS 47840 - 48385 514 ## Mmc1_1123 hypothetical protein 61 25 Op 3 . + CDS 48400 - 48645 298 ## gi|212694292|ref|ZP_03302420.1| hypothetical protein BACDOR_03818 62 25 Op 4 . + CDS 48626 - 48829 109 ## gi|224026325|ref|ZP_03644691.1| hypothetical protein BACCOPRO_03081 63 25 Op 5 . + CDS 48832 - 49266 230 ## Mmol_0134 hypothetical protein 64 25 Op 6 . + CDS 49263 - 49469 89 ## gi|212694295|ref|ZP_03302423.1| hypothetical protein BACDOR_03821 65 25 Op 7 . + CDS 49479 - 49715 121 ## BVU_3408 hypothetical protein 66 25 Op 8 . + CDS 49740 - 49919 238 ## Shel_03840 hypothetical protein + Term 49926 - 49969 6.0 67 26 Op 1 . - CDS 50085 - 50198 118 ## gi|224026330|ref|ZP_03644696.1| hypothetical protein BACCOPRO_03086 68 26 Op 2 . - CDS 50220 - 50711 439 ## COG4474 Uncharacterized protein conserved in bacteria 69 26 Op 3 . - CDS 50715 - 51281 437 ## BF2868 putative ribose phosphate pyrophosphokinase - Term 51781 - 51825 9.0 70 27 Op 1 . - CDS 51833 - 52111 226 ## BF2825 hypothetical protein 71 27 Op 2 . - CDS 52119 - 52874 692 ## COG1192 ATPases involved in chromosome partitioning - Prom 52903 - 52962 8.0 - Term 52920 - 52965 7.6 72 28 Op 1 . - CDS 52994 - 53224 151 ## BF2827 putative single stranded DNA binding protein 73 28 Op 2 . - CDS 53244 - 53420 202 ## BF2827 putative single stranded DNA binding protein 74 28 Op 3 . - CDS 53474 - 53830 112 ## BT_4618 hypothetical protein 75 28 Op 4 . - CDS 53878 - 54129 214 ## gi|212694304|ref|ZP_03302432.1| hypothetical protein BACDOR_03830 76 28 Op 5 . - CDS 54117 - 54671 547 ## gi|212694305|ref|ZP_03302433.1| hypothetical protein BACDOR_03831 - Prom 54886 - 54945 2.8 77 29 Tu 1 . + CDS 55942 - 56418 107 ## - Term 56170 - 56207 2.2 78 30 Tu 1 . - CDS 56224 - 56610 473 ## BF2915 putative single strand binding protein - Prom 56748 - 56807 3.7 79 31 Tu 1 . + CDS 56647 - 56832 93 ## gi|254884355|ref|ZP_05257065.1| predicted protein - Term 57386 - 57435 8.2 80 32 Op 1 . - CDS 57440 - 57622 217 ## gi|212694313|ref|ZP_03302441.1| hypothetical protein BACDOR_03839 81 32 Op 2 . - CDS 57638 - 58222 529 ## gi|212694314|ref|ZP_03302442.1| hypothetical protein BACDOR_03840 - Prom 58242 - 58301 3.0 + Prom 58180 - 58239 6.5 82 33 Tu 1 . + CDS 58288 - 58500 90 ## gi|212694315|ref|ZP_03302443.1| hypothetical protein BACDOR_03841 83 34 Op 1 . + CDS 59228 - 59773 279 ## BT_4616 hypothetical protein 84 34 Op 2 . + CDS 59770 - 61032 507 ## COG0582 Integrase + Term 61103 - 61148 7.4 + Prom 61263 - 61322 4.0 85 35 Op 1 . + CDS 61347 - 61700 301 ## BT_4618 hypothetical protein 86 35 Op 2 . + CDS 61700 - 62296 340 ## BT_4619 hypothetical protein 87 35 Op 3 . + CDS 62247 - 64052 868 ## BT_4620 hypothetical protein 88 36 Op 1 . + CDS 64154 - 64522 177 ## BT_4621 mobilization protein BmgB 89 36 Op 2 . + CDS 64506 - 65405 457 ## BT_4622 mobilization protein BmgA 90 36 Op 3 . + CDS 65405 - 66058 406 ## BT_4623 hypothetical protein 91 37 Tu 1 . - CDS 66089 - 66286 61 ## PRU_0358 DNA mismatch endonuclease Vsr family protein - Prom 66356 - 66415 3.4 - Term 66417 - 66459 -0.9 92 38 Op 1 . - CDS 66626 - 69106 652 ## RCIX2535 hypothetical protein 93 38 Op 2 . - CDS 69108 - 70118 342 ## RCIX2534 hypothetical protein 94 38 Op 3 . - CDS 70102 - 72882 837 ## RCIX2532 endonuclease 95 38 Op 4 . - CDS 72885 - 73034 79 ## Predicted protein(s) >gi|226332184|gb|ACIC01000136.1| GENE 1 62 - 406 138 114 aa, chain + ## HITS:1 COG:no KEGG:PGN_0055 NR:ns ## KEGG: PGN_0055 # Name: not_defined # Def: probable lysozyme # Organism: P.gingivalis_ATCC33277 # Pathway: not_defined # 1 111 59 169 171 102 44.0 4e-21 MGYGHQLQKGERFPKTLTRQRADLLLRTDLLRHLRLYARYGRDAYLLATLSYQIGPAKLL GNGRYPKASLLTRLERGDRDILPLYLSYCKWRGKAVASIRKRRWVEYQLLYAER >gi|226332184|gb|ACIC01000136.1| GENE 2 495 - 1703 997 402 aa, chain + ## HITS:1 COG:no KEGG:BDI_2229 NR:ns ## KEGG: BDI_2229 # Name: not_defined # Def: transposase # Organism: P.distasonis # Pathway: not_defined # 1 402 1 404 404 434 53.0 1e-120 MASIKLKFRTSSVQEKEGRLYYQVIHNRVARQIHTEYRIYSSEWDADHSNIILPTSVTPQ RQAYLLSLKDTLDADRKKLLLVIARLDKEGQSYTADTVVDNFHEGKELHGIIGYTLELNE KLRRIGKKRMVARYKTTLNSLQRYLKGGDVPLEEVDGTTIQGYEQWLKDSGLCRNTTSFY IRNLRTIYNHAVDDGLVISSSPFKHVYTGIDKTVKRALPLEIIKQLKELDLSLNPRLELA RDMFLFSFYTRGMSFIDMAQLTPSNLHGSTLIYRRQKTSQQLHIKWEPAMQEIVAKYKTD DSPYLLPIAKGEGAIFWRQYKNAYSRITKQLKKIGEMIGLSVPLTTYVARHSWASIAKSK NVPVSTISEALGHDSEKTTQIYLSSLDTSVVDNANNLIINSL >gi|226332184|gb|ACIC01000136.1| GENE 3 2209 - 2394 173 61 aa, chain + ## HITS:1 COG:no KEGG:BDI_2226 NR:ns ## KEGG: BDI_2226 # Name: not_defined # Def: putative transcriptional regulator UpxY-like protein # Organism: P.distasonis # Pathway: not_defined # 6 52 10 56 186 62 59.0 5e-09 MVNVLDKDKILWYVMRAYKNENTAEDRLSNETYGLEYFIPKQKVLRTVKGKKVFFYGTRH P >gi|226332184|gb|ACIC01000136.1| GENE 4 2375 - 2782 285 135 aa, chain + ## HITS:1 COG:no KEGG:BDI_3243 NR:ns ## KEGG: BDI_3243 # Name: not_defined # Def: putative transcriptional regulator UpxY-like protein # Organism: P.distasonis # Pathway: not_defined # 1 134 60 185 186 112 43.0 3e-24 MVPVIHSLVFVHASHQKIVDFKHNYYYDLQFVTWKSGGELVYLTVPDEDMTNFIKVCKQA DEEVHFYKISEINKEKDKINIEKGKRVRVHGGPFDQVEGYFIKVAKKRGRQLVVIIPDLL AVSAEVKPEYIQIID >gi|226332184|gb|ACIC01000136.1| GENE 5 2882 - 3559 522 225 aa, chain + ## HITS:1 COG:BS_tagO KEGG:ns NR:ns ## COG: BS_tagO COG0472 # Protein_GI_number: 16080606 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl pentapeptide phosphotransferase/UDP-N-acetylglucosamine-1-phosphate transferase # Organism: Bacillus subtilis # 6 187 6 173 358 73 33.0 4e-13 MSLIWIVNILSVFFLCVFFAGIVIPQILLIAFRRRLFDEPDERKIHQCVVPRLGGMAFKP VVFFSFVLLLAVNVSTGHDELLKEIGAEALPLAYAFCAIIMLYLVGIADDLIGVRYRAKF FIQIVCGIMLVAGGVELSDLHGMLFIHSMPSWISIPLTVFVTVFIINAINLIDGIDGLAS GLCSIAFLFYGMTFIWFHQYLYAMLAFATLGVLIPLRAAYEMCRL >gi|226332184|gb|ACIC01000136.1| GENE 6 3753 - 4358 17 201 aa, chain + ## HITS:1 COG:RSp0310 KEGG:ns NR:ns ## COG: RSp0310 COG0477 # Protein_GI_number: 17548531 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Ralstonia solanacearum # 2 172 246 415 450 67 29.0 2e-11 MKEGMAVLRQNKGLFALLLVGTLYMFVYMPINALYPLITMEYFNGTPMHISIKEIAYASG MLIGGLLLGLFGNYQKRILLITASIFMMGISLTISGLLPQSGFFIFVVCCAIMGLSVPFY SGVQTALFQEKIKLEYLGRVFSLTGSIMSLAMPIGLILSGFFADRIGVNHWFLLSGILII CIAIVCPMITEIRKLDLKQNS >gi|226332184|gb|ACIC01000136.1| GENE 7 4472 - 5935 1152 487 aa, chain + ## HITS:1 COG:BH1900 KEGG:ns NR:ns ## COG: BH1900 COG0488 # Protein_GI_number: 15614463 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Bacillus halodurans # 3 472 2 465 538 303 38.0 3e-82 MELILKAKDIRVEFKGRTCLDIDELEVYDYDRIGLVGANGAGKSTLFKVILGELTPPGCK MNRLGELAYIPQLDKVTLQEEKDFGLVGKLGVNQLDIQTMSGGEETRLKIAQALSAQVHG ILADEPTSHLDREGIDFLIGQLKYFIGALLVISHDRYFLDEVVDKIWELKDGKITEYWGN YSDYLRQKEEERKSQSAEYEQFIAERARLERAAEEKRKQARKIEKKAKGASKKKSTEGGG RLAHQKSMGSKAKKMHNAAKSLEHRIAALGKVEAPEDIRRIRFRQSKALELHNPYPIAGT EINKIFGDKVLFENVSFQIPLGAKVALTGSNGTGKTTLIQMILNHEDGISISPKAKIGYF AQNGYKYNSNQNVLEFMQEDCDYNVSEIRSVLASMGVKQNDIGKSLSVLSGGEIIKLLLA KMLMGRYNILLMDEPSNFLDIPSLEALEILMKEYAGTIVFITHDKRLLDNVADVIYEIKD KKLNLVR >gi|226332184|gb|ACIC01000136.1| GENE 8 6167 - 7006 371 279 aa, chain + ## HITS:1 COG:no KEGG:CKR_2318 NR:ns ## KEGG: CKR_2318 # Name: not_defined # Def: hypothetical protein # Organism: C.kluyveri_NBRC # Pathway: not_defined # 1 270 1 274 289 394 70.0 1e-108 MIDNIIQSVTEKLSSLPYIEGIVLGGSRARGTHTEDSDIDFGIYYNSELFDITVINQLAT ELDDENRSNLIVPPGAWGDWINGGGWLVINGYHVDLILRDIKRVVQIIKDTEQGIATANY QTGHPHGYISAMYRGELAISKILYANDENFYEFKKQAERYPTALQKGLTEFFMFEAGFSL MFAENNIDKDDVSYVCGHCFRSISSLNQVLFAVNKEYCINEKKAVKMIEDFKIKPSDYKE RVDKVISLISTNVDCTRKGIEILQRLVNEVEYLKGAHIQ >gi|226332184|gb|ACIC01000136.1| GENE 9 7003 - 7662 278 219 aa, chain + ## HITS:1 COG:PM1197 KEGG:ns NR:ns ## COG: PM1197 COG0110 # Protein_GI_number: 15603062 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Pasteurella multocida # 1 202 1 201 207 239 56.0 2e-63 MTGPDKKKLYPNENIKTVCYISNLPKRPNVEIGEYTYYSDNNKSPEKFYDNIEHHYEFLG DKLIIGKFCAIAAGVKFIMNGANHRMDGITTYPFNIFGCGWEKVTPTIEQLPFKGDTVIG NDVWICQNVTIMPGVKIGDGAIIAANSTVVKSVEPYSIYGGNPAKFIKKRFSDEKIEFLL KLEWWNWSEEEIFDNLEILTSEAGLEELMNKYSKRDEIN >gi|226332184|gb|ACIC01000136.1| GENE 10 7675 - 8409 345 244 aa, chain + ## HITS:1 COG:NMB0066 KEGG:ns NR:ns ## COG: NMB0066 COG0030 # Protein_GI_number: 15676003 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Dimethyladenosine transferase (rRNA methylation) # Organism: Neisseria meningitidis MC58 # 1 243 1 243 244 315 70.0 3e-86 MNKVNIKDSQNFITSKYHIEKIMNCISLDEKDNIFEIGAGKGHFTAELVKRCNFVTAIEI DSKLCEVTRNKLLNYPNYQIVNDDILKFTFPSHNPYKIFGSIPYNISTNIIRKIVFESSA TISYLIVEYGFAKSLLDTNRSLALLLMAEVDISILAKIPRYYFHPKPKVDSTLIVLKRKP AKMAFKERKKYETFVMKWVNKEYEKLFTKNQFNKALKYARIYDINNISFEQFVSLFNSYK IFNG >gi|226332184|gb|ACIC01000136.1| GENE 11 8913 - 9254 157 113 aa, chain + ## HITS:1 COG:SA1624 KEGG:ns NR:ns ## COG: SA1624 COG4584 # Protein_GI_number: 15927380 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Staphylococcus aureus N315 # 5 102 1 97 166 63 42.0 9e-11 MVINMIYMININTEIFLRSVKDLNKLKLLVEVNNWDRPNFSAIARELGVDRRTVKKYYDG DIKKVRKSKKSKIDDFYDIISSLLSAETDQIFYYKSHLYRYLVREKRIRLFKK >gi|226332184|gb|ACIC01000136.1| GENE 12 9133 - 9369 89 78 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|57118026|gb|AAW34151.1| ## NR: gi|57118026|gb|AAW34151.1| transposase [Campylobacter jejuni] # 30 77 103 150 167 88 91.0 1e-16 MISMISFRVCFQQKRTKYFIINLISIDIWLEKKGLDCSRSNFNYYILKHEKFAEYFKPNK KKDAVKSETSFGKQAQWC >gi|226332184|gb|ACIC01000136.1| GENE 13 9439 - 9732 207 97 aa, chain + ## HITS:1 COG:pli0025 KEGG:ns NR:ns ## COG: pli0025 COG2963 # Protein_GI_number: 18450308 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Listeria innocua # 5 93 9 90 97 65 44.0 2e-11 MGKGRRYDQEYKDMIVELFKSGMSLAELSSEYGIAKSTINGWVKDVKEIKVDENEVMTLK EVKALKKEMARIKEENEILHQRRALAKKAMAIFATRN >gi|226332184|gb|ACIC01000136.1| GENE 14 9993 - 10607 280 204 aa, chain + ## HITS:1 COG:L0434 KEGG:ns NR:ns ## COG: L0434 COG2801 # Protein_GI_number: 15672639 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Lactococcus lactis # 1 197 84 275 279 158 43.0 8e-39 MTELGLCAVTVKKYKPHSSKKVAEDLEDVLKRDFTTTSINEKWVGDITYIYTIKDGWCYL ASVLDLYSKKIIGYAFDQRMTNDLVVKALKNAYYSQFPDKNKQLIFHSDLGSQYTSNDLR ELCKEFNIIQSFSKKGCPYDNACIESFHSLIKKEEIYRNTYRTFEEANMAIFKYIEGWYN RKRLHSSINYMTPDQCELLARSAA >gi|226332184|gb|ACIC01000136.1| GENE 15 10744 - 11415 274 223 aa, chain + ## HITS:1 COG:no KEGG:Ccel_2811 NR:ns ## KEGG: Ccel_2811 # Name: not_defined # Def: DNA polymerase beta domain protein region # Organism: C.cellulolyticum # Pathway: not_defined # 1 220 62 281 289 444 97.0 1e-123 MDDENRNNLVVPPGAWGDWINGGGWLVMNGYHVDLILRDIKRVEQIIKDTEQGIVTANYQ TGHPHGYISAMYRGELAISKIQYAKNESLCELKNQAEIYPGALKKSLINFFLFEAEFSLM FVKANAGAEDKYYIAGHVFRIISCLNQVLFACNNAYCINEKKAIKLLETFEYKPKKYAER VNHIFEVLGLSLFECYDMTEKLYKEVKKIATEINNFLNEGEFR >gi|226332184|gb|ACIC01000136.1| GENE 16 11412 - 12152 555 246 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|239830964|ref|ZP_04679293.1| Ribosomal protein L11 methyltransferase [Ochrobactrum intermedium LMG 3301] # 1 243 1 243 245 218 45 7e-56 MKENKYDDNIFFQKYSQMSRSQQGLAGAGEWETLRKLLPDFKDKRVLDLGCGYGWHCIYA MEHGASSVVGVDISHKMLEVAKEKTHFPQVEYKCCAIEDVEFPEESFDVILSSLAFHYVA DYEILVKKIYRILKSGGKLVFTVEHPVFTAYGTQDWHYNEKGEILHFPVDNYYYEGKRTA VFLGEKVTKYHRTLTTYLNTLLSNGFIINHIVEPQPPENMMDIPGMQDEMRRPMMLIVSA NKKVDR >gi|226332184|gb|ACIC01000136.1| GENE 17 12217 - 12684 352 155 aa, chain + ## HITS:1 COG:no KEGG:Dtox_3136 NR:ns ## KEGG: Dtox_3136 # Name: not_defined # Def: metal dependent phosphohydrolase # Organism: D.acetoxidans # Pathway: not_defined # 1 155 10 165 165 222 77.0 4e-57 MKEVFKEIPFGIEHTLKVLKNAEDIMNGENIGEEEKEFISIIAILHDIGAVEAQKKYGSI DGVYQEKEGPAVAKEILKKVGYNKNIDRICFIIGNHHTPSKIDGLDFQIQWEADLLENLT VMDKEKEQEKIKKCIGENFKTNTGKRIAYNRFILD >gi|226332184|gb|ACIC01000136.1| GENE 18 12734 - 12919 85 61 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|212694249|ref|ZP_03302377.1| ## NR: gi|212694249|ref|ZP_03302377.1| hypothetical protein BACDOR_03775 [Bacteroides dorei DSM 17855] # 1 61 1 61 61 81 100.0 2e-14 MFADIYEGVHELTVVEKIWSIIIDVIAIVAEIFTIDENELMEQIIRDNRRFKAMQLIAQT A >gi|226332184|gb|ACIC01000136.1| GENE 19 13337 - 13639 184 100 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|189466910|ref|ZP_03015695.1| ## NR: gi|189466910|ref|ZP_03015695.1| hypothetical protein BACINT_03292 [Bacteroides intestinalis DSM 17393] # 1 100 1 100 100 189 100.0 4e-47 MSKITKIPSEIRNFFTGKRINPAFSKFMTLLEGMRISDKDLGGKKRENCQLTNLQLFQII LILPFMAVPGFSHYAGSSISKLFGGKKDLLYSFMAKDNID >gi|226332184|gb|ACIC01000136.1| GENE 20 13783 - 13926 59 47 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|189466909|ref|ZP_03015694.1| ## NR: gi|189466909|ref|ZP_03015694.1| hypothetical protein BACINT_03291 [Bacteroides intestinalis DSM 17393] # 1 47 1 47 47 83 100.0 4e-15 MLRAFVQYTLNVDLNAPVIKDLVEKYKDIEQFNRSITDKYVAICKRF >gi|226332184|gb|ACIC01000136.1| GENE 21 14368 - 14544 57 58 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRERVHVNRTMKDLVADLRLMKSVLLSQYMPFLFAQRQEIGFPSRTEIDVPSPKTTQQ >gi|226332184|gb|ACIC01000136.1| GENE 22 14659 - 15945 445 428 aa, chain - ## HITS:1 COG:no KEGG:BVU_1598 NR:ns ## KEGG: BVU_1598 # Name: not_defined # Def: transposase # Organism: B.vulgatus # Pathway: not_defined # 1 428 1 429 429 652 72.0 0 MAKKAIKSEKLTPFGGIFAMMEQFDSTLSYVIDTTLGLRCRLYGYQYSEIIRSLMSIYFC GGSCIEDVTTHLMPHLSLHPTLRTCSSDTILRAIKELTQENISYTSDTGKNYDFNTADTL NTLLLNCMCAAGQLKEGEMYDVDFDHQFIETEKYDAKHTYKKFLGYRPGVAVIGDLIVGI ENSDGNTNVRFHQKDTLKRFFERFEQNNLVINRFRADCGSCSEEIVEEVATHCKTFYIRA NRCSSLYNDLFALRGWKRGEINGIEFELNSILVEKWKGKTYRLVIQRRKRMDGELDLWEG EYTYRCILTNDYEPSVREIVEFYNLRGGKERIFDDMNNGFGWSRLPKSFMAENTVFLLLT ALIRNFYKVIIQRLDVKRFGPNKTSRIKAFVFRFVSVPAKWIRTSRWYVLNIYTCNNAYA DVFQTDFG >gi|226332184|gb|ACIC01000136.1| GENE 23 16094 - 16582 382 162 aa, chain + ## HITS:1 COG:aq_301 KEGG:ns NR:ns ## COG: aq_301 COG0449 # Protein_GI_number: 15605831 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glucosamine 6-phosphate synthetase, contains amidotransferase and phosphosugar isomerase domains # Organism: Aquifex aeolicus # 2 162 432 592 592 190 56.0 1e-48 MLGLADKIKNLSKIYTYARNFLYLGRGYNYPTALEGALKLKEISYIHAEGYPAAEMKHGP IALVDSEMPTVVIAPTDSLYDKIISNVRQVKARGGTVVAIITKGNDAMKDIADHCLEVPD VPECLTPLVVSVPLQLLAYYIAINKDRNVDQPRNLAKSVTVE >gi|226332184|gb|ACIC01000136.1| GENE 24 16579 - 16767 142 62 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|212694256|ref|ZP_03302384.1| ## NR: gi|212694256|ref|ZP_03302384.1| hypothetical protein BACDOR_03782 [Bacteroides dorei DSM 17855] # 1 62 1 62 62 97 100.0 2e-19 MNRILSLIVTILFALCIEAGIFMLIAHWWGMTAAITVIAIECVLALWTVYEMRHASDEPV DE >gi|226332184|gb|ACIC01000136.1| GENE 25 17002 - 18144 701 380 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|212694257|ref|ZP_03302385.1| ## NR: gi|212694257|ref|ZP_03302385.1| hypothetical protein BACDOR_03783 [Bacteroides dorei DSM 17855] # 1 380 17 396 396 720 100.0 0 MKVEINKKRLEYLLALYKMSVDDLLSLLNKGRKRITGAADISGDSIDLGVLKRIGEIFDK EVSFFQDYSKLSTNASSSIFFRKTSFGTELNLESIRTVNRFESLKNALDAYNKLSQLDVK FDIEHYTLQDDPKTVAVRARDFFYPGETVNHRQFLVKMIEKIADHGIFVFEYIETWNKKE KTNIDGFYLKPNVIVLKHHKHYKREIFTLAHELGHCLLGIEEVESVDMMDISAQTSYSDV ERWCNDFAYQFIMGQEAETLANIACVDSRNDYCIDLFKAISARTHISRLALYTRMYIDRK ISYSSYSNVKSELAEEYRRREEQEKLKNSEKRGGRTPKPIISPLFLKTMQYAYFNGVVNE TTFCSRLNVKPANFERELWR >gi|226332184|gb|ACIC01000136.1| GENE 26 18135 - 18692 423 185 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|212694258|ref|ZP_03302386.1| ## NR: gi|212694258|ref|ZP_03302386.1| hypothetical protein BACDOR_03784 [Bacteroides dorei DSM 17855] # 1 185 1 185 185 335 100.0 1e-90 MAIVVDTCSLVMIAKNYLPLDKDGQLYSFLEEAFSRKDLMLLDVILDESKRTSKGIAVEK MPFLKDKKLVIPTKDLFPCAPERFSNMIDNNFCVRLKKQELTEEEYIEQKEEYLKTGDAK IIIYALNVRHSDAIHLEEMQVMTEETRQQNDGKLFKKLPLLCEQIGIGTLTVSEYLHRNG FFIDK >gi|226332184|gb|ACIC01000136.1| GENE 27 18855 - 19127 181 90 aa, chain - ## HITS:1 COG:ZyebA KEGG:ns NR:ns ## COG: ZyebA COG0739 # Protein_GI_number: 15802269 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Escherichia coli O157:H7 EDL933 # 4 65 312 374 419 67 50.0 9e-12 MMHGEVIKVGKDKRSGLYVTLRHGDFTVSYCHLSQTLVTKGTHVRPGIIIALTGNSGRST GPHLHLTLKDTKKGRAIDPSILLNLIKHPL >gi|226332184|gb|ACIC01000136.1| GENE 28 19481 - 19675 104 64 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|254882509|ref|ZP_05255219.1| ## NR: gi|254882509|ref|ZP_05255219.1| predicted protein [Bacteroides sp. 4_3_47FAA] # 1 64 1 64 64 104 100.0 2e-21 MQSYIPIQRQAELHKIMRVIFDDYYYLSLISAQKSNSIAILCYNLLVFPLFFMPVKELLV TFAP >gi|226332184|gb|ACIC01000136.1| GENE 29 19705 - 20466 762 253 aa, chain + ## HITS:1 COG:no KEGG:BF2874 NR:ns ## KEGG: BF2874 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 253 1 252 252 341 66.0 2e-92 MKQVRFEESEVPYQTLARFGLTQEKIEDLPMWALEDIGQGRRSPLLPIQVNNDEGETLKS RTRFALVRMEDGKVDVVFYPQLEKSPLEAFTQEQQEDLLAGKAILADVKDADGRSSKAFV QIDTETNQVMSVPTPVIGRNLEVLKDELKLSSAELTVMQKGEPLTLIMEDEQVTVGIDLN DKTGIRINQGDSQKWKENTKREWDKYTFGCYGCWVMGDDGNLDYVPEEEYTEELWNEQKK NGERNRASFSMHK >gi|226332184|gb|ACIC01000136.1| GENE 30 20490 - 20714 269 74 aa, chain + ## HITS:1 COG:no KEGG:BF2875 NR:ns ## KEGG: BF2875 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 74 1 74 74 83 64.0 2e-15 MASNKYYPEDVLVEKIQSGEYGWLEYVNHHSAEWQEEYESYCLNNGLCICEESAEQFVHH KDEELEKAIENGEA >gi|226332184|gb|ACIC01000136.1| GENE 31 20785 - 22347 1233 520 aa, chain + ## HITS:1 COG:no KEGG:BF2876 NR:ns ## KEGG: BF2876 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 519 12 521 522 683 67.0 0 MDLQPEMRDLLMRNGMQAHVVSNGSGYQLVVQGHDSPLLTYNITEKQMLALTDWGTNHAN KLAYNTFTGIVGNDFYMPKNFVHARNANGRVAMGLHGYRVGIGEYGRMGRIDMPPPFLGW TPRSQEGFHLRRVGGRLFFPEAPIVPERWDGRMKPGELQSGGYGFYYKGQQQSYSPQQQD VLKNLEGIIMPSISRTRSTEPAKPYKELITSPVYFTNEKWQECLASHGIIIDENAKTLTI QSNSVNADMVYDLTDEELKKLTSNSIKEVPVVQRLELLNGIIKDDFADKILMDTLNSKER IDLHLHPELEQELQQRQQQDEERLLPLEEDETVRRDVLQGDAMVRGEDLALINENKGWYR EGAHGREVKVGDIMVEKVPPMEGDKKGEDKYRMTAIINGEAITHEIKQKDYDKFLAVDDY HRMKLFSKVFSEVDMKTRPETNAGLGTKILAALTAGAVVTAEVAHDISHHHHHPAPEFYA EHHGGPRPYFKPGVDSPQEIAARYFEAEANHVATEIRRGY >gi|226332184|gb|ACIC01000136.1| GENE 32 22378 - 23520 732 380 aa, chain + ## HITS:1 COG:no KEGG:BF2877 NR:ns ## KEGG: BF2877 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 4 315 5 322 563 438 65.0 1e-121 MSSELTYDDFLDRLSIQEVLVDAGYHLNKRDGLRYPSYVRTDSEGRRVRGDKFIVTQNGK CCFQPPQQKVYNVISFIKEHPEMFSENKVGMSPDRLVNLVCNRLLNQPIEDRPSKITEPR KDGKPFNLNDYDIHKFNPQDRETQKRFYPYFKFRGIDLYTQYAFHRHFCLATKHRTDGLT FANLAFPLVLPKTPDKTVGFEERGRPKMDGSGGYKGKAEGSNSSEGLWIANLTGEPLDKA SEIVWFESAYDAMAEYQINPVKMVYVSTGGTPTEGQMRGLLSVTPNARHYLGFDKDDAGR QFVANFRKVAAEMGFRHEHVQAYHPLGCYKDWNDALLNKKSAELIAKGEPDTFDYAEFIA AGKAEKQREKEEKNTYHRSV >gi|226332184|gb|ACIC01000136.1| GENE 33 23517 - 24002 230 161 aa, chain + ## HITS:1 COG:no KEGG:BF2878 NR:ns ## KEGG: BF2878 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 160 1 160 161 153 55.0 2e-36 MRQEDNSIITLCPSIGQFFINELPLLLLCVAMLLIGGLPGCAYSTLLLVFSLLFSLCLLY RFIYLRSIRYHIGSEQLICEHGVFQRSVNYMELYRVVDFAEHQTLIQQLCGLKSVTVLSM DRTTPKLEMTGISNSYDVVSVIRTRVETNKRRKGVYEITNR >gi|226332184|gb|ACIC01000136.1| GENE 34 23983 - 24684 672 233 aa, chain + ## HITS:1 COG:no KEGG:BF2879 NR:ns ## KEGG: BF2879 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 11 229 10 229 231 321 75.0 1e-86 MRLRIDKIWFIVVALVLSVPSAHAQIVTANPLEWMALAEGNEAINGEIEKEIKGQTKTAL LQNTIAAEFTKIHEWEKKYNSYLKTASGYASSLKACTYLYDDGVKIFITLCKMRKAINNN PQGIVATLSMNNLYMETATELVSVFTLLKDAVAKGGKENMLTGAERSKTLWALNDKLSAF SKKLHRLYLSIRYYTMTDVWNGVTAGMIDRSNGEIATQALSRWRRAGRMTISD >gi|226332184|gb|ACIC01000136.1| GENE 35 24702 - 25385 677 227 aa, chain + ## HITS:1 COG:no KEGG:BF2880 NR:ns ## KEGG: BF2880 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 226 1 226 228 322 72.0 9e-87 MKYFISIAIFMFISSFAGVRAQNDPTLAGMILMYTNKAEKELKNQEKVMLLQSTGHIWTK EEVEATTDLQREFNNYLDSFRSIVSYAAQIYGFYHEIGQLVDNMGGLVAQLDAHPANGLA VALSAKRNKIYRELIMNSIEIVNDIRTVCLSGNKMTEKERVEIVFGIRPKLKKMNKQLKR LTRAVKYTTMGDVWMEIDEGARPTKANKAEIAAAAKRRWKQVGKNVN >gi|226332184|gb|ACIC01000136.1| GENE 36 25479 - 26141 627 220 aa, chain + ## HITS:1 COG:no KEGG:BF2881 NR:ns ## KEGG: BF2881 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 220 1 214 214 187 53.0 2e-46 MGFWSSLLKYGGKAARGTGHVIGVTGKTLGSAALHPQRTLMGAGKAMQTAAVGSAAGYVT WEKLTTDKSVARIVGDAVIGENATEAVAQTANDVQRLTGKAGEAVDAVNTVAGKMDTQFS GISDFFRNMFGGVGCDMIGNLFGNIGKGNVSGMSIVGLVAAAFMVFGRFGWMSKIAGALL GAMIIGNNSKVAQTLPNRGGTSSHGQNMKEAETQSRGMKR >gi|226332184|gb|ACIC01000136.1| GENE 37 26157 - 26984 555 275 aa, chain + ## HITS:1 COG:HI0409 KEGG:ns NR:ns ## COG: HI0409 COG0739 # Protein_GI_number: 16272358 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Haemophilus influenzae # 22 132 322 443 475 62 34.0 7e-10 MRYTEEMILRSSSGYCMPFEEPIKQDVQMSLGYGEQKNPVTGETFFHHGIDFSVNHYLLS AVATGTVSAIGSDAEHGIYQTIRYGKYEVTYRHLSNVFANFSQSVKAGQTIAISGDLLHM EVRFDGEEINPLEFLTMLYGNIKALEHNSQMGINELDMNIPSPYDTDRQEIEALMLRFLP SYFEDLQQGAYSLPEHTEQSLRNIFSLGASKHYFYETMPSMANPLGIGNRCMPLATKVQN LLIADFLNYLAMRHQVYLSTMDEVLKKTPRASPDA >gi|226332184|gb|ACIC01000136.1| GENE 38 27010 - 27282 117 90 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253571436|ref|ZP_04848842.1| ## NR: gi|253571436|ref|ZP_04848842.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 90 1 90 90 164 100.0 1e-39 MDVDIQSFDIPRIVSVYPDRAGVRWWTKAWFNGKEEGEPSVEIEERMAVQFIHCQVDKDA WLEEHYPKQMEIYHNAIEQTKEQILQQYNI >gi|226332184|gb|ACIC01000136.1| GENE 39 27292 - 29091 1215 599 aa, chain + ## HITS:1 COG:no KEGG:BF2884 NR:ns ## KEGG: BF2884 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 2 599 3 596 596 598 53.0 1e-169 MAVKNSQRFAEIERLMGDYFGCHLAPVMGRVQRDLSNRQADEMRAYSNSTAGILRSLATA NMPGDDPYATLKITGKWNSKTTEDYLAMCKEAIVGNKDMQRDLAVMAEEWRKAVVTEIGR ERYDALSKQLGCDLAYAYIDHRMEQLMIEKMVKDRMPTSSADYIIRKAAQSSLFGLPYEL NKSPLAAEIEAKGEAAYKPSRTEKAAGWAAGAGADVVAMGGIGSWATFAKFVGIDMVMNV GMEHLSKKGSSQEVCVEECISKGVFGTGTNVFASFRKQAKVIKNHENSYIKATNSHFKNK IPTSNITFMDWTKQTEQTSFPWNKGTFLDPQRNEATKRTGKYKDVPLIVAPGQEDEYLEV KARLDAEAEKEKKAEPKETPDSSTEEATPSVVQQEETALQTNENGWDGLLQSFGLNGFSD IGKNLGYVLAMLPDVLVGLFTGKTKSLNMDNSLLPLASIVAGMFVKNPILKMLLMGMGGA NLLNKAGHEVLGRKQQEGIAPTVSPAASRIQYRQYPDEPLNPRISNPMLQGRCLVANIDH VPCTIQLPDSVIGAHQSGALPLNTLANAILAKNDQMRQMAAMQYEEQARESVVRPRGIQ >gi|226332184|gb|ACIC01000136.1| GENE 40 29174 - 31630 2172 818 aa, chain + ## HITS:1 COG:ZmutS KEGG:ns NR:ns ## COG: ZmutS COG0249 # Protein_GI_number: 15803252 # Func_class: L Replication, recombination and repair # Function: Mismatch repair ATPase (MutS family) # Organism: Escherichia coli O157:H7 EDL933 # 686 809 9 130 853 114 44.0 6e-25 MKEKSEFEKRTAEKQVSLLTEALTSAVDAKGHWLNASGKLYPKLYPKGFSVSPFNALVLA LDSDAKGCKSNLFTQFSEAKARGESVREHEKGVPFLYYNWNKYVNRNNPDDVITKEAYAE LSEQDKQQYKGVKNREIRVLFNIDQTLLPMANETAYTTALKKDGTVEDRGYGDKEDKQLH GCVNGFLQKMKDNLVNVRQDGTGVAHYDTEKDVIYLPRQRDFEHYNDYVQEMMRQLVSAT GHQQRLAREGMVMKNGKAPSEDAVKKERLITEIASGVKMLELGLPARLSKDSLSMVDYWT RELKEDPCLIDAIESEVNGALKVLKKAELGEKVEYATDQHQRETAQIQVQLPNHYFVADE IKQHPNEEQRTVVIVRDDASKTADVVLPQGASLEVNNEIKGMNKQRFTNALQKEGYDNVR FYNPDGALGFRPDDSYFADKKITVSRLRNWSIEDLSSLDASEAVTHSRDIGFDNVQLVKD DKERWALYIKPEGKEGFAVYPDKGDLNHFFTTLKQSMDNIERVREELAQKYYAMAEAKPD LKVDIFGGNEQEVDLNRIQRVAVFRAKSGECLCAATIDGKKLQPRSVSPSQWQRLWVAPD RDSYKQNLAASLFADVLQKDNNVEQSTQEKQEEATEVKQAETAIVDKHDEQKEVVLKEDK SSEQREEKKQEEKKEEKKDEKAVKAAVSPLVQQYLDLKKKHPDAILLFRCGDFYETYKDD AVKASKILGITLTKSNGRKDDEGKPLAMAGFPYHALDTYLPKLIRAGERVAICDQLEMPK QTTSSKRGITEMVSPGKETGKQMAQESQETEQHTSLRR >gi|226332184|gb|ACIC01000136.1| GENE 41 31635 - 33191 1097 518 aa, chain + ## HITS:1 COG:lin1178 KEGG:ns NR:ns ## COG: lin1178 COG1705 # Protein_GI_number: 16800247 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Muramidase (flagellum-specific) # Organism: Listeria innocua # 3 148 54 205 289 70 36.0 6e-12 MGKNAAYAAQYAEEAKEQMRMYGIPASVILAQAILESSNGQSQLSRECNNHFGIKATASW LKNGGEYGVYTDDRPNEKFCKYKSVGDSYEHHSQFLKQNKRYAQCFTLSPDDYKGWTKGI ERAGYATGGGYAASLQRIIEANGLDKYDSEVMAEMRAEGRSFGVENNPRREMPAAHVPQS AGYSFPVEREEFLFITSPFGNRQDPMDATKQQMHKGIDIKTNHEAVLATENGGKVVAVNH NANTAGGKSVTVEYSREDGSKVQCSYLHLSDIAVKVGDVVNASQKLGVSGNTGTRTTGEH LHFGVKSVSADGSKRDMDPAAYLAEIGQKGNIKLQALHNGNDLLAKYKSNAPQEQKVSSE DWMKKILSSEDSGVGISGTNDPILDMAMTTFMSLMMLAAQIDSNDEERQKGLISAMAANR LVDLTSLVPGMRQCTITVGENGKTVLHAESGTVKIDRELTSSEMNRLSLILGNANLSEES KRTKIAGMVTGIVATQQASQNFEQGMEAESQQQNLQRK >gi|226332184|gb|ACIC01000136.1| GENE 42 33188 - 33370 188 60 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|212694274|ref|ZP_03302402.1| ## NR: gi|212694274|ref|ZP_03302402.1| hypothetical protein BACDOR_03800 [Bacteroides dorei DSM 17855] # 1 60 1 60 60 92 100.0 8e-18 MKRIASHLCTLLVGMVAVAGHFMLTQWLFGQTVAIIYLGFQLIIAAWIVYELIRALVYDD >gi|226332184|gb|ACIC01000136.1| GENE 43 33443 - 34048 345 201 aa, chain + ## HITS:1 COG:no KEGG:BF2888 NR:ns ## KEGG: BF2888 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 200 1 200 203 201 51.0 1e-50 MIKCNVTVCGTVSKAATCRTNKEGKAFVSFAMNVVIPAKSGINKTIEVSVIKDGTLTEVG SCNIGERIEVAGVLVPRKWGDVLYFNLSASSISHQPDEAEDCIKGVMEFRGKVGKSIEDK TDKNGVPYCQFSAFSAEKVQDGFEYIWVSFFLFDGKCEAWLQPGVKANIKGALSVSVFND KLDFSCRVSEMSEYVPQPYNG >gi|226332184|gb|ACIC01000136.1| GENE 44 34055 - 35977 1480 640 aa, chain + ## HITS:1 COG:XF2061_1 KEGG:ns NR:ns ## COG: XF2061_1 COG4227 # Protein_GI_number: 15838653 # Func_class: L Replication, recombination and repair # Function: Antirestriction protein # Organism: Xylella fastidiosa 9a5c # 21 353 224 520 522 83 26.0 1e-15 MGKPYAKEGPSAEDKALDLFADMMIERIQSLSGKDGWKKPWFTEGALQWPKNLNGREYNG MNAMMLLLHCEKEGYKIPRFCTFDRIQQFNKTGKKDEEQKPRVSVLKGEHSFPVMLTTFT VVNKETKEHIKWEDYKLLSQEEREKYNVYPKLQTYHVFNVAQTNLKEVRPEFWEKLEQEY SMPKVEKDEQFAFEPVDRMIADNRWICPIKPMFGDSAYFSISKNEIVMPEKRQFKDGESF YSNLFHEMGHSTGAEGQLDRIKPATFGSAEYAREELVAELTAALTAQRYGMTKHLKGDSA AYLKSWLDSLKESPQFIKTTLLDVKKATSMLTQHIDKIAMEIDQEKKAEQENGQGKSYLS IDDGDHAVLAYNGSAVYIQHHEKEDSVKIAVPTSNGLEVKLSVPYDHGKDLDTNYQEAFA QYKSLTEPSQSKENVYYASIAYLQSTDDTSELDKLKEKGDYQGLLTLAKEYYDGNGMDEE QTYRKPCQNRGDDLLIEDKDFAVVYNGSVGGTYEVFLKHTEQEVRDHITRYGIGRASEDV KAVAREMTAEEFSELAQRKMPIFQMPNGGLLNLQYNKDKDSLDVGTVTNAGLSVKHTFPF SHNHSMDANISSAYEQLLDMEEYQKEEVQEEHVAKSAFRR >gi|226332184|gb|ACIC01000136.1| GENE 45 36084 - 36449 319 121 aa, chain - ## HITS:1 COG:no KEGG:BVU_3418 NR:ns ## KEGG: BVU_3418 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 4 120 3 118 120 135 57.0 4e-31 MASRLITPVLEEILKSYPLYSQDGKGKDAVCVAIFFIGHVRWFVLEGQPEGNDTTLFTIV CGLHETEYGYTSVNEMESVKVDGSKYGVDEIFQVEQLDGFKPVKLKSIPDEDLQAFLHNM E >gi|226332184|gb|ACIC01000136.1| GENE 46 36452 - 36745 239 97 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|212694278|ref|ZP_03302406.1| ## NR: gi|212694278|ref|ZP_03302406.1| hypothetical protein BACDOR_03804 [Bacteroides dorei DSM 17855] # 1 97 1 97 97 181 100.0 2e-44 MELHRLSENEQAFVECFSRFVNGQMGSAAKVGNALADDHRYLINEKGKVVFAFLERLAND YQKGRYDQRDEWVCRLAAEAIEHLVENRMYYRTLNND >gi|226332184|gb|ACIC01000136.1| GENE 47 37140 - 38402 758 420 aa, chain + ## HITS:1 COG:FN1715 KEGG:ns NR:ns ## COG: FN1715 COG1373 # Protein_GI_number: 19705036 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Fusobacterium nucleatum # 1 411 1 423 430 315 42.0 1e-85 MIVNRYRYIEQLSRSKNNGLIKIITGLRRSGKSFLLKKLFHQHLLDEGVREDHILVIDME SRKNREFKNPDYLLDWVEKMMIDYETYYIIIDEVQEVEDFVEVLSSLSVTEGADVYVTGS NSRFLSSDLVTEFRGRGDEIHVWPLSFKEFMTVYDGSKEDGWAEYRLYGGLPQLLTQVGD EKKADFLRRLYRTVYLRDIYERNNIELRPEFEELSKTVASSIGAPVNALNIANTFKSVSN VQSITDKTVSAYLEYMQDAFLIEKSERFDIKGRKYIGSLSKYYYQDVGLRNAILSFRQSE PTHIMENVIYNEMRMRGWLVDVGNVYHRVRNTEGKQQRVTLEVDFVCNKGSERIYIQSAW RMPDAEKMEQEKRSLRLVDDSFRKLLIVGEHTKQWSDENGIQIMSIYDFLLDWSSTEKHG >gi|226332184|gb|ACIC01000136.1| GENE 48 38543 - 39949 731 468 aa, chain - ## HITS:1 COG:no KEGG:BF2899 NR:ns ## KEGG: BF2899 # Name: not_defined # Def: putative outer membrane protein # Organism: B.fragilis # Pathway: not_defined # 1 468 1 476 480 452 46.0 1e-125 MRKILILSLGFALSLGAQAKDYHRTADTTTVVCRLSADGMLRPLKPSYMKGALVASPWTD NWFVQAAGGTTVFLGKPLGCNDLFGRMKPAFSVSIGKWFTPSVGGRLNYGGMQFNDCNNS SQDYQYLRADFMWNVLGNLYKDDVHTLARWSVIPYVGVGMLHNKVNAHKPFAISYGIQGQ YHLSPRIAVTAEIGNMTTMQDFDGYGKAHRLGDHLLSASLGLSVRIGKTGWKRVIDARPY IAQNEWLSAYASSLSDSNSRYHAQHDRDCQTLEQLRKILAIEGLLDKYGHLFSDDAASSV TNGYPRNDYSGLNSLRARMRDKYGNGQKTKVMESASCTEEYGNAEGEPEDDYLSLIHSGK ACIGAPIYFFFELGTDRLLNRSQLVNLDEVARIAKKYGLKVKVIGAADSATGSEAINDNL SRSRARFIAEELMKRGMENGTISQIFDGGIDDYSPNEANRHTRILLYL >gi|226332184|gb|ACIC01000136.1| GENE 49 40004 - 41455 953 483 aa, chain - ## HITS:1 COG:no KEGG:BF2900 NR:ns ## KEGG: BF2900 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 483 1 485 485 550 62.0 1e-155 MATPKQVMDFRPSKGITTAQSNEHQRRWTEKGWGSAESTGNYDRSRERLNFEVRGGKVCP IDKSRSIPERMADILRSRGIKDPNEGLAEPRFRTVVNFIFGGSRERMTELAFGDQKVNLT HGADNSHLTRCKDIEEWAKDVYRFVADKYGEENIVAFILHLDETNPHVHCTLLPIKDGKF AYKQIFAGKDKYEFSARMKALHSEFADVNKRWGMERGTSVSETGARHRTTEEYRRQLSEQ CTTIEQSVATHQRTLASLQSDIRLAERRVKGLTSMVNNLLQEKAEKEAALAELHRQILAG HDNPDALRQQVQELEQELSAIQSKLDDKQEKLSEADRKLDALREEMTAMKERTEELRQEA HRYSDTIRQGADTVLRNALLENMVSEFSERVARLSNEGKDVFDGSLLQAFAENGTEVMYC ATLLFLGYVDDATTFAEGHGGGGSSSDLKWGRDDDEDERAWARRCMMKAARMMRPSSGKR KKR >gi|226332184|gb|ACIC01000136.1| GENE 50 41664 - 42548 545 294 aa, chain - ## HITS:1 COG:no KEGG:BF2901 NR:ns ## KEGG: BF2901 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 3 283 4 280 288 111 28.0 3e-23 MKKQKHRTSSGKMSERMSLLEFLKERSGIRLSKLEAYLDLVDKASVQYIPKDLCKQEFSL SNGQFVITITELAGCWHWHRATVRTFIEQLEKMNQISVTRLTKSQIITIPMLAETPAVSP IDAALDVFRQKMCTALDEWRSGKMSASACASECEQLYEDATEEVAIILQKADNGNSIGKV PCGRNIPESVGHAFCMTALTAVCEATFHQVLSQETDNALVTSLLPFFYKDLGGDWLSFIE AAKAISELALDGSSPALHQESAAIKSQFQSLCRPFLAIVANQEGSCPSKGCINL >gi|226332184|gb|ACIC01000136.1| GENE 51 42628 - 43122 258 164 aa, chain - ## HITS:1 COG:no KEGG:BF2902 NR:ns ## KEGG: BF2902 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 145 1 145 185 72 40.0 4e-12 MTSKILFLILMSAFFFVTMILHRSRRQFMTRFYLRMTALVTARKLYRLMLLILLYVFHFL YLCVHYNDIGVVASTIAFAIFFVFMDVERWLQRLHEERTPFCIAALAAVVFVFTPHLFTL AVTISFVLLASLFYPSRIVISLWKNKADRKMLLEDTEMLIIYYY >gi|226332184|gb|ACIC01000136.1| GENE 52 43112 - 43525 351 137 aa, chain - ## HITS:1 COG:no KEGG:BF2903 NR:ns ## KEGG: BF2903 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 17 128 1 114 123 69 33.0 3e-11 MPQAPDHLMKETLYMKIIHLLDRHRTWLEIESIATVRNHTIVRNGRMTDILSRVLVVKAI HHHFPYTRGQVWQIAEYDLEQAIKSLRTTDGAFRQRIIKGELTLEDVERIISTATHGVVQ PDLSPLPLFTCYTYYDK >gi|226332184|gb|ACIC01000136.1| GENE 53 43604 - 44050 470 148 aa, chain - ## HITS:1 COG:no KEGG:BF2904 NR:ns ## KEGG: BF2904 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 148 3 150 150 168 58.0 5e-41 MLEQFCDDFLAVVPLQLPELLDKRKMEKPVKYDDYVLLTFQLNTPFTIEEVMDMLEDEME MIILYHHIPSRHTEFGHSCCAYSNPSFGRMFKVNGSTDERGMVSQIKVTIYDSLEHMSAD VCLDLSLHCKNGFFKYMKPKEEVLLDFI >gi|226332184|gb|ACIC01000136.1| GENE 54 44075 - 45187 700 370 aa, chain - ## HITS:1 COG:no KEGG:BF2905 NR:ns ## KEGG: BF2905 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 37 370 18 349 349 234 35.0 3e-60 MKYETQLFGGQTNLHCMKQILHNAKSNSHGCIRLAICIATAFAVVILTACSDGNGTKSFH SSDEAIREYHGFLTTLRQNGKVSIQTLVKIVNEWRVLDDSVTSCISRDTVRKAHSYPFTV YHELNDSIHIELCRMAMSKQRTFHDLLYLREQTSSHVGDEELQQAVKEAQPFFASLDSLP IYNKGGKQAVLKRYLLFLQKSAKQGIHGKEDLLAFIKEEHLYFKSFLQYLPDFADDDIGD IRRNTEHCCREILRAADRKDLSHKDAMIYLSMRTNHRLLRNAQAAIEDLNSGRVKDEHTM HAYLLMMMQPFMTMDDLSVSVLSDKDKADLYKIADALPKEMDGLAKKLHLDKQRLSDMPM LMMKIYVTRL >gi|226332184|gb|ACIC01000136.1| GENE 55 45122 - 45319 72 65 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MQYLLHAMEVGLSSEKLGFIFHCCHYFESDFEIDGAKLMRFSDIFARMTVSLILFVYHTC EYDFQ >gi|226332184|gb|ACIC01000136.1| GENE 56 45442 - 46065 575 207 aa, chain + ## HITS:1 COG:no KEGG:BF2906 NR:ns ## KEGG: BF2906 # Name: not_defined # Def: serine type site-specific recombinase # Organism: B.fragilis # Pathway: not_defined # 1 205 1 203 214 247 58.0 2e-64 MAKVGYIFIATNGEEYAEDKAWMQQYGCVQVIEELSEHERLRPMWKQLISSLERGDELVV SRFSNALRGTRELATFIEYCRVKVVRIISIQDRIDTFDELFPDTKPSQVIRMFGSLSEEC AVLRKASAHIIHLQQNIRPPKKSERALSKLEREKNIVNMYNEGHSIDDIFAISGFTSRSS VFRILNKHGVTLNRGPHSGPLKKRNKE >gi|226332184|gb|ACIC01000136.1| GENE 57 46082 - 46765 467 227 aa, chain + ## HITS:1 COG:no KEGG:Cag_0377 NR:ns ## KEGG: Cag_0377 # Name: not_defined # Def: hypothetical protein # Organism: C.chlorochromatii # Pathway: not_defined # 1 227 2 223 223 223 52.0 5e-57 MNKTFNIYCDESTHMVHDGHPYMLLGCTSIAYTQIRMAKDAIKDIKKKHGYSDELKWTNV HEATYKVYAELIDWFFMNDMEFRAVVVDKSQIDEKREDYTFNDFYFRMYYQLLHHKMDMD YTYNIYMDIKDTCSSDKLERLRKIMEYNSSIGRFQFIRSHESVFIQLADVLMGAINYNLR YEKGEVEGRVRAKMKLIEKIKKHSNISLNTTTPKFRKKFNLFFIALK >gi|226332184|gb|ACIC01000136.1| GENE 58 46769 - 47305 271 178 aa, chain + ## HITS:1 COG:no KEGG:Cag_0376 NR:ns ## KEGG: Cag_0376 # Name: not_defined # Def: hypothetical protein # Organism: C.chlorochromatii # Pathway: not_defined # 4 178 3 178 178 143 48.0 3e-33 MSSLNIIKKYPELLELAYLSEREREHDLHAIFKRDIEDNCQFSFRGWRIYPIKTDGEIDM ARLFKHLTCEEIMVENEDGTTYPKRVFEMARSQRLHWINHHVRELTPDNLDVFTIEERDG KKRKVKKTYIYDKVEKYVIVLEQQRSNGFYLLTAYHLNKEYGLKALEKKMKKRLQTPL >gi|226332184|gb|ACIC01000136.1| GENE 59 47500 - 47799 229 99 aa, chain + ## HITS:1 COG:RSc1188 KEGG:ns NR:ns ## COG: RSc1188 COG0526 # Protein_GI_number: 17545907 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Ralstonia solanacearum # 6 98 16 108 108 95 45.0 3e-20 METFKDVISSDQLVLVDFFATWCQPCKMMHPILEQVKEVLGDRIRIIKVDVDKYGVTASQ YGIQSVPTLMLFRRGEVLWRTSGVMQKSELLATIDPFLR >gi|226332184|gb|ACIC01000136.1| GENE 60 47840 - 48385 514 181 aa, chain + ## HITS:1 COG:no KEGG:Mmc1_1123 NR:ns ## KEGG: Mmc1_1123 # Name: not_defined # Def: hypothetical protein # Organism: Magnetococcus_MC1 # Pathway: not_defined # 4 161 21 180 192 92 37.0 7e-18 MSMDKSEIIKMVLELYSARKEAKEYLDFYAEPNEGQKLEEYKHIIREEFYPSRNREPKTR FSVCRKALSDFKKLKPSEDSVAELMVFYMENACQFTYDYGDMWEQFYDSVESNFDKTLRH IVLYDLWDKYDSRIKQCLRWASPCGWGFPDALNDMYEEMKAQNEELRKKYRNFKMPINAD Y >gi|226332184|gb|ACIC01000136.1| GENE 61 48400 - 48645 298 81 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|212694292|ref|ZP_03302420.1| ## NR: gi|212694292|ref|ZP_03302420.1| hypothetical protein BACDOR_03818 [Bacteroides dorei DSM 17855] # 1 81 1 81 81 125 100.0 9e-28 MGKPKKKHKKAKQLPLSKEEQIFQAARDLAAEIGISYSEALGFTLGIKDVTYGWEEDYTE EEFQALIDHAVGDTDYEHTLY >gi|226332184|gb|ACIC01000136.1| GENE 62 48626 - 48829 109 67 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|224026325|ref|ZP_03644691.1| ## NR: gi|224026325|ref|ZP_03644691.1| hypothetical protein BACCOPRO_03081 [Bacteroides coprophilus DSM 18228] # 1 67 1 67 67 124 100.0 2e-27 MSTPSIESSTREERLDYVLNEWRCLHNCELCGKCHILKGRSEEILYADYIDGKRSYMDIT LEIRSNR >gi|226332184|gb|ACIC01000136.1| GENE 63 48832 - 49266 230 144 aa, chain + ## HITS:1 COG:no KEGG:Mmol_0134 NR:ns ## KEGG: Mmol_0134 # Name: not_defined # Def: hypothetical protein # Organism: M.mobilis # Pathway: not_defined # 28 138 3 113 116 158 63.0 6e-38 MSQKYIYPSLFQEEEPQESVPGDKKEYDLTNLFERLAKSDFRSRFHLSKQDREYVMEKGL PTIRKHAEDFVAKRLAPAVIPNDGKQTPMRGHPVFLAQHATGCCCRGCFFKWHHISAGRA LTKEEQEYAVAVLMAWIEKQMNKG >gi|226332184|gb|ACIC01000136.1| GENE 64 49263 - 49469 89 68 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|212694295|ref|ZP_03302423.1| ## NR: gi|212694295|ref|ZP_03302423.1| hypothetical protein BACDOR_03821 [Bacteroides dorei DSM 17855] # 1 68 1 68 68 128 100.0 1e-28 MKIEEAIVYVMVKRNGGMTTDQIADAINRHRLHLRKDGQPVTSKQVYATICRFPEMFTKE AGRIMLMI >gi|226332184|gb|ACIC01000136.1| GENE 65 49479 - 49715 121 78 aa, chain + ## HITS:1 COG:no KEGG:BVU_3408 NR:ns ## KEGG: BVU_3408 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 75 2 76 77 129 82.0 3e-29 MTIDTTNMCSHLQKKLFEPEGVYYPIWQAMQDDETLTAVVRSRQLHIYRNGKKILVLAGK AQPKIIREDKIQELITRL >gi|226332184|gb|ACIC01000136.1| GENE 66 49740 - 49919 238 59 aa, chain + ## HITS:1 COG:no KEGG:Shel_03840 NR:ns ## KEGG: Shel_03840 # Name: not_defined # Def: hypothetical protein # Organism: S.heliotrinireducens # Pathway: not_defined # 14 59 60 105 105 74 80.0 1e-12 MDAKEKVLATMKEAGQPLNAGKIAELSGLDRKEVDAAMKQLKAEGAIVSPVRCKWAPAE >gi|226332184|gb|ACIC01000136.1| GENE 67 50085 - 50198 118 37 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|224026330|ref|ZP_03644696.1| ## NR: gi|224026330|ref|ZP_03644696.1| hypothetical protein BACCOPRO_03086 [Bacteroides coprophilus DSM 18228] # 1 37 3 39 39 62 100.0 8e-09 MDDIVAIIVCYFVFGALGNVLRALFGGERRNDFRRDR >gi|226332184|gb|ACIC01000136.1| GENE 68 50220 - 50711 439 163 aa, chain - ## HITS:1 COG:BH1768 KEGG:ns NR:ns ## COG: BH1768 COG4474 # Protein_GI_number: 15614331 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus halodurans # 15 152 21 155 189 62 25.0 5e-10 MTKYDKAVSVCFSGHRSVPFAKRRELKQCLKSEIAKAYADGYRYFYCGMAMGFDLLAAEA ALSLQCELKDLQVIAVVPFRGQSDRWSKEEQAKYDAILRIVDDVVVLSEQYYNGCLLRRN DYMVNRSSRLIAYFDGNPKGGTFYTVREAKRQGLDIVNLHNSV >gi|226332184|gb|ACIC01000136.1| GENE 69 50715 - 51281 437 188 aa, chain - ## HITS:1 COG:no KEGG:BF2868 NR:ns ## KEGG: BF2868 # Name: not_defined # Def: putative ribose phosphate pyrophosphokinase # Organism: B.fragilis # Pathway: not_defined # 1 188 1 188 188 293 75.0 1e-78 MAHKNDYQIQKQLDKPQTWFCKYYPARIRNVGEKEIADRQLVFDFKDGRAYEEVAQRTAE NMLTLYGKGCVNIVFAPVPASTSESNELRYKAFCHRVCELTGAENGYEHVRVCGERKTIH DNRKNEDEVRKANVVEFDEPFFDNKMVLVFDDVITKGLSYAKYANQLERLGACVIGGMFL ARTHYKVR >gi|226332184|gb|ACIC01000136.1| GENE 70 51833 - 52111 226 92 aa, chain - ## HITS:1 COG:no KEGG:BF2825 NR:ns ## KEGG: BF2825 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 90 1 82 84 84 57.0 1e-15 MSKSSMLKEGMKGGLNGLLSSTKKPSKEQIVTETTPAPPVSVSETTEVPVHCNFLINKSI HTRMKYLAIKKGMSLRDIVNEAMTEYLKRHDT >gi|226332184|gb|ACIC01000136.1| GENE 71 52119 - 52874 692 251 aa, chain - ## HITS:1 COG:BS_soj KEGG:ns NR:ns ## COG: BS_soj COG1192 # Protein_GI_number: 16081149 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Bacillus subtilis # 1 247 1 249 253 200 45.0 2e-51 MTQIIAVLNHKGGVGKSTTAVSLAAALQLSKKNVLAIDMDGQANLTEALGLSIEEEQTVY GAMCGQYTLPLVKLHNGITVSPSCLDLSAAELELISEPGRELILKGLITKAIAKEHFDYI IIDCPPSLGLLTLNALTAADYIIIPVQAQYLAMRGMAKLMDIIRIVQERLNSNLKVGGIV ITQFDRRKTLNRSVREIVNDSFHEKVFKTVIRDNVALAEAPINGKTIFEYNPKSNGASDY MSLAKEVLNLK >gi|226332184|gb|ACIC01000136.1| GENE 72 52994 - 53224 151 76 aa, chain - ## HITS:1 COG:no KEGG:BF2827 NR:ns ## KEGG: BF2827 # Name: not_defined # Def: putative single stranded DNA binding protein # Organism: B.fragilis # Pathway: DNA replication [PATH:bfr03030]; Mismatch repair [PATH:bfr03430]; Homologous recombination [PATH:bfr03440] # 1 76 65 137 137 67 43.0 1e-10 MVLVEGTLTSSIWTDRNGENHIQHSIIADSIAFVNAGKKDATDTPATKARKAAKAEGEAP VPPADAPQDKSEDLPF >gi|226332184|gb|ACIC01000136.1| GENE 73 53244 - 53420 202 58 aa, chain - ## HITS:1 COG:no KEGG:BF2827 NR:ns ## KEGG: BF2827 # Name: not_defined # Def: putative single stranded DNA binding protein # Organism: B.fragilis # Pathway: DNA replication [PATH:bfr03030]; Mismatch repair [PATH:bfr03430]; Homologous recombination [PATH:bfr03440] # 1 44 1 43 137 73 75.0 2e-12 MIYTHTIGRIGAKDCRVITGTHGTFMAVDIAVDDYSKGKQITTWGKGEEQQGEPYPFG >gi|226332184|gb|ACIC01000136.1| GENE 74 53474 - 53830 112 118 aa, chain - ## HITS:1 COG:no KEGG:BT_4618 NR:ns ## KEGG: BT_4618 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 115 1 114 117 89 37.0 4e-17 MEKQILSFNDLPSALSLVINKLEILEEKFSTLMTQIQPDKGLEWLSVSELSEYLPTHPVE HTIYCWTSNKQIPFHKKGKRIMFLKSEIDEWMQDNKSRSRAEIMNEAIAYVQSTRKVI >gi|226332184|gb|ACIC01000136.1| GENE 75 53878 - 54129 214 83 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|212694304|ref|ZP_03302432.1| ## NR: gi|212694304|ref|ZP_03302432.1| hypothetical protein BACDOR_03830 [Bacteroides dorei DSM 17855] # 1 83 1 83 83 146 100.0 4e-34 MDEVGACMTQCIVYDETTKESETFCSLTAAKKWMRERIKQKHEVSGQKYRVYSDGEFVNC GSISLTGSNKSFVANTRQTKKGY >gi|226332184|gb|ACIC01000136.1| GENE 76 54117 - 54671 547 184 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|212694305|ref|ZP_03302433.1| ## NR: gi|212694305|ref|ZP_03302433.1| hypothetical protein BACDOR_03831 [Bacteroides dorei DSM 17855] # 1 184 15 198 198 356 100.0 3e-97 MKIIILHDADARIEYLDVADHLLGSDIEEFLTRQGFSVNNITWLVTSADHIPVVYHKYDI DCKTGEATHTKREAELQDLTIHGQLQALQHREQDELKAALRKYGTEVDGGFEVHFEGEQP IVAGYLFDEPRDIVIDAARLDADGNLSLLGEDKEVRDGQYDIEPSDIFGGQLDYVTSSIG AWMK >gi|226332184|gb|ACIC01000136.1| GENE 77 55942 - 56418 107 158 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAVLSRSDLGILSFLANSDFVAISREYLYANHLGDVSQMVRDEVVRIKEDGVAVMVNKCT WLLLEKIITRYNSRLMSLSKTKATRRSPCYGEERLFLAFLGCWLGRLFRRSLLLVGTLVE LGCNHDDSGVLHAILVCPLLWLEETLDGEQRALGELVE >gi|226332184|gb|ACIC01000136.1| GENE 78 56224 - 56610 473 128 aa, chain - ## HITS:1 COG:no KEGG:BF2915 NR:ns ## KEGG: BF2915 # Name: not_defined # Def: putative single strand binding protein # Organism: B.fragilis # Pathway: not_defined # 1 114 1 114 126 167 75.0 1e-40 MKKIENNFTVSGFVGKDAEIRQFANASVARFSLAVGRQEKNGEETNRVSAFMNMEAWRKN EHTESFDKLTKGTLLTVEGFFKPEEWTDKDGVKHSRIIMVATKFYECPDKEETPAEEPAK PATKKGKK >gi|226332184|gb|ACIC01000136.1| GENE 79 56647 - 56832 93 61 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|254884355|ref|ZP_05257065.1| ## NR: gi|254884355|ref|ZP_05257065.1| predicted protein [Bacteroides sp. 4_3_47FAA] # 1 61 7 67 67 107 100.0 3e-22 MHSKAGMWHHVGSTMKIVRNTLFVVGGKGEKGRLRTILLVTAIAATQEGGSPSQHHATFA W >gi|226332184|gb|ACIC01000136.1| GENE 80 57440 - 57622 217 60 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|212694313|ref|ZP_03302441.1| ## NR: gi|212694313|ref|ZP_03302441.1| hypothetical protein BACDOR_03839 [Bacteroides dorei DSM 17855] # 1 60 1 60 60 103 100.0 2e-21 MATIDMEQLDGVVGCFCTLLYPRRGQTEGTIVEDLGCEVIVQLINGKEVTEYKDDVVICE >gi|226332184|gb|ACIC01000136.1| GENE 81 57638 - 58222 529 194 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|212694314|ref|ZP_03302442.1| ## NR: gi|212694314|ref|ZP_03302442.1| hypothetical protein BACDOR_03840 [Bacteroides dorei DSM 17855] # 1 194 1 194 194 367 100.0 1e-100 MANICDTQYKVTGSRKAVADLWNTLQELEVNSNNVYLYLLAEHYGIDYEKKGISVRGHIY WAEYEENVEDDYALLSFDTESAWSSCDLFFEEVNKALGDELSISWREVEPGCDIFYTHDE NDFFPEECYVTAYGELFEDCEGAYSTFGDAIKLWCEKTGVSQDGRSEQKMIDFINEYEYE AEDTNFCINPITFG >gi|226332184|gb|ACIC01000136.1| GENE 82 58288 - 58500 90 70 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|212694315|ref|ZP_03302443.1| ## NR: gi|212694315|ref|ZP_03302443.1| hypothetical protein BACDOR_03841 [Bacteroides dorei DSM 17855] # 1 70 1 70 70 136 100.0 3e-31 MLPQKVLWLINARFCRQILPLPQSERGDFTDKPQAPTLQDERHDRYLCNTILIKARWGGT WQTAWYSSVK >gi|226332184|gb|ACIC01000136.1| GENE 83 59228 - 59773 279 181 aa, chain + ## HITS:1 COG:no KEGG:BT_4616 NR:ns ## KEGG: BT_4616 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 181 1 181 181 334 98.0 8e-91 MPRGNYTIQRSCEECGKIFTPPTLVSKYCCPACSKRAYKKRQVAKEKEAIRQALIRRIPS NKGYLTVKEAMLIYGISKDVLYRMIRQGLIPSYNFGQRLIRLSRQYMDEHFKTKAGSRKR KKEALSFEPKDCYTIGEIAKKFHINDSSVFKHIRRHSIPTRQIGNYVYVPKSEIDKLYKS L >gi|226332184|gb|ACIC01000136.1| GENE 84 59770 - 61032 507 420 aa, chain + ## HITS:1 COG:CPn0024 KEGG:ns NR:ns ## COG: CPn0024 COG0582 # Protein_GI_number: 15617948 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Chlamydophila pneumoniae CWL029 # 144 414 45 309 312 67 25.0 4e-11 MKKALPNTKVTVKLRRSNYKEEWYLIIESYPVYKRGSTRASRVVESINRTISTPIWDKSS IARILPDGTFNYKPKRDLNGIIQCRSTIDQEACIYADNVRKLRQHEYDSAILYTDKENEI AAQNERSEQDFIKYFNRIISTRHPNSSDSIIVNWRRVGELLKMYSQGQPIPFKAISVKLL EDIKMFLLRAPMGGNKKGTISQNTASTYFSILKAGLKQAFIDEYLTVDISAKVKGITNIE KPRVALTMNEVQMLVDTPCKDDVLKRAFLFSILTGLRHSDIQTLKWKQIQQTSKGTWQAV VIQQKTKRPDYKPVTQQALQLCGIRPDDDEALVFEGLTDASWISRPLKVWIEASGIKKHI TFHCGRHSYASLLLENGVDIYTIKSLMGHTNVKTTQIYTHIVNEQKEKAANTLHIENLDL >gi|226332184|gb|ACIC01000136.1| GENE 85 61347 - 61700 301 117 aa, chain + ## HITS:1 COG:no KEGG:BT_4618 NR:ns ## KEGG: BT_4618 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 117 1 117 117 210 99.0 1e-53 MAYKVNSLEEMPNALSYLIESVEALQSKVNALQHKQASNSPKWMDIDELCAYLPSHPAKQ TVYGWVSTKQIPVHKINKALAFLQSEIDDWLKNKSHKTQDDLMEEARRFVESKKIIR >gi|226332184|gb|ACIC01000136.1| GENE 86 61700 - 62296 340 198 aa, chain + ## HITS:1 COG:no KEGG:BT_4619 NR:ns ## KEGG: BT_4619 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 198 2 199 199 405 94.0 1e-112 METTDYCFSFFRKPIQNIKPIRAVGIVDVYRYIIGHYAQPQTENLRLMQSSPEAKRYKAT HFDYCTFSGLFRKRNEKELIMHSGLMCLDFDHVENIGELKQQLLNHEYFDTELLFVSPSG NGLKWIIPVDLKGWEHSRYFKAVANCIKATGLPLVDMSGSDVARSCFLSYDPQAFINHKY KDDVEENIFRPRLGECPF >gi|226332184|gb|ACIC01000136.1| GENE 87 62247 - 64052 868 601 aa, chain + ## HITS:1 COG:no KEGG:BT_4620 NR:ns ## KEGG: BT_4620 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 221 601 1 380 380 748 93.0 0 MSKKIFSAQDWENVPSEIQQAHTPSIVPIYNKVEDDVESVVREIERRAIDIAPHYKDWVE LGFALVDGLGENGREYYHRISRFYPTYQREETDKQYTHCLQSKGQGITIRSFFHLANQAG ILLASFNKERLSILPNIQNGKTGKWIKSEEELPAFPECVFEHLPPFLNEVVNNSISVDDR DTILIGAIVCLSVCFHNVCGVYDERIVYPNLYLFVVADAGMGKGALTLCRELVAPINCHL HELSKRLEQEYKEAMNTYIKGKKIDGMTMPVEPPMRMLVIPANSSASSFLKILGDNDGIG LLFESEGDTLSQTLKSDYGNYSDVLRKAFHHELVSLSRRKDREYCEVTNPRVSVALAGTP EQVRRLIPDAENGLMSRFCFYIIRFKRGIRNVFATSDISQSKNAMFKLLGDKFCHLYEEF VRQGNYSFSLPFDLQEHFIEYLSRVNEECCDEVDNRMQGVVRRMGLIAYRIMMVLTAVRH LENVFHESSSSDETVQLICHEFDYSIAMSICDTLLYHAVFVYQNMSGNQSKRFNTATLET GVYARRNALYNMLPDTFTKKDYDAAVLTLGENGSTANKWIEAFIKDGKLSRIEQGKYRKI F >gi|226332184|gb|ACIC01000136.1| GENE 88 64154 - 64522 177 122 aa, chain + ## HITS:1 COG:no KEGG:BT_4621 NR:ns ## KEGG: BT_4621 # Name: not_defined # Def: mobilization protein BmgB # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 122 1 122 122 219 95.0 3e-56 MKEKKLGGRPKLASYQKRTKCFRVMFTENDYIYIQSKAEQAGMPVNEFCHQAAMDCQVCQ RISPEMVSAIRDLSGIANNVNQIAHQMHTSGLEAVKQQCFSIISEVSRIITQVKNNCHDS ED >gi|226332184|gb|ACIC01000136.1| GENE 89 64506 - 65405 457 299 aa, chain + ## HITS:1 COG:no KEGG:BT_4622 NR:ns ## KEGG: BT_4622 # Name: not_defined # Def: mobilization protein BmgA # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 299 1 299 299 524 93.0 1e-147 MIAKIKTRADFGGIVNYANDQKNKKKSATLLAHEGVCAINNKAIADSFQIQASMRPKVKI PVKHVSLAFSSQDISRFPDDEEGDALMVEIAKKWMEQMGIRNTQYIIARHHDTKHPHCHL VFNRIDNEGNLISDSNERIRNAKVCRALTKEYGLYFAPKNSKARNKKRLRPHQLRKYTLR SSVLDARANSRSWNDFFSILKGQGIDMRFNHAENSDKIRGISFCMDEFSIAGSKLDRDLS FNNLCTMLGDIAAELIIQPHQAITSGGGGGTNSEQGWRDDKDKDNPRNEPFYKPSKRRR >gi|226332184|gb|ACIC01000136.1| GENE 90 65405 - 66058 406 217 aa, chain + ## HITS:1 COG:no KEGG:BT_4623 NR:ns ## KEGG: BT_4623 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 207 1 207 215 332 81.0 8e-90 MKGDVVAANLANLRQSINELKECIEKQNATVSKSEQSVKVDFDEKIIYNGVAKSFCTCWN EALSVVKKHIWKQQSDTLTFSLWFPKLIELFKQKSKLLEYLYRHVCDYNLNRMNIEANTD SILRRQNEILAKINELKNPVTIIPPNINGMFIRGYYIKLRYVVAFVVVIMIWAVAASISS MKYKEESFAHYSMYRAVREQHQYLMGKIGLEPQRNTE >gi|226332184|gb|ACIC01000136.1| GENE 91 66089 - 66286 61 65 aa, chain - ## HITS:1 COG:no KEGG:PRU_0358 NR:ns ## KEGG: PRU_0358 # Name: not_defined # Def: DNA mismatch endonuclease Vsr family protein # Organism: P.ruminicola # Pathway: not_defined # 1 55 120 174 178 70 56.0 1e-11 MGWHCITVWECQLKPKVRIQTLESLAYTLNHIFLEDRKIRTYDLPDINNTPMVAEPEVTY GRKEQ >gi|226332184|gb|ACIC01000136.1| GENE 92 66626 - 69106 652 826 aa, chain - ## HITS:1 COG:no KEGG:RCIX2535 NR:ns ## KEGG: RCIX2535 # Name: not_defined # Def: hypothetical protein # Organism: RC-I_MRE50 # Pathway: not_defined # 1 587 3 568 607 348 36.0 4e-94 MNNDINKYYLSLIQEISTRQLSNEDGDNQEQTFTRYVFDVLSEAGETENATVAFDEKDLG TKKQHKINGYAIADNYETVDLFISIYGCEESIYSTPKTEIDRASTRIINFFRKAVYGDYA NEVAESSEIFEFANTLSSYQELKDNLVRVNAFILTNGEYKGEFPANVDICGYKVFFRVID LRYLYQITVESRVPIEINFKEEGYTVPCLSAASENPDYQAYVAIIPGTCLANLYERFGAR LLEQNVRSFLQFTGKINKGIRDTIKKEPHMFLAFNNGLAATADHIELDSTGRNIVKISNL QIVNGGQTTASIYHTWKKDKADISGIFVQAKISVIKRAEEFADIVSRISRYANTQNKVND ADFTANNAHLVAFEKLSRYILTPITEDSNIQTYWFFERARGQYKNFRQKEGFTKSRQAAF DLKYPKKQVITKVELAKYINAYEEVYDGKKIAIGPFIVVRGNEKNYSKFINNNLPENIKN LNSIYYEDAIAKTILFRTTDKRYGTKVSGNNIGELKQVVVPYAISLLTTITKGKLNLYKI WKNQCLSNALSDFIYELMKQINDFILKESPVSHYIEWSKKEECWEKVKAYTWSYDLNDIK PDLINSSTPARKGFSLSSDENDEDVAHDMEIIESIPYSLWRKIAEWGKETDCLSINYQSA AQETAHKLKFNHKFTDSDRRKAINIYNIVCEKNIDLLFEADKLASEDNRASSAIHSSSTD YDNDNITIELVQKMVEWDRRRRVLKDWQWKVMDEIAKGKRPLDERMKRGMYMNYIALQKK RVYRIKRLCRFKNEVLIWQRNKSTIKQKAIEKICENIKPNLIEHYK >gi|226332184|gb|ACIC01000136.1| GENE 93 69108 - 70118 342 336 aa, chain - ## HITS:1 COG:no KEGG:RCIX2534 NR:ns ## KEGG: RCIX2534 # Name: not_defined # Def: hypothetical protein # Organism: RC-I_MRE50 # Pathway: not_defined # 87 336 39 286 288 152 34.0 2e-35 MSIENKYQIIELWNSLKSLATIGLVKKLYDTTLPIQVYVTYSYPDDIIGIAVSFSKEFKI DVTSLSNLSELKIRQLVDTSMPGQKMLHVQLMRNEYQRAFAALCEDLVTTLKPLSTPKKM AQEVVNQLYRWKNLFGKMKFDGLSKEEQQGLYGELVFLRKMLSRSTNDTYASTLKLWTGV EKTNKDFQGDNWAVEVKTTSTNNAQFITINGERQLDNSLVAHLFIYHLVLEVSKINGESL PMIVAEIKNLLSSNVPALCIFEEKLIEVKYIASHESLYAERFYKKRNEKCYKVLADFPRI MENDLRNGVSNVVYAISIGMCDEYLVSEGVLFNTIK >gi|226332184|gb|ACIC01000136.1| GENE 94 70102 - 72882 837 926 aa, chain - ## HITS:1 COG:no KEGG:RCIX2532 NR:ns ## KEGG: RCIX2532 # Name: not_defined # Def: endonuclease # Organism: RC-I_MRE50 # Pathway: not_defined # 15 914 21 916 917 617 41.0 1e-175 MQNQIINMCRVLIGPNAVVTDDQINDAIERVSLIFPNMDIAQVKSELLSLYGVRIQDFQI LEGNERRQPWLRDFKAERKSTWDFWVRYKQYLSEQKGFAPLVIQKLDEMSDRILDNLFNP QLTNITIDKKGLVVGQVQSGKTANYTGLICKAADAGFNFIVVLAGILNNLRSQTQSRIDE GFLGFDTQYERAYNINNTTKIGVGLIPGFDSAIANSYTTSLDSGDFNSRAANTAGFNFNA PQPIILVVKKNASVLKRLNTWLKAQAGGHKIANKSLLLIDDEADNASINTKKDKDLDPTA INKGIRTLIGQFNRSAYVGYTATPFANIFIAQDESDLFPRDFIINLPAPTNYIGPEKVFG TSMDVEEEEDLLPIVVPISDYLTFIPEGHKKNDPKPTFADIPESLKTAIKSFILTCAIRL ARGQENKHNSMLIHVSRYQLWQNEIKELVAQQFSYYKQEIEANDPSVLSEFQDLLENDYT ETTNKIKNSTLSNIDHCLQVHQWEDIKPLLFKVVQKIEVKSINGSSGDIVDYQLNSKNGV SVIAIGGDKLSRGLTLEGLSISYFLRASKMYDTLMQMGRWFGYRPGYVDLCRLFTSAELN EWYRHIAIASAELREEFNYLAESRSTPDKYALKVRTHPGCLQITALNKMRNTKEIQVSWA SRLIETYQLPIDKGLKNRNLVATDELLAGLGEPIIKDGDYLWKNVPANQICDYFNEFTVA ESLKKVNLELIVKYINELQGHNELTNWSVALMNKSAGSGPNKEYRFCNKYKVNCFYRNRA IDTDYKTYFIRKNHIVGNQADEFVDLEKDMLNEALETTKRESVDNGKIWSKAYPKPLVVR SSFRPKTQPLLIIYPLNPIGANVMKNGVPAEGSTVFVESDDPFVGFAISFPATDTSFAVN YKVNMVGEYADIEDNFDNENDNEYRE >gi|226332184|gb|ACIC01000136.1| GENE 95 72885 - 73034 79 49 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGINPSIITDMMRTMYDNQIKTGKTSEQAKAYLKTVEPFNLFEDLIENL Prediction of potential genes in microbial genomes Time: Thu May 12 03:08:07 2011 Seq name: gi|226332183|gb|ACIC01000137.1| Bacteroides sp. 1_1_6 cont1.137, whole genome shotgun sequence Length of sequence - 107700 bp Number of predicted genes - 86, with homology - 85 Number of transcription units - 37, operones - 22 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 1315 456 ## RCIX2531 hypothetical protein 2 1 Op 2 . - CDS 1281 - 2591 514 ## COG0270 Site-specific DNA methylase 3 1 Op 3 . - CDS 2599 - 2808 85 ## BT_3153 hypothetical protein - Prom 2917 - 2976 2.4 4 2 Op 1 . - CDS 3380 - 4087 364 ## BT_4629 hypothetical protein 5 2 Op 2 . - CDS 4092 - 5669 1218 ## COG2865 Predicted transcriptional regulator containing an HTH domain and an uncharacterized domain shared with the mammalian protein Schlafen - Prom 5758 - 5817 7.8 - Term 5939 - 5991 14.3 6 3 Op 1 . - CDS 6076 - 7569 1147 ## COG3119 Arylsulfatase A and related enzymes 7 3 Op 2 . - CDS 7680 - 9014 880 ## BT_4632 putative galactose oxidase precursor 8 3 Op 3 . - CDS 9053 - 10936 1277 ## BT_4633 hypothetical protein 9 3 Op 4 . - CDS 10952 - 14341 2026 ## BT_4634 hypothetical protein - Prom 14420 - 14479 5.5 10 4 Tu 1 . - CDS 14531 - 15376 522 ## COG3712 Fe2+-dicitrate sensor, membrane component - Prom 15406 - 15465 6.3 - Term 15762 - 15798 -1.0 11 5 Tu 1 . - CDS 15828 - 16421 319 ## BT_4636 RNA polymerase ECF-type sigma factor - Prom 16479 - 16538 3.8 12 6 Op 1 21/0.000 - CDS 16673 - 17692 895 ## COG0306 Phosphate/sulphate permeases 13 6 Op 2 . - CDS 17709 - 18191 530 ## COG1392 Phosphate transport regulator (distant homolog of PhoU) - Prom 18324 - 18383 5.3 14 7 Tu 1 . - CDS 18436 - 19077 350 ## COG0586 Uncharacterized membrane-associated protein - Prom 19325 - 19384 4.7 15 8 Tu 1 . - CDS 19780 - 20217 354 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases - Prom 20331 - 20390 5.3 + Prom 20211 - 20270 8.4 16 9 Op 1 . + CDS 20357 - 21613 934 ## BT_4642 hypothetical protein 17 9 Op 2 . + CDS 21664 - 22215 371 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 18 9 Op 3 . + CDS 22215 - 23054 580 ## BT_4644 putative anti-sigma factor 19 9 Op 4 . + CDS 23070 - 24611 876 ## BT_4645 hypothetical protein 20 9 Op 5 . + CDS 24673 - 24849 268 ## BT_4646 hypothetical protein + Term 24889 - 24934 12.1 + Prom 24902 - 24961 7.4 21 10 Op 1 . + CDS 25048 - 25584 389 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 22 10 Op 2 . + CDS 25589 - 25948 402 ## BT_4648 hypothetical protein 23 10 Op 3 . + CDS 25981 - 26565 606 ## BT_4649 hypothetical protein + Term 26705 - 26754 13.1 + Prom 26596 - 26655 4.8 24 11 Tu 1 . + CDS 26881 - 28455 1183 ## COG0038 Chloride channel protein EriC 25 12 Tu 1 . - CDS 28472 - 29305 717 ## COG0648 Endonuclease IV - Prom 29530 - 29589 4.9 - Term 29611 - 29662 0.4 26 13 Op 1 . - CDS 29687 - 32305 2204 ## BT_4652 hypothetical protein 27 13 Op 2 1/0.000 - CDS 32310 - 33575 1325 ## COG0738 Fucose permease 28 13 Op 3 . - CDS 33613 - 34566 470 ## PROTEIN SUPPORTED gi|116517028|ref|YP_816079.1| glucokinase 29 14 Op 1 . - CDS 34727 - 36355 1394 ## BT_4655 hypothetical protein 30 14 Op 2 . - CDS 36377 - 37987 1585 ## COG3119 Arylsulfatase A and related enzymes 31 15 Op 1 . - CDS 38100 - 40100 1908 ## BT_4657 heparinase III protein 32 15 Op 2 . - CDS 40156 - 41451 1103 ## BT_4658 glucuronyl hydrolase - Prom 41581 - 41640 4.9 - Term 41585 - 41631 13.5 33 16 Op 1 . - CDS 41645 - 43291 1126 ## BT_4659 hypothetical protein 34 16 Op 2 . - CDS 43336 - 46479 2121 ## BT_4660 hypothetical protein 35 16 Op 3 . - CDS 46504 - 48684 1363 ## BT_4661 hypothetical protein 36 16 Op 4 . - CDS 48703 - 50811 1365 ## BT_4662 heparinase III protein, heparitin sulfate lyase - Prom 50838 - 50897 8.2 37 17 Tu 1 . - CDS 51238 - 55299 3338 ## COG0642 Signal transduction histidine kinase - Prom 55393 - 55452 2.2 - Term 55396 - 55438 4.0 38 18 Tu 1 . - CDS 55470 - 56840 1413 ## COG1350 Predicted alternative tryptophan synthase beta-subunit (paralog of TrpB) - Prom 56927 - 56986 10.9 + Prom 56842 - 56901 4.1 39 19 Op 1 17/0.000 + CDS 56986 - 58815 1403 ## COG0168 Trk-type K+ transport systems, membrane components 40 19 Op 2 . + CDS 58820 - 59506 655 ## COG0569 K+ transport systems, NAD-binding component + Term 59649 - 59710 12.9 - Term 59900 - 59971 27.2 41 20 Op 1 . - CDS 60066 - 62507 2409 ## COG3250 Beta-galactosidase/beta-glucuronidase 42 20 Op 2 . - CDS 62524 - 63585 963 ## COG3867 Arabinogalactan endo-1,4-beta-galactosidase 43 20 Op 3 . - CDS 63596 - 65308 1064 ## BT_4669 hypothetical protein 44 20 Op 4 . - CDS 65320 - 66933 821 ## BT_4670 hypothetical protein 45 20 Op 5 . - CDS 66945 - 69923 1909 ## BT_4671 hypothetical protein 46 20 Op 6 . - CDS 69926 - 70150 184 ## BT_4672 hypothetical protein 47 21 Tu 1 . - CDS 70266 - 73502 2214 ## COG0642 Signal transduction histidine kinase - Prom 73528 - 73587 2.1 + Prom 73831 - 73890 5.9 48 22 Tu 1 . + CDS 73944 - 74495 484 ## BT_4674 hypothetical protein + Term 74565 - 74606 9.1 49 23 Tu 1 . + CDS 74631 - 75809 1113 ## BT_4675 heparin lyase I precursor + Term 75851 - 75896 7.1 + Prom 75952 - 76011 3.7 50 24 Op 1 . + CDS 76031 - 76474 581 ## BT_4676 putative periplasmic protein 51 24 Op 2 . + CDS 76520 - 77020 511 ## BT_4677 hypothetical protein + Term 77030 - 77078 7.5 - Term 76964 - 76997 -0.1 52 25 Op 1 . - CDS 77160 - 78281 1007 ## COG1760 L-serine deaminase 53 25 Op 2 . - CDS 78299 - 79351 836 ## COG0598 Mg2+ and Co2+ transporters 54 25 Op 3 . - CDS 79440 - 79832 327 ## COG1193 Mismatch repair ATPase (MutS family) 55 25 Op 4 . - CDS 79804 - 81942 2075 ## COG1193 Mismatch repair ATPase (MutS family) - Prom 82106 - 82165 80.3 + TRNA 82075 - 82162 48.9 # Ser TGA 0 0 56 26 Op 1 . + CDS 82487 - 83692 344 ## COG0477 Permeases of the major facilitator superfamily 57 26 Op 2 . + CDS 83717 - 84229 363 ## Cfla_0433 aminoglycoside-2''-adenylyltransferase 58 27 Tu 1 . - CDS 84215 - 84445 120 ## - Prom 84615 - 84674 6.5 59 28 Tu 1 . + CDS 84413 - 84604 86 ## COG3153 Predicted acetyltransferase + Term 84633 - 84675 4.2 - Term 84619 - 84661 4.2 60 29 Op 1 . - CDS 84676 - 85029 280 ## Desal_1939 protein of unknown function DUF1486 - Prom 85055 - 85114 3.1 61 29 Op 2 . - CDS 85118 - 86521 937 ## BVU_1439 mobilization protein 62 30 Tu 1 . - CDS 86733 - 87689 653 ## BVU_1440 DNA primase - Prom 87778 - 87837 3.0 - Term 87798 - 87845 0.1 63 31 Op 1 . - CDS 87902 - 88978 682 ## BVU_2466 hypothetical protein 64 31 Op 2 . - CDS 88993 - 89301 248 ## BVU_2467 hypothetical protein - Prom 89441 - 89500 6.8 - Term 89445 - 89486 6.2 65 32 Op 1 . - CDS 89526 - 90569 775 ## BVU_2468 hypothetical protein 66 32 Op 2 . - CDS 90578 - 91873 294 ## PROTEIN SUPPORTED gi|167855185|ref|ZP_02477956.1| 50S ribosomal protein L31 67 32 Op 3 . - CDS 91937 - 93220 1142 ## BVU_2469 tyrosine type site-specific recombinase - Prom 93311 - 93370 6.9 68 33 Op 1 . + CDS 93666 - 94904 1061 ## BF1219 putative transposase + Term 94912 - 94954 6.5 69 33 Op 2 . + CDS 94979 - 95521 616 ## BVU_2471 hypothetical protein + Term 95547 - 95581 4.0 + Prom 95694 - 95753 5.4 70 34 Tu 1 . + CDS 95781 - 96233 437 ## COG2003 DNA repair proteins + Prom 96425 - 96484 3.9 71 35 Op 1 . + CDS 96642 - 97175 569 ## COG4734 Antirestriction protein 72 35 Op 2 . + CDS 97219 - 97431 126 ## BF1223 hypothetical protein 73 35 Op 3 . + CDS 97456 - 97671 304 ## BF1224 hypothetical protein 74 35 Op 4 . + CDS 97692 - 98171 395 ## BF1225 hypothetical protein + Term 98179 - 98210 3.9 - Term 98257 - 98302 13.1 75 36 Op 1 . - CDS 98473 - 98790 305 ## BF1231 hypothetical protein 76 36 Op 2 . - CDS 98804 - 99253 263 ## BT_0084 conjugate transposon protein 77 36 Op 3 . - CDS 99275 - 99853 395 ## BT_0085 conjugate transposon protein 78 36 Op 4 . - CDS 99856 - 100764 572 ## BT_0086 conjugate transposon protein 79 36 Op 5 . - CDS 100798 - 102183 976 ## BT_0087 conjugate transposon protein 80 36 Op 6 . - CDS 102158 - 102463 68 ## BF1236 hypothetical protein 81 36 Op 7 . - CDS 102467 - 103090 377 ## BF1237 hypothetical protein 82 36 Op 8 . - CDS 103103 - 104128 676 ## BF1238 putative transmembrane conjugate transposon protein 83 36 Op 9 . - CDS 104147 - 104689 446 ## BF1239 hypothetical protein 84 36 Op 10 . - CDS 104673 - 106355 733 ## COG3344 Retron-type reverse transcriptase - Prom 106455 - 106514 2.6 85 37 Op 1 . - CDS 106979 - 107191 205 ## BF1239 hypothetical protein 86 37 Op 2 . - CDS 107196 - 107552 187 ## BF1240 hypothetical protein - Prom 107576 - 107635 3.2 Predicted protein(s) >gi|226332183|gb|ACIC01000137.1| GENE 1 1 - 1315 456 438 aa, chain - ## HITS:1 COG:no KEGG:RCIX2531 NR:ns ## KEGG: RCIX2531 # Name: not_defined # Def: hypothetical protein # Organism: RC-I_MRE50 # Pathway: not_defined # 15 437 29 455 519 372 46.0 1e-101 MKMNYEELKTANATPAAASMVETFRAMGYSLETAMADILDNSISAGANNIWISRIWKGGQ SIITIKDDGIGMNHQELIEAMRPGSQNPLEERSKSDLGRFGLGLKTASFSQCRRLTVYSK KADYKPVYWTWDLDYVAKTNQWELLQWIPDEYINALDDVSHGTLVIWSGLDRIINPETSE IDINSKVKFSSLLDNVKRHISMTFHRFIEEKVVRIFWCEHEIEPWNPFCENEPKTQIRPT ENIIGGISVKGYVLPHKSAFSSEAAYAKAEGINGWAAQQGFYVYRGKRLLLAGDWLGLFR KEEYYKLVRIKIDLPNTLDTEWQIDIKKSRAYPPMQNRSQLESYAKAVRSIGCEVYRHRG KILKQKAGTSFQPLWCEKKKDNKWSFVVNREHDMIKQIKSMALDNPEKAINTLLSFIEET LPTKTIYINEAKGEESQL >gi|226332183|gb|ACIC01000137.1| GENE 2 1281 - 2591 514 436 aa, chain - ## HITS:1 COG:MTH495 KEGG:ns NR:ns ## COG: MTH495 COG0270 # Protein_GI_number: 15678523 # Func_class: L Replication, recombination and repair # Function: Site-specific DNA methylase # Organism: Methanothermobacter thermautotrophicus # 5 423 6 408 413 308 41.0 2e-83 MKYRFIDLFAGAGGLSEGFIRAGYEPIAHIEMDHYACDSLKTRAAFHYLKENGKLEIYEE YLKNKKEKTDGSWLWNQVPKSVIDSVIQEAIGKETLPSLFERVDKLCNGKPVDMIIGGPP CQAYSVAGRARLGKKIEEDPRNDLYKFYVEFLKRYKPKMFVFENVLGIFTAKGGEPFRDL IRLVRELDYNIDFREQIASQHGVLQKRHRVIIVGWQNKRDTQEDTSYHYPYLLEEQMPYK MMRDLFCDLPIVKAGEGTLCGIVHYTKPLSDMEYLKKSGIRGVLSFTTQHIARPNNPTDR EIYKQAVEQWNEGKRLRYDKLDPSLQKHKNTQTFLNRFCVVDPNGVCHTVVAHIAMDGHY YIYPTPNPTTDNVRSITIREAARIQSFPDDYFFEGSRSSAFKQIGNAVPVVLAEKIALEI KKILAHEDELRRTQNR >gi|226332183|gb|ACIC01000137.1| GENE 3 2599 - 2808 85 69 aa, chain - ## HITS:1 COG:no KEGG:BT_3153 NR:ns ## KEGG: BT_3153 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 4 67 46 109 114 69 53.0 3e-11 MSVKQIPNRLKAALADNHKTSKWLADQLGKSDMTVSRWCTNRSQPSVHQLIKIANLLDVD VSELLNKTK >gi|226332183|gb|ACIC01000137.1| GENE 4 3380 - 4087 364 235 aa, chain - ## HITS:1 COG:no KEGG:BT_4629 NR:ns ## KEGG: BT_4629 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 232 1 232 234 403 94.0 1e-111 METSFIPLCAIIAVLIIAFLFASLPQVSHKTRYRVLYAIAIIMLLAVIPISEYMAGDTQN SSNNYLLVLIFDIAVGYFCMYVAALLKFNVLKRKNQALENALTEKQQENVAILLEHQNEK QQALQQRELEWLAGKIKMFTEEEQEAILSSALSFAEHDLIVAPSISIQPKETCSQQELMY FVCSAFYNMDKSRSEIVSFLFQVFPLYFPAGESALAKKMPGLEKVRERRGKEECL >gi|226332183|gb|ACIC01000137.1| GENE 5 4092 - 5669 1218 525 aa, chain - ## HITS:1 COG:UU038 KEGG:ns NR:ns ## COG: UU038 COG2865 # Protein_GI_number: 13357594 # Func_class: K Transcription # Function: Predicted transcriptional regulator containing an HTH domain and an uncharacterized domain shared with the mammalian protein Schlafen # Organism: Ureaplasma urealyticum # 9 517 8 459 463 306 34.0 6e-83 MQIHNNTLIAECSAYDFKEMLERKKVKSWLKSVSAFANTDGGSLFYGVNDDGVIVGLENP QADADFISEMIKARLDPVPEVQLIPIEHEGHTLLEVKVKAGTLTPYYYYQDGTRTAYVRV GNESVECNSQQLLSLVLKGTHMTWDSLPTQVDASKHSFIILANTFREQTHQEWNDKYLES FGLVTPDGKLTNAGLLFVDNCTVFQSRIFCTRWTGLYKDDAISSVEHRANLVLLLKYGMD FIKNYTMSGWVKMPNYRLNLPDYSDRAIFEGLVNHLIHRDYTVMGGEVHIDIYDDRVELV SPGAMLDGTQIQDRDIYKVPSMRRNPVIADVFTQLDYMEKRGSGLRKMRELTEKLPNFLQ GKEPQYQTEATSFYTTFYNLNWNESGRIPVEEVANRVNSTLEKYPVNEESSVEKFVVNSK SSEKTFGDMQESSEKGSEITPKGSEKKFGDSKNKSKSIGKTAQKIIDFVISDPSMSAEAM AYKIGISSRAVEKQISKLRSMGILSREGADHGGYWRIIVKPYTEQ >gi|226332183|gb|ACIC01000137.1| GENE 6 6076 - 7569 1147 497 aa, chain - ## HITS:1 COG:STM0035 KEGG:ns NR:ns ## COG: STM0035 COG3119 # Protein_GI_number: 16763425 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Salmonella typhimurium LT2 # 13 484 30 467 497 164 26.0 4e-40 MPGKVQKKEKKTKPNVIIILADDLGYGDLECYGTTRVHTPNVNRLASEGIRFTNVHATAS TSTPSRYALLTGEYAWRKKGTGVAAGNAGMIIRPEQYTIADMFKSADYTTGAIGKWHLGL GDKTGTQDWNGTISPALKDIGFDYSYIMAATADRVPCIYIENGKVADYDSTAPIEVSYQK PFEGEPTGRKNPELLYNLKPSHGHDMAIVNGISRIGYMKGGGKALWKDENIADTITSHAI RFIEENKERPFFLYFATNDVHVPRFPHERFRGKNPMGLRGDAIVQFDWSVGEIMKTLDRL GLTENTLIILSSDNGPVLDDGYDDKAVELAGSHKPGGPFRGGKYSAFEAGTCVPAIVRYP AQVKKNQTLNTLLSQIDWIQSLASLVNVTIPQSKAPDSQNHLDSWLGKSKKDRPWVIEES NILALSVRKGKWKYIEPSNGSPMITWGPKIETGYAPYDQLFDMNKSEFESENLAPKYPAI VKEMKDILVQERAKGKK >gi|226332183|gb|ACIC01000137.1| GENE 7 7680 - 9014 880 444 aa, chain - ## HITS:1 COG:no KEGG:BT_4632 NR:ns ## KEGG: BT_4632 # Name: not_defined # Def: putative galactose oxidase precursor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 444 1 444 444 873 100.0 0 MKTIHTLLAAMTIAMLAISCENEFDVSDKADSLISVTVNTLNNQAAYNVGETAQAELWIQ RGGLKDASGKVQFVVDPLLLDSLNQEDKTNYELLPENCYEMTNTEFTVDKNELCGYITYH PEKILALSAYNETKYVLPLRLISNDLAINPARNTSFLAFTILEPIVHISNAGVYNINPDL TSTMDIQIGVPFTNKWDILCNLTEDLSLIDEYNQINKVNFTLLPENAYTAPESVTLQEGV SQITASYQLKNNLVPGNYILPIKIGSITASQGGVPNNSLVIDEESNVLFCIVKEGNKINK SGWEVIECSSEHAGNEATYMIDDNESTYWHCKFKNEAGSSVPPFHFIIDMKKEITIAQID LLNRGDGAANNIKWVEFYASNNNSEWDKIGAGDFNDEGTRATFRYYVKSTKARYLKLIIP EGYGNNCPPAAIRELTVYGLEGNN >gi|226332183|gb|ACIC01000137.1| GENE 8 9053 - 10936 1277 627 aa, chain - ## HITS:1 COG:no KEGG:BT_4633 NR:ns ## KEGG: BT_4633 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 627 1 627 627 1283 99.0 0 MKTKKKLYRKVLIVAFTLCNLCSCNYLDVSDELAGNLQSLDEIFDNVAYTKRWYANIFTA IPDYSGIVAAEGNVTGFKNPWAGMCDELTVGYGDNKLYNKTEKNAANMGFHRWGNCYKQI RQANIFLANAKPIAANGTHVDVLTEEELIEMKANVRFMRAYYHYLLFEQYGPIPLVKDAI YERDDDLDIPRSPIDEIIAYIDQELKEVSNELSQEALHEDDQHNAWPTKGVALAVRAKLW VYAASKLFNGEYKEALSITNPDGQRLFPDKDPGKWEKAVSALKDFMTFAEDENNYELLEN EENPSQALYDLFQTYNREIIWATAANTWGGMDGDAFDRRCTPRSEQNGMGCTGVTQELVD AFYMNDGYPVQETSFLKQSTLYTTVGTDTYKEKVVTSNNKKVSDAKNVSNRFMNREPRFY NTVFFQNRRWHVSNNVTQFHKGSPNELSGTIYTLTGYMLYKRFNREVNKKSPGVTSKFRP SIIFRLADFYLLYAEAVNEIDPNDARVLTYLNKVRHRAGLENIEILNPDIAGKQDLQRLA IQRERRIELATEGQRYFDVRRWMIADQEGEGRQYGYVHGMNMNGTEDKFLQEVEASPIVF RRKMYLYPIPDSEMKKSDGQLVQNPGW >gi|226332183|gb|ACIC01000137.1| GENE 9 10952 - 14341 2026 1129 aa, chain - ## HITS:1 COG:no KEGG:BT_4634 NR:ns ## KEGG: BT_4634 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1129 1 1129 1129 2182 100.0 0 MKFFSLFLLIGIGNCLATNTYSQNKLFTIKSSQKTITEVFHEIEKNSEYIIFYMDQLIDT NRKVNINVRKQQVSAILDQLFAGTDNTYSINDRQITIYRKGEQPTPQQEKNKFVITGVVT DSQGESVIGASVRIKDTTNGVITDMDGRYTITAPGKNSILVISYIGYSTEEVKINDRRNI NIRLREDTKALDEVVVVGYGQQKKESVVVSMSSVKVSDITAPTRNLTNNLAGQVSGLIAI QRSSEPGFDDAEFWIRGISTFASNSAASTPLVLVDGIPRKITDIEPDEIETFSILKDAAA TAIYGAEGANGVILVTTKRGKDEKPKITFKTEHSISSPQRLPEFVGSADYLGLYNEALRN DGEPALFSDETIEKFRNSTDPDLYPNTDWIKELLKKTTNNHRYTLNVRGGSARAKYFVSG AYYNESGIYKGNPTDKYDTNIGLDRFNLRSNIDMDVTSTTKISVDLAGQYLIANYPGSSS SSIFRSMLITPPYCFPAVYSDGTVATYEQERGVNMRNPYNLLMNSGYTRQWRTGIQSKVG VQQNLKFITKGLSAKMNISYDFDATFKSIRSYNPSRYHATGRDENGNLQFAQVVSGTPDL SDLKDNGIEANKKIYIDASINYKRTFAEKHDVTGMLLYMQKETQYKTEPLPYRKQGVVGR VSYAYDGRYFIEGNFGYTGSEAFAKNHRFGFFPAVGVAYYLSNEPFYPKAIKKIANKIKI RASVGKTGNDTTDKRFVYRATYNMAAGSWSQGIGSNGGTNAIGNAIIEGFPETLDIGWEI ETKQNYGFDLGLFDNKVDFVFDYFRSERSHILMQRNTTPTVGGFRVNQYANYGIVNNHGV DMSLNIHQQIGKVKLSARGTFTYARNKIVEYDELPQRYPWMAQTGMRINENYLYIAERLY TKEDFIVSKNGSGIETYTLRSELPQSTLGGLLGPGDIKYKDINGDGIIDSYDKVRGVGHP KVPEIVYGFGLNIEYKGFYASAFFQGAGNCSVLLGGNTPEGWFPFSWGVDQSNYRTFALD RWTEKNPSQNVLMPRLHKDNTNAANNSVASTWWLRNGGFLRLKNVELGYQLPKKLLSKIN LQAARIYLMGYNIALWDDLKYFDPEAGNANGGNTYPKARTFTLGIDFTF >gi|226332183|gb|ACIC01000137.1| GENE 10 14531 - 15376 522 281 aa, chain - ## HITS:1 COG:RSc2919 KEGG:ns NR:ns ## COG: RSc2919 COG3712 # Protein_GI_number: 17547638 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Ralstonia solanacearum # 77 280 70 270 274 78 28.0 2e-14 MQEKTELEITAVKRPNATPTDNIQLVLSKKNKIDIDGKESQLQYDHQGKINVNSQTIIQE TDDKEKTNIYNQLIVPAGKRSSITFTDGTCIWLSANSRIVYPVEFKQHKREIYIEGEAFL SVSHDPNRPFIVKTSQMDIQVLGTTFNVSAYENKQTQSVVLVSGKIKVKTSKNESKTLSP NNLLSYNEQEGIHIQSVDVQKYIAWKDGFYLFQTEKLKDIATKLSDYYGKKIMIDSPLKT ITCSGKLDLKEDLDEVLQTLIRTVPARIEKSDGIIHIYVKH >gi|226332183|gb|ACIC01000137.1| GENE 11 15828 - 16421 319 197 aa, chain - ## HITS:1 COG:no KEGG:BT_4636 NR:ns ## KEGG: BT_4636 # Name: not_defined # Def: RNA polymerase ECF-type sigma factor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 197 1 197 197 352 100.0 3e-96 MFTQDESYIKWKLFLEGDDQAYSWIYTHYIQVLYNYGLQITPDSEIVKDCIQDVFVKIYK AKKKLTVPQNPKVYLMIALKNNIYNTFNQERLQKNYAFSLYQTEEQLTVKNEFIDQEARH EEMNNIKRMMKILTPRQREVIYYRFIEELSYDDICQIMGLNYQSAYNLLQRSLQKIREAY GVTGIWMLILHQLTYLH >gi|226332183|gb|ACIC01000137.1| GENE 12 16673 - 17692 895 339 aa, chain - ## HITS:1 COG:RSc1313 KEGG:ns NR:ns ## COG: RSc1313 COG0306 # Protein_GI_number: 17546032 # Func_class: P Inorganic ion transport and metabolism # Function: Phosphate/sulphate permeases # Organism: Ralstonia solanacearum # 12 328 19 326 336 258 47.0 2e-68 MELLVTIIILALIFDYINGFHDAANSIATIVSTKVLTPFQAVLWAAFFNFVAFFIAKYII GGFGIANTVSKTVVEQYITLPIILAGIIAAIFWNLITWWKGIPSSSSHTLIGGFAGAAIM ANGFEAIQLNIILKIAAFIFLAPFIGMIIAFGFTLLVLYICRRTNPHTAEMWFKRLQLVS SAMFSVGHGLNDSQKVMGIIAAAMIAAHSMGLGMGINSISDLPDWVAFSCFTAISLGTMS GGWKIVKTMGTRITKVTPLEGMVAETAGAFTLYITEMLKIPVSTTHTITGAIIGVGATKR LSAVRWGVTKSLMTAWVLTIPVSGLLAACIYCVVSLFLK >gi|226332183|gb|ACIC01000137.1| GENE 13 17709 - 18191 530 160 aa, chain - ## HITS:1 COG:CAC3094 KEGG:ns NR:ns ## COG: CAC3094 COG1392 # Protein_GI_number: 15896345 # Func_class: P Inorganic ion transport and metabolism # Function: Phosphate transport regulator (distant homolog of PhoU) # Organism: Clostridium acetobutylicum # 1 158 49 208 210 90 35.0 1e-18 MEREGDRLTHLIFDELSTTFITPFDREDIHDLASCMDDVIDGINSSAKRIVIYNPRPISE SGKELSRLIHEEAINIGKAMDELETFRKNPKPLRDYCTQLHDIENQADDVYELFITKLFE EEKDCIELIKIKEIMHELEKTTDAAEHVGKILKNLIVKYS >gi|226332183|gb|ACIC01000137.1| GENE 14 18436 - 19077 350 213 aa, chain - ## HITS:1 COG:STM2367 KEGG:ns NR:ns ## COG: STM2367 COG0586 # Protein_GI_number: 16765694 # Func_class: S Function unknown # Function: Uncharacterized membrane-associated protein # Organism: Salmonella typhimurium LT2 # 3 208 6 211 219 267 65.0 1e-71 MEFILDFILHIDQYTISIVQDYHTWAYAILFIIIFCETGLVVTPFLPGDSLLFVAGAISA LPGMPIDIHILVLILFAAAVLGDSCNYMIGHFFGRKLFHNPNSRIFKQSYLDKTHEFYKK YGGKTIILARFVPIVRTFAPFVAGMGKMHYYYFMIYNLIGGAIWVVLFCYAGYFFGDLPF VQENLKLLIIAIIFISILPAIIEILRAKLGSKQ >gi|226332183|gb|ACIC01000137.1| GENE 15 19780 - 20217 354 145 aa, chain - ## HITS:1 COG:FN1295 KEGG:ns NR:ns ## COG: FN1295 COG0454 # Protein_GI_number: 19704630 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Fusobacterium nucleatum # 16 142 3 129 135 102 41.0 2e-22 MIRKLDKAEFQQATTLALEVYLQCGQDDFEDEGLESFKSFIFNDKLMNELCIYGAFEENK LIGIMGTKNEGKHISLFFIKKEFHRKGIGKQLFDYSQCDCPANEITVNSSTYAIRFYESL GFEKTNDRQQTNGISYTPMKLISNK >gi|226332183|gb|ACIC01000137.1| GENE 16 20357 - 21613 934 418 aa, chain + ## HITS:1 COG:no KEGG:BT_4642 NR:ns ## KEGG: BT_4642 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 418 1 418 418 823 100.0 0 MKKSVILLVGWLLAFLWIGSADIWAQDAGDYFTIVGMVKDKQNKRTLENVNVSVQGSNIG TVTNAEGEFALKVRKEEVPRELEISHIGYINSHVSLDKNNSSKLTVWMIPHTNQLNEVVV YANNPRTIIEKAIEKIPVNYSANRNMLTSFYRETVQKGRRYISVSEAVLDVSKTAYTNRT TDYDKLQVLKGRRLLSQKVSDTLAVKVMGGPNISVVLDIVKNKEALLEPEELNNYEFWMA ESALIDNRIQYVINFRPRVLLPYALFHGKLYVDCDKLSFTRIEMNLDMQNKSKAIAAILH QKPFGLRFKPQELSFLITYKTIDGRTYLNYIRNDIRFKCDWKRRLFSTSYAVTSEMVVTD RRESPSEIIPRNKVFTSNQIFYDKVGNYWSEDFWGNYNIIAPTESLEHAVDKLKKQSN >gi|226332183|gb|ACIC01000137.1| GENE 17 21664 - 22215 371 183 aa, chain + ## HITS:1 COG:PA0149 KEGG:ns NR:ns ## COG: PA0149 COG1595 # Protein_GI_number: 15595347 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Pseudomonas aeruginosa # 44 169 38 161 181 63 30.0 1e-10 MLNELLILTKIKAGDIKAFEELFRCYYSPLCWYAASITGRMEVAEEIVEELFYVLWKDRE QLQIFQSVKNYLYRATRNQSIQYCEHEEVKERYRESVLTASSSEQVTDPHQQMEYEELQK LINNTLEKLPERRMQIFKMHRTEGKKYAEIAVQLSLSVKTVEAEMTKTLRTLRKEIENYI QMK >gi|226332183|gb|ACIC01000137.1| GENE 18 22215 - 23054 580 279 aa, chain + ## HITS:1 COG:no KEGG:BT_4644 NR:ns ## KEGG: BT_4644 # Name: not_defined # Def: putative anti-sigma factor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 279 2 280 280 536 100.0 1e-151 MKTDIHKIKTEQAWNRLYDRLDKDHLLVEDHRMPKIPMWVRYGTVAAMVVGLVFSTLYWG FGQKEELPDFITQENQDVPTLVTTLEDGSVVFLAKETSIRYPEHFVSDKREVSLQGDAFF DVAKKQKQPFWIDTKEVKIEVLGTAFSVKSVKDTPFRLSVQRGTVRVTLKKGNKECYVKA GETVVLQSQQLLLSSTENAGELNRYLKHVCFKDESLGHILKVMNMNAGSSQIRVASPALE KRKLTVEFSNESPETVATLIAYALNLKCTRQGDTFMLTE >gi|226332183|gb|ACIC01000137.1| GENE 19 23070 - 24611 876 513 aa, chain + ## HITS:1 COG:no KEGG:BT_4645 NR:ns ## KEGG: BT_4645 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 513 1 513 513 1029 100.0 0 MRLVYRYMVILCLFFIVPDTLRADGEDVLERMISLPKMKGTVYSLLGNISQQSGYLFIYD SKVVDNDVTVKIRKGERTIRQAIYEITGDTSLEFKVIGTHILITSSSPTKQQQAKPSSVH PVNLMLTGTLLDKETGMPVASASVGVRRTSVGIVTNQEGEFRLSLPDSLQNDSVVFSHIG YVSQTLEVSVLAGRHHILSLEPKVVPLQEVVVQWVDPYKLLKEMGRQREQNYSHSPAYLT TFYREGVLLKNKVQNLTEAVFKVYKIASHSPVSDQAKLLKMSRLSNVEAKDSLLVKVKSG IQACFQMDIMKDMPSFLIPDAGDNGYLYTSQGVTFIDDRCVNVIHFAQKKEIIEPLYCGD LYIDAETNALLQARFEVDPQRVKKASEMFVERRTHGIQIIPQKVVYTISYKPWQGIYYIH HIRGDLSFKVKRRRLLSASPQMQIWFEMITCKVDTEQVTAFPRAERLPTRTIFSDLNFKY DEDFWRDFNVIPLEEELGKLIERISLKIEEIGP >gi|226332183|gb|ACIC01000137.1| GENE 20 24673 - 24849 268 58 aa, chain + ## HITS:1 COG:no KEGG:BT_4646 NR:ns ## KEGG: BT_4646 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 58 1 58 58 110 100.0 2e-23 MELPKDPMVLFSVINMKLRDCYSSLDELCEDMDVSREDIVSQLKAAGFEYSAEHNKFW >gi|226332183|gb|ACIC01000137.1| GENE 21 25048 - 25584 389 178 aa, chain + ## HITS:1 COG:PA0762 KEGG:ns NR:ns ## COG: PA0762 COG1595 # Protein_GI_number: 15595959 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Pseudomonas aeruginosa # 16 173 19 184 193 82 32.0 3e-16 MINEDKIRKACSSDRERGFRLLVDSFQEPIYNYIRRMVVSHEDAEDVLQEVFIRVFRHLD QFRNESSLSTWIYRIATNECLRLLNSRKEEVISAEDVQEELMSKLKASDYVDYEDELAVK FQEAILTLPEKQRIVFNLRYYDELEYEDIARVLDSKVETLKVNYHYAKDKIKEYILNR >gi|226332183|gb|ACIC01000137.1| GENE 22 25589 - 25948 402 119 aa, chain + ## HITS:1 COG:no KEGG:BT_4648 NR:ns ## KEGG: BT_4648 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 119 1 119 119 189 99.0 3e-47 MDKDFDFDNIGKRTPYRTPDNFFEETQRKILERTVDEQRKKRRLKRIIPTVIAVAAVLAG ILFTPSLRYMNTDTPSTSNILAVDKNNVTTDPVDKWIKELSDEELEELVSFSENDIFLN >gi|226332183|gb|ACIC01000137.1| GENE 23 25981 - 26565 606 194 aa, chain + ## HITS:1 COG:no KEGG:BT_4649 NR:ns ## KEGG: BT_4649 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 194 1 194 194 261 98.0 8e-69 MKTKFIYVVLMALFLGSQMTLSAQNKENKERKQRPTPEQMMQMQTNQMVKALMLDDATAA KFTPIYQKYLKELRECRMMNFKPRARKDAAQGTEANTAKETPKPVMTDAEIAKMLKDQFA QSRKMLDIREKYYNEFSKILSQKQIMKIYQQEKSNMNKFRKEFDRRKGQKPGQGHKPGQD RQRQRQHTPHQTQK >gi|226332183|gb|ACIC01000137.1| GENE 24 26881 - 28455 1183 524 aa, chain + ## HITS:1 COG:FN1727 KEGG:ns NR:ns ## COG: FN1727 COG0038 # Protein_GI_number: 19705048 # Func_class: P Inorganic ion transport and metabolism # Function: Chloride channel protein EriC # Organism: Fusobacterium nucleatum # 19 523 10 520 521 254 33.0 3e-67 MEFRLIKKLKDRGRWRIFKLKLIDARLYFVSIFVGLLTGLIAVPYHYLLQFFFNLRHDFF DSHPKWYWYLPLFLLMWGILIFVSWLVKKMPLITGGGIPQTRGVINGRVAYKHPFIEVIA KFVGGILALSTGLSLGREGPSVQIGSYVGCLVSKWGRVLAGERKQLLAAGAGAGLAAAFA APLASSLLVIESIERFDAPKTAITTLLAGVVAGGVASWLFPINPYFQIDAIVPGMTFWSQ VKLFLLLAAVVSVFGKFFSVTTLQVKRIYPAIKHPEYVKMLYLLFIAYLISMAEVNLTGG GEQFLLAQAMHPDTHILWIVGMMLLHFVFSTFSFSSGLPGGSFIPTLVTGGLLGQIVALI LVQQGVIAYENISYIMLICMSAFLVAVIRTPLTAIVLITEITGHLEVFYPSIVVGGLTYY FTEMLQIKPFNVILYDDMINSPAFKEEARYTLSVEVMSGSYLDGKIVDELRLPERCIIIN VHRDRKNWPPKGQKLMPGDQVQIEMDSQDIEKLYEPLVSMANIY >gi|226332183|gb|ACIC01000137.1| GENE 25 28472 - 29305 717 277 aa, chain - ## HITS:1 COG:STM2203 KEGG:ns NR:ns ## COG: STM2203 COG0648 # Protein_GI_number: 16765533 # Func_class: L Replication, recombination and repair # Function: Endonuclease IV # Organism: Salmonella typhimurium LT2 # 1 275 1 277 285 382 67.0 1e-106 MKYIGAHVSASGGVEFAPVNAHEIGANAFALFTKNQRQWVSKPLTEENIRLFKENCTKYN FQTDYILPHDSYLINLGHPEEEGLEKSRAAFLDEMQRCEQLGLKLLNFHPGSHLNKISIE DCLALIAESINLTLEKTKGVTAVIENTAGQGSNLGSEFWQLRYIIDRVNDKSRVGICLDT CHTYTAGYDIVNDYDKVFDEFEKEVGFEYLRGMHLNDSKKELGSHVDRHDNIGQGLIGSA FFERLMKDSRFDNMPLILETPDESKWAEEIAWLRSVE >gi|226332183|gb|ACIC01000137.1| GENE 26 29687 - 32305 2204 872 aa, chain - ## HITS:1 COG:no KEGG:BT_4652 NR:ns ## KEGG: BT_4652 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 872 1 872 872 1802 99.0 0 MKQRYYIFLLFVAMLSYSGYAQKSILRLSQQTLMHEVRETPSPLDGQHIAVNPPRFMWPD KFPHLGPVLDGVEEEDHKPEVTYRIRIARDPEFKSEVMTAERNWAFFNPFKLFEKGKWYW QHAYLDKDGKEEWSPVYHFYVDEQTRTFNPPSLQEVLAKFSQSHPRILLDAKDWDQIIER NKNNPEAQLYIQKARKCLNHPLKHLEEEIDTTQVVKLTNIVQYRSALIRESRKIVDREEA NIEAMVRAYLLTKDEVYYKEGIKRLSEILSWKDSKYFAGDFNRSTILSMSTSAYDAWYNL LTPAEKQLLLETISENAHKFYHEYVNHLENRIADNHVWQMTFRILNMAAFATYGELPMAS TWVDYCYNEWVSRLPGLNTDGGWHNGDSYFHVNLRTLIEVPAFYSRISGFDFFADPWYNN NALYVIYHQPPFSKSAGHGNSHETKMKPNGTRVGYADALARECNNPWAAAYARTILEKEP DIMKKSFLGKAGDLTWYRCITDKALPKEEHSLAELPMTKVFNETGIATMHTSLGDIEKNA MLSFRSSPYGSTSHALANQNAFNTFYGGKAIFYSSGHRTGFTDDHCMYSYRNTRAHNSIL VNGMTQTIGTEGYGWIPRWYEGEKISYMVGDASNAYGKITAPIWLKRGELSGTQYTPEKG WDENKLKMFRRHIIQLGNTGVYVIYDELEGKEAVTWSYLLHTVELPMEMQELPDEVKVTG KNKDGGISVAHLFSSAKTEQAIVDTFFCAPTNWKNVTNALGKAVKYPNHWHFSSTTIPCK TARFLTVMDTHGNNRADMKVVRQGNTVQVGDWIITCNLTEKGKAAISVTHQAEKVSLKYD AGKKEGATIITDQVQGKQVNKVLTDYLPDFEI >gi|226332183|gb|ACIC01000137.1| GENE 27 32310 - 33575 1325 421 aa, chain - ## HITS:1 COG:BMEII1053 KEGG:ns NR:ns ## COG: BMEII1053 COG0738 # Protein_GI_number: 17989398 # Func_class: G Carbohydrate transport and metabolism # Function: Fucose permease # Organism: Brucella melitensis # 2 402 18 407 412 134 29.0 3e-31 MKKNLGMLALIMAFWFTISFITNILGPLIPDIIHNFNLSDLAMAGFIPTSFFLAYAIMSI PAGLLIDRFGEKPVLFCGFLMPFIGTILFACMHTYPMLLASSFIIGLGMAMLQTVLNPLQ RTVGGEENYAFIAELAQFMFGIASFLSPLAYTYLIRELDPATYTAGKSLILDLLADMTPQ DMPWVSLYWVFTLLLLVMLIAVGVSHFPKIELKEDEKSGSKDSYLALFKQKYVWLFFLGI FCYVSTEQGTSIFMSTFLEQYHGVNPQTEGAQAVSYFWGLMTAGCLVGMILLKLIDSKRL LQVSGVLTIVLLLAALFGSKDVSLIAFPAIGFSISMMYSIVFSLALNSASQHHGSFAGIL CSAIVGGAGGPMIVSTLADATSLRTGMLSILLFVGYITFIGFWARPLINNKTISLKELLK K >gi|226332183|gb|ACIC01000137.1| GENE 28 33613 - 34566 470 317 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|116517028|ref|YP_816079.1| glucokinase [Streptococcus pneumoniae D39] # 7 312 6 318 319 185 35 7e-46 MEKEYAIGIDLGGTSVKYALIDNNGVFHFQGKLPSNADVSAEAVIGQLVKAVNEVKTFAE AKEYTIAGIGIGTPGIVDCTNRIVLGGAENIQGWENLKLADRMEKETGLPTQLGNDANLM GLGETMYGAGNGATHVVFLTVGTGIGGAVIIDGKLFNGYANRGTELGHVPLIANGEPCAC GSIGCLEHYASTAALVRRFSKRIAEAGISYPNEEINGELIVRLYKQGDKIAAESLNEHCD FLGHGIAGFINIFSPQRVVIGGGLSEAGDFYIQKVSEKALRYAIPDCAVNTEIMAASLGN KAGSIGAASLFLNRKPN >gi|226332183|gb|ACIC01000137.1| GENE 29 34727 - 36355 1394 542 aa, chain - ## HITS:1 COG:no KEGG:BT_4655 NR:ns ## KEGG: BT_4655 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 38 542 1 505 505 1069 99.0 0 MDRRNFIKNTGWSFLGLAASGSLLGSCAAGSKEAKKKMPSASDLKMYWGDLHNHCNITYG HGDMRDAFEAAKGQLDFVSVTPHAMWPDIPGADDPRLKWVIDYHTGAFKRLREGGYEKYV AMTNEYNKEGEFLTFVGYEAHSMEHGDHVALNYDLDAPLVECTSIEDWKQKAKGHKVFIT PHHMGYQGGYRGYNWKCFTEGDQTPFVEMYSRHGLAESDQGDYPYLHDMGPRQWEGTIQY GLELGNKFGIMASTDQHSGYPGSYGDGRIGVLAPSLTRDAIWEALRTRHVCAATGDKILI DFRLNDAFMGDVVRGNSRRIYLNVTGESCIDYVDIVKNGQILARMNGPLTPVAPEGDTVR CKVKVDFGWNREEQYVHWQGKLSVDKGRILSVTPCFRGAAFTSPQEGETEFHTHVNRIVS AGEKETELDMYSSKNPNTTTAAMQAVILEVEMPKDGKVIADFNGKKFEHTLGELLEGSRS HFMIGWLSEAILFNRAMPESCFTVEHYMEDKEPQRDTDYYYVRVRQRDGQWAWSSPIWAE RV >gi|226332183|gb|ACIC01000137.1| GENE 30 36377 - 37987 1585 536 aa, chain - ## HITS:1 COG:PA0183 KEGG:ns NR:ns ## COG: PA0183 COG3119 # Protein_GI_number: 15595381 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Pseudomonas aeruginosa # 28 513 2 520 536 130 24.0 6e-30 MKSNPSTLLLPLAALSLASCANQKKEETKRPNIIFMMTDDHTTQAMSCYGGNLIQTPNMD RIANEGIRFDNCYAVNALSGPSRACILTGKFSHENGFTDNASTFNGDQQTFPKLLQQAGY QTAMIGKWHLISEPQGFDHWSILSGQHEQGDYYDPDFWEDGKHIVEKGYATDIITDKAID FLENRDKNKPFCMMYHQKAPHRNWMPAPRHLGIFNNTIFPEPANLFDDYEGRGKAAREQD MSIEHTLTNDWDLKLLTREEMLKDTTNRLYSVYKRMPVEVQDKWDSAYAQRIAEYRKGDL KGKALISWKYQQYMRDYLATVLAVDENIGRLLNYLEKIGELDNTIIVYTSDQGFFLGEHG WFDKRFMYEECQRMPLIIRYPKAIKAGSTSNAISMNVDFAPTFLDFAGVEVPSDIQGASL KPVLENEGKTPADWRKAAYYHYYEYPAEHSVKRHYGIRTQDFKLIHFYNDIDEWEMYDMK ADPREMNNIFGKAEYAEKQKELMQLLEETQKQYKDNDPDEKETVLFKGDRRMMENR >gi|226332183|gb|ACIC01000137.1| GENE 31 38100 - 40100 1908 666 aa, chain - ## HITS:1 COG:no KEGG:BT_4657 NR:ns ## KEGG: BT_4657 # Name: not_defined # Def: heparinase III protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 666 1 666 666 1347 99.0 0 MNKTLKYIVLLTFACFVGQGYAQELKSEVFSLLNLDYPGLEKVKALHQEGKDEDAAKALL DYYRARTNVKTPDINLKKITIGKEEQQWADDGLKHTFFVHKGYQPSYNYGEDINWQYWPV KDNELRWQLHRHKWFTPMGKAYRVSGDEKYAKEWAYQYIDWIKKNPLVKMDKKEYELVSD GKIKGEVENVRFAWRPLEVSNRLQDQTTQFQLFLPSPSFTPDFLTEFLVNYHKHAVHILA NYSDQGNHLLFEAQRMIYAGAFFPEFKEAPAWRKSGIDILNREINVQVYNDGGQFELDPH YHLAAINIFCKALGIADVNGFRNEFPQEYLDTIEKMIMFYANISFPDYTNPCFSDAKITE KKEMLKNYREWSKLFPKNETIKYLATDGKEGALPDYMSKGFLKSGFFVFRNSWGMDATQM VVKAGPKGFWHCQPDNGTFEMWFNGKNLFPDSGSYVYAGEGEVMEQRNWHRQTSVHNTVT LDNKNLETTESVTKLWQPEGNIQTLVTENPSYKNFKHRRSVFFVDNTYFVIVDEVSGSAK GSVNLHYQMPKGEIANSREDMTFLTQFEDGSNMKLQCFGPEGMSMKKEPGWCSTAYRKRY KRMNVSFNVKKDNENAVRYITVIYPVKKSADAPKFDAKFKNKTFDENGLEIEVKVNGKKQ SLKYKL >gi|226332183|gb|ACIC01000137.1| GENE 32 40156 - 41451 1103 431 aa, chain - ## HITS:1 COG:no KEGG:BT_4658 NR:ns ## KEGG: BT_4658 # Name: not_defined # Def: glucuronyl hydrolase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 431 4 434 434 869 100.0 0 MKKKLVLFASVAIALASCQTTPKEDYSWIKKGLDVASAQLQLTAEEIDGTGKLPRSIRTG YDMDFLCRQLERDPSTFQDSLRPQPTAEQMGKRRLCVVYDWTSGFFPGSLWYAYELTGND TLKADAIQYTNLLNPVRYYKGTHDLGFMINCSYGNAERLAPNDTIKAVMKETADNLSGRF NDSIGAIRSWDFGSWNFPVIIDNMMNLDLLFTVSKWTGDNKYKDVAIKHAITTMKNHFRP DYTCWHVVSYNNDGTVERKQTHQGKNDDSSWSRGQAWAVYGYTSCYRETNDTTFLNFAVN IADMIMERVKTDDAIPYWDYDAPVTEETPRDASAAAVTAAGFIELSTMVPDGKKYLDYAE KILKSLSSDAYLAKVGENQGFILMHSVGSLPNGSEIDTPLNYADYYYLEALKRFMDLKKL RLENGEVKVIE >gi|226332183|gb|ACIC01000137.1| GENE 33 41645 - 43291 1126 548 aa, chain - ## HITS:1 COG:no KEGG:BT_4659 NR:ns ## KEGG: BT_4659 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 548 10 557 557 1092 100.0 0 MSVMCLLPFSCSLEEETRTEVEKKNYMNNAEEAKDVLLGVYRTNTLDAMYGYYLSILFNL GTDISQVEGSGNENFRIIPTNSFPTTQSEVQQTWAALYTGIYRANDFLERISNKIGSYTT TDKKLATLYIAEARALRGMFYFELVRRFGNVVLMTSTQMSNQNPATYVQSAPEKVYEYIE DDLLYACDILPYATDDQYRESNDYRFSKGAALGLLTKVYATWAGYPVKDESKWEAAAKTA RILVESGKHGLLKDYEQLWKNTCNGTWDPTESLIEISFYSPTVSGNSDPVGRIGKWNGVK TTAIAGVRGSCAANVKVVHTFVLDWREDVSDIRRDLSIANYQYTDTKKSLWVAGASDTDE SAAEKDADPTKAQKNKQNYTPAKWDIQKYVTTNSFINNDKSNVNWYFLRYADVLLLYAEA LNEWKHGPDAEAYNAINAVRRRGYGNPSNTSACDLPQGLDETSFREAVRKERSYELSFEG HRRQDLIRWGIYYKTVQATAKELGYWWEGTGSPNYSVATYTEEGKHELFPIPQRDMDLCI QFNQNPKW >gi|226332183|gb|ACIC01000137.1| GENE 34 43336 - 46479 2121 1047 aa, chain - ## HITS:1 COG:no KEGG:BT_4660 NR:ns ## KEGG: BT_4660 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1047 1 1047 1047 1977 100.0 0 MYYTNIIKSISARGLFTLLALIMSISLHAQNATVKGVIVDETDTPLIGATVQVKGTATGS ITDFDGNYTIKANKGAVITFSYIGYKTQEIKFTGQSPLNVKMIPDNQTLDEVVVVGYGTM KRSDLTGSVASIAAKDVEGFKTSSVAGALGGQIAGVQITSTDGTPGAGFSINIRGVGTLT GDSSPLYIVDGFEVDDIDYLSNSDIESIEVLKDASSSAIYGARAANGVVLITTKSGKTGR PTITYNGSASYRKISKKLDVLSPYEFVKLQGEVNSKYSDSYFKPGNDDNDIPYRYQSLDD YMGVKGVNWQDETFNPTWSQDHSLSIMGGTDDSKYNASFSRYIENGIFKNSGFDKTTGKF RLDQKLSKSLSFNITVNYALTNRKGVGTSADSGRFNMLAQILSARPTGGLKLTDDELLDS AIDPEMLESGESLAQVNPVKQTESVTNTKRAEMWSGNGSISWQIIKGLTFKTAGTYNTTN NRTNIFYKDGSKEAYRNGQKPYGRTQMGRDVRWTNFNNLTWKQKVKKHNYDVMLGHEVSF RSTEYLLGEAMDFPFDNLGNDYLGLGATPSKVESSYSEKMLLSFFARGNYNYDNRYLLTA TVRADGSTVFSNKNKWGFFPSFSAAWRVSEEAFMKDVDWVSNFKVRLGWGIVGNDRISNY LSMDLYEASKYGVGNNTVTVLTPKQLKNANLKWEGSSTINLGVDLGFLDNRLNVTADFFV KNTKDLLLAQSLAHVTGFDSQMQNIGKIQNKGIELSLNSTNIQTRDFSWQTNFNISFIKN TLKGLASGVESMYARSGFDSNFTAYDYIATVGQSLGLIYGYEFDGVYQSSDFYTTPDNQL ILKEGVTNNARYGTVKPGVVKYKDQDGDGIITTNDRTVIGNAMPKWFGGITNTFDYKGID FSFMLQFNYGNDIYNATRLYSTQSRNGRRNMLAEVADRWSPTNTSNLVPSQDGYIVNDVY SRFIEDGSFLRLKNVTLGYTLPHKWTRKFHVSRLRLYATGQNLFCVSGYSGYDPEVNSAS NTPMTPGLDWGAYPKSRVFTFGIDLQF >gi|226332183|gb|ACIC01000137.1| GENE 35 46504 - 48684 1363 726 aa, chain - ## HITS:1 COG:no KEGG:BT_4661 NR:ns ## KEGG: BT_4661 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 726 1 726 726 1383 99.0 0 MRSKLLNITSRLLGVLIVIAVTIVNSSCSDTETTDSTKFTIYYTGMTDIGPSMTGVISSP TYKGGTPYDFAITRITLDGEPFSDSIFAIDSETGKITLNSTSNTPVGLYKLSVACYSNNN RYEYTDIVEINMMKPVPDGIKTDPEKLQVEYADIIDTESSNELPTSQIRTEGNHISISNY TIASAMWNGVAVESPEDYFAVSDKGEISIIKGNQNIQPGKYILSFKLTTAATGEDPEKGI FENALEINVTSRPLSLIYTPDEGKIEEEGERSPETTFQSNIPALKGSAEGLVYSISSVSP NTDKITIDPTTGVLSVAAHHGFKDGEKYQISVKAINEFSPEGVVFENVFTLNTVEFIEPI ANFGYADVNDVQAVEIDINKNENFKGDEVKYEFVNLPTDLQGELALDLDGNIAIKKGNKI PVGQYTVQVMATNTKGSETATFTLTITANPNYFTYFRYGNNLGLTPIENYADQFRIEAGG KLNSVKPVPTATDAKDGLSSLKWEVELKHNPNNTKATINESTGQITITGLKQGQCGMVMV TATAGEGKTAVSVKQPVFFHFSMISDSNVQLEYTPFVFQVNPARGGESIAPSLGAGIDKS TFRLDYRRDFFYYNIAGPDSHISGALAQKVDNFLSEMWNSYDATAGTSRKPMSYFENTTN LSKALGYIDQTDFKVHINPNLWRNKDGYANGAMIGQITYDITGKDPQAATSGARVSPIFI WFDTKF >gi|226332183|gb|ACIC01000137.1| GENE 36 48703 - 50811 1365 702 aa, chain - ## HITS:1 COG:no KEGG:BT_4662 NR:ns ## KEGG: BT_4662 # Name: not_defined # Def: heparinase III protein, heparitin sulfate lyase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 702 1 702 702 1375 100.0 0 MKNIFFICFCALFAFSGCADDDDDLLTGGNVDIDLLPDAKPNDVVDPQVFEAINLNYPGL EKVKEFYEAGEHYYAANALLEYYRTRTNVTNPNLSLINVTISEAEQAKADYALVDYRFHV NNFYEDKETLKPYSVKQDGGINWEYSPKDASDEYQKQLHRHQWFIPQAKAYRVSGDEKYI QSWIEVYKNWIENNPKPTTGPNTTSWWQLQVSTRIGDQVQLLEYFKNSVNFTPEWLSTFL VEFAEQADFLVDYPYESGGNILISQANALATAGTLMPEFKNAEKWMNTGYQILSEEVQNQ IMSDGWHKEMSLHYHIGIVADFYEAMKLAEANQLSSKLPSDFTEPLRKAAEVVMYFTYPN YFIKGSDNVVPMFNDSWSRTRNVLKNTNFKQYVEMFPDSEELKYMQTAGNGGTAQGRTPN NDMKLFDQAGYYVLRNGWTPASTVMILSNNKSNDASNSLSAYSHNQPDNGTFELYHNGRN FFPDSGVCTYYTSGGDNDLRYWFRGIDKHNTLSIGKQNIKKAAGKLLKSEEGATELVVFE NQGYDNLKHRRAVFYVNKKFFVLVDEGIGNAEGTINLSFNLCEGTASEVVMDTDKNGVHT AFSNNNNIIVRTFANKAVTCSPFTGRIAYLVDGAYNTRQSYTIDMNKSADETARYITVIL PVNGSTDTSSISAKFIDSGYSENSASVEVSVNGETHTLSYTL >gi|226332183|gb|ACIC01000137.1| GENE 37 51238 - 55299 3338 1353 aa, chain - ## HITS:1 COG:CAC0323 KEGG:ns NR:ns ## COG: CAC0323 COG0642 # Protein_GI_number: 15893615 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 823 1051 376 616 654 136 34.0 3e-31 MSKLKKIVIYLFFLCLGMHSAFSETPEQITFSYISINEGLSQSTVFSIDQDKRGNMWFAT YDGVNKYDGYAFTVYQHNEDDPNSIANDISRIVKTDSQGRVWIGTRDGLSRYDEEKDIFQ NFFYEKNGKHLQVNGIEEISPEQLLISTPEGLIMFDIKESKFRDDSFSTAMHKTIASTLY RQDDQIYIGTSTDGLYTYSITQKTFEKVIPILGTKQIQAILQQSPTRIWVATEGAGLFLI NPKTKEIKNYLHSPSNPKSISSNYIRSLAMDSQNRLWIGTFNDLNIYHEGTDSFASYSSN PVENGSLSQRSVRSIFMDSQGGMWLGTYFGGLNYYHPIRNRFKNIRNIPYKNSLSDNVVS CIVEDKDKNLWIGTNDGGLNLYNPITQRFTSYTLQEDESARGIGSNNIKAVYVDEKKSLV YIGTHAGGLSILHRNSGQVENFNQRNSQLVNENVYAILPDGEGNLWLGTLSALVRFNPEQ RSFTTIEKEKDGTPVVSKQITTLFRDSHKRLWIGGEEGLSVFKQEGLDIQKASILPVSNV TKLFTNCIYEASNGVIWVGTREGFYCFNEKDKQIKRYNTTNGLPNNVVYGILEDSFGRLW LSTNRGISCFNPETEKFRNFTESDGLQSNQFNTASYCRTSVGQMYFGGINGITTFRPELL LDNPYTPPVVITKLQLFNKVVRPDDETGILTKNISETKSITLKSWQTAFSIEFVVSNYIS GQHNTFAYKLEGYDKEWYYLTDSRTVSYSNLPQGTYQFLVKAANSDGKWNPIPTALEIIV LPIWYKTWWALLIFFATFAGFITFVFRFFWMRKSMEAQLEIERRDKEHQEEINQMKMRFF INISHELRTPLTLILTPLQEIINKISDRWTRNQLEYIQRNANRLLHLVNQLMDYRRAELG VFELKAKKGNAHQLIQDNFLFYDKLARHKKITYTLHSELEDKEVLFDANYLELIVNNLLS NAFKYTESGQSITVTLKEENGWLLLQVSDTGIGIPINKQGKIFERFYQIESEHVGSGIGL SLVQRLIELHHGRIELDSEENKGSTFSVYLPQDLSVYKPSELASNNEQNEEEQVYSTNSK AMYFIDTEKVENESVESGDKKRGTILIVEDNNEIRRYLNNGLADLFNTLEAGNGEEALEK LKDNEVDVIVTDVMMPVMDGIKLCKNVKQNIRTCHIPVIILSAKTDIKDQMEGLQMGADD YIPKPFSLAILTTKIQNMMRTRRRMLDKYAKSLEVEPEKITFNAMDEALLKRAMAIVEKN MDNIEFSTDEFAREMNMSRSNLHLKLKAITGESTIDFIRKIRFNEAAKLLKDGRYTVAEV STMVGFNTPSYFATSFKKYFGCLPTEYIKKSKG >gi|226332183|gb|ACIC01000137.1| GENE 38 55470 - 56840 1413 456 aa, chain - ## HITS:1 COG:TM0539 KEGG:ns NR:ns ## COG: TM0539 COG1350 # Protein_GI_number: 15643305 # Func_class: R General function prediction only # Function: Predicted alternative tryptophan synthase beta-subunit (paralog of TrpB) # Organism: Thermotoga maritima # 10 426 7 421 422 487 58.0 1e-137 MSDKRKRYILPEEEIPHYWYNIQADMVNKPMPPLHPGTKQPLKAEDLYPIFAKELCHQEL NQTDAWIEIPEDVREMYKYYRSTPLVRAYGLEKALGTPAHIYFKNESVSPIGSHKLNSAL AQAYYCKEEGVTNVTTETGAGQWGAALSYAAKVFGLEAAVYQVKISYEQKPYRRSIMQTF GAQVTPSPSMSTRAGKDILTAHPTYQGSLGTAISEAIELAQMTPNCKYTLGSVLSHVTLH QTIIGLEAEKQMEMAGEYPDVVIGCFGGGSNFGGISFPFMRHNILEGKKTRFVAAEPASC PKLTRGKFQYDFGDEAGYTPLLPMFTLGHNFAPAHIHAGGLRYHGAGVIVSQLLKDNLME AVDIQQLESFEAGCLFAQAEGIIPAPESSHAIAAAIREAKACKETGEEKVILFNLSGHGL IDMASYDKYLAGDLVNYALTDEDIQKNLDEIGDLSK >gi|226332183|gb|ACIC01000137.1| GENE 39 56986 - 58815 1403 609 aa, chain + ## HITS:1 COG:BH0598 KEGG:ns NR:ns ## COG: BH0598 COG0168 # Protein_GI_number: 15613161 # Func_class: P Inorganic ion transport and metabolism # Function: Trk-type K+ transport systems, membrane components # Organism: Bacillus halodurans # 154 606 18 445 448 227 36.0 6e-59 MKIYHKFLLYQNKLLKPYVRILLGLVEALTYLASLLLIVGVVYEHGFPLSPGEVQQIQIL YKAVWIIFLIDVTLHISLEYRNTKKQYRRLAWILSVLLYLTLVPVIFHRPEEEGAILQVW EFLHGKFYHLILLLILSLLNLSNGLVRLLGRRTNPSLILAVSFMAIILIGTGLLMLPRCT VNGITWVDSLFTATSAVCVTGLVPVDVSTTFTTSGLVVIILLIQIGGLGVMTLTSFFAMF FMGNTSIYNQLVVRDMVSSNSLGSLLSTLLYILGFTLVIEGIGMVSIWFSIHGTLGMNLE EELAFAAFHSISAFCNAGFSTLSGNLGNSMVMTNHNWLFISVSLLIILGGIGFPILVNFK DIVLYHLRHFWTFVRTRKLERHKMQHLYNLNTKIVLIMTFLLLLIGTLAILAFEWNASFA GMPVADKWTQAFFNATCPRTAGFSSVDLASLSVQTLLVYLFLMWVGGGSQSTAGGVKVNA FAVVVLNLVAVLRGSERVEVFGRELSHDSIRRSNATVVMSLGVLFLFIFILSILEPKASL LALTFECVSALSTVGSSLNLTPQLCDASKLLVSLLMFIGRVGLITLMLGIVKQKKNTKYR YPSDNIIIN >gi|226332183|gb|ACIC01000137.1| GENE 40 58820 - 59506 655 228 aa, chain + ## HITS:1 COG:aq_1503 KEGG:ns NR:ns ## COG: aq_1503 COG0569 # Protein_GI_number: 15606658 # Func_class: P Inorganic ion transport and metabolism # Function: K+ transport systems, NAD-binding component # Organism: Aquifex aeolicus # 3 181 5 183 218 104 32.0 1e-22 MKYIIIGLGNYGHVLAEELSALGHEIIGADISESRVDSIKDKVATAFVIDATDEQSLSVL PLNSVDIVIVAIGENFGASIRVVALLKQKKVPRIFARAIDAVHKAVLEAFSLEKILTPEE DAARSLVQLLDFGATMEAFRIDQDYYVVKFTVPKKFVGYFVNELNMDEEFHLKLIGLKRA NKVTNCLGISLMELHVKNELPGDEKVEEGDELVCYGKYRDFQSFWKAI >gi|226332183|gb|ACIC01000137.1| GENE 41 60066 - 62507 2409 813 aa, chain - ## HITS:1 COG:SP0648_2 KEGG:ns NR:ns ## COG: SP0648_2 COG3250 # Protein_GI_number: 15900551 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Streptococcus pneumoniae TIGR4 # 28 798 59 871 871 444 33.0 1e-124 MKHQKQLFTVLLMSAACMLQAQRSETLLEKNWKFSKGDFPEAVQTDYNDTKWESVVIPHD WAIFGPFDMNNDLQNVAVTQNFEKKASLKTGRTGGLPYVGTGWYRTSFDVPADKEVTLLF DGAMSEARVYVNGQEACFWPFGYNSFHCNVTPFLNKDGKNNTLAVRLENRPQSSRWYPGA GLYRNVHLIVTEKVHIPVWGTQITTPHVSDEFAAVRLQTKIENASEKTEIRIETEILSPD GKVVTRKENKGRINHGQPFEQNFIVNAPQLWSPESPALYKAVSKVYADNQLVDTYSTRFG IRSIEYIADKGFYLNGKHRKFQGVCNHHDLGPLGAAINVAALRHQLTLLKDMGCDAIRTS HNMPAPELVELCDEMGFMMMIEPFDEWDIAKCENGYHRYFNEWAERDMVNMLHNYRNNPC VVMWSIGNEVPTQCSPVGYKVAKFLQDICHREDPTRPVTCGMDQVTCVLANGFAAMIDIP GLNYRANRYQEAYDKLPQNLILGSETASTVSSRGVYKFPVKDKKSAQYDDHQCSSYDVEA CSWSNIPDEDFALADDHHWTIGQFVWTGFDYLGEPSPYDTDAWPNHSSMFGIIDLASLPK DRYYLYRSIWNKEAETLHILPHWTWPGREGEVTPVFVYTNYPAAELFINGKSYGKQSKNN SSLKSRYRLMWMDAIYEPGEVKVVAYDKDGKAVAEKSVRTAGKPHHIELVSNRNELTADG KDLAYVTVKVVDKDGNLCPADSRLINFSVKGAGKYRAGANGDPTSLDLFHLPKMHAFNGM LTAIVQTNETAGEIILTAKAGGVKAGSIRLLTK >gi|226332183|gb|ACIC01000137.1| GENE 42 62524 - 63585 963 353 aa, chain - ## HITS:1 COG:CAC2570 KEGG:ns NR:ns ## COG: CAC2570 COG3867 # Protein_GI_number: 15895830 # Func_class: G Carbohydrate transport and metabolism # Function: Arabinogalactan endo-1,4-beta-galactosidase # Organism: Clostridium acetobutylicum # 44 336 34 340 360 227 39.0 2e-59 MKIASILSIVALIASMATNGCSEDGPVTNPRQEEPQKVVKEEGFARGADVSWLTQMEAEG LKFYTPDENRQEMECMELLRDYCGVNSIRLRVWVNPKDGWNNMNDVIVKAKRAERLGLRT MIDFHFSDTWADPGHQEMPEAWKELSFDDLKIALGEHVKSVLTALKAVGVTPEWVQVGNE TTPGMMLPVGSVDNPEQLTALNNVGYDAVKAICPDAKVIVHLDAGNDQWVYNRMFDILQA NGGKYDMIGMSLYPYWAEQEGKTGGWLKVADDCIANIKHVKQKYNKPVMICEIGMPYDQA EACKQLITKMMQADVEGIFYWEPQAPNGYNDGYNLGCFDNNAPTIALDAFKIQ >gi|226332183|gb|ACIC01000137.1| GENE 43 63596 - 65308 1064 570 aa, chain - ## HITS:1 COG:no KEGG:BT_4669 NR:ns ## KEGG: BT_4669 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 570 1 570 570 1110 98.0 0 MKVFSTIVKLILLSAVMTVATTSCDSDIDPVYVLSGDSMELMGGSNEIILTPDNPQALAL TVYWSGDGRLALSDTLLQAPVNVAEITIQFSKDEHFTAPLNIAVDKNIHSRQFLCEELNS LLGRLGYEANEKAPLWIRIRSVLAANIAPEYSNVMEVLVQSYRIELVLAQVLDKDWKNTS MMLASPAEDGIYKGFMGVNGWENWWLREANNVMWGNLGEEGKTFYASSEDSHWNFWFPNP SGCYYTTVNTVEGWWSALHVASLSVSGDVTGEMAYNKQSNQWTLPVNLATQSTLTISVSG KGSLYNRETTDIGPAIEQTVAFGGNSQNLLFGESADNISVTMPAGENLLVLDLSNPLQFT IGIGEDVPQPEVSQYLYFSGITNWEGFDDYLTLYDEAGLCYGGAHWIDSEWGYRAYTEQA WEPAYNAADGSTDLSGTLILAESKDNIPAPAQGLYVMDFNMKSLTYELTQVQTVTFTGLN DNWSESPMTQSAENPEIFTTEFIKEKKSPWGVKVLINNNWNLFFGGGSGILHLKHSESSA GFDGDDELEIGKTYVLTVDLGHQTYSYSLK >gi|226332183|gb|ACIC01000137.1| GENE 44 65320 - 66933 821 537 aa, chain - ## HITS:1 COG:no KEGG:BT_4670 NR:ns ## KEGG: BT_4670 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 14 537 1 524 524 1022 98.0 0 MKKINILTIIVWCMTLVASSCTDVLDQMPTTEDTSKDVYAEAANYKRVLAKIYASYVLAG QGQGGDNGDLTSINKQDFSRGYFNLQEAATDEVANTWLSGNNLSDLTYISWDAKDSWVSD SYYWLFFNIALCNEFLRHCSDEEIARFSEAEQTEIRTFRSEARFMRSFSYWMVLDLYRQG PKVDENTPVIGYIPEAYDGIGMFDFIESELKELVAEGVSSALPATNEYGRASKPVAWALL ARLYLNSAVYRGNVDTKYYTECITACKQVISNPAFYLEPEYAKLFNADNHKRTHEILFAM VCDATTSVTWGGGTYIICGSCGNSSTQDPAKYGLSNGWGMFRARGEMTAKFGDIATTKDS RAMFYTDGQAQFFTGAIDNQAEGYFFEKFSNLTDEGVAASNSAATGCSTDYPLLRLADIY LMAAEAQLRGGTGLSRTEALELVNKVRLRAYNQDESGKISDSDFTLDFILDERARELYLE SVRRTDLIRFNKFVSADYIWQWKGAVLDGRAVNKKYNIYPIPDTDLTANPNLHNPLY >gi|226332183|gb|ACIC01000137.1| GENE 45 66945 - 69923 1909 992 aa, chain - ## HITS:1 COG:no KEGG:BT_4671 NR:ns ## KEGG: BT_4671 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 992 1 992 992 1937 99.0 0 MNCNILNLRHLTVLLILCTAFLGGAIPAFAQQGDKKMTGQVIDENKEPMIGVSILIVGTS TGTVTDFDGNYTLNVPKDSKELQFSYVGYETKVITIPVNSNVLNVQMKSDSQVLSDVVII GYGTQRKSDLTGSVASVGTKDFNKGMVSSPEELVNGKIAGVQIVNGGGSPTSVSTIRIRG GASLNASNDPLIVLDGVPMEVGGSISGGGNFLSLINPNDIESMTVLKDASSTAIYGSRAS NGVIIITTKKGSGSDIKVSFQTTNSIATKTKTSDMLNTDEFINIVNQYGTEHQKSLLGDY RTNWNDEIYQTAFGTDNNLSVSGLALPWLPFRVSTGMYYQDGILKTDNTKRFTGNVNLTP SFFHNELRFNIGLKGTYSKNRFADTDAIWAGSTLNPTIPVYSGNDTFGGYNEAIDANGVP VTGALANAVGRLNQYDSTSDVYRFIGSASVDWNVKWVKGLRLHTTGGYDWSKGKGHIYVP KEAVSYYTTGGRDYTYGPQKNYNKLLTIYANYHNDFDAIHSGIDVTAGYDYQFWKYTTPF YAILSADGVQQSTSAATDQRHSLMSYYGRLNYTFMDRYLLTATMRRDGSSRFASDNRWGT FPSVALAWRVSQEHFFEPLRTVMNDVKLRVSYGITGQQDGITNYGYIPVYTPGLDGAQYL FGGNPIYTYRPEAYNPELKWETTKSWNYGIDLAFLENRFTFSADFYTRKTENLLATVPMP AGTNFDKLMLQNVGNVDSKGLELSVTGHIINTKNWSWTASANAAWQKVRIKNLTLTPGAP SPDTEVGPWIDAYQMQVFSTDYAPYSFYLYKQLYDAETGQPIEGLYADLDGDGEITNKDR YHHHSPAPDWILGFSTSLRYKKWTLSTSLRANIGGYIFNGMAMNTGAWETMSYNDYQLNN LNRSFLDTRFTKRQFLSDHYLENASFLKMDNLQLSYDFGRIYKTIGLHASAMVQNVFTVT KYKGVDPETANGVDTSVYPRPRTYSITIGLNF >gi|226332183|gb|ACIC01000137.1| GENE 46 69926 - 70150 184 74 aa, chain - ## HITS:1 COG:no KEGG:BT_4672 NR:ns ## KEGG: BT_4672 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 74 1 74 74 144 100.0 8e-34 MKRQFILTCICPLFAFVSTQGKTTSTPFAHAYRALGYLRADRKAAIDKDVYHFSRPNDTI NRYFNKINLLPITQ >gi|226332183|gb|ACIC01000137.1| GENE 47 70266 - 73502 2214 1078 aa, chain - ## HITS:1 COG:MA3405_3 KEGG:ns NR:ns ## COG: MA3405_3 COG0642 # Protein_GI_number: 20092217 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Methanosarcina acetivorans str.C2A # 718 940 16 242 256 119 36.0 5e-26 MGTDNGILVYNYRADRYEQPETDFPTDVRTMALQGDTLWLGALNGLYTYQLQSRKLTSFD TRRNGLPHNTIYSIIRTRDNQIYVGTYNGLCRYIPSNGKFEGIPLPVHSSQSNLFVNSLL EDTVRQCIWIGTEGYLFQYFPSTGQMKQTEAFHNNSIKSLALDGNGDLLAGTDNGLYVYH NDTTPLQHIIHDSRNIQSLTNNIIWNIFADQEHNIWLGTDYGISLSRYNSALQFIPISQI TGTGDGNQFYSLFRDSKGFYWFGGTNGLIRFTDPAGERHDTIWYRMGDKTYPLSHNRIRH IYEDKEQQLWIATDGSINRYDYATRQFIHYNIVDSTGMYNTNWTYYMFEDTAGQLWISTC LGGIFVVDKHKLMQSASGQYVAEQNYSVHNGLSGMFINQIIPDNEGNVWVLLYNNKGIDK INPRTREVTKLFADELSGEKSPNYLLCDEDGMLWVGFHGGVMRINPKDESQQSISFGSFS NNEILSMTSVKNSIWISTTNGLWIIDRKTMDARQQNMTDKRFTSLLFDPKEDCIYLGGAD GFGISHPDIQAMHQPERPILLTALYINNQLVSPRTRDDVPNIRYTNSIELKYDQNNLSFE LSDLPYSLDEKNKFVYRLEGMDKEWNFLKSNINRITYSNLSYGNYQLIISKLERDGQPSD HPYILNIRILPPWYYTIWAKAIYILLLLSLIAWTINFFRVKNRLKAERREKEKILEQSRQ KMAFFTNLSNELKTPLSRIIAPISQLLPSTEEVHEKQTLEEVQRNAMKINSLIHQVLNFN RIEDNKDSLLILSRIELVSFSRSLFSVYEEDKRFTFHFEANKAKIYADMDAIKLGVILDN LLSNAVKFTSEGGNIRLSLFYRQETGLLEICVSDTGAGIPQQDIPYIFQRFFQSPHSGSK EGTGIGLYLVKTYTELHGGSIDGVTSEKGKGTSIGLSIPVIAVEEDEIPVIASKKQLESL PILKPIEAESQDEKFLSNIIRLIEDHLSDADLNVNALCELSGISNKQIYRKIKQLTGMSP VEYIKSIRMKKAAMLLQQKKFTVAEVMYMVGFSNHSYFSKCFQAEFGKTPRQYLNDGL >gi|226332183|gb|ACIC01000137.1| GENE 48 73944 - 74495 484 183 aa, chain + ## HITS:1 COG:no KEGG:BT_4674 NR:ns ## KEGG: BT_4674 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 19 183 19 183 183 310 99.0 2e-83 MKKLIYLLLLPLALAVVACGGKNGSSNKESVLARQDSVDAHGLQRMQSSKAETDIKFKGR DYHSLVSRTPDESLPHVSNDMGDTYVDNKIVLRITRGSENVLNKTFTKNDFSSVVDAKFL SKSILEGIVYDKTTPEGIVYAASVCYPQTDLYVPLSITVTADGKMSIKKVDMLEDDYTEE TPE >gi|226332183|gb|ACIC01000137.1| GENE 49 74631 - 75809 1113 392 aa, chain + ## HITS:1 COG:no KEGG:BT_4675 NR:ns ## KEGG: BT_4675 # Name: not_defined # Def: heparin lyase I precursor # Organism: B.thetaiotaomicron # Pathway: not_defined # 17 392 1 376 376 770 98.0 0 MKKYILAIYMMMAGCAMLTAQTKNTQTLVPLTERVNVQADSARINQIIDGCWVAVGTNKP HAIQRDFTNLFDGKPSYRFELKTEDNTLEGYAKGETKGRAEFSYCYATSDDFKGLPADVY QKAQITKTVYHHGKGACPQGSSRDYEFSVYIPSSLDSNVSTIFAQWHGMPDRTLVQTPQG EVKKLTVDEFVELEKTTFFKKNAGHEKVVRLDKQGNPMKDKNGKPVYKAGKLNGWLVEQG GYPPLAFGFSGGLFYIKANSDRKWLTDKDDRCNANPGKTPVMKPLTSEYKASTIAYKLPF ADFPKDCWITFRVHIDWTVYGKEAETIVKPGMLDVRMDYQEQGKKVSKHIVDNEKILIGR NDEDGYYFKFGIYRVGDSTVPVCYNLAGYSER >gi|226332183|gb|ACIC01000137.1| GENE 50 76031 - 76474 581 147 aa, chain + ## HITS:1 COG:no KEGG:BT_4676 NR:ns ## KEGG: BT_4676 # Name: not_defined # Def: putative periplasmic protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 147 1 147 147 278 98.0 5e-74 MKKILSLLVMAIVAIQFSFAGDVITKDMNQLPLPARNFINRHFTKPQVAHIKIDKDLLES AKYEVLLTDGTEIDFDSKGNWEEVSAGKGHAVPASVVPSFAASYLKEHNFTEPVTKVERD RKGYEVELSTGVSFKFDKKGKFLKADD >gi|226332183|gb|ACIC01000137.1| GENE 51 76520 - 77020 511 166 aa, chain + ## HITS:1 COG:no KEGG:BT_4677 NR:ns ## KEGG: BT_4677 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 166 1 166 166 322 98.0 3e-87 MLKFKFGAWAVVLMLTALSFSACDDNDDETYNPPANITEALKQLYPNAQNVEWEMKGEYY VADCWVTGDELDVWFDANANWVMTENELDSIDQLVPAVYTGFRNSNYSSWVVTDVFVLTY PQHPTESVIQVKQGNLRFALYFSQEGGLLHERDITNGDDTNWPVME >gi|226332183|gb|ACIC01000137.1| GENE 52 77160 - 78281 1007 373 aa, chain - ## HITS:1 COG:FN1106 KEGG:ns NR:ns ## COG: FN1106 COG1760 # Protein_GI_number: 19704441 # Func_class: E Amino acid transport and metabolism # Function: L-serine deaminase # Organism: Fusobacterium nucleatum # 1 373 1 373 408 401 52.0 1e-111 MKSIKELYRIGTGPSSSHTMGPRKAAEMFVERHPDATSFKVTLYGSLAATGKGHMTDVAI IDTLQPVAPVEIVWQPKVFLPFHPNGMTFAALDASNKILENWTVYSIGGGALAENNDNPT IESPEVYGMNNMTEILQWCERTGKSYWEYVKECESEDIWDYLAEVWDTMKDAIHRGLEAE GVLPGPLNLRRKASTYYIRATGYKQSLQSRGLVFSYALAVSEENASGGKIVTAPTCGSCG VMPAVLYHLQKSRDFSDMRILRALATAGLIGNIVKFNASISGAEVGCQGEVGVACAMASA AANQLFGGSPAQIEYAAEMGLEHHLGMTCDPVCGLVQIPCIERNAYAAARALDANLYSAF TDGMHRVSFDKVV >gi|226332183|gb|ACIC01000137.1| GENE 53 78299 - 79351 836 350 aa, chain - ## HITS:1 COG:CAC1852 KEGG:ns NR:ns ## COG: CAC1852 COG0598 # Protein_GI_number: 15895127 # Func_class: P Inorganic ion transport and metabolism # Function: Mg2+ and Co2+ transporters # Organism: Clostridium acetobutylicum # 2 349 11 353 354 222 37.0 1e-57 MKNNLLSEKLIYTGDSLTPTHLHLCTYNATEMQESSGDTFQSVKETLDNERINWLQVHGL KDTETIREICSHFEIDFLVLQDILNANHPTKIEEHDKYIVLILKIFYPNEHKEDDDLDGL LQQQVCIILGNNYVLTFLEKETDFFDEINAALRNDILKIRSRLTDYLLSVLLNSIMANYI STISSIDDALEDLEEELLMITNENDIGIQIQGLRRQYMLMKKAILPLKEQYVKLLRAENL LIHKVNRAFYNDVNDHLQFVLQTIEICRETLSSLVDLYISNNDLRMNDIMKRLTIVSTIF IPLTFLVGVWGMNYKWMPELDWRYGYLFAWIIMAVIGIIVYLYFRKKKWY >gi|226332183|gb|ACIC01000137.1| GENE 54 79440 - 79832 327 130 aa, chain - ## HITS:1 COG:Cj1052c KEGG:ns NR:ns ## COG: Cj1052c COG1193 # Protein_GI_number: 15792379 # Func_class: L Replication, recombination and repair # Function: Mismatch repair ATPase (MutS family) # Organism: Campylobacter jejuni # 6 127 623 734 736 71 35.0 4e-13 MQPSLSEAIKTTVKLDRLERSNAAPKTEGIAKSTFVSSQTHDQMYEKKLNFKQDIDVRGM RGDEALQAITYFVDDAILVGMDRVRILHGTGTGILRTLIRQYLSTVPGVSHYADEHVQFG GAGITVVDFD >gi|226332183|gb|ACIC01000137.1| GENE 55 79804 - 81942 2075 712 aa, chain - ## HITS:1 COG:FN1581 KEGG:ns NR:ns ## COG: FN1581 COG1193 # Protein_GI_number: 19704902 # Func_class: L Replication, recombination and repair # Function: Mismatch repair ATPase (MutS family) # Organism: Fusobacterium nucleatum # 13 653 11 632 778 315 32.0 1e-85 MIYPQNFEQKIGFDQIRQLLKDKCLSTLGEERVSDMSFSEQFEEVEEKLNQVTEFVRIIQ EEDGFPDQFFFDVRPSLKRVRIEGMYLDEQELFDLRRSLETIRDIVRFLHRNEEEEENDV PYPSLKRLAGDIAVFPQLIGKIDGILNKYGKIKDNASSELARIRRELSNTMGSISRSLNS ILRNAQSEGYVDKDVAPTMRDGRLVIPVAPGLKRKIKGIVHDESASGKTVFIEPAEVVEA NNRIRELEGDERREIIRILTEFSTILRPSIPDILQSYEFLAEIDFIRAKSYFAIQTNSLK PTVEKEQLLDWTMAVHPLLQLSLAKHGKKVVPLDIELNQKQRILIISGPNAGGKSVCLKT VGLLQYMLQCGMLIPLHERSHAGMFSSIFIDIGDEQSIEDDLSTYSSHLTNMKIMMKSCN ERSLILIDEFGGGTEPQIGGAIAEAVLKRFNQKGTFGVITTHYQNLKHFAEDHEGVVNGA MLYDRHLMQALFQLQIGNPGSSFAVEIARKIGLPEDVIADASEIVGSEYINADKYLQDIV RDKRYWEGKRQTIRQREKHMEDTIARYQTEMEELQKSRKEIIRQAKEEAERMLQESNARI ENTIRTIKEAQAEKEKTRLVRQELNDFRTSLETMTSKEQEEKIARKMEKLKEKQERKKNK KNEPKTAASQTAATPKVMPITVGENVKIKGQTSVGQVMEINGKNATVAFGSN >gi|226332183|gb|ACIC01000137.1| GENE 56 82487 - 83692 344 401 aa, chain + ## HITS:1 COG:RSp0310 KEGG:ns NR:ns ## COG: RSp0310 COG0477 # Protein_GI_number: 17548531 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Ralstonia solanacearum # 4 398 45 439 450 136 30.0 6e-32 MNHWKSTLAVIGIGQLISILTSTIVGFSIIFWISNEFKSPTALSLAILAGFLPQFVLGLF TGVYVDRWNRKKTMFYSDLFIAFCTLCLFIVITKGYKDLSFFYLLTACRSIGSTFHAPAL QASIPLLVPKHHLVRVSGLYHSIQSFSEVIAPVVGASLVVWLPIQYILLIDVIGAVAACL TLLCVQIPSLQKTKVLPDFKKELTECWHTLRRTMGILPLFVCFTLVTFVLMPVFTLFPFM TLLHFNGNILQMGVVEMGWGSGALLGGLVLACKALKSKQTLVMHTAYVILGLYLISASYL PSSAFIGFVCLTFTGGIAYSIYHALFIAIIQQNLASDMLGRTFSLIFSLSTFPSMLGIVA SGYWVEAWGITSVFMISGWVIFLIGVGANFISSIKQLDNYA >gi|226332183|gb|ACIC01000137.1| GENE 57 83717 - 84229 363 170 aa, chain + ## HITS:1 COG:no KEGG:Cfla_0433 NR:ns ## KEGG: Cfla_0433 # Name: not_defined # Def: aminoglycoside-2''-adenylyltransferase # Organism: C.flavigena # Pathway: not_defined # 8 151 2 143 157 120 40.0 2e-26 MTKKEHTTITELFQVLDLLESLDMQFWLDGGWGVDVLYGQQTRLHRDIDIDFDANYTDQL LDLLQERGYQIETNWLPTRVELYSKELGYIDIHPFVLNADGTSKQADLDGGWYEFQPDYF GTAVFEGRSIPCISAKGQQVFHSGYDLREKDIHDLSIIKQCITTMSLTIR >gi|226332183|gb|ACIC01000137.1| GENE 58 84215 - 84445 120 76 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPQYDCCAVTQKCCPLMGLLNQHPPPIPLRWNSGKTANGATLRDVTDWVSLSIFTSVSKI YQTICFSVIFLFLSNC >gi|226332183|gb|ACIC01000137.1| GENE 59 84413 - 84604 86 63 aa, chain + ## HITS:1 COG:lin1847 KEGG:ns NR:ns ## COG: lin1847 COG3153 # Protein_GI_number: 16800914 # Func_class: R General function prediction only # Function: Predicted acetyltransferase # Organism: Listeria innocua # 2 62 108 168 172 58 44.0 4e-09 MGYGTAVVLGHKEYYPRFGYRKAIDLGIEFPFEVSHEYCMVAELIPGATENVKGMVCYPT DFK >gi|226332183|gb|ACIC01000137.1| GENE 60 84676 - 85029 280 117 aa, chain - ## HITS:1 COG:no KEGG:Desal_1939 NR:ns ## KEGG: Desal_1939 # Name: not_defined # Def: protein of unknown function DUF1486 # Organism: D.salexigens # Pathway: not_defined # 1 116 4 119 122 178 65.0 5e-44 MKPKEVLEKWIDCFNKADAYHIAELYATNAVNHQVANEPIIGKESIYKMFVNEFATAKMV CMVENIFEDGEWAIMEWKDSLGLRGCGFFHVKDNKIVFQRGYWDKLSFLKQHNLPIE >gi|226332183|gb|ACIC01000137.1| GENE 61 85118 - 86521 937 467 aa, chain - ## HITS:1 COG:no KEGG:BVU_1439 NR:ns ## KEGG: BVU_1439 # Name: not_defined # Def: mobilization protein # Organism: B.vulgatus # Pathway: not_defined # 1 467 1 467 467 735 85.0 0 MATKSSIHIKPCNIASSEAHNRRTAEYMRNIGESRIYVIPELSTDNEQWINPDFGTPELR THYDNIKQMVKEKTERAMQEKERERKGKNGKIIKVAGCSPIREGVLLIRPDTTLADVRKF GEECQRRWGITPLQIFLHKDEGHWLNGQPEAEDKESFQVGNRWFKPNYHAHVVFDWMNHE TGKSRKLNDEDMATMQTLASDILLMERGQAKAVTGKEHLERNDFIIEKQKAELQRIEETK RHKEQQVSLAEQELKQVKAEIRTDKLKKTATNAATAIASGVGSLFGSGKLKELEHHIEQL HQEITKRDKATDELKIQIQQIQEQHSRQIRNLQRIHNQELEAKDKEISRLSTLLEKAHKW FPMFKEMLRMEKLCAVIGFTKDMIENLLTKKESIQCSGKIYSEEHRWKFDIKNDIFRVEK HPMDSGKLMLTINRQPIVQWFKEQWEKLQQNLRNSVQREQKSRGFRI >gi|226332183|gb|ACIC01000137.1| GENE 62 86733 - 87689 653 318 aa, chain - ## HITS:1 COG:no KEGG:BVU_1440 NR:ns ## KEGG: BVU_1440 # Name: not_defined # Def: DNA primase # Organism: B.vulgatus # Pathway: not_defined # 1 318 1 318 318 594 91.0 1e-168 MDIQTAKQIRIADYLHSLGYSPVKQQGVNLWYKSPLREEAEASFKVNTEREQWYDFGLGK GGGIIELAAHLYATDHVPYILKRIAEQTPHVRPVSFSFGKQNSSEPSFQQLEIVPLSSTA LLAYLQGRGINIKLAKRECSEARFTHNGKRYFAIAFPNGSCGYEIRNRYFKGCIAPKEIS HIRQSGKARNTCYVFEGFMDYLSFLTLRQESCPNYPELDGQDYIVLNSVSNVNKALYPLG NYERIHCFFDNDHAGLEALRQIRMEYGRERYIRDASQIYRGCKDLNEYLQKQIERKRQLQ SAKGVRSQSPEKKNGFQL >gi|226332183|gb|ACIC01000137.1| GENE 63 87902 - 88978 682 358 aa, chain - ## HITS:1 COG:no KEGG:BVU_2466 NR:ns ## KEGG: BVU_2466 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 358 1 357 357 517 74.0 1e-145 MDYMKETKEISAEEAIILWQASRLSLSKSYEKAPEILKVHDSVIGTLGNFSASIGKAKSK KTFNVSAIVAAALKNGTVLRYVAELPEDKRKVLYVDTEQSPYHCLKVMTRILRMAGLPDD RDNENLEFLALRKYTPEQRIRIVEQAIYNTPEIGLVIIDGIRDMVYDINSPSESTRIISK LMQWTDDRQIHIHTILHQNKGDENARGHIGTELNNKAETVLQVEKDKGNGDISHVSAIHI RAMDFEPFAFRINDKALPELIEGYKPETKKPGRPEEEKFDPYRHITEQQHRIALEAVFGL KEEYGYKELEDALIKTYMSVGVKLNHKKAVSLITMLRNKRMIVQETGRKYTFMPDFHY >gi|226332183|gb|ACIC01000137.1| GENE 64 88993 - 89301 248 102 aa, chain - ## HITS:1 COG:no KEGG:BVU_2467 NR:ns ## KEGG: BVU_2467 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 4 102 3 102 102 148 74.0 6e-35 MKNLQELLSKPVWQMTGEEFIFLSKHASRQTETQPQPITDTERKYVYGILGIAKLFGCSL PTANRIKKSGKIDKAITQIGRKIIVDVELALELAGKKTGGRK >gi|226332183|gb|ACIC01000137.1| GENE 65 89526 - 90569 775 347 aa, chain - ## HITS:1 COG:no KEGG:BVU_2468 NR:ns ## KEGG: BVU_2468 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 347 1 338 338 258 45.0 3e-67 MKDVITTLIPRYGELNRIYKDWFDNKGFSFEKQKFITKFYLDYNDVTVLETAILELVFHL PQEQYTLILNSLKKEVCENISYYKKGCMPDEQTVYNICFRVSEIYKEAIEEQQYKTTKLH APLNEAYHRYDSIGYREHTAEDEKRAEMEYERCKDEYEEKKNKLTELYDLQKQTREKALQ YAKSRFNEIYRLGCHLKETLAKYVFDETNEPEEPAEQEPNAPNIQEEVLGTKQTEYFNME LLSLIHVTCVGEQFENISEHDFYTCMNLLPSDTKPQIRPREKIRTCYLIFLMSEKLPKQD RENWKRNILKILDIEESYYKSKYKEPVSDFPSDSNRKFAKEMDGIFR >gi|226332183|gb|ACIC01000137.1| GENE 66 90578 - 91873 294 431 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|167855185|ref|ZP_02477956.1| 50S ribosomal protein L31 [Haemophilus parasuis 29755] # 185 419 101 327 339 117 30 2e-25 MSNKESNIEKTNFDTFIQAVGSEIEQAQVRLIVAANAQMLFHYWKIGNYILYHQQLHGWG SKIIKQLAKAIRLHYPEKKGYSERNLTYMCQFAKAYPLRALQSFIETDASLSVPTIQSVT NEVLKLNDKQFTQELTAQIQSIDCQSLAITQEVPAQFEDVGKTVSAIYRMGIREIEEVFL TSPIAKINWTSQMTILDSSLPLGLSYWYMKQSVEIGWSSNVLKMQIDSDLYSRQISNNKV NNFTTTLPAPQSDLANYLLKDPYIFDLAGAKEKADERDIEEQLVKHVTRYLLEMGNGFAF VARQKHFQVGNSDFFADLILYSIPLHAYIVVELKATPFKPEYAGQLNFYINVVDDKLRGE NDNKTIGLLLCKGKDEVVAQYALTGYDQPIGISDYQLSKAIPEKLKSALPSVEEVEEELA SFLDKDNNTQK >gi|226332183|gb|ACIC01000137.1| GENE 67 91937 - 93220 1142 427 aa, chain - ## HITS:1 COG:no KEGG:BVU_2469 NR:ns ## KEGG: BVU_2469 # Name: not_defined # Def: tyrosine type site-specific recombinase # Organism: B.vulgatus # Pathway: not_defined # 1 427 1 430 430 696 80.0 0 MNIKRNIIFALESRKKNGVPIVENVPIRMRVIFASQRIEFTTGYRIDVAKWDADKQRVKN GCTNKLKQSAAEINTDLLKYYAEIQNIFKEFEVQEVMPTTQQLKEAFNMRMKDTSEEQPE EAPVSFWEVFDEFVKECGNQNNWTASTYEKFAAVRNHLKEFKEDATFNYFNEFGLNEYVN FLRDTKDMRNSTIGKQMGFLKWFLRWSFKKGHHQNIAYDTFKPKLKTTSKKVIFLTWDEL NKLKDYQIPKDKQYLERVRDVFLFCCFTSLRYSDVRNLKRSDVKSDHIEITTVKTADSLT IELNKYSKAILDKYKDIHFENYMALPVISNQKMNDYLKELGELAEINEPVRETYYKGNER IDEVTPKYALLSTHAGRRTFICNALALGIPAQVVMKWTGHSDYKAMKPYIDIADDIKANA MNKFNQL >gi|226332183|gb|ACIC01000137.1| GENE 68 93666 - 94904 1061 412 aa, chain + ## HITS:1 COG:no KEGG:BF1219 NR:ns ## KEGG: BF1219 # Name: not_defined # Def: putative transposase # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 412 1 412 412 785 99.0 0 MKSTFNVLFFVKKDKQKINGSYPIFVRITIDGVASRFNSKLDVQPKLWDGKAGKAAGRSA EAIRINRMLDDINASLNTIYHEMQRRDNYVTAEKVKNEFLGHSENHDTILNLFQKHNDDV KQLVGISKTIATYRKYEVTRRHLAEFIQSKYNLSDISIKEITPMFITDFELYLRTTCKCG FNTTAKFMQFFKRIILIARNNGILIGDPFANYKIRLEKVDRGYLTEDEIKIILKKKMVSE RLEQVRDVFIFSCFSGLAYVDVANLKEDNIRKSFDGNLWIITKRQKTNIDVNVPLLDIPK MILEKYKGKLPNGKVLPIISNQKLNAYLKEIAEVCGIKKNLTFHLARHTFATTTTLAKGV PIETVSKMLGHTNIETTQIYARITNNKISNDMQGLDKKFVGIEKIYKEVSIK >gi|226332183|gb|ACIC01000137.1| GENE 69 94979 - 95521 616 180 aa, chain + ## HITS:1 COG:no KEGG:BVU_2471 NR:ns ## KEGG: BVU_2471 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 180 1 180 180 318 98.0 8e-86 MDLQIIQSKIYEIRGCRVMLDSDLAALYQVETKALKQAVKRNIERFPEDFMFELTKEEVE CLRSQIVTLKNNPDETEEETSSKRGKHTKYLPYVFTQEGVAALSGVLRSPIAIQVNISIM RAFVALRQMITGYQELLKRIEELEESTDAQFSEVYQALTRLLSKPEPKPRKPIGYRTYDE >gi|226332183|gb|ACIC01000137.1| GENE 70 95781 - 96233 437 150 aa, chain + ## HITS:1 COG:aq_1610 KEGG:ns NR:ns ## COG: aq_1610 COG2003 # Protein_GI_number: 15606726 # Func_class: L Replication, recombination and repair # Function: DNA repair proteins # Organism: Aquifex aeolicus # 24 150 108 231 231 87 34.0 1e-17 MKTTEFTMPEITISYKDNVKASERVKILSSETSYSYLKPFYSECMEHHEESYVMFLNRAN KALGVSLISKGGMAETVMDVKIILQTALKVHASGIILSHNHPSGNLRPSEPDKQITSKIK EACKVLDLHLLDHIILTEESYYSFADEGLI >gi|226332183|gb|ACIC01000137.1| GENE 71 96642 - 97175 569 177 aa, chain + ## HITS:1 COG:YPMT1.61c KEGG:ns NR:ns ## COG: YPMT1.61c COG4734 # Protein_GI_number: 16082851 # Func_class: R General function prediction only # Function: Antirestriction protein # Organism: Yersinia pestis # 5 174 4 166 168 110 42.0 1e-24 MEAVTLSEARVYVGTYNKYNNGSLFGKWLDLSDYSDMDEFLEACRELHKDDQDPEFMFQD YENIPEALISESWLSEKFFELRDAIEKLSETQQEAFFVWCDHHNSDISEEDADDLISSFE DEYQGEYKDEEDYAYEIVEQCYDLPEFAKTYFDYSAFARDLFITDYWMDNGFVFRCA >gi|226332183|gb|ACIC01000137.1| GENE 72 97219 - 97431 126 70 aa, chain + ## HITS:1 COG:no KEGG:BF1223 NR:ns ## KEGG: BF1223 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 70 1 70 70 71 92.0 1e-11 MKVNNKIATFLRWFTFAFIIYQAATSTAGMWIALIIGFFIIRFVLRLFISAVYLFCMAII FIVLLSFLII >gi|226332183|gb|ACIC01000137.1| GENE 73 97456 - 97671 304 71 aa, chain + ## HITS:1 COG:no KEGG:BF1224 NR:ns ## KEGG: BF1224 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 71 1 71 71 125 94.0 7e-28 METLNKNGVSITQTPGEEKYVKCCLGAFRGQIYFQYDYRHTDMELFSTVAKTLDECRKQR DEWIAKKEKKQ >gi|226332183|gb|ACIC01000137.1| GENE 74 97692 - 98171 395 159 aa, chain + ## HITS:1 COG:no KEGG:BF1225 NR:ns ## KEGG: BF1225 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 159 1 159 159 291 91.0 7e-78 MKTTEVNKKIIGRRCRCIFTGLLVTGVIEDTTEDKYTVSVKVRFDTPHQWGDELYSYDWS FGRKADDFGSLKYLELLPDKTTFDAMIVTFGKPIDTLNDIFEDVKAWGVCSLKGWIDSYE STRFTPIDVDKAVITSEYNMECVKEWLEHNTPVKNIIIG >gi|226332183|gb|ACIC01000137.1| GENE 75 98473 - 98790 305 105 aa, chain - ## HITS:1 COG:no KEGG:BF1231 NR:ns ## KEGG: BF1231 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 104 1 104 105 161 85.0 9e-39 MVKIELDINGISFFVSTTWKTDAVPAVGDIVIVDKESISPSDRIALRKTPSNQAFKWADE EDDSPALNYFDYDTEMVVKKRTWKFDSEDEEMVCILKVIFLLHEG >gi|226332183|gb|ACIC01000137.1| GENE 76 98804 - 99253 263 149 aa, chain - ## HITS:1 COG:no KEGG:BT_0084 NR:ns ## KEGG: BT_0084 # Name: not_defined # Def: conjugate transposon protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 147 1 147 147 282 93.0 2e-75 MYMRKILSRILMSCYIMAAILFVGACNDDVDIQQSYPFSIETMPVPKKLKVGETAEIRCQ LHRDGRYEETKYFIRYFQPDGAGTLKMSDGTVLLPNDLYPLPGETFRLYYTSASTDQQMV DVYFQDSFGQLEQLTFSFNNDNSKEEENN >gi|226332183|gb|ACIC01000137.1| GENE 77 99275 - 99853 395 192 aa, chain - ## HITS:1 COG:no KEGG:BT_0085 NR:ns ## KEGG: BT_0085 # Name: not_defined # Def: conjugate transposon protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 192 1 192 192 381 96.0 1e-105 MKRICCMMFLFVLCLTFNQAHAQRCLPGMKGLQVTGGMADGVHWSSKSDFAYYFGAALST YTKNGNRWVIGGEYLEKHYPYKDLQIPVSQFTGEGGYYMNFLSDRKKTFFLSLGLSALAG YETSNWGDKLLPDGSTLTDKDGFVYGGALTLELESYITDRVVFLINARERCLFGSSVGKF HTQFGIGLKIIM >gi|226332183|gb|ACIC01000137.1| GENE 78 99856 - 100764 572 302 aa, chain - ## HITS:1 COG:no KEGG:BT_0086 NR:ns ## KEGG: BT_0086 # Name: not_defined # Def: conjugate transposon protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 302 1 302 302 580 97.0 1e-164 MKKLMILFALIMGVVSVKSQSNDLYQGITKKLPYRQMVTPYGVQVTFAKTVHLIFPSAVK YVDLGSNYIIAGKADGAENVVRVKATTEGFPGETNFSVICEDGSFYSFNAKYAHEPEMLN IEMKDFLENEDTTDFSHTRMNIYFRELGNESPLLVKLIMQSIYKNNDREIKHLGCKRFGV QFLVKGIYSHNGLFYFHTQVKNSSNVPFDTDFIRFKIVDKKVAKRTAIQETVIDPVRSYN EILVIGGKSTVRTVYTVPQFTIPDDKILVIELVEKNGGRHQTIRVENSDIVAAKVINELK IK >gi|226332183|gb|ACIC01000137.1| GENE 79 100798 - 102183 976 461 aa, chain - ## HITS:1 COG:no KEGG:BT_0087 NR:ns ## KEGG: BT_0087 # Name: not_defined # Def: conjugate transposon protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 461 1 461 461 765 96.0 0 MEEKEVKNEVPAPETATGKEPEKQGKEKKELTPQQIQQRKKMLVYPLMGLVFLGSMYLIF APSDKDEAKVESVGGFNADIPQPKGDGIISDKKTAYEQEQMENKQADKMRSLQDFAFSLG EENGKGEDLTLIDDAPAGKPKTAMIDFGAGAPNNSRSSIQSSAAAYRDMNRQLGSFYETP KEDKEKEELKRQVEELTARLDAKENGAGSIDEQVALMEKSYELAAKYMNGQNGQSVPPQP EQIAQVAPSATVQGKGNATPVKSVSDRTVSGLQQPMSNVEFITEYSKPRNYGFNTAVGSG YSMGKNTIRACVHNDQTLMDGQTVKLRLLEPLQAGNVIVPKNSLVSGSAKVQGERLDILV SSLEYAGNIIPVELAVYDSDGQKGLSVPSSLEQEAAKEAMANIGAGLGTSISFAQSAGQQ VAMDITRGLMQGGSQYLAKKFRTVKVHLKANYQVMLYAKQQ >gi|226332183|gb|ACIC01000137.1| GENE 80 102158 - 102463 68 101 aa, chain - ## HITS:1 COG:no KEGG:BF1236 NR:ns ## KEGG: BF1236 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 101 1 101 101 194 100.0 6e-49 MIRKILSPVNQVITCIQDWADEKLRHLCGRMTPEIRLAVILIMFLFFSGLSIYFTVSSIY RIGKEDGETIRIEHIRQLQLQSKDSTNIFNQSDNGRKRSEE >gi|226332183|gb|ACIC01000137.1| GENE 81 102467 - 103090 377 207 aa, chain - ## HITS:1 COG:no KEGG:BF1237 NR:ns ## KEGG: BF1237 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 207 1 207 207 400 100.0 1e-110 MEFKSLKNIETSFKQIRLFGIAFVVMCTLITGYAVWNSYTFAEAQRQKIYVLDGGKSLML ALSQDLTQNRPVEAREHVKRFHELFFTLSPDKNAIESNIKRSLFLADKSAFNYYRDLSEK GYYNRIISGNISQTIQIDSVSCNFDVYPYAVATYARQMIIRESSLTERSLVTRCRLLNAV RSDNNPHGFIMESFEITENKDLNTIKR >gi|226332183|gb|ACIC01000137.1| GENE 82 103103 - 104128 676 341 aa, chain - ## HITS:1 COG:no KEGG:BF1238 NR:ns ## KEGG: BF1238 # Name: not_defined # Def: putative transmembrane conjugate transposon protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 341 1 341 341 618 99.0 1e-175 MLLAIEFDNLHQILRSLYTDMMPLCGNMAGVAKGIAGLGALFYVAAKVWQSLASAEPIDV YPILRPFVIGFCIMFFPTFVLGTINGVMSPVVKGCNNMLETQTFDMNAYRAQKDKLEYEA MVRNPETAYLVSDEAFDKQIDELGWSAKDIATMGGMYMDRAAHNIKQSVRNWFRELLELL FQSAALVIDTIRTFFLIVLAILGPIAFAISVYDGFQATLTQWITRYISVYLWLPVSDLFS SILARIQVLMLQKDIQELSDPNFIPDGSNSVYIIFMIIGIIGYFTIPTVSNWIIQAGGMG NMSRNINNSANKVGGAAGAVAGAATGNAGGRVGGKLIKGNQ >gi|226332183|gb|ACIC01000137.1| GENE 83 104147 - 104689 446 180 aa, chain - ## HITS:1 COG:no KEGG:BF1239 NR:ns ## KEGG: BF1239 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 38 180 67 209 209 259 99.0 3e-68 MSRFHNRNFELLEPYDAKVSCTVLRGERGSNTPDLPDKEYYDRLKSVHNLVKDARKVQKS ILLIGEISDIYVNSFQKMLSDENYTPDELSAIAYGYTQLLQESSDVLEEMKSVVNINGLS MSDKERMDVIDRTYNAIRNYRDLVSYYTRKNISVSYLRAKKKKDTDRVMALYGSADERYW >gi|226332183|gb|ACIC01000137.1| GENE 84 104673 - 106355 733 560 aa, chain - ## HITS:1 COG:MA3645 KEGG:ns NR:ns ## COG: MA3645 COG3344 # Protein_GI_number: 20092445 # Func_class: L Replication, recombination and repair # Function: Retron-type reverse transcriptase # Organism: Methanosarcina acetivorans str.C2A # 3 483 22 494 512 403 44.0 1e-112 MNEIRTSCAPTDQTISSWESIDWTKCELEVRKLQARIVKVQKEGRYGKVKALQWLLTHSF AAKALAVKRVTSNKGKNTSGVDKVLWSTPIAKANAITELKRRDYNPMPLKRVNIRKSNGK LRPLGIPTMKDRAMQALYLMALDPVAETTADNHSYGFRKERCTGDAIHQCYINLSKESSP QWILEGDIKGCFDHINHEWLLNNIPMDKVMLRKWLKSGFIFNKQLFPTEEGTPQGGIISP TLANMALDGLQTMLEAKFHRVDLYSPKRSYYPKVHLIRYADDFIITSISKEMLEQEIMPM VKEFLQARGLTLSEEKTKITHIDEGFDFLGFNIRKYKGKFLITPSKESQKKFQRKINEIV NSHKTIPQESLIRLLNPIITGSANYYQHVVSGKVFQKMDFHIYQKLLQWSLRRHPAKGKW WVAERYFHKHRGRSWVFAAPFDKDGRQELYPIKWLTDTKITRYAKLKCDANPYDPDWTEY FEKRETRLMLQSAKGRNTIVRIWKRQQRKCPYCGEPITRNTPWRVSEKVVHGKTERMLVH SYCGRNNVQPFNNEEYEPVS >gi|226332183|gb|ACIC01000137.1| GENE 85 106979 - 107191 205 70 aa, chain - ## HITS:1 COG:no KEGG:BF1239 NR:ns ## KEGG: BF1239 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 66 1 66 209 126 100.0 2e-28 MRTKIIMLLVVCSLFTGKVSAQWVVSDPGNLAQGIINASKNIVQTSSTAQNMIKNFQETV KIYQQGVRHV >gi|226332183|gb|ACIC01000137.1| GENE 86 107196 - 107552 187 118 aa, chain - ## HITS:1 COG:no KEGG:BF1240 NR:ns ## KEGG: BF1240 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 118 1 118 118 210 96.0 2e-53 MKHLKFLFPVLLCTLLLGFSSCKETNADRLRAMRGDWKSVKNRPAFTLFEENGHYRVTTY RKTYWGTVQTETYQISEQDGNLFIETGLSVLLTYDKENDRILLSPGGEYKRSKQPIKK Prediction of potential genes in microbial genomes Time: Thu May 12 03:12:29 2011 Seq name: gi|226332182|gb|ACIC01000138.1| Bacteroides sp. 1_1_6 cont1.138, whole genome shotgun sequence Length of sequence - 65948 bp Number of predicted genes - 48, with homology - 48 Number of transcription units - 23, operones - 14 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 2376 1283 ## BF1241 hypothetical protein 2 1 Op 2 . - CDS 2373 - 2705 211 ## BF1242 putative transmembrane conjugate transposon protein 3 1 Op 3 . - CDS 2717 - 3013 265 ## BT_0095 conjugate transposon protein - Prom 3096 - 3155 1.9 4 2 Op 1 . - CDS 3180 - 3860 387 ## BT_0096 hypothetical protein 5 2 Op 2 . - CDS 3862 - 4323 519 ## BT_0097 conjugate transposon protein 6 2 Op 3 . - CDS 4328 - 5089 520 ## COG1192 ATPases involved in chromosome partitioning 7 2 Op 4 . - CDS 5086 - 5463 109 ## BF1247 hypothetical protein - Prom 5681 - 5740 3.9 8 3 Op 1 . + CDS 5533 - 5802 85 ## BT_0099 hypothetical protein 9 3 Op 2 . + CDS 5821 - 6258 497 ## BT_0100 hypothetical protein 10 3 Op 3 . + CDS 6243 - 7481 757 ## BT_0101 hypothetical protein 11 3 Op 4 . + CDS 7502 - 9481 1354 ## COG3505 Type IV secretory pathway, VirD4 components + Term 9501 - 9556 17.1 - Term 9496 - 9537 11.3 12 4 Op 1 . - CDS 9544 - 11814 425 ## ECA2180 hypothetical protein 13 4 Op 2 . - CDS 11811 - 12650 408 ## gi|253571593|ref|ZP_04848999.1| conserved hypothetical protein - Prom 12889 - 12948 5.7 + Prom 12650 - 12709 11.9 14 5 Op 1 . + CDS 12928 - 14457 1302 ## BF1253 hypothetical protein 15 5 Op 2 . + CDS 14454 - 14642 144 ## BT_0104 hypothetical protein 16 5 Op 3 . + CDS 14639 - 16720 1331 ## COG0550 Topoisomerase IA + Term 16790 - 16830 7.2 17 6 Tu 1 . - CDS 16824 - 17345 360 ## COG0262 Dihydrofolate reductase - Prom 17578 - 17637 3.9 - Term 17662 - 17701 6.1 18 7 Op 1 . - CDS 17727 - 18005 278 ## BF1257 hypothetical protein 19 7 Op 2 . - CDS 18018 - 20069 1643 ## BF1258 hypothetical protein 20 7 Op 3 . - CDS 20032 - 20391 373 ## BF1259 hypothetical protein - Prom 20423 - 20482 2.0 + Prom 21157 - 21216 6.4 21 8 Op 1 . + CDS 21276 - 22136 28 ## COG0534 Na+-driven multidrug efflux pump 22 8 Op 2 . + CDS 22148 - 22978 229 ## COG2207 AraC-type DNA-binding domain-containing proteins 23 9 Op 1 . - CDS 23017 - 23865 583 ## BF1264 hypothetical protein 24 9 Op 2 . - CDS 23924 - 24529 278 ## BF1265 hypothetical protein 25 9 Op 3 . - CDS 24526 - 25080 167 ## BF1266 AraC family transcription regulator - Term 25723 - 25786 12.1 26 10 Op 1 . - CDS 25793 - 26464 409 ## COG3871 Uncharacterized stress protein (general stress protein 26) 27 10 Op 2 3/0.000 - CDS 26513 - 27160 404 ## COG0546 Predicted phosphatases 28 10 Op 3 . - CDS 27163 - 27762 306 ## COG0110 Acetyltransferase (isoleucine patch superfamily) - Prom 27850 - 27909 6.3 + Prom 27789 - 27848 7.3 29 11 Tu 1 . + CDS 27889 - 28731 374 ## COG2207 AraC-type DNA-binding domain-containing proteins + Term 28737 - 28806 13.5 - Term 29001 - 29041 1.3 30 12 Op 1 . - CDS 29279 - 30232 771 ## BT_3672 hypothetical protein 31 12 Op 2 . - CDS 30289 - 32307 1410 ## BT_3671 hypothetical protein 32 12 Op 3 . - CDS 32327 - 35452 2111 ## BT_3670 hypothetical protein 33 12 Op 4 . - CDS 35483 - 36838 678 ## BT_2627 cell surface protein - Prom 37038 - 37097 9.6 + Prom 37804 - 37863 5.5 34 13 Tu 1 . + CDS 38029 - 39300 527 ## BT_4744 putative multiple inositol polyphosphate histidine phosphatase 1 35 14 Op 1 . - CDS 39308 - 40885 1123 ## COG3525 N-acetyl-beta-hexosaminidase 36 14 Op 2 . - CDS 40943 - 43381 1972 ## BT_4682 hypothetical protein - Prom 43402 - 43461 8.3 - Term 43498 - 43547 11.6 37 15 Op 1 . - CDS 43585 - 45153 1180 ## COG3119 Arylsulfatase A and related enzymes 38 15 Op 2 . - CDS 45210 - 47897 2094 ## COG3250 Beta-galactosidase/beta-glucuronidase - Prom 47927 - 47986 9.0 39 16 Op 1 . - CDS 48096 - 50117 1556 ## Coch_1140 RagB/SusD domain protein 40 16 Op 2 . - CDS 50128 - 53580 2885 ## BF4448 hypothetical protein - Prom 53667 - 53726 4.9 41 17 Tu 1 . - CDS 53827 - 54852 609 ## COG3712 Fe2+-dicitrate sensor, membrane component - Prom 54965 - 55024 6.9 + Prom 54873 - 54932 6.7 42 18 Tu 1 . + CDS 54987 - 55565 340 ## BF1060 RNA polymerase ECF-type sigma factor + Term 55689 - 55735 0.8 - Term 55904 - 55956 -0.3 43 19 Op 1 . - CDS 56159 - 57223 902 ## COG0668 Small-conductance mechanosensitive channel 44 19 Op 2 . - CDS 57280 - 59247 1420 ## COG1523 Type II secretory pathway, pullulanase PulA and related glycosidases + Prom 59418 - 59477 9.1 45 20 Tu 1 . + CDS 59545 - 60990 1250 ## COG0366 Glycosidases + Term 61014 - 61068 14.1 + Prom 60999 - 61058 2.7 46 21 Tu 1 . + CDS 61258 - 62031 279 ## COG3129 Predicted SAM-dependent methyltransferase + Term 62044 - 62103 2.5 + Prom 62072 - 62131 8.0 47 22 Tu 1 . + CDS 62277 - 65312 1504 ## BT_4692 hypothetical protein + Term 65355 - 65396 8.8 + Prom 65399 - 65458 2.3 48 23 Tu 1 . + CDS 65550 - 65946 237 ## BT_4693 cation efflux system protein Predicted protein(s) >gi|226332182|gb|ACIC01000138.1| GENE 1 3 - 2376 1283 791 aa, chain - ## HITS:1 COG:no KEGG:BF1241 NR:ns ## KEGG: BF1241 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 791 1 791 834 1595 98.0 0 MRNILKATTLENKFPLFTVENGCIVSKDADITVAFRVELPELFTVTAAEYEAIHSAWNKA VKVLPDYSIVHKQDWFIKENYAPDIQKDDLSFLSRSFERHFNERPFLNHTSYLFLTKTTK ERSRMQSNFSTLCRGFLVPKEIRDRETVTKFLEAVGQLESIMNDSGFMTLTRLTSDEITG TKETAGIVEKYFSLSQTDTTTLKDISLGADEMKIGDNILCLHTLSDAEDMPGKVGTDTRY EKLSTDRSDCRLSFASPVGVLLSCNHIYNQYIFIDDHTENLKQFEKMARNMHSLSKYSRA NQINKAWIEEYLNEAHSQGLISVRCHCNIMAWSDDRDELKHIKNDVGSQLALMECKPRHN TTDTPTLFWAGIPGNQADFPAEESFYTFIEQALCLFTEETNYKSSLSPFGIKMVDRVTGK PLHIDISDLPMKRGIITNRNKFILGPSGSGKSFFTNHMVRQYYEQNAHVLLVDTGNSYLG LCEMINRKTHGEDGIYFTYTTENPIAFNPFYVEDGVFDIEKKESIKTLILTLWKRDDEAP TRAEEVALSNAVSSYIGLITKDCSVTPCFNTFYEYVRNDYRAHLQEKNVREKDFDIDNFL NVLEPYYKGGEYDYLLNSDKELDLLHKRFIVFELDNIKDHKILFPITTIIIMEVFISKMR KLKGIRKLILIEEAWKAIASANMADYIKYLYKTVRKYFGEAIVVTQEVEDIISSPIVKES IINNSDCKILLDQRKYLNKFDSIQNLLGLTDKERSQVLSINLANHPNRKYKEVWIGLGGT QSAVYATEVSL >gi|226332182|gb|ACIC01000138.1| GENE 2 2373 - 2705 211 110 aa, chain - ## HITS:1 COG:no KEGG:BF1242 NR:ns ## KEGG: BF1242 # Name: not_defined # Def: putative transmembrane conjugate transposon protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 110 1 110 110 211 99.0 5e-54 MADYPINRGIGKPVEFRGLKSQYLFIFAGGLLALFVLFIIMYMVGINQWVCIIFGVTSAT LLVWLTFRLNEKYGTHGLMKLSARKSHPFHIINRKAISRLFTKNQKQASK >gi|226332182|gb|ACIC01000138.1| GENE 3 2717 - 3013 265 98 aa, chain - ## HITS:1 COG:no KEGG:BT_0095 NR:ns ## KEGG: BT_0095 # Name: not_defined # Def: conjugate transposon protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 98 1 98 98 145 98.0 5e-34 MKKKVLFSAVALLAASTTFAQGNGMAGITEATNMVTSYFDPATKLIYAIGAVVGLIGGVK VYGKFSSGDPDTSKTAASWFGACIFLIVAATILRSFFL >gi|226332182|gb|ACIC01000138.1| GENE 4 3180 - 3860 387 226 aa, chain - ## HITS:1 COG:no KEGG:BT_0096 NR:ns ## KEGG: BT_0096 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 226 1 226 226 400 93.0 1e-110 MLKWFIIGTLLYYAVLLWRYRDSLGTWFTGGKREPEQPENKPTVKTGNSDCLVGASRYRM GQMRTNGDILGHLSKGVDNASIFVLQSDETVTETPAAQAIETEFEMEFETEDVQETDVSP DEIEAEEIACYMGDGEPEMAQGITLGELGQMVQVIQVKQASDMEERQAVQTICRTETNLF HSLVEQINGGRSRVAELLQKHEIPVPAIVPAAGSNEMADFDMNDFL >gi|226332182|gb|ACIC01000138.1| GENE 5 3862 - 4323 519 153 aa, chain - ## HITS:1 COG:no KEGG:BT_0097 NR:ns ## KEGG: BT_0097 # Name: not_defined # Def: conjugate transposon protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 153 1 145 145 212 88.0 3e-54 MAKQSGGKPQIDEDFMKEIISQGLPVKKQETPTVAVETKIETPDKPDDKPETADKPEIKT EPKEEKAVKEPARRKKNAPGDYRETYFMRVDLTDRQPLYVSRTTHEKLMKIVTVIGGRKA TVSSYVENILLRHFEQFQDEINELYESRFEKPF >gi|226332182|gb|ACIC01000138.1| GENE 6 4328 - 5089 520 253 aa, chain - ## HITS:1 COG:DR0013 KEGG:ns NR:ns ## COG: DR0013 COG1192 # Protein_GI_number: 15805054 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Deinococcus radiodurans # 8 196 51 247 296 61 31.0 2e-09 MKKETLYVAFSTQKGGVGKTTFTVLVASYLYYLKGYNVAVVDCDYPQHSISAMRKRDAEQ VNGDEYYKQLAFHQFKTLGKKAYPVLCSSPDEAIKTADEFLASAGSDYDVVFFDLPGTVN SEGVINSLSGVDYIFTPIAADRVVLESSLSFAVAIDKLLVKNEACRLKGLYLFWNMVDGR EKTDLYTLYEQTIGELELPLMKVFIPDTKRFKKELDAQRKTVFRSTLFPADKRLVKGSNM EELITEIAYLIKL >gi|226332182|gb|ACIC01000138.1| GENE 7 5086 - 5463 109 125 aa, chain - ## HITS:1 COG:no KEGG:BF1247 NR:ns ## KEGG: BF1247 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 125 24 148 148 152 73.0 2e-36 MDGWKDTVKCGRKESGFENLKTRFQDCNLSGKQENKLSGWNENRLPFNQESKQSAKKASC LCGMNENKKERKKATCLCRKIARSKEREIAVIPAFVLSVWNERKPSVKLYGWNEIKMYHK KELLQ >gi|226332182|gb|ACIC01000138.1| GENE 8 5533 - 5802 85 89 aa, chain + ## HITS:1 COG:no KEGG:BT_0099 NR:ns ## KEGG: BT_0099 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 89 5 93 93 138 92.0 7e-32 MSPFVLICPITRYFTAFLIPVTLQPKGVGQANQAPGIGSVPISANSNPSAGGFLFALANS KVCSALLEILSAALNALPGLSRDRVTPKS >gi|226332182|gb|ACIC01000138.1| GENE 9 5821 - 6258 497 145 aa, chain + ## HITS:1 COG:no KEGG:BT_0100 NR:ns ## KEGG: BT_0100 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 145 1 145 145 270 99.0 9e-72 MKQKNENAPRPGGAGRKPKADPAVFRYSISFNAIDHARFLALFDQSGMRTKAHFITARIF GEPFKVIKIDKAAVEYYTRLTALYSQYRGIAVNYNQVVKALNTNFSEKKALAFLYKLEKA TMELADLNRQIIELTREFETRWLQR >gi|226332182|gb|ACIC01000138.1| GENE 10 6243 - 7481 757 412 aa, chain + ## HITS:1 COG:no KEGG:BT_0101 NR:ns ## KEGG: BT_0101 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 412 1 412 412 725 95.0 0 MVAKISVGSSLFGALSYNQNKVDEEQGKVLLSNRMFESEDGNFSIRRCMECFDMHLPADL KTEKPIIHISLNPHPDDVLSDSQLADIAKEYMDKLGYGNQPYMVYKHEDIARHHIHIVSI RVDDTGKKINDKFEHIRSKQITRELEQKYGLHPAEKKQAADRPELKKVDYRAGDVKHQLS NTVKALVGSYRFQSFTEYKALLSIYNVQAEEVKGEVNGKPYNGIVYSATNDKGEKQGNPF KSSSLGKSVGYEAIQRHIKKSAKDIQDKNLKELTRRTVGAVMKSARSREQFITDLKSKGI DVLFRQNDTGRIYGVTFIDHENRTVLNGSRLGKDFSANVFNDLFSGSRTLTGNSKQEMQE HTPEYNPTGHLENTGKTIAGLFSLLSGGDDAPPDNSQVPPPKKKKKKKQRRI >gi|226332182|gb|ACIC01000138.1| GENE 11 7502 - 9481 1354 659 aa, chain + ## HITS:1 COG:alr7213 KEGG:ns NR:ns ## COG: alr7213 COG3505 # Protein_GI_number: 17233229 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirD4 components # Organism: Nostoc sp. PCC 7120 # 201 558 117 466 589 96 26.0 2e-19 MQNEDDLRGLAKVMDFMRAISILFVVINIYWFCYQSFREWGINIGVVDKILLNFQRTAGL FSNILYTKLFSVVFLALSCLGTKGVKEEKITWNKIYVFLTIGFILFFLNWWLLALPLPLA ANTGLYIFTMTAGYLSLLAAGVWISRLLKNNLMDDVFNLENESFTQETRLIENEYSINLL TRFYYKKKWNDGWINVVNPFRASMVLGTPGSGKSYAIVNNYIKQQIEKGFAVYIYDYKFP DLSEIAYNHLISHLDGYKVKPKFYMINFDDPRKSHRCNPMNPAFMTDISDAYEASYTIML NLNRSWITKQGDFFVESPIILLASIIWFLKIYQGGKYCTFPHAIEFLNKKYADTFTILTS YPELENYLSPFMDAWESNAQDQLQGQLASAKIPLSRMISPTLYWVMTGDDFSLDINNPEE PKILVVGNNPDRQNIYSCALGLYNSRIVKLINKKHQLKSSVIIDELPTIYFRGLDNLIAT ARSNKVAVCLGFQDFSQLRRDYGDKESKVIENTVGNIFSGQVVSETAKTLSDRFGKVLQK RQSMTINRNDKSTSISTQMDSLIPPSKISNLTQGMFVGAVSDNFDERIEQKIFHCEIVVD NDKVNRETANYKKLPQIIDFRDEEGNDRMQEEIQANYNRVKQEVQQIVTDEIERINNKS >gi|226332182|gb|ACIC01000138.1| GENE 12 9544 - 11814 425 756 aa, chain - ## HITS:1 COG:no KEGG:ECA2180 NR:ns ## KEGG: ECA2180 # Name: not_defined # Def: hypothetical protein # Organism: E.carotovora # Pathway: not_defined # 1 746 1 740 756 431 35.0 1e-119 MNFIFITYDAIVEFASDRFFQSAEYLESSMLCDVLQGASTYWEHNKIVFQITEKGCFFYT HKLDKKQGLIFDLSTFKGFELSSREQIISIFQKTVKYAVRYFEKLPVATCERLLPGLPTT IVFPFPFTATKDVNKILIDRNSSKQDRKERNYLTVYFFGNDDKAKVSFTNLNKALGELDK LQYSPTQLNTLEPKISAALAVTDLNSLNLSIDSKIGYDNWQYYLTENQKSFVTSPISGAE RLEGAAGTGKTLSMILKCIHLLKEKIEKNEEYHIIFVTHSLATKERIIDIFLNNWSAFKD YQEKDGTRPYISIFVTTLQEWSADHLGTNSITENEYLDKDASDSKAMQILYIEEALDKVL KSQWTAFKIICSPEFSSFLSHTPKENLLEMMQQEIAVLIKGRASSIWDRYKELTRPKYSI PLKNDADKNFMFLIFNQYQKSLERVGQYDSDDIVLSALGQINTPIWSRRVNREGYNACFI DETHLFNINELSIFHYINKPDCRTNIIFAIDKSQAVGDWGVDDQVISNTFNINIDTKEKH KFGTVFRSSPDIVQLAFNILSSGVTLFTNFENPLDYSSFNFTKEDERKCERPLYKLCSDD EKTIEEAFIWADTYCKEKNCLKSNVLIITTTDFLLKQAESYAKSSHKAFEVLKSRSDSKT IRMAFEGNKYILGGIDYVGGLEFEAVVIIGVDEGRVPPSKSINGDASHFMNYAWYNRMYV AVTRAKYAILLLGVQGRGRSSLLESSIYNEVINFEE >gi|226332182|gb|ACIC01000138.1| GENE 13 11811 - 12650 408 279 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253571593|ref|ZP_04848999.1| ## NR: gi|253571593|ref|ZP_04848999.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 279 1 279 279 537 100.0 1e-151 MRQIFEIKKNYSQIDQGSIITGCIAEDYPECEAWGCVITPRCDLAHEGKVSTIHYLPIIN IDDWVRNEARNVLKRRWIEGLHKYLNNKLKEVGVNDGFLDTGLSDDDILKVAKTLIKKEK AYQEFEEKYFLYRLQTEENFKKSLLDNQQKGILKTLIKDLVKGNVHSFYFIESWNNSDKY PYKVILLRDVRRIRYDIAMKIAKGLFEEEMLQEEIKFNDLSLSKNKDNLFYINSQIKSPF IEHILQSFSYNFCRVGVEDLDTDNIIDNLVDKSLEILTS >gi|226332182|gb|ACIC01000138.1| GENE 14 12928 - 14457 1302 509 aa, chain + ## HITS:1 COG:no KEGG:BF1253 NR:ns ## KEGG: BF1253 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 509 1 509 509 824 97.0 0 MDVNTANQSTTDEQMMDILLVLDKEKKTISAVKGVDENGELQTVPPENNSELLKFDRHGD FFSNFFSNMMNQLKNPTRFNFFKIPKIELPKITPIIRDNFDHPTAKNEAEIERYRVTPET LKQGVTNGQPNQTQQQVQQPQQEATAQAPQQPDKSKYIIDSDKVDWEALKNFGLSKEQLE KAKALEPMLRGYKSPGAFSIAGNYNSAIMKLDARLSFRHDKDGNVVLAIHGIRQKPELER PFFGHEFSKEDKANLLETGNMGRIVNLKNYITGEMIPSFISVDKVTNELVSMRASSVQIP DEIKGVKLNKEQKEALREGKGVFLENMISNRKNPFSATVQVNADKKSLEFIYPNAKQSQE QRQKQEQPNNLVTSDGVTIPKSISGIELTRQQQQDLVNDKTIFVAGLKDKRGVEYDAYIQ VNHDKKKLGFYSDNPSFDRSAMKEITPASKNRTQVAVNSEGKTYEATKKVKEPLKKGQDK PTEKQKTKQDKKEKQEQMNNPKQSRGRKR >gi|226332182|gb|ACIC01000138.1| GENE 15 14454 - 14642 144 62 aa, chain + ## HITS:1 COG:no KEGG:BT_0104 NR:ns ## KEGG: BT_0104 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 62 1 62 62 98 82.0 9e-20 MITVLILIPVIGFVLFLFACYKTDWKTIDEQNQQYYIDGYHIYYDRKILRQKEVEQLKSK LE >gi|226332182|gb|ACIC01000138.1| GENE 16 14639 - 16720 1331 693 aa, chain + ## HITS:1 COG:CAC3567 KEGG:ns NR:ns ## COG: CAC3567 COG0550 # Protein_GI_number: 15896801 # Func_class: L Replication, recombination and repair # Function: Topoisomerase IA # Organism: Clostridium acetobutylicum # 2 649 4 677 709 410 37.0 1e-114 MIAVIGEKPSVARDIARILGASEKQDGYLSGNGYLVTWAFGHLVGLAMPEAYGIQSFRRE SLPIIPDSFQLTPRQVKAEKGYKADPGALKQLKVIKEVFDQSDKIIVATDAGREGELIFR YIYQYIGCNKPFVRLWISSLTDKAIREGLQNLKAGSLYDNLFRSAQARSEADYLIGINAT QALSVAAGQGIFSLGRVQSPTLAMICTRFLENKNFVPQKYWQLKLQSAKDNIAFTALSAD KYDMQQAAIDTLQRIKEAKTVQVKTVERKEVNQEPPLLYDLTTLQKEANAKLNFSADKTL SIAQKLYEGKLISYPRTGSRYISQDVFEEIPERLVNLEQYARFAGYAAGMKGKALNSRSV NDGRVTDHHALIVTENLPGKMETDEQAIYELIAGRMLEAFSEKCVKDVTSVTLECAGSLF TVKGSVIKSAGWRAVFAEKEDGEDNATLPVMQDGDTLPLSGIELLEKQTKPKPLHTESSL LSSMETAGKELENADLKASMKDTGIGTPATRAAIIETLFFRQYIVREKKNLVPTEKGLAV YNIIRDKKIADVEMTGMWENTLAKIESGEMNPDTFRKGIEVYARQITAELLGVQLSFASG SGCICPKCKTGRILFYPKVAKCSNVDCSVTIFRNKSDKQLTDKQITELVTTGKTGLIKGF KSKNGKVFDASLTFDEQFNVTFVFPEKKGKPKK >gi|226332182|gb|ACIC01000138.1| GENE 17 16824 - 17345 360 173 aa, chain - ## HITS:1 COG:MA4540 KEGG:ns NR:ns ## COG: MA4540 COG0262 # Protein_GI_number: 20093324 # Func_class: H Coenzyme transport and metabolism # Function: Dihydrofolate reductase # Organism: Methanosarcina acetivorans str.C2A # 4 172 8 181 198 86 30.0 3e-17 MKHLKTCIAVSLDGFIATKDNELDWMPENVRKEISAAYEQPGVLLAGVNTYTYIFEHWGG WPYKSKKTFVVSHYDTNVSEKENVSFLTDMPLRAVNELKSSSETDIQVIGGGKFITSLIE ASLLDEITLYIIPVMLGNGIKFIGKTFGSKWELTQNKILDNQVVCLTYQYKGE >gi|226332182|gb|ACIC01000138.1| GENE 18 17727 - 18005 278 92 aa, chain - ## HITS:1 COG:no KEGG:BF1257 NR:ns ## KEGG: BF1257 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 92 1 92 92 162 97.0 3e-39 MYIDKDIFTAWMERIMDRFDMQDKKIDRVIKGRNCLDGEELLDNQDLCLLLKVAPRTLTR YRKKGILPFLMLDGRCYYRATDVHKLIREKTD >gi|226332182|gb|ACIC01000138.1| GENE 19 18018 - 20069 1643 683 aa, chain - ## HITS:1 COG:no KEGG:BF1258 NR:ns ## KEGG: BF1258 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 683 1 683 683 1320 96.0 0 MLRKEEILNRTNNGLLVFKHYLPGDWRIGRHFLNPLYQDRKASCNIYFDRHSGMYKMKDF GNDNYSGDCFFFVGQLKGLDCNQAADFVEILEIIDRDLGLGLGLAAGTSIAIPTTVNNRN AADKTEETPEKPVKPYQFREQKFPLAELMYWQQYGITPELLEHYKVCSLREYNSKTVDGK PYAYVSSVTEPMYGYKGKQYIKLYRPFSTPRFLYGGSFGENYCFGLEQLPAKGDTLFITG GEKDVLSLAAHGFHAICFNSETVTIPPTLVYRLTFRFKHIVLLFDMDKTGRESSCKQEKL LEEFGVKRLLLPLPGTKEEKDISDYFKVGNTREDFLKLFIEFLDNLYSDTLIMLKSCEID FNNPPAKAQEIISAGDVPLGTQGNLFGITGGEGTGKSNYVAAIVAGCICQPDKEVDTLGI QIAANSKHKAVLLYDTEQSEVQLFKNVSNLLARAKQLTKPEELKAFCLTGMSRKERLHAI VQSMDKFYYQYGGIQLVVIDGIADLVKSANDEVESVAVIDELYRLAGIYNTCILCVLHFV PNGLKLRGHLGSELQRKAATILSIEKDEDPTQSVVKALKVRDGSPLDVPLMLFAWDKTAR MHLYKGEKPREEKEKRKEKELVGVARDVFGRQEHITYIDLCEQIQQILDVKERTAKSYIR FMRERDIIIKDPSNQSYFMIGLI >gi|226332182|gb|ACIC01000138.1| GENE 20 20032 - 20391 373 119 aa, chain - ## HITS:1 COG:no KEGG:BF1259 NR:ns ## KEGG: BF1259 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 119 1 119 119 201 100.0 6e-51 MEIITFESKAYKELDNKITAIADYIFNHLDENKTDEDEIWVDSYEVCTFLKISGRTLQRL RAAGMISYSDIKGHYFYKIKEVKRMLEERLIKRDKECINELITNHQKYVKERRNTKQDK >gi|226332182|gb|ACIC01000138.1| GENE 21 21276 - 22136 28 286 aa, chain + ## HITS:1 COG:PAB0243 KEGG:ns NR:ns ## COG: PAB0243 COG0534 # Protein_GI_number: 14520582 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Pyrococcus abyssi # 1 248 164 414 463 66 26.0 7e-11 MVAINVILNYILIFGALGIPVLGISGAAIASVISEAVAVIHFIIYTQKNVSSCKYGLKRR ITIDCTILKKILSISIWMMLQHGLVFGSWFVFFITMEHFGERSLAITNIVRNISSFLFLF VQAFASVVSSLVGNLIGENKGNSILYVCKKVIFLCYMTILPIMVLFLFFPSSALRIYSDN TVLVFEAIPTLKVMLTSYLIAVPTFVFFSAVSGIGHAVASLLIAFISLVVYIVYVEIISH FSSSVPLLWTSEHIYFSCVFILSLYHIHHSWMSFINQDKKLTTNIS >gi|226332182|gb|ACIC01000138.1| GENE 22 22148 - 22978 229 276 aa, chain + ## HITS:1 COG:CC2573 KEGG:ns NR:ns ## COG: CC2573 COG2207 # Protein_GI_number: 16126811 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Caulobacter vibrioides # 73 257 72 254 270 68 28.0 1e-11 MIYLCYMYKEYSPCSVLAPFIYHFWEYKGETRNGLKFNIPPHGCSDFVFIVGNAADCIRN SLIMKPYHSYFFGPMNTFTELVAHTNFIHIIGVRFRPCGLFRFIDIPLNELTNQGIDSHE FPVLFSQSFIYQLCESNQNPIDAIEHELVRQVYKNNTDTEKQISYAVSLISRQKGIVSIK ELATEACLGLRQFERKFKFYTGYSPKEYSRIVKFWNVIHLLKSNTSFDNFLSVAIQAGYY DTPHLYREVKRLSGNTPNAFLSLSTNEKVEVLHIEL >gi|226332182|gb|ACIC01000138.1| GENE 23 23017 - 23865 583 282 aa, chain - ## HITS:1 COG:no KEGG:BF1264 NR:ns ## KEGG: BF1264 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 282 1 282 282 524 100.0 1e-147 MKEVVDRILEQVEADISEIDLYGYNIIETSLSMVHRLQTVLNDLRTKIQTYVFLTKEDEI LFFKTQKPEILGRLLFFYKIYKIETQCPNGSDEAIRNYINRELDNLTYFFNRNLDFYQYY RSHSTVYDEYYFVRGKADLRLCTDSAQFDKDPNFSTGYDYKVAKIIANEMLRIYLNKRLV KLGTNNQIEDNLQKCFKYPFRFTGKKAYLIELGYSLVSSGDINNGNVEIKEMMNFLSTIF HIDLGDYYASYIAMKERKDRTAYLHHLIDSLIKRMNEDDMKC >gi|226332182|gb|ACIC01000138.1| GENE 24 23924 - 24529 278 201 aa, chain - ## HITS:1 COG:no KEGG:BF1265 NR:ns ## KEGG: BF1265 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 201 1 201 201 377 96.0 1e-103 MKNEIIQFLRENIIGKTLLTGAIYKLENGNLEGEYNDKMTFSNLVTTENGFKFNMTTVTQ ELVYNLDDKGARTTIAKDYTGTSVFCYELAMRKSTKQITGYMRCVSTTVQDSTTEAVVCG IFDVTFDGKELKWQENQLLYRDNPVGEDKYKPVAFHSKVRFYLDNEKVIFEYLPTLWDIS PDTLEKRLSKDDYPPYISKEQ >gi|226332182|gb|ACIC01000138.1| GENE 25 24526 - 25080 167 184 aa, chain - ## HITS:1 COG:no KEGG:BF1266 NR:ns ## KEGG: BF1266 # Name: not_defined # Def: AraC family transcription regulator # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 184 199 382 382 361 98.0 6e-99 MSIPAFRKTYSSVACENKIIKQSDCSIKRIKEFKVAYLKFERTHRNKQAYSTLWGQIIKF AKKYNLTDKGFKYVSISLDSLDITEIDKCRFLIGITVPYSMEVPKGFGTLNIQAGLYSVF NIKGGYHELNKIYRNVYLDWLPNSKYRLREQMTFEIYANTPDKTSTDELVTEIYLPIEAK QKIK >gi|226332182|gb|ACIC01000138.1| GENE 26 25793 - 26464 409 223 aa, chain - ## HITS:1 COG:CAC3491 KEGG:ns NR:ns ## COG: CAC3491 COG3871 # Protein_GI_number: 15896728 # Func_class: R General function prediction only # Function: Uncharacterized stress protein (general stress protein 26) # Organism: Clostridium acetobutylicum # 99 208 13 124 145 78 33.0 8e-15 MKQRICQSCGISMLTDDLLGTHGNGCLCTEYCCHCFQKGFFTNNSLEEQIELNTQPESLA AFNEVSGSNFTKEEAIEGLRKFLPTLKRWMPIRQQAEWVLEQCGYITLSTISENGYPRPV AIDLLRHTDISTLWMTTALSTEKVKHIRQNSKAGVCFVYEADSVTLTGKIEIITDAETRQ TFWQDYMLHYFPQGVNDPDYCILCFHAKEAVLWIDRKFERIVL >gi|226332182|gb|ACIC01000138.1| GENE 27 26513 - 27160 404 215 aa, chain - ## HITS:1 COG:sll1349 KEGG:ns NR:ns ## COG: sll1349 COG0546 # Protein_GI_number: 16330157 # Func_class: R General function prediction only # Function: Predicted phosphatases # Organism: Synechocystis # 2 207 3 211 221 63 25.0 2e-10 MIKLVAFDLDGTIGNTLPMCIQAFKKATQPYIGHELSDEEMVHTFGLNEEGMIGCVVDEP YRQQALNDFYVIYKELHQEMCPTPFEGIRELIALLKQKGIIVALITGKGINSCDITLKQF NMKDSFAKVITGNAERNIKSETLKELLHDYHLAANEIVYVGDALSDITECRKANVMCLSA AWSIPQNEVTALEYSNPKNVFYSIPSLFKYLNIKT >gi|226332182|gb|ACIC01000138.1| GENE 28 27163 - 27762 306 199 aa, chain - ## HITS:1 COG:all1011 KEGG:ns NR:ns ## COG: all1011 COG0110 # Protein_GI_number: 17228506 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Nostoc sp. PCC 7120 # 2 195 9 191 192 173 46.0 2e-43 MKTELEKCLSGEWYDCHAPVFMGFKRKTHELLLKYNSLPYDYKEEKYALLKEMFESVGKK VSVGHSFICDYGCNISIGNNVSINTGCTFVDCNKIIIGNNVLIAPNVQIYTATHPVELNE RLIPTETEDGTAYIRHTYALPVTIEDGCWIGGGVIILPGVTIGQGSVIGAGSVVTKSIPA NSLAVGNPCKVIREINTSI >gi|226332182|gb|ACIC01000138.1| GENE 29 27889 - 28731 374 280 aa, chain + ## HITS:1 COG:CAC1451 KEGG:ns NR:ns ## COG: CAC1451 COG2207 # Protein_GI_number: 15894730 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Clostridium acetobutylicum # 17 280 28 290 295 76 23.0 5e-14 MKRENLHQPFEICFSELDESQLKEHDHTFFELVYILSGTGIQWINNNKFPYHDGHLFMIT PGDIHSFDIHTTTKFVYIKFNDIYIHSAVFGAENIQRLEFILQHANHRPGCILRNQTDKL LVKPMIEAIIREYVNRNIYSSEIITQLINTIVIVVARNIAMFLPEQIDENSEEKSLNILQ YIQSNICYPEKIKAKAISQHFGISPNYLGRYFKKQTNETMQQYILNYKMKMVESRLLHSE MRISEIVEELGFTDESHLNRLFKKYKDCNPTDFRKKHRGQ >gi|226332182|gb|ACIC01000138.1| GENE 30 29279 - 30232 771 317 aa, chain - ## HITS:1 COG:no KEGG:BT_3672 NR:ns ## KEGG: BT_3672 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 315 1 317 319 154 34.0 5e-36 MKTYKISILSLLLALLFLAGCSDRKKFFEDERYEKMVYIVSDNNNIHNLVYDLDGKDGNI RNLSIACSGTNPTEEPVTISLTEDTVLLDEYNYTNFIEDYSRYALKMDPKDYAIESSTVT YPTGEPYTLVPIKIDISVIESLDPDKIYFIPIAIADATPYPIVKKKGNALLQIQKKNKYT SSAEPASYNASGYEGSGYFVITKTMVPLTKNRVRINIGTKRLPAEPENQPDFIKKNTIVI EVADDNSLTVEPYDPEEMEVETVDVKVAHPGDNAYALYNNRLDPTTTTFCVCYKYRLKTG SWTEVREAIRLPVIVIE >gi|226332182|gb|ACIC01000138.1| GENE 31 30289 - 32307 1410 672 aa, chain - ## HITS:1 COG:no KEGG:BT_3671 NR:ns ## KEGG: BT_3671 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 670 1 689 691 603 48.0 1e-171 MRKLILHSILLLSVFSISSCNYLDVDKWFDDTMKYDSIFSRKEYLERYMWNVASMFPDEG DFLNSPATAGPLATDEAFETFNAETYTGMAYVTGAINEENITNKDAKKKKSIYQWDTMYK IIRKANIILSRMDEVTDMSNLEKPDIWAYTYFMRAYAYYHLIMDFGPVILVGDQVYPSNE QPEAYANVRATYDESVDYACEQFELAAQYLPETQPNSSFGRPTKGAAYALIARLRLIQAS PLYNGRGQTYFGNWKRSSDGVFYISQQYDDRKWAVAAAACERVMDMGRYELHTVPADKDS FVPPTNENQAFPNGVGGIDPLKSYSYMFNGETDGRKNKEFIWGRTTGNLCIRESFPHSMD GYNGMGVTQKMVNMFYMNDGTDNYPEDAKPYKEDGTYDNTNFSTEDETFSEYVLKSGVFN MYRNREMRFYANIGFSGRFWKARSYSGSDAKKEQMIKYDNSSVTDGKHSSSGNVNNYPAT GYVITKYVHDDDAFSSPGGILMEKFFPIIRYAEILLAYSEALNNLQGSYSIELKGSNGET KVYEESRDIYKIKQAFNPIRYRVGLPGVKDDELASVDGFNKILQRERAIEFLYENQRYYD VRRWGIYEESEKEAIQGMDLSKDEMSGYFYPVVVDHPWIRHRVVDRKMVFLPIGKYEIRK VDGLDQNPGWGD >gi|226332182|gb|ACIC01000138.1| GENE 32 32327 - 35452 2111 1041 aa, chain - ## HITS:1 COG:no KEGG:BT_3670 NR:ns ## KEGG: BT_3670 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1041 12 1046 1046 1229 59.0 0 MKKYILLSLLFFCNASIMMGQEKQLTISGHVFDESNESLPGVSVYLKDRAGVGTSTDING AFKLSAQKGDMIIFSFIGYKKMEHIVLKDEKNLRIMLKSDSQQMDEVVITGMGTKEKKVN LTGAVTNVDISQIQTPATSLTNMLGGRVPGIISVQSSGEPGSNMSEFWIRGIGTFGASSS ALVLIDGLEGDLNSIDPADIESFSVLKDASATAVYGVRGANGVVLVTTKHGTSDKMEVNV RANVKISYLTKLPEYLRAYDYAVLANEASAMRGAQPIYDDIELKAIQYHLDPDFYPDVNW QDEILKRTSLQQTYYASARGGGSIARYFVSMNMSNESAIYKQDENSKYIKDVHYNTYGMR LNVDFNLTKTTELYFGSDIYLSSQALPGQGASTDQLWDAQAKLTPLTIPVRYSTGELPTY KSGAKDVYSPYAMLNYTGLKNINTYSGTYTIRLQQDFSFLLRNLALNIQGAYSNSSYFEE TRSVKPDMYWAERRDENGNLELIHTIKKEGTKYTKNRDSQYRKYFLQANLTWKAILNDEH RISALIHYEMSDSKNTKDLNKDGMAAIPLRYQGLSGKVSYGFRDTYLLDFNIGYTGSENF EKGKRFGFFPAISGGWVPTQYEFIQEKLPWLNFLKIRASYGIVGNDRISDKRFPYLTIIN ESAEKGWGYNEAGITENQMGADNLKWEKAKKIDVGIEGELFNSKLSFVIDFFKDIRDGIF QQRQQVPEYIGLVSMPFGNVGSMKSYGSDGNVTYIQNIGKDMSVTFRGNFTLSNNRVNYF EEADTKYEYNSATGRPYGYQKGLIALGLFKDDEDIANSPKQTFGSYLPGDIKYKDVNGDG VVNGDDKVPLSFSSTPRFMYGFGVEFRYKRLSAAVLFKGTGNTDVYHVGVPASNGALYDE GYIPFYGKQTGNVLSIVKNPANRWISREYAEKMGLDPSLSENPNARFPRMSYGNLENNSQ TSSFWKNNAKYLRLSEVNINYSVNVPKTIKNLGINSIDLSLVGNNLCVWDKIDICDPEQA LYNGRRYPIPASVTLQAYIKF >gi|226332182|gb|ACIC01000138.1| GENE 33 35483 - 36838 678 451 aa, chain - ## HITS:1 COG:no KEGG:BT_2627 NR:ns ## KEGG: BT_2627 # Name: not_defined # Def: cell surface protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 22 451 20 434 434 191 30.0 5e-47 MKKKITQTLLPIVMGLSLITQFACSDDDKSGGNYNGPLKLTTFYPDSGYLSSKIIIEGEN LGTDASKLSVYFNKKKGYISQASGNILMVYAPKLPGDTCIISVVKGNDSLTFDNKFRYIS RFTVENVCGKTGSGYNIGGDLASTTFEAWRLKVGCCDPEGNYYSCYSSFGNNGGLALISE KKNQSKKIISEMVNDVMYHNVTEKLYAVGTQKNVIYEIDPSNDWKVKRRYLKPQDPPAKQ IDYNSTACIAYCTTDGYFYGRTATKQLFRFKLEDMVCEYIKNDVYNSTTPETENSYIVSM MFDPTEPTHLYTSYGASSVITMQDVSKPESEEIIYAGHLNVRADNTSTTQVINGYRTDCL FNWNNQMTIIVDESGQKNMYIADAGTQTIRKIDMNTGMVTTVVGRQNVKGNQSGTPLEAT FNWPKGVGLTAEGDLYISDCGSGCVRKLSLR >gi|226332182|gb|ACIC01000138.1| GENE 34 38029 - 39300 527 423 aa, chain + ## HITS:1 COG:no KEGG:BT_4744 NR:ns ## KEGG: BT_4744 # Name: not_defined # Def: putative multiple inositol polyphosphate histidine phosphatase 1 # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 420 1 420 425 592 68.0 1e-168 MKGLLLIGIFMFYSISSVWGQDVIKQYAGTAMLYPVQEDSFHLSTSDMVPFYINHLGRHG ARFPTSGKALDKAKEALILGQQENRLTTDGKILLSTLQHLSDSFEGQWGKLSVLGELEQK GIAGRMMRHYPQLFSNSAKVEAIATYVPRCINSMDAFLTRLMLDNPSLRIQRNEGRQYDD ILRFFDLNKSYVNYKNNGDWLSIYEAFVRNKISSASIMKKFFIEPERETDEEAEEIIMAL FSIAAILPDTGLLTNLDDLFTMEEWRSYWQTQNLRQYMSKSSAPVGRMLPVAISWPLLSD FIYTADEVIKGKSDNAANFRFAHAETVIPFVALMGIENTDVQISNPDSVSRYWKDYEISP MAANVQWIFYHDKARGVWVKILLNEKEAKLPIATSRFPYYPWETVCAYFKERIEMAKRIL IKN >gi|226332182|gb|ACIC01000138.1| GENE 35 39308 - 40885 1123 525 aa, chain - ## HITS:1 COG:CC0447 KEGG:ns NR:ns ## COG: CC0447 COG3525 # Protein_GI_number: 16124702 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetyl-beta-hexosaminidase # Organism: Caulobacter vibrioides # 21 520 27 527 757 247 32.0 4e-65 MKIYTCLCLLFLAGYLVAQNNTSALLPMPNQITPVKGKPFTVRSGKTAIYLNQPELQFTA KTLQTILQDRMQAKVPLASESGRADIRLLVDPAMDGKEHYRIEITSKGITISGATAGAVY YGVMTMDQLLLGDACATAHKEISPIRIDDAPRFPHRALMLDPARHFLPVNDVKFFIDQMA RYKYNILQLHLTDDQGWRVEIKKHPKLVGKDYYTQEQLAEIIQYAAQRNIEVIPELDIPG HTVAILAAYPELGCTHTDTLPKIVGKTTDLMLCANNEKVYSVYQDIIKEISSLFPSDYIH LGGDEAVIEKNWTQCSRCQAMMKELGYQKASQLMIPFFSRMLSFVQENNKTPILWCELDN IYPPANDYLFPYPKNVTLVSWRGGLTPTCLELTRKHGNPLIMAPGEYAYLDYPQLKGDFP EFNNWGMPVTTLEKSYQFDPGYGVPAEDQAHITGVMGTLWGEAIRDINRATYMAYPRAFA LAEAGWTQMKYRNWESFKQRLYPNLTNLMKKGVSVRVPFEIADKK >gi|226332182|gb|ACIC01000138.1| GENE 36 40943 - 43381 1972 812 aa, chain - ## HITS:1 COG:no KEGG:BT_4682 NR:ns ## KEGG: BT_4682 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 810 1 810 812 1635 96.0 0 MKKKHSLLFCALFTIFTGHAESTDYTKGLSIWFDSPNTLQGKEVWHSAQQDASWESQSLP IGNGSIGANILGSVEAERITFNEKTLWRGGPNTAKGADYYWNVNKQSAHILDEIRKAFVE GDQKKAEKLTRENFNSEVPYEFSREKPFRFGNFTTMGEFYVETGLSTIGMSDYKRILSLD SAMAVVQFKKDDVAYQRNYFISYPANVMVMRFSADQPSKQNLTFRYAPNPVSTGQFSTDG NNGLVYTASLDNNGMKYAVRIQATVNGGTLNNADGRITVKEADEVIFYVTADTDYKMNFA PDFTDPKTYVGVNPLETTQQWMKDAVAKGYANLLNEHYKDYASLFNRVKLELNPTVKIAN LPTAQRLKNYRKGQPDYYLEKLYYQFGRYLLIASSRPGNMPANLQGIWHNNIDGPWRVDY HNNINIQMNYWPACSTNLDECMLPLIDFIRTLVKPGEKTAQSYFGARGWTASISANIFGF TTPLESQDMSWNFNPMAGPWLATHVWEYYDYTKDLKFLKETGYELIKSSANFTVDYLWHK PDGTYTAAPSTSPEHGPVDQGATFVHAVVREILLDAIQASKELGIDKKERKQWEHVLANL VPYKIGRYGQLLEWSTDIDDPKDEHRHVNHLFGLHPGHTVSPITTPELAEAAKVVLVHRG DGATGWSMGWKLNQWARLQDGNHAYTLFGNLLKNGTVDNLWDTHPPFQIDGNFGGTAGIT EMLLQSHMGFIQLLPALPDAWKDGSIHGVCAKGNFEIDMIWKDGLLQEATLLSKAGENCT VKYAGKTISFKTTKGRSYQLKYDKENGLTRAQ >gi|226332182|gb|ACIC01000138.1| GENE 37 43585 - 45153 1180 522 aa, chain - ## HITS:1 COG:STM3122 KEGG:ns NR:ns ## COG: STM3122 COG3119 # Protein_GI_number: 16766422 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Salmonella typhimurium LT2 # 23 502 23 540 579 249 32.0 1e-65 MNRIVYTSFLLGASFASTEAQVNTPPNIVLILCDDMGFSDLGCYGSEIQTPNIDRLAENG VRFSLFKNTGRSCPSRAALLTGHYQHEAGMGWMTAVDEHRPGYRGQITKNIPTIAEVMKA NGYSTYMSGKWHVTVDGAFDAPNGSYPAQRGFDKYYGCLSGGGSYYKPTPVYSNLARITE FPDDYYYTTAITDSAVSFVKQHPTDKPMFMYVAHYAPHLPLQAPADRVEKCRDRYKVGYD VLRKQRFDRLKELKFISDEMDYPVYQKEFGGKRPSWGTLTPKQQEQWITDMATYAAMIEI VDSGIGELVETIKEKGILDNTVFIFLSDNGATKEGGYLGQLMADLSNTPYRSYKSQCFQG GTSTPFILSYGDAGKNKMKGQICGQPAHIIDILPTCMDIATATYPSEFKENLPGKSLLPS IHGKKIKPRELYFEHQSSCAIISDHWKLVRGSRNEPWELIDLSTDPFETKDLSAQYPKIV KKLEVKWNKWAEQCNVFPLENKPWTERINYYLKQNPDQSGKE >gi|226332182|gb|ACIC01000138.1| GENE 38 45210 - 47897 2094 895 aa, chain - ## HITS:1 COG:SMb21655 KEGG:ns NR:ns ## COG: SMb21655 COG3250 # Protein_GI_number: 16263752 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Sinorhizobium meliloti # 45 815 3 734 755 209 27.0 3e-53 MKTTSTWWKILLSGLLLAGTLVAYADAYQPSYSTAGFYQLSNTGRTAYSMNPAWRFYKGH IEGAEQPEFNDKDWNIVSLPDGIEYLPTEASGCINYQGEIWYRKHFTPEESWKGKQQFLH FEAIMGKSKIWVNGKLLKEHFGGFLPVIVDVTSSLKYGEDNVITVWADNSDDPSYPPGKP QDALDFAYFGGIYRDCWMIVHNKVFITDPNYENETAGGGLFVSFDKVSEQSAQLKLDAHI RNTSGKSFGGKILYELYDRDDKRVLSKEAGLSVSKGLAKQISCKVTVETPHLWSPDSPYL YKLHIYIKDKAGNTIDGYYRRIGIRSIEFKGKDGFWLNGKPYPYPLMGANRHQDFAVIGN ALPNSLHWRDAKKLRDAGMRVIRNAHYPQDPAFMDACDELGLFVIVNTPGWQFWNDQPIF AQRVYSDIRNMVRRDRNHPSVWMWEPILNETWYPADFAKNVVDILNEEYPYPYCYAGCDV TARGHEYFPIHFTHPMNGGGGAFNTENLDPKISYFTREWGDNVDDWNSHNSPSRVNRGWG EVPMLIQAQGYAKNDYQLTSYDGLYRTTRQHMGGCLWHSFDHQRGYHPDPFYGGIMDAFR QPKLSYYMFCSQRPAEPNKELIADNGPMIYIANAMTPFSPKDVTIYSNCDEVRLTYCKGG KEYTYHKPANEAGMPSPVITFKDVFDVMYDKKLSRQKKQADSYLLAEGLMVGKVVATHKV TPTRRPSKLLLWADDEKVQMKADGSDIVTVIAAIADENGNIKRLNNYEVKFEIEGQGQLV ADEETFTNPRPVLWGTAPVLVRSTTTPGEIKIRASVVWQGKHTPVPAELIIPTFPSEHTL VADKDELTQAQSASKDAGNKVNAASSDCEKRVLELQQELNRLKLKEVEKQQSDFE >gi|226332182|gb|ACIC01000138.1| GENE 39 48096 - 50117 1556 673 aa, chain - ## HITS:1 COG:no KEGG:Coch_1140 NR:ns ## KEGG: Coch_1140 # Name: not_defined # Def: RagB/SusD domain protein # Organism: C.ochracea # Pathway: not_defined # 1 669 1 648 650 573 47.0 1e-162 MKKIIKYLFLLTGGSFILASCNDFLDREPLDSVTPDNFFFTENDLAAYAVKHYNFTTHEG FNAGIWKNDNATDNQAATDYDKKWIPGQWKVPEAYDNPASNDPWYFSAIREANYFLETVV PRFEKGEIEGSETNIKHYIGEMYFLRAKNYLSKLETFGDFPIIKNTLPDESASLIQASKR EPRHKVARFILEDLDLAIGLLTNTISGGKNRITQNAALLLKSRAALYEASWLTYHRGTAL VPGGSGWPGKAEDIAGFNIDTEIAFFLKEAKAAAKEVIGNAALVQNTAKDCMDETVKTDE DKYKMSNPYFAQFSANSLEGYSEILLWRAYNLLDYKIVHSAPFYIRVGGNTGFTRQYVES FLCRDGKPIYATDQYKGDESLSDVRKNRDLRLQLFLMTSGETLSPNVMNGTPDLLPEVPQ LLDITEKRCVTGYQVRKGLSGNWYRDGNTAIEGCPVYRVAEAYLNYIEADCMEHNGTSIG SEAAGYWGDLRERAGLPRDYTVTVNNIDLSKELDWAVYSAGKTVSPLLYNIRRERRCELL AEGLRMLDLKRWRALDQVKQFVIQGVNLWESDLKDKYMQDGKNLLIQEGTEGQTSNVSSY ANSGKYLCPYRAVKTNNLMYDAGYTWCEAHYLNPIAITHFRITASNPNDLSTSIIYQNPG WPIQANEGPIGIK >gi|226332182|gb|ACIC01000138.1| GENE 40 50128 - 53580 2885 1150 aa, chain - ## HITS:1 COG:no KEGG:BF4448 NR:ns ## KEGG: BF4448 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 67 1150 1 1092 1092 1005 48.0 0 MKLLFVFCFCFVGLLCANDSYAQRTMISVKAQNLTVKEVLSQIEAQSDFSFFYNDHHIDV DRRVSVVTEQSNIFTLLNEVFKNTNVKYAVRNKKIILSTQIVDAPAVKQTVQIKGKVTDM FGDPVIGATVMEDGTQNGTITDLDGNFILNTASANVTITVSYVGYMSQTVKAQSGKSLSV ILKEDTKTLDEIVVVGFGTQKKVNLTGSVSTVNAEDLLSRPVSNVSQALQGLVPGMNFSY ASDGNGGEIGADMKVNIRGAGTIGDGSNASPLILIDGMEGNMNMLNPNDIESISVLKDAA ASSIYGSRAPFGVVLITTKKGKAGKVNINYNNSFRWSEAINLPDVADAYTYAMYFNKMQL NDGKTAVQFDDTRLQAIKDYASGAITTTTQPNRNTPTIWDWIGNTDTDWYDVVFGGTAFS QEHSLSVSGGTEKIQYYFSGNYMGQEGMMAIRRDKLQRYSVSSKINAQLYPWLNMNYSMK YMRKDYSKPTAMTDNTLYQNIAKRWPMEPTVDPNGYPMGNTIIRPILYGGDNNSQTDWLY QQFQVVIEPIKDWKIFGEINYKVIDAFTHTDYLKVPQMNVAGEPYSGDTWKTSKVTEGAE RTNYFNANVYSEYYRSFADAHHLKAMVGFQAEVNKWRQLQASKLDLISESVPSINAATGE STIDKSKLTHWATAGFFGRLNYDYKERYLLEVNLRYDGTSRFAKDKRWNLFPSFSAGWNV AREAFMEPASHIINTLKIRGSWGELGNQNTENLYPYIQLMKFVAQDPNSNWLINNSRPNT ANAPDLISALLGWETMRSWNIGFDLGMLNNRLTVSFDWFNRKTINMVGPAPELPVTLGAN VPKMNNADMQSTGFELDLGWRDRIKDFDYGVHLLLSDDRQKILKYPNTTGKINTWYVGKY NGDIWGFETIGIAKSQEEMDAHLASLPNGGQDALGVKWGAGDIMYRDLNGDGKISDGATL NDPGDKKVIGNSSPRFRFGLDLNASWKGIDVRLFFQGVAKRDAWLNDNMFWGATGGIWQS ACFTSHLDFFREEGDDFWSANKDAYFPRILVDNFAKNQKTQTRYLQNAAYVRLKNLQIGY TIPAQITRKIGVSNLRVFFSGENLFTLTGLPGGFDPETINTGYGAKDGSNSSGKTYPLSR TFSTGFSVNF >gi|226332182|gb|ACIC01000138.1| GENE 41 53827 - 54852 609 341 aa, chain - ## HITS:1 COG:PA2388 KEGG:ns NR:ns ## COG: PA2388 COG3712 # Protein_GI_number: 15597584 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Pseudomonas aeruginosa # 110 312 111 304 331 73 26.0 8e-13 MEMHHSTIEDLLISYYDGKATADEIQEIEAWIKLSDENKKKAMDIYTLLLMTDTQQITEK MDMNEELSKVKGRMQENKYHISWWGWIQRAAVALLIPMAITILVLLNQPQSTAPVHAQIF EVRTQPGMITSFRLPDSTLVYLNSGSVLKYPSIFTGNIREVSLNGEAYFEVAKDPEHKFI VSTPQKSKVEVLGTHFNLEAFDEMDEVITTLVEGKVEFVYEKDGQGSKILMRPGQKVIYN NKDGQILSYNTNGESELAWMDQKIIFDKTPFKAALHILKKRYNVDFIVNTSKFDKYTFTG AFTEQYIEEILENFKISSHIRWRNVKLTNNQSEERRQIEIY >gi|226332182|gb|ACIC01000138.1| GENE 42 54987 - 55565 340 192 aa, chain + ## HITS:1 COG:no KEGG:BF1060 NR:ns ## KEGG: BF1060 # Name: not_defined # Def: RNA polymerase ECF-type sigma factor # Organism: B.fragilis # Pathway: not_defined # 4 180 6 182 191 192 55.0 7e-48 MEYSNEDAYLLMALRRGEMIAFDSIFKKYYPILCAYGNRFVDYEESKEIAGDAILWLWEH RETLQVETSLGRYLLKSVYHRSLNCIKQKQLKNHADTMYYEEMEKLIHDAETYQFQELSK RVKEAIDALPASYREAFVMHRFTKKSYKEIAESSGVSVKTIAYRIQQATKLLRKDLADLL ITMIVCVLSVVR >gi|226332182|gb|ACIC01000138.1| GENE 43 56159 - 57223 902 354 aa, chain - ## HITS:1 COG:MJ0170 KEGG:ns NR:ns ## COG: MJ0170 COG0668 # Protein_GI_number: 15668342 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Small-conductance mechanosensitive channel # Organism: Methanococcus jannaschii # 5 347 4 349 350 172 34.0 8e-43 MLENELWGNTIENWGISILIILGAIIIVKLLSLLGKKVIKPFVTGTDNHLDDVIFYSLEA PVKFAIILLGIWIAIHRLVYPDSFVKVVDNAYSILIVLDITWFFGRLFSSLLQVYWGKQS NGQANKMMPIIKRTILVIVWLIGIVMALSNVGVNISALLGTLGIGGIAFALAAQDTVKNV FGTFTILTDKPFSIGDTIRVDSYEGTVVDVGVRSTKIMNYDKRIITFPNYKITDTSIVNI SSEPMRRVVLNLGLTYDTTSEKMKEALELLKSIPKRVENVSSNPSDIVAVFTEYSDSALV IMYIYFIEKQGDILGVTSNMNMEILAAFNKAGLNLAFPTRTVYIQKDESLKQES >gi|226332182|gb|ACIC01000138.1| GENE 44 57280 - 59247 1420 655 aa, chain - ## HITS:1 COG:TM1845 KEGG:ns NR:ns ## COG: TM1845 COG1523 # Protein_GI_number: 15644588 # Func_class: G Carbohydrate transport and metabolism # Function: Type II secretory pathway, pullulanase PulA and related glycosidases # Organism: Thermotoga maritima # 13 655 216 843 843 513 43.0 1e-145 MRELKNHPSPDQLDSYDKYPCYYENDLELVYTPQQSIFTLWAPSADNVRLNLYSSGEAGD PEEQLEMDIADNGTWCIHIDRDLKGSFYTFQIEKDGKWLDETPGIWAKAVGINGNRAAVI DWNETNPEGWESDQSPELKMYSDIILYELHHRDFSIDPNSGIEHKGKFLALTETGTKTPE GESSGLDHLKELGVTHIHILPSFDFATIDERKLEKNKYNWGYDPKNYNVPDGSYSTDPVT PTTRIREFKEMVKSLHQNGFRIVLDVVYNHTASTEHSNFDLTVPGYLYRQNPDGSYSNAS GCGNETASEREMVRHYIIESVKFWAKEYHIDGFRFDLMGIHDIDTMNQLREELLKIDPTI FVYGEGWVAADSPLPFEQRAVKENVRKMKGISVFNDEFRDGLKGSTFDEQEPGYASGNIN GHFEPVKYGIVGGTQHPQVDYSGLLYCDGPYAAAPSQMINFVSCHDGYTLVDKLKLSVQG EHSEEELVPIDKLIHTVLLTSQGIPFIRGGEEIMQDKQGEPNSYKSSDAVNQIDWALKAK NRDIFNYIRTLIALRKKHPAFRIPTAEGLEKWLHFLDTGDSGVIAYTLGEYANGDEWKEI LVAYNGNRNQADINIPEQDWIIVCHNGQIDLDSQEHLSGGDISIAASSALILYRQ >gi|226332182|gb|ACIC01000138.1| GENE 45 59545 - 60990 1250 481 aa, chain + ## HITS:1 COG:SP1382 KEGG:ns NR:ns ## COG: SP1382 COG0366 # Protein_GI_number: 15901236 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Streptococcus pneumoniae TIGR4 # 1 477 1 479 484 481 49.0 1e-135 MENGVMMQYFEWHLPNDGKLWKQIKEDALHLHDIGVTAVWIPPAYKADEQQDEGYATYDL YDLGEFDQKGTIRTKYGTKDELKKMIDELHKYHIAVYLDVVLNHKAGGDFTEKFMVVEVD PKERTKALGEPFEIQGWTGYSFHGRKDKHSDFKWHWYHFSGTGFDDAQKRSGVFQIQGEG KAWSEGVDSENGNYDFLLCNDIDLDHPEVVSELNRWGKWVSNELNLDGMRLDAIKHMKDQ FVAQFLDAVRSERGNDFYAVGEYWNGDLEALDAYIEAVGHKVNLFDVPLHYNMFQASQEG KDYDLRDILKDTLVEHHPDLAVTIVDNHDTQRGSSLESNVGDWFKPLAYGLILLMKEGYP CLFYGDYYGIKGEKSPHTRIIDILLDARRKYAYGDQIEYFDHPSTIGFIRTGDEEHNGSG LVFLMSNDEAGSKIMSLGEKHKGEVWHEITGSISEEITLDEEGNGEFSVESRNLAVWVKK D >gi|226332182|gb|ACIC01000138.1| GENE 46 61258 - 62031 279 257 aa, chain + ## HITS:1 COG:YPO2519 KEGG:ns NR:ns ## COG: YPO2519 COG3129 # Protein_GI_number: 16122739 # Func_class: R General function prediction only # Function: Predicted SAM-dependent methyltransferase # Organism: Yersinia pestis # 1 256 73 322 336 264 51.0 1e-70 MKALNKALLISYYGIRYWDIPKNYLCPPIPGRADYVHYIADLIDPERVSNTANEENGDKP KRQCRCLDIGVGANCIYPIIGHVEYGWMFVGSDIDPVSIENARKIVTCNPVLAHKIDLRL QKDNRRIFDGIIAPDEYFDVTICNPPFHSSKKEAEEGTLRKLSSLKGEKVKKTKLNFGGN ANELWCEGGELRFLLNMISESRKYRKNCGWFTSLVSKEKNLDKLYAKLKAVHVSEYKIIR MCQGTKNSRILAWRFLE >gi|226332182|gb|ACIC01000138.1| GENE 47 62277 - 65312 1504 1011 aa, chain + ## HITS:1 COG:no KEGG:BT_4692 NR:ns ## KEGG: BT_4692 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1011 1 1011 1011 1999 99.0 0 MFVVRLRNYLHLSFLVIFLLFSSCIDEFYKDTPKEGYITITGINTRAYTGAYPGDGLDDK VETLRILAFNKTSRICESNTFYYGDVLGGNTLRHPINKGEYDFVFLANEPHNTIVKAVLD NITDYDALKSISYPAEFFDSERVIPMIAENNSIKVLADGNLEVNGVASTKLTVGLRRLAA RVDVVLKSKVDFGDASSSEFEGITFSNIPDRVPLVYGLPSDCLPSSWAYADPVLPYGGTA ITRNVERKLTLADNADCFKIDPTLLTTEDKNNDLVWAVKVKKVILPSSFFSSKSDETNAI NFTVNLIDKYSPSCKLKILFDPDYTLPANAKLDLTGIIREPLEVNIQPSPWIDDGDDWEI SGVRVLNVSHIQAKMTDMNGVRISFWSNMPVVKVLGTVKKEGEANERATNTVFNCLAVDD NNPDPYRFFYDSASGSGYMDLLVDGTNTTGFGTHDRTENMSGTYTLTLSAAEDTNGKNAL QRKIKVTVTQEGLRFVHNPTANQHGLFNAAFFKHNQKGERIITGQHRLDQTWSVSVPVAY QNWLVVSATPSFDPNIGTESPGDAEHYPVTLNPYRGEDGTSISGLTGRIYFRIGVKENAD FVPTKESDPKFGYINLSYYPGWRTTMKIYIRQGEAGAYIYAPGDAIPRTVWGAWNGGTDI TTHVREDMRLLSTNARRGAAKFSALNLTSERLAGNTNPDYENVNVKGARFVDFPSQAGAF FQWAVDLKGTTSGMTDYYRRAFNPSKSYLVSGFPWGYGEFPIMWDGDVSANIPAYKEQFE VCPPGYRRPTDGYTDKISYNGYYDYLPDEGTTGTATNYKDQIEYSELRVSLFNVPFAGNA ASSADYATAFTTPFGGTGNTGPGTYPYGANKDMLARKQLKGTTYTFYSDGFFDRRPIKEA SNGNYGVSLGNSNVAYQGVLYFNPDSGTNASVFFPSAGRLNNTSGALESRGSTGYYWSAS VGPPYMRSELQNPPYDRRIRYGAWSLESAYNSHNFRLSYQGFAQSIRCVKE >gi|226332182|gb|ACIC01000138.1| GENE 48 65550 - 65946 237 132 aa, chain + ## HITS:1 COG:no KEGG:BT_4693 NR:ns ## KEGG: BT_4693 # Name: not_defined # Def: cation efflux system protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 132 1 130 367 253 100.0 1e-66 MMMKRELHTFIIGSVLFFTACGDKQESEQKSGYRIQGDTVYIADPLLLEKIKVSVSELKP YSKKVITSGVVRPIPPRYAYVAPPFAGRVTKSYVRIGQSVRQGTPLFEISSPDFTSAQKE YFQALSSRELAK Prediction of potential genes in microbial genomes Time: Thu May 12 03:15:11 2011 Seq name: gi|226332181|gb|ACIC01000139.1| Bacteroides sp. 1_1_6 cont1.139, whole genome shotgun sequence Length of sequence - 15833 bp Number of predicted genes - 12, with homology - 12 Number of transcription units - 3, operones - 2 average op.length - 5.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 11/0.000 + CDS 1 - 717 607 ## COG0845 Membrane-fusion protein 2 1 Op 2 1/0.000 + CDS 724 - 3831 2600 ## COG3696 Putative silver efflux pump 3 1 Op 3 . + CDS 3809 - 5059 1052 ## COG1538 Outer membrane protein + Term 5068 - 5119 3.4 4 2 Tu 1 . + CDS 5396 - 6163 626 ## BT_4696 hypothetical protein + Prom 6395 - 6454 5.6 5 3 Op 1 . + CDS 6542 - 8311 1245 ## BT_4697 transcriptional regulator + Term 8323 - 8362 -1.0 6 3 Op 2 . + CDS 8382 - 8744 356 ## COG3189 Uncharacterized conserved protein 7 3 Op 3 . + CDS 8829 - 10061 1222 ## COG0561 Predicted hydrolases of the HAD superfamily 8 3 Op 4 10/0.000 + CDS 10058 - 11170 648 ## COG1169 Isochorismate synthase 9 3 Op 5 1/0.000 + CDS 11190 - 12857 1171 ## COG1165 2-succinyl-6-hydroxy-2,4-cyclohexadiene-1-carboxylate synthase 10 3 Op 6 . + CDS 12861 - 13685 1059 ## COG0447 Dihydroxynaphthoic acid synthase 11 3 Op 7 4/0.000 + CDS 13689 - 14714 683 ## COG4948 L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily 12 3 Op 8 . + CDS 14714 - 15817 728 ## COG0318 Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II Predicted protein(s) >gi|226332181|gb|ACIC01000139.1| GENE 1 1 - 717 607 238 aa, chain + ## HITS:1 COG:RSp1041 KEGG:ns NR:ns ## COG: RSp1041 COG0845 # Protein_GI_number: 17549262 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Ralstonia solanacearum # 1 237 145 380 382 122 32.0 4e-28 KKNLKRKEDLIRNGVSSQRELEEAQNALLIADKEFENASAALRVYQVDNPVEMILGQPMI VRSPIPGEVIKDNIVTGQYLKDDTEPIAIVADLSKVWIAAQVKEKDIRFINEGSSLGIEI SALPGTVIKGNVYHVEEEVDEETRSIQVLSVCDNLNGHLKLGMYTTVHFLSVPVGQIQIP EKALLQGEKDSYVFVQIAPLVFVRTSVTVETTENGIAVISEGLRPGDKIISEGGYYLK >gi|226332181|gb|ACIC01000139.1| GENE 2 724 - 3831 2600 1035 aa, chain + ## HITS:1 COG:RSp1040 KEGG:ns NR:ns ## COG: RSp1040 COG3696 # Protein_GI_number: 17549261 # Func_class: P Inorganic ion transport and metabolism # Function: Putative silver efflux pump # Organism: Ralstonia solanacearum # 1 1022 1 1019 1038 1173 56.0 0 MKKELMLSIIQKRWVMLCLFVMMCIFGYYSWKQLSIEAYPDIADVTSQVVTQVPGLAAEE VEQQITIPLERAINGLPGMHVMRSKSTFGLSMITIVFKDGTEDYWSRQRVQERLNEVELP YGAVPGLDPLTSPVGEVYRYIIESDQHSLRELTDLQNWVIIPRIKEVSGVADVTNFGGIT TQFQVEVDPSKLEQYNLSLSQVVEAIENNNANVGGSVLNRGDLGYVVRGIGLIGNLDDLG HIVVSTTSGVPVFLNDIGSLKYGNLERKGVLGYTDRTRNYSESLEGIVLLLKHENPSKVL EDIHVAVDELNNETLPEGVRIHTFLDRTNLVDTTLSTVSHTLLMGMALVVLVLILFLGNW RGALLVSITIPVSLLIAFILMHLTDIPANLLSLGAIDFGIIVDGAIVMLETILKKREDNP KQYLEEKSMAQRAKEVGRPILFSTIVIITAYLPLFAFERVERKLFTPMAFTMSYAMIGAL LVALLLIPGLAYAIYRKPHKIYKNKWLDRLKDKYIRTITGFLEAPLKAIIPSGLILTAGI VLSVVVGKDFLPELDEGSIWLQVNLPPGISVEKSRELSDTLRARTMKYSEVTYIMVQAGR NDDGTDPFTPSHFEVSIGIKPYDEWPKGKTKKDLIHELEEEYKLLPGFKVGFSQPMIDGV MDKIAGAHSELVVKVYGEDFRETRRIAEEITRALGTVRGAVDLDIDQEPPLPQLQISMNR DAIARYGLNVSDVADLIEVAVGGKAIAQLYQGDRQYGITCKYKEEARNTPEKIAGLMLTS STGAKIPLSQVADVRLSVGESTITREMNRRHLTVKLNLRGRDLTSFLNEAQNTINEKVHY DKNKYTIKWGGAFENQKRAYTRLAVIIPLTLTGMFILLYVTFGKFRQAGLVLAIVPLALF GGMLALNVRGMTLNVSSAVGFIAMFGLSIQNGIIMVSQVNGLRRGGMELRKSVIEGARQR FRPILMTSVTTILGLFPASLATGIGSDVQRPLATVIVYGLMFSTLISMFLLPAFYYLVEN NVLNKKKNDEKEYTI >gi|226332181|gb|ACIC01000139.1| GENE 3 3809 - 5059 1052 416 aa, chain + ## HITS:1 COG:PA2522 KEGG:ns NR:ns ## COG: PA2522 COG1538 # Protein_GI_number: 15597718 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Outer membrane protein # Organism: Pseudomonas aeruginosa # 13 410 17 420 428 106 25.0 8e-23 MKKNIRYNIVLLLLMVEAVLIPTFAQMPEIKQGQSISFEEYLNRVGKKNLSYLAEKLNVS IADAEVIARKIFPDPELEFEAGNETFSLGVSYSLEMGNKRGARIKLARTQAELEKLFLQQ GFQDLRAEAADLFLEAILQRELLEVQKSSYEYMYQLSQSDSLRYVSGEITENDVRQSKLE TVTLLNTVYSQEAAYHSALVLLNKHMGMSADTLHIPLGNWEELSRDFALSDLVKAGLDNR IDLLVAQKSTEVTTREYKLTRAERRPDIGVSVSYERNWNGFLPPSRSATAGVSVPLTFSN INKGAVKAAKFRIAQSEIMERDMELQIQTEITQAWFNYEAEKKKIAQYKAGVLEDSQKVL DGMVYKYKRGETSILDVLVAQRTYNEVQQEYLETMKGYAASLVALERACGIWDICF >gi|226332181|gb|ACIC01000139.1| GENE 4 5396 - 6163 626 255 aa, chain + ## HITS:1 COG:no KEGG:BT_4696 NR:ns ## KEGG: BT_4696 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 255 1 255 255 533 100.0 1e-150 MEAKYITLTKENIEKEHICCAFSDKKCKDSYELKKTWLKNEFENGYVFRRLDERAKVFIE YGPAEKAWVPVNAPNYLMINCFWVSGKYKGCGHGKALLQSAVEDAKAQGRDGLVTVVGTS KFHFMGDAKWLLRQGFETIEKLPYGFSLLALKINPAAPDPSFNGTVSSGECEEKEGVVVY YTHRCPFAEFHVRNSLVSVTENKGIPLKIVRLETMAQAQNAPTPATIFSLFYKGKFVTTD LSACMEQRFDKALGL >gi|226332181|gb|ACIC01000139.1| GENE 5 6542 - 8311 1245 589 aa, chain + ## HITS:1 COG:no KEGG:BT_4697 NR:ns ## KEGG: BT_4697 # Name: not_defined # Def: transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 589 1 589 589 1093 100.0 0 MKSIRSVICFCLFLFVFNQLLFADKTIENDSLYTSEYISRIYMAEPERALSLLDGAESKK TIPLRIIHELRSRVYRNMYMTKLAFLYAKKSYLLDSVSQKDTKHLLTMTVDLAELAVLMS DHKESMRYALDGIKLAQKEKDKGAESKLLFCIGENKWQLSFKEEAYDYFDKAIKLLQGTS SKLEMAMLSYFYGMKMDYLLNDSRSKEALEVGLKREKLLKDIAKLPEKTKGFLDLQYTYL YAKMSYICYLEGKYDQAEKYYQQYLSTENSQTPDGKTYAIPYLLVSKQYQKVVDQCQDFK KLMQEQDTVNLQYISILQKEVKAYKGLKDFEKVAALRESIISIIDGINSKDKQNAALELN AIHKADEQEEYIAEQTLQLRIRNISLAFLGCVTCLILFVLWRIWRHTIIVKYKNKMLAKF INEKLAGKVENKQLFIDGDAEDPIAVPLDLEPETGFSEKDDLSPDEVVESREEEDENKKI FKELNRIVVQDQLYLSPELSREDLAQIVHLNNARFARMIKENTGTNFNGYINELRINYAI QLLKQHPNYTIRAIADEAGFNSTPILYSMFKKKTGMTPYEFKKTQESLG >gi|226332181|gb|ACIC01000139.1| GENE 6 8382 - 8744 356 120 aa, chain + ## HITS:1 COG:SMb21412 KEGG:ns NR:ns ## COG: SMb21412 COG3189 # Protein_GI_number: 16264987 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Sinorhizobium meliloti # 1 115 22 137 141 99 45.0 1e-21 MVQVRIKRVYEDFSETDGYRVLVDKLWPRGIKKEWLKYDYWAKDITPSAALRRCFHEDIP GHWNDFVVMYQKELEASQAVADFLTLIKPHPVITLLYASKEPVYNHARILRDYLEMHLKE >gi|226332181|gb|ACIC01000139.1| GENE 7 8829 - 10061 1222 410 aa, chain + ## HITS:1 COG:SP0923 KEGG:ns NR:ns ## COG: SP0923 COG0561 # Protein_GI_number: 15900803 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Streptococcus pneumoniae TIGR4 # 4 269 3 269 269 160 37.0 3e-39 MKYKLLVLDVDGTLLNDAKEISKRTLAALLKVQQMGVRIVLASGRPTYGLMPLAKSLELG NYGGYILSYNGCQIINAQNGEILFERRINPEMLPYLEKKARKNNFALFTYHDDTIITDTP ENEHIQNEARLNNLKVIKEEEFSVAIDFAPCKCMLVSDDEEALVSLEGHWKRRLNGALDV FRSEPYFLEVVPCAIDKANTLGALLEELDVKREEVIAIGDGVCDVTMIQLAGLGVAMGHS QDSVKVCADYVTASNEEDGVALAVEKAIIAEVRAAEIPLDQLNAQARHALMGNLGIQYTY ADEDRVEATMPVDHRTRQPFGILHGGATLALGETVAGLGSMILCQPDEIVVGMQVSGNHI SSAHEGDTVRAVATIVHKGRSSHVWNVDVFTSTNKLVSSIRVVNSVMKKR >gi|226332181|gb|ACIC01000139.1| GENE 8 10058 - 11170 648 370 aa, chain + ## HITS:1 COG:VNG1083G KEGG:ns NR:ns ## COG: VNG1083G COG1169 # Protein_GI_number: 15790177 # Func_class: H Coenzyme transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Isochorismate synthase # Organism: Halobacterium sp. NRC-1 # 118 353 191 432 441 118 30.0 2e-26 MIDEKSNLTTIDALIQRKQPFAVYRVPGEKYPRLLTEDVGAVRLIFDLKELNGQRGFVIA PFRIDKSCPIVLIQSDRTGQPLPMEIVAEEEQDLQSYPEESFHTLCTGKYATCFHTFIEA LRDATFDKLVLSRSLTIGKNPEFSPSAVFRAACQRYIHSYIYLCYTPQTGVWLGSTPEII LSGEKNEWNTVALAGTQPLQNGKLPQVWDDKNRQEQDYVASYIRRQLLSLGIRSTESGPY PAYAGALSHLKTDFHFSLKDNKNLGDLLKVLHPTPAVCGLPKEEAYRFILENEGYDRKYY SGFIGWLDPEGRTDLYVNLRCMHIEDEQLTLYAGGGLLASSELNDEWQETEKKLQTMRRI LVSAPIMMNH >gi|226332181|gb|ACIC01000139.1| GENE 9 11190 - 12857 1171 555 aa, chain + ## HITS:1 COG:BS_menD KEGG:ns NR:ns ## COG: BS_menD COG1165 # Protein_GI_number: 16080134 # Func_class: H Coenzyme transport and metabolism # Function: 2-succinyl-6-hydroxy-2,4-cyclohexadiene-1-carboxylate synthase # Organism: Bacillus subtilis # 19 473 21 494 580 190 30.0 6e-48 MYSDKKNILQLVALLEAHGITKVVLCPGSRNAPIVHTLSTHPGFTCYAMTDERSAGYFAI GLALNGGHPAAVCCTSGTALLNLHPAVAEAYYQNIPLVVISADRPAAWIGQMAGQTLPQP GVFQTLVKKSVNLPEIQTEEDEWYCNRLVNEALLETNHHGKGPVHINIPISEPLFQFTVE SLPEVRVITRYQGLNVYDRDYNDLIERLNRYQKRMIIVGQMNLIYLFEKRHTKLLYKHFV WLTEHIGNQTVPGIPVKNFDAALYAMPEEKTGQMTPELLITYGGHVVSKRLKKYLRQHPP KEHWHVSADGEVVDLYGSLTTVIEMDPFEFLEKIAPLLDNRVPEYPRVWENYCKTIPEPE FGYSEMSAIGALIKALPESCALHLANSSVIRYAQLYQVPSTIEVCCNRGTSGIEGSLSTA VGYAAGSDKLNFIVIGDLSFFYDMNALWNINVRPNLRILLLNNGGGEIFHTLPGLDMSGT SHKYITAVHKTSAKGWAEERGFLYQRVENEEQLAEAMKTFTQPEAMEQPVLMEVFSNKNK DARILKDYYHQLKQK >gi|226332181|gb|ACIC01000139.1| GENE 10 12861 - 13685 1059 274 aa, chain + ## HITS:1 COG:BS_menB KEGG:ns NR:ns ## COG: BS_menB COG0447 # Protein_GI_number: 16080132 # Func_class: H Coenzyme transport and metabolism # Function: Dihydroxynaphthoic acid synthase # Organism: Bacillus subtilis # 6 274 3 271 271 404 68.0 1e-112 MSTQREWTTIREYEDILFDYYNGIARITINRERYRNAFTPTTTAEMSDALRICREEADID VIVITGAGDKAFCSGGDQNVKGRGGYIGKDGVPRLSVLDVQKQIRSIPKPVIAAVNGFAI GGGHVLHVVCDLSIASENAIFGQTGPRVGSFDAGFGASYLARVVGQKKAREIWFLCRKYN AQEALDMGLVNKVVPLEQLEDEYVQWAEEMMLLSPLALRMIKAGLNAELDGQAGIQELAG DATLLYYLTDEAQEGKNAFLEKRKPNFKKYPKFP >gi|226332181|gb|ACIC01000139.1| GENE 11 13689 - 14714 683 341 aa, chain + ## HITS:1 COG:AGpA707 KEGG:ns NR:ns ## COG: AGpA707 COG4948 # Protein_GI_number: 16119707 # Func_class: M Cell wall/membrane/envelope biogenesis; R General function prediction only # Function: L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 125 238 67 179 299 86 35.0 7e-17 MDCKIDIIPRVLHFKQPAGTSRGTYTTRKVWYLHFTAPEFPNWVGIGECAPLPNLSCDDL PDYEEVLAKICRQVENQGGKLDMEALCKYPSILFGLETAIRHFFEGSWALWDTPFSRGEA GIPINGLIWMGDFNRMLAQIEKKMEAGFRCIKLKIGAINFEEELALLQHIRSHYSSKEIE LRVDANGAFSPTDAMEKLKRLSELDLHSIEQPIRAGQWEEMARLTSESPLPIALDEELIG YNTWEEKQRLLSAIRPQYIIIKPSLHGGLAGGEEWIAEAEKLNIGWWITSALESNIGLNA IAQWCATFQNPLPQGLGTGLLFTDNVEMPLEIRKDCLWFCK >gi|226332181|gb|ACIC01000139.1| GENE 12 14714 - 15817 728 367 aa, chain + ## HITS:1 COG:Cgl0445 KEGG:ns NR:ns ## COG: Cgl0445 COG0318 # Protein_GI_number: 19551695 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II # Organism: Corynebacterium glutamicum # 51 351 58 370 376 96 27.0 7e-20 MIFDRQKQRLLLEGKEYTPGDIHSLVAEGEGNHPSAIWDLYLFLNEWFNDDPVITVHTSG STGAPKELLVRKDQMIQSARLTCEFLDLKQGETALLCMNLRYIGAMMVVVRSLIAGLNLI VRRASGHPLADVDTPLRFAAMVPLQVYNTFQIPEEKEKLKQTEILIIGGGAVDKALEEKI RNLSNAVYSTYGMTETLSHIALRRLNGAAASDRYYPFSSVELFLSSENTLMINAPLVCDD TLQTNDIARIYPDGSFTILGRKDNVINSGGIKVQAEEIERLLQSSIPVPFVITSVPDRRL GQAVTLLIEGQMEISALGTKLESVLASYYRPKYIYTVNHIPQTGNGKINRKECRVLAESL QLLGRQE Prediction of potential genes in microbial genomes Time: Thu May 12 03:15:37 2011 Seq name: gi|226332180|gb|ACIC01000140.1| Bacteroides sp. 1_1_6 cont1.140, whole genome shotgun sequence Length of sequence - 69422 bp Number of predicted genes - 52, with homology - 51 Number of transcription units - 29, operones - 14 average op.length - 2.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 439 305 ## BT_4705 RNA polymerase ECF-type sigma factor - Prom 548 - 607 10.9 + Prom 456 - 515 8.0 2 2 Tu 1 . + CDS 652 - 1587 746 ## COG3712 Fe2+-dicitrate sensor, membrane component + Prom 1697 - 1756 2.6 3 3 Op 1 . + CDS 1837 - 5136 3040 ## BT_4707 hypothetical protein 4 3 Op 2 . + CDS 5161 - 6756 1567 ## BT_4708 hypothetical protein 5 3 Op 3 . + CDS 6799 - 7740 770 ## BT_4709 glycosyl hydrolase 6 3 Op 4 . + CDS 7761 - 8972 825 ## BT_4710 hypothetical protein 7 3 Op 5 . + CDS 8990 - 9952 539 ## BT_4711 hypothetical protein + Term 9963 - 10021 14.0 8 4 Op 1 . + CDS 10027 - 10761 553 ## BT_4712 hypothetical protein 9 4 Op 2 . + CDS 10838 - 12721 1336 ## COG3669 Alpha-L-fucosidase + Term 12814 - 12857 5.1 + Prom 12844 - 12903 3.4 10 5 Tu 1 . + CDS 13079 - 15430 1822 ## COG1472 Beta-glucosidase-related glycosidases + Term 15475 - 15520 9.1 - Term 15268 - 15316 2.5 11 6 Tu 1 . - CDS 15536 - 16012 698 ## COG0783 DNA-binding ferritin-like protein (oxidative damage protectant) - Prom 16128 - 16187 4.2 + Prom 15988 - 16047 9.4 12 7 Tu 1 . + CDS 16149 - 17075 900 ## COG0583 Transcriptional regulator - Term 17003 - 17032 -0.2 13 8 Tu 1 . - CDS 17082 - 18014 985 ## BT_4717 integral membrane protein, putative permease - Prom 18102 - 18161 8.2 + Prom 18053 - 18112 7.5 14 9 Tu 1 . + CDS 18136 - 18825 794 ## COG0580 Glycerol uptake facilitator and related permeases (Major Intrinsic Protein Family) + Term 18864 - 18901 3.3 - Term 18850 - 18891 5.0 15 10 Tu 1 . - CDS 18944 - 19666 759 ## BT_4719 hypothetical protein - Prom 19731 - 19790 4.0 + Prom 19619 - 19678 5.5 16 11 Op 1 . + CDS 19846 - 20394 403 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 17 11 Op 2 . + CDS 20403 - 21659 1039 ## BT_4721 hypothetical protein + Term 21682 - 21741 16.0 + Prom 21664 - 21723 5.5 18 12 Op 1 6/0.000 + CDS 21840 - 22427 321 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 19 12 Op 2 . + CDS 22501 - 23478 757 ## COG3712 Fe2+-dicitrate sensor, membrane component + Term 23501 - 23555 2.5 + Prom 23495 - 23554 3.2 20 13 Op 1 . + CDS 23649 - 27068 2737 ## BT_4724 hypothetical protein 21 13 Op 2 . + CDS 27081 - 28424 1074 ## BT_4725 hypothetical protein 22 13 Op 3 . + CDS 28446 - 30977 1954 ## COG0584 Glycerophosphoryl diester phosphodiesterase 23 13 Op 4 6/0.000 + CDS 30992 - 31912 943 ## COG0584 Glycerophosphoryl diester phosphodiesterase 24 13 Op 5 . + CDS 31928 - 33286 850 ## COG2271 Sugar phosphate permease + Term 33334 - 33388 14.1 + Prom 33311 - 33370 2.8 25 14 Op 1 . + CDS 33404 - 33538 233 ## 26 14 Op 2 . + CDS 33556 - 33810 155 ## BT_4730 hypothetical protein + Prom 34230 - 34289 6.9 27 15 Op 1 . + CDS 34400 - 34732 222 ## BT_4732 hypothetical protein 28 15 Op 2 . + CDS 34713 - 35018 371 ## BT_4733 hypothetical protein + Term 35025 - 35075 7.5 - Term 35015 - 35057 6.2 29 16 Op 1 . - CDS 35088 - 37202 1435 ## COG5545 Predicted P-loop ATPase and inactivated derivatives - Prom 37224 - 37283 1.6 30 16 Op 2 . - CDS 37289 - 37519 248 ## gi|253571672|ref|ZP_04849078.1| conserved hypothetical protein - Prom 37680 - 37739 6.4 + Prom 37508 - 37567 6.0 31 17 Op 1 . + CDS 37713 - 38216 619 ## BT_4735 hypothetical protein 32 17 Op 2 . + CDS 38276 - 38443 196 ## gi|253571674|ref|ZP_04849080.1| predicted protein 33 17 Op 3 . + CDS 38430 - 38879 385 ## COG3023 Negative regulator of beta-lactamase expression + Term 38905 - 38956 9.8 - Term 38753 - 38800 4.3 34 18 Op 1 11/0.000 - CDS 38872 - 39570 682 ## COG1180 Pyruvate-formate lyase-activating enzyme 35 18 Op 2 . - CDS 39614 - 41842 2680 ## COG1882 Pyruvate-formate lyase - Prom 41911 - 41970 4.5 + TRNA 42083 - 42167 49.7 # Ser TGA 0 0 + TRNA 42277 - 42364 62.1 # Ser GGA 0 0 + Prom 42430 - 42489 7.7 36 19 Tu 1 . + CDS 42539 - 43546 884 ## COG0332 3-oxoacyl-[acyl-carrier-protein] synthase III - Term 43528 - 43573 10.1 37 20 Op 1 1/0.000 - CDS 43597 - 44073 513 ## COG1905 NADH:ubiquinone oxidoreductase 24 kD subunit 38 20 Op 2 1/0.000 - CDS 44092 - 45858 1667 ## COG4624 Iron only hydrogenase large subunit, C-terminal domain 39 20 Op 3 . - CDS 45870 - 47777 1568 ## COG1894 NADH:ubiquinone oxidoreductase, NADH-binding (51 kD) subunit - Prom 47892 - 47951 9.4 + Prom 48254 - 48313 4.8 40 21 Tu 1 . + CDS 48358 - 51114 2366 ## BT_0126 hypothetical protein + Prom 51170 - 51229 5.7 41 22 Tu 1 . + CDS 51259 - 54552 1996 ## BT_0127 putative transmembrane protein + Prom 54658 - 54717 3.4 42 23 Tu 1 . + CDS 54763 - 55134 404 ## BT_0128 hypothetical protein + Term 55178 - 55221 6.3 - Term 55165 - 55209 4.0 43 24 Op 1 3/0.000 - CDS 55240 - 56427 965 ## COG1312 D-mannonate dehydratase 44 24 Op 2 . - CDS 56440 - 57255 204 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 - Prom 57415 - 57474 5.4 + Prom 57377 - 57436 8.4 45 25 Tu 1 . + CDS 57483 - 59417 1336 ## BT_0132 alpha-glucosidase, putative - Term 59289 - 59332 -1.0 46 26 Op 1 . - CDS 59401 - 60234 720 ## COG2103 Predicted sugar phosphate isomerase 47 26 Op 2 . - CDS 60245 - 60628 331 ## BT_0134 hypothetical protein 48 26 Op 3 . - CDS 60606 - 61388 668 ## BT_0135 hypothetical protein - Prom 61480 - 61539 4.6 + Prom 61442 - 61501 8.4 49 27 Op 1 . + CDS 61541 - 63151 977 ## BT_0136 hypothetical protein 50 27 Op 2 . + CDS 63256 - 65181 1565 ## COG3533 Uncharacterized protein conserved in bacteria + Term 65238 - 65295 11.3 + Prom 65296 - 65355 3.5 51 28 Tu 1 . + CDS 65403 - 68150 1967 ## COG2207 AraC-type DNA-binding domain-containing proteins 52 29 Tu 1 . + CDS 68898 - 69413 506 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog Predicted protein(s) >gi|226332180|gb|ACIC01000140.1| GENE 1 1 - 439 305 146 aa, chain - ## HITS:1 COG:no KEGG:BT_4705 NR:ns ## KEGG: BT_4705 # Name: not_defined # Def: RNA polymerase ECF-type sigma factor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 146 1 146 189 259 100.0 3e-68 MKISFSRQTKERAFKQLYEDYYAPFCLYAKRFVDDKEVREDIVSDVFTSLWDKLDTDSFD LQSETALGYIKMCVKNSCLNFLKHQEYEWSYAENIQKKAPLYETETDSVYTLDELYRMLY ETLNKLPENYRTVFMKSFFEGKTHAE >gi|226332180|gb|ACIC01000140.1| GENE 2 652 - 1587 746 311 aa, chain + ## HITS:1 COG:PA2388 KEGG:ns NR:ns ## COG: PA2388 COG3712 # Protein_GI_number: 15597584 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Pseudomonas aeruginosa # 74 282 105 308 331 77 30.0 3e-14 MKENMENMNPESLLRKAQALGEDIKEMESIDVMGAYQQAQTQIKTNRRRSMYNQLMRYAA FLTIPLLLSSLILGYLYWGATDTEEKYAEVMAATGSVIRYELPDHSVVWLNSGSTLRYPT VFKKDNRNVELKGEAYFEVEADRKRPFYVNTPAGLSVYVYGTKFNVNAYEDDNSIETVLE KGKVNVISPDGKTTVQLAPGERLLYNKVDQKLLKGKVDVYEKVAWKDGKLIFRNAELGEI FKRLARHFNVDIQFNNISGKEYKYRATFRNETLPQILDYLAKSAALKWRTEEAVQQADDT FTKKKIIVDLY >gi|226332180|gb|ACIC01000140.1| GENE 3 1837 - 5136 3040 1099 aa, chain + ## HITS:1 COG:no KEGG:BT_4707 NR:ns ## KEGG: BT_4707 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1099 1 1099 1099 2031 100.0 0 MRISLALLFAVVLQLSAENGYAQRTHVAISMNNVSVEQVLNKIEEASDYVFLYNDKTIQK NRIVSVRNKSGKILDILDDIFKGTDITYTVVDKQIILSTNKMQLVQQEGQIQIKGVVKDA GGDPLIGVNVKVKDSTVGTITDINGNFTLQTRKGDILEISYVGYATKTVKVQNAQVLNIV LTEDTEVLNEVVVTALGIKKEAKSLSYNVQQVSNAEITRIADANFVNNLNGKVAGVTINS SSAGVGGSSRVVMRGTKSLNGNNNALYVVDGIPMSDMSAASTQPTDSYEGAGQSGDPISG LNPEDIESISVLSGPSAAALYGSAAANGVVMITTKKGREGRTSVSISNNTTFSAPLVLPE FQNTYGQTEVGSYYSWGSKLNTLSSYDPKDFFQTGVNVTNAASLSTGTDKNQTYLSLGTT NAKGIIHNNDYERYNVTVRNVAKMLKDKLTLDLSFMLSSVKEQNMTSQGLYYNPLVPLYL FPAGDDFSKVQAYQRYDSERNLLTQYWPYSTSLALQNPYWITEHIKIPNHKNRYMATASA KYEFADWINVTARAKMDRNNERRERMYDAGTNTLFASKYGYYSKSNIENQQIYGELLLNI NKYFVDNTLNVTANVGGNFENNDYQSDYFGGKLKSVANLFTFGNVDTSEKNLANQYGYHL KKRAIFGSAQIGYKSMAYLDVTARNDWSSTFKGTNTGSFFYPSIGLSGIITDIFRCSTDI MPYMKVRISYSEVGNSPDVFLAIPTYALVDGVPVTQSRRPNPNLKPERTKSWEAGFNFVF FKNRLRLDGSIYQSRTYNQFFERTLSSTTGYKSEVVNGGRVDNKGIELSLRFEDKWGNFG WSSYLTYSLNRNKVVELLRNYEDPYTHELTSLDRIDMGGTSMYKMILTEGGSIGDIYVNT LRTDEHGAIYVHPSDQVVVTQPDEFVKAGNSAPKYNLGWGNTFSYKGLSLGFLFTARVGG VVVSQTQATMDAYGSSKATAIARDNGGAIVNGRPIGAEDYYTKIGGAGAQGGIGSMYTYS ATNVRLAELSLGYDIPINKYVDWIKGLNVSFIGKNLFFLYRKAPFDPELTASTGTYFQGI DMFMSPSLRNLGFSVKVNF >gi|226332180|gb|ACIC01000140.1| GENE 4 5161 - 6756 1567 531 aa, chain + ## HITS:1 COG:no KEGG:BT_4708 NR:ns ## KEGG: BT_4708 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 531 1 531 531 1062 100.0 0 MILNKILSWNKLSILGISLMLLNNISCIDYEDNVNPNEVTEEMMEVDNLKTGAFFSQMLR RVVIINDGDKLDSDYQIAQNLSHDLYSGYIAATLGSTNHNGQYNFQEQWVNATFDYAYTG IMAPWQSIHKIATEQELPEVDALATIVKVEGMHRIADTFGPIPYVNYGSSSLYDALDLVY NKFFEELDNAIEVLSNYVNGNVDAKLMSDYDCVYGGDVEKWVKFANTLRLRLAIRVSYAN PTLAEAQAKKSMENMFGFIESKAERAELSHNSLEYHHPLHEIAYNFNSGDCRPGATIVAY LNGLNDSRISSYFTAADDGEYHGVRIGITTSNMSNYQGNKISNLNINRASTPVVWMTAAE SFFLRAEGALRGWDMGGTAKSFYEQGVRMSFEENNAAGVDDYLADHTSTPGAFADNVGSD NYTFSSRVTPAWDDNAGFEAQLERIITQKWIANYPDGPEGWSEYRRTGYPEVIPVVRNSS NGTIDTQLQVRRLPYTRDEKINNASGVASGISALGGQDTGGTKLWWDKKSR >gi|226332180|gb|ACIC01000140.1| GENE 5 6799 - 7740 770 313 aa, chain + ## HITS:1 COG:no KEGG:BT_4709 NR:ns ## KEGG: BT_4709 # Name: not_defined # Def: glycosyl hydrolase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 313 1 313 313 648 100.0 0 MKTKITGLLTFLIGTFFFYSCDTDVEALEIQKLKTYDEQYFENLRAFKKSDHEISYAYYE AWSPIEGVTGYKDPASWGERMVGLPDSIDIVNLWMGIPTAETHPIAYADMQYCQRKLGTR FVMHADASHYRHQFTVDGVDYDLSQNKDDEAMAAYAKWIVNQVLEPGLDGVDVDWEGWSG SDLVRLIKELGKYFGPEGEQPDKLLIVDYFSGTPPTDIIPYCDYIVQQAYSDQVGFLTQP SNFPPEKMIYCESFGVFYADGGQLMNYARWEPSKGRKGGCGVFYLGRNYYSSSGIPYNEF REAIQIMNPAVNE >gi|226332180|gb|ACIC01000140.1| GENE 6 7761 - 8972 825 403 aa, chain + ## HITS:1 COG:no KEGG:BT_4710 NR:ns ## KEGG: BT_4710 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 403 1 402 402 803 99.0 0 MMKTYIFKHSLLGVILGLSILGCTEGDKFDYNKNMAFITGTETTPVTKFVVEDTPSSYAV TASTTDKVDKDVNVKFAIDRSLVETYNNEHNTKYYAIPEGAVTLEDADAVIQTGKAFSTP ATVKVISTEDFAEGRVYVVPVTMVEVDGLEILKPSKTIFLQISRVIHFTSLNISNTNLYS NFIFSDDKKQELSNFTYEIKCYSQEWHRIARLCSFTSADEQRSSMLRFGENGLDINALQW VSPSGSIVSSTRFSTDRWYMISLTYDGSKLTMYVDGVKDAQGDGDGKPVDFQRFELGMSW TGYRGSQYFRGRIAEVRVWNRALSTGELQMNLCGVDSQSEGLVAYWKMNEGEGHIFKDAT GHGYDMDWTNTAREINEGAGLTYNLNYSSAIAWDSDDNNRCNE >gi|226332180|gb|ACIC01000140.1| GENE 7 8990 - 9952 539 320 aa, chain + ## HITS:1 COG:no KEGG:BT_4711 NR:ns ## KEGG: BT_4711 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 320 1 320 320 565 98.0 1e-160 MKINTHILGLLVVVGLFSLQSCDDESYDIEGSNQNYVYINVNRWTSTEYPQNTFVYEVLR TPIGSSLESGPEIVKVGVRSTKVASKDITVTLDIDNSASIGEFSSFPDGVKVTLDKKELV IPAGSMLSSDSVTIEIEDGKWSEFVNTSYLLPLKVTSVSNAALSETHSSAYLAVSTSFTN CVPGATSVEGTLISDRSAWTATNNGVDVGTTLFDGNTRTYPSQDASSTVIVNLNSVYQNI TGIRLQYYSRNYSLSSAAVYTSEIGGEDYEYQGNVTFSRATPQYIRFYGAVSAQYVKLEL LPYSSGYGIVLSEFDMYQNE >gi|226332180|gb|ACIC01000140.1| GENE 8 10027 - 10761 553 244 aa, chain + ## HITS:1 COG:no KEGG:BT_4712 NR:ns ## KEGG: BT_4712 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 244 5 248 248 496 99.0 1e-139 MVLRFIILICCINLYLVASYAQTLSDELVFEERVFDFGEIKEENGLVSHEFEFKNVSQKV ISINGVTSGCGCVQFEFPKEPLRPNATGKVKVTYNPAYRPGFFSKEVVVLSNNNANYNRI WVKGTVIPCKHPVSENYPYEYGSGLWMNFEVMAFGTIGKGGTKTMKLKYVNDTDNDIQLM FVVIGGNTDLKFTSPRQVKAREEGVMPVEYQYSGSFPTETRVYPVINGRALMKPLKITCT NPVD >gi|226332180|gb|ACIC01000140.1| GENE 9 10838 - 12721 1336 627 aa, chain + ## HITS:1 COG:SP2146 KEGG:ns NR:ns ## COG: SP2146 COG3669 # Protein_GI_number: 15901959 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-L-fucosidase # Organism: Streptococcus pneumoniae TIGR4 # 33 487 9 451 559 286 35.0 8e-77 MKRKPIATSISTVVFSLLLTGCQSVSAPEAILPVPQEKQVNWQKMETYAFVHFGLNTFND REWGYGDSDPKTFNPARLDCEQWVQTFVNSGMKGVILTAKHHDGFCLWPTQLTEYCIRNT PYKDGKGDIVRELSDACKKYGIKFAVYLSPWDRHQANYGTPEYVDYFYKQLHELLTNYGD VFEIWFDGANGGDGWYGGAKDARTIDRKTYYDYPRAYKMIDELQPQAVIFSDGGPGCRWV GNENGFAGATNWSFLRAGEVYPGYPKYRELQYGHADGNQWVAAECDVSIRPGWFYHPEED DKVKTVDQLTDLYYRSVGHNATLLLNFPVDRNGLIHPTDSLNAVSFHQRVQKELADNLLS SAKVSASDERGGQFKVRGVTDGKYDTYWATNDGVTTADLTFTFSQPTKMNRVMIQEYIPL GQRVKSFVVEYKEGDQWLSVKCNEETTTVGYKRLLRFEMIETEELRIRFTDARACLCINE VGAYYAPDATENYTPATSELKSFPFTILGVDTEEAKKCSDKDDQTAALISGKEIMIDLGE NRTIHSFYYLPDQSEYSKGLISSYELSAGITEEAMQVVAQGEFSNIRNNPILQNMYFSPV EARYFKLKATRMVDESDSLGIAEIGCR >gi|226332180|gb|ACIC01000140.1| GENE 10 13079 - 15430 1822 783 aa, chain + ## HITS:1 COG:SSO3032 KEGG:ns NR:ns ## COG: SSO3032 COG1472 # Protein_GI_number: 15899739 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Sulfolobus solfataricus # 39 779 4 733 754 545 41.0 1e-154 MMTKRLYILIVGLAMNVVAFAQSSLLPYKNPVLSVDERVKDLLSRMTLEEKVGQLLCPLG WEMYEIKDDEVHPSEKFKRLMKEKNAGMLWATYRADPWTKKTLENGLNPELAAKAGNALQ KYVIENTRLGIPLFLAEEAPHGHMAIGTTVFPTGIGMAATWSPVLIEEVGNVIAKEIRSQ GAHISYGPVLDLSRDPRWSRVEETFGEDPVLSGRLGAAMVIGLGSGDLSREYATIATLKH FLAYAVPEGGQNGNYASVGTRDLHENFLPPFQEAIDAGALSVMTSYNSIDGIPCTANYYL LTQLLRNEWRFRGFVVSDLYSIEGVHESHFVAPTIEEAAMQVVSAGVDIDLGGNAFMNLT HAVQSGKISEAVIDTAVCRVLRMKFEMGLFEHPYVNPKSATKVVRSEEHIRLAHKVAQSS IVLLKNKNSILPLNKKIKKVAVVGPNADNRYNMLGDYTAPQEDENIKTVLDGVISKLSPS KVEYVRGCAIRDTTVNEIAEVVEAASRSEVIIAVVGGSSARDFKTSYQETGAAIADEKSI SDMECGEGFDRATLTLLGKQQDLLNALKATGKPLIVVYIEGRPLDKVWASEYADALLTAS YPGQEGGYAIADVLFGDYNPAGRLPVSVPRSVGQIPVYYNKKAPCNHDYVEQAASPLYTF GYGLSYTTFEYSDLQVIRKSPCYFEVSFKVKNTGSYDGEEVAQLYLRDEYASVVQPLRQL KCFERFFLKRGEEKEIFFTLTEKDLSIIDRNMARVVETGDFRIMIGASSDDIRLTKDIFV ESQ >gi|226332180|gb|ACIC01000140.1| GENE 11 15536 - 16012 698 158 aa, chain - ## HITS:1 COG:PM0817 KEGG:ns NR:ns ## COG: PM0817 COG0783 # Protein_GI_number: 15602682 # Func_class: P Inorganic ion transport and metabolism # Function: DNA-binding ferritin-like protein (oxidative damage protectant) # Organism: Pasteurella multocida # 7 154 5 152 159 155 52.0 4e-38 MKTLEFIKLNESGANNVVASLQQLLADFQVYYTNLRGFHWNIKGHDFFVLHSQFEKMYDD TAEKVDEIAERILMLGGTPANKFSDYLKVANINEVDKVSNGEQALDNILQSISYLIGEER KILSIASQAGDEVTVSMMSDYLKEQEKLVWMLTAYNSK >gi|226332180|gb|ACIC01000140.1| GENE 12 16149 - 17075 900 308 aa, chain + ## HITS:1 COG:HI0571 KEGG:ns NR:ns ## COG: HI0571 COG0583 # Protein_GI_number: 16272514 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Haemophilus influenzae # 1 301 1 296 301 196 36.0 6e-50 MTIQQLEYILAVDQFRHFARAAEYCRVTQPTLSAMIQKLEDELGVKLFDRTVQPVCPTAI GEKIIDQARVILAQTAQVKEIISEEKQSLAGVFHLGVLPTIAPYLLPRFFPQLMEKYPEL DIRVTEMKTQNIQQALHAGDIDAAIIASKLEDTFLKEETLFYETFFGYVSCKEPLFKHDV IRTSDITGERLWLLDEGHCFRDQLVRFCQMETVKVNQMAYHLGSMETFMRMVESGKGITF IPELAVSQLNEEQKKLVRPFAIPRPTRQIVLATNRDFIRHSLLNVLKEEILAAVPKEMQS LQSIQYLL >gi|226332180|gb|ACIC01000140.1| GENE 13 17082 - 18014 985 310 aa, chain - ## HITS:1 COG:no KEGG:BT_4717 NR:ns ## KEGG: BT_4717 # Name: not_defined # Def: integral membrane protein, putative permease # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 310 1 310 310 538 100.0 1e-152 MKKGLFYAILASVLWAIVNPFIKQGLSYDFSPMNFAGIRFTTVGIILFAYTWHKGMWKEI RQHSKLFLNLILINMFMGYTAFYFGVDFVSGAISSIIMGMTPLINVLLAHLLASNDRLNV HKIISLIVSLIGLLLIVGMGSNGAPLDWKGITGIVLLLLSIIFQGYSAISVSEDKGKINP IFLNAVQMFFGGLLIYMIGLGTEGYHSFIGKPAGFYISLSILVFISVFAFSFWFIALQSK GAKVSDINMCRLINPILGAILSWIMLPGEYPTFSTVAGMIIIVSSLIIYFKGAEIGRWLK RHTAGSSTQE >gi|226332180|gb|ACIC01000140.1| GENE 14 18136 - 18825 794 229 aa, chain + ## HITS:1 COG:slr2057 KEGG:ns NR:ns ## COG: slr2057 COG0580 # Protein_GI_number: 16330455 # Func_class: G Carbohydrate transport and metabolism # Function: Glycerol uptake facilitator and related permeases (Major Intrinsic Protein Family) # Organism: Synechocystis # 1 220 1 233 247 162 49.0 4e-40 MKKYIAEMIGTMVLVLMGCGSAVFAGSMAGTVGAGVGTVGVALAFGLSVVAMAYAIGGIS GCHINPAITLGVFLSGRMNGKDAGMYMLFQVIGAIIGSAILYALVTTGGHDGPTATGSNG FGDGEMLQAFIAEVVFTFIFVLVVLGSTDPKKGAGAFAGLAIGLSLVLVHIVCIPITGTS VNPARSIGPALFQGGEALSQLWLFIVAPFVGAAVSALVWNYFGDKNEKK >gi|226332180|gb|ACIC01000140.1| GENE 15 18944 - 19666 759 240 aa, chain - ## HITS:1 COG:no KEGG:BT_4719 NR:ns ## KEGG: BT_4719 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 240 1 240 240 430 100.0 1e-119 MKKFKFIAFIIAVMTTMPILQSCLDDDDSSSDSLVISTINLISPDSKDFYFTLDDGKTMF PSNGNGWISDKNKDGQRAFVIFKELEEPVKGYDYNIQVREIKEILTKEIVTMGEGENTEE KIGDDKINSTYMWITQDKKYLTIEFQYYGTHSEDKKHFLNLVINNKEAEPAADDANDEYI DLEFRHNDEGDTADRLGEGYISFKLDKIQDQMKGKKGLRIRVKTLYNGEKFYEVKFPSSN >gi|226332180|gb|ACIC01000140.1| GENE 16 19846 - 20394 403 182 aa, chain + ## HITS:1 COG:PA0762 KEGG:ns NR:ns ## COG: PA0762 COG1595 # Protein_GI_number: 15595959 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Pseudomonas aeruginosa # 2 175 6 187 193 74 27.0 9e-14 MEEFELSEQCRLGNNRARKELYEHYGGRLLGVCLRYTGDRDTAQDLLHDGFIKIFSSFDK FTWRGEGSLRAWMERVMVNTALQYLRKSDVISQSTPLEEVPEDYEEPDASAVEMIPQAVL MRFIEELPAGYRTVFNLYTFEEKSHKEIAQLLGINEKSSASQLFRAKSVLAKRVKEWIMN NG >gi|226332180|gb|ACIC01000140.1| GENE 17 20403 - 21659 1039 418 aa, chain + ## HITS:1 COG:no KEGG:BT_4721 NR:ns ## KEGG: BT_4721 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 418 1 418 418 765 100.0 0 MEEKELWMNKLKEKLGDYSEPLPASGWEQLEKELMPPVERKIYPYRKWTVAAAAVILLAL GSSVSLYFLGTPAADEIRHAKTPALASVPDVLPDAQQPDMTGTTIEPVVRPVVKNRIAKA ERNIPQPTANIDEPVKKEEQPSELNAQTGDRKEKEEVEPVEETKAIRHKPADTEQPRNKP RRPSSRDKLHIPAEKASSQKGTWSMGLSVGNSGGASTELGSGIPSYMSRVSMVSVSNGLL SIPNDQQLVFEDGVPYLRQANQVVDMEHHQPISFGLSVRKSLAKGFSVETGLTYTLLSSD AKFADSDQKTEQKLHYLGIPLKANWNFLDKKLFTLYVSGGGMIEKCVYGKLGTEKETVKP LQFSVSGAVGAQFNATKRVGIYVEPGVAYFFDDGSDVQTIRKENPFNFNIQAGIRLTY >gi|226332180|gb|ACIC01000140.1| GENE 18 21840 - 22427 321 195 aa, chain + ## HITS:1 COG:PA2896 KEGG:ns NR:ns ## COG: PA2896 COG1595 # Protein_GI_number: 15598092 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Pseudomonas aeruginosa # 6 179 13 188 194 62 25.0 5e-10 MVDSINEKRLLTELKNGSFQAFERLYNMYSGKLYNFIMRISSGNQYMAEEVVQSAFIRVW EVRERVEPESSFISFLCTIAKNLLMNMYQRQTVEYVYNEYLKNTGVDRDSQTEESIDLRF LNEYIDSLAEELPAQRKKIFILSKRQNYTNKEIAEMMGISESTVATQLSLAVKFMREQLM KHYDKIVALLFAFFC >gi|226332180|gb|ACIC01000140.1| GENE 19 22501 - 23478 757 325 aa, chain + ## HITS:1 COG:PA2388 KEGG:ns NR:ns ## COG: PA2388 COG3712 # Protein_GI_number: 15597584 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Pseudomonas aeruginosa # 130 319 132 322 331 77 28.0 3e-14 MDKIHYKELIEKYFEGNIADTEIKELSDWIKNDRQLQNWWEQEFTKSDAAIDPILRDKLF ARIKEGTLKHTSPRTKGVRTLPMIPWRWVAAILLPVCIAFFTYYLIDSSQMTSAPFIVKA DKGDKATVELPDGTNVVLNSASQLSYLNNFGEKVRRVQLNGEAYFKVAPDEKHAFIVQVG DLEVKVLGTSFNVSAYEDAKDITVVLLEGKVGIYTQETSRMMKPGDKIEYNKTTHQLVAT QVHPNDYIEWTKGNIYFEKESLENIMKTLSRIYDVEIRFDSNKLPKEYFTGTIPSGGIQN ALNILMLTSPFYYEMDGSVIVLKEK >gi|226332180|gb|ACIC01000140.1| GENE 20 23649 - 27068 2737 1139 aa, chain + ## HITS:1 COG:no KEGG:BT_4724 NR:ns ## KEGG: BT_4724 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1139 1 1139 1139 2174 99.0 0 MENRTPFVKLPQSTAGRSKNWRTLRAICLLLFMSISLTAYSQITVDLKGISLRASLKKIE QVSNYKFFYSESLPELSWKVSLNVRDVTIDQTMTRLLEGMELTYKKEQENVIVLIRKTQS KQLTKKVTGTVVDANGEPIIGASIVIKGESHGTITDFDGKFALPDVPEKAVLTISYIGYK TVNLATTDQTLVKVVLEEDSKMIDEVVVVGYGVQSQKLVTTSISKVKMENIDQGNDYNPI KMLQGRVAGVNISSASGTPGETPNITVRGIGSVSGGSSPLYVVDGIPSEKYPNLNPNDIE SMEVLKDASAAAIYGSRANAGVVLITTKSGQQGKTKIEVSGRYGFASLASDIEMANSTEY MNTMQAAIDNYNVQMGTNLQLYIPSQIQETDWVKEISRKNSKTGTGSISISGGNEKTTFF ASLGANTQEGYLNKSKYDQYNMRAKFSHKINSIFKLNMNLAGSASRSDLLEETSTSLKVL RTAREEQPWYSPYKEDGTSYKVNGTDILRHNPLMLINEEDWVAKKYQLSGVFSIDVTPFK GFKYTPTVSAYGILDNVSKKLSDKHDARKNSSGWGALAQQKDQSFRYVIDNVFSYNNEWN KLIYSVMLGHSFEKYTYEQFGAKSDNYANGAYPSSSFDLINAGPNIYAGDISYTSYALES YFGRIALNWDNKYILNASLRSDGSSRFAKNKRYGYFPSASFAWRASNEGFFPKNKYVNDA KLRLSWGMTGSMAGVSNYAPLSLISAGGASYNGSAGFQISQDARALTWEKASQFNIGFDI EMFQSRLTLNVDMFYQKTTDLLFKKPVNASTGYTTLQSNIGSLENKGLELALNGKILTGK FKWDLGGNISFVKNKLLSLIEGSEMYVVPSSGSNLLGGSMHALINGEPISTFYMLKMEGI YQRDDEVPAKLYAKGVRAGDVRYFDYNEDGDITDADRVNVGKAIPDFYGGITSNFSYKGF DLSLFGQFSVGGKVMAAWRGVNGSEGTDHLGLALSNVKVGDRGESVEQFFNISKEVANGY WRGEGTSNSIPRPVRIGVHTGYDYDYNVQTSTRYLEDASYFKLKTVTLGYTLPESITKKI HVNSLRVYVSADNLLTFTKYSGYDPETSFSGSPGDSNYGVDFGLQPVLRTFIFGLNLNF >gi|226332180|gb|ACIC01000140.1| GENE 21 27081 - 28424 1074 447 aa, chain + ## HITS:1 COG:no KEGG:BT_4725 NR:ns ## KEGG: BT_4725 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 447 1 447 447 911 100.0 0 MKRFYMINYLLCGICLLWMTGCSNMLDEMRPKDKIPQDALSESDLTKLLNGVYAEMEELV FKFYMDGDVKGENFKAGPGFSMNDPMSMAPSSKDVLGQWQKCFTALKQVNFLVETYEASS NKDSQVVKQTGGTGYYFRALIYYHLVTRWGGAPILRKRTYDVVPISPEADVWNFIKEDLG KAESLLPEFTDRFYVSLSVCDALNAKVCLALKDYTNAAIYADRVITKSNFALSTTSAEYA NAFISNSNSKELIFALANKRSTGLLLFYQSVNDIDPTWDYSPSADCYSHLYADTSVKKQD IRAKAVFGADNSRIIKFPNGSTGQFVTNEQPSQTPIVVARVAEMYLIKAEALGATNGLST LKEFMNKRYATVSLPSSMSDTEFQNQILDERHRELYAEGQRWYDLKRTNRLDLFTSLNGR NYLMYYPVPQSERDLAGAENYPQNDGY >gi|226332180|gb|ACIC01000140.1| GENE 22 28446 - 30977 1954 843 aa, chain + ## HITS:1 COG:AGl598 KEGG:ns NR:ns ## COG: AGl598 COG0584 # Protein_GI_number: 15890416 # Func_class: C Energy production and conversion # Function: Glycerophosphoryl diester phosphodiesterase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 600 700 41 145 306 87 40.0 8e-17 MKNIKIFATFCICLLLFSACSDDWKENALTAKFSFDKSLYYVGDEVRITNETVGGEGNYT YEWDLGDGKTSTDPNPVVTYQTNGAYTVTLHVKDAKGTYAMAHKLLTIDSEPLPEVGNVK LKWVGGHVLGEVRSTAPAVSDDNGVYMTSNDHYLRKFSAATGDQLWEFDLWTSADGDAPS GNTHTTPSIDIDGTIYVGTGDTSGKVGRVYAINPDGSKKWLVAGDAEKGFWNKGNASTPR INYLTCAIGENHVYMGNGGSTGSVLAVDKVTGYRVGYVANADNSGGPSGGVSAGIVLANN TLVWGGGKNGLFGASASALNAGGNVMWAWQVFSSGDDKPSENMNGSPAVDEAGTIYGTAT FAGMGSSAFAMGSDGVEKWRTPLGNVGTLDQGGVVIGLDGSIIVTVKRAPGEATGGIISL SPGGAIQWHYGIAEDVSGCAAIDQAGNIHFGTQSGNYYIIKPEESDEQLILKKDLAALIS ESDSPLKGDWEAGIGKIWSSPTIGPDGAIYIGVTNTVDPTKSVLVALEDEGITGAATSAW PMKGKDRRHSGAQSGGNGENPGGEEGGQLPMTGNLKADLKSLFESTSYKVWLCAHRGNTQ KGMKEGIPENSLPAIEHSVKAGVEMIELDARPTSDGVLVLMHDNTIDRTTNGSGAVGDFT YQQLQQFYLKDASGNITGERIPTLEEAMKKGKGKVYYNLDIVNKNVAVNTIVALLKKLDM EGSTLLYVSNNRNYAFDLKTANSSLLLHPMAKAADDITYFSSSYTDNVQMMQLSTSDAMG GTMVNEIKDQGWLLFSNIVGANDTNMLSENYSGLVGMINKRINIVQTDYAEVAAKYLKSK GYR >gi|226332180|gb|ACIC01000140.1| GENE 23 30992 - 31912 943 306 aa, chain + ## HITS:1 COG:CC3172 KEGG:ns NR:ns ## COG: CC3172 COG0584 # Protein_GI_number: 16127402 # Func_class: C Energy production and conversion # Function: Glycerophosphoryl diester phosphodiesterase # Organism: Caulobacter vibrioides # 50 297 21 270 295 130 34.0 4e-30 MKKYIMIGILFAFPFCLSADEPVKAIYAKITNPDNKEITVVAHRADWRYAPENSLAAIES SIRLGADVVELDVQKTKDGQLILMHDKTLDRTTTGKGKVAEWTLDSIRTLYLKNGAALKT KHRVPTLEEALWVAKGRVMVNLDKAYPIFDEIFPILEKTGTVDQIIMKGSKTVADVKNDL GKYLDRIIYMPIIHLDKPGAMKQLDDFMTELHPVAFELLFVSDTCQVPKQVKTKLKGKSK IWYNTLWDTMAGGHDDDKSLENPDEGYGYLIDTLGAAIIQTDRTAYLLEYLQSRKKRSEN GQDAYH >gi|226332180|gb|ACIC01000140.1| GENE 24 31928 - 33286 850 452 aa, chain + ## HITS:1 COG:VCA0707 KEGG:ns NR:ns ## COG: VCA0707 COG2271 # Protein_GI_number: 15601463 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate permease # Organism: Vibrio cholerae # 5 444 1 439 459 362 43.0 1e-100 MGFKLLDFYKISPPVSGGETDSERSVRFKRIRWATFLSATTGYGIYYVCRLSMNVIRKPI VEDGVFTETQLGIIGSCLFFVYAVGKLTNGFLADRSNVKRFMSTGLLCSALINLCLGFTN SFFAFVLLWGLNGWFQSMGAASGVVSLTRWYSSKERGTFYGFWSASHNLGEALTFISIAL LVSWMGWRYGMIGAGVIGLLGFLMMLAFMRDTPQSQGFLLDRRGTSDAHSVSGKQTEEFN KAQKAVLKNPAIWILALSSAFMYISRYAVNSWGVFYLEAQKGYSTLDASFIISISSVCGI VGTVFSGIISDKFFAGSRNVPALIFGLMNVSALCLFLLVPGVHFWVDALAMVLFGLGIGV LICFLGGLMAVDIAPCNASGAALGVVGIASYIGAGLQDVMSGILIEGQKTVQNGVEVYDF TYINWFWIGAALLSVVFALLVWNAGEKRSEIN >gi|226332180|gb|ACIC01000140.1| GENE 25 33404 - 33538 233 44 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MENSLSDELFCSLKNELREVDAQISVHSEKLLELIDQKKELSSN >gi|226332180|gb|ACIC01000140.1| GENE 26 33556 - 33810 155 84 aa, chain + ## HITS:1 COG:no KEGG:BT_4730 NR:ns ## KEGG: BT_4730 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 84 1 84 84 159 100.0 2e-38 MTPTVIEYRGDPKRYISVILGAIDRGRLTYNGDANCEQTFRSLSSVIDVISPKNGKALSI ETLVSYEKKERAGEFSDYSEGDFK >gi|226332180|gb|ACIC01000140.1| GENE 27 34400 - 34732 222 110 aa, chain + ## HITS:1 COG:no KEGG:BT_4732 NR:ns ## KEGG: BT_4732 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 110 1 110 110 167 100.0 1e-40 MKREIIAYKGYFKEFFENLDAGTQDKILYVLMLLQTQDRIPLKFMRLIEEGLYELRIEYQ SNIYRIFFCFDEGRIVILFNGFQKKTEKTPKKEIDKAKILRKEYYGSKNK >gi|226332180|gb|ACIC01000140.1| GENE 28 34713 - 35018 371 101 aa, chain + ## HITS:1 COG:no KEGG:BT_4733 NR:ns ## KEGG: BT_4733 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 101 1 101 101 166 100.0 2e-40 MEAKTNEEFFNVSALIDERFGKEGTASRAEAEEKAYAFYTGQIIEDARKKAKITQAELAR RIGSDRSYISRVESGQTEPKVSTFYRIMNALGCKIEFSMIL >gi|226332180|gb|ACIC01000140.1| GENE 29 35088 - 37202 1435 704 aa, chain - ## HITS:1 COG:all8519 KEGG:ns NR:ns ## COG: all8519 COG5545 # Protein_GI_number: 17232892 # Func_class: R General function prediction only # Function: Predicted P-loop ATPase and inactivated derivatives # Organism: Nostoc sp. PCC 7120 # 390 627 353 592 836 69 25.0 2e-11 MKHFTIFQGFYYAIAEMTEEEIVSTIGSFTYREKVEEIRRIFAEQGEKAANEKKKELPAI AFSASYRGRRTKVNLVKYLGHIVIDIDHLSKEELARILPIIKRCDYTRIAFISPKGMGVK IIVRACHPDETLPETLQEIEDFHHAAYTRLVSFYTELCRIEIDTSGQDVARTCLFSYDPE IYFNPNADAFLVDQPQASYKISNRKNLSGSKQQTPPDGTPTNEDTALNAHSANASLVLTL TYYHNKSEKYIAGNRNNYLHHLSCTFNRYGIPQEETSAFIKSQFTDFPADETVSLINSAY AHTDEFNTCKLNGTQKRILRIEQYISEHYETRYNEVLHIMEYRRRRPDTEKPEPFRILDE MMENSIWIEINELGYPCTVKTIENLIYSDFSQSYHPIREYFELLPEWDGTDYIRILADSV QTSHQEFWAECLERYLVAMCAAATQENIVNHTVLLLCSEIQNIGKTTFINNLLPPELRAY LSTGLINSNNKDDLAKITQAMLINLDEFEGMSGRELNAFKDLVTRKVISFRLPYARRSQN FPHTASFAGTCNYQEVLHDTTGNRRFLCFHPYSIQFITINYAQLYAQIKYLLNKPGYQYW FTQADNERLEENNEAFIFHSPEEEMVLTHIRKPERFEKVYYLTVSEIAELIRERTGYQYS IGSKIQLGKVMTKHHFESKKGNNGRRYAVFIIDIEQVKSNRLYE >gi|226332180|gb|ACIC01000140.1| GENE 30 37289 - 37519 248 76 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253571672|ref|ZP_04849078.1| ## NR: gi|253571672|ref|ZP_04849078.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 76 1 76 76 143 100.0 3e-33 MNYKEDSFLIRTYTKAELAHLYNPHVCLKVALQILRRWIIYNLPLLHELEQEGYRARNRL LSPKQVATIIHFLGEP >gi|226332180|gb|ACIC01000140.1| GENE 31 37713 - 38216 619 167 aa, chain + ## HITS:1 COG:no KEGG:BT_4735 NR:ns ## KEGG: BT_4735 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 167 1 167 167 278 100.0 7e-74 MAMTVSYSVVPRKNPAKKSEPAKYYAQAQASGELDFEELCEAITSRSTCTETDVRAAISG ILYEVKRALKAGRIARLGDLGSLQIGLNSEGAASVKEFSGSMIKGAHLIFRPGKTLAELM KILSYQQVLTRAVAQAGAGDGGGEDPDKNPDSGGDGSGDEEAPDPTV >gi|226332180|gb|ACIC01000140.1| GENE 32 38276 - 38443 196 55 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253571674|ref|ZP_04849080.1| ## NR: gi|253571674|ref|ZP_04849080.1| predicted protein [Bacteroides sp. 1_1_6] # 1 55 1 55 55 78 100.0 1e-13 MKEIVNRILDVIMYLIPFFGKRKRDKVVREVRYHTTCKEVCKVKTTEREKEHEKD >gi|226332180|gb|ACIC01000140.1| GENE 33 38430 - 38879 385 149 aa, chain + ## HITS:1 COG:HI1494 KEGG:ns NR:ns ## COG: HI1494 COG3023 # Protein_GI_number: 16273395 # Func_class: V Defense mechanisms # Function: Negative regulator of beta-lactamase expression # Organism: Haemophilus influenzae # 47 142 3 98 116 90 41.0 1e-18 MRKISLIVIHCSATRVDCDFTAKDVDTAHRYRGFSCWGYHYYIRKSGEIEPMRDEDTVGA HARGYNAISLGVCYEGGLDENGKAADTRTPRQKEALHRLVHELLQRYPEAKVVGHRDLSP DTNYNGIVDPWERIKECPCFEVIGEFSIN >gi|226332180|gb|ACIC01000140.1| GENE 34 38872 - 39570 682 232 aa, chain - ## HITS:1 COG:VC1869 KEGG:ns NR:ns ## COG: VC1869 COG1180 # Protein_GI_number: 15641871 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Pyruvate-formate lyase-activating enzyme # Organism: Vibrio cholerae # 2 229 14 244 246 207 44.0 1e-53 MGTFDGPGLRLVVFLQGCNFRCLYCANPDTIAGKGGTPTPPEEIVRMAMSQRPFFGKRGG ITFSGGEPTFQAKALVPLVRELKEKGIHVCLDSNGGLWNEDVEELFKLTDLVLLDIKEFN PNRHQTLTGRSNEQTIRTAAWLEEQGKPFWLRYVLVPGYSDFEEDIRALGEALGKYKMIQ RVEILPYHTLGVHKYEAMGQEYKMKGVKENTPEQLEKAAEVFKEYFTTVVVN >gi|226332180|gb|ACIC01000140.1| GENE 35 39614 - 41842 2680 742 aa, chain - ## HITS:1 COG:CAC0980 KEGG:ns NR:ns ## COG: CAC0980 COG1882 # Protein_GI_number: 15894267 # Func_class: C Energy production and conversion # Function: Pyruvate-formate lyase # Organism: Clostridium acetobutylicum # 7 742 8 743 743 936 61.0 0 MELNKIFKDGLWSSEINVRDFVSHNITPYYGDASFLEGPTERTKAVWNRCLEALAEEREN NGVRSLDNVTVSTITSHKAGYIDKENELIVGLQTDELLKRAIKPFGGINVVSKACHENGV EVDDRVKDIFTHYRKTHNDGVFDVYTEEIRSFRSLGFLTGLPDNYARGRIIGDYRRMALY GIDRLIEAKKEDLHNLTGPMTDARIRLREEVAEQIKALKDMKVMGEYYGLDLSRPAYTAQ EAVQWVYMAYLAAVKEQDGAAMSLGNVSSFLDIYLEYELSKGTITESFAQELIDQFVIKL RMVRHLRMQSYNDIFAGDPTWVTESLGGRLNDGRTKVTKTSFRFLQTLYNLGPSPEPNLT VLWSPELPEGFKEFCAKVSIDTSSIQYENDDLMREVRQSDDYGIACCVSYQEIGKQIQFF GARCNLAKALLLAINGGRCENTGTVMVKNIPVLTSDTLKFEEVMDNYKKVLIEIARVYNE AMNIIHYMHDKYYYEKAQMALVDTNPRINLAYGVAGLSIALDSLSAIKYAKVTARRNDIG LTEGFDIEGEFPCFGNDNDKVDHLGVDLVYFFSEELKKLPVYKNARPTLSLLTITSNVMY GKKTGATPDGRAKGVAFAPGANPMHGRDKNGAIASLSSVAKLRYRDSQDGISNTFSIVPK SLGATDEDRIENLVTMMDGYFTKGAHHLNVNVLNRDMLYDAMEHPENYPQLTIRVSGYAV NFVKLSREHQLEVISRSFHERM >gi|226332180|gb|ACIC01000140.1| GENE 36 42539 - 43546 884 335 aa, chain + ## HITS:1 COG:aq_1099 KEGG:ns NR:ns ## COG: aq_1099 COG0332 # Protein_GI_number: 15606369 # Func_class: I Lipid transport and metabolism # Function: 3-oxoacyl-[acyl-carrier-protein] synthase III # Organism: Aquifex aeolicus # 8 327 5 309 309 275 42.0 6e-74 MDKINAVITGVGGYVPDYVLTNDEISKMVDTTDEWIMGRIGIKERHILNEEGLGTSYMAR KAAKQLMQRTKSRPDDIDLVIVATTTSDYRFPSTASILCERLGLKNAFAFDMQAVCSGFL YAMETGANFIRSGKYKKIIIVGADKMSSVIDYTDRATCPIFGDGAAAFMLEPTTEEVGIM DSVLRTDGKGLPFLHIKAGGSVCPPSYYSLDHHLHYIYQEGRTVFKYAVANMSDSCEAII ARNHLTKEEVDWVIPHQANQRIITAVAQRLEVPSEKVMVNIERYGNTSAGTLPLCIWDFE KKLKKGDNLIFTAFGAGFAWGAVYVKWGYAPKEDA >gi|226332180|gb|ACIC01000140.1| GENE 37 43597 - 44073 513 158 aa, chain - ## HITS:1 COG:TM0012 KEGG:ns NR:ns ## COG: TM0012 COG1905 # Protein_GI_number: 15642787 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase 24 kD subunit # Organism: Thermotoga maritima # 28 156 42 171 176 144 52.0 9e-35 MSDIKLACDVVEQVKTICDKHGNNAGELINILHEAQHLHGYLPEEMQRIIASKLRIPVSK VYGVVTFYTFFTMTPKGKHPISVCMGTACYVRGSEKLLEEFKRVLGIEVGETTPDGKYSL DCLRCVGACGLAPVVMIGEKVYGRLQPVDVKKIIEELE >gi|226332180|gb|ACIC01000140.1| GENE 38 44092 - 45858 1667 588 aa, chain - ## HITS:1 COG:TM0201_2 KEGG:ns NR:ns ## COG: TM0201_2 COG4624 # Protein_GI_number: 15642974 # Func_class: R General function prediction only # Function: Iron only hydrogenase large subunit, C-terminal domain # Organism: Thermotoga maritima # 222 586 5 364 372 381 53.0 1e-105 MEEKQITLQIDGHFITVPEGSTILEAACKIGINIPTLCHIDLKGTCIKNNPASCRICVVE VAGRRNLAPACATRCTEGMVVKTSTLRVMNARKVVAELILSDHPNDCLTCPKCGNCELQT LALRFNIREMPFNGGELSPRKREVTSSIVRNMDKCIFCRRCESVCNDVQTVGALGAIRRG FNTTIAPAFDRMMKDSECTYCGQCVAVCPVGALTERDYTNRLLDDLADPDKIVIVQTAPA VRAALGEEFGLPPGTLVTGKMVYALRELGFDYVFDTDFAADLTIMEEGSEILNRLTRYLD GDKSVRLPILTSCCPAWVNFFEHHFPDMLDIPSTARSPQQMFGSIAKSYWAEKMGIPREK LVVVSIMPCLAKKYECDRDEFKVNGVPDVDYSISTRELARLIKRANIGFTLVLDSPFDNP MGESTGAGVIFGTTGGVMEAALRSVYEIYTGQPLKNVNFEQVRGLSGVRRATIDLNGFEL KVGIAHGLGNARHLLEDIRNGHNEYHVIEIMACPGGCIGGGGQPLHHGNSDVLYARANAL YREDANKPLRKSHDNPYIQKLYEEYLGKPLGEKSEMLLHTHYFNKSID >gi|226332180|gb|ACIC01000140.1| GENE 39 45870 - 47777 1568 635 aa, chain - ## HITS:1 COG:TM0010_1 KEGG:ns NR:ns ## COG: TM0010_1 COG1894 # Protein_GI_number: 15642785 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase, NADH-binding (51 kD) subunit # Organism: Thermotoga maritima # 46 566 8 527 527 680 61.0 0 MKILSIHDLQIIRKRAEHHLSLREESNEKVSEKCCGLASGTTHLQILICGGTGCKASSSQ GITENLQKAIERNGITDKVDVITVGCFGFCEKGPIVKIIPDNTFYTQVTPEDAEEIISEH IIGGRRIERLLYIDPKTEQTVSDSKHMDFYRKQLRIALRNCGFIDSENIEEYIAREGYFA LADCLLNKQPADVIDIIKRSGLRGRGGGGFPTGLKWEFASKQVSNVKYVVCNADEGDPGA FMDRSIMEGDPHSIVEAMCICGYSIGSSKGLVYIRAEYPLAINRLKKAIEQAREYGLLGD HILGTDFSFDIEIRYGAGAFVCGEETALIHSMEGKRGEPTLKPPFPAESGYLGKPTNVNN VETLANIPIILTKGAEWFASIGTERSKGTKVFALAGKINNVGLIEVPMGTTLREVIYEIG GGIKGGKKFKAVQTGGPSGGCLTEKHLDTPIDFDNLLAAGSMMGSGGMIVMDEDDCMVSV ARFYLDFTVEESCGKCTPCRIGNKRLLELLNKITEGRGTEKDLDTLATLGRVIKDTALCG LGQTSPNPVLSTLDNFRDEYLEHVRDKTCRAKQCKSLLTYTINPELCIGCHLCAKNCPAD AISGLVRKPHVIHPEKCIKCGMCMARCKFKAILVC >gi|226332180|gb|ACIC01000140.1| GENE 40 48358 - 51114 2366 918 aa, chain + ## HITS:1 COG:no KEGG:BT_0126 NR:ns ## KEGG: BT_0126 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 918 1 918 918 1919 99.0 0 MRRILLGLCVLFFLNAHGQEIPLPEKMPQDHPRVLTTPEGKKETWKLIKKEAWAQDVFNK LKERTEVYTRRTESQPDWLLSRLAMYWKSHATEVYVKGEVFDHAGGAKAPAPTVRYTGTR GTAATHGRPKLEDVVPYDDSAEGNVTFCNNALKGRPLESVHPSKTGRNIESLNCEILGIA RDAAFLYWMTGEEKYARLAAGVFDTYMTGIYYRNVPVDLNHGHQQTLVGLTSFEVIHEDA LHIVVPLYDFLYHYLQSNYPDKMMIYAGALKKWADNIIANGVPHNNWDLLQARYIMNVGL VLEDNKEYADGKGREYYIDYVMNRSSIRQWSLTKLADYGFDSETGIWAECPGYSSVVIND YANFTHQFDHNLQYDLVKAMPVLAKAVATTPQYLFPNRMICGFGDTHPSYLSTNFFIRMI QNAQANGKKEQERYFTALLKCLNPEEGSEKSGKKNVRASVNSFFEDKPLVLDPKVEAGKI EDYVSPLFYAPNVSWLVQRNGMHLRNSLMISLNASEGNHMHANGISMELYGKGYVLGPDA GIGLFLYSGLDYAEYYSQFPSHNTVCVDGISSYPVMKSNHSFDLLSCFPASAEPGKGFTS VTYSQVAFREPESRADQTRLMGIVTTGPETGYYVDVFRSRKERGGDKMHDYFYHNLGQTM TLTAADGTDLNLQPTEELAFAGAHLYAYSYLYDKKVATTGQDVKVTFTIDMKDKGGDDIS MNLWMKGEPEREVFTALSPMTEGLSRTPHMPYNIKEQPTLTFVARQHGEAWNRPFVAVYE PSTQKEPSAIQSVSYFDAEEPGLKDFAGICVESKNGRTDHIFSLTDSSQTATYQGMKVKA DYAVISNEYAGNRTLFIGNGTQLIASGISIQTSEAANVLLEKKQGKWYILSSAPCKMVID GKAVQSGITTELTLLAVQ >gi|226332180|gb|ACIC01000140.1| GENE 41 51259 - 54552 1996 1097 aa, chain + ## HITS:1 COG:no KEGG:BT_0127 NR:ns ## KEGG: BT_0127 # Name: not_defined # Def: putative transmembrane protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1097 1 1097 1097 2234 100.0 0 MKKYWFLLLAALLGGATCIFAKDTLATWKAPAGVALNSDFTVKVRLQDGVWHTLSSYLIK VDEVRDTRHYVENASMAIFDFTGKVEVAVTYNLGEVQTAKVRPLSYDIPFQIDGNTVTFT LEHPRNLSVEVNGDIFHNFHLFTGSPERTIPDKDNPEVIYFGPGIHTVKNGELRVPSGKT VYLAGGAVLMGRVLIENVHDVKLLGRGIIDHSIKGGIRIANSRDVYVEGIVATQCATGGS ENVTIRNVKSISYYGWGDGMNVFASNNVLFDGVFCRNSDDCTTVYGTRLGFEGGCRNITM QNSTLWADVAHPIFIGIHGNSKAPEVLEDLNYINIDILDHREKQVDYQGCMAINAGDNNL IRNVHFEDIRVENFRQGQLVNLRIFYNEKYCTAPGRGIENVLFKNISYTGENAELSIIEG YDEKRKVKNIRFENLKINGKLIDDNMPDKPRWYKTSDMARIYVGPHVENIVFTSDVAQSQ RRFVHPGITYTQGDLDRMKAMVEARQEPYYSTFLKLKESSYSSLDAPVVNRGEQIKEGRF NATIGVDGRRAHDLALLWHLTGEEAYARKAVEYLNANSYYTNTSSRGTGPLDNGKIYLLI DAAEMMRDYSGWTRQDQQRFKDMLVYPGYSNTENYSAKYANYLDDTKNGVTFYWNIYNFD AARFGNQGLFAARSMMAMAIYLDNEIMYDRAYRYLLGMKHRKDDLPYPSGPAISSDQPIH VSPTMIDYKLLQRKNDIQDYGYDEQLQYYIYPNGQCQESSRDQGHVLAGLHNYVAIAEMA WNQGDSLYSSLDNRLLLGLEWSYRYNLSSIQSYKKQETPWEPTGLTKDMNEVTFDNGKYL QIKSRSGRWESVNISSHGRGDVAGTGGTREMALAHYAVRSGLPAEKYTWLQRYRDYMIER YGCENWGVAPNWFYEWTGWGTLTKRLTPWMAGDPVTFSTGKRVSGLHQLPSTILAADYDY YCISENPEGHTYHNIGTVRGNEYRPDGAVELQKIDNKYVVVQVEDGEWMNYTVNIPKSGA YAVYLTYSANSSSHVAMASDQGLEISSSIPSSKKWKETKLGELSLSAGACVLRLRVDKAG QKLCLSAFRLEKVERDR >gi|226332180|gb|ACIC01000140.1| GENE 42 54763 - 55134 404 123 aa, chain + ## HITS:1 COG:no KEGG:BT_0128 NR:ns ## KEGG: BT_0128 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 123 10 132 132 232 100.0 4e-60 MKLLIFISLFGGLLLASCGSSSSQKNASTQEVNPADTIYLGDLREKFEGDSIFFKVVAPD LMLMDYQYFWAATESEAVEKGLTKEYYKRVKKEISETNEAIKKGVMKGADVKRIPDFQEG QKK >gi|226332180|gb|ACIC01000140.1| GENE 43 55240 - 56427 965 395 aa, chain - ## HITS:1 COG:uxuA KEGG:ns NR:ns ## COG: uxuA COG1312 # Protein_GI_number: 16132143 # Func_class: G Carbohydrate transport and metabolism # Function: D-mannonate dehydratase # Organism: Escherichia coli K12 # 2 381 1 383 394 356 46.0 5e-98 MMEKTWRWFGKKDKITLPMLRQIGVEGIVTALHDIPNGEIWTIEAINALKSYIESYGLRW SVVESLPVCEAIKYGGTEREQLIENYKVSLTNLGKCGIKTVCYNFMPVIDWIRTDLQYLC PDGTSSLYYDRIRFAYFDMKILEREGAEKDYTEEELHKVSELDQVITEKEKDDLIDTIIV KTQGFVNGNIKEGDKEPVVLFKRLLTLYKDINRDILRENMCHFLSAIMPVCEEYGVNMCV HPDDPPFQVLGLPRIVTNEEDIAWFLNAVDNPHNGLTFCAGSLSAGEHNDTRELAKKFAK RTHFIHLRSTAAMPGGNFIESSHLAGRGHIIDLIRIFEKENPGLPMRIDHGRMMLGDEDK GYNPGYSFYGRMLALAQVEGMMTVVDDEIKRQMKL >gi|226332180|gb|ACIC01000140.1| GENE 44 56440 - 57255 204 271 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 10 263 4 238 242 83 28 4e-15 MNELFSIAGKVAVITGAGGVLGGNIAQHFVQQGAKVVAIDIRQEQLDNRVAELKQYGQDV IGIIGDVLDIASLEKVAEEIVAQWGQIDILLNIAGGNMPGATLESDQNFYDMDISCWEKV TSLNMNGTVYPSMIFGKVMAEQKKGCIINVSSMAAYSAITRVPGYSAAKTAVANFTQWLA SEMALKYGDGIRVNAIAPGFFIGDQNRRVLINPDGSLTDRSKKVLAKTPMNRFGDIKELN GAVQFLCSEAASFVTGAMLPIDGGFSAFSGV >gi|226332180|gb|ACIC01000140.1| GENE 45 57483 - 59417 1336 644 aa, chain + ## HITS:1 COG:no KEGG:BT_0132 NR:ns ## KEGG: BT_0132 # Name: not_defined # Def: alpha-glucosidase, putative # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 644 1 644 644 1330 100.0 0 MKKSILLLLLVIPTLTKAQNVMTETRRELTSPDGAYRFTFYQRSFGEDNARMYYTLIYKN RPVVEEGELGVQIENQLFESALGVPNDTCHFWCENLKLTDTDHRKNDATWKPVYGERAEV RDCYNEMTLKFRKGEGQGMTEGGYDKRKNYFMNIIVRAYNEGVAFRYHFPETTNGLFLHI TGERTSFTMPEGTMAYYERWAQGPYELRPLSGWGKEESERPLTMKLPDGLTVALLEAEMV DYARGKFRLSAEKPSTLETSLYSSVDIISPYSTPWRVIMAAERPVDLINHNDLALNLNTP CRISDISWIKPGKVFRSGDLKQEKVKAAIDFAAERGIQYVHMDAGWYGPEMKMSSDATTV SPDKDLDISALCKYAESRGIGLMVYVNQRALVQQLDSLLPLYKKWGLKGIKFGFVQIGNQ HWSTWLHDAVRKCAEYEMMVDIHDEYRPTGFSRTYPNLMTQEGIRGNEEMPDATHNTILP YTRFLAGAGDYTLCYFNSRVKNTKAHQLAMAAVYYSPLQFMFWYDNPAMYKGEEELEFWK AIPTVWDESRALDGEIGEYIVQARRSGKEWFVGAMTNTEARTITLTTDFLKPGTKYIVNL YEDDDKLNTRTKVRTTHKKIKAGDKLTLKLKSSGGAALHFTLAE >gi|226332180|gb|ACIC01000140.1| GENE 46 59401 - 60234 720 277 aa, chain - ## HITS:1 COG:YPO2925 KEGG:ns NR:ns ## COG: YPO2925 COG2103 # Protein_GI_number: 16123112 # Func_class: R General function prediction only # Function: Predicted sugar phosphate isomerase # Organism: Yersinia pestis # 15 276 17 279 295 244 51.0 1e-64 MEFIKITEQPSLYDDLDKKSVKEILEDINTEDHKVADAVQKAIPQIEKLVTLIIPRVKKG GRIFYMGAGTSGRLGVLDASEIPPTFGMPPTVVIGLIAGGDTALRNPVENAEDDMSRGWE ELLQHHINSEDTVIGIAASGTTPYVIGAMRTAREHGILTGCITSNPNSPMATEADVPIEV IVGPEYVTGSSRMKSGTAQKMILNMISTTIMIELGRVQGNKMVNMQLSNQKLIDRGTRMI IEELHLDYEKAEALLLLHGSVKSAIEAYRRHNSTQQE >gi|226332180|gb|ACIC01000140.1| GENE 47 60245 - 60628 331 127 aa, chain - ## HITS:1 COG:no KEGG:BT_0134 NR:ns ## KEGG: BT_0134 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 127 1 127 127 256 100.0 2e-67 MQERHLYEYATIRFVPKVEREEFINVGIVLFSKRCKYLKSLYTIDENKLKLFSSELDMNC LKEGLHVFDRICSGSKEGGMIASMDVPDRFRWLTAVKSSCIQVSRPHPGFSEDLDATLER LFKELVL >gi|226332180|gb|ACIC01000140.1| GENE 48 60606 - 61388 668 260 aa, chain - ## HITS:1 COG:no KEGG:BT_0135 NR:ns ## KEGG: BT_0135 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 260 1 260 260 518 100.0 1e-146 MDLRTANVTRYILPLREGGSLPALAEADDEFKYVVKFRGAGHGTKALIAELIGGEIARTL GFRVPEIIFLNLDEAFGRTEADEEIQDLLQWSRGLNIGLHFLSGALTFDPVVHQVDGKTA SQIVWMDAFLTNVDRTIKNTNMLMWHKELWLIDHGASLYFQHSWTNWQKQALTPFVQIKD HVLLPFADQLKETDMAFRQLLTSDKIWEIVNTVPDDWLNWTEGQETPQDLRDIYIQFLEE RIKHSELFVNEAQHARKALI >gi|226332180|gb|ACIC01000140.1| GENE 49 61541 - 63151 977 536 aa, chain + ## HITS:1 COG:no KEGG:BT_0136 NR:ns ## KEGG: BT_0136 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 536 1 536 536 1132 99.0 0 MKLLRFLVFGILVFMLWPECIAAEWQWSVQIPGIISNETDDHPQAFLWIPSDCAQVRGVM IGTHNMTEETLFENLLFREKMSEMGMGLIWITPGWDQKWDVSAGSQRAFEQMLEDFAAIS GYNELKYVPIVPFGHSAMATYPWNFAAWNPERTLAVISFHGDAPRTNLTGYGRDNMEWGK RTIDGIPGLMIEGEYEWWEDRVNPALVFRLMYPRSCISFLCDTGRGHFDIADRTAVYLAL FLKKAMEYRLPETYDVDKPVMLKKLNPENGWLAERWHPDQKRRAKAAPFKQYKGDPHDAF WYFDKEMADMTEERYRQERGKKPQYLGFVQENSLLAYHPKSHVKVAARFLPEEDGLTFHL KAVFTDSLHTTISDEHSSTFPEITRICGPVQKVNDTTFTVRFYRMGMYNKRRTGDICLLA SHDGDNQYKSTVQELSFRIPYRNTEGKRQYILFPGIGDVKAGVESVTLRATSDCGLPVYY YVKEGPAEIDGNKLVFTPIPPRSKYPLKVTVVAWQYGLAGKVQTAEPIERSFFIYQ >gi|226332180|gb|ACIC01000140.1| GENE 50 63256 - 65181 1565 641 aa, chain + ## HITS:1 COG:BH1877 KEGG:ns NR:ns ## COG: BH1877 COG3533 # Protein_GI_number: 15614440 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus halodurans # 49 639 4 591 758 395 37.0 1e-109 MIMNKKNLVVIAWLAVSTMMSAQSVYPGQHQGKMKKETVAPIRVQSFDLKDVRLLASRFR DNMLRDSAWMTSLDVNRLLHSFRTNAGVFAGREGGYMTVKKLGGWESLDCELRGHTTGHL LSAYALMYAATGSEIFKLKGDSLVNGLTEVQNALKGGYLSAYPEELINRNIQGKSVWAPW YTLHKLYSGLIDQYLYADNQQALSVVTKMGDWAYNKLKPLSEETRRLMIRNEFGGINESF YNLYAITGDERYRWLAEYFYHNDVIDPLKELRDDLGTKHTNTFIPKVIAEARNYELTQNE TSKKLSEFFWHTMIDHHTFAPGCSSDKEHFFDPKKCSKHLTGYTGETCCTYNMLKLSRHL FCWTGDSSIADYYERALYNHILGQQDPETGMVTYFLPLLSGSHKLYSTKENSFWCCVGSG FENHAKYGEAIYYHNDKGIYVNLFIPSQVTWKEKGLTLLQETDFPKEETTRLTLRAEKPR HTTIYLRYPSWSKNVKVLVNGKKVSVKQKPGSYIAITREWKDGDRIAATYPMQIELEATP DNPNKVALLYGPLVLAGERGTEGMQAPAPFSDPALYNDYYTYNFHVPADLRTSLKIDVKH PERTLHRVGKDLKFTTEQGDVIRPLYDLHHQRYVVYWDLQD >gi|226332180|gb|ACIC01000140.1| GENE 51 65403 - 68150 1967 915 aa, chain + ## HITS:1 COG:PA3423 KEGG:ns NR:ns ## COG: PA3423 COG2207 # Protein_GI_number: 15598619 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Pseudomonas aeruginosa # 793 908 130 243 247 72 31.0 3e-12 MCWVCGENLQAQPYMFRNVVMSDGLSGLLVNAIYKDSEGFIWLGTDNGLDRFDGVKVKHF EFRGVDSGRKKRVNCITETDNKQLWIGNGIGLWRLNRSGSELQRIVPEKIDCAVNALLAD GDVLYIGTERGLFIQKDGQLLQVQTDKNMLAACNRIMDLCLNEDKSVLWLATVQGLFSYS LKDGQINSWHFRENVPEADYFRCLTRIGETLYLGTMSQGVVCFDIPKQTFAHTVSLGCDV ISDISGDGKETVYIATDGNGVHFLSHKDRKVVRRFFHDVNDKEGIRSNSIYSLLVDDRGA VWVGHFQAGLDYSLYQNGLFRTYAYLPQFNSANLSIRSFVNRGPEKVIGSRDGLFYINEA TGIVKSFVKPVLTSDLILTICFYEGEYYIGTYGGGMMVLNPQTLSLKYFTQGDTELFQKG HIFCVKPDNRGNLWIGTSQGLFCYNGQTKQIKNFTSTNSQLPEGNVYEVSFDSTGKGWIA TETGMCIYDPASQSLRSNVFPEGFVHKDKVRTIYEDAEHNLYFIREKGSLFTSTLTMDHF RNQSVLSTLPDNSLMSITEDNQGWLWVGCNDGLLRIKEEGEEYDSFTFNDGVPGPTFTNG AAYKDERGLLWFGNTKGLIYVDPKQVDEVRGKVRPIVFTDILANGVSINKSSLKYNQNNL TFCFTDFAYGLPSALLYEYKLEGADSDWKLLTAQSEVSYYGLSSGTYTFCVRIPGNEQSM AIYKVTVQPMIPWWGWVILAISIILIVVLVRYYVRKSAPLVTEVHEISSEEEEEEEKTAE PHPAEEKYKANRMSEAECKELHDRLVAYVEKEKPYINPDLKMGELAAALHTSSHSLSYLL NQYLNQSYYDFINEYRVAQFKKMVEDSDYSRYTLTALAELCGFSSRASFFRSFKKSTGVT PNEYIRSIGGTAKDE >gi|226332180|gb|ACIC01000140.1| GENE 52 68898 - 69413 506 171 aa, chain + ## HITS:1 COG:PA1300 KEGG:ns NR:ns ## COG: PA1300 COG1595 # Protein_GI_number: 15596497 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Pseudomonas aeruginosa # 12 156 9 158 175 63 25.0 1e-10 METTAYTSDNIITRSYEEYHQVILNYITYRIAHRYEAEDLTQDVFVRLMDYKQMLRPDTV KYFLFTIARNLVTDYIRRYYKKQEIDSYLYDFTVTSSNDTEEKIIADDLMAMERTRLAAM PEQRRLIYTLNRFEDKSSPEIASELELSCRTVENHLFLGRRDMRDFFRNCI Prediction of potential genes in microbial genomes Time: Thu May 12 03:17:58 2011 Seq name: gi|226332179|gb|ACIC01000141.1| Bacteroides sp. 1_1_6 cont1.141, whole genome shotgun sequence Length of sequence - 37623 bp Number of predicted genes - 28, with homology - 27 Number of transcription units - 14, operones - 10 average op.length - 2.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 10 - 69 3.1 1 1 Op 1 . + CDS 127 - 3357 2486 ## BT_0140 hypothetical protein 2 1 Op 2 . + CDS 3368 - 5107 1123 ## BT_0141 hypothetical protein 3 1 Op 3 . + CDS 5137 - 6522 998 ## COG5492 Bacterial surface proteins containing Ig-like domains 4 1 Op 4 . + CDS 6532 - 7908 1035 ## BT_0143 putative transmembrane protein + Term 7931 - 7976 9.7 + Prom 7992 - 8051 4.8 5 2 Op 1 . + CDS 8113 - 9762 1356 ## COG3507 Beta-xylosidase 6 2 Op 2 . + CDS 9823 - 11043 1081 ## BT_0146 unsaturated glucuronyl hydrolase + Term 11085 - 11131 10.1 - Term 11073 - 11118 13.7 7 3 Op 1 . - CDS 11143 - 11619 519 ## BT_0147 hypothetical protein 8 3 Op 2 . - CDS 11679 - 12968 1524 ## COG1253 Hemolysins and related proteins containing CBS domains - Prom 12989 - 13048 4.7 - Term 13097 - 13163 11.7 9 4 Op 1 . - CDS 13187 - 13969 815 ## COG0501 Zn-dependent protease with chaperone function 10 4 Op 2 . - CDS 14008 - 16386 1964 ## BT_0150 putative ferric aerobactin receptor - Prom 16457 - 16516 6.6 + Prom 16404 - 16463 4.1 11 5 Op 1 . + CDS 16484 - 16918 368 ## BT_0151 hypothetical protein 12 5 Op 2 . + CDS 16945 - 17763 710 ## COG0627 Predicted esterase + Term 17847 - 17886 6.1 13 6 Op 1 . - CDS 18128 - 19003 438 ## BT_0153 hypothetical protein 14 6 Op 2 . - CDS 19010 - 20464 1190 ## BT_0154 putative periplasmic protease - Prom 20547 - 20606 2.6 + Prom 20584 - 20643 5.8 15 7 Tu 1 . + CDS 20791 - 21861 862 ## COG0482 Predicted tRNA(5-methylaminomethyl-2-thiouridylate) methyltransferase, contains the PP-loop ATPase domain + Term 21969 - 22011 6.3 16 8 Tu 1 . - CDS 21891 - 22553 528 ## COG2197 Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain - Prom 22575 - 22634 3.0 - Term 22560 - 22613 8.4 17 9 Op 1 . - CDS 22639 - 23229 653 ## BT_0157 hypothetical protein 18 9 Op 2 . - CDS 23297 - 24724 1137 ## COG1757 Na+/H+ antiporter 19 10 Tu 1 . + CDS 24684 - 24848 103 ## + Term 24902 - 24936 -0.4 20 11 Tu 1 . - CDS 24819 - 25691 935 ## COG1814 Uncharacterized membrane protein - Prom 25713 - 25772 3.4 21 12 Op 1 . - CDS 25813 - 26247 344 ## BT_0160 hypothetical protein 22 12 Op 2 . - CDS 26295 - 27470 674 ## COG0477 Permeases of the major facilitator superfamily 23 12 Op 3 7/0.000 - CDS 27490 - 29808 1536 ## COG4953 Membrane carboxypeptidase/penicillin-binding protein PbpC 24 12 Op 4 . - CDS 29854 - 35502 5056 ## COG2373 Large extracellular alpha-helical protein - Prom 35525 - 35584 6.0 + Prom 35469 - 35528 4.0 25 13 Op 1 . + CDS 35599 - 36000 307 ## COG0545 FKBP-type peptidyl-prolyl cis-trans isomerases 1 26 13 Op 2 . + CDS 36016 - 36573 391 ## COG0526 Thiol-disulfide isomerase and thioredoxins + Term 36576 - 36610 2.2 27 14 Op 1 . - CDS 36703 - 37041 176 ## BT_0167 hypothetical protein 28 14 Op 2 . - CDS 37034 - 37333 307 ## BT_0168 hypothetical protein - Prom 37377 - 37436 5.2 Predicted protein(s) >gi|226332179|gb|ACIC01000141.1| GENE 1 127 - 3357 2486 1076 aa, chain + ## HITS:1 COG:no KEGG:BT_0140 NR:ns ## KEGG: BT_0140 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1076 22 1097 1097 2106 99.0 0 MKRRGLIFNSQPNKAVIVALGLMLMLCPASRTWAYANVNDSRTIAQSQKQVKGQVVDATG EPVIGASILEKGTTNGVISDIDGNFSLNVSSPNAVIVISYIGFKSMELPASDPKLRKIIM KEDTEVLDEVVVVGYGTQKKESLTGAVTVVGAKQLENKGTMSSPLQAMQGTVPGVLITRN SGAPGDESWGMKLRGASSSNSTDPLIIVDGVEYSDGINGMRNLNPDDIESINFLKDASAA IYGSKAAGGVVLITTKKAKAGKTVVQYNGSFTGKVVGLQPELMSLDQWADAVITAQTNDG YSDSNWIRYARLAKLYKNQYIDLSHSAHPIPEGFKDVEDFVFMDNDWQDILWGNSWSTQH DLSVSGGTEKNLFRLSLGYMYDNSTLKWGNNNNQRYNMRLNNQFKLSDAVMLTSSIGYNR QDQVSPSMIGKVLSQSSPQPGLPASTIDGRPYGWGTWRALNWWAEEGGDNKLKVSAINIS ESLNWKIYSDLDAVVNVGYNTSTATREKVEKSIDWYNYAGTKLLATEPTQEKSKYSDSFS RTDYYMVSGYLNWHKTLAEVHNLSAMAGTQYNYTQYKYTFVSVKDINPSLEIPNGAGEVL IKDGDSKPAKWHEAMMSYFGRLNYDYKQRYLVEGNLRYDGSSKFRPENRWQFFWGVSGGW RLSEESFMQPLSSLVSNLKLRLSYGVLGNQSGVDRYDGTQLYNFSSSSGAYIGSGKVSTI DTNGKIVTTDRTWERIHSYNLGLDFGFFNNRLTGTVELFMKKNNNMLIDAQYPGVLGDNA PTMNLGKFEAKGWEGNMTWSDKIGPVQYHIGGTITYTTNKLIDLGATSVLKSGFVGKQQG YPLNSYFGLRYVGKIQTQEELEKYKYYYLDGNGIGMQDNLRLGDHMFEDVNGDGKLDQND YVYLGTDDPKLSYSINIELEWKGFDLSAIFQGVGRRTVFRGGEGNETWRVPMSAIYLNTT TQSIGNTWNPENRNAYYPSYTSIGSINNYNYQCSSWSVENGAYLRLKNLTLGYTLPASWL AKTNAISKLRIYFTGADLWEHSKLRDGWDPEASRKTKDLGRYPFNRTFTVGVNATF >gi|226332179|gb|ACIC01000141.1| GENE 2 3368 - 5107 1123 579 aa, chain + ## HITS:1 COG:no KEGG:BT_0141 NR:ns ## KEGG: BT_0141 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 579 1 579 579 1157 100.0 0 MKRSKITILLAIVLGMTFHSCLDLEPQDQLGGKNMWTSVNDYKQFANTFYSWTRDFSSVV YDGTHSDKRSDLITYQSYNEFSRGINSIPSSDANYTDNYKHIRRTNLLLQNAEAYAKPED IKQYIGEAYFFRAYSYFDLLQLYGDVIITKKPLDITDPEMKVKRNDRSEVVDLIIDDLNH AVENLPAFKELTTEEEGRISKEGAQAFLSRVALYEGTWQKSRNGNQNTQRSANLLDIAAK AARTVIDAKYGYTFRLFGTDSETKILGDSAQKYMFILENEKSNPAGIKKSSNHEYIFARR HDQVLASIGKNITQECLANVQWVTRKFANLYLCDDGLPIEKSGRFQKYDKKVSEFLNRDN RMRYTLLKPGTRYWGNKFGRTSWQWDETDLKTSKVYDPASGTCYGNQKWSAERVVPDTQE GYDFPLIRYAEVLLNYAEAIYERDDKIENEDLNISLNLVRQRVNTNMPALTNELTQAHGL DMRTEIRRERTVELFNEGFRIDDLKRWKTAENEMPQTMLGIKWKGTEYESWNTTFSLNDE GCVVVETGRQWADKNYLYPLPSDQLQLNPTLGQNPGWGE >gi|226332179|gb|ACIC01000141.1| GENE 3 5137 - 6522 998 461 aa, chain + ## HITS:1 COG:CAC2367 KEGG:ns NR:ns ## COG: CAC2367 COG5492 # Protein_GI_number: 15895634 # Func_class: N Cell motility # Function: Bacterial surface proteins containing Ig-like domains # Organism: Clostridium acetobutylicum # 117 333 141 358 752 81 29.0 4e-15 MKHIAFLSVKWMSVGLGMLSLLLSVASCGDKEYGDAMNEAQLMNDIEVNVGSSLPLAVGM DFVLDYKPVPENVTNPEITLTSSDENVVSVSQDGRVTAKMIGKAYINLSQSTAFETLKTI EVQVMPVATAIELENVELFEGTNKKVIVNVTPSDGYNVFDWKSDNEEVATVADDGTITGK KPGTANISVSSQDGSQLTATAVVTVKEVIPIDKITLSEPGYDMMIGDKTLINCLLEPIDA SVGLLSWSTTNDRVATVDADGLVTAVGAGEAEIIAQDPLSGLSASIAVKVVGEGVVSLSL SYVRNQDELKALGWGFGQTPASVNFDAEGMTVNMSLQSNSKYRADLKMASNDRPVVLNIG TYRYLAFRMDVPGNGSLKLDTNKGDYGNNPTGVLAEDSQVIYYDLQAKPYFPTDAPSDKL TTFQLKIADVTVQPYSYKVYWVRTFKTLEDLKVYVEKENNK >gi|226332179|gb|ACIC01000141.1| GENE 4 6532 - 7908 1035 458 aa, chain + ## HITS:1 COG:no KEGG:BT_0143 NR:ns ## KEGG: BT_0143 # Name: not_defined # Def: putative transmembrane protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 458 1 458 458 929 100.0 0 MKTKYISLFFSFVIAFGMMSCSSEIPNGDTSKFSDLKSPEEDMVKKDYLPLKHPCMLHTQ ADIDRVKSNLTRSPWKDAYAQLEASDYAQSSYTEKTSALLDGYLKRMDKNNWSGKYPDYS NYTSCMYDAAAAYQLALRYQLSGNTVFADAAVKLFNAWATNCKGILRMEGYTNNIPDPNL YLIPIQAHQWANAAELLRDYNGWDRNDFEKFKTWMKDTFYSVSNMFLKNHNGGQGNMHYW LNWDLAQMTSILSIGILCDDNVLINQAIVYFKNEEGRYKEAGNIKNAVPFLHQDPDSDEI LGQCEESGRDQGHATLCVSLMGAFCQMAYSIGEDLFAFDNYRAVAMAEYVGKYNLIKDEA FNKGTLVGDDFVYDSNSFPYTSYSNPSYTNTTISTDQRGTKRPAWELFYGYCKAKGISSI YSGKWAEQMRPDGGGGNYGPNSGGFDQLGFGTLMYYRE >gi|226332179|gb|ACIC01000141.1| GENE 5 8113 - 9762 1356 549 aa, chain + ## HITS:1 COG:CAP0114 KEGG:ns NR:ns ## COG: CAP0114 COG3507 # Protein_GI_number: 15004817 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-xylosidase # Organism: Clostridium acetobutylicum # 3 548 10 529 531 204 30.0 3e-52 MRKKILFIAGMMACCTWAGAQEISQTWVADKGNGTYQNPVLYADYSDPDVCAAGDDFYMT ASSFNCIPGLPILHSRDLVNWSLVNYALPVQEPKEFFDKAQHGKGVWAPAIRFHKGEFYI YWGDPDYGIYMIKTRDPKGSWSKPVLVKAGKGMIDPTPLWDEDGKVYLIHAYAGSRSGVN SILVICELNAEGTEVISDPVMVFDGNDGKNHTVEGPKLYKRNGYYYIFAPAGGVATGWQL VLRSKNIYGPYESKIVMAQGKTTINGPHQGGWVDTNTGESWFVHFQDQGAYGRVIHLNPM KWVNEWPVIGVDKDKDGCGEPVTTYKKPNVGKMYPVTTPPESDEFDTRHLGLQWQWHANK QDTYGFTTDLRYIRLYAGSLSKEFVNFWEVPNLLLQKFPAEEFTATTKLTFTAKQDGEQA GMIVMGWDYSYLSVRKAGDKFILQQVVCKDAEQQHPEQVKELASFPVEYLKMPGVADNEW KTVYLRVKVAKGAVCTFAYSLDGKKYTAAGEPFTARQGKWIGAKVGLFCVTPNDGNRGWA DVDWFRVTK >gi|226332179|gb|ACIC01000141.1| GENE 6 9823 - 11043 1081 406 aa, chain + ## HITS:1 COG:no KEGG:BT_0146 NR:ns ## KEGG: BT_0146 # Name: not_defined # Def: unsaturated glucuronyl hydrolase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 406 1 406 406 823 99.0 0 MKKIVAGLVVVLGFCACAHRTSGTLDMNKALDYCAEQTQRSLVELKGDSGIDYTMMPRNV MADEHHWNCRKATKEEWCAGFWPGVLWYDYEYTQDKHILEEAEKFTASLEFLSRIPAYDH DLGFLVFCSYGNGYRLTKNPAYKQVILDTADSLATLFNPVVGTMLSWPREVEPRNWPHNT IMDNMINLEMLFWAAKNGGNPYLYDIAVSHADKTMKCQFRPDYTSYHVAVYDTITGNLIK GVTHQGYADSTMWARGQAWAIYGYTVVYRETKDPKYLDFVQKVADVYLERLPEDKVPYWD FSAPGIPDVPRDASAAAVVASALLELSAYLPNGKGKHYKDAAIEMLTSLSSDNYQSGKSN PAFLLHSTGHWPAHSEIDASIIYADYYYIEALLRLKRLQEGEGVLG >gi|226332179|gb|ACIC01000141.1| GENE 7 11143 - 11619 519 158 aa, chain - ## HITS:1 COG:no KEGG:BT_0147 NR:ns ## KEGG: BT_0147 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 16 158 11 153 153 233 100.0 2e-60 MKKVIMLAAVVAALASCQSKANKAAEAEADSLSIEMAPITEVTEVYEGTLPAADGPGIDY VLTLNAATDGVDTTYTLDMTYLDAEGQGKNKTFSSKGKQQTIQKVVNKKPTKAVKLTPND GEAPMYFVIVNDTTLRLVNDSLQEAVSDLNYDIIKVKK >gi|226332179|gb|ACIC01000141.1| GENE 8 11679 - 12968 1524 429 aa, chain - ## HITS:1 COG:alr5216 KEGG:ns NR:ns ## COG: alr5216 COG1253 # Protein_GI_number: 17232708 # Func_class: R General function prediction only # Function: Hemolysins and related proteins containing CBS domains # Organism: Nostoc sp. PCC 7120 # 1 423 6 429 442 306 40.0 5e-83 MEFLIILLLLVLNGIFAMYEIALVSSSKARLETLAGKGSKSARGVLKQLEEPEKFLSTIQ IGITLIGIVSGAYGGVAIADDLVPLFSLIPGAEAYARNLAMITTVVIITYLSLIIGELVP KSIALSNPERYATLFSPIMILLTKVSYPFVWFLSISTRLLNRIIGVKNEERPMTQEEIKM ILHQSSEQGVIDKEETEMLRDVFRFSDKRANELMTHRRDLVIFHPDDTKDKVMKTIEEEH YSKYLLVDERKDEIIGVVSVKDIILMVGNKKEFNLREIARPALFIPESLYANKILELFKK NKNKFGVVVNEYGSTEGIITLHDLTESIFGDILEEDDTEEEDIVRRQDGSMLVEASMNIK DFMEEMGILSYEDLEDEDFTTLGGLAMFLLGGIPKAGDIFTYKNLQFEVVDMDRGRVDKL LVIKRDEEE >gi|226332179|gb|ACIC01000141.1| GENE 9 13187 - 13969 815 260 aa, chain - ## HITS:1 COG:ECs3811 KEGG:ns NR:ns ## COG: ECs3811 COG0501 # Protein_GI_number: 15833065 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Zn-dependent protease with chaperone function # Organism: Escherichia coli O157:H7 # 2 253 40 286 294 150 37.0 3e-36 MKKQIALVALLFLGMSCFDASAQFKKFNLGKAIQAGKDAAQAISLSDADIAAMSKEYMEW MDNHNPLAKPDSEYGKRLEKLTGNIKEVEGMQVNFGVYEVVDVNAFACGDGSVRICAGLM DVMTDEEVMAVIGHEIGHVIHTDSKDAMKNAYFRSAVKNAAGAASSQVAKLTDSELGAMA EALAGAQYSQKQEYAADAYGVEFCVKNNIDPYGMYKSLNKLLELSNGAPASSYMQRMFSS HPDTAKRVARAKELADKYKQ >gi|226332179|gb|ACIC01000141.1| GENE 10 14008 - 16386 1964 792 aa, chain - ## HITS:1 COG:no KEGG:BT_0150 NR:ns ## KEGG: BT_0150 # Name: not_defined # Def: putative ferric aerobactin receptor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 792 1 792 792 1591 99.0 0 MKRLLSLLLFCNITMSFLLAQPVHQVKGTVIDKSSRQPLEFINVMIVGLNKGGVTNAEGK FSIGQVPPGIYRLQASAIGYKTVTTPEYILSTRDLHIQIEMEENQTELEGVTVTASPFRR DIESPVSLRIIGLQEIEKSPGANRDISRIVQSYPGVAFSPIGYRNDLIVRGGSPSENRFY LDGVEIPNINHFSTQGASGGPVGILNADLIREVNFYTGAFPTDKGNALSSVLDFKLRDGD MERNSLKATLGASEVSLASNGHLGKKTSYLVSVRQSYLQFLFDMLGLPFLPTFTDAQFKL KTRFDARNELTVLGLGGIDKMKLNTKADDEDNEYILSYLPKIQQETFTLGAVYRHYAGAH VQSVVASHSYLNNRNTKYQQNDESDPDHLMLRLRSTEQNTQLRLENSSSFRNWKVTVGTS LDYSQYSNTTFQKVYTDRAQTFDYHTYLGIMRWGLFGTVNYTSIDERFTASLGLRADANN YSAAMKDLSDQLSPRLSLSYQLTEHWSLSGNAGLYYQLPPYTALGFKNNNGLYANKYALR YMQVSQGSVGLNWRKGDTFEVSVEGFYKDYDKIPLSVADGIPLACKGNDYGVIGNELLTS TAQGRSYGAELLLKWLIAKKLNLASSFTLFKSEYRTDKESEYIASAWDNRFIFNLRGTYN LPRHWSVGMKVSCIGGAPYTPYDADKSSLVTAWNAQGKPYYDYTRYNEERLPAFTQVDIR IDKTFYLKRCMLGFYIDLQNIAGSKLKQADVLMSTGVIKNPDAPITEQRYVMKSLKQESG TLLPTLGITFEY >gi|226332179|gb|ACIC01000141.1| GENE 11 16484 - 16918 368 144 aa, chain + ## HITS:1 COG:no KEGG:BT_0151 NR:ns ## KEGG: BT_0151 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 144 1 144 144 291 99.0 4e-78 MKTVKLITCNDAMKAHILQGALENEGIESILHNENFSTLYKSCVSSIAGVDILVADEDYE KAVQVLRQNQSWPEELTLCPYCGSSDIKFVLRKGHKLRAVGAAVLSMLAAAPPGDNHWEY TCQQCHQAFETPVAEFQPSGEDEE >gi|226332179|gb|ACIC01000141.1| GENE 12 16945 - 17763 710 272 aa, chain + ## HITS:1 COG:PM1451 KEGG:ns NR:ns ## COG: PM1451 COG0627 # Protein_GI_number: 15603316 # Func_class: R General function prediction only # Function: Predicted esterase # Organism: Pasteurella multocida # 4 270 2 263 269 184 39.0 1e-46 MKKKKLLLIALLLVGAASSFAAKVDTLLVKSPSMNKDIKVVVVTPDAALGKKATACPTVY LLHGFGGHAKTWIEIKPNLPQIADEKGIIFVCPDGSTSWYWDSPKDPSFRYETFVSSELV KYIDGHYKTITDRKGRAITGLSMGGHGGLWTAIRHKDTFGACGSTSGGVDIRPFPKNWDM AKRLGDYESNKEIWDTHTVINQIDKIQNGDLAIIFDCGEADFFIQVNKDLHNRLLEKKID HDFITRPGGHTGEYWNNAIDYQILFFDKFFRK >gi|226332179|gb|ACIC01000141.1| GENE 13 18128 - 19003 438 291 aa, chain - ## HITS:1 COG:no KEGG:BT_0153 NR:ns ## KEGG: BT_0153 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 291 1 291 291 503 93.0 1e-141 MKKIALLLCAIVMFGCSDDEEKAGTEVPDPEISKMLFSEFCKSDGIGVSDVTKKEEFHYE DAWLTGYTYTQEVSIADLTEIISHPLTIQYGNNRSSVTFTDEIGTERKYTLNEEGYATQC EYASLDQKRQYTFSYTEGYLTQVNEKIMPREGSSDAVVSHTLSLQYDKGDLISTTSPSLT NESSTGYGKFQTNYEAGEDINYYRLPCMLVADTYPLSFHREALFAGMLGKPTQHLTTASC PNESSDTYTERTEYSYNFDKNKKPVSLKISTKYSNGKSTSYLNRTISITIE >gi|226332179|gb|ACIC01000141.1| GENE 14 19010 - 20464 1190 484 aa, chain - ## HITS:1 COG:no KEGG:BT_0154 NR:ns ## KEGG: BT_0154 # Name: not_defined # Def: putative periplasmic protease # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 484 1 484 484 914 99.0 0 MTNLRKILLIPALIVVSVSSFFSCGVDRWPEYAHLTELDTWMYDIMQQNYLWYQYMPGYD EVNLFQDPATFLSKAKWDKDSYSFVDSVLEAPLPTYGFDYSLVKSQDNDTAYNALITYVI PESPAEKAGLQRGDWIMKVDTSYISKKYETQLLQGTIARELSMGVWKEVEEEPEEGEEAP EEPVMVYKVVPNGVTLDLGAAQSIEDQPVHKYEILTLDNGAKVGYLMYNSFTAGTNNDPE KYNNELRKVSTIFQEANVQATILDLRYNEGGSLDCVQLLATILVPNARMGTPMAYLEYND KNLNKDATINFDQEVLKTGVNLNQNTLVAITSGTTAGAAEMLLTSLYKEDQAPNIIVMGS ASKGQTVATEQFINETYRWSVNPVVCTVYNSDHDAGEDAFVLVPSDKFKISETSINGTTD YSQFLPFGNPKERMLSIAIQTLEGTYPPKDEDKEEETRSTKSIKIEKSVSSPASRRFAGG LRIK >gi|226332179|gb|ACIC01000141.1| GENE 15 20791 - 21861 862 356 aa, chain + ## HITS:1 COG:BB0682 KEGG:ns NR:ns ## COG: BB0682 COG0482 # Protein_GI_number: 15595027 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted tRNA(5-methylaminomethyl-2-thiouridylate) methyltransferase, contains the PP-loop ATPase domain # Organism: Borrelia burgdorferi # 2 354 1 350 355 306 44.0 3e-83 MMNIAALLSGGVDSSVVVHLLCEQGYKPTLFYIKIGMDGAEYMDCSAEEDIEMSTAIARK YGLSLEVVDLHKEYWENVAAYAIDKIKKGLTPNPDVMCNKLIKFGCFEQQVGKNFDFTAT GHYATTIRQDGKTWLGTAKDPVKDQTDFLAQIDYLQVSKLMFPIGGLMKQEVREIASRAG LPSARRKDSQGICFLGKINYNDFVRRFLGEREGAIIELETGKKVGTHRGYWFHTIGQRKG LGLSGGPWFVIKKDIEENIIYVSHGYGVETQFGSEFRINDFHFITENPWKDAGKEIDITF KIRHTPEFTKGKLVQEEGGQFRILSSEKLQGIAPGQFGVIYDEEAGICVGSGEITR >gi|226332179|gb|ACIC01000141.1| GENE 16 21891 - 22553 528 220 aa, chain - ## HITS:1 COG:RSc0292 KEGG:ns NR:ns ## COG: RSc0292 COG2197 # Protein_GI_number: 17545011 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain # Organism: Ralstonia solanacearum # 1 214 1 210 210 98 29.0 8e-21 MREYIIADNQDISKAGMMFLLSRQKDVSVLLEADNKAELIQQLRLYPQAVVILDYTLFDF AGADELIVLQERFKEADWILFSDELSINFLRQVLFSSMAFGVVMKDNSKEEIMSALQCAS RKQRYICNHVSNLLLSGTASSAASPATVDDHLLTQTEKNILKEIALGKTTKEIAAEKNLS FHTINSHRKNIFRKLGVNNVHEATKYAMRAGIVDLAEYYI >gi|226332179|gb|ACIC01000141.1| GENE 17 22639 - 23229 653 196 aa, chain - ## HITS:1 COG:no KEGG:BT_0157 NR:ns ## KEGG: BT_0157 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 196 1 196 196 311 100.0 1e-83 MKKFIALLALVLVSASTMMYAQESNAAIRRADRKAERDAERAKLRAEEQVQDMAAYQQAV QALKNKQFVLEANQVIFRNGMSSFVTSNTNFVLMNGNRATVQTAFNTPYPGPNGIGGVTV DGNSSDMKMNIDKKGNVNCSFSVQGIGISAQVFINMSSGNNNASVSISPNFNNNNLTLNG NIVPLDQSNIFKGRAW >gi|226332179|gb|ACIC01000141.1| GENE 18 23297 - 24724 1137 475 aa, chain - ## HITS:1 COG:FN1422 KEGG:ns NR:ns ## COG: FN1422 COG1757 # Protein_GI_number: 19704754 # Func_class: C Energy production and conversion # Function: Na+/H+ antiporter # Organism: Fusobacterium nucleatum # 11 474 1 445 473 267 36.0 3e-71 MKKAPSPLISLLPIVVLVILLFATIRVFGSDALNGGSQISLLTTTAICILIGMAFYKIPW KDYELAITNNVAGVTTAIIILLIIGALSGIWMISGVVPTLIYYGMQIIHPSFFLTSTCII CVLISVMTGSSWTTIATIGIALMGIGKAQGFEEGWIAGAIISGAYFGDKVSPLSETTILA ASVTDTPLFRHIRYMMITTVPSLIITLTIFTVAGLSHDASNTQHIAEVAAALNEKFHITP WLLIIPIATGILIARKVPSIITLFLSTLLAGIFALIFQPDLLREISGAAVSNFDSLFKGL MMTIYGKTSLQTDNAVLTDLIATRGMSGMMNTIWLILCAMCFGGAMTASGMLGSITSLFV RFMKKTVSVVSATVCSGLFLNLATADQYISIILTGNMFRDIYAKKGYESCLLSRTTEDSV TVTSVLIPWNTCGMTQATILSVPTLVYLPYCFFNIISPLMSIAVAAIGYKIVKRP >gi|226332179|gb|ACIC01000141.1| GENE 19 24684 - 24848 103 54 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGNSDMRGDGAFFILNITLYYLIWWFRFLYYVTTGPSMMSDALLISGIDSGKCL >gi|226332179|gb|ACIC01000141.1| GENE 20 24819 - 25691 935 290 aa, chain - ## HITS:1 COG:TM0497 KEGG:ns NR:ns ## COG: TM0497 COG1814 # Protein_GI_number: 15643263 # Func_class: S Function unknown # Function: Uncharacterized membrane protein # Organism: Thermotoga maritima # 8 289 1 283 284 237 47.0 2e-62 MIPTQQDIAEFTRFQRNEITESILYERLASIEKDENNRKTLRLIAAEEKSHYAILKKYTG KEIGPDYKRIARFYFLARILGITFAIKLMESSEENAHNNYDKYAHIPDLQRLAHEEEVHE QKLISLINEERLEYMGSVVLGLNDALVEFTGALAGFTLALSDSKLIALTGSITGIAAALS MASSEYLSTKSEGDDKKHPVKAAVYTGIAYLITVVSLVTPFILISNVIVALGVMLTMALI IIALFNYYYSVARGESFRKRFTEMAVLSFSVAGISFLIGYALKTFTGIDA >gi|226332179|gb|ACIC01000141.1| GENE 21 25813 - 26247 344 144 aa, chain - ## HITS:1 COG:no KEGG:BT_0160 NR:ns ## KEGG: BT_0160 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 144 1 144 144 254 100.0 9e-67 MKYGVNKEILLITAGTVWIIAGANILRIGIVTWLNNSEGWMFKIGEATIVFLLFFVLIFK RLYYKHTQRIERKKEQKNCPFSFFDVKSWIVMIFMICMGITIRSFHLLPESFISVFYTGL SIALILTGVLFIRYWWLRRKTNPV >gi|226332179|gb|ACIC01000141.1| GENE 22 26295 - 27470 674 391 aa, chain - ## HITS:1 COG:CAC3482 KEGG:ns NR:ns ## COG: CAC3482 COG0477 # Protein_GI_number: 15896719 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Clostridium acetobutylicum # 15 390 19 394 394 255 39.0 9e-68 MKQSLKENGGLPASILWTLAIVAGVSVANIYYIQPLLNMIRHELGISEFRTNLIAMVTQI GYAAGLLFITPLGDLYQRKKIILVNFFVLIFSLLTIALADNIHLILLASFFTGACSMIPQ IFIPIAAQFSRPENKGRNVGIVLSGLLTGILASRVVSGFVGELFGWREMYYIAAAMMFVC AIVVLKVLPDIQTNFRGKYGDLMKSLLALVKEFPQLRIYSIRAALNFGSLLAMWSCLAFK MGQAPFHANSDIIGLLGLCGVAGALTASFVGKYVKRVGVRRFNFIGCGLILFSWLLFFVG ENTYLGIILGIIIIDIGMQCIQLSNQTSIFELSPRASNRINTIFMTTYFVGGSLGTFLAG TFWQLYGWHGVIGIGATLTCISLLITTLSKK >gi|226332179|gb|ACIC01000141.1| GENE 23 27490 - 29808 1536 772 aa, chain - ## HITS:1 COG:FN0580 KEGG:ns NR:ns ## COG: FN0580 COG4953 # Protein_GI_number: 19703915 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane carboxypeptidase/penicillin-binding protein PbpC # Organism: Fusobacterium nucleatum # 26 762 1 712 724 390 34.0 1e-108 MGSKILNFFKRLSVTKKVILCILAFLVTGYIFCLPRHLFHVPYSTVVTDRNEELLGARIA SDGQWRFPPRNTTPEKIKECLITFEDKHFYHHWGVNPFAIGRAFYQNVKNKRIVSGGSTL TMQTIRLARNESRTFREKLIEMIWATRLEFRASKEEILSMYISHAPFGGNVVGLDAAAWR YFGHSADDLSWAESAMLAVLPNAPAMIHLSKGRKTLLDKRNRLLKQLLEKKTIDSSTYEL AISEPLPDEPHALPQIAPYLVSRFYQERNGEYSRSTINKGIQTQVEDLAERWSNEFGRSD IRNLAILVIDIPSNQVVAYCGNVHFDRKQGGNQVDVIQAPRSTGSILKPFLYYAMLQEGS LLPDMLLPDVPVNINGFTPQNFSMQFEGAVPASEALARSLNIPAVTMLQRYGVPKFHSFL QQIGLKTINRSSSHYGLSLILGGAEATLWDVTNAYAMMGRSLLQLPQRSCSLLLPTSRIT ESTDPFQPGAVWQTFDALKEVNRPEEIDWKSIPSMQTIAWKTGTSYGFRDAWAVGVTPRY AVGVWVGNATGEGKPGLVGAQTAGPVLFDIFNLLPSSSWFTRPAGIFVEAEVCRKSGHLK GRFCDETDTLLVLPAGLRTEACPYHHLVTLSANESQRIYENCANTEPTLRKSWFTLPPVW EWYYKQHHPEYKPLPPFKAGCGEDTFQPMQFIYPPMNARIKLPKQLDGSKGFLTVELAHN NPNATVFWHLDETYQAQTQDFHKISLQPAAGKHSLTAVDGEGNTISTTFFVE >gi|226332179|gb|ACIC01000141.1| GENE 24 29854 - 35502 5056 1882 aa, chain - ## HITS:1 COG:FN0579 KEGG:ns NR:ns ## COG: FN0579 COG2373 # Protein_GI_number: 19703914 # Func_class: R General function prediction only # Function: Large extracellular alpha-helical protein # Organism: Fusobacterium nucleatum # 263 1875 54 1604 1611 395 24.0 1e-109 MGQIKTRCSAAAGLFLILLTVIAGFSSCKSNQKDIIPSAEYAPYVNAYTGGVISQNSTIR IELTQDQPMVDLNQELKDNPFSFSPSLKGKTYWVSNNTIEFVPEEGALKPGAFYEGTFHL GDFVDVDKKLEEFNFSFRVQERNFSIHTDPITVTATQPDQVTVTGEIRFSDVVKKEEVEK MLTAGSEKNKSYPIEITQTDHPTRYAFSISQIIKEAEDYQLEITAKGNPAGIDRTQNKSI LIPAKNSFRFLSAVRIDQPENGIEIIFSDPVSNTQDLKGLIDVPEVSSSIFQIKENKVFV YFETGKLNKLTLNIHEGIRNSQDKPLGTSHSISFSELNLKPQVEMATSAAILPDSKSLII PFRAVNLYAVDLSVIRIFENNVLMFMQNNSLSSANELRRSGRLVYKKTLWLAKDSSKDVH RWEDYSIDLAGLIHQEPGAIYRVILSFRQEYSAYPCGGSENKEMQFADNKSSDNLTKVSG ETLSEDDEAVWDTPETYYYYNGSVPMDWSQYRWTERDNPCHPSYYMNSDRIAACNILASN LGMIVKRNSLNKLWIAVNNILDTKPVAKAQVTIYNFQLQPIGKGETNGEGLVEITPKGVP FIAVAEADKQKAYVRVVDGEEQSVSRFDVGGKDIQKGLKGFIYGERGVWRPGDTLHISFM LEDREKRIPDKHPVALEIYNPRGQFYTKMISTQGTNGFYTFAVPTQADDPTGLWNAYVKV GGTAFHKSLRIETIKPNRLKITLALPTILQASSKDVYAPLTSSWLTGATASRLKAKVEMS LSKVNTQFKNYGQYLFNNPATDFTTVRADVFNGVLDAEGRAGVNIQLPVATGAPGMLNAT LTTRVFEPGGDASIYSQTVPFSPFTSYVGINLNQPKGKYIETDKDHVFDIVTVNDQGQPV NRSNLEYKIYRISWSWWWENGEESFGTYINNSSITPVASGNLQTTGGKASFKFRINYPDW GRYLVYVKDRESGHATGGTVYIDWPDWRGRSNKTDPSGIKMLAFSLDKDSYEIGETATAI IPAAAGGRALVSLENGSTVLQQQWLEVSDQGDTKLTFKITPEMAPNVYLHISLLQPHAQT VNDLPIRMYGIAPVFVTNRQTILQPQIKMPEVLRPETDFNVTVSEKSGKPMTYTLAIVDD GLLDLTNFKTPDPWNEFYAREALGIRTWDMYDDVLGASGGRYSSLFSTGGDASLKPADAK ANRFKPVVKFIGPFYLAKGKQQTHTLKLPMYVGSVRAMVVAGQDGAYGNAEKTAFVRTPL MLLSTLPRVLSTQEEITVPVNVFAMENQVKNVTVSLEASGAGVQITGNRQQSLTFDQPGD QLAYFTLKTGSKTGKATIHLTASGNGQQTKETIEIEVRNPNPVVTLRNSQWIEAGQEAEL SYTLAGSSSANNQVQLEVSRIPSVDISRRFDFLYNYQHHCTEQLTSKALPLLFVSQFKAV DEQEAEKIKTNVQEAIRQIYARQLPNGGFVYWPGNAVADEWITSYTGMFLTLAQEKGYAV HPNVLNKWKRFQRAAAQNWRMPQEASNWQIWQSELQQAFRLYTLALAGAPEYGAMNRMKE QPGLSIQAKWRLAAAYALTGKMKPAGELVYNAETTVIPYSSMNLIYGSSDRDEAMILETL ILMKRDRDALQQAKKVSQNLAQENWFSTQSTAFALMAMGRLAKQLSGTLDFTWSWNGKQQ PAVKSAKAVFEKEIATSPKSGTVSVKNQGKGALSVDLITRTQLLNDTLPAIADNIRLDVK YTDMAGSPISVEDIRQGTDFMSAVTLSNISGTSDYSNLALTHIIPSGWEIYNERMIVPEV SSSSTNEANVPESSAGKYTYKDIRDDRVLTYFDLRRGESKTFTVRLQATYAGNFILPAIQ CEAMYDAAVQARTKAGRTTVSR >gi|226332179|gb|ACIC01000141.1| GENE 25 35599 - 36000 307 133 aa, chain + ## HITS:1 COG:CC3636 KEGG:ns NR:ns ## COG: CC3636 COG0545 # Protein_GI_number: 16127866 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: FKBP-type peptidyl-prolyl cis-trans isomerases 1 # Organism: Caulobacter vibrioides # 3 132 37 167 177 124 45.0 5e-29 MGRKQEYKETNLEFLKELSSQEGVYALPCGIYYKVLETGTGTVFPNVRSIVTVHYKGSLI NGRVFDNSYERNCPEAFRLCDVIDGWQLALQRMRAGDKWIIYIPYTMGYGTRASGPIPAF STLVFEVELLGVA >gi|226332179|gb|ACIC01000141.1| GENE 26 36016 - 36573 391 185 aa, chain + ## HITS:1 COG:TP0100 KEGG:ns NR:ns ## COG: TP0100 COG0526 # Protein_GI_number: 15639094 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Treponema pallidum # 40 171 63 200 200 58 27.0 7e-09 MKVEIRSKVVILLCIVLAVFSSCITDDDDDRGDFALVAGDALPQFSVEMSDGGVLNTQSF SGKVGVIVFFHTDCPDCQKELPVIQKVYDTYKENPDVLLSCISRSESGSEVQAYWEKHSF SLPFSAQNDDAVFSLFASHTIPRILITDKEGIIRAVYTDDPLATYQELAKDIDAVLIDKL SGNER >gi|226332179|gb|ACIC01000141.1| GENE 27 36703 - 37041 176 112 aa, chain - ## HITS:1 COG:no KEGG:BT_0167 NR:ns ## KEGG: BT_0167 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 112 1 112 112 207 100.0 9e-53 MDDLIKKHLQDILTAIEEVEGFFGNAPKVYDDFYSNLCLRRAVERNIEIIGEAMNRILKV DKDIAITNSRKIVDARNYIIHGYDSLSVDILWSMVINHLPKLRNEVITLLKT >gi|226332179|gb|ACIC01000141.1| GENE 28 37034 - 37333 307 99 aa, chain - ## HITS:1 COG:no KEGG:BT_0168 NR:ns ## KEGG: BT_0168 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 99 1 99 99 167 100.0 1e-40 MKLIENNIQKIIDLCKKHKVHKLFVFGSILTNRFNDNSDIDLVVDFNKAEVSDYFDNYFD FKYALENLFGREVDLLEEQTIKNPYLKKNVDATKTLIYG Prediction of potential genes in microbial genomes Time: Thu May 12 03:19:08 2011 Seq name: gi|226332178|gb|ACIC01000142.1| Bacteroides sp. 1_1_6 cont1.142, whole genome shotgun sequence Length of sequence - 7413 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 7, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 941 503 ## BT_0169 hypothetical protein - Prom 995 - 1054 7.4 2 2 Tu 1 . - CDS 1061 - 1414 346 ## BT_0170 hypothetical protein - Prom 1449 - 1508 3.6 3 3 Tu 1 . - CDS 1586 - 2869 469 ## BT_0171 hypothetical protein - Prom 2891 - 2950 2.5 - Term 2875 - 2925 6.2 4 4 Tu 1 . - CDS 2984 - 3379 207 ## BT_0172 hypothetical protein - Prom 3440 - 3499 4.8 5 5 Tu 1 . - CDS 3518 - 3730 77 ## gi|253571727|ref|ZP_04849133.1| conserved hypothetical protein - Prom 3923 - 3982 6.6 - Term 4336 - 4389 12.5 6 6 Op 1 . - CDS 4406 - 5416 1101 ## BF3207 hypothetical protein 7 6 Op 2 . - CDS 5436 - 6137 894 ## COG0822 NifU homolog involved in Fe-S cluster formation - Prom 6232 - 6291 4.5 + Prom 6113 - 6172 6.8 8 7 Tu 1 . + CDS 6376 - 6780 242 ## BT_0175 hypothetical protein + Term 6916 - 6958 6.4 Predicted protein(s) >gi|226332178|gb|ACIC01000142.1| GENE 1 2 - 941 503 313 aa, chain - ## HITS:1 COG:no KEGG:BT_0169 NR:ns ## KEGG: BT_0169 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 313 1 313 351 646 100.0 0 MKKNLLLLLCLFCLVNVCHAKAKDTFTVFQLNLWHGCTKVPNGDQGIIDVLDQMDADVVF LCEIRDGKQFIPHVIEELEKRGKHYYGETFDLAIGVLSKFKPDSWTKCCIVPGDEGRAMV KMVATIEGQPVSFYSCHLDYRHYQCYMPRGYNGTTWKKMDKPITDEEEVLKANRQSFRDE TIRAFIQEVQSDIQQGRPIIMGGDFNEPSHLDWQADTKDLWDHNGAIIHWDCSMMLSKAG FKDAYREKFPNTVRYPGFTFPAGNKLAEEAKLEKLAWAPEADERDRIDFIYYYPLESMLS MKDSKLVGPSETV >gi|226332178|gb|ACIC01000142.1| GENE 2 1061 - 1414 346 117 aa, chain - ## HITS:1 COG:no KEGG:BT_0170 NR:ns ## KEGG: BT_0170 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 117 1 117 117 181 100.0 5e-45 MLLTFLNILATIIGVISLLIVTYGVFVGFIAFLRNEFKRINGTYTINNVRQLRADFGSYL LLGLEFLIASDILKTVVDPTLDELAILGGVVIVRTVLSVFLNKEIKELAEEENSKST >gi|226332178|gb|ACIC01000142.1| GENE 3 1586 - 2869 469 427 aa, chain - ## HITS:1 COG:no KEGG:BT_0171 NR:ns ## KEGG: BT_0171 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 427 1 427 427 860 100.0 0 MKIVYNTILFTLLFLATSCDHSEKIDDIKLEISQNIFRNIDNNGGKITVAITSNSEWIIA NSADWCIPDKYQGEGNDILTIKILANTKHANRQTNLIISAQGINQTIKISQQKGEVNPDL DKIHYQLPVIFHVLYQNENDINQYIKEDHLKDVLVRTNHFYQSEKCGIDINLEFVLATKD KNGTTLPEPGVERVPFNELPIDCEKFMRDNTGKYTSLLWEPNEYINIMVYPFTDSQILGI STFPYSLKDFFLEGTQQVSASWITLENLSFPYCISINSSYIYEQSTDERINQNDAAITLA HELGHYLGLRHVFSEGGTTSMCTDTDYCKDTPSYNRSEYEQWLNNLDKQNKYQLKDLAKR YSCEKGEYEAHNIMDYAYCFYNEITLEQRKRIRHILNYSPLIPGPKESRIDTRGVQDRLN LPICIMK >gi|226332178|gb|ACIC01000142.1| GENE 4 2984 - 3379 207 131 aa, chain - ## HITS:1 COG:no KEGG:BT_0172 NR:ns ## KEGG: BT_0172 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 131 1 131 131 189 100.0 2e-47 MKQFRILSMLMIVLISSLSFVSCGDDDDETSGDFSASIAGVYTGKLKVGTNVTADAYVVT VTKVSSSVVKVTADFYTDNGSENYNVTKEGNQYILSSESSSGINITVTGKAMTISFLNNA GSMTTFTGTRD >gi|226332178|gb|ACIC01000142.1| GENE 5 3518 - 3730 77 70 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253571727|ref|ZP_04849133.1| ## NR: gi|253571727|ref|ZP_04849133.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 70 1 70 70 117 100.0 3e-25 MTHIGQIIEKELHRQERSVTWFARRLYCDRTNVYNIFRRQSLDTELLLRISIILEYNFFQ IYSDIYNNRT >gi|226332178|gb|ACIC01000142.1| GENE 6 4406 - 5416 1101 336 aa, chain - ## HITS:1 COG:no KEGG:BF3207 NR:ns ## KEGG: BF3207 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 336 1 336 336 648 98.0 0 MIREVKFESQDRRIKGIIEALNANGIKDIEEANAICEAAGLDPYKTCEETQPICFENAKW AYVVGTAIALKKNCTNAAEAAEAIGIGLQAFCIPGSVADDRKVGIGHGNLAAMLLREETK CFAFLAGHESFAAAEGAIKIAAKADKVRKEPLRCILNGLGKDAAQIISRINGFTYVQTQF DYFTGELKVVREIAYSDGPRAKVKCYGADDVREGVAIMWKEGVDVSITGNSTNPTRFQHP VAGTYKKERVLAGKPYFSVASGGGTGRTLHPDNMAAGPASYGMTDTMGRMHSDAQFAGSS SVPAHVEMMGFLGIGNNPMVGCTVACAVDVAQALAK >gi|226332178|gb|ACIC01000142.1| GENE 7 5436 - 6137 894 233 aa, chain - ## HITS:1 COG:CAC2565 KEGG:ns NR:ns ## COG: CAC2565 COG0822 # Protein_GI_number: 15895825 # Func_class: C Energy production and conversion # Function: NifU homolog involved in Fe-S cluster formation # Organism: Clostridium acetobutylicum # 1 233 1 230 230 371 80.0 1e-103 MTYSHEVEHMCVVKKGPNHGPAPIPEEGKWVKSKEIVDISGLTHGIGWCAPQQGACKLTL NVKEGVIQEALVETIGCSGMTHSAAMAAEILPGKTILEALNTDLVCDAINTAMRELFLQI VYGRTQSAFSEGGLIIGAGLEDLGKGLRSQVGTLYGTLAKGPRYLEMAEGYIKNIFLDKN DEICGYEFVHMGKFMDEIKKGTDANEALKKVTGTYGRVTAEQGAVKHIDPRHE >gi|226332178|gb|ACIC01000142.1| GENE 8 6376 - 6780 242 134 aa, chain + ## HITS:1 COG:no KEGG:BT_0175 NR:ns ## KEGG: BT_0175 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 134 1 134 134 226 100.0 2e-58 MELNQVDIHYLIAAICVISSALVFYTIGVWGERLQKKLKLWHIIFFLLGLVSDAVGTSLM EHIAELTHLHDEIHTVTGTIAILLMFVHASWAIWTYVKGSAEAKRHFNRFSIVVWCIWLI PYLIGMAIGMHLHA Prediction of potential genes in microbial genomes Time: Thu May 12 03:19:40 2011 Seq name: gi|226332177|gb|ACIC01000143.1| Bacteroides sp. 1_1_6 cont1.143, whole genome shotgun sequence Length of sequence - 24450 bp Number of predicted genes - 23, with homology - 23 Number of transcription units - 11, operones - 7 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 22 - 81 2.8 1 1 Tu 1 . + CDS 131 - 631 550 ## BT_0176 hypothetical protein + Term 634 - 673 2.8 + Prom 633 - 692 3.4 2 2 Op 1 . + CDS 714 - 1499 435 ## BT_0177 hypothetical protein 3 2 Op 2 . + CDS 1526 - 2071 373 ## BT_0178 hypothetical protein 4 2 Op 3 . + CDS 2081 - 2533 435 ## COG0691 tmRNA-binding protein + Prom 2544 - 2603 3.4 5 3 Op 1 . + CDS 2623 - 5370 2509 ## COG1410 Methionine synthase I, cobalamin-binding domain 6 3 Op 2 . + CDS 5383 - 5982 484 ## COG0778 Nitroreductase + Term 6088 - 6124 1.3 + Prom 5990 - 6049 4.5 7 4 Tu 1 . + CDS 6176 - 7726 1105 ## COG0591 Na+/proline symporter 8 5 Op 1 . - CDS 7814 - 9223 1103 ## COG4623 Predicted soluble lytic transglycosylase fused to an ABC-type amino acid-binding protein 9 5 Op 2 . - CDS 9220 - 9837 510 ## COG0572 Uridine kinase - Prom 9864 - 9923 4.4 10 6 Tu 1 . - CDS 10050 - 10214 69 ## gi|253571741|ref|ZP_04849147.1| predicted protein - Prom 10268 - 10327 4.9 + Prom 10190 - 10249 1.5 11 7 Op 1 . + CDS 10299 - 11522 738 ## COG4974 Site-specific recombinase XerD 12 7 Op 2 . + CDS 11545 - 12852 900 ## COG0582 Integrase + Term 12863 - 12914 4.1 - Term 12843 - 12906 6.1 13 8 Op 1 . - CDS 12912 - 13223 325 ## BT_1930 hypothetical protein 14 8 Op 2 . - CDS 13220 - 13513 258 ## BF1793 hypothetical protein 15 8 Op 3 . - CDS 13519 - 13848 340 ## BT_2651 hypothetical protein - Prom 13995 - 14054 5.4 - Term 14015 - 14056 1.3 16 9 Tu 1 . - CDS 14063 - 14479 193 ## BVU_0680 hypothetical protein - Prom 14692 - 14751 2.7 - Term 14675 - 14714 5.1 17 10 Op 1 . - CDS 14784 - 16100 840 ## COG2204 Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains 18 10 Op 2 . - CDS 16075 - 16317 205 ## gi|218130573|ref|ZP_03459377.1| hypothetical protein BACEGG_02162 19 10 Op 3 . - CDS 16415 - 18400 1254 ## COG0642 Signal transduction histidine kinase - Prom 18458 - 18517 3.1 - Term 18407 - 18474 1.7 20 11 Op 1 9/0.000 - CDS 18522 - 19880 467 ## PROTEIN SUPPORTED gi|157165073|ref|YP_001466086.1| 30S ribosomal protein S12 21 11 Op 2 27/0.000 - CDS 19881 - 22994 2676 ## COG0841 Cation/multidrug efflux pump 22 11 Op 3 . - CDS 22991 - 24112 1046 ## COG0845 Membrane-fusion protein 23 11 Op 4 . - CDS 24156 - 24449 167 ## PROTEIN SUPPORTED gi|157690935|ref|YP_001485397.1| ribosomal protein acetyltransferase Predicted protein(s) >gi|226332177|gb|ACIC01000143.1| GENE 1 131 - 631 550 166 aa, chain + ## HITS:1 COG:no KEGG:BT_0176 NR:ns ## KEGG: BT_0176 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 5 166 1 162 162 295 100.0 6e-79 MIEVMESALQKAAGEGMDEFIQVFTDKYKEIIGGELTADTMPLLTGEQHSLLAYQIFRDE IMFGGFCQLIQNGYGGYIFDNPFAKVMRLWGAEEFSKLVYKAKKIYDANRKDLEKERTDD EFMAMYEQYEAFDDLEEAYLDMEEQVTTLVASYVDNHLELFAKIIK >gi|226332177|gb|ACIC01000143.1| GENE 2 714 - 1499 435 261 aa, chain + ## HITS:1 COG:no KEGG:BT_0177 NR:ns ## KEGG: BT_0177 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 261 3 263 263 466 99.0 1e-130 MKQMKFLLVALMTVLMGMSVTSCMNGDDNTIYTGPAFAKCTNYFPATFELANGQKLVVNE LSMSNLNIGEIYFFYYQFDTAQQPGNSQTLDVTLYAGSTPTSISAKPTEGPEKAADYNEA TAPLYTFNSDTSTQPGILFDQYLVIPIMYWVKVESTDEKQKEELNKHSFILTYDFDELAS GSTELVLNLNHVIKDGSEETVTRDKYTSTYKAYNLSSVIYAFKQATRVEPTKIKVNAKTN SSKNSLDGAANSTWSETLKTN >gi|226332177|gb|ACIC01000143.1| GENE 3 1526 - 2071 373 181 aa, chain + ## HITS:1 COG:no KEGG:BT_0178 NR:ns ## KEGG: BT_0178 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 181 1 181 181 331 100.0 9e-90 MKLISSPAKAWEEISMEEDRRKVFMAFVYPMIGLCGLSVFIGSLLTNGWGGPQSFQIAMT NCCAVAVALFGGYFLAAYAINEMGVRMFGMPANAPLTQQFAGYALVVPFLLQIVTGLLPD FRIIAWLFQFYIVYVVWEGAPILMQVEEKVRLKYTLFSSVLLILCPTVIQIVFSKLIAIL N >gi|226332177|gb|ACIC01000143.1| GENE 4 2081 - 2533 435 150 aa, chain + ## HITS:1 COG:TM0254 KEGG:ns NR:ns ## COG: TM0254 COG0691 # Protein_GI_number: 15644629 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: tmRNA-binding protein # Organism: Thermotoga maritima # 10 149 16 155 158 134 47.0 7e-32 MKQPSVNIKNKRATFDYELMDTYTAGIVLTGTEIKSIRLGKASLVDTFCYFAKGELWVKN MHIAEYFYGSYNNHTARRDRKLLLNKKELEKIQRGMKDPGFTTVPVRLFINEKGLAKLVV ALAKGKKQYDKRESIKEKDDRRDMARMFKR >gi|226332177|gb|ACIC01000143.1| GENE 5 2623 - 5370 2509 915 aa, chain + ## HITS:1 COG:VC0390_2 KEGG:ns NR:ns ## COG: VC0390_2 COG1410 # Protein_GI_number: 15640417 # Func_class: E Amino acid transport and metabolism # Function: Methionine synthase I, cobalamin-binding domain # Organism: Vibrio cholerae # 324 914 1 590 899 709 59.0 0 MKKTISQIVSERILILDGAMGTMIQQYNLKEEDFRGERFAHIPGQLKGNNDLLCLTRPDV IQDIHRKYLEAGADIIETNTFSSTTVSMADYHVEEYVREINLAATRLARELADEYTAKSP DKPRFVAGSVGPTNKTCSMSPDVNNPAFRALSYDELAASYQQQMEAMLEGGVDAILIETI FDTLNAKAAIFAAGQAMKVTGIEVPVMLSVTVSDIGGRTLSGQTLEAFLASVQHANIFSV GLNCSFGARQLKPFLEQLASRAPYYISAYPNAGLPNSLGKYDQTPADMAHEVKEYIQEGL VNIIGGCCGTTDAYIAEYQALIAGAKPHVPAPKPDCMWLSGLELLEVKPEINFVNIGERC NVAGSRKFLRLVNEKKYDEALSIARQQVEDGALVIDVNMDDGLLDARTEMTTFLNLIMSE PEIARVPVMIDSSKWEVIEAGLKCLQGKSIVNSISLKEGEEVFLEHARIIKQYGAATVVM AFDEKGQADTAARKIEVCERAYRLLVDKVGFNPHDIIFDPNVLAVATGIEEHNNYAVDFI EATGWIRKNLPGAHVSGGVSNLSFSFRGNNYIREAMHAVFLYHAIQQGMDMGIVNPGTSV LYSDIPADTLEKIEDVVLNRRPDAAERLIELAEALKETMGGTSGQAAVKQDAWREESVQE RLKYALMKGIGDYLEQDLAEALPLYDKAVDVIEGPLMDGMNYVGELFGAGKMFLPQVVKT ARTMKKAVAILQPIIESEKVEGSSSAGKVLLATVKGDVHDIGKNIVAVVMACNGYDIVDL GVMVPAETIVQRAIEEKVDMIGLSGLITPSLEEMTHVAAELEKAGLDIPLLIGGATTSKM HTALKIAPVYHAPVVHLKDASQNASVASRLLNSQMKAELINELDAEYQALREKSGLLRRE TVSLEEAQKNKLNLF >gi|226332177|gb|ACIC01000143.1| GENE 6 5383 - 5982 484 199 aa, chain + ## HITS:1 COG:MA0330 KEGG:ns NR:ns ## COG: MA0330 COG0778 # Protein_GI_number: 20089228 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Methanosarcina acetivorans str.C2A # 37 199 9 174 179 159 45.0 2e-39 MKKIFSFLCMLMAMTAMISCSSSKEEKGTTGTGNAALDNIFERKSVRTYLNKGVEKEKID LMLRAGMSAPSGKDVRPWEFVVVSDRAKLDSMAAALPYAKMLTQARNAIIVCGDSARSFY WYLDCSAAAQNILLAAESMGLGAVWTAAYPYEDRMEVVRKYTHLPENILPLCVIPFGYPA TKEQPKQKYDEKKIHYNQY >gi|226332177|gb|ACIC01000143.1| GENE 7 6176 - 7726 1105 516 aa, chain + ## HITS:1 COG:MTH1856 KEGG:ns NR:ns ## COG: MTH1856 COG0591 # Protein_GI_number: 15679844 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Na+/proline symporter # Organism: Methanothermobacter thermautotrophicus # 1 512 1 511 526 483 50.0 1e-136 MNTFTLGLIVIAYLLSLAYLGFLGYKKTTNTSDYLVGGRQMNPIVMALSYGATFISASAI VGFGGVAAAFGMGIQWLCFLNMFVGVVIAFIFFGLRTRRMGAKLNVSTFPQLLGRHFRSR NIQVFIAAVIFIGMPLYAAVVMKGGAVFIEQIFQIDFNISLLIFTLVIAAYVIAGGMKGV MYTDALQAVIMFGCMLFLLFSLYQVLGMGFTEANKELTNIAPLVPEKFKALGHQGWTAMP VTGSPQWYSLVTSLILGVGIGCLAQPQLVVRFMTVESSKQLNRGVFIGCFFLIITVGAIY HAGALSNLFFLKTEGAVATEVIQDIDKIIPYFINKAMPDWFAALFMLCILSASMSTLSSQ FHTMGASVGSDIYGTYKPRSRGKLTNVIRLGVLFSILVSYIICYMLPHDIIARGTSIFMG ICAAAFLPAYFCALYWKKATKQGVMASLWVGTLGSAFALIFLHQKEAAAMGICKFLFGKD VLIETYPFPVIDPILFALPLSILAIVVVSLLTGRNK >gi|226332177|gb|ACIC01000143.1| GENE 8 7814 - 9223 1103 469 aa, chain - ## HITS:1 COG:VC0866 KEGG:ns NR:ns ## COG: VC0866 COG4623 # Protein_GI_number: 15640882 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted soluble lytic transglycosylase fused to an ABC-type amino acid-binding protein # Organism: Vibrio cholerae # 37 463 29 460 530 171 29.0 2e-42 MKDKEFRMSAKTLILPLFLILLCCVFGCRNKHHTEKNETAHDLPQIKDSGELVVLTLYSS TSYFIYRGQEMGFQYELSEQFAKSLGLKLRIEVARNVRELIQKLQSGEGDMIAYNLPITK EWKDSLLYCGEDIITHQVIVQQGNGKHKPLKDVTELIGKDIYVKPGKYYDRLVNLNNELG GGIHIHKITGDSVTAEDLITQVAQGKIPYTVADNDLAKLNKTYYPNLNIDLSISFDQRSS WAVRKDSPELAAAATNWHKENMTSPAYTASMKRYFENSKMMPHSPILSLKEGKISHYDDL FKKYAQEIGWDWRMLASLAYTESNFDTTAVSWAGAKGLMQLMPATARAMGVPAGKEQNPE ESIKAAVKYIAATDRSFNMIPDKQERLNFILASYNAGLGHIYDAMALAKKYGKNNLVWKD NVENFILLKSNEEYFTDPVCKNGYFRGIETFNFVRDIQSRFESYKKKIK >gi|226332177|gb|ACIC01000143.1| GENE 9 9220 - 9837 510 205 aa, chain - ## HITS:1 COG:BH1275 KEGG:ns NR:ns ## COG: BH1275 COG0572 # Protein_GI_number: 15613838 # Func_class: F Nucleotide transport and metabolism # Function: Uridine kinase # Organism: Bacillus halodurans # 2 198 6 202 211 216 54.0 3e-56 MLIIGIAGGTGSGKTTVVRKIIESLPVGEVVLLPQDSYYKDSSHVPVEERQNINFDHPDA FEWSLLSKHVMMLKEGKSIEQPTYSYLTCTRQPETIHIEPREVVIIEGILALCDKKLRNM MDLKIFVDADPDERLIRVIQRDVIERGRTAEAVMERYTRVLKPMHLQFIEPCKRYADLIV PEGGSNAVAIDILTMYIKKHLKNEE >gi|226332177|gb|ACIC01000143.1| GENE 10 10050 - 10214 69 54 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253571741|ref|ZP_04849147.1| ## NR: gi|253571741|ref|ZP_04849147.1| predicted protein [Bacteroides sp. 1_1_6] # 1 54 6 59 59 94 100.0 3e-18 MRNIQNPYFFPLAKKALTLYSQNMSETAFFTVMSVAVHRPFSNIVFDLEQATDK >gi|226332177|gb|ACIC01000143.1| GENE 11 10299 - 11522 738 407 aa, chain + ## HITS:1 COG:BH1529 KEGG:ns NR:ns ## COG: BH1529 COG4974 # Protein_GI_number: 15614092 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinase XerD # Organism: Bacillus halodurans # 214 388 105 288 299 62 32.0 2e-09 MEQANVKVSFYLKKSEADADGMCPVMARLNIGKYSEAAFSLKLRVPQAIWSSGRASGKSV KAKEINNRLDEIRAMALGIYAELSVVRDSVTADDVKSLLLGMAGEQTTLLSYFRTFIENF AKRVGVNRTEGSLRSYRNAYNHVERFMREKYNLSDIPFSALTLSFIQDYDSHLRTDCRLS PGTIINLTVQLKIIVGEAVADGIITTYPFTGYEPVRPKQKRRYLTSEELQRLMTMPLHRP NLYLTRDLFLFSCYTGIPYSDMRLLSKEHLSLADDGTWWIRSSRRKTGVEFEIPLLDLPL HIMEKYRDTAPDGKLLPMYSNSTMNLNLKRIAKLCDIDCPLVFHAGRHTYATEITLGHGV PLETVSKMLGHARIETTQIYAKVTDDKINADTRVLNERIAERFSVVI >gi|226332177|gb|ACIC01000143.1| GENE 12 11545 - 12852 900 435 aa, chain + ## HITS:1 COG:PH1826 KEGG:ns NR:ns ## COG: PH1826 COG0582 # Protein_GI_number: 14591576 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Pyrococcus horikoshii # 312 407 189 277 285 59 35.0 1e-08 MKKKSEHADKTIRHRSTFAILFYINRTKMRKDGTCQLLCKVSIDAEWEQIGTKVSVNPDI WNPEKGLANGRSANAVTVNRAIDELTEEITGHYNRIKNSLGFITAELVKNAVMGVGLKPL TLLALFREHNEDFRRRVGLDRIKETLDSYLRSYKHLSAFIKDKKGVEDVTLRSLDKNFYD DFELFLCKDCHMMPKTVHEHLYRLKKMTKLAVSQGTLRRDPYCRLHPALPRRKSRHMKLE DLKKLMETPVEKPQLQFVRDMFLFSTFTGLAYADLKRLKTSDITQSEDGAWWIHIRRQKT DTLSSVRLLDIPLRIIEKYRNQRQGDNVFNVYRRGYFILLTRELGKVYGFDLTFHQARHN FGTHVTLSLGVPIETVSRMMGHMSISTTQLYAQVTDKKVDEDMKALKASGFSSTTELCEE DFTARKGRKRPPRAI >gi|226332177|gb|ACIC01000143.1| GENE 13 12912 - 13223 325 103 aa, chain - ## HITS:1 COG:no KEGG:BT_1930 NR:ns ## KEGG: BT_1930 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 6 93 5 92 97 75 44.0 5e-13 MSYHFLERKDPRVDVLFQGLDNMERLIAAMEDTPKSVFHGERFLTDEELSKILRVSRRTL QEYRTLGVVPYYLVQGKALYKESDILKILEDSYKRCKEDMRWV >gi|226332177|gb|ACIC01000143.1| GENE 14 13220 - 13513 258 97 aa, chain - ## HITS:1 COG:no KEGG:BF1793 NR:ns ## KEGG: BF1793 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 88 1 88 97 87 50.0 1e-16 MEIVCIDKQTFDELRVRFGKLEEKVMGMCRPVEDLGLKKWLDNQEVCEILRISKKTLQVY RDKGILPYSRIKHKIFFKTEDVHKLLESNYYHLKREL >gi|226332177|gb|ACIC01000143.1| GENE 15 13519 - 13848 340 109 aa, chain - ## HITS:1 COG:no KEGG:BT_2651 NR:ns ## KEGG: BT_2651 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 93 1 95 97 93 50.0 3e-18 MLQINSESVQIMLMQIMERFDRIDRTLERMNKLKECLEGDTLLDNYDLCQLLGITKRTLA RYRQKKYVTYYMIDGRTYYKASEVEAFLNQKGKVLPPKLKNRMDVQLKK >gi|226332177|gb|ACIC01000143.1| GENE 16 14063 - 14479 193 138 aa, chain - ## HITS:1 COG:no KEGG:BVU_0680 NR:ns ## KEGG: BVU_0680 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 135 1 137 138 130 48.0 1e-29 MARIQVVTAITLDGFLPEPAGALVSWVRNDRRGFPFWRERCSALILPHSILDLLCEKDNR DDSFIYLAEIIEPESVELLRGLFLYDLVDELVVYQLPFSAGQGIPVLNTFRPQHWELHKT TSFSNGICRLIYRKSCKM >gi|226332177|gb|ACIC01000143.1| GENE 17 14784 - 16100 840 438 aa, chain - ## HITS:1 COG:STM2562 KEGG:ns NR:ns ## COG: STM2562 COG2204 # Protein_GI_number: 16765882 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains # Organism: Salmonella typhimurium LT2 # 5 437 9 438 445 281 38.0 2e-75 MKRKILIVEDNVGLSQIQKDWLSRAGYDAVTAMSEPIARSLIRKTQFDLILSDVRLPEGD GISLLEWLRKEKKDIPFIITTEFVSVPDVVRTIKLGARDYLPKPVHREHLLELAEDVFHP VATVRKQERQLFRRISPMILKVEKFARLVAPSDMSVMILGANGTGKESVAQTIHDNSERY GKPFVAVNCGALPRELAASLFFGHEKGAFTGADTAKTGYFDLAKGGTLFLDEIGTMPHEI QSMLLRVLQENVYTPIGSGRERISDVRVISATNENMEQAIKEGRFREDLYHRLNEFEIRQ PSLEECPEDIIPLAEFFRERFSKELKRTTQGFTEETKQRMLAYRWPGNVRELRNRVKRAV LVSESPMLDVDGLEITVCVCRNEETDTLAILPLKGEAFEKESIIRALKACNGHREQAAGM LNINPATLYRKMKKYGLK >gi|226332177|gb|ACIC01000143.1| GENE 18 16075 - 16317 205 80 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|218130573|ref|ZP_03459377.1| ## NR: gi|218130573|ref|ZP_03459377.1| hypothetical protein BACEGG_02162 [Bacteroides eggerthii DSM 20697] # 1 80 714 793 793 132 87.0 7e-30 MKNGDRHKLREITHRMQPMWEFLRMEEPLLAYRALLKDSKTSDKELKEYTRQIIDSTAML IKAAEAEIKRLTNETEDTDS >gi|226332177|gb|ACIC01000143.1| GENE 19 16415 - 18400 1254 661 aa, chain - ## HITS:1 COG:RSp1178 KEGG:ns NR:ns ## COG: RSp1178 COG0642 # Protein_GI_number: 17549399 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Ralstonia solanacearum # 293 660 298 674 676 154 32.0 7e-37 MKILLQHKIFIGYFLLMAIIGSMVAIVLHERSRVQKIENESIAIFQTQHNINTTHRYVTT LVTYGESVMVWNDEDSGAYRERRVRTDSMLQTLRTQCKDFIQPEQIDSLRTLLAAKEDHL FQIMEATREQKKTDSLLFHQKPTVTTQTTTRTVTRKKKGIAGFFGGKETVQMPVVTTRQT APDKELISLLNKRKRDIETYTDSLRLCNRELNRKLRLLITSLDEQTWNAFRSKEERLKAS YEHSTLVITGLIIFSIILLFISYLVIQRDIKVKAKNRKHLEETIEQNIALLEMRKNIILT ISHDIRAPLNVISGSAELAVDTREKKRRNTHLNNIRIVCRHVVHLLNNLLDVYRLNEAKE TRNDVPFNLNALLERIAFGFSHVINNKGILFNHDFTGTDVKLCGDVDRIEQILDNLLSNA VKFTETGTISLNARYNKGELVLEIKDTGIGMSEDALLRIFRPFERLGSVRNAEGFGLGLP ITKGLVNLLGGTIDVTSGIDQGSTFRVTLPLKTTDETVESENLIIPHPAHLPRNVLVIDD DTMLLNVIKEMLERNGMNCTTCVTSKDVVKAMRGKDYDLLLSDIQMPGTNGIDLLTLLRN SNIGNSRTIPIVAMTARGDRDKEAFLHAGFTDYIYKPFSSSELLGLLSRMKTDRREKKPE G >gi|226332177|gb|ACIC01000143.1| GENE 20 18522 - 19880 467 452 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|157165073|ref|YP_001466086.1| 30S ribosomal protein S12 [Campylobacter concisus 13826] # 3 451 2 455 460 184 28 5e-46 MKKTVIYIILSGWMFAGCGTYSRYHRPDLSTENLYRDVPADIDTTTIASLSWREMFTDPK LQSLIETGLERNTDLNVARLRVEATEAVLMTARLSYLPSLGLTAEGNANKHDGATAKTYN VGASASWELDIFGKLTAAKRGAAAALQGSRAYRQAVQTQLVATIADSYYTLAMLDAQMAI SNRTLENWRTTVRTLEALKKVGKSNEAGVLQAKANVMRLEASLLSIRKSISETENALSAI LAMPSHSIGRSNLAEAAFPDTVSIGVPLQLLSNRPDVRQAEMELAQAFYATNAARAAFYP NITLSGTLGWTNNGGGVIVNPGQWLLNAIGSLTQPLFNRGTNIANLKIAKSRQEEAKLLF RQSLLNAGKEVNDALTAWQTAKSQIEINARQVETLCDAVRKTESLMRHSNATYLEVLTAQ QSLLEAEVQQLQTRFERIQSVIKLYHVLGGGM >gi|226332177|gb|ACIC01000143.1| GENE 21 19881 - 22994 2676 1037 aa, chain - ## HITS:1 COG:BMEI1629 KEGG:ns NR:ns ## COG: BMEI1629 COG0841 # Protein_GI_number: 17987912 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Brucella melitensis # 4 1024 3 1022 1051 736 40.0 0 MNLRFFIDRPVFSGVISVVIVLLGMISMFSLPVEQYPDIAPPTINVFATYPGANAETVQK AVITPLEEAINGVEDMTYMTSTASNTGDASINIYFKQGTNADMAAVNVQNRVNGALSQLP AEATKTGVTTEKQQNAELMTFALYSPDDRFDQTFLNNYVKINVEPRLKRISGVGKAQLFG SNYSMRLWLRPDKMAQYGLIPDDISAVLARQNIEAATGSFGANHPTANEYTMKYRGRLSG AEEFGELVVKSLPGGNVLRLKEVADVELGDEYYNYSSEVNGHPAAMMLINQKAGSNASST IKEIHDVLDDLSRDLPEGTEFVVLTDTNKFLYASIHSVLRTLLEAILLVIVVVYVFLQDI KSTLIPTISIFVSIIGTFAVMSMIGFSINLLTLFALVLAIGTVVDDAIVVVEAVQAKFDE GYQSAVLAADDAMKGVSSAILTSTIIFMAVFFPVAMMGGTSGAFYTQFGITMAVAVGISA VNAFTLSPALCALLLKPYIDEQGNTKNNFAARFRKAFNAVFDSLSRRYVRGVMFIIHRRW LLWSIIGISFGLLVLLVNVTKTGLIPEEDTGTVMVSMNTKPGTSMAQTSKVMERINSRLD SIGEIEYSGAVAGFSFSGSGPSQAMYFVTLKDWEDRKGEGQSVNDVIGKIYAATSDIPDA TVFAMSPPMIAGYGMGNGFELYLQDKAGGNIAAFKEEADKFVEALSQRPEIGEVYSSFAT DYPQYWVDIDAAKCEQSGVSPADVLSTLSGYYTGQYVSDFNRFSKLYHVTMQAPAEYRVN AESLHHMYVRASDGGMSPLSRFVRLTKTNGPSDLTRFNLFNAISISGSPAQGYSSGQVLE AIGETAREVLPSNYTYEFGGISREESKTTNNATLIFLLCMVLVYLILCALYESVFIPFAV LLSVPCGLMGSFLFAWLFGLENNIYMQTGLIMIIGLLAKTAILLTEYAGKRRSEGMTLAQ AAYSAAKVRLRPILMTVLSMVFGLVPLMMAHGVGANGSRSLATGVIGGMIVGTLALLFLV PSLFIVFQYIQERVKHN >gi|226332177|gb|ACIC01000143.1| GENE 22 22991 - 24112 1046 373 aa, chain - ## HITS:1 COG:XF2093 KEGG:ns NR:ns ## COG: XF2093 COG0845 # Protein_GI_number: 15838684 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Xylella fastidiosa 9a5c # 12 363 23 376 408 124 29.0 2e-28 MNKQFIITNFAAFALMLFLPTGCRQADGKQDAVQSYRVIKVAASPVEISESYSAAIRGRQ DVDILPQISGRIIRLKVKEGERVKTGQVLAVIDQVPYRAALRTAQANVSAAQAKVETARI ELRGKQALFDEKVISDYELSLARNQLAVACAELEQAKAQESDARNNLSYTEIKSPSNGVV GTLPYRIGALVGPNMAQPFTVVSDNAEMYAYFSISENMLRRYSARYGSIDSMIAGTPEVG LQLNDGSLYKAKGRIETVSGVVDPVTGTVQIKALFPNPDRELLSGSIGNVILQNPKTEAV TIPMTATVELQDKIIAYRLKNGQAEAAYLTVDRLNDGNRFIVKEGLSVGDTIVAEGVGLV REGMSITPKNETK >gi|226332177|gb|ACIC01000143.1| GENE 23 24156 - 24449 167 97 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|157690935|ref|YP_001485397.1| ribosomal protein acetyltransferase [Bacillus pumilus SAFR-032] # 2 91 86 169 171 68 40 3e-11 NVGWHFNKRFEGKGFACEAAAGLLDYLFREAGARRIYGFVEDDNIRSKRLCERLGMRREG CFKEFVTFVNNPDGSPKYEDTCVYAILEKEWNTIRQW Prediction of potential genes in microbial genomes Time: Thu May 12 03:20:11 2011 Seq name: gi|226332176|gb|ACIC01000144.1| Bacteroides sp. 1_1_6 cont1.144, whole genome shotgun sequence Length of sequence - 17488 bp Number of predicted genes - 21, with homology - 20 Number of transcription units - 7, operones - 5 average op.length - 3.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 31 - 1320 805 ## BVU_1598 transposase + Prom 1440 - 1499 2.5 2 2 Op 1 . + CDS 1526 - 1687 99 ## gi|218130561|ref|ZP_03459365.1| hypothetical protein BACEGG_02150 3 2 Op 2 . + CDS 1668 - 2423 700 ## COG3279 Response regulator of the LytR/AlgR family 4 3 Op 1 . - CDS 2511 - 3299 503 ## BVU_0715 ThiF family protein, ubiquitin-activating enzyme 5 3 Op 2 . - CDS 3296 - 4015 585 ## BT_2648 hypothetical protein 6 3 Op 3 . - CDS 4019 - 5107 614 ## BVU_0713 hypothetical protein 7 3 Op 4 . - CDS 5157 - 5378 280 ## BVU_0712 hypothetical protein 8 3 Op 5 . - CDS 5383 - 6363 721 ## BVU_1522 hypothetical protein 9 3 Op 6 . - CDS 6395 - 6616 194 ## gi|255015017|ref|ZP_05287143.1| hypothetical protein B2_14008 - Prom 6860 - 6919 2.5 - Term 6886 - 6932 8.3 10 4 Op 1 . - CDS 6984 - 7220 116 ## 11 4 Op 2 . - CDS 7417 - 9579 1507 ## COG0550 Topoisomerase IA 12 4 Op 3 . - CDS 9610 - 9870 404 ## gi|212695114|ref|ZP_03303242.1| hypothetical protein BACDOR_04652 13 4 Op 4 . - CDS 9911 - 11680 1791 ## BT_2642 hypothetical protein - Prom 11716 - 11775 6.7 + Prom 12485 - 12544 5.8 14 5 Op 1 . + CDS 12628 - 13197 238 ## PRU_1548 hypothetical protein 15 5 Op 2 . + CDS 13194 - 13988 202 ## PRU_1547 hypothetical protein + Term 14061 - 14104 4.0 16 6 Op 1 . - CDS 14248 - 14541 202 ## BVU_1580 hypothetical protein 17 6 Op 2 . - CDS 14553 - 15050 486 ## ETAE_p030 hypothetical protein 18 6 Op 3 . - CDS 15062 - 15316 272 ## gi|253571770|ref|ZP_04849176.1| conserved hypothetical protein 19 6 Op 4 . - CDS 15328 - 15594 385 ## gi|253571771|ref|ZP_04849177.1| conserved hypothetical protein 20 6 Op 5 . - CDS 15628 - 16161 392 ## Nmag_4223 hypothetical protein - Prom 16314 - 16373 1.8 21 7 Tu 1 . - CDS 16379 - 17293 813 ## BT_2614 putative mobilization protein - Prom 17423 - 17482 3.7 Predicted protein(s) >gi|226332176|gb|ACIC01000144.1| GENE 1 31 - 1320 805 429 aa, chain + ## HITS:1 COG:no KEGG:BVU_1598 NR:ns ## KEGG: BVU_1598 # Name: not_defined # Def: transposase # Organism: B.vulgatus # Pathway: not_defined # 1 429 1 429 429 624 70.0 1e-177 MAKVQIKSEKLTPFGGIFSIMEKFDSMLSPVIDSTLGQRCRSIIGYQFSEIVRSLMSVYF CGGSCVEDVTSQLMRHLSYHPTLRTCSSDTILRAIKELTKENISYTSDQGKTYDFNTADK LNTLLINALVSTGELKEIEEYDVDFDHQFLETEKYDAKPTYKKFLGYRPGVYVIGDKIVY IENSDGNTNVRFHQADTHKRFFALLESQNIRVNRFRADCGSCSKEIVSEIEKHCKHFYIR ANRCSSLYNDIFALRGWKTEEINGIQFELNSILVEKWEGKCYRLVIQRQRRNSGDLDLWE GEYTYRCILTNDYKSSTRDIVEFYNLRGGKERIFDDMNNGFGWSRLPKSFMAENTVFLLL TALIHNFYKTIMSRLDTKAFGLKKTSRIKAFVFRFISVPAKWIMTARQYVLNIYTENRAY AKPFKTEFG >gi|226332176|gb|ACIC01000144.1| GENE 2 1526 - 1687 99 53 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|218130561|ref|ZP_03459365.1| ## NR: gi|218130561|ref|ZP_03459365.1| hypothetical protein BACEGG_02150 [Bacteroides eggerthii DSM 20697] # 1 53 311 363 363 100 92.0 2e-20 MSPIESTGLGLKNITERYALLCDKKVKIENAENFYSVSLPIIKNTIPYEHTDS >gi|226332176|gb|ACIC01000144.1| GENE 3 1668 - 2423 700 251 aa, chain + ## HITS:1 COG:SA0251 KEGG:ns NR:ns ## COG: SA0251 COG3279 # Protein_GI_number: 15925964 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Staphylococcus aureus N315 # 1 235 1 231 246 94 31.0 1e-19 MNILILEDEPRNAKRLIRLLNDIDRTFIVEGPLASIKETVEFFQSGKTTNLILADIRLTD GLSFEALKYAPATIPIIFTTAYDEYAVQAFKFNSFDYLLKPLDSDELEAAIDKAAKAGKN YTDENLQQLFDALQKNKFRYRERFLLPYRDGYKTVRVSDINHIETENKIVYLRLNNGTSE VVNVSMDELEHQLNPDYFFRANRQYIINVEHVLFLGNYFGGKLIVRLKGYPKTEIQVSKE KAQRLKEWIDR >gi|226332176|gb|ACIC01000144.1| GENE 4 2511 - 3299 503 262 aa, chain - ## HITS:1 COG:no KEGG:BVU_0715 NR:ns ## KEGG: BVU_0715 # Name: not_defined # Def: ThiF family protein, ubiquitin-activating enzyme # Organism: B.vulgatus # Pathway: not_defined # 1 262 1 263 263 326 57.0 4e-88 MKRVHYIDNYLLNPQHPVTVNLIGAGGTGSQVLTCLARLDVTLRALGHPGLFVTLYDPDI VTDANIGRQLFGCSDLGLNKAQCLITRVNNFFGNDWKAVPDIFPTVLKDARRDDMANITV TCTDNIKSRLDLWNVLKAVPTSNYRDYETPLYWMDFGNTQSTGQVVLGTIPKKIKQPASE LYKTVDSLKVITRFVKYARVKEADSGPSCSLAEALEKQDLFINSTLAQLGCNILWKMFRN GMLEHHGLFLNLETMKVNPIMI >gi|226332176|gb|ACIC01000144.1| GENE 5 3296 - 4015 585 239 aa, chain - ## HITS:1 COG:no KEGG:BT_2648 NR:ns ## KEGG: BT_2648 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 5 238 2 232 233 263 53.0 5e-69 MLETNELTQKLRTLLHPRAALIAYSGEKDNSYASDNNYFVEIRDIDDNGMMGEGRPVTVD FMNELVRGYSESHSTTPYGRIPPNMLWCDPRKGSERYIWYNSPQKRMMFFKESLQVENAK YNLPGVIYVAGGSSLSIYAYKDKKLTEKTELYAAPFFNVTGANVCLGSAKIDKPRDLTYK NLLEYWEKKFWLTEFSHLGGGGNPTRSNLILVTKAAKDKPFNLDELKPLNNLKLKDILK >gi|226332176|gb|ACIC01000144.1| GENE 6 4019 - 5107 614 362 aa, chain - ## HITS:1 COG:no KEGG:BVU_0713 NR:ns ## KEGG: BVU_0713 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 31 362 33 366 375 199 35.0 1e-49 MCTIGRTTVKAAGEAHHRPDAEECTVRREQGRTDCGSKRGRNSFLSTAFIPLAQNNLTVD CYDSGECNSINVITRDNYEFLRDSYFRYSLLLQKEGVHMPGNSIGEGIANLYDEMNTLVG DELHVNIEQEAGRLFFRLWKYHKWGSFTLYYFPVKFLESLNPILRRIAITFIHKLMKANG IDTFVDYEESDFLFEMLSEDDGSDPEEWKERMKLLDSYQEGKIGRLLKRVESKSYYKNLP KALDTYEPQNGFEQPLVDAMKRGLPFLTPARGIMQYAYDAYYSENPDFHPMYLQQQIRVV YDINDIVSDYLVDYYNSCSRETYDITPVTALDLSPDTEELFSMDDYPERFFRWADEFISL IS >gi|226332176|gb|ACIC01000144.1| GENE 7 5157 - 5378 280 73 aa, chain - ## HITS:1 COG:no KEGG:BVU_0712 NR:ns ## KEGG: BVU_0712 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 73 1 73 73 85 60.0 5e-16 MALEVTGMARSFTFKKGSGMVTLDDPNPSDSPEMVMSYYSNFYPELTTATVHGPVIKDDK AVYEFKTTIGTKG >gi|226332176|gb|ACIC01000144.1| GENE 8 5383 - 6363 721 326 aa, chain - ## HITS:1 COG:no KEGG:BVU_1522 NR:ns ## KEGG: BVU_1522 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 326 1 349 351 151 40.0 3e-35 MFFTAIHQMMTEGVDLTLVIRKANGQLTVSLLPKSNGLKDEAQNHLVPLTVSGHPQELDA GFLQAVARPIQKTTGLISNMAQFEAQAEKAASESKAAKETKAKETKEEKEKREKYEKHFK KAEELIAAKNHKEAVTALGQARLYAKPQDQKKIDEMVAEQTKAMNKGSLFELMEEPAPQP QQPQAQPAAAPVQQPRQAPQYPQQPQAQRPMAAQAPGGQQRPAANPQQQQTMWPPQQPPQ GPAQYYAQPQPQPYMGQQPVPQQAPQPAHGGYHGTHWQEPVQHPEELSPHYHAGEDEPNY RPEDYEEYPDFPQSMLENHVSQYQAV >gi|226332176|gb|ACIC01000144.1| GENE 9 6395 - 6616 194 73 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|255015017|ref|ZP_05287143.1| ## NR: gi|255015017|ref|ZP_05287143.1| hypothetical protein B2_14008 [Bacteroides sp. 2_1_7] # 1 71 1 71 72 68 63.0 1e-10 MNQAFATSAPPAPMRMPMPVTRRKRTVEPEPAVKFPPRGTTGPVHISTLLNPILEISRHP DRNRLLAKLFSEE >gi|226332176|gb|ACIC01000144.1| GENE 10 6984 - 7220 116 78 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKTREYIDKIQANRWVRTIALGFAAFIVFSYMYTRGISPVWAAVAIVCFRGFFRFLYRIA CLLVAAAILFCILSFLVS >gi|226332176|gb|ACIC01000144.1| GENE 11 7417 - 9579 1507 720 aa, chain - ## HITS:1 COG:CAC3567 KEGG:ns NR:ns ## COG: CAC3567 COG0550 # Protein_GI_number: 15896801 # Func_class: L Replication, recombination and repair # Function: Topoisomerase IA # Organism: Clostridium acetobutylicum # 26 660 6 654 709 414 39.0 1e-115 MSNRRFPFSGSGSGERLKFYPNMKAIIAEKPSVGMDIARVVGATDKKDGYCTGNGYMVTW ALGHLVSLAMPGAYGYTKTSAENLPMLPDPFRLVSRQVRTDKGMVTDIAAAKQLKIIDSV FDRCDSIIVATDCAREGELIFRWIYSYLGYTKPFKRLWISSLTDEAIREGLANLRDGSDY DSLYAAADCRAKADWLVGMNASRALAVASGSANNSIGRVQTPTLAMVCARFKENRNFVST PYWQLHLTLKRQDAHRMFIHTEEFKEKDAAEAAYRKITSGSVATVTGSEHKRTFQQAPLL YDLTTLQKDCNVHYDLTAEKTLSIAQSLYEKKLISYPRTGSRHIPEDVMRHIPSLLGKVV SMPEFREYGQSFDMSDLNTRSVDDTKVTDHHALIITGISPEGLSEAESTVYTLIAGRMLE AFSPPCEKELLVMECACEGMAFRSRSSSIVRPGWRGVFARKEDREKDEPERDGGTAEFAE GETVPVMGHGMAQKKTLPKPLYTEATLLAAMETCGKNITDEQAKEAIKELGIGTPATRAA IITTLIKRDYIARSGKSIIPTEKGMYIYEAVKDMRVADVELTGSWEKTLAQVERHTLDTE TFMQSILDYTRRATEEILRLDFPAMQERAFTCPKCKTGKIILRSKVARCDHDGCGLLVFR RILNKELTDTHMEQLFSSGTTRLIKGFKGKKGVPFDAAVTFDAEYNTVFSFPKTGNGKKK >gi|226332176|gb|ACIC01000144.1| GENE 12 9610 - 9870 404 86 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|212695114|ref|ZP_03303242.1| ## NR: gi|212695114|ref|ZP_03303242.1| hypothetical protein BACDOR_04652 [Bacteroides dorei DSM 17855] # 1 86 11 96 96 129 89.0 6e-29 MNVNNKNNTPFKAEDVNWEELAGIGILKDELEMSGELDTLLKGEKTGVMSLSLVLLGVDV VMDATLQLVRKDDGALIEILGVKPVA >gi|226332176|gb|ACIC01000144.1| GENE 13 9911 - 11680 1791 589 aa, chain - ## HITS:1 COG:no KEGG:BT_2642 NR:ns ## KEGG: BT_2642 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 586 15 600 601 531 50.0 1e-149 MAKKNVRDEPLKPQVSENEQMSDIVFILDKMELILQAVSKIGKDGKYSTVPADKEHSNSF LKIDRYANMFENFVKNFWSQLKDPTRFGILSVKADTLDSPEVKQAIEDLAAGKQTKAVED FLKKYEIVPRDKENQSINNQNQEEMAKKNETQQQATQGDGTQQPKYRYNESMINWEELKN FGLSREYLMERGLLDQMLRGYKTNQVVPISMNFGSAVLRTDARLSFQQSVGGPIVLGIHG IRQKPELERPYFGHIFSEEDKKNLLETGNMGRVVELKGRNGEYIPSFISIDKLTNEVVAM RAENAYIPQEIKGVKLTDQEINDLREGKKVFIEGMISNNGKEFDAHIQVNAERRGIEYIF ENDKLFNRQSLGGVELTKQQIEDLNAGKAIFVEGMERKDGELFSSYVKLDEATGRPSYTR YNPDSPEGAREIYIPNEIGGVKITAEEQQQLREGKVIFLNDMVNRKGEEFSSFIKADLET GRLSYSRTPDGFEQRAEFKIPEKVWDVKLTRNQRADLQSGKAVLVEGIKGYDGKTISQYV KANFNQGRLDFYNENPDRKRDASQRNVVANAQKQGQEQAGRKSKGASIA >gi|226332176|gb|ACIC01000144.1| GENE 14 12628 - 13197 238 189 aa, chain + ## HITS:1 COG:no KEGG:PRU_1548 NR:ns ## KEGG: PRU_1548 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 189 69 257 257 109 38.0 5e-23 MSYLNKPYYISLLNAAEIHGAAHQRPQKFSVMSVFPKSSVSQSKNNTLVWVYRKEIPTDF LLSKNSETGVIYYSNAELTALDIVQYEQYIGGLSRASTILEELTEKLDFRGASNKLFGYT SIATIQRLGYILDEVLNAAEIADTLYMELTSYVKRFKYIPLTINKPKDCAERNNRWKIFV NTIIETDEI >gi|226332176|gb|ACIC01000144.1| GENE 15 13194 - 13988 202 264 aa, chain + ## HITS:1 COG:no KEGG:PRU_1547 NR:ns ## KEGG: PRU_1547 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 22 263 2 243 247 370 73.0 1e-101 MINRTAITQWNKIVPWNDNAQVEQDLIISRALVAIFSDEFLASQLAFRGGTALHKLYLTP QPRYSEDIDLVQITPGPIKPIMFRLGEVLDFLPDRVTKQKRYNNTMLFRMESEIPPTIPL RLKIEINCFEHFNELGLVKIPFEMENSWFSGKCGVTTYRLNELLGTKLRALYQRKKGRDL FDLYVALSEASVDVEEIMRCYHRYMSFVVQQPPTYKQFINNMEEKMSDSEFLGDTQNLIR PEREFNPQLGYDLVRSQLIDRLQK >gi|226332176|gb|ACIC01000144.1| GENE 16 14248 - 14541 202 97 aa, chain - ## HITS:1 COG:no KEGG:BVU_1580 NR:ns ## KEGG: BVU_1580 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 97 1 96 96 125 62.0 3e-28 MKIKCQEHYENTVKYAKSIGDETLSKCIERLKQWEVNSNGRYEIELYTDFAPYSFGFAEV AKDGTRGVVGGLLYHGKPDRSFAVMIGGPFHGWSIHT >gi|226332176|gb|ACIC01000144.1| GENE 17 14553 - 15050 486 165 aa, chain - ## HITS:1 COG:no KEGG:ETAE_p030 NR:ns ## KEGG: ETAE_p030 # Name: not_defined # Def: hypothetical protein # Organism: E.tarda # Pathway: not_defined # 4 64 69 126 126 70 54.0 2e-11 MRKITLSEYNSIPKDYRGIWTVERWDLPNWAEIREKHMGKRTMMVYDKGTCLLVEGLGFE IVDDSIWKKADEVKKEIGSHFLKFYCEQGREPHYADCTIRWCDTLETEEVRIALAMDSDN EKDDEIFFYCDSLENLKSLADKGKEDFIIAGCIGFGIYEELLQTT >gi|226332176|gb|ACIC01000144.1| GENE 18 15062 - 15316 272 84 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253571770|ref|ZP_04849176.1| ## NR: gi|253571770|ref|ZP_04849176.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 84 1 84 84 169 100.0 4e-41 MHKKNRQTLVWDNIPEWAIFALEYGIEEELFLPNEDLEMISRFIGENFPNGYTMSVDWES CTEFNPRPAFGKPCKTHKVTFVTN >gi|226332176|gb|ACIC01000144.1| GENE 19 15328 - 15594 385 88 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253571771|ref|ZP_04849177.1| ## NR: gi|253571771|ref|ZP_04849177.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 88 1 88 88 172 100.0 7e-42 MKKRSNFTPMKRFHEILDNYGLKLMEVGTNHLRVFFGNRKLFDYYPLRMKLFDYRQWQQL TYPSVMDGTDKWETELEGIINSLMVSPQ >gi|226332176|gb|ACIC01000144.1| GENE 20 15628 - 16161 392 177 aa, chain - ## HITS:1 COG:no KEGG:Nmag_4223 NR:ns ## KEGG: Nmag_4223 # Name: not_defined # Def: hypothetical protein # Organism: N.magadii # Pathway: not_defined # 2 95 4 111 144 68 36.0 6e-11 MSNTRVVNIRKESCDVYIGRAGQGKDGYFGNPFRLEATMTRGGTLDRYRKYFYYRLSTDE KFRRRIGELQGKTLGCFCKPNPCHGDIIKEYLERMEGCTDEIAIEKTYWKGVAYPVREIQ VGNDIFRVSVKSLCDELVNDMHNGIYEAMEASEEIDGYCTDEELCTLTDDDLYRMCC >gi|226332176|gb|ACIC01000144.1| GENE 21 16379 - 17293 813 304 aa, chain - ## HITS:1 COG:no KEGG:BT_2614 NR:ns ## KEGG: BT_2614 # Name: not_defined # Def: putative mobilization protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 301 372 673 681 507 81.0 1e-142 MDAWQGGAQDQLQGQIASAKIPLSRMISPQLYWVMTGDDFTLDLNNPEHPKILCVGNNPD RQNIYSAALGLYNSRIVKLVNKKGQLKSSIIIDELPTIYFRGIDNLIATARSNKVAVCLG FQDFSQLTRDYGEKEAKVIQNTVGNIFSGQVVGETAKNLSERFGKILQQRQSISINRQDT STSINTQMDSLIPASKIANLSQGTFVGSVADNFGEEIDQKIFHARIIVDNEKVAAETKAY KKIPVINEFKDADGNDIMQQQIDRNYSQIKADVVRIIEDEMTRIKNDPDLKHLIPQEEND KDNE Prediction of potential genes in microbial genomes Time: Thu May 12 03:21:34 2011 Seq name: gi|226332175|gb|ACIC01000145.1| Bacteroides sp. 1_1_6 cont1.145, whole genome shotgun sequence Length of sequence - 15967 bp Number of predicted genes - 23, with homology - 20 Number of transcription units - 12, operones - 6 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 916 548 ## BF1250 putative transmembrane mobilisation protein 2 1 Op 2 . - CDS 953 - 2206 1040 ## BVU_0678 putative mobilization protein 3 1 Op 3 . - CDS 2196 - 2663 483 ## BT_2612 hypothetical protein + Prom 3226 - 3285 3.8 4 2 Op 1 . + CDS 3305 - 4069 761 ## BT_2609 conjugate transposon protein 5 2 Op 2 . + CDS 4066 - 4521 298 ## BVU_1559 hypothetical protein 6 2 Op 3 . + CDS 4524 - 4979 214 ## gi|253571779|ref|ZP_04849185.1| conserved hypothetical protein 7 2 Op 4 . + CDS 4982 - 5731 569 ## gi|253571780|ref|ZP_04849186.1| conserved hypothetical protein + Term 5762 - 5800 5.2 + Prom 5769 - 5828 4.5 8 3 Tu 1 . + CDS 5883 - 6062 206 ## gi|253571781|ref|ZP_04849187.1| predicted protein + Term 6272 - 6313 -0.7 - Term 5920 - 5965 -0.8 9 4 Tu 1 . - CDS 6023 - 6325 152 ## BVU_3757 BexA, putative cation effux pump - Prom 6393 - 6452 3.9 - Term 6398 - 6452 10.1 10 5 Tu 1 . - CDS 6612 - 7169 235 ## CDR20291_3472 putative abc transporter, permease protein - Prom 7230 - 7289 4.9 - Term 7224 - 7279 2.0 11 6 Tu 1 . - CDS 7313 - 7531 62 ## gi|298388108|ref|ZP_06997653.1| ABC-2 type transporter superfamily - Prom 7578 - 7637 3.9 + Prom 7261 - 7320 9.5 12 7 Op 1 . + CDS 7488 - 7628 58 ## 13 7 Op 2 . + CDS 7689 - 7919 109 ## - Term 7873 - 7909 0.2 14 8 Op 1 . - CDS 7971 - 8195 132 ## CDR20291_3470 abc transporter, atp-binding protein 15 8 Op 2 . - CDS 8250 - 9197 327 ## YpsIP31758_0331 hypothetical protein 16 8 Op 3 . - CDS 9213 - 9842 227 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 - Prom 9868 - 9927 4.1 17 9 Tu 1 . - CDS 9966 - 10556 452 ## CD1752 TetR family transcriptional regulator - Prom 10585 - 10644 5.0 18 10 Op 1 . - CDS 10671 - 10961 172 ## BT_1905 hypothetical protein - Prom 10984 - 11043 2.0 19 10 Op 2 . - CDS 11210 - 11983 353 ## COG0500 SAM-dependent methyltransferases - Prom 12007 - 12066 10.3 + Prom 12374 - 12433 4.0 20 11 Tu 1 . + CDS 12483 - 12665 80 ## + Term 12684 - 12736 15.2 + Prom 12779 - 12838 4.8 21 12 Op 1 . + CDS 12966 - 13262 285 ## Fjoh_3006 hypothetical protein 22 12 Op 2 . + CDS 13264 - 13596 196 ## BF0124 hypothetical protein 23 12 Op 3 . + CDS 13593 - 15966 2013 ## COG3451 Type IV secretory pathway, VirB4 components Predicted protein(s) >gi|226332175|gb|ACIC01000145.1| GENE 1 1 - 916 548 305 aa, chain - ## HITS:1 COG:no KEGG:BF1250 NR:ns ## KEGG: BF1250 # Name: not_defined # Def: putative transmembrane mobilisation protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 305 1 303 666 477 73.0 1e-133 MQQEDDLRGLAKVMEFMRAISIVFIVIHVYWYCYRSFVDMGINIGVVDRILLNFQRTAGL FSNLLITKVFAVIFLALSCLGTKGVKNQKMTWQKIYASFLAGLVLFFMNWWMLDLPFSPM VNATLYTATLTAGYILLLMSGVWISRMLKHNLMEDVFNTANESFMQETRLMENEYSVNLP TKFVYQGREWNGWINVVNVFRASIVLGTPGSGKSYAVVNNYIKQQIEKSFAMYIYDYKFP DLSEIAYNHLLKHTQNYKVKPEFYVINFDDPRRSHRCNPINPKFMADISDAYESAYTIML NLNKT >gi|226332175|gb|ACIC01000145.1| GENE 2 953 - 2206 1040 417 aa, chain - ## HITS:1 COG:no KEGG:BVU_0678 NR:ns ## KEGG: BVU_0678 # Name: not_defined # Def: putative mobilization protein # Organism: B.vulgatus # Pathway: not_defined # 1 400 1 400 415 399 52.0 1e-110 MVAKINRGASLYGAIIYNQQKVNDNTARIIYGNRMIANVSGNTERVMQQTLWAFDNYLLA NKNTEKPVLHISLNPSLDDRLTDLQFTELAREYMQRMGYGNQPYIVYMHEDIDRRHIHIV STCVNEKGEKIDDAYEWNRSMKACRELEKKFGLKQVDDKRKELLEPYLKKADYQNGDVKR QVSNIVKSVFSTYRFQSFGEYSALLSCFNIEAKQVKGEFEGTPYNGIVYTMTDDSGKPIC TPVKSSLIEKRYGYEGLEKRIGYNAREYKNKKWQPKIRNEVALAMHGCRGNKEEFTRLLA GKGIDVVFRENDEGRIYGATFIDHKNREVYNGSRLGKEFSANAFERLFNSPTNMPDLDMP MPALNRQGVFSSEMESSIEQAFGIFGIDANGPDPQEEALARRLQRKKKKKRRSRGIS >gi|226332175|gb|ACIC01000145.1| GENE 3 2196 - 2663 483 155 aa, chain - ## HITS:1 COG:no KEGG:BT_2612 NR:ns ## KEGG: BT_2612 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 8 146 2 140 140 142 58.0 3e-33 MRMEENKKKNSRKSAGRKPKSDPAVHRYVVRLNSEENGRFDIQFQKSGLKERSKFIKTMI FGKEIKVVRIDKATMDYYIRLTNFYHQFQSIGNNYNQTVKAVKTNFGEKRAFVLLRNLEK ATIDLVLLSKRIIMLTREFEENYLIKQRKEEENGG >gi|226332175|gb|ACIC01000145.1| GENE 4 3305 - 4069 761 254 aa, chain + ## HITS:1 COG:no KEGG:BT_2609 NR:ns ## KEGG: BT_2609 # Name: not_defined # Def: conjugate transposon protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 251 1 252 252 236 47.0 5e-61 MSKEPLFVAVSNQKGGVGKSALTVVLSSYLHFEKNLNVAIIDCDSPQHSLVRMRERDKKA IANSAYFQQLLQQQWNRVPKKAYPIVGSTSEKAREAADELAASGDYDLIFVDLPGTVEST GVFRTIVNMDYVLTPTTPDLIIMQSTLAFSTTVLDYVKNTKDVPLKDILFFWNKLKKRTN VEIYKSYAEVMRELHLTVLDTKLPDLCRYEKEITHAKRDFFRCTLLPPPPKQLEGSGLKE LAEELIVKLKLEHA >gi|226332175|gb|ACIC01000145.1| GENE 5 4066 - 4521 298 151 aa, chain + ## HITS:1 COG:no KEGG:BVU_1559 NR:ns ## KEGG: BVU_1559 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 2 143 3 139 149 68 31.0 6e-11 MTKKKVEVNEDILKNIIVGDIPVFGRAQPMNENTEQPGQESVPVKNTGPAPATEPVEIQS EKTSPVKPRKKKDEAGSYRDKYLVNIPASNRSHVYINREVAECIKRILPVIAPDMSISGY ISNILVDHIQQHWEEISELYNKEYYKPLKPF >gi|226332175|gb|ACIC01000145.1| GENE 6 4524 - 4979 214 151 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253571779|ref|ZP_04849185.1| ## NR: gi|253571779|ref|ZP_04849185.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 151 1 151 151 300 100.0 2e-80 MTQIIFHIAAIACIVYTFCRFGLFIHKRMRKNKNIGKHVHVPPPETSGKAEPDTWEDTAC TSALKDVINTSSIQIVEDPSYRTTYLVNHPIGNRIQVYINRDSYAYIKRFLAVVAPETSM SGYVSTIIDEHLTKYEDEMSELYTEAINKPL >gi|226332175|gb|ACIC01000145.1| GENE 7 4982 - 5731 569 249 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253571780|ref|ZP_04849186.1| ## NR: gi|253571780|ref|ZP_04849186.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 249 1 249 249 426 100.0 1e-118 MEAKIYFTIKLVCAVFIFRQLWIFIFNREMYGIWEKAYRLMRIVRINLWKYRKKRSEQKA REANKKARRREKTDAHNRHTAQPSSAVTEDVIGKTKIVYLEDPEVARKTPTRSEPMEKES IEEDEEIGPDDVIQEEKGLTKEEKEELMAPVDAEPDPDFDTAMTFEDINNVAEVLISENP DEEKAIRAATTIHHKMHDTVILSFLTDKLSNQEKVNRLVSECLDESGRPLAKRKTASKKV EAFNIDKFV >gi|226332175|gb|ACIC01000145.1| GENE 8 5883 - 6062 206 59 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253571781|ref|ZP_04849187.1| ## NR: gi|253571781|ref|ZP_04849187.1| predicted protein [Bacteroides sp. 1_1_6] # 1 59 3 61 61 80 100.0 4e-14 MAVTKAILEKWMVEQKRHRLFDKQVQMARELGLNPGKLGKIDNHKQDHEKKSQKATRKR >gi|226332175|gb|ACIC01000145.1| GENE 9 6023 - 6325 152 100 aa, chain - ## HITS:1 COG:no KEGG:BVU_3757 NR:ns ## KEGG: BVU_3757 # Name: not_defined # Def: BexA, putative cation effux pump # Organism: B.vulgatus # Pathway: not_defined # 1 92 1 92 429 148 84.0 5e-35 MDYTYKKIWLIAFPVMMSILIEQLINITDALFLGHVGDVELGASAIAGIWFLAIYMLGFG FSLGLQVVIARRNGEQQYSETGRTFCRTHLFLVAFWLFFS >gi|226332175|gb|ACIC01000145.1| GENE 10 6612 - 7169 235 185 aa, chain - ## HITS:1 COG:no KEGG:CDR20291_3472 NR:ns ## KEGG: CDR20291_3472 # Name: not_defined # Def: putative abc transporter, permease protein # Organism: C.difficile_R20291 # Pathway: not_defined # 2 153 45 198 232 118 42.0 8e-26 MLKDFSLYYPILDLFLAIMCPYMICFASVLVVLDETDMKINRYITITPLGKKGYLISRLI IPVLFAAIVSFVLLSFCSVSDMSLWTTFIISILATILSVVAAMIILAYAGNKVEGMALAK VSALVMVGLIIPFVITDSTQYVFSFLPSFWIAKFLISNNYWFILPAIFLSGGLICILYNR YSTKI >gi|226332175|gb|ACIC01000145.1| GENE 11 7313 - 7531 62 72 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|298388108|ref|ZP_06997653.1| ## NR: gi|298388108|ref|ZP_06997653.1| ABC-2 type transporter superfamily [Bacteroides sp. 1_1_14] # 1 72 149 220 220 84 91.0 2e-15 MLIMVPTLIISVAPISVYTMGYKSGVMLLHPSISLIELMSGNISVISLMAISIWCIAMYI FSCLSVKKMMTI >gi|226332175|gb|ACIC01000145.1| GENE 12 7488 - 7628 58 46 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGATDIIKVGTIISIKLFMVEMKAESDIPTSVNNAEPNIIPTARCM >gi|226332175|gb|ACIC01000145.1| GENE 13 7689 - 7919 109 76 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MEITFDKMYACIDMGETANACITFSLFSNNMIAPININPISTGSENIRVVVHSSFHCIGR MRNTAIYTSVNSVYRA >gi|226332175|gb|ACIC01000145.1| GENE 14 7971 - 8195 132 74 aa, chain - ## HITS:1 COG:no KEGG:CDR20291_3470 NR:ns ## KEGG: CDR20291_3470 # Name: not_defined # Def: abc transporter, atp-binding protein # Organism: C.difficile_R20291 # Pathway: ABC transporters [PATH:cdl02010] # 1 73 234 305 306 71 50.0 1e-11 MDTPHNLIMSRRYNEVTYSYLENQQELNTTIALDKISTDKKLQSLIQANALLSIHSNEPT LDHIFSEITGRKLL >gi|226332175|gb|ACIC01000145.1| GENE 15 8250 - 9197 327 315 aa, chain - ## HITS:1 COG:no KEGG:YpsIP31758_0331 NR:ns ## KEGG: YpsIP31758_0331 # Name: not_defined # Def: hypothetical protein # Organism: Y.pseudotuberculosis_IP31758 # Pathway: not_defined # 20 310 20 282 290 86 31.0 2e-15 MEHFQEVPDINRLIISPLYLREGLEFAKNQGYNDILISTDDIGISGVSCKHTLNVSLICE YDFIETLIISGYDFTIEPCNLNQLSVLPHLKKLGLWIDKVFTIDFSLFPKLEELKYYHTK QTENVDTLINLKRLHIYNLKSEDLKELKSMNLLEELTLWDAKNINLNGLCQLTNIRTLEI VRSRKMIDIRGLCNSNRLKELSLYYCNNITDISVLNRLSRVKILRLRNCKLLQDLTQLVP NNTIKQISISSLKDLKFLAGMKKLKFLHFDDVIDGDISPLLDSSLEFVGMVSKKKYSHTE KEVNLILERNRLNNK >gi|226332175|gb|ACIC01000145.1| GENE 16 9213 - 9842 227 209 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 206 1 217 245 92 26 2e-18 MITVKNLSFSYSQLPFIENMNFQVRSGEIFGFLGPSGAGKSTLQKILIGMLPKYKGSVSI NNNEIRNKPDHFYESIGVDFEFPSFYEKFTALENLKFFGSLYDSKLIPAEQLLEMVGLSQ HANKKVAGFSKGMKSRLNFARSIIHQPKVLFLDEPTSGLDPSNSQVMKDIIRELKNKGVT IILTTHNMHDATELCDRVAFIVDGNISSN >gi|226332175|gb|ACIC01000145.1| GENE 17 9966 - 10556 452 196 aa, chain - ## HITS:1 COG:no KEGG:CD1752 NR:ns ## KEGG: CD1752 # Name: not_defined # Def: TetR family transcriptional regulator # Organism: C.difficile # Pathway: not_defined # 1 195 1 196 196 144 39.0 2e-33 MAKAFDDNERKLIKDKLKEGALLFIQQQGVRKTSVDELVKYANISKGAFYLFYTSKELLF FDTIIDYHKKLEKEFLNAINKHTDNITVDTLTDIISDLLINNKPYFVSVFVNSDVEYLSR KLPQEVLSQHVDDDVMLANEVLKFIPESKSIDTKVFAGALRAAFLTILNEKTIGTDIYNE VFKFIVRGIVKQLLKD >gi|226332175|gb|ACIC01000145.1| GENE 18 10671 - 10961 172 96 aa, chain - ## HITS:1 COG:no KEGG:BT_1905 NR:ns ## KEGG: BT_1905 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 72 32 102 121 88 50.0 9e-17 MVAEYNNEKMACFVQDGQTVGTISYAPNFKPSEHGVLIHFNCEDIDRSIRQVMKQGGSIV IPRLKFKPMGNGILPYLLTLRAITLDFTLTNLCRKI >gi|226332175|gb|ACIC01000145.1| GENE 19 11210 - 11983 353 257 aa, chain - ## HITS:1 COG:slr1117 KEGG:ns NR:ns ## COG: slr1117 COG0500 # Protein_GI_number: 16329224 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Synechocystis # 15 256 9 252 253 187 40.0 2e-47 MNNNNNTIYGFDVNLICDFFLNTKRQGPGSPDVTLKALSFIDNLSDTAQIADIGCGTGGQ TMVLAQNVSGTITGLDFFSGFIDKFNLDALKLNLQNRVKGIVGSMDNLPFEKESLDLIWS EGAIYNIGFERGINEWRDYLKPNGYIAVSESSWFTEKRPAEIHDFWMQGYQEIDTIPVKV AQMQRAGYLPVASFVLPEKCWTEHYYQPLSEARKLMLKKYPHNETVKKIIEFQHLEEALY NKYKGFYGYVFYIGKKI >gi|226332175|gb|ACIC01000145.1| GENE 20 12483 - 12665 80 60 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPLPQLIEHIYFKRFKREKPETVKPLKQILKEMETKKKLQKEKKEERRKQRALSSDSAAE >gi|226332175|gb|ACIC01000145.1| GENE 21 12966 - 13262 285 98 aa, chain + ## HITS:1 COG:no KEGG:Fjoh_3006 NR:ns ## KEGG: Fjoh_3006 # Name: not_defined # Def: hypothetical protein # Organism: F.johnsoniae # Pathway: not_defined # 2 98 31 127 127 147 83.0 1e-34 MKKRFFSAAFMLLATVGAFAQGNGIGGITEATNMVTSYFDPGTKLIYAIGAVVGLIGGVK VYSKFSSGDPDTSKTAASWFGACIFLIVAATILRSFFL >gi|226332175|gb|ACIC01000145.1| GENE 22 13264 - 13596 196 110 aa, chain + ## HITS:1 COG:no KEGG:BF0124 NR:ns ## KEGG: BF0124 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 107 1 107 110 158 71.0 6e-38 MAKYPVNKGIGRSPEFKGLKSQYLFIFAGGLLALFVVFVIMYMVGIDQWVCIGFGVVSAS VLVWATFRMNAKYGEWGLMKLHALRSHPRYIINRRKFLRLISPTAKKGKV >gi|226332175|gb|ACIC01000145.1| GENE 23 13593 - 15966 2013 791 aa, chain + ## HITS:1 COG:XFa0007 KEGG:ns NR:ns ## COG: XFa0007 COG3451 # Protein_GI_number: 10956718 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirB4 components # Organism: Xylella fastidiosa 9a5c # 262 754 254 751 815 67 23.0 9e-11 MRNVMKTATLESKFPLLSVEHGCIVSKDADITVAYKVELPELFTLTKAEYEAVHSTWAKA VKVLPNYSIVHKQDFFIEESYRPDICKDDLSFLSRSFERHFNERPYLQHTCYLFLTKTTK ERNRTTSSFNALTRNFIIPKEMTDRETITRFMESCEQFERIVNDSGLLRMTRLTDEEITG TDTSAGIIEKYFSLSQEDTACLQDLSLGAGEMKIGDNYLCLHTLSDPEDLPSNVATDCRY ERLSTDRSDCRLSFAAPIGILLTCNHVVNQYLFIDDSAESLRKFEQTARNMHSLSRYSRS NQINREWIEEYLNEAHSKGLTSIRAHCNVMAWSDDREKLKRIKNDVGSQLALMEAKPRHN TVDVPTLFWAAIPGNAGDFPFEESFHTFIEQALCLFIGETSYKDSLSPFGIRMVDRLTGK PVHLDISDLPMKNGTITNRNKFILGPSGSGKSFFTNHMVRQYYEQGTHVLLVDTGNSYQG LCNLIHARTHGEDGIYFTYKEEDPIAFNPFYVEDGVFDIEKKESIKTLILTLWKRDDEPP TRAEEVALSNAVNLFLANIRKDRSIKPSFNSFYEFIRDEYQDILKEKRTREKDFDVFNFL NVLEPYYRGGEYDYLLNSDKQLDLLGKRFIVFELDNIKDNKVLFPIVTIIIMETFINKMR KLKGVRKMILLEEAWKAISKEGMAEYLKYLFKTVRKFFGEAVVVTQEVEDIISSPIVKST IINNSDCKILLDQRKYMNKFDEIQALLGLTDKERAQILSINMANNPSRKYKEVWIGLGGT QSAVYATEVSL Prediction of potential genes in microbial genomes Time: Thu May 12 03:22:52 2011 Seq name: gi|226332174|gb|ACIC01000146.1| Bacteroides sp. 1_1_6 cont1.146, whole genome shotgun sequence Length of sequence - 7886 bp Number of predicted genes - 11, with homology - 10 Number of transcription units - 1, operones - 1 average op.length - 11.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 115 102 ## 2 1 Op 2 . + CDS 136 - 756 376 ## gi|253571794|ref|ZP_04849200.1| conserved hypothetical protein 3 1 Op 3 . + CDS 798 - 1427 797 ## BF0121 hypothetical protein 4 1 Op 4 . + CDS 1447 - 2451 896 ## BF0120 hypothetical protein 5 1 Op 5 . + CDS 2479 - 3102 604 ## BT_0089 conjugate transposon protein 6 1 Op 6 . + CDS 3123 - 3419 292 ## gi|253571798|ref|ZP_04849204.1| conserved hypothetical protein 7 1 Op 7 . + CDS 3412 - 4710 715 ## PGN_0060 conserved protein found in conjugate transposon TraM 8 1 Op 8 . + CDS 4731 - 5711 807 ## BF0116 hypothetical protein 9 1 Op 9 . + CDS 5714 - 6286 599 ## BF0115 TraO 10 1 Op 10 . + CDS 6300 - 7175 719 ## PGN_0057 probable conserved protein found in conjugate transposon TraP 11 1 Op 11 . + CDS 7201 - 7752 406 ## PGN_0056 probable conserved protein found in conjugate transposon Predicted protein(s) >gi|226332174|gb|ACIC01000146.1| GENE 1 2 - 115 102 37 aa, chain + ## HITS:0 COG:no KEGG:no NR:no TEETEKLELFRLSEKLGGNIELAIKQLSESKREKNIL >gi|226332174|gb|ACIC01000146.1| GENE 2 136 - 756 376 206 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253571794|ref|ZP_04849200.1| ## NR: gi|253571794|ref|ZP_04849200.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 206 1 206 206 386 100.0 1e-106 MYSDLESDQRKREEVISSLYWSLMQNWDIPKSIYDHYGFTEDYRLFHQLEELEPAEYKRK RETGEVPDILEVDARLTRTVEKVFESLCGKPPAPYLDKMNEELEKLGQIAALPDSVHDIL HITPAFLVKYGIDKNASATERSCQAEKAYRALDARFVKMTGRRPYADELFASLRQRKEKT PEAKRPKQVHKPILRNSPSKGRKMGL >gi|226332174|gb|ACIC01000146.1| GENE 3 798 - 1427 797 209 aa, chain + ## HITS:1 COG:no KEGG:BF0121 NR:ns ## KEGG: BF0121 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 209 1 209 209 293 70.0 2e-78 MKKKILMLCMGCFFISLTAKAQWVVSDPGNLAQGIINAAKNIVHTSSTASNMLNNFQETV KIYKQGKEYYDGLRKIKNLVRDARKVQQTILMVGDITDIYVNSFERMLSDPYFTPEELSA IAIGYTKLLEESAHLLNDLKTVVNENGLSMNDKERMDIIDRCYTDMLQYRSLVQYYTNKN IGVSYLRAKKRNDLDRVMALYGSPNERYW >gi|226332174|gb|ACIC01000146.1| GENE 4 1447 - 2451 896 334 aa, chain + ## HITS:1 COG:no KEGG:BF0120 NR:ns ## KEGG: BF0120 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 6 316 1 313 334 467 70.0 1e-130 MVLLAINFDNLHQILQNLYTDMMPLCSQMTGVAKGLAGLGALFYVAYRVWQSLARAEPVD VFPLLRPFALGLCIMFFPTIVLGTLNSVMSPIVKGTHSILEAQTFDMNEYRAQKDRLEFE AMKRNPETAYLVDKESFDNKLDELGVMDAIEACGMYVDRAMYNMKKAVQNFFRELLELMF NAAALVIDTLRTFFLIVLSILGPISFALSCWDGFQASLSQWFVRYISIYLWLPVSDLFSS VLARIQVLMLRQDIEQLSDPNFIPDGSNGVYITFLIIGIIGYFTIPTVANWIIQAGGGAG NYGKNVTQTASKTGSVVAGAGGAAVGNIAGRLIK >gi|226332174|gb|ACIC01000146.1| GENE 5 2479 - 3102 604 207 aa, chain + ## HITS:1 COG:no KEGG:BT_0089 NR:ns ## KEGG: BT_0089 # Name: not_defined # Def: conjugate transposon protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 207 1 207 207 312 72.0 5e-84 MEFKSLKNIETSFRHLRLFGIVYLCACTLLVGYSVWKAYGFAEAQRQKIYVLDEGKSLML ALSQDLEQNRPVEAREHVRRFHELFFSLAPDKSAIEGNIQRAMFLSDRSAYAHYRDLAEQ GYYNRIISGNMSQRIGIDSVKCDFNSYPYEVVTYARLSIIREKSVTERSLVTRGRLLNST RSDNNPHGFILEAFRVVENRDIRVYDR >gi|226332174|gb|ACIC01000146.1| GENE 6 3123 - 3419 292 98 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253571798|ref|ZP_04849204.1| ## NR: gi|253571798|ref|ZP_04849204.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 98 4 101 101 176 100.0 6e-43 MLVKIRDLAEEKLRAACAGLSPEKRVITIVVLTVLFALGNFYMIFRAIYDIGREDAKREV IEITPLDIPDFIQADTLTDSKIREMEEFFNQFNKEDNE >gi|226332174|gb|ACIC01000146.1| GENE 7 3412 - 4710 715 432 aa, chain + ## HITS:1 COG:no KEGG:PGN_0060 NR:ns ## KEGG: PGN_0060 # Name: traM # Def: conserved protein found in conjugate transposon TraM # Organism: P.gingivalis_ATCC33277 # Pathway: not_defined # 10 432 22 455 455 328 45.0 2e-88 MSNDANKLRQRQEIRKYLVFAGMFLLFVGCMWLIFAPSKEERQREERNAGFNSELPDPRG AGIEADKIAAYEQADMKRRQEEKMRTLEDFSALADENRQNETSSPVVEIPQERESESASS YRGSGNRRNGAISSSTSAYNDINATLGSFYEAPREDPEKEALKAEVEQLKQAAAVQQPAQ MTYEEQVALLEKSYELAAKYTPGKDGAGTEKREETEPAANGRKARAVPVGQVSTPVVSSL PQPVSDSVLLARMAQTGYAGFHTAVGKTADGHTRNTIRACVHGDQTIRSGQSVRLRLLEP MRVGRYVLPRNSLVTGEGRIQGERLGIGIIQVEHDGIIIPVELAVYDNDGQEGIFIPGSM EANAAKEVAANLGQNLGTSISITNQSAGDQLLSELGRGAIQGVSQYISRKMREEKVHLKS GYTLMLYQNDNQ >gi|226332174|gb|ACIC01000146.1| GENE 8 4731 - 5711 807 326 aa, chain + ## HITS:1 COG:no KEGG:BF0116 NR:ns ## KEGG: BF0116 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 326 1 328 328 493 73.0 1e-138 MKKIFVMFALMTGAVSAFAQNAADSISAETGRVTMTRELYPGQEDGDLYHGLTRKLTFDR MVPPYGLEVTYDKTTHIIFPSAVRYVDLGSPNLVAGKADGAENVIRVKAVVRNFRDETNM SVITESGSFYTFNVKYADEPLLLNIEMKDFIHDGSKVNRPNNALDIYLKELGSESPKLVQ LINKSIHKENKRHVKHIGSKAFGIQYLLRGIYTHNGLLYFHTQVRNQSNVPFEVDFVTFK IVDKKVMKRTAIQEQIVFPLRAYNYATLVAGNKDERTMFTFDKFTIPAGKVLVVELNEKS GGRHQSFTVESEDIVRAKVINELKVK >gi|226332174|gb|ACIC01000146.1| GENE 9 5714 - 6286 599 190 aa, chain + ## HITS:1 COG:no KEGG:BF0115 NR:ns ## KEGG: BF0115 # Name: not_defined # Def: TraO # Organism: B.fragilis # Pathway: not_defined # 1 190 1 191 191 218 58.0 6e-56 MRKVIYMLVAVVSLALFSGQAYAQRCLPGMKGVELRGGFAEGSKSPLNHYAGFAVSGYTK KANRWIVGAEYLLKNYEYRNVSVPRAQFTAEGGYYLKFLSDPSKTLFLSIGGSALAGYET VNWGNKMLYDGSKLLAKDAFIYGGAITLELETYVTDRIVLLASVRERAMWGGSLSVFTMQ FGLGVKFIIN >gi|226332174|gb|ACIC01000146.1| GENE 10 6300 - 7175 719 291 aa, chain + ## HITS:1 COG:no KEGG:PGN_0057 NR:ns ## KEGG: PGN_0057 # Name: traP # Def: probable conserved protein found in conjugate transposon TraP # Organism: P.gingivalis_ATCC33277 # Pathway: not_defined # 1 282 1 275 284 308 55.0 1e-82 MNIEDIRKIPITDFLARMGHEPTARKGNEWWYSAPYREERTPSFRVNILKNVWQDFGIGR GGDIFSLAGEIIGSGDFKSQAKFISESLGGIVPEIVFRPKEKCFDTTPGEENCFVNVRIE PLNNKILLNYLKERGICSDVALPNCEEVRYTLHGKRYFSIGFRNISGGYELRSRLFKGSM SPKDISLIDNGSDTCNIFEGFIDYLSWMVLGLGCGDDYLVLNSVALLERSYGFLDKYDRV NCYLDRDEAGRRTLEALRKRYGNKIEDCSSLYKGFKDLNEYLQYWEGIIEQ >gi|226332174|gb|ACIC01000146.1| GENE 11 7201 - 7752 406 183 aa, chain + ## HITS:1 COG:no KEGG:PGN_0056 NR:ns ## KEGG: PGN_0056 # Name: not_defined # Def: probable conserved protein found in conjugate transposon # Organism: P.gingivalis_ATCC33277 # Pathway: not_defined # 1 156 1 150 153 183 57.0 2e-45 MKKKNKMNAFAAIFSAMLATVVAIALVSCDNDLDVQQGYPFTVETMPVPKRIVKGETVEI RCELKREGRFSDARYTVRYFQPDGKGSLRMDDGMVLLPNDRYPLDREVFRLYYTSECEDQ QSIDIYFEDNSAPAQLFLLSFDFNNEKKDDAEDDTGGTGRIPRTGGELVKNPLVNIETVT VCE Prediction of potential genes in microbial genomes Time: Thu May 12 03:23:40 2011 Seq name: gi|226332173|gb|ACIC01000147.1| Bacteroides sp. 1_1_6 cont1.147, whole genome shotgun sequence Length of sequence - 901 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 773 658 ## COG1596 Periplasmic protein involved in polysaccharide export Predicted protein(s) >gi|226332173|gb|ACIC01000147.1| GENE 1 2 - 773 658 257 aa, chain - ## HITS:1 COG:aq_505 KEGG:ns NR:ns ## COG: aq_505 COG1596 # Protein_GI_number: 15605977 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protein involved in polysaccharide export # Organism: Aquifex aeolicus # 2 177 172 351 725 94 34.0 2e-19 MGEVVQPGTYALSSFSTVFHALYRAGGVSDIGSLRNVQLVRNGKNIATIDVYEFIMKGNT QDDIRLQEGDVVIVPAYDVLVKISGKVKRPMRFEMKKDENLATLIKYAGGFEADAYTRSL RVVRQNGEEYEVNTVKDMDYSIYTMRNGDVVTAEAILNRFTNKLEIRGAVYRPGIYQLSG KLNTIRELVHEAQGLTGDAFLNRAVLYRQREDLTSEVVQIDIKSIMDGTSPNLALMKNDI LYIPSIHDLEDRGNVTV Prediction of potential genes in microbial genomes Time: Thu May 12 03:23:42 2011 Seq name: gi|226332172|gb|ACIC01000148.1| Bacteroides sp. 1_1_6 cont1.148, whole genome shotgun sequence Length of sequence - 3267 bp Number of predicted genes - 4, with homology - 3 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 18 - 170 64 ## 2 2 Tu 1 . - CDS 341 - 1291 559 ## BT_1726 integrase - Prom 1398 - 1457 3.9 - Term 1378 - 1420 11.2 3 3 Op 1 . - CDS 1465 - 2121 862 ## COG0176 Transaldolase 4 3 Op 2 . - CDS 2189 - 3241 1042 ## COG1830 DhnA-type fructose-1,6-bisphosphate aldolase and related enzymes Predicted protein(s) >gi|226332172|gb|ACIC01000148.1| GENE 1 18 - 170 64 50 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MFIRYFLGNSQTKDKRNRELFSTKTSEEDWYSGNKAEVKVFFINWNKKKT >gi|226332172|gb|ACIC01000148.1| GENE 2 341 - 1291 559 316 aa, chain - ## HITS:1 COG:no KEGG:BT_1726 NR:ns ## KEGG: BT_1726 # Name: not_defined # Def: integrase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 313 1 313 316 592 97.0 1e-168 MNKNGFSRCADIYIGRLREEGRYSTAHVYQNALLSFSKFCGVHSVSFRQVTRDRLRRYEQ HLYACGLKPNTISTYMRMLRSIYNRGVEAGSAPYVHRLFHEVYTGVDVRQKKALPVVALR RLLYEDPHSDRLRRTQAIAALMFQFCGMSFADLSHLEKSALDSNVLRYNRVKTKTPMSVE VLDSAQEMLEQLRNRQSPRPGCPDYLFGILQGDKKRKDEKAYREYQSALRRFNYCLKSLA KRLRLNFPVTSYTLRHSWATTAKYRGVPIEMISESLGHKSIKTTQIYLKGFELKERTEVN RMNLSYVKRCGSGLYV >gi|226332172|gb|ACIC01000148.1| GENE 3 1465 - 2121 862 218 aa, chain - ## HITS:1 COG:TM0295 KEGG:ns NR:ns ## COG: TM0295 COG0176 # Protein_GI_number: 15643064 # Func_class: G Carbohydrate transport and metabolism # Function: Transaldolase # Organism: Thermotoga maritima # 1 215 1 211 218 241 51.0 5e-64 MKFFIDTANLEQIQEAYDLGVLDGVTTNPSLMAKEGIKGTENQREHYIKICKIVNADVSA EVIATDYEGMIREGEELAALNPHIVVKVPCIADGIKAIKYFTEKGIRTNCTLVFSVGQAL LAAKAGATYVSPFVGRLDDICEDGVGLVGDIVRMYRTYDYKTQVLAASIRNTKHIIECVE MGADVATCPLSAIKGLLNHPLTDSGLKKFLEDYKKVNG >gi|226332172|gb|ACIC01000148.1| GENE 4 2189 - 3241 1042 350 aa, chain - ## HITS:1 COG:ECs2900 KEGG:ns NR:ns ## COG: ECs2900 COG1830 # Protein_GI_number: 15832154 # Func_class: G Carbohydrate transport and metabolism # Function: DhnA-type fructose-1,6-bisphosphate aldolase and related enzymes # Organism: Escherichia coli O157:H7 # 1 350 25 374 374 485 66.0 1e-137 MSKVVDLLGDKTSYYLDHTCKTIDKSLIYIPSPDTIDKVWIDSDRNIKVLNSLQTLLGHG RLANTGYVSILPVDQDIEHTAGASFAPNPIYFDPENIVKLAIEGGCNAVASTFGILGSVA RKYAHKIPFVVKLNHNELLTYPNTYDQVLFGTVKEAWNMGAVAVGATIYFGSEQSRRQLV EIAEAFEYAHELGMATILWCYLRNSDFKKGAIDYHAAADLTGQADRLGVTIKADIVKQKL PTNNGGFKAIGFGKTDERMYTELTSEHPIDLCRYQVANGYMGRVGLINSGGESHGASDLR DAVITAVVNKRAGGMGLISGRKAFQKPMNKGVELLNAIQDVYLDPAITIA Prediction of potential genes in microbial genomes Time: Thu May 12 03:24:04 2011 Seq name: gi|226332171|gb|ACIC01000149.1| Bacteroides sp. 1_1_6 cont1.149, whole genome shotgun sequence Length of sequence - 51660 bp Number of predicted genes - 45, with homology - 45 Number of transcription units - 21, operones - 12 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 26 - 772 676 ## COG0588 Phosphoglycerate mutase 1 - Prom 810 - 869 7.8 + Prom 770 - 829 7.0 2 2 Op 1 . + CDS 916 - 1086 115 ## gi|253571811|ref|ZP_04849216.1| conserved hypothetical protein 3 2 Op 2 . + CDS 1017 - 2816 1317 ## COG0642 Signal transduction histidine kinase 4 3 Op 1 . - CDS 2904 - 4358 165 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 5 3 Op 2 . - CDS 4395 - 6401 1522 ## COG1523 Type II secretory pathway, pullulanase PulA and related glycosidases 6 3 Op 3 . - CDS 6485 - 7021 561 ## COG0817 Holliday junction resolvasome, endonuclease subunit 7 3 Op 4 . - CDS 7048 - 7350 200 ## BT_1665 hypothetical protein - Prom 7490 - 7549 79.9 + TRNA 7471 - 7547 67.2 # Ala GGC 0 0 - Term 7543 - 7578 1.7 8 4 Tu 1 . - CDS 7633 - 8412 440 ## COG0561 Predicted hydrolases of the HAD superfamily - Prom 8459 - 8518 6.7 - Term 8439 - 8480 2.6 9 5 Op 1 . - CDS 8523 - 8735 337 ## BT_1667 hypothetical protein 10 5 Op 2 . - CDS 8798 - 9850 671 ## BT_1668 hypothetical protein - Prom 9965 - 10024 5.2 + Prom 9825 - 9884 5.6 11 6 Tu 1 . + CDS 9972 - 10991 1169 ## COG0016 Phenylalanyl-tRNA synthetase alpha subunit 12 7 Op 1 1/0.000 + CDS 11094 - 12338 1063 ## COG0477 Permeases of the major facilitator superfamily 13 7 Op 2 . + CDS 12335 - 13012 562 ## COG0177 Predicted EndoIII-related endonuclease + Prom 13027 - 13086 5.3 14 7 Op 3 . + CDS 13106 - 14365 1637 ## COG0126 3-phosphoglycerate kinase + Term 14395 - 14438 8.5 + Prom 14452 - 14511 2.8 15 8 Op 1 . + CDS 14571 - 15425 728 ## COG0715 ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components 16 8 Op 2 . + CDS 15437 - 16480 952 ## BT_1674 hypothetical protein 17 8 Op 3 . + CDS 16529 - 18733 1891 ## COG0457 FOG: TPR repeat + Term 18742 - 18785 8.9 - Term 18662 - 18703 6.2 18 9 Op 1 . - CDS 18728 - 19309 613 ## COG0424 Nucleotide-binding protein implicated in inhibition of septum formation 19 9 Op 2 . - CDS 19340 - 19861 491 ## COG1778 Low specificity phosphatase (HAD superfamily) 20 9 Op 3 . - CDS 19842 - 20636 668 ## BT_1678 hypothetical protein 21 9 Op 4 . - CDS 20644 - 20979 322 ## BT_1679 hypothetical protein 22 9 Op 5 . - CDS 20982 - 21518 523 ## COG0778 Nitroreductase - Prom 21730 - 21789 6.3 + Prom 21471 - 21530 2.4 23 10 Tu 1 . + CDS 21618 - 21803 153 ## gi|253571832|ref|ZP_04849237.1| conserved hypothetical protein + Term 21847 - 21888 2.2 - Term 21747 - 21781 1.0 24 11 Tu 1 . - CDS 21917 - 22459 361 ## COG0288 Carbonic anhydrase - Prom 22489 - 22548 5.9 - Term 22569 - 22614 11.0 25 12 Op 1 . - CDS 22646 - 24601 1592 ## BT_1682 hypothetical protein 26 12 Op 2 . - CDS 24613 - 27807 3115 ## BT_1683 hypothetical protein - Prom 27881 - 27940 7.3 + Prom 27969 - 28028 8.2 27 13 Tu 1 . + CDS 28056 - 29054 693 ## BT_1684 hypothetical protein + Prom 29061 - 29120 6.8 28 14 Op 1 2/0.000 + CDS 29228 - 29632 492 ## COG0346 Lactoylglutathione lyase and related lyases 29 14 Op 2 . + CDS 29663 - 31216 1673 ## COG4799 Acetyl-CoA carboxylase, carboxyltransferase component (subunits alpha and beta) 30 14 Op 3 . + CDS 31248 - 32168 792 ## BT_1687 hypothetical protein 31 14 Op 4 . + CDS 32193 - 32627 497 ## COG1038 Pyruvate carboxylase 32 14 Op 5 . + CDS 32629 - 33789 1373 ## COG1883 Na+-transporting methylmalonyl-CoA/oxaloacetate decarboxylase, beta subunit + Term 33801 - 33856 15.5 + Prom 33815 - 33874 4.1 33 15 Tu 1 . + CDS 33964 - 35298 1530 ## BT_1690 TPR domain-containing protein + Term 35325 - 35363 5.5 - Term 35313 - 35351 9.3 34 16 Tu 1 . - CDS 35378 - 36382 1008 ## COG0191 Fructose/tagatose bisphosphate aldolase - Prom 36426 - 36485 5.7 + Prom 36581 - 36640 7.2 35 17 Tu 1 . + CDS 36783 - 37037 437 ## PROTEIN SUPPORTED gi|29347102|ref|NP_810605.1| 50S ribosomal protein L31 type B + Term 37057 - 37109 14.4 + Prom 37568 - 37627 6.4 36 18 Op 1 27/0.000 + CDS 37683 - 38735 926 ## COG0845 Membrane-fusion protein 37 18 Op 2 9/0.000 + CDS 38732 - 41770 3224 ## COG0841 Cation/multidrug efflux pump 38 18 Op 3 . + CDS 41813 - 43093 1188 ## COG1538 Outer membrane protein + Term 43129 - 43174 2.7 39 19 Op 1 4/0.000 - CDS 43209 - 44444 1440 ## COG1883 Na+-transporting methylmalonyl-CoA/oxaloacetate decarboxylase, beta subunit 40 19 Op 2 . - CDS 44444 - 46273 2043 ## COG5016 Pyruvate/oxaloacetate carboxyltransferase 41 19 Op 3 . - CDS 46302 - 46562 387 ## BT_1698 hypothetical protein - Prom 46743 - 46802 10.6 - Term 46707 - 46750 2.5 42 20 Op 1 . - CDS 46838 - 48400 1428 ## BT_1642 hypothetical protein - Prom 48466 - 48525 2.5 43 20 Op 2 . - CDS 48565 - 48714 206 ## gi|253571854|ref|ZP_04849259.1| predicted protein - Prom 48741 - 48800 7.8 + Prom 48684 - 48743 5.9 44 21 Op 1 . + CDS 48868 - 49494 402 ## BT_1702 hypothetical protein 45 21 Op 2 . + CDS 49561 - 51399 1121 ## BT_1703 hypothetical protein + Term 51474 - 51516 -0.3 Predicted protein(s) >gi|226332171|gb|ACIC01000149.1| GENE 1 26 - 772 676 248 aa, chain - ## HITS:1 COG:STM0772 KEGG:ns NR:ns ## COG: STM0772 COG0588 # Protein_GI_number: 16764136 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoglycerate mutase 1 # Organism: Salmonella typhimurium LT2 # 3 248 5 250 250 286 57.0 3e-77 MKKIVLLRHGESAWNKENRFTGWTDVDLTEKGVAEAEKAGVTLREYGFNFDKAYTSYLKR AVKTLNCVLDKMNLDWIPVEKSWRLNEKHYGDLQGLNKAETAEKYGEKQVLIWRRSYDIA PNPLSESDLRNPRFDFRYHEVPDAELPRTESLKDTIDRIMPYWESDIFPALRDAHTLLVV AHGNSLRGIIKHLKHISDEDIIKLNLPTAVPYVFEFDENLNVANDYFLGNPEEIRKLMEA VANQGKKK >gi|226332171|gb|ACIC01000149.1| GENE 2 916 - 1086 115 56 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253571811|ref|ZP_04849216.1| ## NR: gi|253571811|ref|ZP_04849216.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 56 1 56 56 94 100.0 1e-18 MNIQPPVKTSQTLETSKQKEYTFLQKLLNKANMGWWEADLKQKVMYVPSLSPDYSG >gi|226332171|gb|ACIC01000149.1| GENE 3 1017 - 2816 1317 599 aa, chain + ## HITS:1 COG:CC2501_1 KEGG:ns NR:ns ## COG: CC2501_1 COG0642 # Protein_GI_number: 16126740 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Caulobacter vibrioides # 235 594 124 486 538 168 32.0 2e-41 MVGSGLKTESYVCSEFISRLLGIDENGVISFEDFNKRILREDQHHTTSYSFDKLQQSIEE VYLLHTPHGEVWIRSKICFQETDREGNTKIYGIAETQDGPHMASAYQALQNSERILHNIY KNLPVGIELYNKDGYLLDLNKKELEMFHITHKEKVIGINIFENPALPEEIKLKIRDNEEV EFTFQYDFSKIKGYYESEKTSGFINLTTRITTLYDHDRQPINYLLINVDKTENTIAYNKI QEFKNFFELVGDYAKVGYAHFDALSRNGYALSSWYKNVGEEEGTPLPEIIGIHSHFHPED RAVMLDFLAKVVKGESTKLCRDVRIRRADGSYTWTRVNVLVRNYRPQDNVIEMLCINFDI TQLKETEQMLIKAKEKAEEADCLKSAFLANMSHEIRTPLNAIVGFSSMLEEAEDQEEKHQ YITIIEDNNKLLLQLISDILDLSKIEAETFDIIPERVNAKQLCNDLFQAIQMKTSPQVEL RLKDNLPELTFTSDKNRLYQVLLNFVTNALKFTSEGNITIDYQIDGNEVKFSVQDTGMGI EPEKQEAIFTRFVKLNSFIPGTGLGLSICQSIVTQLGGKIGVESEPGKGSCFWFTHPIN >gi|226332171|gb|ACIC01000149.1| GENE 4 2904 - 4358 165 484 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 266 478 1 209 245 68 27 9e-11 MIQNTFNMAGGVARNPLVRLAAPITATISAGEHIAIVGPNGGGKSLFVDTLIGKYPLREG TLQYDFSPSATQTVYDNVKYIAFRDTYGAADANYYYQQRWNAHDQDEAPDVREMLGEIKD EQLKQELFELFRIEPMLDKKIILLSSGELRKFQLAKTLLTAPRVLIMDNPFIGLDAPTRE LLFSLLDRLTKMSSVQIILVLSMLDDIPSFITHVIPVDKMMVYSKMEREAYLEAFRSRDV AVSFDELQQRIVDLPYGGNNYDSDEVVKLNKVSIRYDDRTILKELDWTVRRGEKWALSGE NGAGKSTLLSLVCADNPQSYACDISLFGRKRGTGESIWEIKKHIGYVSPEMHRAYLKNLP AIEIVASGLHDSIGLYKRPQPEQMAICEWWMDIFGIAELKDKPFLQLSSGEQRLALLARA FVKDPELLILDEPLHGLDTYNRRMVKKIIEAFCHRKDKTMIMVTHYESELPGTITDRIFL KRNR >gi|226332171|gb|ACIC01000149.1| GENE 5 4395 - 6401 1522 668 aa, chain - ## HITS:1 COG:TM1845 KEGG:ns NR:ns ## COG: TM1845 COG1523 # Protein_GI_number: 15644588 # Func_class: G Carbohydrate transport and metabolism # Function: Type II secretory pathway, pullulanase PulA and related glycosidases # Organism: Thermotoga maritima # 47 637 229 812 843 496 44.0 1e-140 MKMGTNYLAILGVTTVTTVMSCTTAKKEYMSYELYPVRTGSLIEMEYTPEATKFTLWSPT ADEVRLMLYEAGEGGHAYETVKMQSGEEGTWTAVVSKDLIGKFYTFNVKIDDKWQGDTPG INARAVGVNGKRAAIIDWQSTNPDGWESDTRPPLKSPADMIIYEMHHRDFSVDSTSGVKN KGKYLALTEHGTMNSDKLLTGIDHLIELGVTHVHLLPSFDYASVDETRLNENSYNWGYDP QNYNVPDGSYATDPYQPATRVKEFKQMVQALHKAGIRVIMDVVYNHTFNTDESNFERTVP GYFYRQKEDKTLANGSGCGNETASERLMMRKFMVESVLYWIKEYHVDGFRFDLMGIHDIE TMNEIRKAVNAVDPTICIYGEGWAAEAPQYPADSLAMKGNIAQIPGVAVFSDELRDGLCG PVGDKRKGAFLAGIPGGEMSVKFGIAGAIEHPQVQCDSVNYTQKPWAKQPVQMISYVSCH DGLCLVDRLKASMPDITPEQLIRLDKLAQTVVFTSQGIPFIYAGEEIMRDKQGVDNSYKS PDAVNAIDWRRKTTSADVFMYYKRLIDLRKSHPAFRMGDAGQVRKHLEFLPVEGSNLIAF RLKDHANGDHWEDIIVAFNSRPTPARLTIPVGKYTVVCKDGVIDVRGLGIQNGPEVIIPG QSALILYK >gi|226332171|gb|ACIC01000149.1| GENE 6 6485 - 7021 561 178 aa, chain - ## HITS:1 COG:VC1847 KEGG:ns NR:ns ## COG: VC1847 COG0817 # Protein_GI_number: 15641849 # Func_class: L Replication, recombination and repair # Function: Holliday junction resolvasome, endonuclease subunit # Organism: Vibrio cholerae # 1 150 5 150 173 107 40.0 8e-24 MGIDPGTTIMGYGVLRVKGTKPEMIAMGIIDLRKFANHYLKLRHIHERVLSIIESYLPDE LAIEAPFFGKNVQSMLKLGRAQGVAMAVALSRDIPITEYAPLKIKMAITGNGQASKEQVA DMLQRMLHFPKEEMPTFMDATDGLAAAYCHFLQMGRPAMEKGYSSWKDFIAKNPDKVK >gi|226332171|gb|ACIC01000149.1| GENE 7 7048 - 7350 200 100 aa, chain - ## HITS:1 COG:no KEGG:BT_1665 NR:ns ## KEGG: BT_1665 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 100 1 100 100 204 100.0 1e-51 MLIYNTTFQVDDAVHDNFLIWIKESYIPEVQKHGTLKAPRICRILSHRDDGSSYSLQWEV ESSGLLHRWHMEQGVRLNDELTKIFKDKVVGFPTLMEIVE >gi|226332171|gb|ACIC01000149.1| GENE 8 7633 - 8412 440 259 aa, chain - ## HITS:1 COG:lin1028 KEGG:ns NR:ns ## COG: lin1028 COG0561 # Protein_GI_number: 16800097 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Listeria innocua # 1 259 1 256 256 122 31.0 6e-28 MIKVLLLDVDGTLLSFETHKVSQSSIDALKKVHDSGIKIVIATGRAASDLHEIDDAVPYD GVIALNGAECVLRDGSVIRKVAIPAQDFRKSMELAREFDFAVALELNEGVFVNRLTPTVE QIAGIVEHPVPPVVDIEEMFERKECCQLCFYFDEETEQKVMPLLSGLSATRWHPLFADVN VAGTSKATGLSLFADYYRVKVSEIMACGDGGNDIPMLKAAGIGVAMGNASEKVQSVADFV TDTVDNNGLYKALKHFGVI >gi|226332171|gb|ACIC01000149.1| GENE 9 8523 - 8735 337 70 aa, chain - ## HITS:1 COG:no KEGG:BT_1667 NR:ns ## KEGG: BT_1667 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 70 1 70 70 91 100.0 1e-17 MDDMINRHDSIAEENIEPNGRPAKNEFEEWSTEVTDRADNVFKGDTKDGPIKDREKRIKE MDEVIKKDLE >gi|226332171|gb|ACIC01000149.1| GENE 10 8798 - 9850 671 350 aa, chain - ## HITS:1 COG:no KEGG:BT_1668 NR:ns ## KEGG: BT_1668 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 350 1 350 350 639 96.0 0 MKKIVISLLLLTLSFRLSAQIDYLEPVKPFTTYTGELGEYYRNVFSLLNTGFQQRPYARF VAIPSFSPEYAMSVEKKNGRCLLIANTLSRTYWQAEKGTVKVETKSVEISQSLYQSLGAI ARLVTSQIQDLDGSTAGLDGVVYYFSSTDAKGKEMMGRKWSPMKGTLMERLVLVCQSAYI LSQGENILEQALAEEATALLKELENRTKEQPDAYKRPMYVGIYSVGPKLKTHSGKQIEEL PHLTDVCVQEYAAGQMVYPAELLKNNVSGYALCEFTIDKEGVILRPHILKSTHPEFAEEA LRIVKEMPNWTPALVGGKAVESDYTLYVPFRPQLYKEQLQIRERELSKKH >gi|226332171|gb|ACIC01000149.1| GENE 11 9972 - 10991 1169 339 aa, chain + ## HITS:1 COG:BS_pheS KEGG:ns NR:ns ## COG: BS_pheS COG0016 # Protein_GI_number: 16079916 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Phenylalanyl-tRNA synthetase alpha subunit # Organism: Bacillus subtilis # 1 338 1 344 344 320 45.0 4e-87 MIAKIEQLLKEVEALHASNAEELEALRIKYLSKKGAINDLMADFRNVAAEQKKEVGMRLN ELKTKALDKINALKEQFESQDNSCDGLDLTRSAYPIELGTRHPITIVKNEVIDIFARLGF SIAEGPEIEDDWHVFSALNFAEDHPARDMQDTFFIEAHPDVVLRTHTSSVQTRVMETSKP PIRIICPGRVYRNEAISYRAHCFFHQVEALYVDKNVSFTDLKQVLLLFAKEMFGTDTKIR LRPSYFPFTEPSAEMDISCNICGGKGCPFCKHTGWVEILGCGMVDPNVLESNGIDSKVYS GYALGMGIERITNLKYQVKDLRMFSENDTRFLKEFEAAY >gi|226332171|gb|ACIC01000149.1| GENE 12 11094 - 12338 1063 414 aa, chain + ## HITS:1 COG:STM2280 KEGG:ns NR:ns ## COG: STM2280 COG0477 # Protein_GI_number: 16765607 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Salmonella typhimurium LT2 # 18 404 2 381 396 138 25.0 2e-32 MAYHRSDQQIERSLIMAKDRLITSGYCFILAANFLLYFGFWLLIPVLPFYLSEFFQAGNS TIGIVLSCYTVAALCIRPFSGYLLDTFARKPLYLFAYFIFMLMFAGYLIAGSLTLFIIFR IIHGVSFGMVTVGGNTVVIDIMPSSRRGEGLGYYGLTNNIAMSIGPMFGLFLHDGGVSFA TIFCYALGSCMLGFLSASLVKTPYKPPVKREPISLDRFILLKGMPAGLSLLLLSIPYGMT TNYVAMYAREIGIHTQTGFFFTFMAIGMAISRIFSGKLVDRGKITQVIAAGLNLVIISFF LLASCVYLIQWNDAACTILFFGIALLMGIGFGIMFPAFNTLFVNLAPNNQRGTATSTYLT SWDVGIGIGMLTGGYIAEICSFDKAYLFGACLTVVSAVYFKLKVTPHYHKNKLR >gi|226332171|gb|ACIC01000149.1| GENE 13 12335 - 13012 562 225 aa, chain + ## HITS:1 COG:RP746 KEGG:ns NR:ns ## COG: RP746 COG0177 # Protein_GI_number: 15604580 # Func_class: L Replication, recombination and repair # Function: Predicted EndoIII-related endonuclease # Organism: Rickettsia prowazekii # 9 213 8 210 212 196 46.0 4e-50 MRKKERYEKVIAWFQANVPVAETELHYNNPYELLIAVILSAQCTDKRVNMITPPLYKDFP TPEALAASTPEVIFEYIRSVSYPNNKAKHLVGMAKMLVNDFNSKVPDNMDDLIKLPGVGR KTANVIQSVVFNKAAMAVDTHVFRVSHRIGLVPDSCTTPFSVEKELVKNIPEKLIPIAHH WLILHGRYVCQARTPKCDTCGLQMMCKYFCNTYKVTKEAPKAKNK >gi|226332171|gb|ACIC01000149.1| GENE 14 13106 - 14365 1637 419 aa, chain + ## HITS:1 COG:all4131 KEGG:ns NR:ns ## COG: all4131 COG0126 # Protein_GI_number: 17231623 # Func_class: G Carbohydrate transport and metabolism # Function: 3-phosphoglycerate kinase # Organism: Nostoc sp. PCC 7120 # 8 419 13 399 400 387 52.0 1e-107 MQTIDKFNFAGKKAFVRVDFNVPLDENFNITDDTRMRAALPTLKKILADGGSIIIGSHLG RPKGVADKFSLKHIIKHLSELLGVEVQFANDCMGEEAAVKAAALQPGEVLLLENLRFYAE EEGKPRGLAEDATDEEKAAAKKAVKESQKEFTKKLASYADCYVNDAFGTAHRAHASTALI AKYFDTDNKMFGYLMEKEVKAVDKVLNDIQRPFTAIMGGSKVSSKIEIIENLLNKVDNLI IAGGMTYTFTKAMGGKIGISICEDDKLDLALDLIKKAKEKGVNLVLAVDAKIADAFSNDA NTQFCAVDEIPDGWEGLDIGPKTEEIFANVIKESKTILWNGPTGVFEFENFTHGSRTVGE AIVEATKNGAFSLVGGGDSVACVNKFGLASGVSYVSTGGGALLEAIEGKVLPGIAAINE >gi|226332171|gb|ACIC01000149.1| GENE 15 14571 - 15425 728 284 aa, chain + ## HITS:1 COG:AF0088 KEGG:ns NR:ns ## COG: AF0088 COG0715 # Protein_GI_number: 11497708 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components # Organism: Archaeoglobus fulgidus # 1 253 29 282 300 117 31.0 3e-26 MLPTLDGLPFHIAKAQGIYDSLGLDLTILSFNSAYDRDAAFQFKSMDGMITDYPSAVTLQ AIHHTDLGIILKHDGYFCFIVSKESGINQLQELKEKNIAVSHNTIIEYATGQLLNKAGIS QAEVNKPEIAQLPLRLQMLQYDQIDASFLPDPAASIAMNARHRSLISTQELGIDFTVTAF SREAINEKRREIELLITGYNLGIDYIKMHPQKEWKQVLIEIGVPENLTGLIALPVYRKAE RPSADALDKAVTWLKTNNRISQTYSEYKNLIDTTFTKTNSTTIK >gi|226332171|gb|ACIC01000149.1| GENE 16 15437 - 16480 952 347 aa, chain + ## HITS:1 COG:no KEGG:BT_1674 NR:ns ## KEGG: BT_1674 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 347 1 347 347 693 98.0 0 MRKTLFRLSMIAFTLLGMQNTLTAQEKTPLNQVVNTLKERISLAGYAQLGYTYDDAANPD NTFDIKRIIFMAHGKITDRWTCDFMYDFYNGGMLLEVYTDYRILSGLTARIGEFKVPYTI ENELSPTTVELINCYSQSVCYLAGVSGSDKCYGMTSGRDIGMMIHGKLFHDLLQYKFAVM NGQGLNTKDKNSQKDVVGNLMVYPNKWLSVGGSFIRGTGHAIGDSQYSGIKTGENYAKKR WSLGGVVTTSAFNFRTEYLAGKDRNVKSEGFYATGSVRLLQNFDFIASFDYFNPNKAADF KQNNYIAGLQYWFYPRCRVQAQYTFCDKKGDGQKDSNLIQAQVQVRF >gi|226332171|gb|ACIC01000149.1| GENE 17 16529 - 18733 1891 734 aa, chain + ## HITS:1 COG:all1322_2 KEGG:ns NR:ns ## COG: all1322_2 COG0457 # Protein_GI_number: 17228817 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Nostoc sp. PCC 7120 # 505 700 64 257 292 60 26.0 9e-09 MNEKTINEQYTYIRSLLEEKRLKEALMQLESLLWQCPDWDLRTRLEQLQTSYKYMLEYMR QGANDPERWNVYRKLVADTWEIADRSRLLMLDNASSRYYHEVRRTPRPESLSAYTLKKLL HMLESFNDDLAVSGLLSDEKMDEVLKRHEETLKYMFLQTWTNSAWTPEEEEDAQSMLTSE LLPVNDLCLFISAVTLSLMECFDLRKIMWLLDAYRHPDVNAGQRALVGVIFIFHIYRNRL SLYNDLVKRVDLMDEIPPFKEDVARIYRQMLLCQETEKIDKKMREEIIPEMLKNVSSMRN MRFGFEENEDENDDKNPDWADAFEQSGLGDKLREMNELQLEGADVYMSTFAALKSYPFFR EVQNWFYPFSKQQSDVIKQLKQEGNEKNTLLDLILQSGFFSNSDKYSLFFTIRQLPKAQQ DMMLSQLGEQQVAELSEKSSAETMKKFNERPGTVSNQYLHDLYRFFKLSVRRHEFRDIFK EKLDLHHIPALSNVLYSEDILFPIADFYLKKERWNEAIEVYEEMETIGALQGRGAEYYQK LGYALQKNKKYAEAIDAYLKADTLKPDNIWNNRHLAICYRLNRNYQAALSYYKKVEEATP EDTNVIFHIGSCLAELGQYEEAANYFFKLDFIESNCIKAWRGIGWCSFISRKYEQAMKYY EKIIEQKPLAIDYMNAGHVAWTMGDIQKATALYGKSITANGNRERFLEMFRKDKEALLKQ GIQEEDIPLMLDLL >gi|226332171|gb|ACIC01000149.1| GENE 18 18728 - 19309 613 193 aa, chain - ## HITS:1 COG:BS_maf KEGG:ns NR:ns ## COG: BS_maf COG0424 # Protein_GI_number: 16079857 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Nucleotide-binding protein implicated in inhibition of septum formation # Organism: Bacillus subtilis # 10 191 5 183 189 135 42.0 6e-32 MLGNLDKYQIILASNSPRRKELMSGLGVDYVVRTLPDVDESYPADLAGAAIPEYISREKA DAYRSIMKPGELLITADTIVWLDGKVLGKPEGREGAVEMLRSLSGKSHQVFTGVCLTTTE WQKSFTAASDVEFDVLSEEEIRYYVDKYQPMDKAGAYGVQEWIGYIGVKSISGSFYNIMG LPIQKLYGELKKL >gi|226332171|gb|ACIC01000149.1| GENE 19 19340 - 19861 491 173 aa, chain - ## HITS:1 COG:FN0213 KEGG:ns NR:ns ## COG: FN0213 COG1778 # Protein_GI_number: 19703558 # Func_class: R General function prediction only # Function: Low specificity phosphatase (HAD superfamily) # Organism: Fusobacterium nucleatum # 8 165 1 158 168 116 35.0 2e-26 MSTINYDLSRIKALAFDVDGVLSSTTVPLHPSGEPMRTVNIKDGYAIQLAVKKGLHIAII TGGRTEAVRIRFAALGVKDLYMGSAVKIHDYRNFRDKYGLSDDEILYMGDDVPDIEVMRE CGLPCCPKDAVPEVKSVAKYISYADGGRGCGRDVVEQVLKAHGKWMAEDAFGW >gi|226332171|gb|ACIC01000149.1| GENE 20 19842 - 20636 668 264 aa, chain - ## HITS:1 COG:no KEGG:BT_1678 NR:ns ## KEGG: BT_1678 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 264 1 264 264 508 100.0 1e-143 MKRSIEDTPVVFLGAGNLATNLAKALYRKGFRIMQVYSRTEESARTLANEVEAEYITDLK DVSNEARLYIISLKDAAFVELLPQITDGKQHALLVHTAGSIPMNIWEGHAERYGVFYPMQ TFSKQREVDFREVPFFIEAKRPEDTELLKAVAGTLSDKVYEADSEQRRSLHLAAVFTCNF TNHMYALAAELLEKYHLPFDVMLPLIDETARKVHELAPRDAQTGPAVRYDENVMNKHLSM LADSQALQEIYKLMSKSIHEHHQL >gi|226332171|gb|ACIC01000149.1| GENE 21 20644 - 20979 322 111 aa, chain - ## HITS:1 COG:no KEGG:BT_1679 NR:ns ## KEGG: BT_1679 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 111 1 111 111 139 100.0 4e-32 MGAVKANNNKDTYKATAVAFIVIGTLFLINKLLPFASIGLPWVMNKDNLLLYASISFLIF KRDKSVGFVLLGLWLVMNIGLVMSLLGSLSGYLLPLALLIIGIILFWVSKR >gi|226332171|gb|ACIC01000149.1| GENE 22 20982 - 21518 523 178 aa, chain - ## HITS:1 COG:CAC3555 KEGG:ns NR:ns ## COG: CAC3555 COG0778 # Protein_GI_number: 15896791 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Clostridium acetobutylicum # 6 175 3 172 174 142 39.0 4e-34 MENFSELIKNRRSMRKFTDEELTQDEVVALMKAALMSPSSKRSNSWQFVVVDDKEKLKEL SHCKEQASSFIADAALAIVVMADPLASDVWIEDASIASIMIQLQAEDLGLGSCWVQVRER FTATGMPSDEFVHGILDIPLQLQILSVIAIGHKGMERKPFNEEHLQWEKIHINKFGGK >gi|226332171|gb|ACIC01000149.1| GENE 23 21618 - 21803 153 61 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253571832|ref|ZP_04849237.1| ## NR: gi|253571832|ref|ZP_04849237.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 61 7 67 67 92 100.0 5e-18 MKKRITQDDYIKANRKASREAEIKEHGHPVCYKRVHESKKVYNRRKIKAADKKLPYFFHL G >gi|226332171|gb|ACIC01000149.1| GENE 24 21917 - 22459 361 180 aa, chain - ## HITS:1 COG:BS_ytiB KEGG:ns NR:ns ## COG: BS_ytiB COG0288 # Protein_GI_number: 16080121 # Func_class: P Inorganic ion transport and metabolism # Function: Carbonic anhydrase # Organism: Bacillus subtilis # 1 180 3 182 187 206 51.0 2e-53 MIEEMLAYNREFVKNEGYKEYITNKYPDKKIAILSCMDTRLTALLPAALGIKNGDVKMIK NAGGVISHPFGSVIRSLLVAIFELGVEEIMVIAHSDCGACHMHSEEMLEKMKARGINADY IHMMSFCGVDFHSWLDGFEDTEKSVRGTVDFIVHHPLIPSDVKVHGFIIDSTTGELTRIV >gi|226332171|gb|ACIC01000149.1| GENE 25 22646 - 24601 1592 651 aa, chain - ## HITS:1 COG:no KEGG:BT_1682 NR:ns ## KEGG: BT_1682 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 651 1 651 651 1251 100.0 0 MKKIYSLVLLVAGVFSWGSCSLDEVNPNNSGQFSDGSYTLETYKAFTNSCYTALINNIYQ SSDYLVCTEAGTDIWEEPKSGSSYQRYHSYNGMKTDDSYYYKVWQYSYLCINSCNIVIDG AENVTGDETEINECVAQARCLRAYYYMNLVEQFGNVDLQLKAADSENISFDAHRSTVPEI YAAIIEDLKFAVENLPVSFSDYYSRVTKKSAMGLLARAYINGAGYDLKDTDGVSFLEKAY DTATTMINNKAIYEWYMHPAFADVFNENNNRNNEEALFIAAGAERNSDAYTNGNYSQSEM FRHFLPSLGTYTDLGLVDKTSNFVYGRPNSNIFLPSKYLMDCFAADMNDSRFRYSFISAY SSYSIPAWGATYEYGGSACAKEITSTLATKFGIPASNIGKKVYPHFNLESNSTADANYCQ LAIWNADGTAKTTQDKTDGNILHPAMPLDPAEAHQYAVYCSLKTLTEEEKAQYPGLVLNV FDLYDENGTARATYDKPSAASALWLSIYPALSKFNMPGNKFVGRDVQRKTVDVMIMRMAE VYLIAAEASVRLNKGDAAKYINVLRTRAGAGTVSESQVDLNYIFDEYARELCGECARWYL LKRNAAFETRLAAYNKRAAEHFKKEFYLRPIPTKYLDAINNPAEFGQNAGY >gi|226332171|gb|ACIC01000149.1| GENE 26 24613 - 27807 3115 1064 aa, chain - ## HITS:1 COG:no KEGG:BT_1683 NR:ns ## KEGG: BT_1683 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1064 1 1064 1064 2045 100.0 0 MRENAFVEAWKTQHRRVSMILGLTLLCAPSAISVYAENGVNDITVQAVMQTKTVKGTVLD ENGEPLIGVSIVVKGTSTGTITDFDGKFSINLPAGSKELVVSYIGYKEQAIIVSGNAPLN IKMVPDTQALDEVVVIGYGTVKKRDLTGAVASVKSEDITMNPGVNPMDALQGKIAGLDIT NSSGQAGSSPTVQLRGTRSLQLDEDGNISSDAFKPTYIIDGMPGDIATLNPNDIESIEVL KDASSTAIYGSSGANGVIIVTTKGSKSGKPVINFNAYAGTTMGARVPKMRTGDSYINYIK DSYKPVGTTNEEEIFGERYDAIKNNQWVNWGDEVLHSGLKQNYSISVAGGTEKTKAYFSL NYTDEKSMYENDDYKVYSTRVRIDQEINKWMNAGINVQASFSNKNSRNAVLERALFATPL GTPYNEDGSITEFPIPGSSSDPNPLADEQDGVYKNNTKAGRAYVDAYLEWKPVKGLSVKS QLGGSYSQSRAGKFMGEGSYNVLKGSTVAYGEATNKTGYNYKWENIVTYHNTFNKDHDLT VTGVTSWNYNQSEEYYVYGENPATNDMLWYALQNADNKKLNSKYLMSKGMGFIGRVNYSY KGKYLFSASSRYEADSRLSKDNRWNLFPAVSVGWRISDEAFMAGTKSWLDNLKLRVGYGV AGTTAGIAEYSSMASLENVTTSLGGMTVPSAKFTEYITNRNLTWEKTHDLNIGIDASFLN NRIDLTVDLYRTKTTDVILSQALPTSIMGNYTGSTTYKMNKNAAETENRGIELALNTRNI VKKDFTWSSTVTFTANKEKIKSLIEGQDMMFNAKKGDMVFKVGEGVGSFYRYKVLGIWQY SEKETAALFGCEPGDIKLDLPSVKKDGNGYYYMNPEGERVDITAENPYGVRADYDRQVIG RNTPDWTLGFKNNFRYKDFDLSVFLYARWGQMMNYGSVIGKYSPNPDYNIPTYFTYYDKT IEADQDVLFYAIDKSKDRSAYEGYDSMYYVDGSFFKLKNVTLGYTLPGKLTKRFGISNLR VYATMTNLFTYSPNKYVKNYDPEMNGSINFPLSRDCIFGLNLTF >gi|226332171|gb|ACIC01000149.1| GENE 27 28056 - 29054 693 332 aa, chain + ## HITS:1 COG:no KEGG:BT_1684 NR:ns ## KEGG: BT_1684 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 332 1 332 332 674 100.0 0 MPLKLTTYYRGKDIPDLPGTNTFHSKELFQIYESTPGYTPLLIVATEEGRPVARLLAAIR KTKKWLPSSLVKQCVVYGAGEFLDTSLPASKEKEEEIFGEMLEHLTQEASRTCILIEFRN LDNSMFGYRFFRKNDFFPVNWLRVRNSLHSTQKAEDRFSPSRIRQIRKGLKNGAKVVEAH TVEEIKEFSRMLHKVYSSRIRRYFPANDFFRHMNSMLIRGKQAKIFIVKYKEKIIGGSVC IYSGENAFIWFSGGMRKTYALQSPGILAVWKALEDAHERGFRHMEFMDVGLPFRKHGYRD FVLRFGGKQSSTRRWFRISWSWLNNLLVKFYV >gi|226332171|gb|ACIC01000149.1| GENE 28 29228 - 29632 492 134 aa, chain + ## HITS:1 COG:PH0272 KEGG:ns NR:ns ## COG: PH0272 COG0346 # Protein_GI_number: 14590197 # Func_class: E Amino acid transport and metabolism # Function: Lactoylglutathione lyase and related lyases # Organism: Pyrococcus horikoshii # 6 133 8 133 136 122 55.0 2e-28 MKISHIEHLGIAVKSIEEALPYYENVLGLKCYNIETVEDQKVRTAFLKVGETKIELLEPT CPESTIAKFIENKGAGVHHVAFAVEDGVANALAEAESKEIRLIDKAPRKGAEGLNIAFLH PKSTLGVLTELCEH >gi|226332171|gb|ACIC01000149.1| GENE 29 29663 - 31216 1673 517 aa, chain + ## HITS:1 COG:RC0960 KEGG:ns NR:ns ## COG: RC0960 COG4799 # Protein_GI_number: 15892883 # Func_class: I Lipid transport and metabolism # Function: Acetyl-CoA carboxylase, carboxyltransferase component (subunits alpha and beta) # Organism: Rickettsia conorii # 11 517 12 514 514 655 62.0 0 MSNQLEKIKELIERRAVARIGGGEKAIAKQHEKGKYTARERIAMLLDEGSFEEMDMFVEH RCTNFGMEKKHYPGDGVVTGCGTIEGRLVYLFAQDFTVTAGSLSETMSLKICKIMDQAMK MGAPCIGINDSGGARIQEGINALAGYAEIFQRNILASGVIPQISGIFGPCAGGAVYSPAL TDFTLMMEGTSYMFLTGPKVVKTVTGEDVSQENLGGASVHSTKSGVTHFTAKTEEEGLAL IRTLLSYIPQNNLEEAPYVDCTDPIDRLEDSLNDIIPDSPNKPYDMYEVISAIVDNGEFL EIQKDYAKNIIIGFARFNGQSVGIVANQPKFLAGVLDSNASRKGARFVRFCDAFNIPIVS LVDVPGFLPGTGQEYNGVILHGAKLLYAYGEATVPKVTITLRKSYGGSHIVMSCKQLRGD MNYAWPTAEIAVMGGAGAVEVLYAREAKDQENPAQFLAEKEAEYTKLFANPYNAAKYGYI DDVIEPRNTRFRVIRALQQLQTKKLSNPAKKHGNIPL >gi|226332171|gb|ACIC01000149.1| GENE 30 31248 - 32168 792 306 aa, chain + ## HITS:1 COG:no KEGG:BT_1687 NR:ns ## KEGG: BT_1687 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 306 1 306 306 592 100.0 1e-168 MNKTKIGIFLSLLLLIGLTSCGEQKSNNKLLLNEVLIDNQNNFQDDYGLHSAWIEIFNKS YGSADLAACLLKVSSQPGDTVTYFIPKGDILTLVKPRQHALFWADGEPNRGTFHTSFKLN PETANWVGLFDSGRKLLDQVVIPAGTLGPNQSYARISDGAADWEVKGGSKDKYVTPSTNN KTLDSNAKMEKFEEHDSVGIGMSISAMSVVFCGLILLFIAFKVVGKVAVNLSKRNAMKAK GIDKVEAKELSQAPGEVYAAISMALHEMQDEVHDVEETVLTITRVKRSYSPWSSKIYTLR ENPNRK >gi|226332171|gb|ACIC01000149.1| GENE 31 32193 - 32627 497 144 aa, chain + ## HITS:1 COG:SA0963 KEGG:ns NR:ns ## COG: SA0963 COG1038 # Protein_GI_number: 15926699 # Func_class: C Energy production and conversion # Function: Pyruvate carboxylase # Organism: Staphylococcus aureus N315 # 62 144 1064 1146 1150 68 42.0 4e-12 MKEYKYKINGNSYKVTIGDIEDNIAHVEVNGTHYKVEMEKQPKTAPKPAVVRPMPNSPAA PTTPVVKPAAPSTGKSGVKSPLPGVILDIKVNVGDTVKRGQTIIILEAMKMENNINADKD GKVTAINVNKGDSVLEGNDLVIIE >gi|226332171|gb|ACIC01000149.1| GENE 32 32629 - 33789 1373 386 aa, chain + ## HITS:1 COG:TM0880 KEGG:ns NR:ns ## COG: TM0880 COG1883 # Protein_GI_number: 15643642 # Func_class: C Energy production and conversion # Function: Na+-transporting methylmalonyl-CoA/oxaloacetate decarboxylase, beta subunit # Organism: Thermotoga maritima # 12 385 17 383 384 354 54.0 1e-97 MGDFINFLGNNLADFWTYTGFANATGGHIAMIIIGLGFIYLAVAKEFEPMLLIPIGFGIL IGNIPFNMDAGLKVGIYEEGSVLNILYQGVTSGWYPPLIFLGIGAMTDFSALISNPKLML IGAAAQFGIFGAYMIALEMGFDPMQAGAIGIIGGADGPTAIFLSSKLAPNLMGAIAVSAY SYMALVPVIQPPIMRLLTTKHERVIRMKPPRAVSHTEKVIFPIIGLLLTCFLVPSGLPLL GMLFFGNLLKESGVTRRLANTASGPLIDTITILLGLTVGASTQASEFLTLDSIKIFALGA LSFIIATASGVIFVKIFNLFLKKGNKINPLIGNAGVSAVPDSARISQVVGLEYDPTNYLL MHAMGPNVAGVIGSAVAAGILLGFLM >gi|226332171|gb|ACIC01000149.1| GENE 33 33964 - 35298 1530 444 aa, chain + ## HITS:1 COG:no KEGG:BT_1690 NR:ns ## KEGG: BT_1690 # Name: not_defined # Def: TPR domain-containing protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 444 1 444 444 810 99.0 0 MKRIILLAFLCTTTLVMAQRINHVQEAIANYDYEAALTLIAKEKPTIPLLLQKGKAQRGL GMNTEALSTYQEIITNDTANTRAYIEAAECCRSLAKYQQALKYYEQALDLNPENKYVRIQ YIGLLLSLQKFQDALGESSLMTEKDSSAIALHLQAQSFEGMGELLPATGCYYNIQEKYPD DYLAAAKLAALNIAGSYFNEAIEATEKYRQIDTTNIAVNRQNALAYCLNKDYPTAIQRYE YLVSQGDSSFHTCYYLGISYYAEEKYYEAHDFLEAARKHDPENVNLLYYLGRACAKTSWK KLGVEYLEKALDLTIPKDSNMVRLYIGMTDCYKMAQMPKEQIESIRERYRKYDKQNHKLL YDMAFIYFYSLKDKKNTERCLEAFLKTRPKEDKEEEAKLNERGELVLGTKNYYNAAANWL KDIQSKQKIEDFFLGGNVPAQKPE >gi|226332171|gb|ACIC01000149.1| GENE 34 35378 - 36382 1008 334 aa, chain - ## HITS:1 COG:TP0662 KEGG:ns NR:ns ## COG: TP0662 COG0191 # Protein_GI_number: 15639649 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose/tagatose bisphosphate aldolase # Organism: Treponema pallidum # 1 328 1 328 332 450 69.0 1e-126 MVNYKDLGLVNTRDMFAKAIKGGYAIPAFNFNNMEQMQAIIKAAVETKSPVILQVSKGAR QYANATLLRYMAQGAVEYAKELGCAHPEIVLHLDHGDTFETCKSCIDSGFSSVMIDGSHL PYEENVALTKKVVEYAHQFDVTVEGELGVLAGVEDEVSSDHHTYTDPEEVIDFATRTGCD SLAISIGTSHGAYKFTPEQCHIDPATGRMVPPPLAFEVLDAVMEKLPGFPIVLHGSSSVP EEEVETINKFGGALKAAIGIPEEELRKAAKSAVCKINIDSDSRLAMTAAIRKTFAEKPAE FDPRKYLGPARDNMEKLYKHKILNVLGSDNKLAE >gi|226332171|gb|ACIC01000149.1| GENE 35 36783 - 37037 437 84 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|29347102|ref|NP_810605.1| 50S ribosomal protein L31 type B [Bacteroides thetaiotaomicron VPI-5482] # 1 84 1 84 84 172 100 3e-42 MKKGLHPESYRPVVFKDMSNGDMFLSKSTVATKETIEFEGETYPLLKIEISNTSHPFYTG KSTLVDTAGRVDKFMSRYGDRKKK >gi|226332171|gb|ACIC01000149.1| GENE 36 37683 - 38735 926 350 aa, chain + ## HITS:1 COG:VC1756 KEGG:ns NR:ns ## COG: VC1756 COG0845 # Protein_GI_number: 15641760 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Vibrio cholerae # 10 346 22 355 364 154 30.0 2e-37 MKNIYVIALAAILFQGCGQKKEMTAPATRPVKTTIVESRSVIRKDFSGIVEAVEYVKLAF RVSGQIISLPVIEGEKVKKGQLIAAIDPRDIALQYAATKSAYETASAQVERNKRLLSRQA ISVQEYEISLSNYQKAKSEYELSSNNMRDTKLTAPFDGSIEKRLVENYQRVNSGEGIVQL VNTQNLRIKFTIPDAYLYLLRAKDPRFLVEFDTFKGHVFKARLEEYLDISTEGTGIPVSI TIDDPSFDRDLYAVKPGFTCSIRFTADVGPLVQDSWTIIPLSAVFGESDGNNMYVWVVED NKVHKRKIVVNAPTGEAQVLVSEGLKPGEQIVIAGVYQLVEGESIKSIDK >gi|226332171|gb|ACIC01000149.1| GENE 37 38732 - 41770 3224 1012 aa, chain + ## HITS:1 COG:VC1757 KEGG:ns NR:ns ## COG: VC1757 COG0841 # Protein_GI_number: 15641761 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Vibrio cholerae # 1 1009 1 1012 1016 593 33.0 1e-169 MNLAKYSLDNTKIIYFFLAVLLIGGITSFGKLGKKEDAPFVIKSAVIMTRYPGAEPAEVE RLITEPISREIQSMSGVYKIKSESMYGLSKITFELQPSLSASSIPQKWDELRRKVLNIQP QLPSGASTPTVSDDFGDVFGIYYGLTADDGFTYEEMRNWAERIKTQVVTADGVMKVALFG VQTEVVNIFVSTNKLVGMGIDPKQLANLLQSQNQIINTGEIRAGVQQLRVTANGMYANID DIRNQVITTKAGQVKLGDIAVIEKGYLDPPSNIMHVNGKRAIGIGVSTDPQRDVVQTGEN VKVKLNELLPLMPVGLELQSLYLENEIANEANNGFIINLIESILIVIVIIMLVMGLRAGL LIGSSLIFSIGGTLLIMSFFGVGLNRTSLAGFIIAMGMLVDNAIVVTDNAQIAIARGVNR RKALIDGATGPQWGLLGATFIAICSFLPLYLAPSAVAEIVKPLFVVLAISLGLSWVLALT QTTVFGNFILKAKTGDSTKDPYDKPFYHKFAFVLGVLIRKKAVTLVSMVILFIISLIIMG TMPQNFFPSLDKPYFRADVFYPDGYSINDVVKEMKSVEEHLAKQPEVKKVSITFGSTPLR YYLASTSVGPKPNFANVLVELTDSKYTKEYEEDFDGYMKANYPSAITRTSLFKLSPAVDA AIEIGFIGPNVDTLVSLTNQAIAIMHQNPDLINIRNSWGNKIPVWKPVYSPERAQPLGVS RQGMAQSIQIGTTGMTLGEYRQGDQVLPILLKNNQLDSFRINDLRTLPVFGTGNETTSLE QVVSEFDFQYRFSNVKDYNRQMVMMAQCDPRRGTNAISAFNQVWSQVQQEIKIPEGYTMK YFGEQESQVESNEALAKNLPLTFFLMFVTLLFLFKTYRKPTVILLMLPLIFIGIVLGLLV LGKSFDFFSILGLLGLIGMNIKNAIVLVDQIDLETAAGKKPLDAVISATTSRIIPVAMAS GTTILGMLPLLFDAMFGGMAATIMGGLLVASALTLFVLPVAYCAIHRIKGEQ >gi|226332171|gb|ACIC01000149.1| GENE 38 41813 - 43093 1188 426 aa, chain + ## HITS:1 COG:RSc0009 KEGG:ns NR:ns ## COG: RSc0009 COG1538 # Protein_GI_number: 17544728 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Outer membrane protein # Organism: Ralstonia solanacearum # 140 412 190 465 514 60 25.0 5e-09 MKEMKTQLILFSLLLLGSTAWAQQPCLSQDAYREKVEAYSQILKQQKLKTLASSEARKIA HTGFLPKIDVNADGTLNMSDLSAWNEPVGEYRNHTYQGVFVVSQPLYTGGALNAQHQIAK ADEKLNQLNEELTLDQIHYQSDAVYWNASASKAMLQAADKYQSIVKQQYDIIQDRFDDGM ISRTDLLMISTRLKEAELQYIKARQNYTLALQQLNILMGEAPNTPVDSLYNIGMISAPVR ILPLEDVLQRRADYASTEVNIMKSQAQRKAALSQFNPQLNMYFSGGWATATPNLGYDVSF NPIVGVNLNIPIFRWGARFKTNRQQKAYIGIQKLQQSYMTDNINEELSAALTKLTETESQ VKTAKENMSLANENLDLATFSYNEGKASMVDVLSAQLSWTQAHTNLINAYLAEKMAVAEY KKVTSE >gi|226332171|gb|ACIC01000149.1| GENE 39 43209 - 44444 1440 411 aa, chain - ## HITS:1 COG:AF2084 KEGG:ns NR:ns ## COG: AF2084 COG1883 # Protein_GI_number: 11499666 # Func_class: C Energy production and conversion # Function: Na+-transporting methylmalonyl-CoA/oxaloacetate decarboxylase, beta subunit # Organism: Archaeoglobus fulgidus # 24 411 5 354 354 294 48.0 2e-79 MNEIFENLYDMTAFSNIIAEPQFLIMYAIAFVLLYLGIKKQYEPLLLVPIAFGVLLANFP GGDMGVIQADENGMVMINGVMKNIWEMPLHDIAHELGIMNFIYYMLIKTGFLPPIIFMGV GALTDFGPMLRNLHLSIFGAAAQLGIFTVLLVAILMGFTPQEAASLGIIGGADGPTAIFT TIKLAPHLLGPIAIAAYSYMALVPVIIPLVVKLFCTKKELSINMKEQEKKYPSKTEIKNL RVLKIIFPIVVTTIVALFVPSAVPLIGMLMFGNLVKEIGSNTFRLFDAASNSIMNAATIF LGLSVGATMTAEAFLNWTTIGIVIGGFLAFALSITGGIFFVKLVNLFSKKKINPLIGATG LSAVPMASRVANEIALKYDPKNHVLQYCMASNISGVIGSAVAAGVLISFLA >gi|226332171|gb|ACIC01000149.1| GENE 40 44444 - 46273 2043 609 aa, chain - ## HITS:1 COG:AF1252m KEGG:ns NR:ns ## COG: AF1252m COG5016 # Protein_GI_number: 18677784 # Func_class: C Energy production and conversion # Function: Pyruvate/oxaloacetate carboxyltransferase # Organism: Archaeoglobus fulgidus # 9 515 9 477 480 184 30.0 4e-46 MKREVKFSLVFRDMWQSAGKYVPRVDQLVKVAPAIIEMGCFARVETNGGGFEQVNLLFGE NPNRAVREWTKPFHEAGIQTHMLDRALNGLRMSPVPADVRKLFYKVKKAQGTDITRTFCG LNDVRNIAPSITYAKEAGMISQCSLCITHSPIHTVEYYTNMALELIKLGADEICIKDMAG IGRPVSLGKIVANIKAAHPEIPIQYHSHAGPGFNMASILEVCEAGCDYIDVGMEPLSWGT GHADLLSVQAMLKDAGYQVPEINMEAYMKVRALIQEFMDDFLGLYISPKNRLMNSLLIGP GLPGGMMGSLMADLESNLESINKYKAKHNLPFMTQDQLLIKLFNEVAYVWPRVGYPPLVT PFSQYVKNLAMMNVMAMEKGKERWGMIADDIWDMLLGKAGRLPGTLAPEIIEKAEREGRK FFEGNPQDNYPDSLDKYRKLMKENKWEVGEDDEELFEYAMHPAQYEAYKSGKAKEDFLAD VAKRRAEKDKSPEEDAKPKTLTVQVDGQAYRVTVAYGDAELLAAPVAAAPAGEGKDVLSP LEGKFFLVKNAQESALKVGDAVKEGDVLCYVEAMKTYNAIRAEFSGTITAICANPGDTVS EDDVLMKIG >gi|226332171|gb|ACIC01000149.1| GENE 41 46302 - 46562 387 86 aa, chain - ## HITS:1 COG:no KEGG:BT_1698 NR:ns ## KEGG: BT_1698 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 86 1 86 86 113 97.0 3e-24 MENIETAILLMVVGMATVFVILLIVIYLGKLLITLVNKYAPEEVIPAKKEALQGPAPIPG NIMAAITAAVNVVTLGKGKITKVEKL >gi|226332171|gb|ACIC01000149.1| GENE 42 46838 - 48400 1428 520 aa, chain - ## HITS:1 COG:no KEGG:BT_1642 NR:ns ## KEGG: BT_1642 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 520 1 520 521 942 88.0 0 MEDLQRLYPIGIQTFSKIREGNFLYVDKTEYVYRMTHSASSYMFLSRPRRFGKSLLTSTL HSYFSGREELFRGLAMEKLEKEWIQYPVLHFDMSMAKHVDKDKLMNLLDFMLTEHERTLG IDAAGTDPNLRLTNLIKRAYEQTGRKVVVLIDEYDAPLLDVVHERENLGVLRDVMRNFYS PLKACDPYLRYVFLTGITKFSQLSIFSELNNIKNISMDEPYAAICGITENEILTQMKEDV DRMAVKLNITAEEVLAKLKENYDGYHFTYPSPDIYNPFSLLNAFADGKFNSYWFGSGTPT YLIKMLDKFGVEPSEIGNKLAAVEEFDAPTETMSSITPLLYQSGYVTIKDYDKELELYTL DIPNKEVRVGLMRSLLPYYVTNDTREATNMVAFISRDIRKGDMDAALRRLQTFLSTIPQC DHTKYEGHYQQVFYIIFSLLGYYVDVEVRTSRGRVDVVLRTEATLYVMELKLDKNATVAM EQINLKNYPERFALCGLPVVKVAVNFDSESGTLGDWMIEK >gi|226332171|gb|ACIC01000149.1| GENE 43 48565 - 48714 206 49 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253571854|ref|ZP_04849259.1| ## NR: gi|253571854|ref|ZP_04849259.1| predicted protein [Bacteroides sp. 1_1_6] # 1 40 1 40 49 69 100.0 6e-11 MENKIVHPKLGEFIEKVLKVAHVNLSKMCQEIHMGPATYQKVKKGKKKL >gi|226332171|gb|ACIC01000149.1| GENE 44 48868 - 49494 402 208 aa, chain + ## HITS:1 COG:no KEGG:BT_1702 NR:ns ## KEGG: BT_1702 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 208 1 208 208 410 100.0 1e-113 MQNFFRMSYFMPPIAPIRNEQGQTVTPATLTPFCEVSVEQVYQMITCNENLKALTEQVRS AEDIRMAKTSLLPYVTPCGTFIRRNSKFFASPSGLVVVDIDNLDSYQKAVEMRRTLFDDP FLHPVLAFISPSNRGVKTFVPYSNLYTDDQSRNVRESMSWAMEYVEMTYGSEMNNSIETV QKAVDTSGKDIVRACFLSHDPQALFREY >gi|226332171|gb|ACIC01000149.1| GENE 45 49561 - 51399 1121 612 aa, chain + ## HITS:1 COG:no KEGG:BT_1703 NR:ns ## KEGG: BT_1703 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 612 1 612 612 1211 99.0 0 MTDIESLRLLTEAVETAGADIAPTYAEYVQLAFAIATDCGEAGREFFHRLCRISAKYQSA HAERMFSNALTKQQGAIHLGTAFHLAESTGVKICREDRKEVMNNRTGTVGTESSPHNFST HAHVYNKVEEEKSESEELLEGSDPHHPLPTFTLEDWPKLLLRIISYGTTATQKDVLLLGA LTALGATMERYVRCHYAGKYQSPCMQSFIVAPAASGKGVLSLIRLLVMPIHDDIRQQVEK EMNVYKKAKVAYEMMGKERAKAEIPEIPLNRMFLISGNNTGTGILQNIMDNNGTGLICET EADTISTAIGSEYGHWSETLRRAFDHDWLAYNRRTNQEYRENKKSYLSLLLSGTPAQVKP LIPSVENGLFSRQLFYYMHGIYQWINQFDENETDLEAIFTSIGLEWKKLLNLLKEHGLHT LRLTDEQKQEFNDLFSELFTRSTIANGREMNSSIARMAVNICRIMSVVAMLRAFENPQPY QYQASSHPLLTPDKEIATDNIKDGIITRWDMTITPEDFKAVLNLVKPLYQHATHILSFLP PSEVSHRANADRDAFFEALGIQFTRAQLLEQATTMGIKPNTALTWLKRLVKQGLIVNLDG KGTYAHARVCVC Prediction of potential genes in microbial genomes Time: Thu May 12 03:25:28 2011 Seq name: gi|226332170|gb|ACIC01000150.1| Bacteroides sp. 1_1_6 cont1.150, whole genome shotgun sequence Length of sequence - 1483 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 268 62 ## COG0463 Glycosyltransferases involved in cell wall biogenesis - Prom 343 - 402 6.3 - Term 354 - 391 -1.0 2 2 Tu 1 . - CDS 407 - 949 246 ## BT_1710 hypothetical protein - Prom 1187 - 1246 9.2 Predicted protein(s) >gi|226332170|gb|ACIC01000150.1| GENE 1 1 - 268 62 89 aa, chain - ## HITS:1 COG:HI1695 KEGG:ns NR:ns ## COG: HI1695 COG0463 # Protein_GI_number: 16273582 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Haemophilus influenzae # 6 89 3 85 267 86 50.0 1e-17 MSKLIFSVLISVYYNESISYFKKSLDSILYQTLLPAEVVLVKDGILTDDLNCIVKEYSQK YPILKVISLPVNQGLGKALNEGLKHCSYD >gi|226332170|gb|ACIC01000150.1| GENE 2 407 - 949 246 180 aa, chain - ## HITS:1 COG:no KEGG:BT_1710 NR:ns ## KEGG: BT_1710 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 180 191 370 370 251 100.0 6e-66 MQKSVVTLIFISLIFVMLGVSQKILLLGTKIPVFGEFVNYYILDFYGRNVDASYGFSVGM IVNLFLFLFLYFGMRSVYNDISNHLKIFVNLLLYSFILSCFFNAYAVFVERLVSVTNMTL LFILPYVLHKIFIGKVNKTIAFACIVVYAMLMFTKTLYTPVPLGGKYEYQFIPYQYSFVI Prediction of potential genes in microbial genomes Time: Thu May 12 03:25:34 2011 Seq name: gi|226332169|gb|ACIC01000151.1| Bacteroides sp. 1_1_6 cont1.151, whole genome shotgun sequence Length of sequence - 13206 bp Number of predicted genes - 12, with homology - 12 Number of transcription units - 2, operones - 2 average op.length - 6.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 34 - 1017 504 ## BT_1711 hypothetical protein 2 1 Op 2 . - CDS 1056 - 2234 501 ## COG0438 Glycosyltransferase 3 1 Op 3 1/0.000 - CDS 2247 - 2741 274 ## COG1778 Low specificity phosphatase (HAD superfamily) - Prom 2773 - 2832 3.6 4 1 Op 4 1/0.000 - CDS 2834 - 3874 526 ## COG2089 Sialic acid synthase 5 1 Op 5 . - CDS 3874 - 4623 276 ## COG1861 Spore coat polysaccharide biosynthesis protein F, CMP-KDO synthetase homolog 6 1 Op 6 . - CDS 4632 - 5891 295 ## COG1887 Putative glycosyl/glycerophosphate transferases involved in teichoic acid biosynthesis TagF/TagB/EpsJ/RodC 7 1 Op 7 . - CDS 5845 - 6432 107 ## BT_1717 putative lipopolysaccharide biosynthesis protein - Prom 6452 - 6511 1.5 - Term 7573 - 7612 0.5 8 2 Op 1 . - CDS 7672 - 8793 711 ## COG0075 Serine-pyruvate aminotransferase/archaeal aspartate aminotransferase 9 2 Op 2 . - CDS 8803 - 9927 772 ## COG0028 Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] 10 2 Op 3 . - CDS 9944 - 11245 877 ## COG2513 PEP phosphonomutase and related enzymes 11 2 Op 4 . - CDS 11266 - 11985 336 ## COG1213 Predicted sugar nucleotidyltransferases 12 2 Op 5 . - CDS 12049 - 13197 687 ## BT_1722 putative protein involved in capsular polysaccharide biosynthesis Predicted protein(s) >gi|226332169|gb|ACIC01000151.1| GENE 1 34 - 1017 504 327 aa, chain - ## HITS:1 COG:no KEGG:BT_1711 NR:ns ## KEGG: BT_1711 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 327 15 341 341 616 99.0 1e-175 MAKAIIAKENLDCHNVVYLIDRNYANKYHNNIDIEIEDISFWMNKFYPMKIKTILNLRRI IKSFDCYISCLIADSFTVYIPQLTTPIFQLLITHHSCVGYHFMEEGLAYYKDQLYKASPN KFPLLVELLFKVYNIFCRRVQLNYPFLKPYKKSRFQPKYYLLNNKYNVLRENVSLVSWIK EETFFTSIPSGASVLVLSPIVEYRLATPEKFYASMNLLFNYISDSHVFIKAHPYQASSVV NTLVGLLEQHGKTWTFIPNDEPFEQILLSVENLCVFGTESSLMFYATLLGNNNKIISNRN NLAFNDIIFRKYAQNSESLVVNSYIQI >gi|226332169|gb|ACIC01000151.1| GENE 2 1056 - 2234 501 392 aa, chain - ## HITS:1 COG:SP1366 KEGG:ns NR:ns ## COG: SP1366 COG0438 # Protein_GI_number: 15901220 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Streptococcus pneumoniae TIGR4 # 165 389 141 365 367 87 26.0 4e-17 MKKVDIIHGGWLKAPNGASSVLRTLAESTSKFEEYGIQLSVYSMDLIVLKSFENIAAINQ RKGLRYFIKKNANANSILAFISIYAIYLRHARKIVKYYFDQGLEESRILFIHDIFTCYYY LKYRSRRQKTVLVLHNNGDTFNMLRQYYPVLNKSRLYSILLYIERKVLSEVDKVLFVAEN PKDTFVQLHPDIPIEKVGFVYNGILSKEMHICPEKHLGPLEICCVGSVSKRKGQDMIVEA LVKMSPVQREKVHFTIVGDGTLRGELEKLCFEKGISKYIDFIGVSNQVENYLLRSDIFML PSRDEGFPISILEAMRAGLPIISTNIAGIPEMVFSGINGIVISPCLEDIYDILCHIESYN WSAMGKLSYELYQQKFTLDSMIESYANILNAI >gi|226332169|gb|ACIC01000151.1| GENE 3 2247 - 2741 274 164 aa, chain - ## HITS:1 COG:MA3766_2 KEGG:ns NR:ns ## COG: MA3766_2 COG1778 # Protein_GI_number: 20092564 # Func_class: R General function prediction only # Function: Low specificity phosphatase (HAD superfamily) # Organism: Methanosarcina acetivorans str.C2A # 1 153 7 159 163 131 45.0 6e-31 MKEIKLILTDIDGVWTDGGMFYDQTGNEWKKFNTSDSAGIFWAHNKGIPVGILTGEKTEI VRRRAEKLKVDYLFQGVVDKLSAAEELCNELGINLEQVAYIGDDLNDAKLLKRVGIAGVP ASAPFYIRRLSTIFLEKRGGEGVFREFVEKVLGINLEDFIAVIQ >gi|226332169|gb|ACIC01000151.1| GENE 4 2834 - 3874 526 346 aa, chain - ## HITS:1 COG:Cj1327 KEGG:ns NR:ns ## COG: Cj1327 COG2089 # Protein_GI_number: 15792650 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Sialic acid synthase # Organism: Campylobacter jejuni # 1 340 1 332 334 162 32.0 1e-39 MKNTYIIGEIGQNHNGSVDLAKLIIELISRPVHEDVFGMDFMPMNAVKMTKRDLSEELAV SQMNQIYDNPNSFGRTYGEHRAFLELDDQAHFEIYKYAKSLGLDFIETLCAKGCLSLLKL FTPDRLKVASRDLTNLPLLEALAETHIPIILSTGMAGQRELDNALDVITRYHSNIAILHC VSQYPTHPNNLNLRTITYLKKHYNKFEIGFSDHTIGISAATAAVAMGAEIIEKHVTIDRH MKGTDQLGSLGPDGVNRMIRDIRIAECWLGTEDLYIEKGVEKSKIKLERSIATRKYIPAG SIIQESDIHLLSPGDGYKWIDKKNVIGHFAKQDISANEIIYPDFIE >gi|226332169|gb|ACIC01000151.1| GENE 5 3874 - 4623 276 249 aa, chain - ## HITS:1 COG:BS_spsF KEGG:ns NR:ns ## COG: BS_spsF COG1861 # Protein_GI_number: 16080837 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Spore coat polysaccharide biosynthesis protein F, CMP-KDO synthetase homolog # Organism: Bacillus subtilis # 5 235 6 232 239 117 31.0 2e-26 MNYAFVVQARLGSTRLPGKILKPFYGNQSILDLMVHKLSAISNIPVIIATTNSVINEPIE KKALALGVKCFRGEENDVLKRFIDVAEYFDIQGIFRICSDNPFLDVHAARQLVEIAMKSC NDYISFDIDGTPSIKTHFGFWGEFVTLDALKRVIGFTDDLLYREHVTNYIYSHPELFNIQ WISGSPVVSKHHNIRLTIDTLEDFSVAQRIYRDLQEKRVEISIEVIVDYVRRHAEYIQLM KQQIKKNSK >gi|226332169|gb|ACIC01000151.1| GENE 6 4632 - 5891 295 419 aa, chain - ## HITS:1 COG:MTH365 KEGG:ns NR:ns ## COG: MTH365 COG1887 # Protein_GI_number: 15678393 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Putative glycosyl/glycerophosphate transferases involved in teichoic acid biosynthesis TagF/TagB/EpsJ/RodC # Organism: Methanothermobacter thermautotrophicus # 177 414 166 406 409 125 32.0 2e-28 MKILKKIFQVIVKIINHCVPVDSRSLYFIPHPNCRTDKYDIINYSSDNVLVLLNYLLKKD SAEYYKIYVEIYDSSRISEYQEYIYGINPNIRCTFLCACVGRFFFNKVSFKDMLYAFFCF CRASKCFTATFYYDFSFKKRSQQIICLGYYTPFKDDYHMGDHSYDKFRNTTCNSFDYSIA TSCLSARIISIDCGISYDKFKVLGFPRNDLLISKNNCNVIKREISKFAGYDVTKYIVYTP TFRDYETVTEGNLRSILGYVDCDLLKLSKILLKFNAALILKLHPLQNKTVLKKDLPKGIL VFEQTYKYGLYDLLSFSDGMITDYTSTYFDFLLVNKPVIFNFYDIEEYRRVRGFSFEPIE FFCAGDIVYNYNELIDAVMGLLAGKDIHAEHRRHISLLMNQFQDDNSTKRICDIFLNEI >gi|226332169|gb|ACIC01000151.1| GENE 7 5845 - 6432 107 195 aa, chain - ## HITS:1 COG:no KEGG:BT_1717 NR:ns ## KEGG: BT_1717 # Name: not_defined # Def: putative lipopolysaccharide biosynthesis protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 195 300 494 494 329 100.0 4e-89 MVGLLVLANPIISLLLTDKWSGVVILLQILCLDWMFDHLSQINLNLLWVKGRSDLSLRLE IIKKTIAIFILFVSIPFGLEVMCWGRVIYSLIATYLNTYYTNSLIDLTLKLQVKDVFPSL ILSFIMGGIVFICISFFETDIVKIVIGCITGMSFYILLSFLFKLDSFFCILTLIIKTKNI DENIKKDFSSNSKDN >gi|226332169|gb|ACIC01000151.1| GENE 8 7672 - 8793 711 373 aa, chain - ## HITS:1 COG:STM0431 KEGG:ns NR:ns ## COG: STM0431 COG0075 # Protein_GI_number: 16763811 # Func_class: E Amino acid transport and metabolism # Function: Serine-pyruvate aminotransferase/archaeal aspartate aminotransferase # Organism: Salmonella typhimurium LT2 # 7 363 7 363 367 234 35.0 2e-61 MKIERKILLNPGPATTTDSVKMAQVVPDICPREKEFAGMMKQLRDDLVRVAHGNLEKHTA VLFCGSGTINIDICLNSLLPEDKRVLIVNNGAYSTRAVEVCQYYGLPHINLEFSVYERPD LSVVENVLQENPDIAMVYTTHHETGTGILNPIREMGAIAHKYQAKFVVDTTSSLGMIPFD IEKDNVDFCMASAQKGLQAMTGLSFIIGNEEQIKFSKKYPKRSYYCNLYLQYENFERTGE MHFTPPVQTIYATRQALNEYFEVGEEAKFARHKRVFEAIHSGINEIGLKSVIKREWQSGL VVSVQYPEDPNWDFEKIHDYCYERGFTIYPGKISTEDTFRICALGEIEVDDIENFFKVLK AALNHYNVTLPMS >gi|226332169|gb|ACIC01000151.1| GENE 9 8803 - 9927 772 374 aa, chain - ## HITS:1 COG:MJ0256 KEGG:ns NR:ns ## COG: MJ0256 COG0028 # Protein_GI_number: 15669880 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] # Organism: Methanococcus jannaschii # 182 332 2 145 188 87 36.0 5e-17 MLSVEKVYETFLSHDVDFFTGVPDSLLKNICAYITDHTSREKHIIAANEGAAVGIASGYY MASGKVPVVYMQNSGLGNTVNPLLSLADEQVYSFPMLLMIGWRGEPGTKDEPQHKKQGEV TLDLLKAMRIPYIILDSNENDAFIQLHDIIQSAKTKNIPHAIIIRKDVFGKYKLQKEFGC NYPLSREDALQIVVDYLPENSVVVSTTGKLSRELFEYREMKEQGHEHDFLTVGSMGHSSS ISLGIAIAKQDRPVYCLDGDGAFIMHLGAISNIGDLSPKNYYHILFNNGAHESVGGQPTL GFSLDIPAIVRGSGYKHTYTVCTKVEIEEAMKQLPKLCGPVLLEIKVKIDSRENLGRPTT TPIENKEHFMDFLK >gi|226332169|gb|ACIC01000151.1| GENE 10 9944 - 11245 877 433 aa, chain - ## HITS:1 COG:mlr9115 KEGG:ns NR:ns ## COG: mlr9115 COG2513 # Protein_GI_number: 13488216 # Func_class: G Carbohydrate transport and metabolism # Function: PEP phosphonomutase and related enzymes # Organism: Mesorhizobium loti # 148 429 20 287 318 149 33.0 9e-36 MSKTVYIGMTADIMHPGLIRIINEATKYGDVIIGLLTDKAIAEHKRLPYLTYEQRKEVVQ NIKGVCKVVPQEEWSYVENLKRIKPDYIIHGDDWKTGPLREERVRVFEVMNEQGGKVIEI PYTLGINSSSLDKDIKAIGTTPDVRLKSLRRLINAKPVVRILEAHDGLCGLIIENQEILK GDKREVFDGMWSSSLTDSTSKGKPDIEAVDLTTRLQDLNNILECTTKPIIFDGDTGGKIE HFVFTVRTLERHGISAIIIEDKVGLKKNSLFGTDVIQTQDTIEGFCNKIKAGKASQITDD FMIIARIESLIAGKPVSDALERAFAYVQAGADGIMIHSKNKSGEDIKEFCLAFRKQYAHV PIVVVPTTYDHIYESELCDWGVNIIIYANHMLRAAYPAMMNVAKIILENERALEVRELCM PIKEILELIPGTK >gi|226332169|gb|ACIC01000151.1| GENE 11 11266 - 11985 336 239 aa, chain - ## HITS:1 COG:FN1670_1 KEGG:ns NR:ns ## COG: FN1670_1 COG1213 # Protein_GI_number: 19704991 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted sugar nucleotidyltransferases # Organism: Fusobacterium nucleatum # 5 237 76 297 334 85 29.0 9e-17 MIRTAMIMAAGMGTRFGQYTELIPKGFVKVGGIPMIVRSIDTLLSCDIERIVIGTGYKQE VYEELKTDYPMLETCFSPRYAETNSMYTLYNTREILGNDDFLLLESDLIFEKQAIMSLLE CPAADAMLITPVTKFQDQYYVEHDDNFRLSSCSVNKNKLNAKGELVGIHKLSGSFYKIMC ADYASIVDEQPNLGYEYELLRISCSQSPVYVHKVEGLKWYEIDDISDLEYAEKYIVPYC >gi|226332169|gb|ACIC01000151.1| GENE 12 12049 - 13197 687 382 aa, chain - ## HITS:1 COG:no KEGG:BT_1722 NR:ns ## KEGG: BT_1722 # Name: not_defined # Def: putative protein involved in capsular polysaccharide biosynthesis # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 382 1 382 382 719 100.0 0 MIEKQSYQEPFSEVDKRKDDEMEIDLMAIFYKIITVRGVLCKAAGVGVIIAIVVALSIPK QYTVKVTLSPEMGNAKGSNGLAGLAASFLGDGALIGESTDALNASLSADIVSSTPFLLEL LEMEVSVSKKDDKMTLGSYLDEESSPWWDYLIGFPGIVIGGVKSLFGEDTVPASGGRQGT IELTKEVNEKINFLKKKISASIDKKTAITSITVTLQNPQITAVIADSVVHKLQEYIIGYR TSKVKEDCAYLERLFKERQQEYYAAQRKYADYVDAHDNVVLQSVRAEQERLQNDMSLAYQ IYSQVANQLQVSRAKVQEEKPVFAVVEPAVVPLKPSGMGLKIYILLFVFLSVSGTLVWVF FGKSLLASLRKELKNKEIEQDK Prediction of potential genes in microbial genomes Time: Thu May 12 03:25:47 2011 Seq name: gi|226332168|gb|ACIC01000152.1| Bacteroides sp. 1_1_6 cont1.152, whole genome shotgun sequence Length of sequence - 1081 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 318 362 ## BT_1723 polysialic acid transport protein KpsD precursor 2 1 Op 2 . - CDS 378 - 692 200 ## BT_1724 hypothetical protein 3 1 Op 3 . - CDS 757 - 1074 288 ## BT_1725 putative transcriptional regulator Predicted protein(s) >gi|226332168|gb|ACIC01000152.1| GENE 1 3 - 318 362 105 aa, chain - ## HITS:1 COG:no KEGG:BT_1723 NR:ns ## KEGG: BT_1723 # Name: not_defined # Def: polysialic acid transport protein KpsD precursor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 105 1 105 789 193 100.0 1e-48 MLGSGSLMAQSMSDSQVLEYVKDGIRQGKEQKQLASELARKGVTKEQAMRVKQLYEQQNN VNTSKSTGTDINESRLREETKKNTSDMLEDHPTTQDLARGDQVFG >gi|226332168|gb|ACIC01000152.1| GENE 2 378 - 692 200 104 aa, chain - ## HITS:1 COG:no KEGG:BT_1724 NR:ns ## KEGG: BT_1724 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 104 19 122 122 196 100.0 3e-49 MLLPGGGSAGCVYLDDLSALQRSIHEKINDLYSQRGETPEQDATLCLAILQGYNVSMYAN PEDEERKQAVLTRSLSLLDVLPPSLLKQQLSAVCHGMQELCEIN >gi|226332168|gb|ACIC01000152.1| GENE 3 757 - 1074 288 105 aa, chain - ## HITS:1 COG:no KEGG:BT_1725 NR:ns ## KEGG: BT_1725 # Name: not_defined # Def: putative transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 105 88 192 192 202 100.0 3e-51 MEVLSFSTVSRYMVMRGESSPAVIPDEQMARFRFMLDYSEESISMNSSPLARGEKVRVIK GPLTGLVGELVNVDGKSKIAVRLNMLGCACVDMPVGYVEPIMAAV Prediction of potential genes in microbial genomes Time: Thu May 12 03:25:59 2011 Seq name: gi|226332167|gb|ACIC01000153.1| Bacteroides sp. 1_1_6 cont1.153, whole genome shotgun sequence Length of sequence - 22892 bp Number of predicted genes - 14, with homology - 14 Number of transcription units - 5, operones - 3 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 341 - 1291 510 ## BT_1726 integrase - Term 1727 - 1790 4.1 2 2 Op 1 . - CDS 1861 - 2712 844 ## BT_1727 putative transmembrane sensor 3 2 Op 2 . - CDS 2722 - 3312 327 ## BT_1728 RNA polymerase ECF-type sigma factor 4 2 Op 3 . - CDS 3389 - 4963 1567 ## COG4108 Peptide chain release factor RF-3 5 2 Op 4 . - CDS 4969 - 5817 733 ## COG1091 dTDP-4-dehydrorhamnose reductase 6 2 Op 5 . - CDS 5818 - 6363 578 ## BT_1731 hypothetical protein 7 2 Op 6 . - CDS 6375 - 7028 555 ## BT_1732 hypothetical protein - Prom 7060 - 7119 5.7 + Prom 7069 - 7128 2.7 8 3 Op 1 3/0.000 + CDS 7158 - 10862 4019 ## COG0046 Phosphoribosylformylglycinamidine (FGAM) synthase, synthetase domain + Term 10878 - 10940 13.9 + Prom 10998 - 11057 4.7 9 3 Op 2 . + CDS 11134 - 15174 2903 ## COG0642 Signal transduction histidine kinase 10 3 Op 3 7/0.000 + CDS 15171 - 15728 493 ## COG2059 Chromate transport protein ChrA 11 3 Op 4 . + CDS 15760 - 16308 614 ## COG2059 Chromate transport protein ChrA + Term 16501 - 16540 -0.8 - Term 16343 - 16379 5.4 12 4 Op 1 . - CDS 16415 - 17923 1571 ## COG1649 Uncharacterized protein conserved in bacteria 13 4 Op 2 . - CDS 17988 - 19922 1498 ## COG0642 Signal transduction histidine kinase - Prom 20010 - 20069 9.1 + Prom 19974 - 20033 5.2 14 5 Tu 1 . + CDS 20105 - 22876 2716 ## COG0178 Excinuclease ATPase subunit Predicted protein(s) >gi|226332167|gb|ACIC01000153.1| GENE 1 341 - 1291 510 316 aa, chain - ## HITS:1 COG:no KEGG:BT_1726 NR:ns ## KEGG: BT_1726 # Name: not_defined # Def: integrase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 316 1 316 316 611 99.0 1e-173 MNKNGFSRCADIYIGRLREEGRYSTAHVYQNALLSFSKFCGVHSVSFRQVTRERLRRYEQ HLYECGLKPNTISTYMRMLRSIYNRGVEAGSAPYVHRLFHEVYTGVDVRQKRALPVVALR RLLYEDPQSDRLRRTQAIAALMFQFCGMSFADLSHLEKSALDSNVLRYNRVKTKTPMSVE VLDSAQEMLDQLRNRQSPRPGCPDYLFGILHGDKKRKDERAYREYQSALRRFNYCLKSLA KRLRLNFPVTSYTLRHSWATTAKYRGVPIEMISESLGHKSIKTTQIYLKGFELKERTEVN RMNLSYVKRCGVGQCI >gi|226332167|gb|ACIC01000153.1| GENE 2 1861 - 2712 844 283 aa, chain - ## HITS:1 COG:no KEGG:BT_1727 NR:ns ## KEGG: BT_1727 # Name: not_defined # Def: putative transmembrane sensor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 283 1 283 283 554 99.0 1e-156 MENRKELLEQYMEDGKLPLKGELFARKSVLGGKLEEFREEKRADALTSRPNAAGRQKLIV LRSSCIYWRVAALFILLFGIGGYYYLSEEKITSEAVAVDYKLPDGSSVKLMQNSTLSYNK VSWLWGRKLNLLGSAFFDVTPGKTFTVRTEAGDVTVLGTKFLVEQEGKTITVNCEEGTVK VETPIGEQTLNAGESVRCDENKIGPVQEKEELPEVLGYEDDPLVNVVADIQHIFDVEVVG CEKYNELYYNGTILTRDLHETLKKVFGSLGINYQLSGQKVILE >gi|226332167|gb|ACIC01000153.1| GENE 3 2722 - 3312 327 196 aa, chain - ## HITS:1 COG:no KEGG:BT_1728 NR:ns ## KEGG: BT_1728 # Name: not_defined # Def: RNA polymerase ECF-type sigma factor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 196 1 196 196 348 100.0 5e-95 MNTTFWEKIQQGDEEAFRQLYNEYSDLLYGYGMKIAGDDNLVTESIQSLFVYLYEKRQSC SEPQSISAYLCVALKRMLLNELKKTANGVFTSLDEVNSSEYRFDLEIDIETAIVRSELER EQLEVLQKEINGLTKQQREVLYLKYYKKLDSDEIAEVMGLTSRTVYNTTHMAISRLRERL SKSFLLTVAANLWIFN >gi|226332167|gb|ACIC01000153.1| GENE 4 3389 - 4963 1567 524 aa, chain - ## HITS:1 COG:NMA0836 KEGG:ns NR:ns ## COG: NMA0836 COG4108 # Protein_GI_number: 15793806 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Peptide chain release factor RF-3 # Organism: Neisseria meningitidis Z2491 # 6 523 8 526 531 530 49.0 1e-150 MADNTEIQRRRTFAIIAHPDAGKTSLTEKLLLFGGQIQVAGAVKSNKIKKTATSDWMEIE KQRGISVTTSVMEFDYRDYKINILDTPGHQDFAEDTYRTLTAVDSVIIVVDGAKGVETQT RKLMEVCRMRKTPVIIFVNKMDREGKDPFDLLDELEEELMIQVRPLSWPIEQGARFKGVY NIYEKKLDLYQPSKQVVTEKVEVDIHTEELDKQIGKPLADKLRGDLELIEGVYPELDVES YLAGDCAPVFFGSALNNFGVQELLNCFVEIAPSPRPVQAEEREVKPDEPKFTGFIFKITA NIDPNHRSCVAFCKICSGKFVRNAPYQHVRHGKTMRFSSPTQFMAQRKTTIDEAYAGDII GLPDNGTFKIGDTLTEGELLHFRGLPSFSPEMFKYIENADPMKQKQLAKGIDQLMDEGVA QLFVNQFNGRKIIGTVGQLQFEVIQYRLLNEYNASCRWEPVSLYKACWVESDDPAELEAF KKRKYQYMAKDREGRDVFLADSGYVLQMAQMDFKHIKFHFTSEF >gi|226332167|gb|ACIC01000153.1| GENE 5 4969 - 5817 733 282 aa, chain - ## HITS:1 COG:CAC2315 KEGG:ns NR:ns ## COG: CAC2315 COG1091 # Protein_GI_number: 15895582 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-4-dehydrorhamnose reductase # Organism: Clostridium acetobutylicum # 1 282 1 278 280 251 51.0 8e-67 MRILVTGANGQLGNEMQVLAKENPQHTYYFTDVQELDICNKDAVWAYIAEKRIELIVNCA AYTAVDKAEDDSELAYKLNSEAPKTLACAAQFNGAAMIQVSTDYVFDGTAHIPYTEECDP CPNSVYGTTKLEGEYEVLNHCEKSVVIRTAWLYSTFGNNFVKTMIRLGKERDSLGVIFDQ VGTPTYANDLAQAIFAIINKGIVRGVYHFSNEGVCSWYDFTVAIHRLAGITSCKVKPLHT AEYPTRANRPAYSVLDKTKIKTTFGIEIPHWEESLKRCIDTL >gi|226332167|gb|ACIC01000153.1| GENE 6 5818 - 6363 578 181 aa, chain - ## HITS:1 COG:no KEGG:BT_1731 NR:ns ## KEGG: BT_1731 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 181 1 181 181 313 100.0 1e-84 MNQIARQLKEKNIAEYLIYMWQEEDLIRANHGELEEIEANVIARYPEDQRPALREWYGNL ITMMNEEGVREKGHLQINKNIIINLTELHNALTSSPKFPFYSAAYFKALPFIVELRNKNG KKEEPELETCFEALYGLLLLRLQKKPVSEGTMKAVEAISSFLSMLANYYDKDLKGELKLD E >gi|226332167|gb|ACIC01000153.1| GENE 7 6375 - 7028 555 217 aa, chain - ## HITS:1 COG:no KEGG:BT_1732 NR:ns ## KEGG: BT_1732 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 217 1 217 217 362 100.0 5e-99 MIQIETIFDIIVKGFIIGVVVSAPLGPVGVLCIQRTLNKGRWYGFVTGLGASLSDIAYAL LTGYGMSFVFDYINKNIFYLQLLGSIMLLLFGIYTFRSNPVQSIRPASSNKGSYFHNFIT AFFVTLSNPLIIFLFIGLFARFAFVQPGVVVFEEITGYLAIASGALVWWLGITYFVNKVR TKFNLRGIWILNRVIGSIVMLVSVAGLIYTLLGESLY >gi|226332167|gb|ACIC01000153.1| GENE 8 7158 - 10862 4019 1234 aa, chain + ## HITS:1 COG:HI0752_1 KEGG:ns NR:ns ## COG: HI0752_1 COG0046 # Protein_GI_number: 16272693 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylformylglycinamidine (FGAM) synthase, synthetase domain # Organism: Haemophilus influenzae # 15 928 65 1011 1011 504 37.0 1e-142 MILFFRTPSKSVIAVESNHQLTPDESNKLCWLFGEAVTESEENLKGCFVGPRREMITPWS TNAVEITQNMGLEDILRIEEYFPVKDENADYDPMLQRMYKGLDQNVFTTNRQPEPIVSIE DLEAYNEKEGLALSKEEMDYLKKVENDLGRKLTDSEVFGFAQINSEHCRHKIFGGTFIID GVEQESSLFQMIKKTTQENPNKIISAYKDNVAFAEGPVVEQFAPADHSKPDFFQVKDIKS VISLKAETHNFPTTVEPFNGASTGTGGEIRDRMGGGKGSWPIAGTAVYMTSYPRTDEGRE WEDILPVRKWLYQTPEQILIKASNGASDFGNKFGQPLICGSVLTFEHTENNEVYGYDKVI MLAGGVGYGTQRDCLKGTPEAGNKVVVIGGDNYRIGLGGGSVSSVDTGRYSSGIELNAVQ RANAEMQKRANNVVRALCEEDVNPVVSIHDHGSAGHVNCLSELVEECGGLIDMSKLPIGD KTLSAKEIIANESQERMGLLIKEEAIEHVRKIAERERAPMYVVGETTGDHRFSFQQADGV RPFDLAVEQMFGSSPKTYMIDKTVERHYEMPEYELPKLHEYLTNVLQLEAVACKDWLTNK VDRSVTGKVARQQCQGELQLPLSDCGVVALDYRGEKGIATSIGHAPQAALADPAAGSILS VSEALTNLVWAPMAEGMDSISLSANWMWPCRSQEGEDARLYTAVKALSDFCCALQINVPT GKDSLSMTQKYPNGEKVISPGTVIVSAGGEVSDVKKVVSPVLVNNEKTTLYHIDFSFDEL KLGGSAFAQSLGKVGDDVPCVQDAEYFRDAFLAVQELVNKGLILAGHDISAGGLITTLLE MCFANVEGGMEISLDKMKEQDIIKILFAENPGIVIQISDKHKEEVKKILEDAGVGFVKLG KPTDERHILVSKGDATYQFGVDYMRDVWYSSSYLLDRKQSMNGCAKKRFENYKMQPVEFA FMPGFKGKLSQYGITPDRRTPSGVRAAIIREKGTNGEREMAYSLYLAGFDVKDVTMTDLI SGRETLEDVNMIVYCGGFSNSDVLGSAKGWAGGFLFNPKAKEALDKFYAREDTLSLGICN GCQLMMELGLINPEHEKKGKMLHNDSHKFESTYVGLTIPTNRSVMFGSLSGSKLGIWVAH GEGKFSLPYDEDQYNVVAKYSYDEYPGNPNGSDYSIAGLASADGRHLAMMPHLERAIFPW QNGCYPADRKNSDQITPWIEAFVNARKWVEEKTK >gi|226332167|gb|ACIC01000153.1| GENE 9 11134 - 15174 2903 1346 aa, chain + ## HITS:1 COG:all4963_3 KEGG:ns NR:ns ## COG: all4963_3 COG0642 # Protein_GI_number: 17232455 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Nostoc sp. PCC 7120 # 791 1022 5 227 294 119 33.0 4e-26 MKKWLFLLLLFPLTCVAQTYQYLGVEDGLSNRRVYYIQKDNVGYMWFLTHEGIDRYNGKE FKRYKLMDGDKELNSLLNLNWLYIDPEGTLWEIGKKGKIFRYDRIHDTFELVYKLPIEDF KDLPAPITFSWIDHNNHIWLCNEETIFLYNTRTEEVTHVKNCLSEAINDIEQIDESHYFI GTEMGIHHAQLKDNTLQLLPCDKLDNVNIQVNDLHFDPKIRKLFIGTFQRGVMVYDMNTK TITQPEVSLKDVSISRIKPLNAKELLIATDGGGIYKMNVDTYQTVPFIVADYNSYNGMNG NSINDIYIDDEERIWMANYPIGITIQNNRYPSYKWIKHSVGNKQSLINDQVNSIIEDSEG DLWYGTNNGISLQDSKTGKWRSFLSTFENVQNSKNHIFTTLCEVSPGIIWAGGYFSGLYQ IDKRTSKTTYFTPASYAHESIRPDKYIRDIRKDSRGYVWSGGYYNLKRINVQTKDIRLYH GLHSVTAIIEKDDKSMWIGTATGLFLLDIESGKYERIQLPVESTYVYSLYQTKQGSLYIG TSGSGVLIYDPQTKLFTHYYTGNCAMISNNIYTILSDEDNEILLSTENGLTSFYPKQKTF YNWTKEMGLMTTHFNALSGTLRRNNNFIFGSSDGAIEFNKDMKLPRTYSSKMIFSDFKLF YQTVYPGDKNSPLEKDINETKELRLKYNQNIFSLMVSSINYDYPSNVLYSWKLEEFYEEW SKPGNESTIRYTNLAPGKYVLRVRAISNEDQRVMLEERSIDIIIAQPFWLTPWAMILYAI LISLIAVIILRVLILKKQRKVSDEKIHFFINTAHDIRTPLTLIKAPLEDLREKEALSKEG IANMNTAIRNVNALLRLTTNLINFERADVYSSELYISEHELNTFMTEIFNAFQPYANIKH INFTYESNFRYMNVWFDKEKMESILKNIISNALKYTPENGNVQIFVSENNDSWSVEVKDT GIGIPASEQKKLFKLHFRGSNAINSKVTGSGIGLMLVWKLVRLHKGKINLSSVENQGSVI KISFPKDSKRFHKAHLATPSKRRQEITSTTNVPASIYENVHKEQNPNHQRILIVEDNDEL RNYLSQTLAEEYTVQNCCNGKEALTIIPEYKPELVISDIMMPEMRGDELCDAIKNNIETS HIPVILLTALNNEKDILSGLQIGADEYIVKPFNIGILKATVSNLLINRALLRSKYGSLDM DDEDDHEDECINCSQDIDWKFIANVRKNIEDNIDNPALTVDVLCNLMGMSRTSFYNKLRA LTDQAPGDYIRLIRLKRSVKLLKEGTHSITEIAEMTGFSDAKYFREVFKKHFNVSPSQYG KEGKPHKEGKAGKENKSDKEGGEKEE >gi|226332167|gb|ACIC01000153.1| GENE 10 15171 - 15728 493 185 aa, chain + ## HITS:1 COG:FN0712 KEGG:ns NR:ns ## COG: FN0712 COG2059 # Protein_GI_number: 19704047 # Func_class: P Inorganic ion transport and metabolism # Function: Chromate transport protein ChrA # Organism: Fusobacterium nucleatum # 2 169 4 171 186 158 51.0 6e-39 MNIYLESFGIFFKIGAFTIGGGYAMVPLIENEIVTKRKWIAQEDFIDLLAISQSAPGILA VNISIFIGYKLRGIRGSIVTALGTILPSFIIILAIALFFHSFKDNPIVERIFKGIRPAVV ALIAAPTFTMGRSAKINRYNLWIPVVSALLIWLLGFSPIWIIIAAGVGGFLWGKFKKVES EHPRL >gi|226332167|gb|ACIC01000153.1| GENE 11 15760 - 16308 614 182 aa, chain + ## HITS:1 COG:FN0713 KEGG:ns NR:ns ## COG: FN0713 COG2059 # Protein_GI_number: 19704048 # Func_class: P Inorganic ion transport and metabolism # Function: Chromate transport protein ChrA # Organism: Fusobacterium nucleatum # 1 182 1 173 176 116 45.0 2e-26 MIYLQLFYTFFKIGLFGFGGGYAMLSMIQGEVVTRYGWVTSQEFTDIVAISQMTPGPIGI NAATYVGFTSTGSVWGSIVATFAVVFPSFILMLTISKFFLKYQKHPVVESIFAGLRPAVV GLLASAALVLMNVENFGSPTEDTYTFVISIIIFLIAFIGTRKYKANPILMIIACGIAGLL LY >gi|226332167|gb|ACIC01000153.1| GENE 12 16415 - 17923 1571 502 aa, chain - ## HITS:1 COG:BS_yngK KEGG:ns NR:ns ## COG: BS_yngK COG1649 # Protein_GI_number: 16078889 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus subtilis # 8 501 11 510 510 381 40.0 1e-105 MNLKNYLFLLVLLLAAGVRAQAPSGNAYPKREFRAAWIQAVNGQFRGIPTEKLKQTLVGQ LNSLQGAGINAIIFQVRPEADALYASQHEPWSRFLTGTQGQMPSPMWDPMQFMIEECHKR NMEFHAWINPYRVKTSLKNQLAPTHIYHEHPEWFVTYGDQIYFDPALPESREHICKIVSD IVSRYDVDAIHMDDYFYPYPIKGVDFPDNASFARYGGGFSNKADWRRGNVNILIKKLHET IRGIKPWVKFGISPFGIYRNQKTDPLGSNTNGLQNYDDLYADVLLWAREGWIDYNIPQIY WEIGHKAADYETLVDWWAKHTENRPLFIGQAVMNTIQHADPKNPSINQLPRKMALQRSYQ TIGGSCQWYAAAVVENAGKYRDALVQEYHKYPALIPVFDFMDDKAPGKVRKMKKVWTEDG YILFWTAPKAETEMDKAIQYVVYRFGPKEKVNLDDPSHIVAITRNPFYKLPYETGKTKYR YVVTALDRIHNESKSVSKKVKL >gi|226332167|gb|ACIC01000153.1| GENE 13 17988 - 19922 1498 644 aa, chain - ## HITS:1 COG:alr1285_1 KEGG:ns NR:ns ## COG: alr1285_1 COG0642 # Protein_GI_number: 17228780 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Nostoc sp. PCC 7120 # 410 643 216 454 483 121 30.0 5e-27 MKNIRMVISLFILCFASTLAFSSTNKEQQLEVLSDSISKKIGQKDFIPFYQEYMRLARQQ NDTARIDDAYSLITSHYYRLRNTDSLKVVAYEYMDWCLKHGNVNNRYTQWRQYIQLLTEK GLQDEAMRETELLQKDADAAKSAFGMACSEMCIGYNHRVFSNNVKLCLEYYSSALKHFSD AGFYEDAYVVSLNIIQTHLARGEYANAIEYLNCMPGLIEKMKRKDIPVRPELTMRYYQFQ VIATLALKGKKEAQPFIAETDKFYHENINAFPREGWLGYKILCARTLNDNQMALNYLDSL MDYHHSVGSCYPANHLLKAQFMEQMGRFEDACEVYKSYAHINDSIRSAELDEQLSKYTVQ FEVDKLEHDKLELRAEVNRNLFITSVIVGCLVLILLIVITFYYMRSLSLNRKLDAANKAV IKASHMKSSFIQHITHEIRTPLNSIIGFSSLMAAGGLTQEEMEEYARQMESSNAYLLDLV NNVIDIADMDSQTDDIPKKPVSVDACCQECLGLIRQNLKEGIKLQYMPSSTPVEVCTVEI WMKRVLLGLLNNANKFTETGFIRLSYEEDKPNRLVRFIIEDSGPGIEEHFRDAIFERFAK ADTFTQGTGLGLSIIRQIMELVDGNVYLDTSYAGGARFVVEWPQ >gi|226332167|gb|ACIC01000153.1| GENE 14 20105 - 22876 2716 923 aa, chain + ## HITS:1 COG:MTH443 KEGG:ns NR:ns ## COG: MTH443 COG0178 # Protein_GI_number: 15678471 # Func_class: L Replication, recombination and repair # Function: Excinuclease ATPase subunit # Organism: Methanothermobacter thermautotrophicus # 7 923 11 948 962 851 48.0 0 MSENKHISIKGARVNNLKNIDVDIPRNKLVVITGLSGSGKSSLAFDTLYAEGQRRYVESL SSYARQFLGRMSKPECDFIKGIPPAIAIEQKVNSRNPRSTVGTSTEIYEYLRLLYARVGR TYSPVSGEEVKKHSTEDIVNCMLSHPEGTRYTVLTPIRLREDRTLQQQLEIDLKQGFNRV EVNGEMKRIDEYKPVAGDEVYLLVDRMAVASTKDAISRLTDSAETAMYEGDGTCMLRFYL PDGTSKLYSFSTKFEADGITFEEPNDQMFSFNSPIGACPVCEGFGKVIGIDEHLVVPNRS LSVYEGAIVCWRGEKMGEWKDELIHNADKFDFPIFTPYYELTDEQRRTLWEGNQYFHGIN DFFKMLEENQYKIQYRVMLARYRGKTICPKCHGTRLKPEAGYVRVDGKSISELVDLPITE LKEFFDHLNLNEHDSNVARRILIEINSRIRFLIDVGLGYLTLNRLSNSLSGGESQRINLA TSLGSSLVGSLYILDEPSIGLHSRDTDRLIRVLRQLQELGNTVVVVEHDEEIIRAADYII DIGPNAGRLGGEVVYQGDMKDLKKGSNSHTVRYLLGEEEIPVPEHRRPWNNYIELTGARE NNLKGVNVKFPLNVMTVVTGVSGSGKSTLVRDIFYRALKRELDECSDRPGEFSSISGSLR NLRNIEFVDQNPIGKSSRSNPVTYIKAYDEIRKLWAEQPLAKQMGYTAGHFSFNSEGGRC EECKGEGTITVEMQFMADLVLECESCHGKRFKADTLEVKFQDKNIYDVLEMTVNQAVEFF TEHGQKKIVKRLLPLQDVGLGYIKLGQSSSTLSGGENQRVKLAFYLSQEKADPTLFIFDE PTTGLHFHDIRKLLDAFDALIRRGHSIVIIEHNMDVIKCADYVIDLGPEGGDKGGNIVAV GTPEEVATCGASYTGQFLKEKLA Prediction of potential genes in microbial genomes Time: Thu May 12 03:26:18 2011 Seq name: gi|226332166|gb|ACIC01000154.1| Bacteroides sp. 1_1_6 cont1.154, whole genome shotgun sequence Length of sequence - 2295 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 58 - 543 506 ## BT_1740 hypothetical protein 2 1 Op 2 . - CDS 548 - 778 227 ## BT_1741 hypothetical protein 3 1 Op 3 . - CDS 775 - 2205 1367 ## COG1966 Carbon starvation protein, predicted membrane protein - Prom 2225 - 2284 1.9 Predicted protein(s) >gi|226332166|gb|ACIC01000154.1| GENE 1 58 - 543 506 161 aa, chain - ## HITS:1 COG:no KEGG:BT_1740 NR:ns ## KEGG: BT_1740 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 161 1 161 161 309 100.0 2e-83 MKKLALAICLLVVSIAAQAQFEKGKMILNPSVTGLDFSYSKNDKAKFGIGAQAGTFLADG IALMVNVGADWSKPVDEYTLGTGMRFYFNSTGVYLGGGLDWNRFRWSGGHHQTDWGLGIE AGYAYFLSRTVTIEPAVYYKWRFNDSDMSRFGVKVGFGFYF >gi|226332166|gb|ACIC01000154.1| GENE 2 548 - 778 227 76 aa, chain - ## HITS:1 COG:no KEGG:BT_1741 NR:ns ## KEGG: BT_1741 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 76 1 76 76 99 100.0 5e-20 MKKIKKSTGVAIAFLIYVSVTAAYLLPRNTEVGQTEKILTVVGSYVIVLLLWLVLRKKEQ MRERRKKDEQSIHLKK >gi|226332166|gb|ACIC01000154.1| GENE 3 775 - 2205 1367 476 aa, chain - ## HITS:1 COG:MA1905 KEGG:ns NR:ns ## COG: MA1905 COG1966 # Protein_GI_number: 20090754 # Func_class: T Signal transduction mechanisms # Function: Carbon starvation protein, predicted membrane protein # Organism: Methanosarcina acetivorans str.C2A # 1 448 1 447 479 419 52.0 1e-117 MITFTLCLLALIAGYFTYGRLMERVFGPDDRKTPALTKADGVDYIPLPTWKIFMIQFLNI AGLGPIFGAIMGAKFGTSSYLWIVLGSIFAGAVHDYFAGMLSLRHGGESLPEIIGRYLGL TTKQIMRGFTVLLMILVGSVFVAGPAGLLAKLTPESLDATFWIVVVFLYYILATLLPVDK IIGKIYPLFAVALLFMAVGILVMLYVNHPVLPELWDGLQNTNPDASELPIFPIMFVSIAC GAISGFHATQSPLMARCMTSERHGRPVFYGAMITEGIVALIWAAAATYFFHENGMEESNA SVIVDAITKDWLGAIGGVLAILGVIAAPITSGDTAFRSARLIVADFLGLEQKSMRRRLYI CIPMFAVAIGLLLYSLRDADGFNMIWRYFAWANQTLAVFTLWAITVYLVVAKKPYWITLI PALFMTSVCSTYICIAPEGLGLSHIVSYCIGIGCVVVAIAWFFIWLGKQKTRKLSE Prediction of potential genes in microbial genomes Time: Thu May 12 03:26:23 2011 Seq name: gi|226332165|gb|ACIC01000155.1| Bacteroides sp. 1_1_6 cont1.155, whole genome shotgun sequence Length of sequence - 3027 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 138 - 1661 1158 ## BT_1743 hypothetical protein - Prom 1784 - 1843 5.7 + Prom 1737 - 1796 6.3 2 2 Tu 1 . + CDS 1847 - 3026 899 ## COG1649 Uncharacterized protein conserved in bacteria Predicted protein(s) >gi|226332165|gb|ACIC01000155.1| GENE 1 138 - 1661 1158 507 aa, chain - ## HITS:1 COG:no KEGG:BT_1743 NR:ns ## KEGG: BT_1743 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 507 1 507 507 1026 100.0 0 MKKLLSVSLCLLFAASLTAQTGKIRVATVGNSITGGTNDYGYYAMPLAEMLGDDYEVTKF GKGSSGVFIKLREDATTPENPNEYQFAYINSEQCAAALEYKPNIVIIKFGANDANKKNFE KYGKETFKADYKKLIAKFQALSTKPDIYICTPPPMYYGDGGFLGSFDDNVAKNYIQPAVR EIAEELNLVCVDLYNPLKGHPEFMPGGNDWVHPDHRGHYIIAKEVYKAITGEFVMNPGGI TVAASDISLSGSKAKMKNGAIEKAKNGTRLSFDVHFGSLPTYEKVEVDVQLNKKKSGYLD FYLDTETIPFASVDVSGADSKNFTTQSALFDRRIKGKHKVTVQWRGQDAKLKSVTIKEKY MPYVTDNVSQVYLVNKATGMVLDCNPDSKVISAAKYDSEKKSQLFCIENLTYHILRVRNI ATNLHVMNNGDKVIVGKPGDDWRVHDPKYALFLTPTDDEGYYSLGLSPEARIGLSSANST EVVGNRSGQIEDLDKWKIVTADEMKKQ >gi|226332165|gb|ACIC01000155.1| GENE 2 1847 - 3026 899 393 aa, chain + ## HITS:1 COG:BS_yngK KEGG:ns NR:ns ## COG: BS_yngK COG1649 # Protein_GI_number: 16078889 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus subtilis # 6 393 20 420 510 223 33.0 5e-58 MHRTIFILSLLFLILPAKAQPKHEVRAAWITAVYGLDWPRTRATTPQGIRKQKEELVDIL DKLKAANFNTVLFQTRTRGDVLYPSSIEPFNSILTGKVGGNPGYDPLAFAIEECHKRGME CHAWMVTIPLGNKKHVASLGKQSVTKRVKDICVPYKNEYFLNPGHPATKEYLMRLVREVV ERYDIDGVHFDYLRYPENAPLFPDKYDFHRYSKGRTLDQWRRDNISEIVRYIYKGVKAMK PWVKVSTCPVGKYRDTSRYSSRGWNAFYTVYQDPQGWLGEGIQDQIYPMMYFQGNSFYPF ALDWQEQSNGRQVIPGLGIYFLHPDEGKWTRDEVDRQINFIRNQKMAGEGHYRVKYLMDN TQGIYDELIENFYAYPALQPPMTWLDNVPPSAP Prediction of potential genes in microbial genomes Time: Thu May 12 03:26:34 2011 Seq name: gi|226332164|gb|ACIC01000156.1| Bacteroides sp. 1_1_6 cont1.156, whole genome shotgun sequence Length of sequence - 14691 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 6, operones - 3 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 289 - 340 -0.9 1 1 Tu 1 . - CDS 553 - 1707 424 ## BT_1745 hypothetical protein - Prom 1825 - 1884 5.3 + Prom 1725 - 1784 7.3 2 2 Op 1 . + CDS 1914 - 3092 1013 ## COG1373 Predicted ATPase (AAA+ superfamily) 3 2 Op 2 . + CDS 3126 - 6683 3622 ## COG0674 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit + Term 6705 - 6760 12.6 4 3 Tu 1 . + CDS 6808 - 6996 205 ## BT_1748 hypothetical protein + Term 7063 - 7089 -0.6 5 4 Op 1 14/0.000 - CDS 7007 - 7867 887 ## COG2113 ABC-type proline/glycine betaine transport systems, periplasmic components 6 4 Op 2 16/0.000 - CDS 7895 - 8725 955 ## COG4176 ABC-type proline/glycine betaine transport system, permease component 7 4 Op 3 . - CDS 8722 - 9948 1132 ## COG4175 ABC-type proline/glycine betaine transport system, ATPase component - Prom 10172 - 10231 7.0 8 5 Tu 1 . + CDS 10421 - 11707 574 ## BT_1752 hypothetical protein + Term 11766 - 11805 -0.2 9 6 Op 1 . - CDS 11670 - 12002 192 ## COG3695 Predicted methylated DNA-protein cysteine methyltransferase 10 6 Op 2 . - CDS 12030 - 14690 2299 ## COG1879 ABC-type sugar transport system, periplasmic component Predicted protein(s) >gi|226332164|gb|ACIC01000156.1| GENE 1 553 - 1707 424 384 aa, chain - ## HITS:1 COG:no KEGG:BT_1745 NR:ns ## KEGG: BT_1745 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 384 15 398 398 792 100.0 0 MALASGETNYMAPASARQMASDLALLPMWYAEAGSAVLAPSAYNADFLKTKSELLGMDVA LLTEPEVADGKDRKFSPWGWDPALRKRLMTLGADQTELPSADYMNILREHSHRLQAVKLL PGLRLNEYFCGESFYLSTLAECSAFVEGREACLLKAPLSGSGKGLNWCKGIFTTFISGWC ARVAASQGGVVGEPIYNKVEDFAMEFYADGRGRVVFAGYSVFHTGGSGMYAGNDLLSDEK ILQKLSAYVPQEEFIRLRTRLEEELSALFGGFYHGYLGVDMMICHFPDEAPVYRIHPCVE INLRMNMGVVAWLLTDRYLAADAEGVFRIDYYPLAGQALEEHRQMSASFPLSVENNRVCA GYLPLVPVTPQSRYRAFLLLTLPQ >gi|226332164|gb|ACIC01000156.1| GENE 2 1914 - 3092 1013 392 aa, chain + ## HITS:1 COG:MJ1637 KEGG:ns NR:ns ## COG: MJ1637 COG1373 # Protein_GI_number: 15669833 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Methanococcus jannaschii # 33 290 73 342 473 80 28.0 6e-15 MESFYRTHAYLVEHTNAPVRRDLMDEIDWSDRLIGIKGTRGVGKTTFLLQYAKEKFGTDR SCLFINMNNFYFSGHSLVDFANEFQKRGGKVLLIDQVFKHPEWSKELRMCYDRFPSLKIV FTGSSVMRLKEENLELRDIAKSYNLRGFSFREFLNLQTGMKFRAYSLEEILSTHEQIAKG VLSKVRPLDYFQDYLHHGFYPFFLEKRNFSENLLKTMNMMVEVDILLIKQIELKYLSKIK KLLYLLAVDGPKAPNVSQLASDVQTSRATVMNYIKYLADARLINLVYPKGEEFPKKPSKI MMHNSNLMYSIYPVKVEEQDVLDTFFVNTLWKDHKVHKGDKNVSFMVDEVMPFKICAEGT KIKNNPDVTYALQKAEIGRGNQIPLWMFGFLY >gi|226332164|gb|ACIC01000156.1| GENE 3 3126 - 6683 3622 1185 aa, chain + ## HITS:1 COG:FN1170_1 KEGG:ns NR:ns ## COG: FN1170_1 COG0674 # Protein_GI_number: 19704505 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit # Organism: Fusobacterium nucleatum # 5 405 3 403 410 590 72.0 1e-168 MTKQKKFITCDGNQAAAHISYMFSEVAAIYPITPSSTMAEYVDEWAAAGRKNIFGETVLV QEMQSEGGAAGAVHGSLQAGALTTTYTASQGLLLMIPNMYKIAGEFLPCVFHVSARTLAS HALCIFGDHQDVMSARQTGFAMLAEGSVQEVMDLAGVAHLATIKSRVPFMNFFDGFRTSH EIQKIEMLENDDLAPLIDQEALAEFRARALNPMKPVARGMAENPDHFFQHRESCNNYYEA VPAIVEEYMNEISKITGRKYGLFDYYGAEDAERVIIAMGSVTEAAREAIDHLVANGEKVG MVAVHLYRPFSAKHFLAAVPKTAKSIAVLDRTKEPGANGEPLYLDVKDCFYGAENAPVIV GGRYGLGSKDTTPAQIISVFENLAMPMPKNHFTIGIVDDVTFTSLPQKEEIALGGEGMFE AKFYGLGADGTVGANKNSVKIIGDNTDKHCQAYFSYDSKKSGGFTCSHLRFGDTPIRSTY LVNTPNFVACHVQAYLHMYDVTRGLRKNGSFLLNTIWEGEELAKNLPNKVKKYFAQNNIS VYYINATQIAMEIGLGNRTNTILQSAFFRITGVIPVDQAVEQMKKFIVKSYGKKGEDVVN KNYAAVDRGGEYKTLTVDPAWANLPDDAKAENNDPAFINEVVRPINAQDGDLLPVSAFKG IEDGTWYQGTSKYEKRGVAAFVPEWNAENCIQCNKCAYVCPHASIRPFVLDAEEQKGAKF EQLKAVGKAFDGMTFRIQVDVLDCLGCGNCADICPGNPKKGGKALTMKHLESQLAEADNW TYCAENVKSKQHLVDIKANVKNSQFATPLFEFSGACSGCGETPYVKLISQLFGDREMVAN ATGCSSIYSGSVPSTPYTMNEKGHGPAWANSLFEDFCEFGLGMELANEKMRARLVKVMNE AIASDCCPAEYKEVFAEWINNMLDAEKTKELAEKIIPMVEAAKDKCNHCKQIADLQQYLV KRSQWIIGGDGASYDIGYGGLDHVIASGKDVNILVLDTEVYSNTGGQSSKATPVGAIAKF AAAGKRVRKKDLGLMATTYGYVYVAQIAMGADQAQTLKAIREAEAYPGPSLIIAYAPCIN HGLKAGMGKSQEEEEKAVKCGYWHLWRYNPALEEEGKNPFQLDSKEPNWEDFQGFLKGEV RYASVMKQYPAEAEELFKAAEENAKWRYNSYKRLARENWGADTAE >gi|226332164|gb|ACIC01000156.1| GENE 4 6808 - 6996 205 62 aa, chain + ## HITS:1 COG:no KEGG:BT_1748 NR:ns ## KEGG: BT_1748 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 62 1 62 62 84 98.0 2e-15 MIGFVIWIIGLILTFRAAIEIWNINAAVEKRLIAIVLIVLTSWLGLLFYYFYGKERMAGW LK >gi|226332164|gb|ACIC01000156.1| GENE 5 7007 - 7867 887 286 aa, chain - ## HITS:1 COG:MA2147 KEGG:ns NR:ns ## COG: MA2147 COG2113 # Protein_GI_number: 20090990 # Func_class: E Amino acid transport and metabolism # Function: ABC-type proline/glycine betaine transport systems, periplasmic components # Organism: Methanosarcina acetivorans str.C2A # 28 285 55 312 315 216 42.0 3e-56 MNKIKMIAAMFIAGILLTFVSCGDNGKSKKISIAYANWSEGIAMTNLAKAIFEDQGYDVK LLNADLAPIFTSISRKKADVFMDVWMPVTMEDYMKQYGDKLEVIGDIYDGARIGLVVPDY VTINSIEELNAEKERFSGQIVGIDAGAGIMKATDQAIKDYGLDYKLMTSSGPAMTASLKK AIDKKDWIVVTGWTPHWMFDRFKLKVLQDPKLIYGNTESIHTIAWKGFSEKDPFAAELLG NIRLTDAEISSLMTALEEAQTTERQAARQWMEEHQELVNSWIPEKD >gi|226332164|gb|ACIC01000156.1| GENE 6 7895 - 8725 955 276 aa, chain - ## HITS:1 COG:BMEII0549 KEGG:ns NR:ns ## COG: BMEII0549 COG4176 # Protein_GI_number: 17988894 # Func_class: E Amino acid transport and metabolism # Function: ABC-type proline/glycine betaine transport system, permease component # Organism: Brucella melitensis # 3 265 6 268 301 270 57.0 2e-72 MINIGQYIETAINWLTEHFASFFDALSMGIGGFIDGFQHVLFGIPFYITIAVLAALAWFK SGKGTAVFTLLGLLLIYGMGFWEETMQTLALVLSSTCLALLLGVPLGIWTANSDRCNKIM RPVLDFMQTMPAFVYLIPAVLFFGLGTVPGAFATIIFAMPPVVRLTGLGIRQVPKNVVEA SRSFGATPWQLLYKVQLPLALPTILTGINQTIMMSLSMVVIAAMISAGGLGEIVLKGITQ MKIGLGFEGGIAVVILAIVLDRITQGMADNRKSKKK >gi|226332164|gb|ACIC01000156.1| GENE 7 8722 - 9948 1132 408 aa, chain - ## HITS:1 COG:lin1013 KEGG:ns NR:ns ## COG: lin1013 COG4175 # Protein_GI_number: 16800082 # Func_class: E Amino acid transport and metabolism # Function: ABC-type proline/glycine betaine transport system, ATPase component # Organism: Listeria innocua # 1 394 1 392 397 391 53.0 1e-108 MSKIEIKDLYLVFGNEKQKALKMLKEGKTKSEILKATGCTVAVKDANLSINEGEIFVIMG LSGSGKSTLLRCINRLIRPTSGEVIINGTDIAKVSDKELLQIRRKELAMVFQNFGLLPHR SVLHNIAFGLELQGVKKGEREQKAMESMQLVGLKGYENQMVSELSGGMQQRVGLARALAN NPEVLLMDEAFSALDPLIRVQMQDELLTLQSKMKKTIVFITHDLSEAIKLGDRIAIMKDG EIVQIGTSEEILTEPANAYVERFVENVDRSKIITASSVMVDKPIVARLKKEGPEVLIRKM RERNLTVLPVVDSNNLLVGEVRLNDLLKLRKEQIRSIESVVRHEVHSVLGDTVLEDILPL MTKTNSPIWVVNENREFEGVVPLSSLIIEVTGKDKEEINEIIQNAIEL >gi|226332164|gb|ACIC01000156.1| GENE 8 10421 - 11707 574 428 aa, chain + ## HITS:1 COG:no KEGG:BT_1752 NR:ns ## KEGG: BT_1752 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 45 428 1 384 384 648 99.0 0 MFNRAITIFLGKDNCHYTTKEKQTGTATLDDKSQVQFLSKKNPKMPFFYSRRNLYMIGFL LAFIVTLLEFIRGRHQNFMVFSDATQFFWQGISPYTTEFVEAHGRYFLYSPVFTVLFAPF AYLPAWLGPFAWNLFNYSLFFMAIFTLPQQFTQQQKCRMFLFLLLILGQSLLSFQYNITV AYLFLFAYTLLEKDRGFLAVLLIMISGCTKVYGIFELALLLCYPHVWRNFGYAVTMGIVL LALPLIKIAPADLLPYYEEWCHSLAVHQSAGAYDSFFYARPIAAWTLPHFRALQIGMLGL LTLLFLGNFRKWSSFAFRAQALGILMGWVVLLSDSAEKHTYIIALAGFMLWYWSRPTRTA TDKILFWCCFVLLCIVPIDIFVPIPIRDFITRTLWLHVWVFFIVWIRMIWLTFLASFIHP RATNVLSD >gi|226332164|gb|ACIC01000156.1| GENE 9 11670 - 12002 192 110 aa, chain - ## HITS:1 COG:lin0580 KEGG:ns NR:ns ## COG: lin0580 COG3695 # Protein_GI_number: 16799655 # Func_class: L Replication, recombination and repair # Function: Predicted methylated DNA-protein cysteine methyltransferase # Organism: Listeria innocua # 15 107 6 98 98 111 54.0 3e-25 MKDYKVDKESFSESFRKEVYEVVSQIPVGKVSTYGGIALLLGAPQCSRMVGRALKEVPDE LSIPCHRVVNSTGRLVPGWVEQKQLLLAEEVSFKANGCVNLKEHLWHVDE >gi|226332164|gb|ACIC01000156.1| GENE 10 12030 - 14690 2299 886 aa, chain - ## HITS:1 COG:SMb20671 KEGG:ns NR:ns ## COG: SMb20671 COG1879 # Protein_GI_number: 16265126 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Sinorhizobium meliloti # 8 272 46 313 322 188 41.0 4e-47 GVAQCSDDSWRHKMNDEILREAMFYNGVSVEIRSAGDDNSKQAEDVHYFMDEGVDLLIIS ANEAAPMTPIVEEAYQKGIPVILVDRKILSDKYTAYIGADNYEIGRSVGNYIASSLKGKG NIVELTGLSGSTPAMERHQGFMAAISKFPDIKLIDKADAAWERGPAEIEMDSMLRRHPKI DAVYAHNDRIAPGAYQAAKMAGREKEMIFVGIDALPGKGNGLELVLDSVLDATFIYPTNG DKVLQLAMDILEKKPYPKETVMNTAVVDRTNAHVMQLQTTHISELDKKIETLNGRIGGYL SQVATQQVVLYGSLIILLLVAGLLLVVYKSLRSKNRLNKELFKQKQQLEEQRDKLEEQRD QLIQLSHQLEEATHAKLVFFTNISHDFRTPLTLVADPVEHLLADKTLSGDQHRMLMLIQR NVNILLRLVNQILDFRKYENGKMEYTPVTVDVLSSFEGWNESFQAAARKKHIHFSFDSMP DTDYHTLADMEKLERIYFNLLSNAFKFTPENGKIAVRLSSLSKEDKRWIRFTVANTGSMI SAEHIRNVFDRFYKIDMHHTGSGIGLALVKAFVEMHGGMISVESDEKQGTVFTVELPVQS CEAVAAEPDTTLVSADSRTTDVLLAEEEELEKGYDSSKPSVLIIDDNEDIRSYVHTLLHT DYTVIEAADGSEGIRKAMKYVPDLIISDVMMPGIDGIECCRRLKSELQTCHIPVILLTAC SLDEQRIQGYDGGADSYISKPFSSQLLLARVRNLIDSHRRLKQFFGDGQTLAKEDVCDMD KDFVERFKSLIEEKMGDSGLNVEDLGKDMGLSRVQLYRKIKSLTNYSPNELLRIARLKKA ASLLASSDMTVAEIGYEVGFSSPSYFAKCYKEQFGESPTDFLKRKG Prediction of potential genes in microbial genomes Time: Thu May 12 03:26:47 2011 Seq name: gi|226332163|gb|ACIC01000157.1| Bacteroides sp. 1_1_6 cont1.157, whole genome shotgun sequence Length of sequence - 1829 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 380 - 411 1.8 1 1 Tu 1 . - CDS 443 - 1669 899 ## COG2849 Uncharacterized protein conserved in bacteria - Prom 1736 - 1795 4.0 Predicted protein(s) >gi|226332163|gb|ACIC01000157.1| GENE 1 443 - 1669 899 408 aa, chain - ## HITS:1 COG:FN2119 KEGG:ns NR:ns ## COG: FN2119 COG2849 # Protein_GI_number: 19705409 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 79 335 80 333 338 83 29.0 9e-16 MFFFLIVPLFLALFYKPIAYLCGRFLNNKSKRENYFMKHISNFLRSKSWNVFLFLYLALP LFAQKEYKIDQVSVVNVGDGRLLFQELKTEKALNGEHRIIDGYHSAYVLASFKDGFYDGG YKEYVDNILITEGSYKEGRKDGLFKINSKFDGKLKEEKSYKEGKLDGTSKSYFTTGKVES ERNFRMGKEHGKQLSYESDGTLRKEHNYKDGKQVGKQYTFLKGTYDLYETVYFNEEGFQD GEYSSVFTFGSPRILGSYKNGKKNGRWTTFAESGDTLMIETYLDGKEDGLHVSFANGSGV RQKEYYMKNDRKDGLYREYNLENGELRYEATYQAGRLHGKERRLIVSNRFDYWETSTYVN GRQNGPFEARYVKNDQLRECGEYKNGHRVGRWKRYTIDGKLEKEWDEN Prediction of potential genes in microbial genomes Time: Thu May 12 03:26:51 2011 Seq name: gi|226332162|gb|ACIC01000158.1| Bacteroides sp. 1_1_6 cont1.158, whole genome shotgun sequence Length of sequence - 12472 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 3, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 2/0.000 - CDS 22 - 909 1119 ## COG0524 Sugar kinases, ribokinase family 2 1 Op 2 . - CDS 944 - 2113 1184 ## COG0738 Fucose permease - Prom 2318 - 2377 4.9 - Term 2224 - 2276 -0.9 3 2 Tu 1 . - CDS 2393 - 4225 1667 ## COG1621 Beta-fructosidases (levanase/invertase) - Prom 4283 - 4342 3.0 - Term 4264 - 4304 5.7 4 3 Op 1 . - CDS 4355 - 5926 1545 ## BT_1760 glycosylhydrolase 5 3 Op 2 . - CDS 5942 - 7327 1367 ## BT_1761 hypothetical protein 6 3 Op 3 . - CDS 7354 - 9066 1868 ## BT_1762 hypothetical protein 7 3 Op 4 . - CDS 9094 - 12219 3104 ## BT_1763 hypothetical protein - Prom 12257 - 12316 4.6 Predicted protein(s) >gi|226332162|gb|ACIC01000158.1| GENE 1 22 - 909 1119 295 aa, chain - ## HITS:1 COG:MA1840 KEGG:ns NR:ns ## COG: MA1840 COG0524 # Protein_GI_number: 20090690 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Methanosarcina acetivorans str.C2A # 9 294 35 322 326 189 39.0 4e-48 MNNIIVGMGEALWDVLPEGKKIGGAPANFAYHVSQFGFDSRVVSAVGNDELGDEIMEVFK EKQLKNQIERVDYPTGTVQVTLDDEGVPCYEIKEGVAWDNIPFTDELKRLALNTRAVCFG SLAQRNEVSRATINRFLDTMPDIDGQLKIFDINLRQDFYTKEVLRESFKRCNILKINDEE LVTISRMFGYPGIDLQDKCWILLAKYNLKMLILTCGINGSYVFTPGVVSFQETPKVPVAD TVGAGDSFTAAFCASILNGKSVPEAHKLAVEVSAYVCTQSGAMPELPVILKDRLL >gi|226332162|gb|ACIC01000158.1| GENE 2 944 - 2113 1184 389 aa, chain - ## HITS:1 COG:NMB0535 KEGG:ns NR:ns ## COG: NMB0535 COG0738 # Protein_GI_number: 15676441 # Func_class: G Carbohydrate transport and metabolism # Function: Fucose permease # Organism: Neisseria meningitidis MC58 # 4 383 24 417 426 92 26.0 2e-18 MENTKNASLSKLVPVMLCFFAMGFVDLVGIASNYVKADLGLTDSQANIFPSLVFFWFLIF SVPTGMLMSRIGQKKTVLLSLIVTFASLLLPVFGDSYALMLISFSLLGIGNALMQTSLNP LLSNIVRGDRLASSLTFGQFVKAIASFLAPYIAMWGATQAIPSFDLGWRVLFPIYMVIAI LAILLLNATQIEEEKEEGKPSTFGQCLALLGKPFILLCFIGIMCHVGIDVGTNTTAPKIL MERIGMTLDDAAFATSLYFIFRTAGCFLGSFILRQMSPKSFFGISVVMMLAAMVGLFIFH DKAVIYACIALIGFGNSNVFSVIFSQALLYLPGKKNEVSGLMIMGLFGGTVFPLAMGVAS DTSMGQNGAIAVMTVGVLYLLFYTFRIKK >gi|226332162|gb|ACIC01000158.1| GENE 3 2393 - 4225 1667 610 aa, chain - ## HITS:1 COG:BS_sacC KEGG:ns NR:ns ## COG: BS_sacC COG1621 # Protein_GI_number: 16079757 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-fructosidases (levanase/invertase) # Organism: Bacillus subtilis # 143 610 32 511 677 470 50.0 1e-132 MNVFLPVKQLQTFLICFILMTISVSTRAADSPLLIKNLGEGHCLVRVNTSQNYLLLPVED ASPDVRISMIVNNKEVKNFDVRLAIHKVDYFVPVDLSDFSGKTVSFKFKMNSNDPIRVNL SPDNTACCKEMKLSDTFDTTNREKFRPTYHFSPLYGWMNDPNGMVYKDGEYHLFYQYNPY GSKWGNMNWGHAISKDLVNWEHRPVAIAPDALGTIFSGSAVVDHNNTAGFGAGAIIAIYT QNSDRQVQSIAYSTDNGRTFTKYENNPVLVSEARDFRDPKVFWYEATKRWIMVLAVGQEM QIFSSPNLKDWAFESSFGEGYGAHGNVWECPDLFELPVEGTNEKKWVLLCSLGDGPFGDS ATQYFIGSFDGKKFSCDNQPNVTKWMDWGKDHYATVTWSDAPDNRRIAIAWMSNWQYAND VPTSQYRSPNSIPRDLSLFAIDGGIYLQSAPSPELLKLRGVSKKRSFKVNGTRIVKDLIP NNEGAYEIELSLKNQQAEIIGFRLYNDKGEEVDMQYDMKEKKFSMDRRKSGEVNFNENFP MLTWTAIEEDKDEMKLRLFVDRSSVEAFGDGGRFAMTNQVFPSEPYNHISFYSKGGAYKV DSFVVYKLKK >gi|226332162|gb|ACIC01000158.1| GENE 4 4355 - 5926 1545 523 aa, chain - ## HITS:1 COG:no KEGG:BT_1760 NR:ns ## KEGG: BT_1760 # Name: not_defined # Def: glycosylhydrolase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 523 1 523 523 1060 100.0 0 MMKNMILPIAFTALIASMTACSDETDPILTQKNWDGTATYFQSSDEHGFSMYYKPQVGFV GDPMPFYDPVAKDFKVMYLQDYRPNPEATYHPIFGVATKDGATYESLGELISCGGRDEQD AAIGTGGTIYNPADKLYYTFYTGNKFKPSSDQNAQVVMVATSPDFKTWTKNRTFYLKGDT YGYDKNDFRDPFLFQTEDGVYHMLIATRKNGKGHIAEFTSADLKEWESAGTFMTMMWDRF YECPDVFKMGDWWYLIYSEQASFMRKVQYFKGRTLEDLKATTANDAGIWPDNREGMLDSR AFYAGKTASDGTNRYIWGWCPTRAGNDNGNVGDVEPEWAGNLVAQRLIQHEDGTLTLGVP DAIDRKYTSAQEVKVMAKDGNMIESGKTYTLGEGASVIFNRLKVHNKISFTVKTASNTDR FGISFVRGTDSASWYSIHVNADEGKANFEKDGDDAKYLFDNKFNIPADNEYRVTIYSDQS VCVTYINDQLSFTNRIYQMQKNPWSLCCYKGEITVSDVQVSTY >gi|226332162|gb|ACIC01000158.1| GENE 5 5942 - 7327 1367 461 aa, chain - ## HITS:1 COG:no KEGG:BT_1761 NR:ns ## KEGG: BT_1761 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 461 1 461 461 904 100.0 0 MKSIIKQLYTILLVTVACLTVTGCSDDFKSGLRLDGDVWVNSIRLDEYAGTVDYQNKAIV VGVPYDYDITRMVVTEMNLSEGAKASIAIGETIDFSLPVSLTVKNGDVQMSYTITVKRDE AKILTFKLNDTYVGKVDQLSKTISVVVPLTVDITQLKGTFTVTDGATVTPASGSIQDFTN PVTYTATYRSAVTPYVVTVTQGNVIPTAFVGTASSVSLLTSPEEKAAAQWMMDNVSMSEY ISFKDVVDGKVDLGKYTAIWWHFHADNGDNPPLPDDAKAAAEKFKVYYQNGGNLLLTRYA TFYIANLGIAKDERVPNNSWGGNEDSPEITSAPWSFLITGSESHPLFQDLRWKDGDKSTV YTCDAGYAITNSTAQWHIGTDWGGYDDLNAWRNLTGGIDLAHGGDGAVVIAEFEPRSNSG RTLCIGSGCYDWYGKGVDASADYYHYNVEQMTLNAINYLCK >gi|226332162|gb|ACIC01000158.1| GENE 6 7354 - 9066 1868 570 aa, chain - ## HITS:1 COG:no KEGG:BT_1762 NR:ns ## KEGG: BT_1762 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 570 1 570 570 1134 100.0 0 MKKIIYIATIGITLLTTSCDDFLDRQVPQGIVTGDQIASPEYVDNLVISAYAIWATGDDI NSSFSLWNYDVRSDDCYKGGSGTEDGGVFNALEISKGINTTDWNINDIWKRLYQCITRAN TALQSLDQMDEKTYPLKNQRIAEMRFLRGHAHFMLKQLFKKIVIVNDENMEPDAYNELSN TTYTNDEQWQKIADDFQFAYDNLPEVQIEKGRPAQAAAAAYLAKTYLYKAYRQDGADNAL TGINEEDLKQVVKYTDPLIMAKGGYGLETDYSMNFLPQYENGAESVWAIQYSINDGTYNG NLNWGMGLTTPQILGCCDFHKPSQNLVNAFKTDSQGKPLFSTYDNENYEVATDNVDPRLF HTVGMPGFPYKYNEGYIIQKNDDWSRSKGLYGYYVSLKENVDPDCDCLKKGSYWASSLNH IVIRYADVLLMRAEALIQLNDGRITDAISLINEVRSRAAGSTMLIFNYKEDYGVNFKVTP YDLKAYAQDEAMKMLKWERRVEFGMESSRFFDLVRWGEAKDVINAYYVTEASRCSIYKNA GFTENKNEYLPVPFEQISASNGNYTQNFGW >gi|226332162|gb|ACIC01000158.1| GENE 7 9094 - 12219 3104 1041 aa, chain - ## HITS:1 COG:no KEGG:BT_1763 NR:ns ## KEGG: BT_1763 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1041 1 1041 1041 2026 99.0 0 MPGIMKNKKLLCSVCFLFAFMSALWGQNITVKGNVTSKTDGQPIIGASVVETTATTNGTI TDFDGNFTLSVPVNSTLKITYIGYKPVTVKAAAIVNVLLEEDTQKVDEVVVTGYTTQRKA DLTGAVSVVKVDEIQKQGENNPVKALQGRVPGMNITADGNPSGSATVRIRGIGTLNNNDP LYIIDGVPTKAGMHELNGNDIESIQVLKDAASASIYGSRAANGVIIITTKQGKKGQIKIN FDASVSASMYQSKMNVLNTEQYGRAMWQAYVNDGENPNGNALGYAYNWGYNADGNPVLYG MTLSKYLDSKNTMPVADTDWFDEITRTGVIQQYNLSVSNGSEKGSSFFSLGYYKNLGVIK DTDFDRFSARMNSDYKLIDDILTIGQHFTLNRTSEVQAPGGIIETALDIPSAIPVYASDG SWGGPVGGWPDRRNPRAVLEYNKDNRYTYWRMFGDAYVNLTPFKGFNLRSTFGLDYANKQ ARYFTYPYQEGTQTNNGKSAVEAKQEHWTKWMWNAIATYQLEVGKHRGDVMIGMELNRED DSHFSGYKEDFSILTPDYMWPDAGSGTAQAYGAGEGYSLVSFFGKMNYSYADRYLLSLTL RRDGSSRFGKNHRYATFPSVSLGWRITQENFMKELTWLDDLKLRASWGQTGNQEISNLAR YTIYAPNYGTTDSFGGQSYGTAYDITGSNGGGVLPSGFKRNQIGNDNIKWETTTQTNVGI DFSLFKQSLYGSLEYYYKKTTDILTEMAGVGVLGEGGSRWINSGAMKNQGFEFNLGYRNK TAFGLTYDLNGNISTYRNEILELPETVAANGKFGGNGVKSVVGHTYGAQVGYIADGIFKS QDEVDNHATQEGAAVGRIRYRDIDHNGVIDERDQNWIYDPTPSFSYGLNIYLEYKNFDLT MFWQGVQGVDIISDVKKKSDFWSASNVGFLNKGTRLLNAWSPTNPNSDIPALTRSDTNNE QRVSTYFVENGSFLKLRNIQLGYTVPAVISKKMRMDRLRFYCSAQNLLTIKSKNFTGEDP ENPNFSYPIPVNITFGLNIGF Prediction of potential genes in microbial genomes Time: Thu May 12 03:27:26 2011 Seq name: gi|226332161|gb|ACIC01000159.1| Bacteroides sp. 1_1_6 cont1.159, whole genome shotgun sequence Length of sequence - 14834 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 6, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 51 - 923 602 ## COG2017 Galactose mutarotase and related enzymes - Term 954 - 996 3.1 2 2 Tu 1 . - CDS 1128 - 3008 1530 ## COG1621 Beta-fructosidases (levanase/invertase) - Prom 3099 - 3158 4.8 3 3 Tu 1 . - CDS 3638 - 5152 1127 ## BT_1766 putative ribonucleoprotein-related protein - Prom 5221 - 5280 3.1 + Prom 5110 - 5169 6.2 4 4 Tu 1 . + CDS 5267 - 6289 991 ## BT_1767 hypothetical protein + Prom 6309 - 6368 5.1 5 5 Op 1 . + CDS 6465 - 9722 2844 ## COG3250 Beta-galactosidase/beta-glucuronidase 6 5 Op 2 . + CDS 9740 - 11995 2035 ## COG3537 Putative alpha-1,2-mannosidase + Term 12017 - 12088 15.8 + Prom 12084 - 12143 3.7 7 6 Tu 1 . + CDS 12212 - 14779 1794 ## BT_1770 hypothetical protein Predicted protein(s) >gi|226332161|gb|ACIC01000159.1| GENE 1 51 - 923 602 290 aa, chain + ## HITS:1 COG:lin1322 KEGG:ns NR:ns ## COG: lin1322 COG2017 # Protein_GI_number: 16800390 # Func_class: G Carbohydrate transport and metabolism # Function: Galactose mutarotase and related enzymes # Organism: Listeria innocua # 1 288 1 289 290 197 39.0 2e-50 MKTISNEQLTIQVSPHGAELCSIFANGKEYLWQADPAFWKRHSPVLFPIVGSVWENQYRN EGIAYTLSQHGFARDMEFTLVSEKEDEVRYHLVSNEETLRKFPFPFCLEIGYRIQGKQIE VIWEVKNTGEKEMYFQIGAHPAFYWPDFNADTQNRGFFGFDKEEGLKYILISEKGCADPS TEYSLELTDGLLPLDTHTFDKDALIIENEQIHKVTLYTKEKKAYLSLHFHAPVVGLWSPP AKNAPFVCIEPWYGRCDRAHYTGEYKDKDWMQHLQPEETFRGGYTIEIDE >gi|226332161|gb|ACIC01000159.1| GENE 2 1128 - 3008 1530 626 aa, chain - ## HITS:1 COG:BS_sacC KEGG:ns NR:ns ## COG: BS_sacC COG1621 # Protein_GI_number: 16079757 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-fructosidases (levanase/invertase) # Organism: Bacillus subtilis # 123 624 22 509 677 386 42.0 1e-107 MKTTPWIKLCKGAVLALTVSFGLTYCQSTKSTFTLEQKGDSLTIVHITHPTNYILLPIEE EADESQVRLDTGNAADTDMDIRLAQTKVDYFVPFALPADTKVATLRIQKKSKDALCWEEI KLSDTFDTTNTDKFRPVYHHTPLYGWMNDANGLVYKDGEYHLYYQYNPYGSKWGNMHWGH SVSKDLMHWEHLAPAIARDTLGHIFSGSSIVDQENVAGYGAGSILAYYTSASDKNGQIQC LAYSKDNGRTFTKYEKNPVLRPSDGLKDFRDPKVFWYAPESKWVMIVSADKEMRFYDSHN LKDWNYLSSFGEGYGVQPCQFECPDMVELPVDGDINYKKWALIVNVNPGCYFGGSATQYF IGDFDGTKFICDNQPNVTKWLDWGKDHYATVCFSNTGDRVVAVPWMSNWQYCNIVPTRQF RSANALPRELGLYTQDNDIYLSAAPVAETKNLRKESKDVPSFTVDKDYHIESLLTDNEGA YELSLNIEAGKAEIMGFSLFNDKGEKVDIYFNLPEKKLVMDRTKSGIVDFGKNSSPHEIE AHDRRKTTSINYIDDFALATWAPIQKENEYKLDVFVDKCSVEIFLDGGKIAMTNLIFPTE PYNRMCFYSKGGTFAVDSFSVYRLGL >gi|226332161|gb|ACIC01000159.1| GENE 3 3638 - 5152 1127 504 aa, chain - ## HITS:1 COG:no KEGG:BT_1766 NR:ns ## KEGG: BT_1766 # Name: not_defined # Def: putative ribonucleoprotein-related protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 10 504 1 495 495 995 99.0 0 MKFNWINKGMNKVLNKESAPAYRLTPEWELYTSVVTTSLNNSFYEQEEERIERVRTLIGK CNPLFVAQLAAYARETMNLRSIPLVMAVELARIHQGDNLVKRVTARTVRRADEITELLAC YQQANRRTGTKKLNRLSKQLQAGLQDAFNTFDAYQFAKYNRSAEVCLKDALFLVHPKAKD ERQQEIFNGIVNDNLPVPYTWETELSALGQRTFATEEERRKAFRAKWEELIDSGKLGYMA LLRNLRNILEAEVSADHILTVGKRLSSEKAVENSRQLPFRFLAAYRELSKTPSLYATMLM TALERAVQVSARNITGFDESTRVLIACDVSGSMQCPVSAKSKVLYYDIGLLLGMLLKSRC KQVMTGIFGDRWKIINLPDTGILSNVDAFYKREGEVGYSTNGYLVIKDLIDRKAQMDKIM MFTDCQLWNSHSDLQITDLWRKYKKICPAAKLYLFDLSGYGNTPLDITRDDVFLITGWSD KIFDILSAIDKGNDALQEIKKIVV >gi|226332161|gb|ACIC01000159.1| GENE 4 5267 - 6289 991 340 aa, chain + ## HITS:1 COG:no KEGG:BT_1767 NR:ns ## KEGG: BT_1767 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 340 1 340 340 662 99.0 0 MPVNRNALIRYKTIDNCLRNPYRRWTLEDLVDACSDALYEYEGIDKGISKRAVQMDIQMM RSEKLGYNAPIVVYENKYYKYEDPEYSITQTPLNEQDLKTMSEAVEVLRQFKGFSYFTEM SEIVNRLEDHVASARMKTTPVIDFEKNESLKGLDYLDTIYHAIVNEHPLQLKYRSFKARS ANSFIFYPYLLKEYRNRWFVFGVRKSGRILQNLALDRIHSLEVLSREPFIKNTFFDPNTF FDDLVGVTKNSGSKAEKVGFKVAANEAPYILTKPIHHSQRTIETLEDGSVILEIEVVINH ELERVFFGYANGIQVLYPQTLVELIEKKLRRAVEQYEKPK >gi|226332161|gb|ACIC01000159.1| GENE 5 6465 - 9722 2844 1085 aa, chain + ## HITS:1 COG:TM1193 KEGG:ns NR:ns ## COG: TM1193 COG3250 # Protein_GI_number: 15643949 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Thermotoga maritima # 54 496 35 444 1087 124 27.0 1e-27 MNMKRSLILLCGLLTGFTLQSTKVNAQTQTVANDQKLVYPYTFAPSEGLVNQTEKEHRQE ICINGYWDFQPVRLPKEYVQGKGIAPELPLPENDSEWSKTRIKIPSPWNINAFAYRDLEG PDHRNYPSYPKEWEQVKMAWMKKNVTIPADWSGQQIKLHFEAVAGYSEVYVNKEKVGENF DLFLPFSFDITDKVTAGENIEILVGVRSQSLFENNATIGRRIVPGGSMWGYQINGIWQDV YLLALPKIHLEDVYVKPLVSKNTLEIEVTLQNRTEKKTDVQLQGNICEWVNCAGTDVNSA PVPNWTLGNEALKVAPTKLSLPAKASQKVTLQVPVDKDILKYWTPEHPNLYALLLSVNNQ KQTVDTKYERFGWREWTLQGTTQYLNGEPYALHGDSWHFMGIPQMTRRYAWAWFTAIKGM NANAVRPHAQVYPRFYLDMADEMGICVLNETANWASDGGPKLDSDLFWEASKEHLKRFVL RDRNHASVFGWSISNENKPVILHVYNRPELMPVQKKAWEEWRDIVHQYDPTRPWISADGE DDGDGILPVTVGHYGDINSMKRWIEIGKPWGIGEHSMAYYGTPEQVAKYNGERAYESQEG RMEGLANECYNLIVNQRQMGASYSTVFNMAWYALKPLPLGKKDKSKAPTVEADGVFFGDY KEGVPGVQPERVGPYCTTFNPGYDPTLPLYQEWPMYSALRAANAPGQPAWSPYAVIDKQQ YDHRKTTTPSKKYKEVIFIGSQNSKLKQLMDAQGVIFAKKANTPSLLLYIVDGSQKLDAS TKDEMQKQIAKGADVWIWGLTPETVAAYNEILPLPVSLDNLQRSSFLPVQKSWMHGLNNS DFYFCELQKTNASSYSLKGAFVEEGEVLLNACKTDWRKWNKRPEEIKTAGTIRSEYECTA ATPVFVKCQQNASTFYLNTLTEFANSEKGFNTLSAILKNAGIAFQKPEVNIDEVFFLRDE QINFPVATKEKLVKDSDGEWTLELYVFSPRPLDDLLIEPNMPKLSLFVKAKKRALLINDK PYTAVSHDGRNEIIYKELPLLQGWNKLVIKLGAGDRNDFTGYFKCDNKKDFLPLLKAAFV NPETK >gi|226332161|gb|ACIC01000159.1| GENE 6 9740 - 11995 2035 751 aa, chain + ## HITS:1 COG:XF0842 KEGG:ns NR:ns ## COG: XF0842 COG3537 # Protein_GI_number: 15837444 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Xylella fastidiosa 9a5c # 92 748 134 780 790 266 27.0 9e-71 MKLTHILASAAIISTLSACNGSLQTADRTPVDYVNPYIGNISHLLVPTFPTIQLPNSMLR VYPERADYTSELLKGLPLIVTNHRERSAFNFSPYQGEKLRPVITYNYDNEHITPYSFDVE LDDNRMKAEYALSHQSAIYRITYEADKPAYLIVNSRNGSIHANENFISGRQQLNDNTNVY VYIEAQEKPISAGILENGTIETSKDNAEGANACAAWRFADGTTTVNLRYGISFISEEQAE KNLHRELKDYNIKALAEAGRQIWNETLGRIQVEGGTEDDKTVFYSSFYRTFERPICMSEG GRYFSAFDGKVHEDNGTPFYTDDWIWDTYRAAHPLRTLIDQQKEEDIIASYLRMAEQMGN MWMPTFPEVTGDTRRMNSNHAVATVADALAKGLKVDTAKAYEACRKGIEEKTLAPWSGAP AGWLDNFYRENGYIPALRVDEPENDPNVHPFEKRQPVAVTLGTSYDQWCLSRIAQALNKK EEAEYYLKCSYNYRNLYNKETAFFHPKDKEGQWIEPFDYRFPGGMGAREYYGENNGWVYR WDVPHNVADLISLMGGNEQFIANLDRTFTEPLGRSKYAFYAKLPDHTGNVGQFSMANEPS LHVPYLYNYAGQPWKTQKRIRQMLKTWFRNDLMGIPGDEDGGGMTSFVVFSSLGFYPVTP GLPAYTIGSPLFTDAKITLSNGAVFEIEAKNASDDNKYIQSATLNGKEWNKSWFSHDDLM SGGKLVLVMGNKPNKTWASGAEDVPPSLEIK >gi|226332161|gb|ACIC01000159.1| GENE 7 12212 - 14779 1794 855 aa, chain + ## HITS:1 COG:no KEGG:BT_1770 NR:ns ## KEGG: BT_1770 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 855 1 855 855 1639 99.0 0 MKYQLQLLIFILLCLAGRLDASPLYDYGLYLKSHAVPAPERSTLYLDDNQPFSVKNDLTI SFQIYIRANEADYGSILHLKTDKGQIIRFSFVAGEQNHAPALMLNDEIIIIDKPIELEKW INVSLNLRQKDNVIEIEYDKKKMSSTFPLQETNSVTITFGQMLGYQAEVAPVNLRDINII QDGKLTREWKLWKHNDNLCYDEKEGAVARAVQPLWLIDNHIEWKTINKITTSSRVGIAFD ARCALFYVVSPESVKVLDEDGRLKQETAVRGGYPAVEYPNHLLYDTLSNALVSYSLTENI ISRFSFADGMWSNEVRNTKEPNNYNHAKAFNPADSSFYFFGGYGFYKYRNDLFRMKSGSE IMEQIKYDHPLYPRYSAAMAVVGDELYIFGGKGNKYGKQELSTHYYLGLYAINLKSKQSR TIWEKKDDNKETIMASSMYFEPADSSFYAVSTDNGGTLWKISMKSPVYTEVSKPINNRLD YQDCDFNLYYSPTHRKLFLVLDKILNNRTHDIKIYSINMPLVNEIDIRQSVDEMGSGKWW NLLYVIGVLAILGCGAWLFYRSRSKRQPIQSSAISKETPQSVTAPKAISENQEKVTPTMP EQENEPAPKEIVHYYDRSRSSISLLGCFNVRDKEGNDITANFTPRLKHLLILLILYTEKN PNGILTSKTTGILWPDKEETAARNNRNVNLRKLRVLLESIGDVEVVTENNFMHIKWGENV FCDYREVLARIQQFKEKENSELLNRILELLLYGPLLPNTILDWLDDFKETYSSHSIDLLK DLFEIEKQRNHPDMILRLADIMFLHDPLNEEALAAKCAVFSAQGKKGIARNLYDRFCKEY RDSMGEDYKTPFADL Prediction of potential genes in microbial genomes Time: Thu May 12 03:27:53 2011 Seq name: gi|226332160|gb|ACIC01000160.1| Bacteroides sp. 1_1_6 cont1.160, whole genome shotgun sequence Length of sequence - 39387 bp Number of predicted genes - 19, with homology - 19 Number of transcription units - 8, operones - 5 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 682 - 742 23.4 1 1 Op 1 . - CDS 774 - 2921 1847 ## BT_1771 putative cell surface protein 2 1 Op 2 . - CDS 2924 - 4204 1173 ## BT_1772 hypothetical protein 3 1 Op 3 . - CDS 4226 - 6232 2129 ## BT_1773 hypothetical protein 4 1 Op 4 . - CDS 6244 - 9549 3245 ## BT_1774 hypothetical protein 5 1 Op 5 . - CDS 9586 - 11298 1478 ## BT_1775 hypothetical protein - Prom 11506 - 11565 7.4 + Prom 11829 - 11888 5.0 6 2 Tu 1 . + CDS 12120 - 15815 3223 ## COG3250 Beta-galactosidase/beta-glucuronidase + Term 15857 - 15901 9.2 - Term 16325 - 16388 18.5 7 3 Op 1 . - CDS 16448 - 19507 2103 ## BT_1777 hypothetical protein - Term 19596 - 19632 -0.9 8 3 Op 2 . - CDS 19722 - 22283 2263 ## COG1472 Beta-glucosidase-related glycosidases 9 3 Op 3 . - CDS 22334 - 24052 1286 ## BT_1779 sialic acid-specific 9-O-acetylesterase 10 3 Op 4 . - CDS 24083 - 26947 2864 ## COG1472 Beta-glucosidase-related glycosidases 11 3 Op 5 . - CDS 27014 - 29107 1632 ## BT_1781 xylosidase/arabinosidase - Prom 29201 - 29260 5.5 - Term 29766 - 29799 0.4 12 4 Op 1 . - CDS 29994 - 30839 850 ## COG4667 Predicted esterase of the alpha-beta hydrolase superfamily 13 4 Op 2 . - CDS 30817 - 33042 1571 ## COG0475 Kef-type K+ transport systems, membrane components - Prom 33065 - 33124 8.1 + Prom 33027 - 33086 6.6 14 5 Tu 1 . + CDS 33259 - 33750 305 ## BT_1784 hypothetical protein + Term 33751 - 33796 12.6 - Term 33782 - 33824 5.5 15 6 Op 1 . - CDS 33834 - 35039 887 ## BT_1785 hypothetical protein 16 6 Op 2 . - CDS 35036 - 36373 893 ## BT_1786 hypothetical protein - Prom 36533 - 36592 4.9 + Prom 36480 - 36539 2.4 17 7 Op 1 . + CDS 36562 - 37659 898 ## BT_1787 hypothetical protein 18 7 Op 2 . + CDS 37652 - 37993 117 ## BT_1788 hypothetical protein + Term 37999 - 38054 9.9 + Prom 38044 - 38103 4.6 19 8 Tu 1 . + CDS 38173 - 39357 1290 ## COG3579 Aminopeptidase C Predicted protein(s) >gi|226332160|gb|ACIC01000160.1| GENE 1 774 - 2921 1847 715 aa, chain - ## HITS:1 COG:no KEGG:BT_1771 NR:ns ## KEGG: BT_1771 # Name: not_defined # Def: putative cell surface protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 715 1 715 715 1431 100.0 0 MKAINFINWKYIGLTGCLVASLLTACGDDETGPEVVTPLEPKLEIVNEASTFSFSSAGNS AQVLTFNANQDWHIVLDDSEGVDNSWVVLFEREGTAGEGNKVWVAAAENVEPDARRANFS LVSGGFKYDFVIYQAQKDAVVITDPKAYENLDAGEHIISVEFGINTGEYKTSFLYSGDTG WIMPTDERPAGEPETRAMENHTLYFKVLPNQKFDIRRGTIEISSKDNDNVKATMNIFQTG LPKPVITVDNKEMFTNLASKEDSILLKYKATNTNGPRDLTIDIDKNDQDWISFKPAIDAK GDSTFLVSVKENTGGERTATIALCAAADKKVREEFIVTQAQASDVELVITNKNDFRTSLD KLGSAATVKYSVQSTLTDPKNEILVDIVYPEESGYTAENGWLHMANNSMPERVIFTYDVN KVLRERQATVYIYRKGYENKKDYMVIRQAAATQIEIPAPGGLTNVLQSLIDDEIYKDWES ITSLELKGRLNDTDLNLLKNMMTAGKGYNLKTLDMTEVENETLKNGVFNGCNLLENISFP TGLQYVPREACRNCTKLRTVVVNEGPTYIGRHAFGNSVVNEAWFPSTMTYVYGYIFDNVT ALKTLHLKSLPFQCKEVARSDADEGATQPTTWCTVFKTWPSTVYVPMEYLEYYKNPDPKH VVSMHFQDLLNTLTSESEEWTKGPKVQQPYLKSNSALKSNFTWGSSATTIMGESN >gi|226332160|gb|ACIC01000160.1| GENE 2 2924 - 4204 1173 426 aa, chain - ## HITS:1 COG:no KEGG:BT_1772 NR:ns ## KEGG: BT_1772 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 426 1 426 426 872 100.0 0 MKKYKIFNKTGLAICLTLGLVAGSACSPTDDGPSIDDHFLNYEIPQIRPSSDIPVGAIYW NLGSTGVDEKKYARLIGEYNQSGQYPQLCPNVRPVLGRYSMDINKAETADLIQQHLTWAN NAGINFLILPNIGLDTSKGDLLNEGNVNFVNYMAGLNPNSEGIEWGGLRYAVSMDMNNFA NGLNNTAMIEDDADENGVSARCEQLYSFFVNLTSRFCTNNDLYYTVDGKPMIVVWNADKL YARDSEKLYNTIRERVRENVKDGNGNGLEIYILARQERWTPPARWHNFFLSGKVDAVYMD NMYNQTDWFRPTCYPQCIDQNFKYNREYEWANYGVDFVPSVSPSFNQWIDGDGTQFYNFP VVFKDEDMFRKMCNVAKMNLGKRPMVIIDSFNRWNVDQAIEPTDPNYGNGYGMKYLNIVK EEFKVK >gi|226332160|gb|ACIC01000160.1| GENE 3 4226 - 6232 2129 668 aa, chain - ## HITS:1 COG:no KEGG:BT_1773 NR:ns ## KEGG: BT_1773 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 668 1 668 668 1348 100.0 0 MKLKNKILPLLLLASTAVGCTDGFETINSDPNKLYEVNLQSIFPGTVYKTMNAISELNNN RMWSYSRYITNVAFQTPWNERGDGFYRKFYVEILRDLEELAKAYDDGSLPNSLAVVKTWK AFTYYQLLSLYGPVGLSHAGMDGGTDKRIFDYDSEADAYYSILGWLDQAVDEFDESSTDK ILKDPVYNGDVQKWRKLANTLRLEIAMNIQNISEEKAREYAAKSMGHEEWLFSSLDDAFA PKYGTVIDKDCSWYYTRLYKEEIQTKANWGLIPSMNEYFAIYLFSYNDPRMETYFDGSNV YWPERQPYRMTDVLTRSHDCDVSGCSASDQQEHLQWMIDGYEVRDSLRVRYSIPYVPTPD NPGARTPFGWRRPIDPTDPNGTREIPDPLSMNEDNRCYINPRYYAMDATMPLLRWADACF LCAEAKVKFDLGSKDAQTYYEEGIRASFEENGLSASAAAYMAQEGIAWGTSHKGFYDTRK LMTADINGADGKDGQLEQIYKQRWFADFLDGMSAWRMERRTRALNLPPLFLNGSQAYEEG GNMYYGWPERLYFPDAERQTNAEAYYRAIDLLQKNSPKPNAARWGDNIFTMLQFAKPVPN QEEMIAGWENSDYVQFNMDFQAKKYGQTYEEFLKNARELTGIKEAEEALDKAYNFLIEAT LLVYKVEK >gi|226332160|gb|ACIC01000160.1| GENE 4 6244 - 9549 3245 1101 aa, chain - ## HITS:1 COG:no KEGG:BT_1774 NR:ns ## KEGG: BT_1774 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1101 1 1101 1101 2187 100.0 0 MEKLYLKRSWRIGTCLMVLLTFFALSLSAQNRKVTGVVVDEFGDPLPGANITVVGNQNLA TATNMEGQFTLNNVPKDAILNVRFIGYAVKTIRLATLKNNDLLRIVLSEDVKELNDVVVT ALGISRESKSLGYARQSIDTESLLDTRDPNLLNSLSGKVSGVNFISNGGPLSSTRVEIRG NNSLTGNNQPLYVIDGVPIMNPMGEDGDLDYGNPASAISPDDIESIEVLKGANAAALYGS DAANGVILITTRKATKKSGLGVSYGYNMQFTFLREFPAYQNVYGSASLPAVGVGDGFNYY GSNSKNGYAYDPSLPYGIFVFNWANPNQRSWGLPMLGFDLVGRNNQIRSYVPNKDGITDM YEMGVAMTHSVSVNKVFNGAGIRFNYTGIDYDGMLKNFDNMKRHTFDLRVNMDLAKWLSS EISVNYQLETADNRDYKGDSNRNPMRAIMNMPRDVVMEELLPWKRETGEAFTRGNGFYNP YWLLNEISNGDGRNNFRGNLTLNIKPIQGMNIRLRASVEKANKNGWKFDNYYTMWDIDGQ YETFREASNNYNYEGVISYNRNIKKVSLSANLGTSMQKNDWYKLWNQVSQLAAPDVKSLS NNASILSGTESVNAKEKQSVFGMLSLGYHGLFVDGTFRNDWSSSLPKANNSYFYASGSAS AVLTEIFPKLKSKTFSFAKLRGSIARVGNDAGFDRLINSYSYGGLFRNDMAWFQGDAVRK NPNLKPETTISKEIGAEVRLLDGRLTADVTYYDKYTRDQIVESELSYLSNYQRVIINSGK ISNKGWEISVSGVPVKVKNFEWKTIFNWSKNNSMVESLPEGIDKIQIGSGVYNVKSYAEV GKPYGAIYAKTFKRDAEGYILCQLDGSPKEGQDYEYLGCVQADWRGGWNNVFRLGNFSFS VMFDFQKGGKFFSQTSIQSSVDGQSVKSLEGRDADFFSRKILGESDEERYGFMRPQNANT PTANGQIYPDWGRPKGVVLPNCRYDEDVEGLAGQQVLGYCTPERYWMHYTSRDISRFIYD ASYVKLREITVSYDLPKKWLRKTPFQTFRVAAVGRNIATLFANTPHGLDPQATSTTGNAQ GFERGFNLPSATYGFDIKVSF >gi|226332160|gb|ACIC01000160.1| GENE 5 9586 - 11298 1478 570 aa, chain - ## HITS:1 COG:no KEGG:BT_1775 NR:ns ## KEGG: BT_1775 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 570 1 570 570 1147 100.0 0 MGLVLASMIVSCSDEDNTIVPDTNPDPVIPTEGTFILPAPAVESNIPEHSFLMGALFSYQ KNLRNGEDAASGGTSSTTFYNKPLFEVKSFGATETEWWDNLVEEMVYSGLDYVAPNCRGR LPKADTDPAYDRDHGDPTRIKDFVAALQRRNEENLKIAIFDDCPASWAAARNFDLYQTYA TFISAETQQNKGLTDEEVMYPLDNLDDIYKYIWDYNIKLAFDNFYGENKANNKYLLRIDG KPVLFLWSINGFLNVDYAALGNKKIDCKGKLKAILNKIHADFKSAYGEDLFICADRAFQD RDKEVDESVVESLNDWFIASEQAPTRSSWTVRTFNGKTVGVAVPAFYTNDKSGTRMFFDA DHGKRLTDALDDMVAKNADLVFLEGFTDMAENAAYWRSTDNIYYDFPNQRLNILRKYSSS RAWAESFRVEMEACDYFFDVSAKNSGNQYRTGSLDVAKIAGVAPDKNTEWYVTSTEAGEW MEWKELPYAAGDITVKVCYAAKEDAKIRFDFGEGTKRRQGPVVELHATGGEWVTVDAFTM KSDVNGWRRTVLNIVAGKPDLNYFTITVEK >gi|226332160|gb|ACIC01000160.1| GENE 6 12120 - 15815 3223 1231 aa, chain + ## HITS:1 COG:TM1193 KEGG:ns NR:ns ## COG: TM1193 COG3250 # Protein_GI_number: 15643949 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Thermotoga maritima # 247 698 43 444 1087 109 24.0 3e-23 MKQISKVFLACACSLSGLTTQAQDQTVWSNDFSNASQPLNMVGRGVCRVNDGVFHSRSAY ALFGNPEWKDYTLSFKARAPKDAEQVQIWAGFRTHNRFDRYVVGIKGGLQDDLYLMRTGY MGTDEFMGVRPLGFHPVPGEWYKVKVEVCGNRIRVFVNDEKLPHIDLEDKNSNLAPTGEV ALGGGWIDTEFDDLTVTPLAEDALKDVKVQEYRMALTTQEKEKIRREERAQYTSVKLDKL TADRTELSLDGNWLFMPDYQLNDKTKAISMQTNDNDWHIMPVPAFWNPIRIWLHGETMPS PTGPQHKGVSDTYYQQETDRCENYTFNYRKTGAAWYRQWLELPADVKGKQLTLSFDAVSK MAEVYINGESAGSNIGMFGDFQVDATRFLRPGRNLIAVKVTRDINGKAAQTSDAMENYYS SVRKEVEDNQNDQQANKAVLTDIPHGFYGDNPAGIWQPVKLIVSNPLKVEDIFIKPALDG ASIDVTIKNHAPKKKNFDIRVDIIDKQTGETLYGAPVRSKLSLATSAQETFTCDIKGLKP KLWEPATPNLYDFRVSLTEGKNKVTDCITVTSGFRTFESKDGFLYLNGRKYWLRGGNHIP FALCPNDSLLADKFMKLMKEGNVQSTRTHTTPWNELWVSAADRNGIAISFEGTWSWLMIH STPIPDKGVLDLWSSEWLRVMKKYRNHPSIFFWTVNNEMKFYDLDADTERAKQKFRIVSD VVKDMRKTDPTRPVCFDSNYLHNKASKRFGDDFLKTVDDGDIDDNHAYYNWYDYSVFRFF NGEFQKQFKTPGRPLISQEMSTGYPNAETGHPTRSYQLIHQNPYSLIGYEAYDWGNPASF LNTQSFITGELAETLRRTNEQASGIMHFAYMTWFRQCYDHRNIQPYPTYYAMQRAMQPVL VSAELWGRNLYAGEKLHTRIYVVNDNEEGRDLKPMSLAWSIVDETNKVLASGTEQFPAVE YYGRKYIEPNIHMPSNLPADKVNVKLKLTLTESGVTLSQNEYGLLLARKEWNIGQVTASK KILLLDKDHMKATLDFLNIACQTVPSIKELLNAKQKANLCIISGLKECTDEEARLLREYQ SKGGRILFLNSKEAAQKVYPEYITGWIIPTEGDIVVMERDDAPVFDGIGALELRYFNNNK REIPLACTATLKAVRHENVKELAAQMKIHAYIDGGKPEERIARIESMRGLTLLQIADNKG KSLVSTLCTEKATTDPIAGKLLVNMVNELLK >gi|226332160|gb|ACIC01000160.1| GENE 7 16448 - 19507 2103 1019 aa, chain - ## HITS:1 COG:no KEGG:BT_1777 NR:ns ## KEGG: BT_1777 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1019 1 1019 1019 2090 99.0 0 MRNKYLRFADVLFLLSFVAGTTMTACSEQPYQPTLKATYNKPAKNWESEALPIGNGYMGA MIFGDVYVDVIQTNEHTLWSGGPGEDPSYNGGHLRTPEVNKDYLHKARVMLQQKMNDFTA NRSAYIDENGKLITHNYDGDGDGTELRNLIDNLAGTKEHFGSFQTLSNIIVETVNPGIPV LIKEAVQTNYDNTKNQSQSIGSLFDQSTTSKWFADNDRFSSFGSLPCVIKWAYTHAPKAV SYSLTSANDMPGRDPKSWKLYGSADGKSYDLLDQQSGTFWGDDKDGKGSRNKTLSFPLKT DKYTFFKLEITELIDNKQKPQLAELSIDASTELPYSDYTRTLDIDNAIHTVMYKENGITF KREYFMSYPDNVMVMRLTSDSKKGKLSRIISLESLHTDKTITADGHTITMTGYPTPVSGD KRVGDAWKNGLIYAQQLVVKNKGGKISVVDGTKLKVEDADEIIVLMSAATNYVQCMDDSY NYFSQEDPLEKVQATLHKVADKKYTALLATHQKDYHSLYDRMRLNLGNLPEAPVAPTDSL LKGMDENTNSEQENQYLEMLYFQFGRYLLISSSREGSLPANLQGVWGERLSNPWNADYHT NINIQMNYWPTQPTNLSPCHLPMVEYVRSLVPRGKYTAQQYYCKPDGGNVRGWVTHHENN IWGNTAPAKKSTPHHFPAGAIWMCQDIWEYYQFNLDKDFLKKYYDTMLDAALFWVDNLWT DERDGTLVANPSHSPEHGEFSLGCSTSQAMICEMFDMMIKASKELGRDKDPEIIEIATAM SKLSGPKIGLGGQFMEWKDEVTKDVTGDGGHRHTNHLFWLHPGSQIVIGRSEQDDKYADA MKVTLNTRGDEGTGWSKAWKLNFWARLHDGNRSHKLLRSAMKLTVPGSHVGGVYTNLFDA HPPFQIDGNFGCTAGIAEMLMQSQGGYIELLPALPDAWKNGSFKGMKARGNFEVDAAWTD GKITAIEILSNSGAECVIKYPNAKELNVSGAKVKVLADDQISFATVKGKTYIIKNVSSL >gi|226332160|gb|ACIC01000160.1| GENE 8 19722 - 22283 2263 853 aa, chain - ## HITS:1 COG:XF0845 KEGG:ns NR:ns ## COG: XF0845 COG1472 # Protein_GI_number: 15837447 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Xylella fastidiosa 9a5c # 44 840 37 858 882 534 38.0 1e-151 MIMMKRYKKILLTGVLSALSCSFVHAQELYKNENAPVHERVMDLISRLTVEEKISLLRAT SPGIPRLGIDKYYHGNEALHGVVRPGRFTVFPQAIGLAATWNPELQKRVATVISDEARAR WNELDQGREQKEQFSDVLTFWSPTVNMARDPRWGRTPETYGEDPFLSGIMGTAFVNGLQG DDPHYLKIVSTPKHFAANNEEHNRFVCNPQISEKQLREYYFPAFEMCVKEGKAASIMSAY NALNDVPCTLNPWLLQKVLRQDWGFQGYVVSDCGGPALLVNAHKYVKTKEAAATLSIKAG LDLECGDDVYDGPLLNAYKQYMVSDADIDSAAYHVLTARMKLGLFDSGERNPYTKISPSV IGSKEHQQIALDAARQCIVLLKNQKNRLPLNADKLKSIAVVGINAGKCEFGDYSGAPVVE PVSILQGIRNRVGDRVKVVYAPWKSAADGLELIQGENFPEGLQAEYFDNTRLEGTPRVRK ESWINFEPANQAPDPFLPKSPLSVRWTGKLKPTVSGRYTFSFTSDDGCRLSIDNQMLIDA WQAHAVSTDSASIYLEAGKEYQLKAEYYDNRDYAIVKLQWKVPQIGKATRLDLYGEAGKA VRECETVVAVMGINKSIEREGQDRYDIQLPADQREFLQEIYKVNPNIIVVLVAGSSLAIN WMDEHIPAIVNAWYPGEQGGTAVAEVLFGDYNPAGRLPLTYYKSLDELPPFDDYDITKGR TYKYFKGDVLYPFGYGLSYSSFTYSDLQVKDGVGEVTVSFRLKNTGKRNGDEVAQVYVRI PETGGIVPLKELKGFRRVPLKSGESRRVEIKLNKEQLRYWDVEKGQFVVPKGAFDVMVGA SSKDIRLQTVIDL >gi|226332160|gb|ACIC01000160.1| GENE 9 22334 - 24052 1286 572 aa, chain - ## HITS:1 COG:no KEGG:BT_1779 NR:ns ## KEGG: BT_1779 # Name: not_defined # Def: sialic acid-specific 9-O-acetylesterase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 572 1 572 572 1065 100.0 0 MKRLLFILWIFLLSVSFGLTKMKADVRMPLIFGDHMVLQQDTKIAIFGWADAGETVEVVF AGQKVKTTADTDGTWRVNLKPIKTKKEGQTLTIAGKNKLVYSDVLVGDVWVASGQSNMEW GIKVRKEYADDISASEDPLLRLFFVPKNTSLEPLTDIEVPQGTSNPERAARWVLCTPEML AKINGQGFSATAYYFARDMRAANGRPLGVIQSAWGGTRAEAWTSLSGLKQEPALAHYVAA YEKNVKDNPEILATYPQKQKEFDEAVREWDKTVGKEWNQAQKEWAIAVQAAQAAGKPAPP KPQPSVPRPPNPRKPNGGNNGPSNLFNAMISPLIPLSIKGVIWYQGEFNSGGSGKEYATL FSRMITDWREKWGIGDFPFVYVQLPNFEPVDPEPSVEGNGWRWVREGQLKALNLPNTAMA VTIDVGDPFDLHPVDKYDVGHRLALAARKLAYGEKIVGMGPLYKKMSVKGNKIILEFTNQ GKKLVMGTSPYIPAGKPPHPKPTKLTGFGIAGADRKFVWADAVIEGNKVIVSSSEVAVPL AVRYGFSNSPRCNLYNEKGLPASPFRTDSWDK >gi|226332160|gb|ACIC01000160.1| GENE 10 24083 - 26947 2864 954 aa, chain - ## HITS:1 COG:SSO3032 KEGG:ns NR:ns ## COG: SSO3032 COG1472 # Protein_GI_number: 15899739 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Sulfolobus solfataricus # 209 846 82 734 754 468 41.0 1e-131 MKNKMKWILAVGLLSCSVAMAQQQSDILSVSASANAENAALAFDRNVKTMWTIPSQALKA EQWLMFTIQQPGDVCELDLQMQGINKNELKEVLDIFVTYDPMNLGTPVNYRIEGSDKQMK VKFTPKYGAHVKLNFKSGKLDKPFSLKEISVLVAEKVLTDSQGKVTDRRYMDASLPVEER VESLLAVMTPEDKMELIREGWGIPGIPHLYVPPITKVEAVHGFSYGSGATIFPQALAMGA TWNRKLTEEVAMVIGDETVAANTKQAWSPVLDVAQDARWGRCEETFGEDPVLVSQIGGAW IKGYQSRGLFTTPKHFGGHGAPLGGRDSHDIGLSEREMREIHLVPFRHAIRNYDCQSLMM AYSDYMGVPVAKSKELLQQILRQEWGFNGFIVSDCGAIGNLTARKHYTAKDKIEAANQAL AAGIATNCGDTYNNKEVIQAAKDGRINMEDLDNVCRTMLGTMFRNELFEKNPCKPLDWKK IYPGWNSDSHKEMARQAARESIVMLENKDNLLPLSKTLRTIAVLGPGADDLQPGDYTPKL LPGQLKSVLTGIKGAVGKQTKVLYEQGCDFTNPDETNIPKAVKAASQSDVVIMVLGDCST SEATNDVRKTCGENNDWATLILPGKQQELLEAVCATGKPVILILQAGRPYDILKASEMCK AILVNWLPGQEGGPAMADVLFGDYNPAGRLPMTFPRHVGQLPLYYNFKTSGRRYEYVDME YYPLYRFGFGLSYTSFEYSNLKIQEKANGNVEVQATVKNVGSRAGDEVAQLYVTDMYASV KTRVMELKDFARIHLQPGESKTVSFEMTPYDISLLNDRMDRVVEKGEFKIMVGGMSPDYV AKNEIKHSVGYSDNKKGVTGMLNYTHEFGANFDLAVSKVEENLVKNQKTVWVSVKNTGTL MDIGKVEMFVDGKKVGDAVHYELGAGEEKLIPFKLAKENKQPVAFTTKYKMVSM >gi|226332160|gb|ACIC01000160.1| GENE 11 27014 - 29107 1632 697 aa, chain - ## HITS:1 COG:no KEGG:BT_1781 NR:ns ## KEGG: BT_1781 # Name: not_defined # Def: xylosidase/arabinosidase # Organism: B.thetaiotaomicron # Pathway: not_defined # 19 697 1 679 679 1387 99.0 0 MVKRFDWLLVCLLFSIGVMAQSGGKQYNSYKGLVMAGYQGWFNTPGDGSGRGWHHYNGRN GFRPGSCSVDLWPEVSEYKKLYKTEFSFADGKPAYVFSSHDESTVDVHFKWMQEYGLDGV FMQRFITEIRNESGLKHFNKVLNSAMKFANKYERAICVMYDLSGMQPGEEQLLLKDIAEI AERYSLKDHAKNPSYLYHNGKPLVTVWGVGFNDNRRYGLKEAAHIIDGLKSQGFSVMLGV PTQWRTLNGDTESDPRLHELIRKCDIMMPWFVGRYNETTYPKYQKLVEEDIQWAKKNQVD YAPLVFPGFSWGNLKGKDHNSFIPRNKGSFLWTQLMGAIRAGAEMIYVAMFDEIDEGTAI FKCAKKVPVGKSTFVPLEEGVESDHYLKLVGEAAKILRKEKAVAFSAKLDTKSPNPFIRH MYTADPSAHVWEDGRLYVYASHDIAPPRGCDLMDRYHVFSTDDMINWTDHGEILSSDQVP WGRKEGGFMWAPDCAYRNGTYYFYFPHPSETDWNDSWKIGVATSDKPAEGFKVQGYIEGM DPMIDPCVFVDDDGQAYIYNGGGGTCKGGKLKDNMIELDGPMRTMEGLSDFHEATWIHKY NGKYYLSYSDNHDDGEKHNRMCYAISDSPLGPWEYKGIYMEPTDSYTNHGSIVEFKGQWY SFYHNSALSGHDWLRSICVDKLYYNPDGTIKMVKQTK >gi|226332160|gb|ACIC01000160.1| GENE 12 29994 - 30839 850 281 aa, chain - ## HITS:1 COG:CAC2424 KEGG:ns NR:ns ## COG: CAC2424 COG4667 # Protein_GI_number: 15895690 # Func_class: R General function prediction only # Function: Predicted esterase of the alpha-beta hydrolase superfamily # Organism: Clostridium acetobutylicum # 11 277 5 272 283 234 43.0 1e-61 MKDLKLDEHTGLVLEGGGMRGVFTCGVLDYLMDHDIRFPYAIGVSAGACNGLSYASRQRG RAKFSNIDLLEKYNYIGLKHLLRKRNILDFDLLFNEFPEHILPYDYDTYFASPERFVMVT TNCMTGEANYFEEKRDKSRVIDIVRASSSLPFVCPVTYVDGVPMLDGGIVDSIPLQRAIA DGYTRNVVILTRNRGYRKDTKDIRIPSFVYRKYPKLREALSRRCAVYNEQLEMVEQMEDE GKIIVIRPQKPVMVDRIERDVQKLTEFYEEGYDCARNLFEF >gi|226332160|gb|ACIC01000160.1| GENE 13 30817 - 33042 1571 741 aa, chain - ## HITS:1 COG:SPAC105.01c KEGG:ns NR:ns ## COG: SPAC105.01c COG0475 # Protein_GI_number: 19114377 # Func_class: P Inorganic ion transport and metabolism # Function: Kef-type K+ transport systems, membrane components # Organism: Schizosaccharomyces pombe # 63 464 29 429 898 281 37.0 4e-75 MQNKARKNYIIYVLMLLLFGGLIYVAIEEGDRFSHHAANTLNTVQDGPFAMFLQFMQDNL HHPLTTLLIQIIAVLLMVRLFGYLFSLIGQPGVIGEIVAGIVLGPSVLGLFFPEAFHFLF PVHSLTNLELLSQVGLILFMFVIGMELDFSVLKNKINETLVISHAGILVPFFLGILSSYW IYETYAADHTPFLPFALFIGISMSITAFPVLARIIQERNMTKTPLGTLAIASAANDDVTA WCLLAVVIAISKAGSFASALYSVGLAVAYIAVMFLVVRPFLKKVGEVYANKEAINKTFVA FILLILIISSCLTEIIGIHALFGAFMAGVVMPSNLGFRKVMMEKVEDISLVFFLPLFFTF TGLRTEIGLINSPELWMVCLLLVTVAIVGKLGGCAIAARLVGESWKDSLTVGTLMNTRGL MELVALNIGYEMGVLPPSIFVILVIMALVTTFMTTPLLHLVERFFVHREEKLSLKRKLIF CFGRPESGRSLLSIYDLLFGKQLKKEHVIAAHYTVGTDLNPLDAQHYESESFALLNQRAV ELNIQVDNHYRVTDKLVQEMIHFIRKEHPDMLLLGAGSHYRPEMPGTPGAILWLTLFRDK IDDIMEQVKCPVAVFVNRNYREGALVSFVLGGMIDVFLLSYLNIMLQNGHSVRLFLFETD DEEFHQCIDDLLAHYADQIVVVWFAGVEELVTKERDGLLVMSHLSYTKLSEDELVMRELS SLLVIRRNKNKGEKNERLEIG >gi|226332160|gb|ACIC01000160.1| GENE 14 33259 - 33750 305 163 aa, chain + ## HITS:1 COG:no KEGG:BT_1784 NR:ns ## KEGG: BT_1784 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 163 11 173 173 308 99.0 5e-83 MPELIEEILHSDKWQTSVKEEISGRTTVVIRDQAYGSEATIEIYAQSIEIKTAWSKYFYR IFVANDLVWCEYNGAYRGLLEQVLLPTITPKESLLDSDVTESSLYGREHKKLREYAEDNL KLKQFRRENFNEQRNGTAAFDHPKRVYDEFIKEDYVVTPKGNK >gi|226332160|gb|ACIC01000160.1| GENE 15 33834 - 35039 887 401 aa, chain - ## HITS:1 COG:no KEGG:BT_1785 NR:ns ## KEGG: BT_1785 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 401 1 401 401 717 99.0 0 MKRRVLSILILGANLLNVAAQNMTSSPVSMFGLGELSTGEGGIYSGLGGVGIALRGENVI NSANPASLTGLLPQYFFFDLGVSGSYLKYSQSGASNHSLNGNLNNLAVGFRIAPRWYGAI FMAPVSSVGYAISLDQDVAGTDGSTVTSLFQGEGGLSKMGISVAHELWKGLSLGANFSYV GGSISQTETQGSATESNSSHKHTIYADFGLQYTYKLDHYRSAVAGVVYGYSQDLMQDNDH VVSSSSSSGSIEEKGKKYRTCLPQFIGFGASYTTLRWMASIDYKFVDWSRLESSHSSISF RNQHRLMLGGSYTLGNPYSKPVRLLLGAGMGNSYISVHDKTTTNYYLSTGLNFEFRSRST LSLGVKYTDQLKVTSGRFKEQKLSFFLNLTFSEKTYKAKLK >gi|226332160|gb|ACIC01000160.1| GENE 16 35036 - 36373 893 445 aa, chain - ## HITS:1 COG:no KEGG:BT_1786 NR:ns ## KEGG: BT_1786 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 445 1 445 445 853 99.0 0 MRLFITLFYSLLLFCTIFSCRDESSSIGSKWVESSFINVVTDTCMVTLSTLLADSLATSG DTICQIGRFKDDLYGEIKTSFYAEYQVPSHSLNETTDYQFDSITFKWYTSGDYLGDTLVY HRIDLYSLSQGLSLEDNGYLFNKTNVSYNQNNHLGSVYIRPTPGYRNELHETRLPDEWGK KWFNLMLEDDECMESQDRFRAFFKGIAFIPDENGTCINGFQVNDSSFCITLYYHELKEKP TEQTLVFNASTSLNYTKVEHDRTNTPLENLKAGTDYELPSSESEHQVYLQGLTGMYINVE FPYLNNLNQTGRLVSIESAVLKLYPIKGTYNGKYPLPKSLSLYTADQNNATQSVITDISG NTVQTGSLVEDNVYYEDTHYSFDITSFIQNNLGTAGESREKLQIFLSDNAFYNTVQGVVL GDSKHPVNDHDNNVSLTIYYKTYTK >gi|226332160|gb|ACIC01000160.1| GENE 17 36562 - 37659 898 365 aa, chain + ## HITS:1 COG:no KEGG:BT_1787 NR:ns ## KEGG: BT_1787 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 365 1 365 365 663 99.0 0 MEIMKKLQLLLVSIAMLSMFSCSDDDDDTLGVWYRRSDFDGRAREDAAGFVIDNRGYLCG GYRGKDQRERDCWEYNIDNDWWTQCADLPEEAAARNGAVGFAINSKGYVTTGYTVYRDDD PLHTGGYAYLKDTWEYEPATDTWKQMDDYPGDARINAIAFAIGNYGYVGTGQSKDDKQTK DFYRFDPTAASGNQWTIVNGFGGQKRTGGLSFLIDNVAYIVGGTNNGKDVDDFWKFDPSK SEENKWVRLRDITDQSSDDYDDDYKSITRTYGCAFVIDDQAYITLGQTASGSFRSNYWIY DPAKDLWRSSETEDDFDYTPFEGSSRIKAVCFATGRRGIITTGGSSSYYYDDTWELHPYE WEEND >gi|226332160|gb|ACIC01000160.1| GENE 18 37652 - 37993 117 113 aa, chain + ## HITS:1 COG:no KEGG:BT_1788 NR:ns ## KEGG: BT_1788 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 113 1 113 113 207 99.0 7e-53 MTKRKILGYSVAGILIVIACLSFSKTSFSKTDSSASSFSMEVIEVNGGYGYQISHNNHIT IFQPFIPSISGKKPFMEKRDAEQVGQLVMKRMKSGENYTVTLDDLESLGIKIK >gi|226332160|gb|ACIC01000160.1| GENE 19 38173 - 39357 1290 394 aa, chain + ## HITS:1 COG:L130687 KEGG:ns NR:ns ## COG: L130687 COG3579 # Protein_GI_number: 15673860 # Func_class: E Amino acid transport and metabolism # Function: Aminopeptidase C # Organism: Lactococcus lactis # 33 386 54 414 436 60 23.0 6e-09 MKKSILIAALGLFSLSAMAQDAKPEEGFVFTTVKENPITSIKNQNRSSTCWSFSALGFLE SELLRLGKGEYDLSEMFVVHKTMEDRGTNYVRYHGDSSFSPGGSFYDIIYCMKNYGLVPQ EAMPGIMYGDTLPVHNELDAVAEGYINAIAKGKLTKLTPVWKKGLSAIYDTYLGACPENF TYKGKEYTPKTFAESLGLKAEDYVSLTSYTHHPFYSQFAIEIQDNWRNGLSYNLPLDEFM AVMDNAVKQGYTFAWGSDVSEQGFTRDGVAVMPDAAKGAELTGSDMARWTGLTAADKRKE LTSKPLPEMEITQEMRQTAFDNWETTDDHGMVIYGIAKDQNGKEYFMVKNSWGTSGKYKG IWYASKAFVAYKTMNILVHKDALPKEIAKKLGIK Prediction of potential genes in microbial genomes Time: Thu May 12 03:29:26 2011 Seq name: gi|226332159|gb|ACIC01000161.1| Bacteroides sp. 1_1_6 cont1.161, whole genome shotgun sequence Length of sequence - 14303 bp Number of predicted genes - 9, with homology - 9 Number of transcription units - 4, operones - 3 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 9/0.000 - CDS 1 - 706 501 ## COG3279 Response regulator of the LytR/AlgR family 2 1 Op 2 . - CDS 706 - 1911 679 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain - Prom 1960 - 2019 4.8 3 2 Tu 1 . - CDS 2064 - 2915 583 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 3100 - 3159 8.3 + Prom 3014 - 3073 6.4 4 3 Op 1 . + CDS 3166 - 4512 1189 ## BT_1798 hypothetical protein 5 3 Op 2 2/0.000 + CDS 4550 - 7399 2533 ## COG1629 Outer membrane receptor proteins, mostly Fe transport + Term 7426 - 7465 9.1 + Prom 7408 - 7467 6.4 6 4 Op 1 10/0.000 + CDS 7643 - 9493 1447 ## COG0642 Signal transduction histidine kinase 7 4 Op 2 10/0.000 + CDS 9490 - 11451 1533 ## COG0642 Signal transduction histidine kinase 8 4 Op 3 . + CDS 11456 - 13513 1583 ## COG0642 Signal transduction histidine kinase 9 4 Op 4 . + CDS 13606 - 14286 450 ## BT_1803 hypothetical protein Predicted protein(s) >gi|226332159|gb|ACIC01000161.1| GENE 1 1 - 706 501 235 aa, chain - ## HITS:1 COG:SA0251 KEGG:ns NR:ns ## COG: SA0251 COG3279 # Protein_GI_number: 15925964 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Staphylococcus aureus N315 # 1 232 1 239 246 108 31.0 9e-24 MKCIAIDDEPLALKQLTDYISRVPFLSLVKSCQDAFEAMKVLADEDIDLIFVDINMPDLN GLDFIRSLISRPMVIFTTAYSEYAVDGFKLDAVDYLLKPFEFQDLLKAADKARKQYEYRM LEQQGELGNSSQIKGDSLFVKSDYKVVRIDVKNIRYIEGMSEYVRIFIEGEDKPVITLAS LQKMEERLPTYFMRVHRSYIVNLRRITEVSRLRIIFDKNTYIPVGDNYKERFAEH >gi|226332159|gb|ACIC01000161.1| GENE 2 706 - 1911 679 401 aa, chain - ## HITS:1 COG:BH2727 KEGG:ns NR:ns ## COG: BH2727 COG2972 # Protein_GI_number: 15615290 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus halodurans # 202 401 374 586 597 102 31.0 1e-21 MERLRKQQLLEQSIYGAIWTVIFLLPLIGGYFAVSGGLEKEEIRVIVYDSWLSILPFFVL FLLNNYGLVPYFLFKKKYWYYIISLVFLISTACWVIPDPSMERFPKEFRYGDLRKGEGKI QRDQIIKMREKARQEESVHWETPRANDPALEKMQRPVGFPKLTLYPIPPFTIRYLIHCVI AFLMVGFNIAVKLFFKSFRDEEMLKELEHQRLQSELQYLKYQINPHFFMNTLNNIHALVD IDTGKAKSTIVELSKLMRYVLYEASNKTILLSREVQFLENYVTLMSLRYPEKVSIEKNFP LEVPEVQIPPLLFISFVENAFKHGVSYRKESFVHVVMQLEEGNRLAFRCTNSTGTSSDEQ HHGIGLENIRKRLRLLFGNDYTLSITEEYDKFDVLLIIPLL >gi|226332159|gb|ACIC01000161.1| GENE 3 2064 - 2915 583 283 aa, chain - ## HITS:1 COG:PA0248 KEGG:ns NR:ns ## COG: PA0248 COG2207 # Protein_GI_number: 15595445 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Pseudomonas aeruginosa # 179 282 183 286 288 73 31.0 4e-13 MIIDDALPKFDLPVDFVTDDSITGDILNQYGRFPCHIKAGVFVLCTRGTVRATINLSEYT ITHNDFVTVLPGSFIQIHEVSSDTRVCFAGFSSEFISRVSYVETYLDFLPMILDNPIMTL QEEVAQLYRNAFSLLIRAYSLPNTLDNKEILMSIFTIFFQGVGELYKRCKPTTNEPIKRE HELYRQFIQLLMTHYTQEHEVSFYAKKCGVTPAHFSGAIRKASGHSPLAIITGIIIMNAK AQLKSTRLPVKEIAFSLGFNNLSFFNKYFRKHVEMTPQEYREC >gi|226332159|gb|ACIC01000161.1| GENE 4 3166 - 4512 1189 448 aa, chain + ## HITS:1 COG:no KEGG:BT_1798 NR:ns ## KEGG: BT_1798 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 448 1 448 448 857 96.0 0 MKTLRKLFYVACTTFFLASCEETYNDKLFWPGELSQEYGSYIKPSTLDLTYSGEKLIGKT VNFQTEDSKKGTLTLNDIIPGEKETSFRINLSEQEDNYTFSGETVSCAGATVKYAGSITP KTMKLDLNVTMPQNQWIKTYQMSELTRGRGKDVVRNQTTGEYEWGESDNQILTAALYTDM DLEMVKDAGSLYATVSVIIKGMGGYLLPQLLKSVTLESDGNITAEYTSDELQLGEQKFSE IDMDNPASQQQLINFIMMKLMFNTLSADDITDATQGRNYAKSPRGLAFWYLKNDLLYVKL NLPGIISLTMQGQGQTVDAHLIAGIADAILKSNPFLLKTLLGIVSESLDNSLLSMIANMD HQSFQMFFSWIKEGIPMQIEKENGHTHIYLNREALSPLIAFIPNLQPAMEGIPNFGPMLY NSYIGPLYDNWSIITQLDLGLDLTDKNE >gi|226332159|gb|ACIC01000161.1| GENE 5 4550 - 7399 2533 949 aa, chain + ## HITS:1 COG:PA1613 KEGG:ns NR:ns ## COG: PA1613 COG1629 # Protein_GI_number: 15596810 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor proteins, mostly Fe transport # Organism: Pseudomonas aeruginosa # 110 319 37 244 702 79 28.0 3e-14 MKKLYILTLFICLAGFGTSFAQTLKGHIYDANTNEPLVGAAVTYKLHGNQGTVSDINGAY EIKLPEGGVDLVFSYIGYEDVLMPIVIGKREVITKDVYMKESTKLLEEVVVSAGRFEQKL SNVTVSMDLVKAGDIARQAPTDITSTLRTLPGVDIVDKQPSMRGGSGWTYGVGARSQILV DGMSTLNPKTGEINWNTVPLENIEQIEVIKGASSVLYGSSALNGIINIRTARPGLTPKTR FSAYIGIYGDAENDEYQWSDKSFWKDDKYSVKPILRGNLLSGIRNPIYEGFDLSHSRRIG NFDVSGSINLFTDEGYRQQGYNKRFRMGGNLTYHQPDMGMKILNYGLNVDFLSNQYGDFF IWRSPTEVYKPSPFTNMGREENNFHIDPFINYVNPENGTSHKIKGRFYHSADNIVKPSQG NSITDILGNMGTNAQTIQNIAGGDYSSLYPALVGIGSGLINNNLEDAMNGVFTSLGNIFP NATTADYCDLISWVMDNGLPSDLMNGIQNGQVPSDLIPWLSNVMNPTRNNAKTKTDKNYN YYLDYQFNKKWDGGAQITTGMTYEHVRYNSSIMDQVYKSDNVAAFFQYDQRFWDRLSVSA GVRAEYYRVNNHHREAETKIFGAKVPFRPVFRAGLNYQLADYSFIRASAGQGYRNPSINE KYLRKDIGGVGIYPNLDIKPEKGYNAELGFKQGYKIGNFQGFVDVAGFYTEYRDMVEFQF GLFNNADYSMINSISDAIQMVTDSKGFGIGAQFHNVSKAQIYGMEISTNGVYDFNKNTKL FYNLGYVYTEPRDADYKERNEIEDLYTDALQMKEKSNTGKYLKYRPKHSFKATVDFQWKR INLGANFAWKSKILAVDYLMMDEREKQQQDLMDYVRTILFGKSRGETLATYWKKHNTDYA TVDLRFGVKATKEVAFQFMVNNLLNKEYSYRPMAVAAPRTFVVKMDITF >gi|226332159|gb|ACIC01000161.1| GENE 6 7643 - 9493 1447 616 aa, chain + ## HITS:1 COG:MA4377_3 KEGG:ns NR:ns ## COG: MA4377_3 COG0642 # Protein_GI_number: 20093164 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Methanosarcina acetivorans str.C2A # 371 616 14 266 311 146 36.0 1e-34 MKRNIYILFTIGLFIRILPSHSVNLNTERLSNLKRLIENNIAYDSIAPIDSVIAWGQQLS PILEKENKMELSFSIRQLVVYIYSLRGDIGKAIDEARQMYEKAETMRYDLGIALSSAAIG DAYFCSNMPEEATDSYKEAIRYPASPSENNHYKEMTILKLIQVLILTQRTEEAEKYRKIL SESKSIQTHQTLQFLTLATDVSYYIQKNDLRNANNCLLQAEQIYLSDRQPYYRTTYNYMQ GRYNEATGNYNLALQYYNEILTGIRQKTRSIIYLQVAYAKANLLIEMGSKVEAAHLYEEI SIVTDSVVAPSYAHRINSLRASYEENRMKVENKAEFNRIFMGGILIGVIVLGIMIYLVIH ILKQNKKIAESKIRMEQSRLNAENAMQTKSLFLSNMSHEIRTPLSALSGFSSLLTEQALD AETRRQCGDIIQQNSDLLLKLINDVIDLSNLEIGNMKFNFTYCDAIAICNNVIDTVNKVK QTQAELRFNTSLASLKLYTDDSRLQQLLINLLINATKFTPQGSITLEVRQESEEFALFSV TDTGCGIPLEKQSSIFNRFEKLNEGAQGTGLGLSICQLIIERIGGSIWIDPAYTTGCRFY FTHPINPTKEGKEAQS >gi|226332159|gb|ACIC01000161.1| GENE 7 9490 - 11451 1533 653 aa, chain + ## HITS:1 COG:AGc3465 KEGG:ns NR:ns ## COG: AGc3465 COG0642 # Protein_GI_number: 15889187 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 435 567 255 389 511 88 37.0 5e-17 MKRLWIFFILIVLAGGTGKALSSNNEAAKDSLLQILDTLPADSSRLEMLYSLAYLDPMSP SCVYYLGKLLEEATTQDNKYYQCLALYAHVVYYFNHQDEENTVIWMDKLSPVALKNNSYS LYFEGKRAEITIHIIKHKIEYSITQAEEMFKLAQKLNNPQGMSSAKLCLMTAYMMTARFK EGEEAGFEAYRLLPPAASLESRKSVLQEIALSCSATRNKNFLKYLQEYKKVLDTLSEAKQ TPKAYSYLLLESLYADYYLNEGTLDEARPHLKKMDEYFSPTTYIPCRGLYYNVYSHYYRI TKEYEKALSYSQNAIELLSEVSDNEGLNYKIEHASILTEAGQADEAIPLFQSLLAKKDSF YRALSISQTNEIYQMRNMDNLLLEKEQYKAMVHYAGLTLIAIALLILIPSTIRIYYVRKK LRKEEEEIRKMSQIAEEANEVKSRFLVNMSYNIRIPLNNVLGFSQLMTTDPESMDADQWK EYSEIIQTNSAELIQLVNDVLDLSRLEAGRTKWQIQEHEIISLCSDVLGMVRMRCGDKIQ ADFHTEIESQPFQVDTARFTQLILSTLIYTDPCEEKRKVSLYLERDTQRELFVFRVVNSP LADPTLQTQKAEVRHSINRLTIEYFKGTYTIEPNTPEGPVLTFTYPYSKTNNI >gi|226332159|gb|ACIC01000161.1| GENE 8 11456 - 13513 1583 685 aa, chain + ## HITS:1 COG:CC0586_2 KEGG:ns NR:ns ## COG: CC0586_2 COG0642 # Protein_GI_number: 16124840 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Caulobacter vibrioides # 467 676 14 227 232 119 34.0 3e-26 MTKNLITRKIQFLFLIGAILLAPSIARASAVTASVDSLQNLLQNDTDAGQRAIIYIHLAD IHVDSLNISSQYWDKALTEAAKAKDEYMMKLALDNLIQRYASKNKEKVEKYITFARQYLP EEHNTLFRYYLYCYNIWAEMRKNNSLETIEQELDKLKKENHGQMTPEEQIQWEYLTGVSL DYSAVLTHSYNEVSKAIPYVERALEILAPYPLQDRVHFEILCHYELADLYTTVENKKAVD EVNKMIELHKQWNHLNTTFDRRFLDESHYYMAKYSQIIFMADLISKEEIKDYYQKYIQIA NKKKVVKSTYETTARYYQTIGEYKKAIAYIDSATQTDHFQPVNLVPILNAKAGLYYKLND YKNAYLTLKESNKNRMSDKSQKREQQMIEMQTRFDVNKLELEKSKLANKNKQIALTGTFI LLLAAIGWSIYQRTMVKRLKKMHCALMTAHEEVRKQSMKATESEKMKTAFLNSICHEIRT PLNSIAGFSELIFDESLDTATRQEFRQLIQSNSTALASLMDNMLELSQLVSSELPLPVES TDVYGLCVEEMAKLKETLSKPNIECIITGDKDGMTAQTNAFYLSRVIGNLLSNSAKFTES GTITLTCNIDKEKGQLVISVTDTGIGIPADKQEWVFERFTKVDDFKPDTGLGLYICRIII QRLGGQIRIDTGYTSGCKMVVTLPV >gi|226332159|gb|ACIC01000161.1| GENE 9 13606 - 14286 450 226 aa, chain + ## HITS:1 COG:no KEGG:BT_1803 NR:ns ## KEGG: BT_1803 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 226 1 226 226 441 99.0 1e-122 MKKRILLIALLFCSVLAQAQDVFVTADFVSSYIWRGMDSGNASVQPSLGVNWKGLTAYVW GSTEFRHKNNEIDLSLEYEYRNLTLYANNYFTQTEEEPFKYFNYSSHSTGHTFEVGAGYM ISEKFPLSISWYTTFAGNDYRENGKRAWSSYCELSYPFSIKKVDLALEAGFTPWDGMYSD KFNVVNIGLSATKELKITSTFSLPIFGKVIANPYEEQVYFVVGLTL Prediction of potential genes in microbial genomes Time: Thu May 12 03:29:50 2011 Seq name: gi|226332158|gb|ACIC01000162.1| Bacteroides sp. 1_1_6 cont1.162, whole genome shotgun sequence Length of sequence - 58816 bp Number of predicted genes - 45, with homology - 44 Number of transcription units - 24, operones - 11 average op.length - 2.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 29/0.000 + CDS 675 - 1547 1104 ## COG2086 Electron transfer flavoprotein, beta subunit 2 1 Op 2 3/0.000 + CDS 1550 - 2569 1083 ## COG2025 Electron transfer flavoprotein, alpha subunit 3 1 Op 3 . + CDS 2576 - 4282 2089 ## COG1960 Acyl-CoA dehydrogenases + Term 4309 - 4346 4.1 + Prom 4390 - 4449 5.1 4 2 Op 1 . + CDS 4503 - 6263 1282 ## COG0705 Uncharacterized membrane protein (homolog of Drosophila rhomboid) 5 2 Op 2 . + CDS 6272 - 6691 382 ## BT_1808 hypothetical protein 6 2 Op 3 . + CDS 6722 - 11752 4966 ## BT_1809 hypothetical protein 7 2 Op 4 . + CDS 11760 - 12230 322 ## PROTEIN SUPPORTED gi|15902812|ref|NP_358362.1| hypothetical protein spr0768 + Term 12262 - 12306 8.4 - Term 12276 - 12334 3.2 8 3 Op 1 . - CDS 12343 - 12657 298 ## BT_1811 hypothetical protein 9 3 Op 2 . - CDS 12672 - 13637 1201 ## COG2214 DnaJ-class molecular chaperone - Prom 13680 - 13739 4.2 10 4 Op 1 . - CDS 13830 - 15533 1255 ## COG2194 Predicted membrane-associated, metal-dependent hydrolase 11 4 Op 2 . - CDS 15520 - 16716 786 ## BT_1814 hypothetical protein 12 4 Op 3 . - CDS 16807 - 17820 794 ## COG2008 Threonine aldolase 13 4 Op 4 . - CDS 17873 - 18247 384 ## BT_1816 hypothetical protein 14 4 Op 5 . - CDS 18258 - 18806 496 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 18861 - 18920 5.3 + Prom 18846 - 18905 2.0 15 5 Tu 1 . + CDS 18962 - 19378 423 ## BT_1818 hypothetical protein + Term 19389 - 19438 2.2 - Term 19377 - 19426 6.0 16 6 Tu 1 . - CDS 19462 - 19680 246 ## BT_1819 hypothetical protein - Prom 19890 - 19949 6.6 + Prom 19740 - 19799 4.5 17 7 Tu 1 . + CDS 19917 - 21656 1693 ## COG0028 Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] + Prom 21749 - 21808 6.3 18 8 Op 1 . + CDS 21835 - 22752 644 ## COG4984 Predicted membrane protein 19 8 Op 2 . + CDS 22736 - 23692 518 ## BT_1824 putative permease 20 8 Op 3 . + CDS 23679 - 24182 408 ## COG4929 Uncharacterized membrane-anchored protein 21 9 Tu 1 . - CDS 24564 - 25655 853 ## COG2885 Outer membrane protein and related peptidoglycan-associated (lipo)proteins - Prom 25678 - 25737 6.1 - Term 25658 - 25717 12.9 22 10 Tu 1 . - CDS 25755 - 28814 2986 ## BT_1502 hypothetical protein - Prom 28836 - 28895 4.3 23 11 Tu 1 . - CDS 29188 - 30126 353 ## BT_1827 transposase - Prom 30170 - 30229 5.0 - Term 30744 - 30780 -0.7 24 12 Tu 1 . - CDS 30832 - 32394 1091 ## BT_1828 hypothetical protein - Prom 32419 - 32478 3.7 - Term 32486 - 32527 11.1 25 13 Op 1 41/0.000 - CDS 32560 - 34197 1673 ## PROTEIN SUPPORTED gi|167855908|ref|ZP_02478658.1| 50S ribosomal protein L28 26 13 Op 2 . - CDS 34243 - 34515 396 ## COG0234 Co-chaperonin GroES (HSP10) + Prom 34501 - 34560 7.8 27 14 Op 1 . + CDS 34778 - 35806 908 ## BT_1831 hypothetical protein 28 14 Op 2 . + CDS 35850 - 36566 501 ## BT_1832 hypothetical protein - Term 36575 - 36606 -0.4 29 15 Op 1 . - CDS 36621 - 39269 2124 ## COG0642 Signal transduction histidine kinase 30 15 Op 2 . - CDS 39348 - 39509 160 ## gi|253571978|ref|ZP_04849383.1| conserved hypothetical protein - Prom 39701 - 39760 5.8 31 16 Tu 1 . + CDS 39445 - 39612 84 ## + Term 39617 - 39676 5.5 + Prom 39670 - 39729 4.0 32 17 Op 1 . + CDS 39766 - 41235 1204 ## COG4624 Iron only hydrogenase large subunit, C-terminal domain 33 17 Op 2 . + CDS 41216 - 42253 780 ## COG0502 Biotin synthase and related enzymes 34 17 Op 3 . + CDS 42258 - 43676 1219 ## COG1060 Thiamine biosynthesis enzyme ThiH and related uncharacterized enzymes 35 17 Op 4 . + CDS 43740 - 44957 787 ## COG1160 Predicted GTPases + Prom 44959 - 45018 3.2 36 17 Op 5 . + CDS 45046 - 47583 2152 ## COG1506 Dipeptidyl aminopeptidases/acylaminoacyl-peptidases + Term 47620 - 47660 -0.2 - Term 47586 - 47647 0.7 37 18 Tu 1 . - CDS 47668 - 49068 1066 ## BT_1839 hypothetical protein - Prom 49160 - 49219 6.4 - Term 49455 - 49514 11.7 38 19 Tu 1 . - CDS 49679 - 51043 1683 ## COG0124 Histidyl-tRNA synthetase - Prom 51122 - 51181 8.1 - Term 51127 - 51174 8.7 39 20 Tu 1 . - CDS 51195 - 51875 618 ## COG2738 Predicted Zn-dependent protease - Prom 51900 - 51959 4.9 - Term 51923 - 51966 5.1 40 21 Tu 1 . - CDS 51995 - 53311 1260 ## COG3669 Alpha-L-fucosidase - Prom 53334 - 53393 2.4 - Term 53338 - 53383 2.0 41 22 Op 1 . - CDS 53429 - 54700 1474 ## COG0104 Adenylosuccinate synthase 42 22 Op 2 . - CDS 54697 - 55185 323 ## COG0735 Fe2+/Zn2+ uptake regulation proteins 43 23 Op 1 . - CDS 55349 - 56014 545 ## BT_1845 hypothetical protein 44 23 Op 2 . - CDS 56023 - 58071 2197 ## BT_1846 putative dipeptidyl-peptidase III - Prom 58100 - 58159 5.0 - Term 58092 - 58140 6.2 45 24 Tu 1 . - CDS 58166 - 58630 494 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 58680 - 58739 5.9 Predicted protein(s) >gi|226332158|gb|ACIC01000162.1| GENE 1 675 - 1547 1104 290 aa, chain + ## HITS:1 COG:mll5862 KEGG:ns NR:ns ## COG: mll5862 COG2086 # Protein_GI_number: 13474882 # Func_class: C Energy production and conversion # Function: Electron transfer flavoprotein, beta subunit # Organism: Mesorhizobium loti # 3 286 1 265 283 179 40.0 6e-45 MSLKIVVLAKQVPDTRNVGKDAMKADGTINRAALPAIFNPEDLNALEQALRLKDAHPGST VTILTMGPGRAADIIREGLFRGADNGYLLTDRAFAGADTLATSYALATAIRKIGECDIII GGRQAIDGDTAQVGPQVAEKLGLTQITYAEEILEVGDGKIKVKRHIDGGVETVEGPLPIV ITVNGSAAPCRPRNAKLVQKYKHAKTVTEKQQGNLDYTDLYDKRDYLNLPEWSVADVNGD LAQCGLSGSPTKVKAIQNIVFQAKESKTISGSDRDVEELIVELLANHTIG >gi|226332158|gb|ACIC01000162.1| GENE 2 1550 - 2569 1083 339 aa, chain + ## HITS:1 COG:CAC2709 KEGG:ns NR:ns ## COG: CAC2709 COG2025 # Protein_GI_number: 15895966 # Func_class: C Energy production and conversion # Function: Electron transfer flavoprotein, alpha subunit # Organism: Clostridium acetobutylicum # 4 335 9 332 336 262 44.0 7e-70 MNNLFVYCEIEEGIIADVSLELLTKGRSLANELNCQLEAVVAGTGLKEIEKQILPYGVDK LHVFDAEGLYPYTSLPHTSILVNLFKEEQPQICLMGATVIGRDLGPRVSSALTSGLTADC TSLEIGDHEDKKEGKVYKNLLYQIRPAFGGNIVATIVNPEHRPQMATVREGVMKKEIVSP AYQGEVIRHDVKKYVADTDYVVKVIERHVEKAKNNLKGSPIIIAGGYGVGSKENFDLLFS LAKELHAEVGASRAAVDAGFAEHDRQIGQTGVTVRPKLYIACGISGQIQHIAGMQESGII ISINNDPDAPINTIADYVINGTIEEVVPKMIKYYKQNSK >gi|226332158|gb|ACIC01000162.1| GENE 3 2576 - 4282 2089 568 aa, chain + ## HITS:1 COG:CC3393 KEGG:ns NR:ns ## COG: CC3393 COG1960 # Protein_GI_number: 16127623 # Func_class: I Lipid transport and metabolism # Function: Acyl-CoA dehydrogenases # Organism: Caulobacter vibrioides # 49 445 51 459 603 201 33.0 3e-51 MANFYTEIPELKYHLNNPMMKRICELKERNYRDKDEFDYAPLDFEDALDSYDKVLEITGE ITGEIINANAEGVDEEGPHCANGRVEYASGTKENLDAMVKAGLNGMTMPRRFGGLNFPIT PYTMCAEIVAAADAGFGNIWSLQDCIETLYEFGNADQHSRFIPRVCQGETMSMDLTEPDA GSDLQAVMLKATYSEKDGCWLLNGVKRFITNGDADIHLVLARSEEGTRDGRGLSMFIYDK RQGGVDVRRIEHKLGIHGSPTCELVYKKAKAELCGDRKLGLIKYVMALMNGARLGIAAQS VGLSQAAYNEGLAYAKDRKQFGKAIIDFPAVYDMLAIMKGKLDAGRALLYQTARYVDIYK ALDDISRERKLTPEERLEQKKYAKLADSFTPLAKGMNSEYANQNAYDCIQIHGGSGFMME YACQRIYRDARITSIYEGTTQLQTVAAIRYVTNGSYLATIREFETVPCSPEMEPLMSRLK KMADQFEASTNAVKEAQDQELLDFTARRLMEMAADIIMCHLLIQDASKAADLFSKSAHVY LNFAEAEVTKHTNFIKNMDKEDLAFYKK >gi|226332158|gb|ACIC01000162.1| GENE 4 4503 - 6263 1282 586 aa, chain + ## HITS:1 COG:TM0584 KEGG:ns NR:ns ## COG: TM0584 COG0705 # Protein_GI_number: 15643350 # Func_class: R General function prediction only # Function: Uncharacterized membrane protein (homolog of Drosophila rhomboid) # Organism: Thermotoga maritima # 165 388 13 235 235 127 34.0 5e-29 MAFRLRPLHEDKLHFADLSNTQILILALEASQKLEWNIEGIALREVIFYVPMDMRSQGEE VTFTIEEGNSGEISVRSQCASVQLVDYGKNRKNIQKLQETMEEIKSTLTPEELAQKANEL EEDLTRPLTEEERRLQAESEKESSFIHFFIPRKGFIATPVLIDINILVFILMAATGAGIL EPSTLALLNWGADFGPLTLTGDWWRAVTCNFVHIGAFHLLMNMYAFIYIGIWLEHLIGTR RMFVSYLLTGLCSAVFSLYMHAETISAGASGSIFGLYGIFLAFLLFHRIERSQRKALLTS ILIFVGYNLIYGIRAGVDNAAHIGGLLSGFLLGFIYVFGERMKKPEAGRTVSIIGELIIF SVFLFSFLSLCRNVPSTYQEIRNEWKSGLVEAYYKEQEEEQKKSASRRVTGSPRKSSTSE QFPYTPMSDEDTWLSCYDAVSKFSCQYPTNWYKITGTKAPTPDSEPPLLKLVNGGSQLTV TSNSYDTQDEFERMKKLLLTLPRNEQGKPSEDYKQSKVNINGLPMTKTTNPLRIGHPDEE GEEMQQTVLLYFQENKRRVFAIVMLVADEKAQADLDAITSSIQIEK >gi|226332158|gb|ACIC01000162.1| GENE 5 6272 - 6691 382 139 aa, chain + ## HITS:1 COG:no KEGG:BT_1808 NR:ns ## KEGG: BT_1808 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 139 1 139 139 264 99.0 6e-70 MKAYIQVKQVGKRKCSIEKMPVDFPTPPASVQELIEAVVCWQVKEYNERLQQSEMLKYLT CEEMEDKAAAGKVGFEANYNGRPAVETEAIINALQSYEDGIFRIFLDDAELGELSSSVQL KEESTLTFIRLAMLSGRLW >gi|226332158|gb|ACIC01000162.1| GENE 6 6722 - 11752 4966 1676 aa, chain + ## HITS:1 COG:no KEGG:BT_1809 NR:ns ## KEGG: BT_1809 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1676 1 1676 1676 3246 99.0 0 MRITYNDKLNKLLNPYLESLLKEKSSLEPETLTLFDLFMEDLDMGGRYGIFSEILEPRLE EIIREENIESIANLMDGRLNTLFRYMLGDEFAGLFHTYLQLEARCPHTIGYDRRAQRSAH PSNHLDHARDAWLQFLKLRATGFSVEAILKGGNTPEDTEEFSGYMNSSYWLAAQIAQGNE KVIEYLNQVLTSENNVHRLQRWYLHAIAISGHRPLLELEGKLLLAAKLQEGLRQAIVETM DEGCPESYLYLFSIIYNNGLQRFASVKRSIAVCTGIGEQDSSERITNKYVELIWNFLNHP ETARNTLQSKDTVELYLALWSIGFYDTDEIRAIVPDIIRKGAKHQIQTLLYFLRCTQSSQ MNHLISKDAFEVWHDDPKVVAAILPLYMDGLYLSRYSDNKEAPTLSDYFETPEEAVRHYG YLKQVYQSISAKEIYSPYVFPWDCVVLTRSDVVLKMAYIAWMTNDIRLREELCTYLPALE SYNRASYIGIVLARTESKVEQEYVLQSLGDRSADIRDEAYKVLSEMSLSPEQYQHIEELL RFKYSEMRINAINLLMKQPEAQLAASIRRLLSDKNAERRLAGLDMMKSIRNVDFLKDSYQ KLLSTVREIQKPNAKEKILIESLIGDGTEQSPTSNYTRENGFGLYDPALEVSLPPITPDA GFNVKKAFEFIRLGKAKAIFDKLNKYIEKHKEDEFTDKYGEVRLVGNSVLLNWYKHYDGL SELGLPELWQNFYQQEIGSYDKLLMMKFMLASTGAPNEIEEDEDDEFDEEEQEDKEAAIQ SLNTFEPIINKMYAGFTYRGLQKSLRKLTYYRQIEDIIDGLAHEYRNEATYQQFSVNMLL QLLPLLNTKNIFRQYTNKHTWLRDKQEYGAREIVYPIHNNKFVRFWLDAPQHPINDALFT RYFTVRYQLYKLTNYMEHTPELEETDVYLQSMDFAHAWMLGLIPTEEIYRELMGRVNSPT RIKDITSALDERNHSLFHSLTQKVVNRILEIELQRGDSETQVTRLAEELHRVYGAETLIR ILQAFGKDTFIRDSYNWRNTKRGVLSSLLHACYPSPDDDSDTLKSLASQADISHIRLVEA AMFAPQWLELTEKATGWKGLESAAYYFHAHTSECFDDKKKAIIARYTPIAIEDLQEGAFD IDWFKEAYKTIGKERFEVVYNAAKYISLSNTHTRARKFADAVNGKTKAADAKKEIIAKRN KDLLMSYGLIPLGRKADKELLERYQFLQKFLKESKEFGAQRQESEKKAVTIALQNLARNS GYGDVTRLTWSMETELIKEITPYLTPKEIEGVEVYVQVNNEGKPEIKQVRAGKELNSLPP KLKKHPYVEELKAVHKKLKEQHARSRIMLEQAMEDCTRFEENELRKLMKNPVIWPLLRNL VFTSNGRTGFYTDGLLITADGICLPLTPKEELRIAHPTDLYASGNWHAYQKFLFDKAIRQ PFKQVFRELYIPTTEEESATQSRRYAGNQIQPQKTVAVLKGRRWVADYEDGLQKIYYKEN IIATIYAMADWFSPADIEAPTLEFVCFHNRKDYKPMKISEIPPVVFSEVMRDVDLAVSVA HAGSVDPETSHSTIEMRRALVELTLPLFHFSNVRVKGNFAYIEGKLGKYNIHLGSGVIHK AGGAQIAVLPVHSQSRGRLFLPFVDEDPKTAEILTKIIFFAEDDKIKDPSILNQIK >gi|226332158|gb|ACIC01000162.1| GENE 7 11760 - 12230 322 156 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15902812|ref|NP_358362.1| hypothetical protein spr0768 [Streptococcus pneumoniae R6] # 12 150 6 145 165 128 45 6e-29 MAENLLIHTGNKEEKYRELLPQLQALVSSETNRIANLANIAAALKQTFHFFWVGFYMVEG NELVLAPFQGPIACTRIRFGRGVCGTAWKEAQTLIVPDVELFPGHIACSSDSRSEIVVPI IKEGNVIGVLDIDSDTTDSFDETDARYLEEICTYIR >gi|226332158|gb|ACIC01000162.1| GENE 8 12343 - 12657 298 104 aa, chain - ## HITS:1 COG:no KEGG:BT_1811 NR:ns ## KEGG: BT_1811 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 104 4 107 107 182 100.0 2e-45 MQTELIIVSEYCQKCHIEPSFIEMLEEGGLINVRTEAGKHYLLVSELPNVERYSRMYYDL SINMEGIDAIHHMLERMDDMRREITSLRKQLLLFREREIEEADW >gi|226332158|gb|ACIC01000162.1| GENE 9 12672 - 13637 1201 321 aa, chain - ## HITS:1 COG:slr0093 KEGG:ns NR:ns ## COG: slr0093 COG2214 # Protein_GI_number: 16331768 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: DnaJ-class molecular chaperone # Organism: Synechocystis # 3 319 6 314 332 177 38.0 3e-44 MAYIDYYKILGVDKSASQDDIKKAFRKLARKYHPDLNPNDPSAKDKFQEINEANEVLSDP EKRKKYDEYGEHWKHADEFEAQKKARQQAGGFGGAGGFGGFGGGGQGFSDGDGTYWYSSD GGGGFSGGNAGGFSDFFESMFGHRGRGQSSAGFRGQDFNAELHLSLRDAAQTHKQILTVN GKQVRITIPAGVADGQVIKLKGYGAEGVNGGPAGDLYITFVIAEDPVFKRLGDDLYVDVE VDLYTAVLGGDKVVDTLDGKVKLKIKPETQNGTKVRLKGKGFPVYKKEGQFGDLIVTYSV KIPTNLTEAQKELFRQLQSMN >gi|226332158|gb|ACIC01000162.1| GENE 10 13830 - 15533 1255 567 aa, chain - ## HITS:1 COG:RC0454 KEGG:ns NR:ns ## COG: RC0454 COG2194 # Protein_GI_number: 15892377 # Func_class: R General function prediction only # Function: Predicted membrane-associated, metal-dependent hydrolase # Organism: Rickettsia conorii # 177 504 173 498 522 150 29.0 5e-36 MKLFKNIKKWFENQEHLFYLFLIILIMPNIVLCFTEPLPLMAKVSNVLLPLACYYLLMTL SRNCGKMLWILFIFLFFGAFQIVLLYLFGQSIIAVDMFLNLVTTNSSEAMELLDNLLPAI VTVVILYIPALILGMISIIRKRRLSVTFIRRERKRVFILLGISLLALGGAYWEDSRYELK SDLYPLNVCYNVGLAFQRTALTQNYHHTSKDFTFHAQATHPADKREVYVMVVGETSRALN WQLYGYERETNPLLSRQPGIVAFSKVLTESNTTHKSVPMLMSDITAYNFDSIYHQKGIIT AFKEAGYQTAFFSNQRYNHSFIDFFGKEADTFDFIKEDSLDSAYNPSDDELLALVAKELS KGNQKQFIVLHTYGSHFNYRERYPSEDAFFLPDYPVEAEVKYRDNLVNAYDNTIRYTDSF LSRLIHMLEEQHVDAAMLYTSDHGEDIFDDSRHLFLHASPVPSYYQLHVPFLIWMSDTYR EAYPEHWQTVTGNKDKDVSSSCSFFPTMLELGGVQTSYRNDSSSVVSPLYTMKPRVYLND HNEPRPLDDLGMKQPDFQKCQVLGIKY >gi|226332158|gb|ACIC01000162.1| GENE 11 15520 - 16716 786 398 aa, chain - ## HITS:1 COG:no KEGG:BT_1814 NR:ns ## KEGG: BT_1814 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 398 1 398 398 763 99.0 0 MMDARRNGLVLLALLLLTQTTISAQTDSMGKDVPIDSVSVATSDSLQTKRSFFKKFLDYF NDANKEKKNKKFDFSVIGGPHYSSDTKFGLGLVAAGLYRTDRNDSILPPSNVSLYGDVST VGFYLLGVRGNHLFPQEKYRLNYNLYFYSFPSLYWGRGYDNGANSDNESDYKRFQAQVKV DFMFRVAKSFYIGPMAVFDYIDGRDFEKPELWEGMSARTTNTSLGLSLLYDSRDFLTNAY QGYYLRIDQRFSPAFLGNKYAFSSTELTTSYYHPVWKGGVLAGQFHTLLNYGNPPWGLMA TLGSSYSMRGYYEGRYRDKCAMDAQIELRQHIWKRNGVAVWAGAGTVFPRFSDFTSKHIL PNYGFGYRWEFKKRVNVRLDLGFGKHQTGFIFNINEAF >gi|226332158|gb|ACIC01000162.1| GENE 12 16807 - 17820 794 337 aa, chain - ## HITS:1 COG:RSc0808 KEGG:ns NR:ns ## COG: RSc0808 COG2008 # Protein_GI_number: 17545527 # Func_class: E Amino acid transport and metabolism # Function: Threonine aldolase # Organism: Ralstonia solanacearum # 1 328 1 330 345 256 39.0 4e-68 MRSFASDNNSGVHPLVMEALSRANADHALGYGDDRWTEEAVAKIKEIFTPDCVPLFVFNG TGSNVVALQVMTRPYHSIFCAETAHIYVDECGSPVKMTGCQIRPIATTDGKLTPELMQPY LHGFGDQHHSQPRALYISQCTELGTIYTPEELKRLTDFAHLNGMYVHMDGARIANACAAL HLSLKELTVDCGIDILSFGGTKNGLMMGECVIIFNKDLQREARFVRKQSAQLASKMRYLS CQFTAYLTDDLWLKNAAHANAMASKLYQALKELPEVTFTQKPESNQLFLTMPRPTIDRML ESYFFYFWNEANHEIRLVTSFDTTEEDVNQFIRLLRR >gi|226332158|gb|ACIC01000162.1| GENE 13 17873 - 18247 384 124 aa, chain - ## HITS:1 COG:no KEGG:BT_1816 NR:ns ## KEGG: BT_1816 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 124 1 124 124 186 100.0 2e-46 MEEERKINENTGNMLKQALERRQTERLSSNFSYRMMERVHQEAEKQTKRKTRIGWAALLI SALALVGLGVYVLTFYLEFNFADVMPQMNVRQDSSLFAFYVYIALLALVLLGLDYWLRKK YIWK >gi|226332158|gb|ACIC01000162.1| GENE 14 18258 - 18806 496 182 aa, chain - ## HITS:1 COG:BS_sigW KEGG:ns NR:ns ## COG: BS_sigW COG1595 # Protein_GI_number: 16077241 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Bacillus subtilis # 10 175 9 180 187 84 31.0 1e-16 MEQKDESYYIERILDGETEYFSVFLDRYSRPLYTLVVQIVGCPEDAEELLQDIFLKAFRN LNRYKGECRFSTWIYRIAYNAAISATRKKKQEFLYIEENTINNVPDEMADNVLAPAETEE QLERLEMAIDQLSGEEKALITLFYYEEKSMEEIGEVLKLSISNVKVRLHRTRKKICVLMN NK >gi|226332158|gb|ACIC01000162.1| GENE 15 18962 - 19378 423 138 aa, chain + ## HITS:1 COG:no KEGG:BT_1818 NR:ns ## KEGG: BT_1818 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 138 1 138 138 231 96.0 5e-60 MMDFITIPLVVGTITLGIYKLFELFVCRKERIAMIEKLADRVNTGEINSNLSLNLNYSRS RFTFGSLKSGLLMLGIGLGLLVAFFICVNSFPGYTASKNWEVERQASVVYGACVLLFGGA GLLAAFLIEMKIQKKEKE >gi|226332158|gb|ACIC01000162.1| GENE 16 19462 - 19680 246 72 aa, chain - ## HITS:1 COG:no KEGG:BT_1819 NR:ns ## KEGG: BT_1819 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 72 1 72 72 132 98.0 3e-30 MDRKQIGVYAGKVWQLLSNNEKWGYGTLKRKSGLKDKELGAALGWLSRENKIEFDQCDEE LYVYLCVNVYIG >gi|226332158|gb|ACIC01000162.1| GENE 17 19917 - 21656 1693 579 aa, chain + ## HITS:1 COG:STM0935 KEGG:ns NR:ns ## COG: STM0935 COG0028 # Protein_GI_number: 16764297 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] # Organism: Salmonella typhimurium LT2 # 1 572 1 570 572 555 48.0 1e-158 MAKKIAEQLIDTLVESGVERIYAVTGDSLNEVNEAVRKNNKIQWIHVRHEETGAYAAAAE AQLTGRIGCCAGSSGPGHVHLINGLYDAQRSGAPVIAIASTIPSGEFGTEYFQETNTIKL FNDCSYYNEVATTPGQFPRMLQSAIQTAITRKGVAVVGLPGDLAKASSVSVDSSVKNYPA PPEVCPAEEDLIQLAELLNNHKRITLFCGIGCRGAHEEVIALSEKLNAPVVYTFKGKMEV QYENPYEVGMTGLLGMPSGYYSMHEAEVLLMLGTDFPYSAFLPDDIKIAQIDIKPERLGR RAKVDIGLCGDVRMTIQALLRMLDPKTDDTFLLKQLKRYEGVKKDLAAYTENKGDIKKIH PEYVMSEIDKLASDDAIFTVDTGMTCVWGARYLQATGKRHMLGSFNHGSMANALPQAIGA ALAYPDRQVVALCGDGGLSMTLGDLETVVQYKLPIKIIVFNNRSLGMVKLEMEVDGLPDW QTNMLNPDFAQVAEAMGMTGFNVSNPEEVLTTLLNAFELDGPVLVNIMTDPNALAMPPKI ELGQMVGFAQSMYKLLINGRSQEVIDTINSNYKHIREVF >gi|226332158|gb|ACIC01000162.1| GENE 18 21835 - 22752 644 305 aa, chain + ## HITS:1 COG:YPO2801 KEGG:ns NR:ns ## COG: YPO2801 COG4984 # Protein_GI_number: 16122999 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Yersinia pestis # 18 147 49 175 735 85 45.0 1e-16 MEKPDSSLLFRQPMYADKKQWKQFLSIFLLAAGVGFTVAGIIFFFAYNWEDLPKFAKLGI VQTLLVASVLLTVFTRWNILIKQIILTGATFLVGTLFAVFGQIYQTGADAYDLFLGWTLF TILWAVAIRFTPLWLTFIGLLCTTIWLYAMQIVPDNQWAVTLLTSAVTWICASATVVTEW MSIKGTLSRQNRWFVSLLSLATIVHVTYLMMAVICEKDAIVSIPLTSTVLLFSAGLWFGW RQRNLFYLSAIPFAILMILLSLFICHSNLRDVNIFLLSGIIVITGTTLLIYAILHLKKQW YGTEE >gi|226332158|gb|ACIC01000162.1| GENE 19 22736 - 23692 518 318 aa, chain + ## HITS:1 COG:no KEGG:BT_1824 NR:ns ## KEGG: BT_1824 # Name: not_defined # Def: putative permease # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 318 1 318 318 479 97.0 1e-134 MAQKSNLTIEVLSIIGGVLTAIFFLGFLALSSILRSETSCLIVGSILIITTLFVNRLLTK PFLDAMNITCYIAGCILAGYGMNRNMDVLFIVLIGISVVTMLLSKGFILTFLSVISFYMA LFGEITNLFSSLNPLNVAAVPIIAIFLFVNLSETKILSYTNGDFSKYKPIHSGLFVSCVL SLAGLSVNYLTKSTNDWIISVFMLVGILLMVYKIMQVMQVKSPVHQVCIYLLCILICLPS LHAPYLSGSILLILICFHYGYKAESAVALLLFIYSISKYYYDLDITLLTKSITLFFTGIV LLIAWYIFTQKKTRHEKI >gi|226332158|gb|ACIC01000162.1| GENE 20 23679 - 24182 408 167 aa, chain + ## HITS:1 COG:YPO2802 KEGG:ns NR:ns ## COG: YPO2802 COG4929 # Protein_GI_number: 16123000 # Func_class: S Function unknown # Function: Uncharacterized membrane-anchored protein # Organism: Yersinia pestis # 8 162 7 174 176 96 36.0 2e-20 MKKYSRILIIANLILLLGYFNWSVYQKEQTLKEGQLVLLQLAPVDPRSLMQGDYMRLNYK EANSELINRQEAKRGYAVLKLDKNHVGEIIRLQESLEPVNENEIVLKYKTINGRLFLGAE SFFFEEGQDSVYNRAAYGGLRVDNKGQSLLVGLYDGDFRQIKPYRGK >gi|226332158|gb|ACIC01000162.1| GENE 21 24564 - 25655 853 363 aa, chain - ## HITS:1 COG:VC2213 KEGG:ns NR:ns ## COG: VC2213 COG2885 # Protein_GI_number: 15642211 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane protein and related peptidoglycan-associated (lipo)proteins # Organism: Vibrio cholerae # 210 363 161 314 321 59 33.0 8e-09 MKRNVLIVFAILCTTMAFAQSEEQRIKEPGKIVFHPHWFIQTQIGAAHTVGEAKFADLIS PAAALNVGYKFAPAFGARVGVSGWQAKGGWINPEQVYQYKYLQGNVDIIADFSTLFCGFN PKRVFNGYLFAGAGLNRGFDNDEANALDTRTYEMEYLWQEGKFLVTGRLGLGCDLRLNDR LSINIEANANALSDKFNSKKAGNCDWQLNALVGLSIKLGKSYTKTAPIYYEPEPVVVEQP KPAPVVEQPQPKKEVVAEPMKQNIFFALNSAKIQDDQQAKLVSLIEYLETHPAAKISITG YADVNTGNPKINSKLSEKRAVNVAEALKVKGIAADRIKIDFKGDTVQPYATPEENRVSIC IAE >gi|226332158|gb|ACIC01000162.1| GENE 22 25755 - 28814 2986 1019 aa, chain - ## HITS:1 COG:no KEGG:BT_1502 NR:ns ## KEGG: BT_1502 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 235 1019 194 944 945 261 31.0 9e-68 MKRKFVKVMFFGALALSTVTYVGCKDYDDDINAINERIDANDAKLAEIQKVLDSGQWVKD VKPIANGIQVILGNEKTYEITNGTNGKDGIVYTIGDDGYWYKGTEKTQWKAAGDTPYIKD GNWWIGNTDLKVQAEGKDGNSTNGTDGLTPYIKDGNWWIGEGDAAKNLGVQAAADIYTPD ASGIWYVNGKPTDPAQSWKAEGTITAAVNEDGNIVISGMKEYPEGFVLKTNTAVLSSLVF EPQSYLGGIQAMKAISLNYNEWTVKTSDATPTKTGEVWQRNTTKVNVLTPKIVAYYHVNP QNISESQIVEKSVHYVADNKEVVTRSQEFNPTVTSYKLVDKEIDGKNCRMLKVTFSATSE EIKALDAKEVSTLALQVSVQLKDEKEPRVITSDYAAIYKTNWSDFVLAFKDKVNNNATND GHLYGADDNDSKIGQALEAIEAEANFALTATVDAKDPKKLVLKDKFTFHYKETADKTDAD PSEEKVFEGDLADLDLEIAFAQSNYISGDNKTPQNAFLKLEDGEAISTVYGEPGVAAADR QPMVRVMLREIATKKVLNAGWVKINIVKDDVDGKEIDYPKHEEIYVGCDDAAVTTDVQFM NVDVYNALGLKKDEFYAIYKLNDAKTNPAIGDNSVGKVVELQDPEGTQTILVDWTIDNAD LLAAKEGDTFTKKVYYSATGRKDVVIILTTGKVHKASGTIGAKNTNLWSGNSIEIDAKEP IATNKATLADFDFYSVFMGNKIVTTVDSKFTDFQPTQLNDPKFVFAEENGVVTAKQKITI DGVDYYVYRNDAGDKLMGAKGTGVAEEVAVIVGSKVTYNKTAVAKALLNKSAYNENPFTA TIEVVITNKGCNTKLPLTNGKFDAKFLRPVNIMAGPVSTIQDAETGGAKVALNQLISLTD WRGHDFEMGTNNFYTYYGISAFAIATDNIKTNMGQTDPNKFVLWSEVAPLYPNVVTYAAG TDTSNNLTAIENGDFGMVTYDNKGVTVKTSFKLQIPVTVSYPYGDVTKVITVTVLPTHG >gi|226332158|gb|ACIC01000162.1| GENE 23 29188 - 30126 353 312 aa, chain - ## HITS:1 COG:no KEGG:BT_1827 NR:ns ## KEGG: BT_1827 # Name: not_defined # Def: transposase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 312 1 312 312 591 98.0 1e-167 MGNLISFFKKVADGLRAEGNFGTAHVYRSTLNIVTTFHGSKYLDFHQVNPEWLKDFEVYL RSRGSSWNTVSTYLRVLRAVYNRAVDLRKAEYIPHLFRSVYTGTRADRKRALEDEDMRKV FSRLSKSSTIPSDLRHTKELFILMFLLRGLPFVDLAYLRKSDLRGNVITYRRRKTGRSLS VTLTPEAMLLLEKHISRDLSSPYLFSILHSIEGSQEAYREYQLALRSFNQHLELLGQWLG LESKLSSYTARHTWATTAYYCEIHPGIISEAMGHSSITVTETYLKPFRSRKIDEANNRIV DFVKRSIGGIIV >gi|226332158|gb|ACIC01000162.1| GENE 24 30832 - 32394 1091 520 aa, chain - ## HITS:1 COG:no KEGG:BT_1828 NR:ns ## KEGG: BT_1828 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 519 1 519 520 1005 98.0 0 MSSKIYPIGIQNFEKIRKDGYFYIDKTALIYKLVKTGSYYFLSRPRRFGKSLLLSTLEAY FKGRKELFEGLAMESLEKDWIEYPVIHLDLNAKKFDTENDLIRLIDRQLLVYEAQYGSCS SDETIDDRLVTLIRLAAEKTGQRVVILVDEYDKPMLQAIGRDELQEEYRNTLKAFYGVMK SMDGYIKFAMLTGVTKFGKVSVFSDLNNLNDISMWNQYIDICGVSDQELHDNLEVELSEF ADARGMTYNEICVALREYYDGYHFTHNSIGMYNPFSLLNAFLRNEFGSYWFETGTPTYLV ELLKKHHYDLQRMAHEETSAEVLNSVDSTSDNPIPVIYQSGYLTIKGYDERFGIYRLGFP NREVEEGFVKFLLPFYANTNAVESSFEIQKFVREIEAGDYDSFFRRLQIFFADTPYELIR DLELHYQNVLFIVFKLIGFYVKAEYHTSGGRIDLVLQTEKFVYVMEFKLDGTAEEALLQI NEKHYAQPFESNDRELFKIGVNFSAKTRNIEKWVVEKGGK >gi|226332158|gb|ACIC01000162.1| GENE 25 32560 - 34197 1673 545 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|167855908|ref|ZP_02478658.1| 50S ribosomal protein L28 [Haemophilus parasuis 29755] # 2 545 3 547 547 649 61 0.0 MAKEILFNIDARDQLKKGVDALANAVKVTLGPKGRNVIIEKKFGAPHITKDGVTVAKEIE LADAYQNTGAQLVKEVASKTGDDAGDGTTTATVLAQAIVAEGLKNVTAGASPMDIKRGID KAVAKVVESIKAQAETVGDNYDKIEQVATVSANNDPVIGKLIADAMRKVSKDGVITIEEA KGTDTTIGVVEGMQFDRGYLSAYFVTNTEKMECEMEKPYILIYDKKISNLKDFLPILEPA VQTGRPLLVIAEDVDSEALTTLVVNRLRSQLKICAVKAPGFGDRRKEMLEDIAILTGGVV ISEEKGLKLEQATIEMLGTADKVTVSKDYTTIVNGAGVKENIKERCDQIKAQIVATKSDY DREKLQERLAKLSGGVAVLYVGAASEVEMKEKKDRVDDALRATRAAIEEGIIPGGGVAYI RAIDSLEGMKGDNADETTGIGIIKRAIEEPLREIVANAGKEGAVVVQKVREGKGDFGYNA RTDVYENLHAAGVVDPAKVARVALENAASIAGMFLTTECVIVEKKEDKPEMPMGAPGMGG MGGMM >gi|226332158|gb|ACIC01000162.1| GENE 26 34243 - 34515 396 90 aa, chain - ## HITS:1 COG:RC0969 KEGG:ns NR:ns ## COG: RC0969 COG0234 # Protein_GI_number: 15892892 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Co-chaperonin GroES (HSP10) # Organism: Rickettsia conorii # 1 89 5 98 99 100 59.0 7e-22 MNIKPLADRVLILPAPAEEKTIGGIIIPDTAKEKPLKGEVVAVGHGTKDEEMVLKVGDTV LYGKYAGTELEIEGTKYLIMRQSDVLAILG >gi|226332158|gb|ACIC01000162.1| GENE 27 34778 - 35806 908 342 aa, chain + ## HITS:1 COG:no KEGG:BT_1831 NR:ns ## KEGG: BT_1831 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 342 7 348 348 679 96.0 0 MNKISFGISIAFCVLYSCTSHTGHKTSEETLQADSLAQDTIAETVAEPVVKKITPEEIQI TKELLYDKYTLEDTYPYKDTTRSFQWDKIKEQLALLENIQLQPSQWAILQNYKNRNGEAP LVKNFKRNAYGRVADTLGVERYQSVPLYLLTDTIAPERYGQDGELTRFIEDGENFVKAEP IFTEGEWMIPRKYVKVIGDTVVFNKAIFVDRHNQNITALERTEKGKWVVRSMNPSTTGLH RPPYAQETPLGMFVLQEKKVKMIFLKDGSKETGGYAPYASRFTDGAYIHGVPVNEPRKTQ IEYSWSLGTTPRSHMCVRNATSHAKFIFDWAPVNETIIFVLE >gi|226332158|gb|ACIC01000162.1| GENE 28 35850 - 36566 501 238 aa, chain + ## HITS:1 COG:no KEGG:BT_1832 NR:ns ## KEGG: BT_1832 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 24 238 1 215 215 402 98.0 1e-111 MLQKVILFLSFLFIPFSHVSVDTMAVPASDSPTPVSELAAGEQLFEEMELGGIVNFIAFR QAVAGYNRIKEKSKPILTLIDFSKPSTEKRFFVFDMEKKQLLFSSVVSHGRNSGGNYATS FSNQNGSYKSSLGFYLTENTYQGGNGYSLILNGLEKGINDKAKERSIVVHGASYANPTVA ASGRLGRSLGCPALPTKLAKPIINTIKDGSVMFIYANNSSYLAQSDILKNVSAPMPAL >gi|226332158|gb|ACIC01000162.1| GENE 29 36621 - 39269 2124 882 aa, chain - ## HITS:1 COG:slr2098_3 KEGG:ns NR:ns ## COG: slr2098_3 COG0642 # Protein_GI_number: 16330584 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Synechocystis # 626 876 10 267 280 187 41.0 6e-47 MEEKEKAWKTVSSKYLFRRPWLTVRCEDMLLPNGNHIPEYYILEYPDWVNTIAITKDGKF VFVRQYRPRLGRTSYELCAGVCDKEDASPLVSAQRELWEETGYGKGNWQEYMVISANPST HTNLTHCFLATNVEPIDHQHLEDTEDLSVHLLTFEEVKQLLENNEIMQSLNAAPLWKYVA EHAADFEQKSVEKVEYEKTETISHRINDLLYYQNSISRSLSRFLKDEEVETGIYEILKDI LAFYRAGRAYIFETDEENHFYNCTYEVVAEGVKAEINKLQEIPVDFMPWWTSQILGKKPI LFETLKPMPGMGQGEYEVLSRQGIKALMATPLVVNDHVYGFMGVDLVDGSASWSDEDYRW LSSLANVISICLELRRAKEKVIFEQAALARSERLFKNIFANIPAGVEIYDKEGNLVDLNE RDMDILGISDKSEVIGLNFFENPNVDAQILESIRKSSITDFRARYSFECARHTGYYRPLK AGVIELYTKVRKLYDNHGNLTGYILINMDNTERIDALKPISDFENLFLLISDYAKVGYAK LNLLNRQGYAIKQWFKNMGEDENTPLSDVVGVYSKMHPDDRSRMLAFFEEAKKGKAKAFK GEMRILRPGTKNEWNWVRTNVVLNLYEPEKGQVELIGVNYDITALKETEAKLIEAKEKAE ESDRLKSAFLANMSHEIRTPLNAIVGFSSLMVDTEDMEERRQYMDIVEENNDLLLQLISD ILDLSKIEAGTFDFTEREVDVNLLCEDIVLAMRMKARPNVEILFDRHLPECRIMSDRNRL HQVISNFVNNALKFTEEGNIRVGYDQLDEAHLRFYIADTGIGIEPEMQNEIFERFVKLNS FVHGTGLGLSICRSIVEQLGGEIGVDSEPEKGSCFWFTLPIK >gi|226332158|gb|ACIC01000162.1| GENE 30 39348 - 39509 160 53 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253571978|ref|ZP_04849383.1| ## NR: gi|253571978|ref|ZP_04849383.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 53 1 53 53 73 100.0 3e-12 MSSSMLEQRSRFAMIGALMIIISLMFWYYMGSSLVSATKKYLEQVQTIEITCE >gi|226332158|gb|ACIC01000162.1| GENE 31 39445 - 39612 84 55 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MMIINAPIIAKRLLCSNILLLISLLLSYFMFYKDNTEMLQKIDKNVSKTLNIEFI >gi|226332158|gb|ACIC01000162.1| GENE 32 39766 - 41235 1204 489 aa, chain + ## HITS:1 COG:CAC3230 KEGG:ns NR:ns ## COG: CAC3230 COG4624 # Protein_GI_number: 15896476 # Func_class: R General function prediction only # Function: Iron only hydrogenase large subunit, C-terminal domain # Organism: Clostridium acetobutylicum # 173 411 95 340 450 156 37.0 1e-37 MAFTNNIMIVRHKLLADLVRLWKNDQLVEKIDRLPIELSPRKSKPLGRCCVHKERAVWRY KTFPLMGLDMTDEHDEVTPLSEYARMALNRPEPSKENIMCVIDEACSSCVQINYEITNLC RGCVARSCYMNCPKDAIRFKKNGQAMIDHDTCVSCGICHKSCPYHAIVYIPVPCEESCPV KAISKDEHGVEHIDESKCIYCGKCMNACPFGAIFEISQTFDVLQHIKKGEKLVAIIAPSI LGQFKTSIEQVYGAFKEIGFTDIIEVAEGAMMTTSNEAHELLEKLEEGQKFMTTSCCPSY IELVEKHMTEMKPYVSTTGSPMYYAARIAKKKHPDAKIVFVGPCVAKRKEVRRDDAVDYI LTFEEVGSILDGMDIQLEQVNSFSILHTSVREAHGFAQAGGVMGAVKAYLKEEADKINAI QVSDINKKNIALLRACAKTGKAAGQFIEVMACEGGCITGPSTHNDIVSGRRQLAQELLKR KESYETMDR >gi|226332158|gb|ACIC01000162.1| GENE 33 41216 - 42253 780 345 aa, chain + ## HITS:1 COG:CAC1631 KEGG:ns NR:ns ## COG: CAC1631 COG0502 # Protein_GI_number: 15894909 # Func_class: H Coenzyme transport and metabolism # Function: Biotin synthase and related enzymes # Organism: Clostridium acetobutylicum # 5 345 8 343 350 262 40.0 9e-70 MRQWIDKLQEERTLQPEEFRQLLTECDGESLRYINKQAQEVSLRHFGNRIFIRGLIEVSN CCRNNCYYCGIRKGNPNLERYRLSTENILNCCKQGYDLGFRTFVLQGGEDPVLTDERIED IVSTIRRSYPDCAITLSLGEKSREAYERFFQAGANRYLLRHETYDKEHYQQLHPTGMSCE HRLQCLRDLKDIGYQTGTGIMVGSPGQTIKHLIQDILFIEQLRPEMIGIGPFLPHGETPF AQSPSGTVEQTLLLLSIFRLMHPSALIPATTALATLTPDGRERGILAGANVVMPNLSPQE ERKKYNLYNNKASLGAESVEGLNILQQQLEKIGYQISFSRGDYKQ >gi|226332158|gb|ACIC01000162.1| GENE 34 42258 - 43676 1219 472 aa, chain + ## HITS:1 COG:CAC1356 KEGG:ns NR:ns ## COG: CAC1356 COG1060 # Protein_GI_number: 15894635 # Func_class: H Coenzyme transport and metabolism; R General function prediction only # Function: Thiamine biosynthesis enzyme ThiH and related uncharacterized enzymes # Organism: Clostridium acetobutylicum # 1 472 1 472 472 723 75.0 0 MYKIDSPQAEEFIHHEEILETLEYARNHKNNRAFIEQLIEKAALCKGLTHREAATLLECD QPDLIERIFHLAKEIKQKFYGNRIVMFAPLYLSNYCVNGCVYCPYHAKNKTIARKKLTQE EIRKEVIALQDMGHKRLALEAGEHPTLNSLEYILESIRTIYSIRHKNGAIRRVNVNIAAT TVENYRRLKDAGIGTYILFQETYHKKNYEALHPTGPKSNYAYHTEAMDRAMEGGIDDVGM GVLFGLNTYRYDFVGLLMHAEHLEARFGVGPHTISVPRICSADDINAGDFPNAISDDIFS KIVAVIRIAVPYTGMIISTRESQESRKKVLELGISQISGGSRTSVGGYAETELPEDNSAQ FDVSDTRTLDEVVNWLLESGYIPSFCTACYREGRTGDRFMSLVKSGQIANCCGPNALMTL KEYLEDYASEDTRIKGMKLIAKETDRIPNPKIREIAIRNLKDIAEGKRDFRF >gi|226332158|gb|ACIC01000162.1| GENE 35 43740 - 44957 787 405 aa, chain + ## HITS:1 COG:CAC1651 KEGG:ns NR:ns ## COG: CAC1651 COG1160 # Protein_GI_number: 15894928 # Func_class: R General function prediction only # Function: Predicted GTPases # Organism: Clostridium acetobutylicum # 3 400 4 391 411 323 43.0 5e-88 MNLVHTPNANRLHIALFGKRNSGKSSLINALTGQDTALVSDTPGTTTDPVQKAMEIHGIG PCLFIDTPGFDDEGELGNRRIERTWKAVEKTDIALLLCAGGSSAEETGEPDFTEELHWLE QLKAKNIPTILLINKADIRKNTASLAIRIKETFGSQPIPVSAKEKTGIELIRQAILEKLP EDFDQQSITGNLVTEGDLVLLVMPQDIQAPKGRLILPQVQTMRELLDKKCLIMSCTTDKL RETLQALSRPPKLIITDSQVFKTVYEQKPEESRLTSFSVLFAGYKGDIRYYVKSASAIGS LTESSRVLIAEACTHAPLSEDIGRVKLPHLLRKRIGEKLSIDIVAGTDFPQDLTPYSLVI HCGACMFNRKYVLNRIERARLQNVPMTNYGVAIAFLNGILNQIEY >gi|226332158|gb|ACIC01000162.1| GENE 36 45046 - 47583 2152 845 aa, chain + ## HITS:1 COG:PAB1300 KEGG:ns NR:ns ## COG: PAB1300 COG1506 # Protein_GI_number: 14521796 # Func_class: E Amino acid transport and metabolism # Function: Dipeptidyl aminopeptidases/acylaminoacyl-peptidases # Organism: Pyrococcus abyssi # 484 830 291 629 631 168 32.0 3e-41 MKKRTMALAIFLATGVTMQAQDRLTQYKVRNAISVRTPIMNDSINPKGEKHTKKMLLQTP VVLHLPDAPMQSLTADTAGYLSFEKADKDNKLYLVKTQIRAERFLKGKLKITSPVRWEVF IDGASKQVKDAAEDSITSGSSRDIALSLEPERDYEIIIKLLSASDDKAAPTLKCELIKDE KFKDIACNLDPEAKKRFSLDNTVYGNRAISVSISPSGKYLLTRYWDNHAAKRSRTYCELT ELKSGKVLLTNLRDGMSWMPKSDKLYYTVTALTGNDVITLDPATLREETILQSIPEQSFR WSPNEDFLIYYPREEGVKEDGPLRRIVSPADRIPNSRGRSFLAKYNLADGVSERLTYGNH STYLQDISPDGKYMLYSTSKENITQRPFSLSSLYQVDLETLKVDTLFYEDRFIGGAGYSP DGKQLLLTGSPEAFGGIGKNCGNHPIANDFDTQAFIMDLSTRKIQPITKDFHPTVSPLQW NHGDGCIYFNTDDGDCKNIYRYSPKDGKFEKLNLETDVTSAFALSEYNPGIAAYIGQSDY NAGVAYLYDMKKKTSSLLADPMKPILEKIELGKTEPWNFTASDGTVITGKMCLPPNFDPN KKYPMIVYYYGGTTPTTRGISNPYCAQLFASRDYVVYVIQPSGTIGFGQEFSARHVNAWG KRTADDIIEGTKQFCKEHPFVDEKKIGCLGASYGGFMTQYLQTQTDIFAAAVSHAGISDV TSYWGEGYWGYSYNAIAAADSYPWKDPDLFTKQGSLFNADKINTPLLLLHGTVDTNVPIG ESIQLFNALKILGKTVEFVTVDGENHFISDYDKRIKWHNSIMAWFARWLQDQPEWWKEMY PERHL >gi|226332158|gb|ACIC01000162.1| GENE 37 47668 - 49068 1066 466 aa, chain - ## HITS:1 COG:no KEGG:BT_1839 NR:ns ## KEGG: BT_1839 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 466 10 475 475 930 96.0 0 MKTIRNTLFLGIICAASIASSTAHELETNENSLKKSVKAQTRKEIVTFFSESLSGKTTTF DAGKPLKLSEVEHYRDVVWQLWKEANDGFKEEKLIALDTLSSASTGYWELPDSLEPNAVM PYYWGMKGTDEAGNELPLFLYLHGSGPKEREWSTGLNICRNFDDAPSVYFIPQIPNEGEY YRWWQKAKQFAWEKLLRQSLLLGKIDPNRVYVFGISEGGYGSQRLASFYADYWAAAGPMA GGEPLKNAPAENCSNIAFSLRTGDKDTGFYRNTLTGYVREAFDSLAHQHPGYFTHKIELI PGMGHSIDYRPTTPWLKQYVRNPYPKYVSWENFEMDGLYRKGFYNLYVKERSDEEGKSRT YYEMSISGNHISLKVDDVVYEATEKDQRWGIEMKFAKKYAQVHKGKVVIYLCDELVDLTE KVTLTVNGKKVFEGKVKADLKNMVNSCAVFFDPQRLYPAAIEVELK >gi|226332158|gb|ACIC01000162.1| GENE 38 49679 - 51043 1683 454 aa, chain - ## HITS:1 COG:HP1190 KEGG:ns NR:ns ## COG: HP1190 COG0124 # Protein_GI_number: 15645804 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Histidyl-tRNA synthetase # Organism: Helicobacter pylori 26695 # 5 440 4 422 442 247 33.0 4e-65 MAAKPSIPKGTRDFSPVEMAKRNYIFNTIRDVYHLYGFQQIETPSMEMLSTLMGKYGDEG DKLLFKIQNSGDYFSGITDEELLSRNAVKLASKFCEKGLRYDLTVPFARYVVMHRDEITF PFKRYQIQPVWRADRPQKGRYREFYQCDADVVGSDSLLNEVELMQIVDTVFSRFNIRVCI KINNRKILSGIAEIIGEADKIVDITVAIDKLDKIGLENVNAELKEKGISDEAIAKLQPII LLSGTNTEKLATLKSVLAASETGMKGVEESEFILGTLETMGLKNEIELDLTLARGLNYYT GAIFEVKALDVQIGSITGGGRYDNLTGVFGMAGVSGVGISFGADRIFDVLNQLELYPKEA VNGTELMFVNFGDKEAAFSMSMLAKVRAAGIRAEIFPDAAKMKKQMSYANAKSVPFVAIV GENEMNEGKAMLKNMETGEQNLVSVEELIAALRN >gi|226332158|gb|ACIC01000162.1| GENE 39 51195 - 51875 618 226 aa, chain - ## HITS:1 COG:BH1677 KEGG:ns NR:ns ## COG: BH1677 COG2738 # Protein_GI_number: 15614240 # Func_class: R General function prediction only # Function: Predicted Zn-dependent protease # Organism: Bacillus halodurans # 2 226 1 223 224 177 42.0 2e-44 MMSYWVLFIGIAVASWLVQMNLQNKFKKYSKIPTGNGMTGRDVALKMLHDNGIYDVQVTH TPGQLTDHYNPANKTVNLSEGVYESNSIMAAAVAAHECGHAVQHARMYAPLKMRSALVPV VNFSSSIMTWVLLGGILLFNSFPQLLWIGIILFATTTLFSFITLPVEINASKRALVWLSS SGITNSYNHTQAEDALRSAAYTYVVAALGSLATLIYYIMIAMGRRD >gi|226332158|gb|ACIC01000162.1| GENE 40 51995 - 53311 1260 438 aa, chain - ## HITS:1 COG:TM0306 KEGG:ns NR:ns ## COG: TM0306 COG3669 # Protein_GI_number: 15643075 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-L-fucosidase # Organism: Thermotoga maritima # 28 433 8 430 449 136 28.0 8e-32 MKTHFITFLLLVGMSLGISSRLHAQSSYQPGEENLKAREEFQDNKFGIFLHWGLYAMLAT GEWTMTNNNLNYKEYAKLAGGFYPSKFDADKWVAAIKASGAKYICLTSRHHDGFSMFDTQ YSDFNIVKATPFKRDIIKELAAACSKQGIKLHFYYSHLDWTREDYPWGRTGRGTGRSNPQ GDWKSYYQFMNNQLTELLTNYGPVGAIWFDGWWDQDGNPGFNWELPEQYAMIHKLQPGCL IGNNHHQTPFAGEDIQIFERDLPGENTAGLSGQSVSHLPLETCETMNGMWGYKITDQNYK STKTLIHYLVKAAGKNANLLMNIGPQPDGELPEVAVQRLKEMGEWMNQYGETIYGTRGGA VAPHDWGVTTQKGNKLYVHILNLQDKALFLPLADKKVKKAVLFKNGTPVRFTKNKEGVLL EFTEIPKDIDYVVELTID >gi|226332158|gb|ACIC01000162.1| GENE 41 53429 - 54700 1474 423 aa, chain - ## HITS:1 COG:PM0938 KEGG:ns NR:ns ## COG: PM0938 COG0104 # Protein_GI_number: 15602803 # Func_class: F Nucleotide transport and metabolism # Function: Adenylosuccinate synthase # Organism: Pasteurella multocida # 5 417 6 424 432 382 47.0 1e-106 MKVDVLLGLQWGDEGKGKVVDVLTPKYDVVARFQGGPNAGHTLEFEGQKYVLRSIPSGIF QGDKVNIIGNGVVLDPALFKAEAEALEASGHPLKERLHISKKAHLILPTHRILDAAYEAA KGDAKVGTTGKGIGPTYTDKVSRNGVRVGDILHNFEQKYGAAKARHEQILKSLNYEYDLT ELEKAWMEGIEYLKQFHFVDSEHEVNNYLKDGKSVLCEGAQGTMLDIDFGSYPFVTSSNT VCAGACTGLGVAPNRIGEVFGIFKAYCTRVGAGPFPTELFDETGDKMCTLGHEFGSVTGR KRRCGWIDLVALKYSVMINGVTKLIMMKSDVLDTFDTIKACVAYKVDGEEIDYFPYDITE GVEPVYAELPGWKTDMTKMQSEDEFPEEFNAYLTFLEEQLGVEIKIVSVGPDRAQTIERY TEE >gi|226332158|gb|ACIC01000162.1| GENE 42 54697 - 55185 323 162 aa, chain - ## HITS:1 COG:Cj0400 KEGG:ns NR:ns ## COG: Cj0400 COG0735 # Protein_GI_number: 15791767 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+/Zn2+ uptake regulation proteins # Organism: Campylobacter jejuni # 1 154 1 157 157 75 31.0 6e-14 METQNVKDTVRQIFTEYLTANGHRKTPERYAILETIYSIDGHFDIDMLYSRMMDQENFRV SRATLYNTIILLINARLIIKHQFGTSSQYEKSYNRETHHHQICTQCGKVTEFQNEELQHA IENTKLSRFQLSHYSLYIYGLCSKCDRANKRKKVSNNNKKEK >gi|226332158|gb|ACIC01000162.1| GENE 43 55349 - 56014 545 221 aa, chain - ## HITS:1 COG:no KEGG:BT_1845 NR:ns ## KEGG: BT_1845 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 221 1 221 221 425 100.0 1e-118 MDIKEQLKDIKTQLRLSMNGAVSQSMREKGLVYKLNFGVEIPRIKMIAEGYEKNHDLAQA LWKEDIRECKIMAAMLQPVDTFYPEIADIWVESIRNIEIAELTCMNLFQHLPYAPAKSFH WMADEQEYVQTCGFLTAARLLMKKGDMTERASGELLDQALCAVHSESYHVRNAALLAIRK YMQHNEEHAFRVCRMVEGMADSTVEAEQMLYNMVKEEVENK >gi|226332158|gb|ACIC01000162.1| GENE 44 56023 - 58071 2197 682 aa, chain - ## HITS:1 COG:no KEGG:BT_1846 NR:ns ## KEGG: BT_1846 # Name: not_defined # Def: putative dipeptidyl-peptidase III # Organism: B.thetaiotaomicron # Pathway: not_defined # 8 682 1 675 675 1363 100.0 0 MKKHFISMAVTATILASCGGAKTTTAEADKFDYTVEQFADLQILRYKVPEFETLTLKQKE LVYYLTQAALEGRDILFDQNGKYNLRIRRMLEAVYTNYKGDKSAPDFKNMEVYLKRVWFS NGIHHHYGMEKFVPGFSQDFLKQAVLGTDAQLLPLSEGQTAEQLCDELFPVMFDPAILAK RVNQADGEDLVLTSACNYYDGVTQQEAESFYGAMKDPKDETPVSYGLNSRLVKEDGKIQE KVWKVGGLYTQAIEKIVYWLKKAETVAENDAQKAVISKLIQFYETGSLKDFDEYAILWVK DLDSRIDFVNGFTESYGDPLGVKASWESLVNFKDLDATHRTEIISSNAQWFEDHSPVDKS FKKEKVKGVSAKVITAAILAGDLYPATAIGINLPNANWIRAHHGSKSVTIGNITDAYNKA AHGNGFNEEFVCNDEERQRIDQYGDLTGELHTDLHECLGHGSGKLLPGVDPDALKAYGST IEEARADLFGLYYVADPKLVELKLVPDAEAYKAEYYTFLMNGLMTQLVRIEPGNNIEEAH MRNRQLIARWVFEKGAPDKVVEMVKKDGKTYVVVNDYEKVRQLFGELLAEIQRIKSTGDF EGARTLVENYAVKVDPALHAEVLARYKKLNLAPYKGFINPVYELVTDKDGNITDVTVSYN EDYVEQMLRYSKDYSPLPSVNN >gi|226332158|gb|ACIC01000162.1| GENE 45 58166 - 58630 494 154 aa, chain - ## HITS:1 COG:VCA0926 KEGG:ns NR:ns ## COG: VCA0926 COG2207 # Protein_GI_number: 15601680 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Vibrio cholerae # 56 152 272 365 365 61 34.0 5e-10 MSDLENKKAEEAPKKRPYNLREKKEKKAAYRSLIRPELADELYDKILNIVVVQKKYKDPD YSAKDLAKELKTNTRYLSAVVNSRFGMNYSCLLNEYRVKDALHLLTDKRYADKNVEEISA MVGFANRQSFYAAFYKNVGETPNGYRKKHAESKK Prediction of potential genes in microbial genomes Time: Thu May 12 03:31:27 2011 Seq name: gi|226332157|gb|ACIC01000163.1| Bacteroides sp. 1_1_6 cont1.163, whole genome shotgun sequence Length of sequence - 16707 bp Number of predicted genes - 14, with homology - 14 Number of transcription units - 6, operones - 3 average op.length - 3.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 1806 1437 ## COG0514 Superfamily II DNA helicase + Term 1898 - 1942 12.1 - Term 1885 - 1929 12.1 2 2 Op 1 . - CDS 1955 - 3043 529 ## COG3274 Uncharacterized protein conserved in bacteria 3 2 Op 2 . - CDS 3092 - 3910 776 ## COG0457 FOG: TPR repeat 4 2 Op 3 . - CDS 3921 - 4871 739 ## PROTEIN SUPPORTED gi|148988856|ref|ZP_01820271.1| 50S ribosomal protein L9 - Prom 4942 - 5001 5.8 + Prom 4822 - 4881 3.6 5 3 Op 1 . + CDS 4960 - 6795 1235 ## COG1368 Phosphoglycerol transferase and related proteins, alkaline phosphatase superfamily 6 3 Op 2 5/0.000 + CDS 6835 - 7512 561 ## COG0671 Membrane-associated phospholipid phosphatase 7 3 Op 3 . + CDS 7521 - 8150 291 ## COG0500 SAM-dependent methyltransferases 8 4 Tu 1 . - CDS 8156 - 9475 651 ## BT_1856 hypothetical protein - Prom 9606 - 9665 3.6 - Term 9609 - 9653 8.1 9 5 Op 1 11/0.000 - CDS 9679 - 10740 1144 ## COG0473 Isocitrate/isopropylmalate dehydrogenase - Prom 10767 - 10826 4.9 10 5 Op 2 . - CDS 10831 - 12360 1501 ## COG0119 Isopropylmalate/homocitrate/citramalate synthases 11 5 Op 3 30/0.000 - CDS 12336 - 12938 679 ## COG0066 3-isopropylmalate dehydratase small subunit 12 5 Op 4 6/0.000 - CDS 12981 - 14375 1526 ## COG0065 3-isopropylmalate dehydratase large subunit 13 5 Op 5 . - CDS 14453 - 15949 1587 ## COG0119 Isopropylmalate/homocitrate/citramalate synthases - Prom 16076 - 16135 6.3 - Term 16277 - 16318 9.5 14 6 Tu 1 . - CDS 16347 - 16589 180 ## BT_1862 hypothetical protein - Prom 16629 - 16688 5.0 Predicted protein(s) >gi|226332157|gb|ACIC01000163.1| GENE 1 1 - 1806 1437 601 aa, chain + ## HITS:1 COG:ECs4752 KEGG:ns NR:ns ## COG: ECs4752 COG0514 # Protein_GI_number: 15834006 # Func_class: L Replication, recombination and repair # Function: Superfamily II DNA helicase # Organism: Escherichia coli O157:H7 # 2 601 16 605 611 522 44.0 1e-148 MKETLKSYFGYDSFRPLQEEIIRHLLNKQDSLVLMPTGGGKSICYQLPALLSEGTAVVVS PLISLMKDQVETLQANGIAAGALNSSNDETENANLRRACIEGRLKLLYISPEKLIAEKDY LLRDMSISLFAIDEAHCISQWGHDFRPEYTQMGMLHQQFPQVPIIALTATADKITREDII RQLHLIQPRTFISSFDRPNISLDVKRGFQAKEKNKAILEFIHRHREESGIIYCMSRNKTE TVAQMLQKQGIRCGVYHAGLSPQHRDETQNDFINDRIQVVCATIAFGMGIDKSNVRWVIH YNLPKSIESFYQEIGRAGRDGLPSDTVLFYSLGDLILLTKFATESNQQTINLEKLQRMQQ YAEADICRRRILLSYFGETSTEDCGNCDVCKNPPQRFDGTVIVQKALSAIARAEQQISTG ILIDILRGSYSAEVTAKGYQELKTFGAGRDIPPRDWQDYLLQMLQLGYFEIAYNENNHLK ITPSGSNILFGKAKAMLAVIHREEIVTGKGKKKKVVITKELPFGIPGGENEDLFEALRGL RKQIADQDGLPAYIILSDKVLHLLSISRPTTIEAFGEISGIGEFKKKKYGKEFVNLIKQF V >gi|226332157|gb|ACIC01000163.1| GENE 2 1955 - 3043 529 362 aa, chain - ## HITS:1 COG:RSc3292 KEGG:ns NR:ns ## COG: RSc3292 COG3274 # Protein_GI_number: 17548009 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Ralstonia solanacearum # 4 251 1 228 336 60 24.0 7e-09 MKRIVFLDYVRVFACFLVMVVHASENFYGAAGSTDMAGPQSFLASEADRLWVAVYDGFSR MAVPLFMIVSAYLLVPMKEGQTSWQFYRRRFTHILPPFFIFMILYSILPMLWGQIDSETS IKDMSRIFLNFPTLAGHLWFMYPLISLYLFIPIISPWLSKATAKEERFFIGLFLLSTCMP YFNRWFGEVWGQCFWNEYHMLWYFSGYLGYLVLAHYIRVHLKWDRSKRFIVGLISMVAGA ALTIYSFYIQAIPGITHSTPVIEIGWAFCTINCVLLTAGTFLLFTCINRPEAPRFVTDMS KLSYGMYLMHIFWLGLWATVFKNTLELPTVSAIPCIAVTTFVCCYITTKIISFIPGSKWI IG >gi|226332157|gb|ACIC01000163.1| GENE 3 3092 - 3910 776 272 aa, chain - ## HITS:1 COG:alr0622 KEGG:ns NR:ns ## COG: alr0622 COG0457 # Protein_GI_number: 17228118 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Nostoc sp. PCC 7120 # 26 254 292 513 547 81 26.0 1e-15 MIRIILILFLSFPVALCAQTYQQLSEKAIEYIEMDSLSKAEELLLQALKLEPKNAKNAML FSNLGLVQRRLGEYDKALESYSFALNFAPLAVPILLDRAAINLEMGNTDRAYTDYCQVLD EDKQNKEALLMRAYIYVLRRDYPAARIDYNRLLEIDPQSYSGRLGLATLEQKEGKFKEAL EILNKMITATPEDATLYIARADVEREMKHEDLALVDLEEAIRLNPSSADAYLLRGNIYLT QKKKALAKMDFEKAISLGIPPADLHEQLKQCK >gi|226332157|gb|ACIC01000163.1| GENE 4 3921 - 4871 739 316 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148988856|ref|ZP_01820271.1| 50S ribosomal protein L9 [Streptococcus pneumoniae SP6-BS73] # 4 311 3 307 308 289 50 1e-77 MAKIANKLTDLVGNTPLMELSGYSGKYGLNQNIIAKLEAFNPAGSVKDRVALSMIEDAEA RGALKPGATIIEPTSGNTGVGLAMVATIKGYHLILTMPETMSLERRNLLKALGAQIVLTD GLGGMAASIAKAQELRDSIPGSVILQQFENPSNAAVHERTTGEEIWRDTDGEVAVFVAGV GTGGTICGVARALKKHNPDVHIVAVEPASSPILAGGQAASHRIQGIGANFIPKLYDASVV DEVIGVPDDEAIRAGRELAATEGLLAGISSGAAVYAARQLAQRPEFRNKKIVALLPDTGE RYLSTELFAFDAYPLD >gi|226332157|gb|ACIC01000163.1| GENE 5 4960 - 6795 1235 611 aa, chain + ## HITS:1 COG:VCA0802 KEGG:ns NR:ns ## COG: VCA0802 COG1368 # Protein_GI_number: 15601557 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Phosphoglycerol transferase and related proteins, alkaline phosphatase superfamily # Organism: Vibrio cholerae # 190 605 196 632 657 181 30.0 3e-45 MKKRILQFLTTYFLFVLLFVLQKPIFMAYYHELYTDASIGDYFSVMWHGLPLDFSLAGYL TAIPGFLLIASAWTKSSILRRIRQGYFGIIAFVMSCIFIIDLGLYGFWGFRLDATPIFYF FSSPKDAMASVSFWFILLGILAMLIYAAILYGIFYAVLIREKAPLKIPYQRQYVSLVLLL LTAALFIPIRGGFSVSTMNLSKVYYSQNQRMNHAAINPAFSFMYSATHQNNFDKQYRFMD PKVADDLLAEMLDKPVAATDSIPQILNTQRPNIIFIILESFSTHLMETMGGQPNVAVNMD KFGKEGVLFTNFYANSFRTDRGLASIISGYPGQPSTSIMKYPEKTDGLPSIPRSLKNAGY SLEYYYGGDADFTNMRSYLVSSGIEKIISETDFPLSERQGKWGAPDHTLFQRFLKDLKEE KQQEPFFKIVQTSSSHEPFEVPFYRLDDKVLNAFAYADSCVGDFVRQYKETPMWKNTLIV LVPDHLGAYPRPVENPLEGHTIPLILIGGAVKEPRVVDTYASQIDIAATLLSQLGLPHDD FTFSKNIFNPSSPHFGYFTEPTLFGMVTAENQLVYNLDANTVQIDEGTEKGANLEKGKAF LQKLYDDLAKR >gi|226332157|gb|ACIC01000163.1| GENE 6 6835 - 7512 561 225 aa, chain + ## HITS:1 COG:MJ0374_2 KEGG:ns NR:ns ## COG: MJ0374_2 COG0671 # Protein_GI_number: 15668550 # Func_class: I Lipid transport and metabolism # Function: Membrane-associated phospholipid phosphatase # Organism: Methanococcus jannaschii # 5 183 1 164 168 62 31.0 8e-10 MIEFLSDIDTQLLLFFNGIHSPFWDYFMSAFTGKVIWVPMYASILYILLKNFHWKVALCY VVAIALTITFADQMCNSFLRPLVGRLRPSNPENPIADLVYIVNGRRGGGFGFPSCHAANS FGLAIFLICLFRKRWLSIFIVLWAFTNSYTRLYLGLHYPGDLVAGAIIGGFGGWLFYFIA HKLTARLQSDTPVPGKSAGMKQTEVMIYTGLLTLAGIIIYSIVQS >gi|226332157|gb|ACIC01000163.1| GENE 7 7521 - 8150 291 209 aa, chain + ## HITS:1 COG:CAC0567 KEGG:ns NR:ns ## COG: CAC0567 COG0500 # Protein_GI_number: 15893857 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Clostridium acetobutylicum # 15 206 12 209 209 133 37.0 3e-31 MGKRNIIHKLFHNLKKPEGFWGRVILREMNKRHTLLSEWGMSHIVWNKEWNVLDIGCGGG ANLTQLMHRCPQGKAYGIDISPESVLFAQKKNKKYLSTRCFIEQGTVDTLPYTDEMFDVV TAFETVYFWNDLPKAFTEVTRVLKRNGHFLICCELNNLSVKTWTNLIDGMIIRSCDELKS ILLQSGFVSIASYKHEKGPLCIVARRRAE >gi|226332157|gb|ACIC01000163.1| GENE 8 8156 - 9475 651 439 aa, chain - ## HITS:1 COG:no KEGG:BT_1856 NR:ns ## KEGG: BT_1856 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 439 1 439 439 890 99.0 0 MTKNYLFLLLGLLLAACSSDDKDEQPILVTNVVMPASGTVFKPGEKVTIMAKGFQDNDEI MFDIRWPLQDEVLHEGYAKGGRGVITEKTATSITFLAPGHWPASTTEILLRRSGQMMSLG KISVADGQAPKDFQLYGIINSRSNTHRPHAIEYINLEKPQTAEIVRLADNQDFSCVVNLP GSWSLSGVWTKDDRRTTGLYDLSMNYWEKPGADQLVTMGIATANSVFGVYQGGDRLFVKT VNVMPYTRMYVPEKPDYGFLLPEGMKAEALSRYPCIQMSDGNILCSADNGDGTFSPVVLN GQNIEEKSIYVGEPIEAVALIPFWIVKPVEGMGTAKYTRVGGYIVSKRNGATITEGDGTE FRLWNPTTKMLDEPFTTFPNAACSVATLVSDDFKKQELYVLFDGSRNGRLIYVYDLLKGS WESLYPGGGFPYSEIVLAR >gi|226332157|gb|ACIC01000163.1| GENE 9 9679 - 10740 1144 353 aa, chain - ## HITS:1 COG:aq_244 KEGG:ns NR:ns ## COG: aq_244 COG0473 # Protein_GI_number: 15605790 # Func_class: C Energy production and conversion; E Amino acid transport and metabolism # Function: Isocitrate/isopropylmalate dehydrogenase # Organism: Aquifex aeolicus # 3 352 4 358 364 348 49.0 1e-95 MDFKIAVLAGDGIGPEISVQGVDVMSAVCEKFGHKVSYEYAICGADAIDKVGDPFPEETY EVCKNADAVLFSAVGDPKFDNDPTAKVRPEQGLLAMRKKLGLFANIRPVQTFKCLIHKSP LRAELVENADFICIRELTGGMYFGEKYQDNDKAYDTNYYTRPEIERILKVAFEYAMKRRK HLTVVDKANVLASSRLWRQIAQEMAPNYPEVTTDYMFVDNAAMKMIQEPAFFDVMVTENT FGDILTDEGSVISGSMGLLPSASTGESTPVFEPIHGSWPQAKGLNIANPLAQILSVAMLF EYFDCKEEGALIRKAVDASLDENVRTPEIQVADGAKYGTKEVGQWIVDYIKKA >gi|226332157|gb|ACIC01000163.1| GENE 10 10831 - 12360 1501 509 aa, chain - ## HITS:1 COG:MK0391 KEGG:ns NR:ns ## COG: MK0391 COG0119 # Protein_GI_number: 20093829 # Func_class: E Amino acid transport and metabolism # Function: Isopropylmalate/homocitrate/citramalate synthases # Organism: Methanopyrus kandleri AV19 # 7 507 4 493 499 244 33.0 2e-64 MGKQGVKIEIMDTTLRDGEQTSGVSFVPHEKLMIARLLLEDLKVDRVEVASARVSEGEFE AVKMICDWAARRNLLQKVEVLGFVDGHTSVDWIQRTGCRVINLLCKGSLKHCTQQLKKTP EEHIADIINVVHYADEQDIGVNVYLEDWSNGMKDSPEYVFQLMDGLKQTSIRRYMLPDTL GILNPLQVIEYMRKMKKRYPNTHFDFHAHNDYDLAVSNVLAAVLSGVRGLHTTINGLGER AGNAPLSSVQAILKDHFNAMTNIDESRLNDVSRVVESYSGIVIPANKPIVGENVFTQVAG VHADGDNKNNLYCNDLLPERFGRKREYALGKTSGKANIRKNLEDLGLELDEDAMRKVTER IIELGDKKELVTQEDLPYIVSDVLKHGAIGEKVKLKSYFVNLAHGLKPMATLKIEINGKE YEESSSGDGQYDAFVRALRKIYKVTLGRKFPMLTNYAVSIPPGGRTDAFVQTVITWNYDE QVFRTRGLDADQTEAAIKATMKMLNLLEE >gi|226332157|gb|ACIC01000163.1| GENE 11 12336 - 12938 679 200 aa, chain - ## HITS:1 COG:PA3120 KEGG:ns NR:ns ## COG: PA3120 COG0066 # Protein_GI_number: 15598316 # Func_class: E Amino acid transport and metabolism # Function: 3-isopropylmalate dehydratase small subunit # Organism: Pseudomonas aeruginosa # 10 196 8 201 212 163 44.0 2e-40 MAKTKFNIITSTCVPLPLENVDTDQIIPARFLKATTREEKFFGDNLFRDWRYNADGSLNK DFVLNDPTYSGQILVAGKNFGSGSSREHAAWAIAGYGFRVVVSSFFADIHKNNELNNFVL PVVVTEGFLQELFDSIFADPKMEVEVNLPEQTITNKATGKSEHFEINAYKKLCLMNGLDD IDFLLSNKNKIEEWENKASK >gi|226332157|gb|ACIC01000163.1| GENE 12 12981 - 14375 1526 464 aa, chain - ## HITS:1 COG:NMB1036 KEGG:ns NR:ns ## COG: NMB1036 COG0065 # Protein_GI_number: 15676923 # Func_class: E Amino acid transport and metabolism # Function: 3-isopropylmalate dehydratase large subunit # Organism: Neisseria meningitidis MC58 # 3 461 5 466 469 517 57.0 1e-146 MNTLFDKIWDAHVVTTVEDGPTQLYIDRLYCHEVTSPQAFAGLRERGIKVLRPEKVFCMP DHNTPTHDQDKPIEDPISKTQVDTLTKNAKDFGLTHFGMMHPKNGIIHVVGPERALTLPG MTIVCGDSHTSTHGAMGAIAFGIGTSEVEMVLASQCILQSRPKTMRITVDGELGKGVTAK DVALYMMSKMTTSGATGYFVEYAGSAIRNLTMEGRLTLCNLSIEMGARGGMVAPDEVTFE YIKGRENAPQGEAWDQAMEYWKTLKSDDDAVFDQEVRFDAADIEPMITYGTNPGMGMGIT QNIPTTEGMGEAAQVSFKKSMEYMGFQPGESLLGKKIDYVFLGACTNGRIEDFRAFASLV KGRRKADNVIAWLVPGSWMVDAQIRKEGIDKILTEAGFAIRQPGCSACLAMNDDKIPAGK YSVSTSNRNFEGRQGPGARTLLASPLVAAAAAVTGVITDPRELM >gi|226332157|gb|ACIC01000163.1| GENE 13 14453 - 15949 1587 498 aa, chain - ## HITS:1 COG:VC2490 KEGG:ns NR:ns ## COG: VC2490 COG0119 # Protein_GI_number: 15642486 # Func_class: E Amino acid transport and metabolism # Function: Isopropylmalate/homocitrate/citramalate synthases # Organism: Vibrio cholerae # 1 494 1 496 516 463 51.0 1e-130 MDNRLFIFDTTLRDGEQVPGCQLNTVEKIQVAKALEALGVDVIEAGFPISSPGDFNSVIE ISKAVTWPTICALTRAVQKDIDVAVDALKFAKHKRIHTGIGTSDSHIKYKFNSNREEIIE RAVAAVKYARRFVDDVEFYAEDAGRTDNEYLARVVEAVIKAGATVVNIPDTTGYCLPSEY GAKIKYLIDHVDGIDNAILSTHCHNDLGMATANTIAGVLNGARQVEVTINGIGERAGNTA LEEIAMIIKSHHEIDIQTNINTQKIYPTSRMVSSLMNMPVQPNKAIVGRNAFAHSSGIHQ DGVLKNVETYEIIDPHDVGIDDNSIVLTARSGRAALKNRLSLLGVNLDQEKLDKVYEEFL KLADKKKDINDDDVLVLAGADRSQNHRIKLEYLQVTSGVGVRSVASLGLNISGEKFEACA SGNGPVDAAIKALKKIVDRHMTLKEFTIQAISKGSDDVGKVHMQVEYDNQIYYGFGANTD IIAASVEAYIDCINKFKA >gi|226332157|gb|ACIC01000163.1| GENE 14 16347 - 16589 180 80 aa, chain - ## HITS:1 COG:no KEGG:BT_1862 NR:ns ## KEGG: BT_1862 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 80 1 80 80 87 100.0 1e-16 MKTKLFLAAVAVTFSFAVISCSGNKANNAAASEGEEATVETVEAVAVEATDSCCQAKDSC ATACDKKADCTEKKECCDKK Prediction of potential genes in microbial genomes Time: Thu May 12 03:31:41 2011 Seq name: gi|226332156|gb|ACIC01000164.1| Bacteroides sp. 1_1_6 cont1.164, whole genome shotgun sequence Length of sequence - 22620 bp Number of predicted genes - 19, with homology - 18 Number of transcription units - 13, operones - 4 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 834 697 ## BT_2140 putative sodium-dependent transporter - Prom 939 - 998 5.6 2 2 Op 1 . + CDS 930 - 2192 1121 ## COG0860 N-acetylmuramoyl-L-alanine amidase 3 2 Op 2 . + CDS 2201 - 3094 916 ## BT_2142 hypothetical protein 4 3 Tu 1 . + CDS 3446 - 4858 1046 ## COG0593 ATPase involved in DNA replication initiation - Term 4880 - 4918 0.2 5 4 Tu 1 . - CDS 5000 - 5737 543 ## COG0778 Nitroreductase + Prom 5892 - 5951 7.2 6 5 Op 1 1/0.000 + CDS 6022 - 7179 559 ## COG0209 Ribonucleotide reductase, alpha subunit 7 5 Op 2 . + CDS 7001 - 8560 1541 ## COG0209 Ribonucleotide reductase, alpha subunit + Term 8598 - 8638 5.1 + Prom 8578 - 8637 13.7 8 6 Tu 1 . + CDS 8693 - 11374 2018 ## COG1640 4-alpha-glucanotransferase + Prom 11377 - 11436 4.7 9 7 Op 1 . + CDS 11587 - 12600 680 ## COG3594 Fucose 4-O-acetylase and related acetyltransferases + Prom 12603 - 12662 4.6 10 7 Op 2 . + CDS 12695 - 13666 616 ## COG0438 Glycosyltransferase + Prom 13694 - 13753 3.5 11 8 Tu 1 . + CDS 13810 - 14178 382 ## COG1539 Dihydroneopterin aldolase - Term 14437 - 14476 5.0 12 9 Op 1 . - CDS 14508 - 15032 465 ## COG1803 Methylglyoxal synthase 13 9 Op 2 . - CDS 15037 - 16023 583 ## COG1216 Predicted glycosyltransferases 14 9 Op 3 . - CDS 16068 - 16961 694 ## COG1560 Lauroyl/myristoyl acyltransferase 15 9 Op 4 . - CDS 16951 - 18282 782 ## PROTEIN SUPPORTED gi|16079597|ref|NP_390421.1| hypothetical protein BSU25430 - Prom 18344 - 18403 5.7 + Prom 18230 - 18289 6.2 16 10 Tu 1 . + CDS 18392 - 20065 1510 ## COG1022 Long-chain acyl-CoA synthetases (AMP-forming) + Prom 20073 - 20132 3.0 17 11 Tu 1 . + CDS 20244 - 20447 232 ## - Term 20463 - 20502 5.8 18 12 Tu 1 . - CDS 20515 - 21369 619 ## COG1082 Sugar phosphate isomerases/epimerases - Prom 21424 - 21483 5.0 19 13 Tu 1 . - CDS 21569 - 22441 815 ## BT_2157 hypothetical protein - Prom 22476 - 22535 2.0 Predicted protein(s) >gi|226332156|gb|ACIC01000164.1| GENE 1 3 - 834 697 277 aa, chain - ## HITS:1 COG:no KEGG:BT_2140 NR:ns ## KEGG: BT_2140 # Name: not_defined # Def: putative sodium-dependent transporter # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 277 1 277 297 496 99.0 1e-139 MLKFLKNWTLPIAMLVGAVGYPIFIYFSFLTPYLIFTMLLLTFCKVSPHDLKPKPLHAWL LLIQIGGAFLAYLLLYRFNKIVAEGVMVCIICPTATAAAVITSKLEGSAASLTTYTLLGN IGAAIAVPILFPLIEVNPDVSFWGAFWVILSKVFPLLICPFLAAWLLSKFLPKVHQKLLG YHELAFYLWAVSLAIVTAQTLYSLLNDPADGFTEIMIAVGALIACCLQFFLGKTIGSVYN DRISGGQALGQKNTILAIWMAHTYLNPLSSVAPGSYV >gi|226332156|gb|ACIC01000164.1| GENE 2 930 - 2192 1121 420 aa, chain + ## HITS:1 COG:aq_1681 KEGG:ns NR:ns ## COG: aq_1681 COG0860 # Protein_GI_number: 15606778 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: N-acetylmuramoyl-L-alanine amidase # Organism: Aquifex aeolicus # 45 270 130 353 359 128 37.0 2e-29 MSFLQNSTTFAPYLNMKPHRTYILYISICLWLLASPLCIGNLWGKDFVVVIDAGHGGHDP GAIGKISKEKNINLNVALKLGNLIKKNCDDVKVIYTRSRDVFIPLDRRAEIANNAKADLF ISVHTNALAKNRTAKGASTWTLGLAKSDANLEVAKRENSVILYESDYQTRYAGFNPNSAE SYIIFEFMQDKYMEQSVHLASLVQKQFRHTCKRVDRGVHQAGFLVLKASAMPSILVELGF ISTPEEERYLNSEAGANTMAKGIYHAFLNYKREQEIRLTGASKTIIPTEEPATSAPEQRE LMAEASTTPATATKTAPKRPIVSESATNDSEITFKIQILTSSKPLTKNDKRLKGLKDVDY YKEKGLYKYTYGASSDYNKVLRTKRTISAQFKDAFIIAFRNGEKMNVNEAIAEFKKRRNK >gi|226332156|gb|ACIC01000164.1| GENE 3 2201 - 3094 916 297 aa, chain + ## HITS:1 COG:no KEGG:BT_2142 NR:ns ## KEGG: BT_2142 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 297 1 297 297 564 99.0 1e-159 MKYITKEVRIGIAGIVALCVLVYGINWLKGIHMFQPSSYFYAKFQNVNGLTKSSPVFADG VRVGIVRDIEYDYVNPGNVIVEVELDTELRIPKGSSAELVSELMGGVRMNILLANNPREK YAIGDTIPGTLNNGMMESVAKLMPQIESMLPKLDSILTSLNNIVGDKSIPSMLHSIETTT ANLAVVSSQMKGLMSKDIPQLTGKLNTIGDNFVAISGNLKEVDYSAIFQKVDATLANVKM ITEKLNSKDNTIGLLFNDPALYNNLNATTENAASLLEDLKAHPKRYVHFSLFGKKDK >gi|226332156|gb|ACIC01000164.1| GENE 4 3446 - 4858 1046 470 aa, chain + ## HITS:1 COG:BS_dnaA KEGG:ns NR:ns ## COG: BS_dnaA COG0593 # Protein_GI_number: 16077069 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA replication initiation # Organism: Bacillus subtilis # 5 467 3 446 446 292 34.0 1e-78 MIESNHVVLWNRCLEVIKDNVPETTYNTWFAPIVPLKYEDKTLIVQIPSQFFYEILEEKF VDLLRKTLYKAIGEGTKLMYNVMVDKTSIPNQTVNLEASNRSTAVTPKSIVGGNKAPSFL KAPAVQDLDPHLNPNYNFENFIEGYSNKLSRSVAEAVAQNPAGTAFNPLFLYGASGVGKT HLANAIGTKIKELYADKRVLYVSAHLFQVQYTDSVRNNTTNDFINFYQTIDVLIIDDIQE FAGVTKTQNTFFHIFNHLHQNGKQLILTSDRAPVLLQGMEERLLTRFKWGMVAELEKPTV ELRKNILRNKIHRDGLQFPPEVIDYIAENVNESVRDLEGIVIAIMARSTIFNKEIDLDLA QHIVHGVVHNETKAVTIDDILKVVCKHFDLEASAIHTKSRKREVVQARQIAMYLAKNYTD FSTSKIGKFIGNKDHATVLHACKTVKGQLEVDKSFQAEVQEIESLLKKKA >gi|226332156|gb|ACIC01000164.1| GENE 5 5000 - 5737 543 245 aa, chain - ## HITS:1 COG:BH1048 KEGG:ns NR:ns ## COG: BH1048 COG0778 # Protein_GI_number: 15613611 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Bacillus halodurans # 1 199 5 202 244 107 30.0 2e-23 MFETVKNRRTIRKYQQKDIAPDLLNDLLETSFRASTMGGMQLYSVIVTRDAEMKEKLSPA HFNQPMVKGAPVVLTFCADFRRFSKWCEQRNAVPGYDNLMSFMNASMDTLLVAQTFCTLA EEAGLGICYLGTTTYNPQMIIDTLQLPKLVFPITTITVGYPDGTPAQVDRLPLEAAVHQE VYHDYTPEDIDRLYEYKESLPENKQFIEENKKETLAQVFTDVRYTKKDNEFMSENLLKVL RQQGF >gi|226332156|gb|ACIC01000164.1| GENE 6 6022 - 7179 559 385 aa, chain + ## HITS:1 COG:AF1664 KEGG:ns NR:ns ## COG: AF1664 COG0209 # Protein_GI_number: 11499254 # Func_class: F Nucleotide transport and metabolism # Function: Ribonucleotide reductase, alpha subunit # Organism: Archaeoglobus fulgidus # 25 320 7 291 752 169 36.0 6e-42 MEKQIYSYDEAYEESLRYFQGDELAARVWVNKYAVKDSFGNIYEKSPEDMHWRIANEVAR IEAKYPNALTAKELYDLLDHFKYIVPQGSPMTGIGNDYQVASLSNCFVIGVDGAADSYGA IIKIDEEQVQLMKRRGGVGHDLSHIRPKGSPVKNSALTSTGLVPFMERYSNSTREVAQDG RRGALMLSVSIKHPDSEAFIDAKMTEGKVTGANVSVKLDDAFMQAAVDEKPYIQQYPIES ANPTTTKEIDASTLWKKIVHNAWKSAEPGVLFWDTIIRESVPDCYADLGYKTVSTNPCGE IPLCPYDSCRLLAINLYSYVVNPFKPDAYFDFELLQKTRCAGSTHHGRHHRPRTGEDRTY HGKDRSRPGRRRGKTCRTQLVEQDI >gi|226332156|gb|ACIC01000164.1| GENE 7 7001 - 8560 1541 519 aa, chain + ## HITS:1 COG:TM0118 KEGG:ns NR:ns ## COG: TM0118 COG0209 # Protein_GI_number: 15642893 # Func_class: F Nucleotide transport and metabolism # Function: Ribonucleotide reductase, alpha subunit # Organism: Thermotoga maritima # 13 518 358 826 827 136 27.0 9e-32 MPILISNCYKKHVALAQRIMDDIIDLELEKIERIMEKIDQDPEGEEVKHAERNLWNKIYK KSGQGRRTGVGITAEGDMLAALGLRYGTEEATEFSEKVHKTVALGAYRSSVEMAKERGAF EIYSNEREQNNPFIQRLAEADPELYAEMKKYGRRNIACLTIAPTGTTSLMTQTTSGIEPV FLPVYKRRRKVNPNDTNVHVDFVDETGDAFEEYIVFHHKFVTWMEANGYDPAKRYSQEEI DELVAKSPYYKATSNDVDWLMKVKMQGRIQKWVDHSISVTINLPNDVDEDLVNRLYVEAW KSGCKGCTVYRDGSRSGVLISTKSDKKETLPPCKPPTVVETRPRILEADVVRFQNNKEKW VAFVGLLDGHPYEIFTGLQDDDEGILLPKSVTSGRIIKNIDEDGTKRYDFQFENKRGYKT TIEGLSEKFNKEYWNYAKLISGVLRYRMPIEQVIKLVGSLQLNSESINTWKNGVERALKK YIQDGTEAKGKKCPNCGNETLVYQEGCLICTTCGASRCG >gi|226332156|gb|ACIC01000164.1| GENE 8 8693 - 11374 2018 893 aa, chain + ## HITS:1 COG:L94405 KEGG:ns NR:ns ## COG: L94405 COG1640 # Protein_GI_number: 15672678 # Func_class: G Carbohydrate transport and metabolism # Function: 4-alpha-glucanotransferase # Organism: Lactococcus lactis # 399 890 3 488 489 442 45.0 1e-123 MTVSFNIEYRTSWGEEVRIAGLLPESIPMHTTDGIYWTADVELEVPKEGMTINYSYQIEQ NQIIIRKEWDSFPRRLFLSGNSKKKYQIKDCWKNIPEQLYYYSSAFTEALLAHPDRAEIP HCHRKGLVIKAYAPRINKDYCLAICGNQKALGNWDPDKAIPMSDANFPEWQIELDASKLK FPLEYKFILYHKEEKKADCWENNPNRYLADPELKTNETLVISDRYAYFDIPVWKGAGIAI PVFSLKSENSFGVGDFGDLKRMIDWAVSTQQKVIQILPINDTTMTHAWTDSYPYNSISIY AFHPMYADIKQMGTLKDKSAAAKFNKKQKELNGLPAMDYEAVNQTKWEYFRLIFKQEGEK VLASGEFGEFFNANKEWLQPYAVFSYLRDAFQTPNFREWPRHSVYNAQDIEKMCRPESVD YPHIALYYYIQFHLHLQLVAATKYAREHGVVLKGDIPIGISRNSVEAWTEPYYFNLNGQA GAPPDDFSVNGQNWGFPTYNWDVMEKDGYRWWMKRFQKMSEYFDAYRIDHILGFFRIWEI PMHAVHGLLGQFIPSIPMSREEIESYGLPFREEYLIPYIHESFLGQVFGPHTDYVKQTFL LPAETPGVYHMKPEFTTQREVESFFAGKNDENSLWIRDGLYTLISDVLFVPDTKEKDKYH PRIGIQRDFIFRSLNEQEQNAFNRLYDQYYYHRHNEFWRQQAMKKLPQLTQSTRMLVCGE DLGMIPDCVSSVMNDLRILSLEIQRMPKNPMHEFGYLNEYPYRSVCTISTHDMSTLRGWW EEDYLQTQRYYNTMLGHYGTAPTVATPELCEEVVRNHLKSNSILCILSLQDWLSIDGKWR NPNVQEERINVPSNPRNYWRYRMHLTLEQLMKAEELNDKIRELIKYTGRAPKK >gi|226332156|gb|ACIC01000164.1| GENE 9 11587 - 12600 680 337 aa, chain + ## HITS:1 COG:CAC3042 KEGG:ns NR:ns ## COG: CAC3042 COG3594 # Protein_GI_number: 15896293 # Func_class: G Carbohydrate transport and metabolism # Function: Fucose 4-O-acetylase and related acetyltransferases # Organism: Clostridium acetobutylicum # 7 280 2 268 337 70 25.0 7e-12 MGDNQHRRIDFVDLTKGVCIILVVMAHIGGAFDQLDKHSMLSCFRMPLYFFISGIFFKPY EGLYGFIIRKTNKLIIPFIFFYVSAFLLKYIVWKIAPETFHLPVSWRELLFVFHGHDLIK FNPPIWFLLALFNCNILFYLIHYLRDKHLSLMFAATLLIGCTGFFLGKFHIELPLYIDVA MTALPFYVAGFWVRRYNFFLFPSHRFDKIIPSFVLLALVVMYFTATTPGMRTNNYPGNIF QVYIAAFAGIFMIMLICKKIKRISIVSYLGRYSIITLSIHGPIIHFARPVVSHYIHNDWA EASALLLLTLTICILFTPILLKIIPQLVAQKDLLKVK >gi|226332156|gb|ACIC01000164.1| GENE 10 12695 - 13666 616 323 aa, chain + ## HITS:1 COG:alr1000 KEGG:ns NR:ns ## COG: alr1000 COG0438 # Protein_GI_number: 17228495 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Nostoc sp. PCC 7120 # 7 301 10 341 360 97 26.0 4e-20 MNITIVLGDKIPSIKYGGTQRVMWYLGKELNKRGHKVTFIAGKGSSCPFAKVIELDPSVN LNTQIPTDADIVHFNIPVPEGITKPHLLTIHGNGIPANADRNIVFVSRNHAQRLGSESFV YNGLDWDDYGPANLTLSRKNYHFLGKAAWKVKNIKGAIAVVQSIKNAQLDVLGGYRLNLK MGFRFTWNPRIHFHGMVDDSKKKVIIEQSKGLIFPVTWHEPFGLAITESLYFGAPVFGTP YGALPELVSPEVGFLTAHGQEMAEHIQSAQYSPKVCHEYARDLFNSSVMAEAYLKKYETV LNGEPLNKEVPHIIDIARRLPWD >gi|226332156|gb|ACIC01000164.1| GENE 11 13810 - 14178 382 122 aa, chain + ## HITS:1 COG:SA0473 KEGG:ns NR:ns ## COG: SA0473 COG1539 # Protein_GI_number: 15926192 # Func_class: H Coenzyme transport and metabolism # Function: Dihydroneopterin aldolase # Organism: Staphylococcus aureus N315 # 8 118 5 116 121 87 39.0 4e-18 MKINNSYILLKDICCFAYHGVAPQENIIGNEYIINLKLKVDISQAIQTDDVVDTVNYAEI HEAVKAEMSIPSKLLEHVCGRIAKRLLAEFPAIEEIELRLSKRNPPMGADIDSAGVELHC SR >gi|226332156|gb|ACIC01000164.1| GENE 12 14508 - 15032 465 174 aa, chain - ## HITS:1 COG:TM1185 KEGG:ns NR:ns ## COG: TM1185 COG1803 # Protein_GI_number: 15643941 # Func_class: G Carbohydrate transport and metabolism # Function: Methylglyoxal synthase # Organism: Thermotoga maritima # 6 165 15 166 166 132 48.0 3e-31 MEPKVRRGIGLVAHDAMKKDLIEWVLWNSELLMGNKFYCTGTTGTLILEALKEKHPEVEW DFTILKSGPLGGDQQMGSRIVDGQIDYLFFFTDPMTLQPHDTDVKALTRLAGVENIVFCC NRSTADHIISSPLFMDPDYERTHPDYSSYTKRFENKSVVTEAVESVNKRKRKKR >gi|226332156|gb|ACIC01000164.1| GENE 13 15037 - 16023 583 328 aa, chain - ## HITS:1 COG:CAC2327_1 KEGG:ns NR:ns ## COG: CAC2327_1 COG1216 # Protein_GI_number: 15895594 # Func_class: R General function prediction only # Function: Predicted glycosyltransferases # Organism: Clostridium acetobutylicum # 15 244 34 261 378 119 32.0 6e-27 MLRTFLPSVIRYSKSEEVEVCVADNGSTDASVEMLREEFPCVRIIVLDQNHGFADGYNLA LQQVEAEYVVLLNSDVEVTEHWLEPMISYLDGHPEVAACQPKIRSQRQKEYFEYAGAAGG FIDKYGYPFCRGRIMGVVEKDEGQYDTILPVFWATGAALFIRHADYREAGGLDGRFFAHM EEIDLCWRLRSRGREIVCIPQSTVYHVGGATLKKENPRKTFLNFRNNLVMLYKNLSDEEL NKVMRIRTCLDYVAAFTFLLKGQLDNARAVMRARKEYKQICPSFSSSRKENLRKTSLNPI PERIKSSILWQFYVGGCKRFSQLSDLKG >gi|226332156|gb|ACIC01000164.1| GENE 14 16068 - 16961 694 297 aa, chain - ## HITS:1 COG:NMA1630 KEGG:ns NR:ns ## COG: NMA1630 COG1560 # Protein_GI_number: 15794524 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lauroyl/myristoyl acyltransferase # Organism: Neisseria meningitidis Z2491 # 4 288 2 279 289 90 26.0 4e-18 MKSRFIYWLVYSGMWLFSALPFRILYLLSDFNYLLMYRIGRYRRKVVRENLKKSFPEKDK KERLQIERRFYRYLSDYMLEDLKMLHMSPEDLYKRMTYKNTEQYLELTEKYGGIIVMIPH YANYEWLIGMGSIMKPGDVPMQVYKPLKDKYLNELFQRIRSRFGGYNVPKHSTAREIIKL KREGKKMVVGLITDQWPSGNEKYWTTFLGQETAFLNGAERIAKMMNFPVFYCDLTKPGRG YCVAEFKLMTEKPKETGEGEITEMFADYLEQTIRREPAYWLWSHKRWKASKAECERG >gi|226332156|gb|ACIC01000164.1| GENE 15 16951 - 18282 782 443 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|16079597|ref|NP_390421.1| hypothetical protein BSU25430 [Bacillus subtilis subsp. subtilis str. 168] # 11 423 3 423 451 305 37 1e-82 MIDTTVFQNKTAVYYTLGCKLNFSETSTIGKILREAGVRTARKGEKADICVVNTCSVTEM ADKKCRQAIHRLVKQHPGAFVVVTGCYAQLKPGDVAKIKGVDVVLGAEQKGDLLQYLGDL HKHEEGEAFTTTTKDIRSFSPSCSRGDRTRFFLKVQDGCDYFCSYCTIPFARGRSRNGTV ASMVEQARQAAAEGGKEIVLTGVNIGDFGKTTGETFFDLVKALDQVEGIERYRISSIEPN LLTDEIIEFVSRSRRFMPHFHIPLQSGCDEVLKLMRRRYDTALFASKVKKIKEVMPDAFI GVDVIVGTRGETEEYFEQAYQFISGLDVTQLHVFSYSERPGTQALKIDYVVSPEEKHQRS QRLLTLSDEKTRAFYTRHIGQTMQVLMEKSKAGTPMHGFTANYIRVEVENDESLDNQMIN VRLGEFNEDMTALKGTILMNYEV >gi|226332156|gb|ACIC01000164.1| GENE 16 18392 - 20065 1510 557 aa, chain + ## HITS:1 COG:aq_999_1 KEGG:ns NR:ns ## COG: aq_999_1 COG1022 # Protein_GI_number: 15606303 # Func_class: I Lipid transport and metabolism # Function: Long-chain acyl-CoA synthetases (AMP-forming) # Organism: Aquifex aeolicus # 57 554 41 503 600 240 32.0 5e-63 MIKENFIKLYENSFRENWDLPCYTDYGESVHYTYGQVAGEIARMHLLFKHCSLRRGDKIA VIGKNNAHWCIAYMATITYGAIIVPILQDFTPNDVHHIVNHSESVFLFTSDSIWESLEEE KLTGLRGVFSLTDFRCLYQRDGETIQKFLVHLNDEMYAAYPKGFTREDIQYTTLSNDKVM LLNYTSGTTGFSKGVMLTGNNLAGNVTFGIRTELLKKGDKVLSFLPLAHAYGCAFDFLTA TAVGTHVTLLGKTPSPKIIMKAFEEVKPNLIITVPLVIEKIYKNVIQPLINKKGMKWALN IPLLDTQIYNQIRKKLIDALGGRFKEIIIGGAAMNQEVEEFFYKIKFPFTIGYGMTECGP LISYAPWNEFILGSSGKVLDIMEARIYKENPEAETGEIQVRGENVMVGYYKNPEATQEVF TEDGWLRTGDLGTMDASGNIFIRGRLKSMILSSSGQNIFPEELEAKLNNLPFILESLVIE RNKKLVALVYADYEALDSLGLNNPDNLKTIMDENLKNLNNNVAAYEKVSKIQLYPTEFEK TPKRSIKRYLYNSIAVD >gi|226332156|gb|ACIC01000164.1| GENE 17 20244 - 20447 232 67 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAVAIVAVSFASCGNKAADAAKAQQDSIRIADSIAAVEAAAAEAEQAAAQAADSLNADST ATEAVAE >gi|226332156|gb|ACIC01000164.1| GENE 18 20515 - 21369 619 284 aa, chain - ## HITS:1 COG:lin2265 KEGG:ns NR:ns ## COG: lin2265 COG1082 # Protein_GI_number: 16801329 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate isomerases/epimerases # Organism: Listeria innocua # 11 272 2 239 246 110 29.0 4e-24 MAQSKTKAKAKKEVAIQLYSVRDILNKVDNKDGKCDAAYITLLKNLAKMGYTSVEAANYN NGKFYDRTPDQFKKDVESAGLKVLSSHCTRGLSKEELASGDFSSSLQWWDQCIADHKAAG MSYIVAPWMDVPKTLKELDTYCAYYNEIGKRCKQQGMSFGYHNHAHEFQKVEDKVMYDYM IEHTNPEYVFFQMDVYWVVRGQNSPVDYFNKYPGRFKMFHIKDHREIGQSGMVGFDAIFK NAKTAGVKHLVAEIESYSMPVEKSVEVSLDYLLDAPFVKSSYAK >gi|226332156|gb|ACIC01000164.1| GENE 19 21569 - 22441 815 290 aa, chain - ## HITS:1 COG:no KEGG:BT_2157 NR:ns ## KEGG: BT_2157 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 290 1 290 290 585 100.0 1e-166 MKKVFYPLACCLAAGVLVSCSGQKKAGSAQEEQSANEVAVSYSKSLKAAEMDSLQLPVDA DGYITIFDGKTFNGWRGYGKDRVPSKWTIEDGCIKFNGSGGGEAQDGDGGDLIFAHKFKN FELEMEWKVSKGGNSGIFYLAQEVTSKDKDGNDVLEPIYISAPEYQVLDNDNHPDAKLGK DNNRQSASLYDMIPAVPQNAKPFGEWNKAKIMVYKGTVVHGQNDENVLEYHLWTKQWTDL LQASKFSQDKWPLAFELLNNCGGENHEGFIGMQDHGDDVWFRNIRVKVLD Prediction of potential genes in microbial genomes Time: Thu May 12 03:32:02 2011 Seq name: gi|226332155|gb|ACIC01000165.1| Bacteroides sp. 1_1_6 cont1.165, whole genome shotgun sequence Length of sequence - 5169 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 3, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 50 - 1168 870 ## COG0673 Predicted dehydrogenases and related proteins - Prom 1274 - 1333 8.2 - Term 1244 - 1289 -0.8 2 2 Op 1 . - CDS 1421 - 3106 1078 ## BT_2160 putative regulatory protein 3 2 Op 2 27/0.000 - CDS 3148 - 3591 712 ## PROTEIN SUPPORTED gi|29347571|ref|NP_811074.1| 50S ribosomal protein L9 4 2 Op 3 11/0.000 - CDS 3606 - 3878 462 ## PROTEIN SUPPORTED gi|29347572|ref|NP_811075.1| 30S ribosomal protein S18 5 2 Op 4 . - CDS 3881 - 4225 578 ## PROTEIN SUPPORTED gi|29347573|ref|NP_811076.1| 30S ribosomal protein S6 - Prom 4249 - 4308 12.3 + Prom 4217 - 4276 11.8 6 3 Op 1 . + CDS 4386 - 4832 428 ## COG1846 Transcriptional regulators + Prom 4834 - 4893 7.3 7 3 Op 2 . + CDS 4972 - 5148 229 ## gi|298383732|ref|ZP_06993293.1| hypothetical protein HMPREF9007_00288 Predicted protein(s) >gi|226332155|gb|ACIC01000165.1| GENE 1 50 - 1168 870 372 aa, chain - ## HITS:1 COG:lin2932 KEGG:ns NR:ns ## COG: lin2932 COG0673 # Protein_GI_number: 16801991 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Listeria innocua # 7 196 1 188 333 77 25.0 6e-14 MDAQFSLDTKIKVGIIGFGRMGRFYWEAMRKSGRWEIAYICDTDPTTRELAKKISPESLV VENDQKIFEDESVQVVGLFTLADSRLEQIEKAIRYGKHIIAEKPVADTMEKEWKVVDMIE NSNVLSTVNLYLRNSWYHNLMKEYIQKGEIGELAIIRICHMTPGLAPGEGHEYEGPAFHD CGMHYVDIVRWYAESEYRTWNAQGVNMWNYKDPWWVQCHGTFQNGVVFDITQGFVYGQLS KDQTHNSYVDIIGTEGIVRMTHDFKTAVVDLHGVHQTLRVEKPFGGKNIDVLCDLFADSI LTGKRNPRLPLMYDSTIASEYAWKFLQDARMNDLPAIGNLQTLEQIRERRKNMKNGYGLL HSNPPQITNSLK >gi|226332155|gb|ACIC01000165.1| GENE 2 1421 - 3106 1078 561 aa, chain - ## HITS:1 COG:no KEGG:BT_2160 NR:ns ## KEGG: BT_2160 # Name: not_defined # Def: putative regulatory protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 561 1 561 561 1001 100.0 0 MDKSLFFIIFANNMKFNRCYLKRIVLLFLVALGIGNALYANNELKILADSLHKMIDAKPL FVQKKEQRIARIKCLLKDSGLTPDREYKVNLQLYNEYKKFNIDSAIHYVDRNLEIARQLN RNYLKYQSSLQLSLVYSMCGRYRDAELLLEKMKPSEFPRSLLATYYDTYARFWEYYSISA TNNQYGKKREAYQDSLYALMDHTSFDYKLSRAYSYAGHDSTKAIKILDELLNAEEVGTPN YAMITHSYAMLSRYLKREDDAKKYLMMSAIADIQNATRETASLQALALIQYEENNLADAF KFTQSAIDDVVSSGIHFRAMEIYKFYSIINTAYQTEEARSKSNLITFLISTSVSLFLLIV LVVFIYIQMKKTLRMKRALAQSNEELLRLNDKLNSMNSELNDKNDELCEINNIKEHYIAQ FFDVCFSYIHKMEKYQNMLYKIAINKCYEELIKKLKSSALIDDELDALYTRFDRVFLNLY PTFVSDFNALLKDDEKIILKQDALLNRELRIYALLRLGITDSGKIANFLRCSTSTVYNYR TKMRNKAAVDRDEFENEIMKI >gi|226332155|gb|ACIC01000165.1| GENE 3 3148 - 3591 712 147 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|29347571|ref|NP_811074.1| 50S ribosomal protein L9 [Bacteroides thetaiotaomicron VPI-5482] # 1 147 1 147 147 278 100 5e-75 MEIILKEDIVNLGYKNDIVNVKSGYGRNYLIPTGKAIIASPSAKKMLAEDLKQRAHKLEK IKKDAEALAAKLEGVSLTIATKVSSTGTIFGSVSNIQIAEALAKLGHEIDRKIIVVKDAV KEVGNYKAIVKLHKEVSVEIPFEVVAE >gi|226332155|gb|ACIC01000165.1| GENE 4 3606 - 3878 462 90 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|29347572|ref|NP_811075.1| 30S ribosomal protein S18 [Bacteroides thetaiotaomicron VPI-5482] # 1 90 1 90 90 182 100 5e-46 MAQQTQSEIRYLTPPSVDVKKKKYCRFKKSGIRYIDYKDPEFLKKFLNEQGKILPRRITG TSLKFQRRIAQAVKRARHLALLPYVTDMMK >gi|226332155|gb|ACIC01000165.1| GENE 5 3881 - 4225 578 114 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|29347573|ref|NP_811076.1| 30S ribosomal protein S6 [Bacteroides thetaiotaomicron VPI-5482] # 1 114 1 114 114 227 100 2e-59 MNQYETVFILTPVLSDVQMKEAVEKFKGILQAEGAEIINEENWGLKKLAYPIQKKSTGFY QLVEFNADPTVIDKLELNFRRDERVIRFLTFKMDKYAAEYAAKRRSVKSNKKED >gi|226332155|gb|ACIC01000165.1| GENE 6 4386 - 4832 428 148 aa, chain + ## HITS:1 COG:FN2010 KEGG:ns NR:ns ## COG: FN2010 COG1846 # Protein_GI_number: 19705306 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Fusobacterium nucleatum # 25 148 17 141 160 65 32.0 4e-11 MIEQFNFDIRLIFAILNGKVSAAINRKLYRNFRQNGLEISPEQWTVLIFLWEKDGVTQQE LCNATFKDKPSMTRLIDNMERQHLVVRISDKKDRRTNLIHLTKDGKELEEKARIIAGQTL KEALHGITLEELSIGQEVLKKVFYNTKD >gi|226332155|gb|ACIC01000165.1| GENE 7 4972 - 5148 229 58 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|298383732|ref|ZP_06993293.1| ## NR: gi|298383732|ref|ZP_06993293.1| hypothetical protein HMPREF9007_00288 [Bacteroides sp. 1_1_14] # 1 58 1 58 58 63 100.0 5e-09 MKGLLKNLGLILILIGVVILLACSFTGNVNNNAVLGSSVFLVVLGLISYIVINKKIAD Prediction of potential genes in microbial genomes Time: Thu May 12 03:32:42 2011 Seq name: gi|226332154|gb|ACIC01000166.1| Bacteroides sp. 1_1_6 cont1.166, whole genome shotgun sequence Length of sequence - 112584 bp Number of predicted genes - 94, with homology - 93 Number of transcription units - 43, operones - 24 average op.length - 3.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 40/0.000 - CDS 35 - 736 775 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 2 1 Op 2 . - CDS 738 - 2249 1202 ## COG0642 Signal transduction histidine kinase - Prom 2395 - 2454 8.1 - Term 2476 - 2528 10.1 3 2 Tu 1 . - CDS 2551 - 4707 2017 ## COG0480 Translation elongation factors (GTPases) - Prom 4884 - 4943 4.6 4 3 Op 1 . + CDS 4965 - 6095 769 ## COG0635 Coproporphyrinogen III oxidase and related Fe-S oxidoreductases 5 3 Op 2 . + CDS 6115 - 6663 621 ## BT_2169 RNA polymerase ECF-type sigma factor + Prom 6695 - 6754 5.7 6 4 Op 1 . + CDS 6798 - 7232 375 ## BT_2170 hypothetical protein 7 4 Op 2 . + CDS 7257 - 8129 658 ## COG3712 Fe2+-dicitrate sensor, membrane component 8 4 Op 3 . + CDS 8135 - 10828 1699 ## BT_2172 hypothetical protein 9 4 Op 4 . + CDS 10835 - 11848 745 ## BT_2173 hypothetical protein 10 5 Op 1 . - CDS 11877 - 12554 379 ## COG1137 ABC-type (unclassified) transport system, ATPase component 11 5 Op 2 . - CDS 12533 - 12838 193 ## BT_2175 hypothetical protein - Prom 12874 - 12933 5.4 + Prom 12831 - 12890 4.6 12 6 Op 1 . + CDS 13033 - 13320 295 ## BT_2176 hypothetical protein 13 6 Op 2 . + CDS 13317 - 13925 441 ## COG2431 Predicted membrane protein + Prom 13936 - 13995 6.3 14 7 Tu 1 . + CDS 14218 - 14574 450 ## BT_2178 hypothetical protein + Term 14593 - 14643 7.7 - Term 14584 - 14629 7.2 15 8 Tu 1 . - CDS 14704 - 15768 1290 ## BT_2179 putative DNA mismatch repair protein - Prom 15822 - 15881 5.7 + Prom 15737 - 15796 4.6 16 9 Tu 1 . + CDS 15895 - 16590 558 ## COG1011 Predicted hydrolase (HAD superfamily) - Term 16514 - 16541 0.1 17 10 Tu 1 . - CDS 16631 - 17404 542 ## BT_2181 transcriptional regulator - Prom 17472 - 17531 3.2 + Prom 17407 - 17466 5.4 18 11 Tu 1 . + CDS 17492 - 18157 685 ## COG3506 Uncharacterized conserved protein + Term 18314 - 18349 3.1 - Term 18481 - 18527 -0.7 19 12 Tu 1 . - CDS 18562 - 19617 759 ## BT_2183 hypothetical protein - Prom 19637 - 19696 9.4 + Prom 19683 - 19742 7.3 20 13 Op 1 . + CDS 19812 - 20306 293 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 21 13 Op 2 . + CDS 20293 - 20883 414 ## BT_2185 hypothetical protein 22 13 Op 3 . + CDS 20924 - 21781 826 ## COG0128 5-enolpyruvylshikimate-3-phosphate synthase 23 13 Op 4 . + CDS 21852 - 22157 212 ## COG0128 5-enolpyruvylshikimate-3-phosphate synthase 24 13 Op 5 . + CDS 22171 - 22740 433 ## COG0664 cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases 25 13 Op 6 . + CDS 22815 - 23531 575 ## COG0300 Short-chain dehydrogenases of various substrate specificities - Term 23522 - 23552 -0.5 26 14 Tu 1 . - CDS 23555 - 24553 828 ## COG2234 Predicted aminopeptidases - Prom 24597 - 24656 4.3 27 15 Op 1 . - CDS 24714 - 28298 2931 ## COG3250 Beta-galactosidase/beta-glucuronidase 28 15 Op 2 . - CDS 28317 - 29771 1011 ## COG3669 Alpha-L-fucosidase 29 15 Op 3 . - CDS 29798 - 31471 1357 ## BT_2193 hypothetical protein 30 15 Op 4 . - CDS 31489 - 32439 908 ## BT_2194 sialidase precursor, exo-alpha-sialidase 31 15 Op 5 . - CDS 32492 - 34234 1720 ## BT_2195 hypothetical protein 32 15 Op 6 . - CDS 34263 - 37625 3101 ## BT_2196 hypothetical protein - Prom 37741 - 37800 5.1 33 16 Op 1 . - CDS 37821 - 38303 289 ## BT_2197 putative anti-sigma factor 34 16 Op 2 . - CDS 38254 - 38787 301 ## BT_2197 putative anti-sigma factor - Prom 38838 - 38897 7.3 - Term 39022 - 39069 3.2 35 17 Tu 1 . - CDS 39192 - 39773 372 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 39835 - 39894 3.2 36 18 Tu 1 . - CDS 39922 - 42198 1829 ## COG3537 Putative alpha-1,2-mannosidase 37 19 Op 1 . - CDS 42301 - 44190 1975 ## BT_2200 hypothetical protein 38 19 Op 2 . - CDS 44224 - 45867 1711 ## BT_2201 hypothetical protein 39 19 Op 3 . - CDS 45882 - 48872 3176 ## BT_2202 hypothetical protein 40 19 Op 4 . - CDS 48895 - 49959 977 ## BT_2203 hypothetical protein - Prom 49986 - 50045 3.2 - Term 49982 - 50026 -0.1 41 20 Tu 1 . - CDS 50094 - 52667 1646 ## BT_2204 hypothetical protein - Prom 52696 - 52755 4.7 42 21 Tu 1 . - CDS 52774 - 52887 114 ## - Prom 53008 - 53067 5.8 + Prom 52905 - 52964 6.3 43 22 Tu 1 . + CDS 53003 - 53428 410 ## BT_2205 hypothetical protein + Prom 53459 - 53518 8.8 44 23 Op 1 . + CDS 53539 - 54348 811 ## COG1108 ABC-type Mn2+/Zn2+ transport systems, permease components 45 23 Op 2 . + CDS 54363 - 54776 499 ## COG0802 Predicted ATPase or kinase 46 23 Op 3 . + CDS 54773 - 54997 163 ## BT_2208 hypothetical protein + Term 54998 - 55037 1.2 47 24 Op 1 6/0.000 + CDS 55074 - 55367 264 ## COG1669 Predicted nucleotidyltransferases 48 24 Op 2 . + CDS 55351 - 55728 234 ## COG2361 Uncharacterized conserved protein + Term 55860 - 55896 -1.0 - Term 55663 - 55701 -0.6 49 25 Op 1 23/0.000 - CDS 55723 - 57033 837 ## COG1721 Uncharacterized conserved protein (some members contain a von Willebrand factor type A (vWA) domain) 50 25 Op 2 . - CDS 57093 - 58067 1048 ## COG0714 MoxR-like ATPases 51 25 Op 3 . - CDS 58079 - 59353 967 ## BT_2213 hypothetical protein 52 25 Op 4 . - CDS 59350 - 59964 476 ## BT_2214 hypothetical protein 53 25 Op 5 . - CDS 59985 - 60917 925 ## BT_2215 hypothetical protein 54 25 Op 6 . - CDS 60898 - 61860 748 ## COG1300 Uncharacterized membrane protein - Prom 61931 - 61990 4.7 - Term 61990 - 62036 10.3 55 26 Tu 1 . - CDS 62058 - 63443 706 ## BT_2217 hypothetical protein - Prom 63465 - 63524 2.5 - Term 63701 - 63738 1.8 56 27 Op 1 . - CDS 63859 - 65268 677 ## BT_2219 hypothetical protein 57 27 Op 2 . - CDS 65325 - 66599 474 ## BT_2220 hypothetical protein 58 27 Op 3 . - CDS 66614 - 66931 153 ## BT_2221 hypothetical protein - Prom 66965 - 67024 6.7 59 28 Tu 1 . - CDS 67155 - 68942 705 ## BT_2223 TPR repeat-containing protein - Prom 68981 - 69040 6.7 + Prom 68954 - 69013 6.3 60 29 Tu 1 . + CDS 69044 - 69769 435 ## COG1714 Predicted membrane protein/domain 61 30 Tu 1 . - CDS 69779 - 70384 442 ## BT_2225 hypothetical protein - Term 70705 - 70755 12.1 62 31 Op 1 . - CDS 70789 - 71124 522 ## BT_2228 hypothetical protein 63 31 Op 2 . - CDS 71168 - 71482 174 ## PROTEIN SUPPORTED gi|124485582|ref|YP_001030198.1| ribosomal protein L12E/L44/L45/RPP1/RPP2-like protein 64 31 Op 3 . - CDS 71551 - 75354 3922 ## COG0587 DNA polymerase III, alpha subunit - Prom 75472 - 75531 4.6 + Prom 75425 - 75484 7.0 65 32 Op 1 14/0.000 + CDS 75516 - 76202 570 ## COG0688 Phosphatidylserine decarboxylase 66 32 Op 2 . + CDS 76213 - 76920 436 ## COG1183 Phosphatidylserine synthase 67 32 Op 3 . + CDS 76987 - 77286 283 ## BT_2233 hypothetical protein + Term 77300 - 77348 4.3 - Term 77225 - 77261 -0.8 68 33 Op 1 . - CDS 77341 - 77784 402 ## COG0590 Cytosine/adenosine deaminases 69 33 Op 2 . - CDS 77784 - 78017 229 ## BT_2235 hypothetical protein - Prom 78048 - 78107 5.7 + Prom 77999 - 78058 8.5 70 34 Op 1 . + CDS 78085 - 78450 346 ## COG0792 Predicted endonuclease distantly related to archaeal Holliday junction resolvase 71 34 Op 2 . + CDS 78501 - 78857 404 ## COG2315 Uncharacterized protein conserved in bacteria 72 34 Op 3 . + CDS 78841 - 79605 400 ## COG0340 Biotin-(acetyl-CoA carboxylase) ligase 73 34 Op 4 . + CDS 79672 - 81369 822 ## BT_2239 carboxyl-terminal protease + Term 81395 - 81441 11.4 - Term 81137 - 81173 -0.9 74 35 Tu 1 . - CDS 81411 - 82094 544 ## BT_2240 TPR domain-containing protein - Prom 82199 - 82258 5.8 + Prom 82086 - 82145 5.3 75 36 Op 1 . + CDS 82206 - 83525 976 ## COG0534 Na+-driven multidrug efflux pump 76 36 Op 2 . + CDS 83599 - 84309 922 ## COG0528 Uridylate kinase + Term 84333 - 84398 6.3 77 37 Tu 1 . + CDS 84449 - 86929 2305 ## BT_2243 hypothetical protein + Term 86954 - 87009 10.8 - Term 86836 - 86870 0.1 78 38 Op 1 . - CDS 86935 - 88020 940 ## BT_2244 hypothetical protein 79 38 Op 2 . - CDS 88027 - 89082 980 ## BT_2245 hypothetical protein 80 38 Op 3 . - CDS 89103 - 91151 1649 ## BT_2246 hypothetical protein 81 38 Op 4 . - CDS 91155 - 91457 363 ## BT_2247 putative ryanodine receptor - Prom 91477 - 91536 2.1 + Prom 91757 - 91816 3.4 82 39 Op 1 . + CDS 91868 - 92779 707 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily 83 39 Op 2 . + CDS 92831 - 93391 718 ## COG0233 Ribosome recycling factor 84 39 Op 3 . + CDS 93486 - 94418 930 ## COG1162 Predicted GTPases + Prom 94506 - 94565 8.2 85 40 Op 1 27/0.000 + CDS 94734 - 95816 872 ## COG0845 Membrane-fusion protein 86 40 Op 2 9/0.000 + CDS 95832 - 98864 2710 ## COG0841 Cation/multidrug efflux pump 87 40 Op 3 . + CDS 98861 - 100207 1301 ## COG1538 Outer membrane protein + Term 100230 - 100276 3.3 + Prom 100269 - 100328 1.7 88 40 Op 4 . + CDS 100358 - 101452 807 ## BT_2254 putative pectate lyase + Term 101486 - 101550 14.2 - Term 101473 - 101538 21.1 89 41 Op 1 . - CDS 101575 - 102735 1002 ## COG3274 Uncharacterized protein conserved in bacteria - Prom 102756 - 102815 8.4 - Term 102762 - 102805 4.3 90 41 Op 2 . - CDS 102882 - 104516 499 ## PROTEIN SUPPORTED gi|169634422|ref|YP_001708158.1| fumarate hydratase - Prom 104572 - 104631 4.3 - Term 104659 - 104714 -0.3 91 42 Op 1 . - CDS 104828 - 106819 1638 ## BT_2257 hypothetical protein - Term 106825 - 106868 4.1 92 42 Op 2 . - CDS 106876 - 108135 1226 ## COG2262 GTPases - Prom 108166 - 108225 6.4 - Term 108173 - 108219 10.7 93 43 Op 1 . - CDS 108242 - 109702 1237 ## BT_2259 hypothetical protein 94 43 Op 2 . - CDS 109722 - 112559 2198 ## BT_2260 outer membrane protein Omp121 Predicted protein(s) >gi|226332154|gb|ACIC01000166.1| GENE 1 35 - 736 775 233 aa, chain - ## HITS:1 COG:lin2728 KEGG:ns NR:ns ## COG: lin2728 COG0745 # Protein_GI_number: 16801789 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Listeria innocua # 6 226 3 221 225 141 38.0 9e-34 MDEKLRILLCEDDENLGMLLREYLQAKGYSAELYPDGEAGYKAFLKNKYDLCVFDVMMPK KDGFTLAQDVRAANAEIPIIFLTAKTLKEDILEGFKIGADDYITKPFSMEELTFRIEAIL RRVRGKKNKESNIYKIGKFTFDTQKQILATEGKQTKLTTKESELLGLLCAHANEILQRDF ALKTIWIDDNYFNARSMDVYITKLRKHLKEDDSIEIINIHGKGYKLITPEVES >gi|226332154|gb|ACIC01000166.1| GENE 2 738 - 2249 1202 503 aa, chain - ## HITS:1 COG:CAC1701 KEGG:ns NR:ns ## COG: CAC1701 COG0642 # Protein_GI_number: 15894978 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 268 498 341 564 566 124 36.0 6e-28 MGLSFLSLLYLQVSYIEEMMKTRKEQFDSAVRNSLDQVSKDVEYAETMRWLVEDISDAER KALTENNTSLGQNNVVRGTQRFQVKSRDGKIISDFELRVMKIKPSELPKAMVRGRNTIPQ TSNSMLEAIKNRYMYQRALLDEVVLDMIRRASDKSIGDRVRFRDLDDYLKSNLYNNGIDL PYHFAVIDKDGREVYRCADYEAKGSEESYQQALFKNDPPAKMSILKVHFPGKKDYIFDSI SFMIPSLIFTLVLLVTFIFTIYIVFRQKKLTEMKNDFVNNMTHEFKTPISTISLAAQMLK DPAVAKSPQMFQHISGVINDETKRLRFQVEKVLQMSMFDRQKATLKMKEIDANELISGVI NTFALKVERYNGKITSNLEAADPVIFADEMHLTNVIFNLMDNAVKYKKAEEDLELKVKTW NESGKLMISIQDNGIGIKKENLKKIFEKFYRVHTGNLHDVKGFGLGLAYVKKIITDHKGT IRAESDLNVGTKFIIALPLLKNN >gi|226332154|gb|ACIC01000166.1| GENE 3 2551 - 4707 2017 718 aa, chain - ## HITS:1 COG:FN1546 KEGG:ns NR:ns ## COG: FN1546 COG0480 # Protein_GI_number: 19704878 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factors (GTPases) # Organism: Fusobacterium nucleatum # 1 710 3 685 690 441 36.0 1e-123 MKVYQTNEIKNIALLGSSGSGKTTLVEAMLFESGVIKRRGSVAAKNTVSDYFPVEQEYGY SVFSTVLHVEWNNKKLNIIDCPGSDDFVGSTVTALNVTDTAIILLNGQYGVEVGTQNHFR YTEKLNKPVIFLVNQLDNEKCDYDNILEQLKEAYGSKVVPIQYPIATGPGFNALIDVLLM KKYSWKPEGGAPVIEDIPAEEMDKAMEMHKALVEAAAENDEGLMEKFFEQDSLTEDEMRE GIRKGLIARGMFPVFCVCGGKDMGVRRLMEFLGNVVPFVSEMPKVENTDGKEVAPDVNGP ESLYFFKTSVEPHIGEVSYFKVMSGKVREGDDLLNADRGSKERIAQIYVVAGGNRVKVEE LQAGDIGAAVKLKDVKTGNTLNGKDCDYKFNFIKYPNSKYSRAIKPVNEADVEKMMTILN RMREEDPTWVIEQSKELKQTLVHGQGEFHLRTLKWRLENNEKLQVKFEEPKIPYRETITK AARADYRHKKQSGGAGQFGEVHLIVEPYKEGMPVPDTYKFNGQEFKITVRGTEEIPLEWG GKLVFINSIVGGSIDARFLPAIMKGIMSRLEQGPLTGSYARDVRVIVYDGKMHPVDSNEI SFMLAGRNAFSEAFKNAGPKILEPIYDVEVFVPSDRMGDVMGDLQGRRAMIMGMSSEKGF EKLVAKVPLKEMSSYSTALSSLTGGRASFIMKFASYELVPTDVQDKLIKDFEAKQTEE >gi|226332154|gb|ACIC01000166.1| GENE 4 4965 - 6095 769 376 aa, chain + ## HITS:1 COG:lin1513 KEGG:ns NR:ns ## COG: lin1513 COG0635 # Protein_GI_number: 16800581 # Func_class: H Coenzyme transport and metabolism # Function: Coproporphyrinogen III oxidase and related Fe-S oxidoreductases # Organism: Listeria innocua # 2 373 12 385 385 227 33.0 3e-59 MAGIYLHIPFCKTRCIYCDFYSTTRSELKTHYVHTLCRELEMRKEYLKGEPVETIYFGGG TPSQLEEADFKHIFETIRENYGMEHCREITLEANPDDLSQEYLKMLSSLPFNRISMGIQT FDDTTLQLLKRRHSSQTAVEAVRRCREAGFQNISIDLIYGLPGETKERWVNDLRQAIRLD VEHISAYHLTYEEDTPIYNMLKQHQIEEVDEDSSLQFFTLLIEHLQNAGYEHYEISNFCR PDKYSRHNTSYWRGIPYLGCGPSAHSFNGTTREWNVSSIDLYIKGIEGNQRDFETENLDQ TTRYNEFIITTIRTVWGTPIEKLKQEFGNELWEYCRKMSAPYLENGKLEIHEGALRLTRE GIFISDSIMSDLLWVN >gi|226332154|gb|ACIC01000166.1| GENE 5 6115 - 6663 621 182 aa, chain + ## HITS:1 COG:no KEGG:BT_2169 NR:ns ## KEGG: BT_2169 # Name: not_defined # Def: RNA polymerase ECF-type sigma factor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 182 1 182 182 302 100.0 5e-81 MTESEVRKLLRQMKELDSQTAFRDFYNMTYDRLFRIAYYYVKQEEWSQEIVLDIFLKLWK QRDTLLDVKNIEDYCFILVKNASLNYLEKESKYTTIHSDSLPEPQEQSYSPEESLISEEL FAIYVKALDRLPERCREIFIRIREEKQSYAQVAEELDISIKTVDAQLQKAVSRLKEMISS SG >gi|226332154|gb|ACIC01000166.1| GENE 6 6798 - 7232 375 144 aa, chain + ## HITS:1 COG:no KEGG:BT_2170 NR:ns ## KEGG: BT_2170 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 144 1 144 144 254 100.0 6e-67 MKKFNDANVGLFVLLTACLSLFSCNNDNDNYPKDYVGFEKSTRTVECDKNQSESELQIKI IATDKSKEDRTVLLATPALPAGQAPIMKLTETKVTIKAGQKSATTTIKLYPKKMVLKQQN ITLSCTPQWKEGSVSKLTILLKRN >gi|226332154|gb|ACIC01000166.1| GENE 7 7257 - 8129 658 290 aa, chain + ## HITS:1 COG:PA2388 KEGG:ns NR:ns ## COG: PA2388 COG3712 # Protein_GI_number: 15597584 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Pseudomonas aeruginosa # 74 253 115 290 331 84 33.0 3e-16 MKYTDRELEDILNKLIASTRSPRGRFSAASSYPQLEKKLNFRNRRLYLTRTFAAAAAVVL LCLSVWTAYLYMQPATIQTVSTLAETRTVHLPDGTSVTLNHYSSLSYPERFKSDNREVEL SGEAYFEVSKDPKHPFIVQTETIDVQVLGTHFNVDAYPDNPDVKTTLLTGSVAVSNKNNS VRMVLKPNEAAIYNKVEQKLTRKVLENAGDEISWRHGEFIFDDLPLQEIARELSNSFGTT IHIADSTLRNYRITARFRNGEDLDAILSVLHNAGYFNYSRNTQQITITKN >gi|226332154|gb|ACIC01000166.1| GENE 8 8135 - 10828 1699 897 aa, chain + ## HITS:1 COG:no KEGG:BT_2172 NR:ns ## KEGG: BT_2172 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 897 1 897 897 1746 100.0 0 MRTIHQGHLPIRTLVLIGMVILITAQAYAQNANARLTITLRNATLKEFVKLIENSTGYSF IYSEEISISHKINLKVKDMPLHEILDLVFKDEPISYKFSERYILLQKKRVQKPVSRKFTI SGYVTDGTSSETLIGTNIIESHQYQGTTTNPYGFYSITLPEGETELRFSYLGYTTETHHF TLAQDTLLNIRMKGNAQLEEVVVVSDKAEAGTVATQMGAVEIPMVQIKNTPSILGESDVM KAIQLMPGVQAGVDGSAGLYIRGGSPDQNLILLDGTPVYNVDHMFGFFSVFTPEAIKKVT LFKSSFPARFGGRLSSVIDVRTNDGDMKKYHGTLSIGLLTSKINLEGPIVKDKTSFNISA RRSYIDLIAKPFMPDDEKYNYYFYDINAKINHKFSDRSRLYLSAYNGKDHFATQYDDNSD FKDGSKMNWGNTIISARWNYIFNNRLFSNTTVSYNNYLFNVSAYSNNQYSLTNNTTFINR YSSDYRSGINDWNYQIDFDYNPSPMHHIKFGAGYIYHRFRPEVMTSKISDKVDSEAFRDT TYHSMNNSRIYAHEVSAYAEDNLKIGTRLRLNLGLHFSLFQVQKQSYFSLQPRVSTRYQL SKDVILKASYTKMSQYVHLLSSMPIAMPTDLWVPVTKKIKPMRSHQYSLGGYYTGIKGWE FSVEGYYKDMYNVLEYKDGVSFFGSSTGWESKVEMGKGRSMGIEFMAQKTLGKTTGWLSY TLSKSDRKFAKGAINNGERFPYKYDRRHIINLTLNHKFSERIDIGASWVFYTGGTSTIPE EKTTIIRPGNGANNGFILGFDNYYNPIYNNSPNVGEANYIEHRNNYRLPASHRLNIGINF NKKTKHGMRIWNISLYNAYNSMNPAWVFRSHNDDGKSVIKKYTLLPCIPSFTYTYKF >gi|226332154|gb|ACIC01000166.1| GENE 9 10835 - 11848 745 337 aa, chain + ## HITS:1 COG:no KEGG:BT_2173 NR:ns ## KEGG: BT_2173 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 337 1 337 337 646 100.0 0 MKTYIYHPLILLVIILVTVSCENELPFNIKDNPPKLVMNAFINADSLTNVLFLNLTGKDY ANHIENATVEVHVNGELRETLRPLPMETEYDKQCRFNITGKFASGEVVRIDAMTDDGKYH AWAEVTVPQRLDKIENIDTLTVPLIQNGHTQDYMRYKITFKDRPNEANFYRIVVDKQMRL WGYNHEEGGEDYLHWTKHITYSFIGREDIVLTDGQPSTGDDEDNGLFDTAKNIYGVFDDS RFRDTSYTMTVYNDPAIPYNNNYNGYYEGMDVIIRLLSITETEYYYLKALNLVDSDVYDE TINEPLRYPSNVHGGTGMIGISTEVSQTVRIRENNPK >gi|226332154|gb|ACIC01000166.1| GENE 10 11877 - 12554 379 225 aa, chain - ## HITS:1 COG:CC3600 KEGG:ns NR:ns ## COG: CC3600 COG1137 # Protein_GI_number: 16127830 # Func_class: R General function prediction only # Function: ABC-type (unclassified) transport system, ATPase component # Organism: Caulobacter vibrioides # 5 215 11 232 252 97 30.0 2e-20 MHHLLEIDSIIKNFGTRQLLTDVYLKINTGDVIGLFGRNGTGKSVLMQIIFGTMKADRKF IRLDECKVLSAPYQHARTIAFLPQQGYLPKHEKINKLVDCYLDANVVPYFYEDDEVAQSC MERRVSQLSGGERRYLEAKLLLLGNAKFVLLDEPFDYLSFHLVDKLIGLIKKHSEEKGIV IADHNYEKVLEVVNRLVLIREGVLMELTDKRGLVEQGYLLDESYL >gi|226332154|gb|ACIC01000166.1| GENE 11 12533 - 12838 193 101 aa, chain - ## HITS:1 COG:no KEGG:BT_2175 NR:ns ## KEGG: BT_2175 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 101 18 118 118 163 100.0 2e-39 MRRLKLYYLFFHSTLKINVPLSILGALIVSKADWSLFWEAFPYLLGGWGIVASLLYKEFL EKEAYFFYYNSGILKRNLIVFVFAVYWSVLWIVKLCITCLK >gi|226332154|gb|ACIC01000166.1| GENE 12 13033 - 13320 295 95 aa, chain + ## HITS:1 COG:no KEGG:BT_2176 NR:ns ## KEGG: BT_2176 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 95 1 95 95 129 100.0 3e-29 MFIIIGIMLTGMLFGFLLRNKRLSWIHKIITLLIWVLLFLLGIDVGGNEAIIKGLHTLGL EALIITLAAVTGSILCAWGLWYLLYIRNKGKETEV >gi|226332154|gb|ACIC01000166.1| GENE 13 13317 - 13925 441 202 aa, chain + ## HITS:1 COG:FN1083 KEGG:ns NR:ns ## COG: FN1083 COG2431 # Protein_GI_number: 19704418 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Fusobacterium nucleatum # 5 200 2 195 198 117 39.0 1e-26 MKGSLIIVSFFILGTLCGFYHLIPYDFTDSKLSYYALCGLMFCVGISIGNDPNTLKSFRS LNPRLMFLPVMTILGTLAGCAVAGIFMSQRTSSDCMAVGAGFGYYSLSSIFITEYKGPEL GTIALLSNIMREIIALLGAPLLVKYFGKLAPISVGGATTMDTTLPIITRCSGKEFVIISI FHGFVVDFSVPFLVTFLCSISF >gi|226332154|gb|ACIC01000166.1| GENE 14 14218 - 14574 450 118 aa, chain + ## HITS:1 COG:no KEGG:BT_2178 NR:ns ## KEGG: BT_2178 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 118 1 118 118 196 100.0 2e-49 MKRLGLTLVAALCLAASTFAAGNQPTTAKWEGNINVNKLGKYLQLNSVQSEEVANICDYF SEQMSRATTAKKDKEAKLRNAVYGNLKLMRKTLSAEQYAKYAALMNLTLLNKGIDLSK >gi|226332154|gb|ACIC01000166.1| GENE 15 14704 - 15768 1290 354 aa, chain - ## HITS:1 COG:no KEGG:BT_2179 NR:ns ## KEGG: BT_2179 # Name: not_defined # Def: putative DNA mismatch repair protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 354 1 354 354 689 100.0 0 MKIGDKVRFLSEVGGGIVTGFKGKDFVLVEDADGFDIPMPIRECVVIEADDYNIKRKPAA TAPKQEEPAKPAKPEMPVIQRQPEVRGGDTLNVFLAYVPEDAKAMMTTPFEAYLVNDSNY YLYYTYLSAEGKAWNNRSHGLVEPNTKLLLEEFTKDVLNEMERVAVQLIAFKDGKPAAIK PAVSVELRIDTVKFYKLHTFSASDFFEEPALIYDIVKDDVPAKQVYVSAEEIQSALLQKK FVDKPKSQPIMKPNHGQSGKNGIIEVDLHIDSLLDDTHGMSNSEILNYQLDKFREVIEAN KEKREQKVVFIHGKGDGVLRKAIIDELKRKHSNYRYQDASFQEYGFGATMVTIK >gi|226332154|gb|ACIC01000166.1| GENE 16 15895 - 16590 558 231 aa, chain + ## HITS:1 COG:mlr6523 KEGG:ns NR:ns ## COG: mlr6523 COG1011 # Protein_GI_number: 13475450 # Func_class: R General function prediction only # Function: Predicted hydrolase (HAD superfamily) # Organism: Mesorhizobium loti # 5 230 7 231 238 208 45.0 7e-54 MKELIKVIAFDADDTLWSNEPFFQEVEKQYTDLLKPYGTSKEISAALFQTEMNNLQILGY GAKAFTISMVETALQISNGKIAADIIRQIVDLGKSLLKMPIELLPGVKETLKTLKETGKY KLVVATKGDLLDQENKLERSGLSPYFDHIEVMSDKTEKEYLRLLSILQIAPSELLMVGNS FKSDIQPVLSLGGYGVHIPFEVMWKHEVTETFAHERLKQVKRLDDLLSLLG >gi|226332154|gb|ACIC01000166.1| GENE 17 16631 - 17404 542 257 aa, chain - ## HITS:1 COG:no KEGG:BT_2181 NR:ns ## KEGG: BT_2181 # Name: not_defined # Def: transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 257 1 257 257 511 100.0 1e-143 MDVLQKEIDEVYATHPTAHEALDNGIVEQHQQFVRSLTEVNGGCAVISDLSNRKSYVTVH PWANFLGLTPEEAALSVIDSMDEDCIYRRIHPEDLVEKRLMEYKFFQKTFSMSPGERLKY RGRCRLRMMNEKGVYQYIDNLVQIMQNTPAGNVWLIFCLYSLSADQRPEQGIYATITQME RGEVETLSLSEEHRNILSEREKEILRCIRKGLSSKEIAATLYISVNTVNRHRQNILEKLS VGNSIEACRAAELMKLL >gi|226332154|gb|ACIC01000166.1| GENE 18 17492 - 18157 685 221 aa, chain + ## HITS:1 COG:all7165 KEGG:ns NR:ns ## COG: all7165 COG3506 # Protein_GI_number: 17233181 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Nostoc sp. PCC 7120 # 28 204 1 176 183 159 43.0 3e-39 MKHLKIKIMSIALLAAATSSMAQSLDKMNWLNEPQQWEIKEGKALVMDVPAKTDFWRISH YGFTVDDGPFYYATYGGEFEVKVKITGNYVTTFDQMGLMLRIDHENWIKAGVEYVDGKQN VSAVVTHRTSDWSVVQLPDAPRSLWIKAVRRLDAVEIFFSRDDKEYTMIRTCWLQDNCPV MVGLMGACPDGKGFTATFEEFKVTPLADQRRLEWAKKQADK >gi|226332154|gb|ACIC01000166.1| GENE 19 18562 - 19617 759 351 aa, chain - ## HITS:1 COG:no KEGG:BT_2183 NR:ns ## KEGG: BT_2183 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 351 1 351 351 689 100.0 0 MNIPNIRLLNQQLLNPLFREPKELVSWMGAMQAQNYSMVKWAVGMRLKSATIQTVEKALR EGEILRTHVMRPTWHLVAAEDIRWMLKLSAQRIISANDSFAKGYDLDIPNELYTKAHDLL EKILCGKKSLTKQEIAEHFNRSGIVADNRRMTRFMARAEQEGIICSGEDRGSKCTYALLE ERVPPMPELTKDESLARLARSYFRSHSPAVLQDFIWWSGLSVSDAKQAVYLIASELITEQ WKEQTWYIHDICRTRGKLSGHIHLLPSYDEYLLGYKDRTDVLPLEHYSKAFTNNGLFFPI VLHNGQVIGNWDKSVKKKSVDLKYSWFRQVADMNEETLERERQKFTRFLEK >gi|226332154|gb|ACIC01000166.1| GENE 20 19812 - 20306 293 164 aa, chain + ## HITS:1 COG:CC3310 KEGG:ns NR:ns ## COG: CC3310 COG1595 # Protein_GI_number: 16127540 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Caulobacter vibrioides # 24 155 26 155 166 82 33.0 2e-16 MEEEFIELINQYQGILQKICNIYFYQHPFREDYYQEILIRLWKAYPKFKQESTVSTWLYR VTLNTAIDLVRKESVQPVYKELSTQEYAIHDSHPEENTATDKKEQLYQAINRLSEIEKAI IILYLESYEYKEIAAVIGISESNVGVKINRIKKQLIKQLNNGRQ >gi|226332154|gb|ACIC01000166.1| GENE 21 20293 - 20883 414 196 aa, chain + ## HITS:1 COG:no KEGG:BT_2185 NR:ns ## KEGG: BT_2185 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 196 1 196 196 369 100.0 1e-101 MEDNNLKECWKEIHTGTEIKPMNIKEVIRKKHCRTISQTLRRQKTLISLFAILLAFSVAA SIWDTIIMGSASISLWAGSAFLLFLLVSAIGHYQLLTRSADTYSLKESGAMLKKQLERRI NIDFIIYLIFFYGTAVRFVIQYFNDYEGLKGLTFILMLFTGILLAIPWLMRYQQKHRYRY YFNSLDKSQKLLEVSE >gi|226332154|gb|ACIC01000166.1| GENE 22 20924 - 21781 826 285 aa, chain + ## HITS:1 COG:PM0839 KEGG:ns NR:ns ## COG: PM0839 COG0128 # Protein_GI_number: 15602704 # Func_class: E Amino acid transport and metabolism # Function: 5-enolpyruvylshikimate-3-phosphate synthase # Organism: Pasteurella multocida # 10 272 14 308 440 175 37.0 1e-43 MMLYKLISPSMVKATIQLPASKSISNRALIINALGKGIYPPENLSDCDDTQVMIKALTEG KETIDIMAAGTAMRFLTAYLSATSGERIITGTARMQQRPIQILVNALRELGAEIEYTHNE GYPPLRIKGAELKGNEITLKGNVSSQYISALLMIGPVLKDGLTLHLTGEIISRPYINLTL QLMQDFGAKAAWTSPSSISVAPQPYQSVPFTVESDWSAASYWYQIAALSPEAEIELLGLF RNSYQGDSRGAEVFSRLGITTEFTPKGVKIQKNRQDTGTTGRRLR >gi|226332154|gb|ACIC01000166.1| GENE 23 21852 - 22157 212 101 aa, chain + ## HITS:1 COG:NMB1432 KEGG:ns NR:ns ## COG: NMB1432 COG0128 # Protein_GI_number: 15677291 # Func_class: E Amino acid transport and metabolism # Function: 5-enolpyruvylshikimate-3-phosphate synthase # Organism: Neisseria meningitidis MC58 # 3 88 340 423 433 70 41.0 6e-13 MQSLKIKETDRIAALRTELKKLGYLIEEENDSVLMWNGERCEPEAVPVIATYEDHRMAMA FAPAVITFPKLLIADPQVVSKSYPGYWEDLKLAGFQVINEG >gi|226332154|gb|ACIC01000166.1| GENE 24 22171 - 22740 433 189 aa, chain + ## HITS:1 COG:all4541 KEGG:ns NR:ns ## COG: all4541 COG0664 # Protein_GI_number: 17232033 # Func_class: T Signal transduction mechanisms # Function: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases # Organism: Nostoc sp. PCC 7120 # 2 186 3 187 193 109 34.0 4e-24 MEDFNKYTSHINYSPIVDFTLQQGKETSYKKGEYFVRQGEACKIMGFVVSGSFRYCCVNS LGESSIVGYTFDHSFVGNYPAFQLQDQSNVDIQALCDCTVYVINHSQLADFYDINNDHQK LGRRIAETLLWEIYDRMISMYSLTPEERYLDIINRCPDLLKLITLKELASYLLIRPETLS RIRRKVVRK >gi|226332154|gb|ACIC01000166.1| GENE 25 22815 - 23531 575 238 aa, chain + ## HITS:1 COG:CAP0051 KEGG:ns NR:ns ## COG: CAP0051 COG0300 # Protein_GI_number: 15004755 # Func_class: R General function prediction only # Function: Short-chain dehydrogenases of various substrate specificities # Organism: Clostridium acetobutylicum # 1 237 1 240 240 169 38.0 4e-42 MKKIIIVGATSGIGRGLAEQYAREDCLIGITGRRENLLKEICAQDKDKLFYQVCDITHTE TPLLSLETLVQEMGGMDMLIICAGTGELNPQLSYQLEEPTLLTNVVGFTNIAGWGFRYFE QQKGGHLINISSVGGTRGSGVAPAYNASKAYQINYMEGLRQKAAKLPFAVYTTDIRPGFI DTAMAKGEGLFWVTPVDKAVKQIKKAISDRKKVAFISKRWKYVAWLFKLIPSALYCKM >gi|226332154|gb|ACIC01000166.1| GENE 26 23555 - 24553 828 332 aa, chain - ## HITS:1 COG:CC2502 KEGG:ns NR:ns ## COG: CC2502 COG2234 # Protein_GI_number: 16126741 # Func_class: R General function prediction only # Function: Predicted aminopeptidases # Organism: Caulobacter vibrioides # 39 302 34 272 309 115 31.0 8e-26 MKQRIILFGLILVGTVTFAQSPEEKGLNTINRSSAEATIGFLASDELQGREAGFHGSYVS SEYIASILQWMGIQPLNESYFQPFDAYRKERQKKGRLEVHPDSVAKLKQEVHQKLSMRNV LAMIPGKNTKEYVIVGAHFDHLGIDPALDGDQIYNGADDNASGVSAVLQIARAFLASGQQ PERNVIFAFWDGEEKGLLGSKYFVQTCPFVSQIKGYLNFDMIGRNNKPEQPKHVVYFYTA AHPSFGDWLKKDIKKYGLQLEPDYRAWDNPVGGSDNGTFAKAGIPIIWYHTDGHPDYHQP SDHADRLNWDKIVEITKASFLNMWKMANEASW >gi|226332154|gb|ACIC01000166.1| GENE 27 24714 - 28298 2931 1194 aa, chain - ## HITS:1 COG:TM1193 KEGG:ns NR:ns ## COG: TM1193 COG3250 # Protein_GI_number: 15643949 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Thermotoga maritima # 185 1189 7 983 1087 695 39.0 0 MMQRLKILGSALIALSFSASAVVAQQFDPKQGYEIHTVNGLVLDNQESLDTNTKIFISKR EADKESQVWNLIPCEQEGCYTITSPLTQMNVDNSGKGKVECSVIQWSADPKNPNQQWKIT ALPNGNYVLTNVGTGYNLGFPDAGLVGEPVFQLEPEASKSNQQWQIRKSNLKVVAEALKT TSHNDWENERIYAINKEEGRATFIPFANSEEMKADPAYTRPWERTRSSRYLLLNGNWKFN WVKQPSERPVDFYKTSYDVSGWKEIPVPSNWEMHGYGTPIYTNITYPIRNNPPFIQGQRG YAVEKEPNAVGSYRREFALPADWKDKEVFIHFDGIYSAAYVWINGKKVGYSQGSSNDAEF RITPYVKAGNNTVAVEVYRWCDGSFLEDQDMFRLSGIHRDVYLVASPKVRLRDIHLTSQI SDRLDKAELKVKTDVHNYGKKVQEATVRVSLLNTEGKPVSSFIIPTGKITGGQENVCEGT TTIRDPRLWSAETPSLYTVQLELLDAAGNVLEATSQQYGFRKIEIRNNKVYINNALILFK GANRHDIHPQFGKAVPVESMIEDILLFKRFNLNTIRTSHYPNDPKMYSLYDYYGLYVMDE ADIECHGNMSLSNRESWEGAYVDRMVRMVERDKNHPSVIFWSMGNESGGGRNFEATYQAA KAIDDRYIHYEGMNDVADMDSRMYPSIESMIEQDEQPRNKPYFLCEYAHAMGNAIGNLEE YWDYIEHHSKRMIGGCIWDWVDQGINMPGQPADHYYFGGSFGDRPNDNDFCCNGIVTADR QVTPKLWEVKKVYQYITLEPNAENSIGVRNRYAFLNLHDFNLRYVILKDGVPIAEEEFSL PDGKPGEHRAVQIPYSRYLTSEGEYYLNLEIKLKKDCVWAKAGHIVATEQLLLQKSPNTG LQPVAVSASSKESLFKVVEEEKRYLFFRRPGVEITFDKKEGKLTGIRYNGDNMLHLREGW SLNTFRFINNDVRKWQDTQTEVISFDWKWSEDNQSAIVTIQLQETVGDVKVPYTLIYRLY GNGEIDVDASFTTNDDFNLPRLSLQAFFNPSLEQLEWYGRGPIENYRDRKNAAYVGKYQS AVNDMKESYARSQTMGGRCDTRWLTLTNKAGKGIKITAADTFDFSALHYTDKDLFEIKYG HALPDIYRAEVVLNLDCIQRGLGNASCGPGPRPAYEIQKNTVYKYAFRMSPFSK >gi|226332154|gb|ACIC01000166.1| GENE 28 28317 - 29771 1011 484 aa, chain - ## HITS:1 COG:SP2146 KEGG:ns NR:ns ## COG: SP2146 COG3669 # Protein_GI_number: 15901959 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-L-fucosidase # Organism: Streptococcus pneumoniae TIGR4 # 27 477 6 449 559 272 36.0 1e-72 MNIKVLVASAVAAFSSLPVIAQQAPAPCGLVPSARQLEWYNREMIAFFHFGINTFEEYVN EGDGKASTAIFNPTALDCRQWMQTLKAAGIPAAILTAKHADGFCLWPSKYTDYSVKNAAW KNGKGDVVREFVDACEEYGLKAGIYLGPHDRHEHLSPLYTTERYKEYYAHQLGELMSDYG KIWETWWDGAGADELTTPVYRHWYKIVREKQPDCVIFGTKNSYPFADVRWMGNEAGEAGD PCWATTDSVAIRDEAQYYKGLNEGMLDGDAYIPAETDVSIRPSWFYHAEEDSRVKSVREL WDIYCTSVGRNSVLLLNFPPDRRGLIHSTDSLHAALLKQGIDETFSTNLLRGAKVKATNV RGAKYSPEKMLDNEKNTYFAGKDGEVKADIIFTLPKTIEFDCLMIEEVIELGHRTTKWSV EYTVDGKNWITIPEATDKQAIGHKWIVRLAPVKAKQVRLRIQDGKACPAIHTFGVYKQSP VFKL >gi|226332154|gb|ACIC01000166.1| GENE 29 29798 - 31471 1357 557 aa, chain - ## HITS:1 COG:no KEGG:BT_2193 NR:ns ## KEGG: BT_2193 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 557 1 557 557 1114 100.0 0 MKKYIYILLTFCLLISCMEDPTAGVVVPKYTPSTENPGGPGEEPGDGLIDPAEPLDRGFM HLKGRELKSLNSISITGLNDGEKVILSTLAGLAARVTGDQVYINEGGPSSVWLKQMQNKY GIPVNTYNALAPLVQHYVETGVIKGYIVYTPYSEGQSHSINVATSLCGLLRGIAVPESLV DKVKAMGVTTELMDVRSYDEKWLYENYKDQLDKSLAADMKPEIFHHLRDYITMTNAFAFY DYNARRDWSWRTSILKDLDKGAYCFGYYDLDEWGMVNNASQLGVSMLPTDQAANLATLSS IYDTTGLKQRPATKEVVTEENVHYVTFLVSDGDNIAFNLWGQQGYMDHDLHGQFPLGYTI SPSLYDLAPAALRWYYENSKEGDYFVAGPSGSSYIFPSKMSDADLDDYLAKLNEYVDKSG LNICNILDQKIMDNPKVYNKYLAQPNIDAIFYTGYGEKGDGRIKFSDNGKPVIEQRSVLW EGIDGGSNRGEESTVISQINSRSANPHSADGYTFVFVHCWTKNQQSIKTVIDGLNDNVRV VPVDQFVQLVKQNLGPK >gi|226332154|gb|ACIC01000166.1| GENE 30 31489 - 32439 908 316 aa, chain - ## HITS:1 COG:no KEGG:BT_2194 NR:ns ## KEGG: BT_2194 # Name: not_defined # Def: sialidase precursor, exo-alpha-sialidase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 316 1 316 316 632 99.0 1e-180 MAASAVLCSFALLSGCSDDDVCKVYIPEAKTLQIKNFDLGEELREVFHVAVTASDYVTRS GKEVGGDVKVQLTVDEAKVQEYNSLCGTDYPLLPSDCYTLTTETVIPAGESSSADLVLTV NAQGKIQPFDSYLLPVTITSVEGAQADHIQQTVYYLLSGAEDVWAMPLADRSAWQVVEVS SEETSGEGSDNGHAIHAFDGLKGTFWHTQWKGNEPQPPHHIVVDMGQEVKMLGFQYVSRD HGEAWPQEMTMETSLDGSKWESAGTYSDLPAGAKEEFRSYFPGFKQARYFRLTITAVYSG KWATAVAEINAISIIK >gi|226332154|gb|ACIC01000166.1| GENE 31 32492 - 34234 1720 580 aa, chain - ## HITS:1 COG:no KEGG:BT_2195 NR:ns ## KEGG: BT_2195 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 580 1 580 580 1156 100.0 0 MRIHNIIKTVCLAGSLALLSACEDWLEIEPKDRFGDTTVWGSEENADMFLNDIYNQLPHL NNETQNLDQYSDNSYVGAEWMNARTTIYTGALSPTSWIPGPWDMWKWGRQNNDDAKGQYE RIRSCNLFITKVTESDFSAEYKKERLAEVRFLRAWFYHYLWMAYGGVPIITEVLDNNVST DIFYPRETAQKTFEFIDKELDEIKDDLPPRRSGSDLGRASKGAILTLKGWVELFHASELR NPGKDKKRWEAAAATLKDVIDLQVYHLQPTILDLWTEATNNNDEVIFDFQMSKQNGGRRE GLFGPVFVKGVQSSWGNMQPTQELVDDYCMANGLPITDPASGYNKNNPYKNREKRFYQSI LYDGSMWQGEEIITRVGVGSPNEIDTSSDSDVTNTGYYTRKTIDESVNGADNLQMSNGMA NYIFFRYADVLLMYAEASLEAGDKPTAIEYLDMVRTRGDNMPSIGDTYPQGITENQLREI IRRDRRIELAFEDKRWWDILRWKICDGENGVMNKPIGGMKIEDTNGDGVWEYNYHEVGKR TFLPRMYYQPIPQYVIDKNPVIREQNGGEDGWVNGQNPGY >gi|226332154|gb|ACIC01000166.1| GENE 32 34263 - 37625 3101 1120 aa, chain - ## HITS:1 COG:no KEGG:BT_2196 NR:ns ## KEGG: BT_2196 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1120 1 1120 1120 2232 100.0 0 MKKIFSLLALVLFINLFTYQAQAQTAVVNINKTSVSLKELIQEVEKQAGYLFVYGKDINI EQKVKINARNKQVKALLENVLPQIGLTYEYADNYISLKKTQVEKKSIGKKIIVKGTVVDK SGEPIIGANIMEQGSASNGTITDIDGKFTLSLPSNSVLAVTYVGFSTKNIPVNNQNAIQI VMEEDSEILEEVVVVAYGTQKKISSTAAVSSMKTKDVAQKPVINISNSLAGRMAGVIAKQ GSGEPGADGSDLRIRGVATLGNQSPLVVVDGVPRDFSRIDPNMIEDITILKDAAAVAPYG MAGANGVVLVTTKKGKSGAPVLSYNGYIGFQNPTRITDQVNSYEYALMKNEAAMNAGYPN YYAYSQHDLEMYRKTCAGDPDADPDRYPNSNGLRDLVQNNSVITNHNLQLSGGSDKFQYY VSLGYSYQEGMWSTTDYQRFNLLANLSVDATKYTKISLSLGGWHEKKNYPGADTGDIMYQ AYRTPPISAIQYSNGLWGQYVGKSLFGLTYRSGYRQEPADQLNTSLSIEQQLPFIQGLSI KGVINYDPYRVKKKKYLTPIPVYTLDASQTPYQFVEGFQGPEKPNLEQSFEESVSMTYQG MINYSRTFGKHTVTGLGVIEARERNVWNMSAKRLNYNINIDEINAGSSDPADISNGGTSW KERQVGYAFRLGYNYDNRYMAEVSGRYDGHYYFAPGKRFGFFPAFSLGWNLREESFMKDV EWMDKLKLRLSYGESGNLAGSSYQYMSDYGFGNAVNFGGVPMMGMWENLQGNPNITWEKA KKFDFGVEFSVLNGMFSLEADYFYEKRSNMLMAPNALVPAEYGIPLSQVNAGSMHNQGID LSLNFNKRIGKDWMISAKGTFTFARNILDEVFETEATFNNPNRRRTGRPDGTMFGYNALG YFTYDDFNPDGSLKAGIATQPWGQVYPGDLRYEDLSGPNGVPDGKIDEHDQTVIGFSNWA PEIIFGLAPTVQWKNFDLNALIQGATRTSISLGETLVMPFFDSGSATKLQFSDHWTPDNT NAPYPRLTSEVTVNNHRQPSSWWVRDATYIRLKSIEIGYTLPRSALNFIGISSARVYASG QNLWTWTPFMKELIDPEAKSANGKYYMQQAVYAFGLNITF >gi|226332154|gb|ACIC01000166.1| GENE 33 37821 - 38303 289 160 aa, chain - ## HITS:1 COG:no KEGG:BT_2197 NR:ns ## KEGG: BT_2197 # Name: not_defined # Def: putative anti-sigma factor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 160 163 322 322 300 99.0 8e-81 MTKDASRPFTVQTQKYDVRVLGTEFNVYAYSNSEKFETDLLSGKVRVSSTGFPEESVNLL PDEKVSLVDGKLVKSSSHFGGKEYREQGIYDFEELPLGEVLERLEQWYDIHFTVDDPSLL SKIISGKFRQSDQIETILKAISRADLFEYKILSQREITIY >gi|226332154|gb|ACIC01000166.1| GENE 34 38254 - 38787 301 177 aa, chain - ## HITS:1 COG:no KEGG:BT_2197 NR:ns ## KEGG: BT_2197 # Name: not_defined # Def: putative anti-sigma factor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 146 2 147 322 291 100.0 5e-78 MNEKWNQYFFGEPTEEEKRELFQELEKNEDMKREFAEMQNIVGLSGLLPREDDSLKGERN LEAMMNRQEKKLRRKRVLQIVRYTTSAAAMIALTWMLAWYMFVGSETPSYTEITVPKGQR VHLTLPDGSEAWLSSLSTLKWPSVFSFRCPYGRTGWRGVFHCDKGCFPSFYSADAEV >gi|226332154|gb|ACIC01000166.1| GENE 35 39192 - 39773 372 193 aa, chain - ## HITS:1 COG:PA0149 KEGG:ns NR:ns ## COG: PA0149 COG1595 # Protein_GI_number: 15595347 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Pseudomonas aeruginosa # 19 181 15 168 181 61 25.0 1e-09 MIHSLPSSEPLTAQKLFSKLYVSYYARLVRFASLYVGAMGDAENIVQDFFLYLWERKEIL PELQQPDAYLFSAVKHRCLNFLRSQLSIVDRRQPLSDIMEQEFKLKLYSLQLLDDSQMSI DEVEKQICRAIDSLPERCREIFVMSKLKGMKYREIAESLGISQNTVEGQMAIALKKLREE LRHCLPLLLLLSV >gi|226332154|gb|ACIC01000166.1| GENE 36 39922 - 42198 1829 758 aa, chain - ## HITS:1 COG:L135972 KEGG:ns NR:ns ## COG: L135972 COG3537 # Protein_GI_number: 15673483 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Lactococcus lactis # 29 758 3 716 717 432 33.0 1e-120 MVTRLFFLSICFPFLLTSCQQSKKTVEFVDYVNPLMGTESTFAFSHGNTYPAVAVPWGMN FWSPQTGENGSGWMYTYTDSLMRGFRQTHQPSPWINDYGTFSIMPLAGELKMSHKERLVP FSHQQEKATPYNYSVTFNNGLQTSLSATSRGAVFEVSFPEKEDQYVVVDAYNGGSSITIE PEKRLVKGATRYNNGGVPDNFANYFMMEFSHPVIEYGTYNGDTLLHHQTDVAADYTCAYL KFDVPAGEKLTIRTASSFISPEQAAINFNREVADADVQLISGKAREQWNNYLGRVEAEGG TDEQLRTFYSCLYRTLLFPREFYEFDSQGNPVYYSPYDGNVHDGYMYTDNGFWDTFRAVH PLFTLLYPEVSERVTQSIINAYNESGFMPEWASPGHRGCMIGNNSVSLLVDAWMKGIQTV DAEKALEAMIHQTQARHAEIASVGRDGFEYYDKLGYVPYPEVPEATAKTLEYAYADWCIA RFAQSLGKQDIADQYYQKAQNYRNLYYPEHGFMWTKDAKGNWRDRFDATEWGGPFTEGSS WHWTWSVFHDPEGLSELMGGHEPMVARLDSMFVAPNTYNYGTYGFVIHEIAEMVALNMGQ YAHGNQPVQHAIYLYDYIGQPWKTQYHLRNVMDKLYNSGSKGYCGDEDNGQTSAWYVFSA MGFYPVCPGMPEYAVGSPLFKKVTLHLPEGKNFVVSAADNAADRPYIRKALLNGQEFTRN YLTHDELKQGGELNLSMDSVPNQQRGTQPADFPYSYSK >gi|226332154|gb|ACIC01000166.1| GENE 37 42301 - 44190 1975 629 aa, chain - ## HITS:1 COG:no KEGG:BT_2200 NR:ns ## KEGG: BT_2200 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 629 6 634 634 1275 99.0 0 MKTTICGAALLLSLTMASCEVYESPEINQGLNEENNTSLYTDAYDPISNVGQYYMPEMKV KPKKYWNCVEVVTVGSRNELGQTEKVRGLQYHLMCQSLAGLANRAVEQGTSEIAVWLHDH GGSDSYKLSKQALEDMGIHEQGMQSGLELARNDYGPSDGVTIQLKGMFDGYVLTDIEHNP ESGVVASVASHVYNSIIVDVRDKEYYEEAGYTMKYDARSKTTAQAWAEFKDKCSNKALVI MPVQTGELREFAIKNELFVLNLNKRQGTSIAGQNTALLKEILAWLEPNAPVYGWEQGVSE DAFVDLVSKSGHPMIPCDWSYNHSLTSLLYSQRQKSTLARVKNPQFLDYTKKKNFVSFFL SDGDNIQWMMNDFKDFYNAAESEEVRMTYGIAASVLPMMAPAQFDNLLSQQKPNCSILEM LGGGYYYVDNYSENGDRAKNLKVVAERLSAHMRQHRVKLLGVMAMNVKSEAAKEAFQAYV DANDQLEGIVALQYSPYAGGEGDVIWVTNKAGYDIPVITVKYSLWNFGNRNAEREGTPAY IAGRLKQEAQQESFSVVCVHAWSNFSDHGQTEDPLIENQSGDIRGAGAAKLCAGHLNDSF EVVNMQELVWRLRMSQRPEQTKKYLSEVF >gi|226332154|gb|ACIC01000166.1| GENE 38 44224 - 45867 1711 547 aa, chain - ## HITS:1 COG:no KEGG:BT_2201 NR:ns ## KEGG: BT_2201 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 547 1 547 547 1082 99.0 0 MKRVKEQIKYLALLAICLLSGCSDFLDSYNPSAVTDDFYNTKEGQKKLLTDINARYRNVF NTGELQYYGTDIYMAISEEPAERMFNGYDKTFNSTAPIVGDYWKVLYKIVQESNILLNRC TPDIAGNDYTSLTAQARFFRAMAYYYLVETFGPVPLLTEENTSVIQQAERTGEQAIYTFI ITELNDIKDKLDMTTISAGKVTNAAVIHFLGKMYLTRAYKSFAETDDFSNAATTFDLLVE NPASGYALQENYAALFDENNQANSEVIWAIQYGLDKNYRGDGNPQQAQFGFNIVALEPDL FIKNQNDYSSVSRKYWVNPKAHELYTDPEADTRYDATFKREYYINNPDNADYGKLGIYFP RWNDKSGMSNNAKRFYPFKQDDEYVWYPQSTALPVLETASDRMPMVRKFSDTKIQWGEGG SREDVIFRLGDTYLLCAEAYLGAGNKKLALERINSIRKRAAKDATAYEAMKLTDLDINVI MDERARELMGEHDRWFDLKRTQTLLTRVPAYNPFVVKYDNLNKNHLVRPIPQDERNKVDG LSQNEGY >gi|226332154|gb|ACIC01000166.1| GENE 39 45882 - 48872 3176 996 aa, chain - ## HITS:1 COG:no KEGG:BT_2202 NR:ns ## KEGG: BT_2202 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 996 1 996 996 1958 99.0 0 MKNNTIKIKAFLCLLLLMCGWVSVQAQQVLTVKGVVVDALKEPLPGASVQVVGMSTGTVT DLDGNFSLQVPKGKTIAVSFIGYVTQELTVNQNQSNLTIQLKDDSKQLNEVVVIGYGTVE KKDLTGSIQSVTSKELSKLVTTDVTETLNGRVGGVLVNKTSNRPGSDTKIEIRGINSFNF SNEPLYVIDGVPSQTGMRHLNSADIESIDILKDASSSAIYGSRGANGVVIITTKGANKRQ GFNIDYSGYVGFKTPTRIPEMIGNMGNGLEYVDYRTALWKKKYGDASLSRPDFLTDDEKR RIKHGEYYDWLRELSQNALTTSHSVSATGGTDKLSFSFGLGYLKDDGMVGDESFERITAN IGLEYRFSDKFKTGINSYVSLNNTNEGANDALINAYFLPPTVSPYDKDGSYLFNCQPTSS KINPFVQIENNKREKEANYTNFSGYLEYQPIKGLSFKSQIAVQYDSDVYGEWVGTMTQAK GGLNAPEAYRKEGRNMNWVWDNIITYDKTWKNIHRLNAIGLYSVQKETHKGSEMRGDGLP YNSDWHAIETAEEIRDVKSYYWESSMLSFMGRVNYTLMDRYLFTVTGRYDGTSRLATGNQ WGFMPSAAVGWQMKNENFLKNVDWLNSLKLRVSWGKSGNNSIDHDITWTKLDLTHYIYGG KGENAFGLGDRKGNKDLRWEMTSEWNYGIDFGFLNNRINGTIDVYNRTTKDLIFARSVGN LNGYGSILENIGTSSNKGVEIGLNTVNVSKKDFTWKTNLTFSLNRNKIVDLYGDKKDDLA NRWFIGQPMKVIYDLEKVGIWQTEDKELAAKYGQVPGHIRVADLDENYVIDERDYKVLGS PSPDWTAGMTNTFTYKNWDLSFYMYARIGGVYNDDFTYMFTAWDNEHWNKLNVKYWTPEN HSNEYQQIGAQSYHTQVLGKISGSFLKIQNITLGYTLPENWMKKMKMKNARAYINVQNPF TFTDYLGPDPETIGEDVYKSLSLYPMTFTFGVNLTF >gi|226332154|gb|ACIC01000166.1| GENE 40 48895 - 49959 977 354 aa, chain - ## HITS:1 COG:no KEGG:BT_2203 NR:ns ## KEGG: BT_2203 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 38 354 1 317 317 638 99.0 0 MKTRFFMKLCLFSLLVWCISSCKSDNGDSVRPIALECMYTHLTDMTNMLDTAKIGTSDGT FPLANAQNLQKAVEELQTGISKGMAGYFVLQYEIDNYCIAAEKAIAEFQDSYQQTLQPGT PAELKVFGIDGKGRIEFGSDPAYGGGNTFTVESWVKYDAGFFESGIGSFLSTFDGKQPNE GWMINFLGSNLRTTIGMGPQEGRVLEEGRAYPDNFGKWNHVVTVWDNTLSEGQLKMYVNG ELFFSKTNDVKNDAGVLQNYMPNTRNQNMWAFQEPTDNSRCMTGFIKKFRMWSTAKSANE VKTLMNSDVTGTESGLVCAWDFTTVAEDVTNIPDKTGKHVAKIVGNYKWFKVEN >gi|226332154|gb|ACIC01000166.1| GENE 41 50094 - 52667 1646 857 aa, chain - ## HITS:1 COG:no KEGG:BT_2204 NR:ns ## KEGG: BT_2204 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 13 857 13 857 857 1702 99.0 0 MIRRRLCLLFLLTGMTLFAYSQGLLFQANDKEIKERTSLQIFQEGEIPCFTKNFQLSFEL SIRDFDTFGYVFLLKEDQGKTKYSFTYTYLDGENSTFKFNTDGKENHYSLNLRNDALAYQ WIPVSFAFDLQQDVLTIRIGDNEKKITSLGLKDTFCPHLFFGRYDYILDMPTFAIRNLKL EGDRSHSYTFPLNENEGEEVHTSTGKVLGTVVNPVWLINGSYHWEKLFEYSFQTPSGITF EPDSQRLIIFSQDSLLTYNLLKRQPQKYSYSNKLPVKLQLATHFMNTTDGKLYVYELNNL PLGDATVAALDLNNQEWKQTGVAALPVQLHHHDGFWDETTGKYLVFGGFGNKRFNNTFLE YDIEADRWDTLSYSGDRIIPRYFSGMVVNKNRERIYVFGGMGNESGEQSVGRNYLHDLYL LDRKQQSVRRLWQNASDHRLVVARDMILTPDEKYIYALCYPEYLSDTYLQLYRLTVDDGT MKALGDSIPMRSEEIMTNANLYYNSLTHEYYCTTTEFDKKGHTVIRTYVLSAPPVSLDEI RSYGSRSSLEIRWLWIMAGIGVLLLVGGVLFVRRKRGKQRNAVPESSSVLMSPPVGREPD KSVLGKEMPAKEDFESSIVRPNAVYLFGPFTVIDRNGRDITHLFSSRLRQVFIYILLHST HNGVLSASLNEVFWADKPDDKVKNLKGVTINQIRKNLAELDGVELVHDKGYFRLVFTDCY CDYFRFRTLKNAEEVENELGILLMRGKFLDGMDAGMMDHFKQKVEEFLSSFLPLEIERLY QQHKYDAVIRFCNVLFRVDPVNELALAYGMHALNHTGSSQEAILQYSLFVREYRQMMNEE YSTSYAELMSKNPPFHR >gi|226332154|gb|ACIC01000166.1| GENE 42 52774 - 52887 114 37 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKHKKWLTVLFIHGNRRRYQVSIWAEGNGWVIEKGAD >gi|226332154|gb|ACIC01000166.1| GENE 43 53003 - 53428 410 141 aa, chain + ## HITS:1 COG:no KEGG:BT_2205 NR:ns ## KEGG: BT_2205 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 141 1 141 141 259 100.0 2e-68 MWILIISLVLLGIIALIAGYIRNKRLQQKIEKGELDRMPEVKEADIECCGQHEVCEKDSL LAAVSKKIEYYDDEELDQFIGKEANAYTDEETNQFRDVLYTMQDIDVAGWVRSLQLRGIE LPDDLKDEVFLIIGERRNGER >gi|226332154|gb|ACIC01000166.1| GENE 44 53539 - 54348 811 269 aa, chain + ## HITS:1 COG:MA0025 KEGG:ns NR:ns ## COG: MA0025 COG1108 # Protein_GI_number: 20088924 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Mn2+/Zn2+ transport systems, permease components # Organism: Methanosarcina acetivorans str.C2A # 2 261 3 262 274 210 46.0 2e-54 MDLLRYTFFQHALLGSLLASIACGIIGTYIVTRRLVFISGGITHASFGGIGLGLFAGISP ILSAAVFSVLSAFGVEWLSRRKDMREDSAIAVFWTLGMALGIMFSFLSPGFAPDLSAYLF GNILTINQADLWMLGILALILTGFFYLFIRPIVYIAFDREFARSQKIPVEIFEYVLMMFI ALTIVACLRMVGIVLAISLLTIPQMTANLFTYSFKKIIWLSIGIGFLGCLGGLFISYHWK VPSGASIIFFSILIYAICKIGKSCCKKQS >gi|226332154|gb|ACIC01000166.1| GENE 45 54363 - 54776 499 137 aa, chain + ## HITS:1 COG:BS_ydiB KEGG:ns NR:ns ## COG: BS_ydiB COG0802 # Protein_GI_number: 16077658 # Func_class: R General function prediction only # Function: Predicted ATPase or kinase # Organism: Bacillus subtilis # 27 135 30 134 158 90 41.0 6e-19 MEIKIQSLESIHEAAREFIAAMGDNTVFALYGKMGAGKTTFVKALCEELGVSDVISSPTF AIVNEYRSDETGELIYHFDFYRIKKLSEVYDMGYEDYFYSGALCFIEWPELVEELLPGDA VKVTIEELEDGSRVIRL >gi|226332154|gb|ACIC01000166.1| GENE 46 54773 - 54997 163 74 aa, chain + ## HITS:1 COG:no KEGG:BT_2208 NR:ns ## KEGG: BT_2208 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 74 1 74 74 116 100.0 3e-25 MTGQYIVQGIFALAGITSLLASLLNWNWFFTTRNAQTIVRNVGRGRARLFYGILGVIIIG MAVFFFIETRKAIL >gi|226332154|gb|ACIC01000166.1| GENE 47 55074 - 55367 264 97 aa, chain + ## HITS:1 COG:MJ1215 KEGG:ns NR:ns ## COG: MJ1215 COG1669 # Protein_GI_number: 15669400 # Func_class: R General function prediction only # Function: Predicted nucleotidyltransferases # Organism: Methanococcus jannaschii # 1 76 5 81 86 62 46.0 1e-10 MKTTNEYLTKIRQFKQQCAEKYGIISIGIFGSVARGEQHEGSDLDVFVELKEPDPFVMFD IKEELEHICNCKIDLLRLRKNLRSLISQRIEKDGIYA >gi|226332154|gb|ACIC01000166.1| GENE 48 55351 - 55728 234 125 aa, chain + ## HITS:1 COG:MA1296 KEGG:ns NR:ns ## COG: MA1296 COG2361 # Protein_GI_number: 20090160 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanosarcina acetivorans str.C2A # 18 125 14 113 116 57 33.0 7e-09 MESTLKEEILDKFLQLSESISIIEDRCKNIQNVDDFLLSPWGMTVLDACIMRIQVIGETI KAIDDKTQRSFFKDYPQVPWAKVIGLRNIISHEYANIDYEIIWVVITKHLPPLKETVENI IKDLS >gi|226332154|gb|ACIC01000166.1| GENE 49 55723 - 57033 837 436 aa, chain - ## HITS:1 COG:slr2013 KEGG:ns NR:ns ## COG: slr2013 COG1721 # Protein_GI_number: 16329852 # Func_class: R General function prediction only # Function: Uncharacterized conserved protein (some members contain a von Willebrand factor type A (vWA) domain) # Organism: Synechocystis # 1 436 1 435 435 171 27.0 2e-42 MYLTRRFYIALILVILLLGSGYLFAPFFVIGQWALFALFVLVSADGYILYRTQGIQAFRQ CSDRFSNGDDNEVSIRVESTYSRPLSLEIIDEIPFIFQNRDVCFRTTLQPNEGKTISYHL RPTRRGVYSYGQIRVFVTDKIGLLSRRYTCGQPQDIKVYPSYLMLHRYELLAMSDNLTEL GIKRIRRVGHQTEFEQIKEYVKGDDYRTINWKASARRHELMVNVYQDERSQQIYSVIDKG RVMQQAFRGMTLLDYAINASLVLSYVAMRKEDKAGLVTFDEHFDTFVPASKQPGHMQTLL EKLYSQQTTFGETDFSALCVHLNKHVNKRSFLVLYTNFASISSMNRQLAYLQQLNRQHRL LVVFFEDADLKAYIESPARDTEDYYRHVIAEKFAFEKRLIVSTLKQHGIYSLLTTPENLS IDVINKYLEMKSRQLL >gi|226332154|gb|ACIC01000166.1| GENE 50 57093 - 58067 1048 324 aa, chain - ## HITS:1 COG:BH0731 KEGG:ns NR:ns ## COG: BH0731 COG0714 # Protein_GI_number: 15613294 # Func_class: R General function prediction only # Function: MoxR-like ATPases # Organism: Bacillus halodurans # 28 324 11 308 308 280 48.0 4e-75 MEENIEQRVDLTLFSEKIQELKDQIASVIVGQEQTVDLVLAAILANGHVLIEGVPGVAKT LLARLTARLIDADFSRIQFTPDLMPSDVLGTTVFNMKTNDFDFHQGPIFADIVLVDEINR APAKTQAALFEVMEERQISIDGTTHKMGELYTILATQNPVEQEGTYKLPEAQLDRFLMKI TMDYPSLEEEVDILERHHTNASLIKLDDIKPAITKEELLSLRAFMNQVFVDRTLLQYIAL IVQQTRTSKAVYLGASPRASVAMLQASKAYALLQGRDFVTPEDIKFVAPYVLQHRLILTA EAEMEGYSPVKVTQRLIDKVEVPK >gi|226332154|gb|ACIC01000166.1| GENE 51 58079 - 59353 967 424 aa, chain - ## HITS:1 COG:no KEGG:BT_2213 NR:ns ## KEGG: BT_2213 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 424 1 424 424 850 100.0 0 MRGSRWFIFFVLAFLLLMFAIEYHLPKKFVWVPTFSHYDEQPFGCAVFDSLLTVSLPSGY TLSRKTFYQMEQEDTVHNKGILLIATNLPFGRVDIEALLKMADRGNKIMLVSSSFTKILE DTLKFDCTYSYFRSVDLKKYAASLLKRDSIYWIGDPEVYSRQVFRFYPQFCKSYFRRYDS LPVRKLAEINLASDMGHALDELDSTTVSRNYHPLVAMVRPWGKGEIILVSTPLLFTNYGV LDEKNATYIFRILSQMGELPIVRTEGYMKETAQTQRSPLRYFLSQRPLRWAIYLSMFAIL LFMVFTARRRQRAIPVIQEPENKSLEFTKLIGTLYYQKNDHANLVHKKFTYFAEVLRREI QVDVEEVADDERSFHRIAQKTGMEVEEISRLIREIRPVIYGGRVLSGEEMKGFIDKMNEI INHI >gi|226332154|gb|ACIC01000166.1| GENE 52 59350 - 59964 476 204 aa, chain - ## HITS:1 COG:no KEGG:BT_2214 NR:ns ## KEGG: BT_2214 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 204 1 204 204 372 100.0 1e-102 MLTSPADTLVCDTVQIAKWQSESAYDYNRELITPEINIFEWFRRQFGELLRKIFGSRFAE EYSELILICLAIIILLLIIWFVYKKRPELFMRSPKNKLPYEVGEDTIYGVDFSGGIADAL SRSDYREAVRLLYLQTLKRLSDEKRIDWQPYKTPTQYINEVRIPVFRQLTNHFLRVRYGN FEATEELFNSMKSLQEEIGKGGGS >gi|226332154|gb|ACIC01000166.1| GENE 53 59985 - 60917 925 310 aa, chain - ## HITS:1 COG:no KEGG:BT_2215 NR:ns ## KEGG: BT_2215 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 310 1 310 310 499 100.0 1e-140 MESQKPKVAMYVKRSFGDKLNASFDFIKENWKILLKFTTYLLLPVSLIQALSLNGLMGGA FAMTAMSKTATVPDTASLLGFMSYYGLYMIVFMIGSILLTSMIYALIRTYNEREERLEGI TLGILKPLLFRNIKRLLVMTLFSILVMLFVGLVVGLLAFLSLFTLFLTIPLLIAFVVPLA LWAPIYLFEDITVMESFKKTFRLGFATWGGVFLISLIMGFIANVLQGVTMMPWYIATLVK YFFSLSDVGSETTVSAGYSFILYLLGIVQAFGAYLSMIFTFVGMAYQYGHASEVVDSVTV ESDIDNFDKL >gi|226332154|gb|ACIC01000166.1| GENE 54 60898 - 61860 748 320 aa, chain - ## HITS:1 COG:alr1808 KEGG:ns NR:ns ## COG: alr1808 COG1300 # Protein_GI_number: 17229300 # Func_class: S Function unknown # Function: Uncharacterized membrane protein # Organism: Nostoc sp. PCC 7120 # 31 315 39 317 318 125 30.0 1e-28 MKEVTFIRRNIEKWKGTEKVVEQAANLSPDQLADAYTELTADLAFAQTHFPTSRITIYLN NLASALHNEIYRSKREKWTRIITFWTREVPQTMHDAQRELLISFIIFAVSALIGAVSAAN DQEFVRLIMGNQYVDMTLDNIARGEPMAVYNGSPEAPMFLGITINNIKVSFLCFAAGILT SFGTGLILLQNGIMLGSFQMFFYQHDLLWESALAVWLHGTLEIWAIIVAGAAGLALGNSW LFPGTYSRLESFRRGAKRGLKIVIGTVPVFIMAGFIEGFITRHTELPDMLRLGIILTSLA FIIFYYIYLPNRKKHGITET >gi|226332154|gb|ACIC01000166.1| GENE 55 62058 - 63443 706 461 aa, chain - ## HITS:1 COG:no KEGG:BT_2217 NR:ns ## KEGG: BT_2217 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 461 1 461 461 873 98.0 0 MKLKYYFLLVILALFTSCSNELGMQDNLNDSKILLSSDEYISIAYDNPGIITSEEAKQLV LDFNPTGLQTRNNNVKIMAISTYNVSSSSGLKTRVGSTNLTTIPVYKIEFLSDEGKGVAY VPADERFAKIMAYLPKTDLKDSVKYVDSKSMLYLSELSLLEDAKYYEKVKSKLRAKTLAK IAKTLNVKAVRYDEIQDIIIVKDEILSRSGPTFPVGNPIGFAGLYCSSTEWNQDAPYNDL LQKENCDLYDTGSPTYEAVPAGCAVISIAQIMASLEPDLTVDGLKIDWNILKAQKQIIAP SSWQTPSSETTREMVAKLIKYIYIATNTQPVRNSSGYVTGSGTYSSDTSSFFYRFFSTSG LKNGWDGNAAFQSLQLAKLVWVAATSNRGGSHAFILDGFAYWRPQTRYLVKNFDFYFHTN MGWGGYYDGFYLVNKDLSITFDTSNGSYDRNFSMIYNIINK >gi|226332154|gb|ACIC01000166.1| GENE 56 63859 - 65268 677 469 aa, chain - ## HITS:1 COG:no KEGG:BT_2219 NR:ns ## KEGG: BT_2219 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 469 1 469 469 973 99.0 0 MKLKYYFLFFVLSLLASCNDEEGLDLHNKPIESNSIVLSPDEYISIAYDNPRVITPDEAK QLALDFGVNRVQTRGNENIVVKFRSSFNIEEKGSNAVTRSVTDKSILLPVYEMEIVSDAG KGLAYVSADERVAKVMAYLPKVNLEDTAKIIGAKAMLGLSEQSLLEEAKHYEKMKLEFKD ETLKRISKELNVAAVNFNEIKDRILVEDCLDSRATPQDPAGNAIGWAGFFCMATNWDQEP VYNDLLEKAYCAPNVGKEPVYKAVPAGCSVVAMAQIMASLEPNLTIDGIKIDWEALKAQP QILMPLPWKPNQGSPEIVRTMVAKLFKHIYDLSGTHPEYDKTTNWVTGSGTSGPQTRDFL RRYFTTGDLINGWDGNAAFLSLQKAKLVWVGVSDASKKASHAFVIDGFAYWRPQTRYLVK NFDFYFRANMGWAGSYDGFYLVNKDMSISFEAGGYELNTGFNMICDIIL >gi|226332154|gb|ACIC01000166.1| GENE 57 65325 - 66599 474 424 aa, chain - ## HITS:1 COG:no KEGG:BT_2220 NR:ns ## KEGG: BT_2220 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 424 1 424 424 803 98.0 0 MRKIFIQFYLLVVLIIIASCSNENIIDAERNAIKQVDPNALSENEVISIVESFQKSIKSK TDTRGEITENPIFAISEEYLISSRNSEITRSLPNDVKALVYEVEIRNRYEQGKAIVSGDR RFPKVLAYIPSYNDSIFATQIGANAMIQMSKNALLDEIANNSEAITRSRPITDLDGQVWM MIEPFCTTEWGQKEPYSWKFVNTWVEIPMDISRKICTWERRPVGLTNTAIAQIMASLEPE LTCEGISMDWSYLKESKSISYDDPYEKWNMVASLFKYVYDSTSSSPVWGTAYTDVWPDSE SKLVSCITAISTPLSNVYKYLNSGSGITNCGNIQKWSIETVKNSVIKICPVLVECNGSAF IVDGYAMTKNESSSSNAYFHCNFGKTGNSDGYYLVNNDGSISFETGGNTYWDTQLSVIPD IRKR >gi|226332154|gb|ACIC01000166.1| GENE 58 66614 - 66931 153 105 aa, chain - ## HITS:1 COG:no KEGG:BT_2221 NR:ns ## KEGG: BT_2221 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 105 32 136 136 140 80.0 1e-32 MELSCSSRKIILKGKGSKPGETTRIPAIYPVEASINNNVLCLDFFSKVPSVTVSVINLDT NETVYLNTFINPVSLITLDLNLVEGTAYRLELLSKDYDLSGDFEL >gi|226332154|gb|ACIC01000166.1| GENE 59 67155 - 68942 705 595 aa, chain - ## HITS:1 COG:no KEGG:BT_2223 NR:ns ## KEGG: BT_2223 # Name: not_defined # Def: TPR repeat-containing protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 595 1 595 595 1005 91.0 0 MNVNYVYGIIFLFLTSCLSPTDRVIPSLLRADSLIEVGCADSALSLLERMNLAELSSIQS RASYALLLTQAKDQNYVTHTDDSLIRVAVDYYDTSDDITLRAKAHYYLGRVCQDRGDIEG TVREFLVAMPLAEKAENYDLNILLKSNLGLLFWRHGLQEEADSLYRQAVELAEAHHDSLR LAVVLVHCADICMERGEEYYTDAEKYLKRALKLVKDTDKQHTEELVFSSLSYLLEYQGKS REAILYANKGMRLVLDSSERYGYYLVIGSAYLQLEQYDSAFVYLNRGIPSDNYYAKTNAY MKLSEMAMRLGKQNEALEYETLYTIYKDSMKLVEQPVEVVSSLKDVLYRQSTERYESFLT RYRFCLLLFVLLFIITIYFFLQRRRKRKKEIARLVDKRQLLYKSIEAIKKELEEKKLEIK EIQQHCECLESDVNSKVQLDSCLNELLEQYHSMQEDLERRLVERDEEVRRLRNLNLKFIL MSSPIYQMLIALCEYNKLNPDGMKKITNDEWVILLHEIDMASLGFVERLSTEYEYLLEED IRFCCLVRLDFKYADIAYLWGCTSAAVYKRSWSVLEKMGLNKDKKVKLVDILRKV >gi|226332154|gb|ACIC01000166.1| GENE 60 69044 - 69769 435 241 aa, chain + ## HITS:1 COG:BH0734 KEGG:ns NR:ns ## COG: BH0734 COG1714 # Protein_GI_number: 15613297 # Func_class: S Function unknown # Function: Predicted membrane protein/domain # Organism: Bacillus halodurans # 3 149 8 164 266 68 31.0 9e-12 MAESTIITGQFVRISQTPASIGERLLALVIDYSLIVIYLYSVTTLIVKLHLSSGVGTVFF LCLVYLPVLFYSFLCEMFNHGQSFGKRLMNIRVVKSDGSTPSISAYLLRWLLFIIDGPGT GGLGLLVVLLTKNSQRLGDLAAGTMVIKEKNYRKIHVSLDEFDYLTKNYHPTYPQSADLS LEQVNVITRTLESGEKDRVRRVTLLAKKVQEILSVTPRENNQEKFLQTVLRDYQYYALEE I >gi|226332154|gb|ACIC01000166.1| GENE 61 69779 - 70384 442 201 aa, chain - ## HITS:1 COG:no KEGG:BT_2225 NR:ns ## KEGG: BT_2225 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 201 1 201 201 355 91.0 6e-97 MSIYYDLYENPDKEGTEEQKLLHARVVPSGTYTTKEFVERVSMMHHIPHAQLVGAVEAIT DELQTLLSRGYIVEFGDLGHFSLTLSLEKEITDRKEVRSPSVHLKNIKLKVKQTYKRELN SKMDLERISSPTRSNMNISEEECLKRLQDFLEKHPCINRADYCAITGMNKTQAIRQLNLF IEKGVIQKYGIGKTVVYIRVV >gi|226332154|gb|ACIC01000166.1| GENE 62 70789 - 71124 522 111 aa, chain - ## HITS:1 COG:no KEGG:BT_2228 NR:ns ## KEGG: BT_2228 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 111 1 111 111 155 100.0 3e-37 MGLEDDFLLEDADDEKTVEFIKNYLPQELKEKFEDDELYYFLDLIDEYYSESGILDAQPD NDGYVNIDLEEVVAYIVKEAKKDEMGEYDPEEILFVVQGEMEYGNSLGQVD >gi|226332154|gb|ACIC01000166.1| GENE 63 71168 - 71482 174 104 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|124485582|ref|YP_001030198.1| ribosomal protein L12E/L44/L45/RPP1/RPP2-like protein [Methanocorpusculum labreanum Z] # 3 103 18 117 120 71 32 2e-11 MALEITDSNYKEILAEGKPVVVDFWAPWCGPCKMVAPIIEELAAEFEGQVIIGKCDVDDN SDVAAEYGIRNIPTVLFFKNGEIVDKQVGAVAKPVFVEKVKNLL >gi|226332154|gb|ACIC01000166.1| GENE 64 71551 - 75354 3922 1267 aa, chain - ## HITS:1 COG:CAC0516 KEGG:ns NR:ns ## COG: CAC0516 COG0587 # Protein_GI_number: 15893807 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, alpha subunit # Organism: Clostridium acetobutylicum # 2 1230 9 1133 1167 707 36.0 0 MQDFVHLHVHTQYSLLDGQASVARLVDKAMKNGMKGIAVTDHGNMFGIKEFTNYVNKKNS GPKGEIKDLKKRIAGIEAGTVECEDKEAEIAACKAKIVEAENKLFKPIIGCEMYVARRTM DLKEGKPDQSGYHLIVLAKNEKGYHNLIKLVSHAWTRGYYMRPRTDRSELERYHEGLIVC SACLGGEVPKRITAGQFDEAEEAIQWYKNLFGDDYYLEMQRHKATVPRANHECYPLQVNV NKYLQEYARKYNIKLICTNDVHFVDEENAEAHDRLICLSTGKDLDDPSRMLYTKQEWMKT REEMNELFADVPEALSNTLEILDKVEYYSIDHAPIMPTFAIPEDFGTEEGYRAKYTEKDL FDEFTQDENGNVVLSEEEGQAKIKRLGGYEKLYRIKLEGDYLAKLAFDGAKRIYGDPLSE EVKERLNFELYIMKTMGFPGYFLIVQDFINAARSELGVSVGPGRGSAAGSAVAYCLGITK IDPIQYDLLFERFLNPDRISLPDIDVDFDDDGRGEVLRWVTNKYGQEKVAHIITYGTMAT KLAIKDVARVQKLPLSESDRLAKLVPDKIPDKKLNLRNAIEYVPELQAAEASSDPLVRDT LKYAKMLEGNVRGTGVHACGTIICRDDITDWVPVSTADDKETGEKMLVTQYEGSVIEDTG LIKMDFLGLKTLSIIKEAVENIRLSRSLEIDVDQIDITDPATYKLYSDGRTIGTFQFESA GMQKYLRELQPSTFEDLIAMNALYRPGPMDYIPDFIDRKHGRKPIEYDIPVMEKYLKDTY GITVYQEQVMLLSRLLADFTRGESDALRKAMGKKLRDKLDHMKPKFIEGGRKNGHDPKVL EKIWTDWEKFASYAFNKSHATCYSWVAYQTAYLKANYPSEYMAAVMSRSLSNITDITKLM DECKAMGIQTLGPDVNESNLKFTVNHDGNIRFGLGAVKGVGEAAVQSIVEERNTNGPFKG IFDFVQRVNLNACNKKNMECLALAGGFDSFPELKREQYFAVNSKGEVFLETLMRYGNRYQ EDKRAAINSLFGGANVVDIATPEIPQGVERWGDLERLNKERDLVGIYLSAHPLDEFAIVL DHVCNTRMADLEDKAALAGREITMGGIVTSVRRGVSKNGNPYGIAKIEDYSGSTEIPFWG NDWVTYQGYLNEGTFLYIKARCQAKQWRQDELEVKITSMELLPDVKEELVQKITIIIPLS VLNSGLVTELATLTKEHPGNTELYFKVTDDTDSRMSVDLISRPVKLSVGRDLITYLKERP ELGFRIN >gi|226332154|gb|ACIC01000166.1| GENE 65 75516 - 76202 570 228 aa, chain + ## HITS:1 COG:RSc2074 KEGG:ns NR:ns ## COG: RSc2074 COG0688 # Protein_GI_number: 17546793 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylserine decarboxylase # Organism: Ralstonia solanacearum # 12 227 10 214 215 151 41.0 9e-37 MGRLKKLKKIRIHREGTHILWASFLLLLLINAALYWGIDCKIPFYVVAVASIAVYLLMVN FFRCPIRLFGKDTEKIVVAPADGKIVVIEEVDENEYFHDRRLMISIFMSIVNVHANWYPV DGTIKKVAHHNGNFMKAWLPKASTENERSTVVIETPEGVEVLTRQIAGAVARRIVTYAEV GEECYIDEHMGFIKFGSRVDVYLPLGTEVCVNMGQLTTGNQTVIAKLK >gi|226332154|gb|ACIC01000166.1| GENE 66 76213 - 76920 436 235 aa, chain + ## HITS:1 COG:SMc00552 KEGG:ns NR:ns ## COG: SMc00552 COG1183 # Protein_GI_number: 15964875 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylserine synthase # Organism: Sinorhizobium meliloti # 9 189 42 221 289 91 32.0 1e-18 MTNVIKNNIPNTVTCLNLFSGCIACVMAFEARYDLALLFIVLSAVFDFFDGMLARILNAH SIIGKDLDSLADDVSFGVAPSLIVFSLFKEMYYPANMEFLAPYLPYAAFLISVFSALRLA KFNNDTRQTSSFVGLPVPANALFWGSFVVGAHDFLVSENCHPVYLILLVCLFSWLLVSEI PMFSLKFKNLSWNTNKISFIFLIVCIPFLVFLGISSFAAIIVWYILLSLFTRKSK >gi|226332154|gb|ACIC01000166.1| GENE 67 76987 - 77286 283 99 aa, chain + ## HITS:1 COG:no KEGG:BT_2233 NR:ns ## KEGG: BT_2233 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 15 99 15 99 99 161 100.0 6e-39 MHILLFLLLFLAAIVIFGLSIVGFVLRAIFGLGRGSSSRTKQARSGQNGQQQGRPSYNQN TDSRADNEGEIFAENSPRTKHKKIFTQDDGEYVDFEEVK >gi|226332154|gb|ACIC01000166.1| GENE 68 77341 - 77784 402 147 aa, chain - ## HITS:1 COG:SA0516 KEGG:ns NR:ns ## COG: SA0516 COG0590 # Protein_GI_number: 15926236 # Func_class: F Nucleotide transport and metabolism; J Translation, ribosomal structure and biogenesis # Function: Cytosine/adenosine deaminases # Organism: Staphylococcus aureus N315 # 5 146 3 149 156 130 48.0 6e-31 MMTLDDTYFMKQALIEAGKAAERGEVPVGAVVVCKERIIARAHNLTETLNDVTAHAEMQA ITAAANVLGGKYLNECTLYVTVEPCVMCAGAIAWAQTGKLVFGAEDDKRGYQRYAAQALH PKTVVVKGILADECATLMKDFFASKRR >gi|226332154|gb|ACIC01000166.1| GENE 69 77784 - 78017 229 77 aa, chain - ## HITS:1 COG:no KEGG:BT_2235 NR:ns ## KEGG: BT_2235 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 77 1 77 77 113 100.0 2e-24 MRKRSKKCLKMHAEYTKRERRMSILLSEDEQLIVDRYLEKYKITNKSRWLRETILMFIHK NMEEDYPTLFGEHDMRR >gi|226332154|gb|ACIC01000166.1| GENE 70 78085 - 78450 346 121 aa, chain + ## HITS:1 COG:CAC1763 KEGG:ns NR:ns ## COG: CAC1763 COG0792 # Protein_GI_number: 15895040 # Func_class: L Replication, recombination and repair # Function: Predicted endonuclease distantly related to archaeal Holliday junction resolvase # Organism: Clostridium acetobutylicum # 8 116 9 120 122 60 36.0 6e-10 MAEHNELGKAGENAAVAYLEEHGYLIRHRNWRKGHFELDIVAAKENELVIVEVKTRSNTL FAQPEEAVDLPKIKRTVRAADAYMKLFQIDVPVRFDIITVIGENGNFRIDHIKEAFYPPL F >gi|226332154|gb|ACIC01000166.1| GENE 71 78501 - 78857 404 118 aa, chain + ## HITS:1 COG:DR2400 KEGG:ns NR:ns ## COG: DR2400 COG2315 # Protein_GI_number: 15807390 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Deinococcus radiodurans # 3 113 8 125 132 75 43.0 3e-14 MNVETIREYCLSKKGVTESFPFDDVSLVMKVLDKMFALIDLEGANSISLKCDPERAIELR EHYAGIEGAYHFNKKYWNGVYFDRDVDDKLIKELVDHSYEEVIKKFTKKLRAEYDALP >gi|226332154|gb|ACIC01000166.1| GENE 72 78841 - 79605 400 254 aa, chain + ## HITS:1 COG:aq_566 KEGG:ns NR:ns ## COG: aq_566 COG0340 # Protein_GI_number: 15606020 # Func_class: H Coenzyme transport and metabolism # Function: Biotin-(acetyl-CoA carboxylase) ligase # Organism: Aquifex aeolicus # 13 183 5 164 233 87 36.0 2e-17 MMPSPDMSFPVPLIHISETNSTNSYLQTLCAKQQGVAAFTTVVADFQTSGRGQRGNSWES EPKKNLLFSFVLFPDFLEARRQFLISQIVSLAIKEELDSYADDFSIKWPNDIYWKDKKIC GMLIENDLMGRNISQSISGIGINVNQEVFHSTAPNPVSLRQITGKQYDIFEILKNIMLRI QSDYELLRNGDTELIAHRYEKALFRKEGMHRYKDADGEFFARIICVEPEGKLILEDDAQK KRGYMFKEVEYLLI >gi|226332154|gb|ACIC01000166.1| GENE 73 79672 - 81369 822 565 aa, chain + ## HITS:1 COG:no KEGG:BT_2239 NR:ns ## KEGG: BT_2239 # Name: not_defined # Def: carboxyl-terminal protease # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 565 1 565 565 1041 97.0 0 MKKIQYVKLVGLLTILLFLNISCKDDDTLLRGSGITEQSWSTNQTYFASAEQTLTFTFTT LSSWTAQNSSTALLSLDNTAGNSGENTIKVTVHKSSQEQGTITIKVNGYSSASNIKIQLS DDDVQGYEINYSVDQYLREKYLWNDDYKLLTPNFRQAYDEFLRNTLLSMTTNTLDKKRNS NGTYSLFSFIQKLDPDLQTSRSAKEKKTLEYNYGFVNFIAVGNRNTSNYGLVIQGVHKGS SADKEGLKRGMEITEIDNQRITTANVQACYSKLIKPSSPTSIKVKDKDGKVYTINSGPIY ANPIIHHQVKEKIGYLVYSAFESGFDQELFDVFKEFKSQNITELILDLRYNGGGDVTSAN LMSSCIAGDFCVDKTFASYRYNDERMKALGNQRPIQKFAYSQYDNLSTSLSAGGLKLQKI YCLVTDDSASASELVINALRGIDIEVILIGTTTHGKNVGMEGVELTAGTDKYLLFPITFQ AYNAKGFGDFENGFTPDYEINENQPNGEYFEGYGDFGTESDPLYAKAISLISGSKVTTPT RAVNQAKEQMLVIATPRLNRIGMIK >gi|226332154|gb|ACIC01000166.1| GENE 74 81411 - 82094 544 227 aa, chain - ## HITS:1 COG:no KEGG:BT_2240 NR:ns ## KEGG: BT_2240 # Name: not_defined # Def: TPR domain-containing protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 15 227 15 227 227 372 99.0 1e-102 MKTLTLLLFSLFSLPLMLNAQTADEMLHKVTAAIEAGQHGQAVSYFRQAISLNIDRSEMF YWTSIDKSSEISSKLSNELALAYKKNRNYDKAFLFYKELSQKDPDNVDYLEAVAEMQVCR GQEKDALRMYENILKLDADNLAANIFLGNYYYLMAEQEKKKLESDYKKLSSPTKMQYARY RDGLSKLFATGYEKARSSLQKVVLRFPSTEAQKTLDKILRIEKEVNR >gi|226332154|gb|ACIC01000166.1| GENE 75 82206 - 83525 976 439 aa, chain + ## HITS:1 COG:VC0090 KEGG:ns NR:ns ## COG: VC0090 COG0534 # Protein_GI_number: 15640122 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Vibrio cholerae # 5 424 12 431 454 275 38.0 1e-73 MINKKTTSENRRILQIAVPSIISNITVPLLGLIDVTIVGHLGSPAYIGAIAVGGMLFNII YWIFGFLRMGTSGMTSQAYGQHDLNEINRLLIRSVGVGLFIALCLLILQYPILNAAFTLI QTTEEVKQLATTYFYICIWGAPAMLGLYGFAGWFIGMQNSRFPMYIAITQNIVNIIASLS FVYLLDMKVAGVAAGTLIAQYAGFFMAILLYMRYYSTLRKRIVWKDIIQKQAMYRFFRVN RDIFFRTFCLVIVTMFFTSAGAAQGEVVLAVNTLLMQLFTLFSYIMDGFAYAGEALTGRY IGARNQTALRNTVNHLFYWGIGLSTAFTLLYAIGGKGFLGLLTNDVSVISASDTYFYWAL AIPLTGFSAFLWDGVFIGATATRQMLYSMLVASVSFFIIYYVFHNLLGNHALWLAFITYL SLRGIVQTFIGREIVKKAI >gi|226332154|gb|ACIC01000166.1| GENE 76 83599 - 84309 922 236 aa, chain + ## HITS:1 COG:FN1622 KEGG:ns NR:ns ## COG: FN1622 COG0528 # Protein_GI_number: 19704943 # Func_class: F Nucleotide transport and metabolism # Function: Uridylate kinase # Organism: Fusobacterium nucleatum # 4 234 6 236 239 262 58.0 4e-70 MAKYKRILLKLSGESLMGEKQYGIDEKRLAEYAAQIKEIHEQGVQIGIVIGGGNIFRGLS GANKGFDRVKGDQMGMLATVINSLALSSALVAAGVKARVLTAVRMEPIGEFYSKWKAIEC MENGEIVIMSGGTGNPFFTTDTGSSLRGIEIEADVMLKGTRVDGIYTADPEKDPTATKFS DITYDEVLKRGLKVMDLTATCMCKENNLPIVVFDMDTVGNLKKVISGEEIGTVVHN >gi|226332154|gb|ACIC01000166.1| GENE 77 84449 - 86929 2305 826 aa, chain + ## HITS:1 COG:no KEGG:BT_2243 NR:ns ## KEGG: BT_2243 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 826 1 826 826 1469 99.0 0 MKTKNACFWIFLLLSGSAYAQMSQKGTVQIFNSNKSPLPGVQLTATDAPATDTDANGNFQ FNFSKQKPGMAIASPNVYKKGYELVNKDMINGWILSEKRPMSIVMAPEGTIEESKNKYYS ISVAHFSKKQEKAIEEINQLYIAQKISLQERSDRLKAMAEESRSFMNQIDQYAEKFARIN PDDINAIEKQVLELVNAGKLSEAIELYNKSGIITQAREKLSQKTKAEEDIDKLAETMYRY ADLCALTGGMENEKKANDAYKFVAEALPDRFTYVFKYALQKIGLEDSDTMEWLDKCQKLS FDEKSLVQVLNIKSTLARHHQKDYIKALEYDINALEILSNKNISMPSGDYYAIYHQTFYS MGKTYEAMNEFKEAKEIYEDQIKEISEEIADSDNQLFINIQSSQLANIYSSLTDIYKKEG NVSKVNELSEKLFELNKKNAGDNEEAILEAEIGRQEYLFHQAVEKADYKLAAGSIKEILA KAEKLYKMRPLSRAYWYALWNFAYISVSTETEPENFLTTIKAFEDKVANELKINSEEQKT SLLYNIEKQYILYYSKNGNKPEERKHVEQAYQYAEKLNLLNPARFVLEIISAQASYIDML LELSEHDKALKAAIDLEALYTMQGVWGFQDLRTENSIGTAMVCGGLYELGVEHLEKVKKD REKHLKQKPNDVDMMGSITTTYNNLSLGYGKLKKYAKALEVQKKAYEIIKTLYPHNKAQL GTNYLLMTLNTSIAYYQNSNNAEALKFLQEAENIANELKELNQLFSSYPLVVRFAKGDLL AKLSQPGSEELLKTGLEYKSGTLPNDAVLLYLINDYRTNNNSIYRK >gi|226332154|gb|ACIC01000166.1| GENE 78 86935 - 88020 940 361 aa, chain - ## HITS:1 COG:no KEGG:BT_2244 NR:ns ## KEGG: BT_2244 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 361 1 361 361 659 99.0 0 MKKIKIFIASSAELNEDKQMFDLYFSDKNKLYRDRNIDFDQRTWMDFSSSLNEGRLQDRY NDYIRECDIVIFLFHTRMGRYTKEELEVAHEIYLKTKAAKPKIFVYFKEEGIVDESLKDF KSYCEKNLGHFCDLYTNYDDLRLKFDKQLQILENEGFIKPDPVDVKRTLRFVLLYVLVPV LVVALAFFAFYYYSPVTSTVRLTDTSKSSLPFYGADITLEYADKSETRHVDRLSDEVVFK EIHTKYLGENARLKIESKGYVTVDTVLSLEKNVTLGISRDSSLAMIFGTVKDEDNRPLAD ATVQVLDMKTVSDGMGNFQLPIPAEKQKEEQRVTVYKDGYQLWDFTGPVSDKVPWKIILR K >gi|226332154|gb|ACIC01000166.1| GENE 79 88027 - 89082 980 351 aa, chain - ## HITS:1 COG:no KEGG:BT_2245 NR:ns ## KEGG: BT_2245 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 33 351 1 319 319 553 98.0 1e-156 MGNHQKEIFLVLSIFLTGFQCVWAQTTQKGIVVEMSSNNKPVAGAEIKVAGASPTDSDQE GRFILNFTASLPGDPLMINDIYKKGFKIVNYEKVANWNISSASELKIVLGRTEVISALRK KYYDIGESNSEKEYRKTLAELEELKKQNALSAVEYDQKVDSMSKSMMEWQKRLEIYALKF ACINRDELDAMEKQAMELLDHGDVHGAIRLYEEMKLDSAMTLKIAVRQEAKEDMKLLLPS LVNNFQLLKQADDKVACDSVAHLIYEMATDIKLKLMSVEWFFQRNDPSEVLDQYSLIVKE TQSMQEIELVENSLQQSLKEVKLKGELKKKAQLVFERIEDRKKWISIKEKI >gi|226332154|gb|ACIC01000166.1| GENE 80 89103 - 91151 1649 682 aa, chain - ## HITS:1 COG:no KEGG:BT_2246 NR:ns ## KEGG: BT_2246 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 682 1 682 682 1410 99.0 0 MNARWFDRIIYGGAWKQIRFLIIIVVSLIVLSCLGVHWGSKYQMTPSEEMTVLATDSAAN HSFQKTLWNVYNNFVDSGNLISISPEDRPWALIISLLGSVVLGGLLISTLSNIIERRVEN CRNGLIHYKLSDHFVIIGADAMLPCLIRQLCQREKDCTLVIQTSKDVNEVRMELFSNLTK DEEKRIVLVHAMRDSKEELKKLYVADAKEVFILGDSGELDDVEYYHDSMNVDCLNLIGEL CKEENRKPPLKCNVLFEYQSTFAVFQFSDIDDDIKEYIDFCPFNFYETWAQKVFVRNACS IREINYLPLDYQPVTYESEKYVHLVIVGMSRMGIALAVEAAHIAHYPNFIRDKKKKTRIT FIDNEAMREMNSFKQAYENLFDVSYSTFIDTENGMVRRDEPAEVYAHLGTDFIDIEWQFV QGTIESPEVRDLITGWCEDEDALMTVAVCLNLTHQSISSAVYLPRCVYEKGVPVLVQQRI TSAIIEKLSGNPLKGKGGTNQRFKNLRPFGMLDDCFDLCMADEMYAKRVNAVYEKCEGDK VLTELPSAKEMDELWHNPKFKTVKKWSNIYNANAIPTKLRSIGYTKEHWDNGKQLSEKQV AILAEVEHNRWNVEELLLGYRPVTKKEQEEIEQKAALKNKKRDEEYAHYDIRPYNDLRNG SEKYDIALTRHLLLIVKQDGKL >gi|226332154|gb|ACIC01000166.1| GENE 81 91155 - 91457 363 100 aa, chain - ## HITS:1 COG:no KEGG:BT_2247 NR:ns ## KEGG: BT_2247 # Name: not_defined # Def: putative ryanodine receptor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 100 1 100 100 174 100.0 1e-42 MKENKLDYIPEPMDLSLVDLPESLIQLSERIAENVHEVWAKARIDEGWTYGEKRDDIHKK HPCLVPYDELPEEEKEYDRNTAMNTIKMVKKLGFRIEKED >gi|226332154|gb|ACIC01000166.1| GENE 82 91868 - 92779 707 303 aa, chain + ## HITS:1 COG:PAB0040 KEGG:ns NR:ns ## COG: PAB0040 COG0697 # Protein_GI_number: 14520295 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Pyrococcus abyssi # 7 283 23 291 295 74 27.0 2e-13 MNNKTKGFIYGAIAAASYGMNPLFTLPLYAAGMSVDTVLFYRYAFAVIVLGVLMKLQGQS FTLRKADILPLVIMGLLFSFSSLLLFMSYNYMDAGIASTILFVYPVMVAVIMGAFFKEKI SAITVFSILLALSGIALLYQGDGNKPLSTVGIIFVLLSSLSYAIYIVGVNRSSLKTLPTT KLTFYAILFGLSVYIVRSNFCTELQIIPSPLLWADVLALAILPTAVSLICTALAIQSIGS TPTAILGALEPVTALFFGVLLFHEKLTPRLMVGILMIITAVTLIIIGKSLIKKMSVLLQI SKK >gi|226332154|gb|ACIC01000166.1| GENE 83 92831 - 93391 718 186 aa, chain + ## HITS:1 COG:RSc1407 KEGG:ns NR:ns ## COG: RSc1407 COG0233 # Protein_GI_number: 17546126 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribosome recycling factor # Organism: Ralstonia solanacearum # 10 186 9 186 186 160 50.0 9e-40 MVDVKTCLDNAQEKMDMAVMYLEEALAHIRAGKASTRLLDGIRVDSYGSMVPISNVAALS TPDARSITIKPWDKSMFRAIEKAIIDSDLGIMPENNGEVIRIGIPPLTEERRRQLAKQCK AEGETAKVSVRNARRDGIDALKKAVKDGLAEDEQKNAEAKLQKIHDKYIKQIEDMLADKD KEIMTV >gi|226332154|gb|ACIC01000166.1| GENE 84 93486 - 94418 930 310 aa, chain + ## HITS:1 COG:TM1717 KEGG:ns NR:ns ## COG: TM1717 COG1162 # Protein_GI_number: 15644464 # Func_class: R General function prediction only # Function: Predicted GTPases # Organism: Thermotoga maritima # 2 298 6 286 295 214 40.0 1e-55 MKGLVIKNTGSWYQVKTDDGQFIECKIKGNFRLKGIRSTNPVAVGDRVQIILNQEGTAFI NEIEDRKNYIIRRSSNLSKQSHILAANLDQCMLVVTVNYPETSTIFIDRFLASAEAYRVP VKLVFNKVDAYNEDELRYLDALINLYTHIGYPCFKVSAKEGTGVDAIKKDLEGKITLFSG HSGVGKSTLINAILPGTKVKTGEISTYHNKGMHTTTFSEMFSVDGDGYIIDTPGIKGFGT FDMEEEEIGHYFPEIFKVSADCKYGNCTHRHEPGCAVREAVEKHLISESRYTSYLNMLED KEEGKYRAAY >gi|226332154|gb|ACIC01000166.1| GENE 85 94734 - 95816 872 360 aa, chain + ## HITS:1 COG:XF2384 KEGG:ns NR:ns ## COG: XF2384 COG0845 # Protein_GI_number: 15838975 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Xylella fastidiosa 9a5c # 24 354 20 374 411 130 27.0 3e-30 MNKKTKWGIIILIGAGIIGGGIYSQLPKTNDELAAADKVKNTQKNGKKILNVNAKVIKPQ LLTDEYTTTGVLLPDEEVDLSFETSGKVIEINFEEGTPVKKGQLLAKVNDRQLQAQLQRL VSQLKLAEDRVFRQDALLKRDAVSKEAYEQVKTDLATLNADIEIVKANIALTELRAPFDG IIGLRQISVGSYASPTTIVAKLTKITPLKVEFSVPERYASQIKKGTNLNFRIEGKLDAFS AKVYAVESTIDPNLHQFTARALYPNTNRALLPGRYTSIQLKKEEIPNAIAIPTEAIVPEM GKDKVFLYKSGKAEPVEVTTGIRTDAEVQIVRGLQVGDTILTSGTLQLRLGLPVTLDNID >gi|226332154|gb|ACIC01000166.1| GENE 86 95832 - 98864 2710 1010 aa, chain + ## HITS:1 COG:VC0914 KEGG:ns NR:ns ## COG: VC0914 COG0841 # Protein_GI_number: 15640930 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Vibrio cholerae # 1 1001 1 1009 1036 650 37.0 0 MNISELSIRRPVLATVLTIIILLFGFIGYSYLGVREYPSVDNPIISVSCSYPGANADVIE NQITEPLEQNINGIPGIRSLSSVSQQGQSRITVEFELSVDLETAANDVRDKVSRAQRYLP RDCDPPTVSKADADAMPILMVALQSDKRSLLELSEIADLTVKEQLQTISDVSSVSIWGEK RYSMRLWLDPVKMAGYGITPIDVKNAVDNENVELPSGSIEGNTTELTIRTLGLMHTADEF NNLIVKEDNNRIVRFSDIGRAELGPADIKSYMKMNGVPMVGVVVIPQPGANHIEIADAVY KRMEQMKKDLPEDVTYSYGFDNTKFIRASINEVKETVYVAFILVIIIIFLFLRDWRVTLV PCIVIPVSLIGAFFVMYLAGFSINVLSMLAIVLAVGLVVDDAIVMTENIYIRIEKGMSPK EAGIEGAKEIFFAVISTTITLVAVFFPIVFMQGMTGRLFREFSIVISGSVIISSFAALTF TPMLATKLLIKREKQNWFYTKTEPFFEGMNRWYSRSLAAFLRKRWLALPFTFITICLIGF LWNAIPAEMAPLEDRSQISINTRGAEGVTYEYIRDYTEDINQLVDSILPDAEAVTARVSS GSGNVRITLKDMKDRDYTQMEVAEKISKAVQKKTMARSFVQQQSSFGGRRGSMPVQYVLQ ATNLEKLQEVLPQFMAKVYENPVFQMADVDLKFSKPEARIRINRDKASVMGVSTKNIAQT LQYGLSGQRMGYFYMNGKQYEILGEINRQQRNKPADLKAIYIRSSNGDMIQLDNLIELEN GIAPPKLYRYNRFVSATISAGLADGKTIGQGLDEMDKIAKETLDDTFRTALSGDSKEYRE SSSSLMFAFILAILLIYLILAAQFESFKDPLIIMLTVPLAIAGALVFMYFGDITMNIFSQ IGIIMLIGLVAKNGILIVEFANQKQEAGEDKMSAIKDAALQRLRPILMTSASTVLGLIPL AFASGEGANQRIAMGTAVVGGMVVSTLLTMYIVPAIYSYISTNRIKITKE >gi|226332154|gb|ACIC01000166.1| GENE 87 98861 - 100207 1301 448 aa, chain + ## HITS:1 COG:VC1565 KEGG:ns NR:ns ## COG: VC1565 COG1538 # Protein_GI_number: 15641573 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Outer membrane protein # Organism: Vibrio cholerae # 15 427 6 384 419 65 21.0 2e-10 MKQKGICMKRIIYIVTACVFVSVSTAKAQRIYSLRDCLEEGLQNNYSLRIVHNEEQISKN NATLGNAGYLPTLDFSAGYTGNLDNIETKARATGEVTKNNGVYDQTVNVGLNLNWTIFDG FNISTTYKQLKELERQGETNTRIAIEDFIADLASEYYNFIQQKIRLKNFHYAMSLSKERL RIAEASHLVGKFSGLDYQQAKVDFNADSAQYIKQQELLHSSRIQLNELMANNNVNQIIVI KDSTIDVHSDLLFDDLWNATLSTNASLLKADQNTVLAQLDYKKINSRNYPYLKLNTGYGY TFNKYDINANSRRGNLGFNAGVTVGFNIFDGNRRREKRNATLAFKNRRLERQELELALRS DLSNLWQAYRNNLQLLNLERQNLVTAKDNHDIAMDRYIQGDLSGFEVREAQKSLLDAEER ILSAEYNTKLCEISLLQISGKITKYLEQ >gi|226332154|gb|ACIC01000166.1| GENE 88 100358 - 101452 807 364 aa, chain + ## HITS:1 COG:no KEGG:BT_2254 NR:ns ## KEGG: BT_2254 # Name: not_defined # Def: putative pectate lyase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 364 1 364 364 720 100.0 0 MRQTAFFFLLLFIQTGISAQATDYRKQQNYKEWVHLAPKFDDAFFRTEEAQRIGDNVLLY QQVTGGWPKNIYMPAELTEQEYAKARNAKEDVNQSTIDNNATTTEIQYLARLYQATQKEK YKEGVLKGIQYLLKAQYDNGGWPQFYPRPTGYYVQITYNDNAMVRVMQQLREIYEKQSPY TFIPDETCQQARKAFDKGIECILRTQVRQNGELTVWCAQHDRITLEPCKARAYELPSLSG QESDNIVLLLMSLPHPSEKVIESIESAVKWFKKSEIKGVQKEYFTNSDGKRDYRMIPCED CPVLWARFYELDTNRPFFSDRDGIKKYDISEIGHERRNGYSWYNKDGNKVLAKYEKWKKE LNRK >gi|226332154|gb|ACIC01000166.1| GENE 89 101575 - 102735 1002 386 aa, chain - ## HITS:1 COG:RSc3292 KEGG:ns NR:ns ## COG: RSc3292 COG3274 # Protein_GI_number: 17548009 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Ralstonia solanacearum # 16 367 1 319 336 84 24.0 4e-16 MKPPVMNLSNQKNQHIVWLDVVRFIAMFTVVCCHCTDPFNFYPGTAPNIGEIKLWGAIYG SVLRPCVPLFVMITGALLLPVRGDASTFYKKRIPRVFYPFLIWSVLYNLFPWITGLLGLN PQIILDFFPYAGEEVMRQSFSVSLEYILMIPFNFSILAVHMWYIYLLIGLYLYLPVFSAW VEKASERAKLMFLLAWGVTLLLPYYYQFVSNYLWGTCSWNSFGMLYAFAGFNGYLLLGHY LKNLEWSLKKTLTIGIPMFAVGYAVTFLGFRHITALPEYTDEMLELFFTYCSLNVVMMTI PVFMLAKKVKVNSERMKKALANLTVCGFGIYMIHYFFTGPSVVLMRAIDMPIGLQIPVAA ILAFAVSWGLVWLIYRAGKVAKYIVG >gi|226332154|gb|ACIC01000166.1| GENE 90 102882 - 104516 499 544 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|169634422|ref|YP_001708158.1| fumarate hydratase [Acinetobacter baumannii SDF] # 76 531 38 482 508 196 32 3e-49 MATPPFKYQPMFEKGKDTTEYYLLTKDYVSVSEFEGNPILKIEKEGLTAMANAAFRDVSF MLRRSHNEQVAKILSDPEASENDKYVALTFLRNAEVAAKGVLPFCQDTGTAIIHGEKGQQ VWTGYTDEEALSLGVYKTYTEENLRYSQNAPLNMYDEVNTKCNLPAQIDIEATEGMEYEF LCVTKGGGSANKTYLYQETKAILNPGTLVPFLVEKMKTLGTAACPPYHIAFVIGGTSAEK NLLTVKLASTHFYDNLPTTGNEFGRAFRDIELEKQVLEEAHKIGLGAQFGGKYLAHDVRI IRLPRHGASCPVGLGVSCSADRNIKCKINKDGIWIEKLDSNPGSLIPAELRQAGEGDVVK IDLNRPMAEILKELTKYPVSTRLSLNGTIIVGRDIAHAKLKERLDRGEDLPQYIKDHPIY YAGPAKTPAGMACGSMGPTTAGRMDSYVELFQSHGGSMVMLAKGNRSQQVTDACQKYGGF YLGSIGGPAAILAQNNIKSIECVEYPELGMEAIWKIEVEDFPAFILVDDKGNDFFKQIKP RCAK >gi|226332154|gb|ACIC01000166.1| GENE 91 104828 - 106819 1638 663 aa, chain - ## HITS:1 COG:no KEGG:BT_2257 NR:ns ## KEGG: BT_2257 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 663 1 663 663 1368 99.0 0 MKDYRKLTEDEVLQLKSQSCLADDWGNVLVAEGFNCEYVHHTRFSGEVKLGVFDAEFTLP GGIRKHSGLRHVTLHNVVVGDNCCIENIQNYIANYEIGNDTFIENVDIILVDGLSTFGNG VEATVLNETGGREVLINDKLSAHQAYILALYRHRPELINRMKAIADYYSNKHASAVGSIG DHVMILNTGSIKNVRIGDYCHICGTCRLTNGSVNSNVTAPVHIGHGVICDDFIISSGSEV DDGTMLTRCFVGQSCKLGHNYSASDSLFFSNCQGENGEACAIFAGPFTVTHHKSTLLIAG MFSFMNAGSGSNQSNHMYKLGPIHQGTMERGAKTTSDSYILWPARVGAFSLVMGRHVNHA DTSNLPFSYLIEQRNTTYLVPGVNLRSVGTIRDAQKWPKRDKRKDPNRLDYINYNLLSPY TIQKMFKGRSILKELKRVSGETSEIYSYQSAKIKNSSLNNGIRFYEIAIHKFLGNSIIKR LEGINFQSNEEIRQRLKPDTEIGTGEWVDMSGLIAPKSEIDRLLDGIENGSVNRLKSINA SFAEMHENYYTYEWTWAYNKIQEFYGLNPDEITAQDIIRIVKAWKEAVVGLDKMVYDDAR KEFSLSSMTGFGADGSHDEMKQDFEQVRGDFESNTFVTAVLKHIEDKTALGNELIKRIGS IQE >gi|226332154|gb|ACIC01000166.1| GENE 92 106876 - 108135 1226 419 aa, chain - ## HITS:1 COG:XF0088 KEGG:ns NR:ns ## COG: XF0088 COG2262 # Protein_GI_number: 15836693 # Func_class: R General function prediction only # Function: GTPases # Organism: Xylella fastidiosa 9a5c # 19 403 13 371 450 258 40.0 2e-68 MKEFIISEAKVETAVLVGLITQMQDERKTNEYLDELAFLAETAGAEVVKRFTQKLPTANS VTYVGKGKLEEIRQYIRTEEEEEREVGMVIFDDELSAKQIRNIEAELKVKILDRTSLILD IFAMRAQTANAKTQVELAQYKYMLPRLQRLWTHLERQGGGSGSGSGKGGSVGLRGPGETQ LEMDRRIILNRMSLLKERLAEIDKQKATQRKNRGRMIRVALVGYTNVGKSTIMNLLSKSE VFAENKLFATLDTTVRKVIIDNLPFLLSDTVGFIRKLPTDLVDSFKSTLDEVREADLLVH VVDISHPGFEEQIEVVNKTLADIGGGGKPMILIFNKIDAYTYVEKAPDDLTPRTKENLTL EELMKTWMAKMEDNCLFISARERINIDELKDVVYQRVKELHVQKYPYNDFLYQTYEEEE >gi|226332154|gb|ACIC01000166.1| GENE 93 108242 - 109702 1237 486 aa, chain - ## HITS:1 COG:no KEGG:BT_2259 NR:ns ## KEGG: BT_2259 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 486 3 488 488 1001 100.0 0 MKKKILAYALSALTCGLFTSCSDWLDINHDPNTAEKVDPGYLFNYVAVNWAGTRTGGDFY IPLSMSSQCQVDGGLDYGGWDESVYTISPYSTGNTWKHYYSVGGNNLMLAIKNAEEADPV NHNAIAQCKILLAEHMYEATMLWGDIPFTESWNGTIKYPKFDSQESVLNGVLSLLDEALQ IMDLNDANAIDEYDIYYKGDMNKWMTLAKSLKFRTLMVMVDKDPSKATAIGTLLQAGGMV SSASDNLVFPYSAEPGNQNPKYELIELVGGTQILFFASNYMLKPMQERNDPRIPCYFEPG ADGVYRGLGNREPAVTDDKDNMLSSVVSSYLFRKDAPELIYSCQEQLLLEAEAYARGLGV AQNLSKANELYKKGIREACAFYGVAEADIDTYVTGLPELTALTQEKALYEIHMQQWIDLM DRPFEEFVQWRRSGTAGNEVPTLQVPEDATSKELIRRWEYSPEEMTANINAPKESPKIWE KLWFDL >gi|226332154|gb|ACIC01000166.1| GENE 94 109722 - 112559 2198 945 aa, chain - ## HITS:1 COG:no KEGG:BT_2260 NR:ns ## KEGG: BT_2260 # Name: not_defined # Def: outer membrane protein Omp121 # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 945 15 959 959 1804 99.0 0 MLKSDSQNLEEVVVTAMGIKRSEKSLGYAVSSVKGDEITKARESNVLNSLSGKIAGLQIA QSSGTVGGSSSIQIRGASSIGTVSSPLFIIDGLPIDNGSYNPDRTNGIVDVGNRAGDINS DDIESINVLKGAAATALYGARAKDGAIVITTKKGKKNSPVFVTVNSSTRFENVLKLPDFQ NEYAQGSYGVYNVKMLNGWGPKISSVQDQQFTDYKGDKVTLKANPDNVKDFYETGMSYIN NVSVAGGGEKADFRLSFTSTNQTGVVPGSDYNKYAFSVNAGMNFTSNFTGRISAQYIRSD SEGRPAQGANDSNLLIPLINGLPRTIDIHDIKQNWIDENGKQVTLDPEGKSNNPYWIINK NKFTNNLDRMIGNITLTYKPIEGLTITNNAGTDFYTEGRRKLYAKGTVGALNGKFQTWNL YKRIINNDLMVSYEKTFANDYHFKAMVGHNLYQEEWKNENVQAQNLVVDGLYTYTNAKST TPVNYYEKKRLVGLYGDISLGYKDMLFINVTGRNDWSSTLPINNRSYFYPSISGSFIFTE LMENKDILNYGKIRLSYANVGSDEDPYNLAFKYTPASTYFLQYLGNVNTFPHMGLVGFTG PRVLPNENLKPQNQSSFEVGADLRFFGGKIRLDMTYYSNITKNQIVSIDVPLSTGYFANN INAGKIANKGVEVTLGLTPVETRNFKWDLDATFASNKQTVEELAEGLDEYTLTSGLSGLQ IKAAKGDSFGLYGTAWKRDDQGNYVINSKTGLRETVNNVRLGDVYPDFTMGINNTLTYKG LTFSFLIDIRHGGSLYSETVANLRSSGLAAETVAHREDASFIEPGVILQDDGTYRPNDVP VKSMQDYWQHVANSSNNEGNIYNASFVKLREVQLSYSFPRKWFKSFFVKSLDLGFEARNL WIIKDHVPHIDPEANFFGPSQIGGGVEFNSIPSTRSFGFNLRLTL Prediction of potential genes in microbial genomes Time: Thu May 12 03:36:42 2011 Seq name: gi|226332153|gb|ACIC01000167.1| Bacteroides sp. 1_1_6 cont1.167, whole genome shotgun sequence Length of sequence - 2679 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 3, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 473 - 532 2.8 1 1 Tu 1 . + CDS 577 - 777 61 ## BT_2266 hypothetical protein 2 2 Tu 1 . - CDS 839 - 2029 714 ## BT_2267 integrase protein - Prom 2052 - 2111 5.2 3 3 Tu 1 . + CDS 2480 - 2678 106 ## BT_2268 hypothetical protein Predicted protein(s) >gi|226332153|gb|ACIC01000167.1| GENE 1 577 - 777 61 66 aa, chain + ## HITS:1 COG:no KEGG:BT_2266 NR:ns ## KEGG: BT_2266 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 66 1 66 66 107 100.0 2e-22 MALPFKAIRCSSVSNDLFLLKERIVPLKGKSPDIALFMYITVIYMRQIICKLHILTLIYA LFLPYC >gi|226332153|gb|ACIC01000167.1| GENE 2 839 - 2029 714 396 aa, chain - ## HITS:1 COG:no KEGG:BT_2267 NR:ns ## KEGG: BT_2267 # Name: not_defined # Def: integrase protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 395 1 395 396 723 99.0 0 MKKISYSLVFNRKKKLNKKGMALVQVEAYLNRRKMYFSTRVYLKPDQWDVKRRMVKNHPN SDALNRMLYDFIADIEQKELGLWQQRRSISLDSLKDSIEKPENNGNSFLTFFKEEVNNSS LKESTRQNHLSTLELLQEYKKDIVFTDLTFEFVSSFDNYLQSKGYHLNTIAKHMKHLKRY VNVAINKEYMDIQKYAFRKYKIKSIEGSHTHLSPEELNKMEEVNLEGKFTKLQKSKDAFL FCCYAGLRYSDFINLTAANIVELHQETWLIYKSVKTGIDVRLPLYLLFEGKGLRVLENYK DDLNGFFRLKDNSNVNKDLNALAKLAEIDKRISFHTARHTNATLLIYSGANITTVQKLLG HKSVKTTQVYANIMDMTIVHDLEKAAYSKLANRPKG >gi|226332153|gb|ACIC01000167.1| GENE 3 2480 - 2678 106 66 aa, chain + ## HITS:1 COG:no KEGG:BT_2268 NR:ns ## KEGG: BT_2268 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 6 66 1 61 1026 124 100.0 1e-27 MKRKLMLLLACLFVGISLVTAQTQKVTGVVISEEDGQPVVGASVLVKGTTQGTITDIDGN FNLANV Prediction of potential genes in microbial genomes Time: Thu May 12 03:36:53 2011 Seq name: gi|226332152|gb|ACIC01000168.1| Bacteroides sp. 1_1_6 cont1.168, whole genome shotgun sequence Length of sequence - 10186 bp Number of predicted genes - 9, with homology - 9 Number of transcription units - 4, operones - 4 average op.length - 2.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 14 - 2842 2346 ## BT_2268 hypothetical protein 2 1 Op 2 . + CDS 2862 - 4271 1242 ## BT_2269 hypothetical protein + Term 4294 - 4341 14.2 - Term 4281 - 4329 14.4 3 2 Op 1 . - CDS 4400 - 5059 698 ## COG2095 Multiple antibiotic transporter 4 2 Op 2 . - CDS 5121 - 5813 613 ## COG1011 Predicted hydrolase (HAD superfamily) - Prom 5835 - 5894 5.5 - Term 5836 - 5892 12.0 5 3 Op 1 . - CDS 5916 - 6770 980 ## BT_2272 hypothetical protein 6 3 Op 2 . - CDS 6805 - 7479 530 ## COG0313 Predicted methyltransferases 7 3 Op 3 . - CDS 7486 - 8211 684 ## BT_2274 hypothetical protein - Prom 8264 - 8323 4.9 + Prom 8192 - 8251 6.1 8 4 Op 1 . + CDS 8347 - 8946 548 ## COG1435 Thymidine kinase 9 4 Op 2 . + CDS 8966 - 10099 1054 ## COG0628 Predicted permease Predicted protein(s) >gi|226332152|gb|ACIC01000168.1| GENE 1 14 - 2842 2346 942 aa, chain + ## HITS:1 COG:no KEGG:BT_2268 NR:ns ## KEGG: BT_2268 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 942 85 1026 1026 1848 99.0 0 MVKVTLKSDAQNLDEVIVVAYGTAKKSSFTGSAAVIKADKIVSGSKESFDKALSGKMAGV RTASATGDPGSMAEINIRGVGSISASKSPLYVIDGVVTKADDDMNYYGKTQSVLSTLNPE DIESMTVLKDAAAASLYGSRAANGVIIITTKKGKEGKTNVSYSGEVGWNKMAVNAFNMMG SADLIDYTRESLANCLVTYGITDSKQAALNNIDNGGDMFIPLLDSPATVADFIHDPSGKV NTNWKKEIYRTAFTQDHQVSINGGSSKTQFYAGVGYNKSEGIVLGSDFERISGRLNVNHK VNNWLNVALKQMIAATSQDGFRDQGDQAQGMGTSSPIGILFAMDPTAPVKNEDGSYNKNA AWGKVTNPHLMLGGKDSDTALEWIQTKMFRSMTNADVTIKFCDKLSLNSIFGYDYVDNKH FEYWDRNSVNGGSVSGMGSRYTFESRVATSSSTLNYTDTFKDMHNLNLMAGFEVENRDLL QIVTVAKRYSSHYPELANGQPDQAASSTKGAGMMSYFASGNYNYDNKYYLSASFRRDGSS RLSEDNRWASFWSVSGAWRMSKEGFMQDMPLFTDFKIKASYGTNGNLPSDYYGYMDLYTG SGYGSAPAIYWSRMANDKLSWEKSKNFNMGIEWNMYDRVNLSLEYYNKKTTDLLFEVPTS LITGFDSRWENLGALKNDGFELELNSKNISNKNFTWTSNFNLTYQRALVDKLPEGKDIQY GDGEMYLHREGESMYTFYLPEWKGVNPETGLGEFWLDPEDHSKGVVNDYSEAGKGIVGKA LPDVIGGFSNTFTYKDFDLSFLITYQFGGDMFDYPGYFSHHDGVRIGSMNLSEDVAGNYW KNPGDKVDNPMPIYANPYRWDRFSSRTIKSTDNIRLREMTVGYTLPVLKKHISNFRIYFR ANNLAMLWSKTKNIDPDVAINGYRQADTPALRSCVFGINIKL >gi|226332152|gb|ACIC01000168.1| GENE 2 2862 - 4271 1242 469 aa, chain + ## HITS:1 COG:no KEGG:BT_2269 NR:ns ## KEGG: BT_2269 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 469 1 469 469 912 100.0 0 MKKIYLPILLLTLAVSSCDTFDKVPSSEWPSEGAIKTLEDLQFAVNGVYESQTSSIDAGT NPRGSYAGDFFLYADQRGSDFKAIGSNKQTVDVYAYQATKNSSDSYYFYKRFYLSLARIN KVLEGVKKSGLEGAEVDVQVGELYALRALFHFDLARLFAKLPCNAQASDPGIVLSTEVFE SGYTAERATIAKTYETILDDLKTALDNLPETSKVVTAGHIDYWGARALRARAYLYMNENS KALEDAKYVIEKSPYKLYTRDEYETVWTKVGSSESIFEFLITSLYNAQRNSLGFYTHAEG YAEAGITEGFKTFLQERPEDVRSTLIAEESDGGDNEGWYIQKYPGRDGEIYVNNPKVIRL SEVYLIAAEAALKAGGADPASYINDLRKQRIADYEDVASVTIDDILTERRLELYGEGHNA WDTWRNKKAVTNAQVPAGPINYDDYRTVMPIPVSEINVSNGKLKQNEGY >gi|226332152|gb|ACIC01000168.1| GENE 3 4400 - 5059 698 219 aa, chain - ## HITS:1 COG:BS_yvbG KEGG:ns NR:ns ## COG: BS_yvbG COG2095 # Protein_GI_number: 16080438 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Multiple antibiotic transporter # Organism: Bacillus subtilis # 5 203 2 203 211 127 38.0 2e-29 MDSTLLPFALLCFTSFFTLTNPLGTMPVFLTMTHGMTDKERQTIVRRATIVSFITLMVFV FAGQFLFKFFGISTNGFRIAGGVIIFKIGFDMLQARYTPMKLKDEEIKTYADDISITPLG IPMLCGPGAIANAIVLMQDSHTYAMKGILIGTIALIYLLTFFILRASTRLVKVLGETGNN VMMRLMGLILMVIAVECFVSGAKPILADIVREGLSGICK >gi|226332152|gb|ACIC01000168.1| GENE 4 5121 - 5813 613 230 aa, chain - ## HITS:1 COG:YPO2295 KEGG:ns NR:ns ## COG: YPO2295 COG1011 # Protein_GI_number: 16122519 # Func_class: R General function prediction only # Function: Predicted hydrolase (HAD superfamily) # Organism: Yersinia pestis # 1 230 1 222 224 112 30.0 8e-25 MKYKNLFFDLDDTIWAFSRNARDTFEEVYQKYSFDRYFDSFDHYYTLYQRRNTELWLEYG EGKVTKEELNRQRFFYPLQAVGVEDEALAERFSEDFFAIIPTKSGLMPHAKEVLEYLAPQ YNLYILSNGFRELQSRKMRSAGVDRYFKKIILSEDLGVLKPWPEIFHFALSATQSELRES LMIGDSWEADMTGAHGVGMHQAFYNVTERTVFPFQPTYHIHSLKELMNLL >gi|226332152|gb|ACIC01000168.1| GENE 5 5916 - 6770 980 284 aa, chain - ## HITS:1 COG:no KEGG:BT_2272 NR:ns ## KEGG: BT_2272 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 284 1 284 284 441 99.0 1e-122 MKKLAVLFMCAAMLASCDFKGGSKDLKAENDSLLMELTQRNAELDDMMGTFNEVQEGFRK INAAESRVDLQRGTITENSASAKQQIASDIEFISKQMEENKAQIAKLEAQLKNSKYNSTQ MKKAVEALTAELKVKQQRIEELQTELASKNIRIQELDAAVSDLSAAKESLAAENEAKAKT VAEQDKSLNAAWFVFGTKSELKAQKILQSGDVLKSADFNKDYFTQIDIRTTKEIKLYSKR AELLTTHPTGSYELVKDDKGQLTLKITNPAEFWSVSRYLVIQVK >gi|226332152|gb|ACIC01000168.1| GENE 6 6805 - 7479 530 224 aa, chain - ## HITS:1 COG:all4680 KEGG:ns NR:ns ## COG: all4680 COG0313 # Protein_GI_number: 17232172 # Func_class: R General function prediction only # Function: Predicted methyltransferases # Organism: Nostoc sp. PCC 7120 # 2 221 8 228 285 231 50.0 1e-60 MGKLYVVPTPVGNLEDMTFRAIKVLKEVDLILAEDTRTSGILLKHFEIKNAMQSHHKFNE HKTVESVVNRIKAGETVALISDAGTPGISDPGFLVVRECVRNGIEVQCLPGATAFVPALV ASGLPNEKFCFEGFLPQKKGRQTRLKALAEEHRTMVFYESPHRLLKTLTQFAEYFGTERQ ATVSREISKLHEETVRGSLAELIEHFTATEPRGEIVIVLAGIDD >gi|226332152|gb|ACIC01000168.1| GENE 7 7486 - 8211 684 241 aa, chain - ## HITS:1 COG:no KEGG:BT_2274 NR:ns ## KEGG: BT_2274 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 241 1 241 241 287 94.0 2e-76 MEMKQKLLTDIELDVHELKLLMNTFSKEPTQTLSELLKRSILRMQERLEQLSEEISAVPV EASSSPVAEAEIEAPIVEEQAPVIEEIEYPVIGEPVVEENEATASGEDEPVIVQETQTAV EETVMEEPVVEDEMEEKEAEDESEDDESLLIEEPKAAVLGESIKMAAGLRHSISLNDSFR FSREIFGGDPELMNRVIEQISVMSSYKTAVAFLASKVSVNEENEAMADFLELLKKYFNQS A >gi|226332152|gb|ACIC01000168.1| GENE 8 8347 - 8946 548 199 aa, chain + ## HITS:1 COG:BH3779 KEGG:ns NR:ns ## COG: BH3779 COG1435 # Protein_GI_number: 15616341 # Func_class: F Nucleotide transport and metabolism # Function: Thymidine kinase # Organism: Bacillus halodurans # 9 188 1 187 204 184 48.0 8e-47 MVLFSEDHIQETRRRGRIEVICGSMFSGKTEELIRRMKRAKFARQRVEIFKPAIDTRYSE EDVVSHDSHSIASTPIDSSASILLFTSEIDVVGIDEAQFFDDGLIDVCRQLANNGIRVII AGLDMDFKGNPFGPMPQLCAIADEVSKVHAICVKCGDLASFSHRTVKNDKQVLLGETAEY EPLCRECYLRARGEDGQKI >gi|226332152|gb|ACIC01000168.1| GENE 9 8966 - 10099 1054 377 aa, chain + ## HITS:1 COG:RSc2624 KEGG:ns NR:ns ## COG: RSc2624 COG0628 # Protein_GI_number: 17547343 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Ralstonia solanacearum # 37 345 36 338 356 101 25.0 3e-21 MERKKITFDSFIRGSIGCVLVVGILMLVERLSGVLLPFFIAWLIAYMVYPLVKFFQYKLR LKSRIVSIFCALFLITLVGVSLFYLLVPPMVSEIGRMNDLLVTYLTNGAGNNVPKNLSEF IHENIDLQALNRILSEENILAAIKDTVPRVWALLAESLNILFSILASFIILLYVIFILLD YEVIAEGWLHLLPNKYRTFASNLVHDVQDGMNRYFRGQALVAFCVGILFSIGFLIIDFPM AIALGLFIGALNMVPYLQIIGFLPTVLLAILKAADTGENFWIIIACALAVFAIVQIIQDT FLVPKIMGKITGLNPAIILLSLSIWGSLMGMLGMIIALPLTTLMLSYYQRFIINKERIKY DEVETTDNQETSDKEEK Prediction of potential genes in microbial genomes Time: Thu May 12 03:37:20 2011 Seq name: gi|226332151|gb|ACIC01000169.1| Bacteroides sp. 1_1_6 cont1.169, whole genome shotgun sequence Length of sequence - 7091 bp Number of predicted genes - 14, with homology - 12 Number of transcription units - 10, operones - 3 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) + TRNA 8 - 80 73.9 # Lys TTT 0 0 - Term 83 - 146 11.7 1 1 Op 1 . - CDS 246 - 458 181 ## BT_2337 hypothetical protein - Prom 525 - 584 3.3 2 1 Op 2 . - CDS 598 - 759 201 ## BF0653 hypothetical protein - Prom 1000 - 1059 9.8 3 2 Tu 1 . - CDS 1163 - 1855 268 ## BT_1888 LuxR family transcriptional regulator + Prom 1727 - 1786 7.7 4 3 Op 1 . + CDS 1948 - 2430 203 ## COG4635 Flavodoxin 5 3 Op 2 . + CDS 2434 - 2670 173 ## BT_2974 hypothetical protein 6 3 Op 3 . + CDS 2757 - 2987 227 ## BF0648 hypothetical protein + Prom 3048 - 3107 3.2 7 4 Tu 1 . + CDS 3142 - 3375 121 ## BF0647 hypothetical protein + Term 3397 - 3439 6.4 8 5 Tu 1 . + CDS 3505 - 3870 290 ## BF0646 hypothetical protein + Term 3944 - 3982 -0.7 + Prom 4014 - 4073 3.0 9 6 Op 1 . + CDS 4109 - 4693 367 ## BF0644 clindamycin resistance transfer factor BtgA 10 6 Op 2 . + CDS 4698 - 5618 561 ## BDI_1256 clindamycin resistance transfer factor BtgB - Term 5469 - 5513 -0.6 11 7 Tu 1 . - CDS 5641 - 5832 84 ## - Prom 5903 - 5962 9.8 + Prom 5663 - 5722 9.4 12 8 Tu 1 . + CDS 5745 - 6605 418 ## CHU_1176 hypothetical protein 13 9 Tu 1 . - CDS 6709 - 6879 172 ## - Prom 6979 - 7038 6.4 + Prom 6706 - 6765 5.0 14 10 Tu 1 . + CDS 6797 - 7075 57 ## gi|253572153|ref|ZP_04849557.1| predicted protein Predicted protein(s) >gi|226332151|gb|ACIC01000169.1| GENE 1 246 - 458 181 70 aa, chain - ## HITS:1 COG:no KEGG:BT_2337 NR:ns ## KEGG: BT_2337 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 46 1 46 97 79 80.0 5e-14 MKRTKENYPFFNLFSIVGTWESINLNPTVIIYRNDRDYLFSIIYVSYIVFKVTKLCILFQ SKSMQKAMNQ >gi|226332151|gb|ACIC01000169.1| GENE 2 598 - 759 201 53 aa, chain - ## HITS:1 COG:no KEGG:BF0653 NR:ns ## KEGG: BF0653 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 53 1 53 104 97 98.0 1e-19 MEVVTIEKRTFSYICERFTEFAKRIESLCSTHTQKVENWLDSQEVCLLLGFSK >gi|226332151|gb|ACIC01000169.1| GENE 3 1163 - 1855 268 230 aa, chain - ## HITS:1 COG:no KEGG:BT_1888 NR:ns ## KEGG: BT_1888 # Name: not_defined # Def: LuxR family transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 4 229 31 257 258 190 44.0 3e-47 MNLEKYKLYVELLSRVNNSCVFLIEYNNRFLYTSPNFNTFFGYDIEKLKDPSIEHNYLEK YIHPDDFLIFSTIQKRLLGFYYSQPIECRKDYKHIFEFRILNAKKEYVRVISQHQVLEID EIGNPFLVLGVVDLSPDQKDMDEIKFRLVNNKTGEMTPFPLTEETNIKLTKREVEILKLV NKGMFSKEISDSLSISVHTVNNHRQNILQKMNTDNVVEAINYARKLGLLD >gi|226332151|gb|ACIC01000169.1| GENE 4 1948 - 2430 203 160 aa, chain + ## HITS:1 COG:PM1499 KEGG:ns NR:ns ## COG: PM1499 COG4635 # Protein_GI_number: 15603364 # Func_class: C Energy production and conversion; H Coenzyme transport and metabolism # Function: Flavodoxin # Organism: Pasteurella multocida # 1 136 1 140 170 67 30.0 2e-11 MKTAIIYYSKHGTTERVAHLIGEKLSPELEYISLKEFHNPDIQGYDRIILGTSIYAGHPG RLMSKFCNKNRAQLEQKIIALFICGMNDTQEVEQLKKAFPEYLHSNAVAETILGGEFLFD KMNFIEKFITRKISKVVCSVSNLRYDAITVFLDKMNNIRK >gi|226332151|gb|ACIC01000169.1| GENE 5 2434 - 2670 173 78 aa, chain + ## HITS:1 COG:no KEGG:BT_2974 NR:ns ## KEGG: BT_2974 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 8 78 2 72 72 63 50.0 2e-09 MMKVSNNKRNILIFRTSITTKLEIERIKILFAQYSQIHKWNLDFEDWEKVLRIESHGITE TDVINILQAINIYISELE >gi|226332151|gb|ACIC01000169.1| GENE 6 2757 - 2987 227 76 aa, chain + ## HITS:1 COG:no KEGG:BF0648 NR:ns ## KEGG: BF0648 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 76 19 93 93 86 68.0 3e-16 MNKLLRTELFNLLIEKSKEEVNNNVMQNAYDEFIEKIRDISNENDYSTTYRILVATRIEI ASLETILLYGQGGKCA >gi|226332151|gb|ACIC01000169.1| GENE 7 3142 - 3375 121 77 aa, chain + ## HITS:1 COG:no KEGG:BF0647 NR:ns ## KEGG: BF0647 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 77 67 143 143 114 72.0 1e-24 MELVSGLFLSKRVINHAGKEAPLTEIGRAFEHLFNIKFGDIHKKYESVICRQANKRIEFL DILHKAITEENQKKGYL >gi|226332151|gb|ACIC01000169.1| GENE 8 3505 - 3870 290 121 aa, chain + ## HITS:1 COG:no KEGG:BF0646 NR:ns ## KEGG: BF0646 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 109 1 109 110 145 73.0 3e-34 MYIDNENFDKWMERLSKKLGEIGQNLQSLINTDKVLDENDRLLDNQDLAFLLKVSFRTLQ RYRASGKLPFFMISHKTYYRASDIRAFVQENADCKTYERFKKENQLDKQTDAKKADNSPV L >gi|226332151|gb|ACIC01000169.1| GENE 9 4109 - 4693 367 194 aa, chain + ## HITS:1 COG:no KEGG:BF0644 NR:ns ## KEGG: BF0644 # Name: not_defined # Def: clindamycin resistance transfer factor BtgA # Organism: B.fragilis # Pathway: not_defined # 1 194 1 194 194 313 94.0 2e-84 MPNNSRKTIFTTISIDKETAALVEKICKRYSLKKSEVVKLAFGYIDKAHINPSEAPESVK SELAKINKRQDDIIRFIRHYEEEQLNPMIRATNSIALHFEAIGKTLETLILSQLEASQKR QTAVLKKLSEQFSNHADVINNQSKQINALYQIHQRDYKKLLHLIQLYSKLSACGVMDSKR KENLKAEIINLINT >gi|226332151|gb|ACIC01000169.1| GENE 10 4698 - 5618 561 306 aa, chain + ## HITS:1 COG:no KEGG:BDI_1256 NR:ns ## KEGG: BDI_1256 # Name: not_defined # Def: clindamycin resistance transfer factor BtgB # Organism: P.distasonis # Pathway: not_defined # 1 294 1 294 306 417 79.0 1e-115 MHIDFAPPSKGTYNNAGSSRQLASYMEHEDLERMEKEIYTDGFFNLTEDNIYKSKVVKDI DTNIGQLLKTDAKFFAIHVSPSERELRAMGNTEQEKAEAMRRYIREIFIPEYAKNFNKGL SEADIKFYGKIHFNRNRSDNELNMHCHLIVSRKDQSNKKKLSPLTNHKNTKNGVVKGGFD RVNLFQQAEQGFDKLFDYHRQQSESFDYHNTIKNGSIFEQLEQQSQSFTMEKKKAVFQSS EKENNISCNLDNKVADKQSNNQQNNSGSDSLLSIFSLGDGNNYDTTLTEELHTQKRKKKK KKGRKL >gi|226332151|gb|ACIC01000169.1| GENE 11 5641 - 5832 84 63 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MYMLHCQWKAELLSLVLIEIYNIQIYMDTYSLSFIRTLANVQKKYIFMKKNIFILAISYA LSH >gi|226332151|gb|ACIC01000169.1| GENE 12 5745 - 6605 418 286 aa, chain + ## HITS:1 COG:no KEGG:CHU_1176 NR:ns ## KEGG: CHU_1176 # Name: not_defined # Def: hypothetical protein # Organism: C.hutchinsonii # Pathway: not_defined # 6 262 9 270 287 165 38.0 2e-39 MYPYKFEYYRFQLVPKKVVQLSIDNVAYTYDEIKAKKNEYFSEVLTKTTFKGKKGNLPYR IVYEKNNIFVLFLSNPKPYSYIHDFQKQQGTTEPFSIIVIDNNPENQLMAISRNTEAFSD TKTVVKILSQTINKHLDHYNVVLHIEPIFQKEEFWKIVGGKKEEISMIRFELIKPNLTNI SGCLKDELKRVIDTTNSHKTVVEFNAPARAALEDITPSNNDINGLVDYSANGGGNIGVKF KHDRKKYQTAESIKVELETTEVEIQNANPAQLDAFVDSICSKLKRR >gi|226332151|gb|ACIC01000169.1| GENE 13 6709 - 6879 172 56 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRQNTNIIKANLSSDCVVSNAYLLFFANSNSLFVINEYIANNNAVKEISKAKKLDL >gi|226332151|gb|ACIC01000169.1| GENE 14 6797 - 7075 57 92 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253572153|ref|ZP_04849557.1| ## NR: gi|253572153|ref|ZP_04849557.1| predicted protein [Bacteroides sp. 1_1_6] # 1 92 66 157 157 106 98.0 6e-22 MAKKSKYAFETTQSELKLAFIILVFCLIISIPLLLLFSIDENTETWIKLRFVIFSALNTI LILILHIIIDMGKSVFLITNALSKIEENSQKQ Prediction of potential genes in microbial genomes Time: Thu May 12 03:37:57 2011 Seq name: gi|226332150|gb|ACIC01000170.1| Bacteroides sp. 1_1_6 cont1.170, whole genome shotgun sequence Length of sequence - 2168 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 28 - 87 4.5 1 1 Tu 1 . + CDS 108 - 776 391 ## Tmz1t_0114 hypothetical protein + Term 972 - 1012 1.3 + Prom 835 - 894 5.7 2 2 Tu 1 . + CDS 1028 - 1927 340 ## Tmz1t_0113 ATP-binding region ATPase domain protein Predicted protein(s) >gi|226332150|gb|ACIC01000170.1| GENE 1 108 - 776 391 222 aa, chain + ## HITS:1 COG:no KEGG:Tmz1t_0114 NR:ns ## KEGG: Tmz1t_0114 # Name: not_defined # Def: hypothetical protein # Organism: Thauera # Pathway: not_defined # 3 220 2 218 224 115 31.0 1e-24 MKFRYIGTPKNITDLRSIASKHKQLERKNVKIAIVDNESFPMIEILQRHKFDIDKFDDIE NIESLNGYDIILCDIQGVGTKFNEVFQGAYLVKEIYKRYPFKIIIAYTGSRYDPRYNEYL KYAEYNIIKDASSEEWVEKLDSALELASNPEHRWNRVRRYLLNKGVPLFELTLLEDDFVC RFLENKSFDDFPNNKIAKTLDDDIRAILQSFTANALFKILVG >gi|226332150|gb|ACIC01000170.1| GENE 2 1028 - 1927 340 299 aa, chain + ## HITS:1 COG:no KEGG:Tmz1t_0113 NR:ns ## KEGG: Tmz1t_0113 # Name: not_defined # Def: ATP-binding region ATPase domain protein # Organism: Thauera # Pathway: not_defined # 56 283 36 258 258 117 33.0 4e-25 MVQRRLSEKDFMPRISLANYEKSINNISILTNSSVDFYKKENEEIISKETYIDKIEILDN TFHELRKLNQELKLQTEHLIYQSNNFSWNNVDDIKYLSQNIFSTSQLISIRLNTYDFGVN PNLSLYEEKSPIQIHKKFVKVAHCLREYANKKQIKIQITGESYSSIMANDVLELLPYLIL DNAIKYSLENKNIDIKFSEKSSILEVVIKSFSVRPHEHELRKLTERGVRSTRIDSQIQGQ GIGLYLANYICELHNISMDFRIGKERFFESGIPYSDFYVSLTFYDIIKDESVDYTFDID Prediction of potential genes in microbial genomes Time: Thu May 12 03:38:22 2011 Seq name: gi|226332149|gb|ACIC01000171.1| Bacteroides sp. 1_1_6 cont1.171, whole genome shotgun sequence Length of sequence - 68456 bp Number of predicted genes - 48, with homology - 47 Number of transcription units - 31, operones - 11 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 83 - 142 3.8 1 1 Op 1 . + CDS 243 - 1046 176 ## COG2207 AraC-type DNA-binding domain-containing proteins + Prom 1049 - 1108 4.2 2 1 Op 2 . + CDS 1128 - 1496 214 ## BT_2357 hypothetical protein - Term 1641 - 1681 4.3 3 2 Tu 1 . - CDS 1727 - 3124 502 ## BpOF4_06510 hypothetical protein - Term 4104 - 4169 16.3 4 3 Op 1 . - CDS 4217 - 4627 400 ## BT_2360 transcriptional regulator 5 3 Op 2 . - CDS 4665 - 5072 287 ## BT_2361 hypothetical protein + Prom 5129 - 5188 6.7 6 4 Op 1 . + CDS 5403 - 8489 2436 ## BT_2362 hypothetical protein 7 4 Op 2 . + CDS 8508 - 10340 1493 ## BT_2363 hypothetical protein + Term 10371 - 10422 1.9 + Prom 10454 - 10513 4.3 8 5 Op 1 . + CDS 10583 - 13660 2516 ## BT_2364 hypothetical protein 9 5 Op 2 . + CDS 13694 - 15187 1412 ## BT_2365 hypothetical protein + Term 15260 - 15315 10.0 + Prom 15349 - 15408 4.1 10 6 Tu 1 . + CDS 15450 - 16013 329 ## BT_2366 hypothetical protein + Prom 16015 - 16074 3.0 11 7 Tu 1 . + CDS 16123 - 17271 781 ## COG1835 Predicted acyltransferases 12 8 Tu 1 . - CDS 17440 - 17664 398 ## BT_2368 hypothetical protein - Prom 17688 - 17747 3.5 13 9 Tu 1 . - CDS 18198 - 18827 512 ## BT_2369 hypothetical protein - Prom 18850 - 18909 8.2 + Prom 18806 - 18865 6.0 14 10 Tu 1 . + CDS 19025 - 19495 375 ## BT_2370 hypothetical protein + Term 19527 - 19572 10.9 - Term 19507 - 19563 -0.4 15 11 Tu 1 . - CDS 19640 - 20218 688 ## COG0693 Putative intracellular protease/amidase - Prom 20352 - 20411 5.4 + Prom 20736 - 20795 4.6 16 12 Tu 1 . + CDS 20919 - 21392 446 ## COG3449 DNA gyrase inhibitor + Term 21457 - 21502 7.6 - Term 21445 - 21490 10.6 17 13 Op 1 . - CDS 21574 - 23136 1312 ## BT_2373 hypothetical protein 18 13 Op 2 . - CDS 23143 - 25098 1593 ## BT_2374 hypothetical protein - Prom 25123 - 25182 6.1 + Prom 25425 - 25484 7.5 19 14 Tu 1 . + CDS 25631 - 26056 458 ## BT_2376 hypothetical protein - Term 26306 - 26358 5.1 20 15 Tu 1 . - CDS 26561 - 27934 886 ## COG0534 Na+-driven multidrug efflux pump - Prom 27977 - 28036 3.8 - Term 27963 - 28021 4.9 21 16 Tu 1 . - CDS 28089 - 29537 992 ## BT_2378 endo-polygalacturonase - Prom 29592 - 29651 7.4 + Prom 29903 - 29962 5.9 22 17 Tu 1 . + CDS 30161 - 30904 555 ## COG0110 Acetyltransferase (isoleucine patch superfamily) + Term 30909 - 30966 17.3 - Term 30897 - 30953 12.5 23 18 Tu 1 . - CDS 30978 - 31601 472 ## COG0110 Acetyltransferase (isoleucine patch superfamily) - Prom 31747 - 31806 7.4 + Prom 31585 - 31644 6.7 24 19 Tu 1 . + CDS 31879 - 32748 664 ## BT_2385 hypothetical protein - Term 32650 - 32699 -0.3 25 20 Tu 1 . - CDS 32796 - 33266 319 ## COG1522 Transcriptional regulators - Prom 33435 - 33494 6.0 + Prom 33314 - 33373 4.8 26 21 Op 1 . + CDS 33445 - 34731 1269 ## COG2873 O-acetylhomoserine sulfhydrylase + Prom 34777 - 34836 2.4 27 21 Op 2 . + CDS 34856 - 35095 277 ## BT_2388 hypothetical protein + Term 35134 - 35188 14.6 - Term 34874 - 34904 2.0 28 22 Op 1 . - CDS 35088 - 35261 64 ## 29 22 Op 2 . - CDS 35194 - 36105 1024 ## COG0668 Small-conductance mechanosensitive channel - Prom 36151 - 36210 4.6 + Prom 36055 - 36114 6.8 30 23 Tu 1 . + CDS 36337 - 38568 1919 ## BT_2390 hypothetical protein 31 24 Tu 1 . - CDS 38622 - 42755 1695 ## COG0642 Signal transduction histidine kinase - Prom 42790 - 42849 9.7 + Prom 42924 - 42983 6.5 32 25 Op 1 . + CDS 43116 - 44438 1050 ## COG3391 Uncharacterized conserved protein 33 25 Op 2 . + CDS 44455 - 47538 2395 ## BT_2393 hypothetical protein 34 25 Op 3 . + CDS 47551 - 49530 1713 ## BT_2394 putative outer membrane protein 35 25 Op 4 . + CDS 49585 - 50481 819 ## BT_2395 hypothetical protein + Term 50499 - 50560 22.1 + Prom 50553 - 50612 4.9 36 26 Op 1 2/0.143 + CDS 50721 - 51287 520 ## COG3201 Nicotinamide mononucleotide transporter 37 26 Op 2 . + CDS 51284 - 51907 509 ## COG1564 Thiamine pyrophosphokinase + Term 51949 - 51987 1.5 38 27 Op 1 . - CDS 52626 - 53198 416 ## PROTEIN SUPPORTED gi|157164512|ref|YP_001467500.1| 50S ribosomal protein L24 (BL23; 12 kDa DNA-binding protein; HPB12) 39 27 Op 2 . - CDS 53233 - 54501 1370 ## COG0498 Threonine synthase 40 27 Op 3 . - CDS 54531 - 55754 1231 ## COG3635 Predicted phosphoglycerate mutase, AP superfamily 41 27 Op 4 . - CDS 55770 - 58205 2358 ## COG0527 Aspartokinases - Prom 58229 - 58288 2.9 + Prom 58522 - 58581 6.0 42 28 Tu 1 . + CDS 58606 - 59643 784 ## COG0252 L-asparaginase/archaeal Glu-tRNAGln amidotransferase subunit D + Term 59653 - 59715 10.9 + Prom 59665 - 59724 3.9 43 29 Op 1 . + CDS 59745 - 60944 701 ## COG1373 Predicted ATPase (AAA+ superfamily) 44 29 Op 2 . + CDS 60957 - 62324 1183 ## COG1066 Predicted ATP-dependent serine protease 45 29 Op 3 . + CDS 62345 - 63997 1233 ## COG2509 Uncharacterized FAD-dependent dehydrogenases 46 29 Op 4 . + CDS 64013 - 64615 525 ## COG2197 Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain + Prom 64631 - 64690 2.7 47 30 Tu 1 . + CDS 64724 - 67123 1973 ## COG1629 Outer membrane receptor proteins, mostly Fe transport + Term 67183 - 67230 3.1 48 31 Tu 1 . - CDS 67135 - 68034 390 ## COG2207 AraC-type DNA-binding domain-containing proteins Predicted protein(s) >gi|226332149|gb|ACIC01000171.1| GENE 1 243 - 1046 176 267 aa, chain + ## HITS:1 COG:mlr1196 KEGG:ns NR:ns ## COG: mlr1196 COG2207 # Protein_GI_number: 13471273 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Mesorhizobium loti # 62 249 90 270 276 85 28.0 1e-16 MYKEYLPCHILAPYVDRYWEFKGQTEQGMQIKLSTDGCTDFMFILENTVNHGIIMQPYHS YFIGPMNVCSALITSSGTINTFGVRFRPCGLSRFMKISLSELTNIRLSVNDLNTIFSDSF AERLFEEQGIQCKINLIERYLKEHLYKSSHTMDAQIAYAVNQINAHEGKISISHLMDEVN LCQRHFERKFKLHTGFTPQKYNSIVRFKNAIKILKNESFDNLLSVAVKAGYYDASHLSRE VKKLSGSTPNFFLSLPPDNETTIIYTK >gi|226332149|gb|ACIC01000171.1| GENE 2 1128 - 1496 214 122 aa, chain + ## HITS:1 COG:no KEGG:BT_2357 NR:ns ## KEGG: BT_2357 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 119 1 119 119 213 97.0 2e-54 MKLWKYSGTFLTATGIIHTIYALFIGKDAFSEMLRNGLVNSIGENYSQGFAFWFLICGVI LILLGQTLQYYIRKEQKPAPVFLGYSILLLTIIGCIIEPTSGFWLFLPQAIIIIYSNKKR GL >gi|226332149|gb|ACIC01000171.1| GENE 3 1727 - 3124 502 465 aa, chain - ## HITS:1 COG:no KEGG:BpOF4_06510 NR:ns ## KEGG: BpOF4_06510 # Name: not_defined # Def: hypothetical protein # Organism: B.pseudofirmus # Pathway: not_defined # 95 343 8 228 239 133 34.0 1e-29 MNIGDSDILYSFDRARLIDRARNGFMRIDGLTFKRARDYMDKYSARDYLMQCPLDLSTKE LVSGMKDYCLQRRAEMLEPYRKKRYSIHGDPIHHLYIIGNGFDRYHGADSTYMDFRSYLL KHNDFVVKMFELFFGPRSMMNNFDDYNDFLLCLQYGRKLPAPKNTWAKDYLWKDFEKYLS ELNRERIFDFVDENLPRLYEDDENFSYAEYLGPIDIVADVVSSCTFEMQYQFHRWINTIH YKKGFRKNMLYLDPNAVYLNFNYTLFLETEYNISREHILYIHGDRRQKFGSLVLGHNVED NEVAFDEWVHKHKNRRRYRPNLKDKKGKYFANDKLVYLAFFLKDMKKGNWKNPIRYYAVD HIEERLENYYAKNIKHSNDIIDHNLGFFESLNDLKEITLLGHSLGDVDFPYFKAIVENVR NVNDLIWDFSYYSDNDIINIRRFCRHLNIPQGKNVRHFKMSDIKR >gi|226332149|gb|ACIC01000171.1| GENE 4 4217 - 4627 400 136 aa, chain - ## HITS:1 COG:no KEGG:BT_2360 NR:ns ## KEGG: BT_2360 # Name: not_defined # Def: transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 136 1 136 136 190 79.0 1e-47 MKPETYTNALHIGRKIERVRRLRGMTQTDLGDLLGITKQAVSKMEQTEKLDDERVKQVAE ALGVTEEGLKKFTEETVLYYTNNFYENSNATATNIGTISNLENINHFSMDQAVKLFEELL KIEREKYSGGKEESAK >gi|226332149|gb|ACIC01000171.1| GENE 5 4665 - 5072 287 135 aa, chain - ## HITS:1 COG:no KEGG:BT_2361 NR:ns ## KEGG: BT_2361 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 131 1 127 129 150 74.0 1e-35 MDVEIKDKASQRHVGRNLQRIRVYLGMKQEALAADLGISQQEISKIEKQDEIEDKLLTQI ATALGVSAEVIRDFDVERAIYNINSYKDATISPGATATVYAHTQQINPLDKIVELYERLL QSEREKIELLKNANK >gi|226332149|gb|ACIC01000171.1| GENE 6 5403 - 8489 2436 1028 aa, chain + ## HITS:1 COG:no KEGG:BT_2362 NR:ns ## KEGG: BT_2362 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1028 1 1028 1028 1948 99.0 0 MKSFSDKTNIGTDTTLKTVYMIRFYLLFLLTGIISVVGATQVYGQQSRSISGVVKNSQGE TIIGANIVEKGTNNGTITNVDGLFTLRVSPNAVLKVSYMGYIEQEVSTRNKSKLEITMVE DARLIDEVVVVGYGSVKKRDLTGAVTSVKSAEVLAAPTSNVMEALQGKIPGMDITKTSGQ VGGEVSILLRGSRSIYGNNEPLFIIDGIPGSYSQVNPSDIESVDVLKDASSTAIYGSAGA NGVVIITTKRGKEGKATINFDAYYGFSGSPNYKHGMTQDEWVTYQKEAYRYKNGDFPSDM SALLGKQAFTDAYNAGKWIDWVDEVSGNTATTQKYSLSVSSGTEKTRIFASTSYNREEGL LSNENLNRYSLRLNIDQQLFSWAKVGFTSNLVYRDLNSGVKNTFTRSLSAFPLGDAYDEE GKINHEYITGQYSPMGDFIENQYANNTRSTYLNMSGYVELTPLKDFTFTSRINGTLNHSR RGQYWGNQCNANRPSYAGSPHASITNDNAWNYTWENILAYSTTLAKAHTVGGSLITSWNK NQSESNMAAASGQMVDQWSFWRLTSGSSPHVESDFAQTQKMSFAFRLNYSYKSKYLFNFS TRWDGVSQFSAGHKWDAFPAGAIAWRISDEAFMEKTRSWLDNLKLRVSYGITGNSGGTDA YSTTTQAYVYSASGVSVNGKIVPFTQYSETYGSTDLGWEKSYNWNIGLDFGILNSRIDGS IEWFRTTTKGLLFKRILPITSGLTGWGSPLSIWQNIAQTSNQGVEVILNSHNIQHKDFTW NTTLSATWSKEKIDKLPDGDLISENLFTGEPIHAIYGYKYAGIWGSDTPKETLDAYGVNP GFIKIETVDKNGDGGVHKYSTDDRQILGHTNPDWIIGLNNSFTYKNFDLSVFAMARYGQT INSDLLGYYTAEQSVTKNQLAGVDYWTEDNQGAYFPRPGTGDEQKTVYPSLRVHDGSFIK IKNITLGYTLPANISRKVLMEKCRIYATAYNPFIFVKDKQLKDTDPETNGSDAFPTYRQF VFGVNLTF >gi|226332149|gb|ACIC01000171.1| GENE 7 8508 - 10340 1493 610 aa, chain + ## HITS:1 COG:no KEGG:BT_2363 NR:ns ## KEGG: BT_2363 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 610 1 610 610 1151 98.0 0 MKSIHLKTVFCSVLLSALFLHSCSLEEYNPGAFTNETLATSTEGYETLINQCYFAMERFY YGAADWMSLTEGDTDLWTYKANQSTSYTEWFWFFAGTSPNTTYTNNWWNGTYDGIGSCNE AIALGDKPPYATEEERNAKVAEARFMRAVYYFNAVEQFGGVTMLTEPETSLNYSPVRTDP MTIYKEVILPDLRFASEWLSIGNHATTTTPTKKAALGFLAKACLQTYEYGSTEYLQEALD AALQLISDCESGGGKYNTYMYPTYSEVFEESNNWENKEALWKHRWYAGTDGHGSSNGNYK LNRNDEYFLCDINKFGAREDNQETRLTWEGSISGIFMPTQHLLNLYVQNDGTLDPRFHQS FTTEWKANKSYTWDESAVHMYDKDETVIGKALKKGDPAIKFIMPQDADYSTEKQNEHKTD YLLIDYNHVYSDTNNNVNMNYSYTNVTGSYKNDGTSENLFRYYYPSLNKHNSSNYYVANA SKQRNGNLNATFIMRMAEVYLIAAEADIYLNGGANAAKYLNKVRERAGADPLTGSMTVRD VLDERGRELCGEYCRFYDLKRTGMFKSSDYLKDTHPDLARFFDPNYALRPISTTFTATII NGSEYQNPGY >gi|226332149|gb|ACIC01000171.1| GENE 8 10583 - 13660 2516 1025 aa, chain + ## HITS:1 COG:no KEGG:BT_2364 NR:ns ## KEGG: BT_2364 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1025 1 1025 1025 1932 98.0 0 MKINQRLTLCFSLLFCLLSISYASAQQLNISGVIISSEDNEPLIGATVSVKGNPTKGVIT DINGQFQLSVETGTKMLVVSYVGMKTQEIRVPRKEKELKIVLISETMDIDEVVVTGYGNF SKSSFTGSANTLRTDMLKEVPVMSVEQKLQGMTTGVQITSSSGQPGANQSIRIRGMGSFN ASQEPLFVIDGVPVTSGSMSSGGPDAAYMNNSKTNIMSTLNPSDIENVTVIKDAAAASLY GSRAANGVILITTKKGNTGKVRVDLNVSGGFSHAAVDFRPTLNGDQRKELLYEGLLNYAI DNGMESPNEYANSNIGTYAYKPGMGYTDWRKELLRTAMHQSYEASVSGGNDRSTFYASLG YNNQEGLAKNSSLDRYSARLNMTQKVGKYGEVGANMMFSQMNQEMNEERGSSINPFLCVA MTMTPSMVVRDEEGNYVGAYDGTSLNPLRDILTDYNRVRMTRMFSTGYAAIEPIKGLKLK ETLSYDYTIQKDSRYYNPLSSAGPKSGSDAQTAKGFIEYGKLISSTSLNYVRTFARKHHL DVLAAYEIESYQTDHASGEKSKLPSDKLTEPDNAAVLNSFKSATQAYRMISYLSRLNYDY DDRYYIAGSYRRDGSSRLSPNNRWGNFWSVSGMWHLGNENFMKAVKPVLSDVKIRASYGV NGNQPGSYYGYKGLYSYGENYMEAAGSYESAQPNNLLTWEKNYSLNLGIDLSFINRIFAS LEYYNRDTKDLLYSLPISATTGFTSYLSNIGRLNNKGVEFELRTLNVVSNDFNWTSVFNL SHNRNKIVSLNGLLDQTIEGTWFIHKVGLPYHTFYVKEFAGVDPLTGSAQYYLNTKNEDG SYNRELTTDAAKAESIPYKTATPKVSGGFTNILNYKWIDLTFTLTYSLGGYSFDKLGTYI ENGSSSIYSSRYNLPAYMMNRWQKPGDQTDTPRFVYGEPATSTNSSRYIHSTDHLRLKNF TLGFTLPNQWTQKLMIDKIRVYFSGNNLLTWAKWKQYDPETPVNGEVFCEAPAMRTFSFG AQLSF >gi|226332149|gb|ACIC01000171.1| GENE 9 13694 - 15187 1412 497 aa, chain + ## HITS:1 COG:no KEGG:BT_2365 NR:ns ## KEGG: BT_2365 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 497 1 497 497 973 98.0 0 MKRIHIYILTAMALLNSSCSADWLDLNTTSSVETGQAIVTLDDAQIALNGIYRLASGHSY YGDNYWYYGDCRATDVQARITKGDGKRVSPYYEYNVLASDNLNIVLPWNTVYKVIRQTNN LIQKIESGSIQSSDTKELNRIKSEALVMRGLSLFNLTRLFGMPYTNDKGASLGVPIETSP SDPTHKPSRSTVAQCYEQVVSDMSNALSGLIQETSNGYINYWAAQALLSRVYLNMGEYQK AYDAATDVIKNNSGRYQLYSYEEYPNVWGQDFQSESLFELYITLSEPSGGTGGEGAPMVY ANEATVDWNNLILSEDFLNLLNEDPKDVRHCLTKESVIENNTGLPAAAMHEKVYLAKFPG KTGDDPKTNNICIIRLSEVYLNAAEAGLKKGTDLEEAQSYLNDIISRRTTDTSQQVSTET FTLDRILKERRKELVGEGEVFYDYLRNGLAIERKGSWHLETLKASNAQKIEATDLRIALP IPQSEIDANPNIQQNPK >gi|226332149|gb|ACIC01000171.1| GENE 10 15450 - 16013 329 187 aa, chain + ## HITS:1 COG:no KEGG:BT_2366 NR:ns ## KEGG: BT_2366 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 187 1 187 187 365 99.0 1e-100 MRNIIDKMNSLYPISDETIQILKENTVLCHFPKRHQLIEADKFCKSAYFIEEGMTRSFWL VNGEEITTSFACEGAIVFSMDELYYNKMSEEFVETLEDVVAYRISLTDLLRLFQTNIELA NWGRVIHQNEYRRLHRSHKDRLTLSAKERYEAFKLQFPQMCQRIQLGYIASYLGITLSTL SRLRAYK >gi|226332149|gb|ACIC01000171.1| GENE 11 16123 - 17271 781 382 aa, chain + ## HITS:1 COG:AGl3365 KEGG:ns NR:ns ## COG: AGl3365 COG1835 # Protein_GI_number: 15891802 # Func_class: I Lipid transport and metabolism # Function: Predicted acyltransferases # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 5 378 4 362 372 116 29.0 6e-26 MQKISSSAFADSKPHYVLLDGLRGVAALLVIWYHVFEGFATSPIDQKFNHGYLAVDFFFI LSGFVIGYAYDDRWKTTMTQKEFFKRRLIRLHPMVVMGAVLGAITFCIQGCEQWDGTRVS ISMVMVAMLLNLFLIPAVPGTGPEVRGNGEMYPLNGPSWSLFFEYIGNILYALFIRRMST KALTILVVIAGIGLASFSICNLSGSHHLGVGWSMIDYNLIGGFLRMLFAFSIGLLMSRIF KPVHIKGAFWICSLSILILLTMPYVGGETSPWMNGIYDAVCTILIFPLLVYLGASGKTTD KGTAKICKFLGDISYPVYIIHYPFMYLFYAWLWSKEPHITFSQSWPVALCVFFGSIVLAY LCLKLYDEPVRKWLSKKFLTKK >gi|226332149|gb|ACIC01000171.1| GENE 12 17440 - 17664 398 74 aa, chain - ## HITS:1 COG:no KEGG:BT_2368 NR:ns ## KEGG: BT_2368 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 74 1 74 74 139 100.0 4e-32 MDKKIVGANAGKVWHALNEADGISIPELARKVNLSVESTALAVGWLARENKVVIERKNGL IEIYNEGHFDFSFG >gi|226332149|gb|ACIC01000171.1| GENE 13 18198 - 18827 512 209 aa, chain - ## HITS:1 COG:no KEGG:BT_2369 NR:ns ## KEGG: BT_2369 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 209 1 209 209 424 99.0 1e-117 MRTVLAALGVLCFMGSCSSGMTSSKGIVCDATMNTVAIVTDKNDTLSFSTMNANKEEVDG LLLNDTLEVFYTGKYTPGMQASKLVQYPQSLKLGGDRDEHGCIGSAGYVWCEVQQDCIRL FEKGIRTEAVDGSTASAFIVFSPDSTRLELFFSDEQPNEILERRGLPSGGYAWNVEDDDT KNVRFIDGLWTISQRDKVIYREKKTKAGE >gi|226332149|gb|ACIC01000171.1| GENE 14 19025 - 19495 375 156 aa, chain + ## HITS:1 COG:no KEGG:BT_2370 NR:ns ## KEGG: BT_2370 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 156 1 156 156 286 98.0 2e-76 MPEFDFSPITEGIYNVIETEEKVLISLPEETITQKRNRQNRTIKQILGHLIDSASNNHQR MIRLQYSKDLLFFPDYRQDNDLWIALQDYQHTDWNNLIQLWKFFNLHIIQVIKSADQTKL DSYWCDFEGTKVTLKEMIEGYLDHLNLHIQEIHELI >gi|226332149|gb|ACIC01000171.1| GENE 15 19640 - 20218 688 192 aa, chain - ## HITS:1 COG:CAC3350 KEGG:ns NR:ns ## COG: CAC3350 COG0693 # Protein_GI_number: 15896593 # Func_class: R General function prediction only # Function: Putative intracellular protease/amidase # Organism: Clostridium acetobutylicum # 2 192 3 194 195 177 49.0 1e-44 MKLLVFLAKGFETIEFSGFIDVMGWAKTDFGCDVEVVTGGFNEKVISSFNIPVLVDKTID EISVDEYDALAIPGGFEVFGFYEEAYEEKLLNLIRQFDARKKWIATVCVGALPVGKSGVL KDRKATTYHLGGAVKQKVLQSFGAIIVNEPIVVDDNIITSYGPQTASGVALLLLEKLTSH REMSLVKEAMGF >gi|226332149|gb|ACIC01000171.1| GENE 16 20919 - 21392 446 157 aa, chain + ## HITS:1 COG:lin1814_2 KEGG:ns NR:ns ## COG: lin1814_2 COG3449 # Protein_GI_number: 16800881 # Func_class: L Replication, recombination and repair # Function: DNA gyrase inhibitor # Organism: Listeria innocua # 5 157 1 158 159 87 31.0 1e-17 METKIEVKEMPDMKAVYCRHMGAFKEIVKAYEKLIKWAEPRGLYIPNVTKSATVTHDDPS VTELSKVRQSACIIVEGDVKGEGEIGNLVIPGGKYAVGHFELGTEDFEKAWNTMCKWFTE SGYQQGDGCTYELYHNSHRTHPENKHIVDICIPIKPL >gi|226332149|gb|ACIC01000171.1| GENE 17 21574 - 23136 1312 520 aa, chain - ## HITS:1 COG:no KEGG:BT_2373 NR:ns ## KEGG: BT_2373 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 520 1 520 520 946 96.0 0 MKNTYYYIVTLAMLILAFPVKAASEAEFGKLSKAYTLHADGSQEMRVQKELTLFTHAAMN RVYGESFIIYNPEFQTLKIHDSYTRQKDGTIVKTPENALLEVLPSAAADAPAYNGLKEMV VVHTGLELGATIYLDYSVVTRPGYLPELDVCEQVEELSPIREYVFSLSVPESKPLHYEWL NGKAAPVVKTAGGMKTVTWTLKNVQPRPYSLDVSLPAGNVQAVVASTYASKADALRVIKQ QLESDGKDVSELAQKLTVGAQTIEQKKELLTAYVEGLGNCRLTLSQTGYRLRPASEVIRS AYGTEAEKAALLAALQQAIGIRAEIKAAFPKTEDKDAAGLAAVSGLFVTNKGIADIQNFV SVVDLNAQPIILEKVSHVVSRTDTLRVSDKTGKALADGYRKFDLPQARGGWASYDGRVTA LNTTRPVNLLLRYLPDEAYTYIVKIDAGMKPVVVPTNKKIENAVGIMEVTVKKAEDKIEI FRTLKLKKQLITPTEYPAYYRLMAEWMDTNGNSLLFQTAK >gi|226332149|gb|ACIC01000171.1| GENE 18 23143 - 25098 1593 651 aa, chain - ## HITS:1 COG:no KEGG:BT_2374 NR:ns ## KEGG: BT_2374 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 651 1 651 651 1305 97.0 0 MRKQLSVSLLYCLLVSATSLAQSWKPYEQTAKGKTYEASDCVTLLDSTLVSVQPTGQGSF AVCKVIKVQTPRGAVDNRVIKYDYDPLTAYAEFKRITIHRANGKVDELDVRKTCDYAAPA RAIYWGARQIMIEVGALQPGDIVDYEIAKKGFTYALLAAGNEDESRFIPPMRGQFYDIVP FWSSTPTVRKVYVVSIPMEKELQFQFYQGECASSMRYEDGRKKYSFAMDDMMPFAKEPNM VDLFDAAPKLMMSTTPQWKDKSLWFNKVNEDYGSFDPLPEAQKKVDELIKGKKTEMEKIA VLTHWVADNIRYSGISMGKGEGFTLHNTKMNYTDRCGVCKDIAGTLISFLRMAGFEAYPA MTMAGSRVESIPADHFNHCVAVVKLSNGTYMPLDPTWVPFCRELWSSAEQQQNYLPGVPE GSDLCITPVSAPENHYVRIKADNRLDADGTLRGTFTLTAEGQSDSNIRRIFTTGFQSEWK STMERQLLAISPKARLLSVDYGKNPKYYQAAPIKITFRYEIPDYALRGEGEMFFRPMVMN NLYNQVRSYMWIDTSVENRKYGFKDGCSRLVELDETIQLPAGYKLVSAAKDETMQGVGAD FEGSLVQKGNKVLLHNRLALKKRVYEASDWESFRNAVNAHKAYGEYLVIKK >gi|226332149|gb|ACIC01000171.1| GENE 19 25631 - 26056 458 141 aa, chain + ## HITS:1 COG:no KEGG:BT_2376 NR:ns ## KEGG: BT_2376 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 141 38 178 178 273 98.0 1e-72 MKEPFQIPTSKDIENLIGTDLYDVWNSLCQCIEKSYEMEQLWNRGGKAWTYEYKYRKGGK TLCALYAKEKTLGFMVILGKDERAKFEIQRGQFSNEVQMIYDAATTFHDGKWIMFELKDT KLFNDMERLLLIKRKPNRKAE >gi|226332149|gb|ACIC01000171.1| GENE 20 26561 - 27934 886 457 aa, chain - ## HITS:1 COG:CAC0883 KEGG:ns NR:ns ## COG: CAC0883 COG0534 # Protein_GI_number: 15894170 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Clostridium acetobutylicum # 9 430 7 428 448 264 37.0 4e-70 MNNKYEERLGTERMLPLVFKMALPAVAAQFVNLLYSIVDRIYIGHIPGIGTDALAGVGVT TSLIILISSFSAIVGGGGAPLAAIALGQGDRVRAGKILGNGFVLLILFTLLTSVIAYTFM EPILLLTGASENTLEYAVDYLSIYLLGTIFVEISTGLNSFINAQGRPAIAMFSVLIGALL NIILDPIFIFWFDMGVKGAALATVLSQACSAVWVVSFLFSRRASLPLEKRYMGLDRKIIL SILALGVSPFIMASTESLVGFVLNSSLKEFGDIYISALTILQSSMQFASVPLTGFAQGFV PIISYNFGHGDKQRVKDCFRIVLVTMFSFNLILMLFMILFPSTVASAFTSDERLIETVRW TMPVFLGGMTIFGLQRACQNMFVALGQARISIFIALLRKAILLIPLALILPNFMGVTGVY AAEAISDATAAICCTLLFFWQFPKILGRIKGNTLRVE >gi|226332149|gb|ACIC01000171.1| GENE 21 28089 - 29537 992 482 aa, chain - ## HITS:1 COG:no KEGG:BT_2378 NR:ns ## KEGG: BT_2378 # Name: not_defined # Def: endo-polygalacturonase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 482 1 482 482 945 99.0 0 MNYPKAILLSLALLAMHSLNAEVVTYPAGVGVKTLNDFSVEVRQGSGPWLPVDVYPVKVD RVDEKGHNVEVASIAYFDFDGMVDVRVISNKERVNSARVRPLSYKITPDCVGDTVTFSLS RPRNLSVEVNGDIFHNLHLFANPIDKFRPSDKEIQRALKKKKGSNLIYFGPGVHNLPNDT LFVPSGTTVYIDGGARVYGNIFTEGAHDVNIFGRGEVHPDGRGAGVWVRRSKNVRIDGIV VSQLPIGQCDSVELTNVKSISYYGWGDGMDVFSSSNVILDGVFCRNSDDCAAVYASTQGF KGGSNNVLVKNATLWADVAHPINIGGHGDPNGMDTVQNVTFRNIDILDQAEKQVDYQGCL AINPGDNTLVRNITFENIRIEDFRNGQLVNFRISFNPKYCVSTGRGIQNVLVKDVTYNGS GENLSIIAGYDENHKISGIRFVNLVVNGRKITDDMPGKPKWYKTGDMAGIFVGEHVENLS FE >gi|226332149|gb|ACIC01000171.1| GENE 22 30161 - 30904 555 247 aa, chain + ## HITS:1 COG:VC0238 KEGG:ns NR:ns ## COG: VC0238 COG0110 # Protein_GI_number: 15640268 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Vibrio cholerae # 107 235 56 183 188 64 35.0 2e-10 MENITGQNNEFTLIFPHNRIDCTSSQNDKFNKIINQANITITGNNNRISMYFDTEESAEE LLLSNGFLLIVKGDNNIVNMGTIILRYSTILGMTGLKLIIGQLPGLGAGVSRTANNCRID IGNRVVINGVTLYLQEDDSYVSIGDDSQLSWGIDIWCTDAHTITDLKGEPINFAKCIEIG KHVWIGKDSKIGKNVKIADNSIVGWGSIVTKEFNEPNVIIAGIPAKIVKRGINWDRRCIN KYLKERL >gi|226332149|gb|ACIC01000171.1| GENE 23 30978 - 31601 472 207 aa, chain - ## HITS:1 COG:CAC0777 KEGG:ns NR:ns ## COG: CAC0777 COG0110 # Protein_GI_number: 15894064 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Clostridium acetobutylicum # 5 206 9 210 210 279 64.0 3e-75 MNTNIYPRTNDLQTVYLNAVINNPHIKVGDYTIYNDFVNDPVQFEKNNVLYHYPVNGDRL IIGKFCSIACGAKFLFNSANHTLNSLSNYPFPIFFEEWQLDKGNITSAWDNKGDIVIGND VWIGYEAVVMAGVHIGDGAIIASRAVVTKDVPPYTIVGGTPAKKIRMRFDEDTIAQLQEL KWWDWSTDEIAHYLPHIMNGDMEELMK >gi|226332149|gb|ACIC01000171.1| GENE 24 31879 - 32748 664 289 aa, chain + ## HITS:1 COG:no KEGG:BT_2385 NR:ns ## KEGG: BT_2385 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 289 1 289 289 567 99.0 1e-160 MATYTHFGKQPDVLKHLILCEVLRNEHPQVYVETNSACAIYPMQQTSEQQYGIYYFLEKA VEEDNQVLKDSIYYKIENAEMQRGYYLGSPALAMEVLGRQAQRFLFFDIEKSALDNVERY AKQAELQTSVRLYNTDSLEGVMKLLPSLPKDSFIHIDPYEIDKKGTSGLTYLDILIEATQ LGMKCLLWYGFMTQHDKSHLNSYITTRLEEARIKDYICAELIMSSIRQDTVLYNPGIIGS GILGTNLSQKSNTAILDYSDILVRLYQYAKYKDHDGSLYRDIIGNPPET >gi|226332149|gb|ACIC01000171.1| GENE 25 32796 - 33266 319 156 aa, chain - ## HITS:1 COG:NMB0573 KEGG:ns NR:ns ## COG: NMB0573 COG1522 # Protein_GI_number: 15676478 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Neisseria meningitidis MC58 # 7 150 33 175 187 114 38.0 9e-26 MDTFEKLDKVDLQILRTLQENARLTTKELAARVSLSSTPVFERLKRLESGGYIKKYIAVL DAEKLNQGFVVFCSVKLRRLNRDIAAEFTRIIQDIPEVTECYNISGSYDYLLKIHAPNMK YYQEFILNVLGTIDSLGSLESTFVMAEVKHRYGIHI >gi|226332149|gb|ACIC01000171.1| GENE 26 33445 - 34731 1269 428 aa, chain + ## HITS:1 COG:L75975 KEGG:ns NR:ns ## COG: L75975 COG2873 # Protein_GI_number: 15672055 # Func_class: E Amino acid transport and metabolism # Function: O-acetylhomoserine sulfhydrylase # Organism: Lactococcus lactis # 1 428 1 426 426 526 62.0 1e-149 MERKNLHFETLQVHVGQEQADPATDARAVPIYQTTSYVFHNSAHAAARFGLQDPGNIYGR LTNSTQGVFEERVAVLEGGVAGLAVASGAAAITYAFENITRAGDHIVAAKTIYGGSYNLL AHTLPNYGVTTTFVDPSDLSYFEKAIQENTKAVFIETLGNPNSNIIDIEAVSEIAHRHKI PLIIDNTFGTPYLIRPIEHGADIVVHSATKFIGGHGSSLGGVIVDSGKFDWVASGKFPQL TEPDPSYHGVRFVDAAGPAAYVTRIRATLLRDTGATISPFNAFILLQGLETLSLRVERHV ENALKVVNFLNNHPKVKKVNHPSLSDHPDHALYQRYFPNGAGSIFTFEVKGGQEEAHRFI DSLEIFSLLANVADVKSLVIHPASTTHSQLNAQELAEQEIYPGTVRLSIGTEHINDLIAD LEQALAKI >gi|226332149|gb|ACIC01000171.1| GENE 27 34856 - 35095 277 79 aa, chain + ## HITS:1 COG:no KEGG:BT_2388 NR:ns ## KEGG: BT_2388 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 79 1 79 79 137 100.0 1e-31 MEKYLIHSNELHLIDAGKIHQAVEKMVESLDLAAGSTSNFDLYQVVESYFKDLEKRRKIN HVLGIKEDRYEFAEDFGIK >gi|226332149|gb|ACIC01000171.1| GENE 28 35088 - 35261 64 57 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKPLMKKESTSRSRNLPCIRETDHYKIYTKTREFHFGETLSFYMNKLSGVIGKFLFI >gi|226332149|gb|ACIC01000171.1| GENE 29 35194 - 36105 1024 303 aa, chain - ## HITS:1 COG:PA4394 KEGG:ns NR:ns ## COG: PA4394 COG0668 # Protein_GI_number: 15599590 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Small-conductance mechanosensitive channel # Organism: Pseudomonas aeruginosa # 35 299 6 269 278 216 45.0 5e-56 MLLLFQATATQLADSTQIAADKLMEEAIANADGLDKLSLITQQLVDFGIRAGERILIATL VFIVGRFLISMLNRFVGRLMDRRKVDISIKTFVKSLVNILLTILLIISVVGALGVETTSF AALLASAGVAVGMALSGNLQNFAGGLIVLLFKPYKVGDWIESQGVSGTVKEIQIFHTILT TGDNKVIYIPNGAMSSGVVTNYNTQTTRRVEWIVGVDYGEDYNKVQQIVRDILTADKRIL TDPAPFIALHALDASSVNVVARVWVNTADYWGVYFDINKTIYETFNEKGINFPFPQLTVH QGN >gi|226332149|gb|ACIC01000171.1| GENE 30 36337 - 38568 1919 743 aa, chain + ## HITS:1 COG:no KEGG:BT_2390 NR:ns ## KEGG: BT_2390 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 8 743 8 743 743 1414 98.0 0 MKKMMMMAVALLGAGFPVHAQTSAKDSLKVINLQEVQVVSTRATAKTPVAFTNIGKAELK KVNFGQDIPYLLSMTPSTLTTSDAGSGIGYTTLRVRGTDGTRINITVNGIPMNDAESHNL FWVNMPDFSSSVKDMQVQRGAGTSTNGAGAFGASVNMQTEGASMKPYAEFNGSYGSFNTH KETVKVGTGLLNNHWTFDARLSNIGTDGYIDRASVNLNSYYLQGGYFAENTSVKLIAFAG KEKTYHAWGYATKAEMKEHGRRYNPCGEYTGDDNEKHYYADQTDNYLQKNYQLLFNHTFS TAWNLNVALHYTKGDGYYEEYKEDRSFVEYGLKPFTTDGKEISESDLVRQKKMDNKFGGG VFSLNYTNHRLTASLGGGINQYRGNNFGKVTWVKNYIGALSPAHEYYRNQSKKTDGNIYL KASYDLTGGLSAYADLQYRHIDYTIDGANDKYDWNKSALRLLAVDKKFDFFNPKVGLNWN INSNHRVYASFSVAQKEPTRNNYTDGDPDSYPKAEKLLDYEAGYTFANHWLTAGANFYYM DYTDQLVLTGALNDIGEALTENVPDSYRMGIELMLGIKPCKWFQWDINATWSKNRIKNFV ENIPIYDGWNLLDVVSVSHKSTRIAFSPDFLLNNRFAFTYQGFEAALQSQFVGKQYMTNA EVEALTLDKYFVSNLNLAYTFKPRKVVKEVTLGVTVYNLFNEEYENNGWASSSYDKDVNH RIDSAGYAAQAGTNVMGHVSFRF >gi|226332149|gb|ACIC01000171.1| GENE 31 38622 - 42755 1695 1377 aa, chain - ## HITS:1 COG:CAC0903_3 KEGG:ns NR:ns ## COG: CAC0903_3 COG0642 # Protein_GI_number: 15894190 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 845 1069 66 289 318 109 30.0 5e-23 MKRTVRYTVLVLIAIINSLVITTSLYAVLPYNELHQLNMSNGLVDNIITCIYQDQDRFMW FGSTNGLSRYDGKQIRNFSVDNARMYVSDIKETSDDKLWVITNEYLYCFDRRKEKFVLPS FQDGKKAISVSAMEITGDSLFWIVKGGQLQCLKRHYKLVKGDLQIEMTVEAGYPFFLDEG ESFSNLCASQDGHFLYLVTDKGNLLFFDKIAGKVVRKFKYSINPSANATTIMSEGEYIWI SSIVGGVTRLHIPTGKSDYYQYNEDARLSSLSHSDAYGVVALDNDSYIAVTWNGYTLLAP EENDPSKLSATPYTNTSFLQYRNVETRMISVYYDKEGILWIGTRGGGIVYFDFRQHSYMQ YHSKKHNEISAQVADKDGRIWLGTYHEGIMRSDQPYAKSRPLNFSPVGNQKEVPVFCATK DSCSNLWFGNASGNLICYDWSSDSFHIYPLNYLGKKVNSYIVALMIDTRYRFWVCTSAGL YLFDRQTGHFELFSLREALKEDAEPWVTAICEDKQRNIWIGTAKGIVRLSQVNVRPLKMV HGYEEKENIGARVVSALLTGTDGTVYVGYKNGFGIIPVGEDRIASFYTVKDGLCNDCIDC IVEDEKKRIWLGSISGISRYSRQQHVFYNYYISSSNKSVMLFKNTLFWGNNKSLTYFEPE ILTSATIASKTLLTGLEVDNKPINIGEKIKGQVVLDSNIASIDHLELVNANRDFSLLFNN LAFSKDLQKYSYRLYPYQKDWIVSEAGKVSFTNLSAGYYIFEVKTLFPNNTEGDVTALPI TILPHWSQTVWFRLLLIFSVLFLVGYIFYSLLRRQRRFKKMIQLKHELTIANLERNAERH IREERENFFMNAAHELRTPLTLILAPIHDIMKSITPSDNWFDSFSRLHKNCLSLQTLVDR LLYVQKIEGGMIKLHLSESDIKEIVSRVANPFLQMAMVKKREFLVQVDTVPLYLWVDVAK IESAVQNLLSNAFKYTSQNGRIELAVSEAEIDGRPYCLVTVSDNGVGIPDDLQQHVFDSF VTGKRIPQYSTSIGLGLHIVKHTMGLHHGFVTLTSRVGEGSRFVLHIPVGLSHFAQGEYE MVPDPMKGDLLEECPAEERPMEEVVEEKNETMEEAVNRVKNNKEYLLIIEDHDEMREYLC SLFKEDYNVIEAENGEEGVAMADKYIPKLIISDIMMPVKDGFECSREIRENKRTFHIPII FLTAKAEDADRLKSLQIGVDDYIMKPFNPELLKEKVKALIEQRDLLRKLYAKTLMLDEEV LESSEDVQDVFMPKMLQIIEENLSNRNFTIKVLTDKLNMSQPTLYRKVKQKTGLSIIEVI HGVRMSKAASIIMSGRYSSLTEVAEMVGYDSMISFRKQFVAQFGVLPSKYMEEKMRK >gi|226332149|gb|ACIC01000171.1| GENE 32 43116 - 44438 1050 440 aa, chain + ## HITS:1 COG:MA3842_1 KEGG:ns NR:ns ## COG: MA3842_1 COG3391 # Protein_GI_number: 20092638 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanosarcina acetivorans str.C2A # 354 437 49 124 318 60 40.0 9e-09 MRKSFNLIWMVFGFLLLIVGCNEDKKSTAFDPNQPVKFTEFMPDSGGIRTKFIVKGSNFG EDKSQVKVYFKDEVGNEREALVLGVKPDVIYAQVPKQAGGESHVRVEIAGKEAELSNAEK TFKYIVTSSVSTVVGKAKEGGNKDGTLGETTFNTPRYVAVDNDDNVFIFDSDGRTRLSSI EQNKTITLLDGMVIDQPLFIDKEKKQLFGPCDNANFGCFLFDANVSWVADKMGQLLANGG WMHSVVLDPVDSTFVIYRQNTGQLWTQPFDKNRRTLNPNKAKRIGTLYNTGSNGLCAYNP VDKYVYCVLHSKSAVYRFKLTRDADGWPALDGDIDEYIPGAGAGFRDGDVQEAQFKEPRG IAIDKEGNLYIADVGNNRIRKVDTKLNVVTTIAGSGAAGYKDGDPLEAQFNQPWGVYLDK NEFLYIADQNNHCIRKLAIE >gi|226332149|gb|ACIC01000171.1| GENE 33 44455 - 47538 2395 1027 aa, chain + ## HITS:1 COG:no KEGG:BT_2393 NR:ns ## KEGG: BT_2393 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1027 1 1027 1027 2021 98.0 0 MKTLYFKLTLCFFLLLAGNATQGIWAQQEPIQIAGIVLDEHGDPLPGANISVKGKEKTGT ITDMDGNFSMKSLAPKTTLIVSFMGYKSKEILVTQTDKKMRISLEPDSESLDEVVVVGMG TQRKVSVVGAVTSVNPAAISAPSTSVANMLGGVVPGIIAVDRSGEPGQDVSEFWIRGIST FGANQSALVLIDGIEGNLNDLDPSDIESFSILKDASATAVYGVRGANGVVLVTTRSGQEG RTKVTWKSSMTLSYSPRMPEYLEAYDYASLANEARVVSNMDPLYSPTELEIIKAGLDNDL YPNVNWQKEILKDVTINHQHYLNVSGGGKVARYFVSVGGTFKDAIFKQDKVNKYNTNVKW AKYNFRAKVDMNLTPSTIMGLAMDGAIVEDRAPGFGTNNDALWAAQAYLTPLTVPLRYSN GMLPAYGKNGEQISPYILLNHTGFKKGNRTTMNINFTLRQDFGKWIKGLTARGMFSYNNN NIHNVVRSKMPDLYKAFGRYNDGSLMTQRTVSASNIQFAKSAASDRRYYFESQLAYERLF NQEHRVTGLIHYYMQSTETTEANDEISSIPKRYQALSARATYSYKDAYLIEGNVGYTGSE NFEPGKQFGLFPAIALGWIPTQYEFMQNHAGWLNFLKIRASYGEVGNDQIGGGRRFPYLT LINFGGGSRWGSNGLTEKQIGANNLHWEVAKKYNLGIDFQFLNDKIGGTVDIFKDTRDNI FQERKMMPSEVGVVTNPYTNVGRMRSSGVDGNIYLNQKIDKDNSFTVRANMTYAVSKVVH WDQDAVRYPYQSYSDVNYGVMRGLVALGLFQSQDEINRSPKQTYGEVRPGDIRYKDVNAD GKIDDDDVVPIAHSNVPQIQYGFALEYRYKRWTISALFTGAAKVNFLYGGSGFYPFSGGE VGNILSIVNDPKNRWTPAWYSGDPSTENPNARFPRLTYGENKNNNRASTFWLADGSYLRW KSLDISYRIGTNKALRTVGISDMNLQFVGQNLAVWDSVKLWDPGQASKNGAVYPLQRTFS LQLTATF >gi|226332149|gb|ACIC01000171.1| GENE 34 47551 - 49530 1713 659 aa, chain + ## HITS:1 COG:no KEGG:BT_2394 NR:ns ## KEGG: BT_2394 # Name: not_defined # Def: putative outer membrane protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 659 1 659 659 1339 99.0 0 MKKILKKSLHILCTTLGIIPFMLSSCNYLAVDEYFNDLTPLDSIFARQDYLERYVWGTAA LMPAQGNLFSGSYGPYETAVDEVLLSWQKAEYAGTYLYADKITQNDSYYNMWGQYYKGIR KCNTIFARIDECKDLKGLDRREIMGLTHFMRASFYFYLLELYGPAVILPEEPLSVDENIE NLSFERNTYDECVEYICKDLEEAYSLLDPTRPSTQFERPTKYAAAALMSRVRLYQASKWY NGNQYYADWTTSDGRNMIAQQYDPVKWAKSAAASKRIIDSGIYSLYTVPSDENTYASSQA SQAEFPLGVGGIDPYHSYIDMFNGEALAVKNPELIYATPLNNNIISIAFPLKLGGWNGLG ITQKLIDAYYMKDGEDYVQQPDYYEEAGTVPTIATGYELRPTVAKMYLDREPRFYASIGF CECFWPATSVTGTEAPNVTNFTAGYYVNGNCAKQAANPEDYNLTGYTLKKYIHPEDNCTS HTGAKIKPKTFPVFRYAEILLNYVEALNELKGEPEYTEAADNTTYHVLYNPEEIMYYFNM IRYRAGLPGITLADASDQVKTRQLIIRERMIEFACEGRRYHDLRRWGLAETEENKPVQGM DVTKKTTERDQFYTVVNVVHKYALRSFDRKMYFYPIPRAVLDKNAKLKQNPGWDGFGDW >gi|226332149|gb|ACIC01000171.1| GENE 35 49585 - 50481 819 298 aa, chain + ## HITS:1 COG:no KEGG:BT_2395 NR:ns ## KEGG: BT_2395 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 298 1 298 298 500 97.0 1e-140 MRSIYKIVVCSLLTLSITSCEKDLLEKEQYQKEIYLIGAYNRVWTTEVSYSNEEVKTYFT VSSSGTLALDRDVNVKMKINEELVDIYNKKYWTVLNEDKYYKSLDTDLYSIPSLENTVIK HAEGISTEVPVLIKTASLKIDQSYVIPVEIESTTGYPVSTSGYKMLVLLKLKNEYSGSYQ MSGHTTLEGETPKTIQKPKTIKPTGVNTVRLFYAMNNESDEKADIQTGTIELTITDQIVE GTNDVKKVSIKAWDAENGPAIIDSGESTYNTTAKKFSLKYTIGNTLYEEQLTKEKEIL >gi|226332149|gb|ACIC01000171.1| GENE 36 50721 - 51287 520 188 aa, chain + ## HITS:1 COG:PA1958 KEGG:ns NR:ns ## COG: PA1958 COG3201 # Protein_GI_number: 15597154 # Func_class: H Coenzyme transport and metabolism # Function: Nicotinamide mononucleotide transporter # Organism: Pseudomonas aeruginosa # 3 183 1 181 191 92 29.0 3e-19 MEMNYLEIFGTFIGIIYLWLEYRASIYLWLAGIIMPAIYIFVYYDAGLYADFGINIYYLI AAIYGWFFWMWGHGEKKSLPIIHTPWKCYLPLFLVFILSFVGIARILIEYTDSNVPWLDS FTTALSIVGMWMLARKYIEQWFAWILVDIVCCGLYIYKDLYFTSALYGLYSIIAIFGYFK WKKLMSIQ >gi|226332149|gb|ACIC01000171.1| GENE 37 51284 - 51907 509 207 aa, chain + ## HITS:1 COG:HP1291 KEGG:ns NR:ns ## COG: HP1291 COG1564 # Protein_GI_number: 15645904 # Func_class: H Coenzyme transport and metabolism # Function: Thiamine pyrophosphokinase # Organism: Helicobacter pylori 26695 # 9 205 2 199 204 129 36.0 4e-30 MINEHYIPEAVILANGEYPAHELPLRLLAEAQFVVCCDGAANEYISRGHTPDVIIGDGDS LLPEYKKRFSSIILQISDQETNDQTKAVHYLQSKGIRKIAIVGATGKREDHTLGNISLLM EYMKSGMEVRTVTDYGTFIPVSDTQSFASYPGQQVSIINFGAKGLKAEGLFYPLSDFTNW WQGTLNEATADEFTIHCTGEYLVFLAY >gi|226332149|gb|ACIC01000171.1| GENE 38 52626 - 53198 416 190 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|157164512|ref|YP_001467500.1| 50S ribosomal protein L24 (BL23; 12 kDa DNA-binding protein; HPB12) [Campylobacter concisus 13826] # 8 190 3 184 185 164 45 9e-40 MQDIVNGRCGWCGTDELYVKYHDQEWGNLVTDDKTLFEFLVLESAQAGLSWITILRKREG YRKAFCDFDAERVAQMTDEDVERLMHFEGIVKNRLKIKSTITNARQFLAIQKEFGSFYDY TLSFFPDRKPIVNTFCSLSEIPASSPESDAMSKDMKKRGFKFFGTTICYAHLQASGFIND HLKDCICRKK >gi|226332149|gb|ACIC01000171.1| GENE 39 53233 - 54501 1370 422 aa, chain - ## HITS:1 COG:PM0115 KEGG:ns NR:ns ## COG: PM0115 COG0498 # Protein_GI_number: 15601980 # Func_class: E Amino acid transport and metabolism # Function: Threonine synthase # Organism: Pasteurella multocida # 3 421 14 424 424 368 46.0 1e-101 MASLQEAVVKGLAADRGLFMPMTIKPLPQEFYDTIDTLSFQEIAYRVADAFFGEDIPTDT LKQIVYDTLSFDVPLVKVTDNIYSLELFHGPTLAFKDVGGRFMARLLGYFIKKEGQKDVN VLVATSGDTGSAVANGFLGVEGIHVYVLYPKGKVSEIQEKQFTTLGQNITALEVDGTFDD CQALVKAAFMDKELNEHLSLTSANSINVARFLPQAFYYFYAYAQLKRAGKADNAVICVPS GNFGNITAGLFGKKMGLPVKRFIAANNRNDIFYQYLQTGKYNPRPSVATIANAMDVGDPS NFARVLDLYGGSHADISAEISGTTYTDEQIRETVKETWKEHHYLLDPHGACGYRALQEGL KQGETGVFLETAHPAKFLETVESIIGEAVEIPAKLQEFMKGEKKSLQMTKEFADFKKYLL SV >gi|226332149|gb|ACIC01000171.1| GENE 40 54531 - 55754 1231 407 aa, chain - ## HITS:1 COG:MA0132 KEGG:ns NR:ns ## COG: MA0132 COG3635 # Protein_GI_number: 20089031 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted phosphoglycerate mutase, AP superfamily # Organism: Methanosarcina acetivorans str.C2A # 1 397 1 391 397 383 49.0 1e-106 MKHIIILGDGMADWPVKSLGDKTLLQYAKTPYMDKLARMGRNGRLITVAEGFHPGSEVAN MSVLGYNLPKVYEGRGPLEAASIGVDLKPGEMAMRCNLICVEGEILKNHSSGHISTEEAD VLIQYLQEKLGNDRVRFHTGVQYRHLLVIKGGNKELDCTPPHDVPLKPFRPLMVKPLVPE AQETADLINDLILKSQELLKNHPLNLKRISEGKDPANSIWPWSPGYRPQMPTFSETFPQV KKGAVISAVDLINGIGYYAGLRRIAVEGATGLYNTNYENKVAAALEALKTDDFVYLHIEA SDEAGHEGDIDLKLLTIENLDKRAVGPIYEAVKDWDEPVAIAVLPDHPTPCELRTHTSDP IPFLIWYPGIEPDEVQTYDEISACDGSYGVLKEDEFIKEFMNQKNIQ >gi|226332149|gb|ACIC01000171.1| GENE 41 55770 - 58205 2358 811 aa, chain - ## HITS:1 COG:MJ0571 KEGG:ns NR:ns ## COG: MJ0571 COG0527 # Protein_GI_number: 15668751 # Func_class: E Amino acid transport and metabolism # Function: Aspartokinases # Organism: Methanococcus jannaschii # 3 454 4 467 473 268 38.0 2e-71 MKVMKFGGTSVGSVNSILSVKRIVESASEPVIVVVSALGGITDKLINTSKMAAAGDSAYE GEFREIVYRHVEMIKEVIPAGAGQVALQRQIGELLNELKDIFQGIYLIRDLSPKTSDTIV SYGERLSSIIVAELIDEAKWFDSRTFIKTEKKHNKHTIDADLTNQLVKEAFHSIPKVSLV PGFISSDKVSGDVTNLGRGGSDYTAAIIAAALDAGSLEIWTDVDGFMTADPRVISTAYTI SELSYVEATELCNFGAKVVYPPTIYPVCHKNIPIIIKNTFNPDGVGTVIKQETSNPQSKA IKGISSINDTSLITVQGLGMVGVIGVNYRIFKALAKNGISVFLVSQASSENSTSIGVRNA DADLACEVLNEEFAKEIEMGEISPILAERNLATVAIVGENMKHTPGIAGKLFGTLGRNGI NVIACAQGASETNISFVVDSKSLRKSLNVIHDSFFLSEYQVLNLFICGIGTVGGSLVEQI RCQQQKLMMENGLKLHVVGIIDAAKAMFSREGFDLANFREELQKKGKDSNLQTIRDEIVG MNIFNSVFVDCTASPDIASLYKDLLQHNVSVVAANKIAASSAYENYRELKTIARQRGVKY LFETNVGAGLPIINTINDLIHSGDKILKIEAVLSGTLNYIFNKISADIPFSRTIKMAQEE RYSEPDPRIDLSGKDVIRKLVILAREAGYHIEQEDVEKNLFVPNDFFEGSLDDFWKRVPS LDADFEARRQVLEKEHKHWRFVAKLEDGKASVGLQEVGANHPFYGLEGSNNIILLTTERY KEYPMMIQGYGAGAGVTAAGVFADIMSIANV >gi|226332149|gb|ACIC01000171.1| GENE 42 58606 - 59643 784 345 aa, chain + ## HITS:1 COG:YPO2161 KEGG:ns NR:ns ## COG: YPO2161 COG0252 # Protein_GI_number: 16122393 # Func_class: E Amino acid transport and metabolism; J Translation, ribosomal structure and biogenesis # Function: L-asparaginase/archaeal Glu-tRNAGln amidotransferase subunit D # Organism: Yersinia pestis # 7 344 5 335 338 284 46.0 2e-76 MTPLNTSVLLIYTGGTIGMIENTATGALENFNFEQLQRHIPELQKFNFPIDTYQFDPPMD SSDMEPDMWRKLVHIIHENYDLYHGFVILHGTDTMAYTASALSFMLEGLDKPVILTGSQL PIGVLRTDGKENLMTSIEIAAAQDKEGKALVPEVCIFFENHLMRGNRTTKMNAENFNAFR SFNYPVLAEAGIHIKYNQAQIHVNKSKQELVPHYLLDTNIVVLKLFPGIQENVVATMLGT KGLKAVVLETYGSGNAPRKEWFIRRLCQASAQGIVIVNVTQCNAGMVEMERYETGYQLLQ AGVVSGYDSTTESAVTKLMFLLGHGYTPDEVRDRMNRSIAGEITL >gi|226332149|gb|ACIC01000171.1| GENE 43 59745 - 60944 701 399 aa, chain + ## HITS:1 COG:TM1265 KEGG:ns NR:ns ## COG: TM1265 COG1373 # Protein_GI_number: 15644021 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Thermotoga maritima # 31 399 37 387 387 112 27.0 1e-24 MDTLIRRYKRLLTATSTTYIRSLMDTINWDNRLIAIRGARGVGKTTLMLQYLKLHYANDS QSALYTSLDSLYFTQHTLSELAEQFYLKGGKCLFLDEVHKYPSWSKEIKNIYDEFPELKI VFTGSSLLQLLNAEADLSRRCISYNMQGLSYREYLKLFHQIHIRPYTLEEILESSDGICN EVNSQCRPLAHFEDYLKHGYYPFYLEGNAEYYTRIENITNLILEIELPQQCGVDISNVRK LKSLLGILSSEVPFMVDITKLSALAELSRTTILAYLQYLDRAKLIHLLYSDNDSIKKLQK PDKIYMENTNLLYALTFKDVNKGTLREVFMVNQLTYLHRVEYCTRSADYTIDSKYTIEVG GKSKDGKQIAGSKKAFIAADDIEYSAGNKIPLWAFGFLY >gi|226332149|gb|ACIC01000171.1| GENE 44 60957 - 62324 1183 455 aa, chain + ## HITS:1 COG:BS_sms KEGG:ns NR:ns ## COG: BS_sms COG1066 # Protein_GI_number: 16077155 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted ATP-dependent serine protease # Organism: Bacillus subtilis # 1 455 1 457 458 477 51.0 1e-134 MAKEKTVYVCSNCGQESPKWVGKCPSCGEWNTYVEEIVRKETTNRRPVSGIETQKAKPVI LSEIEADDEPRINMHDDELNRVLGGGLVPGSLVLIGGEPGIGKSTLVMQTVLRMPDKKIL YVSGEESARQLKLRADRLSEVSSDCLIVCETSLEQIYVHIKNTSPDLVIIDSIQTISTEN IESSPGSIAQVRECSASILRFAKETHTPVLLIGHINKEGSIAGPKVLEHIVDTVLQFEGD QHYMYRILRSIKNRFGSTAELGIYEMRQDGLRQVSNPSELLLSQDHEGMSGVAIASAIEG VRPFLIETQALVSSAVYGNPQRSATGFDIRRMNMLLAVLEKRVGFKLAQKDVFLNIAGGL KVNDPAIDLAVISAILSSNMDAAVEPEVCMAGEIGLSGEIRPVNRIEQRIGEAEKLGFKR FLLPKYNLQGIDTQKLKIELVPVRKVEEAFRALFG >gi|226332149|gb|ACIC01000171.1| GENE 45 62345 - 63997 1233 550 aa, chain + ## HITS:1 COG:L195271 KEGG:ns NR:ns ## COG: L195271 COG2509 # Protein_GI_number: 15673161 # Func_class: R General function prediction only # Function: Uncharacterized FAD-dependent dehydrogenases # Organism: Lactococcus lactis # 19 543 17 524 535 364 39.0 1e-100 MTQEYQLRILPEIAANEQRLKEYISKEKGINLRNITATRILKRSIDARQRTIFVNLKVRA YINEMPKEDEYERTIYNNVEGKPQVIVVGAGPGGLFAALRLIELGLRPVIVERGKDVRER KKDLAQISREHTINPESNYSFGEGGAGAYSDGKLYTRSKKRGNVDKILNVFCQHGASAAI LADAHPHIGTDKLPRVIENMRNTIIECGGEVHFQTRMDALIIENNEIKGIETNTGKTFLG PVILATGHSARDVYRWLAANGVTIEAKGIAVGVRLEHPAMLIDQIQYHNKNGRGKYLPAA EYSFVTQAEGRGVYSFCMCPGGFIVPAASGPEQVVVNGMSPSNRGSRWSNSGMVVEIQPE DLINGEWKMENGEWAAQQNEQLLAINPLLSNSQLSTLNTQLTPLHFQEELERQCWLQGGR RQTAPAQRMLDFTRKKLSYDLPESSYSPGLISSPLHFWMPDFISKRLSLGFQQFGRSSHG FLTNEAVMIGVETRTSSPVRIVRDKETLQHVTVRGLFPCGEGAGYAGGIVSAGVDGERCA EAVANYINQQ >gi|226332149|gb|ACIC01000171.1| GENE 46 64013 - 64615 525 200 aa, chain + ## HITS:1 COG:BMEI1582 KEGG:ns NR:ns ## COG: BMEI1582 COG2197 # Protein_GI_number: 17987865 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain # Organism: Brucella melitensis # 129 192 143 206 213 68 59.0 8e-12 MNNKPEIAIIESNTLTCLGLKSILEEIIPMATIRTFHSFNELMDDTPDMYAHYFISAQIY VEHNAFFLPRKRKTIVLASDSPQFQLSGVPVLNIYEPEEKLVKSLLKLHQHAHHNGYPVK DMPPIAMPEVHQEILSAREIEVLVLITKGLINKEIADKLNISLTTVITHRKNITEKLGIK SVSGLTIYAVMNGYIEADRI >gi|226332149|gb|ACIC01000171.1| GENE 47 64724 - 67123 1973 799 aa, chain + ## HITS:1 COG:YPO1011 KEGG:ns NR:ns ## COG: YPO1011 COG1629 # Protein_GI_number: 16121312 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor proteins, mostly Fe transport # Organism: Yersinia pestis # 29 799 34 690 690 159 23.0 3e-38 MIALALLMTGTTWAEDFPKDSLKVVDIEEVVVIATPKENRKLRELPAATTVLSQKDMQAN QVNSVKRLSGLIPNIFIPEYGSKLTTSIYIRGIGSRINTPAIGLYVDNIPYIDKSAFDFN YSDIERIDVLRGPQGTLYGRNTMGGLIKVHTKSPFTYQGTDIRMGAATYNDYNVSLTHYH RISDQFAFSTGGFYEHTGGFYQNSARNNERVDKGNAGGGRFRGIYLPKDNLKLDLNVSYE YSDQGGYPYFYTGITQEGLNKGKTEEREEMIGKIAYNDRSNYYRNLLNAGFNVEYQAKNF ILSAVTGYQHLKDRMFLDQDFTEKNIFNLTQKQKLNTISEEIVLKSKPNRKWEWTTGIFG FYQTLNTDGPVTFKEDGVKETIEGNTNSIFENLGNKAPKMSMSVLNPTLRVSGNFDTPIW NGAIFHQSTFNNLFTKGLSFTIGLRLDYEKMSMKYNSASDPLNFDFNFAMGPMVITAKDL VADAAYNGKLSEDYVQLLPKFALQYEWSKGNNVYATVSKGYRSGGYNVQMFSDIITGQQA HSMVEAIKKSAEFEKYSTLIEGMIGDKMPAIPEVKDATTYKPEYSWNYEVGTHLTLWEGK LWADLAAFYMDTRDQQLSQFIGSGLGRTTINAGKSNSYGAEASLRASVTNELSLNASYGY TYATFTDYIINEADKDGNLTVKADYNGKYVPFVPKHTLNIGGEYAITCSSRSIFDRIVFQ ANYNAAGRIYWTEQNDVSQAFYGTLNWRANLEKGNAMISFWARNFLDKDYAAFYFETMNK GFMQKGRPVQFGIDLRCRF >gi|226332149|gb|ACIC01000171.1| GENE 48 67135 - 68034 390 299 aa, chain - ## HITS:1 COG:PA0248 KEGG:ns NR:ns ## COG: PA0248 COG2207 # Protein_GI_number: 15595445 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Pseudomonas aeruginosa # 165 293 156 281 288 70 31.0 3e-12 MDHNIFLLDISHLTSIDTSFFIKGEIVLSDNIDTSPLEPVEQLPQNNFPVQAGMSIVLFS LEGEMHIRISLKEYVLRPNMFCVIITGMIFEVLSINYDFRGFMIATRTDFMPTTEKTTQV MSFYKCLQNRHCFTFAEKEAGEFVRIYRSAKATLQESDHPFTIPMLQSYVQILYYRMLPI VIKEEESQTKYSRTRQEEIFQRFIGEVEKHYRRERSVKFYADSLCISPKYLSTVVYKVSR QLAGQWIDAYVILEAKTLLKSGKLTIQQISEQLNFSNQSFFGKFFKRCAGMSPKDYMNS Prediction of potential genes in microbial genomes Time: Thu May 12 03:40:30 2011 Seq name: gi|226332148|gb|ACIC01000172.1| Bacteroides sp. 1_1_6 cont1.172, whole genome shotgun sequence Length of sequence - 20296 bp Number of predicted genes - 16, with homology - 16 Number of transcription units - 6, operones - 4 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 197 - 688 463 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases 2 1 Op 2 . - CDS 729 - 1763 530 ## BT_2411 hypothetical protein 3 1 Op 3 . - CDS 1760 - 3589 1541 ## COG0826 Collagenase and related proteases 4 1 Op 4 . - CDS 3586 - 4503 871 ## COG1897 Homoserine trans-succinylase - Prom 4547 - 4606 4.5 + Prom 4442 - 4501 3.8 5 2 Tu 1 . + CDS 4684 - 4854 179 ## BT_2414 ferredoxin + Term 4876 - 4913 6.2 - Term 4929 - 4980 7.2 6 3 Op 1 . - CDS 4999 - 6192 1270 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase - Prom 6213 - 6272 3.3 7 3 Op 2 . - CDS 6275 - 7489 1393 ## COG0807 GTP cyclohydrolase II 8 3 Op 3 . - CDS 7495 - 9393 1306 ## BT_2417 hypothetical protein 9 3 Op 4 . - CDS 9470 - 9856 287 ## BT_2418 hypothetical protein - Prom 10098 - 10157 77.0 + TRNA 10081 - 10154 84.7 # Met CAT 0 0 - Term 10248 - 10298 9.2 10 4 Tu 1 . - CDS 10321 - 11628 1593 ## COG0519 GMP synthase, PP-ATPase domain/subunit - Prom 11649 - 11708 6.5 11 5 Op 1 . - CDS 11814 - 13253 1495 ## COG0642 Signal transduction histidine kinase 12 5 Op 2 . - CDS 13273 - 14406 1072 ## COG2205 Osmosensitive K+ channel histidine kinase - Term 14418 - 14479 4.3 13 5 Op 3 . - CDS 14499 - 15245 684 ## BT_2422 hypothetical protein - Prom 15266 - 15325 3.1 14 6 Op 1 18/0.000 - CDS 15361 - 15930 672 ## COG2156 K+-transporting ATPase, c chain 15 6 Op 2 20/0.000 - CDS 15947 - 17980 2028 ## COG2216 High-affinity K+ transport system, ATPase chain B 16 6 Op 3 . - CDS 17999 - 19705 1490 ## COG2060 K+-transporting ATPase, A chain - Prom 19946 - 20005 8.5 Predicted protein(s) >gi|226332148|gb|ACIC01000172.1| GENE 1 197 - 688 463 163 aa, chain - ## HITS:1 COG:CAC2751 KEGG:ns NR:ns ## COG: CAC2751 COG0454 # Protein_GI_number: 15896008 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Clostridium acetobutylicum # 2 163 1 167 167 102 33.0 3e-22 MMEIRPTELKDLPLVMEIYDYARAFMRANGNATQWIDGYPSESFIRQEIEDGHSYVCTDE QGEILGTFCFILGEDPTYLNIYEGAWLNDEPYGVIHRMAASGKRKGVSEACLNWCFEHFE NIRVDTHRDNKVMQHILTKYGFQRCGIIYVKNGTERIAYQRIL >gi|226332148|gb|ACIC01000172.1| GENE 2 729 - 1763 530 344 aa, chain - ## HITS:1 COG:no KEGG:BT_2411 NR:ns ## KEGG: BT_2411 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 344 1 344 344 667 97.0 0 MKRFPGRDMQSVTELVFGGDKFRTALCCSVVGLLLTLLFLSSCHYTRPDLTSEELSEKTK DSLNYLYDRHYTWNTNLEVTTDSVTMECLPIKDTYIALYKGDRVVVAEFAIHPADSVDSV WVKLAHTQEEQGWIRETELKESFVPTDSISQAIHFFSDTHASYFMIIFALFVGVYLFRAF RRKQLQLVYFNDIDSVYPLFLCLLMAFSATVYESMQVFVPDTWEHFYFNPTLSPFKVPFI LSVFLLGIWLFLIVALAVLDDLFRQLTPAAAVFYLLGLMSSCIFCYFFFILMTHIYIGYL FLTFFIWVFAKKVHRNISYKYRCGHCGEKLKEKGICPHCGAINE >gi|226332148|gb|ACIC01000172.1| GENE 3 1760 - 3589 1541 609 aa, chain - ## HITS:1 COG:ECs2039 KEGG:ns NR:ns ## COG: ECs2039 COG0826 # Protein_GI_number: 15831293 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Collagenase and related proteases # Organism: Escherichia coli O157:H7 # 2 605 17 632 667 523 46.0 1e-148 MIKQRKIELLAPAKNLECGIEAINHGADAVYIGAPKFGARAAAVNSLEDIEALVKHAHLY NARIYVTVNTILKEEELKEAEEMIHALYRVGVDALIVQDMGITKLNLPPIPLHASTQMDN RTPEKVKFLWEAGFRQVVLARELSIREIRKIHEICPEVPLEVFVHGALCVSYSGQCYVSQ ACFGRSANRGECAQFCRLPFSLVDSEGKVIVKDKHLLSLKDMNQSDELERLLDAGASSFK IEGRLKDVSYVKNVTAAYRQKLDAIFARRPEYVRASSGTCRFDFKPQLDKSFSRGFTHYF LHGRSEDIFSFDTPKSLGEEMGTMKEARGNFLTVAGLKSFNNGDGVCYIDEQGRLQGFRI NRVDGNKLYPQEMPRIKPRTKLYRNFDQEFERILSRKSAERKIAVNILLADNNFGFSLTL TDEDDNSVTLTLPREKEPARTPQADNLKTQLAKLGNTPFEAEKIEISFSENWFLPASVLA DFRRQAIEKLVAARRINYRQELAVWKPTSHAFPQAALTYLGNVMNTCAASFYREHGVQQV AEAYEKERVEDAVLMFCKHCLRYSMGWCPIHQRVRSPYKEPYYLVSNDGKRFRLEFDCKN CQMKVKAAQ >gi|226332148|gb|ACIC01000172.1| GENE 4 3586 - 4503 871 305 aa, chain - ## HITS:1 COG:CAC1825 KEGG:ns NR:ns ## COG: CAC1825 COG1897 # Protein_GI_number: 15895101 # Func_class: E Amino acid transport and metabolism # Function: Homoserine trans-succinylase # Organism: Clostridium acetobutylicum # 1 301 1 301 301 387 58.0 1e-107 MPLNLPDKLPAIELLKEENIFVIDTSRATQQDIRPLRIVILNLMPLKITTETDLVRLLSN TPLQVEISFMKIKSHTSKNTPIEHMQTFYTDFEQMRNEKYDGMIITGAPVEQMDFEEVTY WEEITEIFDWARTHVTSTLYICWAAQAGLYHHYGVPKYPLDQKMFGIFEHRVLEPFHSIF RGFDDCFYVPHSRHTEVRREDILKVPELTLLSESKDAGVYMAMARGGREFFVTGHSEYSP YTLDTEYRRDLGKGLPIEIPRNYYVDDDPDKGPLVRWRAHANLLFSNWLNYFVYQETPYN INDIE >gi|226332148|gb|ACIC01000172.1| GENE 5 4684 - 4854 179 56 aa, chain + ## HITS:1 COG:no KEGG:BT_2414 NR:ns ## KEGG: BT_2414 # Name: not_defined # Def: ferredoxin # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 56 21 76 76 88 100.0 6e-17 MAYVISDDCIACGTCIDECPVEAISEGDIYSINPDVCTDCGTCADVCPSEAIHPAE >gi|226332148|gb|ACIC01000172.1| GENE 6 4999 - 6192 1270 397 aa, chain - ## HITS:1 COG:AGc3991 KEGG:ns NR:ns ## COG: AGc3991 COG0436 # Protein_GI_number: 15889474 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 1 397 1 400 400 376 49.0 1e-104 MNQLSDRLNSLSPSATLAMSQKSNELKAQGIDVINLSVGEPDFNTPDHIKEAAKKAIDDN FSRYSPVPGYPALRNAIVEKLKKENGLEYTAAQISCANGAKQSVCNAILVLVNPGDEVIV PAPYWVSYPEMVKMAEGTPVIVSAGIEQDFKITPEQLEAAITPKTKALILCSPSNPTGSV YSKEELAGLAAVLAKYPQVVVIADEIYEHINYIGAHQSIAQFPEMKERTVIVNGVSKAYA MTGWRIGFIAGPEWIVKACNKLQGQYTSGPCSVSQKAAEAAYVGTQEPVKEMQKAFERRR DLIVKLAKEVPGFEVNVPQGAFYLFPKCSYFFGKSNGERKIENSDDLAMYLLEDAHVACV GGTSFGAPECIRMSYATSDENIVEAIRRIKEALAKLK >gi|226332148|gb|ACIC01000172.1| GENE 7 6275 - 7489 1393 404 aa, chain - ## HITS:1 COG:BH1556_2 KEGG:ns NR:ns ## COG: BH1556_2 COG0807 # Protein_GI_number: 15614119 # Func_class: H Coenzyme transport and metabolism # Function: GTP cyclohydrolase II # Organism: Bacillus halodurans # 209 402 1 194 197 238 59.0 2e-62 MEEIKMDRIEDAIADFKEGKFVIVVDDEDRENEGDLIIAAEKITPEKVNFMLKHARGVLC APITVSRCKELDLPHQVSDNTSVLGTPFTVTIDKLEGCSTGVSASDRAATIQALADPAST PATFGRPGHINPLYAQEKGVLRRAGHTEATIDMARLAGLYPAGALMEIMSDDGTMARLPE LRAMADEHGLKLISIHDLIAYRLKQESIVEKGVEVNMPTEHGMFRLIPFRQKSNGLEHMA IFKGTWNEDEPILVRVHSSCATGDILGSHRCDCGEQLHKAMEMIEKEGKGVVVYLNQEGR GIGLMEKMKAYKLQEDGMDTVDANLCLGHLADERDYGVGAQILRELGVHKMKLLTNNPVK RVGLEAYGLEIVENVPVETVPNPYNERYLRTKKERMGHTLHFNK >gi|226332148|gb|ACIC01000172.1| GENE 8 7495 - 9393 1306 632 aa, chain - ## HITS:1 COG:no KEGG:BT_2417 NR:ns ## KEGG: BT_2417 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 17 632 1 616 616 1233 100.0 0 MLRIKKLDIFIVKSFLMLFIGTFFICLFIFMMQFLWRYVDELVGKGLDMSVIAQFFFYSA LTLVPISLPLAILLASLITFGNFGERYELLAMKAAGISLLKIMRPLAFFVCGLVAVSFYF QNVVGPIAQTKLYTLIISMKQKSPELDIPEGVFYSEIPDYNLKIAKKNRKTGMMYDVLIY NLKDGFENAHIIYADSGRMEMTADKQHLWLHLYSGDLFENLRSQSMKSQNVPYRREEFRE KHTIIEFDSDFNMADASIMSNTSAAKDMSMLQASIDSMKMVGDSIGRQYYREVSEGNFRS TYGLTKEDTVKIMEANITEYNIDSLYEVAPLMQKQKVISSAASRAENLSSDINFKSYTMG SHDYSMRKHEIEWHKKITISLSCLLFFFIGAPLGGIIRKGGLGMPVIVSVMVFIIYYIID NTGYKMARDGRWVVWMGVWTSSAVLAPLGAFLTYKSNKDSVVLNADAYINWFKKIVGIRN IRHLFKKEVVIHDPDYTRIPGDLQVLTDECKEYIAKNRLMKAPNYFKLWMSGEKDDEMIS INEKLETLIEEMSNTKSVTLLNTLNNYPIISVSAHIRPFHTYWLNLAAGVIFPIGLFFYF RIWAFRVRLAKDMERIIKNNEQIQFIIQNINK >gi|226332148|gb|ACIC01000172.1| GENE 9 9470 - 9856 287 128 aa, chain - ## HITS:1 COG:no KEGG:BT_2418 NR:ns ## KEGG: BT_2418 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 128 1 128 128 231 100.0 6e-60 MKKEKIHLEYLLNATSKNILWAAISTPTGLEDWFADKVVSDDKIVEFHWGKTEQRNAEII AIRSFSFIRFRWQDDENERDYFEIKMTYNELTSDYVLEITDFAEADEVADMKELWESQVA KLRRTCGF >gi|226332148|gb|ACIC01000172.1| GENE 10 10321 - 11628 1593 435 aa, chain - ## HITS:1 COG:BH0607_2 KEGG:ns NR:ns ## COG: BH0607_2 COG0519 # Protein_GI_number: 15613170 # Func_class: F Nucleotide transport and metabolism # Function: GMP synthase, PP-ATPase domain/subunit # Organism: Bacillus halodurans # 121 435 1 315 315 413 61.0 1e-115 MKQDMIVILDLGSHENTVLARAIRALGVYSEIYPHDITVEELKALPNVKGIIINGGLNNV IDGVAIDVNPSIYTMGIPVMAAGHDKATCAVKLPAFTDDIEAIKAAIKSFVFDTCQAEAN WNMANFVNDQIELIRRQVGDKKVLLALSGGVDSSVVAALLLKAIGENLVCVHVNHGLMRK GESEDVVEVFSNQLKANLVYVDVTDRFLDKLAGVEDPEQKRKIIGGEFIRVFEEEARKLD GIDFLGQGTIYPDIVESGTKTAKMVKSHHNVGGLPEDLKFQLVEPLRQLFKDEVRACGLE LGLPYEMVYRQPFPGPGLGVRCLGAITRDRLEAVRESDAILREEFQIAGLDKKVWQYFTV VPDFKSVGVRDNARSFDWPVIIRAVNTVDAMTATIEPVDWPILMKITDRILKEVKNVNRV CYDMSPKPNATIEWE >gi|226332148|gb|ACIC01000172.1| GENE 11 11814 - 13253 1495 479 aa, chain - ## HITS:1 COG:PA5484 KEGG:ns NR:ns ## COG: PA5484 COG0642 # Protein_GI_number: 15600677 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Pseudomonas aeruginosa # 39 477 158 593 595 162 26.0 2e-39 MNIKTKLILGIGMLAGMIILLVTLSVVNLQLLTATEPDSPAAMPALERALLWISVTGGIC ILTGLTLLYWLPRTISKPIKELKQGILEIANHNYEERLDMRNNQEFREVAESFNRMAERL AEYRASTLADILSAKKFIEAIVNSIDDPIIGLNMDREILFINEEALNVLNLKRGNVIRKS AEELSLKNDLLRRLIRELVTPGDQKEPLKIYADDKESYFKASYIPIINTDAGKDEPSKLG DVILLKNITEFKELDSAKTTFISTISHELKTPIAAIMMSLQLLEDKRVGELNDEQEQLSK SIKENSERLLSITGELLNMTQVEAGKLQLMPKITKPIELIEYAIKANQVQADKFNIQIEV EYPEEKIGKLFVDSEKIAWVLTNLLSNAIRYSKENGRVVIGARQEGDMIDLYVQDFGKGI DPRYHKSIFDRYFRVPGTKVQGSGLGLSISRDFVEAHGGTLTVESELGKGSRFVMRLKA >gi|226332148|gb|ACIC01000172.1| GENE 12 13273 - 14406 1072 377 aa, chain - ## HITS:1 COG:AGl2094 KEGG:ns NR:ns ## COG: AGl2094 COG2205 # Protein_GI_number: 15891164 # Func_class: T Signal transduction mechanisms # Function: Osmosensitive K+ channel histidine kinase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 2 345 7 349 900 260 39.0 4e-69 MDMDREQSVQHFLDLIKKSHRGKFKIYIGMIAGVGKSYRMLQEAHELLENGVDVQIGYIE THGRAGTEALMQGLPVIPRRKIFYKGKELEEMDVDAIIRLHPEIVIVDELAHTNVEGSLN EKRWQDVMTLLDEGINVISAINIQHIESVNEEVQEITGIEVKERVPDSVLQEADEVVNID LTAEELISRLKAGKIYRPEKIQTALDNFFRTENILQLRELALKEVALRVEKKVENEVVMG VSVGIRHEKFMACISSHEKTPRRIIRKAARLATRYNTTFVALYVQTPRESMDRIDLASQR YLLNHFKLVAELGGEVIQVQSKDILGSIVKVCRDKQISSVCMGTPNLRFPHAICSILGYR KFLNNLSQANVDLIILA >gi|226332148|gb|ACIC01000172.1| GENE 13 14499 - 15245 684 248 aa, chain - ## HITS:1 COG:no KEGG:BT_2422 NR:ns ## KEGG: BT_2422 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 248 1 248 248 487 100.0 1e-136 MKSFFSKKIVVGALVAATVCVSGITDAKAQEFAVQGDIVSSYVWRGMYQGGGAAFQPTLG FGFDNFSVTAWGSTNFSGGNKELDLTLAYKFGKAGPTLTVADLWWEGEGAYKYFNFKSHE TGHHFEAGLAYTLPVEKFPLSVAWYTMFAGKDKKLNDSGELKQNYSSYLELNYPFSVKNV DLNVTCGAVPYKAEGIYTNSGFAVTNVALKGMTEIKINDKFSLPIFAQAIWNPCVEDAFL VFGVTLRP >gi|226332148|gb|ACIC01000172.1| GENE 14 15361 - 15930 672 189 aa, chain - ## HITS:1 COG:AGl2092 KEGG:ns NR:ns ## COG: AGl2092 COG2156 # Protein_GI_number: 15891163 # Func_class: P Inorganic ion transport and metabolism # Function: K+-transporting ATPase, c chain # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 29 186 30 184 188 148 47.0 5e-36 MKTLFKSLKITLVFCVFFSVFYILILWLFAQVAGPNKGNAEVATLDGKVVGAANVGQMFT KDIYFWGRPSCAGDGYDASSSSGSNKGPTNPEYLAEVEARIDTFLVHHPYLSRKDVPAEM VTASASGLDPNITPQCAYVQVKRVAQARGLTENQVKEIVDQSVEKPLLGIFGTEKINVLK LNIALEENK >gi|226332148|gb|ACIC01000172.1| GENE 15 15947 - 17980 2028 677 aa, chain - ## HITS:1 COG:DRB0083 KEGG:ns NR:ns ## COG: DRB0083 COG2216 # Protein_GI_number: 10957402 # Func_class: P Inorganic ion transport and metabolism # Function: High-affinity K+ transport system, ATPase chain B # Organism: Deinococcus radiodurans # 8 674 10 671 675 775 63.0 0 MKNMKSASLFQKEQVIESLKQSFVKLNPRLMIKNPIMYTVEVATVVMLFVTLYSIINPSQ GSFAYNIAVFIILFITLLFANFAEAIAEARGKAQADSLRKTREETPAKKVEGNQIVTVSS SQLKKGDVFVCEAGDVIPSDGEIIEGLASIDESAITGESAPVIREAGGDKSSVTGGTKVL SDHIKVMVTTQPGESFLDKMIALVEGASRQKTPNEIALTILLAGFTLVFVIVCVTLKPFA DYSNTVITIASLISLFVCLIPTTIGGLLSAIGIAGMDRALRANVITKSGKAVETAGDIDT LLLDKTGTITIGNRKATHFHTAPGIDLHAFVENCLLSSLSDETPEGKSIVELGRESGIRM RSLNTTGARMIKFTAETKCSGVDLPDGTQIRKGAFDAIRKIVETAGNKFPEEVEKVIAAI SSNGGTPLVVCVNRKVTGVIELQDIIKPGIQERFERLRKMGVKTVMVTGDNPLTAKYIAE KAGVDDFIAEAKPEDKMEYIKKEQQSGKLVAMMGDGTNDAPALAQANVGVAMNSGTQAAK EAGNMVDLDNDPTKLIEIVEIGKQLLMTRGTLTTFSIANDVAKYFAIVPALFMVAIPELA ALNIMQLHSPESAILSAVIFNAIIIPILIPLALRGVQYKPIGASALLRRNLLIYGVGGVI APFIGIKLIDLLVGLFF >gi|226332148|gb|ACIC01000172.1| GENE 16 17999 - 19705 1490 568 aa, chain - ## HITS:1 COG:pli0052 KEGG:ns NR:ns ## COG: pli0052 COG2060 # Protein_GI_number: 18450334 # Func_class: P Inorganic ion transport and metabolism # Function: K+-transporting ATPase, A chain # Organism: Listeria innocua # 11 568 3 571 573 451 45.0 1e-126 MNTEILGVVAQVALMVILAYPLGRYIAKVYKGEKTWSDFMAPIERVIYKVCGIDPKEEMN WKQFLKALLILNAFWFVWGMVLLVSQGWLPLNPDGNGPQTPDQAFNTCISFMVNCNLQHY SGESGLTYFTQLFVIMLFQFITAATGMAAMAGIMKSMAAKTTKTIGNFWHFLVVSCTRIL LPLSLIVGFILILQGTPMGFDGKMKVTTLEGQEQMVSQGPAAAIVPIKQLGTNGGGYFGV NSSHPLENPTYLTNMVECWSILIIPMAMVLALGFYTRRKKLAYSIFGVMLFAFLVGVCIN VSQEMGGNPRIDELGIAQDNGAMEGKEVRLGAGATALWSIVTTVTSNGSVNGMHDSTMPL SGMMEMLNMQINTWFGGVGVGWMNYYTFIIIAVFISGLMVGRTPEFLGKKVEAREMKIAT IVALLHPFVILVFTAISSYIYVHHPDFVESEGGWLNNLGFHGLSEQLYEYTSCAANNGSG FEGLGDNTYFWNWTCGIVLILSRFLPIIGQVAIAGLLAQKKFIPESAGTLKTDTLTFGIM TFVVIFIVAALSFFPVHALSTIAEHLSL Prediction of potential genes in microbial genomes Time: Thu May 12 03:41:07 2011 Seq name: gi|226332147|gb|ACIC01000173.1| Bacteroides sp. 1_1_6 cont1.173, whole genome shotgun sequence Length of sequence - 58439 bp Number of predicted genes - 54, with homology - 53 Number of transcription units - 27, operones - 12 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 1252 - 1308 0.5 2 2 Op 1 . - CDS 1311 - 3017 1764 ## COG0457 FOG: TPR repeat 3 2 Op 2 . - CDS 3019 - 3762 396 ## PROTEIN SUPPORTED gi|163764761|ref|ZP_02171815.1| ribosomal protein S11 4 2 Op 3 . - CDS 3796 - 5463 2063 ## COG0497 ATPase involved in DNA repair 5 2 Op 4 . - CDS 5501 - 6703 1117 ## COG0452 Phosphopantothenoylcysteine synthetase/decarboxylase 6 2 Op 5 . - CDS 6703 - 7482 779 ## COG0847 DNA polymerase III, epsilon subunit and related 3'-5' exonucleases - Prom 7525 - 7584 3.4 7 3 Tu 1 . - CDS 7616 - 8740 1131 ## COG0592 DNA polymerase sliding clamp subunit (PCNA homolog) - Prom 8851 - 8910 5.7 + Prom 8684 - 8743 4.1 8 4 Tu 1 . + CDS 8893 - 9267 397 ## BT_1365 hypothetical protein + Term 9294 - 9334 3.0 9 5 Op 1 . - CDS 9988 - 11565 1095 ## BT_1366 hypothetical protein - Prom 11586 - 11645 4.9 10 5 Op 2 . - CDS 11649 - 12407 525 ## COG1235 Metal-dependent hydrolases of the beta-lactamase superfamily I 11 5 Op 3 . - CDS 12404 - 13399 833 ## COG0812 UDP-N-acetylmuramate dehydrogenase 12 5 Op 4 . - CDS 13419 - 14267 681 ## BT_1369 hypothetical protein 13 5 Op 5 . - CDS 14264 - 15220 812 ## COG0451 Nucleoside-diphosphate-sugar epimerases - Prom 15410 - 15469 2.8 + Prom 15196 - 15255 5.5 14 6 Tu 1 . + CDS 15358 - 16545 1508 ## COG0156 7-keto-8-aminopelargonate synthetase and related enzymes + Prom 16587 - 16646 3.2 15 7 Tu 1 . + CDS 16666 - 17073 448 ## BT_1372 hypothetical protein + Term 17108 - 17151 1.4 - Term 17094 - 17141 10.2 16 8 Tu 1 . - CDS 17172 - 17651 714 ## COG1528 Ferritin-like protein - Prom 17671 - 17730 3.8 17 9 Tu 1 1/0.250 - CDS 17773 - 18933 1298 ## COG0019 Diaminopimelate decarboxylase - Term 18953 - 18996 9.4 18 10 Op 1 . - CDS 19012 - 20331 1220 ## COG0527 Aspartokinases 19 10 Op 2 . - CDS 20345 - 21070 237 ## PROTEIN SUPPORTED gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein 20 10 Op 3 24/0.000 - CDS 21055 - 21666 674 ## COG0139 Phosphoribosyl-AMP cyclohydrolase 21 10 Op 4 23/0.000 - CDS 21729 - 22484 696 ## COG0107 Imidazoleglycerol-phosphate synthase - Prom 22510 - 22569 4.9 22 10 Op 5 25/0.000 - CDS 22593 - 23312 662 ## COG0106 Phosphoribosylformimino-5-aminoimidazole carboxamide ribonucleotide (ProFAR) isomerase - Prom 23410 - 23469 6.7 23 10 Op 6 . - CDS 23484 - 24074 576 ## COG0118 Glutamine amidotransferase 24 10 Op 7 . - CDS 24074 - 24931 744 ## COG0788 Formyltetrahydrofolate hydrolase - Prom 25014 - 25073 5.5 + TRNA 25375 - 25448 82.2 # Asp GTC 0 0 + TRNA 25502 - 25575 85.5 # Asp GTC 0 0 + Prom 25376 - 25435 77.7 25 11 Op 1 . + CDS 25600 - 25827 113 ## 26 11 Op 2 . + CDS 25831 - 26424 510 ## COG0693 Putative intracellular protease/amidase + Term 26560 - 26608 9.1 - Term 26454 - 26493 1.3 27 12 Tu 1 . - CDS 26631 - 27467 859 ## COG0656 Aldo/keto reductases, related to diketogulonate reductase - Prom 27531 - 27590 4.2 - Term 27487 - 27556 2.1 28 13 Op 1 1/0.250 - CDS 27621 - 28040 338 ## COG3871 Uncharacterized stress protein (general stress protein 26) 29 13 Op 2 . - CDS 28134 - 29012 459 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 29075 - 29134 3.6 30 14 Op 1 . - CDS 29136 - 29684 492 ## BT_1386 hypothetical protein 31 14 Op 2 . - CDS 29722 - 31086 960 ## COG0534 Na+-driven multidrug efflux pump - Prom 31186 - 31245 8.4 - Term 31124 - 31181 15.0 32 15 Op 1 . - CDS 31317 - 31643 207 ## COG2076 Membrane transporters of cations and cationic drugs - Prom 31668 - 31727 3.6 33 15 Op 2 . - CDS 31735 - 32298 433 ## BT_1389 hypothetical protein 34 15 Op 3 . - CDS 32307 - 32768 274 ## BT_1390 hypothetical protein - Prom 32867 - 32926 5.6 - Term 32910 - 32956 9.1 35 16 Tu 1 . - CDS 32980 - 34137 1203 ## BT_1391 hypothetical protein - Prom 34176 - 34235 3.1 36 17 Tu 1 . - CDS 34923 - 35276 449 ## BT_1392 hypothetical protein - Term 35605 - 35653 11.1 37 18 Tu 1 2/0.250 - CDS 35881 - 37086 1062 ## COG0599 Uncharacterized homolog of gamma-carboxymuconolactone decarboxylase subunit - Prom 37153 - 37212 6.2 38 19 Op 1 . - CDS 37218 - 37613 387 ## COG1359 Uncharacterized conserved protein 39 19 Op 2 . - CDS 37623 - 38666 671 ## COG4925 Uncharacterized conserved protein - Prom 38781 - 38840 6.8 + Prom 38641 - 38700 5.0 40 20 Tu 1 . + CDS 38842 - 39759 571 ## COG2207 AraC-type DNA-binding domain-containing proteins - Term 39608 - 39649 3.1 41 21 Op 1 . - CDS 39739 - 40686 664 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily 42 21 Op 2 . - CDS 40757 - 41914 754 ## COG1168 Bifunctional PLP-dependent enzyme with beta-cystathionase and maltose regulon repressor activities 43 21 Op 3 . - CDS 41920 - 42783 739 ## BT_1399 hypothetical protein 44 21 Op 4 4/0.125 - CDS 42833 - 43864 978 ## COG1073 Hydrolases of the alpha/beta superfamily - Prom 43887 - 43946 3.5 45 22 Op 1 . - CDS 43995 - 44957 755 ## COG2814 Arabinose efflux permease 46 22 Op 2 . - CDS 44972 - 45199 213 ## BT_1401 putative sugar transport protein 47 22 Op 3 . - CDS 45214 - 45909 456 ## COG0655 Multimeric flavodoxin WrbA - Prom 45976 - 46035 5.2 + Prom 46243 - 46302 6.3 48 23 Tu 1 . + CDS 46332 - 48089 1785 ## COG1154 Deoxyxylulose-5-phosphate synthase + Term 48120 - 48176 12.4 49 24 Tu 1 . - CDS 48176 - 49066 553 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 49196 - 49255 4.0 + Prom 48952 - 49011 4.7 50 25 Tu 1 . + CDS 49223 - 49894 488 ## COG2364 Predicted membrane protein + Term 50106 - 50144 1.2 51 26 Op 1 1/0.250 - CDS 49902 - 50498 387 ## COG1309 Transcriptional regulator 52 26 Op 2 . - CDS 50512 - 54735 2491 ## COG1924 Activator of 2-hydroxyglutaryl-CoA dehydratase (HSP70-class ATPase domain) - Prom 54780 - 54839 4.5 53 27 Op 1 . - CDS 54855 - 55514 448 ## COG0793 Periplasmic protease 54 27 Op 2 . - CDS 55508 - 58141 2315 ## COG0793 Periplasmic protease - Prom 58272 - 58331 5.2 Predicted protein(s) >gi|226332147|gb|ACIC01000173.1| GENE 1 187 - 759 358 190 aa, chain - ## HITS:1 COG:no KEGG:BT_1358 NR:ns ## KEGG: BT_1358 # Name: not_defined # Def: putative transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 190 1 190 190 367 100.0 1e-100 MIFSKKDIQEVLPADVTGRSVAHPKRWLVALVRISHEKKTSERLSKMGIENFLPVQQEVH QWSDRRKVVERVLLPMMIFVHVDMYEQKEVLTLGSISRYMVLRGESTPAVIPDEQMHRFK FMLDYSDEAINMSTTPLSPGTKIKVIKGPLSGLQGELVTVNGKSKVAVRLTMLGCAFVDI PVGCVEPLEK >gi|226332147|gb|ACIC01000173.1| GENE 2 1311 - 3017 1764 568 aa, chain - ## HITS:1 COG:FN1787 KEGG:ns NR:ns ## COG: FN1787 COG0457 # Protein_GI_number: 19705092 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Fusobacterium nucleatum # 248 564 42 337 628 72 26.0 3e-12 MKKLLILPLALFLLVQSGLAQTPKWVEKAKRAVFSVITYDKNDKMLNTGNGFFVSEDGLA LSDYSLFKGAERAVIVTAEGKQMPVSLILGADDMYDVIKFRVAITEKKVPSLVVATTAPA AGADAWMLPYSTQKSIACVSGKVKDVSKIAGEYHYYTLSMQMKDKMVSCPVMNAEGQVFG ISQKSSGADTVTTCYAAGAAFAMSQKISALSLGDVALKNIGIRKGLPEAEDQALVYLFMA STQMSADEYEKLLDDFIRQFPSSTDGYIRRANYYVAKGKDDQSYFDKAVADFNQALKVAA KKDDVYYNIAKLIYGYQLSKPETTYKDWTYDTALKNLRQAMAIDPLPVYTQLEGDILFAQ QDYAGALAAYEKVNASNLASAASFFSAAKTKELLKADAKEVLALMDSCIARCPQPVTANF APYLLERAQMYMNNDQARNAMLDYDAYYKAVNGQVNDLFYYYREQAALKARQYQRALDDI VKAIELSPKDLTYRAEHAVVNLRVGRYEESMKILNEILKDEPKYGEAYRLLGLCQIQLKK TDEACGNFNKAKELGDPNVDELIKKYCK >gi|226332147|gb|ACIC01000173.1| GENE 3 3019 - 3762 396 247 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163764761|ref|ZP_02171815.1| ribosomal protein S11 [Bacillus selenitireducens MLS10] # 1 244 1 246 255 157 34 2e-37 MTDKSEMIFGVRAVIEAIQAGKEIDKILVKKDIQSDLSKELFAALKGLLIPVQRVPVERI NRITRKNHQGVVAFISSVTYQKTEDLVPFLFEQGKNPFFVMLDGITDVRNFGAIARTCEC AAVDAVIIPARGSVSVNADAMKTSAGALHTLPVCREQNLRATLQYLKDSGFRIVAATEKG DYDYTKADYTGPMCIIMGAEDTGVSYDHLALCDEWVKIPMLGTIESLNVSVAAGILIYEA VKQRNND >gi|226332147|gb|ACIC01000173.1| GENE 4 3796 - 5463 2063 555 aa, chain - ## HITS:1 COG:PA4763 KEGG:ns NR:ns ## COG: PA4763 COG0497 # Protein_GI_number: 15599957 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA repair # Organism: Pseudomonas aeruginosa # 1 551 1 552 558 327 38.0 4e-89 MLRSLYIQNYALIEKLDIGFETGFSVITGETGAGKSIILGAIGLLLGQRADVKAIRRGAS KCIIEARFDISAYGMRPFFEENELEYDDEECILRREVQASGKSRAFINDTPASLAQVKEL GEQLIDVHSQHQNLLLNKEGFQLNVLDILAHNDDALEKYHSLYNEWRQLDRELSELTALA EQSRTDEDYLRFQLEQLEEAHLVEGEQSELEQEAETLSHAEEIKAGLYRIEQSFFSDEGG LLSFLKDNLNTLNNLKKVYQPAGELSDRMESAYIELKDISQEVSAQGESIEFNPTRLDEV NDRLNLIYSLQQKHRVQTLEELMALADQYRSKLSEITSYDDRIAELTARKEAQYERVKKQ AAVLTKARAAAAREVEKQLAERLVPLGMPNVRFQVEMGLKKEPGIQGEDTVNFLFSANKN GTLQSISSVASGGEIARVMLSIKAMIAGAVKLPTIVFDEIDTGVSGEIADRMADMMQEMG DCNRQVISITHLPQIAARGRAHYKVYKKDSDTETNSHIRRLTDEERVEEIAHMLSGATLT EAALSNAKALLARTS >gi|226332147|gb|ACIC01000173.1| GENE 5 5501 - 6703 1117 400 aa, chain - ## HITS:1 COG:BH2510 KEGG:ns NR:ns ## COG: BH2510 COG0452 # Protein_GI_number: 15615073 # Func_class: H Coenzyme transport and metabolism # Function: Phosphopantothenoylcysteine synthetase/decarboxylase # Organism: Bacillus halodurans # 1 391 1 390 404 333 45.0 3e-91 MLKGKKIILGITGSIAAYKACYIIRGLIKQGAEVQVVITPAGKEFITPITLSALTSKPVI SEFFAQRDGTWNSHVDLGLWADAMLIAPATASTIGKMANGIADNMLITTYLSAKAPVFIA PAMDLDMYAHPSTQKNLDILRSYGNRIIEPATGELASHLVGKGRMEEPENIIRVLDEFFS ASTELQGRKIMITAGPTYEKIDPVRFIGNYSSGKMGFALAEECARRGAEVTLIAGPVQLK TQHSRIHRIDVESAQEMYEAAQTCFPTSDVGILCAAVADFRPETVAGKKIKREKEEELTL HLQATQDIAASLGKIKEDNQLLVGFALETNNEQQNAEGKLERKNFDFIVLNSLNDAGAGF RHDTNKISIIDRKGREDYPLKPKQEVAQDIIDHLVSVLTH >gi|226332147|gb|ACIC01000173.1| GENE 6 6703 - 7482 779 259 aa, chain - ## HITS:1 COG:CT261 KEGG:ns NR:ns ## COG: CT261 COG0847 # Protein_GI_number: 15604982 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, epsilon subunit and related 3'-5' exonucleases # Organism: Chlamydia trachomatis # 9 239 4 214 232 80 31.0 2e-15 MKLNLKNPIVFFDLETTGTNINTDRIVEICYLKVYPNGNEETKTMRINPEMHIPEASSAV HGIYDADVADCPTFKEVAKNVARDIEGCDLAGFNSNRFDIPVLAEEFLRAGVDIDMTKRK FVDVQVIFHKMEQRTLSAAYKFYCEKSLEDAHTAEADTRATYEVLKAQLDRYPDLQNDIA FLAEYSSFTKNVDFAGRMVYDDNGVEVFNFGKYKGMSVAEVLKKDPGYYSWILNSDFTLN TKAVLTKIRLREMSNLITK >gi|226332147|gb|ACIC01000173.1| GENE 7 7616 - 8740 1131 374 aa, chain - ## HITS:1 COG:BMEI1942 KEGG:ns NR:ns ## COG: BMEI1942 COG0592 # Protein_GI_number: 17988225 # Func_class: L Replication, recombination and repair # Function: DNA polymerase sliding clamp subunit (PCNA homolog) # Organism: Brucella melitensis # 1 370 26 395 397 156 30.0 7e-38 MKFIVSSTALSSHLQAISRVINSKNALPILDCFLFELEDGTLSVTVSDSETTMVTTVEVN ESDTNGRFAVVAKTLLDALKEIPEQPLTFYVNTDNYEITVQYQNGKYSLMGQNADEFPQS AVLGENAVRVEMEAGVLLGGINRSVFATADDELRPVMNGIYFDITTEDITMVASDGHKLV RCKTLAAKGNERAAFILPKKPATLLKNLLPKEQGTVTIEFDERNAVFMLESYRMVCRLIE GRYPNYNSVIPQNNPHKVTVDRQQLIGALRRVSIFSSQASSLIKLRMQENQIVISAQDID FSTSAEETQVCQYSGAAMSIGFKSTFLIDILNNISADEVVIELADPSRAGVIVPVEQEEN EDLLMLLMPMMLND >gi|226332147|gb|ACIC01000173.1| GENE 8 8893 - 9267 397 124 aa, chain + ## HITS:1 COG:no KEGG:BT_1365 NR:ns ## KEGG: BT_1365 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 124 1 124 124 216 99.0 2e-55 MEFTGKIIAILPPRGGVSKTSGNEWKSQEFVIENHDQYPRKMCFDVFGADKIEQFNIQMG EELTVSFDVDARQWNDRWFNSIRAWKVERVGAGAPMAPGAPVPPPAPSATPDFTMGDAKD DLPF >gi|226332147|gb|ACIC01000173.1| GENE 9 9988 - 11565 1095 525 aa, chain - ## HITS:1 COG:no KEGG:BT_1366 NR:ns ## KEGG: BT_1366 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 525 1 525 525 1045 99.0 0 MKLRTIVKIAITSSVVLLCSGFALYSFFRLSAAEDQKDFDLYTLVPSSVSAVFATEDVAE FVMEVDELTCSRDHQYLYVSKLFSYLKQYLYSLLEDAPHGLSRQMNQMLISFHEPDNDRN QVLYCRLGNGDRALIDRFVQKYVSSGYPPKTFDYKGEDIIIYPMADGDFLACYLTEDFLV LSCQKKLIEEVIDIRKTGKSLAADSAFKEVRAPKKSPTVATVYTRLAGMMGWTEFDMKLK DDFIYFSGVSHYVDTCFNFINVIRQQESVKGFPGETLPSTAFYFSKQGITDWASLLSYGD TQEYATAGRTSEVLNRDRELSRYLAENTGQDLVACLFQREDSLMGPAAVLSLSVMDVAEA ERMLRSLVNTAPAEEGTDRNSRITFCYTPAKAYPVYRLPQTTLFTQLTSFVEPSLHVFAT FYGGRLLLAPDEDSLSRYIRQLDNDEVLDGALAYRAGTDGLSDSYHFMMMADFGHVLEQS GHQVHYVPEFFLRNSEFFRNFILFAQFTCADGMVYPNIVLKYKSE >gi|226332147|gb|ACIC01000173.1| GENE 10 11649 - 12407 525 252 aa, chain - ## HITS:1 COG:BB0533 KEGG:ns NR:ns ## COG: BB0533 COG1235 # Protein_GI_number: 15594878 # Func_class: R General function prediction only # Function: Metal-dependent hydrolases of the beta-lactamase superfamily I # Organism: Borrelia burgdorferi # 6 250 6 251 253 200 40.0 2e-51 MKVRILGSGTSTGVPQIGCTCPVCTSADPKDNRLRASAIVETEDARILIDCGPDFRAQVL HLPFEKIDGVLITHEHYDHVGGLDDLRPFCRFGAVPIYAEEYVARALRLRMPYCFVDHRY PGVPDIPLQEIEIGHAFSIHHTEVVPLRVMHGRLPILGYRIGRLGYITDMLTMPEESYEQ LEGIDVLIMNALRIAPHPTHQNLEEALKAAERIRAKETYFIHMSHDMGLHKKVEKELPEN IHLTYDGMEIIF >gi|226332147|gb|ACIC01000173.1| GENE 11 12404 - 13399 833 331 aa, chain - ## HITS:1 COG:PM1589 KEGG:ns NR:ns ## COG: PM1589 COG0812 # Protein_GI_number: 15603454 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramate dehydrogenase # Organism: Pasteurella multocida # 1 329 1 328 341 256 42.0 4e-68 MYSLLSHNTFGIDVYAERFQEYASVEELKTLIAQGALTTPFLHIGGGSNLLFVKDYEGLV LHSRIEGIEVTEEDERSVAVRVGAGVVWDDFVGYCVEHGWYGTENLSLIPGEVGASAVQN IGAYGVEVKDLITSVETVNIHGEERVYAVDECGYAYRDSIFKRPENKSVFVTYVCFRLSK EERYTLDYGTIRQELEKYPELTLPILRKVIIDIRESKLPDPKVMGNAGSFFMNPIVPREK LEALQQEFPRIPFYELNDGRVKIPAGWMIDQCGWKGKALGPAAVHDKQALVLVNRGGAKG SDVLALSDAVRASVRAKFGIDIHPEVNLVGK >gi|226332147|gb|ACIC01000173.1| GENE 12 13419 - 14267 681 282 aa, chain - ## HITS:1 COG:no KEGG:BT_1369 NR:ns ## KEGG: BT_1369 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 282 1 282 282 550 99.0 1e-155 MKKLIAGVLVLTILGSCGNKKTNIDPFASITQEVDSIRHKADSIHQEELPEGPKPTEADE SFDDFIYNFASDDALQRQRVKFPLPYYKGDEKTNIEERNWKHDDLFTKQHYYTLLFDKEE DMDLVGDTSLTSVQVEWIFVKTRMVKKYYFERIKGAWILEAINLRPVERNENEDFVEFFS HFAADSLFQSRRVQEPLAFVTSDPDDDFSILETSLDLNQWFAFKPALPTDRLSNINYGQR NDDNSPTKILALKGIGNGFSNILYFRRKAGEWQLYKFEDTSI >gi|226332147|gb|ACIC01000173.1| GENE 13 14264 - 15220 812 318 aa, chain - ## HITS:1 COG:SA0511 KEGG:ns NR:ns ## COG: SA0511 COG0451 # Protein_GI_number: 15926231 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Staphylococcus aureus N315 # 1 315 1 314 321 342 53.0 7e-94 MKHILIIGATGQIGSELTMELRRRYGNTNVVAGYIQGAEPKGELKESGPSAVVDVTNGEM IESVVKEYHIDTIYNLAALLSVVAESKPKLAWKIGIDGLWNVLEVAREQGCAVFTPSSIG SFGASTPHTQTPQDTIQRPRTMYGVTKVTTELLSDYYFNKYGVDTRAVRFPGIISNVTPP GGGTTDYAVDIYYSAVRGEKFVCPIKEGTLMDMMYMPDALNAAIMLMEADPTKLIHRNAF NIASMSFDPETIFHAIKKHVPAFEMIYDVDPLKQRIADSWPDSLDDTCAREEWGWKPAYD LESMTVDMLEKLREKLKK >gi|226332147|gb|ACIC01000173.1| GENE 14 15358 - 16545 1508 395 aa, chain + ## HITS:1 COG:YPO0059 KEGG:ns NR:ns ## COG: YPO0059 COG0156 # Protein_GI_number: 16120412 # Func_class: H Coenzyme transport and metabolism # Function: 7-keto-8-aminopelargonate synthetase and related enzymes # Organism: Yersinia pestis # 7 394 12 402 403 475 59.0 1e-134 MYGKMKEHLSQTLAEIKEAGLYKEERLIESTQQAAITVKGKEVLNFCANNYLGLSNHPRL IKASQEMMERRGFGMSSVRFICGTQDIHKELEAAISDYFQTEDTILYAACFDANGGVFEP LFSEEDAIISDSLNHASIIDGVRLCKAKRYRYANADMKDLERCLQEAQAQRFRIVVTDGV FSMDGNVAPMDQICDLAEKYDALVMVDESHSAGVVGATGHGVSELYKTYGRVDIYTGTLG KAFGGALGGFTTGRKEIIDLLRQRSRPYLFSNSLAPGIIGASLEVFKMLKESNALHDKLV ENVNYFRDQMTAAGFDIKPTQSAICAVMLYDAKLSQIYAARMQEEGIYVTGFYYPVVPKE QARIRVQISAGHEKAHLDKCIAAFIKVGKELGTLK >gi|226332147|gb|ACIC01000173.1| GENE 15 16666 - 17073 448 135 aa, chain + ## HITS:1 COG:no KEGG:BT_1372 NR:ns ## KEGG: BT_1372 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 135 23 157 157 201 100.0 9e-51 MEELTLTTPALLFSAVSLILLAYTNRFLSYAQLVRQLRDRYMENPTDITVAQIENLRKRL NLTRTMQGLGIASLFFCVVSMFLIYIGLQLLSVYVFGLALILLIASLGVSFREIQISTRS LEIYLGAMEKGKIKK >gi|226332147|gb|ACIC01000173.1| GENE 16 17172 - 17651 714 159 aa, chain - ## HITS:1 COG:MTH158 KEGG:ns NR:ns ## COG: MTH158 COG1528 # Protein_GI_number: 15678186 # Func_class: P Inorganic ion transport and metabolism # Function: Ferritin-like protein # Organism: Methanothermobacter thermautotrophicus # 1 159 1 161 171 150 43.0 7e-37 MISEKLQKAINEQITAEMWSSNLYLAMSFYFEKEGFSGFAHWMKKQSQEEMGHAYAMADY IIKRGGTAIVDKIDVVPNGWGTPLEVFEHVYKHECHVSALVDKLVDVAAAEKDKATQDFL WGFVREQVEEEATAQGIVDKIKKAGDTGIFFVDSQLGQR >gi|226332147|gb|ACIC01000173.1| GENE 17 17773 - 18933 1298 386 aa, chain - ## HITS:1 COG:mlr3508 KEGG:ns NR:ns ## COG: mlr3508 COG0019 # Protein_GI_number: 13473029 # Func_class: E Amino acid transport and metabolism # Function: Diaminopimelate decarboxylase # Organism: Mesorhizobium loti # 15 376 27 388 422 279 43.0 5e-75 MKGIFPIDKFRTLQTPFYYYDTKVLRDTLSAINQEVAKYPSYSVHYAVKANANPKVLTII RESGMGADCVSGGEIRAAVRAGFPANKIVFAGVGKADWEINLGLEYGIFCFNVESIPELE VINELAAAQNKIANVAFRINPDVGAHTHANITTGLAENKFGISMQDMDRVIDVALEMKNV KFIGLHFHIGSQILDMGDFIALCNRVNELQDKLEARRILVEHINVGGGLGIDYGHPNRQS VPDFKSYFATYAGQLKLRPYQTLHFELGRAVVGQCGSLISKVLYVKQGTKKKFAILDAGM TDLIRPALYQAYHKMENITSEEPVEAYDVVGPICESSDVFGKAIDLNKVKRGDLIALRSA GAYGEIMASGYNCRELPKGYTSDELV >gi|226332147|gb|ACIC01000173.1| GENE 18 19012 - 20331 1220 439 aa, chain - ## HITS:1 COG:VC0391 KEGG:ns NR:ns ## COG: VC0391 COG0527 # Protein_GI_number: 15640418 # Func_class: E Amino acid transport and metabolism # Function: Aspartokinases # Organism: Vibrio cholerae # 3 439 34 479 479 256 35.0 7e-68 MKVLKFGGTSVGSAQRMKEVAKLITDGEQKIVVLSAMSGTTNTLVEISDYLYKKNPEGAN EIINKLESKYKQHVDELFATQEYKQKGLEVIKSHFDYIRSYTKDLFTLFEEKVVLAQGEL ISTAMVNFYLQECGVKSVLLPALEFMRTDKNAEPDPVYIKEKLQTQLELYPDMEIYITQG FICRNAYGEIDNLQRGGSDYTASLIGAAVKASEIQIWTDIDGMHNNDPRIVDKTAPVRQL HFEEAAELAYFGAKILHPTCIQPAKYANIPVRLLNTMDPDAPGTLISNDTEKGKIKAVAA KNNITAIKIKSSRMLLAHGFLRKVFEIFESYQTSIDMICTSEVGVSVTIDNTKHLNEILD DLKKYGTVTVDKDMCIICVVGDLEWENVGFEAKALDAMRDIPVRMISFGGSNYNISFLIR DCDKKTALQSLSDMLFNNK >gi|226332147|gb|ACIC01000173.1| GENE 19 20345 - 21070 237 241 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein [Acinetobacter baumannii AYE] # 1 202 1 197 311 95 29 5e-19 MEEALIQYKNVEIHQQELCVLSDVNLELHKGEFVYLIGKVGSGKTSLLKTFYGELDVIDG EAEVLGYNMRSIKRKHIPQLRRKLGIVFQDFQLLTDRTVYNNLEFVLRATGWKNKREIQE RIEEVLDLVGMSNKGYKLPNELSGGEQQRIVIARAVLNSPAIILADEPTGNLDVETGKSI VELLHNICESGSSVVMTTHNLQLLKEYPGSVYRCSEHRITDVTDEYMPRQRTIEIDLNID N >gi|226332147|gb|ACIC01000173.1| GENE 20 21055 - 21666 674 203 aa, chain - ## HITS:1 COG:hisI_1 KEGG:ns NR:ns ## COG: hisI_1 COG0139 # Protein_GI_number: 16129967 # Func_class: E Amino acid transport and metabolism # Function: Phosphoribosyl-AMP cyclohydrolase # Organism: Escherichia coli K12 # 2 100 9 107 112 146 66.0 3e-35 MELDFDKMNGLVPAIIQDNETRKVLMLGFMNKEAYDKTVETGKVTFFSRTKNRLWTKGEE SGNFLHVVSIKADCDNDTLLIQADPAGPVCHTGTDTCWGEKNEEPVMFLKALQDFIDKRH EEMPQGSYTTSLFESGINKIAQKVGEEAVETVIEATNGTNERLIYEGADLIYHMIVLLTS KGYRIEDLARELQERHSSTWKKH >gi|226332147|gb|ACIC01000173.1| GENE 21 21729 - 22484 696 251 aa, chain - ## HITS:1 COG:aq_181 KEGG:ns NR:ns ## COG: aq_181 COG0107 # Protein_GI_number: 15605750 # Func_class: E Amino acid transport and metabolism # Function: Imidazoleglycerol-phosphate synthase # Organism: Aquifex aeolicus # 1 250 1 250 253 290 58.0 1e-78 MLAKRIIPCLDIKDGQTVKGTNFVNLRQAGDPVELGRAYSEQGADELVFLDITASHEGRK TFTELVKRIAANINIPFTVGGGINELSDVDRLLNAGADKISINSSAIRNPQLIDDIAKNF GSQVCVLAVDAKQTENGWKCYLNGGRIETDKELLAWTKEAQERGAGEILFTSMNHDGVKT GYANEALASLADQLSIPVIASGGAGLKEHFRDAFLVGKADAALAASVFHFGEIKIPELKS YLCGEGITIRG >gi|226332147|gb|ACIC01000173.1| GENE 22 22593 - 23312 662 239 aa, chain - ## HITS:1 COG:PM1203 KEGG:ns NR:ns ## COG: PM1203 COG0106 # Protein_GI_number: 15603068 # Func_class: E Amino acid transport and metabolism # Function: Phosphoribosylformimino-5-aminoimidazole carboxamide ribonucleotide (ProFAR) isomerase # Organism: Pasteurella multocida # 3 234 5 241 249 191 39.0 1e-48 MIEIIPAIDIIDGKCVRLSQGDYDSKKVYNENPVEVAKEFEANGVRRLHVVDLDGAASHH VVNHRVLEQIATRTSLIVDFGGGVKSDEDLKIAFESGAQMVTGGSVAVKDPELFCHWLEV YGSEKIILGADVKEHKIAVNGWKDESACELFPFLEDYINKGIQKVICTDISCDGMLKGPS IDLYKEMLEKFPNLYLMASGGVSNVDDIIALNEAGVPGVIFGKALYEGHITLKDLRIFL >gi|226332147|gb|ACIC01000173.1| GENE 23 23484 - 24074 576 196 aa, chain - ## HITS:1 COG:YPO1545 KEGG:ns NR:ns ## COG: YPO1545 COG0118 # Protein_GI_number: 16121818 # Func_class: E Amino acid transport and metabolism # Function: Glutamine amidotransferase # Organism: Yersinia pestis # 1 196 1 196 196 177 43.0 1e-44 MKVAVVKYNAGNIRSVDYALKRLGVEAVITADKEILQSADKVIFPGVGEAGTTMNHLKAT GLDELIKNLRQPVFGICLGMQLMCRHSEEGEVDCLNIFDVDVKRFVPQKHEDKVPHMGWN TIGKTNSKLFEGFTEEEFVYFVHSFYVPVCDFTAAETDYIHPFSAALHKDNFYATQFHPE KSGKTGERILRNFLDI >gi|226332147|gb|ACIC01000173.1| GENE 24 24074 - 24931 744 285 aa, chain - ## HITS:1 COG:alr1623 KEGG:ns NR:ns ## COG: alr1623 COG0788 # Protein_GI_number: 17229115 # Func_class: F Nucleotide transport and metabolism # Function: Formyltetrahydrofolate hydrolase # Organism: Nostoc sp. PCC 7120 # 4 284 5 283 284 324 54.0 9e-89 MMTTAKLLLHCPDKPGILAEVTDFITVNKGNIIYLDQYVDHVENIFFMRIEWELKDFLVP QEKIEDYFRTLYGQKYEMDFRLYFSDVKPRMAIFVSKMSHCLFDMLARYTAGEWNVEIPL IISNHPDLQHVAERFGIPFYLFPITKETKEEQERKEMELLAKHKITFIVLARYMQVISEQ MINAYPNKIINIHHSFLPAFVGAKPYHAAFQRGVKIIGATSHYVTTELDAGPIIEQDVVR ITHKDSIEDLVNKGKDLEKIVLSRAVQKHIERKILAYKNKTVIFS >gi|226332147|gb|ACIC01000173.1| GENE 25 25600 - 25827 113 75 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MITLIIKDFFFYALHETKIANSYSLRLRLTDILNILLPAIGHQTYKVIYKKDNNSLHHLG YSILNSTFAYIKKIE >gi|226332147|gb|ACIC01000173.1| GENE 26 25831 - 26424 510 197 aa, chain + ## HITS:1 COG:BS_ydeA KEGG:ns NR:ns ## COG: BS_ydeA COG0693 # Protein_GI_number: 16077578 # Func_class: R General function prediction only # Function: Putative intracellular protease/amidase # Organism: Bacillus subtilis # 1 194 1 184 197 125 38.0 4e-29 MKEVIFVLLNEFADWEGAFIAACLNQGVMPGSPVPYKVKTLSITKEPVSSIGGFRVLPDY DLKDMPEDYAGLILIGGMSWFSPEAELIVPLVEKAIKDKKVVGGICNASVFLAMHGFLNN VKHTSNTIDYLKQHVGERYTGDSNYVDQQAVRDGKIITANGTGQLEFCREILYALEADTA EAIEKSYLFYKNGFCPE >gi|226332147|gb|ACIC01000173.1| GENE 27 26631 - 27467 859 278 aa, chain - ## HITS:1 COG:TM1009 KEGG:ns NR:ns ## COG: TM1009 COG0656 # Protein_GI_number: 15643767 # Func_class: R General function prediction only # Function: Aldo/keto reductases, related to diketogulonate reductase # Organism: Thermotoga maritima # 4 278 6 284 286 335 58.0 5e-92 METVRLNNGVEMPILGYGVYQVTPEECERCVLDAISVGYRSIDTAQAYYNEEGVGNAVRK CGVPREELFITTKVWISNGGYEKAKVSIDESLRKLQSDYVDLLLIHQPFNDYYGTYRAME EAYKDGKARAIGVSNFYPDRFIDLAEFCEIKPAVNQVETHVFNQQVKPQEIMKKYGTKVM SWGPFAEGRNNFFSNEVLKAIGERYGKSVAQVALRFLIQRDIIVIPKSTRKERMIENFDV FDFTLSAKDMEEIAGLDKKESLFFSHYDPEMVNFLINL >gi|226332147|gb|ACIC01000173.1| GENE 28 27621 - 28040 338 139 aa, chain - ## HITS:1 COG:CAC3491 KEGG:ns NR:ns ## COG: CAC3491 COG3871 # Protein_GI_number: 15896728 # Func_class: R General function prediction only # Function: Uncharacterized stress protein (general stress protein 26) # Organism: Clostridium acetobutylicum # 11 139 13 140 145 119 42.0 2e-27 MRDAEKTVGNMIDKLKTAFIGSIDSEGFPNMKAMLQPRKREGIKTIYLTTNTSSMRVAQY RENNHACIYFCDTRFFRGVMLRGTMEVLTDPASKEMIWQEGDTMYYPEGVTDPDYCVLKF TAISGRFYSNFKSESFAVE >gi|226332147|gb|ACIC01000173.1| GENE 29 28134 - 29012 459 292 aa, chain - ## HITS:1 COG:PA0248 KEGG:ns NR:ns ## COG: PA0248 COG2207 # Protein_GI_number: 15595445 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Pseudomonas aeruginosa # 186 288 184 286 288 74 33.0 2e-13 MKNLIPRYTFYKNKYGSELLIDVVELKYVKKFLSQSSVHTLTYYDITFVTEGEGKFSIDN QTNEAASRDVFFSKPGEIRNWDTRHIVNGYALIFEDEFLSSLFKDSLFVQHLSFFQSGKT SSKLQLPDELYMRILQILHNIKTEIDSYRQNDVYVLRALLYEVLMLLDREYKKVNMEEET TSKEVGNIHIDKFMKLVESHLKEQHSVQYYADKLCITPNYLNEIVTSTKGISAKQYIRNK VMDEAKRLLTYTDFPISDIAFELHFSTVSYFIRSFRQYTGTTPLLYRRTHKP >gi|226332147|gb|ACIC01000173.1| GENE 30 29136 - 29684 492 182 aa, chain - ## HITS:1 COG:no KEGG:BT_1386 NR:ns ## KEGG: BT_1386 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 182 1 182 182 323 100.0 2e-87 MEAYLKKLCFEYQVEEKEVKELLARMERVYLDKGETIASATMPEQSLYIIVSGILHTFTT QQGEDKTVRFLSEGDAILCYNTSKHSVKTLTRCVAYYISEDEIEELCATSITFSNLIRQM MEYQFYLKEEMSVNARKLTIRERYLSLLSEIPDILYRVPLKYITYYLGADVTSLGYLAGS CK >gi|226332147|gb|ACIC01000173.1| GENE 31 29722 - 31086 960 454 aa, chain - ## HITS:1 COG:L170983 KEGG:ns NR:ns ## COG: L170983 COG0534 # Protein_GI_number: 15672149 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Lactococcus lactis # 5 396 3 396 446 164 30.0 3e-40 MKELNLTQGSVPKVLLQFAVPFLIANVLQALYGGADLFVVGQYDDSASVAAVAIGSQVTQ TITGIILGITTGTTVLIAIATGAKDDKKVAFTIGSSVWLFSIIGVLLTLVMVVFHGRIAE LMHTPIEAMADTKSYILVCSAGILFIVGYNVVCGILRGLGDSKTPLYFVALACVINIVLD FILVGYFHLGATGAAVATITAQGVSFMISLWFLYRHGFHFEFTRKDIRLNRNLAKKVMTL GAPIALQDALINISFLIITVIVNQMGVIASASLGVVEKIIVFAMLPPMAISSAVATMTAQ NYGAGLIQRMNRCLASGIGIAFVFGVSVCVYSQFLPETLTAFFTKDAAVVSMAAEYLRGY SIDCIVVSFVFCINSYFSGQGNSLFPMIHSLIATFLFRIPLSYWFSQIDSSSLFIMGFAP PLSTVVSLLICIWYLRYTRKKVYLKGTMMPAMSN >gi|226332147|gb|ACIC01000173.1| GENE 32 31317 - 31643 207 108 aa, chain - ## HITS:1 COG:BS_ykkD KEGG:ns NR:ns ## COG: BS_ykkD COG2076 # Protein_GI_number: 16078375 # Func_class: P Inorganic ion transport and metabolism # Function: Membrane transporters of cations and cationic drugs # Organism: Bacillus subtilis # 1 107 2 105 105 73 42.0 7e-14 MNWILLILAGCLELVFTFCLGKINRVVGAEKYMWSFGLLISLCVSMLLLIRVTKALPVGT AYAIWTGIGAAGTVLIGIFFFKEPASWGRIFFIFTLICSIVGLKLVSH >gi|226332147|gb|ACIC01000173.1| GENE 33 31735 - 32298 433 187 aa, chain - ## HITS:1 COG:no KEGG:BT_1389 NR:ns ## KEGG: BT_1389 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 187 1 187 187 368 100.0 1e-101 MDINDIIDSIYKLPDASKEALLSDAPEVAYPKGFHLFRANHKNSKFYLMKKGMLHAYTYQ SDKKVTFWFGKEGDAIFPLQTLHDNRAGYENIELLEDSVLYELSVDKLHNLYLEDIHIAN WGRKFSERECIKSEKLFISRQFKTSLERYQELISEYPDILQRVPLGIIASYLGISQVNLS RIRAKIR >gi|226332147|gb|ACIC01000173.1| GENE 34 32307 - 32768 274 153 aa, chain - ## HITS:1 COG:no KEGG:BT_1390 NR:ns ## KEGG: BT_1390 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 153 1 153 153 326 100.0 2e-88 MMKEQNIPQDSLIAGYLPADYCDSFSRNVVSEKAITSDEFFDMAFNHSPGWVNGLMKLRN AIVKPLGLEVGMRFADTICEQNAQEVVFGMPDKHLTFHASLWCGEKEPGGQTFTITTVVK FNNRLGRLYFFFIRPFHKVIIRSMLKRVAKRLK >gi|226332147|gb|ACIC01000173.1| GENE 35 32980 - 34137 1203 385 aa, chain - ## HITS:1 COG:no KEGG:BT_1391 NR:ns ## KEGG: BT_1391 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 11 385 1 375 375 713 100.0 0 MKHYNLKRMIMKKSIILLAMVLGAFTAKAQSVVEGTKFADNWSVEVNAGAITPLTHSAFF KNARPAFGVGLSKQLTPVFGLGFQGMGYINTTPSKTAFDASDVSLLGKVNLMNLFAGYNG TPRLFEVEAVAGAGWLHYYVNGDGDQNSWSTRFGLNLNFNLGESKAWTLGLKPAIVYDMQ GDFNQAKSRFNANNAAFELTAGLTYHFKSSTGNRYFTEVRVYNQGEIDDLNASINALRGQ VNNKDGELNSANQKISGLQQELEECRTKVVPVETVVKTARVPESIITFRQGKSSVDASQL PNVERVASYLKKYADAKVVIKGYASPEGSAEVNARIAAARAEAVKTILVNKYKINASRIT AEGQGVGDMFTEPDWNRVSICTIED >gi|226332147|gb|ACIC01000173.1| GENE 36 34923 - 35276 449 117 aa, chain - ## HITS:1 COG:no KEGG:BT_1392 NR:ns ## KEGG: BT_1392 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 117 1 117 117 228 100.0 7e-59 MDIIDRLKGLRRNLPIGIQSFERLRRDGYLYVDKTAFVYELASTGNPYFLSRPRRFGKSL LLSTLEAYFCGKKELFEGLAIEQYETEWNEHAVLHLNFNFNAEDYSDIEGLKNGLEL >gi|226332147|gb|ACIC01000173.1| GENE 37 35881 - 37086 1062 401 aa, chain - ## HITS:1 COG:MA0409 KEGG:ns NR:ns ## COG: MA0409 COG0599 # Protein_GI_number: 20089302 # Func_class: S Function unknown # Function: Uncharacterized homolog of gamma-carboxymuconolactone decarboxylase subunit # Organism: Methanosarcina acetivorans str.C2A # 153 400 3 250 250 336 65.0 6e-92 MRKLFFIFVTVIGFLQIPMQSINAQSMKKEEVPQNISAFPLGKENTGFKQYFTGESWLAP LTGNKDLNVPMSNVTFEPGCRNNWHSHTGGQILIAVGGVGYYQERGKAARRLLPGDVVEI APDIEHWHGAAPDSWFSHLAIGCNPQTNKNIWLEQVDDQQYAEATKDNGGTGLSATDPEL DAIFGNFTKEVQQYGNLDTKTRLMVTLASNIASQAQTEYRMMLESALNAGITPIEIKEIL YQAVAYAGMAKVMDFVGITNEALLAHGVRLPLEGQAVVSAETRFDKGLALQKSIFGERID QMHKNAPENQKHIQRYLSANCFGDYQTRGGLDVKTRELLTFSILVSLGGCESQVKGHIQG NVNVGNNKDTLLAVVTQLLPYIGYPRTLNAIACLNEVIPEK >gi|226332147|gb|ACIC01000173.1| GENE 38 37218 - 37613 387 131 aa, chain - ## HITS:1 COG:SMa0558 KEGG:ns NR:ns ## COG: SMa0558 COG1359 # Protein_GI_number: 16262744 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Sinorhizobium meliloti # 1 120 43 159 170 93 42.0 8e-20 MKKLLIIFLLFAATASANVIRPVEPSAAENNMVRLSRIIIDPERLEEYNAYLKEEIEVSM RLEPGVLVLYAVAEKERPNHVTILEIYADEAAYKSHIATPHFKKYKEGTLDMVQMLELID ATPLIPGLKMK >gi|226332147|gb|ACIC01000173.1| GENE 39 37623 - 38666 671 347 aa, chain - ## HITS:1 COG:XF1751 KEGG:ns NR:ns ## COG: XF1751 COG4925 # Protein_GI_number: 15838352 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Xylella fastidiosa 9a5c # 219 346 40 164 167 94 42.0 4e-19 MKKILLIPLMMLTMMPMAACDGSNADPILPEQPGQPGNSDGGDEDDSTNPTNPIPGGNGR YLVLYCSRTGSTERMAQQIQQTLDCDILEVEPQTPYESDYNGMLNRAQEELAAIRQGNYP AIKTSVEDFGNYDIVFVGYPIWYGSMATPMQTFLHNHASKLAGKRIALFASSGSSGISAS VDEARTLCSGATFTETLLLTSSTLSQMGNRIRTWLETLGASRENNYPSTSTNVKITVGNR TITATMEDNAAAQDFLSRLPLEVTLNDYNNITEKIFYPSPALTTAGVPRGCAPVPGDITI YVPWNNVAIFCKSGSQSNDLIKIGRIDGNGIDALNVPGNVAVKFERQ >gi|226332147|gb|ACIC01000173.1| GENE 40 38842 - 39759 571 305 aa, chain + ## HITS:1 COG:BMEII0641 KEGG:ns NR:ns ## COG: BMEII0641 COG2207 # Protein_GI_number: 17988986 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Brucella melitensis # 183 300 182 292 307 65 28.0 1e-10 MEENTKEMIVIDSVDRYNEIFGLETRHPLVSIIDLAKSTTWPTRAWFRYEVYALYLKNVK CGDIKYGRQYYDYQDGTIVCFAPGQITDLEMLPNIQPNAHGILFHPDLIRGTALGQEIKK YSFFSYETNEALHISEEERQTVMDCLQKITIELEHSIDKHSRRLICANIGLLLDYCMRFY ERQFDTRNGVNKDIIVRFEHLLNEYFEGDAPQNQGLPSVKYFADKVFLSANYFGDMVRKQ TGKTVSEYIQDKMIELAKEQLMSSNKTMSQIAYEIGFQYPQHLSRMFKRIVGMTPNQFRL TGYCR >gi|226332147|gb|ACIC01000173.1| GENE 41 39739 - 40686 664 315 aa, chain - ## HITS:1 COG:YPO1203 KEGG:ns NR:ns ## COG: YPO1203 COG0697 # Protein_GI_number: 16121495 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Yersinia pestis # 7 302 5 287 296 109 30.0 9e-24 MKSQNPSKRNLPGVIIAYFLIYVVWGSTYFFIGVALKDFSPFLLGALRFTTAGLILLGIC YLRGEQIFKKSLVKRSAVSGIVLLFVDMAVVMLAQRYLTSSLVAIIASSTALWILLLDVP MWRTNFRNPLTIAGGGIGFGGVVMLYAEQLNAEWLHPQSERGILLLVFGCISWALGTLYA KYRSSREEEVNAFAGSAWQMLFASVMFWICAVVNGDVREADLSTVSVTSWFSLLYLISFG SLLAYSAYVWLLKVRPAAEVGTHAYVNPFIAVLLGVSFGNEQVTFVQLSGLLVILLGVML ISRKRNKMKSTAISR >gi|226332147|gb|ACIC01000173.1| GENE 42 40757 - 41914 754 385 aa, chain - ## HITS:1 COG:CAC2970 KEGG:ns NR:ns ## COG: CAC2970 COG1168 # Protein_GI_number: 15896223 # Func_class: E Amino acid transport and metabolism # Function: Bifunctional PLP-dependent enzyme with beta-cystathionase and maltose regulon repressor activities # Organism: Clostridium acetobutylicum # 3 385 2 382 384 402 48.0 1e-112 MKYNFDEIVPRRGTNSYKWDSAGDADVLPMWVADMDFRTAPSVVEALKRRVEHGIFGYVR VPDAYYEAITRWFAGRHGWQIEKEWIIYTTGVVPALSAVIKALTTPGDKVIVQTPVYNCF FSSIRNNGCEVVANPLIYMNGTYQIDFIDLERKAADPSVKVLLLCNPHNPAGRVWTKQEL TRLGEICLRNNIWVVADEIHCELVFPGHTYIPFASVSEEFLMHSVTCTSPSKAFNLAGLQ IANIVSADTDIRMQIDKAININEVCDVNPFGVEALIAAYNDGEEWLEELNQYLFANYHYL RAYFDEYLPEFPVLPLEGTYLVWVDCSALKQSSEDIVKTLLEKEKLWVNEGNLYGEAGER FIRINIACPRQRLIEGLNRLRRALK >gi|226332147|gb|ACIC01000173.1| GENE 43 41920 - 42783 739 287 aa, chain - ## HITS:1 COG:no KEGG:BT_1399 NR:ns ## KEGG: BT_1399 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 287 1 287 287 603 100.0 1e-171 MKQIKNTHYEGERPLFASHGLYLEEVTIHAGESALKECSDIEAVNCRFEGKYPFWHNDGF VVKNCLFTEGARAALWYSRNLKMKDTLVEAPKMFREMDGVELENVQLPNALETLWYCRNV ELNNVKIDKGDYLFIHGEDIRIKNFVLNGNYSFQYCKNVEIRHAEIHSKDAFWNTENVTV YDSVLDGEYLGWHSKNLRLVNCKISGTQPLCYAHDLIMENCTMADDADLAFEHSSVKATI KGPVHSVKNPRTGNIVAESFGTIILDENLKAPGNCELKLWDGLTCFD >gi|226332147|gb|ACIC01000173.1| GENE 44 42833 - 43864 978 343 aa, chain - ## HITS:1 COG:RSc0206 KEGG:ns NR:ns ## COG: RSc0206 COG1073 # Protein_GI_number: 17544925 # Func_class: R General function prediction only # Function: Hydrolases of the alpha/beta superfamily # Organism: Ralstonia solanacearum # 10 342 6 342 342 484 70.0 1e-136 MKRILLLTVVFIMMLGTSFSFAQADADNFYKSDWVKVEKVSFSNQYKMKVAGNLFLPKTM KEGEKYPAIIVGHPMGAVKEQSANLYATKMAERGFVTLSIDLSFWGGSEGEPRNAVLPEV YAEDFSAAVDFLGTRPFIDRNNIGVIGICGSGSFAISAAKIDPRLKAIATISMYNMGTAS RNGLKHALTLEQRKQIMAEAAEQRYAEFLGGETKYTGGTVHEITENSGPIEREFYEFYRT QRGEFTPEGATLTTTTHPTLSSNVKFMNFYPFEDIETISPRPMLFITGENAHSREFSEDA YRLAADPKELYIVPGAGHVDLYDRVGLIPFDKLESFFKEYLKK >gi|226332147|gb|ACIC01000173.1| GENE 45 43995 - 44957 755 320 aa, chain - ## HITS:1 COG:Cgl0650 KEGG:ns NR:ns ## COG: Cgl0650 COG2814 # Protein_GI_number: 19551900 # Func_class: G Carbohydrate transport and metabolism # Function: Arabinose efflux permease # Organism: Corynebacterium glutamicum # 5 277 118 390 424 100 26.0 4e-21 MLLALGLFTLSNIISMLTTNFTVLLVARILPAFLHPVYVSMAFTVAAASVSKEKAPKAVS RVFVGVSAGMVLGVPVTSFIASEVSFSMSMLFFTVVNALVLIATILFIPSMPVKEKVSYG AQLSVLKRTVMWNSIIAVTLINGAMFGFFSYMSDYLKTVTEVSYSVISVVLLVYGLANIV GNVLAGKLLSINARRSIIIMPFALLVSYICLFITGEWITAMVIIILILGVLAGIASNNMQ YMIAEAAPEAPDFANGMFLTSANLGTTIGTAVCGAFITEMGTRYSVFGALFFLIASIVFV YLRVRVAHSQKQVKMNMATE >gi|226332147|gb|ACIC01000173.1| GENE 46 44972 - 45199 213 75 aa, chain - ## HITS:1 COG:no KEGG:BT_1401 NR:ns ## KEGG: BT_1401 # Name: not_defined # Def: putative sugar transport protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 73 1 73 401 124 100.0 7e-28 MEQKNQSKLHSGGLLVFILTVGVFGIINTEMGVVGILPLIAETFDVSVPQAGWTVSIFAL VVAASAPVTPLLFWE >gi|226332147|gb|ACIC01000173.1| GENE 47 45214 - 45909 456 231 aa, chain - ## HITS:1 COG:MA0418 KEGG:ns NR:ns ## COG: MA0418 COG0655 # Protein_GI_number: 20089311 # Func_class: R General function prediction only # Function: Multimeric flavodoxin WrbA # Organism: Methanosarcina acetivorans str.C2A # 51 228 3 176 179 98 33.0 1e-20 MDRKGFLKSTMIATGSLLLGGAGICRFLNDKEPADGPFPRTIEKIHCGNMKKVLVIMSAG TKLGNTDRLTDAYIKGLVERGHSVTKVYLGSMRIEGCRGCGVCQRLAHQCAVRDGMQDIY PLFAECDTVVMASPLYFWTITSQLKAFIDRLYAISADDKYPQKDTVLLMTAGDDNENTFD QPKQYFRLLSQALGWNEVGIYCAGGCTGCEKLARQIDKVHLENVYKMGLEL >gi|226332147|gb|ACIC01000173.1| GENE 48 46332 - 48089 1785 585 aa, chain + ## HITS:1 COG:CAP0106 KEGG:ns NR:ns ## COG: CAP0106 COG1154 # Protein_GI_number: 15004809 # Func_class: H Coenzyme transport and metabolism; I Lipid transport and metabolism # Function: Deoxyxylulose-5-phosphate synthase # Organism: Clostridium acetobutylicum # 1 584 1 585 586 835 70.0 0 MYLENIYSPADVKKLSFQELNDLSHEIRASLLQKLSAHGGHFGPNFGMVEATIALHYVFN SPKDKIVYDVSHQSYVHKMLTGRKDAFLHPAEYDHVSGYSEPQESEHDFFVIGHTSTSVS LASGLAKGRDLTGGNENIIAVIGDGSLSGGEAFEGLDYVAELGTNMIIIVNDNQMSIAEN HGGLYKNLKDLRDSNGQCECNFFKAMGLDYMYVNDGNCVEALIEAFSKVKDIQHPIVVHI NTLKGKGYEPAEQDKETYHWRTPFDLETGKSKMNDDAEDYSEVTAQYLLKKMKEDKRVVT ITSGTPAVLGFTPDRRQEAGKQFVDVGIAEEHAVALASGIAANGGKPVYGVYSTFIQRSY DQLSQDLCINNNPAVLLVFWGTLSGMNDVTHLCFFDIPLISNIPNMVYLAPTCKEEYLAM LEWSIHQNEHPVAIRVPATDVISCGEPVESDYSNLNRYKVAHRGSKVAILALGSFFGLGQ SVLSLLKEKANIDATLINPRYITGVDSELMDELKADHELVITLEDGVLDGGFGEKIARYY GATDMKVLNYGAKKEFVDRYDIQELLRANHLTDEQIVEDILSLIG >gi|226332147|gb|ACIC01000173.1| GENE 49 48176 - 49066 553 296 aa, chain - ## HITS:1 COG:PA0248 KEGG:ns NR:ns ## COG: PA0248 COG2207 # Protein_GI_number: 15595445 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Pseudomonas aeruginosa # 173 293 164 287 288 81 37.0 2e-15 MVKSRIFDNRQLLKEHPEIPHYKEEVVCFMSEYKDRSYPENERYFNRELYMILVLEGRSE ILLNGEFIVIEPDMLLVHGANYLTDHLYSSPDIKFITLSISESMRTDDSYLTQITAILLA TMRQNKQYTIQLTAYEAQIIRNELEVLMRLLNIKHQFLFRRIQAACNALFLDIADFLSRK TIIKKEVSRKDHVLQEFHALVTRNFREEHFVSFYADKLAISEQYLARIVRAGTGKSVNTI INELLIMEARTLLSSTKSTVGDIASRLGFSDAAGFCKFFKRNMGQTPLNYRKGLWI >gi|226332147|gb|ACIC01000173.1| GENE 50 49223 - 49894 488 223 aa, chain + ## HITS:1 COG:CAC0198 KEGG:ns NR:ns ## COG: CAC0198 COG2364 # Protein_GI_number: 15893491 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Clostridium acetobutylicum # 1 207 1 203 227 160 44.0 1e-39 MGMKDVFKRYLVFVIGLYFLAAGIVLIVRSALGTTPISSINYVLSLNSPLSLGTCTFLIN MLLILGQFWVIRKNRTRQDTIEILLQLPFSFIFSAFIDFNMALTSNLHPDNYGMSIALLL TGCMIQSIGVVLELKPKVAMMSAEAFVKYASRRYNKEFGKFKVCFDVTLVTLAVILSLLL SQCIEGVREGSLIAACITGYIVSFLNQKIMTRKTLYKLLPVLK >gi|226332147|gb|ACIC01000173.1| GENE 51 49902 - 50498 387 198 aa, chain - ## HITS:1 COG:MA2782 KEGG:ns NR:ns ## COG: MA2782 COG1309 # Protein_GI_number: 20091605 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Methanosarcina acetivorans str.C2A # 4 165 14 169 201 59 26.0 4e-09 MEKKIIEAAKELFIENGFAETSMSDIAAKVGINRPALHYYFRTKDKMFQAVFGNIILSFI PKIHDIIIQQDKPVSESIGEIVDAYYKVFTENPCLPLFMVREMHRDMNHLSDTIKELRLE QYLHKIMDVLQEEMNDGRLKTVPLPFVFYTFYSALTVPFLTKNLSASFFLEKDDSFEDIL VQWKPYIVRQMETLLVVS >gi|226332147|gb|ACIC01000173.1| GENE 52 50512 - 54735 2491 1407 aa, chain - ## HITS:1 COG:L104115_1 KEGG:ns NR:ns ## COG: L104115_1 COG1924 # Protein_GI_number: 15674220 # Func_class: I Lipid transport and metabolism # Function: Activator of 2-hydroxyglutaryl-CoA dehydratase (HSP70-class ATPase domain) # Organism: Lactococcus lactis # 3 634 6 634 634 563 47.0 1e-160 MIKIGIDVGSTTAKLVAVDENDNLLFSKYERHNAKAKETILSFLKELLSEIGDKDISVRI TGSIGMGISEKCSLPFVQEVVAAAKAIQKDYSHVTSMIDIGGEDAKVVFFKDAEATDLRM NGNCAGGTGAFIDQMAIILGVNIDELNQLAMNATQVYPIASRCGVFCKTDIQNLIAKNVS RENIAASIFRAVAVQTVVTLAHGCDITTPVLFCGGPLTFIPALRKAFIDYLSLDEKEIIL PANGTLLPALGAALFHLDKEDYCKLSHLIDKIDTTLNGNGNLASTGLEPVFASNEEYEAW KERISRHKMISSGLKAGTQDVFIGIDSGSTTTKIVVLDEDSRLLYSYYNINGGNPIKAVE EGLNQLNDECLKQNAVLNIKGSCSTGYGEDLIRSAFQLHSGIIETIAHYMAAHYLNKDVS FILDIGGQDMKAIFVNNGVIDRMEINEACSSGCGSFLETFAKSLGYSAQEFSLAACHSEA PCDLGTRCTVFMNSKVKQVLREGATINDIAAGLAYSVVKNCLYKVLKLKDISVLGKDIVV QGGTMRNDAVVRAIELLSGAEVARCDTPELMGAFGCALYAMKHQGESVSLDEIINKAQYS ARSLYCKGCDNRCLVIRYEFESGKSYYSGNRCEKVFTNGESSNRKGLNVYRQKEELLFHR SAEIAAPEQIIGIPRCLNMYEEYPFWHTLFTSCGIQVCLSDPSNFRKYEHNARMVMSDNI CFPAKLVHSHVQDLIEKKVDRIFMPFVIFERKGMEQNSYNCPIVTGYSEVIKSVQSEGIS VDSPAITFKDRNLLFKQCREYLSGLSVDDRTIQQAFQKAESEQHSFINTLVDYNKNILQQ TGKGETLTVMLAGRPYHTDSLIQHKVSDMLSDMGVNVITDDLVRQMDIPTGDAHFVAQWA YTNRILKAAKWCATQGKNIQFVEMTSFGCGPDAFLVDEVRDLLMRHNKSLTLLKLDDINN IGSMKLRVRSMIESLKLANADGTEGDVKDFTTVPVYDKSYRDRKILVPYFTPFISPLIPA IMKVAGYDAENLPLSDNDSSEWGLKYANNEVCYPATLIVGDIIKALKSGKYDISKIAVAI TQTGGQCRASNYISLIKKALVDAGYTDIPVISISVGSDIDNDQPAFKVNWMKVVPITFHA VLYSDCIAKFYYASVVREKEAGASAKLRDKYLQLAPEVILRRNIKGLNSLLQSAITEFNE VCRAVDTPKVGIVGEIFLKFNPFAQKEIIDWLIGRKIEVVPPILADFFMQGFVNRKVNTD SYIKNRQLPEFIYDWIYKFVKKQIDKINETASKFRYFAPFNDIFEEAEGAKSVISLNAQF GEGWLLPAEIVSYARQGVNNVVSLQPFGCIANHIVSRGVEKRIKSFYPDINLLSLDFDSG VSDVNITNRMLLFIDNLKNATMNYDNN >gi|226332147|gb|ACIC01000173.1| GENE 53 54855 - 55514 448 219 aa, chain - ## HITS:1 COG:VCA0045 KEGG:ns NR:ns ## COG: VCA0045 COG0793 # Protein_GI_number: 15600816 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protease # Organism: Vibrio cholerae # 4 202 180 379 394 166 43.0 3e-41 MVKPVTSGQLNGLLYKRWVKQRAEDVEKWSGGRLGYVHIQSMGDGSFRTVYSDILGKYNN CDGIVIDTRFNGGGRLHEDIEILFSGQKYFTQVVRGREACDMPSRRWNKPSIMLQCEANY SNAHGTPWVYKHQNIGKLVGMPVPGTMTSVSWETLQDPSLVFGIPIIGYRLPDGSYLENT QLEPDIKVANSPETVVKGEDTQLKMAVDELLKEIDSRKK >gi|226332147|gb|ACIC01000173.1| GENE 54 55508 - 58141 2315 877 aa, chain - ## HITS:1 COG:VCA0045 KEGG:ns NR:ns ## COG: VCA0045 COG0793 # Protein_GI_number: 15600816 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protease # Organism: Vibrio cholerae # 724 859 23 159 394 120 44.0 1e-26 MKKIIISLALVLTALSSYAITPLWMRDARISPDGKEIVFCYKGDIYKVPVQGGTATQLTT QASYEANPVWSPDGKQIAFASDRNGNFDLFIMPADGGTARRLTYHSASEIPSAFTPDGKF VLFSASIQDPAQSALFPTGAMTELYKVPVTGGRTEQVLATPAEWVCFDKSGKNFLYQDRK GFEDEWRKHHTSSITRDVWLYDASTGKHTNLTNREGEDRNPVYAPDGNSVYFLSERNGGS FNVYNFALNTPQEVKAITTFRTHPVRFLSISDKGTLCYTYDGELYTQEPNARPKKVNVDL VRDDEKEIATLRFSQGATSASVSPDGKQVAFIVRGDVFVTSTDYATTKQITNTPAKEAGV SFAPDNRTLVYASERTGNWQLYTAKIARKEEANFPNATLIEEEVLLPSKTIERAYPQYSP DGKELAFIEDRNRLMVLNLKTKKVRQVTDGSTWYNTGGGFDYEWSPDGKWFTLEFIGNRH DPYSDIGIVSAQGGTITNLTNSGYISGSPRWVLDGNAILFQTERYGMRAHASWGSQQDVM LVFLNQDAYDRYRLSKEDFELLKEFEKEQKKAKEKDGDKKKDGSKSKKEKADKEGDKEDT EEDKADKKDIVVELSGIEDRIVRLTPNSSDLGSAILSKDGENLYYFSAFEEGYDLWKINL REKDTKRLHKLNSGWSSLMLDKKGDIFLLGSRNMQKMDAKSDALKSISYQAEMKMDLAAE REAMFDHVYKQHQKRFYNLNMHGVDWDAMTAAYRKFLPHIDNNYDFAELLSEWLGELNVS HTGGRYSPRGNGDVTSNLGLLFDWNYLGKGMQIAEIVEKGPFDHSRTKVKAGCIIEKING EEITPDKDITDLLNNKAGKKHWFRCTTHRIRNVGKKW Prediction of potential genes in microbial genomes Time: Thu May 12 03:42:20 2011 Seq name: gi|226332146|gb|ACIC01000174.1| Bacteroides sp. 1_1_6 cont1.174, whole genome shotgun sequence Length of sequence - 96304 bp Number of predicted genes - 85, with homology - 84 Number of transcription units - 45, operones - 21 average op.length - 2.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 690 327 ## COG2040 Homocysteine/selenocysteine methylase (S-methylmethionine-dependent) 2 1 Op 2 . - CDS 733 - 1554 376 ## COG1237 Metal-dependent hydrolases of the beta-lactamase superfamily II - Prom 1598 - 1657 9.7 + Prom 1560 - 1619 5.9 3 2 Tu 1 . + CDS 1768 - 2313 332 ## COG0350 Methylated DNA-protein cysteine methyltransferase + Term 2358 - 2411 -1.0 - Term 2429 - 2464 -0.6 4 3 Tu 1 . - CDS 2509 - 3048 543 ## COG0655 Multimeric flavodoxin WrbA - Prom 3286 - 3345 6.2 + Prom 3054 - 3113 6.2 5 4 Tu 1 . + CDS 3315 - 4004 455 ## COG0664 cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases + Term 4011 - 4061 9.0 - Term 4003 - 4045 10.4 6 5 Op 1 . - CDS 4094 - 5380 1207 ## BT_1414 hypothetical protein 7 5 Op 2 . - CDS 5388 - 6179 619 ## COG0755 ABC-type transport system involved in cytochrome c biogenesis, permease component 8 5 Op 3 . - CDS 6176 - 7411 1065 ## BT_1416 hypothetical protein 9 5 Op 4 1/0.077 - CDS 7423 - 8904 1374 ## COG3303 Formate-dependent nitrite reductase, periplasmic cytochrome c552 subunit 10 5 Op 5 . - CDS 8948 - 9544 363 ## COG3005 Nitrate/TMAO reductases, membrane-bound tetraheme cytochrome c subunit - Prom 9615 - 9674 6.8 - Term 9574 - 9615 7.6 11 6 Op 1 . - CDS 9702 - 10196 419 ## BT_1419 hypothetical protein 12 6 Op 2 . - CDS 10196 - 12352 1699 ## COG1629 Outer membrane receptor proteins, mostly Fe transport - Prom 12372 - 12431 4.0 13 7 Tu 1 . - CDS 12439 - 12753 71 ## BT_1421 hypothetical protein + Prom 13091 - 13150 4.4 14 8 Tu 1 . + CDS 13205 - 14488 922 ## gi|253572290|ref|ZP_04849693.1| predicted protein + Term 14531 - 14574 6.5 15 9 Op 1 . - CDS 14499 - 14945 322 ## gi|253572291|ref|ZP_04849694.1| conserved hypothetical protein 16 9 Op 2 . - CDS 14954 - 16300 844 ## gi|253572292|ref|ZP_04849695.1| predicted protein 17 9 Op 3 . - CDS 16355 - 16564 139 ## gi|253572293|ref|ZP_04849696.1| predicted protein 18 9 Op 4 . - CDS 16602 - 18314 1449 ## BT_4445 hypothetical protein 19 9 Op 5 . - CDS 18392 - 19009 455 ## BT_1424 hypothetical protein - Prom 19050 - 19109 5.9 - Term 19145 - 19191 -0.8 20 10 Tu 1 . - CDS 19367 - 20065 646 ## BT_1425 hypothetical protein 21 11 Op 1 . - CDS 20138 - 20668 316 ## COG4332 Uncharacterized protein conserved in bacteria - Prom 20739 - 20798 2.4 22 11 Op 2 . - CDS 20839 - 21678 354 ## BT_1427 tetracycline resistance element mobilization regulatory protein RteC - Prom 21914 - 21973 12.4 + Prom 21890 - 21949 4.6 23 12 Tu 1 . + CDS 22074 - 22349 392 ## BT_1428 hypothetical protein + Term 22398 - 22447 2.2 - Term 22384 - 22433 5.2 24 13 Tu 1 . - CDS 22492 - 22893 375 ## COG3871 Uncharacterized stress protein (general stress protein 26) - Prom 22918 - 22977 5.9 25 14 Tu 1 . - CDS 22999 - 23817 452 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 23838 - 23897 5.9 + Prom 23707 - 23766 4.3 26 15 Tu 1 . + CDS 23917 - 24459 409 ## COG0847 DNA polymerase III, epsilon subunit and related 3'-5' exonucleases + Term 24491 - 24544 6.6 - Term 24479 - 24532 11.2 27 16 Op 1 3/0.000 - CDS 24564 - 25733 1216 ## COG1312 D-mannonate dehydratase 28 16 Op 2 4/0.000 - CDS 25768 - 26580 1005 ## COG1028 Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) 29 16 Op 3 . - CDS 26601 - 27671 838 ## COG1609 Transcriptional regulators - Prom 27696 - 27755 6.4 - Term 28036 - 28088 8.5 30 17 Op 1 . - CDS 28097 - 28384 356 ## BT_1435 hypothetical protein 31 17 Op 2 . - CDS 28429 - 28674 303 ## BF4188 hypothetical protein - Prom 28696 - 28755 7.5 32 18 Tu 1 . - CDS 28767 - 29900 597 ## COG1929 Glycerate kinase - Prom 29979 - 30038 2.8 + Prom 29904 - 29963 2.9 33 19 Tu 1 . + CDS 29998 - 31206 1105 ## COG0477 Permeases of the major facilitator superfamily - Term 31201 - 31245 9.2 34 20 Op 1 . - CDS 31289 - 32770 1323 ## BT_1439 hypothetical protein 35 20 Op 2 . - CDS 32784 - 35912 3007 ## BT_1440 hypothetical protein - Prom 36046 - 36105 4.2 + Prom 35947 - 36006 3.7 36 21 Tu 1 . + CDS 36155 - 37189 874 ## BT_1441 hypothetical protein + Prom 37224 - 37283 3.3 37 22 Op 1 6/0.000 + CDS 37311 - 39725 1931 ## COG0161 Adenosylmethionine-8-amino-7-oxononanoate aminotransferase 38 22 Op 2 5/0.000 + CDS 39845 - 40999 922 ## COG0156 7-keto-8-aminopelargonate synthetase and related enzymes 39 22 Op 3 5/0.000 + CDS 40996 - 41655 425 ## COG2830 Uncharacterized protein conserved in bacteria 40 22 Op 4 9/0.000 + CDS 41666 - 42451 566 ## COG0500 SAM-dependent methyltransferases 41 22 Op 5 . + CDS 42448 - 43095 644 ## COG0132 Dethiobiotin synthetase 42 23 Tu 1 . - CDS 43120 - 45303 1952 ## COG0642 Signal transduction histidine kinase - Term 45484 - 45544 13.5 43 24 Op 1 . - CDS 45568 - 46092 721 ## COG1038 Pyruvate carboxylase 44 24 Op 2 2/0.000 - CDS 46144 - 47655 1558 ## COG0439 Biotin carboxylase 45 24 Op 3 . - CDS 47711 - 49255 1699 ## COG4799 Acetyl-CoA carboxylase, carboxyltransferase component (subunits alpha and beta) - Prom 49313 - 49372 9.1 + Prom 49285 - 49344 5.9 46 25 Op 1 . + CDS 49501 - 50088 766 ## BT_1451 hypothetical protein 47 25 Op 2 . + CDS 50113 - 51570 1643 ## BT_1452 hypothetical protein + Term 51626 - 51671 5.6 + Prom 51716 - 51775 7.9 48 26 Op 1 . + CDS 51808 - 53307 1451 ## COG1620 L-lactate permease 49 26 Op 2 . + CDS 53348 - 53866 351 ## COG3153 Predicted acetyltransferase + Term 53878 - 53918 4.9 - Term 53862 - 53907 4.6 50 27 Op 1 5/0.000 - CDS 53927 - 54547 532 ## COG1186 Protein chain release factor B 51 27 Op 2 . - CDS 54552 - 55949 1339 ## COG1690 Uncharacterized conserved protein 52 28 Tu 1 . + CDS 56442 - 56936 621 ## COG0526 Thiol-disulfide isomerase and thioredoxins + Term 56970 - 57017 6.2 + Prom 56968 - 57027 4.1 53 29 Op 1 . + CDS 57177 - 58394 1073 ## COG0535 Predicted Fe-S oxidoreductases 54 29 Op 2 . + CDS 58415 - 59674 809 ## COG1819 Glycosyl transferases, related to UDP-glucuronosyltransferase 55 29 Op 3 . + CDS 59688 - 59972 247 ## BF3050 hypothetical protein 56 30 Op 1 9/0.000 - CDS 59977 - 60696 567 ## COG3279 Response regulator of the LytR/AlgR family 57 30 Op 2 1/0.077 - CDS 60693 - 61799 597 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 58 30 Op 3 . - CDS 61796 - 62884 723 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 59 30 Op 4 . - CDS 62933 - 65410 2171 ## BT_1460 hypothetical protein - Prom 65526 - 65585 6.1 + Prom 66068 - 66127 9.5 60 31 Tu 1 . + CDS 66285 - 66728 362 ## BF1389 hypothetical protein 61 32 Tu 1 . - CDS 67152 - 67742 526 ## COG0110 Acetyltransferase (isoleucine patch superfamily) - Prom 67832 - 67891 3.8 + Prom 67584 - 67643 4.3 62 33 Tu 1 . + CDS 67883 - 68341 257 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases - Term 68007 - 68049 -0.7 63 34 Tu 1 . - CDS 68251 - 68427 81 ## - Prom 68465 - 68524 6.8 - Term 68641 - 68682 8.1 64 35 Op 1 36/0.000 - CDS 68695 - 71115 1969 ## COG0577 ABC-type antimicrobial peptide transport system, permease component 65 35 Op 2 24/0.000 - CDS 71112 - 71828 363 ## PROTEIN SUPPORTED gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) 66 35 Op 3 . - CDS 71844 - 73091 1353 ## COG0845 Membrane-fusion protein 67 35 Op 4 . - CDS 73152 - 74456 901 ## BT_1468 putative outer membrane efflux protein - Prom 74645 - 74704 4.8 68 36 Op 1 8/0.000 + CDS 74786 - 76147 1052 ## COG2204 Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains 69 36 Op 2 . + CDS 76153 - 77403 1093 ## COG5000 Signal transduction histidine kinase involved in nitrogen fixation and metabolism regulation + Prom 77408 - 77467 2.1 70 37 Op 1 . + CDS 77491 - 77838 469 ## BT_1471 hypothetical protein 71 37 Op 2 . + CDS 77843 - 78517 451 ## COG0628 Predicted permease 72 37 Op 3 . + CDS 78574 - 79746 1012 ## COG0463 Glycosyltransferases involved in cell wall biogenesis + Term 79809 - 79849 4.1 73 38 Op 1 1/0.077 - CDS 79758 - 80210 188 ## PROTEIN SUPPORTED gi|116624156|ref|YP_826312.1| SSU ribosomal protein S18P alanine acetyltransferase 74 38 Op 2 . - CDS 80213 - 81679 2566 ## PROTEIN SUPPORTED gi|29346884|ref|NP_810387.1| ribosomal protein S6 modification protein-related protein - Prom 81740 - 81799 4.8 - Term 81710 - 81760 1.9 75 39 Tu 1 . - CDS 81810 - 83054 1062 ## COG4591 ABC-type transport system, involved in lipoprotein release, permease component - Prom 83173 - 83232 6.3 76 40 Tu 1 . - CDS 83247 - 84446 1166 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase - Prom 84528 - 84587 4.3 + Prom 84492 - 84551 6.2 77 41 Tu 1 . + CDS 84627 - 85862 948 ## COG2407 L-fucose isomerase and related proteins + Term 85895 - 85939 13.0 - Term 85883 - 85927 8.4 78 42 Op 1 . - CDS 85952 - 86794 795 ## COG1387 Histidinol phosphatase and related hydrolases of the PHP family 79 42 Op 2 . - CDS 86866 - 88734 2194 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains 80 42 Op 3 . - CDS 88763 - 89569 742 ## BT_1480 hypothetical protein 81 42 Op 4 . - CDS 89621 - 90814 1013 ## BT_1481 hypothetical protein - Prom 91003 - 91062 9.5 + Prom 90967 - 91026 5.6 82 43 Op 1 40/0.000 + CDS 91048 - 91737 821 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 83 43 Op 2 . + CDS 91747 - 93528 1649 ## COG0642 Signal transduction histidine kinase 84 44 Tu 1 . - CDS 93560 - 95131 1366 ## BT_1484 hypothetical protein - Prom 95244 - 95303 5.0 + Prom 95252 - 95311 5.5 85 45 Tu 1 . + CDS 95363 - 96298 826 ## BT_1485 hypothetical protein Predicted protein(s) >gi|226332146|gb|ACIC01000174.1| GENE 1 3 - 690 327 229 aa, chain - ## HITS:1 COG:slr1189 KEGG:ns NR:ns ## COG: slr1189 COG2040 # Protein_GI_number: 16332297 # Func_class: E Amino acid transport and metabolism # Function: Homocysteine/selenocysteine methylase (S-methylmethionine-dependent) # Organism: Synechocystis # 14 222 51 265 351 102 33.0 4e-22 MNFKDCIDSYPFILMEGALGERLKREFDLDISGTVAMADLVYQQKGRLALETLWNEYIDI AYRYQLPFLATTPTRRANKERIYMAGYNESIIADNVDFLRSIKEAANVDMYIGGLMGCKG DAYTGEEALNLEEAIDFHSWQAGLFKSAKVDFLYAGIMPVLTEAIGMAVAMSDTDIPYII SFTIQRDGKLIDGHTIDDAIHCIDNHVSNKPVCYMTNCVHPDIVYEALS >gi|226332146|gb|ACIC01000174.1| GENE 2 733 - 1554 376 273 aa, chain - ## HITS:1 COG:MTH1101 KEGG:ns NR:ns ## COG: MTH1101 COG1237 # Protein_GI_number: 15679112 # Func_class: R General function prediction only # Function: Metal-dependent hydrolases of the beta-lactamase superfamily II # Organism: Methanothermobacter thermautotrophicus # 4 270 2 260 260 144 37.0 1e-34 MNYKITTLVENCVYGRKLQAEHGLSLYIETPENRLLFDTGASDLFIRNARLLNIDLQKVD YLVLSHGHSDHTGGLRHFLELNAEATVVCKREVFFPKFKDERENGMKYTRSLDLSRFRFI DELTELVPGVFLFPSVDIIDRKDTHFERFWIQKEDGSRVPDTFPDELAMVLVEPEGLSIL SACSHRGITNILRTVRTAFPELPCNLLLGGFHVHNAEESKYQVIADYLHEYLPQKIGVCH CTGVDKYAFFYKDFGDRIFYNHTGKLIQTDSLL >gi|226332146|gb|ACIC01000174.1| GENE 3 1768 - 2313 332 181 aa, chain + ## HITS:1 COG:NMB1528 KEGG:ns NR:ns ## COG: NMB1528 COG0350 # Protein_GI_number: 15677380 # Func_class: L Replication, recombination and repair # Function: Methylated DNA-protein cysteine methyltransferase # Organism: Neisseria meningitidis MC58 # 19 177 106 263 269 147 48.0 1e-35 MEKEAKGKVNTVQIQYYQSPCGKLILGSFENKLCMCDWVVSEERRATIDKRIQKALNAQY AIENSEVITQTISQLDEYFSRQRTTFDIPLLLVGTEFQKSVWNELLNIPYGTTISYAQLS QRLDNPKAIRAVASSNGANSISILIPCHRVIGSDHKLTGYAGGLVAKKTLLELESNERRL L >gi|226332146|gb|ACIC01000174.1| GENE 4 2509 - 3048 543 179 aa, chain - ## HITS:1 COG:MA0418 KEGG:ns NR:ns ## COG: MA0418 COG0655 # Protein_GI_number: 20089311 # Func_class: R General function prediction only # Function: Multimeric flavodoxin WrbA # Organism: Methanosarcina acetivorans str.C2A # 1 179 1 179 179 243 63.0 2e-64 MAKKVLIISSSPRKGGNSDMLCDEFMKGVLETGNEVEKIFLKEKAIHPCTGCSVCSMYGK PCPQKDDAAGIVEKMIAADVIVMATPVYFYTMCGQMKIMIDRCCARYTEITNKEFYFIIA AAENDKGMMERTIDGFRGFLDCLENPQEKGVIYGIGAWKVGEIKDTPYMQEAYEMGKMV >gi|226332146|gb|ACIC01000174.1| GENE 5 3315 - 4004 455 229 aa, chain + ## HITS:1 COG:CAC0884 KEGG:ns NR:ns ## COG: CAC0884 COG0664 # Protein_GI_number: 15894171 # Func_class: T Signal transduction mechanisms # Function: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases # Organism: Clostridium acetobutylicum # 19 229 14 225 229 82 25.0 7e-16 MKKIQLTNAHKEKLFQIPLFRDLPLNIKESLLDKLDFVIYAADKKEIVVTQGTPCNKLYV LLEGKLRTDIIDGLGNEVMIEYIIAPRTFATPHLFNSNNTLPATFTALEDSVVLMATKDS TFKVISQDPQVLHNFLCIAGNCNICTVSRLKPLSRKTVRERFIVYLFEHKKKDSLTVEIM HTQSQLAEYLNVSRPALSKEINKIIKEGLITMEGKKIEILNKMALEKYL >gi|226332146|gb|ACIC01000174.1| GENE 6 4094 - 5380 1207 428 aa, chain - ## HITS:1 COG:no KEGG:BT_1414 NR:ns ## KEGG: BT_1414 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 428 1 428 428 828 99.0 0 MKTFYWSFVLMLLPSMAYTQTTEKENEFSMSMKIRPRAEYRNGALVPRNEGQKAASFINN RARLSLDYKRSDLEIKMSAQHVGVWGQDPQIDKNGRFILNEAWAKLDFGKGFFAQLGRQT LIYDDERILGGLDWNVAGRYHDALKLGYANKNNEIHAILAFNQNDEKTAGGTYYNSSIGQ PYKNMQTVWYHYKADKIPFGASLLFMNLGLETGNQLTQDSHTRYLQTMGTYLTYKNSGWN LDGAFYYQTGKNKDAESVSAFMASATAAYAFNKTWGMVVSFDYLSGNEEGSSKFKAFDPL YGTHHKFYGSMDYFYASAFNKGFAPGLIDGRLGARFRASAKVDMELNYHYFATATEVDFK EDLKKSLGSEVDYQINWSVMKDVKLSAGYSFMLGTKTMDAVKGGNHKSWQDWGWVSVNIN PKVFFTKW >gi|226332146|gb|ACIC01000174.1| GENE 7 5388 - 6179 619 263 aa, chain - ## HITS:1 COG:all0936 KEGG:ns NR:ns ## COG: all0936 COG0755 # Protein_GI_number: 17228431 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ABC-type transport system involved in cytochrome c biogenesis, permease component # Organism: Nostoc sp. PCC 7120 # 166 260 253 347 351 85 42.0 1e-16 MSWEHFIWFAIAALICWSLGAFAAWKGKKPVWTYGFTLLGLVIFFSFILGMWISLERPPM RTMGETRLWYSFFLPLAGLITYVRWKYKWILSFSCILSLVFICINIFKPEIHNKTLMPAL QSPWFAPHVIVYMFAYAMLGAAVVMAVYLLWFKKKDIERKEMDLCDNLTYVGLAFMTLGM LTGAIWAKEAWGHYWAWDPKETWAAATWFSYLVYIHFRLGRPLKSRPALVILLVSFVLLQ MCWYGINYLPSAQGVSVHTYNLN >gi|226332146|gb|ACIC01000174.1| GENE 8 6176 - 7411 1065 411 aa, chain - ## HITS:1 COG:no KEGG:BT_1416 NR:ns ## KEGG: BT_1416 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 411 1 411 411 792 97.0 0 MWSKPWSYKEGLVIGAGLLVIGILLQMAVGAINWSLFACPVNVIVLVVYIIALVAMHLLR KRVYLFGWLSHYSAAVSSLVWVVGMTVIMGLIRQAPSGHASNDILGFSQMISSWTFVLLY FWMVTALGLTILRASFPFRIGRLSFLLNHVGLFVALITATLGNADMQRLKMTTRMGNAEW RATDDKGKLIELPLAIELKDFTIDEYPPKLMLIDNETGGVLPEKSPVHLLLEDGVSEGSL LDWDLFVEQSIPMAASVATEDTLKFTDFHSMGATYAAYLKAVNRKNQQAKEGWVSCGSFL FPYKALRLDSLTSLVMPEREPQRFASEVKVYTQEGTITESTIEVNRPMEIAGWKIYQLSY DESKGRWSDISVFELVRDPWLPVVYAGIIMMMLGAICLFVNAQKRKEEDKE >gi|226332146|gb|ACIC01000174.1| GENE 9 7423 - 8904 1374 493 aa, chain - ## HITS:1 COG:PM0023 KEGG:ns NR:ns ## COG: PM0023 COG3303 # Protein_GI_number: 15601888 # Func_class: P Inorganic ion transport and metabolism # Function: Formate-dependent nitrite reductase, periplasmic cytochrome c552 subunit # Organism: Pasteurella multocida # 52 490 68 506 510 447 46.0 1e-125 MEKKLKSWQGWLLFGGSMVVVFVLGLCVSALMERRAEVASIFSNRKTVIKGIEARNELFK DNFPREYQTWTETAKTDFESEFNGNVAVDALEKRPEMVILWAGYAFSKDYSTPRGHMHAI EDITASLRTGSPAGPHDGPQPSTCWTCKSPDVPRMMEALGVDSFYNNKWAAFGDEIVNPI GCSDCHDPETMNLHISRPALIEAFQRQGKDITKATPQEMRSLVCAQCHVEYYFKGDGKYL TFPWDKGFTVEDMEAYYDEAGFYDYIHKLSRTPILKAQHPDYEIAQMGIHGQRGVSCADC HMPYKSEGGVKFSDHHIQSPLAMIDRTCQTCHRESEETLRKNVYERQRKANEIRNRLEQE LAKAHIEAKFAWDKGATEDQMKDVLALIRQAQWRWDFGVASHGGSFHAPQEIQRILSHGL DRALQARLAVSKVLAKHGYTEDVPMPDISTKAKAQKYIGLDMDAERAAKEKFLKTTVPAW LEKAKANGRLAQK >gi|226332146|gb|ACIC01000174.1| GENE 10 8948 - 9544 363 198 aa, chain - ## HITS:1 COG:Cj1358c KEGG:ns NR:ns ## COG: Cj1358c COG3005 # Protein_GI_number: 15792681 # Func_class: C Energy production and conversion # Function: Nitrate/TMAO reductases, membrane-bound tetraheme cytochrome c subunit # Organism: Campylobacter jejuni # 31 165 30 167 171 83 33.0 2e-16 MMKLPFINRIFPSYRSRIVAVIIGGIIVGGGALFMYMLRAHTYLGDDPAACVNCHIMSPY YATWFHSSHARDATCNDCHVPHENIVKKWTFKGMDGMKHVAAFLAKSEPQVIQAHEASSQ VIMNNCIRCHTQLNTEFVKTGKIDYMMSMVGEGKACWDCHRDVPHGGKNSLSATPAAIVP LPESPVPEWLRKMVNNKE >gi|226332146|gb|ACIC01000174.1| GENE 11 9702 - 10196 419 164 aa, chain - ## HITS:1 COG:no KEGG:BT_1419 NR:ns ## KEGG: BT_1419 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 164 1 164 164 283 98.0 1e-75 MKTKLYLPIISLLAMSVFAFISCDDSDSDTTKPVIELSEPEEGQELKIGDEHGVHFEMDL SDDVMLKSYMIEIHSNFDHHSHGKSRAAATDEATVDFSFNRSYDISGKKTAHIHHHDIII PANATPGDYHLMVYCTDAAGNETYIARNIVLSTTAEEDDHHHDE >gi|226332146|gb|ACIC01000174.1| GENE 12 10196 - 12352 1699 718 aa, chain - ## HITS:1 COG:PA0781 KEGG:ns NR:ns ## COG: PA0781 COG1629 # Protein_GI_number: 15595978 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor proteins, mostly Fe transport # Organism: Pseudomonas aeruginosa # 31 714 39 687 687 119 24.0 2e-26 MRIGIISGVIGLFVTLSVHAQKSDSIKSMLLPDVVVTESYQQRQAKKSALTVEVIEQDFL RKHFTGNFMQAMENIPGVQAMDIGSGFSKPMIRGMGFNRIAVLENGIKQEGQQWGADHGL ELDAFNIEAVNVLKGPSSLLYGSDAMGGVIDIASPPIPSANMLFGDVTLLGKSVNGTLAG SLMLGLKKNAWYAQIRYSEQHFGDYRIPTDTIVYLTQKMPVYGRKLKNTAGVERNIGLFT QYQRKGYKADYSISNVYQKTGFFPGAHGVPDASRVEDDGDSRNIELPYSKVNHLKVTTHQ QYAWERLILSGDLGFQNNHREEWSAFHTHYGFQPAPEKDPDKELAFDLNTYSASVKARFI GSSSWEHTLGWDGQHQQNDISGYSFLLPEYHRSTTGLLWLTTYKPNNILSVSGGVRYDYG YIGISPHEDVYLADYLRRQGYDDEQVESYKWNSHAVRKHFGDYSFSLGLVWTPSVQHMVK VNIGRSFRLPGANELAANGVHHGTFRHEQGDASLKSEQGWQMDASYNLRYHGLSVSVSPF VSWFSNYIFLRPTGEWSVLPHAGQIYRYTGAEALFAGTEATIDINFLRHFNYRISGEYVY TYNCDEHIPLSFSPPFGMRNTLTWQRKHCMLYAEWQLIARQNRVDRNEDRTPGVNLFHLG GSLNIPVGRTNEIEITLTARNIFNTRYYNHLSFYRKVEIPEPGRNFQLLIKIPFKKLL >gi|226332146|gb|ACIC01000174.1| GENE 13 12439 - 12753 71 104 aa, chain - ## HITS:1 COG:no KEGG:BT_1421 NR:ns ## KEGG: BT_1421 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 104 1 104 104 183 97.0 2e-45 MYHKRKIRAYIAWMLFMTFIPFFVVKTFHYHGSEDETSCSHAEHSRSHAEDCAICKYSLS LFTEPQSVEFHCILTLVPYEPITYQDKVVCKRTYSHHLRAPPAA >gi|226332146|gb|ACIC01000174.1| GENE 14 13205 - 14488 922 427 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253572290|ref|ZP_04849693.1| ## NR: gi|253572290|ref|ZP_04849693.1| predicted protein [Bacteroides sp. 1_1_6] # 1 427 1 427 427 825 100.0 0 MKKEQTTDKNSWNFHLTRSIADLYDVTLEIHTEFWLSTLQIWFRGYQTPEGYKATIWGKK VDLHIAIAPLGTPSETLPVIKENTTRSKNAQLPSEQQIYVNELQKKIKSLKKHLPPKVDE VLEQRCLDEMNADRIKTIIRECDTIWGDKGLSVEEKINRLVPYKIEIYNLVSMLQLPDEL VRADTNISILMATILYYAQSVEKNARKYKIRIPKLVRQLVKLVDGIITRMNETQNKLNGV ERDMTKEEYKTYDAYLDIKIGAKSAFCSFEKQLELYEQLWEMPSLSTDTKIECLNEAVKL VKKQYGKKTESRCPHAPLVRKHLRAISGYLNELEKEGEATWQLRMADELLPTANAWREDC DFPALSKEGFASQIELQSVHIKTKEKEEGSIHYELELFFQDTEDTFAGHFLYATIEDGKV EEITLMG >gi|226332146|gb|ACIC01000174.1| GENE 15 14499 - 14945 322 148 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253572291|ref|ZP_04849694.1| ## NR: gi|253572291|ref|ZP_04849694.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 148 1 148 148 271 100.0 1e-71 MSFIFRKFKMVLLLLLVAAYSYGQARDAKNFIITPQTVVKTNLPKLDSCLFKLQEILRER FGQSAIIGGRRASGNSVIELWTDFELEGKEHYILNISAKKLSIRGATQKAIQYGLKTLDK ILQEETNNTANKQIAPRRIENVSDSNCP >gi|226332146|gb|ACIC01000174.1| GENE 16 14954 - 16300 844 448 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253572292|ref|ZP_04849695.1| ## NR: gi|253572292|ref|ZP_04849695.1| predicted protein [Bacteroides sp. 1_1_6] # 1 448 1 448 448 884 100.0 0 MSMHAIESLVEYSVITVATALPVPPLAQSICHSLYHLQNQLDCGYTVLRVRDELEKVGYL SLLSPEQLPEPERSEAMELAAEGGFLKGGGIYVDRRSGKCCVTAGCVLWKKLLDMSVIPA SPEAELRLLDPLELAEQIVSLASKALAGGDKRGADTLGHWYVFFPLFCAIEGWDDANAPE PERIQALLRLLDVPEAFEVAASYGNELDVDYEEEEMPFLVGWEQPYRKWLKERKNDEGIQ EGELDSFHRNVMYQYIQRHNFEEADRYASLIADENSRLLQRCVVGYACHQWLKTQEPGTL PPSCLLSLFEVKEGFERLSGLPLPEQELATCRVYLLQTVVLLGDYPAVIEMQQALFTEAI GKLEQYPEGETRQMQQIALALSYYQMLYVNLPDEYPSKKELMRKRFPGLMELSDVKRICG ELLPEKPQMADTLQENMEQCNALMQYLN >gi|226332146|gb|ACIC01000174.1| GENE 17 16355 - 16564 139 69 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253572293|ref|ZP_04849696.1| ## NR: gi|253572293|ref|ZP_04849696.1| predicted protein [Bacteroides sp. 1_1_6] # 1 69 1 69 69 127 100.0 2e-28 MEELFRLVRDENDIDKALSMICSGYILEPSYKSEYNCSLLFHAVYRKSLVLVKTLVEQGA IKQYLFELP >gi|226332146|gb|ACIC01000174.1| GENE 18 16602 - 18314 1449 570 aa, chain - ## HITS:1 COG:no KEGG:BT_4445 NR:ns ## KEGG: BT_4445 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 570 1 570 571 1020 88.0 0 MAKITIPYAVADFIEMRERGFYYVDKTDYISKLEEYKAPVFFRPRRFGKSLLVSTLACYY DRTKAHRFEELFGGTWIGSHPTKEHNQYMIIRYDFSKMVMANSIEGLAQNFNDLNCGPVD LMVEHNRDLFGDFQFTTRGDASKMLEEVLNYARSHGFPKVYILIDEYDNFTNQLLTAYND PLYEEVTTNDSFLRTFFKVIKAGIGEGSIRTCFCTGVLPVTMDDLTSGYNIAEILTLEPG FMNMLGFTYEETEVYLRYVLDKYSTGQDRYNEIWQLIVSNYDGYRFRPNGDRLFNATILT YFFKKFAANAGSIPDELVDENLRTDINWIRRLTLSLDNAKAMLDALVIDDELSYNVADLS SKFNKKKFFDKEFYPISLFYLGMTTLKNSYRMVLPNLTMRSVYMDYYNQLNRIEGNAQRY VPTYERYDADRRLEPLVQNYFEQYLGQFPAQVFDKMNENFIRCSFYELVSRYLSSCYTFA IEQNNSVGRSDFEMTGIPGTDYYTDDRVVEFKYYRAKEAEKMLTLTEPLAEHIVQVKKYG EDTKRKFPNYHVRTYVVYICANKGWKCWEV >gi|226332146|gb|ACIC01000174.1| GENE 19 18392 - 19009 455 205 aa, chain - ## HITS:1 COG:no KEGG:BT_1424 NR:ns ## KEGG: BT_1424 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 205 1 205 205 371 94.0 1e-102 MSTYYDLYETPDVQNTGEKQPLHARIVPSGTYSKKDFIERVSRHQPFPHNIIEGVLGAVV DELAEALADGYIVELGELGHFSISLKCTHKVMTKKEIRAESIRFDNVHLRTSKEFKKKIK REIELERVENSKVKTHDTKISIEDRLQMLQEFLKKNGGITRIEYSRLTGLPRLKAIDDLN AFIKEGTLRKRGAGRTVFYVWKQEE >gi|226332146|gb|ACIC01000174.1| GENE 20 19367 - 20065 646 232 aa, chain - ## HITS:1 COG:no KEGG:BT_1425 NR:ns ## KEGG: BT_1425 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 232 1 232 232 415 99.0 1e-115 MRKLFLLAIVLCVWSVVADAQESEGRYIEVTGSSEIEVVPDEIHLLIQIKEYWKEEFVPK SKPEDYKTKVPLEWIEKDLRRVLSRAGISDEAIRVQEIGDYWRQKGKEFLIGKQLDIRFT DFEKINAVIKKIDTKGIESMRIGELKHKDLPNYRKQGKIEALKAAREKASYLVEAMGQKL GEVIRIVEPASSYVSPYSLYQAQSNVSMGTAATEQYRVIKLRYEMTARFAIE >gi|226332146|gb|ACIC01000174.1| GENE 21 20138 - 20668 316 176 aa, chain - ## HITS:1 COG:CAC0055 KEGG:ns NR:ns ## COG: CAC0055 COG4332 # Protein_GI_number: 15893352 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 1 155 37 183 196 81 32.0 8e-16 MNAQKKNIDVWLIYRCVKCDNTCNITLLSRTKPDLIDKVLFHSFSMNDRKAAWKYAFSAE LAGRNHLKTDYDSVEYEVMDNFSKEDIIRMSDAIIKIQIKCEFEFNLKLSSLLRRNFLLS STQLRRLFEQGVISLLSGKEPQKYKVKDGDILLIDKEHLLVMMDFVDSFMVKTGID >gi|226332146|gb|ACIC01000174.1| GENE 22 20839 - 21678 354 279 aa, chain - ## HITS:1 COG:no KEGG:BT_1427 NR:ns ## KEGG: BT_1427 # Name: not_defined # Def: tetracycline resistance element mobilization regulatory protein RteC # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 279 1 279 279 509 97.0 1e-143 MENFVRNSKLNIEKEIKRIEQQPIAPLERIKQIIDVIQTSLTLLKRAVAEYQFQNPEEEI QFFKVWKPQISGLLMFYIRLYQIEKNRVGKSLSAQCKYLKMELENIQKSFLNNSFYDYYR AGQTELDNHYFIRENYDILSDVHCHLLDRDPSFTTLHDSSVAMILANSHLIEYVSDEIDS LSDKLHLKFTSIVDSKLLQWTDSKVALVEFIYAIYAGKCFNNGNTSLKDIAFCCEVLFNI EIGDFYRIFLEIRNRKKNRTQFLDKLKDKIQKMMDELDR >gi|226332146|gb|ACIC01000174.1| GENE 23 22074 - 22349 392 91 aa, chain + ## HITS:1 COG:no KEGG:BT_1428 NR:ns ## KEGG: BT_1428 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 91 1 91 91 155 100.0 4e-37 MIRLNVFVRVNETNREKAIEAAKELTACSLKEEGCIAYDTFESSTRRDVFMICETWQNAE VLAAHEKTAHFAQYVGIIQELAEMKLEKFEF >gi|226332146|gb|ACIC01000174.1| GENE 24 22492 - 22893 375 133 aa, chain - ## HITS:1 COG:CAC3491 KEGG:ns NR:ns ## COG: CAC3491 COG3871 # Protein_GI_number: 15896728 # Func_class: R General function prediction only # Function: Uncharacterized stress protein (general stress protein 26) # Organism: Clostridium acetobutylicum # 4 133 9 140 145 61 28.0 3e-10 MKEKAAELLQKCEVVTLASVNKEGYPRPVPMSKIAAEGISTIWMSTGAASLKTIDFLSNP KAGLCFQEKGDSVALMGEVEVVTDEKLKQELWQDWFIEHFPGGPTDPGYVLLKFTANHAT YWIEGTFIHKKLD >gi|226332146|gb|ACIC01000174.1| GENE 25 22999 - 23817 452 272 aa, chain - ## HITS:1 COG:mlr1196 KEGG:ns NR:ns ## COG: mlr1196 COG2207 # Protein_GI_number: 13471273 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Mesorhizobium loti # 88 256 107 273 276 77 28.0 3e-14 MYTEYQPSRLLAPYIDNYWEFKGTPEYGMRIHILPDGCTDFIFTLGEAVSKVKEETLVMQ PYRSYFVGPMTKYSELVTYAEFVHQFGVRFLPCGLSGFTKLPLHEFANYRVSTNEMQAVF DSTFIERLCEQNDVRGRIQVVEEYLLAYLAHNYQPVDTQVAAAVNIINQSAGKRSVRSLM DEVCLCQRHFERKFKYNTGFTPKEYSRIVKFKNAVELLRNTTFVNLLTTAIDAGYYDLAH FSKEIKTLSGNTPSSFLSLTVPEDITLTYIEP >gi|226332146|gb|ACIC01000174.1| GENE 26 23917 - 24459 409 180 aa, chain + ## HITS:1 COG:CAC0738 KEGG:ns NR:ns ## COG: CAC0738 COG0847 # Protein_GI_number: 15894025 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, epsilon subunit and related 3'-5' exonucleases # Organism: Clostridium acetobutylicum # 4 168 3 165 306 128 41.0 5e-30 MKDFAAIDFETANGKRTSVCSVGVVIVRNGKIVKKIYRLIRPCPNYYTQWTTAVHGLTYA DTEEAEDFPDVWAEIKPLIDGLPLVAHNSPFDEGCLKAVHELYEMTYPNYKFYCTCRTSR KVFGKELPNHQLHTVAARCGYDLTNHHHALADAEACAQIALLIIPEPKKPKVDLHTGSLF >gi|226332146|gb|ACIC01000174.1| GENE 27 24564 - 25733 1216 389 aa, chain - ## HITS:1 COG:STM3135 KEGG:ns NR:ns ## COG: STM3135 COG1312 # Protein_GI_number: 16766435 # Func_class: G Carbohydrate transport and metabolism # Function: D-mannonate dehydratase # Organism: Salmonella typhimurium LT2 # 5 389 2 392 394 533 62.0 1e-151 MYLCEQTWRWYGPNDPVSLWDIKQAGATGIVNALHHIPNGEVWTVEEIMKRKQMIEEVGL TWSVVESVPVHEHIKTQTGDFLKYIENYKESIRNLAKCGVMVVTYNFMPVLDWTRTDLAY TMPDGSKALRFERAAFLAFDLFILKRPNAEKDYTPEEIAKAKARFEQMSEDDKKLLVRNM IAGLPGSEESFTVEQFQEALDRYNDIDAEKLRANLIFFLKEIAPVADEVGVKLVIHPDDP PYTILGLPRILSTEEDFKKLIEAVPNESNGLCLCTGSFGVRADNDLAGMMERFGDRVNFV HLRSTQRDEEGNFYEANHLEGNVDMYNVMKSLILLQQRRKCSIAMRPDHGHQMIDDLKKK TNPGYSCLGRLRGLAELRGLEMGIAKSIL >gi|226332146|gb|ACIC01000174.1| GENE 28 25768 - 26580 1005 270 aa, chain - ## HITS:1 COG:BH1067 KEGG:ns NR:ns ## COG: BH1067 COG1028 # Protein_GI_number: 15613630 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) # Organism: Bacillus halodurans # 6 269 7 279 281 223 43.0 2e-58 MNELFNVKDKVVVITGGAGILGKGIAAYLAKEGAKVVVLDRSEEAGKALVESIKAEGNEA MFLYTDVMDKEVLEGNKVEIMKAYGRIDVLLNAAGGNMAGATIAPDKTFFDLQIDAFKKV VDLNLFGTVLPTMVFAEIMVEQKKGSIVNFCSESALRPLTRVVGYGAAKAAIANFTKYMA GELALKFGNGLRVNAIAPGFFLTDQNRALLTNPDGSLTDRSKTILAHTPFNRFGEPEDLY GTIHYLISDASNFVTGTVAVIDGGFDAFSI >gi|226332146|gb|ACIC01000174.1| GENE 29 26601 - 27671 838 356 aa, chain - ## HITS:1 COG:SP1999 KEGG:ns NR:ns ## COG: SP1999 COG1609 # Protein_GI_number: 15901822 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Streptococcus pneumoniae TIGR4 # 8 228 7 213 336 89 32.0 8e-18 MDKEFSNIRIVDIAKMAGVSVGTVDRVIHNRGRVSEENRKKVQTILEMVHYQPNLMARSL ASKKQYHFVAITPSFTHGEYWEAISEGIDKAASEMESYNITITKLFFDQYNNKTFDDIVR NLLDEKVDGVLIATLFTDSVIRLSQELDRNEIPYVYVDSNIEGQHQLAYFGTESYDAGVI AARLLTDKISSTSDILMARIIHSGKNDSNQGKNRREGFCHYLKETGFTGNLHEVELKIND SVYNFIKLDEIFEANTTIEGGIIFNSTCYILGNYLKARGMQKVKLVGYDLIERNTQLLSD GVITALVAQRPERQGYDGIKSLCNHLLFKQNPEKVNLMPIDILLKENLKYYLNNKL >gi|226332146|gb|ACIC01000174.1| GENE 30 28097 - 28384 356 95 aa, chain - ## HITS:1 COG:no KEGG:BT_1435 NR:ns ## KEGG: BT_1435 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 95 1 95 95 142 100.0 4e-33 MECRHGNFWIGLGIGSILGAVAYRLSRTAKARQLESDIYNAIHRIGRDAEIAAAEAERKA MNLGLKAVETGAKVADKVAEEADKAVGKAKEKWEK >gi|226332146|gb|ACIC01000174.1| GENE 31 28429 - 28674 303 81 aa, chain - ## HITS:1 COG:no KEGG:BF4188 NR:ns ## KEGG: BF4188 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 81 1 81 81 67 80.0 1e-10 MGFIWYIIIGIIAGFLAGKIMRGGGFGLIINLLLGILGGVLGGWVFALFGLAASGLIGSL ITSTVGAILVLWIASLFSKSK >gi|226332146|gb|ACIC01000174.1| GENE 32 28767 - 29900 597 377 aa, chain - ## HITS:1 COG:STM0525 KEGG:ns NR:ns ## COG: STM0525 COG1929 # Protein_GI_number: 16763905 # Func_class: G Carbohydrate transport and metabolism # Function: Glycerate kinase # Organism: Salmonella typhimurium LT2 # 3 377 2 375 382 291 44.0 2e-78 MKKVVVAMDSFKGCLSSSEAEKAAEEGIKLVCPDCEVIRFPIADGGEGILTVLIEATRGT YQKIIANDPLMRPIETTYGISGDKHIAFIEMAAVNGLPLLSETERNPMLTTTYGTGELIL HAIEQGYREFVIGIGGSATNDAGVGLLQALGARFLDKDGAVLGKGGEILHRIAAIDFSSV HPALKDTRFTIACDVRNPFCGPEGAAHVFARQKGADDTMIEKLDVGMQSFSRLIHSTTGR EITHVPGAGAAGGLGGAFLAFLNAELKPGIDLLLQTLKFSEKIKGADLIITGEGRTDRQS LMGKVPSGILEEAKRQGIPVIVVAGSIEDTEILNQAGFQGVFSIIPSPMSLETAMKPEVA KKNICRTVAQIISLVNL >gi|226332146|gb|ACIC01000174.1| GENE 33 29998 - 31206 1105 402 aa, chain + ## HITS:1 COG:ECs0532 KEGG:ns NR:ns ## COG: ECs0532 COG0477 # Protein_GI_number: 15829786 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli O157:H7 # 10 394 20 401 406 376 55.0 1e-104 MVQNQEQPVGRITFSILIALSLSHCLNDLLQSVISAAYPLFKEDLGLNFAQIGLITLVYQ LSASVFQPITGIIFDKHPVAWSLPIGMSFTMIGMINLAFSNNLYWMLISVFLIGIGSSVL HPEASRITFLASGGKRGLAQSLFQVGGNLGGSLGPLLVALLVAPYGREHLAVFTLFALAA IGVMYPICKWYKSYLNRMKEQKATVKKAVHLPLPMDKTALSIGILLILIFSKYIYMASLT SYYTFYLIHKFNVTVQESQLYLFIFLVATAIGTLLGGPIGDRVGRKYVIWASILGAAPFS LLMPHATLAWTIILSFCVGLMLSSAFPAILLYAQELLPTKLGLISGLFFGFAFGVAGIAS AVLGNMADKFGIESVYNVCAYMPLLGLVTFFLPNLKKQKIQV >gi|226332146|gb|ACIC01000174.1| GENE 34 31289 - 32770 1323 493 aa, chain - ## HITS:1 COG:no KEGG:BT_1439 NR:ns ## KEGG: BT_1439 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 493 1 493 493 973 98.0 0 MKKIKIQSILTAVAGLFLATSCSSSFLDTEPTDAVSSDQVAVAGNAERLFNGAWYNLFEY GTTYANIGYRALQCQDDMMASDVVSRPKYGFNSSYQFNDVAIPSDGRTSFAWYLIYKTID NCNTAISIKGDSEELRQAQGQALALRAFCYLHLVQHYQFTYLKDKDAPCVPIYTEPTTSS TEPKGKSTVAQVFQQIFDDLNLAQDYLTNYVRKGDGQKFKPNTDVVNGLLARAYLLTGQW GEAAKAAEAARKGYSLMTTTAEYEGFNNISNKEWIWGSPQTLSQSDASYNFYYLDATYVG AYSSFMADPHLMDTFAKDDIRLPLFQWMREGYLGYKKFHMRSDDTADLVLMRSAEMYLIE AEAKARDGVALDQAVAPLNTLRTARGAGNYDVTGKTKEQVIDEILMERRRELWGEGFGIT DILRNQKAVERMALSEDMQKTEVDCWQEGGSFAKRNPLGHWFLNFPDGKAFSANSSYYLY AIPEKEINANPNL >gi|226332146|gb|ACIC01000174.1| GENE 35 32784 - 35912 3007 1042 aa, chain - ## HITS:1 COG:no KEGG:BT_1440 NR:ns ## KEGG: BT_1440 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1042 1 1042 1042 1967 99.0 0 MRIYLRLLVAFLLLSAGNVVYAEAQQEKRVTGTVTSEGEPLPGVSVQVKGASSGTITDID GNYSIEAPANGTIVFRFVGLRTVEQAVNNRNVINVTMESESKELEEVMVVAYATAKKYSF TGAASTMKAGEIEKLQTSSVSRVLEGTVSGVQASASSGQPGTDAEIRIRGIGSINASSAP LYVVDGVPFDGSVNSINPDDIASMTVLKDAASAALYGSRGANGVIIITTKQGQQDSKATV KVKATLGGSSRAVRDYDRVNTNQYFELYWEALRNQYAKSSDYTPATAAAQASKDLVTKLM GGGPNPYGTQYPQPVGTDGKLAAGARPLWNSDWSDAMEQQALRTELNLSVSGGGKANQYF FSAGYLNDKGIALESGYQRFNLRSNVTSEMTSWLKGSINLSFAHSMQNYPVSSDSKTSNV ITAGRTMPGFYPIYEMNTDGSYKLDDNGDRIYDFGSYRPSGSMANWNLPATLPLDKSERM KDEFSGRTYLEATIIEGLKFKTSFNFDLINYNTLDYTNPKIGPALENGGGSSRLNSRTFS WTWNNIASYDKTIGDHHFNVLAGMEAYSYRYDELTASRTKMAQPDMPELVVGSQLTGGSG YRIDYALVGYLTQALYDYQNKYFFSASFRRDGSSRFAPETRWGNFWSLGTSWRIDRESFM ASTTNWLSALTLKMSYGAQGNDNLGTYYASKGLYTIVSNLGENALVSDRMATPNLKWETN LNFNVGVDFSLFNNRFSGSFDFFTRRSKDLLYSRPIAPSLGYGSIDENVGALKNTGIEMV LNGTIINQNGWVWKLGMNLTHYKNKVTELPLKDMPQSGVNKLQVGRSVYDFYMKEWAGVD PDNGNPLWYMDEKDANDNLTGKRVTTSDYASASYYYVNKSSLPKVYGGFNTSLSWKGFDL SAIFAYSIGGYIYNRDITMILHNGSLEGRDWSTEILKRWTPENRYTDVPALSTTSNNWNS ASTRFLQNNSYMRLKNLTLSYDLPKQWISKLALSSVQVFVQGDNLFTIHRNQGLDPEQGI TGITYYRYPAMRTISGGINLSF >gi|226332146|gb|ACIC01000174.1| GENE 36 36155 - 37189 874 344 aa, chain + ## HITS:1 COG:no KEGG:BT_1441 NR:ns ## KEGG: BT_1441 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 344 27 370 370 642 99.0 0 MSDSLQTDSVIPAKPSAFKRAIRKFMNFSDFDTLYISPNRYNYALMTTHFSNFEYYSVTS NLPQPQKLSFSPNPHNKIGLYFGWRWIFLGWSVDVDDIYRKTNRRNKGTEFDLSLYSSKL GVDIFYRRTGNNYKIHKINGFPEDVPANYSEKFNGIKVDIKGLNLYYIFNNRKFSYPAAF SQSTNQRRNAGSFIAGFSISKHNLEFDYAELPDFILEAMNPAMKVNNIKYTNANISFGYA YNWVFARNCLACLSLTPAIAYKASDVDAETHEGKTWYGNFNLDFLIRAGVVYNNGKYFVG TSFVGKNYNYHRNNFSLDNGFGTLQIYVGFNFNLRKEYRRKNTR >gi|226332146|gb|ACIC01000174.1| GENE 37 37311 - 39725 1931 804 aa, chain + ## HITS:1 COG:NMB0732 KEGG:ns NR:ns ## COG: NMB0732 COG0161 # Protein_GI_number: 15676630 # Func_class: H Coenzyme transport and metabolism # Function: Adenosylmethionine-8-amino-7-oxononanoate aminotransferase # Organism: Neisseria meningitidis MC58 # 387 803 14 430 433 590 64.0 1e-168 MKKQRHIQTTRSLLSRFRYWGRKNYAAFASMGREFQIGHLHTNVVDVALRKQNAKVTIPY HTFMTLQEIKDQVLAGFDISSAQATWLANMADSEALYAAAHEITVTCASHEFDMCSIINA KSGRCPENCKWCAQSSHYKTQAEIYDLLPAEECLRQAKYNESQDVNRFSLVTSGRKPSPK QISQLCDAARLMRKHSSIQLCASLGLLNEEELRALHTAGITRYHCNLETAPSYFPTLCST HTQEQKLATLDAARRVGMDICCGGIIGMGETMEQRIEFAFTLAELNVQSIPINLLSPIPG TPLENEKALSEEEILRTIALFRFINPTAFLRFAGGRSQLTPEAMRKALFVGINSAIVGDL LTTLGSKVSDDKKMILEEGYHFADSQFDREHLWHPYTSTTDPLPVYKVKRADGATITLED GRTLIEGMSSWWCAVHGYNHPVLNQAAKDQLDKMSHVMFGGLTHDPAIELGKLLLPLVPP SMQKIFYADSGSVAVEVALKMAVQYWYAAGKPDKNNFVTIRSGYHGDTWNAMSVCDPVTG MHSLFGSSLPVRYFVPAPSSRFDGEWNPDDIIPLRETIEKHSKELAALILEPIVQGAGGM WFYHPQYLREAEKLCKEHDILLIFDEIATGFGRTGKLFAWEHAGVEPDIMCIGKALTGGY MTLSAVLASNQIADTISNHAPKAFMHGPTFMGNPLACAVACASVRLLLDSGWAENVKRIE AQLKEELAPARKFPQVADVRILGAIGVIQTERSVSMAYMQRRFVEEGIWVRPFGKLVYLM PPFIISPEQLSKLTSGVLKIVREM >gi|226332146|gb|ACIC01000174.1| GENE 38 39845 - 40999 922 384 aa, chain + ## HITS:1 COG:PM1901 KEGG:ns NR:ns ## COG: PM1901 COG0156 # Protein_GI_number: 15603766 # Func_class: H Coenzyme transport and metabolism # Function: 7-keto-8-aminopelargonate synthetase and related enzymes # Organism: Pasteurella multocida # 8 379 7 379 387 437 56.0 1e-122 MILDSINQELLTLKEKKNYRSLPQLIHDGRDVTVDGRRMLNLSSNDYLGLANEVSLREAF LKTITPETFLPTSSSSRLLTGNFTAYQELEQQLATMFGAESALLFNSGYHANTGILPAVS DTRTLILADKLVHASLIDGIRLSSAKCIRYRHNDLAQLRRLLEENHGMYEKMIIVTESIF SMDGDEADLQALVRLKHDYSNLLLYVDEAHAFGARGEKGLGCAEEQNCINDIDFLVGTFG KAAASAGAYIVCRQTIREYLINKMRTFIFTTALPPVNIQWTAWVLKHFVDFRSKREHLLQ ISRKLKEALTEKGYNCPSVSHIVPMVVGASEDTIRKAEELQRKGFYALPVRPPTVPEGTS RIRFSLTADITEKEIDTLIEIING >gi|226332146|gb|ACIC01000174.1| GENE 39 40996 - 41655 425 219 aa, chain + ## HITS:1 COG:NMA2012 KEGG:ns NR:ns ## COG: NMA2012 COG2830 # Protein_GI_number: 15794892 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Neisseria meningitidis Z2491 # 1 206 1 201 215 146 38.0 2e-35 MKQTFIIRNNEKHLLLFFAGWGMDETPFRHIHPAECDWMICYDYRSLEFDTTLIQAYSKI TLIAWSMGVWAASQVMKQYPSLPVSQSTAINGTLYPIHETEGITPSVFEGTLQGLNEQTL LKFQRRMCGSAADYKVFQTMAPKRPVEELKEELAAIRQQYLSSLPSEFVWQTAIIGDNDR IFLPDHQEQAWRNKADSLLHVEAAHYQQELFNEVIMNIK >gi|226332146|gb|ACIC01000174.1| GENE 40 41666 - 42451 566 261 aa, chain + ## HITS:1 COG:PM1903 KEGG:ns NR:ns ## COG: PM1903 COG0500 # Protein_GI_number: 15603768 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Pasteurella multocida # 1 257 4 249 251 175 38.0 8e-44 MNKTIIAERFSKAISTYPREANVQRQIADKMIRLLQKHIPSPCPKVIEFGCGTGIYSRML LRTLRPEELLLNDLCPEMRYCCEDLLREKQVSFLSGDAETIPFPDKSTLITSCSALQWFE SPEEFFKRCNTLLHSQGYFAFSTFGKKNMKEIRELTGKGLPYRSKEELEAALSFHFDILY SEEELIPLSFEDPMKVLYHLKQTGVNGLSAQSSLYPKHEKQTWTRRDLQHFCERYTQEFT QGTSVSLTYHPIYIIAKKKKV >gi|226332146|gb|ACIC01000174.1| GENE 41 42448 - 43095 644 215 aa, chain + ## HITS:1 COG:NMA0943 KEGG:ns NR:ns ## COG: NMA0943 COG0132 # Protein_GI_number: 15793901 # Func_class: H Coenzyme transport and metabolism # Function: Dethiobiotin synthetase # Organism: Neisseria meningitidis Z2491 # 3 208 2 207 215 234 54.0 7e-62 MKQNVYFVSGIDTDAGKSYATGYLAREWNKNGKRTITQKFIQTGNVGHSEDIDLHRRIME IPFTWEDQEGLTMPEIFSYPASPHLASRLDHRPIDFDKIKHATEVLSERYDCVLLEGAGG LMVPLTTELLTIDYIAQEKYPLIFVTSGKLGSINHTLLSLEAIQKRGIVLATVLYNLYPT VEDKTIQDDTMEFIRTWLEKYFPETKFLLVPEIGK >gi|226332146|gb|ACIC01000174.1| GENE 42 43120 - 45303 1952 727 aa, chain - ## HITS:1 COG:mll3725_2 KEGG:ns NR:ns ## COG: mll3725_2 COG0642 # Protein_GI_number: 13473203 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Mesorhizobium loti # 281 580 1 308 328 178 38.0 4e-44 MLTVFNHNEVGIEVVSNEETNHVGISNKDFEGMHMRQMVPPEAYQNIHANMQKVIATRTV SVAHHDMDFNGSHHYYENRIFPLDEEYVLIMCRDITERVATQQQLEIFKSVLDKVSDSIL AVSSDGTLVYANKQFMEEYGVTQQMGTQKIYDLPVSMRTPELWEAKLKDIRDNGGSFSYR AAYVRKGEEKNRVHQVSTYLTQEDDTELIWFFTQDITDVIKKRDELRELNLLLDGILNNV PVYLYVKDPEDDFRYLYWNKAFAEHSKIPASRAIGRTDFEIFPERADAERFRRDDLELIR THERIDMQETYVAATGETRIVQTLKALVPMEGREPLLIGISWDITNFQNIEQELIKARIK AEQSDRLKTAFLANMSHEIRTPLNAIVGFSNLLPSAETLEEEKLYSSIINQNSEILLQLI NDILDLSKIEAGTLEYVRQPMNLGEVCRNIYQIHKDRVQEGVTLILDNEDDDLIIEEDRN RIAQVITNFLTNAGKFTLSGEIRFGFKVDNQCIRFYVKDTGIGIAPDKVGHIFDRFVKLN SFAQGTGLGLAICRMIIEKIGGEIGATSEVGKGSTFFFTIPYKEHSDDRIAFSEVSETKY ISHTIKRVQKIKKILVAEDVDSNFVLIKNMIGKDYTLLWAKDGFEAVEMYKQFQPDLILM DIKMPRMGGLEATRIIRSYSKEIPVIALTAYAFEADKEQALEAGCNDFVTKPVSKAALEE ALGKCSG >gi|226332146|gb|ACIC01000174.1| GENE 43 45568 - 46092 721 174 aa, chain - ## HITS:1 COG:YGL062w KEGG:ns NR:ns ## COG: YGL062w COG1038 # Protein_GI_number: 6321376 # Func_class: C Energy production and conversion # Function: Pyruvate carboxylase # Organism: Saccharomyces cerevisiae # 80 174 1066 1169 1178 66 41.0 2e-11 MGTSLATYYAKLLDMPDSEYKVEILEDGPIKKIAVNGKIYEVDYNMGGDSIHSIIIDHHS HGVQISPSSNNSYTIMNKGELYQIELQGEMEKIHNARAGTEAVGRQVVQAPMPGVILKIY VKKGDEVKRGDPLCVLVAMKMENEIRSVTDGVVKEVFVEGGMKVGLNDRIMVIE >gi|226332146|gb|ACIC01000174.1| GENE 44 46144 - 47655 1558 503 aa, chain - ## HITS:1 COG:MA0675 KEGG:ns NR:ns ## COG: MA0675 COG0439 # Protein_GI_number: 20089560 # Func_class: I Lipid transport and metabolism # Function: Biotin carboxylase # Organism: Methanosarcina acetivorans str.C2A # 1 478 1 493 493 495 51.0 1e-140 MIKKILVANRGEIAMRIFRTCRVMNISTVAVYTNVDRGALHVRYAEEAYCISSSPEDTSY LKPELILSIAKKTGAAIHPGYGFLSENADFARRCEEEGVIFIGPGADIIAKMGIKTEARK IMREAGLPIVPGTETPVQGIEEVKKVAKEVGYPIMLKALAGGGGKGMRLVRSEEEAETAL RLSQSEAGTSFGNDAVYIEKYIENPHHIEVQIMGDKYGNVVHLGERECSIQRRNQKVIEE SPSPFVKDETRKKMLKVAVEACKRIGYYSAGTLEFMMDKDQNFYFLEMNTRLQVEHPVTE ECTGVDLVRDMITVAAGNPLPYKQEDIKFSGAAIECRIYAEDPENNFMPSPGVITVREAP EGRNLRLDSAAYAGFEVSLHYDPMIAKLCCWGRTRESAISNMARALREYKILGIKTTIPF HQRVLKNAAFLEGKYDTTFIDTRFDKEDLKRRQNTDPTVAVIAAAVRHYEREKEAASRAT TLPVVGESLWKYYGKLQMTANNY >gi|226332146|gb|ACIC01000174.1| GENE 45 47711 - 49255 1699 514 aa, chain - ## HITS:1 COG:BMEI0801 KEGG:ns NR:ns ## COG: BMEI0801 COG4799 # Protein_GI_number: 17987084 # Func_class: I Lipid transport and metabolism # Function: Acetyl-CoA carboxylase, carboxyltransferase component (subunits alpha and beta) # Organism: Brucella melitensis # 1 514 1 510 510 673 64.0 0 MKELIKNLEELNRKAEKGGGDARIEKQHSVGKLTARERIDLLLEKGSFIELDKLVTHRCT DFGMEKQKFSGDGVVTGYGMIGKRLVYVFAQDFTVFGGALSETHAKKICKVMDMAMQMGA PIIGLNDSGGARIQEGVRSLAGYAEIFLRNSMASGVIPQISAIMGPCAGGAVYSPALTDF ILMVKNSGYMFITGPDVVKSVTQEEVSKEDLGGVGIHMTKSGVAHLSAENDIECINYIRE LISYLPGNNMEEPPFVVTNDSPTRLTPELSDLVPTNPNQPYNIKEMIEAVADDNNFFELQ AEFAANIVTGYIRLNGKTIGVVANQPLVLAGTLDINASIKAARFVRFCDAFNIPLLTLVD VPGFLPGVDQEYGGIIRNGAKLLYAYCEATVPKVTVITRKAYGGAYDVMSSKHIRGDVNL AFPTAEIAVMGPDGAVNILFKKEIEKSQTPEERRKELQSDYREKFANPYRAAELGYVDEV IDPAVTRLRLIRSFEMLANKRQSNPPKKHSNIPL >gi|226332146|gb|ACIC01000174.1| GENE 46 49501 - 50088 766 195 aa, chain + ## HITS:1 COG:no KEGG:BT_1451 NR:ns ## KEGG: BT_1451 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 195 1 195 195 382 98.0 1e-105 MRTIKLLLAAGFLLVLGTSAYAQVQRNETYLPKFAIKTNALYWATSTPNLGFEVGLAKKI TLDVSGNYNPWKFGDDRQIKHWLVQPELRYWLCERFNGSFFGLHGHYGEMNVSNLNIFGM GHDRYDGSLYGAGISYGYQWIISKRWSMEATIGVGYARLKYDKYARGDGGEKLGHNTRNY FGPTKIGLSFIYVIK >gi|226332146|gb|ACIC01000174.1| GENE 47 50113 - 51570 1643 485 aa, chain + ## HITS:1 COG:no KEGG:BT_1452 NR:ns ## KEGG: BT_1452 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 484 1 484 484 902 99.0 0 MKRKLIYLFIALAATILPAHAQKFLNDALTLSNVSLWQQGNSLYIGMTVDMANLTIGSAR SLSLIPLLTDGQHNVPLQEIIVNGKRREKAYLRGLAISKQEPTAIIVPYNKRETFNYTQV IPYQPWMANASLQLVENLCGCGNYQEMNAQELITNDVSTEAKRLSAMSPIVAYIQPTVEV VKNRSEQYEAHLDFPVNKSVILTDFMNNHSELVNIHSMFDKIQNDKNLTVQSISIEGFAS PEGPLAFNEQLSKKRAEALKDYLVKNEKASAKLYKVTFGGENWDGLVKALESSSMKDKET FLNIIKNTTNDVKRKQEIMKVGGGAPYRTMLKEIYPGLRKVNCKIDYTVVNFDVEQGRII IRENPKYLSLNEMYQVANSYPKGSKDFVDVFDIAVRMYPTDAVANLNAAAVALSQKDINT ALKYMEKADHTTAEFLNNTGVYNFLNGDIQRATAAFEQAAKLGNDAAQANLKQLQQIMNM KMSKK >gi|226332146|gb|ACIC01000174.1| GENE 48 51808 - 53307 1451 499 aa, chain + ## HITS:1 COG:BB0604 KEGG:ns NR:ns ## COG: BB0604 COG1620 # Protein_GI_number: 15594949 # Func_class: C Energy production and conversion # Function: L-lactate permease # Organism: Borrelia burgdorferi # 3 497 6 499 500 293 39.0 6e-79 MTLILAIIPVLLLIVLMAFFKMSGDKSSIISLIVTMLIALFGFAFSVDNLFYSFLYGALK AVSPILIIILMAIFSYNVLLKTEKMEIIKQQFASISTDKSIQVLLLTWGFGGLLEAMAGF GTAVAIPAAILISLGFKPIFSATVSLIANSVATAFGAIGTPVLVLAKETNLDVLQLSTNV VLQLSVLMFLIPLVLLFLTNPKLKALPKNIFLALLVGGVSLVGQYLAARYMGAESPAIIG SILSIIVIVLYGKLTASKEEKARKSTLKAKDILNAWSIYLLILFLIILTSPLFPSLRSTL ENNWITRISLPVNASTVNYTISWLTHAGVLLFLGTFIGGLIQGAKVKELFIVLWNTVKQL KKTFITVICLVGLSTIMDTSGMIAVIATALATATGSLYPLFAPVIGCLGTFITGSDTSSN ILFGKLQASVAGQIHVSPDWLSAANTVGATGGKIISPQSIAIATSAGNQQGKEGEILKAA IPYALAYVVITGIIVYIFS >gi|226332146|gb|ACIC01000174.1| GENE 49 53348 - 53866 351 172 aa, chain + ## HITS:1 COG:lin1847 KEGG:ns NR:ns ## COG: lin1847 COG3153 # Protein_GI_number: 16800914 # Func_class: R General function prediction only # Function: Predicted acetyltransferase # Organism: Listeria innocua # 6 169 3 162 172 97 36.0 1e-20 MKNFKIRQETKEDLDEVYQLIKTAFETAKVKDGDEQDFAVKLREGKNFIPELSMVAETDG KLIGHIMMTQTPVLQSNGERYTALLVAPLSVQLEYRDLGVGSALMKEGLRLAAEMGYQAV FLTGDPNYYSRFGYQSSSRFGITCPGIPDQYVLAYELVPHALDKVEGTIGEW >gi|226332146|gb|ACIC01000174.1| GENE 50 53927 - 54547 532 206 aa, chain - ## HITS:1 COG:STM0315 KEGG:ns NR:ns ## COG: STM0315 COG1186 # Protein_GI_number: 16763697 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Protein chain release factor B # Organism: Salmonella typhimurium LT2 # 5 202 2 197 204 150 41.0 1e-36 MKEKVYLQITSGRGPAECCRVVALVLERIVRQAQASGLKVEMIEREVGPVNRTLLSATIA LQGAASGELADEWEGTVQWIAQSPYRIYHKRKNWFVGVHSFVLSESQEATERDFRYETLR ASGPGGQHVNKTESAVRAVHIPSGMSVVASDQRSQWQNKKLATERLLVKLSSWTMEQAMI QAQENWSNHNHLQRGNPVKVIREPLI >gi|226332146|gb|ACIC01000174.1| GENE 51 54552 - 55949 1339 465 aa, chain - ## HITS:1 COG:DR0430 KEGG:ns NR:ns ## COG: DR0430 COG1690 # Protein_GI_number: 15805457 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Deinococcus radiodurans # 42 463 40 464 470 384 50.0 1e-106 MGIRLKDLSKLGYRDNVARSLVVDIVGKYCKHDTKEQIAMTLSNILEHPESYKNNEIWSK LAERLSPTILAKEFLAYDLREEPLMYKTYGGKFIETLAKQQMNLAMRLPVTVGGALMPDA HAGYGLPIGGVLATDNAVIPYAVGVDIGCRMSLTVFDAKADFLKRYSYQIKEALKDFTHF GMDGGLGFEQEHEVLDREEFRLTPLLKDLQGKAVRQLGSSGGGNHFVEFGEIALQANNVL NLSEGSYVALLSHSGSRGLGAAIAKHYSLLAREVCKLPREAQHFAWLSLNSEEGQEYWMS MNLAGDYARACHERIHLNLSKALGLKPVANVNNHHNFAWKEEIAPGRMAIVHRKGATPAQ KGQAGLIPGSMATAGYLVCGKGVEESLCSASHGAGRAMSRQKAKESFTQSALKKMLSQAD VTLIGGSIEEMPLAYKDIDRVMYTQETLVEVQGRFMPRIVRMNKE >gi|226332146|gb|ACIC01000174.1| GENE 52 56442 - 56936 621 164 aa, chain + ## HITS:1 COG:BB0061 KEGG:ns NR:ns ## COG: BB0061 COG0526 # Protein_GI_number: 15594407 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Borrelia burgdorferi # 43 156 3 115 117 124 43.0 1e-28 MKLIKSIMSAFAIVLATTACAGNGGENKKSNEPTKEDNKMEVVSLNKAEFLKKVYDFEAN PNDWKFEGKRPAIVDFYATWCGPCKALHPVLEELSKEYSGKVDIYQIDVDQEKELAAAFG IRSIPTLLLIPMKEEPRITQGALPKDQLKKAIDEFLLKQNNEAK >gi|226332146|gb|ACIC01000174.1| GENE 53 57177 - 58394 1073 405 aa, chain + ## HITS:1 COG:TVN0299 KEGG:ns NR:ns ## COG: TVN0299 COG0535 # Protein_GI_number: 13541130 # Func_class: R General function prediction only # Function: Predicted Fe-S oxidoreductases # Organism: Thermoplasma volcanium # 30 184 36 174 336 72 32.0 1e-12 MKTKRTRSSFRSMSSERKLESIFLFVTGKCNAKCAMCFYANDMAKKERDLTFEEIRKISE TAGEINKLWVSGGEPTLREDLPEILEMFYQNNHIKDVNMPTNGLKPDRVIEWVERFRTNC PECNINVSISLDGFGETHDTQRGVPGNFYKAIDTIRKVSEHFRNDGKVLLNVATVITKYN IDQINDFMVWMYGRFHLSTHTIEAARGVTREDGVKALDESTLRLIQDEAAPIYGAYAKRM VANTSGLRKPITKFFYTGIVRALYNIRASNIDQPTQWGMDCTAGETTLVIDYDGRFRSCE LREPLGNVKDYDCNVQRIMQSDVMKKEVGAIGHGYKANCWCTHGCFITASLIFNPRKMIK QVYKGYREVSKLDKPIDISEASLIKMEDTYHLNKEKLQELGIAAY >gi|226332146|gb|ACIC01000174.1| GENE 54 58415 - 59674 809 419 aa, chain + ## HITS:1 COG:ML2348 KEGG:ns NR:ns ## COG: ML2348 COG1819 # Protein_GI_number: 15828264 # Func_class: G Carbohydrate transport and metabolism; C Energy production and conversion # Function: Glycosyl transferases, related to UDP-glucuronosyltransferase # Organism: Mycobacterium leprae # 1 411 1 410 421 103 24.0 5e-22 MKILLVTRGSQGDIYPYLNIASALIKRGHKVTLNLPQIFEKEAQAYQLDYVLQDFDDIQG MVSKAGNKSEGAKPYLKWMRDVIDAQFKQLVPLLKEHDVLIATNSEFAAASVADYCGKPF IRTAFAPFIPGKNIPPPIFPYPKPHPVFTPRMIWRMLNIGNKYMTQKTINKNRKQLGLQP LKNCGYYATERAFNYLLYSRHLGSTDADWKYKWDIGGYCFNDTLHYDTEAYRKLTDFIQQ EQRPVIFFTLGSCSAKESNDFCHRLINVCWQLNFRLIIGSGWSGTGKQLANEKDVFLLTH TIPHSLIFPHCDAVMHHGGSGTTHSVARAGKPQVVMPLIIDQPYWAYRVQQLKIGPARIK IGKVSDKELKEKVYDLVTNPVYKKNAEELGEKIRSEQSTDNFCDFIESMISPKEISEGK >gi|226332146|gb|ACIC01000174.1| GENE 55 59688 - 59972 247 94 aa, chain + ## HITS:1 COG:no KEGG:BF3050 NR:ns ## KEGG: BF3050 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 92 1 93 106 121 70.0 7e-27 MQESTEKFKNEELILRDYLAIERTKLANVRTLFSYIRTSLYLLTAGIGIFQIESISRLDG LAWVCVIIGVILFILGFVRYFQMCKQLKGYVKQS >gi|226332146|gb|ACIC01000174.1| GENE 56 59977 - 60696 567 239 aa, chain - ## HITS:1 COG:FN0219 KEGG:ns NR:ns ## COG: FN0219 COG3279 # Protein_GI_number: 19703564 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Fusobacterium nucleatum # 1 220 2 224 240 105 28.0 6e-23 MNCIIVDDEPLAREAMKLLIEEAECLQLVGSFNSAATASDFMEQHVVDLVFLDIQMPGIT GIEFARTISKRTLVIFTTAYTEYALDSYEVDAIDYLIKPVEAERFQKAVEKAQSYHSLLL QEEKEAIETIVAAEYFFVKAERRYFKVNFSDILFVEGLKDYVILQLGEQRIITRMSLKAV FDLLPKDSFLRVNKSYIVNTAHIDSFDNNDIFIKSYEIAIGNSYRDDFFEGFVMKQQRW >gi|226332146|gb|ACIC01000174.1| GENE 57 60693 - 61799 597 368 aa, chain - ## HITS:1 COG:RSc1351 KEGG:ns NR:ns ## COG: RSc1351 COG2972 # Protein_GI_number: 17546070 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Ralstonia solanacearum # 148 345 137 352 414 94 29.0 3e-19 MRISTNDKNTLDRNPGDGTMNGKSVTAFLLSPHCRIYRHILLQVAVLLITINVFWYEPLQ VVSLLKRFGGCLVYFLSMNAVIYTNLYVLVPSFLLKNRLGGYVLAAIVTNLVVIFFLSVT QGMLFEVILPVRNPDGFATFINTFSGILTMGFVTAGSAAISLFIHWLRYNLRIDQLESTT LQSELKFLKSQINPHFLFNMLNNANVLIKRNPEEASKVLFKLEDLLRYQINDSSRERVAL ASDIRFLNDYLNLEKIRRDHFQFTMEQEGDIDSIWIQPLLFIPFVENAVKHSFDSEHPSY VHLSFKVEENRLEFRCENTMPAVAVSNKVGGIGLANIQRRLGLLYPDRYELEQIENENRY MVTLCITL >gi|226332146|gb|ACIC01000174.1| GENE 58 61796 - 62884 723 362 aa, chain - ## HITS:1 COG:BS_yesM KEGG:ns NR:ns ## COG: BS_yesM COG2972 # Protein_GI_number: 16077762 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus subtilis # 152 348 351 556 577 59 27.0 8e-09 MKHMATETTAAGNESAFLYRLLVSPAFRWMRYLILVMVLGTISFNQVFIIFMDYRDILGV WIYVFTFIYMLTYVGVICLNLFWLFPKYLLKRHYMTYLSVLSVAMVIALLIQMVIEYLAY SHWPQLHARGSYFSVPMLMDYISSFMLSTLCMIGGTMTLLLKEWMIENQRVSQMEKAHVL SEVEQLKEQVSPELLFKTLHHSGELTLTEPEKASKMLMKLSQLLRYQLYDCSRQKVLLSS EIAFLTNYLTLEQSSLPQFRYQFTAEGEVNRMLVPPLLFIPFVQHIMELAHEQQTLLPVS LDIHLKAEKGTIVFTCMCRQLNLSVNRGLERIRQRLDLLYGDRYGLSLTTGCIRLELNGG EQ >gi|226332146|gb|ACIC01000174.1| GENE 59 62933 - 65410 2171 825 aa, chain - ## HITS:1 COG:no KEGG:BT_1460 NR:ns ## KEGG: BT_1460 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 825 1 825 825 1652 99.0 0 MKMNKTMLLGEVANRSGIPADECGVVLKTFERVLCEELTRKLYRYGGWILLLIGLFVSAF AFGQTSERKGRPVQTIRGMVVDGDSKYPIPYATVRLSDKEGMGTTTDSLGRFTIPQVPVG RHTVEAAFMGYEPGIFREILVTSAKEVYLEIPLKESVNELSEVVVRARTNKEEAMNKMAT TGARMLSVEEASRYAGGFDDPARLVSAFAGVAPSVSSNGISIHGNAPHLLQWRLEDVEIP NPNHFADIATLGGGILSSLSSQVLGNSDFFTGAFPAEYGNAVSGVFDMKLRNGNNQKNEN TIQVGIMGIDVASEGPLSKKHKASYIFNYRYSTTGLLNLDGGTMDYQDLNFKLNFPTKMA GTFSVWGTSLIDKFKSDMERNPEKWEYLGDRSESRDKQYMAAGGISHRYFFNSDASLKTT VAATYSQLDGGASMFNHALESTPYMDLESRHTNLILTSTFNRKFSNRFTNKAGFTYTAMF YDMNLAIAPYEAQLLETVSKGDGNTSLISAYNSSSVGLSDRWTLNAGIYGQYLTLNNKWS VEPRAGLKWQATPKATFALAYGIYSRMEKMDVYFVKTKSTGNQSVNKNLDFTKAQHIMLS FGYKISDRMNLKIEPYVQFLHDVPVMADSSYSVLNRSDFYVEDALVNKGRGRNIGIDITL ERFLEKGLYYMISGSLFDSRYRGGDGVWYNTKFNRNYIINGLIGKEWMLGRNKQNILSIN LKLTLQGGDRYSPIDMEATMKHPDKEVQYDERKAFSKQYSPMFIGNYTVSYRINKKKVSH EFAVKGLNFMGTKEHYGHEYNVKTGKIDATDGSTVLTNVSYKLEF >gi|226332146|gb|ACIC01000174.1| GENE 60 66285 - 66728 362 147 aa, chain + ## HITS:1 COG:no KEGG:BF1389 NR:ns ## KEGG: BF1389 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 4 141 39 176 182 222 85.0 4e-57 MFDYKEIERTIKDNHLPFQKIEDADKEILTQKLYDTFVFGNPRALWLSFKYVPYSIDCNM EDPYFHLLDIIDKNIVLYFFIDYWNKDFVIYKARMSDIHKFIGDCEGLDEYYLVTENFME LYSITDHDDLLYIDVRKNEMKKLITNE >gi|226332146|gb|ACIC01000174.1| GENE 61 67152 - 67742 526 196 aa, chain - ## HITS:1 COG:L120883 KEGG:ns NR:ns ## COG: L120883 COG0110 # Protein_GI_number: 15673269 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Lactococcus lactis # 1 191 1 191 203 194 49.0 1e-49 MENKELRERMLSGKMYNDLSVELVQRREQAVFLTNDYNSTYGKPKEVREALLRKLLRCIG SSVHFEPNFRCEFGFNISIGNNFFANFDCIMLDGNLITIGDNVLLGPRVGLYTANHALDP QERLMGGCYAHPIVIEDNVWVGAGVHIMGGVTVGRNSVIGAGSVVTKSIPENVIAAGVPC KVIREITDEDKTGFLP >gi|226332146|gb|ACIC01000174.1| GENE 62 67883 - 68341 257 152 aa, chain + ## HITS:1 COG:L69304 KEGG:ns NR:ns ## COG: L69304 COG0454 # Protein_GI_number: 15673990 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Lactococcus lactis # 3 147 4 148 152 167 52.0 8e-42 MNLVSLRDSPQYLERAIAYFQSKWGNENTKMVYDNCFRHSLNAENPIPQWYLLMNGDEII GCAGLTSNDFNSRADLYPWLVALFIEEKYRGNHYANLLIEQAKKDTLKFGFSKLYLSTSH TSYYERLGFTYIGDCWHPWGESSRVYEIILKF >gi|226332146|gb|ACIC01000174.1| GENE 63 68251 - 68427 81 58 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRTIMLTLLLSVAAISSFAQTNTYRECTSQNFKIISYTRELSPQGCQQSPIYVKPKRS >gi|226332146|gb|ACIC01000174.1| GENE 64 68695 - 71115 1969 806 aa, chain - ## HITS:1 COG:NMB0549_2 KEGG:ns NR:ns ## COG: NMB0549_2 COG0577 # Protein_GI_number: 15676455 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Neisseria meningitidis MC58 # 129 367 136 349 395 63 23.0 2e-09 MIGNYWNSAYRNLMKRKKFSFINVFGLAIGMASALLILTYVTFEFSFDKMHTKYPHIYRV QSTFHEGEVLTDNWASSSFGYGSAMKENLAGIEDYTRIGSLLQPEQIVKYGELTLRENQI AYADPGFFRLFDFELLKGDKKTSLSMPGQVVITERIARKYFKEEDPVGKILIFTGPYYKI SCEVTGVMEEMPSNSHIQYNFLISYTSLPKFIHEYWYKHEAYTYVLLDSPEREAEIEREF PVMAEKYKTEEALKNKTWGVDLVPLADIHLTPQLGYEMETKGNRSAMIALVFAAIAILAI AWINYINLTVARSMERAREVGVRRVVGAFRKQLIHQFLFEALVMNLIAFILAVGLIELIL PYFNQLVGRTVTFSVWLMDYWWVLLVLVFIAGIFLSGYYPALALLNRKPITLLKGKFLHS KSGERTRQVLVIIQYTASMILLCGTLIVFAQLNFMRSQSLGVKTDQTLVVKFPGRTDGLN VKLEAMKKAIMRLPLVHSVAASGAVPGEEVATFLSNRRTNDALKQNRLYEMLACDPDYIE AYGLQVIAGRGFSEEYGDDVDKLVINETAVRNLGFASNDEAIGELVTVECTDAPMQIIGV VKDYHQQALSKNYTPIMLIHKDKIDWLPQRYISIVMTSGNPRELVSQVQEIWNRYFADSS FDYFFLDQFFDHQYRQDEVFGVMIGSFTGLAIFISCLGLWVLVMFSCSTRTKEMGIRKVL GASRWNLFYQLVKGFFLLILIAVVIALPVAWFSMNAWLSHYAFRTDLEAWFFIVPVLLML FISFVTVAFQTMKIIMSKPARSLRYE >gi|226332146|gb|ACIC01000174.1| GENE 65 71112 - 71828 363 238 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) [Campylobacter concisus 13826] # 1 211 4 214 223 144 40 2e-33 MIKTEKLSMLFTTEEVQTKALNEVTLRVEQGEFVAIMGPSGCGKSTLLNILGTLDSPTSG SYFFEGKQVDKMNENQLTALRKNNLGFIFQSFNLIDELTVYENVELPLVYMGIKAAQRKE KVNKVLEKVNLLHRANHYPQQLSGGQQQRVAIARAVVTDCKLLLADEPTGNLDSVNGVEV MELLSELNRQGTTIIIVTHSQRDATYAHRVIRLLDGQIVSENINRPLEKSTSSKNEAV >gi|226332146|gb|ACIC01000174.1| GENE 66 71844 - 73091 1353 415 aa, chain - ## HITS:1 COG:YPO1498 KEGG:ns NR:ns ## COG: YPO1498 COG0845 # Protein_GI_number: 16121771 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Yersinia pestis # 1 399 1 404 420 110 23.0 5e-24 MDTPVERKPGINRKHLYTIGGVALGIAVILYFVFRDTASSMTVEKDRLTIATVKQAEFND YIRVIGQVMPSRIIYMDAIEGGRVEERLKEEGAMVKAGDVILRLSNPLLNIGIMQSEADL AYQENELRNTRISMEQERLQLKQERIGLNKELTLKKRRCEQYRRLVEEQLIAREDFRQAE EEYEAAKEQLAVIDERIRQDNIFRESQISSLDENIRNMKRSLTLVRERLENLKIKAPIDG QVGNLNAQVGQSISAGEHIGQIITPDLKVQAQIDEHYVERVLPGLPADFTRDGGTYKLEV TKPYPEVKEGQFRTDLAFISERPENIRAGQTYHINLQLGDPAQAILVPRGGFFQITGGRW MYIVDESGKFATRRPVKIGRQNPQYYEVTEGLSSGEKVIISGYELFGDNEKLILK >gi|226332146|gb|ACIC01000174.1| GENE 67 73152 - 74456 901 434 aa, chain - ## HITS:1 COG:no KEGG:BT_1468 NR:ns ## KEGG: BT_1468 # Name: not_defined # Def: putative outer membrane efflux protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 22 434 1 413 413 758 98.0 0 MKNFFFLSLFLLPHFVHAQSGMTLDECIRLAWKQNPSVRNSVIDIKETRADYMAAVGAFL PRAAVNAETGKRFGRSIDPDTNGYTNETFEEGTVGLDMTLSLFEGFSRIHQVRFRKMNKE RSEWALKNKQNELAYQVTDAYYKLILERKLLNLALEQSRLSERYLKQTEAFVELGLKSAS DLQEVKARREGDIYRYKSRENTCRLALLHLEQLMNFQPGDTLVIQDTIVEANQLPLLSVP STETLYAQSLEILPAMRMIDLKRKAARKEYAMAGGAFSPSVYARFTVGSNFYNTVFSARQ LRDNIGKYVGIGISFPLLSGLERLTHQRKQKLNLYRLKNEEELEKQQLYTDIEQTLLSLH AGFSEHQQALQQLDAEALVLKESERKWEEGLISVFQLMEARNRFISAKAELVRVRLQVEM MRKLEKYYREGTFL >gi|226332146|gb|ACIC01000174.1| GENE 68 74786 - 76147 1052 453 aa, chain + ## HITS:1 COG:STM4174 KEGG:ns NR:ns ## COG: STM4174 COG2204 # Protein_GI_number: 16767428 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains # Organism: Salmonella typhimurium LT2 # 6 450 8 441 441 316 39.0 5e-86 MENGTILIVDDNKSVLASLELLLENEFGTVRTAANPNQITTLLTTTSIDVVILDMNFSAG INNGNEGLYWLKQIHEIRPSLPVVMLTAYGDVELAVKALKNGATDFLLKPWDNQILIRKI KEAYKSNNSKAKSASKATGKQGDTEKPSKPEMLVGHSPAMLQLIKVVTKVAKTDANILIT GENGTGKEMLAREIHRLSPRNSRQMLGIDMGAISESLFESELFGHERGAFTDAYESRPGK FEAANGSSLFMDEIGNLSIALQAKLLTVLQNRNVTRIGSNKAIPVDIRLISATNKDIPEM VKQGLFREDLFYRINTIHLEIPPLRERGDDILLFIDTFLRRFTSKYQRPDIRMHEQTIEK LRSYHWPGNIRELQHAIEKAVILCEGNVIRPKDILVKQSWKPQTASVVPNLEEVERQAIE TAILQNNGNLTAAAEQLGISRQTLYNKLKRFKP >gi|226332146|gb|ACIC01000174.1| GENE 69 76153 - 77403 1093 416 aa, chain + ## HITS:1 COG:mlr0399 KEGG:ns NR:ns ## COG: mlr0399 COG5000 # Protein_GI_number: 13470633 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase involved in nitrogen fixation and metabolism regulation # Organism: Mesorhizobium loti # 182 406 464 697 738 105 28.0 2e-22 MIFSRHIYWAILLRILLILATSGTGLWLIISQRGIIIGTLLIICSLFQIGGIVTYLNNIN RKLRLFFDAIEDRDNTFSYPEQRISEEQQQLNRSLNRINALLAQTKMDYQKQEHFYRSLL EKVPNGIIAWNNSKKIIFVNNTALALLNIESLALYSQLENILQNEKNRKRLSLSQSQMKL QDETITLLSIQDIDDRLNENESESWSKLSHVLTHEIMNTIAPIISLSQTLASYPEISEKG IRGLHIIQAQSERLMEFTESFRHLSYLPQPEKRLFSLSDLLHNLEELLQTNFQENNISFT LRCSPESIITEGDEKQLSQVFLNLLKNAMQALEGHPHGELSLQAEQNEHIVIDITDNGPG IPPEIQDKVFIPFFTTKSEGTGIGLSLCKEIIRRHEGHLSVKESKAGKTVFHIELP >gi|226332146|gb|ACIC01000174.1| GENE 70 77491 - 77838 469 115 aa, chain + ## HITS:1 COG:no KEGG:BT_1471 NR:ns ## KEGG: BT_1471 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 115 1 115 341 196 95.0 2e-49 MSIKEQYWKYSLIVIILFMGIIIFRQITPFLGGLLGALTIYILVRTQMAHLTERWKLKRS VAALLITAETVMVFLVPLGLTVWLLVSKLQDINLDPQTFIAPMQQVAEFIKEKTG >gi|226332146|gb|ACIC01000174.1| GENE 71 77843 - 78517 451 224 aa, chain + ## HITS:1 COG:VC0624 KEGG:ns NR:ns ## COG: VC0624 COG0628 # Protein_GI_number: 15640644 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Vibrio cholerae # 48 214 186 351 361 95 35.0 8e-20 MLGKDTLSFIVSILPRIGQIIMESISSLAINLFVMVFVLYFMLIGGKKMEAYVNDILPFN EANTQEVIHEINMIVRSNAIGIPLLAIIQGGVAMIGYLIFGAPNILVLGFLTCFATIIPM VGTALIWFPVAAYLAVTGDWFNAIGLAAYGGIVISQSDNLIRFILQKKMADTHPLITIFG VVIGLPLFGFMGVIFGPLLLSLFFLFVDMFKKEYLDLRNNLPAR >gi|226332146|gb|ACIC01000174.1| GENE 72 78574 - 79746 1012 390 aa, chain + ## HITS:1 COG:Ta1048 KEGG:ns NR:ns ## COG: Ta1048 COG0463 # Protein_GI_number: 16082079 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Thermoplasma acidophilum # 57 296 7 212 256 68 23.0 1e-11 MEAFTFNTVELILLSAAGILFIIQLIYYFGLYNRIHTRNKAVRKEAVHFTQEMPPLSVIL CARNEADNLRKILPAILEQNYPQFEVIVINDASTDETEDVLGFMEEKYPHLYHSFTPDSA RYISHKKLALTLGIKASKHDWLVFTETNCMPASKDWLRLMARNFTSQTQVVLGYSGYDRT KGWLHKRVAFDTLFQSLRYLCFALAGKPYMGIGRNMAYRKELFFQRKGYSTYLNLQRGED DLFINQIATPSNTRVETDINATMRIQPVYSYKEWKEEKISYMATARFYHGVQRYLLGFET FSRLLFYAACIAGLVFGILNNHWLAAGIALLIWLFRYTVQAIIINRTAKEMGGDRKYYFS LPVFDILQPIQSLKFKLYRSLRRKGDFMRR >gi|226332146|gb|ACIC01000174.1| GENE 73 79758 - 80210 188 150 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|116624156|ref|YP_826312.1| SSU ribosomal protein S18P alanine acetyltransferase [Solibacter usitatus Ellin6076] # 1 142 1 144 152 77 35 3e-13 MEPTIHLRQAKVGDIPAILEIEQECFQEDSFSREQFVYLICRSKGTFYVVVEQERMIAYV SLLFHAGTRYLRIYSIAVHPDCRGRKLGQLLMERTIETAFDCRAVKITLEVKESNTPAIK LYMKNGFIPVGVKPNYYHDGSDAIYMQRHM >gi|226332146|gb|ACIC01000174.1| GENE 74 80213 - 81679 2566 488 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|29346884|ref|NP_810387.1| ribosomal protein S6 modification protein-related protein [Bacteroides thetaiotaomicron VPI-5482] # 1 488 1 488 488 993 100 0.0 MNNVLILLDNPDDWKPYCETKSILTVSDYLKNKPVEKDRKLVINLSNDYSYNSEGYYCSL LAQTRGQRVIPDVDIINKLETGTGIRMDRSLQALCYQWIQKNGIKSDIWYLNIYFGKCRE KGLERVARFIFENYPCPLLRVALNTHPKNQIESIQFLPLNRLDDEEQDFFANTLDNFNKK IWRAPKSAKAPRYSLAVLIDPKEKFPPSNKGALHKLSEVAKKMNIHVEMITEDDAMRLLE FDALFIRTTTSLNHYTFHLSQLAAQNGMVVIDDPLSIIRCTNKVYLKELFEKERIPAPKS TLIFQSNENSFEQISEQVGAPFILKIPDGSYSIGMKKVSNEEELKTSLELLFEKSAILLA QAFTPTEFDWRVGLLNGVPLYACKYYMAKGHWQIYCHYDSGRSRCGLVDTIPIYQVPRVV LDTAIKAANLIGKGLYGVDLKMVDDKAYVIEINDNPSIDHELEDAIIGDEMYYRLLNHFE QALEMKHY >gi|226332146|gb|ACIC01000174.1| GENE 75 81810 - 83054 1062 414 aa, chain - ## HITS:1 COG:PA2988 KEGG:ns NR:ns ## COG: PA2988 COG4591 # Protein_GI_number: 15598184 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: ABC-type transport system, involved in lipoprotein release, permease component # Organism: Pseudomonas aeruginosa # 3 414 5 416 416 113 23.0 7e-25 MSLSLFIARRIYRESDGGKQVSRPAVLIAMAGIAIGLAVMIIAVAVVIGFKSEVRNKVIG FGSHIQIANLDVVNSYETHPIAVGDSMMTALSGYPQVSHVQRYSTKPGMIKTDDAFQGMV LKGVGPEFDPSFMQEYLVEGEIPAFSDSASSNQVLISKALATKMKLKLGDKIYTYYIQDN IRARRLTIAGIYQTNFSEYDNLFLLTDLYLVNRLNGWEPEQVSGVELQVRDYDKLEDITY EIAVDTDSQRDKFGGVYYVRNIEQLNPQIFEWLNLLDLNVWVILFLMIGVAGFTMISGLL IIIIERTNMIGILKALGADNFTIRKTFLWFAVFLIGKGMLWGNAIGLAFCFIQSQFGIFK LDPENYYVDTVSVSFNVWFFLLINAGTLLASVLMLIGPSYLITKINPASSMRYE >gi|226332146|gb|ACIC01000174.1| GENE 76 83247 - 84446 1166 399 aa, chain - ## HITS:1 COG:CAC1001 KEGG:ns NR:ns ## COG: CAC1001 COG0436 # Protein_GI_number: 15894288 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Clostridium acetobutylicum # 5 394 4 393 395 361 46.0 2e-99 MPTISIRGNEMPASPIRKLAPLADAAKQRGVHVFHLNIGQPDLPTPQAAIDAIRNIDRKV LEYSPSAGNRSYREKLVGYYAKFNINLTADDIIITSGGSEAVLFSFLSCLNPGDEIIVPE PAYANYMAFAISAGAKIRTIATTIEEGFSLPKVEKFEELINERTKAILICNPNNPTGYLY TRREMNQIRDLVKKYDLFLFSDEVYREFIYTGSPYISACHLEGIENNVVLIDSVSKRYSE CGIRIGALITKNKEIRDAVMKFCQARLSPPLIGQIAAEASLDAPEEYSRETYDEYVERRK CLIDGLNRIPGVYSPIPMGAFYTVAKLPVDDSDKFCAWCLSDFEYEGQTVFMAPASGFYT TPGSGINEVRIAYVLKKEDLTRALFILQKALEAYPGRTE >gi|226332146|gb|ACIC01000174.1| GENE 77 84627 - 85862 948 411 aa, chain + ## HITS:1 COG:APE1887 KEGG:ns NR:ns ## COG: APE1887 COG2407 # Protein_GI_number: 14601699 # Func_class: G Carbohydrate transport and metabolism # Function: L-fucose isomerase and related proteins # Organism: Aeropyrum pernix # 53 398 72 417 433 146 29.0 7e-35 MTIHLISFASILHKQASLRNSHEAILSELEKYYTVKLIDYQDMDKLGSDDFKIIFIATGG VERLVIQHFERLPRPAILLADGMQNSLAAALEISTWLRGRGMKSEILHGELPSIIQRVHI LYNNFRAQRSLFGKRIGVIGTPSSWLVASNVDYLLAKRRWGIEYIDIPLERVYEHFKQIT DDQVGASCAAVASQALACREGTPEDLIKAMRLYRAIKRICEEEKLEALTLSCFKLIDQID TTGCLALSLLNDDGIMAGCEGDLQSIFTLLAVKALTGKEGFMANPSTINTRTNELILAHC TIGLKQTERYIIRNHFETEKGIAIQGLLPTGDVTIVKCGGECLDEYYLSTGTLTENTNYI NMCRTQVRIRMNTPADYFLKNPLGNHHIMIHGNYEEALNEFLLSNACKRTE >gi|226332146|gb|ACIC01000174.1| GENE 78 85952 - 86794 795 280 aa, chain - ## HITS:1 COG:SPCC1672.01 KEGG:ns NR:ns ## COG: SPCC1672.01 COG1387 # Protein_GI_number: 19075372 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Histidinol phosphatase and related hydrolases of the PHP family # Organism: Schizosaccharomyces pombe # 8 265 6 271 306 77 25.0 2e-14 MTNLTNYHSHCLYCDGRANMEDFIRFAISEGFSSYGISSHAPLPFSTAWTMEWDRMDDYL SEFSRLKGKYADKIELAIGLEIDYLNEESHPAIPRFQELPLDYRIGSVHMLYSPLGKVVD IDTPADIFRQLIDAHFEGDLDYVVHLYYKNLLRMVELGGFDILGHADKMHYNASCYRPGL LDEPWYDALVRDYFAEIARHGYIVEINTKSYHDLGTFYPNERYFPLLKELGIRVQVNSDA HYPERINNSRAEALAALKKVGFDTVVEWHGGQWKDMPLLQ >gi|226332146|gb|ACIC01000174.1| GENE 79 86866 - 88734 2194 622 aa, chain - ## HITS:1 COG:sll0912 KEGG:ns NR:ns ## COG: sll0912 COG0488 # Protein_GI_number: 16331003 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Synechocystis # 6 619 4 634 636 484 43.0 1e-136 MAVPYLQVDNLTKSFGDLVLFENISFGIAEGQRVGLIAKNGSGKTTLLNIIAGKEGYDSG NIVFRRDLRVDYLEQDPQYPEELTVLEACFHHGNSTVELIKEYERCMETEGHPGLEDLLA RMDQEKAWEYEQKAKQILSQLKIRNFDQQVKQLSGGQLKRVALANALITEPDLLILDEPT NHLDLDMTEWLEDYLRRTNLSLLMVTHDRYFLDRVCSEIIEIDNQQIYQYKGNYSYYLEK RQERIEAKSVEIERANNLYRTELDWMRRMPQARGHKARYREDAFYELEKVAKQRFNNDNV KLEVKASYIGSKIFEADHLFKSYGDLKILDDFSYIFARYEKMGIVGNNGTGKSTFIKILM GQVKPDSGTVDVGETVRFGYYSQDGLQFDEQMKVIDVVQDIAEVIELGNGKRLTASQFLQ HFLFTPETQHSYVYKLSGGERRRLYLCTVLMRNPNFLVLDEPTNDLDIITLNVLEEYLQN FKGCVIVVSHDRYFMDKVVDHLMVFNGQGDIRDFPGNYSDYRDWKEAKAQKEKEAEKPQE EKTARVRLNDKRKMSFKEKREFEQLEKEIAELEAEKAQIEELLCSGTLSVDELTEKSKRL PEVNDLIDEKTMRWLELSEIEG >gi|226332146|gb|ACIC01000174.1| GENE 80 88763 - 89569 742 268 aa, chain - ## HITS:1 COG:no KEGG:BT_1480 NR:ns ## KEGG: BT_1480 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 268 1 268 268 524 99.0 1e-147 MQGKKFISPGAWFSMNYPSDWNEFEDGEGSFLFYNPDVWTGNFRISAFKGNASYGKDAIR QELKENDSASLVKIGTWDCAYSKEMFQEEGTYYTSHLWITGTGNIAFECSFTVPKGGSTK EAEEVIATLEARKEGEKYPAELIPVRLSEIYQINEGYEWVVSTVKQELKKDFQGVEEDLE KIQQVIDSGKISPKKKDEWLAIGITVCAILTNEVEGMEWKTLIDGNREVPVLEYQGRTID PMKIAWSKVKAGQPCNIAEAYQSAIDHH >gi|226332146|gb|ACIC01000174.1| GENE 81 89621 - 90814 1013 397 aa, chain - ## HITS:1 COG:no KEGG:BT_1481 NR:ns ## KEGG: BT_1481 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 397 1 397 397 770 99.0 0 MIKIDLKRYRREVVGGVIVLLLLIGGISVFKYTSFNSGFEIVDDLGGNIFPSAILSVATT DAQVITPSDSTCLGNPKSCIAVRVKSRTAYSRVRIEVAETPFFSRSVSEFVLNKPRTEYT IYPDIIWNYEALKNNAQAEPVSVAVKVEMNGKDLGQRVRTFSVRSVNECLLGYVANGTKF YDTSIFFAAYVNEENPMIDQLLREALNTRIVNRFLGYQSTAKGAVDKQVYALWNILQKRK FRYSSVSNTSLSSNVVFSQRVRTFDDALESSQINCVDGSVLFASLLRAINIEPILVRTPG HMFVGYYTDNSHKDMNFLETTMIGDVDLDDFFPDEQLDSTMVGKSQNEMSLLTFEKSKQY ANKKYKDNEAGIHSGKLNYMFLEISKEVRRKIQPIGK >gi|226332146|gb|ACIC01000174.1| GENE 82 91048 - 91737 821 229 aa, chain + ## HITS:1 COG:TM1655 KEGG:ns NR:ns ## COG: TM1655 COG0745 # Protein_GI_number: 15644403 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Thermotoga maritima # 1 226 9 240 247 169 43.0 5e-42 MNDYRILVVDDEEDLCEILKFNLENEGYEVDTANSAEEALKMDISSYHLILLDVMMGEIS GFKMANMLKKDKKTARVPIIFITAKDTENDTVTGFNLGADDYISKPFSLREVIARVKAVL RRTATMEAEKAPERLTYQSLVIDITKKKVSIDDEEVQLTKKEFEILLLLVQNKGRVFSRE DILARIWSDEVYVLDRTIDVNITRLRKKIGEYGKCIVTRLGYGYCFEAE >gi|226332146|gb|ACIC01000174.1| GENE 83 91747 - 93528 1649 593 aa, chain + ## HITS:1 COG:CAC1701 KEGG:ns NR:ns ## COG: CAC1701 COG0642 # Protein_GI_number: 15894978 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 35 589 30 563 566 160 27.0 5e-39 MNLPVKQKHFLSFSRKLFLSVISLFLVFAFCFIAYQYQREREYKVELLNTQLQNYNSRLY ERLNSNPAIEETTEKYIRDHALEDLRVTLIDLQGNVIYDSYQTTDQQLENHLNRPEVQKA LKDGTGFDVRRTSETTGLPYFYSATRYGDYIIRSALPYNVSLINNLQADPHYLWFTVIVS LLLMVIFYKFTNKLGTSISQLREFAMRADRNEPIEMAMQSAFPHNELGEISQHIIQIYKR LHETKEALYIEREKLITHLQISHEGLGIFTKDKKEILVNNLFTQYSNLISDSNLETTEEV FAISELKDIIHFINKNQQQRSRGKDEKRMSVTINKNGRTFIVECIIFQDASFEISINDVT QEEEQVRLKRQLTQNIAHELKTPVSSIQGYLETIVNNENISRDKINTFLERCYAQSNRLS RLLRDISVLTRMDEAANMIDMERVDISVLVGNIINEVSLELEEKHISIVDSLKKGIQIKG NYSLLYSIFRNLMDNAIAYAGTNIQININCFREDENYYYFSFADTGIGVSPEHLNRLFER FYRVDKGRSRKLGGTGLGLAIVKNAVIIHGGNISAKNNQGGGLEFVFTLAKEK >gi|226332146|gb|ACIC01000174.1| GENE 84 93560 - 95131 1366 523 aa, chain - ## HITS:1 COG:no KEGG:BT_1484 NR:ns ## KEGG: BT_1484 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 523 1 523 523 994 98.0 0 MNAPVKTLRKLPIGIQSFEFLRSEGYLYVDKTELVYRLVTKGKPYFLSRPRRFGKSLLLS TLEAYFKGKKELFEGLAIEKLEQNWFEYPVLHLSLNAEKYDSRERLERMLELQLESWEEM YGVDKGTMTYSGRFITIIRRAYEQTGRRVVVLVDEYDKPMLQTFDKPDLQEDFRKTLTAF YTVLKDADPYLQFVFITGVTKFAQMGIFSTLNQLNDISFDLEYNTLCGMTRTEIETVFVP ELQELALQAETNYDEAIEQLTRQYDGYRFTPEKGFTPMFNPFSVLNALAKSRFGDYWFAS GTPTFLVEILKKTNFDLRELDGIEVSSASLSDDRANISNPVPMIYQSGYLTIKSFDQRFR TYTLGFPNEEVKYGFLNFVAPFYTPVASTDTTFYIGKFVHELETGETEAFLTRLSCFFAD FPYELNEKTERHYQVVFYLVFKLMGQFTQAEVRTAIGRADAVVKTADYIYVFEFKLTGTA EEALKQIDEKGYLLPYTVDGRRLVKVGVSFDAAKRNLGEWLIV >gi|226332146|gb|ACIC01000174.1| GENE 85 95363 - 96298 826 311 aa, chain + ## HITS:1 COG:no KEGG:BT_1485 NR:ns ## KEGG: BT_1485 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 311 1 311 311 581 99.0 1e-164 MKTNISLILILCLLLGACKNGNASSQNKSETPQDTIKAIKMPAIPQMMTAPEQRADFLAK HYWDNVNFADTNYIHHPEVTEQAWADYCDLLNHVPLETAQQAMRNVIDRTNVDKKVFTYI TDLADKYLYDPNSPMRNEEFYIPVLEAMIASPVLNETEKIRPQARLKLAQKNRIGTKALN FTYTLASGAQGSLYQLKAEYLLLFINNPGCQACTETIEGLKNAPIINQLLQEKKLVLLSI YPDEELDEWKKHLSEFPNEWINGYDKKFTIKEEQVYDLKAIPTLYLLDKDKTVLLKDATA PAIEAYLMQHQ Prediction of potential genes in microbial genomes Time: Thu May 12 03:44:58 2011 Seq name: gi|226332145|gb|ACIC01000175.1| Bacteroides sp. 1_1_6 cont1.175, whole genome shotgun sequence Length of sequence - 15464 bp Number of predicted genes - 13, with homology - 12 Number of transcription units - 5, operones - 4 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 765 - 824 4.3 1 1 Op 1 . + CDS 952 - 2337 1061 ## BT_1486 hypothetical protein 2 1 Op 2 . + CDS 2358 - 3704 951 ## BT_1487 hypothetical protein 3 1 Op 3 . + CDS 3719 - 4561 872 ## BT_1488 hypothetical protein + Prom 4893 - 4952 2.7 4 2 Op 1 . + CDS 5038 - 7056 1470 ## BT_1489 vitamin B12 receptor, outer membrane 5 2 Op 2 . + CDS 7087 - 9078 1393 ## BT_1490 putative surface protein 6 2 Op 3 . + CDS 9103 - 9879 635 ## BT_1491 hypothetical protein 7 2 Op 4 . + CDS 9894 - 10070 64 ## + Prom 10184 - 10243 4.3 8 3 Tu 1 . + CDS 10315 - 11694 1426 ## COG3033 Tryptophanase + Prom 11723 - 11782 3.7 9 4 Op 1 . + CDS 11804 - 12106 265 ## COG2388 Predicted acetyltransferase 10 4 Op 2 . + CDS 12115 - 12339 250 ## BT_1494 hypothetical protein + Term 12350 - 12402 13.2 - Term 12339 - 12389 14.4 11 5 Op 1 . - CDS 12412 - 13047 331 ## BT_1495 siderophore (surfactin) biosynthesis regulatory protein 12 5 Op 2 . - CDS 13113 - 14465 1190 ## COG1253 Hemolysins and related proteins containing CBS domains - Term 14480 - 14532 11.2 13 5 Op 3 . - CDS 14540 - 15016 438 ## COG0629 Single-stranded DNA-binding protein - Prom 15062 - 15121 6.8 Predicted protein(s) >gi|226332145|gb|ACIC01000175.1| GENE 1 952 - 2337 1061 461 aa, chain + ## HITS:1 COG:no KEGG:BT_1486 NR:ns ## KEGG: BT_1486 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 461 36 496 496 926 98.0 0 MNCKKLFKMLLFIITAFVIVSCSKDDCDLDHIDKLQGLPALKAGTFPEEDLTLNVGEQYV YAPKASSPLDIYYQWYQNGEDMSTDPSFTFNAEHPGRSKVILELSNDLGKVTLENKVMVP GADYSKGCLIINEGWFGHGSGSISFYNYEKNSIEHWCYKNQNFGDVLGVTSQSATLWNGK LYVCSKEDNQLVVMDPKTLYAENSCGKLANYQAYEFIGLNDDYGIITHGGYFSRINLKTF ETITLVSVGNTYTGTGSGIAYNGKLILNVNSSGWGSPKVYTIDIAELCSPDLKPTDKVAF QELDITTYGGTRFVQCKDGNIYTVETTKDGKNNLVRINADFSLKKVAMRDDYSPSSFGAY REASFCGTPEGIFYYIAGGKIYKATFDNPAPEETLTEYTKEGYGFYGAGIRVNPKTNELL AMYLTGDYQKNLLVRFNAATGEKISEIAYDGYYFPATFIFN >gi|226332145|gb|ACIC01000175.1| GENE 2 2358 - 3704 951 448 aa, chain + ## HITS:1 COG:no KEGG:BT_1487 NR:ns ## KEGG: BT_1487 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 448 1 448 448 869 100.0 0 MNKLYTTLLIACLAAGFTACNDDDCEDLHLGNLAHYPNVLKGTFPTESQVLELGETLEIT PELLNPEGATYSWLVNGKEYSTEPTFSYKIDNPCRADLSCIIKNKYGKVEMSTSFSSNHN FSKGFFYVADGTFNFYDTEKKTAYQDCYASLNAGKTLGIGNYDSANIIHSNGKFYLLVGT STSNRDHFYIVDAKTLYYENSAVVGANLSGLTILNEQYGLVTGDGIRRIDLKSLNNVRIK NERLLCFYNSIIYNGKVLSNDTYKDESKVKYYDVNELIAAKEGEAPAVTELDIIQKQKIN FVLAKDGNVYTLESADNGCNIVKIKNDFTLEKVFANFQPAKGPYHSSPTIGMVASETENI IYLVSTDGAIYKYILGDSDSLKAPFIAAESGVSITAPLQLNQQSGELYVTYTEERKDESK IVVYSKDGKVLHTVDCGESVPSQILFNN >gi|226332145|gb|ACIC01000175.1| GENE 3 3719 - 4561 872 280 aa, chain + ## HITS:1 COG:no KEGG:BT_1488 NR:ns ## KEGG: BT_1488 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 8 280 1 273 273 530 98.0 1e-149 MKRYWYLMAIAATLLASCNKDEEETEIQGFKVLEYRPAPGQFINEGFDCQTMEEANAYAE ERFNKKLYVSLGSFGGYITIKMPKEIKNRKGYDFGIIGNPFSGSSEPGIVWVSEDANGNG KADDVWYELKGSDEPERDYSVTYHRPDAAGDIPWEDSKGESGVIKYLPQYHDQMYYPNWI KEDSYTLKGSMLEARTEQEGGIWKNKDFGKGYADNWGSDMAKDDNGNYRYNQFDLDDAVD QNGNPVTLERIHFVKVQSAILKNVESIGEVSTEVVGFKAF >gi|226332145|gb|ACIC01000175.1| GENE 4 5038 - 7056 1470 672 aa, chain + ## HITS:1 COG:no KEGG:BT_1489 NR:ns ## KEGG: BT_1489 # Name: not_defined # Def: vitamin B12 receptor, outer membrane # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 672 10 681 681 1301 99.0 0 MSVLLCCQALSLSLFAQQQKVDTAHIYSIPEIMVSDPYQTREVRSASPLQVFNKEELKNL QALQVSDAVKHFAGVTVKDYGGIGGLKTVSIRSLGAQHTAVSYDGITVSDCQTGQVDIGR FSLNNVDRLSLNNGQSDNIFQPARFFASAGILNIQTLTPHFKEDKPTNIAAEFKTGSWGL VNPSLFLEQQLNKKWSMTANGEWMSSDGHYPFTLRYGNDADVQVSKEKRRNTDVENLRAE ISAFANLSDKEQWRLKAYYYQSSRGLPNATTLYYDFSRQNLQDKNTFIQSQYKKEFSRKW VFQTSAKWNWSYQNYQDPDALTSVGGTDNSYYQQEYYLSASALYRIWNNLSFSLSTDGSI NTMNANLQNFVSPTRYSWLTAFAGKYVNEWVTLSASALATVINEKAKNGGSAGNHRKLSP NVSISLKPFHNEELRFRFFYKDIFRLPSFNDLYYDKAGNINLKPESATQYNIGITYSKAI NNFIPYLSATVDAYHNKVTDKIVATPTKNLFIWSMVNLGKVDIKGIDATASLSLQPLDKL RINLSGNYTYQRALDVTNSNPNSPEGKVYKHQIAYTPRVSASGQAGIETPWLNLSYSFLF SGKRYMLGQNISDNRLDSYSDHSISAYRDFKIQKVTASLNLEVLNLMNRNYEIVKNFPMP GRSVRVTIGVRY >gi|226332145|gb|ACIC01000175.1| GENE 5 7087 - 9078 1393 663 aa, chain + ## HITS:1 COG:no KEGG:BT_1490 NR:ns ## KEGG: BT_1490 # Name: not_defined # Def: putative surface protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 663 1 663 663 1300 98.0 0 MQKGLLYNMWLRLGKCFFLFPLFFIACDDLEDKPSIVPESNGDVFETGTAEMYILSEGLF NQNNSSLARYSFNRQRCTNNYFSANNQRGLGDTANDIAIYGNKIYVVVNVSSTVEVIDFP TGKSIRQISMLRDNGSSRQPRAIAFDKDKAYICSYDGTVARIDTTSLEIEEIVTVGRNAE DICVQNGKLYVSNSGGLDYSGPGVDTTVSVIDITTFKETKKIEVGPNPGKILPGLEEAVY VVTRGTDIEAGDYHLVKIDSRTDAVATTYDEKVLSFAIDGPIAYLYTYDYQTKDSAIKVF DLNAGTVIRDNFITDGTAIQTPFSIQLNPFSGNVYITEAYNYTVKGDVLCFNQQGQLQYR LNDIGLNPNTVVFSDKASQNEAGDTPEDPNAPSAFANKVFEYIPAPGQFINTTTSAYEDG FSAKQVLERATEKLKKKSVISLGGFGGTITVGFHQSIRNSKGEYDFRILGNASYNQNTGT GALGGSAEPGIVLVSKDENGNGLPDDEWYELAGSEYGKDTETRNYEITYYRPQPANGDIR WTDNQGGEGFVYRNSYHQQDSYYPNWIKEDEITFRGTRLKDNAINEGGTWVGYCYPWGYA DNHPNRSEFSQFKIDWAVDQNGNHVELDKIDFVKIYTAVNQNVGWMGEISTEVMTVEDLH FEN >gi|226332145|gb|ACIC01000175.1| GENE 6 9103 - 9879 635 258 aa, chain + ## HITS:1 COG:no KEGG:BT_1491 NR:ns ## KEGG: BT_1491 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 258 1 256 256 185 45.0 2e-45 MKAKMKKLSLFLLVSVFAFCGCSDENDEVKGGDDNKGGENEKKEVVVSFENQLTESESEF TSTSTEVSGYYFKDTFTDPESYVTFDHYYSDWGQGYSFAGFTYTNKTSHFQNCQPNCGNI KTGEVYLGVYSDENTTASMTIIDSQYSIKGLWITTSKNAYIGMTEGDSYARPFKKGDWYE VTATGYDNKGKKIAETKIKLADYKTDTDKPVNTWIWFDLTPLKDASKITLIPSSSDSGEF GMNTGKYFCIDDLTLIEK >gi|226332145|gb|ACIC01000175.1| GENE 7 9894 - 10070 64 58 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MFDSNRGSNYHKQTLRLSQTDRSIAPNRPLVYPQSTSRLQTIDRISPLNQPHDLKEGA >gi|226332145|gb|ACIC01000175.1| GENE 8 10315 - 11694 1426 459 aa, chain + ## HITS:1 COG:PM0811 KEGG:ns NR:ns ## COG: PM0811 COG3033 # Protein_GI_number: 15602676 # Func_class: E Amino acid transport and metabolism # Function: Tryptophanase # Organism: Pasteurella multocida # 6 455 6 455 458 469 49.0 1e-132 MELPFAESWKIKMVEPIRKSTREEREQWIKEAHYNVFQLKSEQVYIDLITDSGTGAMSDR QWAGMMLGDESYAGATSFFKLKDTITRITGFEYIIPTHQGRAAENVLFSYLVHEGNIVPG NSHFDTTKGHIEGRHAIALDCTIDEAKNTQLEIPFKGNVDPAKLEKALSEYADRIPFIIV TITNNTAGGQPVSMQNLREVRAIADKYNKPVLFDSARFAENAYFIKMREEGYQDKSIKEI TREMFDLADGMTMSAKKDGIVNMGGFIATRRKEWYEGAKGFCVQYEGYLTYGGMNGRDMN ALAIGLDENTEFDNLETRIKQVEYLAQKLDEYEIPYQRPAGGHAIFVDASKVLTHVPKEE FPAQTLTVELYLEAGIRGCEIGYILADRDPITHENRFNGLDLLRLAIPRRVYTDNHMNVI AAALRNVFERRETITRGVRIAWEAPLMRHFTVQLERLSV >gi|226332145|gb|ACIC01000175.1| GENE 9 11804 - 12106 265 100 aa, chain + ## HITS:1 COG:DR1844 KEGG:ns NR:ns ## COG: DR1844 COG2388 # Protein_GI_number: 15806844 # Func_class: R General function prediction only # Function: Predicted acetyltransferase # Organism: Deinococcus radiodurans # 10 96 7 93 93 70 40.0 9e-13 MAEDYKLIDNEEKHRYEFQIDDKIAKIDYIKSNNGEIYLVHTEVPASLGGRGVGSQLAEK TLADIERQGLRLVPLCPFVAGYIHKHPEWKRIVMRGIHIK >gi|226332145|gb|ACIC01000175.1| GENE 10 12115 - 12339 250 74 aa, chain + ## HITS:1 COG:no KEGG:BT_1494 NR:ns ## KEGG: BT_1494 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 74 1 74 74 144 98.0 8e-34 MGKKVEYTNGELTIVWQPEMCQHAGICVKMLPNVYHPKERPWVQIENATTEELIAQISKC PSGALSYRLNKKDK >gi|226332145|gb|ACIC01000175.1| GENE 11 12412 - 13047 331 211 aa, chain - ## HITS:1 COG:no KEGG:BT_1495 NR:ns ## KEGG: BT_1495 # Name: not_defined # Def: siderophore (surfactin) biosynthesis regulatory protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 12 211 1 200 201 392 99.0 1e-108 MGLFLQHKTVDMQWAVWKMEESLDTLFSLLPDVRRVSCEQEMQRFTSDRRKLEWLSVRVL LYSMLQEDKEIGYSSEGKPHLTDNSSFISISHTKGYVAVILSSVAPVGIDIEQYGQRVKR VYDRFIRSDERVEPYQGDETWSMLLHWSAKETIFKSMENADADLRKLCLSHFIPQSEGVF QVREYATERQQLFTVGYRICPDFVLTWTVTI >gi|226332145|gb|ACIC01000175.1| GENE 12 13113 - 14465 1190 450 aa, chain - ## HITS:1 COG:FN1486 KEGG:ns NR:ns ## COG: FN1486 COG1253 # Protein_GI_number: 19704818 # Func_class: R General function prediction only # Function: Hemolysins and related proteins containing CBS domains # Organism: Fusobacterium nucleatum # 40 445 17 425 426 211 35.0 2e-54 MDSDGYLSQLAEIFNGITVNTPSISAIIAIVLAGVLLLASGFASASEIAFFSLSPSDLND IDEHNHPADEKISALLGDSERLLATILITNNFVNVTIIMLCNFFFMNVFVFHSPLAEFII LTVILTFLLLLFGEIMPKIYSAQKTLAFCRFSAPGIYFLEKLFHPVASMLVRSTTFLNKH FAKKNHNISVDELSHALELTDKAEISEENNILEGIIRFGGETVKEVMTSRLDMVDLDVRT PFKEVMQCIIENAYSRIPIYSGSRDNIKGVLYIKDLLPHVNKGDNFRWQSLIRPAYFVPE TKMIDDLLRDFQANKIHIAIVVDEFGGTSGLVTMEDIIEEIVGEIHDEYDDEERTYVVLN DHTWIFEAKTQLTDFYKIAKVDEDEFEKVVGDADTLAGMLLEIKGEFPALHEKVTYHNYE FEVLEMDSRRILKVKFTILPKEMDESESKE >gi|226332145|gb|ACIC01000175.1| GENE 13 14540 - 15016 438 158 aa, chain - ## HITS:1 COG:VC0397 KEGG:ns NR:ns ## COG: VC0397 COG0629 # Protein_GI_number: 15640424 # Func_class: L Replication, recombination and repair # Function: Single-stranded DNA-binding protein # Organism: Vibrio cholerae # 3 158 6 177 177 106 42.0 2e-23 MSVNKVILIGNVGQDPRVKYFDTGSAVATFPLATTDRGYTLANGTQIPERTEWHNIVASN RLAEIVDKYVHKGDKLYLEGKIRTRSYSDQSGAMRYITEIFVDNMEMLSPKGANTGAGAP QPSATQAQQPQQMQQPQQQAQPSQAQPIQDNPADDLPF Prediction of potential genes in microbial genomes Time: Thu May 12 03:45:42 2011 Seq name: gi|226332144|gb|ACIC01000176.1| Bacteroides sp. 1_1_6 cont1.176, whole genome shotgun sequence Length of sequence - 3706 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 4, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 773 484 ## PROTEIN SUPPORTED gi|46129221|ref|ZP_00155777.2| COG1194: A/G-specific DNA glycosylase - Prom 964 - 1023 5.0 + Prom 808 - 867 5.4 2 2 Tu 1 . + CDS 956 - 1231 289 ## COG0776 Bacterial nucleoid DNA-binding protein + Term 1382 - 1440 12.3 + Prom 1426 - 1485 2.9 3 3 Tu 1 . + CDS 1514 - 3088 1582 ## COG1530 Ribonucleases G and E + Term 3096 - 3133 1.1 - Term 3174 - 3216 5.3 4 4 Tu 1 . - CDS 3416 - 3706 210 ## BT_1501 major outer membrane protein OmpA Predicted protein(s) >gi|226332144|gb|ACIC01000176.1| GENE 1 3 - 773 484 257 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|46129221|ref|ZP_00155777.2| COG1194: A/G-specific DNA glycosylase [Haemophilus influenzae R2846] # 6 257 3 260 378 191 37 1e-48 MLDDRSKSGMNIFSEAIVEWYKEYKRDLPWRESSDPYRIWISEIILQQTRVAQGYDYFLR FIKRFPDVKALADADEDEVMKYWQGLGYYSRARNLHAAAKSMNGVFPETYPEVLALKGVG EYTAAAICSFAYGMPYAVVDGNVYRVLSRYFGIDTPIDSTEGKKLFAALADEMLDKKQPA LYNQGIMDFGAIQCTPQSPDCLFCPLADSCSALSTGRVTQLPVKQHKTKTTNRYFNYIYV RAGAHTYINKRTANDIW >gi|226332144|gb|ACIC01000176.1| GENE 2 956 - 1231 289 91 aa, chain + ## HITS:1 COG:SA1305 KEGG:ns NR:ns ## COG: SA1305 COG0776 # Protein_GI_number: 15927054 # Func_class: L Replication, recombination and repair # Function: Bacterial nucleoid DNA-binding protein # Organism: Staphylococcus aureus N315 # 1 90 1 90 90 58 35.0 2e-09 MTKADIVNEITKKTGIDKTTVLTTVEAFMEAVKDSLSNDENVYLRGFGSFVVKKRAQKTA RNISKNTTIIIPEHNIPAFKPAKTFTISVKK >gi|226332144|gb|ACIC01000176.1| GENE 3 1514 - 3088 1582 524 aa, chain + ## HITS:1 COG:CPn0959 KEGG:ns NR:ns ## COG: CPn0959 COG1530 # Protein_GI_number: 15618866 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribonucleases G and E # Organism: Chlamydophila pneumoniae CWL029 # 1 517 1 515 515 275 31.0 1e-73 MTSEIVIDVQPKEVSIALLEDKSLVELQSEGRNISFSVGNMYLGRIKKLMPGLNACFVDV GYEKDAFLHYLDLGPQFNSLEKFVKQTLSDKKKLNSISKATLLPDLDKDGTVANTLKVGQ EVVVQIVKEPISTKGPRLTSEISFAGRYLVLIPFNDKVSVSQKIKSSEERARLKQLLMSI KPKNFGVIVRTVAEGKRVAELDGELKVLIKHWEDAMAKVQKATKYPTLIYEETSRAVGLL RDLFNPSFENIHVNDEAVYNEIKDYVTLIAPDRANIVKLYKGQLPIYDNFGITKQIKSSF GKTVSYKSGAYLIIEHTEALHVVDVNSGNRTKNANGQEGNALEVNLGAADELARQLRLRD MGGIIVVDFIDMNEAENRQKLYERMCANMQKDRARHNILPLSKFGLMQITRQRVRPAMDV NTTEICPTCFGKGTIKSSILFTDTLESKIDYLVNKLKVKKFSLHVHPYVAAFINQGLVSL RRKWQMKYGFGIKIIPSQKLAFLQYVFYDTHGEEIDMKEEIEIK >gi|226332144|gb|ACIC01000176.1| GENE 4 3416 - 3706 210 96 aa, chain - ## HITS:1 COG:no KEGG:BT_1501 NR:ns ## KEGG: BT_1501 # Name: not_defined # Def: major outer membrane protein OmpA # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 96 268 363 363 177 98.0 7e-44 VNSAKIQDDQQAKIASLVEYLEKNATAKVSVTGYADKGTGNARINSKLSEKRANNVVEAL KAKGIAADRITVAYKGDTVQPYNTPEENRVSICIAE Prediction of potential genes in microbial genomes Time: Thu May 12 03:45:45 2011 Seq name: gi|226332143|gb|ACIC01000177.1| Bacteroides sp. 1_1_6 cont1.177, whole genome shotgun sequence Length of sequence - 3977 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 552 536 ## BT_1501 major outer membrane protein OmpA - Term 564 - 607 8.4 2 1 Op 2 . - CDS 630 - 3467 2471 ## BT_1502 hypothetical protein - Prom 3487 - 3546 7.8 Predicted protein(s) >gi|226332143|gb|ACIC01000177.1| GENE 1 3 - 552 536 183 aa, chain - ## HITS:1 COG:no KEGG:BT_1501 NR:ns ## KEGG: BT_1501 # Name: not_defined # Def: major outer membrane protein OmpA # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 183 1 183 363 388 100.0 1e-107 MSIRRKLLIGIMAISVLPISAQEQRIKEPEKISFKPHWFIQAQIGAAHTVGEAKFTDLIS PAAALNVGYKFAPAFGARVGVSGWQAKGGWVNPQQTYQYKYLQGNVDIMADLSTLFCGFN PKRVFNGYLFAGAGLNRGFDNDEANALDTRTYEMEYLWQEGKFLVAGRFGLGCDLRLNDR LSI >gi|226332143|gb|ACIC01000177.1| GENE 2 630 - 3467 2471 945 aa, chain - ## HITS:1 COG:no KEGG:BT_1502 NR:ns ## KEGG: BT_1502 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 945 1 945 945 1623 100.0 0 MNKKFLNAVLFGALLASSTGTFTSCKDYDDDINGLSERVDAVEKTLADLNTKFGALAYVK SVSFANGVLTVTDQSGTPTTYTIPDNDTNTTYTLDVSQDGNKATITLTDDKGNKQTKTIT FTDTDTDTKFDATKLTVGEDGVVKYDGVATGVTIPKQGTLTITEMTTDGVTYGWAIKYGN AEPVNLMICDVLPITSFGFEPDAYLGGVPSMKALKVVYKAWGETTSCNPATATGEVWEKA TTNSYVRPVIAGTYYLNPSSATKEQIKKITVLSADKEYINKTRAAASNPEVESYEVKDGV LTVYFKATTELIEAISSSKITVLAIRVETEAGNTITDEYRAIYATDYENVILGDKAKKAA AAAQHHLYGTTSGKADDAIKADADYEVAYNSTTGIDLADKVITCYDDVAANKTDLTMSAA ELAKLGLKYDFHLCAYIDGTNQTNQSDFARLTGSVLYPKLFNETATPYAAVGRQPLVRVQ LLDTKANNKVVSTGWIKVMITKDKAPAIDAPFTFDPFTNQCDDKVFALTVEQMNVKVYNK LGMDYKMFKTIYEAANPLYTGDGVVTEVADAGEVTQTDLLKWTISQADMKLALAKTSDVG SLKAVVTYKPKAGYEDSYSDVTITLSTKVNAIAAVTIPASNKIAEYWDANKTYVRLNVVV PGTLTDDCAFAVDLDNTFEGNKPIITGATAYKYIFASKNVNRKEKGLSGTEYTLSVSDDG LTLKATAGAATQNVAVIDADGVVTYQNTDFAKDLLNIASHNSVPSAGFYAWINIKATTGE CALELPITNGEYMAYFLRPIDVIAGEGKFQDAVDNGSTVNMLDLLSFSDWRNQAFSTTVK ANYFGYYGIELITVDIPNITTDLNGNDINSKKLSEVTSQLVITQTGTTVNPIPAAPAKDT YGTVTYTNNGNAVGAFNLRLPVTVTYKWGTVKSYVIIPVEKTLGN Prediction of potential genes in microbial genomes Time: Thu May 12 03:46:00 2011 Seq name: gi|226332142|gb|ACIC01000178.1| Bacteroides sp. 1_1_6 cont1.178, whole genome shotgun sequence Length of sequence - 4894 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 549 583 ## BT_1505 hypothetical protein 2 1 Op 2 . - CDS 611 - 1102 296 ## BT_1506 hypothetical protein - Prom 1122 - 1181 1.6 3 1 Op 3 . - CDS 1189 - 4359 2644 ## BT_1507 hypothetical protein - Prom 4382 - 4441 5.5 Predicted protein(s) >gi|226332142|gb|ACIC01000178.1| GENE 1 3 - 549 583 182 aa, chain - ## HITS:1 COG:no KEGG:BT_1505 NR:ns ## KEGG: BT_1505 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 182 1 182 362 389 100.0 1e-107 MNIKKILVAVYALSTTIAFAQEQRIKEESKTIFKPHWFIQAQIGAAHTVGEGEFTDLISP AAALNVGYKFAPAFGARVGVSGWQAKGGWVNPRQNYQYKYLQGNVDIMADLSTLFCGFHP KRVFNGYLFGGVGLNRGFDNDEANALDTRTYEMEYLWQEGKFLVAGRFGLGCDFRLNDRL SI >gi|226332142|gb|ACIC01000178.1| GENE 2 611 - 1102 296 163 aa, chain - ## HITS:1 COG:no KEGG:BT_1506 NR:ns ## KEGG: BT_1506 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 163 1 163 163 315 100.0 4e-85 MKKKDLLYICVLVAVSCFLFSCGEKKKQLTPKEEFLMGLTAEDTVQVLTISRSCMDTLKA GNIDEALKMLFILRDGKAIPLPAEKEQQLRKKFKYFPVVDYKLDYYSFSSTDNNDVKFQI EFFKHTSSDDHTPNTIGFMFNPVKIDGVWYLAVKEATKEATDK >gi|226332142|gb|ACIC01000178.1| GENE 3 1189 - 4359 2644 1056 aa, chain - ## HITS:1 COG:no KEGG:BT_1507 NR:ns ## KEGG: BT_1507 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1056 1 1056 1056 1968 99.0 0 MNKKFLSVVVFGALLASSAGTFTSCKDYDDDIDAVNGRIDELAKSLSDLQAKVGSFVKSV TYDPATGKLTVVDGENNSVSYTIGQNLPTYSISVDKEGKISLLKDGEVVSSGTITFPDAP TTPTIPDAFDPSKLTVDAATGKVMYDGKETGVTIPGDGVLSIEDKKNAEGVVIGYTIKYK DQAVTFALNDVLTLKGLVFKSDLFVDGIEAIEYPYLDYTYKAGSTAATTQWEDEQAEKVL CKVITDSKEWNYTGGTGAQYNPIEYINYHLNPSSAKVSKEDLSFVSRDVEVISSRASVAL PEVADMKPADEGILPVGLRAIGKDIKQGGEGSILALQAIVKAKSDAQADTTVTSDYALLY ASKVTPQAIAFNDATLAAEDCRESANPDELFKTVEGAITHAPTLTVAYNSFIDLKNLLTI HYNREGQPEATKNDGTHKVWAYGDEAKYGLKYDFAFIQYTSGQNVTSDSKYADANTIKEG VLTPRIVNSAGETLNEQGVSSVGKRPLVRVRVTDADGRVVLAGFIKIEIAKQVDDIVTSV FDKGTHNFGCDEADAILTWSEISYQLLEKAAVQSKDEFDALYEFDVDDSGVAKQFAKSAD GKSFVPAAIAQVIGTVKEKVDETGTTNTVLHWTLTTAEQSGIYEMAGHTATIYVRYVSKL NTTSHAPIYMPLQIAVAKPQGTVVTKLTNYWYSDGDVAKDGQNTRLNVPYPQDNGNTLNY VVDLNQVWEKGMPTFTPPANFASYTDVIFANQNGAAGGYKYYFDADQNVLTVDGTKYTLS VDNATAPCITGGSYKATAQNMIDHALKVDAGVFTNTKLYANGTVIATIDQNTGKITYENN DTSKKLLNAYSHSAAKHFAKIGICAYSPCNIAMSLTNNTYNAYFLRPIDAVGTDGGEFVD AHANGSTLDIAKLFNFQDWRNVKFVDGTDYSNSWLYAFYGLNKVEVKIADATTTLSGGKL GETLLSSKTEKIVLTQTDKDGNKVTSATLNLSSYNTEASGTQATYDAIVAAMGKIKYVNN GNNVQTFELRIPVEFTYTWGTVKTTVDCTVKSTMGN Prediction of potential genes in microbial genomes Time: Thu May 12 03:46:18 2011 Seq name: gi|226332141|gb|ACIC01000179.1| Bacteroides sp. 1_1_6 cont1.179, whole genome shotgun sequence Length of sequence - 6928 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 3, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 86 - 145 4.7 1 1 Tu 1 . + CDS 228 - 1052 729 ## BT_1510 hypothetical protein + Term 1104 - 1161 8.2 - Term 1313 - 1348 -0.8 2 2 Tu 1 . - CDS 1508 - 2602 1086 ## BT_1511 outer membrane protein OmpA - Prom 2663 - 2722 3.9 - Term 2694 - 2732 8.1 3 3 Tu 1 . - CDS 2782 - 6402 3160 ## BT_1512 putative surface protein - Prom 6428 - 6487 7.7 Predicted protein(s) >gi|226332141|gb|ACIC01000179.1| GENE 1 228 - 1052 729 274 aa, chain + ## HITS:1 COG:no KEGG:BT_1510 NR:ns ## KEGG: BT_1510 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 21 274 1 254 254 464 100.0 1e-129 MTKFINPFTDVGFKKIFGQEMTKDLLIDFLNDLLVDEKHIIDITFLDKEILPEYMGDRGV IYDIYCTAENGEQFIVEMQNKQHVHFRERALYYLSKTVARQGEKGAEWKFDLKAVYGVFF MNFRLDNLPHKLRTDIVLTDRDTHEQFSDKLRFIFIELPAFSKEEQECETDFERWIYVLK NMETLKRLPFKARKSVFEKLEKIVDIASLSKEERMKYDESIKVYRDNLVTLEFAEQKGRA EGKEETARNLKKMGVSLEIISKATGLSIEKIEAL >gi|226332141|gb|ACIC01000179.1| GENE 2 1508 - 2602 1086 364 aa, chain - ## HITS:1 COG:no KEGG:BT_1511 NR:ns ## KEGG: BT_1511 # Name: not_defined # Def: outer membrane protein OmpA # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 364 1 364 364 738 100.0 0 MKIRNLLVAILAMSGTVVFAQEQRQIKEEGKTVFKPHWFIQAQVGAAHTVGEAKFTDLMS PAAAINIGYKFAPAFGARIGGSGWQAKGGWVTPEQTYQYKYLQGNIDIMADLSTLFCGFN PKRVFNGYLFGGVGLNRGFDNDEANALDTRTYELEYLWQEGKFLVAGRFGLGCDLRLNDR LAINIEANANALSDKFNSKKAGNCDWQLNALVGLTIKLGKSYTKTAPVYYEPEPAVVEQP KPEPVVAKEPEPKPVVVVEPMKQNIFFALNSALLQNDQLAKIDAMIKYMEKNPTSKVAVT GYADKETGNPKINMTLSEKRAKNVVEALKAKGIAADRIITSYKGDTVQPYQKPEENRVCI CIAE >gi|226332141|gb|ACIC01000179.1| GENE 3 2782 - 6402 3160 1206 aa, chain - ## HITS:1 COG:no KEGG:BT_1512 NR:ns ## KEGG: BT_1512 # Name: not_defined # Def: putative surface protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1206 1 1206 1206 2158 99.0 0 MNKKYLSVVLFGALLAASAGTFTSCKDYDDDIKGLQEQIDKSGSTITDLQTQLTTLKAAA EAAQATADAAKAAAVEAKTAADAAKAAGDQAKADAADALAKAKAAEAAAATAKTEAIAEA TKLVEALRNSMQEAIDKKLDVTAFEEASKLLGARIDGIEAGLSTLENGAVKENTEAIKTA MDLIETLQTADANFATTLKELDEYVKITLEAKIDANAEDIKAAQDDIKAAQDDLAALWKE IDGKGGLKDLIGANKTAISDLETKITDELDAIKKNVKDIQASIKEINTKIGNINTNLTGL HTLVTCRLSSISLAPDLFVDGIEAVRFTSLQYSPMDVKDENASIPAKSYQFSTAALATAS YHFNPASFKLANADYSYIDRTAEVVETTRAIAASKLIEIVGEPVANPATGTVDFQLLRLN SHTTQPELDRTNLIALQATLKGDAIDQDEKNAVITSPYVAVYDNILAAEDVRIADKETLF EGGNAAHYATTFDACTKEEPRYETPYDQVFDLKKLVATCLNNENVSTGHDEFPIEAYKLS YRFAVATTDFNLEEGKTETNQQKWVKCNDVEKGLFQAEGFNPEAFGRTPILKVELVDENG NVVRRGFVKIKFTAEKQPDFTVGNPAEELVFKCADTEATYTITEEYIRENVYRKITDGTN IGMSHETFWNTYDVSTAVASVKKNGKLVTGVDFPKIVAGATSVGTATKKVVWSFKHGQLG AISATGSEFVATITVQNKLQSSKYPDAITFKFAVNVKLPKASLDAVKNELYWQTIDGDMK FFKVNAVVPNTPEDPADGCQIHQELNLAYDKYNVKGLPNCVTDRYVVTKTYSNGVATSKV LAGVKISGKTISLDKNDADVKLALNSAGGLQASVAHIYTLESGDQITVNEFMVNFIRPVS LNMPSGVTLTDAITGGDVANFQWNGLLMDWSGNAIVSPSVVETEDISSYWKQVCTTEYEW VPEHSYVKTPASLNVTYGKVDFVTASDLTMYSGTSKVNYWKPGTSTEDVIEKTYSVGAML TQAEAMAMLEVQEREGTPEGYVNFGSTYNFTEIPVIKGSHVEYTYVANIDYVPAEIVVVP GDYVAKKHEHTPMPTFDGESYGQKSGCWEWTKTTFGSSVINLGQYWFYYGEFSDVKLDIT KVTTDLKYNGGKLPAKTTLEQVGNTVKYVNVDSPIEYAYKIFIPATVNYGWGTLSSTLTI TVNPKN Prediction of potential genes in microbial genomes Time: Thu May 12 03:46:50 2011 Seq name: gi|226332140|gb|ACIC01000180.1| Bacteroides sp. 1_1_6 cont1.180, whole genome shotgun sequence Length of sequence - 30728 bp Number of predicted genes - 30, with homology - 30 Number of transcription units - 15, operones - 6 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 202 - 261 11.1 1 1 Tu 1 . + CDS 332 - 631 242 ## BT_1514 hypothetical protein + Term 869 - 913 4.4 2 2 Tu 1 . + CDS 1075 - 1584 237 ## BT_1515 hypothetical protein 3 3 Tu 1 . + CDS 1699 - 3129 1320 ## COG0305 Replicative DNA helicase + Prom 3159 - 3218 8.1 4 4 Tu 1 . + CDS 3264 - 3776 571 ## BT_1517 hypothetical protein + Prom 4021 - 4080 2.8 5 5 Op 1 . + CDS 4128 - 4433 356 ## BT_1518 hypothetical protein 6 5 Op 2 . + CDS 4447 - 4893 322 ## COG3023 Negative regulator of beta-lactamase expression + Term 4937 - 4999 10.1 + Prom 4946 - 5005 2.3 7 6 Tu 1 . + CDS 5032 - 5358 154 ## COG2865 Predicted transcriptional regulator containing an HTH domain and an uncharacterized domain shared with the mammalian protein Schlafen - Term 5170 - 5202 -0.9 8 7 Op 1 . - CDS 5361 - 6347 644 ## BT_1521 hypothetical protein 9 7 Op 2 . - CDS 6347 - 7282 714 ## BT_1522 putative aureobasidin A resistance protein 10 7 Op 3 . - CDS 7300 - 7968 768 ## COG0558 Phosphatidylglycerophosphate synthase 11 7 Op 4 . - CDS 8008 - 8472 257 ## BT_1524 hypothetical protein 12 7 Op 5 . - CDS 8487 - 8966 338 ## COG1267 Phosphatidylglycerophosphatase A and related proteins - Prom 9031 - 9090 2.9 13 8 Tu 1 . - CDS 9115 - 10404 1452 ## COG1260 Myo-inositol-1-phosphate synthase - Prom 10433 - 10492 4.7 14 9 Tu 1 . - CDS 10513 - 13065 2246 ## BT_1527 hypothetical protein 15 10 Op 1 13/0.000 - CDS 13173 - 14465 1117 ## COG0642 Signal transduction histidine kinase 16 10 Op 2 . - CDS 14487 - 15836 1322 ## COG2204 Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains 17 10 Op 3 . - CDS 15905 - 17383 1338 ## BT_1530 putative outer membrane protein OprM precursor - Prom 17426 - 17485 3.1 18 11 Op 1 . - CDS 17515 - 20154 1713 ## COG0642 Signal transduction histidine kinase 19 11 Op 2 . - CDS 20210 - 21463 979 ## BT_1532 ABC transporter permease 20 11 Op 3 . - CDS 21496 - 22785 793 ## COG0577 ABC-type antimicrobial peptide transport system, permease component 21 11 Op 4 . - CDS 22823 - 23449 563 ## BT_1534 hypothetical protein 22 11 Op 5 . - CDS 23479 - 24144 324 ## PROTEIN SUPPORTED gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) 23 11 Op 6 . - CDS 24173 - 25057 966 ## BT_1536 ABC transporter permease 24 11 Op 7 . - CDS 25068 - 25421 393 ## BT_1536 ABC transporter permease - Prom 25460 - 25519 7.3 - Term 25564 - 25605 1.5 25 12 Op 1 . - CDS 25665 - 27056 1273 ## COG1252 NADH dehydrogenase, FAD-containing subunit 26 12 Op 2 . - CDS 27111 - 28016 875 ## COG1705 Muramidase (flagellum-specific) - Prom 28153 - 28212 4.6 + Prom 28045 - 28104 4.8 27 13 Op 1 . + CDS 28128 - 28610 544 ## COG0295 Cytidine deaminase 28 13 Op 2 . + CDS 28614 - 29453 785 ## COG2207 AraC-type DNA-binding domain-containing proteins + Prom 29463 - 29522 3.7 29 14 Tu 1 . + CDS 29542 - 30123 612 ## COG3059 Predicted membrane protein + Term 30128 - 30153 -0.8 + Prom 30142 - 30201 4.6 30 15 Tu 1 . + CDS 30227 - 30728 288 ## COG1249 Pyruvate/2-oxoglutarate dehydrogenase complex, dihydrolipoamide dehydrogenase (E3) component, and related enzymes Predicted protein(s) >gi|226332140|gb|ACIC01000180.1| GENE 1 332 - 631 242 99 aa, chain + ## HITS:1 COG:no KEGG:BT_1514 NR:ns ## KEGG: BT_1514 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 99 1 99 99 196 98.0 2e-49 MIKVEKEVSNDYIIKGFKHFTQLATEYFQGHPDCNAARKRMRDTIDANLQLKAELTAASY TEHTTLLSPKMQQIIWAHWGPPIISLSENHANESDNKRI >gi|226332140|gb|ACIC01000180.1| GENE 2 1075 - 1584 237 169 aa, chain + ## HITS:1 COG:no KEGG:BT_1515 NR:ns ## KEGG: BT_1515 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 140 1 140 212 265 95.0 4e-70 MQKMDCQAIGYGYIIIPKALLKEQLIDCSPHEGEIEAFLKLLVKVNYSETKLTDHQRRTI VCKRGESLHSYRSWSAIFHWPTSRTYRFIQQLKTNGIIEIIPHNDTTALHIRVVNYENWT NISTLLNEKKEKQHKKSATKSSASSGTTTTTSCSSPKRTSPRRSVSGKS >gi|226332140|gb|ACIC01000180.1| GENE 3 1699 - 3129 1320 476 aa, chain + ## HITS:1 COG:BS_dnaC KEGG:ns NR:ns ## COG: BS_dnaC COG0305 # Protein_GI_number: 16081096 # Func_class: L Replication, recombination and repair # Function: Replicative DNA helicase # Organism: Bacillus subtilis # 6 400 6 398 454 263 38.0 4e-70 MNMNTDNRVSPQAPEIEEAILGACLIEQEAMPLVADTLRPEMFYSMRHQVIYAALLAMYQ AGMKIDILTVKEELAHRGKLEEAGGAFGITQLSSKVATSAHIEYHMQIVHEKYLRREMIL GLNKLLACSLDETMDIADTLIDAHNLLDRLEGEFGHNDHMRDMDTLMTDTMEEAEQRIAK SANGITGIPTGLTDLNRMTSGLQNGELIAIAARPSVGKTAFALHLARQAAMQRFAVAVYS LEMQGERLADRWLLSAGDINPYRWRTGIPNPHEVAEAHTAAAALARLPIHVDDSTSISMD HVRSSARLLKSKDACDLVIIDYLQLCDMSTGQKNRNREQEVAQASRKAKLLAKELNVPVV LLCQLNRESEGRPGGRPELFHLRESGAIEQDADVVMLLYRPALARLTTDRESGYPTDRTG GSHRSQTTQWRDRERILQPQPFSNENYGLYTTIGSITETSKIKTESKQYYITQYCL >gi|226332140|gb|ACIC01000180.1| GENE 4 3264 - 3776 571 170 aa, chain + ## HITS:1 COG:no KEGG:BT_1517 NR:ns ## KEGG: BT_1517 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 170 1 176 176 272 94.0 3e-72 MNVSVERYQRRKYVSQENSPMLYYVRQKSGTVRVMDVEKLADAIEANSSLTAGDVKHAIE AFVEQLRLSLTQGDKVKIDGLGTFHITLCSEGTEKEKDCTVRSIRKVNVRFVADKALHLV NASHTSTRSENNVEFVLASKGDTEGGNAGGGNAGGGGNSGGDEEAPDPTV >gi|226332140|gb|ACIC01000180.1| GENE 5 4128 - 4433 356 101 aa, chain + ## HITS:1 COG:no KEGG:BT_1518 NR:ns ## KEGG: BT_1518 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 101 1 101 101 159 100.0 4e-38 MLDKIIEIIMTILPFLGNRKKRKAMAQDVKEFSELVKDQYTFLMEQLEKVLKDYFDLSAR VKEMHTEIFSLREQLAQAATLQCVNKECLQRTQSEASISEA >gi|226332140|gb|ACIC01000180.1| GENE 6 4447 - 4893 322 148 aa, chain + ## HITS:1 COG:HI1494 KEGG:ns NR:ns ## COG: HI1494 COG3023 # Protein_GI_number: 16273395 # Func_class: V Defense mechanisms # Function: Negative regulator of beta-lactamase expression # Organism: Haemophilus influenzae # 46 141 2 97 116 95 45.0 2e-20 MRTINLIVIHCSATREDKDFTEYDLDVCHRRRGFNGTGYHFYIRKNGDIKSTRPIEKVGA HAKGFNRESIGICYEGGLDSKGHPKDTRTEWQKHSLRVLVLTLLKDYPGCRVCGHRDLSP DRNENGEIEPEEWVKACPCFDAENNWKA >gi|226332140|gb|ACIC01000180.1| GENE 7 5032 - 5358 154 108 aa, chain + ## HITS:1 COG:MA2121 KEGG:ns NR:ns ## COG: MA2121 COG2865 # Protein_GI_number: 20090964 # Func_class: K Transcription # Function: Predicted transcriptional regulator containing an HTH domain and an uncharacterized domain shared with the mammalian protein Schlafen # Organism: Methanosarcina acetivorans str.C2A # 53 105 397 449 458 56 47.0 9e-09 MLSLQDAVNSGIFIDFMLQEIYETLKKRQGDPIVTMKATKDVGINVGINEQKVLELLRKN NQITAKEIAGLLGISLRHSERLITSLKQKGMIQRVGSNKNGYWEIIVK >gi|226332140|gb|ACIC01000180.1| GENE 8 5361 - 6347 644 328 aa, chain - ## HITS:1 COG:no KEGG:BT_1521 NR:ns ## KEGG: BT_1521 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 20 328 1 309 309 541 98.0 1e-153 MKNKYRNIFLAFGIVAVLIMIFTFDMDYQELWANLKRAGVYLPLVLLLWLFVYLINTTSW YIIIRSGGKPGFSFARLYKFTVTGFALNYVTPVGLMGGEPYRIMELKPYIGVERATSSVI LYVMMHIFSHFCFWLSSVLLYLCLYPVGWVMGTILGAVTVFCLLVAVLFVKGYQHGMAVA FVRLGSHLPFLKKKVIRFADSHQEQLENIDKQIALLHQQKKGAFYAALFLEYTARVVSCL EIWLILNVLTRSVSFADCCLIAAFSSLLANLLFFLPMQLGGREGGFALAVGGLSLSGAYG VYAALITRVREMVWIVIGLALMKVGNKK >gi|226332140|gb|ACIC01000180.1| GENE 9 6347 - 7282 714 311 aa, chain - ## HITS:1 COG:no KEGG:BT_1522 NR:ns ## KEGG: BT_1522 # Name: not_defined # Def: putative aureobasidin A resistance protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 7 311 1 305 305 563 99.0 1e-159 MIKTIQMPSKKETLTVIVIMALFLLLTAACIGLRSEHLLMAALYLVLFFAGLPTRKLAVA LLPFAIFGISYDWMRICPNYEVNPIDVAGLYNLEKSLFGVMDNGVLVTPCEYFAVHHWAV ADVFAGIFYLCWVPVPILFGLCLYFKKERKTYLRFALVFLFVNLIGFAGYYIHPAAPPWY AINYGFEPILNTPGNVAGLGRFDEIFGVTIFDSIYGRNANVFAAVPSLHAAYMVVALVYA IIGKCRWYVIALFSVIMAGIWGTAIYSCHHYIIDVLLGISCALLGWLLFEYGLMKIRGFR NFFDRYYQYIK >gi|226332140|gb|ACIC01000180.1| GENE 10 7300 - 7968 768 222 aa, chain - ## HITS:1 COG:MA0525 KEGG:ns NR:ns ## COG: MA0525 COG0558 # Protein_GI_number: 20089414 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylglycerophosphate synthase # Organism: Methanosarcina acetivorans str.C2A # 25 216 9 185 194 92 34.0 4e-19 MNYRDYLQQVIYKIINPLIHGMIKIGITPNFITTTGFILNVVAAGMFVYAGIYSGQNDLS ILGWAGGVILFAGLFDMMDGRVARLGNMSSKFGALYDSVLDRYSELMTFFGICYYLSVKD YFIYAIIAYFALIGSLMVSYVRARAEGLGIECKVGFMQRPERVVLTSLGALFCGVFKDIT AFDPMLILMVPLTIVAVFANITAFARVRHCYKVMTANEKKND >gi|226332140|gb|ACIC01000180.1| GENE 11 8008 - 8472 257 154 aa, chain - ## HITS:1 COG:no KEGG:BT_1524 NR:ns ## KEGG: BT_1524 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 154 1 154 154 270 100.0 1e-71 MISQKGGVFMFLRAQLSAQMATIADFLVTILLVRLFDVYYVYATLAGAIYGGIINCIINY KWTFKSKGKKTHVAVKFIIVWVCSIWLNTWGTYTLTESLAKIPWVRDTLSQYFGDFFIIP KVVVAVIVALFWNYNMQRFFVYRNIDIRSLFKRN >gi|226332140|gb|ACIC01000180.1| GENE 12 8487 - 8966 338 159 aa, chain - ## HITS:1 COG:YPO3179 KEGG:ns NR:ns ## COG: YPO3179 COG1267 # Protein_GI_number: 16123341 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylglycerophosphatase A and related proteins # Organism: Yersinia pestis # 10 153 16 154 161 81 41.0 6e-16 MKRPSFLPVLIGTGFGSGFSPFAPGTAGALLASIIWIALYFLLPFTALLWTTAALVVLFT FAGIWAANKLESSWGEDPSRVVVDEMVGVWIPLLAVPDNGQWYWYVIAAFALFRIFDIVK PLGVRKMESFKGGVGVMMDDVLAGVYSFILIAVARWVIG >gi|226332140|gb|ACIC01000180.1| GENE 13 9115 - 10404 1452 429 aa, chain - ## HITS:1 COG:YJL153c KEGG:ns NR:ns ## COG: YJL153c COG1260 # Protein_GI_number: 6322308 # Func_class: I Lipid transport and metabolism # Function: Myo-inositol-1-phosphate synthase # Organism: Saccharomyces cerevisiae # 11 421 87 541 555 185 29.0 1e-46 MKQEIKPATGRLGVLVVGVGGAVATTMIVGTLASRKGLAKPIGSITQLATMRMENNEEKL IKDVVPLTDLNDIVFGGWDIFPDNAYEAAMYAEVLKEKDLNGVKDELEAIKPMPAAFDHN WAKRLNGTHIKKAATRWEMVEQLRQDIRDFKAANNCERVVVLWAASTEIYIPLSDEHMSL AALEKAMKDNNTEVISPSMCYAYAAIAEDAPFVMGAPNLCVDTPAMWEFSKQKNVPISGK DFKSGQTLMKTVLAPMFKTRMLGVNGWFSTNILGNRDGEVLDDPDNFKTKEVSKLSVIDT IFEPEKYPDLYGDVYHKVRINYYPPRKDNKEAWDNIDIFGWMGYPMEIKVNFLCRDSILA APIALDLVLFSDLAMRAGMCGIQTWLSFFCKSPMHDFEHQPEHDLFTQWRMVKQTLRNMV GEKEPDYLA >gi|226332140|gb|ACIC01000180.1| GENE 14 10513 - 13065 2246 850 aa, chain - ## HITS:1 COG:no KEGG:BT_1527 NR:ns ## KEGG: BT_1527 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 850 1 850 850 1701 99.0 0 MRLIIKGAVALFLLLAISIPLSAQYIVQGVVTDSLTREPLSYVSVRLKNTTEGTTTGSDG RFYFKTHRSEGVLVISTVGYNEYSRLIHPARNLSYKVALSPATYALNEVLVKPKREHYRK KDNPAVEFVRRMIEHRDDHSPNEKDFWQRDRYEKTTFALNNFDEEKQKKWLYRKFDFLTE YVDTSAVTGKPILTVSARELLATDYYRKSPHSEKQWVKGRKQAGVDEFLSKQGMQAAINE VFKDVDIYENNISLFTNKFVSPLSRIGTGFYKYYLMDTLQIGGETCADLAFTPFNSESFG FNGHLYVTLDSTYFVKRAVLNFPKKINLNFVDYMLLEQEFKRAEDGTRLLDHESITVEFK LTEGQDGIFARRVADYSRYSFLPTEEADKAFTKPERIIEETEALSRPETFWAENRPQAAI SQQENSVDRLMAQLRGYPVYYWTEKVLSILFTGYIPTSKEAPLFYIGPMNATISGNTLEG PRIRAGGMTTAWLNPHLFGKGYIAYGFKDERLKGLAEVEYSFKKKKEYANEFPIHSLKVR YESDVNQYGQNYLYTSKDNVFLALKREKDDRIGYYRQAEMTYTNEFYSGFSFQLTARRRT DESSYLIPFLRKDGEVYSPVKDFSTSAAELKLRYAPNEKFFQTQWNRFPVSLDAPVFTLS HTIAGKGVLGSDYTYNHTEAGVQKRFWFSAFGYTDIILKAGKVWDKVPFPLLIMPNANLS YTIQPESYSLMNAMEFMNDEYFSWDVTYFLNGWLFNRIPLLKKLKWREIVSCRGLYGHLS DKNNPDMTPGLFAFPIASTRTMGKTPYVEAGVGIENIFKVLRLDYVWRLTYRDSPDIDKS GLRISLHMTF >gi|226332140|gb|ACIC01000180.1| GENE 15 13173 - 14465 1117 430 aa, chain - ## HITS:1 COG:BH1920 KEGG:ns NR:ns ## COG: BH1920 COG0642 # Protein_GI_number: 15614483 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus halodurans # 104 430 188 537 548 107 28.0 3e-23 MRIKGFFGILVFLLLVLGGGLLFLSSRLNMIYFYIGEGLVLFILCYLPFFYRKIVKPLNS IGSGMELLREQDFSSRLSPVGQYEADRIVNVFNRMMEQLKNERLRLREQNNFLDLLIKAS PMGVILTTLDEDLSELNPMAQKMLGVRQEDVLGKKMNEIDSPLAAELANVPKGETATVRL NDSNIYRCTHSSFIDRGFQHPFFLIESLTDEVMKAEKKAYEKVIRMIAHEVNNTTAGITS TLDTVEQALSTEEGMDDICDVMRVCTERCFSMSRFITRFADVVKIPEPTLTPVDLNDLAF TCKRFMEGMCTDRNIKLRLEIDETLKEVKMDASLFEQVLVNIIKNAAESIEKDGEIIVRT LSPAIVEVVDNGKGISKEVEAKLFSPFFSTKPNGQGIGLIFIREVLMRHGCTFSLRTYAD GLTRFRILFP >gi|226332140|gb|ACIC01000180.1| GENE 16 14487 - 15836 1322 449 aa, chain - ## HITS:1 COG:atoC KEGG:ns NR:ns ## COG: atoC COG2204 # Protein_GI_number: 16130157 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains # Organism: Escherichia coli K12 # 2 447 7 456 461 339 41.0 8e-93 MILIIDDDSAVRSSLSFMLKRAGYEAQTVSGPREAMEVVRSVAPDLILMDMNFTLSTTGE EGLTLLRQVKIFRPETPVILMTAWGSIQLAVQGMQAGAFDFITKPWNNAALLQRIETALE LAGSPKETTQEQNDAFDRRHIIGRSQGLTEVLNTIARIAKTNASVLITGESGTGKELIAE AIHINSQRAKQPFVKVNLGGISQSLFESEMFGHKKGAFTDASSDRIGRFEMANKGTIFLD EIGDLDPSCQVKLLRVLQDQTFEVLGDSRPRKTDIRVVSATNADLRKMVGERTFREDLFY RINLITVRLPALRERREDIPLLARHFADRQAEANGFPRTEFSADALQFLSRLPYPGNIRE LKNLVERTILVNGKPTLDAADFDAQYIRHEDQKVAEGASFAGMTLDEIERQTILQALERH KGNLSQVATALGISRAALYRRLEKFGISV >gi|226332140|gb|ACIC01000180.1| GENE 17 15905 - 17383 1338 492 aa, chain - ## HITS:1 COG:no KEGG:BT_1530 NR:ns ## KEGG: BT_1530 # Name: not_defined # Def: putative outer membrane protein OprM precursor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 492 1 492 492 903 100.0 0 MKKKSILFVLAAICLPLAVSAQKEREITLNEAIAMARMQSVDAAVALNELKTSYWEYRTF RADLLPEINLKGTLPNYNKSYGSYQNPDGSYSFVRNSALGLSGELSIDQNIWLTGGQLSL TSSLDYLKQLGNSGDRHLMSVPVTLKLTQPILGVNNVKWNRRIEPVRYAEAKAEFITATE EVTMRAITYYFDLLLAKETLGTARQNLTNANQLYEVAIAKRKMGQISENELLQLKLSALN AKAALTEAESDLNAKMFQLRAFLGVGEDEILRPVVPESVDCGKMEYNMVLNKALERNSFA QNIRRRQLEADYAVATARGNLRSINLFASVGYTGESRELSSVYRNLQDNQIVQVGVQIPI LDWGKRRGKVRVAKSNRDVVLSKIRQEQINFNQDIFLLVEHFNNQAQQLEIAKEADGIAQ QRYKTSIETFLIGKINTLDLNDAQNAKDEARRKHISELYYYWYYFYQIRSLTLWDFQANT ELEADFDEIVRQ >gi|226332140|gb|ACIC01000180.1| GENE 18 17515 - 20154 1713 879 aa, chain - ## HITS:1 COG:mll3725_2 KEGG:ns NR:ns ## COG: mll3725_2 COG0642 # Protein_GI_number: 13473203 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Mesorhizobium loti # 622 875 52 311 328 162 39.0 3e-39 MLKTMLVFLLMNRQYYCLFVVLFFVVFARAAAAEHTYNILFIQSYTSQTPWHNDLNEGVA KGFKENGLTVNITTEYLDADFWSFSSEKVIMQRFCQRARDRKTDMIITASDEAFYTLFSS GDSLPYQVPVVFFGIKYPDKELIASLPNVCGFTANPDFYVILKQAQKVFPQRKEVICVID NSFLSNKGREDFKQEWQLFQEENPDYKMKLYNTQHESTNHIIAAICYPRNSYARVVVAPK WSPFLSFVGKNSKAPVFSSQNVGLMNGVFAAYDADAYTSAMSAAKKASRVLKGESPLKLG VTETEQGFIYDYKQLDFFHIDPDKVSSSGSVVNQPYWERYKLLFILLYPSILALLIVSIV WLIRVNRREAKRRIHAQTRLLVQNKLVEQRNEFDNIFHSIRDGVITYDTDLHIHFTNRSL LKILHLPSESGGRNYEGMMAGSIFKIYYNGQDILYKILKKVGETGQSVIIPDGAFMKEVH SEHYFPVSGEVVPIHSGGQITGMALSARNVSDEEMQKRFFDMAVEESSIYPWQFDVESNS FIFPQGFLVRFGYDESITSITRDEMDRMVYPEDLKEMRILFDKTLAGKETNTRLNFRQRN ANGEYEWWEYRSSVISGLTRDSLYNILGVCQSIQRYKTAEQEMREARDKALQADKLKSAF LANMSHEIRTPLNAIVGFSDLLSDTSGFTEEEIGQFIATINKNCGLLLALINDILDLSRI ESGTMEFMFANHNLPLLLKTVHDSQQLNMPPRVELLLRMPDNEKKYLVTDNVRLQQVVNN LINNAAKFTTYGSITFGYEEDEDPEYTRIFVEDTGVGISEEGIRHIFERFYKVDNFTQGA GLGLSICQTIVERLRGTIFVTSEVGRGTRFTVRIPNFCE >gi|226332140|gb|ACIC01000180.1| GENE 19 20210 - 21463 979 417 aa, chain - ## HITS:1 COG:no KEGG:BT_1532 NR:ns ## KEGG: BT_1532 # Name: not_defined # Def: ABC transporter permease # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 417 1 417 417 822 100.0 0 MQLLKQIWNERRSNGWLWAELLIVFVVLWYIVDWTYATARTYYEPLGYDITNTYYLELSL KNNKSDSYIPRGQKETTTGQDIVELTNRLRRLPEVEAVSISVNARPYIGSNSGIRFRLDT LVRSPLRRPVTPEFFQVFRYQSADGQGYQPLMQAIKNGNAVLSENIWPDGYKGDRTLLGK EVVNVDDSTDVYKIAAVSTKVRYNDFWPNYSDRYMAIPLTEAQLAEINDDLYPLNTEVCL RVKPNTAPGFPERLMEQSDTQFSVGNLFILKVHDYEMIREDYNRTNVNQMQTRFWMMGFL LVNILLGIVGTFWFRTQHRRGELGLRVAVGSTRMGLWHRLNKEGLLLLALAALPAAIICY NIGYLELTEGGIEWSIIRFLITFCVTSLLMALMILIGIWFPARQAIRIQPAEALREE >gi|226332140|gb|ACIC01000180.1| GENE 20 21496 - 22785 793 429 aa, chain - ## HITS:1 COG:BS_yknZ KEGG:ns NR:ns ## COG: BS_yknZ COG0577 # Protein_GI_number: 16078501 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Bacillus subtilis # 99 347 100 331 397 71 24.0 4e-12 MIKLYFKQAFHLLKENKLLSSISIIGTALAIAMIMVIVITLRATIAPFAPESYRDRMLIF RYAGFQYKTNENWQSNGPVSYKTAKACFKEMAAPEAVTITSSFSETMLAAKPAGEKVSCS VLQVDDDFWKVFKFDFLSGYPFDKATFDAGATKAVISEDIARQLFGTSDVVGKTFLLNHS AYMICGVVRPVSKLARYAYAQIWIPLSSTDAFTASWGEYGIMGMVSVYILAKSQDDFPAI RMEAERLRDRYMEGYPDYELLYRDQPDTYFVAAQRYSANNPPAVKQAVRQYIITLIILLI VPAVNLSGLTLSRMRKRLSEIGVRKAFGAPRRELMIQVLSENMLYSLLGGVLGLILSYGA TFFLGSMLFSIDFMGNGVTDLRTMCMDLLFDPVVFLLAFFACFALNLLSAAIPAWRVTRT NIVDAINER >gi|226332140|gb|ACIC01000180.1| GENE 21 22823 - 23449 563 208 aa, chain - ## HITS:1 COG:no KEGG:BT_1534 NR:ns ## KEGG: BT_1534 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 208 1 208 208 395 100.0 1e-109 MTKNIFITAVAVLLSVALCPLWAQNVSEDCKVGEFSAINLRSVGNIIFTQSDTYSCRMEG PFEYVDKTKVTVKGETLVIEYKQKNVKNVKNLTFYITAPDLKSVKIDGVGNFNAKETLKL KNVTFKLDGVGNCEVKKLHCDELELMVDGVGNMNVNVNCNTVKAKVDGVGNIKLSGKADR ASLKRDGVGRINHKGLKCPDVSTKGWGF >gi|226332140|gb|ACIC01000180.1| GENE 22 23479 - 24144 324 221 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) [Campylobacter concisus 13826] # 1 201 4 200 223 129 36 2e-29 MITLTSLSKIYRTDEIETVALENVNLTVDRGEFLSIMGPSGCGKSTLLNIMGLLDAPTTG TVEINGTHTEGMKDKQLAAFRNKTLGFVFQSFHLINSLNVMDNVELPLLYRRISSGERKC LAKEVLEKVGLSHRMNHFPTQLSGGQCQRVAIARAIIGHPEIILADEPTGNLDSKMGAEV MELLHRLNKEDGRTIVMVTHNEEQAKQTSRTIRFFDGRQVQ >gi|226332140|gb|ACIC01000180.1| GENE 23 24173 - 25057 966 294 aa, chain - ## HITS:1 COG:no KEGG:BT_1536 NR:ns ## KEGG: BT_1536 # Name: not_defined # Def: ABC transporter permease # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 294 122 415 415 499 99.0 1e-140 MRRYKLDQLRVNNQTKLSDMAMQIKVSAMKLSRMKVELRNEHYLDSLGAGTTDKVRQAEL SYNVAQLEYEQLQQQYKNEKEVAAAELKVQELDFNIFRKSLAEMKRTLDDAQIRSPRKAI LTYINNQIGAQISQGGQVAIISDLSHFKVEGEIADTYGDRVAAGGKAVVKIGNDKLEGTV SSVTPLSKNGVISFTVQLKDDNNHRLRSGLKTDVYVMNAVKEDVMRMANASFYVGRGEYD LFVRTSDDELVKRKVQLGDSNFEYVEVVSGLQPGDVVVVSDMSSYKNKNKLKMR >gi|226332140|gb|ACIC01000180.1| GENE 24 25068 - 25421 393 117 aa, chain - ## HITS:1 COG:no KEGG:BT_1536 NR:ns ## KEGG: BT_1536 # Name: not_defined # Def: ABC transporter permease # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 116 1 116 415 202 100.0 4e-51 MDREIPKEVRNKERNKKIIRYSSIGVAGIVVIGVLISVMRTGVEKKDLVLSTVDKGMIEV SVSASGKVVPAFEEIINSPINSRIVEVYRKGGDSVEIGTPILKLDLQSTETEYKKLA >gi|226332140|gb|ACIC01000180.1| GENE 25 25665 - 27056 1273 463 aa, chain - ## HITS:1 COG:all2964 KEGG:ns NR:ns ## COG: all2964 COG1252 # Protein_GI_number: 17230456 # Func_class: C Energy production and conversion # Function: NADH dehydrogenase, FAD-containing subunit # Organism: Nostoc sp. PCC 7120 # 11 423 5 424 442 275 36.0 1e-73 MSLNIAKDSKKRVVIVGGGFGGLKLANKLKKSGFQVVLIDKNNYHQFPPLIYQVASAGME PTSISFPFRKIFQHRKDFYFRMAEVRAVFPEKNMIQTSIGKAEYDYLVLAAGTTTNFFGN KHIEEEAMPMKNVSEAMGLRNALLANLERAVTCSNKQEQQELLNIVVVGGGATGVEVAGV LSEMKKFVLPNDYPDMPSSLMHIYLIEAGPRLLAGMSEDSSSHAEQFLREMGVNILLNKR VTDYRDHKVTLEDGSEIATRTFIWVSGVTGVSFGNMNTSVIGRGGRIKVDAFNRVEGTNN VFAIGDLCILSGDEDYPNGHPQLAQVAIQQGELLAKNLIRLEKGKEMKPFHYRNLGSMAT VGRNRAVAEFSKVKMQGWFAWVMWLVVHLRSILGVRNKVVVLLNWVWNYFTYDQSMRMIV YARKAKEIRDREAIEATTHWGKDLMQEPQPAKQEPQPNSKEIV >gi|226332140|gb|ACIC01000180.1| GENE 26 27111 - 28016 875 301 aa, chain - ## HITS:1 COG:lin1064_1 KEGG:ns NR:ns ## COG: lin1064_1 COG1705 # Protein_GI_number: 16800133 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Muramidase (flagellum-specific) # Organism: Listeria innocua # 33 171 54 201 201 90 42.0 5e-18 MENKLYRLVFLTAIIFFAVGVQAQRRNARYVEYINKYSDLAVEQMKLHKIPASITLAQGL LESGAGNSQLARKSNNHFGIKCGGSWRGRSVRHDDDARNECFRAYKHPRESYEDHSDFLK RGARYAFLFKLDITDYKGWARGLKKAGYATDPSYANRLITIIEDYDLYKYDSKGVYSERK LRKNPWLLSPHPVYIANDIAYIVARSGDTFKDLGKEFDISWKKLVKYNDLQRDYTLVAGD IIYLKSKKKKASKPHTVYIVKDGDSMHAISQKYGIRLKNLYKMNRKDGEYIPEVGDRLRL R >gi|226332140|gb|ACIC01000180.1| GENE 27 28128 - 28610 544 160 aa, chain + ## HITS:1 COG:SP0844 KEGG:ns NR:ns ## COG: SP0844 COG0295 # Protein_GI_number: 15900731 # Func_class: F Nucleotide transport and metabolism # Function: Cytidine deaminase # Organism: Streptococcus pneumoniae TIGR4 # 21 153 2 125 129 90 38.0 1e-18 MKDLTITAIIKVYQYDELNEADRSLMKTAMEATARSYAPYSHFSVGAAALLNNGTIVTGT NQENAAYPSGLCAERTTLFYANSQYPDQPVNTLAIAARTEKDFIETPIPPCGACRQVILE TEKRYGQPIRILLYGRECIYEVKSIGDLLPLSFDASAMEE >gi|226332140|gb|ACIC01000180.1| GENE 28 28614 - 29453 785 279 aa, chain + ## HITS:1 COG:PA0248 KEGG:ns NR:ns ## COG: PA0248 COG2207 # Protein_GI_number: 15595445 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Pseudomonas aeruginosa # 116 272 130 285 288 84 31.0 2e-16 MLQQYHTKLKGTLALTDSYLTEKALQKEKGLYKFIWVRQGSITVEIDHQEMVLSKDDVIS LTHLQHLEFKSIDGEYMTLLFNSNFYCIYGNDHEVSCSGFLFNGSSHLVRFTLNEAERRQ LDQITEAMENEFTVSDSLQEEMLRILLKRFIIQCTRIARQRLDITREKESGFEIVRQYYN LVDEHYRTKKQVQDYADMLHKSPKTLSNIFSTCKLPSPLRVIHERVEAEAKRLLLYSNKS AKEIADILGFEDQASFSRFFKNMTGQSAVQFRNTEVGKN >gi|226332140|gb|ACIC01000180.1| GENE 29 29542 - 30123 612 193 aa, chain + ## HITS:1 COG:HI0219 KEGG:ns NR:ns ## COG: HI0219 COG3059 # Protein_GI_number: 16272180 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Haemophilus influenzae # 20 191 27 213 213 175 50.0 4e-44 MKEKFIALLTFTSSLKNFGIKFIRVAILVVFVWIGGLKYFHYEADGIVPFVANSPFMSFF YAKDAPEYKDHKNPEGAFVPENRAWHEANRTYTFSYGLGALIMSIGILVFLGIFFPKVAL VGDTLAIIMTLGTLSFLVTTPEVWVPNLGGDEFGFPLLSGAGRLVIKDIVILAGAVVLLS DSSQRVLKTLKKS >gi|226332140|gb|ACIC01000180.1| GENE 30 30227 - 30728 288 167 aa, chain + ## HITS:1 COG:ykgC KEGG:ns NR:ns ## COG: ykgC COG1249 # Protein_GI_number: 16128289 # Func_class: C Energy production and conversion # Function: Pyruvate/2-oxoglutarate dehydrogenase complex, dihydrolipoamide dehydrogenase (E3) component, and related enzymes # Organism: Escherichia coli K12 # 1 166 10 163 450 159 45.0 3e-39 MKQYDAIIIGFGKAGKTLAAELSNRGWQVAVVEQSDEMYGGACPNVACIPTKTLIHESEI STLLYHNDFDKQSNMYRQAIARKNKLTSFLRENNYEKLSKRPNVTIYTGKGSLVSANTVK VALPEEEIELQGKEIFINTGSTPIIPNIEGIQQSRNVYTSTTLLELD Prediction of potential genes in microbial genomes Time: Thu May 12 03:48:03 2011 Seq name: gi|226332139|gb|ACIC01000181.1| Bacteroides sp. 1_1_6 cont1.181, whole genome shotgun sequence Length of sequence - 95301 bp Number of predicted genes - 80, with homology - 76 Number of transcription units - 48, operones - 18 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 227 - 260 4.0 1 1 Tu 1 . - CDS 385 - 909 207 ## BT_2429 hypothetical protein - Prom 961 - 1020 5.8 + Prom 960 - 1019 3.3 2 2 Tu 1 . + CDS 1070 - 2278 1118 ## COG5026 Hexokinase - Term 2409 - 2452 -0.7 3 3 Op 1 . - CDS 2485 - 5418 2082 ## COG3250 Beta-galactosidase/beta-glucuronidase - Prom 5442 - 5501 4.2 - Term 5435 - 5483 3.1 4 3 Op 2 . - CDS 5508 - 6848 1197 ## COG2204 Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains - Prom 6884 - 6943 3.1 + Prom 6861 - 6920 2.8 5 4 Op 1 . + CDS 6976 - 9456 2151 ## COG0446 Uncharacterized NAD(FAD)-dependent dehydrogenases 6 4 Op 2 . + CDS 9487 - 9855 404 ## BT_2435 MarR family transcriptional regulator + Term 9902 - 9953 8.2 - Term 9886 - 9944 8.5 7 5 Tu 1 . - CDS 9953 - 11257 947 ## BT_2436 putative secreted tripeptidyl aminopeptidase - Prom 11371 - 11430 8.9 + Prom 11240 - 11299 4.1 8 6 Op 1 . + CDS 11391 - 12023 699 ## BT_2437 hypothetical protein 9 6 Op 2 . + CDS 12039 - 12665 697 ## BT_2438 hypothetical protein + Term 12697 - 12743 8.4 - Term 12685 - 12731 11.6 10 7 Op 1 . - CDS 12737 - 15742 2629 ## COG1472 Beta-glucosidase-related glycosidases 11 7 Op 2 1/0.000 - CDS 15739 - 16626 291 ## PROTEIN SUPPORTED gi|145635642|ref|ZP_01791339.1| 30S ribosomal protein S16 12 7 Op 3 . - CDS 16662 - 17465 761 ## COG0737 5'-nucleotidase/2',3'-cyclic phosphodiesterase and related esterases - Prom 17582 - 17641 6.8 - Term 17731 - 17770 1.2 13 8 Tu 1 . - CDS 17940 - 19034 770 ## COG2885 Outer membrane protein and related peptidoglycan-associated (lipo)proteins - Prom 19067 - 19126 2.6 - Term 19068 - 19113 10.7 14 9 Tu 1 . - CDS 19153 - 22044 2506 ## BT_2444 hypothetical protein - Prom 22087 - 22146 4.5 15 10 Tu 1 . - CDS 22416 - 22835 138 ## BT_2445 hypothetical protein + Prom 22624 - 22683 5.2 16 11 Op 1 . + CDS 22900 - 24129 946 ## COG4974 Site-specific recombinase XerD 17 11 Op 2 . + CDS 24149 - 25360 697 ## BT_2447 integrase - Term 25366 - 25422 15.0 18 12 Op 1 . - CDS 25445 - 25738 264 ## BT_2448 hypothetical protein - Prom 25843 - 25902 3.3 19 12 Op 2 . - CDS 25954 - 26073 92 ## gi|253572437|ref|ZP_04849839.1| conserved hypothetical protein - Prom 26096 - 26155 3.3 20 13 Op 1 . + CDS 27429 - 28268 248 ## BT_2450 hypothetical protein 21 13 Op 2 . + CDS 28301 - 29581 340 ## BT_2451 putative pyrogenic exotoxin B 22 13 Op 3 . + CDS 29587 - 29982 268 ## BT_2452 hypothetical protein + Term 29984 - 30031 9.0 23 14 Tu 1 . + CDS 30333 - 30590 179 ## BT_2454 hypothetical protein + Term 30812 - 30842 -0.5 24 15 Tu 1 . - CDS 30690 - 31208 248 ## COG3344 Retron-type reverse transcriptase - Prom 31305 - 31364 4.0 + Prom 31173 - 31232 5.7 25 16 Tu 1 . + CDS 31293 - 31442 92 ## 26 17 Tu 1 . - CDS 31382 - 31870 235 ## COG3344 Retron-type reverse transcriptase - Prom 32031 - 32090 3.6 27 18 Op 1 . - CDS 32755 - 33951 715 ## BT_2457 putative purple acid phosphatase 28 18 Op 2 . - CDS 34016 - 35071 685 ## BT_2458 putative pyridine nucleotide-disulphide oxidoreductase 29 18 Op 3 . - CDS 35052 - 35897 612 ## BT_2458 putative pyridine nucleotide-disulphide oxidoreductase 30 18 Op 4 . - CDS 35913 - 37880 958 ## COG3525 N-acetyl-beta-hexosaminidase 31 18 Op 5 . - CDS 37953 - 39671 1265 ## BT_2460 hypothetical protein 32 18 Op 6 . - CDS 39683 - 43087 2554 ## BT_2461 hypothetical protein - Prom 43127 - 43186 2.6 33 19 Op 1 6/0.000 - CDS 43248 - 44183 507 ## COG3712 Fe2+-dicitrate sensor, membrane component - Prom 44215 - 44274 4.4 34 19 Op 2 . - CDS 44286 - 44807 235 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 44889 - 44948 6.8 35 20 Tu 1 . - CDS 45068 - 45232 169 ## - Prom 45349 - 45408 3.9 + Prom 46097 - 46156 3.6 36 21 Tu 1 . + CDS 46208 - 47122 718 ## BT_2465 hypothetical protein + Term 47360 - 47386 -1.0 37 22 Tu 1 . - CDS 47405 - 47605 56 ## 38 23 Tu 1 . + CDS 47848 - 48258 250 ## BT_2467 hypothetical protein + Prom 48326 - 48385 3.8 39 24 Op 1 . + CDS 48407 - 48526 150 ## 40 24 Op 2 . + CDS 48527 - 48727 137 ## PRU_1111 hypothetical protein 41 24 Op 3 . + CDS 48739 - 49032 219 ## BT_2468 hypothetical protein + Term 49052 - 49108 13.6 - Term 49043 - 49092 4.2 42 25 Tu 1 . - CDS 49191 - 49451 149 ## BT_2469 integrase - Prom 49482 - 49541 4.6 + Prom 49837 - 49896 9.0 43 26 Op 1 . + CDS 49973 - 50182 127 ## BT_2470 hypothetical protein 44 26 Op 2 . + CDS 50189 - 50407 65 ## BT_2471 hypothetical protein - Term 50557 - 50590 -0.3 45 27 Tu 1 . - CDS 50653 - 50826 261 ## BT_2472 hypothetical protein - Prom 50857 - 50916 5.8 - Term 51293 - 51338 6.1 46 28 Tu 1 . - CDS 51430 - 53277 947 ## BT_2473 hypothetical protein 47 29 Op 1 . - CDS 53602 - 55662 1043 ## COG0755 ABC-type transport system involved in cytochrome c biogenesis, permease component 48 29 Op 2 . - CDS 55649 - 56803 1131 ## COG0251 Putative translation initiation inhibitor, yjgF family 49 29 Op 3 . - CDS 56816 - 57550 429 ## BT_2476 hypothetical protein 50 29 Op 4 . - CDS 57570 - 58781 1143 ## BT_2477 hypothetical protein 51 29 Op 5 . - CDS 58827 - 60260 1286 ## COG3488 Predicted thiol oxidoreductase - Prom 60326 - 60385 3.1 52 30 Op 1 . - CDS 60389 - 61507 1073 ## BT_2479 iron-regulated protein A precursor 53 30 Op 2 . - CDS 61548 - 62918 1260 ## BT_2480 hypothetical protein - Prom 62938 - 62997 5.4 54 31 Tu 1 . - CDS 63033 - 63341 184 ## BT_2481 hypothetical protein - Prom 63507 - 63566 6.7 + Prom 63371 - 63430 5.5 55 32 Tu 1 . + CDS 63450 - 63839 328 ## BT_2482 transcriptional regulator + Prom 64907 - 64966 2.5 56 33 Tu 1 . + CDS 65062 - 65253 82 ## BT_2483 hypothetical protein 57 34 Tu 1 . - CDS 65250 - 66026 434 ## BT_2484 hypothetical protein - Prom 66198 - 66257 8.1 - Term 66351 - 66390 1.2 58 35 Op 1 . - CDS 66481 - 67575 837 ## BT_2485 major outer membrane protein OmpA - Prom 67599 - 67658 4.1 - Term 67600 - 67643 8.4 59 35 Op 2 . - CDS 67666 - 70716 2497 ## BT_2486 hypothetical protein - Term 71022 - 71080 -0.6 60 36 Tu 1 . - CDS 71085 - 72032 575 ## BT_2488 integrase + Prom 72245 - 72304 5.9 61 37 Tu 1 . + CDS 72334 - 72687 585 ## PROTEIN SUPPORTED gi|29347899|ref|NP_811402.1| 50S ribosomal protein L19 + Term 72702 - 72769 22.1 62 38 Op 1 . - CDS 73460 - 74041 354 ## BT_2491 hypothetical protein 63 38 Op 2 . - CDS 74056 - 74526 292 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 74553 - 74612 3.7 - Term 74625 - 74667 7.5 64 39 Tu 1 . - CDS 74697 - 75677 498 ## PROTEIN SUPPORTED gi|116517028|ref|YP_816079.1| glucokinase - Prom 75697 - 75756 9.9 + Prom 75761 - 75820 6.5 65 40 Op 1 36/0.000 + CDS 75856 - 76572 351 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 + Prom 76583 - 76642 5.7 66 40 Op 2 10/0.000 + CDS 76686 - 77945 1205 ## COG0577 ABC-type antimicrobial peptide transport system, permease component 67 40 Op 3 13/0.000 + CDS 77951 - 79192 308 ## PROTEIN SUPPORTED gi|163788031|ref|ZP_02182477.1| 50S ribosomal protein L9 68 40 Op 4 13/0.000 + CDS 79262 - 80362 1140 ## COG0845 Membrane-fusion protein 69 40 Op 5 . + CDS 80373 - 81710 1197 ## COG1538 Outer membrane protein + Term 81738 - 81794 14.5 + Prom 81750 - 81809 6.1 70 41 Tu 1 . + CDS 81887 - 82915 943 ## COG0229 Conserved domain frequently associated with peptide methionine sulfoxide reductase + Term 82955 - 83009 3.0 - Term 82944 - 82996 12.2 71 42 Tu 1 . - CDS 83047 - 83565 569 ## BT_2500 hypothetical protein - Prom 83810 - 83869 5.8 72 43 Tu 1 . + CDS 83880 - 85829 1379 ## COG0744 Membrane carboxypeptidase (penicillin-binding protein) + Term 85886 - 85932 -0.7 - Term 85706 - 85753 12.8 73 44 Op 1 . - CDS 85852 - 87051 292 ## BT_2502 hypothetical protein 74 44 Op 2 . - CDS 87066 - 88082 460 ## BT_2503 hypothetical protein 75 44 Op 3 . - CDS 88115 - 88795 642 ## BT_2504 hypothetical protein - Term 88829 - 88867 5.2 76 45 Tu 1 . - CDS 88909 - 89748 226 ## PROTEIN SUPPORTED gi|212640476|ref|YP_002316996.1| Uncharacterized protein conserved in bacteria containing two ribosomal protein S1-like RNA-binding domains - Prom 89771 - 89830 2.3 + Prom 89720 - 89779 7.0 77 46 Tu 1 . + CDS 89934 - 90935 1093 ## COG0039 Malate/lactate dehydrogenases + Term 90958 - 91009 11.5 - Term 90946 - 90997 11.4 78 47 Op 1 . - CDS 91035 - 91454 277 ## BT_2511 putative transcription regulator 79 47 Op 2 . - CDS 91461 - 93413 1670 ## COG2217 Cation transport ATPase + Prom 93496 - 93555 5.0 80 48 Tu 1 . + CDS 93591 - 95030 846 ## COG3177 Uncharacterized conserved protein Predicted protein(s) >gi|226332139|gb|ACIC01000181.1| GENE 1 385 - 909 207 174 aa, chain - ## HITS:1 COG:no KEGG:BT_2429 NR:ns ## KEGG: BT_2429 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 174 1 174 174 276 95.0 2e-73 MKERSPVKKNILKLIPFIEIIAGIIFLCLEIHDYRTLYSMAEADAMYGGLADFCKYKENV FCPAFFWIIVILTGISYWKNKKLYWILTQALVFMVFFRAMFPFYAIFIELNPVVFYILPL LYSTVFFVVERKLFKIKQIDSVDITVKTKIMGIGIGILCTIIYFLLFLTVFCLI >gi|226332139|gb|ACIC01000181.1| GENE 2 1070 - 2278 1118 402 aa, chain + ## HITS:1 COG:SPAC4F8.07c KEGG:ns NR:ns ## COG: SPAC4F8.07c COG5026 # Protein_GI_number: 19114777 # Func_class: G Carbohydrate transport and metabolism # Function: Hexokinase # Organism: Schizosaccharomyces pombe # 4 278 14 299 455 123 32.0 7e-28 MEKNIFKLDNEQLKAIVCSFRDKTEEGLKTENAEIQCIPTFIAPKTTHIKGKSLVLDLGG TNYRVAIVDFDKATPTVHPNNGWKKDMSIMKSVGYTREELFKELADMIIGIKREEEMPIG YCFSYPAESVPGGDAKLLRWTKGVDIKEMVGEFIGKPLLDYLNERNKIKFTGIKVVNDTI ASLFAGLTDNSYDAYIGLIVGTGTNMATFIPADKIEKLDQSCNAHGLIPVNLESGNFHPP FLTAVDDTVDAISGNPGKQRFEKAVSGMYLGDILKTAFPLEEFEEKFDAQKLTSIMNYPD IYKDVYVQVAQWIYGRSAQLVAASLTGLIMLLKSYNKDIRKVCLVAEGSLFWSENRKDKN YNILVMEKLRELLQLFGLKDIEVDIKSMNNANLIGTGIAALS >gi|226332139|gb|ACIC01000181.1| GENE 3 2485 - 5418 2082 977 aa, chain - ## HITS:1 COG:TM1624 KEGG:ns NR:ns ## COG: TM1624 COG3250 # Protein_GI_number: 15644372 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Thermotoga maritima # 63 698 22 678 785 133 26.0 2e-30 MNIWKTTLLLVLLFTGITVHAQKVKISLNSDGNLTIWKVKAQAEIEHPSDMMTTRYDTRD WVKATVPGTVFGSYVLEGLEPDPNFADNAYKVDKAKYDRNFWYRTEFNSPEVENGERVWL NFEGVNRKGEIFFNGTRLGLLDGFMHRGNFDITDLLSKNGKNVLAVLVHWVGTPVPNYAS PTYMSCAGWDWMPYVPGLLTGITDDVYLTKNGEVSIVDPWIRSKVPSKNKAVLSLQLELR NHTDIEQKGVLKGIIQPGNIEFTEDLVIEAGKQRTFLLDDSKFSQFIIQNPALWWPNGYG QPNLYTCELTYMVNGKASDKQNITFGIREYGSELVDGVLHLKINGEPVYVKGGNWGMSEY MLRCRGEEYDLKLKLHNEMHFNMIRNWIGSVTDDEFYEACDKYGIMVWDDFWLNSNSNLP DDVFAFNMNAVEKIKRLRNHACIAVWCGDNEGYPLQPLNKWLEEDVRTYDGGDRAYHANS HSDGLSGSGPWTNSHPNWYFTKAPYGYGANITKGWGFRTEIGTAVFTTFDSFKKFVPEKD WWPRNEMWDKHFFGNSAGNASPDKYFSTVEFNYGKAKGIEDFCRKAQLVNVEVNKAMYEG FQHHIWEDASGILTWMGQSAYPSLVWQTYDYYYDLTGAYWGIRKACEPVHIQWSYADNSV KVINTTLKEQKGLTATGKVYNLDGKEMGRYSQSVVLDAAANKDSYCFHLNFTTDNLAFGK KAVASSISADAGEPSAAIDASDGSRWASEPRDEEWIYVDLGEPTEIASVILNWEAAHAKA YKLLISDDAINWKEIYINEDSKGGVEEIKIKPVCTRYVKMQGLKQATMWGYSLYEFELYG RKKKPTDLTPVHFIKLELNDENGQLLSDNFYWRSNKPGDYKALNNLPKAKLRTTSSLVDS NGKKVIKAKIENVGSSVAFAVHVQAIRSSDGERILPALMNDNYFTLLKGESKDIEIEFDS DLLPDGNYSLSVIPYNK >gi|226332139|gb|ACIC01000181.1| GENE 4 5508 - 6848 1197 446 aa, chain - ## HITS:1 COG:atoC KEGG:ns NR:ns ## COG: atoC COG2204 # Protein_GI_number: 16130157 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains # Organism: Escherichia coli K12 # 1 446 4 456 461 329 39.0 7e-90 MSKILIIDDEVQIRSLLARMLGLEGYEVCQAGDCKAAIRQLEIQQPDVALCDVFLPDGNG VDLVLNIKKTAPNVEVILLTAHGNIPDGVQAIKNGAFDYITKGDDNNKIIPLISRAVEKA KMNVRLEKLEKKVSQLYSFDSILGESKVLKEAVMLAQKVSVTDVPVLLTGETGTGKEVFA QAIHYNSKRSKQSFVAVNCSSFSKELLESEMFGHKAGAFTGALKDKKGLFEEANNGTIFL DEIGEMAFELQAKLLRILETGEYIKIGDTKPTHVNVRIIAATNRNLTEEIKAGRFREDLF YRLSVFQVHLPALRERTGDIRILATAFAKSFSEKLSYTVNEMTPAFLHALEQQPWKGNIR ELRNVIERSLIVCEGGHLDVCDLPLEIQNTHYECSDDNIGSFELAAMERRHIARVLEYTK GNKTETARLLKIGLTTLYRKIEEYKI >gi|226332139|gb|ACIC01000181.1| GENE 5 6976 - 9456 2151 826 aa, chain + ## HITS:1 COG:YPO3001_1 KEGG:ns NR:ns ## COG: YPO3001_1 COG0446 # Protein_GI_number: 16123182 # Func_class: R General function prediction only # Function: Uncharacterized NAD(FAD)-dependent dehydrogenases # Organism: Yersinia pestis # 18 456 20 458 459 430 51.0 1e-120 MKIIIIGGVAGGATTAARIRRTNEAAEIILLEKGKYISYANCGLPYYIGGVIKEREKLFV QTPEAFSTRFRVDVRTENEVIFIDRKRKTVTVKQSSGDTYEESYDKLLISTGASPVRPPL PGIDLNGIFTLRNVADTDRIKEYINSHAPRKAVVIGAGFIGLEMAENLHAQGAKVSIVEM GNQVMAPIDFSMASLVHQHLMDKGVNLYLEQAVASFERDGKGLKVIFKNGQSISADIVIL SIGVRPETTLARAAELKIGEAGGIAVNDYLQTSDESIYAIGDTIEFRHPITGKPWLNYLA GPANRQGRIVADNIQGAHIPYEGAIGTSIAKVFDMTVASTGLPGKRLRQAEINYMSSTIH PSSHAGYYPDAMPMSIKITFDPKTGKLYGGQIVGYDGVDKRIDEIALVIKHEGTIYDLMK VEQAYAPPFSSAKDPVAVAGYVAENIITGKVQPVYWRQLRDIEMENNFLLDVRTPDEFSL NTLPGAVNIPLDELRDRLDELPKDKMIYTFCAVGLRGYLAYRILVQHGFEKVRNLSGGLK TYHAAAAPIILREESENEVPAQPNANIQQPADAQPQKTVSPVEAVKTIRVDACGLQCPGP ILKMKKTMDSLVSGDRVEITSTDPGFPRDAAAWCNSTGNQLISKETAGGKSVVVIEKGEP KSCNIVTSCEGKGKTFIMFSDDLDKALATFVLANGAAATGQKVTIFFTFWGLNVIKKLHK PNVEKDIFGKMFGMMLPSSSRKLKLSKMSMGGIGGKMMRYIMNRKGIDSLESLRQQALEN GVEFIACQMSMDVMGVKQEELLDEVTIGGVATYMERADNANINLFI >gi|226332139|gb|ACIC01000181.1| GENE 6 9487 - 9855 404 122 aa, chain + ## HITS:1 COG:no KEGG:BT_2435 NR:ns ## KEGG: BT_2435 # Name: not_defined # Def: MarR family transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 122 1 122 122 201 100.0 6e-51 MKTICVMRDVFKAMARFEDAFEKVYQVSLNEAMILCALQEASPNNMTATSLSKRTELTPS HASKMLRILEEKELIVRTLGEEDRRLMQFHLSQSGKKLVRQLALEKVEIPELLKPLFESV DS >gi|226332139|gb|ACIC01000181.1| GENE 7 9953 - 11257 947 434 aa, chain - ## HITS:1 COG:no KEGG:BT_2436 NR:ns ## KEGG: BT_2436 # Name: not_defined # Def: putative secreted tripeptidyl aminopeptidase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 434 21 454 454 894 99.0 0 MKHIRIPLFVWFSIVLLIAVAPVSLSAQESFIQKIEKNKSVSGIKSLDTSRFPEKYVMYL TQPLDHRHPEKGSFRQRVIVGHVGYDRPTVIVTEGYGAGYALRPTYREELSELFDANMIF VEHRYFLESTPEPCDWQYLTAENSAEDLHAMTTAFKTLYPGKWISTGISKGGQTSLLYRV FFPDDVDVSVPYVAPLCYAREDGRHEPFLRRVGTEADRKKIEDFQLEVLKRKARLLPRFE KMCTEKNYTFRAPLEEIYDFCVLEYSFSIWQWGTDIRSIPETSASDDTLLDHLLAISGPS YFIVDSPTLSFFVQAARELGYYGYDIAPFKPYLSIKTSKDYLRRLMLPEDMRKMKFDKTL SNKIVRFLKKNDPKMIFIYGQNDPWTAAGVTWLKNKKNIHVFVEPGGSHLARIGTMSEDQ KQKVMSLLRGWLEE >gi|226332139|gb|ACIC01000181.1| GENE 8 11391 - 12023 699 210 aa, chain + ## HITS:1 COG:no KEGG:BT_2437 NR:ns ## KEGG: BT_2437 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 210 1 210 210 385 100.0 1e-106 MKKLIPILLAVFALASCEKDPDMGKLDDNYLVYTNYDKQANFKDFSTFYLADKILVISDS KEPEYLEGEGAEQILAAYTENMEAKGYQPAADKESADLGIQVSYIASTYYFTGYTQPEWW WGYPGYWGPSYWGNWGGWYYPYAVTYSYSTNSFITEMVNLKADEGEGKKLPVVWTSYLTG FETGSKAINRTLAIEAVNQSFTQSPYLTNK >gi|226332139|gb|ACIC01000181.1| GENE 9 12039 - 12665 697 208 aa, chain + ## HITS:1 COG:no KEGG:BT_2438 NR:ns ## KEGG: BT_2438 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 208 1 208 208 373 100.0 1e-102 MKTRKNIYFKVVALAAIAIAFAMPAKAQLSDNGYANIDWQFNVPLSNNFADKASGWGMNF EGGYFVTPNLGLGLFLNYHSNHEYIPRETFKIGAGDVTTDQQHTMFQLPFGAAARYQWNR GGAVQPYVSAKLGAEYAKIRSNFNMLEARENSWGFYASPEVGVNVFPWVYGPGLHFALYY SYGTNKADVLHYSVDGLNNFGFRVGVSF >gi|226332139|gb|ACIC01000181.1| GENE 10 12737 - 15742 2629 1001 aa, chain - ## HITS:1 COG:BH0675 KEGG:ns NR:ns ## COG: BH0675 COG1472 # Protein_GI_number: 15613238 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Bacillus halodurans # 45 417 131 528 686 238 35.0 4e-62 MRLLTYILAILLFLPVKARGEAIDTPVEPLLLYKAKQDVRCQQWVDSIMNKLTLKEKVGQ LFIYTIAPVDTKRNQILLKDAVDTYKVGGLLFSGGKMQTQAELTNRAQRMAKLPLMITFD GEWGLAMRLRGTPVFPRNMVLGCIQDNKLIYEYGREMARQCVQLGVQVNFAPVADVNINP KNPVINTRSFGEDPVRVADKVVAYASGLEGGGVLSVCKHFPGHGDTDVDSHKTLPVLPFT RERLDSVELHPFKEAIRAGLSGMMVGHLQVPVIEPIGGLPSSLSRNIVYDLLTEEFAFKG LIFTDALAMKGVSGNGNVSLQALKAGNDMVLAPRNLKEEIAAVLDAIDKGELTREDIEEK CRKVLTYKYVLGLKKKPFVQLSGLGQRINTPQTRDLIRRLNLAAITVLNNENHVLPLHTN QKETMALLEVGDAGETNALAEQLSKYTSLSRFRLRPGMTEEANQRLRDSLAAYKRIIVAV SEQKLSPYQTFFGKFTTETPVIYAFFTQGKMMLQIQRAVARASAVVLGHSPNEDVQRQVA DVLFAKATADGRLSASLGELFQAGDGVTITPKTPLHFIPEEHGLSSVALQRIDSIALDGV RQGAYPGCQVIVMKEGHVMVDKTFGTHTGTGSARVQPTDIYDLASLSKTTGTVLALMKLY DKGRFNLTDRIADYLPFLQRTNKKDITIQELLYHQSGLPPGIAFYREAIDEDSYEGRLFM SRKDARHPLQLGTSTWANPNFAFKKEYVSKVKTGDYTLQICDSLWLNPSFFKEMEKKIAD APMKPKTYRYSDVGFILLRLLVEKLAGMPMDAYLQREFYEPMGLERTGYLPLRRFPKSEV VPSSVDRFLRKTTLQGFVHDEAAAFQGGVSGNAGLFSNAREVAQVYQMLLNGGELNGQRY LSKETCQLFTTEVSKISRRGLGFDKPDTRNPEKSPCGKHTPAKVYGHTGFTGTCAWVDPD NGLVYVFLSNRIYPDVTNRKLSRLEIREQIQDAIYKAIQSK >gi|226332139|gb|ACIC01000181.1| GENE 11 15739 - 16626 291 295 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|145635642|ref|ZP_01791339.1| 30S ribosomal protein S16 [Haemophilus influenzae PittAA] # 28 288 12 298 603 116 30 4e-25 MKSQYPMTTHNHITSVLNIFKGRTGKWLMLFLLLLPALATAQETKKLIILQTSDVHSRIE PMTQEGDRNYGQGGFVRRASFLQQFRKENPDVLLFDCGDISQGTPYYNMFKGEVEVTLMN EMGYDAMTIGNHEFDFGLDNMARLFKLAKFPVVCANYDLDATVLKDIVKPYVILDRFGLK IGVFGLGAKPEGLIQANKCEGVIYKDPIAVSNDIAAQLKEEGCDVIVCLSHLGIQMDEKL VANTRNIDVILGGHSHTFMESPKNYLNIDGKNVPVMHTGKNGIYVGRLDLTLNKK >gi|226332139|gb|ACIC01000181.1| GENE 12 16662 - 17465 761 267 aa, chain - ## HITS:1 COG:STM4104 KEGG:ns NR:ns ## COG: STM4104 COG0737 # Protein_GI_number: 16767370 # Func_class: F Nucleotide transport and metabolism # Function: 5'-nucleotidase/2',3'-cyclic phosphodiesterase and related esterases # Organism: Salmonella typhimurium LT2 # 53 253 306 506 518 93 28.0 4e-19 MKRNYVKYTSGAILTGLILFTSCQSTREVQANYEVAQIEGTRICMDSTWDAHPDEKAAAL LKPYKEKIDKMMYEVIGVSEMTMDKGRPESLLSNLVAEVLRLSAERVLKHPADIGIMNMG GLRNILPQGNITVDAVFEILPFENSLCVLTMKGTEIKRLMEVIASLHGEGLSGAHLEITK DGKLLKATVQEKEIEDNKDYTVATIDYLADGNDGLTPFINADKRECPDGLTLRGLFLDYV RQQTAAGKKITSKLDGRITVVPASDRK >gi|226332139|gb|ACIC01000181.1| GENE 13 17940 - 19034 770 364 aa, chain - ## HITS:1 COG:VC2213 KEGG:ns NR:ns ## COG: VC2213 COG2885 # Protein_GI_number: 15642211 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane protein and related peptidoglycan-associated (lipo)proteins # Organism: Vibrio cholerae # 267 346 212 290 321 59 42.0 8e-09 MNIKKLWMPVFALLLSSTVISAQEQRIKEEGKTEFKPHWFMQVQVGAAHTLGEAKFSDLI SPAAAINVGYQFAPAWRARVGVSGWQAKGGWVSPQQDYQYKYLQGNIDIVSDLSTLFCGF NPKRVFNGYVFLGGGLNRGFDNDEANALDTRTYEMEYLWRDGKFLIAGRMGLGCNLRLND RLSINIEGNANVLSDKFNSKKAGNADWQFNALIGLNIKFGKGYTETKPVYYEAEPVVVEQ PKPVPVVEQPQPKKEVVVQPMKQDIFFALNSAKIQDDQKSKIDMLVEYLQNNPAAKVKVT GYADVNTGNPKINKSLSEKRAGNVADALKAKGITTDRIMVDSKGDTVQPYKAPEENRVSI CIAE >gi|226332139|gb|ACIC01000181.1| GENE 14 19153 - 22044 2506 963 aa, chain - ## HITS:1 COG:no KEGG:BT_2444 NR:ns ## KEGG: BT_2444 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 9 448 1 440 440 798 100.0 0 MKKKFVRVMLFGALALATGASFVGCKDYDDDIDAVNARVDGIEKTLSELQAQVGGYVKSV VYDASTGKLTVTGADSKEYTIPMPEELPSYTLDVTKDGKITLKKDGKEVSSGTIDIPEIP EIPTYKEFDPLLLTVKDGYVYYDGTKTSVAIPETAKGSITAILDEKGVVTGYRIVTIENG KEVTGEFSVIDAIPLKGLVFEPTCYVGGISALKANNLTYNAWSQNAYAKPSETGETYKSN PTLSYITPQIWAYYHMNPASVSMAQIESLKFLSGDKDYYPVSRAAAMDPTANIKKSDVVT EDGQRYLKVAMDADAEKIPALNSGKVAVYALQAQTKAIGADEAKVITSDYASIYKVLMSD FVLETKIDAEDVILYGQSISRTGSTPNANAGKAQEAIDANADYTVATDGTLKLADKISTY YKENNGTELVKMTNIKDNGLKYVFTASNYIGGANNETEQNSFFNGSDAAKGIINPEYKGQ DSEATEHREPLLRVELVDTTKVPNQVVAVGWIKTKIVKGTIDGFGKVFTKDTYYYGCENN VAKLTYLNMNDVYAEAKLNKPDFHTAYTLKKDDGGAIALGEGSTGTVTEVTDPEAIATTV VEWTVTPAEALALVKKSKTEMFATVTYESVDGTRGDITITMKASIAMPSGVIKNDQKITE AWSPDKSYVKADVKEPVTVAVTDFKLDLFDVFEGRKVVVSDVTEANFPSFTKDKLTAKFI FSGAKDVTIGTDKYTFTVTANGTQLNASKNSATATKIATIDADGIITYNKADENAHALLN KSAYNVDPFTVGIEIVVTNECDEILPLTGNKFDVKFLRPINVVDLKSAELQDATTGTAKI DMSKLLGLTDWRGQAFAVAPIDFYTYYGVTGSAISIDKDQIMTTLNASEAAPKLLKDVAP NMSINYTCDLQKIGGDITYGDLVYANGKFTINAPFKLFIPVTVKYTFGEVTSTVTLTITP THQ >gi|226332139|gb|ACIC01000181.1| GENE 15 22416 - 22835 138 139 aa, chain - ## HITS:1 COG:no KEGG:BT_2445 NR:ns ## KEGG: BT_2445 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 139 7 145 145 248 100.0 7e-65 MNSEYKSALFLSIKIGYLSSFQKQAANSLVTETLQYFVFFCRPAYSAILVKCLIINVLRF ILAIPFFTAFLNITYTARHTWATMAYYSEVHPGIISEAMGHSSITVTETYLKPFKNKKID EANATVLSPFRKTFLKISC >gi|226332139|gb|ACIC01000181.1| GENE 16 22900 - 24129 946 409 aa, chain + ## HITS:1 COG:BH1529 KEGG:ns NR:ns ## COG: BH1529 COG4974 # Protein_GI_number: 15614092 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinase XerD # Organism: Bacillus halodurans # 175 391 64 288 299 64 26.0 4e-10 MKVEKFKVLLYLKKSEPDKNGKAPIMGRITLNRTMAQFSCKLSCTPELWNARESRLNGKS RKAVETNEKIERLLLAVHSAFNSLMERKKDFDAAAVRDMFQGNAGIQMTLLKLLDRHNGE MKARVGVDRAPTTLSTYLFTYRTLSEFIKAKFKVSDLAFGQLNEQFIRDYQDFILMEKGH AVDTLRGYLAILKKICRIAYKEGHSEKYHFCHFKLPKQKESTPKALSRENFEKLRDLEIP EKRRSHVITRDLFLFACYTGTAYADVVSITRENLFTDEENNLWLKYRRKKTNYLGRVKLL PEALVLIEKYRDDARMTLFPPQDYHTLRANMKSLRLMAGLSQDLVYHTARHSFASLITLE EGVPIETISKMLGHSNIKTTQIYARVTPKRLFEDMDRFIEATRDLKLIL >gi|226332139|gb|ACIC01000181.1| GENE 17 24149 - 25360 697 403 aa, chain + ## HITS:1 COG:no KEGG:BT_2447 NR:ns ## KEGG: BT_2447 # Name: not_defined # Def: integrase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 403 1 403 403 808 100.0 0 MRSTFRLMFYINRNKVKSDGTTAILCRISIDGRKSAVTTGIYCKPSDWDSNKGEIRMNKE NNRLTAFRSRLEEAYGNLLKNQGVVTAELLKATVSNTVSIPEFLLQTGEAERERLRIRSV EINSTSTYRQSKTTQLNLRQFIESRGMRDIAFSDITEEFAESFKIFLKKELGHGNGHVNH CLCWLNRLIYIAVDREVLRANPIEDVAYEKKDSPKLKHIGRNELKLIMETPMPDPMMELA RRTFIFSSFTGLAYVDTRRLHPRHIEKSSLGRRYIRIRRAKTGAEAFIPLHPVAEQILEL YNTTDNEKPVFPLPVRDILWYEVHGLGVALGMKENLSYHMARHSFGTLMMTSGIPIESIA KMMGHTNINSTQVYAQVTDQKISSDMDWLMKRRKRKDTDWVKS >gi|226332139|gb|ACIC01000181.1| GENE 18 25445 - 25738 264 97 aa, chain - ## HITS:1 COG:no KEGG:BT_2448 NR:ns ## KEGG: BT_2448 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 97 1 97 97 181 100.0 7e-45 MEGIIGKESESILRFFILLENIQVKMDQLMEGNRPPFNGERFLTDRELSGLLKISRRCLQ DYRDQGRLPYVQLGGKVLYKTSDIEKLLEGNYHRALI >gi|226332139|gb|ACIC01000181.1| GENE 19 25954 - 26073 92 39 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253572437|ref|ZP_04849839.1| ## NR: gi|253572437|ref|ZP_04849839.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 39 1 39 39 73 100.0 4e-12 MEIYYIEAGIFDRMLGCVESLSTRVDRLYEKNANKGVGE >gi|226332139|gb|ACIC01000181.1| GENE 20 27429 - 28268 248 279 aa, chain + ## HITS:1 COG:no KEGG:BT_2450 NR:ns ## KEGG: BT_2450 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 279 1 279 279 514 100.0 1e-144 MRKVFILLFISLALLSCQKEENNRKTEYQVLTSKDASKPFSCLGEKRIITITIIKKTLID DVLSSEVPIIPRDVSVEFDKTLFSDIEIKVEGDQVVLNITSNINKEDKILNADLQISYST INGIKVEKIPLIIDKGKLTFVYKIHSEQNPFILPAEGGRFELPFTCKKQTYLNGQFIEET YSSLNGLRFKTISSGNVWFLTVRKDGEKIGFYKFSFVGEGPYNQKTDPECYFNIYTHDAD LITDNPTEIFRQDFIQPQTPGEDYYKPSRSSYKHGTFDF >gi|226332139|gb|ACIC01000181.1| GENE 21 28301 - 29581 340 426 aa, chain + ## HITS:1 COG:no KEGG:BT_2451 NR:ns ## KEGG: BT_2451 # Name: not_defined # Def: putative pyrogenic exotoxin B # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 426 1 426 426 837 100.0 0 MKQHAHLLILTIMFLTSCSDDTEVMKQESSPPSLTEKAPTYRTKENAIIEVELFIKNNRN NTRNSSFLNYSINNEIYFYRDTITNQTYPSFYIANAEGGNGYAIVSANLYTTPIIAYSES GNLSLSDTLQYQELSFFFDLVQNYISNNKKYEIEFKEENDDDNSLDTSQTRGRRRPIYIK KPGEWEETERVQPLISVKWGQRSPYNNAAPLIEGQRALTGCVATAIAQVMAYHEKPSGYN GVSYNWSEMKQFPTTPAVAHLFRSIGDLVKMDWGTDTSGAKRKNIPQCFEKMGYRKPNNP QIYSQWDVITSIKAKCPVIICGNSVRKKILGIKYYQNGHAWVSDGYFHRERNVDVYRKGS DKVHHSYTEKEDYLHLNWGWNGNSNGYYLAGIFNGGEGPTFPSTRAAGKGNYPYNVEIIP YINIIK >gi|226332139|gb|ACIC01000181.1| GENE 22 29587 - 29982 268 131 aa, chain + ## HITS:1 COG:no KEGG:BT_2452 NR:ns ## KEGG: BT_2452 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 131 1 131 131 238 100.0 4e-62 MRQTYYFLSFVLLLFVFISCSNDVVGKGTDTIGLETKEVFFKSSKDSIELRTRKGDDWWL TEISTKDTIYRTHSLNVQVIEGGGLYVEKTGRKSIFIRIDSNLTGLERRFTTVIQAGNYY NTITIIQKAKE >gi|226332139|gb|ACIC01000181.1| GENE 23 30333 - 30590 179 85 aa, chain + ## HITS:1 COG:no KEGG:BT_2454 NR:ns ## KEGG: BT_2454 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 85 1 85 85 167 100.0 1e-40 MRRVQSNTLDVGIDVHLKNWSTAILSKHSVLRRFLQVDCEGPTLISYIVAKHLAHASRGT ISPYRTNGSAYGGFAIYMYLLIKMR >gi|226332139|gb|ACIC01000181.1| GENE 24 30690 - 31208 248 172 aa, chain - ## HITS:1 COG:CAC3514 KEGG:ns NR:ns ## COG: CAC3514 COG3344 # Protein_GI_number: 15896751 # Func_class: L Replication, recombination and repair # Function: Retron-type reverse transcriptase # Organism: Clostridium acetobutylicum # 12 172 239 426 470 140 45.0 2e-33 MIGKHYEPSIEGTPQGAPLSPLLSNILLNELDKELERRGHCFVRYADDSMILCKSKRSAE RVCSSITDFSFYFTKGKCRLCVHKTTKEKFKRKVKSLTRRSNSMGYAQRKEILWQTFRGW IGYFKYADMRSLLIPLDQWYRRHLRMCIWKCWKRVKTRFSNLQKCGIPKGKA >gi|226332139|gb|ACIC01000181.1| GENE 25 31293 - 31442 92 49 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVYRIKESNKVYAYSIFVAFACITAHLEQCYMRTTSGRNPKLLGLNWRS >gi|226332139|gb|ACIC01000181.1| GENE 26 31382 - 31870 235 162 aa, chain - ## HITS:1 COG:CAC3514 KEGG:ns NR:ns ## COG: CAC3514 COG3344 # Protein_GI_number: 15896751 # Func_class: L Replication, recombination and repair # Function: Retron-type reverse transcriptase # Organism: Clostridium acetobutylicum # 1 157 7 163 470 119 44.0 3e-27 MHKIRVIESTLGYPQKNRMESESYEGVQTYMDMKDGSLIAVQPNKDDLMSQILSPDNLNR AYLQVVRNKGVGGVDRMDFKQLLPYLHEHKSELIESIRIGRYKPNSVRRVDIAKDNGKKR LIGIPTVVDRLIQQAIVQVLSPIYERQFSPSSFGFRPDVVRM >gi|226332139|gb|ACIC01000181.1| GENE 27 32755 - 33951 715 398 aa, chain - ## HITS:1 COG:no KEGG:BT_2457 NR:ns ## KEGG: BT_2457 # Name: not_defined # Def: putative purple acid phosphatase # Organism: B.thetaiotaomicron # Pathway: not_defined # 10 398 1 389 389 814 100.0 0 MIRKIIFLEMLLLGIATFLYSQDEISKITITHGPYLQNVGSNEATIVWITDKPSIGWVEL ASDGNGSFYAKEHPRYFDTSNGIKNTSTIHAVKVKGLTPGKQYRYRVFAQEVLKHTGYKI IYGSYASTDVYYRKPLTFHTCNPQAPATSFVMVNDIHGDNKLLEDLMSRCNLTQTDFVLF NGDMLSFINSEDQLFKGFMDTAVRLFASEIPMYYARGNHETRGVFATEIQRYFSPCQEHL YYAFRQGPVYCIVLDTGEDKPDSDIEYAGITQYDLYRTEQSEWLASILESTEYKEAPFKI IVAHIPPAVTEAGPDEDWHGNVEVEQKFMPLLRQAYPDLMLCGHLHRFVRHDATDKTSFP VVVNSNTSLLRIYATTTQMKIEVMDRDGKMLDEFIIKK >gi|226332139|gb|ACIC01000181.1| GENE 28 34016 - 35071 685 351 aa, chain - ## HITS:1 COG:no KEGG:BT_2458 NR:ns ## KEGG: BT_2458 # Name: not_defined # Def: putative pyridine nucleotide-disulphide oxidoreductase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 351 276 626 626 735 100.0 0 MEWLQAETNVSLFLNYRAFSVKKEEDRIISITACHIESGEEIEFYGRLFADCTGDGTIGY LAGADYRMGRESRSEYGETIAPEIADSLVMGTSVQWYSVEDTKTSYFPEFRYGIEFNEET CEPVTYGEWTWETGMNKNQINDSEQIRDYGMLVIYSNWSYLKNQSERRKYYKKRSLEWVA YIAGKRESRRLLGDYVLKEDDLTKHVAHEDASFTTTWSIDLHRPDPENTRYFPGREFKAT TDHVVIYPYPVPYRCLYSRNIDNLFMAGRNISVTHVALGTVRVMRTTGMMGEVVGMAASL CKKYQATPRDIYHYYLEELKSLMQKGVNKKGLPNNQRYNEGGRLNQIPKVK >gi|226332139|gb|ACIC01000181.1| GENE 29 35052 - 35897 612 281 aa, chain - ## HITS:1 COG:no KEGG:BT_2458 NR:ns ## KEGG: BT_2458 # Name: not_defined # Def: putative pyridine nucleotide-disulphide oxidoreductase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 275 1 275 626 578 100.0 1e-164 MKIVMIKKIIYISLIWMVFHGASLHAASLLVEAESFTEKGGWVVDQQFMDLMGSPYLLAH GLGSPVENASTNVQLPENGTYYIFVRTYNWTSPWQEGEGPGLFGLSVNGKKISYRLGIIG NQWIWQYAGQYQATEKNIHIVLHDLKGFDGRCDAIYFTTRKDDIPPSDMAALNNFRRAKL GLLAPPKTESYDLVVIGAGIAGMSTAVSAARLGCKVALINDRPVVGGNNSSEIRVHLGGA IEIGKYPELGGLQKEFGPVKEGNAQPAGNYEDHKKNGMAAS >gi|226332139|gb|ACIC01000181.1| GENE 30 35913 - 37880 958 655 aa, chain - ## HITS:1 COG:VC0613 KEGG:ns NR:ns ## COG: VC0613 COG3525 # Protein_GI_number: 15640633 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetyl-beta-hexosaminidase # Organism: Vibrio cholerae # 27 506 137 632 637 270 33.0 9e-72 MIGKSYIFHLFLFLLILPDSIYSQDNICLIPQVESMVRKKGTLSIERLESIHFPDEWKNT GNLLVSDLKELANLSVMVNASNPSIHVKKVKMQEPEMYMLEITKQGIIIEAGDQTGMIHA FSTLLQLILGSEGKELPRLIIHDKPRFSYRGVMIDCSRHFWTIEQLKKYTKQLAFFKLNT LHLHLTDNQGWRLYLDQYPDLAFKGTYYRTFEDLSGHYYRKSELQELINYAAMYGIEIIP EIDLPGHCLALLAALPQLSCKGGKFEAYPEELDGQKRKRADENMLCIGNPETYRFVEKLV AELTDLFPSSFIHLGGDEVSTHLWEQCPKCQKIYKQENMTSWHELQDYFTKRVSEIVRSK GKRMIGWDEINDRNAADISDVIMIWQRDGREQQQKALKRGLSVIMSPKDPCYFDFGYSRN STRRLYEWEPVGKECTNTQAHLVKGGQANLWTEFITTSDEVERMLYPRTCALAETLWNTK EKKEWEGFRQRISKFGAIMEKLNICYFKDEDWDNTGFVPQSEQRPRLVCPARIDTNMKGI KYYMPEYAFDGDIQTFFATPYSLKKGDYFTLTLEKRQAVQEIRIVFDVSKEHPEHVQLSV SEDGTIFKKVAADNKNGELSASFSTLAMIKALKMELTTPLMARLTIKEIILRYYE >gi|226332139|gb|ACIC01000181.1| GENE 31 37953 - 39671 1265 572 aa, chain - ## HITS:1 COG:no KEGG:BT_2460 NR:ns ## KEGG: BT_2460 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 572 1 572 572 1172 100.0 0 MKKLLLYIGAGILLLSACHDLNLNPLSSGSTENWYSTETEVQMAVDELYRYDFWPEDGQQ QTDWSDDYTSRDLLTSFDDGTLNGQNAWVTSLWGNQYKAISRANAVIEKSHRAIENGASA TTINRLIAEAKFHRAAAYAKLIVKFGDLPLVLTDVSTEEGLTMGRTDKSIVLAQIYKDFD DAISVLPTSYTDQQRATQGAALALKARVALIMEDWETAADAAKAVMDLNIYELHPDFGNL FLSTTKNSNEFIFKIPRDATYDIYLGTGNGGVNQVYNDLPRNVGGWTQTCPSWELLAAFT CTDGLLINESPLFDPHEPFANRDPRLAETIVPFGSNHLGYEYNPHPEALTCMNYNTGEIV TNYDTRANAQYASFDGLVWKKGIDETWLQNGFKVDQDIIYCRYAEILLTYAEAKIELNEI DNSVLDALNEVRARAYGVDKSDKASYPAFTSNDQSTLRHQLRVERRMEMAKEHMRYTDII RWKLAEKVMNRKTYGILYPASLCIERVTSQGDWFWAFTPEMDEDGCADFSKLEEAGKCQA LAVRSWNDRQYLWPIPTTELQINENLKQNPGY >gi|226332139|gb|ACIC01000181.1| GENE 32 39683 - 43087 2554 1134 aa, chain - ## HITS:1 COG:no KEGG:BT_2461 NR:ns ## KEGG: BT_2461 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1134 1 1134 1134 2237 100.0 0 MKNSLYQELFIENKQLFRIMKIIVYVCFLCVGNLLAIGSYAQTAHVTIVSNHITIGQVIN EIEKQTDYLFVFNVKDINIKRNVKVNAKDKAVNEVLNDIFKDTGIRHAIEGKNIMLINST DKEKGAQQKDFLVTGIVKDENGEPIIGANIQVVGKSAGTITDMNGRFSISASANSTLQIT YIGYQTQTVNIGEQRNINIILQEDNAQLDEVVVVGYGTMKKKDLTGSVTAVKGDELAARR TTQLSTALQGALSGVMVSRDNGAPGSSASIQIRGVTTIGDSSPLVIIDGIPGDINSVNPE DVESMSVLKDAASASIYGSRAAAGVIVITTKRAKSGDVALSYNFEYGWEIPTKLPTYVGA QRYMEMVNETRYNDNPDGGWYQTYTEDEITNWSINNATNPDKYPMTDWGDILLKSSAPRQ THTLSIVGGGKIIRTKASFRYDKTDGLYVNKEYDRFMVRVNNDFNINKYISANLDLNFSR SKALSPNSNPMGAGGRNIPPIYAATWTNGLWGDVKDGENMLAKITDGGTFTNWGTNIGGK LGIDITPVNGLKISAAVAPNFNNVKKKTFVKQIPYTRSEDPNTTVGYMGGYRTTNLTENR NDSHDITTQFFANYAKTLGRHDFSVMIGYEDYYAFWENLDASRDQYQLTSYPYLDLGSED YRDNGGNAEEYAYRSLFGRMTYSYADRYLIQANLRRDGSSRFASNCRWANFPSVSLGWVI SEEKFFKNINMDWFSYLKLRGSWGKLGNERIGAYDDDGNFIYNYYPYQAAIDYSTALFQN SQGIVSSVTTAAQQKYAVRDISWETTESWDIGLDANFLDNRLHFSADYYRKNTKDMLLAL EIPHFIGYDNPEVNAGNMHTTGYDIEVGWRDKVGDFTYSISANLSDFTSKMGNLNGTQFL GNQVKMEGSEYNEWYGYVSDGLFLTQEDVDKSPKLNNSVKVGDIKYKDISGPDGVPDGVI SPEYDRVLLGGSLPHYMFGLTFNAAYKGWDFGLTLQGVGSQKSQISAAMIEGLRDNWLNF PTILDGNYWSPKNTDEQNATAKYPRLTRTNRDANFCMSDYWLYNGRYIRLKNLSVGYTLP KAWTTKACMNTVRLYVSGSDLFTISNAPKGWDPEVSTEGYPITTSLIFGVSINF >gi|226332139|gb|ACIC01000181.1| GENE 33 43248 - 44183 507 311 aa, chain - ## HITS:1 COG:PA2388 KEGG:ns NR:ns ## COG: PA2388 COG3712 # Protein_GI_number: 15597584 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Pseudomonas aeruginosa # 10 282 25 298 331 70 23.0 3e-12 MDELLQKYIAGNASEKESQRIMEWLREDEQHLREYKRQRKLYDITLWQTKSPVDIQQEKK DPLRRVLDVIVRIAAVIVFTVATTYFYTHHVLQDKEENMQTVVVPAGQHAELYLADGTHV WLNSGSRLTFPGRFSKKVRHVELDGEGYFKVSSNIKQPFIVGTNRCNIRVLGTEFNVLAY EKDSIWETALLEGAVEILQKKSEVSLMKLKPGDMARLSKNQLTKEKIHTTDYFRWKEGLI CFNDISLRDIMEKLKLYYDVNFVINNQQILDAHYTGKFRTHDGIEHVIRVLALNNKFTYI KDNESNTITIN >gi|226332139|gb|ACIC01000181.1| GENE 34 44286 - 44807 235 173 aa, chain - ## HITS:1 COG:PA0149 KEGG:ns NR:ns ## COG: PA0149 COG1595 # Protein_GI_number: 15595347 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Pseudomonas aeruginosa # 10 159 27 168 181 61 29.0 1e-09 MRYYKKSFLFVKSYIHNEMVAEDIASESLIKLWQWIQDNPVENIEPMLLSILRNKALDYL RHESMKQQVITRISEKQNEELALRLSSLEDCNPNEIFSKEVMDIVQRTLQSLPEQTSRIF TLSRFGNKTNREIACELNISIKDVEYHISKSLKALRKTLKDYLPLFYFFFYHM >gi|226332139|gb|ACIC01000181.1| GENE 35 45068 - 45232 169 54 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDCKTIAEDKNKIFYTYFNTLDIILANVLNRFSSIHNLNGNKFYITILLVPSLE >gi|226332139|gb|ACIC01000181.1| GENE 36 46208 - 47122 718 304 aa, chain + ## HITS:1 COG:no KEGG:BT_2465 NR:ns ## KEGG: BT_2465 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 304 1 304 304 639 100.0 0 MNKQVRSILAQETTKTSKIRQLYLLGIPRAEIARMVTNGNYGFVVNALRRMNEREGGLNI HPATAALDYTFTRKFGIEIEAYNCSRERLARELREAGIEVMVESYNHTTRPHWKLVTDGS LNGNDTFELVSPILVGEAGLQELEKGCWVLDLCDVKVNGSCGLHVHIDAAGFSMETWRNL SLSYKHLEPVIDRFIPASRRDNYYCRGLGHVSDGMIRSARMVDELKGRIGDRYHKVDLEA YSRHKTIEFRQHSGTTCFTKMRNWVLFLHKLVTFATRGQVPVATALQNIPFLDSEQKLYY KFKD >gi|226332139|gb|ACIC01000181.1| GENE 37 47405 - 47605 56 66 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MEILTAYILGIMVHQATDIVTAYVIGIVICQAMKIAGTHVGKIAIYEETAGREHYGDEKK NPCWSM >gi|226332139|gb|ACIC01000181.1| GENE 38 47848 - 48258 250 136 aa, chain + ## HITS:1 COG:no KEGG:BT_2467 NR:ns ## KEGG: BT_2467 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 136 1 136 136 236 100.0 2e-61 MDSIERKKTKPYNINEKEILDIVAAKGSYLSSISENLEEVPPHKESSSRAESKKTITDEE VKKYIEALLNNFSSNRRKPLHIDAQVYECISDIVWAVRRKDFTVSGVISRILVEHIKENA GMIKKITGRDYKLLQP >gi|226332139|gb|ACIC01000181.1| GENE 39 48407 - 48526 150 39 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MEVYYIEAGIFEEMLARADSLSAQVDRLYEKNREKKPGE >gi|226332139|gb|ACIC01000181.1| GENE 40 48527 - 48727 137 66 aa, chain + ## HITS:1 COG:no KEGG:PRU_1111 NR:ns ## KEGG: PRU_1111 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 63 42 104 113 69 58.0 4e-11 MDNQDVCLRLDISPRTLQTLRDTGRLAFTQIQRKIYYRPEDVEKLMVYAAMKRKEKAVRE KERMNN >gi|226332139|gb|ACIC01000181.1| GENE 41 48739 - 49032 219 97 aa, chain + ## HITS:1 COG:no KEGG:BT_2468 NR:ns ## KEGG: BT_2468 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 97 1 97 97 173 100.0 2e-42 MEGIISKETCSVRRFFGLLDNIQTKLERLAEDNRPLFNGERFLSDKELSDLLKISRRCLQ DYRDQGRISYIRLGGKILYKVSDIEKLLEDNYHEALI >gi|226332139|gb|ACIC01000181.1| GENE 42 49191 - 49451 149 86 aa, chain - ## HITS:1 COG:no KEGG:BT_2469 NR:ns ## KEGG: BT_2469 # Name: not_defined # Def: integrase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 86 1 86 86 115 100.0 4e-25 MSSVVFSPLYHYQYSFSYLYFEVILYAAHHIMLYHYQHIPVYPYSDVTPYTARHSFGTLM LSSGIPIESIAKMMGHTNINSTQVYA >gi|226332139|gb|ACIC01000181.1| GENE 43 49973 - 50182 127 69 aa, chain + ## HITS:1 COG:no KEGG:BT_2470 NR:ns ## KEGG: BT_2470 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 69 1 69 69 114 100.0 1e-24 MYISNNAYSGLNSLRGGLKHSYWSGSSLLDTTTFQSESGKPSCNYTISRNMQFGFQDTLS MDSNVFFIR >gi|226332139|gb|ACIC01000181.1| GENE 44 50189 - 50407 65 72 aa, chain + ## HITS:1 COG:no KEGG:BT_2471 NR:ns ## KEGG: BT_2471 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 72 1 72 72 131 100.0 7e-30 MYWYTYLFLFCFQYGVLKDTSQRLNFNELARVTKKYSLSVRVTGAADSSTGTSGINDSLN ISRACLVATELE >gi|226332139|gb|ACIC01000181.1| GENE 45 50653 - 50826 261 57 aa, chain - ## HITS:1 COG:no KEGG:BT_2472 NR:ns ## KEGG: BT_2472 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 57 51 107 107 87 100.0 1e-16 MEKTFAQISELFDQFSKDASLQVEKGNKAAGTRARKASLELEKLLKRFRKESLEASK >gi|226332139|gb|ACIC01000181.1| GENE 46 51430 - 53277 947 615 aa, chain - ## HITS:1 COG:no KEGG:BT_2473 NR:ns ## KEGG: BT_2473 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 615 92 706 706 1173 100.0 0 MGTQTWSGLDTPASFSQALDKVTTGTADEALTSEAQTFMMLPQRFPEGAQIEVLFTDDTH TGHTLIADIKGSEWPMGKTVTYKISSSSLNWTYTLDVTALADFTYTGGTQQYCVTSYRQN AQGEKEAAEWTAQYAEDGTTWTDTKPVWLTAFTASGTGGEFAQPCDATVEAQTGISNDLH ENALKAATAKGSETTPYNLSNNTGGNTVENTANCYVVNAPGYYSLPLVYGNAIKNSATNA SAYTSTVTGTNILNPFINHAGNGITDPYISGNGCTPAKAELVWQDAMNLVTDIKYNADSN GGNISFKVDRSSIRQGNAVIAIKDVSDAILWSWHVWVTDEDINNVIEITNHQNVKYNFMP VNLGQCDGNTITYEERSCKVKFIAGDQSKEITIKQLANVIATGSNAPFYQWGRKDPLYPS NGMGNTTKIWYDKEGIPSTANPMKGTFSAGNDCIKNCILNPNLMHNRYNGDYTYYNLWSA DNKTTSANDNPVNKTIYDPCPVGFKLPASNAFTGFTTTGESVSSSTQVNGTWSSSEKGWY FYTNSEKTQSIFFPALGYRSYSTMRPGSIGTQGCFWSAHPVAKGNNYYLRTSSVDVQPTY MVDRGFGYALRPCQD >gi|226332139|gb|ACIC01000181.1| GENE 47 53602 - 55662 1043 686 aa, chain - ## HITS:1 COG:jhp1003_2 KEGG:ns NR:ns ## COG: jhp1003_2 COG0755 # Protein_GI_number: 15612068 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ABC-type transport system involved in cytochrome c biogenesis, permease component # Organism: Helicobacter pylori J99 # 448 686 13 248 285 165 41.0 3e-40 MHLFNLKKSLVSLYICLVALLAVVTFVEHVRGTEFVEKYVYHTVWFCCLWGVLAALAVVV LVKRQLWRHLPALLLHGSFLFILVGAMITFSCSKKGYMHLTVGTEVGTFIDQDSKRVIEL PFTLCLDSFRVEYYPGTEAPADYVSYIRDAEPVSMNRILSRQGYRFYQSSFDDDKEGSWL SVNYDPWGIGVTYAGYILLGISMLWMLVGRSGEFRRLLRHPLLRKGGMFVWLLMAVVTVV QAENRSLPALALRQADSLAFKQVIYHDRVVPFNTLARDFVLKLTGKPSYGGMTPEQVVGG WLLRPEVWQNEPMIYIKSAELRHLLRLSSSYARLTDLFDGQNYRLQEFWKGGQKPHMKMT SLEKAIMETDEKVGLILMLRSGTLIHPLPEDGSIKPLSDVKVQAEILYNRIPFSKLLFMF NLTVGMLAFFYLLYCSMHRSAGKAWSVFTVALYAAFLFQLFGYCLRWYVGGRIPLSNGYE TMQFMALCTLLLACIFRCRFSFTLSFGLLISGFALLVAYLGQNNPQITPLMPVLLSPWLS IHVSLIMVSYALFAFMMLNGLLAFCIGGWRKKAIDSEIQEQRKVRVEQLMLFSRLMLYPA VFCLGAGIFIGAVWANVSWGRYWAWDPKEVWALISFLIYGAAFHGPSLAVFRQPRFFHAY MVLAFLTVLMTYFGVNYLLGGMHSYA >gi|226332139|gb|ACIC01000181.1| GENE 48 55649 - 56803 1131 384 aa, chain - ## HITS:1 COG:PAB0825 KEGG:ns NR:ns ## COG: PAB0825 COG0251 # Protein_GI_number: 14521450 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation initiation inhibitor, yjgF family # Organism: Pyrococcus abyssi # 278 374 28 126 127 62 33.0 2e-09 MNNKKQPTNQSENTLTEIFKYEVKEGVSEFHVMIHSTRPEDTYEEQLNAVANAYNDLLAG ELKGAMAVFKRYFLSDTANQADLLLAITTESSDCALSVVEQPPLDGTKIALWAYLQTNVQ TQVLHNGLFEVKHNAYRHLWGGSVFNRAANSEYQTRLLLNDYAMQLMEQGCKLADNCIRT WFFVQNVDVNYAGVVKARNEVFVTQNLTEKTHYIASTGIGGRHADPKVLVQMDTYAVAGL KPEQIHFLYAPTHLNPTYEYGVSFERGTYVDYGDRRQVFISGTASINNKGEVVYPGDIRR QTERMWENVEALLKEAECTFDDLGHMIVYLRDIADYAVVKSMYDKCFPDTPKVFVHAPVC RPGWLIEMECMGVKALKNKEYAPF >gi|226332139|gb|ACIC01000181.1| GENE 49 56816 - 57550 429 244 aa, chain - ## HITS:1 COG:no KEGG:BT_2476 NR:ns ## KEGG: BT_2476 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 244 1 244 244 495 100.0 1e-139 MKKCILFFAAIWMTLCFCSCDDGRIYEKEIEVQRGGRVLKLTGRFSGISNWTDDYSVVVA GFNDKSEYAVITKSLPVNVTDGTDIVMTLGGISDEVKSLKLCVISRLRECIVEFKTMEDE ELAAITDTILMDVGTLDVGMFSSIQSQVFDKRCVACHGQTGSASGNLFLTEGKSYHALVN QPAHKNSDILLVKPGSAEESFLHLVLNRAGDTSMNHTDMLSEDEQPLLKLIDNWINEGIF LNNE >gi|226332139|gb|ACIC01000181.1| GENE 50 57570 - 58781 1143 403 aa, chain - ## HITS:1 COG:no KEGG:BT_2477 NR:ns ## KEGG: BT_2477 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 403 1 403 403 778 100.0 0 MKKLILMVVMAALSLAAMAQHEEDTENGVVSLAGREGFTIETKKGDFVFKPYLLVQTCAN FNWYDDEGLDKAYNQDNVANSGFSIPYAVLGFTGKAFGKVAFNLSINAAASGGALLQQAW FDVQLKKQFAVRVGKFKTPFSHAYLTTLGETLLPQLPVSLTSAIILPYSLNAVTPNIGTG FDLGVEIHGLLADKFGYEVGLFNGTGSSVNTATKTLSDDWHIPSLLYAGRFTYMPKGVMP STQGNPNRLNEDKIMFGVSASINVESENESTNDTRVGVEFAMLKNKLYLGAEAYYMNVGF TDRQKINETYNYLGGYVQGGYFVAPRLQAALRYDFFNRNGTSDDGFLNMPAVGVNYFFKG CNLKLQAMYQYIARTGHDTQLDRDNDNLGLATHSATVMLQYTF >gi|226332139|gb|ACIC01000181.1| GENE 51 58827 - 60260 1286 477 aa, chain - ## HITS:1 COG:PA4371 KEGG:ns NR:ns ## COG: PA4371 COG3488 # Protein_GI_number: 15599567 # Func_class: C Energy production and conversion # Function: Predicted thiol oxidoreductase # Organism: Pseudomonas aeruginosa # 27 477 25 473 473 242 35.0 1e-63 MNGLLKYSFPICFSLLALFACEDDGIDVDNIEVPAGFALSAGTATNFLTSSYAYDRSADW ITGAYDKRFTRGDKLYDDIRTSSNGIGGGLGPVYAGYSCGSCHRNAGRTQPTLWSEGGSG SSGFSSMLVYISRKNGAFFQDYGRVLHDQAIYGVKPEGKLKVEYTYETFQFPDGETYELC KPNYSIYEWYADSIKPEDLFCTVRIPLRHVGMGQMMALDPTEIEALAAKSNYPEYGISGR CNYISERGMRSLGLSGNKAQHADLTVELGFSSDMGVTNSRYPEEICEGQAQVNQGSMMGL SYAQLDVSTEEMENVDLYMQSLSVPARRNVNNEQVIKGEQNFYKAKCHLCHVTTLHTKPR GSILLNGTRLPWLGSQTIHPYSDFLLHDMGSEIMGVGLNDNYVSGLARGNEWRTTPLWGI GLQETVNGHTYFLHDGRARNYIEAIMWHGGEGEASKNLFKKMSKEDRNALVAFLKSL >gi|226332139|gb|ACIC01000181.1| GENE 52 60389 - 61507 1073 372 aa, chain - ## HITS:1 COG:no KEGG:BT_2479 NR:ns ## KEGG: BT_2479 # Name: not_defined # Def: iron-regulated protein A precursor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 372 1 372 372 671 100.0 0 MKKFFYLSALSLGMMCSITACSDDDTTTIDAKNLDYTAENASSWGNYMRVVAQLLVNDAT ALYDDWAVKYNEGGSYADFFKNQDALTSVEQLIDGCVDIANEVGTAKIGDPYDLFIHNNE EKALYAVESWYSWHSREDYRNNIYSIRNAYYGTRTGAISELSLSKAVAAVNANLDTEVKK AIDDAAAAIWAIPSPFRNNINSPEAVSAMEACATLEGVLKGSLKSCIEGIDKTVLAEVVK NYVDVVVLPTYSDLKAGNQALFDAVETFRTSPSNANFKACATAWLAARTPWETSEAFLFG PVADKGLDPNMDSWPLDQDGIVQILTSGNYSDLNWDGDYDEEDDKIAGAQALRGYHTLEY LIFKDGEARTIQ >gi|226332139|gb|ACIC01000181.1| GENE 53 61548 - 62918 1260 456 aa, chain - ## HITS:1 COG:no KEGG:BT_2480 NR:ns ## KEGG: BT_2480 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 456 1 456 456 895 99.0 0 MKKLTCLFALGFLLFNNTLRANEPEKVTSTEEVSETAAKKERIKQVTKDDDYEKFRFGGY GEMVAKFMNYGTNRFYGGVDNSDHRNTIAIPRFVLAFDYKFNSKWILGAEIEFEAGGVGL ETELENSENGEYETEMEKGGEVALEQFHITRLIHSAFNIRAGHLIVPMGLTNAHHEPINF FGTSRPEGETTIIPSTWHETGLEFFGSFGKGYARFDYQAMIVAGLNADGFGRDNWVAGGK QGLFEQDNFTSPAYVARLDYKGVSGLRAGVAFYYCNDVTANADKNYKYSSVGRSSVKIYS ADAQYKNKYVTARGNIIYGDLENSSKISKVTLSNNSNYYHGAMRNVAKNALCYGLEAGLN LSAFFSQKKCPVIYPYARYEYYNPQEEAEGSATMEKRCQVSKWTAGVNWFALPNLVVKAD YTTRHIGTNKVFGSTKYNNENEFAIGIAYVGWFTKR >gi|226332139|gb|ACIC01000181.1| GENE 54 63033 - 63341 184 102 aa, chain - ## HITS:1 COG:no KEGG:BT_2481 NR:ns ## KEGG: BT_2481 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 102 14 115 115 183 100.0 2e-45 MVKLTNHKKTIDRPLGCKRWGIIAIIIIIFAGSLSFLLYSYFSHDADNPQCNCLLCMLNH VPEEGEYIYNRYLIFISFIWIAVIFAAIFLVNYCFVKKNKKK >gi|226332139|gb|ACIC01000181.1| GENE 55 63450 - 63839 328 129 aa, chain + ## HITS:1 COG:no KEGG:BT_2482 NR:ns ## KEGG: BT_2482 # Name: not_defined # Def: transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 129 1 129 129 250 100.0 1e-65 MGISENADSPLLCSREIKLMQYSKDDIQKEILKAAEKVFLENGFPKASMREIAQEAQVGL SNIYNYFKSKDDIFCTVVRPVISAFERMLHEHHGRYGADIMEMYSTESLCKNLRRYIELL VLRIFMERN >gi|226332139|gb|ACIC01000181.1| GENE 56 65062 - 65253 82 63 aa, chain + ## HITS:1 COG:no KEGG:BT_2483 NR:ns ## KEGG: BT_2483 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 63 1 63 63 100 100.0 1e-20 MKNGVKTCKSYRRYLLIVSVFGIFVLKKSFFELMQNKTEYCNSLVSKVLPIIYKVSDNYF IFS >gi|226332139|gb|ACIC01000181.1| GENE 57 65250 - 66026 434 258 aa, chain - ## HITS:1 COG:no KEGG:BT_2484 NR:ns ## KEGG: BT_2484 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 258 1 258 258 498 100.0 1e-139 MGNLISFMKDVADGLRESGNYGTAHIYRSSMRAVISFHGSGNLPFRKVNQEFLKNFETYL RGKDCSWNTVSTYMRTLRAVYNRAVDRHLAPYVPHHFRYVYTGTRADRKRALDKEDMERL MAKLPNQLYSGIRDLQRTRAFFLLMFMLRGIPFVDLAYLKKRDIEGNVLTYRRRKTGRML TVTLLPEAMKLIRQYMNTDPASPYLFSLIASKEGTEEAYREYQLALRYFNSQLVILKGVL GLTADLSSYTAKHNQFSI >gi|226332139|gb|ACIC01000181.1| GENE 58 66481 - 67575 837 364 aa, chain - ## HITS:1 COG:no KEGG:BT_2485 NR:ns ## KEGG: BT_2485 # Name: not_defined # Def: major outer membrane protein OmpA # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 364 3 366 366 724 100.0 0 MNKKKLWIAAFSLLFGSIAVSAQEQRIKEEGKTEFKSHWFMQVQAGAAHTVGEAKFTDLI SPAVALNVGYQFAPAWSARVGVSGWQAKGSWVAPRQDYKYKYLQGNVDIVSDLSTLFCGF NPKRVFNGYIFAGAGLNRGFDNDEAVAINAAGYPMGYLWTEGKFLVAGRLGLGCNLRLND RLAINIEGNANVLSDKFNSKKAGNADWQFNALVGLTIKFGKGYKEIPPVYYEPEPVVVEQ PKPAPVVEQPQPRKEVVVQPMKQDIFFALNSAKIQDDQKSKINMLVEYLQKNPSAKVKVT GYADANTGNSKINKTLSEKRAANVAEVLKTKGITPDRIIADSKGDTVQPYKTPEENRVSI CIAE >gi|226332139|gb|ACIC01000181.1| GENE 59 67666 - 70716 2497 1016 aa, chain - ## HITS:1 COG:no KEGG:BT_2486 NR:ns ## KEGG: BT_2486 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1016 1 1016 1016 1891 100.0 0 MKKKFIRVMFFGALALTTFSYVGCKDYDDDIDRLEQKITENANAIAEINKLIEGGSVITK VENITGGVEVTLSNGKSFTIKNGDAGKDGTQWTIGEDGYWYQDGKKTEFKAKGDKGEPGT PGEPGTPGKDGDTYIPGEDGFWYKNEVKPENKTDKTWIVPGVVSVADMGDYYIFSNLKNE AGEATSATVYKYSATFVSSLVHVPSNINKDLGDVVFLPVIAYDNNSKNTAVYEYNSSNQP LYRTLVNGWAEYEYKLNPANVNPQYYTGISFVEQTATTTTRATEVGLPLAKVVTTTDRGR LIVKASAGDNANNPNAFKMYKDDSEQGTAYDNPDGKPMKESTKYGEAGKVNMLAYMVKNT NEQLGETANINVVSDYVATKRMVVEQEETTIGFAVDADKYADPTVKAPALPLLSDMYKKA ETKVNDYLASLPVNLKFLYTADPLDLEPLLTGLATEFHTNTGFDADESQYVLTDLGFENV TFQYEAVSYMDAEVDQTKRYLKLDGSKISVIDKQSASIGRQAVLKVKMYVDGKFVMSKLM KIEFVEQEQQTKDFVNTDLKEQTLACTALYGGGAKLAAATAYQTATVVEGLKIDFDQYFN ALQVSKAKFVEYYGGGNPTVEVTKDGVKFDGWSFDSEEAGKNLTIDYSNIEGSAITDNNN VYFGLKNQLRAGTYVIKFTYSTTNTAPTYKAFTITATLVVKDPVHTWTYNPSMWEGTEYM LGYGQPTSGDYGAPFLMKATIEDGFMDSWDGCGQWDFELVTTAPASEMTLTTDIATNKHV LNIFDKKWMGKKIRVRAVEYIETKDAAKNNRKINNIVSATKTSEFDIMFVNPVVYAQVNA AAAWYLVDKANTTVLKEGAYLPIYRFVKLYDIKRGETNGMMFDGSQEEPWFERNEVEIGK GLFKMHKVEVSYKLSDKNDAIVQRHASILKSTDGKEVGLLKWNNDEGVLVGDVDQPIIID VTIKNSWMGQPEQTTTQTQEITVYVKKADTKYTVPATGFLGTLAAPEGLGSYADLN >gi|226332139|gb|ACIC01000181.1| GENE 60 71085 - 72032 575 315 aa, chain - ## HITS:1 COG:no KEGG:BT_2488 NR:ns ## KEGG: BT_2488 # Name: not_defined # Def: integrase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 315 1 315 315 630 100.0 1e-179 MGNLISFMKKVADGLRESGNYGTAHIYRSSMSAILAFNESGNLPFRKVTPEFLKSFEAYL RGRNCSWNTVSTYMRTLRAVYNRAVDRRIAPYVPHHFRYVYTGTRADKKRALEKEDMERL MKDIPKQLHSGNRELQRARGFFILMFLLRGMPFVDLAYLKKHDIDGNVLTYRRRKTGRML TVTLLPEAVKLIKQHMNTDPAFPYLFSLISSKEGTEEAYKEYQLALRNFNYQLMILKQVL GLTTDLSSYTARHTWATMAYYCEVHPGVISEAMGHSSITVTETYLKPFKNKKIDEANVMV FSSFRKSFSVGNCLN >gi|226332139|gb|ACIC01000181.1| GENE 61 72334 - 72687 585 117 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|29347899|ref|NP_811402.1| 50S ribosomal protein L19 [Bacteroides thetaiotaomicron VPI-5482] # 1 117 1 117 117 229 100 3e-59 MDLIKIAEEAFATGKQHPSFKAGDTVTVAYRIIEGNKERVQLYRGVVIKIAGHGDKKRFT VRKMSGTVGVERIFPIESPAIDSIEVNKVGKVRRAKLYYLRALTGKKARIKEKRVNN >gi|226332139|gb|ACIC01000181.1| GENE 62 73460 - 74041 354 193 aa, chain - ## HITS:1 COG:no KEGG:BT_2491 NR:ns ## KEGG: BT_2491 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 193 1 193 193 357 100.0 1e-97 MGMEEYIDTIELKQMKEQIAILNSKLDAEVVVNEKLLRKVIKNKVSGMNRYVGIMNSLAL LLIPFYIWACPYLGISWWFCSVFCLFLFIAVMHGYFTHKRLRTNDLMSEDLLVVARKLME IKSRYSIWRKFSIPFIIILLCWLFIELQLAGNSNIIIFCVSLIFSFREGYKQYHMMQRKL DEIQQEIDEIMKE >gi|226332139|gb|ACIC01000181.1| GENE 63 74056 - 74526 292 156 aa, chain - ## HITS:1 COG:DR0180 KEGG:ns NR:ns ## COG: DR0180 COG1595 # Protein_GI_number: 15805216 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Deinococcus radiodurans # 5 154 59 220 229 66 25.0 2e-11 MKKLEIEFEQVVKQYKNTIYTVCFMFSKDSREVNDLFQDVLVNLWKGFDTFKGESNIGTW IWRVSLNTCISSDRKKKIVSVPLIMGIDLFEDRDEDTTQIKMLYNRISRLKHFDRAIVLL WLENFSYEEIAAIVGISVKNVSVRLFRIKEELKKNV >gi|226332139|gb|ACIC01000181.1| GENE 64 74697 - 75677 498 326 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|116517028|ref|YP_816079.1| glucokinase [Streptococcus pneumoniae D39] # 10 317 5 316 319 196 37 4e-49 MNSNMEKPYVVGIDIGGTNTVFGIVDARGTIIASSSIKTAGYPTAAEYADEVCKNLLPLI IANGGVDKIRGIGVGAPNGNYYTGTIEFAPNLPWKGILPLAAMFEERLGIPTALTNDANA AAIGEMTYGAARGMKDFIMITLGTGVGSGIVINGQMVYGHDGFAGELGHVIARRDGRVCG CGRKGCLETYCSATGVARTAREFLAARTDASLLRNIPAENITSKDVYDAAVQGDKLAQEI FEFTGNILGEALADAIAFSSPEAIVLFGGLAKSGDYIMKPIQKAIDDNILNIYKGKTKLL VSELKDSDAAVLGASALAWELKDLKE >gi|226332139|gb|ACIC01000181.1| GENE 65 75856 - 76572 351 238 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 219 1 220 245 139 38 4e-32 MIKLTDINKTYNNGAPLHVLKGINLDIQRGEFVSIMGASGSGKSTLLNILGILDNYDTGD YYLNNVLIKDLSETKAAEYRNRMIGFIFQSFNLISFKDAVENVALPLFYQGVSRKKRNAL ALEYLDRLGLKEWAHHMPNEMSGGQKQRVAIARALITQPQIILADEPTGALDSKTSVEVM QILKDLHRMGMTIVVVTHESGVANQTDKIIHIKDGIIERIEENLDHDASPFGQNGIMK >gi|226332139|gb|ACIC01000181.1| GENE 66 76686 - 77945 1205 419 aa, chain + ## HITS:1 COG:NMB0549_2 KEGG:ns NR:ns ## COG: NMB0549_2 COG0577 # Protein_GI_number: 15676455 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Neisseria meningitidis MC58 # 15 419 16 395 395 114 26.0 3e-25 MIDIWQEIYSTIKRNKLRTFLTGFAVAWGIFMLIVLLGAGNGLIHAFEQSASERAMNSIK IFPGWTSKSYDGLKEGRRIRLDNKDLNTTNKNFPDHVIKAGATVYQGGVNLSFGPEYVNL NLSGVYPNHTEVEVVKLFKGRFINEIDIKERRKVIVLHKKTAEILFDKTHTDPIGQFVNA GNVVYQVVGLYNDKGDSGDSDAYLPFTTLQTIYNKGDKLNNLIMTTKNLETIESNEEFEA HYRKVIGANQRFDPTDHSAIWIWNRFTNYLQQQQGSAMLRIAIWVIGIFTLLSGIVGVSN IMLITVKERTREFGIRKALGAKPLSILWLIIVESVTITTIFGYIGMVAGIGATEWMNNAF GNQTMDNGIWSETVFLNPTVDIGIAIQATLTLVIAGTLAGLFPARKAVSIRPIEALRAD >gi|226332139|gb|ACIC01000181.1| GENE 67 77951 - 79192 308 413 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163788031|ref|ZP_02182477.1| 50S ribosomal protein L9 [Flavobacteriales bacterium ALC-1] # 21 413 22 413 413 123 25 4e-27 MRIDIDTCEEILITITRNKTRSLLTAFGVFWGIFMLVALIGGGQGLQDMMQKNFEGFATN SGFLAAQKTGEAYKGFRKGRWWDLEAIDIERLRSQVKDVEVITPSVARWGSNAIYGDKKY DCSVKGLYPDYLHIESQEMAYGRFINDVDIREARKVCVIGKRVYESLFKPGEDPCGKYVR VDGIYYQVIGMSSSEGDMNIQGRASEAVTLPFTTMQQTYNLGGQIDVICFTVKRGVKVSD VQPQMEEIIKAAHYIAPNDKQALMYLNAEAMFSMVDNLFTGIHILIWMVGLGTLLAGAIG VSNIMMVTVKERTTEIGIRRAIGARPKDILQQILSESMVLTTIAGMFGISFAVMVLQLVE MGANSNGGDAHFQVSFGLAVGTCALLIALGMLAGLAPAYRAMAIKPIEAIRDE >gi|226332139|gb|ACIC01000181.1| GENE 68 79262 - 80362 1140 366 aa, chain + ## HITS:1 COG:VC1563 KEGG:ns NR:ns ## COG: VC1563 COG0845 # Protein_GI_number: 15641571 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Vibrio cholerae # 37 316 44 335 338 133 27.0 6e-31 MKKYLKITLLVIIAIILVGTFVFLYQKSKPKVIVYETLAAEVTDLEKTTVATGKVEPRDE ILIKPQISGIVDEVFKEAGQSVKKGEVIAKVKVIPELGQLNSAESRVRLAEINAKQAETD FTRIKKLYEDQLISREDYEKGEVAVKQAREENQTAKDNLEIIKEGITKNSASFSSTMIRS TIDGLILDVPVKAGNSVIMSNTFNDGTTIATVANMNDLIFRGNIDETEVGRIHEGMPIKL TIGALQNLTFDAILEYISPKGVETNGANQFEIKAAITVPDTVQIRSGYSANAEIVLQRAS KVLAVPESTIEFSGDSTFVYIMTDSVPQQKFERTQVTTGMSDGIKIEIKKGVTAQTKLRG AEKKDK >gi|226332139|gb|ACIC01000181.1| GENE 69 80373 - 81710 1197 445 aa, chain + ## HITS:1 COG:CC1318 KEGG:ns NR:ns ## COG: CC1318 COG1538 # Protein_GI_number: 16125567 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Outer membrane protein # Organism: Caulobacter vibrioides # 1 428 1 416 483 82 22.0 1e-15 MKVINKMMAIILLSGTGIGSVQAQDTTPQTWTLRQCIDYAIEHNIDIRQTANEAEQNKIS VNTAKWARLPNLNGGISQGWSWGRTASPKDNSYSDTNSSSTGVNLSTNVPLFTGLQLSNQ YSLAKLNLQAAIEDLNKAKEDIAINVTSAYLQVLFNLELNKVAQNQVELSKDQLKRIKGL HDVGKASPAEVAEAQARVAQDEMTAVQADNTYKLSLLSLSQLLELPTPEGFVLENPKEEL KFEPLTAPDDIYIQAIAYKPGIKAAEYRLQGSLKNIRIAQSEFYPQLSFSAGLGSSYYTL NGEAESGFARQLKNNLSKSISFNLSVPIFNRFSTRNRVRTARLQQSNLALQLDNTKKVLY KEIQQAWYNAVAAESKYNSSEVAVKANEESFRLMSEKFNNGKATFVEYNEAKLNLTKALS DKLQAKYDYLFRTKILDFYKGEIIE >gi|226332139|gb|ACIC01000181.1| GENE 70 81887 - 82915 943 342 aa, chain + ## HITS:1 COG:FN0803_3 KEGG:ns NR:ns ## COG: FN0803_3 COG0229 # Protein_GI_number: 19704138 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Conserved domain frequently associated with peptide methionine sulfoxide reductase # Organism: Fusobacterium nucleatum # 198 342 3 147 147 204 64.0 2e-52 MKHIIITVILIISNVFLGIAKPMKSQAEIYFAGGCFWGTEHFLKQIRGVESTQVGYANST VANPSYEQVCSGKTNAAETVKVVYDPEEVNLSLLLNLYFQTIDPTSLNRQGNDRGTQYRT GIYYINKADLSTINQAIQALATQYNKPIAVEVEPLTNFYPAEVYHQDYLDKNPGGYCHIN LALFEMARKANAPKTATYQKPDDATLRKKLSAEQYAVTQKNATEPAFHNEFWNEHRPGIY VDITTGEPLFVSTDKFDSGCGWPSFSKPIQKELIAEKKDTTHGMIRTEVRSKTGDAHLGH VFTDGPKDKGGLRYCINSASLRFIPKEKMKEEGYGEYLPLIK >gi|226332139|gb|ACIC01000181.1| GENE 71 83047 - 83565 569 172 aa, chain - ## HITS:1 COG:no KEGG:BT_2500 NR:ns ## KEGG: BT_2500 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 172 1 172 172 308 98.0 4e-83 MKTTVLFRMMVLVAMVVAGIANTELKAQDNNFITNEEKVDDLVVSKVIYRLDGSLYRHMK YDFTYDDQKRVTAKEAFKWDSSTEKWIPYFKIDYTYSSNEITLVYARWNNSHRAYDDSVE KSVYELNDANMPIAYMNYKWNDYKWIEESAGNWAMNIQIPVTDEADLLLAGR >gi|226332139|gb|ACIC01000181.1| GENE 72 83880 - 85829 1379 649 aa, chain + ## HITS:1 COG:all2981 KEGG:ns NR:ns ## COG: all2981 COG0744 # Protein_GI_number: 17230473 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane carboxypeptidase (penicillin-binding protein) # Organism: Nostoc sp. PCC 7120 # 432 618 88 267 640 125 39.0 2e-28 MKKNSKKSILLLSIGGGLFICLISIYLSRNMLLQSITNKRTTHIEQTYGLQIHYQNLQMK GCSEITLQGLSIVPDQRDTLLTLQSVNVRLNFWKLLKGNIEVRNVHMNGLAIAFIKRDSA ANYDFLFSGHHPEATTEPVIETNYAHRINRILNLIYGFFPENGQLTQLNITERKDSNFVT VNIPTFTIENNRFQSTIKIKEDTLTQQWKAAGELNRKVHTLQAELFATEQKKVSLPYINR RFGAEVTFDTLYYSMTKENRTENQLQLDGTAKVSGLDVFHKALSPEVIHLDRGQLTYQMN IGKQTLELDSTTTVLFNQIKFHPYLRAEKNENQWHFTAATDKSWFPADELFSSLPKGLFS NLEGIKTSGELAYHFLLDIDFARLDSLKFESELKEKDFRIIEYGATSLSKMSEEFVYTAY ENGVPVRTFPVGPSWEHFTPLDSISPLLRMSVMQSEDGAFFYHKGFLPDAMREALIYDLQ VERFARGGSTITMQLVKNVFLNRNKNFARKLEEALIVWLIETERLTSKERMYEVYLNIAE WGPLVYGIQEASAYYFGKRPSQLTTEESIFLASIIPKPKHFRSSFAENGRLKENMEGYYK LIAGRLAKKGLISEIEADSIRPDIQVTGDALNSLVGETPESSSPTAEEQ >gi|226332139|gb|ACIC01000181.1| GENE 73 85852 - 87051 292 399 aa, chain - ## HITS:1 COG:no KEGG:BT_2502 NR:ns ## KEGG: BT_2502 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 343 1 343 399 541 99.0 1e-152 MKRILFILLVVTASTTVMAGMSTSKVRKETRFLTDKMAYELDLNNPQYNDVYEINYDFIY SLRNIMDYVVRGDEWALDDYYEALDIRNDDLRWVLSDAQYRRFLGAEYFYRPIYVTGGRW SFRIYINYPNRSLFYFGVPYHYRTYCGAHYRPHIHHVSYYRGRYNNLVHYPTPYRVRDQR VFHSYRRSDFGSVRFRPNTSVRPHNAPTRPGTSGRPGTTSRPNTSGRPGNTSRPRPETDH RPSTSGRPSTSGRPSTSGRPSNSERPSVNERPSIPSRPSTPGRNDKPNKEVTPGRGHNRG EGDRHSGSRPGTSTSTPNSNRRPETNSGSSNSSRRPTSSGTSTNRGSSSGRNSNSGSSSS RGNVSGRSTERSSSSESSRSNSNRGSSSRSSERGSSSRR >gi|226332139|gb|ACIC01000181.1| GENE 74 87066 - 88082 460 338 aa, chain - ## HITS:1 COG:no KEGG:BT_2503 NR:ns ## KEGG: BT_2503 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 338 1 338 338 631 98.0 1e-179 MKNIHIILFFTLWVILGSCSTYYVSNKQVMEVQQGMSWQNVEKILGKPDYRRFDGDGEEW EYHRISSVLYGNSIKIIVYFADGRVTSMDTFNGDEILLPPPPVVMSPSVVVTNTDPVRPP RYEESPRRRTLMMRDEFDRFLSDFRMVIMSDEQIKYIDDVLLDCNFTSAQCGKIIDQISG SDAQMTVMKRMYPQIVDKENFASVVNKLFSSFDRDKMKEYIQAYHGDQRPGDIGYVRPRA MSSADFDRFFIEYRGKSFESDRTRMLDEVVPPSGFTCAQCRKLVDMCTFQTDKKNMIKKL YPKIADKKNFSILTDVFTFEMDKREMREFAESYDSGRN >gi|226332139|gb|ACIC01000181.1| GENE 75 88115 - 88795 642 226 aa, chain - ## HITS:1 COG:no KEGG:BT_2504 NR:ns ## KEGG: BT_2504 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 226 1 226 226 460 99.0 1e-128 MRKIIISFCILLAALSLKAQSVSGIRIDGGDTPILVYFGGRQMCYPTTTCFVANLKPGNY TIEVYASRPTRPGERVWKGERLYNDRVYFNGNEVKDIIVEERGDIRPGRPGRPGTGQGGH RPDYDRYDRVMNDQLFKKFFDSVKNEPFEKDRMGLITTALANSDFTSEQCLQLVKFYTFD NERLKIMKMMYPNIVDKEAFFTVIGTLTFSSNKTKMNDFIKEYEGR >gi|226332139|gb|ACIC01000181.1| GENE 76 88909 - 89748 226 279 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|212640476|ref|YP_002316996.1| Uncharacterized protein conserved in bacteria containing two ribosomal protein S1-like RNA-binding domains [Anoxybacillus flavithermus WK1] # 5 267 2 270 285 91 27 1e-17 MSIELGKFNQLEVVKEVDFGLYLDGGDEGEILLPTRYVPEDCKVGDMLNVFLYLDIDERL IATTLTPLVQVGQFACLEVSWVNQFGAFLNWGLMKDLFVPFSEQKMKMQVGRSYVVHAHV DEESYRIVASAKVERYLSKDMPDYAPGAEVDILIWQKTDLGFKAIIDNKYSGLLYENEIF RALETGMQMKAFVKQVREDGKVDLILQKPGFEKVDDFSKTLLEYIREHGGRINLNDKSPA EDIYDTFGVSKKTFKKGVGDLYKKRLISLHENGITLAES >gi|226332139|gb|ACIC01000181.1| GENE 77 89934 - 90935 1093 333 aa, chain + ## HITS:1 COG:TVN1097 KEGG:ns NR:ns ## COG: TVN1097 COG0039 # Protein_GI_number: 13541928 # Func_class: C Energy production and conversion # Function: Malate/lactate dehydrogenases # Organism: Thermoplasma volcanium # 4 270 1 267 325 107 29.0 2e-23 MEFLTNEKLTIVGAAGMIGSNMAQTAMMMKLTPNICLYDPFAPALEGVAEELYHCGFEGV NLTYTSDIKEALTGASYIVSSGGAARKAGMTREDLLKGNAEIAAQFGKDVRQYCPNVKHI VVIFNPADITGLITLLYSGLKPSQVSTLAALDSTRLQNELVKYFHIPASDIQNCRTYGGH GEQMAVFASTTKVKGEPLTDFIGTTRLPLTEWEALKVRVIQGGKHIIDLRGRSSFQSPAY LSIEMIAAAMGGQPFRWPAGTYVSNGKFDHIMMAMETSITKDGVTYKEVAGSPSEQEELE KSYEHLCKLRDEVIEMGIIPAIKDWHSLNPNID >gi|226332139|gb|ACIC01000181.1| GENE 78 91035 - 91454 277 139 aa, chain - ## HITS:1 COG:no KEGG:BT_2511 NR:ns ## KEGG: BT_2511 # Name: not_defined # Def: putative transcription regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 139 1 139 139 264 100.0 7e-70 MEEKVYLDKLAKRDIKPTAIRLLILKSMMEAGRAVSQLDLETLLDTVDKSTISRTITLFL SHHLIHSIDDGSGSWKYAVCDDSCNCVLKDLHSHFYCEKCHRTFCLEKIHIPVIDLPKGF TLHSVNYVVKGVCAECSKV >gi|226332139|gb|ACIC01000181.1| GENE 79 91461 - 93413 1670 650 aa, chain - ## HITS:1 COG:PAB0626 KEGG:ns NR:ns ## COG: PAB0626 COG2217 # Protein_GI_number: 14521140 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Pyrococcus abyssi # 9 645 68 687 689 551 47.0 1e-156 MGHCSCCAHTHECAPEKHIEKKESIFAEYWKVGLSFILLISGIIMNALELPFFREGYFSL IWYVVAYLPVGLPVMKEAWESMKDKDYFSEFTLMFVATLGAFYIGEYPEGVAVMLFYSVG ELFQEKAVDKAKRNIGALLDVRPEEAAVVRDERVVIENPQSVKVGETIEIKTGGRVPLDG MMLNEVAAFNTAALTGESVPRSIRMGEEVLAGMIVTDKVIRIKVIRPFDKSALARILELV QNASERKAPAELFIRKFARVYTPIVIGLAVLIVLLPFIYSLITPQFLFTFNDWLYRALVF LVISCPCALVVSIPLGYFGGIGAASRLGILFKGGNYLDAVTKINTVVFDKTGTLTKGTFE VQFCNCESGVSEEELIRMIASVESSSTHPIAKAVVNYAGQRDIELSSVTDSKEYAGLGLE AAVNGIQVLAGNGRLLSKFQIEYPPELLSITDTIVVCAIGNKYAGYLLLSDSLKEDAKIA IQNLKALGIQNIQILSGDKQSIVSNFAEKLGISEAYGDLLPDGKVKHLEELRQHTENQVA FVGDGMNDAPVLALSNVGIAMGGLGSDAAIETADVVIQTDQPSKVAEAIKVGKLTRRIVW QNISLAFGVKLLVLILGAGGLATLWEAVFADVGVALIAIMNAVRIQKMIK >gi|226332139|gb|ACIC01000181.1| GENE 80 93591 - 95030 846 479 aa, chain + ## HITS:1 COG:mlr2757 KEGG:ns NR:ns ## COG: mlr2757 COG3177 # Protein_GI_number: 13472455 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Mesorhizobium loti # 20 232 22 214 263 109 34.0 2e-23 MGISNIKQLYSEWKSLQPLKPEDLKRWNDKFKLEFNYNSNHLEGNTLTYGQTKLLLMFGE TSGNASLKDYEEMKAHNVGLEMIKQEAQDKERPLTESFIRELNRTILVQDYWKNAKTPDG QDIRMQIKVGEYKSRPNSVLTATGEVFSYASPEETPAFMTSLVDWYNLEADKGILTPVEL AALLHYRYIRIHPFEDGNGRIARLLVNFVLHRYGYPMIVIHSEDKSNYLNILHQCDVEAG LTPSDGANATLNDILPFVNYLSSCLIRSLTLAIKAAKGESIEEEGDFDKKIAMLQRRYSD KAIEKSSRSVEQARSAFFELAVYVEQKISGLQKLFDRTFITNTPTWNMARKINTPNPDEP IIQSHILYTKSKEYSFVEDIIRLSKKSQVDYFKLKYDAVTFHFNHCRYAGDNTFDFPFCI YIQYLSDGCEISCDITSDSVKLSYNPDVLAEEGKEYMDTACNELLKLLEEKMNDKSPEN Prediction of potential genes in microbial genomes Time: Thu May 12 03:51:34 2011 Seq name: gi|226332138|gb|ACIC01000182.1| Bacteroides sp. 1_1_6 cont1.182, whole genome shotgun sequence Length of sequence - 49471 bp Number of predicted genes - 42, with homology - 40 Number of transcription units - 21, operones - 10 average op.length - 3.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 99 - 158 7.8 1 1 Tu 1 . + CDS 337 - 1029 571 ## BT_2516 hypothetical protein + Term 1126 - 1175 8.6 - Term 1583 - 1636 18.0 2 2 Op 1 . - CDS 1661 - 3502 1707 ## COG0821 Enzyme involved in the deoxyxylulose pathway of isoprenoid biosynthesis 3 2 Op 2 . - CDS 3503 - 4018 700 ## COG0041 Phosphoribosylcarboxyaminoimidazole (NCAIR) mutase - Prom 4038 - 4097 2.5 4 3 Op 1 . - CDS 4100 - 4480 547 ## COG0509 Glycine cleavage system H protein (lipoate-binding) 5 3 Op 2 . - CDS 4511 - 5182 516 ## BT_2520 hypothetical protein 6 3 Op 3 . - CDS 5223 - 6707 1632 ## COG1508 DNA-directed RNA polymerase specialized sigma subunit, sigma54 homolog - Prom 6753 - 6812 4.3 + Prom 6658 - 6717 2.3 7 4 Tu 1 . + CDS 6846 - 8219 1543 ## COG0006 Xaa-Pro aminopeptidase + Term 8264 - 8309 9.2 + Prom 8269 - 8328 6.9 8 5 Op 1 . + CDS 8421 - 11879 2777 ## BT_2523 alpha-rhamnosidase 9 5 Op 2 . + CDS 11885 - 14530 1903 ## BT_2524 alpha-rhamnosidase + Prom 14534 - 14593 2.1 10 5 Op 3 . + CDS 14613 - 15971 881 ## COG3458 Acetyl esterase (deacetylase) 11 6 Tu 1 . - CDS 16141 - 16479 160 ## BT_2526 hypothetical protein - Prom 16528 - 16587 7.0 12 7 Tu 1 . - CDS 16741 - 18414 693 ## PROTEIN SUPPORTED gi|39938628|ref|NP_950394.1| ribosomal protein L13 - Prom 18488 - 18547 6.8 + Prom 18196 - 18255 6.4 13 8 Tu 1 . + CDS 18502 - 20199 1743 ## COG1283 Na+/phosphate symporter + Term 20221 - 20284 19.1 14 9 Op 1 . + CDS 20561 - 21631 927 ## BT_2529 hypothetical protein 15 9 Op 2 . + CDS 21715 - 23040 1005 ## BT_2530 hypothetical protein 16 9 Op 3 . + CDS 23056 - 26184 2574 ## BT_2531 hypothetical protein 17 9 Op 4 . + CDS 26205 - 28232 1787 ## BT_2532 hypothetical protein 18 9 Op 5 . + CDS 28253 - 29155 749 ## BT_2533 hypothetical protein + Prom 29184 - 29243 2.1 19 10 Tu 1 . + CDS 29382 - 29990 673 ## BT_2534 hypothetical protein + Prom 30025 - 30084 5.1 20 11 Op 1 . + CDS 30170 - 30505 313 ## BT_2535 hypothetical protein 21 11 Op 2 . + CDS 30561 - 31049 87 ## BT_2536 hypothetical protein 22 12 Tu 1 . - CDS 31155 - 31538 175 ## BT_2537 hypothetical protein + Prom 31799 - 31858 2.7 23 13 Tu 1 . + CDS 31878 - 32360 607 ## BT_2538 hypothetical protein + Term 32404 - 32443 6.2 - Term 32384 - 32437 9.1 24 14 Tu 1 . - CDS 32464 - 32628 219 ## COG1773 Rubredoxin - Prom 32652 - 32711 6.9 + Prom 32625 - 32684 10.2 25 15 Op 1 1/0.000 + CDS 32763 - 34025 971 ## COG0642 Signal transduction histidine kinase + Prom 34030 - 34089 8.2 26 15 Op 2 . + CDS 34112 - 36802 2568 ## COG0474 Cation transport ATPase 27 15 Op 3 . + CDS 36810 - 37433 666 ## COG1011 Predicted hydrolase (HAD superfamily) 28 15 Op 4 . + CDS 37444 - 38406 410 ## PROTEIN SUPPORTED gi|163762565|ref|ZP_02169630.1| ribosomal protein S2 + Term 38463 - 38523 -0.3 - Term 38262 - 38301 -0.9 29 16 Tu 1 . - CDS 38349 - 38483 59 ## - Prom 38642 - 38701 4.4 + Prom 38462 - 38521 10.4 30 17 Op 1 . + CDS 38631 - 38720 56 ## 31 17 Op 2 . + CDS 38771 - 39574 551 ## COG1266 Predicted metal-dependent membrane protease 32 17 Op 3 . + CDS 39587 - 40243 670 ## BT_2545 hypothetical protein 33 17 Op 4 . + CDS 40266 - 40490 253 ## PROTEIN SUPPORTED gi|163756262|ref|ZP_02163377.1| 50S ribosomal protein L20 - Term 40466 - 40523 14.7 34 18 Op 1 . - CDS 40542 - 40967 498 ## COG2166 SufE protein probably involved in Fe-S center assembly 35 18 Op 2 . - CDS 40998 - 41993 1037 ## BT_2548 leucine aminopeptidase precursor 36 18 Op 3 . - CDS 42065 - 42973 947 ## COG1619 Uncharacterized proteins, homologs of microcin C7 resistance protein MccF 37 18 Op 4 . - CDS 42984 - 43790 848 ## COG2273 Beta-glucanase/Beta-glucan synthetase - Prom 43868 - 43927 7.1 + Prom 43751 - 43810 6.3 38 19 Op 1 3/0.000 + CDS 43970 - 44914 913 ## COG0280 Phosphotransacetylase 39 19 Op 2 . + CDS 44945 - 46006 963 ## COG3426 Butyrate kinase + Term 46224 - 46267 7.6 40 20 Op 1 . - CDS 46316 - 48067 1288 ## BT_2553 hypothetical protein 41 20 Op 2 . - CDS 48097 - 48540 463 ## BT_2554 hypothetical protein - Prom 48717 - 48776 7.6 + Prom 48512 - 48571 6.4 42 21 Tu 1 . + CDS 48753 - 49067 208 ## BT_2555 hypothetical protein + Term 49107 - 49151 8.1 Predicted protein(s) >gi|226332138|gb|ACIC01000182.1| GENE 1 337 - 1029 571 230 aa, chain + ## HITS:1 COG:no KEGG:BT_2516 NR:ns ## KEGG: BT_2516 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 230 1 230 230 369 100.0 1e-101 MDLEETLALKRTNHEKLIRNMDKAIRNEMLKYEEAEFYIRLQSECFNLYPIVVKALALQI IDNKRRSIFCSIVKGHKLKRLADFHKQTPEEIAIEFRSIVCELRCKINNGAFTAKESVNL RLKMERDILEHKIRDYDELCQRLQLKNKILHDQLDMLRDNQKRHSKDEQEITHEKEQEII RKTRKALLEELQRKMEIQIEEQTKNLHHESFVMRCMQWLKNALRLPTVSH >gi|226332138|gb|ACIC01000182.1| GENE 2 1661 - 3502 1707 613 aa, chain - ## HITS:1 COG:CPn0373 KEGG:ns NR:ns ## COG: CPn0373 COG0821 # Protein_GI_number: 15618288 # Func_class: I Lipid transport and metabolism # Function: Enzyme involved in the deoxyxylulose pathway of isoprenoid biosynthesis # Organism: Chlamydophila pneumoniae CWL029 # 5 605 9 598 613 390 39.0 1e-108 MDLFNYFRRETSEVNIGAVPLGGPNPIRIQSMTNTSTMDTEACVEQAKRIVDAGGEYVRL TTQGVKEAENLMNINIGLRSQGYMVPLVADVHFNPKVADVAAQYAEKVRINPGNYVDAAR TFKKLEYTDEEYAQEIQKIHDRFVPFLNICKEIHTAIRIGVNHGSLSDRIMSRYGDTPEG MVESCMEFLRICVAEHFTDVVISIKASNTVVMVKTVRLLVAVMEQEGMSFPLHLGVTEAG DGEDGRIKSALGIGALLCDGLGDTIRVSLSEAPEAEIPVARKLVDYVLLRQDHPYIPGME APEFNYLSPSRRKTRAVRNIGGEHLPVVIADRMDGKTEVNPQFTPDYIYAGRTLPEQREE GVEYILDADVWEGEAGTWPAFNHAQLPLMGECSAELKFLFMPYMAQTDEVIACLKVHPEV VVISQSNHPNRLGEHRALVHQLMTEGLENPVVFFQHYAEDEAEDLQIKSAVDMGALIFDG LCDGIFLFNQGSLSHAVVDATAFGILQAGRTRTSKTEYISCPGCGRTLYDLEKTIARIKA ATSHLKGLKIGIMGCIVNGPGEMADADYGYVGAGRGKISLYKGKVCVEKNIPEEEAVERL LEFIRNDRKELEL >gi|226332138|gb|ACIC01000182.1| GENE 3 3503 - 4018 700 171 aa, chain - ## HITS:1 COG:PAB1077 KEGG:ns NR:ns ## COG: PAB1077 COG0041 # Protein_GI_number: 14521838 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylcarboxyaminoimidazole (NCAIR) mutase # Organism: Pyrococcus abyssi # 5 164 7 166 174 165 54.0 4e-41 MKMTPIVSIIMGSTSDLPVMEKAAQLLNDMHVPFEMNALSAHRTPEAVEEFAQNARSRGI KVIIAAAGMAAALPGVIAANTTLPVIGLPVKGSVLDGVDALYSIIQMPPGIPVATVAING AMNAAILAIQMLALSDAALAEAFAAYKEGLKKKIVKANEELKEVKFEYKTN >gi|226332138|gb|ACIC01000182.1| GENE 4 4100 - 4480 547 126 aa, chain - ## HITS:1 COG:BH3484 KEGG:ns NR:ns ## COG: BH3484 COG0509 # Protein_GI_number: 15616046 # Func_class: E Amino acid transport and metabolism # Function: Glycine cleavage system H protein (lipoate-binding) # Organism: Bacillus halodurans # 2 125 3 126 128 132 51.0 2e-31 MNFPQNLKYTNEHEWIRVEGDIAYVGITDYAQEQLGDIVFVDIPTVGETLEAGETFGTIE VVKTISDLFLPLAGEILEQNEALEENPELVNKDPYGEGWLIKMKPADASAAEDLLDAEAY KAVVNG >gi|226332138|gb|ACIC01000182.1| GENE 5 4511 - 5182 516 223 aa, chain - ## HITS:1 COG:no KEGG:BT_2520 NR:ns ## KEGG: BT_2520 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 223 1 223 223 325 100.0 7e-88 MDKEKEHIIADKTMLRVARITSIVFTPFSIPFLAFLVLFLFSYLRIMPILYKGIVLGIVY CFTILTPTITIFLFRKINGFARQELSERKKRYVPILLTIISYVFCLLMMRKLNIPWYMTG IIFVSLVISIICILVNLKWKLSEHMAGMGGIIGGLVSFSALFSYNPVVWLCLFILIAGIL GSARIVLGHHTLGEVLSGFVVGLVCSFLILHPAYNLIFRVFLF >gi|226332138|gb|ACIC01000182.1| GENE 6 5223 - 6707 1632 494 aa, chain - ## HITS:1 COG:RSc0408 KEGG:ns NR:ns ## COG: RSc0408 COG1508 # Protein_GI_number: 17545127 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma54 homolog # Organism: Ralstonia solanacearum # 19 494 15 497 499 210 32.0 7e-54 MAQGSRQIQSQAQQQIQTLSPQQILVVKLLELPAVELEDRIHAELLENPALEEGKEDAAT DEYADGVSPEEGMDNDANDYDSLSDYLTEDDIPDYKLQENNRSRDEQAEDIPFSDATSFY EILKEQLRERNLTEHQCELVEYLIGSLDDDGLLRKSLESICDELAIYAGIESTEEELEEA LCILQDFDPAGIGARNLQECLMIQILRKKSEEKKPSPILNLEERIISDCYEEFTRKHWEK IIKKLDVREETFNEAIAEITKLNPRPGASLGETIGRNLQQIVPDFLVDTYDDGTINVSLN NRNVPELRMSRDFTEMVEEHTKNKANQSKESKEAMMFLKQKMDAAQGFIDAVKQRQNTLM TTMQAIIDLQRPFFLEGDESLLKPMILKDVAERTGLDISTISRVSNSKYVQTNFGIYPLK FFFSDGYTTEDGEEMSVREIRKILKECIDGEDKKKPLTDDELADILKEKGYPIARRTVAK YRQQLNIPVARLRK >gi|226332138|gb|ACIC01000182.1| GENE 7 6846 - 8219 1543 457 aa, chain + ## HITS:1 COG:FN1949 KEGG:ns NR:ns ## COG: FN1949 COG0006 # Protein_GI_number: 19705251 # Func_class: E Amino acid transport and metabolism # Function: Xaa-Pro aminopeptidase # Organism: Fusobacterium nucleatum # 1 454 1 454 462 409 47.0 1e-114 MFAKETYMQRRALLKKNLGSGVLLFLGNDECGLNYEDNTFRYRQDSTFLYYFGLSCAGLS AIIDIDEDKEIIFGDELSIDAIVWMGSQPTLREKCERVGIREMMASADIVGYLHKCVQSG KAIHYLPPYRPEHKLKLMDWLGIPPARQEGSVPFIRAVIAQRNYKSAEEIVEIEKACDVT ADMHITAMKVLRPGMYEYEVVAEMNRVAESNNCELSFATIATINGQTLHNHYHGNKVKPG DLFLIDAGAEVESGYAGDMSSTIPADKKFTTRQREVYEIQNAMHLESVKALRPGIPYMDV YELSARVMVDGMKTLGLMKGNTEDAVREGAHALFYPHGLGHMMGLDVHDMENLGEIWVGY NGQPKSTQFGRKSQRLAIPLEPGFVHTVEPGIYFIPELIDMWKAEKKFTNFINYDKVETY KDFGGIRNEEDYLITETGARRLGKKIPLTPEEVEALR >gi|226332138|gb|ACIC01000182.1| GENE 8 8421 - 11879 2777 1152 aa, chain + ## HITS:1 COG:no KEGG:BT_2523 NR:ns ## KEGG: BT_2523 # Name: not_defined # Def: alpha-rhamnosidase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1152 1 1152 1152 2248 98.0 0 MARTTLLLLSILFLLPTNAAIKKLQVEYLTNPIGLDITAPRFSWQLESAERGVRQTAYQI TVATDAACLNPVWTSGKVASDESLHICYAGPALTPSTRYYWKVTVWNNKTGEETSTEKAF FETGLLSDGWSGAQWIKATQINKNSKINPEDKKQTKARMLLEMDVTLTSGNASVLFGARD ASNVFMWSVNTLDNEKEPLIRRHIYDRGRLQSSDTPIGKFFTKSDLLNKEHHLAIEAKDG VVKTYIDKVLVDTYTDTDSKLSNGYIGFRAFRGNNTNETAMFDNIVLTEYEQKSDKEEAK VVLKEDFEKPQSAFEGGEIVSVGGNRKLNMVSGSGDYRVLQVDMSGVPMFRKEFKAKKKI ASARIYSSALGVYDLFINGQRVGNKMEDGSIRYDELKPEWTDFSKTAHYQTYDITDLLRK GENAVGAQVSSGWWNSDVCHGEYGSHEVGFIAKILLKYTDGTSETVVTDLSWLSSMDGAI RMGDIYHGETYDARKESAWTKPGYNTANWNKTAVNPHFKGELIAFAGPTVQVRPHLSRIP LSTTVYQGEKDGKINVVSVTDKPAPIRLKKGETAVYNLGQNMVGWVRFKVKGASGTEMKL RFGEMLNDTGDKSRGDDGPAGSIYTANLRSAKATLKYILKGSKEGESFHPSMTFFGFQYC EITASEDIEVLSLIGEVVGSATEEGASFVTSSRSINQLYSNVMWGQRGNYLSIPTDCPQR DERLGWTGDTQVFCRAASYNANVSAFFEKWMRDMRDGQRSDGAYPDVAPHSWVGYGQAAW ADAGVIVPWTIYLMYDNKKILQDNYASMEKYMEFLSRQKGDGYNYNGAGTNYGDWLSYED TERRYVSVCYYAYTAQLMAKISEALKTDDCDAYASKAKAYRKLAQEIKKEFQTRYVDADG DLKQKSQTAYLLALKLDLFPTEEARKKGVETLVRKIAGNGNRLSTGFVGTAILNQTLSQF GESNTAYDLLLQRNNPSWLYSIDQGATTIWERWDSYTKEKGFGPVSMNSFNHYSYGAVSE WMYRTMGGIDIDETRPGFKHIVLQPVPDNRPEVAAGQERIDWVNASFPSCYGDIKSSWKK ENDGTVSYQVTIPANTTATLHLLLPTLDYVVEESGKTAVKAEGVSSVTFMNGKAVLELQS GTYQFVVKKENK >gi|226332138|gb|ACIC01000182.1| GENE 9 11885 - 14530 1903 881 aa, chain + ## HITS:1 COG:no KEGG:BT_2524 NR:ns ## KEGG: BT_2524 # Name: not_defined # Def: alpha-rhamnosidase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 881 1 881 881 1800 99.0 0 MKKIYLVTALACCALQPTLAQKQEECKPVNLHCDHLINPLGIDNANPRLSWMLDDARQGA RQTAYQIIVSTDSLKANNENGEIWNSGKKESDQILVTYPEKNLQPFTKYYWKVNVWDKDG KKATSDINSFETGMMGMENWQGAWIGDNRDINYKPAPYFRKTFDTQKKVKSARAYITVAG LYELYINGEKIGNHRLDPLYTRFDRRNFYVTYDVTNQLQKGKNAIGVLLGNGWYNHQSKA VWDFDRAPWRNRPAFCMDLRITYEDGTTEVIRSERDWKTSSGALIFNSIYTAEHYDARLE QKGWNTADFDDSKWKEAGYRAVPSQNVVSQQVQPIRIVETIPAKALKKVNDTTYVFDFAR NMSGVTRIKVSGEEGTVVRLKHGERIYDNGRVNMSNIDVYHRPIDDKDPFQTDILILSGK GEDEFMARFNYKGFRYVEVTSSKPVALDQNSLTAYFVHSDVPQKGEINTSNPLVNRLWWA TNNAYLSNLMGYPTDCPQREKNGWTGDGHFAIETALYNYDGITVYEKWLADHRDEQQPNG VLPDIIPTGGWGYGTDNGLDWTSTIAIIPWNIYLFYGDSKLLADCYENIKRYIDYVDRTS PSGLTSWGRGDWVPVKSHSSKELTSSVYFYVDTKILANAAKLFNKQEDYKHYQALAEKIR QAINDKYLNRETGIYASGVQTELSVPLMWGIVPKDMKAKVARNLAKKVEEAGFHLDVGVL GAKAILNALSENGEAETAYKVAAQDTYPSWGCWIANGATTLLENWDLNATRDISDNHMMF GEIGGWFYKGLGGIFPDPQQPGFKHILLRPNFPSDLKQFEARHRSPYGEIQSQWERKKKS VVYSVTIPANSSATLYVPDTVKGERVIELEAGKHTFEWKLL >gi|226332138|gb|ACIC01000182.1| GENE 10 14613 - 15971 881 452 aa, chain + ## HITS:1 COG:TM0077 KEGG:ns NR:ns ## COG: TM0077 COG3458 # Protein_GI_number: 15642852 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acetyl esterase (deacetylase) # Organism: Thermotoga maritima # 144 433 12 303 325 114 28.0 4e-25 MENRMRKRMTVILNSKMNMRRFYVSAALLLTTLLAVAENNPYRSDVFWVTVPDHADWLYK TGEQANVEVQFYKYGIPGDSIAINFEIGGEMMPADTKGTVIMRKGKATIPVGTMKKPGFR DCRLTTTVDGKKYSHHVKVGFSPEKLRPYTTMPADFQQFWENEKAELAKFPLTYTKEHVK KYSTDQIDCYLIKLQVNQRGQSIYGYLFYPKKEGKYPVVLCPPGAGIKTIKEPLRHKYYA EQGCIRFEIEIHGLNPEMSEEEFKEISAAFNGRENGYLSNGLDSRDNYYMKRVYLACVRS IDLLTSLPEWDGKNVIVQGGSQGGALALVTAGLDQRVTACVANHPALSDMAGYKAGRAGG YPHFFRNTVDMDTPEKIRTMAYYDVVNFAQLIRADTYMTWGFNDDVCPPTTSYIVYNVLN CPKEALITPINEHWTSSDTEYGHLLWIKKHLK >gi|226332138|gb|ACIC01000182.1| GENE 11 16141 - 16479 160 112 aa, chain - ## HITS:1 COG:no KEGG:BT_2526 NR:ns ## KEGG: BT_2526 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 112 21 132 132 179 100.0 3e-44 MIGGLFFTRYKYLEVWEEMELRIHARQFWECFLLTLIPALGLSLWFSWWWMVLPFMTYHL LYWFEKAISSHSVFNWEALTYCGDAVYMRKRKSYAWMKWYGKKTFPKSEWED >gi|226332138|gb|ACIC01000182.1| GENE 12 16741 - 18414 693 557 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|39938628|ref|NP_950394.1| ribosomal protein L13 [Onion yellows phytoplasma OY-M] # 39 557 31 546 546 271 30 5e-72 MKQMLQICCKNNNIYKEFPIGSSLLDIYYGFNLNFPYQVVSAKVNNRSEGLNFKVYNNKD IEFLDVRDSSGMRTYVRSLCFVLFKAVSELFPEGKLFVEHPVSKGYFCNLRIGRPITLED VTRIKQRMQEIIAENISYHRIECHTAEAVRVFSERGMNDKVRLLETSGSLYTYYYTLGDT VDYYYGNLLPSTGYIKLFDIVKYYDGLLLRIPNKENPDVLEEVVKQEKMLDVFKEYLNWS YIMGLNNAGDFNLACEEGHATDLINVAEALQEKKIAQIADTIFHRGENGDRVKLILISGP SSSGKTTFSKRLSIQLMTNGLKPFPISLDNYFVDREETPLDENGNYDYESLYALDLELFN QQLQALLRGEEVELPRFNFSLGKKEYKGDKLKIEDNTILILEGIHALNPELTPHIPDEKK FKIYVSALTTISLDDHNWIPTTDNRLLRRIIRDFNYRGYSARETISRWPSVRAGEDKWIF PYQENADVMFNSALLFEFAVLRLHAEPILMGVPRNCPEYCEAYRLLKFIKYFVPVQDKEI PPTSLLREFLGGSSFKY >gi|226332138|gb|ACIC01000182.1| GENE 13 18502 - 20199 1743 565 aa, chain + ## HITS:1 COG:TP0771 KEGG:ns NR:ns ## COG: TP0771 COG1283 # Protein_GI_number: 15639758 # Func_class: P Inorganic ion transport and metabolism # Function: Na+/phosphate symporter # Organism: Treponema pallidum # 8 552 47 585 593 265 33.0 2e-70 MEYSFYDFLKLIGSLGLFLYGMKIMSEGLQKVAGDRLRSILTAMTTNRVTGVLTGVLITA LIQSSSATTVMVVSFVNAGLLTLAESISVIMGANIGTTVTAWIISIFGFKVDMAAFALPL LAIALPLIFSGKSNRKSIGEFIFGFSFLFMGLSYLKANAPDLNANPEMLAFVQNYTDMGF FSILLFLFIGTILTMIVQASAATMAITLIMCANGWISLELGAALVLGENIGTTITANLAA LTANTQAKRAALAHFVFNVFGVIWVLIIFHPFMELVNWVVDTFFQSNNPEVAISYKLSAF HSIFNICNVCILIWGVKLIERTVCALIHPKEEDEEPRLRFITGGMLSTAELSILQARKEI HLFAERTHRMFGMVQDLMHTEKDDDFNKLFSRVEKYENISDNMELEIANYLNQVSEGRLS SESKLQIRAMLREVTEIESIGDSCYNLARTINRKRQTNQDFTEKQYEHIHFMMKLTDDAL AQMIVVVEKPEHQSIDINKSFNIENEINNYRNQLKNQNILDVNNKEYDYQMGVYYMDIIA ECEKLGDYVVNVVEASSDVKEKKAS >gi|226332138|gb|ACIC01000182.1| GENE 14 20561 - 21631 927 356 aa, chain + ## HITS:1 COG:no KEGG:BT_2529 NR:ns ## KEGG: BT_2529 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 356 1 356 356 760 99.0 0 MGVALAAMCLMSAQAQRRNEIQVPNLNGYTTLKCDFHMHSVFSDGLVWPTVRVDEAYREG LDAISLTEHIEYRPHKKDIIADHNRSYELSQKQAKKLGILLIRGSEITRSMPPGHFNAIF LNDSNPLEQKAYKDAFNEAKKQGAFIFWNHPGWARQQPDSTLWWPEHTQLYNDGCMHGIE VANGGLFMPEAIQWCLDKNLTMIGTSDIHQPIQTDYDFSKGEHRTMTFVFVKERSPEGIR EALDNRRTAVYYRELVIGREEILRPFFEKCVDIKEVKRTEKEVTFSVMNATDLVLKLKKT AHDPSLVYFREMTLKPHTQHTISVKFDNGIKGGDCNFEVTNFIVAPDKGLDYTIKL >gi|226332138|gb|ACIC01000182.1| GENE 15 21715 - 23040 1005 441 aa, chain + ## HITS:1 COG:no KEGG:BT_2530 NR:ns ## KEGG: BT_2530 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 441 1 441 441 900 100.0 0 MNRYLSLACSLVAFFYLSSCESYSEPTEKAHQSDQPIVLTSFYPKEGGARDKILLDGKNF GTDVSKIKVYFNNARASVVSSSGSRIYAIVPRLPGDNPRISVVVDTDSVAYEETYAYHTQ ALVSTVTGNGETSFLAGTLSTAQVYGKYLDLDAEGNLFMSWRDGGTFGIARINEQQNIVT PLIEADAANRILYANGLTVDRATGMLTAAHESVKEVTFTFDPREAWYPRQRNIKYSQSDL NSIVDADRYKNFVTFCPYDGYLYTRYRDGKIAKIDPNTFEGHIIHQGPSGSQYGQAINPA KPWLLYITLHSNAPTAYRQGISVLDLRDPDGTGGFKRLNAPGGSAFRDGPIEDALFNYPK DIKFDNDGNMFVADYGNHCIRMISADNIVTTVAGQPGVAGYKDGGPVESLFKNPWGVAVN EQGDIYIADWGNARIRKLVIE >gi|226332138|gb|ACIC01000182.1| GENE 16 23056 - 26184 2574 1042 aa, chain + ## HITS:1 COG:no KEGG:BT_2531 NR:ns ## KEGG: BT_2531 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1042 1 1042 1042 2058 99.0 0 MKIKQILYLMSFCLLLSATTFAQQVQEFTVSGTVVDKDKLSMPGVSVYIKDKPSKGTTTD NDGNFKLKVVYGDKMIFSFIGFKPSEHVAIKPQTDLTVTLLEDNNSLDEVVVVALGNVQR KISSVGAITTVSTKDIQSPSPSISNLLGGRAAGVISLQSSGEPGQNIADFWVRGIGTFGA NSSALVLIDGLEGDLNTIDPADVESFSILKDASATAVYGVRGANGVVLVNTKKGLSGKIQ ITGRFNTTLSVLNRLPKYLGAYEYAQLANEARAVRNETPLYDDTEMGIIRDGLDPDLYPN VNWQDEILNKTFWRHTYYVSGRGGSDVARYFLSLGGKNESAAYKVDKNSIYSSNVSYNTY NYRINLDVNLTKSTKVYLGSDGFLSQLNQPGVANTEYIWGAQSRLTPLSIPTQYSNGMLP GRGAGELSSPYVMINHTGKAANEVYKGKSTLAINQDFSELINGLKLRVQGAYDIHSYFNE RRSVQPALYNALGRASDGSLIMQETVQEKKASYSKSTRQYRKYHFEATLNYDRLFGTDHR TSALVYYYISDSKDTDDATSNLSAIPLRYQGVSSRFTYGYKDTYLLDVNFGYTGSENFQP GRQYGFFPSVALGWVPTGYKFIQEALPWLDYLKIRASYGSVGNDRITDVRFPYLTKVNEG TGSAWGVPDIEIIGETRIGADNLAWEKAIKSNVGIEGKLFNNKIDFVVDIFHDQRNGIFQ QRVQVPEFVGVVSNPYANVGKMKSYGADGNISFTQDFTPDFGFTLRGNFTYSKNKVQNWE QAYLEYPYLEYNNFPYNSIRGYQAVGLFKDEDDIKYGPKQTFGEVMPGDIKYKDINGDGM VDKLDMVPLTHSTYPLLMYGLGAEIRYKNLTLGVLFKGTGKTSFFYVGQKTTVNDETRLN GMGYMPFFEGNTGNVLTLAANPANRWIPKDYAIAHGIDPSLAENKDARFPRLQYGNNSNN SQLSTFWQGDSRYIRLQEITLNYHLAAAFLRRIGVSSLDIQFVGNNLYVWDKVKLFDPEQ ARWNGRVYPIPSTYSLQLYVNL >gi|226332138|gb|ACIC01000182.1| GENE 17 26205 - 28232 1787 675 aa, chain + ## HITS:1 COG:no KEGG:BT_2532 NR:ns ## KEGG: BT_2532 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 675 1 675 675 1342 99.0 0 MKTSLIKKTARTFFLVQAVMLVTSCSNFLNIDNYFDDEFNIDSVFTNARYMEAYMWGAAA MFPDEAQTIRNGYTPGPMATDEGFNGLTGSGTANIYHGMDFANGKITQDFLGDGSPNLNQ WGKYYKIIRKCNSILQNLDRPKDLTNSERLRLEGYTRFIRAFAYYNLLVDYGPAILLGDE VVNTNEPIEYYDRPRATYDETMNYTCEEFEKAARLLPSPQDVSTLDFGRPTKGAALGLVA RLRLIHASPLFNGGPVASSYFGNWKRKTDGVHYVSQQYDESRWAVAAAAAKRVMDMGAYK LYTYAKDDNTPELPAGVTSDPNFYDAYPNGADGIDPFKSYSDIFTGEAVASINPELIWGR NTTYLNETISKGSFPPSMGGWGRFCVTQKVVDAYLMDDGRTKEEAAADGYYSETGFTSEP RNFSGYPLNAGVYKMYANREMRFYASVGFNEAVWQALSSSTLNNHTAKYYYQDEDGRGGV TATSPNYPITGYVIKKYNNPMDAWTGTGARHIQKAYPIIRYAEILLSYAEALNNLTGPHT VELDGQPYTMSRDKEEIKKAFNQVRYRAGLPGLTANQLNSTPEVQKQIERERMVEFLWEN RRFYDVRRWGIYEETEQEPIRGMNPDGATKETYYQRVIPSSSSFLTRVVDKRSVWVPIPR NEMRRLPSLDQNPGY >gi|226332138|gb|ACIC01000182.1| GENE 18 28253 - 29155 749 300 aa, chain + ## HITS:1 COG:no KEGG:BT_2533 NR:ns ## KEGG: BT_2533 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 300 1 300 300 566 99.0 1e-160 MKRDFITILALMTALSSCNDDALFEKEMYKNVVALISSDYYNTFEEVVTLSPTGEEATGY IAACTGGTHAPKQDMVIGLEEDLTSLEFYNRSLYDVDEACYAKFLSSDKYEIVDYKIQIN AGERTGRTMIKIRPEGLSPDSTYFVGLKATDISGVEINESKNTILYQILIKNDYATQGEN VYYSMTGLADGMATAGNKKMFPLTHNSVRMIAGTESFESNVDHINKTAIILQVEENNHVT IKPYKDIEVTQIDGDSKYPNTFKVEESFGHTYNVFLLSYRYTKDGKSKVMQEELRLEINN >gi|226332138|gb|ACIC01000182.1| GENE 19 29382 - 29990 673 202 aa, chain + ## HITS:1 COG:no KEGG:BT_2534 NR:ns ## KEGG: BT_2534 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 202 1 202 202 407 99.0 1e-112 MSAQYDLYETPDIKQTGEKQPLHPRIVPKGTIGQDEFLDRVHKFTGISRSLLAGAMQSFQ NELKDLLANGWIVELGEIGYFSVSLQGPPVMHKKDVRAQSIKLKNINYLPTKQFKREVGY DMRLERTESLTRPKGNGRSEAECLALITAHLEKYPCMTRTDYCHMTGHDKKRALKELNAF IEKGILMRYGAGKQVVYAKKMI >gi|226332138|gb|ACIC01000182.1| GENE 20 30170 - 30505 313 111 aa, chain + ## HITS:1 COG:no KEGG:BT_2535 NR:ns ## KEGG: BT_2535 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 111 1 111 111 205 100.0 4e-52 MRKEVEQLALMGAMPDETDEPTDVLIDKYADLLGKITKPITLDEAHILIKLFPPTALYGI EWTLLHLIESVYPETEFIKYKELINECNSVEFREMLVQRLNNSQQNKVTNK >gi|226332138|gb|ACIC01000182.1| GENE 21 30561 - 31049 87 162 aa, chain + ## HITS:1 COG:no KEGG:BT_2536 NR:ns ## KEGG: BT_2536 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 162 1 162 162 263 100.0 2e-69 MNKRIISFICAGLIAISCVSSLCFYDYFSTDGITPNAIYWIFYFFQISILLGQAGMGIYT INALLRQKKEAVSFLSYFAAHLLLQNMTPLLQTWAVPSITEYPVWLLMCQVIFSILLLLS LLINEPSSLFRKSEYIPRILMATFYINCTYLALTIIGTIVAL >gi|226332138|gb|ACIC01000182.1| GENE 22 31155 - 31538 175 127 aa, chain - ## HITS:1 COG:no KEGG:BT_2537 NR:ns ## KEGG: BT_2537 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 127 1 127 127 231 100.0 9e-60 MISCLLRIHSLRFFDGGLHEVPVEEVMTVALIAIFIFVPCLWLVRYLKDKRLDSDEKDKF IQIHIISTIVLTIAGGVFARMSGLVNKWIDEQPLIIAVYIALALSSFGIIVGKMIDFWVY CLRKKMF >gi|226332138|gb|ACIC01000182.1| GENE 23 31878 - 32360 607 160 aa, chain + ## HITS:1 COG:no KEGG:BT_2538 NR:ns ## KEGG: BT_2538 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 160 1 160 160 284 100.0 6e-76 MALRYVVKKRMIGFGKEKTEKYVAQNFITNTVNFKDLCDEISKVGMVPSGAVKFVLDALI DTLNINLNKGISVQLGDFGQFRPGISSESQETEAAVDSDTIRRVKIVFTPRYKFKDMLKK ASIQKLVTGSEIVDNGNTPTPPNPDGGGDGGDEEAPDPTV >gi|226332138|gb|ACIC01000182.1| GENE 24 32464 - 32628 219 54 aa, chain - ## HITS:1 COG:CAC2778 KEGG:ns NR:ns ## COG: CAC2778 COG1773 # Protein_GI_number: 15896033 # Func_class: C Energy production and conversion # Function: Rubredoxin # Organism: Clostridium acetobutylicum # 1 53 1 53 54 80 75.0 8e-16 MKKYICTVCEYIYDPAQGDPESGIEPGTAFEDIPDDWTCPLCGVGKEDFEPYEG >gi|226332138|gb|ACIC01000182.1| GENE 25 32763 - 34025 971 420 aa, chain + ## HITS:1 COG:mll3725_2 KEGG:ns NR:ns ## COG: mll3725_2 COG0642 # Protein_GI_number: 13473203 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Mesorhizobium loti # 165 418 55 315 328 180 38.0 6e-45 MLIALLSILLILCAVLLLSYLLERRKCLRMQKRLFRESRKLEHTNHIASAILKNVHAYIL LIDNDFKVLKTNYYQLTGTQKGLEEKRVGDLLQCRNALSAEGGCGTHAYCGSCPIRCAIR QAFEQRRGFTDLNATLNVLTSEKKSVECDAVISGSYFLLNEEENMVLTVHDITQLKQAEK QLALAKEKAENADLSKSTFLANMSHEIRTPLNAITGFAEILASANTEEEKAQYQEIIKMN ADLLLQLVNDILDMSKIEAGTLEFVYTKVDINLLLSDLRQLFQMKVNDAGGNIQIIAEPS LPSCSIETDRNRVAQVLSNFTTNAIKFTQEGTISIGYEARDTELYFYVTDTGAGIPADKL PEVFGRFVKLNKDKKGTGLGLSISKTIVNKLEGQIGADSVEGKGSTFWFTIPYRTCGRPE >gi|226332138|gb|ACIC01000182.1| GENE 26 34112 - 36802 2568 896 aa, chain + ## HITS:1 COG:MTH1001 KEGG:ns NR:ns ## COG: MTH1001 COG0474 # Protein_GI_number: 15679019 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Methanothermobacter thermautotrophicus # 13 889 24 830 844 417 33.0 1e-116 MSTTKKDDYYHVGLTDDEVLQSREKNGVNLLTPPKRPSLWKLYLEKFEDPVVRVLLVAAV FSLIISIIENEYAETIGIIAAILLATGIGFFFEYDANKKFDLLNAVNEETLVKVIRNGHV QEIPRKDVVVDDIIILETGEEIPADGQLLEAISLQVNESNLTGEPVINKTVIEADFDEEA TYASNLVMRGTTVVDGHGTMRVLHVGDATEIGKVARQSTEENLEPTPLNIQLTKLANLIG KIGFTVAGLAFLIFFVKDVLLYFDFSSLNGWHEWLPVFERTLKYFMMAVTLIVVAVPEGL PMSVTLSLALNMRRMLSTNNLVRKMHACETMGAITVICTDKTGTLTQNLMQVHEPNFYGI KNGGNLGDDDISALVAEGISANSTAFLEEAATGEKPKGVGNPTEVALLLWLNSQGRNYLE LREHARILDQLTFSTERKFMATLVESPIIGKKVLYIKGAPEIVLGKCKEVVLDGRRVDAV EYRSTVEAQLLNYQNMAMRTLGFAFKIVGENEPNDCTELVSANDLNFLGVVAISDPIRPD VPAAVAKCQSAGIGIKIVTGDTPGTATEIARQIGLWQPETDTDRNRITGVAFAELSDEEA LDRVMDLKIMSRARPTDKQRLVQLLQQKGAVVAVTGDGTNDAPALNHAQVGLSMGTGTSV AKEASDITLLDDSFNSIGTAVMWGRSLYKNIQRFIVFQLTINFVALLIVLLGSMIGTELP LTVTQMLWVNLIMDTFAALALASIPPSETVMQEKPRRSTDFIISKAMRNNILGVGTIFLA VLLGMIYYFDHSAQGMDVHNLTIFFTFFVMLQFWNLFNARVFGTTDSAFKGLSKSYGMEL IVLAILVGQFLIVQFGGAVFRTEPLDWQTWLLIIGVSSTVLWVGELIRLVQRSIHK >gi|226332138|gb|ACIC01000182.1| GENE 27 36810 - 37433 666 207 aa, chain + ## HITS:1 COG:L111950 KEGG:ns NR:ns ## COG: L111950 COG1011 # Protein_GI_number: 15672092 # Func_class: R General function prediction only # Function: Predicted hydrolase (HAD superfamily) # Organism: Lactococcus lactis # 6 196 3 193 207 118 35.0 7e-27 MKSKGIKNLLIDLGGVLINLDRERCIENFKKIGFQNIEEKFCTHQLDGIFLQQEKGLITP AEFRDGIREMMGKMVSDKQIDAAWNSFLVDIPTYKLDLLLKLREKYVVYLLSNTNDIHWK WVCKNAFPYRTFKVEDYFEKTYLSYEMKMAKPEPEIFKAVTEDAGIDPKETFFIDDSEIN CKVAQELGISTYTPKAGEDWSHLFRKK >gi|226332138|gb|ACIC01000182.1| GENE 28 37444 - 38406 410 320 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163762565|ref|ZP_02169630.1| ribosomal protein S2 [Bacillus selenitireducens MLS10] # 11 310 15 309 317 162 33 3e-39 MQIINDTSAMIPEPCVATIGFFDGVHMGHRYLIQQVKEIAAAKGLRSALVTFPVHPRKVM NAAYHPELLTTPEEKTNLLAGTGVDYCLMLDFTPDISRLTAKEFMTQILKERYQVKYLVI GYDHRFGHNRSEGFDDYVRYGQAIGIEVIRAQAYTDDIQIDTTQSAPVSSSLIRKLLHQG DVDVAARCLGYEYFLDGTVVGGYQVGRKIGFPTANLSVDDPDKLIPADGVYAVWVTFDGK TYMGMLNIGVRPTIGNGPNRTIEVNILHFHSDIYDKFIRLTFVKRTRPELKYDSIDELIA QLHKDAEETEAILLAAKQEK >gi|226332138|gb|ACIC01000182.1| GENE 29 38349 - 38483 59 44 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MFFDINGLVAAEKRLINVVNLEKKGDHFSCLAASRMASVSSASL >gi|226332138|gb|ACIC01000182.1| GENE 30 38631 - 38720 56 29 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKKSLMSISMLVVSISMIAMFLCSFIASK >gi|226332138|gb|ACIC01000182.1| GENE 31 38771 - 39574 551 267 aa, chain + ## HITS:1 COG:RSc3402 KEGG:ns NR:ns ## COG: RSc3402 COG1266 # Protein_GI_number: 17548119 # Func_class: R General function prediction only # Function: Predicted metal-dependent membrane protease # Organism: Ralstonia solanacearum # 119 221 141 243 285 67 40.0 3e-11 MRTAIKLILLNLLIAQIIAPILVMIPCVIYLVATTGNLDKDTLITMLMIPAQLAGQLMMG IYLWKAGYISTKKITWSPVSSSYLFFSALAVLTCGFVVSALAELMKWIPNIMEQSFDILQ SGWGGILAIAVIGPVLEELLFRGAITKALLQQYSPTKAILLSAFLFGVFHINPAQILPAF LIGILFAWTYYKTASLIPCILMHILNNSLSVFLSTKYPEAENMSDLMDTSSYLIAIFVAV LILAGVIWAMRRTTVRYPWKEETNITD >gi|226332138|gb|ACIC01000182.1| GENE 32 39587 - 40243 670 218 aa, chain + ## HITS:1 COG:no KEGG:BT_2545 NR:ns ## KEGG: BT_2545 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 218 1 218 218 405 99.0 1e-112 MKRRLNILSILVFIVLGLSLYTTGYQFGTGMKAGMSLAKEQHKESKACDRPMLAGDFRFV DVVPTIAMIQPDTIINAQNNEKVPVMYTQMAVRTGKEVNYSYLIISSSCSLMNALLTISA LIIFVMLILSINKSQIFEWKNVRRLRWLGSLLIMSFVFYLIPQVVNYWGLKEVFALEHYI IAPLALQTTDLLLGLGCLIVAETFAIGLKMKEEQELTI >gi|226332138|gb|ACIC01000182.1| GENE 33 40266 - 40490 253 74 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163756262|ref|ZP_02163377.1| 50S ribosomal protein L20 [Kordia algicida OT-1] # 1 67 1 67 67 102 71 6e-21 MAIIVNLDIMMARRKISLGELAEKIDITPANLSILKTGKAKAIRFSTLEAICKVLDCQPA DILEYQEEEKENER >gi|226332138|gb|ACIC01000182.1| GENE 34 40542 - 40967 498 141 aa, chain - ## HITS:1 COG:XF0994 KEGG:ns NR:ns ## COG: XF0994 COG2166 # Protein_GI_number: 15837596 # Func_class: R General function prediction only # Function: SufE protein probably involved in Fe-S center assembly # Organism: Xylella fastidiosa 9a5c # 14 137 23 146 146 105 43.0 2e-23 MSINELQDEVIAEFSDFDDWMDRYQLLIDLGNEQEPLDEKYKTEQNLIEGCQSRVWLQAD DVDGKIVFNAESDALIVKGIIALLIKVLSGHTPDEILNADLYFIDKIGLRDHLSPTRSNG LLSMVKQIRMYALAFKAKEGK >gi|226332138|gb|ACIC01000182.1| GENE 35 40998 - 41993 1037 331 aa, chain - ## HITS:1 COG:no KEGG:BT_2548 NR:ns ## KEGG: BT_2548 # Name: not_defined # Def: leucine aminopeptidase precursor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 331 1 331 331 663 99.0 0 MKKKSMIVLTSAFVLLSAFSCGGGSKATGTNEQSEKVVVNVPQFDADSAYLYVKNQVDFG PRVPNTKEHVACGNYLAGKLEAFGAKVTNQYADLIAYDGTLLKARNIIGSYKPESKKRIA LFAHWDTRPWADNDADEKNHHTPILGANDGASGVGALLEIARLVNQQQPELGIDIIFLDA EDYGTPQFYEGKHKEEAWCLGSQYWSRNPHVQGYNARFGILLDMVGGENSVFLKEGYSEE FAPDINKKVWKAAKKAGYGKTFIDERGDTITDDHLFINRLARIKTIDIIPNDPETGFPPT WHTIHDNMDHIDKNTLKAVGQTVLEVIYNEK >gi|226332138|gb|ACIC01000182.1| GENE 36 42065 - 42973 947 302 aa, chain - ## HITS:1 COG:CAC0293 KEGG:ns NR:ns ## COG: CAC0293 COG1619 # Protein_GI_number: 15893585 # Func_class: V Defense mechanisms # Function: Uncharacterized proteins, homologs of microcin C7 resistance protein MccF # Organism: Clostridium acetobutylicum # 9 273 6 285 306 151 33.0 2e-36 MNIQLPSFLQKGDEVVIVSPSSKIDKEFLKRAKKRLESWGLKVSVGKYAGGSSGRYAGTV RQRLQDLQSAMDDPKVKAILCSRGGYGAVHLVDKIDFTAFREHPKWLLGFSDITALHNLF QKNGYASLHSLMARHLSVEPEEDPCVAYLKDILFGNLPVYTCEKHKLNRQGTAEGILRGG NMAVAYGLRGTPYDIPAEGTILFLEDVSERPHAIERMMYNLKLGGVLEKLSGLIIGQFTE YEEDCSLGKELYPALADLVKEYDYPVCFNFPVGHVTHNLPLINGAKVELTVGKKNVELKF IC >gi|226332138|gb|ACIC01000182.1| GENE 37 42984 - 43790 848 268 aa, chain - ## HITS:1 COG:CC0380 KEGG:ns NR:ns ## COG: CC0380 COG2273 # Protein_GI_number: 16124635 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucanase/Beta-glucan synthetase # Organism: Caulobacter vibrioides # 10 266 5 299 301 108 28.0 8e-24 MKQFRNLLFISTLLLCWVLVSCKTTRVSSSGWSLVWEENFNQKKGFDPQVWSKIPRGKAD WNNYMTDFDSCFAMRKGKLVLRGIANQTLPNDTAPYLTGGVYTKGKKAFLDGRIEICAKL NAAKGAWPAIWLLPENAKWPSGGEIDVMERLNDDSIAYQTVHSHYTYTLGIKDHPKSHST GVIHPDRYNVFAVEMYPDSLSFYVNDMHTFTYPRIQTQKEGQFPFDQPFYLLIDMQLGGS WVGAVDPKDLPVEMEVDWVRFYQRKSLK >gi|226332138|gb|ACIC01000182.1| GENE 38 43970 - 44914 913 314 aa, chain + ## HITS:1 COG:CAC3076 KEGG:ns NR:ns ## COG: CAC3076 COG0280 # Protein_GI_number: 15896327 # Func_class: C Energy production and conversion # Function: Phosphotransacetylase # Organism: Clostridium acetobutylicum # 4 303 2 299 301 173 34.0 3e-43 MEPIQNFAQLTAHLKQQNRRKRIAVVCANDPNTEYAITRALEEGIAEFLMIGDSAILEKY PALKQYPDYVKTIHIEDSDEAAREAVRIVREGGADILMKGIINTDNLLHAILDKEKGLLP KGKILTHLAVMEIPTYHKLLFFSDAAVIPRPTLQQRIEMIWYAICTCRHFGIEQPRVALI HCTEKVSAKFPHSLDYVNIVELAEAGEFGNVIIDGPLDVRTACEQASGDIKGIVSPINGQ ADVLIFPNIESGNAFYKSVSLFANADMAGLLQGPICPVVLPSRSDSGLSKYYSIAMACLQ VAGCECREKLNLNR >gi|226332138|gb|ACIC01000182.1| GENE 39 44945 - 46006 963 353 aa, chain + ## HITS:1 COG:CAC3075 KEGG:ns NR:ns ## COG: CAC3075 COG3426 # Protein_GI_number: 15896326 # Func_class: C Energy production and conversion # Function: Butyrate kinase # Organism: Clostridium acetobutylicum # 2 353 3 355 355 356 49.0 3e-98 MKILVINPGSTSTKIAVYENETPLLVRNIKHTVEELSVYPQVIDQFEFRKNLVLQELEAN GIPFAFDAVIGRGGLVKPIPGGVYAVNEAMKQDTLHAMRTHACNLGGLIAAELAASLPDC PAFIADPGVVDELEDVARISGSPLMPKITIWHALNQKAIARRFAKEQGTKYEELDLIICH LGGGISIAVHQHGKAIDANNALDGEGPFSPERAGTLPAGQLIDICYSGQFTKDELKKRIS GRAGLTAHLGTTDVPAIIKAIEEGDKKAELILDAMIYNVAKAIGGAATVLCGKVDAILLT GGIAYSDYIISRLKKRISFLAPIHVYPGEGEMESLAFNALGALRGELPVQIYK >gi|226332138|gb|ACIC01000182.1| GENE 40 46316 - 48067 1288 583 aa, chain - ## HITS:1 COG:no KEGG:BT_2553 NR:ns ## KEGG: BT_2553 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 583 1 583 583 1159 100.0 0 MMKQIPYGLTDFARIQKDNYYYVDKTMFIERIEMQPAYLFLIRPRRFGKSLTLAMLEAYY DVVYANDFDELFGHLYIGQHPTPKHNCYLIMRFNFSEVSSNVNEVERSFKLHCCSKLRDF VFKYEDLLGKEIWDVLDEEIQQDPGAFLSAINSYASRKGNLPIYLLIDEYDNFTNTILST YGTEYYQKATHGEGFVRGFFNVIKAATTGTGSALQRMFITGVSPVTMDDVTSGFNIGTNI TTDPWFNDLVGFSEAELREMLTYYKEQGVLMQTVDETIVMMKPNYDNYCFSRSRLADCMF NSDMVLYFMKSFVLHGEKPDEIVDPNIRTDFNKLAYLIRLDHGLGENFSVIKEIAEQGEI TTDIATHFSALEMTDVRNFKSLLFYFGLLSIKGVDMVGRPILHVPNLVVREQLFSFLIQG YIKHDIFKIDMNRMTMLFENMAFRGDWKPLFNFIAEAIREQSRIREYIEGEAHIKGFLLA YLGMYRYYQLYPEYELNKGFADFLFKPSPSVPVMPPLTYLLEVKYAKAGASEKEIRALAD GAREQLLRYSQDELVTEARAKGGLKLITIVWCSWELVLLEEVL >gi|226332138|gb|ACIC01000182.1| GENE 41 48097 - 48540 463 147 aa, chain - ## HITS:1 COG:no KEGG:BT_2554 NR:ns ## KEGG: BT_2554 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 147 1 147 147 232 100.0 3e-60 MEAEELTVGRVHHGHNIRRFRIEKNMNQEVLSQLVHLSQSAVSKYEQMRVIDDEMLHRFS RALGVPFEYLKSLEEDAQTVVFENNTVNNNDQASANIGGYVEENNRVNNYNPIEKITELY ERLLKEKDEKYAALERRLQNIEQSLQK >gi|226332138|gb|ACIC01000182.1| GENE 42 48753 - 49067 208 104 aa, chain + ## HITS:1 COG:no KEGG:BT_2555 NR:ns ## KEGG: BT_2555 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 104 1 104 104 160 99.0 1e-38 MIKVLTETEYKFLKEFIALIEKYNLLFTTDEKGEIEVTVHRQQLATETNQNSDLLHTPIH LGDCFDETELNYILDQTDERIKQIAEEYQSEEIFTQHLSKNNKS Prediction of potential genes in microbial genomes Time: Thu May 12 03:53:28 2011 Seq name: gi|226332137|gb|ACIC01000183.1| Bacteroides sp. 1_1_6 cont1.183, whole genome shotgun sequence Length of sequence - 30755 bp Number of predicted genes - 24, with homology - 24 Number of transcription units - 15, operones - 5 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 107 - 150 8.2 1 1 Op 1 . - CDS 237 - 1880 1644 ## BT_2559 hypothetical protein 2 1 Op 2 . - CDS 1899 - 5201 3688 ## BT_2560 hypothetical protein - Prom 5392 - 5451 5.7 3 2 Tu 1 . - CDS 5480 - 6418 870 ## COG3712 Fe2+-dicitrate sensor, membrane component - Prom 6438 - 6497 2.2 4 3 Tu 1 . - CDS 6524 - 7084 429 ## BF4327 putative RNA polymerase ECF-type sigma factor - Prom 7148 - 7207 4.5 - Term 7157 - 7199 6.5 5 4 Tu 1 . - CDS 7226 - 9352 1543 ## PROTEIN SUPPORTED gi|62291006|ref|YP_222799.1| polynucleotide phosphorylase/polyadenylase - Prom 9430 - 9489 3.8 + Prom 9334 - 9393 9.8 6 5 Tu 1 . + CDS 9548 - 10699 1149 ## BT_2564 hypothetical protein + Term 10802 - 10850 11.7 + Prom 10731 - 10790 5.8 7 6 Op 1 . + CDS 10880 - 11344 666 ## COG0782 Transcription elongation factor + Term 11347 - 11390 6.5 8 6 Op 2 . + CDS 11397 - 11789 344 ## COG0537 Diadenosine tetraphosphate (Ap4A) hydrolase and other HIT family hydrolases + Term 11810 - 11851 7.1 - Term 11796 - 11837 4.0 9 7 Op 1 8/0.000 - CDS 11865 - 12785 695 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily 10 7 Op 2 . - CDS 12792 - 14024 947 ## COG0477 Permeases of the major facilitator superfamily - Prom 14231 - 14290 9.6 + Prom 13990 - 14049 3.9 11 8 Tu 1 . + CDS 14278 - 14781 526 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog + Term 14849 - 14888 5.0 + Prom 14811 - 14870 2.6 12 9 Op 1 . + CDS 15041 - 16486 1558 ## COG0076 Glutamate decarboxylase and related PLP-dependent proteins 13 9 Op 2 . + CDS 16495 - 17460 1234 ## COG2066 Glutaminase + Term 17488 - 17526 6.1 - Term 17668 - 17727 6.9 14 10 Tu 1 . - CDS 17879 - 18619 572 ## BT_2572 putative potassium channel subunit - Prom 18805 - 18864 3.9 + Prom 18585 - 18644 8.4 15 11 Tu 1 . + CDS 18836 - 20548 1292 ## COG0531 Amino acid transporters + Term 20570 - 20630 14.2 - Term 20562 - 20613 9.7 16 12 Tu 1 . - CDS 20615 - 21874 1045 ## COG0642 Signal transduction histidine kinase + TRNA 22080 - 22152 82.1 # Phe GAA 0 0 + TRNA 22164 - 22239 75.3 # Pro CGG 0 0 17 13 Tu 1 . + CDS 22479 - 24116 1449 ## BT_2662 alpha-galactosidase precursor - Term 24140 - 24181 4.1 18 14 Op 1 . - CDS 24210 - 25676 1474 ## BT_2663 TPR repeat-containing protein 19 14 Op 2 . - CDS 25698 - 26639 672 ## COG0226 ABC-type phosphate transport system, periplasmic component 20 14 Op 3 . - CDS 26644 - 27456 818 ## BT_2665 TonB 21 14 Op 4 . - CDS 27483 - 28136 716 ## BT_2666 hypothetical protein 22 14 Op 5 . - CDS 28152 - 28757 426 ## BT_2667 hypothetical protein 23 14 Op 6 . - CDS 28787 - 29584 921 ## COG0811 Biopolymer transport proteins - Prom 29604 - 29663 7.1 24 15 Tu 1 . - CDS 29737 - 30180 317 ## BT_2669 hypothetical protein - Prom 30330 - 30389 5.1 - 5S_RRNA 30509 - 30606 97.0 # CP000140 [D:147281..147431] # 5S ribosomal RNA # Parabacteroides distasonis ATCC 8503 # Bacteria; Bacteroidetes; Bacteroidia; Bacteroidales; Porphyromonadaceae; Parabacteroides. Predicted protein(s) >gi|226332137|gb|ACIC01000183.1| GENE 1 237 - 1880 1644 547 aa, chain - ## HITS:1 COG:no KEGG:BT_2559 NR:ns ## KEGG: BT_2559 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 547 1 547 547 1093 99.0 0 MKKNLYIANLLVAGLMTFTGCTDGFESDNANKAGYTPELQEYDLQKYVLNLGVMQQGIYF NYDWGKGTDWTFQTIQNLGHDMFAGYFHDMNNSFNDKNSVYALNDGWTGSAWTYTYGYIM TAAQKSEKINQEQKGLLGVTKILKVELMHRIADTYGPIVYTKFGQEEGDNVDSQEAAYRQ FFKDLDEGVKLINEYKKDNAALEPFAAADILMPAGKRTLSQWVKFANSLRLRLAIRVSMS APDLARAEVAKAMDATAGGVLESADETVAVSTESGYKNPLGTVNEGWGEVFMGATMESIL VGYNDPRLIKYYSKAEGGDIKDEKGNLIIAERNKIAGTYKGVPQGTGVTVKDENRYRLHS KSTITKQTDAILMTAAEVWFLRAEAALRGFTSENVKDCYEKGITVSCQQWGVNVGDYLTS DATPSDYVDAFEAKYDVKALIKVTPKWDGGASPEEQLERILTQKWIACYPEGYEAWTEQR RTGYPQLFKVFVNNSGGAIDTDIRIRRLPYPSDIQKNNPTQYSALKKALGGEDNGGTRLW WDTGRNF >gi|226332137|gb|ACIC01000183.1| GENE 2 1899 - 5201 3688 1100 aa, chain - ## HITS:1 COG:no KEGG:BT_2560 NR:ns ## KEGG: BT_2560 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1100 1 1100 1100 2113 99.0 0 MKITLFLLLFVTFQAYCENGYSQSTKISIPRSTLKVSELLTQIESQTDYLFVYNKKSVDT KRVVNVDATNQPVSEILDKAFEGTGIHYVVEGNNIVLTKNVTERASTQQQKQITVKGVVT DMRGEPIIGASVVEKGTTNGTITDLDGNFTLKVPSDAIINITYVGYQPQSLSVSGQTTFK IKMEEESMALEQVVVTAMGIKKKAASLTYSTQQLGGDELTRAKDPNMIAALAGKSAGVQI SKSASGLGGSAKVSIRGIRSANEDGNNQPLYVIDGVPMLNSTTESVSTVMGGKNDGVNRD AGDGISNLNPDDIESMSILKGASAAALYGSQAANGVILITTKKGKAGVQRITYSSGLTVD HAISLPEFQNSYGAKAGSSWGEKENVKDYDNAGNWFSNGVTAINSVSVMTGNEKLQTYFS YANTTAKGIVDSNKLRKHNLTLRESASLFNDRLKLDANANLMVQTIKNSPTSGGIYLNPL VDVYGFPRGQDLSEYATNFEKFDENRNMPVQSWFTTPSEWTQNPYWVKNRILNNNKRYRA LASLTANLKATDWLNLQARGNVDYVSDKFDQKMYASTSPAIVGNNGRYVYADTQNFLIYG DVMAMFNKHFGDFSVNAALGGSINVQTENSLTLDSKTASLYKPNLFTVPNIVMNTSASAE QVIDRRRTIQSVFATAQVGWKDALFLDLTARNDWSSTLSHTNSVDKGFFYPSVGLSWVIN ESFKLPQWISFGKIRGSWAQVGNDLPIGITDLSDIIGAGGKLQAVDVYNRGDLKPEISNS IEFGLEWRLFNSRVDFDFTYYKTDTKNQLLKVPTSAGEDYAYRYINAGKIRNKGFEITLG ATPVLTEDFRWKTTFNYSQNRNKVVELAPNYTSFVYGDPGFSMAYMMQIKEGGSLGDIYG NTFARDEQGKIKVSDEGKPISESGNKTYLGNCNPDFLLSWANTFTYKGFSLYFLIDMRKG GDVMSLTQATLDGRGVTKATAEARDRGYVEVDGTRFENVEGFYSVVGDRNGISERYMYSA TNIRLRELSLSYSFPSTLLAKTKVFTGIDVSLVGRNLFFFYKDAPFDPDAILSVGNTNQG VDVFGMPTTRNIGFNVKFTF >gi|226332137|gb|ACIC01000183.1| GENE 3 5480 - 6418 870 312 aa, chain - ## HITS:1 COG:CC1130 KEGG:ns NR:ns ## COG: CC1130 COG3712 # Protein_GI_number: 16125382 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Caulobacter vibrioides # 83 306 80 302 307 72 26.0 9e-13 MNQDLLHKYFKGETSVDEEKRILNWVDESEENRKTLQKERMLFDIALFTDSKRKMREKAG AGARIIPMLRWSARIAAAVIVAVSCGFLINEYQYDKLAQLQTVTVPAGQRAQITLADGTK VWLNSESTLSYRSDFGRRDRDVELDGEAYFEVAKNKEIPFYVNTETNQVRVVGTHFNVCA YKGSNEFETTLVEGIVDIYAYDNERPLTRLTKNEFFGSYQGKYKKAVLPSYDYLRWREGL YCFDDSPFSCILTKLEKYYNVKITVENPKVLNYRCTGKFKEQDGVEHILKVIQKDHPFTY SISEEGDSIRIE >gi|226332137|gb|ACIC01000183.1| GENE 4 6524 - 7084 429 186 aa, chain - ## HITS:1 COG:no KEGG:BF4327 NR:ns ## KEGG: BF4327 # Name: not_defined # Def: putative RNA polymerase ECF-type sigma factor # Organism: B.fragilis # Pathway: not_defined # 1 186 1 186 186 250 74.0 2e-65 MHQTNNISSIQLFSEFFHENQEKFLSFAYSYIRYRQEAEDILMESMITLWENRDKWEEDS NLHGLLLTIIKNKALNYLAHLQVRLRAEEEINSHSQRELDLRISTLEACEPDVIFDSEIQ HIVQKALKRMPNQSRQIFILSRYQNTPNKKIAEQLGISVKSVEFHITKALKILRTELKDY LVSILF >gi|226332137|gb|ACIC01000183.1| GENE 5 7226 - 9352 1543 708 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|62291006|ref|YP_222799.1| polynucleotide phosphorylase/polyadenylase [Brucella abortus bv. 1 str. 9-941] # 14 701 13 694 714 598 47 1e-170 MINPIVKTIELPDGRTITLETGKLAKQADGSVMLRMGNTMLLATVCAAKDAVPGTDFMPL QVEYKEKFAAFGRFPGGFTKREGRASDYEILTCRLVDRALRPLFPDNYHAEVYVNIILFS ADGVDMPDALAGLAASAALAVSDIPFNGPISEVRVARIDGQFVINPTFDQLEKADMDLMV AATYDNIMMVEGEMHEVSEAELLEAMKVAHEAIKVHCKAQMELTEEVGKTVKREYNHEVN DEELRKAVREACYDKAYAVAASGNNNKHERFDAFDAIREEFKAQFSEEELEEKAPLIDRY YHDVEKEAMRRSILDEGKRLDGRKTTEIRPIWCEVGPLPGPHGSAIFTRGETQSLTSVTL GTKLDEKIIDNVLEHGKERFLLHYNFPPFSTGEAKAQRGVGRREIGHGHLAWRALKGQIP ADYPYVVRVVSEILESNGSSSMATVCAGTLALMDAGVKIKKPVSGIAMGLIKNPGEEKYA VLSDILGDEDHLGDMDFKVTGTRDGITATQMDIKVDGLSYEILERALNQAKEGRMHILDK ITETIAEPRADLKDHAPRIETMTIPKEFIGAVIGPGGKIIQGMQEETGAVITIEETDGMG RIEVSGTNKKCIDDAMRMIKAIVAVPEVGEVYKGKVRSIMPYGAFIEFLPGKDGLLHISE IDWKRLETVEEAGIKEGDEIEVKLIDIDPKTGKFKLSRKVLMPRPEKK >gi|226332137|gb|ACIC01000183.1| GENE 6 9548 - 10699 1149 383 aa, chain + ## HITS:1 COG:no KEGG:BT_2564 NR:ns ## KEGG: BT_2564 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 383 1 383 383 746 100.0 0 MKKITFVALAALTITACSSGPEFEVNGDISGADGKMLYLEASGLEGIVPLDSVKLKGEGT FKFKQPRPESPEFYRLRVDNKVINFSVDSIETLQINAPYVDFSTAYTVEGSENSSKIKEL TLKQINLQKNVDEQLNALRANKLGHDTFEENLATLLKNYKEDVKVNYIFAAPNTAAAYFA LFQKLNDYLIFDPLNNKDDVKCFAAVATSLNNTYPDAVRSKNLYNIVIKGMKNTRQPQSK SLEIPQDKIIETGIIDIALRDVKGNTRKLTDLKGKVVLLDFSVFQSPAGAPHNMMLRELY NKYAKDGLEIYQVSLDADEHYWKTAAGNLPWVCVRDGNGVYSTNVAVYNVRQVPSIFLIN RNNELKLRGEDIKDLEASVKSLL >gi|226332137|gb|ACIC01000183.1| GENE 7 10880 - 11344 666 154 aa, chain + ## HITS:1 COG:RC1332 KEGG:ns NR:ns ## COG: RC1332 COG0782 # Protein_GI_number: 15893255 # Func_class: K Transcription # Function: Transcription elongation factor # Organism: Rickettsia conorii # 4 152 51 199 206 117 46.0 1e-26 MAYMSEEGYKKLMAELKELETVERPKISAAIAEARDKGDLSENAEYDAAKEAQGMLEMRI NKLKTVIADAKIIDESKLKTDSVQILNKVELKNVKNGMKMTYTIVSESEANLKEGKISVN TPIAQGLLGKKVGDIAEITVPQGKIALEVVNISI >gi|226332137|gb|ACIC01000183.1| GENE 8 11397 - 11789 344 130 aa, chain + ## HITS:1 COG:MA0811 KEGG:ns NR:ns ## COG: MA0811 COG0537 # Protein_GI_number: 20089695 # Func_class: F Nucleotide transport and metabolism; G Carbohydrate transport and metabolism; R General function prediction only # Function: Diadenosine tetraphosphate (Ap4A) hydrolase and other HIT family hydrolases # Organism: Methanosarcina acetivorans str.C2A # 4 101 17 119 150 89 44.0 1e-18 MATIFSRIIAGEIPCYKIAENDRFFAFLDINPLVKGHTLVVPKQEVDYIFDLSDEDLAAM HIFAKKIARAIEKAFPCKKVGEAVIGLEVPHAHIHLIPIQKESDMLFSNPKLKLSDEEFK SIAQAISSSL >gi|226332137|gb|ACIC01000183.1| GENE 9 11865 - 12785 695 306 aa, chain - ## HITS:1 COG:BS_yyaM KEGG:ns NR:ns ## COG: BS_yyaM COG0697 # Protein_GI_number: 16081133 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Bacillus subtilis # 17 269 15 267 305 77 25.0 3e-14 MKNKKLEANLSMAVSKIFSGLNMNALKYLLPLWMSPLTGATLRCTFAAAAFWVIGWFMPP EKSSAKDKWLLFLLGAFGLYGFMFLYLAGLSKTTPVSSSIFTSLQPIWVFLIMIFFYKEK ATWKKILGISIGLVGALVCILTQQSDDLASDAFVGNMLCLISSIVYAVYLILSQRILSSI GAMTMLRYTFSGAAVSAIIVTFITGFDAPVFSMPFHWTPFLILMFVLIFPTTISYMLLPV GLKYLKTTVVAIYGYLILIVATIASLALGQDRFSWTQTCAIIFICIGVYLVEVAESKDKS PDPLKK >gi|226332137|gb|ACIC01000183.1| GENE 10 12792 - 14024 947 410 aa, chain - ## HITS:1 COG:AGc4286 KEGG:ns NR:ns ## COG: AGc4286 COG0477 # Protein_GI_number: 15889635 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 21 339 15 324 400 74 27.0 4e-13 MKIQTGRGTIPLITLIAIWSISALTSLPGLAVSPILGDLTKIFPKATDLDIQMLTSLPSL LIIPFILLGGKLTEKVDFVRILKIGLWLFAASGILYLISNKMWQLIVVSALLGIGSGLII PLSTGLISKYFVGTYRVKQFGLSSAITNFTLVIATAVTGYLAEVSWHLPFLVYLLPLISI LLVGHLKESRSDAAVEPSSQSTTPSEQTAAADTGGSKYGIHIRHLLQIMLFYGVTTFIVL AVIFNLSFLMEKHHFSSGNSGLMISLFFLAIMAPGFCLDKIVDELKERTKAYSLLSMAVG LALIWIAPIEWLIIPGCILVGLGYGIIQPMLYDKTTHTALPQKATMALAFVMMMNYLAIL LYPFIIDFLQNLFHTQSQEFPFVFNLLITIVTFFWAYLRRDTFLFNDQLK >gi|226332137|gb|ACIC01000183.1| GENE 11 14278 - 14781 526 167 aa, chain + ## HITS:1 COG:mll3697 KEGG:ns NR:ns ## COG: mll3697 COG1595 # Protein_GI_number: 13473184 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Mesorhizobium loti # 1 164 1 161 183 99 37.0 4e-21 MKSVSFRKDLVGVQDELLRFAYKLTTDREEANDLLQETSLKALDNEDKYMPDTNFKGWMY TIMRNIFINNYRKVVRDQTFIDRTDNLYHLNLPQETGFESTEKTYDLKEMHRVVNSLPKE YRVPFAMHVSGFKYREIAEKLNLPLGTIKSRIFFTRQKLQEELKDFR >gi|226332137|gb|ACIC01000183.1| GENE 12 15041 - 16486 1558 481 aa, chain + ## HITS:1 COG:sll1641 KEGG:ns NR:ns ## COG: sll1641 COG0076 # Protein_GI_number: 16329656 # Func_class: E Amino acid transport and metabolism # Function: Glutamate decarboxylase and related PLP-dependent proteins # Organism: Synechocystis # 30 443 36 448 467 444 50.0 1e-124 MEDLNFRKGDAKTDVFGSDRMLQPSPVERIPDGPTTPEVAYQMVKDETFAQTQPRLNLAT FVTTYMDDYATKLMNEAININYIDETEYPRIAVMNGKCINIVANLWNSPEKDTWKTGALA IGSSEACMLGGVAAWLRWRKKRQAQGKPFDKPNFVISTGFQVVWEKFAQLWQIEMREVPL TLDKTTLDPEEALKMCDENTICVVPIQGVTWTGLNDDVEALDKALDAYNAKTGYDIPIHV DAASGGFILPFLYPDTKWDFRLKWVLSISVSGHKFGLVYPGLGWVCWKGKEYLPEEMSFS VNYLGANITQVGLNFSRPAAQILGQYYQFIRLGFQGYKEVQYNSLQIAKYIHGEIAKMAP FVNYSENVVNPLFIWYLKPEYAKTAKWTLYDLQDKLSQHGWMVPAYTLPSKLEDYVVMRV VVRQGFSRDMADMLLGDIKNAIAELEKLDFPTPTRMAQEKNLPVEAKMFNHGGRRHKTVK K >gi|226332137|gb|ACIC01000183.1| GENE 13 16495 - 17460 1234 321 aa, chain + ## HITS:1 COG:ECs0538 KEGG:ns NR:ns ## COG: ECs0538 COG2066 # Protein_GI_number: 15829792 # Func_class: E Amino acid transport and metabolism # Function: Glutaminase # Organism: Escherichia coli O157:H7 # 9 312 6 308 310 275 46.0 1e-73 MDKKVTLAQLKEVVQEAYDQVKTNTGGKNADYIPYLANVNKDLFGISVCLLNGQTIHVGD TDYRFGIESVSKVHTAILALRQYGAKEILDKIGADATGLPFNSIIAILLENDHPSTPLVN AGAISACSMVQPIGDSAKKWDAIVENVTDLCGSAPQLIDELYKSESDTNFNNRSIAWLLK NYNRIYDDPDMALDLYTRQCSLGVTALQLSVAAGTIANGGVNPVTKKEVFDASLAPKITA MIAAVGFYEHTGDWMYTSGIPAKTGVGGGVMGVLPGQFGIAAFAPPLDGAGNSVKAQLAI QYVMNKLGLNVFSDNHLIVVD >gi|226332137|gb|ACIC01000183.1| GENE 14 17879 - 18619 572 246 aa, chain - ## HITS:1 COG:no KEGG:BT_2572 NR:ns ## KEGG: BT_2572 # Name: not_defined # Def: putative potassium channel subunit # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 243 1 243 243 417 98.0 1e-115 MKSALSDFILGKKGIYGILHIIILLMSLFLVISISIDTFKGIPFYTQSSYMKIQLWICIW FLFDFVLEFFLAKHKWRYIRTHFIFLLVAIPYQNIIAYYGWTFSPEVTYLLRFIPLLRGG YALAIVVGWLTYNRASSLFVSYLTMLLATVYFASLAFFVLEHKVNPLVTDYGDALWWAFM DVTTVGSNIIAMTTTGRVLSVLLAALGMMMFPIFTVYITNLIQQSNQKKKQYYAEEEEKK EPAQGT >gi|226332137|gb|ACIC01000183.1| GENE 15 18836 - 20548 1292 570 aa, chain + ## HITS:1 COG:BMEII0909 KEGG:ns NR:ns ## COG: BMEII0909 COG0531 # Protein_GI_number: 17989254 # Func_class: E Amino acid transport and metabolism # Function: Amino acid transporters # Organism: Brucella melitensis # 9 482 23 497 510 512 59.0 1e-145 MANIKNAVKLGVFTLAIMNVTAVVSLRGLPAEAVYGMSSAFYYLFAAIVFLIPTSLVAAE LAAMFQDKQGGVFRWVGEAYGKKLGFLAIWVQWIESTIWYPTVLTFGAVSIAFIGMNDVH DMSLANNKYYTLAVVLIIYWLATFISLKGMSWVGKVAKIGGMVGTIIPAALLIILGIIYL ATGGQSNMDFHSSFIPDFTNFDNVVLAASIFLFYAGMEMGGIHVKDVENPSKNYPKAVFI GALITVLIFVLGTFALGVIIPAKDINLTQSLLVGFDNYFKYIHASWLSPIIAIALAFGVL AGVLTWVAGPSKGIFAVGKAGYMPPFFQKTNKLGVQKNILFVQGIAVTVLSLLFVVMPSV QSFYQILSQLTVILYLIMYLLMFSGAIALRYKMKKLDRPFRIGKSGNGLMWFIGGLGFCG SLLAFILSFIPPSQISTGSNTVWFSVLIIGAIIVVVAPFIIYAAKKPSWKDPDTDFEPFH WEVQPQTVTAGVTANSANAARPVSASTSSASPASSATAKPASTPTSGPASAGSTTSGSAP ASNPASSGGSTTASNGSPDSGNKDAGAPKS >gi|226332137|gb|ACIC01000183.1| GENE 16 20615 - 21874 1045 419 aa, chain - ## HITS:1 COG:mll3725_2 KEGG:ns NR:ns ## COG: mll3725_2 COG0642 # Protein_GI_number: 13473203 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Mesorhizobium loti # 137 408 26 310 328 182 38.0 1e-45 MALQFITVISLAAFIYVLIKYQSVKKNMNLKASVMLRESNELTSILQNINVYYLLIDRDF IVHNTNYYTLNHLTPAQGERKKVGDLLHCRNAIAAGECGKHEQCKLCCIRTSISKAFYEK QDFKKLDASMKLLSADEKTVISCDVSVSGAYLKINGEERMVLTVYDVTELRNMQRLLDVE RENAISADKLKSAFIANMSHEVRTPLNAIVGFSGLMATATSEEEKKMYTDIISENNERLL RLVNDIFDLSQIDAGVLNFVYSEFDANDLLRELKGLFGMRLNENPLVTLVCEAHLEPIMM HSEKQRIIQVLTNLLHNAMKFTKTGEIRFGCHMEGTDEVCFYVSDTGIGIPEEEQEKIFS RFTKLDREMQGTGLGLTLSQIIVHNLGGRCGVESEVGKGSTFWFILPLVTKTEITTDKI >gi|226332137|gb|ACIC01000183.1| GENE 17 22479 - 24116 1449 545 aa, chain + ## HITS:1 COG:no KEGG:BT_2662 NR:ns ## KEGG: BT_2662 # Name: not_defined # Def: alpha-galactosidase precursor # Organism: B.thetaiotaomicron # Pathway: not_defined # 39 545 1 507 507 1057 99.0 0 MKNRFHLLATAVVSLLCVSCAETQVSQSGSEKAVNPPIMGWSSWNAFRVDISEDIIKNQA DLMVKKGLKDAGYHYINIDDGFFGERDGNGKMQTNKNRFPNGMKPVADHIHSLGMKAGIY TDAGNNTCGSIWDNDHAGVGAGIYGHEQQDAQLYFGDWGFDFIKIDYCGGDVLGLNEQER YTSIRNSIDKVNKDVSVNICRWAFPGTWAKDVATSWRISGDINAHWGSLRYVVGKNLYLS AYAKDGHYNDMDMMVIGFRDNSKVGGKGLTPTEEEAHFGLWCIMSSPLLIGCNLESLPES SLELLTNKELIALNQDPLGLQAYVAQHENEGYVLVKDIEQKRGNVRAVALYNPSDTVCSF SVPFSALEFGGNVKVRDLAKRSDLGNFSDVFEQTLPAHSAMFLRMEGETRLEPTLYEAEW AYLPMFNDLGKNPKGIIYAADQEASGKMKVGFLGGQPENYAEWPEVYSADGGRYQMTVHY SFGKGRQLEIDVNGIVTKIDSLGEDDKHNQVTIPVDLKAGYNHIRMGNSYNWAPDIDCFT LTKGM >gi|226332137|gb|ACIC01000183.1| GENE 18 24210 - 25676 1474 488 aa, chain - ## HITS:1 COG:no KEGG:BT_2663 NR:ns ## KEGG: BT_2663 # Name: not_defined # Def: TPR repeat-containing protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 6 488 1 483 483 784 99.0 0 MKRVQMFLAGAFIAVGSLYAQSSDAEWQAGVAKLKETIQTNPAQANEEAEQLIKGKNKKN VELLVAIGEVYLKAGKLPEAQEYAALAKKVNGKSALASVLEGDIAFEQKNAGLASQKYEE AIYFDPSCTEAYLRYADIYKSANAALAIEKLEQLKTREPSNTAVDKKLAEIYYLKNDFSK AAEAYSRFAMGPTATEEDLVKYAFALFLNHDFEKSLEVANMGLKKNARHAAFNRLAMYNY TDLKRFDEAIKAADAFFTESDKADYSYLDYMYYGHLLEALKKYDEAVEQYEKAIQLDPTK TDLYKNISSAYEQKNDYKKAISAYQKYYASLDKEHQTPDLQFQFGRLYYGAGTQTDSLTI NAEERKQALMAADSVFHSIAEAAPDSYLGNFWRARANSALDPETTLGLAKPFYEEVATLL ESKNDPHYNSALIECYSYLGYYYLLAIENPALKAEAMANKEKSKEYWSKILAIDSTNATA KRALDGIK >gi|226332137|gb|ACIC01000183.1| GENE 19 25698 - 26639 672 313 aa, chain - ## HITS:1 COG:TM1264 KEGG:ns NR:ns ## COG: TM1264 COG0226 # Protein_GI_number: 15644020 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type phosphate transport system, periplasmic component # Organism: Thermotoga maritima # 37 301 23 271 274 85 24.0 9e-17 MKRQFWLVGIVLLAVLSACRSKSDGPTDTYSSGVISIAADESFEPIIQEEIDVFESLYPL AGIVPRYTTEVEAINLLLKDSVRLAIATRTLTEEEMNSFHSRKFFPREIKLATDGLALIV NRDNPDSLLTVRDFSRILTGEAKDWKDINPNSRLKSIQVVFDNKNSSTVRYTMDSICGGK PLATDNVSALKTNQQVIKYVAENPGAMGVIGVNWLGNRSDTTNLSFTEEIRVMAVSAEDV ATPANSYKPYQAYLYYGNYPLARPIYALLNDPRSALPWGFASFMTSDKGQRIILKSGLVP ATQPVRIVHVKDE >gi|226332137|gb|ACIC01000183.1| GENE 20 26644 - 27456 818 270 aa, chain - ## HITS:1 COG:no KEGG:BT_2665 NR:ns ## KEGG: BT_2665 # Name: not_defined # Def: TonB # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 270 1 270 270 474 100.0 1e-132 MAKLDLASSEWCQLIFEGKNQAYGAYRMRANSPRRHTIAMLIVVAIAIVGFTIPTLLKLA TPEQKEVMTEVTTLSKLAEPEIKQEEMKRVEPVAPPPPALKSSIKFTAPVIKKDEEVHED DEIKSQEELTSTKVAISIADVKGNDEANGADIADLKQVVTQAAEPEKVFDMVEQMPTFPG GQQELMSYLGKNIKYPTIAQENGTQGRVIIQFVVERDGTITDVHVARGVDPYLDKEAVRV VQSMPKWIPGKQNGKAVRVKFTVPVMFRLQ >gi|226332137|gb|ACIC01000183.1| GENE 21 27483 - 28136 716 217 aa, chain - ## HITS:1 COG:no KEGG:BT_2666 NR:ns ## KEGG: BT_2666 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 217 1 217 217 355 100.0 6e-97 MSAEVQESSGKRGKGKQKKMTVRVDFTPMVDMNMLLITFFMLCTTLSKPQTMEISMPSND KDITEQQKSMVKASQAITLLLGADNKLYYYEGEPNYKDYTSLKETSYGASGLRAVLLQKN AAAVNQVRALKQQKLDLKITDDEYKKKVSEIKSGKNTPTVIIKATDDASYMNLIDALDEM QICNIGKYVITDIAEADEFLIKNYDAKGELSQNLADK >gi|226332137|gb|ACIC01000183.1| GENE 22 28152 - 28757 426 201 aa, chain - ## HITS:1 COG:no KEGG:BT_2667 NR:ns ## KEGG: BT_2667 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 201 1 201 201 363 100.0 2e-99 MGRAKIKKKDTFIDMTAMSDVTVLLLTFFMLTSTFVKKEPVQVTTPASVSEIKIPETDVL QILVDQEGKIFMSLDKQQDMQAVLESMGEEYGIKFTPEQAKRFTVASTFGVPIRSMQKFL DLPEDQRDKILKNEGIPCDSTDNQFKSWVRNARAANADLRIAIKADASTPYSVIKNVMNS LQDLRENRYNLITSLKAESEN >gi|226332137|gb|ACIC01000183.1| GENE 23 28787 - 29584 921 265 aa, chain - ## HITS:1 COG:FN1312 KEGG:ns NR:ns ## COG: FN1312 COG0811 # Protein_GI_number: 19704647 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Biopolymer transport proteins # Organism: Fusobacterium nucleatum # 52 261 1 200 202 89 30.0 5e-18 METTKKSQIVGIKNAGIVIICCFIVAVCIFQFLLGNPSNFMNNDPNNHPLNMLGTIYKGG VIVPVIQTLLLTVLALSIERFFALRSAFGKGSLAKFVANIKEALAAGDLKKAQEICDKQR GSVANVVTATLRKYEEMEKNTALPKEQKLLAIQKELEEATALEMPMMQQNLPIIATITTL GTLMGLLGTVIGMIRSFAALSAGGGADSMALSQGISEALINTAFGILTGALAVISYNYFT NKIDKLTYGLDEVGFSIVQTFAATH >gi|226332137|gb|ACIC01000183.1| GENE 24 29737 - 30180 317 147 aa, chain - ## HITS:1 COG:no KEGG:BT_2669 NR:ns ## KEGG: BT_2669 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 147 1 147 147 231 99.0 8e-60 MKGSVRFILAIVLLMLSYSIGGNAMNITSCISQNECSCQLSESTSDTNSNSSFKTRSLGY DRSSQSLSISDVELGFKSVTETSSNNYRLRRIIESNDFFKDVMYKFFLLRENSLVLDQSK SYYSDKDPHYSIICSDYYVFALRRILI Prediction of potential genes in microbial genomes Time: Thu May 12 03:54:26 2011 Seq name: gi|226332136|gb|ACIC01000184.1| Bacteroides sp. 1_1_6 cont1.184, whole genome shotgun sequence Length of sequence - 11305 bp Number of predicted genes - 9, with homology - 9 Number of transcription units - 5, operones - 3 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 5 - 862 329 ## PROTEIN SUPPORTED gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 + Term 955 - 1017 11.0 + Prom 1195 - 1254 6.7 2 2 Op 1 . + CDS 1428 - 3026 1208 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains + Prom 3028 - 3087 4.9 3 2 Op 2 . + CDS 3113 - 3901 752 ## COG2816 NTP pyrophosphohydrolases containing a Zn-finger, probably nucleic-acid-binding 4 2 Op 3 . + CDS 3908 - 4471 335 ## COG0664 cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases + Prom 4474 - 4533 3.2 5 3 Op 1 . + CDS 4556 - 5605 600 ## COG1835 Predicted acyltransferases 6 3 Op 2 . + CDS 5615 - 6790 1041 ## COG2311 Predicted membrane protein + Term 7020 - 7062 8.1 - Term 7007 - 7050 12.1 7 4 Tu 1 . - CDS 7096 - 8841 2012 ## COG1109 Phosphomannomutase - Prom 8889 - 8948 3.4 8 5 Op 1 . - CDS 8974 - 10611 1511 ## COG4690 Dipeptidase 9 5 Op 2 . - CDS 10632 - 11303 655 ## BT_1550 hypothetical protein Predicted protein(s) >gi|226332136|gb|ACIC01000184.1| GENE 1 5 - 862 329 285 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 [Flavobacteriales bacterium ALC-1] # 1 279 170 444 458 131 27 3e-30 LIIVGGGYIGLEFASMYAGFGSKVTLLEAGNRFMPRNDSDIAKSVREVMEKKGVEIRLNV RTQSIHDTHDGVTLTYSDTSDGTPYFVDGDAILIATGRKPMIEGLNLQAAGVEVDAHGAI VVNDQLHTNAPHIWAMGDVKGGAQFTYVSLDDFRIIRDQLFGDKKRDINDRDPLPYAVFI DPPLAHIGITEEEALRKGYSFKVSRLPATSVVRSRTLQQTDGMLKAIINSHSGKIMGCTM FCTDAPELINMVAMAMKTGQTSTFLRDFIFTHPSMSEGLNQLFDV >gi|226332136|gb|ACIC01000184.1| GENE 2 1428 - 3026 1208 532 aa, chain + ## HITS:1 COG:SMa2385 KEGG:ns NR:ns ## COG: SMa2385 COG0488 # Protein_GI_number: 16263742 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Sinorhizobium meliloti # 1 532 1 525 529 289 33.0 1e-77 MPISIQQISYIHPDKEVLFSDLNFAISKGQKLGLVGNNGCGKSTLLQIIAGQLAPSSGVI VRPDDLYYIPQHFGQYDSLTIAQALQIDHKQKALHAILAGDASIENFSILNDDWNIEERS VAALDSWGLGQFPLSYPMNLLSGGEKTRVFLAGMDIHNPSVILMDEPTNHLDSSGRQRLY DWVEKWRSTLLVVSHDRTLLNLLPEICELEKHQISYYGGNYEFYKEQKTLMQEALQQRIE EKEKALRTARKVARETAERRDKQNVRGEKRNLKKGVPRIVLNGLQSKSEKSTAKLTGAHQ EKAEKLTDERNQLRSSLSPTAALKTDFNNSGLHTGKILVTAEEINFGYHPDQPQLWQAPL SFQLKSGDRLRIEGGNGSGKTTLLKIITGQLQPLKGTLIQANFTYVYLNQEYSIINDRNS ILEQAYAFNGRNLPEHEIKIILNRYLFPASEWDKSCRKLSGGEKMRLAFCCLMISNNMPD MFILDEPTNNLDIQSIEIITATIKNYAGTVIAISHDNYFIQEIGVEQSIFLS >gi|226332136|gb|ACIC01000184.1| GENE 3 3113 - 3901 752 262 aa, chain + ## HITS:1 COG:MA1439 KEGG:ns NR:ns ## COG: MA1439 COG2816 # Protein_GI_number: 20090298 # Func_class: L Replication, recombination and repair # Function: NTP pyrophosphohydrolases containing a Zn-finger, probably nucleic-acid-binding # Organism: Methanosarcina acetivorans str.C2A # 3 258 24 279 285 189 39.0 3e-48 MNQTAESLWFVFYKDELLLEKEGNGTFAIPCGEIPLILIKDKTTVHNITTLEGRNCKAYS VSQPIEETEQYAMIGLRASYEYLPLGHYQAAGKAHEILHWDRNSLFCSACGTPMEQKESI MKRCPNCGREVYPAISTAILVLVRKGDSILLVHARNFKGRFNSLVAGFLETGETLEECVA REVKEETGLDVKNITYFGNQPWPYPSGLMVGFIADYAGGEIKLQEEELSSGDFYTRDNLP ELPRKLSLARKMIDWWLECSHK >gi|226332136|gb|ACIC01000184.1| GENE 4 3908 - 4471 335 187 aa, chain + ## HITS:1 COG:CAC3336 KEGG:ns NR:ns ## COG: CAC3336 COG0664 # Protein_GI_number: 15896579 # Func_class: T Signal transduction mechanisms # Function: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases # Organism: Clostridium acetobutylicum # 31 186 38 194 199 66 29.0 3e-11 MENIIKGIRRYYPVSDSSLEMLFSHMKKLELPKKHLLIHGGVSDRHVYFIEKGFCRSYCL RDGEEITIWFSREGDITFAMKDLYHNQPGYEYVELLEDCKLYAIRIDELNQIYETNIEIA NWGRVIHQECLLYMDIHHINRLYLPAKERYEQLLREQPDVIHRAQLGYIASFLGMTPQHL SRLRSES >gi|226332136|gb|ACIC01000184.1| GENE 5 4556 - 5605 600 349 aa, chain + ## HITS:1 COG:Rv0517 KEGG:ns NR:ns ## COG: Rv0517 COG1835 # Protein_GI_number: 15607658 # Func_class: I Lipid transport and metabolism # Function: Predicted acyltransferases # Organism: Mycobacterium tuberculosis H37Rv # 36 349 77 407 436 83 23.0 7e-16 MINTLTSLRIFFALMVFGAHCYVLDTSFDAHFFKEGFVGVSFFFVLSGFIIAYNYQEKLL TKTTTKRTFWVARIARIYPLHLLTLLIAACIGGYVQYSDTTDWIKHFVASTFLLQPFFPS ADYFFSFNSPSWSLGCEQLFYFCFPFVIPFLNSRRKLLVILSICLPVMLAGMYLTADEQI KAYWYVNPITRLPDFFVGVLLYQIYQALHNKKISYSTGTLSEVASVALFLLFYLCAADIP KVYRYSCYYWLPVSLMILIFAFQKGGISRLLSNRFLIIGGEISYSFYLIHLFIILTYTKM AALYQWQVSWMISVPLIFGITITLSLLSYYYFEKPANKWVKRILTKKQS >gi|226332136|gb|ACIC01000184.1| GENE 6 5615 - 6790 1041 391 aa, chain + ## HITS:1 COG:BS_yrkO KEGG:ns NR:ns ## COG: BS_yrkO COG2311 # Protein_GI_number: 16079697 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Bacillus subtilis # 9 386 17 385 405 95 25.0 1e-19 MELLKKTPRIEVVDALRGFAVMAILLVHNLEHFIFPVYPESSPEWLSVLDAGVFNAIFSL FAGKSYAIFALLFGFTFYIQSHNQQLKGKDFGYRFLWRLVLLAGFATLNAAFFPAGDVLL LFVVVGIILFIVRKWSDKAILITAILFSLQPIEWFHYIMSLFNPAYTLPDLNVGAMYSEV ADYTKNGTFWEFLIGNVTLGQKASLFWAIGAGRFLQTAGLFLFGLYIGRKELFVTSESHL KFWTKALIIAAISFAPLYSLKEQIMQSDSSLIQQTVGTAFDMWQKFAFTIVLIASFILLY QKAKFQKTVSNLRFYGKMSLTNYITQSIMGAIIYFPFGFYLAPYCGYTLSLIIGIILFLL QVQFCKWWLSKHKQGPLETIWHKWTWLGAKK >gi|226332136|gb|ACIC01000184.1| GENE 7 7096 - 8841 2012 581 aa, chain - ## HITS:1 COG:CAC2337 KEGG:ns NR:ns ## COG: CAC2337 COG1109 # Protein_GI_number: 15895604 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannomutase # Organism: Clostridium acetobutylicum # 12 553 5 549 575 459 44.0 1e-129 MENQELIKQVTEKAEKWLTPAYDAETQAEVKRMLENEDKTELIEAFYKDLEFGTGGLRGI MGVGSNRMNIYTVGAATQGLSNYLKKNFKDLPQISVVVGHDCRNNSRLFAETSANIFSAN GIKVYLFDDMRPTPEMSFAIRHLGCQSGIILTASHNPKEYNGYKAYWDDGAQVLAPHDAG IIDEVNNIASAADIKFKGNPDLIQIIGEDIDKIYLDMVKTVSIDPEAIARHKDMKIVYTP IHGTGMMLIPRALKMWGFENVFTVPEQMIKDGNFPTVISPNPENAEALSMAVNLAKEIDA DLVMASDPDADRVGIACKDDKGEWVLINGNQTCMMYLYYILTQYKQLGKIKGNEFCVKTI VTTELIKKIADKNNIEMLDCYTGFKWIAREIRLREGKKKYIGGGEESYGFLAEDFVRDKD AVSACCLIAEVAAWAKDNGKSLYQLLLDIYVEYGFSKEFTVNVVKPGKSGAEEIKAMMEN FRANPPKELGGSKVILSKDYKTLKQTDAEGKVTDLDMPETSNVLQYFTEDGSKVSVRPSG TEPKIKFYMEVQGEMGCRNCFASADAAAMEKIEAVKKSLGI >gi|226332136|gb|ACIC01000184.1| GENE 8 8974 - 10611 1511 545 aa, chain - ## HITS:1 COG:MA3377 KEGG:ns NR:ns ## COG: MA3377 COG4690 # Protein_GI_number: 20092191 # Func_class: E Amino acid transport and metabolism # Function: Dipeptidase # Organism: Methanosarcina acetivorans str.C2A # 22 500 2 538 574 179 27.0 1e-44 MKKRIILSAVLFLAALANSFACTNLIVGKNASTDGSTIVSYSADSYGLFGELYHYPAATY PKGTMLKVYEWDTGKYLGEIEQARQTYNVTGNMNEFQVTIGETTFGGRPELADSTGIIDY GSLIYIGLQRSRSAREAIRIMTDLVQQYGYYSGGESFTIADPNEIWIMEMIGKGPGVRGA VWVAVRVPDDCISAHANQSRIHQFDMNDKENCITSPDVISFAREKGYFNGVNKDFSFAEA YAPLDFGARRFCEARVWSYFNKYTDHGNDYLPYIEGKTDTPMPLFVKPNRKLSVQDVKDM MRDHYEGTPLDISNDFGAGPYKTPYRLSPLNFKVGDKEYFNERPISTQQSGFVFVAQMRA NKPDPIGGVLWFGVDDANMAVFTPVYCCTTKAPLCYTRVDGADYITFSWNSAFWIFNWVS NMVYPRYDLMIGDVRATQKEMETTFNNAQEGIEEMAAKLLAKDKNAAIAFLTNYTNMTAQ STFDTWKQLGTFLIVKYNDGVVKRVKDGQFERNSIGQPAGVIRPGYPKEFLEEYVKQTGE RYLVK >gi|226332136|gb|ACIC01000184.1| GENE 9 10632 - 11303 655 223 aa, chain - ## HITS:1 COG:no KEGG:BT_1550 NR:ns ## KEGG: BT_1550 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 223 328 550 550 466 100.0 1e-130 TVRGKKQPIEPLNHREKYFFAWDKTNYLQEPGLSLVVPKGMLYDNVPLQYQIKADSGAVA FTYQLNDKPVPLHASCELRIGLRRKPVADTTKYYVARVTPKGTKYSVGGKYEDGFMKASI RELGTYTVALDTVPPEIIPVNKNLWGRNGKIVYRLKDEGAGIASYRGTIDGEYALFGRPN IVKSYWECVLDPKHVKKGGKHTVEMTVTDACGNQTVSKETFVW Prediction of potential genes in microbial genomes Time: Thu May 12 03:54:39 2011 Seq name: gi|226332135|gb|ACIC01000185.1| Bacteroides sp. 1_1_6 cont1.185, whole genome shotgun sequence Length of sequence - 30280 bp Number of predicted genes - 29, with homology - 29 Number of transcription units - 13, operones - 8 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 20 - 2650 1981 ## BT_1551 hypothetical protein 2 1 Op 2 . + CDS 2684 - 6028 1946 ## BT_1552 hypothetical protein 3 1 Op 3 . + CDS 6068 - 7372 879 ## BT_1553 hypothetical protein + Term 7406 - 7445 5.0 + Prom 7395 - 7454 3.0 4 2 Tu 1 . + CDS 7478 - 8584 1102 ## COG0686 Alanine dehydrogenase + Term 8619 - 8672 6.2 - Term 8609 - 8656 7.1 5 3 Op 1 . - CDS 8669 - 9604 884 ## COG2070 Dioxygenases related to 2-nitropropane dioxygenase - Term 9626 - 9662 2.0 6 3 Op 2 . - CDS 9678 - 10406 667 ## BT_1556 hypothetical protein 7 4 Op 1 . - CDS 10573 - 11082 420 ## BT_1557 hypothetical protein 8 4 Op 2 . - CDS 11109 - 12161 830 ## BT_1558 hypothetical protein 9 4 Op 3 . - CDS 12145 - 12696 231 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 12836 - 12895 3.9 - Term 12810 - 12871 8.5 10 5 Op 1 . - CDS 12899 - 13747 835 ## PROTEIN SUPPORTED gi|163755345|ref|ZP_02162465.1| 30S ribosomal protein S6 11 5 Op 2 . - CDS 13740 - 14132 307 ## BT_1561 hypothetical protein - Prom 14201 - 14260 4.1 + Prom 14181 - 14240 3.4 12 6 Tu 1 . + CDS 14260 - 14733 413 ## COG1576 Uncharacterized conserved protein + Term 14873 - 14910 0.6 13 7 Op 1 . - CDS 14762 - 15400 500 ## BT_1563 hypothetical protein 14 7 Op 2 . - CDS 15432 - 15887 424 ## COG0780 Enzyme related to GTP cyclohydrolase I - Prom 16002 - 16061 3.0 - Term 16354 - 16403 1.1 15 8 Op 1 . - CDS 16513 - 17172 634 ## COG0603 Predicted PP-loop superfamily ATPase 16 8 Op 2 . - CDS 17229 - 17909 644 ## COG1738 Uncharacterized conserved protein - Prom 17954 - 18013 4.5 - Term 18006 - 18042 -0.4 17 9 Op 1 . - CDS 18053 - 18340 258 ## COG1846 Transcriptional regulators 18 9 Op 2 . - CDS 18325 - 18915 443 ## BT_1569 hypothetical protein 19 9 Op 3 . - CDS 18975 - 21152 1244 ## BT_1570 hypothetical protein 20 9 Op 4 . - CDS 21157 - 21552 392 ## BT_1571 hypothetical protein - Prom 21572 - 21631 8.3 - Term 21689 - 21736 2.3 21 10 Tu 1 . - CDS 21780 - 22283 556 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 22303 - 22362 7.6 + Prom 22262 - 22321 4.4 22 11 Tu 1 . + CDS 22447 - 23736 1286 ## BT_1573 hypothetical protein + Term 23786 - 23824 3.0 - Term 23766 - 23813 13.5 23 12 Tu 1 . - CDS 23842 - 25275 1271 ## COG2067 Long-chain fatty acid transport protein - Prom 25457 - 25516 5.2 + Prom 25345 - 25404 4.8 24 13 Op 1 . + CDS 25485 - 26486 1045 ## COG1052 Lactate dehydrogenase and related dehydrogenases 25 13 Op 2 . + CDS 26547 - 27251 668 ## COG1741 Pirin-related protein 26 13 Op 3 . + CDS 27286 - 27927 382 ## COG0259 Pyridoxamine-phosphate oxidase 27 13 Op 4 . + CDS 27996 - 28727 663 ## COG2220 Predicted Zn-dependent hydrolases of the beta-lactamase fold 28 13 Op 5 . + CDS 28742 - 29875 1157 ## BT_1579 hypothetical protein 29 13 Op 6 . + CDS 29889 - 30269 190 ## PROTEIN SUPPORTED gi|148984704|ref|ZP_01817972.1| 50S ribosomal protein L20 Predicted protein(s) >gi|226332135|gb|ACIC01000185.1| GENE 1 20 - 2650 1981 876 aa, chain + ## HITS:1 COG:no KEGG:BT_1551 NR:ns ## KEGG: BT_1551 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 876 1 876 876 1777 100.0 0 MRMYVKVMAAVIACILLSGEAYSALATTPATENLSWFKKKKKKNSEEERVKSDYEKLVEG SSVKKGMFAVYQKKNDYYFEVPTSLLGRDLLVVNKLQRVPAELNDAGVNRGVNYENQMIC MEWDKATGKLMFRQQRPLPLAPQTDAIFRSVKDNFISPLIAAFKIEAVNQDSTALVIKIN DIYDGTETSINNVFTNINLGTSAIKNLSRILSVKSFPNNVVATSELTTKVTEGTTSVYVT VEVSSSILLLPETPMMGRFDNQKIGYFTNPLLSFSDAQQRTDKKQFITRWRMEPKPEDRE AYLKGKVVEPAKPIVFYIDNSTPYQWRKYIKRGIEDWQIAFEKAGFKNAIIAIEVTDSME IDMDDVNYSVLTYAASEKKNAMGPSLLDPRSGEILEADIMWWHNVLSMVSEWITVQTGTV CPEARSVQLPDSLLGDAIRFVACHEVGHSLGLRHNMMGSAAFPTDSLRSATFTSRLNSTA SSIMDYARFNYIAQPGDGVKVLSPHIGPYDIFAIEYGYRWYGKNSPEEEKDILFDFLSKH TDRLYKYSEAQDVRDAVDPRAQNEDLGDDPVRSSLLGIDNLKRIVPQILQWTTTGEKGQT YEEASRLYYAIINQWNNYLYHVLANIGGIYIENTVIGDGVQTYTFVEKEKQQASLKFLTD EVLTYPKWLFDTEVGKYTYLLRNTPVGQQENAPTQILKNAQAYILWDLLGNTRLMRMIEN ESVNGKKAFTVVELMDGLHKNIFGITERGGIPNVMERSLQKNFLDALLTAAAEPEAVKIN KKIADDHFLLDHATPFCSCYAAEQRSLRQEDKMGAHRVLNFYGSQLNRISDAISVKRGEL LRIKKLLQSRLGTSDTATRYHYEDMILRINTALGIK >gi|226332135|gb|ACIC01000185.1| GENE 2 2684 - 6028 1946 1114 aa, chain + ## HITS:1 COG:no KEGG:BT_1552 NR:ns ## KEGG: BT_1552 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1114 1 1114 1114 2159 99.0 0 MNKKHLCIIICFLVNALAMASAANRTITGVVISGEDNEPLIGASVYVHTDELKKAGASQT SLGTITDIDGKFSLSVPDKVTRIHCSYIGFEEQVIVLQGDKKSYRIVLQTSAHTLGDVVV TGYQTLERRKLTAAISKVEVSDAMVGAAKSIDQALTGQIAGVSVTNTSGAPGSPAKIRIR GTASMNGTQDPLWVLDGIPLEGTDIPKMDNKSSDNDIVNIGQSSIAGLSPNDIESITILK DAAATAIYGARAANGVIVVTTKRGKTGKPVVNFNTKLTYTPNLETSRLNLLNSEEKVNLE LQMMKEAPFNKWGFYPIPVYSEKGGVAAILKQYNLMDIYREKGWEGLTPEAQNAINRLKS INTDWNDILFRDAITQEYNISISGGSEKVTYYNSLGYTLENGNIPGVSLSRFNLTSKTSY QINKLLKVGVSIFANRRKNTTFLTDTYGLTNPVYYSRIANPYFEPFDNNNNYLYDYDVVT GSIPDLLQGYNILEERENTSNKSVTTAINSIFDIELRFNDQWKVTSQVGVQWDQLSREEY AGTNSFNLRNQRENSAYYKGNDRIYLIPEGGMLKSTNSTTSQITWKVQGEYKNTFNDIHN IQIMAGSEIRKNWYENQASTGYGYDPKTLTFKNLEFRDSKQANEWKLKTKSFKENAFASF YANGSYTLMDRYTLGGSVRMDGSDLFGVDKKYRYLPIYSVSGLWRISNEPLINQFKWIDN LALRLSYGLQGNIDKNTSPFIVGTYGNISILPGSNEESITINSAPNSKLRWEKTASYNTG IDFSVFNQAINLSVDYYYRKGTDLIGNKMLPLENGFTSMAVNWASMKNQGIEINLQTRNI ATKDFSWYTSFNFAYNQNKVLKVMTEKDQVTPSLEGYPVGAIFTLKTKGIKSETGQIYIE NKDGEAVTIEELFQMADKGNLDGEYSIGVDKETERGFYSYAGTSDAPYTGGFMNTFNYRN WELNLNFSYNLGAHVKTDPTYYIADPDPGRNMNRDILDRWTPENKNGKFPALTSLNTNPA DYYLFSTRRDLYKSLDIWVKKLSYIRLQNIRLAYHFPSEWLHKLNIGGATVGLEARNLFV FGSSYKNYMDPESMSNLYSTPVPKSVTFNLSLNF >gi|226332135|gb|ACIC01000185.1| GENE 3 6068 - 7372 879 434 aa, chain + ## HITS:1 COG:no KEGG:BT_1553 NR:ns ## KEGG: BT_1553 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 434 9 442 442 846 100.0 0 MVVIGSILLSSCDSYLDIQPVGQVIPTTLTEYRALLTTAYGLKLIDKSVTEFRTDIAMVT DNSNSQNFYSEIEKWNDMSPKDGTREFGWAVYYSVIYYANAIIANKDNIKEGSQEDIDQL VGEAYLLRGYMHFILANLYGQPYTKEGAPETKSIPIKWDLDLEVVLPRNTVKEVYTAILS DIESARGLMHQKEWEAVYAYRFSTLSVDAMESRVRLYMGNWKEAYDAAERILVQKSTLED LNDPEAKLPNNYTSAEMITAYEFFNDDVTSASILIPAFMEKYDANKDLRRNKYYSLSGDK YISKKSGKAEHNCTFRTGEIYLNAAEAAAQMNKLPEARTRLLQLMEKRYTPEGYAEKKKA VDAMNQADLIKEILEERALELAFEGHRWFDLRRTTRPRIEKVLNGQAYLLEQDDSRYTLR IPQSAIAANPGLLN >gi|226332135|gb|ACIC01000185.1| GENE 4 7478 - 8584 1102 368 aa, chain + ## HITS:1 COG:VC1905 KEGG:ns NR:ns ## COG: VC1905 COG0686 # Protein_GI_number: 15641907 # Func_class: E Amino acid transport and metabolism # Function: Alanine dehydrogenase # Organism: Vibrio cholerae # 1 363 1 363 374 382 58.0 1e-106 MIIGVPKEIKNNENRVGMTPSGVAEVVKQGHQVYIQHTAGINSGFPDEAYLSVGAQVLPT IEDVYATADMIVKVKEPIAPEYHLIRKGQLLFTYFHFASDKELTLAMIDNKSICLAYETV EKEDHSLPLLIPMSEVAGRMSIQEGARFLEKPQGGKGILLGGVPGVKPAKVLILGGGIVG SNAAQMAAGMGADVTVADINLSRLRYLSETLPKNVKTLYASELRIKKELPDVDLVIGSVL IPGDKAPHLITRNMLAMMQPGTVLVDVAIDQGGCFETSHPTTHSAPTYVIDGIVHYAVAN IPGAVPYTSTLALTNATLPYVIALANKGWKKACKDDPALALGLNVVEGKVVYKAVADVFD LKYENINL >gi|226332135|gb|ACIC01000185.1| GENE 5 8669 - 9604 884 311 aa, chain - ## HITS:1 COG:CAC3576 KEGG:ns NR:ns ## COG: CAC3576 COG2070 # Protein_GI_number: 15896810 # Func_class: R General function prediction only # Function: Dioxygenases related to 2-nitropropane dioxygenase # Organism: Clostridium acetobutylicum # 7 306 9 302 310 224 43.0 2e-58 MNRITSLFGIQYPIIQGGMVWCSGWRLASAVSNAGGLGLLGAGSMHPETLREHIRKCRAA TDRPFGVNIPLMYPQIEEIMAIVAEEGVKIVFTSAGNPKAWTGWLHERGIIVAHVVSSSR FAMKCEEAGVDAVVAEGFEAGGHNGREETTTFCLIPAVRNATTLPLIAAGGIGTGEGIFA AMVLGAEGVQIGTRFALTDESSASPVFKEYCLGLGEGDTKLLLKKLAPTRLVRNNFRDAV ERAEDGGASAEELRTLLGRGRAKKGIFEGDLEEGELEIGQVSAIVSRQQSVAEVMNELVE AMQRAAEKKYW >gi|226332135|gb|ACIC01000185.1| GENE 6 9678 - 10406 667 242 aa, chain - ## HITS:1 COG:no KEGG:BT_1556 NR:ns ## KEGG: BT_1556 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 242 1 242 242 481 99.0 1e-134 MSELIKRLITAMTLCLTALLVHAQEWTKEDSLRLRKLLDSDQELNLNQDAVRQIDFGSAV GTPRMSVEKKWMLPDESLPEALPKPKVVLSLMPYKANTPCNWDPVFQKKIRMDKNTWRGD PHYEMRHQRSYSNWARNPMAGGVRKSLEEIRASGVRFRQLSERANGMMVNSVVMDSPIPL FGGSGVYINGGTIGGLDLMAVFTKNFWDKKGKERRERTLEVLRTYGDSTTVLINRPIEQI AR >gi|226332135|gb|ACIC01000185.1| GENE 7 10573 - 11082 420 169 aa, chain - ## HITS:1 COG:no KEGG:BT_1557 NR:ns ## KEGG: BT_1557 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 169 1 169 169 345 100.0 2e-94 MMLWMAVSLMVSLVGCSSEEMDYENPDVTVFVKQLKAGTYNMKNEKGVVEVPHFTEEDIP ELLKYAEDLTIIPSFPSVYNTNNGKIRLGECMLWVIESIRQGTAPSLGCRMVLANAENYE AIYFLTDDEVLDAAACYRRWWEGRKYPKTTWTIDPCYDEPLCGSGYRWW >gi|226332135|gb|ACIC01000185.1| GENE 8 11109 - 12161 830 350 aa, chain - ## HITS:1 COG:no KEGG:BT_1558 NR:ns ## KEGG: BT_1558 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 350 1 350 350 683 100.0 0 MKKENDEITDLFRTRLADAGMTVRDGFWEELSQDIPVACQHRRRLIFFRVAAAASVLLVL AASSATFWYLSPKEEMEEAFTKIVVANSGRMDGDGVRANQLPTAMEPVLPKPAPKSFGLL SQYPEEGDSVSISFSMSFSFSATTTTGNGNRYGNQGRNGYWQANNGNTEPSAAQEEKYTT GTPAPVENVKKHRWALKAQMGTALPAEDGAYKMPISAGVTVERKLNDYLGIETGLLYSNL RSEGQHLHYLGIPLKANITLMDTKKIDLYATVGGVADKCIAGAPDNSFKEEPIQLAVTAG IGITYKINDRLAVFAEPGVSHHFKTDSKLATVRTKRPTNFNLLCGLRMTY >gi|226332135|gb|ACIC01000185.1| GENE 9 12145 - 12696 231 183 aa, chain - ## HITS:1 COG:SMb20592 KEGG:ns NR:ns ## COG: SMb20592 COG1595 # Protein_GI_number: 16265252 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Sinorhizobium meliloti # 14 174 16 188 227 72 27.0 3e-13 MENEIELIKGCRAGQDSARKELYTLYAKQMLAVCYRYTGDMDAAHDVLHDGFIKIFTNFS FRGESSLCTWITRVMVTQSLDYLRREKRVNQLVVHEEQLPDIPDISSSGGGAGISEEQLM VFVAELPDGCRTVFNLYVFEEKSHKEIAGMLHIKEHSSTSQLHRAKYLLAKRIKEYRNHE ERK >gi|226332135|gb|ACIC01000185.1| GENE 10 12899 - 13747 835 282 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163755345|ref|ZP_02162465.1| 30S ribosomal protein S6 [Kordia algicida OT-1] # 3 281 9 284 286 326 59 1e-88 MNKEKELIDKLIDLAFAEDIGDGDHTTLSCIPATAMGKSKLLIKEAGVLAGIEVAKEIFN RFDPSMKVEVFINDGTEVKPGDVAMIVEGKVQSLLQTERLMLNVMQRMSGIATMTRKYAR QLEGTHTRVLDTRKTTPGMRILEKMAVKIGGGVNHRIGLFDMILLKDNHVDFAGGIDKAI NRAKDYCKEKGKELKIEIEVRNFDELQQVLDLGGVDRIMFDNFTPEMTKKAVDMVAGKYE TESSGGITFDTLRDYAECGVDFISVGALTHSVKGLDMSFKAC >gi|226332135|gb|ACIC01000185.1| GENE 11 13740 - 14132 307 130 aa, chain - ## HITS:1 COG:no KEGG:BT_1561 NR:ns ## KEGG: BT_1561 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 130 1 130 130 214 99.0 6e-55 MKKRVLLGMASLLLSVSLLMAQEIPAGVITAFKRGSSQELSKYMGDKVNLVLQGSSANVD KKKATVMMQEFFTENKVNAFDVNHQGKRDESSFVIGTLTTTKGKFRVNCFLKKVQTQYLI HQIRIDKINE >gi|226332135|gb|ACIC01000185.1| GENE 12 14260 - 14733 413 157 aa, chain + ## HITS:1 COG:SA0023 KEGG:ns NR:ns ## COG: SA0023 COG1576 # Protein_GI_number: 15925729 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Staphylococcus aureus N315 # 1 155 1 158 159 105 37.0 4e-23 MKTTLLVVGRTVEQHYITAINDYIQRTKRFITFDMEVIPELKNTKSLSMEQQKEKEGELI LKALQPGDVIVLLDEHGKEMRSLEFADYMKRKMNTVNKRLVFIIGGPYGFSEKVYQVANE KISMSKMTFSHQMIRLIFVEQIYRAMTILNGGPYHHE >gi|226332135|gb|ACIC01000185.1| GENE 13 14762 - 15400 500 212 aa, chain - ## HITS:1 COG:no KEGG:BT_1563 NR:ns ## KEGG: BT_1563 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 212 1 212 212 395 99.0 1e-109 MTHNIPTELQRYVQERIIPQYAGFDKAHQIDHAEKVIEESLKLALHYEVDSAMVYTVAAY HDLGLCEGREFHHIVSGKILLADETLRQWFTDEQMLQMKEAIEDHRASNRDAPRSIYGKI VAEADRIIDPEVTLRRTVQYGLSHYPEMDKEQQYERFRKHLADKYAEGGYLKLWIPQSDN AGRLAELRQLMENEEELRTVFDKLYLVENSAG >gi|226332135|gb|ACIC01000185.1| GENE 14 15432 - 15887 424 151 aa, chain - ## HITS:1 COG:NMB0317 KEGG:ns NR:ns ## COG: NMB0317 COG0780 # Protein_GI_number: 15676234 # Func_class: R General function prediction only # Function: Enzyme related to GTP cyclohydrolase I # Organism: Neisseria meningitidis MC58 # 8 151 11 155 157 226 74.0 1e-59 MAELKDQLSLLGRKTEYKQDYAPEVLEAFDNKHPENDYWVRFNCPEFTSLCPITGQPDFA EIRISYIPDIKMVESKSLKLYLFSFRSHGAFHEDCVNIIMKDLIKLMNPKYIEVTGIFTP RGGISIYPYANYGRPGTKFEQMAEHRLMNRE >gi|226332135|gb|ACIC01000185.1| GENE 15 16513 - 17172 634 219 aa, chain - ## HITS:1 COG:CAC3627 KEGG:ns NR:ns ## COG: CAC3627 COG0603 # Protein_GI_number: 15896861 # Func_class: R General function prediction only # Function: Predicted PP-loop superfamily ATPase # Organism: Clostridium acetobutylicum # 1 213 5 217 222 311 65.0 5e-85 MNRETALVVFSGGQDSTTCLFWAKRNFKKVYALSFLYGQKHQKEVELAREIARKAEVEFD VMDVSFIGTLGHNSLTDTTMVMDQEKPAGSVPNTFVPGRNLFFLSIAAVYARERGINHLV TGVSQTDFSGYPDCRDAFIKSLNVTLNLAMDEQFVIHTPLMWIDKAETWALADELGVLDL IRNETLTCYNGIQGDGCGHCPACTLRREGLEKYLKKKNQ >gi|226332135|gb|ACIC01000185.1| GENE 16 17229 - 17909 644 226 aa, chain - ## HITS:1 COG:Cgl0234 KEGG:ns NR:ns ## COG: Cgl0234 COG1738 # Protein_GI_number: 19551484 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Corynebacterium glutamicum # 9 213 43 249 250 127 34.0 1e-29 MKEKVSVPFMLLGILFNVCLIAANLLETKVIQVGSLTVTAGLLVFPISYIINDCIAEVWG FKKARLIIWSGFAMNFFVVGLGLIAVAIPAAPFWEGEEHFDFVFGMAPRIVAASLMAFLV GSFLNAYVMSKMKIASQGRNFSARAILSTIVGETADSLIFFPIAFGGIIAWKELLIMMGL QIVLKSMYEVIILPVTIRVVKVIKKVDGSDVYDTNISYNVLKVKDI >gi|226332135|gb|ACIC01000185.1| GENE 17 18053 - 18340 258 95 aa, chain - ## HITS:1 COG:CC2206 KEGG:ns NR:ns ## COG: CC2206 COG1846 # Protein_GI_number: 16126445 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Caulobacter vibrioides # 5 90 10 95 103 77 41.0 5e-15 MFKELNPILHSQLRLAIMSILLTVEEAEFVYLKEKTQSTAGNLSVQLDKLSEAGYIEVEK SFVGKRPRTACRITAGGRKAMEEYVETLKEYLSGL >gi|226332135|gb|ACIC01000185.1| GENE 18 18325 - 18915 443 196 aa, chain - ## HITS:1 COG:no KEGG:BT_1569 NR:ns ## KEGG: BT_1569 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 196 1 196 196 372 99.0 1e-102 MEDKRISEKESLELIARMIRETQDNTARYAAYPLLIWGYTTVAISLVVWYFYLQTGVWQI NFLWFALPVIAGPLTIFFNRKDKNKGAKNYIDRVTGQIWAVFGVVGFCLSCMAFVVRIDI LFVISLLMGMGATLTGLVCKYKPLSIAGMTGIALSFSMLFIHGSGVYLVFAAIFIVMMIV PGHIMNKQMKKQCLKN >gi|226332135|gb|ACIC01000185.1| GENE 19 18975 - 21152 1244 725 aa, chain - ## HITS:1 COG:no KEGG:BT_1570 NR:ns ## KEGG: BT_1570 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 725 1 725 725 1456 99.0 0 MRYLIFLILAGFTGVTAWGQTSTTARPMAVNDTLTEDSTYALQEVVVRTSPVKRKADRFI LSVPPAQNKDGVELLQQAPGVWLSDERISINGSSGTKVFVDNREIKLTGEMLIGYLRSLK SDDIARVEVLPMAGADKDADAQGGAIHIIMRRHTDKGFQGNLSMNASFASSLQSYQPSGS LNYHSGKWDTYGFASGTLVPQNKGDLYVTRDYVAGDKGFSSLTRMKQPSRYGTIRMGTVY TMDSSNSLGVELEYVRRGYIWPSQSYSTLSVGPLDMESQGVYRQKETYNMYTATANYIHK LDKDGSVLKLVTDYISKDLHGRNQYQIFQEIGALNKDTVYRSRSNATYQIATADLSWKQQ LHKKSFFQIGMKYTYTGMKDDACYEGLEPDESWKPNVAYGYELDYHENIGALYGTYSLDM KRGSVHIGLRGEYTQTSNETERRTRKYWDWFPHIDGNFYFDEIHKWMLIGQYGRYIERPT FSALNPNRIQTSEYSYLIGNPMLRPTYINKFSITLVYNFRYTLTVGGNLHHDLIRQFGKE DAENSDISYVINENHNRENHWFVAITAPWQPLNWLNLNASFIGVRQDIRMYREDDYFGHF LYFANANATVLLPSDYSLEAQYNGVSRLYSGNSGIDPRHTFNLHLRKKWKDGRCVATLGV DNIFNRYNSYFSNVPSYSSQNRFELASSGRLMKLTFTWNFNHGKKSGAVTVEKKSASERS RIEGK >gi|226332135|gb|ACIC01000185.1| GENE 20 21157 - 21552 392 131 aa, chain - ## HITS:1 COG:no KEGG:BT_1571 NR:ns ## KEGG: BT_1571 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 8 131 1 124 124 243 100.0 1e-63 MIGGKEKMKRLLFIALGIICYHVVTFAQEVKVEVRGIRSAKGAIMVMAQQDSESKPVYAM ATAVKDTVTVVLKDVPWEKFLISLFHDENGNWELDMNEQGIPVEGYAREKCKKEADASAT VKMKMYYPVND >gi|226332135|gb|ACIC01000185.1| GENE 21 21780 - 22283 556 167 aa, chain - ## HITS:1 COG:mll3697 KEGG:ns NR:ns ## COG: mll3697 COG1595 # Protein_GI_number: 13473184 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Mesorhizobium loti # 6 164 8 161 183 92 35.0 3e-19 MEKVDFTQGILAMESDLHRFAYKLTSDRDSANDLVQDCVLQALDNHEKFTHAKNLKGWMF TIMRNIFVNNYRRTVREMNLIDDTYSINQQSLIEDEEGDRFEFAYDMKQLYRVIHSIPED MKVPFQMFVAGFKYREIAEKLGLPMGTVKSRLFFIRKRLKEELKDFS >gi|226332135|gb|ACIC01000185.1| GENE 22 22447 - 23736 1286 429 aa, chain + ## HITS:1 COG:no KEGG:BT_1573 NR:ns ## KEGG: BT_1573 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 429 1 429 429 865 100.0 0 MKTSFKMVAMLLGIGIFPVCAHAQKKVVIEDEEPNSIMFVSRDKAGDEIIRIMNDRSQMR FHDPNAPRFLLTDQKGKFALGIGGYVRATAEYDFNGIVDDVDFYPALIGQPGKGNFAKNQ FQMDITTSTLFLKLVGRTKHLGDFVVYTAGNFRGDGKTFELQNAYAQFLGFTIGYSYGSF MDLSALPATIDFAGPNGSAFYRTTQLSYMCDKLKNWRFGVAMEMPSVDGTTNSDVSINTQ RMPDFAASAQYNWNSNSHIKLGAIVRSMTYSSNVHDKAFSTTGFGLQASTTFNVTKKWQV FGQFNYGKGIGSYLNDLSNLNVDIVPDPDNEGKMQVLPMLGWYAGLQYNICPNIFVSGTY SLSRLYSENNYPSDNPEAYRKGQYFVANAFWNVTSNMQVGVEYLRGWRTDFNSSTRHANR LNMLVQYSF >gi|226332135|gb|ACIC01000185.1| GENE 23 23842 - 25275 1271 477 aa, chain - ## HITS:1 COG:FN1003 KEGG:ns NR:ns ## COG: FN1003 COG2067 # Protein_GI_number: 19704338 # Func_class: I Lipid transport and metabolism # Function: Long-chain fatty acid transport protein # Organism: Fusobacterium nucleatum # 219 477 10 273 273 64 25.0 4e-10 MRKNFLIGFVMLIVSIPTFAGDYLTNTNQNAAFLRMIARGASIDIDGVYSNPAGLAFLPQ NGLQVALTIQSAYQTRDIAATSPLWTMDGQTSVRNYEGKASAPVIPSVHAVYKNGDWAFS GSFAIVGGGGKASFNTGLPMFDAAAIGLVNSTSDMLKPNMYNINSAMEGRQYIYGLQLGA SYKINEHFSVFAGARMNYFTGGYKGFLNIALKEGVAGQIGAAIVQQIMGANPNLSLEQAQ QIAQEQSAPMLQKLNDTKLELDCDQTGWGLTPIIGVDAKFGKLNLAAKYEFKANMNIENN THKLEFPDAAAAYMAPYQHGVNTPSDLPSMLSVAASYEFLPSLRASVEYHFFDDKNAGMA DNKQKTLKHGTHEYLAGVEWDINKIFTVSGGYQKTDYGLSDAFQTDTSFSCDSYSVGLGG RINLSKALSLDVAYFWTTYSDYTKENPRGLGEAMASMDKDVYSRTNKVFGVSVNYKF >gi|226332135|gb|ACIC01000185.1| GENE 24 25485 - 26486 1045 333 aa, chain + ## HITS:1 COG:FN0511 KEGG:ns NR:ns ## COG: FN0511 COG1052 # Protein_GI_number: 19703846 # Func_class: C Energy production and conversion; H Coenzyme transport and metabolism; R General function prediction only # Function: Lactate dehydrogenase and related dehydrogenases # Organism: Fusobacterium nucleatum # 5 331 6 331 335 352 57.0 8e-97 MAYTIAFFGTKPYDEASFNDKNKEFGFEFRYYKGHLNKNNVLLTQGVDAVCIFVNDTADA EVIHAMAANGVKLLALRCAGFNNVDLNAAATAGITVVRVPAYSPYAVAEYTVALMLSLNR KIPRASWRTKDGNFSLHGLMGFDMHGKTAGIIGTGKIAKILIHILKGFGMNILAYDLYPD YNFAREEQIVYTSLDELYHSSDIISLHCPLTEATKYLINDYSISKMKDGVMIINTGRGQL IHTNALIEGLKNKKIGSAGLDVYEEESEYFYEDQSDRIIDDDVLARLLSFNNVIVTSHQA FFTREAMGNIAMTTLQNIKDFINHKSLLNEVKR >gi|226332135|gb|ACIC01000185.1| GENE 25 26547 - 27251 668 234 aa, chain + ## HITS:1 COG:sll1773 KEGG:ns NR:ns ## COG: sll1773 COG1741 # Protein_GI_number: 16330260 # Func_class: R General function prediction only # Function: Pirin-related protein # Organism: Synechocystis # 5 216 4 214 232 174 40.0 2e-43 MKKVIHKADTRGHSQYDWLDSYHTFSFDEYFDSDRINFGALRVLNDDKVAPGEGFQTHPH KNMEIISIPLKGHLQHGDSKKNSRIITVGEIQTMSAGTGIFHSEVNASPVEPVEFLQIWI MPRERNTHPVYKDFSIKELERPNELAVIVSPDGSTPASLLQDTWFSIGKVEAGKKLGYHL HQSHGGVYIFLIEGEIVVDGEVLKRRDGMGVYDTKSFELETLKDSHILLIEVPM >gi|226332135|gb|ACIC01000185.1| GENE 26 27286 - 27927 382 213 aa, chain + ## HITS:1 COG:all1248 KEGG:ns NR:ns ## COG: all1248 COG0259 # Protein_GI_number: 17228743 # Func_class: H Coenzyme transport and metabolism # Function: Pyridoxamine-phosphate oxidase # Organism: Nostoc sp. PCC 7120 # 2 213 3 214 214 205 49.0 5e-53 MKNIADIRQEYTKSGLRESELPCDPLSLFSRWLQEAIDANVEEPTAVIVGTVSPEGRPST RTVLLKGLHDGKFIFYTNYESRKGRQLAQNPYISLSFVWHELERQVHIEGTAAKVSPEES DEYFRKRPYKSRIGARISPQSQPIASRMQLIRAFVKEAARWLGKEVERPDNWGGYAVTPT RMEFWQGRPNRLHDRFLYTLKTDGKWEINRLSP >gi|226332135|gb|ACIC01000185.1| GENE 27 27996 - 28727 663 243 aa, chain + ## HITS:1 COG:FN1387 KEGG:ns NR:ns ## COG: FN1387 COG2220 # Protein_GI_number: 19704722 # Func_class: R General function prediction only # Function: Predicted Zn-dependent hydrolases of the beta-lactamase fold # Organism: Fusobacterium nucleatum # 5 232 4 228 237 155 37.0 8e-38 MTLDYIYHSGFAIEAEGVTVIIDYYKDSSETEHNRGIVHDYLLQRPGKLYVLATHFHPDH FNREILTWKEAHPDIRYIFSKDILKSHRAKPEDATYIKKGETYEDETIRIEAFGSTDVGS SFLIHLQDWNIFHAGDLNNWHWSEESTEAEIRKANGDFLAEVKYLKEKAPKIDLALFPVD RRMGKDYMKGAKQFIEQIKTTIFVPMHFSEDYEGGNALRDFAENAGCRFVSITHRGESFE ITK >gi|226332135|gb|ACIC01000185.1| GENE 28 28742 - 29875 1157 377 aa, chain + ## HITS:1 COG:no KEGG:BT_1579 NR:ns ## KEGG: BT_1579 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 377 1 377 377 714 99.0 0 MNKLTILFLTMLLTCLPMAMRAESHKEKRDDTRYLAGAVPVVDGKVVFSKEFQIPGMSQK QIYDTVMKWMNERLKENNNPDSRVVFSDEAQGTIAGVGEEWITFYSSALSLDRTWVNYQI TVTCKPGSCLVELEKIRFTYRETEKYKAEEWITDEYALNKAKTKLVRGLAKWRRKTVDFA DDIFMDVAVAFGAPDTRPKSEKKKKEEEAQKPSIVAAAGPLVIGQGGKVTTAEADQTTTP AATLTPATPVGKASADMPGYTEIDLKQIPGDVYALMGNGKLVISIGKDEFNMTNMTANAG GALGYQNGKAVAYCTLSADQPYEAIEKAETYTLKLYAPNQTTPSAVIECKKLPSQTTPQA GQPRTYVGEIVKLLMKK >gi|226332135|gb|ACIC01000185.1| GENE 29 29889 - 30269 190 126 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|148984704|ref|ZP_01817972.1| 50S ribosomal protein L20 [Streptococcus pneumoniae SP3-BS71] # 5 125 3 126 126 77 36 8e-14 MEIKSRFDHFNINVTNLERSIAFYEKALGLKEHSRKEASDGSFILVYLTDNETGFLLELT WLKDHTSPYELGENESHLCFRVAGDYDAIRQYHKEMNCVCFENTAMGLYFINDPDDYWIE ILPQKK Prediction of potential genes in microbial genomes Time: Thu May 12 03:55:54 2011 Seq name: gi|226332134|gb|ACIC01000186.1| Bacteroides sp. 1_1_6 cont1.186, whole genome shotgun sequence Length of sequence - 30341 bp Number of predicted genes - 28, with homology - 25 Number of transcription units - 16, operones - 7 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 30 - 482 160 ## BT_1581 hypothetical protein 2 1 Op 2 . - CDS 483 - 668 198 ## - Prom 742 - 801 3.1 3 2 Tu 1 . + CDS 768 - 1499 618 ## COG0500 SAM-dependent methyltransferases + Term 1578 - 1630 10.5 - Term 1565 - 1618 14.5 4 3 Tu 1 . - CDS 1659 - 2306 676 ## COG0526 Thiol-disulfide isomerase and thioredoxins - Prom 2378 - 2437 76.8 + TRNA 2363 - 2436 72.9 # Thr TGT 0 0 - Term 2442 - 2468 -0.7 5 4 Tu 1 . - CDS 2474 - 3361 493 ## BVU_3167 hypothetical protein + Prom 3389 - 3448 10.8 6 5 Op 1 . + CDS 3527 - 4135 460 ## gi|253572627|ref|ZP_04850028.1| predicted protein 7 5 Op 2 . + CDS 4186 - 4653 284 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 8 5 Op 3 . + CDS 4643 - 5104 345 ## BVU_0773 hypothetical protein 9 5 Op 4 . + CDS 5151 - 5630 302 ## gi|253572630|ref|ZP_04850031.1| predicted protein + Term 5673 - 5718 4.3 10 6 Tu 1 . + CDS 5730 - 5897 79 ## + Prom 5956 - 6015 9.7 11 7 Op 1 . + CDS 6046 - 6780 604 ## BT_1585 hypothetical protein + Prom 6783 - 6842 7.3 12 7 Op 2 . + CDS 6862 - 7434 380 ## BT_1586 hypothetical protein - Term 7411 - 7463 6.3 13 8 Op 1 . - CDS 7486 - 8238 361 ## BT_1587 hypothetical protein - Prom 8265 - 8324 2.1 14 8 Op 2 . - CDS 8327 - 8797 145 ## BT_1588 hypothetical protein 15 8 Op 3 . - CDS 8794 - 9051 100 ## - Prom 9080 - 9139 4.0 16 9 Tu 1 . - CDS 9192 - 9368 146 ## BT_1589 hypothetical protein - Prom 9502 - 9561 6.5 + Prom 9759 - 9818 12.0 17 10 Op 1 . + CDS 9936 - 10973 274 ## BT_1590 hypothetical protein 18 10 Op 2 . + CDS 10994 - 13726 1226 ## BT_1591 hypothetical protein - TRNA 14215 - 14286 51.7 # Cys GCA 0 0 - Term 14169 - 14207 6.2 19 11 Op 1 . - CDS 14338 - 14553 74 ## BF1354 hypothetical protein 20 11 Op 2 . - CDS 14550 - 17036 2513 ## COG0370 Fe2+ transport system protein B - Prom 17191 - 17250 4.1 21 12 Tu 1 . + CDS 17420 - 18700 802 ## COG0037 Predicted ATPase of the PP-loop superfamily implicated in cell cycle control - Term 18631 - 18671 0.9 22 13 Tu 1 . - CDS 18685 - 19644 596 ## BT_1594 putative ferredoxin - Prom 19728 - 19787 4.9 + Prom 19472 - 19531 6.2 23 14 Tu 1 . + CDS 19758 - 21929 2341 ## COG1158 Transcription termination factor + Term 21971 - 22016 3.2 + Prom 21987 - 22046 4.8 24 15 Tu 1 . + CDS 22198 - 23664 1086 ## COG3119 Arylsulfatase A and related enzymes + Term 23670 - 23723 11.6 + Prom 23694 - 23753 4.9 25 16 Op 1 10/0.000 + CDS 23988 - 25604 1255 ## COG0642 Signal transduction histidine kinase 26 16 Op 2 . + CDS 25601 - 27568 1549 ## COG0642 Signal transduction histidine kinase 27 16 Op 3 . + CDS 27608 - 28930 1248 ## COG0534 Na+-driven multidrug efflux pump 28 16 Op 4 . + CDS 28989 - 30311 1784 ## COG0541 Signal recognition particle GTPase Predicted protein(s) >gi|226332134|gb|ACIC01000186.1| GENE 1 30 - 482 160 150 aa, chain - ## HITS:1 COG:no KEGG:BT_1581 NR:ns ## KEGG: BT_1581 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 150 1 150 150 266 100.0 2e-70 MGTIVCILALLLAGHGLGLIIKMRMFCLLIKRVETKYIKKIDGIVVDVDTYVKDGKLLMR PSVEFTLEGEGEKCCRSINPNIIVEDITQFSYYPKDFHIGDKVTVWYNLDNDAMFVEQKM QLTRLFVRLFIPGCFFIISSILGMCYVMNS >gi|226332134|gb|ACIC01000186.1| GENE 2 483 - 668 198 61 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSIEDAIMGGIVFKGKKDKPQEEEKVKTKAKKATYIRGQHGSGAAKMKADIRKKRANRHK K >gi|226332134|gb|ACIC01000186.1| GENE 3 768 - 1499 618 243 aa, chain + ## HITS:1 COG:Ta0580 KEGG:ns NR:ns ## COG: Ta0580 COG0500 # Protein_GI_number: 16081683 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Thermoplasma acidophilum # 10 226 5 210 227 101 31.0 1e-21 MATITLSAEKDPMGAAISDYFNHHRADRLRVFSSQFEEDEIPVKELFRSIQSMPVLERTA LQMATGRILDVGAGSGCHALALQEMGKEVCAIDISPLSNEVMKQRGVKDSRLINLFDETF TETFDTVLMLMNGSGIIGTLNNMPAFFQRMKRILRPGGCILMDSSDLRYLFEEEDGSMLI DLAGDYYGEVDFQMQYKDVKGDTFDWLYIDFQTLSLYASECGFKAELIKEGKHYDYLAKL SLA >gi|226332134|gb|ACIC01000186.1| GENE 4 1659 - 2306 676 215 aa, chain - ## HITS:1 COG:SP1000 KEGG:ns NR:ns ## COG: SP1000 COG0526 # Protein_GI_number: 15900873 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Streptococcus pneumoniae TIGR4 # 51 194 23 167 185 75 34.0 7e-14 MYFFYICSLTNDCKKMKKKIMNVCGVALLAVSLLACSGQKKGAEAAESVSDTVKVATEAM SAQADSTGYIVRIGEIAPDFTITLTDGKQVTLSSLRGKVVMLQFTASWCGVCRKEMPFIE KDIWLKHKDNADFALIGIDRDEPLEKVLAFAKSTGVTYPLGLDPGADIFAKYALRDAGIT RNVLIDREGKIVKLTRLYNEEEFASLVQQINEMLK >gi|226332134|gb|ACIC01000186.1| GENE 5 2474 - 3361 493 295 aa, chain - ## HITS:1 COG:no KEGG:BVU_3167 NR:ns ## KEGG: BVU_3167 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 291 1 292 295 321 66.0 2e-86 MNRIKGIFYAAVSSSTFGLAPFFSITLLLVGFSAFEVLSYRWGIASVVLVIFGLFSGCDF RLKRKDLAVVFTLSLLRAITSFSLIIAYQNIASGVASTIHFMYPLAVALVMMFVFREKKS VWVIVAVLMSLSGASVLSLGGVDVKDGNTPMGLVAACVSVFSYAGYIIGVRKTRAVKINS TVLTCYVMSLGALFYIIGACCTSGLRIVTDGYVWLIILGLALPATAISNITLVQAIKNAG PTLTSILGAMEPLTAVVIGVFVFNELFTLNTLIGILLILIAVSMVVFREHRMKRI >gi|226332134|gb|ACIC01000186.1| GENE 6 3527 - 4135 460 202 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253572627|ref|ZP_04850028.1| ## NR: gi|253572627|ref|ZP_04850028.1| predicted protein [Bacteroides sp. 1_1_6] # 1 202 1 202 202 386 100.0 1e-106 MKQKLLSGLIALFLCLSNVSAGDNLNNIFKKYQKYKGEEIICEQLDLPKELRKAKAGKTS KDINIDELQQSIDMGFRKAQILVCFSLENVFASEPITEVFTSKSAKEFMSDFAHLRKRML NDTKLTHFTVNKKTCNVDCFLYIDEKPTTEEGYLFLKMINGSINCIIHFYGELKTEDIIK SIEDGTFIGISGFEPTFTLVED >gi|226332134|gb|ACIC01000186.1| GENE 7 4186 - 4653 284 155 aa, chain + ## HITS:1 COG:BH0672 KEGG:ns NR:ns ## COG: BH0672 COG1595 # Protein_GI_number: 15613235 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Bacillus halodurans # 1 143 11 157 285 64 29.0 8e-11 MSKLLYRRAFRLTANVQDAEDLVQETFLRLWRRHDTLENVNNPEGLACEILKNAAIDLLR TRHIHSELTLEHEPEDNYSTVHETEVCDALILANKLIKQLKPQQQRILTLRSREECEMAE IANLTGETEDNVRAILSRSRKKLKEMFLKKMNHGL >gi|226332134|gb|ACIC01000186.1| GENE 8 4643 - 5104 345 153 aa, chain + ## HITS:1 COG:no KEGG:BVU_0773 NR:ns ## KEGG: BVU_0773 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 149 1 150 176 83 32.0 3e-15 MDYKEVEQLLDKFYNAHTTCPEEQKLYDWLCSEECSEELFIDREIIRTYIQQHPPVDIPK DIELKMEVLIDQWADSEHQSIRKKIYPKWKYIAGTAAIAALVIGATFYFHSPQKGVYVDT CQTPEEAYVEVQKALSLISETMQKGIEPIIDHQ >gi|226332134|gb|ACIC01000186.1| GENE 9 5151 - 5630 302 159 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253572630|ref|ZP_04850031.1| ## NR: gi|253572630|ref|ZP_04850031.1| predicted protein [Bacteroides sp. 1_1_6] # 1 159 1 159 159 297 100.0 2e-79 MKWLICILICWCSIGNTTAQHYSFNRYADIDGITRVYISKTLLKAIIKSENPAMKISGNK VNLNNKELLSKMDGILVLTGTNTRIANMMHEDATKLSHTKEYEHILYRNEKELAIDVFTR ESKGVILEIILIDKTDQNNRIIQLMGQFTPDDILSMIKI >gi|226332134|gb|ACIC01000186.1| GENE 10 5730 - 5897 79 55 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MYNVFDYYPIMPPYFTLLPLKYKNKSNIISFYILYEFADKKTSFYYSFLNKRVYF >gi|226332134|gb|ACIC01000186.1| GENE 11 6046 - 6780 604 244 aa, chain + ## HITS:1 COG:no KEGG:BT_1585 NR:ns ## KEGG: BT_1585 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 244 1 244 244 458 99.0 1e-128 MALFVNPGKDWEKNMSEEDIAQMESQGYDVTELRAKRAKSTEEEEKERLREKEERENFKN PTNLNKLAPYLQTPRDMSTSFFKATAGSAPWLFKDRWKRKYTEAPIVYAAVVQANTALWM PGNNDYYPAVFVFALDQKHIHDTEWLKQIAEEINVLQDADQIPGDCRKLIQTLRDDTSEF CFRIGKSVCGDANAWCATYKFDKQTALPRKALPSDGIVPFLLKSAPVENQFVDFKLIPTE FYIG >gi|226332134|gb|ACIC01000186.1| GENE 12 6862 - 7434 380 190 aa, chain + ## HITS:1 COG:no KEGG:BT_1586 NR:ns ## KEGG: BT_1586 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 190 1 190 190 360 98.0 1e-98 MHAYIIQQLIRIILFITIGLPIGLKSFAQETKRFYMELDTPRNGAKAGQELELKYISTAD FDSVSPPDFGTLIETVEGATPHKAGHTVKNGILTDIYEQGFSYRIRFKKPGNTKLPLASI KANGKEYETPLTSVWVHPVDTNIDSVKCSIQLEDSYRKGVFTAIVICLLIAWLLIRLSFQ KQKKIKRQDK >gi|226332134|gb|ACIC01000186.1| GENE 13 7486 - 8238 361 250 aa, chain - ## HITS:1 COG:no KEGG:BT_1587 NR:ns ## KEGG: BT_1587 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 250 1 250 250 515 98.0 1e-145 MNTDFVNLTKENLSDEHLCCIIRSKKPHAGIDAKRQWLSDRLNEGHVFRKLNAKATVFIE YAPLEAAWVPIIGDNYYYLYCLWVLGSPKGNGYGRALMEYCLTDAKEKGKSGVCMLSSKK QKNWLSDQSFAKKFGFEVVDATDNGYELLALSFDGTMPKFAQNAKKMKIESEDLTIYYDM QCPYIYQKIEMVKQYCDTNNVPVSLIQVDTLQKAKDLPCVFNNWGVFYKGNFETVNLLLD VEHLKRILKK >gi|226332134|gb|ACIC01000186.1| GENE 14 8327 - 8797 145 156 aa, chain - ## HITS:1 COG:no KEGG:BT_1588 NR:ns ## KEGG: BT_1588 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 156 1 156 156 278 99.0 3e-74 MKIKDKITQSLASITVELQEISPDFYVIGASAMILSGIEVGETADIDILTTEMNSCKLQH LLKTYMEISPETKEDDLFRSNFARFKLPLMDIEVMGDLQIKKNDIWQSVCVKEYKEIFIG NLIIKIPTIEEQKRILSLFGREKDLKRILILNHYLL >gi|226332134|gb|ACIC01000186.1| GENE 15 8794 - 9051 100 85 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MQPRIGSDQSLNIFYQTLTLEYRSTEPCSCYNKAIDEICRRLELHRKRGILSIVALITFI NGIIKACSFTFLKYILYFCYILLEI >gi|226332134|gb|ACIC01000186.1| GENE 16 9192 - 9368 146 58 aa, chain - ## HITS:1 COG:no KEGG:BT_1589 NR:ns ## KEGG: BT_1589 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 58 22 79 79 102 98.0 6e-21 MIESISLMNVDIIPVYPVKDSDILNYRKGLIAFYEMEDYSLYTDYFLDRQIERIKEIE >gi|226332134|gb|ACIC01000186.1| GENE 17 9936 - 10973 274 345 aa, chain + ## HITS:1 COG:no KEGG:BT_1590 NR:ns ## KEGG: BT_1590 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 345 1 345 345 667 98.0 0 MKTINYFSGSLLALLFYALFAACNNDLQTIAVTEIKGIEQTLLNESNALISSDVATDSTM LGKDNTISNSTSATNAFFTKQEYEEGMDWWKDREGALPALYEYQVSCRDFPSGHGPDTCI YYMTIIWTCYPPITEANWMDYQYMQVQHRLISGTTAFPWRYLGEGGPGNYNYGIVKVNGF QDPMDLNATHLPTRVQLRYRLLHKDFPGKADKSINDKEKWYNKNLATGWYYKDYNQPTYN NPYGYNSQDFNNMESLKFIINDPKCGSSFSVRVLVDGYLVFPQLVGGRYEVTVPKFRKTG EYLITAEYAGNVSEEFKTAVRHGIYQEYTSKEIYIMFSCNEFSYD >gi|226332134|gb|ACIC01000186.1| GENE 18 10994 - 13726 1226 910 aa, chain + ## HITS:1 COG:no KEGG:BT_1591 NR:ns ## KEGG: BT_1591 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 910 1 910 910 1815 99.0 0 MKRILLILFCSLHSILFLRSQNASSDIIFKQFFDNAVAFADTYPREKVYLHFDNSSYYVG DTIWFKAYTVYAENNMPSTISKPLYVEMLDQTGHITQRQIIELNSGEGHGQIILNESTMS GYYEIRAYTRWMLAFSEPSYFSRTFPIYQSAQGERPERRISTYNLNPSMKQRPKNVTDKL AIQFFPEGGTLVKGISSRVAFKAESREDGNVILEGAIYTKDGEKLTEIKTLHDGMGIFTY TPEEKPAVAKVTYKEKEYKFNLPDALSAGYVLNVNNSSGAIVGNVLSNENTPDADIVAFI SHEGRPYSYWILRMRSGGNQTFLLKTRDLPGGIYQVSLLDKTGNMLCERFTFVQPNKLNS IQLNGIKDIYRPFEPIRCEIQVTDQKGNPLQGSLSISVRDAIRSDYAEYDNNIFTDMLLT SGLKGYIDKPGYYFADITLRKLQELDVLLMVHGWRQYDLSQLISGKNEKLLQQSAEKELL LQGQIRSSLLKKEMKDMEVSVMAKVDNTFVAGNTFTDENGKFQLPVTSFEGEVEAVFQIR CRGSKHKKDASVMLDRNFAPTPRAFSYEEEHPQWMDKNSWITLSNRIDSLYVDSISKTNN TYLLSEVEIAKKRKNKNITTQVFEKSVDAYYDVTRLVDELRDQGIVINTIPDLLSKVNPN FSYDVQDGSSRYKEKTICLIVGKQVLDTLTAWTLWNEIDGIKQIMICEGSNSYTNEVLNS IHGSNMTKGQNVKDMMTAMNPPRTFDDSFYLYGDAAKKKISLDDTEKAKSEPLFRKKSSG INNNVNVNIDFSRFGQYALFYITPHFDSNYSRLTQKSMKAAHGTRRTIIQGYSRPLAFYS PVYKDKIPTANVNHRRRTLYWNPTIQTDKNGKISIECQNGMYANPVIIHAEMLKGGIPCS ITIIGEKKKK >gi|226332134|gb|ACIC01000186.1| GENE 19 14338 - 14553 74 71 aa, chain - ## HITS:1 COG:no KEGG:BF1354 NR:ns ## KEGG: BF1354 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 3 57 2 56 70 91 76.0 1e-17 MNNWQEWVVGVLIVLCIARVIYGIYLFFRRTRESENPCDSCVSGCDLKDMMEKKRQECGV KKKSTKKNCCG >gi|226332134|gb|ACIC01000186.1| GENE 20 14550 - 17036 2513 828 aa, chain - ## HITS:1 COG:MA3477 KEGG:ns NR:ns ## COG: MA3477 COG0370 # Protein_GI_number: 20092288 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+ transport system protein B # Organism: Methanosarcina acetivorans str.C2A # 112 824 11 665 670 556 41.0 1e-158 MRLSELGTGEKGVIVKVLGHGGFRKRIVEMGFIKGKTVEVLLNAPLKDPIKYKVLGYEIS LRRQEAEMIEVVSEEEAKKQAEKAVYHEGLPEDVSVKEEDMKRLALGKRRTINVALVGNP NSGKTSLFNLASGAHEHVGNYSGVTVDAKEGYFDFEGYHFRIVDLPGTYSLSAYTPEEIY VRRHIVDETPDVIINVVDSSNLERNLYLTTQLIDMNVRMVVALNIYDELESSGNTLDYHL LSKLFGVPMVPTVSKKNRGLDTLFHVVINLYEGVDFFDKQGNMNPEVLKDLTEWHDSLED RKNHEEEHLEDYVREHKRKGRVFRHIHINHGPDLEKAIDAVKEEVSKNEFIRHKYSTRFL SIKLLENDPDIESFVRTLPNAGEIFRIRDKMAKRVQETMNEDCESAITDAKYGFISGALK ETFTDNHLEQAQTTKVLDAIVTHRIWGYPIFFLFMYLMFEGTFVLGEYPMMGIEWLVGQI GDLIRNNMSEGPLKDMLVDGIVGGVGGVIVFLPNILILYFCISLMEDSGYMARAAFIMDK IMHKMGLHGKSFIPLIMGFGCNVPAIMASRTIENRKSRLITMLINPLMSCSARLPIYLLL VGAFFPNNASLVLLSIYAIGIFLAVLLARLFSKFLVKGDDTPFVMELPPYRMPTAKSIFR HTWEKGAQYLKKMGGIIMIASIIIWFLGYYPNHEAYETVAEQQENSYIGQLGRAIEPVIK PMGFDWKLGIGLISGVGAKELVVSTLGVLYVDDPEADEASLAERIPITPLVAFCYMLFVL IYFPCIAALAAIKQESGSWKWALFAACYTTGLAWLVAFSVYQIGGMFV >gi|226332134|gb|ACIC01000186.1| GENE 21 17420 - 18700 802 426 aa, chain + ## HITS:1 COG:CAC3204 KEGG:ns NR:ns ## COG: CAC3204 COG0037 # Protein_GI_number: 15896451 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Predicted ATPase of the PP-loop superfamily implicated in cell cycle control # Organism: Clostridium acetobutylicum # 6 425 5 457 461 208 32.0 2e-53 MIQQRVIRYIEKEHLFSPDDKLLVALSGGADSVALLRVLHTAGYQCEAAHCNFHLRGEES NRDEQFVRQLCQKYGIRLHTIDFNTTQYATEKRISIEMAARELRYNWFEKIKEECGAHVI AVAHHQDDSVETMLLNLIRGTGITGLLGIRPRNGAIVRPLLCINREEIIRYLQQIGQDFV TDSTNLEDEYTRNKIRLNLLPLMQEINPSVKNSLIETSIHLNDVATIYNKVIDEAKTRII TPEGIRIDALLDEPAPDAFLFETLHPLGFNSAQIKDIANSLHGQSGKQFVSKEWRVIKDR NLLLLETIQPEDGPALPYQLIKEEREFTPDFRIPREKETACFDADKLNEEIHCRKWQAGD TFIPFGMTGKKKISDYLTDRKFSISQKERQWVLCCGERIAWLIGERTDNRFRIDETTKRV IIYKIV >gi|226332134|gb|ACIC01000186.1| GENE 22 18685 - 19644 596 319 aa, chain - ## HITS:1 COG:no KEGG:BT_1594 NR:ns ## KEGG: BT_1594 # Name: not_defined # Def: putative ferredoxin # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 319 1 319 319 656 98.0 0 MYVLNRRMFQRGKDNKYFWFHEQSNTFFSVKTYLCCIKENSMTVNEARLIYFSPTHTSKQ VAEAIVHGTGIKNVVSMNLTLQTVEETVIPTSALAVIVVPVYGGHVAPLAMERLESIRGL DTPAVLVVVYGNRAYEKALMELDAFAIPHGLKVIAGATFIGEHSYSTDKCPIADGRPNES DLDYAEDFGKKIMEKIQAADGPDTLYQVDVRAIKRPSQPFFPLFRFLRKVIKLRKSGTPL PRTPWIEDESLCTHCGICAARCPAGAITKGDELNTNAEKCIKCCACVKACANKARKYDTP FASLLSECFKKQKLPQTIL >gi|226332134|gb|ACIC01000186.1| GENE 23 19758 - 21929 2341 723 aa, chain + ## HITS:1 COG:AGc5136 KEGG:ns NR:ns ## COG: AGc5136 COG1158 # Protein_GI_number: 15890078 # Func_class: K Transcription # Function: Transcription termination factor # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 357 723 54 421 421 465 64.0 1e-130 MYNIIQLNDKNLSELQVIAKELGIKKADSFKKEELVYKILDEQAIAGATKKVAAEKLKEE RKGDKNKRSRTAAPKKEEKVAPAAKNAEVTKNKENAPAAKPQQQPKEEAANKAKEAPVAE PKAEKAAPKRKVGRPRKDANIAEKAENKEVENAKPIVKPTEEKAVAEKTVVAPAAEKATP TQETEKKVKENKPAVAEKPVIAKPQKKSAPVIDEESTILSSEDDDDFIPIEDLPSEKIEL PTELFGKFEATKAETAQAAPEQAPQPQQQQHSQPQQRQRIVRPRDNNNNNAGNNNANANN NNNFQRNNNNNQRPPMQQRPAQQQNNAAENLPPVQQQPERKVIEREKPYEFDDILSGVGV LEIMQDGYGFLRSSDYNYLSSPDDIYVSQSQIKLFGLKTGDVVEGIIRPPKEGEKYFPLV KVSKINGRDAAFVRDRVPFEHLTPLFPDEKFRLCKGGYSDSMSARVVDLFAPIGKGQRAL IVAQPKTGKTILMKDIANAIAANHPEVYMIMLLIDERPEEVTDMARSVNAEVIASTFDEP AERHVKIAGIVLEKAKRLVECGHDVVIFLDSITRLARAYNTVSPASGKVLSGGVDANALH KPKRFFGAARNIENGGSLTIIATALIDTGSKMDEVIFEEFKGTGNMELQLDRNLSNKRIF PAVNITASSTRRDDLLLDKTTLDRMWILRKYLADMNPIEAMDFVKDRLEKTRDNDEFLMS MNS >gi|226332134|gb|ACIC01000186.1| GENE 24 22198 - 23664 1086 488 aa, chain + ## HITS:1 COG:yidJ KEGG:ns NR:ns ## COG: yidJ COG3119 # Protein_GI_number: 16131548 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Escherichia coli K12 # 21 471 2 458 497 122 24.0 2e-27 MKTIYPFMGLALCGAAAQAQEKPNFLIIQCDHLTQRVVGAYGQTQGCTLPIDEVASRGVI FSNAYVGCPLSQPSRAALWSGMMPHQTNVRSNSSEPVNTRLPENVPTLGSLFSESGYEAV HFGKTHDMGSLRGFKHKEPVAKPFTDPEFPVNNDSFLDVGTCEDAVAYLSNPPKEPFICI ADFQNPHNICGFIGENAGVHTDRPISGPLPELPDNFDVEDWSNIPTPVQYICCSHRRMTQ AAHWNEENYRHYIAAFQHYTKMVSKQVDSVLKALYSTPAGRNTIVVIMADHGDGMASHRM VTKHISFYDEMTNVPFIFAGPGIKQQKKPVDHLLTQPTLDLLPTLCDLAGIAVPAEKAGI SLAPTLRGEKQKKSHPYVVSEWHSEYEYVTTPGRMVRGPRYKYTHYLEGNGEELYDMKKD PGERKNLAKDPKYSKILAEHRALLDDYITRSKDDYRSLKVDADPRCRNHTPGYPSHEGPG AREILKRK >gi|226332134|gb|ACIC01000186.1| GENE 25 23988 - 25604 1255 538 aa, chain + ## HITS:1 COG:MA4377_3 KEGG:ns NR:ns ## COG: MA4377_3 COG0642 # Protein_GI_number: 20093164 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Methanosarcina acetivorans str.C2A # 284 527 12 255 311 158 38.0 3e-38 MAVQALITEGNISLAVNNANSMYQQAKAIDYPFGTALALRALADTYQSSGNPQSAIESYE ESLKIMQKIPASIPYLKTSMFHLILSKLKYRQMTDIEKDFAYLESLYHKESGLPDDFYLP CSYAYYYIQTNNLPKALEYLKQLDSIYEKYPYPYYSSISNYMYAGYHIESKEYDKALKEY EELLTITKKTALFRHVQLLQERAKVLVLMNQKQEACKIYEEINHLKDSLDAQSYLSQINE LHTLYQIDKSELNYINIQKNLYYWSLSVILVIVVLIIIAIFRIKRTNNRLLQSQQEQEKA KKQAEKSIHTKSLFLSNMSHEIRTPLNALSGFSAILTEESIDNETKQQCNDIIQQNSELL LKLINDVIDLSSLEIGKMQFKFNECDIVALCRNVIDMVEKIKQTQAEVRFSTSLSSLKLT TDSARLQQVLINLLINATKFTAQGTITLELVKQTEDTALFSVSDTGCGISKENQNKIFNR FEKLDENAQGTGLGLSICQLIIEQLGGKIWIDPDYDKGARFLFTHPIRHAQIKKEEAR >gi|226332134|gb|ACIC01000186.1| GENE 26 25601 - 27568 1549 655 aa, chain + ## HITS:1 COG:all0638_1 KEGG:ns NR:ns ## COG: all0638_1 COG0642 # Protein_GI_number: 17228134 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Nostoc sp. PCC 7120 # 421 653 343 596 615 94 27.0 1e-18 MKRLILLIITCYGLLLGYANTDTASDSLLQMLQTLPHDTTRLSTLNAIIKIEQNNYKCIQ YSDTLMLEALKLKNDKYASLAAYYHLLYYYNRCEQDSVAKWIVKMEPLVQKSGLWDYFFD ARRFQIDLYTFTEQYELAISEANKMKQKALDIDNNRGMVAAYQCLSNAYIGSQRWDEGLK ALEEAYRLLPKNGNAVVRISVLSQLISVTKEMKDNNRQLKYLQELENVLCKFIIDNPSLK DGFADVFIFNEIFYAHYYLNTDQPQLAYSHIEKSKKYLTENTYFMYKVLYYDIYAKYYQS IKQYQQASAYIDTTLTMLKKDFTSDYAEQLLKQAKIWVEAGDNNRATTLYQQALAIKDSA AMALSNTQMEQIKKSYNLDKIELEQQKQTNQIRLISLIVISAILIVLFISLFRLSKIRKA LKYSESEIRKAAETVRVTNEIKNRFLSNMSYNIRTPLNNVVGFSQLIASEPNIDEKTREE YSAIIHQSSERLMRLVNDVLDLSRLEAKMMKFQIQDYDAVSLCNEVCYMARMNNEKTGIQ VRFTPEVESLSLRTDTTRLGYALLSTLTYPHEHETEGQQEERIIRFTLSRKGEMLYFRIL NSPLADEAFTSQETGIRHEINQLLLAYFGGSYQVNARGTEGPEIVFTYPIASESE >gi|226332134|gb|ACIC01000186.1| GENE 27 27608 - 28930 1248 440 aa, chain + ## HITS:1 COG:PAB0243 KEGG:ns NR:ns ## COG: PAB0243 COG0534 # Protein_GI_number: 14520582 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Pyrococcus abyssi # 5 393 6 397 463 117 24.0 5e-26 MYTNKQIWSVSYPILLSLLAQNVINVTDTAFLGHVSEVALGASAMGGLFYICIFTIAFGF STGSQIVIARRNGEGRYSDVGPVMIQGIMFLLLIAILMFGFTKAFGGNIMRLLVSSESIY EGTMEFLDWRIYGFFFSFVNVMFRALYIGITRTKVLTINAVVMALTNVILDYALIFGKFG LPEMGIKGAAIASVLAEASSILFFLIYTYISVDLKKYGLNRLRSFDPALLMRILSISCFT MLQYFLSMATWFVFFVAVERLGQRELAIANIVRSIYVVLLIPVNALATTTNSLVSNAIGA GGIQYVMPLINKIARFSFFIMLGLVGLSVLFPQFLLSVYTSEAALITESVPSVYVICCAM LIASVANVVFNGISGTGNTQAALLLETITIAIYGSYIVFVGMWLKAPIEICFTIEILYYT LLLITSYIYLKKAKWQNKKI >gi|226332134|gb|ACIC01000186.1| GENE 28 28989 - 30311 1784 440 aa, chain + ## HITS:1 COG:BS_ffh KEGG:ns NR:ns ## COG: BS_ffh COG0541 # Protein_GI_number: 16078661 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Signal recognition particle GTPase # Organism: Bacillus subtilis # 2 433 3 437 446 448 56.0 1e-125 MFDNLSERLERSFKILKGEGKITEINVAETLKDVRKALLDADVNYKVAKTFTDTVKEKAI GQNVLTAVKPSQLMVKIVHDELTQLMGGETVEINLESRPAVILMSGLQGSGKTTFSGKLA RMLKTKKNRKPLLVACDVYRPAAIEQLRVLAEQIEVPMYSELDSKNPVEIAQHAIQEAKA KGYDLVIVDTAGRLAVDEQMMNEITAIKEAINPDEILFVVDSMTGQDAVNTAKEFNERLD FNGVVLTKLDGDTRGGAALSIRSVVNKPIKFVGTGEKLDAIDQFHPARMADRILGMGDIV SLVERAQEQYDEEEAKRLQKKIAKNQFDFNDFLSQIAQIKKMGNLKELASMIPGVGKAIK DIDIDDNAFKSIEAIIYSMTPAERSNPEILNGSRRTRIAKGSGTTIQEVNRLLKQFDQTR KMMKMVTSSKMGKMMPKMKK Prediction of potential genes in microbial genomes Time: Thu May 12 03:57:20 2011 Seq name: gi|226332133|gb|ACIC01000187.1| Bacteroides sp. 1_1_6 cont1.187, whole genome shotgun sequence Length of sequence - 55799 bp Number of predicted genes - 36, with homology - 36 Number of transcription units - 19, operones - 12 average op.length - 2.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 310 - 369 5.5 1 1 Op 1 . + CDS 425 - 1303 670 ## BT_1604 hypothetical protein 2 1 Op 2 . + CDS 1287 - 2072 517 ## COG0755 ABC-type transport system involved in cytochrome c biogenesis, permease component 3 1 Op 3 . + CDS 2093 - 3475 1511 ## COG1858 Cytochrome c peroxidase + Term 3486 - 3531 1.6 + Prom 3640 - 3699 4.8 4 2 Tu 1 . + CDS 3727 - 4608 988 ## COG0190 5,10-methylene-tetrahydrofolate dehydrogenase/Methenyl tetrahydrofolate cyclohydrolase + Term 4615 - 4657 8.1 + Prom 4650 - 4709 3.4 5 3 Tu 1 . + CDS 4763 - 5899 612 ## COG2843 Putative enzyme of poly-gamma-glutamate biosynthesis (capsule formation) + Term 6058 - 6107 5.4 + TRNA 5964 - 6042 71.9 # His GTG 0 0 + Prom 5967 - 6026 80.4 6 4 Tu 1 . + CDS 6117 - 6353 242 ## COG1983 Putative stress-responsive transcriptional regulator + Term 6367 - 6415 4.3 7 5 Tu 1 . - CDS 6424 - 8283 1406 ## COG2812 DNA polymerase III, gamma/tau subunits - Prom 8351 - 8410 3.9 + Prom 8286 - 8345 5.1 8 6 Op 1 . + CDS 8442 - 8744 480 ## BF3072 putative septum formation initiator-related protein 9 6 Op 2 . + CDS 8741 - 9082 424 ## BT_1612 hypothetical protein - Term 9130 - 9163 4.9 10 7 Op 1 . - CDS 9178 - 9780 585 ## BT_1613 hypothetical protein 11 7 Op 2 . - CDS 9807 - 10040 285 ## BT_1614 hypothetical protein - Prom 10148 - 10207 7.4 + Prom 10088 - 10147 2.5 12 8 Tu 1 . + CDS 10173 - 11633 1538 ## COG2195 Di- and tripeptidases + Term 11659 - 11712 8.6 + Prom 11659 - 11718 8.3 13 9 Op 1 . + CDS 11854 - 12810 560 ## BT_1616 hypothetical protein + Prom 12817 - 12876 1.9 14 9 Op 2 6/0.000 + CDS 12902 - 13465 477 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog + Term 13487 - 13521 1.0 15 9 Op 3 . + CDS 13544 - 14551 775 ## COG3712 Fe2+-dicitrate sensor, membrane component + Prom 14580 - 14639 3.4 16 10 Op 1 . + CDS 14665 - 18117 3135 ## BT_1619 hypothetical protein 17 10 Op 2 . + CDS 18131 - 19987 1741 ## BT_1620 hypothetical protein + Term 20196 - 20246 4.3 18 11 Op 1 . + CDS 20581 - 21858 854 ## COG3525 N-acetyl-beta-hexosaminidase 19 11 Op 2 . + CDS 21855 - 23357 1131 ## COG3119 Arylsulfatase A and related enzymes - Term 23409 - 23442 4.0 20 12 Tu 1 . - CDS 23524 - 24699 1102 ## COG0668 Small-conductance mechanosensitive channel - Prom 24724 - 24783 7.7 - Term 24755 - 24809 10.0 21 13 Op 1 . - CDS 24834 - 26381 1660 ## COG3119 Arylsulfatase A and related enzymes - Prom 26401 - 26460 1.8 22 13 Op 2 . - CDS 26468 - 28288 1802 ## COG3669 Alpha-L-fucosidase - Prom 28311 - 28370 4.8 23 14 Op 1 1/0.000 - CDS 28453 - 31521 3104 ## COG3250 Beta-galactosidase/beta-glucuronidase 24 14 Op 2 . - CDS 31550 - 33880 1810 ## COG3525 N-acetyl-beta-hexosaminidase 25 14 Op 3 . - CDS 33896 - 35479 1637 ## COG3119 Arylsulfatase A and related enzymes - Prom 35524 - 35583 3.6 - Term 35583 - 35634 1.8 26 15 Op 1 . - CDS 35681 - 37462 1407 ## BT_1629 hypothetical protein 27 15 Op 2 . - CDS 37508 - 39325 1649 ## BT_1630 hypothetical protein 28 15 Op 3 . - CDS 39339 - 42698 3175 ## BT_1631 hypothetical protein 29 15 Op 4 . - CDS 42729 - 44393 1653 ## BT_1632 chitinase - Prom 44524 - 44583 5.0 - Term 44584 - 44625 4.1 30 16 Op 1 . - CDS 44653 - 45396 508 ## COG1208 Nucleoside-diphosphate-sugar pyrophosphorylase involved in lipopolysaccharide biosynthesis/translation initiation factor 2B, gamma/epsilon subunits (eIF-2Bgamma/eIF-2Bepsilon) 31 16 Op 2 . - CDS 45405 - 46835 1131 ## COG1660 Predicted P-loop-containing kinase - Prom 46857 - 46916 5.4 + Prom 46885 - 46944 6.3 32 17 Op 1 . + CDS 46967 - 50932 3056 ## COG0642 Signal transduction histidine kinase 33 17 Op 2 . + CDS 50971 - 52500 1245 ## COG3119 Arylsulfatase A and related enzymes 34 18 Tu 1 . - CDS 52778 - 52927 188 ## gi|253572683|ref|ZP_04850084.1| predicted protein - Prom 52956 - 53015 3.4 + Prom 52918 - 52977 6.9 35 19 Op 1 . + CDS 53084 - 53710 341 ## BT_1637 hypothetical protein 36 19 Op 2 . + CDS 53776 - 55623 1106 ## BT_1638 hypothetical protein + Term 55724 - 55764 1.3 Predicted protein(s) >gi|226332133|gb|ACIC01000187.1| GENE 1 425 - 1303 670 292 aa, chain + ## HITS:1 COG:no KEGG:BT_1604 NR:ns ## KEGG: BT_1604 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 292 1 292 292 552 100.0 1e-156 MYYVYVAGYILLAAAMQLFTGNFPVSFLAFPLNIIFAAIWLFLLWILYKEYNNLRITRFL GSSKASVLSISLFIGGCLIIGLFPQLSEPEAEMRKGISASLGCYNFMTSWIFISILLLLL SNLAMITIHACRHRKQARWRFILNHTGLWLALFAGFLGSSDTQTLRIPLYKGEPKHEAFD MNGASYYLDYDMELNSFAVEYYPNGRPSRFSANVRLGNENVLLEVNHPYSYRLGEDVYLT GYDVTKGNESNYCILQVVKQPWKYVMVAGILMMLAGAVLLFINGAKAYDKLG >gi|226332133|gb|ACIC01000187.1| GENE 2 1287 - 2072 517 261 aa, chain + ## HITS:1 COG:RSc2985 KEGG:ns NR:ns ## COG: RSc2985 COG0755 # Protein_GI_number: 17547704 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ABC-type transport system involved in cytochrome c biogenesis, permease component # Organism: Ralstonia solanacearum # 116 253 244 391 395 85 35.0 8e-17 MINWDNFYVFAAVSICLWLAGAVFALRSSSKSKVAIGFTSGGIIILGVFIAGLWIFLQRP PLRTMGETRLWYSFFMGIAGLLTYIRWKYRWILSFSTLLSTVFVIINLMKPEIHDQSLMP ALQSVWFIPHVTVYMFSYSVLGCAFIIALTGLFRHKEEYLHTADNLVYAGVAFLSIGMLL GSLWAKEAWGNYWSWDPKETWAAITWMGYLLYIHLRLFRRTGRKTLYVLLIVSFLALQMC WYGVNYLPAAQQSVHLYNRNN >gi|226332133|gb|ACIC01000187.1| GENE 3 2093 - 3475 1511 460 aa, chain + ## HITS:1 COG:PM0939 KEGG:ns NR:ns ## COG: PM0939 COG1858 # Protein_GI_number: 15602804 # Func_class: P Inorganic ion transport and metabolism # Function: Cytochrome c peroxidase # Organism: Pasteurella multocida # 38 458 48 467 468 436 49.0 1e-122 MKRSTKLIVAFLVVVAALAVTYRLMHRVPSADLEANVQMQQIITDAGCLRCHTSTPELPF YANMPVAGKIVMEDVSKAYRAFDMTQMEKDMKEGRPLNPVDLAKVEKVILDGKMPQAKYY LVHWGASFNDAKKEVALSWVKNHRMGLYTDVAVAPDFINEPIRPIADSISVDVRKVVLGN LLYHDTRLSADNTVSCASCHGLDTGGVDNKQYSEGVGGQLGGVNAPTVYNAAYNFVQFWD GRAGTLAEQAAGPPLNPVEMACESFDQIISKLAEDKNFVVAFNEVYPDGLSEKNITNAIQ EFEKTLLTPNSRFDRYLKGQKDAITADEIAGYDLFKKYDCATCHVGEILGGQSYELIGVQ HDYFADRQAEMTEEDNGRFKQTKAERDRHRFKVPGLRNIELTAPYFHDGSMATMDDAVRA MAKYQLGIDLPQPEVDKIVAFLRTLTGEYKGKLLTNKNMI >gi|226332133|gb|ACIC01000187.1| GENE 4 3727 - 4608 988 293 aa, chain + ## HITS:1 COG:lin1397 KEGG:ns NR:ns ## COG: lin1397 COG0190 # Protein_GI_number: 16800465 # Func_class: H Coenzyme transport and metabolism # Function: 5,10-methylene-tetrahydrofolate dehydrogenase/Methenyl tetrahydrofolate cyclohydrolase # Organism: Listeria innocua # 3 289 4 279 284 271 51.0 1e-72 MTLIDGKAISEQVKQEIAAEVAEIVARGGKRPHLAAILVGHDGGSETYVAAKVKACEVCG FKSSLIRYESDVTEEELLAKVRELNNDDDVDGFIVQLPLPKHISEQKVIETIDYRKDVDG FHPINVGRMSIGLPCYVSATPNGILELLKRYEIETSGKKCVVLGRSNIVGKPMASLMMQK AYPGDATVTVCHSRSKDLVKECQEADIIIAALGQPNFVKEEMVKEGAVVIDVGTTRVPDA SKKSGFKLTGDVKFDEVAPKCSFITPVPGGVGPMTIVSLMKNTLLAGKKAIYK >gi|226332133|gb|ACIC01000187.1| GENE 5 4763 - 5899 612 378 aa, chain + ## HITS:1 COG:SPy0818 KEGG:ns NR:ns ## COG: SPy0818 COG2843 # Protein_GI_number: 15674859 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Putative enzyme of poly-gamma-glutamate biosynthesis (capsule formation) # Organism: Streptococcus pyogenes M1 GAS # 4 314 50 365 430 146 31.0 8e-35 MKELLVSLLLVTQFSCISKAQENDTTTQILTTDSLQTDTLSLLFAGDLMQHQGQINAART ATGGYDYSSYFEYVKDEIQSADFAIANLEVTLGGKPYKGYPAFSAPDEYLTAIHNAGFNV LITANNHSLDRGRKGLERTIQLIDSLKIPHAGTYLNAEERENKYPLLLEKKGFRIAILNY TYGTNGIPVTPPNIVNYIDTTIISKDIEASKTLNPDAIIACMHWGIEYQSLPDKEQKFLT YWLLQKGVNHIIGCHPHVVQPIEVREDSLTNEKHLVVYSLGNYISNMSARRTDGGLMVKM ELVKDSTTRLNHCEYSLVWTARPVQSKKKNHQLLPINFPTDSISVNARNSLRIFANDARA LFNKHNRGIKEYLFYKKK >gi|226332133|gb|ACIC01000187.1| GENE 6 6117 - 6353 242 78 aa, chain + ## HITS:1 COG:TM1156 KEGG:ns NR:ns ## COG: TM1156 COG1983 # Protein_GI_number: 15643913 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Putative stress-responsive transcriptional regulator # Organism: Thermotoga maritima # 5 61 4 59 129 70 59.0 5e-13 MENNKKLTRSRKERMIAGVCGGLAEYLGWDPTLVRIVYALATIFTAFAGIIIYLILWIIM PEERYSDGYSGRMNQRLH >gi|226332133|gb|ACIC01000187.1| GENE 7 6424 - 8283 1406 619 aa, chain - ## HITS:1 COG:BS_dnaX KEGG:ns NR:ns ## COG: BS_dnaX COG2812 # Protein_GI_number: 16077087 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, gamma/tau subunits # Organism: Bacillus subtilis # 3 365 2 363 563 286 41.0 6e-77 MENYIVSARKYRPSTFESVVGQRALTTTLKNAIATQKLAHAYLFCGPRGVGKTTCARIFA KTINCMTPTTDGEACNECESCVAFNEQRSYNIHELDAASNNSVDDIRQLVEQVRIPPQIG KYKVYIIDEVHMLSASAFNAFLKTLEEPPRHAIFILATTEKHKILPTILSRCQIYDFNRI SVEDTVNHLSYVAAKENITAEPEALNVIAMKADGGMRDALSIFDQVVSFTGGNITYKSVI ENLNVLDYEYYFRLTDCFLENRVSDALLLFNDVLNKGFDGSHFITGLSSHCRDLLVSKDA ATLPLLEVGASIRQRYQEQAQKCPLQFLYRAMKLCNDCDLNYRASKNKRLLVELTLIQVA QLTTEGDDVSGGRGPKQTIKPVFTQPAAAQQPQVASATSAQQTTVHTAPSPATAPSAANT AVRQPQVSVASGAAAPVNTASAPSQSAGISSISKEERKIPVMKMSSLGVSIKNPQRDQAA QNTAKNNVAQVQIQPEEDFIFNDRDLNYYWQEYAGQLPKEQVAIAKRMQVIRPVLLNNST TFEVVVDNEIAAKDFTALIPELQTYLRGRLKNSKVAMTVRVSEPTETVRAVGRVEKFQMM AQKNQALMQLKDEFGLELY >gi|226332133|gb|ACIC01000187.1| GENE 8 8442 - 8744 480 100 aa, chain + ## HITS:1 COG:no KEGG:BF3072 NR:ns ## KEGG: BF3072 # Name: not_defined # Def: putative septum formation initiator-related protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 100 1 100 100 142 76.0 5e-33 MDKLITIWSFICRRKYLITVVAFAVIIGFLDENSLFRRLGYEREISQLKEEIEKYRADYE ENTKRLNELNSNPDAIEQVAREKYLMKKPNEDIYVFEDNK >gi|226332133|gb|ACIC01000187.1| GENE 9 8741 - 9082 424 113 aa, chain + ## HITS:1 COG:no KEGG:BT_1612 NR:ns ## KEGG: BT_1612 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 113 1 113 113 198 100.0 6e-50 MKQLVPALFAVGAIMALTGAAVYITGWNYAPYIYTIGAGFIALAQVNTPVKGKSKILKRL RIQQIFGALALILTGAFMFTTRGNEWIACLTIAAVLELYTAFRIPQEEAKEEK >gi|226332133|gb|ACIC01000187.1| GENE 10 9178 - 9780 585 200 aa, chain - ## HITS:1 COG:no KEGG:BT_1613 NR:ns ## KEGG: BT_1613 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 8 200 1 193 193 320 100.0 2e-86 MNRINYLMNGLAALAFIVLFSQCAGKADNQVATTPASANAELSGMKIAYVEIDTLLAKYN FCIDLNEAMVKKSENVRMTLNQKMTALNKEKQEFQKKYESNAFLSPERAQQEYNRLAKME QDLQELSNKLQNGLMEDNNANSLLFRDSINAFLKEYNKTRGYSLIFSNTGFDNLLYADSA YNITKEIVDGLNARYSSAKK >gi|226332133|gb|ACIC01000187.1| GENE 11 9807 - 10040 285 77 aa, chain - ## HITS:1 COG:no KEGG:BT_1614 NR:ns ## KEGG: BT_1614 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 77 1 77 77 140 100.0 1e-32 MVETILITLLIVAISLLLLGVKVFFTKGGKFPNGHVSGNKELRKKGIGCAQSQDREAQKK SRFSIDELEKALNDSMN >gi|226332133|gb|ACIC01000187.1| GENE 12 10173 - 11633 1538 486 aa, chain + ## HITS:1 COG:VC2279 KEGG:ns NR:ns ## COG: VC2279 COG2195 # Protein_GI_number: 15642277 # Func_class: E Amino acid transport and metabolism # Function: Di- and tripeptidases # Organism: Vibrio cholerae # 2 485 51 533 534 465 47.0 1e-131 MEKKDLKPAGVFKYFEEICQVPRPSKKEEKIIAYLKAFGAKHNLETNVDEAGNVLIKKPA TPGKENFQTVILQSHIDMVCEKNNDVQHDFLTDPIETEIDGEWLKAKGTTLGADNGIGVA TELAILADDSIEHGPLECLFTVDEETGLTGAFELKEGFMSGDILLNLDSEDEGEIFIGCA GGIDSVAEFSYKEVEVPAGYFFFKVEVKGLKGGHSGGDIHLGRGNANKILNRFLSRMANR QDLYLCEINGGNLRNAIPREAYAICAVPEDAKHDVRTELNIFTSEVENELAVTEPDLKLV LESETPRKMAIDQDTTTRLLKALYAAPHGVYAMSQDIPGLVETSTNLASVKMKPNHIIRI ETSQRSSILSARNDMANTVRAVFQLAGADVTFGEGYPGWKPNPHSAILEVAAESYKRLFG VEAKVKAIHAGLECGLFLDKYPTLDMISFGPTLTGVHSPDERMHIPSVEKFWKHLLDILA HVPAKK >gi|226332133|gb|ACIC01000187.1| GENE 13 11854 - 12810 560 318 aa, chain + ## HITS:1 COG:no KEGG:BT_1616 NR:ns ## KEGG: BT_1616 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 318 56 373 373 655 100.0 0 MSWNVENLFDTHHDTLKNDNEFLPDAIRHWNYTKYKKKLADMARVITAVGEWNPPALVGL CEVENDSVLRDLTQRSPLKELGYRYVMTSSPDLRGIDVALLYQRDLFKLLSFRSIPIPSF KHHRPTRDLLHVSGLLLTGDTLDVIVCHLPSRSGGAKESEPYRLHAARILRTEADSLLNI RLHPQLVIMGDFNDYPTNKSIKEILEATAPKHSVTFPNPRKLYHLLARKATSRHFGSYKY HGEWGLLDHLIVSGNLLDASSKFFTGEDKANVARLPFLLTEDKKYGDDEPFRTYKGMKYQ GGISDHLPIYADFELIIY >gi|226332133|gb|ACIC01000187.1| GENE 14 12902 - 13465 477 187 aa, chain + ## HITS:1 COG:all2193 KEGG:ns NR:ns ## COG: all2193 COG1595 # Protein_GI_number: 17229685 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Nostoc sp. PCC 7120 # 6 180 18 193 201 58 23.0 7e-09 METFDEKQLLKAISEGDEKAFKTFFLYYYPRIKGFINGLLQSQEEAEDLSQDIFLTLWNN RSSLHTINNLKPYLFRISKNAVYRHIERALLFRNYQQKETEKYSPPQESNETDDTIHLKE LELLVTMVVEKMPPQRQKIYKMSRESGMNNEEIARELGINKRTVENHLSQALTDIRKILF ITFILFF >gi|226332133|gb|ACIC01000187.1| GENE 15 13544 - 14551 775 335 aa, chain + ## HITS:1 COG:SMc04204 KEGG:ns NR:ns ## COG: SMc04204 COG3712 # Protein_GI_number: 15965785 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Sinorhizobium meliloti # 18 291 62 320 354 87 28.0 2e-17 MSNAHKIIKNFTSDKFSPELKEKLWKWLVDSSEQAEKEEALMELWEDQNFKADAGTERSY QNFRRKIAPRQRKATARYTLRRWAQVAAVLLIPLLSIVASYLYIQSNEEHTELVEYYVPR GEQKQITLPDGTTAYLNSGTLLVYPQKFTGDIRSVYLIGEANFDVKKDKQHPFIVKTNHL KVKVLGTKFNVHAYAEDEKTTTTLESGSVVVQKANNEDIITLTPNEQLEYDNPSGEFNKK IIDASVYSGWSRGELNFAAMTLSDIFITIERIYDIHIIVPPHLATTDVYTIKFKQKAPIK EIMNIVTKTIGNIDYKVEDENILLIYSPLNKKGGR >gi|226332133|gb|ACIC01000187.1| GENE 16 14665 - 18117 3135 1150 aa, chain + ## HITS:1 COG:no KEGG:BT_1619 NR:ns ## KEGG: BT_1619 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1150 1 1150 1150 2288 99.0 0 MKFYKLSGIGYPVKIHKFYILLFALCFYNIAYGQNQPITVSMKNQPLLKVFEIIEDQTEF SIAYNQTKLDVKQKVSANFVREAVSSVLNSVLKGTGFTYRQEGKHIIIIPVPAKAEAAPN NTTSTQQSIKIRGTVTDAQGEPLIGANVLVDGSKQATITDMNGEFSLEVPANSKLRVTYI GYVTQEVTVKNKTLFNIQLQEDTQTMDEVVVIGFGTQKKVNMTGAVASVNIKESLGDRPI TNVSAALQGVVPGLKIESTTGTPGDDMTYNIRGTTSINGGEPLVLVNNVPMDINMIDPQD IESVSILKDAASAAIYGARAAFGVILITTKQGKKDMAPRFNYNNNFSFSKASELPQKASP LESVLAYKEMGWANDTYVDGKNITQWEGYIRDYQANPSNYPNGYIFDDQGNLFLMRENDM FADMMDNFGFMQNHSFSVSGGSQRTSYRLSLGYTGEDGILVTDKDKFDRINMSSFLSVDV NKWLTTQLDIRYANSTQNKVEQGGRNGVWGSAMYLPSYHNILPYEQDGIEYPAETSATFV RYGEPRVIKKTNLRTLGRVIISPLKGLKITGEYTYNRITEYNRMYVNKYKYIGFNFTGLL NNVENSRYALTQGFTNYNAINAFANYDFSIGKHDIAIMGGYNQEESHKESQWSQRTDVLL ENLPSLSGSTGTASVTDSFDEYAIRGLFYRVNYTYDGKYMFEANGRYDGTSRFPKDSRFG FFPSFSAGWRISEEAFMKNTKSFLSNLKLRASWGSIGNQIILKPDNTPENYPYIPSMSPY LTEWLVDGQKTTTLNAPAMVSSSFSWEKVYTLDFGVDFGFFDNRFNGTFDWYRRDTKGML APGMDLPWVVGATAAKQNAADLKTYGWELELNWRDRINKDWSYRIGFNLYDSQSEITKYN NETNLLGDKIYRKGMKMGEIWGYVTDRFYTEDDFNADGTLKPGIPIPKGAGKVFPGDVLY KNFDDDTETIWSGEGTADNPGDQRIIGNSTPRFHYGITAGISWKGLDLSIFLRGVGKRDY WRTDQIAWPTGGWGSLFKETLDFWTPTNTNAYYPRVYSNDGVNTSYNHWKQSKYLANASY LKLQNITLSYTLPKVWSQRLYFDDVKVFFSGENLYTWDHLPEGLETDMLSKGAWEYPFMR KFSFGINVTF >gi|226332133|gb|ACIC01000187.1| GENE 17 18131 - 19987 1741 618 aa, chain + ## HITS:1 COG:no KEGG:BT_1620 NR:ns ## KEGG: BT_1620 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 618 1 618 618 1277 99.0 0 MRKYLIFICTVLLASCLNNDFLERYPLGDPTAETAFETYDNFKAYAWGLYETLPSLGYGS ENSTDDISYNQTRGTSESNWIRKLVTIPDKKDNTSWNYYSYIRRVNLMLDRIDGSKMTDV EKANWRSVGYFFRSYRYLSLLSAYGGVPWIDHVLSDDETELIKGPRASRDEIAGHILEDL QFAEKNINVNGEGRNTINKACVQALLSRFCLFEGTWRKYHGLNNAETYLRECKRVSAELI QTYPNVADCYDDLFCSLELKDVTGVILYREYSDAVGVVHAVSIGGTTATSFYNPTRDLVD SYLCTDGKPRWTSDAYLGDKDIYDEFSNRDHRLWLHVTPPYRVDRSASSTAWDNKWKFTD NSKDRSFIDSLTVRMGIGYGTSKERQKLLPFRQGYDGGILGASPHFDFYLENQPWYKSAF GYNNWKYYCCYLSMGSQRNEETDMPIFRIEEVMLNYAEAMCELGEFDQTVADVTINKLRP RANVKLMKVSEINSAFDPKRDLGNPDYPNDYEVSPLLWEIRRERRIELFSEGFRFDDLRR WKKCHYALKKKLGQYVRASDFTAGTNVTIDGGGSEGYLEFHPKQNHLWPDYYYLNPIPRN ERVLNPQLEQNPGWEEGN >gi|226332133|gb|ACIC01000187.1| GENE 18 20581 - 21858 854 425 aa, chain + ## HITS:1 COG:XF0847 KEGG:ns NR:ns ## COG: XF0847 COG3525 # Protein_GI_number: 15837449 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetyl-beta-hexosaminidase # Organism: Xylella fastidiosa 9a5c # 3 393 180 580 841 247 37.0 3e-65 MVISASDKGGFTYAVQTLRQWAAGSAGSITFACASVTDYPRTQWRCFLLDSGRQFQKIAT IRKYIDMASLLKMNYFHWHLTEGLGWRIEIKQYPHLTRTGGSVGKGEEQQGFYTQEEIRD IIEYARQRNITIVPEIDMPGHAEAALSAYPELGCFGLPVEIPQSGFTQNIFCAGKDGTLR FLKNVLDEVCALFPSPYIHLGGDEAPKGNWDQCPDCRKRITTEGLKDSHDLQLWFSAQMA NYLKSKGRKAIFWGDVVYHDGYPLPDNTVIQWWNYRGHKDLALRNAVKHHYPVICSSNYY TYLNFPVTPWKGYTEARTFDLKDVYLNNPSDKAISEKNPLILGMSCALWTDDGVTERMID RRLFPRILALSEQMWHEGEALDFDRFYRNILHRKAWFEEAGFEFGPALKEDVTKDYKWEL KKRTS >gi|226332133|gb|ACIC01000187.1| GENE 19 21855 - 23357 1131 500 aa, chain + ## HITS:1 COG:STM0035 KEGG:ns NR:ns ## COG: STM0035 COG3119 # Protein_GI_number: 16763425 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Salmonella typhimurium LT2 # 12 487 13 467 497 168 30.0 2e-41 MNKRNLLIAIPALCSGYLHGQTQPASPNVIYILMDDLGYGDIGCFGQDKIETPHIDRLCS EGIKLTQHYSGSPVSAPARCVLMTGMHSGHAQIRFNNELAERGAVNNYDSVYVHKELEGQ FPLQANTMTIGRMMQQAGYTTGCFGKWGLGYPGSEGTPNKQGFDRFYGYNCQRQSHTYYP PFLYNDEERVYLSNKVTDPHRSPLDKGADPNDLASYAKYTQKEYANDLIFDELMGFVDAN KRKPFFLMWTTPLPHVSLQAPERWVQHYVEKFGDEKPYTGQAGYLPCRYPHATYAAMISY FDEQIGQLIEKLKAEHLYENTLIVFTSDNGPTFNGGSDSPWFNSGGPFNSAYGWGKCFLH EGGIRVPAIITWPGKIKPGTQSDHICAFQDVMPTLAELAGITCPPTDGISFLPTLLGKKG KQKEHTYLYWEYPDPRIGNKAIRMGKWKGIITDIRKGNTQMQLYNLETDIREEHDVAAQH PDIVKRFERLMKEARNGPDF >gi|226332133|gb|ACIC01000187.1| GENE 20 23524 - 24699 1102 391 aa, chain - ## HITS:1 COG:VC0265 KEGG:ns NR:ns ## COG: VC0265 COG0668 # Protein_GI_number: 15640294 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Small-conductance mechanosensitive channel # Organism: Vibrio cholerae # 35 388 28 400 412 247 35.0 2e-65 MLVVDLGKWMNKILIDWGIDAKVADRFDETIIAILMIAIAVGVDYLCQAILVGGMRQYTR RKPHLWNTLLMKRKVFHNLIHTIPAILVYALLPMAFMRGKELLVISQKACAIYIIFSLLL AINGILLMIMDIYDGKETMKNRPMKGFIQVLQVLLFFIGGIVIISILVNKSPASLFAGLG ASAAILMLVFKDSILGFVAGIQLSANDMVRPGDWITLPSGAANGTVQEITLNTVKIQNFD NTISTVPPYTLVSSPFQNWRGMKDSGGRRVMKNITLDLTTLQFCTPEMLDRYRKEIPLMA DYQPEEGVVPTNSQVYRVYIERYLCSLPVVNQDLDLIISQKEATMYGVPIQVYFFSRNKV WKEYERIQSDIFDHLLAMVPKFDLKVYQYSD >gi|226332133|gb|ACIC01000187.1| GENE 21 24834 - 26381 1660 515 aa, chain - ## HITS:1 COG:STM0035 KEGG:ns NR:ns ## COG: STM0035 COG3119 # Protein_GI_number: 16763425 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Salmonella typhimurium LT2 # 21 505 21 473 497 184 29.0 3e-46 MRKEFYGILPLALTSCLASQATAQEKQTQRPNVVFIYADDIGYGDLSCNGAKTIHTPNVE RLAKMGVRFTNAHSAAATSTPSRYAMLTGEYAWRKAGTGIAAGDAAAIIRPERYTMANLF KDAGYNTGVVGKWHLGLGDKGGEQDWNKPLQPGTNDIGFEYSFIMAATGDRVPCVFVEND QVINLDPNDPIQVSYKANFPGEPTGKDNPELLKMHPSHGHDQSIVNGISRIGYMKGGKSA LWQDEKIAETLTGKAVSFIEGHKSAPFFLYFATQDAHVPRVPSPQFAGKSGMGPRGDCLL EFDWSVGEILNALERLGLDKNTLVILSSDNGPVVDDGYKDQAVELLGDHTPGGIYRGGKY SSFEAGTRIPCIWSWQGVIRPGTVSDALLCQIDWFATFAEMLNVRLPEGAAPDSEPMLKA WTGKQKKGREWLVLQNAQNNLSVTDGRWKYLRPGNGPAYLKAVNIELGNSKEPQLYDLKK DPKEKINVAGQNPELVKKMAAQLEKIVDGRYGLPL >gi|226332133|gb|ACIC01000187.1| GENE 22 26468 - 28288 1802 606 aa, chain - ## HITS:1 COG:SP2146 KEGG:ns NR:ns ## COG: SP2146 COG3669 # Protein_GI_number: 15901959 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-L-fucosidase # Organism: Streptococcus pneumoniae TIGR4 # 51 478 10 448 559 308 38.0 2e-83 MMNKLLTSLFLSSVITLGGCGLTKENYYVKHVEFPQDATLEQKIDMAARLVPTPQQAAWQ QMELTAFLHFGINTFTGREWGDGQEDSAIFNPTELDAEQWVRTLKDAGFKMVLLTAKHHD GFCLWPTATTKHSVASSPWKNGQGDVVKELRNACDKYDMKFGVYLSPWDRNAECYGDSPK YNEFFIRQLTELLTNYGEVHEVWFDGANGEGPNGKKQIYDWDAFYKTIQQLQPKAVMAIM GDDVRWVGNEKGLGRETEWSATVLTPGIYARSEENNKRLGVFSKAEDLGSRAMLEKATEL FWYPSEVDVSIRPGWFYHAEEDSKVKSLKHLADIYFQSVGYNSVLLLNIPPDRRGLINEA DVQRLNEFAAYREKIFTNNRIEKGRKDWEAVSGSETVYSLKPESEINVVMLQEDITKGQR VESFTVEALTEQGWQEVAKGTTVGYKRMVRFPAVKATQLRVKINECRLTAHISQVAAYYA DPLEEENRTENWNNLPRASWKQVAASPLTIDLGKEVELSAFTYAPLKAEAKPTMAFRYKF YVSADGKNWKEVPANGEFSNIMHNPLPQTVTFGQKEKARFIKLEATTPTATTAQVEMNEI GVTVAP >gi|226332133|gb|ACIC01000187.1| GENE 23 28453 - 31521 3104 1022 aa, chain - ## HITS:1 COG:TM1193 KEGG:ns NR:ns ## COG: TM1193 COG3250 # Protein_GI_number: 15643949 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Thermotoga maritima # 29 993 7 957 1087 664 40.0 0 MKLKKRTFLILMAALTATFASAQKQPLPEWQSQYAVGLNKLAPHTYVWPYADASDIEKPG GYEQSPYYMSLNGKWKFNWVKNPDNRPKDFYQPSYYTGGWADINVPGNWERQGYGTAIYV NETYEFDDKMFNFKKNPPLVPSAENEVGSYRRTFKVPADWKGRRVVLCCEGVISFYYVWV NGKLLGYNQGSKTAAEWDITDVLSEGENVVALEVYRWSSGAYLECQDMWRLSGIERDVYL YSTPKQYIADYKVSASLDKEKYKEGIFNLEVTVEGPSATAGSIAYTLKDASGKAVLQDAI NIKSRGLSNFIAFDEKKIAEVKAWNAEHPNLYTLVLELKDAQGKVTELTGCEVGFRTSEI KDGRFCINGVPVLVKGTNRHEHSQLGRTVSKELMEQDIRLMKQHNINMVRNSHYPTHPYW YQLCDRYGLYMIDEANIESHGMGYGPASLAKDSTWLTAHMDRTHRMYERSKNHPAIVIWS QGNEAGNGINFERTYDWLKSVEKGRPVQYERAELNYNTDIYCRMYRSVDEIKAYVGKKDI YRPFILCEYLHAMGNSCGGMKEYWDVFENEPMAQGGCIWDWVDQNFREIDKNGKWYWTYG GDYGPEGIPSFGNFCGNGLVNAVREPHPHLLEVKKIYQNIKATLSDRKNLKVCIKNWYDF SNLNEYILRWNVKGEDGTVLAEGTKEVDCEPHATVDVTLGAVKLPNTVREAYLNLSWSRK EATPLVDTDWEVAYDQFVLAGNKNTTAYRPQKAGETAFVVDKNTGALSSLTLDGKELLAA PITLSLFRPATDNDNRDRNGARLWRKAGLNNLTQKVVSLKEEKTSATVRAEILNGKGQKV GMADFVYALDKNGALKVRTTFQPDTAIVKSMARLGLTFRMADAYNQVSYLGRGDHETYID RNQSGRIGLYDTTVERMFHYYATPQSTANRTDVRWAKLTDQAGEGVFMESNRPFQFSIIP FSDVLLEKAHHINELERDGMITIHLDAEQAGVGTATCGPGVLPQYLVPVKKQSFEFMLYP VK >gi|226332133|gb|ACIC01000187.1| GENE 24 31550 - 33880 1810 776 aa, chain - ## HITS:1 COG:CC0447 KEGG:ns NR:ns ## COG: CC0447 COG3525 # Protein_GI_number: 16124702 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetyl-beta-hexosaminidase # Organism: Caulobacter vibrioides # 29 616 29 594 757 354 34.0 4e-97 MTRTILILLIGFLSSIVAGCSDTNLTHDIPLVPRPAQIVPGSGNYLFSGKTVFAVENEEQ AEVARSFIALFTRAAGFTPKLTVGDAEKGNVRFQTDATLKSEAYTLQVSPKEIIIEASDA KGFFYALQTIRQLLPASIEKEEVSDKKVKWSIPAVSIQDEPRFGYRALLLDASRFFIPKE NVLRIIDCMAMLKINTLHFHLTDDNGWRVEIKKYPRLTEVGAWRVDRTDLPFPARRNPEP GEPTPVGGFYTQEEIKEMVAYAAERQIEVVPEIDTPAHSNSALAAYPHLACPVVKEYIGV LPGLGGRNSEIIYCAGNDSVYAFLQDVMDEILELFPSRYIHIGGDEARKTYWEKCPLCQA RMKKEKLANEEDLQGYFMNRMSEYVRSKGREVIGWDELTNSSFLPDDAIILGWQGYGQAA LKAAEKGHRFIMTPARIMYLIRYQGPQWFEPLTYFGNNTLKDVYDYEPVQKDWKPEYADL LMGVQACMWTEFCNKPEDVDYLVFPRLAALAEVAWTQPEKKDWTSFLKGMDSFNEHLSAK GIVYARSMYNIQHTVRPEDGALKVKLECVRPDVEIRYTMDGSEPTAASPLYEQPLLVKEA QTVNAATFADGQQMGKRLTLPVHWNKATAKPLLGNKVNEAVLNNGVRGSLKQTDFEWCSW GSSDRISFTVDLLQKEKMNTLTIGCITNYGMGVHKPKSIRVAVSDDNATYRDINELEYTP EEIFREGTFIEDLSIDMRGTVARYVRVTTEGAGECPANHVRPGQESRVCFDELIIE >gi|226332133|gb|ACIC01000187.1| GENE 25 33896 - 35479 1637 527 aa, chain - ## HITS:1 COG:PM0598 KEGG:ns NR:ns ## COG: PM0598 COG3119 # Protein_GI_number: 15602463 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Pasteurella multocida # 50 518 1 456 467 132 24.0 2e-30 MENLQSNLFYSLAGVAAVASLASCTNKQKTTEQKPLNIVYIMTDDHTAQMMSCYDTRYME TPNLDRIAADGVRFTQSFVANSLSGPSRACMITGKHSCANKFYDNTTCVFDSSQQTFPKL LQKVGYQTALVGKWHLESLPSGFNYWQIVPGQGDYYNPAFITQDNDTIQKHGYITNLITD DAIDWMENKRDPEKPFCLLIHHKAIHRNWMADTCNLALYEDKTFPLPDNFFDDYEGRPAA AAQEMSIVKDMDMIYDLKMLRPNEKSRLRSLYERFLGRMDEGQRAAWDKFYAPIIDDFYK QNLTGKELANWKFQRYMRDYMKTVKSLDDNVGRVLDYLKEKGLLDNTLVVYTSDQGFYMG EHGWFDKRFMYEESMRTPLIMRLPKGFDRRGDITEMVQNIDYAPTFLELAGAEIPADIHG VSLLPLLKGEHPKDWRKSLYYHFYEYPAEHMVKRHYGVRTERYKLIHFYNDINWWELYDM QADPSEMHNLYGEAEYEPVVKELKDELLKLQEQYNDPVRFSPERDKE >gi|226332133|gb|ACIC01000187.1| GENE 26 35681 - 37462 1407 593 aa, chain - ## HITS:1 COG:no KEGG:BT_1629 NR:ns ## KEGG: BT_1629 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 593 1 593 593 1181 99.0 0 MKNILYKILPVCLLTIGLTGCEDETKYRPLPEAVPLTMSINEKAFVMGEHLKVDIKVEPD ADGNEVVANEDFDIYFTAKAGTEDVANVFEPFSSIVTFPKGEKQIQVDFPVKTSGLVGTT TMEFVAFARGYKMANSSQGIKVSDYYRISMSLENNTENVVTEGGKFVLVAKVDKPSSVPL EVTITPKEGEEGRYDNLPSTLTIPAGRTSVKSAAVTIKQDYEMTGDLQLVLNLKSNSSSN PMTAPALTITMTDLESMADPDLYDMTTVYENPNIMFVSYDDDWFTGKETAKMDEGTAHPN AELGSQWKFDYAIEFHKNSTSGDYQKLGNATDANRGNILKIDWTKYAKVTDEGELNICVG VEGTNYGTAGIHCCKSLGQMWAQNVTRIYPGMRIEMKVRLGGNRTGFVPMIEVKNPATAT TCKEAKQSICILKNVSGSAITQSVRGEVVSDAKSVVSAIPKVEDYNIYWVELVDENTIKL GINGSTTLEVTRDMLDSWPFTKASTGTSVGAKGLYLVMRMDLFGEGNSVSSELPAGWDTE LKSINPANYATEGPRMIIDWIRFYVNDNYKREANELVNRDYKVSIPGKFYQFY >gi|226332133|gb|ACIC01000187.1| GENE 27 37508 - 39325 1649 605 aa, chain - ## HITS:1 COG:no KEGG:BT_1630 NR:ns ## KEGG: BT_1630 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 15 605 1 591 591 1186 100.0 0 MKKIYVLLVVGASFMMSACSDFLDREPLTTPNSETFLSNASAVNNYINGLYIALPSFGTY DMGVRGEEKNSDNIVAEVYDKRLNGELQENGGGTTEWQKGYQNLRNVNYFFEYYKVPETE ETKDVLSMKGEAYFFRAYWHFYLLTRFGSIPVMDRFWDGNATVGGLQIPPRDRSAVAQFI LDDLNTAKGLLHSRSQYKGLRVCKEAAIIMAMRVALYEGTWEKYHKGTDFAAAEDKSADL LGQVLTLGDELFGMGLALNTKATDKNAVNIEDAYAHIFNSKDLSDMTEVVFWKKYSIADG VIHNLSSNLGAGYVDNSGPAGLSQSLVDNYLNADGTPINPADGIFKDFNLTFKGRDGRLL ATVMHSNCKFKSTSPESKSKAMLVEEYSEENKQIVRPPYLTEGGPARNATGYHIRMSIDT TYVSGQGETSLPVIRYAEALLAYAEAAEELGKCTPAVLEKTLKPLRERAGVTYKDPSEID PNFTDFGYTISANLQEIRRERRAELALQGFRLDDLMRWRAHKLIQGKRGTGAYFGTDGVL YKAFDPKNAADLKTILTTDGWLDPQKELLPRGYQFDANRDYLLPVPPSEISLNHELKQNP GWQRK >gi|226332133|gb|ACIC01000187.1| GENE 28 39339 - 42698 3175 1119 aa, chain - ## HITS:1 COG:no KEGG:BT_1631 NR:ns ## KEGG: BT_1631 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1119 1 1119 1119 2154 100.0 0 MEKKKFVRKQERFVLFLMLLLLFPLGALAQQKLIKGQVVDEMGEPIIGATVMVKGVTGGT ITDIDGNFSIQGKVGSTLSVTYVGYAPLQVKVTKQEGNRLVMKEDAKVLDEVVVVGMDTQ KRNTITAAVATLKDDAIVNRPVTDVTSALQGNIAGLNFASDAVGGGVGGEIGADIKFSIR GIGSINGGEPYVLVDGVEQSMQNVNPADIASISVLKDASASAVYGARAAYGVVLVTTKSG KKERASVTYRGTVGFSAPINMPKMMNALEYAAYNNQQYDNGGASSGLQKISDKTIEKIKG FMQNPYSAEFPGIEANTTGDDWAGAYYNQYGNTDWFEYYFKDKSVRHSHNLSVQGGSDKV NYYIGMGYTYQEGLLDKVQDDLSKYNLNTKLQFKTSDWLRFNLNNNITLQMIKRPMANQM ILYNKIGSHRPTQVTELPVESEYNIPSWNEMLYLKNSNYQRNRISDALSFSATVTPLEGW DITGEMKIRFDVENNNLKMKDDQKYETPAGTFKPGDATNQRQGFAYPGISWKNMYFGSYT RGSMFNYYLSPNLSSSYTHQWGDHFFKAMAGYQMELQEYSEEYMYKDGMLSNDIYSFDNA SGNVIAGEARSHWATMGFYTRLNWNYNNIYFLEFSGRYDGSSRFAPGHRWGFFPSFSAGY DIARTPYFQQLALPVSQLKVRVSYGRLGNQNGAGLYDYIAQMPLDAQGTNAWLLPGISSS TASKGTIAKTPKMISPYITWEKVDNANLGFDLMLLKNRLSITADFYQRTTRDMIGPAESI PSISGIASDDRAKVNNATLRNRGWELSVSWNDKLKCGFSYGVGFNVFNYKAVVTKYNNPE GIIYNNHTGLAANKGYYEGMDIGEIWGYRADDLFLSNREIDDYLRNVNLTAFKSNDLWRR GDLKYIDSNGDGRVDGGKGTLADHGDLQIIGNTTPKYSFGINLNLGYKGFEVSTLLQGVA KRDFPISGSNYMFGGNNYNFKEHLDYFSTENPNGYLPRLTGWKDDKDFLVNTGYNTTRYL LNAAYMRMKNLTVAYTFNKKQLKHIGISNLKVYVTCDNLFTVTKLPKQFDPETLNQVNMS AGGDAQNTAPGLTSPMKQNGNGMVYPMNRNFVFGLDFTF >gi|226332133|gb|ACIC01000187.1| GENE 29 42729 - 44393 1653 554 aa, chain - ## HITS:1 COG:no KEGG:BT_1632 NR:ns ## KEGG: BT_1632 # Name: not_defined # Def: chitinase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 554 1 554 554 1073 100.0 0 MRSKFLFLCLFLFGMLTACKDTKWVDVPTGTPPEGSVTGKPGETEEPDDPDPTDKTFINC SYIRGDFFETNRISGASMGACNDLIYLTARPYADGDLTFDLPVNDATLTGAATYAASYNG RNGVLKLDGTGKMNAGDGLLHSPDGAFKKFTFGTYIYVSEWVDGACLFKKVDGGSVVIAF QLGATEGNLKLTVGSATATVTNSALKSGAWHYVAVTYDGGAAKLYIDTNNTATDFTGSLP ASVPNTRADFVMGENLKGYLDETFVNSLLMGTLGRNPISFDNWNNTKTLAYWKYDDAAKP GKDSHTWAIRLEQIRTALNGQAGDRKIRLGIAGGEWLKMVGNATARTNFANNVKKVIEQY NLDGADLDFEWAYSGTDLSNYSKAIVQLREVLGKDVFLTVSLHPVSYKISAEAIAAVDFI SLQCYGPSVELFSMERFKSDGKAAVDYGIPQDKLVMGVPFYGTTGTAGEQAAYFDLVGKG NLTNTSADTWSYEGKNYTLNSQNTIRQKTQYVCENGFGGIMSWDLATDVDVTHEMSLLKA VKEELDYYANPTVE >gi|226332133|gb|ACIC01000187.1| GENE 30 44653 - 45396 508 247 aa, chain - ## HITS:1 COG:aq_718_1 KEGG:ns NR:ns ## COG: aq_718_1 COG1208 # Protein_GI_number: 15606116 # Func_class: M Cell wall/membrane/envelope biogenesis; J Translation, ribosomal structure and biogenesis # Function: Nucleoside-diphosphate-sugar pyrophosphorylase involved in lipopolysaccharide biosynthesis/translation initiation factor 2B, gamma/epsilon subunits (eIF-2Bgamma/eIF-2Bepsilon) # Organism: Aquifex aeolicus # 1 130 4 129 384 100 36.0 4e-21 MIFAAGLGSRLKPLTDTMPKALVPVAGRPMLEHVILKLKASGFTEIVINIHHFGEQIIDF LKANNDFGLTLHISDERDLLLDTGGGIRKARRFFENSDEPFLVHNVDILSDMNLKELYDF HLRNGSVATLLASRRKTSRYLLFDAEQRLCGWINKDTGQVRPEGFLYDESLYREYAFSGI HVFSPAVFQLMEVPCWEGKFSIMDFYLATCGQTDYCGYLTEKLELIDIGKPETLARAEEF LRENIVS >gi|226332133|gb|ACIC01000187.1| GENE 31 45405 - 46835 1131 476 aa, chain - ## HITS:1 COG:YPO3586 KEGG:ns NR:ns ## COG: YPO3586 COG1660 # Protein_GI_number: 16123728 # Func_class: R General function prediction only # Function: Predicted P-loop-containing kinase # Organism: Yersinia pestis # 337 464 156 277 284 90 39.0 1e-17 MITEELQKLYQEYTGVPAENITELPSSGSNRRYFRLTGAKTLIGVCGTSVEENDAFLYMA AHFRKSGLPVPEVHIVSENKSYYLQEDLGDTLLFHAIEKGRATSVFSEEEKELLRKTVRL LPAIQFAGADGFDFSRCYPQPEFNQRSILWDLNYFKYCFLKATGMEFQEDKLEDDFQKMS DVLLRSSSATFMYRDFQSRNVMIKDGEPWFIDFQGGRKGPFYYDVASFLWQAKAKYPDSL RQELLKEYIDALRKYQPIDEPYFYSQLRHFVLFRTLQVLGAYGFRGYFEKKPHFIQSVPF AIQNLRDLLKEAYPEYPYLCNVLRELTELKQFTDDLKKRQLTVKVMSFAYKKGIPDDPTG NGGGYVFDCRAVNNPGKYERYKPFTGLDEPVISFLEEDGEILRFLDSVYALVDASVKRYM ERGFSNLSVCFGCTGGQHRSVYSAQHLAEHLNRKFGVKVELVHREQNIERTFEATV >gi|226332133|gb|ACIC01000187.1| GENE 32 46967 - 50932 3056 1321 aa, chain + ## HITS:1 COG:slr1393_3 KEGG:ns NR:ns ## COG: slr1393_3 COG0642 # Protein_GI_number: 16329802 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Synechocystis # 796 1029 45 296 301 124 32.0 1e-27 MKARLYIVLCIAFITQMAGVLPLQAGHYYYKQISLKEGLPSTVRCILTDEQGFVWIGTRS GLGRFDGHELKKYIHHADDPHSLPHDLVLQIAEDEQYNIWILTDKGVARYQRQSDDFYLP TDEKGKNITAYSTCPTPEGLLFGAKNKIYFYSYRDGSFRLLQEFDQTPDFRNFNITILSL WDEETVLCCSRWQGLLLLNLKTGAYSRPPFDCGKEIMSMIIDSKNRIWIAPYNNGLYCFD RNGKQLASYTTRNSSLSNNVILSLAERENKLWIGTDGGGINILEPETGQLSLLEHIPGRD NYSLPANSILSLYNDRNNNIWAGSIRNGLISIREVSMVTYTDVVPGNDRGLSNNTILSLY QQSDDKIWIGTDGGGVNLFNPLTEKFTHYLSTWEDKVASICQFTPDKLLISLFSQGVFVF NPSTGEKQPFTIIDDETTTRLCNRGKAVNLYQNSPDNVLLLGDHVYQYDLNKRTFRIATE QEGQDIIGALLPITNHENYTYLNDTKRIYQLDKRNNRLTSLFRCFNDTVINSVARDEYGD FWIGSNYGLIHYNPATKVRTPFTTTLFTEVTLIVCDQQGKVWFGTNDMLFAWLIKEKKFV LFGESDGAIQNEYLSKPRLLSSHGDVYMGGVKGLLHINSNLPPATSELPKLQLSDVIVNG ESVNNELCGDPTGISVPWNSNISIRIMSKEEDIFRQKVYRYQIEGLNDQQIESYNPELVI RSLPPGDYQIMASCTAKDGSWIPSQQILELTVLPPWYRTWWFTLGCALLVTGMIVETFRR TLKRKEDKLKWAMKEHEQQVYEEKVRFLINISHELRTPLTLIHAPLSRILKSLSPADTQY LPLKAIYRQSQRMKNLINMVLDVRKMEVGESKLLIQPHPLNEWIEHVSQDFVSEGEAKNI QIHYRLDPRIKIVSFDKDKCEIILSNLLINALKHSPQDTEITITSELLPEEKRVRISITD QGCGLQQVDTQKLFTRFYQGMGEQSGTGIGLSYSKILVELHGGSIGARDNSESGATFFFE LPLKLESEEIICQPKAYLNELMSDDSGKQSQEEDNFATAPYSILVVDDNPDLTDFLKKAL GEYFKRILTASDGVEALQLIKSHTPDIIVSDVMMPRMNGYELCKNIKEDIAISHIPVILL TARDDKQSQMSGYKNGADAYLTKPFEVEMLMELIRNRLKNREHTKKRYLNTGLIPAPEES TFSQADETFLLKLNKIIQENLDSSRLDIPFICKEIGMSRASLYNKLKALTDMGANDYINK FRMEQAILLITGTEMSFTEIAEKVGFTTSRYFSTTFKQYTGETPTQYKEKHKRETSKTIN Q >gi|226332133|gb|ACIC01000187.1| GENE 33 50971 - 52500 1245 509 aa, chain + ## HITS:1 COG:STM0035 KEGG:ns NR:ns ## COG: STM0035 COG3119 # Protein_GI_number: 16763425 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Salmonella typhimurium LT2 # 6 487 12 467 497 174 26.0 3e-43 MNYKLFSLVSGTLILSSLSASGQKNNTKPNILFILCDDMGYGDLGCYGQPFIRTPHLDAM ASEGMRFTQAYAGSPVSAPSRASFMTGQHTGHCEVRGNKEYWTNAPTVMYGNNKEYAVVG QHPYDPDHVILPEIMKENGYTTGMFGKWAGGYEGSCSTPDKRGIDEYFGYICQFQAHLYY PNFLNRYSKALGDTGVVRIIMDENIKYPMYGADYQKRPQYSADMIHQKAMEWLDEQDGKQ PFFGVLTYTLPHAELIQPEDSILNEYKEKFNPDKSYKGSEGSRYNAITHVHAQFAGMITR LDYYVGEVLKKLKEKGLDENTLVIFSSDNGPHEEGGADPTFFGRDGKLRGLKRQCYEGGI RIPFIARWPGRVPAGTVNDHICAFYDLMPTFCEIIGEKNYVKKYANKDKEVDYFDGISFA PTLLGKKKQKEHDFLYWEFNETNQIGVRMGDWKMVVKKGIPFLYNLATDIHEDNNVADQH PEIVEKMKAVIFAQHTPNPNFSVTLPEKK >gi|226332133|gb|ACIC01000187.1| GENE 34 52778 - 52927 188 49 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253572683|ref|ZP_04850084.1| ## NR: gi|253572683|ref|ZP_04850084.1| predicted protein [Bacteroides sp. 1_1_6] # 1 49 1 49 49 78 100.0 1e-13 MKKEYHHFAFGLFIEEVLKCEKVGISAMCQAIGMSKETYEMLKKGMISV >gi|226332133|gb|ACIC01000187.1| GENE 35 53084 - 53710 341 208 aa, chain + ## HITS:1 COG:no KEGG:BT_1637 NR:ns ## KEGG: BT_1637 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 208 1 208 208 420 100.0 1e-116 MTNNFRMSYFMPPIAPIRNEQGQTVTPATLTPFCEVSVEQVYQMITCNENLKALTEQVRG AGDLRMAKASLLPYVTPCGTFIRRSSKFFASPSGLIVVDIDNLDSYQKAVEMRRTLFDDL FLCPILTFISPSGRGVKAFVPYSKLYADDQTRNVKESINWAMQYIEITYGSEMDDSTETP PKAVDTSGKDIVRACFLSHDPQALFREY >gi|226332133|gb|ACIC01000187.1| GENE 36 53776 - 55623 1106 615 aa, chain + ## HITS:1 COG:no KEGG:BT_1638 NR:ns ## KEGG: BT_1638 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 615 1 615 615 1201 98.0 0 MTNIESLRLITEAVETAGTDIAPSYAEYVQLAFAIATDCGEAGREFFHRLCRISAKYQRE HAERMFSNALIKQHGEIHLGTAFHLAESTGVKICREEVMNSRKNAENALNAPYNFPTHRR AYNKVEEEENDNDENTNAPEELSTDSDPLQPLPTFPEADWPKILMLIMSYATSPTQRDIL LLGALTALGATMERYVRCSYSGKYQSPCLQTFFVAPPASGKSGLSLIRLLVEPIHDKIRQ QVDEEMKAYRKEKKNYELLGKERVNTEAPQMPPNKMFLISGNNSGTGILQNIMDANGTGL ICETEADTIASAISSEYGHWSDTLRKAFDHDRLSYNRRTDQEYREVKRIFLAVLLSGTPA QVRALIPSAENGLFSRQLFYYMHGIYTWADQFACGEIDLDEIFRSIGRDWQLKLDILKEH GIHTLRLTDEQKKEFNALFSDLFFRSDIANGNEMRSFIARLAVNICRIMSTIAMLRVLEI PQPYQLKSSNRYAPVPDKEIPTDNVKDGIITRWDITITPEDFKAVLGLVKPLYRHATHIL SFLPSSEIPHRANADRDAFFDALGDEFTRTQLTEQATAMGIKPNTALSWLRRLVKKGLFV MKEKGTYVRARVCVC Prediction of potential genes in microbial genomes Time: Thu May 12 03:58:50 2011 Seq name: gi|226332132|gb|ACIC01000188.1| Bacteroides sp. 1_1_6 cont1.188, whole genome shotgun sequence Length of sequence - 11948 bp Number of predicted genes - 12, with homology - 12 Number of transcription units - 2, operones - 2 average op.length - 6.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.000 - CDS 2 - 254 96 ## COG0463 Glycosyltransferases involved in cell wall biogenesis - Prom 286 - 345 4.7 - Term 263 - 308 -0.2 2 1 Op 2 . - CDS 347 - 1054 329 ## COG3774 Mannosyltransferase OCH1 and related enzymes 3 1 Op 3 . - CDS 1065 - 2087 283 ## PGN_1242 hypothetical protein 4 1 Op 4 11/0.000 - CDS 2110 - 3084 173 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 5 1 Op 5 11/0.000 - CDS 3114 - 4100 181 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 6 1 Op 6 . - CDS 4093 - 5097 181 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 7 1 Op 7 . - CDS 5146 - 6261 408 ## BVU_2943 hypothetical protein 8 1 Op 8 . - CDS 6313 - 7482 162 ## COG1035 Coenzyme F420-reducing hydrogenase, beta subunit 9 1 Op 9 . - CDS 7470 - 8489 256 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 10 1 Op 10 . - CDS 8492 - 10006 353 ## Shal_1490 polysaccharide biosynthesis protein - Prom 10076 - 10135 3.3 11 2 Op 1 . - CDS 10152 - 11255 586 ## BT_1653 hypothetical protein 12 2 Op 2 . - CDS 11261 - 11947 780 ## COG1596 Periplasmic protein involved in polysaccharide export Predicted protein(s) >gi|226332132|gb|ACIC01000188.1| GENE 1 2 - 254 96 84 aa, chain - ## HITS:1 COG:PM0512 KEGG:ns NR:ns ## COG: PM0512 COG0463 # Protein_GI_number: 15602377 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Pasteurella multocida # 2 84 3 84 266 90 54.0 7e-19 MFSVLISVYNKENSLSLRQSLTSVFRQKLPPTEVVLLKDGPLTEELDKVIAEYVMRYPEL KIVSLPVNQGLGKALNEGLKHCSY >gi|226332132|gb|ACIC01000188.1| GENE 2 347 - 1054 329 235 aa, chain - ## HITS:1 COG:FN1241 KEGG:ns NR:ns ## COG: FN1241 COG3774 # Protein_GI_number: 19704576 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Mannosyltransferase OCH1 and related enzymes # Organism: Fusobacterium nucleatum # 1 204 1 212 243 102 35.0 4e-22 MIPKTIHFCWLSGETYPELVCKCIKSWELILPDYEIILWDTQKIDIYSNLWLEQSYKKKK YAFAADYIRFYALYYYGGIYLDADVEVLKSFNPLLGEHYFLGEEAGGDIEAAVIGAEKGS SWVKECLDYYRDRPFIKSNGRLDTRPVPLLINSVVGGKGLAIKPYYYFSPKDYNIGKIDI KESTYCIHHFDGKWLKRGLKYSVKKNIHKALYLIMGRVKHNKVIRIIRIIKDRFI >gi|226332132|gb|ACIC01000188.1| GENE 3 1065 - 2087 283 340 aa, chain - ## HITS:1 COG:no KEGG:PGN_1242 NR:ns ## KEGG: PGN_1242 # Name: not_defined # Def: hypothetical protein # Organism: P.gingivalis_ATCC33277 # Pathway: not_defined # 32 326 28 330 347 107 30.0 8e-22 MIGLIFFIICVILGVWSIESTQFKRMNLNIIVFWVLVIIAGLRYNVGVDYPVYEDIYNDP YSVYTYSIEPVWLIINKILYTLGLNSRAFFFLTSLMIMGGFYVGIRRMSPNFYISILLFV MCGFYFDSMNLVRQYVAISLLFVAFTFFLNGENIKYLLCVLLAALFHYSVLVIFPFILLS RFRYSNWLLAIILCISYFGGTYLLNLIVSYVMPSLMELGRYQYTIEDFDSGINTGILKLF YNLLGVFILLLYTKRVQTQYVFVNMVIIGLVLYNTFYLFMPARRLYLYFFPYIVVLFPYY LQKFKLTSQAIVLVGCCLGFLSFLLKSNWGIPYSFDVLFL >gi|226332132|gb|ACIC01000188.1| GENE 4 2110 - 3084 173 324 aa, chain - ## HITS:1 COG:BS_yveT KEGG:ns NR:ns ## COG: BS_yveT COG0463 # Protein_GI_number: 16080481 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Bacillus subtilis # 5 257 5 249 344 118 28.0 2e-26 MNVCLSVIVPVYNIAPYLRRCIESVLCQSFSDFELLIIDDGSTDDSATICDEYAGKEKCI RVIHQINAGVTAARRRGVEEAKGDWICFVDGDDVLPQNAFCDLYRHTSDVDIVIGRIHLV DRKGRILQRSCQEERFLDFINYLKALLEHKVPLSPWGRIFRKSLFDPSILDLPPTIRRGE DYIMNVRLAIKSEKIRIIDRHVYDYIQYSDSCLHRFRNTWEYEKLFNSFLLRSIMDHHLE EECKESIVHAHIHLLMGVLDDPNLNRKDAFYLQIKKEALEIQITLKEKLLLNFVVFPFYI RKFIYRTLKKVYHSYRQLILLITV >gi|226332132|gb|ACIC01000188.1| GENE 5 3114 - 4100 181 328 aa, chain - ## HITS:1 COG:BS_yveT KEGG:ns NR:ns ## COG: BS_yveT COG0463 # Protein_GI_number: 16080481 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Bacillus subtilis # 5 221 3 220 344 131 36.0 1e-30 MNKNPLVSVIVPIYGVEPYIEKCARSLFEQSLENMEFIFVNDCTPDKSVEILRQVIEEYP RRYLQIQIIEHEENRGLAMARNSGLLIAKGEYIIHCDSDDWVELDMYEKMYEKALEKNAD IVICDYYAEYSNKRIYHIQREPGEKEEYLMKILQGCLHNGMWNKLVRRELYQHLSFWYKE GVDMWEDVSIMPRLVFYAENIVSMHKAFYHYSQVNINAYTKCWKTESLQNVIDVVNVIES FLLENKNGRYNLELLYLKLRAKYCLLRYSSGKQRNVYRNLYSEADSLVFSHSVLPIHDKI IIWCWLHSMDWGAAIILSFIERMKRWLR >gi|226332132|gb|ACIC01000188.1| GENE 6 4093 - 5097 181 334 aa, chain - ## HITS:1 COG:BS_yveT KEGG:ns NR:ns ## COG: BS_yveT COG0463 # Protein_GI_number: 16080481 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Bacillus subtilis # 8 136 3 126 344 117 44.0 5e-26 MRKMESPPLVSVIVPVYRAEKFIQRCARSLFEQTLEEMEFVFVNDCTPDSSMVILREIIK EYPNRARQIRIVEHDQNKGSATARNTGLDNAHGSYIIYCDSDDWVEREMYEKMIKKAFDD KADIVGCDFYYDYVDRLVIHRQHFPNDNRQCIVELLEGKLHGSMCNKLINRKLYSLGHVR FFDGINMWEDIVAIISLCFYAGKIAYVSQPFYHYVQYNSASIIRDVTIKQMEDMIYACRN IESFLVQNNSFDKFKFHFISLVLGVKFYLVLRKEVRDYNRFHCLWPEFDKDIWKIAQRWD ICLIYWLASRKKYRLSRFIFRGKEKIRRLIFRHE >gi|226332132|gb|ACIC01000188.1| GENE 7 5146 - 6261 408 371 aa, chain - ## HITS:1 COG:no KEGG:BVU_2943 NR:ns ## KEGG: BVU_2943 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 370 1 368 371 275 41.0 2e-72 MRICILTQPLCTNYGGLLQAYALQVVLKRMGHEVWTENRKENPLSLISKFKLFIKRILAP IRGIYYGTNEQKKVISQYTELFIRNYITITDPVTSNTKEVLRRYAFDAYIVGSDQVWRPC YSYYLPNYFLDFAMGDKVKRIAYAASFGTSEWEFTAEQTEQCAALAKSFDAISVREDSGV ELCSKYLGVNAVCLLDPTLLLRKEDYVYLVEKEQVTAFDSKLMTYVLDQSEEKQRIIQEL SYKLGLTPIVVMPKFTFEKTGPKEISNCIFPPVTKWLRGFMDAEYVVTDSFHGTVFSIIF NKPFIVIANKERGLGRFTSLLRMLGLESRLVYTFDDITDQLIYTSIDYTQVNRILRRERD RALEFLKASLR >gi|226332132|gb|ACIC01000188.1| GENE 8 6313 - 7482 162 389 aa, chain - ## HITS:1 COG:MA3732 KEGG:ns NR:ns ## COG: MA3732 COG1035 # Protein_GI_number: 20092529 # Func_class: C Energy production and conversion # Function: Coenzyme F420-reducing hydrogenase, beta subunit # Organism: Methanosarcina acetivorans str.C2A # 3 283 7 271 346 62 23.0 1e-09 MLQIINKEDCCGCSACVQICPKCCISMYEDNEGFLYPEINKDICVNCHLCENVCPVLHQG NPHRTLRTYAAKNKKEDIRSQSSSGGVFSLLAEYIIDRGGVVFGARFNEKWEVIHDYVEV KEDIAAFRGSKYVQSQIGDSYKKVEFFLKQSREVLFSGTPCQIAGLNYYLRKKYDNLLTV DLVCHGVPSPGVWRKYLQDQILNKDQSRISNIQFRDKRLGWKNFSFTIWGYSNINKNVPT ILLTESLKNNVFMKGFLKNLYLRPSCYNCPCRNFKSGSDITIGDYWGVENYYHTYDDDKG VSLVCVNTSYGMKIYSLLMVDGFETPYEYLLSANSCLEKSVTIPKLRNLFWLSFQHEGFN SIERICGKMNPNFIHRAFSWIRSRIQYNK >gi|226332132|gb|ACIC01000188.1| GENE 9 7470 - 8489 256 339 aa, chain - ## HITS:1 COG:PA0705 KEGG:ns NR:ns ## COG: PA0705 COG0463 # Protein_GI_number: 15595902 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Pseudomonas aeruginosa # 5 133 18 139 299 80 34.0 4e-15 MNRKLVSVITPCYNSGKFIHRLLETILEQDYPCLEMYVIDDGSVDNTKEVIKNYITKFDK RGYTLTYIFQENEGQSVAVNRALKSIKGEYLVWPDSDDFYANSSAISKMVSILDKSDDSV SMVRCLPIYLNEETLALDHKTPNVVEIYNEYLFEDCLFGKKYFWFVPGDYMVKMSILDKC IRNREIYTEKNAGQNWQLMLPLLYQYRCITIGDYLYNVVIREESHSRGQYKTFERVCDKL QSYENTLICTLNNMLFLSLDEREKYIYAIRKKYLLERLNVCLKYHRKEDARIIRKRLEKD YQVTLSLTRKLDYLCCLIPGYFLVKQFLSSLKYKYLCCK >gi|226332132|gb|ACIC01000188.1| GENE 10 8492 - 10006 353 504 aa, chain - ## HITS:1 COG:no KEGG:Shal_1490 NR:ns ## KEGG: Shal_1490 # Name: not_defined # Def: polysaccharide biosynthesis protein # Organism: S.halifaxensis # Pathway: not_defined # 8 499 11 500 512 404 47.0 1e-111 MIEDNGKNKRIAKNTFYLYIRMLFSMVISLYTSRVVLNTLGVEDFGIYNVVAGVISMFSF LNTSMSGAVSRFFAYAIGTNNWKHLQNTFSSALTIHLIIALFIMVLSETVGLWFLQNKLV IPENRILAANIIYQFSILTSMIAIIQVPYVAMIMAYERMSVYACIEIINVLLKLVIVILL IYITFDKLIIYGLLLLISTLIISVIYGTYCTKKIESCRFCLLGDKKLIYSMFTFSGWDLY GNASVLARTQGVNILLNMFFGAVLNAASGIASQIQTAVMSFAGSVLSAVRPQMIKSYAVG DYNRMIYLIYKASIYTSILLLLFTIPLLIETDFILALWLSEVPIYAGILCRYVLLFNLFA NLSSVVVSAIHATGKIKRPSLINGTFYLSVIPFAYIAFYVGSQPQLAYIYNIFAVLAGLL SNVWTLHLYIPCFSFKQYAWLVLCRTVCVGSLTFFATYFFSKLFDAGWIRLICVGLFSTI LLLVTTYFILMTNRERKLLKLRLL >gi|226332132|gb|ACIC01000188.1| GENE 11 10152 - 11255 586 367 aa, chain - ## HITS:1 COG:no KEGG:BT_1653 NR:ns ## KEGG: BT_1653 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 361 1 362 365 513 75.0 1e-144 MSEQSVEKEPEDMKIDLLDIFSRIIAIRKKLYKATIIGLLIAIIVWASIPKKYTVTVMLS PEMSSDKGSNGLSGLAASFLGGGSSFSSGTDALNASLSSDIVASTPFLLELLNIEVVDNG EVKRLSDYLDTESFPWWSYIIGAPSRIIAGVQSLFINERKFSSEKLQDGIIILNKKVKGQ IDVLKKNITATIDKKTAITNVSVTLQNPRITAVVADSVIHRLQEHIIDYRTSKAKEDCVY LEKLFEERKQEYYAAQKKYAKYVDTHDNLVLQSIRVEQERLQNDMSLAYQVYSQVINQLQ IARAKVQEEKPVFAVVEPPVVPLKPSGIGVKLYVLLFVFFSLFITIGWELLGRNILEWLK KQTRNEC >gi|226332132|gb|ACIC01000188.1| GENE 12 11261 - 11947 780 228 aa, chain - ## HITS:1 COG:Cj1444c KEGG:ns NR:ns ## COG: Cj1444c COG1596 # Protein_GI_number: 15792762 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protein involved in polysaccharide export # Organism: Campylobacter jejuni # 99 175 429 505 552 66 37.0 5e-11 GYQAQQNVVIDGEILFGGNYAMTNREERLSDLVNKAGGPTSLAYLRGAKLTRVASAGEKK RMGDVIRLMSRQLGEAMIDSLGIRVEDTFTVGIDLEKALSDPKGNADLVLREGDVISIPK NNNTVTINGAVMVPNTVSYMQGKDVDYYLNQAGGYSDNAKKSKKFIVYMNGQVTKVKGSG KKQIEPGCEIIVPSKAKKKANIGNILGYATTFSTLGMMVASIANLIKK Prediction of potential genes in microbial genomes Time: Thu May 12 03:59:28 2011 Seq name: gi|226332131|gb|ACIC01000189.1| Bacteroides sp. 1_1_6 cont1.189, whole genome shotgun sequence Length of sequence - 71940 bp Number of predicted genes - 67, with homology - 66 Number of transcription units - 27, operones - 15 average op.length - 3.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 21 - 66 8.8 1 1 Op 1 . - CDS 168 - 2357 1710 ## BF2815 putative mobilization protein - Term 2380 - 2425 6.0 2 1 Op 2 . - CDS 2438 - 3346 605 ## gi|212694227|ref|ZP_03302355.1| hypothetical protein BACDOR_03753 3 1 Op 3 . - CDS 3413 - 4066 477 ## BF2813 hypothetical protein 4 1 Op 4 . - CDS 4075 - 4620 327 ## BF2812 hypothetical protein 5 1 Op 5 . - CDS 4620 - 5465 813 ## BF2811 conjugate transposon protein TraN 6 1 Op 6 . - CDS 5535 - 6677 1250 ## BF2808 conjugate transposon protein TraM 7 1 Op 7 . - CDS 6661 - 7080 278 ## BF2806 hypothetical protein 8 1 Op 8 . - CDS 7093 - 7707 409 ## BF2805 conjugate transposon protein TraK 9 1 Op 9 . - CDS 7711 - 8112 400 ## BF2804 hypothetical protein 10 1 Op 10 . - CDS 8178 - 9308 869 ## BF2803 hypothetical protein 11 1 Op 11 . - CDS 9308 - 9709 251 ## gi|212694218|ref|ZP_03302346.1| hypothetical protein BACDOR_03744 12 1 Op 12 . - CDS 9725 - 10555 673 ## COG3617 Prophage antirepressor 13 1 Op 13 . - CDS 10599 - 11363 688 ## BF2802 hypothetical protein 14 1 Op 14 . - CDS 11391 - 12116 640 ## BF2800 hypothetical protein 15 1 Op 15 . - CDS 12131 - 14641 1950 ## BF2799 hypothetical protein - Prom 14674 - 14733 2.9 - Term 14780 - 14835 2.6 16 2 Tu 1 . - CDS 14892 - 17597 2421 ## BF2797 hypothetical protein - Term 17612 - 17660 4.9 17 3 Op 1 . - CDS 17720 - 18019 342 ## BF2796 hypothetical protein 18 3 Op 2 . - CDS 18068 - 18433 499 ## BF2795 conjugate transposon protein TraE 19 3 Op 3 . - CDS 18492 - 18821 328 ## BF2794 hypothetical protein 20 3 Op 4 . - CDS 18821 - 19273 512 ## BF2793 hypothetical protein - Prom 19365 - 19424 4.0 - Term 19428 - 19460 2.3 21 4 Op 1 . - CDS 19495 - 20265 445 ## BF1071 hypothetical protein 22 4 Op 2 . - CDS 20262 - 21041 563 ## BLD_1325 hypothetical protein 23 4 Op 3 . - CDS 21029 - 22000 584 ## HCH_01177 hypothetical protein - Prom 22028 - 22087 3.5 24 5 Tu 1 . - CDS 22157 - 23056 685 ## BF2792 DNA primase - Prom 23095 - 23154 1.7 25 6 Op 1 . - CDS 23257 - 24369 894 ## BF2791 hypothetical protein 26 6 Op 2 . - CDS 24366 - 24707 291 ## BF2790 putative excisionase 27 6 Op 3 . - CDS 24751 - 25119 205 ## BT_4618 hypothetical protein - Prom 25140 - 25199 4.5 - Term 25172 - 25220 1.4 28 7 Tu 1 . - CDS 25253 - 25807 145 ## gi|212694180|ref|ZP_03302308.1| hypothetical protein BACDOR_03706 - Term 26135 - 26186 9.5 29 8 Op 1 . - CDS 26218 - 27465 498 ## COG0582 Integrase 30 8 Op 2 . - CDS 27462 - 28028 193 ## BF1226 hypothetical protein - Prom 28061 - 28120 4.4 - Term 28147 - 28202 14.1 31 9 Tu 1 . - CDS 28230 - 30146 2257 ## COG0443 Molecular chaperone - Term 30407 - 30466 13.6 32 10 Tu 1 . - CDS 30483 - 31271 598 ## COG3187 Heat shock protein - Prom 31379 - 31438 5.5 + Prom 31244 - 31303 5.8 33 11 Tu 1 . + CDS 31417 - 32700 961 ## BT_4613 hypothetical protein + Term 32716 - 32752 -0.8 + Prom 32723 - 32782 10.8 34 12 Op 1 . + CDS 32806 - 33999 1345 ## COG1748 Saccharopine dehydrogenase and related proteins 35 12 Op 2 . + CDS 34014 - 34469 557 ## COG1225 Peroxiredoxin 36 12 Op 3 . + CDS 34494 - 35528 1276 ## COG0468 RecA/RadA recombinase + Term 35560 - 35605 7.4 - Term 35600 - 35663 19.2 37 13 Op 1 . - CDS 35673 - 36665 955 ## COG2855 Predicted membrane protein - Prom 36690 - 36749 4.0 38 13 Op 2 . - CDS 36783 - 37157 206 ## COG3304 Predicted membrane protein 39 13 Op 3 . - CDS 37169 - 37612 396 ## COG5579 Uncharacterized conserved protein - Prom 37655 - 37714 6.1 - Term 37669 - 37727 14.0 40 14 Tu 1 . - CDS 37757 - 38941 643 ## BT_4606 hypothetical protein - Prom 38981 - 39040 5.5 41 15 Op 1 . - CDS 39042 - 41501 1130 ## BT_4605 hypothetical protein 42 15 Op 2 . - CDS 41513 - 42541 373 ## BT_4604 hypothetical protein 43 15 Op 3 . - CDS 42544 - 43740 514 ## BT_4603 hypothetical protein 44 15 Op 4 . - CDS 43743 - 46121 1109 ## COG0443 Molecular chaperone - Prom 46243 - 46302 9.4 + Prom 47089 - 47148 6.6 45 16 Tu 1 . + CDS 47231 - 47869 582 ## BT_4601 hypothetical protein + Prom 48041 - 48100 4.9 46 17 Tu 1 . + CDS 48236 - 49957 1108 ## BT_4600 hypothetical protein 47 18 Op 1 . - CDS 49959 - 50855 662 ## COG0583 Transcriptional regulator 48 18 Op 2 . - CDS 50913 - 51497 551 ## BT_4598 hypothetical protein - Prom 51717 - 51776 6.0 + Prom 51576 - 51635 6.3 49 19 Tu 1 . + CDS 51721 - 54309 1873 ## PROTEIN SUPPORTED gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 + Term 54325 - 54385 14.2 50 20 Tu 1 . - CDS 54369 - 54617 200 ## BT_4596 hypothetical protein - Prom 54663 - 54722 5.0 - Term 54746 - 54801 11.1 51 21 Op 1 . - CDS 54839 - 55273 488 ## BT_4595 hypothetical protein 52 21 Op 2 . - CDS 55299 - 55913 364 ## COG0237 Dephospho-CoA kinase 53 21 Op 3 . - CDS 55903 - 56730 444 ## BT_4593 hypothetical protein 54 21 Op 4 . - CDS 56768 - 56917 89 ## 55 21 Op 5 . - CDS 56939 - 57256 421 ## COG1862 Preprotein translocase subunit YajC 56 21 Op 6 . - CDS 57297 - 58223 928 ## COG0781 Transcription termination factor - Prom 58352 - 58411 7.3 + Prom 58141 - 58200 7.0 57 22 Op 1 . + CDS 58392 - 58796 436 ## BT_4590 hypothetical protein + Prom 58816 - 58875 3.9 58 22 Op 2 22/0.000 + CDS 58933 - 59523 976 ## PROTEIN SUPPORTED gi|29349997|ref|NP_813500.1| 50S ribosomal protein L25/general stress protein Ctc + Term 59558 - 59618 13.0 + Prom 59580 - 59639 4.6 59 23 Op 1 . + CDS 59659 - 60228 638 ## COG0193 Peptidyl-tRNA hydrolase + Prom 60251 - 60310 3.8 60 23 Op 2 . + CDS 60330 - 60752 572 ## COG1188 Ribosome-associated heat shock protein implicated in the recycling of the 50S subunit (S4 paralog) + Term 60758 - 60819 15.0 - Term 60742 - 60810 18.2 61 24 Op 1 . - CDS 60827 - 61252 524 ## COG0537 Diadenosine tetraphosphate (Ap4A) hydrolase and other HIT family hydrolases 62 24 Op 2 . - CDS 61302 - 63575 1706 ## COG0475 Kef-type K+ transport systems, membrane components - Prom 63649 - 63708 5.3 - Term 63656 - 63706 14.1 63 25 Op 1 . - CDS 63726 - 64811 1062 ## COG0404 Glycine cleavage system T protein (aminomethyltransferase) 64 25 Op 2 . - CDS 64861 - 66084 1272 ## COG2195 Di- and tripeptidases - Prom 66104 - 66163 6.3 - Term 66142 - 66184 11.1 65 26 Op 1 . - CDS 66204 - 67613 1176 ## COG0034 Glutamine phosphoribosylpyrophosphate amidotransferase - Prom 67650 - 67709 5.9 - Term 67673 - 67726 12.2 66 26 Op 2 . - CDS 67778 - 69937 1848 ## BT_4581 alpha-glucosidase - Prom 70180 - 70239 10.7 - Term 70237 - 70279 5.1 67 27 Tu 1 . - CDS 70485 - 71678 367 ## COG5433 Transposase - Prom 71797 - 71856 5.8 Predicted protein(s) >gi|226332131|gb|ACIC01000189.1| GENE 1 168 - 2357 1710 729 aa, chain - ## HITS:1 COG:no KEGG:BF2815 NR:ns ## KEGG: BF2815 # Name: not_defined # Def: putative mobilization protein # Organism: B.fragilis # Pathway: not_defined # 1 726 1 725 728 1239 85.0 0 MEESKELQGFYKIFRAVIYVSILMEFFAYALEPEQLDFLGGVVTDIHTRIKRWMIYHDGN LVYSKVATFLLICITCVGTRNKKHLEFDARRQVLYPLLSGVLLIVQSVWLFGHSIMPCFY TLRLNTWLYMLTSIVGVVLVHIALDNISKFLKEGLLKDRFNFENESFEQCRELQENKYSV NIPMRYYYKGKFRKGWVNIVNPFRGTWVVGTPGSGKTFSIIEPFIRQHSEKGFAMVVYDY KFPTLATKLYYHYLKNKNAKDSKMPHGMKFNIINFVDVEYSRRVNPIQLKYINNLAAASE TAETLLESLQKGKKEGGGGSDQFFQTSAVNFLAACIYFFCNYGKEPYDKDGKMLIAERRE DPKTKRLIPTGRVFDHSGAEVQPAYWLGKYSDMPHILSFLNESYQTIFEVLQTDNEVAPL LGPFQTAFANKAMEQLEGMIGTLRVYTSRLATKESYWIFHKDGDDFDLKVSDPKNPSYLL IANDPEMESIIGALNALILNRLVTRVNTGQGKNVPVSIIVDELPTLYFHKIDRLIGTARS NKVSVALGFQELPQLEADYGKVGMQKIITTVGNVVSGSARAKETLEWLSSDIFGKVVQLK KGVTIDRDKTSINLNENMDSLVPASKISDMPTGWICGQTARDFVVTKTGMNGSMNIQESE EFKTSKFYCKTDFDMTEIKKEESEYVPLPKFYKFKSKEERERILYKNFVQVGEDVKAMIK EIQQFRTMM >gi|226332131|gb|ACIC01000189.1| GENE 2 2438 - 3346 605 302 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|212694227|ref|ZP_03302355.1| ## NR: gi|212694227|ref|ZP_03302355.1| hypothetical protein BACDOR_03753 [Bacteroides dorei DSM 17855] # 1 302 1 302 302 546 100.0 1e-154 MKKNLFLMLGLGAMLLTSCDQDIVNEIYLPGGGGSVVVTDSTSTKQLQVFTNLEMLQPAV STRAVDNQWELNDMIGISSTGMINNMKFTRSGGTNQFVSANKVFFTDTDTHTFNAYYPYT ANPTNDLIEFQVPEAAVQSEQKVNDFLFASGAASYANPSLTLNFTHQMVRVIIKVYTSPE YGFTTNDFAGSLLTFVGYFDSGTFNIKTGKVECSDESVTTYYFGDPQSDHMEYTLYVPAQ VMPDMTLQLSNGENFVFLTNDTWKAGHSYSYSLKPVRNTLEVISATISPWTVESEKTING YE >gi|226332131|gb|ACIC01000189.1| GENE 3 3413 - 4066 477 217 aa, chain - ## HITS:1 COG:no KEGG:BF2813 NR:ns ## KEGG: BF2813 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 20 215 21 216 218 282 67.0 6e-75 MKKLFSILAVAATVLLTACDEHQDFPDTAMKVTHILTTDGKVMPYETYEQSCKQAIAVVF NVNQREEMEGNGYAVYLWDIAPEAFADSIGVEQDTSCDLTAYDGNKNTFALYGTTDVKSP LAERVFDMWRYGQSAYIPSVAQMRLLYHAKDIINPYILKCGGTPIPDEDDDCWYWTSTEV EGQATAKAWLYSLGSGAIQETPKWQPHKARPIITIMD >gi|226332131|gb|ACIC01000189.1| GENE 4 4075 - 4620 327 181 aa, chain - ## HITS:1 COG:no KEGG:BF2812 NR:ns ## KEGG: BF2812 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 31 181 20 170 170 236 68.0 3e-61 MRIRNEELGVRNWLVHTLLVVICFMASMTPAMAQQNSDRISLGFGCLYERGLDVTLSYEY ETKYHNAWEYFANGYIKWNECASCGHVCPESFWNNYRSYGFGFAYKPCVARGRNHHGNMR IGASAGSDTDRFLGGIHLGYEQNYTLRHGWKLFWQVKTDVMIKGEDLFRTGIVLGVKLPV K >gi|226332131|gb|ACIC01000189.1| GENE 5 4620 - 5465 813 281 aa, chain - ## HITS:1 COG:no KEGG:BF2811 NR:ns ## KEGG: BF2811 # Name: not_defined # Def: conjugate transposon protein TraN # Organism: B.fragilis # Pathway: not_defined # 6 281 2 281 281 401 76.0 1e-110 MNCFQKMNVCALLSLSVLSAKAQTTYMEMEQLTVNDQVTTVITASEPIRFVDISTDKVVG DQPINNTIRLKPKDNVYADGEVLAIVTIVTERYRTQYALLYTTRMQEAVTDKEIECSERN AYNNPAVSLSTADMTKYARQIWSSSAKYRNVATKMHRMVMRLNNIYSVGEYFFIDFSVEN KTNIRFDIDEMRIKLSDKKQSKATNAQIVELTPSLVLDDAKTFKYGYRNVIVVKKMTFPN DKVLTIELSEKQISGRTISLSIDYEDVLSADSFNHILLEEE >gi|226332131|gb|ACIC01000189.1| GENE 6 5535 - 6677 1250 380 aa, chain - ## HITS:1 COG:no KEGG:BF2808 NR:ns ## KEGG: BF2808 # Name: not_defined # Def: conjugate transposon protein TraM # Organism: B.fragilis # Pathway: not_defined # 1 379 10 389 390 461 68.0 1e-128 MRITDKINFRQPKYMLPAIVYIPLIATGYFVFDMFNTEVAETQDKNLQTTEFLNPNLPGA QIKNGDGIGGKYENMAKSYGRIADYSAVDNIERDNEEEKEAYESKYTEEDLAELVRQAEE KDDAAEVADAKAREAEALEALNKALAEARLRGQAAVAPVAQDTTQAGPPKEQIEVKGQIA EDSKAVKALDEEDKAQEVVKKVKVTSDYFNTLTQNEHEPKLIKAIIDEDVKATDGSRVRL RLLDDIEIGETVVKKGSYLYATMSGFGSQRVKGSINSILVDDELIKVSLSLYDTDGLEGL YVPGSQFRETTKDVASGAMQSNMNIDQSGANNSFSQWGMQAVTNAYQKTSNAISKAIKKN KVKLKYGTFVYLVNGKEKRK >gi|226332131|gb|ACIC01000189.1| GENE 7 6661 - 7080 278 139 aa, chain - ## HITS:1 COG:no KEGG:BF2806 NR:ns ## KEGG: BF2806 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 103 1 103 123 122 60.0 4e-27 MSVKGIRRMLVGEDMPDKNDPKYKKRYEKEVAAGRKFAKTARIDRAAAKVQGFANAHKTL FLVIVFGFVATCFGINIYRMVRYYGHRTEAASAIERQEGRMKQMMNAAHCITPPVTHRTN EENRDKSVTSKSDSDENNR >gi|226332131|gb|ACIC01000189.1| GENE 8 7093 - 7707 409 204 aa, chain - ## HITS:1 COG:no KEGG:BF2805 NR:ns ## KEGG: BF2805 # Name: not_defined # Def: conjugate transposon protein TraK # Organism: B.fragilis # Pathway: not_defined # 1 204 1 204 204 378 89.0 1e-104 MVIKNLENKIKLVGIICSAFLMGCIIISVSSIWTARCMVTDAQQKVYVLDGNVPILVNRT TMEETLDVEAKSHIEMFHHYFFTLAPDDKYIRYSMEKAMYLVDETGLAQYNTLKEKGFYN NIMGTSAVFSIFCDSIKFNKEKMEFTYYGRQRIERRSNILMRELVTAGQVKRVPRTENNP HGLLITNWRTLLNKDIEQKSKSNF >gi|226332131|gb|ACIC01000189.1| GENE 9 7711 - 8112 400 133 aa, chain - ## HITS:1 COG:no KEGG:BF2804 NR:ns ## KEGG: BF2804 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 128 11 138 139 87 33.0 1e-16 MKAKNEIERWLKDEKFMAFANKRAKEEFFNSENNYIDPQYEEMAEGFEDNDEYVVPMVDY LSYRLHRAKIYRNRRRRERDIWWVWIQLKYEGIYVEACIKYYAKLVEEVEKDIYTILHRE YVRMKRNQTSNKQ >gi|226332131|gb|ACIC01000189.1| GENE 10 8178 - 9308 869 376 aa, chain - ## HITS:1 COG:no KEGG:BF2803 NR:ns ## KEGG: BF2803 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 376 1 377 377 677 91.0 0 MADNILSDFGINILEEEIDDVIFQTNEFLTDATFTGSQGPFWWILQMCMALAALFSIIVA ASMAYKMMVKNEPLDVMKLFKPLVVSIILCWWYPPADTGMTNSGSSWCVLDFLSYIPNCI GSYTHDLYEAEATQISDKFEEVQQLVYVRDSMYTDLQAQADVAHKGTSDPNLIEATMEQT GVDEVTNMEKDASKLWFTSLTAGAVVGIDKIVMLIALIVFRIGWWATIYCQQILLGMLTI FGPIQWAFSLLPKWEGAWAKWLTRYLTVHFYGAMLYFVGFYVLLLFDIVLCIQVENLTAI TASEQTMAAYLQNSFFSAGYLMAASIVALKCLNLVPDLAAWMIPEGDTAFSTRNFGEGVA QQAKMTATGTMATVMR >gi|226332131|gb|ACIC01000189.1| GENE 11 9308 - 9709 251 133 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|212694218|ref|ZP_03302346.1| ## NR: gi|212694218|ref|ZP_03302346.1| hypothetical protein BACDOR_03744 [Bacteroides dorei DSM 17855] # 1 133 1 133 133 254 100.0 2e-66 MESISKIQLRLYAAKRKNGKWQLEMSRMPKRISVIGRTPIVDEHYMPSDLEVVSMSKLHK YVGSYYGKIVKTLKEEGIITKEYGMWKLREDLQDKGIAVYVTGRMRCFYHFYLSWTPKGI EFIKEIINNRTRH >gi|226332131|gb|ACIC01000189.1| GENE 12 9725 - 10555 673 276 aa, chain - ## HITS:1 COG:SPy0980_1 KEGG:ns NR:ns ## COG: SPy0980_1 COG3617 # Protein_GI_number: 15674990 # Func_class: K Transcription # Function: Prophage antirepressor # Organism: Streptococcus pyogenes M1 GAS # 3 123 2 123 123 129 54.0 5e-30 MNDIQIFNNEEFGAVRTTGTPEQPLFCLADVARVLGLKTSKLVQRLSDDVLSKYPISDSL GREQVTNFINEDGLYDVILDSRKPEAKRFRKWVTSEVLPSIRKHGAYMTQQTIEKALAEP DFLIRLAVNLKEERQKRLLVEQECEHQRTRIVELGSKVDDLQQEVTEMKDKVSYLDIILA TKSSVLVTQIAQDYGESSIRFNRRLKEMNIQYQRGKQWILYADYKDCGYVTSETYLIKHK DGTEDVRMNTKWTQKGRRFLYEKLKSVGVIPVIERT >gi|226332131|gb|ACIC01000189.1| GENE 13 10599 - 11363 688 254 aa, chain - ## HITS:1 COG:no KEGG:BF2802 NR:ns ## KEGG: BF2802 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 5 233 6 234 256 271 60.0 2e-71 MKKTMILIMAVATTIKATAQSVTYNHDSSKQNQITVMETGGGSLTPEFYYWLLHNNYKKT AAEKNKLGFRTLAGINLYNQVDDAEKIDSALTKRAEVEALNVADRQIDLAWLAESSKING QLDKMKANIDHIIPTGGTINDKRRWEELYNMYQCAVKATKDAYMPNAQRKRQYLSIYADL TTQNETLLKYLVQLNTQSQTTALLAATNDRVVHKGSIISDAKSRWQENMKGVRSSTGTDG DDANSGEGEESVNR >gi|226332131|gb|ACIC01000189.1| GENE 14 11391 - 12116 640 241 aa, chain - ## HITS:1 COG:no KEGG:BF2800 NR:ns ## KEGG: BF2800 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 16 241 16 241 241 322 71.0 6e-87 MRHTIIIALLLITLFPNTALAQWGFDAVSVEAYINDHKKQRSLLLARSTLEYSNQLLHEY SSEEVGKYKELNVDLDKYTRAFDIIDVMYQSLRTVLNVKSTYQTVSDRISDYKNLLEDFN DKVIKRKHIELADTMLISINKKAIDNIYNEGSQLYHSVNDLVLYATGAAACSTSDLLMVL DNINNSLDLIEKHLNQAYFQTWRYVQLRMGYWKEKVYRTRTKEEILEDAFSRWKVAGKLD Y >gi|226332131|gb|ACIC01000189.1| GENE 15 12131 - 14641 1950 836 aa, chain - ## HITS:1 COG:no KEGG:BF2799 NR:ns ## KEGG: BF2799 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 10 826 19 835 843 1184 72.0 0 MCLTLVVGRVHAQYYSVNYDKQIVAAMAAAFGAGAMAESYYNEQVGEILKHYNAAEVATA GIFAAKFLERKAFTDLGIWSSSTENYYYRRIHNMVANKIMPKIWTVAGMMLKSPQTALYW GSYLMKICADTKSLCMQFESVVTNSSLTFADIQFLEINQEIAAILKLSEIGNVDWQQLLD DIASVPGNFTVENLTADIETLYNMGASLATAGINGASGALLQQSAFNDLFNGKVSKIIDI YDNYHDLYEQAENGIGSTLLSLVGGEDNVAALFDLSNYNITSWMTDYLSETQGNYYTQRW YIARRDQGSVSLCDYYPPTDDNSILNGDHWYRINTSDPNFYPNSTQREQILANSENHAGW SRSRVQQLNAQSDGFTYTMNYWMSSYIISRGGKQTKKAYAYEIHVTKSWNNVEEVYEDVF DSYSMDLNTFKRQLQARLSEYNDNEEGYVYYIGSDARRYYQATDAAKLKGVESVTISVTC SDGVTLGQGSTQYKCRECGGSLNAHSKECAMRTSVSENNLDLSELDEKEREANSKIALLE AQISQLETENRNLLTQIANASVEEAPALRQQHNVNKTKIDNLKKELATWQQQLADIQNAK SEAANDNATSTDDYYRIPAIMQDVKSAYNLTWQGAGSWNGYTYVRTATMPNINGIITFKA TLSIARKPKYFLGIKIHRAILQISWELTSEYTDTQVVDVLTLDPNASDEEKTRQVNSRIS EIAQEFPSCSITTEYAKSEATQTADNDDVIHLLWSSDRLEIAREVDSRITKIYADLVSLE KMMHYKRSIIDVLKDILPSIDADQGRRRTLIEECYERWRENADPNRGNGEDNENEN >gi|226332131|gb|ACIC01000189.1| GENE 16 14892 - 17597 2421 901 aa, chain - ## HITS:1 COG:no KEGG:BF2797 NR:ns ## KEGG: BF2797 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 894 1 898 903 1400 75.0 0 MTLYIILIFSAICVGMAISVKAFGTGGKRKRIFQDIYFSIEEVDGIGVLYTKTGEYSAVL KMENPVQKYSANIEAYYEFTHLFAALAQTLGEGYALHKQDVFVRKAFKEENGGNHEFLSE SYFRYFNGRPFTDSVCYLTITQENKKSRLMSFDNKKWRDFLVKIRKVHDQLRDAGVKSRF LGKTEACEYVDRFFSMNFRDKVVSMTNFKVDDETIGMGDRSCKVYSLVDVDYANLPSVIR PYANIEVNNTSMPVDLVSIVDSVPGADCVVFNQMVFVPNQKRELALLDKKKNRHASMPNP SNLMAVEDIKRVQEVIARESKQLVYTHYNLVVAVSGDTDIQKCTNHLENSFSRMGIHISK RAYNQLELFVNSFPGNCYGMNADYDRFLTLGDAATCLMYKERIVHNEDTPLKIYYTDRQG VPVAIDITGKEGKEKLTDNSNFFCLGPSGSGKSFHMNSVVRQLWEQNTDIVMVDTGNSYE GLCEYVGGKYIAYTEDKPITMNPFNISKRELNIEKIDFLKNLILLIWKGSETQIPELEFR VVEQLVTEYYDFYFNGVQPYPSSQKETLRKNLSTMEKRRGTELTQIHDKVEKLIKGLEER RMALSVKTLSFDSFYEFACERLDQICIENNITTIDCDNFAYMLQNFYRGGKYDKILNENV DSTLFDETFIVFEVDAIKENKQLFPIVTLIIMDVFLQKMRLKKNRKCLVIEEAWKAIASP LMAEYIKYLYKTARKFWASVGVVTQEIQDIIGSPIVKEAIINNSDVVMLLDQSKFRERFD EIKAILGLTDVDCKKIFTVNRLDNKEGRSFFREVFIRRGSTSGVYGVEEPHECYMTYTTE RAEKEALKLYKHELKCRHQEAIERYCRDWDASGIGKSLAFAQKVNEAGHVLNLTDDGATR R >gi|226332131|gb|ACIC01000189.1| GENE 17 17720 - 18019 342 99 aa, chain - ## HITS:1 COG:no KEGG:BF2796 NR:ns ## KEGG: BF2796 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 8 99 6 97 97 124 67.0 9e-28 MADTKQERFPDYPIFKGLQRPLEFFGLQGRYIYWAAATAGGAVAGFILGYCIFGFVAGLV LLVLAVAVGGVLIVMKQRKGLHSKKSDNGIFIYAYSKKV >gi|226332131|gb|ACIC01000189.1| GENE 18 18068 - 18433 499 121 aa, chain - ## HITS:1 COG:no KEGG:BF2795 NR:ns ## KEGG: BF2795 # Name: not_defined # Def: conjugate transposon protein TraE # Organism: B.fragilis # Pathway: not_defined # 2 119 5 123 125 137 73.0 1e-31 MFERINKKVKGFFSSQRMKTLALMLLVGTTAAMAQNAAGDYSAGTTALTTVTEEIAKYVP IVVKLCYAIAGVVAIVGAISVYIAMNNEEQDVKKKIMMVVGACIFLIAAAQALPLFFGIG G >gi|226332131|gb|ACIC01000189.1| GENE 19 18492 - 18821 328 109 aa, chain - ## HITS:1 COG:no KEGG:BF2794 NR:ns ## KEGG: BF2794 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 109 23 129 129 119 64.0 3e-26 MSKTRSAFLSLYAALTPALASAKCGNVDYSWGADALASAHDYAVTMMLYIVYLCYAVAGI VVIVSALQIYIKMNTGEEGVKKNIMMLVGACLFLIGATIVFPAFFGYQI >gi|226332131|gb|ACIC01000189.1| GENE 20 18821 - 19273 512 150 aa, chain - ## HITS:1 COG:no KEGG:BF2793 NR:ns ## KEGG: BF2793 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 141 51 192 200 90 39.0 1e-17 MKSFVIFAIVVTIIYVIYYTVIIVQDLYGKPKDEKSQGESFDVSDMTDEEESIAVSESDG GFSVGDNQYETAYEEKQLAEPTEEAAATAEESKPHVLEKIQSAIEKKMEEVNTTYSDPMY SEELNNTIIARGLRHNGRDMVKVESLNNEI >gi|226332131|gb|ACIC01000189.1| GENE 21 19495 - 20265 445 256 aa, chain - ## HITS:1 COG:no KEGG:BF1071 NR:ns ## KEGG: BF1071 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 4 256 1 255 255 241 49.0 2e-62 MTNMKIQYASDLHLEFADNWRYLKAHPLEVTGDILLLAGDIGYLGDDNYSKHPFWDWASE NYKEVHCCMGNHEFYKYYDVATLPDGYLLEVRPNVFSHYNGIARIGDTDIILSTLWSRIP LEDAYFTEQVISDFRRILYKGELMTHAQFNAEHERCLTFIKDAVAYSQAAHKIVVTHHVP SFRMLHPKFQGSKANVAFTVELEDYITDSGIDYWIYGHSHTNIDARIGNTQCLSNQLGYV FSNEHQDFSHGKYLTI >gi|226332131|gb|ACIC01000189.1| GENE 22 20262 - 21041 563 259 aa, chain - ## HITS:1 COG:no KEGG:BLD_1325 NR:ns ## KEGG: BLD_1325 # Name: not_defined # Def: hypothetical protein # Organism: B.longum_DJO10A # Pathway: not_defined # 2 196 3 192 261 157 43.0 2e-37 MEGLVKFREAFAEYSENYVVIGGAACDITMTNTVVRPRATHDIDMIVIVENMTEAFANRF WQFVREAGYRPEKRKQEAGEPPRYEMYRFLDGKDGYPEMIELLSRHPDVLGEPKGFVIEP IPTDEDVSSLSAIIMDDDYYHFTIAHSQLTDGIRHANSAALIALKARAYLNLMADKRDGK HVNTKDIKKHRSDILKNVVIMTEDNIEAPASIVACIREFVASIRADWSTLAEPLSKSLGQ DEAFVTGLLDQLDELFIEA >gi|226332131|gb|ACIC01000189.1| GENE 23 21029 - 22000 584 323 aa, chain - ## HITS:1 COG:no KEGG:HCH_01177 NR:ns ## KEGG: HCH_01177 # Name: not_defined # Def: hypothetical protein # Organism: H.chejuensis # Pathway: not_defined # 32 321 42 349 354 92 28.0 1e-17 MELKKEIRILGRRFEVEPRAKESLKGITVGEKLSYRFYDGTYQGTPLLFVEPKKGNPSPR TCAITGKRLTEALGLPAVFILAPGPTYERHRLADKGVFFVMSEEYAHLPGIIALEKTSNR KIAEVLTPVAQYILLYHLQVGSIEGMSPRDIAPLLPYSYESVTLGVTCLEDVGLCQKIQN GQRSKVVHFELKGKELWDKAQNVLLSPVENRIFCDDIRLDAEYPVCGINALAHYSMLNRD REEMIMMTSKEYRAVKSADVMENPNIYDGNYIIEVWKYPVVSKMGDKSQWVDRLSLVLSL RDDDDPRVEKEVERIISEQKWKD >gi|226332131|gb|ACIC01000189.1| GENE 24 22157 - 23056 685 299 aa, chain - ## HITS:1 COG:no KEGG:BF2792 NR:ns ## KEGG: BF2792 # Name: not_defined # Def: DNA primase # Organism: B.fragilis # Pathway: not_defined # 1 295 1 295 295 287 47.0 3e-76 MNIEQSKKLSIIDFLDKENVTLKKKKGNAYWYLSPFRDEKTASFKVSKKENLWYDYAIKE GGDLVELVKRMYNKQSVSDALAYLASKSIATVDKAIETAIAAKEYTTTKMNDVKLLPLSN HSLLSYFSSRRIDITIGRMYCREIHYKVEQKHYYGIAFGNLSEGHEVRNPYFKGCIGHKD ITLLAHTFNEWQNGCLVFEGFMDFLAYMTLVKQQDRWFVVESPCDYMILNSVANLKQALH YLDRYTHIHCFLDNDQAGRKTVESISNVFEYRVTDESFRYADYKDVNDYLMRKKRTASE >gi|226332131|gb|ACIC01000189.1| GENE 25 23257 - 24369 894 370 aa, chain - ## HITS:1 COG:no KEGG:BF2791 NR:ns ## KEGG: BF2791 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 22 369 20 368 368 504 71.0 1e-141 MKMEENNKHVQPNSKEEGVQRLNRILSESLIKATDTYKTPPQIIWVDNSSIATLGNFSAS TGKAKAKKTFNVSALVAASLANGKVLNYRASLPEGKRKILYVDTEQSRYHCHNVLERILK LAGLPTSIDNENLDFICLREYTPSVRIEVIDYALAQDQSYGLVIIDGIRDLLLDINNAGE SVEVINKMMEWSSKYDLHIHCVLHQNKGDNNVRGHIGTEMNNKAETVLVITKSTTNPDIS EVKAMHIREKEFKPFAFTVNEEGLPEIVEHTPEKEEGDKQPSRFTYQDLTSEQHNEALTA AFKEKPIKGFDRMVEELTQAYADIGFKRGRSVIIKMLKYLINEQKLIVKRDNHYYFGYTP AEIDLFHEEE >gi|226332131|gb|ACIC01000189.1| GENE 26 24366 - 24707 291 113 aa, chain - ## HITS:1 COG:no KEGG:BF2790 NR:ns ## KEGG: BF2790 # Name: not_defined # Def: putative excisionase # Organism: B.fragilis # Pathway: not_defined # 7 100 11 104 132 109 57.0 3e-23 MNERHFRLYERIVAIEDSLEALGPIDKLIERIEELEKMVKQTKTVLGFDEACKYIGVSES LLYKLTAAKEVPHYKPRGKMLYFNREEIDKWLLQNKQEVIGMVTKIEIDNPKE >gi|226332131|gb|ACIC01000189.1| GENE 27 24751 - 25119 205 122 aa, chain - ## HITS:1 COG:no KEGG:BT_4618 NR:ns ## KEGG: BT_4618 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 8 115 8 114 117 88 36.0 8e-17 MEENSIRFEDIPNAITGVLKKLSSLEDKIDGIYELVQSEKEETWFTVAELCAYLPTHPVE HTIYCWTSNREIPFHKRGKRIMFLKSEIDEWLQGTKGKSKNEIQREAEEYVLSTQRKNRR LI >gi|226332131|gb|ACIC01000189.1| GENE 28 25253 - 25807 145 184 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|212694180|ref|ZP_03302308.1| ## NR: gi|212694180|ref|ZP_03302308.1| hypothetical protein BACDOR_03706 [Bacteroides dorei DSM 17855] # 1 184 59 242 242 335 100.0 9e-91 MAIGIIGTLFRDSKCVSIIKKKEDYSKQELIELFLQHVGTGLPILTRKKSSILTLGCQLS DRQMDLLVELVQSHDIFDFADNSDVRSELCRLFKCDLDASIRVKNVRNVAVLFDAMAQYH LINNNWQYVMGEGRFLTSIKKDGTEKFITSSCLSSSLSRIRRNVSMTASQYAICKSIEQI LREE >gi|226332131|gb|ACIC01000189.1| GENE 29 26218 - 27465 498 415 aa, chain - ## HITS:1 COG:Cgl1981 KEGG:ns NR:ns ## COG: Cgl1981 COG0582 # Protein_GI_number: 19553231 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Corynebacterium glutamicum # 161 405 58 307 315 63 26.0 7e-10 MKRLDYTKVSVKLKKSEWRDEWFIYLEAYPVYETGNDKPKRVREYLRRSITTPIWDKRRS ERAVAGRIKYKPKRDDNGVIQCKSKQDMETCLYADGVRVLRQKEYDNMALFTDQQMEMAE QSERSKCNVLEYIEKLIKEREETASESIVVNWRRLHTLLSMFAKCDYIQFSQIDMKYIEA FRSFLIKAPQGGSKKGTISRNTASTYFSIFKAALKQAFVDGYLNCDIAAKAKNIMFQSAR REYLSLEELNILAKTPCDDILKRAALFSALTGMRHSDIQKLKWSEVEEYNGGYRLNFTQQ KTKGVEYMPISPQAYKLCGERKKDGELLVFAGLPDPSWISRPLERWVKASGITKHITFHC FRHTYATLQLANGTDIYTVSKMLGHTNVKTTQIYAKVIDKKKDEATEAFKLDIDE >gi|226332131|gb|ACIC01000189.1| GENE 30 27462 - 28028 193 188 aa, chain - ## HITS:1 COG:no KEGG:BF1226 NR:ns ## KEGG: BF1226 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 183 1 183 183 163 45.0 3e-39 MAVAKYKIVRKCPVCGEEFFARTLESWYCSPKCSKVAWKRKHDEEKRQLELDKIVSNMPK SKEYISITEAYAMFGASRSTIYRLIYMKKISFIEPEKGIRLVCKGELMNLFPLRQSPLDT KPRKPVTMYRMEPEDCYTIGEISKKFHLDDSTVYAHIRKYSIPTRQIGNYVYAHKESIDK LYKDIKPL >gi|226332131|gb|ACIC01000189.1| GENE 31 28230 - 30146 2257 638 aa, chain - ## HITS:1 COG:ECs0014 KEGG:ns NR:ns ## COG: ECs0014 COG0443 # Protein_GI_number: 15829268 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone # Organism: Escherichia coli O157:H7 # 1 638 1 635 638 684 61.0 0 MGKIIGIDLGTTNSCVAVFEGNEPVVIANSEGKRTTPSVVAFVDGGERKVGDPAKRQAIT NPTRTIFSIKRFMGENWDQVQKEVTRVPYKVVKGDNNTPRVDIDGRLYTPQEISAMILQK MKKTAEDYLGQEVTEAVITVPAYFSDSQRQATKEAGQIAGLEVKRIVNEPTAAALAYGLD KAHKDMKIAVFDLGGGTFDISILEFGGGVFEVLSTNGDTHLGGDDFDQVIINWLVQEFKN DEGADLTQDPMALQRLKEAAEKAKIELSSSTSTEINLPYIMPVNGVPKHLVKTLTRAKFE SLAHGLIQACLEPCKKAMSDAGLSNSDIDEVILVGGSSRIPAVQKLVEDFFGKTPSKGVN PDEVVAVGAAVQGAVLTDEIKGVVLLDVTPLSMGIETLGGVMTKLIDANTTIPARKSETF STAADNQSEVTIHVLQGERPMAAQNKSIGQFNLSGIAPARRGVPQIEVTFDIDANGILKV SAKDKATGKEQAIRIEASSGLSKEEIEKMKAEAEANAEADKKEREKIDKLNQADSVIFQT ENQLKELGDKLPADKKAPIEAALQKLKDAHKSQDLAAIDTAMAEINTVFQTASAEMYAQG GAQGGAQAGPDMDAGQNAGQDNSKHGDNVQDADFEEVK >gi|226332131|gb|ACIC01000189.1| GENE 32 30483 - 31271 598 262 aa, chain - ## HITS:1 COG:DR1940 KEGG:ns NR:ns ## COG: DR1940 COG3187 # Protein_GI_number: 15806938 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Heat shock protein # Organism: Deinococcus radiodurans # 30 233 182 375 403 69 28.0 5e-12 MKKVFVSICIASTVLAMFSCRSVEKAVPLASINGEWNIIEVNGSKVTPGESRTLPFITFD TATGRVSGNSGCNRMMGSFDVNAKPGSMELKGMASTRMMCPDMTTERNVLGALAQVKGYK KAGKDKMFLCNESNRPVVVLEKKEADVKLSVLNGEWKIKEVNGEAITSGMEKQPFIAFDV KKKTLHGNAGCNLINGGFETSTTNAKSISFPGVASTMMACPDMEVEGKILKAMNEVKSFD KLSGGGIGLYDANNALVIVLEK >gi|226332131|gb|ACIC01000189.1| GENE 33 31417 - 32700 961 427 aa, chain + ## HITS:1 COG:no KEGG:BT_4613 NR:ns ## KEGG: BT_4613 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 427 1 427 427 863 100.0 0 MRTIVLFLILFGCGSSHITAQNDYLVTTISNQEGPLSEEEQFIKENFPLQLLCKWTPGMK FMFIPSGRDMFLPTLLVYETEKGADNSVLKHKILTFAGYEEKSQELSNDMSYSTRLIFEC DGEKYYYEIKNTRPDEICEKNPRAYINGLVNLKDVDTAKELLTGRKIYVQSDMARVDDAN SYSGYRKVSIPVNTEATITAIGVGSQSYPVKIIFQDSQGHSYYLEVALSRTNSGMDVSDF QAEKKMRYFPNAISFSSKKSGNLETLKDKYIDLTVYPKRTLAVKRVFSLENKQMENRIHL PRYTVLKIKEIKISGPGSLVILSLEDWHGSLYETEVDLKYDVIVKNENYIEDLFGFGDIH QKYPAITDKHWAIIARGDLEEGMNTDECRLSIGDPVEIQLKKDSRFETWFYNGKTLEFES GILQRFK >gi|226332131|gb|ACIC01000189.1| GENE 34 32806 - 33999 1345 397 aa, chain + ## HITS:1 COG:slr0049 KEGG:ns NR:ns ## COG: slr0049 COG1748 # Protein_GI_number: 16331467 # Func_class: E Amino acid transport and metabolism # Function: Saccharopine dehydrogenase and related proteins # Organism: Synechocystis # 1 392 1 391 398 556 65.0 1e-158 MGRVLIIGAGGVGTVVAHKVAQNADVFTDIMIASRTKEKCDNIVKAIGNPNIKTAKVDAD NVDELVALFNDFKPEMVINVALPYQDLTIMEACLKAGVNYLDTANYEPKDEAHFEYSWQW AYHDRFKEAGLTAILGCGFDPGVSGIYTAYAAKHYFDEIQYLDIVDCNAGNHHKAFATNF NPEINIREITQNGRYYENGQWVTTGPLEIHKDLTYPNIGPRDSYLLYHEELESLVKHFPT IKRARFWMTFGQEYLTHLRVIQNIGMARIDEVDYNGTKIVPLQFLKAVLPNPQDLGENYE GETSIGCRIRGIKDGKERTYYVYNNCSHQEAYKETGMQGVSYTTGVPAMIGAMMFFKGEW QRPGVNNVEEFNPDPFMEQLNKQGLPWHEVFDKDLEL >gi|226332131|gb|ACIC01000189.1| GENE 35 34014 - 34469 557 151 aa, chain + ## HITS:1 COG:HI0254 KEGG:ns NR:ns ## COG: HI0254 COG1225 # Protein_GI_number: 16272212 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peroxiredoxin # Organism: Haemophilus influenzae # 3 150 5 150 155 141 45.0 4e-34 MKNIGDKAPEVLGINEKGEEIRLSAYKGKKIVLYFYPKDSTSGCTAQACSLRDNYSELRK AGYEVIGVSVDSEKSHQKFIDKNNLPFTLIADTDKKLVEEFGVWGEKKLYGRAYMGTFRT TFLINEEGIIERIITPKEVKTKEHASQILNQ >gi|226332131|gb|ACIC01000189.1| GENE 36 34494 - 35528 1276 344 aa, chain + ## HITS:1 COG:BMEI0787 KEGG:ns NR:ns ## COG: BMEI0787 COG0468 # Protein_GI_number: 17987070 # Func_class: L Replication, recombination and repair # Function: RecA/RadA recombinase # Organism: Brucella melitensis # 4 334 20 348 378 426 64.0 1e-119 MAKKDELNFETDNKMASSEKLKALQATMEKIEKNFGKGSIMKMGDDSVEQVEVIPTGSIA LNVALGVGGYPRGRIIEIYGPESSGKTTLAIHAIAEAQKAGGIAAFIDAEHAFDRFYAAK LGVDVDNLWISQPDNGEQALEIAEQLIRSSAIDIIVIDSVAALTPKAEIEGDMGDNKVGL QARLMSQALRKLTGTVSKTRTTCIFINQLREKIGVMFGNPETTTGGNALKFYASVRIDIR GSQPIKDGEEVLGKLTKVKVVKNKVAPPFRKAEFDIMFGEGISHSGEIIDLGAELGIIKK SGSWYSYNDTKLGQGRDAAKVCIKDNPELAEELEGLIFEALNKK >gi|226332131|gb|ACIC01000189.1| GENE 37 35673 - 36665 955 330 aa, chain - ## HITS:1 COG:SPy1056 KEGG:ns NR:ns ## COG: SPy1056 COG2855 # Protein_GI_number: 15675048 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Streptococcus pyogenes M1 GAS # 45 325 34 331 339 134 33.0 2e-31 MISSATKTLQTNNKTIYVAILSTLTFFLFLDYIPGLQAWSAWVTPPVALFLGLIFALTCG QAHPKFNKKTSKYLLQYSVVGLGFGMNLQSALASGKEGMEFTVISVVGTLLIGWFIGRKI FKIDRNTSYLISSGTAICGGSAIAAIGPVLRAKDSEMSVALGTIFILNAIALFIFPAIGH ALDMTEHQFGTWAAIAIHDTSSVVGAGAAYGEEALKVATTIKLTRALWIIPMAFATSFIF KSKGQKISIPWFIFFFILAMIANTYLLNGVPQLGAAINGIARKTLTITMFFIGASLSLDV LRSVGVKPLIQGVLLWVVISLSTLAYIYFV >gi|226332131|gb|ACIC01000189.1| GENE 38 36783 - 37157 206 124 aa, chain - ## HITS:1 COG:MT0892.1 KEGG:ns NR:ns ## COG: MT0892.1 COG3304 # Protein_GI_number: 15840283 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Mycobacterium tuberculosis CDC1551 # 1 123 1 122 129 89 43.0 2e-18 MGCLMNLLWLLLGGIFTAVEYLISSILMMLTIVGIPFGMQTLKLAGLALWPFGKEVRSGG RSSGCLYILMNVLWIFLGGIWICLSHLVFGAVLCITIIGIPFGLQHFKLAALALSPFGKD IVGA >gi|226332131|gb|ACIC01000189.1| GENE 39 37169 - 37612 396 147 aa, chain - ## HITS:1 COG:SMc01703 KEGG:ns NR:ns ## COG: SMc01703 COG5579 # Protein_GI_number: 15964216 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Sinorhizobium meliloti # 11 146 6 142 143 140 48.0 1e-33 MWYEKIYMVDYKLQRFLDAQQGVYNQALAEVKNGKKYSHWIWYIFPQLKGLGMSYNSQYY GICGKEEAEAYLAHPILGVRLREITTVFLQLKGKTAVDVFGGLDAMKVLSCMTLFNEVAS DDLFQKVIDLYFQGRLDETTKKILNKY >gi|226332131|gb|ACIC01000189.1| GENE 40 37757 - 38941 643 394 aa, chain - ## HITS:1 COG:no KEGG:BT_4606 NR:ns ## KEGG: BT_4606 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 394 1 394 394 743 99.0 0 MKTFRFIYQFVLLLMCTTACNEYDDSELWDRVNSLDDRVTSIENRLKALNNDITSISALV GVLQNRLYLTNVTTLGDGYTLTFSDGSHITVSDGKDGINGKDAPVINVRYYNGRYYWVQT INGETTWLNDSDGNKIPASGTDAITPLLKVDSDGYWIISYDKGYTYSKLLDEYGKAVKAS GMDGDSFFESVEATEDELRLILKDGTEIVIPLGEQSLYKAVNLGLSVRWASFNLGAITSS EKGGYFLWGDVDNVGVMPDYQAPNIDNICGSKYDIARAMWGGSWRLPNKAEQIDLIKKCT WTRATINGVSGMKVTGSNGNSIFLPSTGYMLPASGPAGATQLVDENSGYYWVGESYGDTY GRFGYVFYYNSTSYYYNASWNASMIKMAIRPVKE >gi|226332131|gb|ACIC01000189.1| GENE 41 39042 - 41501 1130 819 aa, chain - ## HITS:1 COG:no KEGG:BT_4605 NR:ns ## KEGG: BT_4605 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 819 1 819 819 1463 99.0 0 MNRKFKISRITLLLMCLCLVFLFSCGEENEKNEVRVETEIVPTSECETIFSQGINVSSEG ERATISFSTNKDWSVSLAETQNGDNWCTVSPSSGKAGNVTLTIVVASNTGYDDRNVVLKL SVDELTKSIMINQKQKDALTLTSNRFEVDKDGGIINLQVKSNIDYNVTVAENSRTWIEPV ISERTRGLSTSFYSFAISPSEEYDKREGEIIITSGESDETVKVYQTGSAILILSQNEYTF GCDGGRASIDISSNFEYEIDMPNVDWIQSANLSRAVSSHTLVYEISANNTYADREAVIVF KDANGNKKESVSIKQNQKDAILLSNNKVEVPQNGGTFFVDVNSNVEYSIEIPPSCNSWIS RANSSVNVHRGLSKTTSAFSVASSDEYDKREGEIYFKYNDISDTLKVYQSGGAILVLTQS SYNIDGNTTSINVQLKSNIDYGVSISNDWITEIPTRSISNSMKKFNIMSNTTGKSRTGKI TFTSSDGKKTETVTVIQATVVKVTALDILFNGTLYAYVGDSYNFFVTCKPSDAVTDYEWS SSDIRVLTVSGSGGNATVKIVGYGEANVVVKDKNSGVSKEYTVHSRISDFSWSNTGETYS IYPMLTLAVGESQKISYTSKQGSSVLNIFGNLSDFVFYEPVNVVSEPSVISIDADGTATG LKTGIVGIKPTGYITRLSSGNERVYVNVIPEYIESEYNNNFSNANTIKDGQSMKFSLSST RDVDVFKFNRKSTYMHINLEYLGSFNDVNYSKRLRYEVYDSNYKLTGSGTLSLTGTGQIY DPMLRYVGSGDVAYVQFYFSGDYIQNFPDGYFKVSIAAQ >gi|226332131|gb|ACIC01000189.1| GENE 42 41513 - 42541 373 342 aa, chain - ## HITS:1 COG:no KEGG:BT_4604 NR:ns ## KEGG: BT_4604 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 342 1 342 342 643 100.0 0 MAKMRYEYLGIIHRNDLNILFKKGYIVLCTIHVKTISGNDSVPEEYIRELLKNVSPFDYT SEYVFIKFLRERKWLKRDCKNNIEYKEVQSIIPLDLVAKKDMEMSFNKMIKFVEPLWGTY VDDFSQSLFSENMCKGASACLEILGIKVEKPLKDLDDEDLIIKVTNYRFQKENLDENSSI WQYLLMYERHEPYPSNCLGYFYDSVHVFVNYTFKKEYLTMPKTEILKVLNLIDRQSRYDF EYIVCELKNNKCAERYIEKCTRKGIRQYILIPIYFYLLNLFSLPNYQSLMKDYCRNSFKR LYEKEYKLAVYLVGLRLGFDSINEIYYQKLEKDMESHQQSLF >gi|226332131|gb|ACIC01000189.1| GENE 43 42544 - 43740 514 398 aa, chain - ## HITS:1 COG:no KEGG:BT_4603 NR:ns ## KEGG: BT_4603 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 398 1 398 398 786 100.0 0 MSTRIPTEESLDTQQKDFLKKLKDNHYVGQNIWIKGFPGSGKSVLLVYAIKYLLNNEATQ NAKILLVVFTHSLIELCKAELAELNITNVKIVTMYEFLKTKRKYDYIFCDEVQDLTSRIL IKMKDQTRYMLIVAGDSNQSIYSKDPQYQEAVVIPEKITQLIDSEDWGLLYVYRLTRSII TAIQKMIPSMNIFSAKRYMLKRDTQIRLCKAEFVDQEVEYIMREARKFVNRNQSCAILFP TKDKVLEFCDTVLQLEHKPVWEKWIDQYYEPYFKRLNDHLKENQVPIECVVNSYGSLLAA ENDKRIVIMTYHGVKGLDFDNVFIPFANTDLYIPGDSNITQKLFMVAMSRSRNNLYITYT GMPLSYVSLFSSDSTICANISISNELHPNLSDDINIDF >gi|226332131|gb|ACIC01000189.1| GENE 44 43743 - 46121 1109 792 aa, chain - ## HITS:1 COG:SA1409 KEGG:ns NR:ns ## COG: SA1409 COG0443 # Protein_GI_number: 15927160 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone # Organism: Staphylococcus aureus N315 # 26 503 5 400 610 89 27.0 2e-17 MPILLTVENLVTADLSTIHKEQATYVGIDFGTSTTVVSYSYFDNEKRIVVTEVMNLRQKQ SDGADFTGEKVPTVIALYHNRVLVGEGAANLKYELEKNKDVWYSFKMELGEDLGAKYCNS ILGKDKSVRIQNAKDATILFFRFLKKEIESYVEQKGLCQNIKYAVSIPASFEANQRRDLV DALISNQMDVSKQSLIDEPNAAFLNYIHESEMNNEAVVIPKDINPKMLVFDFGAGTCDIS ILEIGVDYKGVYSKNLSISKFEKLGGNDIDRYIAYEILYPELLSHNHLDMDIFIMSEKKK IINALLTTAENLKIRICKAISLMRDSFDLSGLEQEYKVSVKDIRINTSKGVLYMDEIFLT SIQFAELMKVFTRKKSDVIQKEKQIYNSIYDCVKTSIRKARLEVDDVDYVLLVGGSSLNP YVQAALKSWLPYSKLLLFPDLQTCVSKGASIHSFIYNSFGQNIVQPITSEPICIITKNDQ LMTLVPAGTVMPTDTIEIDNLAIVKEGQKTVEFPICISNENKILTNLVITSNEQNGFSLD TPIKLQLELTADKLLTINAKVGSETCLVSAINPFANKELTSQERIALRAEKKAYNTTANN RGVPSKECLLELAKAYENATLNLKAAETYEECNELYPGTISYNKLGCCFSSAGNNRKALR YGEKAYEYSKDAVSAFNLGYRYKNIDTSMFLKYIHESLAMCPNKPHPLFELGKYEKKEGK KIEGVEKIKKAYNIWKAQYDNGTLSDVDYTWLPSAAEELGESIFAEEVRKKAKRISLDDF YNSNNTVAVKGD >gi|226332131|gb|ACIC01000189.1| GENE 45 47231 - 47869 582 212 aa, chain + ## HITS:1 COG:no KEGG:BT_4601 NR:ns ## KEGG: BT_4601 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 212 1 212 212 411 100.0 1e-114 MARYIMEEMPDIQKTGKRITYPKFARIDNASIKELAQRVGDVSGFSAGDIEGVLLQTAIE MAHLMAEGRSVKIDGIGTFTPSLALCRDKEREEAEEGAKHRNAQSICVGGVNFRADRTMI RNINERCRLERAPWKPQRSSKKFTPEQRLTLALKFLDEHAFLTVRNYQRLTGLLQTAATK ELKQWADLPESGIDIAGRGSHRVYVKKKVATE >gi|226332131|gb|ACIC01000189.1| GENE 46 48236 - 49957 1108 573 aa, chain + ## HITS:1 COG:no KEGG:BT_4600 NR:ns ## KEGG: BT_4600 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 11 573 1 563 563 1139 100.0 0 METVKQIPYGMSNFEDVIMQNRYYVDKTMYIPLLEDQANYLIFIRPRRFGKSLLLSMLRS YYDLSQKDKFQQLFGNLWIGQHTTPLQGKYQILYLDFSKIGGSIDELPQRFDSYSAVQLD GFLNRYREYYTDEFIERFSAADKGIDKLHILDDEARRLGYPLYLIIDEYDNFTNVVLNEK GNEIYRAITHASGFYRDAFKNYKGMFDRIFMIGVSPVTLDDLSSGYNIGWNISTSPVFNQ MLGFSEKDVRTMFQYYKETGQLHGDIEEMILEMKPWYDNYCFARQSLSSDPKMFNCDMVL YYLRNRIQLGASPEQMIDPNTRTDYNKMKKLIQLDRLDGNRKGVIKRIAEEGKIITNLFQ SFSADQITNPEIFPSLLFYYGMLTIIGTRGNLTILGIPNTNVRKQYYEYILEEYQNHHYI NLIDIEILFNNMAFDGEWRPALEFIAKAYKENSSVRSSIEGERNIQGFFTAYLSVNAYYL TTPEVELNHGFCDMFLMPDLQRYAEIAHSYIVELKYLPKEKFDAQSAEQWEEAVAQIHGY AASPKVRLLCQGTQLHCIVIQFCGWEMVRMEEV >gi|226332131|gb|ACIC01000189.1| GENE 47 49959 - 50855 662 298 aa, chain - ## HITS:1 COG:BS_ywfK KEGG:ns NR:ns ## COG: BS_ywfK COG0583 # Protein_GI_number: 16080817 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Bacillus subtilis # 6 275 6 272 299 129 28.0 5e-30 MSDFRLKVFQSVAKNLSFTKASQELFVSQPAITKHIQELETYYQARLFERQGSKISLTVA GELLLKHSEKILDDYKRLEYEMHLLHNEYIGELKLGASTTIAQYVLPPLLAHFIAKFPQV NLSVLNGNSRGVEVALQEHRIELGLVEGIFRLPNLKYTLFLQDELVAVVHVNSKLNIQEE ITPAELPNIPLVLRERGSGTLDVFERALSQHNLKLSSLNVLMYLGGTESIKLFLEHTDCM GIVSIRSVHKELVAGTLRVVEIKGMPMLREFNFVQLQGQEGGLSQVFMRFAGHHSKNL >gi|226332131|gb|ACIC01000189.1| GENE 48 50913 - 51497 551 194 aa, chain - ## HITS:1 COG:no KEGG:BT_4598 NR:ns ## KEGG: BT_4598 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 194 1 194 194 368 100.0 1e-101 MNDKPQIKLSETVVLIDAAFLNFVITDMKGYFEETLRRSLQDIDLSMLTTYLTLDAGITE GKNEVQFLFVYDKESDQLVHCHPSDLVKELNGVAFQSPYGEYSFASVPSEGMVSREDLFL DLLSIVADSADVKRMIVVSFNEEYGKKVFDALHEVKGKEIIQFRMNEPEVQVEYKWDMLA FPIMQALGIKAEEL >gi|226332131|gb|ACIC01000189.1| GENE 49 51721 - 54309 1873 862 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 [Bacillus selenitireducens MLS10] # 1 861 1 810 815 726 45 0.0 MNFNNFTIKSQEAVQEAVNLVQSRGQQAIEPAHILHGVMKVGENVTNFIFQKLGMNGQQV ALVVDKQIESLPKVSGGEPYLSRESNEVLQKATQYSKEMGDEFVSLEHILLALLTVKSTV STILKDAGMTEKELRNAINELRKGEKVTSQSSEDNYQSLEKYAINLNEAARSGKLDPVIG RDEEIRRVLQILSRRTKNNPILIGEPGTGKTAIVEGLAHRILRGDVPENLKNKQVYSLDM GALVAGAKYKGEFEERLKSVVNEVKKSEGDIILFIDEIHTLVGAGKGEGAMDAANILKPA LARGELRSIGATTLDEYQKYFEKDKALERRFQIVQVDEPDTLSTISILRGLKERYENHHH VRIKDDAIIAAVELSSRYITDRFLPDKAIDLMDEAAAKLRMEVDSVPEELDEISRKIKQL EIEREAIKRENDKPKLEIIGKELAELKEVEKSFKAKWQSEKTLMDKIQQNKVEIENLKFE AEKAEREGDYGKVAEIRYGKLQALDKEIEDTQEKLRGMQGDKAMIKEEVDAEDIADVVSR WTGIPVSKMMQSEKDKLLHLEDELHQRVIGQDEAIEAVADAVRRSRAGLQDPKRPIGSFI FLGTTGVGKTELAKALAEFLFDDESMMTRIDMSEYQEKHSVSRLVGAPPGYVGYDEGGQL TEAIRRKPYSVVLFDEIEKAHPDVFNILLQVLDDGRLTDNKGRVVNFKNTIIIMTSNMGS SYIQSQMEKLSGSNKEEVIEETKKEVMNMLKKNIRPEFLNRIDETIMFLPLTETEIRQIV LLQIKGVQKMLAENGVELEMTDAALNFLSQVGYDPEFGARPVKRAIQRYLLNDLSKKLLS QEVDRSKAIIVDSNGDGLVFRN >gi|226332131|gb|ACIC01000189.1| GENE 50 54369 - 54617 200 82 aa, chain - ## HITS:1 COG:no KEGG:BT_4596 NR:ns ## KEGG: BT_4596 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 82 1 82 82 164 100.0 7e-40 MDVITNKEVNVYLDNEKGTLCLTGMLVGTMVFLYDSQGELKEKHRFALPSLTLEISRKGT YVLVMSHPNCQPEVRRIIYSGI >gi|226332131|gb|ACIC01000189.1| GENE 51 54839 - 55273 488 144 aa, chain - ## HITS:1 COG:no KEGG:BT_4595 NR:ns ## KEGG: BT_4595 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 144 1 144 144 254 100.0 4e-67 MLKTILSISGKPGLYKLISQGKNMLIVESISADKKRFPAYGNEKIISLADIAMYTDDAEV PLYDVLEAMKKKENTAVTSIDPKKATPEQLREYLGEVLPNFDRERVYVADIKKLISWYNI LISNGITEFKSEPEAEEEVATDEK >gi|226332131|gb|ACIC01000189.1| GENE 52 55299 - 55913 364 204 aa, chain - ## HITS:1 COG:DR1892 KEGG:ns NR:ns ## COG: DR1892 COG0237 # Protein_GI_number: 15806892 # Func_class: H Coenzyme transport and metabolism # Function: Dephospho-CoA kinase # Organism: Deinococcus radiodurans # 4 181 15 190 207 101 37.0 1e-21 MAIKIGITGGIGSGKSVVSRLLEIMGIPVYISDIEAKRITHTNDVIRRELCALVGQDVFL NGELNRPLLASYIFGSPEHAKKVNAVIHPQVKEDFRRWVKGKGDIAMVGMESAILLEAGF KQEVDFVVMVYAPLEVRVERAIRRDYSSRELIMKRIEAQMSDEVKRNHADFVIVNDDETP LIPQVLKFISLLSKNNHYLCSAKK >gi|226332131|gb|ACIC01000189.1| GENE 53 55903 - 56730 444 275 aa, chain - ## HITS:1 COG:no KEGG:BT_4593 NR:ns ## KEGG: BT_4593 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 275 63 337 337 520 100.0 1e-146 MKDVPNNIVLTSEPPSELRVRVKDKGTVLLNYMLGKSFFPVNLSFPDYKGQNNHVKIFAS EFEKKILSQLNASSKILSIKPDTLDYIYSTGKSKLVPVHFQGKVTAGLQYYVSDTICSPD SVLVYAPAGILDTITTAYTQEVNLENISDTIRQRVALDNKKGVKFVPASVELTFPVDIYT EKTVEVPLRGINFPADKVLRAFPSKVQITFQVGLKRFRSIKASDFVINVSYEELLKLGSD KYTVKLKSAPRGINQIRIVPEQVDFLIEQVSSDGY >gi|226332131|gb|ACIC01000189.1| GENE 54 56768 - 56917 89 49 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLDRRKLRYTYLRLSKRIKDFLLSDKSREFLIFFIFFPDSQWILADSDS >gi|226332131|gb|ACIC01000189.1| GENE 55 56939 - 57256 421 105 aa, chain - ## HITS:1 COG:XF0224 KEGG:ns NR:ns ## COG: XF0224 COG1862 # Protein_GI_number: 15836829 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit YajC # Organism: Xylella fastidiosa 9a5c # 10 93 21 107 120 71 41.0 4e-13 MNLLTVLLQAPAAGGGSMMWIMLIAMFVIMYFFMIRPQNKKQKEIANFRKSLQVNQSVIT AGGIHGTIKEITDDYIVLEIASNVKIKIDKNSIFADASAAGNQSK >gi|226332131|gb|ACIC01000189.1| GENE 56 57297 - 58223 928 308 aa, chain - ## HITS:1 COG:TM1765 KEGG:ns NR:ns ## COG: TM1765 COG0781 # Protein_GI_number: 15644510 # Func_class: K Transcription # Function: Transcription termination factor # Organism: Thermotoga maritima # 187 298 22 133 142 64 34.0 3e-10 MINRVLIRLKIIQIVYAYYQNGSKNLDSAEKELFFSLSKAYDLYNYLLMLMMALTEYAQK RIDAAKAKLAPTKEELYPNTKFVDNKFVAQLEVNKQLTEFISNQKRTWANDQDFVKELYE KIVESDIYKDYMASGDNSYEADRELWRKLYKTYIFNNDSLDQVLEDQSLYWNDDKEIVDT FVLKTIKRFEEKNGANQELLPEFKDDEDQEFARRLFRRTILNADYYRHLVSENTKNWDLD RIAFMDIIIMQTALAEILSFPNIPVSVSLNEYVEIAKLYSTAKSGSFINGTLDGIVNQLK KEGKLTKN >gi|226332131|gb|ACIC01000189.1| GENE 57 58392 - 58796 436 134 aa, chain + ## HITS:1 COG:no KEGG:BT_4590 NR:ns ## KEGG: BT_4590 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 134 1 134 134 194 100.0 1e-48 MEDFKKKMNADMNDKEIVFSKSIKAGKRIYYLDVKKNRKDEMFLAITESKKVVMGEGDDS QVSFEKHKIFLYKEDFQKFMAGLTEAISFINHSDMNDYISRLNEEADIKREQEALAEQAQ EDKLENEIKIDIDF >gi|226332131|gb|ACIC01000189.1| GENE 58 58933 - 59523 976 196 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|29349997|ref|NP_813500.1| 50S ribosomal protein L25/general stress protein Ctc [Bacteroides thetaiotaomicron VPI-5482] # 1 196 1 196 196 380 100 1e-104 MKSIEVKGTARTIAERSSEQARALKEIRKNGGVPCVLYGAGEVVHFTVTNEGLRNLVYTP HIYVVDLDIDGKKVNAILKDIQFHPVKDNILHVDFYQIDEAKPIVMEVPVQLEGLAEGVK AGGKLALQMRKIKVKALYNVIPEKLTVNVSHLGLGKTVKVGELSFEGLELISAKEAVVCA VKLTRAARGAAAAAGK >gi|226332131|gb|ACIC01000189.1| GENE 59 59659 - 60228 638 189 aa, chain + ## HITS:1 COG:slr0922 KEGG:ns NR:ns ## COG: slr0922 COG0193 # Protein_GI_number: 16331675 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Peptidyl-tRNA hydrolase # Organism: Synechocystis # 1 189 1 190 194 138 42.0 7e-33 MIKYLIVGLGNIGPEYHETRHNIGFMTVEALARINNAPPFIDGRYGFTTSFSIKGRQLIL LKPSTFMNLSGLAVRYWMQKENIPLENVLIVVDDLALPFGTLRLKGKGSDAGHNGLKHIA AILGTQNYARLRFGIGNDFPKGGQIDYVLGHFTDEDRKTMDERLETAGEIIKSFCLAGID ITMNQFNKK >gi|226332131|gb|ACIC01000189.1| GENE 60 60330 - 60752 572 140 aa, chain + ## HITS:1 COG:Cgl2072 KEGG:ns NR:ns ## COG: Cgl2072 COG1188 # Protein_GI_number: 19553322 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribosome-associated heat shock protein implicated in the recycling of the 50S subunit (S4 paralog) # Organism: Corynebacterium glutamicum # 5 121 10 122 126 87 41.0 8e-18 MAEARIDKWMWAVRIFKTRTIAAEACKMNRVTINGAYVKASRMIKPGDVVQVKKPPITYS FKVLQAIEKRVGAKLVPDVMENVTTPDQYELLEMSKISGFIDRARGTGRPTKKDRRSLEE FTAPEFMDDFDFDFDFDEDK >gi|226332131|gb|ACIC01000189.1| GENE 61 60827 - 61252 524 141 aa, chain - ## HITS:1 COG:RSc0455 KEGG:ns NR:ns ## COG: RSc0455 COG0537 # Protein_GI_number: 17545174 # Func_class: F Nucleotide transport and metabolism; G Carbohydrate transport and metabolism; R General function prediction only # Function: Diadenosine tetraphosphate (Ap4A) hydrolase and other HIT family hydrolases # Organism: Ralstonia solanacearum # 30 106 27 103 147 62 35.0 3e-10 MKSDPKECLYCQNNETLHNLMIEIAQLSVSRVFLFKEQTYRGRCLVAYKDHVNDLNELSD EDRNAFMADVARVTRAMQKAFQPEKINYGAYSDKLSHLHFHLAPKYVDGPDYGGIFQMNP GKVYLTDAEYQELIDAVKANL >gi|226332131|gb|ACIC01000189.1| GENE 62 61302 - 63575 1706 757 aa, chain - ## HITS:1 COG:PA5529 KEGG:ns NR:ns ## COG: PA5529 COG0475 # Protein_GI_number: 15600722 # Func_class: P Inorganic ion transport and metabolism # Function: Kef-type K+ transport systems, membrane components # Organism: Pseudomonas aeruginosa # 7 443 6 440 585 353 44.0 1e-96 MSQLPTLIADLALILICAGVMTLLFKKLKQPLVLGYVVAGFLASPHMPYTPSVMDTANIK TWADIGVIFLLFALGLEFSFKKIVKVGGSAIIAACTIIFCMILLGIGVGMGFGWHRMDSL FLGGMIAMSSTTIIYKAFDDLGLRKKQFTGLVLSILILEDILAIVLMVMLSTMAVSHNFE GTEMLESIGKLLFFLILWFVVGIYLIPEFLKRCRKLMGEETLLIVSLALCFGMVVMAANT GFSAAFGAFIMGSILAETIEAESIDRLVKPVKDLFGAIFFVSVGMMVDPAMIIEYAIPII VITIAVILGQATFGTFGVILSGKPLKTAMQCGFSLTQIGEFAFIIASLGVSLHVTSDFLY PIVVAVSVITTFLTPYMIRLAEPASSFVDAHLPASWRKFLMRYSSGSQTVLNHENLWKKL LLAMVRITVVYSIVSISIIALSFRFVVPFFKENLPHFWASLLGAVFIILCISPFLRAIMV KKNHSVEFMTLWHDNRANRAPLVSTIVIRIMIAALFVIFVISGLFKASIGLIIGVAVLAV LLMVWSRRLKKQSILIERRFFQNLRSREVRAEYLGEKKPEYAGRLLSHDLHLADLEIPGE STWAGKTLMELNLGKKYGIHIASILRGKRRINIPGGSVRLFPMDKIQVIGTDEQLNVFSS AMQSGAKIDWEMYEKGEMTLKQFIIDADSVFLGKTIRESGIRDKYHCMIAGVEREDGTLM VPDVNAPFEEGDVVWVVGEKENVYQLVDQKNEKIQVK >gi|226332131|gb|ACIC01000189.1| GENE 63 63726 - 64811 1062 361 aa, chain - ## HITS:1 COG:BH2816 KEGG:ns NR:ns ## COG: BH2816 COG0404 # Protein_GI_number: 15615379 # Func_class: E Amino acid transport and metabolism # Function: Glycine cleavage system T protein (aminomethyltransferase) # Organism: Bacillus halodurans # 1 361 4 362 365 325 46.0 1e-88 MKTTPFTEKHIALGAKMHEFAGYNMPIEYSGIIDEHLTVCNAVGVFDVSHMGEFWVKGPQ ALAFLQKVTSNNVAALVPGKIQYTCFPNEEGGIVDDLLVYCYEPEKYLLVVNAANIEKDW NWCVSHNTEGAELENSSDNMAQLAVQGPKAILALQKLTDIDLSAIPYYTFTVGRFAGKEN VIISNTGYTGAGGFELYFYPDAAEAIWKAVFEAGEEFGIKPVGLGARDTLRLEMGFCLYG NDLDDKTSPIEAGLGWITKFVEGKEFINRPMLEKQKSEGTTRKLVGFEMIDRGIPRHGYE LVNEEGEGIGVVTSGTMSPTRKIGIGMGYVKPEYAKVGTEICIDMRGRKLKAIVVKPPFR K >gi|226332131|gb|ACIC01000189.1| GENE 64 64861 - 66084 1272 407 aa, chain - ## HITS:1 COG:CAC0476 KEGG:ns NR:ns ## COG: CAC0476 COG2195 # Protein_GI_number: 15893767 # Func_class: E Amino acid transport and metabolism # Function: Di- and tripeptidases # Organism: Clostridium acetobutylicum # 3 407 4 408 408 503 59.0 1e-142 MTLVDRFLKYVSFDTQSDESTGLTPSTPKQMVFAEYLKTELESLGLEDITLDEHGYLFAT LPANIDKKVPTIGFIAHMDTSPDMTGKDVTPRIVKGYDGTDIVLCAEENIILSPAQFPEL LDHKGEDLIVTNGKTLLGADDKAGIAEIVSAIVYLKEHPEIKHGKIRIGFNPDEEIGEGA HKFDVGKFGCEWAYTMDGGEVGELEFENFNAAAAKITFKGRNVHPGYAKNKMINSIRVAN QFIAMLPSTETPEQTEGYEGFYHLISIQGDVEQSTVSYIIRDHDRAKFEKRKEEIKRLVA QVNTEYGEGTATLELRDQYYNMREKIEPVMHIIDTAFAAMEAVGVKPNVKPIRGGTDGAQ LSFKGLPCPNIFAGGLNFHGRYEFVPIQNMEKAMNVIVKIAELVASK >gi|226332131|gb|ACIC01000189.1| GENE 65 66204 - 67613 1176 469 aa, chain - ## HITS:1 COG:MJ0204 KEGG:ns NR:ns ## COG: MJ0204 COG0034 # Protein_GI_number: 15668376 # Func_class: F Nucleotide transport and metabolism # Function: Glutamine phosphoribosylpyrophosphate amidotransferase # Organism: Methanococcus jannaschii # 1 465 1 456 471 182 29.0 2e-45 MGGFFGTVSKSSCVTDLFYGTDYNSHLGTKRGGLATYSQEKGFIRSIHNLESTYFRSKFE GELDKFKGNAGIGIISDTDAQPIIINSHLGRFAIVTVAKISNIQELEEDLLNQNMHFAEL SSSNTNQTELIALLIIQGRTFTEGIENVFKHIKGSCSLLILTEDGSIIAARDQWGRTPVV IGKKDGAYAATSESSSFPNLDYEIEKYLGPGEIVRMYPDHIEQLRQPNEGMQICSFLWVY YGFPTSCYEGKNVEEVRFTSGMKMGQKDESEVDCACGIPDSGVGMALGYAEGKGVPYHRA ISKYTPTWPRSFTPSNQEMRSLVAKMKLIPNRAMLQGKRLLFCDDSIVRGTQLRDNVKIL YDYGAKEVHMRIACPPLIYACPFVGFSASKNALELITRRIIKELEGDENKNLEKYATTGS PEYEKMVSIIAERFGLTSLKFNTLETLIEAIGLPKCKVCTHCFDGSSHF >gi|226332131|gb|ACIC01000189.1| GENE 66 67778 - 69937 1848 719 aa, chain - ## HITS:1 COG:no KEGG:BT_4581 NR:ns ## KEGG: BT_4581 # Name: not_defined # Def: alpha-glucosidase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 719 1 719 719 1489 100.0 0 MKNMKILTTAVLCLLLSFSMGATAMAESITSPDGQLKLDFSVNVQGEPVYELTYKGKGVI KPSKLGLELKNDPGLMNGFACIDTKTSTFDETWEPVWGEVKSIRNHYNEMAVTLNQKAQD RNMIIRFRLYNDGLGFRYEFPQQKNLNYFVIKEEHSQFAMTGDHTAFWIPGDYDTQEYDY TESRLTEIRGLMKGAITDNASQTSFSPTGVQTSLQMKTADGLYINLHEAALVDYSCMHLN LDDKKLIFESWLTPDAMGDKGYMQTPCQSPWRTVIVSDDARDILASKLTLNLNEPCAYQD VSWIKPVKYIGVWWEMITGKSSWAYTDDVYSVKLGQTDYSKTSPNGRHAANNDKVKRYID FAAQHGFDQVLVEGWNEGWEDWFGKSKDYVFDFVTPYPDFDVKMLNAYAKEKGVKLMMHH ETSASVRNYERHLDTAYQFMVDNGYNAVKSGYVGNIIPRGEHHYGQWMNNHYLYAVTKAA DYKICVNAHEAVRPTGLCRTYPNLIGNESARGTEYEAFAGNKPFHTTLLPFTRQIGGPMD YTPGIFDTRISFLDGKHSVVRTTLAKQLALYVTMYSPLQMAADLPESYERYPDAFQFIKD VALDWDDSKYLEAEPGDYITVARKAKGTNHWFVGGITDENSRTAAFTLDFLEPDKEYVAT LYADGKEADYEKNPTSYQIKKGIVTNKTKMSVKLARSGGFALSLIEAAPSDRKSIKKWK >gi|226332131|gb|ACIC01000189.1| GENE 67 70485 - 71678 367 397 aa, chain - ## HITS:1 COG:ydcC KEGG:ns NR:ns ## COG: ydcC COG5433 # Protein_GI_number: 16129419 # Func_class: L Replication, recombination and repair # Function: Transposase # Organism: Escherichia coli K12 # 22 389 12 368 378 215 38.0 1e-55 MKQETKRRIEISNLHEFADSLILIDNRIERCKKHQASTIVLIAISAVICGADTWNSIEDF GKSKESFFAAKLSIFNGIPSHDTFNRFFSALDPLKFEESYRQWVQSILKCYSGHIAIDGK TIRGAYESEEDKRRRKQGVLPDSNTGKYKLHVISAFATELGISLGQLCTQEKENEIVVIP ELLDMLCIKDCIITIDALGCQRTIAEKVIKGEGDYIFIVKDNQPKLKETVLSATESIVSK GTTVRFDKYETHEEGHGRNESRICYCCNDPGFLGADIRKKWKNIRSFGYIENTRNTNKGT TVEKRCFISSLEPDAQKILKNSREHWEIENNLHWQLDVNFHEDNTRRRNISALNFSVLAK IALATLRNNKREIPINRKRLIAGWDNEFLWELILHDL Prediction of potential genes in microbial genomes Time: Thu May 12 04:02:37 2011 Seq name: gi|226332130|gb|ACIC01000190.1| Bacteroides sp. 1_1_6 cont1.190, whole genome shotgun sequence Length of sequence - 29660 bp Number of predicted genes - 22, with homology - 22 Number of transcription units - 10, operones - 7 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 18 - 77 3.0 1 1 Op 1 . + CDS 326 - 1561 1181 ## BF0152 tyrosine type site-specific recombinase 2 1 Op 2 . + CDS 1574 - 1936 266 ## BF0151 hypothetical protein + Term 2108 - 2149 -0.8 - Term 1978 - 2034 2.5 3 2 Op 1 . - CDS 2039 - 4159 518 ## COG1204 Superfamily II helicase 4 2 Op 2 . - CDS 4146 - 5087 281 ## Xaut_0229 hypothetical protein 5 2 Op 3 . - CDS 5140 - 5343 164 ## BT_4529 hypothetical protein - Prom 5368 - 5427 6.8 - Term 5468 - 5515 3.1 6 3 Op 1 . - CDS 5534 - 5836 345 ## BF0150 hypothetical protein 7 3 Op 2 . - CDS 5871 - 6170 240 ## BF0149 hypothetical protein 8 4 Op 1 . + CDS 6451 - 6807 347 ## BF0147 hypothetical protein 9 4 Op 2 . + CDS 6811 - 7161 450 ## BF0146 hypothetical protein 10 4 Op 3 . + CDS 7182 - 8750 1261 ## BF0145 hypothetical protein 11 4 Op 4 . + CDS 8812 - 10899 1606 ## COG0550 Topoisomerase IA + Term 11053 - 11089 5.5 + Prom 10913 - 10972 2.1 12 5 Op 1 . + CDS 11139 - 11591 443 ## BF0143 hypothetical protein 13 5 Op 2 . + CDS 11581 - 17421 4283 ## COG4646 DNA methylase + Prom 17878 - 17937 10.4 14 6 Op 1 . + CDS 18081 - 20006 665 ## COG0480 Translation elongation factors (GTPases) 15 6 Op 2 13/0.000 + CDS 20006 - 22324 999 ## COG0642 Signal transduction histidine kinase 16 6 Op 3 . + CDS 22317 - 23639 979 ## COG2204 Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains + Term 23864 - 23898 2.0 + Prom 23668 - 23727 2.6 17 7 Tu 1 . + CDS 23919 - 24341 189 ## BF0137 hypothetical protein + Prom 24430 - 24489 7.1 18 8 Tu 1 . + CDS 24580 - 25182 448 ## BF0136 tetracycline resistance element mobilization regulatory protein RteC + Term 25196 - 25236 8.1 + Prom 25184 - 25243 3.2 19 9 Op 1 . + CDS 25323 - 26774 906 ## COG2865 Predicted transcriptional regulator containing an HTH domain and an uncharacterized domain shared with the mammalian protein Schlafen 20 9 Op 2 . + CDS 26784 - 27947 203 ## PROTEIN SUPPORTED gi|134101692|ref|YP_001107353.1| acetyltransferase, including N-acetylase of ribosomal protein 21 9 Op 3 . + CDS 27964 - 28530 66 ## gi|302483533|gb|EFL46535.1| hypothetical protein HMPREF9296_0844 + Term 28533 - 28575 7.1 - Term 28515 - 28567 14.5 22 10 Tu 1 . - CDS 28572 - 29660 851 ## BF0133 putative mobilization protein Predicted protein(s) >gi|226332130|gb|ACIC01000190.1| GENE 1 326 - 1561 1181 411 aa, chain + ## HITS:1 COG:no KEGG:BF0152 NR:ns ## KEGG: BF0152 # Name: not_defined # Def: tyrosine type site-specific recombinase # Organism: B.fragilis # Pathway: not_defined # 1 411 1 411 411 814 97.0 0 MKSTFSVIYYLKRQVVKKDGTVPVMGRITVDGSQTQFSCKLSVDPKLWDTKGGRVTGRSA AALETNRMLDKMRVRINRHYQEIMERDNFVTAEKVKNAFLGLEHRYHTLMQVFRRHNENY EKQVEAGMKAKGTLEKYRVVYKHLQEFLDIRYHVKDIALKELTPAFISDFEMFLRTDKHC CTNTVWLYVCPLRTMVFIAINNEWLTRDPFREYEIKKEETTRSFLTKDEIRLLMEGKLKN AKQELYRDLYLFCAFTGLSFADMRNLTEENIRTYFDEHEWININRKKTGVVSNIRLLDIA NRIIGKYRGLCENGRIFPVPHYNTCLAGIRAVAKRCGITKHITWHQSRHTAATTVFLSNG VPIETVSSMLGHKSIKTTQIYAKITKEKLNQDMENLAARLNNVEEFAGCTI >gi|226332130|gb|ACIC01000190.1| GENE 2 1574 - 1936 266 120 aa, chain + ## HITS:1 COG:no KEGG:BF0151 NR:ns ## KEGG: BF0151 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 120 1 120 120 213 88.0 2e-54 MKRDIIIIEDKAVSVTGNDVWMTATEIAGLFHTTVPAVNAVIKAVRKSDVLNDYKVCRYM QLENGLYADVYAPEIIILIAFRLNTYNTYLFRMWLVGKVLSQEKRQTYVMFIQNGKTAYC >gi|226332130|gb|ACIC01000190.1| GENE 3 2039 - 4159 518 706 aa, chain - ## HITS:1 COG:yfjK KEGG:ns NR:ns ## COG: yfjK COG1204 # Protein_GI_number: 16130545 # Func_class: R General function prediction only # Function: Superfamily II helicase # Organism: Escherichia coli K12 # 42 445 54 458 729 178 31.0 4e-44 MQTNDIYNTCLDISNVLNAGDISNARSKVITLLHEINGTNNNSYMELVNHLIREVGLLPY IDTYTASWEDRFVCEVFKVNIGERKPCVLHTAQSQVLKKLLEGKSVAVSAPTSFGKSFVI DAFIAIKQPINVVILVPTVALADETRRRICRKFSHQYKIITTTDVELAEKNILVFPQERA FAYIDKLPSIDILIVDEFYKASTNFDLERSSTLLSSMVELGKKAKQRYYLAPNIHEIEEN VFTEGMTFLRMDFKTVVTHAWRLYKSRPKNEDKQSFKTRKLIELINTGIGKALIYTGTYK GIEEVTTILNRNLYNKDSELLANFSDWLECNYGEKYILKDLVKHGIGIHNGQLHRSLSQI QIKLFEEQNGLDYLVSTSSIIEGVNTQAESVVLWSNKNGAHKIDYFTFRNIIGRAGRMFR YFVGRVYMLEEPPSQENTKLRLEFPDDVVKKLDGNDPGIKLNNEQYVKIQRYQDEMIELL GTDIWHRIERIPQIRSCKPSMLKIIAEKLKTDSNWPTNCDALQNNNTWEWRDALGDIIEI LEYHRKGHLRYYACACSNGWKMTIKELYNTVKDYGITYEDIFTFERYVSFNLSSIIAVIN IIRQELYPNSSNIANFVYKASNAFLPKIVFQLEEYGLPRMISKKIQNAGLINLEDDSKEI TIVIQEFNTIGIEYLEQKIPNLHSFDKYILKHFMNGIRCITTNQKN >gi|226332130|gb|ACIC01000190.1| GENE 4 4146 - 5087 281 313 aa, chain - ## HITS:1 COG:no KEGG:Xaut_0229 NR:ns ## KEGG: Xaut_0229 # Name: not_defined # Def: hypothetical protein # Organism: X.autotrophicus # Pathway: not_defined # 8 311 64 367 367 303 49.0 6e-81 MSNKIWNFDILVDDCFVNFNTKKQEINPDHNDRLLSVANGFEDGSWRYRQFKEFVFSNIA ETALSAQEREKLIDNDYGRLIEAAKHLRLVDKEQNGKGSEIAEIILYGIMKNHYKALSAI PKIFYKQNDNDNAKGSDSVHIVIDPNGGFQLWLGEAKFYNSLEDARLYEPINSVEQMLRK PIMKKECGIMTNLNELDKQIENQTLLKKIKECFDENTSIDEIKPKLHIPILLLHECQITA STTEMSDEYKNSIISFHRDRAHSFFKKQINKLSSVHKYSEINFHLILFPVPDKTKIVNWF ISQAKNLKNDADE >gi|226332130|gb|ACIC01000190.1| GENE 5 5140 - 5343 164 67 aa, chain - ## HITS:1 COG:no KEGG:BT_4529 NR:ns ## KEGG: BT_4529 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 67 22 88 89 108 80.0 7e-23 MAILNNIKNVLSEKGVMSKWLAKQLQKDPATVSKWCNNHSQPDLYTFVRIANLLDVDVHE LICHTKK >gi|226332130|gb|ACIC01000190.1| GENE 6 5534 - 5836 345 100 aa, chain - ## HITS:1 COG:no KEGG:BF0150 NR:ns ## KEGG: BF0150 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 100 2 101 101 176 94.0 2e-43 MNETNDVFTMEDEPIASVMQDMRKGSKWLSVFLESYRPPLDGERYLTDGEVAELLRVSRR TLQEYRNNRVLSFILLGGKVLYPETGLREVLEANYRKPLE >gi|226332130|gb|ACIC01000190.1| GENE 7 5871 - 6170 240 99 aa, chain - ## HITS:1 COG:no KEGG:BF0149 NR:ns ## KEGG: BF0149 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 99 15 113 113 182 97.0 4e-45 MNMEIVSIEKKTFEMMVAAFGALSEKVAALRRKSDTGRMERWLTGEEVCGQLRISPRTLQ TLRDRRLIGYSQINRRFYYKPEEVKRLIPLIGTLYSHGR >gi|226332130|gb|ACIC01000190.1| GENE 8 6451 - 6807 347 118 aa, chain + ## HITS:1 COG:no KEGG:BF0147 NR:ns ## KEGG: BF0147 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 118 22 141 141 114 59.0 1e-24 MKVITMESSAFRSLTEQIAEIAAHVRAASGDKKAASPDRLLTTREAAHLLNVSTRTLQRM RSEQRIGYVVLRGKCRYRQSEIDRLLEACAVNEDAATPQELKRNHTLRTGGKPKGRRT >gi|226332130|gb|ACIC01000190.1| GENE 9 6811 - 7161 450 116 aa, chain + ## HITS:1 COG:no KEGG:BF0146 NR:ns ## KEGG: BF0146 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 116 1 116 116 213 100.0 1e-54 MELLTRNNFEGWMQKLMERLDRQDELLLAMKAEGKQPTITESIRLFDNQDLCMLLQISKR TLQRYRSVGALPYKTLGKKTYYSEEDVLTFLSNHIKDFKKEDIAFYKARIHNFFHK >gi|226332130|gb|ACIC01000190.1| GENE 10 7182 - 8750 1261 522 aa, chain + ## HITS:1 COG:no KEGG:BF0145 NR:ns ## KEGG: BF0145 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 522 1 523 523 781 90.0 0 MAKKKDEKDVLVVRDEKTGEISVVAGLNADDTPKRTPAKAENAQSFLQFDRHGDVLDNFF KNFFRQCKEPSRFGFYRIAADQVDSMLGVMKDLLQHPYQNKQILAPHKIDTFPYQQQVQE EMAAQKTEKQEPQKQENMEQQKEQQESPQQTQGRQGYQPINESKINWQELEDRWGVKRDD LEKSGDLTKMLHYGKSDLVKVKPTFGGESFELDARLSFKKDGEGNVSLVPHFIRKEQKLD EYKEHKFSDDDRKNLRETGNLGRVVDIVDRETGEIIPSYISIDRKTNEITDIPANKVRIP ERIGKTEITKQEQDMLRAGLPVRDKLIERNDGRKFVTTLQVNVEQRGVEFVPGTGRSPRT AQTQEAKGDTSKSQAQGGENAAQTKKEQRRNTWMNEDGSIRPISKWSGVNFTDKQKADYV AGKAVKLENVTDKQGFHATMYIKFNPEKGRPYRYETNPDNAQQVAPSNESRTQVAVNNEG KTNEATKNLKEPLQKGQTAPKDDRQQQQQEKPKKKNNKGMKM >gi|226332130|gb|ACIC01000190.1| GENE 11 8812 - 10899 1606 695 aa, chain + ## HITS:1 COG:CAC3567 KEGG:ns NR:ns ## COG: CAC3567 COG0550 # Protein_GI_number: 15896801 # Func_class: L Replication, recombination and repair # Function: Topoisomerase IA # Organism: Clostridium acetobutylicum # 4 625 6 647 709 406 38.0 1e-113 MKTIIAEKPSVAREIARIVGATKREEGYFEGGGYAVTWAFGHLVQLAMPDGYGVRGFVRD NLPIIPDTFTLVPRQVRTEKGYKPDSGVVSQIKVIKRLFDTSEQIIVATDAGREGELIFR YLYHYTGCTTPFVRLWISSLTDKAIREGLRNLEDGSRYDNLYLAAKARSESDWLVGINGT QALSIAAGHGTYSVGRVQTPTLAMVCERYWENRRFTSEAFWQLHIATDGCDGEVVKLSSS EKWKSKEPATELYNKVKAAGSATVTKAERKEKTEETPLLYDLTTLQKEANAKHGFTAEQT LEIAQKLYEKKLITYPRTGSRYIPEDVFAEIPKLLAFIGTQPEWKDKVRAKAIPTRRSVD DGKVTDHHALLVTGEKPLFLSKEDSTIYQMIAGRMIEAFSEKCVKDVTAVTAECAGVEFT VKGSVVKQAGWRAVYGEEKEETTIPGWQDGDTLTLKATSITEGKTKPKPLHTEATLLSAM ETAGKEIEDDALRQAMKDCGIGTPATRASIIETLFKRGYMERCKKSLVPTEKGLALNSVV KTMRIADVAMTGEWEKELARIERGELPADTFRKEIEAYTREITSELLSCDKLFGSRDSGC ACPKCGTGRMRFYGKVVRCDNTECGLPVFRQKAGRTLSDDEIKDLLTDGHTKPLKGFKSK QGKNFDAIVAFDGEYNTTFVFPEKKCKSSYPKKRK >gi|226332130|gb|ACIC01000190.1| GENE 12 11139 - 11591 443 150 aa, chain + ## HITS:1 COG:no KEGG:BF0143 NR:ns ## KEGG: BF0143 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 150 46 195 195 236 92.0 2e-61 MNNKKKNEGQTDFSYYGLYLLDYLRTNKFEQATDTAFIRERADRAAETYEKARLEGYPAD GAQEQAMDTLLRGLRYSRYAILREVVESEFFDEVPEEKQEAFILKLMPLVGNVFSVYDLS DDNFALSSDYDLLYTELTGATVLYIGEYGV >gi|226332130|gb|ACIC01000190.1| GENE 13 11581 - 17421 4283 1946 aa, chain + ## HITS:1 COG:AGpT188_2 KEGG:ns NR:ns ## COG: AGpT188_2 COG4646 # Protein_GI_number: 16119916 # Func_class: K Transcription; L Replication, recombination and repair # Function: DNA methylase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 606 1696 42 1144 1315 355 27.0 4e-97 MAFNRKQKLRDNIEAIRTAFILDRENRTATTEERAILQRYCGFGGLKCILNPAKELTDAV RWAKSDLELFAPTVELHRLIRENSKDETEYKRFVDSLKASVLTAFYTPKEITDTIADVLA DYSVRPARMLEPSAGVGVFVDSMLRHNPNADVMAFEKDLLTGTILRHLYPGKKTRTCGFE KIERPFNNYFDLAVSNIPFGDIAVFDPEFQRSDSFGRRSAQKAIHNYFFLKGLDAVRDGG IVAFITSQGVLNSTKTSVRNELFSKADLVSAIRLPNNLFTDNAGTEVGSDLIVLQKNLSK KEMSQDERLMTVIQTDTKTDLTDNAYFIHHPERIVHTTAKLDTDPYGKPAMVYLHEGKAA GIAGDLHRMLDEDFHYRLAMRLYSGSIRQSGTEEKVTVQKEVERTAIKMETTSSMQAVET PTEKPQPTEEKPEIEPRPKYSDGVQLSLLDLWGMTEEVSQQPKTAKKKKEAKKESSARRV LPKPQVHVTQNVTAVPTATTPKTVTENKEAKTENTAKPADPDDIYATLDWDTNPPINGFY EMMMDLTPERRKELRELARQHNEKQAAAEKMEVKAVPDTPREQPRQEETQPEAVTAPAVT DTPPEAVATSLFPDIEAEKPKEEVVDLSPRAYHRTPEMHLREGSLVADRGRHNIGYLKDI TPYGATFQPLDLKGYQKEKALQYVLLRDAYERLYRYESNLHEANVPWREHLNTCYDEFVM RYGNLNAKQNVKLVMMDAGGRDILSLERAENGKFVKADIFERPVSFSVESHANVGSPEEA LSASLNKFGTVDLDYMREITDSTAEDLLTALQGRIYYNPLVTGYEIKDRFIAGNVIEKAE RIEAWMGENPESERMPEVKQALEALKDAEPPRIAFEDLDFNFGERWIPTGVYAAYMSRLF DTEVKIAYSASMDEFSVACGYRTMKITDEFLVKGYYRNYDGMHLLKHALHNTCPDMMKSI GRDEHGNDIKVRDSEGIQLANAKIDEIRNGFSEWLEEQSPQFKERLTTMYNRKFNCFVRP KYDGSHQTFPDLNLKGLASRGIRSVYPSQMDCVWMLKQNGGGICDHEVGTGKTLIMCIAA HEMKRLNLAHKPMIIGLKANVAEIAATYQAAYPNARILYASEKDFSTANRVRFFNNIKNN DYDCVIMSHDQFGKIPQSPELQQRILQAELDTVEENLEVLRQQGKNVSRAMLKGLEKRKH NLEAKLEKVEHAIKSRTDDVVDFKQMGIDHIFIDESHQFKNLTFNTRHDRVAGLGNSEGS QKALNMLFAIRTIQERTGKDLGATFLSGTTISNSLTELYLLFKYLRPKELERQDIRCFDA WSAIFAKKTTDFEFNVTNNVVQKERFRYFIKVPELAAFYNEITDYRTAEDVGVDRPNKNE ILHHIPPTPEQEDFIQKLMQFAKTGDATLLGRLPLSETEEKAKMLIATDYARKMALDMRM IDPHYEDHPDNKASHCAKIIAEYYQKYDAQKGTQFVFSDLGTYQPGDGWNVYSEIKRKLT EDYGIPPSEVRFIQECKTDKARKAVIDAMNAGTVRVLFGSTSMLGTGVNAQKRCVAIHHL DTPWRPSDLQQRDGRGVRAGNEIAKHFAGNNVDVIIYAVEKSLDSYKFNLLHCKQTFISQ LKSGAMEARTIDEGAMDEKSGMNFSEYMALLSGNTDLLDKAKLEKRIASLEGERKSFNKG KRDSEFKLESKTRELGNNTAFIDAMTEDWNRFLSVVQTDKEGNHLNIIKVDGVDSADEKV IGKRLQEIAKNATTGGLYTQVGEFYGFPIKVVSERILKEGLEFTDNRFVVEGNYKYTYNN GHLALADPLAAARNFLNAIERIPSIIDQYKAKNEVLEREIPQLQEIAGKVWKKEEELKQL KSELAALDRKIQLELAPPTPEITEKEHEGQQVKPEAKGVRNGIRQYPEDTSPQIRNPSES IIANHTITGHPGLYAKEETRSKGLKI >gi|226332130|gb|ACIC01000190.1| GENE 14 18081 - 20006 665 641 aa, chain + ## HITS:1 COG:CAC1448 KEGG:ns NR:ns ## COG: CAC1448 COG0480 # Protein_GI_number: 15894727 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factors (GTPases) # Organism: Clostridium acetobutylicum # 3 619 4 626 652 474 39.0 1e-133 MNIINLGILAHIDAGKTSVTENLLFASGATEKCGRVDNGDTITDSMDIEKRRGITVRAST TSIIWNGVKCNIIDTPGHMDFIAEVERTFKMLDGAVLILSAKEGIQAQTKLLFNTLQKLQ IPTIIFINKIDRDGVNLERLYLDIKTNLSQDVLFMQTVVDGLVYPICSQTYIKEEYKEFV CNHDDNILERYLADSEISPADYWNTIIDLVAKAKVYPVLHGSAMFNIGINELLDAISSFI LPPESVSNRLSAYLYKIEHDPKGHKRSFLKIIDGSLRLRDIVRINDSEKFIKIKNLKTIY QGREINVDEVGANDIAIVEDMEDFRIGDYLGTKPCLIQGLSHQHPALKSSVRPDRSEERS KVISALNTLWIEDPSLSFSINSYSDELEISLYGLTQKEIIQTLLEERFSVKVHFDEIKTI YKERPVKKVNKIIQIEVPPNPYWATIGLTLEPLPLGTGLQIESDISYGYLNHSFQNAVFE GIRMSCQSGLHGWEVTDLKVTFTQAEYYSPVSTPADFRQLTPYVFRLALQQSGVDILEPM LYFELQIPQAASSKAITDLQKMMSEIEDISCNNEWCHIKGKVPLNTSKDYASEVSSYTKG LGVFMVKPCGYQITKGDYSDNIRMNEKDKLLFMFQKSMSSK >gi|226332130|gb|ACIC01000190.1| GENE 15 20006 - 22324 999 772 aa, chain + ## HITS:1 COG:MA2256_2 KEGG:ns NR:ns ## COG: MA2256_2 COG0642 # Protein_GI_number: 20091095 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Methanosarcina acetivorans str.C2A # 284 517 45 284 345 122 33.0 3e-27 MERSGNFYKAIQLGYILISILIGCMAYNSLYEWQEIEALELGNKKIDELRKEINNINIQM IKFSLLGETILEWNDKDIEHYHARRMAMDSMLCRFKATYPAERIDSVRSLLEDKERQMFQ IVRLMDEQQSINKKIANQIPVIVQKSVQEQSKKPKRKGFLGIFGKKEGTKPTTTTTTLRS SNRNMVNEQKAQSRRLSEQADSLAARNAELNRQLQGLICQIEKKVQSDLQNRESEITAMR KKSFMQIGGLMGFVLLLLVISYIIIHRDAKNIKRYKRKTTDLIEQLEQSVQQNEVLITSR KKAVHTITHELRTPLTAITGYTELLRKECNSGNNGQYIRNILQSSDRMRDMLNTLLDFFR LDNGKEQPRLSPCRISAITHTLETEFIPVAVNKGLSLSVKTGHDAIVLTDKERIIQIGNN LLSNAVKFTEEGGVSLITEYDNGVLTLVVEDTGTGMTEEEQKQAFGAFERLSNAAAKEGF GLGLAIMRNIVSMLGGTIRLDSKKGKGSRFTVEISMQEAEEQLGYTSNTPVYHNNKFHDV VAIDNDEVLLLMLKEMYSQEGIHCDTCTDAAELMEMIRQKEYSLLLTDLNMPGINGFELL ELLRSSNVGNSPTIPVVVATASGSCNKGELLAKGFAGCLFKPFSISELMEVSDRCAIKET PDGKPDFSALLSYGNEAVMLEKLMTETEKEMQTIREAATEKDLQKLDSLTHHLRSSWEVL RADQPLNVLYRLLHGDVLPDGEALSHAVTAVLDKGAEIIRLAEEERRKYEDG >gi|226332130|gb|ACIC01000190.1| GENE 16 22317 - 23639 979 440 aa, chain + ## HITS:1 COG:hydG KEGG:ns NR:ns ## COG: hydG COG2204 # Protein_GI_number: 16131834 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains # Organism: Escherichia coli K12 # 2 428 4 439 441 269 38.0 8e-72 MDKTTIIVVEDNIVYCEFVCNMLAREGYRTVKAYHLSTAKKHLQQATDNDIVVADLRLPD GNGIDLLRWMRKEGKMQPFIIMTDYAEVNTAVESMKLGSIDYIPKQLVEDKLVPLIRSIL KERQAGQRRMPVFARDGSAFQKIMHRIRLVAATDMSVMIFGENGTGKEHIAHHLHDKSKR AVKPFVAVDCGSLTKELAPSAFFGHVKGAFTGADCAKKGYFHEAEGGTLFLDEVGNLALE TQQMLLRAIQERRYRPVGDKADRSFNVRIIAATNEDLEAAVSEKRFRQDLLYRLHDFGIT VPPLRDCQEDIMPLAEFFRDMANRELECSVSGFSSEARKALLTHAWPGNVRELRQKVMGA VLQAQEGVVMKEHLELAVTKPTSTVNFALRNDAEDKERILRALKQANGNRSVAAELLGIG RTTLYSKLEEYGLKYKFKQS >gi|226332130|gb|ACIC01000190.1| GENE 17 23919 - 24341 189 140 aa, chain + ## HITS:1 COG:no KEGG:BF0137 NR:ns ## KEGG: BF0137 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 140 1 140 140 283 96.0 2e-75 MGKVQILAVLTMDGCLSSELYYKAHQDLCLDRCGLDEIRKNALYRVTPDYSISMLHEWRK DGTNIRYLAEATPDTADYINGLLRMHAVDEIILYTVPFISGSGRHFFKSALPEQHWTLSS LKSFPNGVCRIIYILDKKAR >gi|226332130|gb|ACIC01000190.1| GENE 18 24580 - 25182 448 200 aa, chain + ## HITS:1 COG:no KEGG:BF0136 NR:ns ## KEGG: BF0136 # Name: not_defined # Def: tetracycline resistance element mobilization regulatory protein RteC # Organism: B.fragilis # Pathway: not_defined # 1 200 1 201 201 373 93.0 1e-102 MNYFLLAETEFFRRINEAGDCNMEKAYTAFATQVIELCNGGMDMNLTVIALAYIEIELQH HPVRNLSEERREIAAYVSKALSFVRKMQKFLATPQVPPLISANNATETTASLLWTGNAID LVELIYGIDEISCINNGNMPLKQLAPILYKIFGIESKDCYRFYTDIKRRKNESRTYFLDK MQEKLNERMLRDEELERMRR >gi|226332130|gb|ACIC01000190.1| GENE 19 25323 - 26774 906 483 aa, chain + ## HITS:1 COG:MA2370 KEGG:ns NR:ns ## COG: MA2370 COG2865 # Protein_GI_number: 20091202 # Func_class: K Transcription # Function: Predicted transcriptional regulator containing an HTH domain and an uncharacterized domain shared with the mammalian protein Schlafen # Organism: Methanosarcina acetivorans str.C2A # 9 387 3 373 458 162 31.0 1e-39 MNIDNLDIVKQLIAEKENGQVEFKETTGQLERGMETLCAFLNSEGGTVLFGVTDKGKIIG QEVSDKTKRDIAEAIRRFEPFATLEVSYISIQNTDKSVIALSADSQRYMRPFSYKGRAYL RLESVTSSMPQDVYNQLLMQRGGKYAWEAMTNPDIKVTDLDEHAIMGAVRGGIRCGRLPE ATIREDLPTILEKFNLLHDGKLNNASAVLFGRDFYFYPQCLLRLARFKGTTKDEFIDNQR TTGNIYTLLDTAMSFFFKHLSLSGKVEGLYREEELEIPYKALRECCTNALCHRSYHRPGS SVGIAIYDDRVEIENSGTFPPDITMEKLLSGHNSEPQNLIIANVLYKSEVLESWGRGIGL MISECRRVGIPDPEFHTDGNSVWVIFRYTRKTVGHDPTITRQLPHSHPTVTPQVEKVLSA IGTQTLSTKEIMCVIGLKDKSNFLELYLYPAIRQNLVEPIYPENPKHPRQKYRLTDKGKE LLI >gi|226332130|gb|ACIC01000190.1| GENE 20 26784 - 27947 203 387 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|134101692|ref|YP_001107353.1| acetyltransferase, including N-acetylase of ribosomal protein [Saccharopolyspora erythraea NRRL 2338] # 215 357 5 149 186 82 36 2e-15 MVAKKKKQQGHYCRICSEYKANEQFSGKGHSRHICKECRSLPDDVKADMVRCNEVERAVF KCPMSRQDWELLEKYAKKYKDKESGQFAQDMLDMKRGNQTPDEDMEEDDVLIEGIYEEET IPFAELEDDIRYQLEELLADNINEFMIHKNYIPEGKELKDINEWVMKEARDTFFIKVIPD AAYDSLVEETINRLVKEWKEDGFEIKTYSASLVVMETERLLIRRITRKDMDALLAIMGKP EVMYAWEHGFTKKDVRKWINRQLIRYRKDGFGYFAVILKESGALIGQAGLMNSTLNGNET VELGYILDNTYWHNGYGTEAARACLEYAFGELELKTVCCSIRPENVASIRVVERLGMTLC DNHTIIYNEKEMPHQIYVATNSKSYIL >gi|226332130|gb|ACIC01000190.1| GENE 21 27964 - 28530 66 188 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|302483533|gb|EFL46535.1| ## NR: gi|302483533|gb|EFL46535.1| hypothetical protein HMPREF9296_0844 [Prevotella disiens FB035-09AN] # 1 188 1 188 188 363 100.0 3e-99 MKTNNINLFCDIVTQRSGEHSCAINILLQQQLYGQVISILRQELDSMVRVMFLLSISDLN LREHFINQTLEGIKWSYPNTKKVVTDKQMVDLADKFYGWPFFVYKLGCAFIHLSAMVYYK NSNPFLLLSVSERNDITRFLHQYHSFPLELELNLENIIPYLDKVFNKVSSNLACYIEDLR QNKLLEEY >gi|226332130|gb|ACIC01000190.1| GENE 22 28572 - 29660 851 362 aa, chain - ## HITS:1 COG:no KEGG:BF0133 NR:ns ## KEGG: BF0133 # Name: not_defined # Def: putative mobilization protein # Organism: B.fragilis # Pathway: not_defined # 1 359 308 666 672 714 97.0 0 VQKQGDFFVESPIILFASIIWYLKIYQNGKYCTFPHAIEFLNRRYEDIFPILTSYPELEN YLSPFMDAWLGGAAEQLMGQIASAKIPLSRMISPQLYWVMSDSEFTLDINNPEEPKILCV GNNPDRQNIYGAALGLYNSRIVKLINKKGMLKSSVIIDELPTIYFKGLDNLIATARSNKV AVCLGFQDFSQLVRDYGDKEAKVVMNTVGNIFSGQVVGETAKTLSERFGKVLQKRQSISI NRQDVSTSINTQMDALIPPSKISGLTQGMFVGSVSDNFNERIEQKIFHCEIVVDADKVKR EESAYKKIPVITDFTDADGNDRMKETVQANYRRIKEEVKQIVQEELERIAGDDNLKHLLQ QK Prediction of potential genes in microbial genomes Time: Thu May 12 04:03:28 2011 Seq name: gi|226332129|gb|ACIC01000191.1| Bacteroides sp. 1_1_6 cont1.191, whole genome shotgun sequence Length of sequence - 11027 bp Number of predicted genes - 13, with homology - 13 Number of transcription units - 3, operones - 3 average op.length - 4.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 919 639 ## BF0133 putative mobilization protein 2 1 Op 2 . - CDS 950 - 2197 980 ## BF0132 hypothetical protein 3 1 Op 3 . - CDS 2176 - 2604 293 ## BF0131 hypothetical protein 4 2 Op 1 . + CDS 3314 - 4075 746 ## BF0129 hypothetical protein 5 2 Op 2 . + CDS 4078 - 4518 369 ## BF0128 conjugate transposon protein 6 2 Op 3 . + CDS 4533 - 4874 308 ## BF0127 hypothetical protein 7 2 Op 4 . + CDS 4891 - 5619 540 ## BF0126 hypothetical protein 8 3 Op 1 . + CDS 5821 - 6138 289 ## BF0125 hypothetical protein 9 3 Op 2 . + CDS 6149 - 6481 185 ## BF0124 hypothetical protein 10 3 Op 3 . + CDS 6478 - 8982 2133 ## COG3451 Type IV secretory pathway, VirB4 components 11 3 Op 4 . + CDS 9017 - 9397 353 ## BF0122 hypothetical protein 12 3 Op 5 . + CDS 9421 - 10050 641 ## BF0121 hypothetical protein 13 3 Op 6 . + CDS 10055 - 11027 832 ## BF0120 hypothetical protein Predicted protein(s) >gi|226332129|gb|ACIC01000191.1| GENE 1 1 - 919 639 306 aa, chain - ## HITS:1 COG:no KEGG:BF0133 NR:ns ## KEGG: BF0133 # Name: not_defined # Def: putative mobilization protein # Organism: B.fragilis # Pathway: not_defined # 1 306 1 306 672 604 94.0 1e-171 MSQQEDDLRALAKIMDFLRAVSIILVVMNVYWFCYEVIRLWGVNIGVVDRILLNFDRTAG LFHSILYTKLFAVLLLALSCLGTKGVKGEKITWGRIWTALAAGSVLFFLNWWILALPLPI EAVMGLYVLTIGTGYVCLLMGGLWMSRLLKHNLMEDVFNNENESFMQETRLIESEYSVNL PTRFYYKKRWNNGWINVVNPFRASIVLGTPGSGKSYAVVNNFIKQQIEKGFLMYVYDFKF SDLSTIAYNHLLNHPDGYKVKPKFYVINFDDPRRSHRCNPIHPDFMEDITDAYESAYTIM LNLNKT >gi|226332129|gb|ACIC01000191.1| GENE 2 950 - 2197 980 415 aa, chain - ## HITS:1 COG:no KEGG:BF0132 NR:ns ## KEGG: BF0132 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 415 1 415 415 726 90.0 0 MVAKISVGNSLYGAIAYNGEKINEAQGRLLTTNRIYNDGSGKVDINKAMEGFLTFMPPQM KVEKPVVHISLNPHPEDVLTDAELQDIAREYLEKLGFGNQPYLVFKHEDIDRHHLHIVTV RVDDNGKCISDKNNYYRSKQITRELEKKYGLHGAERRNRRLDTPLSKVDASAGDVKKQVA GTVKALNGQYRFQTMGEYRALLSLYNMTVEEARGNVRGREYHGLVYSVTDDKGNKVGNPF KSSLFGKSAGYEAVQRKFIRSKSEIKDRKLADMTKRTVLSVLQGTYDKDKFVSQLKEKGI DTVLRYTDEGRIYGATFIDHRTGCVLNGSRMGKELSANALQEHFTLPYAGQPPIPLSIPM DAADKAYGQTAYDREDVSGGMGLLTPEGPAVDAEEKAFIRAMKRKKKKKRKGLGL >gi|226332129|gb|ACIC01000191.1| GENE 3 2176 - 2604 293 142 aa, chain - ## HITS:1 COG:no KEGG:BF0131 NR:ns ## KEGG: BF0131 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 142 1 142 142 245 99.0 4e-64 MKEKRKSKAGRNPKLDPAVYRYTVRFNEEEHNRFLAMFGKSGVYARSVFLKAHFFGQPFK VLKVDKTLVDYYTKLSDFHAQFRAVGTNYNQVVKELRLHFSEKKAMALLYKLEQHTVELV KLSRRIVELSREMEAKWSQKSV >gi|226332129|gb|ACIC01000191.1| GENE 4 3314 - 4075 746 253 aa, chain + ## HITS:1 COG:no KEGG:BF0129 NR:ns ## KEGG: BF0129 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 253 97 349 349 505 99.0 1e-142 MSKEIFVAFATQKGGIGKSTVTALAASYLHNVKGYNVAVVDCDDPQHSIHGLREHEMGLI DSSTYFKALACDHFRRMKKNAYTIVKSNAVNALDDAERMIATEDVKPDVVFFDMPGTLRS NGVIKTLSQMDYIFTPLSADRFVVESTLKFVTMFRDRLMTTGQAKTKGLYLFWTMVDGRE RNDLYGIYEEVIAEMGFPVLSTRLPDSKKFRRDLSEERKSVFRSTIFPMDTALLKGSGIR EFSEEISDIIRPQ >gi|226332129|gb|ACIC01000191.1| GENE 5 4078 - 4518 369 146 aa, chain + ## HITS:1 COG:no KEGG:BF0128 NR:ns ## KEGG: BF0128 # Name: not_defined # Def: conjugate transposon protein # Organism: B.fragilis # Pathway: not_defined # 1 146 1 146 146 247 99.0 1e-64 MGSRKVNTEGIDEELLLASIGRRTQDGTLRPAQEVPAAAPTEEDTAAPEPPPVQPVTREK AQRESGRRKRQDEDYNELFLRRNEIKTRQCVYISRDVHGKILRIVNDIAGGEISVGGYVD TVLRQHLEQHKERINELYKKQREDLI >gi|226332129|gb|ACIC01000191.1| GENE 6 4533 - 4874 308 113 aa, chain + ## HITS:1 COG:no KEGG:BF0127 NR:ns ## KEGG: BF0127 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 113 5 117 117 218 95.0 7e-56 MTPNEKRPQQDCGGMFAQVQASVEILSPVPVSGKCCEKDYERLFIRDSEVKAREGKMAYV RPEYHERIMCITRVIGHDRLMLSAYIDHVLTHHFNQCEDAIKSLYARNYNSVF >gi|226332129|gb|ACIC01000191.1| GENE 7 4891 - 5619 540 242 aa, chain + ## HITS:1 COG:no KEGG:BF0126 NR:ns ## KEGG: BF0126 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 7 242 1 236 236 426 99.0 1e-118 MNYTISMTDILLAVSVGCNLWFLFLLLYERIMDTRIVRFFKGIVGLWRSLDGNEAKRIAA HEEVPAEKADIIGKSRFRMASTRTTAAIPTQEAATMEKGIELSEEEATFDDGKTGNASRP AQVPEEKLDETFTSIPPEELGYGDDEPEEDASDTPRASGSSFDEIDDACKTAKNPDATQA EREKAAKVFTDMEGTELYEKMMEGSSEIGIRIKGLIEIRLKKPEKEFVVPDNIEEFDIRN YV >gi|226332129|gb|ACIC01000191.1| GENE 8 5821 - 6138 289 105 aa, chain + ## HITS:1 COG:no KEGG:BF0125 NR:ns ## KEGG: BF0125 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 105 1 105 105 167 100.0 8e-41 MNKNILKNRKAILSAALVIAATASAFAQGNGIAGINEATSMVSSYFDPGTKLIYAIGAVV GLIGGVKVYGKFSSGDPDTSKTAASWFGACIFLIVAATILRSFFL >gi|226332129|gb|ACIC01000191.1| GENE 9 6149 - 6481 185 110 aa, chain + ## HITS:1 COG:no KEGG:BF0124 NR:ns ## KEGG: BF0124 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 110 1 110 110 211 99.0 5e-54 MAEYPINKGIGRPVEFKGLKAQYLFIFCGGLLALFVLFVILYMVGIDQWICIGFGAASSS LLVWQTFALNARYGEHGLMKLGAARSHPRYLINRRRIPRLFKRQRKEERQ >gi|226332129|gb|ACIC01000191.1| GENE 10 6478 - 8982 2133 834 aa, chain + ## HITS:1 COG:PSLT088_2 KEGG:ns NR:ns ## COG: PSLT088_2 COG3451 # Protein_GI_number: 17233453 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirB4 components # Organism: Salmonella typhimurium LT2 # 432 744 184 483 593 84 25.0 1e-15 MRNTSKMTTLENRFPLLAVEHGCIISKDADITVAFEVELPELYTVTGAEYEAIHSCWCKA IKVLPDYSVVHKQDWFIKERYKPELQKDDMSFLSRSFERHFNERPYLKHTCYLYLTKTTK ERNRMQSNFSTLCRGHIIPKELDRETTTKFLEACEQFERIMNDSGLVRLRRLSTDEIVGT EGNTGLIERYFSLMSEGDTTLQDIELSAREMRIGDNRLCLHTLSDAEDLPGKVATDTRYE KLSTDRSDCRLSFASPVGLLLSCNHIYNQYVLIDNSEETLHKFEKSARNMQSLSRYSRSN SINREWIDQYLNEAHSYGLTSVRAHFNVMAWSDDAEELKHIKNDVGSQLASMECVPRHNT IDCPTLYWAAIPGNAADFPAEESFHTFIEQAVCLFTEETNYRNSLSPFGIKMVDRLTGKP LHLDISDLPMKRGITTNRNKFVLGPSGSGKSFFMNHLVRQYYEQGAHVVLVDTGNSYQGL CGMIRRKTGGADGVYFTYTEEKPISFNPFYTDDYIFDVEKKDSIKTLLLTLWKSEDDKVT KTESGELGSAVSAYIERIQSDRSIVPSFNTFYEYMRDDYRKELAQRDIKVEKSDFNIDNM LTTMRQYYRGGRYDFLLNSTENIDLLGKRFIVFEIDSIKENRELFPVVTIIIMEAFINKM RRLKGVRKQLIVEEAWKALSSANMAEYLRYMYKTVRKYYGEAIVVTQEVDDIISSPVVKE SIINNSDCKILLDQRKYMNKFDQIQALLGLTEKEKGQILSINMANNPSRLYKEVWIGLGG TQSAVYATEVSAEEYLAYTTEETEKVEVYRLAEKLGGDIEAAIRQLAEKRRTKS >gi|226332129|gb|ACIC01000191.1| GENE 11 9017 - 9397 353 126 aa, chain + ## HITS:1 COG:no KEGG:BF0122 NR:ns ## KEGG: BF0122 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 126 1 126 126 251 96.0 6e-66 MNLSKVKMLQVSKCLIGLAVMMLQSCDVVDNRRDMLCGNWESVEGKPDVLIYKEGEAYKV TVFRRSGLRRKLKPETYLLQEENDNLFMNTGFRIDVSYNEATDVLTFSPGGDYVRVKPQP GHPTEE >gi|226332129|gb|ACIC01000191.1| GENE 12 9421 - 10050 641 209 aa, chain + ## HITS:1 COG:no KEGG:BF0121 NR:ns ## KEGG: BF0121 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 209 1 209 209 386 99.0 1e-106 MRTRITMIICLCLLFAGRASAQWVVSDPGNLAQGIINASKNIIHTSKTATNMVSNFQETV KIYQQGKKYYDALKSVNNLVKDARKVQQTILMVGDITDIYVNSFQRMLRDGNFRPEELSA IAFGYTKLLEESNEVLTELKNVVNITTLSMTDKERMDVVERCYSKMKRYRNLVSYYTNKN ISVSYLRAKKKNDLDRIMGLYGNMNERYW >gi|226332129|gb|ACIC01000191.1| GENE 13 10055 - 11027 832 324 aa, chain + ## HITS:1 COG:no KEGG:BF0120 NR:ns ## KEGG: BF0120 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 313 1 313 334 625 99.0 1e-178 MEFDNLHQILRSLYEQMMPLCGDMAGVAKGIAGLGALFYVAYRVWQSLARAEPIDVFPML RPFAIGLCIMFFPTVVLGTINSILSPVVQGTAKMLEAETLDMNRYREQKDKLEYEAMVRN PETAYLVSNEEFDKQLEELGWSPSDMVTMAGMYIDRGMYNMKKSIRDFFREILELLFQAA ALVIDTVRTFFLVVLAILGPIAFALSVWDGFQSTLTQWICRYIQVYLWLPVSDMFSTILA KIQVLMLQSDIERMQADPNFSLDSSDGVYIVFLCIGIIGYFTIPTVAGWIIQAGGMGGYG RNVNQMAGRAGSMAGSVAGAAAGN Prediction of potential genes in microbial genomes Time: Thu May 12 04:04:08 2011 Seq name: gi|226332128|gb|ACIC01000192.1| Bacteroides sp. 1_1_6 cont1.192, whole genome shotgun sequence Length of sequence - 5509 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 1, operones - 1 average op.length - 7.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 64 - 687 475 ## BF0119 hypothetical protein 2 1 Op 2 . + CDS 727 - 1014 243 ## BF0118 hypothetical protein 3 1 Op 3 . + CDS 995 - 2341 1192 ## BF0117 hypothetical protein 4 1 Op 4 . + CDS 2389 - 3375 947 ## BF0116 hypothetical protein 5 1 Op 5 . + CDS 3378 - 3953 376 ## BF0115 TraO 6 1 Op 6 . + CDS 3961 - 4863 433 ## BF0114 hypothetical protein 7 1 Op 7 . + CDS 4860 - 5363 515 ## BF0113 hypothetical protein + Term 5448 - 5484 3.0 Predicted protein(s) >gi|226332128|gb|ACIC01000192.1| GENE 1 64 - 687 475 207 aa, chain + ## HITS:1 COG:no KEGG:BF0119 NR:ns ## KEGG: BF0119 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 207 1 207 207 398 98.0 1e-110 MEFKSLKNIESSFRQIRLFGIVFLSLCTVVTVWSVWNSYRFAEKQREKIYVLDNGKSLML ALSQDLSQNRPAEAREHVRRFHELFFTLSPEKSAIEHNVKRALLLADKSVYHYYSDFAEK GYYNRIIAGNINQVLKVDSVVCDFNAYPYRAVTYATQKIIRQSNVTERSLVTTCRLLNAS RSDDNPNGFTIEGFTIIENKDLQTIKR >gi|226332128|gb|ACIC01000192.1| GENE 2 727 - 1014 243 95 aa, chain + ## HITS:1 COG:no KEGG:BF0118 NR:ns ## KEGG: BF0118 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 95 8 102 102 174 90.0 9e-43 MWGVYWKLHDKRKRLVARLKGYLDGLPPETRRRIVLGMLAAFAVLALYTFGRAVYDIGRN DGLRMETGHAGRVELPTPVETDNHLTPYLYGTDEE >gi|226332128|gb|ACIC01000192.1| GENE 3 995 - 2341 1192 448 aa, chain + ## HITS:1 COG:no KEGG:BF0117 NR:ns ## KEGG: BF0117 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 447 1 449 450 761 93.0 0 MEQTKNEPTKENKAAPDTGKPKKGREPLTEAQRLKRQKMIVLPAMVLVFIGAMWLIFAPS SGKEQQPGTDGYNTEMPDADKANRQIIGDKLKAYEHGEMEERLESRNRAIGQLGDMFDRE IAGTEDGTDFDLANPGGKEEKAQPATPQTIQSSAAAYRDLNATLGNFYEQPKNDNAEMDE LLERIASLESELESEKGRTSSMDEQVALMEKSYELAAKYMGGQNGGKPEQAAEPSTVQKG KKNTATPVRQVTRQVVSSLAQPMSNAEFVATFSQERNRGFNTAVGTTEVSDRNTIPACVH GAQSVTDGQTVRLRLLEPMAVAGRTIPRNAVVVGTGKIQGERLDIEVTSLEYDGTIIPVE LAVYDTDGQPGIFIPNSMEMNAVREVAANMGGSLGSSINISTNAGAQLASDLGKGLIQGT SQYIAKKMRTVKVHLKAGYRVMLYQEKD >gi|226332128|gb|ACIC01000192.1| GENE 4 2389 - 3375 947 328 aa, chain + ## HITS:1 COG:no KEGG:BF0116 NR:ns ## KEGG: BF0116 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 328 1 328 328 655 98.0 0 MRKVIIMFALAMGIATANAQENVTVETTKGSEQPTLTKEVYPQKEADGDLYHGLSRKLTF DRMIPPHGLEVTYDKTVHVIFPAEVRYVDLGSPDLIAGKADGAENVIRVKATVRNFPNET NMSVITEDGSFYTFNVKYAAEPLLLNVEMCDFIHDGSTVNRPNNAQEIYLKELGSESPML VRLIMKSIHKQNKREVKHIGCKRFGIQYLLKGIYTHNGLLYFHTEIKNQSNVPFDVDYIT WKIVDKKVAKRTAVQEQIILPLRAQNYATLVPGKKSERTVFTMAKFTIPDDKCLVVELNE KNGGRHQSFVIENEDLVRASTINELQVR >gi|226332128|gb|ACIC01000192.1| GENE 5 3378 - 3953 376 191 aa, chain + ## HITS:1 COG:no KEGG:BF0115 NR:ns ## KEGG: BF0115 # Name: not_defined # Def: TraO # Organism: B.fragilis # Pathway: not_defined # 1 191 1 191 191 371 96.0 1e-102 MRKYIAIIIASLALFTGQAHAQRCLPKMHGIEVRADMADGFNPGGKDGGYSFGASLSTYT KRGNKWVFGGEYLLKNNPYKNTKIPVAQFTAEGGYYFKILSDARKIVFVYAGASALAGYE SVNWGTKVLHDGSTLHDRDAFIYGGALTLDVECYVADRIALLANLRERCLWGGDTRKFHT QFGVGIKFIIN >gi|226332128|gb|ACIC01000192.1| GENE 6 3961 - 4863 433 300 aa, chain + ## HITS:1 COG:no KEGG:BF0114 NR:ns ## KEGG: BF0114 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 300 1 300 300 567 91.0 1e-160 MERTEIDAVRRMPLTDFLARLGHEPVRRSSNELWYLAPYRGERTPSFRVNVAKQLWYDFG LGKGGDIFTLAGEFLQSDDFMKQAKFIAEAANMTVAGWEKPAYLPKPTEPVFEDVEVAPL LRSPLTDYLAERGIPIAITSRHCCRLNYSVRGKRYFAVGFLNMAGGYEVRSRFFKGCIPP KDVSLARTKEIPADECLVFEGFMDFLSAVTLGVTGNADCLVLNSVANVEKAAELLDGYGR IGCFLDRDEAGRRTLDALAKRYGARVADRSSLYDGCKDLNEYLQRTTKKQKNNHLKIEEQ >gi|226332128|gb|ACIC01000192.1| GENE 7 4860 - 5363 515 167 aa, chain + ## HITS:1 COG:no KEGG:BF0113 NR:ns ## KEGG: BF0113 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 167 1 167 167 301 92.0 4e-81 MNILNNRNKSTNIFKVVALCLIAAVSFTLVSCDDDMDIQQSYPFTVEVMPVPNKVTKGQT VEIRCELKKEGDFANTLYTIRYFQFEGEGKLKMDNGITFLPNDRYLLENEKFRLYYTAEG EEAHNFIVVVEDNFGNSFELEFDFNNRNVKDDGLTIVPIGNFSPLLK Prediction of potential genes in microbial genomes Time: Thu May 12 04:04:38 2011 Seq name: gi|226332127|gb|ACIC01000193.1| Bacteroides sp. 1_1_6 cont1.193, whole genome shotgun sequence Length of sequence - 20647 bp Number of predicted genes - 13, with homology - 13 Number of transcription units - 5, operones - 4 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 116 - 161 8.0 1 1 Op 1 . - CDS 266 - 2254 1819 ## BT_1871 putative alpha-glucosidase 2 1 Op 2 . - CDS 2287 - 4566 2014 ## COG1472 Beta-glucosidase-related glycosidases - Prom 4591 - 4650 6.7 3 2 Op 1 . - CDS 4760 - 5743 723 ## BT_1873 endo-arabinase 4 2 Op 2 . - CDS 5802 - 7760 1473 ## BT_1874 hypothetical protein 5 2 Op 3 . - CDS 7772 - 11131 3075 ## BT_1875 hypothetical protein - Prom 11162 - 11221 4.3 6 3 Op 1 6/0.000 - CDS 11403 - 12416 805 ## COG3712 Fe2+-dicitrate sensor, membrane component 7 3 Op 2 . - CDS 12487 - 13077 363 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 13116 - 13175 2.9 - Term 13121 - 13188 13.1 8 4 Tu 1 . - CDS 13205 - 15397 2212 ## COG3537 Putative alpha-1,2-mannosidase - Prom 15595 - 15654 5.6 + Prom 15561 - 15620 5.4 9 5 Op 1 . + CDS 15656 - 17131 1335 ## COG0616 Periplasmic serine proteases (ClpP class) 10 5 Op 2 . + CDS 17115 - 17435 316 ## COG0616 Periplasmic serine proteases (ClpP class) 11 5 Op 3 . + CDS 17446 - 18588 711 ## COG1663 Tetraacyldisaccharide-1-P 4'-kinase 12 5 Op 4 . + CDS 18527 - 19336 914 ## COG0005 Purine nucleoside phosphorylase 13 5 Op 5 . + CDS 19352 - 20386 1007 ## COG0611 Thiamine monophosphate kinase + Term 20429 - 20474 6.1 + TRNA 20552 - 20627 82.7 # Phe GAA 0 0 Predicted protein(s) >gi|226332127|gb|ACIC01000193.1| GENE 1 266 - 2254 1819 662 aa, chain - ## HITS:1 COG:no KEGG:BT_1871 NR:ns ## KEGG: BT_1871 # Name: not_defined # Def: putative alpha-glucosidase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 662 1 662 662 1403 99.0 0 MKKLTFLLLCVLCTLSLQAQKQFTLASPDGNLKTTITIGDRLTYDITCNGRQILTPSPIS MTLDNGTVWGENAKLSGTSRKSVDEMIPSPFYRASELRNHYNGLTLRFKKDWNVEFRAYN DGIAYRFVNQGKKPFRVVTEVSDYCFPSDMTASVPYVKSGKDGDYNSQFFNSFENTYTTD KLSKLNKQRLMFLPLVVDAGDGVKVCITESDLENYPGLYLSASEGANRLSSMHAPYPKRT VQGGHNQLQMLVKEHEDYIAKVDKPRNFPWRIAVVTTTDKDLAATNLSYLLGAPSRMSDI SWIKPGKVAWDWWNDWNLDGVDFVTGVNNPTYKAYIDFASANGIEYVILDEGWAVNLQAD LMQVVKEIDLKELVDYAASKNVGIILWAGYHAFERDMENVCRHYAEMGVKGFKVDFMDRD DQEMTAFNYRAAEMCAKYKLILDLHGTHKPAGLNRTYPNVLNFEGVNGLEQMKWSSPSVD QVKYDVMIPFIRQVSGPMDYTQGAMRNASKGNYYPCYSEPMSQGTRCRQLALYVVFESPF NMLCDTPSNYMREPESTAFIAEIPTVWDESIVLDGKMGEYIVTARRKGDVWYVGGITDWS ARDIEVDCSFLGDKSYHATLFKDGVNAHRAGRDYKCESFPIKKDGKLKVHLAPGGGFALK IK >gi|226332127|gb|ACIC01000193.1| GENE 2 2287 - 4566 2014 759 aa, chain - ## HITS:1 COG:PA1726 KEGG:ns NR:ns ## COG: PA1726 COG1472 # Protein_GI_number: 15596923 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Pseudomonas aeruginosa # 21 756 25 761 764 697 50.0 0 MKIRNLVFGAICGIIFFSCSESGGGKEVEMDRFVTDLMGKMTIREKLGQLNLPSGGDLVT GSVMNSELSDMIRKEEIGGFFNVKGIKKIYDLQRLAVEENRLKIPLIVGADVIHGYETIF PIPLALSCSWDTLAIQRMARISAIEASADGICWTFSPMVDICRDARWGRIAEGSGEDPYL GSLLAKAYVHGYQGDSMQGKDEILSCVKHFALYGASEAGKDYNTVDMSHLRMYNEYFAPY RAAVEAGVGSVMSSFNIVDGIPATANKWLLTDVLRDEWGFQGLLVTDYNSIAEMSIHGVA PLKEASVRALQAGTDMDMVSCGFLNTLEESLKEGKVTEAQIDAACRRVLEAKYKLGLFAD PYKYCDTLRAEKELYTPEHRAVAREVAAETFVLLKNENHLLPLEEKGKIALIGPMADARN NMCGMWSMTCTPSGHGTLLEGIRSAVGDKAEILYAKGSNVYYDAEMEKGAVGIRPLERGN DQQLLAEALRTAARADVIVAAVGECAEMSGESPSRTNLEIPDAQQDLLKALVKTGKPVVL LLFTGRPLILNWESEHIPSILNVWFGGSETGDAVADVLFGKAVPCGKLTTTFPRSVGQLP LFYNHLNTGRPDPDNRVFNRYASNYLDESNEPLYPFGYGLSYTDFVYGDLQLSSETLPKN GNLTASVTVTNKGNYDGYETVQIYLRDIYAEIARPVKELKGFDRIFLKKGESREVKFVLT EDDLKFYNSGLQYIYEPGEFDVMIGTNSRDVQTKRFIAE >gi|226332127|gb|ACIC01000193.1| GENE 3 4760 - 5743 723 327 aa, chain - ## HITS:1 COG:no KEGG:BT_1873 NR:ns ## KEGG: BT_1873 # Name: not_defined # Def: endo-arabinase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 327 1 327 327 669 99.0 0 MKKLKHSILSLLLLMGACIVSCSNDDGQTSSTDPDEVGGESFTIPVSSLRLRDPFILVDK KTSMYYLHFNNNLKIRVYKSKDLSTWKDEGYSFIAKTDFWGQQDFWAPDVYEYEGRYYLF ATFSNAGVKRGTSILVSDSPKGPFTPLVNKAITPSGWMCLDGSLYIDKEGNPWLLFCREW LETIDGEIYAQRLAKDLKTTEGDPYLLFKASEAPWVGSITSSGVTGNVTDAPFIYRLDDG KLIMLWSSFRKTDGKYAIGQAVSASGNVLGPWVQEPETLNSDDGGHAMVFKDLKGRLMIS YHAPNSQTEHPVITPIYIKDGKFVALN >gi|226332127|gb|ACIC01000193.1| GENE 4 5802 - 7760 1473 652 aa, chain - ## HITS:1 COG:no KEGG:BT_1874 NR:ns ## KEGG: BT_1874 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 12 652 12 652 652 1300 99.0 0 MKKLAKKLLLLFISIQFVACEGSLEPHLYGVVDTGNAFNTPADLSAATVALYYELRQKGW GPYLFCDGSSFVMDEVSTEEWTTKWSWVAFLNGSWTDADQMVKGFYNWTAPNITRCTYTI AKMEESPVAQSVKDLYIAEVRALRAFFMFDLYRLYGPMPMILEADQAINPDPDYKPYRPT SEEVGTFLTTELRAAADALPVEQAEYGRITKGAALHYLLKYYMHEKQWQNALETANEIIG LNYYELEKDYASIFSAQNEGNKELMFVVRAEPLADYGNHTYANILPGDYASPYGNIVEGW SGHRMPWEFYDTFDENDRRRALAQAEYTSKSGATVDLRASGDVGALPLKYGIDPEATGTW AGNDKVLDRYAEVLLFKAEALNELNGPNQGSVDLINDIRKRAFGFGTSLPAIPVFKESFD GEFVDNVIGIFSMNNYDQAGGSAWKYDVDKNNTLNNGNSLHVEVESSGTEFWTLQMRTEP LVAKGRKYSIKMKLKASKDIQFEIRVEGPLSHMESISLKAGEVKEFSTQTGKATEDQNCA LFLALGNSGSGYELWIDEIEFTAMEQAADGGDAIIKQLSDFPDKESLRDWILKERGWEFW YEGKRREDLIRMGKYVETGKKYSTNFSEKNLLFPIPTSVIIENSHIEQNPHY >gi|226332127|gb|ACIC01000193.1| GENE 5 7772 - 11131 3075 1119 aa, chain - ## HITS:1 COG:no KEGG:BT_1875 NR:ns ## KEGG: BT_1875 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1119 1 1119 1119 2201 99.0 0 MNYRKKAILMVMALFCLNIAILAQAVSLKMDNVSVKEAMTQLKNKSGYSFVYKVGDLDTK KIITVKAKELNEAIDQILYGQDVVYEVKGKNVIVQKGQARQNTSKDGKKRKITGVVNDVN GEPIIGVTIKEKGTTNGTVSDINAKFTLEVSPGAVLELSYIGYQPLEVKVGNQTDLSIVL YEKQELLDEVVVVGYGTMKKSDLNASIVSVKSEKLNKAASPNFSEMLMGRAAGLTVKQGS AQPGGGIEVLIRGAASTGAGNDPLYVVDGFPIVNAGVNPGSGNQWTAGSNSPLSSINPND IESIEILKDASATAIYGARGANGVVLITTKRGTQDTKVEYNMNMSFQTINKRPELLTAGE LMTTQNSYYKEQYLMQNLVYPYGNTDPGAVSPYVPMYTDEEIAKAGKGTNWYDMITRTGL IQQHNLTVTYGNEKMRSLISLNYFDQKGVVKTSGYKRYSFRYNLDHKITKWWDYGISATA SYVVDQNATLGGGYDSTAGIIESALSYSPTVKAERDLTTGKWMEDPKQALLNHPLSYLDI EDETKTKRFLATAFTNFYFIKDVFWLKLSAGADIRDGSRHSYYPMTTKYGSSVNGDANIN SANREDYVADMVFNFQKTFKERHKLLALLGYSYQVQNDDGAYARAMGFMSDALMYYKLQA GETRPVVSSYKNKHVLASYFGRAQYSFKDKYLFTFTARIDGSDRFGKNNRYAFFPSGAFA WRMNQEDFLKDVNWLSDAKLRVSMGQVGNENLPNDAASEYFAFDGRNYYLGDTEKRGVNL GKFGNPDLKWETTTEVNFGLDFGFFRNRINGSVDVFFKEVKDLLSWRSLPHTAVTTGIWS NIGKTKSTGFELTLNTVNLEGPLHWESTLTYTSYRDRWKERDPKVILAPYVKENDPVSAI YTLIPDGIKQAGEDTPAMPDLLPGQRKYKDVNGLDEEGNLTGKPDGKIDQADVVYLGTRA PKFTMGFSNDFRYKGFDLNIYLYASVGGYSYPYTQVEHGVYGGNGIQRLKDNNNFLADIK NRWTSDNMSSAMPSGEVNSYDSYGAPNWEKNTYLRLKSVTLGYDLSRLFKADKLKVRFYF SGQNLLTFTGYKGLDPEVENDRASYPQQKTFSFGLDVKF >gi|226332127|gb|ACIC01000193.1| GENE 6 11403 - 12416 805 337 aa, chain - ## HITS:1 COG:PA1301 KEGG:ns NR:ns ## COG: PA1301 COG3712 # Protein_GI_number: 15596498 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Pseudomonas aeruginosa # 132 303 124 292 327 63 27.0 5e-10 MEEENKHIDELIANYLTEGLDKNALDELKTWIAASAENQQYFIRQREIWFSAVSREAASV YDKDKAFENFRNRVESQKEIQSTSRRGFSLSALWRYAAVVAIIIAVGCISYWQGEVNVKD TFADISVEAPLGSKTKLYLPDGTLVWLNAGSRMTYSQGFGVDNRKVELEGEGYFEVKRNE KIPFFVKTKDLQLQVLGTKFNFRDYPEDHEVVVSLLEGKVGLNNLLREEKEAVLSPDERA VLNKANGLLTVESVTASNASQWTDGYLFFDEELLPDIAKELERSYNVKIHIANDSLKTFR FYGNFVRREQNIQEVLEALASTEKMQYKIEERNITIY >gi|226332127|gb|ACIC01000193.1| GENE 7 12487 - 13077 363 196 aa, chain - ## HITS:1 COG:SMc04203 KEGG:ns NR:ns ## COG: SMc04203 COG1595 # Protein_GI_number: 15965784 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Sinorhizobium meliloti # 26 183 5 155 159 60 31.0 2e-09 MENTETLIVEQLKIGNENAYRYIYDHHYALLCYVANGYLKDQFLSETIVGDTIFHLWEIR ETLDISVSIRSYLLRAVRNRCINYLNSEREKREIAFSALMPDEITDDKIILSDSHPLGIL LERELENEIYKAIDKLPDECRRVFAKSRFEGKSYEEISGELGISINTVKYHIKNALASLH AHLSKYLISLLLFFFR >gi|226332127|gb|ACIC01000193.1| GENE 8 13205 - 15397 2212 730 aa, chain - ## HITS:1 COG:XF0842 KEGG:ns NR:ns ## COG: XF0842 COG3537 # Protein_GI_number: 15837444 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Xylella fastidiosa 9a5c # 23 729 37 761 790 488 39.0 1e-137 MKRKSPVMAVALAGLFMYTSCSPTEQAETKDYTQYVNTFIGAADNGHTFPGACYPFGMIQ TSPVTGAVGWRYCSEYTYQDSLIWGFTQTHLNGTGCMDLGDILVMPVTGTRARAWDAYRS HFPKDKEAATPGYYTVELSDPQVKAELTASIHAALHRYTYHKADSASLLIDLQHGPAWRE EQYHSQVNSCEVNWEDAQTLTGHVNNTVWVDQDYFFVMKFNRPVIDSLYLPMGETEKGKR IIATFDLKPGDELMMKVALSTTSVEGAKKNLQAEIPDWNFDGVKLAAHDEWNIYLSRIDV EGTDDEKTNFYTCFYHALIQPNQISDVDGMYRNAADSIVKAGTGTFYSTFSLWDTYRAAH PFYTLMVPERVDGFVNSLIEQGEVQGFLPIWALWGKENFCMIGNHGVSVIAEAYRKGFRG FDAERAFNMVKKTQTVSHPLKSDWEVYTKYGYFPTDLTKAESVSSTLESVYDDYAAADMA RRMGKEEDAAYFAKRADYYKNLFDSQTNFMRPRKADGTWKSPFNPSDVGHAESTGGDYTE GNAWQYTWHVQHDVPGLIALFGGEEPFLNKLDSLFTVKLETTQADVTGLIGQYAHGNEPS HHVTYLYALAGRPERTQELIREIFDTQYKNKPDGLCGNDDCGQMSAWYMFSAMGFYPVDP VSGDYVFGAPQLPKIVLHLADGKTFTVIAENLSKEHKYVDSITLNGEPYTKNTISHEDIL KGGTLVYKMK >gi|226332127|gb|ACIC01000193.1| GENE 9 15656 - 17131 1335 491 aa, chain + ## HITS:1 COG:all4590 KEGG:ns NR:ns ## COG: all4590 COG0616 # Protein_GI_number: 17232082 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Periplasmic serine proteases (ClpP class) # Organism: Nostoc sp. PCC 7120 # 38 489 42 494 609 296 38.0 8e-80 MKDFLKFTLATVTGIILSSIVLFIIGMVTLFGIVSTADTETIVKKNSVMMLDLNGVLVER TQESPLGILSQLFSDDSNTYGLDDILSSIKKAKENENIKGIYLQASMLGTSYASLQEIRN ALLDFKESGKFIIAYGDSYTQGLYYLSSVADKVLLNPKGMIEWKGIASAPLFYKDLLQKI GVEMQIFKVGTYKSAVEPFISTEMSPANREQVTAFINSIWGQVTEGVSASRSLPVDSLNA LADRMLMFYPAEESVQCGLADTLIYRNDVRNYLKQWVDLKEDDRLPVLGLSDMINVKKNM PKDKSGNIVAVYYASGEITDYSGSSTSEEGIVGTKVIRDLRKLKDDEDVKAVVLRVNSPG GSAFASEQIWHAVKELKTEKPVIVSMGDYAASGGYYISCVADTIVAEPTTLTGSIGIFGM VPNVKELSEKIGLTYDVVKTNKFSDFGNIMRPFNQDEKTLMQMMITQGYDTFVNRCAEGR HMSKEAIEENS >gi|226332127|gb|ACIC01000193.1| GENE 10 17115 - 17435 316 106 aa, chain + ## HITS:1 COG:sll1703 KEGG:ns NR:ns ## COG: sll1703 COG0616 # Protein_GI_number: 16330327 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Periplasmic serine proteases (ClpP class) # Organism: Synechocystis # 1 101 496 603 610 58 34.0 2e-09 MKKIAEGRVWTGEAAKELGLVDVLGGIDTALEIAVRKAGIEGYTVVSYPAKQDLLSSLLN TKPTNYVESQILKSKLGEYYQQFGMLKNLKERSMIQARIPFELNIK >gi|226332127|gb|ACIC01000193.1| GENE 11 17446 - 18588 711 380 aa, chain + ## HITS:1 COG:aq_1656 KEGG:ns NR:ns ## COG: aq_1656 COG1663 # Protein_GI_number: 15606758 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Tetraacyldisaccharide-1-P 4'-kinase # Organism: Aquifex aeolicus # 12 323 6 285 315 129 32.0 1e-29 MDEHFIKIHKWLYPVSWIYGAVVTVRNKLFDWGFLRSKSFGVPVICIGNLSVGGTGKTPH TEYLIKLLRDNYHVAVLSRGYKRHSRGYVLATPQSTARSIGDEPYQMHTKFPSVTLAVDE NRCHGIEQLLSIKEPSIEVVLLDDAFQHRYVKPGLSILLTDYHRLFCDDTLLPAGRLRES VNGKNRAQIVIVTKCPQDIKPIDYNIITKRLNLYPYQQLYFSSFRYGNLQPVFPSANSEI DSTVNELPLSALTNTDILLVTGIASPAPILEELKMYTDQIDSLSFDDHHHFSHRDIQQIK ERFGKLKGEHKLIVTTEKDATRLIHHPVLSEELKPFIYALPIEIEILQNQQDKFNQHIIG YVRENTRNSSFSERENAHQS >gi|226332127|gb|ACIC01000193.1| GENE 12 18527 - 19336 914 269 aa, chain + ## HITS:1 COG:BH1532 KEGG:ns NR:ns ## COG: BH1532 COG0005 # Protein_GI_number: 15614095 # Func_class: F Nucleotide transport and metabolism # Function: Purine nucleoside phosphorylase # Organism: Bacillus halodurans # 3 266 6 270 275 287 53.0 1e-77 MLEKIQETAAFLKGKMHTSPETAIILGTGLGSLADEITEKYEIKYSDIPNFPVSTVEGHS GKLIFGKLGNKDIMAMQGRFHYYEGYSMKEVTFPVRVMRELGIKTLFVSNASGGTNEAFE IGDLMIITDHINYFPEHPLRGKNIPYGPRFPDMSEAYDKELIRKADEIAQEKGIKVQHGI YIGTQGPTFETPAEYKLFHILGADAVGMSTVPEVIVANHCGIKVFGISVITDLGVEGKIV EVSHEEVQKAADAAQPKMTTIMRELINRA >gi|226332127|gb|ACIC01000193.1| GENE 13 19352 - 20386 1007 344 aa, chain + ## HITS:1 COG:MTH1396 KEGG:ns NR:ns ## COG: MTH1396 COG0611 # Protein_GI_number: 15679395 # Func_class: H Coenzyme transport and metabolism # Function: Thiamine monophosphate kinase # Organism: Methanothermobacter thermautotrophicus # 1 340 1 324 327 161 34.0 2e-39 MRTEIATLGEFGLIDRLTEGIKPENESTKYGVGDDAAVLSYPSEKQILVTTDLLMEGVHF DLTYVPLKHLGYKSAVVNFSDIYAMNGTPRQITVSLALSKRFSVEDMEELYSGIRLACQQ YHVDIIGGDTSSSLTGLAISITCIGDADKDKVVYRNGAKETDLICVSGDLGAAYMGLQLL EREKTVLKGEKDVQPDFTGKEYLLERQLKPEARKDIIEKLAAANIVPTSMMDISDGLSSE LMHICKQSNTGCRVYEEHIPIDYQTAVMAEEFNMNLTTCAMNGGEDYELLFTVPIADHEK VSQMEGIRLIGHITKPELGCALITRDGQEFELKAQGWNPLKEDK Prediction of potential genes in microbial genomes Time: Thu May 12 04:05:10 2011 Seq name: gi|226332126|gb|ACIC01000194.1| Bacteroides sp. 1_1_6 cont1.194, whole genome shotgun sequence Length of sequence - 1279 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 27 - 1253 1224 ## COG4833 Predicted glycosyl hydrolase Predicted protein(s) >gi|226332126|gb|ACIC01000194.1| GENE 1 27 - 1253 1224 408 aa, chain - ## HITS:1 COG:lin0763 KEGG:ns NR:ns ## COG: lin0763 COG4833 # Protein_GI_number: 16799837 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted glycosyl hydrolase # Organism: Listeria innocua # 44 406 3 341 341 134 29.0 4e-31 MKNLLITVFFLSLTLQLSAETGGSNPVIVEKTTASLAEKTPVYWQKMADGMSQALIKHFW GANFKGYENRFYFNYGSDLSNMTTNHYWPQAHAMDVMVDAYMRTGSKQYLNIYPLWWEGA PKFNFAGREEDPWWNVFVDDMEWIALAQIRMFESTKNTKYLKKARQTYDDWVWSTWGPED EAPWFGGITWKTDVAKSKNACSNGPAALIATRLYNFYDAMGKKAGKPKQAYLNEAIKIYT WEKNNLFDRQTGAVYDNMNGEGKITKWVFSYNSGTFLGAAHELYKITGDKQYLTDAVKAA NFVIDHLSTNEGVLSDAEGGDGGLFHGIFFRYFVKLINDPALDSANYNKFRDYITHCATV MAEQGVNQKTMLYSGRWRKAPADDESVGLTSHLTGCMLMEAMCVLKHR Prediction of potential genes in microbial genomes Time: Thu May 12 04:05:12 2011 Seq name: gi|226332125|gb|ACIC01000195.1| Bacteroides sp. 1_1_6 cont1.195, whole genome shotgun sequence Length of sequence - 5609 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 5, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 137 - 175 7.0 1 1 Op 1 . - CDS 248 - 706 477 ## BT_1884 cold shock protein, putative DNA-binding protein 2 1 Op 2 . - CDS 761 - 1885 874 ## COG0513 Superfamily II DNA and RNA helicases - Prom 1914 - 1973 5.3 3 2 Tu 1 . - CDS 1980 - 3053 791 ## COG2070 Dioxygenases related to 2-nitropropane dioxygenase - Term 3367 - 3399 3.3 4 3 Tu 1 . - CDS 3417 - 3719 241 ## COG0724 RNA-binding proteins (RRM domain) - Prom 3883 - 3942 9.4 + Prom 3861 - 3920 7.7 5 4 Tu 1 . + CDS 4039 - 4815 609 ## BT_1888 LuxR family transcriptional regulator + Term 5039 - 5071 -0.2 - Term 4568 - 4598 2.0 6 5 Tu 1 . - CDS 4816 - 5400 395 ## BT_1889 hypothetical protein - Prom 5421 - 5480 4.3 Predicted protein(s) >gi|226332125|gb|ACIC01000195.1| GENE 1 248 - 706 477 152 aa, chain - ## HITS:1 COG:no KEGG:BT_1884 NR:ns ## KEGG: BT_1884 # Name: not_defined # Def: cold shock protein, putative DNA-binding protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 152 1 152 152 238 100.0 7e-62 MAKSITVGKRENEKKRLAKREEKQKKKDSKKLSSKSSFDDMIAYVDENGMITSTPPAENI KKEEINLDEIIIATPKKEDEEPVILRGRVEFFNEARGFGFIKDLAGVDKYFFHVNNVVGN ISEGNIVTFDLERGVKGMNAVNICLEKKPVEN >gi|226332125|gb|ACIC01000195.1| GENE 2 761 - 1885 874 374 aa, chain - ## HITS:1 COG:ECs0875 KEGG:ns NR:ns ## COG: ECs0875 COG0513 # Protein_GI_number: 15830129 # Func_class: L Replication, recombination and repair; K Transcription; J Translation, ribosomal structure and biogenesis # Function: Superfamily II DNA and RNA helicases # Organism: Escherichia coli O157:H7 # 1 369 1 370 455 396 55.0 1e-110 MTFKELNITEPILKAIEEKGYTVPTPIQEKAIPVALAKKDILGCAQTGTGKTASFAIPII QHLHLNKGEGKRSGIKALILTPTRELALQISECIEDYSKYTRIRHGVIFGGVNQRPQVDM LHKGIDILVATPGRLLDLMNQGHIRLDNIQYFVLDEADRMLDMGFIHDIKRILPKLPKEK QTLFFSATMPDTIIALTNSLLKNPLKIYVTPKSSTVDSIKQLVYFVEKKEKSLLLISILQ KSEDRSVLIFSRTKHNADKIVKILGKAGIGSQAIHGNKSQAARQSALGNFKSGKTRVMVA TDIASRGIDINELPLVINYDLPDVPETYVHRIGRTGRAGNAGMALTFCSQEERKQINDIQ KLTGKKLNRADFTI >gi|226332125|gb|ACIC01000195.1| GENE 3 1980 - 3053 791 357 aa, chain - ## HITS:1 COG:CAC3580 KEGG:ns NR:ns ## COG: CAC3580 COG2070 # Protein_GI_number: 15896814 # Func_class: R General function prediction only # Function: Dioxygenases related to 2-nitropropane dioxygenase # Organism: Clostridium acetobutylicum # 6 345 8 347 355 291 46.0 1e-78 MKSFFIGNIEIKVPVIQGGMGVGISLSGLASAVANEGGVGVISCAGLGLLYPKEKGTYTE KCISGLKEEIRKSRMKTKGIIGVNVMVALSNYADMVRTAINEKIDVIFSGAGLPLDLPFY LTTGSITKLVPIVSSSRAAKIICDKWQKNYNYLPDAIVVEGPKAGGHLGFKKEQLQDQNY ALDVLIPEVVAIAASYKEQKHIPVIAAGGISTGEDIAHFMELGASGVQMGSIFVTTLECD ASETFKEVYIHSKSEDVLIIESPVGMPGRAIDGEFIHSVNNGLEKPRKCSFHCIKTCDYT KSPYCIIKALYNAAKGNMKKGYAFAGSNAFLADKISSVKEVMNTLEREFFLATHKLV >gi|226332125|gb|ACIC01000195.1| GENE 4 3417 - 3719 241 100 aa, chain - ## HITS:1 COG:all2777 KEGG:ns NR:ns ## COG: all2777 COG0724 # Protein_GI_number: 17230269 # Func_class: R General function prediction only # Function: RNA-binding proteins (RRM domain) # Organism: Nostoc sp. PCC 7120 # 1 98 1 94 99 77 47.0 6e-15 MNIYISGLSYGTNDADLTNLFAEYGEVSSAKVIFDRETGRSRGFAFVEMTNDAEGQKAID ELNGVEYDQKVISVSVARPRAEKPSYGGNRGGGYNNSRRY >gi|226332125|gb|ACIC01000195.1| GENE 5 4039 - 4815 609 258 aa, chain + ## HITS:1 COG:no KEGG:BT_1888 NR:ns ## KEGG: BT_1888 # Name: not_defined # Def: LuxR family transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 258 1 258 258 513 100.0 1e-144 MREIKLTDITREELWAKQRLSFTDIDYAVWERNKSMLHQFSKMNRNCTFVVDVYKCRYAY ASPNFVDLLGYDAHKIATLERQGDYLESRIHPDDREQLLILQIKLSQFIYSLPPEQRNDY SNIYSFRVLNARQQYVRVVSRHQVLEQTIDGKAWLVIGNMDISPDQQEAETVDCTVLNLK NGEMFSPSPSLQTFNPLTKREAEILRLIQKGLLSKEIADKLCVSIHTVNIHRQNLLRKLG VQNSIEAIRIGYETGILS >gi|226332125|gb|ACIC01000195.1| GENE 6 4816 - 5400 395 194 aa, chain - ## HITS:1 COG:no KEGG:BT_1889 NR:ns ## KEGG: BT_1889 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 194 1 194 194 383 100.0 1e-105 MDILDVARRNQQKAWEIIEKVNVIPIWESIGAQVNLVGSLRMGLLMKHRDIDFHIYTSSL SLADSFRAMAKLAENTSVKKIECVNLLHTVEACVEWHAWYQDSDNALWQIDMIHIRKGSR YDGYFEKVAERISSVLTDEIKRTILQLKNETPESEKIMGVEYYQAVIRDGVRTYAGFEEW RKEHPVGGVLEWMP Prediction of potential genes in microbial genomes Time: Thu May 12 04:05:27 2011 Seq name: gi|226332124|gb|ACIC01000196.1| Bacteroides sp. 1_1_6 cont1.196, whole genome shotgun sequence Length of sequence - 22640 bp Number of predicted genes - 20, with homology - 20 Number of transcription units - 8, operones - 5 average op.length - 3.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 13/0.000 + CDS 3 - 2195 1031 ## COG0642 Signal transduction histidine kinase 2 1 Op 2 . + CDS 2188 - 3510 767 ## COG2204 Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains + Term 3568 - 3599 -0.7 3 2 Tu 1 . + CDS 3792 - 4205 294 ## BF0137 hypothetical protein + Term 4296 - 4343 0.9 + Prom 4207 - 4266 3.6 4 3 Tu 1 . + CDS 4416 - 4970 320 ## BF0136 tetracycline resistance element mobilization regulatory protein RteC + Term 5029 - 5088 15.7 + Prom 5026 - 5085 9.2 5 4 Op 1 . + CDS 5317 - 6447 837 ## COG1373 Predicted ATPase (AAA+ superfamily) 6 4 Op 2 2/0.000 + CDS 6467 - 8248 912 ## COG3593 Predicted ATP-dependent endonuclease of the OLD family 7 4 Op 3 . + CDS 8253 - 10034 450 ## COG0210 Superfamily I DNA and RNA helicases + Term 10035 - 10092 12.2 - Term 10097 - 10155 1.3 8 5 Op 1 . - CDS 10179 - 12188 1689 ## COG3505 Type IV secretory pathway, VirD4 components 9 5 Op 2 . - CDS 12221 - 13471 1003 ## BF0132 hypothetical protein 10 5 Op 3 . - CDS 13450 - 13875 275 ## BF0131 hypothetical protein 11 6 Op 1 . + CDS 14586 - 15347 774 ## BF0129 hypothetical protein 12 6 Op 2 . + CDS 15350 - 15790 262 ## BF0128 conjugate transposon protein 13 6 Op 3 . + CDS 15793 - 16140 293 ## BF0127 hypothetical protein 14 6 Op 4 . + CDS 16162 - 16890 503 ## BF0126 hypothetical protein 15 7 Op 1 . + CDS 17093 - 17404 209 ## BF0125 hypothetical protein 16 7 Op 2 . + CDS 17416 - 17748 218 ## BF0124 hypothetical protein 17 7 Op 3 . + CDS 17745 - 19304 899 ## BF0123 hypothetical protein 18 7 Op 4 . + CDS 19362 - 19862 143 ## gi|253572851|ref|ZP_04850250.1| predicted protein 19 7 Op 5 . + CDS 19870 - 21678 377 ## COG3344 Retron-type reverse transcriptase 20 8 Tu 1 . + CDS 21779 - 22640 701 ## BF0123 hypothetical protein Predicted protein(s) >gi|226332124|gb|ACIC01000196.1| GENE 1 3 - 2195 1031 730 aa, chain + ## HITS:1 COG:MA2553_2 KEGG:ns NR:ns ## COG: MA2553_2 COG0642 # Protein_GI_number: 20091380 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Methanosarcina acetivorans str.C2A # 250 479 190 428 428 127 32.0 7e-29 NKKIDEFRKEINNINIQMIKFSLFGETILEWNDKDIEHYHAQRMAMDSMLCRFKATYPAE RIDSVRHLLEDKERQMCQIVQILEQQQAINDKIPRQVPVIAQKSVQEQPKKPKRKGFLGI FGKKEEAKPTVTTTMHRSFNRNMRTEQQAQSRRLSVHADSLAARNAELNRQLQGLVVQID GKVQTDLQKREAEITAMRERSFIQIGGLTGFVILLLVISYIIIHRNANRIKRYKQETADL IERLQQMAKRNEALIASRKKAVHTITHELRTPLTAITGYAGLMRKDCNTDKTGTYIRNIQ ESSDRMREMLNTLLNFFRLDNGKEQPNFSACRISAITHTLETEFMPIAINKGLTLTIHCH KDAFVCTDKERILQIGNNLLSNAIKFTENGSVSLRADYDNGLLKLIVEDTGTGMTEEEQQ RVFGAFERLSNAAAKDGFGLGLSIVQRIVTMLGGTIRLESDKGKGSRFTVEIPMQTAEEL PERINQTQVHHNHTFHDVVAIDNDEVLLLMLKEMYAQEGIHCDTCTNAAELMELIRKKEY SLLLTDLNMPDINGFELLELLRTSNVGNSRDIPVIVTTASGSCGKEELIEHGFTECLFKP FSISELMEISNKCAMNAAQNEKPGFTSLLSYGNEAVMLEKLITETEKEMQSLRYAEQRKD LPELDALTHHLRSSWEILRADRPLRELYKLIHHNGTPDDKAIGNAVRAVLDKGSEIIRLA KEERKKYNNG >gi|226332124|gb|ACIC01000196.1| GENE 2 2188 - 3510 767 440 aa, chain + ## HITS:1 COG:ECs4927 KEGG:ns NR:ns ## COG: ECs4927 COG2204 # Protein_GI_number: 15834181 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains # Organism: Escherichia coli O157:H7 # 2 428 4 439 441 281 40.0 2e-75 MDKIKIIVVEDNIVYCEFVCNLLAREGFRTVQAFHLSTARKLLQQAADGDIVVSDLRLPD GNGIDLLRWMRKEGMTQPFIIMTDYAEVHTAVESMKLGSLDYIPKQLVEDKLVPLLRTIL KERNIGRSRMPVFARDGSAFRKIMHRIRLVAPTDMSVLIFGENGTGKEHIAHHLHDKSKR SGKPFVAVDCGSLNKELAPSAFFGHVKGAFTGADSAKKGYFHEAEGGTLFLDEVGNLAPE TQQMLLRAIQERRYRPVGDRTDRSFNVRIIAATNENLEKAVNEKRFRQDLLYRLHDFDIT VPPLRDCQEDIMPLAEFFREIANNELECKVSGFSSEARKALLTHSWPGNVRELRQKIMGA VLQAQTGLVTKEHLELGITETTSITGFSLRNDEEDKERILRALKQADGNRKVAAELLGIG RTTLYNKLEEYGLKYKFEQP >gi|226332124|gb|ACIC01000196.1| GENE 3 3792 - 4205 294 137 aa, chain + ## HITS:1 COG:no KEGG:BF0137 NR:ns ## KEGG: BF0137 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 133 1 133 140 192 68.0 5e-48 MGKVQILAVLTMDGCQSSELCCKAYKELRLEDCGINKIRENALYHITPDYSISMLDEWRK STTDICYLAEVTPEKADYINGLLRMRVVDEIILYTIPFIAGTGKRFFQSALPQGQWTLTS QKVYRNGVVRHIYKACV >gi|226332124|gb|ACIC01000196.1| GENE 4 4416 - 4970 320 184 aa, chain + ## HITS:1 COG:no KEGG:BF0136 NR:ns ## KEGG: BF0136 # Name: not_defined # Def: tetracycline resistance element mobilization regulatory protein RteC # Organism: B.fragilis # Pathway: not_defined # 1 181 1 177 201 314 85.0 1e-84 MNYFLLAETEFFRRINEAGDCNMETAYTAFATQVIELCSGNVDTNRTIIALAYIEIELQH HPVRNLPEEKREVAAYISKALSFVRKMQKFLAAPQVPPLIPIRTSSDNTTENPASPLQWT GNAIDLVELIYGINEMGCINNGNMPLKQLAPLLYKIFGVESKDCYRFYIDIKRRKNESRT YRRN >gi|226332124|gb|ACIC01000196.1| GENE 5 5317 - 6447 837 376 aa, chain + ## HITS:1 COG:MT0627 KEGG:ns NR:ns ## COG: MT0627 COG1373 # Protein_GI_number: 15840000 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Mycobacterium tuberculosis CDC1551 # 1 335 1 348 411 107 28.0 3e-23 METVNRILQEKITARIAPNKAVLIFGARRVGKTVMMRKIVDNYSGRTMMLNGEDYDTLAL LENRSIANYRHLLDGIDLLAIDEAQNIPQIGSILKLIVDEIPGISVLASGSSSFDLLNKT GEPLVGRSTQFLLTPFSQREIAQIETPFETRQNLEARLIYGSYPEVVMMENYERKTDYLR DIVGAYLLKDILAIDGLKNSSKMRDLLRLIAFQLGSEVSYDELGKQLSMSKTTVEKYLDL LEKVFVIYRLGAYSRNLRKEVTKAGKWYFYDNGIRNAIIGAFSPLAIRQDVGALWENYII GERRKANFNEGLHKEFYFWRTYDKQEIDLIEENPNNLTALEFKWGNKMPTVPKVFQEAYP HVEFHVVNRENYLEFV >gi|226332124|gb|ACIC01000196.1| GENE 6 6467 - 8248 912 593 aa, chain + ## HITS:1 COG:MA1866 KEGG:ns NR:ns ## COG: MA1866 COG3593 # Protein_GI_number: 20090716 # Func_class: L Replication, recombination and repair # Function: Predicted ATP-dependent endonuclease of the OLD family # Organism: Methanosarcina acetivorans str.C2A # 6 589 1 610 613 320 34.0 4e-87 MYLSRLHISKFRVFDDITLYFKNGINILIGENNSGKTAIIDALRICLGCGKPDNFIYVQD GDLHVNPENPSEINTVIQFDLIFEFGDASIERECFYDFISQDKDNPDKQTIQLHLKFIQE NNGKKKYFKRIIWGGDNEGQQVPYESLQEIFYTYLSPLRDAVSCLRPYSYDNKTSQLFNQ LTKYDKGNESIPLNEEKKKSLAKNLYQIFENDAYDWKHILTTGKSKVNEHLEGTGITLKH PDIEMRYVGREFSDVVRGIELKCPVYKTVEAGQEQKYFTLSQNGLGENNLIFTSVVLGDL INRCEDHALEIYNALLVEEPEAHLHPQYQNTFFEYLNELQSKGLQVFVTSHSPTITAKSD VNNISILQRKQSIIQSFSFDELSEDDYPKESKRHLRKFLDTTKAQLFFANGVLLVEGVAE AIIIPILAKKFLTEKIDLCKSGIELVNIGGVAFNHFGLLFNNDDERKRLLSKCAIITDSD PKDNGDISDRAQKAKDLEKHNLKVCLATHTLEHDLFEQSERNKAIMRDVYRKIHAQTDDL SGDFNVSTLMKKLKSNKDKAEFALQLCDRLETEVAFDVPDYIKDAILFIAPSE >gi|226332124|gb|ACIC01000196.1| GENE 7 8253 - 10034 450 593 aa, chain + ## HITS:1 COG:MA1864 KEGG:ns NR:ns ## COG: MA1864 COG0210 # Protein_GI_number: 20090714 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases # Organism: Methanosarcina acetivorans str.C2A # 218 593 3 397 418 197 32.0 7e-50 MDIQLSAQQQAVVGCNEPRIVVKACPGSGKTFSVAARMAKLLRENNLSRHRGIAAISFTN TACEVIQKELKETFGCRNIGYPSFIGTIDSFINTYIFLPYAHLVMGCDCRPEIVGTAFNK WFDYDPTQTRYIRGKHGERIITSRDTNYYFDLISFGLNDQLLRLAPYQSYHFGKADWDNP NKKDGSPKKIISELKEMKWKHFNAGKFNQADAIYFTFRILDKYPSITQNLVRRFPILIID EAQDTTELQMAIIDKLSQYGAESIMLIGDPDQAIFEWNTASPHLFMEKYNNPDWHSLDLS ENRRSSEKICQLANRFSGNEMCSIASDKDYTDEPCIKGHLDTPESVNQITNHFIEKCNEM GLDESEYAVVFRGQKFGETNFGLVNNDSSSRQNSPWINGHYGVRDIVYGKYLIDHGKYTD GLSLIEKGCYKITKRVRYVPASTIRKEIGEVGFRRHRAEMVEFIQKLPSTNNTLSNWITE LQQTGINLTVDLGKANVRIESLFRTVNHLFRQERPYLKTIHSVKGMTLEAILVFLSKKAV STNYATILNNPAKYSVDNNEELRIVYVACTRPKKMLWIAVPSDDIDCWRNKLF >gi|226332124|gb|ACIC01000196.1| GENE 8 10179 - 12188 1689 669 aa, chain - ## HITS:1 COG:alr7213 KEGG:ns NR:ns ## COG: alr7213 COG3505 # Protein_GI_number: 17233229 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirD4 components # Organism: Nostoc sp. PCC 7120 # 202 559 117 466 589 96 26.0 1e-19 MSQQEDDLRALAKIMDFLRAVSIILVVMNVYWFCYEAIRLWGVDIGVVDRILMNFDRTAG LFRSILYTKLFAVLLLALSCLGTKGVKGEKITWGRIWTALAAGFVLFFLNWWILALPLPV EAVTGLYVLTVGTGYVCLLMGGLWMSRLLKHNLMDDVFNNENESFMQETRLIESEYSVNL PTRFYYRKRWNNGWINVVNPFRASIVLGTPGSGKSYAVVNNFIKQQIEKGFSMYVYDFKF SDLSTIAYNHLLNHPEGYKVKPKFYVINFDDPRRSHRCNPIHPDFMEDITDAYESAYTIM LNLNKSWVQKQGDFFVESPIILFAAIIWYLKIFQNGKYCTFPHAIEFLNRRYEDIFPILT SYPELENYLSPFMDAWLGGAAEQLMGQIASAKIPLSRMISPQLYWVMSDSEFTLDINNPE EPKILCVGNNPDRQNIYGAALGLYNSRIVKLINKKGMLKSSVIIDELPTIYFKGLDNLIA TARSNKVAVCLGFQDFSQLVRDYGDKEAKVVMNTVGNIFSGQVVGETAKTLSERFGKVLQ KRQSISINRQDVSTSINTQMDALIPPSKISGLTQGMFVGSVSDNFNERIEQKIFHCEIVV DAEKVKREEKAYKPIPIITDFTDEDGKDCMKETVQANYRRIKEEVKQIVQEELERIANDE NLKHLLQQK >gi|226332124|gb|ACIC01000196.1| GENE 9 12221 - 13471 1003 416 aa, chain - ## HITS:1 COG:no KEGG:BF0132 NR:ns ## KEGG: BF0132 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 416 1 415 415 703 86.0 0 MVAKISVGKSLYGALAYNGEKINEAKGRLLTTNRIYNDGTGTVDIRKAMEGFLACMPEHT RVEKPVLHISLNPHPDDVLTDTELQDIAREYLEKLGYGNQPYLVVKHEDIDRHHLHIVTI NVDEKGRRLNQDFLFRRSDRIRRELEQKYGLHPAERKNQRIENPLRKVDASAGDVKRQIG NTVKALNGQYRFQTMGEYRALLSLYNMTVEEARGNVRGREYHGLVYSVTDDAGNKTGNPF KSSLFGKSAGYEAVQKKFARSKQEIKDRKLADMTKRAVLSVLEGTYDKEKFVSRLREKGI DTVLRYTEEGRIYGATFIDHRTGCVLNGSRMGKELSANALQEHFTLPYAGQPPIPLSVPV ETGEDMRRQTASDHEDTVGGVGLLTPEGPAVDAEEEAFIRAMQRKKKKKRRKGLGM >gi|226332124|gb|ACIC01000196.1| GENE 10 13450 - 13875 275 141 aa, chain - ## HITS:1 COG:no KEGG:BF0131 NR:ns ## KEGG: BF0131 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 141 1 142 142 227 91.0 9e-59 MKEKRNKTGRNPKLDPAVFRYTVRFNEEEHNRFLAMFEKSGVYAKSVFIKAHFFGQPFRV LKVDRTLVDYYTRLSDFHAQFRAVGTNYNQVVKELRLHFSEKKAMALLYKLEQQTVELVK LSRQIVELSREMEAKWSQKSV >gi|226332124|gb|ACIC01000196.1| GENE 11 14586 - 15347 774 253 aa, chain + ## HITS:1 COG:no KEGG:BF0129 NR:ns ## KEGG: BF0129 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 253 97 349 349 441 83.0 1e-123 MSNETFVAFATQKGGIGKSTVTALAANYLHNVKGHNVAVIDCDAPQHSIHGLRERETGLI GGSLYFKALACDHFRKIRKNAYPVIASDALNALDDAERMLAEEKVKPDVVFFDMPGTLKS NGVVKTLSQMDYIFAPMSADRFVVESTLQFAVMFRDNLMTTGQAKTKGLYLFWTMVDGRE KNGLYDLYGDVIAEMGLPVLSTRLPDSKKFRRDLSEERKSVFRSTIFPMDASLLKGSGIR EFSEEISRIIRPQ >gi|226332124|gb|ACIC01000196.1| GENE 12 15350 - 15790 262 146 aa, chain + ## HITS:1 COG:no KEGG:BF0128 NR:ns ## KEGG: BF0128 # Name: not_defined # Def: conjugate transposon protein # Organism: B.fragilis # Pathway: not_defined # 1 146 1 146 146 197 80.0 8e-50 MGSRKVNTEGIDEELLIASIGRRRQDGTLYHAQEPPAPAPEEESVPETEPPPVQYTAKEK PQRDTVRRKRQEDDYSGLFLRRNEIKTRQCVYISRDVHSKILKIVNDIAGREISVGGYVD TVLRQHLEQHKEKINELYKKQREDLI >gi|226332124|gb|ACIC01000196.1| GENE 13 15793 - 16140 293 115 aa, chain + ## HITS:1 COG:no KEGG:BF0127 NR:ns ## KEGG: BF0127 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 2 115 4 117 117 190 81.0 1e-47 MEKNQKNRSPQHDGGGMLAQVQASVEILSPVPLCGKCGEKDYERLFIREAEVKAREGKMA YVRPEYHDRIMRITRVIGHDRLSLSAYIDHVLTHHFNQCEEAIKSLYARNYDAVF >gi|226332124|gb|ACIC01000196.1| GENE 14 16162 - 16890 503 242 aa, chain + ## HITS:1 COG:no KEGG:BF0126 NR:ns ## KEGG: BF0126 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 7 242 1 236 236 333 77.0 2e-90 MSETISMTDILLTVSVGCNLWFLFLLLYDRIMETRLVRFLRSIAGVWRSLGGTAAKPEAV REIPPADTPDIIGKSRFRMASTRTTAAIPTQEAATSEKGIELTEEEATFDDGNTETVSRP AQIPEDKLDETFTSLTPSELEFGEDEPEEETPDAPRASGSSFDEIDDAVRTAKNPEATTT ERERAAKVFTDMEGTELYEKLMTGSSEMSIRIKGLIEIRLKKPKKDFVVPDNIEDFDIRN YV >gi|226332124|gb|ACIC01000196.1| GENE 15 17093 - 17404 209 103 aa, chain + ## HITS:1 COG:no KEGG:BF0125 NR:ns ## KEGG: BF0125 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 103 1 105 105 154 91.0 1e-36 MNKNNRQKLILSAALLVAVTASAFAQGNGIAGINEATSMVSSYFDPGTKLIYAIGAVVGL IGGVKVYGKFSSGDPDTSKTAASWFGACIFLIVAATILRSFFL >gi|226332124|gb|ACIC01000196.1| GENE 16 17416 - 17748 218 110 aa, chain + ## HITS:1 COG:no KEGG:BF0124 NR:ns ## KEGG: BF0124 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 108 1 108 110 207 95.0 9e-53 MAEYPINKGIGRPVEFKGLKAQYLFIFCGGLLALFVLFVILYMVGIDQWVCIGFGVASSS VLVWQTFALNARYGEHGLMKLGATRSHPRYLINRRRITRLFKRKRKEETT >gi|226332124|gb|ACIC01000196.1| GENE 17 17745 - 19304 899 519 aa, chain + ## HITS:1 COG:no KEGG:BF0123 NR:ns ## KEGG: BF0123 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 512 1 512 834 1026 95.0 0 MRNTSKMTTLENKFPLLAVEQGCIISKDADITVAFEVELPELYTVTGAEYEAIHGCWCKA IKVLPDFSVVHKQDWFIKEKYRPELQKEDMSFLSRSFERHFNERPYLKHTCYLYLTKTTK ERNRMQSNFSTLCRGHIIPKELDKETAAKFMEAAEQFERIINDSGFVRLRRLSTDEIVGT EGKTGLIERYFSLMPEGDATLQDIDLSAREMRIGDNRLCLHTLSDAEDLPGTVATDTRYE KLSTDRSDCRLSFASPVGLLLSCNHIYNQYVIIDNSEENLQKFEKSARNMQSLSRYSRSN SINREWIDRYLNEAHSYGLTSVRAHFNVMVWSDDAEELKHIKNDVGSQLASMECVPRHNT IDCPTLYWAAMPGNAADFPAEESFHTFIEQAVCLFTEETNYRSSLSPFGIKMVDRLTGKP LHLDISDLPMKRGITTNRNKFVLGPSGSGKSFFMNHLVRQYYEQGAHVVLVDTGNSYQGL CEMIRRKTGGTDGVYFTYTEEKPISFNPFYTDGAPIMEA >gi|226332124|gb|ACIC01000196.1| GENE 18 19362 - 19862 143 166 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253572851|ref|ZP_04850250.1| ## NR: gi|253572851|ref|ZP_04850250.1| predicted protein [Bacteroides sp. 1_1_6] # 46 166 1 121 121 224 100.0 2e-57 MKRKPTPSQLILKGNPRGSAACWKSDKSTNCHVATELRGGTIDTRMKPDWLNDSSVANEE AMAKDNYIRHTVIPFLSRSGEEARTNYSTTVINGEKSERRIRQSITLIVITLTKWGLPKP KRRKAIEFALEYSAWQRSFHSSPSRGKPCTWRREAVDNFNINQRKT >gi|226332124|gb|ACIC01000196.1| GENE 19 19870 - 21678 377 602 aa, chain + ## HITS:1 COG:Q0050 KEGG:ns NR:ns ## COG: Q0050 COG3344 # Protein_GI_number: 6226520 # Func_class: L Replication, recombination and repair # Function: Retron-type reverse transcriptase # Organism: Saccharomyces cerevisiae # 28 601 255 826 834 298 35.0 2e-80 MRNPEKVLNTLCLHSKVSDYKYERLYRILFNEDMFYVAYQRIYAKQGNMTSGADGKTIDQ MSIQRIEHIIESLKNETYQPKPAKRVYIPKKNGGNRPLGIPSFEDKLVQEVARMILEAIY EGHFESTSHGFRPYKSCHTALTHIQHQFTGTRWFIEGDIKGFFDNINHNILIDTLRERIS DERFLRLVRKFLNAGYVDNWKYNRTYSGTPQGGIISPILANIYLDKFDKYINEYIERFNK GEKKRRNAVYFQKNTQAVNLRKKLKVETDPCVKAELTEKAKKLQIEMRNIPCSHDMDENY RRMKYVRYADDFIIGVIGSKVDCEKIKADITKYMSDVLNLELSAEKTLITHAQDTAKFLG YELSVRKSYAMKRNRKGVLQRDFNGRVVLTLPIETVKKKLKEYDAISFEQANGKEIWKPK SRSHLTAMQPHDILAQFNWEIRGFYNYYSIANNVSATCSKFGYIMEYSMYLTLAQKLRST ISKLKKKYETNKQFIIPYKDDKGRQQYRILYDDGFKRQTTNANTNCDTQPFTVIVPPPTL VERLKTNRCELCGVESPTVMHHVRTLKTLSKEYEWNRIMLNRNRKTLAVCPSCNAKIQEH EK >gi|226332124|gb|ACIC01000196.1| GENE 20 21779 - 22640 701 287 aa, chain + ## HITS:1 COG:no KEGG:BF0123 NR:ns ## KEGG: BF0123 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 6 287 514 795 834 534 96.0 1e-150 MPSLHYVFDVEKKDSIKTLLLTLWKSEDDKVTKTESGELGSAVSAYIERIRADRSIVPSF NTFYEYMRDDYRRELAERDIKVEKEDFNIDNMLTTMRQYYRGGRYDFLLNSAENIDLLGK RFIVFEIDSIKDNRELFPVVTIIIMEAFINKMRRLKGVRKQLIVEEAWKALSSANMADYL RYMYKTVRKYFGEAIVVTQEVDDIISSPVVKESIINNSDCKILLDQRKYMNKFDQIQALL GLTEKEKSQILSINMANNPSRLYKEVWIGLGGTQSAVYATEVSAEEY Prediction of potential genes in microbial genomes Time: Thu May 12 04:06:14 2011 Seq name: gi|226332123|gb|ACIC01000197.1| Bacteroides sp. 1_1_6 cont1.197, whole genome shotgun sequence Length of sequence - 2162 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 66 - 125 2.5 1 1 Op 1 . + CDS 153 - 533 439 ## BF0122 hypothetical protein 2 1 Op 2 . + CDS 557 - 1186 706 ## BF0121 hypothetical protein 3 1 Op 3 . + CDS 1190 - 2162 760 ## BF0120 hypothetical protein Predicted protein(s) >gi|226332123|gb|ACIC01000197.1| GENE 1 153 - 533 439 126 aa, chain + ## HITS:1 COG:no KEGG:BF0122 NR:ns ## KEGG: BF0122 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 126 1 126 126 235 87.0 5e-61 MSISKIKMLQVSKCLIGLAVMVLQSCDVAENRRDLLCGKWESVEGKPDVLIYKEGEAYKV TVFKRSGIRRRLKPETYLLQEENGNLFMNTGFRIDVAYNEATDVLTFSPNGDYVRVKPQP ETPAEK >gi|226332123|gb|ACIC01000197.1| GENE 2 557 - 1186 706 209 aa, chain + ## HITS:1 COG:no KEGG:BF0121 NR:ns ## KEGG: BF0121 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 209 1 209 209 369 93.0 1e-101 MRTRITFVICLCLLFAGRASAQWVVSDPGNLAQGIINASKNIIHTSKTATNMVNNFQETV KIYEQGKKYYDALKSVNNLVKDARKVQQTILMVGDITDIYVTSFQKMLRDDNFTVEELGA IAFGYTKLLEESNDVLTELKNVVNITTLSMTDKERMDVVERCHSKMKRYRNLVSYYTNKN ISVSYLRAKKKNDLDRIMGLYGSMNERYW >gi|226332123|gb|ACIC01000197.1| GENE 3 1190 - 2162 760 324 aa, chain + ## HITS:1 COG:no KEGG:BF0120 NR:ns ## KEGG: BF0120 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 313 1 313 334 600 93.0 1e-170 MEFDNLHQILRSLYEQMMPLCADMAGVAKGIAGLGALFYVAYRVWQSLARAEPIDVFPML RPFAVGLCIMFFPTVVLGTINSVLSPVVQGTAKLLETQTLDMNKYREQKDRLEYEAMVRN PETAYLVSNEEFDKQLEELGWSPSDMVTMAGMYIDRGMYKMKKGIRDFFREILELMFQAA ALVIDTIRTFFLVVLAILGPIAFAISVWDGFQSTLTQWICRYIQVYLWLPVSDIFSTILA KIQVLMLQSDIERMQTDPNFSLDSSDGVYIVFMIIGIIGYFTIPTVAGWIIQAGGMGGYG RNVNQMAGRAGSMAGSVAGATAGN Prediction of potential genes in microbial genomes Time: Thu May 12 04:06:26 2011 Seq name: gi|226332122|gb|ACIC01000198.1| Bacteroides sp. 1_1_6 cont1.198, whole genome shotgun sequence Length of sequence - 5470 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 1, operones - 1 average op.length - 7.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 18 - 641 571 ## BF0119 hypothetical protein 2 1 Op 2 . + CDS 648 - 968 217 ## BF0118 hypothetical protein 3 1 Op 3 . + CDS 937 - 2307 1387 ## BF0117 hypothetical protein 4 1 Op 4 . + CDS 2352 - 3338 851 ## BF0116 hypothetical protein 5 1 Op 5 . + CDS 3341 - 3916 422 ## BF0115 TraO 6 1 Op 6 . + CDS 3916 - 4827 458 ## BF0114 hypothetical protein 7 1 Op 7 . + CDS 4824 - 5327 568 ## BF0113 hypothetical protein Predicted protein(s) >gi|226332122|gb|ACIC01000198.1| GENE 1 18 - 641 571 207 aa, chain + ## HITS:1 COG:no KEGG:BF0119 NR:ns ## KEGG: BF0119 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 207 1 207 207 391 96.0 1e-108 MEFKSLKNIESSFRQIRLFGIVFLSLCAVITVWSVWSSYRFAERQREKIYVLDNGKSLML ALSQDLSQNRPAEAREHVRRFHELFFTLSPEKSAIEHNVKRALLLADKSVYNYYSDFAEK GYYNRIIAGNINQVLKVDSVVCDFNGYPYRAVTYATQKIIRQSNVTERSLVTTCRLLNSS RSDDNPNGFTIEGFTIIENKDLQTIKR >gi|226332122|gb|ACIC01000198.1| GENE 2 648 - 968 217 106 aa, chain + ## HITS:1 COG:no KEGG:BF0118 NR:ns ## KEGG: BF0118 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 102 1 102 102 158 75.0 5e-38 MKKIKKSFWGAYWKLHDKKKMLVARLRGYLDGLPPKTRRRIVLAMLAAFATLALYTFGKA VYDIGRNDGSRMVTDHAGQVELSVQQKTENHLIPYLYGTDKERNED >gi|226332122|gb|ACIC01000198.1| GENE 3 937 - 2307 1387 456 aa, chain + ## HITS:1 COG:no KEGG:BF0117 NR:ns ## KEGG: BF0117 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 453 1 447 450 654 81.0 0 MEQTKNETKIENKTESKPENKTAPGNDKPKKERKPLTEAQRLKRQKMIVLPAMVLVFIGA MWLIFAPSSGKEQEPGTSGYNIEMPDADKENRRIIGDKAKAYEQGAMEERQENRSRAMQQ LGDLFDRETAAADGDSDFDLANPGGTEEAVKPAPKTIQSSAAAYRDLNATLGNFYEQPKN DNAEMDELLERIATLESELESGKERSSTMDEQVALMEKSYELAAKYMGGQNGTQAAGQTT EPSPVQKAGKNTAKPVRQVTHQVVSSLGQPVSNAEFVASFSQERNRSFNTAVGITTVSDR NTIPACVYGAQSVTDGQAVRLRLLEPMAVADRIIPRNAVVVGAAKIQGERLAIEITSLEH DGTVIPVELEVYDTDGQPGIFIPNSMEMNAVREVAANMGGSLGSSINISTNAGAQLASDL GKGLIQGTSQYIAKKMRTVKVHLKAGYRVMLYQGVN >gi|226332122|gb|ACIC01000198.1| GENE 4 2352 - 3338 851 328 aa, chain + ## HITS:1 COG:no KEGG:BF0116 NR:ns ## KEGG: BF0116 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 328 1 328 328 634 94.0 1e-180 MKKVIMMFALAMGIATANAQENVTVGQSNGSDQPTLTKEVYPQKEADGDLYHGLTRKLGF DRMVPPHGLEVTYDKTVHVIFPAEVRYVDLGSPDLIAGKADGAENVIRVKATVRNFPNET NMSVITEDGSFYTFNVKYAAEPLLLNVEMCDFIHDGEAVNRPNNAQEIYLKELGSESPML VRLIMKSIHKQNKREVKHIGCKRFGIQYLLKGIYTHNGLLYFHTEIRNQSNVPFDVDYIT WKIVDKKVAKRTAVQERVILPLRAQNYATFVPGKKSERTVFTMAKFTIPDGKCLVVELNE KNGGRHQSFVIENEDLVRANTINELQVR >gi|226332122|gb|ACIC01000198.1| GENE 5 3341 - 3916 422 191 aa, chain + ## HITS:1 COG:no KEGG:BF0115 NR:ns ## KEGG: BF0115 # Name: not_defined # Def: TraO # Organism: B.fragilis # Pathway: not_defined # 1 191 1 191 191 356 92.0 3e-97 MRKHIAIIIASLALFTGQAHAQRCLPKMQGIEVRANLADGFKPGGNDGGYSFGAALSTYT KKGNKWVFGGEYILKNNPYKDTSIPVAQFTAEGGYYFKILSDARKIVFVYAGASALAGYE SVNWGEKVLHDGSTLHDRDAFVYGGALTLDVEFYVADRIALLANLRERCLWGGDTKKFHT QWGMGIKFIIN >gi|226332122|gb|ACIC01000198.1| GENE 6 3916 - 4827 458 303 aa, chain + ## HITS:1 COG:no KEGG:BF0114 NR:ns ## KEGG: BF0114 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 4 303 1 300 300 469 75.0 1e-131 MRRMERTDIDTMRQISLADFLARLGHEPVRRSGNELWYHAPYRNERTPSFRVNVAKRLWY DFGLGKGGDIFTLAGEFARSGDFMAQAGFIAETVRMPFVSAKKPLYLPEPSEPAFEGVEA VPLLRSPLTDYLAERGIPYAVASRHCCRLNYGVRGKRYFTVGFPNVAGGYEIRSRYFKGC IPPKDVSLIKPEDTASDVCSVFEGFMDFLSADTLGIGGNGDSLVLNSVANVGKAVKHLDG YGRIDCFLDRDESGRRTLEVLKGHYGGRVCDRSALYDGCKDLNEYLQRTAKKEMNNNLKI KGQ >gi|226332122|gb|ACIC01000198.1| GENE 7 4824 - 5327 568 167 aa, chain + ## HITS:1 COG:no KEGG:BF0113 NR:ns ## KEGG: BF0113 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 165 1 165 167 287 86.0 8e-77 MNILNNKNKRTSIFKALALCLFAAVSLTLASCDDDMDIQQSYPFTVETMPVPNKVTKGQT VEIRCELKKTGEFANTLYTIRYFQFEGEGTLKMDNGITFLPNDRYLLENEKFRLYYTAEG DEAHNFIVVVEDNFKNSYELEFNFNNRNVKDDIQTVIPIGNYKPLPR Prediction of potential genes in microbial genomes Time: Thu May 12 04:06:51 2011 Seq name: gi|226332121|gb|ACIC01000199.1| Bacteroides sp. 1_1_6 cont1.199, whole genome shotgun sequence Length of sequence - 853 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 768 408 ## BVU_2295 hypothetical protein Predicted protein(s) >gi|226332121|gb|ACIC01000199.1| GENE 1 3 - 768 408 255 aa, chain - ## HITS:1 COG:no KEGG:BVU_2295 NR:ns ## KEGG: BVU_2295 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 12 247 42 274 322 75 27.0 1e-12 MKCLNFGSEEAPKWYFATTILINGEELEIIPAFTDYSLKHKNFEKDKVFKAIDKSKLLPT VFCFCDAKPFYAENVPLYDYVGNKIKNLPDNAPAVMVYLDKADNVNLLGLTDEPLQAELV ECDSVSDAYRRVATTAYLSQCPSKDEWIGYAGIVTGDELFLNIRKFGIMYSMSGTAVQGY FGISTTVSLLQSKALAMSSSLFKEEYRTYAQAEQLMKATVQAFGVKAAKQTRYIKAINYC ISQYDFDTVCNVLNS Prediction of potential genes in microbial genomes Time: Thu May 12 04:07:08 2011 Seq name: gi|226332120|gb|ACIC01000200.1| Bacteroides sp. 1_1_6 cont1.200, whole genome shotgun sequence Length of sequence - 45241 bp Number of predicted genes - 48, with homology - 47 Number of transcription units - 18, operones - 9 average op.length - 4.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 1454 558 ## Slin_5872 integrase family protein - Term 1771 - 1819 4.9 2 2 Op 1 32/0.000 - CDS 1867 - 2556 897 ## COG0704 Phosphate uptake regulator 3 2 Op 2 41/0.000 - CDS 2615 - 3373 211 ## PROTEIN SUPPORTED gi|90020817|ref|YP_526644.1| ribosomal protein S16 4 2 Op 3 38/0.000 - CDS 3381 - 4256 771 ## COG0581 ABC-type phosphate transport system, permease component 5 2 Op 4 . - CDS 4258 - 5454 1013 ## COG0573 ABC-type phosphate transport system, permease component - Prom 5549 - 5608 7.4 + Prom 5552 - 5611 6.9 6 3 Op 1 . + CDS 5631 - 6443 777 ## COG0226 ABC-type phosphate transport system, periplasmic component 7 3 Op 2 . + CDS 6495 - 8234 1832 ## COG0008 Glutamyl- and glutaminyl-tRNA synthetases + Prom 8254 - 8313 4.9 8 4 Op 1 . + CDS 8439 - 9875 1473 ## COG0457 FOG: TPR repeat 9 4 Op 2 . + CDS 9906 - 10547 622 ## COG0586 Uncharacterized membrane-associated protein + Term 10763 - 10815 8.0 10 5 Tu 1 . - CDS 10656 - 11540 874 ## BT_1328 hypothetical protein - Prom 11614 - 11673 3.6 11 6 Tu 1 . - CDS 11693 - 12196 526 ## COG2077 Peroxiredoxin - Prom 12315 - 12374 5.1 + Prom 12165 - 12224 8.9 12 7 Op 1 . + CDS 12334 - 12897 651 ## COG3247 Uncharacterized conserved protein 13 7 Op 2 . + CDS 12981 - 13460 719 ## BT_1331 hypothetical protein 14 7 Op 3 . + CDS 13467 - 14135 628 ## COG0325 Predicted enzyme with a TIM-barrel fold 15 7 Op 4 . + CDS 14168 - 15148 1027 ## COG0167 Dihydroorotate dehydrogenase - Term 14966 - 15007 -0.8 16 8 Op 1 1/0.000 - CDS 15255 - 15425 131 ## COG1875 Predicted ATPase related to phosphate starvation-inducible protein PhoH 17 8 Op 2 . - CDS 15394 - 16581 1276 ## COG1875 Predicted ATPase related to phosphate starvation-inducible protein PhoH - Prom 16696 - 16755 6.8 + Prom 16480 - 16539 5.6 18 9 Tu 1 . + CDS 16713 - 18188 1061 ## COG0285 Folylpolyglutamate synthase - Term 18085 - 18133 -0.9 19 10 Tu 1 . - CDS 18156 - 18530 404 ## COG0251 Putative translation initiation inhibitor, yjgF family - Prom 18568 - 18627 3.9 - Term 18566 - 18610 11.2 20 11 Tu 1 . - CDS 18634 - 19107 441 ## COG3152 Predicted membrane protein - Prom 19216 - 19275 6.7 21 12 Tu 1 . + CDS 19438 - 19590 80 ## - Term 19948 - 19992 6.0 22 13 Tu 1 . - CDS 20092 - 20424 151 ## COG1898 dTDP-4-dehydrorhamnose 3,5-epimerase and related enzymes - Prom 20482 - 20541 5.1 - Term 20820 - 20866 7.0 23 14 Op 1 12/0.000 - CDS 20970 - 21881 571 ## COG3958 Transketolase, C-terminal subunit 24 14 Op 2 . - CDS 21878 - 22687 739 ## COG3959 Transketolase, N-terminal subunit 25 14 Op 3 . - CDS 22677 - 23168 286 ## COG2148 Sugar transferases involved in lipopolysaccharide synthesis - Prom 23250 - 23309 3.6 - Term 23268 - 23305 3.2 26 15 Tu 1 . - CDS 23345 - 24103 328 ## COG1028 Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) - Prom 24145 - 24204 2.5 - Term 24352 - 24406 -0.1 27 16 Op 1 . - CDS 24563 - 25102 233 ## COG1045 Serine acetyltransferase 28 16 Op 2 . - CDS 25107 - 26078 225 ## EcE24377A_2321 hypothetical protein 29 16 Op 3 . - CDS 26091 - 27599 295 ## BCE_5386 polysaccharide transport protein, putative 30 16 Op 4 . - CDS 27617 - 28732 100 ## gi|253572895|ref|ZP_04850293.1| predicted protein - Prom 28759 - 28818 2.7 31 17 Op 1 25/0.000 - CDS 28852 - 29967 319 ## COG0438 Glycosyltransferase 32 17 Op 2 . - CDS 29960 - 31123 549 ## COG0438 Glycosyltransferase 33 17 Op 3 11/0.000 - CDS 31138 - 32022 693 ## COG1209 dTDP-glucose pyrophosphorylase 34 17 Op 4 . - CDS 32028 - 32972 516 ## COG1091 dTDP-4-dehydrorhamnose reductase 35 17 Op 5 . - CDS 32932 - 33180 299 ## gi|253572900|ref|ZP_04850298.1| predicted protein 36 17 Op 6 . - CDS 33183 - 33755 400 ## COG1898 dTDP-4-dehydrorhamnose 3,5-epimerase and related enzymes 37 17 Op 7 3/0.000 - CDS 33796 - 34518 386 ## COG1028 Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) 38 17 Op 8 . - CDS 34515 - 35729 319 ## COG0318 Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II 39 17 Op 9 . - CDS 35737 - 35961 301 ## BDI_0568 putative acyl carrier protein 40 17 Op 10 1/0.000 - CDS 35965 - 37011 235 ## COG0332 3-oxoacyl-[acyl-carrier-protein] synthase III 41 17 Op 11 . - CDS 37015 - 37761 289 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 42 17 Op 12 . - CDS 37765 - 38001 302 ## BDI_3629 putative acyl carrier protein 43 17 Op 13 . - CDS 37989 - 38549 388 ## PHZ_c0012 acetyltransferase, GNAT family 44 17 Op 14 . - CDS 38551 - 39681 865 ## COG0399 Predicted pyridoxal phosphate-dependent enzyme apparently involved in regulation of cell wall biogenesis - Prom 39708 - 39767 6.4 45 18 Op 1 8/0.000 - CDS 39772 - 41085 1003 ## COG1004 Predicted UDP-glucose 6-dehydrogenase 46 18 Op 2 3/0.000 - CDS 41106 - 42158 519 ## COG0451 Nucleoside-diphosphate-sugar epimerases 47 18 Op 3 . - CDS 42155 - 44077 890 ## COG1086 Predicted nucleoside-diphosphate sugar epimerases 48 18 Op 4 . - CDS 44106 - 45227 673 ## BT_1355 hypothetical protein Predicted protein(s) >gi|226332120|gb|ACIC01000200.1| GENE 1 2 - 1454 558 484 aa, chain - ## HITS:1 COG:no KEGG:Slin_5872 NR:ns ## KEGG: Slin_5872 # Name: not_defined # Def: integrase family protein # Organism: S.linguale # Pathway: not_defined # 18 433 10 428 433 138 28.0 5e-31 MNNKAQTGQIFFNEVQAKFNLREPKSNKPTNIYLVCRIDRKQIRLSTGVKVYPEHWNVKK QEAYISCRLSELDNLNNTIVNTKIIEIKNRFLQYKRYLCDNPNEIGNSVKILKKFIYKDT MEKEKQNINAVHWLRNTLALDRTIKDSTRADYVKQIKFFEAFLNEVGKYPISFSDINLPL IKDYESYLFNKEVGKGKTTKTTTVGNKVEKIICILKRAEQQGMIDIHESKLDKYKKPQSR QGDENEIYLTEDEIDKIYALRLTGREEEVRDLFVLQCWIGQRFSDTQAINEGIIKEAPNG KGKVIEIVQEKKTHRVSIPLLPVAIDILNKYKNGFPIYTNQTALNYLKNIGEKAGITRLH NVTEDRGGEVVTTQVKAYELIGTHTARRSFICNMLKHGYDSHIIMKITGHNDAKSFKKYV RLTSEDAALLMLETESTKVRQSDKVPTTISQEGNKEAINILKQYQNTINGITFDTLLDTQ FLAS >gi|226332120|gb|ACIC01000200.1| GENE 2 1867 - 2556 897 229 aa, chain - ## HITS:1 COG:RSc1533 KEGG:ns NR:ns ## COG: RSc1533 COG0704 # Protein_GI_number: 17546252 # Func_class: P Inorganic ion transport and metabolism # Function: Phosphate uptake regulator # Organism: Ralstonia solanacearum # 6 221 11 225 235 102 35.0 4e-22 MVKFIESELILLKKEVDEMWTLVYNQLDRAGEAVMTLDKELAQQVMVRERRVNAFELKID SDIEDIIALYNPVAIDLRFVLAMLKINTNLERLGDFAEGIARFVIRCKEPVLDADLLTRL RLGEMQAEVLSMLELAKRALNEENIEMATVVFGKDNLLDDINAEATGILADYIREHPETA HTCVDLVGVFRKLERSGDHITNIAEEIVFFIDAKVLKHSGKVDENYPKS >gi|226332120|gb|ACIC01000200.1| GENE 3 2615 - 3373 211 252 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|90020817|ref|YP_526644.1| ribosomal protein S16 [Saccharophagus degradans 2-40] # 1 241 7 237 318 85 26 4e-16 MDTIKIDARDVNFWYGDFHALKGISMEIEEKSVVAFIGPSGCGKSTFLRLFNRMNDLIPA TRLEGEIRIDGHNIYAKGMEVDELRKNVGMVFQRPNPFPKSIFENVAYGLRVNGVKDNAF IRQRVEETLKGAALWDEVKDKLKESAYALSGGQQQRLCIARAMAVSPSVLLMDEPASALD PISTAKVEELIHELKKDYTIVIVTHNMQQAARVSDKTAFFYLGEMVEYGDTKRIFTNPEK EATQNYITGRFG >gi|226332120|gb|ACIC01000200.1| GENE 4 3381 - 4256 771 291 aa, chain - ## HITS:1 COG:MA0889 KEGG:ns NR:ns ## COG: MA0889 COG0581 # Protein_GI_number: 20089773 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type phosphate transport system, permease component # Organism: Methanosarcina acetivorans str.C2A # 13 289 30 306 307 268 51.0 9e-72 MEIRSNNKAKHRSQRLAFGIFRLLSLCIVLILFAILGFIIYKGAGAISWDFITSAPTDGM TGGGIWPAIVGTFYLMVGSALFAFPVGVMSGIYMNEYAPKGKLVRFIRVMTNNLSGIPSI VFGLFGMALFVNYMGFGDSILAGSLTLGLLCVPLVIRTTEEALKAIPDSMREGSRALGAT RLQTIWHVILPMGMPNIITGLILALGRVSGETAPILFTCAAYFLPQLPTSILDQCMALPY HLYVISTSGTDMEAQLPLAYGTALVLIVIILLVNLLANALRKYFEKRVKTN >gi|226332120|gb|ACIC01000200.1| GENE 5 4258 - 5454 1013 398 aa, chain - ## HITS:1 COG:MA0888 KEGG:ns NR:ns ## COG: MA0888 COG0573 # Protein_GI_number: 20089772 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type phosphate transport system, permease component # Organism: Methanosarcina acetivorans str.C2A # 138 392 36 290 296 260 53.0 3e-69 MKKIFEKIIEGILTCSGFVTSITILLIVLFLFTEAFGLFNSKVIEEGYVLALNKGNKVNT LSPAQIKDVFDEEITNWKELGGEDLPIRVFRLEDITEYYTEEELGPAYEYAGERITQLVE KTPGIVAFVPQKFIVQPDAVHFIKDNTISVKDVFAGAEWFPTATPAALFGFLPLIAGTLW VSLFAILFALPFGLSVSIYMSEVANPKVRNWLKPIIELLSGIPSVVYGFFGLIVIVPLIQ KFFDLPVGESGLAGSIILAIMALPTIITVTEDAMRNCPRSMREASLALGASQWQTIYKVV IPYSISGITSGVVLGIGRAIGETMAVLMVTGNAAVIPTTILEPLRTIPATIAAELGEAPA GGPHYEALFLLGVVLFFITLIINFSVEYISSKGIKRSK >gi|226332120|gb|ACIC01000200.1| GENE 6 5631 - 6443 777 270 aa, chain + ## HITS:1 COG:MA0887 KEGG:ns NR:ns ## COG: MA0887 COG0226 # Protein_GI_number: 20089771 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type phosphate transport system, periplasmic component # Organism: Methanosarcina acetivorans str.C2A # 23 269 70 315 317 192 46.0 5e-49 MKVRRNLLIALSLLSLSANAQRIKGSDTVLPVAQQTAERFMNQHPDARVTVTGGGTGVGI SALMDNTTDIAMASRPIKFSEKMKTKEAGQEVDEVIVAYDALAVVVHPSNPVKQLTRQQL EDIFRGKITNWKQVGGDDRKIVVYSRETSSGTYEFFKENVLKNKNYMAGSLSMPATGAII QSVSQTKGAIGYVGLAYVSPRTKTLSVSYDGSHYAAPTVENATNKTYPIVRPLYYYYNVK NKEQVAPLINFILSPEGQEIIKKSGYIPVK >gi|226332120|gb|ACIC01000200.1| GENE 7 6495 - 8234 1832 579 aa, chain + ## HITS:1 COG:STM0686 KEGG:ns NR:ns ## COG: STM0686 COG0008 # Protein_GI_number: 16764056 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Glutamyl- and glutaminyl-tRNA synthetases # Organism: Salmonella typhimurium LT2 # 10 576 4 552 555 585 52.0 1e-167 MTDIKNEDTGDKKSLNFIEQAVEKDLKEGKNGGKVQTRFPPEPNGYLHIGHAKAICLDFG IAARHGGVCNLRFDDTNPTKEDVEYVEAIKEDIQWLGYQWGNEYYASDYFQQLWDFAIRL IEEGKAYIDEQTAEQIAQQKGTPTQAGVNSPYRDRPIEESLELFKKMNSGEIEEGAMVLR AKIDMANPNMHFRDPIIYRVVKHPHHRTGTTWKAYPMYDFAHGQSDFFEGVTHSLCTLEF VVHRPLYDLFIDWLKEGKDLNDNRPRQTEFNKLNLSYTLMSKRNLLTLVKEGLVNGWDDP RMPTICGFRRRGYSPESIHKFIDKIGYTTYDALNDIALLESSVRDDLNSRATRVSAVINP VKLIITNYPEGQVEELEAINNPEDPEAGSHMIEFSRELWMEREDFMEDAPKKYFRMTPGQ EVRLKNAYIVKCTGCKKDENGVITEVYCEYDANTRSGMPDANRKVKGTLHWVSCNHCLQA EVRLYDRLWKVENPRDELAAIREAKNCEALEAMKEIINPDSLKVLPNCYIEKFAATLPVL SYLQFQRIGYFNIDKDSTPEKLVFNRTVGLKDTWGKINK >gi|226332120|gb|ACIC01000200.1| GENE 8 8439 - 9875 1473 478 aa, chain + ## HITS:1 COG:FN1787 KEGG:ns NR:ns ## COG: FN1787 COG0457 # Protein_GI_number: 19705092 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Fusobacterium nucleatum # 179 411 74 298 628 70 27.0 6e-12 MGRKNSPSANKELNELIAQYETAKAENRQLYLDGDQLADIADRYAAERKFDEAQEVITYG LHLHPDSTDLLVEQAYLYLDTGKIPLAKKVAESITDDYITDVKMLKAELLLNEGQLEAAR STLDTIEDTDELETIINIIYLYMDMGYPEAAKEWLDKGTPRFGKKEDFIAVMADYLAGTN ELEAASTYYNQLIDMDPYNASYWVGLAKCRFAAEDSEKAIEACDFALAADETFGEAYAYR GHCYFYLNNSDAAIENYTKAIEYKAFPPEMGYMFLGMAYSNKGAWQEADDCYQRVIDRFV ADGAGNSPLLIDTYTNKAVAASQLGKHEEAHLLCKKAKKIQPDDPGIHLTEGKLYMKEGQ KKKAVKAFDKALVMEPSAEMWYLVASAYSDAEYLYQAKLCFEESYRIDPNYADVTEKLSI LSLMHNEIDDFFKYNSESAHPISEDIILDLLSRPNQTEEGEQMLKEVWKRMKKEKGNK >gi|226332120|gb|ACIC01000200.1| GENE 9 9906 - 10547 622 213 aa, chain + ## HITS:1 COG:Cj1168c KEGG:ns NR:ns ## COG: Cj1168c COG0586 # Protein_GI_number: 15792492 # Func_class: S Function unknown # Function: Uncharacterized membrane-associated protein # Organism: Campylobacter jejuni # 2 166 3 161 200 132 44.0 5e-31 MESVAFIQWCLEHLNYWTITLLMTIESSFIPFPSEVIIPPAAYKAAVNDELNIYLVVLFA TIGADIGALINYYLAKWLGRPIIYKFANSRIGHMCLIDEAKVQHAESYFDKHGALSTFIG RLIPAVRQLISIPAGLARMKMHTFLLYTTIGAGLWNTILAVIGYYLSTVPGIESEEQLIA KVTEYSHEIGYCFIAIGIFIVGFLVYKGMKRNK >gi|226332120|gb|ACIC01000200.1| GENE 10 10656 - 11540 874 294 aa, chain - ## HITS:1 COG:no KEGG:BT_1328 NR:ns ## KEGG: BT_1328 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 294 1 294 294 561 97.0 1e-159 MDKIRFFFIALLGMGAAHLSAQTADSTQTSPWTTEGFAGLKLTQVSLTNWSAGGDNSVAF DLQGTYQANYKKGKHIWNNRLELAYGLNKTGEDGMRKANDKIYMNTNYGYSIAKNWYASA FTTFQTQFSPGYDYSVNKDVAISEFMSPAYLTTGLGFTYDPGKIFTVVLSPASWRGTFVL NDRLSDEGAFGVDPGKHLLSSFGANLKGEVKYEFLKNMTVYSRLDLYSDYLHKPQNIDVN WEVQINMAINKWFSTTLTTNMVYDDDIKIVQKDGSKGSRLQFKEVLGVGVQFNF >gi|226332120|gb|ACIC01000200.1| GENE 11 11693 - 12196 526 167 aa, chain - ## HITS:1 COG:Cgl1062 KEGG:ns NR:ns ## COG: Cgl1062 COG2077 # Protein_GI_number: 19552312 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peroxiredoxin # Organism: Corynebacterium glutamicum # 1 165 4 167 168 181 56.0 6e-46 MATTNFKGQPVKLIGEFIQVGKVAPDFELVKTDLSSFSLKDLNGKNVILNIFPSLDTSVC ATSVRKFNKMGAGLKDTVVLAISKDLPFAHARFCTTEGIENVIPLSDFRFSDFDESYGLR MADGPLAGLLARAVVVIGKDGKVAYTELVPEITQEPDYEKALAAVKE >gi|226332120|gb|ACIC01000200.1| GENE 12 12334 - 12897 651 187 aa, chain + ## HITS:1 COG:RSp0426 KEGG:ns NR:ns ## COG: RSp0426 COG3247 # Protein_GI_number: 17548647 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Ralstonia solanacearum # 13 181 7 174 186 63 26.0 3e-10 METVFNEIKHSVKNWWTSLLLGIVYIIVALWLMFSPLSSYVALSIVFSISMLISGILEII FSLSNRKGVPSWGWYLVGGIIDLILGIYLIAYPMVSMEVIPFIIAFWLMFRGFSSTGYSI DLKRYGTRDWGWYMAFGILAIICALIILWQPAVGALYVVYMISFTFFIIGLFRVMLSFEL RNLHKRK >gi|226332120|gb|ACIC01000200.1| GENE 13 12981 - 13460 719 159 aa, chain + ## HITS:1 COG:no KEGG:BT_1331 NR:ns ## KEGG: BT_1331 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 159 1 159 159 300 100.0 1e-80 MAMHTWFECKIRYEKTMENGMQKKVTEPYLVDALSFTEAEARIIEEMTPFITGEFTVSDI KRANYSELFPSDEESADRWFKCKLIFITLDEKSGAEKKTSTQVLVQAADLRDAVKKLDEG MKGTMADYQIGMVSETPLMDVYPYSAGPNDKPEFDPSKA >gi|226332120|gb|ACIC01000200.1| GENE 14 13467 - 14135 628 222 aa, chain + ## HITS:1 COG:CAC2121 KEGG:ns NR:ns ## COG: CAC2121 COG0325 # Protein_GI_number: 15895390 # Func_class: R General function prediction only # Function: Predicted enzyme with a TIM-barrel fold # Organism: Clostridium acetobutylicum # 1 222 1 218 221 164 41.0 1e-40 MSIADNLKQVLAELPQGVRLVAVSKFHPNEAIEEAYQAGQRIFGESKVQEMTAKYETLPK DIEWHFIGHLQTNKIKYMIPYVAMIHGIDTYKLLTEVNKQAAKAGRIVNCLLQIHVAQEE TKFGFSPEECKDMLHAGEWKELSHVRICGLMGMASNTDDVEQINREFCLLNRLFQEIKAN WFADSDTFRELSMGMSHDYHEAIAAGSTLVRVGSKIFGERNY >gi|226332120|gb|ACIC01000200.1| GENE 15 14168 - 15148 1027 326 aa, chain + ## HITS:1 COG:alr1912 KEGG:ns NR:ns ## COG: alr1912 COG0167 # Protein_GI_number: 17229404 # Func_class: F Nucleotide transport and metabolism # Function: Dihydroorotate dehydrogenase # Organism: Nostoc sp. PCC 7120 # 3 319 2 323 343 214 37.0 2e-55 MTDLKTTFAGLSLRNPIIISSSGLTNSAGKNKRLAEAGAGAIVLKSLFEEQIMLEADQLK DPAFYPEASDYLEEYIREHKLAEYLTLIKESKKECNIPIIASINCYSDAEWIDFAKQIQE AGADALEINILALQSDVQYTYGSFEQRHIDILRHIKRTVSIPVIMKLGDNLTNPVALIDQ LYANGAAAVVLFNRFYQPDINIENMEQISGAVFSTSADLATPLRWIGIASSVVDKIDYAA SGGVSNPEAVVKAILAGATAVEVCSAVYQNTNAFIGESTRFLSAWMERKGFESIAQFKGK LNIKDVQGINTFERTQFLKYFGKKEN >gi|226332120|gb|ACIC01000200.1| GENE 16 15255 - 15425 131 56 aa, chain - ## HITS:1 COG:BH2629 KEGG:ns NR:ns ## COG: BH2629 COG1875 # Protein_GI_number: 15615192 # Func_class: T Signal transduction mechanisms # Function: Predicted ATPase related to phosphate starvation-inducible protein PhoH # Organism: Bacillus halodurans # 1 56 387 442 442 66 57.0 1e-11 MVFTGDIQQIDQPYLDSQSNGLVYMIDRMKDQNIFAHVNLVKGERSQLSELASNLL >gi|226332120|gb|ACIC01000200.1| GENE 17 15394 - 16581 1276 395 aa, chain - ## HITS:1 COG:TM0495 KEGG:ns NR:ns ## COG: TM0495 COG1875 # Protein_GI_number: 15643261 # Func_class: T Signal transduction mechanisms # Function: Predicted ATPase related to phosphate starvation-inducible protein PhoH # Organism: Thermotoga maritima # 5 375 3 352 418 223 38.0 5e-58 MGTKKNFVIDTNVILHDYNCLKNFQENDIYLPLVVLEELDKFKKGNEQINFNAREFVREL DLLTSDELFSKGVSLGEGLGRLFIVPGNVDAPKVHESFPVKKPDHLILAAVEYLAGKYPK TPAILVTKDVNLRMKARSIGITSEDYITDKVSNVDIFEKSNEIFENVDPALIDRIYSSKE GIDLSEFDFKDVIHPNECFVLKSDRNSVLARYNPFTHSICRVTKGRNYGIEPRNAEQSFA FEILNDPNVKLVALTGKAGTGKTLLALAAALGKLTDYKQVLLARPVVALSNKDIGFLPGD AQEKVAPYMQPLFDNLNVIKRQFAANSTEVKRIEDMQKSEQLVIEALAFIRGRSLSEMYC IIDEAQNLTPNEIKTNHYPCRRRYEDGVYGRYPAD >gi|226332120|gb|ACIC01000200.1| GENE 18 16713 - 18188 1061 491 aa, chain + ## HITS:1 COG:CAC2398 KEGG:ns NR:ns ## COG: CAC2398 COG0285 # Protein_GI_number: 15895664 # Func_class: H Coenzyme transport and metabolism # Function: Folylpolyglutamate synthase # Organism: Clostridium acetobutylicum # 1 482 1 426 431 214 31.0 2e-55 MDYQNTLKYLYESAPMFQQIGGKAYKPGLETTHKLDEHLGHPHQEFRTIHIAGTNGKGSC SHTIAAVLQAAGYKVGLFTSPHLIDFRERIRINGEMIPEEYVVNFVEDHRSFFEPLHPSF FELTTAMAFRYFADQKVDVAVIEVGMGGRLDCTNIITPDLCVITNIGFDHMQYLGHTLTK IAKEKAGIIKEDVPVVVGKVSGAVKRVFTMKAKAMNTFITFSEELKSFTHVQYLSYDNLK ESRADLQELLEEEIKLQQIQTLPMKMMQFVSLINPADAIRAIDRIMDKRKDAISIHNSSF MTGLYYELSGLYQTENCLTILAALDILKNLGYEICNKDYHTGFSNVCEMTGLMGRWQKLQ SYPDLICDTGHNADGFKSIRKQLKYIHEKLHQELHIVFGMVSDKDISSVLELLPKDATYY FTKASVKRAMPEDELMKMASEAGLKGTSYPTVVDAVRAAKENCPPKDFIFVGGSSFIVAD LLANRDTLNLH >gi|226332120|gb|ACIC01000200.1| GENE 19 18156 - 18530 404 124 aa, chain - ## HITS:1 COG:PH0854 KEGG:ns NR:ns ## COG: PH0854 COG0251 # Protein_GI_number: 14590714 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation initiation inhibitor, yjgF family # Organism: Pyrococcus horikoshii # 1 124 12 136 137 133 55.0 9e-32 MKKVICSEKAPGAIGPYSQAIEANGMVFVSGQLPIDAATGKMAEGIEEQARQSLENIKHI LEETGLTMGNIVKTTVFLQDMSFFAGMNGVYATYFDGAFPARSAVAVKALPKDALVEIEC IAVR >gi|226332120|gb|ACIC01000200.1| GENE 20 18634 - 19107 441 157 aa, chain - ## HITS:1 COG:PA0563 KEGG:ns NR:ns ## COG: PA0563 COG3152 # Protein_GI_number: 15595760 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Pseudomonas aeruginosa # 6 120 13 114 117 68 36.0 4e-12 MFKAPFSFDGRIRRIEYFLSGIIGGIVFGVAYSLGLATLFLGAAAGSAGGSLFGILIGIV AGIASIWFSLAQGVKRLHDLNKSGWLILICCVPIIGWVFSLYMLFADGTVGPNQYGEDPK NRMPYQPQPTSVNVTVNVSRETPAEAPAEEEKTEKAE >gi|226332120|gb|ACIC01000200.1| GENE 21 19438 - 19590 80 50 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MHSSAFDTSLYPSLSKRQKDKKAHDSFEDGNEEKEADQQGKKQKAKAMFS >gi|226332120|gb|ACIC01000200.1| GENE 22 20092 - 20424 151 110 aa, chain - ## HITS:1 COG:MA3780 KEGG:ns NR:ns ## COG: MA3780 COG1898 # Protein_GI_number: 20092576 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-4-dehydrorhamnose 3,5-epimerase and related enzymes # Organism: Methanosarcina acetivorans str.C2A # 1 104 74 177 183 142 60.0 2e-34 MRVIKGAVLDVAVDIRKGSPTFGQYVSVELTGENHRQFFIPRGFAHGFSVLSEGMIFQYK CDNFYSPQSEGAIAWNDPDLNIDWRIPAEKVVLSEKDGKHPRLKDWQNVF >gi|226332120|gb|ACIC01000200.1| GENE 23 20970 - 21881 571 303 aa, chain - ## HITS:1 COG:TM0953 KEGG:ns NR:ns ## COG: TM0953 COG3958 # Protein_GI_number: 15644625 # Func_class: G Carbohydrate transport and metabolism # Function: Transketolase, C-terminal subunit # Organism: Thermotoga maritima # 1 300 5 303 311 204 40.0 1e-52 MRDTFVRTLIEIAKADKDVHLLTGDLGFGVLKPYWEQLPEQFTNVGIAEQNMTGIAAGMA LSGKKVFTYSIGNFPTLRCLEQIRNDCAYHDANVKIVCVGGGFVYGSLGMSHHATEDLAI MRALPSMVVMAPGDLTEAAACTHAIYEHSGTCYLRLGRGGEQKIHDSIVNFQIGKALKIR DGEKVAIFSTGGIFDEALGAVELLNKKGICPALYTFPTVKPIDKELITHCSHEYDLIVTV EEHNIIGGLGSAVSEVLSELPSRARQIKIGLNDTYSCIVGSQKYLRNEYGMSAEKIAERI EMF >gi|226332120|gb|ACIC01000200.1| GENE 24 21878 - 22687 739 269 aa, chain - ## HITS:1 COG:TM0954 KEGG:ns NR:ns ## COG: TM0954 COG3959 # Protein_GI_number: 15644626 # Func_class: G Carbohydrate transport and metabolism # Function: Transketolase, N-terminal subunit # Organism: Thermotoga maritima # 7 267 16 276 286 273 52.0 2e-73 MTSEELAWKIKRHGVEMTHISHGSHIGAILSVSDIIAVLYADILNIDPQDPRKADRDRLI LSKGHAGAAIYAALAEKGFFPVGELATHYRDGSRLSGHVSHKGVPGVEFSTGSLGHGLPV AVGMALEAKMERKDYQVYIVMGDGECDEGSVWEASMFAHHHKLDNLTVVVDYNKMQSLTF CEDTLSLAPLDKKFESFGYNVLSVDGHNHDELKKAFRTTFTNGNPKLILANTIKGKGISF MENNILWHYRTPQGEEYEAAVKELEEQRP >gi|226332120|gb|ACIC01000200.1| GENE 25 22677 - 23168 286 163 aa, chain - ## HITS:1 COG:FN1695 KEGG:ns NR:ns ## COG: FN1695 COG2148 # Protein_GI_number: 19705016 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Sugar transferases involved in lipopolysaccharide synthesis # Organism: Fusobacterium nucleatum # 13 137 77 201 230 102 42.0 4e-22 MVKDAEGKGKWNVGDNDKRITRFGHFLRKSKLDELPQLINVLIGDMSLVGPRPELRYYVE MYIEREKKILKMKPGITDWASVTHFKQFVSFTQSDDPDKEYVERIRPLKLELQLYYYEHR SLGMDIRIIVFTVSKMIFRNLNLPKDIQEVITQFNNKIQKYDK >gi|226332120|gb|ACIC01000200.1| GENE 26 23345 - 24103 328 252 aa, chain - ## HITS:1 COG:BH0932 KEGG:ns NR:ns ## COG: BH0932 COG1028 # Protein_GI_number: 15613495 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) # Organism: Bacillus halodurans # 1 250 1 246 252 114 32.0 1e-25 MNLEIRNKNVLIIGASKGIGREISLAFAKEGAMVTAIARNEKLLVSLQHEMKQIYEADHK YYVCDLMDMDLTVLANKLLIEKGPFDIIVHNVGGSLVSRNALGTLDEWEYAWKFNAGIAI ALNNILIPPMIEKKWGRVIHISSISAQMLRGNPLYASAKAFLNAYVTTVGRTIASSGVVM SAVMPGAVAFPGSYWDEYTRTDPERCNDFLRHHQAAGRFGTTKEIADVVLFMASEQASFM QAAIIPVDGANM >gi|226332120|gb|ACIC01000200.1| GENE 27 24563 - 25102 233 179 aa, chain - ## HITS:1 COG:MA3442 KEGG:ns NR:ns ## COG: MA3442 COG1045 # Protein_GI_number: 20092254 # Func_class: E Amino acid transport and metabolism # Function: Serine acetyltransferase # Organism: Methanosarcina acetivorans str.C2A # 2 171 24 207 232 127 40.0 1e-29 MISSKNEYELFLRADSVANKFERMPMKWINIRFRYLRCLRRYEYVVNCKPTLFIVRKQLL RLILSQLSVKSGIQIPINTFGKGLYLPYHGTIVVNETARFGDFCVVQAGVNVSANVCGGN HIYFGTGCKIMKNVEIADDVIIGANAVVTKSVFEQNIVVAGIPAKKINNNGYRDRKNAI >gi|226332120|gb|ACIC01000200.1| GENE 28 25107 - 26078 225 323 aa, chain - ## HITS:1 COG:no KEGG:EcE24377A_2321 NR:ns ## KEGG: EcE24377A_2321 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_E24377A # Pathway: not_defined # 103 285 61 259 295 71 30.0 4e-11 MSQIIYFIENSGYYKYVIAAGEGDSDVFFPLGFKNHYIEKLFKLHLAWPINKRSDAPYVK IWFNSILKGINIDPKQDLYILLAESYHLSYSNNFLNYLKSKYPDAKLIFCFSNPAGDYNL KKLQKVILNYDAVLTFHKKDAELYGYFYCDDMPYRLPNIDQDDLDNSDVFFVGANKGRLR LLLSIYDKLSNAGLKCDFFICGVEKKDQIFREGIVYNKRISYDEVLKHVKASKCVLEVLQ NGNNYVSIRTNEAIQYNKKLLTTNSEIINTSFYNKELVQIFSGDNIDTNFFVKEVASDVY NNINSNRSFLAMKEYLKTNFKRK >gi|226332120|gb|ACIC01000200.1| GENE 29 26091 - 27599 295 502 aa, chain - ## HITS:1 COG:no KEGG:BCE_5386 NR:ns ## KEGG: BCE_5386 # Name: not_defined # Def: polysaccharide transport protein, putative # Organism: B.cereus_ATCC10987 # Pathway: not_defined # 3 486 4 488 508 148 25.0 6e-34 MERSKIFIKNTAWELGYYLLIIALGFFAPRFIILTYGSDVNGLSSTITQILNIILLLQSG ATTAAVYSLYKPIADHNWENISEKVTAANLYFKKMSYVFLGIMFVVAGVTAWNINSGIPE ITIFIAFIIMGFKSFLDLYFTSKYRIVFTAFQEKFIMSIATLLEQTIYYVLVFTTIFFKW NFLFLFFWLFFGCIVKIMVLKIVYVKRYPDIKRIGNLFKSATIHGKNYSLVNEVSHSIVT TSTAIMISFMYGLDEVSVFSIYVMVFSALNLFSTALNSSFGPTFANLCAKGETKRAKDVF TIFQFLYIMMNTILVMCAFYLIIPFMRLYIEGKTEVSYINITLAILYSISGLLSACRIPY NLIVSSYGFFKETWLQPVITAIISLLISYFFGRIQYAFILIGPIIFYLVNFVYQYFKLKR LTPHLISSKVFIIFAISLVGIVLVYFAMKGIEVPHGFWAWLLAAVFTLFAVTIYIFVASR IFLSKEFYLSMAYIRSLCKAYV >gi|226332120|gb|ACIC01000200.1| GENE 30 27617 - 28732 100 371 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253572895|ref|ZP_04850293.1| ## NR: gi|253572895|ref|ZP_04850293.1| predicted protein [Bacteroides sp. 1_1_6] # 1 371 27 397 397 566 99.0 1e-160 MIAVLTLLFLGILFCFNDGESGDHYLYKLGFENQNGGMAVEGIYDYFVEEIKIMGVESYN VFLGIIYVICLVLFHLGIRSNIYNIHAFIPVILPYIYPSCTVTIRFTLAFSLFVFSLRYL FRGKILLYFLLVILAGMMHYSLFFALLFVYCAKINANERNKVVKTEKDLDEKSNLSYYYR KNIFAKLVVIISLLLVIIIYVTRSLPFLDQIYVIFNILDLGIDNKLDANMATMTRLGAFI FIPAYIFSFYMSIKMRKYVIKAILNEEGIAEVYYLININYIINAISAAIIPLLVLNLSFS RLLFLPTIVNIIAYERILKYNKCEEPSIHLNGCNILLSLIMLAWFIPAIFKINSISPSTL LETASQFLDSI >gi|226332120|gb|ACIC01000200.1| GENE 31 28852 - 29967 319 371 aa, chain - ## HITS:1 COG:NMA0640 KEGG:ns NR:ns ## COG: NMA0640 COG0438 # Protein_GI_number: 15793628 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Neisseria meningitidis Z2491 # 37 307 39 318 387 89 26.0 1e-17 MSKICVCISCFNCYETRMKGVVEYFNDNHYSTKYFISNFNHYDKKYYTVNYPNTTQIQVP RYAKNLSISRLRSHYVFSKEVFALLKKTKPDVIYCIFPPNTLVRSVIKYKKITQCKVILD CYDTWPESFPTKRFSKLLSWPFSIWANIRNRYISEADLILCVSQVGVDFIKKIAPNTPAK LLMPVIDAQEFLQYNDDTSKLTFCYLGNINHITDINLLNLILKGVAKRKKVCLHIIGEGQ NYLALKELLTDSGIDVIGHGIVFDIKEKMKIYSLCNLAINVPKEEIQSTMSLKSVEYISM GLPFINSAGGDTQCLVENYQIGINIDKNNLARTIEQIAILTPKELRCLHDNTVYLYNSKF SKQDLNIILKD >gi|226332120|gb|ACIC01000200.1| GENE 32 29960 - 31123 549 387 aa, chain - ## HITS:1 COG:MK1083 KEGG:ns NR:ns ## COG: MK1083 COG0438 # Protein_GI_number: 20094519 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Methanopyrus kandleri AV19 # 207 324 305 427 491 68 32.0 2e-11 MIKGIFCHDLPIYKDIDGNYCSTTLTDSLFLRYFNVVDELVVATRVYKIDKTYKEAHQEK ITLPNIKFLEFPNLNTFKGIFTLIPKAKKVLKKEMQKVDLVFIRGGIIATLGVTVARELN KPYLLENAGCAWDGYWNHSLAGKLIAPFMEYRAKIDTRDAAFVIYVTEKWLQNRYPTKGI STYASNVILDTMDDIILNKRQQKNNLNKSDLIVVGTTAAINNKAKGQQYFIEAMSKLKKY NIRYELVGSGDSNYLRSCALKNGVADRVVFKGQMKHQQVLEWIDSLDIYIQPSMQEGLPR ALIEAMSRACPAIGSSTGGIPELLPSDAIFKRGNVDSLIKTFKTIIESDLDAKAKVNFNK AKEYELEKLNSRRNVVYEQFKNYIINE >gi|226332120|gb|ACIC01000200.1| GENE 33 31138 - 32022 693 294 aa, chain - ## HITS:1 COG:rfbA KEGG:ns NR:ns ## COG: rfbA COG1209 # Protein_GI_number: 16129979 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-glucose pyrophosphorylase # Organism: Escherichia coli K12 # 2 284 5 285 293 380 65.0 1e-105 MKGIILAGGAGTRLYPLTMVTSKQLLPVYDKPMIFYPLSTLMLAGIKEILIISTPVDLPN FERLLGDGSQFGIQLSYKAQPSPDGLAQAFLLGEDFIGDDWCAMILGDNIFYGSGLKEKL KSAVSNAENGYATIFGYYVNDPERFGIMEFNEDGKILSVEEKPKNPKSNYCVTGLYFYDK RVVSLAKDVKPSARGELEITSLNNLYLADESLKGIILGRGFAWLDTGTMDSLIEASEFVQ MTEKRQGVKISAPEEIAFRNGWIDKETLMTSALRYGKSPYGQHLKNVAEGKFIY >gi|226332120|gb|ACIC01000200.1| GENE 34 32028 - 32972 516 314 aa, chain - ## HITS:1 COG:alr4490 KEGG:ns NR:ns ## COG: alr4490 COG1091 # Protein_GI_number: 17231982 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-4-dehydrorhamnose reductase # Organism: Nostoc sp. PCC 7120 # 21 302 1 278 294 215 41.0 1e-55 MALLKSMSGLWSTNNYSSDIINMNILVTGANGQLGQAIRAQSHRLQNHNIVFTDVISNDM LETILLDITSEDAVRSVCKSAQINVIINCAAYTDVEKAETDFEMANLINCDAVRNLATVA KECDITLIHISTDYVYDSRKAAPYVETDEKHPINVYASTKYAGEVAIHEVGCKFILFRTA WMFSGFGSNFMKTMLRLTAEKETLNVVYDQVGTPTYAPELVSIIFKVIDEDKLDMTGEYN FTNEGSVSWYDFAVAINYLSGHNCKVNPCSSEEYGSKVIRPNYSVLDKSKVKKTFGITIP HWLISLEQCINNKQ >gi|226332120|gb|ACIC01000200.1| GENE 35 32932 - 33180 299 82 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253572900|ref|ZP_04850298.1| ## NR: gi|253572900|ref|ZP_04850298.1| predicted protein [Bacteroides sp. 1_1_6] # 1 82 1 82 82 148 100.0 9e-35 MKFEDLSMTLEQVVDFAKNHNMFIEDDDHKHFSNDGILEYLKANRDRITFKPLYYYGDTI IGIQIEENVDGALKKYEWSVEY >gi|226332120|gb|ACIC01000200.1| GENE 36 33183 - 33755 400 190 aa, chain - ## HITS:1 COG:MA3780 KEGG:ns NR:ns ## COG: MA3780 COG1898 # Protein_GI_number: 20092576 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-4-dehydrorhamnose 3,5-epimerase and related enzymes # Organism: Methanosarcina acetivorans str.C2A # 1 170 1 170 183 209 59.0 2e-54 MNIIKTAIESVLIIEPKVYQDERGYFFESFNANEFSEKTGMDIHFVQDNESMSSYGVMRG LHYQKMPYTQSKLVRCVKGAVLDVAVDIRYGSPTFGQHVAVELTEENHRQFFVPRGFAHG FAVLSETALFQYKCDNFYHPEADAGINIKDESLGIDWKVPVEHAILSEKDLKHACLKDAS YDFDINNKLY >gi|226332120|gb|ACIC01000200.1| GENE 37 33796 - 34518 386 240 aa, chain - ## HITS:1 COG:fabG KEGG:ns NR:ns ## COG: fabG COG1028 # Protein_GI_number: 16129056 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) # Organism: Escherichia coli K12 # 4 238 9 240 244 115 34.0 9e-26 MNILVTGCSRGVGLEICRVLLEKEYVVFGIARTYTPEFELLEKTYPNRLYFKRIDLCESE SVHKKVFKDFLNNQVPLHGFVNNAAMAYDDIITNMNLERLKDMYASNVFTPMVLTKYAIR NMLLHRIKGSIIHISSISAHTGYKGLAMYASSKGSLEAFSKNTAREWGELGIRSNVVVPG FMETAMSASLSDEQKDRIYKRTSLKCPTDIHSVAETVAFLLSENAISITGQNIHVDSGTI >gi|226332120|gb|ACIC01000200.1| GENE 38 34515 - 35729 319 404 aa, chain - ## HITS:1 COG:BH2006 KEGG:ns NR:ns ## COG: BH2006 COG0318 # Protein_GI_number: 15614569 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II # Organism: Bacillus halodurans # 4 404 25 501 513 95 25.0 2e-19 MDFLIDKSERYTYEDLLNIVNNADRYIPFFKEKSLYHFFCNLIVGLVSNQPLVLLDSDIK LTELEEGVESKINKSKLISKHYFSTMADLINAIHDSRSEITIFTSGTTGQPKKVIHSVMT LTRTVRIAERYKDKIWAFAYNPTHMAGLQVFFQALSNCNTLINVFNASRTDIYNAIKEWR ITHISATPTFYRLLLPVESSYVSVERVTLGGEKSDNHLYLSIKQIFPSAKINNIYASTEA GSLFAAKGEYFQLSIDLCDRVKVVDDELLLHKSLLGKSDDFSFSDGYYHTGDIIEWINQE ERLFKFKSRKNELINVGGYKVNPGEVENVILRMNGIQQALVYGKTNSVLGNILCAEIKIE PNVNLTELEIRQYLSGMLQDFKIPRKIKFVESFSMTRSGKLKRS >gi|226332120|gb|ACIC01000200.1| GENE 39 35737 - 35961 301 74 aa, chain - ## HITS:1 COG:no KEGG:BDI_0568 NR:ns ## KEGG: BDI_0568 # Name: not_defined # Def: putative acyl carrier protein # Organism: P.distasonis # Pathway: not_defined # 1 72 1 72 73 75 68.0 6e-13 MKDKILEIINIIRISKELQPLKDIKSTDSLRDDIEFTSFDLAELTVRIEDEFDIDIFEDG LVSTVGEVYAKLIK >gi|226332120|gb|ACIC01000200.1| GENE 40 35965 - 37011 235 348 aa, chain - ## HITS:1 COG:Cj1303 KEGG:ns NR:ns ## COG: Cj1303 COG0332 # Protein_GI_number: 15792626 # Func_class: I Lipid transport and metabolism # Function: 3-oxoacyl-[acyl-carrier-protein] synthase III # Organism: Campylobacter jejuni # 1 337 1 338 353 175 33.0 1e-43 MKTIIDNIQIAAVTSWLPLNNLEMTSLSSEYGEKEVASIIKATGVERVRIADKNDTSSDM CFRAAEHLIQQEHIDKNEIDGLVFVSQTVDYILPSTSIILQERLKLNKDVVCLDIHYGCS GYIYGIFQAALWIASKACRNVLVLAGDTTSRMIHPKDKSLRMVFGDCGTATLVTIGKSPL GVTLHSDGAGYDRLIVRAGGFRNPVSEETSLLQYDNDNNGRTDNDLYMDGISIFNFAVNN VYKDIDLLLGLMCWEKDDIGIFALHQANNFMVNYVRKKIKVAAEKVPMNVTNYGNTGPAT IPLLLSDVCSTQKYNLDKVIMSGFGVGLSWGSIAANLSKTRFYSPLNK >gi|226332120|gb|ACIC01000200.1| GENE 41 37015 - 37761 289 248 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 10 243 4 238 242 115 32 4e-25 MYNPFSLAGKTILITGASSGIGKATAIECSKLGAKVIITGRNRLRLEETFSSMSGNAHLL IEADLQYPSDIDMLIDSLPMLDGVVNNAGMTKTIPMAFISEEILLDVLRINTISPILITQ KLLKRRKLAKSASIVFTSSISGNSVGVLGNTIYSTTKAAVNGFVKNAALELAPKNIRVNT VCPGMIDTGILSSGIITEEQLDEERKKYPLKRFGKPEEIAYAIIYLLSDASSFVTGSNLF IDGGFTLL >gi|226332120|gb|ACIC01000200.1| GENE 42 37765 - 38001 302 78 aa, chain - ## HITS:1 COG:no KEGG:BDI_3629 NR:ns ## KEGG: BDI_3629 # Name: not_defined # Def: putative acyl carrier protein # Organism: P.distasonis # Pathway: not_defined # 4 76 5 77 78 82 60.0 4e-15 MGNLNEFIQNFANQFDETDASEFQATTEFRQLDEWSSLMALSIIAMVDEEYNVQLKGDDI VKSQTIDDLFRIVSSKTI >gi|226332120|gb|ACIC01000200.1| GENE 43 37989 - 38549 388 186 aa, chain - ## HITS:1 COG:no KEGG:PHZ_c0012 NR:ns ## KEGG: PHZ_c0012 # Name: not_defined # Def: acetyltransferase, GNAT family # Organism: P.zucineum # Pathway: not_defined # 1 149 1 152 190 84 32.0 2e-15 MVYEGILEGRFVNLRPVTKEDAEFILGIRNNPEISKYLPSLNVTVDQQQSWIDKQRNDIN SYYFIIEDKTKMRIGTISVYNIIENHAEVGRFCSFGNSIQNSEAALMLDDFIFQKLGVDY LDIWVYKDNKPVLALNKGLGCVWEGESEDKGGFPYLYGTLTKEGFEKKSQKIRTNLSKIK DIIWEI >gi|226332120|gb|ACIC01000200.1| GENE 44 38551 - 39681 865 376 aa, chain - ## HITS:1 COG:MJ1066 KEGG:ns NR:ns ## COG: MJ1066 COG0399 # Protein_GI_number: 15669255 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted pyridoxal phosphate-dependent enzyme apparently involved in regulation of cell wall biogenesis # Organism: Methanococcus jannaschii # 13 375 16 382 386 191 33.0 2e-48 MSIQVLKPKFEIEECLEAVKECLEQGWTGMGFKTVQFEEAWKDYTGHKYAYYLNSNTVGL HLAVKILKMQNGWQDGDEIISTPITFVSTNHAVMYENMHVTFADVDEYLCLDPIDVEKKI SDKTKAIIFVGYGGRVGKLDKIIEICKKHDLKLILDAAHMSGTRINGVTPGTWDGVDVTV YSYQAVKNLPTGDSGMICFANEKDDKLARQLAWLGINKDTYARSNKGTYAWKYDVDYVGY KYNGNAIMAAIALVQLKYLDRDNARRREIVEMYSKAFENNAKIQIVKAPHADECSFHIYE LIVPDREGLLGKLAENDIYGGVHYRDNTEYSMYQYAKGTCPRAHDASEHLITLPLHMWLT DEEVDRIIAVVNDYVK >gi|226332120|gb|ACIC01000200.1| GENE 45 39772 - 41085 1003 437 aa, chain - ## HITS:1 COG:XF1606 KEGG:ns NR:ns ## COG: XF1606 COG1004 # Protein_GI_number: 15838207 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted UDP-glucose 6-dehydrogenase # Organism: Xylella fastidiosa 9a5c # 1 437 1 444 450 510 56.0 1e-144 MKIAIVGTGYVGLVTGSCFSEMGIDVTCVDIQKCKIENLKKGIIPIYEPGLEDMVHRNYS AGRLKFATDLKECLDEVEVIFSAVGTPPDEDGSADLKYVLEVARTIGRNINKYVLVVTKS TVPVGTAQKVKSVIQEELDKRNVLIDFDVASNPEFLKEGDAVDDFMKPDRIVVGIESERA KEIMERLYKPFMMNNYRLIFTDIPSAEMIKYAANSMLATRISFMNDIANLCELVGADVNM VRKGIGADSRIGSKFLYPGCGYGGSCFPKDVKALIKTAEKNGYSMYVLKSVEEVNERQKS ILFEKLLKHYNGDIQGKQITLWGLAFKPETDDMREAPSLVLIQKLLAAGATVKVYDPIAM DECRRRVGEKVFYATDMYDSVLDSDALMLVTEWKEFRMPSWGVIKKTMKSAVVIDGRNIY DKKELLEIDFTYYCIGK >gi|226332120|gb|ACIC01000200.1| GENE 46 41106 - 42158 519 350 aa, chain - ## HITS:1 COG:BH3709 KEGG:ns NR:ns ## COG: BH3709 COG0451 # Protein_GI_number: 15616271 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Bacillus halodurans # 1 350 1 335 343 407 57.0 1e-113 MKILVTGAAGFIGSYVCKYLLSRGDEVVGLDNINSYYDINLKYGRLLTLGIEENAVNWYL FVESNVYEKFRFIRMNLEDKQAMQMLFANERFDKVVNLAAQAGVRYSIENPYAYVESNID GFLNVLEGCRHYRVKHLIYASSSSVYGLNGKVPFSENDSVAHPVSLYAATKKSNELMAHT YSHLYAIPTTGLRFFTVYGPWGRPDMSPFLFASAILNNRPIKVFNNGDMLRDFTYIDDIV EGVLRVIDHVPEPNLNWNDQNPEPSSSKAPYKIYNIGNSHPVKLMDFIEAIEKAIGHPAD KIYFPMQPGDVYQTNADTTALERELGFKPNKSIIEGVRNTIDWYRSFYQL >gi|226332120|gb|ACIC01000200.1| GENE 47 42155 - 44077 890 640 aa, chain - ## HITS:1 COG:SA0147 KEGG:ns NR:ns ## COG: SA0147 COG1086 # Protein_GI_number: 15925856 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Predicted nucleoside-diphosphate sugar epimerases # Organism: Staphylococcus aureus N315 # 243 598 228 577 607 342 48.0 1e-93 MKKSLEKFLQYTSRNFFSCWIILGVDTLIAGLCTLLTFIGIHYVTGTGWDIIQLVQVIAL ALGASIIGDLLFNTFRNTIRYSEIKDLWRVICSVLFKATCLAIYVFFFLKEDKLFGSQNL VFVISDGMLTLISLIGIRIIVIVMYESLIGIVRKTDMKVFIYGTDDKSVALKTRLRNSSH YKIAGFCIYSPDNSRRILVDSRVYSFSDKVSFNILIKKHHIQGILFARYESIREEENRLL EYCKSSGVKTLIAPSISEADVNGFFHQFVRPIRIEDLLGREEIKINMQEIASEYCDKIIL VTGAAGSIGSELCRQLAQIGVRKLILFDSGETPLHNVRLEFEKNYPNIDFIPVIGDVRVK QRVRMVFEQYHPQIVFHAAAYKHVPLMEENPCEAVYVNVIGSRQVADMAVEYGAEKMIMI STDKAVNPTNVMGASKRLAEIYVQSLGTAIKEGKAKGKTKFITTRFGNVLGSNGSVIPRF KEQIENGGPVTVTHPDIIRYFMTIPEACRLVMEAGMMGEGNEIFVFEMGEPVKIVDLATR MIELAGYRLDEDIKIEYTGLRPGEKLYEELLSTEENTLPTSNKKIKIAKVRRYEYNDILE SYADFEILARNVEIMDTVALMKRIVPEFISRNSRFEILDV >gi|226332120|gb|ACIC01000200.1| GENE 48 44106 - 45227 673 373 aa, chain - ## HITS:1 COG:no KEGG:BT_1355 NR:ns ## KEGG: BT_1355 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 371 1 376 379 496 71.0 1e-139 MSEQQLNRNVSSEQDKLLNDEMEIDLMDILRKIISIRKTLYKAAGIGLIMGIIVAFSIPK QYTVKVTLSPEMGNSKGTNGIAGLAASFLGSGATIGDGTDALNASLSSDIVSSTPFLLEL MDVNVPTDKGDMLTLETYLDSQSSPWWNYIIGIPGLVINGVKSLFTDENDNSNEASHGTI ELTEKESKKIDALKKKIIANVDKKTAITNVSVTLQEPKVAAVVADSVVRKLQEYIIGYRT SKAKDDCVYLEKLFKERQDEYYDAQKKYADYVDTHDNLILQSVRTEQERLQNDMSLAYQV YSQVANQLQVARAKVQEEKPVFAVVEPAVVPRYPSGMGKKSYLIIFLFLSVVATAGWLLF GKDYLDKIRKELV Prediction of potential genes in microbial genomes Time: Thu May 12 04:08:12 2011 Seq name: gi|226332119|gb|ACIC01000201.1| Bacteroides sp. 1_1_6 cont1.201, whole genome shotgun sequence Length of sequence - 21392 bp Number of predicted genes - 31, with homology - 31 Number of transcription units - 11, operones - 6 average op.length - 4.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 2 - 61 3.4 1 1 Op 1 . + CDS 83 - 334 212 ## pBF9343.03 putative DNA-binding protein 2 1 Op 2 . + CDS 369 - 584 141 ## gi|189464110|ref|ZP_03012895.1| hypothetical protein BACINT_00445 + Term 605 - 641 3.5 3 2 Tu 1 . + CDS 656 - 1045 398 ## gi|253572917|ref|ZP_04850314.1| conserved hypothetical protein + Prom 1047 - 1106 6.7 4 3 Op 1 . + CDS 1142 - 1360 239 ## pBF9343.04 hypothetical protein 5 3 Op 2 . + CDS 1379 - 1696 293 ## gi|189464105|ref|ZP_03012890.1| hypothetical protein BACINT_00440 6 3 Op 3 . + CDS 1757 - 2212 365 ## pBF9343.09 hypothetical protein 7 3 Op 4 . + CDS 2202 - 2414 106 ## pBF9343.10 hypothetical protein + Prom 2476 - 2535 4.9 8 4 Tu 1 . + CDS 2578 - 3735 652 ## pBF9343.12 putative replication protein + Term 3830 - 3875 5.0 + Prom 4706 - 4765 5.5 9 5 Tu 1 . + CDS 4880 - 5236 205 ## pBF9343.13 putative mobilization protein + Prom 5541 - 5600 12.3 10 6 Op 1 . + CDS 5671 - 7233 689 ## pBF9343.14 putative mobilization protein 11 6 Op 2 . + CDS 7249 - 7872 536 ## pBF9343.15 putative ParA-related protein 12 6 Op 3 . + CDS 7931 - 8365 284 ## pBF9343.16 hypothetical protein + Term 8459 - 8491 2.1 13 7 Op 1 . - CDS 8656 - 9042 300 ## pBF9343.17c putative regulatory protein 14 7 Op 2 . - CDS 9035 - 10990 526 ## COG3505 Type IV secretory pathway, VirD4 components 15 7 Op 3 . - CDS 10994 - 11218 114 ## BT_1941 hypothetical protein - Prom 11279 - 11338 5.4 - Term 11314 - 11357 0.4 16 8 Tu 1 . - CDS 11392 - 11835 329 ## pBF9343.20c hypothetical protein - Prom 12022 - 12081 5.0 17 9 Op 1 . - CDS 12178 - 12456 236 ## pBF9343.22c putative HU-like DNA binding protein 18 9 Op 2 . - CDS 12468 - 12821 201 ## pBF9343.23c hypothetical protein 19 9 Op 3 . - CDS 12818 - 13165 214 ## pBF9343.24c putative topoisomerase-related protein 20 9 Op 4 . - CDS 13162 - 13653 294 ## pBF9343.24c putative topoisomerase-related protein 21 9 Op 5 . - CDS 13654 - 14361 404 ## pBF9343.25c hypothetical protein 22 9 Op 6 . - CDS 14342 - 14845 192 ## pBF9343.26c putative integral membrane protein 23 9 Op 7 . - CDS 14842 - 15951 481 ## pBF9343.27c hypothetical protein 24 9 Op 8 . - CDS 15956 - 17515 1068 ## pBF9343.28c putative integral membrane protein 25 9 Op 9 . - CDS 17517 - 18296 325 ## pBF9343.29c hypothetical protein - Prom 18358 - 18417 5.2 + Prom 18197 - 18256 8.1 26 10 Tu 1 . + CDS 18277 - 18942 363 ## COG0739 Membrane proteins related to metalloendopeptidases + Term 18952 - 18986 3.6 - Term 18938 - 18976 1.5 27 11 Op 1 . - CDS 18989 - 19855 431 ## pBF9343.31c hypothetical protein 28 11 Op 2 . - CDS 19842 - 20102 313 ## pBF9343.32c hypothetical protein 29 11 Op 3 . - CDS 20124 - 20603 329 ## pBF9343.33c hypothetical protein 30 11 Op 4 . - CDS 20600 - 21049 348 ## pBF9343.34c hypothetical protein 31 11 Op 5 . - CDS 21121 - 21348 142 ## gi|189464068|ref|ZP_03012853.1| hypothetical protein BACINT_00403 Predicted protein(s) >gi|226332119|gb|ACIC01000201.1| GENE 1 83 - 334 212 83 aa, chain + ## HITS:1 COG:no KEGG:pBF9343.03 NR:ns ## KEGG: pBF9343.03 # Name: not_defined # Def: putative DNA-binding protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 83 1 83 83 127 98.0 2e-28 MKKEFTLRLDDAACAELENLKKLSGEKTDAKAIRFAIQNYATLNDRYIETQRQIRTWKDK YAALNKAVDDYLSSFEALKKTTD >gi|226332119|gb|ACIC01000201.1| GENE 2 369 - 584 141 71 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|189464110|ref|ZP_03012895.1| ## NR: gi|189464110|ref|ZP_03012895.1| hypothetical protein BACINT_00445 [Bacteroides intestinalis DSM 17393] # 1 71 1 71 71 120 100.0 2e-26 MIGVEKKIFRLSSSVLNRTFKQEKSIFCSMVLRLMDTENYANNYCDSLSLVLELFPEIDR EKLEKELDKYV >gi|226332119|gb|ACIC01000201.1| GENE 3 656 - 1045 398 129 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253572917|ref|ZP_04850314.1| ## NR: gi|253572917|ref|ZP_04850314.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 129 1 129 129 236 100.0 3e-61 MAVDMTDETIAQYMERIEKMSIKEIQKEIEFLETPGYNCEGLVCADGVITPRTKMHRKVL WYKRMNQKSLTALQWAKEGYVVNPDATGRKWKNNPHNSKAIIYYVSDEVHEDQEAAKEFL KSRRKEKKE >gi|226332119|gb|ACIC01000201.1| GENE 4 1142 - 1360 239 72 aa, chain + ## HITS:1 COG:no KEGG:pBF9343.04 NR:ns ## KEGG: pBF9343.04 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 72 1 72 72 90 61.0 2e-17 MNERTIQIDVIGKIEGTQFMKCKLYTNENIVIIMMNEFDYERLKEEGIFIRDGKSRDSAG VLNTTNTFIEKN >gi|226332119|gb|ACIC01000201.1| GENE 5 1379 - 1696 293 105 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|189464105|ref|ZP_03012890.1| ## NR: gi|189464105|ref|ZP_03012890.1| hypothetical protein BACINT_00440 [Bacteroides intestinalis DSM 17393] # 1 105 1 105 105 189 100.0 3e-47 MIMDKKEKNFATYKEFAKMLREVANIYSKLGDEPLLKEGYEYNAIRDAVQYVTNKHDFGY FIQPWKDEFLRMPFDVTKRKKWADYVAECHATGKEIDYDNYDWDK >gi|226332119|gb|ACIC01000201.1| GENE 6 1757 - 2212 365 151 aa, chain + ## HITS:1 COG:no KEGG:pBF9343.09 NR:ns ## KEGG: pBF9343.09 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 151 13 163 163 232 86.0 3e-60 MFIEYKVYRRVSDLKPFISRDELPSCQMIGKKKFVGKKAKMEAVYRLTGKRLPEDYTTEQ VNNYLTVELFNTSLWHKYRKIYNEVSNEKEIVVENYSYQYTLVVELANKSNLSLDEGKIV HFVMCELLGNPCETYKGMKNPIISLRKDYDR >gi|226332119|gb|ACIC01000201.1| GENE 7 2202 - 2414 106 70 aa, chain + ## HITS:1 COG:no KEGG:pBF9343.10 NR:ns ## KEGG: pBF9343.10 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 70 1 70 70 133 100.0 3e-30 MIDKAKTLDECFKELILKRGWAKNSPYDRRTASRHKKQFLEGSLPDEFKRVYLQSAGYTI VQPELWRQEL >gi|226332119|gb|ACIC01000201.1| GENE 8 2578 - 3735 652 385 aa, chain + ## HITS:1 COG:no KEGG:pBF9343.12 NR:ns ## KEGG: pBF9343.12 # Name: not_defined # Def: putative replication protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 385 1 385 385 757 100.0 0 MVNREIKVRKRAVKEEKEEIDGEIVRIRKGHPRTNLPVELPENPTWLKQPNVVTLMAGDF KTVQIRILIAVIEKLQNVIELSIQHIDKYGTSIPCEQLSLFQEYSDRIRVDIAYRDLGVS PDQYKEVKSMVRKLISIPVEFDVKDPITGEESWAITGLFTKANIPKTPYSRGFSLEMDRE VAKVFINVDRGFTRYIKEIALRAQSRYTIRMYMLISSWKEKGGFSIYVDRFRKFLKLEDK YPEFKDLYKRVIRPVYDDLFEQADCWFEMAEVYRNSGDSQPYKLNFKVIKSALSKKEEDL LKGQKKMITNFCTLHFAMQDEHLQQFIPHITLNNHAAVLNKMLYLGEYIRENWNKISNKA EYCLAVLLKEVETLPGIIGEEKENE >gi|226332119|gb|ACIC01000201.1| GENE 9 4880 - 5236 205 118 aa, chain + ## HITS:1 COG:no KEGG:pBF9343.13 NR:ns ## KEGG: pBF9343.13 # Name: not_defined # Def: putative mobilization protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 118 1 118 118 194 100.0 6e-49 MKANKRDRIVKVRLTNLEYEELTKASKSSKCMSDYFRQRVFRKGVGLIDPKEFIRSMDEI CLEMKRIGNNINQLARYVNIHKEEVNQEVLNEVEKKMADYVIMQDKLNVTWRKLMSIK >gi|226332119|gb|ACIC01000201.1| GENE 10 5671 - 7233 689 520 aa, chain + ## HITS:1 COG:no KEGG:pBF9343.14 NR:ns ## KEGG: pBF9343.14 # Name: not_defined # Def: putative mobilization protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 520 1 520 520 917 99.0 0 MVVVVLKAATSFAGIDYNERKNEEGKSELLVAENFAMNPDNLKKSDYIAYMESVCRTNPA VKAKQFHAVISCKGREYSADDLKNVALQYINKMGYGNNPYMIYFHSDTENNHVHIVSTRV QKNGQKVKDNMEAVRSQKVINQIMNVDLDLKAKDDISKYMEFSFSTVQQYKLLLEQSGWK LREKDGLLILYKGGEKHASIQLEQIKDKAKQYTPDEERRKQITALLFKYKQGLSYTELQT LMKDKFGINLVFHTGKGHTKPYGYTLIDYRNKRVLKGGEVMDLKELLVEPNKQAKVEHCN SIINLLLKDGTKYTMDSFKKLMLDYGYRFSMNGTIFINGDDKALLVLDKKLLKELRYNSR VYEANKFNVATLKEAELLSRIYHVKKDDILIQEHMDKKNTVYSDMINSYLAHSSDLHSTL REKGILFVEDDYLVYLIDKKNKVIVSTDDLGINIIKDGYSKDRITLVTLDKMERINVEQD SELARGFNLIDAICDILSQHINVQQDKSPNRKKKRGQQQN >gi|226332119|gb|ACIC01000201.1| GENE 11 7249 - 7872 536 207 aa, chain + ## HITS:1 COG:no KEGG:pBF9343.15 NR:ns ## KEGG: pBF9343.15 # Name: not_defined # Def: putative ParA-related protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 207 1 207 207 387 99.0 1e-106 MIILFGNQKGGCGKTTNCIQFANYLVEKGKEVLVLDLDFQRSLSDRRKEDIATYDNEPKY EVIQTDISNVAKIISDFTKVEQGNLLIDLPGKIDDDALGTILKAADIIICPFKYDKFTMD STGFFIQVLQYLNVKAKIFFLPNNMRTAVRYETKEQVVDILKQVGFVTDEIPASVAMERI NTLAISNECIDKVKNAYDYIIQEGAIK >gi|226332119|gb|ACIC01000201.1| GENE 12 7931 - 8365 284 144 aa, chain + ## HITS:1 COG:no KEGG:pBF9343.16 NR:ns ## KEGG: pBF9343.16 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 144 1 144 144 248 100.0 6e-65 MADKLRKKQNEGISSVLEIAKMIRGEELHGSSINEAPIQEAKYEVEEEKEPLHVEMETTS SEVFNFNGSSEMDLTDLLEYIKGKNYNCKEVLYVDTDVKEVFGLLKAKAKIPISSLVSFI LEDWLTKHRSDISSLIKQKKNRFL >gi|226332119|gb|ACIC01000201.1| GENE 13 8656 - 9042 300 128 aa, chain - ## HITS:1 COG:no KEGG:pBF9343.17c NR:ns ## KEGG: pBF9343.17c # Name: not_defined # Def: putative regulatory protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 128 1 128 128 211 97.0 7e-54 MYNITLPILGTRLKQIREHIGYSQAQLAEQLNCKQNAISNLELGKGGSLKLLFQVLNFYS NYVFIDLIFSEKFYLISNTEEDEAQKANYNSVIIEIIRQSEKNYEDAMSNAKKELEANLQ KAINLLQP >gi|226332119|gb|ACIC01000201.1| GENE 14 9035 - 10990 526 651 aa, chain - ## HITS:1 COG:alr7213 KEGG:ns NR:ns ## COG: alr7213 COG3505 # Protein_GI_number: 17233229 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirD4 components # Organism: Nostoc sp. PCC 7120 # 196 577 117 487 589 101 25.0 4e-21 MEESKELQKLYGFMQAFIYITVCIEILLFVHFPFSENITPFLSKLAKIPIYSNILYSKLF TFFIIMVTCIGTRSKKDLELDPTKQIIFPLIFGFIMFFGSVCFLFFKEKGENGLEWYNIA YIILSIIGAILINSALDNISKRIKSNFMKDRFNIENESFEQSKDKVETEYSVNIPMKFFY NRKWHNGWLNICNCFRGTFVIGTPGSGKSFSVINSFIRQHSAKGFAEVVYDFKFPELAKI AYYNYQKNKQLGKIPNNFKFNVINFSDIEYSRRINPLKREYIEILADATETAEALYESLQ KGDKGSGGNSDFFKTSAVNLLAASIYFWSRYENGKYSDLPHVLAFLNQEYDVLFKVLFSE PELKSLVSPFEAAYKSGAVDQLEGQMASLKVQLSRLATKESFWVFSGNDFNLKVSDKTDP SYLIIANNPKTQSMNSALNALIINRLTRLVNTKGNYPTSIIVDECPTLYFYQLATLLSTA RSNKVSICLGLQELPQLEEQYGKATAKTITSIIGNTLSGQAKAPETLDWLQKLFGKVKQV KEGVTIRRSETTINMNEQMDFVIPASKISSLQAGTLVGQVALDFGQEDNFPTAMYHCKTN LDLKKIKKEEEAYKELPKVYNFGTADNREKLLQKNFKRIYDEVETVIEQYV >gi|226332119|gb|ACIC01000201.1| GENE 15 10994 - 11218 114 74 aa, chain - ## HITS:1 COG:no KEGG:BT_1941 NR:ns ## KEGG: BT_1941 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 74 1 75 276 66 45.0 3e-10 MKDQKRLLHKCLLEDIPAFVICGTDICSVQTMEAYYQIAVEKGCNSIFLEDLKLAIEDFK AFQCEEPEKVKIPD >gi|226332119|gb|ACIC01000201.1| GENE 16 11392 - 11835 329 147 aa, chain - ## HITS:1 COG:no KEGG:pBF9343.20c NR:ns ## KEGG: pBF9343.20c # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 147 39 185 185 283 97.0 1e-75 MKKYLSLVMIGLSLMLSGCSKDDNNDPDTPPTGDTGKVRYEATVSDPENFKLLVLYTVGV DFSSESPEKAAKEIVVESPFTFEQEAKQGTYLWLSAYPIPKDEENLENLQLPKKVEVKMF IDDKLHKSDSGENYAVVQYIFGQEKYQ >gi|226332119|gb|ACIC01000201.1| GENE 17 12178 - 12456 236 92 aa, chain - ## HITS:1 COG:no KEGG:pBF9343.22c NR:ns ## KEGG: pBF9343.22c # Name: not_defined # Def: putative HU-like DNA binding protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 3 92 1 90 90 143 100.0 2e-33 MQMNKTDLIKMVAEKANKPTKEVAEIIDSFFKELSQNLREKDVSIKDFGTFKKIIQPARI ARNPATGEPVEVPEKEVVKFKPSKNVLNMKWL >gi|226332119|gb|ACIC01000201.1| GENE 18 12468 - 12821 201 117 aa, chain - ## HITS:1 COG:no KEGG:pBF9343.23c NR:ns ## KEGG: pBF9343.23c # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 117 1 117 117 194 84.0 1e-48 MKIEFIIYSHFFKERGMKVKGDWNFPHLPRIGEEISPHIIMFQNEFTYQNLLEYLTDEAK SDFNKFNDGEDDLEGNFKAWVYDVICEVNIVESIHYRPDTEDYTQIIPEICLSDLSN >gi|226332119|gb|ACIC01000201.1| GENE 19 12818 - 13165 214 115 aa, chain - ## HITS:1 COG:no KEGG:pBF9343.24c NR:ns ## KEGG: pBF9343.24c # Name: not_defined # Def: putative topoisomerase-related protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 115 164 278 278 218 90.0 5e-56 MTGFKSREGHDFSSRLVLTENLDISFDNTLCKCPKCGGNLYINKKAYNCSNYRNEAIKCD FVIWREMSGRSITPEEAIELCEKKETPVLTGFHDKNGQPMERKLVLNDDFKIKLI >gi|226332119|gb|ACIC01000201.1| GENE 20 13162 - 13653 294 163 aa, chain - ## HITS:1 COG:no KEGG:pBF9343.24c NR:ns ## KEGG: pBF9343.24c # Name: not_defined # Def: putative topoisomerase-related protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 160 1 160 278 323 94.0 9e-88 MDNDTIKDLGLCPICQKGHIMKGSLGYSCNYFKNMNDKCTFNIYHSYWGKEITEEIARQL ITTGKTDIFHDFHNKKGVPFSAYLIIENGIVIPSFVNEVLETPCPVCGREIEILLNGYAC KGYSQKDKDNNRVCNLYIPKTIAQREIPLEAAEILARGKKHRL >gi|226332119|gb|ACIC01000201.1| GENE 21 13654 - 14361 404 235 aa, chain - ## HITS:1 COG:no KEGG:pBF9343.25c NR:ns ## KEGG: pBF9343.25c # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 235 1 234 234 383 96.0 1e-105 MNLQTDNDMKRVLLFFSLMLLLGNIKGLAQSIVIDPVMIGTLVYSHQEQQGVLKDIQSEE TKIRNFQILIQQKMNQIQSLQEKTYNYLSTVNAVVKNGKDIIYASTLARDIAKYQSEAAK YAVGDPKLLTIIAKTEYELISRSVDLMIYINNIALQGGEKNLMDNKQRIDLCIHVVNELR RMRGLAYAVCRQMKSAKRAGVLKTLVPGQFKYVNSGKQKVDNILNGIKWISKGGY >gi|226332119|gb|ACIC01000201.1| GENE 22 14342 - 14845 192 167 aa, chain - ## HITS:1 COG:no KEGG:pBF9343.26c NR:ns ## KEGG: pBF9343.26c # Name: not_defined # Def: putative integral membrane protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 167 1 167 167 273 95.0 2e-72 MNDGTNIASDDIILKPKFRYWLMKNFLLITLAIFVFIFSKYIREHELLKFISGTILVLLI FIIFYRYISMLLCTKWIITREQIKIYQGVLSKRINYIELYRVYDYEEKQSFIQSLINNTN IYIYSGDKSTPELLMNGLKANSDIIQTIRNRVEEQKKKKGIYEFTNR >gi|226332119|gb|ACIC01000201.1| GENE 23 14842 - 15951 481 369 aa, chain - ## HITS:1 COG:no KEGG:pBF9343.27c NR:ns ## KEGG: pBF9343.27c # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 369 1 369 369 681 96.0 0 MLTFDDYKAKISIIQILEDLGYKQDISKGKVSPVFKLTDGAGNKLDEIIIKNPHSVQEHY YDRNYKGGDLIQFIKNHINDFPQFQHQNTFVRINMILGHYANEPYSPKYEAFKVVKAENK SFDRDRYKEFPTTLADLRFLTNERNINHQVVEKFLPFITRVKDLQGNGNYYNIGFPYINP KDKDNKVTNYELRNYGFKGMAAGGDKSNSLWIADFCPHPQMAKHIYFAESALDAMSFYQL NANKIKLEESVFCSVGGYISVNQIKNTLLRYPQAKVHTCFDNDLNGNLYDIKVSGIISNT EMTIKENKDDVLFKTKGREFTINKNDVSLESFREKSKIIAPMISHKAEKAKDFNEILMKQ HEQKKSIKL >gi|226332119|gb|ACIC01000201.1| GENE 24 15956 - 17515 1068 519 aa, chain - ## HITS:1 COG:no KEGG:pBF9343.28c NR:ns ## KEGG: pBF9343.28c # Name: not_defined # Def: putative integral membrane protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 519 1 519 519 868 99.0 0 MEEKKKFEALQYSYENAGQFRAIAAILGYSEEYRNGDITFSKGNETYTYPLNTLKLGNLG DNRESLLEQSKERIISFFDKEQASNNFQEYSQYLRENHNVAIIKWDDIKQGTNKNEGYKD GFTVIDLENKIAYKGEDLYRYAYEQNQTLDGKGSYVDIDWNKFNEVGVKPENLSPEDIAN IKNGKKTGMLNFSIEDTPGNRTLLDNEKVSYKTENGKLHFEGKATTLKYITAENTPENKS KLKTNEIDFKEEGKRIKIDGINARKLAIAAITVVYPIAGIAILLIPKRQEIKNDLSFTKD EIKALKADNVVVKTNSKGERTLHQRDKDTNEIVSIKAKDIHIPQKLGGIELTPMQQENLK NGKEITIVNEELNKAAKVKLDLNARNGLSIKDANTIEIKATEKKEQTIGKERYISDKERL EFVAQKGAKGIDEIFKDKPTEMAAFLEKHKLSKDYASYKEVEKTYSSSREATKQTVGEQI STQMDKIDSSIKATAKQEASILGYGRTYGKNNDTPTMKL >gi|226332119|gb|ACIC01000201.1| GENE 25 17517 - 18296 325 259 aa, chain - ## HITS:1 COG:no KEGG:pBF9343.29c NR:ns ## KEGG: pBF9343.29c # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 21 259 1 239 239 401 97.0 1e-110 MINLFFILIFLLLLGKNIHYMDKKQSIFNENDIPYKELELIGISKKQIWSLDKANITALL SGKRTSLLDLSFHDNNGEEISMKGKISLYWKDSNNAGVKVHPVRPEIMNDINLKPKELER LQDNEIITKTINNEKYLVQLDPETNELLKTKIKSISIPSNIKGVELDKQQKETLKSGKEL ILNVDKEKIAIRLDLNNPRGIKFLDFEQKQKIAYDRHNPQIIGTIHTDKNRNEYIEYMKG QKTTLGNESQSKVEHKFKL >gi|226332119|gb|ACIC01000201.1| GENE 26 18277 - 18942 363 221 aa, chain + ## HITS:1 COG:mll8577 KEGG:ns NR:ns ## COG: mll8577 COG0739 # Protein_GI_number: 13477076 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Mesorhizobium loti # 37 192 264 420 434 96 35.0 4e-20 MKNKFIMFFTYSLFFSVNGYCQFNTVLQKKDIPKAEPVLATTENKLIEEKEKAVVVNDAL TTERLAYWKQRRYLSLPIDSMVITSHYGKRKDPFTGKIANHRGIDLKGNNDYVYSIMPGM VVKTGKNKGLGNYVEVRHGDFTSIYGHLYSVLVNAKQAVEAGQPIGISGSTGRSTGEHLH FQMEYKDKTIDPKPILDYINEVIRTVKGQISQQIDNELRRK >gi|226332119|gb|ACIC01000201.1| GENE 27 18989 - 19855 431 288 aa, chain - ## HITS:1 COG:no KEGG:pBF9343.31c NR:ns ## KEGG: pBF9343.31c # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 288 1 288 288 540 100.0 1e-152 MQRYEVEAIFEQKKDGLFEDINISSQKPIAIILGGQPACGKSTLINVAKKDHPNLDFLTV NGDLYRQFHPNKELIKDPIKYPIETQIFSSVFTEKLIEEAIKRKCNIIIEGTMRNPDVPL KTAQKFKDAGFRVEAYIIAAPKEFTQLGLYNRYQEEVLSKGQGRLADIDSHNKAVNGLMK SANQLYSDKAVDKISIHTYLAKERIKDFNLVNGEWSCKSMPSIFIDESRSKQMKNKEILN TNIQRGKELIESVTNPEVKKGMKEALSQLQSSLEKVQRQEKGLKKGFF >gi|226332119|gb|ACIC01000201.1| GENE 28 19842 - 20102 313 86 aa, chain - ## HITS:1 COG:no KEGG:pBF9343.32c NR:ns ## KEGG: pBF9343.32c # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 86 1 86 86 131 100.0 9e-30 MNKEMSLDVALDIIGTLRMMKIDEISEEKDENRKKILQKELSVLNTEEKIANGLLQFEVS ENVRLSVMDKIQNYYAPKLKAYYATL >gi|226332119|gb|ACIC01000201.1| GENE 29 20124 - 20603 329 159 aa, chain - ## HITS:1 COG:no KEGG:pBF9343.33c NR:ns ## KEGG: pBF9343.33c # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 159 1 159 159 313 100.0 1e-84 MKLYHGSTVTVKSPNIQKGRKATDFGKGFYTTTNFEQAKKWAILKRNREHGKKAVVSVYE VPDNILDREFSVLRFDGATKEWLEFVVNNRRGKGKNSYDLIMGPVANDQLYATIRLYEQG VVTADAAIEMLKTHKLFNQLSFHTVKVIPLLKFTESIEV >gi|226332119|gb|ACIC01000201.1| GENE 30 20600 - 21049 348 149 aa, chain - ## HITS:1 COG:no KEGG:pBF9343.34c NR:ns ## KEGG: pBF9343.34c # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 149 1 149 149 248 100.0 4e-65 METNQTYQNELGSAMLPFVMRELVDTVMKRKTLPLEDALYYIYSSNLYKALLDENTKLWY SSTLSLYEALEKEKTEQKKVQKDNPKILLFQMFCAENYRETKNISAKETLLLFSNHGVFE FLYENFEMLHTQDTEYILDTIITYINKKA >gi|226332119|gb|ACIC01000201.1| GENE 31 21121 - 21348 142 75 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|189464068|ref|ZP_03012853.1| ## NR: gi|189464068|ref|ZP_03012853.1| hypothetical protein BACINT_00403 [Bacteroides intestinalis DSM 17393] # 1 75 15 89 89 104 100.0 2e-21 MGGIFVLAIYAYSELVYLIFICIAIVGLTIYYRKSIIKMYAEIFNELGINKYVIQIKNLK KYKEPIYKLINTIGL Prediction of potential genes in microbial genomes Time: Thu May 12 04:09:49 2011 Seq name: gi|226332118|gb|ACIC01000202.1| Bacteroides sp. 1_1_6 cont1.202, whole genome shotgun sequence Length of sequence - 14592 bp Number of predicted genes - 18, with homology - 18 Number of transcription units - 3, operones - 1 average op.length - 16.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 1 - 44 10.0 1 1 Op 1 . - CDS 64 - 705 220 ## PROTEIN SUPPORTED gi|163756109|ref|ZP_02163225.1| 30S ribosomal protein S1 2 1 Op 2 . - CDS 708 - 1367 252 ## pBF9343.36c hypothetical protein 3 1 Op 3 . - CDS 1370 - 1873 292 ## pBF9343.37c hypothetical protein 4 1 Op 4 . - CDS 1881 - 2657 459 ## gi|253572953|ref|ZP_04850350.1| predicted protein 5 1 Op 5 . - CDS 2676 - 3017 317 ## gi|253572954|ref|ZP_04850351.1| conserved hypothetical protein 6 1 Op 6 . - CDS 3036 - 3875 479 ## pBF9343.38c putative plasmid transfer protein 7 1 Op 7 . - CDS 3877 - 4986 580 ## pBF9343.39c putative plasmid transfer protein 8 1 Op 8 . - CDS 4990 - 5421 168 ## pBF9343.40c putative integral membrane protein 9 1 Op 9 . - CDS 5421 - 6035 387 ## pBF9343.41c putative plasmid transfer protein 10 1 Op 10 . - CDS 6068 - 7090 559 ## pBF9343.42c putative plasmid transfer protein 11 1 Op 11 . - CDS 7083 - 7739 249 ## pBF9343.43c hypothetical protein 12 1 Op 12 . - CDS 7741 - 8415 326 ## pBF9343.44c putative plasmid transfer protein 13 1 Op 13 . - CDS 8427 - 11267 946 ## pBF9343.45c plasmid transfer protein 14 1 Op 14 . - CDS 11273 - 11572 98 ## pBF9343.46c putative plasmid transfer protein 15 1 Op 15 . - CDS 11574 - 11888 265 ## pBF9343.47c putative plasmid transfer protein 16 1 Op 16 . - CDS 11908 - 12249 322 ## pBF9343.48c putative integral membrane protein - Prom 12335 - 12394 4.5 + Prom 12255 - 12314 5.4 17 2 Tu 1 . + CDS 12395 - 12988 227 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs + Term 13159 - 13218 4.1 + Prom 13222 - 13281 8.7 18 3 Tu 1 . + CDS 13413 - 14423 431 ## COG4227 Antirestriction protein + Term 14431 - 14463 3.0 Predicted protein(s) >gi|226332118|gb|ACIC01000202.1| GENE 1 64 - 705 220 213 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163756109|ref|ZP_02163225.1| 30S ribosomal protein S1 [Kordia algicida OT-1] # 49 206 203 344 347 89 37 1e-17 MKKKIIILAAFVITFFTACTSTHMITFSNAELSDKEKDMIENNSNYSTASIVGGSAKLVK EERDKAIQEKIEARRNKLKAIDGLSISSYKDKNGKEQFKATAQSEILFKFDSYELNEEAQ KMLSNLCNIITEIPNTKIKIIGHTDNIGEKQYNIILSKNRAAAVGNYLRTAGIETNNITE DGKGYSEPIADNTSEAGRAKNRRVEIFITNDEY >gi|226332118|gb|ACIC01000202.1| GENE 2 708 - 1367 252 219 aa, chain - ## HITS:1 COG:no KEGG:pBF9343.36c NR:ns ## KEGG: pBF9343.36c # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 219 1 219 219 415 99.0 1e-115 MKKIYIILLSIIGVISSCTKHEDIIDTSLQPGSILCSDGSIVHPSLFSPSEKEAIGVVFW CNDGNNPNIKETAYAVSLEDLEENPLIDTDEDIANVSEDENSFDGAANTAAIMNFAIKDS LAYPAAQKAIEYAPKGVTGWFIGSIAQNKAISNNLEKVYSSFSIIGGTHFDGWYWSSTED GAGKDTPKVFALISSLTKGRATSTSKRNSFKIRPIIAIR >gi|226332118|gb|ACIC01000202.1| GENE 3 1370 - 1873 292 167 aa, chain - ## HITS:1 COG:no KEGG:pBF9343.37c NR:ns ## KEGG: pBF9343.37c # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 166 1 166 167 319 95.0 2e-86 MKKIIFLIMISFISLKSMAGDGDKFFNISGGWQWKNTVNAVVGLEFEGKYHNAYELYIDL ATAYDKCPVCNKVCSDSFWSYKTFGIGAAYKPTISRGKNSNLRWRFGADLGANRKGFQAS IDIGLEYSYSFRNGMQVFVMQKNDFVFWTRDHFRNGLLVGVKFPINK >gi|226332118|gb|ACIC01000202.1| GENE 4 1881 - 2657 459 258 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253572953|ref|ZP_04850350.1| ## NR: gi|253572953|ref|ZP_04850350.1| predicted protein [Bacteroides sp. 1_1_6] # 1 258 1 258 258 481 100.0 1e-134 MNILNEKKEVTPIITPSLGYKLRSINWNEDFYFDNPIDKEKINSIPKAIEISKDIFLAIE KSGMQYRETYKWDEYLYIKMTAENIFAATIYYYCKHYPKEYCNLAYIIATVCNPDISILS EMLKRDRETEIFIRGLNECHTLNVGSEFTGIKNELTTYLGRLYTKEFFYLFTEGKYSPNT SQFILLHYFDNVIWQVEQLQKNGFLFLESNIKKENDEESLYDSYDDGIPIKFLINKGRLD FSKIMYESTEKNETNNIK >gi|226332118|gb|ACIC01000202.1| GENE 5 2676 - 3017 317 113 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253572954|ref|ZP_04850351.1| ## NR: gi|253572954|ref|ZP_04850351.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 113 1 113 113 156 100.0 4e-37 METNSHLEAAASPVKQGQYSPRKRKDTTPKVNPQIAVELVYQELKRVEVYTKRIENAMEK GSLLEKSTEELIKTLKEEAKIREKLYIIKQVLLFIQILLLGFLILIAIYKFWF >gi|226332118|gb|ACIC01000202.1| GENE 6 3036 - 3875 479 279 aa, chain - ## HITS:1 COG:no KEGG:pBF9343.38c NR:ns ## KEGG: pBF9343.38c # Name: not_defined # Def: putative plasmid transfer protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 279 1 279 279 504 100.0 1e-141 MRKIFIALCTMVSVITGNAQNTPIGENIELAGENPEELKVIYINKDVSTHFIAMEDIKYV DISVNDIVGDIPTGNSLRIKPTKEGASGVITIVTERFFVQYMLVYSSDLAKAYTRFNIPY ADLRSYMNPEVNLTKAQMYDYAHRMFISKNKFYDVSSTSNLMKIVLNNIYTLDKYFFIDI SMINKTKIRYDIDQIRFKIEDKKQTKATNFQSIEILPLMQVNKNLVFKKNYRNIFVFEKF TFPDEKVLTIEISEKQISGRTITLRIDYADILHADAFTE >gi|226332118|gb|ACIC01000202.1| GENE 7 3877 - 4986 580 369 aa, chain - ## HITS:1 COG:no KEGG:pBF9343.39c NR:ns ## KEGG: pBF9343.39c # Name: not_defined # Def: putative plasmid transfer protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 369 1 369 369 578 100.0 1e-163 MKQIDIKKPKYIIPILILPFILGIGWLIKDMVNSAPAEETTLVETEELNLDIPDANLEKR EIKSKFESLKGAFKKSSDYSSIQTIDKEEEVSEIESYGSLYSNDEIRQIDSLNQASKIRE KELEQQIKGLPTIDESSQTETRKAPEKSKMQEEMELFKMQMAYIDSLQNPRPRTPQKKQD EKPKEKAIEVVKAKNPAEVYFNTVGKEQKTSLITAILDETLKVTDGSRVRIRLLDDIMIN DALLTKGTYLYGNVSGFKAQRVHINISSIMVDGKQMKVDLSVYDNDGQEGFFVPSSAFRD LSKDVGSQIGSQTIQMNSQSEGVEQFALGALQDVYRSTTQALSKNIKKNKAKLKYNTQVF LVNNKDKEQ >gi|226332118|gb|ACIC01000202.1| GENE 8 4990 - 5421 168 143 aa, chain - ## HITS:1 COG:no KEGG:pBF9343.40c NR:ns ## KEGG: pBF9343.40c # Name: not_defined # Def: putative integral membrane protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 143 1 143 143 237 99.0 1e-61 MKLFKHKKNDRKEKIKTKAEQATVILLTKIFQRFGVNNACERLLAWADIHRKVMFGITIS FLSLVTFLSIVMRPSRKAMQVFNEEKNKRQISIDSRLQKKQIGVGDLLEVIKMQNEINEL RKNGKLTPEDTLRIKDLYNKLNK >gi|226332118|gb|ACIC01000202.1| GENE 9 5421 - 6035 387 204 aa, chain - ## HITS:1 COG:no KEGG:pBF9343.41c NR:ns ## KEGG: pBF9343.41c # Name: not_defined # Def: putative plasmid transfer protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 204 1 204 204 387 100.0 1e-106 MIIKNLENKIKLAGILSIGCFATSIVISGMVLAFCFSLVKAERKKIYVLDNDVPVLVKQT GTEVNLEVECKSHINLFHTLFFTLPPDDEFIKYNMEKAMYLIDDSGLKQYNNLKEKGYFN TILASSATVTIMTDSIKIDMNNLSFKYYGIQRIERETSILKRQLVTTGKLRQIPRTENNP HGLIITDWKTILNKDLDYKVKKNF >gi|226332118|gb|ACIC01000202.1| GENE 10 6068 - 7090 559 340 aa, chain - ## HITS:1 COG:no KEGG:pBF9343.42c NR:ns ## KEGG: pBF9343.42c # Name: not_defined # Def: putative plasmid transfer protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 340 1 340 340 573 100.0 1e-162 MNELIDQYLFELFDGLRDKTTVLFGEFIADAQALAAIFMLLYFGVESFKMMSGDKKLEII PLLRPFALGLVLMFWIPFINLISYPGEVLTAQSKAMFTNQLDEVELLSRNRYALIDSVAV ELLHTSILVERAENEVQDKKWYDFSIDFSAIGDKIAGLYVYVVAKAKMIMFNIIEFIVVT FWQVCTYFVFFLQIIFTGILVILGPLAFAFSVLPAFRDAYIQWIARFVSVSLYSCIAYIV LSISLVVMQYGIEREIEILEYALRNEAAFVMYVGMTSGGVNSFLLTALLGAFSMLTIPFV STWIVSTTGVGQAVGGMVGGAAIATKAIAAPATGGASAAM >gi|226332118|gb|ACIC01000202.1| GENE 11 7083 - 7739 249 218 aa, chain - ## HITS:1 COG:no KEGG:pBF9343.43c NR:ns ## KEGG: pBF9343.43c # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 218 1 218 218 393 99.0 1e-108 MKRVLLFFSLMLIIGTSNVFSQMPVKDKGKWSVLNNMEFMDRVRLQPEYYHYWIWYKKVL GIKIPLVGLGLHDKYGKQDRRNFQLQEVPMMAVVKYNKSVTEKEGSKVDTIYKQELFKFG DKTIDYQHTLTKNRRNDILNDINKKLVEYSSTNGNKEHIKVITDEVTRIKKNIEIIHDSH MSNSKKREAYLDFDKELIEVLSLITRLNNINKTIMNHE >gi|226332118|gb|ACIC01000202.1| GENE 12 7741 - 8415 326 224 aa, chain - ## HITS:1 COG:no KEGG:pBF9343.44c NR:ns ## KEGG: pBF9343.44c # Name: not_defined # Def: putative plasmid transfer protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 224 11 234 234 395 100.0 1e-109 MTKILFILVVASTLTQFAYSQKAVLDITCMETLIANHKVQHTSFSKVKDNETQISLIQKQ ISEKMVQIEFFQSKFYNSLKSVEAIIKTGKDIIYCTDIAADIGKYQKQMVELAVGDPALL LVSAKTQLELVNRTADLTQYIYQVAIVGTDVNLMDNKQRIDLLKYVINELRNMRGIAYSV CRQMRTAKRNGVLQTLAPGVFKYKDNRSKLVDDLLQNYKVKPKR >gi|226332118|gb|ACIC01000202.1| GENE 13 8427 - 11267 946 946 aa, chain - ## HITS:1 COG:no KEGG:pBF9343.45c NR:ns ## KEGG: pBF9343.45c # Name: bctA # Def: plasmid transfer protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 946 1 946 946 1785 99.0 0 MKKKSEFEAPYIGIETIDNTPIFYNRRGDYSVIIKCENPIIQYSADMDAYYDFHHLFTNI LKVLGTGYTIQKQDILCKKSFLPPQNRKNDYLSNRYFEHFKGRIYTDISTYLVITGEVER SKFFSFDPRRFDTFIRNITKVLGLFANRGIRAKLLNENEIEIYIKRFLSINFNQQTVSLK NIKAREENLIIGEKNVQCISLVDIDEVNFPSIIKPYKEVNIGLRFPVDLLSFLHDTPSID TIIYNQVINIPDQRNEANKLEGKKKKHKGMPDPANDLCVEDIERVQSDIAREGQMLVYAH YNIILAGLDDISKAINYVETSLFDCGIIINKQCFNQLELFECALPGNAINLNSYDKFLTT SDAAICLLFKEKLQVTENSPFLTYFTDRQGLPVGIDMSGKEGEKKYTNNSNFFVLGPSGS GKSFYVNSKVRQWVLDNTDIVLVDTGHSYSGMCEYYHGKYITYSESKPISMNPFRITEEE YNVEKKNFLKSLIFLIWKGANGEVKKQEEEIMDITIEKYYSFYFHPFNGYSDAEKEAIRE NLLLEFQVNAEEFENEREKEERKILSEKIDKLQQLVEKGEGGEKTNAERAIQNILIGKGF TRQELDNPETRLLSVIERRIQKKEDILKEIKVESLSFNSFYEFSLRIIPIICKENKIDFD ITNYKFLLKKFYKGGQLEKTLNEDFDTSLFEEPFIVFEIDAIKDDPELFPIVTLIIMDVF IQKMRLKKNRKALIIEEAWKAIASPMMAGYILYLYKTVRKFWGMAMVVTQELEDIISNPV VKNSIISNSDIICLLDQSKFIDKYQEIANLLSLTEVNQKQIFTINQLPNKENRNRFNEVF IKRGNYGNVFGVEVSLHEYFTFTTERIEKDAVGYYHIIYDSFQTGLDNFIIDLKASKLKN TDWVLQVNKVLGYHSDEGNLKDILNIIGQQPLYKYIPDKYKWLTNR >gi|226332118|gb|ACIC01000202.1| GENE 14 11273 - 11572 98 99 aa, chain - ## HITS:1 COG:no KEGG:pBF9343.46c NR:ns ## KEGG: pBF9343.46c # Name: not_defined # Def: putative plasmid transfer protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 99 1 99 99 174 100.0 1e-42 MSKEHLSYNVYKGLQKPLIFKGFKGKFIYIGGACIISALLLCAIVSTLASFMWGGITLVI VMFGGLGITSKLQRKGLHKKDKRKGIYIVSRTFVRDRKK >gi|226332118|gb|ACIC01000202.1| GENE 15 11574 - 11888 265 104 aa, chain - ## HITS:1 COG:no KEGG:pBF9343.47c NR:ns ## KEGG: pBF9343.47c # Name: not_defined # Def: putative plasmid transfer protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 104 1 104 104 152 100.0 3e-36 MKKFFSKIAQRAFALAISIMVVTNTFAQGAAGIDAASSELATYMDPLGNMMMILGGIIGL IGAVRVYLKWNSGDQDVQKAVMGWMGSCIFLVVSGVVVKAFFGI >gi|226332118|gb|ACIC01000202.1| GENE 16 11908 - 12249 322 113 aa, chain - ## HITS:1 COG:no KEGG:pBF9343.48c NR:ns ## KEGG: pBF9343.48c # Name: not_defined # Def: putative integral membrane protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 113 1 113 113 144 100.0 7e-34 MSYSQFIMAVFAIYVVYYAANILYDAFLKQSKTNTGEEEEIISIGEEEEIPHKVVDEDFE TGSYEKEDEKKESEQNYVSVNESIEMEVETQGIPMEQLLAKGKSMFSHVNFNK >gi|226332118|gb|ACIC01000202.1| GENE 17 12395 - 12988 227 197 aa, chain + ## HITS:1 COG:XF2028 KEGG:ns NR:ns ## COG: XF2028 COG1961 # Protein_GI_number: 15838622 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Xylella fastidiosa 9a5c # 3 184 2 179 185 164 51.0 1e-40 MEIIGYARVSTREQNLDLQLDALKEAGCKLIFEEKVSGVKDRPELDKALAYLREGDTFVI WKLDRLGRSLKDLVYIVDCLQKRKVAFKSIVDGIDTNSALGRCQFGIFASLAEYEREIIV ERTRAGLQAAKERGKLTGRPIGLSEDAKRKAIAAKRLYENRDYSIDEICRILHIGSKATL YRYLRYEKVRLMNRRNK >gi|226332118|gb|ACIC01000202.1| GENE 18 13413 - 14423 431 336 aa, chain + ## HITS:1 COG:mlr6154 KEGG:ns NR:ns ## COG: mlr6154 COG4227 # Protein_GI_number: 13475143 # Func_class: L Replication, recombination and repair # Function: Antirestriction protein # Organism: Mesorhizobium loti # 10 314 27 303 320 104 30.0 2e-22 MKSTFDKNAEKFVNLMVEKIESLSMNWQKPWFSKVNSKQNFLPQNLTGRTYSGGNAFLLY FLCEKYNYQTPVFLTFNQARNEGINVLKGAVAFPVYYTLFCAYHRQTNEKISCDEYNKLS EEEQKEYRLAAYTKYFQVFNLDQTNFAEKYPNRWDILKAKFSGEEQPQEEKEMYVNPILD EMNKNQNWVCPIQTVFSDSAFYSISNDSITLPLKSQFKDGESFYGTELHEMGHSTGVKNR LNRKGFYENDKFNYGREELVAELISALSGVYLGISVTVREENAAYLKSWCKAIKEEPKFL FTVLADAIKGSKFIAQHLNIRLDVEEVDEEKNTKVA Prediction of potential genes in microbial genomes Time: Thu May 12 04:10:52 2011 Seq name: gi|226332117|gb|ACIC01000203.1| Bacteroides sp. 1_1_6 cont1.203, whole genome shotgun sequence Length of sequence - 726 bp Number of predicted genes - 0 Prediction of potential genes in microbial genomes Time: Thu May 12 04:10:56 2011 Seq name: gi|226332116|gb|ACIC01000204.1| Bacteroides sp. 1_1_6 cont1.204, whole genome shotgun sequence Length of sequence - 16219 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 3, operones - 2 average op.length - 4.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 215 - 274 8.1 1 1 Tu 1 . + CDS 343 - 1692 848 ## COG1672 Predicted ATPase (AAA+ superfamily) 2 2 Op 1 . - CDS 1781 - 3157 1164 ## COG3119 Arylsulfatase A and related enzymes 3 2 Op 2 . - CDS 3172 - 4764 1127 ## COG3119 Arylsulfatase A and related enzymes 4 2 Op 3 . - CDS 4772 - 6457 1301 ## COG3119 Arylsulfatase A and related enzymes 5 2 Op 4 . - CDS 6469 - 7830 885 ## COG3119 Arylsulfatase A and related enzymes - Prom 7882 - 7941 7.2 - Term 7955 - 8007 9.3 6 3 Op 1 . - CDS 8053 - 9090 668 ## BT_3485 hypothetical protein 7 3 Op 2 . - CDS 9107 - 11155 1233 ## BT_3484 hypothetical protein 8 3 Op 3 . - CDS 11172 - 14312 2251 ## BT_3483 hypothetical protein 9 3 Op 4 . - CDS 14324 - 15865 668 ## BT_3482 hypothetical protein 10 3 Op 5 . - CDS 15883 - 16218 231 ## BT_3481 putative glutaminase A Predicted protein(s) >gi|226332116|gb|ACIC01000204.1| GENE 1 343 - 1692 848 449 aa, chain + ## HITS:1 COG:PAB1371 KEGG:ns NR:ns ## COG: PAB1371 COG1672 # Protein_GI_number: 14521702 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Pyrococcus abyssi # 3 414 10 428 472 222 32.0 1e-57 MKFYNREKEVAELKRVQELAFTQNSRMIVVTGRRRIGKTSLIKEALKGTPTVYLFIGRKA ESILVTDFVKIVRETLSIFVPEGIPTFTNLMQYLFEVGKTRAFNIVIDEFQEFYNIRPDV FSDMQNLWDEYKQETKINLVISGSVYSLMQKIFTDHNEPLFGRADNILCLRAFKIGVLKQ IMEDLASGYSNDDLLALYTLTGGIPKYVELFCDNQALTIDKMYDFVFSENSLFIDEGRNL LITEFGKNYGTYFSILSEIANGHYTQGEIESALGGTAIGGYLSKLENVYNLITRERPIFA KPGTKKNVRYVIGDNFLNFWFKYIEQNRSYIELQNFDDLRQLAKADYPTYSGRVLEKYFK QKLAETGGFKEIGSWWETKSSIGKQRINSYEIDIVALKSSGKKALIAEVKRVPENNYSHT VFMEKVEHLKQKEMSKYDIETQVLGLKDM >gi|226332116|gb|ACIC01000204.1| GENE 2 1781 - 3157 1164 458 aa, chain - ## HITS:1 COG:STM0035 KEGG:ns NR:ns ## COG: STM0035 COG3119 # Protein_GI_number: 16763425 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Salmonella typhimurium LT2 # 1 429 9 474 497 186 32.0 1e-46 MNKLILGASMVVSATTLSFAQQVERPNIVIVLADDLGWGDVGFHGSEIKTPCLNALAAEG VVLDRFYTAPISTPTRAGLMTGRYPNRFGIRTTVIPPWREDGLDENEETLADMLARNGYS NRAIIGKWHLGHTRKVHYPINRGFSHFYGHLNGAIDYFDHMREGELDWHNDWETCYDKGY STELITQEAVRCINTYEKEGPFLLYVAYNAPHTPLQAQEKDIELYCDDFGSLTPKEQKRV TYQAMVSCMDRGIGTIVDALKKKGIMDNTFLIFFSDNGPAGVPGSSSGKLRGRKFDEWDG GVRVPAVFYWKRAESNYKNLSSQVTGFVDIVPTLKELVGDKNRPERAYDGISILPLLIGS TSCIDRNFYLGCGAVVNKDYKLIRKGRKPGLNLPQDFLVDYQTDPYEKKNASNGNEQIVR SLYQVALKYDTITPCLPEIPYGKGREGFKAPVEWKVTR >gi|226332116|gb|ACIC01000204.1| GENE 3 3172 - 4764 1127 530 aa, chain - ## HITS:1 COG:PA0183 KEGG:ns NR:ns ## COG: PA0183 COG3119 # Protein_GI_number: 15595381 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Pseudomonas aeruginosa # 22 517 3 520 536 301 34.0 2e-81 MVRTNFLMTAPLLMASFCSAQEKPNVVLILVDDMGYSDIGCYGGEILTPNIDALASKGVR FTQFYNTSRSCPARASLMTGLYQHQAGIGQMSEDPFANPDQKSPNDWGVQGYKGYLNRNC VTIAEVLKEDGYHTYMAGKWHLGMHGEEKWPLQRGFERFYGILSGACSYLRPSGGRGLTL DNTKLPAPDAPYYTTDAFADYAIQFVDEQKDDRPFFLYLAFNAPHWPLHAKEKDIQKFTK IYRKKGWDEIREARRKRMAKMGIIDSNTDFAEWENRKWDELTEQEKDQVAYRMAVYAAQV HCVDYNIGKLIDCLKKNHKMDNTLILFMSDNGACAEPYEELGGGKVEEVNNPASSFAPTY GRAWAQVSNTPFRKYKCRAYEGGISTPLILSWKGALGNKKGKWCRVPGYLPDIMPTILEA TGAKYPETYHGGNKIYPLVGSSLFPAVHKKTDSLHEYMYWEHQNNRAIRWRDWKAIRDEK GKEWELYDVVKDRTERFNVAEQHPEVLTRLVTEWERWAHANFVLPKHPKK >gi|226332116|gb|ACIC01000204.1| GENE 4 4772 - 6457 1301 561 aa, chain - ## HITS:1 COG:PA0183 KEGG:ns NR:ns ## COG: PA0183 COG3119 # Protein_GI_number: 15595381 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Pseudomonas aeruginosa # 29 546 3 520 536 300 36.0 6e-81 MNSKAFLLPAALMIAGNSVANSKGKKTDKRPNILVILADDLGYSDLGCYGSEIHTPNLDK LAQQGVRFNHFYNASRSCPTRASLLTGLYQHQAGIGRMTFDDNLPGYRGTLSRNAVTIAE VLKESGYTTSMIGKWHVAETPLRKDQREWLAHHVYHDTYSDLCHYPVNRGFDSHYGTIYG VVDYFDPFSLVEGEVPVKEVPEGYYITQALSDRAAEEVTEYAKDDKPFFMYLAYTAPHWP LHALPEDIEKYKDTYKVGWEAIRNARYERQKQLGIFPGMDDFLSERQFKDRWEDNAHAEW DARAMAVHAAMIDRMDQGIGQVIDALEKTGQLDNTLILFLSDNGCSNENCQNYSPGENDR PDMTRKGEKMVYPHNKEVLPGPQTTYASLGARWANVANTPFRFWKAKSYEGGICTPMIAH WPKGIKKNVGGMTPEIGHVMDIMATCIDMAGATYPAKYKGNDIIPMAGKSLLPIFKTGHR EGHDYLGFEHFNERAFLAKDGWKLVRPGENAKWELYNLNEDRSEQHNLADKYPEKKNEMV KAYEEWAKRCMVEPYPGQKKK >gi|226332116|gb|ACIC01000204.1| GENE 5 6469 - 7830 885 453 aa, chain - ## HITS:1 COG:STM0035 KEGG:ns NR:ns ## COG: STM0035 COG3119 # Protein_GI_number: 16763425 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Salmonella typhimurium LT2 # 11 440 17 474 497 204 31.0 3e-52 MNIKKHLPWLGALLPIINAQADTNPNVVIIYIDDMGIGDIGCYGGKFVATPNIDKLAQDG LLFNQYYSSAPVSSPSRCGLTTGLFPIEVGINTFLNDKASNKRCEQRNYLDDKLPSMARA FQNAGYATGHIGKWHMGGGRDVHNAPSIKNYGFDEYLSTYESPDPDPAITASKWIWCDND SIKRWKRTEYFVDKSIDFIKRHKDSPFFLNLWPDDMHTPWVPEFKQKERKSWETKEAFSP VLGEMDKQIGRFIKALDDMGLSENTIIIFTSDNGPAPSFKAVRSAYLRGTKNSLYEGGIR MPFIVKYPKKIKPGRVNNSSVLCAVDLYPTLCSVAGIKTEKNYKGDGQNYAKVLLGKSEA KRKTDLMWDFGRNKHFGFPGNPYDRSPHLAIRSGKWKLLVNGDGSDAQLYDMELDKFEKN NIAGEHPELVAKLSKKVCEWYAQNKDKGLQTDL >gi|226332116|gb|ACIC01000204.1| GENE 6 8053 - 9090 668 345 aa, chain - ## HITS:1 COG:no KEGG:BT_3485 NR:ns ## KEGG: BT_3485 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 345 1 345 345 662 100.0 0 MKKIYTYLMMLTVSALIISCNEEWSGEQYEHYISFKAPINSDGVSRINVRYKANEATTYQ VPLIVSGTTLNDKNYTVRIGLDPDTLVVLNKERYSTRIDLYYQVLPATFYSLPETVEIKA GENTALLPIDFKLNNIDLSEKWVLPLTVEESSDGSYTPNYRKHYRKALLRVMPFNDYSGN YSAVNYKCYIKGAESEAPFVKNYVMTYVVDDQSLFFYAGIIDETRVDRKGYRILAKFINN DNGEIGNVELSPADSDNKMKFQIGKDDYGKEVTPTYSVTETMDPIRPYLKRKTIIIRNIN YTYVDYTTIPGVETEYHVIGSLALERQINTQIPDEDQAIDWEAGN >gi|226332116|gb|ACIC01000204.1| GENE 7 9107 - 11155 1233 682 aa, chain - ## HITS:1 COG:no KEGG:BT_3484 NR:ns ## KEGG: BT_3484 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 682 1 682 682 1305 100.0 0 MKKYILLIIIIGFGTTSCSDYLDSEYIFKDRESLEKIFTDYDKTEQWLARAYSFLTGQCC DVGSKRNTPFVFDDCMYYGDDDVTIDASKGGNMSYNKFHEGGYDEGAFQDTWDRCYNGIR QATIFMQNVDMNDKYKPEEIVDLKAQARFVRAYFYWILLRKYGPIPLVPENGFNYNDSYD DVATPRSSYDVCANYIASEMVIAAKDLPLTREQLSINRPTRGAALATRALALLYAASPLT NGNNDEYAQQLLDDKGNRLLSPEYDESKWAKAAAACKDVMELNVYKLYTTGINTVNTGAF PATIAPPHHDVYSNEDFPNGWANIDPYESYRALFNGTVSSVDNPELIFTRGQNIGGERIK DMVIHQLPSVAKGWNTHGMTQKQVDAYYMNTGDDCPGMNSQYAGYQAYKDRVDNRPRPVG YTSNAQDYKPLSANVSLQYANREPRFYASVAFNGARWYLGNESQQANQNKQVFYYRGGGN GYSNNMFWLRTGIGVMKYVNPSDTYENSDGAKIIDKDEPAIRYADILLMYAEALNELTTS YEVSSWDGIKAYTINRDPEEMRKGIKPVRMRAGVPDYDTPIYDNPAKFRIKLKRERQIEL FAEGKRYYDLRRWKDAELEESVPVYGCYTFMTESQKDLFHTPVPISNLSATFSSKMYFWP IKHDELKRNKRLTQNPGWTYND >gi|226332116|gb|ACIC01000204.1| GENE 8 11172 - 14312 2251 1046 aa, chain - ## HITS:1 COG:no KEGG:BT_3483 NR:ns ## KEGG: BT_3483 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1046 1 1046 1046 2041 100.0 0 MKKLFVLCTMLLWVAINALAQQKISVSGIVTDENKEPVIGANVSVKDVPGLGTITDLNGR YVIKVEPYNRLVFSYIGYQSQEVLIKNQSKVNIVLKEDVTNAIDEVVITGTGAQTKLTQT GAITTVNVDHLKANPSSSIVNTLAGNVAGVLAMQTSGQPGKNVSEFWIRGISTFGASTSA YVLVDGFERSMDEINIEDIESFSVLKDASATAIYGSKGANGVVLITTKHGKAGKINITAK AETTYNMRTITPEFENGVNYANFLNEARITRNYEPVYQPEELDIIRLGLDPDLYPNVDWM DLLLKEGAWQHRVNLNMNGGGNTARYFVSFSYLDEQGMYNTDSSLKGDYNTNADYRRYNY RMNTDIDITKSTLLKLGVSGYLSKRNSPGLGDSDVWGELFGYTPIRTPVLYSNGYVPAVG TGNKTNPWVAATQTGFNENWKNTIQTNVTLEQKLDFITKNLKFVGRFGYDTYNDNTIKRH KWPEQWLAERARNEEGELVFKKISGAGNMAQESSSSGNRREFLDLTLNWNRAFKAHHPSA TLKYTQDAFVKTVALGDDLKEGVSRRNQALAGRFAYNYNYRYFVDFNFGYNGSENFAKGH RFGFFPAYSLAWNIAEEKIIKKHLKWMNMFKVRYSYGKVGNDNVGTRFPYLYSIADNYKE NDQTKYYAGYNWATYGSNKSYNGLRYTKLASNDITWEIATKSDLGIDMALFNDKFTATID YFDESRSGIYMERAFLPGMVGLDGNKPYANVGEVNSKGFDGNFSYKQKISDVKFTVRGNI TYSKNEIIERDEENNVYAYQMKKGYRVNQNRGLIALGLFKDYDDIRNSPTQKYGPVMPGD IKYKDVNGDGVVNDGDIVAIGSTSKPNLIYGLGLSATWKGLDVSLHFQGAGKSQFFTYGK CIWAFTEGQWGNILKGTLDNRWVDAETAQKLGIPANENPNASYPRLSYTDNGNGNTTIYE YKNNYRNSTYWLRNGSYVRLKTIDVGYTLPKKIVNKIHFNNVRIFMTGTNLLTWSSFKLW DPEMGASRGEQYPLAKSITLGLSVNL >gi|226332116|gb|ACIC01000204.1| GENE 9 14324 - 15865 668 513 aa, chain - ## HITS:1 COG:no KEGG:BT_3482 NR:ns ## KEGG: BT_3482 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 513 1 513 513 1062 100.0 0 MNLKSNKKSKFWCIPYVMLTLLAFFLGSCDDGKEVSPITGSGGTPFNPAFPIRIDSIYPK AGSAGQRVLIYGENFGNNPSQLEVYIGGQKSSVVNVKSTYMYCLMPSKAYEGDIQVTIKD GEKTYTSEKSDVRFEYQRLKVVSTLLGEVNPQISDKKKNFEIKDGPFDNCGGVANPCFMK WDPEYPELLYFAEDTSVDPEGVTGSGIRCIDFGRQQLYTILPRSEYGGYERGRSIDFFKD KDGVYHMVVATSQANATNPAVYMFDRVSDTTKPCGFKWSNRRTIVNYNGCACAAIHPQNG DLYFTHFQKGHLYRVKKDILQEFVDGTRTTAVSPGTGADYMFTVQDKEWEFNIIIHPTGN YAYLMVINKHYILRSDYNGESFTTPYVVAGSANQNGYVDAVGNLAKFDSAYQGVFVKNDN YKGEIDEYDFYIADKHNHAIRILTPLGQVSTFAGRGSASLNSNPWGYVNGDLRKEARFER PKAIAYDEKTGIVYVGDAYNHRIRKIALEEFND >gi|226332116|gb|ACIC01000204.1| GENE 10 15883 - 16218 231 111 aa, chain - ## HITS:1 COG:no KEGG:BT_3481 NR:ns ## KEGG: BT_3481 # Name: not_defined # Def: putative glutaminase A # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 111 64 174 174 215 100.0 4e-55 MWNLNLFPNNVIDKEVSYYLTKQNPYGLPLDSRKEYTKSDWIMWIAAMSPDQDTFEQFIN PLYKYINETTSRVPISDWHHTDSGKWVGFRARSVIGGYWMQVLMNKVMNNQ Prediction of potential genes in microbial genomes Time: Thu May 12 04:11:33 2011 Seq name: gi|226332115|gb|ACIC01000205.1| Bacteroides sp. 1_1_6 cont1.205, whole genome shotgun sequence Length of sequence - 14609 bp Number of predicted genes - 11, with homology - 11 Number of transcription units - 2, operones - 2 average op.length - 5.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 140 - 415 141 ## BT_3502 hypothetical protein - Prom 435 - 494 1.8 - Term 427 - 467 0.2 2 1 Op 2 . - CDS 512 - 1990 682 ## COG4833 Predicted glycosyl hydrolase 3 1 Op 3 . - CDS 2002 - 3873 997 ## BT_3500 hypothetical protein 4 1 Op 4 . - CDS 3883 - 4617 385 ## BT_3499 hypothetical protein 5 1 Op 5 . - CDS 4610 - 5281 436 ## BT_3498 hypothetical protein 6 1 Op 6 . - CDS 5308 - 6063 228 ## BT_3497 hypothetical protein - Prom 6085 - 6144 3.1 - Term 6074 - 6125 10.6 7 2 Op 1 . - CDS 6147 - 7145 689 ## BT_3496 hypothetical protein 8 2 Op 2 . - CDS 7166 - 9232 1291 ## BT_3495 hypothetical protein 9 2 Op 3 . - CDS 9246 - 12380 2222 ## BT_3483 hypothetical protein 10 2 Op 4 . - CDS 12399 - 13946 1136 ## BT_3492 hypothetical protein 11 2 Op 5 . - CDS 14024 - 14542 478 ## BT_3491 glutaminase A Predicted protein(s) >gi|226332115|gb|ACIC01000205.1| GENE 1 140 - 415 141 91 aa, chain - ## HITS:1 COG:no KEGG:BT_3502 NR:ns ## KEGG: BT_3502 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 91 1 91 91 114 100.0 2e-24 MNKKYFYKYAISKQQQIQKESNHLFQVALLFFFLKNLIFKITTPLGYCYKCLLNILFKLM KQFNYLLSLIVIIKNVIVSITLICKIILKYI >gi|226332115|gb|ACIC01000205.1| GENE 2 512 - 1990 682 492 aa, chain - ## HITS:1 COG:lin0763 KEGG:ns NR:ns ## COG: lin0763 COG4833 # Protein_GI_number: 16799837 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted glycosyl hydrolase # Organism: Listeria innocua # 162 492 24 341 341 114 30.0 6e-25 MKQFAQTYYLLQFIICLICLSISCGEGGNDNYSETNDERNKEPEITVSPITNLEAKSTQI ANQLLITWQNPNTPNLLGVELSYLQKNGGTHNGKQIIQGAAGKTGNYTLQLPQYGTYEIS AIAIDNYGHRSSAVTVIATPAETTVPFSWATLADSCTYVLIEQFMNKSKGTFWSTPKDMS DESTYIYWQQAHAMDVVIYSYKRIKDTNKQLAATYRTYFERWYANHANNYHRNPSDETGF LNDFTDDMCWICLTLIHLSEATGDEKFAQTAKIVYDKYIITRAWTDDKGTGLPWNTTQND RNACTNSPGCLVAAKLYQRYEDGNYLSDAKKLYEYVVNNSYNADGRVEEPPLTYTQGTFG EACRQLYHITQESKYMDKAKQVINYAATSDRCLRNGILRDEGSSMDQSIFKSVYIPYAVN LALDEASSSIGKHLITFLKNNAETLRMNLNHAAYPAMYCNYWWGEMWKSSEPASMGAQVS GASLMEGIARLE >gi|226332115|gb|ACIC01000205.1| GENE 3 2002 - 3873 997 623 aa, chain - ## HITS:1 COG:no KEGG:BT_3500 NR:ns ## KEGG: BT_3500 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 42 623 1 582 582 1191 100.0 0 MNLYKILPALLILSLLAACSSDKENDVYTTLPFSLSIYQVAMTIEGGEQEVNVSSGGAWQ SSIHCDWMTLSPSSGGIGITKVRIKVKKNTSGLPRNGRIAIISGKEEKGITVMQSSIEGE EDYLRLLNLTIDGVTTAVDYVNNSYYLPIDMDAVQPAKVKVEFGGIGVDFIKIGDKKIKS GENIALSLNPGQNIEMEAGNNTIDKIRSCRLIVTGVPVIEISAPEGIEDENKRPCDIVLT DPKRRTNNSVLRFESYAGIEFRGAGALRYAKKSFSFKLKNRQTNENQDAKLLGLREDQSW ILDAMWLDCSKMRNRVCFDLWNDFNTLYYSNTEPEAVNATHGYPVEMILDGAYHGLYILS DRIDRKQLKMKKKGGYLYKGKEWTDECKLQGINTPYSNSKQAWQGFESDYPDEVGEIEFK YLSDLIQFFAAASKEEFAAQYEERLDVSSLIDYFIFINLLSAYDNTGRNVFWGIYNVNLT LLPKFIIQPWDLDGTLGRTWDAIKLDPEKGLGFDNGLILRNDAGSKYFRPFERIMNENPN NIKRRIYERLMAVKDNVLSPANLASKVDYYKRQIVESGALERDRQRWTGYSMYGYANPEE EAEYIKSWYIARLKYLETIFNNF >gi|226332115|gb|ACIC01000205.1| GENE 4 3883 - 4617 385 244 aa, chain - ## HITS:1 COG:no KEGG:BT_3499 NR:ns ## KEGG: BT_3499 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 244 1 244 244 496 100.0 1e-139 MNKKLFLAGLFCLVSFALQAQKDDLGLWTSVGMEKRLFRDFDISLEGEFRSRDKLSEVGR WSGSAGVAYKITNWLKAATAYTYIYYNHPSEITNKGNVIPEYWQPKHRFYFQLTGKVSLN RFTFSLRERWQYTYRPSQSVSKFDGDDGSPKDDEYVKGKGKNVLRSRLQATYNIPKCSLT PYASCELTHLLNDKGAIDKTRWTLGVEWKLSKHHGLDFYYLYQNHADEDEANGHVIGAGY TFKF >gi|226332115|gb|ACIC01000205.1| GENE 5 4610 - 5281 436 223 aa, chain - ## HITS:1 COG:no KEGG:BT_3498 NR:ns ## KEGG: BT_3498 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 223 1 223 223 416 100.0 1e-115 MDELDFLDIIETPLIDGGDFLQLLLRFGFNFLVTGIIIHLFYYPKSKRRDYHFTFTLISI SIFLMIFLLGSVKLKVGFALGLFAIFGIIRYRTESMPVREMTYLFAIIAISVINALGVEV NYGGLIGANLLFVLCIWLCESNRWLKHVSCKLILYDRIELIKSSCEQELIADIKARTGLK IIRVEVGYIDFLRDAAMLKIYYEPMTNEINTIDTLTKIPRGNE >gi|226332115|gb|ACIC01000205.1| GENE 6 5308 - 6063 228 251 aa, chain - ## HITS:1 COG:no KEGG:BT_3497 NR:ns ## KEGG: BT_3497 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 251 1 251 251 481 100.0 1e-135 MTEILQTLKEYIPISLEEMSGIKLMNRTDTKYVTSYQILKEILLAAGHDYRVQEVNGEYN IAYHTIYLDTADRDMYLTHQNGRVVREKIRIRTYVDSDLTFLEVKNKNNKGRTDKKRIRI GSIDTIKEDGGEAFLRQHAWYEQSQLLPLLENSFRRITLVNKHKTERLTIDTGVTFCCLQ SGKISSLDNIVIIELKRDGHNSSPIKEILRKLHIQSSGFSKYCIGSALTVSHLKCNRLKP KLRMLNKMGVY >gi|226332115|gb|ACIC01000205.1| GENE 7 6147 - 7145 689 332 aa, chain - ## HITS:1 COG:no KEGG:BT_3496 NR:ns ## KEGG: BT_3496 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 332 1 332 332 662 100.0 0 MKGIYTLLFSLIVLVMFSACNDEWKEEQFEHYISFKAPINSNGVTRINLSYSETGKTTYR LPILVSGSTKNDRDITVKIGIDADTLRTLNIARFSSREDLWYREMPSKHYSFPETVQIKA GENVGYLDIEFDLTGLDLVDKWVLPLTIEDAPNGEYKPNYRKHYRKALLRVMPFNDYSGT YGGTNLQGKLVDAGAGENEPLVKSEIVTYAIDDETIFFYAGTIDESRQDRRNYKIIAHFV PNTNIVELTSDNSENNGFKINTRASSYIEEEMDAVRPYLLRRTVMIRDVDYEFEDYTLAP GARIKFRFEGTISMQRNINTQIPDKDQAIEWD >gi|226332115|gb|ACIC01000205.1| GENE 8 7166 - 9232 1291 688 aa, chain - ## HITS:1 COG:no KEGG:BT_3495 NR:ns ## KEGG: BT_3495 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 688 1 688 688 1328 100.0 0 MKKKLYILLISVLGLSIQSCSDYLDSDYIFRDRETIEKVFTDRDKTEAWLANAFSYLSDA CADVCNKRVTPHCFADDMYYGDDDGSIQESKEGELSYNEFKEGLYDENTKQGMWERCYQG IRQATIFIQYVDMNDKFESEAERLDYKAQARFVRAYYYWLLLRRYGPVPLLPDNGLDYDA SYDELATPRAPYEEIADYISQEMALAAKDLALTRGQNSAARPTRGAALSARALALIYAAS PLANGNNDEFAQLLVDDKGNRLLSPEYSEEKWAKAAAAAKDVMDLGVYELYTADRRTTDS PAEPATINPPYHEVYSNANYPEGWKDIDPFESYRSVFNGQIDILSNPELIFSRGRNIGGQ SIRDMVVHQLPLTATGWNTSGLTQKIVDAYYMNDGSNCPGMNSEYANVPEYNNNHRLDTR PRATGFTTTAEEHKPLAAGVSLQYADREPRFYASVAYNGAYWYLGNEKEIADRNKQIFYY RGKRDGYNAGMFWLRTGIGVMKYVHPDDTYQNNSIDNLRYKPEPAIRYADILLMYAEAIN EVSEGTYDIPSWDGTKTHSIHRSVAEMRKGIKPVRMRAGVPDFDNNIYADKDVFRIYLKR ERQIELMGEGKRYYDIRRWKDARVEEAMPVYGCNTLMTESDREMFHTPIAVWNLSATFSD KMWFWPISHGELKRNKRLTQNPGWTYND >gi|226332115|gb|ACIC01000205.1| GENE 9 9246 - 12380 2222 1044 aa, chain - ## HITS:1 COG:no KEGG:BT_3483 NR:ns ## KEGG: BT_3483 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 4 1044 1 1046 1046 1608 75.0 0 MNNMKKYLSLCIMLLCLCSTVMAQQEKVVVKGVVTDEAKEPLIGVNVTVKDMPGLGSITD MNGHYSITMEPYRQLVFTYIGYESKEVMVKEQRTINVVMKESVTRAVDEVVITGTGVQRK LTQTGAITTVDVGELKRNPSSSIVNALAGNVPGIMARQTSGQPGKNISEFWIRGISTFGA NSSAYVLVDGFERSMDEINIEDIETFTVLKDASATAIYGSKGANGVVLITTKRGKAGKIK IDAKVETTYNTRTVTPKFEDGFTYASLMNESRITRNNEAVYKPEELDILRLGLDTDLYPN VDWMDMLLKDGAWSNRINLNMSGGGTTARYFVSASYVKEDGMYNTDESLKDDYNTNAAYH RWNYRLNTDIDITKTTLLKVGVAGFLSKRNSPGLGDGDVWGELFGYNPIKTPVMYSNGYI PAIGSGNKTNPWVAATQTGFNENWTNNIETNISLEQNFDFITKGLKFVGRFGYDSNNQNT IRRLKWPEQWLAERARDPETGELRWKKISGSQNMKQESSSSGNRREFLDMMLNWDRSFGA HHPSATIKYTQDSYIQTQNLGEDLKTGINKRNQDLAGRVAYNWNYRYFIDFNFGYNGSEN FAKGDRFGFFPAFSLAWNIAEEPFIKKHLKWMNMFKVRFSYGKVDNDNVGTRFPYLYTIA DRYKNGDNEIIYGGYDWAQYPSSYSFGGLGFADVASNGITWEVAEKNDLGIDLALFNDKF TATIDYFDEKRTGIYMVRNYLPQIVGLNGHNPAANVGAVSSKGFDGHFSYKQRINDVNLT VRGNITYSKNEVLERDEENNVYAYQMQRGYRVDQCKGLIAEGLFKDYDDIRNSPTQSWGK VQPGDIKYRDVNGDGVINDGDQVAIGATSRPNLIYGLGASASWKGLDVNVHFQGAGKSTF FTYGKCVWAFTEGEWGNIFKGMLDNRWVDADTAETLGIPANENPNASYPRLSYQGDNASN NNYRNSTFWLKNGRYLRLKTIDVGYTLPKSIVNKMHFNNIRIFLVGTNLLTWSSFKTWDP EMGDPRGESYPLTKSITMGISVNL >gi|226332115|gb|ACIC01000205.1| GENE 10 12399 - 13946 1136 515 aa, chain - ## HITS:1 COG:no KEGG:BT_3492 NR:ns ## KEGG: BT_3492 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 22 515 1 494 494 993 100.0 0 MNQQYNILLIINFLRMKKNKIMKDYWRVCWMALLLVALFFGSCSDDNDSNGDSDNAAFDP NIPVQVSGINPTTGGFGQRLVISGENFGNDPSIVNVFVGGKKAIVINVKNHSIYCLVPSQ AYSGEIEVQISNGDNPVVSTIAEAKFEYVRKQLVSTLCGSRRDDGGYDTKDGPFNDCGAF GSPNWLEFDPKYPHLVYVAADRGAGDENEGNGNMRILDLKNQYVGTALTEGDMGGTNRGR ALAFFDENHMAVAVDQGDELRAAVYGFTRNREAPEGDERNYKMWGNRLALVNFKACNTVT FHPVDGDMYFNSWDKGQFFKVEKQQIQEIFDGTRTTPADKEVLFQMDNGWEYNIRIHPTG NYAYIVSINKHYIQRTNYDWASKRFVTPFIVAGTAERAAYVDGIGTSARFNTPYQGVFVK NPEYAGQEDEYDFILCDKMGQCIRKITPQGKVSTFAGRGSASLNGNPWGYVDGDLRQEAR FDRPKGIAYDEATDTYYIGDGSNRRIRKIAYEGEE >gi|226332115|gb|ACIC01000205.1| GENE 11 14024 - 14542 478 172 aa, chain - ## HITS:1 COG:no KEGG:BT_3491 NR:ns ## KEGG: BT_3491 # Name: not_defined # Def: glutaminase A # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 172 1 172 172 330 100.0 9e-90 MGVAGYSEIAHILGLNSVAERYADIAKKMAMKWEEMANEGDHYRLAFDRNNTWSQKYNII WDKMWNLNIFPNNVISKEINYYLTKQNLYGLPLDSRRDYTKSDWIMWTAAMSPNREIFEK FVDPLYKYINETTSRVPISDWHDTKTGRMTGFKARSVIGGYWIQVLMDKMNH Prediction of potential genes in microbial genomes Time: Thu May 12 04:12:26 2011 Seq name: gi|226332114|gb|ACIC01000206.1| Bacteroides sp. 1_1_6 cont1.206, whole genome shotgun sequence Length of sequence - 4213 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 368 - 404 1.0 1 1 Tu 1 . - CDS 476 - 1759 496 ## BT_3478 integrase - Prom 1779 - 1838 2.5 + Prom 2216 - 2275 4.7 2 2 Tu 1 . + CDS 2301 - 4212 1091 ## BT_3477 glutaminase A Predicted protein(s) >gi|226332114|gb|ACIC01000206.1| GENE 1 476 - 1759 496 427 aa, chain - ## HITS:1 COG:no KEGG:BT_3478 NR:ns ## KEGG: BT_3478 # Name: not_defined # Def: integrase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 427 1 427 427 835 100.0 0 MQHECNTDYTIAKYIPYIKPKYVTQEGTTVLYVRYNYNRNKRTLISTGYNIKPEHWDAKR KWIKRACPHYEEIDACLIKITSKLGDILTYAKVNGIDPTVDFVLLELEKNREYDLRPNRV DMFDALERYIVEKAPSVSADQIKDYRTLRKHLTAFKEYSSQPVTFRNLNLTFYNEFMDYL FHKVVKPNGTVGLLTNSAGKIIRLLKGFVNYQIVKGIIPPIDLKNFKVVEEETDAIYLNE KELAAIYELDLSDDKQLEEIRDVFITGCFTGLRYSDLSTLSPEHIDLDNENINLKQRKVH KAVVIPMIDYVPEILKKYNYDLPKIPRYMFNERVKELGRKIKLSQKIEVVRRKGKEREKR VYEKWEMISSHTCRRSFCTNMYLSGFPAAELMRISGHKSPSAFMRYIKVDNQQAAKRLKE LRAKLAR >gi|226332114|gb|ACIC01000206.1| GENE 2 2301 - 4212 1091 637 aa, chain + ## HITS:1 COG:no KEGG:BT_3477 NR:ns ## KEGG: BT_3477 # Name: not_defined # Def: glutaminase A # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 637 1 637 840 1269 99.0 0 MKKLLAIMALSVGLLANAQTVDLLKPAKDIALRAPSVPIIVSDPYFSIWSPYDKLMEGST EHWTSAKKPLLGALRVDGKVYRFLGKDKINLVPIAPMTNVERWEAAYTNSQPSNGWQEFQ FDDSNWKKGKAAFGSRDMERIHTEWKGDNTDIYIRRTFDFNEPNIAEDIYLIYSHDDVFE LYLNGEKLVSTGLVWKNNVYLKLSEEAKKKLRKGKNVIAAHCHNTTGGSYVDFGLFREKE NAVKFANEAVQKSVDVLATSTYYTFTCGPVELDVVFTAPQLIDDLDLLSTPINYVSYHVR SLDKKTHDVQFYMETTPELAINESNQPTVARTLSKNGISYVEAGSIDQPICDRRGDLICA DWGYAYLASTNGSGKSVSLGDYYGMKESFVKNGTLATTKTKWTTRKEEDNPAMAYVHNLG SVSNSGKEGFMMLGYDDIYSIEYMYEKRMGYWKHDGKVTIFDAFEKLRDNYQAIMERCRA FDELIYDDAEKAGGKKYAEICSASYRQVISAHKLFTDKEGNLLWFSKENNSNGCVNTVDL TYPSAPLFLVYNPELQKAMMTSIFEYSASGRWNKPFPAHDLGTYPIANGQVYGGDMPIEE GGNMVILAAAISKIEGNADYVKKYWDLLTTWTNYLVE Prediction of potential genes in microbial genomes Time: Thu May 12 04:12:47 2011 Seq name: gi|226332113|gb|ACIC01000207.1| Bacteroides sp. 1_1_6 cont1.207, whole genome shotgun sequence Length of sequence - 30953 bp Number of predicted genes - 36, with homology - 36 Number of transcription units - 14, operones - 6 average op.length - 4.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 150 - 1160 387 ## BT_p548208 mobilization protein C + Term 1189 - 1219 0.3 + Prom 1234 - 1293 8.6 2 2 Op 1 . + CDS 1424 - 1711 322 ## BT_p548209 hypothetical protein 3 2 Op 2 . + CDS 1711 - 2013 232 ## BT_p548210 hypothetical protein 4 2 Op 3 . + CDS 2025 - 4469 1341 ## BT_p548211 TraG-like protein 5 2 Op 4 . + CDS 4447 - 5016 437 ## BT_p548212 TraI-like protein 6 2 Op 5 . + CDS 5021 - 5794 485 ## BT_p548213 hypothetical protein 7 2 Op 6 . + CDS 5806 - 6450 499 ## BT_p548214 hypothetical protein 8 2 Op 7 . + CDS 6455 - 7345 603 ## BT_p548215 TraM-like protein 9 2 Op 8 . + CDS 7342 - 8439 724 ## BT_p548216 TraN-like protein 10 2 Op 9 . + CDS 8423 - 8989 219 ## COG0739 Membrane proteins related to metalloendopeptidases 11 2 Op 10 . + CDS 8976 - 9185 274 ## BT_p548217 hypothetical protein 12 2 Op 11 . + CDS 9196 - 9903 427 ## BT_p548218 hypothetical protein 13 2 Op 12 . + CDS 9908 - 10414 374 ## BT_p548219 hypothetical protein 14 2 Op 13 . + CDS 10433 - 10888 454 ## BT_p548220 hypothetical protein + Term 10900 - 10944 6.1 15 3 Tu 1 . - CDS 11017 - 12078 299 ## gi|253573010|ref|ZP_04850405.1| conserved hypothetical protein - Prom 12098 - 12157 6.3 + Prom 12119 - 12178 11.9 16 4 Op 1 . + CDS 12377 - 13609 428 ## BDI_0727 hypothetical protein 17 4 Op 2 . + CDS 13648 - 14526 486 ## BT_p548228 hypothetical protein 18 4 Op 3 . + CDS 14570 - 15679 921 ## BT_p548229 hypothetical protein - Term 15675 - 15738 7.1 19 5 Op 1 . - CDS 15856 - 16113 206 ## BT_p548230 hypothetical protein 20 5 Op 2 . - CDS 16118 - 16876 704 ## COG1192 ATPases involved in chromosome partitioning - Prom 17108 - 17167 11.8 + Prom 17085 - 17144 8.5 21 6 Op 1 . + CDS 17164 - 17493 421 ## BT_p548232 hypothetical protein 22 6 Op 2 . + CDS 17497 - 18327 359 ## COG3943 Virulence protein 23 6 Op 3 . + CDS 18341 - 18979 437 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs 24 6 Op 4 . + CDS 19052 - 20617 776 ## COG3177 Uncharacterized conserved protein 25 6 Op 5 . + CDS 20640 - 21038 272 ## BT_p548236 hypothetical protein + Term 21188 - 21228 3.2 - Term 21009 - 21038 -0.2 26 7 Tu 1 . - CDS 21219 - 21602 112 ## BT_p548237 hypothetical protein 27 8 Tu 1 . + CDS 21670 - 21897 141 ## gi|253573023|ref|ZP_04850418.1| conserved hypothetical protein + Prom 21932 - 21991 6.2 28 9 Tu 1 . + CDS 22058 - 23074 792 ## COG4227 Antirestriction protein + Term 23087 - 23129 9.2 + Prom 23158 - 23217 3.9 29 10 Op 1 . + CDS 23247 - 23774 371 ## BT_p548201 hypothetical protein 30 10 Op 2 . + CDS 23771 - 24250 340 ## BT_p548202 hypothetical protein - Term 24193 - 24229 2.5 31 11 Tu 1 . - CDS 24323 - 24715 290 ## BT_p548203 hypothetical protein - Prom 24871 - 24930 5.1 + Prom 24830 - 24889 6.2 32 12 Tu 1 . + CDS 25031 - 25717 432 ## BT_p548204 hypothetical protein + Term 25787 - 25845 15.1 - Term 25727 - 25765 0.3 33 13 Tu 1 . - CDS 25993 - 27033 750 ## BT_p548205 putative replication protein B - Prom 27235 - 27294 8.9 + Prom 27702 - 27761 4.9 34 14 Op 1 . + CDS 27792 - 28412 493 ## BT_p548206 mobilization protein A 35 14 Op 2 . + CDS 28417 - 30672 1563 ## BT_p548207 mobilization protein B 36 14 Op 3 . + CDS 30674 - 30953 193 ## BT_p548208 mobilization protein C Predicted protein(s) >gi|226332113|gb|ACIC01000207.1| GENE 1 150 - 1160 387 336 aa, chain + ## HITS:1 COG:no KEGG:BT_p548208 NR:ns ## KEGG: BT_p548208 # Name: not_defined # Def: mobilization protein C # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 336 144 479 479 680 100.0 0 MDDMLTAYQGKDGKRDEWFNGALGILRGVSIRFYNDYPQFCTIPHIVNFICSAGTVRITS FLEGKHQSRVLAGAFLDAKDSPKTQSSYLSSLTNSLSTLANEKKVCYVLSGNDFDFNLID PECPKLVVVSNAYQIENLISPVISLMLSISSRRFTLANKVPFFYFLDEATTFRIADFEKL PSVLREYLCSFVFLTQSAAKIEKIYGKYDRSSIESNFGNQFFGRTKDIEALKSYPLVFGK EERQRVSKTTGSSRGGENRSRTVSTQKEEIYDTNFFTSLKSGEFVGSAAHSNMRNFHLRF EMYEDKEDPLPIVHPVLASDIEENYQQIIRDIQGIE >gi|226332113|gb|ACIC01000207.1| GENE 2 1424 - 1711 322 95 aa, chain + ## HITS:1 COG:no KEGG:BT_p548209 NR:ns ## KEGG: BT_p548209 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 95 1 95 95 133 100.0 2e-30 MKTPLFILLQATGGIRNEVNTFLSDYAVPVIAMLLIVGVGIGVVMNYDKIIDRDGQGTRK EGIVNLLWVVGYIIIGLAIIAAVIALINSKLKMSL >gi|226332113|gb|ACIC01000207.1| GENE 3 1711 - 2013 232 100 aa, chain + ## HITS:1 COG:no KEGG:BT_p548210 NR:ns ## KEGG: BT_p548210 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 100 1 100 100 159 100.0 3e-38 MDFQVRKGLETPLKVHGMRTKFFCIYCIILAVIVLFVVGFLTSAMSGEGSFLAFIVSLVA GAFASVVLRIVFINLSMQRKFGKFKRQVFVISNKDLLNSL >gi|226332113|gb|ACIC01000207.1| GENE 4 2025 - 4469 1341 814 aa, chain + ## HITS:1 COG:no KEGG:BT_p548211 NR:ns ## KEGG: BT_p548211 # Name: not_defined # Def: TraG-like protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 814 1 814 814 1639 100.0 0 MKISAAQAYCILDTVGSSILTKSGEISVSYLLELPEAYSLDRSDLEERHNELFRAFQYVR TGFIHKQDVFLRRKFKPEYVIEGEGYIQTAERKYFDGREYLHHFCILSFTLSGLQSLEKA YQANPLAYSEKLAKADRDRLAEFLESVESAVSIIRNIRGTQIRPLNVAEVKQHLFRYVNG FHDDEGLRDIQCSDMIRIGQKKGIFFAVCDERYLPDRMKVYVTDSTLQEANSQLYMSMLE RIGVHVPCSHVVNQIWKFAGGSYREELAERVKLFGRYREFDKAIKSRYDGLASYEQEIIN EENVLCKTHLNVLLLEDDETILNRQTEQVKMIFTNAGFKYYIPSYEGLYNIFIGSIIGRE NNLNPDYLFLSDLHSSLCMGINYSTFRSDKEGILFNERLYQTPIRKDVWDADKKRIPARN MIIVASTGGGKSVTSLNIIQQNIEQNYKQIVVEFGKSFYQLSQLYPDRSLHIDYDGSSPL GINPFYTAGKQPDNEKIKTLVNLVLKFWRSKAIMEDTKQVVSLTKIICRYYEDIPDGHCF PDFYRYVQREGKKLYERLNILPEYFDIDSFLHVCSEFMPGGFYENVCKPSPLENDMQNRD FIVFELTKIKKDPFLISVIMTILFDTIENKILSDRSVRGMLIFDEYAESQSIKDTFSGAD IHSTVAFCYQKLRKENGAVGTIVQSLAQLPDNEYTKGIIANTQLLYVLPANEIVYDQTVE AFHIKNRSHVNLMKSIRNDFSGVRPYSEIFIRFMDSYATVVRLELSPEKLLAFQTDGEKW NKLQEIYKEAGSMETAIEKYKQLKQKSYETEMSM >gi|226332113|gb|ACIC01000207.1| GENE 5 4447 - 5016 437 189 aa, chain + ## HITS:1 COG:no KEGG:BT_p548212 NR:ns ## KEGG: BT_p548212 # Name: not_defined # Def: TraI-like protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 189 1 189 189 351 100.0 6e-96 MKRKCLCSLLVMMCFYCTAFGQWVVSDPALTQLSQITWAKELKQAYEQFQVLGESRDILT KSLDLYRQVNGVIKNSKMVLQVLSMQGEMLELAAKECTRSDVFTGQEAYGEYTNVLNKIM EESVLSFDLIRTIISPSVSMTDGERIKIIVDLDNKLKENRDKMLDERARFNTVNDAIKRI AALKSTAKK >gi|226332113|gb|ACIC01000207.1| GENE 6 5021 - 5794 485 257 aa, chain + ## HITS:1 COG:no KEGG:BT_p548213 NR:ns ## KEGG: BT_p548213 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 257 1 257 257 446 100.0 1e-124 MEVIDNVISAYEAVKMNGVSQGIIALCSFIAVIIVIQKFVTAWKEAFSEGDKPVDMKQFF KLFYIYIYVFAIIMVAPFAFTIVETALGNLQNELISHYQEDVDLSIDEAIVTFTKDYIED VQRRHNWVGQQIQEVIMLPFNIAAYTILLYATKYIFFFFASARYLYLILLEIVTPIAVIL YMDEKTRHYTHAYLKNLFVCYMTIPAFLIANALGSIIAENIMHMCGQNKYTMLGLLFAFV FKLFLFAKSVKFCRELF >gi|226332113|gb|ACIC01000207.1| GENE 7 5806 - 6450 499 214 aa, chain + ## HITS:1 COG:no KEGG:BT_p548214 NR:ns ## KEGG: BT_p548214 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 214 1 214 214 370 100.0 1e-101 MNKDDILKDFDTLAEAKRKGAVNLKLCLGFALVVVVLVLAWGLVVNFTALDKVVVVERSG EYLKTHAEDSEALFLALVKKTCAEATGYANSFDRLNLKKNQAHAAFYVNKNDLNAVFSKY YNDKAYFDAVQNGAVYKCEIDSIQTIAGDNEPYKVAFTSTLTIYGVSGQKLRFLIRTKGE IVRTTPQFPENVTGFFFNQFIQQIVRIEQEPEKK >gi|226332113|gb|ACIC01000207.1| GENE 8 6455 - 7345 603 296 aa, chain + ## HITS:1 COG:no KEGG:BT_p548215 NR:ns ## KEGG: BT_p548215 # Name: not_defined # Def: TraM-like protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 296 1 296 296 528 100.0 1e-149 MDKATKTKIILFASLGIFVLVVLAVTLALTGKGDNREEDGSVNTSLRAQQQSDFTLDDMM NTDGRPEYENFTDEVYRERSGAMYREDPEVIALQEQLRQNQRADSLKAAAAMKRRTAKAR PKKETPKKEEPERTVGRFFSGEEPENKGNTIEAIVSGDQKITNGSVIKLVTLQEMQLPGG MVINKGTAVFGVVKLAQDRINITVESVRIGNSIYEMQKTVYDRDGLPGIYVPLNIKAEAT KEAADEVVSDMNVTSYGTDVLSTGVNAVSNAAKSVFRKKNNQVVVTVKSNYKLYLK >gi|226332113|gb|ACIC01000207.1| GENE 9 7342 - 8439 724 365 aa, chain + ## HITS:1 COG:no KEGG:BT_p548216 NR:ns ## KEGG: BT_p548216 # Name: not_defined # Def: TraN-like protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 365 1 365 365 695 100.0 0 MKAICSFILALAVSSAGAQIRPVESLPIAVNYSKTIHLVFPSAVKYNQAVTDFVAVDNPE SVPNILRIKANRKSFSKQTTVSVATEGGFFYSFNVTYADSLEHTNYFLPDMSSIRPDTIY LNEVSQTHLIAPEKVIYIDYGDTCIQVSKAENTENIVRMIAATGKVEEFPRQTNVSLATE GGKFYTFNVDYRQQPEAFVYEIGEKRPEKKANVILTDNIIPAGERDQVMSRVYNAKRGIF NKGIVRNKIVFSVNNLHIYDNLLLFTFEIENKSKLPYDIDYIRYYIIDKKTAKLTASQEV DQQALFSENYSPRIEGNGRMKYVIAFDKFTIPDEKVFRIEINEKNGGRHVLFDLENSDIV NVEDI >gi|226332113|gb|ACIC01000207.1| GENE 10 8423 - 8989 219 188 aa, chain + ## HITS:1 COG:jhp1456 KEGG:ns NR:ns ## COG: jhp1456 COG0739 # Protein_GI_number: 15612521 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Helicobacter pylori J99 # 39 183 123 267 312 114 42.0 1e-25 MWRIFSLAILALIPVGRVGAQDKRSLIVEVLQTAGDYGDIAPVWGIVQKDGDIYRFVPSV APLKEKYRISDRYGYRTHPISGERQFHAGLDMAAVYAATVHAAASGTVTFAGETPGYGKT VVVTHRFGFQTRYAHLTLIYTRKGAKVEKGDVIGFVGSTGISTGNHLHYEVIKNQKRINP LNFIYGTK >gi|226332113|gb|ACIC01000207.1| GENE 11 8976 - 9185 274 69 aa, chain + ## HITS:1 COG:no KEGG:BT_p548217 NR:ns ## KEGG: BT_p548217 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 69 1 69 69 115 100.0 7e-25 MELSNELKVERIRLSLTAKSVAEEMGISRQQLCNIEQSETAPVVVKYIAFLRSKGVDLNA LFDRIIVNK >gi|226332113|gb|ACIC01000207.1| GENE 12 9196 - 9903 427 235 aa, chain + ## HITS:1 COG:no KEGG:BT_p548218 NR:ns ## KEGG: BT_p548218 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 235 1 235 235 453 100.0 1e-126 MRKLVTALCALVFLSSCDDELKLVSRDFSVTLKSIDGNTAVVGKPINCTLTISDLDPDNG DQILTRFEVRDGDGVILVDNNEYSPGETFEYDFKANNRLDFDFIPATEGEAYIVMGVASE LVTRSDSIKLKVSSPEINIRFQNVPDLMLVSEEAEFYLQLDTELYGVKASARFVKGSGRV YISGYDATRGEGVALEKNNLVTFRPDATGQAVIEFTVSSRYGLPVKENVTIQVNQ >gi|226332113|gb|ACIC01000207.1| GENE 13 9908 - 10414 374 168 aa, chain + ## HITS:1 COG:no KEGG:BT_p548219 NR:ns ## KEGG: BT_p548219 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 168 1 168 168 308 99.0 5e-83 MRKIAVLSLFLCLVAVCRAQYIGSQYMNVKLGYIVDESQVQGTFGFGKVYKAFKVGANLN YRNLNRDLVKANTVTVGPELAYYALKGRKFSLLGIAAGTIGYQKAKAKSDLVYLAKSKAF VYGYEVGIRPEMLLSPKVALFAEYRFEMLFNSILRNNNHVGLGCVIYL >gi|226332113|gb|ACIC01000207.1| GENE 14 10433 - 10888 454 151 aa, chain + ## HITS:1 COG:no KEGG:BT_p548220 NR:ns ## KEGG: BT_p548220 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 151 1 151 151 277 95.0 8e-74 MRKYLYLSAVCVCMALCFVGCSKDDDEPGGKGAMYEVTIEQSGDFRSFIKSVVVVANGTR LKDGATGESLASPLILSDEELAVEKVTLTTTGKAIEFAVSGSVVDGEDGVVNEPMQWVVT VYKNGKEIEKKSLFFRDGKEIGMDDLNLYYN >gi|226332113|gb|ACIC01000207.1| GENE 15 11017 - 12078 299 353 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253573010|ref|ZP_04850405.1| ## NR: gi|253573010|ref|ZP_04850405.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 353 1 353 353 543 100.0 1e-153 MKFSTKAATFLSSIKTQTYDKKESEMIITYQQKRVFHLSLLMLALCAPIYIYSVPFPNEQ FYYINSVLFLFIIMCTLAYFKKRVNLTTTFSIILIAIHIEIFIEIIYCSICSGYEYSYQR ALIMSNITISLLFTMLSICAYMSNISILLSSLTIASYTICTLITDEPFLYSYLPLIIIIY TMIPLLGRSLHSNISSLLKSSNLLKEEEEMLLKRLQMKKEELFAFAELLSENNPEEKTSS LLNIIGEQSKENLFTALAAYQKKEKSKLDTIRRIYPNLSPSELNICRLILQDKTVSQICE LLHRSSGNITSQRANIRAKLGLKKSDNLKEALQERMRLYEEEHHRQEEFSAMR >gi|226332113|gb|ACIC01000207.1| GENE 16 12377 - 13609 428 410 aa, chain + ## HITS:1 COG:no KEGG:BDI_0727 NR:ns ## KEGG: BDI_0727 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 24 410 25 422 422 478 62.0 1e-133 MTKRCFFYIKCVLFVLVLSISGEISAQEKRDSVRIYFHQGKVNIDTCLLDNGNEMERFAK ICSALNDSVRLIRKIQIIGGASPEGGELLNGRLSEKRAEVLWRYISPYIKIPVLERDFHF SGSDWNGLITMVRADVNVPEREDVLRLLEKIVRLENQDSPYWGGELKRLKGGRPYSYLYK FHFPKLRSSMVKICYDSDPMNPVRDTVYIHTRDTLCIRDTVTVIAPVKKRPFCMAVKTNL LYDAVLIPDIGVEFCLGKNWSVAGNWMYAWWKSDRKHNYWRIYGGDVELRRWFGRRAVEK PFSGHHVGLYGQIVTYDFELGGKGYLGDKWSYGGGVAYGYSLPVGHRFNVDFTLGIGYLG GSYKEYIPLDGHYVWQTTKNRRWFGPTKAGISLVWLIGRGNYSRKKGGRQ >gi|226332113|gb|ACIC01000207.1| GENE 17 13648 - 14526 486 292 aa, chain + ## HITS:1 COG:no KEGG:BT_p548228 NR:ns ## KEGG: BT_p548228 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 276 31 306 322 578 100.0 1e-164 MGLTACEHKDLCYDHPHFATVRVVFDWTKISNHDKPEGMRVVFYPTDDESNTWIFDFPGG EDGEVELPENDYRVICFNYDTDGMVWKENGSYTLFTADTRDVRSPDNQTMAVTPPWLCGD HIDRVILKDIPEGSTKIIRLTPVNMVCHYTYEVNGIRGLDRVADLRAALSGMSGSLNMSG DSLPADLSESLLFDGMVSRNQIIGGFYTFGHSALEGEPNVFRLYLKNRSGSMSVLEQDVS DQVHDVPVAGHIGDVHLVLNFDYEVPSEPGSGGPGFDVDVDDWDDVNVDIVL >gi|226332113|gb|ACIC01000207.1| GENE 18 14570 - 15679 921 369 aa, chain + ## HITS:1 COG:no KEGG:BT_p548229 NR:ns ## KEGG: BT_p548229 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 369 1 371 373 683 96.0 0 MKKSTVMLWAIFGVLLMSCSEEEIANVETSSRNAIGFNVLSNAAETRATPTTNTNLKNTD FDVFAFTADGTAFMGKNDTEFEHDGVKIVYKNGKWDYDNASDLRYWPTEALDFYAFNPGT VSEDMMVFYSWEATKDVQKISYTCMDEYGAGTHANYDVMYAMAKGQTKDMNNGIVKFNFK HILSQVVFKAKTQYDNMQVDIDVIKIHNFKFAGAFTLPAAADGTGSWSSSDLAFPHAFTV VKNANITVNSNTEATDITTNTPMLNIPQELTAWTVSGASKTKKGADDAKQCYLEIACKIR QSGAYLLGSASEYKTIYVPFGDTWEQGKRHIYTLIFGGGYTDQGEAVLNPIQFDAETTGW VNAAKDVNV >gi|226332113|gb|ACIC01000207.1| GENE 19 15856 - 16113 206 85 aa, chain - ## HITS:1 COG:no KEGG:BT_p548230 NR:ns ## KEGG: BT_p548230 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 85 1 85 85 138 100.0 5e-32 MAKKNDLKNSMSAGLTGGLDSLIQSTAGQKEAQKPKKAKTVHCNFVMDETYHQNLKLIAI RKGDSLKSVLQEAISDYLDKNSSLL >gi|226332113|gb|ACIC01000207.1| GENE 20 16118 - 16876 704 252 aa, chain - ## HITS:1 COG:Rv1708 KEGG:ns NR:ns ## COG: Rv1708 COG1192 # Protein_GI_number: 15608846 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Mycobacterium tuberculosis H37Rv # 4 248 64 313 318 200 44.0 2e-51 MSKAKVISVLNHKGGVGKTTTTINLGGALRQKGYKVLLIDLDGQANLTESLGFSAELPQT IYGAMKGEYDLPIYEHKDGLRVVPSCLDLSAVETELINEAGRELILAHLIKGQKEKFDYI LIDCPPSLSLLTLNALTASDRLIIPVQAQFLAMRGMAKLMQVVHKVQQRLNSDLSIAGVL ITQYDGRKNLNKSVSELVQETFQGKVFSTHIRNAITLAEAPTQGQDIFHYAPKSAGAEDY EKVCNELLTEIK >gi|226332113|gb|ACIC01000207.1| GENE 21 17164 - 17493 421 109 aa, chain + ## HITS:1 COG:no KEGG:BT_p548232 NR:ns ## KEGG: BT_p548232 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 109 1 109 109 218 100.0 5e-56 MSTYSYKNPKFINSPKGVVEVVEVIYDGKDDPAYSLAIIKWENTYKLGIRWNIAYSEWDD YRKQNGQDECIGNPQSRGIPTWFVLPDDMMFGEKFSGAMQRLDELRKGK >gi|226332113|gb|ACIC01000207.1| GENE 22 17497 - 18327 359 276 aa, chain + ## HITS:1 COG:STM3755 KEGG:ns NR:ns ## COG: STM3755 COG3943 # Protein_GI_number: 16767039 # Func_class: R General function prediction only # Function: Virulence protein # Organism: Salmonella typhimurium LT2 # 4 121 12 132 345 93 37.0 4e-19 MEQGEIILYQPDEAVKLEVRLEDETVWLTQEQIADLFGTKRPAITKHLNNIYKSGELDID STCSILEHMGNDGKQRYTTKYYNLDAILSIGYRVNSKNATLFRKWANSVLKDYLLKGYSI NKRLSELERTVAQHTEKIDFFVRTALPPVEGIFYNGQIFDAYKFATDLVKSARRSIVLID NYVDETVLLMLSKRSVGVSATIYTQRITQQLQLDLDRHNSQYPPIDIRTYRDSHDRFLIV DETDVYHIGASLKDLGKKMFAFSKLDIPAVVITDLL >gi|226332113|gb|ACIC01000207.1| GENE 23 18341 - 18979 437 212 aa, chain + ## HITS:1 COG:XFa0019 KEGG:ns NR:ns ## COG: XFa0019 COG1961 # Protein_GI_number: 10956730 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Xylella fastidiosa 9a5c # 14 202 9 178 188 58 29.0 8e-09 MNVVIYSRVSSQSARQSTERQVVDLERFAAGRGYEVTAVFEEKISGRKANIERPVLSRCL EYCTDPQNRVDMLLLTEISRLGRSTLEILKALDTLHTHKICVYIQNLNLETLRPDKTVNP LSSLITTLLGELAAIERQGIIDRLNSGRELYIQKGGRLGRKPGSRKTAEQRKEEYREAIA LLKKGYSIRNVAKLTGKAVSTIQQVKKDFINS >gi|226332113|gb|ACIC01000207.1| GENE 24 19052 - 20617 776 521 aa, chain + ## HITS:1 COG:MA2133 KEGG:ns NR:ns ## COG: MA2133 COG3177 # Protein_GI_number: 20090976 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanosarcina acetivorans str.C2A # 306 520 56 255 326 61 29.0 5e-09 MSKNEYLHENNQNLNKNGSMATMAEKMAASLEELRKLQEKDRCVVLQGTAEIGRTHLTRL LDNGWLQEVMKGWYIAARPGTEGDTTIWYTSFWYFIAKYAAVRLGERWCLTADQSLDLYS GKTTVPVQVVIKSPKGHNNTQKLMYDTSLLVFQSEIPDQVYKEPEYGLNLYPLAEALVYA TPRYFQVEKIAARTCLAMIRDAADILKVLTKNGASLRAGRIAGAFRNIGNGEIADSIVST MRGFGYDVREEDPFEDQPRTPLVYEVSPYVTRLRLMWENMRDKVVELFPEAPGKIDDVEG YLRSVDEKYSEDAYHSLSIEGYRVSPELIEKVRVGNWKPEEEDKEHKNALVARGYYQAFQ AVRGTISDILKGKNAGEAVRADHPVWYMQMWMPFVTVGILQREDLVGYRTGQVYIRGSQH IPLNPKAVRDAMPVLFDLLKNEPHPAVRAVLGHFFFVYIHPYMDGNGRMGRFVLNAMLAS GGYNWTVVPVERRKEYMKALEKASVEGDISEFAKVITSLVK >gi|226332113|gb|ACIC01000207.1| GENE 25 20640 - 21038 272 132 aa, chain + ## HITS:1 COG:no KEGG:BT_p548236 NR:ns ## KEGG: BT_p548236 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 132 1 132 132 259 100.0 1e-68 MKSTEYIEWDKLEQIPFCLCRIAEDEENQEIDVYYLDKRVCHDYDHVGHYFRTAIIMFRR IRNITADWVNLKNLWLLRDCIRENFNHGLEVDDLIFGETFDGEDPETIKPLTKERLFKIK KVIQEKDPYATV >gi|226332113|gb|ACIC01000207.1| GENE 26 21219 - 21602 112 127 aa, chain - ## HITS:1 COG:no KEGG:BT_p548237 NR:ns ## KEGG: BT_p548237 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 127 1 127 127 230 98.0 1e-59 MRLNLGYGKSRLTFADKNPGQGKRKAATPEKLKYLSPLKGGLKPGSTIGYRSTRIVPISE KSGGFYPAGFTERLYPEIAYLNRVGGCIHALGIAVESPQPLGEDLEAESPTVYGNAQILQ KKSVLIY >gi|226332113|gb|ACIC01000207.1| GENE 27 21670 - 21897 141 75 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253573023|ref|ZP_04850418.1| ## NR: gi|253573023|ref|ZP_04850418.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 75 1 75 75 132 100.0 5e-30 MEKNVWGVLIQKPLGTQGKRKTTTPTAPKNQNEVVLIQKHRNLRHPQGFLGGERVRGGEF TIGKNQVVNTKFSIY >gi|226332113|gb|ACIC01000207.1| GENE 28 22058 - 23074 792 338 aa, chain + ## HITS:1 COG:XF2061_1 KEGG:ns NR:ns ## COG: XF2061_1 COG4227 # Protein_GI_number: 15838653 # Func_class: L Replication, recombination and repair # Function: Antirestriction protein # Organism: Xylella fastidiosa 9a5c # 11 312 221 500 522 138 34.0 1e-32 MKTDYFDKAAQQFANLMIEKIQQVEDNWQKPWITIAANTRNFFPQNLTGRRYAGGNAFLL LFLCEKFQYQTPVFMTFNQAKEAGISVLKGSKSFPVYYFLFYVYHKETRKKITFEEYKAL SREQQQEYNVIPTYKYYSVFNLDQTNFSDVRPEEWEALREKFRGGQAEQPDGTTGEAHPE IDAMLEAKSWFCPIREQQGDRAFYSPLADYIVVPLRSQFVDMQSFYETLLHEMGHSTGHP TRLNRDLAHPFGSEEYGKEELTAEFAAALAGMFFGIAEHIRTENAAYLKSWIKAIREEPK FILNVLADAMKIVKMIAGKLNISVESEETAADEAEQVA >gi|226332113|gb|ACIC01000207.1| GENE 29 23247 - 23774 371 175 aa, chain + ## HITS:1 COG:no KEGG:BT_p548201 NR:ns ## KEGG: BT_p548201 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 175 1 175 175 351 98.0 7e-96 MNKPKIIQIIDVVSNAIAGNRIDEDFIKSCIYGKVDAELYAHLLGKYRGYDGDFFQFYLG TDDRINRALLENLGIKVEPDKYPDYDSRIVAQVVQGKKRFDIYPFELEAFNRYAMFGNNN ALSCLKGISPTAGQTVRENGINEYGNALNWSLFWIKANPEDKALLVDHVLNIPER >gi|226332113|gb|ACIC01000207.1| GENE 30 23771 - 24250 340 159 aa, chain + ## HITS:1 COG:no KEGG:BT_p548202 NR:ns ## KEGG: BT_p548202 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 159 1 159 159 298 100.0 5e-80 MIAEYFIYRRKGDKEPFISLGEMPQYGLRPKQKFTGKKLKIEVIRRLSGVEIEQTATTPQ INAYIEANIYDTERWPEYRKLYRQVAGEVETVADIFTLQYILVAELEDQTRTGKDCQPQP TDPKDERLIHLIRCELMGEPLEMYKTMINPIIALKKRFV >gi|226332113|gb|ACIC01000207.1| GENE 31 24323 - 24715 290 130 aa, chain - ## HITS:1 COG:no KEGG:BT_p548203 NR:ns ## KEGG: BT_p548203 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 130 1 130 130 236 100.0 2e-61 MFPNYKDFQIAVYYTLGKALPKHIEEVQTEIIENFDIKYNSPMLAHPLLRTPIYEKIILR ILDTMEDLKEIRFSDDRTHVVLTGRGKHLLDEYENEMNQRLPFIISRKKFKRHTQEELAK AYRELNEYPD >gi|226332113|gb|ACIC01000207.1| GENE 32 25031 - 25717 432 228 aa, chain + ## HITS:1 COG:no KEGG:BT_p548204 NR:ns ## KEGG: BT_p548204 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 228 1 228 228 452 100.0 1e-126 MITTIQFKTLDDLYQFYNDAIFNGELSECIVNMSRHGGAFGFFAANRWRGDGQEKKVVHE ISINPDFMNREDRDWHSTLVHEMCHLWQEDFGRPSRGGYHNSQWADKMIQVGLMPSDTGE AGGKRTGQSITHYIIPGGKFEQVFNTLSREDLQNLRLRYKPTLAAVPTRPIRIAGSDGDE TEEPEDPDEGESKSGKRKKYTCGCGCNVWGKSGLVLRCGLCDTDFTEQ >gi|226332113|gb|ACIC01000207.1| GENE 33 25993 - 27033 750 346 aa, chain - ## HITS:1 COG:no KEGG:BT_p548205 NR:ns ## KEGG: BT_p548205 # Name: not_defined # Def: putative replication protein B # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 346 1 346 346 718 100.0 0 MATNDITLYIPPTEEEVMLIQPNNVTFGQYSVTEWQENLLTLISDQIQQHMTREKQLPKD LFNQPVVEIKCDEAGGKNNKSKVLKEAIAMMRKNFSFRWVHPNIHQTVETFGVVITTVHN IKGTNRVALTLNPWAIPFLLYYGTGVGGTRYGKNIALTLRGNYTKRLYKIICSQRDRREY YYSIDQFCKDLEIPASYSNAQIDQKVLRPAQTRIKESEADVWFDYKFICKTPKKGQKPKA DTIILFIHARNPQELRGDQQQQYSFVYRWVYGALSYPSDNTPVRALDKILESGRLKDVYD RCAYYDDKVCSGEMSTAHAQNSLAKMLREDFGIKARGKKKDPDLFG >gi|226332113|gb|ACIC01000207.1| GENE 34 27792 - 28412 493 206 aa, chain + ## HITS:1 COG:no KEGG:BT_p548206 NR:ns ## KEGG: BT_p548206 # Name: not_defined # Def: mobilization protein A # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 206 1 206 206 358 100.0 9e-98 MQTPIREIMATTSKDTSIRIKESTRFRLDMLKGNKSHDAFVAEMLLYFETTGITPQSNVM PPNIAAKEQASRVIEVVRGIEKSTNVRLKNIEQLLLSLVGEVKTPGDNPDEYMHISQVQE LLERSKQLEQEARENREKAGKLQTDLEIARQEKGTPAVGCNTHKILEIVERIDEVKKIPT FNDTVYEIDRNTLDMWVKRLKDELKR >gi|226332113|gb|ACIC01000207.1| GENE 35 28417 - 30672 1563 751 aa, chain + ## HITS:1 COG:no KEGG:BT_p548207 NR:ns ## KEGG: BT_p548207 # Name: not_defined # Def: mobilization protein B # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 751 1 749 749 1382 99.0 0 MFAKVHPAEDVKSGNTGSCHDLVRYLEKETGEGQRFFSHTEQDISPERVIMDIDGNKKAL GANDAKFFMLSLNPSQSEQMHLIGRKVDDFKELTPQEKKEVFQKLEAFTRSAMDEYALNF GRDNIRGGQDLMYYARVETERSYHPEDEEVKQGIARIGEPKPGLNLHVHVIVSRKSLDGK VKLSPGAKSAGNTWELEGRGTVKRGFSHEGWKVRVQECFNRKFDYQAKEGETYVRPQVSA EIGKITNPELKRILQDEQFTAANQIVAAMREQGYTHQVRKGVHSFSREGEVFQVEHRLLK AFEQPLSDEQLKSITERFDLTKYEANPAGYRENGLQVKDISFSTYVKDEAQKGEKALKEV AYKVVYDEQNHTTVSFATVRQFAYEHQINLVKSEPTAEAVLKKIKNPELKNLLENYRFTS ANQIVVAMKEQGYTHKVRKGVHSFSREGERVSIRHKDLKKFADPKLESRHMEGIIERFNL YKYKQEGVAYRENGLEAKNISFLTYQKVPIEPEEDKSVTGKPETKEEAQQPQQETPGSEA HPEETENEAETPMPEPDPVKYRKELKEVSYDVLFDRETKTYVPISAIRKYAYENEINLID RYKHGYAVRNEDLRECLANPEYRTVRQINKAMRERGYTIERDEAGNYTYIKGESSFFMER RDLLAFTGYAKDTGGRERERGTHRSADKTVGFIGGKAKQKLINEILGDSFRTERMLVGNV KKAVSLIQNPANIKMMLIKQIGSFLNPFKEL >gi|226332113|gb|ACIC01000207.1| GENE 36 30674 - 30953 193 93 aa, chain + ## HITS:1 COG:no KEGG:BT_p548208 NR:ns ## KEGG: BT_p548208 # Name: not_defined # Def: mobilization protein C # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 93 1 93 479 192 100.0 2e-48 MMMYIIIGIACLVGIIVVVLLLPNSGKQQQKGQKYRFELSAGGGRKITFADPFDNFLVYG GANSGKTKSIGKPLLSQYIQAGFAGFVYNYKDF Prediction of potential genes in microbial genomes Time: Thu May 12 04:14:44 2011 Seq name: gi|226332112|gb|ACIC01000208.1| Bacteroides sp. 1_1_6 cont1.208, whole genome shotgun sequence Length of sequence - 22657 bp Number of predicted genes - 13, with homology - 13 Number of transcription units - 7, operones - 3 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 49 - 1962 1387 ## COG3507 Beta-xylosidase - Prom 1984 - 2043 4.1 + Prom 2042 - 2101 5.0 2 2 Tu 1 . + CDS 2129 - 3241 432 ## PROTEIN SUPPORTED gi|90020424|ref|YP_526251.1| ribosomal protein L11 methyltransferase - Term 3175 - 3219 8.2 3 3 Tu 1 . - CDS 3238 - 6186 2247 ## BT_3514 hypothetical protein - Prom 6275 - 6334 6.6 + Prom 6306 - 6365 5.9 4 4 Op 1 . + CDS 6481 - 8295 1737 ## COG3250 Beta-galactosidase/beta-glucuronidase 5 4 Op 2 . + CDS 8309 - 10822 2187 ## BT_3512 glutaminase A + Term 10852 - 10905 2.3 - Term 10840 - 10893 9.4 6 5 Op 1 . - CDS 10920 - 11423 435 ## BT_3511 hypothetical protein 7 5 Op 2 . - CDS 11468 - 12010 207 ## BT_3510 hypothetical protein 8 5 Op 3 . - CDS 12000 - 12593 500 ## BT_3509 hypothetical protein - Prom 12616 - 12675 3.7 + Prom 12884 - 12943 6.0 9 6 Tu 1 . + CDS 12964 - 15519 1542 ## BT_3508 hypothetical protein + Term 15589 - 15633 9.1 - Term 15572 - 15624 14.1 10 7 Op 1 . - CDS 15655 - 16716 600 ## BT_3507 hypothetical protein 11 7 Op 2 . - CDS 16739 - 18856 1407 ## BT_3506 hypothetical protein 12 7 Op 3 . - CDS 18872 - 21994 2397 ## BT_3505 hypothetical protein 13 7 Op 4 . - CDS 22011 - 22655 437 ## BT_3504 hypothetical protein Predicted protein(s) >gi|226332112|gb|ACIC01000208.1| GENE 1 49 - 1962 1387 637 aa, chain - ## HITS:1 COG:BS_abnA KEGG:ns NR:ns ## COG: BS_abnA COG3507 # Protein_GI_number: 16079933 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-xylosidase # Organism: Bacillus subtilis # 346 578 42 284 313 113 36.0 1e-24 MKLRHLFLFCVICCSVSISAQNKSKLPKTSGNPIFPGWYADPEGIVFGDEYWIYPTYSAA YDDQIFMDAFSSKDLVNWTKHPKVLSKENISWLRRALWAPAVIHANDKYYFFFGANDIQN NNELGGIGVAVADNPAGPFKDALGKPLIDKIVNGAQPIDQFVFKDDDGQYYMYYGGWGHC NMVKMAPDLLSIVPFEDGTIYKEVTPQNYVEGPFMLKHNGKYYFMWSEGGWGGPDYSVAY AIADSPFGPFERVGKILQQDTNIATGAGHHSVVKGPGSDEWYIIYHRRPLGETAANSRAT CIDRMYFNKDGKIEPVRMTFEGVGASPLATPVRAKTYANPVINFSLPDPTIIKGDDEYYY LYATESIKNVPIHRSRDLVNWYYVGTAFTDATRPNFEPKGRIWAPDINKIGDKYVMYYSM STWGGEWTCGIGIATADKPEGPFTDLGKLFRSNEINIQNCIDPFYIEDNGKKYLFWGSFH GIYGTELTDDGLSLKKGAPIEQIAGTAYEGTYIHKRDGYYYLFASIGRCCEGLKSTYTTV VGRSKNLFGPYVDKKGRSMSDNHHEILIQKNKAFVGPGHNSELVTDKKGNDWMFYHAVSV ANPEGRVLMMDRVQWKGGWPFVKGAVPSLETAAPDFR >gi|226332112|gb|ACIC01000208.1| GENE 2 2129 - 3241 432 370 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|90020424|ref|YP_526251.1| ribosomal protein L11 methyltransferase [Saccharophagus degradans 2-40] # 38 354 5 314 314 171 34 5e-42 MRSRLFLLSIILLIGVSCQHKKATTEKETVAAGNTYTNPLLERGAEPWAIFHEGKYYYTQ GSENRVILWETDDITDLSHATQKDVWIPTDPSNSYHLWAPEIHRINNKWYIYFAADDGNM DNHQIYVVENEAANPMEGTFVMKGRIQTDKDNNWAIHASTFEHDGQRYMIWCGWQKRRID SETQCIYIATMKNPWTLSSDRILISKPEYEWECQWVNPDGSKTAYPIHVNEAPQYFESKN KDKACIFYSASGSWTPYYCVGLLTADAKANLLNPASWKKSPVPVLQQDPENNVYGPGGIS FTPSPDGKEWYMLYHARQIPNDAPGASDSRSPRLQKIDWDKDGMPVLGTPQKEEEPMARP SGSPIHKKQN >gi|226332112|gb|ACIC01000208.1| GENE 3 3238 - 6186 2247 982 aa, chain - ## HITS:1 COG:no KEGG:BT_3514 NR:ns ## KEGG: BT_3514 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 982 1 982 982 2026 99.0 0 MRILFLLVVLSVTSALQSQVVIKENAIQSKYAFPLVTSKAKAVVVYDTDDYLVVRKTAEL FVSDVESVTGQQLRLTDKLKGDKEIIIVGTVEKNRLIRQLAEKGKLDISSLEGTWERFLI QTVSRPFPGVSKALVVAGSDRRGAAYGLFSLSEMMGVSPWYWWADVPVKKHKTLYVDAPA TLSKTPSVKYRGIFLNDEDWGLKPWAAKTFEKERGNIGPRTYAKICELLLRLKANHLAPA MHPVSTAFYKIPENKLVADTFAIVMGSSHCEPLLLNTASEWDSKTMGPWDYNKNKDKINE VLSNRVKENCAYENVYTLALRGLHDAAMGGGDVPMREKVKMLESALNAQREIIARHIDKP VETIPQAFTPYKEVLEIYSNGLELPDDVTIIWADDNFGYMKRLSGPQEQKRSGRAGVYYH ISYLGVPHSYLWYSTTPPALMYEELRKAYDTTADRVWLANCGDLKGAETQISLFLDMAYD IDSFNADNVATYPARWLAKMFGEEYYDTLEDITCSHINLAFSRKPEYMGWGYWNNYWGGG EKRTDTEFSFANYNEAERRLTEYSRIGKKTEDLLASLDEKSKPAFYQLLYYPVKGAELMN HMTIKGQFYRQYVRQQRAAANRLKAQVKCYHDSLEIITDGYNSLLDGKWKHMMSLKQNYD GTSSYFMIPLMEEEYTPVGFPKLGLQAESENLDKGGMSFHTLPAYSTYSRKSHWIDIYNQ GTGELSWSITPSEDWILISQKSGKTSAENRIYISVDWDKVPAGEKVKGQIDITSGTQKET VLVSVFNPESPARTEVQGLYVEENGYISIPAADFHRKFESNDIRMSILPGLGFEGRSLQL GNPTAPLQMYRAGDVPRVEYDFYTFNAGIYDVYTYVLPTFPLHAERDYKLPEHTNSDTKY SVRIDDGSISTPSTSAIEYSQIWYDSVLKNCRVNKSTLYVKKPGKHTLQICCGDPGVVIQ KIVIDLGGMKRSYLGPQSTICN >gi|226332112|gb|ACIC01000208.1| GENE 4 6481 - 8295 1737 604 aa, chain + ## HITS:1 COG:BH2723 KEGG:ns NR:ns ## COG: BH2723 COG3250 # Protein_GI_number: 15615286 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Bacillus halodurans # 107 472 124 484 1014 89 22.0 2e-17 MKKNFLAVLFALALCSPTFAQWKPAGDKIKTPWGEQLNPKNVLPEYPRPIMERQDWKNLN GSWNYAITKKGEPAPNNYQGEILVPFAVESSLSGVGKTINEHQELWYQRTFDVPSAWKGK QILLHFGAVDWKADVWVNDVKVGQHTGGFTPFYFDITSALNKGNNQLVVKVWDPSDRGEQ PRGKQVERPEGIWYTPVTGIWQTVWLEPVAAQHIAQLKTTPDIDKKTVKVEVATNVCSPD KVEVKVFDGKNLVAKGAALNGVPVELTMPEDVKLWTPESPSLYNMEVTLYKDGKAVDQVK SYTALRKFSTHKDKNGITRLQLNNKDYFQFGPLDQGWWPDGLYTAPTDEALVYDLKKIKD FGYNMVRKHVKVEPARWYTYCDQLGLIVWQDMPNGGRGPAEWQMHKYYDGADAIRSAASE ANYRKEWKEIIDYLYSYPSIGVWVPFNEAWGQFKTPEITKWTKEYDPSRLVNPASGGNHY TCGDILDLHNYPGPNLYLYDPIRATVLGEYGGIGMALKGHLWLADKNWGYVKFNTPEEVT NKYIKYADHLLELIEKGFSAAVYTQITDVEEEVNGLVTYDRKVIKVDEPKIKAINQKICN SLNK >gi|226332112|gb|ACIC01000208.1| GENE 5 8309 - 10822 2187 837 aa, chain + ## HITS:1 COG:no KEGG:BT_3512 NR:ns ## KEGG: BT_3512 # Name: not_defined # Def: glutaminase A # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 837 1 837 837 1708 100.0 0 MNKLFSTIMSVCLACNVQATDLFKTSQDIALRAPSVPLITSDTYLAIWSPYNELNEGNTE HWTAATHPLLGALRVDGKVYRFMGKDKLNLETILPMTNTERWEAKFTMSQPAANWIQPQF DDSGWTKGKAAFGTKDMKRIGTEWNTEDIWVRRSFNLNQDLTNDIIYLRYSHDDVFELYL NGEKLVATDYSWNDDVTIELSASAKAKLRKGTNIIAAHCHNTTGGAYVDFGLFRENKQLS NFKEAAIQKSVDVLPTQTYYTFTCGPVELDLVFTAPLLMEDLDLISTPINYISYRVRSLD KKQHDVQVYIETTPQLAVHEPSQPTISEKISKNGMDYLKAGTIDQPYVKRKGDGVRIDWG YAYLGSNSAPNKDLSIGNYYDMKQAFITNGKLLPNSQDFITRSESDMPAMAYTENLGKVD NQGKSGYVMLGYDDIYSIEYFYERRMAYWKHNGQVSIFDAFERAQASYPSLMKRCRAFDQ QLMADAEKAGGRKYAELLALTYRHSITAHKLLTDKEGNLLFLSKENHSNGCINTVDLTYP SAPLYLIYNTELMKGMLNSIFYYSESGRWNKPYPAHDLGTYPIANGQLYGEDMPIEEAGN MILVTTAISMMEGNANYASKYWETLSTWANYLIENGLDPENQLCTDDFAGHLAHNANLSA KAIMAIAGYGEMARMLGKETTANKYIETAKRLAIEWEKMAFDDDHYKLAFDKPGTWSQKY NLIWDKVFNMNIFPQKVFDTEIPFYLTKQNKYGLLLDSRAQYTKSDWVLWSACMSPDDAT FQKFIDPIYTYANETTSRVPISDWHDTNTGKMMNFKARSVVGGYYMKLLMEKVKEQK >gi|226332112|gb|ACIC01000208.1| GENE 6 10920 - 11423 435 167 aa, chain - ## HITS:1 COG:no KEGG:BT_3511 NR:ns ## KEGG: BT_3511 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 167 1 167 167 284 100.0 7e-76 MKLRIVHLLLLQSVMLLLSCSNQSRKNMDTSAEPADVQANVSDSSRIEQEAIGMIEDFYE AYAASFMSTGKEALALGDSIKQKFLTKELIEKVDRLIEATDADPIIRAQDLGENDMKTLS VKHLNDNWYEVNYTSAKGSQYERAVSIPVRVVNVDGQYLIDDITPEN >gi|226332112|gb|ACIC01000208.1| GENE 7 11468 - 12010 207 180 aa, chain - ## HITS:1 COG:no KEGG:BT_3510 NR:ns ## KEGG: BT_3510 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 72 180 1 109 109 231 100.0 8e-60 MVNENIGIAGGLLHSYFVDGAILNDSCTLFVPDGVSPSEYRQSDAYQQASKNQQEHGGSL SLVADSLQYAYMKAQVEYIILPFKKRLDGFTYWQRELEPAIHPALKAYVKAENATGGVSS VMIFWEIARNTDHFCSDFKTDRDVRMFLTLYFWKYLCYFANIDFYTGQDKTEEILKGEAD >gi|226332112|gb|ACIC01000208.1| GENE 8 12000 - 12593 500 197 aa, chain - ## HITS:1 COG:no KEGG:BT_3509 NR:ns ## KEGG: BT_3509 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 197 1 197 197 352 100.0 3e-96 MRTLIILLLCTNTSFAIAQISPKAVEKNNQSVKTAGFFNDSDSLNKAIHLSDEAIALEPS YKLAYANKIKYLMALGQKEKALQTMLQMEKFSPDDPYYILGKGMMLEENAKKSLAMDAYK QAASLFKKRLKEKPTEADLMNYVFVLFLRDNKNYSLDEIEKEYPQIFSPAIRQHTKKLID ELSNKREDVIHEMLGGK >gi|226332112|gb|ACIC01000208.1| GENE 9 12964 - 15519 1542 851 aa, chain + ## HITS:1 COG:no KEGG:BT_3508 NR:ns ## KEGG: BT_3508 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 851 1 851 851 1724 100.0 0 MKNKPLLFLLSVCGLLAISARGENPPAKVIFEQYMNQAQTFADNYPREKAYLHFDNTSYY VGDTIWFKAYVTLAEKQVFSSISRPLYVELVDQAGHVTDKQIIKLSQGEGNGQFVLPQSM LSGYYEVRAYTRWMLAFSDPQYFSRTFPVYQLSHSDQLERSISTYELSPSMEKRPEETRE KLSLRFFPEGGQLVEGITSQVAFKAESKNEGNIQLSGTLYTKEGQEITSFETLHDGMGAF EYTPSALPAIAKVNFQGKQYEFTLPKALPSGYILKVGNNAGAISVTVSCNAATPQDTLAV FISHQGRPHAYQLIHCQANKPQQFTVLSRKLPAGVLQISLLNRAGNTLCDRFIFASPRAP LQISPKGLKEIYAPYAPIRCELQLNNAIGEPIPGKLSVSIRDAVRSDYMEYDNNIFTDLL LTSDLKGYIHQPGYYFTESSLRKQKELDILLMVHGWRKYDMTQQIGISPFTPLQLPESQL VLYGQVKSTILKNKLKDIALSVMVKRDAEIITGQTVTDENGHFSIPLEDFEGSMEAVIQT RKVGKERNKDASILIDRHFSPATRAYGYRELHPEWSNIAHWQQEAEKFDSLYMDSIRRVD GLYLLDEVEIKSKRRSQSTNMATKINEQSIDAYYDIRQAVDQLRDNGKVVTTIPEVMEKL NPLFYWDRSNDNCTYRQKPICYIMDNKILSSTEVNMMLTEIDGLASIIISKGTGGVDDEI IQNTKMSNSNDVDVSELDKYSIFYLIPLPRHDVLNKHETAALGTRQTVMQGYTPALEYYS PAYIDKELYMDKADKRRTLYWNPTVQTDENGKAVIECYNNQYSTPLIIQAETLSNDGKLG SVTYSTVAETK >gi|226332112|gb|ACIC01000208.1| GENE 10 15655 - 16716 600 353 aa, chain - ## HITS:1 COG:no KEGG:BT_3507 NR:ns ## KEGG: BT_3507 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 353 1 353 353 681 100.0 0 MKKIYTLAALAAMTMLGTSCNSEWEDEQYEHYISFSSQLDSKGVTNIYVPYSRHDAEGNY AEGGEGRSNYQLPILVSGSTDNPSNVTVHVAHDADTLNILNYARYATRTELYYEDMGAEG LAYASYPESLQIKAGENKGLLDLKFDFRNIDMSEKWVLPLQIVDDASYNYVAHPRKDYAK AILRIFPFNDYSGDYSGTGITNKVVTGYDGDGKPIETAESITKSSIRGYVIDEQTIFTYA GIVDEDYTDRRKYKIKFAFNGETNGSVTISCDNAEEIGFELNKDVTPSFRISSSMDDAKP YLEHRYVIINNVDYYFNYIPVEGTIIRYHVKGTLTLSRDINTQIPDEDQAIEW >gi|226332112|gb|ACIC01000208.1| GENE 11 16739 - 18856 1407 705 aa, chain - ## HITS:1 COG:no KEGG:BT_3506 NR:ns ## KEGG: BT_3506 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 705 1 705 705 1421 100.0 0 MIKKIYILLLTVLGMGVISSCSDYLNSEKYFKDRLTLEKTFESKDHVEEWLAYAFSFIKN ENYEVTTKGPSENSFCFSDDMYYGDRDKTIDATKNELSYNMFKLGEYDENTYNVGAWGAC YKGIFQASVFIHNVDRCQEMADWEILDYKGQARFVRAYYYWLLLRRYGPVPIMPDEGVDY TQSYDQIATPRSSYEEVAQYISDEMVQATKELQYDRRTDNYAIGRPTRGAALAVRAYALI FAASPFANGNNDEYAQQLVDDEGRRLLSSEYSEEKWAKAAAACRDVIDLDVYELNIVNKS TSDNGPSERPTVTPPADGEFSNQPWPKGWTNIDPLRSYRTIFDGTILPANNKELIFTRGA TNIDMLVLHQMPKDDGGWNCHGMTQKMLDAYYMNNGSNEPGMNSMYQGVANYQGIVDTRE RRTGFTDLQDLKDNKYPELGWKYDPKKGDNDQAKTGMNVSLQYVEREPRFYASVAYNGCT WYYLSQTESKPADVNQQVWYYFGSSDGYRNDGFYLRTGIGIKKFVHPNDYPGNYVAKAET AIRYADILLLYAEALNELTGTYNIPSWNEATTYTISRDKEQMERGIHPVRIRAGLPDYPD EFYLQSGVDDMRKALKRERMIELMGEGKRYFDIRRWKDAPVEESLQIYGCNVFVGEAKRD EFHSAIPVYNLPSTFSEKLWLWPIKHSELKRNSRLTQNPGWTMYD >gi|226332112|gb|ACIC01000208.1| GENE 12 18872 - 21994 2397 1040 aa, chain - ## HITS:1 COG:no KEGG:BT_3505 NR:ns ## KEGG: BT_3505 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1040 1 1040 1040 2047 100.0 0 MKKYLAIFLLMLVVPLTISAQQTITVTGTVTDTQDEPMIGVNITVKDVAGLGTITDINGN FSIKMEPYHRLVFSYIGYDDVEVLVKEQHTVNVKMKESEASVLDEVVVTGMGAQKKLTVT GAVTNVNVGDLKRFPTSNMSNALAGNVPGIIARQTSGQPGKSTSEFWIRGISTFGASSSA YILVDGFERSSLDELNIEDIESFTVLKDASATAIYGSKGANGVVLITTKRGKAGKINIDA KVETSYNTRTITPEFVDGLSYASLMNEALVTRNLGMAYQPEELELFRTQMDPDFYPNVDW MDLILKNGAWSYRANLNMNGGGNTARYFVSASYTEDQGMYNTDQTLRDDYDTNANYKKWN YRMNVDIDITKSTLLKLGIAGSLAKRNSPGLADNEMLWGMLFGYNPIATPVYYSNGYAPI SHRDNVNKLNPWVASTQTGYNEDWQNNVQTNVTLEQNFDFITKGLKFVGRFGYDTDNSNW INRHRQPDLYKANGRRQETGEIIYEKMFSAYDMTQSSGSSGKRREFLDLLLSWERAFGNH HGGVTFRYTQDSEKRTVDIGTDIKNGVSKRNQGLAGRFTYNWNYRYFVDFNFGYTGSENF APGNQFGFFPAFSLAWNVAEESFVKNNLKWMNMFKIRYSHGKVGNDNIGDNNRFPYLYTI ATTGYNSEGKPNYVYNWGFGDYGKSFIGTHYTQMASNGITWEVATKDDLGIDLSLFNDKF TATVDYFHEKRTGIFLTREFLPDITGLESKPKANVGEVKSQGFDGNFALKQKLGEVDMTI RGNITYSKNEVLEKDEENQVYSYLYQKGYRVDQVKGLIAEGLFADYDDIRTSPKQEFGTV QPGDIKYKDVNGDGVVNDNDKVAIGATTTPNLVYGIGASFAWKGIDVNVHFQGAGKSTFP IYGKCVYAFSESDWGNIFKDMISDRWVDSETAAKLGLHANENPNATYPRLTYGENKNNQQ TSTYWMRDGRYIRLKNLDIGYTLPKSIVNKLHFNNIRIYIAGSNLITWSKFKTWDPETGN PRGEAYPLTKSVTMGLSVNL >gi|226332112|gb|ACIC01000208.1| GENE 13 22011 - 22655 437 214 aa, chain - ## HITS:1 COG:no KEGG:BT_3504 NR:ns ## KEGG: BT_3504 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 214 305 518 518 440 100.0 1e-122 IKNGGSWDPIVKNNPNTFKQLFTIADPSWEFQIFIHPTGKYAYFGVINNHYFMRSDYDEI KKEFITPYNFVGGYKQSGYRDDVGTEARMNNPCQGVFVKNPDYTGEEEYDFYFVDRLNFC VRKVTPEGIVSTYAGRGASTSLADGNQWGTDDGDLREVARFRDVSGLVYDDVKEMFYVHD QVGHTIRTISMEQEENVAGDENIPEDESTVESNE Prediction of potential genes in microbial genomes Time: Thu May 12 04:15:50 2011 Seq name: gi|226332111|gb|ACIC01000209.1| Bacteroides sp. 1_1_6 cont1.209, whole genome shotgun sequence Length of sequence - 1278 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 907 510 ## BT_3504 hypothetical protein 2 1 Op 2 . - CDS 953 - 1222 97 ## BT_3503 glutaminase A Predicted protein(s) >gi|226332111|gb|ACIC01000209.1| GENE 1 1 - 907 510 302 aa, chain - ## HITS:1 COG:no KEGG:BT_3504 NR:ns ## KEGG: BT_3504 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 5 302 1 298 518 617 100.0 1e-175 MKCKMKKVLKEWRWILLALFTICICSCKDDDNVETGAFDPSKPVAISDFTPKEGGAYQKL LIYGENFGTDVSKVKVKIGGKDAIVINVKSTYVYCFVPSGAFSGEIEITVGEGENAVTTT ASTTFSYEKKMVVGTLCGYRNNRDDQGWRDGPFDGPEGVKCCGFSDNGRLAFDPLNKDHL YICYDGHKAIQLIDLKNRMLSSPLNINTIPTNRIRSIAFNKKIEGYADEAEYMIVAIDYD GKGDESPSVYIIKRNADGTFDDRSDIQLIAAYKQCNGATIHPINGELYFNSYEKGQVFRL DL >gi|226332111|gb|ACIC01000209.1| GENE 2 953 - 1222 97 89 aa, chain - ## HITS:1 COG:no KEGG:BT_3503 NR:ns ## KEGG: BT_3503 # Name: not_defined # Def: glutaminase A # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 89 83 171 171 171 98.0 9e-42 MTKQNPYGLPLDSRKEYTKSDWIMWTAAMSSDLETFKKFIDPLYKYINETTSRVPISDWH HTDSGEWVGFKARSVIGGYWMQVLMDKTR Prediction of potential genes in microbial genomes Time: Thu May 12 04:16:02 2011 Seq name: gi|226332110|gb|ACIC01000210.1| Bacteroides sp. 1_1_6 cont1.210, whole genome shotgun sequence Length of sequence - 20940 bp Number of predicted genes - 16, with homology - 16 Number of transcription units - 7, operones - 4 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 2 - 61 1.9 1 1 Op 1 1/0.000 + CDS 88 - 717 768 ## COG3404 Methenyl tetrahydrofolate cyclohydrolase 2 1 Op 2 . + CDS 714 - 2213 1411 ## COG2986 Histidine ammonia-lyase + Term 2254 - 2306 4.7 + Prom 2363 - 2422 9.1 3 2 Op 1 . + CDS 2545 - 3195 555 ## BT_2689 hypothetical protein 4 2 Op 2 13/0.000 + CDS 3245 - 4591 1338 ## COG1538 Outer membrane protein 5 2 Op 3 27/0.000 + CDS 4623 - 5639 1317 ## COG0845 Membrane-fusion protein 6 2 Op 4 . + CDS 5670 - 8828 3342 ## COG0841 Cation/multidrug efflux pump 7 2 Op 5 . + CDS 8841 - 9134 359 ## BT_2685 hypothetical protein + Term 9160 - 9208 13.4 + Prom 9137 - 9196 5.2 8 3 Op 1 . + CDS 9356 - 10750 971 ## BT_2683 putative periplasmic protein 9 3 Op 2 . + CDS 10719 - 11645 698 ## BT_2682 putative periplasmic protein 10 3 Op 3 . + CDS 11632 - 13119 1125 ## COG1696 Predicted membrane protein involved in D-alanine export + Prom 13267 - 13326 3.5 11 4 Tu 1 . + CDS 13389 - 16394 2768 ## COG3250 Beta-galactosidase/beta-glucuronidase + Prom 16514 - 16573 3.7 12 5 Tu 1 . + CDS 16609 - 17616 811 ## COG0667 Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) + Term 17767 - 17801 0.2 + Prom 17953 - 18012 4.6 13 6 Op 1 . + CDS 18093 - 18323 134 ## gi|253573065|ref|ZP_04850458.1| conserved hypothetical protein + Prom 18328 - 18387 2.5 14 6 Op 2 . + CDS 18422 - 18907 571 ## gi|253573066|ref|ZP_04850459.1| conserved hypothetical protein 15 6 Op 3 . + CDS 18930 - 19562 642 ## BT_2676 hypothetical protein + Term 19593 - 19649 13.2 + Prom 19566 - 19625 4.3 16 7 Tu 1 . + CDS 19659 - 20384 416 ## BT_2675 hypothetical protein Predicted protein(s) >gi|226332110|gb|ACIC01000210.1| GENE 1 88 - 717 768 209 aa, chain + ## HITS:1 COG:FN0739 KEGG:ns NR:ns ## COG: FN0739 COG3404 # Protein_GI_number: 19704074 # Func_class: E Amino acid transport and metabolism # Function: Methenyl tetrahydrofolate cyclohydrolase # Organism: Fusobacterium nucleatum # 2 207 3 212 212 134 39.0 1e-31 MLADLTVKDFLDKVAGNDPVPGGGSIAALGGALASALATMVTGLTIGKKGYEASEEVMQH AQTLTTRFQKEFIALIDKDSEAYDEVFACFKLPKATDEEKAVRSAAIQEATRHAALIPME VARKALEVMPVIADIARLGNRNAITDACVAMMAARSAVLGALLNVRINLGVLKDKEFVQG LQAEADRMEQTACRKEKELLDAVNQDLRV >gi|226332110|gb|ACIC01000210.1| GENE 2 714 - 2213 1411 499 aa, chain + ## HITS:1 COG:FN1406 KEGG:ns NR:ns ## COG: FN1406 COG2986 # Protein_GI_number: 19704738 # Func_class: E Amino acid transport and metabolism # Function: Histidine ammonia-lyase # Organism: Fusobacterium nucleatum # 3 497 2 499 511 405 44.0 1e-113 MSKNVYQIGSGELTFEIIERIINENLKLELAPEAKLRIQKCRDYLDHKIASSEEPLYGIT TGFGSLCTKNISSGELGTLQENLIKSHACSVGEEIRPVIIKLMMLLKAHALSLGHSGVQL ITVQRILDFFNNDVLPIVYDRGSLGASGDLAPLANLFLPLIGVGDVNYKGKKCEAISVLD EFGWEPVRLMSKEGLALLNGTQFMSANGVFAILKAFRLSKKADLIAALSLEAFDGRIDPF MDCIQQIRPHPGQIETGEAFRKLLAGSELIERPKAHVQDPYSFRCIPQVHGATKDAIRYV SSVLLTEINSVTDNPTIFPDEDRIISGGNFHGQPLAISYDFLAIALAELGNISERRVSQL IMGLRELPEFLVANPGLNSGFMIPQYAAASMVSQNKMYCYAASSDSIVSSNGQEDHVSMG ANAATKLYKVMDNLEHILAIELMNAAQGIDFRRPLKTSPLLESFLHAYRKEVPFVKDDIV MYKEIHKTVAFLKRTKLEY >gi|226332110|gb|ACIC01000210.1| GENE 3 2545 - 3195 555 216 aa, chain + ## HITS:1 COG:no KEGG:BT_2689 NR:ns ## KEGG: BT_2689 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 216 1 215 215 375 100.0 1e-103 MMEHGRDASQKAELRERIIMTATEAFTLKGIKCITMDDIAAALGISKRTLYEVFADKESL LKECILQKQAERDKYLQEIYEQSNNVLEVILAVFQKSIEIFHQTNKRFFEDIKKYPKVYA MMKDRSESDSEKTMSFFKSGVEQGIFRADVNFEIVNLLVREQFDVLLNTDICNEYPFIEV YESIMFTYIRGISTEKGAKVLEEFISEYRKNRVEQQ >gi|226332110|gb|ACIC01000210.1| GENE 4 3245 - 4591 1338 448 aa, chain + ## HITS:1 COG:PA4144 KEGG:ns NR:ns ## COG: PA4144 COG1538 # Protein_GI_number: 15599339 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Outer membrane protein # Organism: Pseudomonas aeruginosa # 39 447 56 464 471 72 19.0 1e-12 MNRLKRLTGKKMLLTAMALCAFGFAKAQTEQTTQNTLTLTLDKALEIALDENPTIKVAAE EIALKKVASKEAWQSLLPEASLNGSLDHTIKAAEMKLNDMSFKMGQDGTNTANAGLSINL PLFAPAVYRAMSMTKTDIELAVEKSRASELDLINQVTKAYYQLMLAQDSYEVLQGSYKLA EDNFNVVNAKYQQGAVSEFDKISAEVQMRSIKPNVISAANAVTLAKLQLKVLMGITADVD IKTDDNLTNYESMLFANQLKEEDMSLENNTTMKQFELNMKLLEKNVKSLKTNFMPTLSMS FSYQYQSLYNPNINFFDYTWSNSSSLMFNLSIPLYRASNFTKVKSARIQMRQLDWNRIDT ERKLNMQVVSYRNNMTASSEQVVSNKENVMQAEKAVQIAGKRYEVGKGTVLELNSSQVSL TQAQLTYNQSIYDYLVAKADLDQVLGKQ >gi|226332110|gb|ACIC01000210.1| GENE 5 4623 - 5639 1317 338 aa, chain + ## HITS:1 COG:VC0165 KEGG:ns NR:ns ## COG: VC0165 COG0845 # Protein_GI_number: 15640195 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Vibrio cholerae # 34 336 39 353 368 111 28.0 2e-24 MKKGIQLVALLLTVVMSSCTGGKDKAAVKQEDEKPRVKVADVTARPVDQIQDYTATVEAE VKNNIAPSSPVRIDRILVEVGDRVSKGQKLVQMDAANLTQTKLQLDNQKIEFNRIDELYK VGGASKSEWDAAKMQLDVKQTAYDNLVENTFLLSPINGVITARNYDNGDMYSGGSPVLVV EQITPVKLMINVSETYFTKVKKGEPVDVKLDVYGDELFKGTISLIYPTIDATTRTFQIEI KLDNKDQRVRPGMFARATLNFGTADNVVVPDLAIVKQAGSGDRYVYVYKDGKVSYNKVEL GRRMGSEYELKSGVPDNSQVVVAGQARLINGTEVEIEK >gi|226332110|gb|ACIC01000210.1| GENE 6 5670 - 8828 3342 1052 aa, chain + ## HITS:1 COG:BH3816 KEGG:ns NR:ns ## COG: BH3816 COG0841 # Protein_GI_number: 15616378 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Bacillus halodurans # 7 1020 5 1005 1093 494 32.0 1e-139 MSLYEGAVKKPIMTSLCFLAVVIFGLFSLSKLPIDLYPDIDTNTIMVMTAYPGASASDIE NNVTRPLENTLNAVSNLKHITSRSSENMSLITLEFEFGNDIDVLTNDVRDKLDMVSSQLP DDVENPIIFKFSTDMIPIVLLSVQANESQAALYKILDDRVVNPLARIPGVGTVSISGAPQ REIQVYCDPNKLEAYNLSIETISSIIGAENKNIPGGNFDIGSETYALRVEGEFDDSRQLA DIVVGTHNGANVFLRDVARIVDTVEERAQETYNNGVQGAMIVVQKQSGANSVEISRKVEE ALPRLQKNLPSDVKLGVIVDTSDNILNTIDSLAETVMYALLFVVIVVFLFLGRWRATLII CITIPLSLIASFIYLAITGNTINIISLSSLSIAIGMVVDDAIVVLENVTTHIERGSDPKQ AAVHGTNEVAISVIASTLTMIAVFFPLTMVSGMSGVLFKQLGWMMCAIMFVSTVAALSLT PMLCSQLLRLQKRPSKMFKLFFTPIEKALDALDTGYAKMLNWAVRHRPVVIVGCIAFFVV SLLCAKGIGTEFFPAQDNARIAVQLELPIGTRKEIAQELSEKLTNQWLTKYKDIMKVCNY TVGQADSDNTWASMQDNGSHIISFNISLVDPGDRDISLEAVCDEMRQDLKGYPEFSKAQV ILGGSNTGMSAQASADFEIYGYDMTMTDSVAARLKRELLTVKGVTEVNISRSDYQPEYQV DFDREKLAMHGLNLATAGNYLRNRINGAVASKYREDGDEYDIKVRYAPEYRTSLESIENI LIYNAKGESVRVKDVGKVVERFAPPTIERKDRERIVTVSAVISGAPLGDVVAAGNKLIDK MDIPGEITIQISGSYEDQQDSFRDLGTLGILIVILVFIVMAAQFESLTYPFIIMFSLPFA FSGVLMALFFTGSTLSVMSLLGGIMLIGIVVKNGIVLIDYITLCRERGMAVINSVVTSGK SRLRPVLMTTATTVLGMIPMAIGGGQGSEMWSPMAIAVIGGLTVSTVLTLVLIPTLYCVF AGTGIKNRRRKLHRKRELDVYFQEHKDEIIKK >gi|226332110|gb|ACIC01000210.1| GENE 7 8841 - 9134 359 97 aa, chain + ## HITS:1 COG:no KEGG:BT_2685 NR:ns ## KEGG: BT_2685 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 97 1 97 97 206 100.0 2e-52 MKSVLITFDQAYYERIMALLDRLNCRGFTYLEKVQGRGSKTGDPHFGSHAWPSMCSAILT VVDDNKVDPLLDTLHKMDLQTEQLGLRAFVWNIERTI >gi|226332110|gb|ACIC01000210.1| GENE 8 9356 - 10750 971 464 aa, chain + ## HITS:1 COG:no KEGG:BT_2683 NR:ns ## KEGG: BT_2683 # Name: not_defined # Def: putative periplasmic protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 464 1 464 464 959 99.0 0 MEIIKNYLKYSLWLVLIVFAALFGLHWLPAITIDGHTMRRVDLLSDLRYPKQEVAVADSD TIPLPPVVKPVFVDTCRAGMTCIEDYGDSTRRGMASFYEALDRTSSSHPDHDGLVRIAVF GDSFIEADIFTADLREMLQKRFGGCGVGFVTITSMTSGYRPTVRHTFGGWSSHAVTDSVY FDKKKQGISGHYFIPREGAYVELRGQSKYASLLDTCQQASIFFYNKDSVHLTARVNRGES KNYSLGPSGDLQKISVEGRIGSVRWMVDRADSTLFYGLAMDGKKGIILDNFSLRGSSGLS LRGIPQQMLRQFNEQRPYDLIILEYGLNVATERGRIYDNYQKGLLTSIEHLKNCFPQASI LLLSVGDRDYKTEDGELRTMPGVKNLIRYQQNIAAESGIAFWNMFEAMGGEGSMANLVHA KPSMANYDYTHINFRGGKHLAGLLYETLIYGKEQYDRRRAYEQE >gi|226332110|gb|ACIC01000210.1| GENE 9 10719 - 11645 698 308 aa, chain + ## HITS:1 COG:no KEGG:BT_2682 NR:ns ## KEGG: BT_2682 # Name: not_defined # Def: putative periplasmic protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 21 308 1 288 288 596 99.0 1e-169 MTGGVLMSKNNLLLGFILGLMGTFFLLSAGKAQDRIPACPPLGKTLKMIKPLREMNWAND TIAVQTSFPSAFRETGRNEIIDSIALLTPVFERLRQVRAGLSEDTVRILHIGDSHVRGHI YPQTTGTLLKETFGAVSYTDMGVNGATCLTFTHPGRIADIAAMKPELLILSFGTNESHNR RYNVNLHYNQMDELVKLLRDSLPDVPILLTTPPGSYESFRQRRRKRTYAINPRTVTAAET IRRYAKDHRLLVWDMYDVVGGKRRACTNWTEAKLMRPDHVHYLPEGYILQGNLLYQAIIQ AYNDYVSH >gi|226332110|gb|ACIC01000210.1| GENE 10 11632 - 13119 1125 495 aa, chain + ## HITS:1 COG:PA3548 KEGG:ns NR:ns ## COG: PA3548 COG1696 # Protein_GI_number: 15598744 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted membrane protein involved in D-alanine export # Organism: Pseudomonas aeruginosa # 40 418 19 390 520 251 40.0 2e-66 MFPIDIDFSRLKEVLTYDPQAPMIFSSGIFLWLFAAFMIVYTLLQRKYTARILFVTLFSY YFYYKSSGTYFFLLALVTVCDFLLAQWMDRTEGRWKRKGLVTLSLGVNLGLLAYFKYTNF LGGVIASLMGGEFTALDIFLPVGISFFTFQSLSYTIDVYRKDIKPLTNILDYAFYVSFFP QLVAGPIVRARDFIPQIRKPLFVSQEMFGRGIFLILSGLFKKAIISDYISVNFVERIFDN PTLYSGLENLMGVYGYALQIYCDFSGYSDMAIGIALLLGFHFNLNFNSPYKSASITEFWR RWHISLSSWLRDYLYISLGGNRKGKFRQYLNLIITMFLGGLWHGASWNFVLWGTFHGIAL AVHKMWMSIIGRKKGEQSHGWRRVFGVIITFHFVCFCWIFFRNADFQNSMDMLRQIFTTF RPQLFPQLIEGYWRVFALMLLGFLLHFTPDSWENAACRGVIRLPFLGKALLMVALIYLVI QMKSSEIQPFIYFQF >gi|226332110|gb|ACIC01000210.1| GENE 11 13389 - 16394 2768 1001 aa, chain + ## HITS:1 COG:SP0648_2 KEGG:ns NR:ns ## COG: SP0648_2 COG3250 # Protein_GI_number: 15900551 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Streptococcus pneumoniae TIGR4 # 27 755 56 825 871 174 25.0 8e-43 MRHRILFFFLAFIGISQFVTSSDVRNKYNFNSDWLLYVGDVPEAKQPRFSDSEWKKVTLP HAFNEDEAFRLSIDELTDTIMWYRKHFRLPADSKNKKVFVEFEGVRQGADFYINGQNVGL HENGVMAVGFDLTPYIKYGQENVIAVRIDNDWNYKERATGTKYQWSNKNFNANYGGIPKN VWLHVTDRLYQTLPLYSNLKTTGVYIYAEDIRVKSRKAIIHAESEIRNEYNRDKQVSYQV ELIDRDGTSIQTFEGTKAVVKSGETITLKAASEVDNLHFWSWGYGYLYTVKTSLWVDGRK VDEVATRTGFRKTRFGKGMIWLNDRVIQMKGFAQRTSNEWPGVGMSVPAWLSDYSNHLMV EGNANLVRWMHVTPWKQDIESCDRVGLIQAMQAGDAEKDCEGRQWEQRTELMRDAIIYNR NNPSILFYECGNESISREHMIEMKAIRDLYDPHGGRAIGSREMLDIREAEYGGEMLYINK SAHHPMWAMEYCRDEGLRKYWDDYSYPYHKDGDGSNSYKSTVTNKVQKKVDTRVYNRNQD SFTIENIIRWFDYWRERPGTGDRVSSGGVKIIFSDTNTHYRGVENYRRSGVTDAMRIPKD PFFAHQVMWDGWVDTECPRIYIVGHWNYNDTVVKPVYVVSSADKVELFLNGKSLGNGERD YHFLYTFKDIAFVPGRLEAVGYDEKGKECCRTQLQTAGKPEQIQLSFIQSPEGWKADGAD MVLFQVEVMDKDGNRCPLANDLIHFEVEGPAEWRGGIAQGENNYILSKDLPVECGINRAL IRSLTTAGNVRITAKADGLKPAEISLTSSPVEVKNGLSNYIPGDELESRLTRGETPQTPS YKVSKVDVGIVSAIAGANNESTVNSFDDNELSEWRNDGKAATAWITYKLERAARVDEVCM KLTGWRMRSYPLEIYAGKELIWSGDTDKSLGYVHLDVKPVLTNEITIRLKGTSKEGDAFG QIVEVAAPVANELDLFKAKDGDKTGHELRIVEIEFKENLLK >gi|226332110|gb|ACIC01000210.1| GENE 12 16609 - 17616 811 335 aa, chain + ## HITS:1 COG:YPO2980 KEGG:ns NR:ns ## COG: YPO2980 COG0667 # Protein_GI_number: 16123161 # Func_class: C Energy production and conversion # Function: Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) # Organism: Yersinia pestis # 8 332 4 328 329 384 58.0 1e-106 MDYPLNNQPATDRYDRMQYKYCGKSGLQLPLISLGLWHNFGSVDNFSVATDMIKYAFDHG ITHFDLANNYGPVPGSAETNFGRILKENFQGYRDEMIISSKAGHDMWAGPYGGNSSRKNL MASIDQSLRRTGLEYFDIFYSHRYDGVTPVEETIQTLIDIVKQGKALYIGISKYPPLQAR IAYEMMAKAGVPCLISQYRYSMFDRAVEAESLPLAAEYGSGFIAFSPLAQGLLTDKYLNG IPEGSRAARPSTFLQRSQVTPEKVEAARQLNEIARHRGQTLAEMALAWVLRDERMTSVIV GASSVNQLADNLQALNQLEFTAEELNGIERILCKV >gi|226332110|gb|ACIC01000210.1| GENE 13 18093 - 18323 134 76 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253573065|ref|ZP_04850458.1| ## NR: gi|253573065|ref|ZP_04850458.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 76 13 88 88 158 100.0 1e-37 MIHAGQLIERTLHEQGRTVTWFAGQLCCTRPNIYKIFKKENIDIHLLWRISCILDHDFFR DLSDNIRIGTSSCVSK >gi|226332110|gb|ACIC01000210.1| GENE 14 18422 - 18907 571 161 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253573066|ref|ZP_04850459.1| ## NR: gi|253573066|ref|ZP_04850459.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 161 1 161 161 310 100.0 2e-83 MKKLILIMMLVFGCVIGTYAQKTVFKFRDAQARAGDAVTEVCVKPTVVEVKILEEKGRIK DVWTLSREEVEVAMKGELDNIRAWGTYLSTIKNDCDVIMGATFKIEDDEKSGGYTVTVVG YPGVFTNWHTATTEDYEWIRIQKLSGADDRTKIAPVIKNKN >gi|226332110|gb|ACIC01000210.1| GENE 15 18930 - 19562 642 210 aa, chain + ## HITS:1 COG:no KEGG:BT_2676 NR:ns ## KEGG: BT_2676 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 210 1 210 210 414 100.0 1e-114 MKKVLSIVAALLLCVGTQAQIVSSRSAIVKTQKQASNTQWFLRAGLNIMNLNGDGVEADS NIGYNATFGYQKPMGGAGAYWGMEFGLGSRGFKADEMKCIAHNIQYSPFTFGWKIGVADN LSIDPHLGVFASYDYTSKMKESGESISWGDFADYLEVDYNHFDAGMNIGVGLWYDRFNLD FTYQRGFIDVFSDLDGIKTSNFMIRLGIAF >gi|226332110|gb|ACIC01000210.1| GENE 16 19659 - 20384 416 241 aa, chain + ## HITS:1 COG:no KEGG:BT_2675 NR:ns ## KEGG: BT_2675 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 241 1 241 241 478 99.0 1e-134 MITRLRLWVIVFLLGGVLPSLIAQTFTEQKNTYPLSADGSKYVVNGFIPFSPMNDESIYA NALLWTIKNVCSAQRDGITEVSIPAKSFSCDLVLTSQADAKQKNTYYCTAQFQVKDGKLV YYLSNIQIESSVVIMKKVTPMEKLQPEKKPSHQETMDDFVQIESQMLNKLFDFVSTNQLS PITHWNEINIGKPVKGMTEDECLLAFGKPQTISESNGEVQWMYSSSFYLFFKNGCVETII K Prediction of potential genes in microbial genomes Time: Thu May 12 04:16:41 2011 Seq name: gi|226332109|gb|ACIC01000211.1| Bacteroides sp. 1_1_6 cont1.211, whole genome shotgun sequence Length of sequence - 19023 bp Number of predicted genes - 17, with homology - 16 Number of transcription units - 11, operones - 6 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 672 - 736 15.2 1 1 Op 1 . - CDS 754 - 3279 1501 ## COG3291 FOG: PKD repeat - Prom 3307 - 3366 5.5 2 1 Op 2 . - CDS 3372 - 3776 213 ## gi|253573072|ref|ZP_04850464.1| conserved hypothetical protein - Prom 3849 - 3908 4.2 - Term 4093 - 4130 2.4 3 2 Tu 1 . - CDS 4133 - 5113 658 ## BVU_2684 integrase - Prom 5163 - 5222 6.9 + Prom 5097 - 5156 6.5 4 3 Tu 1 . + CDS 5190 - 5438 98 ## gi|298386629|ref|ZP_06996185.1| hypothetical protein HMPREF9007_03376 + Term 5586 - 5646 10.5 - Term 5580 - 5623 3.0 5 4 Tu 1 . - CDS 5644 - 6828 913 ## BF2740 clostripain-related protein - Prom 6853 - 6912 9.9 - Term 6883 - 6924 7.4 6 5 Op 1 . - CDS 6943 - 7344 269 ## BT_1309 hypothetical protein 7 5 Op 2 . - CDS 7421 - 8002 348 ## BT_1310 hypothetical protein - Prom 8034 - 8093 3.1 8 6 Op 1 1/0.000 - CDS 8139 - 8999 901 ## COG0568 DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) - Prom 9040 - 9099 4.5 - Term 9078 - 9143 11.4 9 6 Op 2 . - CDS 9182 - 10720 1697 ## COG0265 Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain - Prom 10773 - 10832 7.1 - Term 10736 - 10789 -0.3 10 7 Op 1 3/0.000 - CDS 10940 - 12091 975 ## COG4948 L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily 11 7 Op 2 . - CDS 12124 - 13110 783 ## COG0791 Cell wall-associated hydrolases (invasion-associated proteins) - Prom 13136 - 13195 4.1 + Prom 13523 - 13582 1.8 12 8 Tu 1 . + CDS 13607 - 14926 1126 ## COG1295 Predicted membrane protein + Term 15063 - 15124 18.0 - Term 15050 - 15111 18.0 13 9 Op 1 . - CDS 15162 - 15698 592 ## COG0778 Nitroreductase 14 9 Op 2 . - CDS 15717 - 16319 753 ## COG0307 Riboflavin synthase alpha chain - Prom 16515 - 16574 9.9 15 10 Tu 1 . + CDS 16887 - 17060 81 ## gi|255690571|ref|ZP_05414246.1| conserved hypothetical protein + Term 17070 - 17111 1.2 - Term 17055 - 17102 4.2 16 11 Op 1 . - CDS 17128 - 17439 147 ## 17 11 Op 2 . - CDS 17442 - 18734 256 ## COG3291 FOG: PKD repeat - Prom 18962 - 19021 3.3 Predicted protein(s) >gi|226332109|gb|ACIC01000211.1| GENE 1 754 - 3279 1501 841 aa, chain - ## HITS:1 COG:MA4289 KEGG:ns NR:ns ## COG: MA4289 COG3291 # Protein_GI_number: 20093078 # Func_class: R General function prediction only # Function: FOG: PKD repeat # Organism: Methanosarcina acetivorans str.C2A # 440 787 517 819 1734 147 31.0 8e-35 MGKLKHLGMSLLLATTFFSCGDDYDDTALRNDVNDLKSRVEKLESWCSTTNTQISALQGL VSALEQNDYVTGVTPIVEGSVEVGYTITFTKSKPITIYHGKDGKNGADGINGVDGITPLI GAEKDTDGIYYWTIKLGDADSAWLTDVDNNKIPTTGKDGENGNDGEPGNNGEPGKDGEPG HSPVISVDTFEGKLYWKVDDEWLLDSNSNKVAATGEKGDTGSAGKDGEKGPQGEQGDSVF KKDGVKIEDGKVIFTLANGKEFTLPMLVDGLAVGIGGSDLFYASPSDNSIDVTFASTMKE EDYKSIVATITNGNEADWIIKSRGAVDTWKVAVTEPTFTGTDGTYTPGSAKITITPPANV KLSDTALLTVTVTDANDGTISVSREVKYFDGEIVACTGGDLSIKGLDTSVKRLALKGSIT VEDFKYIRESLTALEVLDISMTDLTELPDRALRFGGDTPNTSLKAVRLPLSMKTIGYAAF TNCRALTSIDTENVEIIGEWAFEQCRGLVEVKLHDGLKEIRRQAFNNCISLTLIDIPGSV TDLGKNTEVSTSQYEGWVFENCTNLSTITLHEGLKKLYVSTFSRCGVVSINIPTTVTDIP DYAFQECQNLERVTWHNGITQIGEAAFLRCRSLKAITIPTGVTVLRNNLFDECANLQYVT LHHNITEIQVRAFSYCTLLKSEFFPYEDAYIDRAALPNALQTLGAGVFTDCKNMESINMQ RTQVKNILRNTYSGCSGITTFYYPNVVETIGEFAFSGCSALVGSFFPASLTSIETNAFHL CTKMSQVYCLGSTPPTLGTAAFDSSLKSISTLFVPEAAISAYSSSSWNPYFKEIKKDLLH P >gi|226332109|gb|ACIC01000211.1| GENE 2 3372 - 3776 213 134 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253573072|ref|ZP_04850464.1| ## NR: gi|253573072|ref|ZP_04850464.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 134 1 134 134 236 100.0 3e-61 MKQKNKFIKERLKHKQEQLSMQAIRQERLNRTMQNTKDDIDQLIYALHELEEAEKGNHKR HTIPYNEVFSFFTRICRLFDYLQHRCHKYFNYRLTCLVIHAHYRFRSRTRAPGREYLSPG TILTYFKHEWEMVV >gi|226332109|gb|ACIC01000211.1| GENE 3 4133 - 5113 658 326 aa, chain - ## HITS:1 COG:no KEGG:BVU_2684 NR:ns ## KEGG: BVU_2684 # Name: not_defined # Def: integrase # Organism: B.vulgatus # Pathway: not_defined # 2 315 3 305 309 249 44.0 1e-64 MKQHSFTDFVYEVIGELRQENRFATAYIYHYALQAFTASVGGGKIFWGGLNRRSLCKFQY YLEKEQKSYNTTSTYIRALRAIYNRAVDRGVVAGECRLFVNLKTGVSSEHKLALTAGQTS RLLGDVVDGKYSSGRLSSPVLPPEVYRAQDTMRLMLLLQGMPFVDLAHLRKVDLKGDLLI CRRQKTGTELCVKLMPQARQLIERYRSREESSPYLLDILTPTSFDREAFDDYRRCLRKLN FNLSRLPGLMGMEEVKVSSYTARHTWATLAKYCQVPEEVISEGLGHSSLEVTRTYLKSFE GEELNRANIVVNNYIITGEKRMWNEA >gi|226332109|gb|ACIC01000211.1| GENE 4 5190 - 5438 98 82 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|298386629|ref|ZP_06996185.1| ## NR: gi|298386629|ref|ZP_06996185.1| hypothetical protein HMPREF9007_03376 [Bacteroides sp. 1_1_14] # 23 82 1 60 60 119 98.0 8e-26 MFNRFFCRNKKECTLLTLITNAVLHGFSTPANDFYLCSVLIKHMCKVMKYSLYKADAEEE NEDSPKLDEKPGGGSTNEYTAT >gi|226332109|gb|ACIC01000211.1| GENE 5 5644 - 6828 913 394 aa, chain - ## HITS:1 COG:no KEGG:BF2740 NR:ns ## KEGG: BF2740 # Name: not_defined # Def: clostripain-related protein # Organism: B.fragilis # Pathway: not_defined # 3 394 12 390 393 299 44.0 2e-79 MGLAALFAACENNENEGPEGPEPREQVGRTVLVYIVGDNGVSELSSLFKVNFSDMKAGME EVDYSKCNLVVYSEMVNDVPHLISLKQKNGKVVADTLFTYDEQNPLDKEVMASVISQTVS YFPADSYGFVFLSHSSSWVPASNNANSRSIGYYRRTQMNIPDFREALSSAFPKPLKFILF DSCNMQSVEVAYELRDCAEYFIGSPTEIPGPGAPYKAVVPEMFTETNLATNIAEAYYGYY EKSYTGVAPVSNENWTGGVATSVIKSAALDNLAAATKTIIPKYIQEKREVDRSDILAYDF NDDANYDFEKLIRNLTGGVDNTDYRSWYTAFEAAVVYRKTTLKNYSGIIHRMFTMEGSEG LSTYIPTGASNSTMNTFYRTLSWYTAAGWDGTGW >gi|226332109|gb|ACIC01000211.1| GENE 6 6943 - 7344 269 133 aa, chain - ## HITS:1 COG:no KEGG:BT_1309 NR:ns ## KEGG: BT_1309 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 133 1 133 133 234 99.0 8e-61 MKYVKILFAMALVLTMCSAFSLKKDHSKTVYAFGIAASFTDTVVYFTDIQILDSAKVSKE GFLTHRDLYSYQLKNYVEDNGLQQNSTCMIYFSENRKKLEKEATKILSKYKRNRNTTVTR IDTDKFHFIKPEE >gi|226332109|gb|ACIC01000211.1| GENE 7 7421 - 8002 348 193 aa, chain - ## HITS:1 COG:no KEGG:BT_1310 NR:ns ## KEGG: BT_1310 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 193 1 218 218 341 79.0 9e-93 MKKSALILFTLVGLGIGIVAKAEQPLREKALAAKTYCVEKGFNTNYCFLIDFSIPSGKKR FFVWDFKGDSIKYSSLCAHGYGKESTPKKPVYSNVEGSYCSSLGKYKVGIRSYSKWGINV HYKLHGLESTNSNAFKRYIVLHSYTPVPTLEIYPMHLPLGISQGCPVICDDVMRKVDALL KAEKKPVLLWIYE >gi|226332109|gb|ACIC01000211.1| GENE 8 8139 - 8999 901 286 aa, chain - ## HITS:1 COG:lin1491 KEGG:ns NR:ns ## COG: lin1491 COG0568 # Protein_GI_number: 16800559 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) # Organism: Listeria innocua # 21 285 108 373 374 219 44.0 4e-57 MRQLKITKSITNRESASLDKYLQEIGREDLITVEEEVELAQRIRKGDRVALEKLTRANLR FVVSVAKQYQNQGLSLPDLINEGNLGLIKAAEKFDETRGFKFISYAVWWIRQSILQALAE QSRIVRLPLNQVGSLNKISKAFSKFEQENERRPSPEELADELEIPVDKISDTLKVSGRHI SVDAPFVEGEDNSLLDVLVNDDSPMADRSLVNESLAREIDRALSTLTEREKEIIQMFFGI GQQEMTLEEIGDKFGLTRERVRQIKEKAIRRLRQSNRSKLLKSYLG >gi|226332109|gb|ACIC01000211.1| GENE 9 9182 - 10720 1697 512 aa, chain - ## HITS:1 COG:YPO3566 KEGG:ns NR:ns ## COG: YPO3566 COG0265 # Protein_GI_number: 16123710 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain # Organism: Yersinia pestis # 78 482 56 436 457 242 37.0 1e-63 MKQTTKNILGVGAIILLSSGVAGLTTYKLLQSNDAAKETSFNEMFQQNPNVKLAAFDAVN AQPVDLTQAAENSLHAVVHIKSTQEAKTRTVQQAPDIFDFFFGDGRGQQRQVQSQPRVGF GSGVIISKDGYIVTNNHVIEGADEISVKLNDNREFRGRVIGTDPSTDLALIKIESEDDLP TIPVGDSETLKVGEWVLAVGNPFNLNSTVTAGIVSAKARTLGVYNGGIESFIQTDAAINQ GNSGGALVNAKGELVGINSVLSSPTGAYAGYGFAIPTSIMTKVVADLKQYGTVQRALLGI KGASLGSSIMEDQSPIDKSGTTLRDKAKEFGVVDGVWVREIVDNGSAAGADIKVDDVIVG VDNKKVHNFADLQEALAKHRPGDKVTVKLVRAKKEKSVEVTLKNEQGTTKIVKEAGMEIL GAAFKELPDDLKKQLNLGYGLQVTGVSSGKMSDAGVRKGFIILKANDQPMRKVSDLEEVM KAAVKSPNQVLFLTGVFPSGKRGYFAVDLTQE >gi|226332109|gb|ACIC01000211.1| GENE 10 10940 - 12091 975 383 aa, chain - ## HITS:1 COG:all3532 KEGG:ns NR:ns ## COG: all3532 COG4948 # Protein_GI_number: 17231024 # Func_class: M Cell wall/membrane/envelope biogenesis; R General function prediction only # Function: L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily # Organism: Nostoc sp. PCC 7120 # 46 380 1 343 350 208 35.0 2e-53 MPNRRDFLKTAAFATLGSRIAVSQALAGESVASTIHINKQGMGGKMKMTFFPYELKLRHV FTVATYSRTTTPDVQVEIEYEGVTGYGEASMPPYLGETVESVMNFLKKVNLEQFSDPFQL EDILSYVDSLSPKDTAAKAAVDIALHDLVGKLLGAPWHKIWGLNKEKTPSTTFTIGIDTP DVVRAKTKECADRFNILKVKLGRDNDKEMIETIRSVTDLPIAIDANQGWKDRQYALDMIH WLKEKGIVMIEQPMPKEQLDDIAWVTQQSPLPVFADESLQRLKDVAALKGAFTGINIKLM KCTGMREAWKMVTLAHALGMRVMVGCMTETSCAISAASQFSPAVDFADLDGNLLISNDRF KGVEVVKGKITLNDLPGIGVMKI >gi|226332109|gb|ACIC01000211.1| GENE 11 12124 - 13110 783 328 aa, chain - ## HITS:1 COG:BH3007 KEGG:ns NR:ns ## COG: BH3007 COG0791 # Protein_GI_number: 15615569 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell wall-associated hydrolases (invasion-associated proteins) # Organism: Bacillus halodurans # 42 299 71 331 336 97 30.0 3e-20 MKKNILFFYCLLVVAVVSLKAQEIRPMPADSAYGVVHISVCNMRDEGKFTSGMSTQALLG MPVKVLQYTGWYEIQTPDDYTGWVHRMVVTPMSKEQYDEWNRAEKIVVTSHYGFTYEKPN DDSQTVSDVVAGNRLKWEGSKGRFYKVSYPDGRQAYISKHISQPETKWRASLKQDVESII QTAYTMIGIPYLWAGTSSKGVDCSGLVRTVLFMHDIIIPRDASQQAYVGERIEIAPDFSN VQRGDLVFFGRKATADRKEGISHVGIYLGNKRFIHALGDVHISSFDPEDEYYDEFNTGRL LFATRFLPYINKEKGMNTTDHNPYYLYD >gi|226332109|gb|ACIC01000211.1| GENE 12 13607 - 14926 1126 439 aa, chain + ## HITS:1 COG:FN1154 KEGG:ns NR:ns ## COG: FN1154 COG1295 # Protein_GI_number: 19704489 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Fusobacterium nucleatum # 55 341 19 300 396 160 34.0 4e-39 MKKKITDIWKFITYDIWRITEDEVTRTKFSLYNIIKTIYLCINRFTKDRMANKASALTYS TLLAIVPILAILFAVARGFGFDNLMEHQFRSGFGGNIETTEAILSFVNSYLSQTKGGIFI GVGLVMLLWTVINLVSNIEITFNRIWEVKKARSMYRKITDYFSMFLLMPILIVVSGGLSL FVSTVLKQMDDFVLLAPVMKFMIRLIPFVLTWLMFTGLYIFMPNTKVKFKHALIAGILAG SAYQAFQFLYINSQLWVSKYNAIYGSFAALPLFLLWLQISWTICLFGAELTYAGQNIRSF SFDQDTRNISRRYRDFISILIMSLIAKRFEKNEPPYTAAEISEEHQIPIRLTNQVLYQLQ EIDLIHEVVSDQKSEDIGYQPSMDINQLNVAILLDRLDTYGSENFKIDKDEEFNDEWKVL TESREEYYKKASKVLLKDL >gi|226332109|gb|ACIC01000211.1| GENE 13 15162 - 15698 592 178 aa, chain - ## HITS:1 COG:CAC2311 KEGG:ns NR:ns ## COG: CAC2311 COG0778 # Protein_GI_number: 15895578 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Clostridium acetobutylicum # 1 143 1 139 187 82 33.0 4e-16 MDFLQLVLSRQSDRAYDKERPVEAEKLERILEAARLAPSACNAQPWKFVVVTDHELASKV GKAAAGLGMNKFAKDAPVHILVVEESANITSLLGGKVKGKHFPLIDIGIAAAHITLAAEN EGLGSCILGWFDEKEIKELTGIPASKRLLLDIAIGYPVKEKRKKMRKPKEKVISYNQY >gi|226332109|gb|ACIC01000211.1| GENE 14 15717 - 16319 753 200 aa, chain - ## HITS:1 COG:L0164 KEGG:ns NR:ns ## COG: L0164 COG0307 # Protein_GI_number: 15672976 # Func_class: H Coenzyme transport and metabolism # Function: Riboflavin synthase alpha chain # Organism: Lactococcus lactis # 1 196 1 192 216 153 41.0 3e-37 MFSGIVEEYATLVALVKDQENIHFTFKCSFVNELKIDQSISHNGVCLTVVSLTDDTYTVT AMKETLDRSNLGLLKVGDKVNVERSMMMNGRLDGHIVQGHVDQTATCVEIKDAEGSWYFT FKYAFDKEMAKRGYITVDKGSVTVNGVSLTVCNPTDDTFQVAIIPYTYEYTNFHTFEIGS VVNIEFDIIGKYISRMIQYR >gi|226332109|gb|ACIC01000211.1| GENE 15 16887 - 17060 81 57 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|255690571|ref|ZP_05414246.1| ## NR: gi|255690571|ref|ZP_05414246.1| conserved hypothetical protein [Bacteroides finegoldii DSM 17565] # 1 57 1 57 57 62 75.0 7e-09 MQVTPKKSTVISYLPQEIRETTYTDIIYRPVTREMLEQLFSNSAKKDKHKLVKKRRK >gi|226332109|gb|ACIC01000211.1| GENE 16 17128 - 17439 147 103 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSKEENFTVEQQISRLIRSSKCTFQEGSKVLNVKSIKNISILNYRKGDFGDWNTEYNGIA TLLVVDEKNSGNYYENLYNIHGYANVDNDTVVGISTIIFVCKR >gi|226332109|gb|ACIC01000211.1| GENE 17 17442 - 18734 256 430 aa, chain - ## HITS:1 COG:MA4289 KEGG:ns NR:ns ## COG: MA4289 COG3291 # Protein_GI_number: 20093078 # Func_class: R General function prediction only # Function: FOG: PKD repeat # Organism: Methanosarcina acetivorans str.C2A # 211 383 624 796 1734 148 47.0 2e-35 MRTLRLISTTLLIVVLCLNFTSCSDDNESKKNVIILADDTQRELTFSAKEETKEIKFTSS ESWSAYIYYSSENDWITVTPTEGGAGENTISIKVNSNSQYSYRGVTLGIRIPKQETQIYI TQEGKEDILYTQNIEVAGTLYELLGTYDMHKLKLTGYLNGTDVATIRKMPLTEIDLSDVN IVGGGTYTVHYIGLGTGEYRGSTSDNVFPSYFFYDKTTIQSVVLPNTITMIDARAFAYCE NLSSIVIPNGVTEIGMEAFRECKGLTSITIPNSVTKIGYEAFQACESITSITIPNSVTKI ANGAFRFCTNLTSIIIPNSVIDIDDYAFQNCSGLTSIVIGNGIKRLPTYVFGGCTSLSSI IIPANVETIETKAFKDCSALKEIHVKNPLPPTVYDAFSTYSHVVLYVPIGSKEAYQNHSI WGKFSTIIEE Prediction of potential genes in microbial genomes Time: Thu May 12 04:17:18 2011 Seq name: gi|226332108|gb|ACIC01000212.1| Bacteroides sp. 1_1_6 cont1.212, whole genome shotgun sequence Length of sequence - 4390 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 2, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 37 - 96 3.8 1 1 Op 1 . + CDS 164 - 742 557 ## BT_3832 hypothetical protein 2 1 Op 2 . + CDS 755 - 940 339 ## PROTEIN SUPPORTED gi|29349241|ref|NP_812744.1| 50S ribosomal protein L32 + Term 961 - 1000 5.0 + Prom 947 - 1006 2.4 3 2 Op 1 . + CDS 1027 - 2034 794 ## COG0332 3-oxoacyl-[acyl-carrier-protein] synthase III 4 2 Op 2 . + CDS 2120 - 3001 1107 ## COG1159 GTPase 5 2 Op 3 . + CDS 3050 - 4363 1194 ## COG1160 Predicted GTPases Predicted protein(s) >gi|226332108|gb|ACIC01000212.1| GENE 1 164 - 742 557 192 aa, chain + ## HITS:1 COG:no KEGG:BT_3832 NR:ns ## KEGG: BT_3832 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 14 192 1 179 179 353 100.0 2e-96 MGKFDKYKIDLKGMQTDSAKYEFVLDNLYFAHIDGPEVQKGKVNVTLTVKRTSRAFELSF QTDGIVWVPCDRCLDEMELQITSSDKLMVKFGHEYAEEGDNLIVIPEEEGEINVAWFMYE FVALSIPMKHVHAPGKCNKAVTSKLSKHLKTDANEDSDEVFDTGGDDIVVAEEMEEQIDP RWNELKKILDNN >gi|226332108|gb|ACIC01000212.1| GENE 2 755 - 940 339 61 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|29349241|ref|NP_812744.1| 50S ribosomal protein L32 [Bacteroides thetaiotaomicron VPI-5482] # 1 61 1 61 61 135 100 7e-32 MAHPKRRQSKTRTAKRRTHDKAVAPTLAICPNCGEWHVYHTVCGACGYYRGKLAIEKEAA V >gi|226332108|gb|ACIC01000212.1| GENE 3 1027 - 2034 794 335 aa, chain + ## HITS:1 COG:lin2305 KEGG:ns NR:ns ## COG: lin2305 COG0332 # Protein_GI_number: 16801369 # Func_class: I Lipid transport and metabolism # Function: 3-oxoacyl-[acyl-carrier-protein] synthase III # Organism: Listeria innocua # 4 328 1 311 312 281 43.0 9e-76 MEKINAVITGVGGYVPDYILTNEEISKMVDTNDEWIMTRIGVKERHILNEEGLGSSYMAR KAAKQLMKKTGANPDDIDLVIVATTTPDYHFPSTASILCDKLGLKNAFAFDLQAACCGFL YLMETAANFIRSGRYKKIIIVGADKMSSMVNYTDRATCPIFGDGAAAFMMEPTTEDLGVM DSILRTDGKGLPFLHMKAGGSVCPPSYFTVDNKMHYLHQEGRTVFKYAVSSMSDVSAAIA EKNGLTKDTINWVVPHQANVRIIEAVAHRMELPMDKVLVNIEHYGNTSAATLPLCIWDFE DKLKKGDNIIFTAFGAGFTWGAVYVKWGYDGKKES >gi|226332108|gb|ACIC01000212.1| GENE 4 2120 - 3001 1107 293 aa, chain + ## HITS:1 COG:lin1499 KEGG:ns NR:ns ## COG: lin1499 COG1159 # Protein_GI_number: 16800567 # Func_class: R General function prediction only # Function: GTPase # Organism: Listeria innocua # 3 290 6 296 301 253 45.0 4e-67 MHKAGFVNIVGNPNVGKSTLMNALVGERISIATFKAQTTRHRIMGIYNTDDMQIVFSDTP GVLKPNYKLQESMLNFSTSALTDADVLLYVTDVVETPDKNNEFMGKVRQMTVPVLLLINK IDLTDQEKLIKLVEEWKELLPQAEIIPISAASKFNVDYVMKRIQELLPDSPPYFGKDQWT DKPARFFVNEIIREKILLYYDKEIPYSVEVVVEEFKEEAKKIHIRAVIYVERDSQKGIII GKQGKALKKVATEARRELERFFGKTIFLETYVKVDKDWRSSDKELRNFGYQLD >gi|226332108|gb|ACIC01000212.1| GENE 5 3050 - 4363 1194 437 aa, chain + ## HITS:1 COG:SPy0341 KEGG:ns NR:ns ## COG: SPy0341 COG1160 # Protein_GI_number: 15674498 # Func_class: R General function prediction only # Function: Predicted GTPases # Organism: Streptococcus pyogenes M1 GAS # 5 437 6 435 436 375 46.0 1e-104 MNNLVAIVGRPNVGKSTLFNRLTKTRQAIVNEEAGTTRDRQYGKSEWLGREFSVVDTGGW VVNSDDIFEEEIRKQVLLAVEEADVILFVVDVMNGVTDLDMQVATILRRANSPVIMVANK TDNNELQYNAPEFYKLGLGDPYCISAITGSGTGDLMDLIVSKFNKETSEILDDDIPRFAV VGRPNAGKSSIVNAFIGEDRNIVTEIAGTTRDSIYTRYNKFGFDFYLVDTAGIRKKNKVN EDLEYYSVVRSIRSIENADVCILMLDATRGVESQDLNILSLIQKNQKGLVVVINKWDLIE DKTAKMMKEFEATIRSRFAPFVDFPIIFASALTKQRILKVLEEARNVYENRTTKIPTARL NEEMLPLIEAYPPPSNKGKYIKIKYITQLPNTQVPSFVYFANLPQYVKEPYKRFLENKMR EKWNLTGTPINIYIRQK Prediction of potential genes in microbial genomes Time: Thu May 12 04:17:26 2011 Seq name: gi|226332107|gb|ACIC01000213.1| Bacteroides sp. 1_1_6 cont1.213, whole genome shotgun sequence Length of sequence - 10947 bp Number of predicted genes - 11, with homology - 11 Number of transcription units - 7, operones - 3 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 47 - 814 281 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 - Prom 852 - 911 2.6 + Prom 776 - 835 3.7 2 2 Op 1 23/0.000 + CDS 890 - 1633 733 ## COG0767 ABC-type transport system involved in resistance to organic solvents, permease component 3 2 Op 2 . + CDS 1630 - 2400 306 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 + Term 2431 - 2467 4.1 - Term 2415 - 2457 3.4 4 3 Tu 1 . - CDS 2489 - 2734 282 ## COG0724 RNA-binding proteins (RRM domain) - Prom 2755 - 2814 9.4 + Prom 3053 - 3112 5.9 5 4 Op 1 29/0.000 + CDS 3135 - 4490 1904 ## COG0544 FKBP-type peptidyl-prolyl cis-trans isomerase (trigger factor) + Term 4517 - 4575 6.4 + Prom 4555 - 4614 6.5 6 4 Op 2 24/0.000 + CDS 4634 - 5296 868 ## COG0740 Protease subunit of ATP-dependent Clp proteases 7 4 Op 3 . + CDS 5299 - 6543 248 ## PROTEIN SUPPORTED gi|163762510|ref|ZP_02169575.1| ribosomal protein S16 + Term 6616 - 6659 0.1 + Prom 6625 - 6684 3.7 8 5 Tu 1 . + CDS 6713 - 8893 1877 ## COG0514 Superfamily II DNA helicase + Term 9062 - 9096 -0.5 9 6 Tu 1 . + CDS 9425 - 9598 105 ## gi|255690571|ref|ZP_05414246.1| conserved hypothetical protein + Term 9626 - 9671 3.8 - Term 9614 - 9659 7.6 10 7 Op 1 . - CDS 9667 - 10191 421 ## gi|253573101|ref|ZP_04850492.1| predicted protein 11 7 Op 2 . - CDS 10197 - 10655 460 ## gi|253573102|ref|ZP_04850493.1| conserved hypothetical protein - Prom 10866 - 10925 3.0 Predicted protein(s) >gi|226332107|gb|ACIC01000213.1| GENE 1 47 - 814 281 255 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 7 230 1 229 245 112 27 9e-25 MEESKMVLRTEDLVKKYGKRTVVSHVSIDVKQGEIVGLLGPNGAGKTTSFYMTVGLITPN EGRIFLDDLEITKFPVYKRAQTGIGYLAQEASVFRQMSVEDNIASVLEMTNKPKDYQKDK LESLIAEFRLQKVRKNKGNQLSGGERRRTEIARCLAIDPKFIMLDEPFAGVDPIAVEDIQ QIVWKLKDKNIGILITDHNVQETLSITDRAYLLFEGKILFQGTPEELAENKIVREKYLSN SFVLRRKDFQLNKDE >gi|226332107|gb|ACIC01000213.1| GENE 2 890 - 1633 733 247 aa, chain + ## HITS:1 COG:aq_355 KEGG:ns NR:ns ## COG: aq_355 COG0767 # Protein_GI_number: 15605864 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: ABC-type transport system involved in resistance to organic solvents, permease component # Organism: Aquifex aeolicus # 1 245 1 244 245 115 34.0 8e-26 MIKALRTVGRYIILMGRTFSRPERMRMFFRQYLNEMEQLGVNSIGIVLLISFFIGAVITI QIKLNIESPWMPRWTVGYVTREIMLLEFSSSIMCLILAGKVGSNIASELGTMRVTQQIDA LEIMGINSANYLILPKITAMVTVIPVLVTFSIFAGIIGAFCTCWFAGVMNAVDLEYGLQY MFVEWFIWAGIIKSLFFAFIIASVSAFFGYTVDGGSIAVGKASTDAVVSSSVLILFADLV LTKLLMG >gi|226332107|gb|ACIC01000213.1| GENE 3 1630 - 2400 306 256 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 237 1 239 245 122 31 1e-27 MIDIRGLYKSFEDKTVLSNINASFENGKTNLIIGQSGSGKTVLMKCIVGLLTPEKGEVLY DGRNLVLMGKKEKKMLRKEMGMIFQSAALFDSMSVLDNVMFPLNMFSNDTLRDRTKRAMF CLERVNLTEAKDKFPGEISGGMQKRVAIARAIALNPQYLFCDEPNSGLDPKTSLVIDDLI HDITQEYNMTTIINTHDMNSVLGIGEKVIYIYEGHKEWEGTKDDIFTSTNERLNNFIFAS DLLRKVKDMEVQNMEG >gi|226332107|gb|ACIC01000213.1| GENE 4 2489 - 2734 282 81 aa, chain - ## HITS:1 COG:asl4022 KEGG:ns NR:ns ## COG: asl4022 COG0724 # Protein_GI_number: 17231514 # Func_class: R General function prediction only # Function: RNA-binding proteins (RRM domain) # Organism: Nostoc sp. PCC 7120 # 1 80 1 80 94 86 56.0 1e-17 MNIYVGNLNYRVKEGDLQQVMEDYGAVSSVKVVMDRETGKSKGFAFIEMEDDAAAAKAIA ELNGAEYMGRTMVVKEARPRA >gi|226332107|gb|ACIC01000213.1| GENE 5 3135 - 4490 1904 451 aa, chain + ## HITS:1 COG:PA1800 KEGG:ns NR:ns ## COG: PA1800 COG0544 # Protein_GI_number: 15596997 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: FKBP-type peptidyl-prolyl cis-trans isomerase (trigger factor) # Organism: Pseudomonas aeruginosa # 1 445 1 425 436 73 21.0 1e-12 MNVSLQNIDKVSAQLTVKLEKADYQEKVDKSLKSFRQKAQMPGFRKGMVPMSLVKKMYGK SVIAEEVNKVLQEAVYNYIKENKVNMLGEPLPNEEKQQVIDFDTMEEFEFVFDIALAPEF KAEVSAKDKVDYYTIEVSDEMIENQVKMYTQRTGKYDKVDVYEDNDMLKGLLAQLDEEGN TKEGGIQVEGAVLMPSYMKNDDQKAIFANAKVNDVLVFNPNVAYDGHAAELGSLLKIDKE IAKDVKSDFSFQVEEITRFVPGELTQEVFDQAFGEGVVKTEEEFRAKIKEEIAARFVADS DYKFLIDIRKVMMDKVGKLEFSDALLKRIMLLNNEEKGEEYVAENYDKSIEELTWHLIKE QLVEANDIKVEQEDVLKMAKETTKAQFAQYGMLSVPEDVLDNYAQEMLKKKDTINNLVSR VVEVKLAAALKAQVTLENKNVSMEEFNKMFE >gi|226332107|gb|ACIC01000213.1| GENE 6 4634 - 5296 868 220 aa, chain + ## HITS:1 COG:sll0534 KEGG:ns NR:ns ## COG: sll0534 COG0740 # Protein_GI_number: 16332068 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Protease subunit of ATP-dependent Clp proteases # Organism: Synechocystis # 32 217 25 210 226 227 58.0 2e-59 MDDFRKYATKHLGMNGMVLDDVIKSQAGYLNPYILEERQLNVTQLDVFSRLMMDRIIFLG TQVDDYTANTLQAQLLYLDSVDPGKDISIYINSPGGSVYAGLGIYDTMQFISSDVATICT GMAASMAAVLLVAGAEGKRSALPHSRVMIHQPMGGAQGQASDIEITAREIQKLKKELYTI IADHSHTDFDKVWADSDRDYWMTAQEAKEYGMIDEVLIKK >gi|226332107|gb|ACIC01000213.1| GENE 7 5299 - 6543 248 414 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163762510|ref|ZP_02169575.1| ribosomal protein S16 [Bacillus selenitireducens MLS10] # 154 404 248 453 466 100 31 6e-21 MADSKTKKKCSFCGRSENEVGFLITGMNGYICDSCATQAYEITQEALGEGRKRAGATKLN LKELPKPVEIKKFLDQYVIGQDDAKRFLSVSVYNHYKRLLQKDSGDDVEIEKSNIIMVGS TGTGKTLLARTIAKLLHVPFTIVDATVLTEAGYVGEDIESILTRLLQVADYNVPEAEQGI VFIDEIDKIARKGDNPSITRDVSGEGVQQGLLKLLEGSVVNVPPQGGRKHPDQKMIPVNT KNILFICGGAFDGIEKKIAQRLNTHVVGYTASQKTAVIDKNNMMQYIAPQDLKSFGLIPE IIGRLPVLTYLNPLDRNALRAILTEPKNSIIKQYIKLFEMDGIKLTFEDSVFEYIVDKAV EYKLGARGLRSIVETIMMDVMFEIPSESKKEYKVTLDYAKQQLEKANMARLQIA >gi|226332107|gb|ACIC01000213.1| GENE 8 6713 - 8893 1877 726 aa, chain + ## HITS:1 COG:alr0205 KEGG:ns NR:ns ## COG: alr0205 COG0514 # Protein_GI_number: 17227701 # Func_class: L Replication, recombination and repair # Function: Superfamily II DNA helicase # Organism: Nostoc sp. PCC 7120 # 6 720 6 712 718 495 39.0 1e-139 MAGKINLTDELKKYFGFNKFKGNQEAIIQNLLDGKDTFVLMPTGGGKSLCYQLPSLLMEG TAIVISPLIALMKNQVDAMRNFSEEDGIAHFINSSLNKGAIDQVRSDILAGKTKLLYVAP ESLTKEENVDFLRSVKISFYAVDEAHCISEWGHDFRPEYRRIRPIINEIGKAPLIALTAT ATPKVQHDIQKNLGMVDAQVFKSSFNRPNLYYEVRAKTNNIDKDIIKFIKNNSEKSGIIY CLSRKKVEELAEILQANGINARPYHAGMDSMTRTKNQDDFLMEKVEVIVATIAFGMGIDK PDVRFVIHYDIPKSLEGYYQETGRAGRDGGEGQCLTFYTNKDLQKLEKFMQGKPVAEQEI GKQLLLETAAYAESSVCRRKTLLHYFGEEYTEENCGNCDNCLNPKKQVEAQELLCAVIEA IIAVKENFKADYIIDILQGRETSEVQAHLHEDLEVFGSGMGEEDKTWNAVIRQALIGGYL SKDVENYGLLKVTEEGHKFLKKPKSFKITEDNDFEETEEEVPARGGGSCAVDPALYSMLK DLRKKLSKKLEVPPYVIFQDPSLEAMATIYPVTLDELQNIPGVGAGKAKRYGEEFCKLIK RHCEENEIERPEDLRVRTVANKSKMKVAIIQAIDRKVALDDIALSKGIEFGELLDEVEAI VYSGTKLNIDYFLEEIMDEDHMLDIYDYFKESTTDKIDDALDELGDEFTEEEVRLVRIKF ISEMAN >gi|226332107|gb|ACIC01000213.1| GENE 9 9425 - 9598 105 57 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|255690571|ref|ZP_05414246.1| ## NR: gi|255690571|ref|ZP_05414246.1| conserved hypothetical protein [Bacteroides finegoldii DSM 17565] # 1 57 1 57 57 70 89.0 4e-11 MQVTPKKNNTIHYLPQEIRETRLTEVTYLPITREMLEMLFSNSAKKNKHKLVKKRRK >gi|226332107|gb|ACIC01000213.1| GENE 10 9667 - 10191 421 174 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253573101|ref|ZP_04850492.1| ## NR: gi|253573101|ref|ZP_04850492.1| predicted protein [Bacteroides sp. 1_1_6] # 1 174 1 174 174 320 100.0 2e-86 MRIRLFTILLTFALCISNVYGVNPERQKGMRYAVEGHFAVGASSVENAMNFGFTASIGYQ LNSNIFLGIGGGYINYPQYADGYVDNAFPIYADAKYNFNDKKVSPFIQLKAGYDVSCNNG FYTNPTVGTSFALGRNMFLNVGLGYIIDINEGYTGSTKENKACGSVNLKVGFEF >gi|226332107|gb|ACIC01000213.1| GENE 11 10197 - 10655 460 152 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253573102|ref|ZP_04850493.1| ## NR: gi|253573102|ref|ZP_04850493.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 152 1 152 152 273 100.0 2e-72 MKTLRLIGTTLLMVVLCLNFTACGDDDDDDTPTSEKLVGQWILVYEEGYEKNPAYPEYDE EWSRVPEGECYDYGHLTFRADGTFTEYELDNSVIGNGKWTLSNGLLSLRYGSDTDVYDLK VTELTSTKLVLECYEEKTDGDKEYSKMTYQKK Prediction of potential genes in microbial genomes Time: Thu May 12 04:17:49 2011 Seq name: gi|226332106|gb|ACIC01000214.1| Bacteroides sp. 1_1_6 cont1.214, whole genome shotgun sequence Length of sequence - 9243 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 6, operones - 3 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 160 - 219 6.2 1 1 Tu 1 . + CDS 242 - 1435 735 ## COG4974 Site-specific recombinase XerD + Term 1441 - 1492 8.2 + Prom 1458 - 1517 3.2 2 2 Tu 1 . + CDS 1547 - 2470 496 ## gi|253573105|ref|ZP_04850495.1| conserved hypothetical protein 3 3 Op 1 . + CDS 2602 - 2889 262 ## ZPR_1827 putative excisionase 4 3 Op 2 . + CDS 2895 - 3968 812 ## BF2791 hypothetical protein + Prom 3999 - 4058 4.2 5 4 Tu 1 . + CDS 4133 - 5098 509 ## BDI_3503 DNA primase + Term 5147 - 5181 1.8 6 5 Op 1 . + CDS 5244 - 5627 345 ## BF3284 mobilization protein 7 5 Op 2 . + CDS 5593 - 6513 542 ## BF3285 mobilization protein 8 5 Op 3 . + CDS 6518 - 7237 485 ## BF3286 hypothetical protein + Term 7248 - 7291 7.3 9 6 Op 1 . - CDS 7326 - 7616 242 ## BVU_3196 hypothetical protein 10 6 Op 2 . - CDS 7682 - 8842 628 ## BT_1668 hypothetical protein - Prom 8876 - 8935 2.1 Predicted protein(s) >gi|226332106|gb|ACIC01000214.1| GENE 1 242 - 1435 735 397 aa, chain + ## HITS:1 COG:lin2069 KEGG:ns NR:ns ## COG: lin2069 COG4974 # Protein_GI_number: 16801135 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinase XerD # Organism: Listeria innocua # 128 380 20 281 297 71 28.0 3e-12 MAKKKQEVKLKEPVRIRFKQLANGNQSIYLDYYTGDVIRKENYVGGKRQYEFLKLYLIPE KTREDKAKNEATLALAKAIQSKRIVELQNDAHGFQNTNKSKANVIDYLMNMRSQSKERGS LNYEKTVGNTIRELKLFRGDYIAFRDIDKDFLNSFVDFLKQAKKASKFGLLKAGGVLSNN SVIAYYGVLRTAINRAYKEGIITVNPTKEFDFASKVKAEVSRREYLTIEELKRLIGTECK YEIMKQAFLFSCLCGLRVSDIRKLKWNDLQKSGERIRIEIKMQKTKEPLYLPISDEALKW LPQQNEAKGDDLIFPLTHEGTINKILQKWAKDAGVIKHISFHVARHTHATMMLTLGADLY TVSKLLGHKNIATTQIYAKIVDKKKEEAISLIPNLTD >gi|226332106|gb|ACIC01000214.1| GENE 2 1547 - 2470 496 307 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253573105|ref|ZP_04850495.1| ## NR: gi|253573105|ref|ZP_04850495.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 307 1 307 307 557 100.0 1e-157 MTEEELKIKFDHIQGIFNRCINHASQVMIDNIASKSLYFDEEQADKLEQQEYVRTADELV QLYIRYSVLNDIPYFYSVSDFFWESGFYESLKSDEKRKYMSFNPLSFDYSRYEQDNTVYD EELPYFSVVVKTIVLERYSAYLRKKKESKVQAEMQPQQKEPQPIQDKYQEPKIISHIAET ENPFKSILNDRQIALLVDCINEVEIFNAPMTFEDLKAILSCKPKVIFRSNNNRQVAFLFS ELSNRGLITPNWQSVIAKNKLFVTKNIKKDKYLNQGDLATAANYVKGVEHEKDYVTISNY IKQLKKL >gi|226332106|gb|ACIC01000214.1| GENE 3 2602 - 2889 262 95 aa, chain + ## HITS:1 COG:no KEGG:ZPR_1827 NR:ns ## KEGG: ZPR_1827 # Name: not_defined # Def: putative excisionase # Organism: Z.profunda # Pathway: not_defined # 5 91 6 92 93 78 43.0 7e-14 METSIEKRVAELENLVFLSKNVLSFDEASKFLNLSKSYLYKLTSGNLIPHYKPQGKMLYF EKAELEAWLRQNPVKTQAQIEQEAQKYILNRPLKK >gi|226332106|gb|ACIC01000214.1| GENE 4 2895 - 3968 812 357 aa, chain + ## HITS:1 COG:no KEGG:BF2791 NR:ns ## KEGG: BF2791 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 352 1 356 368 384 53.0 1e-105 MENRKATEAGQDVTMQKEDFAALWKTIHLKVTDTYEVPPEMLWVNGSTIGTLGNFSASTG KAKSKKTFNISAIVAAALKNDEVLKYSACLPSNKRKILYVDTEQSKYHCHKVMERILRLA GLPTDKDMNDFVFIVLREQTPDKRKQIIGYMLENMPDVGLLIIDGIRDLMYDINSPSEST DLINLLMRWSSGYNLHIHTVLHLNKGDDNTRGHIGTELNNKAETVLQITKSQQDGNISEV KAMHIRDREFDPFAFRINDNALPEVVDGYVFKQPSQDRGFPLAELTEQQHRTALENGFGK QVIYGYENVLKTLKQGYASIGYERGRNIHVELNKFLVNKRMIVKEGKGYRYNPDFHY >gi|226332106|gb|ACIC01000214.1| GENE 5 4133 - 5098 509 321 aa, chain + ## HITS:1 COG:no KEGG:BDI_3503 NR:ns ## KEGG: BDI_3503 # Name: not_defined # Def: DNA primase # Organism: P.distasonis # Pathway: not_defined # 1 291 1 291 312 463 74.0 1e-129 MNIEDVKQIPIADYLHSLGYSPVKQQGNGLWYKSPLREEHEPSFKVNTDRNLWYDFGAGK GGNLIALAKELYCSDSLPYLLNRIAEQRPHVRPVSFSFPQRRTEPSFQHLEVRDLTHPAL LRYLQGRGINIELAKRECKELHFTNNGRPFFAIGFSNMAGGYEVRNSFFKGCIAPKDITH IQQQGEPREKCLVFEGFMDYLSFLTLRMRNCPAMPDLDRQDYVILNSTANVSKALDVMSP YERIHCLLDNDKAGFEATRAIELEYSYHVRDFSHNYRGYSDLNDYLCGRKQEQKNAASQV QETKQETGQRAAPRQKRGRDI >gi|226332106|gb|ACIC01000214.1| GENE 6 5244 - 5627 345 127 aa, chain + ## HITS:1 COG:no KEGG:BF3284 NR:ns ## KEGG: BF3284 # Name: bmgB # Def: mobilization protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 126 1 126 127 183 82.0 2e-45 MKKIQDKLGGRPAKKRTDKQKKVVSTKLTELQYYAIRKRAGEAGLRISEYVRQAVVSAEV IPRLNRQDADTIRKLAGEANNINQLAHRANAGGFALVAVELLKLKDRIVEIINHLSDDWK NKKGKRF >gi|226332106|gb|ACIC01000214.1| GENE 7 5593 - 6513 542 306 aa, chain + ## HITS:1 COG:no KEGG:BF3285 NR:ns ## KEGG: BF3285 # Name: bmgA # Def: mobilization protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 306 1 304 304 451 80.0 1e-125 MIGKIKKGSGFKGCVNYVLGKEQAVLLHADGVLTESRGDIIRSFCMQTGMNPDLKKPVGH IALSYSAVDALKLTDGKMIQLAQEYMREMKITDTQYIIVRHQDREHPHVHIVFNRIDNNG KTISDRNDMYRNEQVCKKLKAKHGLYFAKGKEQVKQHRLKEPDKSKYEIYTAVKNEIGKS RNWQQLQQRLAERGITVRFKRKGQTDEIQGISFSKGEYTFKGSEIDRSFSFSKLDKCFGD AGMNVAESQRQTTFAPVREQAPAPGKADSPLITGSLGLFSASSPPVDEEPNFNLRKKKKK KKQLKL >gi|226332106|gb|ACIC01000214.1| GENE 8 6518 - 7237 485 239 aa, chain + ## HITS:1 COG:no KEGG:BF3286 NR:ns ## KEGG: BF3286 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 239 1 241 241 227 57.0 2e-58 MEDNLILEGLLSMVTELKEKLEAQVTPASREETLERLGTIEQALSELHSNSAVPEEKLQV IQSQLDDIRSRMQGQQKNIEDTKKITLETYRCFKVMIDTLGSCRTDKEEATSLPFYQRIY NKVTSWVRPGLFVFSAVLVICSASIFLNVRLVTRMQQLQDNDIKYRYLLMQGQADGEVFD ILETKFNWQRDNGFIQSLTDSVIDFEYRSRKQAEALERARLLNEQAELLRKEADKLGKP >gi|226332106|gb|ACIC01000214.1| GENE 9 7326 - 7616 242 96 aa, chain - ## HITS:1 COG:no KEGG:BVU_3196 NR:ns ## KEGG: BVU_3196 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 79 1 79 356 94 51.0 2e-18 MGTLGCINDMLQRDKENRELRKLNKERMKEHHKYLVNKGKNINLSNISIETIEEIRKNTI AKEQSDQIYLFKAYIIFAICLLGLFLLGILCYKLFF >gi|226332106|gb|ACIC01000214.1| GENE 10 7682 - 8842 628 386 aa, chain - ## HITS:1 COG:no KEGG:BT_1668 NR:ns ## KEGG: BT_1668 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 157 265 235 340 350 79 39.0 3e-13 MKHFLPILFIFSFNLVSPLLAQDKKCANMSALLENKIWKVQLPKDKQYAMEMEFRSAGWR TTFLYDEKQTKTFYSYSLHGDTIKVFESGKNYIIQELTDSTLTFLYLPESLTIGVTPVRC TTDNSIQGQRENEERLDSIWRKEDIWNKGTAPINKTQAIKEYPRWAVWDYDLEKYFVSQM KYPKELLKKNVAGHSVVMFSIDTLGLPREINILTTIHKEFDKEIIRLTKELPHCLPCRDK NGKRIECFHTVYVPFLPQHYRDRVKADSIAKEELKQSFVEWETVSYFEKANPYAVTNYIN ERLTYDPKLLNGKKEAKGIYTIRIDSYGEIIKVETLRSCGIPEWDKQVLQIIKGMPRWTP TINYYGKGEYHNSVWTVPVLFKNDNR Prediction of potential genes in microbial genomes Time: Thu May 12 04:18:32 2011 Seq name: gi|226332105|gb|ACIC01000215.1| Bacteroides sp. 1_1_6 cont1.215, whole genome shotgun sequence Length of sequence - 4977 bp Number of predicted genes - 0 Number of transcription units - 0, operones - 0 average op.length - 0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - LSU_RRNA 247 - 2899 99.0 # AE015928 [R:1626942..1629594] # 23S ribosomal RNA # Bacteroides thetaiotaomicron VPI-5482 # Bacteria; Bacteroidetes; Bacteroidia; Bacteroidales; Bacteroidaceae; Bacteroides. - SSU_RRNA 2962 - 4954 99.0 # EU136679 [D:1..1995] # 16S ribosomal RNA # Bacteroides thetaiotaomicron # Bacteria; Bacteroidetes; Bacteroidia; Bacteroidales; Bacteroidaceae; Bacteroides. - TRNA 3074 - 3147 84.8 # Ala TGC 0 0 - TRNA 3202 - 3275 78.1 # Ile GAT 0 0 Predicted protein(s) Prediction of potential genes in microbial genomes Time: Thu May 12 04:18:33 2011 Seq name: gi|226332104|gb|ACIC01000216.1| Bacteroides sp. 1_1_6 cont1.216, whole genome shotgun sequence Length of sequence - 4138 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 4, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 187 179 ## Clim_0726 addiction module toxin, Txe/YoeB family 2 1 Op 2 . + CDS 198 - 395 299 ## gi|189464126|ref|ZP_03012911.1| hypothetical protein BACINT_00461 + Term 415 - 460 2.1 + Prom 426 - 485 3.3 3 2 Op 1 . + CDS 577 - 876 79 ## gi|189464125|ref|ZP_03012910.1| hypothetical protein BACINT_00460 4 2 Op 2 . + CDS 873 - 1667 576 ## BT_2995 hypothetical protein 5 2 Op 3 . + CDS 1624 - 2076 232 ## gi|255692735|ref|ZP_05416410.1| conserved hypothetical protein 6 2 Op 4 . + CDS 2096 - 2326 267 ## gi|237707884|ref|ZP_04538365.1| predicted protein + Term 2327 - 2374 10.1 - Term 2318 - 2359 5.5 7 3 Tu 1 . - CDS 2363 - 3388 597 ## CFPG_P2-1 replication protein A - Prom 3570 - 3629 5.8 + Prom 3723 - 3782 3.7 8 4 Tu 1 . + CDS 3802 - 4053 212 ## gi|189462399|ref|ZP_03011184.1| hypothetical protein BACCOP_03085 Predicted protein(s) >gi|226332104|gb|ACIC01000216.1| GENE 1 2 - 187 179 61 aa, chain + ## HITS:1 COG:no KEGG:Clim_0726 NR:ns ## KEGG: Clim_0726 # Name: not_defined # Def: addiction module toxin, Txe/YoeB family # Organism: C.limicola # Pathway: not_defined # 1 61 33 93 93 63 52.0 2e-09 KIYKELLEHPKTGLGHPEALRGGGDITWSRHITAHDRIIYDIYEEVVEVYILEVEGHYND K >gi|226332104|gb|ACIC01000216.1| GENE 2 198 - 395 299 65 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|189464126|ref|ZP_03012911.1| ## NR: gi|189464126|ref|ZP_03012911.1| hypothetical protein BACINT_00461 [Bacteroides intestinalis DSM 17393] # 1 65 1 65 65 111 93.0 2e-23 MVEYCVYWLENGEPMHEVFSNLAAAEMYSCAIRGKENVEWVEVSEEEAIDLDELKDMFPD DFCGV >gi|226332104|gb|ACIC01000216.1| GENE 3 577 - 876 79 99 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|189464125|ref|ZP_03012910.1| ## NR: gi|189464125|ref|ZP_03012910.1| hypothetical protein BACINT_00460 [Bacteroides intestinalis DSM 17393] # 1 99 1 99 99 148 100.0 1e-34 MELRRNEKITFRCTELEKDALAEQAARCSLSVSEYCRSLSLGGRPRERYTEEERQLLRDI AQLKGTLQRLNNYFGGRQYREVFEENRALITELKKILSR >gi|226332104|gb|ACIC01000216.1| GENE 4 873 - 1667 576 264 aa, chain + ## HITS:1 COG:no KEGG:BT_2995 NR:ns ## KEGG: BT_2995 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 242 1 261 403 105 29.0 2e-21 MIGKGKSISHGVAALEYDLAKEINGQAVATEIARHELYGCTGAEMVQEMKPYHIDFPNVK NNCLRFEVSPSIEESATFTDADWAELGNDFMQRMGLANHQYIIIRHSGTESKKEQAHLHI LANRVSLSGELYRDNWIGKKATEAANAIAKERNFVQSQDIGKVNKAEIKEAMDGVLKKMQ GFDFTKFKEELGKRGFKVREARASTGKLNGYYVTARSGTEYKASEIGKGYTLAHIERTQS KLKCNSMNISHGNKLTPGSGSFQR >gi|226332104|gb|ACIC01000216.1| GENE 5 1624 - 2076 232 150 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|255692735|ref|ZP_05416410.1| ## NR: gi|255692735|ref|ZP_05416410.1| conserved hypothetical protein [Bacteroides finegoldii DSM 17565] # 1 150 1 150 150 261 90.0 1e-68 METNSHPEAAASNVKQGQYSPRKRKDTTPRVNPQLAVELVYQELKRVEVYTKRIEDATAR KVQIDGKSLESAENRLKNVLADFERQGYRMKNGGYVDKRISFYSILCAVISLLFACFMCY LWTDAAKDRDNYKQYYEYYQEQAREQKGNK >gi|226332104|gb|ACIC01000216.1| GENE 6 2096 - 2326 267 76 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237707884|ref|ZP_04538365.1| ## NR: gi|237707884|ref|ZP_04538365.1| predicted protein [Bacteroides sp. 9_1_42FAA] # 1 76 35 110 110 145 100.0 7e-34 MGTTEHEEPRFFFILNKGAKSGGEITHAVLNGSIVSKPAGWDAFHGLALAREKLSSEEIQ QQMKELGVEMEIVPLI >gi|226332104|gb|ACIC01000216.1| GENE 7 2363 - 3388 597 341 aa, chain - ## HITS:1 COG:no KEGG:CFPG_P2-1 NR:ns ## KEGG: CFPG_P2-1 # Name: not_defined # Def: replication protein A # Organism: A.pseudotrichonymphae # Pathway: not_defined # 10 339 24 349 532 152 34.0 2e-35 MKKKLPITKNKDVVVSWVYTWSKQQDMSIHEQRIVLRILEACQAELKGVKLKDYAGTKRK FEHGLWDVDAQMHVSDVIFSGRDYNEIIAALDSLAGRFFTYEDDEEWWKCGFISNPKYKK HTGIITFRVSNDLWGVFTKFAKGYREFELNKALALPTGYSLRFYMLMSGQVYPLDISLDN LKERLGIPADKYKDKNGKDRIDNFEERVLKPAKAALDESCPYTFNYVKVRENPNNKRSKV TGFRFYPVYQPQFRDEELEGKDLQAKVTARYQIDSHVYEYLRYSCGFTSEEINRNKETFI TAQEKITDLIGELALLNGKSREKNNPKGWIINALKGKIKDK >gi|226332104|gb|ACIC01000216.1| GENE 8 3802 - 4053 212 83 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|189462399|ref|ZP_03011184.1| ## NR: gi|189462399|ref|ZP_03011184.1| hypothetical protein BACCOP_03085 [Bacteroides coprocola DSM 17136] # 1 83 1 83 83 142 100.0 5e-33 MEALSVREYRNNLAASFTKADNGEQVLIRRKNEIYALVKVGREDLMITPELQARIDKARE EIKSGKCVTLKSSEDIDAYFDSL Prediction of potential genes in microbial genomes Time: Thu May 12 04:19:10 2011 Seq name: gi|226332103|gb|ACIC01000217.1| Bacteroides sp. 1_1_6 cont1.217, whole genome shotgun sequence Length of sequence - 2793 bp Number of predicted genes - 4, with homology - 3 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 79 - 444 442 ## BVU_4025 hypothetical protein 2 2 Tu 1 . - CDS 399 - 638 56 ## - Prom 763 - 822 4.8 3 3 Op 1 . + CDS 570 - 770 93 ## BVU_4024 hypothetical protein + Prom 780 - 839 2.3 4 3 Op 2 . + CDS 861 - 2708 990 ## COG3436 Transposase and inactivated derivatives Predicted protein(s) >gi|226332103|gb|ACIC01000217.1| GENE 1 79 - 444 442 121 aa, chain + ## HITS:1 COG:no KEGG:BVU_4025 NR:ns ## KEGG: BVU_4025 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 121 1 120 120 92 39.0 3e-18 MFSEKDFERLWFLYKTEGEPNGVSINSFCIGNNIPYTAFYDWFKKTQKKVVPVEVEGIPE ELIRTGNEERQEVAKDKLKRTTPHKGSIMVTIRTREGLCIQKKGLDYQGLKTLVEKLEGL C >gi|226332103|gb|ACIC01000217.1| GENE 2 399 - 638 56 79 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKIIGIALIVYHANLALVFRHEHVNITIQWITVVSTSDYLAYTLSLASHIVKLRKVVEVV KTVDAQHKPSSFSTSVFSP >gi|226332103|gb|ACIC01000217.1| GENE 3 570 - 770 93 66 aa, chain + ## HITS:1 COG:no KEGG:BVU_4024 NR:ns ## KEGG: BVU_4024 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 66 45 110 111 63 45.0 3e-09 MFMSKDQRKVRMIHYERNAYYLHEKSFIKGYRFMRIERKDDVTVYKIDWKDLVTILETPV ITSIRI >gi|226332103|gb|ACIC01000217.1| GENE 4 861 - 2708 990 615 aa, chain + ## HITS:1 COG:SMc03298 KEGG:ns NR:ns ## COG: SMc03298 COG3436 # Protein_GI_number: 15966896 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Sinorhizobium meliloti # 269 610 188 528 537 83 26.0 1e-15 METEELIEHILRQKHLEEMMNRPEKEHTPLEGMDNEQIKRFALFLFEENQSKSKQLDEMI ARLDEIGKDLKESNKKIDSLTNALLKANSKADKVVLEYKLRDKEYKKLEKKYNALLERLS IMNTQTYASSKSLKGIDRKRVVKGKHDDKDDFDGTPTAPPSEVPQPDSSASCDTQDTPKA SLSKERPYRKGMTYNKTCVGTPIIHRSDYTMLPEDSVVISSSYRKIRSIVSHIEEHHFEV LKVKHADGRIESMFLPMKDDVRASLYDEIVPGTSITANMLSYLMFNRFQMSTPAYREAKN RLSDMDWNTSVQNLLNWADKGAMQLNKLIPALKKIALQDGANVNVDETWLRYHAYNKKRK TYMWCLVNRKARIVIFFYEDTTDDEGVQKHGGRNRNVLKEFLGDAKIKSLQSDGYNVYMY LDNELMDIEHLCCLAHARAKFKYAYDQGSLQARIFLELIAKLYGMEETYRREKLTADEIY SRRNGKETTEIIEKLRTELYDLLANPDESRSELMSKALNYLKNFWNQIFSYRNDGEYSID NMTAERAIRPITVQRKNSLFFGSVKGIQNSAIYNTFIETCKQAGVSFRNYFCKLLRELKK GRTDYENLLPMTICK Prediction of potential genes in microbial genomes Time: Thu May 12 04:19:19 2011 Seq name: gi|226332102|gb|ACIC01000218.1| Bacteroides sp. 1_1_6 cont1.218, whole genome shotgun sequence Length of sequence - 2419 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 34 - 71 -0.1 1 1 Tu 1 . - CDS 122 - 1774 794 ## COG3344 Retron-type reverse transcriptase - Prom 1852 - 1911 2.5 Predicted protein(s) >gi|226332102|gb|ACIC01000218.1| GENE 1 122 - 1774 794 550 aa, chain - ## HITS:1 COG:MA3645 KEGG:ns NR:ns ## COG: MA3645 COG3344 # Protein_GI_number: 20092445 # Func_class: L Replication, recombination and repair # Function: Retron-type reverse transcriptase # Organism: Methanosarcina acetivorans str.C2A # 6 484 24 496 512 385 43.0 1e-106 MNETKTSCAPADNRNLTWDGMNWSKCEAYVRKLQARIVKAQKEGRHNKVKALQWMLTHSF YAKALAVKRVTSNKGKKTSGVDKQLWDSPKRKYKAISELKRRGYNPQPLRRVHIKKKNGK LRPLGIPTMKDRAMQALYLMALEPIAETTGDRFSYGFRKKRRTMDAIRQIDTVLNRQHSP EWILEGDIKGCFDHISHDWLIENIPMDKTILRKWLKCGAVFNGKLFPTEEGTPQGGIISP TLANMVLDGLQPLLAKRFKRLWRNNKTFHYKVNLIRYADDFIITGRDKELLENEVKPIVI EFLKERGLTLSEEKTTITNIYDGFDFLGFNVRKFGKKLYTSPSKDAQKRFRAKISDIVKG HKMCKQESLIRMLNPVITGWGNYYRYGASTDAFHGCDYHIYNLTKKWALRRHPKKRKSWV ADRYWHEIRGRKWTFAWKYETKSMKVNYLTLKRLSDIHYTPYKQVKGEANPFDPEYDDYF SQRKEQQMLESLKGRKSLLYLWNKQNRICPLCGKEIDCTKAWNVNEISVGGSIVRQLVHN NCYKRNKRKC Prediction of potential genes in microbial genomes Time: Thu May 12 04:19:20 2011 Seq name: gi|226332101|gb|ACIC01000219.1| Bacteroides sp. 1_1_6 cont1.219, whole genome shotgun sequence Length of sequence - 1700 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 3, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 783 545 ## BT_1503 integrase - Prom 904 - 963 7.3 + Prom 1039 - 1098 1.8 2 2 Tu 1 . + CDS 1137 - 1400 89 ## BT_1504 hypothetical protein 3 3 Tu 1 . - CDS 1410 - 1700 244 ## BT_1505 hypothetical protein Predicted protein(s) >gi|226332101|gb|ACIC01000219.1| GENE 1 3 - 783 545 260 aa, chain - ## HITS:1 COG:no KEGG:BT_1503 NR:ns ## KEGG: BT_1503 # Name: not_defined # Def: integrase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 260 25 284 336 497 100.0 1e-139 MLKVVAFMKQVAKGLQMGGNFGTAHVYRSSLNAIIAYRGKGDFTFNEVTPEWIKGFEIHL RGRGCSWNTVSTYLRTFRAVYNRAVDCRGAVYVPHLFRSVYTGTRADRKRALDTEDIQKV FTKLPQSSAVTSDMRRTQELFVLMFLLRGLPFVDLAYLRKSDLHDNVITYRRRKTGRPLS VTLTPEAMAILKRYMNRDTSSPYLFPLLNSREGTKEAYHEYQLALRNFNRQLMLLGEMLG LGDKLSSYTARHTWATTAYY >gi|226332101|gb|ACIC01000219.1| GENE 2 1137 - 1400 89 87 aa, chain + ## HITS:1 COG:no KEGG:BT_1504 NR:ns ## KEGG: BT_1504 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 87 1 87 87 157 100.0 7e-38 MPPGKKGQASSLGQNKVHFHKYNLILLPPDTISSHSSEERGLLWKHRLSQARTSYSSHVR ILVRIRKMWEKTPHMRIGNIFFKKREL >gi|226332101|gb|ACIC01000219.1| GENE 3 1410 - 1700 244 96 aa, chain - ## HITS:1 COG:no KEGG:BT_1505 NR:ns ## KEGG: BT_1505 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 96 267 362 362 189 98.0 3e-47 VNSAKIQDDQQAKISSLVEYLEKHSAAKVSVTGYADKETGNPNINMTLSEKRAKNVIEML KAKGVVADRIVIGYKGDTVQPYQKPEENRVCICIAE Prediction of potential genes in microbial genomes Time: Thu May 12 04:19:28 2011 Seq name: gi|226332100|gb|ACIC01000220.1| Bacteroides sp. 1_1_6 cont1.220, whole genome shotgun sequence Length of sequence - 1699 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 24 - 1595 1119 ## COG3436 Transposase and inactivated derivatives Predicted protein(s) >gi|226332100|gb|ACIC01000220.1| GENE 1 24 - 1595 1119 523 aa, chain + ## HITS:1 COG:SMb20541 KEGG:ns NR:ns ## COG: SMb20541 COG3436 # Protein_GI_number: 16264268 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Sinorhizobium meliloti # 24 512 29 539 550 206 32.0 1e-52 MTQTETLELLVATLQQANSSQSESIERLTRQNEQLQNKLQELLAQVAWLNRQLFGRKSEK LAHLDPNQLSLFDPPVQPLEHEIPEEAAAQEPVCSTTPKKKVRQNRNMLDGLPVVEIVIE PEGVDPDKYKRIGEERTRTLEFEPGKLYVKEIIRPKYGLKDNISLPQGHQGSVIIAPLPL LPIYKGLPGASLLTEILLQKYEYHVPFYRQVREFHHLGLKISENTLQGWFKPACELLKPL YEELKKQVLKADYIQVDETTLPVINKQNHKAVKEYLWIVRAVMDGLVFFHYDDGSRSQET AWKLLQTFKGYLQSDGYAAYNIFEGKKEVCLVGCLAHIRRHYEVAKEENESLAGYVLAQI QQLYRIEQIADQEELTYEQRMLRRQEQALPILEQLEKWMETAYPKVLPKSRMGQAIAYAY QLWPRMRNYLKDGRLKIDNNLAENAIRPIALSRKNFLFCGNHEAAQNTAIICSLLASCKA SNINPREWLTEVIALLPYYAANKEKDLKELLPHCWESGNSKEL Prediction of potential genes in microbial genomes Time: Thu May 12 04:19:29 2011 Seq name: gi|226332099|gb|ACIC01000221.1| Bacteroides sp. 1_1_6 cont1.221, whole genome shotgun sequence Length of sequence - 1694 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 178 - 209 2.7 1 1 Tu 1 . - CDS 238 - 1584 545 ## Cphamn1_0956 transposase IS4 family protein - Prom 1634 - 1693 5.1 Predicted protein(s) >gi|226332099|gb|ACIC01000221.1| GENE 1 238 - 1584 545 448 aa, chain - ## HITS:1 COG:no KEGG:Cphamn1_0956 NR:ns ## KEGG: Cphamn1_0956 # Name: not_defined # Def: transposase IS4 family protein # Organism: C.phaeobacteroides_BS1 # Pathway: not_defined # 26 446 33 454 468 305 38.0 3e-81 MCKDTTNNGVSQEKNLFSFQDVLPDRKIQVDFNAPDISSNGGLVLVGLMKDSIARKIARL IPDYRNQLFVQHSYEEMVCQRVGQIMCGYEDANDCDRLRHDSALKMSVGRKASDPGLCSQ PTMTRLENHIDKRTLWKIAELFVKDYISSFDKAPRKIILDVDDTNANTYGAQQLSLFNDY YDEYCYMPMVIFDGMNGKLILPLLRPGRRNKSLNIFGILRRVIEYIHKEWPHTIIELRGD SHFCSHEFMDWVKTHLYVRFITGLSGNPALMKKIDKQLRRAKGDFELHHEDVRRYYSFEY KAKSWKYRQRVIAKIEVSDKGVNVRFIVTSNRNNKPETVYRRYCKRGTMELWIKDLKYFR ADRMSCSSFRANMFRLFLYGAAYVTAYRLRSKAFSNTEVGAFTMDSFMKRIMLSAVFIVE KKTFIRFSFSPHHRHLEALSQALTRLSA Prediction of potential genes in microbial genomes Time: Thu May 12 04:19:35 2011 Seq name: gi|226332098|gb|ACIC01000222.1| Bacteroides sp. 1_1_6 cont1.222, whole genome shotgun sequence Length of sequence - 1667 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 96 - 1550 661 ## HRM2_25410 putative transposase - Prom 1591 - 1650 5.9 Predicted protein(s) >gi|226332098|gb|ACIC01000222.1| GENE 1 96 - 1550 661 484 aa, chain - ## HITS:1 COG:no KEGG:HRM2_25410 NR:ns ## KEGG: HRM2_25410 # Name: not_defined # Def: putative transposase # Organism: D.autotrophicum # Pathway: not_defined # 28 420 31 426 483 150 27.0 1e-34 MEAKIEKISELSKLLSVKNRMSDDLFHLFGKFGIGRLLSQLSLEKQDGVSASELILSLCL FRIVGDSIHSICKHKIYELSNHGKNCFYRMMIRPSMDWRRLMNHFALRYMCLLRKYGEAP RPDTTTCFIIDDTVLEKSGVRMEGVSRVFDHVKGKCVLGYKLLLCAFFDGKTTIPFDFSL HQEKGKQGDCGLTKQQRRKAYHAKRNNGSPDYERFQECKKPKMEVAVDMLRRGWKMGLHA KYVITDSWFTCEQLMACVRSIGKGAMHFVGLAKLGKTKYTVSGRKKNAAELIAAYERERG KVCRKYRCRYIRLNGNLGDTPVRIFLIKYGRNSAWNVLLTTDTTMSFVKAFEVYQIRWNI EVMNKETKQYLGLGGYQGCDFNGQIADATLCYLTYTVMTLEKRFGEYQTMGELFSDMEDD LMALTLWKRVLACIERILRALGETLGVTPQQLMAAISVNDKEMGKILVMAEALEKWDDEH KRIA Prediction of potential genes in microbial genomes Time: Thu May 12 04:19:41 2011 Seq name: gi|226332097|gb|ACIC01000223.1| Bacteroides sp. 1_1_6 cont1.223, whole genome shotgun sequence Length of sequence - 1586 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 45 - 428 207 ## BT_1869 hypothetical protein + Prom 506 - 565 1.8 2 2 Tu 1 . + CDS 602 - 1321 430 ## COG2801 Transposase and inactivated derivatives Predicted protein(s) >gi|226332097|gb|ACIC01000223.1| GENE 1 45 - 428 207 127 aa, chain + ## HITS:1 COG:no KEGG:BT_1869 NR:ns ## KEGG: BT_1869 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 13 127 13 127 127 189 100.0 2e-47 MKYSKKKYNYYSDDERMSYIREYLSSPESKSEFCKRHGFCAKLLTYWLNKYQIEDKDMGR SSKPVNSDAIDSSISELQKELSLLRAENRKLHRALADESLRHEACEELINLAESTYHIKV RKNSDAK >gi|226332097|gb|ACIC01000223.1| GENE 2 602 - 1321 430 239 aa, chain + ## HITS:1 COG:STM0947 KEGG:ns NR:ns ## COG: STM0947 COG2801 # Protein_GI_number: 16764309 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Salmonella typhimurium LT2 # 23 230 9 212 227 138 40.0 9e-33 MPKAGMRELYACCVSKFGPKMVIGRDRCYDIFRSNGLCQRTSRKRPKTTNSNHNYYIYPD LLNVAPKFVATRLGAMVVADITYVNTGQGWAYLSLLTDASSRAIVGYALYKTLETEGPLK ALEMAISFYEKYHIDMSTLIHHSDRGVQYCSNKYVERLKEHQINISMTQCGDPLHNALAE RMNNTIKNGWLFDCDDESFEQVSKRIEDAVYVYNHVRPHQGINMRTPMEVVSETGGLTA Prediction of potential genes in microbial genomes Time: Thu May 12 04:19:44 2011 Seq name: gi|226332096|gb|ACIC01000224.1| Bacteroides sp. 1_1_6 cont1.224, whole genome shotgun sequence Length of sequence - 1465 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 25 - 1347 504 ## BT_2558 transposase - Prom 1400 - 1459 7.6 Predicted protein(s) >gi|226332096|gb|ACIC01000224.1| GENE 1 25 - 1347 504 440 aa, chain - ## HITS:1 COG:no KEGG:BT_2558 NR:ns ## KEGG: BT_2558 # Name: not_defined # Def: transposase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 440 1 440 440 895 100.0 0 MVVHQSAIVNQFSKEHKEKMGAYRFLNNSSVSSDAILSGLIHTCCKNASGRQHLLCIQDT SEINYEAHVERMKKKTASPGIVGQKQCGTFLHPVLVVDASSHIPIGFSSVKQWNRSPAAL SREERNYRYQPIEEKESYRWIESGMAASEQMPRDAVKTIIGDREADIFELFSRIPTDNVH LLIRSVHERNCRLDDPDCSVHLNTLMEQAVLRAEYSFEVLPGSGRKKRVACMELRFERVT LCAPVNGPAKGSPPVSLYCIHVKEKSSSTPVNESPIEWRLLTTHVVETVEQAIECIGWYR CRWLIEELFRVLKRKGFMIEDAQLETVSALQKLILISLQAALQVMVLKLSFDKEDEKLSS EIYFTSKEIALLHIVGKKSEGNTKIQQNPYKKESMAWAAWIIARLGAWSAYKSQSIPGYI TFKNGLDRFYTQFELYELIS Prediction of potential genes in microbial genomes Time: Thu May 12 04:19:50 2011 Seq name: gi|226332095|gb|ACIC01000225.1| Bacteroides sp. 1_1_6 cont1.225, whole genome shotgun sequence Length of sequence - 1456 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 14 - 73 1.7 1 1 Tu 1 . + CDS 139 - 1263 636 ## BVU_1734 hypothetical protein Predicted protein(s) >gi|226332095|gb|ACIC01000225.1| GENE 1 139 - 1263 636 374 aa, chain + ## HITS:1 COG:no KEGG:BVU_1734 NR:ns ## KEGG: BVU_1734 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 2 351 265 620 950 120 27.0 9e-26 MNIIKQTELNPEEVNIIVGNSDDNDRQIARIGEGFKRGRIPLKGETHKKFTFCTSTAYAG CDFYSTNAATFVISDCNRPNTAVDIATELVQIAGRQRLACNPFRQFLTFVYNVNAEEVEQ EAFNEHLCRKVNVTLDEIRDNNNAGEALRAKRIKDFRRIPDNVKYQDSYTMYDEQKGEFV FNRLAYVNEQYCFDVQKFNYQNGVIVKKLLQDSSFDVSENQTYAVYQEQLKHLIKKEPFV DRMQAYCEYRAKQGLIVNLAMSTLESKYPELRYYYEALGADRIKALNYKEKKLLNEIHIM KTKNKIRHELHGTIHIGDRILTTDIQQTLRVVYDRLGIDKSPKATDLNEFFEIHPVKIPT ANGRKNGFEIRGIL Prediction of potential genes in microbial genomes Time: Thu May 12 04:19:55 2011 Seq name: gi|226332094|gb|ACIC01000226.1| Bacteroides sp. 1_1_6 cont1.226, whole genome shotgun sequence Length of sequence - 1380 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 103 - 162 4.6 1 1 Tu 1 . + CDS 225 - 1220 449 ## COG3547 Transposase and inactivated derivatives + Term 1304 - 1356 5.2 Predicted protein(s) >gi|226332094|gb|ACIC01000226.1| GENE 1 225 - 1220 449 331 aa, chain + ## HITS:1 COG:NMB1750 KEGG:ns NR:ns ## COG: NMB1750 COG3547 # Protein_GI_number: 15677594 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Neisseria meningitidis MC58 # 6 326 2 310 316 90 27.0 3e-18 MNYSHFVGLDVGKKTFDASLMSADEKELSHKSFDNTPTGIQSLLDWIAGYHLSLSKLLFC AENMGSYVTELSVSSVSMGFSLALVCPLTIKKSIGLQRGKNDRIDAKRIANYAVLHYRKL ELYKLPDKDLVRLRGWIIIRDNLVKQKVSSIKLLETFSWMAKLADVTESISFLEEQLKSI KERILEVEEDMEQLIAASTSLYTNYLLLRSIKGIGIINAIVLLCVTDNFQRFDNPRKFAC YCGVAPFEHTSGISIRGKTQTSSLANKEVKVYLTRAAITAISWDPQMKAYYKRKIAEGKH KASVINAVRAKIIARSFAVIRRQTPFVTLAV Prediction of potential genes in microbial genomes Time: Thu May 12 04:19:56 2011 Seq name: gi|226332093|gb|ACIC01000227.1| Bacteroides sp. 1_1_6 cont1.227, whole genome shotgun sequence Length of sequence - 1358 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 16 - 909 288 ## COG2801 Transposase and inactivated derivatives 2 1 Op 2 . - CDS 942 - 1304 359 ## BT_1891 hypothetical protein Predicted protein(s) >gi|226332093|gb|ACIC01000227.1| GENE 1 16 - 909 288 297 aa, chain - ## HITS:1 COG:PA0257 KEGG:ns NR:ns ## COG: PA0257 COG2801 # Protein_GI_number: 15595454 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Pseudomonas aeruginosa # 10 270 1 256 263 149 36.0 7e-36 MVSLCKLFGVTKQAFYKYVDHSTDSSARERFVLEFVKRVRSKDPGIGGMKLWLMYRNEFG TSQAFVGRDCFCAILSKYKLTIRKRFRAPRTTDSSHHLPQYPDLTRTLLLEHPDQLWVSD ITYITIWLPDGSYVFCYLSLVTDAYTKEIIGYCVGDTLGSCYTVEALEMAVRRIAAKEIK GLIHHSDRGVQYASADYIAILRHNGILPSMTEDGNPKDNAIAERVNGIIKNELLQGMRFS SIQEVRKAVATAVHFYNNERPHMSLDMLTPVQAGEMQGPIRKRWISYREKYLNVALA >gi|226332093|gb|ACIC01000227.1| GENE 2 942 - 1304 359 120 aa, chain - ## HITS:1 COG:no KEGG:BT_1891 NR:ns ## KEGG: BT_1891 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 120 1 120 120 197 100.0 8e-50 MKKSDMTFSPYQLELLGDFYRSNFSVSRFAQEKGIARITFWRWVRIFEDSNPEISAYMKK NKSPKSSDESSSITALRLENERLRAELKDAKMRAHAFDTMIDVAEEMFNLPIRKKAGTKQ Prediction of potential genes in microbial genomes Time: Thu May 12 04:19:59 2011 Seq name: gi|226332092|gb|ACIC01000228.1| Bacteroides sp. 1_1_6 cont1.228, whole genome shotgun sequence Length of sequence - 1339 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 3, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 16 - 75 5.4 1 1 Tu 1 . + CDS 162 - 575 189 ## COG3023 Negative regulator of beta-lactamase expression + Term 604 - 668 7.2 2 2 Tu 1 . - CDS 664 - 957 227 ## COG1669 Predicted nucleotidyltransferases 3 3 Tu 1 . - CDS 1093 - 1305 151 ## BT_1644 putative CPS biosynthesis glycosyltransferase Predicted protein(s) >gi|226332092|gb|ACIC01000228.1| GENE 1 162 - 575 189 137 aa, chain + ## HITS:1 COG:HI1494 KEGG:ns NR:ns ## COG: HI1494 COG3023 # Protein_GI_number: 16273395 # Func_class: V Defense mechanisms # Function: Negative regulator of beta-lactamase expression # Organism: Haemophilus influenzae # 47 128 2 96 116 79 41.0 2e-15 MRTITLIIIHCSATPEGKSLSAEACRQDHIRHRGFRDIGYHFYITRDGEIHPGRPLEKVG AHCRNHNAHSIGICYEGGLDADGRTKDTRTLEQRGSLLALIRELRKRFPKALIVGHHDLN PMKECPCFRCVEEYKEL >gi|226332092|gb|ACIC01000228.1| GENE 2 664 - 957 227 97 aa, chain - ## HITS:1 COG:MA0100 KEGG:ns NR:ns ## COG: MA0100 COG1669 # Protein_GI_number: 20088999 # Func_class: R General function prediction only # Function: Predicted nucleotidyltransferases # Organism: Methanosarcina acetivorans str.C2A # 4 76 13 86 106 58 39.0 3e-09 MLSTIEYLSLLRKYMRENAVKYGISRMGIFGSVARGEQHDGSDVDICVEIDRPSIFTLVH IKEELEKLFKCPVDVVRLRNNMDELLRNCINKDGIYV >gi|226332092|gb|ACIC01000228.1| GENE 3 1093 - 1305 151 70 aa, chain - ## HITS:1 COG:no KEGG:BT_1644 NR:ns ## KEGG: BT_1644 # Name: not_defined # Def: putative CPS biosynthesis glycosyltransferase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 70 139 208 208 143 100.0 2e-33 MEHDARYVFLYQIRPGVTSYATLYNGYTDTMDKMLRRLRYDLFYLQHRSWWFDFKILVKT FINICFGKKF Prediction of potential genes in microbial genomes Time: Thu May 12 04:20:01 2011 Seq name: gi|226332091|gb|ACIC01000229.1| Bacteroides sp. 1_1_6 cont1.229, whole genome shotgun sequence Length of sequence - 1306 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 878 733 ## BT_2158 putative dehydrogenase and related proteins 2 1 Op 2 . - CDS 944 - 1306 374 ## BT_2158 putative dehydrogenase and related proteins Predicted protein(s) >gi|226332091|gb|ACIC01000229.1| GENE 1 2 - 878 733 292 aa, chain - ## HITS:1 COG:no KEGG:BT_2158 NR:ns ## KEGG: BT_2158 # Name: not_defined # Def: putative dehydrogenase and related proteins # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 292 158 449 491 629 100.0 1e-179 MGNQGASGEGTDLVCEWIWNGEIGEVTKVECATDRPIWPQGLNAPEKADRIPNTLNWDLF TGPAKLNPYNNVYHPWNWRGWWDYGTGALGDMACHILHQPFRALKLGYPTKVEGSSTLLL SACAPQAQHVKMIFPARDNMPKVALPEVEVHWYDGGMMPERPKGFPEGKQLMGPGGGLTI FHGTKDTLICGCYGEQPFLLSGRVPNAPKVCRRVTCSHEMDWVRACKEDKSNRVMPKADF SESGPMNEMVVMGVLAIRLQGLNKTLEWDGANMCFTNIGDNETLRTCIKDGF >gi|226332091|gb|ACIC01000229.1| GENE 2 944 - 1306 374 120 aa, chain - ## HITS:1 COG:no KEGG:BT_2158 NR:ns ## KEGG: BT_2158 # Name: not_defined # Def: putative dehydrogenase and related proteins # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 112 15 126 491 233 100.0 2e-60 AALAGITIAPSSILGMSHGHVSPTDKLNLAAVGIGGMGHANINNVKGTENIVALCDVDWK YAKGVFDEFPNAKKYWDYRKMYDEMGKSIDGVIIATADHTHAIITADAMTMGNTYTARNR Prediction of potential genes in microbial genomes Time: Thu May 12 04:20:09 2011 Seq name: gi|226332090|gb|ACIC01000230.1| Bacteroides sp. 1_1_6 cont1.230, whole genome shotgun sequence Length of sequence - 1195 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 684 474 ## COG1086 Predicted nucleoside-diphosphate sugar epimerases 2 1 Op 2 . - CDS 761 - 991 128 ## BT_0598 putative nucleoside-diphosphate sugar epimerase/dehydrase 3 1 Op 3 . - CDS 1044 - 1193 83 ## BT_0377 hypothetical protein Predicted protein(s) >gi|226332090|gb|ACIC01000230.1| GENE 1 3 - 684 474 227 aa, chain - ## HITS:1 COG:RSp1004 KEGG:ns NR:ns ## COG: RSp1004 COG1086 # Protein_GI_number: 17549225 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Predicted nucleoside-diphosphate sugar epimerases # Organism: Ralstonia solanacearum # 6 226 109 318 665 84 31.0 2e-16 MAIWLFLPQTGLHNSQKIVFVLFDGMLTFIAMVGFRIQLILVYELLLNMLNKKNMHILIY GIDDKSVALKVRLMNSSHYKVVGFCIYGTGDSIRRVADLPVYSFKDEECFNKLIHKKCIG GILFARYENTREEEDRLLQYCKRSGLKTLIAPSISEADENGSFHQWVRPIKIEDLLGRSE IHINMEEVMTEFCGKVVLVTGAAGSIGSELCRQLAQMNIKKLIMFDS >gi|226332090|gb|ACIC01000230.1| GENE 2 761 - 991 128 76 aa, chain - ## HITS:1 COG:no KEGG:BT_0598 NR:ns ## KEGG: BT_0598 # Name: not_defined # Def: putative nucleoside-diphosphate sugar epimerase/dehydrase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 72 1 72 641 150 100.0 2e-35 MKHAFDRFVQYLQKNYFSYWIVLGIDTFIALICTWVSFIGIHYITETSKEITSLFHILAI SVISSVLGSFLFPYLP >gi|226332090|gb|ACIC01000230.1| GENE 3 1044 - 1193 83 49 aa, chain - ## HITS:1 COG:no KEGG:BT_0377 NR:ns ## KEGG: BT_0377 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 49 71 119 119 88 100.0 6e-17 MGYSVSMYANSEDEAKKKTVLRRSQMILKNQLPSPLKIQLHTIYDKLLS Prediction of potential genes in microbial genomes Time: Thu May 12 04:20:13 2011 Seq name: gi|226332089|gb|ACIC01000231.1| Bacteroides sp. 1_1_6 cont1.231, whole genome shotgun sequence Length of sequence - 1154 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 35 - 94 2.6 1 1 Tu 1 . + CDS 168 - 1148 362 ## PROTEIN SUPPORTED gi|148987750|ref|ZP_01819213.1| ribose-phosphate pyrophosphokinase Predicted protein(s) >gi|226332089|gb|ACIC01000231.1| GENE 1 168 - 1148 362 326 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|148987750|ref|ZP_01819213.1| ribose-phosphate pyrophosphokinase [Streptococcus pneumoniae SP6-BS73] # 3 311 10 312 317 144 32 4e-35 MSKHITEEQRYAISMMLQIPMSKKAIAEAIGVDKSTVYREIKRNCDARSGSYSMELAQRK ADRRKQQKHRKEVLTPAMRKRIIKLLKKGFSPEQIVGRSRLEGIAMVSHETIYRWIWEDK RRGGKLHKYLRRQGRRYAKRGSKNAGRGFIPGRVDIDERPEIVELKERFGDLEIDTIIGK NHKGAILTINDRATSRVWIRKLSGKEAIPVAKIAVWALRKVKNLIHTITADNGKEFAKHE EIAQKLEIKFYFCKPYHSWERGANENTNGLIRQYIPKGKDFSEVTNKQIKWIENKLNNRP RKRLGYLTPNEKFKQIINQNSVAFAS Prediction of potential genes in microbial genomes Time: Thu May 12 04:20:13 2011 Seq name: gi|226332088|gb|ACIC01000232.1| Bacteroides sp. 1_1_6 cont1.232, whole genome shotgun sequence Length of sequence - 1136 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 2 - 61 1.8 1 1 Tu 1 . + CDS 118 - 543 170 ## COG3291 FOG: PKD repeat 2 2 Tu 1 . - CDS 560 - 976 95 ## BT_4441 cell surface protein - Prom 1022 - 1081 2.6 Predicted protein(s) >gi|226332088|gb|ACIC01000232.1| GENE 1 118 - 543 170 141 aa, chain + ## HITS:1 COG:MA4285_2 KEGG:ns NR:ns ## COG: MA4285_2 COG3291 # Protein_GI_number: 20093074 # Func_class: R General function prediction only # Function: FOG: PKD repeat # Organism: Methanosarcina acetivorans str.C2A # 6 140 742 874 1325 67 35.0 1e-11 MPTAKRIETSAFGGCSSLKKVLFPDSVYFLGTNIFYGCTSLESVNIPKGFADSKFPSNVF RDCISLTSLIEVPETVTIIGMATFNGCTSLAGVKLKGNVPPSLEYSVFGNSTFPIYVPEA AVNAYKSASGWTSLASRIMGY >gi|226332088|gb|ACIC01000232.1| GENE 2 560 - 976 95 138 aa, chain - ## HITS:1 COG:no KEGG:BT_4441 NR:ns ## KEGG: BT_4441 # Name: not_defined # Def: cell surface protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 138 104 241 241 285 99.0 3e-76 MANTKISKILIPKSVGLCDVATFKNCVNLTSIVFEDGGDVPLYVGGDLWLENTQVTILVL PFKTYRIRGYWRRGSNLNTLYVKSTIPPILEHGWGDNPDTCDLYVPIGCKEVYASATNWG SFRTITEYDFDLNPNNVH Prediction of potential genes in microbial genomes Time: Thu May 12 04:20:17 2011 Seq name: gi|226332087|gb|ACIC01000233.1| Bacteroides sp. 1_1_6 cont1.233, whole genome shotgun sequence Length of sequence - 1125 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 22 - 81 6.9 1 1 Tu 1 . + CDS 182 - 1124 699 ## COG0739 Membrane proteins related to metalloendopeptidases Predicted protein(s) >gi|226332087|gb|ACIC01000233.1| GENE 1 182 - 1124 699 314 aa, chain + ## HITS:1 COG:TM1660 KEGG:ns NR:ns ## COG: TM1660 COG0739 # Protein_GI_number: 15644408 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Thermotoga maritima # 25 272 19 256 323 77 27.0 5e-14 MKRYIAALLLACCVEGYAQEKKQAAFVPPFDFPLTLSGNFGEIRSNHFHGGLDFKTGGTI GKPVRALADGYISRIRVTNGSGYVLDVCYHNGYSTINRHLSAFLSPIAERVKKLQYENEN WEVEIIPEPDEYPVKAGQRIALSGNTGYSFGPHLHLDVFETETGDYIDPMPFFKKNLKDT RAPKADGIMLFPQLGKGVVSGSQENKTILPNSEHPVEAWGVIGTGIKAYDYMDGVNNHYG VYSVVLTVDGTEVFRSTVDRFSQEENRMINSWTYGQYMKSFIDPGNTLRLLKASNDNRGL VTIDEERDYQFQYT Prediction of potential genes in microbial genomes Time: Thu May 12 04:20:17 2011 Seq name: gi|226332086|gb|ACIC01000234.1| Bacteroides sp. 1_1_6 cont1.234, whole genome shotgun sequence Length of sequence - 1020 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 356 - 415 5.8 1 1 Tu 1 . + CDS 603 - 1020 204 ## COG0582 Integrase Predicted protein(s) >gi|226332086|gb|ACIC01000234.1| GENE 1 603 - 1020 204 139 aa, chain + ## HITS:1 COG:mll6228 KEGG:ns NR:ns ## COG: mll6228 COG0582 # Protein_GI_number: 13475204 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Mesorhizobium loti # 1 139 107 244 342 92 37.0 2e-19 MVEWQSILSIKAKKTEISPPNYLSMEGIKLLLAQPDTTSWKGRRHLALLSLMYDTGARVQ EIADLTVDCVRIDTTPYTIRLTGKGRKTRIVPLAEAQVDILRSYMEENNLNDPNMMKKPL FFNGRHEKLTREGITYILK Prediction of potential genes in microbial genomes Time: Thu May 12 04:20:18 2011 Seq name: gi|226332085|gb|ACIC01000235.1| Bacteroides sp. 1_1_6 cont1.235, whole genome shotgun sequence Length of sequence - 1002 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 455 312 ## COG2942 N-acyl-D-glucosamine 2-epimerase 2 1 Op 2 . - CDS 529 - 975 426 ## COG2942 N-acyl-D-glucosamine 2-epimerase Predicted protein(s) >gi|226332085|gb|ACIC01000235.1| GENE 1 2 - 455 312 151 aa, chain - ## HITS:1 COG:all3695 KEGG:ns NR:ns ## COG: all3695 COG2942 # Protein_GI_number: 17231187 # Func_class: G Carbohydrate transport and metabolism # Function: N-acyl-D-glucosamine 2-epimerase # Organism: Nostoc sp. PCC 7120 # 1 148 175 321 388 170 54.0 8e-43 MILCNLALEIEHLLDPDYLKLTMETCIHEVMNVFYRPELGGIIVENVDMDGNLVDCFEGR QVTPGHAIEAMWFIMDLGKRLDRPELIQQAMLTTLTMLDYGWDKQCGGIYYFMDRNGCPP QQLEWDQKLWWVHIESLISLLKGYQLTGDKR >gi|226332085|gb|ACIC01000235.1| GENE 2 529 - 975 426 148 aa, chain - ## HITS:1 COG:all3695 KEGG:ns NR:ns ## COG: all3695 COG2942 # Protein_GI_number: 17231187 # Func_class: G Carbohydrate transport and metabolism # Function: N-acyl-D-glucosamine 2-epimerase # Organism: Nostoc sp. PCC 7120 # 2 145 4 146 388 170 56.0 1e-42 MDFKTLANQYRNELLDNVLPFWLEHSQDLEFGGYFTCLDRKGGVFDTDKFIWLQGREVWM FSMLYNKVEKRQEWLDCAVLGGEFLKKYGHNGDYNWYFSLNRSGRPLVEPYNIFSYTFAT MAFGQLSLATGSQEYADIAKKTFRNHTL Prediction of potential genes in microbial genomes Time: Thu May 12 04:20:19 2011 Seq name: gi|226332084|gb|ACIC01000236.1| Bacteroides sp. 1_1_6 cont1.236, whole genome shotgun sequence Length of sequence - 962 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 380 298 ## COG2148 Sugar transferases involved in lipopolysaccharide synthesis 2 1 Op 2 . - CDS 409 - 951 162 ## COG0463 Glycosyltransferases involved in cell wall biogenesis Predicted protein(s) >gi|226332084|gb|ACIC01000236.1| GENE 1 2 - 380 298 126 aa, chain - ## HITS:1 COG:BH3716 KEGG:ns NR:ns ## COG: BH3716 COG2148 # Protein_GI_number: 15616278 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Sugar transferases involved in lipopolysaccharide synthesis # Organism: Bacillus halodurans # 19 126 5 112 207 106 45.0 9e-24 MEQQTTFIPDGMNAFERNVKRIGDCIIAGILLIIFSPLFLICYIAVKREDGGPAIFKQER IGRFGRPFYIYKFRSMRLDAEKMGPALYRGGKDKRLTKVGKFLREHHLDELPQLWNVFCG QMAFIG >gi|226332084|gb|ACIC01000236.1| GENE 2 409 - 951 162 180 aa, chain - ## HITS:1 COG:HI1695 KEGG:ns NR:ns ## COG: HI1695 COG0463 # Protein_GI_number: 16273582 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Haemophilus influenzae # 1 180 90 267 267 128 37.0 5e-30 MDTDDIAKPDRFEKQIRFFQEHPELDVVGAWIDEFEETTSNIISTRKLPEVHDDICQFAK KRNPENHPVIMFRKQAVLAAGGYQHFPLFEDYYLWIRMLQNGAKFYNIQESLLYFRFSPA MFKRRGGLKYVTTELRFQNQLRNLGFITSSEYLYNVFIRVITRMMPNTLRAILYKKALRK Prediction of potential genes in microbial genomes Time: Thu May 12 04:20:20 2011 Seq name: gi|226332083|gb|ACIC01000237.1| Bacteroides sp. 1_1_6 cont1.237, whole genome shotgun sequence Length of sequence - 952 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 697 146 ## COG0582 Integrase 2 1 Op 2 . - CDS 690 - 950 90 ## GFO_1229 phage integrase family protein Predicted protein(s) >gi|226332083|gb|ACIC01000237.1| GENE 1 1 - 697 146 232 aa, chain - ## HITS:1 COG:mlr9323 KEGG:ns NR:ns ## COG: mlr9323 COG0582 # Protein_GI_number: 13488300 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Mesorhizobium loti # 53 224 47 211 316 74 31.0 1e-13 MAEQFNYSSVLSPYIKRMLEIRKSMGIVDSRVRWILKEFDDFANSIGLQEPHITEEFVKR WHKSRISDKEITIYGKYLVLRQLTSLMCRNGCVCYIPIIPKQPKSEFTPYIYTHDQISQL FTAADSSCLFNNCMKTAIISMPVILRLLYSTGMRVSEALYMRNEDVNLDSGYIHLRKTKN RCERLVPIGESMVIVLKQYIEYRNRMPIEKISHPNHLFFTKLDGTSFRVCTL >gi|226332083|gb|ACIC01000237.1| GENE 2 690 - 950 90 86 aa, chain - ## HITS:1 COG:no KEGG:GFO_1229 NR:ns ## KEGG: GFO_1229 # Name: not_defined # Def: phage integrase family protein # Organism: G.forsetii # Pathway: not_defined # 5 85 344 424 425 107 55.0 1e-22 NKIIRLSGVNIDKKRHGPHSLRHSLASNMLENGATMPIISEVLGHRNTETTMTYLKINLV ALRKCVLPVPPIPDSFYTQKGGAFYG Prediction of potential genes in microbial genomes Time: Thu May 12 04:20:22 2011 Seq name: gi|226332082|gb|ACIC01000238.1| Bacteroides sp. 1_1_6 cont1.238, whole genome shotgun sequence Length of sequence - 937 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 935 351 ## COG3344 Retron-type reverse transcriptase Predicted protein(s) >gi|226332082|gb|ACIC01000238.1| GENE 1 2 - 935 351 311 aa, chain + ## HITS:1 COG:Q0050 KEGG:ns NR:ns ## COG: Q0050 COG3344 # Protein_GI_number: 6226520 # Func_class: L Replication, recombination and repair # Function: Retron-type reverse transcriptase # Organism: Saccharomyces cerevisiae # 15 247 466 708 834 97 32.0 4e-20 RKGDARYKLYEQRRYRLAKKLKNEKDVKVRKQMTAEIKRLREERNNYPARNEMDSSIKRL KYVRYADDFLIGITGNLEDCKTVKEDIKNYLNEALKLELSDEKTLITNAQKPAKFLGYDV FIRRCNDLRKDKYGKTVRAFGHKPVLYLNFETMRKKLFDYKAARIAVVNGKEVWKSIVRT YMLNLDDLEIVSQFNAEIRGFYNYYSIANNSYVINSFYHIMSYSMYKVFANKYKSSVKKI LLKYKKNKVFQVAYENSKGKTLYQSFYHDGFKRKKVAGNIYCDTIPRTVSITGGRNSLME RLKLQVCELCG Prediction of potential genes in microbial genomes Time: Thu May 12 04:20:23 2011 Seq name: gi|226332081|gb|ACIC01000239.1| Bacteroides sp. 1_1_6 cont1.239, whole genome shotgun sequence Length of sequence - 937 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 935 395 ## COG3344 Retron-type reverse transcriptase Predicted protein(s) >gi|226332081|gb|ACIC01000239.1| GENE 1 2 - 935 395 311 aa, chain + ## HITS:1 COG:Q0050 KEGG:ns NR:ns ## COG: Q0050 COG3344 # Protein_GI_number: 6226520 # Func_class: L Replication, recombination and repair # Function: Retron-type reverse transcriptase # Organism: Saccharomyces cerevisiae # 39 247 496 708 834 98 35.0 2e-20 RKGDARYKLYEQRRYRLAKKLKNEKDVKVRELMTAEIKRLREERNNYPARNEMDSSIKRL KYVRYADDFLIGITGSLEDCKTVKEDIKNYLNEALKLELSDEKTLITNAQKPAKFLGYDV FIRRCNDLRKDKFGKTVRAFGHKPVLYLNFETMRKKLFDYKAARIVVVNGKEVWKSIVRT YMLNLDDLEIVSQFNAEIRGFYNYYSIANNSYVINSFYHIMSYSMYKVFASKYKSSVKKI LLKYKKNKVFQVAYENSKGKTLYQAFYHDGFKRKKVAGNIYCDTIPRTISITGGRNSLME RLKLQVCELCG Prediction of potential genes in microbial genomes Time: Thu May 12 04:20:24 2011 Seq name: gi|226332080|gb|ACIC01000240.1| Bacteroides sp. 1_1_6 cont1.240, whole genome shotgun sequence Length of sequence - 934 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 177 176 ## BT_1704 hypothetical protein - Prom 385 - 444 6.5 + Prom 172 - 231 5.7 2 2 Tu 1 . + CDS 397 - 897 667 ## BT_1705 hypothetical protein Predicted protein(s) >gi|226332080|gb|ACIC01000240.1| GENE 1 3 - 177 176 58 aa, chain - ## HITS:1 COG:no KEGG:BT_1704 NR:ns ## KEGG: BT_1704 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 58 1 58 72 116 100.0 3e-25 MIEFKIRAYGRMELAQLYSPELTGIAAYRKMNKWIVRCPGLQERLSDLGYQPQHRSYT >gi|226332080|gb|ACIC01000240.1| GENE 2 397 - 897 667 166 aa, chain + ## HITS:1 COG:no KEGG:BT_1705 NR:ns ## KEGG: BT_1705 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 166 1 166 166 289 100.0 3e-77 MIRYKIYQNQQKKGLNAGKWFARAVSDETFDLAKLAEHMSKHNSPYSSGVIKGVLTDMVD CIKELLLDGKSVKIDDLAIFGVGIRSKAAETLEEFSLEKNITGMRLKARATGNLSTTNLK LDSQLKQQAEYQKPTTPGGGGSGSGNTPDPKPNPDEGDEEAPDPTV Prediction of potential genes in microbial genomes Time: Thu May 12 04:20:28 2011 Seq name: gi|226332079|gb|ACIC01000241.1| Bacteroides sp. 1_1_6 cont1.241, whole genome shotgun sequence Length of sequence - 915 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 499 - 558 2.4 1 1 Tu 1 . + CDS 580 - 913 234 ## COG3344 Retron-type reverse transcriptase Predicted protein(s) >gi|226332079|gb|ACIC01000241.1| GENE 1 580 - 913 234 111 aa, chain + ## HITS:1 COG:YPMT1.75c KEGG:ns NR:ns ## COG: YPMT1.75c COG3344 # Protein_GI_number: 16082867 # Func_class: L Replication, recombination and repair # Function: Retron-type reverse transcriptase # Organism: Yersinia pestis # 21 109 21 112 156 67 39.0 4e-12 MRNPENVLNSLSKHSGNLNYKFERLYRVLFNVEMFYVAYQNIYSKTGNMTAGVDGKTIDG MSIDRVEQLIGSLKNETYQPNPSKRTYIPKKNGKKRPLGIPSFNDKMVQEV Prediction of potential genes in microbial genomes Time: Thu May 12 04:20:29 2011 Seq name: gi|226332078|gb|ACIC01000242.1| Bacteroides sp. 1_1_6 cont1.242, whole genome shotgun sequence Length of sequence - 900 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 357 186 ## BT_0948 hypothetical protein - Prom 511 - 570 6.2 Predicted protein(s) >gi|226332078|gb|ACIC01000242.1| GENE 1 3 - 357 186 118 aa, chain - ## HITS:1 COG:no KEGG:BT_0948 NR:ns ## KEGG: BT_0948 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 118 1 118 338 241 100.0 5e-63 MESTLQIMPVQRTSRNFGEYAEEAVIIEEPIIKQKRPLFIEANTIEASLEHLRNDCIIPV FAKDNEATLSHVAFIEAVQDATNTFFSGEQIESPDIRVSHVIKGRIPEAIHKPANQLL Prediction of potential genes in microbial genomes Time: Thu May 12 04:20:32 2011 Seq name: gi|226332077|gb|ACIC01000243.1| Bacteroides sp. 1_1_6 cont1.243, whole genome shotgun sequence Length of sequence - 898 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 355 200 ## BT_0948 hypothetical protein - Prom 509 - 568 6.2 Predicted protein(s) >gi|226332077|gb|ACIC01000243.1| GENE 1 1 - 355 200 118 aa, chain - ## HITS:1 COG:no KEGG:BT_0948 NR:ns ## KEGG: BT_0948 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 118 1 118 338 241 100.0 5e-63 MESTLQIMPVQRTSRNFGEYAEEAVIIEEPIIKQKRPLFIEANTIEASLEHLRNDCIIPV FAKDNEATLSHVAFIEAVQDATNTFFSGEQIESPDIRVSHVIKGRIPEAIHKPANQLL Prediction of potential genes in microbial genomes Time: Thu May 12 04:20:34 2011 Seq name: gi|226332076|gb|ACIC01000244.1| Bacteroides sp. 1_1_6 cont1.244, whole genome shotgun sequence Length of sequence - 896 bp Number of predicted genes - 3, with homology - 2 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 53 - 157 57 ## 2 1 Op 2 . + CDS 162 - 575 180 ## COG3023 Negative regulator of beta-lactamase expression + Term 585 - 631 4.4 - Term 578 - 614 7.3 3 2 Tu 1 . - CDS 650 - 862 161 ## BT_1644 putative CPS biosynthesis glycosyltransferase Predicted protein(s) >gi|226332076|gb|ACIC01000244.1| GENE 1 53 - 157 57 34 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSNSSSRSVWSFIIKVIITVATAVGGLIGVQSCM >gi|226332076|gb|ACIC01000244.1| GENE 2 162 - 575 180 137 aa, chain + ## HITS:1 COG:HI1494 KEGG:ns NR:ns ## COG: HI1494 COG3023 # Protein_GI_number: 16273395 # Func_class: V Defense mechanisms # Function: Negative regulator of beta-lactamase expression # Organism: Haemophilus influenzae # 47 128 2 96 116 77 40.0 5e-15 MRTITLIIIHCSATPEGRALSAEACRQDHIRHRGFRDIGYHFYITRDGEIHPGRPLEKVG AHCRNHNAHSIGICYEGGLNAEGQAKDTRTLAQRGALLALLRELKKKFPEALIIGHHDLN PIKECPCYPCVEEYREL >gi|226332076|gb|ACIC01000244.1| GENE 3 650 - 862 161 70 aa, chain - ## HITS:1 COG:no KEGG:BT_1644 NR:ns ## KEGG: BT_1644 # Name: not_defined # Def: putative CPS biosynthesis glycosyltransferase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 70 139 208 208 140 98.0 1e-32 MEHDPRYVFLYQIRPGVTSYATLYNGYTDTMDKMLRRLRYDLFYLQHRSWWFDFKILVKT FINICFGKKF Prediction of potential genes in microbial genomes Time: Thu May 12 04:20:40 2011 Seq name: gi|226332075|gb|ACIC01000245.1| Bacteroides sp. 1_1_6 cont1.245, whole genome shotgun sequence Length of sequence - 887 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 52 - 111 2.4 1 1 Tu 1 . + CDS 132 - 885 662 ## COG2942 N-acyl-D-glucosamine 2-epimerase Predicted protein(s) >gi|226332075|gb|ACIC01000245.1| GENE 1 132 - 885 662 251 aa, chain + ## HITS:1 COG:all3695 KEGG:ns NR:ns ## COG: all3695 COG2942 # Protein_GI_number: 17231187 # Func_class: G Carbohydrate transport and metabolism # Function: N-acyl-D-glucosamine 2-epimerase # Organism: Nostoc sp. PCC 7120 # 2 251 4 251 388 278 54.0 5e-75 MDFKKLANQYRDELLDNVLPFWLEHSQDIEFGGYFSCLDREGKVFDTDKFIWLQGREVWM FSMLYNKVEKRQEWLDCAVQGGEFLKKYGHDGNYNWYFSLDRSGRPLVEPYNIFSYTFAT MAFGQLSLATGNQEYADIAKKTFKIILSKVDNPKSKWNKLHPGTRNLKNFALPMILCNLA LEIEHLLDPGYLEQTMETCIHEVMDVFYRPELGGIIVENVDMDGNLVDCFEGRQVTPGHA IEAMWFIMDLG Prediction of potential genes in microbial genomes Time: Thu May 12 04:20:41 2011 Seq name: gi|226332074|gb|ACIC01000246.1| Bacteroides sp. 1_1_6 cont1.246, whole genome shotgun sequence Length of sequence - 853 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 768 538 ## BVU_2295 hypothetical protein Predicted protein(s) >gi|226332074|gb|ACIC01000246.1| GENE 1 3 - 768 538 255 aa, chain - ## HITS:1 COG:no KEGG:BVU_2295 NR:ns ## KEGG: BVU_2295 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 12 238 42 265 322 68 27.0 2e-10 MKLLNLGSAEAPKWYFVTTVNFNGEEQEIILAFTDYSIKRKNFEKNKTYESIDKSKLLPT LVCLCDAKPFYTENVPLYDYVGSKIENLPDNAPAVMVYLDKADNVNLLGLTDEPLQAELV ECDSVADAYRRVATTAYFSQCPSKDEWIGYAGIVTGDELLLNIRSFGEKYSISGTAVQGY FGISTTVSLLQSKALAMSSSLFKEEYRTYAQAEQLMKATVQAFGVKAAKQTRYIKAINYG ISEYGFDTVCDVLNS Prediction of potential genes in microbial genomes Time: Thu May 12 04:20:45 2011 Seq name: gi|226332073|gb|ACIC01000247.1| Bacteroides sp. 1_1_6 cont1.247, whole genome shotgun sequence Length of sequence - 834 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 40 - 411 321 ## COG3436 Transposase and inactivated derivatives 2 1 Op 2 . - CDS 395 - 757 103 ## BT_2350 hypothetical protein Predicted protein(s) >gi|226332073|gb|ACIC01000247.1| GENE 1 40 - 411 321 123 aa, chain - ## HITS:1 COG:Z2127 KEGG:ns NR:ns ## COG: Z2127 COG3436 # Protein_GI_number: 15801566 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Escherichia coli O157:H7 EDL933 # 1 110 72 179 186 62 38.0 2e-10 MFCLNDTMRYFLCPGKTDMRKGMNSLCGVVHDKMGYDVRLGDVFIFINRQRTTMKLLHAE DGGLVLYIKRLEEGTFRLPSYDKESKSYPMQWRDLVLMVEGINDEPSKRLKRLKALRKSD MQY >gi|226332073|gb|ACIC01000247.1| GENE 2 395 - 757 103 120 aa, chain - ## HITS:1 COG:no KEGG:BT_2350 NR:ns ## KEGG: BT_2350 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 120 4 123 123 231 100.0 7e-60 MSKEEFIEILSRQQRSGLTIKDFCINEAYTESSFYYWKGKFGLSRRYHMDRHSSSLEEFA PVSLTSSPASHSACDSGAIQTGEIRIEFPGGIIAHFSGMAESQAAMQLLTQLCNRHVLPE Prediction of potential genes in microbial genomes Time: Thu May 12 04:20:48 2011 Seq name: gi|226332072|gb|ACIC01000248.1| Bacteroides sp. 1_1_6 cont1.248, whole genome shotgun sequence Length of sequence - 804 bp Number of predicted genes - 3, with homology - 2 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 1 - 198 70 ## BT_0948 hypothetical protein + Term 221 - 259 3.0 2 1 Op 2 . + CDS 283 - 408 72 ## + Term 427 - 468 7.1 + Prom 431 - 490 8.1 3 2 Tu 1 . + CDS 590 - 803 195 ## BT_0947 integrase Predicted protein(s) >gi|226332072|gb|ACIC01000248.1| GENE 1 1 - 198 70 65 aa, chain + ## HITS:1 COG:no KEGG:BT_0948 NR:ns ## KEGG: BT_0948 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 65 274 338 338 139 96.0 4e-32 AYINDENFGSLGNDLSMWKFYNLLTGANKSSYIDSFLDRAYNATELATGICSALHGDDKY QWFLS >gi|226332072|gb|ACIC01000248.1| GENE 2 283 - 408 72 41 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLQALLVVISWTILLLVMCAISPVVFFLMIIWTIYKIITMK >gi|226332072|gb|ACIC01000248.1| GENE 3 590 - 803 195 71 aa, chain + ## HITS:1 COG:no KEGG:BT_0947 NR:ns ## KEGG: BT_0947 # Name: not_defined # Def: integrase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 71 1 71 195 133 95.0 2e-30 MSLKYSSTTADYLQWNEAMNLVRKLARDSNYKMSLLIALGCFTGLRISDILALRWNQILD AEEFTITEIKT Prediction of potential genes in microbial genomes Time: Thu May 12 04:20:55 2011 Seq name: gi|226332071|gb|ACIC01000249.1| Bacteroides sp. 1_1_6 cont1.249, whole genome shotgun sequence Length of sequence - 766 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 194 - 253 6.5 1 1 Tu 1 . + CDS 312 - 765 149 ## BT_4286 hypothetical protein Predicted protein(s) >gi|226332071|gb|ACIC01000249.1| GENE 1 312 - 765 149 151 aa, chain + ## HITS:1 COG:no KEGG:BT_4286 NR:ns ## KEGG: BT_4286 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 21 151 21 156 259 258 93.0 5e-68 MRVQQIIAVSSFLLLSILKVYSQEDKSLKELFLSDGYTLDRNQVRFNQSSDFVFHYIPGG DRLTCITSKQVRDYNRHYNCILSKDSNFIAYVSVPPIFCGDKDSIYADFSFKTGFMIEVN TYHLEIIKGDFFVNTGERVTSLCDLPLTYKS Prediction of potential genes in microbial genomes Time: Thu May 12 04:20:58 2011 Seq name: gi|226332070|gb|ACIC01000250.1| Bacteroides sp. 1_1_6 cont1.250, whole genome shotgun sequence Length of sequence - 761 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 387 - 446 8.1 1 1 Tu 1 . + CDS 546 - 759 160 ## BT_0947 integrase Predicted protein(s) >gi|226332070|gb|ACIC01000250.1| GENE 1 546 - 759 160 71 aa, chain + ## HITS:1 COG:no KEGG:BT_0947 NR:ns ## KEGG: BT_0947 # Name: not_defined # Def: integrase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 71 1 71 195 136 100.0 2e-31 MSLKYSSTTADYLQWSEAMNLIRKLARDSNYKMSLLIALGCFTGLRISDILALRWNQILD AEEFTIIEIKT Prediction of potential genes in microbial genomes Time: Thu May 12 04:21:00 2011 Seq name: gi|226332069|gb|ACIC01000251.1| Bacteroides sp. 1_1_6 cont1.251, whole genome shotgun sequence Length of sequence - 751 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 749 383 ## BT_1513 integrase Predicted protein(s) >gi|226332069|gb|ACIC01000251.1| GENE 1 2 - 749 383 249 aa, chain - ## HITS:1 COG:no KEGG:BT_1513 NR:ns ## KEGG: BT_1513 # Name: not_defined # Def: integrase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 249 51 299 305 468 100.0 1e-130 IHLRGRGCSWNTVSTYLRTFRAVYNRAVDCRGAVYVPHLFRSVYTGTRADRKRALDTEDM QKVFTKLPQSSVVTSDMRRTQELFVLMFLLRGLPFVDLAYLRKSDLHDNVITYRRRKTGR PLSVTLTPEAMAILKRYMNRDSSSPYLFPLLNSREGTKEAYHEYQLALRNFNRQLMLLGE MLGLGDKLSSYTARHTWATTAYYCEIHPGVISEAMGHSSITVTETYLKPFRNKKIDEANK KVVDFIKRS Prediction of potential genes in microbial genomes Time: Thu May 12 04:21:04 2011 Seq name: gi|226332068|gb|ACIC01000252.1| Bacteroides sp. 1_1_6 cont1.252, whole genome shotgun sequence Length of sequence - 719 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 112 - 152 0.9 1 1 Tu 1 . - CDS 365 - 718 379 ## gi|237718418|ref|ZP_04548899.1| conserved hypothetical protein Predicted protein(s) >gi|226332068|gb|ACIC01000252.1| GENE 1 365 - 718 379 117 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|237718418|ref|ZP_04548899.1| ## NR: gi|237718418|ref|ZP_04548899.1| conserved hypothetical protein [Bacteroides sp. 2_2_4] # 1 117 486 602 602 214 99.0 2e-54 INKASDMFERMGYVKNGKLYDYNSEISGIVKEIEAYMQSPASGLEVAHKYVERLSVGNLS NLRDELKLLIVKCIKIEVNVETVMQIVDKAFKMGILDNDSLNDMKEIVAGILKAKDK Prediction of potential genes in microbial genomes Time: Thu May 12 04:21:11 2011 Seq name: gi|226332067|gb|ACIC01000253.1| Bacteroides sp. 1_1_6 cont1.253, whole genome shotgun sequence Length of sequence - 682 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 613 430 ## BT_0568 hypothetical protein Predicted protein(s) >gi|226332067|gb|ACIC01000253.1| GENE 1 1 - 613 430 204 aa, chain - ## HITS:1 COG:no KEGG:BT_0568 NR:ns ## KEGG: BT_0568 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 204 1 204 222 375 100.0 1e-103 MKKLITLFFISILSLQVYCSAQEILKVDYPAIKEYVTNHNTEFQKLMQRFEENDTLLTRQ DHAMLYYGYSFTPAYKGSMDDFQDFRKLIKEEKYEDAYNIGKELLKKNPVSLQLLYNMYG IAGLLQKDIREIKHYSKRYAALLTMIALTGDGTSEETAFKVICVNDEYQLLNMLFKMENM KGQSLVNKCDLIEFDKCQYYEGNQ Prediction of potential genes in microbial genomes Time: Thu May 12 04:21:14 2011 Seq name: gi|226332066|gb|ACIC01000254.1| Bacteroides sp. 1_1_6 cont1.254, whole genome shotgun sequence Length of sequence - 640 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 639 116 ## BT_4274 hypothetical protein Predicted protein(s) >gi|226332066|gb|ACIC01000254.1| GENE 1 3 - 639 116 212 aa, chain - ## HITS:1 COG:no KEGG:BT_4274 NR:ns ## KEGG: BT_4274 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 212 19 229 263 403 98.0 1e-111 SNAYPQKDKSLQNVFIPEGYALDRNQVQYNQPKDFISYIDSLGKNKVYLGCFLAKESKRL PSYYRKSFLSKDSNFIILTAIPVYFVRRNDSVYADFDLLPEMKVEMNTYHLTHIKADFLK NTGKRVAFLSELPITYKSSKYARESFNADTVITYPLKMWEKYENKYTHCLVILIQKRYRG CIPLYCLYTDEGAKKLNHYIKSLKKVFWYRNP Prediction of potential genes in microbial genomes Time: Thu May 12 04:21:18 2011 Seq name: gi|226332065|gb|ACIC01000255.1| Bacteroides sp. 1_1_6 cont1.255, whole genome shotgun sequence Length of sequence - 622 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 621 436 ## BT_1509 integrase Predicted protein(s) >gi|226332065|gb|ACIC01000255.1| GENE 1 3 - 621 436 206 aa, chain + ## HITS:1 COG:no KEGG:BT_1509 NR:ns ## KEGG: BT_1509 # Name: not_defined # Def: integrase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 206 51 256 305 375 100.0 1e-103 IHLRGRGCSWNTVSTYLRTFRAVYNRAVDCRGAVYVPHLFRSVYTGTRADRKRALCDEDM QKVFAKLPSSPAVTPAMRRTQELFVLMFLLRGLPFVDLAYLRKSDLHDNVITYRRRKTGR PLSVTLTREAMVLLKRYMNRDSSSPYLFSLLESREGTKEAYREYQLALRSFNQQLLLLGQ LLGLGDRLSSYTARHTWATTAYYCEI Prediction of potential genes in microbial genomes Time: Thu May 12 04:21:21 2011 Seq name: gi|226332064|gb|ACIC01000256.1| Bacteroides sp. 1_1_6 cont1.256, whole genome shotgun sequence Length of sequence - 603 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 1 - 228 135 ## BT_1357 hypothetical protein 2 1 Op 2 . + CDS 286 - 601 308 ## BT_1356 putative capsule polysaccharide export protein Predicted protein(s) >gi|226332064|gb|ACIC01000256.1| GENE 1 1 - 228 135 75 aa, chain + ## HITS:1 COG:no KEGG:BT_1357 NR:ns ## KEGG: BT_1357 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 75 48 122 122 140 98.0 2e-32 DLYSQRGKTPEQDATLCLAILQGYNVSMYANPEDEDRKRSVLQRSLTLLDVLPPSLLKQQ LSAVCHGMQELCETN >gi|226332064|gb|ACIC01000256.1| GENE 2 286 - 601 308 105 aa, chain + ## HITS:1 COG:no KEGG:BT_1356 NR:ns ## KEGG: BT_1356 # Name: not_defined # Def: putative capsule polysaccharide export protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 105 1 105 789 193 100.0 1e-48 MLGSGSLMAQSMSDSQVLEYVKDGIRQGKEQKQLASELARKGVTKEQAMRVKQLYEQQNN VNTSQSTGTDINESRLREETKENTSDMLEDHPTTEDLAREDQVFG Prediction of potential genes in microbial genomes Time: Thu May 12 04:21:25 2011 Seq name: gi|226332063|gb|ACIC01000257.1| Bacteroides sp. 1_1_6 cont1.257, whole genome shotgun sequence Length of sequence - 585 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 573 536 ## BT_1654 polysialic acid transport protein KpsD precursor Predicted protein(s) >gi|226332063|gb|ACIC01000257.1| GENE 1 3 - 573 536 190 aa, chain - ## HITS:1 COG:no KEGG:BT_1654 NR:ns ## KEGG: BT_1654 # Name: not_defined # Def: polysialic acid transport protein KpsD precursor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 190 488 677 789 362 99.0 3e-99 MTLEDLIIQAGGLKEAASTVRIDVSRRIKNPRSIADNDTIGQMYTFSLKDGFVIDGQPGF ILQPYDEVYVRRSPGYQAQQNVAIEGEILFGGNYAMTSREERLSDLVNKAGGPTNYAYLR GAKLTRVANASEKKRMGDVIRLMSRQLGEAMIDSLGIRVEDTFTVGIDLEKALSNPKSNA DLVLREGDVI Prediction of potential genes in microbial genomes Time: Thu May 12 04:21:29 2011 Seq name: gi|226332062|gb|ACIC01000258.1| Bacteroides sp. 1_1_6 cont1.258, whole genome shotgun sequence Length of sequence - 585 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 18 - 77 2.5 1 1 Tu 1 . + CDS 109 - 583 220 ## gi|288927180|ref|ZP_06421060.1| integrase/recombinase y4rA Predicted protein(s) >gi|226332062|gb|ACIC01000258.1| GENE 1 109 - 583 220 158 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288927180|ref|ZP_06421060.1| ## NR: gi|288927180|ref|ZP_06421060.1| integrase/recombinase y4rA [Prevotella buccae D17] # 1 158 17 174 426 86 31.0 6e-16 MEKVEIKKLIEQCINYFYESGYAKGTIDYYKCLWTKGILQYMSDKGIDMYTPDVGAKFIE STQHQDMSNHECERIRSIHALNDIMTVGYIRKQCVRAAFYPLDGAIGKQMEKLVLHLISL RRGKNTLKHYRSCLGNFLYYLDMIGVQNIKQITEEHVI Prediction of potential genes in microbial genomes Time: Thu May 12 04:21:36 2011 Seq name: gi|226332061|gb|ACIC01000259.1| Bacteroides sp. 1_1_6 cont1.259, whole genome shotgun sequence Length of sequence - 584 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 18 - 77 2.4 1 1 Tu 1 . + CDS 108 - 582 227 ## gi|288927180|ref|ZP_06421060.1| integrase/recombinase y4rA Predicted protein(s) >gi|226332061|gb|ACIC01000259.1| GENE 1 108 - 582 227 158 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288927180|ref|ZP_06421060.1| ## NR: gi|288927180|ref|ZP_06421060.1| integrase/recombinase y4rA [Prevotella buccae D17] # 1 158 17 174 426 82 30.0 8e-15 MEKVEIKKLIEQCLNYFYESGYAKGSIDYYKCLWTKGILQYMSDKGIDMYTLDVGAEFIE STQHQGMSNHECERIRSIHVLNDIMTVGYVRKQCVRAAIYPLDGAIGKQMEKLVLHLISL RRGKSTLKHYRSCLGNFLYYLDMIGVQNIKQITEEHII Prediction of potential genes in microbial genomes Time: Thu May 12 04:21:44 2011 Seq name: gi|226332060|gb|ACIC01000260.1| Bacteroides sp. 1_1_6 cont1.260, whole genome shotgun sequence Length of sequence - 572 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 203 168 ## BF0666 hypothetical protein - Prom 273 - 332 7.7 Predicted protein(s) >gi|226332060|gb|ACIC01000260.1| GENE 1 2 - 203 168 67 aa, chain - ## HITS:1 COG:no KEGG:BF0666 NR:ns ## KEGG: BF0666 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 67 1 66 1060 102 74.0 5e-21 MKRKLMLLLACLLASIGLVIAQTPKKVTGVVISEEDDQPVVGASVLVKGTTMGTVTDIDG KFTINDV Prediction of potential genes in microbial genomes Time: Thu May 12 04:21:46 2011 Seq name: gi|226332059|gb|ACIC01000261.1| Bacteroides sp. 1_1_6 cont1.261, whole genome shotgun sequence Length of sequence - 568 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 180 - 239 5.8 1 1 Tu 1 . + CDS 379 - 568 171 ## gi|237718420|ref|ZP_04548901.1| conserved hypothetical protein Predicted protein(s) >gi|226332059|gb|ACIC01000261.1| GENE 1 379 - 568 171 63 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237718420|ref|ZP_04548901.1| ## NR: gi|237718420|ref|ZP_04548901.1| conserved hypothetical protein [Bacteroides sp. 2_2_4] # 1 63 1 63 614 131 98.0 1e-29 MKKEVIKIPANIKYLTEREKFIEEFGKPFELPNGILNKEIPGCGATTVALTDEHKTIICS PRN Prediction of potential genes in microbial genomes Time: Thu May 12 04:21:51 2011 Seq name: gi|226332058|gb|ACIC01000262.1| Bacteroides sp. 1_1_6 cont1.262, whole genome shotgun sequence Length of sequence - 548 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 178 - 408 68 ## BT_0947 integrase - Prom 459 - 518 2.1 Predicted protein(s) >gi|226332058|gb|ACIC01000262.1| GENE 1 178 - 408 68 76 aa, chain - ## HITS:1 COG:no KEGG:BT_0947 NR:ns ## KEGG: BT_0947 # Name: not_defined # Def: integrase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 76 120 195 195 142 97.0 5e-33 MLKEIKKKYRLQIGNFSCHSLRKTFGRQVYNMNNDNSELALVKLMELFNHSSVSITKRYL GLRQEELLNTYDCLSF Prediction of potential genes in microbial genomes Time: Thu May 12 04:21:53 2011 Seq name: gi|226332057|gb|ACIC01000263.1| Bacteroides sp. 1_1_6 cont1.263, whole genome shotgun sequence Length of sequence - 506 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 496 297 ## COG0477 Permeases of the major facilitator superfamily Predicted protein(s) >gi|226332057|gb|ACIC01000263.1| GENE 1 2 - 496 297 164 aa, chain + ## HITS:1 COG:CAC1339 KEGG:ns NR:ns ## COG: CAC1339 COG0477 # Protein_GI_number: 15894618 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Clostridium acetobutylicum # 3 159 300 453 469 100 36.0 2e-21 QVLVGLVNTLTTVLALVIIDRVGRKQLVYYGVSGMVVSLLLIGIYFLFGDSWGVSSLFLL VFFLFYVFCCAVSICAVVFVLLSEMYPTKVRGLAMSIAGFALWIGTYLIGQLTPWMLQNL TPAGTFFLFAVMCVPYMLIVWKLVPETTGKSLEEIERYWTRSEQ