Prediction of potential genes in microbial genomes Time: Fri May 27 23:54:51 2011 Seq name: gi|283510618|gb|ACQH01000001.1| Prevotella sp. oral taxon 317 str. F0108 cont2.1, whole genome shotgun sequence Length of sequence - 11728 bp Number of predicted genes - 13, with homology - 13 Number of transcription units - 6, operones - 4 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 155 - 514 383 ## CCC13826_1047 putative putative periplasmic protein 2 2 Op 1 . - CDS 651 - 887 91 ## BF1967 hypothetical protein 3 2 Op 2 . - CDS 884 - 1273 209 ## BF1967 hypothetical protein - Prom 1359 - 1418 2.3 - Term 1610 - 1650 11.3 4 3 Tu 1 . - CDS 1740 - 2117 353 ## gi|288927262|ref|ZP_06421109.1| hypothetical protein HMPREF0670_00003 - Prom 2268 - 2327 4.7 5 4 Op 1 . - CDS 2582 - 2755 126 ## gi|288927263|ref|ZP_06421110.1| hypothetical protein HMPREF0670_00004 6 4 Op 2 . - CDS 2819 - 4765 998 ## gi|288927264|ref|ZP_06421111.1| hypothetical protein HMPREF0670_00005 - Prom 4793 - 4852 2.0 7 5 Op 1 . - CDS 4972 - 6093 1107 ## PRU_1517 hypothetical protein 8 5 Op 2 11/0.000 - CDS 6090 - 7031 775 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 9 5 Op 3 . - CDS 7028 - 7975 869 ## COG0463 Glycosyltransferases involved in cell wall biogenesis - Prom 8047 - 8106 4.4 10 6 Op 1 . + CDS 8285 - 9736 1161 ## PRU_1515 polysaccharide biosynthesis protein 11 6 Op 2 . + CDS 9757 - 10761 788 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 12 6 Op 3 . + CDS 10801 - 11094 206 ## gi|288927270|ref|ZP_06421117.1| hypothetical protein HMPREF0670_00011 13 6 Op 4 . + CDS 11122 - 11535 352 ## Pecwa_2422 hypothetical protein Predicted protein(s) >gi|283510618|gb|ACQH01000001.1| GENE 1 155 - 514 383 119 aa, chain - ## HITS:1 COG:no KEGG:CCC13826_1047 NR:ns ## KEGG: CCC13826_1047 # Name: not_defined # Def: putative putative periplasmic protein # Organism: C.concisus # Pathway: not_defined # 1 115 1 115 116 151 65.0 7e-36 MFRLEFKAKAITASRNKKDNYYMVGLADDKFDYKNYILFQRPITLGEDDDPESELNGIYA ECNGDVTYNTCKSVSLTYETVVFEVQDSIIVVDLSDVKLNKRFVEYSKEIFKELLTTKL >gi|283510618|gb|ACQH01000001.1| GENE 2 651 - 887 91 78 aa, chain - ## HITS:1 COG:no KEGG:BF1967 NR:ns ## KEGG: BF1967 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 65 130 194 195 72 50.0 6e-12 MKDPKWKFLCQGEIQPFEDKELYQQKMIKRRLDKDALISYCIKLGIDIRDDAFGESQQSI LVERLSWLCNREGTHLHQ >gi|283510618|gb|ACQH01000001.1| GENE 3 884 - 1273 209 129 aa, chain - ## HITS:1 COG:no KEGG:BF1967 NR:ns ## KEGG: BF1967 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 121 1 121 195 121 48.0 1e-26 MRINQFEYFTFTCLDCPLEQIKGYIQKRWGNSEKYKITRTPFKFDLYETSPLKGGAHFEK LYFFTPRTCENKCIMFSNYSDGLSSLAYQISKALRIKAYCFQISTNDTHDAMNAFSYIEK GKKYVLSMQ >gi|283510618|gb|ACQH01000001.1| GENE 4 1740 - 2117 353 125 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927262|ref|ZP_06421109.1| ## NR: gi|288927262|ref|ZP_06421109.1| hypothetical protein HMPREF0670_00003 [Prevotella sp. oral taxon 317 str. F0108] # 1 125 1 125 125 230 100.0 2e-59 MKTLGFVGIVIIAIVLCVKLTLNSNEDKELEGIIWNKQYMEDEARKKAELFLNVRVTFNA DGTVTYSNSPEWSDAHSNLVDKTSKMEYTKGDNTRSKEGTSFNNENPENWDCYDVDLNNG VFWAD >gi|283510618|gb|ACQH01000001.1| GENE 5 2582 - 2755 126 57 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927263|ref|ZP_06421110.1| ## NR: gi|288927263|ref|ZP_06421110.1| hypothetical protein HMPREF0670_00004 [Prevotella sp. oral taxon 317 str. F0108] # 1 57 1 57 57 110 100.0 3e-23 MKKNIYKKPTIGLLCIKVMFLCGSKGTSLSPGTERGDASQADAKRAVDFDNYENNDM >gi|283510618|gb|ACQH01000001.1| GENE 6 2819 - 4765 998 648 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927264|ref|ZP_06421111.1| ## NR: gi|288927264|ref|ZP_06421111.1| hypothetical protein HMPREF0670_00005 [Prevotella sp. oral taxon 317 str. F0108] # 1 648 1 648 648 1287 100.0 0 MNSKAKLMNSQTHQLINKLLLLLFFTVGTTAKAQNIIETFDYANHTNTDKESKHCQLIFP KGEGKGNPTVNPKTHTLLLNPGNRLLGKSLSNKCITKVVLHVAKGATPLRSEQFTTPSGY FQQEGQAWVGQSRQMEITNNSQEVLRLERIEVHLADVSPKQPTSISFVQPAYDFFKGDKL PVPHIEGADNNNLMFLSGNNKILELNADGTNATPIATGTTTLTAVYLGDERHAPSMTTCV MRCWSGIKGAANIREFNKKHDTESLYNLSLNAAEVVFARNKWAFVRDKTGCLALKFNNTM PFQTGDVLSGFVLGKFVRNYFPVLMVENDGTGHVQVEHKEPPTPTELDCTKLNLDDCVFG YQPNVAVKVRGRTGKAELMVNGHSLELLRAFNLTFTLPTKQTYHTLTGLFYTFKEGEIYF IPLQRDGIRKELVLDEGKPQNTIITTTNTAVVVKRKLKAERWNTICLPFDMSATELKEAF GDHTLMAFSAVSNNKLHFKPVTQLQAGVPYLLRTANNIKSWTMPNTDIIAPYASMVTHDG VSFCGTYNAKELAADDTEQFLQGNKLFLPANNARTMPGLRAYFKFSSAAAAKQYQFVANE TTVVELPHKEQRDEENETWYTLDGRCIKKEQRQAGQVYVRRNKKILIR >gi|283510618|gb|ACQH01000001.1| GENE 7 4972 - 6093 1107 373 aa, chain - ## HITS:1 COG:no KEGG:PRU_1517 NR:ns ## KEGG: PRU_1517 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 365 1 365 371 355 47.0 2e-96 MKILLLGEYSNVHNALAQGMRQLGHQVTVASNGDFWKDYPRDIDLKRTAGLWGKISFSLR LLWALPKLRGYDVVQLINPIFVEMKAERLFSLYRYLRKHNKRVFLCAFGMDYYWVNECRT RKSLRYSDFNLGNELRQNEDALKETADWIGTSKERLNKYIAHDCDGIITGLYEYWVCYQP LFPHKTVFIPYPIKMPCPPATIAPIGQKVKIFIGINKSRHAYKGTDIMLAAALRLVEKHT NEVELVKVESVPFAEYQRLMENSDLILDQLYSYTPSMNPLLAMSKGIVCVGGGEPEGYEL LGETHLRPIINVLPNEEDVYIKLEQAIADKEKLMRHKRQSVEYVAKHHDYVKVAQQYVQL YTAELADNVPSEE >gi|283510618|gb|ACQH01000001.1| GENE 8 6090 - 7031 775 313 aa, chain - ## HITS:1 COG:SP1771_1 KEGG:ns NR:ns ## COG: SP1771_1 COG0463 # Protein_GI_number: 15901601 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Streptococcus pneumoniae TIGR4 # 2 203 7 224 259 98 32.0 2e-20 MISFIIPYHNEPLNMLRQCLDSILSLSLSNEEREIIVVDDGSGYSPINDLRDIQDHIIYV RQKNGGLSHARNTGLRLATGEYVQFIDADDYLLRAPYEHCLDIMRYQEPDVVMFNQTQNT PKNVPTHYEGPIDGARYMRHNNLHAAAWGYLFKRQLAGSLRFRKGIYHEDEEFTPQLLLR AERVFVTNAHAYFYRKREGSITQRHDIKSVAKRLSDLEDVIMQLKKMAATGPTANRDALQ RRVAQLTMDYLYKIIVDTRSPRHLEKCVARLHRNGLFPLPDKPYSQKYQWFRRFTNFKAG RRILCRVLPKVVS >gi|283510618|gb|ACQH01000001.1| GENE 9 7028 - 7975 869 315 aa, chain - ## HITS:1 COG:SP1365 KEGG:ns NR:ns ## COG: SP1365 COG0463 # Protein_GI_number: 15901219 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Streptococcus pneumoniae TIGR4 # 3 215 6 220 328 127 31.0 2e-29 MQISVVIPVYNMLHTLDRCVESVLKQTFDNMEVLLVDDGSTDGSAQKCNAWAEKDSRITV LHQPNGGLSKARNTGIAHAKGDFLAFVDADDAIAPNTFQAAADAFAQHPQCNIVEFPVLK GWLSQHEQLLTFAEQEHHKPALYWYEEQGYLHAYAWNKLWRANLFANVRFPEGKVFEDLI TTARLLAKAGNILTINQGAYHYWLNPEGITQKASAYELAQLLVWNMRVRKSIIPPNGCTS EAQARFHLHLLNMQIDLYRKGDTTIRLARHPLSWHFVLKAPMPFATRAKAFWIKVFGIKS LCQIHLLLHRFMKTK >gi|283510618|gb|ACQH01000001.1| GENE 10 8285 - 9736 1161 483 aa, chain + ## HITS:1 COG:no KEGG:PRU_1515 NR:ns ## KEGG: PRU_1515 # Name: not_defined # Def: polysaccharide biosynthesis protein # Organism: P.ruminicola # Pathway: not_defined # 4 481 3 480 485 355 42.0 2e-96 MKQQKKTNGYAHILKYMGVFGSVQGVTIGLGIIRNKLVAVLIGPAGVGIISLFNSAITLL TNTSNLGLGVSGVRNVAEAYEANDGQRLDSAVSTLRFWAFLTGFLGMSLCLVLSVWLSQI TFSDSVHTVDFLLLSPIVALSCITVGETALLKGARMLRSLAQLTIASLALTILITIPLYY IYREAAIVPSLIVAALVQMGVTLACSFRAFPLRLNWNSEHFRSGSAMIKLGLAFTLAGIL GNGADLFIRSFLNKTASIETVGLYNAGYMLSVVYASTLFSALEADFFPRISAVNRFQYTM NVLVNRQIEVGLLIVGPLLVVFVLALPVLVPLLYSSAFAGVVVMVQVLSFNIFFRAISLP LEYIALAKGDSRTYLLLEVAYDLAFVAAVIGGMHLGGLWGVGLCLSVMSLLNLALVACVV QSKFGYCMRRSVFRNIGLQLPWLVLAYATTFVHNMSVYCLVGLALVGISSYLSVGALRKK TKH >gi|283510618|gb|ACQH01000001.1| GENE 11 9757 - 10761 788 334 aa, chain + ## HITS:1 COG:BS_yveR KEGG:ns NR:ns ## COG: BS_yveR COG0463 # Protein_GI_number: 16080483 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Bacillus subtilis # 2 98 4 99 344 91 45.0 2e-18 MPKVSVLVAVYNAEKYLRACLDSLLNQSLKDIEVVCVDDASTDNSLALLNEYAANDSRLK VATLAKNSGISHTRNVALGMSTGEYVCMVDSDDWLSADALQLAAEVLDTEQGVDCVLFDF IIAERDDVLGTYTRQRRYNSMPFSRISGLRACELSLDWRIHGLYMIRRAIHLQYPYDESS PVYSDENTTRVHYWASREVVQCAGKYFYRQHAESATHAPNLNKYYRLDAYKSLKQFLQSH TDTRFLCARFENMRWLTVVDLYFEYYKTRRQLSQAEQQTVKKRLKAAWQDIDHSLLSGKD TRKFGYMPLRFSWLAFRIQEELYFFVRAMRDKMR >gi|283510618|gb|ACQH01000001.1| GENE 12 10801 - 11094 206 97 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288927270|ref|ZP_06421117.1| ## NR: gi|288927270|ref|ZP_06421117.1| hypothetical protein HMPREF0670_00011 [Prevotella sp. oral taxon 317 str. F0108] # 1 97 1 97 97 181 100.0 2e-44 MDIINDFEVQFYKALRLLDLGKTEQASKILENVVAEAAKMQNNLFFIRASCVLGELLFAT GKYNEARRYLTQVIETPCQDDVVDYEKNLAEDILGRL >gi|283510618|gb|ACQH01000001.1| GENE 13 11122 - 11535 352 137 aa, chain + ## HITS:1 COG:no KEGG:Pecwa_2422 NR:ns ## KEGG: Pecwa_2422 # Name: not_defined # Def: hypothetical protein # Organism: P.wasabiae # Pathway: not_defined # 7 137 2 130 130 62 31.0 4e-09 MKTEKYKTTDQEFEAIIKLTPQQRYVYFVKRICDWEEIWTLYKDDCIVLNEDKHGVLFAL LFPFESFASFYATRTVSLQNDLCKCFTLDEFLSTIVNKLLANNIASALVFPVPNGFGLNV PLTRLIADIHKELENYE Prediction of potential genes in microbial genomes Time: Fri May 27 23:55:52 2011 Seq name: gi|283510617|gb|ACQH01000002.1| Prevotella sp. oral taxon 317 str. F0108 cont2.2, whole genome shotgun sequence Length of sequence - 26710 bp Number of predicted genes - 19, with homology - 15 Number of transcription units - 15, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 105 - 689 633 ## COG1739 Uncharacterized conserved protein + Prom 893 - 952 6.0 2 2 Tu 1 . + CDS 1071 - 2582 1689 ## COG1649 Uncharacterized protein conserved in bacteria + Term 2800 - 2870 20.0 + Prom 2900 - 2959 5.6 3 3 Op 1 . + CDS 3084 - 6317 3326 ## BF2942 hypothetical protein 4 3 Op 2 . + CDS 6336 - 8270 1793 ## BF3103 hypothetical protein 5 3 Op 3 . + CDS 8295 - 9206 910 ## BF2940 hypothetical protein + Term 9406 - 9457 10.0 + Prom 9423 - 9482 10.0 6 4 Tu 1 . + CDS 9699 - 9950 71 ## + Prom 10629 - 10688 5.0 7 5 Tu 1 . + CDS 10885 - 11094 131 ## + Term 11248 - 11282 -0.8 - Term 11702 - 11760 2.8 8 6 Op 1 2/0.000 - CDS 11783 - 13036 1596 ## COG4198 Uncharacterized conserved protein 9 6 Op 2 6/0.000 - CDS 13045 - 13962 1188 ## COG0111 Phosphoglycerate dehydrogenase and related dehydrogenases 10 6 Op 3 . - CDS 14064 - 15131 1111 ## COG1932 Phosphoserine aminotransferase - Term 15577 - 15627 -0.8 11 7 Tu 1 . - CDS 15670 - 16179 454 ## COG0778 Nitroreductase 12 8 Tu 1 . + CDS 16493 - 18574 2262 ## COG0550 Topoisomerase IA + Term 18653 - 18689 -0.7 - Term 18388 - 18420 1.0 13 9 Tu 1 . - CDS 18656 - 18859 63 ## + Prom 18628 - 18687 4.8 14 10 Tu 1 . + CDS 18874 - 20214 1530 ## COG1538 Outer membrane protein + Prom 20283 - 20342 5.7 15 11 Tu 1 . + CDS 20405 - 20590 103 ## + Term 20616 - 20654 -0.6 + Prom 21330 - 21389 4.7 16 12 Tu 1 . + CDS 21410 - 22693 1006 ## COG0673 Predicted dehydrogenases and related proteins + Term 22781 - 22819 0.1 17 13 Tu 1 . - CDS 23043 - 23468 70 ## BVU_1598 transposase + Prom 23560 - 23619 3.7 18 14 Tu 1 . + CDS 23714 - 26020 731 ## BVU_0280 hypothetical protein + Term 26218 - 26274 -0.8 - Term 26328 - 26364 5.4 19 15 Tu 1 . - CDS 26482 - 26646 129 ## PGN_1826 hypothetical protein Predicted protein(s) >gi|283510617|gb|ACQH01000002.1| GENE 1 105 - 689 633 194 aa, chain + ## HITS:1 COG:NMB2153 KEGG:ns NR:ns ## COG: NMB2153 COG1739 # Protein_GI_number: 15677966 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Neisseria meningitidis MC58 # 1 176 1 176 203 159 43.0 4e-39 MNDEYKTIDDIGQGVYTEKRSKFLAYAHPVDSVEQIKDLLANYRKQHYDARHVCYAYMLG SERQEFRANDDGEPSSTAGKPILGQINSHELTNILVVVVRYYGGVNLGTGGLIVAYRTAA ALAIDNAPMVSRLVEETVSFAFPYVMMNGVMRVVKESEARILATNFENTCEIKLSIVRSK AEELRNKLNKLSFG >gi|283510617|gb|ACQH01000002.1| GENE 2 1071 - 2582 1689 503 aa, chain + ## HITS:1 COG:BS_yngK KEGG:ns NR:ns ## COG: BS_yngK COG1649 # Protein_GI_number: 16078889 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus subtilis # 25 499 28 507 510 325 38.0 1e-88 MKRLMMIATLLATMLFAGAPQLSAASLHSKREFRGAWIQTVNGQFTGMSTAAMQQTLRQQ LDELQRDGVNAIIFQVRPECDALYESKIEPWSRFLTGVQGRAPMPYWDPLRWMITQCHRR GMELHAWINPYRAKTAGTTALARNHVASLYPNRVFPYNGQLILNPGLPENREYICRVVDD IVTRYDVDGIHIDDYFYPYPAAGQTIADDKEFMRYNGGITNKNDWRRNNVNAFVQRLGQT IKDRKPWVKFGVSPFGIYRNKKSDPVNGSATAGLQNYDDLYADVLLWVNNGWVDYCVPQL YWKIGHPTADYQTLAIWWNKYAGNRPLYIGESVERTVKEPDLNNPKTNQMEAKFKLRRNL PNVQGAVLWYAKAVADNMGNYGSALRTYYWKTPALQPLMPFIDHKRPKKPKHLKAVWTSD GLVLFWEEPKGKRWGDFAQKYVVYRFAQGEDIDIDNPAKIVAITHDKWIKLPYDKGQVKY RYVVTALDRMSNESRKVKKKVWL >gi|283510617|gb|ACQH01000002.1| GENE 3 3084 - 6317 3326 1077 aa, chain + ## HITS:1 COG:no KEGG:BF2942 NR:ns ## KEGG: BF2942 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 52 1077 103 1125 1125 1450 67.0 0 MQSQTTLTRKLCCTVSGVALLMFAPALQTQVHASGWSLGMPWLSVAATPNGLKATLAPAQ NSTAPKIVGVVTDSNTGEPLVGATVKVVGTTEGTATDVDGRFEIRATKGSTLEVSYLGYT TKKIKVTDVKVLSITLSDQAQMLEGAVITAFGTAQKKETVTGSIQTVRPNDLKVPATSLS TAFAGRLSGVIAYQRSGEPGNNGADFFVRGVSTMNGATSPLIVMDGVEITKDDLNAIDPE IIESFSVLKDATATAMYGTRGANGVLIIKTKSGADLDRPVIGMRVESWINTPINVPKTVD GVTFMRMYNEAVTNQGSGDLLYSEDKINGTMRGLNPYVYPNVDWYKEIFKDATWNQRANF NVRGGTSKITYFMNINASHETSMLKDVSSKYFSYDNGTSYMKYAFQNNVDFNISKASKLS LHLNVQLTDMHGMLTDKKGGGTKEVFGSIMGANPVDFPIMYPKGEDDWYRWGGLLAGNYN PLNPMAEATKGYRDDFESTVVANLNFEQKLDFITKGLSFKTLFSFKNYTINYKYRVQDYN RYQLTDYAANPATDGGYDMTVKPIADPQKYILNNYFYTNGNRRYYFQTYLDYNRSFGDHT VSGMALFNIDEYNSNVNSNDNFLASLPKRRMGLAARLSYDYKYRYMVELNAGYNGSESFA KGHRWGFFPSVSLGWNVSEEPFFESLKKTVQQLKLRASYGLVGNDQVGQERFAYQAIVDL EKSPEFSTGYGSQSRNLKGPVFKRFENTNLTWEVGRKLNVGFDLTVKDVKLTVDVFREIR SNIFQKKGSIPNYFGTAKTDIYGNLAKVKNWGVDLAVDYGKNITKDLAVEFRGTFTFARN KVLEYDEAADTRPALRMVGKRLKSYWGYVANGLYIDEADIANSPKSTLGNIAIAPGDVKY ADQPDQNGEYDGKIDANDRVQLGYPYIPEIVYGFGPSISYKKWDLSFFFQGQANVSLMMN DFEPFGTQSKRNVLQWIADDYWSKDNQNPNARYPRLTKYNNHNNMQSSSYWLRNAAFLRL KNFEVGYKFKYGRIYASASNLVTFSAFKLWDPEMGGGAGMSYPLQRTFNIGLQLTFK >gi|283510617|gb|ACQH01000002.1| GENE 4 6336 - 8270 1793 644 aa, chain + ## HITS:1 COG:no KEGG:BF3103 NR:ns ## KEGG: BF3103 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 639 1 636 636 893 66.0 0 MKSIYRIFMAVAVTFAFVSCNFLDIVPDENATKEDTYSDHNKTEKYLYSCYAYLPQSNLA QGSLDFFTGDEVTTAFEHETFAAFPKGNYTASSPVISYWNDFFQGIRQCYMFLEGVDKTP DMDDEKKADYKAQGQFLIAYYHYMLARCYGPIILVKETPNIETKVEDFAHRTPYDECVQW ICDKLDAAAAGLPVRRDQIQYGLATSVAAKAVKAKMLLYAASPLFNERAPELFKNMRDPD NGTQLMPTAYDANKWVKARDAMKEAIDFAEQNGYHLYTKQDRFIGDASKNQWPAAGPVRC LRLLSLDYQEVNPEVLLAETRGEGWYGVLSKSLPFSASGWAWNGVCPTWAMLNRFYTKNG LPWDEDPEFKDTEKLQVVAIDSAHQDEGQPGKKTIVFNLDREPRYYAWVAFQGGYYEVQN DPSNPAYTMSNGQKNTTDSRLVCDFVLGGNCSLGTAALKRTGNYSPGGYLNKKFVSPNTA VSASGVGSREWMPFPVIRLADLYLGYAEACVETGDLETAKTYLNKVRQRAGIPDVETSWA RAGVTLGQAKLRQIVRQERMAEFYLECQNFWDMRRWMLAGECFNKKAQGLNINATTIEEF AGLKEIQYERKFESPTQYLLPIPIDDINKNPHLVNNPGYTGDHR >gi|283510617|gb|ACQH01000002.1| GENE 5 8295 - 9206 910 303 aa, chain + ## HITS:1 COG:no KEGG:BF2940 NR:ns ## KEGG: BF2940 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 300 3 299 300 286 48.0 1e-75 MKNIILSIVCLCAVGAFCSCSKDDNGPAPTSLDAQTLRYDKAPGAVIIRWTIPQNPTYKY IKVTYQVPGEEQVRMRLASVYSDSIKIDNLLARYGEIEYSLQPFSEAGAGGDVAKITAQA GAATKTVYETFKDQLTFETNQVWSDDPESSEGPLSALIDGNENTYFHMSWSAPRPFPHYI VFDLGQERTALQFRYVCRNNNNKDNPKEMDVLVSDAFENTPEYYANETGTRKLASFANLP GDKKAVYESARSIKSDKPFRYVWFKIKSATSGKNWVAMAEWQVFKIREKVYDPETKETAI VEY >gi|283510617|gb|ACQH01000002.1| GENE 6 9699 - 9950 71 83 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSKYAASLSAIGGGGCNGTYVRSKKSLWAYQLGKAERKSYRSQNSETEVAFTFIVKHNCF WKIVVISLLTMHLWLSISNFAWL >gi|283510617|gb|ACQH01000002.1| GENE 7 10885 - 11094 131 69 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVQTKPLQTMKHMAKKGKMVVKKWGFRGKMLGLGLEKLRACNQTRKLKWCKMQVLSLKVD MIKGAEGWK >gi|283510617|gb|ACQH01000002.1| GENE 8 11783 - 13036 1596 417 aa, chain - ## HITS:1 COG:lin2955 KEGG:ns NR:ns ## COG: lin2955 COG4198 # Protein_GI_number: 16802014 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Listeria innocua # 1 415 1 410 413 406 50.0 1e-113 MATIRPFKGIRPPKELVEKVESRPYDVLSSEEARAEAGDNEMSLYHIIKPEIDFSPDTSE YDPKVYEQAARNFKKFQDKGWLVQDEKPAYYIYAQTMNGKTQYGLVVGAFVEDYLNGTIK KHELTRKDKEEDRMKHVRVNDANIEPVFFAYPDNKVLDELINKYVKNNAPEYDFIAPIDG FGHQFWVISDANDIDVITKEFAKMPALYIADGHHRSAAAALVGAEKAKNNPNHKGDEEYN YFMAVCFQASQLTILDYNRVVKDLNGLTSAQFLEKLEKNFDVQKKGKEMYHPTGLHNFSL YLDGEWYSLTAKKGTYNDADPIGVLDVDISSRLILDEILGIKDLRTDKRIDFVGGIRGLN ELKERVDSGEMKMALALYPVSMQQIMDIADSGKIMPPKATWFEPKLRSGLVIHKLSD >gi|283510617|gb|ACQH01000002.1| GENE 9 13045 - 13962 1188 305 aa, chain - ## HITS:1 COG:MJ1018 KEGG:ns NR:ns ## COG: MJ1018 COG0111 # Protein_GI_number: 15669207 # Func_class: H Coenzyme transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoglycerate dehydrogenase and related dehydrogenases # Organism: Methanococcus jannaschii # 1 290 2 298 524 150 35.0 3e-36 MKVLVATEKPFAANAVKAIKEEIEKAGHELVLLEKYTEKAQLLDAVKDVEAMIVRSDKIT PDVMDAAPKLKIIVRAGAGYDSIDTAYAKEKGIIVENTPGQNANAVAELVLGLLVYAVRA FFNGKSGSELIGKKLGLLAFGNVGRNVARIAKGFGMDVYAYDAFCPANVIEEAGVHAVAN QEALFDQCDIVSLHIPATPETKQSINYALVNRLPKGGILVNTARKEVINEAELLKLMAER EDLKFVTDIKPDADADFAKLEGRYFSTPKKMGAQTAEANFNAGLAAAKQINAYFADGCTK FQVNK >gi|283510617|gb|ACQH01000002.1| GENE 10 14064 - 15131 1111 355 aa, chain - ## HITS:1 COG:BS_serC KEGG:ns NR:ns ## COG: BS_serC COG1932 # Protein_GI_number: 16078066 # Func_class: H Coenzyme transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoserine aminotransferase # Organism: Bacillus subtilis # 5 352 6 355 359 339 47.0 6e-93 MKKYNFNAGPSILPREVIENTAKQILDFNGSGLSLMEISHRAKDFQPVVDEAVALIKELL SVPEGYSVIFLGGGASLQFTQVPCNFLIKKAAYLNTGVWAKKSMKEAKLYGEVVEVATSA DANFTYIPKNFDIPADADYLHVTTNNTIYGTEIRKEIDSPIPLVGDMSSDIFSRPVDVSK YDCIYAGAQKNLAMAGVTVIIVKNDKLGRAPRQIPTMLDYRTHVDKGSMFNTPPVVPIYC ALETLRWIKKSGGVEAMDKKAIERAKIIYDEIDRNRLFRGTVKEEDRSLMNICFVMNEDF AELEKPFMEFATQKGMVGIKGHRDVGGFRASCYNAMSIEGAEALVACMKEFEAKL >gi|283510617|gb|ACQH01000002.1| GENE 11 15670 - 16179 454 169 aa, chain - ## HITS:1 COG:CAC1484 KEGG:ns NR:ns ## COG: CAC1484 COG0778 # Protein_GI_number: 15894763 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Clostridium acetobutylicum # 1 166 1 167 172 121 37.0 6e-28 MNFLNLAKERFSARAFSAQVVEEDKVNYLLACAQRAPSACNKQPWHFVVAQTEEARQKVQ QCYNRDWLKAAPLYIIIYAADDRAWVREEDGKNHADIDAAIAIEHLCLAATAVGLGSCWV CNFDVAALQAAFPLGKEWHPVAILPIGYPANTPSKPSPRKEIGEIVSRF >gi|283510617|gb|ACQH01000002.1| GENE 12 16493 - 18574 2262 693 aa, chain + ## HITS:1 COG:CAC3567 KEGG:ns NR:ns ## COG: CAC3567 COG0550 # Protein_GI_number: 15896801 # Func_class: L Replication, recombination and repair # Function: Topoisomerase IA # Organism: Clostridium acetobutylicum # 2 686 4 650 709 454 39.0 1e-127 MIVCIAEKPSVARDIASVIGATTQRDGYMEGNGYQVTWTFGHLCTLKEPNDYTDNWKRWS LGALPMIPQRFGIKLIEDRGIEKQFQVIERLMQNADEIVNCGDAGQEGELIQRWVMQKAQ AKCPVKRLWISSLTEEAIRQGFANLKDQSEYQPLYLAGLSRAIGDWILGMNATRLYTLKY GQNRQVLSIGRVQTPTLALIVNRQKEIDNFKPEAYWVLSTIYRDTLFTATKGKFTNKEEG ERAFATIADKDFEITGVVKKKGTEAPPHLFDLTSLQVECNRKFGYSAETTLNLIQSLYEK KLTTYPRVDTQYLSDDIYPKCPLILKGVRGYETFTQPLLGKALNKSKKVFDSSKVTDHHA IIPTGVPAGALSDMENNVYDLVTRRFISVFYPDCKFSTTTVTGEVDDIELKATGKEILEQ GWRVLYAKETNTNNNDEESTKQTTEERTLPTFTKGEKGPHTPTLSEKWTVPPKFYTEATL LRAMETAGKFVDDEELRSALKENGIGRPSSRAGIIETLFKRHYIRRERKNLLPTPTGIEL IGIIREELLKSCELTGIWEKKLRDIEHRNYDAQQFVNELKAQITQIVNEVLADNSNRHIT ITTEDDLKKTIAKKKASAPKAKTQANPKTKAQNATKGSATPEPTAPQTLQAHESIVGKPC PQCKQGVIIKGKGAYGCSNWKAGCTFRLPFGEV >gi|283510617|gb|ACQH01000002.1| GENE 13 18656 - 18859 63 67 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGLRVDELTSSRVYEFMCWQVDELSGWCACFAIGLLFNKAFYRCSKSAFCCQLDTKRAEK VYKDDGL >gi|283510617|gb|ACQH01000002.1| GENE 14 18874 - 20214 1530 446 aa, chain + ## HITS:1 COG:XF2586 KEGG:ns NR:ns ## COG: XF2586 COG1538 # Protein_GI_number: 15839175 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Outer membrane protein # Organism: Xylella fastidiosa 9a5c # 105 438 90 426 452 83 24.0 1e-15 MKKWTLSLGILFALPTMLFAQTHQWTLKDCISHAMQHNISLQKQRIAVKSAEEEMLHSKA ELLPSVSASTSQNVNYRPWPETGAARVANGSVQSSVDKVFYNGTYVINAQWTLWNGNRNR NTIALNQLQMSKAEAEAMVSAATIQERIAQLYVQILYSAEAINVSKQALETSARNEERGK EMLNVGKMSKADLAQLTAQRAQEEYNVLSAQSTLADFKRQLKQLLQITDSAPFEVAVPAT TGEMALQPIPSVSEVYTQAISWRPELKAAQLAINGAETSIKVAKAQNLPTLSLGASMGTN TTSMSNNAWGTQIKTNFDMSAGLTLSIPLFDNRNKRTAVNRAQFERQSSMLELQDKQTSL YSNIEECWLQATNNQNKYKAAKVSVESAQQSYDLLNEQFNLGLKNIIELMTGKNNLVTAQ QNELQSKYMAILNIHLLNFYKTGEIK >gi|283510617|gb|ACQH01000002.1| GENE 15 20405 - 20590 103 61 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSAIFIHQPLTLPQKIDSREGCFWCKKGQHVNAGKMNIYFKTPSFAPISELFAAKCSAFC S >gi|283510617|gb|ACQH01000002.1| GENE 16 21410 - 22693 1006 427 aa, chain + ## HITS:1 COG:TM0585 KEGG:ns NR:ns ## COG: TM0585 COG0673 # Protein_GI_number: 15643351 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Thermotoga maritima # 35 305 3 260 360 72 23.0 2e-12 METMPAARPLETPVPTRVAGQRDVLSLALAPMPLLQVAFVGVGARGQMAVTRWCHIPNVQ IVAVCDVSRQAAEEAAQHVEQLGQPRPTAYWGENAYHQLCQQQGLNLVYVCTDWLSHVPI ALHAMQCGKCVAVEVPAALTLDHIWQLVDTAERTQQHCMMLENAVYDHFEMAVREMVREG LFGEVVHVEGGYAHPIGHRWTPWRMEYNRLTCGDVYPTHSIGPACQLLDIHRTDHLHYLT AMHTAPFLGKEVYKEVMGKASDSFANGDQTSTMIRTMKGKTILIQHNVMTPRPYSRMFQV VGTHGYAAKYPIPEMLLSPNAATKVGLDIAEGNASLSTLQIETLLNRYTTPFSAQTISMA KQLDSRGGMSYFMDLRLAQCLQQGLPLDMDVYDLAEWCCIAELSKLSIEHGSMPVMIPDF TRRNVLR >gi|283510617|gb|ACQH01000002.1| GENE 17 23043 - 23468 70 141 aa, chain - ## HITS:1 COG:no KEGG:BVU_1598 NR:ns ## KEGG: BVU_1598 # Name: not_defined # Def: transposase # Organism: B.vulgatus # Pathway: not_defined # 31 121 300 390 429 120 58.0 1e-26 MAKVQIKSDNFTAFGGVLTTETCIWQPWPLEEKYTYRCITTNDYEMSLLDVVHFYNQRGS SERIFDELNNGFGWRHLPKSFMAENTVFLIVTVLIRNLYKLLMSDVDIKCFDLKKKAESK TSSSNWSQSLPNGKGRHVGKS >gi|283510617|gb|ACQH01000002.1| GENE 18 23714 - 26020 731 768 aa, chain + ## HITS:1 COG:no KEGG:BVU_0280 NR:ns ## KEGG: BVU_0280 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 768 1 769 769 166 23.0 4e-39 MMNKFLGVLFLFVLFSFPTLAQSISGVVLAKDNKDAVGYATVSLITRDSTYINATVATED GKFTLDTKGCTDCLLKIIALGYKPSTFSLDSLRTVFFIEKETYQLKEVVKRASRRIVKMN RSGNLEVNVAGTYLANRNNIKELLLSVPGVVERDGNIITISGAAPLYVVNGRKVQSYAEI ATLSVKDIKNIAVNNNPGVEYGGNVKAVIMITTSVPLDRLTLNTQATVRKSRAWSHNEYL NAIYAFKDVTLYGTLQYASYKKKSFQDIAFSSIDNTSVPLFLSHTNLVSYPSDIRKLNYN FGVDCQLSSSHKLSLMFDGFVSRINDTGNATNDVINRSESKVSFHSLSKLNDKLDYKHVA LNYNYTDKRGLAFSLAADYAATHSKRTQNTEEVFSNSNEATNVANRVTNNLFSINPHFIY PLHQRLSAQLGGQFDYITNENRLEYSRSMQKRNSETKELISAAYMGLTYQGNHYQAQIGM RYENTRNTFQQDTQKTEQNSSQLLPYASLSFSLGNTSHQISFKYDMQRPPLGFIGGYSYY LNHYKFQEGNPMLRPQRNSTLDYALSWNNFYFSAKYQYVKSPIMALSTLVSSSDYDMVKS SWYNLEKQHNITVMMNYSKSFDWYKPSLTLAYIHHINYLATKHEHTETLSRPVPYLQMQN TLVLPWLDVNVNYEYTGKGYFRVFLTEERHLLNLSVQKRLLHETLSVSLYWNDVFRQDIS RYEVYYQGLLFRQTEDQDRQIIGLTVSYRFNNKQRNQKIKSSDQLLRL >gi|283510617|gb|ACQH01000002.1| GENE 19 26482 - 26646 129 54 aa, chain - ## HITS:1 COG:no KEGG:PGN_1826 NR:ns ## KEGG: PGN_1826 # Name: not_defined # Def: hypothetical protein # Organism: P.gingivalis_ATCC33277 # Pathway: not_defined # 1 47 183 229 232 70 63.0 2e-11 MRKRPKYLLFCPELEGMNTILYVINNEVFIYRIVEMEKYKLDDYMKNRTAIVDS Prediction of potential genes in microbial genomes Time: Fri May 27 23:56:47 2011 Seq name: gi|283510616|gb|ACQH01000003.1| Prevotella sp. oral taxon 317 str. F0108 cont2.3, whole genome shotgun sequence Length of sequence - 2163 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 2, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 30 - 83 4.1 1 1 Op 1 . - CDS 197 - 475 254 ## gi|260912074|ref|ZP_05918633.1| conserved hypothetical protein 2 1 Op 2 . - CDS 524 - 1183 428 ## gi|288927287|ref|ZP_06421134.1| hypothetical protein HMPREF0670_00028 3 1 Op 3 . - CDS 1207 - 1428 120 ## gi|288929197|ref|ZP_06423042.1| hypothetical protein HMPREF0670_01936 - Prom 1542 - 1601 4.6 + Prom 1411 - 1470 5.0 4 2 Op 1 . + CDS 1540 - 1809 141 ## gi|288930216|ref|ZP_06424022.1| conserved hypothetical protein 5 2 Op 2 . + CDS 1806 - 2159 198 ## Dacet_1491 transposase IS4 family protein Predicted protein(s) >gi|283510616|gb|ACQH01000003.1| GENE 1 197 - 475 254 92 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260912074|ref|ZP_05918633.1| ## NR: gi|260912074|ref|ZP_05918633.1| conserved hypothetical protein [Prevotella sp. oral taxon 472 str. F0295] # 1 91 5 95 238 149 89.0 7e-35 MKRVYLIFIMGSLFFYISNAQTLIEQVERAYSALDSASYINKIVLSYAKSLEKNEEETYK LLYSPDSDSMKVAQWFNRADSMYLKYLQKIKF >gi|283510616|gb|ACQH01000003.1| GENE 2 524 - 1183 428 219 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927287|ref|ZP_06421134.1| ## NR: gi|288927287|ref|ZP_06421134.1| hypothetical protein HMPREF0670_00028 [Prevotella sp. oral taxon 317 str. F0108] # 1 219 10 228 228 460 100.0 1e-128 MQTNEIDKMIDMKPILGFIFSVLICLCLTSCAGIYRTTYYITETQGWKKADYNGVNDIAY TSDTAPDSIMLNVRGPGAAISTGPLFFPFAFPTCLLFWIKHSPELEVEAKFYVCQPYTIY PQSIRFTDNEGNMMTPTMIESLDIVHEQDITKLDSTAIVQGIKTSGAIRLRWYFKGKQAK KKGFSVLMDAPAVNGKDGKPLLIWFRRKTVYEYVPLFTH >gi|283510616|gb|ACQH01000003.1| GENE 3 1207 - 1428 120 73 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929197|ref|ZP_06423042.1| ## NR: gi|288929197|ref|ZP_06423042.1| hypothetical protein HMPREF0670_01936 [Prevotella sp. oral taxon 317 str. F0108] # 1 73 38 110 110 131 98.0 2e-29 MKYYNIPSLRGTRHKRADICFVMTMEYDTFTFDAPTVFISFCPFCGVNLYDYYKSDEYVN EIEGETFKFFKDQ >gi|283510616|gb|ACQH01000003.1| GENE 4 1540 - 1809 141 89 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288930216|ref|ZP_06424022.1| ## NR: gi|288930216|ref|ZP_06424022.1| conserved hypothetical protein [Prevotella sp. oral taxon 317 str. F0108] # 1 61 1 61 208 118 100.0 1e-25 MEAKIEKISELSKLLSVKSRMSDDLFHLFGKFGIGHLLSRLSLEKHDGVSASELILSLCL FAFWVRASTAYANVRYMSFQIMVRTVSIG >gi|283510616|gb|ACQH01000003.1| GENE 5 1806 - 2159 198 117 aa, chain + ## HITS:1 COG:no KEGG:Dacet_1491 NR:ns ## KEGG: Dacet_1491 # Name: not_defined # Def: transposase IS4 family protein # Organism: D.acetiphilus # Pathway: not_defined # 4 97 90 179 466 64 32.0 1e-09 MMIRPQMDWRRLMNRFALRYMCLLRKYGEAPQSDATTCFIIDDTVLEKSGVRMEGISRVF DHVKGRCVLGYKLLLCAFFDGKTTIPFDFSLHQEKGKQGGCGLTRQQRRKACHTNEE Prediction of potential genes in microbial genomes Time: Fri May 27 23:57:23 2011 Seq name: gi|283510615|gb|ACQH01000004.1| Prevotella sp. oral taxon 317 str. F0108 cont2.4, whole genome shotgun sequence Length of sequence - 41611 bp Number of predicted genes - 35, with homology - 28 Number of transcription units - 27, operones - 5 average op.length - 2.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 133 - 192 5.6 1 1 Tu 1 . + CDS 212 - 763 338 ## BT_0225 hypothetical protein + Term 819 - 860 -0.7 - Term 854 - 903 12.2 2 2 Op 1 . - CDS 981 - 2603 1810 ## PRU_1026 hypothetical protein 3 2 Op 2 . - CDS 2754 - 3728 987 ## gi|288927292|ref|ZP_06421139.1| heterogeneous nuclear ribonucleoprotein A2 - Prom 3786 - 3845 4.8 + Prom 3785 - 3844 4.2 4 3 Tu 1 . + CDS 3975 - 4844 624 ## PROTEIN SUPPORTED gi|237727781|ref|ZP_04558262.1| ribosomal protein L11 methyltransferase + Term 4899 - 4932 -0.9 - Term 4864 - 4902 4.4 5 4 Tu 1 . - CDS 4921 - 5358 503 ## PRU_0802 hypothetical protein - Prom 5378 - 5437 5.5 + Prom 5097 - 5156 1.9 6 5 Tu 1 . + CDS 5254 - 5607 66 ## + Term 5678 - 5721 0.3 - Term 5620 - 5657 5.1 7 6 Tu 1 . - CDS 5712 - 7277 1064 ## BT_4508 hypothetical protein + Prom 8043 - 8102 7.4 8 7 Tu 1 . + CDS 8337 - 10415 1646 ## Plav_1227 hypothetical protein 9 8 Tu 1 . - CDS 10657 - 11289 621 ## COG2197 Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain - Prom 11491 - 11550 2.7 - Term 11702 - 11753 10.0 10 9 Tu 1 . - CDS 11775 - 12347 668 ## PROTEIN SUPPORTED gi|150005281|ref|YP_001300025.1| 30S ribosomal protein S16 - Prom 12478 - 12537 4.9 + Prom 12631 - 12690 4.2 11 10 Tu 1 . + CDS 12770 - 13570 849 ## PRU_1652 putative lipoprotein 12 11 Tu 1 . + CDS 13672 - 14292 790 ## COG0357 Predicted S-adenosylmethionine-dependent methyltransferase involved in bacterial cell division 13 12 Tu 1 . + CDS 14426 - 15040 480 ## COG0491 Zn-dependent hydrolases, including glyoxylases + Prom 15077 - 15136 4.8 14 13 Op 1 . + CDS 15211 - 15783 716 ## COG0817 Holliday junction resolvasome, endonuclease subunit + Prom 15789 - 15848 2.9 15 13 Op 2 . + CDS 15873 - 16358 670 ## gi|288927304|ref|ZP_06421151.1| hypothetical protein HMPREF0670_00045 + Term 16508 - 16554 6.4 - Term 16220 - 16262 -0.3 16 14 Tu 1 . - CDS 16289 - 16495 167 ## - Prom 16536 - 16595 1.7 - Term 16630 - 16665 -0.7 17 15 Op 1 24/0.000 - CDS 16711 - 19941 3890 ## COG0458 Carbamoylphosphate synthase large subunit (split gene in MJ) 18 15 Op 2 . - CDS 19957 - 21036 1210 ## COG0505 Carbamoylphosphate synthase small subunit 19 15 Op 3 . - CDS 21043 - 22938 2207 ## COG0034 Glutamine phosphoribosylpyrophosphate amidotransferase - Term 22961 - 23002 10.6 20 16 Tu 1 . - CDS 23015 - 23809 803 ## PRU_1779 lipoprotein - Prom 23835 - 23894 3.7 21 17 Tu 1 . + CDS 24841 - 25386 647 ## COG0622 Predicted phosphoesterase 22 18 Tu 1 . + CDS 25480 - 25986 449 ## Plut_0901 hypothetical protein + Term 26032 - 26060 -0.9 + Prom 26131 - 26190 2.5 23 19 Tu 1 . + CDS 26237 - 27040 754 ## COG0566 rRNA methylases + Term 27201 - 27241 7.3 24 20 Tu 1 . - CDS 27160 - 29490 2482 ## COG3525 N-acetyl-beta-hexosaminidase 25 21 Tu 1 . - CDS 29622 - 30101 400 ## gi|288927313|ref|ZP_06421160.1| hypothetical protein HMPREF0670_00054 - Prom 30156 - 30215 5.8 26 22 Tu 1 . - CDS 30390 - 31028 238 ## gi|288927314|ref|ZP_06421161.1| hypothetical protein HMPREF0670_00055 - Prom 31112 - 31171 6.2 27 23 Tu 1 . - CDS 31252 - 31935 544 ## gi|288927316|ref|ZP_06421163.1| hypothetical protein HMPREF0670_00057 - Prom 31955 - 32014 4.8 + Prom 33861 - 33920 4.3 28 24 Op 1 . + CDS 34131 - 35051 891 ## gi|288927317|ref|ZP_06421164.1| hypothetical protein HMPREF0670_00058 29 24 Op 2 . + CDS 35048 - 38788 3954 ## gi|288927318|ref|ZP_06421165.1| Cna protein B-type domain protein + Term 38811 - 38864 12.2 - Term 38797 - 38852 12.6 30 25 Tu 1 . - CDS 39064 - 39798 147 ## gi|288927319|ref|ZP_06421166.1| conserved hypothetical protein - Prom 39839 - 39898 2.5 - Term 40123 - 40160 0.4 31 26 Op 1 . - CDS 40276 - 40410 64 ## 32 26 Op 2 . - CDS 40349 - 40558 145 ## 33 26 Op 3 . - CDS 40576 - 40776 97 ## 34 26 Op 4 . - CDS 40788 - 40994 76 ## - Prom 41140 - 41199 4.7 - Term 41183 - 41227 -0.9 35 27 Tu 1 . - CDS 41235 - 41444 177 ## - Prom 41471 - 41530 4.0 Predicted protein(s) >gi|283510615|gb|ACQH01000004.1| GENE 1 212 - 763 338 183 aa, chain + ## HITS:1 COG:no KEGG:BT_0225 NR:ns ## KEGG: BT_0225 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 7 183 11 197 197 86 32.0 5e-16 MGWNKFLATLAMLIVSNAAFAEREQSFFVELFGASTSLGIHYDSRFSDNTRWGGRIGIAY TYSRSQDFFESATERTRGWSFPIAVNYLIGNGRHHGEIGIGISYGLYSCMYHDKAGREVE YDTSGSFGFIDLGYRYQSERGIMVRVGLCPGVALKTYDESGLRNKGVNRAAVFYPYVGVG YNF >gi|283510615|gb|ACQH01000004.1| GENE 2 981 - 2603 1810 540 aa, chain - ## HITS:1 COG:no KEGG:PRU_1026 NR:ns ## KEGG: PRU_1026 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 7 540 13 523 523 430 48.0 1e-119 MICSALLALPAMAQETYENSRLLGNDLNGTARYVGMGGAMEALGADLSTIRTNPAGIGLY RKSGVNLSFGFISQGDANARLGHSATKMSFDQVGFYRSMRMGERSFVNFAFNFSKGTNFN YFLSAAGKLSDASLGKQAYLKGLRGSLYNGGYNIRQDPKHNDVYIGYAESNPKNFDVSRN YGQLDYLYWNSRLIDKNDPTKAGYDAANGYDFNRSQHGYIGNYDFNISGNINERVYLGLT LGLRTVNYKGYSEYVESLVNKSDKYIGTATIADERKVTGTGFDLSAGAIFRPIEGSPFRI GVSINSPTWYDLKTENYTTILNEGENKGRYSNGRSTESYNFKLFTPWTFGASLGHTVGNF LALGASVSYADYATMDNRYIDGEYQDYYGYTHETSASDKVMNNHAKRTLNGVATLKLGME YKVLPQVAVRAGYNYVSPAYKKDAFKDGTLNSEGSYYSSAADYTNWKSTNRMTLGVGFIG KQWSADLAYQYSTTDGEFYPFTNGLTTVDPTTKKPLTNEASAVSVSNKRHQMLFTVGYRF >gi|283510615|gb|ACQH01000004.1| GENE 3 2754 - 3728 987 324 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927292|ref|ZP_06421139.1| ## NR: gi|288927292|ref|ZP_06421139.1| heterogeneous nuclear ribonucleoprotein A2 [Prevotella sp. oral taxon 317 str. F0108] # 1 324 1 324 324 295 100.0 3e-78 MKRLGLLAIGLTLGTFTAMAQTDDVYFVPKKSDRINKEYQDKGERDADVYYSGSNRNVDE YNRHNRNGSSYQKIRRDSLGNVIVEEDVSAMSSDRNRSRRDWRDRRDYYDDDDYYYSRRV SRFYDPWFFGYYAYSPYYGYSRWDWDWYDPFYYGYGPRWRGYYSYWYNDWAYPYYDYGWR SPYYYGGYYGPYYGYYSPYYNYGGYYYSPGYYSYGGVTGTRNHGSMSSRAARDRTDYRYQ YNDNTRSSGRSYGNSYERSSNGDFYNTRSYDNNRNYNPPTRSYDSGSRGGSFGGGGSSRG GGSFGGGSSGGGSSSGGVTFGGRR >gi|283510615|gb|ACQH01000004.1| GENE 4 3975 - 4844 624 289 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237727781|ref|ZP_04558262.1| ribosomal protein L11 methyltransferase [Bacteroides sp. D4] # 1 286 1 278 281 244 44 5e-64 MEYLKAVFTLNVAEEDMDAAKDLLADLAAAAGFEAFEETDDKLVGYVQPQLLDKMALDEA IADFPIASASIAYEMVEAENRNWNEEWEANGFEPICIDGRCMIYDARHYSQADLPVGFDL QVAIEPRQAFGTGAHETTRMVVAQLLSMPLKGKRILDCGCGTGILGIVAAKLGADRVLGY DVDEWSVDNSLHNADLNGVQNMAVRKGDATTLNAGDGQFDVVVANINRNILLADMPRWVQ MMAQGGRIVFSGFYQTDVPMLVEAAADLGLRLISKREQEIWACLTFCKE >gi|283510615|gb|ACQH01000004.1| GENE 5 4921 - 5358 503 145 aa, chain - ## HITS:1 COG:no KEGG:PRU_0802 NR:ns ## KEGG: PRU_0802 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 144 1 144 146 148 53.0 6e-35 MKATEQTIMQVERAINKIAQKFPNQENNTALTDIHIRVSQDSGELLAFDDDDREITRCVV EQWIESKDEHFYHSVTHILQSCLRKLNNVVEQMGILRPFSFVLEDDEKNHVAELYLVDDD TSIITGELLQNLDEDLDSFMKDLMK >gi|283510615|gb|ACQH01000004.1| GENE 6 5254 - 5607 66 117 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDVGEGGVVFLVRKLLRYLVDGTFHLHDGLFRSFHNANCFNGIAKIVKNRKGKRVCIKNV RLLRIFLFGGRGWCACDVGLELMANGRQFCTAAPRFYRETALTHGAKALNGKPIEMC >gi|283510615|gb|ACQH01000004.1| GENE 7 5712 - 7277 1064 521 aa, chain - ## HITS:1 COG:no KEGG:BT_4508 NR:ns ## KEGG: BT_4508 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 151 521 32 408 408 79 24.0 3e-13 MTFKEFQETLKELDTFKLRRGKKRFALVNVTPETTTLTEVGSTKNMDVPTKMLYEAFKEL EVQGEYTTKDLTPYVQSTAAPACAALLNSVFDVDMTKELDRVTNEIENTYLKLQDLFVPD ELLLDSSGFDLEPTSYKKALGEMSPNMLEQEALLLGVPSIIKATRTPSMKKMQQDIYKFV TRYPEEWLLGLPMRDLYLLQEMVNGKLVNIDYSHTPPTLNWLGIVMDKATDGKRGEHIAI YDDLKEALQPLIRPAITAKLTLAEFNLETLFVGLMNTVGWISRKKAIKILSEMMREKMGR EMGIYVNTYFEHSILTAIFTCIASWDKQCGTLCAPRLKDMPNLGKSDLEDDDRPELSYND LMLRGLYPFIRPINAEEQDFFDLLTKKGFSEDEAFLHFTLIFHRIQEERMPNGKLISEII EVFPQRKLPSDKDINIITKFVNSVPRPHLNGYSPNQIANKRLRPNAHFTAATPRFDIDSP FDSGFKNPFGNLNLEQPKVGRNEPCPCGSGKKYKKCCGREN >gi|283510615|gb|ACQH01000004.1| GENE 8 8337 - 10415 1646 692 aa, chain + ## HITS:1 COG:no KEGG:Plav_1227 NR:ns ## KEGG: Plav_1227 # Name: not_defined # Def: hypothetical protein # Organism: P.lavamentivorans # Pathway: not_defined # 297 550 180 433 638 69 26.0 5e-10 MSFIATMTLRKAGRLNEALQQANADWQYEKSPHACMAMFWVQKDICDWLIKEQRLDEANQ RIALLEQLQPLMNDTEGIASKAIERLKEHALPFYPQIRHIEQMSKDGCERVAYDALLPFL GENLPTALHERVGWVVFRYLKRLPSECQSATARRALLNYIMLKNQRPSLLHSQMLLMAIA VKERFADFKLLNFIAAWDVNAFAEADFQPSQHEEQTFRPLAERVVEHCFDLGYGLQEVLN TFLPNVRFTDEKVKELYARSKYFAIIKLQKQDETAAIGLAEHYAQTISGTFVRNEYHSAI LHFYLRHLPKAHQRQALRFVKGWGMDNFRDEDWQRTTKDDRRYPALVEKTLVALLSACEK HELRRLNERPPAILQKALDAYADNESLMRLWVKVKLAACKDHEALETLRCLIRKQQRFYL WKELADITPDEQLKLSALCKAILLQPKDEFLGKVHFMLALLLKGQGLLSEAQAQINAYVE TYRRNGWTPTPEMQALANAMPPGIAPCADVSAFYNAHLDAAHDFLFPDLHWNLLVVMNFF TQINPKTGRKTDKVALADTNGNTLSVNRKQWPSGHSLAKFACVEAKIAVEGKRKNIVAMR PSAANGQDIWHLETVQIVRADEQRQRYVAKNEAGLTFVVPYYIIRSRVDCGIKARLWWVD AATKEGKGMRRTIAVEFVDAENKPKPRPEKNA >gi|283510615|gb|ACQH01000004.1| GENE 9 10657 - 11289 621 210 aa, chain - ## HITS:1 COG:SA1666 KEGG:ns NR:ns ## COG: SA1666 COG2197 # Protein_GI_number: 15927422 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain # Organism: Staphylococcus aureus N315 # 2 203 4 207 207 63 26.0 2e-10 MLILSDNQDITRAGLQWLCSEMKHNDIAIATDKTTLARLLEGQEHATVVLDYTLFDFTSF DELLIWQQRFHRVQWLLFSETLTSDFVRVAVASGPRFGIILKDAPLPLIRQALHDALRGE PYHYAPLVQMADEQWQEVERVKLTKTEMEVLRDIALGLTTKEIAEKRFSSFHTVNTHRKN IFRKLNVNNVHEATKYALRAGMVDAAEYYI >gi|283510615|gb|ACQH01000004.1| GENE 10 11775 - 12347 668 190 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|150005281|ref|YP_001300025.1| 30S ribosomal protein S16 [Bacteroides vulgatus ATCC 8482] # 1 190 1 183 183 261 72 4e-69 MATKIRLQRGGRKGYAFYGIIIADARAPRDGKFTEKIGTYNPNTNPATVDLNFERALHWV ECGAQPTDTVRNILKGEGVYLMKHLRGGVKKGAFDEATAQKKFDAWKEEKNKGTEAIRNN EAKAKKDLEAKKLEAEKSVNAAIAKKVNEKKAAEAAAKAEAEAANAEVENAAEATAEAPV QEANEAPAEA >gi|283510615|gb|ACQH01000004.1| GENE 11 12770 - 13570 849 266 aa, chain + ## HITS:1 COG:no KEGG:PRU_1652 NR:ns ## KEGG: PRU_1652 # Name: not_defined # Def: putative lipoprotein # Organism: P.ruminicola # Pathway: not_defined # 10 266 20 269 277 119 30.0 1e-25 MKRILIAWAVALVAFAACTNKSATSAKIEGLDFDSIVVDTTVALTDLKLTPLCRISLNLQ YAKGNNADKINNALLHAGILMPDYLGLTNQKFSMKQAVDSFVKRMLNDYLHDYAALYRQD RENGSSYNYEYKVKTSTRNGAENIVVYTAKIYTYGGGAHGINQTLVRNIDVTTGKVLQLQ DVFVPGYEPTLKELLLKKVGERFNADGLDELNKKDIFADGHVYVPDNFAIDDDGFTFIYC EDEIAPHAVGEISVTLSRSELSRILK >gi|283510615|gb|ACQH01000004.1| GENE 12 13672 - 14292 790 206 aa, chain + ## HITS:1 COG:DR0014 KEGG:ns NR:ns ## COG: DR0014 COG0357 # Protein_GI_number: 15805055 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted S-adenosylmethionine-dependent methyltransferase involved in bacterial cell division # Organism: Deinococcus radiodurans # 13 146 26 165 253 91 38.0 8e-19 MDEIRKYFTELTATQLEQLTQLGELYRTWNERINVVSRKDIDNLYLHHVLHSLAIAKYVQ FKAGTRVLDFGTGGGFPGIPLAIVFPECRFKLIDRTAKKIRVTQEIAAAIGLTNATAEQK AGEEEWGEYDFVVSRAVMPLPDLMKIVKKNISREQRNALPNGLICLKGGNLDAETRTVKK VVDITPVGNWFEDEWFEEKNVVYVPM >gi|283510615|gb|ACQH01000004.1| GENE 13 14426 - 15040 480 204 aa, chain + ## HITS:1 COG:SMc01587 KEGG:ns NR:ns ## COG: SMc01587 COG0491 # Protein_GI_number: 15966085 # Func_class: R General function prediction only # Function: Zn-dependent hydrolases, including glyoxylases # Organism: Sinorhizobium meliloti # 2 202 12 210 213 125 38.0 8e-29 MLQENCYVVSDETKECVIIDCGAFFEKDKSALTQYIADNELKPRRLLATHGHVDHNLGNA FVFATYGLRPEVHEDDGELLEHLPEQAAYLLNMPLEEQQPRVERYLTEADSIEWGNHRVT IIHTPGHTEGSCVFYCESEGVAFTGDTLFKGSIGRVDLPGGSMFKMMQSIRHLSQLPDTT RVFSGHGPETTIGYELAHNPYLDR >gi|283510615|gb|ACQH01000004.1| GENE 14 15211 - 15783 716 190 aa, chain + ## HITS:1 COG:VC1847 KEGG:ns NR:ns ## COG: VC1847 COG0817 # Protein_GI_number: 15641849 # Func_class: L Replication, recombination and repair # Function: Holliday junction resolvasome, endonuclease subunit # Organism: Vibrio cholerae # 9 160 3 150 173 120 46.0 2e-27 MYKEETEKIIMGIDPGTTLMGYGVIRVVGKKAQMVAMGVIDLRKMPDPYLRLGHIFERVS GIIDAYLPDELAIEAPFFGKNVQSMLKLGRAQGVAIAAAIQRDIPIHEYAPLKIKMAITG QGQASKEQVAGMLKRMLKLSDDDMPKFMDATDALGAAYCHFLQMGRPETDTKYRGWKDFV SRNQDRVAGK >gi|283510615|gb|ACQH01000004.1| GENE 15 15873 - 16358 670 161 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288927304|ref|ZP_06421151.1| ## NR: gi|288927304|ref|ZP_06421151.1| hypothetical protein HMPREF0670_00045 [Prevotella sp. oral taxon 317 str. F0108] # 1 161 1 161 161 259 100.0 3e-68 MKGTTMKKILILVLMACATAFTAQAQEVYKRILKVSKQTAADKSKSIDVRKVATFKVDEL NYMAMKSKELMPDSTVRMLDTQAYAMHEFINLFFKRLSEAKKKTQKELVMARFKNASINN SRFNDMDKELVLSYYDNGNYMTQFSLDTDWVKALAEIRSKK >gi|283510615|gb|ACQH01000004.1| GENE 16 16289 - 16495 167 68 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MCSDSDNIVRFSKYLNQSNNPRLPHPFLYFTRKRNVTIEARQGKSHYFLLRISAKAFTQS VSKENCVI >gi|283510615|gb|ACQH01000004.1| GENE 17 16711 - 19941 3890 1076 aa, chain - ## HITS:1 COG:YJL130c_2 KEGG:ns NR:ns ## COG: YJL130c_2 COG0458 # Protein_GI_number: 6322331 # Func_class: E Amino acid transport and metabolism; F Nucleotide transport and metabolism # Function: Carbamoylphosphate synthase large subunit (split gene in MJ) # Organism: Saccharomyces cerevisiae # 7 1060 6 1053 1070 1175 56.0 0 MKDESIKKVLLLGSGALKIGEAGEFDYSGSQALKALREEGVKTVLINPNIATVQTSEGVA DQIYFLPVQPYLVERVIEKERPDGILLAFGGQTALNCGVELYQSGVLEKYGVKVLGTPVQ AIMNTEDRELFVEKLNEINVKTIKSEACANIEQARRAAASLGYPVILRAAYALGGLGSGF CDNEEELNKLAEKAFSFSPQVLVEKSLKGWKEIEYEVVRDRYDNCITVCNMENFDPLGIH TGESIVVAPSQTLSNSEYHKLRALSIKIIRHIGIVGECNVQYAFDPKSEDYRVIEVNARL SRSSALASKATGYPLAFVAAKLGMGYGLFELKNSVTKTTSAFFEPALDYVVCKIPRWDLS KFRGVDKELGSSMKSVGEVMAIGRTFEEAIQKGLRMIGQGMHGFVENKELKIDDIDAALR EPTDKRVFIISKAMHKGYTIDQIHQLTKIDKWFLEKLKHIIDIDEALQRCTSVNVLDKEL LRTAKVYGFTDFQIARAVGLEKEMSNMHKAALVVRNLRKSYGILPVVKQIDTLAAEYPAQ TNYLYLTYSGVASDVRFDNDRRSIVVLGSGAYRIGSSVEFDWCGVQALNTIRREGYRSVM INYNPETVSTDYDMCDRLYFDELTFERVMDVIDLETPNGVIVSTGGQIPNNLAMKLDEQH VPILGTSATDIDNAEDRAKFSSMLTRNGINQPEWSALTSMDKINEFIARVGFPVLVRPSY VLSGAAMNVCSNREELERFLQLAANVSEDHPVVVSRFIEHAKEIEMDAVSKDGEILAYAI SEHIEFAGVHSGDATIQFPPQKVYIETVRRIKRVSRQIAKELHISGPFNIQFMARDNDLL VIECNLRASRSFPFVSKVLKLNLIDLATRVMLGLPVEKPNKNLFDLDYVGIKASQFSFNR LQKADPVLGVDMASTGEVGCLGDDTNSALLTSMLSVGHRIPKKTVLLSTGGGKQKAEMLD AAKRLVNNGYELYATSGTSHFLTENGIANTTVYWPTDEGMEPKVLDVLREKKIDMVVNIP KDLTPRELTNGYSIRRAAIDLNIPLITNTRLASAFITAFTTMRVEDIEIKAWSEYK >gi|283510615|gb|ACQH01000004.1| GENE 18 19957 - 21036 1210 359 aa, chain - ## HITS:1 COG:SPAC22G7.06c_1 KEGG:ns NR:ns ## COG: SPAC22G7.06c_1 COG0505 # Protein_GI_number: 19113967 # Func_class: E Amino acid transport and metabolism; F Nucleotide transport and metabolism # Function: Carbamoylphosphate synthase small subunit # Organism: Schizosaccharomyces pombe # 4 355 59 441 474 361 46.0 1e-99 MRNVTLVLQDGTKFHGKSFGYDAPVAGEVVFNTAMMGYPESLTDPSYAGQLVTLTFPLVG NYGVPPFTFGPEGLPTFMESDHIHASAIIVSDYSEQYSHWNANESLAEWLKREQVPGITG IDTRELTKVLREHGVMMGQIIFDDDPTNIPQAQYEGVNFVDRVSCKEIIRYNEGAGKRVV LVDCGVKANIIRNLIERGLEVVRVPWNYDYTGMEFDGLFLGNGPGDPDLCQDAVNILRQQ MNKSRKPICGICMGNQLMAKAGGANIYKLKYGHRSHNQPVRMVGTDKCYITSQNHGYAVD ASTLDKDWSELFVNMNDGSNEGVRHNTNPWFTSQFHPEACSGPVDTLFMFDLFVEKLTL >gi|283510615|gb|ACQH01000004.1| GENE 19 21043 - 22938 2207 631 aa, chain - ## HITS:1 COG:NMA0892 KEGG:ns NR:ns ## COG: NMA0892 COG0034 # Protein_GI_number: 15793861 # Func_class: F Nucleotide transport and metabolism # Function: Glutamine phosphoribosylpyrophosphate amidotransferase # Organism: Neisseria meningitidis Z2491 # 36 507 17 421 514 133 27.0 9e-31 MEPLKHECGVAMIRLLKPLQYYQQKYGTWMYGLNKLYLLMEKQHNRGQEAAGMSCVKLKT TPGHEYMFRERVEGSNAITEIFSTVQKGFAKETAEHLADARYAERTLPFAGELYMGHLRY STTGKSGLSYVHPFLRRNNWRAKNLSLCGNFNMTNIGEIFERLTRQGQCPRIYSDSYIML ELMGHRLDREVERNFVAAQALGLTDTDITQYIEDNVNVANVLRTTMPHFDGGYVVCGVTG SGEMFSMRDPWGIRPAFYYKNDEVAVVASERPVLQTTFAIECDDVKELEPGTALIVKRDG DCRIERIMEQKADSSCSFERIYFSRGSDASIYKERKKLGEQLTNPILRAIDYDTKNTVLS YIPNTAEVAFYGLVAGFKRYMRQQHVAQIQALDHAPTAEELEEILSNTVRLDKVAWKDIK LRTFIAEGNSRNDLASHVYDVTYESIRAGKDNLVIIDDSIVRGTTLQKSILRILDRLHPK KIIVVSSAPQIRYPDYYGIDMPRLEEFCVFRAAIELLKEKNMHALIKQTYDNCRRELKKP VEEMDNCVRQIYKAVTVDEINRKIVQMLRPKGVKTPIELVFQSIEGLHEAIPQHKGDWYF TGNYPTPGGMRLCNQAFVNYIEKVYQNEQKF >gi|283510615|gb|ACQH01000004.1| GENE 20 23015 - 23809 803 264 aa, chain - ## HITS:1 COG:no KEGG:PRU_1779 NR:ns ## KEGG: PRU_1779 # Name: not_defined # Def: lipoprotein # Organism: P.ruminicola # Pathway: not_defined # 20 264 21 266 266 262 54.0 8e-69 MNKILSSVVLLTFVLLLGSCGGSKDVVYFQNIDNTSLEPSKGLYEAKIMPKDQLSITVVT TDPAAAAPFNLTVGNSVGTKGQLAQGSNLQTYLVDNNGDIDFPVIGKVHVGGLNKNECQD LIKSKISGYLAASENPIVTVRMASYQVTVLGEVNKPTVIPVTSEKMSIVEALAQAGDLSI YGRRDNVMLLRENPDGRKEAHRIDLRDANLINSPYYYLQQNDVIYVEPNKAKAANSKTST STTIWFSVISSVLSLSSLIVNLVR >gi|283510615|gb|ACQH01000004.1| GENE 21 24841 - 25386 647 181 aa, chain + ## HITS:1 COG:ECs3184 KEGG:ns NR:ns ## COG: ECs3184 COG0622 # Protein_GI_number: 15832438 # Func_class: R General function prediction only # Function: Predicted phosphoesterase # Organism: Escherichia coli O157:H7 # 1 180 2 183 184 164 45.0 7e-41 MKYLLISDIHGCLPALQRALHFYKEQHCDMLCILGDIINYGPRNQLPEGLNPKGIVEQLN AMADDIVAIRGNCDSEVDQMLLNFPIMADYLLLVDGNKRLILTHGHVYNKEKRPLGKCDA LFYGHTHLWELTRTPQGVVCNLGSITFPKGGNPPTFATYEDGVISVYRTDGSLLQTMDLN E >gi|283510615|gb|ACQH01000004.1| GENE 22 25480 - 25986 449 168 aa, chain + ## HITS:1 COG:no KEGG:Plut_0901 NR:ns ## KEGG: Plut_0901 # Name: not_defined # Def: hypothetical protein # Organism: P.luteolum # Pathway: not_defined # 1 156 1 155 177 73 28.0 3e-12 MSKAKLKKHLLALSKEEIINVVLDLYDARTEAKAYFEFYLTPDSSTELEKLRVRIVHEFF PVRGLSENPSFAKCKKIVTDFAKMQASPDCVADLMLTVTEQGNSWAMTYGYCEGAYETAL ANSFARAMDFIFRNGLLLYFYERIERMLASAANGWGGDSLREIYRQYR >gi|283510615|gb|ACQH01000004.1| GENE 23 26237 - 27040 754 267 aa, chain + ## HITS:1 COG:Rv0881 KEGG:ns NR:ns ## COG: Rv0881 COG0566 # Protein_GI_number: 15608021 # Func_class: J Translation, ribosomal structure and biogenesis # Function: rRNA methylases # Organism: Mycobacterium tuberculosis H37Rv # 1 262 13 276 288 151 37.0 1e-36 MSIERITSLTHPGVEVFSALTETQLRNRLQPDKGVFIAESPKVIRVALDAGYEPLSMMCE ERHIVGDAADIISRCGDIPVYTGERALLAQLTGYTLTRGVLCAMRRKPTRSVAEACADAT RVCVIDGVTDTTNIGAIFRSAAALGIDAVVLAKNACDPLNRRAVRVSMGSVFLVPWTWSE QPLAELSALGFRTAAMALTHNSISLDDAKLKAEPRLAIVMGTEGDGLPHQVIAQADYVVR IPMAHGVDSLNVAAASAVAFWELRVDK >gi|283510615|gb|ACQH01000004.1| GENE 24 27160 - 29490 2482 776 aa, chain - ## HITS:1 COG:CC0447 KEGG:ns NR:ns ## COG: CC0447 COG3525 # Protein_GI_number: 16124702 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetyl-beta-hexosaminidase # Organism: Caulobacter vibrioides # 32 598 31 582 757 398 40.0 1e-110 MYFKQFILGALGLLTTTFAQAQTPTTKANFRVVPLPQEITNTSAGAFSLNAQTRIAYPKG NTALQKDAELLAQYLFAATKVRPALTTAAQGRNLIVLSATLKNNNKEAYRLEVTKERITI NGASAAGTFYGIQTLRKAIPQTGAQRVSFPSVTIADQPRFGYRGAMLDVARHFFSFEEVK TFIDILTLHNINRFHMHLTDDQGWRIEIKKYPKLTEVSSMRPETMVGHDASKFDGKPHGG FYTQEQARAIVKYAADRHIMVIPEIDMPGHMVAALAAYPELGCTGGPYKVRTTWGIAEEV LCAGNDRTLQFAKDVLSEVMDIFPAEYVYIGGDECPKVSWAKCPRCQARIKEKGLAADGK HTAEERLQSYFMSDIANFITARGRKVGGWDEILEGGIAPNATVLSWRGMEGGIEAARLGH DAIMCPVSHLYLDYYQTNDREHEPLAFNGFIPIEKTYDFNPVSPKLTAEEAKHIIGVQAN LWTEHMDNFPHVEYMALPRLAAASEVQWTMPEKKNFDDFANRMPQLLNIYNLNKYRYGRH LFNVNIALSPAEKPGSLNVTLTARKGATIYYTLDGTAPTAKSLRYTQPFPVKKSCKLRTI AVYPTFTSNENSEDLLISKATMAEVKYNVEPDAKYPGLSPRELTNGLQGSVIFNSGRWVG YVGKDLDLTVDLGEVKPIRLISVRTLISSAQWIMPHRGLYVWVSTDGKNFEDIYDLPADP MPANKPDYIDTDNVLLPMPKDARYVRIKMLSEKSMPAWHPHTGKPAWLFVDEVTIE >gi|283510615|gb|ACQH01000004.1| GENE 25 29622 - 30101 400 159 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927313|ref|ZP_06421160.1| ## NR: gi|288927313|ref|ZP_06421160.1| hypothetical protein HMPREF0670_00054 [Prevotella sp. oral taxon 317 str. F0108] # 1 159 1 159 159 306 100.0 2e-82 MKRCATTLCWVAMTWANTALAQTRCVVADMETHRPLKGVRVKTDTNTTIETDFTGTCILP ATFKSLTFVSYGYMRRMMNKEELTDTVLLLPTMLNEVVVYGKAPKPGFDVQEAARQSARI GARMAPGGFGFDFFRMFDKRTRRPSKKEREMNERILKTY >gi|283510615|gb|ACQH01000004.1| GENE 26 30390 - 31028 238 212 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927314|ref|ZP_06421161.1| ## NR: gi|288927314|ref|ZP_06421161.1| hypothetical protein HMPREF0670_00055 [Prevotella sp. oral taxon 317 str. F0108] # 1 212 1 212 212 413 100.0 1e-114 MKTIALWIIGCLALTAIGCTERTDKRAELPLNRIDTAKLKPELKAVVRAYIKAHRQYKTF LMVQGKVQEEEEWLDYKTPYYFCIGPALEQLFDGGEFTLGTSYPTSYFLLNGRTILVKTL QEEFVVPDKAVIDTYNRLAEPLDTFEKGTTLSYTSPATNYITKSWVIRIGREIPTKVVST RADTFLGRKRVEWTPPLTCVKRTKCKPTRANK >gi|283510615|gb|ACQH01000004.1| GENE 27 31252 - 31935 544 227 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927316|ref|ZP_06421163.1| ## NR: gi|288927316|ref|ZP_06421163.1| hypothetical protein HMPREF0670_00057 [Prevotella sp. oral taxon 317 str. F0108] # 1 227 1 227 227 445 100.0 1e-124 MKTIALWIIGCLALTAIGCTERTDIRTELPLNRIDTAKLRSELKAVVRAYIKVHPQYKTF LLAPTELPADDWSVYPETYFLIGPAFDGLYNGGEFGFVESYPTSYFDLDGHIVFVKTIQE DLTIPDKAAIDTYNRLAEPQDTFDKGTPTPYVHPIKSYIFKAWVVETGFKTTTKLLSTRA DTLATVKRVPCNLPNMELLYSNSKARKPRHATKKDTTLNDTAVADSI >gi|283510615|gb|ACQH01000004.1| GENE 28 34131 - 35051 891 306 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288927317|ref|ZP_06421164.1| ## NR: gi|288927317|ref|ZP_06421164.1| hypothetical protein HMPREF0670_00058 [Prevotella sp. oral taxon 317 str. F0108] # 1 306 1 306 306 572 100.0 1e-162 MRKAALALAAALTVGCWAQVFNVASLVPVELPSDVGSKVVAISGQGDFLLLTTDNNSGLT KLDLNTGKAQNITQAAGAGYDARVSPDGKRVVYRENSFTPGHLRMVSLRSVNLESGQHNE LVAPTRSLQGMALNNQAALPVTRGQVAAKGFEGKVSAKNAVVLSINNRQLMISRNGKTRN LSPNGKEKSYLWPSLSPDGTKILYYVGAEGAFVCNLDGSNVKPLGMMRAPQWWDDSTVVG MYDQDDGEFVYASRIVATNLKGDKQTLTPDSLIAMYPKVSAQAGKIAFSTPSGKAYIINV TKSQQP >gi|283510615|gb|ACQH01000004.1| GENE 29 35048 - 38788 3954 1246 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288927318|ref|ZP_06421165.1| ## NR: gi|288927318|ref|ZP_06421165.1| Cna protein B-type domain protein [Prevotella sp. oral taxon 317 str. F0108] # 1 1246 1 1246 1246 2499 100.0 0 MKKILLFIAMAFTALAQAQTKNASELRIYLNPGHGCYGPNDRPMPTIPYPNLPETGRPGK KGFYESTTVLMRTLPMVDKLVKMGVKRDNIMLSRTDNGPYPYVTGDPENDKYDRPLSEIC EEVDANNMDFFISVHSNAATDGGNTNYPLILYRGRDKEGNDLVPGSRDMAIKMWEPHYMD ELDPQSYYSRTNMNVRGDISFYHSSSVRHGTHGDYEGYLGVLKHGVPGFLIEGYFHTYQP ARHRALNPDYCKQDAIRMSRGLADIFNLPAEKTGYIMGTVKDLHERIVNPVFHYAPRTND QWLPLNGATVTLFKGDKAVKTYQVDTLYNGIFVFEDLEPGEYTVRATLKGYKEQGKFTAD ATSTEYQNLVAQSMEKLVVKANQTTYTKLYLEVEGYEPPADTYHNYPDPEQPAYLTMPAA LNMKAEEPVTLAIKGEVKRAITREGKTVILTNDNGTPLLYLVNNATRKIEKQLSTNGLPA AETDNKGFHSRLNDIAFTADGQLVGVNSVECQFNDGEVETDKGYKRGTLRIFKWQDMDAD PIEWLTTQSSANFLNADMGKTVAVSGAAKSCKVIVGGTNANGVAKGVRNLVLYVENNTIT SSLFTEKTINASSNFTEVKLGNDYKLCASPFADDQWMVDGNVTSPMEFKPATTSNVNSEI LGRLPADILGSEGEVAAASGAIFFKYAKHTLLATPYLKDAKVAGLRLFDVSEGLEKAKLI KATTLNLATPLEKVGFMAATATVKDADITLTLITDSVLTNFTTKGVAQPAVKGVYAYNLR LAQAGERYTFSFDANDEPTAAKLVFTDAKTNAAVGELPLNGVKAGHNSFDFATDQLPGRL QQELNWAVSLTGNRIASINRINPEVATAYNRATVAIDKSTESDFFGRMYVGEANKKQLDV TGVYVCDANGARNNAAPYKGGQKLMGNYRMSVDATGKLYIAEFSDENPGVFIANPAQMDG KFEQFFVGKPDEDGLITNDGESVGSSASMVLPTGSGSNAKLYVCLEDMKAAIGVYNIGQP DGSVLTSWNKAPSLTLKVSGLINADDNLAAGPDGGLWVVQYRGAGNNTKGVPSLMYVDKD GKVTFNSGNADWVENLNGSRRSGFAVSDDGKSLVICDGSLALQFFNVAWNGSTPTLTKKY SYEGLGKEIYQMAFDPAGNLVCAGKEVCILSIPTEHNQTLTPAKRSLTVKRQPTTAVEQP TAGKRVVSIRYYNAAGLQSAQPFDGVNIVVTTYADGSKKTEKMMKK >gi|283510615|gb|ACQH01000004.1| GENE 30 39064 - 39798 147 244 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927319|ref|ZP_06421166.1| ## NR: gi|288927319|ref|ZP_06421166.1| conserved hypothetical protein [Prevotella sp. oral taxon 317 str. F0108] # 8 244 1 237 237 316 99.0 1e-84 MKKSNNIVEIKEKQAAWHHQVPYLLPLLNRTSSLNNPSLAIVKSSNFLPFYHTSNLSNPS LAIVKSSNFLPFYHTSNLSNRSLAIVKSSNFLSFYHTSNLSNPSLSAIKSSNFLSFYHTS NLSNPSLSAIKSSNVLRFYRTSSLNTPSLAIIKSSNVMPFYRTSNLNNSSLSVIKSSNIL PLYRTSSLNNPSLCAIKSSNILPLYRTSSLNNPSLCVITCKLNNCWLVKWRSHFSPILVD KLTS >gi|283510615|gb|ACQH01000004.1| GENE 31 40276 - 40410 64 44 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDGALVRLVGRSKAIMWLSLNPCFNGWCTRTHKMAQLISIVQES >gi|283510615|gb|ACQH01000004.1| GENE 32 40349 - 40558 145 69 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDGALVHEVEKIFEPIHTWVLILVLMDGALVRVHTKTTIKKMKSLNPCFNGWCTRTLGGK VESYHVAES >gi|283510615|gb|ACQH01000004.1| GENE 33 40576 - 40776 97 66 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDGALVPRTATSSTRQDCLNPCFNGWCTRTRAHPHMVCPLCYSLNPCFNGWCTRTTSPQE GECNLR >gi|283510615|gb|ACQH01000004.1| GENE 34 40788 - 40994 76 68 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVHSYRKSNSSNMPSAMRLNPCFNGWCTRTKCFETLSSFSFNVLILVLMDGALVLSQSGL FIWHSIMS >gi|283510615|gb|ACQH01000004.1| GENE 35 41235 - 41444 177 69 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDGALVHIEKERNLSHNGGLNPCFNGWCTRTMLLYLVKLGYMIVLILVLMDGALVQIYSE MKKKDIIKS Prediction of potential genes in microbial genomes Time: Sat May 28 00:00:20 2011 Seq name: gi|283510614|gb|ACQH01000005.1| Prevotella sp. oral taxon 317 str. F0108 cont2.5, whole genome shotgun sequence Length of sequence - 23310 bp Number of predicted genes - 19, with homology - 13 Number of transcription units - 8, operones - 2 average op.length - 6.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 314 - 348 -0.5 1 1 Tu 1 . - CDS 491 - 670 66 ## - Prom 863 - 922 4.4 2 2 Tu 1 . - CDS 1070 - 1261 122 ## - Prom 1326 - 1385 5.6 3 3 Tu 1 . - CDS 1601 - 1990 287 ## - Prom 2106 - 2165 4.2 4 4 Tu 1 . - CDS 2227 - 2502 134 ## - Prom 2585 - 2644 2.1 5 5 Tu 1 . - CDS 2700 - 2936 160 ## - Prom 3077 - 3136 5.9 + Prom 4719 - 4778 5.0 6 6 Tu 1 . + CDS 4799 - 4993 91 ## + Term 5184 - 5219 0.8 - Term 4963 - 5024 2.2 7 7 Op 1 . - CDS 5054 - 5533 621 ## Mevan_0228 CRISPR-associated Cas2 family protein 8 7 Op 2 . - CDS 5544 - 6563 1229 ## gi|288927322|ref|ZP_06421169.1| VirE N- domain protein 9 7 Op 3 . - CDS 6560 - 7540 1237 ## COG1518 Uncharacterized protein predicted to be involved in DNA repair 10 7 Op 4 . - CDS 7568 - 9010 1404 ## COG0507 ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member - Prom 9233 - 9292 2.8 11 8 Op 1 . - CDS 9361 - 11100 1265 ## Metvu_1220 CRISPR-associated protein, TM1812 family 12 8 Op 2 . - CDS 11110 - 12216 1062 ## gi|288927326|ref|ZP_06421173.1| hypothetical protein HMPREF0670_00067 13 8 Op 3 . - CDS 12229 - 14022 1202 ## Fisuc_1562 hypothetical protein 14 8 Op 4 . - CDS 14030 - 14437 342 ## gi|288927328|ref|ZP_06421175.1| hypothetical protein HMPREF0670_00069 15 8 Op 5 . - CDS 14430 - 15926 1415 ## Fisuc_1560 protein of unknown function DUF324 16 8 Op 6 . - CDS 15938 - 17362 1134 ## CFF8240_1676 hypothetical protein 17 8 Op 7 . - CDS 17359 - 17913 589 ## VVA1540 hypothetical protein 18 8 Op 8 . - CDS 17917 - 19479 1684 ## Fisuc_1557 hypothetical protein 19 8 Op 9 . - CDS 19499 - 20176 783 ## Cthe_3205 hypothetical protein - Prom 20346 - 20405 3.4 Predicted protein(s) >gi|283510614|gb|ACQH01000005.1| GENE 1 491 - 670 66 59 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVHSYIEPMFGCWTDKFSLNPCFNGWCTRTRIGFHVEPSVWGCLNPCFNGWCTRTVMTT >gi|283510614|gb|ACQH01000005.1| GENE 2 1070 - 1261 122 63 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDGALVPQYGMSVQFTLKEGLNPCFNGWCTRTGCGCGIPVWDVSVLILVLMDGALVHKKK KGP >gi|283510614|gb|ACQH01000005.1| GENE 3 1601 - 1990 287 129 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDGALVPTFSEQYKEGFNVLILVLMDGALVLKISLNVIVPGVLILVLMDGALVQLFLNGT METYSVLILVLMDGALVPAYSRKARTIVGVLILVLMDGALVPNLDYQKHVISLKGLNPCF NGWCTRTWY >gi|283510614|gb|ACQH01000005.1| GENE 4 2227 - 2502 134 91 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVHSYIMSVCMAPKAASLNPCFNGWCTRTYPAFVSDKGERSLNPCFNGWCTRTFSTSLQK LRNKTSLNPCFNGWCTRTSAKETWWCARRRS >gi|283510614|gb|ACQH01000005.1| GENE 5 2700 - 2936 160 78 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MMHSYFIYSILLACITFSLNPCFNGWCTRTAHDTIINHVVNVLILVLMDGALVLGIIVPK YYFLEVLILVLMDGALVR >gi|283510614|gb|ACQH01000005.1| GENE 6 4799 - 4993 91 64 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRYERTSIVQVKVSCGLRRVCRSRRILVKRMRQRCDGCFVGTDVACADEKGGGERERKCP YALL >gi|283510614|gb|ACQH01000005.1| GENE 7 5054 - 5533 621 159 aa, chain - ## HITS:1 COG:no KEGG:Mevan_0228 NR:ns ## KEGG: Mevan_0228 # Name: not_defined # Def: CRISPR-associated Cas2 family protein # Organism: M.vannielii # Pathway: not_defined # 62 159 1 96 96 66 37.0 4e-10 MRKKAKPLPYIEVLRKLVRAGVAHSPEINRQVGNIEGLPTLQQRVNHVLGIVNLPQRKPT NMLFFVMYDIESDKVRRHVAKYLEQKGCTRVQRSIFLADLDTADYQEIKTDLAEVQSLYD NHDSIIVCPISTDQLRAMRIIGQQIDVDIITHNRNTLFF >gi|283510614|gb|ACQH01000005.1| GENE 8 5544 - 6563 1229 339 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927322|ref|ZP_06421169.1| ## NR: gi|288927322|ref|ZP_06421169.1| VirE N- domain protein [Prevotella sp. oral taxon 317 str. F0108] # 1 339 1 339 339 657 100.0 0 MISCGKNIQSAADPLLKIKEEQLYHSLINPRPDIEARIRQLRIVYAMDTKQYASLKRTLP YVVCGHFTPNFRKKENFAYTETFILDIDHVSEKNLDLAAVRQQIQADTRVLLCFASPGED GLKVMFRLSERCYDPDIYTLFYKAFARDFSLRYHLEQAIDNKTSDVARACFISIDRNAYF NPQCEAVNIKTFVNPDNPLNVADTRHELEQHQKMQKETFAATPEPRLKDPDAEVLQRIKQ QLNKDKAATKPISEAFVPERLNELVEPLKQHIQQTGLEVTEIANIQYAKKIKVRMGQKDA EVNLFYGKRGFSVVISPRRGTNEELNEVVAKLIEQFVNQ >gi|283510614|gb|ACQH01000005.1| GENE 9 6560 - 7540 1237 326 aa, chain - ## HITS:1 COG:alr0381 KEGG:ns NR:ns ## COG: alr0381 COG1518 # Protein_GI_number: 17227877 # Func_class: L Replication, recombination and repair # Function: Uncharacterized protein predicted to be involved in DNA repair # Organism: Nostoc sp. PCC 7120 # 3 289 42 327 374 121 27.0 2e-27 MELILNTYGVSLNRDNQGFVITTADGRQRIPAAGIKSIQISKGAQITSDAVMLAVEQEIE VLFMDKAGNPIGRIWSPRYGSISTIRKGQLNFTFSSEAVEWIKGVIAQKIENQQALMLLF NTTDTPQVNVDKSIRRLEDYRNKVSAEQGDIVNDIAPTLRGWEGLASKIYFATINAFIPP QYRFESRSQHPATDVANALLNYGYGLLYGKIEGEMIKAGIDPYVGIMHRDDYNRPVLVYD VIELYRIWVDYVVYSLLAQNVVTDEYYSVRPDGSYWLEPLGRRVLIQSLNDYMDETVTIK GVTRSRHTSMKLYVQSLAQKFKTFDL >gi|283510614|gb|ACQH01000005.1| GENE 10 7568 - 9010 1404 480 aa, chain - ## HITS:1 COG:mll1421 KEGG:ns NR:ns ## COG: mll1421 COG0507 # Protein_GI_number: 13471448 # Func_class: L Replication, recombination and repair # Function: ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member # Organism: Mesorhizobium loti # 10 474 1 374 375 118 27.0 3e-26 MNNNANTPQITLTPTQQRVLDHMLRFIDSSQQVFILTGYAGTGKTTMMNAFVAELAKRDV QHILMASTGRAAKILSNRANCKATTVHSNIYTFNDFNRDIGKMAEQLEKQQLMESDGQLL LQFTPMSIEPTAQKHIYIIDEASMISDEEDRNPTQALFGSGRVLHDLLTYDPSGKFLFVG DECQLPPIGQPFSPALSPQYFREKFNIEAVHYQLTQIMRQQSDNDIIVAANKIRQLYQNP PRVKWGKFPLRNRKHVHLYPSQLQLLNAYIRNIKANGFNSATLICGTNRLCLSLTNLVRP ALGFNQSVLMPGELLLVTQNNCLSGLMNGDLVKVLQLGTRKRRAGLTFLEVEVEELFTHR VVKQMLIEDILYSSLPNLKQYDQQALFIDYYHRMKDRRISHKSQVFKDFMLTDPYLNAIR AVFGYALTCHKAQGGEWKDVYLDIPRKFSFETSKPQYQWVYTAVTRAADQLHIVDDFFIA >gi|283510614|gb|ACQH01000005.1| GENE 11 9361 - 11100 1265 579 aa, chain - ## HITS:1 COG:no KEGG:Metvu_1220 NR:ns ## KEGG: Metvu_1220 # Name: not_defined # Def: CRISPR-associated protein, TM1812 family # Organism: M.vulcanius # Pathway: not_defined # 7 419 5 395 420 245 37.0 4e-63 MPRKVFISILGTGFYEECAYTKEIEEEHKPFKSSPTRFIQQASLELVGANGWTEEDRVII FLTKKARELNWNKGITERTPYGKNTPVPYKGLETVLQEMGLKATIEDKDIRDGMNEAEIW EVFQTIYDTLEEGDEVYMDLTHSFRFLPMLLLVLVNYAKFLKGITIENIVYGNYEARNIA ENEAPIISLLPLASLQDWTFAAANYLQNGRATQLQAMVSKELSPILKKAKGTDKDASALN NYFNALAELSGFVLNCRGKDIRVGKKIYTIHNEEKRIEKIIIPPMVPIFNKIHQPLLAFS PTDNILNGFHAARWCLNKQLHQQAITLLQETVVSLLCKENGLDLLDKNQRILINKAFTIV SDKIPESKWILSEDGAEASEKQKETIKHLIQHPVIIGLANTFKEITNIRNDFNHAGEDRG GARGVKSITSGIDKYLNITLDYLGISNSAATSPTQPQPQSALFINLSNHPSSTWQPAQLE AAKQYGEIIDIDFPAVDALCMPERVDQLASQYALDIINRGAPTCITAHVMGEMTLTFRIV ELLKAQGIRCVASTTERIVTNLPDNRKETQFTFVQFREY >gi|283510614|gb|ACQH01000005.1| GENE 12 11110 - 12216 1062 368 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927326|ref|ZP_06421173.1| ## NR: gi|288927326|ref|ZP_06421173.1| hypothetical protein HMPREF0670_00067 [Prevotella sp. oral taxon 317 str. F0108] # 1 368 1 368 368 721 100.0 0 MSKIHIALVGGQPMPIHVGIEEFKADKLILVHSKESKELAEGIQQSVDLPCELQPFESVD YPKIYQAAEKLLGKCGQSETIINISGGTKPWTVAFVLLAQQCPNTSIIYVDQNNVGYNLT TSEKKSISSKLDTQTILRYNNYPLGSYANFKDITEADLKAAKDVEKIRKSNLKVFKELTI DNKEVEYKPTGRKETARGSYVEWDKKANTAYMVLKNKWGVNKLRCSSPHVIPIVFNSGWF EVQVAEMLSQWEKAQEVIVNATFDYSDKDTKNEIDVIVNMGNKLLFVECKTQIKNLTDLD KFTTAAKKYGGMGVKMLFVTKERMNNKAEEKCRENRIIPFSIQSGGLIDPQKALFALLDN EMSNINTK >gi|283510614|gb|ACQH01000005.1| GENE 13 12229 - 14022 1202 597 aa, chain - ## HITS:1 COG:no KEGG:Fisuc_1562 NR:ns ## KEGG: Fisuc_1562 # Name: not_defined # Def: hypothetical protein # Organism: F.succinogenes # Pathway: not_defined # 132 374 337 582 755 160 40.0 2e-37 MSNNNIIKSPYNFVPLSEEVYTPSWADLISQDVPFSDGVSGKIRLRITAETPIFIRNGQK QDKEKDRNKDGQTAKQEEEKKPQKFSQTPDGRFYIPATSIKGEVRNVLEIMSFGRMTVDE RAKFADRKGKIKKPFNNSVFDCLPKAHKDLQSLDLAECIFGHVKDKGMLKGRVQFGHAFS DNAKEEQPVRLTLSSPKASFYPIYIKQDNNIDKYKTYDDGQLAGWKRYVIRTGVCQNKTS TDNTDTTITPLKKGSVFTCEITYHNLLPIELGALLSALTFHNTPNCFHQLGQAKPYGYGK VKYDVDLISPEDKECSFFLEQFEKEMCEFKSNWLTSTEIQELIALVSHPVKPYENQFNYM DLKEFQNIKKNKTPFKPFSKIKKVTTSLQAIAQQEEQKKTARESELREQKRVEEINKLKK KLEERDKELCNEDESCSASQPSHIELLNKHIQECTDIREEEGNEDLKDIINKYLSKWKEE RSRLEKEIDEKRKVESDKNIFTDGFKAHLNKANSISTCFNQCDKWVRLAKKYENGRENLN EEELGALVQKLKELYKEASSKDKKDCNTKGGKFIKKFRDVIGDHNKTIELFNTITNQ >gi|283510614|gb|ACQH01000005.1| GENE 14 14030 - 14437 342 135 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927328|ref|ZP_06421175.1| ## NR: gi|288927328|ref|ZP_06421175.1| hypothetical protein HMPREF0670_00069 [Prevotella sp. oral taxon 317 str. F0108] # 1 135 1 135 135 291 100.0 9e-78 MSKIEIKDIDFGLCYEGCLWMSNENKPRIFVPAAMIDRTLLEGLNPFVAEGYLYNKEQGV SISIKSPNGRPCAHRFLVNAADFHSKEVTPVEYPAHRMGGNRLLFLRYWTAKPDGACLDM PVLTLDKMVFVGFKK >gi|283510614|gb|ACQH01000005.1| GENE 15 14430 - 15926 1415 498 aa, chain - ## HITS:1 COG:no KEGG:Fisuc_1560 NR:ns ## KEGG: Fisuc_1560 # Name: not_defined # Def: protein of unknown function DUF324 # Organism: F.succinogenes # Pathway: not_defined # 7 496 9 467 467 320 39.0 1e-85 MQTTKYNHRLIARVTIEAETPLAIGSGKKSILTDATINRDANYLPFIPGTTLAGLIRHAI DEELADRLMGFIKKKNDKNGEYEVEGSRLIVTEAKLLNCKGKAIDGLLNLETACDDEDKA FLEDFKHAPIRQHAKINHRGITEDKGKFDEEVVPKGARFCFEMELIANPKSEEELAEYKQ NFKDLLGILVADGFRVGGKSRNGFGKIKVVGEACLYRELDLSLPADLDLYLKKSASLAEA WNGFEPLKLEKPQESRYTRYELKITPKDFLFFGSGFGNEDVDHSYIKERYITWDGEGNSG RWNSQDNSLVVPASSVKGALAHRTAYHYNKECGIFAENLSPEDFNKYVGKRNKAVFALFG SEGNEDETQPTEAPTGERTDGKRRGHVLFADIIRNKEEKTDKKIHNHVKIDRFTGGAIDG ALFDEEALIVHPDEPEEIEFELLVDVDELINEDQRIIQAFEKALKDVCEGMLPLGGNVNK GYGQFEGKLYKDGNCIYE >gi|283510614|gb|ACQH01000005.1| GENE 16 15938 - 17362 1134 474 aa, chain - ## HITS:1 COG:no KEGG:CFF8240_1676 NR:ns ## KEGG: CFF8240_1676 # Name: not_defined # Def: hypothetical protein # Organism: C.fetus # Pathway: not_defined # 1 443 1 422 451 193 32.0 1e-47 MKTIQLQCTLLSDVIISETSATTGNRHSLDFIPGNNFLGIAASKLYNEENEKTHLLFHSG KVRFGDAHPSLAGVRGVRIPAVVFRPKLKDGADEHYFHHLLPTELSKEMGEKQLKQCREG FYVFDGEQATKVKVDKTFALKSAYDSENRCSQDHQLFGYEAMNKGLILYFEVELDDDAAQ YAEELAAALCGTRHLGRSRTAQYGLVEITATSFEQLKSDAVVDCEEVLVYADGRLIFLDE HGLPTFQPTLQQLGFEGKGEILWHKSQVRTFSYAPWNFKRQAFDSQRCGIEKGSVFVVRA NAAPKKYAYVGDYQQEGFGRVVYNPEILSKNKADQEGKAKFKFEEQKKTQSEEPRKEVKE TSLLAYLKRAQEKDKRVFDVQKVVNKFVNKHADKFKGERFASQWGSIRSLAMVTSDAEQL KDRVNKFLDHGVAKGQWDENGRRECLEEVMNTPNICLQELLINLASEMAKECKD >gi|283510614|gb|ACQH01000005.1| GENE 17 17359 - 17913 589 184 aa, chain - ## HITS:1 COG:no KEGG:VVA1540 NR:ns ## KEGG: VVA1540 # Name: not_defined # Def: hypothetical protein # Organism: V.vulnificus_YJ016 # Pathway: not_defined # 1 181 1 194 200 87 33.0 2e-16 MKNLKYSIQFYTNWHCGSGQAAGADVDALVIKDAQGLPFVPGKTIKGLVREALAYTLSQE NKSKGMLTNICGVFKDKDNCTKGSAFFTDATLHEDEKNVIVKDNLAGFLYQSVSATAICN DGIAKDNSLRKTQTTVPCTLYGQIMNLEDDEAHEVERALMMIKRLGAGRTRGYGRCKISM EEKQ >gi|283510614|gb|ACQH01000005.1| GENE 18 17917 - 19479 1684 520 aa, chain - ## HITS:1 COG:no KEGG:Fisuc_1557 NR:ns ## KEGG: Fisuc_1557 # Name: not_defined # Def: hypothetical protein # Organism: F.succinogenes # Pathway: not_defined # 1 518 1 501 505 354 42.0 5e-96 MEKFLYGASVQGIQSFIFQTNQLDDISGASELVESICTKAFEKAVGEGFNKKDNAVIMAA GNVKYVFDTQEQCENVVRNFPRKVQECAPGITFSQAVVKFDDEKNFGKAVDELEKKLKTQ RNLPQTPLEVGMLGCKRAAKTGMPVVKIENGEELDAATYAKRMARKDCGNKLHIKSFWGK RQDADKDLKLEDELDIKDLTGKNDWIAIVHIDGNGLGKVVQQLGKDRKKFYQFSTLLDQA TTKAANEAFKVIEETHGFSYNKLPICPVVLGGDDLTAIIRGDLAMPYAQKFITEFERETS QGEMKSLLSKANMTKLTACGGIAYIKASFPFYYGYKLAEELCHAAKTDAKAINKENVPSC LMFHKVQDSFVQSYKDIKARELELKPKDDGKDKTANNGQEEPAKIKKTLCFGPYYLDEQV GYHTISELEGMVNELGKTENEGLKTGVRQWLSLMHENEEAATQRLQRLESLTSNKKLLLN LTTPATRTHVNNAGKEEQEEHYAAYDVLAYYTINNQQTND >gi|283510614|gb|ACQH01000005.1| GENE 19 19499 - 20176 783 225 aa, chain - ## HITS:1 COG:no KEGG:Cthe_3205 NR:ns ## KEGG: Cthe_3205 # Name: not_defined # Def: hypothetical protein # Organism: C.thermocellum # Pathway: not_defined # 4 218 3 214 222 106 34.0 8e-22 MNKLKTLTIQFETQLLQTDIQRFRGAIIKLLPQKDVLFHNHTKEGYRYAYPLIQYKRLNN KAALFCIGQGVENIGVFFASNNFNVRIGNKRHCLQIEEVNAKQVDIACEEGQVYAYQLTD WLPLNEENHTTFAQADNDEQRQEMLQRILTGNILSMLKGVGIRVEQRIEVCITGVKERPC VPFKGVKLYCANVDFVSNVRLPQHVGLGKHASVGFGTLTTTFNND Prediction of potential genes in microbial genomes Time: Sat May 28 00:02:01 2011 Seq name: gi|283510613|gb|ACQH01000006.1| Prevotella sp. oral taxon 317 str. F0108 cont2.6, whole genome shotgun sequence Length of sequence - 16798 bp Number of predicted genes - 10, with homology - 9 Number of transcription units - 8, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 3097 - 3137 5.1 1 1 Op 1 . - CDS 3185 - 4210 1494 ## COG0468 RecA/RadA recombinase 2 1 Op 2 . - CDS 4249 - 5487 1726 ## COG1748 Saccharopine dehydrogenase and related proteins 3 1 Op 3 . - CDS 5570 - 6172 671 ## PRU_0068 hypothetical protein - Prom 6251 - 6310 2.1 4 2 Tu 1 . - CDS 6673 - 7887 1379 ## PRU_1908 hypothetical protein 5 3 Tu 1 . - CDS 8494 - 10329 730 ## gi|288927338|ref|ZP_06421185.1| hypothetical protein HMPREF0670_00079 - Prom 10459 - 10518 7.2 + Prom 10078 - 10137 3.0 6 4 Tu 1 . + CDS 10233 - 10430 92 ## 7 5 Tu 1 . - CDS 11314 - 11718 238 ## Ppha_1247 filamentation induced by cAMP protein Fic - Prom 11790 - 11849 5.1 - Term 12002 - 12042 3.1 8 6 Tu 1 . - CDS 12067 - 13983 2477 ## COG0443 Molecular chaperone - Prom 14014 - 14073 3.6 + Prom 14379 - 14438 4.1 9 7 Tu 1 . + CDS 14526 - 15881 1448 ## COG1055 Na+/H+ antiporter NhaD and related arsenite permeases + Term 15922 - 15975 -0.7 10 8 Tu 1 . + CDS 16018 - 16593 516 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases Predicted protein(s) >gi|283510613|gb|ACQH01000006.1| GENE 1 3185 - 4210 1494 341 aa, chain - ## HITS:1 COG:mlr0030 KEGG:ns NR:ns ## COG: mlr0030 COG0468 # Protein_GI_number: 13470353 # Func_class: L Replication, recombination and repair # Function: RecA/RadA recombinase # Organism: Mesorhizobium loti # 4 334 3 336 365 444 68.0 1e-124 MAKEESKQLSPDEAKLKALQAAMSKIEKDFGKGSIMKMGDEQIENVEVIPTGSIGLDVAL GVGGYPRGRIIEIYGPESSGKTTLAIHAIAEAQKAGGIAAFIDAEHAFDRFYAEKLGVDV ANLWISQPDNGEQALEIADQLIRSSAIDILVVDSVAALTPKKEIEGDMGDNNVGLQARLM SQALRKLTSTISKTNTTCIFINQLREKIGVMFGNPETTTGGNALKFYASVRLDIRKVTSI KDGDNIVGNQVRVKVVKNKVAPPFRKTEFEITFGEGISKVGEILDLGVEYEIIKKSGSWF SYNEAKLGQGRDATKNLLKDNPELCEELEAKIMQAIKDKQD >gi|283510613|gb|ACQH01000006.1| GENE 2 4249 - 5487 1726 412 aa, chain - ## HITS:1 COG:DR1252 KEGG:ns NR:ns ## COG: DR1252 COG1748 # Protein_GI_number: 15806271 # Func_class: E Amino acid transport and metabolism # Function: Saccharopine dehydrogenase and related proteins # Organism: Deinococcus radiodurans # 1 410 1 402 405 502 58.0 1e-142 MGKVLMIGAGGVATVAAFKIAQNKTHFGEFMIASRRKEKCDEIVDAIHAKGYDMDIKTAQ VDADDVEQLKALFNDFKPELVINLALPYQDLTIMEACLACGCNYLDTANYEPKDEAHFEY SWQWAYKQRFEEAGLTAILGCGFDPGVSGIYTAYAAKHHFDEIHYLDIVDCNAGNHHKAF ATNFNPEINIREITQKGLYYKDGEWLETEPLAVHQPITYPNIGPRESYLMHHEELESLVK NYPTIRQARFWMTFGQQYLTYLDVIQNLGMSRIDEIEYEAPLADGSGKHVKVNIVPLQFL KAVLPNPQDLGANYDGETSIGCRIRGVKDGKELTYYVYNNCKHQEAYKETGMQGVSYTTG VPAMIGAMMFFKGLWRKPGVWNVEEFDPDPFMEQLNIQGLPWHEEFGGDLEL >gi|283510613|gb|ACQH01000006.1| GENE 3 5570 - 6172 671 200 aa, chain - ## HITS:1 COG:no KEGG:PRU_0068 NR:ns ## KEGG: PRU_0068 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 9 197 11 199 201 259 62.0 3e-68 MEQAVKNNWSENVIVVDADYVDNVAFNLIVNFERIIGRRIPQADLARWLDCIALDGGVRN GENGLQVVMVHSKAKTQMENFTPANFADELNGKAFKDGLGEFAIGTYPVEDLVNATDFFL DVVQTVCAQSEVKRVMVVPNAENAELYEALRHQMHRLDDDTKRVTIFAMQPLPGGNFRQE ILGYSLMNALGIRSDEINAQ >gi|283510613|gb|ACQH01000006.1| GENE 4 6673 - 7887 1379 404 aa, chain - ## HITS:1 COG:no KEGG:PRU_1908 NR:ns ## KEGG: PRU_1908 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 3 404 1 420 420 423 51.0 1e-117 MKVNNLLATIALFLCLPLTAWAQGNNTGMGGEDDDTPSIAERVLKLEKKTDAFNLYLNFA AAARATEHNGTWKGAFVNRQLRLEIKGHITDKLFYRLRQHLNSGYEPQGEDNFAKATDMM MVGYTFNPKWEVQAGKIGQIWGGYEYDENPLFIYQYCDFVAHMECFVAGAQVTFKPIPTQ EFGLQITDSHNGTFEQEYAKDAKMAGPNFATDREQLTPSNIPLTYIANWNGNFFNDRLQT RWAWGLQTQAKGKYSRMLSLGQRLNLPKLQWYIDYYGAVDELDRLRIASGELTSLVNNDK DVFFGNVHYNALVTEANWQFAPRCNLKLRGSYETTSIPQYEALKNYRKYYSYVASLEYFP IKGQNLRLFLAYMGQKVDYNHACGLSDYHTNKLELGLMYRIKCY >gi|283510613|gb|ACQH01000006.1| GENE 5 8494 - 10329 730 611 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927338|ref|ZP_06421185.1| ## NR: gi|288927338|ref|ZP_06421185.1| hypothetical protein HMPREF0670_00079 [Prevotella sp. oral taxon 317 str. F0108] # 1 611 226 836 836 1189 100.0 0 MVCRWQNSVPEDCFVSRLIGVARFAVKGNDFSLFQNVLYSLDNIIDTERSSCSREQKDDV IKESSVHYRTMRFFDELLITYKPIPNAFMQNESIIFKMVGAFDRTKYMSNADSFRLAICM RKMLDYGNEALLEKYIDYTQHYFKYLLYLPKVFYIKGGLTKNRDYVEKRSVNSWNYLCCF HYAVLAYAFDKCKYSLLKIHLEKNHYSDYNLYPKTGADVLIRYAYCVNNICFLNDNKLFE RRVDVKSLLRKYTAILLLLIPETEEIGYPMLENEPKKIIDTIDSSKSDLEKEMNIIKTNT TLLNLYPNLANTNSSKRFSYFLDVIKMFLTPSASREIEGNKECSTLSLVHSFIGRLLEKT TPNKTRTRRTNLYTQKLNKKITDDFASRFNQLNNDLSHFLPNGLFSSDSYGKNDEVFLNP CQLRIDKLYFLDMEYYSGGYDLYREYVEQFSTRLIYLALFSFRKMKLKEVNIKSPNFDIF FEKFTRGKRENFVLIGIDSPFEAILNITLNGRDMSYRKTTPYVNISSSRNPLLTDLDDYT YFKDSILVVAKKDLPKIVDIGENKDIMIDHKDCSDESNMQLSVRTTIDVHKKLIFNPNAK IAMVRLKRMNM >gi|283510613|gb|ACQH01000006.1| GENE 6 10233 - 10430 92 65 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKNHSPLLRNVQLLLALTQSSLLEHYFAIDIPYLLLIYKLYSQLCLYNGSNYEAEPCKAP IFVVA >gi|283510613|gb|ACQH01000006.1| GENE 7 11314 - 11718 238 134 aa, chain - ## HITS:1 COG:no KEGG:Ppha_1247 NR:ns ## KEGG: Ppha_1247 # Name: not_defined # Def: filamentation induced by cAMP protein Fic # Organism: P.phaeoclathratiforme # Pathway: not_defined # 9 130 5 126 293 140 56.0 1e-32 MAEHKEQNKKSIRFFNDREVRAVWDEKQNCWWFSATDIVRAINNEPDYTKAGNYWRWLKK KLKQEDIELVSATHGFKFEAPDGKLRVADVLNSEGVVLLAKNYPNNRANEFLDWFTYSDN TIDGQSKKKSLPTL >gi|283510613|gb|ACQH01000006.1| GENE 8 12067 - 13983 2477 638 aa, chain - ## HITS:1 COG:TP0216 KEGG:ns NR:ns ## COG: TP0216 COG0443 # Protein_GI_number: 15639209 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone # Organism: Treponema pallidum # 1 638 1 630 635 679 60.0 0 MGKIIGIDLGTTNSCVAVFEGNEPVVIANSEGKRTTPSVVGFVKDGERKVGDPAKRQAIT NPKNTVYSIKRFMGETYEQSRKEAEAMPYTVVDENGLPRVDIEGRKYTPQEISAMILQKM KKTAEDYLGQEVTDAVITVPAYFSDSQRQATKEAGQIAGLNVQRIVNEPTAAALAYGVDK GNKDMKIAVFDLGGGTFDISILEFGGGVFEVLSTNGDTHLGGDDFDQVIIKWLADGFKAD EGIDLTKDPMAMQRLKEAAEKAKIELSSTTSTEINLPYISAEGGVPKHLVKTLTRAQFEQ LAHNLIQACLVPCQNAIKDAKLSTSDIDEVILVGGSSRIPAVQTLVKNYFGKEPSKGVNP DEVVAVGAAIQGAILNKESGVGDIVLLDVTPLTLGIETMGGVMTKLIDANTTIPHKKSET FSTAVDNQTAVTIHVLQGERPMASQNKSIGQFNLEGIAPARRGVPQIEVTFDIDANGILN VSAKDKATGKEQKIRIEASSGLSKEEIERMKAEAEQNAAADKAEREKVDKLNQADSMIFT TENFLKDNGDKIPADQKPGIESALQQLKDAHKAADVAAIDAAINNLNSVMQAASAQMYQG AGGAQPDPNAGFQGAGGEQAQSDNTGDNVQDADFEEVK >gi|283510613|gb|ACQH01000006.1| GENE 9 14526 - 15881 1448 451 aa, chain + ## HITS:1 COG:CPn1015 KEGG:ns NR:ns ## COG: CPn1015 COG1055 # Protein_GI_number: 15618923 # Func_class: P Inorganic ion transport and metabolism # Function: Na+/H+ antiporter NhaD and related arsenite permeases # Organism: Chlamydophila pneumoniae CWL029 # 4 450 2 418 420 282 40.0 9e-76 MTSLTLAIVVVFIMGYLCIALESLTKVNKAPVALLMCVACWTLFMVNPSEFVLPGMPELT GNAAGILAHVGESLREHLGETAETLFFLMGAMTIVEVVDTNGGFNFVRDSIQTHSKRGLL WRIAFMTFFLSAILDNLTTSIVMIMVLRKLVADKQDRMVYAALVIIAANSGGAFSPIGDV TTIMLWIAGSITTVGVITEILVPSLVSMLVPAFIMQYMLKGKLPAMVASAEDSTLAFTRV QRRIIFFLGVGGLMFVPIFRYLTDLPPYMGILLVLGVLWTATEIFHRRMHLGEDSMSARV IALLRKIDMGTILFFLGILMAVGCLAEIGVLTAMGRGLDSVSGGNHYLVTGIIGVLSSIV DNVPLVAGCMGMYPIAPTGDMAVDGIFWQLLAYCAGVGGSMLIIGSAAGVVVMGLEKITF GWYMKRITWVAFVGYLAGILVYWVERMLFFD >gi|283510613|gb|ACQH01000006.1| GENE 10 16018 - 16593 516 191 aa, chain + ## HITS:1 COG:lin0816 KEGG:ns NR:ns ## COG: lin0816 COG0454 # Protein_GI_number: 16799890 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Listeria innocua # 4 187 2 184 185 71 29.0 9e-13 MITIEQAQVSQAPDITRLIMEAMNHECCLYFAGEEHGLSGFHQLMTDLVCRDDSQYSYLN TLVALNAQREVIGVCTAYDGARLHQLRTAFVQGCLARFNRDFGNMDDETAAGELYVDSLA VDAHYRGKGIAKALLRATVQRARLLQLPAVGLLVDKGNPKAERLYASVGFQYVGDNQWGG HGMRHLQYVLG Prediction of potential genes in microbial genomes Time: Sat May 28 00:02:49 2011 Seq name: gi|283510612|gb|ACQH01000007.1| Prevotella sp. oral taxon 317 str. F0108 cont2.7, whole genome shotgun sequence Length of sequence - 71960 bp Number of predicted genes - 50, with homology - 48 Number of transcription units - 34, operones - 12 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 129 - 1145 927 ## PRU_2879 hypothetical protein - Prom 1168 - 1227 2.0 2 2 Tu 1 . - CDS 1437 - 2255 1019 ## COG3950 Predicted ATP-binding protein involved in virulence - Prom 2412 - 2471 3.9 + Prom 2376 - 2435 3.2 3 3 Tu 1 . + CDS 2469 - 3491 795 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily - Term 3471 - 3511 -0.6 4 4 Tu 1 . - CDS 3585 - 4550 750 ## COG1397 ADP-ribosylglycohydrolase - Term 5229 - 5288 2.4 5 5 Tu 1 . - CDS 5294 - 5479 57 ## - Prom 5647 - 5706 3.0 + Prom 5577 - 5636 2.1 6 6 Tu 1 . + CDS 5666 - 6613 821 ## PROTEIN SUPPORTED gi|148988856|ref|ZP_01820271.1| 50S ribosomal protein L9 7 7 Tu 1 . - CDS 6870 - 8273 1562 ## COG3579 Aminopeptidase C + Prom 8507 - 8566 4.1 8 8 Tu 1 . + CDS 8713 - 10728 1674 ## PRU_1115 hypothetical protein - Term 10592 - 10625 1.2 9 9 Op 1 . - CDS 10737 - 11303 240 ## PROTEIN SUPPORTED gi|163764797|ref|ZP_02171850.1| ribosomal protein L29 10 9 Op 2 . - CDS 11334 - 12167 964 ## PRU_1992 hypothetical protein - Prom 12364 - 12423 4.3 + Prom 12237 - 12296 2.0 11 10 Tu 1 . + CDS 12408 - 13832 1298 ## PRU_1990 putative helicase + Term 13883 - 13921 0.0 12 11 Tu 1 . - CDS 14703 - 15215 600 ## COG0406 Fructose-2,6-bisphosphatase - Prom 15397 - 15456 2.5 + Prom 15400 - 15459 6.5 13 12 Op 1 . + CDS 15585 - 16358 677 ## PRU_1342 hypothetical protein 14 12 Op 2 . + CDS 16382 - 17281 578 ## COG0702 Predicted nucleoside-diphosphate-sugar epimerases + Term 17527 - 17562 -0.7 + Prom 17917 - 17976 4.6 15 13 Op 1 . + CDS 18042 - 18707 196 ## gi|288927356|ref|ZP_06421203.1| hypothetical protein HMPREF0670_00097 16 13 Op 2 . + CDS 18716 - 19222 245 ## gi|288927357|ref|ZP_06421204.1| hypothetical protein HMPREF0670_00098 + Prom 19526 - 19585 3.0 17 14 Op 1 . + CDS 19830 - 20387 192 ## gi|288927358|ref|ZP_06421205.1| tetratricopeptide repeat protein 18 14 Op 2 . + CDS 20392 - 21825 135 ## gi|288927359|ref|ZP_06421206.1| hypothetical protein HMPREF0670_00100 + Prom 21864 - 21923 5.1 19 15 Tu 1 . + CDS 22037 - 22753 155 ## gi|288927360|ref|ZP_06421207.1| leucine Rich Repeat domain-containing protein - Term 22964 - 23034 23.8 20 16 Op 1 . - CDS 23074 - 24915 1797 ## COG3083 Predicted hydrolase of alkaline phosphatase superfamily - Prom 24995 - 25054 3.4 21 16 Op 2 . - CDS 25147 - 27768 2905 ## COG0013 Alanyl-tRNA synthetase - Prom 27952 - 28011 5.0 + Prom 27877 - 27936 6.3 22 17 Tu 1 . + CDS 28182 - 28532 396 ## COG0789 Predicted transcriptional regulators + Term 28607 - 28657 5.3 - Term 28594 - 28645 5.5 23 18 Tu 1 . - CDS 28711 - 29664 720 ## PROTEIN SUPPORTED gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 - Prom 29823 - 29882 4.1 - Term 30075 - 30116 -0.0 24 19 Tu 1 . - CDS 30123 - 30671 806 ## COG0386 Glutathione peroxidase - Prom 30692 - 30751 5.3 + Prom 30690 - 30749 4.6 25 20 Tu 1 . + CDS 30923 - 31471 683 ## gi|288927367|ref|ZP_06421214.1| hypothetical protein HMPREF0670_00108 + Term 31596 - 31633 6.1 26 21 Tu 1 . - CDS 33503 - 34399 1024 ## COG1864 DNA/RNA endonuclease G, NUC1 - Prom 34447 - 34506 2.1 - Term 34470 - 34531 14.2 27 22 Op 1 . - CDS 34566 - 34817 387 ## PROTEIN SUPPORTED gi|60280046|gb|AAX16386.1| 50S ribosomal protein L31 type B - Prom 34846 - 34905 3.1 28 22 Op 2 . - CDS 35111 - 36490 1312 ## gi|288927370|ref|ZP_06421217.1| lipoprotein - Prom 36525 - 36584 1.9 29 23 Tu 1 . - CDS 36626 - 37468 1020 ## BT_3561 hypothetical protein 30 24 Tu 1 . - CDS 37601 - 40141 2195 ## BT_3560 hypothetical protein - Prom 40171 - 40230 3.3 - Term 40469 - 40508 2.5 31 25 Op 1 . - CDS 40512 - 41294 853 ## COG1402 Uncharacterized protein, putative amidase 32 25 Op 2 . - CDS 41304 - 42491 1198 ## COG2942 N-acyl-D-glucosamine 2-epimerase 33 25 Op 3 4/0.000 - CDS 42518 - 43438 1159 ## COG0329 Dihydrodipicolinate synthase/N-acetylneuraminate lyase - Prom 43520 - 43579 4.1 34 25 Op 4 . - CDS 43607 - 44842 1133 ## COG0477 Permeases of the major facilitator superfamily - Prom 44931 - 44990 3.1 - Term 45014 - 45050 -0.3 35 26 Tu 1 . - CDS 45063 - 46109 1034 ## COG3055 Uncharacterized protein conserved in bacteria - Prom 46172 - 46231 4.3 - Term 47119 - 47167 1.3 36 27 Op 1 . - CDS 47190 - 47666 133 ## EpC_08520 conserved uncharacterized protein 37 27 Op 2 . - CDS 47717 - 48478 159 ## Fjoh_3235 abortive infection protein - Prom 48542 - 48601 7.9 + Prom 48138 - 48197 2.7 38 28 Op 1 . + CDS 48417 - 48635 63 ## 39 28 Op 2 . + CDS 48667 - 49398 600 ## PRU_0815 hypothetical protein - Term 49689 - 49722 -0.2 40 29 Tu 1 . - CDS 49730 - 52009 2133 ## PRU_1227 putative thiol protease/hemagglutinin PrtT - Prom 52228 - 52287 3.1 - Term 52132 - 52172 -0.5 41 30 Op 1 . - CDS 52316 - 53578 583 ## COG1194 A/G-specific DNA glycosylase 42 30 Op 2 . - CDS 53592 - 54767 1194 ## gi|288927383|ref|ZP_06421230.1| tetratricopeptide repeat protein - Prom 55001 - 55060 2.0 - Term 55020 - 55078 8.7 43 31 Op 1 . - CDS 55101 - 56267 1074 ## COG0668 Small-conductance mechanosensitive channel 44 31 Op 2 . - CDS 56312 - 57595 1337 ## COG2027 D-alanyl-D-alanine carboxypeptidase (penicillin-binding protein 4) 45 31 Op 3 . - CDS 57595 - 57996 344 ## BVU_1111 hypothetical protein - Prom 58188 - 58247 4.5 46 32 Tu 1 . - CDS 59409 - 63383 1946 ## gi|288927387|ref|ZP_06421234.1| hypothetical protein HMPREF0670_00128 - Prom 63515 - 63574 3.9 47 33 Tu 1 . + CDS 63847 - 64572 579 ## gi|288927388|ref|ZP_06421235.1| hypothetical protein HMPREF0670_00129 + Term 64641 - 64668 1.5 + Prom 64615 - 64674 4.7 48 34 Op 1 . + CDS 64725 - 65864 667 ## gi|288927389|ref|ZP_06421236.1| hypothetical protein HMPREF0670_00130 49 34 Op 2 . + CDS 65861 - 67666 1541 ## gi|288927390|ref|ZP_06421237.1| hypothetical protein HMPREF0670_00131 50 34 Op 3 . + CDS 67700 - 71960 3029 ## Hoch_2925 YD repeat protein Predicted protein(s) >gi|283510612|gb|ACQH01000007.1| GENE 1 129 - 1145 927 338 aa, chain - ## HITS:1 COG:no KEGG:PRU_2879 NR:ns ## KEGG: PRU_2879 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 329 1 329 332 372 54.0 1e-101 MPKGLKEHLSSSYFEAANRMTPKRARRRIVAYVESYDDIFFWRTVLSSFENANRYFEVML PSKENKLERGKKAALMSLIGEKVGNDMIACVDADYDYLLQGASPLSKAILNNPFVFHTYA YSIENLQCFAGGLHEVCVAATLNDHFVFDFPDFFARYSRIVFPLFVWNIWHYRTGRYGEF TITDFNRMIDFGRFSFHRTEEQLGKLQSRVDHKLRQLRQRHPNAKASWIEVKADLLRLGV TPETTYLYIQGHHLADTLVIPMLMRVCEYLVREREMEIRQQSVHSTQMRNELASYAHSVA DMAPMLKKSKVYVDAAPFLQIKKDVEEFLEKSEKQVTS >gi|283510612|gb|ACQH01000007.1| GENE 2 1437 - 2255 1019 272 aa, chain - ## HITS:1 COG:STM2746 KEGG:ns NR:ns ## COG: STM2746 COG3950 # Protein_GI_number: 16766058 # Func_class: R General function prediction only # Function: Predicted ATP-binding protein involved in virulence # Organism: Salmonella typhimurium LT2 # 171 259 316 410 427 65 35.0 2e-10 MEKYADYIKEIEIDSLWSGTRHIVWRLHKQVNVLSGVNGMGKSTILNRVVKCLPKGGDTT QSHVKGVHITMEPEDATCIRYDIIRSFDRPLLNAEIIAKLGLSITTELDFQLYQLQRRYL DYQVDIGNRIIHALQSGAPDAARVAQAHNEPKRMFQDMIDRLFAETGKTIIRTENEVRFS QIGETLQPYQLSSGEKQLLVVLLTVLVEDRLPFVLFMDEPEVSMHIEWQKQLIDLILQLN PNVQIILTTHSPAVIMDGWMDRVTEVSDITVE >gi|283510612|gb|ACQH01000007.1| GENE 3 2469 - 3491 795 340 aa, chain + ## HITS:1 COG:MA3243 KEGG:ns NR:ns ## COG: MA3243 COG0697 # Protein_GI_number: 20092059 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Methanosarcina acetivorans str.C2A # 43 296 1 256 291 65 24.0 1e-10 MDKAHIGESSSTSPSEQTHSVMQDNAQKDIGQPPSSNGGAPPLQAHAAMFLACLFWGLMA PLGKDAMNHGIDGITMVSFRVAGGAALFWLTSLFTRKEHVPRKDVLRFAGAAVFGLVCNQ CCFTIGLSLTSPINASIVTTSMPIFAMVFSAIILKEPITGKKATGVLMGCSGALILILTS AAATNRVQGDIRGDLLCLAAQFSYAFYLSLFNPLVRRYSVITVSKWMFFWATLLLLPFTG AHVAALPWAQIGGITWLETAYVVVICTFVCYCLIMVGQKTLRPTVVSIYNYVQPIVSVTV SVLMGIGVLKWAQGLAIILVFGGVWLVTKSKSKRDMEKGG >gi|283510612|gb|ACQH01000007.1| GENE 4 3585 - 4550 750 321 aa, chain - ## HITS:1 COG:MJ1187 KEGG:ns NR:ns ## COG: MJ1187 COG1397 # Protein_GI_number: 15669377 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ADP-ribosylglycohydrolase # Organism: Methanococcus jannaschii # 17 306 6 286 301 95 25.0 1e-19 MKNHGLEQANNNETLVDRIKGTIYGQAIGDALGLGTEGMTDEEMALNYPNSITHYGDIFQ DHHRKRWKIGDWTDDTDMMLCIANAVIKDEGVNLTTIAQNFKHWADGCPMGIGETTHKVL LFGDYVEKPFEVSKKIWEMSHCRAAANGGLMRTSVVGLFPKAVEVCAANICRLTHYDPRC VGSCVIVSLLIHSLVYEGKGLSYHQIIDLARKYDNRIIEYVDLSMDTDIRVLELQDKSSV GYTLRCLAAGLWAYWHAKSFADGLLAIVRAGGDADTNAAVACAILGAKFGYGTIPTEYID GLIYKKQLDDIIKGISSLMLS >gi|283510612|gb|ACQH01000007.1| GENE 5 5294 - 5479 57 61 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSGVFGHQPPTRPQKSNSREGLFGCKKGTVVDADKMNTHCIKPQFAPYFVLFAAKCSAFW C >gi|283510612|gb|ACQH01000007.1| GENE 6 5666 - 6613 821 315 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|148988856|ref|ZP_01820271.1| 50S ribosomal protein L9 [Streptococcus pneumoniae SP6-BS73] # 4 311 3 307 308 320 53 1e-86 MAHIAKQLTELIGGTPLLELGKFSVSNNLQTPIVAKIESFNPGGSVKDRIALAMIEDAER SGVLKPGATIIEPTSGNTGVGLALVAAVKGYQLILTMPETMSVERRNLVKAYGAQVKLTP GKDGMKGAIEMAQQLREEIPGAVILQQFENLANPERHYETTAREIWADTSGQVDVFVAGV GTGGTVSGVGKYLKEKNPNVRIVAVEPADSPVLSGGKPGPHKIQGIGAGFVPKTYNGNVV DEVVQVSNDDAIRTGRQLAQQEGVLAGISSGAAAFAAAQLAKRPENAGKRIVTLLPDTGE RYLSTVLYAFEEYPL >gi|283510612|gb|ACQH01000007.1| GENE 7 6870 - 8273 1562 467 aa, chain - ## HITS:1 COG:SPy1651 KEGG:ns NR:ns ## COG: SPy1651 COG3579 # Protein_GI_number: 15675522 # Func_class: E Amino acid transport and metabolism # Function: Aminopeptidase C # Organism: Streptococcus pyogenes M1 GAS # 30 464 11 441 445 291 37.0 2e-78 MNNKTLIVGACLAVVGSLNAQTKDGGISLQMLQQIQQNGKQTTAERAIANAIATNSIDEL ARTPQSRADVDAYFSVETPKQNIHNQKSSGRCWMFTGLNVLRADFALRSDSLTVEYSHAY LFFYDQLEKANLMLQSVVDCAKKPINDPEVQFFFKHPLNDGGTFCGVADLAEKYGLVPKS VMPETYSSENSSRMSKLISSKLREFGLELRKMVAAKAKPTQIQERKTQMLATVYKMLALT LGEPVKAFQYAFKNKDGKAITPVETYTPKQFYDKTVGRSLNDNIIMVMNDPRNAYYKVYE VKHDRHTYDGHNWRYLNLPMEEIHKMAIASLKAGHKMYSSYDVGKQLDRSKGYLDVNNFD YGPLFGTTFPMNKADRIATFDSGSTHAMTLTAVDLDSQGKALKWEVENSWGDDNGHKGYL IMSNNWFNDFFFRLVVDKQYVSADILKMAAQAPIMLSYDDPVFADDL >gi|283510612|gb|ACQH01000007.1| GENE 8 8713 - 10728 1674 671 aa, chain + ## HITS:1 COG:no KEGG:PRU_1115 NR:ns ## KEGG: PRU_1115 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 3 671 17 686 687 451 37.0 1e-125 MRAQVATPTENADSVTSEKTLAEVNVIAKHLTREADRIVAMPTTEQRKHAHTGYDLVRNL MIPGVSVDRKTFKVNTPAGGATLYIDGREVDVREVRALRPRDVVKVEYIDMPTGKYAKDV AALNFVTKRLTHGGYTQIDALQGLGFLQGDYNAVSKFSVKNTNFNAWAGYAIQQPRNHII ESEHFSLASGDVDRMGVLPERSTLSINKYAQCSVSRMKDRATWMIKGGLNYMRTFENKEG EWYYTGPITGRLATLQSGGERSLRPELFLYGTWKIGENQSIETVYDGYYSRNHYLNNNVE GGDTYKSDVTENYLYSRLGVYYNLKLKHGNSLSFSLSEYLRLSKSHYHNLLPVWQRLRSS ETILFVDYTQRVGRLFFDFKPGLSYLNYRLKGYDGITHVTPRLSFSVANRLSDVQLLRAE LKLGNTYPTLNTVNYVTQQVDRVMLRRGNPDMHNSILYSPRLTYGLTFSRLAMNVSASYL FANHIITNDYHAEGDKMVNTFRSDTRYHAPSCSFSTTYRPSKMFNVKVDANWERNVLRGG TQLQHTKWSAALEANCYVNDFAFSASVKTPERQLVDNHKLVKTGWLYDIALSWNHNNIGV ELNCSNLFIRRNVTTTDLFAGVYAYHGERQSDWHNQYATLKMIWSVDYGKKTSKSVRNVR KEAESAIMKAE >gi|283510612|gb|ACQH01000007.1| GENE 9 10737 - 11303 240 188 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163764797|ref|ZP_02171850.1| ribosomal protein L29 [Bacillus selenitireducens MLS10] # 1 188 7 199 199 97 32 2e-19 QTNPPIMRIITGLYKGRHFDIPRSFKARPTTDFAKENIFNVLNAYVDFEDATALDLFAGT GSISLELLSRGCANVVSVETDREHANFIRQCTTKLGTDKSLLIRGDVFRFIKSCRQQFDF IFADPPYALPELEKIPDLIFQHQLLKPDGVFVFEHGKGNDFSQHAHFVEHRAYGSVNFSL FKAGEESE >gi|283510612|gb|ACQH01000007.1| GENE 10 11334 - 12167 964 277 aa, chain - ## HITS:1 COG:no KEGG:PRU_1992 NR:ns ## KEGG: PRU_1992 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 16 271 7 265 266 228 44.0 2e-58 MQETAPDLHIQQTRLVIRATRNTLSFAAVDPTVQTNLFFEPYTVRSGISVAANLREAFKT STLLLRGYKRANLLVDTPVMLVPIDEFQEEDVRQLYHRTMVLADADTMMHHVLPDLNAVA AFAVNKDLKLVVDDHFADVRITPLMKGVWTHLHRRSFTGSRRKLYAYFHDKRLEVFSFGK NRFKFSNAFDTNSSRDAVYFILYVWKQLGFDSMQDELHLSGDIPDKEWTKTALLEFVKKT YVANPVADFNRASITAIEGLPYDLMTIFVRGVKEGVR >gi|283510612|gb|ACQH01000007.1| GENE 11 12408 - 13832 1298 474 aa, chain + ## HITS:1 COG:no KEGG:PRU_1990 NR:ns ## KEGG: PRU_1990 # Name: not_defined # Def: putative helicase # Organism: P.ruminicola # Pathway: not_defined # 15 474 2 455 461 602 62.0 1e-170 MITDEITLRLTQCFGHTPTDEQSSAIRLFARFLAHRNPHSLMLMRGSAGTGKSSLAGAFV RALGLLGQKTVLMAPTGRAAKVFSLNSGGHAAFTIHRKIYRLRAFAGVGGEYNLNDNLHA DTLFMVDEASMVANGGGAEVAFGSGRLLDDLVRYVYAGRNCRLVLIGDQAQLPPVGEEQS PALDTQFMQGYGMEVFACDLNEVVRQQSASGILFNATCLRQLMGRDEDTLLPRIRLEGFA DLQIVPGDELIETLNTSYAEVGMDETIVVTRSNKRANVFNQGIRNTVLGREELLTTGDLL MVVKNNYHWMEKERSTIGFLANGDRARVLRVRNEQRAHGLRFADVWLQLTDYENEELQVT VVLDSLLSESPALTAQQNEALYNNVLNDYADIPLKADRMKMVKNDAFFNALQVKFAYAVT CHKAQGGQWEHVFLDQGYVPEDTSRSDYLHWLYTAFTRAKSKLFLVNWPKNQVE >gi|283510612|gb|ACQH01000007.1| GENE 12 14703 - 15215 600 170 aa, chain - ## HITS:1 COG:L1889160 KEGG:ns NR:ns ## COG: L1889160 COG0406 # Protein_GI_number: 15673802 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose-2,6-bisphosphatase # Organism: Lactococcus lactis # 4 141 3 150 174 80 36.0 2e-15 MTTLLLVRHGETVANANQILQGQTQGELNETGVAQAEKLARELSDTPIDAFISSDLQRAE HTCSIIAAPHNKPFCTTPLLRERDWGGFTGAFIPSLKDKVWPEDVESLTAIKHRARLFLD MIAQRYPNQTVLAVGHGIINKAIQSVFYNKEMHEVEKMGNAEVRVLKVKR >gi|283510612|gb|ACQH01000007.1| GENE 13 15585 - 16358 677 257 aa, chain + ## HITS:1 COG:no KEGG:PRU_1342 NR:ns ## KEGG: PRU_1342 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 198 1 199 215 155 39.0 1e-36 MKTVQTAFIRAIIAVVMGLLLIKYREDTVTWITIIIGVLFIISGLISCIVYLVNRNAKPT AAVDVEGRPISLNKPTFPVVGIGSLVLGVVLAAFPNLVVNWMVYIFGAILIFGAVGQYVT LASVVKLSKLSLYFWLMPSFVFVVGLIALFKPSWIASAPLLFLGWAMVFYGVVECANAFK IMNIHRQVARIEAQQQAECEAMEADAVEVGTQDETETSSDLSGLSKQSDLSDQSTTPTTP INNTPEGQDNPNLGAGI >gi|283510612|gb|ACQH01000007.1| GENE 14 16382 - 17281 578 299 aa, chain + ## HITS:1 COG:all5305 KEGG:ns NR:ns ## COG: all5305 COG0702 # Protein_GI_number: 17232797 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Predicted nucleoside-diphosphate-sugar epimerases # Organism: Nostoc sp. PCC 7120 # 19 292 3 287 291 103 28.0 4e-22 MVEQADNTPLQAAEGCRVLLAGATGYLGSFVLRELQRRNYSTRVIVRNPSRMQSVSPNVD VRVGEVTQADTLKGVCEDIDVVISTVGITRQKDGMTYMDVDFQANANLVDEAKRSGVKRF IYVSVFNGANMRHLKICEAKERLGDYLKNSGLDYCIVRPTGFFSDMRDFLKMAKGGSVWL FGDGMLRMNPIHGADLARAVVDALHSQQHELNIGGPDVLTHNAIAELALRAYGHQPRVRH LPDIVRRSTLFLLRLFTPAKTYGPLEFFLTAMAMNMQAPTYGEEKLEDFFKREVERERK >gi|283510612|gb|ACQH01000007.1| GENE 15 18042 - 18707 196 221 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288927356|ref|ZP_06421203.1| ## NR: gi|288927356|ref|ZP_06421203.1| hypothetical protein HMPREF0670_00097 [Prevotella sp. oral taxon 317 str. F0108] # 1 221 24 244 244 427 100.0 1e-118 MGKLPMFSQYVELPVSDIYDTDVMMAHMNAVRDANAVDRRLADEMKPIVTDLFEKYNRGE YRECKNRIDDIFSHIRFYKRQYWIYSPLYYLRGMSLMELGDKYDGIKDLVYAKEANNSDA TEALCNYFLQFCNNANQYFQTDRLNDCLHEIQLALSTSYCSAQIYMVAGKVYEKWNLFDT ARDYYKLAKKRGSTKASKLLKELRQHRKKYEKEIRNANLSR >gi|283510612|gb|ACQH01000007.1| GENE 16 18716 - 19222 245 168 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288927357|ref|ZP_06421204.1| ## NR: gi|288927357|ref|ZP_06421204.1| hypothetical protein HMPREF0670_00098 [Prevotella sp. oral taxon 317 str. F0108] # 1 168 1 168 168 327 100.0 2e-88 MKKILITFFSIICLCDVSAQSLDDLESNPSFKGIDIGMPITAILGKVKYVKQVDAQTIYT IDDPSYNSVFGIGVDYVNVAVQGGRIYAIIAVKEVLNSSVVFNTSELDAIEAGLTRHYGK PTNFVGDGKHFGVQWISRTKRVDNIMTFFGTGVGYRILFAISENKEDY >gi|283510612|gb|ACQH01000007.1| GENE 17 19830 - 20387 192 185 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288927358|ref|ZP_06421205.1| ## NR: gi|288927358|ref|ZP_06421205.1| tetratricopeptide repeat protein [Prevotella sp. oral taxon 317 str. F0108] # 1 185 201 385 385 392 100.0 1e-108 MKDFNKSLACGNSWIALYGSSWGQHVSDVYYYLGVSYMELKDFHEADKRFAEAIRLVSTA PILSAPLDECLTLSQYYSQKACNYLMGKAYSLSVEAFDIAIQYRMRGLGFTMDDLTGGKI KDPSIGRWLYSVSQIYVVFEHNEGAAERYVLLSAVCGYPDALTCCENIPKLKYMLDANKK CINEK >gi|283510612|gb|ACQH01000007.1| GENE 18 20392 - 21825 135 477 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288927359|ref|ZP_06421206.1| ## NR: gi|288927359|ref|ZP_06421206.1| hypothetical protein HMPREF0670_00100 [Prevotella sp. oral taxon 317 str. F0108] # 1 477 6 482 482 978 100.0 0 MPFNRIYTFFSLLCVVLCVTSGCTDRKARVSFMKADFDNIYLTYTDTVPITLYFDGLDSM NVAGNDTFSLKDVMDRLSIEKGKRYAIAYDFCQKGYYALHIKANNKMGELLADTMLKITF PTVTGKCVEEPIVVRGDLCPYILQNSYNGDRKIPIRNWLFDADYPLDSLLISKMEGAVNY LCGIAYPTYCLSQEHNIPRLTSIQNSPVSVRTNMQADRYFLFAASSASALLEKIKQVVRN NHVKGLFKSRDVLTGIPLNSVGIGDVVSIFLVGINDDDTYSILPVGAYIVDKSAPISLRG RTWNYAGVNYLPSSSSPLIGTQRFSSLFFDFRNIGFIVRHEVDLQIDGCVVLSHGNFGGD MVDGVDIPFTLNSRGDIKRIEIFRTKRNEWGVRPGKRTIEVNGQTLSNYHFDYTLLLRFG DNYVPVIVTDMRGNKTKMDYYIPVARVSNNSDEDEIENLQSQYDDLEGRISELERNN >gi|283510612|gb|ACQH01000007.1| GENE 19 22037 - 22753 155 238 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288927360|ref|ZP_06421207.1| ## NR: gi|288927360|ref|ZP_06421207.1| leucine Rich Repeat domain-containing protein [Prevotella sp. oral taxon 317 str. F0108] # 1 238 1 238 238 414 100.0 1e-114 MRKILFIISIALVVLFTRCKHSDVKPQTNKKPVAAIEAEASEPTEEEIEQEKKREEERER AAYRAADTTGIAPNSPYWYDPAISEPQFSEDGDTMKYFPRKRKGGHYVVPEGVTYLQERV FQCCVKIRSVEFPKSLIHMEMAVFDGCPKLRKVVFKSPIREVPFRGFTFCRRLRELHLVD RRPPVSPYEKDENVDEEEWFYTFGGVNAKKCVIFVPKGCAKRYKRHRLWRKFKHIVEE >gi|283510612|gb|ACQH01000007.1| GENE 20 23074 - 24915 1797 613 aa, chain - ## HITS:1 COG:VC2600 KEGG:ns NR:ns ## COG: VC2600 COG3083 # Protein_GI_number: 15642595 # Func_class: R General function prediction only # Function: Predicted hydrolase of alkaline phosphatase superfamily # Organism: Vibrio cholerae # 47 613 70 622 622 331 32.0 3e-90 MQNQTLRTALKRGFLFFSATVLLLTLQFLFVLFTSNTLEIMSPLGYLFFFASCLSHAACL TLPPYLIYMLIACTGCVRIARVVHFVLVALLVLLVFLDAQVYAIYRFHINGFVLNMVFGP GAGEIFTFDTWLYLKEVGLFLLLLGMVYGAYWLSGWVWRKRGKAYVAATVCALVGSTLFA HLTHIYGAFMSQASVVHSAKLLPYYFPTTAYSFMTNLGFKAPVDTRMLQKGGNGTLAYPI HALQTERPDSLPNIVLVLLDSWNKRSLTPQCMPNTYRWAQQQQWFDNHLSASNGTRSAVF GLFFGLTCYYWEDFEAARVSPVFIDRLQQLGYDIRVYPSAQFYNPNFAKVVFGKVKGVRT ETAGNTALERDQRICADFIGELPERLKSKRPLFSFVFFDLPHSFELDAKHNVPFAPAWPY ADYTKLNNDMDPTPFFNLYLNTCHQDDILLGKIFQTLEQRGILDNTIVILSGDHGQEFNE NKKNYWGHNGNFSVWQIGVPLICHFPGEKPQKYSHRTTHYDIVPTLMHNYLGVKSPVSDY SMGHLLTDNSSRKWHIVGSNLNYAFIIGGDSILEKNAEGGLDVYDARMNPVSNYRLDTRA FNKAMDKLNKFYK >gi|283510612|gb|ACQH01000007.1| GENE 21 25147 - 27768 2905 873 aa, chain - ## HITS:1 COG:ZalaS KEGG:ns NR:ns ## COG: ZalaS COG0013 # Protein_GI_number: 15803211 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Alanyl-tRNA synthetase # Organism: Escherichia coli O157:H7 EDL933 # 6 871 7 871 878 634 42.0 0 MMTANEIRDSFKQFFEGKGHKIEASAPMVIKDDPTLMFTNAGMNQWKDIILGTRDPEPRR RADSQKCLRVSGKHNDLEEVGHDTYHHTMFEMLGNWSFGDYFKREAIDYAWEYLVDVLHL NPQDLYVTVFEGSEEEGIPRDDEAAEYWSKHLPADHIINGNKHDNFWEMGETGPCGPCSE IHLDSRSAKEKAEVPGASLVNKDNPQVIEIWNIVFMQFNRKSDGSLQPLPMHVIDTGMGF ERLVRSLQGKTSNYDTDVFQPVIQEISQLSGLKYGEDEKVDVAMRVIADHLRAVAFSIAD GQLPGNAKAGYVIRRILRRAVRYAYTFLGQRSAFMFKLLPTFIHEMGEAYPELKAQRELI GRVMKEEEDAFLRTLEKGISMLNDEMERLKAEGKTTLDGTQAFRLFDTYGFPLDLTELIC RENGLQVDAAQFDVEMQKQKERARNAAAVENSDWVVLREGEQNFVGYDYTEYECRILRYR QVTQKKNTYFELVLDNTPFYGEMGGQVGDCGVLVNGEETVDIIDTKRENNQSIHIVKALP KDPKADFMACVDTDKREASAANHTATHLLDYALKAVLGEHVEQKGSLVAPDTLRFDFSHF QKVTDEELREVERLVNDLIRQDLPLDEHRNTPLEEAKAMGAVALFGEKYGDTVRVVRFGP SCEFCGGIHARSTGRIGMFKIVSESSVASGVRRVEALTGKRCEEAMYALEDTIRGIRNLF NNAKDLQGVIAKYMEEHDAMRKEIEKFSAQAVERLKDSLVANAKDVNGLKVVKAVLPINA EQAKNLVFKVREAIPQHLVCVVGSTANDKPLLSIMFSDDVVSEHGLNAGQIIREAAKLIQ GGGGGQPHYASAGGKNLDGISVAVDKAVELACQ >gi|283510612|gb|ACQH01000007.1| GENE 22 28182 - 28532 396 116 aa, chain + ## HITS:1 COG:RSc1584 KEGG:ns NR:ns ## COG: RSc1584 COG0789 # Protein_GI_number: 17546303 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Ralstonia solanacearum # 10 88 18 95 146 59 40.0 1e-09 MVLKTDRNLKLYYSIKEVSDIIGVNESTLRYWETELPQLKPRTAMGSKVRQYTERDIDLL KNIYTLVKVRGFKIAAARKMINENREGANKSTHVLNTLLSVRDDLRALKKELDGLV >gi|283510612|gb|ACQH01000007.1| GENE 23 28711 - 29664 720 317 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 [Streptococcus pneumoniae SP6-BS73] # 10 313 5 306 306 281 48 5e-75 MEKREYTRCLIVGSGPAGYTAAIYAGRANLSPVQYCGMQTGGQLTQTTEIENFPGYPKGV DGNQMMVELREQAARFGADIRDGEITKVDFSSKPYVVTTDRGVEIEAETVIIATGASAKY LGLPDEEKYSGQGVSACATCDGFFYRNKTVAVVGGGDTACEEASYLAGLCSKVYMIVRKP FLRASDVMKKRVASNPKIEILYEHNTLGLYGEDGLQGAHLVKRKGESDEQRYDLPIDGFF LAIGHKPNTDVFKDWLDLDEIGYIKTVAGTPKTNVAGVFAAGDCADPVYRQAVVAAGSGC MAALEAERYLGELETQK >gi|283510612|gb|ACQH01000007.1| GENE 24 30123 - 30671 806 182 aa, chain - ## HITS:1 COG:FN2007 KEGG:ns NR:ns ## COG: FN2007 COG0386 # Protein_GI_number: 19705303 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Glutathione peroxidase # Organism: Fusobacterium nucleatum # 4 182 19 197 199 236 63.0 2e-62 MKTVYDFAVKDRKGGEVSLREYANEVILIVNTATKCGFTPQYEELEAIYEKYHAKGFTIL DFPCNQFGQQAPGTDESIHEFCKLTYGTEFPRFKKIKVNGDDADPLYKYLKEQKGFAGWD PNHKLTPILDEILSKEDPDYKEKADIKWNFTKFLVNKQGLVVARFEPTESLENVSKKIEE LL >gi|283510612|gb|ACQH01000007.1| GENE 25 30923 - 31471 683 182 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288927367|ref|ZP_06421214.1| ## NR: gi|288927367|ref|ZP_06421214.1| hypothetical protein HMPREF0670_00108 [Prevotella sp. oral taxon 317 str. F0108] # 1 182 10 191 191 333 100.0 2e-90 MAALAVATILVTSCEIDNYYEDNTYRRYSWWDDSYEYPSNDLLAMAQTLRGHWDGRFVAR GVDALGYAGTKVYYTDIEFDQYNSNAIYGRGRQVDYEGRNDPSPFRRSFSWRIDTRTRAI VITYDNNYTMTIAYSELSLNDNAFEGVMRGANETDEFDFRRYTLAKKGTVDLSELTDTTS TK >gi|283510612|gb|ACQH01000007.1| GENE 26 33503 - 34399 1024 298 aa, chain - ## HITS:1 COG:BB0411 KEGG:ns NR:ns ## COG: BB0411 COG1864 # Protein_GI_number: 15594756 # Func_class: F Nucleotide transport and metabolism # Function: DNA/RNA endonuclease G, NUC1 # Organism: Borrelia burgdorferi # 96 277 8 175 195 95 34.0 1e-19 MIKKIMIFAVAFALVACEGEDLSKVNKPNGGGNTPTGENHNANDAKAQREYARLEFPRLK SGANRVLLHETPKDGLNFAIEWNETKKAQRWTCYQLFKSNMVKKTDRYKSDTNQYPRDPL LPKNLWFTSDPYWGTGYDHGHICPSADRLNSAEANYQTFFLTNMQPQVHGFNAGVWQNME TKVRDIASRYNYTFCDTLYICKGGTIDKPEQVLETTGKGLLVPKYFFMAVLRVKNGVYNA MGFWIEHKASNDKKLAKYAVTIRELEEKTGIDFFCNLPDNIEYAVENVPVNNTFWGLD >gi|283510612|gb|ACQH01000007.1| GENE 27 34566 - 34817 387 83 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|60280046|gb|AAX16386.1| 50S ribosomal protein L31 type B [uncultured murine large bowel bacterium BAC 31B] # 1 83 1 83 83 153 84 2e-36 MKKGIHPENYRPVVFKDMSNGDMFLSQSTCKTNDTVEFEGETYPVVKIEISSTSHPFYTG KSKLVDTAGRVDRFMNRYGKIKK >gi|283510612|gb|ACQH01000007.1| GENE 28 35111 - 36490 1312 459 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927370|ref|ZP_06421217.1| ## NR: gi|288927370|ref|ZP_06421217.1| lipoprotein [Prevotella sp. oral taxon 317 str. F0108] # 1 459 1 459 459 805 100.0 0 MKKVIYAILIAALATAMLGGCQDVPEPYNNPNENKGGNQEVLEPGTPTGTGTEADPYNIT AALAKIKTLKGGENTGEIFVKGKIRSVKEVDTGNFGNATYFISDDITKKMDSLQIYRSKY LGNAKFTAKDQIKAGDEVVIRGIFVNFKGNTPESEPNKSWIHSLNGKTVEAETLVPGEPK GAGTETEPYNVTAALAKIKTLGEKDTLKNLYVRGKIRYIKEVNLEKYGSATYYISDDNTK KMDSLYIFGSKFLNNEKFKAKDQIKVGDEVVVVGSFVNFKGNTPGAAGGKTHLYMLNGKK ETGGTTPEKPETPQTGKGLSIEGQVVTLTNANAEVGTETITYAMSKLGNDDIAKVDPINL GDGLTLTVEQSDGKAAPTYVGKFKNLRIYANNVFTITGNKKIAKVILDCDKFKKDIFVGN STATVSFNGNSVTYKNVFSEPKGGGVQLRILKVTVVYAK >gi|283510612|gb|ACQH01000007.1| GENE 29 36626 - 37468 1020 280 aa, chain - ## HITS:1 COG:no KEGG:BT_3561 NR:ns ## KEGG: BT_3561 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 18 280 21 275 277 165 34.0 2e-39 MKIRYSLIAIACAALVGCMDKDWNAPNNSEAYGNPSIKETNVITISELKNNYANVIASQT DPYKQVENNIQIKGRITGNDIQGNIYNSVCLEDATGGILINIAQGGLFGYLPVGQEIIVN LKDLFVGSYGQQAAIGTPFTNSKDQTSVSRMNRYLWNEHFKYVGAADASKVQPEVFDVSK ISDTEYLRTHSGRLMTIKGVKFKDADGKKVFATAAEKDAANSVNRELEGFTNRQIVVRTS TYADFAAQPLPQGTVDITGIFTRFRNTWQILLRTESDVKK >gi|283510612|gb|ACQH01000007.1| GENE 30 37601 - 40141 2195 846 aa, chain - ## HITS:1 COG:no KEGG:BT_3560 NR:ns ## KEGG: BT_3560 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 846 1 846 846 649 42.0 0 MMQAKLKLALVALCCAPAAFAQTPKDDKQQQPTTVDESAFTFTEAQLGENDNMSQNVTII NSASNLYASQVGYLFSPVRFRYRAFNQKYNEVHVNGVPINDVVSGQFRFSNVGGLNQQTR NADFALPFETNLFAMPAMGGSNNYDFRPANQPAGHRLTLGGANRNYTLRGMYTYNSGLSE RGWAFSANLTYRWANRGYVEGTFYNALSYFFGVQKVFGSRTQHSLAFSTWGNPTERGTQG AATDESYWIANNNQYNPYWGYQNGKVRNSRVVNDFSPSAVMTWDWKIDDGLKLSTSLFGR YSMYKSTKLNYNNSDNPQPDYWKMLPSSYYDVWDETNKVARTDQALADWNTAYNYLTARK ENRQVNWDRLYAANRGVAAQGVDAMYFIQARRNDALNLSLASTLNMQVTKTSNWNLGYIA STNNARHYQTMEDLLGATTYHNVNTYAIGTYAPNSSQVQYDLNNPNALVGKGDVFGYDYH LLVNKAMAWTSYNVLLGRVNLMVAGKVGGVSMQRDGKMRNGLFSESSYGKSGTAHFLEGG GKASLIWSMGGGNTLQLGQGYQYNAPTPYVSFVAPELNNDFVRDLKNERVYSAELGFQHQ SGRIHLNLNGYFSHLSNVSDWQCFYFDDINSFSYVSITGQKKVFYGVEAAFKYKINSAFD VKLLGTMSEAKTINDAYVNYVNSTQGTYETDILLNKDMRESGTPLTAGSLILSYHKGGWF VDLCGNYYDRIYLGYSLYHRYKSVGEKRGNVDADGNVMRTPQEKGHGGFMLDGSIGKSLY LKHGQLSINLMFTNLLNNSRIVTGGYEQSRSDYTASGNARAYKFSRNPKKFYAWGTNAML NVTYRF >gi|283510612|gb|ACQH01000007.1| GENE 31 40512 - 41294 853 260 aa, chain - ## HITS:1 COG:MK0183 KEGG:ns NR:ns ## COG: MK0183 COG1402 # Protein_GI_number: 20093623 # Func_class: R General function prediction only # Function: Uncharacterized protein, putative amidase # Organism: Methanopyrus kandleri AV19 # 21 245 16 219 224 92 29.0 9e-19 MNKLVDLSVSNYGTTRQINYQLVVLPWGATEPHNYHLPYLTDCILSHDIAVDAAQRLLTK HGLHAMVMPPVGMGSQNPGQRELPFCVHTRYETQKAILTDIVDSLYHQGFRKLVIVNGHG GNGFKSMIRDLLTCYPDFLIAASDWFKMRNAKDFFDNPGDHADEIETSVMMYYHPELVNL ADAGEGKSNAFAIDALQEGRVWMPRNWGKVSKDTGIGSPKQSTAEKGKGFADAVCDAYVD FFADFIAVKDEDDPYTKGAW >gi|283510612|gb|ACQH01000007.1| GENE 32 41304 - 42491 1198 395 aa, chain - ## HITS:1 COG:slr1975 KEGG:ns NR:ns ## COG: slr1975 COG2942 # Protein_GI_number: 16330802 # Func_class: G Carbohydrate transport and metabolism # Function: N-acyl-D-glucosamine 2-epimerase # Organism: Synechocystis # 8 392 7 387 391 271 37.0 1e-72 MDTKQYLQQWAQRYKDDLVNNVMPFWMKYGLDRQNGGVYTCVDRDGTLMDTTKSVWFQGR FGFIAAAAYNNIEQNPEWLAASKSCIDFIIDHCTAPDGRMYFEVTAEGTPLRMRRYLFSE CFAIIAMAEYAIASGDKSYAQRALDLFKLVQRYAYTPGLLPAKYLPNVQCQGHSLTMILI NTASRLRMAIDDPTLTEQIDTSLHAIKNYFMHPEYKALLETVGPNGEFIDTINGRTINPG HCIETAWFVLEESRHRNWDADLKQMGLEILDWSWDWGWDEPYGGIINFRDCKGLPSQDYA QDMKFWWPQTETVIAMLYAYLATGDEKYLERHKLINDWTYKHLPDPEFGEWYGYLHRDGT VAQPAKGNLFKGPFHIPRMLINSFLLCNEILKTKF >gi|283510612|gb|ACQH01000007.1| GENE 33 42518 - 43438 1159 306 aa, chain - ## HITS:1 COG:VC1776 KEGG:ns NR:ns ## COG: VC1776 COG0329 # Protein_GI_number: 15641779 # Func_class: E Amino acid transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: Dihydrodipicolinate synthase/N-acetylneuraminate lyase # Organism: Vibrio cholerae # 1 299 1 296 298 202 36.0 5e-52 MEKIIGLIDAPFTPFHANGEVNLEPIERYAQMLQHNGLKGVFINGSSGEGYMLTTEERMQ LAERWVSVAPAGFKVIVHVGSCCLKESVALARHAEKIGAWGVGAMAPPFPKIGRVEELVD YCAAIAEAAPSLPFYYYHIPAFNGAFLPMIDLLKAVDGRIPNFAGIKYTYESLYEYNQCR LYKDGKYDMLHGQDETILPSLAQGGAKGGIGGTTNYNGRELVGIIDAWNRGDLETAREKQ NFSQAVINVICHFRGNIVGGKRIMKFLGFDLGPNRTPFRNMTDEEEAQMRRELEEIGFFE RCNVCK >gi|283510612|gb|ACQH01000007.1| GENE 34 43607 - 44842 1133 411 aa, chain - ## HITS:1 COG:CC0336 KEGG:ns NR:ns ## COG: CC0336 COG0477 # Protein_GI_number: 16124591 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Caulobacter vibrioides # 7 332 22 347 438 110 28.0 4e-24 MKANKNYPWIVVGLLWVVALLNYMDRQMLSTMQGAIKADIEELNTAEVFGALMAVFMWVY GCFSPFAGVIADKLSRKWLVVGSLFVWSAVTLSMGFATNFETLYVLRALMGVSEALYIPS ALALITDWHTGKSRSLAIGVHMTGLYVGQGLGGFGANLAHHFSWHQTFFALGLIGVLYSL LLMFLLHENPECSVKSRNTVAPGEKRESVWRGLSVVLSNWAFWIILFYFAVPSLPGWATK NWLPTLFSENLSLDMTAAGPMSTITISVSSFIGVIFGGILSDRWAQRNVRGRIYTSAIGL SLTIPSLLLLGFGHSVVGVVGAGLLFGIGFGIFDANNMPILCQFVSAKYRATAYGIMNMT GVFAGAAVTQVLGSWSDGGNLGLGFALLGGIVALALILQLTFLRPQTDNMP >gi|283510612|gb|ACQH01000007.1| GENE 35 45063 - 46109 1034 348 aa, chain - ## HITS:1 COG:FN1470 KEGG:ns NR:ns ## COG: FN1470 COG3055 # Protein_GI_number: 19704802 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 3 344 18 366 372 103 26.0 5e-22 MAQNSNPIQRLVGFPTEEQGFDKGVSACYCGVINGYLYIAGGCNFPDKPVAEGGKKRFYK AIYAAKLNAEGNRLEWKTVGQMPQPAAYGVSVTYENSLIFVGGNNETGGITTAIRLRPTA TGMQQEALPSLPHALDNTAGAVVGHILYVVGGNCEGVATQKVWSLDLKNTAKEGWKEEPS IPGIARVQPIAAALAGDLLGVWGGFAPKTDSKAAQLAMNGASYNAGCGTWTALPVPTDAL CEEVFTGGATAIATPQKGVVVVGGVNKDVFLAAINKLPEGYLLHEPEWYRFNSRVLCYRD GTWTQLLQHPSVARAGCALAYWDGWVYVVGGELKPGIRTPEIVRFRVD >gi|283510612|gb|ACQH01000007.1| GENE 36 47190 - 47666 133 158 aa, chain - ## HITS:1 COG:no KEGG:EpC_08520 NR:ns ## KEGG: EpC_08520 # Name: not_defined # Def: conserved uncharacterized protein # Organism: E.pyrifoliae # Pathway: not_defined # 18 157 4 143 143 154 50.0 1e-36 MGQRTSTSLKHRNLCYQKKIKTFCKEKGWWNDDYTVEYADALRTLNIDLTTDFATFFLHV EDSPTFYGRHQELYQICWFAINTNYELAITFAHNALELPNEYIPLDSFEGEGGFFYNRST GEVLEIELGQKLIDFQKGELQPQWHDFNSFAEWFFEIP >gi|283510612|gb|ACQH01000007.1| GENE 37 47717 - 48478 159 253 aa, chain - ## HITS:1 COG:no KEGG:Fjoh_3235 NR:ns ## KEGG: Fjoh_3235 # Name: not_defined # Def: abortive infection protein # Organism: F.johnsoniae # Pathway: not_defined # 50 253 38 232 233 84 31.0 5e-15 MTDKITTNPVSWKRLLIFFIIATAVSNLFRFDVFNIHPAIQKLPTWLSIFVFALLHPSGV LLGTFIVMPFLRKERRVEMSLFGTSKQKSLLMCAIPIVLLTAIGVNNDYKLENHIYAFVA IIGTIVYCIEEEYGWRGYLQEELKGLKPWAKYLIIGCLWYVWHLSFVHDRSIVHNLTFLS LLIFGSWGIGKIAETTKSVLASACFHLIVQIMMLNSLFRHSPDKTTKFVIFVVCVGAWFM IFFKWNKTRSTSN >gi|283510612|gb|ACQH01000007.1| GENE 38 48417 - 48635 63 72 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIKNISNLFQLTGLVVILSVIFTKCLGSNYVSIYAKRFILLYVGIINAVLIGVNKNNSYI CSIIIVATGRCF >gi|283510612|gb|ACQH01000007.1| GENE 39 48667 - 49398 600 243 aa, chain + ## HITS:1 COG:no KEGG:PRU_0815 NR:ns ## KEGG: PRU_0815 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 95 243 1 149 156 75 30.0 1e-12 MERLLYIIILVVAYVWLRRLRHKGEHGVERGVSATEPPAVWLSYTRAEPCFKMAGRLTEW PETGKETEGRPYENDAEAHNEHAVTIRINRKNITMENKYVGMSTLEMIEEICKALGCPFE RDEENFADFTFQGEDFRIQTIHGTRYFNIMRVSWAEASLDDIDQVACLQRTINEVNMYAA CVVVYTIDTDANVMHLHFRRHAVITPEMPAVEDYFKSLLMPFFSVEHDFVNEFQNMKKEM GLE >gi|283510612|gb|ACQH01000007.1| GENE 40 49730 - 52009 2133 759 aa, chain - ## HITS:1 COG:no KEGG:PRU_1227 NR:ns ## KEGG: PRU_1227 # Name: not_defined # Def: putative thiol protease/hemagglutinin PrtT # Organism: P.ruminicola # Pathway: not_defined # 27 374 10 339 777 157 33.0 1e-36 MNQYYATNRRKLTPALQCGLNHYRQAFALFALAFATLLPIQAKQVTARQALDIARKYVMP NRQSIASAQTRTGKQQPIEPFYVFNDKQGKGFVVVSGDDAMGEILAYGDNGTLDTLNANP GVKFLLQAYREHFAQLQQAPATAQPATRAMPTYKAVAPLLTCKWNQLEPYNKKTGYPYTG CVATAIAQVMYYHKWPIQGKGENTYTVRHYNHVKHADFSKSYYDWANMKDEYTYYDPGTQ REKDAVAKLMSDVGIATNMQYTPYLSGAQNESADKALRENFDYTTAFVSRSDEGLPAFTD IVRQELINGFPVYLSGSQKGGGGGHAWVTDGINEQGLFHMNFGWGGQGDAYFSLSTLSVA QSGNEFGGKPMTFSYGLIAILAHPNKPNTRPIDHALLASTPKLKFNIGGSLRLPQGSGKT FAVGNMPAVEMNEFNNYGKPFKGDIGVGIFDMNGKRITVCPSDDHATGGYTKRVYGQYSN GEMGRDYTNPQTVKIKVDLTKLANGYYQLLPMCAPLEKNGRWGEWIHMRQAPRMVIEIGN GKVRIVEEDSINAGFQLTEQPSKLWLKPGNEETIHVPIRNKGGLGFGYYAKLQLFDAKGN VAFETQRTKPMELDGFRTTWMPFSINVPSTLKEGKYRMKLLLIKETSAGIDNPDAERFDV EKLYDKEETIFTVASKVTHIAGTTTANYTLTNNDGHSLTVRGTGLQRVRLFDINGRLLGS AVSTDGTEVSISLANLVPGVYLVQVMADGQVHTHRLLKR >gi|283510612|gb|ACQH01000007.1| GENE 41 52316 - 53578 583 420 aa, chain - ## HITS:1 COG:BH0931 KEGG:ns NR:ns ## COG: BH0931 COG1194 # Protein_GI_number: 15613494 # Func_class: L Replication, recombination and repair # Function: A/G-specific DNA glycosylase # Organism: Bacillus halodurans # 20 401 13 346 372 224 35.0 2e-58 MPCSLSFLLKAHIVTTTHPFSTALLRWFQRHGRSLPWRETKDPYAIWLSEVILQQTRVSQ GMAYWQRFMRNYPTVNALAAATEDEVLRLWQGLGYYSRARNLHQAAKQIAELGHFPNTHE EISKLKGVGPYTSAAIASIAFNLPVAVVDGNVYRVLARFFGIDTPINSTEGKKQFATLAQ SLLPHHAPARYNEAIMDFGALQCLPVKGETGKVNGHTVAPNDTPSFCNSCPLSGQCVAFA QGLVRSLPVKTKSQAPKQRRMGYIYIRCGGEIAIRKRPAGDIWQGLWEPLLYEDVVLSGV QTPKTYKREGTNNQENFPSPQLLACIDRLFKGEQGTESTSIPPNSHLHSTLSPIIKGPYT FRHILTHRIIMAQFAVVETSVRPDLPSDYIWVSEQELNKYAISRLFELFLERLNNETKDK >gi|283510612|gb|ACQH01000007.1| GENE 42 53592 - 54767 1194 391 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927383|ref|ZP_06421230.1| ## NR: gi|288927383|ref|ZP_06421230.1| tetratricopeptide repeat protein [Prevotella sp. oral taxon 317 str. F0108] # 1 380 80 459 470 647 100.0 0 MKKTLLAALLLLAANAAQAQKTEQEKQLHEMDSTARMLIADGKFDKAIAAFTDYTAKVKH LRGEADTIYIDGLVFLGKSYFSAKRLSKAVETAQKVVDLYGKHFNTKDKRYAWYLDNLSL YLSSNGQYKEALANSKKALKIYEGLYTNDRDMALILIHAAENSFYAGNKADAINFQLRAL AIYKDLFGQHAQEYTDEAEYLVTYYEGNNQGDKAQSLSEELEKLKEEAKKGYGDLPELPK LETAEDCRKHTKDVERCCQYYLSHRFTARDMEDAARFIMAWAVPSDQVTIPMGKNESQLL AKKESNPYLFSYYAGYILYALENKETKETEDGYEAAMVATLNHYINNKDLTGPVPALEKY VKLYKKDKDKMFALIRKNFPKTEKEDKEKEK >gi|283510612|gb|ACQH01000007.1| GENE 43 55101 - 56267 1074 388 aa, chain - ## HITS:1 COG:VC0265 KEGG:ns NR:ns ## COG: VC0265 COG0668 # Protein_GI_number: 15640294 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Small-conductance mechanosensitive channel # Organism: Vibrio cholerae # 8 383 16 400 412 204 33.0 2e-52 METIKDYVEQIIRLTGVSGNAVPIVRHILLVLVTILLAWAAGALCRKILIPVVHRITSKT KGNWDDILFNDNVLITASRIVPAIVVSWLLPLVFFQYALVHTALEKLTAIYIVVMSVRLF IAFINSLTQLKLSSSPSVRQYIRTFCGVLKVIVIFIAVIVVVAILFNKNPLSLIAGLGAT SAILMLVFKDTIEGLVAGIRLTSNDMVRKGDWITVPSTPADGVVEDISLTTVKVRNFDNT TVTVPPLALVSGSFQNWRSMQKGAGRRVQRLLYVDFRSVKLVDDALKSSLLSNKLLTEEQ MQGQQVNISLFRLFIEQYLTKRAEVNEEQTLMVKQVEATQCGLPIEFYFFIKNAPDIHYE HTLADIMEHIYAYTHAFGLTIYQQYPEQ >gi|283510612|gb|ACQH01000007.1| GENE 44 56312 - 57595 1337 427 aa, chain - ## HITS:1 COG:HI1330 KEGG:ns NR:ns ## COG: HI1330 COG2027 # Protein_GI_number: 16273241 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine carboxypeptidase (penicillin-binding protein 4) # Organism: Haemophilus influenzae # 224 410 266 454 479 105 34.0 2e-22 MKLKSKYALVVLLAWLMALPMMGQVEKDVTDDDDETVAAADTLATDSLLADTLATDSLLP WPQSVRARIDRLVQADMLRTSQLGLMVYDLDADSAIYCFNERQTMRPASTMKVITAITAL DKLGGSHQFKTELCYTGTIENRTLTGNVYCVGGMDPRFNTDDMHAFVESLQKMGVDTIRG TLYADKSMKTTDLLGKGWCWDDDNPVLSALVYARKDVFMDRFMQELRKAGIVLDAFTATG QRPQDATCICTRFHTMDQVLMRMMKESDNLYAEAMFYQIAAATGNKPATAKNAAQVVKRL IAKLGLNAAAYKVADGSGLSLYNYVSAELEVAMLRYAFRNANIYLHLSPSLPIAGIDGTL KSRMKGVFTRGNVRAKTGSVTGVFSLAGYCTAANGHRLCFAIINQGVMHGRNARAFQDKV CTALCQP >gi|283510612|gb|ACQH01000007.1| GENE 45 57595 - 57996 344 133 aa, chain - ## HITS:1 COG:no KEGG:BVU_1111 NR:ns ## KEGG: BVU_1111 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 128 2 129 137 72 31.0 6e-12 MDRTFHQRLSPASVCGILAFALLALYMFWTKSALVGCLLAAVVVLMIERVVHTTYVFRRN DGEEEMLYIDNGRFSKTRQIRVNDIVSCRKMSTGFGLSQYVLLQCGHKRLVSVQPDNADA FIDEIRKRQATPE >gi|283510612|gb|ACQH01000007.1| GENE 46 59409 - 63383 1946 1324 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927387|ref|ZP_06421234.1| ## NR: gi|288927387|ref|ZP_06421234.1| hypothetical protein HMPREF0670_00128 [Prevotella sp. oral taxon 317 str. F0108] # 1 1324 1 1324 1324 2509 100.0 0 MNYIVPKITKYSNIDELLKNESKTPSLFDIENFLSIDDLLSENFVCIAGEPGVGKSRLIS EIKARLTEKPLSFCNASEFKPQVIPSDVEYCIVDALDEVDGSEFNYTLQLIKQYKEEHQD VKVMFSCRKHYVVSYASFFSSCTKLVFVEVNRLDDQEVTNIIDTCADVTKENINRSPKLR KLLSIPRYLMFLLESENQREGISNIGELFEFIIDKSVDEALTTYDKPVRKDNFKALIKRV IEKIAFIMEISRKDKISKDDLYTIIDELKGNMAHMLVANFDLLFFESRILKETNGILQFG NSEIQEYLAAKELGRQENIESVLYDVAVQKELKHIYPNWYDVIPHLSYSEGRSDSFVNVF KLITSYESHLENETFESLLRYVNPSTLGVQQRKELFSNLFEHYQRVPAYIKWRGPIQNLI QECYTSGCNTILKISSDQLNKIQLTNIYAILEGLVENNKLDESVIKHWEDAARALMGTKD SEMQQIALNFYYALKDDEELRELASTFNSFTEDVKQKYIEVTGYRKITDKLVIDCWLTYC DKSNPEAINVVLYIDDPESIIYAYNKILSKGIRDFFNPKGSLLVLYDFYLKKQFDVVWKF NQDRRNLMTKVIAYFVKNRNYSSLKEINTVVKQILLEEETGMIFFNCFEKDIWELEDLFR TFDADLIDADLLSKLEKLLNDIDAEKWNKDKILIYLINKIRKDESKKCTISDYIKRYENT FEQWDKNSNEMKQERDYAPSLSKAYQRLSKQETSEYDKYELAFELSKSIDFLSKQDHKPI VDVITNFFNKIDLDKLALKSNGENSYSLSKALIKIPYYVRVLHHLCYRELLEKYKDTLIK TLPIVCCTMNFDSHEIKDIYKSIIGNINQDEKSAIVRWWEDRKDDFMNVSSDDIFACITD YGFDTLSYKLEEYIEKYIEDKSLGNKIAASRALDLISEDYCKWDINKYKKLFGCLEDESI ESVKMQCNAIMIEKFQDPEAINWRIEYLKANVTKSFHNETGHARCISREESEMISPNPYM FRCFMNIKNNDDLIKQMLDLFEFGLPLCLKPETQEYSSYLLRQIYLFFVNINNITYISEL RKKVEAFNAKNMSYLAYLIMNNAENIFLKNETITISKAIKLYNKCIDESYLQIRNNSDLR QYFSKIQYEVQKEIQDQGVYALVRQQSLSEDYIQRELKNTIINKCCLMGLETIQVDREVA LQDNKRTDFLIRYGLCDPIMIELKLLNNTEIKNKKKRQEYKNKFVQYTNATNACLSVFWV FDVHKGGKIKDFKDLKEEYKNLDNTLVLLTDCRCSSGIETGIPQVKSNIGKKKACGNTTR PKKK >gi|283510612|gb|ACQH01000007.1| GENE 47 63847 - 64572 579 241 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288927388|ref|ZP_06421235.1| ## NR: gi|288927388|ref|ZP_06421235.1| hypothetical protein HMPREF0670_00129 [Prevotella sp. oral taxon 317 str. F0108] # 1 241 1 241 241 507 100.0 1e-142 MKAIIFVAILLCCACQPLSAQDANFHAGLRLGGGISSNSNVDRILVSEDYYSNYSLRKRV LFVPCAELFFLYKPQGNLWGVEAGIVYYNRTARVRYDDRDELNYTLSARYHHLGLAAYFN LYPFKERNTWHVSLGGRLGANLSPENLSYKGNQEDAKFKKWKYPSVEETERVMRSKLKGR PDAALGGGVGYDFHSGICIDLRFHYSLGSTIKTETNTFNWVECANHNWQAELTVAYVIDI K >gi|283510612|gb|ACQH01000007.1| GENE 48 64725 - 65864 667 379 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288927389|ref|ZP_06421236.1| ## NR: gi|288927389|ref|ZP_06421236.1| hypothetical protein HMPREF0670_00130 [Prevotella sp. oral taxon 317 str. F0108] # 1 379 12 390 390 651 100.0 0 MRRAVALLAIVLALLLCFRLCKRKTTEDPIPPPTHQETTPTKVAPQHTSGRTHPKGSGLR TRRDAGRKGIKRCANTEPTTPKKEEKYATAEDEGVTDAPQTRVGATEQPVEQQTQQQPDQ PTEPKTEQQTKPAAEENTPAEKTRQPKSAREPRKRITMRYKGHNQSRIGIRAGAGYAVIN NLGAMVEEGNIRPRYTLEACGATVPTIGVFALLRRNRLGMELATDYTWLASTLKEHKLAG NVNEETNFRYHLLMPQLAARLYLLGNLYMGVGVGLAIPLNPGGIDFTSDRPAMFASADEL TQAHLRETLRARVQAMPLIKVGYSSFKNGIEAYLQYAYGLTDLIQTKDNPYGYATARNNS HLFLLTVGYTIPLNKNKQP >gi|283510612|gb|ACQH01000007.1| GENE 49 65861 - 67666 1541 601 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288927390|ref|ZP_06421237.1| ## NR: gi|288927390|ref|ZP_06421237.1| hypothetical protein HMPREF0670_00131 [Prevotella sp. oral taxon 317 str. F0108] # 1 601 1 601 601 1190 100.0 0 MIQFRYGHVATFVPLLCMLLLTASRVQAQIQKHTPQHAPGGVMGYVKWGFGNDNAPMQLA ANSGMTFVGVGKARQGGEQLLWHVGAQGGQTQLVQTTARTADLARGTFMNYVGRDTLPPL RLYTYVSSAANGIRGTLAVGGTTKERLPIRAMKEGMAEYAVYPRALSGTERMRVESAMAL RHGITLNHSYFNSKGLVVRNYYRLKAYNHRVAGIIGDSTSCLYLSTGQSADGEAAVRVAA KAIGEGASYLWGDNAKRLAFAADRGNGKWMQRRWAATTTGTPVAGMTLTLDTRSIRQLEP LGEGESYYLAIDNSGTGKFPVGQLRYYKAHMGQADSVTFMGISVGVGESVFTFRAAKDFF STIDVEQPRCSTGAKGALRVLLTGGTPPYRLTASVDGKAVCVHATADSIVTVPSLTQGKY VIVAYDSSGKSLCSEVTLHNADLADVPDLEDVSFAQGAERDYHLETKGNYACRWKTPAGK YLDGRRVTLEEDGEYLVEVTNDDGCSTLRTLNVTTATSDGFARYEVSPNPTRDGNVDVRV ELTEAAPLALFLHAPDGALLQTEVREPESYHATRLYLPIVGVYILEMRSGEARRSVKIIR R >gi|283510612|gb|ACQH01000007.1| GENE 50 67700 - 71960 3029 1420 aa, chain + ## HITS:1 COG:no KEGG:Hoch_2925 NR:ns ## KEGG: Hoch_2925 # Name: not_defined # Def: YD repeat protein # Organism: H.ochraceum # Pathway: not_defined # 110 997 559 1465 3456 424 33.0 1e-116 MKRLIHITLATAIALLSLASCNDGRRTSRGMDAQGEVKRQVIHHAEQKADKGSVVPNAKK GIGGTLVTRKVQGRETNKDLSRLRSFRRRTVSKLFARDAADSLHLGRAQLHVPSGSMERA KLLSVTPLGKGELPHLPAGMVNVTGDRDVPVSASLKGGVAGYRLLPHGEHFVHAPATITV PYDSALIPKGYTADDIHTYYYDELQGKWTMLRHKALDREREVVMAETSHFTDVINGIIKV PESPETQNYVPTGIAELKAADPAAGITTVNAPTPNQSGTASLGYTFELPKGRGSMQPSVG LQYGSDGGSSYVGYGWSLPVQSVDIETRWGVPRFDIEHESESYLLMGTKLGDRTYRTAEL PGRAKDKRFKPLVEGGFARIIRRGDLPTNYTWEVTSKDGTTSYFGGIDGKIADNAVIKDA EGNIVRWALYRTVDTHGNFVSYTYEHHDSTLYPKCYRYTGHDDEAGAYAVNFEYAPTARK DAMSSGRLGVLQRDDRLLQRVNVTFNNESVRSYALHHREGPFAKTLLDSIVQLDAKGARV AAQGFEYYDDVKGGMFGKAESWTSEADSKHEHLPPLQKAINGFSNELSMLGGGYSKGRTL GGGLLVGFGVSVATVNVGASYIHSKNENFGKVALIDIDGDGLPDKLFQARDGLRYRKNLS GETGNPIFAKSRMIKGIGEFSRGTSSSKTLNADAAVELPFVSPGVSYSRTWDKTETKIYL SDFNNDGLVDIANNGMVWFNRIGTDGLPTFIPSTKETPNPIVGRSAEIDSTFIPDYKAIR DSLEKENPLHDVVRVWRAPFSGTIRIASVVNKSTTYGDGITYSIQKEENVLKRDSILSTG TRTDSLTTSVKAGERVFFRLQSRYSGAADSVAWSQRVEYSRIVDGNATYLGQDLSHYDAA SDFLEGMTTGMFLSKDGRVEVKAPYRKGRTESHVTLVVRRKDLHGEHILMEKKLPAGTPV EDTFEDAFDVLAKDSVQLTFEMLTDAPLEWRNMSWKPTLRYVSTQDTLRVIPFRKMYNKP VLVRASRNVRKDLAKNQKYDPGVILVHKFKVKRQDWDKDKDTATVHVNFNREDGTLLLKR KYTLTKDDSIKVDSLAIVDAALLEELCKGKTLATFSVENELRSVVKATVEVFRDSLVFKK DWTKKKMIVDHKERILVDTFACSVFSGYNSLNFGHLYRGWGQFGYNGNREYANRPIETAA LTIDTDGYKGIADNYDKSRKEEDLAGLTEPSKQRFFVMGYNAMRKVYIGATDSAYVGVSF QCSSRMGESEIKVDSVQYASGEGLSAPILQSKSSGHGVSATGNGPLRFGVSGSKSSQTSH TEVAAMDVNGDGYPDWIDEGDSHVRTQYTSPTGTLSQLSVKTDIPMPEFSSGAYSLGIGN DGAIAVSIGNRNRSESGRATGNSSPGDVGSMNNANENPNK Prediction of potential genes in microbial genomes Time: Sat May 28 00:07:03 2011 Seq name: gi|283510611|gb|ACQH01000008.1| Prevotella sp. oral taxon 317 str. F0108 cont2.8, whole genome shotgun sequence Length of sequence - 1147 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 1146 739 ## Hoch_2925 YD repeat protein Predicted protein(s) >gi|283510611|gb|ACQH01000008.1| GENE 1 3 - 1146 739 381 aa, chain + ## HITS:1 COG:no KEGG:Hoch_2925 NR:ns ## KEGG: Hoch_2925 # Name: not_defined # Def: YD repeat protein # Organism: H.ochraceum # Pathway: not_defined # 25 368 1870 2235 3456 149 33.0 2e-34 TTPTNNSTGQCEGQNASDGNAVSSLAVTASGNFSHGNSKTTRDWLDWNGDGLPDMVEGDA VRYNLGYGFTGHMPRGCQDLDATSNTNWGAGLGTKITVLGNAEITGGFNGTKTTTLTNCS YADLNGDGMPDKIIRDGDDVKVSINTGTGFVDEAQDVQGSAGKNLATSVSLYGSAAFSFK IHLLFLKLTITPHVTAAWSTGVSRTESALLDIDGDGLPDFVESAGPDALVVRRNLTGRTN LLRSVTLPFGGHVRVEYRQTLPSFAMPGRRWVMSAVETSGGYAENGATRMRNEFAYEGGY RDRRERDFFGFAKVVTRQLDTQKGNAAYRTQVSEYGHNRNLYMHDLVTAETLYDAQGNRL QGTLNTYEAVRQRDDSTSVFP Prediction of potential genes in microbial genomes Time: Sat May 28 00:07:08 2011 Seq name: gi|283510610|gb|ACQH01000009.1| Prevotella sp. oral taxon 317 str. F0108 cont2.9, whole genome shotgun sequence Length of sequence - 1715 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 1108 643 ## Hoch_2925 YD repeat protein 2 1 Op 2 . + CDS 1111 - 1659 324 ## gi|288927394|ref|ZP_06421241.1| hypothetical protein HMPREF0670_00135 Predicted protein(s) >gi|283510610|gb|ACQH01000009.1| GENE 1 2 - 1108 643 368 aa, chain + ## HITS:1 COG:no KEGG:Hoch_2925 NR:ns ## KEGG: Hoch_2925 # Name: not_defined # Def: YD repeat protein # Organism: H.ochraceum # Pathway: not_defined # 37 182 3148 3294 3456 111 36.0 4e-23 ELGDHSVPDGWIQRPKRNAVPGTPPGPPVQWDKAEDPDDVQPGYGYVPADTAHHEDIFYY HTDHLGSTSYITDAKANVAQFDAYLPYGELLVDEHSSSEEMPYKFNGKEFDEETGLYYYG ARYMNPRTSLWYGVDPHIEEMAFSSSYTYCDDHPVNFVDIKGKKKFNWIVLNMNDASKNS IDVRYTSAYYRSNYDDNVVNIWAHGLRAPRPNKYSQQAWGLGVSKTNSGELWNKVAFYNP RDVLLLSERIDLSDKGWRERNNGHPIIVLHCCASSTFAQKISASKEFKDVIIIAPDATLH SNKEKGEWINSIVYGRHYEHKREGAWHVYMNGTPLKNEKGKTIFYDSKAQPGTKGFNYRF DTKKQKKK >gi|283510610|gb|ACQH01000009.1| GENE 2 1111 - 1659 324 182 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288927394|ref|ZP_06421241.1| ## NR: gi|288927394|ref|ZP_06421241.1| hypothetical protein HMPREF0670_00135 [Prevotella sp. oral taxon 317 str. F0108] # 1 182 1 182 182 355 100.0 9e-97 MCKFQYFLLVATMTTLCCCGGKANKIAQRNVPKTAKKVYAAENSEEELDIPMYPCAEYSN GTLFYGRYHKYHQLSESEINLIKIVLSDKEKWLKTGSGIPILDINKYYRQYLAYNNGDIY VMINLYKYYYIVFDNNNIVGACTPAWGIQIISLVNDKSRKRYDNVHVLLNLSKRQIIKVH NI Prediction of potential genes in microbial genomes Time: Sat May 28 00:07:23 2011 Seq name: gi|283510609|gb|ACQH01000010.1| Prevotella sp. oral taxon 317 str. F0108 cont2.10, whole genome shotgun sequence Length of sequence - 3264 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 1100 584 ## Hoch_2925 YD repeat protein 2 1 Op 2 . + CDS 1097 - 1660 170 ## gi|288927396|ref|ZP_06421243.1| hypothetical protein HMPREF0670_00137 3 2 Tu 1 . + CDS 1958 - 3263 546 ## COG3209 Rhs family protein Predicted protein(s) >gi|283510609|gb|ACQH01000010.1| GENE 1 3 - 1100 584 365 aa, chain + ## HITS:1 COG:no KEGG:Hoch_2925 NR:ns ## KEGG: Hoch_2925 # Name: not_defined # Def: YD repeat protein # Organism: H.ochraceum # Pathway: not_defined # 57 169 3158 3282 3456 115 43.0 3e-24 ELGDHSVPDGWIQRPKRNAVPGTPPGPPVQWDKAEDPDNVQPGYGYVPADTAHHEEIFFY HTDHLGSTSYITDAKANVAQFDAYLPYGELLVDEHSSTEEMPYKFNGKELDQETGLYYYG ARYMNPRTSLWYGVDPHIEEMAFSSSYSYCDDHPVNFVDIKGKKKLNWIVLTLKDSPESD PRNISAYYRSNYDDNVVNIWAHGLRDSEPGNIQHAWGLGVSKRNSGELWMNVAFSIPKDI LLFSERIDGEDRGWRERNNGHPILVLHCCASSTFAQQISTAKEFKDVIIIAPDATLQSGK KKGERINSIASGKHYEYKRNGLWHVYMNGKPLKDKNGKAILYNYNAQPGTKGFDYKLNLK KQRKK >gi|283510609|gb|ACQH01000010.1| GENE 2 1097 - 1660 170 187 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288927396|ref|ZP_06421243.1| ## NR: gi|288927396|ref|ZP_06421243.1| hypothetical protein HMPREF0670_00137 [Prevotella sp. oral taxon 317 str. F0108] # 48 187 1 140 140 274 99.0 2e-72 MKNFIKGVLLFIVMVIMSGCGSRAGESSDNLHKREPRTKKRIYAREVLVENLDYPMLPCA YYSNDKPFYGKDKEYHKLSDREIRMVKAIIGDKQKWQKPGSKCPFLDIKEYFRQYLAYRK DGKTYVLVNLYKYFYIIYENNGILGAIAPSPTSHIISLANDKSRNKYDNVTILLDLSRKR ILEVHDI >gi|283510609|gb|ACQH01000010.1| GENE 3 1958 - 3263 546 435 aa, chain + ## HITS:1 COG:MA2043 KEGG:ns NR:ns ## COG: MA2043 COG3209 # Protein_GI_number: 20090890 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rhs family protein # Organism: Methanosarcina acetivorans str.C2A # 98 206 124 230 440 65 34.0 2e-10 MMQIEKQREDYYRKLGTPPGVPTMKGATAEPENTHEGYNSIIKELGDHSVPDGWIQRPKR NAVPGTPPGPPVQWDKAEDPDDVQPGYGYVPADTAHHEEIFFYHTDHLGSTSYITDAKAN VAQFDAYLPYGELLVDEHSSTEEMPYKFNGKEFDEETGLYYYGARYMNPKTSLWYGVDAL IENYPTIGSYIYCSSNPVVFIDPDGNKKVVVTGGADAHNKYEMNFVNASKIQLDKYLKRA GSLEEISWLIFGVGYSQEQKQEIAKWANNRGVSFKFLDTAEQLVNELNSSSQSNDKLTEV SMFSHGTASNVSFGFGQHGDRNSTNYSNKDNLTEATLNKTPLQSSAFAKGGRIDLYSCNS GTPLKYEETEFKTEKELRYTTRHAPSLVTLMGRQSKTIVRGFVGRSDYTPVAQGLLPKPG GTGGSYSPHVRGKNV Prediction of potential genes in microbial genomes Time: Sat May 28 00:07:38 2011 Seq name: gi|283510608|gb|ACQH01000011.1| Prevotella sp. oral taxon 317 str. F0108 cont2.11, whole genome shotgun sequence Length of sequence - 5083 bp Number of predicted genes - 9, with homology - 7 Number of transcription units - 4, operones - 3 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 120 - 179 3.1 1 1 Op 1 . + CDS 199 - 693 215 ## gi|288927398|ref|ZP_06421245.1| hypothetical protein HMPREF0670_00139 2 1 Op 2 . + CDS 707 - 1171 324 ## gi|288927399|ref|ZP_06421246.1| hypothetical protein HMPREF0670_00140 3 2 Op 1 . + CDS 1320 - 1850 286 ## gi|288927400|ref|ZP_06421247.1| hypothetical protein HMPREF0670_00141 4 2 Op 2 . + CDS 1887 - 2369 171 ## gi|288927401|ref|ZP_06421248.1| hypothetical protein HMPREF0670_00142 - Term 2074 - 2105 0.2 5 3 Tu 1 . - CDS 2216 - 2545 63 ## - Prom 2587 - 2646 6.9 + Prom 2471 - 2530 1.9 6 4 Op 1 . + CDS 2564 - 3202 211 ## gi|288927402|ref|ZP_06421249.1| hypothetical protein HMPREF0670_00143 7 4 Op 2 . + CDS 3238 - 3669 181 ## gi|260911272|ref|ZP_05917872.1| hypothetical protein HMPREF6745_1827 8 4 Op 3 . + CDS 3704 - 4771 734 ## COG3209 Rhs family protein 9 4 Op 4 . + CDS 4768 - 5083 185 ## Predicted protein(s) >gi|283510608|gb|ACQH01000011.1| GENE 1 199 - 693 215 164 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288927398|ref|ZP_06421245.1| ## NR: gi|288927398|ref|ZP_06421245.1| hypothetical protein HMPREF0670_00139 [Prevotella sp. oral taxon 317 str. F0108] # 1 164 1 164 164 309 100.0 3e-83 MMATEAPSSLGFLGRLQGSLNTYEAVRQRDDSTSVEYQIRMGNYSGPKELRQRYSLVLLK QGLKDEIEVVGNLGATLRPTDEQKAFPDGEISPSKNFKVFINPLATPEEQAKAVGHEFDG HLFMYLIGKDPRHGGSTRTQYGNMELEKQIKEREHESIRNFEEK >gi|283510608|gb|ACQH01000011.1| GENE 2 707 - 1171 324 154 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288927399|ref|ZP_06421246.1| ## NR: gi|288927399|ref|ZP_06421246.1| hypothetical protein HMPREF0670_00140 [Prevotella sp. oral taxon 317 str. F0108] # 1 154 1 154 154 291 100.0 1e-77 MYMLLFFSNVSCGQNRTVISDKEQESIVLYDVQGLKYEQLWRFNNTIVRDCYFTRKGEFI DNATWNVEIQTLLKLKDKILRKFRVDDETFTSGTAIVLLLAVPKSNILELRLARGLTSSF DKEMLRALKNVEQEVLVFSDNPIAVFIPIRITID >gi|283510608|gb|ACQH01000011.1| GENE 3 1320 - 1850 286 176 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288927400|ref|ZP_06421247.1| ## NR: gi|288927400|ref|ZP_06421247.1| hypothetical protein HMPREF0670_00141 [Prevotella sp. oral taxon 317 str. F0108] # 1 176 1 176 176 350 100.0 2e-95 MLVYKGHYTEKELSYYKGMAEKNGISFELASEEKDIIYYINCGSADVSKPDRNSDPITDF EYVGHGHPRGFYIEPRGNDNYKSFNSEKFEARAFDVNANIYLYGCGQGLTGSALHKLYPQ STEPNLIDNMQRLTKETIVGYSVTLEWGKSLGSFVPYSLGLSVCRGQRKLNKALYS >gi|283510608|gb|ACQH01000011.1| GENE 4 1887 - 2369 171 160 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288927401|ref|ZP_06421248.1| ## NR: gi|288927401|ref|ZP_06421248.1| hypothetical protein HMPREF0670_00142 [Prevotella sp. oral taxon 317 str. F0108] # 1 160 1 160 160 308 100.0 5e-83 MKIYILLIVLLSQLHFMGVSAQKVTYIHFIHGPLMGNKGNNYVGIQMDSKSKDIIFYEAG KKHKRKPQNQYSRYFYTLEKDTIISYSRSSYNVSVKDEKGKVKYGRLPLDYMLYVEFADG SIERFLICPLYPIDNKNLHGRDLYWNNLRMILKYCFPRTK >gi|283510608|gb|ACQH01000011.1| GENE 5 2216 - 2545 63 109 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDFLFALLCSQNDRFLRVVAFYLPLFDRAMRRFSSSHAMRRSVFRLSSLMKLGLSYDIGY FVLGKQYLSIIRKLFQYKSRPCKFLLSMGYKGHIKKRSIEPSANSTYNM >gi|283510608|gb|ACQH01000011.1| GENE 6 2564 - 3202 211 212 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288927402|ref|ZP_06421249.1| ## NR: gi|288927402|ref|ZP_06421249.1| hypothetical protein HMPREF0670_00143 [Prevotella sp. oral taxon 317 str. F0108] # 1 212 1 212 212 385 100.0 1e-106 MNMTKKIYLIIVLLISITYCNAQCKELYETESMETIVSSRKKADRLLELFRENAPKVLYS LEDCYYYLIIKASPHSKEYYIALDSLGEIRVLYLLSERTKTAKQKKYEKLLLEAGSIFEP TYYKEVSRTVETRIALGRPSYFVMKDASGKRFGEFCLSTLTVPLPINQALLTYLINRLSE EIHKDTKSVNRVKNKDNPNYGNEGKANSRLAQ >gi|283510608|gb|ACQH01000011.1| GENE 7 3238 - 3669 181 143 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260911272|ref|ZP_05917872.1| ## NR: gi|260911272|ref|ZP_05917872.1| hypothetical protein HMPREF6745_1827 [Prevotella sp. oral taxon 472 str. F0295] # 1 141 1 141 152 209 75.0 5e-53 MRIIKCLILLVLQSATCHLNAQSIKECLKIDSVQTIYDIGSSVSFHFSNNCEEDIYISIS LEKRVDGKWFIFAQDIFHVPNTYKVQNVIVLKGCEVNRKEEWEIKGVKYKKKCENVYRFK YNIGRKLNKQSYIEYSNIFHIKN >gi|283510608|gb|ACQH01000011.1| GENE 8 3704 - 4771 734 355 aa, chain + ## HITS:1 COG:MA2043 KEGG:ns NR:ns ## COG: MA2043 COG3209 # Protein_GI_number: 20090890 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rhs family protein # Organism: Methanosarcina acetivorans str.C2A # 59 239 124 291 440 63 29.0 6e-10 MKILNSTFRTPLSRHERSPKRYVTSGISSGPPMQWDKAEAPDDVQPGYGCVPADTAHHED IFYYHTDHLGSTSYITDAKANVAQFDAYLPYDELLVDEHSSSEDMPYKFNGKELDQETGL YYYGVRYMNPKTSLWYGVDRLKENYPELGSYTYCADNPVKFVVPDGNTALIDNAIDVLIG ASLEITTQMVANLVVGNGITDINWGKVAVATIDGAITSGASNTVRFVAKISVAANSLMDN YDKGIIEIVKGTAVNLAAGAVGSKASKLAKGFGNKTTEEISNKMISSKTSLTKTVKKITK VNNKRAKTIATNIQQAQKAVAKEVKKSPQTAADKIVSNVIESQGETNKNNKTDKR >gi|283510608|gb|ACQH01000011.1| GENE 9 4768 - 5083 185 105 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRDFIGRLLVYACVGWILYHVVFYFVGNFLLRINGVKTYAVITAEPSSYARRYTKVYYMY SFSYNGNEYKGNSLINSSTGKPGDSIEVVFLEAYPSINRPTSFFK Prediction of potential genes in microbial genomes Time: Sat May 28 00:08:34 2011 Seq name: gi|283510607|gb|ACQH01000012.1| Prevotella sp. oral taxon 317 str. F0108 cont2.12, whole genome shotgun sequence Length of sequence - 1771 bp Number of predicted genes - 4, with homology - 3 Number of transcription units - 2, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 57 - 242 79 ## 2 1 Op 2 . + CDS 243 - 476 78 ## gi|288928397|ref|ZP_06422244.1| hypothetical protein HMPREF0670_01138 + Term 627 - 654 0.3 + Prom 760 - 819 2.0 3 2 Op 1 . + CDS 865 - 1419 173 ## gi|288927404|ref|ZP_06421251.1| hypothetical protein HMPREF0670_00145 4 2 Op 2 . + CDS 1430 - 1769 253 ## gi|260911294|ref|ZP_05917893.1| conserved hypothetical protein Predicted protein(s) >gi|283510607|gb|ACQH01000012.1| GENE 1 57 - 242 79 61 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAKSIFHNILREVMKSHSRLITLVFIALVAFSCAEKRDLTDLICGDDSKLWYVNFDARDK L >gi|283510607|gb|ACQH01000012.1| GENE 2 243 - 476 78 77 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288928397|ref|ZP_06422244.1| ## NR: gi|288928397|ref|ZP_06422244.1| hypothetical protein HMPREF0670_01138 [Prevotella sp. oral taxon 317 str. F0108] # 1 73 86 158 273 66 42.0 4e-10 MCFYFNRDRTWKILERDLQGNLSKYEQHDHLWPEYWILKDDSVISLGGTEYKIHEVNPSL LILHVDTTKITLRLVNQ >gi|283510607|gb|ACQH01000012.1| GENE 3 865 - 1419 173 184 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288927404|ref|ZP_06421251.1| ## NR: gi|288927404|ref|ZP_06421251.1| hypothetical protein HMPREF0670_00145 [Prevotella sp. oral taxon 317 str. F0108] # 1 184 10 193 193 274 100.0 1e-72 MYIATLYIVFVVSIVLLSICGMLTLFKMLQSYVIAGSIASMMDVITNYGVLILVFQFTSH FVLVFIMDHHLLLHKTTTLIIRYRAITYFLVYSSLGMFSTHLLLFYLGNEAYRDTAEKHY GNALSYWARFPLLNVLVYGKILILLHRKGYRIKVAKLVLLYFTGRLAYFLLCEKENKMMP KLGY >gi|283510607|gb|ACQH01000012.1| GENE 4 1430 - 1769 253 113 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260911294|ref|ZP_05917893.1| ## NR: gi|260911294|ref|ZP_05917893.1| conserved hypothetical protein [Prevotella sp. oral taxon 472 str. F0295] # 21 113 2641 2733 3065 166 81.0 3e-40 MAVNKIHGFCFIELRIVLRVRRRAPKIGASIFQNTYGHGANVVTAGQKDYQMRMMQIEKQ HEDYYRKLGTPLGAPKMKGATVDPENTHESYNTIIKGLGGHCVPAGWIQRPKR Prediction of potential genes in microbial genomes Time: Sat May 28 00:08:57 2011 Seq name: gi|283510606|gb|ACQH01000013.1| Prevotella sp. oral taxon 317 str. F0108 cont2.13, whole genome shotgun sequence Length of sequence - 3543 bp Number of predicted genes - 5, with homology - 3 Number of transcription units - 5, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 338 - 397 4.8 1 1 Tu 1 . + CDS 450 - 710 129 ## + Prom 927 - 986 3.9 2 2 Tu 1 . + CDS 1007 - 1447 323 ## gi|288927406|ref|ZP_06421253.1| hypothetical protein HMPREF0670_00147 + Prom 1617 - 1676 2.0 3 3 Tu 1 . + CDS 1704 - 2006 141 ## gi|288927407|ref|ZP_06421254.1| hypothetical protein HMPREF0670_00148 + Term 2062 - 2099 -0.9 + Prom 2101 - 2160 3.2 4 4 Tu 1 . + CDS 2193 - 2690 270 ## gi|288927408|ref|ZP_06421255.1| hypothetical protein HMPREF0670_00149 + Prom 2820 - 2879 4.5 5 5 Tu 1 . + CDS 2981 - 3202 161 ## + Term 3397 - 3443 0.2 Predicted protein(s) >gi|283510606|gb|ACQH01000013.1| GENE 1 450 - 710 129 86 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKKLILIYIAVFGIWFVIYGISICDQYVALKYEIETIDNVAVINRVYEIVNMSTIINLVW FVLSAILFVIFVVQYKQRNKNKRTNE >gi|283510606|gb|ACQH01000013.1| GENE 2 1007 - 1447 323 146 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288927406|ref|ZP_06421253.1| ## NR: gi|288927406|ref|ZP_06421253.1| hypothetical protein HMPREF0670_00147 [Prevotella sp. oral taxon 317 str. F0108] # 1 146 1 146 146 291 100.0 8e-78 MNRIKFIAVFGIWLLGFNTLEACNMKFKIIYANACLSTIISITPDFFDKGMIEGSLDSVM VVSHAECVEFMDVISSLKETKVKEGERLPEIDVRAKVIVFLNDQFWDSYYWGMFYLYHNG KIYEVNKEFRDRVNLMLKKGGKPKMF >gi|283510606|gb|ACQH01000013.1| GENE 3 1704 - 2006 141 100 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288927407|ref|ZP_06421254.1| ## NR: gi|288927407|ref|ZP_06421254.1| hypothetical protein HMPREF0670_00148 [Prevotella sp. oral taxon 317 str. F0108] # 1 100 81 180 180 205 100.0 8e-52 MMLFLLNCCFPNFAPRNKESECVDVDSGKFALITQIGVIDQEYPYSVYYIMNNDSVLVCK GYRIKMIRMEGDTLMIDMDGEILYHRYKIKKYRVKLLSLK >gi|283510606|gb|ACQH01000013.1| GENE 4 2193 - 2690 270 165 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288927408|ref|ZP_06421255.1| ## NR: gi|288927408|ref|ZP_06421255.1| hypothetical protein HMPREF0670_00149 [Prevotella sp. oral taxon 317 str. F0108] # 19 165 1 147 147 280 100.0 3e-74 MNNSRLQFLLRKNAGKRYMAEYLTELSRLITNNKFRILNLEESDITSKSIMDNHEKLLFK MDYWECRNILFTQKEILKSVITKIQLVYSAPVYMSIGYSDMCGLVMIERISLFNSDFEFN DEHSGLIVLYDKSASNKLVIDFYEEACTCFYDLQLFGEQWVKCRE >gi|283510606|gb|ACQH01000013.1| GENE 5 2981 - 3202 161 73 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTPINNRHDTIITYDIDNLSSEGAETHVLYKGGKIKESTVYVYGAGGKMEIKYIFNRNFI DVRTNLSLPRHIP Prediction of potential genes in microbial genomes Time: Sat May 28 00:09:29 2011 Seq name: gi|283510605|gb|ACQH01000014.1| Prevotella sp. oral taxon 317 str. F0108 cont2.14, whole genome shotgun sequence Length of sequence - 13417 bp Number of predicted genes - 17, with homology - 13 Number of transcription units - 12, operones - 2 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 39 - 98 3.7 1 1 Op 1 . + CDS 261 - 1010 503 ## wcw_1574 transposase 2 1 Op 2 . + CDS 1101 - 1523 183 ## 3 1 Op 3 . + CDS 1496 - 1735 83 ## 4 1 Op 4 . + CDS 1738 - 2130 202 ## gi|302484289|gb|EFL47273.1| putative lipoprotein 5 2 Op 1 . + CDS 2459 - 3040 192 ## gi|288927410|ref|ZP_06421257.1| hypothetical protein HMPREF0670_00151 6 2 Op 2 . + CDS 3042 - 3218 94 ## 7 2 Op 3 . + CDS 3211 - 3714 200 ## + Prom 3800 - 3859 2.4 8 3 Tu 1 . + CDS 3970 - 4395 176 ## gi|260590898|ref|ZP_05856356.1| conserved hypothetical protein + Prom 4522 - 4581 3.4 9 4 Tu 1 . + CDS 4622 - 5200 166 ## gi|288927412|ref|ZP_06421259.1| hypothetical protein HMPREF0670_00153 10 5 Tu 1 . + CDS 5735 - 6181 252 ## gi|260590904|ref|ZP_05856362.1| conserved hypothetical protein + Prom 6210 - 6269 2.7 11 6 Tu 1 . + CDS 6508 - 7107 284 ## gi|288927413|ref|ZP_06421260.1| hypothetical protein HMPREF0670_00154 - Term 7408 - 7442 0.8 12 7 Tu 1 . - CDS 7605 - 7859 106 ## gi|282881300|ref|ZP_06289983.1| IS66 family element, Orf2 protein - Term 8115 - 8166 -0.2 13 8 Tu 1 . - CDS 8191 - 10464 1374 ## BDI_2894 hypothetical protein - Prom 10609 - 10668 3.0 - Term 10648 - 10683 -0.6 14 9 Tu 1 . - CDS 10757 - 10963 134 ## gi|288928591|ref|ZP_06422437.1| conserved hypothetical protein - Prom 10996 - 11055 3.5 15 10 Tu 1 . + CDS 11178 - 11729 319 ## gi|288927415|ref|ZP_06421262.1| hypothetical protein HMPREF0670_00156 + Term 11941 - 11978 -0.7 - Term 11654 - 11690 -0.2 16 11 Tu 1 . - CDS 11788 - 12006 85 ## gi|288928591|ref|ZP_06422437.1| conserved hypothetical protein + Prom 11852 - 11911 5.2 17 12 Tu 1 . + CDS 12114 - 12647 428 ## gi|288927416|ref|ZP_06421263.1| hypothetical protein HMPREF0670_00157 Predicted protein(s) >gi|283510605|gb|ACQH01000014.1| GENE 1 261 - 1010 503 249 aa, chain + ## HITS:1 COG:no KEGG:wcw_1574 NR:ns ## KEGG: wcw_1574 # Name: not_defined # Def: transposase # Organism: W.chondrophila # Pathway: not_defined # 102 225 211 333 362 72 29.0 2e-11 MILNLSMEELNNLRWFQHSKDFEPYCTQIACILMLAHGHDAKTISYNLEISLYSVYNYAE LYMLGDISKLTDNRYKSYCMHNSHSTYACEEKDGHMGQFTVSGRYRINLNGFLNARDAAD VIALDCHSVNAQFRCQLYEAAPAKHPTTCGIYVISNDARFYRNKHLTQRVKGTRAKQFFL RLYSPNLTLIEHQWKMLRRGVVNTTFYRAKEKFRQVVMNFFDSIADYKQELEPLLLPNLR MVNSKTILV >gi|283510605|gb|ACQH01000014.1| GENE 2 1101 - 1523 183 140 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKKLIFVYISIFCILFACYGKDIQSRHKNNNDSNKDCSSVSVRVDCLEVGKVMGLVGKLQ EVRELERDYKVKGRKAIVLLQSPNMDTPYFWIQVGFSTAYRFEPIYNFYVSPKKGSVFFY DTMTDSIIRIEDWRKIAQKK >gi|283510605|gb|ACQH01000014.1| GENE 3 1496 - 1735 83 79 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MEEDCSEKVTTYECHIYDANGTLTLYLRNAIIEDGVNDDRVCSLIKETNLKVKKYGYSLS FKRYSSETSGFDIGVKKNK >gi|283510605|gb|ACQH01000014.1| GENE 4 1738 - 2130 202 130 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|302484289|gb|EFL47273.1| ## NR: gi|302484289|gb|EFL47273.1| putative lipoprotein [Prevotella disiens FB035-09AN] # 4 109 14 118 146 67 37.0 3e-10 MKYIIVFCVMLSMLSCKKNSIEMTIAGRNHRYWLKKSHAEKSQVYYYFDRQGRWYVYERA YKANVLTKYDGGDIMLIEKWSLINDTTINIGGMEYCIRECNDTLLVIESSMFTDTLMSLD LYPSVIPERP >gi|283510605|gb|ACQH01000014.1| GENE 5 2459 - 3040 192 193 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288927410|ref|ZP_06421257.1| ## NR: gi|288927410|ref|ZP_06421257.1| hypothetical protein HMPREF0670_00151 [Prevotella sp. oral taxon 317 str. F0108] # 1 193 17 209 209 368 100.0 1e-100 MHDKTKKEMVMRFLYIIITLFVVVGCKTKLRNDCVAKDSIVVTYYRGHYESFDAVSFEDM KQMSDRKMPNETISLDSTLFRKFKGFIKKVSRDRLSVDDARFYLKWHKDEITMGIPVHLH ALQEQADKSLRYTIYQILWKTKYFNTVKEGSIKYVLLIKEFGIPVDYRYDETSSCSGKMP SKLKKIILIESNK >gi|283510605|gb|ACQH01000014.1| GENE 6 3042 - 3218 94 58 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLLDKSFVKVLLQINLMRGLPRESVPSCYTRMFHFITKKQYDATVKDVNKNNNNGLNE >gi|283510605|gb|ACQH01000014.1| GENE 7 3211 - 3714 200 167 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNKLITICVFTIGITFVCSCTHHFPKKVEWYISKHNILNDQGKGELDLREVLDIDYDTMY VFHSLIPLNGVQNIIGIKTYGDAEDPYTALIGSDSEMCRVILIKNNKVVYEDEYYYSHYK LQVLFETFYEVNGCGIFDGNLAPVHGYMCTKHHFTVSRVSDDKYMMK >gi|283510605|gb|ACQH01000014.1| GENE 8 3970 - 4395 176 141 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260590898|ref|ZP_05856356.1| ## NR: gi|260590898|ref|ZP_05856356.1| conserved hypothetical protein [Prevotella veroralis F0319] # 5 139 3 138 138 177 66.0 1e-43 MKFNWVILVFLIMTISSCCAQKTHSKREKIITNELSFINNKRVNFSIKKVACDTCFPIFD VGYRVSVKLTPKQKILVVRMKRGEWLSLLNDETTDYAANILLYYIYNRDAIVLLYNRSIK DWRDGMKEDDILYWEKRLKYH >gi|283510605|gb|ACQH01000014.1| GENE 9 4622 - 5200 166 192 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288927412|ref|ZP_06421259.1| ## NR: gi|288927412|ref|ZP_06421259.1| hypothetical protein HMPREF0670_00153 [Prevotella sp. oral taxon 317 str. F0108] # 1 192 73 264 264 366 100.0 1e-100 MRIAFVLFCFILFGHSMLFCQNIGSIQARWCIRGEKEGYDETRSCLYSIKNNTPSKLLIF FIEENNDSLTPVQLLRRKLLRRYGDFSLSMLEWESNMVIEDSCAIIPELFVKCLLPKETF EIIMLFNSSDREKTNVDITKHLLICEERVFSDSLIGMPNFVKNLSLYRVGYLYPYVVMNS SIFQAFINKQIK >gi|283510605|gb|ACQH01000014.1| GENE 10 5735 - 6181 252 148 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260590904|ref|ZP_05856362.1| ## NR: gi|260590904|ref|ZP_05856362.1| conserved hypothetical protein [Prevotella veroralis F0319] # 1 144 14 157 191 182 60.0 6e-45 MLVGLCACKSQDTEETKFLSHFIDLAEFPCGGILSTMPLPTKDTISYDILANRFLLPVNT IVLSNAHSQSTFCYVGKYKIADGYYVLACKEFYNYHDSRIIIYLYNDKQDVVTSSLLVGC HDEFLDVDSEYKNGTITIRTTYKKDNMG >gi|283510605|gb|ACQH01000014.1| GENE 11 6508 - 7107 284 199 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288927413|ref|ZP_06421260.1| ## NR: gi|288927413|ref|ZP_06421260.1| hypothetical protein HMPREF0670_00154 [Prevotella sp. oral taxon 317 str. F0108] # 1 199 48 246 246 343 100.0 3e-93 MKRNALLLFLWALALVTNVRCASKSREKTEFQDSIVVFFRPIYSSKVVSAKELIMMSNKV PVYDTLFVDRGDYNSIKNFIENKEVIRTDVKGIPEVYLKTKRGTVFLTQFNPLAVDIDGS PLAISMGMAYRINTCSKMYNYYTKEEISWDTLFREYSLPEDYHYHYDKYYQEFKNNGHKL DSIFIVKKRGVKLWLIARE >gi|283510605|gb|ACQH01000014.1| GENE 12 7605 - 7859 106 84 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|282881300|ref|ZP_06289983.1| ## NR: gi|282881300|ref|ZP_06289983.1| IS66 family element, Orf2 protein [Prevotella timonensis CRIS 5C-B1] # 1 78 1 78 114 124 79.0 3e-27 MFALNETNADRVCCTPVDMHQVILRLCQFVRGNDFTPSDVCVYVFYNRPRNRIKLLHGER YGFVVYHKQMAQGCLSSKKSLCWT >gi|283510605|gb|ACQH01000014.1| GENE 13 8191 - 10464 1374 757 aa, chain - ## HITS:1 COG:no KEGG:BDI_2894 NR:ns ## KEGG: BDI_2894 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 2 750 1 770 775 133 22.0 3e-29 MMARILTTLWFAFVASLISSNAQNRIYGTVTDPSGDPLVGVDCVLASLSDTTAIVHTQTD SKGTFSLNVGEDGLFLLKFSCLGYQPQQLTCKSGDVGMVKLTESIQNLSEVTVSGQGLRS FGNADIILLDRRARQVGNNALEAISNLPQFRLSTNSGELLTADGKFVLVLIDGIRRSVRE LMLLKTDEIKTLHYYSNPPARFAHENIGAVLNVATKRTNKRHYSLYLDTKNSITTGYGTN LFSTSYSDSLNKVSAAYFLDYRHLNRNTMENRYTYPDVTNDYKGLSGKYTGAYHIGQLFY QRFQNNNLFNITLEYRKFPATQQYSQQLLRTGRLGDANHRRLQSDFSSLSANIYFGHTFR KGNSLSVNVVNTYFKSNSNNNLTNTSGTGTFTNIVDNKSYSLIAEALYTDKLWQGNIQIG AYYQYKRLQQTDGASSLSTVNTQKEYLYADYTNQWGALSYNLGVGVENNRYRTVTKAIAN YILMRPSVSLNYQLYKHGSLRLTSSIASAVPSVGQLTDSRTVVDERFYLQGNSSLKPYHF YRTELSFQYASANNIWFVKPSVSHAHYPSRNMTVVSADGTDVVNRIVDIHHVNEYGASLV LSCRPFSWLTLQPYYNYLFSKYDTPNQCINHNLNNVGISCQVVTGKWQFITNHNLPMTTV DGDVFEKHGYNANASLTYRLGAVTMGAMFIYNPVPSMIYADTPSFHFVEKTRWGNFKNLV ALTFVYSLTKGNSRRHLSQNINNSDNDSGLTKYNTAK >gi|283510605|gb|ACQH01000014.1| GENE 14 10757 - 10963 134 68 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928591|ref|ZP_06422437.1| ## NR: gi|288928591|ref|ZP_06422437.1| conserved hypothetical protein [Prevotella sp. oral taxon 317 str. F0108] # 1 44 1 43 259 67 75.0 3e-10 MCEAMDKDTIKSEILPHLSVAKRGYALKSNLLEIIILCILYKLKRKTTNAIYVTDHKEIP PAILMCSN >gi|283510605|gb|ACQH01000014.1| GENE 15 11178 - 11729 319 183 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288927415|ref|ZP_06421262.1| ## NR: gi|288927415|ref|ZP_06421262.1| hypothetical protein HMPREF0670_00156 [Prevotella sp. oral taxon 317 str. F0108] # 1 183 8 190 190 335 100.0 9e-91 MLILLSICSVLSNYGQKEVTYFINEKDTLIYKDMEYVFNKLQYNNISTGFTSLIRRLSLN RSAEMYIMNISKHNGKLLILIQNWEYRGIASLRKRNVYGMYRSKQMKDFLVCYDNSCPIA TLKRIFHRTDERISINTLMKILPNDEHIVIEDIATQYCGSLIKNRLRTDKFILNNKTVYY GKK >gi|283510605|gb|ACQH01000014.1| GENE 16 11788 - 12006 85 72 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928591|ref|ZP_06422437.1| ## NR: gi|288928591|ref|ZP_06422437.1| conserved hypothetical protein [Prevotella sp. oral taxon 317 str. F0108] # 1 44 1 43 259 65 72.0 1e-09 MCKAMDKDTIKSEILPHLSVAKRGYALKSNLLEIIILCILYKLKRKTTNAIYVTEHKEIP LAILMCSNWSSR >gi|283510605|gb|ACQH01000014.1| GENE 17 12114 - 12647 428 177 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288927416|ref|ZP_06421263.1| ## NR: gi|288927416|ref|ZP_06421263.1| hypothetical protein HMPREF0670_00157 [Prevotella sp. oral taxon 317 str. F0108] # 1 177 1 177 177 319 100.0 4e-86 MEKFEQLFVDYYNGVTAILEGERPAGFLSKMPFMYEKLLEEAIMEYDKITDTERWLKKEI ISLTDSNITIFRNKSIVFNRYYIHTLWRFDLICDYLNRKNIDDLNVGEQLNATLEFYAAN NQLDRIMRIIAELLSFIRKNETSELIYKKIMDSYYKLHVEDKTILLELEVYKKYCEP Prediction of potential genes in microbial genomes Time: Sat May 28 00:11:42 2011 Seq name: gi|283510604|gb|ACQH01000015.1| Prevotella sp. oral taxon 317 str. F0108 cont2.15, whole genome shotgun sequence Length of sequence - 117493 bp Number of predicted genes - 81, with homology - 74 Number of transcription units - 48, operones - 18 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 495 - 701 76 ## + Prom 493 - 552 2.7 2 2 Op 1 . + CDS 673 - 1191 365 ## gi|288927417|ref|ZP_06421264.1| hypothetical protein HMPREF0670_00158 3 2 Op 2 . + CDS 1216 - 1653 263 ## gi|260911898|ref|ZP_05918464.1| conserved hypothetical protein + Prom 1655 - 1714 9.1 4 3 Op 1 . + CDS 1924 - 2355 299 ## gi|270339875|ref|ZP_06203463.1| hypothetical protein HMPREF0645_1335 5 3 Op 2 . + CDS 2408 - 2617 100 ## gi|260910415|ref|ZP_05917087.1| conserved hypothetical protein - Term 2745 - 2789 1.1 6 4 Tu 1 . - CDS 2795 - 3757 1310 ## COG1482 Phosphomannose isomerase - Prom 3819 - 3878 4.0 7 5 Op 1 17/0.000 - CDS 4170 - 5627 1467 ## COG0168 Trk-type K+ transport systems, membrane components - Prom 5702 - 5761 3.0 8 5 Op 2 . - CDS 5833 - 7176 1703 ## COG0569 K+ transport systems, NAD-binding component 9 5 Op 3 . - CDS 7173 - 9059 1904 ## COG1154 Deoxyxylulose-5-phosphate synthase 10 5 Op 4 . - CDS 9073 - 9339 141 ## - Prom 9365 - 9424 2.5 + Prom 9255 - 9314 4.2 11 6 Tu 1 . + CDS 9363 - 11825 2356 ## BF2098 hypothetical protein + Term 11929 - 11964 -0.8 12 7 Tu 1 . - CDS 11925 - 12536 284 ## PRU_1109 hypothetical protein + Prom 13956 - 14015 5.4 13 8 Op 1 . + CDS 14072 - 14275 153 ## gi|288927424|ref|ZP_06421271.1| hypothetical protein HMPREF0670_00165 14 8 Op 2 . + CDS 14262 - 15314 1214 ## PRU_1345 endonuclease/exonuclease/phosphatase family protein 15 8 Op 3 . + CDS 15361 - 15936 614 ## COG3663 G:T/U mismatch-specific DNA glycosylase 16 8 Op 4 . + CDS 16007 - 16630 848 ## COG0572 Uridine kinase + Prom 16753 - 16812 6.9 17 9 Tu 1 . + CDS 16845 - 17270 481 ## gi|288927428|ref|ZP_06421275.1| hypothetical protein HMPREF0670_00169 + Term 17304 - 17343 8.2 + Prom 17453 - 17512 5.8 18 10 Tu 1 . + CDS 17564 - 19546 2089 ## COG0358 DNA primase (bacterial type) + Term 19608 - 19652 9.1 19 11 Tu 1 . - CDS 19901 - 21067 1120 ## BT_0445 endoglucanase E precursor (EGE) - Term 21156 - 21206 10.5 20 12 Op 1 . - CDS 21302 - 22510 1214 ## gi|288927431|ref|ZP_06421278.1| hypothetical protein HMPREF0670_00172 21 12 Op 2 . - CDS 22606 - 22788 173 ## gi|288927432|ref|ZP_06421279.1| hypothetical protein HMPREF0670_00173 - Prom 22813 - 22872 3.8 + Prom 22848 - 22907 5.5 22 13 Op 1 . + CDS 22928 - 24022 1180 ## COG1408 Predicted phosphohydrolases 23 13 Op 2 . + CDS 24095 - 24760 523 ## gi|288927434|ref|ZP_06421281.1| hypothetical protein HMPREF0670_00175 24 14 Tu 1 . - CDS 24820 - 26250 1358 ## COG1502 Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes - Prom 26482 - 26541 6.5 25 15 Tu 1 . - CDS 27414 - 27638 110 ## gi|288927437|ref|ZP_06421284.1| hypothetical protein HMPREF0670_00178 - Prom 27833 - 27892 4.7 26 16 Tu 1 . - CDS 28770 - 28988 114 ## - Prom 29071 - 29130 3.7 + Prom 29056 - 29115 3.5 27 17 Tu 1 . + CDS 29191 - 29385 68 ## gi|288927438|ref|ZP_06421285.1| hypothetical protein HMPREF0670_00179 + Prom 31464 - 31523 4.0 28 18 Tu 1 . + CDS 31639 - 33123 2097 ## COG2268 Uncharacterized protein conserved in bacteria - Term 33561 - 33605 -1.0 29 19 Tu 1 . - CDS 33779 - 34579 516 ## CA2559_06345 hypothetical protein 30 20 Op 1 . - CDS 34634 - 35368 669 ## CA2559_06345 hypothetical protein 31 20 Op 2 . - CDS 35402 - 36157 623 ## CA2559_06345 hypothetical protein - Prom 36288 - 36347 3.5 - Term 37531 - 37573 4.8 32 21 Tu 1 . - CDS 37622 - 37840 128 ## + Prom 37715 - 37774 6.5 33 22 Op 1 . + CDS 37809 - 38915 910 ## COG0472 UDP-N-acetylmuramyl pentapeptide phosphotransferase/UDP-N-acetylglucosamine-1-phosphate transferase 34 22 Op 2 . + CDS 38912 - 39484 529 ## COG2148 Sugar transferases involved in lipopolysaccharide synthesis + Prom 39649 - 39708 3.2 35 23 Tu 1 . + CDS 39731 - 41893 1675 ## COG1629 Outer membrane receptor proteins, mostly Fe transport + Prom 41969 - 42028 6.0 36 24 Tu 1 . + CDS 42055 - 43041 780 ## PRU_1563 group 2 family glycosyltransferase + Prom 43198 - 43257 3.1 37 25 Tu 1 . + CDS 43327 - 44655 1306 ## COG2385 Sporulation protein and related proteins + Term 44718 - 44751 0.8 + Prom 44728 - 44787 1.9 38 26 Tu 1 . + CDS 44832 - 46166 1027 ## PRU_1564 putative transporter + Term 46201 - 46253 4.2 - Term 46333 - 46368 1.1 39 27 Tu 1 . - CDS 46471 - 47622 1091 ## COG0763 Lipid A disaccharide synthetase - Prom 47869 - 47928 7.3 40 28 Tu 1 . - CDS 47963 - 48907 830 ## COG0496 Predicted acid phosphatase - Prom 49002 - 49061 7.7 + Prom 48924 - 48983 4.4 41 29 Op 1 25/0.000 + CDS 49042 - 49806 935 ## COG1192 ATPases involved in chromosome partitioning 42 29 Op 2 . + CDS 50059 - 50952 1026 ## COG1475 Predicted transcriptional regulators 43 29 Op 3 . + CDS 50969 - 51703 717 ## PRU_0015 hypothetical protein 44 29 Op 4 . + CDS 51753 - 53126 1560 ## COG0741 Soluble lytic murein transglycosylase and related regulatory proteins (some contain LysM/invasin domains) + Prom 53350 - 53409 2.7 45 30 Tu 1 . + CDS 53429 - 55702 2679 ## COG0317 Guanosine polyphosphate pyrophosphohydrolases/synthetases + Term 55788 - 55835 4.5 46 31 Op 1 . + CDS 57351 - 58916 1605 ## PRU_1339 hypothetical protein 47 31 Op 2 . + CDS 58934 - 59404 434 ## PRU_1338 putative lipoprotein + Term 59447 - 59498 1.2 + Prom 59628 - 59687 5.1 48 32 Op 1 . + CDS 59793 - 62330 3113 ## COG1596 Periplasmic protein involved in polysaccharide export 49 32 Op 2 . + CDS 62350 - 63387 534 ## PRU_0084 chain length determinant family protein 50 32 Op 3 . + CDS 63393 - 64301 441 ## COG2746 Aminoglycoside N3'-acetyltransferase 51 32 Op 4 . + CDS 64339 - 65628 366 ## COG2244 Membrane protein involved in the export of O-antigen and teichoic acid 52 32 Op 5 . + CDS 65649 - 66956 376 ## gi|288927465|ref|ZP_06421312.1| hypothetical protein HMPREF0670_00206 53 33 Op 1 . + CDS 67210 - 68325 651 ## BF1021 putative glycosyltransferase 54 33 Op 2 6/0.333 + CDS 68315 - 69109 534 ## COG0726 Predicted xylanase/chitin deacetylase 55 33 Op 3 . + CDS 69106 - 70317 534 ## COG0438 Glycosyltransferase - Term 70308 - 70355 2.4 56 34 Tu 1 . - CDS 70366 - 71121 553 ## COG0463 Glycosyltransferases involved in cell wall biogenesis - Prom 71161 - 71220 5.6 - Term 73389 - 73428 -0.4 57 35 Op 1 . - CDS 73510 - 74532 1271 ## BT_3507 hypothetical protein 58 35 Op 2 . - CDS 74556 - 76628 2527 ## BT_3506 hypothetical protein 59 35 Op 3 . - CDS 76644 - 79733 3303 ## BT_3505 hypothetical protein 60 35 Op 4 . - CDS 79758 - 81242 1429 ## BT_3504 hypothetical protein - Prom 81357 - 81416 6.0 - Term 81702 - 81752 14.1 61 36 Tu 1 . - CDS 81777 - 84536 2867 ## gi|288927474|ref|ZP_06421321.1| conserved hypothetical protein 62 37 Tu 1 . + CDS 84879 - 87590 2316 ## PRU_2177 putative glutaminase + Prom 87593 - 87652 2.6 63 38 Tu 1 . + CDS 87762 - 89048 1058 ## COG3507 Beta-xylosidase 64 39 Tu 1 . - CDS 89020 - 89262 126 ## - Prom 89310 - 89369 2.0 + Prom 89110 - 89169 5.2 65 40 Op 1 . + CDS 89189 - 90325 1084 ## BT_0169 hypothetical protein 66 40 Op 2 . + CDS 90398 - 91867 1581 ## BT_3504 hypothetical protein + Term 91932 - 91973 4.1 67 41 Op 1 . + CDS 92050 - 94044 1528 ## PRU_2075 hypothetical protein 68 41 Op 2 . + CDS 94123 - 98058 3380 ## COG0642 Signal transduction histidine kinase 69 41 Op 3 . + CDS 98055 - 100151 398 ## PROTEIN SUPPORTED gi|15900011|ref|NP_344615.1| aldose 1-epimerase + Term 100348 - 100419 13.3 + Prom 100587 - 100646 5.0 70 42 Tu 1 . + CDS 100795 - 101910 1050 ## PRU_1340 hypothetical protein + Term 102122 - 102173 10.1 71 43 Tu 1 . - CDS 102118 - 102342 95 ## 72 44 Tu 1 . + CDS 103125 - 105896 2746 ## COG0553 Superfamily II DNA/RNA helicases, SNF2 family 73 45 Op 1 5/0.333 - CDS 106003 - 106884 1111 ## COG0388 Predicted amidohydrolase 74 45 Op 2 . - CDS 106903 - 107988 1211 ## COG2957 Peptidylarginine deiminase and related enzymes 75 45 Op 3 . - CDS 107991 - 108194 127 ## gi|288927486|ref|ZP_06421333.1| hypothetical protein HMPREF0670_00227 - Prom 108432 - 108491 1.9 + Prom 108048 - 108107 2.2 76 46 Op 1 . + CDS 108211 - 108849 192 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 77 46 Op 2 . + CDS 108871 - 109650 725 ## COG0390 ABC-type uncharacterized transport system, permease component + Term 109665 - 109699 4.0 + Prom 109807 - 109866 3.8 78 47 Tu 1 . + CDS 109886 - 109990 87 ## + Term 110020 - 110075 12.2 - Term 111018 - 111053 0.1 79 48 Op 1 . - CDS 111268 - 112122 603 ## Csac_0632 hypothetical protein 80 48 Op 2 . - CDS 112211 - 114604 1769 ## BVU_0314 hypothetical protein 81 48 Op 3 . - CDS 114619 - 117024 1975 ## COG1629 Outer membrane receptor proteins, mostly Fe transport - Prom 117199 - 117258 3.1 Predicted protein(s) >gi|283510604|gb|ACQH01000015.1| GENE 1 495 - 701 76 68 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVKQNFLLIILLYFLLSLSNLLRHWQQSPFRSYCKPRLPLLKRTKYLTKVQEKLKLSYIY QLQNAIIF >gi|283510604|gb|ACQH01000015.1| GENE 2 673 - 1191 365 172 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288927417|ref|ZP_06421264.1| ## NR: gi|288927417|ref|ZP_06421264.1| hypothetical protein HMPREF0670_00158 [Prevotella sp. oral taxon 317 str. F0108] # 1 172 26 197 197 340 100.0 2e-92 MINKKFCFTMLAMGVFLTSCYQQYVRKDWVIDANANIYVYRITYGELCYNSITAISLEHD RVYYYANDSLLLSVDDLTHKKRRKVNDQAIVNLLHYLTVINLSRLEKPDWSRCGCVTKNE YVIRILKQGKNEYHFFPELLKCDQRLGLKFVEDLRMIFERMLEYDGNSDGFR >gi|283510604|gb|ACQH01000015.1| GENE 3 1216 - 1653 263 145 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260911898|ref|ZP_05918464.1| ## NR: gi|260911898|ref|ZP_05918464.1| conserved hypothetical protein [Prevotella sp. oral taxon 472 str. F0295] # 1 144 1 125 131 100 43.0 2e-20 MNLYYLVLLMSLICAKGSAMKRLENFRVISYRLVSSDYHYSDERMKWRGMYYMPFLYEEY RKGTLQNVGIYKSEGLLLFTSSEVEKDSLMMDYVYIIKRREIYVGKKVVIDKEGGDKRHV YNTVIVKHRYYCYTGKYLRVYIPDT >gi|283510604|gb|ACQH01000015.1| GENE 4 1924 - 2355 299 143 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|270339875|ref|ZP_06203463.1| ## NR: gi|270339875|ref|ZP_06203463.1| hypothetical protein HMPREF0645_1335 [Prevotella bergensis DSM 17361] # 20 126 31 137 146 67 30.0 3e-10 MIVLLLSGMFMGYAKDVKSLQIVFTDIYMTTIFRITPSYFDKGALRGICDTVSIVGTKKI KELMQAVNELQEFTSYKTVPDTRGKIIFLYDDNQTEDLYFSAFFALYRGKLYELSNQFKR LINRIIISRKPDNQVFLITHGKK >gi|283510604|gb|ACQH01000015.1| GENE 5 2408 - 2617 100 69 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260910415|ref|ZP_05917087.1| ## NR: gi|260910415|ref|ZP_05917087.1| conserved hypothetical protein [Prevotella sp. oral taxon 472 str. F0295] # 14 64 25 75 75 73 80.0 5e-12 MEKMVSIGVFMKVVSIVSCPSFVGLSASCLSYSLGVARYAFRIRPGNMLNNQTKSVQALT TDLGLRHVL >gi|283510604|gb|ACQH01000015.1| GENE 6 2795 - 3757 1310 320 aa, chain - ## HITS:1 COG:CAC2918 KEGG:ns NR:ns ## COG: CAC2918 COG1482 # Protein_GI_number: 15896171 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannose isomerase # Organism: Clostridium acetobutylicum # 1 320 1 310 326 189 35.0 8e-48 MQPLKFKPLLKQTIWGGNKIVPFKHLDSNLQNVGESWEISGVPGDETIVANGEYEGKSLN EVLKELQGKLVGEKNYARFGNEFPLLIKFIDANDDLSIQVHPTDELAQKHGKSRGKTEMW YVMESDPKAYLNCGMKTAITPQQYKEMVANNTICDAICNYNVKRGDSFFIPAGRIHSIGK GCFILEIQQTSDVTYRIYDFNRKDKDGNTRELHTEEAAECINYTVEENYRTNYVPAKNVG VNLATCPFFTTNLYDLDQPMTIDLSQLDSFVILICTGGKGTVTDNEGNTVEMQTGDSILV PATTQNVKVEGVIELVQTYV >gi|283510604|gb|ACQH01000015.1| GENE 7 4170 - 5627 1467 485 aa, chain - ## HITS:1 COG:VC0042 KEGG:ns NR:ns ## COG: VC0042 COG0168 # Protein_GI_number: 15640074 # Func_class: P Inorganic ion transport and metabolism # Function: Trk-type K+ transport systems, membrane components # Organism: Vibrio cholerae # 1 482 1 479 481 286 36.0 7e-77 MINYRTIYKIIGSLLFIEAALMLSCLAMAIYYAEDDVMAFLVSIIATLFFGFVFRFMGRN SNNTMGRRDACLVVSLSWAIFSAIGTMPFMIGGYLHSFTDAYFETMSGFTTTGATIIDDV EALPHGIIFWRSLTQWIGGLGIVFFTIALLPSLVGGSVKVFSAEATGPIKSKLHPKLSTS AKAIWAVYLTISIACTLCYKFLGMGWFDSVNYSMTSLATGGFSTHNSSVEFFHSPAIEYA VTFFCFFSGVNFTLLYFTATKLKIRQLFRNSEFKLYLWLVLGFTAFIMVELIIRNHYELE KAFRAAIFQVVSFITTTGLFSDDAARWPHVTWVVLAMCMFIGACSGSTSGGFKCIRGVML LKVIRNEFKQLLHPNAVLPLKIDGVNVSTQKRVTLLAFLTVTLVMCLFCAFSMIAAGIDN TNAITITLSALSNVGPTLGIEIGPTMSWAQLPDFAKWLCSVLMLMGRLEIFTVLVIFTPA FWNDR >gi|283510604|gb|ACQH01000015.1| GENE 8 5833 - 7176 1703 447 aa, chain - ## HITS:1 COG:PA0016 KEGG:ns NR:ns ## COG: PA0016 COG0569 # Protein_GI_number: 15595214 # Func_class: P Inorganic ion transport and metabolism # Function: K+ transport systems, NAD-binding component # Organism: Pseudomonas aeruginosa # 1 433 1 436 457 188 31.0 2e-47 MKIVIAGACDIGIYLATLLSHSHENITLIDEDEERLGRMDAEADLMMLQASPSSVKTLKE ANAGDADLFIAVTADQHLNLNICMIAKALGAKKTVAKIDDVELTEPGVAELFERLGVSSL IVPETLAATDIINGLKMSWVRQRWDVHNGALVMLGIKLREGCEILNQPLKQLCGPDDPYH VVAIKRGYETIIPGGNDELKLYDLAYFMTTRQYIPYIRKIVGKEHYVDVKNVFFMGGGNT CVMAVKNIPSYMEAKIIEKDEQRCEELNDLLEEDKALVIHGDGRDIQLLNEEGIMNTQAF VALTSNTETNILACLTAKRLGVRKTVAMVENMDYVSMADSLDIGTIVNKKDIAASHIYQM MLDADVSNVRFLTMANADVAEFTAQQGSKVTKKRVFELGLPRGVTIGGLVRKGVGILVSG GTQIEAGDSVMVFCHNTNMQQLGKFFN >gi|283510604|gb|ACQH01000015.1| GENE 9 7173 - 9059 1904 628 aa, chain - ## HITS:1 COG:aq_881 KEGG:ns NR:ns ## COG: aq_881 COG1154 # Protein_GI_number: 15606220 # Func_class: H Coenzyme transport and metabolism; I Lipid transport and metabolism # Function: Deoxyxylulose-5-phosphate synthase # Organism: Aquifex aeolicus # 6 626 5 618 628 516 44.0 1e-146 MGENSFELLDKIRYPEDLRKLDVEQLPQLCQELRQEIIEEVSVNPGHFASSLGVVELTVA LHYVYDTPEDRIVWDVGHQAYGHKLLTGRREQFFTNRKLGGIRPFPTPMESPYDTFTCGH ASNSISAALGMAVAAKLGGDQQRHVVAVIGDGAMSGGLAFEGINNVSSTDNDLLIVLNDN DMSIDRAVGGMEKYLLNLDTNETYNRLRFKAAQWLHAKGWLDDDRKKGILRLNNALKSAL SHQQNIFEGMNIRYFGPFDGHDVKEVVRKLRQLKLMRGPKLLHLHTTKGKGYKPAEESAT IWHAPGKFDPATGERLVCNNTGEPPRYQDVFGETLVELAQVNPKIVGVTPAMPTGCSLNI MMKAMPNRAFDVGIAEGHAVTFSAGMAKDGLMPFCNIYSSFSQRAYDNIIHDAALLNLPV VLCLDRAGLVGEDGPTHHGVFDIAALRAVPNLTIASPMDEHELRNLMYTAQLPNKGTFVI RYPRGNGVCPDWRSPFEEITVGTGRCLRAGTDVAVITLGPIGNDVAKLLDEMQGTKQVAH YDLRFVKPLDENLLQDIGRKFNKIITIEDGVRNGGMGSAVLEWMSDHSFRPQIVRMGLPD AFVEHGSVAQLRKLVGLDADSIRKEIEA >gi|283510604|gb|ACQH01000015.1| GENE 10 9073 - 9339 141 88 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRPASKPNAKVVYTFGLCNKKAYYFTSAPLLRLILMTSASLQHLHTNRCKPRNRQDKHES SIKTLPWQANNERGVIEIRLKAVPLCRN >gi|283510604|gb|ACQH01000015.1| GENE 11 9363 - 11825 2356 820 aa, chain + ## HITS:1 COG:no KEGG:BF2098 NR:ns ## KEGG: BF2098 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 40 812 112 919 922 357 29.0 8e-97 MRIRLLLIYIAVGLLAVSSLPTRATTSSEPRHLVSDPDTLRQGKDLGEVVVVARRKLIQM RNDTTLINVGGLHTKRGGSLEYLLRKVPGMRYDRVTGRLTYNGKPLVKINLNGSTFMGNN IARALAALPAEAVSQLKIYDLLSTLEKATGITDGLQELTLDIQTKDEFNGALTTNVKAEH GTQGRRNDSALLNWLRSNGESLSFGVASSNLENTTAGNGNYTNSVSIDGNKHFGKQLKLN ASASLYDFHQEMFGSSYSEEYLTTGTRADASTSSTKGRNTSGSLWFFSTYEMSKRSQLNV NFSASLYGNKSQSAQETQLFDKRARIDTITLGNEESFSRGHSNSFSASADFTHAFAEGGT ALSIMGRMDANNAQSTNNSHSRTHYRQLKNKLGTDSLQLRELQQLSPTQSRQTSFTALLT QPLGKKVRLQAGWGIVAAKNYRKQDTYDQARNSGQWVDSLSNESKLTSLGQEFRLTFNYD SEHLSANGGLTLLPQRKRFHQRYHGEERDTTLRHADFKPMAHVRWRGKGMSVRLEYSGYT SQPSAEMLLAFRNTTDPLAIREGNPNLKPAFNSMFDLEWEHERWGLSLSANFQNTLNDFT EEMRLDRQSGARHYRQVNINGNNSFNANLGWNKQLGLWNLGLSMALGLRREVSLNGDESE AEVTKSVTKMREGEVLLLCGFVPKWGDITFSAHWQPRFSHNIQTQTRTKNHDFSLSVNAM ADLPFDLELSTDASCYLHRGTMTSSDADQWLWNMSLAWSFLRSKQATLTLSWNDILNRRQ SLERTASATGFNESYRPVIKSYVLLSFSYQLNFKKKEGNS >gi|283510604|gb|ACQH01000015.1| GENE 12 11925 - 12536 284 203 aa, chain - ## HITS:1 COG:no KEGG:PRU_1109 NR:ns ## KEGG: PRU_1109 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 24 200 18 194 201 173 45.0 3e-42 MNKDSYIEESKTLKSRIKALPVDSVLFRSDYPEYHSEFVGSTLAELTESGVLFKMAQGIY VKPRKSRFGLVFPSIEKIVQAIAMRDNAEVLPSGTTALNALGLSTQVPMNYSYLTSGSER TIKLANRQVVLKRGVPKNFCYKTRLIALLTQALRALKQENIGDSEIQIIRELIAKETDKE SLAKDVDAMPGWMKRIIKPMLNN >gi|283510604|gb|ACQH01000015.1| GENE 13 14072 - 14275 153 67 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288927424|ref|ZP_06421271.1| ## NR: gi|288927424|ref|ZP_06421271.1| hypothetical protein HMPREF0670_00165 [Prevotella sp. oral taxon 317 str. F0108] # 1 67 1 67 67 108 100.0 1e-22 MAGESRKDSNSRTFLADLMGNAKLAWSIAPGENLEIAYTECLAKRKGKNALLKNKKTNNN TQKHELF >gi|283510604|gb|ACQH01000015.1| GENE 14 14262 - 15314 1214 350 aa, chain + ## HITS:1 COG:no KEGG:PRU_1345 NR:ns ## KEGG: PRU_1345 # Name: not_defined # Def: endonuclease/exonuclease/phosphatase family protein # Organism: P.ruminicola # Pathway: not_defined # 28 347 1 319 320 432 62.0 1e-120 MSYFRRMLALSALLVVVLSAAAQKKFEVYAAGFYNQENLFDTCHDEGKRDYEFLPSGSYK WNGMKYTHKLHNMARALADMGTDVLPGVGCAVIGLAEVENAKVLTDLTAQPELAARGYKF CHIEGPDRRGIDCALLYNPALFAVRNVKLVPYVQSLEKDSAFYTRGFLTVSGTLAGEHVT VVVCHLPSRFSDSFYREQGARQILAIRDSVQREDKDCKVLVMGDMNDDPMDKSISQGLHG KANMSEVGQGDMYNPWYNVLTKEGVGTLQFQGSWNLFDQILLSKNLLNANGSKDFTALKY WKNQIFKRDYLFQTEGKYKGTPKRTTAGGVWLDGFSDHLPVVVYLVKEKE >gi|283510604|gb|ACQH01000015.1| GENE 15 15361 - 15936 614 191 aa, chain + ## HITS:1 COG:NMB0698 KEGG:ns NR:ns ## COG: NMB0698 COG3663 # Protein_GI_number: 15676596 # Func_class: L Replication, recombination and repair # Function: G:T/U mismatch-specific DNA glycosylase # Organism: Neisseria meningitidis MC58 # 2 188 35 220 229 164 42.0 7e-41 MVERHPFKPFLPKGCKLLMLGSFPPSQKRWCMNFFYPNFTNDMWRIFGLAFFEDKQHFVD EANKTFKLDELKAFLTAKGVGIYDTATAVNRTTGTAADKDLEVIEPTNLDELLLQVPACK NVVVTGQLAADVLRAHFGIAEQPKVGTYVPFVFQPDGREMRLYRMPSSSRAYPLKVEKKT EWYKAMFEETL >gi|283510604|gb|ACQH01000015.1| GENE 16 16007 - 16630 848 207 aa, chain + ## HITS:1 COG:BH1275 KEGG:ns NR:ns ## COG: BH1275 COG0572 # Protein_GI_number: 15613838 # Func_class: F Nucleotide transport and metabolism # Function: Uridine kinase # Organism: Bacillus halodurans # 2 204 3 205 211 203 50.0 1e-52 MQKTTIIGIAGGTGSGKTTVVKKIAEALPPHYVAVVPLDSYYNDTSDMTEEERHAINFDH PDAFDWKLLTKHVNDLRSGVAIEQPTYSYLLCNRLKETVHVEPKPVIIIEGIMTLLNKRL RDIMDLKVFVDADPDERLIRNIQRDTIDRGRTVSMVVERYLEVLKPMHEQFIEPTKRYAD LIIPQGGENEKGINILCSYIEGLVPVE >gi|283510604|gb|ACQH01000015.1| GENE 17 16845 - 17270 481 141 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288927428|ref|ZP_06421275.1| ## NR: gi|288927428|ref|ZP_06421275.1| hypothetical protein HMPREF0670_00169 [Prevotella sp. oral taxon 317 str. F0108] # 1 141 1 141 141 153 100.0 4e-36 MKKIVLASIMMLSIGITGCASSQEFSELKTQQKALKLQTQLTNTQLQYEKAIASLAELRK KAADVNAEANASVVTGLSIRDAEATAKAAKARAKTLKEVTKMNKKLAKEQKKVEKLEKKM EKIQDKISKLNQKVEFVGGWR >gi|283510604|gb|ACQH01000015.1| GENE 18 17564 - 19546 2089 660 aa, chain + ## HITS:1 COG:CAC1299 KEGG:ns NR:ns ## COG: CAC1299 COG0358 # Protein_GI_number: 15894581 # Func_class: L Replication, recombination and repair # Function: DNA primase (bacterial type) # Organism: Clostridium acetobutylicum # 1 436 5 429 596 276 36.0 1e-73 MIDRATIEKIQEAANIVDVVSEFVTLRKAGANYKGLCPFHNERTPSFMVSPARGICHCFG CGKGGTPVSFIMEHEQMTYPEALRWLANKYHIEIKERELSDDERREQNERESMFIVNEWA AKHFEDTLHNNVDGLAIGMQYFRSRGFRDDIVRKFRLGFDLNDRELLARTAIDKGYNVEF LVKTGLCYRTDDGRYVDRFAGRVIFPWFGVSGKVVGFGGRLLDSRTKGVQQKYVNSPDSD IYHKDHELYGIFQGKKAIAKEDCVYMVEGYTDVVSMHQCGVENVVANSGTALSIHQIHTL HRFTSNIVLLYDGDAAGIHAALRGTDMLLAEGMNVKVLLFPDGDDPDSFARKHNAEEFRK YIADNQTDFIQFKTRVLLDGVTDPARRSEAISSIVQSVSVIPNQILRDTYLHDCAQRLGV AETTLINSMNRYIRESRERRSTETSRAVQESMETTPRPTIQTVTPLQQAAKVERMLVEQV VKYGERIVLRNVEDEDGNLLNLTVAQFIQYDFMQDELAFQHPLFNRILQEAAERSVYPDF KAEPYFVHHEDIELSKLATELCMDPYQYLKLKQPEPTHPLDEEEKRLKEKEKEDELRQRT VHLLLDFRLDYVEHRLKTLQTEIAQAASDPEKLMQLMAEFKDMQTIRNELARRLGSEILV >gi|283510604|gb|ACQH01000015.1| GENE 19 19901 - 21067 1120 388 aa, chain - ## HITS:1 COG:no KEGG:BT_0445 NR:ns ## KEGG: BT_0445 # Name: not_defined # Def: endoglucanase E precursor (EGE) # Organism: B.thetaiotaomicron # Pathway: not_defined # 45 387 20 360 366 299 43.0 1e-79 MNPICNIHTPYYIGATNTATSTRHMLKSIFAALTILFAALVPTTAQAANDTQIKATSPEV QYIGRVEHNATAGTVRYDWVGTYLRTGFTGTEIAVLISDEGESYHNVFVDGKWIKKIRVK GNTPQRMVLANGLKSGKHQLVLQKCTEGEYGCTTVHALLLSKGGSLHAVPVPKRLIEVIG DSYTCGYGTESNKATDPFKLETENCDKAYACLLARYFGADYVLAAHSGRGMVRNWGDTVQ ISKGNMSQRYLQLFDHYATTPYDFKAYRPQLVLINLGTNDYSTVITPSVEQYVGAYVKLI ELVRKHYGPVPVICIRPHSAGAYLSASFKVLQQRLSAHKDVHFAEFMPGIITVDKDLGAS YHPNYSGQQKLCMTLVPLVSAVMGWEIN >gi|283510604|gb|ACQH01000015.1| GENE 20 21302 - 22510 1214 402 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927431|ref|ZP_06421278.1| ## NR: gi|288927431|ref|ZP_06421278.1| hypothetical protein HMPREF0670_00172 [Prevotella sp. oral taxon 317 str. F0108] # 1 402 1 402 402 754 100.0 0 MKIEEGSTTGPWIGPVLGEVPINLLAQEKGDKLNANIGIDFQGMTIKVVFGEGYQVPNSD FENFGASKEPNRWHSFQSVITEGWFTSLAKGQQTKESTDVRPGSLGKKSLCVYSRSIIGV TANGTVTTGRLKAGSTTATDTRNNSFLDLANKDKDGNGDPFYTELSGRPDSLTLWVKFKQ GKPSADHPYATAKAVITDGTYYQLPEEKGKTYKKMAEAINNEIADTKGEWKRLSIPFSYV NNSIDPKAILVTLSTNADAGKGSGSDELYVDDLELVYNFGVEGISIKGQALANFAENTTE YTHIVGNATADDITVKTKGQGMLVAKTVENGKATVLVASNDLSKYRLYTINVTTGIDNLP SVEGNKQVEIYTLDGVRVNNTNRKGVYIIKDAQGKTRKVVKQ >gi|283510604|gb|ACQH01000015.1| GENE 21 22606 - 22788 173 60 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927432|ref|ZP_06421279.1| ## NR: gi|288927432|ref|ZP_06421279.1| hypothetical protein HMPREF0670_00173 [Prevotella sp. oral taxon 317 str. F0108] # 1 60 1 60 60 97 100.0 2e-19 MKRFFTALFVVLGTVTTFAKTYTGKLVVYINGNIADNSTATINVDQQGDGTYKLSLANFP >gi|283510604|gb|ACQH01000015.1| GENE 22 22928 - 24022 1180 364 aa, chain + ## HITS:1 COG:CAC3027 KEGG:ns NR:ns ## COG: CAC3027 COG1408 # Protein_GI_number: 15896279 # Func_class: R General function prediction only # Function: Predicted phosphohydrolases # Organism: Clostridium acetobutylicum # 70 356 86 384 392 157 33.0 2e-38 MKLGMLLFLLIPIVGHVYVFWHVWNIVPLNNWLKIILLVLLSLGFISLLVGFGAGLDKMP MTLATAIYELGTSSVFVLLYAVLLFLLLDLGRLAHLIPKQFLYNSLSGTLTVLGILLAVF VYGNIHYNNKERQELHLNTSKHLDKPLKVVMISDLHLGYHNRRAEFARWVDLINAERPDL VLIAGDIIDISVRPLLEEHVAEEFRRIKVPIYACLGNHEYYSGDANAEKFYRDANINLLR DSVVQVMDLNLVGRDDRTNGRRASLKTLMGKVDPSKYTILLDHQPYHLEEAQRAGVDFQL SGHTHYGQVWPISWIEDAIYEDAYGPLTKGNTHYYVTSGIGIWGGKFRIGTRSEYVVTTI SQSF >gi|283510604|gb|ACQH01000015.1| GENE 23 24095 - 24760 523 221 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288927434|ref|ZP_06421281.1| ## NR: gi|288927434|ref|ZP_06421281.1| hypothetical protein HMPREF0670_00175 [Prevotella sp. oral taxon 317 str. F0108] # 1 221 1 221 221 449 100.0 1e-125 MKSNFKLTQAHLKAWFGEFNRDYFGGKLPEPRLMLSRSRTQLGSMHCKRRAGLGRVETFD YAIHVSILFEQDEKGFKNVLLHEMIHYYIAYNNIQDTAPHGDVFKAMMNRLNSEYGWNMK VSERGKALQVAQEYVPQREYLVLALSLSSGKKMFSVVSPSNFRKLDGQVKRVREVENYGW YVSKDVYFNGYPKVRTLRGVPVAPDFFDEMLKKMQPLVLPE >gi|283510604|gb|ACQH01000015.1| GENE 24 24820 - 26250 1358 476 aa, chain - ## HITS:1 COG:lin2646 KEGG:ns NR:ns ## COG: lin2646 COG1502 # Protein_GI_number: 16801708 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes # Organism: Listeria innocua # 23 476 21 482 482 334 37.0 2e-91 MPMQYLHTAIIALYIIVTIGVMVRVLMDHRQPAKTMAWLLVLTFVPVVGIVFYFFFGQNT RKERLISQRSMDRLTKRSMLEYVEQGELQLPDAHKELINLFTNLSLALPYKDNEVEIFTT GHDFFLDLLAEIGQARHHIHLGTYIIEDDALGRLVADALIDKAKQGVEVRLIYDDVGCWS VGNSFFDHLARGGVQVQSFMPVRFPMFTSKMNYRNHRKLCIIDGRTGYIGGMNVALRYVK GTRKQEWRDTHLRLRGGAVYGIQRAFLVDWYFMARTLITDKVYYPMLSPQISNNCIAQVV TSSPVSPWPDIMQGYMRILLEAKKYVYMESPYFLPTEPILFAMRTARQAGVDIRLMLPAR TDSKIIEWAGRSYVQAALEAGVMICFYCAGFNHSKLLVCDDSLCTCGSTNVDFRSFENNF EANVFFYDKEMALKMKEVFLTDEAECIVLDSPDIFAHRPFLARLWESLIRLLSPLL >gi|283510604|gb|ACQH01000015.1| GENE 25 27414 - 27638 110 74 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927437|ref|ZP_06421284.1| ## NR: gi|288927437|ref|ZP_06421284.1| hypothetical protein HMPREF0670_00178 [Prevotella sp. oral taxon 317 str. F0108] # 1 74 1 74 74 144 100.0 1e-33 MYEVPDKDTTKSESRFRLLVAKCGNASKGDQAAVIQSSLHKVKNRLSMADVFCVCMMPFP LVVSSLPYNRSRWR >gi|283510604|gb|ACQH01000015.1| GENE 26 28770 - 28988 114 72 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MFSCPTGQIQRHCIAYEHSHNHVLMYRKFMLYIFDFHSPTPRSNRIAFSKRTIVDKHLST LNSKLTYRKVRH >gi|283510604|gb|ACQH01000015.1| GENE 27 29191 - 29385 68 64 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288927438|ref|ZP_06421285.1| ## NR: gi|288927438|ref|ZP_06421285.1| hypothetical protein HMPREF0670_00179 [Prevotella sp. oral taxon 317 str. F0108] # 1 64 1 64 64 122 100.0 6e-27 MTNVIERPFVSEGVLNTPPLVQGLNMADANLLLCRKYKVNGGKGHVSGVAPLVCRLNCLQ HAAK >gi|283510604|gb|ACQH01000015.1| GENE 28 31639 - 33123 2097 494 aa, chain + ## HITS:1 COG:BS_yuaG KEGG:ns NR:ns ## COG: BS_yuaG COG2268 # Protein_GI_number: 16080153 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus subtilis # 16 480 17 493 509 183 29.0 6e-46 MPEQFSLYVIIGIAAVILVLFFFASYVKAPPSYAYIISGLSREPRVLIGSGGFRIPFFER LDRVYLGQITVDIKTEESVPTTDFINVDVDAVAKIRVTPNAEGTRLAAKNFLNMTPMMIA EQLQDSLQGNMREIIGTLDLRSLNTDRDGFSDQVMQKAQHDMAKLGIEIISCNIQNVTDK EGLIHDLGADNTAKIKKDASINRAIAERDVKIQVAHADKDANDARVDADTAIAMKNNDLA LKRAELKRQADTAQADADAAYSIQQQEQQKTINIKTVEAEIEKTKRQQILSQEQIVIRQN ELSAEVEKRADAEKYQVQKNAEADLEQRKRIAEAQRYEAEQQAMAQNAASDATRYKLEQE AQGIKAKGEAEAYAILKKGEAEAQAMDKKAEAYKKYNNAAVAQMMIEVLPQIVENVAKPI SAIKDVNIYSGDGNGISAMSGNVPVAIKQAFDVLKSATGVDMADIMKAGSIQAKTTRNIN LNGEAQEVVDGLKD >gi|283510604|gb|ACQH01000015.1| GENE 29 33779 - 34579 516 266 aa, chain - ## HITS:1 COG:no KEGG:CA2559_06345 NR:ns ## KEGG: CA2559_06345 # Name: not_defined # Def: hypothetical protein # Organism: C.atlanticus # Pathway: not_defined # 67 259 24 216 220 167 48.0 3e-40 MGCKQAKSESFRKSNEQCRVCSNIGMARKHTFRGTRKESLKVTTSKQGLVQNSVRLATAP KMELKTDPKAFVPKGYSLFQQYKGDLNKDGKPDVVLMIKGTEKSKWVDDECRGRLDRNRR GLIILFKRKNGYEQILRNDTCFSSENEDGGIADAPELELYIIKNTLHIYFAHGRYGCWNY IFRYQNNDFELIGYNHNRCIRYVTYYNLDINFSTRTRVYEENLNVDDNEKEEHYKVTKSK IKRRKLIKLSEIADIDKLNWGDFSNE >gi|283510604|gb|ACQH01000015.1| GENE 30 34634 - 35368 669 244 aa, chain - ## HITS:1 COG:no KEGG:CA2559_06345 NR:ns ## KEGG: CA2559_06345 # Name: not_defined # Def: hypothetical protein # Organism: C.atlanticus # Pathway: not_defined # 45 238 24 217 220 200 55.0 4e-50 MFAVALITGCKQAKNESGKDTVPTKDTVQASVKPVAVAQMELKADPEEFIPDGYVLFSGC EGDLNKDGKPDVVLMIKGTEESKWVDHEYCGRLDRNRRGLIILFKRDGGYELIAENDECF SSENEDGGVYYAPELDFYIKNNTLILHYAHGRYGYWKYIFRYQNNDFELIGYFGSYDRGP VVLYITDVNFSTRTCVYKENINADDDEAEEKFKVKTIKFKRKNLIKLSEITDFDELDLDL PKDD >gi|283510604|gb|ACQH01000015.1| GENE 31 35402 - 36157 623 251 aa, chain - ## HITS:1 COG:no KEGG:CA2559_06345 NR:ns ## KEGG: CA2559_06345 # Name: not_defined # Def: hypothetical protein # Organism: C.atlanticus # Pathway: not_defined # 53 245 25 217 220 205 56.0 1e-51 MQRTFYLLFVVALMMGCKQAKNERGKDTVPTKDNVQASVKPAAVAQTDLKTNPLAFIPEG YDLFSRYEGDLNKDGKPDVVLMIKGTDKSKWVDDEYRGRLDRNRRGLIILFKRDGGYELI AENDECFSSENEDGGVYYAPELRLEINKNRLIISYLHGRYGYWSYIFRYQNNDFELIGYD GYSSRGPVTLRILEVNYSTRTCVYKENINADDDEAEEEFKVKTIKFKRKNLIKLSEITDF DELDLGLPDDE >gi|283510604|gb|ACQH01000015.1| GENE 32 37622 - 37840 128 72 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MIIGNASFIIIVVSLLYSNKLCLCDYIACESKFKALILYYFTGLGRVPMRWYIVDYDVVD LLGRWPHDVGLA >gi|283510604|gb|ACQH01000015.1| GENE 33 37809 - 38915 910 368 aa, chain + ## HITS:1 COG:BS_tagO KEGG:ns NR:ns ## COG: BS_tagO COG0472 # Protein_GI_number: 16080606 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl pentapeptide phosphotransferase/UDP-N-acetylglucosamine-1-phosphate transferase # Organism: Bacillus subtilis # 1 317 1 297 358 124 32.0 2e-28 MMMNEALPIIIAFLLSVCFGFVFIPGILNFCKEKNIYDIPNQRKVHKTLVPRLGGLAFVP AITLSIIVAVGVMSVGFTEPKIQVSLWSMAFLASLLLVYATGVVDDLIGLNAHVKFTVQV ITACALPLCGLYVNNLYGFMGIHEVPYWLGFPLTVFIIVFIDNAMNLIDGIDGLAAGLSI LALLGFLFVFGNAQLIPYAVIVASAIGILIPYLRFNIWGKAEQNRKIFMGDSGSLTLGFI LGFLFVKSAMHNPPLMAASPERFVLAYSLLVVPVFDVVRVVLHRLRTRQPLFSADKNHIH HKLMRLGMTQHQALIFILGLAIAYVAMNMLLFPLVSITGIVLIDVATFTLFQLLLTRKVE RKELKIAA >gi|283510604|gb|ACQH01000015.1| GENE 34 38912 - 39484 529 190 aa, chain + ## HITS:1 COG:BH3650 KEGG:ns NR:ns ## COG: BH3650 COG2148 # Protein_GI_number: 15616212 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Sugar transferases involved in lipopolysaccharide synthesis # Organism: Bacillus halodurans # 22 181 17 174 205 154 48.0 9e-38 MKRLFDLIVSALCLVIFSPLILLCAIAIRLDDGLPVIYTQERIGRKGMPFFIYKFRTMRM DAEENGPMVLEVDGDDRLTRVGKFLREHHLDELPQLWNVFKGDMSFIGPRPERSFYIEQI MAKDARYEQLYALRPGVTSYATLYNGYADSIEKMLKRLELDLYYLQNRSWWMDTRILACT FFKIAGGKKF >gi|283510604|gb|ACQH01000015.1| GENE 35 39731 - 41893 1675 720 aa, chain + ## HITS:1 COG:CC0815 KEGG:ns NR:ns ## COG: CC0815 COG1629 # Protein_GI_number: 16125068 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor proteins, mostly Fe transport # Organism: Caulobacter vibrioides # 395 705 414 725 737 76 24.0 2e-13 MQKFISLTLITLSCLSSYAQVEQQSSLPSKNDTTAIDSSDYWRRLDLNEVIVIAEKTVID HKPDRIVYYTKNDPYAKGLNGVEVMRRLPRVSVVNEAVNVAGKANVRYIIDGRLLETSES ETLMKLKSLRADNIERIELFTLAPAKYPAADNVCYIAIKTKRDETLGVSGNVDANLIVKE SLNSYFGGGIRQATKRVDYSIDLNSNRNKGINDIIREYTFADHIKLSQRRNDFTNKLFNL NALLKYKPTATIETGLMLNLGTERLKNELQDLTIDRGLNFRSHAHSPSNPINATTLTGYA DWNLDNKGKLLSLTYNYFNKATKSTSHISTTEGTSSSMMNNSGNNNYKIHAVKLDATLPF NAVNMEIGAAYTSISNTSAISTETLTGGVWTKLLGQNNNFRYTEKTNAAYVSLSKDLSPK WYVQAGLRFEHTHLTGRNVAEHNKQSYNRLFPTLNLSYRAETGYSLSAAYSMGINRPRFS DLNPFRYYTTTTDYVSGNAYLSASITHNAELSFSHKGLYAVAYAYQLKDGVGYVTRFTPD NSQYTTPQNYIDYRKYGVYASYQKNFTAWWNVKVGGELFYAKSQSSIDDSHPSNSTSWSG KLEGTSDVAFNSQHTLLFTIQYLHMFPHDEDLVRYKALSLLNASLRWQLINGRLQLNLSA SDPFLQNITRATKHYSAYEEYMETNAHVRNVSLKITYLFGGKSVRDVYKDNKETESNRSY >gi|283510604|gb|ACQH01000015.1| GENE 36 42055 - 43041 780 328 aa, chain + ## HITS:1 COG:no KEGG:PRU_1563 NR:ns ## KEGG: PRU_1563 # Name: not_defined # Def: group 2 family glycosyltransferase # Organism: P.ruminicola # Pathway: not_defined # 15 325 488 804 1224 329 48.0 7e-89 MEIKDGQGTTKATEVQGQDILNRFFEEQLQVWPDAQQRFNDLKGVLLKAVTCAGMPFYVQ CNPARLVSTGAKIDKAALAQRPCFLCEKNRPAVQTQIPIDNDFELLVNPFPILPLHFTIP AKVHQPQLICEHYGKMRRILECFEHLIVFYNGPKCGASAPDHMHLQAGTGTHLPLRDNWE QLYNNRQTLLSTPDGSELAAINNYACPAFTIVGRDAKADVTLFETLYKALPQREDETEPM FNILKWREGETYITVVIPRDKHRPACYSAEGDAQMLISPGALDMAGLVITPRKEDFERLN TPLLESLYKEVGMPTTTFEDVKRKIMRS >gi|283510604|gb|ACQH01000015.1| GENE 37 43327 - 44655 1306 442 aa, chain + ## HITS:1 COG:slr0191 KEGG:ns NR:ns ## COG: slr0191 COG2385 # Protein_GI_number: 16331612 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Sporulation protein and related proteins # Organism: Synechocystis # 85 440 199 524 535 112 28.0 2e-24 MNPEYNATPQVSVGIMNGTEIVFVLNTPYQVNNSEAQGQQTVCLNKGRIVWNGQEYDQLT FEPISAEANFSLHNVTIGVNFHWEQQQTQTFKGALRFVIDGNKIVAINQLSVEDYLESVI SSEMSATSSPELLKAHAVISRSWLLAQIQNRDKKTATTGPAAFVMNENEIVKWYDREDHT LFDVCADDHCQRYQGITKATSKHVAEAVQATRGQVLLSEEGICDARFSKCCGGMVEEYRY CWEDEDKPYLKVVRDLPNPAVDLDLTNETNSEAWIRSTPDAFCNTHDTRILSQVLNDFDQ TTKDFYRWRLEYTQAQVGELLKKKLGIDFGLITAFEPIERGKSGRLSRLKIVGTKRTLTL GKELEIRRALSESHLYSSAFVVDVAERDAEGVPLRFVFTGAGWGHGVGLCQIGAAVMGEK GYAYHEILAHYYPGAEHKRLYK >gi|283510604|gb|ACQH01000015.1| GENE 38 44832 - 46166 1027 444 aa, chain + ## HITS:1 COG:no KEGG:PRU_1564 NR:ns ## KEGG: PRU_1564 # Name: not_defined # Def: putative transporter # Organism: P.ruminicola # Pathway: not_defined # 4 402 2 390 422 294 44.0 4e-78 MTTSKETPWRWVPSIYFGTGTAQAAIATLALLLYKQLGLSNAEITLYTGLLFLPWLLRPL WSPFLGLIRTPRWWIVAMQLMLGVALGGVAFTIPTAHWLQGTFAFLGLLAFALAVHETEA DVFYHDTVMGSTRSGLSSMPNTFRLLAFIFVQGFLVMVAGNLQLLYRNSISLSWSLIFYS VAGLFILSWLWHRTALPSPYNECLHLLRRQQRREQFAERWREVKKDIFSFFAQHPRQQLA AVFCFTFCFLLPETLTSKVAMLFLVDSNHNGGLGLSPQEFGLTYGTVGVVGLIVGSLLGK AIIATKGLRTIVLPLSAVILLPNALYVLLSETQPTSLGVINLCIFIGQTGLGLALVTYWT TLKSISKALGNHTVYIFLTSVMALAQMVPSMFSGALQEWLGYNNLFLIALACGVISLAGC VWIRPMLLNKGNDDVENNKNIDFS >gi|283510604|gb|ACQH01000015.1| GENE 39 46471 - 47622 1091 383 aa, chain - ## HITS:1 COG:FN0597 KEGG:ns NR:ns ## COG: FN0597 COG0763 # Protein_GI_number: 19703932 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lipid A disaccharide synthetase # Organism: Fusobacterium nucleatum # 1 369 1 348 356 158 31.0 1e-38 MRYYLIVGEASGDLHASHLMRSLQAVDSAAEFRFFGGDLMTAVGGTRVKHFKELAYMGFI PVLLHLRTIFRNMAFCKKDIVEWAPDVVILVDYPGFNLNIATFVKSKTRIPVYYYISPKI WAWKEYRIKNIKRDVDELFSILPFEVDFFEKKHHYPIHYVGNPTADEVRSFLSTYNEGFE QFCKANALQADKPILALLAGSRRQEIKDNLPAMMQVAARFPQYQAVLAGAPSIADEYYEG FIRGSQVRLVKNQTYPLLAHSTAALVTSGTATLETALFNVPQVVCYKTPVPRLIRFAFNH IIKVEYISLVNLIMNKEVVSELFADRFTIDNIAHCLQTLLPGGEARQEMLNNYVLLQKVL GDDVAPDNAAKLMYGMLKGVGAK >gi|283510604|gb|ACQH01000015.1| GENE 40 47963 - 48907 830 314 aa, chain - ## HITS:1 COG:aq_832 KEGG:ns NR:ns ## COG: aq_832 COG0496 # Protein_GI_number: 15606188 # Func_class: R General function prediction only # Function: Predicted acid phosphatase # Organism: Aquifex aeolicus # 62 312 2 249 251 170 37.0 3e-42 MHLHVMVKGWLFAFLKRQTAARKTDVELDKWACHCAFETGFGRCLDNRQVTKIRRKMEVK KPLILVSNDDGYHAKGLRSLVAMLTDFADVVVCAPDAGRSGFAGAFSVAKPLLLKRRKDV AGAPVWSSNGTPVDCVKLAFSELFAERQPDLILSGINHGDNAAVNVHYSGTMGVVIEGCL KGFPSVGFSLADPDEDANFEPLRPYVRDIVSRVLAEGLPKEVCLNVNFPRAQTFKGVKVC RMNRGTWVNECEKRTHPHGYDYFWMAGHFQSDEPEAQDTDHWALKNGYVAIVPTRIDVTC YDSLQRMKAWEEVL >gi|283510604|gb|ACQH01000015.1| GENE 41 49042 - 49806 935 254 aa, chain + ## HITS:1 COG:lin2923 KEGG:ns NR:ns ## COG: lin2923 COG1192 # Protein_GI_number: 16801982 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Listeria innocua # 1 253 1 253 253 291 57.0 1e-78 MGKIIALANQKGGVGKTTTTINLAASLATLEKSVLVVDADPQANSSSGLGVDLNDVECSL YECIIDHADIRDAIYTTDIDGLDIIPSHINLVGAEIELLNIENRERVFKTLLDGIKGDYD YILIDCSPSLGLITVNALTAADSVIIPVQCEYFALEGISKLLNTIKIIKSKLNPKLEIEG FLLTMYDSRLRLANQIYDEVKRHFQELVFKTVIQRNVKLSESPSHGLPVILYDAESTGSK NHLALAKEIMEKGN >gi|283510604|gb|ACQH01000015.1| GENE 42 50059 - 50952 1026 297 aa, chain + ## HITS:1 COG:Cgl3034 KEGG:ns NR:ns ## COG: Cgl3034 COG1475 # Protein_GI_number: 19554284 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Corynebacterium glutamicum # 33 289 121 374 379 190 45.0 3e-48 MAVHKKFNALGRGLDALISTESVRPQGSSTINEIPLEQIEPNPNQPRREFDEDALQELAN SINEIGIIQPITLRQVEDNKFQIIAGERRWRASQLAGLQAIPAYIRTIKDESIMELALVE NIQREDLNAIEIALAYEHLLSAEGMTQERVSERVGKSRTAITNYLRLLKLPAQVQMALQK KEIDMGHARALLAIDSPSLQIKLFREIQKHGYSVRKVEELAQKLKNGEDIQSGKKTIATK AAMPEEVTRIRQRLSDFLDTKVQMTCSPKGKGKISIPFANEEELARIMAAFDKLKEQ >gi|283510604|gb|ACQH01000015.1| GENE 43 50969 - 51703 717 244 aa, chain + ## HITS:1 COG:no KEGG:PRU_0015 NR:ns ## KEGG: PRU_0015 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 40 244 21 216 216 258 58.0 1e-67 MSAKISKLLMTMALLSPFGSVASIAQGHFDNSKKTQSLGDTLATTITDEVLTQEKDSLRT ALTPTLQKKKEKRDWDTWRPNPKRAMWLAIVLPGAGQIYNRKYWKLPLVYGGFVGCIYAM QWNNMMFRDYSKAYQDIMDTDPTTQSYNQFLHLGTRITDQNKEQYQSIFKSRKDRYRRWR DLSFFCLLGVYALSIVDAYVDASLSEFDISDDLTLRVEPAVMNTPLAGTSFKPSAIGLHC SLTF >gi|283510604|gb|ACQH01000015.1| GENE 44 51753 - 53126 1560 457 aa, chain + ## HITS:1 COG:PA1812 KEGG:ns NR:ns ## COG: PA1812 COG0741 # Protein_GI_number: 15597009 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Soluble lytic murein transglycosylase and related regulatory proteins (some contain LysM/invasin domains) # Organism: Pseudomonas aeruginosa # 131 457 118 463 534 146 31.0 9e-35 MKYSMKRFNFLLATMMLSCAMPALAQHNNDDKEITVTTANGKNEVIDLPEGMTYELDSLL KQYNAEKYLRPATDCNMPNVNPSFPKEVFIERLSRIPSIIEMPYNDVVQKFIDRYSGKLR RSVSYVVGASNFYMPIFEEALEAYGLPLELKYLPVIESALNPKAVSRVGATGLWQFMLNT GKHYGLTVNSLIDERSDIAKSSYAAAHYLSDLYRIFGDWNLVIAAYNAGPENINKAIHRA KGEKDYWKIYPYLPRETRGYVPAFIAANYIMNYYCEHNICPMTATLPVKTDTVVVHRDIH FNQLAGVLGIDLAELKTLNPQYRKDIVCGASAPATLRLPASMVNKFIDKQEDIFAYNSDD LLTRRREVEVAEAPASYSPPRSSKSSSRNHESKKERRKREKREKRERGDGGGKSIMIKDG DTLSEIAKRNNTTVKQLKKLNKISGTNIRAGKKLRVK >gi|283510604|gb|ACQH01000015.1| GENE 45 53429 - 55702 2679 757 aa, chain + ## HITS:1 COG:BH1242 KEGG:ns NR:ns ## COG: BH1242 COG0317 # Protein_GI_number: 15613805 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Guanosine polyphosphate pyrophosphohydrolases/synthetases # Organism: Bacillus halodurans # 16 757 8 727 728 426 35.0 1e-119 MTDQITRDKEREQAEEQLVNNAFKHLLETYMASPHRKKEEIITKAFNFAREAHKGVRRLS GEPYIMHPIAVAQIACEEMGLGSTSICAALLHDVVEDTDYTVEDIENIFGSKIAQIVDGL TKISGGIFGDRASAQAENFKKLLLTMSDDIRVILIKICDRLHNMRTLESQAPNKQYKIAG ETLYIYAPLANRLGLNKIKTELENLSFKFEHPEEYASIEAKLASTQEQREILFQQFTDPI RQALDQMGIQYEIKARIKSPYSIWNKMQSKHVAFEEIYDILAVRIIFTPKNKDEEINESF NIYVKISQLYKSHPDRLRDWLNHPKANGYQALHVTLMSKQGRWIEVQIRSNRMDEIAEQG FAAHWKYKEGEEITEDEGELNDWLRTIKEILDDPQPDAMDFLDAIKLNLFASEIFVFTPK GEIKMLPAGCTALDFAFQLHTFLGSHCIGAKVNHRLVPLSHRLQSGDQVEILTSKSQHVD PSWINFVSTAKARGKIQAILRRANRELQKQGENILKEWLAKHNLEQTTPILDKLCELHEL QKPEELLQHLGDRTVILGEKDLDEILGKKKSNVSFSWLKRMPFMGREKRRNKTEKDEQDL LVVGKDFNKKLPCIITETSVERYIFPSCCHPIPGDDILGYIDNKGRIEIHKRACPVANRL KSSYGNRILDAKWDMHGKMFFEATIEIRGIDRHGLLRDVAEVISSQLNIDMRKLVISGDE GVFDGTIELRVHDRNETQQIIDKLKDIDGIHEVQRIL >gi|283510604|gb|ACQH01000015.1| GENE 46 57351 - 58916 1605 521 aa, chain + ## HITS:1 COG:no KEGG:PRU_1339 NR:ns ## KEGG: PRU_1339 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 521 1 494 494 535 53.0 1e-150 MKHYFTALLWACMVPAFTAQAQTWREANEHGKIKLGKGLGYSVEVQATQSSGQTPLWLNA NKHGLSSLEKTNGYLLAGIERPLTTDSVRRWGVGYGLNVALTQGFTSRIVVDQAFVQARW LHATLTIGSKRHNMELKNNALSSGSQLLGINARPVPQVRLALPDYWTLPFGNGWLHLKGH AAYGKTTDDNWQKRFTGQQHLYTENVLYHSKAGYIMIGNPERFTPLSLELGLEMASTFGG TSYEPLDDGTMKVIRGRTNLKSFVKAFLPGGADVGETTYQNAEGNQLGAWLARVNYDADT WRFSVYADKYFEDHSSMLQLDYDGYGTGDEWLQKKKRRFFVYDLKDWMLGAELNFKYGRW ITDVVLEYLYTKYQSGPIYHDHTNTIADHLGGQDNYYNHYIYTGWQHWGQVMGNPLYRSP IYNTNGNINVANNRFVAWHLGIGGKPTDELSYRFLATYQNGLGTYSDPFEKKEHNVSLML EGKYELRNDWNVAAAVGADMGRILGNKWGVQLTVAKKGLFK >gi|283510604|gb|ACQH01000015.1| GENE 47 58934 - 59404 434 156 aa, chain + ## HITS:1 COG:no KEGG:PRU_1338 NR:ns ## KEGG: PRU_1338 # Name: not_defined # Def: putative lipoprotein # Organism: P.ruminicola # Pathway: not_defined # 23 156 11 131 131 94 38.0 8e-19 MTQNAKIRRPMGLLQFVGMALMVLALLGSCELETSRNKKLDGYWRLQQIDSIGTGGVNTM QGKRLFWAFQHKLLELRDVDGKISKCLLRFERQGDSLMLSQPYLYDRENGDKPLENTEIL LHYGVNSLNEHFKIDDLTSSRIVLSSRKCRLHFVKF >gi|283510604|gb|ACQH01000015.1| GENE 48 59793 - 62330 3113 845 aa, chain + ## HITS:1 COG:aq_505 KEGG:ns NR:ns ## COG: aq_505 COG1596 # Protein_GI_number: 15605977 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protein involved in polysaccharide export # Organism: Aquifex aeolicus # 144 612 34 511 725 135 27.0 4e-31 MKRSLLLVLLMTLCSLGLRAQSSMTDSQVMQFVLKEHEAGTSQTQIVTKLMQRGVDIQQI RRVKKAYERVAKDKGLGQVSNSTDKESKLEDGRMRKNNGENKKDEKNVKNNMMVRGAQQE EGDLPTKQRNTADHRLMQQELDDFLPDSATQREREKLKIYVGKMPKKVFGRDIFNRKDLS FEPNMNIATPQNYRLGPGDAVNIDIYGASQKSEQTTISPDGDVIIEGFGPVQVSGLTVAE ANARLRSTLGSRYSSSRVRLTVGQTRTIMVNVMGEVKQPGTYTLSAFASVFHALYMAGGI NDLGTLRNIKVYRQNKLVTVVDIYDYILNGKLTGNVRLADNDVVVVGAYDCLVNITGKVK RPMYYEMKRNESVGTLLKYAGGFTGDAYTKTVRLVRKTGREYSVFNIDEFDLNTFHLADE DSVSVDSILPRFSNMAEVKGAVFRPGMYQVGGSINSVRTLIEHAEGITEEAFTARAVMHR MRPDRTLEVIPVDIEGIMSGKVADIPLQKNDVLFVPTKGEMMQQQTVTIHGEVMYPGIYK YAANETLEDLVLQAGGLRESASTTKVDVARRIVNPKALSTDSVISRTYTFALKDGFVVDG ETGFTLQPFDEVYVRKSPGYNVQKNIAVQGQVMFAGTYTLTSKNERLSDAIKRAGGVTDL AYVRGARLERRITPDERLRMETVLRLAEMQSGKKDSVEKKRLDLGDTYYVGIELEKALAE PGGDADLVLREGDKLIVPEYNGTVKISGNVMYPNTVAYEKGRRPAWYINQAGGFGNRSKK SNTYIIYMNGTVARVGHNAKILPGCEIVVPTKPENNGKALTQWLSVGTTMAGLATLIATI ANLMK >gi|283510604|gb|ACQH01000015.1| GENE 49 62350 - 63387 534 345 aa, chain + ## HITS:1 COG:no KEGG:PRU_0084 NR:ns ## KEGG: PRU_0084 # Name: not_defined # Def: chain length determinant family protein # Organism: P.ruminicola # Pathway: not_defined # 11 339 19 348 358 252 42.0 1e-65 MENVENDKNVVDLRSLFRKIWIKRISYTKVLSITFILSCLYIIGYPRYYETDTKLAPEIE TPGKGGALGSIASSLGIDLSDIQSKDAITPLLYPSLLEDNKFVSNLFTIPVQTSDKSIYT SYYDYLANRQKQTWWEDALFFLKKKEEVLPKTAVNPYNLTKEQSVVMEQIRHAVKLSVDS KNGVITITTQAQDPLICKTLADSVRNKLQAFITEYRTTKARNDMEYYKKLVQEAKQAYEK SRKLYTSYSDANTDLILESFRAKREDLENDMQLKFNTYTTLATQLQSAKAKVQEKTPAFT ILKGATVPLKPAGPKRMFFVLTIMLLVFVVQSLWLLRKELGKLFN >gi|283510604|gb|ACQH01000015.1| GENE 50 63393 - 64301 441 302 aa, chain + ## HITS:1 COG:DR0599 KEGG:ns NR:ns ## COG: DR0599 COG2746 # Protein_GI_number: 15805626 # Func_class: V Defense mechanisms # Function: Aminoglycoside N3'-acetyltransferase # Organism: Deinococcus radiodurans # 43 186 21 181 279 75 31.0 1e-13 MGVSIITRLGLFVKGITGIKDFSLLKKKTHKEIGKLVYHKTYTANDIVREMQRLGMKKGS VVCIHASMMEFYNYKGTAEELITAVMKVLTQEGTLLMPAFPDPLLQKDANYIFDKATAPT KAGYLAETFRTFPGVVRSINVQHSVCAWGQYAEWLTKDHHKSTNCWDELSPWYRMTKLNA LVFSIGLPKFYIGTFVHCVEAILYKEHPYWAQFFTEEKTYRYLTSSGEVKEYTCLEGNLE RRSREGRIIKRFSADCYQHTRLSNLSIKVFHSAPCLQKMLELGRRGVTIYYVPSPSKYNF EQ >gi|283510604|gb|ACQH01000015.1| GENE 51 64339 - 65628 366 429 aa, chain + ## HITS:1 COG:MA4450 KEGG:ns NR:ns ## COG: MA4450 COG2244 # Protein_GI_number: 20093236 # Func_class: R General function prediction only # Function: Membrane protein involved in the export of O-antigen and teichoic acid # Organism: Methanosarcina acetivorans str.C2A # 1 356 16 375 494 110 23.0 4e-24 MLKLSFGNIIMYILPFLVTPILSRLYEKEDFGEWGVFSSFISIVTVLIFLGYENAIVKID RREEKNIVALCLLLGCSTTLLIGATFVLGKSLDISFFNSFPSFTLLMVYLFFFILYTVCY NVINRHEQYTILSVNHIIQGGSQGLFRIVFGAVGLTVANGLLLGTVLAQAITALFLVAFV LMKIKMSGARPISFSSMSRCALENKSFPLYDAPASALSFAAFNLPLLILSAFFSQSVIGC YSIILQLLLLPMSFVGSAVGKVYYQEICDTEDEDKIKQVTKKVVNATLMLSVIPTLFICL GGDKLIVWFLGDKWHEAGSMSICLSLWAMSTILTQPLIPLFRRINKQHILLRYELLYFIV GIGCILFACNYRYGFATILISYSVGCATVKTLLFRKVLGLVHLSLSDFIKSLPLCILALL FFIFRTWEL >gi|283510604|gb|ACQH01000015.1| GENE 52 65649 - 66956 376 435 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288927465|ref|ZP_06421312.1| ## NR: gi|288927465|ref|ZP_06421312.1| hypothetical protein HMPREF0670_00206 [Prevotella sp. oral taxon 317 str. F0108] # 1 435 24 458 458 737 100.0 0 MNANSLLGRMLCITAYILTYDFMFEHFVFKLFYYMGLDYIEMEPLPKTLWITFSILPFTL YKGIKSMSSYFCIFLYLLVYIPFIHALFVTNGIDAYSLYSYACVMCLFFIVYFGMDSWRN LFKPLELRPALSFRWIEIITLIITAIFVLSRMKSMHFVNIFTQSDVLYDLRSQNSEAING GGGFIAYLQGWLSGAFYPFLLVCYLREKKWLKALAILFGYILLFMVDMQKITFVMPFVLV ALYFVVQLKHETISQRLHSLIIVTTVIISFALYFAQDNEILFVVGAIVLLRTVCVAGWLS QFYLHFFSEHPYTHYSHINIVNAITNAYPYDVPLGVVVAHGTQNANANFFLTDGIAAWGL SGVVIIGVFFFVLLQFINAIAFRYELKDLFVVFLPTLSFLLNTSIFTTLLSSGMFILLII LLMVESPLVKNKTPT >gi|283510604|gb|ACQH01000015.1| GENE 53 67210 - 68325 651 371 aa, chain + ## HITS:1 COG:no KEGG:BF1021 NR:ns ## KEGG: BF1021 # Name: wcfG # Def: putative glycosyltransferase # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 2 368 3 370 377 422 54.0 1e-116 MDTSKSLCLITNIGSHYRFPIFSEMAKNFACHFYLGNKVATPIKKFDYTKLEGYRQTLKN VYFGPFYWQRGSVRLLFKPYRYYIIDGEPFCLSSWAILLLAKLTRKKTIAWTHGWYGREG KIKRQVKKLFFSLHSALMVYSEYAIALMQKEGIPRRKMYCIANSMDSDKERAIREQLTCS DIYRQHFLNDNPTIIYCGRIQKLKRLELLVDCAAAFKEEGRPVNIVFVGKDVEQVNIDQY AKDKNVADSVWMYGPCYDDEVLGQLFYNAHVCVSPGNVGLTAIHALSFGCPVVTHNNFAY QMPEFEAIRQGETGTFFEQGNAKSLKQEIECWICVDADKREQTRRKAFDEIDRKWNIHYQ TEVIRKVLNET >gi|283510604|gb|ACQH01000015.1| GENE 54 68315 - 69109 534 264 aa, chain + ## HITS:1 COG:BMEI1603 KEGG:ns NR:ns ## COG: BMEI1603 COG0726 # Protein_GI_number: 17987886 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted xylanase/chitin deacetylase # Organism: Brucella melitensis # 47 202 32 181 237 77 33.0 2e-14 MKHNPTLYQRLRGVVREEILNVLGAWATPAPGIHILNGHRIATEKEPDTFRNLLENLSRN VTFIRIEDAVERIMRREHPDKPLVAFTFDDGFMECYDIFAPLLEEFGVNALFFVNPNYVE GDDAYIEHFNNHTVLTEGKQPMRWSHLRELANRGHIIGAHTLDHYMINTDDEDTLRHQIV ACKTVIEQQLLHPCNYFAFPYGRLTHANQRSIDIACNAYKYVFSQSDYKHYFSFNGKVIN RRHFEPFWPIKHLNYFVSCNKTYE >gi|283510604|gb|ACQH01000015.1| GENE 55 69106 - 70317 534 403 aa, chain + ## HITS:1 COG:RP336 KEGG:ns NR:ns ## COG: RP336 COG0438 # Protein_GI_number: 15604204 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Rickettsia prowazekii # 92 402 91 403 407 95 24.0 1e-19 MKKLLQINITANWGSHGKIAEGIGQAAIKQGWQSYIAYGRWANPSASNLFHIGNRWDEMR HGIASRLFDNHGLMSQKATKSLLQFVRNVNPDIVHLHNIHGYYLNYPLLFQYLRLHDVPV VWTLHDCWSFTGHCAHYEFIGCEKWKTHCAVCPQKGAYPKSLLLDRSYRNFEQKKEAFLS LNRLTLVPVSQWLQRQLQLSFFKHTPTRLIYNGIDTNVFSKQTEVNWIKKKYGIPEHCAI VLGIASNWYRKGLPDFLQLASLLPPSIRIVLVGLNKQEQKLAARAGIVGISRTDNLHELC SLYSVANVYFNPTWEDTFPTTNLEAMACGTPVVTYKTGGSPETITTGTGLAVEKGDIQTA AIEIGRLCQQPATTFEDVCRQRIVRHFNKEERFSEYLELYSKL >gi|283510604|gb|ACQH01000015.1| GENE 56 70366 - 71121 553 251 aa, chain - ## HITS:1 COG:jhp0094 KEGG:ns NR:ns ## COG: jhp0094 COG0463 # Protein_GI_number: 15611164 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Helicobacter pylori J99 # 1 247 2 245 260 224 46.0 1e-58 MKVTVITVAFNSATTIADCMQSVLDQTHPDIEYIVVDGQSNDNTIDIIREFEPKFTNRLR WVSEKDNGIYDAMNKGLRMATGEVVGVLNSDDFFTTTDVVEQVAQAFSDSMLDAVYGDVH FVREPDLTHSVRHYSSAGFRPWWLRFGLMPAHPSFYARKEVFQKAGLYKTDYKIGGDFEM MVRLFKRFNIKAKYLNIDFVTMRTGGASTKNMGSRLTLLREDTRACRENGIYTHPLLISL KYFYKVLEYRF >gi|283510604|gb|ACQH01000015.1| GENE 57 73510 - 74532 1271 340 aa, chain - ## HITS:1 COG:no KEGG:BT_3507 NR:ns ## KEGG: BT_3507 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 11 340 8 353 353 348 55.0 1e-94 MKVIHLPIYAAMLASALATLTACNDEWKEEQYTQYVSFRAPLGKLGVTNVYVPYSRRNAD KTFVEGSGLSSYKLPVIVSGSLQNENNITVHVAHDGDTLNVLNQARFLNRTELWYTDMKD YATVPETTTIPSGENVGLLNIKFNFNGIDMSNKWVLPLTIVNGENYGYTAHPRKNYAKAL LRIFPFNDYSGDYSGSTMKIFVTGDEANATAKSIIRGYVVDDKTIFFYAGDIDESRTDRA KYKVFARFNGEKEGTVDFYTTNNDLKLKVNKQASFHIIEQKDDVKPYLVHRYVIINNISY DYVDYTSAPGTEFNWSVSGTLTLERQINTQIPDEDQAIEW >gi|283510604|gb|ACQH01000015.1| GENE 58 74556 - 76628 2527 690 aa, chain - ## HITS:1 COG:no KEGG:BT_3506 NR:ns ## KEGG: BT_3506 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 690 1 705 705 758 57.0 0 MKRKNIALITLALGIGTLFSACNDYLDSDKYFEDRTTIEKVFTSRIRSQQWLAYAYSFLK DENADVVGKEKETNSFTFADDMYYGDRDVNYDSKEADELSYNTFMQGEYDENDFNQGWTM CYKGIYQASVFIANIYRNTEMTEKERLDFKGQARFVRAYFYWLLLRRYGPVPIMPDEGVD YTLSYEQIATPRSSYEEVANYISSEMVQAAKEIQYTRRDGENIARPTKGACLATRALALA FAASPLANGNTDAYAKQLVDDKGRTLLNTDYQEEKWARAAAACKDVMQLGVYDLYHASFS TTDAAGDPATIVPADSTCQFAQNDWPTGWKNIDPFKSYRVLFNGDVAPEDNPELIFSRID QNATRINHSMMSFARHSMPRDFGGWNTHGLTQKMVDAYYMNDGTDCPGKDSELNGGNGAE RLKGFTTARDYRRGKYKPLAANVSLQYANREPRFYASVGYNGSVWEYLGDPENTHHNRQT FYYRGSGNGYNNSQFYLRTGISVKKYVNPTDVPDREDYKNIKNRAEPAIRYADILLLYAE ALNELDGTYTIPSWDESTTYNVRRDIAEIQKGIHPIRIRGGVPDYSTDVYANKDLLRAKI KRERMIEFMGEGKRYFDLRRWKDAPRELNQRIYGCNVMMDANHAAEFQQVIPINNLRSTF SDKMYFWPIRHSELKHNSRLTQNPGWTYND >gi|283510604|gb|ACQH01000015.1| GENE 59 76644 - 79733 3303 1029 aa, chain - ## HITS:1 COG:no KEGG:BT_3505 NR:ns ## KEGG: BT_3505 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1029 1 1040 1040 1302 62.0 0 MKKIFILFWMLCGITLNALAQEQLEISGTVTDATGEALIGVSVTVKDAKGLGTITNIDGK YTIKMQQYQTLVFSYLGYKPVSVLVKGDRRVIDVQMAEEKLNTIDEVVITGLGAQKKLTV TGAITNVDLGQMKQFPSSNFTNALAGNVPGIIAMQSSGQPGKSTSRFWVRGISTFGASSD AMILVDGFERSNINDLNIEDIESFSVLKDASATAIYGSKGANGVILITTKHGKEGKININ AKGEVAYNTRTITPKFVDAPTYATLLNEARITRNLPPQYQPEELALIRSGLDPDFYPNVD WSKLLLKDGAMSYRADLSLSGGGNTARYFASLSYVEDQGMYNTDETLRNRYNTNANFKRW NYRMNVDIDVTPTTVIKLGVSGNLNKRNSPGLGDQYVWGELFGFNALSSPVLYSNGYVPA YGRQKYQMNPWVSATRTGYNQEWDNNIQTNVTVDQKLDFITRGLSFTGRFGYDTYNSNHI YHRLWPAMYRANSRDSQGNIIWDKLFEESPMAQTSGGEGSRREFLEALLRWNRTFNKQHN FGATARFTQDERIQTQNIGTDIKNSVSQKNQGVAGQITYNYALRYFADFNFGYNGSENFA DHHRFGFFPAFSLAWNVAEEPLVKKALPWLNMFKLRYSWGMVGNDNAGRSHRFPYLYTID FTKEGYNWGSNLTTATTTGMHYTQVASPNVTWEIARKTDVGFDFVAFDNKFSLTMDYFYE KRTGIFMQRNFLPDITGLESRPWANVGAVKSSGFDGNFQYKDHIGEINWTVRGNITYSKN TILERDDENNVYAYQYEKGYRIGQQRGLIAQGLFRDYDDIRNSPKQSWGTVQPGDIKYKD VNGDGVVNDGDRVAIGATSTPSLIYGLGASISWRGFDFNFHFQGAGKYTFLINGGTVNAF SNGRWGNILKGITDNRWISADISGTKETENPNAPYPRLSYGSNSNNEQASTFWLRNGRFL RLKNLDIGYTMPKPWVNAIHLESARIYISGQNLITWSAFDLWDPELDSSRRGEEYPITRS FTAGIQISL >gi|283510604|gb|ACQH01000015.1| GENE 60 79758 - 81242 1429 494 aa, chain - ## HITS:1 COG:no KEGG:BT_3504 NR:ns ## KEGG: BT_3504 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 11 494 6 497 518 508 51.0 1e-142 MKKNNLNIYGQRFAWGALLFFVACMSACRSDNGDVSSAEYDPSRPVTVTDFIPKEGGVDQ KLVVYGSNFGNDTGNVKVTIGGQRAVLISVKGDCLYCLVPAKAYSGEVVVSVGGKTDNEQ SATASSKFNYERKMVVGTLSGVRNQFDDQGWHDGPFNTATGYAGEGCLTFDPLNHNILYA VYDENAHGIQALDMEKKEVKTILSMSKFDNHRLRSIDFSNDGQYMLISTDRDDRQLQSPS VWIVKRNADGTFTDASKSQILAAYKQCNGAAVHPKNGELYFNSYERGQVFRLDMNNYFNT VASGGTWYPNWTDGNFKELFTIQDVSWEFKIFIHPTGKYCYIVVINKHYILRSDYNETTK SFAPPYVVAGQARNDGWVDGVGTGARVNRPYQGCFVKNKKYVAENRDDVYDFYFCDNRNH CIRYLTPDGIVRTYAGRGTSSQAGDGNTWGTEDGDLREVARFNSPTGIAYDENSNTFYIL DTQGRKIRTISMEK >gi|283510604|gb|ACQH01000015.1| GENE 61 81777 - 84536 2867 919 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927474|ref|ZP_06421321.1| ## NR: gi|288927474|ref|ZP_06421321.1| conserved hypothetical protein [Prevotella sp. oral taxon 317 str. F0108] # 1 919 1 919 919 1772 100.0 0 MKKQVLSLGLGLLSATLLNAQTTPWPGHAVGNGGEYYLYNVATGLWLQNNNTVKDGWATA VNVGTRGLPITFEKAGAKTFRLSSIFYKGNCVSKKIGDAGLLYWDMPADNIGDWELSPAD NMQSIHGYWLECDALVLGADNNLLTVDKEKNSVWQLVTREERIADAKAKASADHPVDVTW LIGASDLVTKNNLFKMDCTAAPNTEHSTYRGGWDIVRANTIQEFWNTQTFDFYQTISGLP NGTYKFSVRGYYRDGSSETRNYAMYGYGADKFINGTEQLRATYYANGTSAPIMSLYAGAK KAPEEGFNFQAERENKQNSGLYVPNTTHEANCALWKGNYQNPEITVTVTDGTLKLGVKKE AGVVDDWCVISNFSLKYLGSKVLQTAEEALKDLKAILATTKAFKGAVAPALSKQYTDAIA AANKTLTSTDPVAIIAVTSNLQKAYDAVAACSENYSALVKTTEICKNINKNNDAQLNAAT VKAEKVAKTATTNADMKAALVDLRVARKIVAADKMPDIYKGAKAGAGEFYFYNVASQKFL MGGSDWNTHAAVDVPGLLFTVAAEGNGFTINRFGGKAGNYLGYNGYTDIPDKAVWAFVPV AGKANVYNIVKGDNHAQGLAFAPQSNTDADEAMDKEFWNTVSVEAAVAKNANAEWKLVTK AERDALLATATEKRPVDATYLLTNPGFNRPDLFEKWNSDKKGDFKDANLGVIDRGRRTNP VCEAYYLNSFEVNQTVSNLPEGYYQVNMTGYYRDGSREALQQKVAKGTAPARHAMLYIEY KGKGDEVALPSIAAGMNQCPGIGWTGTAGEQPDDVMDAAEYFECGLYKVYTHIIKVGPEG ELTFGVTKDKQVDGDWAVFDNFRLTYFGKKVSQDIINGIENVKNNVVEDGKIYNLQGMEV KRPLKRGIYISNGKKFIVK >gi|283510604|gb|ACQH01000015.1| GENE 62 84879 - 87590 2316 903 aa, chain + ## HITS:1 COG:no KEGG:PRU_2177 NR:ns ## KEGG: PRU_2177 # Name: not_defined # Def: putative glutaminase # Organism: P.ruminicola # Pathway: not_defined # 1 843 1 831 832 884 50.0 0 MKKLLLCLVGVANMLAAEAQGGSFFEPYRRTSLRLPSVPLIVNDPYFSIWSPYDNLYDGT TRHWTGQQKAIDGLLRVDGTTYRFMGKEKGRLLKPVAPMADMGAWKAKVSYAKPAANWMQ RDFNDSKWQTQQAAFGTPKEYPNIRTVWTDTNSDIYIRRHVTLTKEDLARDLWLIYSHDD KCEVYVNGVLAIETGETWVQNEELMLPANVKQSLRVGDNVIAYHVHNTTGGANADIGLFA NVKEKHNNIRNAVQTNCDVMATNTYYTFRCGSVDLRLVFTAPMFLDDLNLLSTPINYISY QVRANDKRKHDVQIFFGTTPELAVMKNTQPTISRIQNVNSIDYVRTGSVEQNVLGKTGDI ICIDWGYLMIPNVNGNVTMADQNVVESQFVSKGTLAQRDALRIKSNNESEMPMLGYLHNF GLTERDSSYMMIGYDEVQDVQFLGTRYKAYWAKDGQQLTDAFENFRDNYKSYMQRARRWD KIIYDDALAAGNKKYAETLAASYRQTLAAHKLFMDKDGDLMYFSKENSSGGFINTVDVTY PAAPLLFTYNTQLEKGALEGVFKYCADSSRWGFHFPAHDLGFYPIADKQTYASCFPGANG DFGKNMPVEEAGNMVILTAMTCLREGKADFAKKYWDLLTMWTNYLVANGQDPANQLCTDD FAGHLAHNVNLSVKAIMGIAGYALMAQMQGENATYRTYMDKARQMAATLERTANDGDHYR LAYDRSGTWSMKYNMVWDKMWGLNLFSKQMKQKEYSYYLTKAEPYGIPLDNRERYTKSDW EMWTACLAENQEGFARIADLVWEFANKTKNRVPFSDWYWVNSGDYQIFQGRSVLGGHWMK VLMDNFLKNKKFSTNIENEPQTPDASKKEVGRYDVDGRRLSTPKRGINVVRYKDKTVKKV VVR >gi|283510604|gb|ACQH01000015.1| GENE 63 87762 - 89048 1058 428 aa, chain + ## HITS:1 COG:BS_abnA KEGG:ns NR:ns ## COG: BS_abnA COG3507 # Protein_GI_number: 16079933 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-xylosidase # Organism: Bacillus subtilis # 32 302 39 303 313 121 34.0 2e-27 MKKDAIKRIALVVNLLMLAAVVRGQYTNPVLNKDMPDPSVVRAPDGQFYVYATEGGGLCV PIYKSRDLATRWRFAGSAFNSSTRPTFIKGGNIWAPDINYINGQFVLYYSMAEWGKEWNC GIGVATSSSPRGGFVDKGKLFTSTEIGVQNSIDPCFFQDDDGKKYLFWGSFRGVWGIELS ADGLALKPGAEKFQIGPNHNWMHHGTEATMIVKRKGYYYFLGSLGNCCEGANSTYRVVVS RSKDLRGPYVNKRGKKIMDVGGSYESVLSGNELVAGPGHCSQLIIDDNGDYWMLYHGFDK TDIDAGRKLFVDKVLWDADGWPYIAGKHPSKGGAVPFFKDGELSGIADVDATQDTYTIAR GGDNYYQISSPDNSAFTWELYAISGEKVKAGRAINAQDLWLNDVPVGIYVVKVRGNSGSV NQKIVKVQ >gi|283510604|gb|ACQH01000015.1| GENE 64 89020 - 89262 126 80 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MEQQPNSIINAAFRQLRLKRCSLFIVLGFNKYYSRFLVTLSPHLKYDVWGVALGWFCCTS GVRRVPQYRDISTEPLRFSD >gi|283510604|gb|ACQH01000015.1| GENE 65 89189 - 90325 1084 378 aa, chain + ## HITS:1 COG:no KEGG:BT_0169 NR:ns ## KEGG: BT_0169 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 50 378 28 351 351 334 48.0 3e-90 MKREHLFKRSCLNAALMMLFGCCSIACSKEDNNPTPEPPKPVEVYQDLKIFQLNTWFGAT QVNNAYNGLVDVINQVDPDIVLLCELRNDLLLKKIPDGDGEYTHRLCQSLKTKHQKNYYG KNHGTFVGVLSKYEIKSFKVLPTPTDNYMIKVTVEVRGQEMTFYSAHLDYKHYACYLPRG YNSGPDWSKLPNPITDSERIMKDNRLSTRDEAMEMFLDDAQGEMDRGRIVVLGGDFNEPS DLDWQANTKDLYSHNGVVANWDCSVMLRKAGFVDTYREKFPNPVTHPGFTFPADNKDASI GQLSFCPEYDERDRIDFVYYNKSQPVELLKAELVGPSGSIYFGKRGPNDSKDTFIEPTGT WPTDHKGNLTTLKVRVKK >gi|283510604|gb|ACQH01000015.1| GENE 66 90398 - 91867 1581 489 aa, chain + ## HITS:1 COG:no KEGG:BT_3504 NR:ns ## KEGG: BT_3504 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 488 5 494 518 388 42.0 1e-106 MRRWKYALFIAALCGNIAACGDKDEDNTPNYNPNVPVTISRFTPAEGSEGQEMVIYGTNF GLDSKQVNVTIGGKTAKLTYVKADSIGCIVPYWTESGRFEVEVKVAQQSAKATTKFAYAS PLVVRTLIGYRISDGDPGWKDGKFGTGKDDGSATFGKQAAFMKFDPQNHDHLYVAYEDFD YSGYGVQLFDLKAKTVTTVMHGSKCFEGKRLRAIDFTAKGDMVVATDRDDSGDRSTSVWI VKRNADGSFTDESKCEVLAAYKQCNTVAVHPVNGELYFNSYGNNAIYRLDVAKYYANASE TTPWDPYLTGNNYEQMLTLGASRWNFNITMHPTGKYAYVIALNQHCIYRMDYDEATKRFK QPYLVSGQQGVKGWADGKGDGTLMNFPYQGIFVKNADYAAQGKEDVYDFYFCDYDNYCVR YLTPEGRVRTFAGRGATSAMGDGNTWGPDDGDLRETARFGSPTGIAYDERTETFYVFDTY YKSVRTISR >gi|283510604|gb|ACQH01000015.1| GENE 67 92050 - 94044 1528 664 aa, chain + ## HITS:1 COG:no KEGG:PRU_2075 NR:ns ## KEGG: PRU_2075 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 22 664 20 668 669 847 61.0 0 MKRIKVILLGWAFVANISPLQAQGLSMSLSLKTPGNPSETRQLQQQGQCWTTDGPLSVVS STTKDGDDELLTVTLSAREKVYFHLELSLNTGLSSSETEFYMPGFWYHRNMRSPQEAPSF KTSKHWCFREDRMSTPLTSAFNPHDGQGISVLRVLDTPCDVQVQLTTGEIILPGRTSVGY AGFDGEANTVALKFGYPYKETPKRYIRKLTLIDPVTAFSKLEKGEQQTLRWRIHRYKAND FGSFVTQVWQYCYDRMQPQPLKSAYTGAEVKAALSDYFRHSFVDRYPLKYNSGLSLRVND CLPETEMQIGFCGRVLLNAFNEIEYGVETKDEELVNKGQAIFDSFLSNGFGEGGYLHDFV NFRDGFPKESIHSIRQQSEAVYAVLHYLKYEKRRGRRHVEWERHIRTILDNFVALIKGDG HFARKFKSDGTDVDASGGSTPSATSALVMGWKYFGNKNYLQAARRTVSYVEKNIISQSDY FSSTLDANCEDKEAAIAAVTATYYLAMVTKGEERKHYIELCKQAAYFAISWYYTWDVPFA PGQMTGELGLKTRGWGNVSVENNHIDVFVFELPHILSWLAQQTGEQRFKGMYDVITSSLS QLLPVANNLCGVGKQGFYPEVVQHTTWDYGHNGKGFYNDIFAPGWTIASLWELYSPNRTT DFLK >gi|283510604|gb|ACQH01000015.1| GENE 68 94123 - 98058 3380 1311 aa, chain + ## HITS:1 COG:CAC0903_3 KEGG:ns NR:ns ## COG: CAC0903_3 COG0642 # Protein_GI_number: 15894190 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 797 1029 50 289 318 110 29.0 3e-23 MIVALLLLVGMTGRSNILFRHIGVESGLSQSTVLAILQDRTGLMWIGTKSGLNRYDGILF KWYYSYPDGHSLGSSYINALFEDSNGRIWVGTDCGVWIYSPLTDSFSRFDKRSADGVSIT NMVNVIKGHGDKIYITANEQGVFCYDLRRNHLSHFRLKGYPNVAGMAIGDNGAVWLGLFG GGLYTTDHAFRTLTPFRTQDGQTPFAGNIVSAILPLGNGRYAIGTDRQGLSLIDAARHSF EPLVTSVEGKDLFVRNLILSHREIWAATEQGMLVYNLDTHAIQHFTYNPTDPFSISDNPL YCLYKDREGGLWAGSYFGGLNYLPAIHPVFERFVPQGSGNGLHGRRVREMVMDKQGKIWI GTEDGGLNCMEPDSETFSHIAASNTFPNVHGLCVDGNELWVGTFASGLKVIDTRTHQVVK SFKADGRLGSLHDNTIFSIARSPQGVMYLGTIRGLCTYDANTQSFVYDKAVPPVLINDVS FDSHGNLWLATQTNGVFLRHNGRWTNFKATQSGLTSNKALSIFEDSEGTIWVTTQGGGVC RFNFSTRRFKPLQQGVLNSTSTYFRMEEDGDGVLWLSSYAGFVRYDPHSGDVRTYNNSTM LLDNQFNYNSSLIDKRGRIYFGSLSGIVRFSPSALKKEQRMPQLVATDLYIGNEHVDNFT KGTPLEQNIVFTRKLSLAYNQNSFRLHVVPLSYSRQNWGALEYKLEGFDKSWQPMGADFF MTYANLPAGSYQLIVRMKDQNGKAYPGEYKLDINVRPFFLFSIWAKLFYVVLLAVLVWLL MRYWNRRAETRRRHAMEAFETRKEQELYQSKIHFFTNVAHEIRTPLTLILGPLENILSTQ KVKDNDVRDDLNIMYENTQRLTDLINQLLDFRKTEKDGLRLNFEYCNLTKLVTDIYNRFR SAMREKNINATLSLQGNNLHGYVDHEGFTKIVSNLINNAVKYCLSYISVTLKTDDAQLFL TVANDGNIIPLDLREKIFEPFFHIDTAEHSTSGTGIGLALARSLAELHNGHLRMGDEADM NVFVLTLPLVQEVPVRLGGNGIGDITPAKTELATTQEAQTKPYTLLLVEDNVQMLEYEKR CLEKEYNIVTATDGEEALLQMERNNVNLVVTDVMMEPMDGMELCRRIKYNVDLSHIPVII LTAVTSERGKMEGMESGADAYIVKPFSMNFLSQTVQNLLRQREEIKKMYATSPFVSASSA SISPADAEFLERLKAAVMRNIGNSDFNVDLLAAEMNMSRTSLNRKVRGTLDQSPNNYIRI ERLKAAAEMLKTGDKKVNEVCYNVGFSSPSYFTKCFYEQFGILPKEFNKEQ >gi|283510604|gb|ACQH01000015.1| GENE 69 98055 - 100151 398 698 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15900011|ref|NP_344615.1| aldose 1-epimerase [Streptococcus pneumoniae TIGR4] # 353 698 12 345 345 157 30 2e-37 MRHSVKPILTATLLAWLSLSAQFSLAQTSTYGNPVIDKSAPDPTVIRAEDGTYYLYATED TRNVPIYRSTNLVDWKLVGTAFTDNSRPKWLPKGGIWAPDIQRVGGKYHLYYSKSVWGGE WDAGIGVAVSNSPAGPFVDRGCMFTSKQIGIQNCIDPFYIEDGGKKYLFFGSFHGIYGVE LSANGLHVKQGAKPRKVAGTFMEATYIRRRGGYYYLFGSAGTCCEGARSTYRVTIGRSKS LFGPYLDKAGQRLLDNHYDVLLSKNDNVVGPGHNAGLITDDAGNDYMFYHGFKASNPDDG RVVWLDRIEWADGWPLVAGNGSSKTSTAPTVKQGSRGVATRSGLYPNDFEAYVNGKRTRL YTLVNHKGMEVCLTNFGARIVSIMVPDRRGTLRDVVLGYDNIAQYADYQHFGSDFGAAIG RYANRINQGRIVVDGKTLQLPRNNYGHCLHGGFTGWQYQVYDGKQLNDSTVEMSLVSPDG DNGFPGTVRATVRYTLTADNAIDISYEATTDKKTVINMTNHSYFNLNGNPSQHGENQVLY INADRYTPADTTYMPTGQMLKVAGTPMDFRKPTPLSKDINNQRFAMTRNARGFDHNWCLN TWHNGQPDEHAVAASLYSPQTGIMLQVFTNEPGIQVYTGNFLDASFAGKHGYRYPKHSAV CLETQHYPDSPNRPEWPSAWLEPGKKYSSHCVYKFYVR >gi|283510604|gb|ACQH01000015.1| GENE 70 100795 - 101910 1050 371 aa, chain + ## HITS:1 COG:no KEGG:PRU_1340 NR:ns ## KEGG: PRU_1340 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 299 1 278 279 171 42.0 3e-41 MKKLLITLLTLLAFAYQAKAMSFEQARQQALFLTDKMAYELNLTDDQYEAAYEINLDYLL SIDHYDDLYGIYWRRRNQDLGYILYDWQYRAYCQANYFYRPLYYDTGYWRFRIYARYPYR DYYYFGRPVFYNTYRGNSGWYNGSSRYSGRQFGGGRSYGMRDGFQRGDYGRGFRFGGFTD NSYDNYGYGYDAPQRSYSGRDYGTYNYNYPSQSRRSGGSRYGSGRDFGYGESWNDGYERQ SSTRSTARINRDYDDNFGSGNFGGARYFSGSDASGSPEVPNSKFSPSRSQGSQPSTSGRS FGGNTGGSSFSNTQQSVTPFNSTRSQGGSTPSTSGRSFGGASNGGTTYTPQHNNTNSSNT NSGGVHFGGRR >gi|283510604|gb|ACQH01000015.1| GENE 71 102118 - 102342 95 74 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDLGNRICRTKEEFRRAVMSFLENIAEYKEAVDLLLTINSRLVDSNHFFLTISLPLTPGK CKNGMLYHKASRCH >gi|283510604|gb|ACQH01000015.1| GENE 72 103125 - 105896 2746 923 aa, chain + ## HITS:1 COG:CPn0849 KEGG:ns NR:ns ## COG: CPn0849 COG0553 # Protein_GI_number: 15618758 # Func_class: K Transcription; L Replication, recombination and repair # Function: Superfamily II DNA/RNA helicases, SNF2 family # Organism: Chlamydophila pneumoniae CWL029 # 215 669 692 1152 1166 228 33.0 3e-59 MSENINKGKKKKKGKQKVRKRISHQVMPKGMNLEDWQVALRRQVAREERLLVATVDDKLQ PGEYMVTNPKTEQQYKVVYRGANSQWNYCSCFDFRTSQLGTCKHIEALKLTFGGRRKVHR ELPPYSSVYIDYRQGRQVRIRIGCDNSEAFKMLANDYFDEHLALKPAAYAHFHEFLAKAR ELDAGFRCYPDALQYVVERRENIERAAWVDALGQEDFDSLLHTRLYPYQVEGIKFALKAG KSIIADEMGLGKTIQAIATAQMLRRKALVGNVLIVCPTSLKYQWKREIERFTGETVHVIE GDLLARCKQYEGQEPYRIISYNALSNDLKLLKSIEVDMLVIDEVQRLKNWNTHIAKAARK VNAQYAVVLSGTPLENRLEELYSVAELVDQFALSPYYKFKDRYIMLDERGATAGYRNLNE LGERIKRFLIRRRKRDVNLQMPERQDKLLFVPMTKQQMVQHDEARWHVSVLLKKWQNMHF LSETDRNKLMKYLSQMRMLCDSTYILDQKTRFDTKVTEVINIVRNVIESGDEKLVVFSQW ERMTRLVAKELEKEGIGFEYLHGGIPSIRRKDLVNNFMDEPHCRVFLSTDAGSTGLNLQA ASVVVNVDLPWNPAVLEQRIARVYRLGQKRNIQVINLVSAGTFEEDMLDKLKFKSSLFEG VLDGGEDTIFAQQGKFDEMMAELNKTMGSSADTHTGESPIDNDEQEQVVGNETMEDNASA TNEMEVFDDDDLAENEEETRNNEESEARAHTIANNEGGNTNAERARESKTPQGEYDNTTA VANFDFGTPTDDETEDDAAEYEDNAVEHEGEMAENKDFGHGDTPKNAAKGNANVTPSHAH GSRSANPSTPKELVHAGRTFFQGLADTLKSPEATRQLVEELVEEDPETGATHIKIAIPDK KSVEVLLGFFGSLLAANNSQEAD >gi|283510604|gb|ACQH01000015.1| GENE 73 106003 - 106884 1111 293 aa, chain - ## HITS:1 COG:XF2443 KEGG:ns NR:ns ## COG: XF2443 COG0388 # Protein_GI_number: 15839034 # Func_class: R General function prediction only # Function: Predicted amidohydrolase # Organism: Xylella fastidiosa 9a5c # 1 291 5 294 295 380 60.0 1e-105 MLRTGIIQQHNTADIQDNMNRLANGIARLAKEGAQLIVLQELHNSLYFCQEEQVDVFDLA EPIPGPSTQFFGQLAKEHGVVIVTSLFEKRAPGLYHNTAVVMEKDGSVAGIYRKMHIPDD PAYYEKFYFTPGDLGFEPINTSVGRLGVLVCWDQWYPEAARLMAMRGADLLIYPTAIGYA ASDDEAEQQRQREAWTTIQRAHAVANGLPVVAVNRVGFEPDPSQQTPGINFWGSSFVAGP QGELLFRANDTEEQRAIVDVDLAHSEQVRRWWPFFRDRRIDEYGGLTQRFLQP >gi|283510604|gb|ACQH01000015.1| GENE 74 106903 - 107988 1211 361 aa, chain - ## HITS:1 COG:XF2442 KEGG:ns NR:ns ## COG: XF2442 COG2957 # Protein_GI_number: 15839033 # Func_class: E Amino acid transport and metabolism # Function: Peptidylarginine deiminase and related enzymes # Organism: Xylella fastidiosa 9a5c # 13 358 25 362 363 276 42.0 3e-74 MATSNNIDIKLYLPAEWHEQSGVQLTWPHAKTDWAPILPDITKVFVELTRAIAKHEKVLI VAPDTDDVKATLMRNLGLERLRNVLFHQCETNDTWARDHAAITLTGNNTNNGFKVQNTLL DFKFNGWGEKFAADKDNAITQSLYHEGMLNGTLESHNDFVLEGGAIESDGKGTVFTTSQC LLAPHRNQPMTKDDIEQRLKDALRAERVVWLDHGNLVGDDTDGHIDTIVRTAPENTLLYV GCDDSHDEQYEDFLALEHQLMGLRTSTGMPYRLLRLPMPDAIYDEGERLPATYANFLIIN GAVICPTYAQPEKDKQALQTIAQAYPDRETIGIDACTVIKQHGSLHCLTMQFPQGVIANN P >gi|283510604|gb|ACQH01000015.1| GENE 75 107991 - 108194 127 67 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927486|ref|ZP_06421333.1| ## NR: gi|288927486|ref|ZP_06421333.1| hypothetical protein HMPREF0670_00227 [Prevotella sp. oral taxon 317 str. F0108] # 21 67 1 47 47 84 100.0 1e-15 MCYGAKARQSAHTHTRLCAGMARHGSRQVFFKFNKVMNCLQKYKEKFTNVSFVGFLRYLC NENEGER >gi|283510604|gb|ACQH01000015.1| GENE 76 108211 - 108849 192 212 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 20 194 40 221 329 78 31 1e-13 MTVEMKDASLALESKSLFHNLSFVVSGGEMLCVMGESGCGKTTLLRCILGFQPLDEGSVS IGGTELTPLSAEYMRSMMAYVPQEVNLPCNTVAELVALPYQLRVNRDKRFSKEALMDQWV RLRLDETLYDKRMAEVSGGERQRIMLSMAGLLDKKVLLVDEPTSALDVDSALLVAGYLQS LAKRGAAIIAVSHDRAFAGQCDKTLMLKLNNQ >gi|283510604|gb|ACQH01000015.1| GENE 77 108871 - 109650 725 259 aa, chain + ## HITS:1 COG:TM0193 KEGG:ns NR:ns ## COG: TM0193 COG0390 # Protein_GI_number: 15642966 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, permease component # Organism: Thermotoga maritima # 1 253 5 262 263 102 26.0 7e-22 MNISIIGMLLTLLLVLVPLYFFHYFKVPLIRSTIISTVRMVVQMSLIGLYLEFLFSYNSW LINVLWFVLMVLVASVTAVNRTRMRMRLMLMPIAVGLGVGAFVVCMYALFVVLRLYNPFD ARYFIPVVGILLGNMLGVNVLSLSTYYHGLQRERQLYFYLLGNGATRLEALTPFIRQAIE KSFMPCIANMAVMGLVSFPGTMIGQILGGSAPDTAVRYQILISCITFCAPMLSLMVTLYM AHRFSFDKHGRLLDVLKAD >gi|283510604|gb|ACQH01000015.1| GENE 78 109886 - 109990 87 34 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRKMVKKIVMSDKFNKYCVNVLKMYNYGRVNVPV >gi|283510604|gb|ACQH01000015.1| GENE 79 111268 - 112122 603 284 aa, chain - ## HITS:1 COG:no KEGG:Csac_0632 NR:ns ## KEGG: Csac_0632 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticus # Pathway: not_defined # 45 268 71 302 312 245 55.0 1e-63 MQKTRTITLLILTAITALTACNAQTNNKPVQAKTGQEKRLVANNLAARIPAPQGYKRVEV SEGSFAHFLRNLPLKPAGSDLHYYNGQVKERKYAGVVVDMDFGKNANEQCADAIIFLRAS YLWKTRQYAKINFNFTNGFKAEYAKWAQGYRIRNNKAWVKTQKADYGYQSFRKYLNLVFQ YAGTASLSQELKPIGRCWAADIQAGDVIIKGGFPGHAEIVVDVAENEKGERVVLLAQSFM PAQEIEIFPQWFSPSSNGTCLVTPAWTFYSSTDKTLLLRRFKSE >gi|283510604|gb|ACQH01000015.1| GENE 80 112211 - 114604 1769 797 aa, chain - ## HITS:1 COG:no KEGG:BVU_0314 NR:ns ## KEGG: BVU_0314 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 30 796 7 770 774 239 27.0 2e-61 MTYTNKTDNCMLKNAMRSQAQAFLSLRKKLCLVLLCAIQMVCAQSFSISGKISDNTTRKP IKAAQITLLAADGVSAKAAIASNEDGKFALNGLSPSSYLLRITSVGYKPMLLTIVELNKD TDLGELLLKPDTVQLGEVMVSANLQKNADRWVFYPSDAIKRQSTDAYDVLQRMALPDLQF DLMNRTLSSQKGGTLQIRINGVPSKQSDLAALQPQDIARVEYIDNPGIVYGDGVAAVLLV HTKRGYEGMQGGVQIAHALTTKQGNAYAYLKIVGQKHQLAITIDGLYKSVGGVFNNSDKT LRYPANELTLNTKGDDKAYRYRNGSVKLEYNKLLDTRNSFFNIVANYATSRQPENVSQSH AQSNGAPFFSEALLTRDRTHNASLDFYLDKQFASKANLLANLTATYIGSDYRRAYSKAYH AAAQPAFSNAYDVDGKHKSVIGEVIFKQPLSKQLNLTIGTHNRLSKTNNTYVATNTTTPT SLLNFNNYDYVELSGMLGKLAYSVGGGYSFYRMKNDSLVAQYHFFRPSLTLSYPLSKAFR LQYYLGINPVEPQLAMLSSFVQTQSEYELRKGNPHLKPYQAYINQLSLSFQKQETMLGFV TYLHYSNHPFTNNPPVYDAATNMFVYTSANQHSFTHLQMRLYASQSLFSQSLRLSGWLTL NRYINNGLSFFTTYTDVIGSLSASYDRPKWGLQASFRSAITTMFNQTKTRTAPNLQLSAY YNLHRLRCTLSINNPFMSTATTVSTMNSELISATTYKFAKYKDNLVQLSLRYSFRKGKVR NLQKQMDNADNDAGVVK >gi|283510604|gb|ACQH01000015.1| GENE 81 114619 - 117024 1975 801 aa, chain - ## HITS:1 COG:CC1142 KEGG:ns NR:ns ## COG: CC1142 COG1629 # Protein_GI_number: 16125394 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor proteins, mostly Fe transport # Organism: Caulobacter vibrioides # 130 662 75 608 751 107 23.0 1e-22 MKTLNTYLLLIFLCFSVYVEKAEAQQKLTLIVSDSLSGEPLPFVNIKVNGAKDSLICYSD SIGKCVIPLQAGFYELWLSNVGYTNHKESVTLDADKTLHISLSANANVLGEVSVVGRKRL IKITHNGLEYDLSKDLAAQSTNLLAALNRVPLVNVDANDNITVKGSTSFSIFLNGHPYRI AQSSPKAVLQSIPVTTVKKIEVIDRINARYGGDVGDAIINIVTNQQLFDNYALTLNGGAN TQPRANAGLNLMGTYKNIDYSLGYSYALDGQREQPIASKSRYAQGSQWQEYNGKGVGDGD WKKHIVRTMLQWRPDTLNTLYLDGHALIEQTNLNTLWQQSFNEQGQAPLQSTFDTENKNT TGTAEANVIYRNLFRNTGDERWVVGYRYTYNPDVRHYYQTYSNLLGNERRTRRKTNGGLH EHSLSVDFNLLNNDNTTLLVGGKQTLRNGNIQSSNRLLKDGEWVDDAAENLQQQQLKYTQ NVSAAYASISLGVGNFSLDASLRWEYGDLKMQYPQQTVYNFTNRKHYVLPYASIYYQGKS SSLSLNYNTGVTRPSILMLNPFKAIVSQYLAAEGNPQLKNTYTHTLETSYSLYSNTLFLS YSLHYKWINHPIMAFPRYDTHQKQVITQYQNIAFARDFGSNLYFNYRPIPLISLTFSGNI DWYKNQENEQLWDKSNIAHNLTLMCDVFLKKNWTIGLQYGNYKNPAEIWAKAHAFSISSL AISKSFFKGSLNTRVVMNSPFQKYNELLVEQNLTHFSRSQTNYLTARAFGIDITYTFKSG NPKELKRDSRLKSSDQKTGVE Prediction of potential genes in microbial genomes Time: Sat May 28 00:16:39 2011 Seq name: gi|283510603|gb|ACQH01000016.1| Prevotella sp. oral taxon 317 str. F0108 cont2.16, whole genome shotgun sequence Length of sequence - 31289 bp Number of predicted genes - 22, with homology - 20 Number of transcription units - 16, operones - 5 average op.length - 2.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 4 - 63 1.8 1 1 Tu 1 . + CDS 133 - 1011 750 ## COG1864 DNA/RNA endonuclease G, NUC1 + Prom 1064 - 1123 3.1 2 2 Op 1 . + CDS 1205 - 1954 350 ## gi|288927494|ref|ZP_06421341.1| hypothetical protein HMPREF0670_00235 3 2 Op 2 . + CDS 1974 - 2855 581 ## Hore_03640 hypothetical protein + Prom 2867 - 2926 3.4 4 3 Tu 1 . + CDS 3051 - 3986 290 ## PRU_2812 hypothetical protein + Term 4090 - 4149 0.6 + Prom 5146 - 5205 3.8 5 4 Tu 1 . + CDS 5448 - 6338 683 ## COG1864 DNA/RNA endonuclease G, NUC1 + Prom 6369 - 6428 2.2 6 5 Op 1 7/0.000 + CDS 6452 - 7267 554 ## COG0327 Uncharacterized conserved protein 7 5 Op 2 . + CDS 7271 - 8095 1191 ## COG1579 Zn-ribbon protein, possibly nucleic acid-binding + Term 8187 - 8233 -0.3 - Term 8337 - 8381 9.6 8 6 Tu 1 . - CDS 8463 - 8738 311 ## gi|288927500|ref|ZP_06421347.1| hypothetical protein HMPREF0670_00241 - Prom 8766 - 8825 5.3 9 7 Op 1 . - CDS 9013 - 9834 983 ## COG0484 DnaJ-class molecular chaperone with C-terminal Zn finger domain 10 7 Op 2 . - CDS 9895 - 10854 936 ## BVU_4192 putative sodium-dependent transporter 11 7 Op 3 . - CDS 10885 - 11334 563 ## PRU_0539 hypothetical protein - Prom 11506 - 11565 1.8 + Prom 11311 - 11370 4.8 12 8 Tu 1 . + CDS 11508 - 12536 1094 ## COG1466 DNA polymerase III, delta subunit + Term 12574 - 12604 1.1 + Prom 13161 - 13220 4.8 13 9 Tu 1 . + CDS 13310 - 13789 385 ## PRU_0541 DNA-binding protein - Term 17121 - 17181 15.3 14 10 Op 1 . - CDS 17210 - 20782 4327 ## COG0674 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit 15 10 Op 2 . - CDS 20813 - 22000 1049 ## COG1373 Predicted ATPase (AAA+ superfamily) - Prom 22030 - 22089 10.4 16 11 Tu 1 . - CDS 22185 - 22451 104 ## 17 12 Tu 1 . + CDS 22330 - 23793 1265 ## COG0471 Di- and tricarboxylate transporters - Term 25518 - 25556 -0.2 18 13 Tu 1 . - CDS 25690 - 26529 789 ## COG1947 4-diphosphocytidyl-2C-methyl-D-erythritol 2-phosphate synthase - Prom 26559 - 26618 4.9 19 14 Tu 1 . - CDS 26678 - 26887 63 ## - Prom 27083 - 27142 2.8 + Prom 26620 - 26679 2.8 20 15 Op 1 . + CDS 26896 - 28374 1638 ## COG0305 Replicative DNA helicase 21 15 Op 2 . + CDS 28461 - 29141 632 ## PRU_0734 hypothetical protein + Term 29193 - 29224 2.0 + Prom 29289 - 29348 6.9 22 16 Tu 1 . + CDS 29376 - 31016 1057 ## gi|288927512|ref|ZP_06421359.1| hypothetical protein HMPREF0670_00253 Predicted protein(s) >gi|283510603|gb|ACQH01000016.1| GENE 1 133 - 1011 750 292 aa, chain + ## HITS:1 COG:BB0411 KEGG:ns NR:ns ## COG: BB0411 COG1864 # Protein_GI_number: 15594756 # Func_class: F Nucleotide transport and metabolism # Function: DNA/RNA endonuclease G, NUC1 # Organism: Borrelia burgdorferi # 112 275 10 175 195 111 37.0 2e-24 MKSRKVYSLALLAGICLSACGLQPSKLGRQLGGKYAYQEQKDGQINSEGKNGYATDKQGK AAYKVADGLEVPEKLTDRPEQILKRVAYTASYNSDLRIPNWVAWRLTGAHTRGKNKRAGV KFHEDTDVPMPRAVDFDYVRSGYDRGHLCPSADNRWDATAQEQSFLLTNVCPQDHNLNVG DWHELEILCRKWAKTYGSIYIVAGPVLFKGKHKTIGKNKVTVPEAFFKVVLCMEGTPKAI GFIYRNESGNRPKSYYVNTIDDVERITGIDFFPALPDKVENEVEATSSLDDW >gi|283510603|gb|ACQH01000016.1| GENE 2 1205 - 1954 350 249 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288927494|ref|ZP_06421341.1| ## NR: gi|288927494|ref|ZP_06421341.1| hypothetical protein HMPREF0670_00235 [Prevotella sp. oral taxon 317 str. F0108] # 1 249 9 257 257 525 100.0 1e-147 MKGTMNARCEQTKRVFNEETLSMLCKQNKAEIGYVGHKHRLCPRIRYVIGRNRFGYANFG DFFFCHDGGLYVWQQSEEFKEGHNPYIVEEYFGNSCEGRGYALRSIFAGIDTGYEDCNGS RMFTGDVVLVKEDSGYEFGTLCLASACDLDGKGFYGFPLDNHSLTLDTCRQNNYHLERIG TIFYQLNPCNEPIALWSQAVHYNNARCDLEQKKTRRTMARYTPNFDQEEWKYLGLEVLGV EEFNWNKII >gi|283510603|gb|ACQH01000016.1| GENE 3 1974 - 2855 581 293 aa, chain + ## HITS:1 COG:no KEGG:Hore_03640 NR:ns ## KEGG: Hore_03640 # Name: not_defined # Def: hypothetical protein # Organism: H.orenii # Pathway: not_defined # 26 293 16 280 280 138 34.0 2e-31 MKDYIVMVQDLLNGRNEEFDNADKKRIRLIRHADNRKEKIIDGKSYANSLYNLYLTERNV FLTYQSEQIAKDFKDVDYIVSFIGEEGTTSRFVGVFKNGGIVAQLGLYNGKELARFDFTE VDGFELLKERVIIDWNSPVSWRQNYQNLMPVIRIDRGLQENNIPVFTRFEDVMLDYSQLR RIIETNNMEWKSKLESCNAVYLILDKLTGKQYVGSTYNPKGIWGRWSAYACTGHGDNKDI KQLIDNDETYAKKYFQWCILETLPLNILEKHAIDRESLYKQKFGTRKFGYNNN >gi|283510603|gb|ACQH01000016.1| GENE 4 3051 - 3986 290 311 aa, chain + ## HITS:1 COG:no KEGG:PRU_2812 NR:ns ## KEGG: PRU_2812 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 2 287 4 288 303 245 39.0 2e-63 MSLIQKYIWLVKTIHRAGRITLKELNEKWRDNTDLSRGESLPRQTFDRWKGGVLDLFGII IDCEQRGGYHYYIANPKVLESGELKAWLLDTYGTAVTLSGCLSIHDRILTENIPSSQEFL STILDAMKTSVVLQITFHSFTMKKTETFLVEPYCVKMSAQRWYMLARNVDYNCLRLYSLD RLEHVELTDTSFALPNDFCAKEYFSEYFGIVLDERVPLQRIVLRADKYHQHYMRTLPLHH SQREVFACDDYADFELTLRPTYDFYMKLMSLGNMIKVLEPQSLQDKLCRWLENTAKLYVK ASLDNEEVRSC >gi|283510603|gb|ACQH01000016.1| GENE 5 5448 - 6338 683 296 aa, chain + ## HITS:1 COG:BB0411 KEGG:ns NR:ns ## COG: BB0411 COG1864 # Protein_GI_number: 15594756 # Func_class: F Nucleotide transport and metabolism # Function: DNA/RNA endonuclease G, NUC1 # Organism: Borrelia burgdorferi # 121 279 15 175 195 108 37.0 1e-23 MKTRKVFYLALFAFTYLSAFGLQPVKSSKSIVGVQPLQGQQVAKQRLYEGEDSVRDTVVH KCTWVQRKGLGPLEVPGKLTDRPEQLLKRVAYMASYNSDLRIPNWVAWKLTAEHTEGANE RKGMRFQEDKEVPKPRAVDADYVRNGYDRGHLCPSADNCWDATAQKQSFLLTNICPQDHK LNVGDWKELESQCRKWAKKYGCIYIVAGPILMKGEHKTIGKNKVTVPEAFFKVVLCMEGQ PKAIGFIYRNEGDDRPKSYYVNTIDEVERITGIDFFPTLPNKVEDEVEATSCLDDW >gi|283510603|gb|ACQH01000016.1| GENE 6 6452 - 7267 554 271 aa, chain + ## HITS:1 COG:SP1609 KEGG:ns NR:ns ## COG: SP1609 COG0327 # Protein_GI_number: 15901449 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Streptococcus pneumoniae TIGR4 # 1 267 1 261 265 118 29.0 1e-26 MKIKEVLRALEQFAPLPLQEGFDNAGLQVGLTEAEVSGALLCLDVTEKIVDEAVAKGCNL IVAHHPLIFRKLAQVSDANAVQRTVMKAIKNDITIAALHTNIDNARGGVNFKIAEKMGLL NADFFGARKRVETFNDKGESISVEGASGVMGVFEQPLAADDFVILLKKTFDVECVMCNQL LRRPISKVAICGGSGAFLLADAIAAGADAFVTGEMHYHDYFDHEQEIQIAVIGHYQSEQF TNEIFKTIIEERCQGVKCILTQTNTNPIIYL >gi|283510603|gb|ACQH01000016.1| GENE 7 7271 - 8095 1191 274 aa, chain + ## HITS:1 COG:TP0494 KEGG:ns NR:ns ## COG: TP0494 COG1579 # Protein_GI_number: 15639485 # Func_class: R General function prediction only # Function: Zn-ribbon protein, possibly nucleic acid-binding # Organism: Treponema pallidum # 62 241 54 232 273 62 25.0 9e-10 MAKKEITELSVEEKLKALYQLQTTLSAIDEKRALRGELPLEVQDLEDEIEGLNIREDKIR REIDDFSRAISQKRAEIAEAEASVERYNRQLDEVKNNREYDTLSKEIEFQTLEIELCNKK IKEALVKIDECERELSVTGQQIYDRERDLHQKRNELDEIMLETKEEEEKLRDRARDLETK IEPRLLSSFKRIRKNARNGLGIVYVQRDACGGCFNKIPPQRQLDVKMHKKVIVCEYCGRI IVDPELAGVKNDKSAAAEKPKRKRAIRKTKEETE >gi|283510603|gb|ACQH01000016.1| GENE 8 8463 - 8738 311 91 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927500|ref|ZP_06421347.1| ## NR: gi|288927500|ref|ZP_06421347.1| hypothetical protein HMPREF0670_00241 [Prevotella sp. oral taxon 317 str. F0108] # 1 91 1 91 91 63 100.0 4e-09 MKKIIFSAALFVAAIGFANAQSEGTAKKCDKAKTEQCDKAGKKGCCKKDAAAKECCKKDA ACKDKKECSKKKCDKAAKQDCKKNCDKKAAK >gi|283510603|gb|ACQH01000016.1| GENE 9 9013 - 9834 983 273 aa, chain - ## HITS:1 COG:MPN021 KEGG:ns NR:ns ## COG: MPN021 COG0484 # Protein_GI_number: 13507760 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: DnaJ-class molecular chaperone with C-terminal Zn finger domain # Organism: Mycoplasma pneumoniae # 212 273 9 72 390 63 46.0 4e-10 MAIGKWIGGFLGFIAGGPLSGLAGLVLGALFDVGLDQVNEPDRNGQPWNTGGSKANFYQH NATANQQQRQQGERNSFLFSLLVLSAYIIRADGKVMHSEMETLRSFLRRNFGEVAVEQGD SIIRNIFNQQKQMGAMAFESIIRDCCFQIAAHMNSAQRLQLLSFLAELAKADGRIDPSEV NALRYLAQWLGFDARILDSMFNLDKHDTHSAYKVLGISPNATNDEVKTAYRKMALQHHPD RVATLGEDIRLAAEKKFKEINDAKERIYKERGM >gi|283510603|gb|ACQH01000016.1| GENE 10 9895 - 10854 936 319 aa, chain - ## HITS:1 COG:no KEGG:BVU_4192 NR:ns ## KEGG: BVU_4192 # Name: not_defined # Def: putative sodium-dependent transporter # Organism: B.vulgatus # Pathway: not_defined # 1 313 1 314 318 242 42.0 1e-62 MNIIAFLKKWTLPSGLVIGAAVYLLFSRIAPLQPIGNVVGPFLVKLLPVVIFVMLYITFC KIQLGDLRPRAWHFILQAIRIALSGILVFAILHTTDPMAKLVLEGTFVCVICPTAAAAPV ITERLGGSIASLTIYTILANVVTSVIIPLFFPMVEKSADITFLTAFVMILRRITFVLIVP LCLALLTRKFLPKVAARIKETKNLAFYLWGFNLSIIMGLTIRNILSTQVYGTILALLLLL PLVISLLLFSIGKAVGYHYGDSISAGQALGQKNTVVGIWLTIAFLNPIASIAPCAYVVWQ NLINAWQLWYKQKYGTLKW >gi|283510603|gb|ACQH01000016.1| GENE 11 10885 - 11334 563 149 aa, chain - ## HITS:1 COG:no KEGG:PRU_0539 NR:ns ## KEGG: PRU_0539 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 149 1 149 149 193 59.0 2e-48 MYSLNLPSYAIKVGGTKARPTVFDPLRGKYVALTPEEWVRQHFVNYLIHHKGYPQLLMAN EVALSIGDKSLRADSVLYDRDLKPQMVLEYKAPTIALTQKVFDQITVYNMLLHVDYLVVS NGLQHICCKMNYTDNSYTFLPYLPNYTEI >gi|283510603|gb|ACQH01000016.1| GENE 12 11508 - 12536 1094 342 aa, chain + ## HITS:1 COG:BS_yqeN KEGG:ns NR:ns ## COG: BS_yqeN COG1466 # Protein_GI_number: 16079610 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, delta subunit # Organism: Bacillus subtilis # 15 320 7 319 347 81 27.0 2e-15 MAEKRNALGPEVIIKDIKARNFSSVYVLMGEESYYIDKIASLLEETVLQPQERDFNQDIL FGVDTNGVQVADLCKAYPMMAERRLVMVKEAQNLKLVDALAKYLENPVQTTILVLCHKNG TIDRRKKLLARAESMGVVFESKKKRDYELPAFIKGYLATKKATVEDKAAIMIAEHIGADL NRLTSELDKVLISLPENDLRITPDIVERQIGVSKEFNGFELRNAIVSRDVFKANQIIQYF DKNPKAGSIFAFLPLLFSFFQNLLIAYYAPNKNNETEVAKFLDLKSVWGVKDFMVGMRNY SAMKTMHIIHKIREIDAKSKGLDNPNTPVGDLMKELIFFILH >gi|283510603|gb|ACQH01000016.1| GENE 13 13310 - 13789 385 159 aa, chain + ## HITS:1 COG:no KEGG:PRU_0541 NR:ns ## KEGG: PRU_0541 # Name: not_defined # Def: DNA-binding protein # Organism: P.ruminicola # Pathway: not_defined # 1 155 1 162 162 160 57.0 2e-38 MKDRIKKVMESQHMSQQTFALSIGMSPASLSSIFNGRTKPTLNIVEAIKDKIPKISTDWL IFGKGEMYLDETGAANNAPTTHAIPDKEPQLNFDAPAAAPSLFNEQPAVAPKVVQAPREP ERVEVKVIEKPQRKVTEIRIFYDDRTWESFYPENRDGRS >gi|283510603|gb|ACQH01000016.1| GENE 14 17210 - 20782 4327 1190 aa, chain - ## HITS:1 COG:CAC2499_1 KEGG:ns NR:ns ## COG: CAC2499_1 COG0674 # Protein_GI_number: 15895764 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit # Organism: Clostridium acetobutylicum # 5 406 3 404 413 566 65.0 1e-160 MAKEKKFITCDGNEAAAHVSYMFSEVAAIYPITPSSPMAEHVDTWAAKGRKNLFGQTVTV QEMQSEGGAAGAMHGSLQAGALTTTFTASQGLLLMIPNMYKIAGEMLPCVFDVSARAIAS HALCIFGDHSDVMACRQTGFAMFCSGSVQEVMDLTAVPHLATLRTSIPFVNFFDGFRTSH EYQKIELIDQDEIAKLVNHDDIKRFRDRALSPERPVTRGTAENPETFFTHREACNQHYDN IPEVVEEYLGKISEITGREYHLFSYYGAEDAENIIILMGSATEAAREAIDYQLKQGKKVG MISVHLYRPFSVKHLLAAVPKTVKRIAVLDRTKEPGAEGEPLYLDVKSAFYDVENKPLIV GGRYGLGSSDTTPAKILSVFNNLELEVPKNHFTVGIVDDVTFTSLPAVPEIPMGSEGIFE AKFYGLGADGTVGANKNSVKIIGDNTNKHCQAYFSYDSKKSGGFTCSHLRFGNSPIHSTY QVNTPNFVACHVQAYLNMYDVIRGLQKNGTFLLNTVFEGDELVRFIPNRIKRYFAENNIT VYYMNASKIAQEIGLGNRTNTILQSAFFRITEVIPVDLAIDQMKKFIVKSYGNKGQDVVD KNYAAVDRGGEYKTLAVDAAWANLPDDAVAAEEDVPAYIKEIVRPINGQSGDLLKVSDFV RYDMVDGTMSNGAAAFEKRGVEAFNPEWTAENCIQCNKCAYVCPHACIRPFVLDEEEMKS FEDATLEMKVPKPMAGMNFRIQVSVLDCVGCGNCADVCPGNKNGKALAMVPFTHDEQQIS NWNYLAKNVKSKQHLIDIKSNVKNSQFAKPLFEFSGACAGCGETPYVKLISQLYGDREMI ANATGCSSIYSASIPSTPYTTNEAGQGPAFDNSLFEDFCEFGMGMALGNKKMRERITMLL SDLLDDDKVPAEFKEAANEWLNTKDDADASRESSAKLKPFIEAGAAKGCETCAELKTLDH YLVKRSQWIIGGDGASYDIGYGGLDHVIASGEDVNILVLDTEVYSNTGGQSSKATPLGAI AQFAAQGKRIRKKDLGLMATTYGYVYVAQIAMGADQAQTFKAIREAEAYPGPSLIIAYAP CINHGLKAGMGKSQAEEAKAVAAGYWHLWRYNPLLADEGKNPFTLDSKEPDWSKFHDFLL GEVRYLSVKKAYPNEAEELFKAAQEMAKLRYKSYVRKSQEDWSKDEETVA >gi|283510603|gb|ACQH01000016.1| GENE 15 20813 - 22000 1049 395 aa, chain - ## HITS:1 COG:TM1265 KEGG:ns NR:ns ## COG: TM1265 COG1373 # Protein_GI_number: 15644021 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Thermotoga maritima # 26 395 32 387 387 75 23.0 2e-13 MEAFFRTHAYLVEHTNATLRRTLMDEINWDDRMIGIKGTRGVGKTTFLLQYAKENFAKGD RQCLYINMNNFYFQSRGIADFAADFVNHGGKVLLIDQVFKQPDWSSELRKCYDRFPELKI VFTGSSVMRLKDENPELNGIVKSYNLRGFSFREYLNIITGKRFKAYSLEEITSQHESIVK EILPLASPSQHFRDYIHHGFYPFFLEQRNFSENLLKTMNMMTEVDILLIKQIDLKYLTKI KKLFYLLSLNSTKAPNISQLAQDIETSRATVMNYIKYLADARLINMMHLVGDDFPKKPSK IMMHNPNLMYAIYPIVPNEQVVMETFFVNALWKDHLVNQCNKEHHYIVDEKKKFRICDAR GTNKVRYNNDTIYARYNTEVGKGNKIPLWLFGFLY >gi|283510603|gb|ACQH01000016.1| GENE 16 22185 - 22451 104 88 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPKAVEGKFHKTAVTTTAKTNKENFLNLKTSLNVFLLCSDIFIYVFMVSNNGAKLSIFSI RQKKYQKSLHISRKIVFVRAKRRKKHAW >gi|283510603|gb|ACQH01000016.1| GENE 17 22330 - 23793 1265 487 aa, chain + ## HITS:1 COG:VC1314 KEGG:ns NR:ns ## COG: VC1314 COG0471 # Protein_GI_number: 15641326 # Func_class: P Inorganic ion transport and metabolism # Function: Di- and tricarboxylate transporters # Organism: Vibrio cholerae # 21 485 24 486 487 384 46.0 1e-106 MSEHKRNTLSEVFRFKKFSLFVLAVVVTAVLWNLPSTAFGIEGLTVIQQRVIAIFAFATI MWVTEAISSWATSVTLIGALLFTTSDNAFLFFRAGIDKSELIHHSALMATFADPIIILFL GGFMLAIAATKSGLDVFLARALLRPFGRRSEMVLLGFILITGAFSMFVSNTATAAMMLTF LTPVFKALPPEGKGRVALTLAIPVAANIGGMGTPIGTPPNAIALKYLNDPNGLNLGLGFG QWMAFMLPLTIILLLIGWYLLKTFFPFKKREIDLHIEGDIHRGWRTPVIAVTFCVTVLLW MFDKFTGVGANTVAMLPISVFALTGVVKAKDLQEINWSVIWMVAGGFALGLALNESKLAE LAIESIPFGNWSPMVILLISGVICYILSNFISNTATAALLVPILAVVCKGMGDSLAGIGG TPTVLIGIAIAASTAMCLPISTPPNAIAHSTGLVHQNEMMKIGLAIGVIGLVLGYIVLFF VTKIGMI >gi|283510603|gb|ACQH01000016.1| GENE 18 25690 - 26529 789 279 aa, chain - ## HITS:1 COG:BS_yabH KEGG:ns NR:ns ## COG: BS_yabH COG1947 # Protein_GI_number: 16077114 # Func_class: I Lipid transport and metabolism # Function: 4-diphosphocytidyl-2C-methyl-D-erythritol 2-phosphate synthase # Organism: Bacillus subtilis # 7 259 9 257 289 129 32.0 7e-30 MITFPCAKINLGLNIVAKRNDGYHDLETVFYPVPLTDALEVKLMDEQFPSEVACDLKVTG NTLVGDEQQNLVCKAYNLLARDFSMPRVHAHLHKRIPSQAGLGGGSSDAAFMIRLLDERF RLNIGNAEMERYAAQLGADCAFFVTAEPSFATGKGEILEPVDTPQHNLNGYFLAIVKPDV AVSTAEAFKHIVPKRPAKCCRDIVRQPIATWKDELTNDFEQSVFAIYPELANIKRELYAQ GALYAQMSGSGSAIFGLFEQCPKEVEHLFKDHFVYITQL >gi|283510603|gb|ACQH01000016.1| GENE 19 26678 - 26887 63 69 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MFISCVVPYRSGRFSINNLAVRNMYISSSQHRHNAKLRATSKPKPMLYCGKSITECGTAC GQHLVQKRT >gi|283510603|gb|ACQH01000016.1| GENE 20 26896 - 28374 1638 492 aa, chain + ## HITS:1 COG:lin0047 KEGG:ns NR:ns ## COG: lin0047 COG0305 # Protein_GI_number: 16799126 # Func_class: L Replication, recombination and repair # Function: Replicative DNA helicase # Organism: Listeria innocua # 18 463 3 440 450 365 43.0 1e-101 MPETSNTNRRTKKSVPVDNTYGHLPPQALDLERVVLGALMIDKDAFAVVSEMLHPETFYE PRHQKIYTAIRSLNMDEKPVDIMTVTEQLKRDGTIDDVGGAPYVVELSSQVASSAHIEYH AAILAHKFLARQLISFASNIETKAFDESIDVENLMQEAEGALFELSQKNMRQDFTQIDPV IKQAIDILQKASANSGGMTGVPTGYTKLDEITSGWQPSDLIIIAGRPAMGKTSFALSLAK NIAVDAQVPIGFFSLEMNNVQLVNRLISNVCEIVGSKILNGQLTPDEWDRLDKRLRNLQG AQVYVDDTPGLSVFELRTKARRLVREKGVGVIMIDYLQLMNANGMKFGSRQEEVSTISRS LKGLAKELNIPVIALSQLSRNVENREGLEGKRPQLSDLRESGAIEQDADMVLFVHRPEYY HIFEDDKGNDLHGMAQIIIAKHRKGATGDVLLTFKGQYTRFENPDDTPSQPALPDDGGGL IGSRMNDEPMPF >gi|283510603|gb|ACQH01000016.1| GENE 21 28461 - 29141 632 226 aa, chain + ## HITS:1 COG:no KEGG:PRU_0734 NR:ns ## KEGG: PRU_0734 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 14 224 7 217 217 140 38.0 3e-32 MKRNVILAFLVATISFNAAAQTKKELRDSIAVLAKEADMHPDSIDLRLRKAALNLQLEQW QYAKDEYDNVLTRFPNNATALYFRAFAYERMRRYDLAKADYEKLLVHIPGNFETLVALAL LNQKMKRHTEALDLANRLVNQHADKALAYAVRAGIERERGMLQLALYDYEQAISKDNANT EYYIYKVETLIDMKQFEKATQELNMLTKKGVPHSDLAELYKRCKRR >gi|283510603|gb|ACQH01000016.1| GENE 22 29376 - 31016 1057 546 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288927512|ref|ZP_06421359.1| ## NR: gi|288927512|ref|ZP_06421359.1| hypothetical protein HMPREF0670_00253 [Prevotella sp. oral taxon 317 str. F0108] # 11 546 1 536 536 1067 99.0 0 MKHLYIRTFILLFACVCGVVGMRAEVLDLTKLMPKVKKGQSYPLTTVEQGGVLLTFSRGE GRVVPANWGYARLYNNNTLTIAAQKNMRKVTLHFLKSGKEVLPEAGEMEVSSGKLTADNV WQGNAQQLVITVYPPRGKTFSLEKVEIAYEEGQTPPTTDPNEDKPKTEETVADIASFKQT KHGKAATLKINKALVLGSNDAELFVKDHTALIRVLVSKRGDWQRGDLVTGEVRGVYEEVD ALPTINAVERISLRKVSSQSASAPNVPFNKLSEYTYSWVHTSFTANEKYKLANPRTDVPC FAYNGAEIDCYGIVIPQGTTLILYPLTAQDVTINYYDDRLNTYGEAQNINVQIHQALKQG VFNTLTLPVSLSASEVKSVFGQATVIAKFAYVEGNNVVFQRLTDAGMQAGEPYLLSPQTD MKSIFVAQTQVHSPKTNPNSPFVGTLNATTPPTGSYYLNRQNKLQAFFGNQQIKAFKAYF TDIENGEGKTIVVRHVTNGIQHPTLENETNVHFPTIYNLNGQRMSGDKGALAPGIYIIDG KKTIVR Prediction of potential genes in microbial genomes Time: Sat May 28 00:17:50 2011 Seq name: gi|283510602|gb|ACQH01000017.1| Prevotella sp. oral taxon 317 str. F0108 cont2.17, whole genome shotgun sequence Length of sequence - 27410 bp Number of predicted genes - 20, with homology - 20 Number of transcription units - 13, operones - 2 average op.length - 4.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 4 - 723 792 ## COG0217 Uncharacterized conserved protein - Prom 826 - 885 4.5 2 2 Tu 1 . - CDS 892 - 3354 2836 ## COG0072 Phenylalanyl-tRNA synthetase beta subunit - Prom 3385 - 3444 3.0 - Term 4458 - 4503 7.2 3 3 Tu 1 . - CDS 4607 - 6148 1691 ## Cphy_1800 glycoside hydrolase family protein - Prom 6253 - 6312 3.9 - Term 6467 - 6501 1.9 4 4 Tu 1 . - CDS 6514 - 7191 710 ## COG1720 Uncharacterized conserved protein - Prom 7338 - 7397 5.9 5 5 Tu 1 . + CDS 7308 - 7862 589 ## COG3467 Predicted flavin-nucleotide-binding protein + Term 7974 - 8024 -0.8 - Term 8057 - 8103 8.8 6 6 Tu 1 . - CDS 8127 - 10331 2523 ## COG1506 Dipeptidyl aminopeptidases/acylaminoacyl-peptidases - Prom 10352 - 10411 2.5 + Prom 10318 - 10377 2.2 7 7 Tu 1 . + CDS 10440 - 11351 827 ## COG1242 Predicted Fe-S oxidoreductase - Term 11274 - 11317 2.3 8 8 Tu 1 . - CDS 11330 - 12058 810 ## PRU_0632 putative lipoprotein - Prom 12157 - 12216 6.7 + Prom 12578 - 12637 3.2 9 9 Op 1 . + CDS 12779 - 14620 199 ## PROTEIN SUPPORTED gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 10 9 Op 2 . + CDS 14651 - 16633 1359 ## COG1368 Phosphoglycerol transferase and related proteins, alkaline phosphatase superfamily 11 9 Op 3 . + CDS 16636 - 17628 519 ## PRU_2910 lipid A biosynthesis acyltransferase 12 9 Op 4 . + CDS 17625 - 18419 485 ## gi|282881231|ref|ZP_06289918.1| hypothetical protein HMPREF9019_1330 13 9 Op 5 11/0.000 + CDS 18434 - 19372 356 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 14 9 Op 6 . + CDS 19391 - 20179 486 ## COG0463 Glycosyltransferases involved in cell wall biogenesis + Term 20339 - 20370 -0.8 - Term 20122 - 20164 -0.4 15 10 Tu 1 . - CDS 20398 - 21168 560 ## COG0463 Glycosyltransferases involved in cell wall biogenesis - Prom 21242 - 21301 2.3 - Term 21241 - 21296 10.2 16 11 Tu 1 . - CDS 21334 - 22158 1134 ## PRU_0831 hypothetical protein + Prom 22397 - 22456 7.7 17 12 Tu 1 . + CDS 22492 - 23445 749 ## PROTEIN SUPPORTED gi|163762490|ref|ZP_02169555.1| ribosomal protein L28 + Prom 23492 - 23551 3.5 18 13 Op 1 . + CDS 23699 - 24994 1251 ## PROTEIN SUPPORTED gi|229870452|ref|ZP_04490046.1| SSU ribosomal protein S12P methylthiotransferase 19 13 Op 2 . + CDS 24991 - 25281 298 ## COG0776 Bacterial nucleoid DNA-binding protein 20 13 Op 3 . + CDS 25262 - 27178 1901 ## PRU_0754 HU family DNA-binding protein + Term 27184 - 27234 7.6 Predicted protein(s) >gi|283510602|gb|ACQH01000017.1| GENE 1 4 - 723 792 239 aa, chain - ## HITS:1 COG:Cj1172c KEGG:ns NR:ns ## COG: Cj1172c COG0217 # Protein_GI_number: 15792496 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Campylobacter jejuni # 1 238 1 234 235 180 45.0 2e-45 MGRAFEYRKAAKLKRWGHMARTFTKIGKQIAIAVKAGGPEPENNPTLRAIIANAKRENMP KDNIERAIKNAMGKDQSEYKVVTYEGYGPHGVAIFVDTLTDNTTRTVGDVRSIFNKFNGN LGTSGSLSFLFDHKSVFTFKKKDGLEMDELILDLIDYGVEDEYDEDEETGEITIYGAPTS FGEIQKHLEEEGFEVTGAEFTYIPNDLKEVTDEERETIDKMVERLEEFDDVQTVYTNMK >gi|283510602|gb|ACQH01000017.1| GENE 2 892 - 3354 2836 820 aa, chain - ## HITS:1 COG:FN2122_2 KEGG:ns NR:ns ## COG: FN2122_2 COG0072 # Protein_GI_number: 19705412 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Phenylalanyl-tRNA synthetase beta subunit # Organism: Fusobacterium nucleatum # 155 820 4 652 653 363 33.0 1e-100 MNISYKWLKEYVDFDLTPQKTAEVLTSCGLEVDSVEEVQIIKGGLKGLYVGEVLTCEAHP NSDHLHITTVDLGKGEPSQIVCGAPNVAAGQKVIVADVGAVLYDGDNEFVIKKSKLRGVE SFGMICAEDEIGVGTAHDGIIVLPADAKVGQPAAEYYQLESDWLIEIDITANHADALSHY GVARDLYAWLVQNGYETTLKRPSCDAFVVDNNDLPIDVTIENNDACKRYACVSITDCEVK ESPEWLQNKLLTIGLRPINNIVDITNYIMMAYGQPMHCFDADMVKGHHIIVKTQPEGTRF VTLDGEEHTLGPSDLSICNAEEPMCIAGVFGGKGSGTYETTRNVVLESAYFHPTWIRKSA RRHGLSTDSSYRFERGIDPSGTVYALKQAAILCKELAGGKISMEVKDVYPEPISDFDVDL KYAYVGQVIGKEIGNDVIKRIVTSLGMQVVEETNEGLQLKVPAYRVDVQRPCDVVEDILR IYGYNNVEIPTQLKSSLTVHGDEDREHKLETLIGEQLVGCGFNEILNNSLTKVAYYEGLN NYTEDTCVKLLNPLSADLGMMRQTLLFGGLESLSRNANRQNQNLRFFEFGNCYHYHADKA NEESPIKAYSEEKHLALWLTGKRVEGSWIHPDEQSSFYELNGYVENIFARLGIPAGLLVY EKSENNIFTYALKVMNRGGKVVAELGYVADGIRKNFGLTNEVFFADLNWTALIKLTSKKD IEFKEISKFPAVSRDLALLLDKEVLFKDVVAVATQTEKKLLKKVELFDVYEGKNLPQGKK SYAVNFILQDENATLKDKQIESIMQKLIANLKGKLGAELR >gi|283510602|gb|ACQH01000017.1| GENE 3 4607 - 6148 1691 513 aa, chain - ## HITS:1 COG:no KEGG:Cphy_1800 NR:ns ## KEGG: Cphy_1800 # Name: not_defined # Def: glycoside hydrolase family protein # Organism: C.phytofermentans # Pathway: Amino sugar and nucleotide sugar metabolism [PATH:cpy00520] # 210 344 515 644 652 68 39.0 4e-10 MMRISTLLLGATLSLSTFPAMMAQTYCQPTKSSGGKNWAHANSQSFYSLKVSSDGNDVYQ FTDENCNNQYNWIQDEQGFTTSTGKKLVLDVRSGIWTWDIQVGFDWDGDGDFEDIQRAFS TVGKKPTEKETSWNPKYSDTYANAKWREAEQNRLGHRGVLYHQFTFTVPSNAINGKTRMR ILCDGDGGSADFEMCSPVGYAGSMHDFGVTIVSKNTAAKPVFSVADGIYKMDQMVTITTN TPDAKIYYTLDGSTPTAANGMLYESPVKVSGLEGNTTEVTLKAIAVKDGMEASNVATAKY TIQKAWSIVKGTVHPTEDRYITSATTEDAKQNLNFTQSTKPSVVYINTGSAFTVESGSNF KLHVQCSPSMKWDHAIVFVDWNHNYSFDDAGEQLFKVGEEVKGNEEVCNFTRTIAVPADA VVGATRMRIQFTDAWHKKNEPGHTHSAEDVIDKGGAYDFVVNIEKPVVNGIDNVSTDKTA AEVYTLEGIKLNKQVNELPKGIYIVNGKKVVVK >gi|283510602|gb|ACQH01000017.1| GENE 4 6514 - 7191 710 225 aa, chain - ## HITS:1 COG:NMB2158 KEGG:ns NR:ns ## COG: NMB2158 COG1720 # Protein_GI_number: 15677971 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Neisseria meningitidis MC58 # 4 224 5 221 226 167 44.0 1e-41 MTEIHPIAHLRSPFSSKFGIPKQSGLVQDVRGQIVFEPAYRSKDALRGLDEFDYLWLIWH FSANKHAPNSLVVRPPLLGGNTKMGVFATRSPFRPNGLGLSSVRLERIEWETSQGPVIHV LGADLMDGTPIFDIKPYIAYADCHVGARSGFVDRTAIARLDVCIPERFRTIFSSHDLKTL CEVLALDPRPHYQDNPQKVYGMPFMDKDVRFTVSAEGQLLVVDVV >gi|283510602|gb|ACQH01000017.1| GENE 5 7308 - 7862 589 184 aa, chain + ## HITS:1 COG:CAC2475 KEGG:ns NR:ns ## COG: CAC2475 COG3467 # Protein_GI_number: 15895740 # Func_class: R General function prediction only # Function: Predicted flavin-nucleotide-binding protein # Organism: Clostridium acetobutylicum # 28 179 5 154 154 107 38.0 1e-23 MKCYGLLWQKILNFAIKEEVMKYNNDHVRRRDRLLTEQRAIELIEGAEYGVLSMTDEEGM PYAIPVNHVWDGESALYVHCAPEGKKLRAIAKNPHVCLCIVGDVNLLPANFTTEYESVVI YGLARTGLDEDERMKALHLLINKLSPEHKQLGEKYTQASFHRTEIIKIEVQEFSGKCKKV NKAG >gi|283510602|gb|ACQH01000017.1| GENE 6 8127 - 10331 2523 734 aa, chain - ## HITS:1 COG:CC2154 KEGG:ns NR:ns ## COG: CC2154 COG1506 # Protein_GI_number: 16126393 # Func_class: E Amino acid transport and metabolism # Function: Dipeptidyl aminopeptidases/acylaminoacyl-peptidases # Organism: Caulobacter vibrioides # 120 734 132 736 738 283 30.0 7e-76 MKKVIISCMAFLLSLSSGSAAGTLDLKKITDGTFSPEYISGIKPIAGTDLYAQISDDGQR IVRYSFKTGKQVDVLFDVNDTQGQKVKSFDSYILSPDGTRILICTNPKKIYRRTFKATYY IYTVASRRLDKLSDGGPQQNPTWSPDSRQVAFVRDNNIFLVKLLYDNAESQVTKDGKLNE VINGVPDWVNEEEFGISSSLVFNADGTMICWVRYDEKDVKTFSLQMYQGQYPSHKENQTY PSFFAYKYPKAGEDNAKVSVWSFDIKSRQTRKLDVPLDADGYIPRIKTTNNADMILVYTM NRHQDMLNVYTVNPRSTVAQLLVREKGDKYVKEEAIANISVYNNSFLLPSDRDGYMHLYL YDMGGKQIRKIGDGNYDITQVYGYDEQTGDVFYQAAVPTAHDRQVFVNHKNGKTECLTKQ EGWNNAIFSGDLKNFINTWSNATTPYTFTLCDARGKVISTLLDNKQLVAKYKQEGLSEPE FFSFTTGDGVKLDGWMVKPANFSPSKKYPVIMFQYSGPGSQQVVNSWGIGSMGQGALFDR YLAQEGFIVVCVDGRGTGGRGSAFEKSTYLQLGKLESQDQVATARYLASLPYVDANNIGI WGWSFGGFNTLMSMSSGDNVFKAGVSVAPPTSFRYYDTIYTERYMRTPKENGKGYDDNAM SRAHNLHGALLICHGLADDNVHPQNTFEYAESLVQADKDFKELYYTNRNHSIYGGNTRNH LLRQITNFFKEQLK >gi|283510602|gb|ACQH01000017.1| GENE 7 10440 - 11351 827 303 aa, chain + ## HITS:1 COG:L142355 KEGG:ns NR:ns ## COG: L142355 COG1242 # Protein_GI_number: 15674248 # Func_class: R General function prediction only # Function: Predicted Fe-S oxidoreductase # Organism: Lactococcus lactis # 1 299 1 301 313 208 34.0 1e-53 MQLPYNDFGNWMRRQFPFKVQKLSVNAGFTCPTRDGHTGFGGCTYCNNRSFNPAYCEPQQ GVGEQLEAGKKFFARKYPSMKYLAYFQAYTSTYDTIDRLKAMYEEALMVEDVVGLVIGTR PDCMPDALLRYLEELNKHTFLIVEYGVESANNETLERVKRGHTFECSRETIERTHECGIR TGAHVILGLPGEDAEESIRQATIMSALPIDILKIHQMQIIRGTLLAREYMKQPFHLYTVE EYLDVICEYVKRLRKDLVLERFVSQSPEGLLIAPKWGLKNYEFTNLLVRKMANEQVVQGS MAL >gi|283510602|gb|ACQH01000017.1| GENE 8 11330 - 12058 810 242 aa, chain - ## HITS:1 COG:no KEGG:PRU_0632 NR:ns ## KEGG: PRU_0632 # Name: not_defined # Def: putative lipoprotein # Organism: P.ruminicola # Pathway: not_defined # 9 237 30 260 267 229 47.0 8e-59 MNGDHGDCIKVQRYDRLESRYLSTGDYSALQQMNTDYPTETRTLIEKVLQLGTIEDSEIH HRFLHYYQDTTLQRLITDVEAEFNNINDVNEQFTHAFSKLSKWFPNLPQPKIYSQISSFD QSIIVGNGTIGVSLDKYMGEDYSLYKRFYTPEQTKTMYRGYIVPDALSFYLLAQYSLDRN DTRPQIDRDLHMAKIWWVVNKATSKQSYRNKFVDQVDKYMRAHPNVTYEELISTENYRAM LP >gi|283510602|gb|ACQH01000017.1| GENE 9 12779 - 14620 199 613 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 [marine gamma proteobacterium HTCC2080] # 392 598 21 224 305 81 29 6e-15 MKEFLRVLRRFVPPYKKYLALSIVFNVLSAILNIFSFATLIPILNILFKTSDAAKPVPYV AWGSAESIGQLVDIIVNNLNYYIQRMIVEWGATTTLLVVGLLLAFMTMLKTLSYFLSSAA IIPIRTGVVRDIRNQMYRKITSLSLGFFSEERKGDLIARMSGDVQEVEGSIMSSLDMLFK NPVLIIAYFITLVVVSWQLTLFTLIFVPLFGWFMGYVGRRLKQNSIKAQALWSDTMSQVE ETLGGLRIIKAFCAEDKMNARFDKVNSAYRNDIMKVNIRQQLAHPMSEFLGTVMIVIVLW FGGMLVLNNQVMQGPTFIYYLVILYSIINPLKDFSRAGYNIPKGLASMERIDKILKAEID IKEMENPVHISSFEHQIEFRDVSFRYGEQWVLRHINLTIRKGQSVALVGQSGSGKSTLVD LIPRYYDVQEGEVLIDGINVKDLGVHDLRQLIGNVNQEAILFNDSFYNNITFGVDNASMH DVEEAARIANAYDFIMETEHGFDTNIGDRGGRLSGGQRQRVSIARAVLKNPPILILDEAT SALDTESERLVQDALERLMKTRTTVAIAHRLSTIKNADEICVIHEGHIVERGTHEELLAI DGYYRKLNDMQSL >gi|283510602|gb|ACQH01000017.1| GENE 10 14651 - 16633 1359 660 aa, chain + ## HITS:1 COG:PA1689 KEGG:ns NR:ns ## COG: PA1689 COG1368 # Protein_GI_number: 15596886 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Phosphoglycerol transferase and related proteins, alkaline phosphatase superfamily # Organism: Pseudomonas aeruginosa # 276 647 306 679 700 133 27.0 1e-30 MKRNRLDIMQSPLAPFVALLANLLLAFGLYIVARIEFLFENYNYFCAVLSPARLWRLFYG GYMFDRSAIFYTNSLYVVMMLFPLWMKETPLYHKVCKWLFVVVNSLTLVINLGDSVYFPY TLRRTTTSVFSEFKNENNLGDVFWGELLEHWYLVLLAALLIFLLCKLYVTPKVRAEQYVS VKQRVAYSGIQLLLLAAVTPFGVAACRGGLDSAIRPITISNANEYVERPTECALVLNTPF ALIRTIGKSVFTVPEYFKSQEELEKVFTPIHNRHRPSALSALDAKNVVILIVESFGREYI GALNKELEDGNYKGYTPNVDKLIEQSVTYKYSYCNGRKSIDGMPSVLCGIPMFVEPFVLS PQSMNTYTGLAGILSNEGYNTAFFHGANRGSMGFLAFAKKTGFKEYYGREDYAADPRFGG DADFDGHWGIWDEPFLQYYCAKMSEMKQPFMTTVFTVSSHTPYVIPEKYKDVYPEEGLIM HKCIRYTDMAIGKFFESARKQPWFKNTLFVLTSDHTNLSDHAQYQTDIGGFCSPIIIYDP SGEIEPGMRDGIAQQIDILPTVLSILEYSKPFLSFGCDLMTTPMEETYAVNYLNGIYQYV KYGYVLQFDGKQTKAVYALDDLLMKHNLKGKVKQQAQMEREVKAIIQQYMYRMVNDKLMP >gi|283510602|gb|ACQH01000017.1| GENE 11 16636 - 17628 519 330 aa, chain + ## HITS:1 COG:no KEGG:PRU_2910 NR:ns ## KEGG: PRU_2910 # Name: not_defined # Def: lipid A biosynthesis acyltransferase # Organism: P.ruminicola # Pathway: Lipopolysaccharide biosynthesis [PATH:pru00540]; Metabolic pathways [PATH:pru01100] # 8 318 3 313 324 359 54.0 6e-98 MKNVISFIAYYLLYAAWFVFSLLPFCVLYVISDFLYLVTYYVVKYRRRVIRRNIESSFPE KSDVELRSIEKGFYHWFCDYLVETIKLMTMSKRQLMEHLVFTGTDVIDHYVDQGRSCGVY LGHYGQWEWITSLPYWISDKGLCTQIYHPLENKRFDSLFKKVREKQGARCIAMAATLRRV VAYQQQKQPIVMGYISDQVPFWNNIHHWFDFLNHDTPVLTGTEKVMKATNQVVFYCDVSR TKRGYYRCDMQLITDQPKGRADFELTDTYFEKLQHTIRRDPALYLWSHNRWKRTREEFNL RYDEATGRVDLGDIEDIKRRKLIQQNQQGI >gi|283510602|gb|ACQH01000017.1| GENE 12 17625 - 18419 485 264 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|282881231|ref|ZP_06289918.1| ## NR: gi|282881231|ref|ZP_06289918.1| hypothetical protein HMPREF9019_1330 [Prevotella timonensis CRIS 5C-B1] # 1 264 1 264 264 357 62.0 4e-97 MIFVKGYGQMCNNILQYAHLYAFGKEHKVTVVSLRFSYKYRFFEICKKWYHNPLTYILGK LLITFRVITCIKDGSNIIQDQQLCKPWIIASAAWHLRYPDLFLKYRDDIARIFEIKPTIQ AKVNCWMNTHPQADINLGLHIRRGDYIRWQGGKYFFSDEVYHRIIKDFIALHPNETINIY ICTNDNALNIDGFTAVHPTTFLSEGSAIEDLQLLASCDYLIGVKSTFSLWASFYRRVPLY WIMDKDVPLTAQSFVYFDDVFTTV >gi|283510602|gb|ACQH01000017.1| GENE 13 18434 - 19372 356 312 aa, chain + ## HITS:1 COG:AGl534gl_1 KEGG:ns NR:ns ## COG: AGl534gl_1 COG0463 # Protein_GI_number: 15890377 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 3 230 117 348 365 129 31.0 5e-30 MPKVSVIVPNYNYGRFLPQRLESILSQTFQNMEVILLDDCSTDNSQAVLSQYATHPLVSR IVFNEKNSGSPFAQWRKGIDLAQGEYVWIAESDDYCEPTLLATLVRALDQYPTAAIAFAG AVLVDENGKPTGREFDYWHKHEDGSVRFYPSKSYLRKLLMWRCSVYNAAGVLFRRDLYLQ IDKRYAGLRYCADWLFWIEMALRGDVVEVRKKLNRFRMHNDSVTARSEKTKGVFEEQLII AHRLWALPFIGSYRLSLARGAFYKKIKRMFPDVQNRKEPMQRAKALGVKRSDYVLERIIK SIHQVIPFLYHS >gi|283510602|gb|ACQH01000017.1| GENE 14 19391 - 20179 486 262 aa, chain + ## HITS:1 COG:Cj1135 KEGG:ns NR:ns ## COG: Cj1135 COG0463 # Protein_GI_number: 15792460 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Campylobacter jejuni # 4 253 259 514 515 165 35.0 6e-41 MKTSIIVSTYNWPQALQICLGSILKQTVLPDEIVIADDGSTGETRRLVEEMQAKSNVPIE HVWHEDKGFRRTTIMNKAIARASGDYILQVDGDVILSPHFVSDHLELAEKNYFVCGSRVK LTPQNTQEILATSQFKLKYSSLPLTFLLNSFRSRLFRHLLAERYARKIDHLRGCNMAFWK NDLIKVNGYNEDLAQWGHEDGELAYRLHFAGVKKKALKMGGNVYHLYHKEASKSNEQRHL DELEKVKRNHLAWCENGLNKHL >gi|283510602|gb|ACQH01000017.1| GENE 15 20398 - 21168 560 256 aa, chain - ## HITS:1 COG:aq_1742 KEGG:ns NR:ns ## COG: aq_1742 COG0463 # Protein_GI_number: 15606814 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Aquifex aeolicus # 6 252 3 244 251 117 34.0 2e-26 MNDEKKISVVINTYNAERHLMQVINAASGFDEIVVCDMESTDNTLDIAKGMGCRVVTFEK KDHNIVEPARQFAIEQASFPWVLVLDADEIVSEELRTYLYNHISQPACAEGVAIPRKNYF MGKFMHACYPDYILRFFKKEVTTWPAIIHASPIVEGRVIRLPRNRKDLAFEHLANDSIST LNNKNNVYSDNEISKRLHKHYGVGALLGRPFFRFFRSYILKGGFRDGVPGFIHAVWEGIY QFTIVAKMIEFRRQAK >gi|283510602|gb|ACQH01000017.1| GENE 16 21334 - 22158 1134 274 aa, chain - ## HITS:1 COG:no KEGG:PRU_0831 NR:ns ## KEGG: PRU_0831 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 29 270 1 239 239 198 52.0 1e-49 MKQEKKALTLKTLSKSNVWSIQENDVFRMWEAAEKDADLKDNARHYMDVVKSAFEAEEIK IDRPEIIKKYEERGFKVGAIRLEEGQKTKLAIKKRPIMRVTDLTYENIRHISAAKLIEVL DRNFGGGWDSLSQSVKDIIESGFEVSTTVLPKDRLHKKGGMFDKKTEDGFEALEIEKGTW VEAIFVKEKPEMERVTSRHDDNFADEKDLSSRDEDNEDEDEVNDSESQYKDDEDEDDSFD EDILTEESYRTTFEEDNDELGMEAENIGDEDDNY >gi|283510602|gb|ACQH01000017.1| GENE 17 22492 - 23445 749 317 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163762490|ref|ZP_02169555.1| ribosomal protein L28 [Bacillus selenitireducens MLS10] # 18 315 25 320 336 293 48 1e-78 MGIFGFFNKSKKETLDKGLEKTKQSVFSKLTRAIAGKSKVDDEVLDDLEEILITSDVGVD TTLKIIERIETRVARDKYVSTSELNNILKEEITALLSENNTQDNDSWELPDTGNPYVILV VGVNGVGKTTTIGKLAHQFKQAGKKVYLGAADTFRAAAVEQICIWGERVGVPVVKQQMGS DPASVAFDTLSSAKANGADVVIIDTAGRLHNKVGLMNELKKIKEVMKKVLPEAPNEVLLV LDGSTGQNAFEQAKQFSAVTNITALAITKLDGTAKGGVVIGISDQLKVPVKYIGLGEGME DLQLFNRQHFVDSLFNE >gi|283510602|gb|ACQH01000017.1| GENE 18 23699 - 24994 1251 431 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229870452|ref|ZP_04490046.1| SSU ribosomal protein S12P methylthiotransferase [Spirosoma linguale DSM 74] # 1 430 6 434 437 486 55 1e-137 MKKNQVDIITMGCSKNLVDSELLMKQFEANGYHCVHDSKKPNGEIVVINTCGFIESAKEE SINTILEFAQAKEEGRLKQLYVMGCLSQRYQKELEQEIPQVDKFYGKFNYKNLLKDLGKG VIASCNGTRSITTPRHYAYLKISEGCDRSCAYCAIPLITGKHVSRPKEEILEEVRSLVSQ GVKEFQIIAQELTYYGVDLDGQRQITDLISEMADIKGVEWIRLHYAYPNQFPHSLLQVIK NKPNVCKYIDIALQHISDNMLTRMHRHVTKAQTMELIKRIREEIPGIHLRTTLMVGFPGE TEDDFNELLDFVKWARFERMGAFAYSEEEGTYSAKHYEDDIPDEVKQRRLDQLMALQQDI SAEVEAQKVGKTLRVIIDRKEGDYYIGRTEWSSPEVDPEVLIPATVKLRVGSFYNVRITD SEEFDLYGEIE >gi|283510602|gb|ACQH01000017.1| GENE 19 24991 - 25281 298 96 aa, chain + ## HITS:1 COG:BS_yonN KEGG:ns NR:ns ## COG: BS_yonN COG0776 # Protein_GI_number: 16079164 # Func_class: L Replication, recombination and repair # Function: Bacterial nucleoid DNA-binding protein # Organism: Bacillus subtilis # 1 90 1 90 92 57 34.0 7e-09 MNNKEFITELAQQTGYTQNVTQTLVRNVIDEMAVRFDEGDVVSIPGFGTFEVKKRLERVV VNPATQQRMLVPPKLVLNFRPTASIKERLKNGGEDE >gi|283510602|gb|ACQH01000017.1| GENE 20 25262 - 27178 1901 638 aa, chain + ## HITS:1 COG:no KEGG:PRU_0754 NR:ns ## KEGG: PRU_0754 # Name: not_defined # Def: HU family DNA-binding protein # Organism: P.ruminicola # Pathway: not_defined # 8 112 3 107 472 103 54.0 2e-20 MEVKMNNRVGIRELAMALVNKNKLSLKEAENFVATMFDTLNVGLNDDKQVKLKGLGTFKV MSVSARKSVDVNTGEPIIIDGRDKISFTPDSTMKDLVNKPFAQFDTVVINDGVDLKELER MDWTETTSDADRTPLNDNSDDAELTPTSTLPPSKIETVVNQAADALASAPVEVPNETEVP VKKAPEKSHILTMQQLSALNGDQRPIAVAENAVNATTQLTLTADQLRLLNGGERHNEQEQ RVVQRAELELEETSIETGTGAPDILSDDETEGTQAQTQSVEGLSEQADETPDATEMDVVD GVEASDVEETHNDVEGEPAKEEDMLDADNDIVDTDTDADNECEPVQAACAMPVVEGLTLV DENLETEDELTPEDTESNSADDSDVETELQSDEEDSDDDDETDENSCEYEEDEWEEEERR RKRRRLIVACVVGIIVVVGGVAWYLLNQMQLKDNRISHLETLLNTSKPSDSKASEQPLSP TDSAREDSINRLIDAQTKADLAASEAKIAQNAEKKAEKADNKKPAATASENKKPAEKTAD KPAPKQEKAKANTTENQNYKQYEKDARVRTGAYKIVGIAHVVTVTKGQTLYSLSKSMLGP GMECYIEAVNGNVKELKEGQKIKIPKLELKKPKTTSKN Prediction of potential genes in microbial genomes Time: Sat May 28 00:18:36 2011 Seq name: gi|283510601|gb|ACQH01000018.1| Prevotella sp. oral taxon 317 str. F0108 cont2.18, whole genome shotgun sequence Length of sequence - 31470 bp Number of predicted genes - 23, with homology - 23 Number of transcription units - 11, operones - 7 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 1223 - 1271 12.4 1 1 Op 1 32/0.000 - CDS 1304 - 1567 351 ## PROTEIN SUPPORTED gi|110636876|ref|YP_677083.1| 50S ribosomal protein L27 2 1 Op 2 . - CDS 1586 - 1903 407 ## PROTEIN SUPPORTED gi|150006123|ref|YP_001300867.1| 50S ribosomal protein L21 - Prom 2008 - 2067 5.3 + Prom 2290 - 2349 4.7 3 2 Op 1 . + CDS 2469 - 5969 3711 ## COG0642 Signal transduction histidine kinase 4 2 Op 2 . + CDS 5999 - 6724 883 ## PRU_0870 response regulator 5 2 Op 3 . + CDS 6717 - 7352 486 ## PRU_0869 hypothetical protein + Term 7537 - 7563 1.0 + Prom 7540 - 7599 4.0 6 3 Op 1 . + CDS 7738 - 10353 3103 ## COG0188 Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), A subunit 7 3 Op 2 . + CDS 10537 - 11763 1545 ## PRU_0814 hypothetical protein + Term 11783 - 11831 10.7 8 4 Op 1 . + CDS 11924 - 13129 1153 ## PRU_1072 hypothetical protein 9 4 Op 2 . + CDS 13165 - 14490 1324 ## COG0513 Superfamily II DNA and RNA helicases + Term 14657 - 14702 -0.2 - Term 14645 - 14690 3.0 10 5 Tu 1 . - CDS 14730 - 15149 438 ## COG0607 Rhodanese-related sulfurtransferase 11 6 Tu 1 . - CDS 15364 - 15912 537 ## COG0526 Thiol-disulfide isomerase and thioredoxins - Prom 15964 - 16023 5.8 + Prom 17184 - 17243 3.8 12 7 Tu 1 . + CDS 17285 - 17962 196 ## PROTEIN SUPPORTED gi|163756109|ref|ZP_02163225.1| 30S ribosomal protein S1 + Prom 18215 - 18274 2.9 13 8 Op 1 1/0.500 + CDS 18345 - 19091 657 ## COG0744 Membrane carboxypeptidase (penicillin-binding protein) 14 8 Op 2 . + CDS 19124 - 21595 2113 ## COG0210 Superfamily I DNA and RNA helicases - Term 21882 - 21943 6.4 15 9 Op 1 . - CDS 21965 - 23824 1643 ## PRU_0120 putative peptidase 16 9 Op 2 . - CDS 23859 - 24641 469 ## COG1145 Ferredoxin - Prom 24670 - 24729 2.6 17 10 Tu 1 . - CDS 24756 - 25496 648 ## COG1335 Amidases related to nicotinamidase - Prom 25672 - 25731 5.4 + Prom 25643 - 25702 6.0 18 11 Op 1 . + CDS 25910 - 27202 1179 ## COG0477 Permeases of the major facilitator superfamily 19 11 Op 2 . + CDS 27199 - 27753 584 ## gi|288927551|ref|ZP_06421398.1| hypothetical protein HMPREF0670_00292 20 11 Op 3 . + CDS 27750 - 28472 536 ## COG1385 Uncharacterized protein conserved in bacteria 21 11 Op 4 . + CDS 28481 - 29833 1002 ## BT_4327 hypothetical protein 22 11 Op 5 . + CDS 29830 - 30480 708 ## COG1136 ABC-type antimicrobial peptide transport system, ATPase component 23 11 Op 6 . + CDS 30477 - 31470 1006 ## BVU_3608 hypothetical protein Predicted protein(s) >gi|283510601|gb|ACQH01000018.1| GENE 1 1304 - 1567 351 87 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|110636876|ref|YP_677083.1| 50S ribosomal protein L27 [Cytophaga hutchinsonii ATCC 33406] # 1 87 1 87 89 139 77 2e-32 MAHKKGVGSSKNGRESASQRLGVKIYGGQKVTAGNIIVRQRGTKHNPGKNVGIGKDDTLF ALIDGVVNFRKSRKDKSYVSVIPEAEA >gi|283510601|gb|ACQH01000018.1| GENE 2 1586 - 1903 407 105 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|150006123|ref|YP_001300867.1| 50S ribosomal protein L21 [Bacteroides vulgatus ATCC 8482] # 1 105 1 105 105 161 75 5e-39 MYAIVEINGQQFKVEQGMKLYVHHIKDVEAGKTVEFDKVLLVDKEGAVTVGAPTVEGAKV VVEVVNPLVKGDKVIVFKMKRRKDYRKKNGHRAQFTQVEVKQVIA >gi|283510601|gb|ACQH01000018.1| GENE 3 2469 - 5969 3711 1166 aa, chain + ## HITS:1 COG:all0824_2 KEGG:ns NR:ns ## COG: all0824_2 COG0642 # Protein_GI_number: 17228319 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Nostoc sp. PCC 7120 # 506 671 79 237 273 75 33.0 8e-13 MAYAFHYRNLDSTAAYAQRALNVSGNYAAGRAEALNNMAFVRISKMEFETASKLLNDVKD ITDNQVELLVADVQLMRLCQRQSRNKDFYNYRESALRRLKRIKEEAYNLSPRLRKRMIYA ETELYIVASVYFYYIGLPDRAIAEQENIKPNGDIQQDTAQWLSYMYNVGAGGLITAPTKE AISQQEFDFLFRCYGLAIQSNYPYWEANSLQAISEHLIDKQQRRQLLRDNATAITILNTD DMPDSLLAGNLAQRALVMFSSYGNVYQTAGAYRTLASCYWALKDYKSALFCLQNALYRNP DINKAPDLVSSICEQLSLVYSAMNMKIQSDVNRNVYLDIQRQTRQDKQQEARAEQLENSS KQLNMMLVYVGVAIVLVVLLLYFFNSLRARQAAKYSPEKMLEPLRQWEKVNAQHVEEQND RYEELHEEQEIGRRHVVENKKKNIEQRAKVSLVNSVVPFIDRMRHEIHRLNAVQEPENIR QERLDYVKELTDKIKEYNAILTQWIQMRQGELSLHVESVALQPLFDIVGKSRMGFEHKGI KLVVESTTEVVKADRTLTLFMINTLIDNARKFTERNGKVTLSARVAEEEGYVDIVIADTG VGMTEEERAHLFERKVINSSSADVQTSHGFGLLNCKGIIDKYKKVSRIFSGCSISVESEK GKGSVFCFRLPRVVRMLAVVLALCGGMQVHAATARDSVHGNRSKILDRISAYTPEGKLLK RANAYADSAYFANVNGRYAKTLQYADSCIGKLNDLYRLKAPHGKDLMVAYSKAVDKPAEI QWLNDSLETNFSIILDIRNESAIAALALHEWDLYRYNNEVYTKLFKALSADNTLDAYVVG MQQSKANKTISIVVLVLLLLLIIPAYYVLFYRHYVAYRIYVERVEDINKVLMNTGEAQAK LDAIARIIKSIDRHGGETVQFKSLNEVVSQISEVLQKSVEMEKAREESIEMAEDELRRVE IENDKLHVNNSVLDNCLSALKHETMYYPSRIRQLIENGGEDLHDMNELVDYYKELYMLLS AQAMRLVDDLRYSYRPVPLKQLLDVLPHTSEDGVGEGRHILGDEDLLGYLFGILAGLFEG HSVALAVKDKGARYVEIDLELPSVMYTDEQCRELFTPTTPNLQFYLCRQIVRDTGEFTNA RGCGVFASNIENKTHIAITLPKTTQA >gi|283510601|gb|ACQH01000018.1| GENE 4 5999 - 6724 883 241 aa, chain + ## HITS:1 COG:no KEGG:PRU_0870 NR:ns ## KEGG: PRU_0870 # Name: not_defined # Def: response regulator # Organism: P.ruminicola # Pathway: not_defined # 1 239 1 237 239 342 78.0 6e-93 MDKFKVIIVEDVPLELKGTEGIFRNEIPEATIIGTADNEPAYWKLLKQELPDLVLLDLGL GGSTTIGVEICRHTKEQFPKVKVLIFTGEILNEKLWLDVLDAGCDGIILKSGELLTRGDV ASVMNGKRLVFNQPILEKIVEKFKRVVSNQIFRQEAFITYEIDEYDERFLRHLALGYTKE QITNLRGMPFGVKSLEKRQNELVNKLFPNGNGGMGVNATRLVVRAIELHILDIDNLVPDN G >gi|283510601|gb|ACQH01000018.1| GENE 5 6717 - 7352 486 211 aa, chain + ## HITS:1 COG:no KEGG:PRU_0869 NR:ns ## KEGG: PRU_0869 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 17 208 4 194 195 109 38.0 7e-23 MGKHKMVCHRRTGKYIARLALALGVLQLVAVLGSWIFKAVNPELPIRSLLSAEGIRWMVG NFGDNLAGRGLVWLLLGSMAYGSVRFCGILDVPRKRKAMSFWDHFGLMVALAELFVIVVL MLLLTVLPHAVLLGVTGNLYPSSFSKSFFLMVCLSVCFISVSFGVVSSRLRSLEEVCDCL VAGIAYTLPLWLIYILAIELYASLRFVFVLS >gi|283510601|gb|ACQH01000018.1| GENE 6 7738 - 10353 3103 871 aa, chain + ## HITS:1 COG:BH0007 KEGG:ns NR:ns ## COG: BH0007 COG0188 # Protein_GI_number: 15612570 # Func_class: L Replication, recombination and repair # Function: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), A subunit # Organism: Bacillus halodurans # 8 816 5 804 833 838 53.0 0 MDENQTIDQDRIIKINIEEEMKSSYIDYSMSVIVARALPDVRDGFKPVHRRILYGMLGLG NTSDKPYKKCARVVGDVLGKYHPHGDSSVYGALVRLAQDWNMRYTLVDGQGNFGSVDGDS AAAMRYTECRLSKLGERIMDDLEKDTVDMEENFDATLQEPQVMPTKIPNLLVNGGNGIAV GMATNMPTHNLAEVLDGCCAYIDNPEIDTEGLMLHIKAPDFPTGAYIYGLQGVKDAYETG RGRIVLRAKAEIESDESHDKIVISEIPYGVNKAQLIENIADLVKEGRLEGISNANDESGR QGMRIVVDVKRDANANVVLNKLFKMTQLQSSFSVNNIALVKGRPRLLTLKECVKYFVEHR HDVTIRRTKFDLKKAQERAHILEGLIIACDNIDEVVQIIRNSKTPSEAQRNLEKRFELDE LQSKAIVDMRLSQLTGLRIEQLHAEYQELEELIARLQLILDDPEECKRVMKAELEEVKEK FGDERRTEIIPDEHEFNAEDFYPNDPVVITVSHLGYIKRTPLSEFHEQARGGVGSKGANT RDKDFTEYIYPATMHQTMLFFTKKGRCHWLKCYEIPEGDKTSKGRAIQNLLNIDADDSVN AFLRVKGLNDVDFIKNHFIVFATKNGTVKKTSLEAYSRPRTNGVNAITIAEGDEVVDVRL TNGKNQLIMANRNGRAVYFDENTVRTMGRTATGVRGMRLDGGDDAVVGMIVVNNPETETV MVVSETGYGKRSIVSDYRQTNRGGKGVKTLNITDKTGKLVAIKNVTDDNDLMIINKSGIA IRLAVANFRVMGRATQGVRLINLSKKNDVIASVCKVKTSDKEDDLNENDEEHPNNEAMNT AEQGTANANNDAQIQGNDADGGAEVEADATE >gi|283510601|gb|ACQH01000018.1| GENE 7 10537 - 11763 1545 408 aa, chain + ## HITS:1 COG:no KEGG:PRU_0814 NR:ns ## KEGG: PRU_0814 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 403 1 400 404 234 43.0 4e-60 MRKVMMFALMLAASSVAFAQDAVKEVLKAKNYADAQALVKANLSGMSNADKAKAYNKLVD LSFEKINKEQSVITANQMAEQFKQGKVEAYDTLGFYNALADALTNAIECEKYDQMPNEKG KVKPKFHSSNQNRLYNLRVYLVNAGQDAAQKNNNAGVLKFWGLYTATADTPLFADMSNKQ PDQYVGQVAGFAARYAIQDKDYATADKYLDVALKYAGDNKDDYKDALGLKFYIAQQQLKT KEDSLGYVNKLKEYYAKDPSNDMIMGTLANMYTNLNMKDDMKKLIDSRLAADPNSVMAWT LKGQAEMNASQWDDAIASLKKAIALDDKNVIVLTYLGFCINSKAAAINNDVPAQKVLYKE SMGYLEKAKELDPNREKANWSYPLYQCYYVNYGAADQRTKDLESLLNK >gi|283510601|gb|ACQH01000018.1| GENE 8 11924 - 13129 1153 401 aa, chain + ## HITS:1 COG:no KEGG:PRU_1072 NR:ns ## KEGG: PRU_1072 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 19 398 2 376 377 262 35.0 2e-68 MNRFYFILLFSFITLASSASLKKDLAQVKASLKTGKELDKAERIMNDVLAKPDNRKEIKA WLLLFDVIKKQYEQGNERLYLQQSADTVTMYNLTKKMCLTLESIDSVELANANGDASKLK YRKKHADEFFTYKANLFYGGTYFMSKKMYDKAYSFFSLYIDCAKMPIFTGYDILNTDSQL PRAAFFSVYCGYKLHKPELALRYLPLAQKDKERYEYLLQYLADVYQMQNDTDKYMAVLKE GFQDYPTQTYFYTRLIEHYSKDNNWSSAKAYVDQAMQAMPEHLWLRVAKSTILLNLGDYN SCIALCDSIIAKHKDLPEVYLNAGLAYFNRAVEEEKTARTSRDLKKKVAASYRGALPYME QYRRMAPNDQSRWLLPLYTIYLNLNMGKEFEEMDNLMKAAK >gi|283510601|gb|ACQH01000018.1| GENE 9 13165 - 14490 1324 441 aa, chain + ## HITS:1 COG:VC2564 KEGG:ns NR:ns ## COG: VC2564 COG0513 # Protein_GI_number: 15642559 # Func_class: L Replication, recombination and repair; K Transcription; J Translation, ribosomal structure and biogenesis # Function: Superfamily II DNA and RNA helicases # Organism: Vibrio cholerae # 5 432 14 451 460 202 31.0 1e-51 MNHIDLAKAFRKLGVESLNDMQKATATAMQAGKGDVVVLSPTGTGKTLAYLLPLVEMINP ENDNVQAVVIVPGRELALQSQQVLAAVGAVRGMACYGGRTAMEEHRALKQQRPQVVFATP GRLNDHLAKGNISANDVHLLIIDEFDKCLDMGFENEMRTALASLPQVQRRVLLSATNSPR IPLFVENAKLQCVDFVGEAEKVKQRIAVYEVKSDSKDKLDTLHRLLLAFNGESTIVFLNH RESVERTNAFLVEQGFVTSLFHGGLEQDAREAALYRFANGSANVLVCTDLASRGLDIDDV RHIVHYHLPETADAYTHRIGRTARWDKLGNGYFILSEGERIPEYVDSEVQPFALPTVNGK AQMPRMATLYIGKGKKDKISKGDILGFLCKKGHLNNDDIGRIDVKERYAYVAVSRNKVKR LLADVAGEKIKGVKTIVEEVK >gi|283510601|gb|ACQH01000018.1| GENE 10 14730 - 15149 438 139 aa, chain - ## HITS:1 COG:MA0746 KEGG:ns NR:ns ## COG: MA0746 COG0607 # Protein_GI_number: 20089631 # Func_class: P Inorganic ion transport and metabolism # Function: Rhodanese-related sulfurtransferase # Organism: Methanosarcina acetivorans str.C2A # 36 132 39 144 151 83 44.0 1e-16 MNQQNQSVKTSRIKKIAFVFMSIWSTIFGACAQSQYTNVDVEGFEQAIKNDSAQVLDVRT HEEFAESHIKGAVLVDVFSPNFLADAESKLQKDRPVAVYCRSGRRSATAAKQLSAKGYKV INLEGGILAWIGKRKETVR >gi|283510601|gb|ACQH01000018.1| GENE 11 15364 - 15912 537 182 aa, chain - ## HITS:1 COG:BS_resA KEGG:ns NR:ns ## COG: BS_resA COG0526 # Protein_GI_number: 16079372 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Bacillus subtilis # 49 164 44 155 181 82 33.0 4e-16 MNIKHLRRATTALLVAFSLWSNTAKAQQSTEVDYDAQYASELVKPGTVAPNFELPSPDGM KVSLSQFKGKYVVLDFWASWCPDCRKDAPNIVDLYNRFKDKGVAFVGISFDVDAALWKAA IEKYGMNYAHASELKKMREANISKTYGVKWIPSMVLVDPEGKVVMGTVLWKKLERKLEQL FR >gi|283510601|gb|ACQH01000018.1| GENE 12 17285 - 17962 196 225 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163756109|ref|ZP_02163225.1| 30S ribosomal protein S1 [Kordia algicida OT-1] # 53 207 205 344 347 80 30 2e-14 MKKMKVLTLLMCLVLVFGCKTKQGTGTLIGTGGGALLGAVIGKVAGNTAVGAVVGATVGA ATGSMIGKHMDKVAKRTAEQVKNAKVEEVTDANGLKAVKVTFDSGILFATNKATLNQSSK NELAKFSQVLKENRDCHVDIYGHTDNTGNDGINIPLSNSRAQSVVNYLANCGVPRSQFQK VEGKGSSDPVASNSTASGRQQNRRVEVYLYASKAMIDAANNGTLK >gi|283510601|gb|ACQH01000018.1| GENE 13 18345 - 19091 657 248 aa, chain + ## HITS:1 COG:PA0378 KEGG:ns NR:ns ## COG: PA0378 COG0744 # Protein_GI_number: 15595575 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane carboxypeptidase (penicillin-binding protein) # Organism: Pseudomonas aeruginosa # 6 221 18 229 243 206 49.0 4e-53 MNLMKKKIFKVLRWTAALFFGSTILSVVVLRFVPVFFTPLMFIRCGEQLLDGKSLTMHHH WVSLEKISPHLPVAVMGSEDQRFLLHHGFDFAAIQKAAKQNMRGGKKMGASTISQQTAKN VFLWPGRSWIRKGFEMYFTALIELMWSKQRIMEVYLNSIEMGDGIYGADAVAEYNFGTTA GNLSRAQCALIAATLPNPRVFSSKYPSPYMRKRQKQIMKNMRFMPSFPKEGEDYNPNTAS GGVYRKKK >gi|283510601|gb|ACQH01000018.1| GENE 14 19124 - 21595 2113 823 aa, chain + ## HITS:1 COG:SP1087 KEGG:ns NR:ns ## COG: SP1087 COG0210 # Protein_GI_number: 15900955 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases # Organism: Streptococcus pneumoniae TIGR4 # 1 654 1 646 763 509 43.0 1e-144 MDNLLDKLNESQREAVVYTDGPQLVIAGAGSGKTRVLTFKIAYLLQQGLKPWNILALTFT NKAANEMKARIGNLVGHEGAKHLFMGTFHSIFSRILRVEAPRIGFSSNFTIYDETDSRSL IKTICKEMQLDDKVYKPATVHARISMAKNHFILPDAYAANAELMLRDTREHLSAVHKVYA TYMQRCRLANAMDFDDLLVLTYRLFAEHEDVRKEYAGRFQYVLVDEYQDTNHVQQLIVTL LTKEHRRICVVGDDAQSIYGFRGADIDNILDFQQQFPEARLFKLEQNYRSTQNIVKAANS LISHNQRQIKKDVFSENDAGEKLMLKPVYSDKEEALVVCNDIKRIMNDEQGAYSDFAILY RTNSQSRSFEEELRRQSIPYRIYGGLSFYQRKEIKDIIAYFRLVMNVDDEEAFKRIVNYP ARGIGNTRLAKVTMAAQQNGVSCWEILSNPAHYQLDVNKGTWTKLQKFRDMVNTFVAELA TLDAYELGMKIIKESGIQADLYATSDADSLTRQENLQEFVGGLREFVDIRREEGNEGEVF LNDYLQEVSLLSDLDSEGDDESRVVLMTVHSAKGLEFPTVFVVGLEENIFPSPRSTDSPR QLEEERRLLYVAITRAERHCILTCAKNRYRFGRMEFDTPSRFIRDIDPSLLNVQNPMADF SKPRLERSLLSDELPSARYNGGRSARESGERRFDNPRFLNSRPVASQFMADPIPNPARPR KPEPAVDPLSEGFKKRLAASGANLKKVSEAISHSSSGSAPAEVMHDAGGKEIIEGCVIEH ERFGIGTIERLEETGENARMTVNFKHAGSKKLLLKFAKFKVIG >gi|283510601|gb|ACQH01000018.1| GENE 15 21965 - 23824 1643 619 aa, chain - ## HITS:1 COG:no KEGG:PRU_0120 NR:ns ## KEGG: PRU_0120 # Name: not_defined # Def: putative peptidase # Organism: P.ruminicola # Pathway: not_defined # 137 615 42 483 485 163 30.0 2e-38 MKKTVLILITLLCHLASHASKGWPYPITVSQPDGTQLTVRINGDADFNWVSTLDGVVLKQ VGNGYYIANIDTNGMLSSSGTLAHDADKRSSAEQSLCKKQDVKAFLTVNTRPERLAATRG FGGKSSTSFFPHTGSPRAIVLLVQFANRPFKVQPRKAFNQYLNSMANKHQDFGNAEDRNT GSVKRYFSDMSGGKFKPQFDLYGPITMPKNVAYYGEGSSSMERYRELVSEACTMMDDSLD FSKYDADKDGNVDLVYIIYAGYGESVSSIDSTLWPKAFVCGTDIKKDGKYVRLAGISNEL NGRPDGNYNSKSGLLINGVGLFCHEFSHCMGLPDFYPTVSPQWTTANDKQDFDAYDNQGM EEWDVMDNGIYLYNGYSPTAYTAWEREKMGWLTIETLTKEGKVELKSIDEGGKAYRIKND KNTSGNEYYIVENIQAKGWNKKLPASGMMVSHVEYEPRAFSVFHGGDNSVNNIKKHPRMT IVPADGYLPSSYRKVPNSSNLTAPYMKKQQYDEQLAGDLYPGKSNVNRLTDAQNMVNYAP WTGGMLNKPIYNIALKNGIVTFDFLKDQTSTGIEQPESVTENNNEERIYTIDGRYVGTNL KALPKGVYIKGKKKVVVSR >gi|283510601|gb|ACQH01000018.1| GENE 16 23859 - 24641 469 260 aa, chain - ## HITS:1 COG:CAC2657 KEGG:ns NR:ns ## COG: CAC2657 COG1145 # Protein_GI_number: 15895915 # Func_class: C Energy production and conversion # Function: Ferredoxin # Organism: Clostridium acetobutylicum # 4 240 7 238 249 64 23.0 2e-10 MIFYFSATGNTRWAAQQLSEKTGERLIFIPDAMEGQCHYTLAAGERIGFCFPVHAWRPPV LVRRFIKKLSIANAEGHYCFALCTAGDTTGESMRIFRHALAQRSLHLDAVFSLLMPESYV GLPFMDVDTIEKEREKKNQAKIHLTSIVAQINEAEKVEKEPNVGRWPRINSRVIGEFFER KLVNDKAFRVETNRCVRCGVCAEVCPTADIEGGKGQEPTWKHNGSCLTCFACYHHCPQHA IEFGRQTKKKGQYWYGRNQL >gi|283510601|gb|ACQH01000018.1| GENE 17 24756 - 25496 648 246 aa, chain - ## HITS:1 COG:TP0696 KEGG:ns NR:ns ## COG: TP0696 COG1335 # Protein_GI_number: 15639683 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Amidases related to nicotinamidase # Organism: Treponema pallidum # 2 205 3 215 278 110 32.0 2e-24 MQMLLIIDPNNDFADSRGSLYVPHADKALKAIAHYIEENNPEGIAISQDTHRRYHVGHCA YWQGEGVKPFANVRSEDVERGVISPTFATKEQTLNYLRAMESKGRVHTLWPEHCLIGSWG WAFPDVLVQAVNAWDNLHHGQRPLHVYQKGEYADAEMFSIFSYVNERTPNEHGMQVLDQL AQYDEVVVCGFAKDYCVAESVKDMRNDPRFEGKLRFFDAGMAAINPQSDNLAIYEDCLNT FGAKRI >gi|283510601|gb|ACQH01000018.1| GENE 18 25910 - 27202 1179 430 aa, chain + ## HITS:1 COG:STM3113 KEGG:ns NR:ns ## COG: STM3113 COG0477 # Protein_GI_number: 16766414 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Salmonella typhimurium LT2 # 7 427 1 407 418 360 48.0 3e-99 MKILTIMNLKIRLSLMNFLEFAVWGAYLTSMGTYLVKIGMADQIGWFYAVQGIVSIFMPA IMGIVADRWVPAQRLLGLCHLLAALFMFATAYYGYVAMGEVAVGTSVAGTFAGSMIFPLY SLSVAFYMPTLALSNSVAYTGLNQAGMDTVKDFPPIRVFGTIGFICTMWFVDLMGFQPNQ NQFVVSGIFSIILFLYSFTLPTCPTSSGTERKSIADQLGLRAFSLFKQKKMAIFFFFSML LGVSLQITNGFANPFITSFESIPEYADTFGVNHANILISLSQISETCCILLIPFFMKRFG IKNVMLIAMFAWVLRFGLFGLGNPGNGVWLFVLSMIVYGVAFDFFNISGSLFVDKSTDAH MRSSAQGLFMLMTNGIGATVGTLSAQAIVNHFTVDGVTDWQTCWYVFAAYALVVGVAFAL VFRPKGIDKK >gi|283510601|gb|ACQH01000018.1| GENE 19 27199 - 27753 584 184 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288927551|ref|ZP_06421398.1| ## NR: gi|288927551|ref|ZP_06421398.1| hypothetical protein HMPREF0670_00292 [Prevotella sp. oral taxon 317 str. F0108] # 1 184 1 184 184 305 100.0 1e-81 MKSKIKLIFNNVAEIVGNERLGLLALTDEAQTMELLIPCDREMKEQLGLRLNHLPITKML LPEVLWQLVRSHTDVDFEIIIDDIIDGQYRVLLYNTLTLEPIKLRASDAVLLSVIGNIPI YIETSLMARQGTAFNKNATGASLPINILDNKMIEDALAKAIKDENYELASRISEEIKRRK DKQK >gi|283510601|gb|ACQH01000018.1| GENE 20 27750 - 28472 536 240 aa, chain + ## HITS:1 COG:PM1868 KEGG:ns NR:ns ## COG: PM1868 COG1385 # Protein_GI_number: 15603733 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Pasteurella multocida # 17 238 19 240 244 87 29.0 3e-17 MKETRYFYVPDAASTNELPAEEAAHASRVLRLESGDEVFLIDGTGCFFKAQLTLVTKGRC LYDIVERLPQEKTWNGRIAVAMAPTKVIDRVEWTLEKATEIGIDEFSLLNCAFSERRNVK LERLDKIVVAAVKQSRKAWKPLLNDLQPFENFVKQPRKGAKYIAHCYAEIDKKDLYGELL QLNGDEEVTILIGPEGDFSIDEVRLAMSKGYVPISLGQSRLRTETAALAATMIAQLAFRK >gi|283510601|gb|ACQH01000018.1| GENE 21 28481 - 29833 1002 450 aa, chain + ## HITS:1 COG:no KEGG:BT_4327 NR:ns ## KEGG: BT_4327 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 5 439 9 479 515 87 23.0 9e-16 MKDIIGVLFSLVVLLVSCSKANYVQSIPKSSMAVMAIDAPKLGKDLKAENILKTFIGADN NVADCGIDLSSKLYLFEAADMNFGLCAKVDDVSKLAATFDALVQKGTAVAGKERKGCQFY TVNNVWAVGFNDDALLVMGPQPLAVQPDLQLRMAKLLNIDPKKEDRQSPLLASVDAIDAP MALVARLDALPENVALPFTLGLPKRVNLSQVVLKAGLHFSGNALLMKGETFSYNQTIDES LKRVANAYRPIGETFLKDATSPAALTLFANMEGDKLLGIVRDNVVLSVMLTGLNTAIDMD NIIRSIDGNIIISTSTYASGQAYLSMKAQIKATNWLMDVGYWKRSCPPGSTIYDAGPRAY VFDNGSYKYYFGLKPDSSFYCLPHPPFQQPVSATVNGQVSPSVANEIRGKRLALVFNLDA LVGTRGIKTNSLATVKSYLGGVNTIVYILE >gi|283510601|gb|ACQH01000018.1| GENE 22 29830 - 30480 708 216 aa, chain + ## HITS:1 COG:Rv0073_1 KEGG:ns NR:ns ## COG: Rv0073_1 COG1136 # Protein_GI_number: 15607215 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, ATPase component # Organism: Mycobacterium tuberculosis H37Rv # 30 200 27 200 220 97 32.0 2e-20 MMNSISFEQVTPHVFLQRENLQSDVWQQPNLRFEKGKFYLVEAQSGLGKSTFCSYILGYR QDYEGVVKFDETDISLLSTADWVNVRQRNVSQLFQELRLFPELTALENVMLKNQLTHFKT ETEIRAWFEQLGIADKINTRVGKMSFGQRQRVATLRALVQPFDFLLADEPISHLDDFNAD TLGELMAAEAKAQGAALIVTSIGKHPNLQYDKILKL >gi|283510601|gb|ACQH01000018.1| GENE 23 30477 - 31470 1006 331 aa, chain + ## HITS:1 COG:no KEGG:BVU_3608 NR:ns ## KEGG: BVU_3608 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 331 1 332 400 395 55.0 1e-109 MKLVWKLLRQHISWPQFVGFFFANLFGMTIVLLGYQLYCDILPIFTANDSFLKADYLVVS KKIGMANALGQQHSGFSKDEIADLQAQPFVKGVGQFTSTAYKAEATMGVSGMKILNSELF FESVPDPFVDVSLDNWHYTPGDSLVPVILPRSYIAMYNFGFAQNHSLPKINEGLVGMIDL HIQVQGKGGQGYFKGKVIGFSSKLNTILVPQSFMTWSNSHFSPDSEMPPSRLILDVTNPA DQRIGTYLEDHNYELEDNNLDAEKTTYFLKLMVTLVMGVGVVISALSFYVLLLSIYLLVQ KNTTKLQNLLLIGYSPSRVALPYQMLTLALN Prediction of potential genes in microbial genomes Time: Sat May 28 00:19:19 2011 Seq name: gi|283510600|gb|ACQH01000019.1| Prevotella sp. oral taxon 317 str. F0108 cont2.19, whole genome shotgun sequence Length of sequence - 1856 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 201 - 260 3.1 1 1 Tu 1 . + CDS 281 - 1762 1521 ## COG0442 Prolyl-tRNA synthetase Predicted protein(s) >gi|283510600|gb|ACQH01000019.1| GENE 1 281 - 1762 1521 493 aa, chain + ## HITS:1 COG:BB0402 KEGG:ns NR:ns ## COG: BB0402 COG0442 # Protein_GI_number: 15594747 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Prolyl-tRNA synthetase # Organism: Borrelia burgdorferi # 8 493 5 488 488 478 50.0 1e-135 MAIELKDLTKRADNYSQWFNDLVVKADLAEQSAVRGCMVIKPYGYAIWEKMQQQLDRMFK ETGAQNAYFPLLIPKSFFSREAEHVKGFAKECAVVTHYRLRASEDGNSVEVDPSAKLDEE LIIRPTSETIIWNTYKNWIHSWRDLPLMCNQWCNVMRWEMRTRPFLRTSEFLWQEGHTAH ATREEAEEEAKKMLDVYAKFAQDWMAVPVVCGVKSETERFAGALDTYTIEAMMQDGKALQ SGTSHFLGQNFAKSFDVTFLNKENKPEYVWATSWGVSTRLIGALIMTHSDDNGLVLPPKL APIQVVVIPIGKAGQQMQAIADKLQPVIEQLRKAGVTVKYDDSDNRRPGFKYADYELKGI PVRIVMGGRDLENNTVEIMRRDTLEKETIGFEGLAEHIVSVLDDMQANIFKKALDYRNAH VYECNDYEEFKQRIKDGGFFLCHWDGTEETEARIKEDTQATIRCVPFEFEQTEGVDMVSG KPAKHRVIIARSY Prediction of potential genes in microbial genomes Time: Sat May 28 00:19:53 2011 Seq name: gi|283510599|gb|ACQH01000020.1| Prevotella sp. oral taxon 317 str. F0108 cont2.20, whole genome shotgun sequence Length of sequence - 128542 bp Number of predicted genes - 104, with homology - 101 Number of transcription units - 55, operones - 24 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 246 - 305 3.5 1 1 Tu 1 . + CDS 327 - 626 100 ## + Term 627 - 672 -0.3 - TRNA 1698 - 1774 75.8 # Thr TGT 0 0 2 2 Tu 1 . - CDS 2076 - 2276 59 ## - Prom 2387 - 2446 1.8 + Prom 2031 - 2090 5.5 3 3 Op 1 . + CDS 2291 - 3043 797 ## COG2877 3-deoxy-D-manno-octulosonic acid (KDO) 8-phosphate synthase 4 3 Op 2 . + CDS 3085 - 4071 930 ## COG0794 Predicted sugar phosphate isomerase involved in capsule formation 5 3 Op 3 . + CDS 4096 - 5367 1008 ## COG1502 Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes + Term 5373 - 5404 -0.5 + Prom 5794 - 5853 5.0 6 4 Op 1 . + CDS 5986 - 7020 716 ## PRU_1314 hypothetical protein 7 4 Op 2 . + CDS 7063 - 7593 465 ## PRU_1315 FHA domain-containing protein + Term 7785 - 7822 -0.9 + Prom 7710 - 7769 3.2 8 5 Op 1 . + CDS 7843 - 8556 178 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 9 5 Op 2 1/0.000 + CDS 8591 - 9910 1383 ## COG0527 Aspartokinases + Prom 9914 - 9973 3.1 10 5 Op 3 . + CDS 10042 - 11205 834 ## COG0019 Diaminopimelate decarboxylase 11 5 Op 4 . + CDS 11216 - 12394 958 ## BF1603 putative transmembrane glycosyltransferase 12 5 Op 5 . + CDS 12400 - 13524 1049 ## PRU_0477 hypothetical protein 13 5 Op 6 . + CDS 13525 - 15525 2058 ## COG3855 Uncharacterized protein conserved in bacteria + Term 15531 - 15579 15.5 - Term 15519 - 15566 15.3 14 6 Tu 1 . - CDS 15636 - 16589 618 ## Coch_1252 hypothetical protein - Prom 16710 - 16769 8.6 15 7 Tu 1 . - CDS 18072 - 18332 272 ## Ccur_10250 hypothetical protein - Prom 18352 - 18411 3.9 16 8 Op 1 . - CDS 18555 - 20360 1835 ## COG0038 Chloride channel protein EriC 17 8 Op 2 . - CDS 20399 - 20968 627 ## COG0009 Putative translation factor (SUA5) - Prom 20988 - 21047 8.1 18 9 Tu 1 . - CDS 21070 - 22053 938 ## COG0223 Methionyl-tRNA formyltransferase 19 10 Tu 1 . + CDS 22022 - 22228 95 ## 20 11 Tu 1 . - CDS 22187 - 24397 2566 ## COG0317 Guanosine polyphosphate pyrophosphohydrolases/synthetases - Prom 24562 - 24621 8.7 + Prom 24521 - 24580 8.1 21 12 Tu 1 . + CDS 24703 - 26220 1510 ## PRU_1211 hypothetical protein + Term 26246 - 26293 16.1 - Term 26230 - 26284 17.1 22 13 Op 1 . - CDS 26342 - 26770 230 ## PRU_1353 hypothetical protein 23 13 Op 2 . - CDS 26797 - 27132 573 ## PRU_1354 hypothetical protein 24 13 Op 3 . - CDS 27167 - 27991 894 ## COG4105 DNA uptake lipoprotein - Prom 28035 - 28094 7.8 + Prom 28015 - 28074 4.1 25 14 Tu 1 . + CDS 28103 - 30136 2185 ## COG0556 Helicase subunit of the DNA excision repair complex + Term 30147 - 30188 3.3 26 15 Tu 1 . + CDS 30254 - 30451 65 ## gi|288927436|ref|ZP_06421283.1| hypothetical protein HMPREF0670_00177 27 16 Op 1 . - CDS 31559 - 32245 474 ## PROTEIN SUPPORTED gi|241889736|ref|ZP_04777034.1| putative 30S ribosomal protein S12 28 16 Op 2 . - CDS 32263 - 32973 536 ## COG4123 Predicted O-methyltransferase 29 16 Op 3 . - CDS 33039 - 33656 801 ## PRU_1645 LuxR family transcriptional regulator 30 16 Op 4 . - CDS 33671 - 34627 1202 ## COG2197 Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain - Prom 34672 - 34731 9.7 + Prom 35500 - 35559 2.5 31 17 Op 1 . + CDS 35776 - 36516 497 ## PRU_1802 nucleotidyl transferase family protein 32 17 Op 2 . + CDS 36491 - 37315 672 ## PRU_1801 hypothetical protein 33 17 Op 3 . + CDS 37308 - 38249 775 ## PRU_1800 hypothetical protein 34 17 Op 4 . + CDS 38255 - 39010 798 ## COG1011 Predicted hydrolase (HAD superfamily) + Prom 39094 - 39153 2.3 35 17 Op 5 . + CDS 39187 - 41766 2512 ## PRU_1798 hypothetical protein + Term 41787 - 41841 13.2 - TRNA 43181 - 43254 51.9 # Gln CTG 0 0 + Prom 43669 - 43728 5.3 36 18 Op 1 17/0.000 + CDS 43761 - 44501 580 ## COG0247 Fe-S oxidoreductase 37 18 Op 2 . + CDS 44498 - 45868 992 ## COG1139 Uncharacterized conserved protein containing a ferredoxin-like domain 38 18 Op 3 . + CDS 45865 - 46440 539 ## BF3592 hypothetical protein - Term 46564 - 46614 12.3 39 19 Op 1 . - CDS 46657 - 47595 784 ## COG0462 Phosphoribosylpyrophosphate synthetase 40 19 Op 2 . - CDS 47658 - 48275 603 ## PRU_0674 thiamine-phosphate pyrophosphorylase-like protein - Prom 48319 - 48378 3.9 + Prom 48062 - 48121 3.1 41 20 Tu 1 . + CDS 48365 - 49537 924 ## COG0019 Diaminopimelate decarboxylase + Term 49571 - 49636 14.3 - Term 49556 - 49624 23.5 42 21 Tu 1 . - CDS 49694 - 50179 498 ## COG1522 Transcriptional regulators - Prom 50378 - 50437 7.7 - Term 50272 - 50306 -0.8 43 22 Op 1 . - CDS 50496 - 50966 560 ## BVU_1945 hypothetical protein 44 22 Op 2 . - CDS 50969 - 51490 333 ## PRU_1820 ExbD/TolR family biopolymer transport protein 45 22 Op 3 . - CDS 51506 - 52009 498 ## gi|288927598|ref|ZP_06421445.1| conserved hypothetical protein 46 22 Op 4 . - CDS 52006 - 52869 999 ## COG0811 Biopolymer transport proteins - Prom 52929 - 52988 2.8 47 23 Tu 1 . - CDS 53093 - 53578 544 ## gi|288927600|ref|ZP_06421447.1| hypothetical protein HMPREF0670_00341 - Prom 53824 - 53883 3.4 48 24 Tu 1 . + CDS 53669 - 54472 785 ## COG1235 Metal-dependent hydrolases of the beta-lactamase superfamily I 49 25 Tu 1 . - CDS 55729 - 56760 1104 ## BT_1767 hypothetical protein + Prom 57032 - 57091 2.0 50 26 Tu 1 . + CDS 57112 - 58284 1065 ## COG1690 Uncharacterized conserved protein + Prom 58481 - 58540 5.8 51 27 Tu 1 . + CDS 58724 - 60991 1570 ## Fjoh_4747 hypothetical protein + Term 61120 - 61157 -0.8 52 28 Tu 1 . - CDS 61203 - 62036 353 ## BVU_0885 glycosyl transferase family protein - Prom 62164 - 62223 1.7 53 29 Tu 1 . - CDS 62241 - 64271 2096 ## COG2885 Outer membrane protein and related peptidoglycan-associated (lipo)proteins - Prom 64358 - 64417 3.9 54 30 Tu 1 . - CDS 64463 - 65617 860 ## COG0635 Coproporphyrinogen III oxidase and related Fe-S oxidoreductases - Prom 65643 - 65702 3.5 + Prom 65597 - 65656 5.4 55 31 Tu 1 . + CDS 65782 - 67944 2123 ## COG0480 Translation elongation factors (GTPases) + Prom 68070 - 68129 4.9 56 32 Op 1 40/0.000 + CDS 68164 - 69723 1407 ## COG0642 Signal transduction histidine kinase 57 32 Op 2 . + CDS 69729 - 70427 805 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain + Term 70458 - 70496 2.1 58 33 Tu 1 . - CDS 70701 - 70880 103 ## gi|260911681|ref|ZP_05918261.1| hypothetical protein HMPREF6745_2216 - Prom 70955 - 71014 5.0 + Prom 70913 - 70972 4.5 59 34 Op 1 11/0.000 + CDS 70992 - 71339 408 ## PROTEIN SUPPORTED gi|212691068|ref|ZP_03299196.1| hypothetical protein BACDOR_00558 60 34 Op 2 27/0.000 + CDS 71342 - 71611 422 ## PROTEIN SUPPORTED gi|150002620|ref|YP_001297364.1| 30S ribosomal protein S18 61 34 Op 3 . + CDS 71628 - 72167 515 ## PROTEIN SUPPORTED gi|34540405|ref|NP_904884.1| 50S ribosomal protein L9 + Term 72194 - 72238 12.1 + Prom 73444 - 73503 3.7 62 35 Tu 1 . + CDS 73704 - 74327 410 ## gi|288927616|ref|ZP_06421463.1| conserved hypothetical protein + Term 74348 - 74393 0.2 + Prom 74588 - 74647 5.3 63 36 Op 1 . + CDS 74685 - 75572 873 ## COG0331 (acyl-carrier-protein) S-malonyltransferase + Term 75583 - 75619 1.5 64 36 Op 2 . + CDS 75629 - 77779 1892 ## COG0366 Glycosidases + Term 77793 - 77824 -1.0 + Prom 77956 - 78015 5.8 65 37 Op 1 11/0.000 + CDS 78250 - 80499 2170 ## COG1882 Pyruvate-formate lyase + Term 80530 - 80569 6.5 66 37 Op 2 . + CDS 80646 - 81443 740 ## COG1180 Pyruvate-formate lyase-activating enzyme + Term 81544 - 81602 17.9 - Term 81539 - 81583 12.0 67 38 Tu 1 . - CDS 81591 - 81983 380 ## PRU_0823 hypothetical protein - Prom 82019 - 82078 3.8 + Prom 81996 - 82055 4.6 68 39 Op 1 . + CDS 82101 - 83882 2004 ## COG0481 Membrane GTPase LepA 69 39 Op 2 . + CDS 83915 - 84520 558 ## COG0664 cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases - Term 85488 - 85547 12.0 70 40 Tu 1 . - CDS 85573 - 86484 1224 ## PRU_0301 hypothetical protein - Prom 86507 - 86566 3.1 - Term 86518 - 86563 1.0 71 41 Op 1 . - CDS 86675 - 87340 568 ## COG0671 Membrane-associated phospholipid phosphatase 72 41 Op 2 . - CDS 87349 - 89112 1561 ## COG1368 Phosphoglycerol transferase and related proteins, alkaline phosphatase superfamily - Prom 89338 - 89397 5.5 73 42 Op 1 6/0.000 - CDS 89511 - 90116 417 ## COG0512 Anthranilate/para-aminobenzoate synthases component II 74 42 Op 2 . - CDS 90117 - 90833 710 ## COG0135 Phosphoribosylanthranilate isomerase 75 42 Op 3 16/0.000 - CDS 90870 - 91370 469 ## COG0262 Dihydrofolate reductase 76 42 Op 4 . - CDS 91391 - 92185 795 ## COG0207 Thymidylate synthase 77 42 Op 5 . - CDS 92182 - 93585 819 ## COG0144 tRNA and rRNA cytosine-C5-methylases - Prom 93645 - 93704 4.6 - Term 95119 - 95180 7.1 78 43 Tu 1 . - CDS 95220 - 96434 1508 ## PRU_0556 C1 family peptidase (EC:3.4.-.-) - Prom 96518 - 96577 2.4 - Term 96614 - 96662 15.3 79 44 Tu 1 . - CDS 96704 - 98107 1646 ## COG0006 Xaa-Pro aminopeptidase - Prom 98153 - 98212 4.8 + Prom 98073 - 98132 3.8 80 45 Op 1 . + CDS 98195 - 99058 818 ## COG4667 Predicted esterase of the alpha-beta hydrolase superfamily + Prom 99071 - 99130 3.5 81 45 Op 2 1/0.000 + CDS 99158 - 99799 579 ## COG0637 Predicted phosphatase/phosphohexomutase + Prom 99935 - 99994 4.0 82 45 Op 3 . + CDS 100027 - 101694 1318 ## COG0438 Glycosyltransferase 83 45 Op 4 . + CDS 101734 - 104295 2729 ## COG0058 Glucan phosphorylase + Term 104344 - 104393 12.5 - TRNA 104833 - 104906 38.6 # Pseudo GCA 0 0 - Term 104788 - 104827 6.0 84 46 Tu 1 . - CDS 104963 - 105154 205 ## gi|288927637|ref|ZP_06421484.1| hypothetical protein HMPREF0670_00378 - Prom 105190 - 105249 3.5 + Prom 105340 - 105399 2.9 85 47 Op 1 . + CDS 105495 - 106100 690 ## COG0218 Predicted GTPase 86 47 Op 2 . + CDS 106106 - 107881 2030 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains 87 47 Op 3 . + CDS 107927 - 108433 466 ## gi|288927640|ref|ZP_06421487.1| conserved hypothetical protein + Prom 108448 - 108507 2.3 88 48 Op 1 . + CDS 108588 - 110252 1535 ## COG1022 Long-chain acyl-CoA synthetases (AMP-forming) + Prom 110259 - 110318 1.7 89 48 Op 2 . + CDS 110353 - 112017 1719 ## COG1022 Long-chain acyl-CoA synthetases (AMP-forming) + Term 112030 - 112101 28.9 + Prom 112052 - 112111 4.2 90 49 Tu 1 . + CDS 112186 - 112728 349 ## BT_1585 hypothetical protein + Term 112750 - 112802 16.0 - Term 112859 - 112910 13.1 91 50 Op 1 . - CDS 112967 - 114136 1074 ## BF2289 putative polyphosphate-selective porin O 92 50 Op 2 . - CDS 114159 - 115451 1178 ## Coch_0692 glucose-1-phosphatase (EC:3.1.3.10) - Prom 115504 - 115563 3.5 93 51 Tu 1 . - CDS 115812 - 115970 60 ## gi|288927646|ref|ZP_06421493.1| hypothetical protein HMPREF0670_00387 - Prom 116122 - 116181 5.2 94 52 Op 1 . - CDS 116787 - 117812 1021 ## COG1216 Predicted glycosyltransferases 95 52 Op 2 . - CDS 117836 - 119176 708 ## PROTEIN SUPPORTED gi|16079597|ref|NP_390421.1| hypothetical protein BSU25430 96 52 Op 3 . - CDS 119224 - 120087 917 ## COG0575 CDP-diglyceride synthetase - Prom 120107 - 120166 3.0 97 53 Op 1 . - CDS 120271 - 122274 1176 ## PROTEIN SUPPORTED gi|157803230|ref|YP_001491779.1| 50S ribosomal protein L9 98 53 Op 2 . - CDS 122323 - 122682 368 ## COG0799 Uncharacterized homolog of plant Iojap protein - Prom 122707 - 122766 4.2 99 54 Op 1 . - CDS 122835 - 123515 524 ## PRU_0195 cyclic nucleotide-binding domain-containing protein 100 54 Op 2 . - CDS 123571 - 124599 1034 ## COG1477 Membrane-associated lipoprotein involved in thiamine biosynthesis 101 54 Op 3 . - CDS 124695 - 125390 404 ## PRU_0193 hypothetical protein 102 54 Op 4 . - CDS 125341 - 125832 408 ## PRU_0192 putative lipoprotein 103 54 Op 5 . - CDS 125820 - 126782 1163 ## COG0463 Glycosyltransferases involved in cell wall biogenesis - Prom 126864 - 126923 2.1 104 55 Tu 1 . - CDS 126949 - 127467 475 ## PRU_0190 hypothetical protein - Prom 127488 - 127547 3.8 Predicted protein(s) >gi|283510599|gb|ACQH01000020.1| GENE 1 327 - 626 100 99 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPTQDKYPSLIRSAGQFNALVLSTCESPITHENPTIPENITTSTYKLTNTSTHKLTTPSV YKLKNLSIHELINSSTRKLKYLYFILQYHSKFHSVIAKI >gi|283510599|gb|ACQH01000020.1| GENE 2 2076 - 2276 59 66 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MAFARASAFLSRRMRHGLSCELCLLFTQRYNKKCIMQAFMFTFCVKDRTKRTNRTQTKLL SSLPSL >gi|283510599|gb|ACQH01000020.1| GENE 3 2291 - 3043 797 250 aa, chain + ## HITS:1 COG:aq_085 KEGG:ns NR:ns ## COG: aq_085 COG2877 # Protein_GI_number: 15605681 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: 3-deoxy-D-manno-octulosonic acid (KDO) 8-phosphate synthase # Organism: Aquifex aeolicus # 2 250 3 251 267 283 58.0 2e-76 MKTTFIAGPCVIESQELLYTVAEKLVEINQKLEVDIIFKASFDKANRTSISSFRGPGLER GLEMLANVKSKYGLKLLTDIHESCQAEAVGQVVDVLQIPAFLCRQTDLLVAAAKTGKVVN IKKAQFLSGPDMKYPVEKAKEAGAAEVWLTERGNTFGYNNLVVDFRNIPDMKEIVPTVIM DCTHSVQRPGAMGGKTGGDRRFVPSMALAAKAFGATGYFFEVHPNPDKGLSDGPNMLELD KLENLIANLL >gi|283510599|gb|ACQH01000020.1| GENE 4 3085 - 4071 930 328 aa, chain + ## HITS:1 COG:Cj1443c_1 KEGG:ns NR:ns ## COG: Cj1443c_1 COG0794 # Protein_GI_number: 15792761 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted sugar phosphate isomerase involved in capsule formation # Organism: Campylobacter jejuni # 15 214 2 201 205 238 58.0 1e-62 MSTLATDSTDNRLNNVRAYATQCLKDEAQALLDLIPQLDHHFDKAVEMMFNCKGKVIVTG VGKSGNIGAKIAATLSSTGTPAFFINPLDVYHGDLGVMTADDVVLALSNSGQTDELLRFI PAILHRDVPLIGMSRNPHSLLAKYSVAHITVKVDKEACPLNLAPTSSTTAALAMGDALAV ALMQVRNFKPTDFARFHPGGELGKRLLTTAADVMRVDDLPVIPRQMHLGDAIIQVSKGKL GLGVSVEDGKIVGLITDGDIRRAMEKWQAEFFNKTVNDIMTTNPKIVLPTTKIADIQQIM QKYKIHTVLVADENERLVGIVDHYRCML >gi|283510599|gb|ACQH01000020.1| GENE 5 4096 - 5367 1008 423 aa, chain + ## HITS:1 COG:lin2646 KEGG:ns NR:ns ## COG: lin2646 COG1502 # Protein_GI_number: 16801708 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes # Organism: Listeria innocua # 28 423 103 482 482 223 32.0 4e-58 MIEPSRIRSLLLSLLISLAALSQGSDSLIVQQMRDEGVKFSHNNSVVLLTTGQEKFDDMF SAISRARHSVHLEYFNFRNDSIADRLFELLVEKANEGVRVRALFDGFGNDSNNRPLKKKH LRTLRERGIEIFEFDPVQFPWVNHVFHRDHRKIVVVDGNVAYTGGMNVADYYIKGTEVVG SWRDMHCRIEGQEVNTLQAIFLKMWNKVAKQNVHGAEYYRGNDDLGYFDNLKPDTCQSAG KKMVGILNREPRISNKIIRSFYTKAINDSQDSIKLINPYLTLNRALKKALRNAVKRGVKV EIMVSTHSDIPLTPDCVFYNVHKLMKRGVTVWMYEPGFHHTKVIMVDGKFCTVGSANLNA RSLRFDYEENAVIVDKETTQQLSDLFDKDKADCFKLTPESWNKFRTPWQKFVGWFAHLLA PWL >gi|283510599|gb|ACQH01000020.1| GENE 6 5986 - 7020 716 344 aa, chain + ## HITS:1 COG:no KEGG:PRU_1314 NR:ns ## KEGG: PRU_1314 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 340 37 372 377 207 37.0 4e-52 MAHHVEARYDDGDHTFLPPTRTLGFNRQLRLNDSLLAQSYVEAYDLSYPEALTRIASEVE EMRQHLHNEGYYELNDLGVLYYADNGTYTFEPCESGILTPSLYGLGSIDIVRLSVQDAFS EVPAVESVGNVQEIEAARVQQRESKPILLEMDEAAKSLSNSDAEEDNDDETNNNDLIIPV AWVRNAVAVCVAVVAFLLFPSTVTEGEVNTMSHSVVDTGLLTSVMPKEMIKGTESIAEIK APATSVAKQTAPKCEQAKPTEAKHANQTFYCLVLASKITKRNANRYAQMLKDKGYDKTEV LVDDKDVKVTYGRYETKNEAYLAFGKLHGKAEFKDCWVMRSDVR >gi|283510599|gb|ACQH01000020.1| GENE 7 7063 - 7593 465 176 aa, chain + ## HITS:1 COG:no KEGG:PRU_1315 NR:ns ## KEGG: PRU_1315 # Name: not_defined # Def: FHA domain-containing protein # Organism: P.ruminicola # Pathway: not_defined # 1 176 1 176 176 240 65.0 1e-62 MKRVRCPKCDNYITFDETRYQEGQSLFFQCPDCGKEFGIRIGVSKLHNRQKETNADELDC KDHCGTLTVIENVFHYKQVLPLRMGRNVIGRYMKNSGINCPIETNDPSIDMNHCAIEVSR DKRGRLKYVLSDGPSYTGTFVDNEILGDRERRLLENGSLFTIGATSIILRTLDEED >gi|283510599|gb|ACQH01000020.1| GENE 8 7843 - 8556 178 237 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 1 235 7 253 563 73 25 6e-12 MLIEYKNVNVYQDDREVLRGVDFHVDEGEFIYIIGKVGTGKSSFLKTIYCELDIDEEDAD TAEVLGRNLKTLRRREVPALRKELGVIFQDFQLLHDRSVWENLQFVLKYTGWKDKAERAE RIEKVLADVGMSDKKHNMPYELSGGEQQRIAIARAILNNPKVIIADEPTGNLDLDTATNI VALLRDITQSGTAVVMSTHNLSMLSKYPGIVYRCGDGKIKEVTDEYSKIDLTEEDDI >gi|283510599|gb|ACQH01000020.1| GENE 9 8591 - 9910 1383 439 aa, chain + ## HITS:1 COG:VC0391 KEGG:ns NR:ns ## COG: VC0391 COG0527 # Protein_GI_number: 15640418 # Func_class: E Amino acid transport and metabolism # Function: Aspartokinases # Organism: Vibrio cholerae # 3 437 34 476 479 246 34.0 8e-65 MKVMKFGGTSVGSPNRMKNVASLITASGEPTFVVLSAMSGTTNSLVEVADYLYKKNPEGA NEVINNLEKQYVKHVDELLADPQHREKLRAFLVDEFNYLRSFTKDLFTSFEEKTIVAQGE VISTNMMVSYLQEIGVKAVLLDALDFMRTDKNNEPDMAFIRENLARLMEENQGYQIYITQ GFICKNAYGETDNLLRGGSDYTASLVGAVLPADEIQIWTDIDGMHNNDPRVVDKTEAVRQ LNFEEAAELAYFGAKILHPTCVQPAKYAGIPVRLKNTLDPDAEGTIINNEVLHNKIKAIA AKDKITAIKIKSSRMLLATGFLRKVFEIFESYQTPIDMIVTSEVGVSMSIDNDSHLEEIV DELKKYGTVTVDTGMCIVCVVGDLDWNNVGFETLVTDAMKNIPVRMISYGGSNYNISFLI KEEDKQRALQSLSHKLFNS >gi|283510599|gb|ACQH01000020.1| GENE 10 10042 - 11205 834 387 aa, chain + ## HITS:1 COG:RSc2979 KEGG:ns NR:ns ## COG: RSc2979 COG0019 # Protein_GI_number: 17547698 # Func_class: E Amino acid transport and metabolism # Function: Diaminopimelate decarboxylase # Organism: Ralstonia solanacearum # 12 374 58 417 451 278 43.0 1e-74 MISLPTHAFKHLSTPFYYYDEGMLRATLDALNEQLAQHPNFIVHYAVKANANKRILQTMK AAGLGVDCVSGGEIEAALRAGFKGDSIVFAGVGKNDWEIKLALENGIFCFNVESIEELEV IEQLCSELNRTADVCLRVNPNVAAHTHAKITTGLAENKFGIPIGHVRPTIERLQVLKRIR FIGLHFHIGSQIVDMNDFREVCTSVNSLLNLLKQHGIEPEHINVGGGLGIDYDNPDTHAI PDFKDYFDTYARHLQLNEKQKVHFELGRALVGQCGWLVSRVLYVKQGQNKRFLILDAGMN DLLRPALYHARHQILNLSKPSAPSEVYDVVGPVCESSDVFGSDVVLPASHRGHLVAICSA GAYGEAMASQYNCRQLPKSYFSSDLNI >gi|283510599|gb|ACQH01000020.1| GENE 11 11216 - 12394 958 392 aa, chain + ## HITS:1 COG:no KEGG:BF1603 NR:ns ## KEGG: BF1603 # Name: not_defined # Def: putative transmembrane glycosyltransferase # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 58 330 55 329 390 142 31.0 2e-32 MVLTLYIDAITIVCCVSLIMLFFISLFCNPFLRGRRLKKLALANAEIEVDVTDASATPVS IIIPILEDAPELATMVETVLQQEYAAPFRVILVADKGNALVERLSAEHKGDKHLYCTFVP NSSRYMSRKKLTLTIGVKAAETDWVVIMEPTCVPTSCKWLAAFTQSLTDDKRLVVGYTAL DTDAKTSWRMQQFRSSFYILTRTIKGKPYCHAGGVVAIRKDDFMKRDGFSGNLNLLRGEY DFLVNKFGDEGSAAVALHPDAHTQTKAPTLKEWRTLQLFFWETRKSLQRGGVIRFLFNLD NIMLYLNYIVAAICLAYGLIEERWFIVSLASANLLLTIVLRILFAHRAIKIFKACIPAWR VIGFELMVPWRKMRSYLKYMKADKYEFTSHKL >gi|283510599|gb|ACQH01000020.1| GENE 12 12400 - 13524 1049 374 aa, chain + ## HITS:1 COG:no KEGG:PRU_0477 NR:ns ## KEGG: PRU_0477 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 47 369 31 333 339 77 22.0 6e-13 MNVLVYISKGVSDVDRQYLTTLFALLDTNPDIEITYDAPLWRHDFSLFHIFGCWDAKALR LYNMALRTHTPVVLSPFGELNPWRIRNLLKVKRMLLNIPVQRMLSRVEVVHACGQFESDT LKSLNATANIVRIDNPRISGAIDTADMAAQFVRLYHKTLNTWPATMLTAADLELVGALVL AGIDAEMFAAMDLRLETIDALTTPQWRYVLMYAAQERVLDLVRKGLERLQTETPSTDLSK FETFCFDPIPEDDNLAEGEEQEAEDEQQEETDNPDDNSTVDGIVEIVRTLYDALPKDTAH LQLLATLYANLRLTDYDEQALVQALNKVKLLPFFRRVESVMHHLFKLPEGFMPILPLNDK QTKSLTNKITKVYY >gi|283510599|gb|ACQH01000020.1| GENE 13 13525 - 15525 2058 666 aa, chain + ## HITS:1 COG:CAC1572 KEGG:ns NR:ns ## COG: CAC1572 COG3855 # Protein_GI_number: 15894850 # Func_class: G Carbohydrate transport and metabolism # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 4 659 8 661 665 731 55.0 0 MIKKKYEIERDLRYLTLLAQSFPTVAEASTEIINLQAILNLPKGTEHFLADLHGEYEAFQ HVLKNASGNIKRKVNEIFGTTLREAEKRELCTLIYYPEQKLELVKESETDINDWYHITIH QLVSVCRDVSSKYTRSKVRKSLPTDFKYIIQELLHERVDDSNKAAYVNGIIETIISTGRA DDFVVAICNVIQRLAIDQLHILGDIYDRGPGAHVIMDTMERYHSWDIQWGNHDMLWMGAA AGNVACICNVIRLSLRYANLTTLEEGYGINLVPLATFAMETYGDTACKEFVPKSSGESMK LDEKTMRLTALMHKAIAVIQFKVEAQLFEKHPHWQMTNRAVLRHIDYTKGVITLDGKEYQ LTSNEFPTIDPNNPLELTPEEKMLAKRLRHSFKVSEKLQRHVRLLLQHGCMYAIYNNNLL FHASVPLNDDATLKEVEVFPGQTLSGRKLLHKIGMLVRTAYQKDAEPEEREYAIDYFLYL WCGPDSPLFDKSKMATFERYFIAEKETHKEKKGNYFTLRDNEAVVDSILDAFDVKGENRH IINGHVPVHVANGENPIKANGKLMVIDGGFSEAYHKETGIAGYTLVYHSRGFQLVQHEPF ASAMDAIRTGRDIKSTTQIIEMSSHRMLVADTDKGVELNKQVADLEELLYAYRHGIIKEA ERKKQE >gi|283510599|gb|ACQH01000020.1| GENE 14 15636 - 16589 618 317 aa, chain - ## HITS:1 COG:no KEGG:Coch_1252 NR:ns ## KEGG: Coch_1252 # Name: not_defined # Def: hypothetical protein # Organism: C.ochracea # Pathway: not_defined # 32 311 10 273 275 83 27.0 1e-14 MIPKEQRWFEPLMRWLTFKQRMVTMAKIATCAVVMGLAACGNNDDLNDERLGDIKVSVKE NLDGQPYSDLNSVKVGDFIRYDLEITGSGDAEKYDYKFNPYSVGNIPHQLLGKDYEAWLF TDANYKAYKDKNTAPAQRKKIMQESKLKDGEILIQPNTKYVLLIRPLVPGTFKLDSRCSK SKKGESYSVIGDVPISFNCTKLEAWWVEIEDKRGGLFQESRSHNAFYFQVNDRTAETDKF LAEKENKKITFEIIYDENTYTGDFVEGKPIWFHNSEQGRGAPPVDNKIISYVRLSIAETG QKPINIEYHNIYMNKRN >gi|283510599|gb|ACQH01000020.1| GENE 15 18072 - 18332 272 86 aa, chain - ## HITS:1 COG:no KEGG:Ccur_10250 NR:ns ## KEGG: Ccur_10250 # Name: not_defined # Def: hypothetical protein # Organism: C.curtum # Pathway: not_defined # 4 86 3 85 85 109 60.0 3e-23 MKKEKFFGYVGWVGMCTSILMYVFYFPQIQENLNGNKGSFIQPFMAGINCTLWVCYGLFK EKRDLPLVLANSPGVIFGFFAAFTAL >gi|283510599|gb|ACQH01000020.1| GENE 16 18555 - 20360 1835 601 aa, chain - ## HITS:1 COG:RSp0020 KEGG:ns NR:ns ## COG: RSp0020 COG0038 # Protein_GI_number: 17548241 # Func_class: P Inorganic ion transport and metabolism # Function: Chloride channel protein EriC # Organism: Ralstonia solanacearum # 19 453 27 447 461 158 29.0 4e-38 MDTTIEIEDKTPIGRFTKWRTQHLSTRQFTLILSFFVGLFAAIAAYSLHWIIKQIQMLLT EGFTISSINWLYLVFPVVGIYLTSLFIRHVVKDNISHGITRVLYAISTKQARLKGHNCWS SVIASAITIGFGGSVGAEAPIVLTGSAIGSNLGQLFKLDNRTMILLVGCGASAAIAGIYK APIAGLVFTLEVLMVDLTMASLLPILIACVTATCFTYIFVGTDSMFTFHLDGEWMIERVP ASVLFGIFCGMISLYFMRTMTACENVFGKMKNHPYQRLILGGIILSSLIFLFPSLYGEGL SSVNTLLNGNTETDWSQILNNSLFYGHDNLLVVYVGLVLFTKVFATSATNGSGGCGGTFA PSLFIGGFGGFFFARLWNMYNIGVYIPEKNFTLLGMAGVMTAVMHAPLTGIFLIAEITGG YQLLLPLMIVCISSYLTINIFEPHSIYSMRLAREGKLITHHTDKSILTLMSMESIIDKNY IAVNPEMPLGKLVYAISRSHTSFIPVLDHAGRIIGEIDVTKIRHIMFRTELYQKFSVAQI MIPTPAVLGRNDPMEEVMNKFDKTDATYLPVVDANNELQGFISRTRMYAMYRKMVADFST E >gi|283510599|gb|ACQH01000020.1| GENE 17 20399 - 20968 627 189 aa, chain - ## HITS:1 COG:MK0635 KEGG:ns NR:ns ## COG: MK0635 COG0009 # Protein_GI_number: 20094073 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation factor (SUA5) # Organism: Methanopyrus kandleri AV19 # 4 189 7 193 212 98 31.0 6e-21 MTREEDIKKAVEVLRKGGIILYPTDTIWGIGCDATNAEAVAKVYQIKQRDDSKAMICLVD SATRMQRYVRNVPNVAWDLVELATKPTTLILDDAVNLAPNLIAEDGSVALRVTDEPFSKE LCYRFQKALVSTSANISGQPAAQNYKDISEELLNAVDYVCWSRRQEHKPHTPSCIIKLNK DGEVTVIRK >gi|283510599|gb|ACQH01000020.1| GENE 18 21070 - 22053 938 327 aa, chain - ## HITS:1 COG:BH2508 KEGG:ns NR:ns ## COG: BH2508 COG0223 # Protein_GI_number: 15615071 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Methionyl-tRNA formyltransferase # Organism: Bacillus halodurans # 6 315 1 300 317 233 44.0 4e-61 MNKQDLRIVFMGTPEFAVPSLKLLVEGGYNVVAVVTQPDKPVGRHGSTLQAPPVKQYALS KSIPVLQPERMKDEDFLCQLRAFDAHLQVVVAFRMLPKQVWNLPPFGTFNVHAALLPQYR GAAPINWAVINGETETGVTTFFLDEDIDTGRIISHKRLPIPDDANVEWVYNHLMNLGAEL CIETVDRILQSDGKVESTEQPQDQPLKHAPKIFKETCQIDWNQPAKRIYDFIRGLSPYPG AWTEIQKTGIEGENKPQVLKIFSATKTQIPANAQPGTFVVEGNRLFVNTADSLLELQEIQ LAGKKRISSRDFLNGTRDANLYRAITK >gi|283510599|gb|ACQH01000020.1| GENE 19 22022 - 22228 95 68 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKTILKSCLFILLLYTLVVVKEGEGAFTRSLLARMSLRYPVSCCSFPLVLLLGLALQTRH VLNAFNGS >gi|283510599|gb|ACQH01000020.1| GENE 20 22187 - 24397 2566 736 aa, chain - ## HITS:1 COG:PA0934 KEGG:ns NR:ns ## COG: PA0934 COG0317 # Protein_GI_number: 15596131 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Guanosine polyphosphate pyrophosphohydrolases/synthetases # Organism: Pseudomonas aeruginosa # 64 733 68 738 747 370 33.0 1e-102 MEEKLVFSTEEQDIANGLLSKLLANDANGFQPNDAEKLRQRIDQYVHSGTLQRNIFGLNP VVYALETAEIAFDEIGLKRDAIIAIVLYTGVVAGLSDTTQIEKDFGRGVAQIIHGLVKVQ ELYKRTPVVESENFRNLLISFAEDMRVILIMIADRVNLMRQVRDTDNKEARLEVAQEASY LYAPLAHKLGLYKLKSELEDLSLKYLEHDAYYMIKDKLNATKKSRDSYIERFITPIQQRL EEAGLHFHMKGRTKSIHSIWQKMKKQKCNFEGIYDLFAIRIIIDSPIDKEKMQCWQAYSI VTDMYMPNPKRLRDWLSVPKSNGYESLHITVLGPENKWVEVQIRTERMDEIAEKGLAAHW RYKGIKGESGIDEWLSNIRAALENNDDLQLMDQFKMGLYEDEVFVFTPKGELLKFPKGAN ILDFAYRIHSGLGNKCVGGKINGKNVSFRAELKSGDEVEVVTQSNQTPKQEWINIVKTPR AKAKIKLALKDTIAKDTVYAKELLERRLKNRKIEFDESTMMHLIKRMGFKVATDFYKQIA DEKLDVNEVIEKYVAVRDYDQNLNAPQTTRSATEYSYDNPDEEIARNNDDVLVIDRNLKG VDYSLAKCCQPIYGDPVFGFVTVSGGIKIHRANCPNAPELRKRFGYRIVKARWSGKSAGQ YSITIKVVGNDDLGIVNNITSIISKEEKIVMRSINIDSHDGLFDGTIVVQLEDVSKLEAL MKKLRTVKGVKHVSRL >gi|283510599|gb|ACQH01000020.1| GENE 21 24703 - 26220 1510 505 aa, chain + ## HITS:1 COG:no KEGG:PRU_1211 NR:ns ## KEGG: PRU_1211 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 7 505 1 493 493 676 65.0 0 MLTKQDLQQIESLGISEAMVEHQMEQHRKGFPYLKLQAAASIGNGIMVPTAKERDAFLAE WERYQAEGRKVVKFVPASGAASRMFKDLFAFLDAPYNEPTSAFEKEFFDNTKLFAFRKQL CKSCKADTQSSVCDLKAAGRYKEIVAHLLGEEGLNYGNLPKGLLLFHNYDDEPRTPLEEH LVEAALYAASNGEANVHFTVSHDHLQLFKDKVAEKAEKYEKAYNTRLNISFSEQKPSTDT LAVNPDNTPFRNEDGSLLFRPGGHGALIQNLNDVEGDVVFIKNIDNVVPDRLKADTVTYK KLLGGILVHLQKRAFAYLSMLDKGGCSRKQLDEMVHFLEEELCCKREGVSKLDDKALAEY LHKKLNRPMRVCGMVKNVGEPGGGPFLTYNEDGTVSLQILESSQIDSNNAEYVKAFKDGT HFNPVDLVCGIRDYSGKAFHLPDFVDHNTGFISSKSKNGKQLKALELPGLWNGAMSDWNT VFVEVPLSTFNPVKTVNDLLRPEHQ >gi|283510599|gb|ACQH01000020.1| GENE 22 26342 - 26770 230 142 aa, chain - ## HITS:1 COG:no KEGG:PRU_1353 NR:ns ## KEGG: PRU_1353 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 142 10 153 153 89 37.0 3e-17 MVLAIAFTIACLCLPLGYFEPKGMGLNADLFNLWIKQPEGGVSMAPSVLFAIQVLTASIG VWAILGFRNRPRQAKLCVVNILLLIVWYALAAFYALYVGFRDYTFHANIAICFPLVAIIL YWMARRGVLADEKLVRAADRIR >gi|283510599|gb|ACQH01000020.1| GENE 23 26797 - 27132 573 111 aa, chain - ## HITS:1 COG:no KEGG:PRU_1354 NR:ns ## KEGG: PRU_1354 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 108 1 108 112 158 82.0 6e-38 MDYKKSKAPLNTVTRDIMDLCNETNNIYESVAIIAKRANQISIEIKQDLNKKLQEFASYN DTLEEVFENREQIEISRYYEKLPKPTLLATQEFIDNSIYWRDPSKEGQQVD >gi|283510599|gb|ACQH01000020.1| GENE 24 27167 - 27991 894 274 aa, chain - ## HITS:1 COG:aq_1273 KEGG:ns NR:ns ## COG: aq_1273 COG4105 # Protein_GI_number: 15606494 # Func_class: R General function prediction only # Function: DNA uptake lipoprotein # Organism: Aquifex aeolicus # 43 178 52 182 306 60 30.0 3e-09 MKKPLLIISTAALLLSSCAHEFNQVYKSDNMQYKYEYAKECFAKGKYVRAITLLQELVTL QKGTENAQESLYMLAMAQYNSKDYETAAQYFKKYYQSYPKGRYAEMAQFYVGQSLFMSTP EPRLDQSRTIQAITDFQTFLDLYPDSKLKPQAQQRLFDLQDKLVEKELYTAKLYYDLGTY FGNCNFGGNNYEACIITSQNALKDYPYSRLREDFAVLLMKSKFELAQQSVETKKLERFQD AEDECYGFINEYPDSKERVLAEKYIAKCKEITKN >gi|283510599|gb|ACQH01000020.1| GENE 25 28103 - 30136 2185 677 aa, chain + ## HITS:1 COG:BS_uvrB KEGG:ns NR:ns ## COG: BS_uvrB COG0556 # Protein_GI_number: 16080570 # Func_class: L Replication, recombination and repair # Function: Helicase subunit of the DNA excision repair complex # Organism: Bacillus subtilis # 3 673 5 660 661 717 55.0 0 MEFKLTSKYKPTGDQPEAIKQLTEGLERGDKAQVLLGVTGSGKTFTMANVIAQHNVPTLV LSHNKTLAAQLYEEMKGFFPNNAVEYYVSYYDYYQPEAYLPTTDTYIEKDLAINDEIDKL RLRAVSALLSGRKDVVVVSSVSCIYGMGGPTAMESGIISLSKGQRIDRNEFLRKLVDSLY VRNDIDLQRGNFRVKGDTVDVAMAYSDNLLRITWWDDEIDSIEEVDSLTYHRIERFNDYK IYPANLFVTTKDQTEHAIRCIQDDLVKQIDFFNELGDGIKAQRIKERVEYDMEMMKELGH CSGIENYSRYFDGRQPGQRPYCLLDFFPKDYLMIIDESHVSVPQLGGMYGGDRARKQNLV EFGFRLPAAFDNRPLRFEEFHNLINQVIYVSATPADYELGEAEGVVVEQLIRPTGLLDPE IVVRPSENQIDDLLAEILERSDKNERVLVTTLTKRMAEELTEYLLDHGVKTNYIHSDVAT LDRVRIMNALRAGEYDVLVGVNLLREGLDLPEVSLVAILDADKEGFLRSHRSLTQTAGRA ARNVNGKVIMYADTITQSMQRTIDETARRRTIQMQYNAEHHITPQQIVKDIKGALTGNTT TVETAKGYRSQAGGYVEPDAVAFAADPIVERMTRQQLEKSIANTTALMKQAAKELDFIQA AQYRDEIARLQEQLELK >gi|283510599|gb|ACQH01000020.1| GENE 26 30254 - 30451 65 65 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288927436|ref|ZP_06421283.1| ## NR: gi|288927436|ref|ZP_06421283.1| hypothetical protein HMPREF0670_00177 [Prevotella sp. oral taxon 317 str. F0108] # 5 65 14 74 74 66 65.0 6e-10 MSYTLQPLALAVVSVRIFRAVAYARVCSIIVFCLVSSKDFNIVSALSAHQLRTTPQKIDS RGGLF >gi|283510599|gb|ACQH01000020.1| GENE 27 31559 - 32245 474 228 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|241889736|ref|ZP_04777034.1| putative 30S ribosomal protein S12 [Gemella haemolysans ATCC 10379] # 16 220 25 229 230 187 45 3e-46 MKQQNYRILGDKDADYCLIQAVDEHDEQLLEKEFDAIRQRCNDKKVMLAAFLVDDWFHDL SPWKAPAVFGKNHFGDGATNTLNFVVNTFIPHLTNNVLAKPTETHFIIGGYSLAGLFALW AVTQTKHFKACAAASPSVWFPNWTTYASTCHFHADTIYLSLGNKETKAKNQLLAHVGEDM ESLMHILENKKTTFEWNEGNHFAETDVRTAKAFAWCINQLETNLPKTI >gi|283510599|gb|ACQH01000020.1| GENE 28 32263 - 32973 536 236 aa, chain - ## HITS:1 COG:YPO2709 KEGG:ns NR:ns ## COG: YPO2709 COG4123 # Protein_GI_number: 16122913 # Func_class: R General function prediction only # Function: Predicted O-methyltransferase # Organism: Yersinia pestis # 5 235 20 250 252 162 41.0 3e-40 MPDFFQFKQFTIHQDRCGMKVGTDGVLLGAWAEGGKRILDIGTGTGLIALMLAQRYPDAE ITGVELDEQAALQAQENVAGSPFAQQVTINNTPIQHFSHQPDLHGRFTSIVSNPPFYHSL KSKSHERTLARHTESLTFTELFQCVSLLLAPDGCFSAVIPTEQMDNFLAEACIKGLFVSR LIKVKTVETKPAKRCLVAFSKQRAAHFAEEEEATIRDALGQYTTWYSRLTGDFYLK >gi|283510599|gb|ACQH01000020.1| GENE 29 33039 - 33656 801 205 aa, chain - ## HITS:1 COG:no KEGG:PRU_1645 NR:ns ## KEGG: PRU_1645 # Name: not_defined # Def: LuxR family transcriptional regulator # Organism: P.ruminicola # Pathway: not_defined # 9 205 1 197 197 266 70.0 4e-70 MHNKPEHMLKERPRIAIVDPNTLAILGLKQILQNVLPIMQVDTFNTFAELKSANPEVFFH YFVATSVVIENRPFFLEHKNKTIVLTASTEPNSQMAGFNSLCINVPEEQLVRSILQLEQY AHAHGRNLPPLPHALQMKILSDREIEVLSLIVQGLINKEIADRLNIGLTTVITHRKNIMD KLGMKSVSALTIYAVMHGYVDISKI >gi|283510599|gb|ACQH01000020.1| GENE 30 33671 - 34627 1202 318 aa, chain - ## HITS:1 COG:BMEI1582 KEGG:ns NR:ns ## COG: BMEI1582 COG2197 # Protein_GI_number: 17987865 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain # Organism: Brucella melitensis # 249 310 147 208 213 60 53.0 4e-09 MKNHKMYEPDDKMILLIKDNYSILQTLSAFGISLGFGDKTVRDVCESQSVDTYTFLAVVN FAINGYRDYDDADKLSIPTLIKYLKASHEYFLDFQLPFMRKALVAALDEKDNLARLILKL YDEYAHSIRSHMKYEEKMVFPYVDSLLNNKAQNNYDIETFSKHHGQTDVKLRELKNIIIK YLPSDGLHNNQLTAVLYDIYNNEEWLLHHGQVEDEIFVPAIRRLEAKSKQNDVSVKISNM INQTNDNVDNLSEREKEVIISLVQGMTNKEIADHLCISINTVITHRRNIARKLQIHSPAG LTIYAIVNNLVDISAVNL >gi|283510599|gb|ACQH01000020.1| GENE 31 35776 - 36516 497 246 aa, chain + ## HITS:1 COG:no KEGG:PRU_1802 NR:ns ## KEGG: PRU_1802 # Name: not_defined # Def: nucleotidyl transferase family protein # Organism: P.ruminicola # Pathway: not_defined # 1 238 1 234 238 313 64.0 3e-84 MKYAIIAAGEGSRLMQEGVQLPKPLVRVGGEHLVDRLIRIFLANKADEIVVICNEQMNDV AAHLRVVQRNGLAGKSVPLRLIVKRTPSSMHSLYELSPYLHASPFVLTTVDTVFKEDEFA LFVNSMNTALAAGDNGMMAVTGFVDDEKPLYVQTNNPPYISGFYDELAPNCHYISAGIYG LTPQCLPVLKTCIERGESRMRNFQRALIANGMKLKAYVMGKVLDIDHVSDINKAEKFLYE QRNCHL >gi|283510599|gb|ACQH01000020.1| GENE 32 36491 - 37315 672 274 aa, chain + ## HITS:1 COG:no KEGG:PRU_1801 NR:ns ## KEGG: PRU_1801 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 57 260 1 199 204 197 46.0 5e-49 MSSVIAIYRDVRFSPNSVAADRSIMDATIDRLGQHQVLVIAEHELSEQHRADLFLSMGRL PRTIQLLKAREQQDRALVINNAAALERFSRRHILQMMQENGIPTPPETCEHGYWVKRADA AAQTKADVVFCSDKATAEKTKRNFALRGVNDVVVSAHVVGDLIKFYGVGDSFFWYFYPTD NGHSKFGNEKRNGIAQHFGFNLHELRAAATRLARLTGIDVYGGDCIVDREGRFFIIDFND WPSFSPCREQAADAISLFVNQLLAAQNNSQQKNG >gi|283510599|gb|ACQH01000020.1| GENE 33 37308 - 38249 775 313 aa, chain + ## HITS:1 COG:no KEGG:PRU_1800 NR:ns ## KEGG: PRU_1800 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 5 306 4 298 299 389 63.0 1e-107 MDNKFKTLFRESLKSSDTEETIDIYFTRPIGLLFALLWNKLGIHPNAITILSIFLGVGAG YMFYFPDLTHNIMGVCLLMLANFCDSTDGQMARLTGKKTLLGRVLDGFSGDLWFFSIYLA LCLRLQSSLIPYTNVKWGLWIWALAAVSGLLCHSPQSSLADYYRQIHLLFLKGKKGSELD TYAQQRAIFEALPKRGQWFKRLFHYNYANYCKSQEKRTPKFQALFAQLNATYGDAEKAPA QLRDDFLQGSRPLMKYTNMLSFNLRAIVLYAACLANCPWVYFLFELTVLSGIYIYMHKQH EELCERCSNTIKQ >gi|283510599|gb|ACQH01000020.1| GENE 34 38255 - 39010 798 251 aa, chain + ## HITS:1 COG:PAB1224 KEGG:ns NR:ns ## COG: PAB1224 COG1011 # Protein_GI_number: 14521909 # Func_class: R General function prediction only # Function: Predicted hydrolase (HAD superfamily) # Organism: Pyrococcus abyssi # 132 250 93 207 216 61 34.0 2e-09 MHENMITKEYSFNKLNALGIEGIIFDYGATLDTCGNHWGQVIWHAYQFADVPVSESDYRA AYVHVERTLATNKTIMPDFTFDAVLSTKLQMQLAYLRDNKVLDVPQLELDIMHDVMLTYL ARNLKRTLDHSRFVLETLRKRYPLVLVSNFYGNIKTVLDDYFLLELFEDVIESAVVGVRK PDPKIFALGIETLKLPAEKVMVVGDSFDKDIVPAHALGCPTAWFKSEAWEDKAHDESIPQ MIIKDLEDLLG >gi|283510599|gb|ACQH01000020.1| GENE 35 39187 - 41766 2512 859 aa, chain + ## HITS:1 COG:no KEGG:PRU_1798 NR:ns ## KEGG: PRU_1798 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 13 859 28 882 882 1111 60.0 0 MVFALLFAGLTVSAQTESMLTGIVTDAATGDTIYYPSVSYKGPHIAVSGTAKGEYTIVRK EGLVLTFSAVGYAPVQVVVKASTPKTLNIKLKSDTRQLAEVVVRQKRGKYSRKDNPAVEL MRRVIAAKKRTRLENHDFFQYTKYQKITLAMNDITPTQIDSGFIGKRRWMLDQVEHCPYN NKLILPVSVDETVTQHIYRKQPKTERDIIKGQSSTGINQLIQTGDIMDIMLKDVFTDVNI YDNDVRLLRHHFTSPIGNSAISFYRYYIEDTLYVGKDLCYHLQFTPNNGQDIGFRGELYI VADSTLHVKRCNLTLPIQSSVNFVQNLQVRQEFAQLENGEWALSEDDMIAEIEVNDLLQK AIVIRTTRLNNYAFDELPAKLFKGSGREKREADAMMRDEAFWKKYRAVELSKSESSMDEF VHRVEQMKGFKYIIFGLKALIENFVETGGKDHPSKVDIGPVNTMFTRNFIDGFRTRISAQ TTANLSRHWFLAGYMARGWGSKKNYYSGEITYSFNRKEYLPREFPKRTVSFKSTYDIMSP SDKFLRTDKDNVFTALKWAKVDAMMFYNRQQLTLEREEEWGMKTTLSLRTEENEAAGSLF FEKLSNFFPPIIFPNTDVSSLLHNGKIRTTELLFELEIAPGRTFINTKQRRIAINLEAPV ITLSHAVGLNGVLGGQYRYNFSEVGLFKRFWLRSWGKFDVQLRAGAQWDKVPFPLLIMPA ANLSYIVQKGSFNLINNMEFLNDRYASVDLAWDMNGKIFNRIPLLKKLKWREYIGFKGLW GSLTDKNNPFLFQNMGDATLMYFPEGSHVMNPKRPYMELIVGVHNIFKLFHVQYVRRLNY NDLPTAQKQGVRLMMRMSF >gi|283510599|gb|ACQH01000020.1| GENE 36 43761 - 44501 580 246 aa, chain + ## HITS:1 COG:DR1907 KEGG:ns NR:ns ## COG: DR1907 COG0247 # Protein_GI_number: 15806907 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Deinococcus radiodurans # 1 244 1 240 247 167 38.0 1e-41 MKVGLFVPCYVNALYPEAGVATYKLLKHFGVGVTYPLKQTCCGQPMANAGFEKKAVPLAE KYENMFKEFDYVVAPSASCAAFVRTHYPRLLSGEKHVCETSTKTMDIVEFLHDILQVKTL PGRFPFVVSVHNSCHGVRELGLSAPTELNIPTYNKMVDLLQLVKGITIKEPERKDECCGF GGMFAIEEPEVSVKMGHDKIQRHIDTGAQYVTGPDSSCLMHMEGLARREKSPIKFIHIAQ ILAAGL >gi|283510599|gb|ACQH01000020.1| GENE 37 44498 - 45868 992 456 aa, chain + ## HITS:1 COG:PM1854 KEGG:ns NR:ns ## COG: PM1854 COG1139 # Protein_GI_number: 15603719 # Func_class: C Energy production and conversion # Function: Uncharacterized conserved protein containing a ferredoxin-like domain # Organism: Pasteurella multocida # 33 451 38 465 467 316 40.0 6e-86 MSTLHSKKAASALKNVEKITRHDQTFWLMRGKRDLMEHKLPEWESLREHASEIKRHTATH LAQYLEQFSKNLENNGVIVHWAEDAREFNQTVLEILEKHGVKKLVKSKSMLTEECELNPY LEAQGIHVVETDLGERILQLLDEKPSHIVVPAIHVTREEVGELFEREGISKEIGNYDPTY LTQCARHSLRNDFLDADAGLTGCNFGVAETGDIVVCTNEGNADMSTSVPKLHIAVMGLEK VVPDYRSLAVFQRLLCRSATGQPSTTYTSHFRKARPGGEMHVVIVDNGRSDMIANDEHWQ TLKCIRCGACMNTCPVYRRSTGYSYSYFIPGPIGINLGMFKSPQQHSGNLSACSLCLSCD NVCPTKVAPGSQVYAWRQSLSGLGLANSTKKLISNGMGMLFYNPSLFSMSLKFAPIVNHL PRFLVYHGLNDWGKGREMPVFASESFEQMWKKGKVK >gi|283510599|gb|ACQH01000020.1| GENE 38 45865 - 46440 539 191 aa, chain + ## HITS:1 COG:no KEGG:BF3592 NR:ns ## KEGG: BF3592 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 3 190 4 191 192 225 56.0 8e-58 MKKENLFEKLKRNTHVRCDMPDMNMVGIVYDDPLAQFCEMVVAVGGKFAVLADGEDVNEA IKRLYPDAKVFASNLSEITIATLDPDTVAEAQDLNGTDVGILRGELGVAENACVWVPQTM KEKAVMFISEELVLLLDRNKLVNNMHEAYTKIAMNDYGYGCFISGPSKTADIEQALVMGA QAARGVTVVLT >gi|283510599|gb|ACQH01000020.1| GENE 39 46657 - 47595 784 312 aa, chain - ## HITS:1 COG:Cj0918c KEGG:ns NR:ns ## COG: Cj0918c COG0462 # Protein_GI_number: 15792247 # Func_class: F Nucleotide transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoribosylpyrophosphate synthetase # Organism: Campylobacter jejuni # 7 311 4 309 309 320 51.0 3e-87 MSEENSFLIFSGTKSMYLAEKICASLGCPLGKLVVTHFSDGEFAVSYEETVRGRDVFLVQ STFPNSDNLMELLLMIDAAKRASAKTINAVIPYFGWARQDRKDKPRVSIGAKLIADLLHV AGINRLITMDLHADQIQGFFDVPVDHLYASAVILPYLESLKLDNLVIASPDVGGSKRANT YAKYLGCPLVLCNKTRARANEVESMQIIGEVEGKNVVLIDDMVDTAGTIAKAADVMIAAG AKSVRACASHCVMSGPASERVQNSSLEEIVFTDSIPYHNNCPKVRQLSVADMFAETIRRV ESNQSISSQYLI >gi|283510599|gb|ACQH01000020.1| GENE 40 47658 - 48275 603 205 aa, chain - ## HITS:1 COG:no KEGG:PRU_0674 NR:ns ## KEGG: PRU_0674 # Name: not_defined # Def: thiamine-phosphate pyrophosphorylase-like protein # Organism: P.ruminicola # Pathway: not_defined # 4 204 1 201 202 233 55.0 3e-60 MAAMKLVIMTKSTYFVEEDKILTALFDEGMDKLHLYKPGSQLVFSERLLSLIPEGYHDKI VVHEHFRLKNEYDLAGIHLNKPTEIVPNGIKGKISRTCEDLDLLKDMKKNSNYVFLRRIF SCAGNADKPSSFSVKQLEDAADKGLIDKRVYALGGIDVDNVRMAKELGFGGVVVCSDLWK RFDIHNGTDFRDLLAHFRNFQKIVG >gi|283510599|gb|ACQH01000020.1| GENE 41 48365 - 49537 924 390 aa, chain + ## HITS:1 COG:sll0873 KEGG:ns NR:ns ## COG: sll0873 COG0019 # Protein_GI_number: 16330194 # Func_class: E Amino acid transport and metabolism # Function: Diaminopimelate decarboxylase # Organism: Synechocystis # 14 389 20 386 387 386 47.0 1e-107 MRINLDTFEEVRRPMYIVEEAKLRRNLELIASVSARAEVEIILAFKAFALWKTFPIFKEY IRSTTASSLSEAKLAFEEFGSRAHTYSPAYTDDEIEEIASCSSHLTFNSLTQYDRYSTFA KERNASLSFGLRVNPEYSEVETELYNPCAPGTRFGLSAERLPSQLPTNIEGFHCHCHCES GADVFARTLAHIESKFSGWFPALKWINFGGGHLMTRKDYDVELLVDTLRSFKERHPHLHV ILEPGSAFAWQTGPLVAQVVDIVEDKGIQTAILNVSFTCHMPDCLEMPYQPSVRNAQSLD LEAAVNGGAKHIYRLGGNSCLSGDFMGSWQFDHALTVGENVIFEDMIHYTTVKTNMFNGI SHPSLAIVRTSGELEMLRTYGYEDYKARMD >gi|283510599|gb|ACQH01000020.1| GENE 42 49694 - 50179 498 161 aa, chain - ## HITS:1 COG:HI0563 KEGG:ns NR:ns ## COG: HI0563 COG1522 # Protein_GI_number: 16272506 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Haemophilus influenzae # 4 149 2 148 150 108 38.0 3e-24 MSHHNLDLLDKKILQLISADARTPFLEVARICKVSSAAIHQRIQKLLNLGVIKGSRFILA PEKVGYETCAYVGLHLKDPSTSDKVAEALEKIPEVVECHVPTGNYDLFIKLYAINNRHLM SIIHDKLQPLGLSGSETIISFNALIDRQVSVDQVAVEDEEE >gi|283510599|gb|ACQH01000020.1| GENE 43 50496 - 50966 560 156 aa, chain - ## HITS:1 COG:no KEGG:BVU_1945 NR:ns ## KEGG: BVU_1945 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 11 153 12 154 154 136 51.0 2e-31 MIRRSKRNHEIPALNTSSLPDLIFTVLFFFMIVTHMRNVTVKVRYQVPQGKELNRLAKKS SVMHIYIGKPIEKRRTDNSETTIVQFNDKVIPLSELRKYVAAERAKHSPDEAKELTASIK ADKDTKMKTIAEVKQALREARVYNVNFTATQSKSKR >gi|283510599|gb|ACQH01000020.1| GENE 44 50969 - 51490 333 173 aa, chain - ## HITS:1 COG:no KEGG:PRU_1820 NR:ns ## KEGG: PRU_1820 # Name: not_defined # Def: ExbD/TolR family biopolymer transport protein # Organism: P.ruminicola # Pathway: not_defined # 1 163 1 164 174 133 39.0 3e-30 MFARRRRTVPQLNATSTADISFMLLIFFLVTTSMDLDKGLSRKLPPVEKNKQEESLVNKD NIIKVYITGNNKILVDDEPSTLDELKSKLKKFVVRKGRNHLIQLQASENANYDTYFHVQD AIVSIFNQLKNQLAVKMFRHPYALCSAEEKDEVNRLMPQRMVESFDNKDNNTP >gi|283510599|gb|ACQH01000020.1| GENE 45 51506 - 52009 498 167 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927598|ref|ZP_06421445.1| ## NR: gi|288927598|ref|ZP_06421445.1| conserved hypothetical protein [Prevotella sp. oral taxon 317 str. F0108] # 1 167 1 167 167 281 100.0 1e-74 MKRLNLRNIPAERLSQRVFYVLVGLCVVVFGLFYMVGYDIPYVFNPDITAPMFTGVVLDT MYVLFLLALACAIWSAVKGFSVSGKSNGTENNVPYRAISYAVFVGTAVILLLALLFGSSK EMVVNGSLYANKFLLKLSDMFIYTIGILLFVAFAAVVFGFTRYYRKK >gi|283510599|gb|ACQH01000020.1| GENE 46 52006 - 52869 999 287 aa, chain - ## HITS:1 COG:PA2983 KEGG:ns NR:ns ## COG: PA2983 COG0811 # Protein_GI_number: 15598179 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Biopolymer transport proteins # Organism: Pseudomonas aeruginosa # 86 280 3 197 211 75 25.0 1e-13 MRHKLSHLLFIGLLLFAATAQLQAQGNSKTPTPRKAKPADSVSAQVQPKDTAKAMPDTLA VPLPNPGLDLADTEESMGLHQSLKVKFIEGNAGFMSLVALALVLGLAFCIERIIYLSLSE INAKRFMADLDAKISAGDIEGAKEQCRNTRGPVASICYQGLTRISETIDNIERSIVSYGT VQSANLEKGCSWITLFIVMAPSLGFLGTVIGMVMAFDQIEQAGDISAPVVAAGMKVALIT TIFGIIVALVLQVFYNYILSKIEHLVAQMEESSITLLDSIMKHKLSK >gi|283510599|gb|ACQH01000020.1| GENE 47 53093 - 53578 544 161 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927600|ref|ZP_06421447.1| ## NR: gi|288927600|ref|ZP_06421447.1| hypothetical protein HMPREF0670_00341 [Prevotella sp. oral taxon 317 str. F0108] # 1 161 1 161 161 322 100.0 6e-87 MKKFLAVCALLLPTLLVGCDVRKVAFSREFDGSVDSFAVYFFNWRLDKAAQFCTPDSKKW LQFAASQVQPEDLEQMKSLPNNFAYAYEIIQGDTKGVAAEVALNVENFFVLDSIGKPGRL KESATFPLQLVKQNGTWKVRLDGLPRMDKKYQQKKPEEENG >gi|283510599|gb|ACQH01000020.1| GENE 48 53669 - 54472 785 267 aa, chain + ## HITS:1 COG:CAC3538 KEGG:ns NR:ns ## COG: CAC3538 COG1235 # Protein_GI_number: 15896774 # Func_class: R General function prediction only # Function: Metal-dependent hydrolases of the beta-lactamase superfamily I # Organism: Clostridium acetobutylicum # 2 266 1 261 261 183 36.0 4e-46 MLNFISLGSGSSGNCYYLYTETEGLLIDAGVGIRTLKKHFREYGLSLAHIQNVLITHDHA DHVKSVGSLSRDYGLPIYATHRVHVGIEKNYCVRCKVSPERLKLVEKGVAFRLGSFTVTP FNVPHDSLDNVGYQVQFGDITFCLLTDVGEVTDEMKPFINAANYLVIEANHDVEMLSGGP YPQHLKERILGKTGHLSNVDCAEALAQNASEKLRHVWLCHLSEENNHPELAKKTVEQTLR SYGIVAGKDFELEVLKRKSPTGVYELT >gi|283510599|gb|ACQH01000020.1| GENE 49 55729 - 56760 1104 343 aa, chain - ## HITS:1 COG:no KEGG:BT_1767 NR:ns ## KEGG: BT_1767 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 340 1 340 340 332 50.0 1e-89 MPANKNALIRYKTIDNCLRNRYRRWTLDDLVDACSNALYDLEGIRKGVSVRTVQADLQMM RSDKLGYNAPIEVYEHKFYRYADATFSINNMPLSQNDYEVMQEAVDLLRQLEDFEQFTEM SDVVNRLQDHLAIARNHRKPIVHFDKVPNLKGLRLLNPLYNYIAHRQTLRISYQSFSAKE PITLVLCPHLLKEFRNRWFLFGSTADDMVLFNLPLDRIVKVETADEPYRDNPNFDAEHFF DDVIGVSKNVGDRPKTVMFWADDEQAGYISTKPLHPSQQVEDEHPEQGGCVFSIKVVLNF ELYSVLMSYGPGIKVLSPQKVVRRMHNMTSKAHWLYREEKVKE >gi|283510599|gb|ACQH01000020.1| GENE 50 57112 - 58284 1065 390 aa, chain + ## HITS:1 COG:all3526 KEGG:ns NR:ns ## COG: all3526 COG1690 # Protein_GI_number: 17231018 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Nostoc sp. PCC 7120 # 10 390 13 393 393 254 39.0 3e-67 METKMINNRPVMIWASSVDAKAMQQIENLTTLPFLFHHLAIMPDVHAGMGMPIGGVLACK DAVIPNAVGVDIGCGMCAVKSNYKVADISPDVLRKKIMSGIRQRIPLGMVHHTEPQPECY LPEGHEIARMEVVKRRQAAIVYEVGTLGGGNHFIELQKDEEGNLWVMLHSGSRNLGKLVC EHYNKLAQQLNARWHTVVDEKLHLPFLPVGSAEFKNYWAEMQYCIDFALCNRSLMMQRVQ EVLADALPGIAFEPMINIAHNYAAWENHYGKNVIVHRKGATLAREGMVGIIPGSQGTASY IVEGLGNAASFNSCSHGAGRVLSRTAAIASLDMQAEVAQLEAKGIVHAIRSQDDMQEATG AYKDIEEVIANQTDLIKVKTKLLPIAVIKG >gi|283510599|gb|ACQH01000020.1| GENE 51 58724 - 60991 1570 755 aa, chain + ## HITS:1 COG:no KEGG:Fjoh_4747 NR:ns ## KEGG: Fjoh_4747 # Name: not_defined # Def: hypothetical protein # Organism: F.johnsoniae # Pathway: not_defined # 8 754 8 745 746 717 48.0 0 MDNIIANKVALRYQALFIDIDSQQIKQPHEPTVVALTLVSRLSKFGYGVEQDLLQAFYWA SAKQIEAVYVVMADVLRLKQNWASLVKGWGTPTKESLIDHFITYVANLFKGKLQLEGTTL PCGHFIPYGTFPLERYNGCPYCGTPFVTSTFVYEGQGKKLKPLHLFTRVDLERELRTLLT SPTPLDATQAQSVAQLLMLFDLPSDVNIAMKETAMIAIKALVAAGKGEQATRLFETPADI LRFLWYEKNGCARIIDPRTLIANARGLYWHMAAQENKASEAGEAMRERLKLKYNRAYCRL VATWMNAIPLSSEKSAEAMHAKRGMWVRMIRALRLAEYARKKGFERLAALLDVFYCQTYT PWLGTLDKARRANDVQLTLSLLSQRPGLFARSLFSTMLNFDSQTTLAAFEGIVHQLPARL LLSLNNGAKAYFDPERMRLARPITGVMHNLDPHPLLVSFDDAQLKQMIADVNAMYKRAMK SRYAQQPHSVGGTVYIDPRLYQVPIGVGDRSNSVQDRACALQGTRFKVEGNAVRLFMQWG NGLHAQHLDMDLSAGIALENGKSIFCSYFNLSCPGAKHSGDIRNIPEQVGTAEYIELNLN ELEKAGARYVTFSCNAYSCGALSPNLMVGWMNSAYPMTVSDKDGVAYDPSCVQHIVRVDE SNLSKGLVFGVLNVLQREIVWLEMPYTSQTILGVDTTSVEALLKRLEEKTTVGELLEIRA EAQGMTLVGNESDADERYNYQWALNTAEVSKLLLG >gi|283510599|gb|ACQH01000020.1| GENE 52 61203 - 62036 353 277 aa, chain - ## HITS:1 COG:no KEGG:BVU_0885 NR:ns ## KEGG: BVU_0885 # Name: not_defined # Def: glycosyl transferase family protein # Organism: B.vulgatus # Pathway: not_defined # 4 275 2 274 276 242 45.0 1e-62 MNTGKLLFVPSGGLANRMRAMASAWQLAANTGVKVETIWFCDWALNAPFHSIFEPIDNVA MVAREAKAWELLTLDRPRKRNFRIPLLYQRLRFAQRIDEWQVTPLKNQRFNFNEWARGKN SYMSCYQDFGNVPNSVYKHLFHPVGPVLDEIQSYREHFSAHTIGMHIRRTDNKESIERSP LSLFVDAARREIDQHNDTCIFLATDDETTKTALKAEFGDRIITSPKPAARNSIAGIRGGV AELWMLASTTTIYGSAGSSYSVMAAKIGDNKLEVLSK >gi|283510599|gb|ACQH01000020.1| GENE 53 62241 - 64271 2096 676 aa, chain - ## HITS:1 COG:ECs0776 KEGG:ns NR:ns ## COG: ECs0776 COG2885 # Protein_GI_number: 15830030 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane protein and related peptidoglycan-associated (lipo)proteins # Organism: Escherichia coli O157:H7 # 511 596 71 156 173 81 43.0 6e-15 MVLTIIATSCGADAYIKKGEKNLALGEYFDAANNFKQAYTRTPAKDKAARGHISTKLAFC YDKMNASQKAVGAYQNVLRFGLADANTHLLLGRQLLKTGAYKAAETQFKIALDSLPGNKL ANEGLISAQQAAQAKSKGSHYIVKRMDAFNSHRADYSPMYFGDDHNKLYFSSTRNEAKGD ALSGITGAKPADIFVSERDDKGKWSKPEPVQGGLNTEYDEGACTFSPDQRTMYFTQCATD PNYPRYAQVMTSARSDAAWAKATKLDISKDTLSSFAHPAVSPDGQWLYFVSDMPGGQGGQ DLWRIRLTTNGLGEMENLGETINTAGNECFPTFRPNGDLYFSSDGHVGLGGLDVYVARQN TDQRWTIEHLGFPLNSNGDDFGLTFEGPYNRGFFSSNRGDARGWDHIYAFEYPEIVQSIK GWVYEQDGYELPAAQVFMVGNDGTNLKLNVLADGSFKQVVKPGVEYVMLATCKGYLNHKE ELAVKPSETSHEYVLQFPLASITAPVLIDNIFYDLDKYTLRPESKQALDELVKLLNENPN VTIELSAHCDYRGTPEYNKVLSQHRANAVVQYLIDAGIAPQRLTPVGYGKEKPKTIRKKL TERYKWLKEGDVLTEDFITKLDKDKQEICNQLNRRTEFVVQRTTYGLLDDKGNLKKQKKV PKQSEKDKEDVFDIVE >gi|283510599|gb|ACQH01000020.1| GENE 54 64463 - 65617 860 384 aa, chain - ## HITS:1 COG:FN0560 KEGG:ns NR:ns ## COG: FN0560 COG0635 # Protein_GI_number: 19703895 # Func_class: H Coenzyme transport and metabolism # Function: Coproporphyrinogen III oxidase and related Fe-S oxidoreductases # Organism: Fusobacterium nucleatum # 5 382 8 365 365 222 34.0 8e-58 MAGIYIHVPFCASRCIYCAFYTTTTLSSQDRLVQALCSELAMRRDYLLDKAFPSSAITID TIYLGGGTPSQLSEQNLQRLFDTIYNKEQQLPAFHISPNVEVTIECNPDDITPQFAQTIS RLPVNRISMGVQTFSDERLTFLRRRHKATDIAPAIERLRQVGINNISIDLIFGFPKQTLD EWAIDLQNAIELGVEHISAYSLMYEEGTPLFRLLQQQRVSEIDDNLSLDMFNLLVDTLVA NNYEHYEISNFAREGFRSRHNASYWQAIPYLGLGPSAHSYDGNSRQWNVSNLRKYMDAIE NGILPMEREVLNEDTKYNDWITTALRTKEGLDLNRLSNAHRQYLLEAAEHHLKQGHLVLS GNNIALSRTGIFISDSVMSDLVKV >gi|283510599|gb|ACQH01000020.1| GENE 55 65782 - 67944 2123 720 aa, chain + ## HITS:1 COG:Cgl0488 KEGG:ns NR:ns ## COG: Cgl0488 COG0480 # Protein_GI_number: 19551738 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factors (GTPases) # Organism: Corynebacterium glutamicum # 7 710 10 703 709 405 33.0 1e-112 MRVYQTNEIKNIALVGSAGSGKTTLAEAMLFGSGVIKRRGSVEAKNTVSDYFPVEQEYGY SVFPTIFHVEWNNKKLNIIDCPGADDFVSGAITALNVTDEAVILINGQYGPEVGTQNNFR YTEKLKKPVIFLINQLDSEKCDFDNIISTMKDIYGEKCVQIQYPTQTGPGFNAIIDVLTM KKYSWNAEGGSPKVEDIPAEEMEKATALNKILVEAAAENDEGLMEKFFESETLTEDELRE GIRKGLVTRSIFPVFCVCAGKDMGVRRLMEFLGNVVPFVSEMPKLHNTRGEEITPAAEGP TSIYFFKTGMEPHIGEVSYFKVMSGCVKVGDDMTNADRGSKERIGQLYACAGANRIPVDK LNAGDIGCTVKLKDVKTGNTLNAKDCDNKFDFIKYPNSKYSRSIKAVNTQDTEKLMAALL RMHQEDPTWVVEQSKELRQTIVHGQGEFHLRTLKWRLENNEKLQVKFGEPKIPYRETITK SASAEYRHKKQSGGAGQFGEVHLIIEPYAEGMPDPTTYKLNGQDVKMNIKGKEELPLEWG GKLVFINSVVGGAIDARFMPAILKGIMDCMEHGPLTGSYARDVRVIVYDGKMHPVDSNEL SFMLAARRAFSDAFKAAGPKILEPIYDLEVYVPDDFMGDVMSDLQGRRALIMGMDSEAGY QKLMAKIPLKELSNYSISLSSLTGGRASFTTKFASYELVPNEIQQELIKEHEAELANEEE >gi|283510599|gb|ACQH01000020.1| GENE 56 68164 - 69723 1407 519 aa, chain + ## HITS:1 COG:BS_phoR_3 KEGG:ns NR:ns ## COG: BS_phoR_3 COG0642 # Protein_GI_number: 16079962 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus subtilis # 278 514 34 271 279 132 31.0 2e-30 MKKRTIWTIAIIMGTSFLALLFVQLKYIQEMADMKKEQFDESVNRALYQASRNMELNETL RYLEKDINETEHKALQNDSASGRNGLPDGSYQRSHQLSLGDKGGMMYTAFELKTITTKPS QMPKAMILRSDKNSLSEASKAMQEVVKNRYVYQRALLDEVVYTILYSASERPLKERINFK SLDQDLKNELMSNGINLQYHFTVSTPDGREIYRCSDYSEEGEDYSYSQVLFRNDPASKMG LVKIHFPDMNSYIYSSVRFVIPSVIFTLVLLVTFIFTIVVIFRQKRYTEMRNDFINNMTH ELKTPISSISLAAQMLNDTSVTKSESMMKHLGGVINDESKRLRFLVEKVLQMSLFDKKKA IFNMKQLDLNEMVENIAHTFTLRVEHTGGKIYTDIEALDSTIYVDEMHFQNAIFNLMDNA VKYKKPDGPLDIYLRTWNDDDHLYLSVRDTGIGIKKDNLKKVFEKFFRVHTGNLHNVKGF GLGLAYVKKIIDVHKGEISVESEYGKGTTFTIKLPIIKD >gi|283510599|gb|ACQH01000020.1| GENE 57 69729 - 70427 805 232 aa, chain + ## HITS:1 COG:lin2728 KEGG:ns NR:ns ## COG: lin2728 COG0745 # Protein_GI_number: 16801789 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Listeria innocua # 6 226 3 221 225 144 37.0 1e-34 MDEKLKILLCEDDENLGMLLREYLQAKGYMAELCPDGEAGYKAFLKTKFDICVLDVMMPK KDGFTLAQEIRQANSEIPIVFLTAKTLKEDILEGFKIGADDYITKPFSMEELVFRVEAIL RRVRGKKNKENTLYHIGKFLFDTQKQLLTIGGKQTKLTTKENELLALLCSHANEILQRDF ALKTIWIDDNYFNARSMDVYITKLRKHLKDDEDIEIINIHGKGYKLITPDVE >gi|283510599|gb|ACQH01000020.1| GENE 58 70701 - 70880 103 59 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260911681|ref|ZP_05918261.1| ## NR: gi|260911681|ref|ZP_05918261.1| hypothetical protein HMPREF6745_2216 [Prevotella sp. oral taxon 472 str. F0295] # 1 59 1 59 59 88 79.0 1e-16 MRKFYPYIGLALVCMGVIIFIVSYFAKWTHSNLPLFVGLVSVVCGAVLHVVWQKKSSEY >gi|283510599|gb|ACQH01000020.1| GENE 59 70992 - 71339 408 115 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|212691068|ref|ZP_03299196.1| hypothetical protein BACDOR_00558 [Bacteroides dorei DSM 17855] # 1 111 1 112 114 161 72 1e-38 MNQYETVFILTPVLSDEQMKETVAKFKKLLTDKGAEIVNEEAWGLKKMAYAIQKKSTGFY CLLEFKAEPEVIKSLETGYRRDEKVIRHMVVKLDKYAVQYAEKRKHKWTKKLEEA >gi|283510599|gb|ACQH01000020.1| GENE 60 71342 - 71611 422 89 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|150002620|ref|YP_001297364.1| 30S ribosomal protein S18 [Bacteroides vulgatus ATCC 8482] # 1 89 1 89 89 167 88 3e-40 MADKQSEIRYLTPPSVDTKKKKYCRFKKSGIKYIDYKDPEFLKKFLNEQGKILPRRITGT SLKYQRRVAQAIKRARQIALLPYVTDLMK >gi|283510599|gb|ACQH01000020.1| GENE 61 71628 - 72167 515 179 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|34540405|ref|NP_904884.1| 50S ribosomal protein L9 [Porphyromonas gingivalis W83] # 1 178 1 177 179 202 58 5e-51 MELILKEDIIGLGYKNDIVNVKSGYGRNYLIPTGKAVIASESAKKVLAENLKQQAHKLAA IKAEAEKKAKALEGVALVIEAKVSATGATYGSVNAATVAEELKKQGIEVDRKIITMRDIK RVGDFEAVVHFHKEVEIVVPVKVVAENAAVETPAVEEAPVAEVVEETPNAEVEKTPAAE >gi|283510599|gb|ACQH01000020.1| GENE 62 73704 - 74327 410 207 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288927616|ref|ZP_06421463.1| ## NR: gi|288927616|ref|ZP_06421463.1| conserved hypothetical protein [Prevotella sp. oral taxon 317 str. F0108] # 1 207 2 208 208 434 100.0 1e-120 MEKVRDCQLFMTTTDEKMFCKALRVFNPNIYFLDVKPSFEACIDERLVTDVTRLDSDIFS IVNLDMISKGELSKCYKMRSGYYHFFQLGRAQMQFLRSAPDRNVEGCLQHGRIADSYKIV DKEERAWKNKVYAILKKIGQKVYWYYTLPDGTREISPKPENNLVAFPDAAVNYNGKSGNF MIHNRAKFVPQSIAVSELNDNPDLAGV >gi|283510599|gb|ACQH01000020.1| GENE 63 74685 - 75572 873 295 aa, chain + ## HITS:1 COG:CAC3575 KEGG:ns NR:ns ## COG: CAC3575 COG0331 # Protein_GI_number: 15896809 # Func_class: I Lipid transport and metabolism # Function: (acyl-carrier-protein) S-malonyltransferase # Organism: Clostridium acetobutylicum # 3 295 5 298 308 258 45.0 9e-69 MKAYVFPGQGAQFAGMGKDLYDSKPLAKELFDKANEILGYSITDIMFNGTDEQLKETKIT QPAVFLHSVISALCLGDEFTPSMVAGHSLGEFSALVAAGALSFEDGLRLVYARAMAMQKA CEVAPGTMAAIVGLDDETVEKVCQQVSTTGNVVVAANYNCPGQLVISGNIDAVNQACELL KEAGAKRALPLKVGGAFHSPLMQPAKDELQTAIENTTFAEPKCPVYQNVDGQAHTAPEEI KKNLIAQLTSSVRWTSSVQNMIKDGANDFTECGPGKALQGMIGRIDKAVAAHGIE >gi|283510599|gb|ACQH01000020.1| GENE 64 75629 - 77779 1892 716 aa, chain + ## HITS:1 COG:TM1650 KEGG:ns NR:ns ## COG: TM1650 COG0366 # Protein_GI_number: 15644398 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Thermotoga maritima # 4 509 5 379 422 96 23.0 1e-19 MNNKIIIYQVFPRLFGNRNLTNKKWGTIAENGCGKMNDFDEQALQRIRKLGISHIWFTGI IRHASQTDYSEYGIPAQHNAIVKGKAGSPYAITDYYDVSPDLAVDVEHRFEEFKALVQRV HKCGMKVIVDFVPNHVAREYKSVCRPNGVADLGENDDTSLHFSTNNNFYYCWGKPFEPSI SLTDSAGNTYHEQPAKATGNDRFDNHPGINDWYETIKLNYGVDYCDAAGRSEHFVPVPNT WKKMVDILLFWADAGVDGFRCDMVEMVPTAFWQYATGILKENYPQTIVIGEVYDPYQYRA YVQSGFDYLYDKVGMYDCMKSIVRQESHAAAITYQWQSLDDIGKHMLYFLENHDEQRIAS PFFCGDAQRAIPALIVSALMQRNPMLIYAGQEFGEPGMDNEGFSGIDGRTTIFDYWTVGS LYRGYFDREKLSAQENEIYAQYQHILRIARGESAIGKGLFFDLMYVNPQSDNFNPARHFA FLRKWQDELLLIVVNFSGESNSVQVNIPGHAFDYLELPEGHFGAIDLLTGETTVENLGRD GNISMTLAPYNGRIFKFNTKMKDNELLLCTHNKDEFPPAHTAEHLLNQVMIRMFDCGRST NAHVERKKSKISYTLDRKPSRQDEKEIERRMNELIEEDLPVTYELVDRYDLPEGIDLSRV PDDYSSMVRIVRIGDFDACPCIGKHVRSTGQIGKFVLLGTNWDETTRTFRVRFKLV >gi|283510599|gb|ACQH01000020.1| GENE 65 78250 - 80499 2170 749 aa, chain + ## HITS:1 COG:lin1443 KEGG:ns NR:ns ## COG: lin1443 COG1882 # Protein_GI_number: 16800511 # Func_class: C Energy production and conversion # Function: Pyruvate-formate lyase # Organism: Listeria innocua # 1 748 1 743 743 971 60.0 0 MRQEWRGFTGKKWLDEVNVREFIQNNYTAYDGDESFLAEPTDATNKLWGMLQVLQKEERA KGGVLDMETEVVSGMTAYGPGYIGENTKSLEKVVGIQTDKPLKRAFMPYGGIHMAEQACT TYGYKPSEKLHEIFTKYCKTHNDGVFDAYTDEMKLVRHNHILTGLPDTYGRGRIVGDYRR VALYGVDFLIDEKEKDKRNCGCGVMTEDIIRLREEISMQIKALKELKEMAEIYGYDISKP ANNAREAVQWLYFGYLGAIKTQNGAAMSVGRISTFLDIYIQRDFNEGTLTEAEAQELIDH LVMKFRMVKFARIPAYNQLFSGDPVWATLEVAGMGQDGRSMVTKSDFRFLHTLENMGPSP EPNLTVLYCSRLPEGFKRYASRISVKTSSIQYENDDVMRPIWGDDYSICCCVSATQTGKE MQFFGARANLAKCLTYAISGGVDSKTREQCGPALRPILGDVVTYEEFMPRFMDMMEWLVG VYVNTLNLIHYMHDKYFYEAAELALIDTDVRRTFATGIAGFSHVVDSISAIKYAKVNIIR DDTGFPVKFETVGDFPRYGNDDDRADDIAVWLLKTFMNMIRKHHTYRKSEPTTSILTITS NVVYGKFTSNMPDGRPAGAPLSPGANPSYGAEKNGLLASLNSVAKLPYEYALDGISNTQT IAPSTLGHNEEERVNTLVGVMDGYFDQGAHHLNVNVFGVEKLIDCMEHPEKEEYANFTIR VSGYAVKFIDLTKEQQLDVIARQAHSKLA >gi|283510599|gb|ACQH01000020.1| GENE 66 80646 - 81443 740 265 aa, chain + ## HITS:1 COG:SP1976 KEGG:ns NR:ns ## COG: SP1976 COG1180 # Protein_GI_number: 15901799 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Pyruvate-formate lyase-activating enzyme # Organism: Streptococcus pneumoniae TIGR4 # 25 258 15 250 264 311 58.0 9e-85 MENNNTFVPTNANATNEAMPRKAYVHSIESFGSVDGPGIRFIIFLSGCKMRCRYCHNPDT WAMKSADMRSADELLQQALRFRPYWGKEGGITVSGGEALLQIDFMLELFEKAKALGINTC LDTAAQPFTRQEPFFSKFTKLMSFTDLVLFDLKHIDNQEHKKLTGWDNTNVLDCATYLSQ IQTPVWIRHVLVPGITDNDAYLYALRDFIKTLHNVKRIEVLPYHSMGAYKWQKLGLNYTL DHIDSPTSERVENAEKILLSALNND >gi|283510599|gb|ACQH01000020.1| GENE 67 81591 - 81983 380 130 aa, chain - ## HITS:1 COG:no KEGG:PRU_0823 NR:ns ## KEGG: PRU_0823 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 21 130 1 108 108 71 38.0 9e-12 MRKIFFLLLGAAMLTACHESIEKRAEREAKEYTAKYCPTPVTNYTRTDSVVFYPETKTYH YYCSFVDKMDDAAIINKNRQLIDDMLLKSIIESTSLKPYKEAGFSFAYTCHSDKEPQKVL FETKYTKKRY >gi|283510599|gb|ACQH01000020.1| GENE 68 82101 - 83882 2004 593 aa, chain + ## HITS:1 COG:CAC1278 KEGG:ns NR:ns ## COG: CAC1278 COG0481 # Protein_GI_number: 15894560 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane GTPase LepA # Organism: Clostridium acetobutylicum # 3 593 8 602 602 713 56.0 0 MNHIRNFCIIAHIDHGKSTLADRMLEFTKTIQVTGGQMLDDMDLEKERGITIKSHAIQME YQYEGQNYVLNLIDTPGHVDFSYEVSRSIAACEGALLVVDATQGVQAQTISNLYMAIEHN LEIIPIINKVDMPSAMPEEVEDEIVELLGCDKADIIRASGKTGEGVEDILKAVVQRVPHP TGDDNAPLQALIFDSVFNSFRGIIAYFKVMNGVIRKGDNVKFFNTGMEYAADEVGVLKMD MVPKQELRTGEVGYIISGIKDAKEVKVGDTITHRDKPCQSAIEGFQEVKPMVFAGVYPIE PSEYENLRASLEKLQLNDASLTFSPESSVALGFGFRCGFLGLLHMEIIQERLDREFNMDV ITTVPNVSYMVYDKMGNEREVHNPSGLPDPTMIDHIEEPYIRASIITNVNYIGPIMKLCM DKRGELINQEYVSGNRVELHFMLPLGEIVIDFYDKLKSISKGYASFDYHVDCFRPSRLVK LDILLNGEAVDALSTLTHFDNATTFGRRMCEKLKELIPRQQFDIAIQAAIGAKIVARETV KCLRKDVTAKCYGGDVSRKRKLLEKQKKGKKRMKQIGNVEVPQKAFLAVLKLD >gi|283510599|gb|ACQH01000020.1| GENE 69 83915 - 84520 558 201 aa, chain + ## HITS:1 COG:all4541 KEGG:ns NR:ns ## COG: all4541 COG0664 # Protein_GI_number: 17232033 # Func_class: T Signal transduction mechanisms # Function: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases # Organism: Nostoc sp. PCC 7120 # 26 192 19 184 193 85 34.0 9e-17 MKELNNTTRDIARELARKYSTMTHDELDILESVLVPMKFAKGEIILKEGDVCEHIYYVER GLTRQFYFKNGKELTEHIGVEHTIVMCIESLFKEKPTYLQLEALEPTLIYAMPKHRLEEV ALHNVNIQILYRKILEESLIISQVHADMLRFETAQDRYLKLCKQSPQVVLRAPLVYVASY LQMTPETLSRVRAASLYADKD >gi|283510599|gb|ACQH01000020.1| GENE 70 85573 - 86484 1224 303 aa, chain - ## HITS:1 COG:no KEGG:PRU_0301 NR:ns ## KEGG: PRU_0301 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 303 1 313 313 513 78.0 1e-144 MKPTLLLLAAGMGSRYGGLKQLDGLGPNGETIMDYSIYDAIKAGFGKIVFVIRKDFEKDF KEKILSKYEGHIPAELVFQSLDALPEGFAVPEGREKPWGTNHAVMMAKDVIKEPFCVINC DDFYNRDSFMVIGKFLSELPDNAKNAYAMVGFRVGNTLSENGTVARGVCSTDENELLTTV VERTEIMRVDGKVCYKDEQGQWVAIADNTPVSMNMWGFTPDYFDYSEAYFKEFLSDEKNQ TNLKAEFFIPLMVNKLVNDKTATVKVLDTTSKWFGVTYSADREGTVERIQSLINEGVYPA KLF >gi|283510599|gb|ACQH01000020.1| GENE 71 86675 - 87340 568 221 aa, chain - ## HITS:1 COG:MJ0374_2 KEGG:ns NR:ns ## COG: MJ0374_2 COG0671 # Protein_GI_number: 15668550 # Func_class: I Lipid transport and metabolism # Function: Membrane-associated phospholipid phosphatase # Organism: Methanococcus jannaschii # 5 170 1 150 168 71 34.0 1e-12 MIKTLNDLDAALLLWINSFHSPFFDQFMNLVSGKWVWVPMYVGIFIVLAMRLGFKPKLVA ILATIGVALFFSDFVCAEVIRPIFNRPRPTQLGSGISHLVHIVDNYRGGDYGFPSCHASN SFMLATTAALLFRNKILSAFFFLWALAMCYSRAYLGVHYPGDLLAGAVWGTIVAFALYAL VNRYYALTEVKTAKHTQLISIVGTSTFVLLAIVSAVQTCVQ >gi|283510599|gb|ACQH01000020.1| GENE 72 87349 - 89112 1561 587 aa, chain - ## HITS:1 COG:PM1683 KEGG:ns NR:ns ## COG: PM1683 COG1368 # Protein_GI_number: 15603548 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Phosphoglycerol transferase and related proteins, alkaline phosphatase superfamily # Organism: Pasteurella multocida # 22 549 54 594 649 167 27.0 7e-41 MFYNHSAFKDCNFTDYLAVIWHGLPMDVSVAGYFTIIPALLALISIWLHPALIRILHRTY LALASFLISIVYCTDAVLYGFWQFRIDSTPFFYFLSSPKDALASVPIWWVLLGVVVVLLL TAMLFFMLCKTIGTFPESKGIAKDRLIASGVMVVVLALLFVPIRGSLGTSTMNLGRVYYS ENQKLNHAAINPCFSLLSSLMNDEDTKDQYRFMAADEANKLFSQITDKTKQGGTDAVFKL LNTNRPNIVMVVLESFSAHIMKSMGGTANVAVNMDKWANEGVLFTNFYANSFRTDRGLAA ILAGYPAQPTMSIMKYPNKTGNMPMFPQKLKKAGYQLKYYYGGDADFTNMRSFVTTAGFE DLISDADFPIKLRLSKWGVPDQYVFDRALADIKSQAPNATHLSVIQTSSSHEPYDVPFKK LNNKILNAFAYTDNCLGKFVAALKKLPSWKNTLVVLVPDHQGCYPEDMDNYSPQRYHIPL LLLGGALKAKGPIATLGSQADIAATLLSQLGFSYSEFTFSKDMLNPNVPHFAFSTVPNAF MMKTTDNTVFYNCETNKTILDNGKTPGKNLPYGKAYLQKLYDDISNR >gi|283510599|gb|ACQH01000020.1| GENE 73 89511 - 90116 417 201 aa, chain - ## HITS:1 COG:ECs4211 KEGG:ns NR:ns ## COG: ECs4211 COG0512 # Protein_GI_number: 15833465 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Anthranilate/para-aminobenzoate synthases component II # Organism: Escherichia coli O157:H7 # 16 200 2 187 187 180 49.0 2e-45 MCLQQTPHPISTTMNVAIIDNYDSFTFNLVHLIASLGANVTVLKNDDFELAALEHYDKIV LSPGPGLPSEAGKTMDVIRTYAETKPILGVCLGHQAIAQAYGCKLTHLSEVCHGQSTRGL NLGNDPIFEGLPTEVLMGRYHSWVVERESVSSPLEITAISCTGEVMGIRHKQLRLHGIQF HPESILTPNGKTIVNNWLHKG >gi|283510599|gb|ACQH01000020.1| GENE 74 90117 - 90833 710 238 aa, chain - ## HITS:1 COG:all5288 KEGG:ns NR:ns ## COG: all5288 COG0135 # Protein_GI_number: 17232780 # Func_class: E Amino acid transport and metabolism # Function: Phosphoribosylanthranilate isomerase # Organism: Nostoc sp. PCC 7120 # 1 238 1 210 217 72 26.0 7e-13 MIIKLCGMQHADDIKAAEQLGVDLLGFDFIPKSPRYVRMISSRAGIIPDFSEERLRALKQ PNSPQQQPVKVGRVGVFADDMPQNIVTRVYNYELDYVQLNGEEPAVTLENLRRSIDPDIR KGIKIIKRIVVDTQADLAKATEYEGKADLLLFHITAKETELQTTGDDNSPLNQLLSTYHG STPFLISKPNAPFDTLLIKSITHPQFAGINLDTQFELEPGKKDMEQLQKYVEELKAIE >gi|283510599|gb|ACQH01000020.1| GENE 75 90870 - 91370 469 166 aa, chain - ## HITS:1 COG:BS_dfrA KEGG:ns NR:ns ## COG: BS_dfrA COG0262 # Protein_GI_number: 16079240 # Func_class: H Coenzyme transport and metabolism # Function: Dihydrofolate reductase # Organism: Bacillus subtilis # 3 166 2 166 168 127 42.0 7e-30 MQINIIAAVARNRAIGHQNKLIYWLPNDLKRFKALTTGHTIIMGRNTFLSLPKGALPNRR NVVLSSTQSVFAGCEHFSSLSDALQSCAPDEDVYIIGGAMLYNEAIAFADKLLLTEIDDT PDVADAFFPNYNGWKETARECHEKDEKHAFNYCFADYERPTALSRG >gi|283510599|gb|ACQH01000020.1| GENE 76 91391 - 92185 795 264 aa, chain - ## HITS:1 COG:DR2630 KEGG:ns NR:ns ## COG: DR2630 COG0207 # Protein_GI_number: 15807610 # Func_class: F Nucleotide transport and metabolism # Function: Thymidylate synthase # Organism: Deinococcus radiodurans # 1 264 124 387 387 426 71.0 1e-119 MKQYLDLLQRIVNEGTRKEDRTGTGTLSVFGHQMRFNLEEGFPLLTTKKLHLKSIIHELL WFLKGDTNVKYLQENGVRIWNEWADENGELGPVYGHQWRSWPNYNGGHVDQIQDIVNALK NNPDSRRMIVSAWNVAEVDQMALPPCHCLFQFYVANGKLSLQLYQRSADTFLGVPFNIAS YALLTMMMAQVSGLKPGDFIHTTGDTHLYLNHLEQAKEQLKRTPRTLPRMVINPNVTSIF DFKYDDFTLEGYDPLPHIKAEVSV >gi|283510599|gb|ACQH01000020.1| GENE 77 92182 - 93585 819 467 aa, chain - ## HITS:1 COG:SP1402_1 KEGG:ns NR:ns ## COG: SP1402_1 COG0144 # Protein_GI_number: 15901256 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA and rRNA cytosine-C5-methylases # Organism: Streptococcus pneumoniae TIGR4 # 1 293 1 280 280 166 38.0 1e-40 MVLPQEFTAYTRALMGNELYGKLLVALEQEPPVSLRLNPFKTKNGEVEVCNADESVAWTD YGYYLSGRPNFTFDPLFHAGAYYVQEASSMFVAWIVKQLIHSPITMLDLCAAPGGKSTAL RSVLPQGSLLFCNEPITQRASILAENVQKFGHPDMVVSSNFAADYRKSGLQFDAILADVP CSGEGMFRKDAGAIAEWSETNVERCWRLQREIIADIWPCLKPGGLLIYSTCTFNDKENER NVEWIAQEFDADFDLPPIPSAWNITGSIGSDIPACRFIPGISKGEGLFLCVLRKRYEHGM VHDERKKRSSKSMTTKVSLPEMPQWLNADLAFDTRQINGNIHAIPCLWTSLFDKASATLR IVHAGVALATPKGKDLVPHPALAHSIALNSDAFCKCELDYASAIAFLRREAITLHADTPR GYVLVTHQGMPLGFVKNIGNRANNLYPQEWRIKSTHIPQPQQIIKLK >gi|283510599|gb|ACQH01000020.1| GENE 78 95220 - 96434 1508 404 aa, chain - ## HITS:1 COG:no KEGG:PRU_0556 NR:ns ## KEGG: PRU_0556 # Name: not_defined # Def: C1 family peptidase (EC:3.4.-.-) # Organism: P.ruminicola # Pathway: not_defined # 33 402 32 401 401 539 64.0 1e-152 MKKVFFAVLVALTATGLNAKTPKTKTTPTRSKPVFTVVKENKITSIKDQNRSGTCWDYAT MSFIESEILRKSGKTYNLSEMFIASKNYMDRAVKAVRMHGDVSFAQGGSFDDPIHVIRQH GIVPEEAMALPGTMTGDSLANFGEFFSVMSPYVSAVATSKAKKLSPAWKKGLQGILDAYL GKCPENFTYEGKSYTPQSFAESLGLDWDDYITFTSYTHHPWYSKFAVEVQDNWRWAQSYN VPIEDLTRIIDNAIMNGYTIGWGGDVTEDGFTRKGLGIAIDAKKVRSMAGTDADRWFKLS QDEKKHRYDSLGVNVPELVPTQALRQEAYDNWETTDDHGMHIFGIAKDQNGKEYYMVKNS WGKYGDYKGVWYMTKAYVALKTMDFMVNKNAVPADLLQKIGLSK >gi|283510599|gb|ACQH01000020.1| GENE 79 96704 - 98107 1646 467 aa, chain - ## HITS:1 COG:FN1949 KEGG:ns NR:ns ## COG: FN1949 COG0006 # Protein_GI_number: 19705251 # Func_class: E Amino acid transport and metabolism # Function: Xaa-Pro aminopeptidase # Organism: Fusobacterium nucleatum # 1 466 1 461 462 403 43.0 1e-112 MFSKETYIRRRTELKKLVQSGVIVIFGNNESPCNFPNNGYYPFRQDSSFLYYFGVQRDGL VGVIDIDNNADTLIGDDIDIDDIVWYGSVDSVKDLAAQVGVEHSAPMKHLKTVCHDAIKQ KRKIHFLPPYRHDIKLQIFDLLGIHPNQQKDAASMQLINAVVKMRSVKEDQEIEELERAS EIGYLMHTTAMKLIKPGVTEKYVAGQVSGIAGSYGAMVSFPTIFSQHGEIMHGNPSMAVL EEGRLALCDCGAETINNYCSDNTRTMPVSGKFTQRQLEIYTIVEACHDHALELSKPGVKY YDVHMGVCRMMTERLKELGLMKGDTDEALAAGAHAMFLPHGLGHMMGMDVHDMEALDQKY VGYDEEIQPSTQFGTSALRMARRLEKGFVVTDEPGIYFIPDLIDDWRAKGHCKDFLNFDL IETYKDFGGIRLEDDVLITDNGCRMLGKQIIPYHPKDVEEYMAKHRV >gi|283510599|gb|ACQH01000020.1| GENE 80 98195 - 99058 818 287 aa, chain + ## HITS:1 COG:CAC2424 KEGG:ns NR:ns ## COG: CAC2424 COG4667 # Protein_GI_number: 15895690 # Func_class: R General function prediction only # Function: Predicted esterase of the alpha-beta hydrolase superfamily # Organism: Clostridium acetobutylicum # 6 274 3 271 283 207 39.0 2e-53 MQIGANTGLVLEGGGMRGVFTSGVLDAFMKHGLRFNYVVAVSAGACNGMSYMSWQPRRAR LSNIDFLARYDYLGLRHLVTQGCIFDQDLLYDKFPNELLPFDYDAYFSNRTTFEMVTTNC LTGGAMYLTEKNDKQRALDVVRASSSLPFVCKIVEVDGIPMLDGGIVDSIPVQRAIDTGH DFNVVVMTRNYGFRATGKDHKIPNFIYKKYPRLRVALSRRLEVYNAQLQLAEDLEREGRI VCIRPQRPMDVGRIEKDTNKLEKLYEEGFEEGEKFIANLNAGMYVRK >gi|283510599|gb|ACQH01000020.1| GENE 81 99158 - 99799 579 213 aa, chain + ## HITS:1 COG:DR2613 KEGG:ns NR:ns ## COG: DR2613 COG0637 # Protein_GI_number: 15807594 # Func_class: R General function prediction only # Function: Predicted phosphatase/phosphohexomutase # Organism: Deinococcus radiodurans # 5 179 28 205 238 82 30.0 5e-16 MSYKKAALFDLDGVVFDTEPQYTAFWSTVFARHYPNEPQLATSIKGQTLTWIYERYFADK PDLQNEITAELNVFERDMKFEYVLGFEDFIAQLHQLNVNTAVVTSSNKEKMQQVYDQHPN FKALFDHVFTAEDFAKSKPDPYCYLLGASYFGVKPTECVAFEDSVNGFKSVKSAGMPLVG LATSNSVEVINQYTKVIIPNYLQASFADLCNQL >gi|283510599|gb|ACQH01000020.1| GENE 82 100027 - 101694 1318 555 aa, chain + ## HITS:1 COG:YLR258w KEGG:ns NR:ns ## COG: YLR258w COG0438 # Protein_GI_number: 6323287 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Saccharomyces cerevisiae # 11 550 10 617 705 223 29.0 6e-58 MNRKKLFPDYILESSWEVCNKVGGIYTVLSTRARTLQAVMPDRIIFVGPLLNGENTGFQE VNSLYADWVKQARTDGLNVKVGRWDVPGSPVAVLVDFQPFFSEKDKIYTELWEDFQVDSL HGYGDYDEASMFSYAAAKVVESFFRYQVGKNEKVVYHGNEWMAGLGLLYVRKHLPAVATI FTTHATSIGRSIAGNNKPLYEYLFAYNGDLMADELNMQSKHSIEKQTAWNVDCFTTVSDI TANECRELLDKEVDVVLPNGFEDDFVPKSNAFATKRKRARAELLRVANCLTGDTFDDDTL IVSTSGRYEFRNKGIDVFIDAMNRLRFDERLNKKVVAFIDVPAWVGNAREDLAERLKAKA NTTFDTPLQCPMLTHWLHNMDQDKALSMMRSLDFANNKQDKVKIIFVPCYLNGNDGIFNM PYYDLILGNDLCVYPSYYEPWGYTPLEAVAFSIPCITTDLAGFGIWAQHVLAEEKAENSL ENGVKVVHRTDYNYHEVADEIKDAVANYTLLTKTQVAKAREKARNLSKKALWKKFIAYYE DAYDIALRKAEERTK >gi|283510599|gb|ACQH01000020.1| GENE 83 101734 - 104295 2729 853 aa, chain + ## HITS:1 COG:PH1512 KEGG:ns NR:ns ## COG: PH1512 COG0058 # Protein_GI_number: 14591294 # Func_class: G Carbohydrate transport and metabolism # Function: Glucan phosphorylase # Organism: Pyrococcus horikoshii # 18 746 14 734 837 596 42.0 1e-170 MKIKADYSNDPVWKELSIKSRLPEELKCLDELAHNMWWAWNYEARNMWKSLDETLYEKVG HNPVMLLERLSYDRKEEIVKDKALMKKVKDVYAMFRKYMDVKPDSKRPSVAYFSMEYGIN QVVKIYSGGLGMLAGDYLKEASDSNVDMCAVGFLYRYGYFKQSLSMDGQQIANYDAQNFN SLPLVRQLDENGNPVVVDVPYMNYMVHAYVWRMNVGRISLYLLDTDNELNSEYDRPITHA LYGGDNENRLKQEILLGMGGILTLKKLGIKKQIYHCNEGHAALCNLQRLCDYVDGGMPFN EAMELVRASSLYTVHTPVPAGHDYFDENLFGKYMGGYPSRLGISWDEFIGMGRENPNNHD ERFCMSVFACNTCQEVNGVSRLHGWVSQKMFAPIWKGYYPEENHVGYVTNGVHLPTWTAT EWRKLYDKYFDPSFMSDQSNEKIWHGIYNVDDEEIWNTRMALKNKLVRFIREMFTETWLK NQGDPSRVVSLLERINPNALMIGFCRRFATYKRAHLLFTDIERLSKIVNDPEHPVLFFFS GKAHPADGAGQGLIKSIFEISQRPEFLGKIIFLEDYDMQLARRLVSGVDIWMNTPTRPLE ASGTSGEKAELNGVVNLSVLDGWWVEGYREGAGWALDEKRTYQNQEYQDKLDAATIYGLL ENEIIPMYYKKNKKGYSEKWIHVIKNSIATIAPHYTMKRQLDDYFDKFYNRQAKRSAELL ANNNELAKKISLWKEAVAERWDGIHVVSKETSFLENGGETGMKYKIRYVIDEQGLDDAVG LELVSLNNEQSDSEREVYSTHQFKMVKREGNLFTFEAEIEPSNAGTYRSCVRMYPKNDLL PHRQDFAYVKWLD >gi|283510599|gb|ACQH01000020.1| GENE 84 104963 - 105154 205 63 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927637|ref|ZP_06421484.1| ## NR: gi|288927637|ref|ZP_06421484.1| hypothetical protein HMPREF0670_00378 [Prevotella sp. oral taxon 317 str. F0108] # 22 63 22 63 63 82 100.0 6e-15 MTAQYIVIAAIIAIAIGYAAYKIYETVSNANNPCGGCKGCSMGQQQPENGLPDEKKPTCD HKM >gi|283510599|gb|ACQH01000020.1| GENE 85 105495 - 106100 690 201 aa, chain + ## HITS:1 COG:PA5492 KEGG:ns NR:ns ## COG: PA5492 COG0218 # Protein_GI_number: 15600685 # Func_class: R General function prediction only # Function: Predicted GTPase # Organism: Pseudomonas aeruginosa # 4 186 12 193 215 148 40.0 5e-36 MDIKKSAFTLSAPSIRLCPTDNKIEYAFIGRSNVGKSSLINMLCNHKGLAKTSATPGKTL LINHFIINDSWYLVDLPGYGFAKRSKTVQQKLQQMISSYILQREQLVNLFVLIDIRHEQQ KIDREFIDWLGESGVPFTIVFTKADKLGLVKAKQNAEKWMKQLEDSWEELPPYFITSSEK RIGREELLNYIDEINKKIEAQ >gi|283510599|gb|ACQH01000020.1| GENE 86 106106 - 107881 2030 591 aa, chain + ## HITS:1 COG:sll0912 KEGG:ns NR:ns ## COG: sll0912 COG0488 # Protein_GI_number: 16331003 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Synechocystis # 7 590 3 634 636 447 41.0 1e-125 MANPKPVLDVQSLTKRFGAKVLFENISFAIAEGQRVGLIAQNGTGKSTLLSILMGEEGKD GGEVIYRNGLRVACLVQSPTFDPKETVLDACFNHQGDDDKVLKAKQILTQLKIADLNQPM GELSGGQQKRVALANTLITEPDFLILDEPTNHLDLEMIEWLEGFLSRGNKTLLMVTHDRY FLDRVCNLIVELDNNTIYTYQGNYAYYLEKRQARLDNARAEVQHANNLYRRELDWMRRQP QARGHKAKYREDAFYELEAKAKQRIEERQVRLKSSNVYIGSKIFECQYVSKAWSPEKVIL HDFYYNFARFEKMGIVGNNGTGKSTFIKMLLGEVQPDGGRFDIGETVRFGYFSQEGLKFD EQKKVIDVVTDIAEYIDLGSGRHLSASQFLQHFMFSPEEQYNYVYKLSGGEKRKLYLCTV LMRNPNFLVLDEPTNDLDIKTLQVLEEYLQDFPGCVIIVSHDRYFMDKVVDHLLVFKGGG EVQDFPGNYTQYRQWSQLASQAESKPAPVEKKEKPAYRNETKRKLTYKEKTEFEQLGKDI AALESEQAEIEAQLSSGTLSVVEITEKSKRLPVLKDELDEKSMRWLELSEI >gi|283510599|gb|ACQH01000020.1| GENE 87 107927 - 108433 466 168 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288927640|ref|ZP_06421487.1| ## NR: gi|288927640|ref|ZP_06421487.1| conserved hypothetical protein [Prevotella sp. oral taxon 317 str. F0108] # 1 168 1 168 168 332 100.0 7e-90 MNTLTFRSPYSKFMLVTTIIGMVLMHGAFMACMGGMVRSDLGSGRFFTYLLIFAACVFVV LYAFCLQIRKVTVDDSALVIHRQIGKVRIPLNSITKVETKDEIGLDLRLWGISFFFGHYG LFYNRSLGRYRAYVKNGDQMVVVHTPNRVHVFSCERRDELIAMLNERK >gi|283510599|gb|ACQH01000020.1| GENE 88 108588 - 110252 1535 554 aa, chain + ## HITS:1 COG:aq_999_1 KEGG:ns NR:ns ## COG: aq_999_1 COG1022 # Protein_GI_number: 15606303 # Func_class: I Lipid transport and metabolism # Function: Long-chain acyl-CoA synthetases (AMP-forming) # Organism: Aquifex aeolicus # 27 554 14 503 600 224 29.0 3e-58 MKQIPSFNAYIEKSIKDHWNLNALTDYKGKTLQYHDVARKIEKLHILFESGGIVKGDKIA LCGRNSAHWAVAYLAIITYGAVVVPVQSEFTPEQIYNIVNHSESKLLFVGDVVAPYLTME PMPNVEAIVYLPDFSLRQCKSEKLSFAREHLNELFGKKYPKFFRKEHVAYHCDHAEELAI INYTSGTTGFSKGVMLPYRALWGNLDFMMHELAPHVPAGSNILSILPMAHMYGQMFEFLL ALCVGAHNHFLTRQPSPSLIVEAMAEVKPAIVSAVPLMVDKIIRKKIFPQVQNNRIKLLT SMPVIGRKVKSSLCQMVRDLFGGNVYEVIVGGASLNKEIEDFLTDIGFPITVVYGTTETA PVLTFTDQSQFAAGSCGMPVRHVEVKIASNDPLNVPGEIVARGINVMQGYYKNEEATRQV LDKDGWYHTGDLATMAPDGHIYIRGRIKNMLLASNGQNVFPEEIEDKLNSMTLVNESLVV QKGDKLVGLVYPEMEEVASLDLSPEELEAIMEQNRHELNTILPTYCKLSAIKLHDTEFEK TPKRSIKRYLYQNV >gi|283510599|gb|ACQH01000020.1| GENE 89 110353 - 112017 1719 554 aa, chain + ## HITS:1 COG:FN0867_1 KEGG:ns NR:ns ## COG: FN0867_1 COG1022 # Protein_GI_number: 19704202 # Func_class: I Lipid transport and metabolism # Function: Long-chain acyl-CoA synthetases (AMP-forming) # Organism: Fusobacterium nucleatum # 53 550 39 506 606 199 30.0 2e-50 MEKIPSFNACVQKSIIDHWDLDALTDFKGQTLQYHDVARKIEKLHILFENSGVVKGDKIA LAGRNSANWAVAFLATLTYGAVAVPVLHEFTADQMHNIVNHSEAKLLFVGDVVATTIDAT KMPALEGIIYIPDYSLVLSRTDKLTYAREHLNEMFGKKYPKYFRKEHVNYYIEENPDELA LINYTSGTTGFSKGVMIPYRALWSNLDFAMGVLGPHVSPGAHIISILPMAHMYGMAFEFI FEFCCGCHIYYLNRMPSPAIIAQAFAEIKPKVIIAVPLVIEKIIRKRVFPKIQNNKMRLL LNMPVINKKINQKIKEQVAAAFGGEFYEIIIGGAAFNREVETFLTRIDLPFTVGYGATEC APIITYADYKDFVPTSCGKAVVHMEVKIDSHDPQNVPGEILARGLNVMLGYYKNEEATRK TLDKDGWYHTGDLGLMDAEGNVFIKGRSKNMLLGSNGQNIYPEEIEDKLNSMTMVTESVV VQDGDKLVGLVFPDFDEAKNLGLNNDDLVNLMEQNRQQLNAILPAYSKLSSIEIHAEEFE KTPKKSIKRFKYQR >gi|283510599|gb|ACQH01000020.1| GENE 90 112186 - 112728 349 180 aa, chain + ## HITS:1 COG:no KEGG:BT_1585 NR:ns ## KEGG: BT_1585 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 179 71 243 244 110 32.0 2e-23 MKTPHSVDSEFFKAVAGKAPWFGKDKWRRKYSEGPIVYRGVVAAQSELYKPGHKVGEAFH AITIVAVDKAHQCNEEWMQRVIKQLQDMQAGKVAVPSDCAELVDVMNEVDNDGDWRSGML GMSIAEGAEAHYRKDVLYRKDLPNGFLPTNGIIPQVCTNAPVKKSHSPLTANIPVQFYMD >gi|283510599|gb|ACQH01000020.1| GENE 91 112967 - 114136 1074 389 aa, chain - ## HITS:1 COG:no KEGG:BF2289 NR:ns ## KEGG: BF2289 # Name: not_defined # Def: putative polyphosphate-selective porin O # Organism: B.fragilis # Pathway: not_defined # 6 389 12 380 380 167 29.0 6e-40 MNKKLLLAFAACLYMGAASAQEKLEIKVTGRALFDAATYTQNAQAKSEDGKMNEGVGIRD MRVGLKATYGKWYFRGDVSYSNNKVSLKDVYLQYSFNENNFLRAGHYTAPFGLSSAYSSA KKEYLDEPEGNIYQPGRRIGVMHTIANHNLWLQYGAFADNSALTTSTDKSGPQGYTISGR FVYRPIMTDAGGFHVGFSGMHVKAEAVKEGEHAHIKYDKKYLTAVDKRTATAIDITDARW ENKFTAEFQGIWHNYQLSSQYYWSHISRDENKSYNTDGFYVSARGIIINPADYKYNYACS GVDNPADKNLELMLGFGYLNLKDAKALADAPIAGMAKAGRMSDLSAGLSFFWNKYVTLRL NYHYIHSHTWDQPTAKVVNVVQMRVQYMF >gi|283510599|gb|ACQH01000020.1| GENE 92 114159 - 115451 1178 430 aa, chain - ## HITS:1 COG:no KEGG:Coch_0692 NR:ns ## KEGG: Coch_0692 # Name: not_defined # Def: glucose-1-phosphatase (EC:3.1.3.10) # Organism: C.ochracea # Pathway: Glycolysis / Gluconeogenesis [PATH:coc00010] # 22 427 23 428 428 560 66.0 1e-158 MNRNILLVGLLFAFVPTSIFAQKQRSQAFRDKYTLKEAVILSRHNIRAPLSTKGSLLEQV TTHPWFKWTAGASELTTRGGALENQFGLYFRKWSVDAGLFKENATPSHNEVIVYANSMQR CIATANYFTTALFPVADVPVNHRFVPSKMDPIFFPQLTKSSKSFRAQAMKEIAAMGGKKG IVGINEGLKESYEITAKVLDLKDSPACKEKNLCAFDNYNTQLILDKGDEPRMKGSLKDAT TCSDALILQFYEEPDAKKAAFGHDITLDDWTKISKIKDVYGDVLFTAPVVAVNVAHPLLV YMRDELMDKDRKLAFLCGHDSNIASVTAALEVEPYDLPYSIEKKTPIGCKLVIEKFEGKD GKMYCDINLTYQSTEQLRHIAMLSLDNPPQIFSLSLKGLQKNADGLYLLKDVEGRFMKAI RAYDKIEDEL >gi|283510599|gb|ACQH01000020.1| GENE 93 115812 - 115970 60 52 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927646|ref|ZP_06421493.1| ## NR: gi|288927646|ref|ZP_06421493.1| hypothetical protein HMPREF0670_00387 [Prevotella sp. oral taxon 317 str. F0108] # 1 52 1 52 52 105 100.0 7e-22 MLLTTRTGKWHGKHIRVLPPTRAVAGMPLLANGGSIEEKQKLFAIDYQFSNL >gi|283510599|gb|ACQH01000020.1| GENE 94 116787 - 117812 1021 341 aa, chain - ## HITS:1 COG:alr4493 KEGG:ns NR:ns ## COG: alr4493 COG1216 # Protein_GI_number: 17231985 # Func_class: R General function prediction only # Function: Predicted glycosyltransferases # Organism: Nostoc sp. PCC 7120 # 4 270 9 276 295 129 29.0 7e-30 MHKVAIVILNWNGQSMLAQYLPSVMRHSRNDAAVYVADNASTDNSLAYLTQHFPHCQTIA LEKNWGFAEGYNKALAQIDAEYYVLLNSDVEVTHQWLTPLIEEMDAHPDIAACQPKLLAM HDRESFEYAGASGGFIDSLGYPYCRGRIFEHVEKDEGQYNYRQEIHWATGACLMVRAKDY WAVGGLDARFFAHNEEIDLCWRLRLRGKKIFCIPESAVYHVGGGTLPKVNPMKTYLNFRN NLTMLYKNLAPSQMRKVMVWRLFLDYVAAFQTLVFNRNWGDFKAIIKARRDFNKWKHSFD TAREEIQNTKTIHTENPNSPFSILWKYYVMGKHTFHALPQK >gi|283510599|gb|ACQH01000020.1| GENE 95 117836 - 119176 708 446 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|16079597|ref|NP_390421.1| hypothetical protein BSU25430 [Bacillus subtilis subsp. subtilis str. 168] # 11 411 3 406 451 277 37 2e-73 MIDSSAFQGKTAAYYTLGCKLNFSETSTFGQMLQDMGVRTANEGEPADICLINTCSVTEV ADHKCRQIIHRMVRENPGAFVVVTGCYAQLESEKVSAIEGVDLVLGSNEKANLIQYLNDA WADGQRGKALHRHFSMNTKDIKTFAPSCSRGNRTRYFLKVQDGCDYFCTYCTIPFARGFS RNPSIASLVQQAHDAANDGGKEIVLTGVNIGEFKGGGNERFIDLVKALDQVEGIQRFRIS SIEPNLLTDELIDYCASSRAFMPHFHIPLQSGSDEVLKLMQRRYDTALFAHKVQLIKQRI PNAFIGVDVMVGSRGEEPAYFEECYDFLKSLDISQLHVFPYSERPGTAALRIPYVVNDAE KRRRSKLLLELSDEKLETFYASQIGSQSLVLFEKAAKGKAMHGFTPNYVRVELPASLAKD EFDNQLLPVRLTQFNHNKSAIKVELI >gi|283510599|gb|ACQH01000020.1| GENE 96 119224 - 120087 917 287 aa, chain - ## HITS:1 COG:PM1990 KEGG:ns NR:ns ## COG: PM1990 COG0575 # Protein_GI_number: 15603855 # Func_class: I Lipid transport and metabolism # Function: CDP-diglyceride synthetase # Organism: Pasteurella multocida # 8 274 2 278 289 118 32.0 1e-26 MTPKIKNLITRTITGVFFVAIMVTCFLRPLGMVFVFALITSLSLWEYAGLVNENKGSSVN RFISTVAGTYLFLAVAGVNSGFIGTNAVFVPYLLTIVYLFVSELYTKANNPINNWAYTML GQMYIALPLSMINVLAFRQADNQIYFYHLLPLSVFIFLWTNDTGAYCSGSLFGKHKLFPR ISPAKSWEGSIGGGILVLLVAGLIGYLETQSTAPTNLTIPQWLGLGLVVVVFGTWGDLVE SLFKRTLGIKDSGNILPGHGGMLDRFDSSLMAIPASVVYLYSLTLFQ >gi|283510599|gb|ACQH01000020.1| GENE 97 120271 - 122274 1176 667 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|157803230|ref|YP_001491779.1| 50S ribosomal protein L9 [Rickettsia canadensis str. McKiel] # 138 658 104 630 636 457 48 1e-127 MSNKQPYNNKQPKMPRFNMNWIYGLIIIALGLFFFTGKGDMMAGSVKQEATYTKFKEYVN KGYVNNVVVNSEQRTLKMYVKPANLRDVFGKDPKQLGEKPYVSVQYGSPEELEKYLTEIQ QAGKIKDFAFDNDKGGHFTDILIQFVPFIALIAIWIFIMRRMGSGGAGGGSGVFSVGKSK AKMYEKGNEIGITFKDVAGQEGAKQEVQEIVEFLKSPQKYTDLGGKIPKGALLVGPPGTG KTLLAKAVAGEAGVPFFSMSGSDFVEMFVGVGASRVRDLFHQAKEKSPCIIFIDEIDAVG RARSKNPSMGGNDERENTLNALLTEMDGFGTNSGVIILAATNRADMLDSALLRAGRFDRQ INVDLPDLPERKQIFQVHLRPVKVDSTVDIDFLSRQTPGFSGADIANVCNEAALIAARHN SKSVGKQDFLDAVDRIIGGLEKKTKIMTAAEKRTIALHEAGHATVSWFCQHADPLVKVSI VPRGRALGAAWYLPEERQITTKEQMLDEMCALMGGRAAEELFTGHISTGAMNDLERATKS AYGMVAYAGMSDKLPNISFYNNQEYQFQKPYSETTAKVIDDEVMKMVNEQYDRAKKILQE NSYGHNKLADLLISREVIFAEDVEEIFGKRPWVSRSEEIIEDNTPKLEDMPDEVKAAEEE HRRLRGL >gi|283510599|gb|ACQH01000020.1| GENE 98 122323 - 122682 368 119 aa, chain - ## HITS:1 COG:slr1886 KEGG:ns NR:ns ## COG: slr1886 COG0799 # Protein_GI_number: 16330295 # Func_class: S Function unknown # Function: Uncharacterized homolog of plant Iojap protein # Organism: Synechocystis # 4 118 27 140 154 84 40.0 4e-17 MTRTNNLVNTIIKGIQEKKGSNIVVADLKEIEGAITNYFVICQGNSPTQVEAIAESIGET VRKELKDKPTSAVGLGLNQWVAMDFVDVIVHVFIPEMRSFYDLEHLWADAKLTHLTDID >gi|283510599|gb|ACQH01000020.1| GENE 99 122835 - 123515 524 226 aa, chain - ## HITS:1 COG:no KEGG:PRU_0195 NR:ns ## KEGG: PRU_0195 # Name: not_defined # Def: cyclic nucleotide-binding domain-containing protein # Organism: P.ruminicola # Pathway: not_defined # 1 222 1 222 225 226 48.0 4e-58 MHRIPLYGKLMELPMFQGMSKEDLAAVLGQTRFDFAQKKAGATIVRAEDECTHLYFLLNG RMEVSAQAHDNGYIFTEEIGAPYILQPENLFGISPRYTRTFVAKATCSLLLLSKSEVLRL TDDFIIFKFNLLNTISTQAQKAERLPWRHCALNLEHRITRFITTHCTHPAGSKLVHIKMQ KLANELNDNRLNVSRALNGMQKQGLIQLKRGCIVIPMLEKLVAEWG >gi|283510599|gb|ACQH01000020.1| GENE 100 123571 - 124599 1034 342 aa, chain - ## HITS:1 COG:VC2289 KEGG:ns NR:ns ## COG: VC2289 COG1477 # Protein_GI_number: 15642287 # Func_class: H Coenzyme transport and metabolism # Function: Membrane-associated lipoprotein involved in thiamine biosynthesis # Organism: Vibrio cholerae # 31 316 51 345 367 201 38.0 1e-51 MTTPNRKRRLLWQIPFLLFLIVGTIYIIKQQQDAPYQHDKGAIFGTTYSIIYQSNENLNK EIMAALNEVDQSMSTFNKGSVISKINRNEQVQPDKMFVDVFQLANKISIETNGAFDVTVA PLVNAWGFGFKNGVHPTPQTIDSLKQFIGYQKVAYVNKRIKKQDPRLMLDFSAIAKGYGS DVVANLLKRKGIENFMVEIGGEIVTRGISEKRLPWKVGVTKPTDDSLNVNQELQTILNVT DKSMATSGNYRKFYYKNGKKYAHTIDPNTGYPVQHNILSSTVLANNCAEADAYATAFMVL GLDKAKKVLEQHPELMAYFIYSDKNGKNAVWFSPSLKDKIQK >gi|283510599|gb|ACQH01000020.1| GENE 101 124695 - 125390 404 231 aa, chain - ## HITS:1 COG:no KEGG:PRU_0193 NR:ns ## KEGG: PRU_0193 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 60 231 1 172 172 209 56.0 7e-53 MTHHRHISASISRVAISALLAFGALSASAQGLKPQTEHPDTTRFFRGLQVMADAVGPIQL AVSDYGQYEAALRINLKDKYFPVFELGYGTANHEDDPVTHVVYKTSAPYGKVGMDFNIMK NKHDIYRVYVGARYAFTSFKYDVASPVLTDPVWKDPAEIELNNIPASCHWAELVFAVDAK IWGPLHLGWSVRYRRRLAHNDGESGNVWYVPGFGKTGNSRLGGTFNIIINL >gi|283510599|gb|ACQH01000020.1| GENE 102 125341 - 125832 408 163 aa, chain - ## HITS:1 COG:no KEGG:PRU_0192 NR:ns ## KEGG: PRU_0192 # Name: not_defined # Def: putative lipoprotein # Organism: P.ruminicola # Pathway: not_defined # 1 161 1 167 169 120 42.0 1e-26 MRRLIPLLVLGSAVLAACSSIDCPYNNTVYSQYVLMKGGGLKPDTLTDTLTILTTRRDGQ DTIIYNKGVKTKSFSLPMSYTRDQDALFLQLKDTLGNVVHDTIKVNKTNTPHFESVDCPM SYFHTITAVSTSHHGIDSIILVNPNVTYDTSSPHFRIYFKSGH >gi|283510599|gb|ACQH01000020.1| GENE 103 125820 - 126782 1163 320 aa, chain - ## HITS:1 COG:mlr7556 KEGG:ns NR:ns ## COG: mlr7556 COG0463 # Protein_GI_number: 13476277 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Mesorhizobium loti # 3 320 4 312 326 247 40.0 2e-65 MDISVVIPLFNEDESLPELYSWIERVMNDNQFSFEVIFVNDGSTDRSWEVIEELSKQSEH VRGIKFRRNYGKSPALFCGFKEANGNVVITMDADLQDSPDEIPELYRMITEEKYDLVSGY KQKRYDPLSKTIPTKLFNATARKISGIKNLHDFNCGLKAYRKAVVKNIEVYGEMHRYIPY LAKNAGFGRIGEKVVQHQARKYGSSKFGLNRFFNGYLDLITLWFLSNFGKKPMHVFGLLG SLMFLLGFIATCLLGADKLYCLAQGIPQRLITDSPYFYLSLTTMMIGTQLFLAGFLGDLI SRNSQGRNDYQIEEEIRCGD >gi|283510599|gb|ACQH01000020.1| GENE 104 126949 - 127467 475 172 aa, chain - ## HITS:1 COG:no KEGG:PRU_0190 NR:ns ## KEGG: PRU_0190 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 9 171 8 175 180 76 33.0 4e-13 MNFLESLKQLKAFARQDGAILALVWIVAFLFTMRLPQSMTGNLLTLSTPFVVAWRLRAFR NNALDGEISYRRALAYSWHTFVYASLIFALAQYLYIRFYDPESLITMMRDSIHSFGAAYQ QMGMNETQMQESVKLLGTLQPIELVFLFFTQNIFIGLFLSLIIAAFGMKHRR Prediction of potential genes in microbial genomes Time: Sat May 28 00:23:00 2011 Seq name: gi|283510598|gb|ACQH01000021.1| Prevotella sp. oral taxon 317 str. F0108 cont2.21, whole genome shotgun sequence Length of sequence - 5830 bp Number of predicted genes - 4, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 515 - 565 14.3 1 1 Op 1 . - CDS 610 - 2241 1232 ## gi|288927658|ref|ZP_06421505.1| lipoprotein 2 1 Op 2 . - CDS 2265 - 3989 651 ## gi|288927659|ref|ZP_06421506.1| lipoprotein 3 1 Op 3 . - CDS 4041 - 4292 357 ## gi|288927660|ref|ZP_06421507.1| hypothetical protein HMPREF0670_00401 - Prom 4456 - 4515 2.6 4 2 Tu 1 . - CDS 5048 - 5302 120 ## Predicted protein(s) >gi|283510598|gb|ACQH01000021.1| GENE 1 610 - 2241 1232 543 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927658|ref|ZP_06421505.1| ## NR: gi|288927658|ref|ZP_06421505.1| lipoprotein [Prevotella sp. oral taxon 317 str. F0108] # 1 543 15 557 557 1107 100.0 0 MLKLRYKLLSSLSACALAFMPMACTEEQNLAQPDNNQEELEPAVSFTGSMNSPSSRTSGE YGKIDATNLGITFYWTTNDHEHMYINRGTDTSPIWERATKVAFTDKDATTHSANATFRFK GKYENKQYTMLYVGGDQKFSTKDKVYVNFARIQKQTSLDDGELANSGDCGVAKTTCVYYK KEGKGWGYHGDRHTFKLEHKAAYVTFMPYNPRGHMDNTYFVSAQIEAKESLYGKFPFTEK GVDVSKRPLHQNKVTLNLATKGMGNGLPIASNKEDAQKHAGIMVLPPGKYTDVLVKFIIK DHSTGKRFDVKKPYPVLNLETGMNQPIFCRLEVVDFTPSFNYYSMWGAEDPYYTLDAKAP HNWNKNATGIPAPEDERWPHDGDSRWYSEDEDAPAGVLTDDAFTINDASYVLANDCYWDE DYLWAFDGHLQKGALWVPVISKGHATAFDGKDWTSIPTSKSKTAITDEEHPTVGYYPLPL IGKYVDGALVEVGAAGYYWINRADPNSSDSKAYYLKVTKAGVSINHDGSKEWGCVPFTGD IRP >gi|283510598|gb|ACQH01000021.1| GENE 2 2265 - 3989 651 574 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927659|ref|ZP_06421506.1| ## NR: gi|288927659|ref|ZP_06421506.1| lipoprotein [Prevotella sp. oral taxon 317 str. F0108] # 1 574 1 574 574 1104 100.0 0 MKFITQHMKQLAMAAYAFALAFTLGGCVNDLTNNEHRAENKETKETFNTKSFAGASSQME LPRTKTSGKYTDISKKQDGSKMGIQFYWDHDDYQRLWVNLGGTTGWQRFYKQDTLKLLND GKIAQSRFYLYSSYKLTEEKYPLWYSYYGFSNGQRYTYINDYTTQIRANDFTKLSEQGDC GIATAVDNGIWYGFTLQHTTSYINFMPYAGEGKSKEALLKCKLTHITLTTDEPNYGYYYF DENGNLDESKKPTGPRRARTLYCVDNLNYTGFSVPESRESAKKNAAIMVMAPGTYHNVEI TYTLYDPVVQTTGVFKKFISELTLKPGKTTSMFSELVADDVSHLFGRYHMWGASQFYWQG HESNAHHKWRQGGVDGIAGYPTPGDPRYKSDYYAPGAGNVAPAGSIASTVPSANFLSWYV KLGDPRWDDSPFTYDGHLFTGRMWFRKRKYFAPAAGLTDASMNAKGASGYDSRVWRNGGA SNTPTKWEDVPESNRSQYFCVMALGYYDNDGRLYMGNEYEGRYWTSTCVPDFESYGPRWA YVLMFNRNNVSVAGNTHYWPSRKEYGYQLWPGED >gi|283510598|gb|ACQH01000021.1| GENE 3 4041 - 4292 357 83 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927660|ref|ZP_06421507.1| ## NR: gi|288927660|ref|ZP_06421507.1| hypothetical protein HMPREF0670_00401 [Prevotella sp. oral taxon 317 str. F0108] # 1 83 1 83 83 152 100.0 5e-36 MKEEQLKERQCPCKMAYVCPDIAMTEMAMQGQLLQASAKGGHKTLPNGGTVYNNSGSSGG DAGTAEEDNVISNSKGYTLYEWD >gi|283510598|gb|ACQH01000021.1| GENE 4 5048 - 5302 120 84 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MIKRCTKLYNNVRTSHFYPPHIHIQQQYIYFLLSKQNKSQITDVKHTLWPKINHSYYITE RHHLENNIFLNFFDLFIWQFKYYV Prediction of potential genes in microbial genomes Time: Sat May 28 00:24:25 2011 Seq name: gi|283510597|gb|ACQH01000022.1| Prevotella sp. oral taxon 317 str. F0108 cont2.22, whole genome shotgun sequence Length of sequence - 151121 bp Number of predicted genes - 115, with homology - 107 Number of transcription units - 68, operones - 29 average op.length - 2.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 1052 - 1111 3.2 1 1 Op 1 . + CDS 1343 - 2800 1165 ## PRU_1451 LysM domain-containing protein 2 1 Op 2 . + CDS 2855 - 3550 740 ## PRU_1450 hypothetical protein + Prom 3576 - 3635 2.8 3 2 Tu 1 . + CDS 3655 - 5328 1520 ## COG3104 Dipeptide/tripeptide permease + Term 5360 - 5399 8.2 - Term 5442 - 5486 5.1 4 3 Tu 1 . - CDS 5510 - 6697 1232 ## COG0156 7-keto-8-aminopelargonate synthetase and related enzymes - Prom 6783 - 6842 6.1 + Prom 6794 - 6853 5.7 5 4 Tu 1 . + CDS 6881 - 7837 1164 ## COG0451 Nucleoside-diphosphate-sugar epimerases - Term 7995 - 8033 4.2 6 5 Tu 1 . - CDS 8238 - 11069 2496 ## COG0178 Excinuclease ATPase subunit - Prom 11208 - 11267 2.7 + Prom 11103 - 11162 4.6 7 6 Op 1 . + CDS 11233 - 11892 445 ## BVU_1953 hypothetical protein 8 6 Op 2 . + CDS 11893 - 12444 450 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 9 6 Op 3 . + CDS 12428 - 12772 301 ## gi|288927669|ref|ZP_06421516.1| membrane protein + Term 12810 - 12871 5.2 10 7 Tu 1 . - CDS 13856 - 15280 1283 ## COG2966 Uncharacterized conserved protein - Prom 15300 - 15359 6.1 11 8 Tu 1 . - CDS 15406 - 16134 192 ## PROTEIN SUPPORTED gi|163797523|ref|ZP_02191474.1| 50S ribosomal protein L9 - Prom 16297 - 16356 3.6 12 9 Op 1 . - CDS 16381 - 17742 953 ## COG0037 Predicted ATPase of the PP-loop superfamily implicated in cell cycle control 13 9 Op 2 . - CDS 17778 - 18965 561 ## PROTEIN SUPPORTED gi|223476703|ref|YP_002580685.1| ribosomal protein L11 methyltransferase, putative 14 9 Op 3 . - CDS 19022 - 19645 609 ## PRU_0905 3'-5' exonuclease - Prom 19717 - 19776 5.3 + Prom 19584 - 19643 5.2 15 10 Op 1 . + CDS 19758 - 22241 2351 ## COG1674 DNA segregation ATPase FtsK/SpoIIIE and related proteins + Prom 22249 - 22308 3.3 16 10 Op 2 . + CDS 22328 - 22939 615 ## PRU_0907 hypothetical protein 17 10 Op 3 . + CDS 22982 - 24493 609 ## PRU_0908 prophage PRU01 putative membrane protein + Prom 24549 - 24608 4.1 18 11 Op 1 . + CDS 24643 - 26391 1997 ## COG0608 Single-stranded DNA-specific exonuclease 19 11 Op 2 . + CDS 26398 - 28329 1722 ## COG0514 Superfamily II DNA helicase + Term 28332 - 28383 -0.5 20 12 Tu 1 . - CDS 30220 - 30687 292 ## gi|288927680|ref|ZP_06421527.1| hypothetical protein HMPREF0670_00421 - Term 30986 - 31023 3.0 21 13 Tu 1 . - CDS 31198 - 33738 2676 ## COG1193 Mismatch repair ATPase (MutS family) - Prom 33855 - 33914 3.9 - Term 33867 - 33904 3.1 22 14 Tu 1 . - CDS 33933 - 34322 528 ## COG1970 Large-conductance mechanosensitive channel - Prom 34361 - 34420 5.2 + Prom 34307 - 34366 7.8 23 15 Op 1 . + CDS 34519 - 35538 1259 ## COG0057 Glyceraldehyde-3-phosphate dehydrogenase/erythrose-4-phosphate dehydrogenase + Term 35568 - 35606 6.9 24 15 Op 2 . + CDS 35621 - 36571 630 ## COG0324 tRNA delta(2)-isopentenylpyrophosphate transferase 25 15 Op 3 . + CDS 36576 - 37508 556 ## COG1597 Sphingosine kinase and enzymes related to eukaryotic diacylglycerol kinase + Term 37591 - 37640 -0.7 26 16 Tu 1 . - CDS 37505 - 38209 492 ## COG0664 cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases - Prom 38446 - 38505 6.0 + Prom 38160 - 38219 4.8 27 17 Op 1 4/0.000 + CDS 38464 - 40410 1434 ## COG3408 Glycogen debranching enzyme 28 17 Op 2 2/0.111 + CDS 40429 - 41697 1310 ## COG0438 Glycosyltransferase 29 17 Op 3 . + CDS 41722 - 43137 1490 ## COG1449 Alpha-amylase/alpha-mannosidase + Term 43227 - 43268 11.8 + Prom 43147 - 43206 5.9 30 18 Tu 1 . + CDS 43403 - 44329 888 ## COG0583 Transcriptional regulator + Term 44378 - 44432 19.3 - Term 44366 - 44420 19.3 31 19 Op 1 . - CDS 44459 - 45493 961 ## BF3435 hypothetical protein 32 19 Op 2 . - CDS 45539 - 47701 1887 ## COG1629 Outer membrane receptor proteins, mostly Fe transport - Prom 47829 - 47888 5.2 33 20 Tu 1 . - CDS 47908 - 48228 154 ## gi|260911440|ref|ZP_05918030.1| hypothetical protein HMPREF6745_1985 - Prom 48353 - 48412 2.1 - Term 49582 - 49626 8.0 34 21 Tu 1 . - CDS 49675 - 50571 844 ## COG2761 Predicted dithiol-disulfide isomerase involved in polyketide biosynthesis - Prom 50704 - 50763 7.9 + Prom 50651 - 50710 6.9 35 22 Tu 1 . + CDS 50733 - 51194 607 ## BF1333 hypothetical protein + Term 51224 - 51279 10.3 - Term 51794 - 51830 -1.0 36 23 Op 1 14/0.000 - CDS 51888 - 53060 1084 ## COG0451 Nucleoside-diphosphate-sugar epimerases 37 23 Op 2 3/0.000 - CDS 53088 - 54173 1196 ## COG1089 GDP-D-mannose dehydratase 38 23 Op 3 . - CDS 54188 - 55252 916 ## COG0836 Mannose-1-phosphate guanylyltransferase - Prom 55365 - 55424 1.9 + Prom 55471 - 55530 6.8 39 24 Tu 1 . + CDS 55717 - 56589 910 ## COG0568 DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) + Term 56614 - 56657 8.4 - Term 56605 - 56642 1.6 40 25 Tu 1 . - CDS 56872 - 57090 69 ## - Prom 57209 - 57268 4.0 - Term 57194 - 57231 -0.1 41 26 Op 1 . - CDS 57293 - 58837 1731 ## COG0519 GMP synthase, PP-ATPase domain/subunit 42 26 Op 2 . - CDS 58911 - 59687 738 ## COG1208 Nucleoside-diphosphate-sugar pyrophosphorylase involved in lipopolysaccharide biosynthesis/translation initiation factor 2B, gamma/epsilon subunits (eIF-2Bgamma/eIF-2Bepsilon) - Prom 59712 - 59771 4.6 43 27 Op 1 . - CDS 59792 - 61333 1449 ## COG1660 Predicted P-loop-containing kinase 44 27 Op 2 . - CDS 61362 - 61547 83 ## - Prom 61599 - 61658 8.1 - Term 61796 - 61833 4.6 45 28 Op 1 41/0.000 - CDS 61881 - 63509 1644 ## PROTEIN SUPPORTED gi|167855908|ref|ZP_02478658.1| 50S ribosomal protein L28 - Prom 63530 - 63589 4.7 - Term 63575 - 63603 -1.0 46 28 Op 2 . - CDS 63643 - 63912 488 ## COG0234 Co-chaperonin GroES (HSP10) - Term 63978 - 64020 2.0 47 29 Op 1 . - CDS 64025 - 64270 80 ## - Prom 64294 - 64353 2.1 48 29 Op 2 . - CDS 64356 - 65051 512 ## COG0666 FOG: Ankyrin repeat 49 29 Op 3 . - CDS 65091 - 67889 3217 ## COG1410 Methionine synthase I, cobalamin-binding domain 50 29 Op 4 . - CDS 67957 - 68436 589 ## COG0691 tmRNA-binding protein 51 29 Op 5 . - CDS 68497 - 69006 733 ## PRU_0797 hypothetical protein - Term 69762 - 69803 1.5 52 30 Tu 1 . - CDS 69867 - 70361 29 ## - Prom 70498 - 70557 3.1 - TRNA 70228 - 70318 38.3 # Ser CGA 0 0 53 31 Op 1 . - CDS 70712 - 72181 1716 ## COG3538 Uncharacterized conserved protein 54 31 Op 2 . - CDS 72251 - 73561 1120 ## COG0498 Threonine synthase - Prom 73643 - 73702 2.6 55 32 Op 1 . - CDS 73732 - 74976 987 ## COG3635 Predicted phosphoglycerate mutase, AP superfamily 56 32 Op 2 . - CDS 74988 - 77423 2413 ## COG0527 Aspartokinases - Prom 77481 - 77540 4.4 + Prom 77594 - 77653 7.5 57 33 Tu 1 . + CDS 77894 - 78964 740 ## COG1672 Predicted ATPase (AAA+ superfamily) + Prom 79262 - 79321 4.9 58 34 Op 1 . + CDS 79384 - 80235 596 ## COG2207 AraC-type DNA-binding domain-containing proteins 59 34 Op 2 . + CDS 80243 - 80725 175 ## Coch_0541 hypothetical protein 60 34 Op 3 . + CDS 80741 - 81226 216 ## Coch_0542 hypothetical protein + Term 81373 - 81415 9.2 - Term 81361 - 81402 8.2 61 35 Op 1 . - CDS 81449 - 83896 1732 ## PGN_0561 trypsin like proteinase PrtT 62 35 Op 2 . - CDS 83944 - 84132 197 ## gi|288927718|ref|ZP_06421565.1| hypothetical protein HMPREF0670_00459 - Prom 84288 - 84347 6.4 63 36 Tu 1 . - CDS 85235 - 85543 134 ## + Prom 85461 - 85520 5.0 64 37 Tu 1 . + CDS 85607 - 85933 236 ## + Prom 85988 - 86047 6.6 65 38 Op 1 . + CDS 86068 - 86931 863 ## COG0568 DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) 66 38 Op 2 . + CDS 86935 - 87432 416 ## PRU_1022 hypothetical protein 67 39 Op 1 . + CDS 87570 - 88451 857 ## PRU_2123 hypothetical protein 68 39 Op 2 . + CDS 88427 - 88912 118 ## BT_0074 hypothetical protein 69 39 Op 3 . + CDS 88955 - 89224 423 ## PRU_2121 hypothetical protein + Term 89342 - 89398 15.1 + Prom 89369 - 89428 2.9 70 40 Tu 1 . + CDS 89495 - 90628 902 ## PRU_1233 hypothetical protein + Term 90728 - 90781 1.7 + Prom 91094 - 91153 5.6 71 41 Tu 1 . + CDS 91287 - 93320 2195 ## COG1158 Transcription termination factor + Prom 93562 - 93621 5.6 72 42 Op 1 . + CDS 93728 - 94141 345 ## COG2207 AraC-type DNA-binding domain-containing proteins + Term 94148 - 94192 10.0 73 42 Op 2 . + CDS 94210 - 96168 1998 ## PRU_0989 M49 family peptidase (EC:3.4.14.4) 74 43 Tu 1 . + CDS 96309 - 96923 567 ## BVU_2016 hypothetical protein + Term 96981 - 97018 2.1 + Prom 96927 - 96986 2.1 75 44 Op 1 . + CDS 97040 - 97492 341 ## COG0735 Fe2+/Zn2+ uptake regulation proteins 76 44 Op 2 . + CDS 97530 - 98807 1137 ## COG0104 Adenylosuccinate synthase 77 44 Op 3 . + CDS 98807 - 100177 1459 ## COG0124 Histidyl-tRNA synthetase 78 44 Op 4 . + CDS 100190 - 100624 340 ## COG0735 Fe2+/Zn2+ uptake regulation proteins + Prom 100796 - 100855 6.0 79 45 Tu 1 . + CDS 100887 - 101018 57 ## - Term 101530 - 101569 -0.9 80 46 Tu 1 . - CDS 101705 - 102994 505 ## gi|288927733|ref|ZP_06421580.1| hypothetical protein HMPREF0670_00474 81 47 Op 1 . - CDS 103055 - 103177 63 ## 82 47 Op 2 . - CDS 103222 - 103671 264 ## gi|288927735|ref|ZP_06421582.1| hypothetical protein HMPREF0670_00476 83 47 Op 3 . - CDS 103689 - 105314 751 ## gi|288927736|ref|ZP_06421583.1| hypothetical protein HMPREF0670_00477 - Prom 105348 - 105407 4.0 84 48 Tu 1 . - CDS 105411 - 106943 85 ## gi|288927737|ref|ZP_06421584.1| hypothetical protein HMPREF0670_00478 85 49 Tu 1 . - CDS 107117 - 107359 236 ## gi|288927738|ref|ZP_06421585.1| hypothetical protein HMPREF0670_00479 - Prom 107399 - 107458 7.3 + Prom 107419 - 107478 7.5 86 50 Tu 1 . + CDS 107583 - 109358 2488 ## PROTEIN SUPPORTED gi|160887146|ref|ZP_02068149.1| hypothetical protein BACOVA_05162 + Term 109398 - 109447 15.7 - Term 110535 - 110580 -0.9 87 51 Op 1 . - CDS 110817 - 111599 629 ## COG1179 Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 1 88 51 Op 2 . - CDS 111655 - 111987 423 ## PAU_00474 hypothetical protein - Prom 112046 - 112105 2.5 - Term 112016 - 112063 -0.6 89 52 Tu 1 . - CDS 112118 - 114466 1517 ## BVU_1144 hypothetical protein - Prom 114591 - 114650 2.9 - Term 114528 - 114585 9.3 90 53 Tu 1 . - CDS 114681 - 116426 2156 ## COG1109 Phosphomannomutase - Prom 116493 - 116552 7.5 + Prom 116394 - 116453 5.3 91 54 Op 1 . + CDS 116659 - 116985 308 ## PRU_2631 hypothetical protein 92 54 Op 2 . + CDS 116988 - 117458 443 ## COG1576 Uncharacterized conserved protein + Prom 117739 - 117798 7.4 93 55 Tu 1 . + CDS 117886 - 121224 3250 ## PRU_1810 hypothetical protein 94 56 Op 1 . + CDS 121353 - 121964 466 ## COG0726 Predicted xylanase/chitin deacetylase 95 56 Op 2 . + CDS 121942 - 123000 651 ## COG1600 Uncharacterized Fe-S protein 96 56 Op 3 . + CDS 122997 - 123713 541 ## gi|288927749|ref|ZP_06421596.1| hypothetical protein HMPREF0670_00490 + Term 123736 - 123791 1.6 97 57 Tu 1 . - CDS 123765 - 124409 528 ## PRU_1212 TetR family transcriptional regulator - Prom 124436 - 124495 5.3 + Prom 124492 - 124551 4.4 98 58 Tu 1 . + CDS 124620 - 126920 2522 ## COG0281 Malic enzyme + Term 126959 - 127000 1.2 - Term 128267 - 128320 10.3 99 59 Tu 1 . - CDS 128413 - 129168 755 ## PRU_1803 putative lipoprotein - Prom 129322 - 129381 5.7 100 60 Tu 1 . + CDS 129447 - 130238 633 ## COG0566 rRNA methylases + Prom 130456 - 130515 2.7 101 61 Tu 1 . + CDS 130537 - 131433 750 ## gi|288927755|ref|ZP_06421602.1| hypothetical protein HMPREF0670_00496 + Term 131461 - 131502 5.6 - Term 131449 - 131490 9.4 102 62 Tu 1 . - CDS 131524 - 132021 512 ## BVU_0830 hypothetical protein - Prom 132155 - 132214 12.7 103 63 Tu 1 . + CDS 132466 - 134943 2400 ## COG0787 Alanine racemase + Prom 135213 - 135272 6.0 104 64 Op 1 . + CDS 135297 - 137312 1871 ## COG4206 Outer membrane cobalamin receptor protein 105 64 Op 2 . + CDS 137327 - 138487 1087 ## PRU_2507 hypothetical protein + Prom 138567 - 138626 6.5 106 65 Tu 1 . + CDS 138694 - 140457 1668 ## PRU_2508 hypothetical protein + TRNA 140897 - 140984 55.5 # Ser GGA 0 0 + Prom 141273 - 141332 8.5 107 66 Op 1 . + CDS 141386 - 142618 694 ## COG4974 Site-specific recombinase XerD 108 66 Op 2 . + CDS 142644 - 143111 483 ## PG0816 hypothetical protein 109 66 Op 3 . + CDS 143280 - 143627 238 ## PG0814 hypothetical protein + Term 143668 - 143712 9.0 + Prom 143895 - 143954 3.7 110 67 Op 1 . + CDS 144013 - 144459 424 ## BF2919 hypothetical protein 111 67 Op 2 . + CDS 144482 - 144676 210 ## BF0106 hypothetical protein 112 67 Op 3 . + CDS 144695 - 144928 198 ## BF0106 hypothetical protein 113 67 Op 4 . + CDS 144933 - 145199 57 ## PGN_0050 hypothetical protein + Term 145272 - 145326 7.1 - Term 145523 - 145560 1.1 114 68 Op 1 . - CDS 145574 - 147421 1176 ## PRU_2073 hypothetical protein 115 68 Op 2 . - CDS 147445 - 150825 1568 ## PRU_2074 hypothetical protein - Prom 150938 - 150997 5.0 Predicted protein(s) >gi|283510597|gb|ACQH01000022.1| GENE 1 1343 - 2800 1165 485 aa, chain + ## HITS:1 COG:no KEGG:PRU_1451 NR:ns ## KEGG: PRU_1451 # Name: not_defined # Def: LysM domain-containing protein # Organism: P.ruminicola # Pathway: not_defined # 26 485 1 450 450 457 48.0 1e-127 MQLILKFIAHSFKIVIFAVTNFGRTMTKAIIRYLFVLFVAFLCIPVDAQNLKWRDMYQVK KKDTLFGIAKNFGLTLEELINANPEMKVAGYELKKGDYIFIPYPSNATEKPTNTVPAVKL QSQNTARSTSQEQHAKTIKVGFVLPLHDVDGDGKRMIEYYRGFLMGCDSLKQQGYSIDIH AWNVPIDADIRATLQNASIRNCDIIFGPLYTKQVKPLSDFCKGAGIKLVIPFSIWGDEVS RNPEIFQVYQSTDELNNAAIDAFISRFGNQHAVFIDCNDSTSKKGIFTFGLRNRLNARGV KYSITNLKSSEELFAKAFSRTQPNVVVLNTGRSPELTVAMAKLKGLVASNPSVVVSLYGY TEWLMYTRNNLDNFFKFNAHIPTTFYYNPLSTRTQQLEAGYRRWFGVEPLYALPKFALMG YDHAQFFLRGLHKYGKAFNGSKAQSAYATVQTPLNFKRVGSGGMQNQAFMLVRYTNDKRI ELISY >gi|283510597|gb|ACQH01000022.1| GENE 2 2855 - 3550 740 231 aa, chain + ## HITS:1 COG:no KEGG:PRU_1450 NR:ns ## KEGG: PRU_1450 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 5 229 22 263 269 240 48.0 3e-62 MCKAQIGAYRNNFSIGFNGGYVLSNVGFNPRVDQGYHGGITGGLSFRYVSEKYFNTICSI YAEVNYASLGWKQDIRDLERNPVVNATTGLAEEYSRTINYIQVPIFAHLAWGRETRGAQF FFQVGPQMGFYLGESTKTNYDVSTRNLTDRANKVTEQEAMPVERTLDYGIAAGLGMEYSI PRVGHFLLEGRYYYGLGNIYGDSKRDYFGRSNFGNIVVKMAYLFDLTHTSK >gi|283510597|gb|ACQH01000022.1| GENE 3 3655 - 5328 1520 557 aa, chain + ## HITS:1 COG:CAC0751 KEGG:ns NR:ns ## COG: CAC0751 COG3104 # Protein_GI_number: 15894038 # Func_class: E Amino acid transport and metabolism # Function: Dipeptide/tripeptide permease # Organism: Clostridium acetobutylicum # 5 556 24 519 521 130 23.0 5e-30 MFEGQPKGLYALALANTGERFGYYTMLAIFTLFLQAKFGYTATETSTIFASFLGFVYFMP LIGGILADRFGYGKMVTSGIVIMFLGYVLLAIPTAADFSGKAMMFGALALISCGTGLFKG NLQVLVGNLYDNPQLSSKRDTAFSLFYMAINIGALFAPTAATKITNYVLAGSNFTYEPQI PSLAHQFLDGTIKADGSVLLEQLKAAQGYAGDMTSFCTTYISELSRAYNYGFAVACVSLV VSMIIYVVFRSTFKHADVNNKQKAAKAGNDAVAEPELSPAETKQRIVALLLVFAVVIFFW MAFHQNGLTMTFFARDYTAKSVVGLNRLGFDILNLVLIVVSVYSAFSLFQSKTTKGKTIA ALVLAGAIIGIVANYSSLEAEIKILPQIFQQFNPFFVVALTPVSLAVFGYLGRRGKEPTA PRKIGYGMLIAACGFLVLAIASFGLPTPTEVKNSGISDSLLVSPDWLISTYLVLTFAELL LSPMGISFVSKVAPPKYKGMMMGGWFVATAIGNYLVAIIGYLWGDMQLWMVWSVLIVCCL LSALFIFSIIKRLEKVC >gi|283510597|gb|ACQH01000022.1| GENE 4 5510 - 6697 1232 395 aa, chain - ## HITS:1 COG:YPO0059 KEGG:ns NR:ns ## COG: YPO0059 COG0156 # Protein_GI_number: 16120412 # Func_class: H Coenzyme transport and metabolism # Function: 7-keto-8-aminopelargonate synthetase and related enzymes # Organism: Yersinia pestis # 7 394 12 402 403 484 60.0 1e-136 MYGKMKEHLSKSIAEIKEAGLYKEERLIESAQQAAITVKGKEVLNFCANNYLGLSNHPRL IEGAKRMMDRRGFGMSSVRFICGTQDAHKELEQAISNYFQTEDTILYAACFDANGGVFEP LFTDEDAIISDALNHASIIDGVRLCKAKRYRYQNADMADLERCLQEAQQQRFRIIVTDGV FSMDGNVAPMDQICDLAEKYDALVMVDESHSAGVVGATGHGVSELCKTYGRVDIYTGTLG KAFGGALGGFTTGRKEIIDMLRQRSRPYLFSNSLAPCIIGASIEVFKMLAESNELHDKLV DNVNYFRDKMMAAGFDIKPTQSAICAVMLYDAKLSQVYASRMLDEGIYVTGFYYPVVPKE QARIRVQISAGHNREQLDKCIGAFIKVGKELGVLK >gi|283510597|gb|ACQH01000022.1| GENE 5 6881 - 7837 1164 318 aa, chain + ## HITS:1 COG:SA0511 KEGG:ns NR:ns ## COG: SA0511 COG0451 # Protein_GI_number: 15926231 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Staphylococcus aureus N315 # 1 315 1 314 321 337 53.0 1e-92 MKNVLVIGSTGQIGSELTGELRKRYGNSNVVAGYIQGAEPKGELKDAGPCAIADVTNPEM IAEVVRKYNIDTIYNLAALLSVVAESKPRLAWKIGIDGLWNILEVARENGCAVFTPSSIG SFGPDTPHVMTPQDTIQRPATIYGVSKVTTELLSDYYQKKYGVDTRSVRFPGIISYVTPP GGGTTDYAVDIYYSAVKGEKFVCPIGKGTRMDMMYMPDALHAAIQLMEADKDKLVHHNGF NIASMSFDPETIYQAIRKYKPEFEMVYEVDPLKQSIADSWPDQMDDSAARKEWGWMPQYD LDSMTKDMLEKLTEKLLK >gi|283510597|gb|ACQH01000022.1| GENE 6 8238 - 11069 2496 943 aa, chain - ## HITS:1 COG:FN1103 KEGG:ns NR:ns ## COG: FN1103 COG0178 # Protein_GI_number: 19704438 # Func_class: L Replication, recombination and repair # Function: Excinuclease ATPase subunit # Organism: Fusobacterium nucleatum # 5 943 17 957 960 795 45.0 0 MINEIDKHIELKGVRVNNLKNISLNIPRNKFIAIAGVSGSGKSSLAFDTLYAEGQRRYVE SLSSYARQFLGNMSKPECDFIKGLPPAIAIEQKVNTRNPRSTVGTSTEIYEYMRLLFARI GKTISPVSGQEVKKHSVEDVIQCTKQYSNGTKFVVLAPIHIPEERTFEKQLNMFVQEGYT RLWHKGDFVRIDELVEDEERLKAANADEYLLLIDRLAVDDSKEHLSRLTDSVETAMYEGH GECRLVFLPSNITYEFSSRFEADGMTFEEPSDNMFAFNSPLGACPTCEGFGKIIGIEESL VIPNSTLSVYDGCVQCWHGDKMGIWKEEFCRRAAQDNFPIFKPYFELSRKEKDMLWHGLP SEKGRDISEQVCIDAFFQMVRENQYKIQYRVMLNRYRGKTTCPECHGTRLKKEATWVKID GMSIADLVEMPIGKLKTWFDNLQLNDYDAKIAARLLTEINNRLQFLLDVGLHYLTLNRQS NTLSGGESQRIQLTTSLGSSLVGSLYILDEPSIGLHSRDTDRLIKVLNDLVQIGNTVIVV EHDEDILRKADHLIDVGPDAGRLGGEIVFNAPASYITKDALEKYPNSHTVAYLTGAEKID TPTSRRKWNQSILLKGARMNNLKGIDVTFPLNVLTVVTGVSGSGKSSLIKGILHPALKRS MGDVADAPGEFIALEGDLKRVKHIEFVDQNPIGRSTRSNPATYVKAYEPIRQLFANQPLA KQLGFSAQYFSFNAEGGRCEECKGAGVITVEMQFMADLILECETCHGQRFKQDVLDVRYE GKNINDILDLTVSEAIEFFAAHKQTVIVNRLQPLEDVGLGYIKLGQNSSTLSGGENQRVK LAYFIGQEKAEPTLFIFDEPTTGLHFHDIKRLLKAMGALIERGHSVIVIEHNMDVIKCAD HVIDLGPDGGDRGGKLVFAGTPEELANCKESITGKYLKAQLES >gi|283510597|gb|ACQH01000022.1| GENE 7 11233 - 11892 445 219 aa, chain + ## HITS:1 COG:no KEGG:BVU_1953 NR:ns ## KEGG: BVU_1953 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 60 217 36 192 235 80 28.0 4e-14 MKKYLLAFSMLFLVGVTAVDATPKHRHRHKTTTVSVKTDSTSAAIEVYSDTASANGSGSD SAYVSSSWDDDDDVHEDFGHFINRVAGHGPGATIAVIIIVFLGFILACAPFLIFGLIIYY MVKRHNDRVKLSMKAMEMGQLPPAKADAAFAESDEMLWKKGVKNSALGLGLALMFLMFDA EGLAGVGLLVLCYGLGQVYMARASRKKRENKDENNHPEF >gi|283510597|gb|ACQH01000022.1| GENE 8 11893 - 12444 450 183 aa, chain + ## HITS:1 COG:PM1789 KEGG:ns NR:ns ## COG: PM1789 COG1595 # Protein_GI_number: 15603654 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Pasteurella multocida # 5 168 4 179 191 92 35.0 3e-19 MNVTQLNDNTLVALAAEQHNRKAFNELVVRYQSPIRRFFLNQTLGNEPLSDDLAQDTFVK AYLNITKFRGDSAFSTWLYRIAYNVFYDYTRSNKHTEDLETTEVARRNAEVSDTTVSLDI YDALNKLSDYERTCVTLQLMEGQPIDKIAEITGMAAGTVKSHLFRGKEKMTRYLKENGYD RKR >gi|283510597|gb|ACQH01000022.1| GENE 9 12428 - 12772 301 114 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288927669|ref|ZP_06421516.1| ## NR: gi|288927669|ref|ZP_06421516.1| membrane protein [Prevotella sp. oral taxon 317 str. F0108] # 1 114 3 116 116 224 100.0 2e-57 MTERDDELLMQFFSEHKQEIFDNGFSERVMQKLPRSAIRTYNRVWTLFCCMVGLAFILLT RGWEQVARIGHILSSQFYDALYGLNLMSFTPIVLFVAMLTFIGVTVYNLNLSKD >gi|283510597|gb|ACQH01000022.1| GENE 10 13856 - 15280 1283 474 aa, chain - ## HITS:1 COG:VC0438 KEGG:ns NR:ns ## COG: VC0438 COG2966 # Protein_GI_number: 15640465 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Vibrio cholerae # 3 280 37 286 289 87 25.0 8e-17 MNIEENQQVESRKLLRRKLDLLLRTGKILVESSADTSRIMRNMKRTAAYLGLNEEHLHIH ITYNMLMVNLSDGTHSFSKFQRCDKHGIEMRAISDISKLSWSAIREDYSLDRYEEELNKI GSRKRSYTPWQVAIAAGFACGGFCIQFGCDWTAFFYASFAAALGFRLRTKLNEIGSNGYA NIAVAAFVSTIIAWLLGTFANSQLVADMPQWLACALQTVTPWHPLMACALFIVPGVPIIN FVNDVLDNNIEVGIIRGINTILIVSAMAFGIVFAIRVCGIDNFVKDLSMIPHHSYWEYAI AAAISAMGFATIYNFPPKQLWVLALGGIVAVCTRNYINFGDSSGNIGLDLGPVIGSLVGA SLISIIYIKVVHWVHLPHQCLSIPTVIPMVPGVLMYRCLFALLDMHGVVGELTTAMTNGI RASLIVLCIAIGVAIPNIFARKWIAPNRQRKLKRLIEERKARGKFVDLAEIKSN >gi|283510597|gb|ACQH01000022.1| GENE 11 15406 - 16134 192 242 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163797523|ref|ZP_02191474.1| 50S ribosomal protein L9 [alpha proteobacterium BAL199] # 5 230 8 240 259 78 29 2e-13 MKKAIIMGASSGIGYAVAKRLIEEGWMVGLSARRLEPLETLTQLAPERVFTQRIDITDTE ATTSLSQLIVRLGGIDLYFHASGIGKQNTLLDESIEQATLATNVMGFARMVDFVFNYMAD NGGGHIAVISSVAGTKGIGSAASYSASKTFQNAYIQALDQLAVKRKLPIIFTDIRPGFVD TPILTGDNYPMLMTVEHVTERIMWALRRKKRVRIIDWRYRLMVRIWRMIPNCCWRKINIG KA >gi|283510597|gb|ACQH01000022.1| GENE 12 16381 - 17742 953 453 aa, chain - ## HITS:1 COG:CAC3204 KEGG:ns NR:ns ## COG: CAC3204 COG0037 # Protein_GI_number: 15896451 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Predicted ATPase of the PP-loop superfamily implicated in cell cycle control # Organism: Clostridium acetobutylicum # 7 448 4 455 461 156 25.0 8e-38 MKKIEREVSQYIDRHNLLSASGRYLVALSGGADSVALFRMLLRLDYKVDAVHCNFKLRGE ESFRDEAFCENLCREHGVAFHTVHFDTAEYAALHKLSIEMAARELRYNHFEKLRNDLDMD GICVAHHQDDNVETMLINLIRGTGLSGLTGMKPRNGHVLRPLLAVSHAHLLDYLQALGQP YVTDSTNLTTDFTRNKIRLTLVPLLTEINRVSTNNIAKTIERLAEAEKIFNPAIERHAND ATVVREDEPRWRLGISIDRLRQTPSPEYVLFQLIHPLGFAPAQIEQIAQNLDQQTGKQWQ SPTHTLVIDRQMLLVEQTLPASKLAKTMRIPEEGTYVFSPNVRIKVSKGVKNERFSPSKA PHCAHLDMANIAFPLMVRHIKPADRFVPFGMRTSKLVSNFLTDRKRSYFDRQRQLAVFDA NDALLWLVGERTDNRFRITERTTRFLQLDYMEI >gi|283510597|gb|ACQH01000022.1| GENE 13 17778 - 18965 561 395 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|223476703|ref|YP_002580685.1| ribosomal protein L11 methyltransferase, putative [Thermococcus barophilus MP] # 33 395 30 393 396 220 37 3e-56 MYKSIYLKKGKEESLKRFHPWVFSGAISHFAEKGDIDEGELVRVLTNGGDFIAVGHYQIG SIAVRVLSFRDVEINADFWKTRLTSALNARLAIGIANKADNDTYRLVHGEGDNLPGLIVD CYGKTAVMQAHSVGMHLARHEICKALVEVMGSRIENVYYKSDTTLPFKAELGQENEFIYG DSENNTGMENGLLFHVDWLKGQKTGFFIDQRDNRSLLEKYAMGKSVLNMFCYTGGFSVYA MRGNAKLVHSVDSSAKAIELTNQNVELNFPNDNRHEAFCEDAFKFLNANENKYDLIVLDP PAFAKHRAALHNALKGYTRLNAKGFECIKPGGLLFTFSCSQVVTKDQFRNAVFTAATQAG RKVRIMHQLHQPADHPINIFHPEGEYLKGLVLYVE >gi|283510597|gb|ACQH01000022.1| GENE 14 19022 - 19645 609 207 aa, chain - ## HITS:1 COG:no KEGG:PRU_0905 NR:ns ## KEGG: PRU_0905 # Name: not_defined # Def: 3'-5' exonuclease # Organism: P.ruminicola # Pathway: not_defined # 1 203 1 203 217 251 59.0 1e-65 MRKIIYNKFDKKSIAQLPTVTFPGKTVVVMSESEAEKAVDFLLSRDILGVDTETRPSFKK GETHMVSLLQVSTSDVCFLFRLNHIGITPAILRLLENKAVPMVGLSLHDDMLSLHKRVAF TPGFFIDLQDLVGELGIEDLSLQKLYANLFHQKISKRQRLTNWDSDVLNDKQKAYAALDA WACINLYKEILRLKQSGDYELVINEQD >gi|283510597|gb|ACQH01000022.1| GENE 15 19758 - 22241 2351 827 aa, chain + ## HITS:1 COG:BS_spoIIIE KEGG:ns NR:ns ## COG: BS_spoIIIE COG1674 # Protein_GI_number: 16078743 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: DNA segregation ATPase FtsK/SpoIIIE and related proteins # Organism: Bacillus subtilis # 332 813 314 780 787 389 44.0 1e-107 MVKRKTESKPKRKSSTFSLSGIFKISNERTDFVLGIVLILLAVYVGLAMFSFFTTGKADQ SILEDLRPGEWLNTGKEFSNYCGSLGAILAYTLITLNFGFAAFAIPVFILLAGAKLTRAY KVNLWKTFFYLAIIMVWCSVAFAKFLTPLTGSLTFNPGGNHGLFCVQNIENMIGPPGLTA LLFIVALAFLTYLSAETVSFVRKAINPVKYITSKVKFTIANNGSRTVPADKDEEVEVETR QENGQANDDEGLKDEENGLPTVVDLTDGTMGLGGDKPFGGAHSHKEQQANGDPSAQLKVE IPQGEETASGRELTAAEVLSTPINPLEPFLSYKYPTLNLLKAYDNDSKPYVDMTELKANN DRIIKVLRDFGVEIREIKATVGPTITLYEITPAEGVRINKIRNLEDDIALSLAALGIRII APIPGKGTIGIEVPNNKPNIVSMESILNSKKFQETKMDLPLALGKTITNEVFMVDLAKIP HLLVAGATGQGKSVGLNAVITSLLYKKHPNELKLVLIDPKKVEFSIYSPIVNHFLAKVPE EDDEPIITDVTKVVRTLNSLCKLMDTRYDLLKAAGARNIKEYNEKFVNHKLNLTKGHEYM PYIVVIIDEFGDLIMTAGKEIELPIARIAQLARAVGIHMVIATQRPTTSIITGNIKANFP GRMAFKVSAMIDSRTILDRPGANQLIGRGDMLFLNGNEPVRVQCAFVDTPEVERINRFIA DQPGPVEPMELPEPNTEEGGIGGGTADMNSLDPYFEEAARAIVISQQGSTSMVQRRFSIG YNRAGRLMDQLEVAGVVGIAQGSKPREVLITDENTLNALLAKLRGAG >gi|283510597|gb|ACQH01000022.1| GENE 16 22328 - 22939 615 203 aa, chain + ## HITS:1 COG:no KEGG:PRU_0907 NR:ns ## KEGG: PRU_0907 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 27 203 22 198 198 193 56.0 3e-48 MLKKVLLLICLSMFVFNATKAQKDNDARKVLDKTAAIVGNKGGALANFKLSNGKLGAISG TISIKGNKFYARTSKATVWFNGKTQWSYLPSTNEVNVSNPNQSQQAAMNPYTFINMYKSG YNLSMNKVGRDYQVHLVAEKGNKGIPELYVLVDATYKPKQVKMKRGGEWTTITISNFLHK NLSDHLFSFNSKDFPSAEVVDLR >gi|283510597|gb|ACQH01000022.1| GENE 17 22982 - 24493 609 503 aa, chain + ## HITS:1 COG:no KEGG:PRU_0908 NR:ns ## KEGG: PRU_0908 # Name: not_defined # Def: prophage PRU01 putative membrane protein # Organism: P.ruminicola # Pathway: not_defined # 1 503 1 492 494 281 35.0 5e-74 MNRLQLYRTLYRHRSLSMRRALDYEHNRSAKYLFYIGYIMGFAYMLFIAVLLSLKVNNMR DTNAAGLIVGILPIILLVDLFIRFTYQQTPAQIIKCYALLPLPRKVCIESFIGASLFNLG NAWWYVLFVPYCVMSVVFSYGLFATILLLAFIGLAILLNSQLYVLFRTLVSQSLLWILGL LAVYSLLAIPMIFGGNGSFASSLFIYSHLGIALSQGEIWPLLIVLSLLIAVVWLNREVQF HAIWRELTSEKAATKVNGKDRLAFLRQHGEIGEYAHLEWKLISRNKFPRETFIYSFFLIA LLSVLITFTQAYNGDGMVRFWCSYNFVILGLTLLSRLLSMEGNYIDMLMVHPNSIYSLLR AKYYVYCGLLIVPFMLMLPMVFVGKASLLMLLSFALFAAGFQFLVFFQSAVYNKRTQKLN VSLMSSRNVERNYLYIFVCLLALLTPVIFVGTLSMFCSPTVCYLCLSVVGLIGIATHRLW IRNVYVRMMARRHENLEGFRTSK >gi|283510597|gb|ACQH01000022.1| GENE 18 24643 - 26391 1997 582 aa, chain + ## HITS:1 COG:BS_yrvE_1 KEGG:ns NR:ns ## COG: BS_yrvE_1 COG0608 # Protein_GI_number: 16079816 # Func_class: L Replication, recombination and repair # Function: Single-stranded DNA-specific exonuclease # Organism: Bacillus subtilis # 15 521 15 514 560 355 36.0 1e-97 MHFKWIYEPLSPQMQSAAKELGEKLNMSTILAQLLIRRGITTESAAKRFFRPQLNDIINP FLMKDMDVAVDRLNDAMGRKERIMVYGDYDVDGCTSVALVYKFLQQFYSNIEYYIPDRYE EGYGVSRKGIDYAHQSGVKLVIILDCGIKAIEEIAYAKSLGIDFIICDHHVPDEEMPPAV AILNPKRPDDHYPFKHLSGCGVGFKFMQAFAKNNGIPFSRLIPMLDLCAVSIAADLVQVE EENRILAFHGLKQLNQNPSLGLKAIIDICGLSGKEISMSDIVFKIGPRINASGRMESGKE SVSLLVEKDYNVALQSAKHINEYNEQRKDIDKQMTEEANQIVARLESQKKQSSIVLYDEN WKKGVIGIVASRLTEIYFRPTVVITRDGDFATGSARSVMGFDIYSAIKSCRDLLINFGGH TYAAGLTMRWDDVPRFVERFHQYVDNHILPEQTEAILHIDEIIDFKDITKKMHHDLKKFS PFGPGNLKPRFCTNNVYDYGTSKVVGREQEHIKLELIDSKSSNVVNGIAFGQSASARYIK SKRSFDIAYTIEDNVFRHNNVQLQIEDIRLTESDENDEHNAL >gi|283510597|gb|ACQH01000022.1| GENE 19 26398 - 28329 1722 643 aa, chain + ## HITS:1 COG:recQ KEGG:ns NR:ns ## COG: recQ COG0514 # Protein_GI_number: 16131672 # Func_class: L Replication, recombination and repair # Function: Superfamily II DNA helicase # Organism: Escherichia coli K12 # 15 360 16 361 610 297 42.0 5e-80 MEETNGTQTAHDAFKRTLREYWGYPDFRGIQRDIIESIAQGKDTLGLMPTGGGKSITFQV PALVMSGVCVVITPLIALMKDQVDHLRQKGIQAAAIYSGMSRREIITTLENCVFGGVKLL YVSPERLFSDIFKTKFKHMEVSFVTVDEAHCISQWGYDFRPSYLAIAEVRKLKPDVAVLA LTATATPKVVDDIQERLNFRQKNVFRMSFERSNLAYIVRDTMDKHTELIHILNAVAGSAI VYVRSRKHASDIASHISSANISATFYHAGLEPAVKNQRQNSWQQNEVRVMVATNAFGMGI DKPDVRLVVHMDCPDSIEAYFQEAGRAGRDGKKAYAVLLWNNSDRRKLNKRVAETFPEKE YIKEVYEDLAYYYQIGVASGAGYSFVFEIDRFCRTFKHFPVQTHSALQILERAGYIHYEM DPEARARVMFKLGRNDLYRLDEGSAFEEAVITALLRNYGGLFSNYVYIDESLLAQEAELT THQVYVILKNLAQRNIINFIPQRKTPFITYLRDRVDGERVVLSKEIYEERKAQYVARINA MQAYATNGDVCRSRQLLIYFGEERHKDCQQCDVCLEHASPEPSNEQTTTARNAILNLLKD GERHHITELHKLNLPQKGLDVALEYLIHEEEVRLDGSFLYILE >gi|283510597|gb|ACQH01000022.1| GENE 20 30220 - 30687 292 155 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927680|ref|ZP_06421527.1| ## NR: gi|288927680|ref|ZP_06421527.1| hypothetical protein HMPREF0670_00421 [Prevotella sp. oral taxon 317 str. F0108] # 1 155 1 155 155 250 100.0 2e-65 MKTTFSKTSKLIALTMMLLASFTVFTACGSDDDDESNTVKIDGQETKVVSMTAQETDEDA IIVLGIESIEKNERPLETLNLSLSTKDYGKTVHCESSGITSSLFDNTAKLDRGSQFTVSK NEDGVYKITFQLAQTLGKKTRTVAGKYEGKPNTAK >gi|283510597|gb|ACQH01000022.1| GENE 21 31198 - 33738 2676 846 aa, chain - ## HITS:1 COG:BH3106 KEGG:ns NR:ns ## COG: BH3106 COG1193 # Protein_GI_number: 15615668 # Func_class: L Replication, recombination and repair # Function: Mismatch repair ATPase (MutS family) # Organism: Bacillus halodurans # 26 846 23 784 785 382 32.0 1e-105 MIYPNNFEQKIGFNEVRTLLKGKCLSTLGTTKVDDMTMSHNADEIQTWLTQTREFRLLQT QAEDFPLQYFFDLRPTLSRLRIEGTHLDEQELFDLLRSLDTIHQVVRFLNRTTEGQVAIE GQNARFPYPSLQLLTVDVATFPDIVRRITQIVDKFGKIKDSASPELARIRRELAQTEGSV SRILNSILRNAQQEGLVEKDVTPAIREGRLVIPVLPGMKRKIGGIVHDESASGRTVYIEP TEVVEANNKIRELEADEKKEIVRILKEVSNDIRPHIAPMTASYQLLAHIDFIRAKAELAR TFKAIEPQISNKPLIDWGQAVHPLLRLSLEKQGKRVVPLDVTLTPQKHILIISGPNAGGK SVCLKTVGLLQYMVQCGLSVPLGDTSRVGVFQNVMIDIGDEQSIENDLSTYSSHLLNMKM MMRHANDKTLILIDEFGTGTEPLIGGAIAESVLTQFCNKGAFGVITTHYQNLKHFADTHD GVANGAMLYDRKEMQALFMLSIGQPGSSFAIEIARKIGLPEEVIQDASRIVGEDYIQSDK YLQDIVRDKRYWENKRATIHKQEKELQAAIERYEKDIEEIGKTRKDVLKRAKEQAEELLR ESNKRIETTIREIKEAQAEKERTKRIREELSEFRTSVEQVDAAANDEFIAKKIEQIKRRK ERHEKHIKEKAEKQKKAAEQIREAVSKKQSPTGEIKPGDAVKMKGLSTVGRVESLDNNGM ALVVFGGMRSKMDVKRLELATPAEKEQAQQYTMSKGTRQTIDERKSHFHQDLDIRGERGD DALNAVMHFIDDAILVGMPRVRILHGKGNGILRQLIRQYLATVPNVTSFKDEHVQFGGAG ITVVDL >gi|283510597|gb|ACQH01000022.1| GENE 22 33933 - 34322 528 129 aa, chain - ## HITS:1 COG:ECs4156 KEGG:ns NR:ns ## COG: ECs4156 COG1970 # Protein_GI_number: 15833410 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Large-conductance mechanosensitive channel # Organism: Escherichia coli O157:H7 # 5 129 4 132 136 142 62.0 1e-34 MSKFLNEFKDFAMRGNVLDLAVGVIIGGAFGKIVSSVVDDLLMPVIGMLIGGLDFKGLSI TIGEAKIMYGNFIQNVIDFTIIAFCIFLLIKGINSLSRKKEEPAAPEAPAEPSNEEKLLS EIRDLLKNK >gi|283510597|gb|ACQH01000022.1| GENE 23 34519 - 35538 1259 339 aa, chain + ## HITS:1 COG:VC2000 KEGG:ns NR:ns ## COG: VC2000 COG0057 # Protein_GI_number: 15642002 # Func_class: G Carbohydrate transport and metabolism # Function: Glyceraldehyde-3-phosphate dehydrogenase/erythrose-4-phosphate dehydrogenase # Organism: Vibrio cholerae # 2 336 3 331 331 471 73.0 1e-133 MIKIGINGFGRIGRFVFRASLEEANAKEVQVVAINDLCPVDYLAYMLKYDTMHGIFNGTI EADVEKSQLIVNGNVIRVTAERNPADLKWGEVGAEYVVESTGLFLSKEKSQGHIDAGAKY VVMSAPSKDDTPMFVTGVNDKTYVKGTQFVSNASCTTNCLAPIAKVLNDKWGITDGLMTT VHSTTATQKTVDGPSMKDWRGGRAASGNIIPSSTGAAKAVGKVIPELNGKLTGMSMRVPT LDVSVVDLTVNLAKPAKYEEICAAMKEASEGELKGVLGYTEDAVVSSDFLGDPRTSIFDA KAGIALTDTFVKVVSWYDNEIGYSHKVVELIKHMAKVNG >gi|283510597|gb|ACQH01000022.1| GENE 24 35621 - 36571 630 316 aa, chain + ## HITS:1 COG:TP0637 KEGG:ns NR:ns ## COG: TP0637 COG0324 # Protein_GI_number: 15639624 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA delta(2)-isopentenylpyrophosphate transferase # Organism: Treponema pallidum # 2 287 23 306 316 221 43.0 1e-57 MITILGPTACGKTALAVNLAAKTGGEIISADSRQVYRGMDIGTGKDLSEYNVDGKQIAYH LINIEDAGQKYNLFRFQEDFNAAYEDLTHRGVLPILCGGTGLYMEAVLKGYALSPVPQDD NLRKKLSARTLTELKELLVWLKARNGSVMHNETDVDTVSRAIRAIEIEFHNLRRPVDTRR LPAVSSLIVGLDVGRDVRRERITARLKARLEEGMVEEVRGLLEKNGIAKEDLMYYGLEYK YVTAFVVGEMSYDEMFKQLEIAIHQFAKRQMTWFRGMERRGFNIHWLNASMPMADKLAQI ERWMEQEDETIYGKRR >gi|283510597|gb|ACQH01000022.1| GENE 25 36576 - 37508 556 310 aa, chain + ## HITS:1 COG:TM0358 KEGG:ns NR:ns ## COG: TM0358 COG1597 # Protein_GI_number: 15643126 # Func_class: I Lipid transport and metabolism; R General function prediction only # Function: Sphingosine kinase and enzymes related to eukaryotic diacylglycerol kinase # Organism: Thermotoga maritima # 6 273 2 271 304 91 28.0 2e-18 MAANKKWAILYCPKKGMYSRGVRWSRVERSLMDNGIDFDYVQSENANSVERLITMFIENG YQTIVVVGGDSALNDAVNCLMRADKETRDGIALGVIPNGLMNDFAHFWGMKDGDVDSIVK WLKKRRVRKIDLGLIRYINKEGKNCRRYFLNCVNVGLIATIMNLRMQTRRLFGSRTLSFL FSFVLLAFQRLEYKMRLKINEEVLNRKIMTICVGNALGYGQTPNAVPYNGLLDVSVVYHP KMMKLFEGIYLFLRGKFLNHRSVHPYRTRMVEFMDANHALVGIDGRPMNTPVGEFKILVE QEVINFLIPD >gi|283510597|gb|ACQH01000022.1| GENE 26 37505 - 38209 492 234 aa, chain - ## HITS:1 COG:slr0449 KEGG:ns NR:ns ## COG: slr0449 COG0664 # Protein_GI_number: 16332256 # Func_class: T Signal transduction mechanisms # Function: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases # Organism: Synechocystis # 29 234 26 233 238 72 24.0 9e-13 MTAKILVSREEVIGIISEHWAPLSKAQQELLADNVRIVTFTKNDIIYRGGESPHEVMCLV SGKVKVYKDGVNGRSQIIRAIKAVEFFGYRAFFAGEEYKTSAMAIDTCVVAFIPIKLVVK FIHENTAVSSFFIRHLARLLGSSDERTINLTQKHIRGRLAETLLFLKDSYGVEEDGYTLN IYLSREDIASMSNMTTSNAIRTLSSFAAENMVAIDGRKIRIMQDEELRKVSKLG >gi|283510597|gb|ACQH01000022.1| GENE 27 38464 - 40410 1434 648 aa, chain + ## HITS:1 COG:MA0905 KEGG:ns NR:ns ## COG: MA0905 COG3408 # Protein_GI_number: 20089784 # Func_class: G Carbohydrate transport and metabolism # Function: Glycogen debranching enzyme # Organism: Methanosarcina acetivorans str.C2A # 1 637 22 669 680 296 33.0 1e-79 MGYIRFEKSLMTNLEETLPKELLRTNRSGAYACSSILDCNTRKYHGLLVLPIPEIDDDNH VLLSALDVSVVQHGAVFNLGVHKYRGNTYSPNGHKYIREFNCDKVPTTLYRVGGVLLRKE MVFQHFEDRILIRYTLEDAHSATTLRFKPFLAFRSVREYTHENTVASREYHEVENGIKTC MYAGYPDLFMQFSKKNTFVFEPNWYRGLEYPKEQERGYDANEDLYVPGYFEMNIKKGESI VFSASIKEFTTKGLCQLFEDEVNERSPRDNFLHCLINAAHQFHISEGDGDEYILAGYPWF KCRARDTFVALPGLTLAIEEESYFEAVMRTAERGLREFMDDKPLTVKIAEIEQPDVLLWA IWAIQQYGRETGKERCIEKYGQLVKDIIIYIRSNKHPNLTLDDNGLIRTNGKDKAVTWMN STINGKPAVPRSGYIVEFNALWYNALKFAATIATDMGEPHETENLEEMAERCKKAFVDTF LNEYGYLYDFVDGNMVDWSVRPNMIFAVALDYSPLELNQKKSVLDVCTRELLTPKGLRTL SPKSGGYNPMYVGSQLQRDYAYHQGTAWPWLGGFYMEACLKLYKRTRLSFIERQMVGYED EMFYHCLGTIPELFDGNPPFHGRGAISFAMNVGEILRTLELLEKYSYQ >gi|283510597|gb|ACQH01000022.1| GENE 28 40429 - 41697 1310 422 aa, chain + ## HITS:1 COG:Ta0340 KEGG:ns NR:ns ## COG: Ta0340 COG0438 # Protein_GI_number: 16081471 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Thermoplasma acidophilum # 1 416 20 386 388 216 31.0 6e-56 MKVLMFGWEYPPHILGGLGTASFGITEGLHAQGDMEIAFCLPKPWGDEDKNAANIVAMNC VPVAWRDVNYDYLKERLGNVMSPEYYYELRDHIYADFNYMNLNDLGCVEFSGRYPDNLHE EINNYSIVAGVVARQQSFDIIHAHDWLTYPAGIHAKQVSGKPLCIHVHATDFDRSRGKVN PTVYSIEKDGMDNADCIMCVSELTRQTVINQYHQDPRKCFTVHNAVYPLAQELQDIPRPN HDNKEKVVTFLGRITMQKGPEYFVEAATMVLHRTRNVRFCMAGSGDMMDQMIYLAANRGI ADRFHFPGFMRGKQVYECLKASDVYVMPSVSEPFGISPLEAMQCGTPSIISKQSGCAEIL TNCIKVDYWDIHAMADAIYSICHNDSLFHYLQEEGKNEVDQITWEKVGLWIRTLYERTIN KN >gi|283510597|gb|ACQH01000022.1| GENE 29 41722 - 43137 1490 471 aa, chain + ## HITS:1 COG:MA4052 KEGG:ns NR:ns ## COG: MA4052 COG1449 # Protein_GI_number: 20092845 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-amylase/alpha-mannosidase # Organism: Methanosarcina acetivorans str.C2A # 1 387 1 390 396 256 36.0 5e-68 MKTICLYFEIHQIIHLKRYRFFDIGTDHYYYDDFENERTITDIAERSYMPALNALLDMIK ANDQYFKVAFSLSGVGIEQLEMHAPQVLDKLQELNDTGCVEFLAEPYSHGLASLINEDSF EAEVKKQCQKIKEYFGQVPKVLRNSSLIYNDDIGLKASHMGFKGMLTEGAKHVLGWKSPH YLYHCNMAPNLKLLLRDIRLSDDVSLRFNNSEWDEYPLFADRYVGKIAALPEEEQVINIF MELSALGIAQPLSSNILEFLKAIPVCAKQAGINFSTPSEICAKMKSVAALDVPYALSWID EERDVSSWLGNPMQREAFNKLYSVADRVRIANDSRINQDWDYLQASNNFRFMTTKNTGVA IDRGIYSSPFDAFTNYMNILGDFINRVNSLYPEDIGNEELNGLLTTIKNQGDEIEMKEKE IVRLQTKIAKIEAEDDKLRAMLDKGEVKPKAKRATAKKVAQKQKASKRAIG >gi|283510597|gb|ACQH01000022.1| GENE 30 43403 - 44329 888 308 aa, chain + ## HITS:1 COG:PM1346 KEGG:ns NR:ns ## COG: PM1346 COG0583 # Protein_GI_number: 15603211 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Pasteurella multocida # 1 272 1 271 298 184 34.0 2e-46 MTLQQLEYVMAVFRFRHFAKAAEYCGVTQPTLSSMVQKLEDELGLKIFDRKSQPIAPTST GRLVVEQAWKVLLRARKLKETVEEEKHSLLGTFTIGILPTIAPYLIPRFFPQLMNKYPDM DVRIVEMKTEDLKRALVRGDVDAGILAQLEGLDEFDSLPLFYEQFFAYVADGDPLFSKES IKTADLTGEYLWLLDEGHCFRDQLVKFCHLKAASAAKKAYTLGSIETFMRIVESGKGITF IPELAVHQLDETHKRLVRPFAIPVPTREIVMITGKNFIRKTLRSLLVEEIQASVPKDMLT LRQTQKKI >gi|283510597|gb|ACQH01000022.1| GENE 31 44459 - 45493 961 344 aa, chain - ## HITS:1 COG:no KEGG:BF3435 NR:ns ## KEGG: BF3435 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 344 1 338 338 334 49.0 3e-90 MKKNFFKSMLALSCAALFALAFTACDNKVDTPVDETKNKLHEDPVKVTVQLVQGHMHANW LYVDTEGGFHQDSESKAKYLKRIQEITYEVKPGKGWTLAEGSADKFYVVKAHDYSTEKGF ENPAPVYLLFIKYYNAKGEYINGQFVKNGQDAIHQHFFTVENVKELMSGRDVSSKPNTPD YIAYKYTDTTPWDKTVKYDGAKLTGTTNPIGLKGVMKFLKDDVTMDLRIRLYHGFQGKKD PRTGKFSPYYRFSPMQLQQGTWDINFTVPVVVYADREDYIGEDTFKEDMDIDNFPENGFK EDVSNKLIHRIMKALNLSWTEALRDYHKSIFTTAPHDLGKGIWL >gi|283510597|gb|ACQH01000022.1| GENE 32 45539 - 47701 1887 720 aa, chain - ## HITS:1 COG:PA0781 KEGG:ns NR:ns ## COG: PA0781 COG1629 # Protein_GI_number: 15595978 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor proteins, mostly Fe transport # Organism: Pseudomonas aeruginosa # 79 623 63 600 687 84 23.0 6e-16 MGKRTFLCVALVANIALWATAKVQSTAQPNLLSQQTDTTTNINKKSNKSVFGKEHNIKEV VVTGVRQQASINSVSDRIGENLIDRSMGKSLASILEHVSGVSSIQTGTTVAKPVINGMYG NRILIVNNGARQTGQQWGADHAPEIDQNSSGSIEVIKGAESVRYGSEALGGIIVMNQKAL PYGLSGVSGHLRTLYGSNGKRYSVVAQAEGTMPFNNSLAWRLQGTYANSGDQSAAKYLLN NTGYREHDFSASLGYKYNNLKLEGYYSMYNLKLGVMLSAQMGSEDLLKQRIELGQPVEVS PYTRHIDYPFQHVVHHTAIGKAFFDAGKYGNFQWQTAFQYDDRKENRIRRMNLSDIPAVS LFLTSFQNQLKWNLAYNRWNTEAGASYLHIRNRNQAGTGVVPLIPNYTEYDLGIYAIQKY RHENWTAEAGIRFDNQETHASGYDYTGKLYGGHHIFSNFSYNLGTSYRLNEQLKLTSNLG LAWRAPHVHELYSNGNELGSGMFVMGDSTMHSEQSTKWVTSLSYRSAFAEVRVDAYLQWI NGYIYDEPERGKYVTVISGSYPLFQYKQTDAFLRGVDFDVRLKPMQHLEYHLLSGLIWAN EKRTGNYLPYIPSARFDHDLTWEDIRVGKGNAWLQLKHRLVLKQTRFNPASDLIDFTPPT YNLFGFEAGIEWPLSDRNKLRVLLAADNLFNKEYKEYTNRSRYYAHDMGRDIRLSLGWFF >gi|283510597|gb|ACQH01000022.1| GENE 33 47908 - 48228 154 106 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260911440|ref|ZP_05918030.1| ## NR: gi|260911440|ref|ZP_05918030.1| hypothetical protein HMPREF6745_1985 [Prevotella sp. oral taxon 472 str. F0295] # 14 106 1 93 93 150 81.0 3e-35 MNKNKHIRRFLAWMFIVTFMSVFVIKDFHAHEDHSSKVHLACDGTTKALKSSCFICDFVM HNAGAPILQVYQPAVFATILTPYIFTQQVVYRHVEAVNSHSPPTRA >gi|283510597|gb|ACQH01000022.1| GENE 34 49675 - 50571 844 298 aa, chain - ## HITS:1 COG:BH2855 KEGG:ns NR:ns ## COG: BH2855 COG2761 # Protein_GI_number: 15615418 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Predicted dithiol-disulfide isomerase involved in polyketide biosynthesis # Organism: Bacillus halodurans # 6 251 36 273 306 74 25.0 2e-13 MSKVIITNVTDPVCPWCWGEEPFFRKLETHFPNLINWRNVMGGLVEDMNKNKPADMDVRS YYNKENKDFIAHCLETADKHHMPIKAEGFNLFSETENSSFPLCIAFKAAQMADTEKADLF LYNLRAAVMAEARQAISEAELIAIADESGIDIAAFLDPLNDGSAEKAFWQDVEEAINLKV EVFPTFVFEYEGKKMALKSFRDYNTMAAMIKAVSGGKLLSQDVSFSPETLFELMETHPRL AAEEVKEAFDFADLQTMEDAIAPLLATGDLIKQPADNSYFVRKPAKGMACDLTTGICK >gi|283510597|gb|ACQH01000022.1| GENE 35 50733 - 51194 607 153 aa, chain + ## HITS:1 COG:no KEGG:BF1333 NR:ns ## KEGG: BF1333 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 153 1 157 157 73 36.0 2e-12 MKKLFMMVAACAVIASCNQTGKNAKAENPKDSVMAVDTTNAATGMTYSGTIPAADGPGIK YTVTLSGDSAKTFTMEEVYLQAKDGKDDVKTYTGNVEMIKKDVKGKAVTAYKLPMDKDNA LYLLVKDDATLSVVNDQLEEAASGNDYTLKLVK >gi|283510597|gb|ACQH01000022.1| GENE 36 51888 - 53060 1084 390 aa, chain - ## HITS:1 COG:ECs2857 KEGG:ns NR:ns ## COG: ECs2857 COG0451 # Protein_GI_number: 15832111 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Escherichia coli O157:H7 # 7 384 5 312 321 311 44.0 1e-84 MLHKDTKIYVAGHRGLVGSAIWKNLQTRGYHNLVGRTHSELDLTNQAQVRAFFDEEQPEA VVLAAAHVGGIMANSLYRADFIMLNMQMQCNVISESFRHNVKKLLFLGSTCIYPKNATQP IKEDELLTSPLEYTNEEYALAKISGLKMCESYNLQYGTNYIAVMPTNLYGPNDNFHLENS HVMPAMMRKIYLAKLINEDNWQAIRADLNKRPVEGVDGTAQEQRILEVLSKYGIADNAVQ LWGTGKPLREFLWSEDMADASVHVLLNVDFSDIIGIEKYSSVFYGAETNGQNDRNSNAGR GGAIPALGEIRNCHINVGTGKEITIKQLAQLIAQAVDFKGDIQFDSTKPDGTPRKLTDVT KLNNLGWKHKVEIADGVAKLFAWYQNDLKA >gi|283510597|gb|ACQH01000022.1| GENE 37 53088 - 54173 1196 361 aa, chain - ## HITS:1 COG:MA1173 KEGG:ns NR:ns ## COG: MA1173 COG1089 # Protein_GI_number: 20090039 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: GDP-D-mannose dehydratase # Organism: Methanosarcina acetivorans str.C2A # 4 347 6 344 345 484 65.0 1e-136 MERKVALITGITGQDGSYLAELLTEKGYDVHGLLRRSSSFNTGRIEHLYLDEWVRDMKKQ RPVNLHWADMTDSSSLIRIIGEVRPTEIYNLAAQSHVKVSFEVPEYTADVDATGVLRLLE AVRICGLEKQCRIYQASTSELYGKVQEVPQKETTPFYPRSPYSVAKLYGYWIVKNYRESY GMYCCNGILFNHESERRGETFVTRKITLAACRIKQGFQDKLYLGNLDAKRDWGYAKDYVE CMWMMLQQEQPEDFVIATGEMHTVREFCELAFKEVGIDIRWEGQGVNEKGIDVQTNRVLI EVDPKYFRPCEVDQLLGDPTRAKTKLGWSPTKTSFKQLIEKMVEHDMLFVKKLYIKAQMP E >gi|283510597|gb|ACQH01000022.1| GENE 38 54188 - 55252 916 354 aa, chain - ## HITS:1 COG:DRA0032 KEGG:ns NR:ns ## COG: DRA0032 COG0836 # Protein_GI_number: 15807702 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Mannose-1-phosphate guanylyltransferase # Organism: Deinococcus radiodurans # 10 351 17 357 372 261 39.0 1e-69 MSTNNFNHLVIMAGGVGSRFWPMSTEEKPKQFIDVLGTGRSLLQLTLDRFATVCPPSNVW IVTNKKYQSVVAEQLPEIDPSHILLEPCRRNTAPCIAYVSWRIKKECPGANVIVTPSDHV VTDVQEFRRVVNSCLKFTSETDAILTLGMKPTRPETGYGYIQADLSVNSARNTEVFRVDS FREKPDLPTATEYISKNNYFWNAGIFIWSVHTIVNAFRVYQPAMSKLFEGLMPYYGTPEE QAKIDETYPQCENISVDYAIMEKAEEIFVCPAEFGWSDLGTWGSLWEQSKRDLYGNACIG NNIKLFDSHNCIVHTTQEQRVVIQGLDDCIVAEQDGILLICKRSEEQRIKQFSE >gi|283510597|gb|ACQH01000022.1| GENE 39 55717 - 56589 910 290 aa, chain + ## HITS:1 COG:TM1451 KEGG:ns NR:ns ## COG: TM1451 COG0568 # Protein_GI_number: 15644200 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) # Organism: Thermotoga maritima # 16 289 121 391 399 191 41.0 2e-48 MRQLKIQKSITNRSSEALDKYLVEIGRAPLISIDEEIELAQKIRKGGPEGEKAKDKLVTA NLRFVVSVAKQYQHQGLTLTDLIDEGNIGLIKAAQKFDETRGFKFISYAVWWIRQSILQA IAEQSRIVRLPLNQVGSLNKINHEINKFEQENQRHPSVSELSDATNIDEEKIGQSLMADG HHVSIDAPFQDGEDNCMLDVMASGDDSRTDKQADHESMALELNSVLAKVLKDREITIIRE CFGIGCHEKGLEEIGDQLGLTRERVRQIREKSIAKLRDSGNAKILMKYLG >gi|283510597|gb|ACQH01000022.1| GENE 40 56872 - 57090 69 72 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MYQKSTQAEVIQSLFHKPKQSANGGLLLRFYNSVAVGRGLHALQPLAQMAVSARAFRTVY PRACLRYSCSFV >gi|283510597|gb|ACQH01000022.1| GENE 41 57293 - 58837 1731 514 aa, chain - ## HITS:1 COG:FN1444_2 KEGG:ns NR:ns ## COG: FN1444_2 COG0519 # Protein_GI_number: 19704776 # Func_class: F Nucleotide transport and metabolism # Function: GMP synthase, PP-ATPase domain/subunit # Organism: Fusobacterium nucleatum # 194 514 2 318 318 406 59.0 1e-113 MQQKIIILDFGSQTTQLIGRRVRELDTFCEILPYNKFPKDDPSVIGVILSGSPYSVHDKE AFKVDLNDFIGKYPVLGICYGAQFISHANGGKVEQTGTREYGRAHLQEICLDNPLFKGFE PNSQVWMSHGDTITAIPEGYRVIASTNDVKFAAYASEQNPVWAVQFHPEVFHTTQGTTLL KNFVVNICGSKQQWSAASFVESTVNELKAQLGDDRVILGLSGGVDSSVCAALLNRAIGDK LTCIFVDHGMLRKNEFQKVMETYKGLGLNVVGVDASDKFFADLKGVSDPEEKRKIIGRDF VEVFNAEAKKITDAKWLAQGTIYPDRIESLSITGMVIKSHHNVGGLPKEMHLQLCEPLQW LFKDEVRRVGLEMGMPQRLIKRHPFPGPGLAVRILGDITREKVRILQEADDIYIENMHNY ICEDGEELYDKIWQAGTVLLSSVRSVGVMGDERTYEHPVALRAVTSTDAMTADWAHLPYD FMAKVSNEIINKVKGVNRVCYDISSKPPSTIEWE >gi|283510597|gb|ACQH01000022.1| GENE 42 58911 - 59687 738 258 aa, chain - ## HITS:1 COG:CC3536 KEGG:ns NR:ns ## COG: CC3536 COG1208 # Protein_GI_number: 16127766 # Func_class: M Cell wall/membrane/envelope biogenesis; J Translation, ribosomal structure and biogenesis # Function: Nucleoside-diphosphate-sugar pyrophosphorylase involved in lipopolysaccharide biosynthesis/translation initiation factor 2B, gamma/epsilon subunits (eIF-2Bgamma/eIF-2Bepsilon) # Organism: Caulobacter vibrioides # 4 112 8 117 242 94 44.0 3e-19 MMQAMIFAAGLGTRLKPLTDTIPKALVEVGGETLLQRTIERLKACDVKSMVINVHHFAQA IIDYLQANNNFGVEIAVSDESQHLLDTGGGLKKAAHLFSPTANILIHNVDIISNVNLQAF YAHQAQSDALLLVSQRETKRYLLFNDDMRLVGWTNVETGKVRSPYPTLNPQSCKKLAFAG IHSFSPKLFEKMEKYPARFGIIDFYLDQCAQNDIRGYEQQGLQLLDVGKLNTLERLPRPK AGGLAVSDADILQHLYAQ >gi|283510597|gb|ACQH01000022.1| GENE 43 59792 - 61333 1449 513 aa, chain - ## HITS:1 COG:SA0720 KEGG:ns NR:ns ## COG: SA0720 COG1660 # Protein_GI_number: 15926442 # Func_class: R General function prediction only # Function: Predicted P-loop-containing kinase # Organism: Staphylococcus aureus N315 # 378 507 175 299 303 74 34.0 4e-13 MEKLAELYKRWKGTAPAMMAQLPEAGSNRKYYRLTGEDGESVIGVVGNSRDENHTFIYLT RHFTERRLPVPRILAVDEDELRYLQTDLGSTSLFDVLRGGRESGGRYNMHEKQLLTAVIR ELPNIQIRGARGLDWQVCYPQPEFDVDSVLFDLNYFKYCFLKATELEFHELKLEANFRLF AKDLTSEKSESFLYRDFQARNVMLDSNGKPYFIDYQGGRKGPFYYDLASFLWQAAAKYPF KLRRELVYEYYNSLKNYTEVPSVRHFVERLSLFVLFRTLQVLGAYGFRGYFEHKKHFINS IPPAIDNLRELLKLKKFFPYPYMMDMLRRLTELPQFAHIEEPALSRADGFKTTPHTVYKA HPQDGPATFSKYDGKGPLVVNVYSFSYRKGIPEDTSGNGGGYVFDCRSTHNPGRYEPYKQ LTGLDEPVIRFLEDDGEILNFLDGVYALADHHVRRYIQRGFTSLMFCFGCTGGQHRSVYS AQHLAEHLHKKFGIEVHITHREQNISSVLQAKR >gi|283510597|gb|ACQH01000022.1| GENE 44 61362 - 61547 83 61 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKIRINWKRILVLLLVSAGLVYITKSFLMSAGIFLLLFAADSLLARYDEDRKIRKQREEH E >gi|283510597|gb|ACQH01000022.1| GENE 45 61881 - 63509 1644 542 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|167855908|ref|ZP_02478658.1| 50S ribosomal protein L28 [Haemophilus parasuis 29755] # 2 541 3 543 547 637 59 0.0 MAKEIKFDIEARDLLKNGVDKLANAVKVTLGPKGRNVVIEKKFGAPQITKDGVTVAKEVE LEDHFENTGAQLVKSVASKTGDDAGDGTTTATILTQAIVNEGLKNVTAGANPMDLKRGID KAVKVVVDYIRESAEQVDDNYDKIEQVAAVSANNDPEIGKLLADAMRKVSKDGVITIEES KSRETSIGIVEGMQFDRGYLSGYFVTDTDKMEAVMENPYILIYDKKISNLKDFLPILQPA AESGRPLLVIAEDVDSEALTTLVVNRLRGGLKICAVKAPGFGDRRKAMLEDIAVLTGGLV ISEEKGLKLEQATLEMMGTCDKVVVSKDNTTIVNGAGEKQNIADRVAQIKNEIANTTSSY DKEKLQERLAKLSGGVAVLYVGANSEVEMKEKKDRVDDALCATRAALEEGVVAGGSTTYI RALDALKGLKGDNADEQTGINIVERAIEEPLRQIVINAGGEGAVVVQKVREGKGDYGYNA RTDAFEDMRKAGIIDPAKVARVALENAASIAGMFLTTECLIVEKPSDAPAMPMGNPGMGG MM >gi|283510597|gb|ACQH01000022.1| GENE 46 63643 - 63912 488 89 aa, chain - ## HITS:1 COG:RC0969 KEGG:ns NR:ns ## COG: RC0969 COG0234 # Protein_GI_number: 15892892 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Co-chaperonin GroES (HSP10) # Organism: Rickettsia conorii # 1 88 5 98 99 85 53.0 2e-17 MNIQPLADRVLVLPAPAEEKVGGIIIPDTAKEKPLNGTIVAVGEGTKDEQMILKEGDNVL YGKYAGTELEYEGKKYLMMRQSDVLAIVK >gi|283510597|gb|ACQH01000022.1| GENE 47 64025 - 64270 80 81 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MYVNDCMAIWFRSRGRQKADKRHHLFGIRMASSGSLHRFANIDDEIVPQGQSGGSLLALC SQTRNTNETDTKQSAMRKNVS >gi|283510597|gb|ACQH01000022.1| GENE 48 64356 - 65051 512 231 aa, chain - ## HITS:1 COG:XF1640 KEGG:ns NR:ns ## COG: XF1640 COG0666 # Protein_GI_number: 15838241 # Func_class: R General function prediction only # Function: FOG: Ankyrin repeat # Organism: Xylella fastidiosa 9a5c # 76 231 275 422 1058 63 33.0 2e-10 MKKMTLFFALLLLCMAGLGHAQNAVHKDYYRSAIKAIKNNNLAELAANMKYINNVDSFIP LDSYHSYSLLGYACLHKNKSAIQKLIAMKANIDEAFADEIFIYDALYMAINNEDEDLVKQ LLTMGADPNRPYNENGLCPLAASCNVNNVAIASLLLKHGAKANGLGNLGGDYIECPLIKA VIKGNKDMVLLLLKHGAKKGIKDEEGRTALYYAKQLKHTPIIQILEKAPGN >gi|283510597|gb|ACQH01000022.1| GENE 49 65091 - 67889 3217 932 aa, chain - ## HITS:1 COG:metH_2 KEGG:ns NR:ns ## COG: metH_2 COG1410 # Protein_GI_number: 16131845 # Func_class: E Amino acid transport and metabolism # Function: Methionine synthase I, cobalamin-binding domain # Organism: Escherichia coli K12 # 331 925 1 586 901 666 56.0 0 MGVNGKGKVDKMKIQDIIKQRIMILDGALGTMIQDYNLEEKDFRNADLAEHKGQLKGNND VLNITRPDLILDIHRRYLAAGADLIETNTFSSQVVSQADYQLEHLSRPMALAGARLARKA ADEFSTPQWPRFVCGSVGPTNKTCSMSPDVSDAAARDITYDQLFDAYREQVGALIEGGVD AILIETIFDTLNAKAAIDATMTEMQAQGLELPIMLSMTVSDLAGRTLSGQTIEGFLASIS SYPIFSVGLNCSFGADQMKPFLKELASKAPYYISAYPNAGLPNTMGQYDETAESMSPQIA QFINEGLVNIIGGCCGTDDKFIASYAALAKGKTPHAVVNKPTTLWLSGLEMLNVTPEVNF VNVGERCNVAGSKRFLRLIKEGQYDEAISIARKQVTDGALVLDINMDDGLLDAEKEMTTF LNMIAAEPDIARVPIMIDSSKWDVIMAGLKCVQGKCIVNSISLKEGEEQFIEHARALKRF GAACVVMCFDEQGQATTFERRIEIAQRAYRILTQEVGMNPLDIIFDPNILAIATGIEEHD NYAYEFIRSIAWIRKNLPGTHISGGVSNLSFSFRGNNYIREAMHAVFLYHAINEGMDFGI VNPSSKVTYSDIPIDELEIIEDVILNRKPNAAEALIELANKKKEEEEQRKAGLANGDSSV LKQEEEQWRSMELDERLKYALRKGIGDHLDEDLHLALEHYPHAVDIIEGPLMAGMNEVGE LFGAGKMFLPQVVKTARTMKQAVAILQPYIEKEKKVGATKAGKVILATVKGDVHDIGKNI VAVVMACNNYEVIDLGVMVPAEQIIRKAIEEKADIIGLSGLITPSLEEMINVAQEMEKAG LDIPIMIGGATTSQLHVALKIAPVYGGPVVWMKDASQNSLAAARLLNKSEETAYVNELND KYESLRAGYQDKQQKLLPIEEARKNRLRLFDD >gi|283510597|gb|ACQH01000022.1| GENE 50 67957 - 68436 589 159 aa, chain - ## HITS:1 COG:TM0254 KEGG:ns NR:ns ## COG: TM0254 COG0691 # Protein_GI_number: 15644629 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: tmRNA-binding protein # Organism: Thermotoga maritima # 18 159 16 153 158 127 47.0 9e-30 MGKETDSIRKKSAVQIRNKKASFEYFFVETFMAGIVLTGTEIKSIRMGKASLVDSFCYIN NGEMWVKGMNISPYFYGSYANHEAKRDRKLLLTKREIRKLQEATKQVGFTIVPTLVFIDP HGRAKVDIALVRGKKEFDKRQTLKEKEDRRQMDRAIKRY >gi|283510597|gb|ACQH01000022.1| GENE 51 68497 - 69006 733 169 aa, chain - ## HITS:1 COG:no KEGG:PRU_0797 NR:ns ## KEGG: PRU_0797 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 4 168 3 167 167 184 55.0 1e-45 MPNKVTVSDAALRQAAEEGMDAFVDIFVDAINASVGGKLTAETMSQLNASQITLLAYRIL RDEVMDGGFIQLIHNGYGGFIFLNPFAMMVKQWGITELGRLISKAHSNYKKYHEEIEKDC TDEEFMSLFERLPIFDDFDDTFVEHEEEWTAAIAQYIDGHIEEFAEIVN >gi|283510597|gb|ACQH01000022.1| GENE 52 69867 - 70361 29 164 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRVFVILLLPLHRIWKIGRVIECAGLEIRYTPFGYRGFESLIFRKHFNKPKGMFHLAKQI RIAVRLTFYSDFCFLFASTPLLINTRKPNNSRIQQLNILQTQKAINSHTHQPTYFICSSV HQHTNSYQLTNSPNHQLTNSYQLMNSQTHQIINPKIHQIINSKT >gi|283510597|gb|ACQH01000022.1| GENE 53 70712 - 72181 1716 489 aa, chain - ## HITS:1 COG:XF0843 KEGG:ns NR:ns ## COG: XF0843 COG3538 # Protein_GI_number: 15837445 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Xylella fastidiosa 9a5c # 41 473 63 497 516 419 44.0 1e-117 MKSFYLALALLALPTQQMLAQKNKVYGQEYIEAIADNTRLYVSKRPDVKDRLFRSEVIDK KIEEVKKLLKGNRYLAWMFENCFPNTLETTVHYRKLNGEDDTFVYTGDIPAMWLRDSGAQ VWPYVQFANKDPKLKAMIRGVILRQLRQINIDPYANAFNDGPTGEGHITDKTEMQPEVFE RKYEIDSLCYPIRLAYHYWQVTGDMSVFGTLWQEAIKKILATFKEQQRKDGQGKYRFERE STRQTETLGGDGYGSPVKPVGLIASCFRPSDDATTMLFLVPSNFMAVSAMNKAAEILAKV NKNQELSQSCKALSNEVAAALKKYAITGHPKYGKIYAFEVDGYGGKVLMDDANVPSLLAL GYMGDVPLNDPIYQNTRRFVWSEDNPWFFKGAAGEGIGGPHIGANYPWPMSIMLKAFTSQ NDEEIKQCIKMLMETDGNTGVMHESFNKDDAHKYTRAWFAWPNTLFGELILKLINDGKVD LLNSIQVKK >gi|283510597|gb|ACQH01000022.1| GENE 54 72251 - 73561 1120 436 aa, chain - ## HITS:1 COG:PM0115 KEGG:ns NR:ns ## COG: PM0115 COG0498 # Protein_GI_number: 15601980 # Func_class: E Amino acid transport and metabolism # Function: Threonine synthase # Organism: Pasteurella multocida # 1 431 1 423 424 354 44.0 2e-97 MRYYSTNGKTELVDLREAVVKGLAADNGLFMPERIAALPTTFYEQIDKLSFQDIAFTVAQ AFFGEDVEAAALRTIVNETLAFDCPVVPVQANRYALELFHGPTLAFKDVGARFMARLLSH FVGQDRKQAINVLVATSGDTGSAVANGFLGVEGVNVYVLYPKGKVSAIQESQFTTLGQNI TAIEVDGVFDDCQALVKQAFMDKELNEQLLLTSANSINVARFLPQAFYYFYAYAQMKRQG LHSELVFCVPSGNFGNITAALFAHRMGLPIKRFIAANNANDVFYQYLQTGKYRPKPSVQT IANAMDVGNPSNFARIYDLYGGNHHTISQLIAGFRFNDDEIRATIRDCFQQTGYVLDPHG ACGYKALQLSLSENEHGVFCETAHPAKFKETVDAVLPTPIDIPERLARFMQGRKQTYPMN NSFDKFKAFLLQNAPR >gi|283510597|gb|ACQH01000022.1| GENE 55 73732 - 74976 987 414 aa, chain - ## HITS:1 COG:MA0132 KEGG:ns NR:ns ## COG: MA0132 COG3635 # Protein_GI_number: 20089031 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted phosphoglycerate mutase, AP superfamily # Organism: Methanosarcina acetivorans str.C2A # 1 413 1 397 397 382 46.0 1e-106 MKHLIILGDGMADHPVERLGGKTLLQYANTPYMDLLARQGRTGMLQTIPEGFAPGSEVAN TAILGYDLSQVYEGRGPLEAASIGYDMQPTDLALRCNILTLTGGLIRNHHGGHLKTEESE PLIEALNKALSSDIVKFAMGIQYRHLLIIKGGNKHIDCTPPHDHPDEPWQPLLVKPQEGW EDIREMGRMTARETATLLNELIIKSQQVLSNHPLNKQRAAKGKLQANSIWPWGGGYRPAM KTLTQLYPNVQSGSVISAVDLIRGIGHYAGLRNIVVEGATGLYNTNYEGKAAAAIEALRH DDFLYLHIEASDEAGHDGDLELKLKTIEALDRRIVGPIYNKVKGWNEPICIAILPDHPTP VEVRTHVDEPVPFLIWHEGIAPDAVQTFDEQSCKSGSYGLLRPNQFMQTFMHIQ >gi|283510597|gb|ACQH01000022.1| GENE 56 74988 - 77423 2413 811 aa, chain - ## HITS:1 COG:MJ0571 KEGG:ns NR:ns ## COG: MJ0571 COG0527 # Protein_GI_number: 15668751 # Func_class: E Amino acid transport and metabolism # Function: Aspartokinases # Organism: Methanococcus jannaschii # 3 455 4 467 473 273 37.0 1e-72 MKVLKFGGTSVGSVKSILCLKRIVETEAKRQPVIVVVSALGGITDKLIATAQMAVGGNEK WRGEFDEMMTRHHQLIDTIISDTTAREQLFNTVDNLFEQLRSIYTGVFLIRDLSEKTQDA IVSYGERLSSNIVAALIRGSKWFDSREFIKTRRKNGKHVLDSELTNKLVVDTFSPLPRVS LVPGFISLDQDTGETTNLGRGGSDYTAAILAAALDAEVLEIWTDVDGFMTADPRIIKTAY TIKELSYTEAMELCNFGAKVIYPPTIYPVCVKNIPIRIKNTFNPKAEGTTIKQKVERTDK PIRGISSINDTALITVAGLSMVGVIGVNRRIFTVLANNGISVFLVSQASSENSTSIGVRE EDLAAAVEVLNKEFAKEIEMGAMFPMKAESGLATVAIVGENMRNTPGVAGKLFGTLGRSG ISVIACAQGAAETNISFVVSAAFLRKSLNVIHDSFFLSEYTELNLFICGIGTVGGKLIEQ IKSQAEELRMRSRLKLRVVGIASSKLMMFDRNGLDLDNYAELLKDAEICTPESLRNGILE MNIFNSVFVDCTASKDVADLYQTLLEHNVSIVTANKIAASSKYEKYERLKQTALTRGIKF KYETNVGAGLPIIGTINDLRNSGDHILKIEAVLSGTLNFIFNELSAAVPFSETVRRAKAQ GYSEPDPRIDLSGTDVVRKLVILAREAGYKAEQADVEKQLFVPDSFFEGSLEEFWENLPT LDSDFERQSQQLAKEEKRWRFVATMDHGKTCVGLQAVDKSHPFYNLEGSNNIVLLTTERY KQYPMQIQGYGAGADVTAAGVFANIMSIANI >gi|283510597|gb|ACQH01000022.1| GENE 57 77894 - 78964 740 356 aa, chain + ## HITS:1 COG:MA1854 KEGG:ns NR:ns ## COG: MA1854 COG1672 # Protein_GI_number: 20090704 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Methanosarcina acetivorans str.C2A # 5 353 7 385 390 97 25.0 5e-20 MKNPFKFGTIVEGEYFTDRDDELQRIKQLLNSDNHLILISPRRFGKSSLVKKAVTQVSRP CISLNLQMVVSVEGLAAMILKEVFKLHPWEKLKHLLSSFRVVPTISTTPTGETIDITFQA TTDATVLIEDALQLVEKVNEKGESVVVVFDEFQELMGLDKGIDKRLRAIIQTQQHVNYIF LGSQESMMTEIFERKRSPFYRFGVLMHLDRIPHDNFSQYILERLPADCATKVDVVEQILS TTRCHPYYTQQLAALVWDFLTYKKFGAAEVVEQAIATLVRTHDLDFERIWLNFNKTDRSV LMGLVSNVQLASNRRLPTSTVYSSVKRLMQSGYVIKLDTFEIEDPFFARWIKQRQG >gi|283510597|gb|ACQH01000022.1| GENE 58 79384 - 80235 596 283 aa, chain + ## HITS:1 COG:AGl448 KEGG:ns NR:ns ## COG: AGl448 COG2207 # Protein_GI_number: 15890331 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 11 111 184 285 295 83 39.0 6e-16 MKTITHNSYIQSINKASHYIATHLDETIDVKTLARVANLSDFHFCRIFKAIKGEPPIAFI ARLRIETAAQLLRYSSLAVEEIAYNIGYESPASLSKAFKSRYGITPTQYRTDKNRYIMKR ETINENLALKAPKLVTLEDKSLIYVSLMGEYGSLNYDDAYLRLWHVVKEQKLFTKGIETL CISHDDPKVTEANQQRSDICLAIHKPAKPQDGVSCKTLAGGKYAMFAYQGPYDNLAAVYD SAMRWLVNSEYELRDEPMFEKYLNNCQRTSAEKLKTEVYIPIR >gi|283510597|gb|ACQH01000022.1| GENE 59 80243 - 80725 175 160 aa, chain + ## HITS:1 COG:no KEGG:Coch_0541 NR:ns ## KEGG: Coch_0541 # Name: not_defined # Def: hypothetical protein # Organism: C.ochracea # Pathway: not_defined # 1 158 1 158 159 192 71.0 3e-48 MDRTKVRNAVGVALAAILLLLTLLGSGYFIFTLKVGFVQWLAFNACSPTSLVYLVCVTIF WLKGNTALLPFALLPMYYFGTMGLFTFTWSGANVFAQLSHITMTLNIAWAAYVLCSIGDY KATAKGLSWGIVVFVPYISFVMYYCRTHAAEIGRLLQMVE >gi|283510597|gb|ACQH01000022.1| GENE 60 80741 - 81226 216 161 aa, chain + ## HITS:1 COG:no KEGG:Coch_0542 NR:ns ## KEGG: Coch_0542 # Name: not_defined # Def: hypothetical protein # Organism: C.ochracea # Pathway: not_defined # 10 159 1 150 152 184 58.0 6e-46 MVNGKEMGTVQSRCKECYADNNRATPLLNPEDCLRHHRQYVCNSCGRCICADVDAKGRFR AKFPFKSLDIAILYLRAAEVVWQKSCQIYEIMNDKGRKEYKIFPSELELNDYLHNNKHKR CTSAQPLFVSPRYTACTSQQLRMLTKGEIEKYLKERYSQYK >gi|283510597|gb|ACQH01000022.1| GENE 61 81449 - 83896 1732 815 aa, chain - ## HITS:1 COG:no KEGG:PGN_0561 NR:ns ## KEGG: PGN_0561 # Name: prtT # Def: trypsin like proteinase PrtT # Organism: P.gingivalis_ATCC33277 # Pathway: not_defined # 5 555 1 566 840 159 25.0 5e-37 MNKQMKKTLFIALTMLSTAVLSNANPITKGQALNIASKYINNPTLSKNTPVTRSAQANEQ PAYYIFTSSSDKKFVIISGESKLNELVGYGDNMSKNSDEQPPYFKKFLKDYENVVKAVRN SSKQAPTTQMPIKRKVEPLLTCKWSQYFPYNKYTLKVDGKTTPTGCVATATAQVMYHHKW PKNRPTGYVKKEGDEAWKSPTYWWGDMKDTSNKMFSERSRQAVGVLMRDIGKAVNMRYYH KGSDSNLQYACNALRDKFDYTVRFLDKDFLPANEFLKEVMQEISNGYPVLVCGGPHAFVY DGYDERGFLHTNWGWEGLNDGYFDINTVYLNVTGFALSGTFWDDMSVVFAHPNDGKAVPF KEIERGLDARTTTAFTISKTEATRTEKLSASIIKLGSYSPVKGKLGVFTGKVALALYNHK NERVKIFNSASDNLVWASIFTTMSFDLTNISFEGLPNGHYRLVPVFSEMLDEQTKKNGEW KPINHANEMEVELTTNAVRINTNNPQNVVTIEKRPSVLAPFYEDSGKMGAFSFTMYNPGR EEVRGDLVMTLKSLETNKVYNAYLLTPNVVAQRLGHTSFTINMQALYNNPGDFGKLKTGK YSVTLSIKVKKKNSEYIVPITMKEPFEIEVLPDPHQGTIEFNFVDFLVDGANANYSTFQL NKIKEIGLQVHARVAGYQIREGYRGRIYYRLLDLTENKWIDLGYANNVYLPCEKPNDASK TRVTFNASMLKPNHAYEIHLEIERGGKREDVWNPNAKRNVFYTVDDLKTTAIDSFGSDNK APKHIFNIEGVRQTQAWETLPAGIYIVDGKKVMKK >gi|283510597|gb|ACQH01000022.1| GENE 62 83944 - 84132 197 62 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927718|ref|ZP_06421565.1| ## NR: gi|288927718|ref|ZP_06421565.1| hypothetical protein HMPREF0670_00459 [Prevotella sp. oral taxon 317 str. F0108] # 1 62 1 62 62 108 100.0 1e-22 MLLLHHFLKGTTVNSAYETTRKEEVDKSMLALGVTGGFILLIRLPKQNKQHLAGFKHKPK HA >gi|283510597|gb|ACQH01000022.1| GENE 63 85235 - 85543 134 102 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVFSSLVCFISLNWLTLTFNYIGYSSYQPLFEEIAPSFLQQLLRLLNILGYNAVQLLYIV YEAIFMLRLPDVVPFGHGQHALLPRTMAVVSVHALYGLPIHV >gi|283510597|gb|ACQH01000022.1| GENE 64 85607 - 85933 236 108 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTIIKYIILSVVILAMLLFFFITSKREKNLTQRGEDIVEKIELFRKENHRLPKDLNEIGI LEEENSNALYYDIRNDTSYTVSFMMSIDYNRTYYSDTKQWEDGYREIK >gi|283510597|gb|ACQH01000022.1| GENE 65 86068 - 86931 863 287 aa, chain + ## HITS:1 COG:lin1491 KEGG:ns NR:ns ## COG: lin1491 COG0568 # Protein_GI_number: 16800559 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) # Organism: Listeria innocua # 15 285 102 373 374 213 42.0 3e-55 MRQLKITKSITNRESDSLDKYLQEIGHEELISVEEEVELAQRIKKGDRKALEKLTKANLR FVVSVAKQYQNQGLSLQDLINEGNVGLIKAAEKFDETRGFKFISYAVWWIRQSILQAIAE QSRIVRLPLNQVGSVNKINKILSKFEQENERRPSINEIAEKTDLPEDKIEDAIKVTGRHI SVDAPFVDGEDNSLLDLLANTDTPTVDNELVKESLRAEIADALQYLNERERNVIEAFFGI NQMEMTLEEIGDKYGLTRERVRQIKEKAIRRLRNNTKNKMLKTYLGQ >gi|283510597|gb|ACQH01000022.1| GENE 66 86935 - 87432 416 165 aa, chain + ## HITS:1 COG:no KEGG:PRU_1022 NR:ns ## KEGG: PRU_1022 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 49 161 3 122 127 84 41.0 2e-15 MGVQAIQHLERRPLLLSKMKSKGGFSPYNKEMIQFKRYVLVAMILLAVGATASAKLVQTK VYMFGMAASFNDSTAYFTDVQEVDAWINDKGKFLYSRENYSYQLRDYLQSQGFANATCIT CFAFSRKKAEKKYATLLKKSAARGDANVRYLKESEFKYDAIVPEK >gi|283510597|gb|ACQH01000022.1| GENE 67 87570 - 88451 857 293 aa, chain + ## HITS:1 COG:no KEGG:PRU_2123 NR:ns ## KEGG: PRU_2123 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 291 3 282 284 172 34.0 2e-41 MTHYFNIAEHDISVVFEDATPIDGSLIQSLEPFRVESTPKDLLLQLIVTNDLEPFSPEDT ELIRDIDTGNGMTLVDILPSGGYQFHIRNLQGKLCCLLHTNKDFSLCRCKLEGNYSMRYF GLNNALMLTYAFAGSFRQTLLIHASLVRHENRGYAFTAKSGTGKSTHVSLWLRYIPNCDL MNDDNPIVRVIDGKPYIYGSPWSGKTPCYRNVKAELGAISRIDRAKVNSVERLRPIEAFA SLLPSCSTMKWDKEVFNNTCDTITKIIETTGIYTIHCLPNKDAALLCQQTIAK >gi|283510597|gb|ACQH01000022.1| GENE 68 88427 - 88912 118 161 aa, chain + ## HITS:1 COG:no KEGG:BT_0074 NR:ns ## KEGG: BT_0074 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 4 155 10 163 163 97 35.0 1e-19 MPTNHSEIRELQFENALFLPQVAKFLEEGHTVTIGLKGYSMRPFLEHNRDKALLSKPTTI QKGDAVLAEISPGVFVLHRIIKIDGDDITLRGDGNLAIEQCKRADVRGFVLGFYRKGRQT LDKTNSVKWRVYSALWTGLFPLRRYLLAFYTKIWMRLFGPI >gi|283510597|gb|ACQH01000022.1| GENE 69 88955 - 89224 423 89 aa, chain + ## HITS:1 COG:no KEGG:PRU_2121 NR:ns ## KEGG: PRU_2121 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 88 1 89 89 73 44.0 2e-12 MKKKVGFKLRSICGEQVIVAEGKENIDFSKIISMNETSAYLWEAVEGKEFTADTLAKLLT EQYDVQYNVAFNDCLELIVKWEEAGIIEQ >gi|283510597|gb|ACQH01000022.1| GENE 70 89495 - 90628 902 377 aa, chain + ## HITS:1 COG:no KEGG:PRU_1233 NR:ns ## KEGG: PRU_1233 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 3 354 25 360 370 269 41.0 1e-70 MTLHIFNPDHDLVLAAGNDHFTPPKAARDLYADLGFLPALWAKAGDLVLVADAEAAYERL RHIKGRTKGVQLVDRNTLQRVLASGTITSNQLDIAPWGWDKNVKDLFAGIGVSEDVLPAD RWLEDVRTISSRQWMSQNILPELVSRLNNLYPSKFIGTSFVVQSMAQLHTLLQENALVVV KAPWSSSGRGIRYIENKPDPSTEGWCANTIEKQGCITFEPYYNKIRDFGMEFNANADGSI DYLGLSIFKTSKRTYTGSLLATEDTKLEYLSQYIDVSILKAVSETITTMLSALLLNKYVG PLGIDMMLVKQEGTNNLAIHPCVEINLRRTMGHVALSLSPSPLEPQRLMSIDHSRGAYHL RLHTLSDGLLNTSIVRL >gi|283510597|gb|ACQH01000022.1| GENE 71 91287 - 93320 2195 677 aa, chain + ## HITS:1 COG:BMEI0003 KEGG:ns NR:ns ## COG: BMEI0003 COG1158 # Protein_GI_number: 17986287 # Func_class: K Transcription # Function: Transcription termination factor # Organism: Brucella melitensis # 309 676 52 420 421 456 61.0 1e-128 MYSKEELLSKKMSELEDIAKTLGAEYDGDNLEEMVYSILDKQAIDEGTKNPLGQKKKRTR IVKKDTDRVYSVKGKDGENFDVKNNKVTSSEQPSLFKEAEETATETQAQESEEPAKPKKR GRKSKKEKEAEAAAEAERLAQEAEERQKEEDGEESAPEQVAAEPDENTQAEPDDHIKAES QEIIDEQAETEPTNEVPVPEASFLPEATDMGENQDAPDSGLLAQLQAKINARNEGPNDRP DPIREGCWEGDPGDGTDFITVVDLPIEDQGALPNYDMFDNPTNGMPNAGYQQPEPMQPAP PAYDFSDLITADGVLEIMPDGYGFLRSSDYNYLSSPDDVYVSTQQVKKYGLKTGDVVQSR VRPPREGEKYFPLTSIDMINGRVPDEIRDRVPFEHLTPLFPDEKFNLCGNPATTNLSTRI VDLFSPIGKGQRALIVAQPKTGKTILMKNIANAIAANHPEAYLMMLLIDERPEEVTDMAR TVNAEVIASTFDEPAERHVKIAGIVLEKAKRLVESGHDVVIFLDSITRLARAYNTVSPAS GKVLTGGVDANALQKPKRFFGAARNVENGGSLTIIATALIDTGSKMDEVIFEEFKGTGNM ELQLDRSLSNKRIFPAVNLVASSTRRDDLLQDRTTLDRMWVLRKYISDMNPIEAMNSIHD RLNVTKDNDEFLLSMNA >gi|283510597|gb|ACQH01000022.1| GENE 72 93728 - 94141 345 137 aa, chain + ## HITS:1 COG:AGc425 KEGG:ns NR:ns ## COG: AGc425 COG2207 # Protein_GI_number: 15887598 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 34 130 245 338 365 71 34.0 5e-13 MVQLVETKGKKRDSQIFINPQLMGTLAKNICQEISVNQKYKDKTYSAKQLAKDLGTNTRY ISTVINRQFNTNYTSFVNKFRIQEAMKLLKDKRKAHLTIEDVSDAVGFSTRQSFYTSFKR FTGMTPREYRISPENDR >gi|283510597|gb|ACQH01000022.1| GENE 73 94210 - 96168 1998 652 aa, chain + ## HITS:1 COG:no KEGG:PRU_0989 NR:ns ## KEGG: PRU_0989 # Name: not_defined # Def: M49 family peptidase (EC:3.4.14.4) # Organism: P.ruminicola # Pathway: not_defined # 7 652 3 649 649 892 66.0 0 MKETGTFNYADERFADLQMLRYRLPQFEKLSLQQKKYIYYLSQATLCGRDITTDQFGKYN LRIRKTLEALYTGLDANADKDNYDKLALYLKRVWFSNGIYHHYANDKIQPDFSEAFFRAA VRQMPIDKLPMTGYNSVEELCDELCPVMFNPKILPKRVNKADGVDLVKTSACNYYEGVSQ QEVEDFYNEKKAAAGDNSPSWGLNTKLVKDDSGLEEKVWKENGEYGEAIRQIIYWLDKAK SVAENQQQQRVIDLLIKYYRTGDLHLFDEYSIEWLKEQAGNVDFINGFIEVYGDPLGIKA SWEGIVEYKDLEATRRTRLISDNAQWFEDHSPVDSRFKKAEVRGVTANVVCAAMLGGDEY PSTAIGINLPNADWIRAQHGSKSVTISNITDAYNKAAKGNGFKEEFVIDKETLDIVSRYG DICDELHTDLHECLGHGSGKLLPGVSPDALKAYGNTIEEARADLFGLYYMADDKLQELGL LPDKNAFRSQYYTYMMNGLMTQLTRIERGKDIEEAHMRNRALIAHWTLEHGKGAVELVKR NGKTYVQINNYEQLHRLFGELLAEVQRIKSEGDFNAARNLVENYAVKVDGELHAEVLERF AKLDIAPYKGFINPVYKPILNSEGEIMDVEVDYSEAYDAQMLRYSSEFGFLV >gi|283510597|gb|ACQH01000022.1| GENE 74 96309 - 96923 567 204 aa, chain + ## HITS:1 COG:no KEGG:BVU_2016 NR:ns ## KEGG: BVU_2016 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 4 201 2 199 222 166 43.0 4e-40 MNEDTHSKLMAIKQSFRLFMDGTTSRSMAQKGIGYKINWGVPFHELRKMAAPYAPNYELA IELWKENIRECKIMATLIMPPERMSPELADVWTEQPLQQEMAEMLAFNLLQHVDFAPALA YQWMAGDRMDRQICAYQLLARLFMKGCEPNERGLDEYLDQVSVALQSDNLGVRHAASASL QKLAMLGDSYEKRVDELLARLQAP >gi|283510597|gb|ACQH01000022.1| GENE 75 97040 - 97492 341 150 aa, chain + ## HITS:1 COG:CAC1682 KEGG:ns NR:ns ## COG: CAC1682 COG0735 # Protein_GI_number: 15894959 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+/Zn2+ uptake regulation proteins # Organism: Clostridium acetobutylicum # 15 149 12 151 151 72 33.0 4e-13 MKMKQDSKKQARKILDQYLETNNYRRTPERYAILDAVFSIKGHFSLDQLSEYIKNENFKV SRATLYNTLRLFIKLRLVVRHRLTDGTKYEARTKNDNHCHQVCTMCGAVTEVNLPEITTT LEQVRLKGFCSDGFALYLYGVCSACQKKIK >gi|283510597|gb|ACQH01000022.1| GENE 76 97530 - 98807 1137 425 aa, chain + ## HITS:1 COG:PM0938 KEGG:ns NR:ns ## COG: PM0938 COG0104 # Protein_GI_number: 15602803 # Func_class: F Nucleotide transport and metabolism # Function: Adenylosuccinate synthase # Organism: Pasteurella multocida # 4 422 2 425 432 379 45.0 1e-105 MKTGKVDVLLGLQWGDEGKGKVVDVLTPQYDVVARFQGGPNAGHTLEFNGQKFVLRSIPS GIFQGGKTNIIGNGVVLAPDLFMAEAKDLEKSGHDLKKCLHISKRAHLIMPTHRVLDAAY ETLKGDKKVGTTGKGIGPTYTDKVSRSGLRVGDIFEGFEEKYAQHKERHLNILRSMNYTD FDISEVEKTWMEGIEYMRQFNIIDSEFELNSLLNDGKAILCEGAQGTMLDVDFGSYPYVT SSNTITAGACAGLGIGPNKIGDVYGIMKAYCTRVGAGPFPTELFDETGKKLRDLGHEYGA VTGRERRCGWIDLVALRYSIMINGVTQLIMMKSDVLDTFETIKACVAYEKDGQRIDHFPF DIEHGITPVYEELKGWQTDMTQFTSEEQFPQAFKNYIDFLEKQLQTPIKIISIGPDRSQT IVRKK >gi|283510597|gb|ACQH01000022.1| GENE 77 98807 - 100177 1459 456 aa, chain + ## HITS:1 COG:APE0662 KEGG:ns NR:ns ## COG: APE0662 COG0124 # Protein_GI_number: 14600873 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Histidyl-tRNA synthetase # Organism: Aeropyrum pernix # 3 451 8 432 438 243 33.0 6e-64 MNKPSIPKGTRDFSPAEMAKRNYIFDTIKDVYALYGYQQIETPSMETLQTLMGKYGEEGD KLLFKILNSGDFIGKVPAEEFVSDNVLKLAAKICEKGLRYDLTVPFARYVVMHRDELQMP FKRYQIQPVWRADRPQKGRYREFYQCDADVVGSDSLLNEVELVQIIDTVFTKFGINVQIK LNNRKILAGIAEYIGQPDKIVDITVAIDKLDKIGVEAVNAEMLANGISQDAVDKLQPILT MSGTNVEKLETIAQTIATSEIGVKGVEETRFILEKIAAVGLKNELQLDLTLARGLNYYTG AIFEVKAKDVAIGSITGGGRYDNLTGIFGMPGLSGVGISFGADRIYDVLNTLDLYPQNAT QGTEVLFINFGETESDYCLPIASQVRAAGISVELYPDCAKMKKQMAYANAKGIPFVVLAG ESEISQGKVTLKNMLTGDQQLVSAEELIAKITSKNP >gi|283510597|gb|ACQH01000022.1| GENE 78 100190 - 100624 340 144 aa, chain + ## HITS:1 COG:FN2045 KEGG:ns NR:ns ## COG: FN2045 COG0735 # Protein_GI_number: 19705335 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+/Zn2+ uptake regulation proteins # Organism: Fusobacterium nucleatum # 4 134 9 137 142 109 43.0 1e-24 MEMDVYHKLLENGIRPSAQRLAIMNYLLTHFTHPTVDEVYQGLCNEIKTLSRTTVYNTLR MFAEKNLAQMITINEHHVCYDGCISPHKHFYCNHCGHVFDIFDDTSAELPQAMNIDGHQV LETQVYYRGICKKCLNNTAGQDSN >gi|283510597|gb|ACQH01000022.1| GENE 79 100887 - 101018 57 43 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNKIQRYNRLYILYSTLSGLFARKRAKSINYFTRTRVKVTRAR >gi|283510597|gb|ACQH01000022.1| GENE 80 101705 - 102994 505 429 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927733|ref|ZP_06421580.1| ## NR: gi|288927733|ref|ZP_06421580.1| hypothetical protein HMPREF0670_00474 [Prevotella sp. oral taxon 317 str. F0108] # 1 429 1 429 429 743 100.0 0 MFLFVYDLTGWVSAYSIQLAIIPALIGAGAAIVGGLLSRSGAKKANEQNYDFQREMWNKQ VAQQDKVNAQQMAYQDKVNAENREWSTESNVRQRIEDAGYNPYLYNGQASANSTGMASST NLGNSITPGSVTAVNEEEGLSNSLNSVGSIMAQGVKTAQDAYTLSRGKAIDKQNDKVSGI KGGTESAQAQATLQASQNEARIKASTAILQDMEVAILQTQAMDKNGQPMVDESTGRPVTL AMQRERGQQQEVMYRVEKLKSDILNGEVDWENTSVDTLLKRYQLEKTNPEQLAILRQTFS NLKDTNLQIKAETKVLSTQVGLNNSQTRLNIQNVATQKRYAKLLGQQTLTEEEKTLFQKL QNHYGSAEKVAEIVQKARPHNFIEFANYLIDTWHDDLTGRRTSYGKIAIDVDSIANNWKN RRKGMAFKK >gi|283510597|gb|ACQH01000022.1| GENE 81 103055 - 103177 63 40 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKVFIKILKVLEIALPFLKTLVESLSKKEKKDNEDPTASV >gi|283510597|gb|ACQH01000022.1| GENE 82 103222 - 103671 264 149 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927735|ref|ZP_06421582.1| ## NR: gi|288927735|ref|ZP_06421582.1| hypothetical protein HMPREF0670_00476 [Prevotella sp. oral taxon 317 str. F0108] # 1 149 1 149 149 235 100.0 7e-61 MDIIDFYYPHLDSAQKKILLSETSPNTELPVLNAEISELQEVVFPTDPTTGLPINPVTKL LSGSITQLERDRILSFMQPMPSSKRNNLSDDDLIRMLPSRYNSTLVDMDAVRDWYEENIF QPLHEQELQQQQQQQQQQQQQQTGGQPVE >gi|283510597|gb|ACQH01000022.1| GENE 83 103689 - 105314 751 541 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927736|ref|ZP_06421583.1| ## NR: gi|288927736|ref|ZP_06421583.1| hypothetical protein HMPREF0670_00477 [Prevotella sp. oral taxon 317 str. F0108] # 1 541 1 541 541 1097 100.0 0 MSLKKVPQIKPSRANRPRSAFDLSQKHLYTAPAGALLPVLSVDLMFHDHIRIQAQDFMRT MPMNSAAFISMRGVYEFFFVPYSQLWHPYDQFITSMNDYRSSVVSSAAGDKALDSVPNVK LADMYKFVRERTDKDIFGYPHSNNSCRLMDLLGYGKPITSSKTPVPLLYTGNVNLFRLLA YNKIYSDYYRNTTYEGVDVYSFNIDHKKGTFVPTADEFKKYLNLHYRNAPLDFYTNLRPT PLFTIGSDSFSSVLQLSDPTGSAGFSADGNSAKLNMASPDVLNVSAIRSAFALDKLLSIS MRAGKTYAEQIEAHFGVTVSEGRDGQVYYLGGFDSNVQVGDVTQTSGTTNPNVSEVGNAK LAGYLGKITGKGTGSGYGEIQFDAKEPGVLMCIYSVVPAMQYDCMRLDPFVAKQTRGDYF IPEFENLGMQPIVPAFVSLNRAKDNSYGWQPRYSEYKTAFDINHGQFANGEPLSYWSIAR ARGSDTLNTFNVAALKINPHWLDSVFAVNYNGTEVTDCMFGYAHFNIEKVSDMTEDGMPR V >gi|283510597|gb|ACQH01000022.1| GENE 84 105411 - 106943 85 510 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927737|ref|ZP_06421584.1| ## NR: gi|288927737|ref|ZP_06421584.1| hypothetical protein HMPREF0670_00478 [Prevotella sp. oral taxon 317 str. F0108] # 1 510 60 569 569 1025 100.0 0 MFTLTYNNEFIPRWERFLDNNDCPQLRPIGRCAELFPSCPLNYFDKVTGKWSIDLDTFLP KIENDEHTEVFASCCKKDIQNFLKRLRFNISKLYGKAESRKIRYYVASEYGPTTLRPHYH GIIFFDDASLLSEISSLIVRSWGFQRRVGGKRNSFIFQPFADISLTQQYVKLCDQNTAYY VAEYVSGNLGLPQVLAYKSTLPFHLCSKSPVIGCFKADYCEVLGRVHRGAYRVGREVFDE KSGQFMHYDIPLDRDLCSSLFRKCLGFSSLSFNEKLLRYSFYGQHFAEWYENANLAFIEW KYSTRLFSADFADFLRSERGWKYRSWLEVNYKYDYYYLEMDCDQTWYSSRNAWRVVRDFD MNCYLPWNDLYYTYVRLFDKFEVMRNSDQLIEFYKLFNDIVHECGFQQAMLAAYPLINDA VPVRTREKLLSECGMIYKEDAYLRSFFDEVAWFGDFYKFGILDYDKFSAYSFYRTQYFKN YLAQQQLRLLKRNKSKKLKNTFVFGSRQIS >gi|283510597|gb|ACQH01000022.1| GENE 85 107117 - 107359 236 80 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927738|ref|ZP_06421585.1| ## NR: gi|288927738|ref|ZP_06421585.1| hypothetical protein HMPREF0670_00479 [Prevotella sp. oral taxon 317 str. F0108] # 1 80 1 80 80 128 100.0 1e-28 MRRFQYYSYRLFLRYLLESDFELISMLPTSNGFVISYDDRYVSELDKLFGSLKTLVDFEV IEFVDGSTYDHKTRIEYQVL >gi|283510597|gb|ACQH01000022.1| GENE 86 107583 - 109358 2488 591 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160887146|ref|ZP_02068149.1| hypothetical protein BACOVA_05162 [Bacteroides ovatus ATCC 8483] # 1 591 1 600 600 962 80 0.0 MSNLKNVQPLSDFNWQEFENGSPVEVSKDALEKAYDETLNKVSEHQVVDGTVISIDKKEV IVNIGYKSDGVIPASEFRYNPDLKVGDTVEVYIENQEDKKGQLILSHKKARLSKSWERVN AALENEEVIQGYIKCRTKGGMIVDVFGIEAFLPGSQIDVHPIRDYDVFVGKTMEFKVVKI NQEFRNVVVSHKALIEEELEAQKKEIISKLEKGQILEGTVKNITTYGVFVDLGGVDGLIH ITDLSWGRVSDPHEVVSLDQKINVVILDFDDEKRRIALGLKQLTPHPWDALDQSLNVGDH VKGKVVVIADYGAFVEIAPGVEGLIHVSEMSWSQHLRSAQDFMHVGDEVEAVILTLDRDE RKMSLGIKQLKDDPWETIEVKYPVGSKHTSKVRNFTNFGIFVELEEGVDGLIHISDLSWT KKVKHPSEFTQVGAEIEVVVLEIDKENRRLSLGHKQLEENPWDTYETLYTPGSVHRGKIS EMMDKGAVITLNEGGEGFATPKHLVKEDGTQAQQGEELDFKVIEFVKDTKRIILSHSRTF EEGKDDVKPARKQHAAKKSESSVQINNVAAGTTLGDIDVLADLKAKLEKGK >gi|283510597|gb|ACQH01000022.1| GENE 87 110817 - 111599 629 260 aa, chain - ## HITS:1 COG:CAC0908 KEGG:ns NR:ns ## COG: CAC0908 COG1179 # Protein_GI_number: 15894195 # Func_class: H Coenzyme transport and metabolism # Function: Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 1 # Organism: Clostridium acetobutylicum # 5 254 6 249 251 226 46.0 2e-59 MENQFSRTEMLVGKDAVERLKGSKVAVFGVGGVGGYAVEVLARSGVGCIDVFDADTVNIT NLNRQVIALHSTLEQAKVDAIEKRIYDINPACVVGKYKMFYLPENADEVDLKQYDYVVDC IDTITAKIELIRRCKLFGRPIISSMGAANKMDATAFRVADISKTTMDPLAKVMRKKLRAL GIAGVKVVFSEEQPLKPLGQNNSECGDNVTTNENLADDEGAHKPQGKKHVPASNAFVPAA AGIVVGAEVVKDIIGWGNPV >gi|283510597|gb|ACQH01000022.1| GENE 88 111655 - 111987 423 110 aa, chain - ## HITS:1 COG:no KEGG:PAU_00474 NR:ns ## KEGG: PAU_00474 # Name: not_defined # Def: hypothetical protein # Organism: P.asymbiotica # Pathway: not_defined # 10 109 2 105 110 65 42.0 5e-10 MRTVHLERENENQSLIAINKDDEVEDSFCYKLFELGEFDVAKIQELVSYVSLNVLNNNEK NVLKWIIQGVNSCFAYHKDANDYYVIKNYSKKIEDKWGMEWEVVLNAALV >gi|283510597|gb|ACQH01000022.1| GENE 89 112118 - 114466 1517 782 aa, chain - ## HITS:1 COG:no KEGG:BVU_1144 NR:ns ## KEGG: BVU_1144 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 99 777 96 766 772 171 24.0 1e-40 MKQFSLITIIALLLLTQAASALAYVVKGTVVDATGKPLYEALVLGRNKVGTKVIEIETNK LGQFVSAEINDSTLSIEITKDTFVPLQINISGTSSQFVDLGTITLFERQVTLGDVVVTAE SVMQKAGRYIIIPTKKELGQSANGLSLLNALQYKMPGLAVNEILQTVQVDNATPVFKVNG KPCELSKILSLNPKNVLRIEYSDTPDLRYDGRSVINIILNPSQNGGSVMANVLSGVTTGF LNGNVGIDYHHGKSEWELNYAANWRDYNKREISSKGDFIGRDEPVSRERNGMPSDFNYLS NELSLAYTYLHNPNTMMSTKVGIGFEDQKISDNSWNTQSYKGKVSRYTNHTRWKLGFVSP NFDLFFRKQLNKTQHIEANVYGRRSSGTYDRDYMNVYDTPLNNDTLMSSTENKSWRVGAD IMYSKTFKALRTSVGIQNYYNATDNMQLENGVLKQATIDQNRITLYAQIVGNAKQLSYGL NLSGIYNYADNNAYKTNAVRMKVNVNANYAFSPNWSLNYLFLLNPSLPGISQQSDLIQVI DDISIRQGNLSLKPSTYLRNRVYLRFSKAKFTSTLWVSHNKTFNPIYYSYSYISEVSNPY YDKFLSKAINGRSTNQLNLELELSAKELFGFATLWGNVGWDSYQVPLPTQTHSCKRFYAS LSAAMYFGKWVLSAKHEIEPKFELQGNTYYSNERWNMIKVQYQCKKWHFTIMGANLFTRR GSKYERITVSDVHPEYHIQSIRDNANMLVLGATYRLNFGKGKDKANRTLHNDGLEKGVDV FY >gi|283510597|gb|ACQH01000022.1| GENE 90 114681 - 116426 2156 581 aa, chain - ## HITS:1 COG:CAC2337 KEGG:ns NR:ns ## COG: CAC2337 COG1109 # Protein_GI_number: 15895604 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannomutase # Organism: Clostridium acetobutylicum # 14 556 6 552 575 462 45.0 1e-130 MENNAQLIAQCEAKAKQWLTSEFDEQTRNEVKGLMENADKTGLIDAFYKDLEFGTGGLRG IMGAGSNRMNIYTVGMATQGLANYLKKNFANLPQIKVTVCHDCRNNSRLFAETVANIFSA NGIKVYLFDDMRPTPECSFAIRHFGCQSGVNITASHNPKEYNGYKAYWEDGAQVLAPHDK GIIDEVNKVKVADVKFKGNPELIEIIGEEVDKIYLDMVKTISIDPAVIERQKDLKIVYTP LHGTGMMLIPRSLKLWGFENVHCVKEQMVRSGDFPTVVSPNPENGEALTLALRDAKEIDA DIVMASDPDADRVGMACKNDKGEWVLINGNQTCLLFLYYIITNRIKTGKMKPNDFIVKTI VTTEVIKKIADKNHIEMRDCYTGFKWIANEIRKSEGKQQYIGGGEESYGFMAQDFVRDKD AVSACSLLAEICAYAKDQGKTLYELLMDIYLEYGFSKEFTINVVKPGKSGADEIKAMMEK FRNNPPKEIAGSKVVIAKDYKTLEQTDANGVVTKLHMPETSNVLQWFCEDGTKVSVRPSG TEPKIKFYLEVKGEMKCAGCYERCNKEADEKIEEIKKSLGL >gi|283510597|gb|ACQH01000022.1| GENE 91 116659 - 116985 308 108 aa, chain + ## HITS:1 COG:no KEGG:PRU_2631 NR:ns ## KEGG: PRU_2631 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 3 97 1 95 95 125 69.0 6e-28 MELYFTGIIIAVSTFLIIGIFHPIVIKVEYYWGTRQWWVFLILGIVAVLVALMVANVIVS SILGVIGASFLWAIGELFEQKKRVERGWFPMNPKRKHAYNLEQQKKAA >gi|283510597|gb|ACQH01000022.1| GENE 92 116988 - 117458 443 156 aa, chain + ## HITS:1 COG:BS_yydA KEGG:ns NR:ns ## COG: BS_yydA COG1576 # Protein_GI_number: 16081075 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus subtilis # 1 154 1 158 159 111 38.0 4e-25 MKVELLLVGKTVNSVFVAGIADYAKRIGHYIPFNINVIPELKNTKNISANQQKEAEGELI LKKLQPDDHLVLLDEHGKELRSIEFAQWLERKQHVARRLVFAVGGPYGFSEAVYARANEK ISLSKMTFSHQMVRLIFTEQLYRACTILKGEPYHHE >gi|283510597|gb|ACQH01000022.1| GENE 93 117886 - 121224 3250 1112 aa, chain + ## HITS:1 COG:no KEGG:PRU_1810 NR:ns ## KEGG: PRU_1810 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 1105 1 1117 1121 1725 70.0 0 MKQYKLVDNVLGWLAFFIAAFVYCSTIEPTASFWDCPEFITTGYKLEIGHPPGAPFFMLT ANLFSQFASDVTQVAKMVNIMSALLSATTILFLFWTITHLTRKLIVKDWGSMTLSKLVAI EGSGLVGALIYTFSDTFWFSAVEGEVYAYSSAFTAIVFWLILKWEDHADEPHSDRWLVLI MYMTGLSIGVHLLNLLCIPAIVLVFVYRKFPNIEVKGSLLALLVAFVLVATVLYGVIPGI ITVGGWFELFFVNTLGMPFNTGEVVYIVLLVGAIIWAIYESYSQKSEKRENISFIVAFGL LGVPFYGYGWSAFIIGAIVLGILYYVLSYKRKNEAKKLVPLITSRIKNTVLLCLLMLMIG YSSYAVIVIRSLANPPMDQNSPEDIFTLGSYLSRDQYGDNPLLYGQAFTSQVQLDREGDM CKPRMKEGAPIYQRKEKASKDEKDSYFVVSRKNKYVYAQNMLFPRMWSPLHAQSYNDWLG GIDGSEVPYDRCGETMMIKMPSQMDNIRFFLSYQCNFMYWRYFMWNFAGRQNDMQGNGEL EHGNWISGISFLDNARLGDQSKLPDDLKDNKGHNVYYCLPLLLGLIGLFWQAFRGQRGIR QFWVVFFLFFMTGLAIVIYLNQTPSQPRERDYAYAASFYAFAIWCGLGVAAIIDLLKKRI KVEGTIVSAVVAALCLFIPIQMASQNWDDHDRSHRDTCRDFGQNYLMTLQDEGNPIIFTN GDNDTFPLWYNQEVEGVRTDARVCNLSYLQTDWYIDQMKRPAYNSPSVPITWPRLEFCSG TNEYVEVVPQAKQQLLDFYKNYPNEAKAKYGDEPFELKNILKNWVRSKDKDAHFIPTDTV YVTIDKAAVRKSGMMMASDSIPSRMVISLKGKNALYKGDLMMLEMIAQSNWVRPIYVAST VGQENYMNLGDNFVQEGLANRITPFTTNAPGAKNFDTQKTYNNVMNRYKYGGLSRKGIYI DETVMRMCYTHRRIIAQLALHLISEGDKQKANKVLQKADKELPAYNIPYNYMNGGLDIAR AYALLGQTAKAKEVANAVWTNAQQYLNWYLTLDGSRFANSMNDCMYQIYVLRQAASVMGM ADKQTAAQQEKKLNMLFKQYQEKGGAMPMEDE >gi|283510597|gb|ACQH01000022.1| GENE 94 121353 - 121964 466 203 aa, chain + ## HITS:1 COG:alr1793 KEGG:ns NR:ns ## COG: alr1793 COG0726 # Protein_GI_number: 17229285 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted xylanase/chitin deacetylase # Organism: Nostoc sp. PCC 7120 # 20 201 94 281 290 114 35.0 8e-26 MIIEQPAIWLRWLYPHALWRMDKNDRSVYLTFDDGPIPESTPFILETLAKFGARATFFMV GENVERHPELYQRIVDAGHRVGNHTYNHMGGAKHTIKTYTNNAKRADELIHSNLFRPPHG WMRPSQYAWLSRTYKVVMWDLVTRDYSKWMTAEDVLNNVKRYARNGSIITFHDSLKSIDK LHFALPQALEWLKEQGYEFKTFE >gi|283510597|gb|ACQH01000022.1| GENE 95 121942 - 123000 651 352 aa, chain + ## HITS:1 COG:PA4950 KEGG:ns NR:ns ## COG: PA4950 COG1600 # Protein_GI_number: 15600143 # Func_class: C Energy production and conversion # Function: Uncharacterized Fe-S protein # Organism: Pseudomonas aeruginosa # 4 323 17 319 361 196 35.0 4e-50 MSLKLSNDIKAEAKRLGFFACGIAKAAPVEPDTAADVVRWLDNACFAEMAYMNNYTDKRL NPQLLMPGLKSIVCVAMSYAPAQRMPEGQYQLASYAYGQDYHEVVKGKLRQLAAHFGFEP YIEEGVSHSTSVADSLPRYRIFCDSAPILERYWAVKAGLGWTGRNHQLIIPKAGSMFVLG ELFLDIELNYDEEMPNRCGNCHKCIDVCPTKALGTWQDMANKKTFLFDARRCLSYQTIEY RGDLNPEVAQAMGDTIYGCDICQTICPWNAHPIATNVEELAPKAELLNMTREQWHNLTED DYRRLFKGSAVKRAKYAGLMRNIKAVEQCQNQQKQEENDVHEPTINDKEKKQ >gi|283510597|gb|ACQH01000022.1| GENE 96 122997 - 123713 541 238 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288927749|ref|ZP_06421596.1| ## NR: gi|288927749|ref|ZP_06421596.1| hypothetical protein HMPREF0670_00490 [Prevotella sp. oral taxon 317 str. F0108] # 1 238 1 238 238 449 100.0 1e-125 MRDKQNLINTQKTLIHAFFDGFMNGVTDANDLKQDESQTLIKTLLNVYEKISEHYAVVML PILGVVHHESLEQLQCKLDALRQQGDTDVTAFFKVVCGEEGTYGAMVDDYKANFEVLLKG STLHPQHLTLADESPTSTYTKTNEEQNIRLLVRTILRAYRCGLQTKSSLQQFKQPTVIRM VLDNVDLLVNGKYTIAQHMDKATDIHGLFLAVLHTQERYNVVQDELQIEMERLIKGDN >gi|283510597|gb|ACQH01000022.1| GENE 97 123765 - 124409 528 214 aa, chain - ## HITS:1 COG:no KEGG:PRU_1212 NR:ns ## KEGG: PRU_1212 # Name: not_defined # Def: TetR family transcriptional regulator # Organism: P.ruminicola # Pathway: not_defined # 1 208 1 207 207 140 36.0 2e-32 MEQQDLLSSYRKDLKERILVTSMRAFKERGIRSVRMDDIATTLGISKRTIYEIYSNKEEL LLDGIMREDKARKLDSQLYAQDKSHTVMDLIFHIYNREMADMASTNPLFFSELSKYSRVV RYLTNKTEKNHKEALDFVKRGIEEGYFVRFFDYNVVLNFISFSSKLMMGERMFQKYDVNH LFTNVAVLFLRGFCTPKGVEFIDSKLMKETPYNK >gi|283510597|gb|ACQH01000022.1| GENE 98 124620 - 126920 2522 766 aa, chain + ## HITS:1 COG:maeB_1 KEGG:ns NR:ns ## COG: maeB_1 COG0281 # Protein_GI_number: 16130388 # Func_class: C Energy production and conversion # Function: Malic enzyme # Organism: Escherichia coli K12 # 6 433 6 434 434 511 60.0 1e-144 MVKVTKEAALSYHQSGRPGKIEVRPTKPYHTQTDLSLAYSPGVAYPCLEIQGNPDDVYKY TDKGNLVAVISNGTAVLGLGDIGAMSGKPVMEGKGLLFKIYGGIDVFDIELAEKDPEKFC ETVERIAPTFGGINLEDIKAPECFYIEERLKRTLDIPVMHDDQHGTAIISAAGLKNALEV AGKDIKKVRIVVNGAGAAAISCTKLYLALGAQRENVVMLDSKGVITTDRENLNDQKALFA TERRDIRTLEEAIKGADVFVGLSKGNILTQDMVGSMADRPIVFALANPVPEISYEDAMAS RPDVLISTGRSDYPNQINNVIGFPYIFRGALDVHATAINEEMKMAAVHAIADLAKQPVPD VVNDIYRVNDLTFGPLYFIPKPVDPRLITEVSAAVAKAAMESGVARKSITNWDAYKNSLM QLLGQETVFTRKLHDTARLQPQRVVYAEGGHPSMMKAAVQAKAEGICFPILLGNPDRINR QAQRLKLDLSGIKIIDMRADKEQGRRATYAKHLAEKLARKGYTFQEAYDKMYERNYFGMM MVERGDADAFITGLYTKYSNTIKVATEVVGVREGFHTFGTMHVVNSQKGTYFIADTLINR HPDNNFLVDIARLSANTVEFFNEKPVMAMISYSNFGTDEIGSPLKVHEAVEELHANYPNL VVDGEMQVNFALNREMRDEKYPFSKLKDKDVNTLVFPNLSSANASYKFLQALNPKTEIIG PIQMGLNRPIHFTDFEASVRDIVNVTAVAVVDACVDKMKRAGKTKR >gi|283510597|gb|ACQH01000022.1| GENE 99 128413 - 129168 755 251 aa, chain - ## HITS:1 COG:no KEGG:PRU_1803 NR:ns ## KEGG: PRU_1803 # Name: not_defined # Def: putative lipoprotein # Organism: P.ruminicola # Pathway: not_defined # 27 250 30 256 258 129 34.0 1e-28 MKQRILLIASAVLVFGALFTSSRAYAGSHKKKVTTQQNLKYFKRIVMSTPYDVHFVQGES NTVKLVGPQAEVSSVVLRVSGETLYIERAARRSRMFFTSSDDVDIYITSPDLTQLEIKGS GDFKAARRVDTDQLTVSINGSGDIDFKDIICDKLVASINGSGDIEFGFVECVNAEASLRG SGDIDFKRLKADKMQFSVKGSGDIGANLNDAGNVNCEVFGSGTIKLAGVAKNLNKNIRGS GNVETHRLQLR >gi|283510597|gb|ACQH01000022.1| GENE 100 129447 - 130238 633 263 aa, chain + ## HITS:1 COG:VC0803 KEGG:ns NR:ns ## COG: VC0803 COG0566 # Protein_GI_number: 15640821 # Func_class: J Translation, ribosomal structure and biogenesis # Function: rRNA methylases # Organism: Vibrio cholerae # 3 253 15 255 257 164 40.0 1e-40 MTISKAKIKYIRALETKKHRDAEGVFVAEGPKVVGELLAKTPAKLLVATPQWRASNAVQE GTELITVSNDELHKLSFMQAPQQVLAVFPKLYEGESTTAGNVETNELTLMLDGIQDPGNL GTIIRLADWFGIRHVVCSTDTVDVYNPKVVQATMGSIARVKVSYTPLETLLDALPPSFPV YGTLLDGDNIYTQQLCAYGIIVMGNEGKGLSQGVRKRVTHRLLIPSFAIGEERAESLNVA NATAIVCAEFRRQGAIGKGEKVV >gi|283510597|gb|ACQH01000022.1| GENE 101 130537 - 131433 750 298 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288927755|ref|ZP_06421602.1| ## NR: gi|288927755|ref|ZP_06421602.1| hypothetical protein HMPREF0670_00496 [Prevotella sp. oral taxon 317 str. F0108] # 1 298 1 298 298 505 100.0 1e-141 METELIKDRSLSSCIKTANNLLGINFAKTIKGTWLPALLFAIMTALFGFFALQEISATSL GSPSPTILSYVRSIGTWVVSLGLWAYFFASVITLVSESGLKQNLRRSLAIVGVELIVYLI LMAIGIVVVRMVVMSHLGKPFTANFITLVCGVTVAWLLLMALIMVPFKYAEMRYLLSPNG SLRRNLIAYYVSGVRGGALLIGTTFFTVLASVCLGTIIFLPTFILLGARTSSLLGELTMN DPSGLPTYFNALLIGSIVLTTFIFMMVVAWSIFVFYYVYGSIEHRRQMKKQKQAATAE >gi|283510597|gb|ACQH01000022.1| GENE 102 131524 - 132021 512 165 aa, chain - ## HITS:1 COG:no KEGG:BVU_0830 NR:ns ## KEGG: BVU_0830 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 165 1 163 163 108 37.0 1e-22 MKKLFLVAAFAMVSAFASAQFAVGVHTLYGTEVSNLGIGVRARYDINEQFRLDGNFNYYF KKNGLEFWDINANLHYLFNITDRFSAYPLGGLGLVTASSTIEVRDPFTGKVLSSTSESST KLGFNFGGGVDFALTDDLYLNGEVKYQIISGYNQAVMSAGIVYKF >gi|283510597|gb|ACQH01000022.1| GENE 103 132466 - 134943 2400 825 aa, chain + ## HITS:1 COG:CAC0492 KEGG:ns NR:ns ## COG: CAC0492 COG0787 # Protein_GI_number: 15893783 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Alanine racemase # Organism: Clostridium acetobutylicum # 457 825 1 376 386 198 32.0 4e-50 MTYPIERVATLIGAQRLGNAEAHISFVLTDSRSLCFPEETLFFALRSNRNDGHNYISELY RRGVRCFVVEEVPARAAELFGDANFLCVRSTRQALQRLAQRHREAFNIPIVGITGSNGKT MVKEWLYQLLSPQMVVTRSPRSYNSQIGVPLSVWLLNSHSQLGIFEAGISQKGEMQALHD IIKPNIAVLTNIGSAHEENFATPEEKCREKLRLFEGAEVMVYNADDDLVTRVLNETNNKG ERLAWSTKSPNAAMFVTKIDKQPTHTTIHYIYKGGQPASYSLPFIDDASVCNSVVCATIC LHVGLDAHTIDKRMRRLEPVAMRLEVKEGLNGCTLINDSYNSDINSLDIALDFMNRRPDH NGQQRTLVLSDILQSGLPDAELYKEVSSLAEQRGVQKFIGIGPKICANKESIRLAERHFF ASVEDFIRSDVFKGLRNEVVLLKGARSFGFDQLTELMVKKVHETTLEVNLNAMIDNLNYY RAMMHPNTKLVCMIKADGYGAGAVEIAKTLQDHRADYLAVAVADEGATLRRNGISSNIMI MNPEMTAFKTMFDYDLEPEVYSFRLLEALIKAAEKEGITGFPVHIKLDTGMHRLGFDPEK DVPRLIERLHNQNALIPKSVFSHFVGADADQFDNFSQRQFECFDKGSKMLQAAFEHHILR HIDNSAGIEHFPQRQLDMCRLGLGLYGINPRNNAIINNVSTLRTTILQLHKVKAGETVGY SRRGTIEHDSVIAAIPIGYADGLNRRLGNRNGHCLVNNQRAEYVGNICMDVAMIDVTGID CNEGDSVEIFGDNLPVTVLSDALQTIPYEVLTGISNRVNRVYFQD >gi|283510597|gb|ACQH01000022.1| GENE 104 135297 - 137312 1871 671 aa, chain + ## HITS:1 COG:BMEI0657 KEGG:ns NR:ns ## COG: BMEI0657 COG4206 # Protein_GI_number: 17986940 # Func_class: H Coenzyme transport and metabolism # Function: Outer membrane cobalamin receptor protein # Organism: Brucella melitensis # 439 667 391 595 599 61 25.0 4e-09 MQSVHFFSACAALFFAASAHGQDTKRQLALDSVQHVREVVVVSKNTFREVIPSQKLDGAV LERLNAHSVADALRYFSGLQIKDYGGVGGLKTVNIRSMGTNHLGIFYDGIELGNAQNGQI DLGQLSLDNVEEITLYNVQKSAIFQPASDFGNAGSVYIRTRTPRFDVNKPFNLRFKAGCA SSDTYRLSGLWENRLSQAVNTSLSVGYLTSSGKYRFRYLRHNQDHSVAYDTTAIRQNGDI WALRAEANLHGVIDHGMWNWKAYTYNSQRGIPGAIVNNVWRRGERQSDHNHFSQLRFQKS FSDAFSTQWLAKYAYYHTHYINNDPTQLPVDNTYKQQEIYLSTANVLELMPNWSVSLSYD FKWNKLDSDARLFVFPHRFSNFISLATAIDYRHLKLQASVLGTFVKDHTRTMVDKPSHSV LSPALFVNLYPFSSKAFSLRAFAKKSFRMPTFNELFYTEVGNALLKPETAVQYNVGAVWE KTYKASLIRGFRLQVDGYFNIVSDKIVAYPKGQQFRWTMLNLGKVHIKGIDLQAETTMQP SPELSFTGRLQYTYQQARDVTNPSDSYYKHQIPYIPWHSGSAIVGANYKQWQLNYSFIYA GKRYNQQENIVYNYMPAWYTSDLSLVYAFNWQKLRCKLTVEVNNLLAQDYDVILNYPMPK RNYAISIDITL >gi|283510597|gb|ACQH01000022.1| GENE 105 137327 - 138487 1087 386 aa, chain + ## HITS:1 COG:no KEGG:PRU_2507 NR:ns ## KEGG: PRU_2507 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 46 386 1 342 346 418 60.0 1e-115 MKRFSIFYVACLILAMVSSCREDFYIIPSQNQDTGVAPTRGDIVGMYVLNEGNMGSNKAS IDFLDLDENKPTVHYHRNIYSERNPNVVKELGDVGNDIKIYGSKLWIVVNVSNKVEVATA DSCKRITQINIPNCRYLAFKDGFAYVSSYVGPVKLDKDAPLGMVYKVDTVDFKKKDSVVV GYQPEELCIVENKLYVANSGGYRMPNYDNTLSEIDLTTFKEIRKIKVGLNLHHCQVDHYG QIWVTSRGNYNDVPSRIYWLYKGHNQLYEVIDSIDTPVSGLSIVGDSLYYYGSAWNSATA TNTISYGLINVRTHQTVETNLFSALQIKDITMPYGIIVNPTERDFYLMDAKNYVSSGSLL HFKPDGTHDFTVQTGDLPGHATFVYK >gi|283510597|gb|ACQH01000022.1| GENE 106 138694 - 140457 1668 587 aa, chain + ## HITS:1 COG:no KEGG:PRU_2508 NR:ns ## KEGG: PRU_2508 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 303 1 332 372 298 48.0 4e-79 MKKALLFVVALTLIGAKATAQQYFKQGTPSQYIYKVVDYSPAPGQFVNTMPAYKKGDDAA KMAQKCTDMLANNKRDLITLGAFGGSVTFHFDHSVANIAGKKDLCIEGNAFSGNSEPGIV MVSKDVNRNGIADDPWYELRGSADDEKPNRVVYGYEVTYTSAPMQDIPWTDNKGGSGKVE RNKYHAQEYYPLWMPTKITYKGSLLPKNATQKPGTIFWTLREFAYGYVDNKPNTDKDANS FDIDWAVDANRKKVKLDFIDFVKVYCAEQQMAGWLGETSTEVAGAEDLHLEESVAAINKA LEGKVATFDDVDVSLNADGYYLGTGNKDNGYDSQYLSGNYRFTVTNMPEYKAWNNFCISN RTATNFKDLFPDQFNSCVGHGYDNSANYCVAFFFGKSAPIEVLSKPEGDVVRGLYVTNNA YTLSCILHGDNMSKGATGKAEFEKGDWLLLTIWGTKADGSETKVEVYLADYRSSNSAEHY YLGNWQWVDLSSLGAVKELRFSITGSRHNKYGLTTPSYVCLDNINGTNDGKSGKVYLTTG IDNLPTVDSDKREVARYTIDGKRINTPVKGINIVKYADGTTRKIVVK >gi|283510597|gb|ACQH01000022.1| GENE 107 141386 - 142618 694 410 aa, chain + ## HITS:1 COG:BS_ripX KEGG:ns NR:ns ## COG: BS_ripX COG4974 # Protein_GI_number: 16079408 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinase XerD # Organism: Bacillus subtilis # 328 394 226 292 296 61 46.0 4e-09 MKTIFRAVFYLRSNYVNKEGKTPVMLRIYLNNERLSIGSTGIAVQQSQWDSEKERLRGRT TEVLSTNLELDNIQSGLQTIFKKLEMTDAISLERIKSEYLGKKEEVETMMTLFDKHNKDI AKQVGISVSAATFQKYNVCKRHFMTFLQDKYKRSDIRLSELTYIIIHDFDIYLRTVVGQN PNTATKTMKTFKTITILGRKMGVIHHDPFLNHRFHLEPVNRGFLTDEEILKIANKNLGIQ RLELVRDLFVFSCFTGLAYIDVANLTPENIVTLDDKQWIMTKRQKTSVATNVLLLDIPKN IIEKYSGKTYRDGKLFPMLTNQRTNSYLKEIADICGIKKDLTFHMARHTFATMSLSKGVS MESVSKMLGHTNIKTTQIYARITNKKIEHDMEQLAGKLDKFKVAMGINSK >gi|283510597|gb|ACQH01000022.1| GENE 108 142644 - 143111 483 155 aa, chain + ## HITS:1 COG:no KEGG:PG0816 NR:ns ## KEGG: PG0816 # Name: not_defined # Def: hypothetical protein # Organism: P.gingivalis # Pathway: not_defined # 2 155 3 156 157 237 81.0 1e-61 MTTTKKELSYFRLKLENYLSEHFPEMLSDKPFITARADEALISYCDAVAQGFSHPEAESM ASEVLHRGLHFSKYDTLVSVLENEFEKELPSPLPERLSPILLKNKAVQSVFDKYELSDDF GASPEYEKLYTELTGTIVLLIEVNGLPTIGGENIT >gi|283510597|gb|ACQH01000022.1| GENE 109 143280 - 143627 238 115 aa, chain + ## HITS:1 COG:no KEGG:PG0814 NR:ns ## KEGG: PG0814 # Name: not_defined # Def: hypothetical protein # Organism: P.gingivalis # Pathway: not_defined # 1 115 1 115 115 116 62.0 2e-25 MNTIIMNQDVHTPAFVKADASNKTDNIKQQPTSPKQVRHFGWTRFIELTILLSIVIGAIW LTSKIVTPQVVTVVSVIAGFLILRFIVRVILKVTITLLSILFWMAILCAILLCVL >gi|283510597|gb|ACQH01000022.1| GENE 110 144013 - 144459 424 148 aa, chain + ## HITS:1 COG:no KEGG:BF2919 NR:ns ## KEGG: BF2919 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 32 145 4 108 109 111 51.0 7e-24 MNRIENRRKPKGYSSSSHYRINPTVEKSEQVLAIKFVQWDVPSLECLCNSKVYLLRMKLY RGKCMSREDELLRAWLEKKNWLCEAVNSNTYFRTAVPLQGYRFDFFDVLKKYLVNQYGQW TEYYAPDRTSLRAYLYGRINQIVEIPKY >gi|283510597|gb|ACQH01000022.1| GENE 111 144482 - 144676 210 64 aa, chain + ## HITS:1 COG:no KEGG:BF0106 NR:ns ## KEGG: BF0106 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 59 1 59 138 77 62.0 2e-13 MKGTEHFTRTIAEYLNQRAMTDPLFAPNLMKPNKNIEECITYILNEVQKSGCNGFDDDDE LLRA >gi|283510597|gb|ACQH01000022.1| GENE 112 144695 - 144928 198 77 aa, chain + ## HITS:1 COG:no KEGG:BF0106 NR:ns ## KEGG: BF0106 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 2 77 64 136 138 80 60.0 2e-14 MAVHYYDEDDIEVGKAVSCQVAVNHVVELTEEEKAEARQEAIKQYQREELAKIQSRNARV KKTENAATQVQPSLFDF >gi|283510597|gb|ACQH01000022.1| GENE 113 144933 - 145199 57 88 aa, chain + ## HITS:1 COG:no KEGG:PGN_0050 NR:ns ## KEGG: PGN_0050 # Name: not_defined # Def: hypothetical protein # Organism: P.gingivalis_ATCC33277 # Pathway: not_defined # 1 51 1 51 424 89 82.0 3e-17 MKPRNKFEKAVLEQSKHLRSITKTQIRWTFHECIDHFAYHLLKGRTTCMDCRQCTLSNKS AISSYNGSQLKEDIFLHAAKEIKTSQTT >gi|283510597|gb|ACQH01000022.1| GENE 114 145574 - 147421 1176 615 aa, chain - ## HITS:1 COG:no KEGG:PRU_2073 NR:ns ## KEGG: PRU_2073 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 614 1 615 615 761 63.0 0 MKKNIINILSTGVVLVSSILFSSCGPDFLDTTPSNRVSTNEAFTSANKANAIVNGAYERF YYEFIGSEARAWDQFTEIMDLDVNWVSGNAPLLLGSATASSHQFAYWWKAFYEEIYRTSN VIQNIDQVPDMSDNIKARNKAECHFLRAWAYYRANCLYRGVPIYTEPVEYGKATKARSTE EEVWKQVIDDCTAAIDEPNLPNKYAISSSDYGRVTKAAAYYLRGQVYLWQKEWAKASADF KIITTMGYELFGDYKAMFKAANEKNNEYIFQYQYTDDDKLGNYFSQCYGNRVTVDNAWNN YLPNPMLIDSYEWANGKPFNIDEMIPGYSSMEPKARSVFYLRDSLNVNATGTKLDKAYKA GYNQMVTYGSDMTKYLEYGNEARIKAIYEGRDPRLLMNIITPYSEYKGGCTPDCFTYTLR WPYFGADAEYPYDLRTDTNDRYYYLWRKFVIEGRESRTIWDSDIDIPIFRYAGALLSLAE ALNEEGKTTEAIKYVNQVRARIPGLALLNSNSWTTVTGQDNMRERIRKEMLWELAGENQM YWKELRWGTWADKKFGTNRLTESDHTKYNGANGMTEIWGTKRYTLNYNGDYELKWAIPLS EIERNPNLKQNDGWK >gi|283510597|gb|ACQH01000022.1| GENE 115 147445 - 150825 1568 1126 aa, chain - ## HITS:1 COG:no KEGG:PRU_2074 NR:ns ## KEGG: PRU_2074 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 62 1126 27 1091 1091 1461 66.0 0 MEKYLEGNGIITNYLPKKSLVTISFTSLLLLGTISSMAADTNKATLLVKKTTATAYVSEP QQQKKTVGGIVVDSNGDPLVGVSVQEKGASNGTMTDINGHFSISLTKSEAQLIISYIGFS TQTVAASDNMNIVLKEDNHQLDEVVVVGYGTQKKVNLTGSVASVNFQKEAASRPITTAAQ ALTGMAAGVQILQSSGRPNSEGFGIQIRGTGTFNNSNPLVLVDGMEMNLSDVNPNDIENV SILKDAASCAIYGNRGANGVILITTKMGQAGTFRVTYSGKWSINTPSKIVRFVSNYADYM DFVNEADENVGAGHTYSDATIKKWRDAEANPNGISESGYPNYVAYPNTDWFDEIYRTKVM QEHSLSVLGNEKRTKYNIGLTYLDNPGTIVRSGVKKYFMNLNITSDVTEWLQIGAHAWGY HDDQDRNDVANLTAWSFLKTVPGIYPYYDGHYGGVENSEEDGAAGNPLLNLNGNGDSYYK HNRIYATTHAQIKFLKDFTLKTVFGYDYFHQRHKYAGTQNELYSFSRKQVVSPATSLDQI YVYMYTNQNYNWKWTNTLNWGHTFNKVHDVNVLLGFEEGRYYNGYLDTSKYGILDPSITD MSTITNMNTITGSDNQNKYRSWFGRMNYAYISKYLFEVNFREDGSSKFAPGKRWGFFPSV SAGWRISEESFAKNSFLREFDNLKFRVSYGKLGNSSVDDYAFQSWYETGYTVMGGKKAPS FYLRQLPNIDITWEETKTLDLGLDFALLNNRLSGAIDYYSKYTSGILYSPSIGLIYGDKR SPLQNLAEVSNRGVELTLKWEDHIGDLTYGVAVNGTWNKNRVEKYKGSWIHGWGANPNNP NSNVYYSNLGEVSSGDDERILEGHMMNEFYLYQTYSGSGSYFNADGSVNINGGPKDGMIR TEKDMKWLQAMIAAGHEFLPRRTAAKNGIWYGDYIFADLDGDGVYGGSNDRSFCNVSREP KFNYGIQAHAEWKGIDLSMNWGGATGFKTYWREIGQNASDVVYGLALPKDVAYDHYFYDP NNPNDPRTNISSTNPRLVMTNPGQSDGLYSTLRLYNCDFLKLRNLTIGYTFPKSICKVIY AQNLRVYFSGENLLTITSFPGLDPEMRAGQGYTTMRQFSFGLNVTF Prediction of potential genes in microbial genomes Time: Sat May 28 00:29:26 2011 Seq name: gi|283510596|gb|ACQH01000023.1| Prevotella sp. oral taxon 317 str. F0108 cont2.23, whole genome shotgun sequence Length of sequence - 17827 bp Number of predicted genes - 17, with homology - 17 Number of transcription units - 8, operones - 5 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 896 - 939 -0.5 1 1 Op 1 . - CDS 1073 - 1969 619 ## PG1475 conjugative transposon protein TraN 2 1 Op 2 . - CDS 2019 - 3431 881 ## PG1476 conjugative transposon protein TraM 3 2 Op 1 . - CDS 3583 - 4206 523 ## PG1478 conjugative transposon protein TraK 4 2 Op 2 . - CDS 4253 - 5290 951 ## PG1479 conjugative transposon protein TraJ 5 2 Op 3 . - CDS 5331 - 5963 652 ## PG1480 conjugative transposon protein TraI - Term 6038 - 6089 3.7 6 3 Op 1 . - CDS 6164 - 8677 1667 ## COG3451 Type IV secretory pathway, VirB4 components 7 3 Op 2 . - CDS 8674 - 9009 216 ## PG1482 conjugative transposon protein TraF 8 3 Op 3 . - CDS 9014 - 9313 359 ## Fjoh_3006 hypothetical protein - Prom 9336 - 9395 5.2 9 4 Tu 1 . - CDS 9468 - 10283 549 ## gi|258649172|ref|ZP_05736641.1| hypothetical protein GCWU000325_02754 - Prom 10403 - 10462 2.3 10 5 Op 1 . - CDS 10484 - 10822 209 ## gi|288800966|ref|ZP_06406423.1| conjugative transposon protein TraC 11 5 Op 2 . - CDS 10832 - 11611 407 ## PG1486 conjugative transposon protein TraA - Prom 11817 - 11876 4.2 + Prom 11940 - 11999 3.4 12 6 Op 1 . + CDS 12153 - 12566 367 ## PG1488 hypothetical protein 13 6 Op 2 . + CDS 12563 - 13843 615 ## PG1489 hypothetical protein 14 6 Op 3 . + CDS 13938 - 14882 718 ## BF1250 putative transmembrane mobilisation protein 15 6 Op 4 . + CDS 14831 - 15940 726 ## PG1490 TraG family protein - Term 15840 - 15880 4.1 16 7 Tu 1 . - CDS 16122 - 16328 316 ## gi|288927784|ref|ZP_06421631.1| hypothetical protein HMPREF0670_00525 17 8 Tu 1 . - CDS 16458 - 17816 1364 ## gi|288927785|ref|ZP_06421632.1| hypothetical protein HMPREF0670_00526 Predicted protein(s) >gi|283510596|gb|ACQH01000023.1| GENE 1 1073 - 1969 619 298 aa, chain - ## HITS:1 COG:no KEGG:PG1475 NR:ns ## KEGG: PG1475 # Name: not_defined # Def: conjugative transposon protein TraN # Organism: P.gingivalis # Pathway: not_defined # 1 297 1 322 341 446 68.0 1e-124 MKKIILSMAMLAMMGATATAQENNDGLVSSRPLTSGELFQGMSRAIPTGRVVLPYGLDVT FDKTVHLIFPSAIRYVDLGSQNIIAGKAEDAENVLRVKASVKDFETETNMSVICEDGSFY AFNVKYADEPEKLSIEKKDFLSPTDGRLPSNRADIYFKELGNESPVLVKLMMQTIYQNDR RCIKHIGTQQFGMKFLLRGLYAHNGLLYFHTRMENGTNMPYSVDFITFKVVDKKMAKRTA IQEQVLQPLRAYHQVMQVRGMGSEHAVFALEQFSLAEDKQLEVTLYERNGGCMMAACG >gi|283510596|gb|ACQH01000023.1| GENE 2 2019 - 3431 881 470 aa, chain - ## HITS:1 COG:no KEGG:PG1476 NR:ns ## KEGG: PG1476 # Name: not_defined # Def: conjugative transposon protein TraM # Organism: P.gingivalis # Pathway: not_defined # 44 470 27 453 453 392 54.0 1e-107 MIAQFFLPPKKEKLPIPKGKLMDSPIRTAPYKSYTAIFTYNIIKQSRKMDNKQKEQMKKG LVFGGLGLLFALSMWFIFAPSGKDKTAAEQGLNDSIPQATTEKLTENKLKAYELGDKAHE EEQTREEMGRLSDYFAQNTAPSEEQRAETAASTAKIENSMHRYEENNRLLNSFYAPDPHE QEREALRSEIDNLKKELSAKDSREDNEEKRQLALMEKSYQMAAKYLPKASTPPTFNNGLT AEKEKTEGAAGGSEAAKGKTMQNEKPAKEVLPERRQIVSSLDQPMSDVYFMEKYGKKARN MGFHSLASTAVPTMRNTLKVVVDRTTVLKEGDNVVLRLLETAKVQGLHIPRQSRLIAVAK IEGNRMHLLIKSIEVDGHIIAVKLSAYDTDGQEGVYIPGSEDVNALKEVGANIGGSMGTS FTFASSAKDQIISEAARGVMQGASQLLQKKLCTIKVTLKGGYRLFLVQSK >gi|283510596|gb|ACQH01000023.1| GENE 3 3583 - 4206 523 207 aa, chain - ## HITS:1 COG:no KEGG:PG1478 NR:ns ## KEGG: PG1478 # Name: not_defined # Def: conjugative transposon protein TraK # Organism: P.gingivalis # Pathway: not_defined # 1 207 1 207 207 306 74.0 2e-82 MEFKSLGNIETSFRQIRLYAFVFAIVCVAVSGYAVYASYSFAKEQREKIYVLDQGKSLML ALSQDASRNRPVEAREHVRRFHELFFTIAPDKDAIEKNMERAFVLCDKSAFSYYKDLAEK GYYNRAISGNVNQRIEVDSIHCNFNAYPYEVTTYARLFIVRQSNVTERSLVTTCTLQNSV RSDNNPQGFLMENFLVKENRDIQTYKR >gi|283510596|gb|ACQH01000023.1| GENE 4 4253 - 5290 951 345 aa, chain - ## HITS:1 COG:no KEGG:PG1479 NR:ns ## KEGG: PG1479 # Name: not_defined # Def: conjugative transposon protein TraJ # Organism: P.gingivalis # Pathway: not_defined # 1 338 1 338 368 517 73.0 1e-145 MDFASLHELLHSTYDEMMPLCAQMTGIAKGIAGLGALFYIALRVWASIARAEAIDVFPLL RPFVLGFCIMFFPTVVLGTMNAVLSPVVQGTEMMVHQQEGNLAELTAKRDKLQEEAYLRN PETAYLVSNEKFDQKIEEMGIIGPEDALTIAGMYAERAAYQTRQWIMKMVHDLLELLFHA AGLIIDTLRTFILIVLSILGPIVFGIAVWDGLAGSLTAWFSRYISVYLWLPVSSILTALL TKIQVLMIEKDIEALSDPNYLPDSGTWYYIVFFLIGIIGYFCVPTVAGWIIEAGGGIGSY GRNVNQAAQRGAQSAYTGGRYAAAGAGSVIGNVGGRIKGALLKGK >gi|283510596|gb|ACQH01000023.1| GENE 5 5331 - 5963 652 210 aa, chain - ## HITS:1 COG:no KEGG:PG1480 NR:ns ## KEGG: PG1480 # Name: not_defined # Def: conjugative transposon protein TraI # Organism: P.gingivalis # Pathway: not_defined # 1 210 1 209 209 266 68.0 3e-70 MRTRILMLLMGIVLLFSGKAHAQWVVTDPGNFAGNIVNSVKEIATASKTVKNTLDGFKEV EKLYNDTKKYYDALKKVNNLIGDAYKVKECILMVGDISEIYVTSYKKMLSDKNFRPSELA AMASGYAKLLEQSGESLKELKSIVKSGVLSMNDHERMQAIDRIYTTLRENRSLVSYYTRK NISVSYVRAREKNNLASVKALYGNTASRYW >gi|283510596|gb|ACQH01000023.1| GENE 6 6164 - 8677 1667 837 aa, chain - ## HITS:1 COG:RP103 KEGG:ns NR:ns ## COG: RP103 COG3451 # Protein_GI_number: 15603980 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirB4 components # Organism: Rickettsia prowazekii # 260 772 281 744 805 72 22.0 5e-12 MRNKSKITTLESKFPLLSIEQGCMVSKDADITVAFRVELPELFTVTSTEYEAMHSAWHKA IKVLPNYSIVHKQDWFIKEDYQGKLSDGGLSFLARSSERHFNERPYLHHSVYLFLTKTNK QRMAQQSNFSSLCRGHLIPKEITNKDEVMKFMEAVDQFERIINDTEQIRIVRMTEEELIG TKEKGGLLDRYFSLSEEGHASLEDIRLGADLVRVGDNMLCLHTLSDTDDLPTTVSTDSRY ERLSTDRSDCRLSFVSPVGLMLPCNHIYNQYLFIEDSDANLERFEKQARNMHSLARYSRS NQINEEWIQEYLNIAHSQGLTSIRAHFNVLAWSSDKEELRQIKNDVGSALALMECHPRHN TIDAATLYWAGIPGNAADFPAEESFYTFIEPALCFFTAETNYKDSLSPFGIKMADRLSGK PVHLDISDLPMKKGVITNRNKFILGPSGSGKSFFTNHMVRQYYEQGAHVLLVDTGNSYQG LCELIHRKTKGEDGVYFTYTNESPISFNPFYTDDYFFDVEKRESICTLLLTLWKSADEHI TKTEAGELGSAVNSYIELICADHSVTPCFNTFYEYLRDVYRKDMEKRDIKVTLSDFNINN LLTTLKQYYRGGRYDFLLNSDKNIDLLSKRFIVFEIDQVKDNKDLFPVVTIIIMEAFINK MRRLKGIRKMILIEEAWKAIASANMADYIKYLYKTVRKYFGEAIVVTQEVDDIIQSPIVK ESIINNSDCKILLDQRKYMTKFDGIQAMLGLSEKEKSQILSINQNNDPNRLYKEVWIGLG GMQSAVYATEVSMEEYLTYTTEETEKVEVMNRAAQLGGDIETAIRQLAQEKRVARNR >gi|283510596|gb|ACQH01000023.1| GENE 7 8674 - 9009 216 111 aa, chain - ## HITS:1 COG:no KEGG:PG1482 NR:ns ## KEGG: PG1482 # Name: not_defined # Def: conjugative transposon protein TraF # Organism: P.gingivalis # Pathway: not_defined # 4 97 2 95 111 143 73.0 2e-33 MAAFEINKGVGRTVEFKGLKAQYLFLFAGGLLAVFILVVILYLCGVSQVSCLVIGVVGAS LVVWQTFTMNRKYGQYGLMKKGAIRMHPRYLVNRRTVCHLIRNLQPKKAKQ >gi|283510596|gb|ACQH01000023.1| GENE 8 9014 - 9313 359 99 aa, chain - ## HITS:1 COG:no KEGG:Fjoh_3006 NR:ns ## KEGG: Fjoh_3006 # Name: not_defined # Def: hypothetical protein # Organism: F.johnsoniae # Pathway: not_defined # 1 99 26 127 127 144 77.0 9e-34 MNNKKKITMLLLTATAIGAYAQGNGIAGINEATKMVTSYFDPGTKLIYAVGAVVGLIGGI KVYNKFSSGDPDTSKTAASWFGACIFLIVAATILRSFFL >gi|283510596|gb|ACQH01000023.1| GENE 9 9468 - 10283 549 271 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|258649172|ref|ZP_05736641.1| ## NR: gi|258649172|ref|ZP_05736641.1| hypothetical protein GCWU000325_02754 [Prevotella tannerae ATCC 51259] # 1 271 8 278 278 342 88.0 2e-92 MQTILFYTLLLLTVIWLLVGILFLWQQLHTVRQKVEEESKKKNKQKQKQSTSSVTQDQVE QARQILVGKSKSYRERYDEISKEITTNSQKIPDVPDTSSKEKSADNPNTFAGKNSSVSEE IKEAESTEEDNEMQVDYTMDEPDEDTIIREELQIADDSLPEVSPSAILTRDLSRINGWHK NDDALDEESETDVHETLQAIRGTQLMDYIKEATLKQEKDHQKLLAAIRKAEEAELEENNI NSSSDFETNSNIESSDSDISEEADRPLSYYL >gi|283510596|gb|ACQH01000023.1| GENE 10 10484 - 10822 209 112 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288800966|ref|ZP_06406423.1| ## NR: gi|288800966|ref|ZP_06406423.1| conjugative transposon protein TraC [Prevotella sp. oral taxon 299 str. F0039] # 1 106 1 106 168 144 89.0 1e-33 MAKIQEQRQKLEAKLKEMAEDGVMKTPPKETVAFDPPDEEEDMKTEVSVNQKEKQHLVKE LMPTSEDKQEKTLCKSHERGDDRSAETPRTSLEDYRTRYLQSIRLRNEPILL >gi|283510596|gb|ACQH01000023.1| GENE 11 10832 - 11611 407 259 aa, chain - ## HITS:1 COG:no KEGG:PG1486 NR:ns ## KEGG: PG1486 # Name: not_defined # Def: conjugative transposon protein TraA # Organism: P.gingivalis # Pathway: not_defined # 8 259 11 262 269 242 47.0 1e-62 MEQIQTSPIFLGFSSQKGGVGKSTLAEIVSSILYYERNIHLFVVDCDLSQDSFYKLRERE KACVESDPQLSKQMKQHFSSLKRTAYRILKADPKEAIEKTNEYIRKHPSEQFDLVIFDFP GHAGTSDLLQLSLEMDYIVSPLEPDVQSMVACLTYIKAVNDLGVSMSSVRIKEVILLWNK VDRRVKNTLIEYYSRYIKNEDYTLLNQHAYATHRFSHELEQYGFRGVFRSTYLPPNKALR IGTGIDELTEELLTHIQLK >gi|283510596|gb|ACQH01000023.1| GENE 12 12153 - 12566 367 137 aa, chain + ## HITS:1 COG:no KEGG:PG1488 NR:ns ## KEGG: PG1488 # Name: not_defined # Def: hypothetical protein # Organism: P.gingivalis # Pathway: not_defined # 22 137 15 129 133 101 47.0 9e-21 MNKDAEHRSERKRIEKPRWDNWHVRLPDPKAQQRAIELFHKSGAETKSDFVRARILGESF KVITVDKSAVEYYRKLSELTAQIHKIGVLYNQAVRAINSYHSVKTAHILLEKLEKLSIQI IELQRKAVALTIDYRSK >gi|283510596|gb|ACQH01000023.1| GENE 13 12563 - 13843 615 426 aa, chain + ## HITS:1 COG:no KEGG:PG1489 NR:ns ## KEGG: PG1489 # Name: not_defined # Def: hypothetical protein # Organism: P.gingivalis # Pathway: not_defined # 1 416 1 414 426 383 49.0 1e-105 MIAKISSTANLGRALGYNFKKVRQEETTVLLAGGLYQNQDGRYAMEQVLSDMQQLIPNKC RTKNTVFHCSLNPHPDEKLSDETLMQIAREYMEALGYGNQPYIVFKHNDIAREHIHIVSL RVDSEGRKLNDRFEKRRSKQITDALERKYNLIPSSKVSGKVETETPKVDIGKGNIREQVA SVFRMVLKHYRFCSLGEFNAILNKYNLTVEEVKTEFRGRKYDGLVYVPTDGKGNKAGTPI HASDIGRGVGYTAVQNRMQKSKLAVKPLVPAIRDKVLQTMRTSPRTEEELRQWLEEQGLR VVIRKNESGRIYGITFIDDEVGIALNGSRLGKGYAANIFNAYFSNSTYNPFLDETLYGSP SVRLEQSATVQTLQQNTEESDNLVDELIEDMVGESFRTTGNDDWKEAAWQRKLRRQSKVK LRRRKH >gi|283510596|gb|ACQH01000023.1| GENE 14 13938 - 14882 718 314 aa, chain + ## HITS:1 COG:no KEGG:BF1250 NR:ns ## KEGG: BF1250 # Name: not_defined # Def: putative transmembrane mobilisation protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 296 1 294 666 499 78.0 1e-140 MAQEDDLRALGKVMDFMRGISVIFLLINCYWFCYEAFYEWHFTLGIINKILMNFQRTTNL FSSILWTKLFCVVFLALSCLGTKGVKEEKITWPKIWTVLFFGFVFFFLNWWLLALPIGKV GVATLYIFTLSVGYICLLMGGVWMSRLLKNNLMDDVFNTENESFMQETRLMENEYSVNLP TRFYYKKKWNKGWINVVNPFRASMVLGTPGSGKSYAIVNNYIKQQIEKGFAMYIYDYKFP DLSEIAYNHLLHHLDAYKVKPQFYVINFDDPRKSHRCNPINPAFMTDISDAYESAYPSCS ILTARGFRSRVISL >gi|283510596|gb|ACQH01000023.1| GENE 15 14831 - 15940 726 369 aa, chain + ## HITS:1 COG:no KEGG:PG1490 NR:ns ## KEGG: PG1490 # Name: not_defined # Def: TraG family protein # Organism: P.gingivalis # Pathway: not_defined # 1 361 299 659 669 689 93.0 0 MLNLNRSWIQKQGDFFVESPIILLAAIIWFLKIYENGKYCTFPHAIEFLNRPYAQIFPIL TSYDELANYLSPFMDAWEGGAQDQLQGQIASAKIPLSRMISPALYWVMTGDDFSLDINNP NEPKVLVVGNNPDRQNIYSAALGLYNSRIVKLINKKKQLKSSVIIDELPTIYFRGLDNLI ATARSNKVAVCLGFQDFSQLTRDYGDKESKVIQNTVGNVFSGQVVGETAKTLSERFGKVL QQRQSMTINRNDKSTSISTQMDSLIPASKISNLTQGMFVGAVSDNFDERIDQKIFHAEIV VDSAKISAEMKAYQPIPIIADFTNEDGSNNLKETIEANYKRIKQEILSLVELEKERIKAD PTLAHLNKE >gi|283510596|gb|ACQH01000023.1| GENE 16 16122 - 16328 316 68 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927784|ref|ZP_06421631.1| ## NR: gi|288927784|ref|ZP_06421631.1| hypothetical protein HMPREF0670_00525 [Prevotella sp. oral taxon 317 str. F0108] # 1 68 1 68 68 102 100.0 8e-21 MNNYIKPEIKVVPITLENSILTGSTSVGIGEGPATGPALSKGNNFFDGEEDDDDYDAPSR RPSSLWDD >gi|283510596|gb|ACQH01000023.1| GENE 17 16458 - 17816 1364 452 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927785|ref|ZP_06421632.1| ## NR: gi|288927785|ref|ZP_06421632.1| hypothetical protein HMPREF0670_00526 [Prevotella sp. oral taxon 317 str. F0108] # 1 452 2 453 453 844 100.0 0 MVGVIRNYSGTFDGQGHALTVNWNHTSFVDIAPFKNVAGATIKNLHVKGQNEATGNSFLS GLIQNAYGTVTVSGCVSDVDIKGSSNLAGMIQMVNLNTEVIITDCVVKGALNSATKSIGG FVDYQSGSCTLTNCLYAGTNNATTDNNTFADNATLTNCYYLNACGKPQGTQVTEEQLKSG EVTKKLQGNRTDKCYWAQLLGEMPGLYCAADKSKANYVYYDAAKKGWACEDFRLTDGTPL PIGLDFTAANVTYERKFNGTQNATLCLPYDLYAQGFKAYTLSGGNKNEVHFKEVDDKLTA YTPYYITANGMPQLGGRNIEVKAYKDDKMTTPAAGYKFTGTVAGVSNATAAAANAYILQD DGKFHKVTTAYSATIPAYRAYIICPPQASGAKELSVVLDGETTGIDGVTNGRADGPVYDL QGRRVADRLDAAARHRLPAGVYIVGGRKVVVK Prediction of potential genes in microbial genomes Time: Sat May 28 00:30:52 2011 Seq name: gi|283510595|gb|ACQH01000024.1| Prevotella sp. oral taxon 317 str. F0108 cont2.24, whole genome shotgun sequence Length of sequence - 10620 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 3, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 2626 2045 ## gi|288927786|ref|ZP_06421633.1| GLUG motif subfamily protein 2 1 Op 2 . - CDS 2671 - 3585 639 ## PGN_0879 transposase in ISPg3 - Prom 3698 - 3757 5.2 + Prom 3497 - 3556 7.2 3 2 Tu 1 . + CDS 3734 - 7033 2293 ## Slin_0358 coagulation factor 5/8 type domain protein + Term 7096 - 7140 12.1 - Term 7289 - 7333 6.1 4 3 Op 1 . - CDS 7375 - 8214 696 ## COG0657 Esterase/lipase 5 3 Op 2 . - CDS 8247 - 10445 2075 ## COG1472 Beta-glucosidase-related glycosidases - Prom 10531 - 10590 3.1 Predicted protein(s) >gi|283510595|gb|ACQH01000024.1| GENE 1 1 - 2626 2045 875 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927786|ref|ZP_06421633.1| ## NR: gi|288927786|ref|ZP_06421633.1| GLUG motif subfamily protein [Prevotella sp. oral taxon 317 str. F0108] # 1 875 1 875 875 1644 100.0 0 MAWGGGVFTLHLYCIRHSVIPNWGIYYTSIPPSFFGPVINYDTHNVKAEFKVDNNDIATV TAYGFLKFKCPGTVVLTATCSASENCAKAQCSTTVTTKRDGVTFTSEGLPDVIFNNTSYC IRNYLYNSKTKSGGNFHANGSFSVTSSNNAILRYDDLWLKFSGTAGEATTFSHTFIVARR DQYGTILIKDANEWKVFCKLVNEKGMTNLNARLEADIDLGGDIAMVGNNYAGTFDGQNHT LKFNWNGGSDSEIAPFKKVSNAIIKNLRTEGTIKSSSYYLSGLVGDAYGTVNISNCASNV NITSSHTGGRCDAAGMISYIGGGSANITITDCLVEGNITAMTDEGKKRMGGFIYNKNGSC TLANCLYVGTNNATSGNTFAYNATIKNCYYLNACGTAQGEQVTKDQLKNGYVACQLQAGR NDQVWGQAFGSNAPQLTADGAKRVRKVEFTYNNRVKATRYATHGNAIYGGMPTFTAKDLL GSDYNEHHYYSGITFEGGFNGSTIVNTDRTVAVTFNKKDYYEIASKEDWKAFCDLVKSGQ NVVDAKMTADVNLGGDIAVIGSRKNYSGIFDGQGHTLTVNWNVGNKSYIAPFYIVEKATI KNLRTEGQITSNEKFLSGLVGNVYGTTTISGCVSAVNITSSYDNGGCNAAGLICIVKLGA NVTIDDCVVKGNITATTDKGKEKMTGFVSGQEGTCTLNNCLYLGSSNGDTFSRTFVDDAY HGVTTTLNNCYNLNTCGTAQGKKVTKEQLRNGEVAYLLQNKRAGNFWGQTLGTENDPQPT DKAEKHVCKVDFTYNDQVKATHYATRNNAIYGDMPTFTAKDLMGSDYNEHHYYAIAFTGG FNENTTVSSDQTVAVTFNKKDYYEIASKEDWKTFC >gi|283510595|gb|ACQH01000024.1| GENE 2 2671 - 3585 639 304 aa, chain - ## HITS:1 COG:no KEGG:PGN_0879 NR:ns ## KEGG: PGN_0879 # Name: not_defined # Def: transposase in ISPg3 # Organism: P.gingivalis_ATCC33277 # Pathway: not_defined # 7 288 5 285 300 291 51.0 1e-77 MTTDYKVTELFCIIDEFCKHFDAKNAGNLLEDNSGVKRRRRQASLSDSEIMTILLYFHFG TFRNFKHYYLFFIKGTMKSYFPKAVSYNRFVELESRVFFQLMFFLNLGAFGRCTGITFVD STMIPVCHNLRRYDNKVFKGIATDGKGTMGWCHGFKLHLACNDRGEIIAFVLTGANVSDK DPNVFKVLAKRLYGKLFADKGYISQKLFNFLFEDGIQLVTGLRVNMKNKLMPFYDRMMLR KRYIIETINDMLKNTAQIVHSRHRSVTNFIMNLISALGAYCFFDNKPKALQGYCIEDTKQ LSLF >gi|283510595|gb|ACQH01000024.1| GENE 3 3734 - 7033 2293 1099 aa, chain + ## HITS:1 COG:no KEGG:Slin_0358 NR:ns ## KEGG: Slin_0358 # Name: not_defined # Def: coagulation factor 5/8 type domain protein # Organism: S.linguale # Pathway: not_defined # 32 1095 35 1109 1114 971 47.0 0 MSKKSLRNYLLKMFLTVCLFFATQNAQAQETSLYRSFVNPPAVARPHLWWHWMNGNITKD GIYKDLMWMHRIGIGGFHHFDAGLETPQIVEKRLEYMAPEWKDAFSYAVHLADSLGMEMA VASSPGWSSTGGPWVTPEQAMKKLVWRELDVKGGITYRGMLPPPYTATGYFQNFNSSVHA PFVSGNEVTYYKDIAVVAVKCLPTDRSMVSLGATATTLGGEVSLRTLTDGDLTTAVQLPC DEKNGYTWIQYAFSKPQTIRAISVVDGRFRNEWASVPADVNKHLEAGDDGIVFRKVCDIP SGGASQQTITIPPTRAKYFRIRFDNVPQKKVTEVAELQLYTIDRVNHAEEKAGFATPPDL ELYPTPADASATALNDVVVLTDKVDSTGRLVWKVPKGNWRIYRFGYSLTGKKNHPASPEA TGLEVDKLDAQAVHDYLNYYLSTYDDASKQMLGKSGLRSMLIDSYESGWETWTPKMEQEF ESRRGYSLVKWLPVLTGQLIGSADRSERFLWDWRKTIGELITENLYAQVDSILAARGMST YYETHENGRLYLADGMEAKSKGDIPMAAMWCQPTDAATTSMSESDIRESASVAHLYGKPI VAAESMTANGLYSGAYVFYPGNLKPVADLEIANGVNRFFIQECAHQPVDDKRPGLGLMMY GPWFNRHDTWAEWAKPWTDYLARSSYMMQQGRSVADILYYYGEDNNITGLFAHVHPDIPA GYNFDYINADALTRIITFRNGRLTAPSGMQYRLLVLDKNARKMSLPVLRTLARLAREGAL ICGQVPELMPSLTGDSLEFARLVREVFYSGRPNVYNYASAATVLRANGILPDVVVSEGNG AEGAAAHWRYVHRTTDNAEIYWLNNRTDRSRSLTVTCRTAGMKPLLWHPETGQTEELSYR VGRDATTIYLDMVPNDAVFVVFQGKAQPAEQQLPARRITPVCQIETNWQVTFRDNMPTEK TLVMPRLQSYTEQSDPDIKYYSGRAVYRNRFLLPSFPLKAARYVLNLGKVGCMAQVIVNG RRQRILWKAPYTVDITDALQKGDNRIEVEVINLWANRLIGDCQPNNPKCYTYTGFPFYKA DSPLLPSGLLGPVELSIEH >gi|283510595|gb|ACQH01000024.1| GENE 4 7375 - 8214 696 279 aa, chain - ## HITS:1 COG:CC2313 KEGG:ns NR:ns ## COG: CC2313 COG0657 # Protein_GI_number: 16126552 # Func_class: I Lipid transport and metabolism # Function: Esterase/lipase # Organism: Caulobacter vibrioides # 46 252 83 303 328 137 39.0 3e-32 MNQTTYIRFIKKQLFTLLFCLLAATTAQAQNTVRKIQVPLGNEGAQLFGYLPEKPTGRAV ICCPGGAYESLAIDHEGHQWSAFFNPRGIAFFVLKYRMPRTNHIIPLTDAEAAMRLVRDS AAAWHINPADVGIMGFSAGGHVASTLSTHASAATRPNFAILFYPVITMKQPGTHLGSRHH LLGDKPDRKLVERYCNELQVRSGETPTTILFLTADDDVVPPLENGVAYYCAMQRAGNACT LHIYPTGGHGFGYLSSWRYHNNALYELNDWLTRLPAPTR >gi|283510595|gb|ACQH01000024.1| GENE 5 8247 - 10445 2075 732 aa, chain - ## HITS:1 COG:YPO2803 KEGG:ns NR:ns ## COG: YPO2803 COG1472 # Protein_GI_number: 16123001 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Yersinia pestis # 24 723 18 705 793 365 33.0 1e-100 MIKTIALTAMLAGYIGLQAQNQKPVYLNPKAPIEERVKDALSRMTLREKIAVIHAQGKFS SPGVPRLGIRQLNMDDGPHGVRAENDWNRWGEAGWTNDSIVAFPSLTCLTATWNTDLSAL YGNAVSEEFAFRDKHIMLGPGTTIARTPLNGRSFEYMGEDPYLAGEMVVPYIRSTQQNGV ACCLKHFFLNNQETDRFKVNVNVSERAVNEIYLPAFKKAVQRGGVWTMMASYNKWLGVHC CQHDSLLNGILKRQWGFDGVVISDWGGVNDTWQAATGGLDIEMGSFTDGKLKESEFTYND YYLARPFEQLLKAGKIPMSVLDDKVSRVLRTIFRTAMNPNRIIGNQCSEAHYDACRQIAE EGIVLLKNTRNLLPLDTKKYGKILVVGENATRSLTKGGGSSELKTLYDISPLEGLKALYG DKIDYAQGYESGGAHYDKIDTIPVAVQTQLRKEALEKARKADIILYIGGLNKNHLQDCEN GDREDYNLSFGQNELIAGLAALKKPVVVITFGGNPYATPWIDNVGALVHCWYLGSESGTA LANVLSGKVNPSGKLPITFAVKQNDYPCFAYGAEGFPGVNYEEYYREGIFVGYRHFDTRG IKARFPFGYGQSYTTFKYGRPTLSSRTIAPNGHLTLTVAVTNTGKRAGKEIVQLYIGDDK ASVERPRKELKGFRKIALMPGETRTVTFDITTKDLQFFSEKEHRWVAEPGTFKAYVCASS EDVRGTAAFELR Prediction of potential genes in microbial genomes Time: Sat May 28 00:31:33 2011 Seq name: gi|283510594|gb|ACQH01000025.1| Prevotella sp. oral taxon 317 str. F0108 cont2.25, whole genome shotgun sequence Length of sequence - 4530 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 4, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 29 - 88 1.7 1 1 Tu 1 . + CDS 118 - 1251 689 ## BVU_1598 transposase 2 2 Tu 1 . - CDS 967 - 1614 113 ## gi|281424166|ref|ZP_06255079.1| transposase, IS4 family protein - Prom 1684 - 1743 5.8 + Prom 1670 - 1729 5.4 3 3 Tu 1 . + CDS 1765 - 3969 870 ## BT_3590 alpha-N-acetylglucosaminidase precursor 4 4 Tu 1 . - CDS 4131 - 4457 105 ## PGN_0955 transposase in ISPg3 Predicted protein(s) >gi|283510594|gb|ACQH01000025.1| GENE 1 118 - 1251 689 377 aa, chain + ## HITS:1 COG:no KEGG:BVU_1598 NR:ns ## KEGG: BVU_1598 # Name: not_defined # Def: transposase # Organism: B.vulgatus # Pathway: not_defined # 1 364 1 357 429 288 45.0 3e-76 MTKVAIKIENITSFGGIYHIMDVFSKLGFEKLTESVLGKRGSSGKAFSHGSIFGSLFFSY LCGGECLEDINALIGQFKQRPDTLLPGADTVGRGLKELTEKNIVYKSETSDKSYSFNTAE KLNTLLLRMIRRMGLIKAGSHVDLDFDHQFIPSHKFDAKYSYKQDFGYFPGWASIGGIIV GGENRDGNTNVRFHQEDTLRRIMDRVTSELGVVIERFRADCGSFSKEIIQTVEQRCNTFY IRAANCDSRCEDFRQLKEWKSVEVGYERCDVTSISMDNLIERKSYRLVVQRSPLKDKEGK QQTDVFGVIYTYRCILTNNWTSTEKDIITFYNERGASEKNFDIQNNDFGWSHLPFSFMAE NMVFKHYVIYNETCSGI >gi|283510594|gb|ACQH01000025.1| GENE 2 967 - 1614 113 215 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|281424166|ref|ZP_06255079.1| ## NR: gi|281424166|ref|ZP_06255079.1| transposase, IS4 family protein [Prevotella oris F0302] # 1 144 1 144 463 272 91.0 1e-71 METKIEKISELSKLLSVKTRMSDDLFHLFGKFGIGHLLSRLSLEKQDGVSASELILSLCL FGIVGESIHSICKHKIYELSNHGKNCFYRMMIRPQMDWRRLMNHFALRYMCLLRKYGEAP LSDTTTCFIIDDIVLENHVLSHKGKGQMRPSEVIVLYVEVLFACSTLIIKCNDVFLGRCP VVGKYTAVCVYHSEYISLLLALLVFQRTTLYDKPV >gi|283510594|gb|ACQH01000025.1| GENE 3 1765 - 3969 870 734 aa, chain + ## HITS:1 COG:no KEGG:BT_3590 NR:ns ## KEGG: BT_3590 # Name: not_defined # Def: alpha-N-acetylglucosaminidase precursor # Organism: B.thetaiotaomicron # Pathway: not_defined # 19 726 15 724 732 734 48.0 0 MKKILFLFFILFNISMFTVAAETNVSLVPVRELVKRILPEHYLKIVVEYMPDVTNDERFE LYSQADKIIIRGTTKSAIGVGLNYYLKYYCKTYVSWYSFDKIETPKVLPVVPEKVVRSAR VPERFFLNYCTYGYTLTWWGWHEWERLIDWMALNGINMPLAIAGQESVWLNVWKKYGLTE KQILEYFTGPSYLPWHRMSNIDHWMGPLPMSWIKNQEKLQKKILRRTRDLGMKPVLPAFA GHVPEILKEKYPKAKITPLSIWGDFEDQYRCHFLDPFDSLFTDIQKTYIDEQTKLYGTDH IYGVDPFNELAPPSWEPEYLANASAKIYDVLKNADSKAVWLQMTWMFSYQRKDWTDERIK SYITAVPDKKQILLDYYAERTEVWKFSESYYKQPFIWCYLGNFGGNTMIAGNIAEVDRRL NEAFANAESMVGVGSTLEGFDVNPIMYDFVFEKVWHKDGISLHDWTVQWAQRRVGTTDEN AEKAWKLLIDKIYVQYSLCTEGTLTNARPSLTGHGNWTTKNWTKYNNRDLLEAWGLLLRS KAITKIAYKYDIVNIGRQVLGNYFTVLRDEFTQAYERKDISALTIKGNEMLSLLNDLEAL LYTSPSFLLGPWLTNAQNMGRNMEESRYYEKNARNIITNWSTQGVALNDYGNRTWAGLLQ GYYTPRWKMFIEEVISAVKQNKEFNNETFFKKVTDEEWQWISKTENYPIQATGDSYLLAN KFYHKYHHLINPDE >gi|283510594|gb|ACQH01000025.1| GENE 4 4131 - 4457 105 108 aa, chain - ## HITS:1 COG:no KEGG:PGN_0955 NR:ns ## KEGG: PGN_0955 # Name: not_defined # Def: transposase in ISPg3 # Organism: P.gingivalis_ATCC33277 # Pathway: not_defined # 25 92 3 70 300 78 54.0 6e-14 MRYFLYLYSSKPKCYTNILIPMNSTNLIELFCILDEFCKYFTPQLKKRMIDTVGKRRRNR ACVMNDSEIMTILILYHQSHYIDLKTFYLHEIFLLHLLPTILCRRSQR Prediction of potential genes in microbial genomes Time: Sat May 28 00:31:56 2011 Seq name: gi|283510593|gb|ACQH01000026.1| Prevotella sp. oral taxon 317 str. F0108 cont2.26, whole genome shotgun sequence Length of sequence - 2363 bp Number of predicted genes - 2, with homology - 1 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 411 260 ## BVU_1596 transposase 2 2 Tu 1 . - CDS 1852 - 2142 94 ## - Prom 2187 - 2246 3.0 Predicted protein(s) >gi|283510593|gb|ACQH01000026.1| GENE 1 3 - 411 260 136 aa, chain - ## HITS:1 COG:no KEGG:BVU_1596 NR:ns ## KEGG: BVU_1596 # Name: not_defined # Def: transposase # Organism: B.vulgatus # Pathway: not_defined # 1 134 1 133 401 95 41.0 5e-19 MAKVAIKNENITSFGGIYHIMDVFSKLGFEKLTESVLGKRGSSGKAFSHGSILGSLFFSY LCGGECLEDINALIGQFKQRPDTLLPGADTVGRGLKELAEKNIVYKSETSDKSYSFNTAQ KLNTLLLRMIRRMGLI >gi|283510593|gb|ACQH01000026.1| GENE 2 1852 - 2142 94 96 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MYKVLDEDTIKPEILLQIVCGKTWLCFRRRLGGSCSMRSLQVESLAVNGRLLPLFHDVVL VGREQHPPQPLALAVASLHVFCVIAHSRVFTAYFFV Prediction of potential genes in microbial genomes Time: Sat May 28 00:32:09 2011 Seq name: gi|283510592|gb|ACQH01000027.1| Prevotella sp. oral taxon 317 str. F0108 cont2.27, whole genome shotgun sequence Length of sequence - 23045 bp Number of predicted genes - 12, with homology - 12 Number of transcription units - 7, operones - 3 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 371 - 430 1.8 1 1 Tu 1 . + CDS 458 - 1498 839 ## COG3335 Transposase and inactivated derivatives + Prom 1555 - 1614 3.0 2 2 Op 1 . + CDS 1697 - 4867 2402 ## BT_2894 hypothetical protein 3 2 Op 2 . + CDS 4879 - 6657 1224 ## Phep_3405 RagB/SusD domain protein + Term 6676 - 6720 2.9 4 3 Tu 1 . + CDS 6730 - 7932 665 ## COG2730 Endoglucanase + Term 8012 - 8070 0.4 + Prom 7950 - 8009 5.4 5 4 Tu 1 . + CDS 8158 - 10521 1387 ## COG1472 Beta-glucosidase-related glycosidases + Prom 10608 - 10667 3.0 6 5 Op 1 . + CDS 10689 - 11633 692 ## COG1409 Predicted phosphohydrolases 7 5 Op 2 . + CDS 11636 - 13552 948 ## COG3533 Uncharacterized protein conserved in bacteria 8 6 Op 1 . + CDS 13658 - 15829 934 ## Amuc_0060 alpha-N-acetylglucosaminidase (EC:3.2.1.50) 9 6 Op 2 . + CDS 15845 - 18076 1626 ## COG1472 Beta-glucosidase-related glycosidases 10 6 Op 3 . + CDS 18089 - 19822 1442 ## COG0366 Glycosidases 11 6 Op 4 . + CDS 19890 - 22250 1565 ## Cpin_6026 alpha-L-rhamnosidase + Term 22272 - 22322 9.0 - Term 22462 - 22498 1.1 12 7 Tu 1 . - CDS 22526 - 22915 136 ## BVU_1598 transposase Predicted protein(s) >gi|283510592|gb|ACQH01000027.1| GENE 1 458 - 1498 839 346 aa, chain + ## HITS:1 COG:BH2520 KEGG:ns NR:ns ## COG: BH2520 COG3335 # Protein_GI_number: 15615083 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Bacillus halodurans # 184 323 36 175 188 88 35.0 2e-17 MMLDLSIEELKTLRRLQHSKEFEPYWAQITCILMLSHGHDAKTISYDLGISLSSVYNYAE SYKSGGISKLTDNHYKGYWGLLDSSQIAALCAELRGKVYTEAKSVAQWIKCTFGVSYTPQ GTVDLLNRIGFTYKKTTEVPCEADAQKQEEFVEELSKTLRDMDSSAVVYYADGVHPTHNS RSTYAWVEKGERMEQPTVSGRDRINLNGLLNAHDVTDVITLDCPRVNAQSTRELYQAAQA RHPEATEIYIISDNAKYYHNKELAQWVKGSRIRQVFLPPYSPNLNLIERLWKMLRKKVIN TGFYRTKEEFRRAVTNFFEHIADYKEELESLLTLNFRLVNSKTISL >gi|283510592|gb|ACQH01000027.1| GENE 2 1697 - 4867 2402 1056 aa, chain + ## HITS:1 COG:no KEGG:BT_2894 NR:ns ## KEGG: BT_2894 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 54 1056 25 1017 1018 575 36.0 1e-162 MKTKIHYFTGGLSDSYWSRPSFRLSGLILLLGSSLLVPITAGATKSSMPMAVEQQTKKIT GTVTDNSGEPIIGATVAVKGTKNATVTDVDGHFTLSGVTNNAVLRISYIGYSDQEIGIGK QSTISIVLSQDTKALDEVVVVAYGAQKKTTLTGAVAAIGTRELKQSPAANLAVTLTGRLP GLIAIQRSGEPGRDVTNLYMRGRGTINGQNPLILVDGVERDITSLDPNEVDNVSILKDAS STALFGVRGANGVILVTTKRGTSEIPDISLTAETGWQTFTRWPSQIDSYDWALLKNQAWH NDHPNPGVTDAPPYSAYALERYRLGDQPNIYSNHHWVDELTNKWVPQTRYNLTLNGKGAN VNYFVNVGYLHQGGQFKLDKSEKPSYDAKKFMDRYNFRANLDVALNARKTLKAFLNAAGT FETVNGPNEETLTLLTNLLGKWPNIQAGPTTPDGEVLLGGSSYQYSPWAQINRSGYRKET RSSVNATFGMSYDLGFLLKGLSTKLTASYDTQSINYLVGKKGYQYWESVVDPNRKNPDGS DYIEYHRIRTDYDNTPLSTSKSATFASFYDLQWQINFNRTFNEKHTVTALLLAQKQSQIK ASDVLPFNVQGLATRLTYAYDDKYIAEFNAGYNGSEQFAPKNRYGFFPSASAAWNISREK FFEKWTNVVDKMKLRVSYGLVGNDKIGNTRFLYLDNVARSYGGYSPSLSNNNTIQELFFG NPNLKWETAKKLNVGFELGLWKYFNLSFDIFSERRDNILITKNSTPSIIGVARSTIAPFN LGRVKNRGYELEMSFNKTITKDLLIMAKANLNYNDNEVVYMDELKFDETYAYPYHQTGYS IGQQWGMIAEGFFKDQDEINAYAKYEGQQPRPGDLKYKDVNGDNIINQKDLSPIGYSDVP KYTAGLALSITYKNFDISALFQGAFNVSGAVGAPGPYEWYDFREFHKKAWTAERAAAGEE ILFPALALAQSPSEIYNSTFFNMDRSYIRLKNLEIGYTLPKNWSRAINAKVVRFYVNGYN LATWDKMKFKDWDPEVMDNSTYPLLKVWNIGLNVTF >gi|283510592|gb|ACQH01000027.1| GENE 3 4879 - 6657 1224 592 aa, chain + ## HITS:1 COG:no KEGG:Phep_3405 NR:ns ## KEGG: Phep_3405 # Name: not_defined # Def: RagB/SusD domain protein # Organism: P.heparinus # Pathway: not_defined # 1 592 1 600 600 290 33.0 8e-77 MKYKSFKISLALASGLLLLSGCGDALDITPDGRMTQTDVFNNIDYTESYVNSMYEGIRKY GVNYHYFTFLAAYEDDGTDSQVPTDSWQQLQYWNQGSCTSSTNRPFSQGKTGLRYTDMDF WETAYGGIRKTNIFLANANEKNVPEVAKLGRYRAEAKVLRAHYYFDLMKNYGGVPLFDTD ITLKQDFTDVKRATYQQCTDFVVKDCDDAIAEPNLPFRTTEEGERGRMTKAIAHFIKASV LLFNASPLWNPEGDQAKWARAAKACKEAVDALEANGYELFPDYETYFITRPDKAQMPADK ETILEAIDSWGYGNQTYRAFGVIMFLMNEIPTFPSEKCGLCPSQELVDTYEMANGEVPIE GYKDERHTQPIINPAAGYDDQNPYVNRDPRFYASIWYNGAHFGMVKDKDVYIESYVGGAH GMAGIKQRSPTGYYLRKYVDKTMRVTGASKTLYRIYRFGELYLSLAEAENEANGPTAIAY AAVNKVRHRAGMPDLPTGLSKEEFRERIRRERRVELALEENRFYDIRRWKILPEVSKFKT GMRWVKDDAGNLKQTRVVAVDCQPTADEKYYLLPIPLDEILRMPNVEQNPKW >gi|283510592|gb|ACQH01000027.1| GENE 4 6730 - 7932 665 400 aa, chain + ## HITS:1 COG:TM1751 KEGG:ns NR:ns ## COG: TM1751 COG2730 # Protein_GI_number: 15644497 # Func_class: G Carbohydrate transport and metabolism # Function: Endoglucanase # Organism: Thermotoga maritima # 152 353 111 288 317 60 26.0 8e-09 MRKLTFILGLVLSVLTACGSSDPEAPKPKPDTPKEEPKGEEELPPTVRGFMVGSLQYTDK NTIEAAKSWGANVIRLQLNPVNYAAKQGSTYDLQLSSYISMLKQRLSEAKAVGMKVILDL HEAPCYIKGKVPTGDDAAKIDFWKDAATRTAFVSFWDRVARELKDSYYDDIVWGYDLFNE PSVGWTRIPPEWKSIATECVNAIRKYDKDVWIVYEPVIEINEFPKVTPLDDKRVVYSIHF YRPGGFTHQGVLDGIHGSKDMTRKEALAKLNIKYPGYAPDIYAVPIKYNYCDLDSVKRAM KPFDDFIEKYKVPALIGEFSVICWAPVESAVAWLQDVINIFEQKHYSWCYHAFREWQGWS LEQPEGLEAFWFSRDPIPPASPTETKRARVIKNALLKNKQ >gi|283510592|gb|ACQH01000027.1| GENE 5 8158 - 10521 1387 787 aa, chain + ## HITS:1 COG:TM0025 KEGG:ns NR:ns ## COG: TM0025 COG1472 # Protein_GI_number: 15642800 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Thermotoga maritima # 31 758 4 706 721 622 44.0 1e-178 MKTVYTTIISLLFLPASILAQQAPKLNTNNIDEVLKAMTLREKAMLLVGNANNFESSSAV VGGSATLVPGAAGNTTAIPRLGIPETILTDGPAGVRIDVNRQGSQKKYYATGFPVGSCLA STWNTALVKQVGQAIGEETRDYRCDVILGPGMNLHRNPLCGRNFEYFSEDPLLTGKIAAA YVQGVQGVGTGVSIKHFAANSQETNRSRVDERVSQRALRELYLRGFEIAIRESNPWTVMS AYNRINGTFAQGNYDLLTKILRQDWGFNGIVMTDWIGRREGLSVASQVHAGNDLFEPGEV EQVNDIEAAVKAGKLDIKDVDRNVRRMLEYVVKTSSFRRYAATNEPDLVAHAAITRQTAS EGMVLLKNTASTLPFHNVKTVALYGVGSYHFLSGGVGSGCVHTPYIIDLVTGLKNAGISS TANLTRMYQKYIDFAKIKREADRDPACWFEKPEMGDQKLPELAISQRQIEAEADSSDIAI ITLTRQAGEGIDRSIEKEFNISPIEKDMIGQVCAAYHRIGKRVIVVINSGSVIETASWNA LPDAILVAWQPGEEGGNSIADVLTGKVCPSGKLTMTWPIFATDHPSTANFPQDDNMTVYH YVTFKDWTSKGGGMSTRDYTNHAEDIWVGYRYFDSFKRQVAYPFGYGLSYTSFEYGQPRV RRQGDKIEVTLSVKNTGKNSGKEIVQLYVAAPKGTMAKPEKELKAFAKTRLLQAGESENI TMSMQVRDLASFDEAGSQWLVDAGTYRLMVGSNINDIRGTATLSLPRYTETVSQALAPQH PIHKIVK >gi|283510592|gb|ACQH01000027.1| GENE 6 10689 - 11633 692 314 aa, chain + ## HITS:1 COG:AGl909_1 KEGG:ns NR:ns ## COG: AGl909_1 COG1409 # Protein_GI_number: 15890570 # Func_class: R General function prediction only # Function: Predicted phosphohydrolases # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 22 298 359 672 1299 110 28.0 5e-24 MRKNLLTACLLMLSMTTFAQKFTIPVFPDSQSEIDSNMGMFNSQLDWIIQNQKKENIPMV LHVGDVVNFDNLTHWDKASQGFARLDSAHIDYAITLGNHDNEAVQEYNGSAAPGDTHANV RKTSKFNSYFPTYRFRAQRGTYEPGKSDNAYYTFRTGNTYWLVVTLEFCPRPEMVEWANK IVKKYHEHNVIILTHYYLNGKGEIVKNNAGYGDSSPQYIFDNLVKLHPNIKFVISGHVGF SSFKVDEGVNHNKIYQMLQNYQNVEYGGGYLRLMRFDLDKGTVDCELYSPYFKKSKPENT HYKIEGFTTIKPKL >gi|283510592|gb|ACQH01000027.1| GENE 7 11636 - 13552 948 638 aa, chain + ## HITS:1 COG:TM0280 KEGG:ns NR:ns ## COG: TM0280 COG3533 # Protein_GI_number: 15643049 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Thermotoga maritima # 93 548 81 535 620 112 23.0 3e-24 MNGMRTINKSMKAGLLFCFFFPYAATAGNYEKIKLAIQDKQQPIEHVTFMGYIGNRFSQS YENRVLAQSVDKLVEPFCHHNETHLWQSEFWGKWMNSAVLAYQYRPSNAMISRIQEAVDK LIKTQDSRGYIGNYTDETHLQEWDIWGRKYCILGLLDAYGVTHDKKALNAACREADYLIN ELHHSKSTIVELGNQHGMAASSVLKPICYLYRYTGNKRYFDFAKEIISLWESATGPKLIS KAGIDVASRFPKPTAAKWYSWEQGAKAYEMMSCYEGLLEMYRLTGNTEYLSAVEQVWQNI NDTEINITGSGASMESWFGGKHLQYMPIRHFQETCVTATWIKLSRQLLLLTGNTKYADAV EISFYNALLGAMRTDASDWAKYTPLSGQRLPGSEQCGMGLNCCNASGPRGLFVIPQTAVL TSAKGVDVNLYIAGDYKLTTPRHQQMVLKLEGEYPKNNKMSFLLSLKKAENITIRLRIPE WSTATKVIVNDVAVEHVQAGKYMELSRTWHHGDRISIEFDMPGIVHRLGQHPEYVAITRG PIVLARDQRLAGPGLEAFLTPVVDDKQQILLEATNTQNTDIWMSFMAKFQPEAYTEDGAP AILVGLCDYASAGNSSQKDDYPFFKVWMPQLFNPAISQ >gi|283510592|gb|ACQH01000027.1| GENE 8 13658 - 15829 934 723 aa, chain + ## HITS:1 COG:no KEGG:Amuc_0060 NR:ns ## KEGG: Amuc_0060 # Name: not_defined # Def: alpha-N-acetylglucosaminidase (EC:3.2.1.50) # Organism: A.muciniphila # Pathway: not_defined # 9 718 8 722 848 690 46.0 0 MKYKKKIYILLCLWFFSVPTIAQVTAQQKLEVVRNIIRRFSHRDDINLRLVPRKQGQLET FNQQVSNGKLTISANSPIALCHGYYDWIRQNEYGIMSWTGNRCNIPTKIDGSKTRSVTSP FQYHYYFNAVTFGYTMPYWDWNRWEQEIDWMAFHGIDMPLALTANEAILARVFKKIGLSD EVIGRFFTGPAHLPWLRMGNIYGIDGPLSNQWHQDQIALQHKILDRMRKLDMHPICPGFA GFVPEALKELYPTADIQYTTWEKAFHNYILSPADPLFHKIGVMFIQEWEKEFGRCDFYLI DSFNEMDIPFPPKDDPKRYEFMADFGKKVYQCIKEANPSATWVMQGWMFGYQPEIWDYKT LNALVSQVPDNKMIMLDLAADYNKFLWKTPFNWDFYKGFCGKQWIYSVIPNMGGKSALTG ALDFYAKGHLEALNSQNRGKLIGFGFAPEGIENNEVVYELLCDAGWAKQGVELRPWLRNY TYSRYGCYPIGMEQYWNEMIQSVYGSFKSHPRFNWQFRPGKEKYGSVDLDNHFYHAVEIM AGMLSQMKGNKLFEADFKEMAANYLGGKVEILVRQIDKAYESQDTINANQLETRFYRLMT GMDLVLQGHPTKDMQKWIDYARARGVSYNKADCYESNARRIVTVWGPPIDDYSARIWAGL IRDYYLPRWKHYFNQKRSGKPFDFSTWELDFVENQKGLSQPALTKDKISLAVQLIQDAKN IVE >gi|283510592|gb|ACQH01000027.1| GENE 9 15845 - 18076 1626 743 aa, chain + ## HITS:1 COG:YPO2803 KEGG:ns NR:ns ## COG: YPO2803 COG1472 # Protein_GI_number: 16123001 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Yersinia pestis # 34 733 29 705 793 372 33.0 1e-102 MRNKALLLTIALLLGNYAIAQVHTYLDRTKNIEERVEDALSRMTLTEKLKVIHAQSKFSS AGVPRLGFPDFWTSDGPHGIRTNTLWDEWTDANQTNDSCVAFPALTCLAATWNPEVAYLF GKSLGEEARYRGKDMVLGPGVNIYRTPLNGRNFEYMGEDPYLAGQMAVPYIQGIQSNGVA ACVKHYPLNQDENNRIEENIIVDDRALHEIYLAPFKEAVIKGHVWGIMGAYPSYKDQSCT YNAYLTNKVLKGDWGYDGVVLSDWGATHETEGAVRHGLDIEFGTWTDGKKYGDSKHYNRY YLADAYRKGLEEGRYTMESLDNKVRRVLRLFYRTTMTYREPGSLTSDEHFAVARRIGEEG VVLLKNERQILPLQLKPGKRILVIGENAIKMMTAGGGSSSIRTKHEFVPLDALRRYTDKV GVQLDFARGYVGDTVTMFNGASVGQDIRDSRTPQQLMDEAVTKAHGADYILIFGGLNKSE YQDCEGYDRKNYSMPYQQDQLVEALAKVNKNIVFVNISGNTVAMPWVERVSGIIQAWYQG SEAGNVIASILVGETNPSGKLPYTWTVRLNDVPAHALGTYPGTWRADHKVIDVEYKEGIF VGYRWADKHRVRPLFAFGHGLSYTAFKLSEAKADKSAMTPSDSITFTITVQNIGQRAGSE VVQLYIKDKEASLPRPVKELKGFRKVFLQPGESRNVSITIGKDALSFYDDRQQQWVAEPG HFEALIGTASDKIASHVKFTLLK >gi|283510592|gb|ACQH01000027.1| GENE 10 18089 - 19822 1442 577 aa, chain + ## HITS:1 COG:MT0134 KEGG:ns NR:ns ## COG: MT0134 COG0366 # Protein_GI_number: 15839507 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Mycobacterium tuberculosis CDC1551 # 28 493 41 501 601 239 33.0 2e-62 MKYFLKRYSILSSILLFSTLSMVAMQSPKWLERAVFYQIYPSSYMDSDGNGIGDLQGIIS KLDYIQSVGFNAIWLNPVFESGWFDGGYDIIDFYKIDPRFGTNTDMVMLIKEAHRRGIKI CLDLVAGHTSNKCKWFQESAAGDANSRYSDYYIWTDSVSDAEKRDIELRKKEGENLFGAR GNFVEANAPRGKYYQKNFFECQPALNYGFAHPDASHPWEQTPDAPGPQAVKRELLNIMAF WLGKGVDGFRVDMAASLIKNDPDGSAVRKLWKEINHWKDSNYPNSVLISEWSDPMQAIPA GFNIDFMIHFGVKAYIPMFFAKGTPWGDWDTHECCYFDRQGKGTLFPFIKNYTEAYEATK HQGYIALPSANHDFQRPNIGTRNTPDQLKVLMTFLLTMPGVPFIYYGDEIAMKYQLYLPS KEGSGIRSGSRTPMQWTKGKNAGFSDCDSRQLYLPVDTEEGKLTVEAQEADTTSLLHYVR RLIRLRQSADAIGNTGDWLLVSDAKQPYPMVYKRSKGGSVYYVAINPSGRSVTAAIPLSK SGRVTQVLQTGHIRCHTSAATLQLKMKGISAVVLSEQ >gi|283510592|gb|ACQH01000027.1| GENE 11 19890 - 22250 1565 786 aa, chain + ## HITS:1 COG:no KEGG:Cpin_6026 NR:ns ## KEGG: Cpin_6026 # Name: not_defined # Def: alpha-L-rhamnosidase # Organism: C.pinensis # Pathway: not_defined # 8 781 12 783 787 715 46.0 0 MAKKYRFILLLLILTLVQARATASATAKPAYWIAVPQTSAHDYGVYYFRKNLNLTQLPAA MRVEVSGDNRYELYVNGQLASAGPAKGDLHHWHYETVDLKPYLKPGNNVVAAMIINEGEK RALSLYTHRTAFYLHALDAIGAELNTDPSWLCIQDHGYQPLNTKVAAFMAVGPCDILNMH QHIAHWCDTTCDLSKWRPAQVLALPAYADRSYIYGYPSVWQLMPSTIPQMERRMERMAKV RQSTLKLPKTFLNAPTSITILAHTKATILIDNKQETNAYVHLAFGYGRDAEMSLTYSECL WDDTKGTKSNRNVVNGKLMRGVKDSIISNGKERQIYRTLSWRTYRYVQLTINTHDAPLTL YDLYGIFTGYPLQLASTFECQDKELERILQIGWHTARLCAWETYMDCPYYEQLQYLGDSR IQALITLFNSRDDRLVKSFLDMADWSRRPEGFTMSQYPSTMEQNIPTYSLIYILSLHDYM RYGSDLDFVRGKLSGVRQILDYFKTWQLPDGRLKETPGWNFIDWVAAWGDPGEGPKGSEG ATATADLFLLLAYQAAADLESQLGMPAMAKLYEAAAQRLAISIRSSYWNTTRGLFADDSN HRHFSQHTNALAILAKTTKTDEITAIAHQLLEDKSLAPCSVYFSFYLNSALHAANLGDNY LQWLDIYRENIKQGLTTWAETSDLAHTRSDCHAWGAAPNIEVYRIMLGIESTSPGFASVR IAPHPGNCKQLSGSIPHPQGNIRVNYRLSGQTLKAVIELPEHVSGEFVWRGKIYPLHGGR NENNYE >gi|283510592|gb|ACQH01000027.1| GENE 12 22526 - 22915 136 129 aa, chain - ## HITS:1 COG:no KEGG:BVU_1598 NR:ns ## KEGG: BVU_1598 # Name: not_defined # Def: transposase # Organism: B.vulgatus # Pathway: not_defined # 6 127 303 424 429 130 48.0 1e-29 MFGVIYTYRCILTNNWTSTEKDIITFYNERGASEKNFDIQNNDFGWSHLPFSFMAENMVF MMVTAMLKSFYLYLVRHISDKVKPLKKTSRLKAFILHFVSVPAKWVRTGRQNVLNLYTNK TYYAEVFIE Prediction of potential genes in microbial genomes Time: Sat May 28 00:32:50 2011 Seq name: gi|283510591|gb|ACQH01000028.1| Prevotella sp. oral taxon 317 str. F0108 cont2.28, whole genome shotgun sequence Length of sequence - 4932 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 5, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 188 - 409 79 ## gi|288927806|ref|ZP_06421653.1| hypothetical protein HMPREF0670_00547 - Prom 536 - 595 5.6 + Prom 52 - 111 2.4 2 2 Op 1 . + CDS 353 - 559 69 ## gi|260885395|ref|ZP_05896915.1| conserved hypothetical protein 3 2 Op 2 . + CDS 599 - 1474 697 ## COG2207 AraC-type DNA-binding domain-containing proteins - Term 1778 - 1818 4.1 4 3 Tu 1 . - CDS 1844 - 2071 117 ## BT_1192 putative xylanase - Prom 2203 - 2262 5.8 - Term 2170 - 2203 -0.3 5 4 Tu 1 . - CDS 2275 - 2868 458 ## COG0657 Esterase/lipase - Prom 2962 - 3021 5.1 + Prom 3528 - 3587 6.4 6 5 Tu 1 . + CDS 3785 - 4843 621 ## BT_2990 hypothetical protein Predicted protein(s) >gi|283510591|gb|ACQH01000028.1| GENE 1 188 - 409 79 73 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927806|ref|ZP_06421653.1| ## NR: gi|288927806|ref|ZP_06421653.1| hypothetical protein HMPREF0670_00547 [Prevotella sp. oral taxon 317 str. F0108] # 1 73 1 73 73 118 100.0 1e-25 MTKVAIKNENITSYGGIYHILDVFSKLGFEKLTESVLGKRGSSGKHSAMEVFSALSSSVI FAVGVALRISMRL >gi|283510591|gb|ACQH01000028.1| GENE 2 353 - 559 69 68 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260885395|ref|ZP_05896915.1| ## NR: gi|260885395|ref|ZP_05896915.1| conserved hypothetical protein [Prevotella tannerae ATCC 51259] # 1 63 1 63 132 107 85.0 2e-22 MINSTVRSDILVFNCYLCHVKFLLTCYVSSTKIGEISDISKCFENFVSQALGVLTKIYAA KLRYYKNI >gi|283510591|gb|ACQH01000028.1| GENE 3 599 - 1474 697 291 aa, chain + ## HITS:1 COG:SMb21419 KEGG:ns NR:ns ## COG: SMb21419 COG2207 # Protein_GI_number: 16264994 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Sinorhizobium meliloti # 6 290 10 290 295 155 31.0 1e-37 MQESKYLIANFRDTQWGLTVSTVGYEHIIPGEDYPTHGHADGYYFKVERGRILNEYQLLY ITAGEGVFKSSSVEERRIKAGDFFLLFPGEWHTYHPSKNIGWQSYWIGFKGENMDARVKN GFLSPTKPVYHVGYMDRLEDLYHYALDTAQEEAVHTQRTLAGVVNLLIGMMYSLERNIEL GKNQEHVNMVNRARKRIREALEENLTIQQIATDMGVSYSNFRKLFKEFTGVSPALYQQEL RLQRAKELLSTTNLSVKQIAYKLCFDSPDYFSAKFKAKINLRPSEFREQTK >gi|283510591|gb|ACQH01000028.1| GENE 4 1844 - 2071 117 75 aa, chain - ## HITS:1 COG:no KEGG:BT_1192 NR:ns ## KEGG: BT_1192 # Name: not_defined # Def: putative xylanase # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 75 192 264 463 109 67.0 4e-23 MNVSHTTPRAFIVLSDDDDVVPSANGVNYYLALNKNGVRSSLHVFPSGGHGWGCKVGFRY HAEMMMALKAWLRSF >gi|283510591|gb|ACQH01000028.1| GENE 5 2275 - 2868 458 197 aa, chain - ## HITS:1 COG:CC2313 KEGG:ns NR:ns ## COG: CC2313 COG0657 # Protein_GI_number: 16126552 # Func_class: I Lipid transport and metabolism # Function: Esterase/lipase # Organism: Caulobacter vibrioides # 43 185 73 236 328 91 37.0 1e-18 MKLHKLFMMFGLALAIIPTSAQRTFDLKLYDGRPTYVSSDANDTAKVRVYLPEEKMATGR AVVICPGGGYVGLAIGHEGYDWGEFFQSQGIAAVVLKYRFPRGNPNVPISDAENALKLVR RHAEAWKINSNDVGIMGSSAGGHLAATIATTAPDGAKPNFQILFYPVISMMEGYGHAGSL HHLLGSSVKCVDEEFCF >gi|283510591|gb|ACQH01000028.1| GENE 6 3785 - 4843 621 352 aa, chain + ## HITS:1 COG:no KEGG:BT_2990 NR:ns ## KEGG: BT_2990 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 5 325 20 360 488 129 31.0 1e-28 MLDPRRANGDEEQEFPVAIRIQYNGKKVYLRIGKTYKLEEWRVLCELERQGRNKNATERK ELRTRISKVEDMVNALVDNGIFTLKRLQERFSGITPEERTIYSVWDKYIADRKIDKLGTA RTNIDVKRRFVKQMGTNVSFGDIDRAFILKWTKAMKKYGLNATTIGISLRTFRAIIKVCI DEGLIKGDTKEMFKDTGYNKSNSRKHEFLDVATMRQLYDFWEKKEARDEKGKELFYSHEK ETVFRDLGLFLFMYLGDGQNLADTLRLEYDEWYFSTHGKQLRFYRQKTHDRNESASEVIF PITQELKKIIEEYGNEPELGKRVFPILPTLIMPDKELWVIQRYNKYPLAELI Prediction of potential genes in microbial genomes Time: Sat May 28 00:33:06 2011 Seq name: gi|283510590|gb|ACQH01000029.1| Prevotella sp. oral taxon 317 str. F0108 cont2.29, whole genome shotgun sequence Length of sequence - 5395 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 2, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 2 - 35 0.7 1 1 Tu 1 . - CDS 94 - 1338 733 ## BF0152 tyrosine type site-specific recombinase - Prom 1435 - 1494 1.9 - Term 1861 - 1917 5.1 2 2 Op 1 . - CDS 2077 - 2886 608 ## COG0235 Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases 3 2 Op 2 . - CDS 2916 - 3947 843 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily 4 2 Op 3 . - CDS 3951 - 5204 1274 ## COG4806 L-rhamnose isomerase Predicted protein(s) >gi|283510590|gb|ACQH01000029.1| GENE 1 94 - 1338 733 414 aa, chain - ## HITS:1 COG:no KEGG:BF0152 NR:ns ## KEGG: BF0152 # Name: not_defined # Def: tyrosine type site-specific recombinase # Organism: B.fragilis # Pathway: not_defined # 1 407 1 406 411 507 58.0 1e-142 MRSTFSVIFYLKKDKVKKDGTAPIMGRITVDGTQAQFSCKLSIDSNLWGVKGGRAVGKSV TTRETNRMLDKLRVGITKHYQEIMERDSFVTAEKVRNAFLGLEYRCQTLIKIYDHFMEDY AKKVDCGMKAKSTLTKYHAVYSHLKDFLQSRYHVSDIALKEIQPTFITDFETYLLAEKHL HCNSVWVYSCPVRMLLHRAVENGWLIRYPFPDCSVPKEETEKGFLTKEELGQLINAPKMN FKRTFIRDLFLFCAFTGLAYIDLKNLREENIVSNPLDGSVWIHTYRQKTGVETNVRLLDI PLQIMKKYKGLCDDGRVFPVPSYNSCKMILRTVMKKCCIHKHITWHMARHTMATVVCLSN GMPIESVSSVLGHKCISSTQIYAKITNEKLNKEMNVLSEKLAGIGEFVTSEDFA >gi|283510590|gb|ACQH01000029.1| GENE 2 2077 - 2886 608 269 aa, chain - ## HITS:1 COG:lin2979 KEGG:ns NR:ns ## COG: lin2979 COG0235 # Protein_GI_number: 16802037 # Func_class: G Carbohydrate transport and metabolism # Function: Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases # Organism: Listeria innocua # 2 265 3 262 273 140 30.0 2e-33 MKSILDGRPRLENEVGKIAEVAGYLWQKGWAERNGGNITVNITAFIDDEIKALPAISDIK EIGATLPALKGQYFFCKGTGRRMRDLARWPMENGSVIRVLDDCSSYVIIADNPVLPTSEL PSHLMVHARQLERKTEIKATLHTHPIELVAMSHGKRFLEKDVLTRLLWSMIPETKAFCPL GLGVVPYAIPGSNALAEGTLKELEDYDVVLWEKHGVFAKSTDIMDAFDQIDVLSKSAAIY LDCKAMGFEPDGMSDAQMKELSQVFHLPK >gi|283510590|gb|ACQH01000029.1| GENE 3 2916 - 3947 843 343 aa, chain - ## HITS:1 COG:STM4050 KEGG:ns NR:ns ## COG: STM4050 COG0697 # Protein_GI_number: 16767316 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Salmonella typhimurium LT2 # 3 337 5 335 344 145 31.0 1e-34 MNIIIGLLIIAIGAFCQSSCYVPINRIKVWSWESYWLVQGVFAWLLLPLLGALLAVPSGH SLTDLFTADHMFDINMTILFGILWGVGGLTFGLSMRYLGVALGQSISLGTCAGLGTVLGP IMLNVIYPEGHHLEKLTVSVLIGVGVTLVGIAVIGVAGSMKAAQLSDEAKRAAVKEFNFP KGIAIALLAGCMSACFNVGLEFGSGLHFAATSGLFKTLPATLLVTVGGFATNAVYCLWQN QKNHTWSDYGKRSVWANNILFCLLAGALWYSQFFGLALGKGFLTDSPVLITFSFCILMAL DVVFSNVWGILLKEWKGCGRRTIAVLVVGIVILVVSSFLPQLL >gi|283510590|gb|ACQH01000029.1| GENE 4 3951 - 5204 1274 417 aa, chain - ## HITS:1 COG:STM4046 KEGG:ns NR:ns ## COG: STM4046 COG4806 # Protein_GI_number: 16767312 # Func_class: G Carbohydrate transport and metabolism # Function: L-rhamnose isomerase # Organism: Salmonella typhimurium LT2 # 3 416 2 418 419 432 48.0 1e-121 MKTESIEKAYAYAKERYAAVGVDTDKAVELLKKTPISLHCWQADDVVGFERNEALSGGIQ TTGNYPGRARNIEEVRRDILFVKSLLGGKHRLNLHEIYGDFGGKFVDRNQVEVDQFQSWI DWARENDMKLDFNSTSFSHPKSGSLSLANPDKGIREFWIEHTKRCRRIADAMGRAQGDPC IMNIWVHDGCKDLTVERMRYRRLMAESLDEILAEKLDGVKNCLEAKLFGIGLESYTVGSH DFVAGYCATRNLMYTLDTGHYEQTENVSDMVASLLLFVPELMLHVSRPVRWDSDHVTIMN DQTLDLFKELVRADALDRAHIGLDYFDASINRIGAYIVGTRATQKCLLQAFLEPIDRLRA YEDTDKGFERLALLEEAKSLPFGAIYDYFNLTSGVPVGEDFIPVVEQYEKDVLSKRG Prediction of potential genes in microbial genomes Time: Sat May 28 00:33:24 2011 Seq name: gi|283510589|gb|ACQH01000030.1| Prevotella sp. oral taxon 317 str. F0108 cont2.30, whole genome shotgun sequence Length of sequence - 55255 bp Number of predicted genes - 42, with homology - 41 Number of transcription units - 21, operones - 12 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 1537 936 ## COG1070 Sugar (pentulose and hexulose) kinases 2 1 Op 2 . - CDS 1557 - 2441 466 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 2544 - 2603 4.7 + Prom 3002 - 3061 5.3 3 2 Op 1 . + CDS 3086 - 3205 99 ## 4 2 Op 2 . + CDS 3283 - 3510 208 ## gi|288927817|ref|ZP_06421664.1| conserved hypothetical protein + Term 3580 - 3635 -0.7 5 3 Op 1 . - CDS 3447 - 4109 589 ## gi|288927818|ref|ZP_06421665.1| hypothetical protein HMPREF0670_00559 6 3 Op 2 . - CDS 4171 - 4434 117 ## gi|288927819|ref|ZP_06421666.1| hypothetical protein HMPREF0670_00560 - Prom 4459 - 4518 5.3 7 4 Tu 1 . - CDS 4562 - 6301 1495 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 6396 - 6455 4.8 + Prom 6780 - 6839 7.5 8 5 Tu 1 . + CDS 6865 - 7893 694 ## gi|288927821|ref|ZP_06421668.1| putative transcriptional regulator, LuxR family + Term 7933 - 7979 0.3 + Prom 7983 - 8042 4.3 9 6 Op 1 . + CDS 8080 - 9480 1397 ## PG1494 hypothetical protein 10 6 Op 2 . + CDS 9502 - 11586 1049 ## COG0550 Topoisomerase IA + Prom 11602 - 11661 2.3 11 7 Op 1 . + CDS 11697 - 15005 2288 ## COG0827 Adenine-specific DNA methylase 12 7 Op 2 . + CDS 15101 - 17893 2072 ## COG4646 DNA methylase + Term 17917 - 17960 8.5 - Term 17903 - 17948 8.9 13 8 Op 1 . - CDS 18004 - 19008 311 ## PG0848 hypothetical protein 14 8 Op 2 . - CDS 18993 - 19700 313 ## PG0849 hypothetical protein - Prom 19730 - 19789 3.5 - Term 19716 - 19767 13.2 15 9 Op 1 . - CDS 19791 - 20099 256 ## PG0850 excisionase family DNA-binding protein 16 9 Op 2 . - CDS 20087 - 21109 628 ## PG0851 hypothetical protein 17 10 Op 1 . - CDS 21254 - 22150 545 ## BT_0108 hypothetical protein 18 10 Op 2 . - CDS 22147 - 22545 164 ## PG1534 hypothetical protein 19 10 Op 3 . - CDS 22597 - 23013 297 ## gi|288927830|ref|ZP_06421677.1| hypothetical protein HMPREF0670_00571 - Prom 23093 - 23152 5.0 + Prom 24647 - 24706 8.4 20 11 Tu 1 . + CDS 24740 - 25600 744 ## PRU_1783 putative lipoprotein - Term 25619 - 25686 8.4 21 12 Tu 1 . - CDS 25719 - 26486 612 ## COG3884 Acyl-ACP thioesterase - Term 26820 - 26859 8.2 22 13 Tu 1 . - CDS 26885 - 28036 1073 ## COG2885 Outer membrane protein and related peptidoglycan-associated (lipo)proteins - Prom 28132 - 28191 6.0 + Prom 27995 - 28054 5.6 23 14 Op 1 2/0.000 + CDS 28254 - 28781 593 ## COG2087 Adenosyl cobinamide kinase/adenosyl cobinamide phosphate guanylyltransferase 24 14 Op 2 11/0.000 + CDS 28875 - 29894 804 ## COG2038 NaMN:DMB phosphoribosyltransferase 25 14 Op 3 6/0.000 + CDS 29869 - 30687 602 ## COG0368 Cobalamin-5-phosphate synthase 26 14 Op 4 . + CDS 30704 - 31234 427 ## COG0406 Fructose-2,6-bisphosphatase + Prom 31332 - 31391 5.7 27 15 Tu 1 . + CDS 31424 - 32980 1610 ## COG2461 Uncharacterized conserved protein 28 16 Op 1 . - CDS 33326 - 34252 574 ## COG1575 1,4-dihydroxy-2-naphthoate octaprenyltransferase 29 16 Op 2 . - CDS 34252 - 35124 844 ## PRU_0313 putative pantothenate kinase 30 16 Op 3 . - CDS 35115 - 35810 661 ## COG3382 Uncharacterized conserved protein 31 16 Op 4 . - CDS 35810 - 37807 2057 ## COG0272 NAD-dependent DNA ligase (contains BRCT domain type II) 32 16 Op 5 . - CDS 37860 - 38669 844 ## PRU_1312 putative lipoprotein 33 16 Op 6 . - CDS 38692 - 40008 1342 ## COG1004 Predicted UDP-glucose 6-dehydrogenase 34 16 Op 7 . - CDS 40050 - 40598 384 ## COG1898 dTDP-4-dehydrorhamnose 3,5-epimerase and related enzymes - Prom 40653 - 40712 5.5 - Term 42060 - 42099 1.0 35 17 Tu 1 . - CDS 42174 - 43247 1096 ## COG0611 Thiamine monophosphate kinase - Prom 43267 - 43326 3.9 36 18 Op 1 . - CDS 43458 - 44267 1078 ## COG0005 Purine nucleoside phosphorylase 37 18 Op 2 . - CDS 44152 - 45330 654 ## COG1663 Tetraacyldisaccharide-1-P 4'-kinase 38 18 Op 3 . - CDS 45339 - 47114 1674 ## COG0616 Periplasmic serine proteases (ClpP class) - Prom 47243 - 47302 5.9 - Term 47639 - 47673 -0.9 39 19 Tu 1 . - CDS 47774 - 49333 1138 ## PRU_1227 putative thiol protease/hemagglutinin PrtT 40 20 Tu 1 . + CDS 49986 - 50912 1089 ## COG0598 Mg2+ and Co2+ transporters + Prom 50914 - 50973 5.6 41 21 Op 1 . + CDS 51026 - 52789 1830 ## PRU_1553 hypothetical protein 42 21 Op 2 . + CDS 52844 - 54541 1423 ## COG2985 Predicted permease + Term 54603 - 54644 10.0 Predicted protein(s) >gi|283510589|gb|ACQH01000030.1| GENE 1 1 - 1537 936 512 aa, chain - ## HITS:1 COG:STM4047 KEGG:ns NR:ns ## COG: STM4047 COG1070 # Protein_GI_number: 16767313 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar (pentulose and hexulose) kinases # Organism: Salmonella typhimurium LT2 # 15 490 8 465 489 353 40.0 3e-97 MKQLKSQQADSVFFAVDLGATSGRTIIGTISGKGADGCAADKPTITLEELTRFDNALIQI RGHVCWDLAALYNKIIVGLRLAAQRNYPIESIGIDTWGCDFVCIGADGLPLGNPFAYRDP HTVGKMDEYFAEAMPKAQVYEKTGIQLMNFNSLFQLYAMRKAGSPALQNADRILFMPDAL SYLLTGEMVCEYTISSTSQMLNVHTGEIDEALLRSVGLRRTHFGRTVGPGTVVGTLTEEV RKLTGLGPLPVVAVAGHDTASAVAAVPAAGPHFAYLSSGTWSLMGIETPQAIVNEQSFSR NFTNEGGVAGTVRFLKNICGMWLYECCRREWKKESGDVSHATLIEEAMKAEGSRSLINPD DPMFANPDSMTAAIQTYCRKHDEPVPETRGQFCRCIYDSLALRYRQVFGWLKEFAPFPLD VLHVIGGGSQNSFLNQFTADSCNITVLAGPQECTAIGNVMLQAKAAGYFKDVWDMRQVIA NSIAPARFIPRDTAPWKAAYEKYLYITGGSEE >gi|283510589|gb|ACQH01000030.1| GENE 2 1557 - 2441 466 294 aa, chain - ## HITS:1 COG:SMb21419 KEGG:ns NR:ns ## COG: SMb21419 COG2207 # Protein_GI_number: 16264994 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Sinorhizobium meliloti # 9 290 10 288 295 149 31.0 8e-36 MGHNQTLQYLIDNEQDARWGLTVNGVGFQHIDSDEPYPPKNHPQEYSFLADQGRRLNEYQ FIYITAGRGFFQSEHCAQTEIKEGDMFLLFPREWHSYHPDPGTGWNEYWIGFRGQNIDSR IANGFFSPREPVFHVGLHDELVQVFRNAIAIAKEQYVGYQQMLAGAANMVMGHAYAFNRH NEFENREVVEQMRKAKVIMTERLAEGITPMEVAEEINMGYSRFRQVFRKFTGYAPLQYIQ EVRLTKCKQLLANTNLSLTEIAYQVGYESVDYFATTFRKKNGVTPTAYRKSVRR >gi|283510589|gb|ACQH01000030.1| GENE 3 3086 - 3205 99 39 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLGTQEIFIIALIILLLFGGKKIPELMKGLGKGVKSFKD >gi|283510589|gb|ACQH01000030.1| GENE 4 3283 - 3510 208 75 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288927817|ref|ZP_06421664.1| ## NR: gi|288927817|ref|ZP_06421664.1| conserved hypothetical protein [Prevotella sp. oral taxon 317 str. F0108] # 1 75 2 76 76 143 100.0 4e-33 MAENEELSFWDHLDILRSIILKVIIVGGACTLLAFYFKDPIFKIVLAPKDHGFITYKMLR TESFGISLINTELTE >gi|283510589|gb|ACQH01000030.1| GENE 5 3447 - 4109 589 220 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927818|ref|ZP_06421665.1| ## NR: gi|288927818|ref|ZP_06421665.1| hypothetical protein HMPREF0670_00559 [Prevotella sp. oral taxon 317 str. F0108] # 1 220 1 220 220 377 100.0 1e-103 MKQQNIIQRLCLFLFVLVLALPALATNYGREGYETFRSRNLGTHQTVTTLKQGKVEITFS SCTTSGSSGNGAIYQPAKGSRITVKADDGYSIRWIILRDTEGGESYRHPQGKYRISSVTS GYDYYFEKEAVSNSHISGGNQNQLNDDDNNIIVYQYDASAQSVEIRPHNNRNWEQFKVRD IIVGYVRAPKVRFEKDRYDIYSVSSVLMRLMPKDSVRSIL >gi|283510589|gb|ACQH01000030.1| GENE 6 4171 - 4434 117 87 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927819|ref|ZP_06421666.1| ## NR: gi|288927819|ref|ZP_06421666.1| hypothetical protein HMPREF0670_00560 [Prevotella sp. oral taxon 317 str. F0108] # 1 87 1 87 87 160 100.0 2e-38 MKKTCWVYILRNESGEITIGFSIDMDEKFTEISMRKEKLYYLRPFEEPFDGLAHKHLLDS LSKDTINHLVRRNREQTETYKEVFRKT >gi|283510589|gb|ACQH01000030.1| GENE 7 4562 - 6301 1495 579 aa, chain - ## HITS:1 COG:AGc425 KEGG:ns NR:ns ## COG: AGc425 COG2207 # Protein_GI_number: 15887598 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 448 561 227 338 365 67 28.0 1e-10 MKHFITLLFVITLYAGEMYAATPADSIRKEMKHLKGEKLLQAYHNLCRLAAAEDNMGYEL RCIREYLAESLRQKDKEAEAQARVTQLYCYYNYEMTDSISYYLPEILSAMKKNGTWDYYY NAWNVLIESYLYEDKVQTALLEAQKMYADARKRKSNYGLGTSTYGMACIYQTMGRFREAE KTIEESIAALSKADEISQLLSAYNVLGETLDGLRKYEKLRAKCTEWKAVIDKYKNEALRK GYTPSLNGRYLYCTLATAVAELETGHYDRAKGLLQLADKYAEGRKAVARFKLLQVKARYY AAIKQYDRAIACNNENMGIMTAAGDSVSLLTVQMQQADLYTQAGRYKEAAELYSLVIPHK DKLRNTELAKQLDELRTIFEIDKLTLRNEVITTRLYLSLIIVALLLATVVLYITYTRRLR RKNRALYDSILLYRKAESDMETAARLVPEEELDREGKIYRRLCELMQKEKIYKDTELNRD ILSKRIGTNAVYITNAVRKYADGATINEFINGYRLRHAASLLTNNPDLNINEVECRSGFN SRATFNRCFRAFFGMSPSEYKAVSKEKKKTQKGSFEERA >gi|283510589|gb|ACQH01000030.1| GENE 8 6865 - 7893 694 342 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288927821|ref|ZP_06421668.1| ## NR: gi|288927821|ref|ZP_06421668.1| putative transcriptional regulator, LuxR family [Prevotella sp. oral taxon 317 str. F0108] # 1 342 1 342 342 622 100.0 1e-177 MIFHSLTEKVSAFLRSRAENTIERHRVIVYLLHSLLVVTIISMQFLGLGGSNDPLPLSMS GIHLAVCLLSLLLYLTQWLTLSKAFSLTVLVAQCTIVMRFFYFATVRPDHFLQLILVNQI ASLLAVFFLVISFVRFTPFIVSTISVIGYGCVAAYLEEPSLWRLFGFFLFVQFFLCTLGE LLRYNVMSVTKENTDLHHRETALMHAVRLNRQEIEVYLRMSGNDHPPPEDTDRLFSMLKP KSQRNLINAVRLHLKKHLMDDCDLGHHFPCLTKSETDVCRLILAGKKRSEIGLLLDKTEN NVDVTRNHIRKKLNVPTDQDLQKFLINLLIEKEYSKKEEINK >gi|283510589|gb|ACQH01000030.1| GENE 9 8080 - 9480 1397 466 aa, chain + ## HITS:1 COG:no KEGG:PG1494 NR:ns ## KEGG: PG1494 # Name: not_defined # Def: hypothetical protein # Organism: P.gingivalis # Pathway: not_defined # 34 466 1 437 437 460 56.0 1e-128 MESNSNDNYVLVLEDRTEVKNENDAGKLSVVSGIDDKGKLQTTEAKDVHQAAFLKFNNKD GLLKNFMTNFLKQFNEPSRFGLYKVVANNVEQGVASLHTMLQNREKPENKQQLAESQVHF EDFLPKQKNATAIDESKIDWKQLDTLGLTRERLEQSGELAKMLNWQKSNLVTIAIPIDDT TIYTDARLAFRTAGEGNIGLAVHPLRKEPQLDFPYMGHKFSNEEKELLLATGNLGKTIEI TPKNGDPFAAYVSIDPQTNELIALRADRVNIPKEIKGVTLSDAQYKDLVEGKAVKVEGMT AKSGKSFNATLQVNAEKKGIEFIFENKQGLKERQQHTQQQGAPRKLCGLELSDKQREALD SGRTLYLKNMVDKEGQPFNAYVKMDKEQNRPRFYKWNPDKKQETGKEKVVAVAEEHKTQV AVNNQGKTNEATKNVNEPLKSGQTQPTAAQKQKQDEKKQQRRGRKM >gi|283510589|gb|ACQH01000030.1| GENE 10 9502 - 11586 1049 694 aa, chain + ## HITS:1 COG:CAC3567 KEGG:ns NR:ns ## COG: CAC3567 COG0550 # Protein_GI_number: 15896801 # Func_class: L Replication, recombination and repair # Function: Topoisomerase IA # Organism: Clostridium acetobutylicum # 5 634 6 655 709 436 40.0 1e-122 MITCIIAEKPSVARDIARIVGANGKQDGYLEGNGYLVTWAMGHLITLAMPEAYGFAAYKA EDLPIRPNPFRLIVRQVRKDTEYTSAPAALKQLKAIRGCFDSADRIIVATDAGREGELIF RYIYQHLNCHKPFERLWISSLTDKAIREGLAHLKAGTAYDNLYHSAKARSEADWLVGINA SRALSIARKGGYSLGRVQTPTLAMVCRRYIENRDFSSVPYWKLSVAAEKEGISLKAVSDS AFENEADAQSALAMLREQSRLTVTSVARKVGHTSPPLLYDLTSLQKEANRKHGFSADKTL STLQSLYEKKITTYPRTGSRYISEDVFEEVPALLGKIGQSQSCPLNRHSVDNDKVTDHHA IIPTGETPSELSADETTIFQMVIHRFIEAFSPDSEEERMQVELTDGTNTFIWKACRSISL GWKAVQQGTGTNDEKGKEDEEQTLSVLPTLTENEVLPLLFSEITEHKTKPKPLYTEATLL SAMENAGKEVADAESKKAMAECGIGTPATCANIIETLILRDYIRREKKTVVPTEKGLAVY EIVKDKRIANAEMTGSWELTLAAIEAGQMPPEKFAQGINSYVETICEELLALAPPMQKSY PTYRCPKCGNESVGIYAKIAKCRHEGCDFHIFREICGTFLSEDNIRDLITTGRTPILKGL TSKAGKKFNARLVLGEDHTTSFEFEGKKGKVRGR >gi|283510589|gb|ACQH01000030.1| GENE 11 11697 - 15005 2288 1102 aa, chain + ## HITS:1 COG:pli0004 KEGG:ns NR:ns ## COG: pli0004 COG0827 # Protein_GI_number: 18450290 # Func_class: L Replication, recombination and repair # Function: Adenine-specific DNA methylase # Organism: Listeria innocua # 6 341 417 720 756 131 31.0 6e-30 MAYNKKAVLEGNTEAIRVVLRLEKERREATEAEKVLLRGYQGFGGLKCVLNRCDNPDDLR YWSASEQNLFAPTQRLKQMIYRDAVDANTAKRYWESIKASVLTSFYTDTRIVSAIADALS ATDVQVRRCLDPSAGMGAFTETFAKQAGMVDAMEKDLLTARITQALHPYGKDNIFVRQEP FEAIGELEDKDKYDLITSNIPFGDFMVYDRSYSKGENILKRESTRTIHNYFFVKGLDTIK EGGLLTFITSQGVLDSPKNEAIRRYLMQNSRLISAIRLPSGMFSENAGTDVGSDLIVLQK QSGKEIGEGIEQQFVQTASVPKGDGFSIAFNHNSLFEGEWKDISHRTIVTDRQMGTDPYG KPAWEYTFDGSIEDMADSLRTQLSLEVEQRFDRKLYESGIPMTEEEWQVHVDKMVQKVQE NIKTEGIPQGQEIKDKEEKKEDKEDEKEEENAYNLMPDSTKKQLPKLYATEKQLIGDRTA YARYFFPMGAYTAYMLEYDPKERIGFGAVTMGYGWELGYMSLKEMEEVKVKGLGIERDLY FKPTKLHEIAELEEIVRGQYTKEPIIEEGKDESRQEVLRPVQEGTQPQEKVEDGNKIVGD EARDEAETKKEAQAAQVLPTIEPENEPAPEGVPVINLQRQYEQASREIRTDVETPREMNG QTVFFDEDHHPIMDSTIETEATEQLLFAPEEYSLWTQDVARVNNEIKEAAQQKKVRDNQP LSASRQPKSTRSTPTSPRRNKKTASAPVREPSLFDFMEEAEPRKPQPIAEVKKEFDASPR PFLSSPDSHLRDGSIVVQKGQVGFLSDLKRHPTFNPMDLPFAQFSRLKAYIEIRESYHRL YDYEANNQAEDKEEREKLNRLYDSYVGRWGCFNQKANTDIIKMDATGVEMLFLERSENGK YIKADIFDHPTAFSTSELSIAADPMEALGASLNKYGTVELDYMSSLLPDMEESDMLSALE GRIFYNPEENSYEVADKFISGNVIEKAERIESWILDHPEHEEAKQSLTALRAATPTPIPF ADLDFNLGERWIPAKVYGKFASEFFETDINVSYHSNMDEYSLVCDRKNANIWHKYAVQGE FRRYDGINLLKHALHNTIPGRL >gi|283510589|gb|ACQH01000030.1| GENE 12 15101 - 17893 2072 930 aa, chain + ## HITS:1 COG:AGpT188_2 KEGG:ns NR:ns ## COG: AGpT188_2 COG4646 # Protein_GI_number: 16119916 # Func_class: K Transcription; L Replication, recombination and repair # Function: DNA methylase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 3 716 447 1150 1315 288 29.0 2e-77 MDWLGRTPDTFKEQLSERYNRLFNCFVRPNFDGTHQSFPDLDLKRLGIPDLYKSQKDAVW MLKTNGGGICDHEVGAGKTLIMCTAAYEMKRLGLANKPMIIGLKANVFDIADTFRKAYPN AKILYPGKNDFNKQNRQRIFNDIKNNDWDCIILTHEQFGMIPQALEIQEAILQKEKDSVE ENLEVLRMQGADISRAMLKGLEKRKQTLEAKLQGIQDSIAERKDDAVDFKMMGIDHLFVD ESHQFKNLMFNTRHDRVSGLGNPDGSQRALNMLFAIRTIQERSGKDLGATFLSGTTISNS LTELYLLFKYLRPQALEKQGINSFDAWAAVFAKKSTDYEFSITNEIIQKERFRTFIKVPE LASFYAEICDFRTAKDIGIDRPEKNEILHNIPPTPEQEEFIGKLMEFAKKGDATILGRAP LSESEERAKMLIATDYARKMSLDLRMIDENGYSDHIDNKASHCAKLLNDYYQKYDAQKGT QFVFSDLGTYKPGGDFNIYSEVKRKLVEDYHIPSYEICFIQECKNEKAKKAMVDAMNRGD IRIIFGSTSMLGTGVNAQQRAVAVHQLDTPWRPSDLEQRNGRAIRKGNMVAKEFADNKVD VIIYAVERSLDSYKFNLLHNKQLFINQLKTNTLGSRTIDEGSMDEDSGMNFSEYVAVLSG NTDLLEKAKLDKKIATLESERKNFLRERDAATGKLAEIDSSVSFHSDKIKEAKADLACFE KRVERDKDGNPINKLFIKGVEGSTDTKVIAARLQEINDKARTKGEYNKIGEIYGFSIMVK TESTSKDLFDCSVNRFFVKGQESINYTYNNGKLATDPKLACENFLGALGRIPKVIESHEK EKEKVAANKEIYTAIANGAWKKEDELHSLKGQSAELDRKIALTLSADKEDKAEQENSLSG NEISTSIENANEPTKEADNRSQSFRPKWRH >gi|283510589|gb|ACQH01000030.1| GENE 13 18004 - 19008 311 334 aa, chain - ## HITS:1 COG:no KEGG:PG0848 NR:ns ## KEGG: PG0848 # Name: not_defined # Def: hypothetical protein # Organism: P.gingivalis # Pathway: not_defined # 1 332 1 332 332 455 71.0 1e-126 MEHTIAKQLHLEKENFRDLLHAASEELTIPIQLVEKDYYISSILRALSESSYTEQIVFKG GTSLSKAYQLINRFSEDVDFAVISEHMSGNQVKMLLSHLMKEVTANLKGDLGFSDISKGS KYRKQAFLYDTQVGLDELSNPVPARIIVEISAFANPFPHEIRIIEPFVTTFLRKKGMSSF IEQYNLTPFELNVLSLRQTLCEKVVSLIRFSMSDTPLASLTSKVRHFYDLDALLSIEQLQ NYISSDAFVSDLITLIKHDQQAFDEPRGWKLLDNLNQSPLIKDFENIWSSLTPKYIENLS KIAYRKIPSPDNIKSSFTKILHSIENIDLGRQEN >gi|283510589|gb|ACQH01000030.1| GENE 14 18993 - 19700 313 235 aa, chain - ## HITS:1 COG:no KEGG:PG0849 NR:ns ## KEGG: PG0849 # Name: not_defined # Def: hypothetical protein # Organism: P.gingivalis # Pathway: not_defined # 1 235 22 256 256 341 72.0 2e-92 MTSKGIKKKISEFEMGKVFKLEDLGLLHTEHQAAVMTLRRLVEKGEIERLSPGLYYKPKI TPFGEVGPTMEERFRDLLYKDNRPIGYLTGFYAFNLLGLTTQQSTTLEIGTNFPRRNRKR GMYAVRFVIQKNEINKEVIDMLRLLDCLKWIKKIPDTTVDQSYSRLKNLINAYSKKEQER IVELSMKYSPLTRALLGSMLSDKSLTDKLYQSLSPLTQFRIGLSPSLAKKQWNIQ >gi|283510589|gb|ACQH01000030.1| GENE 15 19791 - 20099 256 102 aa, chain - ## HITS:1 COG:no KEGG:PG0850 NR:ns ## KEGG: PG0850 # Name: not_defined # Def: excisionase family DNA-binding protein # Organism: P.gingivalis # Pathway: not_defined # 1 102 1 102 102 144 75.0 9e-34 MPYIEHTSEKDWCKRLFDRLKTVENKLDQLLILKEQSVDTTTHPPLKPEYLDIIDVSKIL KVEQKTIYNWVWAHKIPYLKANGRLLFLREEIDEMLRKRDEW >gi|283510589|gb|ACQH01000030.1| GENE 16 20087 - 21109 628 340 aa, chain - ## HITS:1 COG:no KEGG:PG0851 NR:ns ## KEGG: PG0851 # Name: not_defined # Def: hypothetical protein # Organism: P.gingivalis # Pathway: not_defined # 1 340 49 388 388 625 88.0 1e-178 MYAQTMMILQSCEIDYDNPPDASKSVVAVNGVPLGTQDNLFCITGGEGTGKSNYIAAILA GTLGSERLQAEQTLGLEVTANPKGLAVLHYDTEQSEAQLYKNLEKTLRRAGIKSVPEFYH SLYLASLSRKDRLKIIRESMDLFHHKHSGIHLVVIDGIADLIRSANDETESIAIVDELYR LAGIYNTCIICVLHFVPNGIKLRGHIGSELQRKAAGILSIEKDDNPEYSVVKALKVRDGS PLDVPMMLFGWDKKADMHIYRGEKSKEDKERRKTDELLAVVKSAFRAKLKLSYQELCDVL MREMEIKDRTAKKYIAYMKEQRILSQDTSGNYQKGELCHT >gi|283510589|gb|ACQH01000030.1| GENE 17 21254 - 22150 545 298 aa, chain - ## HITS:1 COG:no KEGG:BT_0108 NR:ns ## KEGG: BT_0108 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 4 298 4 293 666 299 50.0 7e-80 MNIKEEILSRTNKGLDVFCFYMPIDFVPKRNFRNPLYDDRRASCNIYLDNKSGCYRMKDF GNDAYSGDCFWFAATMLGLDVRKDFVKVLETINRDLQLNICIERKEHSNPHTMMMKPCKP TLVQPPNQLKKMEGKKWYKLIEQSFNVKELDYWEQYGIDTKTLQRFHVKSLARYESVSNQ GKPFTLGSTHEDPMFAYSMGKFVKVYRPKSKLRFLYGGEKVNDYVFGFQQLPSKGDVVFI TGGEKDVLSLSAHGFNAICFNSETAQIPENIIEGLQLRFRHIIILYDSDETGIREAKR >gi|283510589|gb|ACQH01000030.1| GENE 18 22147 - 22545 164 132 aa, chain - ## HITS:1 COG:no KEGG:PG1534 NR:ns ## KEGG: PG1534 # Name: not_defined # Def: hypothetical protein # Organism: P.gingivalis # Pathway: not_defined # 1 120 1 119 129 89 36.0 3e-17 MTEEDNLQKTVIAELRSLRNDMERIAGFIVEMRRDYSVLEDKMELSSSDVIRLLGISRAS LARWRDTNAIPFRYISCNHVAYPFKGLYVAVKSGRASFKGFRRVEALQRLNAYKDGVLKG YMGDGQTLFEEL >gi|283510589|gb|ACQH01000030.1| GENE 19 22597 - 23013 297 138 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927830|ref|ZP_06421677.1| ## NR: gi|288927830|ref|ZP_06421677.1| hypothetical protein HMPREF0670_00571 [Prevotella sp. oral taxon 317 str. F0108] # 1 138 1 138 138 247 100.0 2e-64 MKTKKQNSGRNAKYYVVLPTLEIMLSACKNCKLRADYADMEYSNFMKHCKMQTDLRINTY ARCAAAFDMDVLLIHLPKGMIDSMIATTPHKSLRFSTMEQEDLIVILNRLCKLDSRRFKQ HLMLLLHQLGKDSEFPDG >gi|283510589|gb|ACQH01000030.1| GENE 20 24740 - 25600 744 286 aa, chain + ## HITS:1 COG:no KEGG:PRU_1783 NR:ns ## KEGG: PRU_1783 # Name: not_defined # Def: putative lipoprotein # Organism: P.ruminicola # Pathway: not_defined # 13 263 9 257 291 93 27.0 8e-18 MTNNKRLAYVLTYVLALFLVACEKPVVDSEPEKEAETDGITLIFEAPTAEPYTEIGEARP AYATKNEALTQLSVSVFKGGKRVKSVHLSAENDLFMKPSLRLDDGTYSVVAVAHAGMGHA TITNAQRITFYKNKITDTYVYCGEITVSGKSTFTLSMKRVTAMFQLSVTSTLPQNVAQVE FRYTGGSSTLDAKGLEGCVNSRQTETREVTEMMRTTPPTFAIYTFPRTDSEGLKIKVSFN DKNGNTLYEENYEEVQVKLGMKTCYKCNLPKDFVDALSKEGCIVEE >gi|283510589|gb|ACQH01000030.1| GENE 21 25719 - 26486 612 255 aa, chain - ## HITS:1 COG:CAC3591 KEGG:ns NR:ns ## COG: CAC3591 COG3884 # Protein_GI_number: 15896825 # Func_class: I Lipid transport and metabolism # Function: Acyl-ACP thioesterase # Organism: Clostridium acetobutylicum # 16 222 15 216 248 69 22.0 7e-12 MMLDKIGKYEFLIEPFHCDFTHHLFVGHLGNHVLNAADFHSNDRGFGMTYLNTIHKTWVL SRLALELEELPKEYSRMSVETWVDGVMRFFTNRNFQMTDAETGTVYGYGRSVWAMIDTQT RQPADILAINEGSIVNYVDKEKACPIAVSSRVKLSGQAILEGSVETHYSDVDINGHINSV KYLEHLFNLKPETWHRTYRVGRIDVAYVAESHFGDVLHYYKEEITEVEELYQIVKCTRTG DTAEVCRVAVKWVER >gi|283510589|gb|ACQH01000030.1| GENE 22 26885 - 28036 1073 383 aa, chain - ## HITS:1 COG:VC1622 KEGG:ns NR:ns ## COG: VC1622 COG2885 # Protein_GI_number: 15641629 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane protein and related peptidoglycan-associated (lipo)proteins # Organism: Vibrio cholerae # 237 366 35 174 212 66 33.0 9e-11 MKKLFTLFAAVSLAASVSAQTVTESKTFDNFYIGVNGGVSTVTTGHSWLKNLNPNAGLRI GRYFTPVFGVAVEGNAYFSNKPGTSYGTFVRFINSSALATINFSNWVGGYKGEPRTFEVI GLYGLGWLHTFNNQDWKNVNALTSKAAIDFALNFGSSKQFQFYVEPAIIYRLNGEGKEGL EYNINKSFVQLNGGLVYRFGNSNGSHNFTVAQVRDQAEIDGLNAQINSLRGDLSNKDSRL AEKDRQISDLQKALDDCNNQPKTIVKNVKSETVTNLQPTVLFRQGKAVIDPAQYAPLELI ASYMRNHPEAKVEIRGYASPEGSAELNQKLSNARAQAVKDALVKRYKIAASRLTTKGMGV TDTLFEEVSFNRVATFNDSSKGE >gi|283510589|gb|ACQH01000030.1| GENE 23 28254 - 28781 593 175 aa, chain + ## HITS:1 COG:BMEI0693 KEGG:ns NR:ns ## COG: BMEI0693 COG2087 # Protein_GI_number: 17986976 # Func_class: H Coenzyme transport and metabolism # Function: Adenosyl cobinamide kinase/adenosyl cobinamide phosphate guanylyltransferase # Organism: Brucella melitensis # 3 175 6 172 173 127 39.0 9e-30 MKKVILITGGQRSGKSLTAERMALNLSPTPVYMATAHAWDDEFKQRIARHQERRGPCWTN IEEEKTLSKHLLYNKVVVVDCITLWCTNFFYGASNNTSTLPDVDRTLEELKAEFDKLTAQ EATFIFVTNEIGSGGVSNNALQRRFTDLQGWMNQYVAARADEVYLMVSGIAVKIK >gi|283510589|gb|ACQH01000030.1| GENE 24 28875 - 29894 804 339 aa, chain + ## HITS:1 COG:RSc2397 KEGG:ns NR:ns ## COG: RSc2397 COG2038 # Protein_GI_number: 17547116 # Func_class: H Coenzyme transport and metabolism # Function: NaMN:DMB phosphoribosyltransferase # Organism: Ralstonia solanacearum # 11 337 20 344 354 275 45.0 8e-74 MLDDSLVWQKIDNLNKPKGSLGMLETLAYRICRIQHTLSPTLSHPCHLLFAADHGIEQEG VSVSPRAVTWQQMINFTHGGGGVNLFCKQHGFELTLVDMGVDHDLTAHPTILNRKIANGT RNFLHEPAMSQQQMQQALNTGSELAESCKAKGCNVLCLGEMGIGNTSPSSVWMSLWGNLP LSSCVGSGSGLNTEGEKHKLAVLEQAVNHFKRQPLPHNDPATIIQYFGGFEMVAAIGAML RAAEQRMIVLVDGFIMSASMLAASKLNPAALDYAVFGHCGDESGHRQLLSLMQAQPILQL GMRLGEGTGALCAYPIIESSVRMINEMNNFKDANIDKYF >gi|283510589|gb|ACQH01000030.1| GENE 25 29869 - 30687 602 272 aa, chain + ## HITS:1 COG:RSc2396 KEGG:ns NR:ns ## COG: RSc2396 COG0368 # Protein_GI_number: 17547115 # Func_class: H Coenzyme transport and metabolism # Function: Cobalamin-5-phosphate synthase # Organism: Ralstonia solanacearum # 15 136 13 134 258 87 41.0 2e-17 MRISTNTSRWYDRPWAAFIFFTRLPFWRIHQPPKDAYESVVEFWPLAGWLTGALMGAIIW FGSKVLPFSIAILAAIAARLLVTGALHEDGLADFFDGFGGGGKDRSRILAIMKDSHIGTF GVLSLIVYFALFFECMLSLGPLYAALVAFTADPFCKMVAGQITQILPYARTEEQAKAHNI YRRFNLTAGVSLMVQGLTPFIGLLYFDAHWGSRLLDWNFLVAAPCLTMFFLYMLMLRRIK GYTGDCCGAVFLLCELAFHLTAVVLLTANFTL >gi|283510589|gb|ACQH01000030.1| GENE 26 30704 - 31234 427 176 aa, chain + ## HITS:1 COG:RSc2395 KEGG:ns NR:ns ## COG: RSc2395 COG0406 # Protein_GI_number: 17547114 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose-2,6-bisphosphatase # Organism: Ralstonia solanacearum # 1 146 1 149 192 83 34.0 2e-16 MEIIFVRHTSVAVAKGTCYGCTDVALSETFEQEATLTRNALSAHAPFDAAYSSPLSRATR LAAFCGYESPIIDPRLCEMNMGDWEMRRFDEIVDDNLQRWYADYMNVRTTNGEGFPDVYS RVSDFLNQLKTQPHRRVVVFAHGGVLICAGIYAGVFKRENAFEHLTPFGGLLRITI >gi|283510589|gb|ACQH01000030.1| GENE 27 31424 - 32980 1610 518 aa, chain + ## HITS:1 COG:FN1655 KEGG:ns NR:ns ## COG: FN1655 COG2461 # Protein_GI_number: 19704976 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 5 517 1 509 512 499 51.0 1e-141 MEHKMRGYLPPMDPEKLKALFEIENAFKSGQLSADEARQQIKERVGKVSAYHVAYIEQTM TEETSDECIREDVHAIINMLGDQIDNTMPNLPADHPIMHYLKENEEMKRLLLAVEDLVQY PMIKNQWLELYDKISQYPIHYKRKQNQLYPMLERKGFTRPTTTMWTFDDMVRDEIREAER LLREDQEDAFIKQQERVLLYARDLMEKEEFILYPTSMALISEEEFEDMKSGDQEIGFAFF EVEHKTTENKVGTQPNEQNNFANDLQALLTKYGYSAAGGGDKLDVTTGKLTLEQVNLIYK HLPIDISFVDENELVCFYSDTDHRIFPRSKNVIGREVMNCHPRKSAHIVREVIDKLRSGE QDKAEFWINKPGLFIYIIYVAVRDKDGKFRGVLEMMQDCTHIRALEGSQTLLTWSNGDTS ETAEQSHNTQETRVEDTETEDEETTEPVTEITPQTRLKDLLKQYPTLKSRLPELNPKFKM LNTPLGKIMMGKANVQMMSERSGIPLDKLIEGIKKLIS >gi|283510589|gb|ACQH01000030.1| GENE 28 33326 - 34252 574 308 aa, chain - ## HITS:1 COG:VNG1075G KEGG:ns NR:ns ## COG: VNG1075G COG1575 # Protein_GI_number: 15790173 # Func_class: H Coenzyme transport and metabolism # Function: 1,4-dihydroxy-2-naphthoate octaprenyltransferase # Organism: Halobacterium sp. NRC-1 # 5 301 3 309 311 144 36.0 2e-34 MNTNKTIKTNSAKAWLLAARPKTLTGAAVPVLLGAMSAYLKTGNEIRLLPIVFCFLFAFV MQIDANFVNDYFDFRKGNDNEKRLGPKRACAQGWITPAKMKCALYLTTLLACLIGLPLIF FGGWITILVGLVCVVFCFLYTTHLSYKGMGDVLVLVFFGLVPVYGTYVLALPHSALSFSA EPFALALACGFVIDTLLIVNNFRDIENDIEAGKRTLAVRLGIERTLWLYLFVGLFAILLS GFAFILAGHYFAFAFSLCYIPLHIKTFNTMKAIRKGKALNKVLGKTAANITAYGVATAIG MLFDTMLR >gi|283510589|gb|ACQH01000030.1| GENE 29 34252 - 35124 844 290 aa, chain - ## HITS:1 COG:no KEGG:PRU_0313 NR:ns ## KEGG: PRU_0313 # Name: not_defined # Def: putative pantothenate kinase # Organism: P.ruminicola # Pathway: Pantothenate and CoA biosynthesis [PATH:pru00770]; Metabolic pathways [PATH:pru01100] # 1 271 1 270 275 352 69.0 8e-96 MGIVIGIDVGISTTKIVGINKDGIVISPLRIKATDPVTSLYGAFGKYLHDNKISLEEVEH VMLTGVGAAYINEPIYGLPTNKADEFLADGLGARYETEIDRMIVVSMGTGTSLVQCDGDR IEHIGGIGIGGGTLQGLSRILLKTDDIKQVSTLALQGDITNINLLIGDISAQPLSGLPMD ATASLFGNAKSNASREDIALGLICMVLQSIGSGTILSSLNTGIKEYVLIGNLTLLPQCKD VFPAMERLYHVHFRIPKHSEFCTAIGAALYYYQGILEKKQKGQHHYGKCK >gi|283510589|gb|ACQH01000030.1| GENE 30 35115 - 35810 661 231 aa, chain - ## HITS:1 COG:BH1019 KEGG:ns NR:ns ## COG: BH1019 COG3382 # Protein_GI_number: 15613582 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus halodurans # 1 185 1 186 224 88 32.0 9e-18 MITVSVLPEIAAVCPNFVGACLVAKVANTTFDANLWEKIEATQAQLRQTLTTESLKQQPS IAATRQVYKALGKDPSRYRPASESLIRRLLQGKNLYQANTLVDLINLASIVHGYSIGGFD ASKIAGSTLKLGVGKAGEPYEGIGRGTLNIEGLPVYRDELGGIGTPTSDNERTKLTLDTC ELLMLINGYDGNEARVRENAIFISDLLRTHCQSDGGQFFIFKPDTSNKTWE >gi|283510589|gb|ACQH01000030.1| GENE 31 35810 - 37807 2057 665 aa, chain - ## HITS:1 COG:all1717 KEGG:ns NR:ns ## COG: all1717 COG0272 # Protein_GI_number: 17229209 # Func_class: L Replication, recombination and repair # Function: NAD-dependent DNA ligase (contains BRCT domain type II) # Organism: Nostoc sp. PCC 7120 # 3 665 7 676 677 547 43.0 1e-155 MEEIKRIEELREQLHHHNYLYYVQNSPTLSDQEFDRLMRELQDLEAKHPEVYDPNSPTQR VGSDLSTGFTQVKHKYAMLSLANTYNEQEVASWYATVSKDLGGQPFEVCCELKYDGLSIS LTYEQGRLVRAVTRGDGEQGDDVTANVRTIRAIPLVLPGTGYPNEFEIRGEILMPWKVFE QLNVKLEKAGETLLANPRNAAVGALKSHDPRLVAKRKLDAYLYYLLGDDLPSDGHYENLK AAETWGFKVSQGMHKATSLEQIYEFINHWDTARHDLPVATDGIVLKVNSLRQQQQLGFTA KSPRWAIAYKFKAERVCTRLNEVTFQVGRTGAVTPVANMEPVLLAGTTVKRATLNNEDFI RSLDLHIGDNVFVEKGGEIIPKIVGVDVSHRSPNLQPVHFISNCPECGSPLVRYAGEAAY YCPNDTGCPPQIKGRIEHFIARKAMNIDSIGPETVDDYFRRGIVRNVADLYEIRTEQING DGTRQKSAQKIVKGIQDSVSTPFERVLFALGIRFVGETTAKLLAKHFKSIDALMVATPEQ LVEVEGVGTVIAESVVRFFHDEVNLNIIERLRQYGLQMSQSVDQQQPASEKLAGKNIVIS GVFEQHSRDEYKEMIERNGGKNVSSISSKTSFILAGANMGPSKMQKAQQLGIDMIDESAF LKMLE >gi|283510589|gb|ACQH01000030.1| GENE 32 37860 - 38669 844 269 aa, chain - ## HITS:1 COG:no KEGG:PRU_1312 NR:ns ## KEGG: PRU_1312 # Name: not_defined # Def: putative lipoprotein # Organism: P.ruminicola # Pathway: not_defined # 8 267 4 256 259 218 42.0 2e-55 MILRTNNILALLCAATAIVACAPKTPKNEGKVEEDKVAKQMLQGIWINADDESVAFKVKG DTIYYPDSTSSPAYFQIFRDTLVIHGADDTVKYPIVKQAPHLFMFNNSNGETVKLTLSED ALDKFQFDNNKAQVLNQNQLIKKDTLVSYEDMRYRCYIQVNPTTYKVIKATYNDEGVEVD NVYHDNIVNLTVYSGANRLFSHDFRKNDFNKFVPQDFLMQAILSDLTLYGVDKSGVHFNA LLAMPDSPSYYVVEVSVSPTGKLKMQLGK >gi|283510589|gb|ACQH01000030.1| GENE 33 38692 - 40008 1342 438 aa, chain - ## HITS:1 COG:RSc0913 KEGG:ns NR:ns ## COG: RSc0913 COG1004 # Protein_GI_number: 17545632 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted UDP-glucose 6-dehydrogenase # Organism: Ralstonia solanacearum # 1 438 1 450 457 463 50.0 1e-130 MNIAIVGTGYVGLVSGACFADTGASVTCVDVDTEKIERIKRGEIPIYEPGLDELVLKNVK AGRLRFTTSLESVLNEQQIVFSAVGTPPDEDGSADLKYVLQVARTIGQHLNKYMVVVTKS TVPVGTAKLVRETIQAELDKRHADVTFDVASNPEFLKEGNAIKDFMSPDRVVVGVESEKA KELLTRLYKPFLINNFRVIFMDIPSAEMTKYAANSMLATRISFMNDIANLCERVGADVNM VRAGIGSDTRIGRKFLYAGCGYGGSCFPKDVKALIKTADDMGYSMEVLKAVERVNEAQKH VVFNKLAAAFAKEGLNGKTIALWGLSFKPETDDMRESTALVTIELLRQAGCRIKVYDPVA MPECQRRIGTTVNYASDLYDAVNNADALLLLTEWNEFRLPNWEIVGKVMNRKLLIDGRNI FEKKELESYGFDYHSIGR >gi|283510589|gb|ACQH01000030.1| GENE 34 40050 - 40598 384 182 aa, chain - ## HITS:1 COG:PH0416 KEGG:ns NR:ns ## COG: PH0416 COG1898 # Protein_GI_number: 14590334 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-4-dehydrorhamnose 3,5-epimerase and related enzymes # Organism: Pyrococcus horikoshii # 2 177 7 180 188 197 59.0 8e-51 MEYIATELQGVFVIQPKVFNDERGYFFESWKKEEFEKHVGKVDFVQDNESKSSFGVLRGL HYQKGEASQAKLVRVIKGRVLDVAVDLRKSSPTFGKYVAVELSDENKKQLFIPRGFAHGF LVLSPEAVFTYKVDNVYCPQSETSIRWNDETLGIEWPIEYDKIVTSAKDLKGKSLGEAEV FE >gi|283510589|gb|ACQH01000030.1| GENE 35 42174 - 43247 1096 357 aa, chain - ## HITS:1 COG:aq_2119 KEGG:ns NR:ns ## COG: aq_2119 COG0611 # Protein_GI_number: 15607070 # Func_class: H Coenzyme transport and metabolism # Function: Thiamine monophosphate kinase # Organism: Aquifex aeolicus # 4 331 3 301 306 156 34.0 7e-38 MTEISALGEFGLIKRLTKDLKTNNDATLYGVGDDCAVLHYPDSEVLVSTDMLMEGVHFDL TYIDMYHLGFKSAQVNISDIFAMNGTPRQLIVSLALSKRFKVEDLEEFYAGLRAACEQWK VDIVGGDTTSSYTGLAISITCIGESPRNEIVYRNGAKDTDLICVSGDLGAAYMGLQLLER EKTVYYQQVDEARKKNDQLALQQLKAFQPDFAGKEYLLQRQLRPEARGDIIKRLREANIK PTAMMDISDGLSSELLHICDQSNCGCRVFEKNIPIDYQTAVMAEEMNMNVTTCAMNGGED YELLFTVPIGDHSKIEQMEGVKLIGHITKPEFGKQLVARDGTEFEITAQGWNPLQGQ >gi|283510589|gb|ACQH01000030.1| GENE 36 43458 - 44267 1078 269 aa, chain - ## HITS:1 COG:BH1532 KEGG:ns NR:ns ## COG: BH1532 COG0005 # Protein_GI_number: 15614095 # Func_class: F Nucleotide transport and metabolism # Function: Purine nucleoside phosphorylase # Organism: Bacillus halodurans # 3 266 6 270 275 286 52.0 3e-77 MYEKIQETASWLKQRMTTSPKTAIILGTGLGQLASEITDKYEFPYNEIPNFPVSTVEGHA GKLIFGKLGGKDIMAMEGRFHFYEGYDMKEVTFPERVMYELGIETLFVSNASGGMNPNFV IGDLMIIDDHINFFPEHPLRGKNFPTGPRFPDMHEAYDKQLRNLADQIAKEKGIRVVHGV YVGVSGPTFETPAEYKMYHRLGGDAVGMSTVPEVIVARHCGIKVFGMSIITDLGLEDQPV EVSHEEVQVAANKAQPLMTEIMREIIKRS >gi|283510589|gb|ACQH01000030.1| GENE 37 44152 - 45330 654 392 aa, chain - ## HITS:1 COG:aq_1656 KEGG:ns NR:ns ## COG: aq_1656 COG1663 # Protein_GI_number: 15606758 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Tetraacyldisaccharide-1-P 4'-kinase # Organism: Aquifex aeolicus # 14 352 6 310 315 125 31.0 1e-28 MRTEGDFIKIHEWLTPLSWLYGLGVGFRNLLFRLGVLKSRAFDIPVISVGNITVGGSGKT PHVEYLVSLLLDKMKVAVLSRGYKRKSKGYVLASNESTMSQIGDEPYQMKQKFPTLYVAV DKKRTRGIDRLTSDEQTKDVDVILLDDAYQHRYVKPGVNILLVDYHRLIIYDKLLPAGCL REPQEGKSRADIVIITKCPKDLRPMEYRVLMKALDLYPYQSLYFTTLVYDDLKPVYGKGS IALNSLPKACNVLLLTGIASPKQMQTDLAVYKFNLHQLAFPDHHNFSKKDVRTINNKFAE LPSPKIIITTEKDASRIKFIEGFEQEVKDNMYALPVRIQFMLEQEESFNNKIINYVRKNS RNSILVKTKDDNKSQNGNHSRNGAGPISFRNN >gi|283510589|gb|ACQH01000030.1| GENE 38 45339 - 47114 1674 591 aa, chain - ## HITS:1 COG:all4590 KEGG:ns NR:ns ## COG: all4590 COG0616 # Protein_GI_number: 17232082 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Periplasmic serine proteases (ClpP class) # Organism: Nostoc sp. PCC 7120 # 38 590 42 608 609 324 37.0 3e-88 MKDFLKSALASALGVLIFCLCCFAFMIMSIIGMIASADTETKLKDNSVLTINLSGTISEM AAPDVLGFLSGNTIENNGLNDMLLAIRKAKNNDDIKGIYLEAGPLVAGFSTLQELRDALA DFKKSGKWIVAYGDTYTQGCYYVASVANHIFLNPQGQVDWHGLSSQPYYIKDLAAKFGVK YQVAKVGTFKSATEMFTETKMSDANRLQVSMYLNGLWANVCKAVSESRKISIPALNTYAD EYQLFADAQSLVKKRFVDKLLYADQVKGEVKKLLGIDSDKSINQVGVTAMCNVSQDADTD DGTIAIYYAEGEIVQIAPGGMFNNSTNIVSKDICKDLEDLKNDDDIKAVVLRVNSPGGDA YASEQIWHQVTELRKKKPVVVSMGDYAASGGYYMSCGANWIVAEPNTLTGSIGIFGVFPD LSGLVTEKLGVKFDEVKTNANSAFGNVAARPFNTAEMAMLQGYINRGYATFLNRVSQGRK MPVTNLDKIAQGRVWLGADALKIKLVDQLGGIKDAVEKAAQLAKIKDYGVSEYPAPASWQ DQLFNSVVPRNTLDEQLRLTLGAAYEPFMLIRKINQREAIQARLPLELNIR >gi|283510589|gb|ACQH01000030.1| GENE 39 47774 - 49333 1138 519 aa, chain - ## HITS:1 COG:no KEGG:PRU_1227 NR:ns ## KEGG: PRU_1227 # Name: not_defined # Def: putative thiol protease/hemagglutinin PrtT # Organism: P.ruminicola # Pathway: not_defined # 8 323 2 320 777 146 32.0 2e-33 MKNNNIKKAYCLIAFLVHFTCANAVQVSQGAAFDIASKYFKKPELVTRSTLKPEGDPAFY IFTNSGHKGFVIVSGESDLPPILGYGNRFTPNEARLPDYFYSLLRHYELLVMAYRCNRVG FSKSVLNVKKEIKPLLSCTWNQEMPFKLHTPKDNNVNMPTGCVATAVSQLMYYNKWPTKR PPKFVDQSGTNAQKSSVYLWNEIKDNSTQMGEVGKDAVGVLLSDVGKAVNMKYEAKGSIS NMQWALDALRKNFDYSVKHISKEYMPKGMFYELVINELANGYPVLIGESSHSFLLDGIDK QGYIHVNWGWAGENDGWFDFATLYTPLDDEVFGTDIFALEMEAVLAHPKTGRHVQFKNIR GLEAQAVDAFKFLQHEVPRNFPIQACLKNIGTYNESNGDLGLFTGKVGLAVYDMKGKLIK VVEFKDPALQWSSMFITKDLRFNNINLAELPNGTYTVKPVSNELVAKPDKFSGWQPIAYS NTQTLVVSKNKVSVETYEWKDKRMDYTFNANTRLSAKEK >gi|283510589|gb|ACQH01000030.1| GENE 40 49986 - 50912 1089 308 aa, chain + ## HITS:1 COG:CAC0294 KEGG:ns NR:ns ## COG: CAC0294 COG0598 # Protein_GI_number: 15893586 # Func_class: P Inorganic ion transport and metabolism # Function: Mg2+ and Co2+ transporters # Organism: Clostridium acetobutylicum # 5 305 10 312 315 179 37.0 7e-45 MRTYWNFNSALKVIDEWQPNCWIQVTCPTEEDQALLEERFNIPDYFLSDISDTDERARYE YDDGWMLIILRIPYVKEIRSRTPYTTVPLGVIHKRDVTITVCYYETNMMIDFVSYQQKRG AGFTDHVDMIFRLFLSSAVWYLKRLKQINILIEKAKHNLDQDVNNESLIGLSRLQDSLTY FNTSIRGNENLLSKLKFKLQVDELDADLIEDVSIEMTQARETTSIYSDILESTMDTYSSI INNNMNTVMRTLTSVSIIMMFPTLIASLFGMNLINGMEQSKWGFAVAIILSFLVSGLSWW ILRAKRLL >gi|283510589|gb|ACQH01000030.1| GENE 41 51026 - 52789 1830 587 aa, chain + ## HITS:1 COG:no KEGG:PRU_1553 NR:ns ## KEGG: PRU_1553 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 14 586 38 610 611 642 68.0 0 MMDSQEKALLENAADDTSKKTYKSKEEILRRATEIVAENETPDKEEIDNLKTTFYKLHIA ERDAEQKAYLENGGDPEAYQVSPDPIEEAFKAQMGVIKERRAKLFLEQEQEKQENLQKKT EIIEKIKAMATSPDEANKSYQAFKDLQQQWKEIKAVPADKANELWRNYQLYVEQFYDLLK LNNEAREYDFKKNLELKEKLCENAERLADEKDIISAFHQLQELHQEYREIGPVSKELREQ IWARFKAASTVINKRHQQHFEEMRANEEENLAKKTALCEKVETIVAAENKGAADWEKRTK EIIDIQAEWKTIGFAPQKMNVKIFERFRTACDVFFSRKADFFKAMKEKYAQNAEKKKELV EKARQLSESTDWKSTADKLIALQKEWKTIGMVPRKLGDKLWNDFITACNHFFEARNAAHA GTRGEERENLSKKKAVIAKLKEMMGTTPDNAAETIKQLIEEYNAIGHVPFNEKDKVYAHF HDTVDKLYKELNISVASQKLDDFKSNLKNLAKKGADVVDNERGKLMRRYEALKSEIQTYE NNLGFLTAKSKKGNSLVDEMNRKVTRLKDDLELVRQKIKAIDQQEQN >gi|283510589|gb|ACQH01000030.1| GENE 42 52844 - 54541 1423 565 aa, chain + ## HITS:1 COG:STM3807 KEGG:ns NR:ns ## COG: STM3807 COG2985 # Protein_GI_number: 16767092 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Salmonella typhimurium LT2 # 28 565 18 551 553 372 39.0 1e-102 MDWLIALFNTNDSVAHIVLLYSVVIAGGVLLGKVKIGGISLGVTFVLFVGILAGHIKFTA PINILTFMQDFGLILFVFCIGLQVGPGFFESFKKGGIKLNLLSVSLILLNVAVMFACYFI FFDTTDVNALPMMVGTLSGAVTNTPGLGAANEALSSMEKSFPNGLPQIASGYACAYPLGV LGIIGATIAVRYICRVDLQEEEEKLAEQENEDPHAKPHIMHVRVNNSYLDGRTLGQISEF LNRDMVCTRLYHDGKVVIPQQDTVFSVGDEALIVCAESDAEAIQVFIGEKLPEEWAYEDK EQPLVSKRIVVTRPAINGKLLREMHFTSVYGVNITRITRQGMNLFASPSHRFQIGDRVVA VGPEDHVDRVAEVLGNSARRLDAPNVATIFIGIIVGILFGSIPLHIQGIPAALKLGIAGG PLVIAILIGRFGYKFKLITYTTTSANLMLREIGLALFLASVGIKAGAHFWDTVIQGDGIK YVYTGFIITVIPILIVGTITRLVYKFNYFTIMGMLAGTCTDPPALAYANQICSREAPGIG YSTVYPLSMFLRIFTAQLIVLFFCG Prediction of potential genes in microbial genomes Time: Sat May 28 00:35:21 2011 Seq name: gi|283510588|gb|ACQH01000031.1| Prevotella sp. oral taxon 317 str. F0108 cont2.31, whole genome shotgun sequence Length of sequence - 99131 bp Number of predicted genes - 88, with homology - 85 Number of transcription units - 38, operones - 18 average op.length - 3.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 1024 - 1073 10.1 1 1 Op 1 . - CDS 1089 - 1952 661 ## PRU_2534 hypothetical protein 2 1 Op 2 . - CDS 1963 - 2448 337 ## PRU_2533 hypothetical protein 3 1 Op 3 . - CDS 2426 - 2875 398 ## gi|288927857|ref|ZP_06421704.1| hypothetical protein HMPREF0670_00598 4 1 Op 4 . - CDS 2868 - 3338 457 ## PRU_2532 ECF family RNA polymerase sigma-70 factor 5 2 Op 1 . - CDS 3833 - 4174 271 ## Dfer_5567 hypothetical protein 6 2 Op 2 . - CDS 4209 - 4964 637 ## COG0637 Predicted phosphatase/phosphohexomutase - Term 4980 - 5033 14.5 7 2 Op 3 . - CDS 5046 - 6395 1586 ## COG0166 Glucose-6-phosphate isomerase - Prom 6466 - 6525 7.8 8 3 Op 1 . - CDS 6613 - 6993 181 ## GAU_1363 hypothetical protein - Prom 7014 - 7073 3.4 9 3 Op 2 . - CDS 7078 - 8073 1002 ## COG0240 Glycerol-3-phosphate dehydrogenase - Prom 8134 - 8193 2.9 - Term 8148 - 8204 0.4 10 4 Tu 1 . - CDS 8218 - 9948 1943 ## COG1190 Lysyl-tRNA synthetase (class II) - Prom 9971 - 10030 3.9 11 5 Tu 1 . - CDS 10099 - 11442 1278 ## COG1252 NADH dehydrogenase, FAD-containing subunit - Prom 11473 - 11532 4.0 + Prom 12228 - 12287 8.0 12 6 Tu 1 . + CDS 12353 - 12619 63 ## + Term 12658 - 12705 -1.0 13 7 Tu 1 . - CDS 12734 - 14464 1071 ## COG1649 Uncharacterized protein conserved in bacteria - Prom 14702 - 14761 6.7 + Prom 14498 - 14557 5.3 14 8 Tu 1 . + CDS 14726 - 15214 552 ## PRU_0739 hypothetical protein 15 9 Tu 1 . - CDS 15173 - 15433 197 ## gi|288927869|ref|ZP_06421716.1| hypothetical protein HMPREF0670_00610 + Prom 15432 - 15491 5.9 16 10 Op 1 . + CDS 15674 - 16327 801 ## PRU_1357 hypothetical protein 17 10 Op 2 . + CDS 16394 - 16609 183 ## gi|260911807|ref|ZP_05918376.1| conserved hypothetical protein 18 10 Op 3 . + CDS 16610 - 18907 2157 ## PRU_1359 hypothetical protein + Term 18976 - 19034 6.1 + Prom 19061 - 19120 4.9 19 11 Op 1 . + CDS 19228 - 21246 1971 ## COG0021 Transketolase 20 11 Op 2 . + CDS 21248 - 21691 556 ## COG0698 Ribose 5-phosphate isomerase RpiB 21 11 Op 3 . + CDS 21757 - 22083 75 ## + Term 22211 - 22256 1.2 - Term 22201 - 22240 6.0 22 12 Tu 1 . - CDS 22314 - 23978 1508 ## PRU_1444 putative transporter - Prom 24029 - 24088 4.4 23 13 Tu 1 . - CDS 24118 - 26277 2300 ## COG1506 Dipeptidyl aminopeptidases/acylaminoacyl-peptidases - Prom 26355 - 26414 3.6 - Term 26299 - 26357 13.5 24 14 Op 1 . - CDS 26417 - 28300 2139 ## COG0706 Preprotein translocase subunit YidC 25 14 Op 2 . - CDS 28395 - 29999 1549 ## COG0504 CTP synthase (UTP-ammonia lyase) - Prom 30119 - 30178 6.5 26 15 Tu 1 . + CDS 29938 - 30162 83 ## 27 16 Tu 1 . - CDS 30324 - 31211 1029 ## COG0329 Dihydrodipicolinate synthase/N-acetylneuraminate lyase - Prom 31312 - 31371 6.8 - Term 31718 - 31747 -0.4 28 17 Op 1 . - CDS 31835 - 34708 2618 ## PRU_1650 hypothetical protein 29 17 Op 2 . - CDS 34705 - 37932 3299 ## COG1074 ATP-dependent exoDNAse (exonuclease V) beta subunit (contains helicase and exonuclease domains) 30 17 Op 3 . - CDS 37978 - 40002 1382 ## PRU_1648 hypothetical protein + Prom 40921 - 40980 7.4 31 18 Tu 1 . + CDS 41196 - 43190 1406 ## PRU_2899 hypothetical protein + Prom 43196 - 43255 3.6 32 19 Tu 1 . + CDS 43471 - 44202 395 ## gi|288927888|ref|ZP_06421735.1| hypothetical protein HMPREF0670_00629 + Prom 44338 - 44397 5.1 33 20 Tu 1 . + CDS 44437 - 46164 1887 ## COG0737 5'-nucleotidase/2',3'-cyclic phosphodiesterase and related esterases + Term 46217 - 46274 -0.6 34 21 Tu 1 . + CDS 46285 - 47064 606 ## COG2816 NTP pyrophosphohydrolases containing a Zn-finger, probably nucleic-acid-binding + Prom 47122 - 47181 2.6 35 22 Op 1 . + CDS 47224 - 48279 929 ## COG0820 Predicted Fe-S-cluster redox enzyme 36 22 Op 2 . + CDS 48302 - 49417 537 ## PROTEIN SUPPORTED gi|163786851|ref|ZP_02181299.1| 50S ribosomal protein L32 37 22 Op 3 . + CDS 49424 - 50455 709 ## PRU_1560 hypothetical protein + Prom 50597 - 50656 3.6 38 23 Op 1 . + CDS 50693 - 51955 1244 ## COG3604 Transcriptional regulator containing GAF, AAA-type ATPase, and DNA binding domains 39 23 Op 2 . + CDS 51933 - 52454 547 ## PRU_1558 putative lipoprotein 40 23 Op 3 . + CDS 52481 - 53236 695 ## PRU_1557 hypothetical protein + Prom 53270 - 53329 5.1 41 24 Tu 1 . + CDS 53349 - 53717 460 ## PRU_1556 preprotein translocase subunit SecG + Term 53768 - 53820 14.2 - Term 53852 - 53890 2.2 42 25 Tu 1 . - CDS 54043 - 54702 693 ## gi|288927898|ref|ZP_06421745.1| hypothetical protein HMPREF0670_00639 - Prom 54745 - 54804 4.6 + Prom 56450 - 56509 2.8 43 26 Tu 1 . + CDS 56594 - 57589 1042 ## COG1609 Transcriptional regulators 44 27 Op 1 . + CDS 57789 - 60878 2516 ## ZPR_4655 TonB-dependent receptor Plug domain protein 45 27 Op 2 . + CDS 60889 - 61674 732 ## ZPR_4656 RagB/SusD family protein + Term 61770 - 61819 13.5 - Term 61758 - 61807 5.1 46 28 Op 1 . - CDS 61835 - 62473 601 ## COG0207 Thymidylate synthase 47 28 Op 2 . - CDS 62445 - 63305 916 ## BVU_0955 hypothetical protein - Prom 63409 - 63468 1.6 - Term 63434 - 63470 2.2 48 29 Op 1 . - CDS 63480 - 63776 318 ## gi|288927904|ref|ZP_06421751.1| hypothetical protein HMPREF0670_00645 49 29 Op 2 . - CDS 63788 - 69559 6373 ## gi|288927905|ref|ZP_06421752.1| conserved hypothetical protein 50 30 Op 1 . - CDS 69733 - 70452 772 ## gi|288927906|ref|ZP_06421753.1| LigA protein 51 30 Op 2 . - CDS 70455 - 71279 239 ## gi|288927907|ref|ZP_06421754.1| hypothetical protein HMPREF0670_00648 52 30 Op 3 . - CDS 71272 - 71781 428 ## gi|288927908|ref|ZP_06421755.1| hypothetical protein HMPREF0670_00649 53 30 Op 4 . - CDS 71771 - 72619 1002 ## Cpin_0294 hypothetical protein 54 30 Op 5 . - CDS 72630 - 72905 292 ## gi|288927910|ref|ZP_06421757.1| hypothetical protein HMPREF0670_00651 55 30 Op 6 . - CDS 72890 - 73306 355 ## gi|288927911|ref|ZP_06421758.1| hypothetical protein HMPREF0670_00652 56 30 Op 7 . - CDS 73351 - 73794 594 ## COG3023 Negative regulator of beta-lactamase expression 57 30 Op 8 . - CDS 73791 - 74282 424 ## gi|288927913|ref|ZP_06421760.1| hypothetical protein HMPREF0670_00654 58 30 Op 9 . - CDS 74295 - 74600 404 ## gi|288927914|ref|ZP_06421761.1| hypothetical protein HMPREF0670_00655 59 30 Op 10 . - CDS 74604 - 75068 506 ## gi|288927915|ref|ZP_06421762.1| hypothetical protein HMPREF0670_00656 60 30 Op 11 . - CDS 75131 - 76075 1061 ## Coch_0642 hypothetical protein 61 30 Op 12 . - CDS 76110 - 76736 654 ## gi|288927917|ref|ZP_06421764.1| hypothetical protein HMPREF0670_00658 62 30 Op 13 . - CDS 76830 - 78599 1737 ## Cpin_0287 hypothetical protein - Prom 78622 - 78681 2.3 63 31 Op 1 . - CDS 78765 - 79049 467 ## Coch_0646 hypothetical protein 64 31 Op 2 . - CDS 79071 - 79478 548 ## Coch_0647 hypothetical protein 65 31 Op 3 . - CDS 79533 - 80717 1459 ## Coch_0648 hypothetical protein 66 31 Op 4 . - CDS 80732 - 81643 774 ## gi|288927923|ref|ZP_06421770.1| hypothetical protein HMPREF0670_00664 67 31 Op 5 . - CDS 81678 - 82787 940 ## COG0740 Protease subunit of ATP-dependent Clp proteases - Prom 82897 - 82956 7.1 + Prom 82759 - 82818 4.5 68 32 Op 1 . + CDS 82956 - 83405 455 ## gi|288927926|ref|ZP_06421773.1| hypothetical protein HMPREF0670_00667 69 32 Op 2 . + CDS 83402 - 84943 1367 ## DvMF_0728 phage uncharacterized protein 70 32 Op 3 . + CDS 84945 - 85385 545 ## gi|288927928|ref|ZP_06421775.1| hypothetical protein HMPREF0670_00669 71 32 Op 4 . + CDS 85410 - 86747 1442 ## COG4383 Mu-like prophage protein gp29 72 33 Op 1 . + CDS 86900 - 88171 702 ## gi|288927930|ref|ZP_06421777.1| conserved hypothetical protein 73 33 Op 2 . + CDS 88248 - 88823 513 ## gi|288927931|ref|ZP_06421778.1| putative phage virion morphogenesis protein 74 33 Op 3 . + CDS 88804 - 89295 594 ## gi|288927932|ref|ZP_06421779.1| hypothetical protein HMPREF0670_00673 - Term 89092 - 89127 3.3 75 34 Tu 1 . - CDS 89237 - 89527 206 ## gi|288927933|ref|ZP_06421780.1| hypothetical protein HMPREF0670_00674 - Prom 89551 - 89610 2.5 - Term 89559 - 89599 4.2 76 35 Op 1 . - CDS 89715 - 90362 904 ## gi|288927934|ref|ZP_06421781.1| hypothetical protein HMPREF0670_00675 77 35 Op 2 . - CDS 90343 - 90825 525 ## gi|288927935|ref|ZP_06421782.1| hypothetical protein HMPREF0670_00676 78 35 Op 3 . - CDS 90857 - 91129 260 ## gi|288927936|ref|ZP_06421783.1| hypothetical protein HMPREF0670_00677 79 35 Op 4 . - CDS 91143 - 91763 534 ## gi|288927937|ref|ZP_06421784.1| hypothetical protein HMPREF0670_00678 80 35 Op 5 . - CDS 91760 - 92296 420 ## Aave_1600 hypothetical protein 81 35 Op 6 . - CDS 92342 - 93004 494 ## COG1066 Predicted ATP-dependent serine protease 82 35 Op 7 . - CDS 93004 - 93879 1003 ## gi|288927940|ref|ZP_06421787.1| hypothetical protein HMPREF0670_00681 83 36 Op 1 . - CDS 94027 - 96039 2245 ## gi|288927941|ref|ZP_06421788.1| hypothetical protein HMPREF0670_00682 84 36 Op 2 . - CDS 96049 - 96450 456 ## gi|288927942|ref|ZP_06421789.1| conserved hypothetical protein 85 36 Op 3 . - CDS 96494 - 96952 492 ## gi|288927943|ref|ZP_06421790.1| hypothetical protein HMPREF0670_00684 86 36 Op 4 . - CDS 96966 - 97181 323 ## gi|288927944|ref|ZP_06421791.1| hypothetical protein HMPREF0670_00685 - Prom 97283 - 97342 4.6 + Prom 97148 - 97207 6.0 87 37 Tu 1 . + CDS 97283 - 97957 391 ## BDI_0844 hypothetical protein - Term 98335 - 98389 -0.5 88 38 Tu 1 . - CDS 98393 - 98617 207 ## gi|288927946|ref|ZP_06421793.1| hypothetical protein HMPREF0670_00687 - Prom 98685 - 98744 5.4 Predicted protein(s) >gi|283510588|gb|ACQH01000031.1| GENE 1 1089 - 1952 661 287 aa, chain - ## HITS:1 COG:no KEGG:PRU_2534 NR:ns ## KEGG: PRU_2534 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 287 1 294 294 205 41.0 2e-51 MEKILVCMALTFNLAAKATPSDNDTVVVERPQKVRIITGDSVQSVEVWGKEGKQTFHYVS NIQLVDSNYVSTSTINDDTWSFSIAGFSKPRRKKSRTECTTHFVVGFNNAVGMPTGADIQ PFKSWELWWIVTDWTYRPWRNNHFLSMGLGLDWRNYRMTDDLRFVKDNRSVALDTYPTGS SPQFSRIKVFSINLPIRYGYEGKWFGFSLGPVINFNTYGSLKTRYEKDGHELKDVDKDVR ITPVTVDFMGTFSIRGVPDFYFKYSPCNLLRDGYGPKFRTLSFGLMF >gi|283510588|gb|ACQH01000031.1| GENE 2 1963 - 2448 337 161 aa, chain - ## HITS:1 COG:no KEGG:PRU_2533 NR:ns ## KEGG: PRU_2533 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 30 160 1 134 135 95 39.0 5e-19 MTFSESNKWNKIKYIVMKLRFFLTAIVAFLAFEAHAQQGLNVNALFQGKVVPHEEMVDVR VKGRAISKYKLDFYRSIRFNATEQQRNVVDDLVDSDCKTAIGTEQTTRNGTTTLIMTLPK QGNMNRYLCYITCRKGRTTVITVVYMEGKVESIAELRKLIH >gi|283510588|gb|ACQH01000031.1| GENE 3 2426 - 2875 398 149 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927857|ref|ZP_06421704.1| ## NR: gi|288927857|ref|ZP_06421704.1| hypothetical protein HMPREF0670_00598 [Prevotella sp. oral taxon 317 str. F0108] # 1 149 1 149 149 278 100.0 8e-74 MNKDEEIRILVARFMEGDTTLDEEGKLYCYFSGDEVSDDLLPMREYFRSIATMRTPNHHR HKRLPVLHGLLSRPLFIGIAASVALLVVVGVGGMLWQPEPQDYCEAYIYNRHVTAPATVM EEVDETLEAISQDGELNVDGQLHDIFGKQ >gi|283510588|gb|ACQH01000031.1| GENE 4 2868 - 3338 457 156 aa, chain - ## HITS:1 COG:no KEGG:PRU_2532 NR:ns ## KEGG: PRU_2532 # Name: not_defined # Def: ECF family RNA polymerase sigma-70 factor # Organism: P.ruminicola # Pathway: not_defined # 1 150 1 151 225 130 45.0 1e-29 MTVDEYKEEVERNRPRMLSVARSYLKAGEEAEDVVQDVLLKLWQLLDKLRIPMGPLALVL VRNRCIDCLRRLQTTIDIPEYVAESEPTCDERYEKVMRLVDKLPTMQQTIMRLRHMEGME MCEIAALTGSNETAVRKALSRARQAIRQQLKQQSNE >gi|283510588|gb|ACQH01000031.1| GENE 5 3833 - 4174 271 113 aa, chain - ## HITS:1 COG:no KEGG:Dfer_5567 NR:ns ## KEGG: Dfer_5567 # Name: not_defined # Def: hypothetical protein # Organism: D.fermentans # Pathway: not_defined # 2 104 13 115 119 112 45.0 4e-24 MVAEVSANDMLLTAPEDANDLLGNAYYQGFDGMIISADKISPRFFDLKTRLAGEILQKFS TFRMRLAIVGNFSTFTSESLKSFVYESNRGSLIHFSPTTADAVAWFENQFCHR >gi|283510588|gb|ACQH01000031.1| GENE 6 4209 - 4964 637 251 aa, chain - ## HITS:1 COG:MA0451 KEGG:ns NR:ns ## COG: MA0451 COG0637 # Protein_GI_number: 20089342 # Func_class: R General function prediction only # Function: Predicted phosphatase/phosphohexomutase # Organism: Methanosarcina acetivorans str.C2A # 32 244 3 206 218 85 29.0 1e-16 MYAPTSPNLPFGHEQEVKSYLENRGFDSFQPQAVLFDMDGVLYDSMPNHARCWQEAMAKF GLRMTAADVYATEGMRGVETIRLMVKAQQGRDISEDEAQIMYDEKARLFGLLPKAPIMEG VLELMEKIKAAGMCIVVVTGSGQLPLIERLQHDFKGFVTADKIVSAYDVTRGKPAPDPYL MGLQKAGDLLPWQGIVVENAPMGVRAGVAAQIFTIAVNSGPLPNATLAGEGANIVFDRMT QLRDTWMSDHD >gi|283510588|gb|ACQH01000031.1| GENE 7 5046 - 6395 1586 449 aa, chain - ## HITS:1 COG:BH3343 KEGG:ns NR:ns ## COG: BH3343 COG0166 # Protein_GI_number: 15615905 # Func_class: G Carbohydrate transport and metabolism # Function: Glucose-6-phosphate isomerase # Organism: Bacillus halodurans # 4 449 5 450 450 491 54.0 1e-138 MKNISLDITKAAQFLSEGAVKAYEPKVKAAQEALENGTCPGNDFLGWLHLPNSITPQFLD EVQAVANTLREKCEVIVVAGIGGSYLGARAIIEALGNSFAWLVGDKTNPTIVFAGNNIGE DYLFELSEYLKDKRFGVINISKSGTTTETALAFRLLKKQCEEQRGKAEAKDVIVAITDAK RGAARAAADKEGYKTFVIPDNVGGRFSVLTPVGLLPIACAGFDIKALVNGAADMEKATAP SVPFAENIAAQYAAVRNALYTEKGKKIEIMVNYQPKLHFIAEWWKQLYGESEGKEHKGIF PASCDFTTDLHSMGQWIQESERSIFETVISVEQPAKKLLFPSDDENLDGLNFLAGKRVDE VNKMAELGTLLAHVDGGVPNIRISVPELNEYYIGQLIYFFEIGCGISGNVLGVNPFNQPG VEAYKKNMFALLDKPGYEAESKAIKERLK >gi|283510588|gb|ACQH01000031.1| GENE 8 6613 - 6993 181 126 aa, chain - ## HITS:1 COG:no KEGG:GAU_1363 NR:ns ## KEGG: GAU_1363 # Name: not_defined # Def: hypothetical protein # Organism: G.aurantiaca # Pathway: not_defined # 10 109 1 100 182 110 52.0 2e-23 MEVENAITKIINCAYTVHKQLSIGFAENVYKNAMVIEMREQGLSVKTEMPFEVMYKGRVV GSYRADIVVEDKVILELKAVHSLAVAHEIQLVNYLTALHIDYGLLINFGSELITIKRKFR TYRKAH >gi|283510588|gb|ACQH01000031.1| GENE 9 7078 - 8073 1002 331 aa, chain - ## HITS:1 COG:SA1306 KEGG:ns NR:ns ## COG: SA1306 COG0240 # Protein_GI_number: 15927055 # Func_class: C Energy production and conversion # Function: Glycerol-3-phosphate dehydrogenase # Organism: Staphylococcus aureus N315 # 6 328 3 326 332 162 30.0 1e-39 MFNCGKIAIVGGGSWATAIAKIVVRHTHHIGWYMRRDDRIDDFKRLGHNPAYLMSVHFNV DEIYFSSDINKIVENYDTLVFVTPSPYLKSHLKKLKTRLHDKFVVTAIKGLVSDDNLLCS EFFHQVYDVPYANIACIGGPSHAEEVALERLSYLTVGCADRDKAQAFTEILSSDFIKTKT SQDVVGIEYSSVLKNVYAIAAGICSGLKYGDNFQAVLMSNAVQEMSRFLSAVHPLERSVY DSVYLGDLLVTGYSNFSRNRTFGTMIGKGYSVKSAQIEMEMIAEGYFGTKCMKETNRHLH VNMPILDAVYNILYEKISPQVEIKLLTDSFR >gi|283510588|gb|ACQH01000031.1| GENE 10 8218 - 9948 1943 576 aa, chain - ## HITS:1 COG:CAC3197 KEGG:ns NR:ns ## COG: CAC3197 COG1190 # Protein_GI_number: 15896444 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Lysyl-tRNA synthetase (class II) # Organism: Clostridium acetobutylicum # 13 499 33 510 515 457 48.0 1e-128 MNVLELSEQEIVRRQSLQELRDLGIDPYPAAEYPTDAFSTDIRDNFKGDEQREVCIAGRM MTRRVMGKASFMELQDSKGRIQVYVTRDDICPNENKELYNTVFKRLLDIGDFVGVRGFVF RTQTGEISVHAKELTLLSKSIKPLPIVKYKDGVAYDKFDDPELRFRQRYVDLVVNEGVKE TFLKRAKVISTMRQFFDEAGYTEVETPTLQSIAGGATARPFVTHFNALNQEMFLRIATEL YLKRLIVGGFEGVYEIGKNFRNEGMDRTHNPEFTLMELYVQYKDYNWMMSFTERLLETIC VAVNGKPESVVDGQVISFKAPYRRLPILDAIKEKTGYDLDGKTEEEIRQVCKELKLDIDD TMGKGKLIDEIFGEFCEGQFLQPTFITDYPVEMSPLTKMHRSKPGLTERFELMVNGKELA NAYSELNDPIDQEERFVEQMKLADKGDDEAMIIDKDFLRSLQYGMPPTSGIGIGIDRLVM LMTGQTAIQEVLLFPQMRPEKSIPKSTTAEWQALGVPAEWVPVFNKAGYNLISDIKEVKA QKLQMDVCNVNKKYKLGYENPKVDAFQAWIDKANEG >gi|283510588|gb|ACQH01000031.1| GENE 11 10099 - 11442 1278 447 aa, chain - ## HITS:1 COG:all2964 KEGG:ns NR:ns ## COG: all2964 COG1252 # Protein_GI_number: 17230456 # Func_class: C Energy production and conversion # Function: NADH dehydrogenase, FAD-containing subunit # Organism: Nostoc sp. PCC 7120 # 11 423 5 424 442 262 36.0 1e-69 MSLNIAKDGRKRIVIVGGGFGGLQVANKLKGTDYQVILIDKNNYHQFPPLIYQVASAGME ASSISFPFRRNFQKHKNFYYRMAELRAIFPEKKLIQTSIGKVEYDYLVLAAGTTTNFFGN KNVEEQAMPMKTVDEAMGLQNAILSNIERAITCATKQEQQELLNVVVVGGGATGVEIAGV LSEMKRTILPHDYHDLDPSLMNIYLIEAGNRLLSAMSPESSSAVEKYLREMGVNILLNKM VTDYKDHKVMLADGSSISTRTFIWVSGVAGQRVGNLDAGHLGRGRRIKVDTFNRVEGLED VFCIGDQCIVEGDKDYPNGHPQLAQVAIQQGKNLAKNLKRMAKAKPLSPFRYKNLGAMAT VGRNKAVAEFAKIKMKGFGAWFMWLVVHLRSILGVRNKMVVLLNWMWNYFNYNQSLRMIF YPKKAKEIREREEREAKTHWGEDLMEQ >gi|283510588|gb|ACQH01000031.1| GENE 12 12353 - 12619 63 88 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLNFLLKTRQKAIVSHGWATHTNCCLLANKIKASNYSAIMNVDYQELINVILWKKSRNDI TLPEFVRVCLCNLLVINNSLYALLLNSY >gi|283510588|gb|ACQH01000031.1| GENE 13 12734 - 14464 1071 576 aa, chain - ## HITS:1 COG:all1776_2 KEGG:ns NR:ns ## COG: all1776_2 COG1649 # Protein_GI_number: 17229268 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Nostoc sp. PCC 7120 # 45 480 2 448 455 186 30.0 1e-46 MIKQSSFLRRFLLLFLTFFVATTLLAQKFSFSKPNALNGWKLPKREVRAVWLTTIGGLDW PRSYAQNDLAASKQKQELRGILDKLQRAGINTVLFQTRIRGTVVYPSLLEPWDGCLSGVP GRSPGYDPLAFAIDECHKRGMELHAWVVTIPVGKWNALGCKTLRNKYPHLIKRIGEEGYM DPENPTTATYLANICKEISDRYDIDGLHLDYIRYPETWKINIAHDAARRNITAIVRAIGE KVKANKPWVKYSCSPIGKFNDLSRFASNGWNAYTKVCQDAQGWLHDGLMDALFPMMYFQG NNFFPFAIDWAEQSYGRMVVPGLGIYFMSPSEKNWSLDIITREMQVSRQYGMGHAYFRSK FFTDNLKGIYTYAQQVFTPTLALPPAMTWENSKLPSPPTDLNTSEENGNVFVSWRGGRST NNSPYIIYNVYASSSYPVDVTDGRNLIAMRYAKNSIEVSPRNAQMYFAITTMDRYGNESQ PLQTRKAAGGSRSELKMLPYSDGKVLLPKSADALWGRVWVVETLQGQHVATLSSVNDKLD VRGIANGVYVLRALNNKGVGHRLGMFTKRWIRSPFD >gi|283510588|gb|ACQH01000031.1| GENE 14 14726 - 15214 552 162 aa, chain + ## HITS:1 COG:no KEGG:PRU_0739 NR:ns ## KEGG: PRU_0739 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 3 161 4 166 167 117 41.0 2e-25 MKKKLGLLFVSLMLTLTASAQFEEEKVYVAGSLSGLDINYNGNNKFSFGVQAKAGYFFED DWMVLGQFAYNHYGNETVPDYISVGAGLRYYIEQNGLFVGANASLVHTYHNYNDIMPGFE LGYAFFVSRTVTIEPSLYYNQSFKSHSKYSTVGLRIGVGIYI >gi|283510588|gb|ACQH01000031.1| GENE 15 15173 - 15433 197 86 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927869|ref|ZP_06421716.1| ## NR: gi|288927869|ref|ZP_06421716.1| hypothetical protein HMPREF0670_00610 [Prevotella sp. oral taxon 317 str. F0108] # 1 86 1 86 86 133 100.0 3e-30 MAQRRVIDVARMLWTTNIRMRATLMNETSLDIAEDNTLKATRNVYNTVEYLSLSLPTLVY VCKGSYSFLTNRRLNVNTYTNSEANC >gi|283510588|gb|ACQH01000031.1| GENE 16 15674 - 16327 801 217 aa, chain + ## HITS:1 COG:no KEGG:PRU_1357 NR:ns ## KEGG: PRU_1357 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 208 1 195 205 158 44.0 1e-37 MKRKLFLALLFVPVFTFAQNDWERPQSKEQNKSTQTKSKKESATTTEQAKYLEGAVPTVD GKVVFDYELDLPGKSAQEIYDATYAAIEDLTKGENQFPESSIALVNKKEHIIAARLKEWL VFQSTFLSLDRTVFNYTLIAKCSDGKLNLTLSRISYAYEMNRGEGSGVEATAEKWITDQY GLNKAKTKLSKMSGKFRRKTIDRKDEVFETIKQRLRQ >gi|283510588|gb|ACQH01000031.1| GENE 17 16394 - 16609 183 71 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260911807|ref|ZP_05918376.1| ## NR: gi|260911807|ref|ZP_05918376.1| conserved hypothetical protein [Prevotella sp. oral taxon 472 str. F0295] # 1 71 1 71 71 115 87.0 1e-24 METQYPQRRHRRPDKEQDQFLPIRNILNIIFIIGAIIGVSVYFLSDTTVGTFIVLGSMVF KIAETILRFIR >gi|283510588|gb|ACQH01000031.1| GENE 18 16610 - 18907 2157 765 aa, chain + ## HITS:1 COG:no KEGG:PRU_1359 NR:ns ## KEGG: PRU_1359 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 6 765 8 776 776 746 50.0 0 MNRLFMLAAMLSLALSSALAQDDFTRFDENGNFSTGKKENRKDSLGSNKEIPVGLKVWTV DPRFGDRIAAQPDTLSHMFMNKAFTYGMRGEYNMLGNLSSPRLNRVFVDRSADGQFIFTE PYSYFIVPADKFLFTNTLSPITNLSYFSCGDKTDGEDHFTALFGVNAGKRVGVGFKFDYN YGRGFYQNQSTAMFNYTMYGSYIGDRYQAHLLFSTNHQKASENGGITNDNYITHPETFSE NFTTNEIPTVLTKNWNRNDNLSIFFSHRYNIGFNRKVPMTKEEIAAKKFAMEAKKDKKQR EKKQNGRTSKAGDNATDDKAYGGRPDNAKIAGVEPEDSARSTRRITVSSKAQADSLMAQS DQPKKDTSWLKNEYVPVTSFIHTLKFENYKRIYEAHQTPTNYYANTYFNAGKLTGDSIFD ETRHHRLQNTLALALLEGFNKWAKAGIKAFISSDIRHYELPNTQGRLESFNEQTMSFGGQ LAKTQGKTLHYGVTAETWFLGKDAGQLKVDASADLNFPLLGDTVSLAANAFFHRINPGFY FRHFQSRHFWWENNSLSMTIHSRLQATLSSQKTRTKLRFAVDELKNHTYFAQNYTINSDF TRTGNTVNVEQSSAAINLLTAELTQDFTFGPLNWESVVTWQRTSHPDVLPLPSLNIYTNL YLRFKIAHVLKCDFGADMRYFTAYNAPDYSPALGQFTVQAQTDKVKIGNYPLVNVYANFH LKQARFFILMSHINAGSGNKKYFFAPHYPLNDRMLYFGLSWNFFN >gi|283510588|gb|ACQH01000031.1| GENE 19 19228 - 21246 1971 672 aa, chain + ## HITS:1 COG:BH2352 KEGG:ns NR:ns ## COG: BH2352 COG0021 # Protein_GI_number: 15614915 # Func_class: G Carbohydrate transport and metabolism # Function: Transketolase # Organism: Bacillus halodurans # 4 670 3 663 666 456 39.0 1e-128 MNDKKLMNLAADNIRILAASMVEKAKSGHPGGAMGGADFINVLFSEFLVYDPENPSWEGR DRFFLDPGHMSPMLYSALALQGKFTIDELQQFRQWGSPTPGHPERDVQRGIENTSGPLGQ GHAYGAGAAVAQKFLETKLSHTMMQHTIYIYISDGGIQEEISQGVGRLAGALGLDNIVMF YDSNDIQLSTECGVVTCENTAAKYESWGWKVITIDGNDADEIRRALNEAKAEKEKPTLII GNTVMGKGALRADGSSFEHHIGTHGAPLGAEAYQKTIVNLGGNPNEPFVVYPEVKELYAK RNEELREIVAKRHQEENEWAKNNPQLAAQMKDWFSGNAPQVDWSTLTQKPDAATRVASAA CLGLLAEQVPNMVCSSADLSNSDRTDGFLNKTREMKRGDLSGAFLQVGVSELTMACMCIG MYLHGGVIPACGTFFVFSDYMKPAIRMAALMRVPIKFIWTHDAFRVGEDGPTHEPVEQEA QIRLMEKLKNHCGEDSVRVFRPADVNETTVCWQMAMENMNTPTALILSRQNVKNIHPDTN YELARRGAYVVAGSDSQFDVILLASGSEVATLEAGAELLRKDGVKVRIVSVPSEGLFRTQ SAEYQEEVLPTGSKIFGLTAGLPVTLQGLVGCNGKVYGLNSFGYSAPFKVLDEKLGFTAE NVYHQVKSLIRE >gi|283510588|gb|ACQH01000031.1| GENE 20 21248 - 21691 556 147 aa, chain + ## HITS:1 COG:TM1080 KEGG:ns NR:ns ## COG: TM1080 COG0698 # Protein_GI_number: 15643838 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose 5-phosphate isomerase RpiB # Organism: Thermotoga maritima # 6 143 3 140 143 139 47.0 1e-33 MEIKTIGIACDHAGFPLKQYVLQYLEEHGYPYKDYGTYSDQSSDYPDFAHALAEGIESGE VYPGIGICGSGEGMAMTLNKHQGVRAGLAWMPEIAQLIRQHNDANVLVMPGRFVDNKTAK KILDGFFSATFEGGRHLNRVKKIAIKE >gi|283510588|gb|ACQH01000031.1| GENE 21 21757 - 22083 75 108 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MEVINTSSSFFLKGSRFIREPFNFSSSDKLSLCLIIIYNNNHTCRFITSLSLIAMKLGVV NLFQLCPGKHFIENAPTVMLFQIFKRRSEQIQGQLFILRCILCIPLTL >gi|283510588|gb|ACQH01000031.1| GENE 22 22314 - 23978 1508 554 aa, chain - ## HITS:1 COG:no KEGG:PRU_1444 NR:ns ## KEGG: PRU_1444 # Name: not_defined # Def: putative transporter # Organism: P.ruminicola # Pathway: not_defined # 10 554 8 540 540 759 75.0 0 MTTINNATSTLRDSAAARWTALLLLAMAMFFAYIFMDILSPIKDLMESTRGWDSTAFGTY AGSETFLNVFIFFLIIAGIILDKMGVRFTAILSGLVMLTGACINWYAVTESFMGSSLEHW FSDNLNYIPLFDELGVSPFYAGMPASAKLASIGFMIFGCGAEMAGITVSRGIVKWFKGKE VALAMGSEMALARLGVATCMIFSPVFARLFGRVDVSRSAAFGLILLMIALIMFVVYFFMD KKLDAQTGEAEEKDDPFRISDIGQILRSQGFWIVALLCVLYYSAIFPFQKYAVNMLQCNL TFTHLAEGDFWASNTVTIIQYFVMITIAATAFTSNFSKKASLKYGLLFISLLFLVGYCFI AYKRQSAEAIFAVFPLLAVGITPILGKYVDHKGKAASMLVLGSVLLIVCHLTFAFVLPMF KGNEIGGVTLAFVTILILGASFSLVPASLWPSVPKLVDSKIIGSAYALIFWIQNIGLWLF PLLIGKILKASNPDIVQSLEAGTLSPAEAATSYNYTNPLLMLAMLGLVALLLGLYLRVVD RKKGYGLEEPNIKQ >gi|283510588|gb|ACQH01000031.1| GENE 23 24118 - 26277 2300 719 aa, chain - ## HITS:1 COG:CC1986 KEGG:ns NR:ns ## COG: CC1986 COG1506 # Protein_GI_number: 16126229 # Func_class: E Amino acid transport and metabolism # Function: Dipeptidyl aminopeptidases/acylaminoacyl-peptidases # Organism: Caulobacter vibrioides # 33 716 16 675 683 313 29.0 5e-85 MKKSLALAAMALLAVAPQSIGQTMIGKNEITLKSDLMTPEALWAMGRIGGAQASPDGKLV VYRVAYYSVQENKGHHMLFVSNADGTNKKRLTTTADNETDAAWIENGRRIAFLTKGQLWS MNPDGSDRKQLTHSATDIEGFKFSPDGKKVVLVKSIKYTGVIQANPTDLPKATGRKVTDL MYRHWDHYVESIQHPFVADVQADGITEGKDMLEGEPYECPMEPFGGVEQLAWSPDSKTIA YTCRKKTGREYSISTDSDIYLYELATGKTTNLCKPADYKAPAVDATKSLRNQAVNRQAKD YNVGYDVNPQFSPDGKYVAWQSMKRDGCESDRNRLCIYTLATGEKRYVTESFDSNVDDYC WAANSQTLYFIGVWHGCVNMYQTNLVGKVMQLTEGWHDYASVQLLGNTGKLLAMRHSYSH PDELFVVTPSKKEKKADVKQITDENKHIFDQLEMGKVQQRWVNTVDGKKELVWIILPPHF DPNKKYPALLFCEGGPQSPVSQFWSYRWNFQIMAANGYVIIAPNRRGLPGFGSEWNDEIS GDWTGLCMKDYLAAVDDAVANLPFVDKDRLGCVGASFGGFSVYYLAGHHDKRFKAFISHD GAFNLEAMYTETEENWFSNWEYDDAYWNKDQTANAKRTYENSPHRFIDKWDTPILCIHGE KDYRINATQGMSAFNAARMKGIPAELLIFPDENHWVLKPQNGVLWQRTFFEWLDRWLKK >gi|283510588|gb|ACQH01000031.1| GENE 24 26417 - 28300 2139 627 aa, chain - ## HITS:1 COG:BMEII0275 KEGG:ns NR:ns ## COG: BMEII0275 COG0706 # Protein_GI_number: 17988619 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit YidC # Organism: Brucella melitensis # 70 559 61 547 588 144 26.0 6e-34 MNKNNLIGFILIGLVLIGFSLWNRPSEEELAKQQQKELIEKKNKEEQTKAKAKADAARAQ ALEQATTDTAALFHKALIGEAKPIVLKNSKLTLQLNTKGGVVDKVIISNFTDRNGHKDLT LFQGKDQSLNYTFSTKDGYVSTADLFFTPSEVTDTTVTFTAEAAAGKTLTIKYVLDNEYM LHTSISATGMGGILNPSVNTVDINWQERCAQQEKGFTFENRYASLTYHKTEGGTDYLSES SEKIDEVIEDKIDWVAFKNQFFSAVMIAKTDFGTNATLTSIPQEKGSGYLKQYQAKLKTA FDPTGSQPTQLDFYYGPNDFRLLQKMESKSHFGKDLELERLVYLGWPLFRIINRWFTLYV FDWLTGLNINMGVVLILITLLLKFITFPLVKKSYLSSAKMRVLKPKLEEATKEFNKPEDQ MKKQQAMMSEYAKYGVSPLSGCLPMLIQMPIWIAMFNFVPNAIQLRGESFLWIKDLSTYD PVLEWGHNIWLIGDHLSLTCILFCLSNVLYSMMTMRQQRDQMVGQQAEQMKMMQWMMYLM PVMFFFMFNDYSSGLNFYYFVSLFFSAAIMWLLRKTTNDEKLLSILEANYKEAQNNPQKL KGLSARLQAMQQQQQELQRKREQLGKK >gi|283510588|gb|ACQH01000031.1| GENE 25 28395 - 29999 1549 534 aa, chain - ## HITS:1 COG:BS_ctrA KEGG:ns NR:ns ## COG: BS_ctrA COG0504 # Protein_GI_number: 16080768 # Func_class: F Nucleotide transport and metabolism # Function: CTP synthase (UTP-ammonia lyase) # Organism: Bacillus subtilis # 4 534 2 535 535 589 52.0 1e-168 MTETKYIFVTGGVVSSLGKGIISSSIGKLLQARGYNITIQKFDPYINIDPGTLNPYEHGE CYVTVDGMETDLDLGHYERFTDIKTTKANSLTTGRIYKAVIDKERRGDYLGKTIQVVPHI TDEIKRNVKLLGEKYHYDFVITEIGGTVGDIESTPFLEAIRQLKWELGRKAVCVHLTYIP YLKAAQELKTKPTQHSVKELQSVGIQPDVLVLRTEKTLNAGILKKVAAFCNVDLDCVVQS EDLPSIYEAPVRMQEQGLDVAILKRMGEPIGEKPALGPWRKFLERRRQATSEVNIGLVGK YDLQDAYKSIRESLYQAGTYNDHKTVITFINSEKITKENVAEKLAGMDGIVICPGFGQRG TEGKIIAAHYARTNDMPTFGICLGMQMMVIEFARNVLGYEDANSREMDEKTEHNVIDIME EQKNITNMGGTMRLGAYECVLKQTSRVLDIYKHEHIQERHRHRYEFNNEYQKEYERAGMA CVGKNPESDLVEIVEISGLKWYIGTQFHPEYQSTVLNPHPLFVDFVKTAIANKK >gi|283510588|gb|ACQH01000031.1| GENE 26 29938 - 30162 83 74 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPLPSEETTPPVTKIYLVSVTMIIIYLFINAICMLWKPRCHWQYAGLFFLNCKCTIKKRY AQELKKRYKAYATK >gi|283510588|gb|ACQH01000031.1| GENE 27 30324 - 31211 1029 295 aa, chain - ## HITS:1 COG:TM1521 KEGG:ns NR:ns ## COG: TM1521 COG0329 # Protein_GI_number: 15644269 # Func_class: E Amino acid transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: Dihydrodipicolinate synthase/N-acetylneuraminate lyase # Organism: Thermotoga maritima # 5 292 1 290 294 258 43.0 7e-69 MATNIFKGLGIALVTPFKTDGSIDYAALKRLIEYQTDNGADFLCILGTTSESPCLDQEER AEIKRFVVEVNQGRLPVLMGCGGNNTKAVVKELQSFDLRGVDGILSVCPYYNKPSQEGLY RHFKMIADNCPLPVVLYNVPGRTGINLKSETTVRLANDCRNIVAVKEAGGSLEQVDEIIK NKPAHFDVLSGDDALTFPMIASGAAGVISVIGNALPREFSRMIRLEFNGEYEPARKIHHR FTELYSLLFVDGNPAGVKALLHEMGFIENELRLPLVPTRVATVQKMAAILQEMRT >gi|283510588|gb|ACQH01000031.1| GENE 28 31835 - 34708 2618 957 aa, chain - ## HITS:1 COG:no KEGG:PRU_1650 NR:ns ## KEGG: PRU_1650 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 4 955 3 877 879 890 50.0 0 MRKTFLEYVAEDLLKKFGTDLSRVTLVFPNKRASLFLNEHLARMADGPLWSPVYTTISQL FRDRSARIVADDIKLVCDLYRIYVQCTGSTETLDHFYGWGMLMLSDFDDIDKNLADAADV FRNLSNIHELDDVSYLTGEQREALKRFFSNFSDDYNSELKKRFLQLWQNFGDIYNRYNAH LQAEGLAYEGALYREVACNENVSFDADVYVFVGFNMLQKVEQKLFARLEKAGKAKFYWDF DRYYMEGVNNEAGHYIRQYLNAFPNELFNRDEEIYNNFCQEKKMQFVSASTETIQARFVG HWLQNDTFVQAGRKTAVVLCDENLLQTVVHSLPPQVESVNITTGFPLAQSPVFSFVNALI ALQTAGYTRSSGRYRLQYVRPVLRHPYGLFLSENCSKLLATLEEHHTYYPLRQDLATDEG LTLLFADLEEGVADVQAYHARLVEWMLQLLKCVGTSTQETNDHLMKEAIYRMYTLFNRLH ELIISGDLSVDLITLLRLITQLVQATCIPFHGEPVVGLQVMGVLETRNLDFDNILLLSCN EGNMPKGVNDASFIPYSIRKAHGLTTIDHKVAIYSYYFHRLLQRAKNVTILYNNSTEDGH TGEMSRFMLQMLVESGHQIERLSLQAGQMPNVLQPHAVKKTQSIMEEMLKLEKLSPTAIN RYLRCQLLFFYNIVAGLKESDDETDDIDNRTFGNIFHKSSQLIYEQLMDANSAVSDRAIK DFLADKSALQRVVDRAFNEELFKVSNTNQRPEYNGLQLINRGVIISYLKKLLQMDLALTP FRILAMEKQVYNDVLFHVDGKQHPLKIGGYVDRLDEVDEGMGKVIRVVDYKTGRKPQMGV ATFEDIFSGEKVTKNHADYFFQTFLYAAIVRDSAFWNAQKLPVSPALLFIQQASTEENDP VLRLGKERVDDVATYRDEFWAGLQSLLEEIFDKDRPFEPTEDRERCTRCPYRQVCYQ >gi|283510588|gb|ACQH01000031.1| GENE 29 34705 - 37932 3299 1075 aa, chain - ## HITS:1 COG:FN1149 KEGG:ns NR:ns ## COG: FN1149 COG1074 # Protein_GI_number: 19704484 # Func_class: L Replication, recombination and repair # Function: ATP-dependent exoDNAse (exonuclease V) beta subunit (contains helicase and exonuclease domains) # Organism: Fusobacterium nucleatum # 8 828 8 831 1056 110 23.0 1e-23 MHHTPLTVYKASAGSGKTFTLAVNYIKILLNNPHSYRNILAVTFTNKATEEMKLRILSQL YGIWKLLPSSKGYLDKITNELGISPEYASQQAGTALSNLLHNYNYFRVETIDTFFQAVLR NLARELDLTANLHVGLNDSQVEQQAVDKLIEDLSPNSKVLKWIMEYIQQNIADDKSWNVI GQIKRFGENIFKDVYKDNRKQLSALMANEEAFDSYVKTMRALSKAAEHKLVGIGEGFEQL LEENQLSVADFAYKDKGVCGYFMKLKNGQVEDDKLLTKRVLDALEDPNAWVGKKEQNGDG HALTVVRGVLFDYLNNSEKLRKEQAKLLKSAKLTLRHLDQLRLLDKIETQVRELNEQSNR FLLSDTQTLLHALIHDSDSPFIFEKIGTQLEHIMIDEFQDTSTVQWANFKVLLQECMSHN QSQSLIVGDVKQSIYRWRSGDWRLLNEIEKQFAGSPHAFKIERMTTNYRSERNIVDFNNA FFEAACAIEQRELEGKSPTGAQQMQVAYQDVKQLVPTSKEPKGRVEVCLLDKENCEQRML AKVCHTIKTLLEQGARAKDVAILVRNNNSIALIAGYMMVHLPEVRLVSDEGFKLQASIAV QIIMGALRVLANPADRLLQANLATTYQTHILGNAIGDREMLKRDADVASFLPNKFWNNHQ QLMAMPIFNIVEEVYQAFQLERLSEQSAYICTFYDRLNKFLVDNSADLNAVIDEWENDMK DRAIQSDELNGVRLLTIHKSKGLEFAHTIVPFCDWQLERTGGTLWCQPDEPPFNALPLVP VDLYPKQLMGSVYEKDYLHEHLQTMVDNMNLLYVAFTRAGRNLFVMGEKDKAGSRSAVIG QCIEQLMGMLPGSTLETTAEEDLEFKYGTLSIDTAEEKAKSTSDNVFNTIPSPFTFDVKV YESKALFLQSNKSSDFIGGEDDNAQQNNYIKLGRVLHAVFAQIRTLADVPNVLRQLEQDG VIYDSNLTKEKLIEMLNKRFADKRVKEWFSDGWQLFNECSILSVDPVTNAVVTHRPDRVI TNGEEMRVIDFKFGKPKEEHHAQVASYMALLETMGYKRVSGYLWYVYSNKIEEVK >gi|283510588|gb|ACQH01000031.1| GENE 30 37978 - 40002 1382 674 aa, chain - ## HITS:1 COG:no KEGG:PRU_1648 NR:ns ## KEGG: PRU_1648 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 8 674 15 677 677 417 34.0 1e-115 MKMRHVIIVCGMCIGVSAQSKTWTATDTLLLNRIYRYADSCYARPIDTTYYSYTRYTIDI RRKNIVLMAVPNLFDLARSGKRHFMGESIRRHTYNKNAGHNTQTLVSYTTLPHSENILPP AIKYLSPKLYDVTLVDKFLLSPFNRHNKMFYQLRFLSQTDSLVLLSFKPKVDNTQLVKGN AIIDKHTGRVITCHAEGEYDMLRFELDTDTGDKEHDKLAVKSCILRTTFKFMGNQMAATY RVNYDLSPIDSAKVYPNKPAFTINLLRKDSLTRIEREIFKPVFLADIRQDTVRTQPDRPR GMFWRKLGDYLVDRIGSDFGKSNQGYFRLNAPFDPFYMEYSPSRGFTYMNDFRLGYTFSP RTSLFVQFKAGYFFREKRVTFYAPLVFYYNRERNFYSRLELSTGNRIYSSTIAERVSESL LNKLFDMGKDLYQFTDRYMKFDTNIDFSSKWGLRVGFVFHKRKPLDPSAFELLSLPVVFR SFAPSFELSIRPWGWAGPIFTIDYERSLPHAFKSNISYERWEIDGQYLLRLSRLRALSMR LGTGFYTQKGRSSYFLDYSNFRENTIPGGWNDGWTGEFELLNASWYNASNYYVRGNFTYE TPLLAVSWLPLVGHYIEMERIYVSALSVKSLHPYLECGYGFTTRLFSIGAFVSNRNWKFD GFGLKFGFELFRHW >gi|283510588|gb|ACQH01000031.1| GENE 31 41196 - 43190 1406 664 aa, chain + ## HITS:1 COG:no KEGG:PRU_2899 NR:ns ## KEGG: PRU_2899 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 48 636 15 568 584 407 39.0 1e-111 MKIRYVAFLLVMTCFLPSAAQDETRWEEYYQRWLENNEQMEETNADAYETLSDLQTHPLN LNTATREDLERLPFLSAQQVEGICEYLYHYAPVRSLSELALIETIDYDTRLLLQCFVYVD EDKRQTFPTLRNIIKYGKHEVAGEVKFPLTGESSASKKHLGYDVRHWLRYDFHLGEYVKA GLLASQDAGEPFFSNRNKGGYDFYSFYLLVRKLGRIKAATVGRYRLRFGMGLVINNDYGF GKQASLATLGRTTGHVRGHSSRSEGNYLQGAAATVNLAKGLDLSLFGSYRHIDTTPTKDG NAIKTILKTGYHRTKTELAHKHNALQGLVGSNLNFLRNGFHFGLTALYTEFNHELHPDTA QRFRKYAARGRQFYNVGIDYGYLSRRLTIAGETAIDKNQAFATINMATVCLSSRLDLRLV QRFYSYRYQALLAQSFSEGGRVNNESGLYVGTNWQPAQGWLVSAYSDYAYFPQPRYRVST ASTAWDNMLTATFARGNATWTARYRFKVKQENAAGSHQVIDKKENRCRLAYCLTADNWHA KIQADGVMIQGNGLKSGWMTSLCGGWQSVKTKLDAVVGYFNTNSFDTRIYTYERGLRYNF LFPSFFGRGYRAALLGRVAINSHLLLLAKLGYTHRFYIPKARPNDHSTTQKDQADADLQL IWKL >gi|283510588|gb|ACQH01000031.1| GENE 32 43471 - 44202 395 243 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288927888|ref|ZP_06421735.1| ## NR: gi|288927888|ref|ZP_06421735.1| hypothetical protein HMPREF0670_00629 [Prevotella sp. oral taxon 317 str. F0108] # 1 243 1 243 243 493 100.0 1e-138 MKEKLTEGCIMLVFIAFVVLGSTMAVNMFQQGAWVKGIICVLGALFFGFPLLGPFLPSAA PDTKPLPFVPQVNLPTDKDSLRELAKMLAGEEADVMQTVEELLESPEAFYSAQITRDGWY NDAYVDIWDMYHDQPDVLCSEGLLFVLAEAEVIAMFDWKEGLEEFVGQMTDLRRAQANNL PVPQEHFDEQADIPHWCNALNELWQPLGYNATFIDTDGDEYIVAVVQYTPSPPIDDISCT TQS >gi|283510588|gb|ACQH01000031.1| GENE 33 44437 - 46164 1887 575 aa, chain + ## HITS:1 COG:CAC0353 KEGG:ns NR:ns ## COG: CAC0353 COG0737 # Protein_GI_number: 15893644 # Func_class: F Nucleotide transport and metabolism # Function: 5'-nucleotidase/2',3'-cyclic phosphodiesterase and related esterases # Organism: Clostridium acetobutylicum # 25 555 542 1079 1193 216 30.0 1e-55 MRKVLFMLLTGLSLSQYTQAQNVEIKVIQTSDVHGSFFPYDFINRRDKKGSLARVSSYVN EMRKKYGKNVILLDNGDILQGQPTCYYCNFVKPTMPNLAASVVNYMQYDAQTVGNHDIET GHPVYDKWIKEVKCPILGANIIDTRTNKPYVHPYTIIKRDGVKVAVLGLLTPAIPNWLKE SLWTGLRFDNMVKSAKYWVKHIQTKEKPDVIIGLFHSGREGGIHTPEYDEDASLTVAREV EGFDIILFGHDHTRYAGWVKSNAGKDVLCLDPSCDAYMVSDATINIRKAKGKVVSKKITG DVIDITGQPIDEKYVQHFKADIDSVNAFVNRKIGRFESTIYTRDCYFGSAAFTDFIHDLQ LKITGADISFNAPLSFDARINKGDVYVSDMFNLYKYENQIYVLKMTGKEIRNHLEMSYDL WVNTMKSPDDHIMQISEWAKQDRQRFGFKNLAFNFDSAAGIIYEVDVTKPDGQKVRILSM ADGTPFDENRWYKVAMNSYRGNGGGELLTKGAGIPRAELKSRIVFESEKDQRFYLMNEIE KAGVMNPQAHNNWKFVPEEWTKPAIERDRKLIFGN >gi|283510588|gb|ACQH01000031.1| GENE 34 46285 - 47064 606 259 aa, chain + ## HITS:1 COG:MA1439 KEGG:ns NR:ns ## COG: MA1439 COG2816 # Protein_GI_number: 20090298 # Func_class: L Replication, recombination and repair # Function: NTP pyrophosphohydrolases containing a Zn-finger, probably nucleic-acid-binding # Organism: Methanosarcina acetivorans str.C2A # 2 254 27 278 285 175 37.0 8e-44 MKSLWFVFKQSDLLLEQLDDHTFGIPLSDNPPTTFNAQQNVHEMECTPEGIAVKTYAIDD ATTIPATYTFCDLRQSYYKLPNNLYLIAGKCREINYWDAQTKFCGLCAGAMKLHTNISKR CTSCGNEVWPQLATAIIVLIYKGDEVLLVHAKNFKGNFYGLIAGFVETGESLEEAVVREV REETGLEIDSLRYFGSQPWPYPIGLMVGFTARYKGGNLRLQEEELSAGGWFHRNKLPQIP EKLSLARKLIDHWLGQFDQ >gi|283510588|gb|ACQH01000031.1| GENE 35 47224 - 48279 929 351 aa, chain + ## HITS:1 COG:NMB1308 KEGG:ns NR:ns ## COG: NMB1308 COG0820 # Protein_GI_number: 15677174 # Func_class: R General function prediction only # Function: Predicted Fe-S-cluster redox enzyme # Organism: Neisseria meningitidis MC58 # 5 336 2 342 364 253 38.0 6e-67 MDNHKIPLLGHTLDELKAIAIDNGLPAFAGKQMAVWLYDKHVDTIEEMTNISKSNREKLA QRYEIGAAKFIDAQYSKDGTIKYLFPTESGKFVETVYIPDRDRATLCVSCQVGCKMNCLF CQTGKQGFEGNLTAKDILNQIYALPERQKLTNIVFMGQGEPMDNLDNVLKVTQILTADYG YAWSPKRITVSSVGVKGKLKRFLDESDCHVAISMHTPIPEQRASIMPAEKGLSIEEIVEL LKQYDFTHQRRLSFEYIMFGGLNDTPLHARQLVKLVEGLDCRVNLIRFHQIPNVNLNNSD EKRMETFRDYLTNHGVFTTIRASRGQDIFAACGLLSTAKKIAGRENNGEHQ >gi|283510588|gb|ACQH01000031.1| GENE 36 48302 - 49417 537 371 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163786851|ref|ZP_02181299.1| 50S ribosomal protein L32 [Flavobacteriales bacterium ALC-1] # 3 342 4 343 346 211 34 1e-53 MDDNKLRIAITQGDANGIGLELIFKTFASPEMFELCTPIIYGSPKVAAYHKKALNLDVNF SIIQHANEAKPERLNLLAISDEEVKIEFGTPNEAAAALARKAIDRAVADCKAGIVNAIVN APISEKHFFAAQEHPLSLSQYIAKQLGETGETLEMWLNECLRITTLSGELAIKDVANEIT KERIESTVRLLHNSLCRDFMISIPRVAVLALNPTLEDQPCKEEQEVISPAIKQLIAEGYT AFGPYQADSFFGNREFDTFDAVLAMYHDQAMIPFKALSCEGGVRLITGLSAVVTSTDDEA LDGKAGQGVSNEIAFRHAIYLALDVYRNRKAYDQPYKNPLKKLYHEKRDESDKVRFNVTK KKESNEEQQSK >gi|283510588|gb|ACQH01000031.1| GENE 37 49424 - 50455 709 343 aa, chain + ## HITS:1 COG:no KEGG:PRU_1560 NR:ns ## KEGG: PRU_1560 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 334 1 331 334 397 62.0 1e-109 MKKKYQNGFFIFGLVLLGIMVTQLDFAEVWRGLKHTGYWFFAVLALWMALYVLNTSAWYI IIKAGQGQETAHPTTNDKRINFWWLYKVTVSGFALNYATPGGLMGGEPYRIMSLAPKIGT ERASSSVILYAMTHIFSHFWFWLVSIVLFFITQPLTFGHLTIVLASLVFCFLGLWFFMIG YRKGLAFRAMRLLSHIPFVKRWALSFIERNKQQLDTVDQQISALHKQNRTTFVTAVLLEL SCRIISALEIYFILLVIMPDVNYIQCILILAFTSLFANLLFFIPLQLGGREGGFLMSASG LGIASSSGIFVALIVRVRELIFTGLGLLLIKFDRSTPNADLQK >gi|283510588|gb|ACQH01000031.1| GENE 38 50693 - 51955 1244 420 aa, chain + ## HITS:1 COG:aq_218 KEGG:ns NR:ns ## COG: aq_218 COG3604 # Protein_GI_number: 15605774 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulator containing GAF, AAA-type ATPase, and DNA binding domains # Organism: Aquifex aeolicus # 4 293 181 472 506 233 40.0 7e-61 MNTTELQKIKQRYNIVGNCDALNHALDVAMQVAPTDLSVLIIGESGVGKEIIPRVIHDNS PRKREKYFAINCGSIPEGTIDSELFGHEKGSFTGAIGESEGYFGVANNGTIFLDEVGELP MATQARLLRVLETGEYIRVGGQRIMKTNVRIVAATNVNMRKAISEGRFREDLFYRLNTIP IQMPPLRDRGNDVILLFRLFAMQMAEKYKLPKISLTEDAKALMLKYKWPGNVRQLKNITE QISILSREREIDATHLQKFIPQDPESTQLAPMSSTGSHSYESEREVLYKILYELRGNVSD LRREVSGLRKRLDDSVALGAELPSYSSSIAALNTKNYDTSPTLAAIQSPQQASLTNGFAE AEEINEADNESLNLNDLGRQMVEKALERNGGNRKKAAIELGISDRTLYRRIKQYGLESSK >gi|283510588|gb|ACQH01000031.1| GENE 39 51933 - 52454 547 173 aa, chain + ## HITS:1 COG:no KEGG:PRU_1558 NR:ns ## KEGG: PRU_1558 # Name: not_defined # Def: putative lipoprotein # Organism: P.ruminicola # Pathway: not_defined # 1 173 1 172 172 237 72.0 2e-61 MDWKAVNRLALTLVLGTCVLWACSVSYKFNGASIDYSKTKTIQIAEFPNRSSYVWGPMAP MFNNQLKDIFANHTRLTQVKRNGDLRIDGEILQYSQRNKSVSSEGYSAQTELSMTVNVRF TNTKNQKENFERQFTASATYETTLPLSAVQERLVREMVKDLTDQIFNATVANW >gi|283510588|gb|ACQH01000031.1| GENE 40 52481 - 53236 695 251 aa, chain + ## HITS:1 COG:no KEGG:PRU_1557 NR:ns ## KEGG: PRU_1557 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 13 251 1 244 246 235 57.0 1e-60 MNLTQLINHPEVLDKETLYDLRSIVALHPYYQPARLMMLQNLYILHDPTFDEELRRASVY LTNRNVVFEMIEASHYKLKTEDRRTTRQEQTDGNESRTISLIDNFLDSLPEDKEQETQKS RRKPTPADAAIDYVSYLLEIESEEQEKPQEAPQLKGQMLIDNFIFNEGGKIQLKDEPEFT PVVEDNTTTTTTAEGDGYFTETLAKIYIRQGKFEKALEIIRRLNLNYPKKNVYFADQIRF LEKLIINNKNK >gi|283510588|gb|ACQH01000031.1| GENE 41 53349 - 53717 460 122 aa, chain + ## HITS:1 COG:no KEGG:PRU_1556 NR:ns ## KEGG: PRU_1556 # Name: secG # Def: preprotein translocase subunit SecG # Organism: P.ruminicola # Pathway: Protein export [PATH:pru03060]; Bacterial secretion system [PATH:pru03070] # 2 106 1 103 126 120 63.0 2e-26 MIYSLFVVLIVIAAVLMIAIVLIQESKGGGLSSQFSSSNSIMGVRKTTDVVEKTTWGLAI AMVVFSVVCAYVAPKSLAETSVLEKSATETQTTNPNTTPGFGAGSQAPTAPSTPAPAAPV KK >gi|283510588|gb|ACQH01000031.1| GENE 42 54043 - 54702 693 219 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927898|ref|ZP_06421745.1| ## NR: gi|288927898|ref|ZP_06421745.1| hypothetical protein HMPREF0670_00639 [Prevotella sp. oral taxon 317 str. F0108] # 1 219 21 239 239 431 100.0 1e-119 MERITLENFEATYVDPIEEERIDKFVCDEMGRQIHRYIKGMSGSKDIMNKFEAQLSTLSI PEKEVAIARYIDLNRKVTSGLDFKIVLTRAMANYCDTFDYLLTLVNNRRKMVYYLNRIKS KYLRYHEVVEVNGKFGINDSDGNVLVSPKYDFLRRCYTYVDDLCLMPIIAQKDGKMGLIL PDGKDTVVADFVYDDICLRDEYPYFEAHQGKKKILLETK >gi|283510588|gb|ACQH01000031.1| GENE 43 56594 - 57589 1042 331 aa, chain + ## HITS:1 COG:SMb20674 KEGG:ns NR:ns ## COG: SMb20674 COG1609 # Protein_GI_number: 16265129 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Sinorhizobium meliloti # 11 328 3 322 339 140 31.0 3e-33 MAVHKRSSLKDIAEALGLSKTTVSFVLNGKANEYHIGEETAKRVWEMAERLQYNPNFTAI SLRDGCSKILGIVVSDISNPFFASLARLYEDEAAKLGYTVFFGSSDENADKMQQAIRNLI ARGVDGLIVVPCQDSETFISSLVPRGIPLVLLDRYFPHQEINYVALDNFEATRQATAYIL GKGSKQPTIVAYDIDLIHMHERIRGYEQAMADKGKSAQSNVIRIEPDISTADMAYLLKES MDKGTDGFIFTTNLITLSGMYALLELGCPTNGLRLVGFDGTPAFDFFNCPITYIQQPLEQ LVGASLTAVKAIIDKEQASSALLRGKLIEKY >gi|283510588|gb|ACQH01000031.1| GENE 44 57789 - 60878 2516 1029 aa, chain + ## HITS:1 COG:no KEGG:ZPR_4655 NR:ns ## KEGG: ZPR_4655 # Name: not_defined # Def: TonB-dependent receptor Plug domain protein # Organism: Z.profunda # Pathway: not_defined # 29 1027 4 997 997 1048 51.0 0 MTINVFRQAPKRQLLLLLCIFFMQAQLFAQGGAKVTGRVFDAKGDALIGVTVSQADNPQV ATVTDINGTYTLQLPTVGVTIRFAYMGFKEKTVRVTKSAVIDVALEENVSALNEVVVVGY GTQKKASVVGAINNLEPSKLNLVSSRSMSNGFAGMVPGIIAVQRSGDPWNNNSDFWIRGI SSFAGNTQPLVLIDGIERSINDIDPDEVASFSVLKDAAASAVYGVRGANGVIMVETKRGS IGKPQVSVRFEHALSQPVRIPQYVGSVKYLELVNEMYSQDGMAPYVSEATLRNYRDKTDP ELYPDVNWWNVISKNHADNTRATLSVNGGSNVLRYALVAGYYNENGILARDKTKAWDSSL KVDRYTVRSNVDINITSSTILRINIGGYLQSRNAPPGDISDSRAFYYAMRIPPYIHPVVY ADGKIPRIINKENPWAMLTQRGYERLNHSNVEALTSIEQDLKFVTPGLRLKLTYAFDKFS ANSVTRAKNPDYYHPATGRDYEGNLITSIQANGEDFLGYAKDAKWGNQSVYVEATVNYNR TFGNKHAVNAMLLYNHKNFDDGSFLPFRTQGFAGRTSYTYDDRYVAEFNFGYNGSENFAP GKRFGFFPAVAVGWIVTQEKFMQRLTKVLSLLKIRASWGLAGNSNINGRRFAYLSTIANN GEYYFGSDRLLHRLGRAEGDVGVPDLTWEKVTKTNLGFDIGLLANSITLSVDLFKERRSD IFMKRTNVPAEAGFINAAWSNFGKVDNSGVDMSLNFRRRFGKDWEVSALANFTYAHNKIV EIDEADAVKGTYRSKTGRPVSQLFGLVAERLFTKDDFEADGKLKQGIATQRYSAESSLRP GDIKYSDLNNDGEINDLDQTAIGGTIDPQLVYGFGATVRYKMFDFGLFFSGVGKTHRILG GETWLPASSIGAGNIWSNIDSRWTEANPRQDVFWPRMSTTTYKNNEQPSTWWLKDMSFLR LKNIEVGVTLPEQWTHAAKIRECRIFLRGNNILTFSKFKMWDPEIGSNDGLKYPVMKSVS LGVSFNFNN >gi|283510588|gb|ACQH01000031.1| GENE 45 60889 - 61674 732 261 aa, chain + ## HITS:1 COG:no KEGG:ZPR_4656 NR:ns ## KEGG: ZPR_4656 # Name: not_defined # Def: RagB/SusD family protein # Organism: Z.profunda # Pathway: not_defined # 5 260 2 252 615 251 49.0 2e-65 MKTFKQIIATAVVGCSLISLTGCDFLDKSPDDQLNMEMVFSDKIRTEDWLASVYAGIPSP MWGYMTNQGYNIMSDDMVIPIEWSPYGWNEAYAYTTGNWNPSSTWNANYWVELPKRIRTG LIFLNNVRVIPSEGLTQDYVEQMKNEVRFLNAYYYSLMVEAYGAIPFAPGRLVSPTDPAN EMMLEQTPVDNVVDWIDKELLDVSKHLPAVYENNQDWGRATSIMALAVRAKTLLLAASPL FNGNTDYAQWKNVSGQNLVNL >gi|283510588|gb|ACQH01000031.1| GENE 46 61835 - 62473 601 212 aa, chain - ## HITS:1 COG:ECs3684 KEGG:ns NR:ns ## COG: ECs3684 COG0207 # Protein_GI_number: 15832938 # Func_class: F Nucleotide transport and metabolism # Function: Thymidylate synthase # Organism: Escherichia coli O157:H7 # 1 196 1 216 264 69 24.0 4e-12 MNKYYSLLDNILSDGHNQTNKKGNITYLLNQQLTLTPADLLDIFESRSIARRKLRNELDL FMQGERDVAKYRKAGITWWDYCGSILVNSYPTYFEKLPALIARINREKRTSKNYVLFLGS TDAESNQAPCLSLVQFQIDDGQLVISAYQRSSDANLGLPSDIYHLYLMSRQIELPLKSIT LFLGNVHVYENNIGRTRALLDGDEGVKFDLNV >gi|283510588|gb|ACQH01000031.1| GENE 47 62445 - 63305 916 286 aa, chain - ## HITS:1 COG:no KEGG:BVU_0955 NR:ns ## KEGG: BVU_0955 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 6 274 6 274 279 315 54.0 9e-85 MTHTIYQQAPLPFMGQKRKFVKAFRQILKSYPDNVTIVDLFGGSGLLSHVAKREKPNATV VYNDYDNYHRRIAAIPRTNALLARIREVTESLPRGKVIRQPHRDRILEIIAEEEQRGFVD YITLSPSLLFSMKYANKMDELVKQTFYNTVRRNDYCADGYLDGLTIVHKDYKALFNEYRD KPNVLFLVDPPYLSTEVGTYTMTWKLADYLDVLTILQGHDYVYFTSNKSQIIELCEWIGQ SRIDRNPFECAHRVEVNTTMNYNSSYTDIMLYRKNENDEQILQPAG >gi|283510588|gb|ACQH01000031.1| GENE 48 63480 - 63776 318 98 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927904|ref|ZP_06421751.1| ## NR: gi|288927904|ref|ZP_06421751.1| hypothetical protein HMPREF0670_00645 [Prevotella sp. oral taxon 317 str. F0108] # 1 98 1 98 98 165 100.0 1e-39 MEKVIRIIERGVVKVLLFPFGREEPIELTACGLCKNKIIAALVRLKYSQEEVEALLCEYV ACPTDKAAKQAFDGLIAYRRECEAEADKLMEEYEKLQT >gi|283510588|gb|ACQH01000031.1| GENE 49 63788 - 69559 6373 1923 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927905|ref|ZP_06421752.1| ## NR: gi|288927905|ref|ZP_06421752.1| conserved hypothetical protein [Prevotella sp. oral taxon 317 str. F0108] # 1 1923 1 1923 1923 3984 100.0 0 MANGIDKKINDLATAWQGYKGTRIEEFLKEYLSKLDGAKFGFVNIESGENSLQTIRFFRD EHAYADWFADRTANADRVLGEFSLYSNKPVESYTMRAIITRYPAANMARGAQNAVSLAYN CYWGDNPADRDTQDGTATVEVNGVAAPALTRQLKASGTATANVYTFELGDLLTAETNEVK LRVTNAHGAEKVFTFNINTYSLTLEFDPAYDESQVQTSRWSLRVLCQGVPATVYCRIQDG ARTDTLTKSIHNSSGEFVVDEQDRYGSGAHAITLWAENKELGLRTPNITTTYIKASSGPG GVAALCFGKGIPATARQFSVARLPYYFYLPDEDAGTAVSVKAELLYGGGQHVRQLSVQKV TLNPDHGSGLQTLNVAFDEAEYLPEVTVRLSVGGVSAECKIRVQGLGVDLAPADECKVYL PMRGRANGDESAQNIVATYRGRQTARLVRSDNFRLDDNNGFIDGQGMTILAGKSVTLKDF LPFGSDFGANGSKQGRTIEMEFESGICSDENAVIVDCMDGGTGFRVYANRVELGCATGNV ITYYPEQSRVRLGVVIDGTTTHTRNNLGGGSVAEKDVNLAYLYMNGVIVRMFDYATALWK QGAPKELVIGSPQAEVKLYSIRMYDKALNFTQMVGNYAYDTPDIEDVTDRDGRFVRFGKV SIARRNDILNSVGDIHNPDEIVSYNKVRKALPETPIAVWDIENLPYNKNNPNVPITGTEF INPQWDKARDGWAGVPFKVGPHAFNADGTSSNGYPLPYKNWAEIFETFSGDPVTLTLDPG HSDEQSTSYSITRGVAEGEKEMVHKVNFASSEGIFNVLAMNLFQEILLGCARNDMDLYTS FQRAQAMQGTEVTYRKSLSGLSEIGFRKTAATSAKEPTFLSIYNMINNKYSASFLGFPKK DHTKAQVWEIDENVNFFNREMTLHELLADGTVRQSNGTDSAGPMYYARVPKKSPTNKKNK LGQVKSATDDIEAANRELAVIRRFHNWVVSCNPHLPERYKAEHGEYKLLDQPVMYNGVTY DRDTPAYRRARFVNTYRDYLVKDDVLFYIVFCVFFLGMDSLDKNMSIAFDDIELNPDGSV KVAHARLFLRDTDTQSLFNNSGALFYKYWAEWNDAYNPTTGKTQPIAGETYDRDNHAWLP KMDEGYSPVFNGRLSGLIDLVWQCWGDDLAAMYKSMRDNGLESSNIFRRYTDFWKQWCEN LYNADAMGYANTGHFTKAYGDKLMLMLYFLQKRSRYMDSKFCCGASVVNNLRMRLYEQGK GLAIKHYSPLYASVQWGANNFATVRNIDGGYALIPFGFTNPQNATFDIDDADMITDIKTF TRRVGGQVTYSGLEGLGDFEFDANMQLLRRLEELVMDYAPQRPNTRERGTAFDLSKCVML RRVIVRNVKNLAKVIQLGSGVLQEVDFTGTPIKGVAMPENGTLTRLVLPDTIEELTLRGL DALEPGGLNLGGLANVKKFRYSACRKLNGFDLLQRIYAAGAKPTDIEMDGLNETLASLDT LDLLAAAGAKLSGRITLRGVTPDFRAKLRYVQAWGDVDNSRNPLHLVYERIPVNSVTISG DIYVQEAGVAWLNISPDNVRANSVRTIEWSMAANPYATLDARTGRMAVTRVGADETANAQ VTVTVTVDDGRQLTATETVYFYKRAPQAGDIVYADGSWSDKHNKNKTPIGVCFYVSADGK DRRMMGLTRLNSFNTSWGTSGNYIDRLAKLASEPARDISVVRGMKKNWPTEIADDWMTLE EPLLDYPRGAKIPYGMYNTLCIIRQRNDILQDENYKLDVPHASAGVSEFEDVKRCALKYF NENQIYVYYVPASLSYAYSPAVLKGETLSNQFAPHKWRLPSAAEAKAILVSIANDYTSDR NFLKAAVSYGLIEQLVIDGSVYVKSLETSQETSAQERLVGVEWTVRGWVKWGVCTYVFPI CNF >gi|283510588|gb|ACQH01000031.1| GENE 50 69733 - 70452 772 239 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927906|ref|ZP_06421753.1| ## NR: gi|288927906|ref|ZP_06421753.1| LigA protein [Prevotella sp. oral taxon 317 str. F0108] # 1 239 1 239 239 410 100.0 1e-113 MEATIYELQARAQALREKTQEGSITPEEVGGLIADTLALLADVEQTAGSLRVSKVYASKA EMEADTAPEDAHHQPLKAGQLVAIHAEGDSPDNGNIYVYLAPGWKLIGNLNRVAIGEDLG QAYPGTKGKRLAEDLNNERIERTDNDAALQRAIANETDSRRQALTEQAEVLRRETNKAVA DEATARDRELTSIRQSIRDMQGNIGSVEGFKHAFLTEEEYNRKQQAGELDPDRCYFIEE >gi|283510588|gb|ACQH01000031.1| GENE 51 70455 - 71279 239 274 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927907|ref|ZP_06421754.1| ## NR: gi|288927907|ref|ZP_06421754.1| hypothetical protein HMPREF0670_00648 [Prevotella sp. oral taxon 317 str. F0108] # 1 274 1 274 274 543 100.0 1e-153 MNRLVFSEGGQPLFLDDIKLLQDNDAGFNRQLLNAISGRSSVFLLQALDMKILSVDQEKF TTTAKVYAGSIVIAGDIIDFPETTVTVKTWHDPLYVCIKETEHEEREFEDGQTRPCRQST QAYVSTSKDGVKAAYNAFELPTLTGLLRRSVGIEGAFVWKDIPVTFFNGYGGQVQYMEQN GSIRVKIKVSSRNGDWKNSSDGLLFEVDPVIGSFLNRKWSTTFGAGGDNGSFLCALEFYD GKCFLHDIRGASGNDGATWSPIVCPVSVIFEVVE >gi|283510588|gb|ACQH01000031.1| GENE 52 71272 - 71781 428 169 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927908|ref|ZP_06421755.1| ## NR: gi|288927908|ref|ZP_06421755.1| hypothetical protein HMPREF0670_00649 [Prevotella sp. oral taxon 317 str. F0108] # 1 169 1 169 169 325 100.0 8e-88 MWYDVDFTRWAVQLLPPILRSRVLVALLRIIIIPLAYLHRLFTDYRKKLAERLDITASVQ DIERALNRRFFLRNRQIYIESESDDRHPCLYFHAEGKPPTFLNPRMTLWMAGEVPSKPNF TIFVPSFLVSSLNSEEDRHHGRHLAEIIRIVERYKPAGRRYAITIYEYE >gi|283510588|gb|ACQH01000031.1| GENE 53 71771 - 72619 1002 282 aa, chain - ## HITS:1 COG:no KEGG:Cpin_0294 NR:ns ## KEGG: Cpin_0294 # Name: not_defined # Def: hypothetical protein # Organism: C.pinensis # Pathway: not_defined # 1 259 1 259 280 113 32.0 1e-23 MARTIAEIKRTMTDAFMANATLREAYGLAEGDTFEGSFSAVSLESILFFIVAACCHVMEA LFDRHRLDVDEKISRAVVASVPWYYKVARQFQYGDALVFDDTTSQWRYPTTDGKKRLVRY VAVRDRGTSIQILASADKDGLPEPLSADVLTAFKHYMNRVKIAGVVLNVRSLPADSIQVR ATVQVDPLILSANGTRNGEESRPVEDSINAYLRGITYGGTFNKTRLVDAIQGVEGVVDVT LAECLYKAAADADYRPVAGNNYTAVGGSFVAIGLQNSIRYVV >gi|283510588|gb|ACQH01000031.1| GENE 54 72630 - 72905 292 91 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927910|ref|ZP_06421757.1| ## NR: gi|288927910|ref|ZP_06421757.1| hypothetical protein HMPREF0670_00651 [Prevotella sp. oral taxon 317 str. F0108] # 1 91 16 106 106 175 100.0 9e-43 MEATVRDGQSLADIAVQEYGALEAVVRLAMDNGMAVSQAPPAGMRLRLHDGEYNRPMRRY CQAHGIAPATLRGDGGMRARIFNETFNDTFN >gi|283510588|gb|ACQH01000031.1| GENE 55 72890 - 73306 355 138 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927911|ref|ZP_06421758.1| ## NR: gi|288927911|ref|ZP_06421758.1| hypothetical protein HMPREF0670_00652 [Prevotella sp. oral taxon 317 str. F0108] # 1 138 12 149 149 211 100.0 8e-54 MLLASCRTTRTITRNSEVDVRQRDSLVLRDSVVLRYVTATRDSVTIRDSVVLVKDSSGRV IATERYRTSERTRDTHADNSVTATRDRTHDKGVSAHVKERSTDSKSGWPTLGTIMDIVGW IAFVLFLILFARKLWRRR >gi|283510588|gb|ACQH01000031.1| GENE 56 73351 - 73794 594 147 aa, chain - ## HITS:1 COG:HI1494 KEGG:ns NR:ns ## COG: HI1494 COG3023 # Protein_GI_number: 16273395 # Func_class: V Defense mechanisms # Function: Negative regulator of beta-lactamase expression # Organism: Haemophilus influenzae # 60 139 18 97 116 80 47.0 9e-16 MRTIKYIVVHATGGSQRTTIKELMMEFARLDWKAPGYHYVVHVDGRITQLLGEEKVSNGV KGYNHMLINVAYIGGLDAKGKYADTRTAEQKEALRKLLGMLHKKYPAAEIRGHRDFSPDL NHNGIIEPWEFIKACPCFDAKKEYKDI >gi|283510588|gb|ACQH01000031.1| GENE 57 73791 - 74282 424 163 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927913|ref|ZP_06421760.1| ## NR: gi|288927913|ref|ZP_06421760.1| hypothetical protein HMPREF0670_00654 [Prevotella sp. oral taxon 317 str. F0108] # 1 163 1 163 163 286 100.0 3e-76 MIEHFLQKLSEALSTVWGWLLFLGLVVMNFIVGYEKMVGFTVMAIVLDAVWGIAASLVQK RFALSELARDTFAKLAVYGTAVFAFILVDKLAGISGGLTTSVICIGIILVELWSMSASML ICFPHMPFLKILKRALAGEIASKLNVKPEDVAEALENLHTKQE >gi|283510588|gb|ACQH01000031.1| GENE 58 74295 - 74600 404 101 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927914|ref|ZP_06421761.1| ## NR: gi|288927914|ref|ZP_06421761.1| hypothetical protein HMPREF0670_00655 [Prevotella sp. oral taxon 317 str. F0108] # 1 101 1 101 101 190 100.0 2e-47 MNGIQLTDFAPAIRVRRDEQGKITSGLQVGNTLRQNQALILALNKGELKERPSVGCGIAD MLMDNDPLYWRTLIREQLEMDRQRVNNIRITPKGIEIDATY >gi|283510588|gb|ACQH01000031.1| GENE 59 74604 - 75068 506 154 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927915|ref|ZP_06421762.1| ## NR: gi|288927915|ref|ZP_06421762.1| hypothetical protein HMPREF0670_00656 [Prevotella sp. oral taxon 317 str. F0108] # 1 154 1 154 154 283 100.0 3e-75 MDNYKELAQLVRSAAGKAQLTLMQGIVRKVSGLTCEVEIGGIAVPDVRLRASEAADGGQM LVTPKTGSAVIVGSLSGDLTQLVVLAVDQAESITINGGKLGGLINIEPLTQKINELVQAF NAHTHQGFHGPTGPPLKPAQQLNKSDYEDTTIKH >gi|283510588|gb|ACQH01000031.1| GENE 60 75131 - 76075 1061 314 aa, chain - ## HITS:1 COG:no KEGG:Coch_0642 NR:ns ## KEGG: Coch_0642 # Name: not_defined # Def: hypothetical protein # Organism: C.ochracea # Pathway: not_defined # 5 313 8 321 321 129 29.0 2e-28 MAYDIIIGQYKLGMLAAVSVHKSVELLADTCEITLPAAQLNQALDVESRIRRGDVVLVKF GYKETGLTEEFRGWLQRIATDGGDIKLFCEDDLFTFRRDLPNEVLKQVPLSELLAHIIKG VGKDYKVNCTYTWTYAKFVIHDATGYDVLKKVQEECGADIYLQDGALHVHPPGEVTGTER RYDFALNVEDADLTYRRAEDKKVRVVVKALMPDGKVKEVEVGSTGGEKVEVKCHASDTAS MQARGEAEVRRRSFDGYDGSITTWLVPQCVPGDTATLHDADYPHKDGTYYVRAVTTEFSE NGGVRKIELGFRLS >gi|283510588|gb|ACQH01000031.1| GENE 61 76110 - 76736 654 208 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927917|ref|ZP_06421764.1| ## NR: gi|288927917|ref|ZP_06421764.1| hypothetical protein HMPREF0670_00658 [Prevotella sp. oral taxon 317 str. F0108] # 1 208 1 208 208 418 100.0 1e-116 MNNVTRFALENMALRITGGKVPPYWLFRDAGIRQVDEGDYSAIRAMSDAELADMVRTNAL GLPMAMPLSLKLEEPGAQEWLLPFEPMISLTGKNIIKRRQVNKGVIRGSIKERWAQDDYD ITIEGVLIGTDGRYPSADVARLKNFCEAASVTALSPLLEVFGISRLVIESWEMPFTAGEA NQNYSIRAYSDDVYKLLLGTNEYELMNS >gi|283510588|gb|ACQH01000031.1| GENE 62 76830 - 78599 1737 589 aa, chain - ## HITS:1 COG:no KEGG:Cpin_0287 NR:ns ## KEGG: Cpin_0287 # Name: not_defined # Def: hypothetical protein # Organism: C.pinensis # Pathway: not_defined # 75 584 92 560 565 214 29.0 9e-54 MDNVLKFLIKLKADKGNVVSVARETERQLDSINRKASVVGRGLRKAFSLEGFKGSLMAIP GMQFLMNPYTMIGAGIGAMVRLGAQAESVNVAFTTLVGSENKAAQMLGQINDFAAHSPFG KMDLTQSAQTMLNFGVETGKVLPLLRQLGDISGGDKDKMSALSLVMGQVSSTGYLMGQDL LQFINAGFNPIQELSQMTGISVDKLKDKMAKGQITYQNVEQAIAHATGAGGKFNGMMDKQ SQTLAGKWSTLMDTVQQGAIDLSQSVNTPIAEVVDKITAAIPKVFAVFQAVFSAISAGIG FVVRFRTAFMILGGAVLAVWAVFRTYTMALAAYQAITTLVTAATKIWTAAQWLLNVAMTA NPIGLIVAGVAALIAVIVFCWTKFAGFRAFLITMWDVWRKFGDLIKTYVVDRIKELIRGV GLLSKAFSKLFSGDFKGAAADFAAGVKNVSGVNSAVQLVKSTAATVRGIGGTFQKNLAAE RAKDKQKEKKKGERSAISTPGLKGSAAVGDVVFGAGKGTDKAGKGKEGRRSAEEIATGGR RSTSITMNISKFFDTLHVHMTDKADTAELERIVVQSMNRALAIATSTDR >gi|283510588|gb|ACQH01000031.1| GENE 63 78765 - 79049 467 94 aa, chain - ## HITS:1 COG:no KEGG:Coch_0646 NR:ns ## KEGG: Coch_0646 # Name: not_defined # Def: hypothetical protein # Organism: C.ochracea # Pathway: not_defined # 1 94 1 96 96 84 48.0 9e-16 MKYTKEQIEEWKRKHGDLFEITVEGKGCILHRPTRQDLSYVSVLKDPIKMSETMLNQLWV VGDEEIKTDDSLFLAAIQKMQEVLEVKEAEIKKL >gi|283510588|gb|ACQH01000031.1| GENE 64 79071 - 79478 548 135 aa, chain - ## HITS:1 COG:no KEGG:Coch_0647 NR:ns ## KEGG: Coch_0647 # Name: not_defined # Def: hypothetical protein # Organism: C.ochracea # Pathway: not_defined # 2 134 4 136 138 110 43.0 2e-23 MFNSREYEWADISVVMGGRPVTGIRGIKYNIKKEKELLYAKGNRPHAVQSGNYDYSGEIT LLQSEYLALREAAKGDILAAQLDVVVAYGNPTRGDTISTDILVGVEFTEDNTEWKQGDKF QEKTIPFVFIDKKQA >gi|283510588|gb|ACQH01000031.1| GENE 65 79533 - 80717 1459 394 aa, chain - ## HITS:1 COG:no KEGG:Coch_0648 NR:ns ## KEGG: Coch_0648 # Name: not_defined # Def: hypothetical protein # Organism: C.ochracea # Pathway: not_defined # 2 390 3 386 390 229 34.0 1e-58 MLPRIKIQFLNGQLGTVGESPDGLFALVCGAAAVAKTLELDKAYTLHSFDELAKLGVTPE NNPRLHKHVKEFYAEAEEGTKLIIFPVDKTKTFTELLDKDTGLVKELVTAQNGALRGIFV AGDGREATLTTNGLDDDLLTALPKAQQLAEWATTQLYAPLFIVIEGRGYKGGAVKDLHGE AYNRVGILIGDTVKASEGAAVGVMAGRLASIPVQRNIGRVKDGALKPIAMYIGDKPVEEN ASAVSDLYDAGYITPRKYVGKAGYFFTDDRLACVPTDDYAHITARRTIDKAYRIAYAALL DLMLDELPVNEDGTLQHGIIVAWQQMMENAVNRAMTAQGELSADADGAGCKAYIDPKQNV LATSKVELTLKVRPFGYARYVDVKLGFQVETAGK >gi|283510588|gb|ACQH01000031.1| GENE 66 80732 - 81643 774 303 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927923|ref|ZP_06421770.1| ## NR: gi|288927923|ref|ZP_06421770.1| hypothetical protein HMPREF0670_00664 [Prevotella sp. oral taxon 317 str. F0108] # 1 303 1 303 303 596 100.0 1e-169 MALNISIWQTTLVENFYPDNSFASKSVDDSAFVHAHKVIIPNAGAPSKVQKNRTVKPASV NQRTDHDLEYEIDELTTDPIYIPNIDTVELSYDKRNSIISNDREQLRNAAEENILERWGL GVPSKNVLFTTGTTEREAHTSETATGKRKCITKADLLKIMTRMDADNVPKEGRHILLDAY MYADLLENLSESDKWMFQNSADVQRGIVGNLWGLNVMTRSQVLRVKTDKSLLSWDQEAVA GEMAAALAWHDKSVSRAMGEVKMFDSTNNPLYYGDIYSFLLRTGGSVRRYDKKGIYLLAE AAK >gi|283510588|gb|ACQH01000031.1| GENE 67 81678 - 82787 940 369 aa, chain - ## HITS:1 COG:ECs0829_1 KEGG:ns NR:ns ## COG: ECs0829_1 COG0740 # Protein_GI_number: 15830083 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Protease subunit of ATP-dependent Clp proteases # Organism: Escherichia coli O157:H7 # 17 177 66 222 226 109 39.0 8e-24 MQKKFFNIIPGDGEVAILLYGDVGDGQRVDSGRVVAELMALQSQYSKIDVRINSRGGDVF SGIAIYNALRTSKADITVYIDGVAASIAGIIALCGKPLYMSPYAKLMLHAVSGGTWGNAS ELRQMAEVMENLQGDLASMIAGRCGMKKDEVLAKYFDEKDHWISAQEALSMKLIDGIYDM DGEAVNAGSTDEIYTYFNNRLRNQPQNKDRGMALLESLKGIPSFANLADENAVLAHVREL ENKAAKADALAQAVEGYKKKLQDVEDKEVVAIIDKAIAERRITAEQKESFMALMKTDREN TEKLLASMKARPFRRIVDELRDETGSPANLAGKSWDELDKAGKLSELRNADFETFKAKYK EKFGLDYKE >gi|283510588|gb|ACQH01000031.1| GENE 68 82956 - 83405 455 149 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288927926|ref|ZP_06421773.1| ## NR: gi|288927926|ref|ZP_06421773.1| hypothetical protein HMPREF0670_00667 [Prevotella sp. oral taxon 317 str. F0108] # 1 149 1 149 149 272 100.0 5e-72 MAKTNIDKKSIARSLFLDGNYTQEEIADKVGSTRQTVSRWIREGNWEEVKASVAITPAQI ISQWNRQIIEVNNAIAARDEGQRYATPAEADALAKLANAINKLQNDVGVSDCVSVGMRFL TWLRPLDVEAAKQFNNLFDAFIKDQTTRG >gi|283510588|gb|ACQH01000031.1| GENE 69 83402 - 84943 1367 513 aa, chain + ## HITS:1 COG:no KEGG:DvMF_0728 NR:ns ## KEGG: DvMF_0728 # Name: not_defined # Def: phage uncharacterized protein # Organism: D.vulgaris_Miyazaki_F # Pathway: not_defined # 38 242 48 259 558 84 29.0 7e-15 MKAKHTDKQALELWRRFHEGLAKDVPVDEGLSRYEVERRRKELERDPVEWIRYFFPAYAK YEFAPFHIKAIRRIIANDEWYEVLSWSRELAKSTVVMFVLMYLTLTKRKRFVALAAATID AAERLLAPYKANFEKNPRLMQFYGKQETIGAWTNTEFACACGAKFIALGAGSAPRGMRNE AIRPDVLYFDDYDTDEDCRNPATLDKKWQWAERALYPTRSISEPTLVLWCGNIIAKDCCI TRAGALANSWDIVNIRDKHGHSTWPQKNTEEQIDRSLSKISVRAQQGEYFNNPVAEGKIF KNLPWGKVPPLKKFRFLIGYGDPAYSDSRKKASSTKALWLVGKYKGVYYVIKGFLARETN ANFIGWYFELDKYVAGRANVYWYIENNKLQDPFYQQVFKPLLRDECAKRKVQLFIREDTR KKTDKATRIEANLEPLDRLGTWVFNEEEKDNPHMQELMNQFKLFELTLPYPADGPDAVEG GVTTVDQKTGELEPTYTIALNDEDMNKDNPFMM >gi|283510588|gb|ACQH01000031.1| GENE 70 84945 - 85385 545 146 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288927928|ref|ZP_06421775.1| ## NR: gi|288927928|ref|ZP_06421775.1| hypothetical protein HMPREF0670_00669 [Prevotella sp. oral taxon 317 str. F0108] # 1 146 1 146 146 290 100.0 2e-77 MSNFIDITDYDASIHREILDSLLRQGTADYDPQIVEICEDRAVLEMRSYLNKKYDCDKIF SARGTDRHALVLMFALDIAIYHIFCQHNPYKISKSREDRYNRAVEWLKGVMRGDVTIDGA PLLPAEEIEDKSRWQIKADEVRPTLL >gi|283510588|gb|ACQH01000031.1| GENE 71 85410 - 86747 1442 445 aa, chain + ## HITS:1 COG:NMB1095 KEGG:ns NR:ns ## COG: NMB1095 COG4383 # Protein_GI_number: 15676976 # Func_class: S Function unknown # Function: Mu-like prophage protein gp29 # Organism: Neisseria meningitidis MC58 # 152 408 162 434 522 77 26.0 8e-14 MKTLKQRRAQGRRITQGGMLAAPGERQPDVVLQMPELFHFNLQHYMNAVTSARGIDYSNR VRLYDMYESANFDLHLTGVMAKRLRGVTQIPIEFQRDGKPDEEINRQLRSPWFKELRKEL ILSEFWGFTLVQFRMEEDDGNIRFDSISRKNYDPIRGLVLRHQGDISGVPVEEYGHTLFV GSERGLGIFAEILPAVLYKKGNMGDWARFCNIFGMPIREYTYDAGDEEARRTLIREARQQ GTNAVYIHPKDSDLNLIEAGNKTGSSELYRTFAEYWDSKISIRVLGNTLTTDAKSTGTQA LGTIHKEEEDEMNADDRDFILDILNYQMRDIFAQLGFNTDGGEFVYAKKEKVDTAQQIDI VQKLSNMGLPIDDDYLYETFGVAKPENYNELKAKKEEERTALRQQLAHQGEEPTTPEPPT RKAPTNALRRFFGLAPTPIGADNDF >gi|283510588|gb|ACQH01000031.1| GENE 72 86900 - 88171 702 423 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288927930|ref|ZP_06421777.1| ## NR: gi|288927930|ref|ZP_06421777.1| conserved hypothetical protein [Prevotella sp. oral taxon 317 str. F0108] # 1 423 29 451 451 885 100.0 0 MWRELQRTMNEAAAEGLTRGEYQPRHNDRFLDAMRHGNEVFAAFKVHAMGKAMADKLRDS NGNIKPFEQWSNDVRTIASHHTGAWLRTEYNTAVLRAHAAADWQEFVENKDIFPNLRWMP TTSPDAEASHRSYWEKKLTLPVEHPFWEKHHPQDRWNCKCMLEATDDPATPADVVEDMPA PQPQRGLDNNPGKDGHLINDTHPYFPEKCAQCPFYKPRGVKNRIRAVFVAHKKDCFNCPY IDGKLPGSNGFILKNKYKNGGTLSIHELVDKQKTDYHDIIEVAQSFAKDGHKVEITPSVH FKSEEYKQIYGSLIGTKYERKCPDFRVDGVFYEYESFVRPWSKKKVGRMLSHGMEQSPYI VINNTKGCSDRFIRKNIIARRNVKADIREVWVYEKGKMRLMFKDGIFVNNNGETEVPPRG NVP >gi|283510588|gb|ACQH01000031.1| GENE 73 88248 - 88823 513 191 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288927931|ref|ZP_06421778.1| ## NR: gi|288927931|ref|ZP_06421778.1| putative phage virion morphogenesis protein [Prevotella sp. oral taxon 317 str. F0108] # 1 191 1 191 191 362 100.0 7e-99 MNAKQIADIIARAPQQVEQAMRTDIPRKAAVLAKNHFRQNFRNGGFTNGGLHPWKKTRRQ DAGSPYKPLTSATDNLMRSIDAVAMPGAVMVTNPRPYALIHNEGGNIGITPKMRRYAWHM VYSLAKVKKGEKMPKELPPMAQAWRALALTRKTAIHIPRRQFIGPSHELNVKIQKMILNA LMEIGNGIDTR >gi|283510588|gb|ACQH01000031.1| GENE 74 88804 - 89295 594 163 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288927932|ref|ZP_06421779.1| ## NR: gi|288927932|ref|ZP_06421779.1| hypothetical protein HMPREF0670_00673 [Prevotella sp. oral taxon 317 str. F0108] # 1 163 1 163 163 314 100.0 1e-84 MESILANTIAHIAHELPWTRTVDEDYGQLEALDNDNLDMYPLTYPAVLIDLPGTDWTDTG DLTQRGTCEVRVRLILDCYDDTHAGSHTTDRIMQREEKRKALHALLQGYRPSGEGALIRT RSRFFTFNHGIKVYEETYTCALSEATRETRTIARTALSVRLKT >gi|283510588|gb|ACQH01000031.1| GENE 75 89237 - 89527 206 96 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927933|ref|ZP_06421780.1| ## NR: gi|288927933|ref|ZP_06421780.1| hypothetical protein HMPREF0670_00674 [Prevotella sp. oral taxon 317 str. F0108] # 1 96 1 96 96 159 100.0 7e-38 MARGRNKELILERDRKLFERFYYWSEVRRLRFDDTIAKLSHEEFFLAEATTLRIVRRMLM DGATVDGKAVEKSRRQGFRSSTARREPCGQLSLFPE >gi|283510588|gb|ACQH01000031.1| GENE 76 89715 - 90362 904 215 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927934|ref|ZP_06421781.1| ## NR: gi|288927934|ref|ZP_06421781.1| hypothetical protein HMPREF0670_00675 [Prevotella sp. oral taxon 317 str. F0108] # 1 215 1 215 215 368 100.0 1e-100 MKKEMLEGLSPEEKKELLATLQNEANEEKNNRRQAYEELREKFAQDVQARLNDVVTAVTA FREWLENESRAFRDVMAEYGQLRSESQGGFTMTVADFRLTVAANKVKGFDERADMAAERL VDYLKRYVQRTEKGTDDPMYQLAMTLLERNKSGDLDYKSISKLYDLETRFDAEYAEIMQL FKESNVIQRNAQNFYFHRRDDVGVWRKVEPSFCRM >gi|283510588|gb|ACQH01000031.1| GENE 77 90343 - 90825 525 160 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927935|ref|ZP_06421782.1| ## NR: gi|288927935|ref|ZP_06421782.1| hypothetical protein HMPREF0670_00676 [Prevotella sp. oral taxon 317 str. F0108] # 1 147 1 147 160 290 100.0 3e-77 MPPEFNYRQFYALLARMPYADKETLVYQYTKGRTDHLGQMHLDEYRVMLRDMKRVADDED TTRELKKRRSAVLKLMQQLGVDTTQWPCVDAFCLHPRIMGKRFCRISVDELEALAVKLRA IKRKGGLKDESPNAAQPTLKVKYKFTINNKNNKNNEKGNA >gi|283510588|gb|ACQH01000031.1| GENE 78 90857 - 91129 260 90 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927936|ref|ZP_06421783.1| ## NR: gi|288927936|ref|ZP_06421783.1| hypothetical protein HMPREF0670_00677 [Prevotella sp. oral taxon 317 str. F0108] # 1 90 12 101 101 147 100.0 2e-34 MKQTFETMMARLQAWHERRARRMEARLVKRLDNESRRRLQLIEHNGIIYLSVDGIPLLGA ADLACGLTESLAQARANYADYREEGIWAKR >gi|283510588|gb|ACQH01000031.1| GENE 79 91143 - 91763 534 206 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927937|ref|ZP_06421784.1| ## NR: gi|288927937|ref|ZP_06421784.1| hypothetical protein HMPREF0670_00678 [Prevotella sp. oral taxon 317 str. F0108] # 1 206 1 206 206 333 100.0 3e-90 MIRGKWGKWTLTDEERAWMEEHFAHTKNEEVARHLGVSRRTAVRLARGMGLEKSAEFVRT MQANAVHHAARANRGQGNAGKANLLKYGKAYQFKPGQGGNKRFMSPEAFKEMHRRKGIIR RETVRAERRRVAFGLEQRTALRVVKAPKAKIYLRHELRKRGYVVAHASSDATITTNTRRS AILEQRAEKMGIRFYPTEKQNDKKVF >gi|283510588|gb|ACQH01000031.1| GENE 80 91760 - 92296 420 178 aa, chain - ## HITS:1 COG:no KEGG:Aave_1600 NR:ns ## KEGG: Aave_1600 # Name: not_defined # Def: hypothetical protein # Organism: A.avenae # Pathway: not_defined # 82 171 4 92 113 65 35.0 7e-10 MGNILYNIIHREQDKTMTSAELKEYLEKVIADLPEQEKDISVDITLCSDSGMIRIIRGFP AFNDTSERIQQYLNKKKMNNKIYISGAIAHHDIDERKAAFAAAAHKLRKEGFTPVNPFEN GLPDSEDWRRHMRVDIGMLLQCGRIYMLRGWELSKGAKLELDVASSCGIEVMFETHKP >gi|283510588|gb|ACQH01000031.1| GENE 81 92342 - 93004 494 220 aa, chain - ## HITS:1 COG:L0303 KEGG:ns NR:ns ## COG: L0303 COG1066 # Protein_GI_number: 15674046 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted ATP-dependent serine protease # Organism: Lactococcus lactis # 25 125 76 173 453 60 34.0 2e-09 MARTRAYTPREVGEKRYKTLPWDGEWQRVFGRPALNELWFISGASAQGKSSFVMQLAKKL CEYGRVLYVSGEEGIRQSFQRRLQLFHMEDVNRRFFIIEDTKIEALTERLAKHKSPRFVV IDSFQVAEWTYEEAMALKARFPQKTFIYVSQEHKSAPMGKPAVRLRYIAGVKVRVSGFVA LCMGRENEHHGQGFVVWEEGAVRYGNGSLTPQPLSKGRGE >gi|283510588|gb|ACQH01000031.1| GENE 82 93004 - 93879 1003 291 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927940|ref|ZP_06421787.1| ## NR: gi|288927940|ref|ZP_06421787.1| hypothetical protein HMPREF0670_00681 [Prevotella sp. oral taxon 317 str. F0108] # 1 291 1 291 291 565 100.0 1e-160 MMTKDTKQRILAAVAANRTNYPSDAKHAASLGISTSVYSALKNGQTDRTLSDANWISIAR RLGVELRASIEWKAARTPVYQFVMAQLEFYQQSGTSGILCDMPNIGKTFTARLYVQNHAN SVYIDCSQVKTKLKLVRKIAAEFGVNARGRYADVYDDLVYYLRSIEQPLIILDEAGDLQY EAFLELKALWNATERACAWYMMGADGLKEKINRSIECKKVGYTEMLSRYGDRYSKVTPDD GRERDAFLAEQARIVAKVNAPTGTDIAAIVRRTGGGLRRVYTEIEKLKRAN >gi|283510588|gb|ACQH01000031.1| GENE 83 94027 - 96039 2245 670 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927941|ref|ZP_06421788.1| ## NR: gi|288927941|ref|ZP_06421788.1| hypothetical protein HMPREF0670_00682 [Prevotella sp. oral taxon 317 str. F0108] # 1 670 1 670 670 1380 100.0 0 MAMVEYYEGRLCIPAKELVERGLVSEANYKKMAIRKKFDVARTARGLGNYALVAVDTLPA AMKEAVKRAYPNLRIVRLVNWVRENYDYDQHAYAFFSDPAQCGVELPRRHVREYTVNAGV ISAAVALYNSAKAQHTVMGEAYDWDMMAEAIDVLKQEYGHTLPTSTLRFRKKVAEFKKKG YACLISGKFGNQSARKVDHKTERLILGLAILPNKPFNSNVYEMYLSFVCGELDVYDPDTG ELFCPDDFTLKNGEPKTLSEGTINNVLNAPKNKLIVEHALSTYTTFMHEQMPHMHRHNGQ FSLSQITMDDVDLTRKLKDTKQRVHAYYAYDVVSQCVLGASYGRKKDENLVVDCFRDMFR TIARHGWGIPAGIEVENHLMSQYRDGFLRAGEVFPFVHFCAPQNSQEKYAEPLNGAKKRS IIHKNHTGIGRFYGKGKWRQEYKKVSDEWNDTYEDREYFTWEELVADDRADSAEWNNTLH PDQKRYPGMTRWQVLVANVNPTLLPYDARTLARHIGAAVETSIRRNSTVRVAHEDWWLSS TTALERLAPNNYKVTAHYLPDEDGRAMDVYLYQGDRYIDKVERVETFNRVMAEQTDEDVV KFIEQQKKVAGFRKYVTDNAIRRVGVMKTKVELTVEDEEDLEVATPQAEEELPLPPIMAT DWSRAGVDAT >gi|283510588|gb|ACQH01000031.1| GENE 84 96049 - 96450 456 133 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927942|ref|ZP_06421789.1| ## NR: gi|288927942|ref|ZP_06421789.1| conserved hypothetical protein [Prevotella sp. oral taxon 317 str. F0108] # 1 133 1 133 133 247 100.0 2e-64 MKRKIVVTAEVKQKLMKQFGAGERSLFNALTYDERRGNSPTAKRIRESAMKNGGVAMADD CLDMETIHLADGTMRQFFPRGTVMTVFRNGVVTIEKNGRLVKKEQCPGLIDDYEELQRLA AKVDGAERVTVLR >gi|283510588|gb|ACQH01000031.1| GENE 85 96494 - 96952 492 152 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927943|ref|ZP_06421790.1| ## NR: gi|288927943|ref|ZP_06421790.1| hypothetical protein HMPREF0670_00684 [Prevotella sp. oral taxon 317 str. F0108] # 1 152 1 152 152 282 100.0 4e-75 MNTERQNSKLAMLAKDVEGKLATITATMQRVKGVMEVDYERFFRWHSEEAYRMNMCRFEY GRLHACLLTGDLDKVRQWLRQNADCIKELLLAEGARGYSVSASGLANVNALEAKRELRKQ YLSMLDFIGNGAENERDGLNRESWLDAALKEI >gi|283510588|gb|ACQH01000031.1| GENE 86 96966 - 97181 323 71 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927944|ref|ZP_06421791.1| ## NR: gi|288927944|ref|ZP_06421791.1| hypothetical protein HMPREF0670_00685 [Prevotella sp. oral taxon 317 str. F0108] # 1 71 1 71 71 141 100.0 1e-32 MNGNIEIKEWSTNDFKGKVAQRLMTDRVRFCYDPEQGIVFTAPEEYVKELIYKLMVCDGV KRRPNIYEYNK >gi|283510588|gb|ACQH01000031.1| GENE 87 97283 - 97957 391 224 aa, chain + ## HITS:1 COG:no KEGG:BDI_0844 NR:ns ## KEGG: BDI_0844 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 47 223 47 222 222 113 36.0 6e-24 MQENKQEKSLVKRNISLYLSSKGITPYEFYKESGTTRGILGQNNGISEDNISRFLAYAPD VNVGWLLTGEGNMLKSESDTHELVSSTEQPISSNEGAPYYDVEFQGGFTDSFNDQTIYPD RHIYIPGFERVQVWCNISGHSMEPRIGHQDIIGLRQCLVQDIQFGKIYAVVLKTKRTVKI LRKSNNPQMLRYVPINDKGFDEQEFPITDIINIFEVLGGVAKFF >gi|283510588|gb|ACQH01000031.1| GENE 88 98393 - 98617 207 74 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927946|ref|ZP_06421793.1| ## NR: gi|288927946|ref|ZP_06421793.1| hypothetical protein HMPREF0670_00687 [Prevotella sp. oral taxon 317 str. F0108] # 1 74 1 74 74 142 100.0 5e-33 MAKVIHVHLLQQIDGVKRRDWYFSSLSAVFTVFTPEQVGVTKNYLLHAGLSGGGVIINKR AVIRQSTLIGCSRG Prediction of potential genes in microbial genomes Time: Sat May 28 00:43:03 2011 Seq name: gi|283510587|gb|ACQH01000032.1| Prevotella sp. oral taxon 317 str. F0108 cont2.32, whole genome shotgun sequence Length of sequence - 60938 bp Number of predicted genes - 56, with homology - 52 Number of transcription units - 31, operones - 17 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 1144 994 ## ZPR_4656 RagB/SusD family protein 2 1 Op 2 . + CDS 1157 - 2869 1306 ## Cphy_0623 hypothetical protein + Term 2891 - 2943 7.5 3 2 Tu 1 . + CDS 3007 - 4950 1398 ## Snas_3258 hypothetical protein + Term 5101 - 5142 5.4 + Prom 5122 - 5181 5.9 4 3 Op 1 1/0.000 + CDS 5245 - 6945 1869 ## COG3643 Glutamate formiminotransferase + Term 6952 - 6990 6.7 + Prom 6984 - 7043 3.1 5 3 Op 2 6/0.000 + CDS 7065 - 9074 2176 ## COG2987 Urocanate hydratase 6 3 Op 3 6/0.000 + CDS 9097 - 10572 1661 ## COG2986 Histidine ammonia-lyase 7 3 Op 4 . + CDS 10579 - 11823 1357 ## COG1228 Imidazolonepropionase and related amidohydrolases + Prom 11909 - 11968 4.0 8 4 Tu 1 . + CDS 11992 - 12312 394 ## COG3070 Regulator of competence-specific genes + Term 12470 - 12506 -0.1 - Term 12357 - 12385 2.3 9 5 Tu 1 . - CDS 12397 - 13005 720 ## COG0307 Riboflavin synthase alpha chain - Prom 13043 - 13102 7.9 - Term 13694 - 13733 -0.3 10 6 Tu 1 . - CDS 13743 - 14378 594 ## PRU_1361 HD domain-containing protein - Prom 14399 - 14458 3.4 - Term 14393 - 14446 7.2 11 7 Op 1 . - CDS 14509 - 14625 58 ## 12 7 Op 2 . - CDS 14643 - 16007 1503 ## PRU_0993 putative lipoprotein 13 7 Op 3 . - CDS 16075 - 16884 857 ## COG0297 Glycogen synthase + Prom 16835 - 16894 5.4 14 8 Op 1 12/0.000 + CDS 17084 - 17935 697 ## COG0414 Panthothenate synthetase 15 8 Op 2 . + CDS 17942 - 18289 579 ## COG0853 Aspartate 1-decarboxylase + Term 18361 - 18419 12.0 - Term 18349 - 18405 9.1 16 9 Op 1 . - CDS 18458 - 18886 616 ## COG0511 Biotin carboxyl carrier protein 17 9 Op 2 . - CDS 18894 - 19052 152 ## gi|260911542|ref|ZP_05918128.1| hypothetical protein HMPREF6745_2083 18 9 Op 3 . - CDS 19095 - 20660 1717 ## COG4799 Acetyl-CoA carboxylase, carboxyltransferase component (subunits alpha and beta) - Prom 20722 - 20781 4.3 19 10 Tu 1 . - CDS 20819 - 23200 1795 ## PRU_1228 hypothetical protein - Prom 23352 - 23411 2.9 - Term 24764 - 24801 -0.9 20 11 Tu 1 . - CDS 24903 - 25790 788 ## COG3757 Lyzozyme M1 (1,4-beta-N-acetylmuramidase) - Term 25910 - 25956 8.4 21 12 Tu 1 . - CDS 25973 - 27613 1749 ## COG0205 6-phosphofructokinase - Prom 27638 - 27697 6.1 + Prom 27779 - 27838 9.8 22 13 Tu 1 . + CDS 27934 - 28203 332 ## PRU_0791 hypothetical protein + Prom 28294 - 28353 2.9 23 14 Op 1 . + CDS 28388 - 28942 491 ## COG0204 1-acyl-sn-glycerol-3-phosphate acyltransferase 24 14 Op 2 . + CDS 28952 - 31585 1680 ## PRU_0789 putative N-acetylmuramoyl-L-alanine amidase 25 14 Op 3 . + CDS 31604 - 31921 513 ## PRU_0788 hypothetical protein + Prom 31928 - 31987 1.9 26 15 Op 1 . + CDS 32019 - 34031 1792 ## COG1032 Fe-S oxidoreductase 27 15 Op 2 . + CDS 34105 - 34875 557 ## COG3022 Uncharacterized protein conserved in bacteria + Prom 34895 - 34954 5.5 28 16 Op 1 . + CDS 35064 - 36068 648 ## PRU_1550 hypothetical protein 29 16 Op 2 . + CDS 36112 - 37569 1254 ## COG2195 Di- and tripeptidases 30 16 Op 3 . + CDS 37578 - 38219 533 ## COG1180 Pyruvate-formate lyase-activating enzyme + Prom 38952 - 39011 4.1 31 17 Op 1 . + CDS 39253 - 40482 779 ## BT_4022 integrase 32 17 Op 2 . + CDS 40536 - 41831 640 ## COG4974 Site-specific recombinase XerD 33 17 Op 3 . + CDS 41828 - 42079 147 ## gi|288927979|ref|ZP_06421826.1| hypothetical protein HMPREF0670_00720 + Term 42225 - 42264 -1.0 34 18 Tu 1 . - CDS 42661 - 42876 189 ## BF1159 hypothetical protein - Prom 42945 - 43004 5.2 - Term 43069 - 43109 -0.7 35 19 Tu 1 . - CDS 43110 - 43361 156 ## gi|288927981|ref|ZP_06421828.1| conserved hypothetical protein - Prom 43410 - 43469 3.6 + Prom 43201 - 43260 5.2 36 20 Tu 1 . + CDS 43317 - 43658 163 ## 37 21 Op 1 . - CDS 43556 - 45277 736 ## Ccur_13900 molybdopterin/thiamine biosynthesis dinucleotide-utilizing protein 38 21 Op 2 . - CDS 45270 - 46253 302 ## Ccur_13910 hypothetical protein 39 21 Op 3 . - CDS 46255 - 47259 398 ## COG3621 Patatin - Prom 47438 - 47497 5.1 + Prom 47254 - 47313 8.0 40 22 Op 1 . + CDS 47354 - 47590 88 ## 41 22 Op 2 . + CDS 47521 - 47718 69 ## + Term 47725 - 47771 6.2 + Prom 47821 - 47880 11.4 42 23 Op 1 . + CDS 48124 - 48486 404 ## gi|288927985|ref|ZP_06421832.1| cell surface protein 43 23 Op 2 . + CDS 48486 - 48689 340 ## gi|288927986|ref|ZP_06421833.1| hypothetical protein HMPREF0670_00727 + Term 48704 - 48747 1.1 + Prom 48794 - 48853 3.7 44 24 Tu 1 . + CDS 48897 - 49148 285 ## PGN_1415 DNA-binding protein histone-like family 45 25 Op 1 . - CDS 49440 - 49943 150 ## BF3623 hypothetical protein 46 25 Op 2 . - CDS 49940 - 51172 418 ## MAE_49360 hypothetical protein - Prom 51223 - 51282 6.3 47 26 Op 1 . - CDS 51424 - 51672 270 ## BF3280 hypothetical protein 48 26 Op 2 . - CDS 51682 - 51984 209 ## PGN_0091 hypothetical protein - Prom 52048 - 52107 2.5 + Prom 52379 - 52438 3.3 49 27 Tu 1 . + CDS 52544 - 53809 545 ## COG5545 Predicted P-loop ATPase and inactivated derivatives + Prom 53913 - 53972 7.1 50 28 Op 1 . + CDS 53992 - 54996 474 ## PGN_0923 putative DNA primase 51 28 Op 2 . + CDS 54999 - 55469 339 ## BT_4018 hypothetical protein + Term 55549 - 55591 -0.9 + Prom 55611 - 55670 4.3 52 29 Op 1 . + CDS 55852 - 56283 242 ## gi|288927995|ref|ZP_06421842.1| hypothetical protein HMPREF0670_00736 53 29 Op 2 . + CDS 56267 - 57328 394 ## FP0489 mobilization protein BmgA + Term 57477 - 57519 1.3 54 30 Tu 1 . + CDS 58260 - 58490 188 ## gi|288927998|ref|ZP_06421845.1| hypothetical protein HMPREF0670_00739 + Prom 59313 - 59372 3.9 55 31 Op 1 . + CDS 59523 - 59678 142 ## gi|288928000|ref|ZP_06421847.1| hypothetical protein HMPREF0670_00741 56 31 Op 2 . + CDS 59684 - 60916 1415 ## COG0560 Phosphoserine phosphatase Predicted protein(s) >gi|283510587|gb|ACQH01000032.1| GENE 1 2 - 1144 994 380 aa, chain + ## HITS:1 COG:no KEGG:ZPR_4656 NR:ns ## KEGG: ZPR_4656 # Name: not_defined # Def: RagB/SusD family protein # Organism: Z.profunda # Pathway: not_defined # 3 376 255 611 615 333 51.0 8e-90 QNFSKEKWERAAAAHAELIKAAEAAGHKLYYEYNDDGTIDPFMSYYNMSLKRFSDGNKEI IFGRAHNKDLENWQRHHLPKGIGGNGALGITQELVDAFFMANGEMPISGYGQDGTPIINE NSGYTEKGFSSENEMRHTKWPGGGPSSTYDNTNGNRPVTLPGTYKMYCNREPRFYVSVIF NNEWLNVANRRVNNLQGVGEDASKSFDAPWTGYNVRKQVSLDVFPRDNKYSYQPGILYRL AEAYLGYAEALNESQDGPNENVYHYVNLIRERAGLPNLPQGLSKEQMRSAIQQERRVEFN CEGIRLNDLRRWKLAEKYLNHSYWGMNRNGTMASDDANNPNAYYKRTLWKPRTFTKKMYL MPIPQKQMDINVKLRQPTGY >gi|283510587|gb|ACQH01000032.1| GENE 2 1157 - 2869 1306 570 aa, chain + ## HITS:1 COG:no KEGG:Cphy_0623 NR:ns ## KEGG: Cphy_0623 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 123 344 1 219 236 103 29.0 2e-20 MKRHKTILLQALCLAVTLFAGCADKEEVRPDVSHYVRISQPSFNLNVGETTIIRALVDAN DGQNYNLTWEVENTQVAQVEQRAGHEGLIKALAPGKTVVKVETSDHRLKYYADVNVAEGE APMRLLTIGCGRAIDANSQLLSQMATTTGKKLVICNLSIPQASFKTHLNNIKQETNSYAY QRVSVDGNSNTQKDQDLRKVIEEENWDFIALEEATEMAGMANGYQTFLPQLAQAIKSLAT NPNVKVLLHQTWAYAKTARNPTFASYGNDQKKMFDSITNAVMQGKPHVDMVIPTGTAIQN GRTSYLGDKVLKDDIDLDPITGRYIAALSWYETLFSTDVSALSYLNPSLSQYDNKLAKTA AHAAATNLQEVTELSEFKEKGPNEFVLKCPIYIDFGPIETPAPFNNYKHNYDAPLANLLD SAGNSTYFNFAVTTHFSKPEENMVRRELTNNLGLPKTACIDMFWCDGKKLPKGAFKLSYL NKGMKYTFYFYGSINDRKTCTKYKVIGKNEGEAELVNDFNTDKMAVVSGIEPKDDGTIDI ELSMGSTNTHWAGFFGINAMIITPEGYRLR >gi|283510587|gb|ACQH01000032.1| GENE 3 3007 - 4950 1398 647 aa, chain + ## HITS:1 COG:no KEGG:Snas_3258 NR:ns ## KEGG: Snas_3258 # Name: not_defined # Def: hypothetical protein # Organism: S.nassauensis # Pathway: not_defined # 34 496 58 530 930 148 28.0 9e-34 MIKIRTAILALVAIIPTLEAWSQRPYQWTDELHSLKRIDLLPEYRTGTYIESFSSYDRKH GNDDGFDGTYSYLRKENGKLVIAEMNGPGVIERIWTPTPNDNMLYFYFDGQKKPSLSIKF SDLFSGKVFPFIKPICGNEIGGYYCYIPITYAKSCKILYDGQKLEFIQVEYRKLDGMKVE TFNPQANSKAELEQIRSVENLWTASSYSPNLFAMGASADFKTEERTVTIQPGETKPFFES NQGGRIVGFEIDAGTAFEGENKDVILSAKWDDEVLEAIYAPLADFFGYAFGKPAMRSLLA GRQQTANYCYLPMPYDNSAKLSLVYKKRNSQQSPITARIKVHFTSNKRNPQKEGKLYCTW RNEITPLGKYHTLLETEGKGHYVGTILSAQGLRPGMTLFFEGDDSTYVDGKMRIHGTGSE DYFNGGWYAQLDRWDRGVSLPLHGSLDYSLPMARTGGYRWFLNDKMPFERHIYHGMEHGP QNNDFPVNYTSLAFYYSNTPTSKRMVPTEEARTVYSPTKHIYFPQQFNITLGQGVVCVFG KGLAIDTPSDGIVKLSLDDVPEGKYRLLLNYKSNSHGADVQIWQRQNLIADWFSTKGDKN EEKREVHVGDIDLTRQTNSVTFRIRKNGNANKLELMLVTLERINEPK >gi|283510587|gb|ACQH01000032.1| GENE 4 5245 - 6945 1869 566 aa, chain + ## HITS:1 COG:SPy2083 KEGG:ns NR:ns ## COG: SPy2083 COG3643 # Protein_GI_number: 15675841 # Func_class: E Amino acid transport and metabolism # Function: Glutamate formiminotransferase # Organism: Streptococcus pyogenes M1 GAS # 6 316 3 285 299 235 41.0 2e-61 MNNVKQLVECVPNFSEGRNMEIINQITSVIKEVKGVKLLDVDPGEATNRTVVTFVGCPDV VVEAAFLAVKKAGELIDMRQHHGAHPRMGATDVLPLIPVAGITLEECAELARKLAKRIAD ELQIPCYCYEAAAFTPERQNLAVCRQGEYEALAEKLTTEGKQPDFGARPVDERVQRTGIT AVGARNFLIATNFNLNTTSTRRANAIAFDVREKGRPVREGNPITGKIKKDENGKTINQPG TLKATKAIGWFIDEYGIAQVSMNITNIDVTPLHVAFDEVCRCAQNRGVRVTGTEIVGLIP KRTLIEAGRYFLEKQNRSTGIPEEDVIKIAVKSMGLDDLKPFNPHEKVIEYLCEEEGNKK LVDLTVEGFAKETSRESPAPGGGTISAYMGVLGAALGAMVANLSSHKPGWDDRWEEFSHW ADKGQEMMRNLLHLVDEDTEAFNRIMAAFGLPKKTDADKQARTDAIQAATLYAAQVPLQT MKESFHVFELCKAMAETGNPNSVSDAGVGALAARAAVLGAGMNVKINASSLIDKEVSNKL IAEANNLIAKANAAETEIVAIVESKL >gi|283510587|gb|ACQH01000032.1| GENE 5 7065 - 9074 2176 669 aa, chain + ## HITS:1 COG:SPy2082 KEGG:ns NR:ns ## COG: SPy2082 COG2987 # Protein_GI_number: 15675840 # Func_class: E Amino acid transport and metabolism # Function: Urocanate hydratase # Organism: Streptococcus pyogenes M1 GAS # 17 664 20 667 676 661 50.0 0 MTLQEFQADLCGGIPHVLPEPQAYDNSVSHAPKRKDILSAEEKKLALRNALRYFPKELHA QLLPEFTQELKDYGRIYMYRLRPRYAMHARPISDYPHHSTQAAAIMMMIQNNLDPAVAQH PHELITYGGNGAVFQNWAQYRLTMQYLSEMTDEQTLVMYSGHPLGLFPSHPNAPRVVVTN GMVIPNYSKPDDWERFNALGVSQYGQMTAGSYMYIGPQGIVHGTTITVLNAARKVFGKEK DHKMPLFVSSGLGGMSGAQPKAGNIAGVVSVVAEINPHAAEKRHAQGWVDELHDNLDELI SAVKKAVEERRVVSMAYVGNVVDLWERLAKDNVQVDLGSDQTSLHNPYAGGYYPVGMSLE ESNRMMAEDPDTFKERVFDSLRRQVAAINKLTAQGMYFFDYGNAFLLMSSRAGADIMKDA QHFRYPSYVQDIMGPMFFDYGFGPFRWACTSCLPADLELTDQLATEVLTELMDKATDDIH GQLADNLHWIREAGRNKLVVGSQARILYADSEGRIRIALAFNKAIREGRISAPIVLGRDH HDVSGTDSPFRETSNIYDGSQFCADMAVQNVIGDSFRGATWVSLHNGGGVGWGEVINGGF GMVLDGSEACDQRIRQMLFWDVNNGISRRSWARNKGSMDAISREMERTPHFKVTLPNLVD DELIDNIHF >gi|283510587|gb|ACQH01000032.1| GENE 6 9097 - 10572 1661 491 aa, chain + ## HITS:1 COG:SPy2089 KEGG:ns NR:ns ## COG: SPy2089 COG2986 # Protein_GI_number: 15675846 # Func_class: E Amino acid transport and metabolism # Function: Histidine ammonia-lyase # Organism: Streptococcus pyogenes M1 GAS # 1 483 1 482 513 432 47.0 1e-121 MTHQISADHLTIEKIGEIIYNNYKIELSDDARKRIIKSREYLDARIAKEENPIYGVTTGF GSLCNVSISKDQLSQLQVNLIKSHACGVGERLPNDIIKMMLLLKVQSLSYGFSGCKLDTV ERLVDMFNNNIYPVILEQGSLGASGDLVPLAHMSLPLVGLGEVEMNGEVVSGEEMNRRMN WKPIELASKEGLALLNGTQNMSAQAVWALLNAQRLSEWADVIAAMSLDAFDGRIEPFTHA VHAVRPHQGQIETAARFRQLLEGSQLIARPKQHVQDPYSFRCVPQVHGTVKDTLRYVYSV IDTEINSATDNPTVCPDEDLIISAGNFHGEPIALPMDYLSIAMSELANISERRIYRLVSG LRDLPSFLVAKPGLSSGFMIAQYTAASVVSLNKSYAVPSSIDNIPSCQDQEDLVSMGANA AIKLRKVVANTERVLAIELFNAAQALDFRRPLVSSPELMKVHEAYRKVVPFIDVDVVMYP HIEESIQFLRK >gi|283510587|gb|ACQH01000032.1| GENE 7 10579 - 11823 1357 414 aa, chain + ## HITS:1 COG:SPy2081 KEGG:ns NR:ns ## COG: SPy2081 COG1228 # Protein_GI_number: 15675839 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Imidazolonepropionase and related amidohydrolases # Organism: Streptococcus pyogenes M1 GAS # 23 411 30 419 428 285 38.0 2e-76 MATTLIYNIATLGGIDRKGRLRLQGSEMAQFEALDNAYLLIKDGRIADFGSMESLPQLAA DVKRVDAEGGTVMPSFCDSHTHIVNAGSRENEFVDKINGLSYAEIAKRGGGILNSADKLH EMSEDELFEQSMKRVREVMLKGTGCVEIKSGYGLNTEDELKMLRVIRRIKQTSPLKVVSN FLGAHAVGRAYAGRQSAYVDLVVNEMIPAVAAEGLADFVDVFCDEGFFTPEETGRILQAG AEHGMRGKIHGQELAPSGGVEVALKHNALSVDHLESMTDEDIAMMIGTETSPTALPGTSF FLNIPFAPVRKMISAGLGPAIASDYNPGSTPSGDMKFVVSLACIKMRLLPEEALNAATIN GAYAMGLSADYGSIAIGKVANFYITHPIPAIAFIPYAYTTPVIREVYLNGERQN >gi|283510587|gb|ACQH01000032.1| GENE 8 11992 - 12312 394 106 aa, chain + ## HITS:1 COG:CAC2476 KEGG:ns NR:ns ## COG: CAC2476 COG3070 # Protein_GI_number: 15895741 # Func_class: K Transcription # Function: Regulator of competence-specific genes # Organism: Clostridium acetobutylicum # 1 98 1 98 114 107 52.0 6e-24 MACTTDFVQYVVDQCATAGEITVKKMFGEYGIYCNGKIFGLLCDDCLYIKPTEAGKKKLK SIDLRPPYPGAKDYFYIADLEDHEVLAEIVRETYKELPEPKPRNKK >gi|283510587|gb|ACQH01000032.1| GENE 9 12397 - 13005 720 202 aa, chain - ## HITS:1 COG:L0164 KEGG:ns NR:ns ## COG: L0164 COG0307 # Protein_GI_number: 15672976 # Func_class: H Coenzyme transport and metabolism # Function: Riboflavin synthase alpha chain # Organism: Lactococcus lactis # 1 196 1 192 216 154 40.0 1e-37 MFSGIVEEMATLVAMKRYEENIDFTFKSTFTNELKIDQSVAHNGVCLTVVELDGDNYTVT AMKETLIKSNLGLLQVGDKVNIERSMKMNGRLDGHIVQGHVDTTATCIAMEDAEGSTYFT FEYAFNPEQVGRGYFTVDKGSVTVNGVSLTVVSPTECTFKVAIIPYTRENTNFCDIKVGA VVNLEFDILGKYIARMQSLSNK >gi|283510587|gb|ACQH01000032.1| GENE 10 13743 - 14378 594 211 aa, chain - ## HITS:1 COG:no KEGG:PRU_1361 NR:ns ## KEGG: PRU_1361 # Name: not_defined # Def: HD domain-containing protein # Organism: P.ruminicola # Pathway: not_defined # 1 209 1 209 218 259 60.0 5e-68 MAVSSVQLDLMEFVERNILPRYNQFDRAHNMAHVIRVIKRALVLAEKIGADVNMAYVAAA YHDLGLEGPRAIHHITSGKILMADQRLRKWFSKDQIVTLKEAVEDHRASSSRSPRSVYGK ITAEADRDLEPEMVFRRTVQYGLTNYPELSREEQWERFQRHLTDKYSTSGYIKLWIPGSI NEQYLNDIRNILPSKVLLRQWFDRLFEEETH >gi|283510587|gb|ACQH01000032.1| GENE 11 14509 - 14625 58 38 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MADNYIENKNQEYEERKKKWLKSQQRYPVNLPGNRKKT >gi|283510587|gb|ACQH01000032.1| GENE 12 14643 - 16007 1503 454 aa, chain - ## HITS:1 COG:no KEGG:PRU_0993 NR:ns ## KEGG: PRU_0993 # Name: not_defined # Def: putative lipoprotein # Organism: P.ruminicola # Pathway: not_defined # 8 454 19 485 486 283 37.0 9e-75 MALAIVGCDDNTDTLGGSLIKEGSFVVATDSFPVASNTVAVDSLLSKNTIAYLGKVRDPE TGSYITGDAMIQLHSLANLGLPVKDSIVSKQGNEIYADSCELFLPFDKSYGDSTVVTTLN MREMEKPMLEKRAFYTNFDVEAGGYLRTSNAINKDVSYTLTSAAGKTKGILIRMNEQYVA KDGTPYKNFGTYLLQTYDKHPEYFKNTYTFLNKVLPGFYFKYKSGLGAMAHVEGCLLNVF YRSKIKGKDSTTWVSFAGTEEVLQYTKITNTADNAALLSDSKYTYLKTPAGFFTQLTIPV DKLTQGHDGENINQVKLMMPRANNSTQSKNALDAATNVLLLPVDSVDSFFKQDILMDNKT SFLASLVANTYTFDNIANVINVMRKADRTKPNWDKLVIVPVTVTTVTRQNQSGGKETVIT KIVHNMSLTSTKLLKGTGVQGSPIKLSVIYTKVQ >gi|283510587|gb|ACQH01000032.1| GENE 13 16075 - 16884 857 269 aa, chain - ## HITS:1 COG:TM0895 KEGG:ns NR:ns ## COG: TM0895 COG0297 # Protein_GI_number: 15643657 # Func_class: G Carbohydrate transport and metabolism # Function: Glycogen synthase # Organism: Thermotoga maritima # 4 181 2 175 486 87 27.0 4e-17 MSKKILFINQEITPYVPETEMSLMGYNVPLNIQETGYEIRTFMPKWGNINERRGQLHEVI RLSGVNLIIDETDHPLIIKVASIPSTRIQVYFIDNDDYFMKRQMALDENGALYSDNGQRA IFFARGVLETVKKLRWNPDIIHCQGWMAAYIPVLIKKAFKDEPTFANTKVVTSLFNHGLE GQLGHNFKKTLEYKSIKAATLSEFNAEFDYNELEKLAIAFSDGVIEGGEGIKQQLLDYAD KKGVPVLKYPGQDFTESYKKFYDTVYPEK >gi|283510587|gb|ACQH01000032.1| GENE 14 17084 - 17935 697 283 aa, chain + ## HITS:1 COG:CAC2915 KEGG:ns NR:ns ## COG: CAC2915 COG0414 # Protein_GI_number: 15896168 # Func_class: H Coenzyme transport and metabolism # Function: Panthothenate synthetase # Organism: Clostridium acetobutylicum # 1 281 1 279 281 221 42.0 1e-57 MKVFTKIVDLQDALSCHRKQHSAIGFVPTMGALHAGHASLVERSVKDNDVTVVSIFLNPT QFNDKSDLEHYPQTLEADCKLLEKVGADYVFAPSVNEIYPTPDTRVFDFPPVTTVMEGAK RPGHFNGVCQVVSRLFDIVQPDKAYFGEKDWQQIAVVKRLVEFMGSNIEIVECPIVREES GLAMSSRNERLTPNERKLAANIYKTLKESLDKAKTLTLEETRKWVINEINAHEGLEVEYF SIVDGLTLTDLDDWNASPYVVGCITVYCGETPIRLIDHIKYKG >gi|283510587|gb|ACQH01000032.1| GENE 15 17942 - 18289 579 115 aa, chain + ## HITS:1 COG:BS_panD KEGG:ns NR:ns ## COG: BS_panD COG0853 # Protein_GI_number: 16079298 # Func_class: H Coenzyme transport and metabolism # Function: Aspartate 1-decarboxylase # Organism: Bacillus subtilis # 1 115 1 116 127 128 60.0 3e-30 MLIEVLKSKLHCVTVTEANINYMGSITIDEDLLDAANMIAGEKVQIVDNNNGERFETYII KGQRGSGEICLNGAAARKVLVGDTVIIMSYALMDFEEAKQFKPTVVFPENNKVPK >gi|283510587|gb|ACQH01000032.1| GENE 16 18458 - 18886 616 142 aa, chain - ## HITS:1 COG:PH0834_2 KEGG:ns NR:ns ## COG: PH0834_2 COG0511 # Protein_GI_number: 14590700 # Func_class: I Lipid transport and metabolism # Function: Biotin carboxyl carrier protein # Organism: Pyrococcus horikoshii # 26 142 2 114 114 63 38.0 1e-10 MKEYKYTIDGKEYKVSIGDIEENVAEVTVNGESFKVEMETEAEPEKKKVVLGAPAASAAA EESAAPATNVNTANAIKAPLPGVITEIKVAVGDTVAAGDTLVVLEAMKMANNIEAEKAGK VTAVCVKQGESVLEDTPLVVVE >gi|283510587|gb|ACQH01000032.1| GENE 17 18894 - 19052 152 52 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260911542|ref|ZP_05918128.1| ## NR: gi|260911542|ref|ZP_05918128.1| hypothetical protein HMPREF6745_2083 [Prevotella sp. oral taxon 472 str. F0295] # 1 52 1 52 52 97 100.0 2e-19 MATKDKNIYAAIALALHEFKGNNVHDKEPGFITIKKRDTLWNAKFLSMTQKP >gi|283510587|gb|ACQH01000032.1| GENE 18 19095 - 20660 1717 521 aa, chain - ## HITS:1 COG:RC0960 KEGG:ns NR:ns ## COG: RC0960 COG4799 # Protein_GI_number: 15892883 # Func_class: I Lipid transport and metabolism # Function: Acetyl-CoA carboxylase, carboxyltransferase component (subunits alpha and beta) # Organism: Rickettsia conorii # 14 521 15 514 514 620 58.0 1e-177 MSKQIEKIKELIAKREQARLGGGEKAIEKQHARGKYTARERIDMLVDEGSFEEYDMFKLH RCHNFGMEKKQYLGDGVVAGSATVDGRLVYVYAQDFTVNGGSLSETMSQKICKIMDMAMT MGAPVICMNDSGGARIQEGINALAGYGEIFERNILASGVIPQISGIFGPCAGGAVYSPAL TDFTLMMENTSYMFLTGPKVVKSVTGEDVDSENLGGASVHATKSGVTHFTAKTEEEAMDM IKKLLSYIPSNNTEEAPRVACDDPINRMDDSLNDILPDDPNKAYDMYKVITAITDNGEFF EVQPKFAKNIITGFARFNGQSVGIVANQPSAYAGVLDVNASRKGARFVRYCDAFNIPIVS LVDVPGFLPGTGQEYNAVILHGAQLLYAYGEATVPKITINLRKSYGGSHIVMGCKQLRAD LNFAWPTAEIAVMGASGAVAVLCAKEAKEKKDAGEDVKAFLAEKEEEYTEMFANPYQAAQ YGYIDDVIEPRNTRFRICRGLAQLATKRQSLPAKKHGCMPM >gi|283510587|gb|ACQH01000032.1| GENE 19 20819 - 23200 1795 793 aa, chain - ## HITS:1 COG:no KEGG:PRU_1228 NR:ns ## KEGG: PRU_1228 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 31 786 1 748 755 1001 63.0 0 MNRYVILICATVLSLLSTSLNAQTFTLQGRITDEKMSPIELATVSVLSQGKVALTSLKGE FSLTLHSADSVVVKFSMIGYKTKTRVLRNPRGKQTLQVVLHEGDNTLGEVSITEMRRQTG QTQELKKDVNKLAPTASGNAVEELIQAQAGVSTHNELSSQYNVRGGAFDENSVYINNVEV YRPFLVRSGQQEGLSVINPDLVEKIGFSTGGYAAKYGDKMSSALDITYKRPTRFEGSLSL SLLGGSAYVALGSKTFSWSNAVRYKTTKYLLGSLETKGEYSPNFLDYQTYLSYIPNKRWQ IDFIGNISNNQFNFTPADRETKFGTMQNVRSFKVYFDGQEKDLFRTYFGSLSVKRNFGDA TSLALVASAFQTKEKETYDIQGQYWLTQTETSENLGVGTYFQHARNFLQARVTSLKLVFA HKSKQHEVESGVTLKREHIEERSHEYEMRDSSGYSIPHTGTDLHLIYSLRARNQLDANRV EAYAQDTYRFGGKNENTRYNLNYGIRMSHWDFNKETIVSPRVSLGIVPAFNSNLTLRFAA GLYYQAPFFKELRDTSTVNHETRVTLNKKIKSQRSLHLIAGMDYRFNVSGRPFKLTGELY YKALANLVPYSVNNVKVVYYGDNLSNGHAAGIDLKLYGEFVPGTDSWISLSLMDTRMKLN GKSLPLPTDQRYSINMFFTDYFPGTDRWKMSLKLAFADGLPFSAPHRQMESHSFRAPAYK RADIGMSYRLLNNEQREKKSPFKNIWLGVDCLNLFGINNVNSYFWITDVTNQQYAVPNYL TGRLVNVRALLEF >gi|283510587|gb|ACQH01000032.1| GENE 20 24903 - 25790 788 295 aa, chain - ## HITS:1 COG:yegX KEGG:ns NR:ns ## COG: yegX COG3757 # Protein_GI_number: 16130040 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lyzozyme M1 (1,4-beta-N-acetylmuramidase) # Organism: Escherichia coli K12 # 82 288 65 270 275 169 42.0 7e-42 MARKSSTKVRNKRNTRRSGGRGRRRARFFYWLIPGFLRSSHRWAWWLGGVGVVCLYVWVF YYFFVGPTGFRWRALYGDANYPDGFAIHGIDISHYQGEINWDKLSDATIDGFPLKFVIVK STEGSSGIDENFNDNFYQAREYGFIRGAYHFWSNKSSARAQANFFLKQVHLEEGDLPPVL DVEHKPKNRSVEDFQRDVLTWLHIVEDKYHVKPIIYTYYKFKEQYLSAPVFDDYPYWIAH YYVEKVEYKGKWKFWQHTDAGRLDGIRGYVDLNIFNGSFYELKRLTIGSERIFGE >gi|283510587|gb|ACQH01000032.1| GENE 21 25973 - 27613 1749 546 aa, chain - ## HITS:1 COG:TP0542 KEGG:ns NR:ns ## COG: TP0542 COG0205 # Protein_GI_number: 15639531 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphofructokinase # Organism: Treponema pallidum # 1 545 1 559 573 640 54.0 0 MEKSALQKARASYQPKLPKALQGAVKVSEGAPTKSVDNQDEIKKLFPNTYGMPLVEFVQG DKMPSKEINVGIILSGGQAPGGHNVISGLFDEVKRLNPNNRLYGFLMGPGGLVDHNYIEI TADFIDEYRNTGGFDMIGSGRTKLETTEQFEKGLEIIRKLDIKAIVIIGGDDSNTNACVL AEYYAAKNYGVQVIGCPKTIDGDLKNEQIETSFGFDTATKTYSELIGNIERDCNSARKYW HFIKLMGRSASHIALECALQTQPNICIISEEVEAKDLTLNDIIEEIASAVAHRAANGQNY GVVLIPEGLIEFVPAIGRLIHELNDLLAAHGADYKDLDKDAQRAYIMSHLSAENKATFET LPYSVARQLSLDRDPHGNVQVSLIETEKLISEMVEAKLDEWAKEGKYHGHFAALHHFMGY EGRCAAPSNFDADYCYALGTSAALLIACGKTGYMAVVKNTTAQADEWKAGGVPITMMMNM ERRSGEMKPVIRKALVELDGKPFKAFAAKRDEWAKNTCYIYPGPIQYWGPSEVCDVPTRT LLLEQA >gi|283510587|gb|ACQH01000032.1| GENE 22 27934 - 28203 332 89 aa, chain + ## HITS:1 COG:no KEGG:PRU_0791 NR:ns ## KEGG: PRU_0791 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 85 2 86 88 82 47.0 3e-15 MNKRELKKSVNYICSELFAEGVAASLYSQNADKDNINALLASIINIQNDFVRRISHPEPG MPAKVFYKQLIKDFNNTVSEVIDQIAHLQ >gi|283510587|gb|ACQH01000032.1| GENE 23 28388 - 28942 491 184 aa, chain + ## HITS:1 COG:CC1900 KEGG:ns NR:ns ## COG: CC1900 COG0204 # Protein_GI_number: 16126143 # Func_class: I Lipid transport and metabolism # Function: 1-acyl-sn-glycerol-3-phosphate acyltransferase # Organism: Caulobacter vibrioides # 2 180 4 185 196 119 36.0 2e-27 MLKNICAWLLYRRMGWKTNITVAHPKKYIICLAPHTSNMDFLIGLLFSRAENMSSGFLMK REWFFWPLGPIWRKLGGIPVWRDKKSSMTDNLAELARKADTFSLCVTPEGTRSLNPDWKK GFYFIAQKAGIPILLYGVDYERKLIECTRMIEPTGDVEADMKEIKLYFKNFKGKNPELFT IGDV >gi|283510587|gb|ACQH01000032.1| GENE 24 28952 - 31585 1680 877 aa, chain + ## HITS:1 COG:no KEGG:PRU_0789 NR:ns ## KEGG: PRU_0789 # Name: not_defined # Def: putative N-acetylmuramoyl-L-alanine amidase # Organism: P.ruminicola # Pathway: not_defined # 29 868 141 969 982 820 48.0 0 MNKFLLWCNLLVCACFTNALAQPNALLVDLDYQGRPLTTNLSKPFNITAGLGGRHLSIWA SHGLYYETKRNSWQWQRPPLFGTNEDLFTPTIVLPYLIPMLEKAGAIVLSPRERDLQTEE VIVDNDANNNHIYNHIYKEKNGLHHWRNTTASGFANPRTTYADGENPFLMGTARMARTVQ REDKRSEIAFIPKIKNDGEKAVYVCYQTLPESTDNAQYTVYHKGQATTFSVNQRMGGGTW VYLGTFDFGKGCSESNCVVLNNLCTKDGMVTADAVRFGGGMGNIHRGGNISGVPRALEGA RYAAQWYGAPQEVYSSKEGRDDYGDDVNARGLISNWWAGGSVFLPTKPGLGIPIELCLGI HSDAGADRQSRNLVGSLAICTSYFNDGRLSTGMTRNLSLNFAESMLETIHNDLVQKFGRW EIRELIDKNYSETRMPDIPSAILETLSHQNFPDMRLGHDPNFKFTLARAVYKSILRFMCN QHATTAIVSPLAPSNFHINYLYNGRIKLGWSETEDELEPTAKPMGYILYTAIDSAGFDNG RMIKQNEIELALRPHSTYSFKVAAVNQGGESFTSETLSAYYQPEATSTILVVDGFDRLSS PAVISTPLQQGFDLNEDFGVSLGTQAAWVGYQQNFNKRAAGVDGWQGLGYSDTSLQGQLV AGNDQNTARTHVEDIAAIKKYNVVSASMKAVEKGLCNLSRYQCIDFALGLQRNDGHSLVK YKTFTPAIQKIIREYSSNNGRLIVSGAYVGTDMSDSLEQVFLADVFKASFEGTVRTDTLL AAIRPLASDTDSIPRDSVVALLYKSPNAVHYASQMVDVLTPLDDATTGYRYAGWWPASVA YVAGSRRSHLFGFPLECIKDREDRRRIMAETIDTIMK >gi|283510587|gb|ACQH01000032.1| GENE 25 31604 - 31921 513 105 aa, chain + ## HITS:1 COG:no KEGG:PRU_0788 NR:ns ## KEGG: PRU_0788 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 3 99 2 98 114 133 70.0 2e-30 MYKIQANNSGTRTIDVSDLHLETIDKYALLKNLVDSNGIIDEEVLDKLKYNVRSLLESET GKDKSLLDLCLDVIYNNNMKAFGLHQLVLLYIEWKTNANKSSVTE >gi|283510587|gb|ACQH01000032.1| GENE 26 32019 - 34031 1792 670 aa, chain + ## HITS:1 COG:STM3168 KEGG:ns NR:ns ## COG: STM3168 COG1032 # Protein_GI_number: 16766468 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Salmonella typhimurium LT2 # 9 661 29 719 723 516 38.0 1e-146 MNNYRLTDFLPTTKKECELRGWEQLDVILFSGDAYVDHPSFGVAVIGRCLEAAGYKVAIV PQPDWHGDYRDFKKLGVPRLFFGISPGCMDSMVNKYTANKRLRSEDAYSPDGRHDCRPDY PTVVYTQILKELYPDVPVVLGGIEASMRRLTHYDYWQDKVRKCILCDSKADIILYGMGEK SILELCASLENGQHISTIHHIPQTVFMVDKNSVPGGITDKDIILNSHEACLKDKRKQAEN FKHIEEEANKVHAQRLLQEVDGMYAVVNPPYPTLTTQELDATFDLPYTRLPHPKYKNKRI PAYEMIKFSVNMHQGCFGGCSFCTISAHQGKFVVCRSKESILNEVKKVAQMPDFKGYLSD LGGPSANMYGMAGRNKKACEVCKRPSCINPEICPNLNTDHSKLIEIYRAVDALPEIKKSF IGSGVRYDLLLHKSKDKKVNDAAMEYTKELIRNHVSGRLKVAPEHTSDEVLRSMRKPSFS QFEEFKRIFDKINKEENLRQQIIPYFMSSHPACKEADMADLAVRTKKLDFHLEQIQDFTP TPMTVSTEAWYTGYDPYTLEPVFSAKTPKDKNNQRLFFFWYKKEERQNIEHELRKIGRPE LINELYSSNDWGRQQSSGKGKKDDKNHREHKNSRTEKAYEPKPSPKTAERKSFNPNSHSG KPNKSKGKRR >gi|283510587|gb|ACQH01000032.1| GENE 27 34105 - 34875 557 256 aa, chain + ## HITS:1 COG:STM0005 KEGG:ns NR:ns ## COG: STM0005 COG3022 # Protein_GI_number: 16763395 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Salmonella typhimurium LT2 # 1 241 1 242 257 120 32.0 3e-27 MQILLSCAKDMATKPITFPNEHANEPHFIAAAQEIAAEMMNFSSEELEEMLKINSKLANQ NWLRYQHFLQAGDTSHAALAFTGMAYKHLKASSFNQEDISFANQHLWITSFLYGLLRPLD AIKPHRLEGYVRLQNHDDIRLFDYWKPRLTQFFIQSISNDNGMLAYLASEEMKSLFDWNE VKKKVKIFEPQFVVENARGRKTIVVYTKMCRGAMANYIITNKITNTDDLLAFEYEGFKFA EWIDNGKTPLFVLGAP >gi|283510587|gb|ACQH01000032.1| GENE 28 35064 - 36068 648 334 aa, chain + ## HITS:1 COG:no KEGG:PRU_1550 NR:ns ## KEGG: PRU_1550 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 325 1 325 333 406 64.0 1e-112 MKIKRIINDTFKVALSLFLGGAILYWMYRGFDFKQVERVMLHEMNWTWMLLSFPFGIMAQ VFRGWRWRQTLEPMGEKPRASTAIHAIFVSYAVSLVIPRIGEFARCAVLRRYDKVDFAKA LGTVVTERAIDSLLVLAIAAITLLAQVSIFDSFFSKTGTSLHAIVGQFSTTGYIVTGICA IAVGVLLHFLLKRMAIYNKVKETVKGIWQGINSIRKVRNKWLFAFFTLAIWGSYFLHYYL TFFCFEATAGLGMACAMVTFVVGSMAVVVPTPNGAGPWHFAVKTMLILYAVESTAALNFV LIVHSVQTLLVILLGVYGWMALAFSHSKRATTTE >gi|283510587|gb|ACQH01000032.1| GENE 29 36112 - 37569 1254 485 aa, chain + ## HITS:1 COG:VC2279 KEGG:ns NR:ns ## COG: VC2279 COG2195 # Protein_GI_number: 15642277 # Func_class: E Amino acid transport and metabolism # Function: Di- and tripeptidases # Organism: Vibrio cholerae # 2 485 50 533 534 391 42.0 1e-108 MSEIKNLKPESIWRNFHNLTQVPRPSGHLEKVKAFLLNFAKEVGVEAFVDAANNVVMRKP ASPGMENRKTITLQAHMDMVPQKSPDSKHNFETDPIETWIDNEWVRAKNTTLGADNGIGV AAIMAVMEDKSLKHGSIEGLITADEETGMYGANDLPANELQGDILLNLDSETWGCFVVGS AGGVDVTATRNYKEVPVNKDRVALKVTIKGLKGGHSGLEIHEGRANANKLMARVVAKLVA ECGAELATWHGGNMRNAIPYKAETLVTLPANNEDKACALLNEMKTMFETEYRFVENQVEL FAEKAPLPEKQMDATDRDIVIRAIFGCHNGVLRMIPAIPTVVETSSNLAIIDIEDGKARF LMLARSSSETMKAYICDTLKGTLTLGGFDVKLSADYPGWDPNTDSEILNLMQRLHKQLFN EDAIVKVDHAGLECSIILNKYPHMDVVSFGPTIRSPHSTSERCLHETVVPFWNLLKKILE EVPQK >gi|283510587|gb|ACQH01000032.1| GENE 30 37578 - 38219 533 213 aa, chain + ## HITS:1 COG:ECs4881 KEGG:ns NR:ns ## COG: ECs4881 COG1180 # Protein_GI_number: 15834135 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Pyruvate-formate lyase-activating enzyme # Organism: Escherichia coli O157:H7 # 4 206 17 246 292 70 25.0 3e-12 MANRTNDAKVPLIGIDRHRIATDGHGVTTLVGFFGCPLHCKFCLNDQCHDSRRRWQRMSP QALYDELKQDELYFLATGGGVTFGGGEPCLQSRFIRAFRNICGATWNITVETSLYVPQSH LRRLLNVVNTYIVDIKDLNPDIYQEYTRKDIGLLKENLQWLAAHVAKENIFVRVPSIPNH NTPANIEYSIEELKKLGLVNIECFDYINPKQVK >gi|283510587|gb|ACQH01000032.1| GENE 31 39253 - 40482 779 409 aa, chain + ## HITS:1 COG:no KEGG:BT_4022 NR:ns ## KEGG: BT_4022 # Name: not_defined # Def: integrase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 409 1 409 409 421 49.0 1e-116 MKTEKMKVLLYLKKSGLDKSGKAPIMGRITIGRSIAQFSCKLSCNPDLWNPRESRMDGKS REAVEVNGRLENLLLSIQAAYQSLLSKGCPFDATNVKRLFQGSVQARCMLTERLDMLIKE KESHIGIDIKEGAIHGYHSTRIHLQRFIQRKYKVSDLAFSQLTEQFIYDFRQYFVGECGF QECTFYNAATFLKTVCRLAYREGLADTLLFDKAKISKGDNKLPKALDREALDKLKVLRFE DLEEEMETARDIFLFACYVGAAYCDLMELSKSHLVRDDEGSLWLKFNRQKTGVLCRVKLL PEAIRLIERFHSDERETLLPFIKYKNYQSCLKALRLRAGISFPFTTHTARHTFATLITLE QGVPIETVSKMLGHSNISMTERYAKVTPQKLFEEFDCFLSFTEDMQLAI >gi|283510587|gb|ACQH01000032.1| GENE 32 40536 - 41831 640 431 aa, chain + ## HITS:1 COG:BS_ripX KEGG:ns NR:ns ## COG: BS_ripX COG4974 # Protein_GI_number: 16079408 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinase XerD # Organism: Bacillus subtilis # 162 395 56 287 296 66 25.0 1e-10 MRSTFKILFYINRQKTKADGRTVILCRITIDGKSTAITTGEECNPSEWNSKQGLTTDRKT SQRLHEFRELVEKTYRDILVSDGVVSVELIKNHLQGIATHPTTLLAMSRAELQMVRESVG KSRTEGTYLNLSHADRNLHEFVKDKGVQDILIGAITEDLFEEYRFYLKKRGLKGTTINNY LCWLSRLMYRAVSQRIIRCNPFDNAKYEKEDRKIRFLQKSEVAKLVEMKMNDRESEQARL MFVFACFTGMAIADMERLQYKHIQTSADGQRYIRKERQKTKVEFVVPLHPIAEAIINHCR KEQENNEEWQTVKEKGDSLVFHRGCSRSVMGKNLCIVGKACGISQRLSYHMARHTFGTMC LSAGIPIESIAKMMGHASISSTQIYAQVTDRKISEDMDRLIAKQVAKEKGTVEREACELS DIVTYKLEETA >gi|283510587|gb|ACQH01000032.1| GENE 33 41828 - 42079 147 83 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288927979|ref|ZP_06421826.1| ## NR: gi|288927979|ref|ZP_06421826.1| hypothetical protein HMPREF0670_00720 [Prevotella sp. oral taxon 317 str. F0108] # 1 83 1 83 83 159 100.0 6e-38 MNTNYKPRIKASAANHRSYFDWGSNMQVIRKGNGEIAMTESELVRFFGVTWRKLNYRLQT IMKSSNLHPDERGAGEEEVLANG >gi|283510587|gb|ACQH01000032.1| GENE 34 42661 - 42876 189 71 aa, chain - ## HITS:1 COG:no KEGG:BF1159 NR:ns ## KEGG: BF1159 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 3 69 4 70 80 82 62.0 3e-15 MNTKDYNRIKVVLVEKKHTSKWLEEQLGKDQSTVSKWCANTAQPGFEMFFQIAKCFGVEV KDLIREDINNN >gi|283510587|gb|ACQH01000032.1| GENE 35 43110 - 43361 156 83 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927981|ref|ZP_06421828.1| ## NR: gi|288927981|ref|ZP_06421828.1| conserved hypothetical protein [Prevotella sp. oral taxon 317 str. F0108] # 1 83 1 83 83 156 100.0 4e-37 MANQFIKEDYEQSEHTRFYIGEWHTHPEDNPTPSAVDYSSIEDNYQTASLVVPFMIMIVV GTKAFHISVFNGKKFVVAELEIV >gi|283510587|gb|ACQH01000032.1| GENE 36 43317 - 43658 163 113 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MFRLFVVFFNKLICHCCISFTSTTGMAFNTWWRNLTDSVTIIINLCPHQYSAKLVALTLS IAFQFINYFLFYLEVIIINIYTISSCKVFFDNPDDDTRLIPFSAAIFVFPTHW >gi|283510587|gb|ACQH01000032.1| GENE 37 43556 - 45277 736 573 aa, chain - ## HITS:1 COG:no KEGG:Ccur_13900 NR:ns ## KEGG: Ccur_13900 # Name: not_defined # Def: molybdopterin/thiamine biosynthesis dinucleotide-utilizing protein # Organism: C.curtum # Pathway: not_defined # 54 522 37 498 550 84 22.0 9e-15 MPDVEAALKTCFLVISKEYAIEEVNSIPQLEHLPQTHIWKIQIPALVSGKAEDIETYILF PKTFPYSMPYVIIPDDRFRYLPHISVKTRKLCLYEDDEVYDTENIEGLIRDNIDRTRRWI ENYYGRDNSDEYSKEIRSYWNEQYDGENNVDDHWILLGDIPTETCKMAGVAYTNKYLHNS SPYVMSVIASDENDKTLASIKRQHKTRDIPVLYIASLKTPNVPPFCMTGQEVIDRISDNE DRKIFKKNLNNFKNINVLFPIGLKYAVGGICTHGLRVNRNGYREGALTSFKVLTSFENKN KKLERLHLSVYSNQRIAERTAGAMMMERRFLVAGLGSVGSNLCYYLNGYNNVVFDLIDRD SLTIDNIGRHLLGFEYINQQKSYAVAHYLQQYRPDREVYASQKQLQQQKKEEIDKASAIF VCTGDVMSEKWLLSEMIEGSVRKPAFILWLEPYGISGIMIYVNPKDKESIKRLREKANGC FMDFCLIKREEYKDGEKLTKHDAGCNGSYSLYSANDVTLFLSSMFPMIDQLLEKPSESRC YQWVGNTNIAAEKGISLVSSSGLSKNTLQELIV >gi|283510587|gb|ACQH01000032.1| GENE 38 45270 - 46253 302 327 aa, chain - ## HITS:1 COG:no KEGG:Ccur_13910 NR:ns ## KEGG: Ccur_13910 # Name: not_defined # Def: hypothetical protein # Organism: C.curtum # Pathway: not_defined # 1 299 1 287 315 121 30.0 4e-26 MANNHEQFMAFHDAIKATSTRRKTLKTNRDALRSKIKKYFKDTHSDYIQPTFYWQGSYAM CTLLNPIKDESGLGAYDLDDGIYFESDDINDRKSIDWYHQEVYKAVKDHTENGADDLDPC VRVQYADGHHIDLAIYFLHTTEDKTMLAHRAEPWHESNPDEMIDWFKDECNNKTKMRELV RFMKAWCEYKRNEGIKMPSGCIMSMLTSKYYQWNDENRYDIAMRDILKNMYDDLSIEDYF HCWRPVSPYEDLFDNYTKGRKVSFLKELKSFKEDAERAIASKNQHDGCVRWQKHFGDRFS CSTAKDIDEDASQQRFSGTINKNSQYA >gi|283510587|gb|ACQH01000032.1| GENE 39 46255 - 47259 398 334 aa, chain - ## HITS:1 COG:VC0178 KEGG:ns NR:ns ## COG: VC0178 COG3621 # Protein_GI_number: 15640208 # Func_class: R General function prediction only # Function: Patatin # Organism: Vibrio cholerae # 5 208 11 224 355 101 33.0 2e-21 MYMTKKPFKILCIDGGGIKGIYSAELLAKFEEVFDCIVSECFDMLCGTSTGGIIALAASL KIPMSDVVKFYQKNGPSIFNESVKHRLGGCAYLRSKQIAFGGKYSAKPLRLALECVFKDK KIVESNNFLCIPSYNTLTANPRVFKKDFDKFTEDDRKSYVDVALATSAAPTYLPVMEIED DQFVDGGLWANNPILVALTEYLYKFAQDKRFDGLEILSISSCQKTKGEKHHKLDRAFIDW SDTLFDAYSIGQSKSTMVLLSKLKGYLSFPFDFQRIEHIPLSPEQDKIIDMDNASEASMK LLVKIADNTAMNEKMRPEIAHYFQTKKTLNPKDY >gi|283510587|gb|ACQH01000032.1| GENE 40 47354 - 47590 88 78 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTNVHSFARRFHIYFCFSLQVSEALTIFIPNYNTQSVNGLERQQEMRHLSLYNKGVCTAQ GNGGFLRIPVWYPLWITA >gi|283510587|gb|ACQH01000032.1| GENE 41 47521 - 47718 69 65 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MHGAGKRRFSADSRMVSVVDYGVKNAVNLPLDYLVVPNNLLLSLIGAPPLILHKNIFIVR ARHRA >gi|283510587|gb|ACQH01000032.1| GENE 42 48124 - 48486 404 120 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288927985|ref|ZP_06421832.1| ## NR: gi|288927985|ref|ZP_06421832.1| cell surface protein [Prevotella sp. oral taxon 317 str. F0108] # 1 120 64 183 183 224 100.0 2e-57 MYSAADYKMVGVVEDTPVKSDASNNLYAFSKSKGKFLKIKETGNTMPHFSAYMKLNSANQ AKEFHFVFAEDNFALTTTGIESVENTESDDNAPYYNLSGVRVSKPTKGVYIHNGKKVIIK >gi|283510587|gb|ACQH01000032.1| GENE 43 48486 - 48689 340 67 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288927986|ref|ZP_06421833.1| ## NR: gi|288927986|ref|ZP_06421833.1| hypothetical protein HMPREF0670_00727 [Prevotella sp. oral taxon 317 str. F0108] # 1 67 1 67 67 109 100.0 6e-23 MEALTMKKEYIQPQTEVVLLNGDTLMEGEWWSVHVDPNEEIEDDDIGAKQGAFNDNEDWP SYNPWED >gi|283510587|gb|ACQH01000032.1| GENE 44 48897 - 49148 285 83 aa, chain + ## HITS:1 COG:no KEGG:PGN_1415 NR:ns ## KEGG: PGN_1415 # Name: not_defined # Def: DNA-binding protein histone-like family # Organism: P.gingivalis_ATCC33277 # Pathway: not_defined # 1 79 1 84 172 63 44.0 3e-09 MAFYKKMQMKVNGKWYPKSVLVGSAITTEQVAKRVAAESTVSPADVRAVLTALGGVMGDY MAQGRSVKLDGIGSFYFTAATNK >gi|283510587|gb|ACQH01000032.1| GENE 45 49440 - 49943 150 167 aa, chain - ## HITS:1 COG:no KEGG:BF3623 NR:ns ## KEGG: BF3623 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 2 164 3 165 168 104 35.0 1e-21 MIDFFPKEHIHISKKKVFGLCDSSTSSEKPAFLSENNGKEWIAVVHNYRQESISFVPIDH CVELLKEDGSMDNRCDCCLFHHTTIIFVELKQRGMKGSEWIKQGEKQLRSTISHFEREEQ AQLFDLKKAYIANSSKPLFRSGQAVRMQRFFNETGYSLRIENHIKDL >gi|283510587|gb|ACQH01000032.1| GENE 46 49940 - 51172 418 410 aa, chain - ## HITS:1 COG:no KEGG:MAE_49360 NR:ns ## KEGG: MAE_49360 # Name: not_defined # Def: hypothetical protein # Organism: M.aeruginosa # Pathway: not_defined # 1 410 1 412 412 330 46.0 4e-89 MSRIKIKNFGPLNTSTETDDGWIQIRKITVFVGNQGSGKSTIAKLISTCVWIEKVLTRGD FKEKEFTASKFRNKYCGYHRISNYFKKGLTEIFYEGDSYNFTYTKEGDFIIKKNTDALSI YPLPQIMYVPAERNFISIVNNPSLIKELPDSLLTFLSEYDKAKSIIKGDFILPINEASLE YSKSNDTISVKGNDYKVKLQEASSGFQSIVPLYLVSRYLSDSVSEQAKHSHKMSNDEAKR FEEEVSRIWSDNNFTDTQRRIALSALSAKFNKSAFINIVEEPEQNLFPKSQSLLMQSLLL FNNKLDANKLIITTHSPYLINDLTVAIKVGQLKEDERINDKMKDNINSIYPLASSFSTED IFIYEHNENKACFSLLDTYNGLPSDENFLNSQLEDSNNRFADLLEIEQQL >gi|283510587|gb|ACQH01000032.1| GENE 47 51424 - 51672 270 82 aa, chain - ## HITS:1 COG:no KEGG:BF3280 NR:ns ## KEGG: BF3280 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 8 79 4 75 95 64 40.0 9e-10 MMESVKDLNMDADDMQVVLSALSSVNKRIKEVAQTHKPLFGGEHFLTGKEVCEQLYISPR TLLDYRNRRIIPYTQFAGKDTL >gi|283510587|gb|ACQH01000032.1| GENE 48 51682 - 51984 209 100 aa, chain - ## HITS:1 COG:no KEGG:PGN_0091 NR:ns ## KEGG: PGN_0091 # Name: not_defined # Def: hypothetical protein # Organism: P.gingivalis_ATCC33277 # Pathway: not_defined # 31 88 33 90 92 67 55.0 2e-10 MGYFIITDSNWTRLRDEILSLAETCHKAFGEQSRHTDWQHNGDVCRLLNISKRTLQHYRD TGVLPFSQIGYKCYYKREDVERLLETKSEKSKTDRAKNNK >gi|283510587|gb|ACQH01000032.1| GENE 49 52544 - 53809 545 421 aa, chain + ## HITS:1 COG:all8519 KEGG:ns NR:ns ## COG: all8519 COG5545 # Protein_GI_number: 17232892 # Func_class: R General function prediction only # Function: Predicted P-loop ATPase and inactivated derivatives # Organism: Nostoc sp. PCC 7120 # 63 336 321 585 836 75 24.0 1e-13 MKSNADKGKPERGNSTPRPRKTSSKNSEIETYLSKHYEFRYNTVLGRTEYRRKSDSDFAK VGRYEINTLRREIDNDIGIITSSDNLYSIIESSFSPRINPIQEYFKRLPSVDISSSSPFA LKAIPDLASCVVVHNSDKWLQYLTKWLVAVVANAMDDRECRNHTCLVLTGEQGKFKTTFL DLLCPSALHGYSYTGKIYPQEKDTLTYIGQNLIVNIDDQLKALNKRDENELKNLITCPMV KYRMPYDKYVEEHPHLASFVASVNGNDFLTDPTGSRRFLPFEVLSIDIERAKAISMDNVY AEAKALLKSGFRYWFDDDEIAELYRESEDFQVQTAEMELLLRCFEKPTEDESYSLMTTTE ILTYLGIYTHQPLVAKRMGEALKKAEYIKVSKRRNGGSPIYVYKIRKILPCPLLQTCSSQ M >gi|283510587|gb|ACQH01000032.1| GENE 50 53992 - 54996 474 334 aa, chain + ## HITS:1 COG:no KEGG:PGN_0923 NR:ns ## KEGG: PGN_0923 # Name: not_defined # Def: putative DNA primase # Organism: P.gingivalis_ATCC33277 # Pathway: not_defined # 5 296 2 274 295 169 36.0 2e-40 MKEEDLSRIKRYPIVEYLERKGIKPVRRTPSYALYRSPLREETHPSFKVDTEKNLWIDYA EGRGGSIIDLFMRLENCTLSEAIRRLGQTAPDDTAYSLHNDFTPNNSQPAMAASEARKLI SISDTLPPHLQEYLTKVRCIDLERAMPFLKCISYEVRDRRYQAIGFANPSGGYELRDNGS FKGTVAPKDITPIFTDKQAEHAADKTPPVCVFEGFMDFLSFLSMKEEITNHCLVMNSVSN VARTVRYLNDRHLTHIRAFLDNDDAGRRTVQEFVRAGFKVEDMSQYYKDFKDLNEYHVSR VRERQKHKGQIQAHMSITDQNQNKESKQVKLKMK >gi|283510587|gb|ACQH01000032.1| GENE 51 54999 - 55469 339 156 aa, chain + ## HITS:1 COG:no KEGG:BT_4018 NR:ns ## KEGG: BT_4018 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 61 156 60 155 155 73 44.0 1e-12 MEKKTMMSDKDLSWFLECQDDEMDNEVISQTEVQPVPVEETTTDVEMKKDRITQGEADIA SEQSEHTRNTENVTVQKRISAKMRKKTLEAYKQAYLVPTKLNNRKAVYLSRETQERADFI VRRLGDKGSNLSSFVENIVRQHLEEYGEDIEEWRRL >gi|283510587|gb|ACQH01000032.1| GENE 52 55852 - 56283 242 143 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288927995|ref|ZP_06421842.1| ## NR: gi|288927995|ref|ZP_06421842.1| hypothetical protein HMPREF0670_00736 [Prevotella sp. oral taxon 317 str. F0108] # 1 143 1 143 143 259 100.0 4e-68 MSRYNGKAARWQQQDKEEQRMNKTEFIKVRCTLEEKQRIKSKAESAGRKFSDYCREILLN GEVTAVPKMTDNEREAIAILHHTGKFYGQVSNLIKVKDERWVLITKNLSLCAKEAFKRFY DPHFRVEDEVYKVLNLTRNDRKM >gi|283510587|gb|ACQH01000032.1| GENE 53 56267 - 57328 394 353 aa, chain + ## HITS:1 COG:no KEGG:FP0489 NR:ns ## KEGG: FP0489 # Name: bmgA # Def: mobilization protein BmgA # Organism: F.psychrophilum # Pathway: not_defined # 1 246 1 245 352 108 31.0 4e-22 MIGKCKAIAHGSTALGYIFREGKLGYRLAFHNLCSREPKAIYEEMKVVSDYNSRCRNKFL RIEIGIAPQDEKKLPVSELMRIAHLFAKRMGLDNHQWVAVTHKDTDNRHIHIIANRISLY GEVYDTTFVSNRAARVAEKISRSKGLTIAKEVKAERMYQKAKSNLTREQTKKELQQICYA LLDKNKGGGISGHSMFLYELNKNNITIERMKNKQGKVYGLKFSYCGQTFKASEIGREVGY HSLQKNFEVTNKEESRKPHLTVQEPTERKEQPDTGYQLVPPSRSSISRDNDTPRAQNSIG AVADTIVSAADELMEGLGDLITPTVQGDDYAETAWQRKLRNQANRKKKRGRGL >gi|283510587|gb|ACQH01000032.1| GENE 54 58260 - 58490 188 76 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288927998|ref|ZP_06421845.1| ## NR: gi|288927998|ref|ZP_06421845.1| hypothetical protein HMPREF0670_00739 [Prevotella sp. oral taxon 317 str. F0108] # 1 76 1 76 76 137 100.0 3e-31 MEWRKRSTDNAIATRSLRMVLTKNTTKYQGKAVLDPQAQKICNLGLKSFNAIAATSYEHY SCVFNLFVSRATNTFA >gi|283510587|gb|ACQH01000032.1| GENE 55 59523 - 59678 142 51 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288928000|ref|ZP_06421847.1| ## NR: gi|288928000|ref|ZP_06421847.1| hypothetical protein HMPREF0670_00741 [Prevotella sp. oral taxon 317 str. F0108] # 1 51 1 51 51 84 100.0 2e-15 MAKTFTTFGHSSDEFRRDELVRKIEHALQELTLGELEALYYDMISKDYLRD >gi|283510587|gb|ACQH01000032.1| GENE 56 59684 - 60916 1415 410 aa, chain + ## HITS:1 COG:PA4960_2 KEGG:ns NR:ns ## COG: PA4960_2 COG0560 # Protein_GI_number: 15600153 # Func_class: E Amino acid transport and metabolism # Function: Phosphoserine phosphatase # Organism: Pseudomonas aeruginosa # 193 405 1 212 217 246 61.0 5e-65 MNTKKNKEEQILIRITGQDRPGLTASVMKILARYDAQILDIGQADIHSSLSLGVLIRIDD KHSGQVMKELLFKATELNVHIGFEPIGDQQYEDWVHGQGKNRYILTLIGRSLSAKQIEAA TKVISSQGFNIDSILRLTGRISIMNPDKNVRACIEFSLRGTPQDRSAMQKQLMALSAEMG IDFSFQKDDMYRRMRRLICFDMDSTLIQTECIDELAMRAGVGDKVKAITESAMRGEIDFK ESFRKRVALLKGLDVGVMKDIAEHMPITEGVDRLMAVLKRYGYKIAILSGGFTYFGEFLQ HKYGIDYVYANELEVDDNGKLTGNYVGEIVDGHRKAELLKLIAQVEKVNLAQTIAVGDGA NDLPMISEAGLGIAFHAKPRVVANAQQSINTIGLDGVLYFLGFKDSYINA Prediction of potential genes in microbial genomes Time: Sat May 28 00:45:40 2011 Seq name: gi|283510586|gb|ACQH01000033.1| Prevotella sp. oral taxon 317 str. F0108 cont2.33, whole genome shotgun sequence Length of sequence - 13399 bp Number of predicted genes - 9, with homology - 8 Number of transcription units - 7, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - TRNA 9 - 85 78.2 # Pro TGG 0 0 1 1 Tu 1 . - CDS 156 - 1346 1067 ## COG0156 7-keto-8-aminopelargonate synthetase and related enzymes - Prom 1578 - 1637 4.5 + Prom 1438 - 1497 5.3 2 2 Tu 1 . + CDS 1564 - 2592 1053 ## COG1597 Sphingosine kinase and enzymes related to eukaryotic diacylglycerol kinase - Term 3052 - 3101 11.2 3 3 Op 1 . - CDS 3169 - 3339 79 ## gi|288928004|ref|ZP_06421851.1| hypothetical protein HMPREF0670_00745 4 3 Op 2 . - CDS 3355 - 6861 3295 ## gi|288928005|ref|ZP_06421852.1| hypothetical protein HMPREF0670_00746 5 3 Op 3 . - CDS 6797 - 6994 68 ## + Prom 7714 - 7773 1.7 6 4 Tu 1 . + CDS 7825 - 8847 565 ## gi|288928006|ref|ZP_06421853.1| putative transcriptional regulator, LuxR family + Term 8931 - 8992 7.5 - Term 8925 - 8975 14.1 7 5 Tu 1 . - CDS 9141 - 10598 1582 ## COG0265 Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain - Prom 10724 - 10783 5.2 + Prom 10745 - 10804 4.9 8 6 Tu 1 . + CDS 10984 - 11193 69 ## gi|288929043|ref|ZP_06422889.1| hypothetical protein HMPREF0670_01783 + Term 11394 - 11434 -0.2 - Term 12271 - 12328 0.7 9 7 Tu 1 . - CDS 12390 - 13370 468 ## PROTEIN SUPPORTED gi|116517028|ref|YP_816079.1| glucokinase Predicted protein(s) >gi|283510586|gb|ACQH01000033.1| GENE 1 156 - 1346 1067 396 aa, chain - ## HITS:1 COG:BS_kbl KEGG:ns NR:ns ## COG: BS_kbl COG0156 # Protein_GI_number: 16078763 # Func_class: H Coenzyme transport and metabolism # Function: 7-keto-8-aminopelargonate synthetase and related enzymes # Organism: Bacillus subtilis # 27 394 25 392 392 288 38.0 1e-77 MGQLQERYKAYREPQKYIKAGVYPYFREITSKQGTEVEMDGHHVLMFGSNAYTGLTGDER VVKAAKDALDKYGSGCAGSRFLNGTLDLHVQLEKELATFVQKDDTLCFSTGFSVNQGVLA MVVGRGDYIICDDRDHASIVDGRRLSFAKQLHYKHNDMEDLENILKSLPHEAVKLIVVDG VFSMEGDLAKLPEIVALKHKYNCSVMVDEAHSLGVFGKHGRGVCEHFGLTDEVDLIMGTF SKSLASIGGFIASDADTINYLRHTCRSYIFSASNTPAATAAALEALHIIQQEPERIERLW EVTRYALRRFREEGFEIGDTESPIIPLYVHDVEKTFLVTKLAFEAGVFINPVIPPACAPQ DTLVRMALMATHTEEQVERGVQILKKIFVDLDIIKS >gi|283510586|gb|ACQH01000033.1| GENE 2 1564 - 2592 1053 342 aa, chain + ## HITS:1 COG:lin0768 KEGG:ns NR:ns ## COG: lin0768 COG1597 # Protein_GI_number: 16799842 # Func_class: I Lipid transport and metabolism; R General function prediction only # Function: Sphingosine kinase and enzymes related to eukaryotic diacylglycerol kinase # Organism: Listeria innocua # 4 270 3 274 309 115 29.0 9e-26 MKKKIVFIMNPISGTGSKKGIPEAIDRYIDKELFDYEIRTTEYAGHACHIATEAKEQGVD VAVAVGGDGTVNEVGRAIVESDTALGIIPCGSGNGLARHLMLPMNVKKCLQLINTCEIHR LDYGKINEHYFFCTCGMGFDAFVSQKFAQAGKRGPITYAENILREGLKYQPETYEIEDET GVHRYKAFLISCANASQYGNNAYIAPRASMSDGLLDVIIMEPFDLLDAPQISLDMFNKTL DKNSKIKTFRCKELKIHRKNEGVIHFDGDPVEAGKDIVVSLKEKGINVIVNPDADKSLRE PNAFQSIAASLFNELNDMRRVFNKRSQKLPLLGKVIQRKLTR >gi|283510586|gb|ACQH01000033.1| GENE 3 3169 - 3339 79 56 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928004|ref|ZP_06421851.1| ## NR: gi|288928004|ref|ZP_06421851.1| hypothetical protein HMPREF0670_00745 [Prevotella sp. oral taxon 317 str. F0108] # 1 56 1 56 56 90 100.0 3e-17 MKTNYIKPISEIVTNKYEPILMLQNSGNQSDDADAKSFVFDEEEEDEEILPWGVVY >gi|283510586|gb|ACQH01000033.1| GENE 4 3355 - 6861 3295 1168 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928005|ref|ZP_06421852.1| ## NR: gi|288928005|ref|ZP_06421852.1| hypothetical protein HMPREF0670_00746 [Prevotella sp. oral taxon 317 str. F0108] # 1 1168 1 1168 1168 2262 100.0 0 MRRMFLHAKACGRLFAMLLLSSMFCGISSRAQNVIVSENTGSMICSQTTYSSGTTETGFA SGGFATWKHHQLPLTMTASDLKTLSPNGQLAVHGNNLYNTGADTGIQVFGGQREDGFITF ALPHGYRFTSYKIIVQNNVDVFGNGKAKLRVTHDRAFYFGETNSRFDFMSAYYKNLKKGM SNEEFTIERTSMVESDMGNILYFKIANGVNSHVSGRYVGVTLKYVELTFTPEAPFKVNIA PQQASAEGVSVVQCPFPTGKVDLGQISQNTYTGVKRQSYVYRNVKDLMAQSLFYEGASVD ETLDLNSRTAGSAVGNKTIKAVTIDQMGYFEIQPGQTYFAETPVCTKDQKGNDVPLHYRI TSAKVNYTISAEQKFYIKYIEGGDVWYLQRDATFGTTQQKWEIDAQGRINVVGTSNYLVV DPNNTLTIGNGTGSRFSLVGEGIVCNNLYMFGTKPGSPVYFAEYDNVGTAQWERSNASGA YTLKVYDKTGKAAKEINVTQPGNFVMDDLNNDAVKLEVVGGNGLVNLELTVEALDPYINH MELMCTHGDMKISREFVSNDFSVGGGVFYFYIPRDWLNTACHFTFENLKSKCADNTYYDG SSNGNARFGFVKSEYFNLFGESNNNIYRHPDFAANYDYTKKVFVATAGTKAFKFNNADEV SKTGTATSLIEYPFTLEKYAAAGGLFNNVVMTPTEENKDYMTNAYVFTTDETRYNIAPTT ATQHRYYAYYDMEIHLVARTYTPSVAFEKIYDKSFYGEAESGEFYGAVVTSKDNEGNLGY SSVEAVKEQLETAIAAGGSNVPAAMDKLLYVDMGSQMQGAYSSNGSSWTTLKNALAKNAL VFLPKNTTHAADNFAYAEEGGTYKAARNIILTDKQPFYSPYKIRVDAANYASYTREVTGT NSQAKKATLMLPFTLALTDGKHVNKGDGCSFEVYVMQATNCLNVSPEQKPGYDYMKFDGD AHFVKVEGMATEANKPYMILVNDAYTPKDGLSFVAMQHGADIMPTIAHKNKTYLIGEKAT GSGNGTNYAFENRGTYCGDKVEKVFYFANNKYYSSLNLPSDPKLVKVRPFRSYYSFSSTS GAKMASFDVVFGENGETTGINNVKANADLAVTAANGMITFFAKKAQRVEVFGVNGISVAK LNLKANETQSVPVAAGVYVINGVKVSVQ >gi|283510586|gb|ACQH01000033.1| GENE 5 6797 - 6994 68 65 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNVFVKPNEQRRACSDIVMARKRTFRGTRKEVYLSTIINLKNECYEKNVFTRKGMWQTVR HALAQ >gi|283510586|gb|ACQH01000033.1| GENE 6 7825 - 8847 565 340 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288928006|ref|ZP_06421853.1| ## NR: gi|288928006|ref|ZP_06421853.1| putative transcriptional regulator, LuxR family [Prevotella sp. oral taxon 317 str. F0108] # 1 340 1 340 340 613 100.0 1e-174 MKTTTLTAQFENFVALYTKDKEERVRFFVFFAGAIQLIVFTLLNIVGTIGIYHPFLQTVS FALLALCVAMVTLYLRRTLSLVSAFATFAITAQLLEMTRIAFLLFMKPSGYEAMVIYYQV GSYTILLYLALGFIPQIPILVTALNIATLLLVTFYDGHAIDQQIALLFALLCIFTCALAV ISRRGLHKIQQENKDYQDTHNSILTAFNMSQSELIAYLQICRAKEPNSKHVDMLLSQLNE QSKHNLVHAAMVLKKKHDAQQLELSKCFPSLTHTELEVSRLVVEGKTLGEIALIMGKTTT NISTVRGNVRKKLGLQPSEDLVEKLKELAAPANKALRKTF >gi|283510586|gb|ACQH01000033.1| GENE 7 9141 - 10598 1582 485 aa, chain - ## HITS:1 COG:RSc1058 KEGG:ns NR:ns ## COG: RSc1058 COG0265 # Protein_GI_number: 17545777 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain # Organism: Ralstonia solanacearum # 42 468 46 483 505 263 37.0 4e-70 MKKYSQYLIGGLCVLSIAFSAGSFMKVNAAAASPAMPAQPVDLTFAAEKALPAVVHIRYV QNSKVQTVEVQSDPFSDFFDDFFGSPGRGNGGTQKRQVQTPKREATGSGVIISADGYIVT NNHVVEGADQLTVTLNDNREFSARIIGTDKSTDLALIKIDGKNLPTLPIGDSDKLKVGEW VLAVGNPFNLNSTVTAGIVSAKARTLGGNPIESFIQTDAAINQGNSGGALVNTQGELVGI NAMLYSQTGSYSGYGFAIPTTIMNKVVADIKQYGSVQRAVMGIKGSDVRIYLDIEKEKGK EHDLGTNDGIYVDSVEDGGAGSAIGLKSGDVIVAADGRKLTKMAELQELLSGKKPGDKIT ITYLRNKKKTTATATLKNAQGNTKVMKSADLDILGGNFKEINDEQKRQLNISTGLEVIRV NNGALKDAGVAKGFIIQKVNEQIIKSTDDLQKAVKEASTSKDPVLYIQGIYPTGKKAYFA VVVGN >gi|283510586|gb|ACQH01000033.1| GENE 8 10984 - 11193 69 69 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929043|ref|ZP_06422889.1| ## NR: gi|288929043|ref|ZP_06422889.1| hypothetical protein HMPREF0670_01783 [Prevotella sp. oral taxon 317 str. F0108] # 1 62 1 62 69 62 58.0 6e-09 MVYSQIVIGRKLFNAVSTNWKPLVDAQLLMRLHDAVDIVHEQHGLQPPTLALASLHVFHA TAIRTYTMQ >gi|283510586|gb|ACQH01000033.1| GENE 9 12390 - 13370 468 326 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|116517028|ref|YP_816079.1| glucokinase [Streptococcus pneumoniae D39] # 11 322 5 316 319 184 35 2e-46 MSGQEELKPYVIGLDLGGTNSVFGIVDSRGEIKATTAIKTQAYADVEDYVKASLDALHVI IEQVGGIQTIKAMGIGAPNGNYYKGTIEFAPNLAWGHNGVVPLADMFSKGLGGIPVALTN DANAAAIGEMIYGVARGLKNFIVITLGTGVGSGIVINGQVVYGCDGFAGELGHVIAQREG GRSCGCGRFGCLETYCSATGVARSAREFLEKSTTPSVLRDLKPEEITSLDVSLAAAKGDK LAIDVYEFTGKILGQACADFAAFSSPEAFIFFGGLTKAGDLLMKPLKKAYDENVLKIFKD KAKFLISGLEGSSAAVLGASAVGWEL Prediction of potential genes in microbial genomes Time: Sat May 28 00:46:44 2011 Seq name: gi|283510585|gb|ACQH01000034.1| Prevotella sp. oral taxon 317 str. F0108 cont2.34, whole genome shotgun sequence Length of sequence - 10822 bp Number of predicted genes - 9, with homology - 9 Number of transcription units - 5, operones - 3 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 109 - 168 3.7 1 1 Op 1 . + CDS 201 - 869 592 ## COG1280 Putative threonine efflux protein + Term 898 - 936 -0.6 2 1 Op 2 . + CDS 952 - 1500 675 ## PRU_0239 hypothetical protein 3 1 Op 3 . + CDS 1518 - 2366 854 ## COG1091 dTDP-4-dehydrorhamnose reductase 4 2 Tu 1 . + CDS 2476 - 4041 1612 ## COG4108 Peptide chain release factor RF-3 + Term 4068 - 4100 3.1 + Prom 4068 - 4127 4.0 5 3 Op 1 16/0.000 + CDS 4167 - 6131 2177 ## COG0441 Threonyl-tRNA synthetase + Prom 6288 - 6347 5.2 6 3 Op 2 . + CDS 6411 - 7082 582 ## COG0290 Translation initiation factor 3 (IF-3) + Term 7124 - 7168 4.2 + Prom 7093 - 7152 4.2 7 4 Op 1 . + CDS 7186 - 7383 268 ## PROTEIN SUPPORTED gi|212691284|ref|ZP_03299412.1| hypothetical protein BACDOR_00775 8 4 Op 2 . + CDS 7419 - 7763 520 ## PROTEIN SUPPORTED gi|53712979|ref|YP_098971.1| 50S ribosomal protein L20 + Term 7783 - 7826 9.4 + Prom 9294 - 9353 8.8 9 5 Tu 1 . + CDS 9509 - 10537 1128 ## COG1879 ABC-type sugar transport system, periplasmic component Predicted protein(s) >gi|283510585|gb|ACQH01000034.1| GENE 1 201 - 869 592 222 aa, chain + ## HITS:1 COG:BS_ycgF KEGG:ns NR:ns ## COG: BS_ycgF COG1280 # Protein_GI_number: 16077378 # Func_class: E Amino acid transport and metabolism # Function: Putative threonine efflux protein # Organism: Bacillus subtilis # 10 209 1 199 209 61 27.0 1e-09 MPLPFHIDIVDIVFKGILIGLIASAPMGPVGVLCVQRTLNKGRWYGFITGIGAAVSDMIY ALVTGYGMSFIMDLITAPRTLFALKISGSVLLMLFGIYCFKSNPTKKVHYSGKGKGTLIH NGITAFLVTFSNPLIIFLFMATFAQFAFVVPNRSWLMIGGYTGIVLGALIWWFGLTWLID KVRGKFDKGGIVLINKIIGSVVIIFSVIFLIGTIFNLYTFEY >gi|283510585|gb|ACQH01000034.1| GENE 2 952 - 1500 675 182 aa, chain + ## HITS:1 COG:no KEGG:PRU_0239 NR:ns ## KEGG: PRU_0239 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 180 1 181 181 253 69.0 2e-66 MYIAKQLREKSIAEYILYMWQIEDLIRAYGCSLQRIRHEYIDKFDYTPEQKEEMLDWYGN LVRMMNQEGKRERGHLQINAIIVKDLMDLHNQLMQSTKFPFYNTAYYKVLPFIVELRNKG DKQVNEIETCLDALYGVMLLRLKQKEITPDTMTAIKEITTFVGMLADYYQKDKREGLVFE DE >gi|283510585|gb|ACQH01000034.1| GENE 3 1518 - 2366 854 282 aa, chain + ## HITS:1 COG:CAC2315 KEGG:ns NR:ns ## COG: CAC2315 COG1091 # Protein_GI_number: 15895582 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-4-dehydrorhamnose reductase # Organism: Clostridium acetobutylicum # 1 279 1 275 280 238 48.0 1e-62 MNILITGCNGQLGNEIQLLQAQYAQHTWFNTDVNELDITDKAAIERFVEANEIDGIVNCA AYTAVDKAESDPQLARKLNADAPAFLAEAVAKRGGWMVQVSTDYVFNGTKHTPYVETDEP CPNSIYGQTKLEGEQAVSKLCPNAMIIRTAWLYSEFGNNFVKTMIRLGREREQLGVIFDQ VGTPTYAHDLATAIMTAIDKGIKPGVYHFSNEGVTSWYDFTKSIHRLAGINTCQVSPLHT AEYPTPACRPAYSVLDKTKIKDAYGIEIPHWEESLAKCIAKL >gi|283510585|gb|ACQH01000034.1| GENE 4 2476 - 4041 1612 521 aa, chain + ## HITS:1 COG:NMA0836 KEGG:ns NR:ns ## COG: NMA0836 COG4108 # Protein_GI_number: 15793806 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Peptide chain release factor RF-3 # Organism: Neisseria meningitidis Z2491 # 1 520 6 526 531 544 50.0 1e-154 MTEIERRRTFAIISHPDAGKTTLTEKFLLFGGQIQVAGAVKSNKIRKTATSDWMDIEKQR GISVSTSVMEFDYEGYKVNILDTPGHQDFAEDTYRTLTAVDSAIIVVDSAKGVETQTRKL MEVCRMRNTPVIIFINKMDREGRDPFDLLDELEEELQIKVRPLSWPINQGERFKGVFNIF ENQLQLFTPDKQRVTEKVEVDVNSDELNEHIGEQDAQKLRDDLELVDGVYPEFDVETYRA GHVAPVFFGSALNNFGVQELLNCFVQIAPSPKPTMAEEREVQPTEPKFTGFIFKITANID PNHRSCIAFCKICSGKFVRNQPYRHVRLDKTLRFSSPTQFMAQRKSTIDEAYPGDIIGLP DNGIFKIGDTLTEGELLHFRGLPSFSPEMFKYIENDDPMKAKQLNKGIEQLMDEGVAQLF VNQFNGRKIIGTVGQLQFEVIQYRLQNEYNAKCRWEPVHLHKACWIESDDPKELENFKKR KYQYMAKDREGRDVFLADSGYVLSMAQQDFEHIKFHFTSEF >gi|283510585|gb|ACQH01000034.1| GENE 5 4167 - 6131 2177 654 aa, chain + ## HITS:1 COG:DR2081 KEGG:ns NR:ns ## COG: DR2081 COG0441 # Protein_GI_number: 15807075 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Threonyl-tRNA synthetase # Organism: Deinococcus radiodurans # 2 622 1 623 649 629 50.0 1e-180 MIKITFPDGSIREYEKGITGLEIAKSISPALARDVLSIGVNGKTTELNRPINEDATIALY KWDDEEGKHTFWHSSAHLLAEALQALYPGIQFGFGPAVENGFFYDVMTKDGTPISENDFP KIEEKMREFAKRDEAIVRREVSKSDALKEFKEDGQEYKCEHIDLDLEDGTISTYSQGAFT DLCRGPHLMSTGPIKAIKLTSVAGAFWRGDAQKDQLTRIYGITFPKKKMLDEYLVMLEEA KKRDHRKIGKEMELFMFSERVGKGLPIWLPKGTELRLRLQNMLRKIQKRFGYQEVITPHI GSKNLYVTSGHYAHYGKDSFQPIHTPEEDEEYMLKPMNCPHHCEVFAWKPRSYKDLPLRI AEFGTVYRYEKSGELHGLTRVRSFTQDDAHIFCRPEQVKNEFLRVMDIIQAVFKIFHFDD FEAQISLRDPKNTTKYIGSDEVWEESEQAIIDACKEKGLDAKIAYGEAAFYGPKLDFMIK DAIGRRWQLGTIQVDYNLPERFKLEYTAEDNTKKTPVMVHRAPFGSMERFTAVLIEHTAG HFPLWLIPDQVAILPISEKYNDYAQRVAKYFDSVGVRATLDVRNEKIGRKIRDNEIKRVP YMVVVGEKEAAEGLVSMRKQGGGEQATMSMEEFAKRINDEVATQLAATDIHPND >gi|283510585|gb|ACQH01000034.1| GENE 6 6411 - 7082 582 223 aa, chain + ## HITS:1 COG:BH3140 KEGG:ns NR:ns ## COG: BH3140 COG0290 # Protein_GI_number: 15615702 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation initiation factor 3 (IF-3) # Organism: Bacillus halodurans # 6 171 20 187 190 149 47.0 6e-36 MKNERIKNQYRVNEQIRAKEVRIVNEGGSTVMPTRQALDMARSQGVDLVEISPNAQPPVC RIIDYSKFLYQQKKHAKEMKAKQVKVEVKEIRFGPQTDEHDYNFKLKHAKEFLDAGNKVR AYVFFRGRSILFKEQGEVLLLRFANDLEEYGKVEQMPALEGKKMFLYIAPKKTGVVKKSQ QKMDREKRENESKNAETETNSTDNGLFANAKNGEDVLKKLQAE >gi|283510585|gb|ACQH01000034.1| GENE 7 7186 - 7383 268 65 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|212691284|ref|ZP_03299412.1| hypothetical protein BACDOR_00775 [Bacteroides dorei DSM 17855] # 1 65 1 65 65 107 75 3e-23 MPKVKTNSGAKKRFRFTGTGKVKRKHAYHSHILTKKTKKQKRNLVHSTLVDRADMKQVRD LLCLR >gi|283510585|gb|ACQH01000034.1| GENE 8 7419 - 7763 520 114 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|53712979|ref|YP_098971.1| 50S ribosomal protein L20 [Bacteroides fragilis YCH46] # 1 114 1 114 116 204 86 2e-52 MPRSVNHVASKARRTRILKLTKGYYGARKNVWTVAKNTWEKGLTYAYRDRRNKKRTFRAL WIQRINAAARLENMTYSTLMGALHKAGIEINRKVLADLAVNNPQAFKAIVDKVK >gi|283510585|gb|ACQH01000034.1| GENE 9 9509 - 10537 1128 342 aa, chain + ## HITS:1 COG:mll7623 KEGG:ns NR:ns ## COG: mll7623 COG1879 # Protein_GI_number: 13476333 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Mesorhizobium loti # 2 334 16 345 345 84 24.0 3e-16 MAGVSVGTVDRVLHGRPNISKIAREKVEKVLAQINYQPNMYASALAYNKSYTFYLLVPQH ESEAYWEEIEEGAMKAAEVRRDFHINVEVMYYDRFNAETFVSQVNHCLSLNPDGVIVVPA QLEITRPFTDKLHERNIPFVMLDSYMPDLQPLSFYGQDSLASGYFAAKMLMMVAYNESEI AIVKQTKDGKVVSKQQENRETGFRTYMNDHFPQIKITDVNLPLETGKHEYADILEDFFAQ NPQVHHCITFNSKAHIVGKFLLQTNRRNVQIMGYDMVGKNAECLRQGSISFLIAQHAYMQ GYACVETLFQAIVLKMEVDPINYMPIELLTKENVDFYRRTQL Prediction of potential genes in microbial genomes Time: Sat May 28 00:46:53 2011 Seq name: gi|283510584|gb|ACQH01000035.1| Prevotella sp. oral taxon 317 str. F0108 cont2.35, whole genome shotgun sequence Length of sequence - 17653 bp Number of predicted genes - 17, with homology - 16 Number of transcription units - 5, operones - 4 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 12 - 71 2.1 1 1 Op 1 . + CDS 103 - 1392 1411 ## COG1757 Na+/H+ antiporter 2 1 Op 2 . + CDS 1405 - 2211 676 ## PRU_0877 hypothetical protein 3 1 Op 3 . + CDS 2260 - 2955 730 ## COG3757 Lyzozyme M1 (1,4-beta-N-acetylmuramidase) 4 1 Op 4 . + CDS 2993 - 4453 1361 ## gi|288928021|ref|ZP_06421868.1| hypothetical protein HMPREF0670_00762 + Term 4546 - 4595 -0.9 - Term 5087 - 5137 7.5 5 2 Tu 1 . - CDS 5156 - 5335 106 ## - Prom 5425 - 5484 6.5 6 3 Op 1 . - CDS 5571 - 6053 491 ## COG0295 Cytidine deaminase 7 3 Op 2 . - CDS 6078 - 7469 1084 ## PRU_0981 putative lipoprotein - Prom 7492 - 7551 3.8 8 4 Op 1 . - CDS 7605 - 8780 878 ## COG0477 Permeases of the major facilitator superfamily 9 4 Op 2 . - CDS 8872 - 9693 781 ## COG0413 Ketopantoate hydroxymethyltransferase - Prom 9736 - 9795 3.5 - Term 9705 - 9743 3.9 10 5 Op 1 . - CDS 9872 - 10975 819 ## COG0399 Predicted pyridoxal phosphate-dependent enzyme apparently involved in regulation of cell wall biogenesis - Prom 10996 - 11055 3.1 - Term 10978 - 11021 -0.8 11 5 Op 2 . - CDS 11059 - 12000 712 ## PRU_0177 hypothetical protein - Term 12029 - 12055 -0.6 12 5 Op 3 . - CDS 12056 - 12469 315 ## PRU_0176 polysaccharide glycan isomerase 13 5 Op 4 . - CDS 12466 - 12768 172 ## PRU_0174 hypothetical protein 14 5 Op 5 . - CDS 12768 - 13916 1387 ## COG1088 dTDP-D-glucose 4,6-dehydratase 15 5 Op 6 . - CDS 13930 - 14436 440 ## COG0622 Predicted phosphoesterase 16 5 Op 7 . - CDS 14424 - 15062 560 ## BDI_1616 hypothetical protein 17 5 Op 8 . - CDS 15065 - 16420 1164 ## COG0534 Na+-driven multidrug efflux pump - Prom 16447 - 16506 3.3 Predicted protein(s) >gi|283510584|gb|ACQH01000035.1| GENE 1 103 - 1392 1411 429 aa, chain + ## HITS:1 COG:SA2117 KEGG:ns NR:ns ## COG: SA2117 COG1757 # Protein_GI_number: 15927906 # Func_class: C Energy production and conversion # Function: Na+/H+ antiporter # Organism: Staphylococcus aureus N315 # 4 426 25 448 459 319 43.0 8e-87 MHHISNKKGLLALSPLFVFIVMYLVTSIVAGDFYKIPITVAFMISSMYAIATSRGKPLAE RINVFSKGASTNNMMLMLWIFVLAGAFANSAKEMGSIDAMVNFTLNALPSNLLLPGIFLA ACLISLSIGTSVGTIVALIPIAAGIAQSTGVNLPLMAGIVVGGSFFGDNLSFISDTTVAA TNTQGCRMKDKFLVNIYIVLPAALAIVAIYTFLGFGVASTHAPSSIQYLNILPYVLVIVT AIFGMNVMAVLTLGIGMTGIIGIWNGAYDLFGWFHSMGEGIVGMGELIIITMLAGGLLEI IRMNGGVDYVINKLTARINGKRGAEGTIALLVALVDVCTANNTVSILTVGGIAKDVSQRY GVDSRKSASILDTMSCCIQGIIPYGAQLLMAAGLAKINPAHIVPYLYYPFAIGLATILAI VFRYPKKYS >gi|283510584|gb|ACQH01000035.1| GENE 2 1405 - 2211 676 268 aa, chain + ## HITS:1 COG:no KEGG:PRU_0877 NR:ns ## KEGG: PRU_0877 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 268 1 269 269 133 31.0 8e-30 MDVIKRNFFALLQTGAFGKMQHVEPMSAYKWRKLFAMMGVHDVLQFALAGYTKQQYQNVR HIPQPILHDLKSHPQLSLDEIYASSRLSNKFLNNRLKAIREGEPHAMDTSIETLHVLNII VHNVEDTLNKGVVLRGIVDLGIYLRTMGHKVDFVKLESWLNRLHIESLAQFQGSILILLL GFEEDEVPFTKRIMGDAEKATLKSVERTAKDTSEKWDFKQDKSLWLHANSGAIKQSMRRS FHYLLYAPIETISNLTYKFAKNLSELEE >gi|283510584|gb|ACQH01000035.1| GENE 3 2260 - 2955 730 231 aa, chain + ## HITS:1 COG:yegX KEGG:ns NR:ns ## COG: yegX COG3757 # Protein_GI_number: 16130040 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lyzozyme M1 (1,4-beta-N-acetylmuramidase) # Organism: Escherichia coli K12 # 34 218 70 257 275 129 32.0 6e-30 MKKLLIVLACALCPFGANAQEGYYIQCEDTCNHIHGIDISHYQGQVFWEVIGENSKMAYV YIKASEGGDRIDPRYERNIQLAHQYGLKVGSYHFFRPKTNLTKQLENFMTQCRPGDQDLI PMIDVETKSGLSTPEFRDSLTKFITLVEEAYKQKPLIYTFTNFYNAHMQGAIDGYPLMIA QYNAVEPELKDGRDITMWQYTGKGRINGINGFVDKSRFLKNHGLREIRFHH >gi|283510584|gb|ACQH01000035.1| GENE 4 2993 - 4453 1361 486 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288928021|ref|ZP_06421868.1| ## NR: gi|288928021|ref|ZP_06421868.1| hypothetical protein HMPREF0670_00762 [Prevotella sp. oral taxon 317 str. F0108] # 1 486 1 486 486 844 100.0 0 MAMIKCPECGHVISDRAPSCPSCGAKIENEVVKCPVCGEAYFKEQAECPHCHHRTANAST MENEHPNNVVPPAPQSPSAGVADTNVPQQLSNNAYVQPTVTPVAAQQQQNVYGSSQGMQS QASQPVNNAYQGQPVQPTQQQGGYQAQPAQPTQQQSGYQAQPAQPTNGPQSQFGNPQQPQ YGQVPPQTPPPGTQQPQNKNILTIALISVVALLFIGLGAYFFFGSSGNKEQQAYEYALNS NDPAVLESYLNNYRDADPMHRDKISQQLAQLKQSELDWTNTMVSGSKTALADFLSKYPNS THKAEAERKIDSLDWLVAKKANTTEAYNLYVTEHPNGVYIDEANNSLNVAKTKNITPKEK EMLALVFRSVFRSINDRDDEGLTSNFSDFLSSFLGKSNASKSDVIAFLNKIYKDDITAME WRANNDMKITKREIGIDEYEYSVDFSAKQKIEREDEGQETEANYRIKAKVNPDGKITELN MVKVIK >gi|283510584|gb|ACQH01000035.1| GENE 5 5156 - 5335 106 59 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKKLVLAFVAIAAVSFASCNDKKNAPAANADSTVNDTANVQDSTAANDSTAANDSTAHM >gi|283510584|gb|ACQH01000035.1| GENE 6 5571 - 6053 491 160 aa, chain - ## HITS:1 COG:SP0844 KEGG:ns NR:ns ## COG: SP0844 COG0295 # Protein_GI_number: 15900731 # Func_class: F Nucleotide transport and metabolism # Function: Cytidine deaminase # Organism: Streptococcus pneumoniae TIGR4 # 25 155 6 127 129 88 38.0 4e-18 MRTIDLNIRIREFAFDEMNEADRYLLEQAKLATNNAYANYSKFYVGAALLLADGSVVIGA NQENAAFPSGLCAERSAIFAAQSQRPEQAVVAMAVAARNEKGFLSQPITPCGACRQVVLE MEDRYKQPVRILLYGENRIFELSSIKDLLPLSFVDAHMHG >gi|283510584|gb|ACQH01000035.1| GENE 7 6078 - 7469 1084 463 aa, chain - ## HITS:1 COG:no KEGG:PRU_0981 NR:ns ## KEGG: PRU_0981 # Name: not_defined # Def: putative lipoprotein # Organism: P.ruminicola # Pathway: not_defined # 2 431 14 435 442 339 42.0 1e-91 MLFSCSDDDFTTSSSVSPLHFSTDTVDMDTVFANVPTSTRTFWVYNRLGTNVRLSSVRLA GGNQQGYRVNVDGTYLSSESGFQTNNLEVRKNDSIRVFVELTSPKATSEGLELIEDKLVF SQLGGMEQRVLLRARSWKARSVKNLRIRRDTTLTGETPIIVYGKIQVDSAATLTLAPGTM LFFHDKAGIDVYGRLLVKGTAEANVVLRGDRLDKMFSYLPYDRVSGQWAGVHIYEHSYGN EFENMDLHSSFDGLRVDSSDVSKTKLVMKAVTIHNCQGFGVLADNSQIEMQNCQISNTQN DCVRINGGKVSLTHCTLAQFYPFGANRGAALWVTFNKFPLTNLVVQNSLLTGYANDILML GKEEKDKGLLMTFRDSRIRMPKLEVPYTNAKFENVTYENATDTTQSPRNTFAVFDEKNLY YDFRLKKGSGAIDAANALWTLPTDRKGVARDAKPDLGAYEYKE >gi|283510584|gb|ACQH01000035.1| GENE 8 7605 - 8780 878 391 aa, chain - ## HITS:1 COG:BS_ywoG KEGG:ns NR:ns ## COG: BS_ywoG COG0477 # Protein_GI_number: 16080698 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Bacillus subtilis # 11 379 6 372 396 62 25.0 2e-09 MDTQNTPVHISLWHADFWILAITNLLITMSVYMLIPILPIWMLGTAGLSQQAVGTVMGVY GIGLFLLGGFCNWLVQRYRRNRVCLWAIALVFLSLLGCWLVERELQQSWLQYRLLLFLRI VLGAAFGLVQMILSSTLIIDKSESFQRTEANHASAWFGRFALSLGPMLGIVFARTSFSPL FASAALALVAYLLLATVKFPFRTPDDGVKKISGDRFFLPDAKWLFLNLFLICLSVGILMS TQYTAMFYAMVMVGFAIALLAERFAFVDADLKSQVVAGSILIAAALMMILTRKQMVVNYI SPVFVGLGIGLIGSRFLLFFIKLSKHCQRGTAVSTYMLGWESGLAAGLFVGYFFFAEDIH LALSASLACLILAFGLYVVFTHQWYIRKKNR >gi|283510584|gb|ACQH01000035.1| GENE 9 8872 - 9693 781 273 aa, chain - ## HITS:1 COG:Cgl0115 KEGG:ns NR:ns ## COG: Cgl0115 COG0413 # Protein_GI_number: 19551365 # Func_class: H Coenzyme transport and metabolism # Function: Ketopantoate hydroxymethyltransferase # Organism: Corynebacterium glutamicum # 8 273 5 269 269 273 56.0 2e-73 MAGYLNTDIKKVTANKFVAMKRQGEKISMLTAYDFTTATIIDGAGIDGILVGDSAANVMA GNADTLPITLDEMIYHARSVARAVHRALVVCDMPFGTYQVSKEAAVANAVRIMKETGVDA LKMEGGKEIADTIKYLVDIGIPVIGHLGLTPQSVHKFGGYGVRAKETAEAEKLIVDAIAL DEAGCFAITLEKVPADLAAEVTKRVSCATIGIGAGNATDGQILVYADMLGLTQGFKPKFL RQYANLGQQMNDAVQAYIADVKSTTFPSVDESY >gi|283510584|gb|ACQH01000035.1| GENE 10 9872 - 10975 819 367 aa, chain - ## HITS:1 COG:all0498 KEGG:ns NR:ns ## COG: all0498 COG0399 # Protein_GI_number: 17227994 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted pyridoxal phosphate-dependent enzyme apparently involved in regulation of cell wall biogenesis # Organism: Nostoc sp. PCC 7120 # 2 363 8 375 395 236 36.0 6e-62 MLKYFDLQRYTALHREQLQQAVNRVVDSGWYLLGNEVKAFEKAYSQYIGTTHCVACGNGL DALSLIFKAYIEKGLLRPGDEILVPANTYIASIISITSNGLLPVLVEPHADTFQIDAQQM PKHIGKRTKALLVVHLYGKCAFTDELLSICEEHQLLMIEDNAQAHGCQWKGRRTGSVGHA AAHSFYPGKNLGALGDAGGVTTDDEELANIVRALGNYGSSKKYVFDYVGQNSRMDEMQAA VLCEKLKWLDQDNELRVAIATKYVDGIQNTAITLPTKDYLANNVFHIFPVLCAKRELLQR ELHQRSIETIIHYPIPPHKQKCYSEWNTLALPITERIHREELSLPLNPTMTNEEVEMVIS QVSDIRL >gi|283510584|gb|ACQH01000035.1| GENE 11 11059 - 12000 712 313 aa, chain - ## HITS:1 COG:no KEGG:PRU_0177 NR:ns ## KEGG: PRU_0177 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 312 1 310 311 330 51.0 5e-89 MYEIVRYTPELKAQWDSFVRKSRNGTFLFLRDYMDYHRQRFTDFSLLIFHKQSLVALLPL NKERDGSVTSHSGLTYGGLITDKRGRAVKVMQVFECLNTWLRNAGITRVVYRPVPWIYHE LPAEEDLYALFNVCNARLTAREISSTISQENKLTFSQSRKDCLRKAKRAGVEVMPCNDFA AFWDILSANLQQRYAVNPVHSLTEITMLANLFSRHIRLYAAYLDNEMVAGTVLYIMPRVV HVQYISASAKGKAVGALDLLFRYLIDEVFVDKPYFDFGKSTEDRGRRLNEQLIFQKEGFG GRGVCYDTYEWTP >gi|283510584|gb|ACQH01000035.1| GENE 12 12056 - 12469 315 137 aa, chain - ## HITS:1 COG:no KEGG:PRU_0176 NR:ns ## KEGG: PRU_0176 # Name: not_defined # Def: polysaccharide glycan isomerase # Organism: P.ruminicola # Pathway: not_defined # 1 134 1 134 137 189 70.0 2e-47 MKHLNTALIDFPKITDPRGNLTVAQALTDVPFAIKRAYWVYDVPAGECRGGHAHKLCKEV LIALSGSFHVTVDNGEEQKTVLLNHPYQGLLIETDVWRTLDDFSSGAVCLVLASEPFDED DYIREYDDFLRYLADIK >gi|283510584|gb|ACQH01000035.1| GENE 13 12466 - 12768 172 100 aa, chain - ## HITS:1 COG:no KEGG:PRU_0174 NR:ns ## KEGG: PRU_0174 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 2 99 4 101 101 64 44.0 1e-09 MDSISTELHSFLISFGQNPKLVSHQVGHYVEHLLHLLPTLNEQRLISFYGLFGKTRLTLR QLAQAKNETDAQTAENIATDLRRLAVTPEWQMLKGLINKK >gi|283510584|gb|ACQH01000035.1| GENE 14 12768 - 13916 1387 382 aa, chain - ## HITS:1 COG:FN1667 KEGG:ns NR:ns ## COG: FN1667 COG1088 # Protein_GI_number: 19704988 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-D-glucose 4,6-dehydratase # Organism: Fusobacterium nucleatum # 1 382 1 399 399 478 59.0 1e-134 MKTYLVTGAAGFIGANYIKYLLEKKYQDEDIKVIVLDLLTYAGNLGTIKNNIDDKRCVFV RGDIRDRDLVNGLFADNEIDYVVNFAAESHVDRSIEDPQLFLSVNILGTQNLLDAARKAW VTGKDETGYPTWKAGKRYHQVSTDEVYGSLGDTGYFMETTPLCPHSPYSASKASADMFVM AYHDTYHMPVSITRCSNNYGPYHFPEKLIPLIINNILEGKKLPVYGKGENVRDWLYVEDH CKAIDLVVREGRVGEVYNVGGHNEMKNIDIVKLIIKTIRQLMDSNKDLRHVLKKQELDAN GEIRTDWINEDLITYVADRLGHDLRYGIDPTKIKNELGWYPETMFADGIVKTIEWNLANQ EWIAEVTSGDYQKYYEEMYGNR >gi|283510584|gb|ACQH01000035.1| GENE 15 13930 - 14436 440 168 aa, chain - ## HITS:1 COG:PA0351 KEGG:ns NR:ns ## COG: PA0351 COG0622 # Protein_GI_number: 15595548 # Func_class: R General function prediction only # Function: Predicted phosphoesterase # Organism: Pseudomonas aeruginosa # 3 161 9 157 157 73 35.0 2e-13 MRRIGILSDTHGYWDDRYLTYFAECDEIWHAGDIGSLEVSMRLAAFKPLRAVHGNIDGGD LRRLYPERLRFKVEDVDVLMTHIGGYPGHYDPRIQATLYASPPKLFVAGHSHILKIQYDK TLNMLHINPGAAGISGFHKERTLVRLTIEGEKFTDCEVITLADNNKNR >gi|283510584|gb|ACQH01000035.1| GENE 16 14424 - 15062 560 212 aa, chain - ## HITS:1 COG:no KEGG:BDI_1616 NR:ns ## KEGG: BDI_1616 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 4 209 3 210 215 194 46.0 2e-48 METSNIIITVGRQLGSGGKEIASLLAQRLGYKMYDSELLNLAAKESGLCERLFERNDEQK AFSKSFFRYNLPLIGGEHGMFQDELSQDTLFKFQSEAIIKAAEADNCVFVGRCADYILRD APRMISFFVTANLEERIQRVMQRQQCDEKKAAAIIHEGDERRAAYYDFYTGKTWGHGFSY DLCINSSILGVRGTVDFLGRFVETKFSKQCGE >gi|283510584|gb|ACQH01000035.1| GENE 17 15065 - 16420 1164 451 aa, chain - ## HITS:1 COG:CAC0883 KEGG:ns NR:ns ## COG: CAC0883 COG0534 # Protein_GI_number: 15894170 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Clostridium acetobutylicum # 9 434 6 432 448 342 42.0 1e-93 MDNQKATLELGTKPVGQLLVRYAVPAIIAMTASSLYNMVDSIFIGQGVGALAISGLAITF PLMNLSTAFGAGVGVGASSLLSVKLGQKDYEAAQNILGNTVMLNIITGVCFSIISLLFLE PILMFFGASAQTLPYAKDYMEIILLGNVVTHLYFGLNALQRAAGKPQLSMYMTIFTVIIN AILDPIFIWPLGLGIRGAAYATVLSQLLALIWQLVMFSKKSEFIHFKRGIYRLRSRLVKN ILAIGMSPFSMNVCACFVVIIINNSLVSHGGDMAVGAYGIINRIAFIFVMVTIGVNQGMQ PIAGYNYGAMKFDRMMRVLKYAVICGTCVTTTGFVVGEFFPEQCVRLFTSDQTLITLSVH AMRITMMSFPIIGYQMVIANFFQSIGKAKISVFLSLSRQLVFLIPLLLVLPTLYGVDGVW WSMPVADTTSALVTAVIMVLFMRKINKQKTL Prediction of potential genes in microbial genomes Time: Sat May 28 00:47:40 2011 Seq name: gi|283510583|gb|ACQH01000036.1| Prevotella sp. oral taxon 317 str. F0108 cont2.36, whole genome shotgun sequence Length of sequence - 14188 bp Number of predicted genes - 15, with homology - 14 Number of transcription units - 9, operones - 2 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 4 - 63 4.2 1 1 Tu 1 . + CDS 213 - 431 172 ## gi|288928034|ref|ZP_06421881.1| hypothetical protein HMPREF0670_00775 + Term 451 - 492 3.4 2 2 Tu 1 . - CDS 673 - 2244 1326 ## COG2509 Uncharacterized FAD-dependent dehydrogenases - Prom 2331 - 2390 4.6 3 3 Op 1 42/0.000 - CDS 2435 - 3403 863 ## COG0224 F0F1-type ATP synthase, gamma subunit 4 3 Op 2 . - CDS 3478 - 4101 560 ## COG0056 F0F1-type ATP synthase, alpha subunit 5 3 Op 3 41/0.000 - CDS 4098 - 5066 872 ## COG0056 F0F1-type ATP synthase, alpha subunit 6 3 Op 4 38/0.000 - CDS 5068 - 5607 425 ## COG0712 F0F1-type ATP synthase, delta subunit (mitochondrial oligomycin sensitivity protein) 7 3 Op 5 . - CDS 5615 - 6130 467 ## COG0711 F0F1-type ATP synthase, subunit b 8 3 Op 6 . - CDS 6139 - 6381 352 ## PRU_1195 ATP synthase F0 subunit C (EC:3.6.3.14) - Prom 6408 - 6467 3.4 - Term 6495 - 6540 0.4 9 4 Tu 1 . - CDS 6573 - 7628 758 ## COG0356 F0F1-type ATP synthase, subunit a + Prom 7595 - 7654 4.1 10 5 Tu 1 . + CDS 7809 - 7952 120 ## - Term 7969 - 8013 -1.0 11 6 Op 1 42/0.000 - CDS 8084 - 8326 238 ## COG0355 F0F1-type ATP synthase, epsilon subunit (mitochondrial delta subunit) 12 6 Op 2 . - CDS 8348 - 9877 1900 ## COG0055 F0F1-type ATP synthase, beta subunit - Prom 9915 - 9974 4.9 13 7 Tu 1 . - CDS 10008 - 10985 1032 ## COG0205 6-phosphofructokinase - Prom 11017 - 11076 4.6 - Term 11070 - 11108 0.4 14 8 Tu 1 . - CDS 11151 - 12539 842 ## COG1055 Na+/H+ antiporter NhaD and related arsenite permeases - Prom 12755 - 12814 2.5 15 9 Tu 1 . - CDS 12870 - 13310 222 ## TDE0530 hypothetical protein - Prom 13368 - 13427 4.9 Predicted protein(s) >gi|283510583|gb|ACQH01000036.1| GENE 1 213 - 431 172 72 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288928034|ref|ZP_06421881.1| ## NR: gi|288928034|ref|ZP_06421881.1| hypothetical protein HMPREF0670_00775 [Prevotella sp. oral taxon 317 str. F0108] # 1 72 7 78 78 91 100.0 2e-17 MFREDGYLNDFHRAWDEKCIKDDKERWALMQSKGMRYTAEFARAQSQRMDELVRQEEERL KEERKKHSKAGR >gi|283510583|gb|ACQH01000036.1| GENE 2 673 - 2244 1326 523 aa, chain - ## HITS:1 COG:L195271 KEGG:ns NR:ns ## COG: L195271 COG2509 # Protein_GI_number: 15673161 # Func_class: R General function prediction only # Function: Uncharacterized FAD-dependent dehydrogenases # Organism: Lactococcus lactis # 17 522 30 532 535 360 40.0 5e-99 MTYEYQLRLLPQQAYTEDSIRDYVAHDKALDVRTITHVQVLKRSIDARQRTIFVNLSVRV FVNEMPEKLEYQVVNYPDVSHCPQAIVVGAGPGGLFAALRLIEQGIRPIVVERGKSVNER HRDLSLITKEQKINPESNYCFGEGGAGAYSDGKLYTRSKKRGNVEKILSVFCQHGASTSI LSDAHPHIGTDKLPKVIEAIRNTIIICGGEVHFQTKMTALLMESNKVIGVQTLNGLTSEE RNFKGAVILATGHSARDVYQYLATAGIPIEPKGLAIGVRLEHPSALIDQIQYHNKNGRGK YLPAAEYSFVTQSAERGVYSFCMCPGGFVIPAATEHEQIVVNGMSPANRGSKWSNSGMVV EVRPGDVEGDDVLKMMHYQQSIEHDCWVNGNRKQTAPAQRMVDFVNRKLSYDLPQSSYSP GLISSPLHFWMPKHIVSRLQEGFQKFGRNSHGFLTNEAVLIATESRTSSPVRIVRDRENF QHVSVGGLFPCGEGAGYAGGIVSAAMDGERCAEMCAAYVKAIG >gi|283510583|gb|ACQH01000036.1| GENE 3 2435 - 3403 863 322 aa, chain - ## HITS:1 COG:FN0359 KEGG:ns NR:ns ## COG: FN0359 COG0224 # Protein_GI_number: 19703701 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, gamma subunit # Organism: Fusobacterium nucleatum # 1 319 1 279 282 171 33.0 2e-42 MPSLKEIKTRIASVNSTRKITSAMKMVASSKLHHAQQLIENMLPYESMLEHILKAFLASM PEARTPLNVERKQLKRVALVPFSSNSSLCGGFNANVTKLLQQAVAHYREQGVTDIVVYPL GRKITEQATKMGLQIGGNFNLLAEKPNAHQCADIATELSEQYAAGELDRVELIYHHFKSA GSQILTRRTFLPIDLSTELGRFSDRDLSSDVVTAKAQEYLKKKAAKEEKTKDEAKPLNDN FLVEPDMETILTTLIPKELNLMMYTALLDSNASEHAARMVAMQTATDNADELLRGLNLQY NKSRQQAITNELLDIMGGSVNN >gi|283510583|gb|ACQH01000036.1| GENE 4 3478 - 4101 560 207 aa, chain - ## HITS:1 COG:ML1143 KEGG:ns NR:ns ## COG: ML1143 COG0056 # Protein_GI_number: 15827574 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, alpha subunit # Organism: Mycobacterium leprae # 14 204 320 511 558 220 58.0 2e-57 MNDLPECLKGHVKGGGSLTALPIIETQAGDVSAYIPTNVISITDGQIYLETDLFNQGFRP AINVGISVSRVGGSAQIKSMKKVAGTLKIDMAQYRELEAFSKFSSDMDAVTAMTLDRGRK NTQLLIQPQYSPMPVGEQIAILYCGTRGLMHDVPVDQVRECQTLFLDKLRSTHQSTIDFL ASGGLNDEACGVLESVMADIAGQFQKV >gi|283510583|gb|ACQH01000036.1| GENE 5 4098 - 5066 872 322 aa, chain - ## HITS:1 COG:FN0360 KEGG:ns NR:ns ## COG: FN0360 COG0056 # Protein_GI_number: 19703702 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, alpha subunit # Organism: Fusobacterium nucleatum # 5 289 3 280 500 335 58.0 9e-92 MSDKIRPNEVSEILFQQLQQISDGQDFDEIGTVLTVSDGVARVYGLRNAEANELLEFENG TMAIVMNLEEDNVGCVLLGPTSEIKEGQVVKRTHRIASIRVNDNMLGRVINPLGEAIDGL GEIDLSNAFEMPLDRKAPGVIFRQPVKEPLQTGLKGVDSMIPIGRGQRELIIGDRQTGKT AIALDTIINQKSFYDAGKPVYCIYVAIGQKASTVATLVQTLKERGALDYTIIVSATAADP AAMQYYAPFAGAAIGEYFRDRGEHALVVYDDLSKQAVAYREVSLILRRPRVARLIQATFS ICIVVCWNVPHASTTSKKWPNR >gi|283510583|gb|ACQH01000036.1| GENE 6 5068 - 5607 425 179 aa, chain - ## HITS:1 COG:CC3450 KEGG:ns NR:ns ## COG: CC3450 COG0712 # Protein_GI_number: 16127680 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, delta subunit (mitochondrial oligomycin sensitivity protein) # Organism: Caulobacter vibrioides # 9 177 13 182 184 66 25.0 2e-11 MDIGLIAVRYARALLKAAIDDRCYEDVYNNMTNLLECYLHVPELRPAVCNPMYSRSKKSS VVLAACGTPAVELTRRFIALVLREGREEVLQFIASSFINLYRQHFHITSGKITTAVQLTA EVEQRMRHLVEERTQGTVEFTTEVDPAILGGFVLDYDTYRMDASVKSKLNRILIELKNK >gi|283510583|gb|ACQH01000036.1| GENE 7 5615 - 6130 467 171 aa, chain - ## HITS:1 COG:ML1142_1 KEGG:ns NR:ns ## COG: ML1142_1 COG0711 # Protein_GI_number: 15827573 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, subunit b # Organism: Mycobacterium leprae # 19 170 8 159 250 69 28.0 3e-12 MENLPSILTPDLGLLFWMLLAFLVVFVVLVKYGFPVIIRMVEDRKTYIDESLRKAHEASE RLENIKQESEQILQDAREKQSLILKEAAQTRDAIVEKARQTAHEEGVRLLEETKRQIEVE KQNAIRDIRTQVAELSVQIAEKVVRENLASNAQQMSLVNRFLDDAFSVNPN >gi|283510583|gb|ACQH01000036.1| GENE 8 6139 - 6381 352 80 aa, chain - ## HITS:1 COG:no KEGG:PRU_1195 NR:ns ## KEGG: PRU_1195 # Name: atpE # Def: ATP synthase F0 subunit C (EC:3.6.3.14) # Organism: P.ruminicola # Pathway: Oxidative phosphorylation [PATH:pru00190]; Metabolic pathways [PATH:pru01100] # 2 80 1 79 79 94 88.0 9e-19 MMITTLLAAEGLAQLGCGIGAGIATIGAGLGIGRIGGSAMDAIARQPEATNKIQTNMIVT ASFVEGCALFAIAACAFLIK >gi|283510583|gb|ACQH01000036.1| GENE 9 6573 - 7628 758 351 aa, chain - ## HITS:1 COG:BMEI1546 KEGG:ns NR:ns ## COG: BMEI1546 COG0356 # Protein_GI_number: 17987829 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, subunit a # Organism: Brucella melitensis # 110 348 50 277 277 88 30.0 2e-17 MKKVRYISFLLLLLFTSLSLHAASNGGEDKKELDIPEIVLEHLSDAYEWHIATYEGKHLS IPLPIIVRSSDTGEWYVGTAHHLPNNFFFDAQHHGKIYERRADGSTTRPLDLSITKAVAQ IWIVVAILLTVFLSCARWYKKVDEKSAAPKGLVGLMEMFVMYIHDDLIRPAVGERQYKRY APFLLTVFFFIFTCNMIGLVPVFPGGANVTGNINITMFLAVTAMLVINIFGNKEYWKDVF WPEVPVALKFPVPLMPVIEVFSIFTKPFALMIRLFANMMAGHAIILSFTCVIFLGWNMGV GYGLGLNTFGIIMLLFMNCLELLVAFVQAYVFTMLSAVFIGMAHREHHVAE >gi|283510583|gb|ACQH01000036.1| GENE 10 7809 - 7952 120 47 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRNPDTPKKKVGRLSGDFIATRDQIRMSAVSMIKKTTELTSNGNATA >gi|283510583|gb|ACQH01000036.1| GENE 11 8084 - 8326 238 80 aa, chain - ## HITS:1 COG:XF1142 KEGG:ns NR:ns ## COG: XF1142 COG0355 # Protein_GI_number: 15837744 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, epsilon subunit (mitochondrial delta subunit) # Organism: Xylella fastidiosa 9a5c # 6 79 11 84 143 58 41.0 3e-09 MLKLIIVSPEKQLYQGDAQSVKVPGSLGEFEILDSHAPIISSLDAGQVVYVTGEGETHEC AIAGGFVEVQKNVVSLCVEV >gi|283510583|gb|ACQH01000036.1| GENE 12 8348 - 9877 1900 509 aa, chain - ## HITS:1 COG:CAC2865 KEGG:ns NR:ns ## COG: CAC2865 COG0055 # Protein_GI_number: 15896119 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, beta subunit # Organism: Clostridium acetobutylicum # 6 505 6 466 466 608 64.0 1e-173 MSNIEGHISQIIGPVIDVYFDTKGQEEEKVLPKIHDALRIVRPDGRELIVEVQQHIGEDT VRCVAMDNTDGLQRHLKATPLGSPITMPAGEQIKGRMLNVIGKPIDGMSELNMEGAYPIH REAPKFDELSTHKEMLATGIKVIDLLEPYMKGGKIGLFGGAGVGKTVLIMELINNIAKGH NGYSVFAGVGERTREGNDLIRDMIESGVIRYGDKFRKAMEEGKWDLSLVDPEELRQSQAT LVYGQMNEPPGARASVALSGLTVAEEFRDHGGKDGEPADIMFFIDNIFRFTQAGSEVSAL LGRMPSAVGYQPTLASEMGAMQERITSTKRGSITSVQAVYVPADDLTDPAPATTFTHLDA TTELSRKITELGIYPAVDPLGSTSRILDPLIVGKEHYECAQRVKQILQRYKELQDIIAIL GMDELSDEDKQTVNRARRVQRFLSQPFTVAEQFTGVKGAMVSIEDTIKGFNMILDGEVDD LPEQAFLNVGTIEDAIEKGKKLLEAAGSN >gi|283510583|gb|ACQH01000036.1| GENE 13 10008 - 10985 1032 325 aa, chain - ## HITS:1 COG:BH3164 KEGG:ns NR:ns ## COG: BH3164 COG0205 # Protein_GI_number: 15615726 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphofructokinase # Organism: Bacillus halodurans # 4 325 1 319 319 287 48.0 2e-77 MTKVKTIGILTSGGDAPGMNAAIRAVTRAGICNGFNIKGIYRGFDGLINGDIKPFTTENV SGIIMQGGTILKTARSSEFTTEEGRKKAYENIVKEGIDALVVIGGNGSLTGAMVFAREYD LCCIGLPGTIDNDLYGTDSTIGYDTTMNTIVECVDRIRDTAQSHERIFFVEVMGRDAGFL AQNSAIAAGAEAAIIPEDSTDVDQLARFMERGIRKSKKSCIVIVSESPKCGALYYADRVR KEFPDYDVRVSILGHLQRGGRPTAHDRILASRTGVGAIDAIMQGQRNVMIGVHNNEIVYV PLSEAIRSDKPFDKKLITVLDELSI >gi|283510583|gb|ACQH01000036.1| GENE 14 11151 - 12539 842 462 aa, chain - ## HITS:1 COG:CPn1015 KEGG:ns NR:ns ## COG: CPn1015 COG1055 # Protein_GI_number: 15618923 # Func_class: P Inorganic ion transport and metabolism # Function: Na+/H+ antiporter NhaD and related arsenite permeases # Organism: Chlamydophila pneumoniae CWL029 # 1 448 2 400 420 130 23.0 4e-30 MSLIILLIIVVGFILISTENVTNVNRAAVAIFVGTVGWVLYVSYGTDFVQLMHKQAYTDY LGGMAHTSQAVKQFIAQTIFLRYVGKACEIVLFLLATMTIVEILNNNGCFDFLGIWARTR DSKKLLWMMMLIVFVVSANLDNLTTTVMALTLIHQVIPNRRYRMIYGSVAVIAANMGGAL TVIGSPEGLVLWNMGAITATNYSLSMAVPCIVTCVVPTWLLSRALPDRIDIEFLRLPYRG DDTNLNVWQRLVMFVVGIGGLWFIPTFHNITKLSPFLGALCVLSVLWVVNEIFNRKLLNT GETIHRRIPRVLQYGSMQMILFVLGIMFAVGVVNETGFFSEVTAYIDKQTPNIWFFGVAS AAISSVLDSFVTAMTALSLHPVLSGENAELTSYSVNFAQNGAYWKMIAFATAVGGNILPI GSISGLALIRTERMRVGWYFANVGWKVLLGGVLGMALLWLIT >gi|283510583|gb|ACQH01000036.1| GENE 15 12870 - 13310 222 146 aa, chain - ## HITS:1 COG:no KEGG:TDE0530 NR:ns ## KEGG: TDE0530 # Name: not_defined # Def: hypothetical protein # Organism: T.denticola # Pathway: not_defined # 1 146 1 146 146 137 48.0 1e-31 MEIVISSVIGGQIAAVEGLEKVVHALIIEMRKSFKKQFENKSFAGLDKVKINMYISGDVS EYCTTSEITKCLYFERKKQFISEFCIARSYWTSCSLSDVKGKFLLLIENLFVSLSMLIKH RLKCKGYDFDSEVFKEIVLKSLIELS Prediction of potential genes in microbial genomes Time: Sat May 28 00:48:05 2011 Seq name: gi|283510582|gb|ACQH01000037.1| Prevotella sp. oral taxon 317 str. F0108 cont2.37, whole genome shotgun sequence Length of sequence - 45266 bp Number of predicted genes - 34, with homology - 30 Number of transcription units - 22, operones - 9 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 216 - 275 7.3 1 1 Tu 1 . + CDS 479 - 709 96 ## + Term 738 - 788 1.0 - Term 633 - 699 11.3 2 2 Op 1 . - CDS 730 - 2763 2267 ## COG3590 Predicted metalloendopeptidase 3 2 Op 2 . - CDS 2852 - 4813 1970 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains - Prom 4850 - 4909 3.4 4 3 Tu 1 . + CDS 4686 - 4919 75 ## 5 4 Tu 1 . - CDS 4953 - 5336 254 ## PRU_1039 putative 2-amino-4-hydroxy-6-hydroxymethyldihydropteridine pyrophosphokinase - Prom 5356 - 5415 4.9 + Prom 5513 - 5572 5.2 6 5 Tu 1 . + CDS 5602 - 6516 947 ## COG1705 Muramidase (flagellum-specific) + Term 6553 - 6594 8.5 + Prom 6640 - 6699 2.2 7 6 Tu 1 . + CDS 6730 - 8004 978 ## PRU_1040 hypothetical protein + Term 8062 - 8100 6.1 8 7 Tu 1 . - CDS 8386 - 8505 78 ## - Prom 8527 - 8586 5.0 - Term 9497 - 9556 17.1 9 8 Op 1 . - CDS 9654 - 11306 1604 ## COG4690 Dipeptidase 10 8 Op 2 . - CDS 11326 - 12978 1415 ## COG0739 Membrane proteins related to metalloendopeptidases 11 9 Op 1 . - CDS 13086 - 13301 118 ## gi|288928056|ref|ZP_06421903.1| hypothetical protein HMPREF0670_00797 12 9 Op 2 . - CDS 13265 - 14149 767 ## PRU_1066 hypothetical protein 13 9 Op 3 . - CDS 14151 - 14807 613 ## PRU_1067 putative lipoprotein signal peptidase - Term 14824 - 14871 8.9 14 10 Op 1 . - CDS 14896 - 15276 489 ## PRU_1068 hypothetical protein 15 10 Op 2 . - CDS 15293 - 18733 3935 ## COG0060 Isoleucyl-tRNA synthetase - Prom 18788 - 18847 7.2 + Prom 18918 - 18977 4.3 16 11 Tu 1 . + CDS 18998 - 19492 573 ## COG0394 Protein-tyrosine-phosphatase + Prom 19561 - 19620 3.3 17 12 Tu 1 . + CDS 19673 - 21841 2031 ## COG0507 ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member + Term 21937 - 21979 2.5 - Term 21682 - 21718 1.6 18 13 Tu 1 . - CDS 21917 - 22108 178 ## 19 14 Op 1 33/0.000 - CDS 22234 - 23247 710 ## COG0609 ABC-type Fe3+-siderophore transport system, permease component 20 14 Op 2 . - CDS 23255 - 24382 740 ## COG0614 ABC-type Fe3+-hydroxamate transport system, periplasmic component - Prom 24616 - 24675 7.5 21 15 Op 1 . - CDS 25387 - 26532 792 ## PRU_1304 hypothetical protein 22 15 Op 2 . - CDS 26533 - 28617 1888 ## COG0296 1,4-alpha-glucan branching enzyme - Term 28762 - 28818 3.3 23 16 Op 1 . - CDS 28862 - 29311 310 ## COG2731 Beta-galactosidase, beta subunit 24 16 Op 2 . - CDS 29366 - 30391 864 ## COG4642 Uncharacterized protein conserved in bacteria - Prom 30512 - 30571 6.5 + Prom 30643 - 30702 6.2 25 17 Op 1 . + CDS 30925 - 31803 1004 ## COG1209 dTDP-glucose pyrophosphorylase 26 17 Op 2 . + CDS 31883 - 34387 2362 ## COG0489 ATPases involved in chromosome partitioning 27 17 Op 3 . + CDS 34453 - 34884 398 ## PRU_1780 hypothetical protein + Term 34953 - 35013 12.3 - Term 34939 - 35001 9.3 28 18 Tu 1 . - CDS 35087 - 35731 584 ## COG0546 Predicted phosphatases - Prom 35883 - 35942 3.4 + Prom 35703 - 35762 9.0 29 19 Tu 1 . + CDS 35921 - 36970 995 ## COG1559 Predicted periplasmic solute-binding protein + Prom 38842 - 38901 5.9 30 20 Tu 1 . + CDS 38966 - 40681 1956 ## COG0008 Glutamyl- and glutaminyl-tRNA synthetases + Term 40783 - 40819 4.2 + Prom 40730 - 40789 1.6 31 21 Op 1 . + CDS 40825 - 42237 1310 ## COG0457 FOG: TPR repeat 32 21 Op 2 . + CDS 42248 - 42928 314 ## PROTEIN SUPPORTED gi|154175107|ref|YP_001408238.1| ribosomal protein L22 33 21 Op 3 . + CDS 42993 - 43646 613 ## COG0336 tRNA-(guanine-N1)-methyltransferase + Prom 43702 - 43761 3.1 34 22 Tu 1 . + CDS 43786 - 45123 1143 ## PRU_1441 hypothetical protein + Term 45180 - 45208 -0.9 Predicted protein(s) >gi|283510582|gb|ACQH01000037.1| GENE 1 479 - 709 96 76 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKKYLSCIGITKTTNTYQFIERLKVRTVLILKFALETLRKVKIGGQTMPDGSKKKPLPDR CKANLALTAPHRATVN >gi|283510582|gb|ACQH01000037.1| GENE 2 730 - 2763 2267 677 aa, chain - ## HITS:1 COG:MA2001 KEGG:ns NR:ns ## COG: MA2001 COG3590 # Protein_GI_number: 20090849 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted metalloendopeptidase # Organism: Methanosarcina acetivorans str.C2A # 31 677 19 665 665 469 37.0 1e-132 MNMKQFFPALLMMASPTLGMAQGQSGLVMSDLDKSVRAADDFYQFATGGWQKNNPLPAAY SRFGSFDRLQEDNNKRINNILGELLKKKFKPGSTEQKLSDYYKLAMDSTRREKEGLAPVA ALIAEVEGAKTKANLEQFQLKYAVYGYGVGFGSYFAADEKNVKMNILTIGQGGLTLGQKD YYVNNDPATVAIREAFRKHIARMFRLYGFNEAQADARRDAIFRMETALALVSKSVTELRD PQANYNKMTLKEFKENYPNINLEAMANAEGVKSAYIQEMIVGQPGFMEGYDKLYAAATAD DLRALIEWDIIQSSASYLTNEIRLANFDFFGKTMSGRKEEYPLWKRATNQVEEQMGEALG KMYVARYFPETSKKMMEQLVRNLQISLGQRIDAQTWMSDTTKAAAHNKLNKFYVKIGYPN KWIDYSKLTIDPTKSYYENVLACRKFSHDRHILTRAGKPVDRDEWLMTPQTVNAYYNPTT NEICFPAGILQRPFFDPKADEAYNYGAIGVVIGHEMTHGFDDQGRQYDANGYMNDWWTAS DAKGFEERAKLYADFFSNIKVLPDLNANGQFTLGENLADHGGLQVSFFAFKNVEAKKHLK TIDGFTPDQRFFLAYAAVWGQNITEQEIRNRTKADPHALGKWRVNGALPHIDAWYEAFGV KEGDKLFIPKNQRLSLW >gi|283510582|gb|ACQH01000037.1| GENE 3 2852 - 4813 1970 653 aa, chain - ## HITS:1 COG:all4183 KEGG:ns NR:ns ## COG: all4183 COG0488 # Protein_GI_number: 17231675 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Nostoc sp. PCC 7120 # 1 537 1 536 564 394 40.0 1e-109 MISVEGLTVEFSVKPLFKDVSFVVNNRDRIALVGKNGAGKSTMLKILCGKQKPTAGNVSV PSGTTVGYLPQVMVLEDDTTVKEEARKAFADNTELKARVEQLNNELSERTDYESEDYMAL VERFTHEHERYLMLGGNNYEAELERTLIGLGFNRNDLDRPTSEFSGGWRMRIELAKILLR KPDVLLLDEPTNHLDIDSIQWLEQFLAQSSGSVVLVSHDRAFINNVTNRTLEISCGKVID YRVKYDEYVTLRAERREQQLRAYENQQKEIADMKDFIEKFRYKPTKAVQVQSRIKQLAKI VPIEVDEVDTSALRLKFPPCLRSGDYPVICDDVRKDYGEHTVFDHVTFTIKRGEKVAFVG RNGEGKSTLVKCILNEIPYTGELKIGHNVQIGYFAQNQAQLLDESLTVFETIDNVAKGDI RLKIRDILGAFMFGGEASDKKVKVLSGGERSRLAMIKLLLEPVNLLILDEPTNHLDMQSK DVLKEAIKAFDGTAIVVSHDREFLDGLVSKVYEFGEGRVKEHLGGIYDFIRSAAANQLDN AMPANIDSLLHTSATAPIKTDESPKTDNESYAQRKERMKKVNKAEKAVKECEGRIASMEK RIAELNELLSDVNNASDMTLVSEYTSVQRALDSENEKWMELSETLEELLRQVS >gi|283510582|gb|ACQH01000037.1| GENE 4 4686 - 4919 75 77 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVDLPAPFLPTKAMRSRLFTTKLTSLNNGFTLNSTVKPSTEIIYLLFTRTKLIILFESMD VCFRISLATTTFLLFAK >gi|283510582|gb|ACQH01000037.1| GENE 5 4953 - 5336 254 127 aa, chain - ## HITS:1 COG:no KEGG:PRU_1039 NR:ns ## KEGG: PRU_1039 # Name: not_defined # Def: putative 2-amino-4-hydroxy-6-hydroxymethyldihydropteridine pyrophosphokinase # Organism: P.ruminicola # Pathway: not_defined # 1 124 1 130 130 79 31.0 4e-14 MNQQTHYILLALGSNVAAELHIEQAKARLSAVFPQLRFSRSLITPAIGIVSPPFINCLAE GYCSVPLEEVIVALKDIEAQMGSVSEERKKGIVKIDIDLLQFDNTKRKADDWSRDYIQLL LNELSCK >gi|283510582|gb|ACQH01000037.1| GENE 6 5602 - 6516 947 304 aa, chain + ## HITS:1 COG:lin1178 KEGG:ns NR:ns ## COG: lin1178 COG1705 # Protein_GI_number: 16800247 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Muramidase (flagellum-specific) # Organism: Listeria innocua # 23 180 51 219 289 90 39.0 5e-18 MKKIVVLLTTIALVLPVSAQLRWNSVYQTYISQYKDLAIEEMLKHRIPASITLAQGLLES GAGQGALVKKGNNHFGIKCHDWTGRKTYHDDDAKNECFRSYKSAKESYEDHSQFLKRDRY KSLFALAPNDYRGWAKGLKACGYATDPSYATKLINVIELYKLYLYDSAKDYDKFFAKHAG NYLPGSSNMRLHPIYKYNDNYYLRARRGDTFKSLAEELELSARSLAKANERSKNDVLAEG DIIYLKKKRKHAEKQFKDRPHVVKPGESMYDIAQKYGIRVKSLYKMNDLPEDYQIKVGAV LRVY >gi|283510582|gb|ACQH01000037.1| GENE 7 6730 - 8004 978 424 aa, chain + ## HITS:1 COG:no KEGG:PRU_1040 NR:ns ## KEGG: PRU_1040 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 424 1 410 410 531 60.0 1e-149 MKTLLLSILALASTLTAWSQRFDDEFDNRTLRIDYTFAGNVNKQEISVAKLNVMPGWYGK RQRLAEIPVEGNGQITVRRKKDKQVIYRNSFSTLFQEWLTYDEAKSSSKSFENVFLVPMP KDTVEITIDLRNNRRQVVASLTHTVAPKDILIRKIGETSVTPYETIQKAADPNHCIHIAY IAEGYTDAEMDVFLKDAKAATEALFAHEPFKSTRNRFNIVAVKSPSKESGTSEPSKGIWK NTALHSHFDTFYSDRYLTTLNLFELHDWLAGTPYEHIIVLVNTEKYGGGGILNSYNLSMT HHEKFKPVVVHEFGHSFAGLGDEYAYEAESISMYPRDIEPWEENLTTLVNFKGKWENLVK KNTPIPTPATEKNKKTVGAYEGAGYSLKGVYRGYQDCRMRTNEFPEFCPVCTQALTRFID FYTK >gi|283510582|gb|ACQH01000037.1| GENE 8 8386 - 8505 78 39 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MHAQVGSDTERTFVYNNQRERLWGMLLTTNQDYIMEKQK >gi|283510582|gb|ACQH01000037.1| GENE 9 9654 - 11306 1604 550 aa, chain - ## HITS:1 COG:MA3377 KEGG:ns NR:ns ## COG: MA3377 COG4690 # Protein_GI_number: 20092191 # Func_class: E Amino acid transport and metabolism # Function: Dipeptidase # Organism: Methanosarcina acetivorans str.C2A # 20 499 2 537 574 162 27.0 1e-39 MKKLLLIAAFSLVGLGSYACTNFIVGKKASKDGSVFCTYNADDYGMFIGLCHFPAGKHAK GEMRKVFDWDSHKYLGEIPEAPITYNVIGNINEHQVCIGETTYGGREEMTDSTGLIDYGS LIYIALQRSKTAREAINVMTTLVNTYGYYSGGETFTICDPNEAWIMEMMGKGAGSKGAVW VAVRIPDNAICAHANQSRITKFMQYPKGDVMYSKDVISFARSKGWYTGKDKDFSWRDVYG EPTFGGRRFCDARVYSFFNRFADNFDRYLPWVLGKDPNAEDMPLWIIPNKKVSFEDVAAA MRDHYEGTPLALDSSNVGGGIWQMPYRPTPLYYTVDGKKYFNERPVSTQQTAFSFIAQLR SWLPRQIGGILWFGNDDGNMVPYTPIYCGNTVQPECYNTPGADALNFSDKNAYWVCNWVS NMVYPRYSLLFPPLKAMRDSLENSYLAQQPQVEKRAAELYNTDVAKAVKYLNDYSNEQAQ SMLKNWKELATFLIVKYNDMAVKPTEGRSFKRNKEGLGAKVQRPGYPEAFARKLVKETGD WYLVPEEKKK >gi|283510582|gb|ACQH01000037.1| GENE 10 11326 - 12978 1415 550 aa, chain - ## HITS:1 COG:TM1660 KEGG:ns NR:ns ## COG: TM1660 COG0739 # Protein_GI_number: 15644408 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Thermotoga maritima # 4 297 6 289 323 89 25.0 2e-17 MTNILLSIMLAANVLFGSPVNYPISLAGNFGEPRPNHFHGGVDVKTDGVEGKAIFSIGDG YVSHVSVGYDGFGNAVYVHHPEGYTSVYCHLKTFTPAIKAMVRKWQYVNKQSTGDIWFKP TDLPVAKGQLIAISGNSGASEAPHLHLELHETRSGDMLDPLDFIGQHVKDALAPMAHGFM AYPVSGEGVFNNNPGKSSHSFSSHNLPNKFTAWGKVGFGIWANDYMEITYNRYGVKKTEL LVDGKSVFKSNVNRIPYEKTVQVNAWGDHEHFLRYNVWYMHSFLYPGMGLQMLSADKNKG IVNFNQERDYHLEYVLTDFKGNTSRYTFTVTGQKRAFAPQRQASTLRSVYWNRPYALQTD GVQLLIRPNSVANAITLTPDISRKDDALSNACTFTNNAERLFRPARISLKLTKRVKDTSK LYISGRGFGTRYLGGTYEKGWVTANMRDLHLQYSIDYDDEPPIVSPVGQGNWNATKTIKL GLTDAKSGVDTYQGYIDGRFVLFEDVPLSPWVACKLAETPIKRTGKQRRLTFFATDNQGN KRRFETDILY >gi|283510582|gb|ACQH01000037.1| GENE 11 13086 - 13301 118 71 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928056|ref|ZP_06421903.1| ## NR: gi|288928056|ref|ZP_06421903.1| hypothetical protein HMPREF0670_00797 [Prevotella sp. oral taxon 317 str. F0108] # 1 71 1 71 71 136 100.0 5e-31 MDERKLYAANTIVHGDVILRNGVVNIAKGAVQSYSPLDGEQANTIWLGGEIVLKEHDNGE LRAFHKGVLLE >gi|283510582|gb|ACQH01000037.1| GENE 12 13265 - 14149 767 294 aa, chain - ## HITS:1 COG:no KEGG:PRU_1066 NR:ns ## KEGG: PRU_1066 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 24 272 4 253 293 167 37.0 4e-40 MVNVVKRIIGVFMFALLLLTAQGCKPGVPSQYIQPGEMEDILYDYHLAGAMARNSMGDGK DEYAYRLAALKKHGVDQAKFDSSMVYYMQNTNMLHDIYQRLSKRLESEEMALGGAAANYD RLSSKGDTANVWNGDRSVVLTSQIPYNMYSFAIKADTSFHKGDRYLLEFDVQFIFQEGVR DAVAVMAMTLANDSVVAQTIHISSPSHFTLQLNDDERQGIKRINGYFLLNNEKAFGGMQT TLRLLPIYNIRLIKMRANAPEQPSNQPEPNKIDSAVTIDDPKQWMNESSTLPTR >gi|283510582|gb|ACQH01000037.1| GENE 13 14151 - 14807 613 218 aa, chain - ## HITS:1 COG:no KEGG:PRU_1067 NR:ns ## KEGG: PRU_1067 # Name: not_defined # Def: putative lipoprotein signal peptidase # Organism: P.ruminicola # Pathway: Protein export [PATH:pru03060] # 13 217 1 204 204 230 60.0 3e-59 MAKGKELLTDGRLAILLVIVILIIDQVIKIEVKTSMSLGEAIHVTDWFYIDFVENNGMAY GMTFINKLVLSILRLVAIVVIARYIWKVVKKGMRTRYVVFLSMILAGAMGNMVDSMFYGL IFNASTPFTVASFVPFGSGYADFLTGKVVDMFYFPLIVTTYPDWFPFKGGEQFVFFSPVF NFADASISVGVVCLFLFCRKEWETISFSFSRKKNVNEE >gi|283510582|gb|ACQH01000037.1| GENE 14 14896 - 15276 489 126 aa, chain - ## HITS:1 COG:no KEGG:PRU_1068 NR:ns ## KEGG: PRU_1068 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 126 1 126 126 167 80.0 1e-40 MANKTRYNDEELEEFRVIINEKLRLARRDYEAMMRTLMNADGNDVDDTSPTYKVLEEGSA TQSKEELIQLANRQQKFITGLEAALVRIANKTYGIDRITGELIPKQRLMAVPHATLSVES KNARKR >gi|283510582|gb|ACQH01000037.1| GENE 15 15293 - 18733 3935 1146 aa, chain - ## HITS:1 COG:CAC3038 KEGG:ns NR:ns ## COG: CAC3038 COG0060 # Protein_GI_number: 15896289 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Isoleucyl-tRNA synthetase # Organism: Clostridium acetobutylicum # 18 1143 16 1033 1035 837 40.0 0 MANKFAEYKGLNLTQTNKDVLAEWEKNDIFHKTIDEKEGCPQFVFYEGPPSANGHPGIHH VLARSIKDTFNRYKTMKGFQVKRKAGWDTHGLPVELGVEKELGITKADIDNKESAKYISV ADYNKKCRENVMMYTAEWRELTEKMGYFVDLDNPYITYDNKYIETLWWLLKQLYTKDMLY KGYTIQPYSPAAGTGLSSHELNLPGCYRDVKDTTVTAQFEIKNPKPEWEQWGKAYFMAWT TTPWTLAANSALCVGPKIDYAAVQSFNPYTGEPITVVLAEARLNAYFNEAGKDVELSTYK KGDKVIPWKVIANYKGTELVDIAYHQLLPFISPMEDGAFRVILGDYVTTEDGTGIVHIAP NFGADDALVARKAGVPPIVLVDKKGAERPVVDLEGKYFKVEDLDADFVNKYVNLPEWNKY AGRYVKNAYDDTLTDADETLDVSICMDLKAENKAFKIEKHVHSYPHCWRTDKPVLYYPLD SWFIRSTAKKERMFELNKTINWKPESTGTGRFGNWLENLNDWNLSRSRFWGTPLPIWRDE DGREICIGSLEELYSEIEKSVAAGHMSSNPLKEQGFVPGNYAQDNYERIDLHRPYVDDIV LVSEAGKPMKREMDLIDVWFDSGAMPYAQVHYPFENAELIDKRIAFPADFINEGVDQTRG WFFTLHAIATMVFDSVAFKNVISTGLVLDAKGNKMSKHVGNVVNPFEMMEKYGADPVRMY MLTNSEPWDNLKFDAESVDEIRRKLFGTLYNTYSFFALYANVDGFDYSQADVPLNERPEI DRWILSELHSLVEGVEREMDDYDPTRAGRLIDTFVNDDLSNWYVRLNRKRFWGKEMSQDK LSAYQTLYTCLETVAKLLAPFAPFYADQLYLDLTRATNRGHLQSVHLADFPVADDNCIDR ELEERMEMAQKITSMVLALRRKVNIKVRQPLLQIMIPAVDQRQRTHIEAVKDLIMSEVNV KELNFVEDGKGMLVKKVKCNFRTMGKKFGKLMKGIAAQMETLEQEAIADLERNGNINLMV EGNEVTVEAADVEIISEDIPGWLVSNEGNLTVALEVELTDELVKEGMAREIINRVQNLRK ESGLEITDRIKVTLAPNAQTDAAVEAFGDYIKSQVLADDLLVQPNDGKEIAFEDFNLNIQ IDKIIH >gi|283510582|gb|ACQH01000037.1| GENE 16 18998 - 19492 573 164 aa, chain + ## HITS:1 COG:slr0328 KEGG:ns NR:ns ## COG: slr0328 COG0394 # Protein_GI_number: 16331232 # Func_class: T Signal transduction mechanisms # Function: Protein-tyrosine-phosphatase # Organism: Synechocystis # 8 160 2 153 157 131 46.0 7e-31 MEDRKRTKLLFVCLGNICRSPAAEGVMKQVLLNKGMSDLFEVDSAGIGGWHVGELPDSRM RKCGAARGYNFNSRARQFDTDDFRKFDYIFVMDNDNKNMLSQKTNNERELAKVKMLVDYA ASHPKAKLIPDPYYGDEKDFDYALDLIEDATNTLADRLAKGGEL >gi|283510582|gb|ACQH01000037.1| GENE 17 19673 - 21841 2031 722 aa, chain + ## HITS:1 COG:SPBC887.14c KEGG:ns NR:ns ## COG: SPBC887.14c COG0507 # Protein_GI_number: 19113280 # Func_class: L Replication, recombination and repair # Function: ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member # Organism: Schizosaccharomyces pombe # 3 424 311 777 805 167 31.0 7e-41 MKNDEMNLAWQFVKNTGASIFLTGKAGTGKTTFLRTLKDKLPKRMVVVAPTGIAAINANG VTIHSFFQLPFAPYVPNSTLKAEGIFRVSKEKQRILRTLDSLVIDEISMVRADLLDSVDM VLRRYRDRSKPFGGVQLLLIGDLQQLPPVVRDEEWNMLSRYYDSPYFFSSLALKQMGYIT IELKHVYRQNDLDFLSLLNKIRDNKADQDTFNKLNSRYIPNFVPPAGTDYIRLVTHNYQA QSINESKLAELPGTSRHYKASITGEFPEQSFPTEFVLELKEDAQVMFVKNDSTGKGRYYN GMLGKVVSTTARGVTVKGNETGELIDLLPEEWTNAKYVLNKTTHEIEEEIEGTFKQFPLR LAWAVTIHKSQGLTFEHAIIDAQHSFAHGQTYVALSRCKTLEGLVLSSPLSTSTVICDEK VHQFNIDASARQPNSEMLSQMQRTYEVNLIDELFDFRPIGIAFERITRFIDEHFARKYPR LLQEYRSARPILDEVASVAARFRLQYTQMLAAQQEQELSDEVLQRIRSGAEYFHKRLALL VEMHRKNKIETDNKALKKTLKERYTDVHETLDFKWNLLKYESREDVSFSVGDYLSVKAKL LLGAMDGDKPNTKKIKEVKPKVDTKLATFNLYRLGMSIDEIAKRRELTTRTIHLHLVHYV REGRLSASEFVDDKKRQEITRFLIKHHEEKKSLTEIKEEVSPNITYEDIQFVMADIVGRT ND >gi|283510582|gb|ACQH01000037.1| GENE 18 21917 - 22108 178 63 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNLEILVNLGKLETLEELEILEELETLEELETLEELETLEELEILEELEILGELEILEKL EIL >gi|283510582|gb|ACQH01000037.1| GENE 19 22234 - 23247 710 337 aa, chain - ## HITS:1 COG:alr4032 KEGG:ns NR:ns ## COG: alr4032 COG0609 # Protein_GI_number: 17231524 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-siderophore transport system, permease component # Organism: Nostoc sp. PCC 7120 # 3 335 20 354 362 211 40.0 2e-54 MRLTSLKIFVLVLFILILFGLNLLIGSVDVPFKYVMGIIVGNKTENELWNYIVIENRLPQ ALTALCCGGALSASGLMLQTAFRNPLAGPGIFGINSGAALAVAVVMLFLGGSVSIETVNL SGFTAILFASFAGAMAVTFIILAFSTRVRNNVSLLIIGIMIGYLASSVISLLNFFATDQG VKSYLIWGLGSFGGVSLGNIPFFCTMVLGGLVCSLVLVKPLNALTLGEQYAQSLGVNVAR TRFLLLAITGLLTAVTTAFCGPIAFIGLAVPHLARLLTGTEDHRTLLPATILLGCGVALL CSLLCYLPPNGSIIPINAITPLIGAPVIVYIMMKRRG >gi|283510582|gb|ACQH01000037.1| GENE 20 23255 - 24382 740 375 aa, chain - ## HITS:1 COG:alr4031 KEGG:ns NR:ns ## COG: alr4031 COG0614 # Protein_GI_number: 17231523 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-hydroxamate transport system, periplasmic component # Organism: Nostoc sp. PCC 7120 # 30 371 77 420 426 184 32.0 3e-46 MKKVLLLLFAALLIVACGGKQAKEQRSEGDTLHFKYAQNIVIIKQEHAVVVEIKNPWKKD ALLHRYVLPNDTNFSPSKVADGVPTTVIPKVKHRAIVCPSTHVALLEMLGAEQQLAGVCD LKYMISPRVQLLAKTKKIADCGESMNPDIEKMMETRADLVLVSPFENSGGYGKLDKTGIA IIECADYMETSPLGRAEWMRFYGLLFGCEQKADSLFALVEKNYMELSRTAKKERERPTIL TEKLTGSVWYVPGGHSTMAKMIADAGAIYPFADNELSGSVALPLEKVLEKARTADIWLIT YNAPQPLTYAQLQAENKGYAMIKAFTARNIYGCNAAIKPLFDEAPFRPDGLLRELISIAH PSAFPSEWRYYTPLN >gi|283510582|gb|ACQH01000037.1| GENE 21 25387 - 26532 792 381 aa, chain - ## HITS:1 COG:no KEGG:PRU_1304 NR:ns ## KEGG: PRU_1304 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 6 379 8 386 387 162 31.0 2e-38 MNISFQKNGAVLSQALCSILFCLYSFVFTCFYQANELALMIRRLTDGIQLYLNRWVVGLL LTGLLLLLQRGLLRLMGGSGHRLYVLSFVPSFLVLMFLSSFTGIAPFAAFLPKLGTSLVL VVLAVGLFVWRQRRNVDTETEYNPMFFLPQSVWFMALLMFAAAYGGAVPAKLHQQLVMEM LIKSGNSEKALEVGKRDTDADSALTSLRVLALANEGLLGEQLFDFPLRGKSAAMMPNGKS VKWFTLNAKTLDTLFVVPTSHDEYITLNGLEKWAEEHKLTKPAYDYLLCGYLLDRRIDVF AANVFRYYPNNNELPKHYREALTLYMHQRTNPVVVYSNNLMETDYQDFQDLENKYKNRTE RINALRSVYGNTYWYYYNYGE >gi|283510582|gb|ACQH01000037.1| GENE 22 26533 - 28617 1888 694 aa, chain - ## HITS:1 COG:YEL011w KEGG:ns NR:ns ## COG: YEL011w COG0296 # Protein_GI_number: 6320826 # Func_class: G Carbohydrate transport and metabolism # Function: 1,4-alpha-glucan branching enzyme # Organism: Saccharomyces cerevisiae # 30 691 10 700 704 541 44.0 1e-153 MSAKDANPTKKVKKPSKTTTAPTPNASRMGLVKNDAYLEPYEDAIRGRHEHYLWKMNQLS GNGRKTLNDFANGHQYYGLHKLSKGWVFREWAPNATEIYLVGDFNDWKETERYRAKRIEG TGNWELRLSEKAVSHGDLYKMHVYWNGGNGERIPAWARRVVQDEQTHIFSAQVWQPEHAY EWSKKKFKPTTSPLLIYECHIGMGQDAEKVGSYTEFKELVLPRIIEDGYNCIQIMAIQEH PYYGSFGYHVSSFFAASSRFGTPEELKSLIDEAHKHGVAVIMDIVHSHAVKNEVEGLGNL AGDPNQFFYPGERHEHPAWDSLCFDYGKDEVIHFLLSNCKYWLEEFHFDGFRFDGVTSML YFSHGLGEAFCNYGDYFNGHQDDNAICYLTLANKLIHEVNPKAITIAEEVSGMPGLAARI DDGGYGFDYRMAMNIPDFWIKTIKELSDENWKPSSIFWEVKNRRSDELTISYCESHDQAL VGDKTIIFRLIDADMYWHFKKGDENDVAHRGIALHKMIRLVTAAAINGGYLNFMGNEFGH PEWIDFPREGNGWSYKYARRQWNLVDNKELDYHYLGDFDREMLKVIKSEKDFIKTPVQEI WHNDGDQILAFMRGDLIFVFNFNPTTSFTDYGFLVPTGAYNVVLNTDASAFGGNNLADDT VTHITNYDPLYVAERKEWLKLYIPARSAVVLRKN >gi|283510582|gb|ACQH01000037.1| GENE 23 28862 - 29311 310 149 aa, chain - ## HITS:1 COG:CAC0836 KEGG:ns NR:ns ## COG: CAC0836 COG2731 # Protein_GI_number: 15894123 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase, beta subunit # Organism: Clostridium acetobutylicum # 1 147 1 150 152 70 27.0 1e-12 MVLDVLDNFGKYAALNPYFSLVDAFMRTHQLEELALGTYNIKGEDVVLKIEQAMGKSRTE AILESHCQMIDIQIPLQQEETFGLAATVSLPAADYSAEKDISFYPNSPVQNYVTCPRGGF VIFFPQDAHAPCISEDASFTKAIFKVRCV >gi|283510582|gb|ACQH01000037.1| GENE 24 29366 - 30391 864 341 aa, chain - ## HITS:1 COG:slr1485 KEGG:ns NR:ns ## COG: slr1485 COG4642 # Protein_GI_number: 16329198 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Synechocystis # 2 309 33 340 349 142 31.0 7e-34 MDGGLYKGEMAGGKPHGKGNTLYSNGETYEGEYFKGRRQGYGVRTFADGKRYEGQWFQDQ QHGNGTCYFANNNKYVGLWFRGHQQGHGVMYYYNGDKYEGNWAHDKRQGKGRYTFSTGAF YDGNWNDDKKSGLGFFDWGDGSNYNGMWVDNQRSGKGTNRYADGDVYVGNWTNDIQNGKG IYKFQNGDVYEGDYVQGDRTGQGIFKYANGDRYTGRFLEGDKDGKGTFVWANGDSYVGDW KKDRQNGHGKLTKKNGDVFEGNFINGRVEGEVIIHYADGSRFKGTYKDGKRNGKAIEEDK SGNRFEGTYVRGIRDGRFVEKDRNGQIKAKGHYENGKRFND >gi|283510582|gb|ACQH01000037.1| GENE 25 30925 - 31803 1004 292 aa, chain + ## HITS:1 COG:YPO3861 KEGG:ns NR:ns ## COG: YPO3861 COG1209 # Protein_GI_number: 16123996 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-glucose pyrophosphorylase # Organism: Yersinia pestis # 1 285 1 283 293 402 67.0 1e-112 MKGIVLAGGSGTRLYPITKGVSKQLIPIFDKPMIYYPISALMLAGIREILIISTPHDLPG FERLLGDGKDLGVRFEYAEQPSPDGLAQAFIIGEKFIGNDTVCLVLGDNIFYGAGFTGML RQSVEAANNGKATVFGYYVNDPERYGVAEFDDKGNCLSIEEKPKHPKSNYAVVGLYFYPN AVVEVAKNIKPSARGELEITTVNQEFLSRGALKVQTLQRGFAWLDTGTHDSLSEASTFIE VIEKRQGQKVACLEEIAMDNKWIDKEQVKALAQPMIKNEYGQYLMRIAQMAK >gi|283510582|gb|ACQH01000037.1| GENE 26 31883 - 34387 2362 834 aa, chain + ## HITS:1 COG:alr2856_2 KEGG:ns NR:ns ## COG: alr2856_2 COG0489 # Protein_GI_number: 17230348 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Nostoc sp. PCC 7120 # 543 793 9 252 275 113 29.0 2e-24 MEENIKNNTLDNEQAPAGKSLFNFQTVYMTLILNWKWFVLSLIICLGLASIYLRYTIPIY QTYAKLLIKEENNSRGRNSLQYTTNLGMVSNSTGIDNEMEILKSSSIAIQAVKDLKLYTT YMSAGKVTNRLMYKTQPISVDIDPIHLDILTHPISLTITREGKKYHVEGTYYSTNPDAPG KAYAIDKSFTALPAAIGTQAGILTFIPNSTTPMKDGEKVLVRISPPKAVAIGYAAGLSIA QSSKSTSIAVLTINDQSPERAIDYLKQLAICYNRQANEDKNEVAVRTEEFINGRLEKINT ELGSTESQLESYKKANKMVELQMSANQALSNSDQYDQKLAEANTQIALLNSINDYMNQPS NKYETLPANIGISDQSAVSLINKYNEIVLERNRLLRSASENSPTVTPLTSQLDDLSSSIR RAMAQAKRSMDIQRNAVASQYGKYNSMIQQTPEQEKTMKQIGRQLEVKAGLYLMLLQKRE ENSISLAATADKGKLIDDPTVVGKVAPRSSTIYAGALAAGLAIPSVVFFLLSFFRYKIEG HDDVARLTRLPIIADVAVASETAKTKADIVVHENQNNQMEEIFRSMRTNVQFMLKENEKV ISFTSSISGEGKTFIAANLAVSFALLGKKVILVGLDIRKPRLAELFEIDDHRHGITNLLT KTVVTAADIQAQTLPSGVNDNLELLMAGPIPPNPAELLTRTSLDDIMDLLKRTYDYVIID TAPVGLVTDTLQVGRVSNLTVYVCRADYTPKENFELINTLHVENKLPNICVVVNGIDLSK KKYGYYYGYGKYGRYAKYGYYGKHGKYGRYNNYGNYGNYSNSHYSNENDTSVKQ >gi|283510582|gb|ACQH01000037.1| GENE 27 34453 - 34884 398 143 aa, chain + ## HITS:1 COG:no KEGG:PRU_1780 NR:ns ## KEGG: PRU_1780 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 143 1 144 144 202 69.0 3e-51 MTIAVDFDGTIVEHRYPEIGEEIPFATETLKMLIKDRHRLILWSVREGELLDEAVEWCRQ RGVEFYAVNRDYPEEEKEKNNHFSRKLKADLFIDDRNVGGLPDWGEIYRMVTENVTYKQI VRNAIRAHNNPESYKKKKHWWQF >gi|283510582|gb|ACQH01000037.1| GENE 28 35087 - 35731 584 214 aa, chain - ## HITS:1 COG:TP0554 KEGG:ns NR:ns ## COG: TP0554 COG0546 # Protein_GI_number: 15639543 # Func_class: R General function prediction only # Function: Predicted phosphatases # Organism: Treponema pallidum # 9 213 9 217 222 131 37.0 8e-31 MIKDYDTYIFDLDGTLLNTLTDLAASCNYALQQHGMRQRTIDEVRQFVGNGVKKLMERAI PDGLANAQFEEVYQCFRQHYLEHGLDHTSPYPGVLQMLAELKRRKKNVAVVSNKFYKATQ ELCKHFFDQYVEVAIGEREGISKKPAPDTVLEALVQLGVTADNAVYIGDSDVDIATAKNC NLPCISVLWGFRDKDFLLRSGATTLITAPEELLV >gi|283510582|gb|ACQH01000037.1| GENE 29 35921 - 36970 995 349 aa, chain + ## HITS:1 COG:aq_775 KEGG:ns NR:ns ## COG: aq_775 COG1559 # Protein_GI_number: 15606156 # Func_class: R General function prediction only # Function: Predicted periplasmic solute-binding protein # Organism: Aquifex aeolicus # 72 335 63 319 326 114 30.0 2e-25 MKHSCVRNYLIPAGICLLVILGILYYYFFSAMLPSGSATKFVYIDNDDNIDSVYAKLTPN CSPHALSGFRTLTRHSSYGEHIKPGRYAIKAGQGAFVVFRHLKNGMQEPVSLTVPSVRTL DKLSAEVCKRLMMDSTQLLNALRDPKICAHYGYDTATIQCMFIPNTYDVYWNISTEKLLD RMQKESKNFWNVDRTVKANELKLTPVQVITLASIVDEETANNAEKPMVAGMYYNRLMLRN AEYPNGMPLQADPTIKYAWKQFGLKRIYNKLLNIDSPYNTYRNAGLPPGPIRIPSIEGIE AVLNLKKHGYLYMCAKEDFSGTHNFAKTYAEHLANAAKYTKALNQRGIK >gi|283510582|gb|ACQH01000037.1| GENE 30 38966 - 40681 1956 571 aa, chain + ## HITS:1 COG:RSc0791 KEGG:ns NR:ns ## COG: RSc0791 COG0008 # Protein_GI_number: 17545510 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Glutamyl- and glutaminyl-tRNA synthetases # Organism: Ralstonia solanacearum # 16 568 16 580 580 588 50.0 1e-168 MTERIENSNEEKKNLSFVEQIIEQDLAEGKNGGRLQTRFPPEPNGYLHIGHAKAICMDFG AAEKFNGICNLRFDDTNPSKEKDEYVENILSDIKWLGFKWENIYYASEYFQKLWDFAVWM IKKGHAYIDEQTSEQIAEQKGTPTKAGTASPYRDRPVEESLALFEKMNLPDTPEGSMVLR AKLDMANPNMHFRDPIMYRIIHTPHHRTGTLWNAYPMYDFAHGQSDYFEGVTHSICTLEF VPHRPLYDKFIDFLKEMNGEMDNINDFRPRQIEFNRLNLTYTVMSKRKLHTLVDEGLVSG WDDPRMPTLCGMRRRGYSPESVRMFINSIGYTKFDALNDMALLEASVREDLNKRACRVTA VVNPVKLVITNYPEGQTEEMETVNNPENAADGTHTITFSRNLWMEREDFMEDAPKKFFRM TPGQEVRLKSAYIVKCTGCTKDADGNIVEVQAEYDPQSKSGMEGANRKVKGTLHWLSADH CTQAEVRVYDRLFDVENPSTDERDFRELLNKDSKKVFTNCYVENYAANKKPGEYLQFQRI GYFMADTDSSAEHPVFNKTVGLKDTWAKQNK >gi|283510582|gb|ACQH01000037.1| GENE 31 40825 - 42237 1310 470 aa, chain + ## HITS:1 COG:MJ1345 KEGG:ns NR:ns ## COG: MJ1345 COG0457 # Protein_GI_number: 15669536 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Methanococcus jannaschii # 81 334 26 276 314 65 24.0 2e-10 MTNDSYFNSEEFQQTLHSYEEAEQEGRTAYFDSDTLADIAEYYYTDGRTNDAVKAALKGV DMFPGAMPPLVFMGRYELLMQNPQKAKWYLEQVDDQSELECVYLRVEIMLVENKNDEADE FLRQQLEWQDDEERDDFILDVADIYLDYDLIDHAQAWLALAENTQTADYQEVQGRIALGK GEYRQGEEIFKKLVEENPFASPYWNQLASSQFLSNRIKDSIESSEYSIAINPNDEEALLN KANGMFSLGEYEEALTYYTRFNNLHKDDETGEIFIGITLINLDRPAEALQHLQKAEEMAV PKSVNLVEIYQETAFALSRLGRVDDALAYVDKAIATGNGDEDELTVFKGHIQLENGNPLA AQNFFLEAVRHSKSAPNVYFRIAISVYDNGYMQLAYRMFHTLFQSVPADWKDGYSHMSLC CKCLNKESEYLHFLRKACELNPMEAKTILGEYFPKDLDPESYYNYITKQK >gi|283510582|gb|ACQH01000037.1| GENE 32 42248 - 42928 314 226 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|154175107|ref|YP_001408238.1| ribosomal protein L22 [Campylobacter curvus 525.92] # 1 207 5 198 199 125 31 4e-28 MSWLLPLLNNLNYWTVLLLMFIEGSVIPAPSELIVPPAAYRAAAGELNIVLVVVVATIGA VLGSSANYAAAYYLGRPVIYRFANSKLGHLCLLNQEKIEKGEKYFYDHGVIATLTGRLLP GIRQIISIPAGLSKMKFWKFILYTTIGAGLWNTVLATLGWYLHSFVPKEELYNKIEEYNS QIQLVVIIAVGVAAIVGAIWWQMRKRKKRKANASTVTEGTPPNEGN >gi|283510582|gb|ACQH01000037.1| GENE 33 42993 - 43646 613 217 aa, chain + ## HITS:1 COG:SA1083 KEGG:ns NR:ns ## COG: SA1083 COG0336 # Protein_GI_number: 15926823 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA-(guanine-N1)-methyltransferase # Organism: Staphylococcus aureus N315 # 1 214 12 224 245 217 50.0 1e-56 MIEGFVHESILARAQKKELAEIHLHNLRDYSTDKWRRVDDYPYGGFAGMVMQCEPIDRCI SALKAQREYDEVIFTSPDGQQFDQHVANELSLKGNIIILCGHYKGVDQRVRDHLITREIS IGDYVLTGGELAAAVMADAIVRLIPGVISDEQSAFSDCFQDDLLSAPIYTRPANYKGWEV PPILLSGNEAKIKTWEMEQALERTRQLRPDLLERKDE >gi|283510582|gb|ACQH01000037.1| GENE 34 43786 - 45123 1143 445 aa, chain + ## HITS:1 COG:no KEGG:PRU_1441 NR:ns ## KEGG: PRU_1441 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 46 445 4 392 392 378 48.0 1e-103 MKHVLSTLILSTLAVVTLQAQQTRPTRRTDRADYATALLKSYEDSLALLYNKVYRQNDKS ALPNDFVENVVPYQSYKLFSPLTFYEGIAKDFFDLSGEGTTTKARNQRADYLGPRFTDAA LLGIYLKRPDLVLNRETVLHNVAPEGKTTPVVIKSDAGLAEKQPQVPTDPESGPINVVVT KPNFWTFKGDYYLQFLQNYVSGNWHKGGESNYSMVSAVTLQCNYNNKQKVKFDNKLELKL GFITSPSDTLHSLKTSEDLIRFTSKLGLQATKRWYYTLQLVAYTQFLRGYKNNSGEPASD FMSPFNTNISLGMDYNVDWFKGRLKGSIHLAPLAYNFRYVDRLALGPRYSLKPDSHTLHD FGSEFTVDLMWKIAEPISWQTRLYGYTTYSRAEVEWENTFTFQFNKFISSKLFVYPRFDD GGTRDEKHGYLQLREFVSIGFSYSF Prediction of potential genes in microbial genomes Time: Sat May 28 00:49:03 2011 Seq name: gi|283510581|gb|ACQH01000038.1| Prevotella sp. oral taxon 317 str. F0108 cont2.38, whole genome shotgun sequence Length of sequence - 19659 bp Number of predicted genes - 15, with homology - 15 Number of transcription units - 5, operones - 3 average op.length - 4.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 54 - 1469 1042 ## COG0658 Predicted membrane metal-binding protein 2 1 Op 2 . + CDS 1496 - 2230 488 ## COG0101 Pseudouridylate synthase + Term 2334 - 2381 2.0 + Prom 2254 - 2313 3.3 3 2 Tu 1 . + CDS 2451 - 3374 622 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily + TRNA 3539 - 3615 68.5 # His GTG 0 0 - Term 5134 - 5179 3.1 4 3 Op 1 . - CDS 5217 - 7835 2028 ## PRU_2417 hypothetical protein 5 3 Op 2 . - CDS 7872 - 8678 498 ## ZPR_4163 hypothetical protein 6 3 Op 3 . - CDS 8687 - 9502 516 ## Dfer_1653 hypothetical protein - Prom 9538 - 9597 3.3 - Term 10015 - 10054 9.1 7 4 Op 1 . - CDS 10083 - 10853 807 ## COG1043 Acyl-[acyl carrier protein]--UDP-N-acetylglucosamine O-acyltransferase 8 4 Op 2 . - CDS 10872 - 12257 1484 ## COG0774 UDP-3-O-acyl-N-acetylglucosamine deacetylase 9 4 Op 3 . - CDS 12270 - 13301 939 ## COG1044 UDP-3-O-[3-hydroxymyristoyl] glucosamine N-acyltransferase 10 4 Op 4 . - CDS 13348 - 14568 1027 ## COG1078 HD superfamily phosphohydrolases 11 4 Op 5 . - CDS 14596 - 15429 952 ## COG0284 Orotidine-5'-phosphate decarboxylase 12 4 Op 6 . - CDS 15453 - 16565 1349 ## COG0216 Protein chain release factor A 13 4 Op 7 . - CDS 16573 - 17532 472 ## COG2958 Uncharacterized protein conserved in bacteria 14 4 Op 8 . - CDS 17537 - 18700 1429 ## COG0150 Phosphoribosylaminoimidazole (AIR) synthetase 15 5 Tu 1 . - CDS 18806 - 19555 833 ## COG0169 Shikimate 5-dehydrogenase - Prom 19594 - 19653 3.5 Predicted protein(s) >gi|283510581|gb|ACQH01000038.1| GENE 1 54 - 1469 1042 471 aa, chain + ## HITS:1 COG:BS_comEC_1 KEGG:ns NR:ns ## COG: BS_comEC_1 COG0658 # Protein_GI_number: 16079611 # Func_class: R General function prediction only # Function: Predicted membrane metal-binding protein # Organism: Bacillus subtilis # 141 386 169 396 469 86 29.0 1e-16 MAILVSTVFLGAWLMSLQINDRQSIPLNQAITYKAIVTDTPTAHGKVLKCNLLVINLAEE TDGHTPFLVRASILKDTLTNRWQRLTTGSTIAVQSVFNNESALFSSSKFNFARWMQAHQI YATTFVYFADWQNTQLTNDEIHKISNVDKIRLHLLRVRQSLLDSAWQQRLSTDNQALVAS IALGDKSNLTPTQRESYNKAGVSHILALSGLHLGIIYSVLSALFTTILYRVVRRDWAEFI AQTIITITLWGYVFLVGLPPGAVRSALMLTLYAFVSLLHRDRFSANSLAFACIVMLIANP STLWDVSFQLSFIAVLSIIVLYPPLFSLYQPTTRWANLLRPVWAIVCVSVAAQVGTMPII AYYFGRFSCYFLLSNLLILPLITLLLYATLASFIVQGAPLLHNLVLDAMQWLATAVQTLV NGIAALPHSTLDGLQLNLIQTYLAYIFIAFLYAIGLFALKWFRIRRLIQQL >gi|283510581|gb|ACQH01000038.1| GENE 2 1496 - 2230 488 244 aa, chain + ## HITS:1 COG:BS_truA KEGG:ns NR:ns ## COG: BS_truA COG0101 # Protein_GI_number: 16077216 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Pseudouridylate synthase # Organism: Bacillus subtilis # 10 244 8 245 247 169 38.0 3e-42 MKQRYFIYFSYDGTSYHGWQIQPNANSVQAEMQRAMSLLLRHPVELVGAGRTDTGVHART MVAHFDVDTPIDAVQLTYKLNRVLPFDITINAIKAVDADMHARFSATKRTYHYYIHLDRD PFCRAYSYALHQAPDFQLMNEAAEILLSHKDFGAFCKSHSDAKTTLCEVYEARWIKLSDT KWHLRISANRFLRNMVRAIVGTLLDVGWHKITLEQFVDILHSKSRQQAGESVPGHALFLE DVQY >gi|283510581|gb|ACQH01000038.1| GENE 3 2451 - 3374 622 307 aa, chain + ## HITS:1 COG:CAC1984 KEGG:ns NR:ns ## COG: CAC1984 COG0697 # Protein_GI_number: 15895255 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Clostridium acetobutylicum # 3 303 4 283 285 112 27.0 8e-25 MLWLLLAFLSASLLGFYDVFKKLSLKGNAIIPVLFLNTFFSSLIFIPFILGSRFGWLSPV NMFYVSATNLSTHGYIFFKSMLVLSSWIFGYVGMKHLPLTIVGPINATRPILVLVGAMLV YGEQLNLLQWLGVALAICSFFLLSRSGKREGIDFKHNKWIYAIALAAILGACSGLYDKYL MTPTAQVGLGLDKMVVQSWYNLYQCGMMGIVLLVLWYPKRKTTQPFQWRWAIMLISLFLC AADFAYYYALSQPQAMISIVSMVRRGSVLVSFMFGALVFQEKNLKRKAFDLFLVLLGMLC LYFGSKT >gi|283510581|gb|ACQH01000038.1| GENE 4 5217 - 7835 2028 872 aa, chain - ## HITS:1 COG:no KEGG:PRU_2417 NR:ns ## KEGG: PRU_2417 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 24 870 7 848 850 233 24.0 2e-59 MNCRIFPFFRWRKFTFNAKPWRYLLLLAAFSVLPALGQEGVVRGLVVETGSETPVRGAVV TIRNAKNIIIYNTQTSAKGEFSFMVRREKVAGAMLVVSSMTHRTAQCALTNQAFYRMEME PKAFEIKDVYVKPNKITHRNDTTSYLVSAFATQKDRTIGDVLRNLPGINVSKEGTVSYNG KVIDQFLIEGVDLFDGQYNIATRNISHDVISRVDVVENFQSKKTMQKSKAEGLTALNLAL KDKAKGRWSGNVRVASGVPKLWEAEGFGARLSAVNQTSLTAKSNNTGKDIMSENKVLTID ELLKQTSTSSPNSFMNTKLGRPSALDESRTRFGCTHIANVGNVQKVSESAIVRTKLYYSD ERNTSKRQHGLSYFLADTTLTRATTECGILNNRELSAAVMFKSDRERSFFSNEIKYSSAW QRNRTITQGDDNNQSETRSDLHAIENQLQWIRAVGKHHIKFTSVNLYRYMPEQLIVVADS TRQQDVTRCQFLSSTKLNFTFNTRCWALEMDAEGRIQAMNMESNYRTAPYDTAFYENANI HFMAFIATPTITYKHRGLRGTLKMPLNACRYFGTAVANKLFFSPILSLSWELNSRCKLGA SVSMNSFEPSVNDSHAMPMLVDYRTFQKSPINFYGLHNWQEMAFVSYANYMHMLFVRASV AFNTHHNNISRTKQVDRRGMVYYSTVENDNHSHGIMAIATVSKRLEWLKGTFNTQCVLSQ NTNSMYQNGINTEFKSGSLQTSVSLNSNVWAWLDVAYRAQYNVNSLEISAMKTRTKLFTQ ELELTLMPTDALNFSFSTEHYANFFNDGTRKQTVFADFNCLYKYRKTDFTLHINNMLNQR YYNNTTYNNLSSTFNQYVLRGRTLLVGVKVYF >gi|283510581|gb|ACQH01000038.1| GENE 5 7872 - 8678 498 268 aa, chain - ## HITS:1 COG:no KEGG:ZPR_4163 NR:ns ## KEGG: ZPR_4163 # Name: not_defined # Def: hypothetical protein # Organism: Z.profunda # Pathway: not_defined # 23 267 25 259 263 112 30.0 2e-23 MKTIVSIMLFLICALHGNAADNKATLECIYRLIYQVDTVTKRSSNAMMVLRRNETKSLFY SQSNFEKDSMLLEASGPEERKAINDSIKARYGKVSVYYYVLEDFDKKELEFVESLPTNSR YTESLPNFSWEITEEQKTIDNYACQKAVCTFVGRTYEAWFAPDIPISDGPWKFYGLPGLI LEVYDTKHHYEFQFLGMRPCSGKIEIPNRDVTKTTKKEYLRLKQLAIDDPSAYLDILNAM VGIKCVSSTKPPKLRSYATMERVESLPK >gi|283510581|gb|ACQH01000038.1| GENE 6 8687 - 9502 516 271 aa, chain - ## HITS:1 COG:no KEGG:Dfer_1653 NR:ns ## KEGG: Dfer_1653 # Name: not_defined # Def: hypothetical protein # Organism: D.fermentans # Pathway: not_defined # 17 270 2 252 254 99 26.0 1e-19 MKIRHYLMALATFALPTAAQEKSPATTEVSYLVSCINDTVEGKVLTDTFRLRFNDSKALF YRVDDFISDSTKLNDADAWARDVNKTLTRKKRNDKTGMCYYILTDLQGKTYVYKDRIGTN AFRYTDSLPNFNWKILPEQKTIAGHKCAKAVGKYMGRTYEAWYATDIPARLGPWKFNGLP GLVMAVYDTRRQYTFTFIDMQACNGDVALFPSRAFKTTKEKFLKEYAAYQSDPVTYFSTR SLIRISFGAPDGQLVQEINNDNRHRPMEILE >gi|283510581|gb|ACQH01000038.1| GENE 7 10083 - 10853 807 256 aa, chain - ## HITS:1 COG:ECs0183 KEGG:ns NR:ns ## COG: ECs0183 COG1043 # Protein_GI_number: 15829437 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Acyl-[acyl carrier protein]--UDP-N-acetylglucosamine O-acyltransferase # Organism: Escherichia coli O157:H7 # 4 256 8 262 262 213 46.0 4e-55 MNQISPLAYVHPEAKLGKDNIIGPFCYIDRNTVIGDGNNLQNSVTINYGARIGNGNEILA GASISTKPQDLKFVGEDTLCEIGDNNSIRENVTISRGTASKGVTIVGSNNLLMENMHIAH DCVIGSNVIIGNSTKFAGEVIVDDFAIVSAAVLCHQFCRIGGYVMVQGGSRFSQDIPPYV IVGKDPVRFAGINLVGLRRRGFSNELIDLIHNAYRLLYSKGLMAEGIQEIRNNLQVTKEI QYIIDFVESSKRGIVR >gi|283510581|gb|ACQH01000038.1| GENE 8 10872 - 12257 1484 461 aa, chain - ## HITS:1 COG:PA4406 KEGG:ns NR:ns ## COG: PA4406 COG0774 # Protein_GI_number: 15599602 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-3-O-acyl-N-acetylglucosamine deacetylase # Organism: Pseudomonas aeruginosa # 1 306 1 283 303 172 37.0 1e-42 MLKQTTLKGSFSLYGKGLHTGLSLTVTFNPAPENTGYKIQRIDLEGQPIIDALAERVVDT QRGTVLANGDVRVSTVEHGLSALYAMGIDNCLIQVNGPEFPILDGSAIMYVEQIKRVGIE EQSVPKDYYIIRHKIEVKDEATGSCITILPDDQFSITAMCSFDSKFINSQFATLDSLDNF ETEIAPARTFVFVRDIEPLLKANLIKGGDMDNAIVIYEKIIDQEQLDQLADLLKVPRLDA SKQGYIQHKPLVWENECTRHKLLDIIGDMALLGKPIKGRIIATRPGHTINNMFARHLRKD IRKHEVQAPIYDPNEEPVMDNKRIRELLPHRYPMQLIDKIISLGPTSIVGVKNVTSNEPF FVGHFPQEPVMPGVLQVEAMAQCGGLLVLNQLEEPERWSTYFMKIDEVKFRQKVVPGDTL LFRVELLHPVRHGVSSMKGYMFVGDQVVSEATFTAQIVKNK >gi|283510581|gb|ACQH01000038.1| GENE 9 12270 - 13301 939 343 aa, chain - ## HITS:1 COG:PA3646 KEGG:ns NR:ns ## COG: PA3646 COG1044 # Protein_GI_number: 15598842 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-3-O-[3-hydroxymyristoyl] glucosamine N-acyltransferase # Organism: Pseudomonas aeruginosa # 1 342 5 345 353 231 39.0 2e-60 MEFTAKQIAQVVDGEIVGDENAAVHSFAKIEEGRPGAISFLSNPKYTHYIYETEASVVLV NKDTKLEKPVKTTLIKVDNAYECVAKLLQMYEAAKPKKTGIDPLAFVSPDATVGENCYVG PFAYVGSGVVVGNGTQVYPHATLCDNVRVGNDCIIYPQVCLYHDVVVGNRVILHSGCVIG ADGFGFAPSANGYDKIPQIGTVTIEDDVEIGANTCVDRSTMGSTYIRKGVKLDNLVQIAH NTDIGENTVMSAQVGVAGSTKVGQWCMFGGQVGVSGHINIGNKVFLGAQSGVPGNLKDGQ QLIGTPPMELKPYFKSHAIFRRLPDMYKQLNELQKEIDELKSK >gi|283510581|gb|ACQH01000038.1| GENE 10 13348 - 14568 1027 406 aa, chain - ## HITS:1 COG:BS_ywfO KEGG:ns NR:ns ## COG: BS_ywfO COG1078 # Protein_GI_number: 16080812 # Func_class: R General function prediction only # Function: HD superfamily phosphohydrolases # Organism: Bacillus subtilis # 1 388 8 394 433 192 30.0 9e-49 MSEIKIINDPVFGFIKIPRGLILDVICHPLMQRLTRIRQLGMASVVYPGAQHTRFLHSIG AFHLVSEAVLSLQQKGVFIFDSEVEAVQIAILLHDIGHGPFSHVLENSLISGISHEDISL QMMEKMNREMGGALTLAIKIFKNEYSKRFLHQLISSQLDMDRLDYLRRDSFFTGVTEGNI GSARIIKMLDVENDALVVDQKGIYSIENYLISRRLMYWQVYLHKTAVGCEAILVNLLKRA KYLVRNGHSLFATPALAYFLANDVDRDFFTTHAEALQNYCSLDDNDIWSSVKVWMNDADP VLSILANDLVNRKLFKVEVFDEPIPEGVIEERLAAICTHTGLSTDDAAYLMSANTVNKNM YNMEDDHITIKCKDGTCKDISQASELFDVTRLSMKNSKYYLCYQRF >gi|283510581|gb|ACQH01000038.1| GENE 11 14596 - 15429 952 277 aa, chain - ## HITS:1 COG:RSc2773 KEGG:ns NR:ns ## COG: RSc2773 COG0284 # Protein_GI_number: 17547492 # Func_class: F Nucleotide transport and metabolism # Function: Orotidine-5'-phosphate decarboxylase # Organism: Ralstonia solanacearum # 4 266 22 283 288 191 40.0 1e-48 MTREQLIEQIHVKQSFLCVGLDTDIKKIPQHLLQADDPIFEFNKAIIDATAPYCVAYKPN LAFYESLGVKGLLAFERTVDYLNENYPEQFVIADAKRGDIGNTSQMYARTFFETYNLDAL TVAPYMGEDSVTPFLAYEGKWVILLALTSNKGSHDFQLTEDAQGVRLFEHVLTKSQAWGN ADNMMYVVGATQGEAFKDIRRHAPNHFLLVPGIGAQGGSLHDVCQYGMNKDCGLLVNSSR GIIYASNGADFAQVAAEKAREVQEEMARELKAILPTF >gi|283510581|gb|ACQH01000038.1| GENE 12 15453 - 16565 1349 370 aa, chain - ## HITS:1 COG:VC2179 KEGG:ns NR:ns ## COG: VC2179 COG0216 # Protein_GI_number: 15642178 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Protein chain release factor A # Organism: Vibrio cholerae # 6 354 4 344 362 316 49.0 6e-86 MIEQNSILTKLDGLESRYEEVSTLITDPAVIADQERYVKLTREYKDLGDIMNARKRYVAC LNSIAEAKDILANESDAEMKEMAREELSANETERPKLEEEIKLLLVPKDPEDAKNVQMEI RAGTGGDEAALFAGDLFGMYKRFCDSKGWNLSVTASSEGAVGGFKEIDFAVSGDNVYGVL KYESGVHRVQRVPATETQGRMHTSAATVAVLPEADKFEVNINEGDIKWDTFRSSGAGGQN VNKVESGVRLRYPWKNPNTGETEEILIECTETRDQPKNKERALSRLRTFIYDREHQKYVD DIANRRKSLVSTGDRSAKIRTYNYPQGRVTDHRIGYTTHDLQGFIAGDIQAMIDALSVAE NTERMKEAEL >gi|283510581|gb|ACQH01000038.1| GENE 13 16573 - 17532 472 319 aa, chain - ## HITS:1 COG:Cj1602 KEGG:ns NR:ns ## COG: Cj1602 COG2958 # Protein_GI_number: 15792907 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Campylobacter jejuni # 29 268 36 278 318 133 40.0 5e-31 MMTIKDAVLLSLEDFPNGATSREVYDNIINKKLFEFSKVAKTPDATVGAQMGVMIKHGDV RIKRIKNDKNIFCYYLSKYSKNIESDDKIPAHTGVSRSSFNERDLHPLLCSFLNYNGIIA KTIFHEKSSKAEEHQKWVHPDIIGAKFVEQKNITSNALFKAISKKDSLLLYSYELKKKID NDYELKKCFFQAVSNSSWSNKGYLVAFEINEDLKDEMARLNNSFGIGFIHLKANPYESKI WFDAKERHLDFVTIDKLCEINRDFKEFIKLVEATLTADDKHFHPSKTALISICDRYPAND GDILKYCNEHHIPWGDEEV >gi|283510581|gb|ACQH01000038.1| GENE 14 17537 - 18700 1429 387 aa, chain - ## HITS:1 COG:MJ0203 KEGG:ns NR:ns ## COG: MJ0203 COG0150 # Protein_GI_number: 15668375 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylaminoimidazole (AIR) synthetase # Organism: Methanococcus jannaschii # 47 369 50 323 350 107 29.0 3e-23 MNNRYMMRGVSAAKEDVHNAIKNIDKGVFPQAFCKIIPDILGGDDAYCNIMHADGAGTKS SLAYAYWKETGDLNVWRGIAQDAVVMNTDDLLCVGAVDNILVSSTIGRNKMLVPGEVISA IINGTDELLAELREMGVGIYATGGETADVGDLVRTIIVDSTVTCRMKRSDVIDNANIRPG DVIVGLSSCGQATYEKTYNGGMGSNGLTSARHDVFAKYLAEKYPETFDHAVPNELVYSGT KRLTDAIEGLGVDAGQLVLSPTRTYAPVIRKVLDEMRNHVHGMVHCTGGAQTKVLHFVSD DCRVIKDNMFDVPPLFKLIQSESGTDWKEMYKVFNMGHRMEIYVRPEHAERIIAISKSFN IEAQVVGRVEEGKKSLTIRSEYGTFEY >gi|283510581|gb|ACQH01000038.1| GENE 15 18806 - 19555 833 249 aa, chain - ## HITS:1 COG:CAC0897_2 KEGG:ns NR:ns ## COG: CAC0897_2 COG0169 # Protein_GI_number: 15894184 # Func_class: E Amino acid transport and metabolism # Function: Shikimate 5-dehydrogenase # Organism: Clostridium acetobutylicum # 4 246 7 251 273 149 40.0 5e-36 MDKYGLIGFPLGHSFSISYFNEKFHNEGIDAEYENYEIASIDSLAEILDTTPNLRGLNVT IPYKEEVIKYLDDLSPEARSIGAVNVIRVSHKGSKPYLKGFNSDVIGFTRSIESMLEPGH QKALILGTGGAAKAIDYGLKSLGLETRFVSRTARKGNTITYDDVTPQVIAEYNVIVNCTP LGMYPQTDECPKLPYEAMDQHTLLYDLIYNPDETLFMKKGSERGAITKNGLEMLLLQAFA SWEFWHGRE Prediction of potential genes in microbial genomes Time: Sat May 28 00:49:28 2011 Seq name: gi|283510580|gb|ACQH01000039.1| Prevotella sp. oral taxon 317 str. F0108 cont2.39, whole genome shotgun sequence Length of sequence - 32295 bp Number of predicted genes - 26, with homology - 23 Number of transcription units - 14, operones - 7 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 302 - 1036 465 ## PROTEIN SUPPORTED gi|163754278|ref|ZP_02161401.1| 30S ribosomal protein S15 2 2 Op 1 . - CDS 1254 - 2204 1145 ## COG0152 Phosphoribosylaminoimidazolesuccinocarboxamide (SAICAR) synthase 3 2 Op 2 . - CDS 2231 - 3115 1087 ## COG1702 Phosphate starvation-inducible protein PhoH, predicted ATPase - Prom 3198 - 3257 5.9 + Prom 3161 - 3220 2.7 4 3 Op 1 . + CDS 3290 - 4093 628 ## PRU_0725 hypothetical protein 5 3 Op 2 . + CDS 4146 - 5054 897 ## COG0324 tRNA delta(2)-isopentenylpyrophosphate transferase 6 4 Tu 1 . + CDS 5220 - 7703 2551 ## COG5009 Membrane carboxypeptidase/penicillin-binding protein + Prom 7742 - 7801 3.5 7 5 Tu 1 . + CDS 7963 - 9663 1704 ## COG0507 ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member + Prom 9666 - 9725 4.1 8 6 Op 1 . + CDS 9746 - 10297 256 ## gi|288928106|ref|ZP_06421953.1| hypothetical protein HMPREF0670_00847 9 6 Op 2 . + CDS 10332 - 10772 301 ## gi|288928107|ref|ZP_06421954.1| hypothetical protein HMPREF0670_00848 10 6 Op 3 . + CDS 10819 - 11616 192 ## gi|288928108|ref|ZP_06421955.1| hypothetical protein HMPREF0670_00849 + Prom 11726 - 11785 5.5 11 7 Tu 1 . + CDS 11914 - 12129 60 ## - Term 13160 - 13186 0.3 12 8 Tu 1 . - CDS 13218 - 13415 67 ## + Prom 13278 - 13337 6.4 13 9 Op 1 . + CDS 13375 - 17997 4605 ## PRU_0299 hypothetical protein 14 9 Op 2 . + CDS 17987 - 20293 2014 ## PRU_0300 OMP85 family outer membrane protein + Prom 20566 - 20625 2.3 15 10 Tu 1 . + CDS 20646 - 20900 322 ## Coch_0566 transglycosylase-associated protein + Term 20956 - 21003 10.1 - Term 20949 - 20987 4.8 16 11 Op 1 . - CDS 21010 - 21396 324 ## Coch_0565 hypothetical protein 17 11 Op 2 . - CDS 21432 - 21758 296 ## Coch_0152 hypothetical protein - Prom 21999 - 22058 8.0 18 12 Tu 1 . - CDS 23543 - 24325 773 ## COG0561 Predicted hydrolases of the HAD superfamily - Prom 24353 - 24412 2.2 - Term 24363 - 24415 17.9 19 13 Op 1 7/0.000 - CDS 24466 - 25734 1472 ## COG2871 Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrF 20 13 Op 2 9/0.000 - CDS 25746 - 26372 678 ## COG2209 Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrE 21 13 Op 3 9/0.000 - CDS 26382 - 27011 530 ## COG1347 Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrD 22 13 Op 4 9/0.000 - CDS 27027 - 27749 902 ## COG2869 Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrC 23 13 Op 5 7/0.000 - CDS 27754 - 28920 1209 ## COG1805 Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrB 24 13 Op 6 . - CDS 28927 - 30276 1240 ## COG1726 Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrA - Prom 30465 - 30524 7.8 + Prom 30224 - 30283 2.6 25 14 Op 1 . + CDS 30303 - 30509 97 ## + Prom 30511 - 30570 2.2 26 14 Op 2 . + CDS 30590 - 32294 1472 ## PRU_1251 hypothetical protein Predicted protein(s) >gi|283510580|gb|ACQH01000039.1| GENE 1 302 - 1036 465 244 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163754278|ref|ZP_02161401.1| 30S ribosomal protein S15 [Kordia algicida OT-1] # 23 244 1 221 221 183 42 1e-45 MYKQEEIKPYGEQGDKGELVEKMFDNIAPTYDVLNHRLSWDIDKRWRRKAIGQLVPFKPK QMLDIATGTGDFAILAAKMLKPSSLVGADISEGMMEIGRKKVEKAELGGVVSFMKEDCMN LSFPAERFDAVTAAFGIRNFKDLDKCLGELHRVLRKGGHLSIVELSAPKRFPMSWLFKIY SHTVLPLYARLVSKDKGAYRYLISTVEAFPQGETMVGILKKAGFEEVKFTRLTFGICTMY LATK >gi|283510580|gb|ACQH01000039.1| GENE 2 1254 - 2204 1145 316 aa, chain - ## HITS:1 COG:mll2603 KEGG:ns NR:ns ## COG: mll2603 COG0152 # Protein_GI_number: 13472340 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylaminoimidazolesuccinocarboxamide (SAICAR) synthase # Organism: Mesorhizobium loti # 18 312 40 337 337 249 44.0 6e-66 MKALTQTNFHFAGQKSVYHGKVRDVYDIDGDKIVMVATDRISAFDVILPKGIPFKGQVLN QIAAKFLDLTADICPNWKLATPDPMVTVGLKCEGFRVEMIIRSILTGSAWREYKNGCREL CGVKLPGGMKENERFPEPIVTPTTKADEGHDLNISKEEIIAQGLVSAEDYAVMEDYTKKL FRRGQEIAAKHGLILVDTKYEFGKRDGKVYLIDEIHTPDSSRYFYAEGYEEKLAKGEPQR QLSKEFLRQWLIEHNFMNEPGQTMPEITDEYAEEVSNRYIELYEHITGETFNKAADEADL AARIEKNVTEYLAQGK >gi|283510580|gb|ACQH01000039.1| GENE 3 2231 - 3115 1087 294 aa, chain - ## HITS:1 COG:BS_phoH KEGG:ns NR:ns ## COG: BS_phoH COG1702 # Protein_GI_number: 16079588 # Func_class: T Signal transduction mechanisms # Function: Phosphate starvation-inducible protein PhoH, predicted ATPase # Organism: Bacillus subtilis # 9 282 38 316 319 263 51.0 4e-70 MIKSLFPKLRIVARDNVIRVLGDEEEMAVFEEKTEGIRKHILKYNAINEEDILDIIKGKR TKADAIKDVIVYNITGKPIKSRSENQQRLVDAYERHDLVFAVGPAGTGKTYLSIALAVKA LKEKTAKKIILSRPAVEAGEKLGFLPGDMKDKIDPYLQPLYDALEDMIPAVKLQDMMEKH VIQIAPLAFMRGRTLTDAIVILDEAQNTTMAQIKMFLTRMGTNTKMIVTGDLTQIDLPRE QKSGLRVALDILRNVKGIATVQFDQKDIVRHKLVTRIVNAYEAYYQTERNKEKE >gi|283510580|gb|ACQH01000039.1| GENE 4 3290 - 4093 628 267 aa, chain + ## HITS:1 COG:no KEGG:PRU_0725 NR:ns ## KEGG: PRU_0725 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 16 267 4 239 239 291 58.0 2e-77 MRLTKNQFLKAFAAAVVLLAVVRAVFPSIAKPADAKKKTETKGQDISKDAERNAHAAAQP VDGKLTKLKFFKPDGTPMKNRIMSVPNFSKAFPDLNDVQLEMAQKYGVKPVADRKEAENR KAELVYMASSPYYHVDKLNNSVPYLVPRAAVLLNDIGRNFFDSLQIKGIPLHKFIVTSVL RSEADVKKLRGHNGNASENSCHRFGTTFDIAYNRYKTVESPDGPARRKVRNDTLKWVMSE VLNDLRRNERCYIKYEVKQGCFHVTVR >gi|283510580|gb|ACQH01000039.1| GENE 5 4146 - 5054 897 302 aa, chain + ## HITS:1 COG:BH2366 KEGG:ns NR:ns ## COG: BH2366 COG0324 # Protein_GI_number: 15614929 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA delta(2)-isopentenylpyrophosphate transferase # Organism: Bacillus halodurans # 2 294 3 305 314 194 34.0 2e-49 MKTLIVVLGPTGVGKTALCLHLAKRYAAHIISADSRQMFAQLPIGTAAATPEEQQQAIHH FVGHLKLDEYYSASCFETDTLALLDQLFANGDVALMTGGSMMYIDAVCYGIDDIPTVDDN TRKHVKQLLEQQGLPALVELLKQLDPEHYDFVDKQNPRRVCHALEICLMTGNTYTSYRKK ERKQRPFNIIKIGLNRPREEMYNRINQRVLQMMANGLEEEARRVYPLRGLNSLNTVGYKE LFAHFDGTIPLEEAVRQIQSNTRRYMRKQLTWFKKDPEIKWFYPDNTEEITNYIDSFVGI TD >gi|283510580|gb|ACQH01000039.1| GENE 6 5220 - 7703 2551 827 aa, chain + ## HITS:1 COG:NMA0655 KEGG:ns NR:ns ## COG: NMA0655 COG5009 # Protein_GI_number: 15793640 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane carboxypeptidase/penicillin-binding protein # Organism: Neisseria meningitidis Z2491 # 51 727 21 697 798 274 29.0 6e-73 MRKKIVQGAKWCATKLKAFWKWYKGLYKGRKWYFKTGVVVASAFVAFFVYLGMVDINFLW LFGKSPGWSTINNPITAQASEIYSADGKLIGKYFNENRTPVSYGDVNPVFWNTLIDTEDE RFYSHWGVDPIGMFAAAKDAVTKGDGRGASTITQQLAKNLFRVRTEYSTGLLGKIPGIRM LIMKTKEWIIATKLELSYDKNTILTMYANTVDFGSSAYGIKTACKTYFNITPKDLTAAQS ATLVGMLKATTTYNPITHPENSLKRRNTVLANMVRRGHLSQAVYDSLSQTPLSLKYNVET NYDGQANYFREAVADYLKDWCEANGYDLYTSGLKIYTTLDTRMQKYAEDAATKQMRQVQQ NFKNHWGKNDPWIDERGRVIPDFIEGIARKQPVYNWLLEKFPSQPDSITYYLNKPHKVKL FDYAKGSIEMNISTMDSIRYMVRFMHCAFVAMEPQTGAVKAWVGDINFDAWKYDKVKAMR QPGSTFKLFVYTEAMNQGLTPCDKRRDEYITMQVYDKKKHEVVTWTPSNANGSFSGDSMP LKSAFAKSINSVAVRLGQEMGIKRIIETAEKMGIKSPLEDAPSLALGSSDVNLLELTNAY CTIANDGMHHEPVLVTRIEDKDGKEVFLGPATSEQAVPYKTAFLVQQLLMGGMREPGGTS QSLWGYVGDYRDTDFGGKTGTSNNHSDAWFMGVSPKLVVGAWVGGEYRSIHFRTGALGQG SRTALPICGYFLQSVFRDPAFKQYHGKFGKPRDADITRDMYLCDSYYPRAKKDTTETDST GGEGIIILDGNGNPIREISPGDLLPSEQGQEKKRSRPTEQPLNLDEL >gi|283510580|gb|ACQH01000039.1| GENE 7 7963 - 9663 1704 566 aa, chain + ## HITS:1 COG:SPBC887.14c KEGG:ns NR:ns ## COG: SPBC887.14c COG0507 # Protein_GI_number: 19113280 # Func_class: L Replication, recombination and repair # Function: ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member # Organism: Schizosaccharomyces pombe # 2 407 315 760 805 172 30.0 2e-42 MQKALQIIQFTRRSLFLTGKAGTGKSTFMRHIAATIKKKHIILAPTGIAAINAGGSTLHS FFKLPFHPLLPTDKRYSTRNIRNTLKYNGEKTKLLREVELIIIDEISMVRADIIDFIDKV LRIYNRNMREPFGGKQLLLVGDIYQLEPVIKEDERQLLRPFYPSCFFFDAHVFREMQLIA VELRKVYRQRDAQFISLLDHIRTSHVSDSDLRLLNAQVNAEIGTEEGRLSITLSGRRDTV DYINEKQLNTLPDEPTIFYGHIEGEFPESSLPTPIELEVKTGAQVLFIKNDKERRWVNGT LGTIIGFGDEQDGIIYVRTEDGRDFDVEREIWSNVRYTFNEKEQKIEEEEIGSYRQFPIR LAWAITVHKSQGLTFNNVKIDFTGGVFAGGQTYVALSRCTSLQGISLQEPIRRENIFVRT EVTNFARNYNNLNIINSALQESKADKQYNDAMTAFDKGDMQAAIDNFFLAIHSRYDIEKP AAKRLIRKKLNIINLLKQQLEEARAMEKQRAEQLKRLAAEYTLLGKESEKEGMPQAAIRN YEKALELWPDLPEAKRKLKKLKKSGD >gi|283510580|gb|ACQH01000039.1| GENE 8 9746 - 10297 256 183 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288928106|ref|ZP_06421953.1| ## NR: gi|288928106|ref|ZP_06421953.1| hypothetical protein HMPREF0670_00847 [Prevotella sp. oral taxon 317 str. F0108] # 20 183 1 164 164 319 100.0 4e-86 MTSKARSKSLFILVICLTLMAACSSHNGRSLECFEVTNRQLKAIVNTMTQQFFTENTEDK VAVLDIVQEDSITKFVFSFQNKHRLRDKYIAYKGRRIVGYISQNHRDLIVLTNIHIITEL QETLEPLMKPSDKIKNFDFLSTDNQLYYDAETKGWKSFETIYEPILVEYQLKEGHFSQPT MKR >gi|283510580|gb|ACQH01000039.1| GENE 9 10332 - 10772 301 146 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288928107|ref|ZP_06421954.1| ## NR: gi|288928107|ref|ZP_06421954.1| hypothetical protein HMPREF0670_00848 [Prevotella sp. oral taxon 317 str. F0108] # 1 146 1 146 146 286 100.0 2e-76 MKHLTAIIAVIIGSILVGCNQKIKQSRLFVPKQIQWDTLKGGFEDDTDLYHKAFFVLYVS NDTLYKIKTENELINDSILTVTSASCAEAFEIKEITPEMLIVERENKKDTIFLAHYTDHK VHKQSINRVKQWIYGKYPAPEIVIPN >gi|283510580|gb|ACQH01000039.1| GENE 10 10819 - 11616 192 265 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288928108|ref|ZP_06421955.1| ## NR: gi|288928108|ref|ZP_06421955.1| hypothetical protein HMPREF0670_00849 [Prevotella sp. oral taxon 317 str. F0108] # 1 265 1 265 265 525 100.0 1e-148 MKLIFNLLLAVCLSACALSPNTPQKDIIKDVQKRKSTIVIDGELKLQYHTKGEYYVMDFI TIGENYITFNKKASELITIHIRYAGEKGAIIGRDYHFDDDTHKLSHFYSLRNNNHHGFAV FYNDARKVERYEYFNEGESGHPQLTNTNSISKVHQEVFSDSLFNTLFPRRGYGDLYKNNS IEYHKIRDKCILLIGPLCFVVKKEGTLEAIFTHCYKCNNKLFRGGEGYFFDPHSHYLIAL CRYKHNTLNGKAIIFDNNGMVIQHK >gi|283510580|gb|ACQH01000039.1| GENE 11 11914 - 12129 60 71 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MFQKATCWKVFQRCSSSCKLPIEGKLIPRFHDVVSIGREQHALQPRALAVVSVPGFRAVA HARAYSTIAFV >gi|283510580|gb|ACQH01000039.1| GENE 12 13218 - 13415 67 65 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MATAMTTHRIYFFIQKCSLVYLFCLKAPNFNLQSKHFFLKISHLSAKEMVTNNNLFKIKT QRLQM >gi|283510580|gb|ACQH01000039.1| GENE 13 13375 - 17997 4605 1540 aa, chain + ## HITS:1 COG:no KEGG:PRU_0299 NR:ns ## KEGG: PRU_0299 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 1538 1 1545 1557 1400 45.0 0 MKKYIRWVVIAVAIPIVLFLILVVLLYCPPVQNWAVKRVASYASQKMGMEITVDRVRLVF PLDLGVEGVRVLQANDSIAGQTDTIANVGRIVANVQLIPLLSNKVEVDELSLHDVQMNTA NLVHEARIKGKVGLLLVQSRGVDLNSKTIRVNQAQLKDANISVELSDTVPPDTTKTPIYW KIKVDKLQIEHSKALLRTPRDSMTVLVDLPKLDAKNGYFDLYKNLYRLETLNWQNGSLNY DQTYKPRTDGLDPNHIHLTELHLGIDSFYFCQPELKMNLRVCAFKEKSGVRVNHISGHVA LDSTRIALPNLELRTPESSLTAKFNMDMNAFDEVNPGKLYVALHASLGKADLLRAMGGMP EGFRKAFPNYPLRIDGVAKGNKERIAFSGLNVVLPTAFKLQTTGYVEKPMDTRRCKGEIF LKANTYNLNFAKVLLDKKLQRQVNIPSGISLNARAKMQGEAYQADATMHEGGGSVQLKGS FNAHKMAYNANIKLKSLALQHFLPNQGLHPFTGEVETRGVGTDFMSTKTQLFANVRIDRF HYGGYDLDKIAGKAQMSNGLITASLQSNNPLLKGSINVNGRNSDKLIKAHINTDLAKLDL YNLHLLDQKVVASFRSDLNLETNGKDYYKLEGNVADIMVNDTAKDYRLGAMSVNLFTNRD TTHANLVCGDLALQMNSKGGYAYILNRGMGFWKELQHQLATKHLNEARLRERLPLANINL QSGRQNIFMRVLRKYGYSVGEMSADLHSSPVAGLNGYLNVNQLVADSILIDTVRLALKSD ADQIAYSVQVRNNKKNPQYVFNALLNGALEERGTYIKAQIYDENNKLGIGMGVRATMEDK GIRLSLMDREAVLGYKQFTVNDSNYVFMGDDRRVCADLRLRAADGTGIQLATNNENEEAL QDITLSLNRLNLEKLFSVLPYMPEVSGTLNGDFHVIQTKDALSISSAVSVTDLAYQHSPM GNLSSEFVYMPKSDGGHYIDGTLSQNDIQVGQLSGTYNSARGGQLDATLKLERLPLTIVN GFIPQRMFGLNGYAEGELAIKGALSKPMINGEVFLDSAHLFSDPYGVTLRFANDPVRIVN SHLLFENFEVFAHNDSPLNISGEFDFSNPDRMRMDMRMRANKYEIINSKENYRSEVFGKV FVNFSGRMSGEVSNLNLRGKLDVLDATDMTYVLRDSPLSTDNQMDDLVSFTNFAENKAEI INRPALTGFNMDLSMNIDPNAHILCALNADKSNYIDLMGGGNLRMQYTPVNGLRLTGRYT LNNGEMKYSLPVIPLKTFTIKNGSYIEFTGDAMNPRLNITATEITKASVTTNGRDGRIVE FECGVVVTKTLKNMGLEFTIDAPQDMTVSNQLKTMGSDERGKLAVTMLTTGMYLADGNTS SFSMNNALSAFLQSQINNITGNALRTLDLSIGLDNVTDASGNMHTDYSFKFAKRLWSNRL RIIVGGKVSSGSDVGRNDNTFFNNVSLEYRLNQGSTRYMQMFYNRDSYDWLEGDIGKYGV GFIWRRKLRHFRDIFRLKVPEEVVLPATPDSLNKEKKDGE >gi|283510580|gb|ACQH01000039.1| GENE 14 17987 - 20293 2014 768 aa, chain + ## HITS:1 COG:no KEGG:PRU_0300 NR:ns ## KEGG: PRU_0300 # Name: not_defined # Def: OMP85 family outer membrane protein # Organism: P.ruminicola # Pathway: not_defined # 1 768 10 787 787 767 50.0 0 MANNPKHVFSALLACVATLVATSCSTTKGVPEGDQLYVGLKKIKYTNYEDNSHFSSTQEE MEAALDCAPNGAFMGSSYYRTPFPISLWVWNAFSESTSGAGKWISKTFGKPPVLMSWVNP ALRASVAQNTLKNNGYFQGRVKYEVLTQKNPKKAKIGYSVSVGPLYTLDSVAYLNFPPEI TQLIASRASISKLQKGSPFSVSSLDTERNRISNLLRDNGYYYYQPGYSSFLADTIQVPQK VQLKMLPTATMPETAKRRWYVGKIDFYLRKQMGETLTDSVKRKRLTVYYNGKHSPIRSRI LMRNIMLRPEHMYRYRDENLTSTKLMDMGLFSMTDFKFTPRDTTSLSDTLDLSITGLLDK PYDFYVETNYTGKTNGRMGPAVIMGFTRRNAFRGAEKLDINVFGSYEWQTGHSANGAAAQ MQSYEYGASASLEFPRLFAPFWQSRRFYNTPSSLIKVASSVISRGNYFKRHIVSGELTYN VQTSERSMHQFSPISLQYNYMTSHTAVFDSILNANPYLKVTMKDVFIPKMRYSYIFNSPS RYRHPIWWQTTFSEASNLLSLAYMAAGKKWDEKDKTLFKNPFAHFFKIETDWRKTWKLNA HSQLVAHASAGAIWSYGNATEAPYSEQFYVGGANSIRAFTARSIGPGAYYPTASTNSYLD QTGDIKLLFNLEYRPRLIGSLYGAVFIDAGNVWAMHSSSMRPNATFNVKNLAKEMALGTG IGLRYDLDFFVIRIDWGIGIHVPYKSGFYNMGRFKNSQSLHLAIGYPF >gi|283510580|gb|ACQH01000039.1| GENE 15 20646 - 20900 322 84 aa, chain + ## HITS:1 COG:no KEGG:Coch_0566 NR:ns ## KEGG: Coch_0566 # Name: not_defined # Def: transglycosylase-associated protein # Organism: C.ochracea # Pathway: not_defined # 1 83 1 83 83 63 86.0 2e-09 MGIIATLIIGAIAGWLGGVIYKGSGLGLIGNIIVGVLGSGVGSWLLGNVLNISLGEGWIG SILTGAIGAIVILFLINLIFKKKK >gi|283510580|gb|ACQH01000039.1| GENE 16 21010 - 21396 324 128 aa, chain - ## HITS:1 COG:no KEGG:Coch_0565 NR:ns ## KEGG: Coch_0565 # Name: not_defined # Def: hypothetical protein # Organism: C.ochracea # Pathway: not_defined # 1 128 1 128 128 160 67.0 1e-38 MKKAFTLVTLICLVVVAVLSISCSKSDDPADNDFFIGKYKGKISYTVDGDTKSVDNGSVT VIKTGNNYYFSFSDGIPDLKGVKFKKQGDHTLVNIDFKEGMKVINISASSLYIYYKQDVK KWTASATR >gi|283510580|gb|ACQH01000039.1| GENE 17 21432 - 21758 296 108 aa, chain - ## HITS:1 COG:no KEGG:Coch_0152 NR:ns ## KEGG: Coch_0152 # Name: not_defined # Def: hypothetical protein # Organism: C.ochracea # Pathway: not_defined # 1 91 1 91 108 84 60.0 9e-16 MSKITKTLLGIAAGAAVGVGLGILFAHDKGENTRKKIKGSVSDKVDDLKEQIERFTKMFN EKSSKLKGTLEENVDYLLSESNCKSDELINLLEKKLASLKKEAKKRMS >gi|283510580|gb|ACQH01000039.1| GENE 18 23543 - 24325 773 260 aa, chain - ## HITS:1 COG:lin1028 KEGG:ns NR:ns ## COG: lin1028 COG0561 # Protein_GI_number: 16800097 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Listeria innocua # 1 260 1 256 256 141 36.0 1e-33 MIKALFFDIDGTLVSMKTHTIPTSAVKSIAQAKHQGVKVFIATGRPYAIINNIDSILPYI DGYLTTNGAYCTVGNEVVYHHPIPKRDVELILNDATEQDYSCVVVGEKDITTYNYKDNID RIFRRTLDIHGIDYHLPIEKVMQYDILQLTPFATVEQEQQLMPRIPDCVSGRWHPEFMDI TSRMADKGRGLAAIAQYLGIPIDACAAFGDGGNDISMIRAAGVGVAMGNAGNDVKQAANF VTTHIDEDGIMYAMLHLGII >gi|283510580|gb|ACQH01000039.1| GENE 19 24466 - 25734 1472 422 aa, chain - ## HITS:1 COG:PA2994 KEGG:ns NR:ns ## COG: PA2994 COG2871 # Protein_GI_number: 15598190 # Func_class: C Energy production and conversion # Function: Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrF # Organism: Pseudomonas aeruginosa # 5 422 6 406 407 410 49.0 1e-114 MLQFILISIAAFLLIILLLVIMLLVAKKYLSPSGKVVITVNGDKELTVEQGNNVMATLNE NGIFLPSACGGKASCGQCKLQVLEGGGEILDSERSHFSRKEVKDHWRLGCQCKVKGDLKV KVPESVLGVKEWECTVISNKNVSSFIKEFIVELPPGEHMDFVPGSYAQIKIPAYETIDYN KDFDKKDIGEEYLPVWEKFGIFDLKAHNPEETIRAYSMANYPAEGDRITLTVRIATTPFK PRPEVGFQDVPTGIASSYIFSRKPGDKVVMSGPFGDFHPIFDSKKEMIWVGGGAGMAPLR SQIMHMLKTLHTRDREMHYFYGARSLSEAFFLEDFHELEKEYPNFHFHLALDRPDPKADE AGVPYVAGFVHEVMYNTYLKDHDAPEDIEFYMCGPGPMSKAVQVMLDSIGVDRENIMFDD FG >gi|283510580|gb|ACQH01000039.1| GENE 20 25746 - 26372 678 208 aa, chain - ## HITS:1 COG:VC2291 KEGG:ns NR:ns ## COG: VC2291 COG2209 # Protein_GI_number: 15642289 # Func_class: C Energy production and conversion # Function: Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrE # Organism: Vibrio cholerae # 1 208 1 198 198 203 58.0 1e-52 MEHLISLFFRSIFVDNMIFAFFLGMCSYLAVSKNVKTSLGLGLAVTFVLLVTVPVDYLLQ TKVLGPDCLIAGVDLSYLSFILFIAVIAGIVQLVEMAVEKYSPSLYSALGIFLPLIAVNC AIMGASLFMQQRINMDPSSTQYIGSVWDSISYAVGSGFGWTLAIVSMGAIREKMQYSDVP KPLQGLGITFITVGLMAMAMMCFSGLNL >gi|283510580|gb|ACQH01000039.1| GENE 21 26382 - 27011 530 209 aa, chain - ## HITS:1 COG:PA2996 KEGG:ns NR:ns ## COG: PA2996 COG1347 # Protein_GI_number: 15598192 # Func_class: C Energy production and conversion # Function: Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrD # Organism: Pseudomonas aeruginosa # 10 203 9 202 224 203 55.0 2e-52 MSKLFSKENKEVFVNPLNLDNPIMVQVLGICSALAVTSQLKPAIVMGLAVTIITAFSNVI ISLIRNTIPSRIRIIVQLVVVAALVTIVSQVLKAFVYDVSVELSVYVGLIITNCILMGRL EAFAMMNKPWPSFLDGIGNGLGYAFILVLVGCIRELFGRGSLLGFQLIPESAYHAGYLNN GMMTMPAMALILLGCVIWLHRAYFYKEKK >gi|283510580|gb|ACQH01000039.1| GENE 22 27027 - 27749 902 240 aa, chain - ## HITS:1 COG:CT279 KEGG:ns NR:ns ## COG: CT279 COG2869 # Protein_GI_number: 15605000 # Func_class: C Energy production and conversion # Function: Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrC # Organism: Chlamydia trachomatis # 12 238 10 302 316 93 28.0 3e-19 MAEVKKKGLNTNSNGYTLIYSTILVVVVAFLLAFVFKALKPQQDINVALDKKKQLLYALN IRDISDEEAAQKYKEVVLADEIIDTKGNVTTKGEQGGEKAGFLLNSADYKDGRLALFVCK VDGQTKYIVPVYGMGLWGPINGFIALDADKNTVYGAYFNHEGETAGLGAEIKDNVKWQNL FKGKKVFAEGTQKVALSVVKKVDDPSTQVDAVTGATLTSNGVSDMLIEGLSKYLVFLTSK >gi|283510580|gb|ACQH01000039.1| GENE 23 27754 - 28920 1209 388 aa, chain - ## HITS:1 COG:PM1329 KEGG:ns NR:ns ## COG: PM1329 COG1805 # Protein_GI_number: 15603194 # Func_class: C Energy production and conversion # Function: Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrB # Organism: Pasteurella multocida # 3 382 2 408 410 310 44.0 5e-84 MKALRQYLDKIKPKFEPGGKLHAFQSIFDGLETFLYVPNTTSKVGTHIHDSIDSKRIMSM VVIAALPALLFGMYNVGYQNYAAAGHLADASFFDLFFFGFLAVLPKVVVSYAVGLGIEFA WAQWKHEEIQEGFLVTGILIPLIIPVSCPLWILVLAVAFSVVICKEIFGGTGMNIFNVAL CARAFLFFSYPSKMTGDTVWVASNSILGFGYDLPDGFTAATPLGEIGVGANVPYSLSDMI TGFIPGSVGETSVIAIGIGAVILLWTGIASWRTMGSVFGGGILMALLFKHLGMTSIDWYE HIVLGGFCFGAVFMATDPVTSARTHTGQWIYGFLIGAMALIIRVMNPGYPEGMMLAILFA NMFAPLIDYYVVQANISKRMKRLTKNNK >gi|283510580|gb|ACQH01000039.1| GENE 24 28927 - 30276 1240 449 aa, chain - ## HITS:1 COG:NMB0569 KEGG:ns NR:ns ## COG: NMB0569 COG1726 # Protein_GI_number: 15676474 # Func_class: C Energy production and conversion # Function: Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrA # Organism: Neisseria meningitidis MC58 # 4 447 1 446 447 245 33.0 1e-64 MANVIKLRKGLNIHLKGTAAQTKRVIGACAEYVVRPDTFEGVVPKLVVHEGDRVEVGDAL FVDKNFPEVRFASPVSGTVKAVERGERRKILGIRLQADEVQKFHDFGRRDVSTLSADEVK SALLEAGLFGYIYQLPYAVSTNPQTTPKAIFVSALRDMPLANNFEYELQGNERDFQTGLT ALSKMATTYLGIGKNQKADALVNAKDVEVTVFDGPCPAGNVGVQVNHISPVNKGEVVWTV DPTFVIFFGRLFNTGKVNLLRTIALGGSAMASPMYVDALIGTPFSVILKDALTRQEDLRL INGNPLTGTKSCLEDSVGVKTSEITAIPEGADANEMFGWILPRTNQFSTSRSYFSWLMGK KEYDLDARVKGGERHIIMSGEYDSVLPMDIYGEFLIKAIIVGDIDKMEQLGIYEVSPEDF ALAEFVDSSKLELQRIVRQGLNMLRKENA >gi|283510580|gb|ACQH01000039.1| GENE 25 30303 - 30509 97 68 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MINLLSDLFALAASLCAHAESNERLLKWCSSSKQTPKAMDRNTTHRFIVANIIKIKHNKK KNGGYISH >gi|283510580|gb|ACQH01000039.1| GENE 26 30590 - 32294 1472 568 aa, chain + ## HITS:1 COG:no KEGG:PRU_1251 NR:ns ## KEGG: PRU_1251 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 4 567 18 587 1480 347 35.0 7e-94 MVGLYFVFVVLLHVPFIQAFVGNTIGDALSEKLGTHVSVGKVDLGFLNRIIIDDLVINDQ QNQRMLEATRLSAKLDYAALAAGKIHISSVQLFGLKANLYKKSAQSQPNFQFALDSLASK DTTSNTPLDLRINTLIIRRGSVSYNQLDKPKTNTFSPNHISATDISAHIALNALTNDSVN LNIKDLSFKEQSGLHVKALTGELTANKQRASLLHLTCVLPSTSISFGPITATYAFNGTEF QSPSFLFSGSLQRSKLTLADLACFLPELKTSTKPFFLQADVSGTSTSLRVKTLEISSRAN NFSLQADGSISNWTSKLRWATNIRRLRVSAEGVQFLSDNFANHFKVPAVIGRLGNIDYIG EAGGYGQDVATKGIIRTDAGTATVAFGRHNDNFSGRVETNALDLGKILADKRFGNISTQI DVDGKLPTRGSVYVKAKGDVKEFFYNAYSYKNIAVDGEWNNGTFDGKLNVKDPNVNFNLQ GQFNLASKQPFAKLKAEVTNFNPAALSLTKQWPNTQFSFAVMADIKGNSLNTANGRVQLD RFAMLSDAKSYKLNAISVDVSNNNQQHS Prediction of potential genes in microbial genomes Time: Sat May 28 00:50:52 2011 Seq name: gi|283510579|gb|ACQH01000040.1| Prevotella sp. oral taxon 317 str. F0108 cont2.40, whole genome shotgun sequence Length of sequence - 23880 bp Number of predicted genes - 19, with homology - 17 Number of transcription units - 11, operones - 4 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 129 - 2576 2813 ## PRU_1251 hypothetical protein + Term 2680 - 2735 6.6 2 2 Tu 1 . - CDS 2898 - 3368 -336 ## - Prom 3482 - 3541 4.1 + Prom 3151 - 3210 4.2 3 3 Tu 1 . + CDS 3460 - 3891 307 ## + TRNA 4136 - 4211 83.5 # Gly GCC 0 0 + TRNA 4235 - 4316 53.5 # Leu CAG 0 0 + TRNA 4347 - 4430 45.8 # Leu GAG 0 0 + TRNA 4455 - 4527 82.9 # Gly GCC 0 0 4 4 Op 1 . + CDS 4551 - 5102 547 ## PRU_0882 hypothetical protein 5 4 Op 2 . + CDS 5121 - 5537 389 ## COG0629 Single-stranded DNA-binding protein 6 4 Op 3 . + CDS 5610 - 6872 1291 ## COG1253 Hemolysins and related proteins containing CBS domains 7 5 Tu 1 . + CDS 7000 - 7599 386 ## BF1558 siderophore (surfactin) biosynthesis regulatory protein + Term 7661 - 7714 4.6 + Prom 7717 - 7776 4.8 8 6 Tu 1 . + CDS 7800 - 8939 1057 ## PRU_0697 hypothetical protein + Term 8949 - 8993 0.3 9 7 Op 1 . + CDS 9183 - 10055 855 ## COG2820 Uridine phosphorylase 10 7 Op 2 . + CDS 10060 - 11442 1242 ## COG0486 Predicted GTPase + Prom 11459 - 11518 2.4 11 7 Op 3 . + CDS 11539 - 13596 1613 ## BDI_1105 putative outer membrane receptor protein involved in Fe transport + Term 13829 - 13871 8.1 12 8 Tu 1 . - CDS 14898 - 15440 700 ## COG0288 Carbonic anhydrase - Prom 15519 - 15578 2.5 + Prom 15708 - 15767 2.8 13 9 Op 1 . + CDS 15850 - 16143 326 ## PRU_1181 hypothetical protein 14 9 Op 2 . + CDS 16156 - 16419 309 ## gi|288928137|ref|ZP_06421984.1| transglycosylase associated protein 15 9 Op 3 . + CDS 16424 - 16714 221 ## COG3326 Predicted membrane protein + Prom 16792 - 16851 2.6 16 10 Op 1 . + CDS 16935 - 18218 974 ## Spea_1267 hypothetical protein 17 10 Op 2 . + CDS 18232 - 20040 934 ## BTH_I0101 hypothetical protein 18 10 Op 3 . + CDS 20037 - 23285 1562 ## Spea_1269 helicase domain-containing protein - Term 23261 - 23300 1.5 19 11 Tu 1 . - CDS 23386 - 23880 391 ## PROTEIN SUPPORTED gi|116492196|ref|YP_803931.1| acetyltransferase Predicted protein(s) >gi|283510579|gb|ACQH01000040.1| GENE 1 129 - 2576 2813 815 aa, chain + ## HITS:1 COG:no KEGG:PRU_1251 NR:ns ## KEGG: PRU_1251 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 799 670 1469 1480 761 50.0 0 MNDAQKTLNITLKAPSFSYDGHRIEGGEMTVSTLNDTLKLAAQATNVAENGKKTWCKVQA TAADNKISSTISFDNLRKPLLKGELKWQTTFFKNDEGEATAQATFQQSDITVGNAIWQVK PSQVTYSKNRVLFNKFAIARGKQLIAIDGAATNNAHDSLMVELKDVDVSYMLNLVNFHAV EFGGRATGTAIVKTLFNNPNAYANLRVNDFTFEEGDMGTLVAGVKYNNEKGNVEIDAIAS DGLNARTFINGYVSPKHNNIDLAIKANNTNIAFMETFCKSFMHGVKAYANGEVRLSGPLS YINLTGQLVANGSFTMTPLNTVYTLTNDTVHFVPNRIDLNACRFTDRNGNVGTLSGEIRH DHLTDLSYNLSVKANNLLAYDTKTFGENTFYGTVYATGTCNIRGRSGEVNIDIDATPNRN SVLVYNAASPQTINNNSFIHWGADALLTAEEQKAQEPETEDQAHNDNQTADIPTDLHLNF LVNCTPDATIRLIMDKQSGDYISLNGYGVLRATYYNKGAFDMFGNFAIDHGVYKLTIQNV IKKDFTFEQGSTIAFNGDPYAARLNMKAIYTANSVSLSDLNLGKSFSNNTIRVNCLMNIT GTPAAPKVSFDLDLPTVNSNAKQMIYSLINGEEEMNQQVLYLLAVGRFYTQGRNNADLGN SSTQSTTSLAMQSLLSSTLSQQLNNVLGTVVNNTNWNFGANIATGDDGWGNAEYEGLLSG RLLNNRLLINGQFGYRDNVNTTKSFIGDFDIRYLLFPSGNMAIKMYNQTNDRYFTRNSLN TQGIGLILKKDFNGWKDLFGITSRKKKKTKAAKKK >gi|283510579|gb|ACQH01000040.1| GENE 2 2898 - 3368 -336 156 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MHFVSLCFSLCSVARQFYHLQNAHFASQNGHFASNLPCFGAASSHLQPVYLCCINIISPL WELLFFTFGLQFTAFSPAFCSVLPCVLLHFTLRFAVFCIAFCCILHCFLVQNAIVFASYV YLCSIKMACTLLCAVPIFTLITPCNTPFFAAEWQAG >gi|283510579|gb|ACQH01000040.1| GENE 3 3460 - 3891 307 143 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MERRKRAFCKIYLPLLGYSKKLYNFPLTALLLTSYYLLLTILPFTPHCLTAYYIQLYCLL PYYLVPYNLLPYAILLYSILLYSILLSTLVTSYFNSLLPCYILLYYLLPNYILLYYLLPY YLLLYYLLLYYLLLYYQPLYHFH >gi|283510579|gb|ACQH01000040.1| GENE 4 4551 - 5102 547 183 aa, chain + ## HITS:1 COG:no KEGG:PRU_0882 NR:ns ## KEGG: PRU_0882 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 183 1 180 180 222 63.0 5e-57 MNLYLRYFDKETLVDNVEKALDFLQSIPEIGMNIDLENDIRDYVASDVYYPKRYKVRPRV YFIIIKTVAATMLDFKQKKAVQQLPVSNGYDRRDLAATAMTRLTEERPGWYEGELDFKRV VLVPSTGKYEYRDTHFVVHCKAMSGQDCYDRMIDNLSERVDRRSQFPSPKGKNFRFRYLG RWK >gi|283510579|gb|ACQH01000040.1| GENE 5 5121 - 5537 389 138 aa, chain + ## HITS:1 COG:PM1950 KEGG:ns NR:ns ## COG: PM1950 COG0629 # Protein_GI_number: 15603815 # Func_class: L Replication, recombination and repair # Function: Single-stranded DNA-binding protein # Organism: Pasteurella multocida # 1 127 4 128 166 90 37.0 9e-19 MNKVMLIGNVGKDPEVHYYEADTAVAQVSLATTERGYTLQNGTKVPDRTDWHTVVFWKSL AKTVEKYVHKGDKLYVEGKVRYRSYDDKQGKRQYVTEIWADSLEMLTPKNLSRQEPQAHE PSTRTEQSEEQKDNQLPF >gi|283510579|gb|ACQH01000040.1| GENE 6 5610 - 6872 1291 420 aa, chain + ## HITS:1 COG:FN1486 KEGG:ns NR:ns ## COG: FN1486 COG1253 # Protein_GI_number: 19704818 # Func_class: R General function prediction only # Function: Hemolysins and related proteins containing CBS domains # Organism: Fusobacterium nucleatum # 17 414 17 414 426 201 32.0 2e-51 MGVVFAFVLAIVLLGVSGFASGSEIAFFSLSPSDISELDLEKSLKDKNINMLREDSERTL ATILITNNFVNVTIIMLLNYVFANVVHFGPRAYWLQFLILTILLTFLLLLFGEIMPKVLS RQAPLLFCRRSVAGVLFLRKLFWPLETVLLKTGMVAEKMMGKESVTLSVDDLEQALELTN KEDLKDEEKLLQGIIRFGDETAKEIMTSRKDIVDIDIKCNFSEVLESIKENNYSRIPIYQ DNTDNIKGVLYVKDLLPHLTKPHTFRWQSLIRPPYFVPETKKIDDLLRDFQENKIHIAIV VDEFGGTSGLVTLEDILEEIVGEINDEYDDETEKTYTKLNYNTFVFDGKTLLSDLCRILE VDDEEFSEVEGGADTLAGLLLELKGDFPSIGERLNYKNYLFEILAIEERRISRVKVVVHP >gi|283510579|gb|ACQH01000040.1| GENE 7 7000 - 7599 386 199 aa, chain + ## HITS:1 COG:no KEGG:BF1558 NR:ns ## KEGG: BF1558 # Name: not_defined # Def: siderophore (surfactin) biosynthesis regulatory protein # Organism: B.fragilis # Pathway: not_defined # 16 171 15 177 224 85 30.0 8e-16 MPLLRIERLGEDATLGLWTLTEEERFFFDNYPCLHSFQKEINTIGSKQRRMEKLATQALL FEMTKNTELTIEHDNNGKPLVKGYHVSISHTKGLVALMLSTTREVAVDVEYDSDRVSRIA HKFVNTNETMPTNAHLLLAWCAKETIYKYFSSDNLLSSDIYLQPFKVEDNGIIVGRNLKR NLLVSARYSRFNGFTMVYI >gi|283510579|gb|ACQH01000040.1| GENE 8 7800 - 8939 1057 379 aa, chain + ## HITS:1 COG:no KEGG:PRU_0697 NR:ns ## KEGG: PRU_0697 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 22 379 5 354 358 243 40.0 6e-63 MKKRYLIFTLIMACFCTAGYAQRKKVKAAVKPKPQPVQLSAADLLYRNMLGATAKVMFVD SVVVEKSLFLSSIPLNPEAGTLTSYPAFFKEHKYLDGSVYENEFKNTIYYSEGDSTRRSA IYTSDRIGTEWSDPQRLSEIGDEFEQQNYPFLMADGITLFFAAKGANSLGGYDLFMTRKD GETGKFFQPENYGMPFNSTANDYLLAIDEMDNLGWLVSDRFQPEGKVCIYTFVPTAQRLS FKEGEVSQQQLESLAKLSSIAQTWSFGNRDQALKSIESLKARSAKTTKTTAFRFPINDAK VYTSLSQFKTKQGKELYMELTELYKKQAKAVQTLEEKRAIYHKGRYNEAEAIKALEKNVE QLDTKIHATEKAIRNYENK >gi|283510579|gb|ACQH01000040.1| GENE 9 9183 - 10055 855 290 aa, chain + ## HITS:1 COG:VNG0893G KEGG:ns NR:ns ## COG: VNG0893G COG2820 # Protein_GI_number: 15790029 # Func_class: F Nucleotide transport and metabolism # Function: Uridine phosphorylase # Organism: Halobacterium sp. NRC-1 # 19 266 14 227 273 99 31.0 5e-21 MPKFFAESELIINGDGSIFHLHVKPEQLADKVILVGDPGRVSLVASHFDSKECDIESREF HTITGTYHGKRISVVSTGIGCDNIDIVLNELDALANIDFKTRTEKEQLRQLTLVRIGTCG GLQEYTPVGTFIASEKSIGFDGLLNFYSGRNDVCDLPFEEAFKQHMQWNPQLCAPYVIDA DKDTLERIAGDEMVRGVTIACGGFFGPQGRELRIPLADPKQNEKVESFEYNGYRITNFEM ESSALAGLARLMGHKAVTCCMVIANRRAKNVNANYKNSIDELIKLVLERI >gi|283510579|gb|ACQH01000040.1| GENE 10 10060 - 11442 1242 460 aa, chain + ## HITS:1 COG:aq_871 KEGG:ns NR:ns ## COG: aq_871 COG0486 # Protein_GI_number: 15606214 # Func_class: R General function prediction only # Function: Predicted GTPase # Organism: Aquifex aeolicus # 8 460 2 448 448 304 39.0 2e-82 MQALLNDRQTICALATSVGGALGVIRVSGNDAIDIVDHAFKAPKGKKLHALPPQTVQYGH IVDEGEQTIDEVLVTCFRAPHSYTGENSVEISCHGSAYILNEVLKLLVRLGCRQAQPGEF TQRAFLNGKMDLSQAEAVADLIAATNRASAQLALGQLRGHFSGELAALRDKLLHITSLIE LELDFSDQDVTFADRQELQALAEEIRTKIATLATSFETGRAIKAGISVAIVGKTNVGKST LLNRLLKEERAIVSDIHGTTRDVIEDTIQINGINFRFIDTAGIRKTSDEIESLGIERTYQ KLTEAAIVLWVIDKAPTLSEIEEMNAHTRGKRLIVVSNKTDAQSFAFPTFSWTEQPTFVS VSAKFNTNIETLETCIYNAANIPEIHENDVVVTNVRHYEALSHALASIQRVLEGIALDLS GDLLAEDLRQCLHFLAEITGGSITSNEVLGNIFRHFCIGK >gi|283510579|gb|ACQH01000040.1| GENE 11 11539 - 13596 1613 685 aa, chain + ## HITS:1 COG:no KEGG:BDI_1105 NR:ns ## KEGG: BDI_1105 # Name: not_defined # Def: putative outer membrane receptor protein involved in Fe transport # Organism: P.distasonis # Pathway: not_defined # 31 683 112 769 769 122 21.0 6e-26 MKKCISILLFFTLCIIANAQTDSTATRTLGEVVVNAKGQIETAEKAVLTPTTLEKRHATN GFELLNVMQTPELDVSPRTNSISTKGGGQVVLCINGMEVLPEDVSTLRASNIISIEYIRT PSGKYAGKAALLNFIIKQMSYGGNVYLSASEGFAYKNGDYLAFTDFTKKQLTLSVAVSAD WKHDHSHIEGHDLFRFADNSTLANSYSTDHSLRKQNSQSARFRLSHTGNKHQFVSYLNLT RQAEPSVKTITNNQYTGKYNLATTRTTTTNGKSLAPTLYANYVVWLPHKQTIDVSGSFAW GNNSYNSTNNETAQANIISKAVERNIAVNANARYSKSLNGSTTLSASLWHNHNYYKDTYT GTTEGLQRLTTNQTGALAQLSGSGQKHSYYISAGLSNTAVSLNNQHYNYCVPMAFYGGNY AFNDRQALSLNGYLTHTMFDPSNKNDMTVPTSFFQSVKGNPDLALIKVFGNTLTFNAQWG KSKASIFYNSNIYFKNIAHLFTADANTIYDMRVNDGTFYGNMLGVSYALSAFADKLRLDV TALEEYNVMRGNTYNMQRNVFRLRTSLAYLLGDWMFSLLYQTPRTDLDIREPFLIRMQPI YEMAINWNHKAWAVEFALCNVFSRYAKQRITMDYGHYNRDSWLYNEPNGRVINLKITYSI GYGKTKQRGEMELNKNINSAIIKGF >gi|283510579|gb|ACQH01000040.1| GENE 12 14898 - 15440 700 180 aa, chain - ## HITS:1 COG:NMA0415 KEGG:ns NR:ns ## COG: NMA0415 COG0288 # Protein_GI_number: 15793420 # Func_class: P Inorganic ion transport and metabolism # Function: Carbonic anhydrase # Organism: Neisseria meningitidis Z2491 # 2 176 4 178 199 191 52.0 6e-49 MIDQIINHNKTFVAQKGYEKYITDKYPDKKLAVLSCMDTRLTELLPAALGLKNGDAKIIK NAGGLVISAFDSAMRSLIVAIYELGVKEIMVVAHSHCGACHMNYDHFHHEMIARGITDET LNTIRKCGVDLDSWLEGFKDTHTSVRKTVDTIKTHPLVPKDVVVRGFIIDSETGELEEIY >gi|283510579|gb|ACQH01000040.1| GENE 13 15850 - 16143 326 97 aa, chain + ## HITS:1 COG:no KEGG:PRU_1181 NR:ns ## KEGG: PRU_1181 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 97 1 99 99 122 56.0 5e-27 MTRVFVMSTCPDCVQVKAQLADNPAYELIDIGEHVRNLKQFLALRDSNPAFAEVKKHGSV GIPCFLLEDGNVQFALDNITIEDAPEGAACSLDGKGC >gi|283510579|gb|ACQH01000040.1| GENE 14 16156 - 16419 309 87 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288928137|ref|ZP_06421984.1| ## NR: gi|288928137|ref|ZP_06421984.1| transglycosylase associated protein [Prevotella sp. oral taxon 317 str. F0108] # 1 87 1 87 87 157 100.0 2e-37 MLEHIAEFTAWPILTGFLIGYIANRIMSGEGKGCCMNFIVGVVGSYAGTLISHLLNIELF GRGYITNFVFCVLGAVTVLWVWKKIFD >gi|283510579|gb|ACQH01000040.1| GENE 15 16424 - 16714 221 96 aa, chain + ## HITS:1 COG:BS_ysdA KEGG:ns NR:ns ## COG: BS_ysdA COG3326 # Protein_GI_number: 16079936 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Bacillus subtilis # 8 86 2 80 89 71 44.0 3e-13 MKGITSPVVLVYLIAVNALGLVLFGIDKWKAKHTKWRISEPTLLAIAAIGGSIGSWIGMK MWHHKTLHKKFKYGIPLIMMVQFALLLFALYKANSV >gi|283510579|gb|ACQH01000040.1| GENE 16 16935 - 18218 974 427 aa, chain + ## HITS:1 COG:no KEGG:Spea_1267 NR:ns ## KEGG: Spea_1267 # Name: not_defined # Def: hypothetical protein # Organism: S.pealeana # Pathway: not_defined # 2 425 5 412 416 144 27.0 8e-33 MTSKIGWIDFSPLHRERVKRFIELMEEDGVQDELGVGTIRDAMSNKLFPGFSTLHTRAKY FFITPYILLDRERKQRKSESGKDYFNRVEIETNNTIIKFYESCKERKEESYFGKFKKDGV LKRQPSEIYWNGITTLRLVNYDGTLDQLLRDKHSTIEELLSCNQGDDTVKEQGENNKPRV VDAGSADGWIENITEKGLTLTSVEAQILRDRLIKHTPNSLPTELLTNDEVWEVYKAAAYK DKGNEQITNAFINFVEKAYKLVENEELRTNLITAHDLSLFLYGPHIAYNLRLAEQVKAAE SVIQELRDMGIVWLETLEQRMIDYKRFDISNCMQDVNVKPTTRLFLKEVQQLVTQQEKWQ DIEPELCDRVEKQEERNKKTKSRFFKLKNNRVVDENEKGAWVGLGLINYRYTAALAVMKD VQEVLGK >gi|283510579|gb|ACQH01000040.1| GENE 17 18232 - 20040 934 602 aa, chain + ## HITS:1 COG:no KEGG:BTH_I0101 NR:ns ## KEGG: BTH_I0101 # Name: not_defined # Def: hypothetical protein # Organism: B.thailandensis # Pathway: not_defined # 7 602 6 622 625 194 26.0 9e-48 MSVLEKSNRLDYGLSLAPDEGWTTSWAIGTTYSLDLNVLLNIPLALFHGKYLSENSDVNN LRSDMLDALNKVKDRMFIFVHENNIHAKCKFSTLIGFIDQSIWPFKVETAYHNFHPKLWL VRYEKDNDKQTYKYRLIVMSRNITDATDFDIAVTMDSDAPSEEHHDNDSLQQMVVMLMKR TGQTKIINQIKKELKGIRFMAPYPFDKRAPKFFHHSNKSFTSPLISDDKYKELLVISPFI DNYSLERLSRKTETKPILISRELELDKCKPEVLEKWDCYQWNNMLEEASDYEENERAEDE SKPYGINLHAKIFIAKEISGRRNNWFVGSANCTRAGLEKNHEAMLQLSSNNEDTSPQNAL ETLLPRLITKYTPKEKVLQQEDGKEQMREIVFDLSQLTFKSQIIKGTNGRYAMEVAVESS EWKAFIGKYQGIKITMQLFASDLDRWIITKECCHKFKPLLCQQLSPFVSITIKNGEEEKK FLLKLPLEMPDERLGHIMAEILDNEEKIMRYFMFCLDPLTDKEQQKIGSGITRYNSHISK EYQDYSLPIYEKLLLSASRNRAALAEIGKSIKQLKNTYGKDGKPLLSESFKDMWKLFAAY IK >gi|283510579|gb|ACQH01000040.1| GENE 18 20037 - 23285 1562 1082 aa, chain + ## HITS:1 COG:no KEGG:Spea_1269 NR:ns ## KEGG: Spea_1269 # Name: not_defined # Def: helicase domain-containing protein # Organism: S.pealeana # Pathway: not_defined # 14 1069 16 1077 1081 495 33.0 1e-138 MKYEDKVKETYSDLRDFQKVTVDYLVQRMFDEGQGRMLVADEVGLGKTWIAKGVIAQAYV RWCKMANKKSKHFNVYYICSNQQLFTQNLGKVNFTGEKCCIANDINRISMLALKPLHENE PVQIYALTPDTSFSNKSKQGIKEERYILYAILQQTNEFNKVHLSNLLRGNGVQKQNWNPE SEKARYLQRVKNNVTEAYIKALKTNHIQRENLPTAFETYNLPEKITTWEMLKSVVEVFDR RKAYATIYNEIIGCMRKEMADVCLKHMDADLFVMDEFQRYSQLIDNNCQSEQDTIAQKIF GQSKVKVLMLSATPFKAYTNQYDEQLGEQHYKEFKRVLEFLYEGRPINWQHLEENRARLY SQMLLLNNADDKEMLLNEIQSLKKEVEDVYSRVIVRTEKLIASNDPNAMIVEPNKQPLNI TKQEVAYFVHLDKVFKTIYEKQGEHAPSPVNYAKSAPYAMSFLREYKVGEKAQHNNVTMR KDAFVDLNEVNKYNFPRDGHWPHEKLRLLMKNMKQESKLLWCPPSLPYYPCEGAFKGQEQ FTKTLVFSAWKLVPKMIATLVSYEAERQTLGKLNHIKGVKYFADKRVASLGDKARKPTRR LVPSKEGPMTTLLLAYPCITLAKKLNPAQFIQSGKTLKQIVEQETQEWETQIKAFCNKEG KVYNHEDSHISWTYPMICDSNLEGRWEQETAWEIDDQDKNSALVTHNIPRLKNILTKEEK INIPKNPTKKELHQLSKLMATMSIGSPGVCAYRSLCQYYPDNNETFSCAFKIGRAFIDLF NKPESTAVVELLYTKQKSMEYWQKVIQYCAGGNLQAVLDEYVFMLSSNCDGIQKLTEKLC SALTIRTSTFKVDDAQSFGKKENKENNIRYSMRSHFAAPFGVNAESSQGSEKRATDIREA FNSPFKPFVLATTSIGQEGLDFHWYCRKVMHWNLPNNPIDFEQREGRVNRYRGMVVRQRV AEKYHANVANTNKLWDTLFELAEKDKADAKFPCDLVPNWHFESKGVSIERVVPLYKFSQD IQRYEKMLHVLGLYRLTFGQPRQEELVEALNCEISAEELDKLLIDLCPMRRAEGKTKNKP KA >gi|283510579|gb|ACQH01000040.1| GENE 19 23386 - 23880 391 164 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|116492196|ref|YP_803931.1| acetyltransferase [Pediococcus pentosaceus ATCC 25745] # 1 159 23 182 185 155 44 3e-37 KYASNPNVGPIAGWPPHTGVENSREIIKNVLSAPETYAVVLKETGEVVGSIGIMTSKSEI HSARIADNECEIGYWIGEPYWGQGLIPEGVNELLRYAFENLQHTTVWCGYYDGNEKSKRA QEKCGFVYSHTEENKPVPLMNDFRTEHFTKITLEDWKKRIGGLK Prediction of potential genes in microbial genomes Time: Sat May 28 00:52:07 2011 Seq name: gi|283510578|gb|ACQH01000041.1| Prevotella sp. oral taxon 317 str. F0108 cont2.41, whole genome shotgun sequence Length of sequence - 28086 bp Number of predicted genes - 22, with homology - 20 Number of transcription units - 13, operones - 5 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 214 84 ## + Term 286 - 339 3.5 2 2 Op 1 . - CDS 320 - 877 293 ## COG0645 Predicted kinase 3 2 Op 2 . - CDS 901 - 1950 768 ## COG0337 3-dehydroquinate synthetase - Prom 2008 - 2067 3.7 + Prom 1921 - 1980 7.7 4 3 Tu 1 . + CDS 2194 - 5166 3149 ## PRU_1680 hypothetical protein + Term 5346 - 5387 10.0 - Term 6387 - 6434 2.6 5 4 Tu 1 . - CDS 6608 - 7147 362 ## COG4283 Uncharacterized conserved protein - Prom 7167 - 7226 4.3 6 5 Tu 1 . - CDS 7286 - 8500 1033 ## COG1760 L-serine deaminase - Prom 8660 - 8719 6.4 + Prom 8850 - 8909 4.0 7 6 Op 1 . + CDS 8929 - 9831 230 ## PROTEIN SUPPORTED gi|161507907|ref|YP_001577871.1| ribosomal protein large subunit 8 6 Op 2 . + CDS 9836 - 9991 81 ## - Term 9980 - 10015 2.2 9 7 Tu 1 . - CDS 10027 - 11076 1174 ## COG0252 L-asparaginase/archaeal Glu-tRNAGln amidotransferase subunit D - Prom 11103 - 11162 4.9 + Prom 11043 - 11102 5.0 10 8 Tu 1 . + CDS 11300 - 12697 1300 ## COG1904 Glucuronate isomerase + Prom 12700 - 12759 3.0 11 9 Tu 1 . + CDS 12902 - 13639 875 ## PRU_2725 thioredoxin domain-containing protein + Term 13703 - 13747 -0.2 + Prom 13649 - 13708 2.5 12 10 Tu 1 . + CDS 13784 - 14602 616 ## COG4099 Predicted peptidase - Term 14729 - 14785 8.9 13 11 Op 1 . - CDS 14913 - 15698 755 ## PRU_1171 prephenate dehydrogenase (EC:1.3.1.12) 14 11 Op 2 . - CDS 15703 - 16767 1179 ## COG2876 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase - Prom 16991 - 17050 2.8 15 12 Op 1 . - CDS 17053 - 18243 1219 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase 16 12 Op 2 . - CDS 18224 - 19060 713 ## COG0077 Prephenate dehydratase - Prom 19165 - 19224 5.9 + TRNA 19606 - 19680 75.9 # Arg TCT 0 0 17 13 Op 1 . + CDS 20001 - 21200 1375 ## BDI_0255 hypothetical protein 18 13 Op 2 . + CDS 21190 - 21747 676 ## BDI_0254 hypothetical protein 19 13 Op 3 . + CDS 21731 - 25381 3351 ## BDI_0253 putative DNA repair ATPase 20 13 Op 4 . + CDS 25386 - 26225 653 ## BDI_0252 hypothetical protein 21 13 Op 5 4/0.000 + CDS 26245 - 26769 616 ## COG1974 SOS-response transcriptional repressors (RecA-mediated autopeptidases) 22 13 Op 6 . + CDS 26773 - 28068 864 ## COG0389 Nucleotidyltransferase/DNA polymerase involved in DNA repair Predicted protein(s) >gi|283510578|gb|ACQH01000041.1| GENE 1 2 - 214 84 70 aa, chain + ## HITS:0 COG:no KEGG:no NR:no TYRVILRHSPAMPLKSDVLFVIPYLLYYIYCNNSVFKRIDSDTDMCRIVGNIIILPSLLN MCIVRTLDNF >gi|283510578|gb|ACQH01000041.1| GENE 2 320 - 877 293 185 aa, chain - ## HITS:1 COG:DR0609 KEGG:ns NR:ns ## COG: DR0609 COG0645 # Protein_GI_number: 15805636 # Func_class: R General function prediction only # Function: Predicted kinase # Organism: Deinococcus radiodurans # 5 91 75 161 162 98 51.0 9e-21 MNNAILFILSGLPAVGKSTLAKFIVKEFGAVYLRIDTIEQGLKDLCHIHVEGEGYRLAYR MASDNLHLNNNVVADCCNPIELTRKEWEDVANKNGCRFVNIEIVCSDKVEHRKRAEQRRS DVANMKLPTWDDIEKRTYQEWQTNRIVIDTAGKNIRQCGTELKEKVSEILSPITGNKDVE SIECC >gi|283510578|gb|ACQH01000041.1| GENE 3 901 - 1950 768 349 aa, chain - ## HITS:1 COG:BS_aroB KEGG:ns NR:ns ## COG: BS_aroB COG0337 # Protein_GI_number: 16079327 # Func_class: E Amino acid transport and metabolism # Function: 3-dehydroquinate synthetase # Organism: Bacillus subtilis # 52 346 66 357 362 176 38.0 5e-44 MQEVLISDNLERDLAHAIDVCAPDKLFVLTDITTREYCWPKIEGFDCVKGAKLITIPSTD EHKNIETLSMVWEMLGEQEGTRCSCLINLGGGMVTDLGGFAAATFKRGINFINIPTTLLA MVDAAVGGKTGINFNGLKNEIGAFCNARFVLLNTLFLDTLDDKNTRSGYAEMLKHGLISN ERMWAELLNFDLGEPNWGRLQQMVAQSVRVKEEVVAKDPHEQGLRKALNLGHTVGHAFES MALHRGQPVLHGYAVAFGLVCELYLSTFKTNFPTSKMRQTVQFIRSYYGTFPLSCNDYDD LIELMQHDKKNTSAGINFTLLEDIGKPIINQTATRDEIKEALDFFRETM >gi|283510578|gb|ACQH01000041.1| GENE 4 2194 - 5166 3149 990 aa, chain + ## HITS:1 COG:no KEGG:PRU_1680 NR:ns ## KEGG: PRU_1680 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 950 1 943 986 915 53.0 0 MNKYCLFILLCFVAAVTKAQERKISGTLTDRDTHEPVTQTTVQLLKSDSTFVVGAISNEK GEFALSAPADGNYLLKISSVGYITYFKKLHIDGKDLALGQIVMNGDAIMLKGATVTGQAV KVTVREDTFVYNSAAFRTPEGSTIEELVKRLPGAQISDDGKITINGKEVKKILVDGKEFM TGDTKTALKNLPTSIVDKIKSYDERSDLARVSGIDDGDEQTVLDFGIKKGMNKGVFSNVD LALGTESRYANRLMGAYFKDDYRAMLFANANNTNDMGFPGGGGRGNFGRNRDGLNASKMV GTNFNYEKKNKLKIDWSVRWNHSDGDVQSKVATENFVSTAASFSNSLNQSYTRGNQWNAR GRLEWQPDTLTNIMFRPSFSYSTSDGTSSSTSAAYNDNPYNYVSDPLSQSTITQLAAQNL MVNTRLQNSISYTQNKKAGGMLQLNRRLSASGRNVTLRADVDYADTDNKSLALTDVHLFQ LKNKAGQDSVYQTNRYNLTPTTAWSYSLQATYSEPLWQGTYLQFRYKYGYTRTTSNRSTY DFSNLGEGFFSSLSPLYRGWNSYLSLLTNPYERYLDQRLSRSSAYTNYTHELEMMIRFVG KRYKFNAGFMVQPQSSHFTQTYLGQNADTMRHVFNVSPTLELRYKFSDVSQLRINYRGTT SQPAMSDLLDIVDDSDPLNIHRGNPGLKPSFTNSLRFFYNTYKERRQQAFMSFLDYSNTR NAVSNRVTYNETTGGRTTQPDNINGNWNVNAGLMFNTAIDSAGVFNVNTFSNLGYNNYVG YVSLNPTSGSQRNVTRSLSVSERLATSYRNSWLELELDGSFTYAHSKNQLQLQNNLNTWQ FAYGATLNFTLPWGMQLSTDLHQNSRRGYADRALNTNELVWNAQLSQSFLKGSPLSVSLQ FYDLLRQQSNFSRSINAYQRTDTEYNSINSYAMLHVVYRLNLFGGKEARQQMRNHGPNND GDGPVGPPPGGTPPGGRRPFGSGRPMGGGF >gi|283510578|gb|ACQH01000041.1| GENE 5 6608 - 7147 362 179 aa, chain - ## HITS:1 COG:FN1248 KEGG:ns NR:ns ## COG: FN1248 COG4283 # Protein_GI_number: 19704583 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 1 178 25 202 204 186 54.0 2e-47 MTRATTKTDLTTSANGQFEKMWKIIDSMSEELQTATFSEEMAVAGKEAHWGRDKNLRDVL VHLYEWHHLLLDWVQANSNGKPKPFLPEPYNWRTYPAMNVEFWKKHQSTPLTEAKAKLKE SHKDVMVLIENYSNDELFAKGALDWTGTSPLGSYCVSVTASHYDWAMKKIKMHIKTSKK >gi|283510578|gb|ACQH01000041.1| GENE 6 7286 - 8500 1033 404 aa, chain - ## HITS:1 COG:FN1106 KEGG:ns NR:ns ## COG: FN1106 COG1760 # Protein_GI_number: 19704441 # Func_class: E Amino acid transport and metabolism # Function: L-serine deaminase # Organism: Fusobacterium nucleatum # 1 398 1 397 408 431 52.0 1e-120 MRSLKELFRIGKGPSSSHTMGPQKASQIFKERNPQAASFEVTLYGSLAATGKGHMTDAII VETLEPTAPVELVWQPSVFLPFHPNGMKFVAKDAKGDPVDEWTVYSIGGGALSEGKKEGD LFDTKDIYDMNTLTDIMRWCEKSGRSYWEYVEQCETADIWDYLHEVWTVMQAAVERGLNR EGVLPGPLGLPRKAATYYVKASGYKQTLQTRGLVYAYALAVSEENASGGTIVTAPTCGAS GVLPAVLYHLSRGHNFSDTRILHALATAGLFGNVVKKNASISGADVGCQGEVGVACAMAS AASCQLFGGSPAQIEYAAEMGLEHHLGMTCDPVCGLVQIPCIERNAFAATRALDSNLYSA FSDGRHRVSFDRVVEVMKQTGHDIPSLYKETSEGGLAKDFMMGS >gi|283510578|gb|ACQH01000041.1| GENE 7 8929 - 9831 230 300 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|161507907|ref|YP_001577871.1| ribosomal protein large subunit [Lactobacillus helveticus DPC 4571] # 51 290 41 279 285 93 30 2e-18 MWRKNMNNKNKGQDYDRYDVTEESTLLDWLLRNVKGKSRNKLKDILRGHGVCVNGKCVTQ FDYQLVPGMKVTVSRTKKAVPFKSRYIKIVYEDRYLIVVDKSVGVLSMAAGHSSLNVKAI LDTYFENSGQKCRAHVVHRLDRETSGLLVYAKDMETEQILEHNWHDMVYDRRYVAVVSGE IVDEGGTISSWLKDNSSYITYSSPVDNGGKYAVTHFHVLKRTTTHSLVEYKLETGRKNQI RVHSADIGHPVCGDLKYGNGDNPIGRLCLHAYMLCFYHPATRERMEFNTPIPQTFKQLFK >gi|283510578|gb|ACQH01000041.1| GENE 8 9836 - 9991 81 51 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSSLSYFIFVLVAIIIGVVVIKKVTGCLIRFVVVSVLLAVLAYLYFTYFSA >gi|283510578|gb|ACQH01000041.1| GENE 9 10027 - 11076 1174 349 aa, chain - ## HITS:1 COG:YPO2161 KEGG:ns NR:ns ## COG: YPO2161 COG0252 # Protein_GI_number: 16122393 # Func_class: E Amino acid transport and metabolism; J Translation, ribosomal structure and biogenesis # Function: L-asparaginase/archaeal Glu-tRNAGln amidotransferase subunit D # Organism: Yersinia pestis # 12 348 6 335 338 287 44.0 2e-77 MASFTYQRTNKVLLIYTGGTIGMGKNPATGALEPLDFNCLVDNLPEFKYLKTGIEVYQFA SPIDSSDMSPRLWAHLVGIIADNYDRYDGFVILHGTDTMAYTASALSFMLENLTKPVVLT GSQLPIGQLRTDGKENLVTSIDLAASHHEDGTPMVPEVCIYFSGKLLRGNRSTKQNAEGF NAFESFNYPSLCDAGINFNFHHHEILKPNYEQPMVPHTALDSNLTILSIFPGIQENIVRH MIEAPELRSIVMRSYGSGNAPQHPWLMKLLKEAANRGVVVVNISQCIAGSVEMERYDTGF QLKNAGVVTGYDSTVEAALAKLMFLQGMYRDNGMVRKLMHKSIAGEISV >gi|283510578|gb|ACQH01000041.1| GENE 10 11300 - 12697 1300 465 aa, chain + ## HITS:1 COG:ECs3974 KEGG:ns NR:ns ## COG: ECs3974 COG1904 # Protein_GI_number: 15833228 # Func_class: G Carbohydrate transport and metabolism # Function: Glucuronate isomerase # Organism: Escherichia coli O157:H7 # 3 464 4 466 470 397 41.0 1e-110 MLYIDENILLETDFAQELYRKHAQRVAVIDYHSHISPHDVAVNRTFDTITSIWMENNPYV WRAMRANGVEEKYCTGTEVDDWQRFEKWAQTMPYLMRSPLYLWSHLALKSVFGIDEPLNA QSARRIFDACNEQLQGGGLSARSLLRRFRIACVCTIDDPIDTLDHHRKLKSEGCDIQMLP TWRPDKAMNIEKKEFRGYVKQLGEVANIDISTFESMMDALQRRHDFFAQNGCCMADHQLE EFYDEEYTDSQIEIIFEKAMRGQDLSRGEVRQYKHCFLKRTAEMNQESNWTQQYHYGILL DNNSLLYDTLGQGVGADSIGEPYTAHAMSHFLDDLHTRGKLTRTILTCMNPADNDMLCSM TGNFQEAGIPGKLQFGMGWWSNAQPGDINEQINALSRFGLLGRSIGVSTGSRSLLSLVRH EYFRRLLCNLLGNDVAKGLLPADIDSLGILVEDVCYNNAHRFFGF >gi|283510578|gb|ACQH01000041.1| GENE 11 12902 - 13639 875 245 aa, chain + ## HITS:1 COG:no KEGG:PRU_2725 NR:ns ## KEGG: PRU_2725 # Name: not_defined # Def: thioredoxin domain-containing protein # Organism: P.ruminicola # Pathway: not_defined # 15 227 5 201 362 69 26.0 1e-10 MTNNMNRTFKTLVALLLTALCFVQCSQTDNKSLRIKGKLEGVKDSLLFSIMAIENGDPVA HDTIISTKGGEFDFTMHLDSAAVMTVSDKANDKYDTLAGFYCVAVPGEEMVLKGDINKGW SFGGSKFYRELNEVEQAVQPINLEIRDFTAEWNKLMAAGKINDDVEKKFSDREEALYKKL ETACLAFAKAHPDMEAAVSLLNYLNVESSEKLVKLLNPKVLNGRMKAYVNPRINQPDIEP LSVPE >gi|283510578|gb|ACQH01000041.1| GENE 12 13784 - 14602 616 272 aa, chain + ## HITS:1 COG:SMb20552 KEGG:ns NR:ns ## COG: SMb20552 COG4099 # Protein_GI_number: 16264279 # Func_class: R General function prediction only # Function: Predicted peptidase # Organism: Sinorhizobium meliloti # 3 242 2 243 243 120 30.0 3e-27 MKMRKVLMLLAFFALCLAARAQKTDKGNFVTKVNAIPKGYNFWVYTPTEYEVDQHPLPLV IFLHGASLCGNNLQRVRRYGVLDAIDRGKIIPTLVVAPQNPGGAWSPQKINDLLEWTKAH YKVDTTRVYVLGMSLGGYGTMDFVDAYPEKIAAAMALCGGCSRNDVSGIGLVPMWIMHGT ADRAVPMKQSEVVVNKLKAQGNDKLLRYDWIQGGSHGLLARLFYLQKTYDWLFSHSLRDS PRTIDMHFEITRADINQTYQELKRLPSWYDND >gi|283510578|gb|ACQH01000041.1| GENE 13 14913 - 15698 755 261 aa, chain - ## HITS:1 COG:no KEGG:PRU_1171 NR:ns ## KEGG: PRU_1171 # Name: tyrA # Def: prephenate dehydrogenase (EC:1.3.1.12) # Organism: P.ruminicola # Pathway: Phenylalanine, tyrosine and tryptophan biosynthesis [PATH:pru00400]; Novobiocin biosynthesis [PATH:pru00401]; Metabolic pathways [PATH:pru01100]; Biosynthesis of secondary metabolites [PATH:pru01110] # 1 256 1 256 264 388 72.0 1e-107 MKILILGAGKMGSFFTDLLSFEHEVAVYDQEPRKMRFTYNCARFTSLEEVKQFEPQLLIN VVTMKYTIAAFESVMPFLPQQCIVSDIASVKTGLYEYYEGCNHPFVSSHPMFGPTFANLN QLSDENAIIISEGDYMGRIFFRDLYSRLGLNIHEYTFEQHDRTVAYSLSIPFVSTFVFAA VMKPQDAPGTTFKRHLNIARGVLGEDDFLLREILFNPYTAEQVGHIRDELNNLLDIIDRK DADGMKEYLVKIRKNVVKGKE >gi|283510578|gb|ACQH01000041.1| GENE 14 15703 - 16767 1179 354 aa, chain - ## HITS:1 COG:PAB0297 KEGG:ns NR:ns ## COG: PAB0297 COG2876 # Protein_GI_number: 14520663 # Func_class: E Amino acid transport and metabolism # Function: 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase # Organism: Pyrococcus abyssi # 19 244 29 247 265 158 38.0 1e-38 MELDLLPLNLPSDNERPIVMAGPCSAETEEQVMQAANALAQKGCHIFRAGAWKPRTKPGG FEGNGEAALPWLKRVKEETGMLIATEVATPEHVELCLKHGVDILWIGARTCANPFAMQAL ADSLRGMDMPVLVKNPVNPDLELWIGGMERINQAGIKRLAAIHRGFSSYDNKMYRNSPMW QIPLELRRRIPALPIICDPSHIGGRRELVAPLCQQAMDLGFDGLIVESHCNPDKAWSDAK QQVTPDVLDYILSLLIVRDKSQMQEGIIQLRQQIDDLDNQLMELLAKRMRVCRQIGQYKK EHNMTVFQASRYNEILEKRGAQGALCGMSPEFVATVFESVHEESVRQQIEIINK >gi|283510578|gb|ACQH01000041.1| GENE 15 17053 - 18243 1219 396 aa, chain - ## HITS:1 COG:aq_273 KEGG:ns NR:ns ## COG: aq_273 COG0436 # Protein_GI_number: 15605813 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Aquifex aeolicus # 14 392 5 382 387 262 37.0 9e-70 MKMGNNKTLDIKPAQRLDTVQEYYFSRKLKEVAQLVAKGNDIISLAIGSPDLPPSANTIA RLCEVAQREDAHGYQPTKGTPELLKAMAAYYGRWYGVDVNPQSEVLPLIGSKEGILHVTL AFANVGDQVLVPNPGYPTYTSLSKLLGVEVLNYNLREENQWQPDFDELESMDLSRVKLLW TNYPHMPTGGKPMMKTYERLVDFARKHNIVVVNDNPYSLILNSEPMSIMQVDGAKDCCIE LNSLSKSHNMPGWRVGMCVSNATWVQWILKVKSNIDSGTFRGIQLAAAEALTGNDTAWHT IYNKEVYLRRRTIAEKIMKALRCTFDPQQTGLFLWGRIPDNMADAEQLTEQLLHHHGVFV TPGFVFGSNGKRYIRISLCAKEERMEEALQRIVGNK >gi|283510578|gb|ACQH01000041.1| GENE 16 18224 - 19060 713 278 aa, chain - ## HITS:1 COG:ECs3462_2 KEGG:ns NR:ns ## COG: ECs3462_2 COG0077 # Protein_GI_number: 15832716 # Func_class: E Amino acid transport and metabolism # Function: Prephenate dehydratase # Organism: Escherichia coli O157:H7 # 3 273 1 271 282 147 32.0 2e-35 MKRIAIQGIAGSFHDIAAHQYFATEQVQGIYCNTFEEVFNQIANDPTVIGMVAIENTIAG SLLHNYELLRASGTTIVGEHKLHIEHSICCLPEDDWHSLSEVHSHPVALMQCREFLARHP KLKAVEAEDTAGSAEFIARTKQRGWAAICNASAAKLYGLKVLEDHIEDNKHNFTRFLVVC HPQRAGSLRPWEHANKASLVFSLPHEEGSLSKVLTILSFYNINLTKIQSLPVIGHEWEYL FYVDVTFDNVTRYHQSIDAIVPLTKRLKILGEYEDGKQ >gi|283510578|gb|ACQH01000041.1| GENE 17 20001 - 21200 1375 399 aa, chain + ## HITS:1 COG:no KEGG:BDI_0255 NR:ns ## KEGG: BDI_0255 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 7 399 4 398 398 284 41.0 3e-75 MDILTHFRTIEELTKTLAREQTLLSEMFEKRKLLKFPRGLAIDLVGGNETRLKRLVDYGV LVETGNTVEIESDYLNFFEEVLNVNEEISVLSVQECINTLKEHIGYFLQENNVNRKAGYQ DSVRQLLKKTGFRTLKNVVDLKRNMDNAYKQEPTYNIKKQRLKNLDEKSHSIRAMIKECE KLMDTEQAFFIMANDPHMAKTCSDVRHDFVEAYHALMEIDRQIIVYINQIDLQNQLYKKI RKLKYLQDQLLIKTDTNIVQVLEDTNPLWMEGRQYSKIRLSLEMLRENEGVAKVLRRIAE HNGVQKTARTEAEPLTDEDLQEHVQQLKEVDPTEVWNAFAASGYNLFEFVLAYNYKVERN IEDHATLFCQLVILHADECKLTGEYATYQDIEYPIVYAK >gi|283510578|gb|ACQH01000041.1| GENE 18 21190 - 21747 676 185 aa, chain + ## HITS:1 COG:no KEGG:BDI_0254 NR:ns ## KEGG: BDI_0254 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 5 185 4 183 183 142 47.0 5e-33 MRNNTQRIYERLSRGEFLSVDSTDSTVRHLYEDIEENMNDYADYFKEIGLRLETGNGYFY LSRTVENKQAIENKLESFAKWLDYLDFLKCYNQSFTAGYQFRKSNLIEQISLDIELKEKA GHLFKKYGVGSNLEIVNKLLQEMGNMGFAECVSEQDETYKVTSAFRYAEELVNMIQIANE DEVPE >gi|283510578|gb|ACQH01000041.1| GENE 19 21731 - 25381 3351 1216 aa, chain + ## HITS:1 COG:no KEGG:BDI_0253 NR:ns ## KEGG: BDI_0253 # Name: not_defined # Def: putative DNA repair ATPase # Organism: P.distasonis # Pathway: not_defined # 1 1213 1 1218 1221 738 39.0 0 MKFLNKIVFVNSANIPYAEISVDGNVHFTGTQGVGKSTVLRALLFFYNADKHRLGIQQGQ KSFDEFYFRQSNSHILYEVMRDNGAYTILVSRYQGRASWRFIDAPYQREWLVDEDKQVLS DWVKIRERIDKNIAVTARIDSGVMFRDIIFGNTRDNKYTRYALLQSTHYQNIPRSIQNVF LNTKLDADFVKNTIIQSMTEEELPIDLQTYRRLVTDFEREYDEIDCWFRQARDGSYPVRQ QASKIAEQGRKVVALDQQLRDVWHMLNHAVADNEQQIPLLEADAAETKTAIEKERNREKE LTAEYNKEKDSLNQKLGGKKQKLDEVLQARKEFEAMGIEDKLKLASREDTIKKEADNKQT LLNDLLKTHASIEEKYNIAKAKLENAQQAFDNDQKERFYQEQDKIQAERKRLEDERTKNR NQRTDAFAAWRHESDERLQALQAEQHRADAALKELKSWHPLANEMQTIAEQIRTLDVKEK ENAAQQAVVKSQMEQLTAEFEMKEAELKKIAQREQEEFEASRSSCREQIAKIEGLLSHLD GSLYQWLNENVEGWENTIGKVVDEERVLYAQGLEPQVDTLAEGLFGVRLNLSDVDATHRT PDEYRKEKKRLEEQLQQINRKLNDLPLTLEENTAKLGKKYATMLNPLRQKATHLRVEEEQ IPTKRQDLQNQQHKQKMDEQEMIEREKEVRERAFNAALLCVQAEKDERERHETQHKKELK ELDLHFNRQSKEFDGKLSAFKTALETETKARQKDFYNQKKQLEEQQKAELAGKGVDVNLL GQYRQAIEQLEALLQQIDNERPMVIRYRDAQQNLFAKEPEIRKDIKDIEQQLARMSKRYE DKKTRVEKKRQEQEERQKTIGKDLARRREGLNLYHQMVENEHLVPETLLADNKPVATKQD CQQLLSLLRGSVNQKRETIERLKDFVVNFNRNFKPQNAFHFNTMPITDNEYLQIAADLQD FMDNNKIEEFRRRTSEHYKDILGRISTEIGSLLKRRSDVDGVILDINRDFVEKNFAGVIK SIELRADESSDRLMQLLMSIHNYTAENAFSIGELNLFSDDNRDEVNRKVVDYLKSLSHQL QNEPSRTTVSLGDAFRLQFRVKENDNDTNWVERINNVGSDGTDILVKAMVNIMLINVFKK KAARKSGEFIVHCMMDEIGRLHPNNIKGILQFATSRNIYLINSSPTSYNPYDYRYTYLLS KQGVKTRVEKLLKRNP >gi|283510578|gb|ACQH01000041.1| GENE 20 25386 - 26225 653 279 aa, chain + ## HITS:1 COG:no KEGG:BDI_0252 NR:ns ## KEGG: BDI_0252 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 4 279 17 290 293 220 43.0 5e-56 MAIGKAIQNLVAGEQVAGSKIGGKLLEELLSEGLLVVVARGSRKSYRARNIEMLKRFLTD RDESYRVLEVENAETRAAMANKTGNSKLMTIRSCPGFPVNSYEPIACLLNDEPFVVNPHE GSFLFVTQWEGFRIPADVIVIGIENMENFRLIRKQKKFFENYLRAHALPTKTLFVSRYPQ SNDLRKWLTTIPNRYLHFGDFDLAGIHIFLTEFERHIGTERASFLIPSDIADRLKSGSSR RYDEQHAMFKEMQTDVRELGQLITLIHQERKGYDQEGYI >gi|283510578|gb|ACQH01000041.1| GENE 21 26245 - 26769 616 174 aa, chain + ## HITS:1 COG:ECs1678 KEGG:ns NR:ns ## COG: ECs1678 COG1974 # Protein_GI_number: 15830932 # Func_class: K Transcription; T Signal transduction mechanisms # Function: SOS-response transcriptional repressors (RecA-mediated autopeptidases) # Organism: Escherichia coli O157:H7 # 37 172 7 139 139 112 37.0 3e-25 MPRRNYPKNQACRARGSSSELDKRESIYMTSIQIIRGDFQKELKLPIAEGVKAGFPSPAD DYIHETLDFNHDLIRNPEATFYGKVEGDSMIEAGICNGDIAVIDRSIEPRDGDVVVGYIN EEFTIKYLDLSHRQEGYIELRPANSNFKPIRIDEDDAFEVWGVIVWTIKNWRKY >gi|283510578|gb|ACQH01000041.1| GENE 22 26773 - 28068 864 431 aa, chain + ## HITS:1 COG:STM1997 KEGG:ns NR:ns ## COG: STM1997 COG0389 # Protein_GI_number: 16765333 # Func_class: L Replication, recombination and repair # Function: Nucleotidyltransferase/DNA polymerase involved in DNA repair # Organism: Salmonella typhimurium LT2 # 1 431 1 422 422 292 37.0 9e-79 MYGIVDCDNCYVSCERVFRPDLNGKPVVVLSNNDGCVVARSNEAKRMGIKAGMPYYQLAE LFPNQDIAVFSSNYELYGELTGRVVEIIRKESPAYFRYSIDECFVYFNGIGQLDLKDWGE KLHKRIKRSVGIPTSIGIATNKTLAKMAGHFAKKHLGYHHCCLIDNDEKRVKALKLFPIS EVWGIGRRYATRLQAMGVQTAFDFAERNLSWVRATFKNIVVERTWRELNGEDCIPNEELA KKKSICMSRSFNGMITDLDGLRTHVSNFAARCAEKLRRQGTVASIVGVFLHTNAFREDLP QYWNFQEMRLITPTNNSIAIVKAANNVLQKLYRQDYHYKKAGVIVMGIAPNAPIQQDFFD ITAEQIEKLRRLDAIVDHLNKINGTEFIVLGSQQYTSKGCEGKAYSFANAIKHEFRSKNP TTRWKDIITLK Prediction of potential genes in microbial genomes Time: Sat May 28 00:53:01 2011 Seq name: gi|283510577|gb|ACQH01000042.1| Prevotella sp. oral taxon 317 str. F0108 cont2.42, whole genome shotgun sequence Length of sequence - 1798 bp Number of predicted genes - 3, with homology - 2 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 146 - 205 2.9 1 1 Tu 1 . + CDS 244 - 444 172 ## - Term 302 - 355 3.1 2 2 Op 1 . - CDS 369 - 938 226 ## gi|288928165|ref|ZP_06422012.1| hypothetical protein HMPREF0670_00906 3 2 Op 2 . - CDS 951 - 1535 174 ## PAU_02437 trypanothione synthetase domain protein precursor - Prom 1596 - 1655 5.2 Predicted protein(s) >gi|283510577|gb|ACQH01000042.1| GENE 1 244 - 444 172 66 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGNGTVNTYAYDRQSKRLLECIKSQDDFIELKMKSSLIHSRFYFDFLASLASDTLCNISN FALTYK >gi|283510577|gb|ACQH01000042.1| GENE 2 369 - 938 226 189 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928165|ref|ZP_06422012.1| ## NR: gi|288928165|ref|ZP_06422012.1| hypothetical protein HMPREF0670_00906 [Prevotella sp. oral taxon 317 str. F0108] # 1 189 1 189 189 379 100.0 1e-104 MNKTMIFITCVLFYLFCGRSNAHSMDSVQQCQDSLFIFTEENDTSCVSLLMFRHGCFEEE LIRVYCLDGSSIDFLTNLTGQDGRGKIGVYAVYNAATDESKFLFFDYLAKQAYITPAYFS ESLPVYTSLNLKRGYVILRTISPPSNRCQGALMEETQAHTTNGKNYLYVKAKLEMLHKVS LANDAKKSK >gi|283510577|gb|ACQH01000042.1| GENE 3 951 - 1535 174 194 aa, chain - ## HITS:1 COG:no KEGG:PAU_02437 NR:ns ## KEGG: PAU_02437 # Name: not_defined # Def: trypanothione synthetase domain protein precursor # Organism: P.asymbiotica # Pathway: not_defined # 1 193 1 194 202 208 52.0 1e-52 MVKKVIYYTLSGIILLMISYWCVTHVNLNPTLSRGVPIDSLDGVYVYYNGGTGQSSGRNI IDGYNVGVRYQCVEFVKRYYYLHYHHRMPDTYGHAKAFFDKKLSNGSLNTARALFQYTNG DHVKPQKGDLLVFDSYILNPYGHVAIVSRVTDSNIEIIQQNPGPWGKSRACIELQRNGKS WQIKNNRILGWLRK Prediction of potential genes in microbial genomes Time: Sat May 28 00:53:18 2011 Seq name: gi|283510576|gb|ACQH01000043.1| Prevotella sp. oral taxon 317 str. F0108 cont2.43, whole genome shotgun sequence Length of sequence - 6286 bp Number of predicted genes - 3, with homology - 2 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 229 - 2271 1058 ## PRU_2411 hypothetical protein 2 1 Op 2 . - CDS 2322 - 2486 65 ## - Term 2758 - 2803 8.5 3 2 Tu 1 . - CDS 2853 - 6080 1292 ## gi|288928169|ref|ZP_06422016.1| hypothetical protein HMPREF0670_00910 - Prom 6210 - 6269 2.0 Predicted protein(s) >gi|283510576|gb|ACQH01000043.1| GENE 1 229 - 2271 1058 680 aa, chain - ## HITS:1 COG:no KEGG:PRU_2411 NR:ns ## KEGG: PRU_2411 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 21 680 9 662 664 474 41.0 1e-132 MKRFYTFLFLVFCSLSLCAQYEEKSVVLDEVTVKAAMVVNRADGQTIYPTEALRNASNNG YSIMQKLALPNIRVDNVAHSISAIDNRGDVQIRINGIIIGKEEMLALDPKLIRKIEFIDN PGVRYGEDVAYVIDIKTRMISGYTLGTNLTTALTSFNVDGMVYGKWNTGKSLFALSYNAS AGRFDKQKVNEIANYTLSDGYLYTIERNDLESLKKEVSHNVKLTYSLADTTAYLFQATLY KNFRNIPKDYHLRRITDGLQQYEATLSNSSRGSDLGIDLYFFRQLTPCQSITANTVGTYI STTNKNYYDEGAPYQYKVNGKTASALSEVVYENRLKPFTLSAGLNHRYKYTKNDYIGDAS ALAEMNQHNLYIFSEIKGTLKDFRYVLGLGSSYIHYFQNKHNYDFWTFRPIALLSYNFPH MMKLRYSFEIEDRASRIAMINDATIQTNSLEMTVGNPSLKPGRDIDQSLRLSYNNQRWNT YVEGFYRHCIKPNMVHYERTSDGKFIYTQINQKEMDLLRLAGYVSYWILPEKLQASVYGG MQRCFNYGNDYSHFYTSWFYVGSVTAYLGNFTLHGYIDNGNRFLEGEKRGYNGAYSTIQA SYQYKDWQFSLLWTNPFIKNYKSIEDEILNRDLHKLTTVYDRASGNQVSISVSWRMSRGK KYVSVDKTINLKDTDDGIMK >gi|283510576|gb|ACQH01000043.1| GENE 2 2322 - 2486 65 54 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPQNKNATIRYQALDRCFNEFCYKFNRRYFGDRLFDRLVGSITYTPGFKHRVYN >gi|283510576|gb|ACQH01000043.1| GENE 3 2853 - 6080 1292 1075 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928169|ref|ZP_06422016.1| ## NR: gi|288928169|ref|ZP_06422016.1| hypothetical protein HMPREF0670_00910 [Prevotella sp. oral taxon 317 str. F0108] # 1 1075 47 1121 1121 2094 100.0 0 MIGTRGVVLDMLFNPLGFKRGYDWDSNVGFNLFAGYEYGQQKLANEWHYGSYSGNYVGYR TGAQVWLKLCNDLRLNVEPSYSFMELYNAVNERKQYDEIGLKLGLAVLFRDKAHREKLLI PVDSVPASMRMSPNRGFFIGMGFGWNTMVRAWRYTGGDDLALKNAEFFAGYNFDPFHGFR LSAEYLGDRIHTLNELGNGLNNENLQNMMLSLDYQFNIMNAFAGYNPYRRWNVYLYGGPT WARGDARQLAFNFGGMLTYSLTSSLALFYNHTVYRMPKGRYENNQIYGNDGTYVNSLNVG VLYTINQPIRDVLLGISQLTKSDYSRQPLTFEYSVGPSWYNNLPISTGSSLGYTANTYLG LWLNSAIGMRGGVHISNADWFHDLNEPRKNQLGLLAGTVDLMFNPLGIRGKYDWNMPFGF NLFMGTGLGKIRFVTSRMTAYETKFHEFRMGAQLWMKMTDDLRFNVEPTFSRMRGFEHGR VAKYVDELALKLGVSMILRDKADKDKTIELDSATQARVRSPYGIFFGGGFGWNTTVHTWR HTGRGFDLLKNGLLFAGYNFNEYHGVRLGGEYLYDYVWNDYNGPGIYEKQEFKNTMLSLD YQFNILNAIAGVRPGRRWEASLYLGPSLALGEKGTDLGWNFGGILSYQLTKELSVFYSHT VYRMEKNRYKTAQVYRTPGTYVNSLNVGIIYNMSGSLGDGVGSFTCDYEHKPVIFEYNVG PTWFNGLEVSRASSMGFTANANLGWWLNSALGVRGGIHVSNADWTNDASGEQRYLLRFYA GTLDLMFNPFGFQKKYDWNSTAGLNLFAGAGTGNIRFVTGQSSTHESKFNEWRLGTQIWL RLADNLRFNIEPTYSLLRGFEKVQSVAKTDELSLKFGLSLLLGKNLANNEDVSSSDASQI DYNPANGFYVAAGGGWNTAIHTWRTRGQETPFFKNALLFFGYRFGGYHGVRVSAEFMQDK VWEKSGAILEKKEFENKLYSLDYNFHLLNAIAGINPARRYDVSLYLGPSYVQSKAGNGLA WNLGGIISYNVSSKMALFYSHTIYRMDKKHYQSSQIYTKPGAIVNSLNVGLQYKF Prediction of potential genes in microbial genomes Time: Sat May 28 00:54:03 2011 Seq name: gi|283510575|gb|ACQH01000044.1| Prevotella sp. oral taxon 317 str. F0108 cont2.44, whole genome shotgun sequence Length of sequence - 9176 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 2, operones - 1 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 943 308 ## gi|288928170|ref|ZP_06422017.1| hypothetical protein HMPREF0670_00911 2 1 Op 2 . - CDS 977 - 3379 1171 ## gi|288928171|ref|ZP_06422018.1| lipoprotein 3 1 Op 3 . - CDS 3398 - 3967 195 ## BT_1066 hypothetical protein 4 1 Op 4 . - CDS 3984 - 5705 1158 ## gi|288928173|ref|ZP_06422020.1| hypothetical protein HMPREF0670_00914 5 1 Op 5 . - CDS 5746 - 7173 783 ## BF4129 TPR repeat-containing protein - Prom 7352 - 7411 6.0 - Term 7430 - 7483 14.0 6 2 Tu 1 . - CDS 7543 - 8124 163 ## PROTEIN SUPPORTED gi|40063301|gb|AAR38119.1| ribosomal protein S16, putative - Prom 8190 - 8249 8.4 Predicted protein(s) >gi|283510575|gb|ACQH01000044.1| GENE 1 1 - 943 308 314 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928170|ref|ZP_06422017.1| ## NR: gi|288928170|ref|ZP_06422017.1| hypothetical protein HMPREF0670_00911 [Prevotella sp. oral taxon 317 str. F0108] # 1 314 10 323 323 641 100.0 0 MKKRVLVAMLAGLLPLCASAWVSSQPFNKRFYISVPDTIAKQDTAKMDLEDLENMLPDSV LETKQDTVMGIPRPKGYNGLRFVLDKRHRYAGDQFVDSGFLSHTYVTLGGGVSAFLPNDK FDYTPIANLHVGIAKELSPMSTIRLFAEKSWGFTKAALGFSSVTTYSTWGGGIDYLFNFS NYLMGNRPDRPLNVLGTIGVGVQTARLNATENTFVPYYAENSALSYNARFGMQFKIMTSP HASLAFEPYLRLATRTHDLVKGTDFESLDLAYGINMSYIWYLWPNLSSERDKGSFLRRFN DNERFFLEEYMKKP >gi|283510575|gb|ACQH01000044.1| GENE 2 977 - 3379 1171 800 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928171|ref|ZP_06422018.1| ## NR: gi|288928171|ref|ZP_06422018.1| lipoprotein [Prevotella sp. oral taxon 317 str. F0108] # 1 800 2 801 801 1551 100.0 0 MKLEKIFILLLFLGLFTLKMSAQAIHITGSVSKSMKETNGQISGKMPLSVPIYIFDNKAE ARKQAQLYNSQNGMSGSIVKIKSNDVVIPDYEGHFEADISVGGAMLLINEGALKMVDVIG GKLNYDIVFAPSRSDGILIKNVNVFAKKQGVDFKEVPPVDDGENMRWTVGISLPAWYTTK HSRLIFQPVAVNTNTGDTIQYLEPLVFEGDKYHQNQIKRKSFGYERNDSLHAYYVPDNVM SNAPFSFSWDITYPKPDIDKNYKWKAKLSLEDYTHVYFNDNKEGTNNVRRPWKMLDLGAS AKRIELDPRFFETVRARLLEVPRDLQVTFVVGKDELTTDEENQKTLDFLIEELRSYGTSL MNISVQGTASPEGGLSLNIELAKKRARKILSMVGSHIRSASLTVKEPLVYTWDDVADSLV QRGQASEADELRKYAKAKDYVGIKRMTTSNPIIEEIMQNQRLMRSTYTLRRNKIMEPKEA VWAYYNDNRYRPGGEEYFSNGDYYNLFTQITDSVELRKLTHRAYKENMSRKTAKFSAFAA YLANRIAIELLEQDSVNLSVLEPFVDMSSGVEVQRPVAFESSYMYTVNFKEVVANQALMY LKSRKLSHSAHLANKLPDTQKFHDLKMFIDLESLFFKQNKTPEEEQRASNALNFVMNQNP LNRAILSVELAPELGITYESLNPLVDSLPDNLVKKWYLKGVIAANNPDIEQENDFMNLIK EFGSEAAVKMQTNDTPRFLAYFQHCFDLDPSFYERFFKTDGNITDEIRKRYPYVESKKDL YREKFNMMMVKPVSQETVSE >gi|283510575|gb|ACQH01000044.1| GENE 3 3398 - 3967 195 189 aa, chain - ## HITS:1 COG:no KEGG:BT_1066 NR:ns ## KEGG: BT_1066 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 6 188 2 188 189 78 30.0 1e-13 MKMVTKKIGLLLFLLVGLFTQMQAQTFAVNSNLLLLSTQTYNLGAEITVGNHTTLGLSVF GNNKPYFHKGMRAIGVQPEFKYYFGGRPMYHHYVGIGLLAADYSITWGNRKYDGTAWGGG MTFGYVISLSKRWNLDFHGGVGAIFSHHKEFMVNGDDLLLNDNSQAKDGFWEYRILPTKL GITLSYTIW >gi|283510575|gb|ACQH01000044.1| GENE 4 3984 - 5705 1158 573 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928173|ref|ZP_06422020.1| ## NR: gi|288928173|ref|ZP_06422020.1| hypothetical protein HMPREF0670_00914 [Prevotella sp. oral taxon 317 str. F0108] # 1 573 4 576 576 1025 100.0 0 MKGYKLILALTVQLFVGVTGALGQGNQLYYNKVDTMELQNRFNLKTDAVGWLTLTPNLGL EFSLGNKNWNQWTVGVYGRANWDVNTRTKSYYVYDIFGARAEVRKYWHAKMPKRAYYIGL YGGANSFDIKFSDIGKKGNSFVGGLMVGMVTQVYGYTNGASLDLELGFSPGVVFADMHDY TRLYKGGKYYYSRTTRDTGYKLTFNPWVYAASVDVVHVSFVYHFGTKLANRYKNRNLIDN DYRLAVEKEKIRRDSVHTVQAKEKRIKMDSLAKADYERRFEEQRLEIERAYTQDSIRQEN KALKEKQAVEREEAKHRADSLKEAARIKAMEDKANAKITADSLKFAEKARQLEEKENAKR TADSLKYAAQERARELKEQKAMERENAKRTADSIKVAQKMEKQQSEADKKKSDSSAPQED VDASQTPQVEDTALPKEEKDGQSKENPESKSKDDGQSTEKPVTEEKRAEEKKSETKEDHE SPEAKPSGENGKSEEGSDNSDNKEKAESGIVKHVSETKVLVCEKDSRRSLCPFFDANMLK NLNSIGYSSYSGVYLGQIDRNALCAFGDSFSQN >gi|283510575|gb|ACQH01000044.1| GENE 5 5746 - 7173 783 475 aa, chain - ## HITS:1 COG:no KEGG:BF4129 NR:ns ## KEGG: BF4129 # Name: not_defined # Def: TPR repeat-containing protein # Organism: B.fragilis # Pathway: not_defined # 5 470 2 473 477 228 31.0 3e-58 MSMTKKWLAFVFCAMISLPVFSQDYFSYNDDLQVIATLLTSSPDAAKSSAEKFASKHKED GALLAGLGTLYLRAGQLDEATHYFYLAQRCRRITTKALNLGGDIAKARNRPDSAQFYYRR AMYFDRKDPDAYYKLAELLKAKEINKSIETLNALKRNRPDLNVQRMMAEVYYTANDLTNA IAAYKSIGIDSLNDKELMQYSLSLYIKKDYSQSLDLATLGHKRASKDPVFNRLLLYNNTE LQHFDTAISNANDLFNNSKDTKPQYQDYIYYGYALSGLGRTQEAIQQFNVALQKNGNIPG IHKQISDAYSRIHDYDNAIKYYEQYVSNLKDTSDNRAYEIFQLGRLYWRKGTYVKENEKR TEEQTKAIEHAADLFGQVATLRPNSYLGFYWQGNANAMLDPEYTKGLAKPYFSKAAELLE KSGSNKDHLIECYKQLSYYYYIKKEMNSSVEYAEKVLALDPEDDFAKQVVATKGK >gi|283510575|gb|ACQH01000044.1| GENE 6 7543 - 8124 163 193 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|40063301|gb|AAR38119.1| ribosomal protein S16, putative [uncultured marine bacterium 578] # 28 171 91 239 255 67 32 4e-11 MKMRRCILSIVAVAFVSVMSASAQSENQGSAVADAQRAVADAQKAKSDAARAAAEAAKQK AEAAKAKVIAAKAKVEAEKARMDSEKAKAESEQARADAEAAEAEAQKAVEAANQAKAETE KAIANANACKATALKARTEAAKAKEMTAKVAAEAAKAKAEVDKAKAELAKAGADAAAAIE TANAAIEKAKSTN Prediction of potential genes in microbial genomes Time: Sat May 28 00:55:35 2011 Seq name: gi|283510574|gb|ACQH01000045.1| Prevotella sp. oral taxon 317 str. F0108 cont2.45, whole genome shotgun sequence Length of sequence - 60680 bp Number of predicted genes - 60, with homology - 56 Number of transcription units - 35, operones - 15 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 908 - 957 13.1 1 1 Op 1 . - CDS 1023 - 3386 2432 ## COG0493 NADPH-dependent glutamate synthase beta chain and related oxidoreductases - Prom 3412 - 3471 3.0 - Term 3443 - 3500 13.2 2 1 Op 2 . - CDS 3605 - 4894 1483 ## COG0172 Seryl-tRNA synthetase - Prom 4927 - 4986 6.2 3 2 Tu 1 . + CDS 5252 - 5467 122 ## + Term 5531 - 5566 0.8 + Prom 5777 - 5836 3.4 4 3 Op 1 . + CDS 5955 - 7277 1271 ## COG1295 Predicted membrane protein 5 3 Op 2 . + CDS 7316 - 7720 549 ## gi|288928181|ref|ZP_06422028.1| hypothetical protein HMPREF0670_00922 6 3 Op 3 . + CDS 7800 - 8198 564 ## PRU_0460 hypothetical protein + Prom 8365 - 8424 5.2 7 4 Tu 1 . + CDS 8633 - 9175 152 ## gi|288928183|ref|ZP_06422030.1| hypothetical protein HMPREF0670_00924 8 5 Tu 1 . - CDS 9277 - 9690 280 ## Vpar_0186 hypothetical protein - Prom 9731 - 9790 4.6 9 6 Tu 1 . - CDS 9812 - 10267 499 ## TDE0524 hypothetical protein - Prom 10453 - 10512 4.8 10 7 Tu 1 . - CDS 10583 - 11749 558 ## TDE0809 hypothetical protein - Term 12065 - 12125 4.4 11 8 Tu 1 . - CDS 12274 - 12849 344 ## TDE0559 hypothetical protein - Prom 13004 - 13063 4.8 12 9 Tu 1 . - CDS 13269 - 13547 68 ## - Prom 13675 - 13734 5.7 - Term 14863 - 14906 1.0 13 10 Op 1 . - CDS 14942 - 15556 686 ## PRU_0764 putative lipoprotein 14 10 Op 2 . - CDS 15626 - 15856 252 ## PRU_0765 hypothetical protein - Prom 15907 - 15966 5.2 + Prom 15864 - 15923 3.5 15 11 Op 1 . + CDS 15947 - 17404 1682 ## COG2195 Di- and tripeptidases 16 11 Op 2 . + CDS 17407 - 17802 377 ## gi|288928190|ref|ZP_06422037.1| hypothetical protein HMPREF0670_00931 + Term 17823 - 17872 14.0 - Term 17971 - 18008 6.2 17 12 Op 1 . - CDS 18050 - 18208 143 ## PRU_0750 hypothetical protein 18 12 Op 2 . - CDS 18228 - 18416 274 ## PROTEIN SUPPORTED gi|150005787|ref|YP_001300531.1| 50S ribosomal protein L33 19 12 Op 3 . - CDS 18422 - 18685 378 ## PROTEIN SUPPORTED gi|150005786|ref|YP_001300530.1| 50S ribosomal protein L28 - Prom 18739 - 18798 2.9 20 13 Op 1 . - CDS 19073 - 19582 589 ## COG1546 Uncharacterized protein (competence- and mitomycin-induced) 21 13 Op 2 . - CDS 19601 - 20635 652 ## PROTEIN SUPPORTED gi|227425790|ref|ZP_03908856.1| SSU ribosomal protein S18P alanine acetyltransferase 22 13 Op 3 . - CDS 20645 - 21508 956 ## COG0024 Methionine aminopeptidase - Prom 21562 - 21621 3.5 + Prom 21453 - 21512 2.6 23 14 Tu 1 . + CDS 21545 - 21772 69 ## + Term 21800 - 21837 6.6 24 15 Tu 1 . + CDS 22396 - 22611 72 ## - Term 22537 - 22588 11.4 25 16 Op 1 . - CDS 22686 - 23297 512 ## BDI_1934 hypothetical protein - Prom 23325 - 23384 5.3 26 16 Op 2 . - CDS 23422 - 25734 2435 ## COG5009 Membrane carboxypeptidase/penicillin-binding protein - Prom 25758 - 25817 5.0 + Prom 25702 - 25761 6.0 27 17 Op 1 19/0.000 + CDS 25849 - 26802 1119 ## COG0540 Aspartate carbamoyltransferase, catalytic chain 28 17 Op 2 . + CDS 26799 - 27260 628 ## COG1781 Aspartate carbamoyltransferase, regulatory subunit + Prom 27269 - 27328 1.9 29 18 Tu 1 . + CDS 27379 - 28659 1615 ## COG0112 Glycine/serine hydroxymethyltransferase + Term 28688 - 28726 7.2 - Term 28605 - 28656 8.1 30 19 Tu 1 . - CDS 28675 - 28887 64 ## gi|288928202|ref|ZP_06422049.1| hypothetical protein HMPREF0670_00943 - Prom 29137 - 29196 2.0 - Term 29785 - 29840 4.6 31 20 Tu 1 . - CDS 29989 - 31218 1222 ## COG2081 Predicted flavoproteins - Prom 31288 - 31347 3.1 + Prom 31545 - 31604 3.3 32 21 Op 1 . + CDS 31624 - 32043 513 ## COG2050 Uncharacterized protein, possibly involved in aromatic compounds catabolism 33 21 Op 2 10/0.000 + CDS 32036 - 33121 694 ## COG1169 Isochorismate synthase + Prom 33169 - 33228 2.8 34 21 Op 3 1/0.000 + CDS 33322 - 34989 1209 ## COG1165 2-succinyl-6-hydroxy-2,4-cyclohexadiene-1-carboxylate synthase 35 21 Op 4 . + CDS 35027 - 35851 871 ## COG0447 Dihydroxynaphthoic acid synthase 36 21 Op 5 4/0.000 + CDS 35859 - 36899 792 ## COG4948 L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily 37 21 Op 6 . + CDS 36878 - 37891 946 ## COG0318 Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II - Term 37871 - 37913 7.0 38 22 Tu 1 . - CDS 38002 - 38763 497 ## PRU_1254 hypothetical protein + Prom 39056 - 39115 6.7 39 23 Op 1 . + CDS 39135 - 39725 785 ## gi|288928211|ref|ZP_06422058.1| hypothetical protein HMPREF0670_00952 40 23 Op 2 . + CDS 39824 - 40597 980 ## BF3707 hypothetical protein + Term 40642 - 40690 13.3 - Term 41471 - 41519 9.1 41 24 Tu 1 . - CDS 41556 - 43091 1829 ## COG1418 Predicted HD superfamily hydrolase 42 25 Op 1 . - CDS 43205 - 43522 370 ## PRU_0226 hypothetical protein 43 25 Op 2 . - CDS 43515 - 43823 337 ## PRU_0227 hypothetical protein - Prom 43869 - 43928 4.5 44 26 Tu 1 . - CDS 44018 - 44641 618 ## PRU_0228 hypothetical protein - Prom 44736 - 44795 4.7 - Term 44915 - 44976 1.1 45 27 Op 1 . - CDS 45153 - 46538 1611 ## COG1109 Phosphomannomutase 46 27 Op 2 . - CDS 46554 - 47201 550 ## PRU_0303 hypothetical protein - Prom 47294 - 47353 3.8 47 28 Tu 1 . - CDS 47451 - 48488 989 ## COG0618 Exopolyphosphatase-related proteins - Prom 48548 - 48607 4.1 48 29 Tu 1 . - CDS 48901 - 49290 422 ## PRU_1440 hypothetical protein - Prom 49435 - 49494 5.0 49 30 Tu 1 . + CDS 51369 - 51761 315 ## BF0742 hypothetical protein - Term 51801 - 51856 3.3 50 31 Tu 1 . - CDS 51941 - 53923 2200 ## COG0457 FOG: TPR repeat 51 32 Op 1 . - CDS 54078 - 54641 717 ## COG0242 N-formylmethionyl-tRNA deformylase 52 32 Op 2 . - CDS 54638 - 55054 186 ## COG0816 Predicted endonuclease involved in recombination (possible Holliday junction resolvase in Mycoplasmas and B. subtilis) 53 32 Op 3 . - CDS 55061 - 55540 370 ## PRU_0589 sporulation-like repeat protein - Prom 55780 - 55839 6.2 + Prom 55616 - 55675 5.8 54 33 Op 1 . + CDS 55815 - 56036 367 ## PRU_0588 hypothetical protein 55 33 Op 2 . + CDS 56038 - 56391 386 ## PRU_0587 hypothetical protein 56 33 Op 3 . + CDS 56396 - 56671 211 ## gi|260912127|ref|ZP_05918683.1| hypothetical protein HMPREF6745_2638 + Term 56832 - 56868 -1.0 57 34 Op 1 . - CDS 56687 - 57448 626 ## COG1235 Metal-dependent hydrolases of the beta-lactamase superfamily I 58 34 Op 2 . - CDS 57445 - 58461 1024 ## COG0812 UDP-N-acetylmuramate dehydrogenase 59 34 Op 3 . - CDS 58477 - 59334 728 ## PRU_0583 putative lipoprotein - Prom 59479 - 59538 5.1 + Prom 59452 - 59511 4.8 60 35 Tu 1 . + CDS 59535 - 60572 1179 ## COG0016 Phenylalanyl-tRNA synthetase alpha subunit + Term 60627 - 60662 4.0 Predicted protein(s) >gi|283510574|gb|ACQH01000045.1| GENE 1 1023 - 3386 2432 787 aa, chain - ## HITS:1 COG:MA3787 KEGG:ns NR:ns ## COG: MA3787 COG0493 # Protein_GI_number: 20092583 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: NADPH-dependent glutamate synthase beta chain and related oxidoreductases # Organism: Methanosarcina acetivorans str.C2A # 327 782 16 469 469 436 53.0 1e-121 MNKIIRKEQFSEKVYRFDIEAPLIAKSRKAGNFVIVRVGDKGERMPLTIADADTTKGTIT LVVQKVGLSSIKLCNLNEGEYVTDVVGPLGNPTHIENFGTVVCAGGGVGVAPMLPIIRAL KAAGNRVLSVLAGRSKDLIILEDEVRQSSDEVIIMTDDGSYGEQGVVTVGIEKFINAEHI DRAFAIGPAIMMKFCCLLTQKYNIPTDVSLNTIMVDGTGMCGACRLTIGGKTKFVCIDGP EFDGALVDWDEMFKRMGTFKKAESEELQRYNDHIEQVEERVAQTVSDITMDVEPTTEGID VLTDRNAEWRKELRASMKPKERTGIHRVEMPELDPVYRATSRVEEVNKGLTKELALMEAK RCLDCAKPTCMEGCPVSINIPSFIKNIERGQFLAAAKVLKDTSALPAVCGRVCPQEKQCE SRCVHLKMNEPAVAIGYLERFAADYERESGNISVPELAPANGIKIAVVGSGPAGLSFAGD MAKFGYDVTVFEALHEVGGVLKYGIPEFRLPNTIVDVEIDNLKKMGVKFITDCIVGKTIS VDDLEEQGYKGIFVGSGAGLPNFMGIPGENAINIMSSNEYLTRVNLMDAANPNTDTPINL GKRVMVVGGGNTAMDSCRTAKRLGAEVTLVYRRSEAEMPARLEEVKHAKEEGIDFLTLHN PLEYLADEQGAVKAAVLQVMELGEPDASGRRSPQPIEGVTKTLDVDQVIVAVGVSPNPLV PNSIRGLELGRKNTIVVNEGMQSSRPEIYAGGDIVRGGATVILAMGDGRKAAASMHKQLT EELQLAI >gi|283510574|gb|ACQH01000045.1| GENE 2 3605 - 4894 1483 429 aa, chain - ## HITS:1 COG:SP0411 KEGG:ns NR:ns ## COG: SP0411 COG0172 # Protein_GI_number: 15900330 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Seryl-tRNA synthetase # Organism: Streptococcus pneumoniae TIGR4 # 1 420 1 417 424 324 43.0 2e-88 MLTLKLISEETERVIKGLEKKHFKGAKDAVEKVLETDKRRREAQQKLDKNKQEANSMSRQ IGMLMKDGKTQEVEEVKAKVAVYKENDKQLQAQMDEAEQELTTLLCNIPNIPADEVPEGK DANDNVVVKEGGIMPQLPEDALCHWDLCKKYNLIDFDLGVKITGAGFPIYIGKMARLQRA LEAFFLDEARKSGYLEVQPPLVVNQASGYGTGQLPDKEGQMYHAEQDDLYLIPTAEVPVT NIFRDVILDEKDLPVMRCAYSACFRREAGSYGKDVRGLNRLHQFDKVEIVRIDKPEHSHE SHKKMLEHVEGLLQKLELPYHILLLCGGDMSFTSSICYDFEVWSAAQKRWLEVSSVSNFE SYQANRLKCRYRRAEDKKIELCHTLNGSALALPRIVAAIIENNQTPEGIRVPRVLVPYCG FEMLDDKNF >gi|283510574|gb|ACQH01000045.1| GENE 3 5252 - 5467 122 71 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNLLVELTYDKNPQIQHLKSSTNKTNSPIHKHFFFICFSAYCFAKTLPPSPALSATKFHE KLKQNAFFIKL >gi|283510574|gb|ACQH01000045.1| GENE 4 5955 - 7277 1271 440 aa, chain + ## HITS:1 COG:FN1154 KEGG:ns NR:ns ## COG: FN1154 COG1295 # Protein_GI_number: 19704489 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Fusobacterium nucleatum # 55 296 19 254 396 145 32.0 1e-34 MKKTLNTIITFFKVEMWQIRREDVSPLAYVCLMVLKRLVVTVKFFTTRSVTDLASALTYS TVLAIVPIVAVVFAIARGFGFSKYIEVWFRDALSSQPQAAETIIGFVNSYLVHTKGGVFL GIGLVFMLFTVLMLISNIEKAFNSIWQVQHPRSLFRTITDYLAMFFLVPIVIVVTSGVSI VMATFAQDINEYVVLGPMMRLVITVMPYVLMSAVFVCLFIFMPNTKVNFSAAFLPGILSG IAMQLLQVFYIHSQIFLSSYNAIYGSFAALPLFMLWVQISWSICLFGAELSYTSQNMESF DLLGQMDDLSYRYRMMLSALLLGKICKRFDEGKPAYTAVELKLETNIPVRIVQQLLFDMQ NANLVTYVASDEKDTQARYQPAVSLKHLTLGFMMGRLESAGKWNLDFDVRKHLKTKEWVE FYKLRRKYLSELDEISLSKL >gi|283510574|gb|ACQH01000045.1| GENE 5 7316 - 7720 549 134 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288928181|ref|ZP_06422028.1| ## NR: gi|288928181|ref|ZP_06422028.1| hypothetical protein HMPREF0670_00922 [Prevotella sp. oral taxon 317 str. F0108] # 1 134 1 134 134 271 100.0 7e-72 MHIELMDFEQRVSSKIRVESGFPGFPQPEQYNLTKTEIDDYLLDKQAILDSAGSQRTQYT IMGVMIVLPVVVFSAFPQKDMPGGNWAIFVALAIGLCLAGLVKLLTKLRISHRLKNMADE RIERYIEDVLNFKS >gi|283510574|gb|ACQH01000045.1| GENE 6 7800 - 8198 564 132 aa, chain + ## HITS:1 COG:no KEGG:PRU_0460 NR:ns ## KEGG: PRU_0460 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 7 120 2 115 130 157 72.0 1e-37 MEKQKTDLPDFMSYKDKFTQSGFIEKISKIAKRAGAKLVYAALILYYTLQSDTVSLKDKT MIVAALGYLISPLDVIPDAIPIAGLGDDLGVLLYVLRKVWTDIPEDVKTKAHDKLSKWFD EEEVKEADSINL >gi|283510574|gb|ACQH01000045.1| GENE 7 8633 - 9175 152 180 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288928183|ref|ZP_06422030.1| ## NR: gi|288928183|ref|ZP_06422030.1| hypothetical protein HMPREF0670_00924 [Prevotella sp. oral taxon 317 str. F0108] # 1 180 51 230 230 370 100.0 1e-101 MIKGNCPIPKGGSRYMQFEYVVKNETSDTLLLPISIVHGWNSYHSHITIRDKTRSFDVVL GQTLSRRERDVVLAKGGSTVITITLYRNALDSIGIKPHDLPSKVVQNFNFFFSYDRRDST HFVVPQIKFHNTNNKDKNNLYNGGWRTTLYWDVQYDSTYNIFYCFHKPGVRIPVAEVIQP >gi|283510574|gb|ACQH01000045.1| GENE 8 9277 - 9690 280 137 aa, chain - ## HITS:1 COG:no KEGG:Vpar_0186 NR:ns ## KEGG: Vpar_0186 # Name: not_defined # Def: hypothetical protein # Organism: V.parvula # Pathway: not_defined # 1 135 1 135 136 170 63.0 1e-41 MKVVRDFGEYFKYSLHDARINKIEHRRGNMVLHFNYIFSYDEGAEQTHKAQVVFEQTDID DVRILVFNSRWLDDFQGECIDLETYQSKYKNSEFEVIDESYNWGKAVFQGWLWTGGVPVC CIMDIYFKGKMVYVIGE >gi|283510574|gb|ACQH01000045.1| GENE 9 9812 - 10267 499 151 aa, chain - ## HITS:1 COG:no KEGG:TDE0524 NR:ns ## KEGG: TDE0524 # Name: not_defined # Def: hypothetical protein # Organism: T.denticola # Pathway: not_defined # 1 151 1 151 151 145 49.0 6e-34 MNTLEHKYFGKLSFDNANEIGIVWKGEVDGIDVTMWYDEDVEVMKGELDRFALFLENIKD NIKKATEALIETLKEDREYMDFHLEECKDAKLPDDVVEFVSQMTATAIDLWIGDSEYRIT MDFMILPDESDQILCVTFNERGEINAVDWES >gi|283510574|gb|ACQH01000045.1| GENE 10 10583 - 11749 558 388 aa, chain - ## HITS:1 COG:no KEGG:TDE0809 NR:ns ## KEGG: TDE0809 # Name: not_defined # Def: hypothetical protein # Organism: T.denticola # Pathway: not_defined # 13 388 4 380 381 412 58.0 1e-113 MMKEENINRNIIKIILRNEERKYLGLTPIESHWERVDIKHVTLFFDGNVIRKKITHAEFE SQQFSYLEEDVYVETTENRTIVLPKTLRGKPKKLNYTATTMFTRIGMYFMYNDGYVIIAN ATTQKTYYWTRIEGEENKKLFDEWFDTWKKETTEEDLRELEVFKNETRKKQKYKEGDVFA FKIGRRNYGFGKIIIDVVSRRKAEEFRKNKNYGLAHLMGTALIVKVYHKMSDTKDIDIAE LERCCSLPGQAVMDNHIYYNQYPIIGHLPVSNVDMNDAYLSVSRSINYNDSDIAYLQYGL IYKEIPLSEYKKYEEEHWPAYRNEGIGFCLDIDGLEDCIKEKSNDKYFEINTNDLRAPEH AKDREIIFKLFGLDASLDYQGNLELCKQ >gi|283510574|gb|ACQH01000045.1| GENE 11 12274 - 12849 344 191 aa, chain - ## HITS:1 COG:no KEGG:TDE0559 NR:ns ## KEGG: TDE0559 # Name: not_defined # Def: hypothetical protein # Organism: T.denticola # Pathway: not_defined # 5 190 1 186 195 266 83.0 3e-70 MYTIMDNSNFNIQEWVNLNSYTDSKGWKGYCKRNKPFTIFRANKMVDYFMRFFSKYTNDI AIVSNAFKVEEYNLNNSNISNYVRYIEKYLIDFGYMEIVSGSIYDNPNCLSLSFKKILDA DYDIISMFSLLMMMDEGVDGHCFFVFDKLGLIAYPHDDTGFGFIRIKNTKVHYEDVFLKN VSQFSDFISVW >gi|283510574|gb|ACQH01000045.1| GENE 12 13269 - 13547 68 92 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MMPHTRTWATARKTRTLTPARASGCRACCLRLTETTLWKRRSNLPSTAGFQLVECALNDF PPNNLLKHIQVLPPTNVAEFWNSSYLSPVLRT >gi|283510574|gb|ACQH01000045.1| GENE 13 14942 - 15556 686 204 aa, chain - ## HITS:1 COG:no KEGG:PRU_0764 NR:ns ## KEGG: PRU_0764 # Name: not_defined # Def: putative lipoprotein # Organism: P.ruminicola # Pathway: not_defined # 3 195 2 188 211 147 55.0 2e-34 MNKKNIFGTLALAAMTTLAFTACNKQNSQAENKSESNTKAAPTSMKIAYVEVDSIMSQYK FCKDYSLILQKKGQNIQSTLASKQQALQAAAANFQQKVQQNAYTREQAEAIQAGLQKQSA DLQGLNQRLSNEFQVETEKYNAALRDSLRHFLNEYNKDKKYSLILSKAGDNLLYADKAFD ITNDVVAGLNKAYKPIKAAETTKK >gi|283510574|gb|ACQH01000045.1| GENE 14 15626 - 15856 252 76 aa, chain - ## HITS:1 COG:no KEGG:PRU_0765 NR:ns ## KEGG: PRU_0765 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 67 1 67 68 75 53.0 7e-13 MAETLALSLLIIAIAVALLCVKMIFRKNGTFSSQHIHDNPGLRKQGIHCVIDQDKEARRT VSTAVNEHGSNSTSNV >gi|283510574|gb|ACQH01000045.1| GENE 15 15947 - 17404 1682 485 aa, chain + ## HITS:1 COG:VC2279 KEGG:ns NR:ns ## COG: VC2279 COG2195 # Protein_GI_number: 15642277 # Func_class: E Amino acid transport and metabolism # Function: Di- and tripeptidases # Organism: Vibrio cholerae # 4 485 53 533 534 402 42.0 1e-111 MNKSNLSPSCVFDQFKQINQIPRPSKHEERMVEYLKEFAKKHGLDVKVDAVNNVIIRKPA TPGYENRKVVILQSHTDMVCDKLVDVEFDFNTDAIQTYVDGEWMKAKGTTLGADDGIGVA MQLAVLESNDIQHGPLECVFTSDEETQMTGASGMEPGFMTGDWLINLDSEDEGEIFVSCA GGRSTHAVFHFEKEEAPKDYFFLQISVKGLTGGHSGDDINKKRANAIKLLARFVYLEQEK YDVRLASFNSGKLHNAIPRDGRVVMAVPNDVKENVRADWNVFAHEVADEFYVTDTAMEFA MESANAEKVMPADVSRRIIMALQGITNGVFAICQDPALGGMVETSSNIAVVKTTDNTVDL LASQRSNVMSNLTNQCNAVGAVLKLAGAEVEQTDGYPAWQMNPNSELTKVAIESYKRLFN KEPIVRGIHAGLECGQFATSYPGMDMVSFGPTLRDVHTPDERLLIPTVQMVWDHLLDILK NVPAK >gi|283510574|gb|ACQH01000045.1| GENE 16 17407 - 17802 377 131 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288928190|ref|ZP_06422037.1| ## NR: gi|288928190|ref|ZP_06422037.1| hypothetical protein HMPREF0670_00931 [Prevotella sp. oral taxon 317 str. F0108] # 1 131 1 131 131 246 100.0 3e-64 MKNKHVFLFLAAGFLALASVGCGKRHSAKKLVENFIDEHAQLSSVSITDVGKLDSTDRVD NSTINALQADVKNGGLYKPDTKFGQRPANTKTLLMIRVTLETKDEKGEKKPYKQTFYLDP ELTSVVAVKTN >gi|283510574|gb|ACQH01000045.1| GENE 17 18050 - 18208 143 52 aa, chain - ## HITS:1 COG:no KEGG:PRU_0750 NR:ns ## KEGG: PRU_0750 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 52 1 52 52 81 92.0 8e-15 MAKKVVATLRDGSKDGRAYTKVIKMVKSPKTGAYIFDEKMVPNEAVKDFFKN >gi|283510574|gb|ACQH01000045.1| GENE 18 18228 - 18416 274 62 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|150005787|ref|YP_001300531.1| 50S ribosomal protein L33 [Bacteroides vulgatus ATCC 8482] # 1 62 1 62 62 110 80 2e-23 MAKKAKGNRVQVVLECTEMKNSGLSGTSRYVTTKNRKNTPERMELMKYNPIMKKMTLHKE IK >gi|283510574|gb|ACQH01000045.1| GENE 19 18422 - 18685 378 87 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|150005786|ref|YP_001300530.1| 50S ribosomal protein L28 [Bacteroides vulgatus ATCC 8482] # 1 86 1 86 86 150 80 2e-35 MSKICQITGKKAMVGNNVSHSKHRTKRSFDVNLFSKKFYYVEEGCWISLKISAAGLRLIN KVGLDAALKQAVSKGFCDWKDIKVIGE >gi|283510574|gb|ACQH01000045.1| GENE 20 19073 - 19582 589 169 aa, chain - ## HITS:1 COG:ECs3557 KEGG:ns NR:ns ## COG: ECs3557 COG1546 # Protein_GI_number: 15832811 # Func_class: R General function prediction only # Function: Uncharacterized protein (competence- and mitomycin-induced) # Organism: Escherichia coli O157:H7 # 4 156 5 157 165 97 39.0 8e-21 MELETKVLSRELQQHMYENGLTIGTAESCTGGRVAEAIIATPGASKYFKGGVICYSNEVK ENLLGVSPQVLAEKSAVSEEVAVQMVKGACETLKTDYCMAVTGYAGPVAGDNDGVPVGTI WVACGTKEKTLTLKLNEDFGRDINLAIATNRALRLMLDFLEQEKGTTEE >gi|283510574|gb|ACQH01000045.1| GENE 21 19601 - 20635 652 344 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227425790|ref|ZP_03908856.1| SSU ribosomal protein S18P alanine acetyltransferase [Atopobium parvulum DSM 20469] # 9 330 480 804 832 255 43 4e-67 MNNNKDIYILGIESSCDDTSAAVLRNDVLLSNVTSSQAVHQSYGGVVPELASRAHQQNIV PVVDQALKKAGIEKEQLSAVAFTRGPGLMGSLLVGVSFAKGFARALGIPLVDVNHLQGHV LAHFIQEPDMQNAIPPFPFLCLLVSGGNSQIVKVNAYNDIEVLGQTIDDAAGEAIDKCSK VMGLGYPGGPIIDKLARQGNPAAFKFAEPHIPGLDYSFSGLKTSFLYSLQKWVKDDPNFI ENHKEDLAASLEFTIVDILMKKLRLAVKQTGITHVAVAGGVSANSGLRNAFHEAAEKEGW TIYIPKFSYTTDNAAMIGITGYFKYLDKVFCSIDKPAFSKVTFE >gi|283510574|gb|ACQH01000045.1| GENE 22 20645 - 21508 956 287 aa, chain - ## HITS:1 COG:PA3657 KEGG:ns NR:ns ## COG: PA3657 COG0024 # Protein_GI_number: 15598853 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Methionine aminopeptidase # Organism: Pseudomonas aeruginosa # 40 286 5 248 261 259 52.0 6e-69 MVLNKKRRWHCLSGQEPTDLDKQVMYWENKGKLVPTRELIKTPEQIEGIRRSGVVNTGVL DEVAKHIHAGMNTLEIDKICYDYCTRHGAIPACLNYEGFPKSVCTSINEVVCHGIPKKED VLKEGDIINVDFTTILDGYYADASRMFIIGKTTPEKEQLVRVAKECLEIGAEAAKPYGFV GDIGHAISKHADKYHYGVVRDLCGHGVGLKFHEEPEVAHFGRRGQGMLLVPGMVFTIEPM INMGTWKVFIDADDEYGWEVITGDELPSAQWEHTFLMTEHGVEILTR >gi|283510574|gb|ACQH01000045.1| GENE 23 21545 - 21772 69 75 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MQARQSVFILTTYTLRLAKLLQIIGTHKENNHFLLQHAKGLTNNAMRLHKIQANAQTCKR YAKKERTSKNPLPYH >gi|283510574|gb|ACQH01000045.1| GENE 24 22396 - 22611 72 71 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MQVYAILFAAFTQTNKLLNIKCASVLTPHINKEESRGVAFMYDSALRNELIQKKLIVSRQ PIFLNLNNLIP >gi|283510574|gb|ACQH01000045.1| GENE 25 22686 - 23297 512 203 aa, chain - ## HITS:1 COG:no KEGG:BDI_1934 NR:ns ## KEGG: BDI_1934 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 195 1 200 209 80 23.0 3e-14 MRKLLFGLCLAAMPITMMAQQVCKMWVNMPVSVAGALEKSGRQELLDLKQMKKAATIAGP LGERCSIDTLTDDFLSARMNEVYTLQMKMLPTSSGDSLLCLVQTYAGPQPESSISFYSPD WKALSMPQMHLDVDLQRPDTMSEDDFNKLQALFDPKLVSFSLSPSNAELVVALSPAAISE EGKANIKRLKLQNKLNWNGKTFN >gi|283510574|gb|ACQH01000045.1| GENE 26 23422 - 25734 2435 770 aa, chain - ## HITS:1 COG:aq_624 KEGG:ns NR:ns ## COG: aq_624 COG5009 # Protein_GI_number: 15606057 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane carboxypeptidase/penicillin-binding protein # Organism: Aquifex aeolicus # 13 730 4 667 726 289 30.0 2e-77 MRKRFIHILWGMLVLGILSAILAFVAIWFGWIGYMPPISELQNPINRFATQVYSADGKVI GTWNENRENRVCIPYNTLSPHLVHALVATEDVRFYDHSGIDFIALGRAFVKRGLMGHASA GGGSTITQQLAKQLYSEKAHSTIERVLQKPIEWVIAVKLERHYTKEEIIALYLNYFDFLH NAVGIKTAANTYFNKEPKDLTITEAATLIGLCKNPSYFNPVRYPERSTERRNVVLAQMVK AGYLSDTEYRQYAEEPLALNFHRTDHKDGTATYFREFLRLYLMAKRPDKSNYPEWNKSQY VYDSIAWANDPLYGWCNKNLKKNGEPYNIYTDGLKVFTSIDSRMQQYAEDAVYKHLARFL QPAFSKENRTKPNAPFTNKLTAEQVRSILNRSITQSERYRNMKSAGYSEEEIRKAFRTPT DMAVFTYHGDIDTVMTPLDSIRYYKSFLRAGFVSMDPKNGMVKAYVGGLDFTHFQYDMAT EGRRQVGSTIKPFLYSLAMENGFSPCDQAPNVQRTYIVAGQSWTPRNGSRSRYGQMVTLK WGLAQSNNWISAYLMSKLNPQQFVGLLRSYGLNNPDIHPSMALALGPCESSVCEMVSAYT AFANRGIRCAPLFVTKIEDNEGNVIATFQPRMNEVISETSANKMLVELQAVVNEGTARRL RFKYNLKGEIGGKTGTTNRNADAWFMGFTPELVSGCWVGGEDRDIHFDSMSMGQGATMAL PIWAYFMQKVFADPRLGYTQKATFDLPEGFDPCAKDDGGMNEYGIDEVYE >gi|283510574|gb|ACQH01000045.1| GENE 27 25849 - 26802 1119 317 aa, chain + ## HITS:1 COG:VC2510 KEGG:ns NR:ns ## COG: VC2510 COG0540 # Protein_GI_number: 15642506 # Func_class: F Nucleotide transport and metabolism # Function: Aspartate carbamoyltransferase, catalytic chain # Organism: Vibrio cholerae # 3 306 26 330 330 310 51.0 2e-84 MQMKQRDFVTIADLSKEKIQYLIDLSQEFERHPNRELLKGKVIASLFFEPSTRTRLSFET AANRLGARIIGFSDAKATSVSKGETLKDTILMVANYADVIVMRHYIEGAAQYASEVSPVP IINAGDGAHMHPSQCMLDLYSIYKTQGTLEDLNIYLVGDLKYGRTVHSLIMAMRHFNPTF HFIAPKELAMPNEYKLYCKNHGIKFQEHTAFNDKVIADADIIYMTRVQKERFSDLMEYER VKDVYILKNDMLANVKDNMRILHPLPRVNEIAYDVDDNPHAYYIQQAQNGLYARQALFCD VLGITLEDVRNDKTIIK >gi|283510574|gb|ACQH01000045.1| GENE 28 26799 - 27260 628 153 aa, chain + ## HITS:1 COG:PAB1499 KEGG:ns NR:ns ## COG: PAB1499 COG1781 # Protein_GI_number: 14521525 # Func_class: F Nucleotide transport and metabolism # Function: Aspartate carbamoyltransferase, regulatory subunit # Organism: Pyrococcus abyssi # 7 142 3 140 152 129 46.0 3e-30 MSANKKERLVAAIKNGTVIDHIPADKTFQVVNLLELQTLNEPITIGNNYASKMMGKKGII KVTDRYFTDEEISRLSVVAPNVVLNIIKDYEVIEKKKVETPAELKGIVRCNNPKCITNNE PMQTVFHVVDKVSRTLKCHYCDMEQAIDKAQLV >gi|283510574|gb|ACQH01000045.1| GENE 29 27379 - 28659 1615 426 aa, chain + ## HITS:1 COG:aq_479 KEGG:ns NR:ns ## COG: aq_479 COG0112 # Protein_GI_number: 15605959 # Func_class: E Amino acid transport and metabolism # Function: Glycine/serine hydroxymethyltransferase # Organism: Aquifex aeolicus # 1 424 5 410 428 500 61.0 1e-141 MKRDLEIFKLIEDEHQRQLKGIELIASENFVSEQVMQAMGSYLTNKYAEGLPGKRYYGGC EVVDKVENLAIERIKKLFGAEFANVQPHSGAQANEAVLLTCLNPGDTFMGLNLAHGGHLS HGSLVNTSGILYNPVGYNLNKETGRVDYDEMERLALEHKPKLIIGGGSAYSREWDYKRMR DIADKVGAIFMVDMAHPAGLIAAGLLENPLKYAHIVTSTTHKTLRGPRGGIILMGKDFEN PWGKKTPKGEVKMMSQLLNSAVFPGIQGGPLEHVIAAKAVAFEEALQPEFKEWALQVKKN AKVLAEELIKRGFTIVSGGTDNHSMLVDLRDKYPELTGKVAENALVAADITVNKNMVPFD TRSAFQTSGIRLGTAAITTRGAKEDLMGFIAELIEEVLNNPTDETTIANVRKRVNEKMKD YPLFAY >gi|283510574|gb|ACQH01000045.1| GENE 30 28675 - 28887 64 70 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928202|ref|ZP_06422049.1| ## NR: gi|288928202|ref|ZP_06422049.1| hypothetical protein HMPREF0670_00943 [Prevotella sp. oral taxon 317 str. F0108] # 1 70 1 70 70 124 100.0 2e-27 MLNFLLKTRQKSGCAPYARVGNGTARMYAFDCQRERLRLMLLTANHPVRSLTRPFTKNGM QPKGASRLQK >gi|283510574|gb|ACQH01000045.1| GENE 31 29989 - 31218 1222 409 aa, chain - ## HITS:1 COG:all4556 KEGG:ns NR:ns ## COG: all4556 COG2081 # Protein_GI_number: 17232048 # Func_class: R General function prediction only # Function: Predicted flavoproteins # Organism: Nostoc sp. PCC 7120 # 47 408 3 365 370 320 44.0 3e-87 MTEQLNIAVVGGGAAGFFAAIAAKQANPKARVTIFERGQKVLAKVLVTGGGRCNLTNSFA RISDLKQAYPRGDKMMKRLFNVFDHDDTWRWFEERGVKLLTQADECVFPVSQSAQSVVDA LTNEAHKLGVEVRTAHALEGLKPLPNGDLQLEFKAQKPLTFNRVAITTGGSPRAEGLQYL ARLGHDIMQPVPSLFTFNIADVAFKALMGTVVEGVTVSVVGTKYKASGPLLVTHWGASGP AILKLSSYGARFVHDCGYRFSIAVNWIGLTNGALVAEHLQGIVESNWRKQLSSVHPFGLP SRMWLYILDKTGLGAEKRWEELGKKGLNKLVETLTNDLYAVTGKGAFREEFVTCGGVSLT NINLNTMESKVCKNLFFAGEVLDVDAITGGFNLQAAWTTGYVAGKSMGV >gi|283510574|gb|ACQH01000045.1| GENE 32 31624 - 32043 513 139 aa, chain + ## HITS:1 COG:HI1161 KEGG:ns NR:ns ## COG: HI1161 COG2050 # Protein_GI_number: 16273085 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Uncharacterized protein, possibly involved in aromatic compounds catabolism # Organism: Haemophilus influenzae # 20 134 24 137 138 75 42.0 3e-14 MNIEHIKDRINKLPNLSTALGMEFISTPEPDTCMARMKVDERNRQPFGFLSGGASLALAE NLAGVGSLALCPGKISVGINVSGSHLIAVEEGDTVTATAHIMHKGRTLHQWLVEIRNTAG ELVCSVQVTNYVLNARKDD >gi|283510574|gb|ACQH01000045.1| GENE 33 32036 - 33121 694 361 aa, chain + ## HITS:1 COG:STM0595 KEGG:ns NR:ns ## COG: STM0595 COG1169 # Protein_GI_number: 16763972 # Func_class: H Coenzyme transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Isochorismate synthase # Organism: Salmonella typhimurium LT2 # 114 358 147 387 391 99 29.0 8e-21 MTDFALFREPHATHATRYVQLHGEAQSLPNYAALNHVQGFVFAPFAIEDDCPLLVIAPDK VEQLDVETHVSTLARNATAVETTTASNSQQQVSQSYQESFHRFHQQLVQGQFAKIVLSRT EEVTLKHAPNAEELFWRACQRYPRLFIALVNTRSAGTWLMATPEVLVRNDMHRWHTMALA GTMRLTSEQQKWPDQTWNSPTPPLEWSAKNKEEQRLVARYIAQCLAHLADNIEETPPYTQ RAGGLVHLRSDFAFTLKPAYGLGDLVQMLHPTPAICGLPKAEAWRFISDNEPHRRLYYSG FCGPVTANGGTKLYVSLRCMQLWPNKACLYAGGGLLKDSTEEQEWEETRAKMETMLGLMQ P >gi|283510574|gb|ACQH01000045.1| GENE 34 33322 - 34989 1209 555 aa, chain + ## HITS:1 COG:sll0603 KEGG:ns NR:ns ## COG: sll0603 COG1165 # Protein_GI_number: 16331568 # Func_class: H Coenzyme transport and metabolism # Function: 2-succinyl-6-hydroxy-2,4-cyclohexadiene-1-carboxylate synthase # Organism: Synechocystis # 9 520 13 554 595 173 27.0 7e-43 MYSNKENINILTSLLVAHGVRRAVVCPGSRNAPIVHNLVKSGAISCYPVTDERSAGFYAL GMALLDGDAVAVCVTSGSALLNLLPAVAEAYYQHIPLIVISADRPLAWIDQQDGQTIRQQ NALADFVKKAVTLNEPTDETLRWHCNRLANEALLATNHHGVGPVHINVPLSEPLFGYDVP TLPNERTISRCTPSKLAATALNNLMADFRKAEKPMIVLGQMPPGSVNASLVKVLEGQAVV VSEVLSHLGGLQNADELLATCAETAPYTPDFVLYMGGHLIGKNLKSLLWQMNNAPQWLVN EGGEVVDTFMNLTAIVEAKPNEVLSLMVQMMQQTDAKQHEERTKAFRQLWQTGHNAYREQ LNQAAPTFSEGYAVQCFEQQLQQTNQPFEVHYANSLSVRLANKHAQHHVWCNRGVNGIEG SLSTSAGMSLTTQANVYCVIGDLSFFYDQNALWNTRLGGNLRIMLLNNGGGKIFQHVKGM PADDIQATFVSATHNTSAQGICQQNGARYLLVNDIETLQKGIAELIGYGGNRPFVLEVNL TSWGVTTTKCSATPY >gi|283510574|gb|ACQH01000045.1| GENE 35 35027 - 35851 871 274 aa, chain + ## HITS:1 COG:BS_menB KEGG:ns NR:ns ## COG: BS_menB COG0447 # Protein_GI_number: 16080132 # Func_class: H Coenzyme transport and metabolism # Function: Dihydroxynaphthoic acid synthase # Organism: Bacillus subtilis # 4 274 3 271 271 388 66.0 1e-108 MKREWKPIEVFDFKEILFEEYQGIAKITINRPRYRNAFTPRTTWEMSQAFAYCREAQDIG VVLLTGAGDMAFCAGGDMNVKGRGGYVGEDGVPRLNVLDVQMQIRRLPKPVVAMVNGYAI GGGHVLHLVCDLSIASENAIFGQTGPKVGSFDAGFGASYLARIVGQKKAREIWFLCRQYN AEEAERMGLVNKVVPQDKLEDECVAWAETMMQHSPLALRMIKAGLNAELDGQAGIQELAG DATMLYYFLDEAQEGGKAFLEKRKPDFKKYPKLP >gi|283510574|gb|ACQH01000045.1| GENE 36 35859 - 36899 792 346 aa, chain + ## HITS:1 COG:BS_ykfB KEGG:ns NR:ns ## COG: BS_ykfB COG4948 # Protein_GI_number: 16078363 # Func_class: M Cell wall/membrane/envelope biogenesis; R General function prediction only # Function: L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily # Organism: Bacillus subtilis # 42 306 41 312 366 83 27.0 7e-16 MMFHIDIAPHTLHFKQPAGTSRGVYRTRKTWFVTFSSPQKPGVVGVGECAPLPNLSCDDL PNYEQCLRQACQRFCETGVINHDELSPYPSIRFGLETALWQWQRDGHANLSQTPFARGEM GIPINGLVWMGTFEEMAMRIEEKLNQGFRCVKLKIGAIDFDRELQLLKDIRATFSPRQVE LRVDANGAFAPAEALSKLEKLAPLDLHSIEQPIAKGLWNEMADLCKRSPLPIALDEELIG TNTLADKEELLDTIAPQYLVLKPSLHGGIRGTNEWISLAEDRKIGSWITSALESNVGLNA IAHLAAEAYGPQIAMPQGLGTGQLFTDNIDMPLQIKGDCLWFLPNS >gi|283510574|gb|ACQH01000045.1| GENE 37 36878 - 37891 946 337 aa, chain + ## HITS:1 COG:ECs3148 KEGG:ns NR:ns ## COG: ECs3148 COG0318 # Protein_GI_number: 15832402 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II # Organism: Escherichia coli O157:H7 # 17 336 137 440 451 91 26.0 2e-18 MVLAEFLDQWNDDKAPCMLVHTSGSTGKPKPLWVEKRRMEASARATCQYLGLKAGNTALL CMPLDYIAGKMMVVRSLVANLQLVVVEPDGHPLKDLETDIDFAAMVPLQVYNSLQVPAER QRLMRIKHLIIGGSAIDEALAKELRAFPHAVWSTYGMTETLSHIALRRLNGQAASQWYTP LPGVKVSLNDKGCLVIDAPHLCEGLLETNDIATIKHIKMAKQEDDLQHNDEPKQAFRILG RLDNVLVSGGIKIQIEEVERALHQHLSVPYRIVKCHDQKFGEVVGLQFVPLSAENAESET AMVKQICQKVLPKYWQPKAYFPVKEIERTATGKIKRG >gi|283510574|gb|ACQH01000045.1| GENE 38 38002 - 38763 497 253 aa, chain - ## HITS:1 COG:no KEGG:PRU_1254 NR:ns ## KEGG: PRU_1254 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 75 232 30 186 197 120 41.0 4e-26 MFIVYLCCMMKKHLLVTIICCAVAGAAFAQKERKQNVRRTAENAVVEQPKLVDSIVVDTL HKLVTPVVAVEPTIDAAMALPSLTLNGQIEPLGYRPWFWAGWQRWDLHPGLNLSLGASVF AQFGKHARHGAGFMQNISAMYAQPLGKKFTVAAGGYLNNVFWGRDNYRNAGLQAIVGYRL NEHWEAYLYGQKSLVSNRLMPYPLYDMQALGDRVGAAVKYNFNPSFSVQVSVEQVWGPNM PRGYYDTFNDIGR >gi|283510574|gb|ACQH01000045.1| GENE 39 39135 - 39725 785 196 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288928211|ref|ZP_06422058.1| ## NR: gi|288928211|ref|ZP_06422058.1| hypothetical protein HMPREF0670_00952 [Prevotella sp. oral taxon 317 str. F0108] # 1 196 1 196 196 383 100.0 1e-105 MKKTLICALVAMFFSASPTWAQSNNGHQDESLADKVSTIVKKAKSSMQRAGKRVEKAIGI NDKNRDGDEVKIDGTYYMPIYSLNIYNGKDADKFKNTCQKAFTKKYPRTNVLSVTIPQEG WVSKAVKDGGKVIGYLQYMYCYVLAQDGDDGYINARFSFQRYKDVGKEYGSVNGRWPQWD RTDVLTPSVYDELKNY >gi|283510574|gb|ACQH01000045.1| GENE 40 39824 - 40597 980 257 aa, chain + ## HITS:1 COG:no KEGG:BF3707 NR:ns ## KEGG: BF3707 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 2 233 3 234 320 164 38.0 4e-39 MNILKALFGGKTEDPEERKKDEETKNFDILKFDGVKALHLGQADHAINCFTKALSYKDDL ETRDYLSQAFLMTNEPLKAYEQLQKLAEAQPDNPKIYVRMADVAYRMDDYGAMADACEKA LLIDKASPTVHYLYARACIGQGDNVNAVAMLTKAILTTPDYLDAYLLRGETLLAMENLEE AEEDANFLVEHVAGNEDALLLKARVEKAKGYNAAAQEYYDKVIEANPFHAEAFKERAEVR EALGDLQGAEEDRNVNF >gi|283510574|gb|ACQH01000045.1| GENE 41 41556 - 43091 1829 511 aa, chain - ## HITS:1 COG:CAC1816 KEGG:ns NR:ns ## COG: CAC1816 COG1418 # Protein_GI_number: 15895092 # Func_class: R General function prediction only # Function: Predicted HD superfamily hydrolase # Organism: Clostridium acetobutylicum # 36 511 46 514 514 406 50.0 1e-113 MIETIIAAAVALVLGCFVGYLLFRYVIKGQYNEMIENANKEAEVLKEKKLLEVKEKFLNK KSELEKEVQIRNQRIQQNENKLKQREISLNQRQDELARRKQEVEQNQQRVDNEKKLLNIK QEELEKMQSQERLKLEELSGLSADEAKERLIESLKDEARTDAASYINEIIDEAKLNANQQ AKKIVIQTIQRVATETAIENSVSVFHIDNDEVKGRIIGREGRNIRALEAATGVEIVVDDT PEAIVISAFDPVRREVCRLALHQLVADGRIHPARIEEVVAKVKKQLENEIIETGKRTAID LGIHGLHPELIRIVGKMKYRSSYGQNLLQHARETANLCAVMASELGLNPKKAKRAGLLHD IGKVPDEESELPHALYGAKIAEKYKEKPDICNAIGAHHDEIEMNTLLAPIVQVCDAISGA RPGARREIVEAYIKRLNDLEAIAMSYPGVTKTYAIQAGRELRVIVGADKMDDAESEKLSG EIATKIQNEMTYPGQVKITVIRETRSVAFAK >gi|283510574|gb|ACQH01000045.1| GENE 42 43205 - 43522 370 105 aa, chain - ## HITS:1 COG:no KEGG:PRU_0226 NR:ns ## KEGG: PRU_0226 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 98 1 98 99 72 45.0 5e-12 MADNKKDFFKITLNVYDRSIPVNCSREDEEKYRKSAKLITDLVNFYSARYAGKKSDIDIL YMVLIDIAMRYHNADEHNDVEPLMDSLKGLAAEIEETLGIEPTSV >gi|283510574|gb|ACQH01000045.1| GENE 43 43515 - 43823 337 102 aa, chain - ## HITS:1 COG:no KEGG:PRU_0227 NR:ns ## KEGG: PRU_0227 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 96 1 96 98 74 45.0 9e-13 MDANEFTLSLFTTRLRQLLLQYKELEKKNAELQTMIDERDARIKELDDSLQSVYKSYDSL KMAKMVEITDGDMESAQKRLSKLIRDVNRCITLLSENSIDNG >gi|283510574|gb|ACQH01000045.1| GENE 44 44018 - 44641 618 207 aa, chain - ## HITS:1 COG:no KEGG:PRU_0228 NR:ns ## KEGG: PRU_0228 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 207 1 206 206 282 67.0 6e-75 MGLFSSIFKRQADKSKSGGMEDYMTLVRVYFQATIAANLGITNLAMLPDLRTYKTTLRIP TQNNKLGLGEKAHCRKMLKELYKTDEGFFKEIDASIRKNCHKLQDVQVYLIQFQNFTQDL MMLMGNLMKFKLRLPSFFKKAIYTMTQRTINDIFNKNDFSDAGVLKTVASIRQYNKRLGF SQKWVTDFVFQVVMLAKKEPRPDEKAK >gi|283510574|gb|ACQH01000045.1| GENE 45 45153 - 46538 1611 461 aa, chain - ## HITS:1 COG:PH0923 KEGG:ns NR:ns ## COG: PH0923 COG1109 # Protein_GI_number: 14590777 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannomutase # Organism: Pyrococcus horikoshii # 12 445 6 439 455 261 38.0 3e-69 MTLIKSISGIRGTIGGGVGDTLNPLDIVKFTSAYATFIRQNMPKGNNIIVVGRDARISGE MVKNIVCGTLMGMGFDVLNIGLATTPTTELAVTMAGADGGIIITASHNPRQWNALKLLNN KGEFLTAEDGAKLLKIADKGDFTYADVDHLGHYTEDDSFDQRHIDSVLALKLVDADAIRK AGFKVVVDTVNSVGGVILPKLLDQLGVKYTMLNGQPTGDFAHNPEPLEKNLTEIMTEVKN GNYDMGIVVDPDVDRLVFICEDGKMFGEEYTLVSVADYVLSQTPGNTVSNLSSTRALRDV TLKHGGQYTAAAVGEVNVTTQMKAVNAVIGGEGNGGVIYPESHYGRDALVGIALFLSNLA HKGCKVSELRATYPNYFIAKNRIDLTPSTDVDAILERVKELYKDEQVNDIDGVKIDFADK WVHLRKSNTEPIIRVYSEAATMEDADAIGKKLMQVVYDMSK >gi|283510574|gb|ACQH01000045.1| GENE 46 46554 - 47201 550 215 aa, chain - ## HITS:1 COG:no KEGG:PRU_0303 NR:ns ## KEGG: PRU_0303 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 11 213 1 203 205 165 43.0 8e-40 MKKLLFSLFLLAGALSWQSCSRYETYADQVESERNAISNYITKKNIKVISENEFVANGET TDTAKNEYVLFESSGVYMQIVHKGVGRKIEKGERMNVTCRFSERNLLTDSLELSSFHPRF SAWLDRMDVTNTSGTFTASLTGKGGLMYVARQRTNGTAVPSGWLTPFSYILLDRLSSSEE AGARVRLIVPHDRGHGAAVNGVIPYFYELELSKAR >gi|283510574|gb|ACQH01000045.1| GENE 47 47451 - 48488 989 345 aa, chain - ## HITS:1 COG:aq_1630 KEGG:ns NR:ns ## COG: aq_1630 COG0618 # Protein_GI_number: 15606737 # Func_class: R General function prediction only # Function: Exopolyphosphatase-related proteins # Organism: Aquifex aeolicus # 23 336 18 314 325 110 28.0 4e-24 MQLEILSQSDSNTLNGLIQAANNIVICCHKSPDGDAIGSSLAWAEYLRQVGKEPLIIVPD AFPDFLRWLPGVEKVMRYDKHTAVCKERIAEADLVFCLDFNATNRVDDMQQALETTTAQR VVFDHHLNPQMDARLMVSLPQMSSTCEIVFRIVWQLGGFEMFSRQAAIAVYCGMMTDTGG FTYNSTRPEIYVIIGHLLTKGFDKDKVYRNVYNNYSSWAIRFRGYLMSQKLNVLNELNAS YFCITRQDMKNFHFVKGDAEGLVNEPLRIKGQKLVISLREDTDKDNLILVSLRSVDDFPC NEMAERFFNGGGHLNASGGKLFCSMSEAERVVRDAIMAYSGRLRA >gi|283510574|gb|ACQH01000045.1| GENE 48 48901 - 49290 422 129 aa, chain - ## HITS:1 COG:no KEGG:PRU_1440 NR:ns ## KEGG: PRU_1440 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 127 1 127 128 173 67.0 2e-42 MKKLVLLSMMLFAFVGFAVAQQQAEIKFDKITHNFGTFSEKNPVQECTFTFTNVGKTPLV INEATASCGCTVPSYPKTPIKPGEKGEVKVVYNGKNMFPGHFKKTITIRTNGVTEMTRIY VEGVMEEAK >gi|283510574|gb|ACQH01000045.1| GENE 49 51369 - 51761 315 130 aa, chain + ## HITS:1 COG:no KEGG:BF0742 NR:ns ## KEGG: BF0742 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 2 128 3 129 131 179 67.0 2e-44 MKRFTEEQITTLAPNEIFVFGSNIQGYHGGGAARMALNKFGAVWGKGDGLQGQSYAIPTM EGGVETIRPFVDKFISFASKHPEYTFLVTRIGCGIAGFTDGEIAPLFQNAISLTNIILPK TFADVLEQTK >gi|283510574|gb|ACQH01000045.1| GENE 50 51941 - 53923 2200 660 aa, chain - ## HITS:1 COG:all0889 KEGG:ns NR:ns ## COG: all0889 COG0457 # Protein_GI_number: 17228384 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Nostoc sp. PCC 7120 # 37 640 51 584 605 141 24.0 4e-33 MRRKIALVFFALVAFIARAQFNVDRLMTSGQIALHYEDYVLSMQYFNQIISLRPYLYQPW QYRAIAKFYLDDYVGAEEDAGEAISLNPYIEDIYDLRAISRIRQSKYQEAITDYNRAIKL NPSNRNYWYNRAACRMNGKDYAGAHHDIDTIVSRWADYANAYTLNAEVYLHEKDTLKAAE WLDKGLKLDPYDANVWTMRAYISLSRKQWREADEFLGKAIHLKSKNVDNYLNRAIARVNL NNLRGAMADYDLALELDPNNFLGHYNRGLLRLQLGDDNRAIADFDFIIKMEPNNFMAVFN RGILHEKTGNIRAAIDDYSTVIKQFPNFWIGLSYRARCYRRLGMVAKAELDEFKIFKAQM DKRLGVQARWTRAKLKEVRKRSEINMDKYNQMVVSDENEVAHEYESRYRGRVQDNATNTS LMPMFHLSLVPYANGIRSYQAFDRDVEQFNRVAKLGRPLYVACGNEQLDERHSKDFFVLV ESLTQRIGANANNAELRALLLQRAVTYGTLQNFDAAIADLTTLLSLDSTSVLAYWHRAVC QMAMSGFKASNGADAKLLEAGALSDLNRAITLQPQSAYLYYCRANLYYNQGNNDKAIADY TAAIKLDSRFAEAYFNRGLAHVARGERPQGVKDLSKAGELGLYSAYNIIKKNSATTAQKK >gi|283510574|gb|ACQH01000045.1| GENE 51 54078 - 54641 717 187 aa, chain - ## HITS:1 COG:FN1157 KEGG:ns NR:ns ## COG: FN1157 COG0242 # Protein_GI_number: 19704492 # Func_class: J Translation, ribosomal structure and biogenesis # Function: N-formylmethionyl-tRNA deformylase # Organism: Fusobacterium nucleatum # 1 168 1 159 174 125 41.0 3e-29 MILPVYIYGQQVLRKVAQDIPLDYPNLKELIQNMFETMEASDGIGLAAPQIGLDIRLLVV DLDVLAETYPEYKGYRKAFINPHIVEIDEQSPTESLEEGCLSLPGIHEKVTRHTRIRMQY VDEDLKPHDEWFEGYLTRVLQHEVDHLDGILFTDHLSPFRKQLIKNKLKALLQGKFRCGY RTKQATR >gi|283510574|gb|ACQH01000045.1| GENE 52 54638 - 55054 186 138 aa, chain - ## HITS:1 COG:CAC1680 KEGG:ns NR:ns ## COG: CAC1680 COG0816 # Protein_GI_number: 15894957 # Func_class: L Replication, recombination and repair # Function: Predicted endonuclease involved in recombination (possible Holliday junction resolvase in Mycoplasmas and B. subtilis) # Organism: Clostridium acetobutylicum # 3 135 2 134 135 83 38.0 9e-17 MARILSVDYGQKRTGIAVTDPLQIICNGLVTVSSSTLVDFLLDYTKREEVELIVVGEPKQ TNGQPSENLERVRQFVARWKKACPTIPVVYYDERFTSVLAHRAMIDGGLKKKARQDKALV DEISATIILQSYLESKRR >gi|283510574|gb|ACQH01000045.1| GENE 53 55061 - 55540 370 159 aa, chain - ## HITS:1 COG:no KEGG:PRU_0589 NR:ns ## KEGG: PRU_0589 # Name: not_defined # Def: sporulation-like repeat protein # Organism: P.ruminicola # Pathway: not_defined # 14 157 2 149 152 137 53.0 1e-31 MKKSMILCASLCAAFMLSSCGSSKESAYKRAYEKAKSQEQTSGESAAPVVTPVEESPATQ ITVEESVENASVRQENVTVVSGSGLNDYSVVVGSFSLKTNAEGLYNKLKSEGHDARLAYN ADRNMYRVVAASFTSKADAVTSRNEFRATYPDAWLLYRK >gi|283510574|gb|ACQH01000045.1| GENE 54 55815 - 56036 367 73 aa, chain + ## HITS:1 COG:no KEGG:PRU_0588 NR:ns ## KEGG: PRU_0588 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 69 1 69 71 70 78.0 2e-11 MKALGYIGAFLGGAIAGAALGLLVAPEKGEDTRCKISDAVDDFCKRHNLKLSRKQVDDLV DDIKDAVPESIVE >gi|283510574|gb|ACQH01000045.1| GENE 55 56038 - 56391 386 117 aa, chain + ## HITS:1 COG:no KEGG:PRU_0587 NR:ns ## KEGG: PRU_0587 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 3 116 1 114 116 79 51.0 5e-14 MAMFSTDNNVQTIGQLVEVLKDYIGLQKEFVKLDVIEKVVRLITVATLLGILTASILMIA IFGSFAAAYALAPKLGMAASFCIVAAFYLVLFLLIFAFRKQWIEKPVVRILASILMD >gi|283510574|gb|ACQH01000045.1| GENE 56 56396 - 56671 211 91 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260912127|ref|ZP_05918683.1| ## NR: gi|260912127|ref|ZP_05918683.1| hypothetical protein HMPREF6745_2638 [Prevotella sp. oral taxon 472 str. F0295] # 1 90 1 90 93 138 78.0 1e-31 MENETIQTPSYRSLTQIRVRKEALKKQIEEEDAKIKTLWDKLFRPNEQADTSPSKRFTDV LNTGAGVLDALILGWKLYRKFSGKPLFRLKW >gi|283510574|gb|ACQH01000045.1| GENE 57 56687 - 57448 626 253 aa, chain - ## HITS:1 COG:BB0533 KEGG:ns NR:ns ## COG: BB0533 COG1235 # Protein_GI_number: 15594878 # Func_class: R General function prediction only # Function: Metal-dependent hydrolases of the beta-lactamase superfamily I # Organism: Borrelia burgdorferi # 1 251 1 251 253 167 35.0 1e-41 MKFTLLGTGTSNGVPVLGCGCEVCRSDNSKDKRMRCALLVESKEHRVLVDCGPDIRMQLM PQVFRPIDAVLLTHIHYDHVGGVDDLRPYCSFGDIHLYGNVDTVKGLKHNMPYCFSDDLY PGVPLLKLHAIEPHQPLMFGDINVMPFVVLHGQMPILAYRFGQLAYITDMKTIGNDELQY LKDVEVLIINALRFEKEHHSHQLVSEAIDFSRQLGARRTILTHLTHQIGLHEKASKRLPI GVEFGYDGMQVEL >gi|283510574|gb|ACQH01000045.1| GENE 58 57445 - 58461 1024 338 aa, chain - ## HITS:1 COG:NMB0811 KEGG:ns NR:ns ## COG: NMB0811 COG0812 # Protein_GI_number: 15676709 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramate dehydrogenase # Organism: Neisseria meningitidis MC58 # 5 336 6 338 346 283 44.0 3e-76 MTEHYDYDLTPHNTFGIKARCRRYVAFDTVAEAQKALPKLVSANERLLILGGGSNLLLTG DFDGTVIHSHIMGMEQHDDGEEHVLLRCGSGLVWDEVVAQCVADGLYGAENLSLIPGEVG ASAVQNIGAYGAEVKDLITTVEAVEVETGRIVHFTNAQCEYAYRQSRFKRDWKNRFFITH VTYRLCRTFVPHLDYGNIKAALAEKGVVVPTAIELRQAIIDIRKAKLPDPKMEGNAGSFF MNPIVQRSKFEALLPLYPDMPHYIIDADRVKIPAGWMIEQCGWKGKALGRAGVHHKQALV LVNKGGAEGSDVLALCRRIQADVKAKFGIDIYPEVNIQ >gi|283510574|gb|ACQH01000045.1| GENE 59 58477 - 59334 728 285 aa, chain - ## HITS:1 COG:no KEGG:PRU_0583 NR:ns ## KEGG: PRU_0583 # Name: not_defined # Def: putative lipoprotein # Organism: P.ruminicola # Pathway: not_defined # 3 282 2 276 279 262 47.0 8e-69 MKKLIFFAIMVVLLIAMGSTTTGCTDKKPTTADSTRTDTIVGDTATRDTLESIIEEQPMP KAADELFDDFVFNFAANRKLQFARVKFPLDVYQNEKVVKRIEKKDWRMEHFFMKQGYYTL IFDNAKQMELVKDTTISHVVIEKIAFNNKSVKQFLFNRVNGQWMLTSMNLKAMYETTNAS FLQFYQRFASDSAFQVQSLNALVEFTAPDPDDDFGSITGSISPEQWPSFKPGLIPKGEIY NIIYGQKYTESNRKLFVIRGVANGMETEMYFKKLKGKWKLVKFNS >gi|283510574|gb|ACQH01000045.1| GENE 60 59535 - 60572 1179 345 aa, chain + ## HITS:1 COG:SA0985 KEGG:ns NR:ns ## COG: SA0985 COG0016 # Protein_GI_number: 15926722 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Phenylalanyl-tRNA synthetase alpha subunit # Organism: Staphylococcus aureus N315 # 18 343 21 345 352 314 46.0 1e-85 MILDKIDQLLNEVSNLSAQNADEIEQLRLKYLSKKGEITALMADFRNVAADQKKAVGMKI NELKQLAQDKLNELKEAVGTQKKAEDAPDLTRTPYPIKLGTRHPLTIVRNQIIDIFQRMG FTLAEGPEIDDDLHVFTKLNFAPDHPARDMQDTFFIETNPSDVTKNVILRSHTSNDQSRV MERQQPPIRVICPGRVFRNEAISARAHCFFHQLEGLYIDKNVSFTDLKQVLLTFARELFG PDTDIRLRPSYFPFTEPSAEMDISCHICGGEGCGFCKHTGWVEILGCGMVDPNVLEACGI DSTVYSGYAFGLGIERITNLKYHVADLRMFSENDVRFLQEFQAAN Prediction of potential genes in microbial genomes Time: Sat May 28 00:57:32 2011 Seq name: gi|283510573|gb|ACQH01000046.1| Prevotella sp. oral taxon 317 str. F0108 cont2.46, whole genome shotgun sequence Length of sequence - 1342 bp Number of predicted genes - 0 Prediction of potential genes in microbial genomes Time: Sat May 28 00:57:38 2011 Seq name: gi|283510572|gb|ACQH01000047.1| Prevotella sp. oral taxon 317 str. F0108 cont2.47, whole genome shotgun sequence Length of sequence - 21653 bp Number of predicted genes - 20, with homology - 20 Number of transcription units - 8, operones - 5 average op.length - 3.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 625 - 678 11.6 1 1 Op 1 . - CDS 687 - 1601 653 ## BT_3652 hypothetical protein - Prom 1627 - 1686 2.7 2 1 Op 2 . - CDS 1708 - 2436 464 ## BT_3652 hypothetical protein 3 1 Op 3 . - CDS 2459 - 3307 969 ## COG0501 Zn-dependent protease with chaperone function - Prom 3505 - 3564 5.4 4 2 Tu 1 . - CDS 3646 - 4452 542 ## PRU_0966 hypothetical protein - Prom 4508 - 4567 7.2 5 3 Op 1 . + CDS 4883 - 6499 1997 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains + Term 6506 - 6537 -0.1 6 3 Op 2 5/0.000 + CDS 6576 - 7178 777 ## COG0576 Molecular chaperone GrpE (heat shock protein) 7 3 Op 3 . + CDS 7204 - 8376 1332 ## COG0484 DnaJ-class molecular chaperone with C-terminal Zn finger domain 8 3 Op 4 . + CDS 8376 - 9683 1069 ## COG0285 Folylpolyglutamate synthase + Prom 9813 - 9872 5.8 9 4 Tu 1 . + CDS 9896 - 10987 350 ## PROTEIN SUPPORTED gi|15900011|ref|NP_344615.1| aldose 1-epimerase + Term 11026 - 11063 1.0 10 5 Op 1 . + CDS 11195 - 12508 1304 ## COG0738 Fucose permease 11 5 Op 2 . + CDS 12582 - 13742 1468 ## COG0153 Galactokinase + Term 13798 - 13849 8.1 + Prom 13788 - 13847 2.4 12 6 Tu 1 . + CDS 13977 - 14729 718 ## BDI_2957 hypothetical protein - Term 14875 - 14944 9.1 13 7 Op 1 . - CDS 14987 - 15580 596 ## COG0424 Nucleotide-binding protein implicated in inhibition of septum formation 14 7 Op 2 . - CDS 15603 - 16121 545 ## COG1778 Low specificity phosphatase (HAD superfamily) 15 7 Op 3 . - CDS 16126 - 16914 522 ## PRU_2462 hypothetical protein 16 7 Op 4 . - CDS 16920 - 17447 535 ## COG0778 Nitroreductase 17 7 Op 5 . - CDS 17460 - 18131 718 ## COG1285 Uncharacterized membrane protein 18 7 Op 6 . - CDS 18142 - 18756 455 ## BT_0462 putative transcriptional regulator - Prom 18965 - 19024 5.1 - Term 20019 - 20062 5.1 19 8 Op 1 . - CDS 20106 - 20606 350 ## gi|288928250|ref|ZP_06422097.1| hypothetical protein HMPREF0670_00991 20 8 Op 2 . - CDS 20606 - 21427 202 ## gi|288928251|ref|ZP_06422098.1| hypothetical protein HMPREF0670_00992 - Prom 21572 - 21631 5.7 Predicted protein(s) >gi|283510572|gb|ACQH01000047.1| GENE 1 687 - 1601 653 304 aa, chain - ## HITS:1 COG:no KEGG:BT_3652 NR:ns ## KEGG: BT_3652 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 43 297 4 257 262 124 28.0 4e-27 MKKKTIVTLMFMLCLVAQQVAAAARFRIRVRGGGSDDVESGPLTWIIYGVVAIAILGGLY IYMTELKSLVLKIFGGLRMDDKTTLTKDQQRKMLLSSVSAAYEKVILNSLKTGMARTERD DYLKEHWDVDGHDSAINVLNDWKAACTKNFIPHIGEAFKLKEQQPIDKYLNDTFVLDKDA RACARQIENAFKMMPKLVKLGIVKDETDFVRIGADGYYISVLVYLARLCAESKYISEEEM WQYVDAADEFAHQSLTSWEDYGKSYLIGYALWLGSGIQLKMQADVVKKLIENPKSPWNSF PFAK >gi|283510572|gb|ACQH01000047.1| GENE 2 1708 - 2436 464 242 aa, chain - ## HITS:1 COG:no KEGG:BT_3652 NR:ns ## KEGG: BT_3652 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 13 236 35 257 262 148 35.0 2e-34 MFSFFSSFYYAIFGGFRIDETTTLSKEQQRKLLLSGVFSAQKGTFMNVVKTGMSYGGRQK MFGQGWGITDKESAIDTLDYLQHAGTRRFFPLVVEALKLQNKRAIQQYITENLEDEDDMR DCWEQVQFAFGSMDSLINAQLIKDQADFVRIGPDAWDAGRLVFMARLCREHEYITDEQMW QYIDAADEIAHQTLTSWEDFGKSYIIGRCLWCGTANYFEVMADHAKDMYSSPKSPWKGVP FA >gi|283510572|gb|ACQH01000047.1| GENE 3 2459 - 3307 969 282 aa, chain - ## HITS:1 COG:ECs3811 KEGG:ns NR:ns ## COG: ECs3811 COG0501 # Protein_GI_number: 15833065 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Zn-dependent protease with chaperone function # Organism: Escherichia coli O157:H7 # 31 259 70 292 294 172 40.0 8e-43 MKKVLSHMFVAVLCILGTMPAMAQFNLKKAVGGAAKAAKAVTLTDADVVNYVKEYIDWMD KHNQVCPDDNPYAIRLKKLTAGLTQVEGIPLNFKVYYVVDVNAFACADGSVRVFSSLMDI MTDEELLGVIGHEIGHVAHKDSKNAFRTALLTSALKDGVSSTNGKAAALTDSQLGDLGEA LLNATYSQKQESNADAYGYEFLKKNNKNPWAMALSFEKLKKLEEDAGYQKDSKWKRMFSS HPDLDKRIKTMSERAQKDGFTRPENKMPEKTAAQKVVKGKKK >gi|283510572|gb|ACQH01000047.1| GENE 4 3646 - 4452 542 268 aa, chain - ## HITS:1 COG:no KEGG:PRU_0966 NR:ns ## KEGG: PRU_0966 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 268 1 268 270 174 34.0 2e-42 MKRFLFLFVVLCVAFVLPSKGQTKVTGRVVDTNGHGIEYATVCVDSLFCISDADGHFNLQ LPHSVTADMSVSHISFKTKLVPRSVFQQGYVNVMLEERAYNLGSVDIVAKQTKLETIVGR GMKLPGDAAFGKEQQGKVEMGPLFSVKRDYVVASFDMKVEECTYQQCVVRLIVYEVHGAK FEPVQSKPLYLALTPANQNTHIKIKTQGDITLKKGSTYYVGVSVVSGSNAGTLHFPAHLR SSYVRNLVKGTFKKLPATLGMVVKGYRI >gi|283510572|gb|ACQH01000047.1| GENE 5 4883 - 6499 1997 538 aa, chain + ## HITS:1 COG:BS_ykpA KEGG:ns NR:ns ## COG: BS_ykpA COG0488 # Protein_GI_number: 16078507 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Bacillus subtilis # 1 536 1 538 540 632 57.0 0 MITVTNLAIQFGKRVLYKDVNMKFTSGNIYGIIGANGAGKSTLLKAISGELEPNKGTVEL GPGERLSVLDQDHFKFDEYRVIDTVLMGHEKLWKNMKEREALYAKEEMTEEDGNRAAVLE EKFAEMNGWEAESDAAQMLSNLGVTDALHYKQMSDLSNNEKVRVMLAKALFGNPDNLLLD EPTNDLDLDTVNWLEEYLGNVEQTVLVVSHDRHFLDAVSTQTVDIDFGKVTIFSGNYSFW YESSQLALRQAQNQKQKAEEKRKELEEFIRRFSANVAKSRQTTSRKKMLEKLNVEEIKPS SRKYPGIIFQMEREPGNQILEVEGLKATDNEGNVLFDNVSFNVEKGQKVVFLSHNPKAMT ALFEIINGNREAEAGTYKWGVTITTAYLPLDNTAFFDSKLNLVDWLSQYGPGNEVVMKSY LGRMLFSGEDVLKEVNVLSGGEKMRCMIARMQMKNANCLILDTPTNHLDLESIQAFNNNL VGFKGNILFASHDHQFIQSIADRIIELTPNGTIDKLMSYDDYIYDPAIKEQRAKMYNA >gi|283510572|gb|ACQH01000047.1| GENE 6 6576 - 7178 777 200 aa, chain + ## HITS:1 COG:alr2445 KEGG:ns NR:ns ## COG: alr2445 COG0576 # Protein_GI_number: 17229937 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone GrpE (heat shock protein) # Organism: Nostoc sp. PCC 7120 # 40 199 70 229 248 86 36.0 3e-17 MNDNMTDENMEKKANGNLDNNEEAVDLNQDAPQEETNEPQEQPQQTEEEPQTEEEKLAKQ LEELKDKYLRTVAEFENFKRRTLKEKAELILNGGGKTITAILPIIDDMERAIENAHKQEC VDAVEEGWELIYKKLLSTLEGMGVKKMEVDGKDFDVDFHEAVAMVPGMGDEKKGKIIDCL QTGYTLNDKVIRHAKVAVGQ >gi|283510572|gb|ACQH01000047.1| GENE 7 7204 - 8376 1332 390 aa, chain + ## HITS:1 COG:RSc2634 KEGG:ns NR:ns ## COG: RSc2634 COG0484 # Protein_GI_number: 17547353 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: DnaJ-class molecular chaperone with C-terminal Zn finger domain # Organism: Ralstonia solanacearum # 1 390 1 379 380 287 44.0 3e-77 MAKRDYYEVLGVDKNASEDEIKKAYRKLAIKYHPDKNPGDKDAEAKFKEAAEAYDVLHDP EKRKQYDQFGFDGPAGMGGSGGFGGGASMDMDDIFSMFGDIFGGHGGFSGFSGFGGGTRS RQHVQFRGSDLRLKVRLTLQEIATGITKKFKVKKDVTCTHCHGTGEEAGSGTETCPTCHG QGIVTKTVRTMLGMMQTQTECPTCHGEGTVIKNPCQTCHGTGVTKGEEVVEIKIPAGVAE GMVVNVPGKGNAGKRNGITGDIQVCIEEEPNDTFIRDGNDLIYNELLDFPTAALGGSLEI PLVDGGKVRMKVAPGTQPGTALRLRGKGLPSVQGYGPGVGDLIVNLSVYVPKTLSGEEKA CLDKLAQSDNFKGDRVTKQTIFQKFKNYFN >gi|283510572|gb|ACQH01000047.1| GENE 8 8376 - 9683 1069 435 aa, chain + ## HITS:1 COG:CAC2398 KEGG:ns NR:ns ## COG: CAC2398 COG0285 # Protein_GI_number: 15895664 # Func_class: H Coenzyme transport and metabolism # Function: Folylpolyglutamate synthase # Organism: Clostridium acetobutylicum # 1 428 1 430 431 230 34.0 5e-60 MNYQETIQYLYNSTPVFEHVGASAYKEGLDNSLALDSHLGAPHRAFKSIHVAGTNGKGST AHTLAAILQACGYRVGLYTSPHLVDFRERIRINGEPLSESYVVDFVAQHRAFFEPLHPSF FELTTAMAFSYFTESHVDVAVIEVGLGGRLDCTNIISPILSVITNISLDHTGFLGNTLVQ IAVEKAGIIKPNTPVVVGETTVETHQVFENTAKERHAPVVWAEQNPEVLQWQHRNEGGIS LQTRSFGNIVFELGGNYQHHNANTILTAARQLQTQGFDVTAKEVGQGMATVCQSTGLQGR WQILGHAPLVVCDTGHNVAGWQYLSPQIRAQSYRTLRMVFGMVDDKDLDTVLTLLPKDAT FYFTQASTHRAIPSEVVAQKAEQNALHGSIYNNVYAAYKAALSDAAKDDFVFVGGSSYVV ADLLSNLKMCSENTE >gi|283510572|gb|ACQH01000047.1| GENE 9 9896 - 10987 350 363 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15900011|ref|NP_344615.1| aldose 1-epimerase [Streptococcus pneumoniae TIGR4] # 21 358 11 341 345 139 29 2e-32 MTNTDKVQNVSGLKREDFQTHINGKETDLYILRNKDGNEVAITNYGGAIVAIIVPDKHGN LANVIQGHDNIQDVVNSPEPYLSTLVGRYGNRIAKGKFQLHGKEYSLPINNGPNSLHGGK KGFNAKVWDALQMNDTTLVLNYISAYGEEGFSGELKTTVIYTFTDDNELVIEYMAKTNKK TVINLTSHGFFSLAGIANPTPSIENLECEINADFYIPIDEVSIPTGEIRFVKGTPFDFRT PKTIGQDINADHEQIKNGAGYDHCFVLNKREEGELSFAARIMEPVTGRTMEVYTTEPGVQ LYTDNWADGYKGQHGATFPRRSGICFEAQHFPDSPNHPYFPSVVLNPGEQYKQKTIYKFG TIA >gi|283510572|gb|ACQH01000047.1| GENE 10 11195 - 12508 1304 437 aa, chain + ## HITS:1 COG:BMEII1053 KEGG:ns NR:ns ## COG: BMEII1053 COG0738 # Protein_GI_number: 17989398 # Func_class: G Carbohydrate transport and metabolism # Function: Fucose permease # Organism: Brucella melitensis # 13 429 24 412 412 125 29.0 1e-28 MENKQAKSGSIVAIVTMMFLYAMISFVTNLAAPIGIIWKNAFQGDSANTVGMLGNAMNFL AYLFMGIPAGKLLSKIGYKRTALVGIALGFIGVLVQFLSGGVDDKVAGFSVYLLGAFISG FCVCTLNTVVNPMLNLLGGGGNRGNQLNLVGGTLNSLAGTLTPMLVGALIGTVTAKTAMV DINLVLYIAMGVFAAAFVALLFIPISDPEMGKTTAETVYERSPWAFRHFVLGTIAIFIYV GVEVGIPGTLNFYLSDTSANGGGLDPATAVKIGGFVAGTYWFLMLVGRFSAGFIADKVSS KAMMIVTASLGIFLILLAMVLGKSTTMAMPVFTGSSFQLVTVPIAAPLLVLCGLCTSVMW SSIFNLATEGLGKYSAAASGIFMMMVVGGGVLPLMQNFIADHAGYLISYSIPLIGLAFIL FYALIGSKNVNKDIRVD >gi|283510572|gb|ACQH01000047.1| GENE 11 12582 - 13742 1468 386 aa, chain + ## HITS:1 COG:CAC2959 KEGG:ns NR:ns ## COG: CAC2959 COG0153 # Protein_GI_number: 15896212 # Func_class: G Carbohydrate transport and metabolism # Function: Galactokinase # Organism: Clostridium acetobutylicum # 1 385 1 388 389 253 38.0 6e-67 MDIEHVRSRFIKHFDGKTGNIYASPGRINLIGEHTDYNGGFVFPGAVDKGIMAEVRPNGT DTVMCYSIDLKDRVEFKVDDPNGPRASWARYIYGIIQEMKKLGVDVKGFNTAFSGDVPLG AGMSSSAALESCFAFALNDLFGDNKVSKWDMVLAGQATEHNYCGVMCGIMDQFASVFGQE GKLMRLDCRSREFEYFPFNPQGYKLVLLDSKVKHELASSAYNHRRKSCENVVAALNAKFP EKKFDTLRDADWQELDAVKADVSEEDFKRAHFVLGEKDRVLAVCDALNAGDYETVGKKMY ETHEGLSKEYEVSCEELDYLNELAKENGVTGSRIMGGGFGGCTINLVKDELYDKFIADAK QKYAAKYKHEPAVYDVVISDGSRKVC >gi|283510572|gb|ACQH01000047.1| GENE 12 13977 - 14729 718 250 aa, chain + ## HITS:1 COG:no KEGG:BDI_2957 NR:ns ## KEGG: BDI_2957 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 245 1 246 248 164 37.0 2e-39 MNLKTKAQLKEYAARYETTAFLKEDPARFMHLVQGANNQETMAFIAANLSYGNRKQFFPK INFVLEKSQGEPYNWVKSGLFEQDIPPTDDCYYRLYTKRMMNSFLKTLQKLLSEHNTLHQ YIQTTATNDAETAIVALCKYFAEHGQEGIVPINARSACKRICMFLRWMVRDNSPVDLGIW KDTIDKRTLIMPLDTHVLTQAQRLKLLKSKSTSMSIAKKLTAEMAKVFPNDPLKADFALF GYGVNDGQEI >gi|283510572|gb|ACQH01000047.1| GENE 13 14987 - 15580 596 197 aa, chain - ## HITS:1 COG:BH3033 KEGG:ns NR:ns ## COG: BH3033 COG0424 # Protein_GI_number: 15615595 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Nucleotide-binding protein implicated in inhibition of septum formation # Organism: Bacillus halodurans # 5 186 4 182 190 140 45.0 2e-33 MKNRLVLASNSPRRKELLRGLGIDFEVRLIRDIDESFPATLPVSDVAVHISKAKATAYLD TMAENEVVLTADTVVVCDGQILGKPQDAEDARRMLGLLSGKSHEVITGVTLSTRQWQRSF AVTTVVWFKELTADEISFYVDSYRPFDKAGAYGIQEWIGYVGVQRIEGSYFNVMGLPVQR IYEELKDCFDFLHPFEV >gi|283510572|gb|ACQH01000047.1| GENE 14 15603 - 16121 545 172 aa, chain - ## HITS:1 COG:aq_2171 KEGG:ns NR:ns ## COG: aq_2171 COG1778 # Protein_GI_number: 15607107 # Func_class: R General function prediction only # Function: Low specificity phosphatase (HAD superfamily) # Organism: Aquifex aeolicus # 6 154 7 155 163 108 36.0 4e-24 MINYDLQRIRAIIFDVDGVLSASTITLSADGEPLRTVNIKDGYAIQFAQKVGLRICIITG GDTKAVRKRFEGLGVEDIYMKAGVKLLAYEDFKARYGYADDELMFVGDDVPDYEVMSRCG CACCPADACADIKSVSRYVTTCAGGMGCGREVIEQTLRAKGLWMKDAKAFGW >gi|283510572|gb|ACQH01000047.1| GENE 15 16126 - 16914 522 262 aa, chain - ## HITS:1 COG:no KEGG:PRU_2462 NR:ns ## KEGG: PRU_2462 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 7 260 1 254 256 250 48.0 4e-65 MSKIKAIKIVLFGAGNLATSLGLALQRAGFSVAQVYSRTLTAAETLAVRLGVTAINSLES LSADADVYLLALKDDALASVLPQVCQLNPHAVYVHTAGSVPMNVFEGLAKNYGVFYPLQT FSKQKPLDFSSIPCLIEGANEFSEDVLLKMARALSHDVRLTTSADRRIVHLAAVFGCNFV NLCYSLANETLKRRGLDFGLLLPLIDETAAKVHRLSPKAAQTGPAVRMDKAVMERQMSLI DDAEERRLYELMSEMIHKQANT >gi|283510572|gb|ACQH01000047.1| GENE 16 16920 - 17447 535 175 aa, chain - ## HITS:1 COG:CAC3555 KEGG:ns NR:ns ## COG: CAC3555 COG0778 # Protein_GI_number: 15896791 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Clostridium acetobutylicum # 6 171 3 168 174 127 36.0 8e-30 MNSFADLMHRRRSCRKFTNEKIDADELSMVLNAALISPSAKNRRSWHFVVVEDKETLEKL SECKESGAALLKDAAVAVVVLGNVAENDCWIEDCSIAALAMQLQAEDLGLGSCWVQVRNR GLNSGERGTDIVRGILNIPETLDTLCILALGHKASDLPMKNDEDARWEQVHVGGF >gi|283510572|gb|ACQH01000047.1| GENE 17 17460 - 18131 718 223 aa, chain - ## HITS:1 COG:lin2751 KEGG:ns NR:ns ## COG: lin2751 COG1285 # Protein_GI_number: 16801812 # Func_class: S Function unknown # Function: Uncharacterized membrane protein # Organism: Listeria innocua # 2 223 1 220 220 179 47.0 4e-45 MMDIDFALRIFIAGMLGGAIGLEREYRAKEAGFRTHFLVSLGSALFMILSHSGFDFILEH GKNVSLDPSRIASQVVTGIGFIGAGIIIFQKHVVHGLTTAAGLWVTSAIGLTCGSGMYLL ATSATVMVLICLEALHLILRKIGTRHIVLTLSTPSHDNIKAVLKDMRAKNMTIVSYEMQE KALGDGQHLLVATIELKLKKQIYENCLLQFMEEYEGVDINKIE >gi|283510572|gb|ACQH01000047.1| GENE 18 18142 - 18756 455 204 aa, chain - ## HITS:1 COG:no KEGG:BT_0462 NR:ns ## KEGG: BT_0462 # Name: not_defined # Def: putative transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 24 201 7 183 184 92 28.0 9e-18 MHSIPKKGNESNGLLDTDELSSPQQLTEDGRPWHAVRLFSLRLNEAMAYFTSAGLTCFVP MEWSGKRTGDDDGKPIFKPVVFNLVFVKKDKEECDLRRIFANARFKMSVIRKDKVAGTYY EIPSKQMLDFMLMCSPEIEMRKFLSATEAKLKSGTPVIVKHGPLKGLTGRLVRSSKKYFL LKEVPELGVMIKVSRWCCAPIEQQ >gi|283510572|gb|ACQH01000047.1| GENE 19 20106 - 20606 350 166 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928250|ref|ZP_06422097.1| ## NR: gi|288928250|ref|ZP_06422097.1| hypothetical protein HMPREF0670_00991 [Prevotella sp. oral taxon 317 str. F0108] # 1 166 15 180 180 319 100.0 3e-86 MKSLLLSFLLAIPTLAFSQKEDVRAEIHFQGNAFYHEFRPYEFAGSVGYNITDRFFVNVR GESTVALFKINGLKDYYTNVMYGANVGYTFLKNDLGNFDVRIGLGDNLRCKQDWKYNYYD AGVYMHLGKAKIKPDFGFGIRRYHSKGSMFKSYTRVFVAIGLSFGT >gi|283510572|gb|ACQH01000047.1| GENE 20 20606 - 21427 202 273 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928251|ref|ZP_06422098.1| ## NR: gi|288928251|ref|ZP_06422098.1| hypothetical protein HMPREF0670_00992 [Prevotella sp. oral taxon 317 str. F0108] # 1 273 1 273 273 510 100.0 1e-143 MKKITIYGLCLTLCLSVIGCSSSDVVETNKLSTKEVQALTELKSDLYAVNAEYAKSNTRG LKKWLRWLIFGAADAAGFVTGGGAVAISASTLAWTVTKAEREVSTNSDFKDCAEVALDKG SIGYAHNELSQKIVREHQDSLLGMPIDQLAEIVEEESKAYPAIENKSVDREILKQIISTF NADASIQDNINAFKQFTNDPQKQEALDICGIVLEGLQNVSDENTTYIDQVNRLVDASPVG FQTKKMIKGGISVADASAKLWNSSELEELPKAK Prediction of potential genes in microbial genomes Time: Sat May 28 00:58:27 2011 Seq name: gi|283510571|gb|ACQH01000048.1| Prevotella sp. oral taxon 317 str. F0108 cont2.48, whole genome shotgun sequence Length of sequence - 36567 bp Number of predicted genes - 31, with homology - 28 Number of transcription units - 17, operones - 7 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 157 - 216 4.7 1 1 Tu 1 . + CDS 282 - 974 707 ## COG1051 ADP-ribose pyrophosphatase - Term 1268 - 1315 5.0 2 2 Op 1 . - CDS 1339 - 1623 302 ## COG1846 Transcriptional regulators 3 2 Op 2 . - CDS 1623 - 2225 488 ## BDI_1981 hypothetical protein - Prom 2308 - 2367 7.3 4 3 Tu 1 . + CDS 2696 - 4015 1161 ## COG0534 Na+-driven multidrug efflux pump - Term 4038 - 4079 5.2 5 4 Op 1 . - CDS 4112 - 5227 809 ## PRU_0611 putative lipoprotein 6 4 Op 2 . - CDS 5112 - 6974 1783 ## COG0826 Collagenase and related proteases 7 4 Op 3 . - CDS 6986 - 7903 795 ## COG1897 Homoserine trans-succinylase - Prom 8079 - 8138 2.6 8 5 Tu 1 . + CDS 7806 - 8123 88 ## - Term 9315 - 9346 0.2 9 6 Op 1 . - CDS 9370 - 9735 312 ## PRU_1437 hypothetical protein 10 6 Op 2 . - CDS 9759 - 10061 348 ## PRU_1438 hypothetical protein - Prom 10091 - 10150 4.2 11 7 Tu 1 . - CDS 10200 - 11960 1859 ## COG0173 Aspartyl-tRNA synthetase - Prom 12130 - 12189 5.3 - TRNA 13214 - 13289 83.5 # Lys CTT 0 0 - TRNA 13307 - 13382 83.5 # Lys CTT 0 0 - Term 13781 - 13839 1.2 12 8 Op 1 . - CDS 13842 - 14687 623 ## PRU_0459 hypothetical protein 13 8 Op 2 . - CDS 14684 - 15778 871 ## PRU_0458 hypothetical protein 14 8 Op 3 . - CDS 15780 - 16763 998 ## COG0673 Predicted dehydrogenases and related proteins - Prom 16850 - 16909 8.3 - Term 16943 - 16989 11.3 15 9 Tu 1 . - CDS 17024 - 17587 486 ## PRU_1672 hypothetical protein - Prom 17755 - 17814 7.9 - Term 17802 - 17848 7.1 16 10 Op 1 . - CDS 17919 - 19421 1629 ## COG0469 Pyruvate kinase 17 10 Op 2 . - CDS 19418 - 19660 63 ## 18 10 Op 3 . - CDS 19727 - 20476 1061 ## COG1212 CMP-2-keto-3-deoxyoctulosonic acid synthetase - Prom 20496 - 20555 1.5 19 10 Op 4 . - CDS 20558 - 20761 64 ## - Prom 20930 - 20989 5.5 20 11 Tu 1 . - CDS 21040 - 22257 1160 ## PRU_0475 hypothetical protein - Prom 22328 - 22387 3.4 - Term 22317 - 22363 5.1 21 12 Tu 1 . - CDS 22461 - 23714 1665 ## COG0126 3-phosphoglycerate kinase - Prom 23811 - 23870 4.2 - Term 23898 - 23934 1.1 22 13 Op 1 . - CDS 24013 - 24654 640 ## COG0778 Nitroreductase 23 13 Op 2 . - CDS 24660 - 25754 1033 ## PG1623 membrane bound regulatory protein, putative - Term 25774 - 25820 -0.6 24 13 Op 3 . - CDS 25831 - 26484 713 ## COG0603 Predicted PP-loop superfamily ATPase + Prom 26711 - 26770 3.8 25 14 Tu 1 . + CDS 26824 - 27132 247 ## gi|288928274|ref|ZP_06422121.1| hypothetical protein HMPREF0670_01015 + Prom 27647 - 27706 5.9 26 15 Tu 1 . + CDS 27876 - 28100 68 ## gi|288927699|ref|ZP_06421546.1| hypothetical protein HMPREF0670_00440 - Term 28174 - 28218 0.0 27 16 Tu 1 . - CDS 28263 - 29651 1594 ## COG1066 Predicted ATP-dependent serine protease - Prom 29701 - 29760 2.7 + Prom 29651 - 29710 4.3 28 17 Op 1 . + CDS 29779 - 30687 956 ## COG0715 ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components 29 17 Op 2 . + CDS 30684 - 32939 2317 ## PRU_1631 hypothetical protein 30 17 Op 3 . + CDS 32941 - 33639 888 ## PRU_1452 hypothetical protein 31 17 Op 4 . + CDS 33721 - 36546 3299 ## COG0178 Excinuclease ATPase subunit Predicted protein(s) >gi|283510571|gb|ACQH01000048.1| GENE 1 282 - 974 707 230 aa, chain + ## HITS:1 COG:DR0192 KEGG:ns NR:ns ## COG: DR0192 COG1051 # Protein_GI_number: 15805228 # Func_class: F Nucleotide transport and metabolism # Function: ADP-ribose pyrophosphatase # Organism: Deinococcus radiodurans # 15 215 15 218 225 149 40.0 5e-36 MKEKKYAYKYPHPSVTADCIIFGFDGGKLKVLLIERGQDPYKGKWAFPGGFVQMDESCED GALRELEEETALKGMSVQQFHTYSDPNRDPRERVITVAFLALVRLQEVKAGDDARKAQWF AIDEVPQLAFDHDVILRDALKHLRERIHFQPIGFELLPEKFTMRQLQNLYESILDVHFDR GNFSKKMLHFNILTPLDETVKPTPKREARLYRFNKESYDELKQKGFRLEF >gi|283510571|gb|ACQH01000048.1| GENE 2 1339 - 1623 302 94 aa, chain - ## HITS:1 COG:CC2206 KEGG:ns NR:ns ## COG: CC2206 COG1846 # Protein_GI_number: 16126445 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Caulobacter vibrioides # 5 94 10 99 103 77 40.0 5e-15 MFRELDPLLHSQLRLAIISLLVGLDEADFVYLKERTEATAGNLSVQIDKLSQAGYIEVEK GFEGKRPRTVCRITAQGRTAFAEYVDALKDYIRK >gi|283510571|gb|ACQH01000048.1| GENE 3 1623 - 2225 488 200 aa, chain - ## HITS:1 COG:no KEGG:BDI_1981 NR:ns ## KEGG: BDI_1981 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 195 1 199 209 142 42.0 9e-33 MEEKKINEQESLELIARMIESTKENLEVGRGNRFLYFGYFAFVLSIVVFACVKLTHNNQF SGLWWLMFMCWGFVSWKSKKPKVVTYIDRAINSVWIVVGVMFCLSTVYLLVSAFITHSTD MSLMMPFSLLFVSVGTSMTGVIVRNHAITYLPLVSAIVSFYMLNSLLFGHPSVWWHLLFG FASVFNMILPGHLLNVIEGK >gi|283510571|gb|ACQH01000048.1| GENE 4 2696 - 4015 1161 439 aa, chain + ## HITS:1 COG:VC0090 KEGG:ns NR:ns ## COG: VC0090 COG0534 # Protein_GI_number: 15640122 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Vibrio cholerae # 6 421 18 432 454 253 38.0 7e-67 MKPRDKEILNIAIPSIISNITVPLLGMVDVAIMGHLGNAAYIGAIAVSSMIFNVIYWLFG FLRMGTSGMTSQACGARNLEEVTKLLLRSQLIGLLIALAIILLQYPIAQGALFLMRPTPE ISQMAQRYFNVCIWGAPAMLGLYGFMGWFIGMQNTRIPMFISIMQNVVNICVSLALVLFA GMNIEGVAIGTLVAQWSGMLAAVAFFLALYRPKLRIHFNMRGVITHSAMTSFFKVNRDIF LRTLFLVAVNFFFTSAGSAQGTMILAVNTLLLQLHTIFSYFMDGFAYAGEAVGGKYYGAK NKLAFNDTIRRLFVWGGAMVLAFTLLYTSGGRSFLELLTNDQQVVGSALPYFGWTIALPL AGMAAFVWDGVFIGITATRGMLTSSALAALLFFATFYLLRPFLGNHALWLAFILYLASRG LIQTYLFKRIEGVKALQQT >gi|283510571|gb|ACQH01000048.1| GENE 5 4112 - 5227 809 371 aa, chain - ## HITS:1 COG:no KEGG:PRU_0611 NR:ns ## KEGG: PRU_0611 # Name: not_defined # Def: putative lipoprotein # Organism: P.ruminicola # Pathway: not_defined # 48 339 4 307 310 333 55.0 6e-90 MVVQSPHGKSHFTYNWATIGGLGSILIVLNVKWTYMQNDMKKGLLLNILSLLALLLLLSS CYHQPPTHNEGTADLTEEQLDSLSFSASHHYSNNYNFVVKADSLPLVRQQPEELVTDVVL DTLYIKRHDHVAVADIRIVPASGEDSVWVQLARDQSTFGWINETQLLPNVVPDDPISQFI STFSDVHLLIFLVVICVIGMLYLALKIVRAKAFLVHVRDIDSFYPTLLALLVAASATFYA SIQLFAADTWRHFYYHPTLNPFSVPLLLSVFLASAWAILIIGLATIDDVRHKLPFSDAML YLCGLAGVCAVNYIVFSISTLYYVGYPLLLAYGYWAIRRYLAYNRSRYVCGNCGAALHSK GHCPRCGSFNQ >gi|283510571|gb|ACQH01000048.1| GENE 6 5112 - 6974 1783 620 aa, chain - ## HITS:1 COG:STM1604 KEGG:ns NR:ns ## COG: STM1604 COG0826 # Protein_GI_number: 16764948 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Collagenase and related proteases # Organism: Salmonella typhimurium LT2 # 3 615 7 616 654 440 41.0 1e-123 MRHLELLAPAKNLECGKAAVDHGADAVYIGASRFGARAAAGNSVADIAELCTYAHRFGVK VYVTVNTIVYEDELDDTQRLIWDLYRAGADAILVQDMGLLKMQLPPIALHASTQTDNRSA EKVNWLLGHGFERVVLARELSIDEIREIHAQLPQAELEVFVHGALCVSYSGVCYVSQHCF SRSANRGACAQFCRMKFDLIDSAGQVIERQRYLLSMKDLSQIDHLEELAEAGATSFKIEG RLKDMAYVKNVVSAYSQKLNQLVRKYPQRYCRASFGKVEYAFEPSLKKTFNRGYTDYFAH GRQQGIASFDTPKAMGEHVGRVKELRGNSFNVASTASFANGDGLCFINADRELVGFRVNR AEGNRLFPLKMPEGLKPGMSLYRNNDEAFGKVLARQTAVRKVPLHMKLRLTSDGVELETA IVHAGECVQRRVDRLPCEHQQAQKPQRENIVRQLTKLGSTPYTCSEVELEPAVEELFIPS SLLADLRRRATDTLMNPPLHLTDQPRTLPTPPERKQTLAWTPMYKRHSYLYNASNHLSKA FYEAEGMPEIGDAFEVKKPQPKDALLMQCRHCIRFSLGHCVKHGGTKPTWKEPLYLQLGD NRRFRLDFNCAECQMDLYAE >gi|283510571|gb|ACQH01000048.1| GENE 7 6986 - 7903 795 305 aa, chain - ## HITS:1 COG:BH2280 KEGG:ns NR:ns ## COG: BH2280 COG1897 # Protein_GI_number: 15614843 # Func_class: E Amino acid transport and metabolism # Function: Homoserine trans-succinylase # Organism: Bacillus halodurans # 1 302 1 302 303 364 54.0 1e-100 MPLRIPDKLPAIDILMRENIFVMDSSRATTQDIRPLRIVVLNLMPLKITTETDLIRLLSN TPLQLEVSFMKLKSHTPKNTPIEHMMSFYEDFDAMRNQKFDGMIVTGAPVEHLPFEEVSY WTEVTEIFEWAKTHVTSALYICWAAQAGLYYHYGIAKHPLNQKMFGVFEQTPLLPLLPIF RGFDNRFNMPHSRHTEVRREDIVQNGALTLVVESECSGVGMVMARGGREFFIMGHLEYAP NTLDTEFKRDWGKRADVSVPLHYYPNDDPTLPPMVTWRAHANLLFSNWINYYVYQETPYN INDIR >gi|283510571|gb|ACQH01000048.1| GENE 8 7806 - 8123 88 105 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSCVVALELSITKIFSRISMSIAGNLSGMRSGIAAYFNKAKLEQWHANKRQDACFLHLAF ALALCKGSNFWRNEKAETRVRRGEWLTKQVLANSFIGRLINLSTR >gi|283510571|gb|ACQH01000048.1| GENE 9 9370 - 9735 312 121 aa, chain - ## HITS:1 COG:no KEGG:PRU_1437 NR:ns ## KEGG: PRU_1437 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 13 100 1 89 93 71 38.0 1e-11 MVALILCYCMSAIKYEQEVFRVLTEAGEEGLAVQKIALHVYNACNSLFNPVSYEEVYAFV SRFLIAKSRMPHSLIERTAHRGVYHLNFNLKETQQLVLQFKHETDVEEPKAPPVDMSLSL F >gi|283510571|gb|ACQH01000048.1| GENE 10 9759 - 10061 348 100 aa, chain - ## HITS:1 COG:no KEGG:PRU_1438 NR:ns ## KEGG: PRU_1438 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 38 100 40 102 102 67 69.0 1e-10 MAEKKATKSVEEAKTSTAKKPAAKKTATKKVTQPAWVIDAESVGFRAGDVYQALSAAGKP LSVAEIAKASDKSEEEVLLGIGWLFKEGKIKGEDKLVTLA >gi|283510571|gb|ACQH01000048.1| GENE 11 10200 - 11960 1859 586 aa, chain - ## HITS:1 COG:CAC2269 KEGG:ns NR:ns ## COG: CAC2269 COG0173 # Protein_GI_number: 15895537 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Aspartyl-tRNA synthetase # Organism: Clostridium acetobutylicum # 1 583 8 590 595 606 51.0 1e-173 MYRTSTCGELRLADAGKTVTLAGWVQRVRKMGGMSFVDLRDRYGITQLVFNENEDKELCD CANRLGREFCIQVTGTVNERQSKNANLATGDIEIIASQLNVLSESETPPFTIEENTDGGD DIRMKYRYLDLRRPNVRKNLELRHKMTILIRNFLDSKDFIEVETPILIGSTPEGARDFVV PSRMNPGQFYALPQSPQTLKQLLMVSGFDRYFQIAKCFRDEDLRADRQPEFTQIDCEMSF VDQDDVINLFEDMARHLFKELRGVELPAKLRQMPWHEAMKRYGSDKPDLRFGMEFVELAD VLKGTGEFSVFNEAEYIGGICVPGCADYSRKQLNELTDFVKRPQVGAKGLVFVKYETDGS VKSSIDKFYSPEVFAQLKQVMGAKDGDLVLIMSGDKANKTRVQLCNLRLEMGDRLGLRDK DKFECLWIVDFPLFEWSDEEQRLMATHHPFTMPNPEDVPLLDEHPERVRAQAYDFVCNGI EVGGGSLRIHDGALQEKMFGILGFTPERAMAQFGFLINAFKYGAPPHAGLAFGLDRFVSI FAGLNSIRDCIAFPKNNSGRDVMLDAPSVIDDKQLEELNLRVDLGQ >gi|283510571|gb|ACQH01000048.1| GENE 12 13842 - 14687 623 281 aa, chain - ## HITS:1 COG:no KEGG:PRU_0459 NR:ns ## KEGG: PRU_0459 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 27 280 31 285 287 269 49.0 9e-71 MKKAILVCLLFLAALPLLAQAQYTSADSAKVVKLLLEGKRHKDKQNLVLFFARKFLGVPY VASTLENNADERLVINLRQLDCTTFVENVLALTLCTQNGKTTFADFCDQLRKIRYRNGKV GYPTRLHYFSEWISDNARMGYVEETQAPNPPFSAVQTLQINFMSTHVDKYPMLVRTPAFV KPIAQMESELCGKTCRYIPKAGILNNAACKQAVKDGNILALITSRAGLDTSHVGFAVWKK DGLHLFHASSLQKKVVEDKSLLRNYMKKQKSQLGIRVVKVK >gi|283510571|gb|ACQH01000048.1| GENE 13 14684 - 15778 871 364 aa, chain - ## HITS:1 COG:no KEGG:PRU_0458 NR:ns ## KEGG: PRU_0458 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 44 362 29 345 353 270 44.0 7e-71 MLKTLLTSFLLCLLSVVVRAEVPDTTAAPQPAKRSLVKRMADGVKQFVNSFSDVDTTYIE PQRYIYTVMLQNTFTYELYKLSTVEGNSVTFAPEPTIKFGPYVGWQWLFLGYTVDLRRLD GNNKQEFDLSFYTSQIGIDLFWRNTGNNYKIKSLTLAKDIKPDAVLGSAFSGFEARIRGF NLYYILNHRRFSYRAAFSQSTVQRRSAGSMLFGVGCTWHKVNVDVAQLDQLITDRLGANA TMTAPGDTANFGKVQYTDLSVSAGYGYNWVFARNWLLAGSLSLALGYKRTTGNVSARRTS LGDFLSYNMALDGVGRFGIVWNNTRWYAGASTILHTYNYRKNRFSTNNMFGNLNIYVGFN FIRR >gi|283510571|gb|ACQH01000048.1| GENE 14 15780 - 16763 998 327 aa, chain - ## HITS:1 COG:BH1248 KEGG:ns NR:ns ## COG: BH1248 COG0673 # Protein_GI_number: 15613811 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Bacillus halodurans # 1 259 1 257 340 111 27.0 2e-24 MKEISWGFIGCGEVTETKSGPAFNEVEGSHVHAVMSRTEDKARSYAERHRIRKWYTDPNE LIGDPDVNAVYIATPPSSHATFAIMAMKAGKPVYVEKPLAASYEDCVRINRVSEQTGVPC FVAYYRRYLPYFQKVKAIVDSGAIGKITNVQIRFAVPPRDLDYSTNSNLPWRLQPDVSGG GYFYDLAPHQLDLIQELFGVITRAHGYLSNRGRLYGVEDTVSAAFEFESGVVGSGSWCFV AHQSAKEDCIEVIGERGLLSFSVYTNNPIQLVTSDGATSFNVDNPAYAQLPIIKAVIEDL QGIGHCSCTSVSATPVNWVMDRILNKF >gi|283510571|gb|ACQH01000048.1| GENE 15 17024 - 17587 486 187 aa, chain - ## HITS:1 COG:no KEGG:PRU_1672 NR:ns ## KEGG: PRU_1672 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 187 1 185 185 136 41.0 4e-31 MKKILMTLVAAVIAVSASAQQVYVGGTLGIGVSKVKGGDEVTTYKVLPEVGYSFNDDWAI GTVLGWGKGNPVSVADAKTTQARFFTVSPYVRYTFLHSKYVDVFVDTGLDYLRYQGGDNE LGVGVKPGVAINLNRHFTFLAKVGFAGWKNERGTYTDPRVGEVKINNQAWGASFDGNNLS FGVYYNF >gi|283510571|gb|ACQH01000048.1| GENE 16 17919 - 19421 1629 500 aa, chain - ## HITS:1 COG:BB0348 KEGG:ns NR:ns ## COG: BB0348 COG0469 # Protein_GI_number: 15594693 # Func_class: G Carbohydrate transport and metabolism # Function: Pyruvate kinase # Organism: Borrelia burgdorferi # 2 468 4 472 477 393 47.0 1e-109 MKQTKIVATISDRRCDQEFIRKLFFAGMNVVRMNTAHAPEEGMKEIIKNVRAVSSHIALM IDTKGPEIRTTSVAEPIQYKSGDIVKIFGRPEMETTHDILNVSYPDIANDVKVGDSILFD DGELDMKVLDVQGPMLVTQVQNDGKLGARKSVNVPGEHIDLPALTEKDRNNILFAIEQDI DFIAHSFVRSAEDVHAVQKILDEHNSDIKIISKIENQEGIDNIDEIIDASYGIMIARGDL GIEVPLECIPGLQREIIHKCILKKKPVIVATQMLHTMINNPRPTRAEVTDIANAVYYRTD ALMLSGETASGKYPVEAVKTMAAIAEQAEKDRLRLKGLQISLAPNCSQKEFLAYSAIEST RRLGVAGIITDSETGETARILASFRGSTPILAMCYKEKLQRWLNLSYGVIPVHQKEHLAP QFIFTAAIRMLRQKGYINLEDKIAYLSGSFGEGGGTTFVEINKVAKALNADSNYRFHLPE HPSKKQTGEPEAEETWAGEN >gi|283510571|gb|ACQH01000048.1| GENE 17 19418 - 19660 63 80 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSSSLGHCHGEKTHFSRNEKRTLFVRLSTIIITKGQGVRDFFQYLCKRQANVNGSVATRR DNARRACPLFTLYKGHFNKI >gi|283510571|gb|ACQH01000048.1| GENE 18 19727 - 20476 1061 249 aa, chain - ## HITS:1 COG:FN0807 KEGG:ns NR:ns ## COG: FN0807 COG1212 # Protein_GI_number: 19704142 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: CMP-2-keto-3-deoxyoctulosonic acid synthetase # Organism: Fusobacterium nucleatum # 1 247 1 242 245 207 42.0 1e-53 MKFIGIIPARYGSSRFPGKPLAQLGGKPVIQRVYEQVAQALDETYVATDDERIYNKVLAF GGKAVMTSTEHQSGTDRVNEAVQKIGGDYDVVVNIQGDEPFVQRSQIDVICQCFDTEGVQ IATLGIPFKTIDEVRNPNSPKIVVSNGGFAMYFSRSVIPFVRGTEPEQWLEAYPFLKHLG IYAYRPEALRAITALPQSSLEKAESLEQLRWLQNGYQIKVGVTQVETVGIDTPEDLQRAE EFLKAHDAG >gi|283510571|gb|ACQH01000048.1| GENE 19 20558 - 20761 64 67 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSELTHIRRKHQYKATCTGTTSGLYWYYKVLVLALQGACSPTTNSIVLSKCCCADTLMAK WGMRTSF >gi|283510571|gb|ACQH01000048.1| GENE 20 21040 - 22257 1160 405 aa, chain - ## HITS:1 COG:no KEGG:PRU_0475 NR:ns ## KEGG: PRU_0475 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 15 405 21 404 405 356 49.0 8e-97 MLCLWFSLAGWAQRNRILNPDIASLQVVAGNNWLSMPVIALGEGVPVNIAFDDLTHEYRR YAYKVEHCNADWSTSGDLFVSDYIDGFNTDNVIEDVEQSINTNVLYTHYRLQIPNGRCKL TMSGNYRVTIYDANDDDKAVAECCFMVVEPRMGIKLSVDANTDKGINSRWQQVAMEVKYG GGLSVTDVQRQIYTVVMQNGRWDNAVVNAKPQFVMGDGLRWSHNAQLVFEAGNEYRKFEM LDVRHANMGVEKIDWDGKEYHAYLWPDEPRGSYVFDEDANGAFYIRNGDNRENSSTSEYV HVHFTLRSPRLEGGVFVNGVWTNDQLAAPYQMQYDEGAQCYRLSLLLKQGYYSYQYVWQQ SNGQIATVPSEGSFYQTENRYQALVYYRKLGERADRLVGFAEIRR >gi|283510571|gb|ACQH01000048.1| GENE 21 22461 - 23714 1665 417 aa, chain - ## HITS:1 COG:BH3559 KEGG:ns NR:ns ## COG: BH3559 COG0126 # Protein_GI_number: 15616121 # Func_class: G Carbohydrate transport and metabolism # Function: 3-phosphoglycerate kinase # Organism: Bacillus halodurans # 3 414 6 391 394 355 50.0 1e-97 MKIEQFNFAGKKAIVRVDFNVPLDENGNITDDTRIRGALPTLKKILADGGAVIMMSHMGK PKGKVNPKLSLSQIVNKVSERLGVPVKFADDCANATDAAANLKMGEALLLENLRFYPEEE GKPVGIDKADPAYDEAKKEMKAKQKDFAKKLASYADVYVNDAFGTAHRRHASTAVIADYF DADSKMLGYLMEKEVTAIDNVLKNAQHPFTAIIGGSKVSSKLGVIKNLLDKVDNLIIGGG MGYTFVKAQGGKIGDSLHEDDLMPEALNVIAAAKEKGVNLSLSVATVAADKFDNNANRKV VPIDQIPDGWEGMDASEESLAIWKKIILESKTILWNGPVGVFEFENFAHGTGEIAKYVAQ ATQENGAYSLVGGGDSVAAVNKFGLADKVSYVSTGGGAMLEAIEGKVLPGVAALQGE >gi|283510571|gb|ACQH01000048.1| GENE 22 24013 - 24654 640 213 aa, chain - ## HITS:1 COG:MTH120 KEGG:ns NR:ns ## COG: MTH120 COG0778 # Protein_GI_number: 15678148 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Methanothermobacter thermautotrophicus # 39 203 25 186 191 165 43.0 5e-41 MKISTILNFALALALVFLCAKLVMSGALGSQAAAGDDVVYNNILSRTSVRAYQDKAVEKD KVEKLLRAGMAAPTAVNKQPWHFVVVADKKLLAALAETNAQADFAKDAPLAIVVCGDMTK AIEGAGRDFWVQDCAAATENILLAAHAMGLGAVWTGAYPIEERSSEMGKVLKLPETMVPL SVVVIGYPKGETKAKDKWDPEKVSYNVYGGQEM >gi|283510571|gb|ACQH01000048.1| GENE 23 24660 - 25754 1033 364 aa, chain - ## HITS:1 COG:no KEGG:PG1623 NR:ns ## KEGG: PG1623 # Name: not_defined # Def: membrane bound regulatory protein, putative # Organism: P.gingivalis # Pathway: not_defined # 62 360 86 386 396 163 34.0 1e-38 MSKPAKWQQLLSLITCFLLVLGVSIRRDGKIQGHDLLQKGKAIAAGEDTIRTLDDGTVVV NTTALGNDVVGYSGTVPLEISLKDDHVVGVKALDNAETPEFFEEASQLLTKWNGKSIDEA QRMQVDAVSGATFTSKAIVENMRRGLLYAAKAKADANIWSKLDLSAKSLVALAVALMAAI LPLFIKNRLYRLVQHVLNVAILGFWCGTFLNYTSMVGYMSNGMNVLALLVPVVMLVTAFV YPLFGKKSYYCTHVCPYGSLQELAGKCVRYKIKMKPKTMKRLDLFRQILWAVLMFCLWTG VWFDWIDYEPFSAFIFGSASWVVIGIALVFVLLSTVVVRPYCRFVCPTGSLFKYSQYSTP SGKK >gi|283510571|gb|ACQH01000048.1| GENE 24 25831 - 26484 713 217 aa, chain - ## HITS:1 COG:SMb20940 KEGG:ns NR:ns ## COG: SMb20940 COG0603 # Protein_GI_number: 16264811 # Func_class: R General function prediction only # Function: Predicted PP-loop superfamily ATPase # Organism: Sinorhizobium meliloti # 4 215 3 217 236 186 47.0 3e-47 MKDSVIVVSGGMDSVTLLYEKKDEIALGISFDYGSNHNHNELPLAALHCQRLGIAHVVIP LGFMHQYFKSSLLESGESIPDGSYDEENMKSTVVPFRNGVMLAVAAGIAESNGLTKVLIA NHGGDHTIYPDCRPEFIAAMDAAVEAGTFARVRVVAPYTNISKADIARRGRALGIDYAET WSCYKGGDVHCGTCGTCVERKEALREAGIEDNTTYQS >gi|283510571|gb|ACQH01000048.1| GENE 25 26824 - 27132 247 102 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288928274|ref|ZP_06422121.1| ## NR: gi|288928274|ref|ZP_06422121.1| hypothetical protein HMPREF0670_01015 [Prevotella sp. oral taxon 317 str. F0108] # 1 102 1 102 102 189 100.0 5e-47 MFIETNPCTEFWFLLHFLPNVVCRQYESYEQLLPELQKHMPGYVKTKRYFIKTNLYKYLT EHGDLERAVQNSEKLCQLCQESPEDLKAYSEIHKVIELLKTI >gi|283510571|gb|ACQH01000048.1| GENE 26 27876 - 28100 68 74 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288927699|ref|ZP_06421546.1| ## NR: gi|288927699|ref|ZP_06421546.1| hypothetical protein HMPREF0670_00440 [Prevotella sp. oral taxon 317 str. F0108] # 1 54 1 54 65 66 66.0 7e-10 MAVFLNKNSFCGIPMLTLFSIKTNLRENRFFAAGGRLVNRKGTHNVKKRAENFTILDPTS CIIYIVCSTKGNNA >gi|283510571|gb|ACQH01000048.1| GENE 27 28263 - 29651 1594 462 aa, chain - ## HITS:1 COG:BS_sms KEGG:ns NR:ns ## COG: BS_sms COG1066 # Protein_GI_number: 16077155 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted ATP-dependent serine protease # Organism: Bacillus subtilis # 1 462 1 457 458 440 48.0 1e-123 MAKDKTAYVCDNCGHESVKWIGKCPSCGQWNTFKEIRVAPSVSGGGKRLAAGGVGNTSST RNTPMPISKVQAKDEPRIDLHDDELNRVLGGGLVRGSIVLLGGEPGIGKSTLVLQTILRM PEKRILYVSGEESTHQLKMRADRIEENVADNVIVLSENALEKILMQVDDVKPDLLIVDSI QTIESETIDSSPGSIVQVRECASALLRYAKTSSVPVLLIGHINKEGVLAGPKILEHIVDT VLQFEGDQHYMYRILRSIKNRFGSTSELGIYEMQQNGLRPVSNPSELLLTRDHEGLSGIA ISCAIEGVRPFLVETQALVSTAAYGTPQRSTTGFDQRRLNMLLAVLEKRVGFKLIQKDVF VNIAGGLRVTDLAMDLSIIAAVLSSNVDTPIDKEWCMAGEVGLSGEVRPVNRIEQRIAEA EKLGFTDMVIPKNNLSGIDPSRYKIKLHPVRKVEEALRILFG >gi|283510571|gb|ACQH01000048.1| GENE 28 29779 - 30687 956 302 aa, chain + ## HITS:1 COG:AF0088 KEGG:ns NR:ns ## COG: AF0088 COG0715 # Protein_GI_number: 11497708 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components # Organism: Archaeoglobus fulgidus # 23 252 16 236 300 66 23.0 8e-11 MTLFHIRTLALATLTTLLFACNGKGNAPQNQSERQTTNDSTLRIAVMPTLDCLPLYVADA TGMFGEEHINVRLMKYTAQMDCDTAILRGRVQGVFTDLVRAQRLRQKGVNLHEVGATNLH WQLISNRNARLKDVRQLGDKMVGMTRFSATDFLTADLIDRSKPKNEVFRIQVNNVFIRLK MLVGYEIDAAFMPEPQATVARMGKNNVLADTKAKDIRLGVAVFSMAAKDQKKVCQALPAF VKAYDRACDSLNIRGIQHYADLLKAHFQLNDAAVKALPKTAFAKMSPVREKDRNAIKDFP TQ >gi|283510571|gb|ACQH01000048.1| GENE 29 30684 - 32939 2317 751 aa, chain + ## HITS:1 COG:no KEGG:PRU_1631 NR:ns ## KEGG: PRU_1631 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 25 700 14 689 729 346 32.0 2e-93 MNKDDINNIRWSIKPERESIPGLILLSKDLGEALNRMRNDINSFGLRKYYERLESIEEDY RLMKDFIKRGFDDPKREEVYNSLLKRLYHLHCDVMITRRKQNTSGFNYYSPFNVATMPLD DIRQRLESFVQDTALLSLDPSATNSEKKKAIYAEHHELTTALFYHLMTDTHWNEETQKFY QELLLSPTIDSMDAQLLVGALLLALLLAFDARKFATLLYLYQYATDPQVKQRALVGWALT LPKIEITLYPNLAEQLHKTVSSQAVQRELLELQYQIVYCINAEKAMQEIHRDILPTMMRG QNIVLGTETEEQKLQDILNPDAEERAMEDLEKSYMRMMDMQQKGVDVYFGGFSQMKRFPF FNRASNWFVPYYPEHPELKPLEGLDETRLEQNIFAHSSFCNSDKYSLELAMAMIANQIPA ELKNAINANGIEIQGDMDVDKSNPAYIRRMYLQDLYRFFFLNNMKEEFVNPFRGGGIVMR GLFFGNQLLAQVGLSSQFVTFGKFLYKRGLWQLLQLFTSIYELDTTLTSNEWLYFRAVAA MQTGNYGEAKKHFAELMNRLPEDEKVAHGYVETCIQSNDFAHAEKVLERLQGERNLSTEQ ELFLAKGYAHAGKAEKATELLFKLNYEHPDNMEVKRQLVLTQLQDGRAETAIDLLNGIVY NNNEVDAKDMLYLAAAHWFANNTQKAIELFVNFLRKTGEKSSETRSGIDEMLHLLDAMAE KYNRSAIDIRILTDIVADEMEANNSPNINND >gi|283510571|gb|ACQH01000048.1| GENE 30 32941 - 33639 888 232 aa, chain + ## HITS:1 COG:no KEGG:PRU_1452 NR:ns ## KEGG: PRU_1452 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 136 228 37 128 134 140 70.0 3e-32 MKTIIRTTFTLLFAAMATFAFADDTPTISPMATFFTPNGEEQSSDYQGSAPLRARFQANV ANADGWNAHYEWRFYTDLKRQTPYLSRFEKDTEVTFTVAGTHYVELYATFVQGTDTVEYT QEYWKTADPIRVSIAESKLEMPNAFTPNGDGVNDVYRAKPGYQSLVEFKAYIFNRWGQKI YEWKDPAGGWDGKFNGQDVKEGVYFVQVNAKGADGKVFNIKKDVNLLRSYIE >gi|283510571|gb|ACQH01000048.1| GENE 31 33721 - 36546 3299 941 aa, chain + ## HITS:1 COG:BH3594 KEGG:ns NR:ns ## COG: BH3594 COG0178 # Protein_GI_number: 15616156 # Func_class: L Replication, recombination and repair # Function: Excinuclease ATPase subunit # Organism: Bacillus halodurans # 1 939 1 939 957 1005 54.0 0 MQDEKINVWGARVHNLKNIDVEIPRNSLTVITGLSGSGKSSLAFDTIFAEGQRRYMETFS AYARNFLGNMERPDVDKITGLSPVISIEQKTTNKNPRSTVGTTTEIYDYLRLLYARAGTA YSYATGEKMVKYTEEKVIQMILERYADKRIYILAPLVHQRKGHYRELFESMRRKGYLYMR VDGEITEITRGMKVDRYKNHNIEVVIDKLKVQGKDDQRLKKTIEVAMKQGDGVIMILDKD NNELKSYSKRLMDPVTGISYPEPAPNNFSFNSPEGACPKCKGLGYVNEIDLKKVIPDTNK SIYEGAIAPLGKYKNQMIFWQIDAILKKYDCELKTAIKDVPADALDEILYGSMERVKIAK ELVHTSSDFFSTYDGLIKYLKSVMDNDDSASSQKWADQFLATCQCPDCKGQKLKRESLAY RVWDKNIAELASMDISDLKTWLDNVEEHLDTKGRKIAAEIVKEIRTRVSFLLEVGLDYLT LNRQSVSLSGGESQRIRLATQIGSQLVNVLYILDEPSIGLHQRDNERLIASLKELRDLGN TVIVVEHDKDMMMAADYIIDIGPKAGRKGGEVVYQGTPQNMLKGNTITANYLNGQMKIEV PAHRRAGNGKSIWIRGAKGNNLKDVDVEIPLGKLIVVTGVSGSGKSTLINETLQPILSQH FYRSLKRPMPYDDVEGIDNVDKVVDVDQSPLGRTPRSNPATYTGVFNDIRNLFVNLPEAM IRGYKPGRFSFNVKGGRCEACSGNGYKTIEMNVLPDVMVPCEVCHGKRYNRETLEVRYKG KSIADVLDMTINQAVEFFENVPNILQKIKTIQDVGLGYIKLGQPSTTLSGGESQRVKLAT ELSKRDTGKTIYILDEPTTGLHFEDIRILMDVLQKLVDRGNTVLIIEHNLDVIKLADHII DMGPEGGRNGGLLLSAGTPEEVAKSKAGFTPKFLKQELEVG Prediction of potential genes in microbial genomes Time: Sat May 28 01:00:06 2011 Seq name: gi|283510570|gb|ACQH01000049.1| Prevotella sp. oral taxon 317 str. F0108 cont2.49, whole genome shotgun sequence Length of sequence - 112026 bp Number of predicted genes - 99, with homology - 93 Number of transcription units - 55, operones - 23 average op.length - 2.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 89 - 148 2.4 1 1 Op 1 . + CDS 382 - 1122 587 ## gi|288928280|ref|ZP_06422127.1| conserved hypothetical protein 2 1 Op 2 . + CDS 1106 - 3256 1215 ## GAU_2336 hypothetical protein + Term 3427 - 3467 7.5 - Term 3483 - 3541 7.3 3 2 Tu 1 . - CDS 3609 - 4046 417 ## PRU_2605 hypothetical protein - Prom 4076 - 4135 5.2 4 3 Tu 1 . - CDS 4377 - 4967 354 ## gi|288928283|ref|ZP_06422130.1| hypothetical protein HMPREF0670_01024 - Prom 5048 - 5107 4.2 - Term 5085 - 5139 0.2 5 4 Tu 1 . - CDS 5388 - 7709 732 ## BVU_1454 hypothetical protein - Prom 7768 - 7827 2.7 6 5 Tu 1 . - CDS 7901 - 8767 750 ## BF2104 hypothetical protein - Prom 8870 - 8929 1.9 - Term 8774 - 8810 6.5 7 6 Tu 1 . - CDS 8942 - 11083 2748 ## PRU_0859 peptidyl-prolyl cis-trans isomerase-like protein - Prom 11200 - 11259 3.4 8 7 Op 1 . - CDS 11261 - 12535 1308 ## COG4536 Putative Mg2+ and Co2+ transporter CorB 9 7 Op 2 . - CDS 12606 - 13256 683 ## PRU_0861 hypothetical protein 10 7 Op 3 . - CDS 13284 - 14579 1340 ## PRU_0862 hypothetical protein - Prom 14675 - 14734 4.6 + Prom 14631 - 14690 3.1 11 8 Op 1 . + CDS 14755 - 15945 957 ## PRU_0863 hypothetical protein 12 8 Op 2 . + CDS 15942 - 17357 1494 ## COG1524 Uncharacterized proteins of the AP superfamily + Prom 17368 - 17427 2.7 13 9 Tu 1 . + CDS 17527 - 20850 4060 ## COG0653 Preprotein translocase subunit SecA (ATPase, RNA helicase) + Prom 21068 - 21127 2.0 14 10 Tu 1 . + CDS 21187 - 21891 176 ## PROTEIN SUPPORTED gi|225088774|ref|YP_002660041.1| ribosomal protein S16 + Prom 21914 - 21973 1.6 15 11 Tu 1 . + CDS 21996 - 22499 523 ## CCC13826_0565 hypothetical protein 16 12 Tu 1 . + CDS 23758 - 26346 2862 ## COG0370 Fe2+ transport system protein B + Prom 26474 - 26533 6.3 17 13 Op 1 . + CDS 26596 - 26799 167 ## 18 13 Op 2 . + CDS 26810 - 29101 1362 ## Dfer_2699 TonB-dependent receptor + Prom 29138 - 29197 1.5 19 14 Op 1 . + CDS 29235 - 29507 151 ## 20 14 Op 2 . + CDS 29494 - 30543 382 ## gi|288928297|ref|ZP_06422144.1| hypothetical protein HMPREF0670_01038 + Term 30732 - 30784 -0.7 + Prom 30731 - 30790 6.0 21 15 Op 1 30/0.000 + CDS 30834 - 31184 207 ## PROTEIN SUPPORTED gi|154175415|ref|YP_001407462.1| NADH dehydrogenase subunit A 22 15 Op 2 9/0.000 + CDS 31175 - 31798 418 ## PROTEIN SUPPORTED gi|154175216|ref|YP_001407461.1| NADH dehydrogenase subunit B 23 15 Op 3 8/0.000 + CDS 31808 - 33376 1828 ## COG0649 NADH:ubiquinone oxidoreductase 49 kD subunit 7 24 15 Op 4 31/0.000 + CDS 33379 - 34467 1090 ## COG1005 NADH:ubiquinone oxidoreductase subunit 1 (chain H) 25 15 Op 5 28/0.000 + CDS 34478 - 35008 583 ## COG1143 Formate hydrogenlyase subunit 6/NADH:ubiquinone oxidoreductase 23 kD subunit (chain I) 26 15 Op 6 30/0.000 + CDS 35026 - 35547 590 ## COG0839 NADH:ubiquinone oxidoreductase subunit 6 (chain J) 27 15 Op 7 26/0.000 + CDS 35544 - 35852 358 ## COG0713 NADH:ubiquinone oxidoreductase subunit 11 or 4L (chain K) 28 15 Op 8 30/0.000 + CDS 35859 - 37817 1953 ## COG1009 NADH:ubiquinone oxidoreductase subunit 5 (chain L)/Multisubunit Na+/H+ antiporter, MnhA subunit 29 15 Op 9 22/0.000 + CDS 37851 - 39356 1623 ## COG1008 NADH:ubiquinone oxidoreductase subunit 4 (chain M) 30 15 Op 10 . + CDS 39369 - 40838 1397 ## COG1007 NADH:ubiquinone oxidoreductase subunit 2 (chain N) + Term 40904 - 40943 10.2 31 16 Op 1 . - CDS 41207 - 41509 90 ## gi|260912219|ref|ZP_05918771.1| conserved hypothetical protein 32 16 Op 2 . - CDS 41503 - 41955 585 ## PRU_0504 cupin domain-containing protein - Prom 42120 - 42179 4.8 + Prom 41920 - 41979 7.3 33 17 Tu 1 . + CDS 42199 - 42567 338 ## COG1733 Predicted transcriptional regulators - Term 44101 - 44145 9.1 34 18 Tu 1 . - CDS 44181 - 44912 352 ## gi|288928310|ref|ZP_06422157.1| hypothetical protein HMPREF0670_01051 - Prom 45015 - 45074 5.0 35 19 Tu 1 . + CDS 45299 - 47182 2113 ## COG1874 Beta-galactosidase + Term 47252 - 47294 6.4 + Prom 47316 - 47375 4.6 36 20 Op 1 . + CDS 47467 - 48366 1025 ## COG1045 Serine acetyltransferase 37 20 Op 2 . + CDS 48382 - 49899 1882 ## COG0116 Predicted N6-adenine-specific DNA methylase 38 20 Op 3 . + CDS 49889 - 52075 2556 ## COG1506 Dipeptidyl aminopeptidases/acylaminoacyl-peptidases 39 20 Op 4 . + CDS 52081 - 53349 1461 ## COG0151 Phosphoribosylamine-glycine ligase 40 21 Op 1 . + CDS 53471 - 54403 829 ## PRU_1808 hypothetical protein 41 21 Op 2 . + CDS 54352 - 54840 424 ## COG1238 Predicted membrane protein 42 21 Op 3 25/0.000 + CDS 54847 - 55680 887 ## COG0803 ABC-type metal ion transport system, periplasmic component/surface adhesin 43 21 Op 4 . + CDS 55677 - 56447 220 ## PROTEIN SUPPORTED gi|225084369|ref|YP_002657150.1| ribosomal protein S16 + Prom 56509 - 56568 3.8 44 22 Op 1 . + CDS 56718 - 56858 95 ## gi|260912231|ref|ZP_05918783.1| hypothetical protein HMPREF6745_2738 45 22 Op 2 . + CDS 56863 - 57519 873 ## COG0036 Pentose-5-phosphate-3-epimerase + Term 57636 - 57674 6.6 + Prom 57534 - 57593 1.7 46 23 Tu 1 . + CDS 57781 - 59241 1303 ## COG0591 Na+/proline symporter + Term 59373 - 59409 2.2 47 24 Tu 1 . - CDS 59262 - 59474 101 ## - Prom 59512 - 59571 1.9 + Prom 59450 - 59509 1.8 48 25 Op 1 . + CDS 59713 - 60075 271 ## PRU_1408 hypothetical protein 49 25 Op 2 . + CDS 60082 - 60495 508 ## PRU_1407 hypothetical protein + Term 60538 - 60596 11.6 + Prom 61838 - 61897 3.8 50 26 Op 1 12/0.000 + CDS 61933 - 64146 2430 ## COG1328 Oxygen-sensitive ribonucleoside-triphosphate reductase 51 26 Op 2 . + CDS 64270 - 64779 414 ## COG0602 Organic radical activating enzymes + Term 64816 - 64867 18.2 - Term 64877 - 64909 0.3 52 27 Tu 1 . - CDS 65036 - 65629 585 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Term 66015 - 66052 -0.7 53 28 Tu 1 . - CDS 66120 - 66962 594 ## COG0682 Prolipoprotein diacylglyceryltransferase - Term 67291 - 67335 3.6 54 29 Tu 1 . - CDS 67364 - 67786 483 ## BVU_2895 hypothetical protein - Prom 68011 - 68070 4.4 + Prom 67888 - 67947 5.7 55 30 Op 1 . + CDS 68037 - 69164 1351 ## COG0592 DNA polymerase sliding clamp subunit (PCNA homolog) 56 30 Op 2 . + CDS 69264 - 70118 1045 ## COG0847 DNA polymerase III, epsilon subunit and related 3'-5' exonucleases 57 30 Op 3 . + CDS 70152 - 71375 1023 ## COG0452 Phosphopantothenoylcysteine synthetase/decarboxylase + Term 71453 - 71510 9.7 + Prom 71506 - 71565 4.6 58 31 Op 1 . + CDS 71589 - 72494 979 ## PRU_0837 hypothetical protein 59 31 Op 2 . + CDS 72496 - 74157 1878 ## COG0497 ATPase involved in DNA repair 60 31 Op 3 . + CDS 74178 - 75830 1601 ## PRU_0835 hypothetical protein 61 32 Tu 1 . + CDS 75988 - 76602 557 ## COG0353 Recombinational DNA repair protein (RecF pathway) + Term 76633 - 76686 5.1 62 33 Op 1 . + CDS 76934 - 77368 335 ## PRU_0896 hypothetical protein 63 33 Op 2 . + CDS 77361 - 78551 851 ## COG1216 Predicted glycosyltransferases 64 33 Op 3 . + CDS 78548 - 79060 369 ## PROTEIN SUPPORTED gi|229254479|ref|ZP_04378409.1| acetyltransferase, ribosomal protein N-acetylase 65 34 Tu 1 . - CDS 80487 - 81014 333 ## gi|288928340|ref|ZP_06422187.1| hypothetical protein HMPREF0670_01081 - Prom 81106 - 81165 4.1 + TRNA 81404 - 81480 81.4 # Asp GTC 0 0 + TRNA 81503 - 81576 80.8 # Asp GTC 0 0 + Prom 81406 - 81465 79.0 66 35 Op 1 . + CDS 81700 - 82155 137 ## ECL_01542 putative cytoplasmic protein 67 35 Op 2 . + CDS 82163 - 82675 149 ## BDI_3895 hypothetical protein + Term 82826 - 82865 -0.1 - Term 82814 - 82853 -0.1 68 36 Tu 1 . - CDS 82901 - 83356 386 ## PROTEIN SUPPORTED gi|229210357|ref|ZP_04336754.1| acetyltransferase, ribosomal protein N-acetylase - Prom 83498 - 83557 3.6 69 37 Op 1 . - CDS 83629 - 84900 1332 ## COG2873 O-acetylhomoserine sulfhydrylase 70 37 Op 2 . - CDS 84857 - 85204 111 ## - Prom 85302 - 85361 3.9 71 38 Tu 1 . - CDS 85727 - 85936 74 ## - Prom 85956 - 86015 5.0 72 39 Tu 1 . - CDS 86282 - 86821 343 ## gi|288928345|ref|ZP_06422192.1| hypothetical protein HMPREF0670_01086 - Prom 86859 - 86918 2.2 73 40 Op 1 . - CDS 87017 - 87562 657 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 74 40 Op 2 . - CDS 87642 - 89294 1517 ## COG2989 Uncharacterized protein conserved in bacteria 75 41 Tu 1 . - CDS 89421 - 89729 234 ## gi|288928348|ref|ZP_06422195.1| hypothetical protein HMPREF0670_01089 - Prom 89816 - 89875 4.4 76 42 Tu 1 . - CDS 90070 - 90882 741 ## COG1234 Metal-dependent hydrolases of the beta-lactamase superfamily III 77 43 Tu 1 . - CDS 90992 - 91552 721 ## PRU_1412 putative lipoprotein - Prom 91572 - 91631 3.7 78 44 Tu 1 . - CDS 91738 - 92535 961 ## COG1694 Predicted pyrophosphatase - Prom 92618 - 92677 6.3 + Prom 92601 - 92660 5.5 79 45 Op 1 9/0.000 + CDS 92710 - 93702 1104 ## COG0147 Anthranilate/para-aminobenzoate synthases component I 80 45 Op 2 . + CDS 93677 - 94270 591 ## COG0115 Branched-chain amino acid aminotransferase/4-amino-4-deoxychorismate lyase + Prom 94282 - 94341 1.9 81 45 Op 3 . + CDS 94368 - 97049 2737 ## COG0525 Valyl-tRNA synthetase 82 46 Tu 1 . + CDS 97242 - 98447 1015 ## BT_0727 hypothetical protein + Prom 100168 - 100227 1.9 83 47 Tu 1 . + CDS 100257 - 100454 232 ## gi|288928356|ref|ZP_06422203.1| hypothetical protein HMPREF0670_01097 + Prom 100526 - 100585 3.4 84 48 Tu 1 . + CDS 100645 - 101295 766 ## BF2047 hypothetical protein + Prom 101332 - 101391 3.6 85 49 Op 1 22/0.000 + CDS 101483 - 102085 596 ## PROTEIN SUPPORTED gi|212691832|ref|ZP_03299960.1| hypothetical protein BACDOR_01327 86 49 Op 2 . + CDS 102117 - 102686 451 ## COG0193 Peptidyl-tRNA hydrolase 87 49 Op 3 . + CDS 102696 - 103109 382 ## COG1188 Ribosome-associated heat shock protein implicated in the recycling of the 50S subunit (S4 paralog) + Prom 103116 - 103175 1.8 88 50 Op 1 . + CDS 103250 - 103825 658 ## COG0299 Folate-dependent phosphoribosylglycinamide formyltransferase PurN 89 50 Op 2 . + CDS 103849 - 104580 545 ## COG0671 Membrane-associated phospholipid phosphatase + Term 104613 - 104674 7.2 - Term 104598 - 104662 10.0 90 51 Op 1 . - CDS 104725 - 105108 528 ## gi|288928363|ref|ZP_06422210.1| hypothetical protein HMPREF0670_01104 91 51 Op 2 . - CDS 105156 - 105653 562 ## COG0847 DNA polymerase III, epsilon subunit and related 3'-5' exonucleases 92 51 Op 3 . - CDS 105753 - 106229 491 ## COG0054 Riboflavin synthase beta-chain 93 51 Op 4 . - CDS 106229 - 106933 850 ## PRU_0202 hypothetical protein - Prom 107099 - 107158 8.1 + Prom 106681 - 106740 1.5 94 52 Op 1 . + CDS 106815 - 107063 87 ## 95 52 Op 2 . + CDS 107079 - 108194 941 ## COG1195 Recombinational DNA repair ATPase (RecF pathway) 96 52 Op 3 . + CDS 108187 - 108477 362 ## PRU_0200 hypothetical protein + Prom 108595 - 108654 4.1 97 53 Tu 1 . + CDS 108677 - 110050 1000 ## RPC_1791 patatin + Prom 110128 - 110187 2.6 98 54 Tu 1 . + CDS 110279 - 110869 295 ## gi|288928370|ref|ZP_06422217.1| hypothetical protein HMPREF0670_01111 + Term 111051 - 111088 2.1 99 55 Tu 1 . - CDS 111223 - 111438 67 ## gi|288927699|ref|ZP_06421546.1| hypothetical protein HMPREF0670_00440 Predicted protein(s) >gi|283510570|gb|ACQH01000049.1| GENE 1 382 - 1122 587 246 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288928280|ref|ZP_06422127.1| ## NR: gi|288928280|ref|ZP_06422127.1| conserved hypothetical protein [Prevotella sp. oral taxon 317 str. F0108] # 1 246 1 246 246 454 100.0 1e-126 MRRNLLLLFMVTFTITMRAQTKGNQPQNKNYQQAIVDSMSNAMYSNNIAPLDNVEKELKG KKPNRVYNYWLSFCLLHKALYLTTTKQDQQAQACLNEGEHIFETQAGKNAEEYALLAYIQ GVSIKFVKGMDAAVIARKSVNNAQQATQIDPKNTRAWYVLGMLDYYTPKQFGGQQKCEAL LEKALAQPISKTNNPYEPTWGRKEAYNLLLEYLVTTGNKEKAKTIYQKAKTEFPNAPDLK KYENQL >gi|283510570|gb|ACQH01000049.1| GENE 2 1106 - 3256 1215 716 aa, chain + ## HITS:1 COG:no KEGG:GAU_2336 NR:ns ## KEGG: GAU_2336 # Name: not_defined # Def: hypothetical protein # Organism: G.aurantiaca # Pathway: not_defined # 23 696 12 685 706 70 20.0 2e-10 MKTSFKLVLFVALWLVPSHQCRSQTIRGKVYADTEAAFAANVFLKHHTDIHAITDEDGQF ELPLQDGILPDTLMVTYVGYKDFVLPLTVQNLPDSLTIKLLSNAISLSEVTIMADATSSK EFAVTKLSRLEVLMSPSASADPLKAMGMMAYSTPTTESANPELRGSASGYSRVVVNGVPI YNPVRNQQLDGMGNFSLFNADIVGEQYVYPSNPPLEYGNATGGIVNIRTTQALPEGRRTR LSASLASIGAFHGFKMGTKSFVQTYANLQSSELYRPVNAKGLSHLKSFRSADIGLNLHTQ LSKGMHLNLFSYYINERYTANRGSFNYRGEQNATNKRNFNILNLAYSAGWKNSFTLNMAT DVASSHYLFGSINDHTQQLSTFVSLAWKHCLTENLSVAVGLDNEYLRYKYDGEYPSTYFL VHPSNEHSHRKSAIWMNKSEGYVYGKWQMGKFTIGSGVRQMVSKQFSSTLSYQASIRFNP NKQNTIIASFGEYNAISRPTYYSRTFNKALSRQTSLDYSFTKGNVTLSAALYHKRERLPF LSPYQQEMIPIAQRITGIELSGKYSWKQFTASACYTYLMSSTFIDNKWQHAENDFGHTAK VEISYFNMKLFNISLNCMFHPGKYFTPIIGKHDVGTQPYPAFGTYNSEQLPHYLSVNLAL NRYMQVGKTAVVAYLTLSNILNRTNANYAYYDEQYTNQLIEPLQKRLLYFGMVVNF >gi|283510570|gb|ACQH01000049.1| GENE 3 3609 - 4046 417 145 aa, chain - ## HITS:1 COG:no KEGG:PRU_2605 NR:ns ## KEGG: PRU_2605 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 144 1 145 147 94 35.0 9e-19 MSYTTFDFRSVPCDWTLCFIQNCPLKATCMRYFVGEQVPMNATSGSAVYPTALQGNVCKH YFEKRVIRVAWGFSTLFSEVKRKDDLPLRNQIKEFLGGHGTYYCYMHGKKMLNSEQQAWI LDLFRDYGYTTNLVFDHYADVYDFG >gi|283510570|gb|ACQH01000049.1| GENE 4 4377 - 4967 354 196 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928283|ref|ZP_06422130.1| ## NR: gi|288928283|ref|ZP_06422130.1| hypothetical protein HMPREF0670_01024 [Prevotella sp. oral taxon 317 str. F0108] # 1 196 1 196 196 391 100.0 1e-107 MNTIKVSVLLIVILFVVAGCNYTSKPRTFDREQATLYPRVPYEAIKDNKDSLDFILSLKG FSEELSNDKKPYQLSVAEMKMAKKILEKYVASGGDHLVVGSKDGTKSGKEVCDEMDDTAP LPLTHYYKQYLGYVKNGHHIVEINLLACVYVNYGESVSAYLQRVYSLPHDGGNHFGRVLI DLTEKKVIRFSLNTVA >gi|283510570|gb|ACQH01000049.1| GENE 5 5388 - 7709 732 773 aa, chain - ## HITS:1 COG:no KEGG:BVU_1454 NR:ns ## KEGG: BVU_1454 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 4 773 5 773 773 359 29.0 3e-97 MLKRIFIIGLFLNLLPLSLVSQKLLLSGKVEDNTGFPVALANVAIQTCSDSCVIARTVTD DNGSFSFDLDALPVPVTLHVSAIGFEDAFVDTQNADAGVIVLKPAAYMLNEVFVKASKHI VNLKNNGIRVSVSGTYLSHTGTTLELLGKIPFVFKSGTNIEVIGKGVPIIYINNRQVRDL HELEQLSSSSVKNIEVITSPGARYSSTANAVIRITTLAPMGEGFSFSGRTTLGLKHYAYL FEQLNFNYRAKGLDLFGALNYENYRECPRFENVVTQYLRAGLVEQHSMGQEMTKYPVYAG KLGLNYNDRVHSFGLLYDFKFNPSKTNGQSETSRYGAYIPEEVLGNFFVANRHNRQHLLS AYYSATLEKWKLTANIDALWQINDRFTEENERSSVNPLRNFNTMNKVGNRLLAGNLMATC AVWFGDLRFGMETNGIHRTDRYAGNADYIASNDNRIDETTTALFVESDQKFGVVSAGVGL RWEYTDSKFYLFGRYSPEQSRTYHNFAPSLFVAFPMGTVKANFAYTRKTSRPVFSQLSSA VKYIDRYTYETGNPNLRPIFRDNFSFSSSWKDLMVQLEYTSTKNYFMWQTFPYPSNTEAT LQTIQNMPRFHAYSAFINYAPSFFGCWHPVLMAAVVVQDFKLVHHGTELKLNRPLGVFRF NNAVQLPCEVWLNVDFLLHTDGDAENNRNHNYWNCDIGLYKSFLNDTWNVKLQLGDVFGT WRQKFVMYDALTRSSVVKRYHTRDLNLTIRYNFNATRSRYKGNGAGNDEKGRL >gi|283510570|gb|ACQH01000049.1| GENE 6 7901 - 8767 750 288 aa, chain - ## HITS:1 COG:no KEGG:BF2104 NR:ns ## KEGG: BF2104 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 7 287 23 305 306 370 66.0 1e-101 MKTKEIYLWRTLASFLVVLFSMPLGHALMIIMEHTMKPSAMHVAGFLLGLVGLIMVIVGV FVKGDTRQTIWGLLGGLLFWTGWVEFLFLYYARRYGVQPEIENGEVVTKPEYLIMPASFG LWMMVMVMYIFSTRNGCDFITWIQRVCFRDQRKVIVAQPMTRHTSIVTFMEVNMMLWGCY LLLMFCYDKNFLGDHHPVTFLVGLGCLVGSLFMFKRQLYLASWGANIRMAIATVIVFWTP VEILGRINFFKEFWTEPQHYVFELSSILIVFIALLLYLWMKGAKKKKG >gi|283510570|gb|ACQH01000049.1| GENE 7 8942 - 11083 2748 713 aa, chain - ## HITS:1 COG:no KEGG:PRU_0859 NR:ns ## KEGG: PRU_0859 # Name: not_defined # Def: peptidyl-prolyl cis-trans isomerase-like protein # Organism: P.ruminicola # Pathway: not_defined # 1 713 1 711 711 765 59.0 0 MAVLGKIRSRGMILIGIIGLGLLAFIAEEAFRSCEATRNNQRQQVGEVNGEKISVQEFQK MVDEYAEAIKMQQGQDNLNDEQLNQVKDMVWTSYVQNKLVEKEAKELGLTVTDQEMQNVL NQGTNPMLMQTPFVNQQTGRFDASSLKKFLAEYKNQQTTNPQMAQQYESIYKFWTFIEKQ LRTQLLAQKYQSLLAHCLLSNPVEAKMAFKEENEESQIQLAAFPYSSIEDSKVQVSESDL KAKYEELKPRFKQYVETRDIKYIDIQVEASPEDKAGLKKQFADYTKELTEAADPTNVVRK SSSLVNYLGLPVGKDAYPSDIAARLDSMSVGSVYGPVENKQDNTLNLVKLVAKVQLPDSV EYRQIQVGGATPEAAHKTADSIYTALTNGADFEALAKKYGQTGEKTWMTTRQYQMAPSMD KDTKDYINSLNTMGVNDIKNITLAQGNIILQVLNRKGMVTKYQAAVVKRTIDFSKETYRK AYNQFSSFVSANATADAVVKNAVKNGYRLQEAKDVTTSQHNLAGIRSTREALKWLFDAKE GSVSPLYECGDNNHLLLVILDHINPMGYRSLTDSQVKEMVKAEVIKDKKAEQILAKLNGV TSIAAAKAKGAVVTPVNQVTFASPVFLTATGASEPALSGAVAATAKGAFDKTPVKGNAGV YMFQVTGRTMRPVKFNAKETEMRLRQRAMQYAGNFMNQLYINGKVVDNRYLFF >gi|283510570|gb|ACQH01000049.1| GENE 8 11261 - 12535 1308 424 aa, chain - ## HITS:1 COG:BMEI0044 KEGG:ns NR:ns ## COG: BMEI0044 COG4536 # Protein_GI_number: 17986328 # Func_class: P Inorganic ion transport and metabolism # Function: Putative Mg2+ and Co2+ transporter CorB # Organism: Brucella melitensis # 13 420 12 417 435 161 28.0 2e-39 MDDPISISLVLGIIVTMVFSAFFSGMEIAFVTSNRMLVEMDKERNGLSQKVQAIFYKNPN NFVSTMLVGNNIALVVYGILIAKFFDNTIFRGMDASFTVPADTILSTLIVLFTGEFLPKT FFKSNPNRLFSFFAVPAYLCYLLLWPFSRFSTFLARVLMRIFGVRIEKENENILFSKTDL DYLVQSSLDSAKNEDDINDEVKIFQNALEFRDTKVRDCMVPRTEINSVDLNDCTIDELTQ KFIESGNSKLIVYQEDIDHIVGYIHSSELFRNPDQWKQHIRKMPFVPETMAAQKLMQTFL MQKRSLGVVVDEFGGTSGIVSLEDIVEEIFGDIEDEHDNQKYVAKQTAEGDYVLSARLEI DKVNNMFDLALPESDDYMTIGGLILHHYQSFPKINEEVNIGPYQFRILKNTMNKIELVKL KVHQ >gi|283510570|gb|ACQH01000049.1| GENE 9 12606 - 13256 683 216 aa, chain - ## HITS:1 COG:no KEGG:PRU_0861 NR:ns ## KEGG: PRU_0861 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 20 211 3 197 205 234 58.0 1e-60 MYTLLRTSGLAFMLCCIVALTACEQAVEHTAPAVNPKDSVPTMVTYGVNTLISDSGVVKY RIVTERYEVNQVKNPPRWTFDKGVFLEEFDENFHVQLYIQCDTAYYYDQQRLWELRGRVR VKTKDGVRFYSEELFRDENRHELYSNKFSRLITPDRQLQGNYFRSDERMTKYYVSNSKGS FARTDIAGKNDSLSSAPDTLKETVRPSAMPQRKLGG >gi|283510570|gb|ACQH01000049.1| GENE 10 13284 - 14579 1340 431 aa, chain - ## HITS:1 COG:no KEGG:PRU_0862 NR:ns ## KEGG: PRU_0862 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 431 1 428 428 399 46.0 1e-109 MKKVIGIVLASTVSMAAFAQSGTNSPYSQFGLGVLSEQSSGFNRGMNGLGLGFHEGNQVN YLNPASYARLDTLTFLFDAGISLQLTNFSEKGVKRNAKNANIEYVVAGFKAARRLGLSFG LIPFTNVGYQYLAAQALTDQPDATILTNTYTGTGGVRQAYLGIGWEPIKGLALGANMSYM WGDYDRRLINSYSDGSVNNLVHQYTMDIRSYKVDFGAQYTAKVSSKDNITLGLTFSPSHK IGGKPQLQEILTNSQTNVSDTTSYGGNFDLNLPNFFGVGLMWDHNGKWKVGADYTLQQWS TVKFPSYETNGGVSSFALRDNMLTDRHKLTLGGDFCPAPFSRNFFNRIHYRAGVSYATPY IKVYGNDGPTEMSASIGFGIPIINPYNNRSFLNVSAQWVRSTAKNYIQENVFRINVGLTF NELWFKKWKLE >gi|283510570|gb|ACQH01000049.1| GENE 11 14755 - 15945 957 396 aa, chain + ## HITS:1 COG:no KEGG:PRU_0863 NR:ns ## KEGG: PRU_0863 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 31 385 44 390 393 293 45.0 6e-78 MLLLRHLLAAVLLCAFQLSAQAQGTLELKDMDAVRISLLTCQPHEEVYSIYGHTAIRYQD VARRTDIAVNYGMFSFHKPYFVLRFVFGLTDYEMGIESFDTFCAQYASYGSGVYEQVLNL TPEEKLTIAKAIDTNYEPQNRVYRYNYFYDNCTTRARDMITNHLQASVEYGNEKAEDATS YRQIVHQCASQTPWIRFGNDMLLGIKADLPIDRAQRQFLPANLMNDFQSATLNNSSPTKR KLVASSGWVVQPGVQTSSSSFPLSPTALMLILAAIILGTTAIESKLNTRFRWFDGFWFLI CGLIGILLFVMIFSQHPTVSLNLQILFFCPYTLLYIYRAVKKGKETQFLRGIKIWCILIS LFLIGGFFQHYAEGVRFLALSLLIRYIYLIYRHKRA >gi|283510570|gb|ACQH01000049.1| GENE 12 15942 - 17357 1494 471 aa, chain + ## HITS:1 COG:CC2461 KEGG:ns NR:ns ## COG: CC2461 COG1524 # Protein_GI_number: 16126700 # Func_class: R General function prediction only # Function: Uncharacterized proteins of the AP superfamily # Organism: Caulobacter vibrioides # 1 166 13 188 577 77 31.0 4e-14 MNKYITLFALATLANIECKAQPNVAVPRLVVGITVDQLNSDDMEQFAALYGNGGFKKLLR EGVVYENAQYDFAPINLASATTAIATGASPVDNGITAERWVSRETLRPVGCTFDKKFFVS PNNVKSSTIGDEMKMASNGNALVFSVAANQDAAVLQGGHAADGVFWIDEATKNWKTSAFY PRSAQTWLDTYLRMPGLDVAKKNPNQATCQFALDCISDHVMGRDAQADLLFVTLSASNAT AESNPLLARENTYRELDKNLSDLIQNIEQKVGKENVLFVLTATGRSESNIQNYAKYNVPT GTFYINRTANLLNMYLNAIYGINRLVDGVLNNEIYINKTILEQKRINISEVLRHAREFLV QISGVKEVYTADQLLLDNGVSSKVRNAFNHAVSGDIIVKVAPGWKIYNEETKEEQLPATA SLPFPIIIYGAGVKPNTISTPVTTNRIAPTIARFIRIRAPNACVVPPLPLK >gi|283510570|gb|ACQH01000049.1| GENE 13 17527 - 20850 4060 1107 aa, chain + ## HITS:1 COG:CT701 KEGG:ns NR:ns ## COG: CT701 COG0653 # Protein_GI_number: 15605434 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit SecA (ATPase, RNA helicase) # Organism: Chlamydia trachomatis # 7 1031 5 934 969 634 38.0 0 MNLNKILQSLFGNKSTRDMKLIQPLVEKVKATYDEIKALSNDQLRAKTKEIQAYVQNAAK EQKEKIAELKAQIEETPIDERESLFNQIDKLEKEALEVYEVTLNEVMPVAFGIMHDTARR FTENEEIVVTATDFDRDLAATKDFVRIEGDNAIYSNHWIAGGNDTKWDMIHYDVQIFGGI ALHQGKIAEMATGEGKTLVATLPVFLNALTGNGVHVVTVNDYLAKRDSEWMGPLYEFNGL SVDCIDKHRPNSPERRKAYMADITFGTNNEFGFDYLRDNMSNSPEDLVQRAHNYAIVDEV DSVLIDDARTPLIISGPIPKGDDQMFEEYQPLVERLVDVQRKLATQYLAEAKQLITEGQE QNDQKKLDEGFLSLYRSFKSLPKNKPLIKYLSEEGIKAGMLKTEEVYMENNNRKMPEAIA PLYFVVDEKQNSCDLTDKGTEWLAKQVNDAELFVLPDIATQLSALEADKTMSDEQKVDRK DELLNHYAIQSERVHTLQQLLKAYTMFNKDDEYVVIDGEVKIVDEQTGRIMEGRRWSDGL HQAVEAKEHVKVEAATQTFATITLQNYFRMYHKLAGMTGTASTEAGEFWDIYKLDVVEIP TNRPIARNDMDDRVYKTAREKYAAVIDEIEDLRKQGRPVLVGTTSVEISELLSKMLNMRK IEHEVLNAKQHQKEASIVAKAGQSTNGLGAVTIATNMAGRGTDIKLSPEVKAAGGLAIIG TERHESRRVDRQLRGRAGRQGDPGSSVFYVSLEDKLMRLFASERIARIMDRLGFKEGERI ESPMISKSIERAQKKVEENNFGIRKHLLEYDDVMNKQRTVIYEKRRHAVMGERIGMDITN IIWDRVINIIQTNDYEGCKEAFIKIFAMECPFTEEEFINTPHDVLEERTFQMAMGTFKRK TDRLQEMTYPTIKEVYETQGDRYERIVVPITDGKRIVNIVCDLKEAYETEAKSVIKQFEK NVMLHIIDDCWKENLRQLDELRHSVQNASYEQKDPLLVFKLESVKLFDSMVNEMNDRITS LLMRAQLHVEQQVQEAAPEVRQQQQYTESKENLDETAQRAARQQDTRESAAPQNRTPVMK EHMPRRNDPCPCGSGKKFKDCHGRGIV >gi|283510570|gb|ACQH01000049.1| GENE 14 21187 - 21891 176 234 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|225088774|ref|YP_002660041.1| ribosomal protein S16 [gamma proteobacterium NOR5-3] # 3 220 12 229 312 72 26 9e-12 MQVNINNLTKKYGDKTAVSIPVFSMSDNEIVGLVGNNGSGKTTFFRLLTDLVKANTGEVS LNGVDPTQSESWKQHTGVYLDEGFLIDYLTPEEYFDFVASISDIDNESLTQTLQQFENFM AGEILGQKKLIRDLSAGNKQKVGIIAALLAHPQFLILDEPFNFLDPTSQNALKAILSKFG QREGTTVIVSSHNLQHTIDISTRIVLLEKGLIINDFKTVDTAAIRELENYFAAQ >gi|283510570|gb|ACQH01000049.1| GENE 15 21996 - 22499 523 167 aa, chain + ## HITS:1 COG:no KEGG:CCC13826_0565 NR:ns ## KEGG: CCC13826_0565 # Name: not_defined # Def: hypothetical protein # Organism: C.concisus # Pathway: not_defined # 3 166 1 167 171 120 44.0 3e-26 MQLRKEIEPDFETVEKRYPIALKAIMSYTAYCDENGDEDLVEYNKLADYLHQLTGKDMPQ FNLWEWWEEEGAEVLAFKIVLPEPQCVHNITMDELYEVVKRLKTDIYTPSEDGSLKELFK YHLDEYYKLFLERNFNTYDPKLFERNINDKGEYFEYTEAEIVQMLWR >gi|283510570|gb|ACQH01000049.1| GENE 16 23758 - 26346 2862 862 aa, chain + ## HITS:1 COG:MA3477 KEGG:ns NR:ns ## COG: MA3477 COG0370 # Protein_GI_number: 20092288 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+ transport system protein B # Organism: Methanosarcina acetivorans str.C2A # 129 861 11 670 670 530 39.0 1e-150 MRLSELRTGEKAVVVKVLGHGGFRKRIVEMGFIKGKTVEVLLNAPLQDPVKYKILGYEIS LRRDEAQMIEVVSEEEAKLTDTPNDETKDTENTLGTNAEKDDQPQPTATDMPVSEEKMQQ AARHKRRVINVALVGNPNCGKTSLFNFASGAHEKVGNYSGVTVDAKEGIARFEGYEFHLV DLPGTYSLSAYSPEELYVRKQIVEKTPDVVINVLDASNLERNLYLTTQLVDMNLRMVCAL NMYDEFEKRGDTIDLEMLGKQLGVPMIPTVFKTGRGVQLLFHIIINMYEGMDFIDKDGNI NPEVAAQLQKWHEEYKNDVPTEHQEDFATTDARPHNKVSRHIHINHGPRLELCISRLKLQ IEQNPEVMHKFHTRYLAIKLLENDAETEREVASLSNAKAILATRDKEAQSIKTQLNEDCE TAIMDAKYAFIHGVLAKAGYKEGKATDTYKLTHKLDAIITHKFWGFPIFFLLLYVMFQVT FSVGQYPMDWIDQGVNWLGEWVKENMKDGPVKDLLAEGIIAGVGAVIVFLPQILILYFFI SFMEDSGYMARAAFIMDKLMHKMGLHGKSFIPLIMGFGCNVPAVMATRTIESRRSRLITM LILPMMSCSARLPIYIMITGTFFALKYRSAIMISMYVVGILMAVIMSKIFSRFLVKGEDT PFVMELPPYRFPTKKAIARHTWEKGKQYLQKMGGIILVASIIVWALGYFPHNDELDKQQQ QEQSIIGRMGKAIEPAFRLQGFDWKLDVSLLAGVGAKEIVASTMGVLYADDDSFSDDNGS GDDTEKYTRLRRQMLNDGISPLSAYAYLIFVLLYFPCIATVAAIRNESGSWRWALFAAFY TTGLAWVVSAAVYQVGSLIMGN >gi|283510570|gb|ACQH01000049.1| GENE 17 26596 - 26799 167 67 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKTIFKLTIICTGIVLASLFSTTTAKVDDPSNPPVDYGPGFSGCLGEKGTCGITKNGTKL VGKWVEI >gi|283510570|gb|ACQH01000049.1| GENE 18 26810 - 29101 1362 763 aa, chain + ## HITS:1 COG:no KEGG:Dfer_2699 NR:ns ## KEGG: Dfer_2699 # Name: not_defined # Def: TonB-dependent receptor # Organism: D.fermentans # Pathway: not_defined # 100 752 98 808 817 78 22.0 1e-12 MITVSLSFFTAGTSVCRKLLSLISVLLLVSSSFVHAQQPQNLKIHITIDNGLSLNDINVS IDYANNKTIGKTNEDGIFVAQIVPVPDTDTCKVSIHAFMYQSKDTTLTLSQTKIHEIRLN HLELQEVKVTGYKRIASTSANKTTFDIDRRGFPVNAKADMALSRLPGVMKSGDSFSLPGN NKPTSLRIDDREVDIKELMTLDLKDIERVEVLRHGTDEASDGGVINVIKKKHLPSLVKGQ MSLSAANFSEERYGTFPQITLRTQSIEFTAFLSGHLSNQHSWQTIAWNGEQNYHSDKSAK VGQCTANAKLSIFATPKFKASIAYTFFGFNSKINAQTESKAGFSGKQKITEGYGSHNINI VSCYEISKNKLLNLKARFRKYKSTNENDQPVSLIYETGMNELTGELSYQANNLHFIGKGN DLQMGYKNIYRQHLLSAYTRKFYTNIHNLFLTESFSLGGPFDAYVAVRNEWTANRMETKH SNYHTLLPTLALNANSKIGSISASYSRKINRPSVDYLNPETFYVSEHEQFQGNINLAPQY TNNFSLQIRKQLRNSYLNAFATYAHTTNLIELVFSDNLDCSTYENAARANIYRMGAGIYL PMLSHKLNINFNMSVNHEQYKLYNRFAEKTLMRNIQQSGWTVNSALNLSFSAPKGWYFNL NGNYINKSFDLNSSTTHRPSLSILVKRSFLKDLLELSLNYQDVFGWNQRTTTIYHFKTGF RTITSKLSTSSISLSANITFGKSFRTRRVGTNINNDDTKTKEQ >gi|283510570|gb|ACQH01000049.1| GENE 19 29235 - 29507 151 90 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MHIKYVAYGALAFECVSIVGLLVKERVGLWLSLSMMATFSIYIVILHLLNRYEICGCGGI LNGLSFTTHLAINIGIIIILTYLLKKNETI >gi|283510570|gb|ACQH01000049.1| GENE 20 29494 - 30543 382 349 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288928297|ref|ZP_06422144.1| ## NR: gi|288928297|ref|ZP_06422144.1| hypothetical protein HMPREF0670_01038 [Prevotella sp. oral taxon 317 str. F0108] # 1 349 1 349 349 622 100.0 1e-176 MKQFKNTKEKHIYSIATILLLLCCILSFLLLSIRGKSAEYNYSFQRFYIKNPFKQCTNPC KGIDFSKPNGVCCDPENILVQYKTDKNTIIKQLKRNGIIARIYKSAKFDTQIFGQLGNQI YYISRSSLFLSDTTEQLRFNKVKLSQLLINATPINDSTLLVIKNDSITNNNGTSFGILNV KTKQFTQLGKGGYLPENNKYKTYQEQVLAYDGHFLNTNEAITYTFTHIPYIYVFDKTGKL TRIVKTLDNVPAPSIIRYGDFYIFERGKTFNSNVGAFAKGNSLFVLSYRISTHAHYIIDK YDLCKGKYQGSFSVLNKNNMNNKAIDKLLFLNHHVLIVSKDSATVLKVS >gi|283510570|gb|ACQH01000049.1| GENE 21 30834 - 31184 207 116 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|154175415|ref|YP_001407462.1| NADH dehydrogenase subunit A [Campylobacter curvus 525.92] # 2 116 13 126 129 84 34 2e-15 MSFILFITVLITAVLLVVAAFAIAKAIGPRSYNAVKGEPFESGIPTRGSSWIPVHIGYYL FAILFLMFDVETVFLYPWAVVVKQFGALALASIGFFLVVLVFGLAYAWRKGALEWK >gi|283510570|gb|ACQH01000049.1| GENE 22 31175 - 31798 418 207 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|154175216|ref|YP_001407461.1| NADH dehydrogenase subunit B [Campylobacter curvus 525.92] # 26 182 6 160 170 165 45 8e-40 MEVKQPHIKSIPYAEFKDNDTLEQIMNELHEGGVNVVVGSLDQAINWGRSNSLWSLTFAT SCCGIEFMSVGCARYDFARFGFEVTRNSPRQADLIMCAGTITNKMAPVMKRLYDEMADPK YVIAVGGCAISGGPFKSSYHVVRGIEEIVPVDVFIPGCPPRPEAILYGMMQLQRKVKVEK FFGGANRKEKKPLVIMDVPKDFDKDKE >gi|283510570|gb|ACQH01000049.1| GENE 23 31808 - 33376 1828 522 aa, chain + ## HITS:1 COG:SMa1529 KEGG:ns NR:ns ## COG: SMa1529 COG0649 # Protein_GI_number: 16263284 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase 49 kD subunit 7 # Organism: Sinorhizobium meliloti # 163 522 19 404 404 278 35.0 2e-74 MKLNNIELNANDFANEMAKLRNEKHFDFLVTIVGEDFGEEGFGCIYILENTQTYERISVK VVAADRENPYLPSVTSLWKAADLLEREVYDFFGIKFIGHPDMRRLYLRSDFTGYPLRKDY DMDPEKNKAHMYDDPEPDYTLQYSLDKDGNLVETRHKLFGDDDYVVNIGPQHPATHGVLR LQTILDGEVIRKIYPHLGYIHRGVEKISESYVYPQTLAFADRLDYLSATMNRHALVGVIE EAMGVELTDRIKYIRTIMDELQRLDSHLLFYSCCAQDVGALTAFIYGMRDREQVINVMEE TTGGRLIMNYYRIGGCQADIDPNFVQNVKKLCAYMKPMFKEYQDVFTGNVIMENRFKGVG MLSREDAISYGCTGGTGRAAGWQNDVRKHHSYGVYDKVDFKEITYNNGDSYDRYMVRMKE MEQSVHILETLIDNIPEGDFYIKQKPIIKVPEGQWYFSCEGSRGEIGVFLDSKGDKSPYR LKFRPMGLPLVSALDTMLRGEKIADLIAVGASLDYVIPDIDR >gi|283510570|gb|ACQH01000049.1| GENE 24 33379 - 34467 1090 362 aa, chain + ## HITS:1 COG:RP796 KEGG:ns NR:ns ## COG: RP796 COG1005 # Protein_GI_number: 15604628 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase subunit 1 (chain H) # Organism: Rickettsia prowazekii # 49 357 33 332 339 231 43.0 2e-60 MFDFSIVTTWFDSLLRDTCGLSSFWAILIECVLVGVFILAAYAILAILLIFMERKVCAAF QCRLGPMRVGPWGIFQVIADVLKMLIKEIFAVDKADKLLYSIAPILVLIGSVGAFSFMPW NKGATILDFNVGIFLMTAISSIGVIGIFIAGWSSNNKYSVISAMRGAVQMISYEMSLGLC LIAAVVLTGTMQVSGIVEAQSGPFGWLIFQGHIPAIIAFVVFLITGNAEANRGPFDLAEA ESELTAGYHTEYSGMGFGFFYLAEYLNLFIVAGVASMVFLGGWMPIHIGVDGFDKVMDYI PGIVWFLGKAFALVWVLMWIRWTFPRLRIDQILKLEWKYLMPLSLLNLVFMTIVVAFGWY IK >gi|283510570|gb|ACQH01000049.1| GENE 25 34478 - 35008 583 176 aa, chain + ## HITS:1 COG:SMa1519 KEGG:ns NR:ns ## COG: SMa1519 COG1143 # Protein_GI_number: 16263279 # Func_class: C Energy production and conversion # Function: Formate hydrogenlyase subunit 6/NADH:ubiquinone oxidoreductase 23 kD subunit (chain I) # Organism: Sinorhizobium meliloti # 19 148 20 143 188 89 39.0 2e-18 MEDNKKSYFGGIRDGIKSLATGLRVTLREYFTPKVTEQYPENRKTTLHVAKRSRGRLVMK RDENDVVKCVACLMCEKACPNGTIRIVSEMVTNEEGKKKRQLVKYDYDLGDCMFCELCTN ACNFDAIEFTNDFENSVFNRDKLVIELDKEHYQSSLPNIIDGGAESEIGTFNTKTK >gi|283510570|gb|ACQH01000049.1| GENE 26 35026 - 35547 590 173 aa, chain + ## HITS:1 COG:sll0521 KEGG:ns NR:ns ## COG: sll0521 COG0839 # Protein_GI_number: 16332084 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase subunit 6 (chain J) # Organism: Synechocystis # 5 170 10 169 198 67 32.0 1e-11 MANIIMFCILAVVIIGSAIVCVFTKRIMRAATFLLFVLFGVAGVYFLLDYTFLGTAQIAI YAGGLTMLYIFAIQLVSKRTLQGLQERQRGKTVVAGALVALIGLVTVALVLLKNQFIDMS ASMVDAEVPMSVIGEKIVGAGKYQYVLPFEFISVFLLACIIGGIMIARKEDKK >gi|283510570|gb|ACQH01000049.1| GENE 27 35544 - 35852 358 102 aa, chain + ## HITS:1 COG:VNG0643G KEGG:ns NR:ns ## COG: VNG0643G COG0713 # Protein_GI_number: 15789840 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase subunit 11 or 4L (chain K) # Organism: Halobacterium sp. NRC-1 # 1 101 1 99 100 65 41.0 2e-11 MIPVELLLGLSTLLFFIGFFGFVTRRNLIAMLISIELVLNSVDINFAVFNRILYPGELEG FFFSLFSIGINAAETAVAIAIILNVYRRFHSDQVNSIENMKL >gi|283510570|gb|ACQH01000049.1| GENE 28 35859 - 37817 1953 652 aa, chain + ## HITS:1 COG:slr0844 KEGG:ns NR:ns ## COG: slr0844 COG1009 # Protein_GI_number: 16331732 # Func_class: C Energy production and conversion; P Inorganic ion transport and metabolism # Function: NADH:ubiquinone oxidoreductase subunit 5 (chain L)/Multisubunit Na+/H+ antiporter, MnhA subunit # Organism: Synechocystis # 85 648 83 678 681 367 39.0 1e-101 MEYNFVFLILLLPALSFLVLGLLGMKMPHKVAGSIGTVVLGTIFCLSIYTAYEYFIGQGR GADGLYPHVTAFNFSWLKFSELLTFNLGFRLTPISVMMLVVITTVSLMVHIYSFGYMHGE KGFQRYYAFLSLFTMSMLGLVVATNIFQMYLFWELVGVSSYLLIGFYYPQGAAVAASKKA FIVTRFADLFFLIGILIYSFYVGTFNYDLNAMPSLHDQVAGSAVMMPVALFFMFIGGAGK SAMFPLHIWLPDAMEGPTPVSALIHAATMVVAGVFLVASLFPIFVEFAPEQLHWIAYIAA FTAFYAAAVACVQSDIKRVLAFSTISQIGFMLVALGVCMPDPATGSVAMTEEYADVQGLG YMAGMFHLFTHAMFKACLFLGAGCIIHAVHSNEKEYMGGLRKYMPITHWTFLISCLAIAG IPPFSGFFSKDEILVACYNFSPVMGAVMSGIAMMTAFYMFRLYYVIFWGKSYYETHINEG AHKPHEAPLTMTFPLIFLSVITVTCGFLPFGELVSANGIGFHTHINWGVAGTSVALALIG IAAAMLMYKKAETPFADKLAKTFPALHRAAYKRFYMDEIWQYVTHRIIFRLISKPIAWFD RHVIDGTFNFLAWSSNEAGESIRPWQSGDVRHYAIWFLSGAVALTLVLLYML >gi|283510570|gb|ACQH01000049.1| GENE 29 37851 - 39356 1623 501 aa, chain + ## HITS:1 COG:slr1291 KEGG:ns NR:ns ## COG: slr1291 COG1008 # Protein_GI_number: 16329430 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase subunit 4 (chain M) # Organism: Synechocystis # 43 444 46 448 559 259 36.0 1e-68 MSILTLFVVIPVLMLLGLWLSRNLNQVRGVMVVGSSCLLVLSVWLTIAFLQMRDAGNTDA MLFQYTVDWFKPLNIAYAVGVDGISVVMLLLSSIIVFTGTFASWQLKPLTKEYFLWFTLL SIGVYGFFISVDLFTMFMFYEVALIPMYLLIGVWGSGKKEYSAMKLTLMLMGGSALLIIG IIGIYFFSGATTMNILELQALHNIPESVQKVLFPFVFIGFGVLGALFPFHTWSPDGHASA PTAVSMLHAGVLMKLGGYGCFRIAIVLLPEAAHQLSWIFIILTTISVVYGAFSACVQTDL KYINAYSSVSHCGLVLFALLMMTQTSCTGAILQMLSHGLMTALFFALIGMIYGRTHTRDI RQLGGLMKIMPFLSVGYVIAGLANLGLPGFSGFVAEMTIFVGSFQNNDVFHRTATIIACT AIVITAVYILRVVGKILFGKVADPHYEQLTDATWDERFAVGCLIFCVAGLGLAPLWASNI ISDTLGPIIDHINNVTAIASL >gi|283510570|gb|ACQH01000049.1| GENE 30 39369 - 40838 1397 489 aa, chain + ## HITS:1 COG:BMEI1145 KEGG:ns NR:ns ## COG: BMEI1145 COG1007 # Protein_GI_number: 17987428 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase subunit 2 (chain N) # Organism: Brucella melitensis # 74 447 62 433 478 216 34.0 6e-56 MNLDYSQFFHMMPEVTLMAMLVIVFIIDFATAHKAPIVITNDKTAPSPRPWFNPLVCAFM LIHILLNCCPTEPATAFGGMYATTPMIGVLKTILAFGTLIVFIQARTWLSRPDSTFKEGE FYMLIISTLLGMNMMVSADHFLLFFLGLEMASVPMACLVAFDKYRHNSAEAGAKFILTAT FSSGVMLYGLTFIYADAGSLYFNEVSQHLSATPMTVMGMVFFFSGLGFKISLVPFHFWTA DTYQGAPTTITGYLSVVSKGAAAFALCSILMHVFGVLLEQWQYLLGIVIVLSITIANLFA IRQTELKRFMAFSSISQAGYIMLAVMGDSGMGVTALTYYILVYVVANMSVFTIISAIEER NNGVTDMDAYNGLYSTNPKLAFLMTLALFSLGGIPPFAGMFSKFFVFMAALEQGSTWAYA IVFIALINTVISLYYYLLIVKAMYINKSENPLPAFKSDCNTRLALAICTLGVALFGVCSY VYDWIFAAL >gi|283510570|gb|ACQH01000049.1| GENE 31 41207 - 41509 90 100 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260912219|ref|ZP_05918771.1| ## NR: gi|260912219|ref|ZP_05918771.1| conserved hypothetical protein [Prevotella sp. oral taxon 472 str. F0295] # 1 97 1 97 99 145 73.0 7e-34 MVIDPQFPTCPIRNVLSRLCGLDALWVILVLAERVSGSMEQLCWGIVGMKHQQVASAVRI LMEDHLVEKSSGVYRLSPLGQSVLPYVKELVGWCEMQSKC >gi|283510570|gb|ACQH01000049.1| GENE 32 41503 - 41955 585 150 aa, chain - ## HITS:1 COG:no KEGG:PRU_0504 NR:ns ## KEGG: PRU_0504 # Name: not_defined # Def: cupin domain-containing protein # Organism: P.ruminicola # Pathway: not_defined # 21 150 1 131 131 192 72.0 4e-48 MKTVKTTTEGKNFTAVNIGKLSEVKDYVLPLGDIEIPGKVFVGQALQATGSEVSFQTLVP GQDSGFLHTHKTHEELYFILKGEGEYQVDGEVFPVSEGSIVRVAPAGKRALKNTGEGEML MLCIQYKADSFGENDSPAGDGVILGDKLTW >gi|283510570|gb|ACQH01000049.1| GENE 33 42199 - 42567 338 122 aa, chain + ## HITS:1 COG:XF0240 KEGG:ns NR:ns ## COG: XF0240 COG1733 # Protein_GI_number: 15836845 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Xylella fastidiosa 9a5c # 14 119 24 128 142 108 48.0 2e-24 MNRTEVRDALFPNCPVRNILSRVGDKWSMLVLFTLETNGNLRFKELQRNIPDISQKMLTA TLKMLEGDALITRVAFPEVPPRVEYSLSEKGKSLLPHISNLLAWATEHMGEIIESRQHYL LK >gi|283510570|gb|ACQH01000049.1| GENE 34 44181 - 44912 352 243 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928310|ref|ZP_06422157.1| ## NR: gi|288928310|ref|ZP_06422157.1| hypothetical protein HMPREF0670_01051 [Prevotella sp. oral taxon 317 str. F0108] # 8 243 1 236 236 480 100.0 1e-134 MDKLLKLMPLMLLLFLAGCSSKNDDEESEVFSTTLNMMNEDNGKTLFESTGYYIDNAGNF TSEYGYWCIVDAGNTSLSNTKWQADLENRVRTAAVQYNHSYHFFRNADLIEFPSGTFAVP LETDFYKLRVEEPIKENGVTKGAVVKFVKSKVSSRTLPAKGSVIGSMVNTAGSKIEIPVP KGAEYCEGEKIKSIGNIFDIQINDGKLIIKPNYWSDGSSRREEFDLYIRQGSICTKVLFY VVS >gi|283510570|gb|ACQH01000049.1| GENE 35 45299 - 47182 2113 627 aa, chain + ## HITS:1 COG:XF0840 KEGG:ns NR:ns ## COG: XF0840 COG1874 # Protein_GI_number: 15837442 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase # Organism: Xylella fastidiosa 9a5c # 35 616 28 599 612 390 37.0 1e-108 MKHLKFLAVAALMLLAATNAEAKQNKQTKTTRNTFAIADGQFVYNGKPMQLHSGEMHYAR VPAPYWRHRMKMMKAMGLNAVATYVFWNYHETEPGKWDWKTGNRNLRQFVKTAAEEGMLV ILRPGPYCCAEWEFGGYPWWLSKAKGLVIRADNQPFLDSCRVYINQLASQMRDLQITKGG PIIMVQAENEFGSYVAQRKDIPLETHRAYSAKIKQQLLDAGFDVPLFTSDGSWLFKGGTI EGALPTANGESDIEKLKKVVNEYNGGKGPYMVAEFYPGWLSHWAEPFPQVSTESIVKQTA KYLENGISFNYYMVHGGTNFGFTSGANYTTATNLQPDLTSYDYDAPISEAGWNTPKYDAL RALMIKNVKYNVPAVPQRIPVIAIPNIKLNKSADVLNLLTKGKAVESDKPLTFEDLNQGH GYVLYRRHFNQPIGGMLKVAGLADYALVYVNGQKVGELDRVSDVDSIEINVPFNGVLDIL VENMGRINYGARITQSIKGINGPVVIDGNEITGNWQMYKLPMNEVPDVNALPTANNKGLP TLYSGTFNLDTTGDTFLNMETWGKGIVFVNGINLGRYWKRGPQQTLYLPGCFLKKGENKI VVFEQQNDTPQTSVAGQTTPILQKLVK >gi|283510570|gb|ACQH01000049.1| GENE 36 47467 - 48366 1025 299 aa, chain + ## HITS:1 COG:BS_cysE KEGG:ns NR:ns ## COG: BS_cysE COG1045 # Protein_GI_number: 16077161 # Func_class: E Amino acid transport and metabolism # Function: Serine acetyltransferase # Organism: Bacillus subtilis # 128 288 4 155 217 135 42.0 9e-32 MTTADLTRTLTHIAEELSDAKALKGIFHQHKDGNPIPSGEVLKKIVDLLRAIIFPGYYGK STVNSRTIKFHVGVNVEKLNQLLTDQIQAGLCFASCDGCDASDTATDMRCTAEDMAMNLI KRLPAVRQMLAADVAAIYNGDPAAGSFGEIISCYPSIKALVNYRIAHELLLMGVPLIPRM LTEMAHSETGIDIHPAAQIGQHFAIDHGTGVVIGATCIIGNNVKLYQGVTLGARSFPKDE DGNPIKGIARHPILEDDVVVYSNATILGRVTIGKGSVVGANVWVTEDVEPNTHIFKKTK >gi|283510570|gb|ACQH01000049.1| GENE 37 48382 - 49899 1882 505 aa, chain + ## HITS:1 COG:slr0064 KEGG:ns NR:ns ## COG: slr0064 COG0116 # Protein_GI_number: 16331495 # Func_class: L Replication, recombination and repair # Function: Predicted N6-adenine-specific DNA methylase # Organism: Synechocystis # 12 372 21 384 384 284 39.0 2e-76 MEFELIAKTFMGLEPVLAKELTQMGANNVQIGRRMVSFTGDKEMMYRANFQLHTAIRILK PIAKFKARSADDVYEEIQKIDWSKHIEEGKTFSVDSVVYSEEFRNSRFVTYKVKDAIVDQ FRQRTGKRPNISVSNPDIRLNIHIAEYDCTLSLDSSGESLHRRGYRQESVEAPLNEVLAA GMILMTGWQGETDFIDPMCGSGTLVVEAALIARNISPGVFRKSYAFEKWPDFDQELFDNI YNDDSNEREFNHHIYGYDVDMKAVNTARLNVRASGLTKDITIEQADFKDFKKPENKSILV TNPPYGERISTPNLLGTYKMIGERLKHQFMGNDAWVLSYREECFDQIGLKPSIKIPVYNG SLECEFRKYAIFDGSLKDFRQEGGVVKTDEEKRQMAEKNRFKKNREFKKRLDDDGSQDTD DIRSYTFKNHFAHTEERNEERKRRRTTSYADRDDKRGGRSFERGGKPFAGKGKDFNRKGK SFERTDKRGGKNFGRNSANFIDHDD >gi|283510570|gb|ACQH01000049.1| GENE 38 49889 - 52075 2556 728 aa, chain + ## HITS:1 COG:CC2154 KEGG:ns NR:ns ## COG: CC2154 COG1506 # Protein_GI_number: 16126393 # Func_class: E Amino acid transport and metabolism # Function: Dipeptidyl aminopeptidases/acylaminoacyl-peptidases # Organism: Caulobacter vibrioides # 138 728 150 736 738 322 34.0 2e-87 MTIKQLIVGAMLSTMIFAPQSMSAQKLFTLEDLNYGGNNFHNMVPQNRYTAWWGDQLVRT DAEFCALIDKNTGKETRLFSLDDVNQWVESAGNIKVRSFYHATFPYPNQPLVLLNASKTR MLVNWKTKQLVWKQDTKDESSADWNAQSRAVAFVKGDNLYVCNAQGTHKQLTKDGSRDIV YGQSVHRDEFGIYKGTFWSNDGQKLAFYRMDQSMVADYPLVDIDTRIATETPIRYPMAGE KSHLVTVGIYDLKTDKTVYLNTGDPTDRYFTNIAWSPDGKLIYLIELNRAQNHYSLDAYD ATTGNKTATLYTESSDKYVHPMHAITFLPWDNSRFILQSEKDGYNHLYLFDTSGKQIKQL TTGKWIVVDLMGFNAKTKEAIILSTEASPIQNNLYAVNLQTGARRLLDNGKGCHANTTGE GGSHKIALSPSGQWILDSYTEPTVPRNIDIVNVASAKATRYFTADNPWKGYTVPEYSCGT IKAADGTTDLYYRMVKPTNFDPNKKYPTIIYVYGGPGVRNVEARWHFWSRGWETYMAQKG YLLFILDNRGSSARGLAFEQATFHHLGVEEVKDQMKGVEYLTSLPYVDKDRMGVHGWSFG GFMTTNLITTHPEVFKVGVAGGPVIDWKWYEVMYGERYMGTPQNNPKGYAESSLLPKAKN LKGKLQIITGMNDPVVVPQHCLNFLQECIKVGTQPDFFVYPGEPHNMRGHQSTHLHERIS QYFDDYLK >gi|283510570|gb|ACQH01000049.1| GENE 39 52081 - 53349 1461 422 aa, chain + ## HITS:1 COG:HI0888 KEGG:ns NR:ns ## COG: HI0888 COG0151 # Protein_GI_number: 16272828 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylamine-glycine ligase # Organism: Haemophilus influenzae # 1 419 1 419 429 385 48.0 1e-107 MNILLLGSGGREHALAWKIAQSSKVSKLFIAPGNAGTATVGQNVDINANDFDALKRFALD NDVQMVVVGPEDPLVNGIYDDFKNDARTTHIPVIGPSRQGATLEGSKDFAKAFMARHAIP TARYKTITAQNIDEGLAFLEELKAPYVLKADGLCAGKGVLILPTLQEAKSELRHMLDGMF GNASASVVVEEFLSGIECSVFVLTDGKDYQILPEAKDYKRIGEHDTGLNTGGMGSVSPVP FATPEWMKKVEERIIRPTINGLAEEGIDYKGFIFFGLINVEGNPMVIEYNCRMGDPETES VMLRLKTDIVDLFEGVAQGNIASRNVAFDERAAVCVMLVSGGYPQEYAKGYPIEGIENVA ESIVFHSGTAIKDGLLVTNGGRVLAVSSYGKDKAEALQKSFESANAIRFTDKYFRKDIGQ DL >gi|283510570|gb|ACQH01000049.1| GENE 40 53471 - 54403 829 310 aa, chain + ## HITS:1 COG:no KEGG:PRU_1808 NR:ns ## KEGG: PRU_1808 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 3 308 1 308 313 181 34.0 3e-44 MVVKRLQNKIAESRMTLSFTAVYAVLVFLASGLVGKNLWFEFACLAVSTYLMVEFNNVNA LLRVYSRMISCAFLVLACAANYLFASPSAALVQACFAAFYLIVFNAYQERTASGVTFYAY LFIGVASTVFVQILFFVPLLWILHATNVLSMSWRNFWASVLGLIAPYWFVGAYLVYTGQP EVFVQHFTDLALFAPLFQYQEIDHNRIATFLFVALVSLIGIVHYLRSSRNDKIRNRMIYE MVIVVDLFTIACIVLQPQHADNLLPILIINTAALIGHFITLTRTAATNIVFCVLWLASIL LTVCNLWIPL >gi|283510570|gb|ACQH01000049.1| GENE 41 54352 - 54840 424 162 aa, chain + ## HITS:1 COG:MA3555 KEGG:ns NR:ns ## COG: MA3555 COG1238 # Protein_GI_number: 20092362 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Methanosarcina acetivorans str.C2A # 12 156 1 144 153 82 31.0 5e-16 MACLNLTHRLQLMDSLIALLINYGYWGMFLSAVLAGSVIPFSSEAVMVGLQAAGLDAWPL ILYGTAGNVIGGMINYGVGRMGRTDWIEKYLHVKKEKLQRAENFMAGRGAWMGFFAFLPI LGSAITVVLGLMRANVAITVTSMTIGKFIRFALLVFGASFLL >gi|283510570|gb|ACQH01000049.1| GENE 42 54847 - 55680 887 277 aa, chain + ## HITS:1 COG:MTH604 KEGG:ns NR:ns ## COG: MTH604 COG0803 # Protein_GI_number: 15678632 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type metal ion transport system, periplasmic component/surface adhesin # Organism: Methanothermobacter thermautotrophicus # 14 277 22 293 295 176 35.0 3e-44 MYKKISLFFAFAVLLLASCTMVKGGSKRTVMVTIEPLRYFVEAIAGNRFDVKTMVPIGGN PETYEPTAQQMVELSHSDLYVKVGSIGFEQTWMKRLKANAPHTIIIDSSEGIEPIESTDG VPDPHTWMSCKNAAIIAQNIYKALLQIDKEDSLYYKANLETLLAKIEETSNQIRENLTRE KSTTFLIYHPILTYYASEFDLHQIYIEDEGREPSAAQIKDIINSAKASQVRVLFMQKEFA NRNSETIANAVGAEVIEFNPLAYNWEKEMIKVAKSLK >gi|283510570|gb|ACQH01000049.1| GENE 43 55677 - 56447 220 256 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|225084369|ref|YP_002657150.1| ribosomal protein S16 [gamma proteobacterium NOR51-B] # 1 214 1 214 309 89 27 7e-17 MTAPKQPIIKIEHVFAGYEEKVALQDVSLEVFDHDFLGVIGPNGGGKTTLMRVILGMMKP FSGKVSYFRNGVQVPELTIGYLPQYNDIDRQFPIAVEEVVLSGLSRQKKLFAPFDANHKQ QVAQTLEKLGMEDFAKRPIGSLSGGQLQRVLLARAIVSRPEVVVLDEPNTYIDKRFQEQM YQMLDVINHDCAIIIVSHDVGTILQNVKAVACVNRTLHYHDTPEISSDELSEHFGCPIEL LGHGRMPHRVLKAHED >gi|283510570|gb|ACQH01000049.1| GENE 44 56718 - 56858 95 46 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260912231|ref|ZP_05918783.1| ## NR: gi|260912231|ref|ZP_05918783.1| hypothetical protein HMPREF6745_2738 [Prevotella sp. oral taxon 472 str. F0295] # 6 46 1 41 41 77 100.0 2e-13 MVISPMKYFTPEELKPSEKTLNMIRQIAHCYKSIQVNGQRYTICLN >gi|283510570|gb|ACQH01000049.1| GENE 45 56863 - 57519 873 218 aa, chain + ## HITS:1 COG:BS_yloR KEGG:ns NR:ns ## COG: BS_yloR COG0036 # Protein_GI_number: 16078642 # Func_class: G Carbohydrate transport and metabolism # Function: Pentose-5-phosphate-3-epimerase # Organism: Bacillus subtilis # 7 216 4 214 217 222 53.0 3e-58 MMKKPLIAPSLLSADFLNLNAEVQMINESEADWLHMDVMDGTFVPNISFGFPVLDAVGKA CKKPMDVHFMIVHPENYIEQTAKAGAMLMCVHQEACVHLHRTVAKIHETGMKAGVALNPS TPVNVLEDIINDVDLVLLMSVNPGFGGQKFIENTLNKVKRLRQLIDSTGSKALIEVDGGV QAQTAPLLIEAGADVLVAGNYVFKSANPKETIHQLKSL >gi|283510570|gb|ACQH01000049.1| GENE 46 57781 - 59241 1303 486 aa, chain + ## HITS:1 COG:sll1087 KEGG:ns NR:ns ## COG: sll1087 COG0591 # Protein_GI_number: 16330938 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Na+/proline symporter # Organism: Synechocystis # 7 466 9 455 512 125 28.0 2e-28 MLIILTILAYFGLLLIFSRLTSRRADNATFFRANKRSPWYMVAFGMVGASISGVTFVSVP GMVNVIGMTYMQTCLGFIVGYFAVAFLLLPVYYKLNLTTIYTYLDKRLGLSAYRTGAGFF ILSKISGAAVRFYVVCMLLQRFVLDALGIPFALTVLTMVALIWLYTRKGGINTLVWTDTF QTFCMFTALILIIWQVIGALGMSPTEAIQAVAADERSRMFVLDDWVSKQNFWKQFISGAF IVVVMTGLDQDMMQKNLTCKTLREAQKDMCTYGFAFVPANLLFLALGVLLSMLAVQRGVP LPDMGDELLPLFAASGSLGTLVVVLFTIGIVAASFSSADSALTALTTSFCIDICGKPNDE RLRKRAHLGMAAVFALFILFFRAVNSTSMLDAIYILCSYTYGPLLGLFAYALFTRRNVGG KWVLPVCLAAPLLCFVVETLTQQFSTYRFGYELLMLNGALTFIGLWLTGNNEVPNRQKNS IFSTPL >gi|283510570|gb|ACQH01000049.1| GENE 47 59262 - 59474 101 70 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MQTPAGNGLVRVGYAETIRSDLPKQMVRRDAPRASSQAGLSVGQCVGLVPTRLWKAGARL PIFVLCRLML >gi|283510570|gb|ACQH01000049.1| GENE 48 59713 - 60075 271 120 aa, chain + ## HITS:1 COG:no KEGG:PRU_1408 NR:ns ## KEGG: PRU_1408 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 12 119 14 119 120 63 38.0 3e-09 MEQTEHSADANQQPFSVPEGYFDSLPPAVMQKIDGQNEQAAPKRRTPRLLRPIYIAAASL CAAIFGVAVYLGSAPQSPSGTTPVPPHPTIATIINSADKDAVADYTMMDNEDIYVHYSEI >gi|283510570|gb|ACQH01000049.1| GENE 49 60082 - 60495 508 137 aa, chain + ## HITS:1 COG:no KEGG:PRU_1407 NR:ns ## KEGG: PRU_1407 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 134 1 134 145 97 42.0 2e-19 MKKSLLTLLFAFAAISMVAHPPHKFDPEQFQAELEKFITAEVSLSQAESAAFFPVYRELR KKQRNIFKQIKHYKHVNPTDNKAAAEAIKQQDKLELEMKELLKDYHNKFMTLLPATTVFQ ILKAEDKFHRQLFKGKK >gi|283510570|gb|ACQH01000049.1| GENE 50 61933 - 64146 2430 737 aa, chain + ## HITS:1 COG:CAC1209 KEGG:ns NR:ns ## COG: CAC1209 COG1328 # Protein_GI_number: 15894492 # Func_class: F Nucleotide transport and metabolism # Function: Oxygen-sensitive ribonucleoside-triphosphate reductase # Organism: Clostridium acetobutylicum # 1 737 1 692 699 459 38.0 1e-128 MIQTVVKRDGRIVGFNEQKIMAAVRKAMLHTDKGEDERLVQQIADRIGFVGKPQMTVEDI QNQVEMELMKSSRKDVARAYISYRNQRSVARKAKTRDVFLEIINVKNNDITRENANMNAD TPAGMMMKFSSETTKPFVDDYLLSEESREAVSQNRLHIHDKDYYPTKSLTCCQHPLDHIL ERGFSAGHGSSRAAKRIETASVLACISLETAQNEMHGGQAIPAFDFYLAPYVRASFVEEV KALEELTGENYAHIYNKEINDYLKKPLDGLEGEARIVQHAVNKTVARVHQAMEAFIHNMN TIHSRGGNQVVFSSINYGTDTSAEGRCVIRELLNSTYDGVGNGETAIFPIQIWKKKRGVN YLPSDPNYDLYCLACKVTARRFFPNFLNLDATFNQSESWKADDPKRYMHEVATMGCRTRV FENKYGMKTSVGRGNLSFSTINIVRLAIECMDIADKDARINSFFAKLDNMLDVAARQLNE RYDFQKTAMAKQFPLLMRSLWTGAENLSPNDTIEKVINQGTLSIGFIGLAECLKALLGVH HGESDEAQQLGLRIVDYMRCRCNEFSEKYHHNFSLLATPAEGLSGKFTKVDRKKFGVLEG ITDRDYYTNSNHVPVYYKCSARHKAEVEAPYHEMTRAGHIFYVEMDGDATHNPQAIMNVV DMMDHYNMGYGSVNHNRNRCLDCGFENAEANLEVCPQCGSKHLDRLQRITGYLVGTTDRW NSGKLAELKDRVLHETE >gi|283510570|gb|ACQH01000049.1| GENE 51 64270 - 64779 414 169 aa, chain + ## HITS:1 COG:SP0205 KEGG:ns NR:ns ## COG: SP0205 COG0602 # Protein_GI_number: 15900141 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Organic radical activating enzymes # Organism: Streptococcus pneumoniae TIGR4 # 20 167 28 180 196 117 39.0 8e-27 MMANNKYTLRVLSIIEDTTVDGPGFRTCIYCAGCTHACEGCHNPQSWNRDGGEEMNIGRI MQVIKADPFANVTFSGGDPMLQAEAFTALAKEIKRQTKKDIWCYTGFTFEALLKDKGRRA LLEQLDVLVDGPFVRSLRDVGLRFRGSSNQRLIDVQTSLARGKVALWQG >gi|283510570|gb|ACQH01000049.1| GENE 52 65036 - 65629 585 197 aa, chain - ## HITS:1 COG:PA2896 KEGG:ns NR:ns ## COG: PA2896 COG1595 # Protein_GI_number: 15598092 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Pseudomonas aeruginosa # 8 188 13 189 194 85 27.0 7e-17 MIKISEMTDEELAMSYVNGNNKAFDLLLERNQSRLFSYILFIVHDRSMAEDLFQETFVKI ISKLHEGKYSSSGKFVSWMLRIAHNAVMDWYRRQKNEKTMEVYNENDLYGDSTSVLDANI ENHFVKEQVLRDVQRMMNLLPITQREVVFMRFYQNLSFKEIAELTNVSINTALGRMRYAV MNMRRMAAANGVALQLD >gi|283510570|gb|ACQH01000049.1| GENE 53 66120 - 66962 594 280 aa, chain - ## HITS:1 COG:RC0072 KEGG:ns NR:ns ## COG: RC0072 COG0682 # Protein_GI_number: 15891995 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Prolipoprotein diacylglyceryltransferase # Organism: Rickettsia conorii # 1 274 1 259 259 127 31.0 2e-29 MNYIHWNPPTDIINLGFFAIRWYGMLWGIGLIGATYIVARLFKEYKIADCKFASLFVYVF FGVLLGARLGHCLFYEPAYFFGSFKHFVEMLLPIRFLSEGGWRYTGYAGLASHGGVIGLL IALYLYCRQMKMEYLFVLDCIGFAAPFTAMAIRLGNLMNSEIIGKTTDVPWAFVFTRVDP LPRHPAQLYEAIFYAIVFCLGVLLYKKRKEKTTIGSGFFFGYCLTLVFTFRFFVEFIKEV QVDFEHHLVLDMGQILSLPLVAIGLYFMWKSKNKLANKHK >gi|283510570|gb|ACQH01000049.1| GENE 54 67364 - 67786 483 140 aa, chain - ## HITS:1 COG:no KEGG:BVU_2895 NR:ns ## KEGG: BVU_2895 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 140 1 125 125 117 50.0 2e-25 MDIQGKVIAVLPEKTGVSAKGEWKVQEYVIETHEAYPHKMVFSVFGADRIARFNVQVGQE VMVSFDIDAHEYQGRWFNSIRAYDVRLVDPAQFGVPAAAPQATQAAPAGFPPQQAASAAP AGFPPAQDAPAEDSSDDLPF >gi|283510570|gb|ACQH01000049.1| GENE 55 68037 - 69164 1351 375 aa, chain + ## HITS:1 COG:TM0262 KEGG:ns NR:ns ## COG: TM0262 COG0592 # Protein_GI_number: 15643032 # Func_class: L Replication, recombination and repair # Function: DNA polymerase sliding clamp subunit (PCNA homolog) # Organism: Thermotoga maritima # 1 372 1 365 366 139 27.0 1e-32 MKFSISSSALSSKLQNLAKTLSAKPSIPILNNFLFEVVDGVLTLTTSDSETTMKAALKLD ESDSNGRFTVPSRTILDATRELPDQPLHFDINLETMAVQITYQNGVYNFTAQTAEEYPRN KELAADAASIQMESAVLLNSIVRSIFATGQDELRPIMNGIYFDLKEDGLAIVATDGHKLV KNKNNAIRSDVKRSFVLAKKPASLLRNILTKDATMVEIKFDDRNAEVHFPNGMLSCRLVE GNYPNYESVIPKDNPNQITLDRKGLVSVLRRVLPFASESSQLIRLHIEQGLIELSSEDID FATSAKESMVCDYVGIPMNIGFKGSVLGDVLNNLDSDDVVVQLADPSKAALVLPAVQPEN EEVLMLIMPMMLNDN >gi|283510570|gb|ACQH01000049.1| GENE 56 69264 - 70118 1045 284 aa, chain + ## HITS:1 COG:NMB1451_1 KEGG:ns NR:ns ## COG: NMB1451_1 COG0847 # Protein_GI_number: 15677307 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, epsilon subunit and related 3'-5' exonucleases # Organism: Neisseria meningitidis MC58 # 3 125 24 142 177 90 43.0 4e-18 MKLNLTRPLVVFDLETTGLDLVNDRIIQISYIKIYPDGKEERENLLVYPQRPIPDEVSEL TGITTDLVEDAPTFEELAPRLNEVFKGCDFAGFNSNHFDIPMLAEEFLRAEIDFDFSSAR LIDAQTIFHKMEKRNLAAAYKFYCGRKMEDDFEAHRADQDAEATYRVLLGELERYSEANQ EEPERVLPNDMDYLAEFSKNNDNVDFAGRIVWRPVLDGKGNEVLDSKGQVIKKEYFNFGK YKGSAVDEVLQRDPGYYNWMLQADFTNNTKQVLTRIRLKGFGGR >gi|283510570|gb|ACQH01000049.1| GENE 57 70152 - 71375 1023 407 aa, chain + ## HITS:1 COG:BH2510 KEGG:ns NR:ns ## COG: BH2510 COG0452 # Protein_GI_number: 15615073 # Func_class: H Coenzyme transport and metabolism # Function: Phosphopantothenoylcysteine synthetase/decarboxylase # Organism: Bacillus halodurans # 1 406 1 398 404 332 44.0 6e-91 MLKGKKIVLGITGSIAAYKACLILRLLVRQGAEVQVVITEAGKQFITPVTLSALSGKPVV SEFFAATDGTWHSHVDLGIWADAMLIAPCTACSLGKMAHGIADNMLITTYLSMKAPVFVA PAMDLDMYAHPSTQANIDRLRQVGNIIIEPQAGFLASGLEGKGRMEEPESIVATLEKYFE GEDNGRTCVVQRQLAGKKVLITAGPTYEKIDPVRFIGNYSSGKMGFALAEECSRRGADVV LVAGPVSLQSSATIRRIDVESADEMHAACVAEFAHADAAILCAAVADFKPAAVAAQKIKR EGDGLTLKLAATHDIAAALGRVKKGNQVLVGFALETNNEETNAQKKLESKNLDFIVLNSL RNEGTCFGTDNNMVSIISATEKRQFGFKSKSEVAKDIVEYLCERMKC >gi|283510570|gb|ACQH01000049.1| GENE 58 71589 - 72494 979 301 aa, chain + ## HITS:1 COG:no KEGG:PRU_0837 NR:ns ## KEGG: PRU_0837 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 12 301 12 301 301 396 68.0 1e-109 MKIKIAFLLGLLMVCRGAFAQELRAKVTVNASQVQTTETAVFDNLQQAIEQFVNNRQWTH LQFRDNERIVCNFNITIQQYDREANRFSCKALIQANRPVFNASYSSVLYNNTDNDFNFDY SQFDQLELNEDNLSNQLTALLAYYAYLIIGLDLDSFAPMGGEDVLQRCMTLVNNAQTLPF AGWKSFDNDRNRFAIINDYLDGAMQPFRQLQYDYYRKGLDEMVNNAERGRVNITTALEEG LKRSKENKPLSMLPQIWTDYKRDELVQVYQGKGTQKEKQAVYDLLFKINASQNNAWEKIK E >gi|283510570|gb|ACQH01000049.1| GENE 59 72496 - 74157 1878 553 aa, chain + ## HITS:1 COG:SP1202 KEGG:ns NR:ns ## COG: SP1202 COG0497 # Protein_GI_number: 15901065 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA repair # Organism: Streptococcus pneumoniae TIGR4 # 1 550 1 550 555 297 33.0 4e-80 MLTQLYIKNFALIDELDMDFRSGFSVITGETGAGKSIILGALGLVMGQRADVKSIKHGAE KCTVEAHFNIAHYGLESFFEQNDLDYDANDCIIRREINVSGKSRAFINDAPAPLALVKEL GERLIDIHSQHQNLLLNKEDFQLNVIDLIAQNSAQLAEYAEAYDKYKAAEKELRQLEELI SGNKEREDYLRFQHAELEEAKLEDGQQESLEQESEIMSHAEDIKAALYQAVEGINGESDS ILGNLHTYVGKLHDIEDIYPNVRELAERLDNCYIELKDIAQDLGSQVEDVDFDPQRLDEI IQRLNIIYSLEKKYRLDSVAELIAMHNDIKQQLQHIDNGDADLEEMRKKKEQMLAICTEK AAKLTAIRQEKAQQIEQQLQSMLMELGIPKVQFKIALSAKDLSPNGCDKVSFLFSANTNT ALQPVSQVASGGEVARVMLSLKAMISNAVKLPTIIFDEIDTGVSGRVAEKMANIMADMGK TDRQVIAITHLPQIAALGTTHYMVSKEETDQGTVSHMSLLNPSERVGEIAKMLSGSDVSS AAIDNAKTLLGIQ >gi|283510570|gb|ACQH01000049.1| GENE 60 74178 - 75830 1601 550 aa, chain + ## HITS:1 COG:no KEGG:PRU_0835 NR:ns ## KEGG: PRU_0835 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 550 1 553 553 385 40.0 1e-105 MKSVFLSLLFTCFALGAQAQSANVQKVLKSVFSLTTFKKDGSLLASGRGVFVGTNGEAVS TWAPFVGADSAIVIDANGNKHTVDVIYGASEVYDLCKFKVNAITTPAPAAKTQMAVGARA WLVQYSTGRAQTSALKVKKAEPFMTDCYFYQYDGQPENIHVGCPVTNEKGEVMGLLQHSK LSGDYNSVDVRFAETFKTTGLSVNEPILKQTTIRMALPDKQDQALLMLMLEAGQTSPSRY LRYVEDFIHKFPKSVEGYVSKAEFYLAQNDFGKANSTMQTALSMADKKDEAHYNLAKLIY QKEIKQANMPYPQWNLDLALAETNKAYALNPLPLYKHLEAQIVYFKGEYAKAYEQFMALT KTNLRNGELFYEAAQCKQQLKAPFAEILTLLDSAVTAQDANLSAPYVLARGTALHGAQQY KKAMADYNRYDSLMLGRASHDFYYTRALCEVQLRQFQQALNDFAHAIVLNRAEPTYYAEL AQLQLRVNQKEDAVVTADLGLRIAPNYPDLYLIKGLALVNDNKKNEGLAALNKAKELGDT RADDLIKKYK >gi|283510570|gb|ACQH01000049.1| GENE 61 75988 - 76602 557 204 aa, chain + ## HITS:1 COG:DR0198 KEGG:ns NR:ns ## COG: DR0198 COG0353 # Protein_GI_number: 15805234 # Func_class: L Replication, recombination and repair # Function: Recombinational DNA repair protein (RecF pathway) # Organism: Deinococcus radiodurans # 4 203 2 198 220 199 48.0 2e-51 MEQQYPSALLEKAVAEFSRLPGIGRKTALRLVLHLLKTRVEDVENFSSSILKVRKDIKYC QLCHNISDTETCAICANPRRDAATICVVENIQDVMAIENTQQYNGLYHVLGGVISPMDGV GPNDIEINSLVERVAKGGVNEVILALSSTMEGDTTNYYISRKIAPYKVKTSVIARGISVG DEIEYTDEVTLGRSIINRTSIGET >gi|283510570|gb|ACQH01000049.1| GENE 62 76934 - 77368 335 144 aa, chain + ## HITS:1 COG:no KEGG:PRU_0896 NR:ns ## KEGG: PRU_0896 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 17 130 18 136 141 72 36.0 6e-12 MKPQQTLNIIFYTSTGLSLLAALLFETDMLIGGWWADNRSADFLCTAILELLSLCAIPVA FRLVRPGHSNTGRMSYDCRAILRLVLLGLPLLLNAFAYYAFMGVPFGYMAIILFLCLLFV APTRNRYEREKASLQTTDNSSENA >gi|283510570|gb|ACQH01000049.1| GENE 63 77361 - 78551 851 396 aa, chain + ## HITS:1 COG:CAC2321 KEGG:ns NR:ns ## COG: CAC2321 COG1216 # Protein_GI_number: 15895588 # Func_class: R General function prediction only # Function: Predicted glycosyltransferases # Organism: Clostridium acetobutylicum # 4 281 6 286 298 196 37.0 5e-50 MPKLTVVIVNYNVKYYVEQCLHSLRKALAGVDAEVYVVDNHSHDGSVEYLQERFPRLNII ACMHNYGFAYANNVAIKQSQSEHVLLLNPDTFVAENAIQTMLNFMDEHPQAGGIGVQMLG ADGQKAMESRRGLPSPMVSFYKMSGLCARFPQSKRFGKYYLSYLPWDAPARIEVISGACM MVRREAFDKVGLLDEDYFMYGEDIDLSYRILKAGYENWYLPCQILHYKGESTHKSSFRYV HVFYQAMLIFFRKHYRHTRFWVSLPIKAAIYFKALTALLSIQTRVLRRALALRPRRKQQP AYRFFGSQTSCLAVEQMAKRKGLSVECTVANEQTLPLGHATAAETLAQPTVMVYDSSVYR YQTILEIMSKHPQKNVTLGLFNPETQLLITTNDIIK >gi|283510570|gb|ACQH01000049.1| GENE 64 78548 - 79060 369 170 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229254479|ref|ZP_04378409.1| acetyltransferase, ribosomal protein N-acetylase [Capnocytophaga ochracea DSM 7271] # 5 167 1 162 166 146 44 4e-34 MNNPIVNLRAMEPEDLDTLYKIENNQALWAVSATNVPYSRFALHEYVETNTNDIYADKQV RMMIDNEAGETVGIIDLMNFSPQHSRAEVGIVVMKPHRHKGYATAALTKLVAYASQTLHL HQLFLVADCENESCLRLFEKLGFVQTALLKQWLQVKEGVYKDACLMQLFL >gi|283510570|gb|ACQH01000049.1| GENE 65 80487 - 81014 333 175 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928340|ref|ZP_06422187.1| ## NR: gi|288928340|ref|ZP_06422187.1| hypothetical protein HMPREF0670_01081 [Prevotella sp. oral taxon 317 str. F0108] # 1 175 1 175 175 352 100.0 5e-96 MESKHRTSPFALTDGEETMLPYFTGETPTCLYTVYFLGRTAKERSFWDVYNKVGFWKDKD NERPHLYAKDIERHITTAIKQYNIFAIAIDKGMIKDIQKGDFTPNDDAQYTTFLLKDGKW TVVDSFNVMQSPTNTAEYLMKVLDKTSRNNAQLLGDSLVSSPNFAHNLWHIMRKK >gi|283510570|gb|ACQH01000049.1| GENE 66 81700 - 82155 137 151 aa, chain + ## HITS:1 COG:no KEGG:ECL_01542 NR:ns ## KEGG: ECL_01542 # Name: not_defined # Def: putative cytoplasmic protein # Organism: E.cloacae # Pathway: not_defined # 23 139 24 143 163 99 44.0 3e-20 MTAISYSTLSSNFARVKSLAPSKVGKLIGGKVEANIKANIFKNACAIRMSYAFNYSGMPI SRYDGSVSSGKDKKWYLYRVAGMKKFVKKHIGGSPIKGHAAKDFRGKKGVIIFSDCEWDD AAGHIDLFDGKEVEGNGYFDECGMALLYELK >gi|283510570|gb|ACQH01000049.1| GENE 67 82163 - 82675 149 170 aa, chain + ## HITS:1 COG:no KEGG:BDI_3895 NR:ns ## KEGG: BDI_3895 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 39 168 627 760 766 119 47.0 3e-26 MRTSGWCIVISFLAVVLFANCKDADNRLFVAERNGLYGYINAQGDTIVDCTFPFAYTDTI SRIGFVADSVGAIKCFNNKGEFLFKVFNYDNGPDYPADGLFRIVDDNALVGFADTLGNVV ISPRFKFAYPFKEGHAKVTDTGVLVPCGEGVDRHTTWLSDKWYFISKKKR >gi|283510570|gb|ACQH01000049.1| GENE 68 82901 - 83356 386 151 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229210357|ref|ZP_04336754.1| acetyltransferase, ribosomal protein N-acetylase [Leptotrichia buccalis DSM 1135] # 3 147 4 148 155 153 48 4e-36 MNIKLEKGINKDNASLLCKWSNEQGKAFQEQWMGPKISYPLSYGKIKDLENVFSIFNNGE FVGMIQEIQIGKDNVHIGRFVINPQKTGLGLGTEALKRFVDFIFKDDNIESISLAVFDFN QNARKLYEKLGFEINEIIEVPKLKYIMRRHR >gi|283510570|gb|ACQH01000049.1| GENE 69 83629 - 84900 1332 423 aa, chain - ## HITS:1 COG:PM0738 KEGG:ns NR:ns ## COG: PM0738 COG2873 # Protein_GI_number: 15602603 # Func_class: E Amino acid transport and metabolism # Function: O-acetylhomoserine sulfhydrylase # Organism: Pasteurella multocida # 1 420 1 418 422 527 58.0 1e-149 MKKQTLCVQAGWQPKNGEPRVLPIIQSTTFKYDNTEEMAMLFDLKKEGYFYTRLQNPTND AVAKKIAALEGGVAAVLTSSGQAANFYAVFNICEAGDHFVTSSEIYGGTFNLFAVTLKKL GIDCTFVHPDASDDEIQAAFRPNTKLLFGETISNPSCAVLDIEKFARIAHRNGVPLIVDN TFATPINCNPFEWGCDIVTHSTTKYMDGHATSVGGVVVDSGNFDWETNAEKFAGLCTPDS SYHGLTYTKAFGKMAYITKMVVQLMRDLGSIPSPQNAFLLNIGLETLHLRMRQHCQNAQK VAEYLHNDPHVAWVHYCGLEDDANHEKAKKYLPNGSCGVLSFGLKGDRATAVKWMDALKM IAIVTHVADARTCVLHPASHTHRQLSDEQLREAGVTPDLIRLSVGIEDVEDILADIKQAL EKL >gi|283510570|gb|ACQH01000049.1| GENE 70 84857 - 85204 111 115 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLPTSECDFDNLKIELSVRKFRPYNRINQTFRIFFLYFNVINRKDKRLQKNFRVPSVHAT TPSTLPIFTHIMSLLPCLSHSFLVTLHTTSSTSNNFKIQQNDEETNFMRAGRMAT >gi|283510570|gb|ACQH01000049.1| GENE 71 85727 - 85936 74 69 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MFLMFLCQNERIYVKTREKNVFVRPNEQCRAYSKIVRARKRTFRETRKERLTRSLIHPFT CQQVNSPTS >gi|283510570|gb|ACQH01000049.1| GENE 72 86282 - 86821 343 179 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928345|ref|ZP_06422192.1| ## NR: gi|288928345|ref|ZP_06422192.1| hypothetical protein HMPREF0670_01086 [Prevotella sp. oral taxon 317 str. F0108] # 30 179 1 150 150 305 100.0 7e-82 MRSLLILLMAIILVSCECHYETFRIDNVVMQPIVFTDSLVNGKKYFVINFITSWTGTKPV LFGGGIEPGLKGIDEKVKSIEVRTRSGKLISPCFKGWETDMYGLISNQEESHGYYSSLNI ASLVRSINKGERQSVGMRICIPRLFYLSSNDEPYTITIKFRERQITSKVIQKEMVYCAE >gi|283510570|gb|ACQH01000049.1| GENE 73 87017 - 87562 657 181 aa, chain - ## HITS:1 COG:BS_sigW KEGG:ns NR:ns ## COG: BS_sigW COG1595 # Protein_GI_number: 16077241 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Bacillus subtilis # 22 176 19 182 187 89 35.0 2e-18 MIELDEQRLLQRLADPATRHAAFQDLVKAYAQPLYWKVRRMVLVHEDADDLVQNVFLKAW NGLDNFKQQARLSTWLYRIAINEALDFLRKQKNATTLQSSADLTLAQQLMADDYFDGDAT QARLQEAIAELPEVQRTVFNLRYFDEMKYEEISVILSTSVGALKASYHLAVKKLRKIMDA E >gi|283510570|gb|ACQH01000049.1| GENE 74 87642 - 89294 1517 550 aa, chain - ## HITS:1 COG:BMEI1014 KEGG:ns NR:ns ## COG: BMEI1014 COG2989 # Protein_GI_number: 17987297 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Brucella melitensis # 279 543 278 518 531 122 31.0 2e-27 MRNKHKDKEYQQAAQIVGGLSMKYILELGKSSSNFLIGKFSSTVLFVCLVLVACHEKGHV ERPLPSQETLGQWAEQAYKLNISDIKRQLALLVRADRDTLVADSRTRRYYRDGGNLLWID RFGLDKRADSLLAYLNTVEEMGFSKRVFVVEQIERDLQRIRTLDVDSGIRDVNRVMARLE YNLTKAFLRYTAGQHFGFTNPTRIFNNLDIREQDSVRTTYRGLFDIHVKRPTKDFYTTAL RKVYNDSVDIFLRQAQPQNALYALLIKHFKGKDLSETDRIRLLCNVERSRWRQTDYPQNH RKYVLVNVPSYHLDAIEGDSVLTMRMACGTLDTKTPLLNSHIMRMDVNPQWIIPRSIIKK EVVGHAGNPDYFERHRYFILNRPSGRRVDPANVSAGMLLNPDYFVIQEGGEGNSLGRVVF RFNNNFSIFIHDTSSKWAFDRGSRSVSHGCIRVQKPFELAVFLLGKKDKTLIDKINYSMQ ADLSKPKEGEEKRTRVDYRRLVNSVKVNPNVPLFITYYTLYPDVKGKMHTYADVYGYDKV LYRYLRSFSI >gi|283510570|gb|ACQH01000049.1| GENE 75 89421 - 89729 234 102 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928348|ref|ZP_06422195.1| ## NR: gi|288928348|ref|ZP_06422195.1| hypothetical protein HMPREF0670_01089 [Prevotella sp. oral taxon 317 str. F0108] # 1 102 1 102 102 182 100.0 6e-45 MTKTLLTILFATALMMGNPLTSRAAAALETIDNDYQNVVISVSESTIHVSGAAGQMLAVY DLAGMRVAYLKIDSAEKHFDLGLPKGCYIVKVGKMVRKIAVR >gi|283510570|gb|ACQH01000049.1| GENE 76 90070 - 90882 741 270 aa, chain - ## HITS:1 COG:slr0050 KEGG:ns NR:ns ## COG: slr0050 COG1234 # Protein_GI_number: 16331469 # Func_class: R General function prediction only # Function: Metal-dependent hydrolases of the beta-lactamase superfamily III # Organism: Synechocystis # 1 270 34 307 326 162 35.0 8e-40 MIDCGEGTQVQLRRSRIRFTKIGAVFISHLHGDHCFGLIGMISTFGLLGRTAALHVYAPA ELGPMLRMQLDFFCGGIEFEVVFHAVDTKVQQVIYEDKSLTVESIPLEHRIDCCGFLFRE KELLPHIRRDMIDFYRIPISQINNIKLGADWVDDEGRTVPNTRLVTPADPARAYAYCSDT RYMPQLHKAVQGVNTLYHEATYDSSMASRARLYYHSTAEEAAKVARDAGVGKLVLGHYSA RYEAEEVLLAEAKAIFPNTILGQEGLEIDV >gi|283510570|gb|ACQH01000049.1| GENE 77 90992 - 91552 721 186 aa, chain - ## HITS:1 COG:no KEGG:PRU_1412 NR:ns ## KEGG: PRU_1412 # Name: not_defined # Def: putative lipoprotein # Organism: P.ruminicola # Pathway: not_defined # 9 154 7 156 227 66 27.0 5e-10 MDKRVSTILFAVMAVLFIVACGDGKKKGDGQQSLATTEESKGDTTVYGVCGVNTAMHNLE LVTVQGDTLNYLFDVDDESAVKGGLLAGDRLAVVGYKNADGEYEANQVLNLTTLMGKWRS IDKQFELMEDGTVLSAVKEEAHPWTSWKVFNGQLLLNKDTFTVNALGGDSLYLENAHGIF TFTRQK >gi|283510570|gb|ACQH01000049.1| GENE 78 91738 - 92535 961 265 aa, chain - ## HITS:1 COG:CC1747 KEGG:ns NR:ns ## COG: CC1747 COG1694 # Protein_GI_number: 16125991 # Func_class: R General function prediction only # Function: Predicted pyrophosphatase # Organism: Caulobacter vibrioides # 14 263 21 275 275 206 41.0 3e-53 MHTKEEKMEAFGRLLDVLDDLREKCPWDKKQTFESLRPNTIEETYELCDALAKNDMNNVC KELGDVLMHTAFYALLASEKGEFDIADVCNREADKLIFRHPHIYGDVQADNEEQVLKNWE QLKLQEKDGNETVLSGVPDALPSVIKAYRIQDKARNVGFDWKEKNDVWAKVREELDELET ELKREDKEKSTQELGDFLFSVINAARLYHLNPDNALEKTNQKFIRRFNYVEQHSIKEGKP LTEMTIEEMDELWNKAKNEEREMSS >gi|283510570|gb|ACQH01000049.1| GENE 79 92710 - 93702 1104 330 aa, chain + ## HITS:1 COG:HI1170 KEGG:ns NR:ns ## COG: HI1170 COG0147 # Protein_GI_number: 16273094 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Anthranilate/para-aminobenzoate synthases component I # Organism: Haemophilus influenzae # 11 326 11 324 328 284 44.0 2e-76 MNFYNREEARQRMNDLAEQAVPFLFIIDYDSKRAVVEPENEVDANELKYCFNGKGNATQA AYPANKDIEWGIEPPTEAEYQHSFNIVRNAMLAGNSYLANLTCRIGLRTNLSLLDLYHAA EARYRLWMKDKLVCFSPETFVKIAQGEISSYPMKGTAEDVSPSSAEQLLANEKEAAEHAT IVDLIRNDLSMVAYDVHVERYRYVERLNTHRGPLLQTSSEIRGRLMPHLMQRPGDVIFSQ LPAGSITGAPKKKTVEVIAEAENYHRDFYTGVMGRWDNGELDSAVMIRFIDQCHGKLFFK AGGGITAKSNWKDEYHEVIEKVYAPICRNH >gi|283510570|gb|ACQH01000049.1| GENE 80 93677 - 94270 591 197 aa, chain + ## HITS:1 COG:HI1169 KEGG:ns NR:ns ## COG: HI1169 COG0115 # Protein_GI_number: 16273093 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Branched-chain amino acid aminotransferase/4-amino-4-deoxychorismate lyase # Organism: Haemophilus influenzae # 1 185 1 187 188 117 35.0 2e-26 MHQFVETIRIEGGKAMNLPLHEARMNATRLHFAPHAAPISLQKWLDDAPISEERVKARVV YDADGVCETTFQTYKRREIQWLRMVEDNDISYTFKSTDRHELDHLLALRDGCDEVLIVKN GLITDTSFTNVAFFDGHKWLTPTQPLLNGTMRQWLLQRGELMEAQITPASLASFQRIMLF NAMIGAHELELPTTHII >gi|283510570|gb|ACQH01000049.1| GENE 81 94368 - 97049 2737 893 aa, chain + ## HITS:1 COG:FN2011 KEGG:ns NR:ns ## COG: FN2011 COG0525 # Protein_GI_number: 19705307 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Valyl-tRNA synthetase # Organism: Fusobacterium nucleatum # 2 876 3 886 887 737 44.0 0 MEIASKYDPKEVEATWYKYWQDNKLFASKPDGRAPYTVVIPPPNVTGVLHMGHMLNNTIQ DILVRRARMEGKNACWVPGTDHASIATEAKVVAKLAEKGVKKTDLSRDEFLKHAWDWTHE HGGIILKQLRRLGASCDWDRTAFTMDDERSESVLKVFCDLYEKGLIYRGVRMVNWDPQAQ TALSDEEVVYKEEHSKLFYLKYMVVEEPGKYAVVATTRPETIMGDTAMCINPNDPKNTWL KGKHVIVPLVGREIPVIEDDYVDIEFGTGCLKVTPAHDINDHNLGLKHGLDTIDIFNDNG TLSEAAGLYVGMDRMDVRKKIAGDLEAAGLMEKTEDYNNKVGYSERTHVPIEPKLSTQWF LKMQHFADLALQPVMDDDIAFYPQKYKNTYRHWLENIKDWCISRQLWWGHRIPAYYFTAN GKRECVVALTAEEALAKAKAVCGTLTAADLEQDEDCLDTWFSSWLWPISVFNGINQPDNE EINYYYPTSDLVTGPDIIFFWVARMIMAGYEYRGKMPFKNVYFTGIVRDKLGRKMSKSLG NSPDPIELIEQYGADGVRMGMMLSAPAGNDILFDETLCEQGRNFNNKIWNAFRLVKGWEV SDQPQPEASRIACQWFEAKLRQTNEELNDLFSKYRISEALMAVYRLFWDEFSSWYLEMIK PAYGSPIDSHTYNATLRYFETLLNMLHPFMPFITEELWQHIAERKEGESIMCQCLKIDAP TEADNKLNAKMELVKQIISGVRTMRSQERLSPKYKFDLRYMGKENIWQDYVSLIEKMANV GFTSEPPKTNGTYTFMVETHEFVIHVGGLIDFETKISKLEAQLAHLEGFLAGIKKKLSNE RFVANAPEAVVALERKKQSDSEEKIAALKASIKELKTDKEKFGNFPFPDHVKA >gi|283510570|gb|ACQH01000049.1| GENE 82 97242 - 98447 1015 401 aa, chain + ## HITS:1 COG:no KEGG:BT_0727 NR:ns ## KEGG: BT_0727 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 8 401 15 393 393 286 39.0 1e-75 MQKKLTSLLILLATLFIVSCHEDVPAPYPIPQGLPDKTIFVYMPWSAARNSSTGSLYDNF LQNIKDIEAAIEAEKGLGRNKLMVFIATSANSAVLMEVKCAPYGTCRRDTLQRYDQHNMP AYTTTNGLASIFNEVKAQAKAKQYALIVGCHGTGWLFSEGKSRARTRYFGGTDRYFQTNI PTLAAAIEQAKMPMQFVMFDDCYMSNIEVAYEMRHATNYLIGCCSEIMAYGMPYKNIWKY LIQQKPNYQAVVNEFHQFYSNYQWPYGNIGVTDCSKVEEVVAHMKTINAATANNANLIDW EDVQRLDGYEKTIFFDMGDYVNKLCTTPETQALARNLHTALAQLVPYKSTTPYIYTALEE LSYNRIRVNAYSGITISDPTQSDFEKALTTKKATGWWKATH >gi|283510570|gb|ACQH01000049.1| GENE 83 100257 - 100454 232 65 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288928356|ref|ZP_06422203.1| ## NR: gi|288928356|ref|ZP_06422203.1| hypothetical protein HMPREF0670_01097 [Prevotella sp. oral taxon 317 str. F0108] # 1 65 39 103 103 97 100.0 2e-19 MCSYLKVLEKRFVEQGGIKERMHAARTGYRQAQDELLERLKTENEQLKAEVQRLKAELER RKLEE >gi|283510570|gb|ACQH01000049.1| GENE 84 100645 - 101295 766 216 aa, chain + ## HITS:1 COG:no KEGG:BF2047 NR:ns ## KEGG: BF2047 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 65 213 50 199 199 89 36.0 1e-16 MNKKMIVGMAIAAMSLMGNKAYAQFDFGKILQNGADILSNGGTTDDLLSGITSIFSDKKV ATIDDLVGEWAYTEPAVVFMSENLLKKAGGKLASSAIEKTIETQLNKVGITKGAMKMTFT RNGRFTQTIAGRRLRGTFTIKGKEVVLKYAGEIKQLVGTTQVDGNDLLIVMDASKMLTYL KAIGSISGNASLKTATSLLGSMDGMLCGLRLNRASK >gi|283510570|gb|ACQH01000049.1| GENE 85 101483 - 102085 596 200 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|212691832|ref|ZP_03299960.1| hypothetical protein BACDOR_01327 [Bacteroides dorei DSM 17855] # 1 196 1 196 198 234 57 2e-60 MREINVAGKSRSNLGKKASKEMRKEGLIPCNLYGEKKNADGLPEALSFAVPMADLRKVVY TPHIYVVNLDIDGQKHTAIMKELQFHPVTDALLHVDFYEINEEKSIAIGIPVNLVGLAQG VRDGGRLSLSIRKLMVKAPYKQIPEKLDIDVTALTIGKSIKAGELSFEGLEVVTPKDVIV CTVKMTRAAAAAAAAAAAGK >gi|283510570|gb|ACQH01000049.1| GENE 86 102117 - 102686 451 189 aa, chain + ## HITS:1 COG:BS_spoVC KEGG:ns NR:ns ## COG: BS_spoVC COG0193 # Protein_GI_number: 16077121 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Peptidyl-tRNA hydrolase # Organism: Bacillus subtilis # 5 187 3 187 188 147 41.0 1e-35 MEKYLIVGLGNPGREYEATRHNTGFMILDAFAKASNIVFEDKRYGFVSSAMVKGRKLVLL KPSTFMNLSGNAVRYWLDKENIPQEHLLVVVDDLALPLGTLRLKGNGSNGGHNGLGHIQQ LIGPQYARLRVGIGNEFPRGGQVDWVLGEYDDEELKALEPAIERSIEIIKSFALAGITVT MNQFNKKNA >gi|283510570|gb|ACQH01000049.1| GENE 87 102696 - 103109 382 137 aa, chain + ## HITS:1 COG:Cgl2072 KEGG:ns NR:ns ## COG: Cgl2072 COG1188 # Protein_GI_number: 19553322 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribosome-associated heat shock protein implicated in the recycling of the 50S subunit (S4 paralog) # Organism: Corynebacterium glutamicum # 5 121 10 122 126 85 41.0 2e-17 MNEARIDKWLWAARIFKTRSIAADACKNGRVTLNGVNLKPSRTIKEGDVVSVKKPPVTYS FKVLKTIEQRVGAKLLPEIYENVTDPKQYELLQMSRISGFIDRAHGTGRPTKKERRALDA FVDPVMFGFDDEDEGDE >gi|283510570|gb|ACQH01000049.1| GENE 88 103250 - 103825 658 191 aa, chain + ## HITS:1 COG:MA0316 KEGG:ns NR:ns ## COG: MA0316 COG0299 # Protein_GI_number: 20089214 # Func_class: F Nucleotide transport and metabolism # Function: Folate-dependent phosphoribosylglycinamide formyltransferase PurN # Organism: Methanosarcina acetivorans str.C2A # 3 187 10 199 204 126 37.0 3e-29 MTNIAIFVSGSGTNCENIIRHFADDANVHIALVLSNKPDAYALVRAKNHHVPTAVLTKAE FNDETKVMDLLNAHEVNFIVLAGFLLMIPPFLVSAFHQRMLNIHPALLPKFGGKGMYGHH VHEAVKAAGEKETGITIHWVSDDCDAGEIVAQYSTPLTDSDTPDDIAEKVHLLEQAHFPE VIAQVLERANL >gi|283510570|gb|ACQH01000049.1| GENE 89 103849 - 104580 545 243 aa, chain + ## HITS:1 COG:STM4319 KEGG:ns NR:ns ## COG: STM4319 COG0671 # Protein_GI_number: 16767569 # Func_class: I Lipid transport and metabolism # Function: Membrane-associated phospholipid phosphatase # Organism: Salmonella typhimurium LT2 # 1 237 1 233 250 153 38.0 3e-37 MKRTAFIFVALLGTLCLNAQETSKPNRWPSFLNKTDFADATKFIPAPPDTSSIAYLNDFN RYQWGKSMRNTARGKQAIYDASVDIDSVLSGFSEAFGMPISKQTTPELHYLMERVENDGS LSVRSAKKKYMRKRPYVQFGEGTAVPHEEEELRHTGSFPSGHTARGWAMALVLAEINPER QDEILVRGYEYGESRVIAGFHYQSDVDAARLAASAAVARLHADEAFRKQLDKAKKEFARL KKK >gi|283510570|gb|ACQH01000049.1| GENE 90 104725 - 105108 528 127 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928363|ref|ZP_06422210.1| ## NR: gi|288928363|ref|ZP_06422210.1| hypothetical protein HMPREF0670_01104 [Prevotella sp. oral taxon 317 str. F0108] # 1 127 1 127 127 263 100.0 3e-69 MKLQQLLQRYQFDDLMPVVADMFPGTGKFRASLEQGYNILMSLQPVASKKNIRYRIMPDP NSDASFMGAEDAAFDTTWEVCLGKEVVKEKGVDLTDIELAANCLVNVCLIGRHPKNFDEA YRALSKA >gi|283510570|gb|ACQH01000049.1| GENE 91 105156 - 105653 562 165 aa, chain - ## HITS:1 COG:CAC0738 KEGG:ns NR:ns ## COG: CAC0738 COG0847 # Protein_GI_number: 15894025 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, epsilon subunit and related 3'-5' exonucleases # Organism: Clostridium acetobutylicum # 4 164 3 160 306 125 40.0 4e-29 MKDFAAIDFETANFKRESVCSVGLVLVEEGKIVDSYYHLIQPTPNYFEERCVTVHGLTTA DTNGQPTFPDVWAEIAPLIRNKPLVAHNKAFDESCLKAAFAAYDMEYPDYPFYCTLLTAR RKLKESGLADFRLPTVASHCGYELDMHHHALADAEACAAIALKIL >gi|283510570|gb|ACQH01000049.1| GENE 92 105753 - 106229 491 158 aa, chain - ## HITS:1 COG:SA1586 KEGG:ns NR:ns ## COG: SA1586 COG0054 # Protein_GI_number: 15927342 # Func_class: H Coenzyme transport and metabolism # Function: Riboflavin synthase beta-chain # Organism: Staphylococcus aureus N315 # 21 158 11 148 154 136 47.0 1e-32 MATALHNLSEYDLSKIPDASNMCIGIVVSEWNPEVTGALLNGAVQTLERHGALPENIHVK TVPGSFELVYGAHQMVLNGGYDAVIILGCVIRGETPHFDYICQGVTYGIAKLNATSEIPV VYGLLTTNDQQQALDRCGGKYGNKGDECAIVAIKMAKF >gi|283510570|gb|ACQH01000049.1| GENE 93 106229 - 106933 850 234 aa, chain - ## HITS:1 COG:no KEGG:PRU_0202 NR:ns ## KEGG: PRU_0202 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 19 234 40 246 246 202 60.0 9e-51 MANKNKQGVAETFDASQSSDAFFSKNKKVILGSVAALVLIVAGFFLYKSLVAGPREEKAS TALAKGQEYFNQEQFDKALNGDGAGYVGFARIADDYSSTDAGNLANLYAGLCNANLDKWE AAKKFLDAYSPASDAMVSPAAVAALGNAYAHLNDLDKAVDNLKKAAKLADGKAADGANST LSPLFLIQAGEILESQGKKEEALAIYQDIKKKYVNSILVQSSEIDKYVERASTK >gi|283510570|gb|ACQH01000049.1| GENE 94 106815 - 107063 87 82 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRTKAATEPSITFLFFEKNASELCDASKVSATPCLFLFAIIYLNVVIILRSGAKLLNFSH IVKHISYFFRFSYPNGVTLQES >gi|283510570|gb|ACQH01000049.1| GENE 95 107079 - 108194 941 371 aa, chain + ## HITS:1 COG:SA0004 KEGG:ns NR:ns ## COG: SA0004 COG1195 # Protein_GI_number: 15925709 # Func_class: L Replication, recombination and repair # Function: Recombinational DNA repair ATPase (RecF pathway) # Organism: Staphylococcus aureus N315 # 1 365 1 370 370 175 29.0 1e-43 MILNSISIINYKNLRAVNLQLSPKTNCFIGHNGSGKTNFLDALYYLSFCRSAYNPIDSQL ITHEQDFFVLEGDYISEGGDAENVYCGMKRGSKKQFKRNKKAYKRLSQHIGLIPLVLVSP ADAALIDGGSEERRRLMDMVIAQYDTTYIEALTRCNKALQQRNALLRMEAEPDLALLELW EEEMAAQGKVVYAKRAAFVEEFIPVFQSIHERISGGSERVSLRYISHGQRGDLLDVIRKD RHKDRAVGYSLHGVHRDDLEMLIDGYQLKREGSQGQSKTYALAMKLAQFDFLKRTASKTT PLLLLDDIFDKLDSQRVERIVELVSGDSYGQIFITDTNREHLDRILQSGTTDYKLFYVEN GEITEKGGDNV >gi|283510570|gb|ACQH01000049.1| GENE 96 108187 - 108477 362 96 aa, chain + ## HITS:1 COG:no KEGG:PRU_0200 NR:ns ## KEGG: PRU_0200 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 94 1 94 96 100 51.0 1e-20 MFRRDVKPLGEILMKLLRDEGLETPLQQKRLIDAWETVAGPMVARYTTEKFIKNQTLFVK ITNPALRQDLAMMRTQLMRRLNEQVGALVITEIKVF >gi|283510570|gb|ACQH01000049.1| GENE 97 108677 - 110050 1000 457 aa, chain + ## HITS:1 COG:no KEGG:RPC_1791 NR:ns ## KEGG: RPC_1791 # Name: not_defined # Def: patatin # Organism: R.palustris_BisB18 # Pathway: not_defined # 9 347 23 322 386 102 29.0 2e-20 MNKDKKKIGLALSGGGYRAAAYHVGTLRALHKLGILNKVDVISSVSGGSITTAYYALHKD DYQTFEQGFIARLQRGVLWSSFLYVGVLGVLVLAFAVVLGYITSLVMGTLWPHHPIWAGS VAALVGVCALALALLGILKNSFTALPISKLISKLYDKVFFERKTLSDLPQTPLICINSTN LATHLPFTFSRGMMGEYAYRINGQSMFDANGFPLSRAVMASSCVPYGFTPITIGAAFVRG KYEDCEQKPEPPKLIDGGVYDNQGAHKLSQDKSRFRCEYIVVSDAGNGQVSAAGTTHFFN LAMNTISMMMNRIKKMQRSDNLYEGFANKEHFAYVPLEWDCSERSLHGFVNNLRNGNVHP DVWQAHGISEAEVASLKAKGAQRTEAEKAILQHIKASVGWRKFEESVPSSDKIDLARRVG TSLVALSAEQIGALIAHSAWLAELQTRLYLPMLVEHV >gi|283510570|gb|ACQH01000049.1| GENE 98 110279 - 110869 295 196 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288928370|ref|ZP_06422217.1| ## NR: gi|288928370|ref|ZP_06422217.1| hypothetical protein HMPREF0670_01111 [Prevotella sp. oral taxon 317 str. F0108] # 1 196 1 196 196 356 100.0 5e-97 MAKKKRSNIRCFEKLDTADANKRFLEYRKKEEQEFPEEQFMYPPCECKYTDSAGNIITVT KTEVGYVKTKQKKGSLFGEYRCYFMNGNIQESGEYYYQSFNCGIWREYDVKGYLIKETDM DKPYKGYSWQDVLLFVKKHNINLYDERTDIDRYVDESNIPRWNISWFSMRKKVSRYTIID ARNGRIITDGIAHGMK >gi|283510570|gb|ACQH01000049.1| GENE 99 111223 - 111438 67 71 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927699|ref|ZP_06421546.1| ## NR: gi|288927699|ref|ZP_06421546.1| hypothetical protein HMPREF0670_00440 [Prevotella sp. oral taxon 317 str. F0108] # 1 63 1 64 65 67 64.0 2e-10 MAVLLNKYSFCRMHMLTLFCIKTNFRENRFFAASGRLVDEKGTYNVKFLTKNQTKPPCGK HARVGNGTENM Prediction of potential genes in microbial genomes Time: Sat May 28 01:04:11 2011 Seq name: gi|283510569|gb|ACQH01000050.1| Prevotella sp. oral taxon 317 str. F0108 cont2.50, whole genome shotgun sequence Length of sequence - 35803 bp Number of predicted genes - 28, with homology - 27 Number of transcription units - 12, operones - 9 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 143 - 178 4.0 1 1 Op 1 . - CDS 399 - 929 685 ## COG0212 5-formyltetrahydrofolate cyclo-ligase 2 1 Op 2 . - CDS 929 - 2593 1888 ## COG0793 Periplasmic protease 3 1 Op 3 . - CDS 2583 - 3023 594 ## COG2131 Deoxycytidylate deaminase 4 1 Op 4 . - CDS 3098 - 5185 2527 ## COG0339 Zn-dependent oligopeptidases - Prom 5333 - 5392 4.0 + Prom 5991 - 6050 5.2 5 2 Op 1 . + CDS 6109 - 7461 1438 ## COG0541 Signal recognition particle GTPase 6 2 Op 2 . + CDS 7476 - 8357 957 ## COG0190 5,10-methylene-tetrahydrofolate dehydrogenase/Methenyl tetrahydrofolate cyclohydrolase 7 2 Op 3 . + CDS 8415 - 8594 161 ## PRU_1151 hypothetical protein 8 3 Op 1 . + CDS 8723 - 8974 254 ## PGN_1752 putative ferredoxin 4Fe-4S 9 3 Op 2 23/0.000 + CDS 8961 - 10043 1270 ## COG0674 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit 10 3 Op 3 22/0.000 + CDS 10040 - 10807 691 ## COG1013 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, beta subunit 11 3 Op 4 . + CDS 10804 - 11346 678 ## COG1014 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, gamma subunit + Term 11458 - 11501 4.3 + Prom 12595 - 12654 1.7 12 4 Op 1 . + CDS 12735 - 14906 2165 ## PRU_0795 OMP85 family outer membrane protein + Prom 14911 - 14970 3.3 13 4 Op 2 . + CDS 14990 - 15640 595 ## COG0177 Predicted EndoIII-related endonuclease + Term 15661 - 15701 3.2 + Prom 15786 - 15845 2.6 14 5 Tu 1 . + CDS 15892 - 16593 599 ## COG5587 Uncharacterized conserved protein + Term 16769 - 16805 1.5 - Term 16734 - 16769 0.6 15 6 Op 1 . - CDS 16898 - 18235 1296 ## COG3669 Alpha-L-fucosidase 16 6 Op 2 . - CDS 18303 - 19217 844 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 17 6 Op 3 . - CDS 19308 - 20651 1388 ## COG0513 Superfamily II DNA and RNA helicases 18 7 Op 1 1/0.000 - CDS 20757 - 21956 1080 ## COG0795 Predicted permeases 19 7 Op 2 . - CDS 21979 - 23106 1215 ## COG0343 Queuine/archaeosine tRNA-ribosyltransferase 20 7 Op 3 . - CDS 23114 - 25579 2969 ## COG0466 ATP-dependent Lon protease, bacterial type - Prom 25748 - 25807 4.9 + Prom 25534 - 25593 4.9 21 8 Op 1 . + CDS 25794 - 26717 889 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily 22 8 Op 2 . + CDS 26714 - 27679 1017 ## PRU_1447 endonuclease/exonuclease/phosphatase family protein - Term 29013 - 29066 -0.5 23 9 Op 1 . - CDS 29095 - 29733 641 ## PRU_1790 hypothetical protein 24 9 Op 2 . - CDS 29726 - 30403 694 ## PRU_1791 hypothetical protein 25 10 Tu 1 . + CDS 30512 - 31090 767 ## PRU_1792 hypothetical protein + Term 31259 - 31311 14.2 - Term 31247 - 31297 12.2 26 11 Tu 1 . - CDS 31495 - 33753 1584 ## COG3256 Nitric oxide reductase large subunit - Prom 33862 - 33921 3.9 27 12 Op 1 . - CDS 34341 - 34541 80 ## 28 12 Op 2 . - CDS 34492 - 35313 313 ## gi|288928397|ref|ZP_06422244.1| hypothetical protein HMPREF0670_01138 Predicted protein(s) >gi|283510569|gb|ACQH01000050.1| GENE 1 399 - 929 685 176 aa, chain - ## HITS:1 COG:CAC1090 KEGG:ns NR:ns ## COG: CAC1090 COG0212 # Protein_GI_number: 15894375 # Func_class: H Coenzyme transport and metabolism # Function: 5-formyltetrahydrofolate cyclo-ligase # Organism: Clostridium acetobutylicum # 3 176 5 180 182 87 31.0 9e-18 MDKKALRQEIRLRKRQFNGEQLRQLSLAVMQRLLAHPRVVQANTVMFYHSLPDEVYTHEA VDQLVKMGKRVLLPVVIDEQHLEIRQYQGPQDLKLGAMNILEPAGKPFTAYQEIETILVP GMSFDPHGNRLGRGKGYYDRFLAQVPQAYKIGVCFDFQKVDEVPIDDNDIKMNEVV >gi|283510569|gb|ACQH01000050.1| GENE 2 929 - 2593 1888 554 aa, chain - ## HITS:1 COG:aq_797 KEGG:ns NR:ns ## COG: aq_797 COG0793 # Protein_GI_number: 15606169 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protease # Organism: Aquifex aeolicus # 5 358 2 346 408 226 36.0 1e-58 MNPNRRNRYMPLLMAVCVAFGIVIGTFYANHFSGNRLNIINSGSNRLNNLLHIIDDQYVD NVNIDSLVDKAIPQILSDLDPHSVYITAKDMQAANDELKGSFSGVGIEFNIREDTLHVQN VIKNAPAERAGLLAGDKVVSIDGKPFVGKTVSNDEAMRRLKGPKDTKVRIGVLRYGHKKP LEFTVTRGDIPQKSITATYMLDDDTGYIRIKNFGERTYPDMLIALAKLSQEGCKNLVIDL RDNTGGFLQSAVQMANEFLPKNKLIVYTQGRKSQRQDYVSDGKGSYQRMPLVVLINEGSA SSSEIFAGAMQDNDRATVIGRRSFGKGLVQQQMGFTDGSLIRLTIARYYTPSGRCIQKPY ATGDAADYEQDIYSRYQHGEFFSQDSIKHTGPAYHTGIGRLVYGGGGITPDIFVGEDTIG VTSYFKQAAMSGLILQFAYNYTDNNRQKLAKYKTLKELTDYLGKQNVVEQFASYADKHGL QRRNLMIRKSHKMLSRFIISRIVYNMLDDEAWTQYLNEDDRVITAALKVFKNNAAFPKLT QSDKGKTAPQPKRK >gi|283510569|gb|ACQH01000050.1| GENE 3 2583 - 3023 594 146 aa, chain - ## HITS:1 COG:AF1764 KEGG:ns NR:ns ## COG: AF1764 COG2131 # Protein_GI_number: 11499353 # Func_class: F Nucleotide transport and metabolism # Function: Deoxycytidylate deaminase # Organism: Archaeoglobus fulgidus # 6 139 2 146 157 111 41.0 4e-25 MCTDNKQTILDLRYLRMARIWAENSYCVRRKVGALVVKDKMIISDGYNGTPSGFENVCED DNNVTKPYVLHAEANAITKLARSSNNSDGSTLYVTAAPCIECSKLIIQSGIKRVVYGEKY RLEEGIELLRKANIEVIYLNPEQNEP >gi|283510569|gb|ACQH01000050.1| GENE 4 3098 - 5185 2527 695 aa, chain - ## HITS:1 COG:XF1944 KEGG:ns NR:ns ## COG: XF1944 COG0339 # Protein_GI_number: 15838538 # Func_class: E Amino acid transport and metabolism # Function: Zn-dependent oligopeptidases # Organism: Xylella fastidiosa 9a5c # 13 691 35 715 716 459 38.0 1e-129 MNENLKEESKNHNNPFFGEYKTPHETVPFNLIRTEHYEEAFLEGIRRDDEEIEKLINDPE TPTFENTIIRVDNENGDHYYDLLNRVSNVFSCMLSAETNDELDALAQKMSPILTQHANDV RLNKRLFERIKYVYENHRELDAEEQMLLENSYDGFVRSGALLDDEGKEQLRKLTEEAGVL SLQFSQNLLKENKAFTLHITNEEELDGLPDMARDAAALAAKELGKEGWVFTLDFPSYSPF MTYSSQRNLRKQLYMARNTECTHQNDTNNLEICKRLVNLRRELAQLLGYDTYADYVLKHR MATNIDNVYRLLNDLIKAYKPTAEAELKEMESMAKEMEGDDFELEPWDTAYYSHKLQMQK YNLDSEMLRPYFELSKVIDGVFGLANRLYGITFKPNNDIPVYHPDVKAYEVFDRDGSYLA VFYADFFPRKGKQGGAWMTEFQGQWVDRKGNNVRPHVSVVMNFTKPTAEKPALLTLGEVE TFLHEFGHSLHGMFANSRFESLSGTNVWWDFVELPSQFMENFAIEKEFLRTFAFHYETGE PIPDELINRIVKSRNFMAASGCLRQVRFGLLDMAYYTQREAFTADIIPFEKEAWKEAIIM KQLPDTCMTTQFSHIMAGGYAAGYYSYKWAEVLDADAFSMFKKNGIFDQATAQSFRDNIL SRGGTEHPMVLYKRFRGGEPTIDALLERDGIVANG >gi|283510569|gb|ACQH01000050.1| GENE 5 6109 - 7461 1438 450 aa, chain + ## HITS:1 COG:FN1393 KEGG:ns NR:ns ## COG: FN1393 COG0541 # Protein_GI_number: 19704725 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Signal recognition particle GTPase # Organism: Fusobacterium nucleatum # 1 438 1 437 444 441 55.0 1e-123 MFENLSDRLERSFKIIKGEGKITEINVAETLKDVRRALLDADVNYKVAKTFTDTVKQKAL GMNVLTAVKPGQLMVKIVHDELAALMGGETVGLQLTNRPAIILMSGLQGSGKTTFSGKLA NLLKGKQHKNPLLVACDVYRPAAIEQLRVVAQQVGVPVYSEPDNKNVVEIANNAIREAKA KGNDVVIVDTAGRLAVDEEMMDEIYRLKQALNPDETLFVVDSMTGQDAVNTAREFNDRLD IDGVVLTKLDGDTRGGAALSIRTVVTKPIKFVGTGEKMEAIDVFHPERMADRILGMGDIV SLVERAQEQFDEEEAKRLQKKIQKNKFDFNDFLKQIEQIKKMGNLKDLASMIPGVGKAIK DVDIDDNAFKGIEAIIKSMTPKERTNPEILNNSRRQRIAKGSGTNIQEVNRLIKQFDQTR KMMKMMTGTNMAKMAGMASKMKGMPGMPKL >gi|283510569|gb|ACQH01000050.1| GENE 6 7476 - 8357 957 293 aa, chain + ## HITS:1 COG:SP0825 KEGG:ns NR:ns ## COG: SP0825 COG0190 # Protein_GI_number: 15900713 # Func_class: H Coenzyme transport and metabolism # Function: 5,10-methylene-tetrahydrofolate dehydrogenase/Methenyl tetrahydrofolate cyclohydrolase # Organism: Streptococcus pneumoniae TIGR4 # 1 287 1 278 285 271 49.0 1e-72 MMQLIDGKATATAIKAEIAEEVRQIVANGGKQPHLAAVLVGHDGGSETYVKNKVIACEQC GFKSTLIRFEENVTEDELLACVDKLNKDEDVDGFIVQLPLPKHIDEQKIVEAVDYRKDVD GFHPINVGRMAIGLPCFISATPLGILTLLQRYGIATSGKKCVVLGRSNIVGKPMAQLMMQ KQYGDATVTVCHSHSPSLKDECREADIIVAAIGRPDFVTADMVKEGAVVIDVGTTRVEDP SRKGGFRLSGDVKFDEVAPKCSFITPVPGGVGPMTICSLMKNTLSAGKKEFYK >gi|283510569|gb|ACQH01000050.1| GENE 7 8415 - 8594 161 59 aa, chain + ## HITS:1 COG:no KEGG:PRU_1151 NR:ns ## KEGG: PRU_1151 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 59 1 59 59 68 57.0 8e-11 MNSEAYYEQGNAYRREGNWQKALECYAEAIELDPESPAVHAREMLNSILAFYSKDALNP >gi|283510569|gb|ACQH01000050.1| GENE 8 8723 - 8974 254 83 aa, chain + ## HITS:1 COG:no KEGG:PGN_1752 NR:ns ## KEGG: PGN_1752 # Name: not_defined # Def: putative ferredoxin 4Fe-4S # Organism: P.gingivalis_ATCC33277 # Pathway: Citrate cycle (TCA cycle) [PATH:pgn00020]; Metabolic pathways [PATH:pgn01100] # 1 75 1 75 75 83 61.0 3e-15 MNKIKGAVVINTDRCKGCDLCVVACPYDVLELARKKVNASGYAYAQAVKAEACVGCAACG TVCPDGCITVYRMKVKENSDGRA >gi|283510569|gb|ACQH01000050.1| GENE 9 8961 - 10043 1270 360 aa, chain + ## HITS:1 COG:TM1759 KEGG:ns NR:ns ## COG: TM1759 COG0674 # Protein_GI_number: 15644505 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit # Organism: Thermotoga maritima # 8 357 8 351 356 370 55.0 1e-102 MAEREVRLMKGNEAIAHAAVRAGIDGYFGYPITPQSEIIETLAQLRPWETTGMVVLQAES EVASINMLYGGGGSGKRVMTSSSSPGVALMQEGISYMAGAEIPGLIVNVQRGGPGLGTIQ PSQGDYFQSTRGGGNGDYNVIVLAPNSVQEMADFVDLAMNLAFKYRNPAMILSDGVIGQM MEKVVLPPMKPRRTEEEIIKECPWASTGRTRGREPNIITSLELRPELMEVKNRALQAKYE EIRNNEVRFETLLTDDAEYVFVAFGSAARLAEKAVELAREEGIRVGLFRPITLWPFPERQ LDSLAQGKKGLLVVEMNAGQMVQDVRLAVEGTVPVAHFGRQGGIVPEPEEIVKALKEKLQ >gi|283510569|gb|ACQH01000050.1| GENE 10 10040 - 10807 691 255 aa, chain + ## HITS:1 COG:MA2909_1 KEGG:ns NR:ns ## COG: MA2909_1 COG1013 # Protein_GI_number: 20091730 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, beta subunit # Organism: Methanosarcina acetivorans str.C2A # 1 251 3 262 296 234 42.0 8e-62 MTTDIISPENLVYAKPALLNDTTMHYCPGCSHGVVHKLIAEVIAEMGMEEKSVGISPVGC AVFAYRYIDIDWQEAPHGRAPALAAGIKRLWPDRLVFTYQGDGDLACIGTAETLHALNRG DNITIIYVNNAIYGMTGGQMSATTLLEQPTATCPTGRTPELHGYPIDLTNLAVQLEGTCF VSRQSVDTVASINKAKRAIRKAFESSMAGKGSSLVEIVSTCSSGWKMSPADANKWMQDNM FDHYKKGDLKDRTKE >gi|283510569|gb|ACQH01000050.1| GENE 11 10804 - 11346 678 180 aa, chain + ## HITS:1 COG:MA2909_2 KEGG:ns NR:ns ## COG: MA2909_2 COG1014 # Protein_GI_number: 20091730 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, gamma subunit # Organism: Methanosarcina acetivorans str.C2A # 7 173 12 177 186 117 39.0 2e-26 MKHEIIIAGFGGQGVLSMGKILAYSGLMEDKEVTWMPAYGPEQRGGTANVTVIVSDERIS SPVLSKYDVAIVLNQPSLDKFEPRVKPGGILIYDGNGIASPPSRTDIKVYRMDAMDKAAE MKNAKVFNMIVLGGLLKVCPVVSTEGLNKALFKTLPERYHNLIPLNMEAVEQGKQIIAEQ >gi|283510569|gb|ACQH01000050.1| GENE 12 12735 - 14906 2165 723 aa, chain + ## HITS:1 COG:no KEGG:PRU_0795 NR:ns ## KEGG: PRU_0795 # Name: not_defined # Def: OMP85 family outer membrane protein # Organism: P.ruminicola # Pathway: not_defined # 11 723 2 703 703 837 56.0 0 MLALLAACSATKFVPDESYLLEKVELKADTKKFNVASLEPYIRQKANSKWFSVFKIPLGA YALSGRDTTRWINRTLRKIGEEPVVYDTLQAQLSCNDLKLALQNMGYMNGEVDLTTRAKG KRLTAIYTLHPGTPFFINSVRYDIQDSVIARMLDLDNPARQELKAGDPFTVTRLEEERKR ITRVLLDSGYYRFHKDFILYGADSTASKNQINLMLHLLPYRPNSNAPETLHPRYSIRNVT FSSNDSAAIHLRPGVLRRSTLIKQGDLFSATQLQRTYGNFARMQAVRYTNIRFTEVPDTT LLDCDINISTNKPSSISFQPEGTNTAGDLGAAVSVTYENRNLFRGSELLSIQLRAAYEAI TGLEGYQNKDYQEYGVETKLTFPQFVAPLLSSTFRRRSVATSELSLNWNLQNRPEFHRRV FSMAWRYHWNEPRHHISYRYDFFDINYVYMPWISSRFKADYLDNATSRNAILRYNYEDLF ILKMGFGLAFNDGENALKVNVESAGNMLQAVSKLFRLPQNANGKYTLVNIAYAQYVKFDV DFTRLFQFDERNSLAFHVGLGVAYPYGNSTILPFEKRYFSGGANSVRGWRVRELGPGTFK GTDGRIDFINQTGDMKLDMNLEYRTFLGWKLHGAVFVDAGNIWTLRSYAEQPGGQFRLDR FYRQIALAYGLGFRLNFDYFILRFDLGMKAVNPAYETSREHYPLLYPNFKRDFAFHFAVG LPF >gi|283510569|gb|ACQH01000050.1| GENE 13 14990 - 15640 595 216 aa, chain + ## HITS:1 COG:jhp0532 KEGG:ns NR:ns ## COG: jhp0532 COG0177 # Protein_GI_number: 15611599 # Func_class: L Replication, recombination and repair # Function: Predicted EndoIII-related endonuclease # Organism: Helicobacter pylori J99 # 19 206 22 206 214 194 51.0 1e-49 MTRNERYKYILDYFRAQAPVVTTELEFGSAFQLLVATLLSAQCTDKRINQVTPALFARFP TAEEMAKAEVEEVFEYIKSVSYPNAKANHLVAMARKLVDDFKGEMPSTTAELTTLPGVGR KTANVLQAVWFDKPNMAVDTHVFRVSHRMGLVSKKANTPLKVEQELLRHIPSVDVNKAHH WLLLHGRYVCVSRKPKCEECVFNDICPKLLEGSKLE >gi|283510569|gb|ACQH01000050.1| GENE 14 15892 - 16593 599 233 aa, chain + ## HITS:1 COG:mll1538 KEGG:ns NR:ns ## COG: mll1538 COG5587 # Protein_GI_number: 13471537 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Mesorhizobium loti # 5 224 10 218 229 64 25.0 2e-10 MNVKRILNYLAELSANNNRDWYQAHKDEYTACRADFEEGVRQALSAISLFDSEISHLQVK DCVYRFNRDTRFSADKSPYKNHFGAYMCAKGKKALRGGYYMHLEQGHCLLAVGGYWLPTN ILTSCRNEIMGNIDTWRGIVENKAFVDLFGKPNETKWGEGTRGFGLDSLKSAPSGFPRDY EFIQYLRMKDYCGWVKVPDNFFEGDAWIDEMTRIFKVGKPMMDFMNNVIDDYE >gi|283510569|gb|ACQH01000050.1| GENE 15 16898 - 18235 1296 445 aa, chain - ## HITS:1 COG:TM0306 KEGG:ns NR:ns ## COG: TM0306 COG3669 # Protein_GI_number: 15643075 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-L-fucosidase # Organism: Thermotoga maritima # 34 364 7 358 449 132 29.0 1e-30 MSKLTYAMKKTILLAFALLPFVGAVAQNNNAPSKYTPTAQNLAAREAFQDQKFGVFLHWG LYSMIGESEWVMTNRNINYKEYPKLAQTFYPSNFNADEWVAAIKAAGAKYVTITTRHHDG FSLFKTATSTYNSVDGSPFKRDIIKEMAEACARQGIKLHLYYSHLDWYRTDYPVGRTGKG TGRPKDAANWKSYYNFMNTQLTELLTNYGPVGAIWFDGWWDHDSDPTPFDWELPQQYAMI HKLQPQCLIANNHHQVPFAGEDIQIFERDLPGENKAGLSGQSISHLPLESCQTINEHWGY SLVDSNYKSGKELIQMLVRAAGKNANLLLNVGPEPGGELPSVAVARLKEIGQWLSKYGNT IYGTRGGLVAPHHWGVSTQRGNKLYIHILDLQESGLYLPLGKRMPKSATEFATGKRVNFT KHADGITLHLDKVPTDIDYVVELTM >gi|283510569|gb|ACQH01000050.1| GENE 16 18303 - 19217 844 304 aa, chain - ## HITS:1 COG:CC2427 KEGG:ns NR:ns ## COG: CC2427 COG0463 # Protein_GI_number: 16126666 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Caulobacter vibrioides # 2 206 19 219 309 85 34.0 2e-16 MKTQLSILIPTYNYVCLPLVRELHRQASEMGGLEFEIVVAEDGSDQPDNIARNAEITALS HCKHIVREENVGRAAIRNHLADMANMPWLLFIDSDMRVVSPHFVERYLQPPDEWGVVYGG NTTKGGAWTDERLLRVRYEQAAERQFTPQQRAKHPYNHLTTSNILVSKRVMQAVPFDSRF LTYGYEDVFWGMSLAQKGIEVAHINNPIGFNYYDSNVVFVGKTIEGLHTLYTFRTELADF SPIIRLERRLRRWRLDGAVRSVLNKLMPLLHRRIIGFKPSLVAFKLFKLCTYLHIAHTDA ASKA >gi|283510569|gb|ACQH01000050.1| GENE 17 19308 - 20651 1388 447 aa, chain - ## HITS:1 COG:STM3280 KEGG:ns NR:ns ## COG: STM3280 COG0513 # Protein_GI_number: 16766578 # Func_class: L Replication, recombination and repair; K Transcription; J Translation, ribosomal structure and biogenesis # Function: Superfamily II DNA and RNA helicases # Organism: Salmonella typhimurium LT2 # 2 392 24 411 646 292 40.0 1e-78 MTFEELDLNDNVLDALYDMRFETCTPIQEHCIPQILEGKDILGVAQTGTGKTAAYLLPIM SMLDDGGFPHDAINCIIMSPTRELAQQIDQAMQGFGYYLDSVSSLPIYGGNDGNRYDQEI KSLRLGADIIIATPGRFISHIQLGNVDLSKVSFFVLDEADRMLDMGFSEDIMTIAKQLPP TCQTIMFSATMPPKIEQLAKTLLKNPVEIKLAVSKPAEKIAQKAYLCYEPQKLKVLEDIF KAGHLNRVIIFSGKKQKVKEINRALVRMKINSDEMHSDLSQEERDQVMFKFKSGATDVLV ATDILSRGIDIDDITMVINYDVPHDVEDYVHRIGRTARAERDGVAITLISDQDVYYFQQI ERFLEKEIEKVPLPEGIGDGPEYKTASRAPQKGGARGSRGGRQKYGQPRGGRNRYAKPTN QHAQGGKSGRSRGSHNRRKPNKPAGNA >gi|283510569|gb|ACQH01000050.1| GENE 18 20757 - 21956 1080 399 aa, chain - ## HITS:1 COG:FN1030 KEGG:ns NR:ns ## COG: FN1030 COG0795 # Protein_GI_number: 19704365 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Fusobacterium nucleatum # 39 396 2 361 363 90 25.0 6e-18 MQIRRSIKLLRLRCMRQKWVAWLVARLAFLCVLLPTRYLKILDWYIIRKFIGTYVFSILL IISIAIVFDVNENLAKFTQFHAPLKAIVFDYYANFVPYFANLFSPLFVFIAVIFFTSKLA GTSEIISMLAAGVSFNRLMRPYMVSCLLISCLSFYLSGYVIPHGTVIRQNFESMYKNSKK NTSAENVQLQVDRGVIAYIQHYDDKTKRGYGFSLDKFENKKLVSHMTASEIQYDTISDSK YHWQVSNWRIREMKALREKITYGDKRDTLIMMEPADLVFSKGQQETFTNPELSSYISKQI GRGSSNVVQYEVEYHKRIAASFASFILTTIGLSLSSRKRKGGMGLYLGIGLALSFGYILL QTVSATFAVNANTPPMLAAWIPNLLFAVVAYFCYRSAPN >gi|283510569|gb|ACQH01000050.1| GENE 19 21979 - 23106 1215 375 aa, chain - ## HITS:1 COG:aq_1308 KEGG:ns NR:ns ## COG: aq_1308 COG0343 # Protein_GI_number: 15606515 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Queuine/archaeosine tRNA-ribosyltransferase # Organism: Aquifex aeolicus # 2 375 3 378 378 379 50.0 1e-105 MKFELQRTDTASAARTGIITTDHGQIKTPVFMPVGTCGSVKGVHFSELRQQVKAQIILGN TYHLYLRPGLDVLKKAGGLHKFNTWDRPILTDSGGFQVFSLTGIRRLTENGCEFRSHIDG SKHVFTPESVMDTERIIGADIMMAFDECPPGNSDYAYAKKSLSLTQRWLDRCFKRFNETE PLYGYQQSLFPIVQGCTYKDLRQEAAKHVADKGADGNAIGGLAVGEPTEVMYEMIEVVNE ILPKDKPRYLMGVGTPQNILEAIERGVDMFDCVMPTRNGRNAMLFTYQGTMNMRNKKWED DFSPIDPDGCEIDRIHSKAYLHHLFKAQELLAMQIASIHNLAFYLRLTADARMHIEQGDF VQWKSSVIDNLNRRV >gi|283510569|gb|ACQH01000050.1| GENE 20 23114 - 25579 2969 821 aa, chain - ## HITS:1 COG:lon KEGG:ns NR:ns ## COG: lon COG0466 # Protein_GI_number: 16128424 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ATP-dependent Lon protease, bacterial type # Organism: Escherichia coli K12 # 31 804 11 774 784 617 43.0 1e-176 MDRRNNTSIQMIADYEGDISTLFDTPTPDVVPVLATRNLVLFPGVVTPILVGRAASVSLV NKLKKDPEQVFAVFCQKNADIEEPGKKDLFPIGVYAKLVRVLEMSGPGNNITAIVQGLGR CQLEDVVKRKPYLVAQTTKKPEIFIDEDTSEYHTAMEDLRSQTVEFIKMNEEMPDEAQFA IANIHHDVIATNFICSNMPFDLNDKMHMLEADNSLERVYIALKTLNKEMQLLQIKQTIRS KTREDIDEQQREYFLQQQIKNIKEELGNGEGSPEHRELEEKAKGKKWSEEVSKIFYKELD KLDTLNPQSPDYSIQLTYLQTMVGLPWNEYTKDDLSIKRAQKVLDHDHYGMEKVKDRILE HLAVLQLSGDLKSPIICLYGPPGVGKTSLGKSIATAMNRKYVRMSLGGLHDESEIRGHRR TYVGAMPGRIIKSIQKAGSSNPVFILDEIDKVTQNTLHGDPSSALLEVLDPEQNNAFHDN YLDVDFDLSRVLFIATANDLNTIPRPLLDRMELIEVSGYITEEKIEIAKRHLVPRELQNT GLAKNATEKPNFNKAALEKIIEQYTRESGVRQLEKQIDKALRKMAYLRALNGELPFAKIT PAEIEGLLGKPPYYRDIYQGNDYAGVVTGLAWTSVGGEILFIETSLSKGRGAKLTLTGNL GDVMKESAVIALEYVKAHVDKLGVDYRLFDQWNIHIHVPEGATPKDGPSAGITIATSIAS ALTQRKVRKNTAMTGEITLRGKVLPVGGIKEKILAAKRAGITDIVMCRANKKDIEEIPEK YLTGVKFHYVENVQDVWDFALTNEVVENAIKFDIEEEKKKD >gi|283510569|gb|ACQH01000050.1| GENE 21 25794 - 26717 889 307 aa, chain + ## HITS:1 COG:BS_ybfH KEGG:ns NR:ns ## COG: BS_ybfH COG0697 # Protein_GI_number: 16077290 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Bacillus subtilis # 14 284 10 272 306 151 37.0 2e-36 MTRPTSTHANLLYHVVAFVTVAIWGSTFISTKLLIYAGLSPAHIFTLRFLIAYVLMLGFT LVGPRRAGGRKLLCNNLSDEGCMMLLGITGGSAFFLMQNEALRFSTATNVSLIVCSCPLF TMVLMRLLVHGTRLGTLQVAGTLFSFVGMAAVVLNGKFVLHLSPLGDFLALCACLAWAFY SIIMKRMNERYDSLFITRKVFFYGLLTMLPYYILVPQMPPLHALTRPDVIGHLLFLGVLG SMICFLTWTWVMGKLGAMRATNYIYVNPITTIVFAWLVLHETITPYFIAGTVFILVGLFL SSKSKTT >gi|283510569|gb|ACQH01000050.1| GENE 22 26714 - 27679 1017 321 aa, chain + ## HITS:1 COG:no KEGG:PRU_1447 NR:ns ## KEGG: PRU_1447 # Name: not_defined # Def: endonuclease/exonuclease/phosphatase family protein # Organism: P.ruminicola # Pathway: not_defined # 4 310 3 307 311 377 58.0 1e-103 MTTLGTLLISLFTLVQLNCENLFDCRHDSLKNDQEFLPTAYRHWTPSRYWKKLNRIGQEI VACGGEGKQWRLPDLVALCEVENDSVLHDLTRRSLLRTARYEYVMTHSPDLRGIDVALLY SPFTFALIKSYALRIKPPTGMRPTRDILYAEGVTFGGDTLHVFVLHAPSRAGGEANTRPY RMAVANRLCTAIDSIRQGNPTANIVVTGDFNDYGDAPALRLLAANGLTDVSANAIGINGA KGTYRYQGEWGSLDHVFVSQKIHEKGVSCHIFDAPFLMEDEEKYGGKRPWRTYQGPKYLG GFSDHLPVVVTFGMPNAASRL >gi|283510569|gb|ACQH01000050.1| GENE 23 29095 - 29733 641 212 aa, chain - ## HITS:1 COG:no KEGG:PRU_1790 NR:ns ## KEGG: PRU_1790 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 40 202 35 197 197 176 53.0 6e-43 MSKFRSLLSLVPFRFWLYLGLVVVVFVVICVAVCRCETDNSISVVSNDKIDLTPNQIQQI EAIGEWEFLSIDDEELVDTVKYGFFGDDKLVRIYYGTLRLGVNLREARPRWLSMQGDTLC ATLPPVKLLDNNFIDEARTKSFFSTGTWSDADRDRLYYKANAMMRKRCLTPTNMALAEEN ARTQFTSLLRSLGFDKTKVVFDDPQTATKKER >gi|283510569|gb|ACQH01000050.1| GENE 24 29726 - 30403 694 225 aa, chain - ## HITS:1 COG:no KEGG:PRU_1791 NR:ns ## KEGG: PRU_1791 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 25 207 50 231 243 217 52.0 2e-55 MYRTILMALALVLMVLGCSGNKSEPGQEVVDTVPMMVTQLKKCSRLYTTEIRVHKIITHN DEKTIKGSFLGNKFNINLPLSKRQVAIPMEATLKAYVDFGKFSEDNVRRRGDRIEITLPD PQVELTGTRIDNKGIKRYVDFARSNFTDAELTAYERQGRESIVKSIPQMKIMEQARQNAA NVIIPMVAQMGFKEENITVNFRKDFNYGDITRLLLGNTERRIGNE >gi|283510569|gb|ACQH01000050.1| GENE 25 30512 - 31090 767 192 aa, chain + ## HITS:1 COG:no KEGG:PRU_1792 NR:ns ## KEGG: PRU_1792 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 6 174 23 191 191 129 42.0 5e-29 MKQHLLALVTMSVAICLPSCGGKKEKKDIIARKPIVVRATETRAMGDHSESRQVQWIGAS YTIDTQFKADKSLPLINDGEQKYYDNRATVKILRKDGTVFFNRTFSKADFINQIDASYKG GALLGVVYDKCDNDYLYFAASVGSPDKSSDEYVPLVVKVSRFGEVSIKKDATLDTQSEMG NSNTAEEEEEGV >gi|283510569|gb|ACQH01000050.1| GENE 26 31495 - 33753 1584 752 aa, chain - ## HITS:1 COG:RSp1505 KEGG:ns NR:ns ## COG: RSp1505 COG3256 # Protein_GI_number: 17549724 # Func_class: P Inorganic ion transport and metabolism # Function: Nitric oxide reductase large subunit # Organism: Ralstonia solanacearum # 2 731 3 727 756 721 50.0 0 MTPKKLWLTLAAVIIGSFAVLCFYGVEIFREMPPFPNKVVTESGETLFEGQDIKDGQNVW QSIGGQTVGSVWGHGAYVAPDWTADYLHRESELMLAELAQKDGKVYAQLAEADKAKYKVL LQEELRKNTYDAQTGVITFSDMRARVAKQLHNYYAKLFLNDPSMAKLRNAYAMREKSVEA VGGLSAEQRFDKMDAFFAWSSWVCVTNRPNSDVSYTNNWPHDPVIGNIAPTSLHLWSGFS VLLLLFCVGILVYYYAQHKEEHVSKAPETDPMRELKPTASMRAVLKYIWVVGALMLVQMI SGVITAHYGVEGDAFFGLPIQDIFPYAVSRSWHVQLAILWIATSWLATGLYIAPAVSGVE PKYQALGVNVLFGALVFVVAGSLAGQWFGVMQKLGLVENFWFGHQGYEYVELGRLWQILL LTGLVLWLFLMIRALIPALKRKDESRHLLILFVIASLAIAFFYAAGLMYGRQTHMAIAEY WRWWVVHLWVEGFFEVFATVVASFLFCRLGLLKIKSATISVLFSTIVFLAGGILGTFHHL YFSATPTAVLALGATFSAMELVPLVLIGMEAYHNYQLSRSTTWIKSYKWPIYCFIAMCFW NFLGAGIFGFSINPPIALYYLQGLNTTAVHGHAALFGVYGILGIGLMLFVLRGLYPDYEW NNKLIGSAFWCINIGLLLMTVVSILPIGILQAHESITNAYWSARSAEFMQQDIMQTLRWL RIPGDSFVALGEVLLVLFIIGLQFGWSRKGKR >gi|283510569|gb|ACQH01000050.1| GENE 27 34341 - 34541 80 66 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRFLGVKEVNTGRLNIEIKYGMFQHPPAWVQLSFICIALHCLSAVIWYVPVILLYYRSLF ILLLTV >gi|283510569|gb|ACQH01000050.1| GENE 28 34492 - 35313 313 273 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928397|ref|ZP_06422244.1| ## NR: gi|288928397|ref|ZP_06422244.1| hypothetical protein HMPREF0670_01138 [Prevotella sp. oral taxon 317 str. F0108] # 1 273 1 273 273 555 100.0 1e-157 MKEMLAYIFLTSEKIDLVYHLTFLKYMDGKVLMRAMRVTTMLMAFTLVGVFMVSCADRRD KLPDILYGDEYRVWYVDFDNDDRWHLCFYFNKDGTWKILDQDPQDSLHKHVERCQFWTER WRLENDTTLVLGETAWKVRVVSPTELKLCLDTITMILQPIVYNPVHKQGKYGEFIGKTVG YFLSYHRFYSNLMVVAGKPGKPRAIRVDYPSDSLTIELAPAEFKHIKLFDEFQEWDTTLA KKENIAQIRVYKNDSLIDEIPWSERSKYREIKY Prediction of potential genes in microbial genomes Time: Sat May 28 01:04:55 2011 Seq name: gi|283510568|gb|ACQH01000051.1| Prevotella sp. oral taxon 317 str. F0108 cont2.51, whole genome shotgun sequence Length of sequence - 9091 bp Number of predicted genes - 8, with homology - 6 Number of transcription units - 8, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 298 - 615 152 ## gi|258649227|ref|ZP_05736696.1| polysaccharide biosynthesis protein - Prom 721 - 780 5.6 - Term 703 - 746 1.2 2 2 Tu 1 . - CDS 824 - 1240 168 ## gi|260590896|ref|ZP_05856354.1| conserved hypothetical protein - Prom 1295 - 1354 3.5 3 3 Tu 1 . - CDS 1366 - 1692 166 ## - Prom 1765 - 1824 3.7 + Prom 1564 - 1623 3.5 4 4 Tu 1 . + CDS 1855 - 2055 60 ## + Term 2186 - 2213 0.1 5 5 Tu 1 . - CDS 2349 - 4550 2309 ## COG1472 Beta-glucosidase-related glycosidases - Prom 4596 - 4655 4.4 - Term 5344 - 5407 5.1 6 6 Tu 1 . - CDS 5529 - 5789 73 ## gi|260910377|ref|ZP_05917049.1| conserved hypothetical protein + Prom 6015 - 6074 5.4 7 7 Tu 1 . + CDS 6190 - 7464 841 ## gi|288928400|ref|ZP_06422247.1| hypothetical protein HMPREF0670_01141 + Term 7684 - 7721 2.0 + Prom 7630 - 7689 5.5 8 8 Tu 1 . + CDS 7760 - 8686 748 ## COG1715 Restriction endonuclease Predicted protein(s) >gi|283510568|gb|ACQH01000051.1| GENE 1 298 - 615 152 105 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|258649227|ref|ZP_05736696.1| ## NR: gi|258649227|ref|ZP_05736696.1| polysaccharide biosynthesis protein [Prevotella tannerae ATCC 51259] # 1 89 1 87 87 65 46.0 1e-09 MKRFLAFNIVIGGAILISHLVNYLIVVPCVIGDECGYDVGKCDRGKVFELFFEQSSDTGY HPEPTLFNFFFVTFIGAMVGICFFKIIVRYLLPKTKVSQNCARKT >gi|283510568|gb|ACQH01000051.1| GENE 2 824 - 1240 168 138 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260590896|ref|ZP_05856354.1| ## NR: gi|260590896|ref|ZP_05856354.1| conserved hypothetical protein [Prevotella veroralis F0319] # 20 132 38 150 155 77 30.0 4e-13 MVAILAIVCLMACENTWMCSCNITEKDIVYEPAEKKFYPSAQLKAKVVKAFNGSFSWRRA GYVYRDAFQGIGVIVFEYNPDVGEIDHVQLVRPTNVTSVDQELVNCIKAMRFTDTLKKHR VKPVKFMLQINYFDMTAH >gi|283510568|gb|ACQH01000051.1| GENE 3 1366 - 1692 166 108 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKLPKPFHRQKENYEEGVIYFYHFVDSTYIIVFQGSMMEFSIDKYQNKMVERKGERKTSV GVENNRYWRKDVYSDGVRVYYDHVPKRNKDVYDKVLDEITFRQLQDDE >gi|283510568|gb|ACQH01000051.1| GENE 4 1855 - 2055 60 66 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIRLCGHYMSYFREERSKFFVDSLFLMNFRSLSMSFNLSNTNSFESGIFLHNSLWVRRES YSQGLQ >gi|283510568|gb|ACQH01000051.1| GENE 5 2349 - 4550 2309 733 aa, chain - ## HITS:1 COG:YPO2803 KEGG:ns NR:ns ## COG: YPO2803 COG1472 # Protein_GI_number: 16123001 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Yersinia pestis # 13 733 11 714 793 375 33.0 1e-103 MKSKLTMAALSMAFTLMASAQKTVPVYLNPDAPIEARVQDALKRLTVHEKVKLIHAQSMF SSAGVPRLGIRQLNMSDGPHGARAELNWSTWGYAQWTNDSICAFPSLTCLAATWNRDLAG EYGHAISEEFAFRGKDVILGPGTNICRVPLNGRSFEYMGEDPYLAGEMAVPYIKRAQANG VACCLKHFALNNQEADRFTVNVNVGERALREIYLPAFKKCVQQAGVWTIMGAYPLWNGQH LCHNDSLLNKILKKEWSFDGAVISDWEGTHDTWQAAMNGLDIEMGTSTDRKTEDGVQGYD ANYMASPLEKLVLQGRIPMSVLNDKVERVLRTIFRTSMNSHKTIGNQCSPEHYAVCKSVG AEGAVLLKNQDDILPIAPNRYKRILVVGDNAVRNMSEGGGSSELKTQFDISPLDGLRSVY GDKITFVRGYYAGKPLYDAVEPLSADSLKRLKEEALQAARKADLVIFVGGLNKNRRQDCE NGDRENYDLSYGQNELIAELAKVQKNIVVLTFGGNAFATPWINSVKALMHCWYLGSMSGE TLAELVSGKLNPSGKLPITFAKRQADYPCFQFGKRGYPGVDKQVYYDEGIYVGYRWFDTK GVAPQFPFGFGLSYTTFKYGKPTLSAQTMDKNGKITLSVAVTNTGRRAGKETIELFIGDD KCSEERPKKELKNFAKVSLAPGETKTVSFDITSGDLEYWSERTHGFVAEPGTFTAYVCAS ETDVRATAKFELK >gi|283510568|gb|ACQH01000051.1| GENE 6 5529 - 5789 73 86 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260910377|ref|ZP_05917049.1| ## NR: gi|260910377|ref|ZP_05917049.1| conserved hypothetical protein [Prevotella sp. oral taxon 472 str. F0295] # 1 52 34 85 91 85 80.0 8e-16 MADYFCFSMMSSWLVVSNILHNRSRWPLQAYMFSVSLPTCACMQYNYFCQVFSRKFNIMS AFLSTNRSLGRKKSILAKVCFDAKRG >gi|283510568|gb|ACQH01000051.1| GENE 7 6190 - 7464 841 424 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288928400|ref|ZP_06422247.1| ## NR: gi|288928400|ref|ZP_06422247.1| hypothetical protein HMPREF0670_01141 [Prevotella sp. oral taxon 317 str. F0108] # 37 424 1 388 388 785 100.0 0 MKIKTVLRKSPLYAVANFGMLLLSSCFPSFNSAEEVMEYLKKEFPDHDIVLSSEYKTSRG LWEDWRIWSFTLSGYPKDTFQVASHIGSYPFPMMKTNKSIISNFYKVVTLRREREFEQGP LKAFDAPTRRIWHRFPHTDFSLRAVQWEVETLDDIWRAKRLIDAFEQFLSEERVDSHAHY YLRMYMQGPCYALGGGNYIDFMDNLETAEPGEKSPCYIEYHIYGDINRQEVCQMFYNSVM RFHQLMADQGNGVTKENFQAWAEQQLRLKARLPELSTEEERDSLRKVLVVDDDDVRRVFI DMGQKPYMMVTLANSDMRPNSRGIFFTYPQLRVFCLRSGLRVQGTGDHFTVKGVDGSRYE FSIRFYEEKKDVVGFEEDTCYYLRDGRKVVVQGFWSPEKCVNDALVRRITGRDVRQMVVH EIKQ >gi|283510568|gb|ACQH01000051.1| GENE 8 7760 - 8686 748 308 aa, chain + ## HITS:1 COG:alr7132 KEGG:ns NR:ns ## COG: alr7132 COG1715 # Protein_GI_number: 17233148 # Func_class: V Defense mechanisms # Function: Restriction endonuclease # Organism: Nostoc sp. PCC 7120 # 2 302 3 302 305 172 37.0 5e-43 MIPTFQQIRAQVLRALNVGGVMRAKDLRTPLAKHFNLTNDELNTKYDSGNGEIFLDRISW ALSYLFMAKLADKPRRGDYTISQKGKELVVSCTDEEINKYINDTVSKPRKTSGETDKVIG ATIMTSNNELTPQEGLYESYENIKKSIQADILATILSKKPQAFERIVVELLQIMGYGGEI KDAGFVTKLSNDGGIDGIIKEDVLGFNHISIQAKRYAANNHVGRNEVQAFVGAVAGTPSK KGVFITTSDFTKGAIDYVESLNGTPTVILINGEQLTKYLYECGLGLQEERVFKVMKLDRD FWDAMDDE Prediction of potential genes in microbial genomes Time: Sat May 28 01:05:44 2011 Seq name: gi|283510567|gb|ACQH01000052.1| Prevotella sp. oral taxon 317 str. F0108 cont2.52, whole genome shotgun sequence Length of sequence - 29146 bp Number of predicted genes - 24, with homology - 23 Number of transcription units - 15, operones - 5 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 117 - 1493 876 ## COG0534 Na+-driven multidrug efflux pump 2 2 Tu 1 . - CDS 1949 - 3196 1193 ## COG4591 ABC-type transport system, involved in lipoprotein release, permease component - Prom 3223 - 3282 2.9 3 3 Tu 1 . - CDS 3376 - 4593 1428 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase - Prom 4696 - 4755 2.8 + Prom 4544 - 4603 6.2 4 4 Op 1 . + CDS 4748 - 5266 753 ## PRU_1298 hypothetical protein 5 4 Op 2 . + CDS 5266 - 5940 661 ## COG0325 Predicted enzyme with a TIM-barrel fold + Prom 5989 - 6048 3.0 6 5 Tu 1 . + CDS 6155 - 7192 1196 ## COG1087 UDP-glucose 4-epimerase + Prom 7619 - 7678 4.2 7 6 Tu 1 . + CDS 7810 - 8910 519 ## Fisuc_1062 hypothetical protein 8 7 Tu 1 . - CDS 9137 - 9337 59 ## - Prom 9443 - 9502 8.0 9 8 Op 1 . + CDS 10214 - 11008 519 ## SCAB_47011 hypothetical protein + Term 11043 - 11083 -0.1 10 8 Op 2 . + CDS 11100 - 11750 468 ## gi|288928410|ref|ZP_06422257.1| hypothetical protein HMPREF0670_01151 + Prom 11795 - 11854 2.3 11 9 Tu 1 . + CDS 12011 - 13468 1394 ## BT_1839 hypothetical protein + Term 13497 - 13554 6.9 + Prom 13501 - 13560 4.0 12 10 Op 1 . + CDS 13759 - 14877 1258 ## COG0781 Transcription termination factor 13 10 Op 2 . + CDS 14929 - 15258 473 ## COG1862 Preprotein translocase subunit YajC 14 10 Op 3 . + CDS 15262 - 16248 902 ## PRU_0245 hypothetical protein 15 10 Op 4 . + CDS 16257 - 16850 520 ## COG0237 Dephospho-CoA kinase 16 10 Op 5 . + CDS 16864 - 17313 586 ## BVU_3817 hypothetical protein + Term 17332 - 17401 20.3 17 11 Tu 1 . + CDS 17468 - 18448 967 ## COG0530 Ca2+/Na+ antiporter 18 12 Op 1 . + CDS 18550 - 18987 372 ## gi|288928418|ref|ZP_06422265.1| hypothetical protein HMPREF0670_01159 19 12 Op 2 . + CDS 19010 - 19234 159 ## gi|260912447|ref|ZP_05918984.1| hypothetical protein HMPREF6745_2939 + Term 19245 - 19300 11.4 - Term 19324 - 19367 -0.9 20 13 Tu 1 . - CDS 19520 - 20881 1524 ## COG1350 Predicted alternative tryptophan synthase beta-subunit (paralog of TrpB) - Prom 20977 - 21036 4.8 21 14 Op 1 . + CDS 21590 - 23416 1283 ## COG1944 Uncharacterized conserved protein 22 14 Op 2 . + CDS 23418 - 25658 1162 ## Dfer_5304 TonB-dependent receptor 23 14 Op 3 . + CDS 25662 - 27845 208 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 + Prom 27989 - 28048 3.7 24 15 Tu 1 . + CDS 28158 - 29084 491 ## COG0515 Serine/threonine protein kinase Predicted protein(s) >gi|283510567|gb|ACQH01000052.1| GENE 1 117 - 1493 876 458 aa, chain - ## HITS:1 COG:lin0003 KEGG:ns NR:ns ## COG: lin0003 COG0534 # Protein_GI_number: 16799082 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Listeria innocua # 13 443 3 433 447 239 35.0 7e-63 MLFHTSHTPSHTESMTSGAPSRGILKFAIPLILGYILQQMYLVIDAAIVGRWIGVGALAA VGASSSIMFLVMGFCNGACAGFAIPVAQAFGAGDHHKMRCYVSNAMRIAVMLAVVITLLS CLLCERILKLVNTPADVFHDAYVFLFLQFCTIPFTMAYNLYSGQIRALGNSKQPFYFLIV SSLLNILLDVVLILLLGLGVEGAGIATFLSQVLASWLCWRFIRRHMAILVPKADERSFDN KKISILLNNGIPMGLQFSITSVGIIMLQSANNALGTIYVASFTAALRIKYLFTCVLENIG VAMATYCGQNIGAGKLDRVKTGVRDAMWIMMGYFVLTVAVIYPFADEMMMLFVSGHEQEV IANAAQLMRIANWFYPTLGVLVILRYSIQGLGYSNLSLMSGVMEMIARCGVSLWLVPALA WLGVCYGDPVAWIMADLFLVPAYLWLLRRLKNKERVKR >gi|283510567|gb|ACQH01000052.1| GENE 2 1949 - 3196 1193 415 aa, chain - ## HITS:1 COG:BMEI1139 KEGG:ns NR:ns ## COG: BMEI1139 COG4591 # Protein_GI_number: 17987422 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: ABC-type transport system, involved in lipoprotein release, permease component # Organism: Brucella melitensis # 29 415 37 422 422 100 24.0 4e-21 MNLPFFLAKRIYTNNTDKTRVDRPAIRIAIAGVAVGLAVMLVSVSVVFGFKHTIRNKVVG FGSHIQVANFMTLQASEQYPIQMGDSMLKVLRAIPGVRHVQRFAIKQGILKTNNDFLGVA FKGIAADYDTTFIHQNLVAGAIPHFSDSVGKQQVVISQAIADQLNLKLGDKVFAYFIDNT GVKARRFSVAAIYQTNLSQYDKVTCFIDLYTAVKLNAWECDQASGAELTVDDFDRLDDTA ARVVNKVNRTIDRYGETYSSQTIQEMNPQIFSWLDLLDLNVWIILGLMLSVAGVTMISGL LIIILERTAMIGILKAVGARNVTIRRTFLWFAVFTIGKGMLIGNLIGMGLIALQHYTGLV KLNPATYYVSTVPVEFNLLVWLLLNVATLLISVFVLIAPSYLVSKINPATSMRYE >gi|283510567|gb|ACQH01000052.1| GENE 3 3376 - 4593 1428 405 aa, chain - ## HITS:1 COG:CAC1001 KEGG:ns NR:ns ## COG: CAC1001 COG0436 # Protein_GI_number: 15894288 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Clostridium acetobutylicum # 5 394 4 393 395 355 45.0 1e-97 MPEISVRGLEMPESPIRKLAPLAVAAKARGTKVYHLNIGQPDLPTPQVGLDALKNIDRSV LEYSPSQGYQSYREKLTTYYKKYNIDVAPDDIIVTTGGSEAVLFAFMSCLNPGDEIIIPE PAYANYMAFAISVGATIRTVATSIEDGFALPNVEKFEELINERTKGILICNPNNPTGYLY TRSEMNQIRDLVKKYDLYLFSDEVYREYIYTGSPYTSACHLEGIENNVVLIDSVSKRYSE CGVRVGALITKNAEVRATVMKFCQARLSPPLLGQIVAEASLDAPDDYFRGVYEEYVERRK CLIDGLNRIPGVFSPIPMGAFYTVAKLPVDDCEKFCRWCLEEFSYEGETVMMAPAAGFYT TPGAGYNEVRIAYVLKKEDLKRALMILRKALEVYPGRVDDERKAL >gi|283510567|gb|ACQH01000052.1| GENE 4 4748 - 5266 753 172 aa, chain + ## HITS:1 COG:no KEGG:PRU_1298 NR:ns ## KEGG: PRU_1298 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 163 1 157 158 180 61.0 2e-44 MRQRTADWFETKIRYDKTMDDGTLKPITEQYVVDALSFTEAENAIIEEMSAFIAGDFKIT GIKPAPYHEVFFSENDNDDKWYKAKLQFITLDEKTEKEKRSNVNYLVQAASLQGAVKHID SVMGNTMITYDIASVTETLIMDVYEHVAITQNAAANDKPEYEEGNQNTKEAE >gi|283510567|gb|ACQH01000052.1| GENE 5 5266 - 5940 661 224 aa, chain + ## HITS:1 COG:FN0561 KEGG:ns NR:ns ## COG: FN0561 COG0325 # Protein_GI_number: 19703896 # Func_class: R General function prediction only # Function: Predicted enzyme with a TIM-barrel fold # Organism: Fusobacterium nucleatum # 17 219 21 222 223 152 42.0 6e-37 MATDVAGNLKRVVQSLPPHVRLVAVSKFHPNEELMAAYEQGQRIFGESQEQELSRKAAEL PKDIAWHFIGHLQTNKVKYIAPFIDMIEAVDSLRLLREINKQAEKCGRVIDVLLELHVAQ EATKYGFTPDACREMLASGEWRKLSHIRICGLMTMASNVDNEEQIRAEMTTAWHFFDEIK SQYFAHDDAFKERSWGMSHDYPIALQCGSTMVRVGTNIFGERQY >gi|283510567|gb|ACQH01000052.1| GENE 6 6155 - 7192 1196 345 aa, chain + ## HITS:1 COG:BS_galE KEGG:ns NR:ns ## COG: BS_galE COG1087 # Protein_GI_number: 16080937 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-glucose 4-epimerase # Organism: Bacillus subtilis # 5 336 3 328 339 334 51.0 2e-91 MKQTILVTGGTGFIGSHTTVELQQAGYDVVIVDNLSNSNAEVVDGIEQITGIRPAFEKVD CCDKQALEAVFAKYKDIKGIIHFAASKAVGESVEKPLLYYRNNIVSLLNLLELMPVYGVK GFIFSSSCTVYGQPTKAHLPVTEDAPIQEACSPYGNTKQINEEIIRDDIHSGAPIKSVIL RYFNPIGAHPSALIGELPNGVPNNLIPFVTQTAMGIRKELKIFGNDYDTPDGTCIRDYIY VVDLAKAHVKAMQRVLDMDTEPIEYFNVGTGRGVSTYEVVDKFEKATGVKVNWSYAPRRE GDIEKVWANPDKANTVLGWKAETSLEDTLRSAWNWQLKLRERGVM >gi|283510567|gb|ACQH01000052.1| GENE 7 7810 - 8910 519 366 aa, chain + ## HITS:1 COG:no KEGG:Fisuc_1062 NR:ns ## KEGG: Fisuc_1062 # Name: not_defined # Def: hypothetical protein # Organism: F.succinogenes # Pathway: not_defined # 7 364 9 394 405 98 25.0 4e-19 MNTEEEKTYIDLLCKFEQLPKQDKRPTFMEICKYPYTRFEEVCSRILQFYLNPFAEHGLR SLWLSALWRVSTQKEELPFYNKVECVTEEYAESKKIDIVIKSEEFVVAIENKTTAKLYNQ LDIYAKHIHNAYSDKKHKILLVLSVFPLSDLKSKRLMEENGFQPVLYKDLFAEVNKELAN YMMSSDQRYLTYMLDFMKTIENMSSVNNKRVFDFFIQNKERVEQLVNGYNQFQSNILSCH KEHIDFLKESVNERTGATWWAWEGWDLGISFNDKTNRIGIESNYLANVNGPCGEFHIYIT TWQKKHWAPYKEKILETFPEYIFLDENDDDRVYLHLPTIDGNNTEEIIKVLTETYNKLKE IAEQIH >gi|283510567|gb|ACQH01000052.1| GENE 8 9137 - 9337 59 66 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNFFAECLSSWGGLFTSLKVHRHVRCIRTRVGGDSFHVEENKEVMRAFVPAAALHWVCCK RTTSLL >gi|283510567|gb|ACQH01000052.1| GENE 9 10214 - 11008 519 264 aa, chain + ## HITS:1 COG:no KEGG:SCAB_47011 NR:ns ## KEGG: SCAB_47011 # Name: not_defined # Def: hypothetical protein # Organism: S.scabiei # Pathway: not_defined # 34 251 5 233 269 70 26.0 7e-11 MRTDHRKIVEQIERKFGARLSYLPNNHEGKFNLPLFEEKDTGILFVYIPGGNYCMGLSEK ELEHILKLTECPNITPEEMQPVQRISLNPFLISATPILNKHILKYNELFNKDIEVEGTEH APFFCDYPLAQAIAKDLLAEIPHEKEWEYFCRGGSQSVFCFGNTLPNEIELGKWLSWDFS SLNKLACNGFNLYGLFMGEWCSNDFTTNLGHHADVVQGSKTIRGGGAYFWPWQDEEWVYC ISAFRMPSKDLMESKAAFRLKMKI >gi|283510567|gb|ACQH01000052.1| GENE 10 11100 - 11750 468 216 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288928410|ref|ZP_06422257.1| ## NR: gi|288928410|ref|ZP_06422257.1| hypothetical protein HMPREF0670_01151 [Prevotella sp. oral taxon 317 str. F0108] # 1 216 1 216 216 405 100.0 1e-112 MKYFILFILFLFPSLSICGQTINKQAEKGKTYVRLVDFTRHHFTKKWKNTANNAVYQRTA TANNKYFERLASNPQRKLPLVPRKDLTGDDYLVDSAYYCKTIFSTKKYTLNIYKSGSKFK DKMGDEMFAPVDYLVFVTMDSQQRIVDYLVCYYNVHRLYESAERYFYMNNGHHITLINFY VDEIDTSFKGIQKYLINNQGKFVLLNKTNGHKRVTK >gi|283510567|gb|ACQH01000052.1| GENE 11 12011 - 13468 1394 485 aa, chain + ## HITS:1 COG:no KEGG:BT_1839 NR:ns ## KEGG: BT_1839 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 80 483 70 472 475 502 58.0 1e-140 MIHVNKLAASMAAAFTMSATMSCARDQTQNHQEMVKNQQEMVKNQQEMVKHYFAQTLKTK LNAEQNSKADFARNANYSTDIQQQLKDKDVASTKKMVWAAWCEANRELQEEKLIKPQSLQ NGGQGSWNLPQALEPDAVMPYYYGTKGQANGQLPLFLYLHGSGAKKQEWQTGLMLGNSFE DGPSLYFIPQIPNEGMYYRWWQVAKQFAWEKLIRQAFVEGAVDANRLYVFGISEGGYGSQ RLASFYADYWAAAGPMAGGEPLKNAPVENCANIGFSFLTGAEDEGFYRNILTRYTQEAFD SAQQARPLSVDNRPLFQHRINLLERMQHSINYSLTTPWLKDFIRNPYPKTVLWEDFEMDG RRRSGFYNLQVLVQPSEQRTYYEMTIKDNVITMNINDVEYTTIEKDQRWGIEMKFNRSYK QSKGGKLRIYLNEQLVDMSKAVTLIVNGKQLYRGNVKPTLQDMINSCMEYFDPYRIFPTS IEVSY >gi|283510567|gb|ACQH01000052.1| GENE 12 13759 - 14877 1258 372 aa, chain + ## HITS:1 COG:TM1765 KEGG:ns NR:ns ## COG: TM1765 COG0781 # Protein_GI_number: 15644510 # Func_class: K Transcription # Function: Transcription termination factor # Organism: Thermotoga maritima # 196 296 32 132 142 63 35.0 5e-10 MINRELIRIKIVQLTYAYYQNGNRNMDNAEKELLFSLSKAYDLYNYLLALIVAITREERH RVDIATQQAQREGTEVPSAKFAYNKFATQLEENKQLNDFMETLKQRWEDDIESVRKLCNQ IEQSETYREYMESGNDDYEEDRELWRKLYKQLVQNNVDLDALLEEKSLYWNDDKEIVDTF VLKTIKRFDPANKAKQELLPEYKDEEDKDFARKLFRSTILNGEQYQRYMSENSRNWDFSR LAYMDVVIMQIAIAEMLNFPNIPVSVTINEYVELAKLYSTHRSGGYINGMLDTIARGLVA HGQMMKAMPEPRPRKQFDNGRKNGEHKHAPQPAMQQEGPVVADKDAPMEPSSDADNGALE QPIEPERNVAEA >gi|283510567|gb|ACQH01000052.1| GENE 13 14929 - 15258 473 109 aa, chain + ## HITS:1 COG:BH1229 KEGG:ns NR:ns ## COG: BH1229 COG1862 # Protein_GI_number: 15613792 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit YajC # Organism: Bacillus halodurans # 20 95 8 81 88 60 36.0 9e-10 MISTTVLAAQAAAGNSAAPILMMVAIFAIMYFFMIRPQQKKQKAIRNFQNALQEGTRVVT SGGVYGDVKRINIESNTVELEIARGVVVTVDRNYVFADPASLQSVNTKN >gi|283510567|gb|ACQH01000052.1| GENE 14 15262 - 16248 902 328 aa, chain + ## HITS:1 COG:no KEGG:PRU_0245 NR:ns ## KEGG: PRU_0245 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 9 326 8 325 329 303 47.0 6e-81 MKQLRLLKIFSIVRSFVFSAVNKEFLIFLFFLALSGGFWLLTALNETYEKDFIVPVHLAN LPKDVVITSNDEDVVRVTLRDKGFAIMSYLYTERLRPIVLNFANYANKSRGKGTIPAADI QKQLYTQLSGSTKIVAVKPERLEFFFNYGQSKRVPIRIAGTIKPGDQRYLADYKFFPQFV TVYANNKALDSIKEVRTEELNITNFSDTIVETVELHRIEGVKMVPSVVRLALYPDVLTEE SIEVPISAESLPPDKVLRTFPSKVKVHFIVGVSRVRSITADAFKVAANYAELVEHPADKC TLYLRKAPPGVRKVRLEINQVDYLIEQK >gi|283510567|gb|ACQH01000052.1| GENE 15 16257 - 16850 520 197 aa, chain + ## HITS:1 COG:jhp0770 KEGG:ns NR:ns ## COG: jhp0770 COG0237 # Protein_GI_number: 15611837 # Func_class: H Coenzyme transport and metabolism # Function: Dephospho-CoA kinase # Organism: Helicobacter pylori J99 # 12 195 7 196 196 82 30.0 6e-16 MDFRHPSPHLRIALTGGIGSGKSFVAQRLRAHGIEVFDCDASAKRLLRTSEPLMESLRQL VGNHLYADGRLQKQVLAAYLLASDENKQRINALVHPAVARDFEQSGNQWLESAIFFESGF DARVKVDKVVCVTAPLEVRIQRVMQRDALDRAKALEWIECQWPQQRVRAMSHFEIVNDGV KDVDQQLNELFIQLIKT >gi|283510567|gb|ACQH01000052.1| GENE 16 16864 - 17313 586 149 aa, chain + ## HITS:1 COG:no KEGG:BVU_3817 NR:ns ## KEGG: BVU_3817 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 149 1 146 146 169 59.0 4e-41 MLQTILSISGKPGLYKLVSSGKASLIVEAIDETHKRMPAFATDRVTSLSDIAMYTDSEDV PLWQVLKNLGEKEQLKPCSLNYKKATSQELRDYFGEVLPAFDRDRVHDSDIRKLIQWYDI LVKNGITNFEEALKPKNGEEQTAEQAAAE >gi|283510567|gb|ACQH01000052.1| GENE 17 17468 - 18448 967 326 aa, chain + ## HITS:1 COG:aq_066 KEGG:ns NR:ns ## COG: aq_066 COG0530 # Protein_GI_number: 15605665 # Func_class: P Inorganic ion transport and metabolism # Function: Ca2+/Na+ antiporter # Organism: Aquifex aeolicus # 33 317 18 313 322 189 43.0 6e-48 MPQLFCAALGGFTDNIWLNVLFVLAGIVLVLWGADRLTEGAVGLAERMKVSQMVIGLTVV AMGTSMPEFCVSLVSALKGTPDLAVGNIVGSNVFNSLLIVGVAAMLAPMTILKTTVRKDI PFAVLASGALFLLCLDGYIGRVDAGFLLILFVFFMGVTLTTAKPDGGAQTEAKKPMKPLV ATAWLLVGLGCLVLGSNLFVDGATTVAKAIGISDAVIGLTIVAGGTSLPELATSAVAARK GNSGIAIGNVLGSNVFNILLILGVTGVISPMNIQGIGLVDLSVLFGSMMLLWLFSFTKYT ITRLEGALLSSIYIIYVGYLIVQVVK >gi|283510567|gb|ACQH01000052.1| GENE 18 18550 - 18987 372 145 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288928418|ref|ZP_06422265.1| ## NR: gi|288928418|ref|ZP_06422265.1| hypothetical protein HMPREF0670_01159 [Prevotella sp. oral taxon 317 str. F0108] # 1 145 1 145 145 253 100.0 2e-66 MKKSLLLSLLLAFLCAFSVSAKKSSNIKCELILNDGKKVEGWLVKEKNASYGPNMVKNAT DVVVASSAESKDGTNYNADDVKEMKLTDEASGESLVYKSLHAVKSFTMPKSMSPSPKRYF WLVMYEGKKVTGLHIYGYYSRYNWC >gi|283510567|gb|ACQH01000052.1| GENE 19 19010 - 19234 159 74 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260912447|ref|ZP_05918984.1| ## NR: gi|260912447|ref|ZP_05918984.1| hypothetical protein HMPREF6745_2939 [Prevotella sp. oral taxon 472 str. F0295] # 1 74 154 227 227 143 94.0 3e-33 MPFSYSVDGDNIAVTYHVPMGGTVVGKQADLKRMFERFPQMVEYIDSKEFDLKAFKKNPF ILLEKLDQILTSGK >gi|283510567|gb|ACQH01000052.1| GENE 20 19520 - 20881 1524 453 aa, chain - ## HITS:1 COG:TM0539 KEGG:ns NR:ns ## COG: TM0539 COG1350 # Protein_GI_number: 15643305 # Func_class: R General function prediction only # Function: Predicted alternative tryptophan synthase beta-subunit (paralog of TrpB) # Organism: Thermotoga maritima # 9 425 7 421 422 476 56.0 1e-134 MSRQKRYLLQESDLPKQWYNIQADMPNKPLPPLNPATKRPVSVDDLARIFCKACAEQELD TKNAWIDIPEEVQDKYKYYRSTPLVRAYALEKALGTPAHIYFKNESVNPLGSHKVNSALP QCYYCKEEGVTNVTTETGAGQWGAALSYAASVYGLAAAVYQVKISMQQKPYRSSVMRTFG AAVTGSPSMSTRAGKDIITRDPLHQGSLGTAISEAVELATTTPNCKYTLGSVLNHVTLHQ TIIGLEAEKQMAMAEEYPDVVIGCFGGGSNFGGLAFPFMRHNILDGKRTEFVAAEPDSCP KLTRGKFKYDFGDEAGYTPLLPMFTLGHDFKPANIHAGGLRYHGAGVIVSQLLKDGLMRG EDIPQLETFEAGTLFARTEGIIPAPESNHAIAAVIREAKRCKQTGEEKVILFALSGHGLM DMTAYDQYLNGDLLNYSLSEESIAQSLKAVPEV >gi|283510567|gb|ACQH01000052.1| GENE 21 21590 - 23416 1283 608 aa, chain + ## HITS:1 COG:YPO1385 KEGG:ns NR:ns ## COG: YPO1385 COG1944 # Protein_GI_number: 16121665 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Yersinia pestis # 3 408 7 391 588 135 29.0 2e-31 MKPYKACSPATTINRIRKSLDEAGIFLQEASYSNSDCLFTSRVHMSGKFSPLNIGTNGKG TTYEYALASGYAEFMERLQNNMLFRGVKNAEKHNLSKMDDCFYKKQLIEQGLALDFLYDP KESDILMETEVENQFVFLKSLFPFINTKSDAISFFKDYLMFKNCKCVPFYSEEDKGEVML PIELILISCGSNGMASGNTKEEALIQGFCEIFERYAGSQIYNQNLTPPTIPISAFKDYPV YSIIEKLQEENRYKLIIKDCSLGKGIPVLGVLVIDEEMGKYNFNLGSALNPGVALERCLT ELYQSATGLSWYDITFEQYANNPKYDESFIFSNGNKLFIDGSGNWGVCLFKEKPSYEFKG LNDSLDVTDEGDISFIKSLIHGLGFRIYIRDVSFMGLNSYYIVVPGMSQFATKREHYEVL NDTYYSLNSLRNIEHISKEELLDMCKKVNSDYQLLKMFSFNFNEMLVFHTNRDLHELKLE QLLFMLNYKAGMIKEAYWYLQEFLKNKDFGAYKYYYGIKDYVRLKLENNSDDEIENKLSI LYSDELAEIIDDVKDPDKILRYYEWPQNFNCENCALIDECRQLDLLRVVKNIQGKHQEAN INHNTFIF >gi|283510567|gb|ACQH01000052.1| GENE 22 23418 - 25658 1162 746 aa, chain + ## HITS:1 COG:no KEGG:Dfer_5304 NR:ns ## KEGG: Dfer_5304 # Name: not_defined # Def: TonB-dependent receptor # Organism: D.fermentans # Pathway: not_defined # 26 715 31 775 812 109 20.0 4e-22 MKKSILILIGLLINLSFCIAQNQDKVSGHIIDSLTKKPIEFVNVVLLNADSAFVCGAVTD SLGYYELSKNLVEGKTYTIQVTHVCYDRKRVSFCHGEQTDVSIELTANATSLDEVEVNGI KTKVRNRLNFTYSFTDQMKENVRLTSRLLENIPTVFVDCNSTVHIKGSSNILILKNGIEL TDNSLVDQIQPASVKRVEIMYNIPSQYANLNYTAIMNIITEREQGYSLMVDNKTAVDASM NDTKVNIGHVIEKNSFYLFYKQYYRNLEMKTEDRIFDNGGMLSSEDLYTTSPRKECDNEF FYGYTYQPSSRFQIGVDGYLSLYRERFQNTYENQNTSFSVLKEAFNTQHYKGYANYKDEK NHLKFEVSFNKKAIDDNDTYFTGNSLIRQNENQEIYGSKLDYNRKFDETTTLYSGVKYSH NKTKGLFNNSYTDIAERYHCNNMFAYAELMKSLGENWTVDAGVSFQNYHRSFADGTRVKD TDFFPKFNVSYAWDNNNLALGYSSYLNDPSVWQMLPFIKKESQNISTKGNPYLKPQKNGT LSLEYSYSKGNFYLASSAYYKQVNNQVASNLLTDGTNATLEFININKVQNFGLDFTLSCN LTKWWSVNFYADALSRRIAANSFYDTSMFSYMVQMQSNWHLSSRLTAIVQYTYNSKELQY NGYSKSRDSSIGMVNYTLNDYLDLYLVFIQPFGNLKSHSRIHQATQYVDMKDKVYSQKVM LCLTFNLSKGKVQRERKIYENESKKH >gi|283510567|gb|ACQH01000052.1| GENE 23 25662 - 27845 208 727 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 488 706 1 222 245 84 26 6e-16 MFHRFRIVRQYDSVDCALACLKMICQHYKMYHSLDASEYSYYISKDGISLASVVEIARKN HFEVTCGKVFLEQLTQEVTLPCILYWNRNHFVVLYKISQLHGKFFFHIADPAKGRVVFSL SEFEQHWSTDTVDSRAQGIVILLEPNGNCQPATVKKRKTGKKTFLPILLSYKKMLLYLLL GVLIGAGIQMAFPILTQWIVDKGIEGGDISFILMVLIGQMALITGNMFNDFFRKVLVLKL GSRFSISLLTDLVTKMLRLPVRFFDSKPMGDFIQRLQDHDKVERFVTVYFVNCVFSLITL LVLGIVLMTYNGYIFLVFVLGSILYIIWTYLFVNKRKTLNYELFAVKAKNQGKYYEILKG ITEIKLQNLEQRNRDDIENIQKDLYHTNVKALKIDQYLDVGNVFINELKNILITFLSAYF VIRGDITFGMMMSIQYIIGELNVPIAQFLTFIAGYQDAKLSLDRMNSIFSIDNEEDGDKE AFKPQHGVISVEHLHFKYVMSGNDILHDMNLCLPLGKKIAIVGSSGSGKTTLVKLLLKLY EPTQGKIVVSNTDIKNVKNRLWRNSCGAVLQDGFIFSGTLRENIILEKSFNEERFHQAVS ISNVETFAEPLPYKYETRIGDNGMRLSQGQIQRILIARAIYKNPDIFFFDEATNSLDANN EKDILEKLKPILKGKTVFVVAHRLSTVKDADIILVMDHGTIIEQGTHTSLVEKQGYYYQS SSGTNWN >gi|283510567|gb|ACQH01000052.1| GENE 24 28158 - 29084 491 308 aa, chain + ## HITS:1 COG:YAL017w_2 KEGG:ns NR:ns ## COG: YAL017w_2 COG0515 # Protein_GI_number: 6319302 # Func_class: R General function prediction only; T Signal transduction mechanisms; K Transcription; L Replication, recombination and repair # Function: Serine/threonine protein kinase # Organism: Saccharomyces cerevisiae # 95 293 114 308 321 71 28.0 2e-12 MDRQEISEHIQTYHDRVFLRESIAQGFLLPQRESCELPPSVISYNFPQSLYTVVEVLAQS TYSISYHVRSRTGQDLFLKQLRLIDPDQRKHFQKEISIIKRLQDSQNILKLICFDIESMY YVTEYIQGRNLEVCYHAMNFSEKIDIIKRIIAVIANLHSRNIVHGDLHLAQFIISDTGIL KLIDYEMLGDITSKHVFPYMGATFEYIEPESLSTNPFVLIQKEDINFKAEVYRLGVLIYT IIYETPPFYEITWKMLCESILHESPTFVKTDNKGHRIPDWIIELIKKCLHKDPNQRYASV CEIENFGV Prediction of potential genes in microbial genomes Time: Sat May 28 01:06:46 2011 Seq name: gi|283510566|gb|ACQH01000053.1| Prevotella sp. oral taxon 317 str. F0108 cont2.53, whole genome shotgun sequence Length of sequence - 26738 bp Number of predicted genes - 21, with homology - 20 Number of transcription units - 13, operones - 3 average op.length - 3.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 208 88 ## ZPR_3005 transposase - Prom 257 - 316 5.4 + TRNA 564 - 639 85.6 # Thr CGT 0 0 + TRNA 642 - 717 79.6 # Gly CCC 0 0 + Prom 886 - 945 3.8 2 2 Tu 1 . + CDS 969 - 4199 3133 ## Cpin_5935 TonB-dependent receptor plug + Term 4272 - 4312 3.1 + Prom 4213 - 4272 5.8 3 3 Tu 1 . + CDS 4342 - 4674 388 ## COG0662 Mannose-6-phosphate isomerase 4 4 Tu 1 . + CDS 4781 - 5941 985 ## COG4623 Predicted soluble lytic transglycosylase fused to an ABC-type amino acid-binding protein + TRNA 6244 - 6331 53.9 # Ser TGA 0 0 - Term 6324 - 6369 7.3 5 5 Op 1 . - CDS 6427 - 7476 643 ## BDI_1313 hypothetical protein 6 5 Op 2 . - CDS 7469 - 8320 201 ## gi|288928429|ref|ZP_06422276.1| hypothetical protein HMPREF0670_01170 7 5 Op 3 . - CDS 8353 - 9747 479 ## gi|288928430|ref|ZP_06422277.1| hypothetical protein HMPREF0670_01171 8 5 Op 4 . - CDS 9750 - 10778 401 ## HCH_05712 superfamily I DNA/RNA helicase 9 5 Op 5 . - CDS 10771 - 11700 260 ## gi|288928432|ref|ZP_06422279.1| hypothetical protein HMPREF0670_01173 10 5 Op 6 . - CDS 11711 - 11914 181 ## PGN_0927 hypothetical protein - Prom 11958 - 12017 3.1 + Prom 12104 - 12163 5.4 11 6 Tu 1 . + CDS 12275 - 12544 169 ## gi|288801429|ref|ZP_06406882.1| conserved hypothetical protein + Prom 12621 - 12680 3.3 12 7 Tu 1 . + CDS 12710 - 12928 84 ## gi|288927437|ref|ZP_06421284.1| hypothetical protein HMPREF0670_00178 + Term 12963 - 12995 1.2 - Term 12645 - 12710 3.0 13 8 Tu 1 . - CDS 12859 - 13008 94 ## gi|260911294|ref|ZP_05917893.1| conserved hypothetical protein - Prom 13086 - 13145 4.0 + Prom 13863 - 13922 5.1 14 9 Op 1 . + CDS 14090 - 14266 74 ## 15 9 Op 2 . + CDS 14297 - 19615 3242 ## COG1112 Superfamily I DNA and RNA helicases and helicase subunits + Prom 19988 - 20047 5.2 16 10 Tu 1 . + CDS 20105 - 20929 975 ## COG3315 O-Methyltransferase involved in polyketide biosynthesis + Term 21133 - 21186 4.1 + Prom 21101 - 21160 5.5 17 11 Op 1 . + CDS 21239 - 22351 930 ## COG0351 Hydroxymethylpyrimidine/phosphomethylpyrimidine kinase 18 11 Op 2 16/0.000 + CDS 22348 - 23223 1203 ## COG0214 Pyridoxine biosynthesis enzyme 19 11 Op 3 . + CDS 23228 - 23791 592 ## COG0311 Predicted glutamine amidotransferase involved in pyridoxine biosynthesis + Term 23875 - 23937 12.4 - Term 23848 - 23877 -0.4 20 12 Tu 1 . - CDS 24044 - 24817 566 ## gi|288928440|ref|ZP_06422287.1| hypothetical protein HMPREF0670_01181 - Prom 24925 - 24984 2.6 - Term 24870 - 24897 -0.8 21 13 Tu 1 . - CDS 25119 - 26378 989 ## gi|288928441|ref|ZP_06422288.1| hypothetical protein HMPREF0670_01182 - Prom 26517 - 26576 6.4 Predicted protein(s) >gi|283510566|gb|ACQH01000053.1| GENE 1 1 - 208 88 69 aa, chain - ## HITS:1 COG:no KEGG:ZPR_3005 NR:ns ## KEGG: ZPR_3005 # Name: not_defined # Def: transposase # Organism: Z.profunda # Pathway: not_defined # 1 68 1 68 266 77 51.0 2e-13 MYEVLDKDTIKYESLPHLSVAKRGYVTKCDLAEVIQCILYKLKTGCQWHMLPVSAIFTGR VLSYKSVYV >gi|283510566|gb|ACQH01000053.1| GENE 2 969 - 4199 3133 1076 aa, chain + ## HITS:1 COG:no KEGG:Cpin_5935 NR:ns ## KEGG: Cpin_5935 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: C.pinensis # Pathway: not_defined # 3 1076 4 1096 1097 581 35.0 1e-164 MQKRLFFLFAIVMFLTSPLMAQVTTSGINGKVVAGGEEVIGATVVAVHNPSGTRYNGITN EKGRYSIQGMRVGGPYTITISYVGYKDEVVENVNLALGEASVFNADLKEDAKLLGEVTVT GKAGVGTTGASTQFSKTQIDNTPTVNRNIYDVATLSPLVNANKKGGITIAGANNRYNSFQ IDGMVSNDVFGLSSGGTIGNQTNANPIALDAVEQIQVVASPFDIRQSGFTGGAINAITKS GTNVFKASAYTYYTDENLYGRWNQNTGLKEKYQNESTKTFGATFGGPIVKDKLFFFASAE YKKNTYPATYYAGAPGYFMTLEMAKAIADRYESITGIREDYSRPDITSGALSLMSRIDWN INSNNKFSLRYQFNDSYKDEISSSYNTYNFVNSGFKRINNMHSFVAELNSHLSRSLYNEL RVGLTMVRDRRDIPYRAPSALITKAGAYNPLTGDEEAGDKTINIGTEYSSGLNALEQNIW VFEDNLSWYLGNHNITFGTHNELYDMKNSFMQAFTGRYSYGSKDIGGITAFMTDKASEFT WNHADTNITGTREWKTPFKSGQVGFYVQDKWDLNTLLQFTYGVRLDIPYYVNNPSTNKDF NASDFSTKHDAVVGRKPKSVLMFSPRFGFRWYTDETHKTMLRGGLGIFNGRAPFVWVENA WANTGIEQKGVSIRANSKTGTLAPSFATYKNDAEEAAKEAKASNATKPNIATVARNFKFP QVFRANLAWEQQLPWDMKFTLEGLYSRNLNNVWFENLALVNEGKRVYAVEGAENSSTIFY ASNPGSYSSIVNMTNTNKGYSYSVSAQVEKSFKFGLDLMANYTFGRSYSVNDGTSSIALS NWGFYYSVDPNEPVLATSMFDIPHRLVVTANYNSKRYGNGRWQTHVGLTYNGSSGQRYSL TMSDSQNQSFNGDYRKGNTLLYIPTKNELANMKFVNKKVGSKIITADEQKAQFEQWIEGD SYASKHRGQYAERNSHLTPWENRFDLHISQDFYYLKERGSKIELVFDILNVANLLNKNWG TTYGSVYNINKLQVQGVEDGAAPDTKVASFQYFDNTPFVSDVASRWHAQIGLRVTF >gi|283510566|gb|ACQH01000053.1| GENE 3 4342 - 4674 388 110 aa, chain + ## HITS:1 COG:TM1287 KEGG:ns NR:ns ## COG: TM1287 COG0662 # Protein_GI_number: 15644042 # Func_class: G Carbohydrate transport and metabolism # Function: Mannose-6-phosphate isomerase # Organism: Thermotoga maritima # 12 106 18 118 121 59 34.0 1e-09 MHIDLNNLEETKIVGFKGGHGEMFMRAFVDDKCRIMRNVLKPGAASGLHKHEENCEVVFA LSGEGTFYCDGEKETLLPGQVHYCPMGHAHYFENNGTEDFVFFAIVPEHH >gi|283510566|gb|ACQH01000053.1| GENE 4 4781 - 5941 985 386 aa, chain + ## HITS:1 COG:VC0866 KEGG:ns NR:ns ## COG: VC0866 COG4623 # Protein_GI_number: 15640882 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted soluble lytic transglycosylase fused to an ABC-type amino acid-binding protein # Organism: Vibrio cholerae # 189 360 298 463 530 105 36.0 1e-22 MLMVVGLAACNHKKQNQPVETPWGTTLGADSATSAESGFSLNDIIHNGELIMLTLTGPEN YYDYRGRGMGLQYLLCERFAKALGVSLRVEVCKDTADLVARLKNGDGDLAAFFLPKKVGG VAFCGASSASGDGQWAVQEGNEALADTLNRWFKPALLAKIKAEERFALSSRSITRHVYSP MLDRKAGVISQYDHLFQKYAPTARWDWRLLAAQCYQESTFDPQAHSWAGARGLMQIMPGT AAHLGLPASQVHEPEPNVAAATRLIRELDGKFNDIGDRMERIRFVLASYNGGAGHVRDAM ALARKYGQNPKRWEEVAPFVLRLSTPQFYNDPIVKNGYMRGSETVDYVERIGNRWQQYMG MAPGGISGSYGSMVPQRATKKYRFHL >gi|283510566|gb|ACQH01000053.1| GENE 5 6427 - 7476 643 349 aa, chain - ## HITS:1 COG:no KEGG:BDI_1313 NR:ns ## KEGG: BDI_1313 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 7 348 2 343 345 540 72.0 1e-152 MTDKIRILTDNGKQVEASAPIIISASRSTDIPAFYAKWFFNRLAKGYCVWYNPFNQQPIY ISFKNCRVIVFWTKNPKPILPYLHILDEMGIHYYFQVTLNDYVNEGFEPNIASVDERVDT FKQLSQKIGKERIIWRFDPLIITPTIGPRELLKRIWNVGNKLKGFTDKLVFSFIDVKAYR KVQNNLVKETIFFTKEDVETAEANYAQRIDIVEGLQKIQEAWKGLGWNVTMATCAEDIDI ESYGIEHNRCIDRELMKRIFSEDEELVYYLYTLKWPKRDMFGQLPAIPQKEKKVKDPGQR KICGCMVSKDIGMYNTCRHFCVYCYANTNKECVLRNKEKHNESSESIIE >gi|283510566|gb|ACQH01000053.1| GENE 6 7469 - 8320 201 283 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928429|ref|ZP_06422276.1| ## NR: gi|288928429|ref|ZP_06422276.1| hypothetical protein HMPREF0670_01170 [Prevotella sp. oral taxon 317 str. F0108] # 1 283 38 320 320 543 100.0 1e-153 MSKPIISGLTKRTAFANIYKAIITLLYNVNLEGELPNIEEPVQIAQKEESQPIYVSAKEE LDRLRIWQESIRSFFEQLKKVPKKKDLYAPLESAFAENGNQMDYVKFFALLNGYKEWRTS KGEPIKAWKSMVEHFCPEEYSKRFGYSPEFIHQNKRQPKKMFEKSFDFFGFKQGEKFINE EYPETSSVSIITDENAANYLKVCPKLVVFRTEIVDIFIEFQKKYQSGYYFKNQHKYNRNN NDVVDHFCKWCLSQKNPNAIPYTQHNKGMIDEIKKYLLIKYHD >gi|283510566|gb|ACQH01000053.1| GENE 7 8353 - 9747 479 464 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928430|ref|ZP_06422277.1| ## NR: gi|288928430|ref|ZP_06422277.1| hypothetical protein HMPREF0670_01171 [Prevotella sp. oral taxon 317 str. F0108] # 1 464 1 464 464 908 100.0 0 MKLYLATSSLNVDNILSTESVAPYSFYQVRNYGYDSFVCLDLIPFKNVLILFSRIPYFEI YDKEHDSRPLVLEIEVNESINPLAHIADYEGVNVYSTDTIIRLSPFNTRLLFFKPQDLKH SRLSCSDSLTNKLGDRFYFDMCRAEFDLAHLSNTNLHVDDKCNNFEQKVFQDNRLNTIKG FVFGYYLGVSKSVSSNSAKLLKIQKRVYDIAATVKNNGGYSNNSFFNELEQLDKEYRRND PSTLKCKDLWDKTLIELGIPSEALNQLFALYDVNGVVKTNFMKKQGVMPTVSLHQYGFNN IEMYRDNLKHHTDNIIREEQKIQLSSFDVINTFDLDPSYETCMLAGKDSDSMIFNKFIDA ILWHGIAPTPDTLRTDRFNIATEITKSAKSIWESTNQEWQNSSAQIFMNDLRQNIKSFTP LDINKQENEILKSIAAQLSCLREKILMQLFSFAKIIHIPTIDTL >gi|283510566|gb|ACQH01000053.1| GENE 8 9750 - 10778 401 342 aa, chain - ## HITS:1 COG:no KEGG:HCH_05712 NR:ns ## KEGG: HCH_05712 # Name: not_defined # Def: superfamily I DNA/RNA helicase # Organism: H.chejuensis # Pathway: not_defined # 99 332 192 433 454 64 27.0 6e-09 MDDFDSEFFIDYSNLDDFQRQLIDRKNNKSMVVSGSAGSGKSLIALHKAKQIAALGEMYV VIVYTKSLRKYFEDGLKKLGLKNVYHYHQWKHNQRHVKYLIVDECQDFTKEEIEEMKQYG EFCLFFGDSAQSIMGFGNRGEVQSIERTASDMNVAPDPLYFNYRLTQEIAALGEKVGNVE DLVLKCKRKGEKPNLISANSFDGQLDKIADIIKNRALTNVGILLPFNTNDKGEWSVEYVK DYLMRKGVTCEFKYNANQDTEMDLDFHSSNPKIMTWWCAKGLQFKDVFIPGCDKKTDNRS ALYVAMTRCSERLYLGYTSSLSSLFPEKSDTVYNNNEELEII >gi|283510566|gb|ACQH01000053.1| GENE 9 10771 - 11700 260 309 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928432|ref|ZP_06422279.1| ## NR: gi|288928432|ref|ZP_06422279.1| hypothetical protein HMPREF0670_01173 [Prevotella sp. oral taxon 317 str. F0108] # 1 309 1 309 309 597 100.0 1e-169 MCEFPEWLDNYIFNILKAKYSPDHVRFEYNLNLNKDEVLIYLGTYFPRSYIETNSLFTEF SNAVNYTKAIEHKSELKILDLGCGTGGEILGILSFVDKFVLNVNAIKLLAIDGNQESLRI FEKVISYYKAHTRLDINYSIGPAFIENKGDLDTIAEIVSEQYDIILSCKAICELLAKKRF EQKPYKSIVAMLASKLTSDGVLFIEDVTVKSPATNTFIPIILNSELNEFVRENKDFTTLF PRSCANNENKCIDGCFFKREFIFSHSHKSNDVSKVVFRFIGKKGILNRLKITDESKKYNC KISKKIKYG >gi|283510566|gb|ACQH01000053.1| GENE 10 11711 - 11914 181 67 aa, chain - ## HITS:1 COG:no KEGG:PGN_0927 NR:ns ## KEGG: PGN_0927 # Name: not_defined # Def: hypothetical protein # Organism: P.gingivalis_ATCC33277 # Pathway: not_defined # 9 66 2 59 67 70 58.0 2e-11 MGQKKEHSNLIKEHLKKRGITQTWLAKELGMSFSITNAYVCNRKQPNFAIIFKVADLLGV SPKELVE >gi|283510566|gb|ACQH01000053.1| GENE 11 12275 - 12544 169 89 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288801429|ref|ZP_06406882.1| ## NR: gi|288801429|ref|ZP_06406882.1| conserved hypothetical protein [Prevotella sp. oral taxon 299 str. F0039] # 1 86 1 85 102 75 50.0 8e-13 MTINELPDKPVWQMTGEELLFLAQHGNMGTSGETAKASTAKEKSDMYLAWLGLHVSLGAA CPRLIASSRAVRSIVPLHKLGAKLLLKLN >gi|283510566|gb|ACQH01000053.1| GENE 12 12710 - 12928 84 72 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288927437|ref|ZP_06421284.1| ## NR: gi|288927437|ref|ZP_06421284.1| hypothetical protein HMPREF0670_00178 [Prevotella sp. oral taxon 317 str. F0108] # 1 71 1 71 74 76 61.0 6e-13 MYKVLDKDTIKSEILPYLSVAKRGYVSKSDLTEVIQSNLYKPGNRLSIPDYFCVCMILFP LHVSSMPCNRSC >gi|283510566|gb|ACQH01000053.1| GENE 13 12859 - 13008 94 49 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260911294|ref|ZP_05917893.1| ## NR: gi|260911294|ref|ZP_05917893.1| conserved hypothetical protein [Prevotella sp. oral taxon 472 str. F0295] # 12 48 2374 2410 3065 63 78.0 5e-09 MPKTRQKKSVAYTRTGNGTENTFAYDRQHERLQGMLLTCSGNNIMQTQK >gi|283510566|gb|ACQH01000053.1| GENE 14 14090 - 14266 74 58 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MHFGFTRQILLDYVRVDGDILTKKFMENLLKGPCNGHTCNQLMSLRKFVNHIHNVIST >gi|283510566|gb|ACQH01000053.1| GENE 15 14297 - 19615 3242 1772 aa, chain + ## HITS:1 COG:TM0618 KEGG:ns NR:ns ## COG: TM0618 COG1112 # Protein_GI_number: 15643384 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases and helicase subunits # Organism: Thermotoga maritima # 1007 1574 717 1289 1289 345 36.0 6e-94 MVTVNIEYLPSINYSLINNRIAICQSVEISNNETSDLRDIVIECSGEFFQDFRSSVIDSL KAGKSMRLQGMNLSPIAAQVAAVTEKTASAFTVNVFSDALSETKKILFTHSYDIDIMPYD QWLGTSILPQCLASFVTPNHPAINNVVAKAAGKLKEMSGSSAFTEYQTGNTNEVRKQVAA VYGALHAEGIVYRSVPASYEVVGQRITLPDQVLSSKLGNCIELTLLFASVLESLGINSGI VLQEGHAYLAVWLVDDCCQYSVYDDASYIEKKCAEGIDEMLVLECTQVTAESTTFEDAQS IATKHLANTGIFEMFIDIKRCRLEQVRPLPQRTMNNGVWEVPVEGVSHDECVLNVKEHSR FDLTMLMDSKREVTKMDIWERKLLDFSLRNSMLNLYLRQKAIQFISFDVDLVEDYLQDGE EYLISCKPNVGFNIIGDERLVRSKLLPELHELITNDIKHRTLHSYQTEAETRYTLKNIYR ASRNAVEETGANALYLAIGTLRWFETDISEKPRYAPILMLPVEIVYKKGDYYIRTRDEEI ALNITLTEFLRQNFDITIPGLNPLPKDDHGVDVKKIFAIIREVLKNKKRWDVEEECILGV FSFSKFLMWNDIHNHRQELLDNNVVRSLVEQKLTFIPTQVMSDLKGKDKEVKPSDLALPV PIDSSQMAAVMEARQGNSYILYGPPGTGKSQTITNLIANALFQGKRVLFVAEKMAALSVV QKRLEKINLGPFCLEMHSNKITKRHVLEQLKKALDAAHIIRPEEYARIADELYEQRCKLI EYMEALHDIKGAEGMSLFDCIIRYESINTPELDVDANDEELKRKFRIEKMDSYSHLLSQK YQAILSITGAPSKHPLLGLNIVEDDLVDAGRLPSRIKTASEILRKAEENREMLAKATGIK TDILRDCNDGILFQDGTALYNEWRGIKAKWFLPRFFAKRGFVAKLKQFNSLIIEQEVDAL LSNLLNYQQLHAEIVTIQDVVRTLFAVNFDADRLPSNDELGSYGSCLDNWLSHTDKARDW YQWCAYKKELENEGLGVVARHIELETIPAETLKDAFFKRIFRSLACEKIASSQVLRTFEG AIFDETISRYQQLTEEFQILSQKELYAKLAANVPHVTDSIDNSSPIGFLNRNIANGGRGI SLRDLFDNISTLLPRLCPCMLMSPMSVAQFLDLSQNKFDLVVFDEASQMPTSEAVGAIAR GKSVIVVGDPKQMPPTSFFSSTSVGEEEADIDDLDSILDDCHSLGIPSLQLNWHYRSKHE SLIAFSNNEYYDGELITFPSVDDQTTKVKYCHVDGVYDKGGRRSNKKEAETIVADIVKRL QSTDHAKYSIGVIAFSQVQQNLIEDLLTEKLDKDQKLREAADELYEPIFIKNLENVQGDE RDIILFSIGYGPDKDGKVSMNFGPLNNNGGEKRLNVAVSRARREMIVYSSLKASHIDLKR TKARGVEGLKHFLEYAEQQILIQAANAHKESSDRIISEQIANALRTRGHNVNTNVGRSNF KVDVAIADSADSGNYSMGILLDGEVYHNTQTTRDREIVQPTVLNMLGWKIMRVWSVDWVN NPERVIARIENALQQKSKPIETPVGNATFDVAKEKVEEIESNEKEYRVYNGLGNTGSMSD EELATKILSCEQPMTLMYLCRCMCMHRDSTRVTPTLLTSVEEIANRQLFVQKLGNATILW TDKEHADAFNGYRQAHGRDITEIPLIEIMNAIALTVQEQLSIKTDALTLLVAKKLGFARR GTKVDQVLKEGLEMLLNTKCIVESGGIVSLQE >gi|283510566|gb|ACQH01000053.1| GENE 16 20105 - 20929 975 274 aa, chain + ## HITS:1 COG:PM1322 KEGG:ns NR:ns ## COG: PM1322 COG3315 # Protein_GI_number: 15603187 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: O-Methyltransferase involved in polyketide biosynthesis # Organism: Pasteurella multocida # 1 270 1 272 276 252 46.0 4e-67 MNDKIESKLANVPETMLITLWAKATETERADALIKDEKAVEMMGKIDYDFSKLKKASFSQ SGCCVRAGLIDMEAKAFLKDNPDAVVIQLGAGIDARYERLGKPKVGHWYDLDLPEAIELR RKLLQESDENTFLAQSLFDYSWCDKVKATGKPALVIIEGVMMYFEPKEVQSFFSTIAQRL DNAVVLFDMLAFSLVGKSKVHDSLKKMDKEVEFKWSVLNTKEMETWSNRLHLDKEYYMSD YDHGRFPFIFRMLYHIPYFYRRFNQRVVKLKIGD >gi|283510566|gb|ACQH01000053.1| GENE 17 21239 - 22351 930 370 aa, chain + ## HITS:1 COG:CAC3095 KEGG:ns NR:ns ## COG: CAC3095 COG0351 # Protein_GI_number: 15896346 # Func_class: H Coenzyme transport and metabolism # Function: Hydroxymethylpyrimidine/phosphomethylpyrimidine kinase # Organism: Clostridium acetobutylicum # 5 245 4 239 265 137 34.0 5e-32 MEQQVTPILTINGSDGTGGAGIQADIKTISALGGRALSAITSITAQNTLGIQEFYDLPAE TIKGQIEAIVNDMQPAVVKVGMIRRADTVAQIAQLLRQHKPRHIIYDPVIVSSQNEMLMA QEVVHEVRRSLLPLCSLVLMKRADAERLTQTAINTAADLNQAVKSLLAEGCQSVLLQGSH MATQSLTDVFATAKDSEPTFLPSLFGEGEVGTRHGLSGSLSAAIATFVNGGNAIFEAVVN ARNYIAQLQPQHTGIIGRSGELFNEFTHEITQHHRTNSDVKFYADKLNVSARYLAQVTRR ITGKAPKAIIDEYLTHEIEQQLAFTPKTVQEIAYAYGFRSQAHLAKFFKNINGLAPSEFR KEILLNKQQK >gi|283510566|gb|ACQH01000053.1| GENE 18 22348 - 23223 1203 291 aa, chain + ## HITS:1 COG:SP1468 KEGG:ns NR:ns ## COG: SP1468 COG0214 # Protein_GI_number: 15901318 # Func_class: H Coenzyme transport and metabolism # Function: Pyridoxine biosynthesis enzyme # Organism: Streptococcus pneumoniae TIGR4 # 1 291 1 291 291 429 83.0 1e-120 MKENRQELNRNLAQMLKGGVIMDVTTPEQARIAEAAGACAVMALERIPADIRAAGGVSRM SDPKMIKGIQEAVSIPVMAKCRIGHFAEAQILQAIEIDYIDESEVLSPADDVYHIDKNKF DVPFVCGAKNLNEALRRIAEGATMIRTKGEPGTGDVIQAVRHLRMMQSEIRRLTSMSEDE LYEAAKAMQAPYELVRYVHENGKLPVVNFAAGGVATPADAALMMQLGAEGVFVGSGIFKS GDPAKRAAAIVKAVTNFNDAKMLAELSEDLGEAMVGINEQEIALLMAERGQ >gi|283510566|gb|ACQH01000053.1| GENE 19 23228 - 23791 592 187 aa, chain + ## HITS:1 COG:SP1467 KEGG:ns NR:ns ## COG: SP1467 COG0311 # Protein_GI_number: 15901317 # Func_class: H Coenzyme transport and metabolism # Function: Predicted glutamine amidotransferase involved in pyridoxine biosynthesis # Organism: Streptococcus pneumoniae TIGR4 # 1 186 1 188 193 211 56.0 5e-55 MRIAILALQGAFIEHSKMLAQLGVESFEVRQAADWEQPKDALIIPGGESTTMLKLLNELN LLAPIKQAIEGGLPVFGTCAGLILLAKHVVGDVLQRISTMDTTVCRNAYGRQLGSFFVQS NVKGISHAVPMTFIRAPYITNVGKDVEVLAEVDGHIVAARQGKQLVTAFHPELNDDLSIH RLFVEMV >gi|283510566|gb|ACQH01000053.1| GENE 20 24044 - 24817 566 257 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928440|ref|ZP_06422287.1| ## NR: gi|288928440|ref|ZP_06422287.1| hypothetical protein HMPREF0670_01181 [Prevotella sp. oral taxon 317 str. F0108] # 1 239 31 269 287 452 100.0 1e-125 MKILVILALTLLACTRGSAQADSTYHQETLDLVQYLDVRDFEVSQNCLGDYEFQAGDRSI RQYEMYDSDKKFVGYVEQSKELHTPYSYYYNYDTKGRLRYLSVSFDGVSIGNSYRYDSLG RITEIKDYSRPYKFKLNNLIEKMSSEYGCDLLNKRRLVSVHRSEGERDLKKPWYSVFYLE NERSHYGDKYLIDGTTGETLYVLKHEAWNECGSGDLSDFYTVAKGLPTSIEGKYLYELSK KKQAQKKKGVKKKVGRR >gi|283510566|gb|ACQH01000053.1| GENE 21 25119 - 26378 989 419 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928441|ref|ZP_06422288.1| ## NR: gi|288928441|ref|ZP_06422288.1| hypothetical protein HMPREF0670_01182 [Prevotella sp. oral taxon 317 str. F0108] # 1 419 20 438 438 822 100.0 0 MKKTILVAILGVLLLTSSQKHPAKNTSANKQKMALALNAQYHLGISPDLVLPFDSAGMDS LLSGYKAFFDVTPSDSAGEEEEEEEEDEICVDEKMVDIYIPYAFNDLVKHGFKPISNEQF EQRLAELGIDSMQRKNLPYVFDHQHYFTIPSCVRGWSDRTESERKSLNDENYVKFSCMND FFIKGYNFISSPPMDIESVRQIHGKMCFRLSNDIIHINRFLFNNDKESFLWLCKYAPLWL KDLFVTYGYDKNELINRQMLKRMMQELKSDMWDSDLRETRLGFMYNTFARKIYTQKPYVA IREGLFKTLLQLPTNDKNSLWMDVLDFYGETLLKKDLSDCAYWFSRFTKNERYMIAAYTA YYLIKLREKYNRNGAPFYDDALRNDNRFRQYLEKQHYFNLPGFDKMSENILNYNKAVGQ Prediction of potential genes in microbial genomes Time: Sat May 28 01:08:33 2011 Seq name: gi|283510565|gb|ACQH01000054.1| Prevotella sp. oral taxon 317 str. F0108 cont2.54, whole genome shotgun sequence Length of sequence - 8310 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 6, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 1098 - 1157 3.0 1 1 Tu 1 . + CDS 1341 - 1691 283 ## gi|260912462|ref|ZP_05918999.1| conserved hypothetical protein + Prom 1853 - 1912 3.5 2 2 Tu 1 . + CDS 1937 - 2296 313 ## gi|288928443|ref|ZP_06422290.1| hypothetical protein HMPREF0670_01184 + Term 2512 - 2547 1.1 - Term 2372 - 2426 11.2 3 3 Tu 1 . - CDS 2520 - 4094 1464 ## COG1530 Ribonucleases G and E 4 4 Tu 1 . - CDS 4816 - 5100 397 ## COG0776 Bacterial nucleoid DNA-binding protein - Prom 5124 - 5183 4.2 - TRNA 5328 - 5404 58.5 # Arg ACG 0 0 - Term 5280 - 5322 11.7 5 5 Tu 1 . - CDS 5568 - 7622 2698 ## COG0326 Molecular chaperone, HSP90 family - Prom 7643 - 7702 4.3 + Prom 7585 - 7644 6.3 6 6 Tu 1 . + CDS 7873 - 8308 498 ## PRU_2812 hypothetical protein Predicted protein(s) >gi|283510565|gb|ACQH01000054.1| GENE 1 1341 - 1691 283 116 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260912462|ref|ZP_05918999.1| ## NR: gi|260912462|ref|ZP_05918999.1| conserved hypothetical protein [Prevotella sp. oral taxon 472 str. F0295] # 1 116 1 118 119 211 86.0 1e-53 MKHKILLWLMLFFFPLALGTACNTEVDLKSIRYEGEVLVLEGKSRPYNIIRVTKSSSSKG LPVGVTLGFIGTGYDKQMSEGDIVHFHVIDFKEWQWPKTEDLRWPMFIGIVEFYDN >gi|283510565|gb|ACQH01000054.1| GENE 2 1937 - 2296 313 119 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288928443|ref|ZP_06422290.1| ## NR: gi|288928443|ref|ZP_06422290.1| hypothetical protein HMPREF0670_01184 [Prevotella sp. oral taxon 317 str. F0108] # 1 119 61 179 179 245 100.0 7e-64 MKHKVLLWLMLFFFPLALGTSCNTEVNLENIKYEGKILSLIKSNNNELYNIILITSSTSR KGVPVGSSIGFYDRDFGEKMHEGDIVHFRVPMFKKWVGPETADHLWPEYVGVIDFNYNE >gi|283510565|gb|ACQH01000054.1| GENE 3 2520 - 4094 1464 524 aa, chain - ## HITS:1 COG:CPn0959 KEGG:ns NR:ns ## COG: CPn0959 COG1530 # Protein_GI_number: 15618866 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribonucleases G and E # Organism: Chlamydophila pneumoniae CWL029 # 1 508 1 505 515 269 31.0 9e-72 MTSEVVIDVQQKEISIALLEDKKLVEYQSEPREVSFSVGNVYMAKVKKLMPGLNACFVDV GFERDAFLHYLDLGSQFDSYAKYLKQVQSDRKKLYPFAKASRLPDLKKDGSVQTTLTSGQ EVLVQIVKEPISTKGPRLTCELSFAGRYLILIPFNDKVSVSSKIKSGEERARLKQLIHSI KPKNCGVIVRTVAEGKRVAELDAELKTLTKYWEEAIEKVQKTQKRPQLVFEETSRAVALL RDLFNPTYENIYVNDEDVFKEVKHYVSMIAPDKVDIVKLYSGKVPIFDNFNITKQIKSSF GKAVNYKHGAYLIIEHTEALHVVDVNSGNRTRSEKGQEANALEVNLGAADELARQLRLRD MGGIIVVDFIDMNLAEDRQMLYERMCKNMQKDRARHNILPLSKFGLMQITRQRVRPAMDV SVDETCPTCFGKGKIKSSILFTDQLEGKIDRLVNKIGVKKFYLHVHPYVAAFINQGFISL KRQWQLKYGFGLHIVSSQKLAFLQYEFYDAKKQFIDMKEEIETK >gi|283510565|gb|ACQH01000054.1| GENE 4 4816 - 5100 397 94 aa, chain - ## HITS:1 COG:lin2048 KEGG:ns NR:ns ## COG: lin2048 COG0776 # Protein_GI_number: 16801114 # Func_class: L Replication, recombination and repair # Function: Bacterial nucleoid DNA-binding protein # Organism: Listeria innocua # 3 91 4 91 91 59 39.0 1e-09 MTKADIINEIAIQTGIQKKDVSVVVESFMETIKGSLLDKKDNVYLRGFGSFIIKHRAAKT ARNIAKNTTITIAAHDLPCFKPAKSFVESMRGEQ >gi|283510565|gb|ACQH01000054.1| GENE 5 5568 - 7622 2698 684 aa, chain - ## HITS:1 COG:sll0430 KEGG:ns NR:ns ## COG: sll0430 COG0326 # Protein_GI_number: 16332281 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone, HSP90 family # Organism: Synechocystis # 1 577 4 598 658 433 38.0 1e-121 MQKGNIGVTTENIFPVIKKFLYSDHEIFLRELVSNAVDATQKLKTLAERGDFKGEQGDLT VRVSLDADKGTLTISDRGVGMTADEIDRYINQIAFSGVTDFLDKYKDNANAIIGHFGLGF YSAFMVSKKVEIVTRSYQEGAQAVKWSCDGSPAYEIEEVQKDDRGTDIVLYIDDDCKEFL EKARIQQLLNKYCKFLSIAIAFGKKTEWKDGKQVDTEEDNIVNDVEPLWTKTPSTLKDED YKAFYRTLYPMQDEPLFWIHLNVDYPFNLTGVLYFPRIHSNIELQRNKVMLYCNQVFVTD QVEGIVPDFLTLLHGVIDSPDIPLNVSRSYLQSDGDVKKISTYITKKVSDRLQSLFKEDR KGFEEKWDSLKLFINYGMLSQEDFFERAKDFALLKDVDGKYFTYDEYRTLIKDNQTDKDG QLVCLYTNNNEEQYSYIEAARAKGYSVLLLEGELDVPVASMLEQKLEKCHFVRVDSDIVE RLIQKDDAPKSNIDEADRENLSQVFRSQMPHIDKAEFNAEVEAMGETAQPILITQSEYMR RMKDMSRLQAGMAFYAQMPDAYSVVLNSDHALIKRVLEACKQATADTLQPVEAEIKGLEA RLAALRQQQSAKKPEEITEEERAEVSKTEKEIADQRAQKQAALADFGKGNDIVHQLIDLA LLQNGLLKGAALDAFLKRSVALIK >gi|283510565|gb|ACQH01000054.1| GENE 6 7873 - 8308 498 145 aa, chain + ## HITS:1 COG:no KEGG:PRU_2812 NR:ns ## KEGG: PRU_2812 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 4 143 5 143 303 141 47.0 6e-33 MSKILNKYIWLAETIYKARKITFEEINEEWRKEIDFSGGEDLPLRTFHKWRIGVEDLLGL VIECEHKGRYAYYIANEGEIANGTLKRWLFDTITEGNLLMENQRMKDRILLQHTPTNNQQ LKTILQAMRENHTLRVTYHSYYRPS Prediction of potential genes in microbial genomes Time: Sat May 28 01:08:55 2011 Seq name: gi|283510564|gb|ACQH01000055.1| Prevotella sp. oral taxon 317 str. F0108 cont2.55, whole genome shotgun sequence Length of sequence - 27238 bp Number of predicted genes - 19, with homology - 17 Number of transcription units - 9, operones - 3 average op.length - 4.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 479 449 ## PRU_2812 hypothetical protein + Prom 517 - 576 5.4 2 2 Op 1 . + CDS 596 - 2455 1417 ## azo2045 hypothetical protein 3 2 Op 2 . + CDS 2442 - 4976 1243 ## Fjoh_2693 hypothetical protein + Term 5000 - 5038 5.9 4 3 Tu 1 . + CDS 5067 - 6716 1449 ## COG0464 ATPases of the AAA+ class + Prom 6968 - 7027 7.5 5 4 Tu 1 . + CDS 7201 - 7773 627 ## ZPR_0569 metallophosphoesterase domain-containing protein 6 5 Tu 1 . - CDS 7926 - 8684 846 ## COG0388 Predicted amidohydrolase - Prom 8930 - 8989 5.5 + Prom 9472 - 9531 4.0 7 6 Op 1 23/0.000 + CDS 9781 - 10776 1407 ## COG0714 MoxR-like ATPases 8 6 Op 2 . + CDS 10776 - 11654 1011 ## COG1721 Uncharacterized conserved protein (some members contain a von Willebrand factor type A (vWA) domain) 9 6 Op 3 . + CDS 11651 - 12730 1358 ## PRU_0758 hypothetical protein 10 6 Op 4 5/0.000 + CDS 12756 - 13754 1126 ## COG2304 Uncharacterized protein containing a von Willebrand factor type A (vWA) domain + Prom 13885 - 13944 3.3 11 6 Op 5 . + CDS 13973 - 15637 1783 ## COG2304 Uncharacterized protein containing a von Willebrand factor type A (vWA) domain 12 6 Op 6 . + CDS 15680 - 18253 2756 ## PRU_0761 putative BatD/BatE protein + Term 18399 - 18457 4.9 + Prom 18269 - 18328 3.7 13 7 Op 1 . + CDS 18503 - 18700 74 ## 14 7 Op 2 . + CDS 18739 - 19002 63 ## 15 7 Op 3 . + CDS 18999 - 19694 575 ## PRU_0762 PAP2 domain-containing protein 16 7 Op 4 . + CDS 19697 - 20920 1262 ## COG0809 S-adenosylmethionine:tRNA-ribosyltransferase-isomerase (queuine synthetase) 17 7 Op 5 . + CDS 20938 - 22134 1155 ## BVU_0331 putative DNA mismatch repair protein + Prom 22349 - 22408 4.0 18 8 Tu 1 . + CDS 22456 - 24399 1968 ## COG0171 NAD synthase + Prom 24458 - 24517 4.7 19 9 Tu 1 . + CDS 24538 - 27238 1565 ## PRU_2765 hypothetical protein Predicted protein(s) >gi|283510564|gb|ACQH01000055.1| GENE 1 3 - 479 449 158 aa, chain + ## HITS:1 COG:no KEGG:PRU_2812 NR:ns ## KEGG: PRU_2812 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 157 148 303 303 165 50.0 5e-40 TFEAEPYCVKMFRRRWYMAARSPHYNKVMVYALDRIVNLELLTGNKFALPLDFDAKAFFN DCFGVIIGDNTATEEVTLRVSEWQAHYLRDLPLHASQRELAPEDNRCTFAFRLRPTFDFQ QELLSLGADVEVIEPHWLRETMKEKIREMAERYGEERD >gi|283510564|gb|ACQH01000055.1| GENE 2 596 - 2455 1417 619 aa, chain + ## HITS:1 COG:no KEGG:azo2045 NR:ns ## KEGG: azo2045 # Name: not_defined # Def: hypothetical protein # Organism: Azoarcus_BH72 # Pathway: not_defined # 4 608 2 584 591 197 29.0 7e-49 MSNNSSDINLKAFEELSGLTFIIPSLQRGYKWTPKNVKELLKDLWEFSKQENKNMYCLQP IAVVKHEERKYEVLDGQQRLTTLFLLYKYLTQKNAYTFEFERDNNESDNNRWDFLTAIDK KEKLDDSQIDRFYITKAYETIKDAFEKEPQEIFPGLEKENLKELFEELLKAKREKKSVQV IWYETPKEKAYETFRNLNSGKISLSNSDLIKGLLLNRVNGLPSEHHNMVARQLEEMEQAL NQNKFWFMLSREEPKHPYTRMDLLYNVVANVREEEVRIDYRTSFRWFAENDNGNLLEKWK QVRHTFLRMYDLYTDTYTYHYIGFLTYHHKGDTIKRIYKLIEENEKYSKQEFISQLRTSI QQIVNPDRERKIKDYSYINNSANELRDLLLLHNIETLLQRYQTLKDSKQYQLQYEFEQFP FEVLYKQNWDIEHIASQTDNSLRNENEWTRWLQSIKADYPSYFKYDGDINEDCLRAKIQR YKIAFEKEKSNNNFNRLYAVIIRYNEEETLGNEAIKKDEKDNIGNLVLLDKHTNRSYKNA LFPQKRKAIITADGLGANVETRQFIPLCTMQCFTKAYNKENNVKLNAWTKADAEAYLEDI KEKLSKYFTNKKEQDNELH >gi|283510564|gb|ACQH01000055.1| GENE 3 2442 - 4976 1243 844 aa, chain + ## HITS:1 COG:no KEGG:Fjoh_2693 NR:ns ## KEGG: Fjoh_2693 # Name: not_defined # Def: hypothetical protein # Organism: F.johnsoniae # Pathway: not_defined # 2 586 6 623 848 148 27.0 8e-34 MNFIKAFENKIILIPLLQRDYVQGNVESIITPFIEQLVDKDRHSDLNYIYGYNEDGKFVP IDGQQRLITLWLLHLYCAAKSHNALNIKLQFMSREFANDFCEHLDSNLSILVGKLKEQDE EDLGHAIINESWFISSWKYNETVRGMLNALKHIHRECKRKDVSELWKQLNSDNCHIAFSF LDMGGEQGLDDDIYIKMNGRGRPLSVFENLKSWMDEQVTTSKQSTSNREEEEEWRNLWPT YMDNKWTDFFWENRNVGQEHPEEIDDEQLYCFCNLLILYWMRHKDKLKDKLEAIKKDETL FEDFLRLLKKEGKKEDVEHIVSYFFERLQRAALPPLVWIERLELMPTEFLRFAFDALNTL HDLSDKINKSETHIGSIKEGLTKTYYLSMCVGTFRRTLPLLYAILLYAILLCNDTKKLHN WFRVCRNLIENTTIEQEELPTILECLERFYTYVKDKDLSKVLAEDENIDTLLNKFRTSQV KEERQKAELPQVFRTVIEKLENHPFFFGRIGILFKVLGNRNDIVADGLDKFEKCTATLNT IFSKKSTTTVSEDFDKEDEYLLRRALMAVSKPNYYGYYKACDWCFCNNREEWKGFLDTQT DGLNTFKQLINNCIERLGSNNHDVEKTCKDYMKEIIAETESDYERKLNEQDRENKFSLHF VHHPGVWKYMQNKKCRWGDNDFNIMLKRRDRVNKMNLRTYALYLDYCDTSIRKDLVEDRK DWKCFCYEREDSCLYFEKHFSSIRRTIAIDVLHNGSKEDDYSLKLFVRPTDEESAKGPSV LQATNEQYLSSIISKIQDIQVRISKESGRYTTVERYSRDGIIRAIRTFITTINATIEGKA NETK >gi|283510564|gb|ACQH01000055.1| GENE 4 5067 - 6716 1449 549 aa, chain + ## HITS:1 COG:Cj0377 KEGG:ns NR:ns ## COG: Cj0377 COG0464 # Protein_GI_number: 15791744 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ATPases of the AAA+ class # Organism: Campylobacter jejuni # 300 513 320 535 570 139 35.0 2e-32 MEAKKEMKDMNLLEAMEHIVALAKGSELSEEFYNKADEYICFIADKLQLSKRQSVMLSLF VDKSDNRSILASDIAEFLQCTTIQIIKCMNDVDELVHREYIIKRKDDRQLSFRVPIDIVE AFRHNETFIAKPKTNLSTSELMMELEEIFKTRDDKELTYEQTAGKINNLLEANAGLLFVQ ELKKLNLIDDEKMLLILFCHLFINNEDDEIRFYDIKFLYNSKRWNRIKHSLNINVHPLQR KGLIEFVNDNGMADCEAFRLTRKAKRELLSELNFSSMSQVCKGMIKAKDIVAKQLFYEND TQQQIAELEGLLDENRYQQIHSRMKEAGFRCGFTCLFYGAPGVGKTETVLQLARKTGRNI IQVNVEQIKSMWVGESEKNIKALFDDYRNQVESQSLAPILLFNEADAVIGMRHKGAERAT DKMENALQNIILQEMERIDGILIATTNLVQNFDKAFERRFLYKVKFNAPSIQTRRHIWQS IMPEISEESASWLASHYNLSGGQIENVARRYAINTILYGPPTEELPTLCKCSENESKETY GISQIGFSK >gi|283510564|gb|ACQH01000055.1| GENE 5 7201 - 7773 627 190 aa, chain + ## HITS:1 COG:no KEGG:ZPR_0569 NR:ns ## KEGG: ZPR_0569 # Name: not_defined # Def: metallophosphoesterase domain-containing protein # Organism: Z.profunda # Pathway: not_defined # 1 188 1 207 207 142 40.0 5e-33 MTILQISDTHNRHGLLQDIPMADVLIHCGDFTDRGTETEALDFLNWFIALPHPHKLFVTG NHDLCLWDAEGIDDLPPNVLFLQDKACEIGGVKFFGLGYNHHERLIPNEVDVLITHEPPM MIRDKSNGTHWGNLPLHNRVMEIRPKFHLFGHAHESYGTDVFGNTVFSNGAVLDDQYHLV NAPAMLTLET >gi|283510564|gb|ACQH01000055.1| GENE 6 7926 - 8684 846 252 aa, chain - ## HITS:1 COG:STM0308 KEGG:ns NR:ns ## COG: STM0308 COG0388 # Protein_GI_number: 16763691 # Func_class: R General function prediction only # Function: Predicted amidohydrolase # Organism: Salmonella typhimurium LT2 # 1 252 4 255 255 208 43.0 7e-54 MEIALLQTDIVWGDKEENLMRVERLMDEHPKAQLFVLPEMFATGFMVGGDVVTEPQEGQV HRWMVDVARRRGCAVAGSVAVEVDGRRHNRFYFVTPDGATWHYDKRHLFTYAGEDRLYTK GEDRVVVRYGGLRFLLQVCYDLRFPVFSRNRGDYDVAIYVANWPTARLFAWQTLLRARAI ENQCFVLGVNRVGNDPANAYSGGTVVLDPQGQTLAECRANEVDVAQATLSLESLEGLRRT FPVMRDGDVFIL >gi|283510564|gb|ACQH01000055.1| GENE 7 9781 - 10776 1407 331 aa, chain + ## HITS:1 COG:ML1810 KEGG:ns NR:ns ## COG: ML1810 COG0714 # Protein_GI_number: 15827968 # Func_class: R General function prediction only # Function: MoxR-like ATPases # Organism: Mycobacterium leprae # 28 331 50 352 377 305 49.0 1e-82 MAESLDIRELNQLIEQQSAFITNLTTGMNRVIVGQKHLIDSLLISLLSDGHILLEGVPGL AKTLAIKTLAQLVDADYSRIQFTPDLLPADVIGTLVYSQKEENFQVKKGPVFANFVLADE INRAPAKVQSALLEAMQEHQVTIGEKTFPLPKPFLVMATQNPIEQEGTYQLPEAQVDRFM LKVVIDYPTLEEEKLIIRENIHGGLPQVLPVTTANDIMKARGVVNQVYIDEKIEQYIADI VFASRYPERYQLAELKPLINYGGSPRASINLAKAARAYAFIKHRGYVVPEDVRAVVHDVM IHRIGLSYEAEASNVTSEEIVSKIINKVEVP >gi|283510564|gb|ACQH01000055.1| GENE 8 10776 - 11654 1011 292 aa, chain + ## HITS:1 COG:BB0175 KEGG:ns NR:ns ## COG: BB0175 COG1721 # Protein_GI_number: 15594520 # Func_class: R General function prediction only # Function: Uncharacterized conserved protein (some members contain a von Willebrand factor type A (vWA) domain) # Organism: Borrelia burgdorferi # 9 276 14 278 291 148 32.0 1e-35 MDALELFKQVRKIEIKTRGLSSNIFAGQYHSAFKGRGMAFSEVREYQFGDDVRDIDWNVT ARFRRPFVKVFEEERELTVMLLIDVSGSLDFGTTQRTKREMATEMAAILAFSAIQNNDKI GVIFFSDRIEKYIPPKKGRKHILYIIHEMLDFKPESKRTNVAAAIEYLTRVMKRRCIAFV VSDFYAENSFQKELQIANSKHDVVAIQVYDQRAKTLPNVGLMKVKDAETGHEMFIDTASA KLRRAHTEYWLERMNTLKTTFAQSNVDWVSVATNEDYVKAMMLLFMQRGQNR >gi|283510564|gb|ACQH01000055.1| GENE 9 11651 - 12730 1358 359 aa, chain + ## HITS:1 COG:no KEGG:PRU_0758 NR:ns ## KEGG: PRU_0758 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 8 358 5 354 355 359 57.0 8e-98 MIKRHITLLIVALLAVAAKAQVSVEQKIDSFAIYIGQQTALRLSVTAPKGQTVVFPNFTP SQEMVPGIEVLQTSEEETTELDNGMQKTTKSYTLTSFDEKLYSLPGLKVKVNNKEYEANI LALKVVTMDVDTLHPNQFFPPKDVQNNPFSWTEEPWASMFYLSVLMLLLAALGVYLLVRL KQNKPVITRIRIVKKILPHQKAMTAIERIKAEKMTASENQKEYYTQLTDALRQYINERFG FNAMEMTSAEIIEHLNANGNQEMMNELKEIFQTADLVKFAKYATLINENDLNLVNAIQFI DKTKIENQPDTERIVPELTEADKRTKQTRVALKTVIWTIAICVAAIMVYIVYSLYDVLG >gi|283510564|gb|ACQH01000055.1| GENE 10 12756 - 13754 1126 332 aa, chain + ## HITS:1 COG:VCA0172 KEGG:ns NR:ns ## COG: VCA0172 COG2304 # Protein_GI_number: 15600942 # Func_class: R General function prediction only # Function: Uncharacterized protein containing a von Willebrand factor type A (vWA) domain # Organism: Vibrio cholerae # 8 330 9 318 318 153 34.0 5e-37 MEFANKEYFLLLALLVPYIFWYFIYKKRGEPTLRMSDTRAYRYAPKSWRMRLMNLPVLLR CATFALAVIILARPQTHTSWGNKQVEGIDIMLAMDVSTSMLAEDLTPNRMEAAKDVAAEF IADRPNDNIGLTIFAGEAFTQCPMTTDHTSLLNMLQTVRTDIAAKGLIQDGTAIGMGLAN AVSRLKDSKAKSKVVILLTDGSNNMGDLSPMTSANIAKSLGIRVYTIGVGTNKVARYPMP VAGGVQYVNMPVEIDTKVLKDIAATTDGNFYRATNNQELKQIYKDIDKLEKSKMSVKKFS KLYEAYQPFAILALITLLAEILLRTTVLRRIP >gi|283510564|gb|ACQH01000055.1| GENE 11 13973 - 15637 1783 554 aa, chain + ## HITS:1 COG:VCA0171_1 KEGG:ns NR:ns ## COG: VCA0171_1 COG2304 # Protein_GI_number: 15600941 # Func_class: R General function prediction only # Function: Uncharacterized protein containing a von Willebrand factor type A (vWA) domain # Organism: Vibrio cholerae # 26 250 23 244 336 92 27.0 3e-18 MFRFEDPIYLWLLLLVPVLALVALLAHRKKRRQLKAFGDPELLNDLMPDVSTYRPWVKLG LATTAFALLVVMLARPQMGTKITHDKRNGIETIIAVDISNSMMAQDVVPSRLEKSKLLIE NLVDNFTHDRIGLVVFAGDAFVQLPITTDYVSAKMFLQNIDPALIATQGTDIAKAINLSM RSFSQQKDIGKAIIVITDGEDHEGGALEAAKAANERGIHVFILGIGSTKGSPIPTSEGGY LTDRSGQTVLTALNESMCKQIAQAGNGTYIHVDNTNDAQEKLNNELAKLQRADTQAVIYS EYGEQFQAVCIIVIILLIAEILILDIKNPKWRNIHLFGSKKSAAMLLLLIAPTLAFAQND RHFIRTGNKLYRNQNYAKAEVEYRKAMSQNGSNAHAVYNLGNALMMQQKDSAAIAQLENA GKMETNKTRKAMAYHNIGTICQRHQLYGDAIKAYQEALRNNPNDNETRYNLALCKRLNKN NKKQQKQQKQQKQEQEQQKQQKEKEKQQPKPKEQMSKENAEQLLNATIQDEKATQQRLKK AMQQPSRRTVEKNW >gi|283510564|gb|ACQH01000055.1| GENE 12 15680 - 18253 2756 857 aa, chain + ## HITS:1 COG:no KEGG:PRU_0761 NR:ns ## KEGG: PRU_0761 # Name: not_defined # Def: putative BatD/BatE protein # Organism: P.ruminicola # Pathway: not_defined # 4 857 1 853 853 890 54.0 0 MKRMKRHILLMVALANVMFVAAQRLTVSAPSKVAAGENFRIAYTINTQDVDDFKAGNIPS AIEVIAGPYTSSQSSYQMTNGHTSSSSSVTYTYTLYATKNGTYTISPAKAMVHGKAIMSP AVKINVVGTAKPSASGAPKMHDYDNDEDAMRAAGSKISGSDLFIKVSASKKRVREQEPVL LSYKVYSLVELTQLNGKMPDLNGFHTQEVKLPTQKSFHIERLNGKNYKCVTWSQYVMYPQ MSGELKIPSIKFDGIVVQRNSNVDPFEAIFNGGAGYVEVKKEIEAPGLTLQVDPLPARPA NFSGGVGNFTISAQLNKKEVKTGEPLNLRIVVSGAGNLKLLKAPTVNFPKSFDKYDVKVT DKTQLTTNGIEGNMIYDFLAVPQQIGKYEIPPTEFVFYDTKTQQYRTIRTQRFTLNIEKG TGTTSEMSKFEEEQNKDIRPIMQGPAVMLKSNRMFFASPLWFVLLLVIVGATVVIFIVLR QRQEIYSDSRRLRGSRANKVAARRLKLAGKLMEESRQNEFYDEVLHALWGYVSDKLGISV EQLTRENIAETLNRRAVNAETIDSFIAAIDECEFERYAPGDSQGNMSKTYDMAVKAIMDI ESTMKSNKNAPTLVLLTLVMSMLSLTANAVTKASADQEYKRGNYPQAIADYKALLKKAPS AEVYYNLGNAYFRSDSIPQAILAYERAALINPGNSQIRFNLQFARGKTIDKVAEPDEMFF VTWYRSAANLATVDGWATTALLSAVLLGCCILLYFFNSRIPVRKVGFGCSIAFAVLFVLS IIFAMYQKSALTSKEGAIIMAPAANLKKTPIRSGADEAVLHEGTRVDIADRSIKGWLGVK LADGREGWIEQNTVEEI >gi|283510564|gb|ACQH01000055.1| GENE 13 18503 - 18700 74 65 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MADKGGWQTYLQHLFFSALNILRSSQAMYKSCNLHLQQARPADIKLCPTYIKDMPTIYKG VHLHI >gi|283510564|gb|ACQH01000055.1| GENE 14 18739 - 19002 63 87 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLRTTWTSRNLRLNVSLFLAFIDFNAKQKSLTPYIIRRRRRQNNPRERGPTTQRMHKHAH KTSVATDAKDREIRTHQQQQNNSKKEE >gi|283510564|gb|ACQH01000055.1| GENE 15 18999 - 19694 575 231 aa, chain + ## HITS:1 COG:no KEGG:PRU_0762 NR:ns ## KEGG: PRU_0762 # Name: not_defined # Def: PAP2 domain-containing protein # Organism: P.ruminicola # Pathway: not_defined # 6 225 5 229 229 184 43.0 2e-45 MNLTNILELDRMVLAWFNGSNSLFVDSLAVNLTSGFTWIPLYVVLIYVIIKNNDTMPQIF LTIGCAVLAVVVVSVSVELIIKPLVGRWRPSNDPYIKHTIKIVNGIRSGQYGFFSAHAAN TFSLAVFLSLLIKSRPLAVMLCLWSAVNCWTRLYLGLHYPLDILFGLLWGALIGWSAYAL YRRWGKPLGLPRTQVTPQTTASAFLKADVGKVLLTMALWICAIIIYSLFNA >gi|283510564|gb|ACQH01000055.1| GENE 16 19697 - 20920 1262 407 aa, chain + ## HITS:1 COG:HI0245 KEGG:ns NR:ns ## COG: HI0245 COG0809 # Protein_GI_number: 16272205 # Func_class: J Translation, ribosomal structure and biogenesis # Function: S-adenosylmethionine:tRNA-ribosyltransferase-isomerase (queuine synthetase) # Organism: Haemophilus influenzae # 6 402 1 350 363 229 34.0 9e-60 MNTKHIRIKDYNYDLPDERIAKFPLPQRDKSKLLVYNHGEVSQDEFSNITRYLPQGALMV FNNTKVIQARLHFRKETGSLIEIFLLEPALPADYEQMFQTTDACSWYCMVGNLKKWKGGP LTTTVEIAGQPAAVQLTAERDMSCETALKIDFKWTGNKLSFAELLDKAGELPIPPYLNRN TQESDKTTYQTVYSKIKGSVAAPTAGLHFTQRVLADIDQHGIDREELTLHVGAGTFKPVK SEEIEGHSMHTEYVCIRRTTLEKLLQHNCQAIAVGTTSVRTLESLYYMGCKVKANPNVSL EQLHVNQWEPYEPTPTNITPTEAIGELIAYMDRNNLPSLHSSTQIIIAPGYQYKIVKRLV TNFHQPQSTLLLLVSAFVGGDWKSIYRYALDNDFRFLSYGDSSILIP >gi|283510564|gb|ACQH01000055.1| GENE 17 20938 - 22134 1155 398 aa, chain + ## HITS:1 COG:no KEGG:BVU_0331 NR:ns ## KEGG: BVU_0331 # Name: not_defined # Def: putative DNA mismatch repair protein # Organism: B.vulgatus # Pathway: not_defined # 1 394 1 359 362 293 43.0 6e-78 MKKGDKVRFLSETGGGIVAGFQGKNIVLVQDSDGFEIPMSINEVVVVDEELDRKQGRTSK VANEQKTNTTQQDTPGKQSIKALLSGGYENEETDNETEHETDPADSEITFRKPVEERVGG NSLNVYLAFVPENARQLSHTKMAVELVNDSNYYLRYAFATAEGESWILRNEGEIEPNTKV TLETIAAEDYNSFEQSAFQAIAFKRQKSYVRKPAIDVKLHIEPLKLFKQHLFAETPFYNE PAMLYTIVEGDEPAEAPVSEIDADKLKTEMLSKAAIQKMKQDARSAKGAVSATGKHVHGG RQANEPLVVDLHAHEILETTQGMDSLDILQYQLEVFRRTLKEHANERGLKIVFIHGKGEG VLRKAIINELNRHFKGYTYQDASFQEYGYGATLVKVKG >gi|283510564|gb|ACQH01000055.1| GENE 18 22456 - 24399 1968 647 aa, chain + ## HITS:1 COG:CAC1050_2 KEGG:ns NR:ns ## COG: CAC1050_2 COG0171 # Protein_GI_number: 15894337 # Func_class: H Coenzyme transport and metabolism # Function: NAD synthase # Organism: Clostridium acetobutylicum # 326 640 1 309 310 417 60.0 1e-116 MQHGYIKVAAAIPAVKVADCHFNAEQTEQQIKQANEQGAEIVVFPELGITGYTCQDLFTQ QLLIEQAELAVGSLLENTKTLDIIAVVGVPVAVDSILLNCAVVFQRGHILGIVPKTYLPN YSEFYEKRWFASTHHLNETSIHYAGQQALLTAQSQIFVTADGVKFGVEICEDVWAPNPPG TYLALAGADIVCNLSASDELIGKHTYLKSLLAQQSARTMAGYVYSGCGFGESTQDVVYGG NALIYENGKLLTQNKRFDFEPQIVVSEIDIFKLRAERRTNSTYVNAQHGHTALLHTAQAP LTNKPFALQRTIDPLPFVPQDEQMYDSCEEIFNIQVSGLAQRLKHIHASKVVLGISGGLD STLALLVCVRTFDKLGLSREGIIGVTMPGFGTTNRTYRNAIDLMRELGITTHEINIAKSV TQHFEDIGHDMAVHDVVYENGQARERTQILMDLGNKLGGIVIGTGDLSELALGWATYNGD HMSMYAVNVSIPKTLIKHLVRHVAQTMHNEATCQTLYDIIDTPISPELIPADEAGNIKQK TEDLVGPYELHDFFIYHFLRHGFTPQRLFIMARHAFASPQQRAKHYSDDEIKHWLRVFLR RFFAQQFKRSCLPDGPKVGSVSLSPRGDWRMPSDATAKMWLDECDKL >gi|283510564|gb|ACQH01000055.1| GENE 19 24538 - 27238 1565 900 aa, chain + ## HITS:1 COG:no KEGG:PRU_2765 NR:ns ## KEGG: PRU_2765 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 32 900 29 872 967 232 26.0 5e-59 MKQLLGILLCVMLPLPQQSFAQQHSTTSKKTREMKMYGRVVDSFTSLAIVDSKITLMAPD STVIDSCSTQTWNRNAIHPEAYFFMTPKVSEGKYIIKVENPKYETTYYNHEISFKTRATV IDLKDLTLKRKRMEDVEHNLGEVVVKSTKIKMINKGDTIVFNAEAFNLPEGSMLDALIRQ LPGATMNSNGEIFINGRKLDYLTLNGKDFFKGNNKQLIENLPNYTVKDLKVFEKSTEKSQ AMGIDVEKKDYVMDIQLKKQYEKNFIGNADIAGGTNSRYALRLFGLYFTPRIQVSTFANI NNVNEDRKPGEKGDWNPTKLPKGQVTRKTIGLNFANSNEKNTVNNSLSTSVSWLSTHNIT HTAAESFLGTANNNFTRRINNSTTKNTIYELNDQFHVNKSYQLMAMLWLNYNKFDNFSTD KSAVFQTDPKSLGSTEAILDSAFTLPLPQLSTYKGINRQLSQTLGNGHTLSSALDIYLGK PLKSGDRIGVQVKSTYENIKNYNFSKYRLDYLQGAGKKDYRNRYDESPSYTYSYTFNPQY IFVLPSGFNIITGYEYRQAGNYRNNNKYRLDRLQEWQNEDHSIGSLPSTRDTLLRSLDLN NSYTGYYMSKQHMGVLGAHYMKRKGKNSFTFGVFTPLSYKKEQLDYRSEPITKPISQSNW IKQINFSGQLIHNEYEFVAWGNSTTTTPDMYSLIDRRDNSNPLAVSLGNPNLKNGRQSSL GIRFSDKHNRRYNLPSIYFGIQETHNAVANGFIYDPQTGVYTYKPENVNGNWSAHSEIYY NVPLDSMKYFNLFTETEYNYDHNVDLTGVAGQTSSVLSKVNNHTTRQYLEVKYQKETLSL SLMGQLSWRNVNSTRKGFTPFNTYDYEYGFIGSYSMKTGLEMGMDLKMFSRRGYADEQIN Prediction of potential genes in microbial genomes Time: Sat May 28 01:09:59 2011 Seq name: gi|283510563|gb|ACQH01000056.1| Prevotella sp. oral taxon 317 str. F0108 cont2.56, whole genome shotgun sequence Length of sequence - 8949 bp Number of predicted genes - 11, with homology - 10 Number of transcription units - 7, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 16 - 216 245 ## gi|260912485|ref|ZP_05919021.1| conserved hypothetical protein + Prom 596 - 655 5.5 2 2 Op 1 . + CDS 829 - 1026 160 ## 3 2 Op 2 . + CDS 1077 - 1979 313 ## PROTEIN SUPPORTED gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein 4 2 Op 3 . + CDS 1976 - 2752 331 ## BF4122 hypothetical protein - Term 3683 - 3722 -0.6 5 3 Tu 1 . - CDS 3788 - 4069 62 ## gi|288928470|ref|ZP_06422317.1| conserved hypothetical protein 6 4 Tu 1 . - CDS 4521 - 5087 342 ## gi|288928471|ref|ZP_06422318.1| hypothetical protein HMPREF0670_01212 - Prom 5124 - 5183 2.8 7 5 Tu 1 . - CDS 5208 - 6095 954 ## COG0084 Mg-dependent DNase - Term 6161 - 6202 0.7 8 6 Op 1 . - CDS 6234 - 6932 661 ## BF3532 hypothetical protein 9 6 Op 2 . - CDS 6957 - 7931 1205 ## COG0142 Geranylgeranyl pyrophosphate synthase 10 6 Op 3 . - CDS 7944 - 8654 675 ## BT_2059 TonB - Prom 8725 - 8784 3.2 + Prom 8681 - 8740 4.7 11 7 Tu 1 . + CDS 8776 - 8947 67 ## PRU_1827 hypothetical protein Predicted protein(s) >gi|283510563|gb|ACQH01000056.1| GENE 1 16 - 216 245 66 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260912485|ref|ZP_05919021.1| ## NR: gi|260912485|ref|ZP_05919021.1| conserved hypothetical protein [Prevotella sp. oral taxon 472 str. F0295] # 1 66 905 970 970 134 98.0 2e-30 MCNAYVTQSFMKGRFAVKLEAFDLFQQLKSVDYQVNGQGKSEIRYNTIPHYIMLHAIYKM NVGGKK >gi|283510563|gb|ACQH01000056.1| GENE 2 829 - 1026 160 65 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDGVMNIMTAIIDSQTLEIMYIGLMLLYSIYVIREKRIVLSAQKHFTIYFPMNYLEFQIG LNKSY >gi|283510563|gb|ACQH01000056.1| GENE 3 1077 - 1979 313 300 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein [Acinetobacter baumannii AYE] # 9 300 10 306 311 125 29 1e-28 MNILETNHLTFNYRNHSVLEDINLKIPEGSIYGYLGKNGEGKSTTIKILLGLEQTPKNTV FFNKKEFNSNRSYILQNIGCLVEQPFFYADLSAYENLNYLDMLYHCGQKRIKEVLELVNL TKDKDKKVRKYSTGMKQRLGIAMAMFHNPKLLILDEPLNGLDPQGVYEMRELMLKLQSEG KTIFISGHILAELEKICTHIGVLNNKRLLFQGEINSLLNQKKLSYLIYTDQPERGMEICK EHLIPVNRISDRTFIIEIEQDGSFDSFRNLMQQNNIQIIFADKCQEKLESVFLHLIKTDL >gi|283510563|gb|ACQH01000056.1| GENE 4 1976 - 2752 331 258 aa, chain + ## HITS:1 COG:no KEGG:BF4122 NR:ns ## KEGG: BF4122 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 9 258 10 255 257 119 35.0 1e-25 MKPVSFFTILKSEHYKLRFNIAIWLFLLFPFFITLYIDVYILFKHADAVNNPAMTFDYNP WVWLLGRYIFEFYSLLYPILAAVLSYSLCDVEYKNYGFRLLFTRPVSKVTVYSSKIVFLL EIIFISSLIGYLTFLLSGFALDKLLPGYKFSSYNVNTLMASYFSYLFIALSAVSFIQYNL SLIFKSFVLPIGFASLMTIFGIIAQNKDYAYLIPYSTVWRLNHYVYSGIINFSKSEYASI AYVPFFIVISFFVFIRKK >gi|283510563|gb|ACQH01000056.1| GENE 5 3788 - 4069 62 93 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928470|ref|ZP_06422317.1| ## NR: gi|288928470|ref|ZP_06422317.1| conserved hypothetical protein [Prevotella sp. oral taxon 317 str. F0108] # 1 93 7 99 99 164 98.0 1e-39 MFNELNQIVQQPVAQIQDTDNLSNTIAQQPVAQLGEVPQTLSAIYHSDVCWIEEIFTQSA IAATNWASHVILLNSKFRLGECYSTHQLINSKS >gi|283510563|gb|ACQH01000056.1| GENE 6 4521 - 5087 342 188 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928471|ref|ZP_06422318.1| ## NR: gi|288928471|ref|ZP_06422318.1| hypothetical protein HMPREF0670_01212 [Prevotella sp. oral taxon 317 str. F0108] # 1 188 1 188 188 353 100.0 2e-96 MKRIILLLLVLFATRVSAQTWNTGEFQQMGEKTILKMRRIFKIPTQVKLYATAYIYVCDL TTKKKSVPTDSIASLTVSDWLRKVYDCRHKVYVYDDQLRYFLVDNLANRDEGGDERKSYL GTKDDRAIAETLQANRFDCVIEADRCQNKDAKQLVLICVKRDGMYRAVVKGEKTTLKLVT KDDLLIEP >gi|283510563|gb|ACQH01000056.1| GENE 7 5208 - 6095 954 295 aa, chain - ## HITS:1 COG:VC0103 KEGG:ns NR:ns ## COG: VC0103 COG0084 # Protein_GI_number: 15640135 # Func_class: L Replication, recombination and repair # Function: Mg-dependent DNase # Organism: Vibrio cholerae # 37 290 1 252 255 207 42.0 2e-53 MYVGGFSPHRYGEQGFLGFCNWAQSQQIVNENMTTEIIDTHIHLDVDDYRDDLDEVVARA KAAGVSKVFIPAIDEQSGDAALLLAQRYPNYAFAMMGLHPEEVKADYAEVLKRMKRRFEE PHPFIAIGEVGLDFYWSREFEKQQLEAFEEQVRWSAEYKLPLMIHCRKGQNEMVHIIKKY ESQLPGGVFHCFTGNEREADAFLQFDKFVLGIGGVLTFKKSNLPSVLPHIPLSRIVLETD GPYMAPVPMRGKRNESSYLTHVVEVMANAYATTTDEIARQTNENVKRVFGSRVPF >gi|283510563|gb|ACQH01000056.1| GENE 8 6234 - 6932 661 232 aa, chain - ## HITS:1 COG:no KEGG:BF3532 NR:ns ## KEGG: BF3532 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 230 1 231 236 191 43.0 2e-47 MPYRRLPKTDAARLKALKMLLENESVYAARNRFIDWSVINRARPAYDRLLTAAQQYRTSL TAQARQTGKIDRVQRNAVMYVSHFLQVLFMSVERGEIKKSALKLYGLSEHATSLPNLKTQ EGLAEYGKRVIEGEKERIKQGGRPIYNPTIGMVSTHYDIFMETYKQQKSLQARTNKAVDT LHTLRPETDNIILDLWNQIEKHFEHEPPEVRFAECRKLGVIYYYRRNEEHLY >gi|283510563|gb|ACQH01000056.1| GENE 9 6957 - 7931 1205 324 aa, chain - ## HITS:1 COG:MA0606 KEGG:ns NR:ns ## COG: MA0606 COG0142 # Protein_GI_number: 20089495 # Func_class: H Coenzyme transport and metabolism # Function: Geranylgeranyl pyrophosphate synthase # Organism: Methanosarcina acetivorans str.C2A # 9 324 15 324 324 196 37.0 7e-50 MLTDDLILSKINAYLEALPFNRKPQSLYEPIRYVLSIGGKRIRPALALLAYNLFKDDPES ILAPACGLETYHNYTLLHDDLMDNAALRRGKPTVHKRWDANTAILSGDSMLVLAYQLVAQ CDAAKLKPVMELFTETALEIGEGQQYDMDFEHRNDVTEDEYIEMIRLKTSVLLACAMKMG ALLADAPQRDADLLYKFGEQIGLAFQLQDDFLDVYGDSKVFGKAIGGDIVSNKKTYMLIN AFNKADAAQRAELQHWVSQEQCNAQEKIAAITQLYNAMGIDKMAERRIEHYFECGKRYLD EVNVPAERKENLINYASKMMNRKY >gi|283510563|gb|ACQH01000056.1| GENE 10 7944 - 8654 675 236 aa, chain - ## HITS:1 COG:no KEGG:BT_2059 NR:ns ## KEGG: BT_2059 # Name: not_defined # Def: TonB # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 236 1 226 227 152 41.0 1e-35 MEVKKTHSANLENKRVTNYLLGLVVVLSFLFVALEYNSNSRAFDFDDEVPDDFFEEMEIL PDVEKNEMVAAAIPTAAPSITTKIKAVDTPVALPDKLNERNDEEKEGEEATVQQNETPTT AIEQPVPEPIANDDKPRVVQQMPEYPGGIVEFMKWLQRTLRYPPTAQEQGIQGSVMVSFI VNADGTITDQKVVRGVNEDLDAEALRVISHMPKWKPGLDKGKPCRTLFAIPIVFKL >gi|283510563|gb|ACQH01000056.1| GENE 11 8776 - 8947 67 57 aa, chain + ## HITS:1 COG:no KEGG:PRU_1827 NR:ns ## KEGG: PRU_1827 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 57 3 59 316 69 56.0 4e-11 MKKVIFALILFFMADKMAAQESQTAYNFLRLPVSAHAAALGGDNVTLINDDAWLMFN Prediction of potential genes in microbial genomes Time: Sat May 28 01:10:39 2011 Seq name: gi|283510562|gb|ACQH01000057.1| Prevotella sp. oral taxon 317 str. F0108 cont2.57, whole genome shotgun sequence Length of sequence - 22299 bp Number of predicted genes - 17, with homology - 15 Number of transcription units - 14, operones - 3 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 14 - 796 934 ## PRU_1827 hypothetical protein 2 1 Op 2 . + CDS 846 - 1571 223 ## PROTEIN SUPPORTED gi|15639271|ref|NP_218720.1| bifunctional cytidylate kinase/ribosomal protein S1 + Prom 1824 - 1883 3.4 3 2 Tu 1 . + CDS 1919 - 2152 94 ## + Term 2190 - 2232 3.1 - Term 2741 - 2792 15.6 4 3 Tu 1 . - CDS 2829 - 4601 2051 ## COG5016 Pyruvate/oxaloacetate carboxyltransferase - Prom 4625 - 4684 7.8 + Prom 4585 - 4644 6.1 5 4 Op 1 . + CDS 4867 - 7698 3396 ## PRU_0256 hypothetical protein 6 4 Op 2 . + CDS 7732 - 8904 1343 ## COG2873 O-acetylhomoserine sulfhydrylase 7 5 Tu 1 . + CDS 9019 - 10197 1324 ## COG1168 Bifunctional PLP-dependent enzyme with beta-cystathionase and maltose regulon repressor activities + Term 10229 - 10268 0.6 - Term 10206 - 10268 13.3 8 6 Tu 1 . - CDS 10283 - 10987 767 ## COG4912 Predicted DNA alkylation repair enzyme - Prom 11157 - 11216 4.5 + Prom 12592 - 12651 6.0 9 7 Tu 1 . + CDS 12674 - 12895 366 ## PRU_1260 hypothetical protein 10 8 Op 1 31/0.000 + CDS 13005 - 14567 1503 ## COG1271 Cytochrome bd-type quinol oxidase, subunit 1 + Prom 14653 - 14712 3.0 11 8 Op 2 . + CDS 14824 - 15966 1079 ## COG1294 Cytochrome bd-type quinol oxidase, subunit 2 + Prom 16141 - 16200 2.9 12 9 Tu 1 . + CDS 16227 - 16916 684 ## COG0652 Peptidyl-prolyl cis-trans isomerase (rotamase) - cyclophilin family + Term 16969 - 17000 -0.1 - Term 16991 - 17056 13.6 13 10 Tu 1 . - CDS 17102 - 18205 1154 ## COG0489 ATPases involved in chromosome partitioning - Prom 18229 - 18288 7.2 14 11 Tu 1 . - CDS 18373 - 19518 906 ## gi|288928489|ref|ZP_06422336.1| hypothetical protein HMPREF0670_01230 15 12 Tu 1 . - CDS 19685 - 20377 589 ## COG2323 Predicted membrane protein - Prom 20406 - 20465 2.6 - Term 20415 - 20474 -0.7 16 13 Tu 1 . - CDS 20655 - 21470 438 ## gi|288928491|ref|ZP_06422338.1| hypothetical protein HMPREF0670_01232 - Prom 21535 - 21594 2.7 + Prom 21591 - 21650 8.2 17 14 Tu 1 . + CDS 21867 - 22112 70 ## + Term 22192 - 22233 -0.7 Predicted protein(s) >gi|283510562|gb|ACQH01000057.1| GENE 1 14 - 796 934 260 aa, chain + ## HITS:1 COG:no KEGG:PRU_1827 NR:ns ## KEGG: PRU_1827 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 260 57 316 316 280 50.0 5e-74 MFNNPALLSSVSDKTLALNYMNFMRGVNLAGVAYNKVIKERASAALSVNYVDYGKMRYTT ADNQDLGEFAARDIAATAYFSYMLTDKLAGGIAAKVLSSSIGHYTSIGMGVDLGLNYYDS DRELSLSVVAKNLGGQLKAYEDDYEAMPIDVQMGISKRLQHTPFRLSATLVGLNHWHYAL MKHFVFGVDVDLSPNIWVGAGYSLRRANEMKVGDAENASSHGAGLSLGAGLQLERFKLNL AYGKYHVSSSSVVLNVSFAM >gi|283510562|gb|ACQH01000057.1| GENE 2 846 - 1571 223 241 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15639271|ref|NP_218720.1| bifunctional cytidylate kinase/ribosomal protein S1 [Treponema pallidum subsp. pallidum str. Nichols] # 1 225 22 274 863 90 29 9e-18 KKETYKQQYTMKKITIAIDGLSSCGKSTMAKMLAKEVGYIYVDTGAMYRAVTLFAMQNSM IAPDGHVDREALKAKMDTLRVEFKLNPETGKAETYLNGENVEREIRGMEVSAHVSSIAAI DFVRTALVAQQQRMGQDKGIVMDGRDIGTVVFPDAELKVFVTASAEVRAQRRFDELVGKG MEANYDEILRNVQERDYKDSHREVSPLRKADDAIELDNGQLTIAQQLQWLVDRFEEKAGK E >gi|283510562|gb|ACQH01000057.1| GENE 3 1919 - 2152 94 77 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDKTFLTSFLSFSLSPSPFRQKGLKKELINLRNLNFCHNKVCIFKYLAYLCSDKFRSRGL LIASQFSQTNDEYPNCQ >gi|283510562|gb|ACQH01000057.1| GENE 4 2829 - 4601 2051 590 aa, chain - ## HITS:1 COG:AF1252m KEGG:ns NR:ns ## COG: AF1252m COG5016 # Protein_GI_number: 18677784 # Func_class: C Energy production and conversion # Function: Pyruvate/oxaloacetate carboxyltransferase # Organism: Archaeoglobus fulgidus # 1 472 1 446 480 197 32.0 5e-50 MKKKIQFSLVYRDMWQSSGKFQPRKDQLERIAPVIIEMGCFSRVETNGGAFEQVNLLAGE NPNEGVRAFCKPFNEVGIKTHMLDRGLNALRMYPVPDDVRALMYKVKHAQGVDIPRIFDG LNDVRNIIPSIKWAAEAGMTPQGTLCITTSPVHTLEYYSNIADQLIEAGAKEICLKDMAG IGQPALLGQLTKAIKDKHPDIIIEYHGHTGPGLSMASVLEVCNNGADIIDTAIEPLSWGK VHPDVISVQSMLKNEGFDVPEINMSAYMKARALTQEFIDEWLGYFINPANKLMSSLLLGC GLPGGMMGSMMADLAGIHSTINSIRAKKGEPELSTDDMLINLFNEVEYVWPRVGYPPLVT PFSQYVKNIALMNLLTMEQGKGRFVMMDDAMWGMILGKSGKIPGTIAPELVALAKEKGFE FTDADPHTLLPNSLDEFKKEMDENGWEYGQDDEELFELAMHPEQYRNYKSGQAKKNFLAD LQKAKDAKMGSKLSVEELAAFKHAKADALVAPVKGQLFWEFNGDGECAPTVEPYIGKEYK EGEVFCYIVATWGEIVTVPAALGGKLVEINAKQGSKVNKGDVVAYIERAQ >gi|283510562|gb|ACQH01000057.1| GENE 5 4867 - 7698 3396 943 aa, chain + ## HITS:1 COG:no KEGG:PRU_0256 NR:ns ## KEGG: PRU_0256 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 943 1 954 954 1229 66.0 0 MRRKSLIALLLFAVIFVTAGVYTPYAQRKKTPKTAAERQADSIRRDSLERVANDTTRMDS LQRAIYRHNKQVDDSIRLDSINRQKKNGIDAPVVYSGNDSLVYNADSKMAYLYGSAKVKY ENMDLASERIRISTDSSVVHATGAPDSTEKSGVKGKPVFNMGSDKYSSDTMAFNFKSKKG IIKEVHTQQQEGFLFGEKAKRDANGDVYLQHGRYTTCDKDCPDFYIALSRAKVRPGKDVV FGPAYLVVADVPLPLAIPYGFFPFTKKYSSGFIMPTYGDESERGFYLRDIGYYFALGDKW DLKLLGEIYTKGSWGVSAATNYRKRYRYNGSFYFSYQNTKTGDKGLPDYSSQTSFKLQWS HRQDPKANPFSNLSGSVNFASTSYERNNMNSLYNPQALTQSTRTSSVSWSTGFSSIGLTL SATGNLSQNMRDSSIALTLPDLNISIARFNPFKRKHAVGKERWYEKISMSYTGQLSNSIN TKEDRLLRSNLVKDWRNGMMHSVPISGSFTLFNYLSLNPSVSFTDRMYTNKINRSWDIAQ QKERRDTTYGFHNVYDWRMSVSANTKLYGFWTPSRKLFGDKIVAIRHVITPQVSFSYAPD FGSARYGYYATYQKTDASGNVSLVEYSPFEGSLFGVPGRGKTGSVSFDVGNNLEMKIRSD KDSTGFKKISLIDELGASMSYNMAAESRPWSDLSMRLRLKWWKNYTFSMNAVFATYAYEL DKDGNPYVGTHTEWGRGRFGRFQGMSQNVSFTLTPEKLAKLFGKKTKEDDNKKPNKNDEG IDTNIESNIDDDMVEGQRGASNPNDGGKTETDDDGYMKFSMPWSLTFGYGVTMSENNTDR SKFNYKTMRYPYKFTQTLNMSGNIRLSEGWNISFSSGYDFENKKISMTTASLQRDLHCFS MSCSVVLAPYTSYNFTFRCNASTLTDALKYDKRSGYGNAVQWY >gi|283510562|gb|ACQH01000057.1| GENE 6 7732 - 8904 1343 390 aa, chain + ## HITS:1 COG:MA2715 KEGG:ns NR:ns ## COG: MA2715 COG2873 # Protein_GI_number: 20091539 # Func_class: E Amino acid transport and metabolism # Function: O-acetylhomoserine sulfhydrylase # Organism: Methanosarcina acetivorans str.C2A # 4 389 25 438 441 226 34.0 7e-59 MRKTTRAIHQAYKRRDAYDALSMPVYNAVAYEFDNAQTMSDVFCGRIAAPDYSRVGNPTV ENFEHRVKGITGATDVVAFNSGMAAISAVFIALAEQGKNIVSSRHMFGNTYSLLTSTLKR FGVEARLCDLTKPDEVEPLLDDNTCCVYLEIMSNPQLEVVDLPALAALAHARGIPLVADT TLIPFTEFSAKALGVDAEVVSSTKYISGGATSLGGLVIDYGTCERLGHALRGELLFNFGA YMTPQAAYLQTVGLEVLDARYHIQAANALQLAIRMQALPQIKRVNYVGLPDNPYHELAKK QFGNTAGAMITIDLESKAACFNMINKLKLIHRATNLFDNRTLAIHPASTIYGSFSDKERQ EMDVLDTTIRLSIGLEDVDDLLEDIKQALG >gi|283510562|gb|ACQH01000057.1| GENE 7 9019 - 10197 1324 392 aa, chain + ## HITS:1 COG:YPO3006 KEGG:ns NR:ns ## COG: YPO3006 COG1168 # Protein_GI_number: 16123185 # Func_class: E Amino acid transport and metabolism # Function: Bifunctional PLP-dependent enzyme with beta-cystathionase and maltose regulon repressor activities # Organism: Yersinia pestis # 3 392 2 392 393 345 41.0 8e-95 MHNYNFDQIIDRKGSGDVKHDALLPRWGRDDLLPLWVADMDFATPPFIVDALRKRLEHPI FGYTTTPDELWQSIINWQQRQHQWTVQRQWLTYIPGIVKGIGFVINVFTRPGEKVIIQPP VYHPFRLTPLANGREVVFNPLKRNAEGYYEMDFDNLEAVCDEQCKVFVLCNPHNPAGQTW SKETLQRLADFCYERNILVISDEIHADMALFGYRHTPFATVSQRAADISITFGAPSKTFN MAGIVSSFAIVPNAELRTKFFRWLAANELNEPTIFAPIATIAAFTQGDEWRKQMLKYIEG NVLFVEEFCKKHIPGVKPLRPQASFLVWLDFTELNLNHAQLLDLCVDKAHLAMNDGEMFG PGGESHMRLNVGTPRAILQQALEQLAHAVKSM >gi|283510562|gb|ACQH01000057.1| GENE 8 10283 - 10987 767 234 aa, chain - ## HITS:1 COG:RC0866 KEGG:ns NR:ns ## COG: RC0866 COG4912 # Protein_GI_number: 15892789 # Func_class: L Replication, recombination and repair # Function: Predicted DNA alkylation repair enzyme # Organism: Rickettsia conorii # 58 198 1 139 139 127 43.0 2e-29 MSTTDTITNELKAQGSAEKAAHLSYFFKTGKGQYGENDRFLGVTVPETRRVAKAHAGVEM ADLQQLIQSEWHEVRLCALFILTIQFNKGDEPTRTELVDFYLSNTQYINNWDLVDLAAWE TIGTYLLDKPRDLLYRLAESPLLWDNRIAMVSTYAFIRQGQLDDTYALAEKMMGHKHDLM HKAIGWMLRESGKRDVERLRQFVENHRLEMPRTMLRYAIEKFSPEERKHLMRKG >gi|283510562|gb|ACQH01000057.1| GENE 9 12674 - 12895 366 73 aa, chain + ## HITS:1 COG:no KEGG:PRU_1260 NR:ns ## KEGG: PRU_1260 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 68 10 76 81 65 64.0 5e-10 MNKKSFWYRAFDLYYDGFRSMTLGKKLWLVIFIKLFIMFAILKVFFFPNFLKANAPKGGE ADYVSRTLDQRQK >gi|283510562|gb|ACQH01000057.1| GENE 10 13005 - 14567 1503 520 aa, chain + ## HITS:1 COG:Cj0081 KEGG:ns NR:ns ## COG: Cj0081 COG1271 # Protein_GI_number: 15791471 # Func_class: C Energy production and conversion # Function: Cytochrome bd-type quinol oxidase, subunit 1 # Organism: Campylobacter jejuni # 13 515 6 504 520 506 50.0 1e-143 MFNQLFLDISTGTIDWSRAQFALTAMYHWLFVPLTLGLAVIMGIAETCYYRTKKQFWKDV AHFWQKLFGINFAMGVATGIILEFQFGTNWSNYSWFVGDIFGAPLAIEGIVAFFMESTFV AVMFFGWKKVSPGFHLASTWLTGIGATISAWWILVANAWMQYPVGCEFNPDTMRNEMVSF ADVALSPFAIDKFCHTVTSSWIVGAVFVVAVSAYYLLKKREMELAKQSMKIAACVGFVAS ILAAMTGHHSATLVGKVQPMKLAAMEALCKGGESQSLTAIALVNPFKQHDYRSGEELACH VSIPYALSVMATHTAHGFVPGVNDLLDGYVKADGTREPSVAEKMVMGKAAVNALMVYRKA KKAGDETTAQKVLPIIKANMKYFGYGYVEKEEQVVPYIPLAFWSFRLMVGLGSFFVLFFA VLTFFSYRKDLSRYRWLLILGMCTLPMGYIASEAGWVLAELGRQPWTIQDMLPTWVAVSD VSPASIATTFFLFLALFTTLLVVEINILVKQIKKGPEYGK >gi|283510562|gb|ACQH01000057.1| GENE 11 14824 - 15966 1079 380 aa, chain + ## HITS:1 COG:Cj0082 KEGG:ns NR:ns ## COG: Cj0082 COG1294 # Protein_GI_number: 15791472 # Func_class: C Energy production and conversion # Function: Cytochrome bd-type quinol oxidase, subunit 2 # Organism: Campylobacter jejuni # 1 378 5 371 374 284 47.0 3e-76 MSYEFLQHYWWFLVSLLGALLVFLMFVQGANTLIFTLGKTENERRLVVNSTGRKWEFTFT TLVVFGGGFFASFPLFYSTSFGGAYWLWMIILFSFVIQAVSYEFQNKLGNLLGVRTFQTF LIINGVVGPILLGGAVATFFNGSNFVVDKGNITAMAQPVISRWANGSHGLDALLDLWNVV LGVAVFFLARVLGLLYIINNVAHEAIVTRARKQLRLCTVVFLAFFLSFLVHVLLKDGYAY DPQTDLIFMQPFKYLHNLTTMWYLTLVLLVGVVSLLYGIVRTLREPKFKRGIWFGGVGVV LVVLVLLLCAGWNNTAYYPSNADLQSSLTIANSSGSEFTLTAMSVVSLLIPFVLAYIFAA WRAIDRKSITEDEIMNGESY >gi|283510562|gb|ACQH01000057.1| GENE 12 16227 - 16916 684 229 aa, chain + ## HITS:1 COG:lin2475 KEGG:ns NR:ns ## COG: lin2475 COG0652 # Protein_GI_number: 16801537 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peptidyl-prolyl cis-trans isomerase (rotamase) - cyclophilin family # Organism: Listeria innocua # 34 224 21 190 194 108 39.0 6e-24 MKKALFTFVCAMFCMVALAQTQADTVRHQVLLETNMGNIRVVLYNETPKHRDNFIRLVKS GYYNGTLFHRVIENFMIQGGDSTSRHAAFGEPAGGYSPDYTVPAEIVFPTYYHKRGALAA AREGDDVNPKWESSSAQFYIVYGKRMTDYQLDAVQAKLDLRTNGSVKLTPEQRETYFKKG GTPHLDGTYTVFGEVVEGMDVVEKIQKTPCDERSRPFEDMKIIKAVVVK >gi|283510562|gb|ACQH01000057.1| GENE 13 17102 - 18205 1154 367 aa, chain - ## HITS:1 COG:alr0652 KEGG:ns NR:ns ## COG: alr0652 COG0489 # Protein_GI_number: 17228148 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Nostoc sp. PCC 7120 # 21 345 23 346 356 234 41.0 3e-61 MTLYPKLIKDTLATVMYAGTKKNIIESDMLADDIHIDGMKVSFTLRFPKETDPFLKSTIK AAEAAIHYHVSPDVEVEIKTEFAAKPRPEVGKLLPQVKNIIAVSSGKGGVGKSTVSANLA IALARLGYKVGLLDADIFGPSMPKMFNVEQARPYASKVDGRDLIEPIEQYGVKLLSIGFF VNAETATLWRGSMASNALKQLIADADWGELDYFILDTPPGTSDIHLTLLQTLAITGAVIV STPQSVALADARKGIDMYRNEKVNVPILGLVENMAWFTPAELPQNKYYIFGKEGVKQLAD EMETPLLAQIPLVQSICENGDKGTPAALDADTITGQAFINLAQAVVTVTNRRNKEQAPTK IVEVNKG >gi|283510562|gb|ACQH01000057.1| GENE 14 18373 - 19518 906 381 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928489|ref|ZP_06422336.1| ## NR: gi|288928489|ref|ZP_06422336.1| hypothetical protein HMPREF0670_01230 [Prevotella sp. oral taxon 317 str. F0108] # 1 381 62 442 442 771 100.0 0 MKGNVVAADQYVKKWLKRGDNLSVALELLRAPLPKDSDFFAVDYGNPTFAFGLRYTFNNA VRLHREADADWGLLVPVPYESPLGNTLSLYTTFYRPLHRSRTWETAYSLSGGVGFSSSIY NKENAIDNEFIGSHASIYFAAGLHQTFHFAPKWGLRASLEFVHHSNGALYRPNKGSNTLG ASLALLYTPYYEQTLRTAQTRSKKPFERGMYLNFAAGVGAKTLLEDWLLTQFGTPPGHPD YRTERFRIYAAYSLQADLMWRYARRWSSGVGVDAFYGTYARRVEELDQQAGHNLRHSPFS FGLAAKHEAHYGRLSLAMSLGVYLYRRMGANAKINETPYYERIGLHYALPWFNGLTVGVN VKAHKTKADFTEVVVGVPIRL >gi|283510562|gb|ACQH01000057.1| GENE 15 19685 - 20377 589 230 aa, chain - ## HITS:1 COG:BS_yetF KEGG:ns NR:ns ## COG: BS_yetF COG2323 # Protein_GI_number: 16077781 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Bacillus subtilis # 3 206 4 223 231 119 30.0 4e-27 MQYFEIAVKLITGMIGILVFLRIAGKAQMAQLTPLDTVSAFVIGALVGGVLYNPDMSVWH ILFALAVWTGFNLFIRFCMRSKVLRRLIKGDSVYLVKGGAINFKAFKRNSLEMEQFRLLL RQKGVFSMFDVDDVRFETNGAITVSEMGKVSDSYLLVNNGAIVDSSLYHCGKTRQWVLRH IKHYGYDSPTNLFCMEWTPNKGFYLVAKNGDVTRGNEEIPVEDIAHEVLN >gi|283510562|gb|ACQH01000057.1| GENE 16 20655 - 21470 438 271 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928491|ref|ZP_06422338.1| ## NR: gi|288928491|ref|ZP_06422338.1| hypothetical protein HMPREF0670_01232 [Prevotella sp. oral taxon 317 str. F0108] # 1 271 50 320 320 558 100.0 1e-157 MSGFSVADNDVFYFAGGKPLSVVCFKGTKKMYERVIDELPTKLSMFKVEKDSVYLVNDQN LTLYRMHKSGQEPVNKVKLNVGKLKPCEGAMREFGFVLIQRIGQPLNANYKAIYFNDSGR LIKKEDIENKDNALCNSSECKKILDSYLYPSAKLVFPGFDGNYKGTWNGYNIFWGLFGNS EKASWTLAFADKDGKVAKHYDINYHLNDMYDVIPLPVLTNDIDPETFTTAPTYCMLHGEN LYILGYSGAKKKIILCRINLLNAFNSEQSAK >gi|283510562|gb|ACQH01000057.1| GENE 17 21867 - 22112 70 81 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRCNIFSPDYPLKDGHNSCYLHLLDGSVSLLYPRCCVGWVDAPHHVRSVWQRYNNSLNYN SFEVTFCGSVGRCVCNKAFLF Prediction of potential genes in microbial genomes Time: Sat May 28 01:11:35 2011 Seq name: gi|283510561|gb|ACQH01000058.1| Prevotella sp. oral taxon 317 str. F0108 cont2.58, whole genome shotgun sequence Length of sequence - 22159 bp Number of predicted genes - 16, with homology - 16 Number of transcription units - 10, operones - 6 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 44 - 103 6.8 1 1 Tu 1 . + CDS 134 - 1687 1525 ## SSUBM407_0558 suilysin (hemolysin) + Term 1829 - 1871 10.3 - Term 1821 - 1855 2.2 2 2 Tu 1 . - CDS 2013 - 2582 512 ## COG1418 Predicted HD superfamily hydrolase - Prom 2621 - 2680 4.5 + Prom 2551 - 2610 4.8 3 3 Op 1 . + CDS 2765 - 3325 783 ## COG0233 Ribosome recycling factor 4 3 Op 2 . + CDS 3348 - 4280 935 ## COG1162 Predicted GTPases - Term 4572 - 4607 0.5 5 4 Tu 1 . - CDS 4609 - 5685 1323 ## COG0482 Predicted tRNA(5-methylaminomethyl-2-thiouridylate) methyltransferase, contains the PP-loop ATPase domain - Prom 5727 - 5786 1.6 - Term 6940 - 7002 23.0 6 5 Op 1 . - CDS 7059 - 7340 473 ## Sterm_1740 hypothetical protein - Prom 7371 - 7430 4.2 7 5 Op 2 . - CDS 7452 - 10862 3867 ## COG1197 Transcription-repair coupling factor (superfamily II helicase) - Prom 11025 - 11084 4.6 - Term 11036 - 11077 7.0 8 6 Op 1 1/0.000 - CDS 11101 - 11850 835 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 9 6 Op 2 . - CDS 11946 - 12704 700 ## COG0204 1-acyl-sn-glycerol-3-phosphate acyltransferase - Prom 12808 - 12867 4.7 10 7 Op 1 . - CDS 13061 - 14242 737 ## COG1408 Predicted phosphohydrolases 11 7 Op 2 . - CDS 14278 - 15072 833 ## COG1410 Methionine synthase I, cobalamin-binding domain - Prom 15095 - 15154 5.3 - Term 15268 - 15306 6.2 12 8 Op 1 . - CDS 15372 - 16700 1182 ## COG0044 Dihydroorotase and related cyclic amidohydrolases 13 8 Op 2 . - CDS 16697 - 17479 902 ## PRU_1615 putative lipoprotein - Prom 17532 - 17591 4.3 14 9 Op 1 . - CDS 17661 - 18074 492 ## PRU_1616 hypothetical protein 15 9 Op 2 . - CDS 18058 - 18396 375 ## PRU_1617 hypothetical protein - Prom 18589 - 18648 5.0 16 10 Tu 1 . + CDS 18740 - 20566 2071 ## COG2812 DNA polymerase III, gamma/tau subunits Predicted protein(s) >gi|283510561|gb|ACQH01000058.1| GENE 1 134 - 1687 1525 517 aa, chain + ## HITS:1 COG:no KEGG:SSUBM407_0558 NR:ns ## KEGG: SSUBM407_0558 # Name: sly # Def: suilysin (hemolysin) # Organism: S.suis_BM407 # Pathway: not_defined # 107 375 98 380 497 137 33.0 8e-31 MKKLTLRKELTIALSVLVCACGNDKDGDLGTNTGNGTFPTEINSYVNTMPSVKQMEPFSE RKVGDAKPASLGYAFADTRGTSAKEYYEQAKEFEEQLLFSDDKNVFYPGALLRAKSVVDG EYDPIEADRAPIMLSTDLPGSSDPTIKIENPSLSGVRKGINELLSRKFNAPAANLTYSIE EVYDKAHLKMAMGGNYKGAANTVEASAGFSFEKEKNRFLVKVQQVFYELSIDLPKNPSDF FAKEFDYKKEFGKEKPLFVSSIKYGRILLLGIETNMTKKEAEAKLQASVLGGKIGANAEA AYNDLLKESTIKGRVLGGDAKLGSLAAISLEDVKKFISEGARLSTENPGAPIAYKLKELG TNRTFKTVIYSKYTKNDPYAGEFSSVGFELVVQDDLKSLAGKKLNAGRGYVQFGTDPKQQ AEFHFDYWSRYAHYNIPSYKSGEKIFVVFKRERENDGFKKEYTFEVPLFETLVREGKRLG HGEIKKLYDTEKNPLKLRDTQGEVTMTFGIEKLKFHK >gi|283510561|gb|ACQH01000058.1| GENE 2 2013 - 2582 512 189 aa, chain - ## HITS:1 COG:TVN1071_2 KEGG:ns NR:ns ## COG: TVN1071_2 COG1418 # Protein_GI_number: 13541902 # Func_class: R General function prediction only # Function: Predicted HD superfamily hydrolase # Organism: Thermoplasma volcanium # 23 171 14 145 164 79 37.0 3e-15 MQTGYQAVIDKYYPHDNELRHILITHSRSVADLALALAKRLPEQCLDLQFVEEAAMLHDI GIVRCDAPSIHCCGTEPYIAHGRQGAEMLRAEGMPRHARVCERHTGAGITREAIETQHLP LPLQDFLPETLEEQLICYADKFFSKTKLDRQKTVEQAEKSLAKFGEDGLERFRAWVKRFG EPPIVLPTT >gi|283510561|gb|ACQH01000058.1| GENE 3 2765 - 3325 783 186 aa, chain + ## HITS:1 COG:BH2424 KEGG:ns NR:ns ## COG: BH2424 COG0233 # Protein_GI_number: 15614987 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribosome recycling factor # Organism: Bacillus halodurans # 5 186 3 185 185 151 53.0 6e-37 MIDVKEALQDAQDRMEMAAMYLEEQLSRVRAGRANVAILDGVRVESYGMTVPLNQVANVS APDARTIAIKPFDKKAIRDIEKAIMDSDVGITPENNGEVIRLGIPQPTEERRRELTKQCN KIGEKSKIEVRNVRAEIKDKLKKAIKDGLSEDNEKDAEEKLQKLHDKFIKKIDELLEAKN KEIMTV >gi|283510561|gb|ACQH01000058.1| GENE 4 3348 - 4280 935 310 aa, chain + ## HITS:1 COG:TM1717 KEGG:ns NR:ns ## COG: TM1717 COG1162 # Protein_GI_number: 15644464 # Func_class: R General function prediction only # Function: Predicted GTPases # Organism: Thermotoga maritima # 2 298 6 286 295 207 38.0 2e-53 MQGLVIKNTGSWYTVKTDDGRIVDSKIKGSFRLKGIRSTSPVAVGDRVQLVTNQEGTAFI SAIHDRKNYIIRKSSNLSKQSHILAANIDQALLIVTVNRPQTSTTFIDRFLASAEAYRIP VVLVFNKTDLLDDDEMRYQQMVMNLYETVGYHCLAISAETGAGVDEVRALLKGRISLLSG NSGVGKSTLINRLLPTANLRTAEISNAHNAGMHTTTFSEMLELPEGGWVIDTPGIKGFGT FDIERNELSSYFKEIFEFSKGCRFNNCTHTHEPGCAVLQALNDHYIAQSRYQSYLSMLDD KDESKYREAY >gi|283510561|gb|ACQH01000058.1| GENE 5 4609 - 5685 1323 358 aa, chain - ## HITS:1 COG:BB0682 KEGG:ns NR:ns ## COG: BB0682 COG0482 # Protein_GI_number: 15595027 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted tRNA(5-methylaminomethyl-2-thiouridylate) methyltransferase, contains the PP-loop ATPase domain # Organism: Borrelia burgdorferi # 10 356 2 350 355 290 41.0 4e-78 MNIDELQGKKIAVLLSGGVDSSVVVYEFNRLGLHPDCFYIKIGPEETEEWDCASEEDLEM ATAVAKRFGCKLEVVDCHKEYWEQVTRYTMDKVKAGFTPNPDVMCNRLIKFGAFHEKRGH EYDLIATGHYAQTEWIDGKKWLVTSPDPVKDQTDFLAQIYDWQLRKALFPIGHYQKGEVR EIAEREHLINAKRKDSQGICFLGKINYNEYIRRFLGEKPGEVLELETEKKIGMHRGLWFH TIGQRHGLGFGGGPWYAVKKDVERNVLFVSKGYEPATAYKKDFKVHHLHQLTAPLDGLDV TFKIRHTPEYHRAQLEPLPDGGYIVHAEEPMQGVAPGQFCVIYDKEHHRCFGSAEIAW >gi|283510561|gb|ACQH01000058.1| GENE 6 7059 - 7340 473 93 aa, chain - ## HITS:1 COG:no KEGG:Sterm_1740 NR:ns ## KEGG: Sterm_1740 # Name: not_defined # Def: hypothetical protein # Organism: S.termitidis # Pathway: not_defined # 1 91 1 91 94 70 38.0 2e-11 MTINHTNNEQNGEFVVFEGEEKLGYISYEWVDADHFAIVHTVVSPEHKGKGIGKILLDAA ADYARKNNKKIKDVCSFVTVQFARSDKYNDVKG >gi|283510561|gb|ACQH01000058.1| GENE 7 7452 - 10862 3867 1136 aa, chain - ## HITS:1 COG:BS_mfd KEGG:ns NR:ns ## COG: BS_mfd COG1197 # Protein_GI_number: 16077123 # Func_class: L Replication, recombination and repair; K Transcription # Function: Transcription-repair coupling factor (superfamily II helicase) # Organism: Bacillus subtilis # 31 1048 29 1086 1177 616 33.0 1e-176 MSMNLQQIYSGQPQAAALDKFLADAAISRIFLQGMVASTAPVFFSGMAQRRPKTTLFVLQ DADEAGYFYQDLVQLMGQNDVLFFPSSYRRAVKYGQRDAANEILRTEVLARLAAERATYV VCAPDALSELVVSKQRLDERTLRLSVGQTIDVVQMVKTLRAFDFREVDYVYEPGQFAVRG SIVDIYSYSHELPFRIDFFGDEIDTIRTFDVENQLSKDKRQAVEIVPELARQSNEKVSFL TLLPPDALLVMKDFGYVRDTIARIYDEGFAHQALEAQLDGATEVEQQQIRAAMQKELQLL TASQFAHDAAAFRHIYIGSKPSDEVQATIAFHFSAQPLFHKNFDLLRQTLEDYQLQGYRL VLLADSKKQQERLRDILTAEQKSAAPLKMESAGDFTLHAGYVDNDLRICFFTDHQIFDRF HKYNLKSERARAGKLALTMKELQEMEPGDFIVHVDFGIGKFGGLVRVPTGNSYQEMIRIV YQNNDKVDVSIHSLYKISRYRKGTATEQPRLSTLGTGAWERLKEKTKKRIKDIARDLIKL YAKRRTERGFAFSHDSFMQHELEASFLYEDTPDQLKATTEVKADMERARPMDRLVCGDVG FGKTEVAIRAAFKAACDSKQVAVMVPTTVLAFQHFKTFSKRLENLPVRVDYLSRARSAKQ TRQVLDDLAAGKIDIIIGTHKLIGKTVKWHDLGLLIIDEEQKFGVSTKEKLRTLKTNVDT LTMSATPIPRTLQFSLMGARDMSVMRTPPPNRHPIHTEIATFDGAFVAEAINFEMSRNGQ VYFVNDRIANLPELANIIRKHVPDCRVVIAHGQMKPEDLENALIDFMNHEYDVLLSTSII ENGIDISNANTIIVNHAHKVGLSDLHQMRGRVGRSNRKAFCYLLAPPLSALTPDARRRLE ALETFSDLGSGFNIAMQDLDIRGAGNLLGAEQSGFMEDLGYETYQKILSQAVTELRNDEF ADLYAESIEQGEEVGGDQFVEDCALESDLEMYFPDNYVPGSSERMLLYRELDNIESDDEL DAYRSRLVDRFGPVPPQGEELMQVVALRRIGKRLGCEKIMLKQGRMLMQFVSNPNSAYYR SKAFDNVLTYIANNPRRCDLKEIKGRRMMHVSAVPTVGDAVKVLRAIENKPAPKLG >gi|283510561|gb|ACQH01000058.1| GENE 8 11101 - 11850 835 249 aa, chain - ## HITS:1 COG:MT2111_2 KEGG:ns NR:ns ## COG: MT2111_2 COG0463 # Protein_GI_number: 15841539 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Mycobacterium tuberculosis CDC1551 # 7 239 16 246 277 204 45.0 2e-52 MNTSDSIVIIPTYNEKENIEKIIRAVFALEKCFHILVIDDGSPDGTAAIVRRLIDEEFAD RLFMLERKGKLGLGTAYITGFKWALQHDYEYVFEMDADFSHDPADLPRLYAACHDEGYDV AIGSRYVSGVNVVNWPIGRVLMSYFASKYVRFVTGFTVHDTTAGFKCYRRRVLQTIPLDE VRFKGYGFQIEMKYTAYKIGFKIKEVPVIFVNRREGTSKMSGGIFGEAFFGVMRLRLDGW TRKYPKITQ >gi|283510561|gb|ACQH01000058.1| GENE 9 11946 - 12704 700 252 aa, chain - ## HITS:1 COG:TM1693 KEGG:ns NR:ns ## COG: TM1693 COG0204 # Protein_GI_number: 15644441 # Func_class: I Lipid transport and metabolism # Function: 1-acyl-sn-glycerol-3-phosphate acyltransferase # Organism: Thermotoga maritima # 56 229 60 232 247 99 30.0 5e-21 MTILYRIYQLFVAIPLIVLATVLTALVTIVGCLIGNGHFWGYYPGKCWSWFIIRILFLPI KVEGREHLQRGQSYVFAANHQGAFDIFMIYGFLGRNFKWMMKQSLRKIPLVGKACEAAHH IFVDKRSAAKIKKSIDQARQTLHNGMSLVIFPEGARTFTGHMGVFRRGAFMLADELQLPI VPVTINGSFDVMPRTRDLKWARWHRLRLTIHAPIMPIGQGTDNMKFLEEQSYRVIMGSLV PEYQGFEANPDQ >gi|283510561|gb|ACQH01000058.1| GENE 10 13061 - 14242 737 393 aa, chain - ## HITS:1 COG:BH1618 KEGG:ns NR:ns ## COG: BH1618 COG1408 # Protein_GI_number: 15614181 # Func_class: R General function prediction only # Function: Predicted phosphohydrolases # Organism: Bacillus halodurans # 137 388 33 254 256 95 31.0 2e-19 MISRILIPLLLAILLPDLYIERRNWKSKRKAKWPLRVLRWVLSLGMVVFTIVMAGSKNFV PDSVEMLNVYLICLGLLVVPKFVYVSCAVVGRVFARLRHSTTNWGKAVGYLLALFCIYVL AYGWLFGANKLNVNHVEIYSPDLPETFDGYRIVQFTDAHVGSFVGSRAHFLERAVDTIMA QHADAIVFTGDLQNVQPAELYPFREQLSRLNARDGMFSVLGNHDYSMYFNGPEAIKVANE REMVARQRQFGWDLLLNEHRVVRRRADSIVIAGTENDGRPPFPSRADLSKALKGVGKGTF VVMLQHDPSAWERNILPHSHVQLTLSGHTHGGQISLFGLRPTSISFKQDKGLYTQGQRSL FVSSGLGGVVPFRFGVPPEVVVITLRRGNVAEK >gi|283510561|gb|ACQH01000058.1| GENE 11 14278 - 15072 833 264 aa, chain - ## HITS:1 COG:YPO3722_2 KEGG:ns NR:ns ## COG: YPO3722_2 COG1410 # Protein_GI_number: 16123860 # Func_class: E Amino acid transport and metabolism # Function: Methionine synthase I, cobalamin-binding domain # Organism: Yersinia pestis # 2 263 606 891 900 205 41.0 8e-53 MILKYKVHDIRPYISWLYFFHTWGMSGKPTDEKEKLRKDAEEMLDAMETRYSTHAVFELF DANSDGDDLLLGDKRLPLLRQQKPKSTGAPNLCLADFVRPLSSGIKDKVGAFATTVDAGI EYDYKNDDYKQMMAKVLGERLAEATAERMHEEVRKHYWGYAPDEHFEPQELLKEPYQGIR PAIGYPSLPDTSVNFLLNELINMGSIGIHLTPTGMMQPHASVSGLMFAHPQSKYFDLGKI GEDQLADYAKRRGVPLYMIKKFVQ >gi|283510561|gb|ACQH01000058.1| GENE 12 15372 - 16700 1182 442 aa, chain - ## HITS:1 COG:XF0988 KEGG:ns NR:ns ## COG: XF0988 COG0044 # Protein_GI_number: 15837590 # Func_class: F Nucleotide transport and metabolism # Function: Dihydroorotase and related cyclic amidohydrolases # Organism: Xylella fastidiosa 9a5c # 3 424 4 432 449 398 49.0 1e-110 MKTLIYGATIVNEGRSFKGCLTLENDLIASVTEGEAVPEGHFDLRLNAEGCVVMPGVIDD HVHFREPGLTAKADLESESRAAAYGGVTTFFEMPNTLPQTTTLDALNQKLKLAQGKSHVN YSFFFGATNQNAHLFERLDTKRVPGIKLFMGASTGNMLVDRLETLENIFENAPLPIMTHC EDTQLINQNMAEAKRLYGDDPGVEHHPAIRSAEACYHSSALAVELATKYGAQLHVAHLTT ARELSLFGHNPRITAEAVIAHLWFCDADYRTRGTQIKCNPAVKTIADRDALRAALTDGRI SVVATDHAPHLVSDKIGGCAKAASGMPMVQFSLPAMLELCEKGVLTIERMVELMCHAPAR LFAVSKRGFIKPGYKADLTIVRRGEPWTVTKDCVQSKCQWSPMEGEKFRWRVVQTICNGH LLLDEDGTFDAQYRGEEVAFCH >gi|283510561|gb|ACQH01000058.1| GENE 13 16697 - 17479 902 260 aa, chain - ## HITS:1 COG:no KEGG:PRU_1615 NR:ns ## KEGG: PRU_1615 # Name: not_defined # Def: putative lipoprotein # Organism: P.ruminicola # Pathway: not_defined # 1 252 1 255 257 258 50.0 2e-67 MNKVLYTLFALALLTSCGSSYNIEGTSNISTLDGRKLYLKVLKDNEFKKLDSCDVVHGKF GFSGSLDSVRIANIFMDEESVLPLVLESGDIIVKLDDAQQNVSGTPLNDKLFKFFNKYNQ LKNQEQELVHKHDQAIMNGSNMDVVNARLNSEAERLSGLEDKLITSFVCENFDNVLGPGV FFMVTIGYDYPELTPWIEDIMSKATDKFKNDAYVKDYYAKAQENQQIMNGMKDVPMPAPA VTPQAAPAPTPNDLAQPTKE >gi|283510561|gb|ACQH01000058.1| GENE 14 17661 - 18074 492 137 aa, chain - ## HITS:1 COG:no KEGG:PRU_1616 NR:ns ## KEGG: PRU_1616 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 137 1 132 132 105 42.0 5e-22 MKQLNKLQSILYAVGGALMVIGAGCFAFMWQQQAVCWLYLVGATLFCLMQSMQTYEGTDF VVRRLKRIQAVANIFFMLAGILMIDTAYMFFRPLFDSSIAYVDYLYNKWVVLLLIAALLE IYTMHRIDHEMKKDKRQ >gi|283510561|gb|ACQH01000058.1| GENE 15 18058 - 18396 375 112 aa, chain - ## HITS:1 COG:no KEGG:PRU_1617 NR:ns ## KEGG: PRU_1617 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 14 102 19 107 119 87 49.0 1e-16 MNSKWNEVGRMLNHYRYLIVVVVGVLIVGFLDDNSFMHRLEYEIQVSDLKKQIKEYNERH STDTERLRQLKRDPKAIEKIARERYFMKADDEDIYVLSDDEKPTQQQDETTE >gi|283510561|gb|ACQH01000058.1| GENE 16 18740 - 20566 2071 608 aa, chain + ## HITS:1 COG:BH0034 KEGG:ns NR:ns ## COG: BH0034 COG2812 # Protein_GI_number: 15612597 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, gamma/tau subunits # Organism: Bacillus halodurans # 9 441 8 427 564 282 38.0 1e-75 MGEYIVSARKYRPVSFDSVVGQSALTTTLKNAVKSGKLAHAYLFCGPRGVGKTTCARIFA KAINCLTPTEMGEACNHCESCQAFNDQRSYNIFELDAASNNSVENIKTLMDQTRIPPQVG KYKVFIIDEVHMLSTAAFNAFLKTLEEPPAHVIFILATTEKHKILPTILSRCQIYDFERM TVPGIIDHLKRVAENEGIAYEEEALAVIAEKADGGMRDALSIFDQAASFAQGNLTYDSVI ADLNVLDSDNYFSIIDLAVQNKVSDIMVLVNDIIAKGFDAGNLVNGLATHVRNVLMAKDE STLPLLEVSARQRERFKEQAQKCPTKFLYKALQVMNQCDIHYRASSNKRLLAELTLIQVA QITQPEDEPAAGRSPKRLKSLFRKLATPQPKPALQVVGGDKPLVRTTVATRTATALANTN EAKTTPYPTQPAVETPKANEPETTTQRTAMPIGKLKLGKVGTSFADLKRGPEKVGQVKED EITNKEENHVFSQEELELQWMALCNRMPQAMIAIATRMKNMSPRITQFPNIEVLADNEIV LEEVNNIKKRIEGGLAKYLHNGQLQLHIRLAKVEEMKRILSKKEIYENMLKANPAIDNLR DALGLELA Prediction of potential genes in microbial genomes Time: Sat May 28 01:11:54 2011 Seq name: gi|283510560|gb|ACQH01000059.1| Prevotella sp. oral taxon 317 str. F0108 cont2.59, whole genome shotgun sequence Length of sequence - 4970 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 3, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 59 - 1096 885 ## COG2502 Asparagine synthetase A - Prom 1177 - 1236 4.8 2 2 Op 1 . - CDS 1295 - 2146 289 ## BVU_2474 hypothetical protein 3 2 Op 2 . - CDS 2161 - 3480 513 ## COG3950 Predicted ATP-binding protein involved in virulence + TRNA 3744 - 3820 56.2 # Arg CCG 0 0 + Prom 3746 - 3805 79.3 4 3 Op 1 . + CDS 4042 - 4263 288 ## BVU_3458 hypothetical protein 5 3 Op 2 . + CDS 4260 - 4433 95 ## gi|288928514|ref|ZP_06422361.1| hypothetical protein HMPREF0670_01255 6 3 Op 3 . + CDS 4430 - 4651 276 ## gi|288928515|ref|ZP_06422362.1| hypothetical protein HMPREF0670_01256 + Term 4724 - 4748 -1.0 Predicted protein(s) >gi|283510560|gb|ACQH01000059.1| GENE 1 59 - 1096 885 345 aa, chain - ## HITS:1 COG:FN0776 KEGG:ns NR:ns ## COG: FN0776 COG2502 # Protein_GI_number: 19704111 # Func_class: E Amino acid transport and metabolism # Function: Asparagine synthetase A # Organism: Fusobacterium nucleatum # 10 345 3 327 327 356 53.0 4e-98 MSTLIRPEHYHALLDKVRTEQAIKVIKDFFQQNLSTELRLRRVTAPLFVMQGLGINDDLN GIERPVTFPIKDLGDARAEVVHSLAKWKRLTLAEYKIKPGYGIYTDMNAIRADEELDNLH SLYVDQWDWEMVITPEQRTLSFLKQTVRRIYAAILRTEYLTCETYPAIQPFLPPEIHFVH SEELLQMYPHLTPKEREDAICEKYGAVFIIGIGGKLSDGKRHDGRAPDYDDWSTLAEDGH VGLNGDILIWYPVLGRSVELSSMGIRVNADALLRQLEMMGQQTRAELYFHRQLLSGKLPL CIGGGIGQSRLCMVLLQKAHVGEIQASIWSDEMRAECQAAGMPLI >gi|283510560|gb|ACQH01000059.1| GENE 2 1295 - 2146 289 283 aa, chain - ## HITS:1 COG:no KEGG:BVU_2474 NR:ns ## KEGG: BVU_2474 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 277 1 270 273 155 37.0 2e-36 MQWINKRNRKYRKKAHHLLNKFLNEGWNSSVGKYVNCDFNSLKSFNANHGIRALLYSEQN GYCCYCMRKLNLGDRRMCTIEHVMPHKVNDSDLAFYFANVPHLRKNVRALVIDNKTQRLR HARPYPHFCAYENLVLSCSGGIYRTDDPDNECLYNIHACCNNVRGKERIFPIFFYKEENM IYERDGLITCSQKYEHTIDVLQLETENLCLFRKAWAYLLSSHSMEEIKDARAESKRSLRE EILMDTPLKLNEVKRLCHRLYWETLYEYRWFGFYFQRHMRQKK >gi|283510560|gb|ACQH01000059.1| GENE 3 2161 - 3480 513 439 aa, chain - ## HITS:1 COG:STM3753 KEGG:ns NR:ns ## COG: STM3753 COG3950 # Protein_GI_number: 16767037 # Func_class: R General function prediction only # Function: Predicted ATP-binding protein involved in virulence # Organism: Salmonella typhimurium LT2 # 1 349 1 363 396 103 25.0 6e-22 MFLRRVTIDNYSCFKQFDVQLAEGINVIIGRNGAGKTSLIKSLVYLMNFMFTNDRSMGDH FLSAGNPDLKMTSIKQDEFYRFNANSEAVSYANFHGEMTFEGEDISWDMYKKSIFGASLY PSKYVDAYRKLMEMSTRTGRLPLLAYFSDSFPHKQTNISSFARNEISKTTDILRNFGYYQ WDNETACTVIWQMRLLNVMAKSMSLKETHSAAVKENSYVLNALKSFSRTINIDGDNSFEI DTLMLDYNGAEKLELWVRLKSGQEIPFEALPAGYHRLYSIVLDLAYRSYLLNRNNPDETT GLVLIDEVDLHLHPSLSIEVLERFRAVFPAIQFVVTTHSPLIITSLKSDEHKNQVLRLVS GENKPHVLPDLYGIDYNAGLVDVMGINPTNEDVNYLKRAIIRAARRGDSANRKLKETELK GLLSESTFEKIIADINKEI >gi|283510560|gb|ACQH01000059.1| GENE 4 4042 - 4263 288 73 aa, chain + ## HITS:1 COG:no KEGG:BVU_3458 NR:ns ## KEGG: BVU_3458 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 9 73 77 141 142 64 40.0 1e-09 MEYSTEEKLEWIVYFIYEFGKKYDLTMKQAFGYLQRFKAIDFIDKHYGYAHTQSFRTMVD ELSQYCRRMGGSL >gi|283510560|gb|ACQH01000059.1| GENE 5 4260 - 4433 95 57 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288928514|ref|ZP_06422361.1| ## NR: gi|288928514|ref|ZP_06422361.1| hypothetical protein HMPREF0670_01255 [Prevotella sp. oral taxon 317 str. F0108] # 1 57 1 57 57 105 100.0 1e-21 MMRLYHGTTSDFGEIDLTKSKPSKDFGRGFYLSAEVEQAKDFAQTRALLLVEHLKRL >gi|283510560|gb|ACQH01000059.1| GENE 6 4430 - 4651 276 73 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288928515|ref|ZP_06422362.1| ## NR: gi|288928515|ref|ZP_06422362.1| hypothetical protein HMPREF0670_01256 [Prevotella sp. oral taxon 317 str. F0108] # 1 73 1 73 73 116 100.0 5e-25 MTEKEEMKGFLVKELAKQLIANDNTLSIEQALTLVLNSETYEKLMDDASKLYYQSPGYVF SFLQTELKTGKMG Prediction of potential genes in microbial genomes Time: Sat May 28 01:12:19 2011 Seq name: gi|283510559|gb|ACQH01000060.1| Prevotella sp. oral taxon 317 str. F0108 cont2.60, whole genome shotgun sequence Length of sequence - 44101 bp Number of predicted genes - 27, with homology - 26 Number of transcription units - 16, operones - 8 average op.length - 2.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 29 - 88 1.8 1 1 Op 1 . + CDS 217 - 1440 1447 ## COG2195 Di- and tripeptidases + Term 1551 - 1602 -0.3 + Prom 1584 - 1643 1.7 2 1 Op 2 . + CDS 1725 - 3272 1618 ## COG1620 L-lactate permease + Term 3290 - 3333 8.4 - Term 3275 - 3324 8.5 3 2 Op 1 3/0.000 - CDS 3410 - 3874 350 ## PROTEIN SUPPORTED gi|148994988|ref|ZP_01823966.1| ribosomal protein L11 methyltransferase 4 2 Op 2 . - CDS 3921 - 4610 648 ## COG1738 Uncharacterized conserved protein - Prom 4709 - 4768 5.0 + Prom 4702 - 4761 4.4 5 3 Tu 1 . + CDS 4860 - 5486 596 ## COG3340 Peptidase E + Term 5638 - 5684 10.5 - Term 5624 - 5672 7.1 6 4 Op 1 . - CDS 5681 - 6598 717 ## Cphy_0623 hypothetical protein 7 4 Op 2 . - CDS 6638 - 8560 2021 ## COG2217 Cation transport ATPase - Prom 8731 - 8790 2.4 - Term 8584 - 8645 -0.1 8 5 Tu 1 . - CDS 8816 - 9919 1042 ## COG0381 UDP-N-acetylglucosamine 2-epimerase - Prom 9979 - 10038 1.9 + Prom 11390 - 11449 3.8 9 6 Tu 1 . + CDS 11509 - 12249 783 ## Slin_5315 hypothetical protein + Term 12267 - 12305 -0.8 10 7 Tu 1 . + CDS 12363 - 13112 689 ## Slin_5315 hypothetical protein - Term 13173 - 13227 2.1 11 8 Op 1 . - CDS 13289 - 14155 350 ## PROTEIN SUPPORTED gi|15895122|ref|NP_348471.1| 4-hydroxy-3-methylbut-2-enyl diphosphate reductase 12 8 Op 2 . - CDS 14152 - 15216 1042 ## PRU_0159 hypothetical protein 13 8 Op 3 . - CDS 15229 - 20832 5018 ## COG2373 Large extracellular alpha-helical protein - Prom 20974 - 21033 2.8 14 9 Op 1 . - CDS 21152 - 21289 123 ## 15 9 Op 2 1/0.000 - CDS 21302 - 26881 4968 ## COG2373 Large extracellular alpha-helical protein - Prom 26901 - 26960 4.0 16 9 Op 3 . - CDS 27025 - 32634 4998 ## COG2373 Large extracellular alpha-helical protein - Prom 32773 - 32832 7.5 + Prom 32592 - 32651 3.2 17 10 Op 1 . + CDS 32890 - 33828 1006 ## COG0501 Zn-dependent protease with chaperone function 18 10 Op 2 13/0.000 + CDS 33844 - 34629 891 ## COG0543 2-polyprenylphenol hydroxylase and related flavodoxin oxidoreductases 19 10 Op 3 . + CDS 34629 - 35537 1085 ## COG0167 Dihydroorotate dehydrogenase + Term 35678 - 35729 2.0 + Prom 35727 - 35786 3.5 20 11 Tu 1 . + CDS 35853 - 36590 583 ## COG4335 DNA alkylation repair enzyme + Term 36765 - 36800 1.0 - Term 36623 - 36670 1.6 21 12 Tu 1 . - CDS 36809 - 37237 202 ## gi|288928537|ref|ZP_06422384.1| hypothetical protein HMPREF0670_01278 - Prom 37337 - 37396 6.1 - Term 38532 - 38595 -0.2 22 13 Op 1 26/0.000 - CDS 38671 - 39627 1147 ## COG0330 Membrane protease subunits, stomatin/prohibitin homologs - Prom 39676 - 39735 3.6 - Term 39740 - 39779 -1.0 23 13 Op 2 . - CDS 39810 - 40259 466 ## COG1585 Membrane protein implicated in regulation of membrane protease activity - Prom 40280 - 40339 4.7 + Prom 40614 - 40673 4.9 24 14 Op 1 . + CDS 40751 - 41233 570 ## gi|288928540|ref|ZP_06422387.1| hypothetical protein HMPREF0670_01281 25 14 Op 2 . + CDS 41244 - 41915 370 ## gi|288928541|ref|ZP_06422388.1| hypothetical protein HMPREF0670_01282 + Term 42056 - 42104 7.1 26 15 Tu 1 . - CDS 42095 - 43183 1147 ## COG0082 Chorismate synthase - Prom 43239 - 43298 3.8 - Term 43203 - 43260 12.0 27 16 Tu 1 . - CDS 43301 - 43975 696 ## COG1047 FKBP-type peptidyl-prolyl cis-trans isomerases 2 - Prom 44003 - 44062 2.6 Predicted protein(s) >gi|283510559|gb|ACQH01000060.1| GENE 1 217 - 1440 1447 407 aa, chain + ## HITS:1 COG:CAC0476 KEGG:ns NR:ns ## COG: CAC0476 COG2195 # Protein_GI_number: 15893767 # Func_class: E Amino acid transport and metabolism # Function: Di- and tripeptidases # Organism: Clostridium acetobutylicum # 2 404 3 407 408 458 56.0 1e-128 MEITERFINYTKFDTQSDDNSESVPSTPKQLVFAEYLKKEMEREGLSDVEMDDMGYLYGT LKANCKKKIPTIGFISHYDTSPDSSGKDVKARIVKNYDGGDIELSPGIVSSPTKFPELKS HVGEDLIVTDGTTLLGADDKAGIAEIMQAMCWLRDHNEVKHGDIRVAFNPDEEIGKGAHH FDVEKFGCEWAYTIDGGDLGELEYENFNAAGAKVIIKGVSVHPGYAKGKMINANRLACEF CNMLPDNQTPETTEGYEGFYHLIGMQTSTEAAKLNFIIRDHDRDKFEDRKAFFEKCARAM NEKYGEGTVKVLLNDQYYNMKEKIDPNMHVIDIVLQAMMQSGVSPKVKPIRGGTDGAQLS YKGLPCPNIFAGGVNFHGPYEFVSVQVMEKAMNVIIKICEIVADYND >gi|283510559|gb|ACQH01000060.1| GENE 2 1725 - 3272 1618 515 aa, chain + ## HITS:1 COG:BB0604 KEGG:ns NR:ns ## COG: BB0604 COG1620 # Protein_GI_number: 15594949 # Func_class: C Energy production and conversion # Function: L-lactate permease # Organism: Borrelia burgdorferi # 4 511 7 497 500 268 36.0 3e-71 MAALVSILPIVLLFVLMLGFKMAGHRSAFISLLTTAAIAVFLAPTMNFAPDGFTQSGVAW AFVEGTLKAVFPILIIILMALFSYNVLVESKQIDVIKAQFTSFSDDDGVTVLMLVWGFGG LLEGMAGFGTAVAIPAAILISLGYKPLFSALVSLIANTVPTGFGAVGVPVITLANEIAPG GAASQELISQLSVYAVMQLSVLYFVLPFIILTLTNHSKGAILKNVRLAVWVGAVSLGTQY LVARYLGAETPAIIGSLAAIVAIVIYAKLFAPKKKEENEKAYTASQTLRAWAVYAFILLF ILLSGPLFPPINAFLKSTLVSRIALPIHAPGTAFSFAWIGNAGLMLFLGTVLGGLVQGVS PRRLLVSLARTVVNLRKTTVTIISLVALAAIMNYAGMILVIADTLANITGKAYPLFAPLI GAIGTFVTGSDTSSNILFAKLQANVATQLHHADPNWIVASNTTGATGGKMISPQSIAIAT AACDMQGKDGEILRSALPYAALYIIIGGLMVMFAG >gi|283510559|gb|ACQH01000060.1| GENE 3 3410 - 3874 350 154 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148994988|ref|ZP_01823966.1| ribosomal protein L11 methyltransferase [Streptococcus pneumoniae SP9-BS68] # 55 152 7 104 114 139 64 3e-32 MDKNRETEGLKSLGNNTKYSMDYAPEVLETFENKHPESDYWVRFNCPEFTSLCPITGQPD FAEIRISYVPNVRMVESKSLKLYLFSFRNHGDFHEDCVNIIMKDLIKLMNPKYIEVVGLF TPRGGISIHPFANYGMPNTKYAEMAERRFAEYGG >gi|283510559|gb|ACQH01000060.1| GENE 4 3921 - 4610 648 229 aa, chain - ## HITS:1 COG:Cgl0234 KEGG:ns NR:ns ## COG: Cgl0234 COG1738 # Protein_GI_number: 19551484 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Corynebacterium glutamicum # 11 216 42 249 250 125 32.0 9e-29 MNNKDNKVSVLFMLFSILFCVCLIAANVLETKQIAFGSISLTGGLIVFPVSYIINDCVCE VWGYKKTRMLIWTGFAMNFFFVMLGAICDMIPGAPYWTNDEGFHAVFGLAPRIAFASFLA FICGSFVNAYVMSKMKLSSGGKNFSLRAVVSTIFGESVDSIIFFPLALWGVVPTEELPWL MLWQVFLKTAYEVVVLPLTIRIVRYVKRHEQVDTYDNDVNYSIWRVFSI >gi|283510559|gb|ACQH01000060.1| GENE 5 4860 - 5486 596 208 aa, chain + ## HITS:1 COG:lin0382 KEGG:ns NR:ns ## COG: lin0382 COG3340 # Protein_GI_number: 16799459 # Func_class: E Amino acid transport and metabolism # Function: Peptidase E # Organism: Listeria innocua # 2 196 1 194 209 172 47.0 3e-43 MMVKLFLCSYFAAVSSFTPQFVGGDLKGKKLAFIPTASLFEEYTDYVDEAKEAFENLGLN IEVLDVSSAPKDLIERTLQRCDLIYVSGGNTFYLLQELEKSGAKTIILEQVKGGKPYIGE SAGSIILAPNTSYAKDMDEAKKAAPQLKSFEGLGLVDFYPLPHYKSFPFEEVTEKMLVKY AKLGLKPITNQQALFVNGNQTTLIEVKQ >gi|283510559|gb|ACQH01000060.1| GENE 6 5681 - 6598 717 305 aa, chain - ## HITS:1 COG:no KEGG:Cphy_0623 NR:ns ## KEGG: Cphy_0623 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 34 264 1 226 236 165 40.0 2e-39 MQRKRFAIFSRGLALVLPLMLVLCVQVAFADRPLKLLAIGNSFSEDAIEQNLFELAGAAG HQMVIGNMYIGGCSLERHWGNAQSNKPDYNYRKIEVDGKMTRTANYTLDKALRDEQWDYV SLQQVSQLSGMYSSFQPHLDSLIAYVRARVPATTKLIWHVTWAYAQNSTHGGFANYDRDQ GKMYRAIVEGAQRLKKENAQFSLFVPVGTAIQNARTSFVGDHLNRDGFHLDLVLGRYIAA CTWFECLFGTNVVGNRYAPKGLDKAQREVAQWAAHLAVERPFECSPIDIVDIDTKVSASE QGEKK >gi|283510559|gb|ACQH01000060.1| GENE 7 6638 - 8560 2021 640 aa, chain - ## HITS:1 COG:BH0557 KEGG:ns NR:ns ## COG: BH0557 COG2217 # Protein_GI_number: 15613120 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Bacillus halodurans # 8 437 79 507 806 323 42.0 5e-88 MEKRTIPVVGMACSACSANVEKKLNSLPGIKSAAVNLPGRTALVEYNPEEVSLEKMKSEI NAIGYDLVIETNRSAEAIERRELQRLKRRTLLSWGFALLCMCVSMGWLNLGGKSMNNQAA LLLALANFVVCGGEFFSRAWRQLRHGSANMDSLVALSTSIAFLFSAFNTFWGDAVWGARG IVWHTYFDASVMIITFVLTGRLLEERAKNETAGSIRQLMGLAPKTARLIDGDEMRDVPIA TIAVGDVLEVRAGEKVPVDGVVTTAESFMTADGAYIDESMITGEPTPALKQKGAKVLAGT VVSQGKFRFKAQQIGEHTALAQIIKMVQEAQGSKAPVQRTVDRVARVFVPTVALIALVTF LLWWLLGGNTALPQAILSAVAVLVIACPCAMGLATPTALMVAIGKAAQMNVLIKDAAALE SLKGVNAMVIDKTGTLTIPNQNIDFTKADDLPLEMRETLKPHAEEAMAELQNMGIEVYMM SGDKDEAARYWAQKAGIRHYRSRVLPQDKENMVRQLQAEGKRVAMVGDGINDTQALALAD VGIAMGRGTDVAMDAAQATLMGDDLRRLPQAIRLSRKTTAMIGQNLFWAFIYNVVCIPLA AGVLHLFGINFQITPMWASALMAFSSVSVVLNSLRLKLAK >gi|283510559|gb|ACQH01000060.1| GENE 8 8816 - 9919 1042 367 aa, chain - ## HITS:1 COG:MJ1504 KEGG:ns NR:ns ## COG: MJ1504 COG0381 # Protein_GI_number: 15669698 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylglucosamine 2-epimerase # Organism: Methanococcus jannaschii # 1 361 1 361 366 179 32.0 7e-45 MNICIVAGARPNFMKVAPIVHSIEKAKEQGRDLNYQLVYAGREDDPTLEGSLFNDLQLAR PQVFLGVDCENLNELTGQVVSAFDKYLRNHPTDVVLVVDDLASTLAVAIVTKKKGILLAH LVAGTRSFDIDMPKEINRLVTDMLSDLLFTAGMQSNSIATREGAELSKIYMVGNILMDTM RFNHNRLVRPALLTDLQLNDGGYLVFTLNRKALIANTPNLLAMVRQIAAASVHVPVVAPL RGAAAKAVNECLEQLGCPANVHVVEPLSYLEFGYLTAHAKGIITDSGNVAEEATFNGVPC ITLNSYTEHAETVKQGTNVLVGEDPEALAAALNTLQAGQWKHASIPDRWDGRSADRIVSI LLEQRKG >gi|283510559|gb|ACQH01000060.1| GENE 9 11509 - 12249 783 246 aa, chain + ## HITS:1 COG:no KEGG:Slin_5315 NR:ns ## KEGG: Slin_5315 # Name: not_defined # Def: hypothetical protein # Organism: S.linguale # Pathway: not_defined # 20 209 17 201 231 86 34.0 7e-16 MKKVLLILALMLGASTATFAQSTKDVLYLKNGSVIYGQLIEMVPEKQVKIKTADGSVFVY NTSEVDRIAKAEKEVKEQRSERKSSFRTRKIATGFKGFIDDSYSASMNGSDFNRGGLSVS LGSQILPQLFVGAGLGLEYYNEPEVLTVPVFADVRVNFINGPISPFLGVKVGYTALGDFE GFYLNPMIGCRFGLTNKLALHTAIGYALQNADYTHEERYYSPYGTTTFRRSISTISAIKI QFGFEF >gi|283510559|gb|ACQH01000060.1| GENE 10 12363 - 13112 689 249 aa, chain + ## HITS:1 COG:no KEGG:Slin_5315 NR:ns ## KEGG: Slin_5315 # Name: not_defined # Def: hypothetical protein # Organism: S.linguale # Pathway: not_defined # 24 207 22 198 231 72 32.0 1e-11 MKKALLTLTLMLVASVTTFAQTTKDVVYLKDGNVIRGQLVEIIPNKQVKIKTADGSLFVY NTNDVARIENAEIAKKEKKSEDEPTFNGLKKIARGFKGFVEGSFSASLDGSSYHREGLSV SLGSQILPQLFVGGGIGFEYFGKPNTVTVPIYLDIRTNFINGPISPFVGTKFGYAFGRDI QGFYFNPMLGCRFGLTNKLALHAAIGYALLYDVYLSETYNYGQSNEYYYNVDFDGTPTSA IKIQFGFEF >gi|283510559|gb|ACQH01000060.1| GENE 11 13289 - 14155 350 288 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15895122|ref|NP_348471.1| 4-hydroxy-3-methylbut-2-enyl diphosphate reductase [Clostridium acetobutylicum ATCC 824] # 1 276 1 274 642 139 29 3e-32 MIQVEIDNGSGFCFGVTTAIKKAEEELAAGKKMYCLGDIVHNGMEVERLTAMGMTTINHE QLRELHDVKVLLRAHGEPPETYELARRNNIEIIDATCPVVLQLQKRIKKQFDANPDAQIV IFGKNGHAEVLGLVGQTRSEAIVIEHFDEVSKLDFNRDIYLYSQTTKSLDEFHRIIEYIQ SHISSTATFRSFDTICRQVANRMPNIARFATQHDVIIFVSGRKSSNGKVLFNECKAVNPR SYHVENASEINLDWFADAQTVGICGATSTPKWLMEECRDHILDAHKQP >gi|283510559|gb|ACQH01000060.1| GENE 12 14152 - 15216 1042 354 aa, chain - ## HITS:1 COG:no KEGG:PRU_0159 NR:ns ## KEGG: PRU_0159 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 4 354 10 356 356 302 45.0 1e-80 MNNSILAAMFLSMASLSASAQRMEAVRAAIDCGQVLYRTPVTAVFEVKNKGGKPLRIIDV RKSCGCTEVSYPKTEIPAGQVFKVQATYDAAQMGHFNKQVGLYGPSGKDPLVLTLKGVVV EEVVDFSGTYPFTLGDVMVEKNNIEFDDVNRGDRPVQKIHIMNNSGKPVQPVLMHLPPYL QADVSPSTIMSGRGGVATITLDSRSLRNFGLTQTNIYLGMFPGDKVSADKEISVSAVLLP GFNELSTKALHNAPQIKLSATKVDLGAISGKKVKKAEIDIANVGHSTLDISSLQMFTAGL KLSLNKTHIVPGDMAKLKITADEKQLKNVKAAPRVLMITNDPKQPKVVITINVK >gi|283510559|gb|ACQH01000060.1| GENE 13 15229 - 20832 5018 1867 aa, chain - ## HITS:1 COG:TM0984 KEGG:ns NR:ns ## COG: TM0984 COG2373 # Protein_GI_number: 15643744 # Func_class: R General function prediction only # Function: Large extracellular alpha-helical protein # Organism: Thermotoga maritima # 499 717 182 399 1536 78 28.0 1e-13 MRKIAFWALALFMMACQTALAGNYDSLWKQYEVAIKKDLPKTAIGILTQIEKRAKADKQY GSMLKAVICRGALQSAIAPDSLLGEVKGLEEAELAVRNSQPVLAAVYQSALAQMYWDNPN MGDSKAQKSKAYLEMSLQHPELLAKTSAKGYEPLLEPGIDSRIFNNDLLHVLAMVCENYC LMYDYYSKAGNRRAACIVASMMFRKGYNNFSGRTIERGGEISPWSFYKSHTVHQLDSLIE IYSDLPEVAELALVRYKMMESAYDVPEKDVIEYINTALAKWGGWPRMNVLRNAYKKRINP TFDVKLRELTLPNTPQQVHLTGVRNIGKVAMKVWRLDVDGLFQKDIEKPEVLDSLKHKMS AVPSANRECKFVRKLPYKYSTDSVLLQGLPVGTYMLEFISDNKDVPVRRVLLAVTNVTAV VQALPANRVRIVVLNATTGYPMPNAKVRLTLKDRRKYEFRVEHQDCDANGELEYKVRADE YLSAIHPYTPTDVYAHENNYESYYGYYDNKRRYATLHLYTDRSIYRPGQTVHLAGICVER GTDLERKPIAGEKLTLVLYDANYREVAQNDVVTDAFGKCSTLFTLPSGGLTGAFEVRSRG KVYSSIQIRVEEYKRPTFEVGFSEVKEKYAPGDTVVAKGHAKSFAGVPVQGAKVSYKVRR NPAFWWHWRNRNDERGTLISSGTLVTDGDGTFSVGIPMILPDSKQSYGFYNFEVIATVVD QGGETRVGYMSLPLGSRSTVLNCDLPYKIEKDSLKTLLFNYRNASGVGIPGTVRYYIDKE KDVQTAKANTPVPFNAAAFSSGKHRLVAICGNDTIQQEFILFSMKDRRPVVESKMWTYQT AKSFNLANKPVYVQVGTSLKDTYVLYAIYSGDRVLEKGATVVSDSLITIPYTYKPEYGDG ITISYVWAKGDHCYDTYISIARAMPDKRLVPEWKTFRNRLTPGQKEEWTLNLKYPNGKPA NAQLMATLYDKSLDQILNHRWDFGLAYSFGTPSVRWQWVWNLRTRYLGASYPLKLLPVNQ FNYYHFASGCFQWNHYFTRANAMAALDVKGNDMASPEEVITSRVQVESYSASAKITRSGA AAYAKEVSTERSSDEVLKVRETGKEDQGQTAGKGDKPTIRENLNETAFFYPALTSDANGD VAIKFTLPESVTTWRFMGLATDKDMNNVLVENEAVASKKMMVQPNMPRFMRLGDKGWITA RVINTSEKALTGTALLEIVDPETGKTFHSETKTCTVNAGQTTNVSFPLNLQEGTKFYTAG VRVWICRISVTGKGFSDGEQHYLPLLGDTELVTNSKAFTQHRPGTLNIDLRKLFAVNRAD NRLTVEYTNNPAWMMVQTLPYMSSVDESDAISLATSFYVNSLGSYLLKQAPVIKNTVELW KQENNGEGSMVSELEKNEELKTLVLNETPWVWNGKNETQQRQELTRFFDENGITYRLNQA TNKLRALQRGDGSWSWFPGMAGSSYVTLTVAEMVTRLQTLTGGRTDMNRNLKRAFDFMGK RVTEEVAAMKKREQRGEKNVRPSEWAVRYLYVSSLVDKEYYEDVKADRAYLIRHLAKQPA AFTIYGKAVAAVILAKNGYAQMGKEYLQSIREYSVFTEEMGRYYDTRKAYYSWFDYKIPT QVAAIEAIQALEPHDTTTLDEMKRWLLQQKRTQVWDTPINSVNAVYAFLNGNMQVLQAAA DSPAKLSVDGKRLETPNASAGLGYVKTTMTGANLNTFTVDKTAPGTSWGSVYAQFTQRTT DIADASMGLSIKRELLYDGELKVGSKVKVRITITADRDYDFVQVVDKRAACMEPVNQLTG YRGGFYCSPRDNATYYYADMMPKGTHVVESTFYIDRPGTFTTGTCTVQCAYSPAYMARTK ALTITIK >gi|283510559|gb|ACQH01000060.1| GENE 14 21152 - 21289 123 45 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNSGILASLSASVRQMKVARAVIDMAKRMPNKDKPSQKVKQYRFL >gi|283510559|gb|ACQH01000060.1| GENE 15 21302 - 26881 4968 1859 aa, chain - ## HITS:1 COG:TM0984 KEGG:ns NR:ns ## COG: TM0984 COG2373 # Protein_GI_number: 15643744 # Func_class: R General function prediction only # Function: Large extracellular alpha-helical protein # Organism: Thermotoga maritima # 323 716 41 399 1536 70 23.0 3e-11 MKKNVFWALALFIMVCQSAFADGYNELWKQFDVAMVKDLPKTALGVLQQIERQAKADKQY GSLLKASVCKGALQSAIAPDSLLGEVKRIEAEELAVRNTEPVLAAVYQSALAQIYHNNYS LRDKDGKTAQTFFNLSLQHPELLAKTQAKGYEPMLLPGVDSRIFNNDLLHVLAMAAGNYR LLYDFYAKAGNRRAACIVASMMFERGFRSNSDDEGEDEDDGKGFSFYKSRIVHKLDSFIN MYADLPEAGELAIVRANMMQSASDVGVEKELEYINYALSKWGTWQNMNTLRNTYNERINP TFNMRVRHMLLPQKPQQVYLSNVRNIKTITMKVWRLNVDGDFEGDVEEPKVLAKLRRNMK AVPTATRERKFVGVPNYVFTNDSLQLEGLPVGMYMLEIIPDNKNVPTRRVLLNVTNLVPV LQELPDDRVRIAVLNATTGHPVPNAKVRLRFYDDEEEKEKTKLFTCGVDGELEHAFGDNE IRGIYVYTKDDTYFNEAAYDTQYRYYGSDKTSPRLHLYTDRSIYRPGQTVHVAGIWVLRG PDLERKGAAGQSLKLKLYDANWKVVAENEVVTDAFGKLSTQFTLPTSGLTGTFRVSASSK ISGNVTFRVEEYKRPTFEVSFPKVNEKYAPGDTVVVMGHAKSFAGVPVQGAKVTYTIKRN PAFWWWWSRDDDESPNEIARESTVTDNDGAFRVNIPLTLPKGKKRPGFYNFEVTATVVDQ GGETRLGSMSLPLGSKLTVLSCDLPERSEKDSLKNITFSFRNASGAEIPGNVRYYIDNEL TTQTVKANTPIAFDVAALASGKHRLVAVCENDTVDQEFVLFSMNDRRPVVDSKIWAYQTS SKFNLDGSPVYVQVGTSLKDAYVLYAIYSGDKVLEKGTTVLSDSLITIPYTYKPEYGEGI TISYVWVKDDHYYDHTLSIRRATPDKRLLAEWKTFRNKLTPGQKEEWTLTLKHPDGKPAN AQLMATLYDKSLDQIVNHRWGFDLGYSFPVPYAVWQWAWSLRSYEVYSRNPLRSLPVNSL GFYRFDTSCFDWAYYFTSMFGLKGDRLEEVVVSGYSKKSSLTGSVRVRGLSSVSAKETRA SAVAYEEVKTKDVEETPEAKPESSSTDKPTIRENLNETAFFYPTLTSDANGDVAIKFTLP ESVTTWRFMGLATDQEMNNVMVENEAVATKKMMVQPNMPRFMRLGDKGWITARVINTSEK ALTGTALIEIVDPETGKTFHTETKTCTVNAGQTTNVSFPLNLQEGTKFYAAGVRVWICRI SVTGKGFSDGEQHYLPLLGDTELVTNSKAFTLHRPGTLNIDLRKLFAVNRADNRLTVEYT NNPAWMMVQTLPYMSSVDESDAISLATSFYVNSLGSYLLKQAPVIKSTVELWKQENKGEG SMVSELEKNEELKTLVLNETPWVWNGKNETRQRQELVRFFDENGITYRLNQATNKLRALQ RGDGSWSWFPGMPGSSYVTLTVAEMVTRLQTLTGGRTDMNRNLERAFAFMGKRVADEVAA MKKREQRGEKNVRPSEWAVRYLYVSSLVDKSYYEDVKADRAYLISHLAKQPAAFTIYGKA VAAVILAKNGYAQMGKEYLQSIREYSVFTEEMGRYYDTRKAYYSWFDYKIPTQVAAIEAI QALEPHDTTTLDEMKRWLLQQKRTQVWDTPINSVNAVYAFLNGNMQVLQAAADSPAKLSV DGKRLETPNASAGLGYVKTTMTGINLNTFTAEKTAPGTSWGSVYAQFTQRTTDIADASMG LSVKRELLYDGELKVGSKVKVRITITADRDYDFVQVVDKRAACMEPANQLTGYRGGFYCS PRDNATYYYADMMPKGTHVVESTFYVDRPGTFTTGTCTVQCAYSPAYMARTKALTITIK >gi|283510559|gb|ACQH01000060.1| GENE 16 27025 - 32634 4998 1869 aa, chain - ## HITS:1 COG:RSc3030 KEGG:ns NR:ns ## COG: RSc3030 COG2373 # Protein_GI_number: 17547749 # Func_class: R General function prediction only # Function: Large extracellular alpha-helical protein # Organism: Ralstonia solanacearum # 381 663 214 482 1582 69 26.0 6e-11 MRKAVLLALSLLIMVCQTVFADNYDALWKQYSVARKKDLPRTAMDVLQQIEKKAKAERQY GSMLKAVFCKGQVMVSISPDSLEGEVKRIEAAELAVRDKMPVLAAVYQSALGQLYNVYDE LKDGDGSKSRTYFSLSVKHPELLAKTSAKGYEPLLVPGMDSHVFHNDLLHVLGLAAGEYD LLYRYYEKVGNRRAACLLARLMLEETETNFADSVNEENGEEDTCKFDELPIVRKLDSLMN VYADLPEVAELAIYRYSLMYGSEKVKKSEVLKYIDYALSKWGSWKRMNVLRNAQKTVSNP TFDVASRRLILPKKPLKVHFSNVRNLSSITMKVWRLNINGRFEGDINKPQVLQRLRKTMQ AVPSATRECKFVGKSVYQFSNDSVQLDGLPVGAYMLEFIPSNKEVAVQRSLLFVTNVTAV MQEQPGYNVRIAVLNATTGHPMPNAKVEVTIKDRTGYKTRVDTLACKANGEVVCSLEYNE DLEDVFPYTPTDVYACREGWSGFYFQGNGHSGLPTLFIYTDRSIYRPGQVVQVAGISACG GPDLERKSVAGKKVTLRLCDANGKEIGEREAVTDAFGKIATQFTLPKDGLTGNFTIKLKG DYYTSTSFRVEEYKRPTFDVNFAEVKEKYALGDTVVVKGHAQSFAGAPVQGAKVYYTIKR SPVFWCYQSCDVDEDAGELDNDTTVTDSEGMFSVRVPMELPYFKWECAFFSFRVTATVVD QGGETHSCAISLPLGYKSVSVTCDLPDKIEKDSIKTLLFNVRNNTGAEIPGNVRYYIDKE SNTQTVVANTPIPFDAAALSSGKHRLVAICENDTLDREFVLFGLNDRRPVVESKIWSYQS ASSFNLDNKPVYVQVGTSLKDTYVLYAIYSGNKVLEEGTTIISDSLITIPYTYKPEYGDG ITISYVWVKDDECYDCNISIVRATPDKRLMAEWKTFRDRLTPGQTEEWTLSLKHPNGEPA NAQLMATLYDKSLERIKEHHWGLDLGYNFNAPSVNWQWIWNPASVELSGSRPVRLLPTPS FRFYRFSKACFRYDDPFDYLDGLSYAEELYEDESVGNGNVKFTAPVLKRDDEIKRIAVLS SMADDCVEDERLEVMEAGKDKGYADNTDKEVVAGNGEKPTIRENLNETAFFYPALTSDAN GDVAIKFTLPESVTTWRFMCLATDKDMNNVLVENEAVATKKMMVQPNMPRFMRLGDKGWI TARVINTSEKALTGTALIEIVDPETDKTFYTDTKTCTVSAGQTTNVSFLLDLQEGSKFYT AGVKVWVCRISVTGKGFSDGEQHYLPLLGDTELVTNSKAFTLHRPGKLNIDLRKLFAVNR ADNRLTIEYANNPAWMMVQTLPYMGSANESDAISLATSFYVNSLANHMLKQVPAIKRTIE LWKQENNGEGSMVSELEKNEELKTLVLNETPWVWEGNNETRQRQELVHFFDENGIAQRLN TATNKLRQLQRYDGGWGWFEDMKSSPSITLAVAEMITRLQTLTGGRTDMNRSLERAFKFM GTAVKEEVTEIKEREKKGEKDVLPSEWAVRYLYVSSLVDKEFYEDVQAERKFLISRMAKQ PVAFTIYGKAVAAVILAKNGYAQMGKEYLQSIREYSVFTEEMGRYYDTRKAYYSWFDYKI PTQVAAIEAIQALEPHDTTTLDEMKRWLLQQKRTQVWDTPINSVNAVYAFLNGNMQVLQA VADSPAKLSVDGKRLETPNASAGMGYVKTTMTGADLNTFTVDKTAPGTSWGSVYAQFTQR TTDITDASMGLSVKRELLYDGELKVGSKVKVRITITADRDYDFVQVVDKRAACMEPTNQL TGYRDGYYRSPRDNATYYYTDMMRKGTHVIENTYYIDRPGTFTTGTCTVQCAYSPAYMAR TKALTITVK >gi|283510559|gb|ACQH01000060.1| GENE 17 32890 - 33828 1006 312 aa, chain + ## HITS:1 COG:PA4632 KEGG:ns NR:ns ## COG: PA4632 COG0501 # Protein_GI_number: 15599828 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Zn-dependent protease with chaperone function # Organism: Pseudomonas aeruginosa # 35 270 41 269 273 133 36.0 4e-31 MKRLNHLLIGALAAMMLACGTTSQVPITGRKHSLLVSDAQILSLSKQEYAKFLSSARLSS NAANTAMVKRVGQRLARAVETYLMNNGYQDEIKNFEWEFNLVADNHVNAFCMPGGKIVVY EGLLPVTQNEASLAIVLGHEIAHAVAKHSAEQMSKKIRQAYGTQIGGSILGAIGGETVGG LAQVAAGQYFSFRNLKYSRDNESEADHMGLIFAAMAGYDPSVAVAFWQRMAAKSGGGNTS DMFSDHPSDAKRIAAIQKLLPEAMSYYKASGNAHANAPATTSTKPANTQPTNKNKGSKGV SASGLYRNSKRK >gi|283510559|gb|ACQH01000060.1| GENE 18 33844 - 34629 891 261 aa, chain + ## HITS:1 COG:BH2535 KEGG:ns NR:ns ## COG: BH2535 COG0543 # Protein_GI_number: 15615098 # Func_class: H Coenzyme transport and metabolism; C Energy production and conversion # Function: 2-polyprenylphenol hydroxylase and related flavodoxin oxidoreductases # Organism: Bacillus halodurans # 38 260 32 258 259 154 39.0 2e-37 MPNEEMKKFCLDLDVVAVERLNARYVLIRLSRAQPLPLMKPGQFVEVRVDHSPQTFLRRP ISINYVDIQRNEMGLLVATVGHGTRQMALLKTGDTLNCVFPLGNPFTLPTSPDERFLLVG GGVGVAPMLFLGQKIKEMGAQPTFLLGARTAADLLEMDLFERTGRVLVTTEDGSAGEKGF VTNHSVLQNETFDMISTCGPKPMMMAVARYAREKGTACEVSLENLMACGIGACLCCVEKT VDGNLCACKEGPVFNIQKLLW >gi|283510559|gb|ACQH01000060.1| GENE 19 34629 - 35537 1085 302 aa, chain + ## HITS:1 COG:aq_046 KEGG:ns NR:ns ## COG: aq_046 COG0167 # Protein_GI_number: 15605646 # Func_class: F Nucleotide transport and metabolism # Function: Dihydroorotate dehydrogenase # Organism: Aquifex aeolicus # 4 299 3 300 306 293 49.0 4e-79 MAILKTNIGNLSMKNPVMTASGTFGYGLEFEDFVPLDGIGGIIVKGTTLHPRQGNDYPRM AETPQGMLNCVGLQNKGVDYFAEHIYPKIKDIDTRMIVNVSGNTPEDYAECATRIDALEN IPAIELNISCPNVKEGGMAFGTTCAGAASVVRAVRARYGKTLIVKLSPNVTDVTEIARAV EAEGADSVSLINTLMGMAIDIERRQSVLSINTGGLSGPAVKPVALRMVWQVAKAVKIPVI GLGGICNATDAIEFIMAGASAIQIGTANFLDPTVTIKVRDGINQWLDNHGVKDIKEIVGA VG >gi|283510559|gb|ACQH01000060.1| GENE 20 35853 - 36590 583 245 aa, chain + ## HITS:1 COG:PM1750 KEGG:ns NR:ns ## COG: PM1750 COG4335 # Protein_GI_number: 15603615 # Func_class: L Replication, recombination and repair # Function: DNA alkylation repair enzyme # Organism: Pasteurella multocida # 19 244 17 242 245 134 35.0 1e-31 MGTNYSITEQFGANLAELLAEKICKVYHEFDTVHFIQDTANKTIGQSYTQRVATLAELLK IYLPADYKKALPILMAILGEENPNETGMFTHYYWILPIGKFVQEYGIEHFDRSMKAIEEI TKRNTGEYAVRPYIRKYPETSLKIIEQWAKSPNFHLRRLASEGLRPKLPWASKLETFIEN PAPVFKILEFLKEDEILFVKRSVANHLTDWLKVNREGVLPLIERWKTSDNPHTQWIIKRA TRKIQ >gi|283510559|gb|ACQH01000060.1| GENE 21 36809 - 37237 202 142 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928537|ref|ZP_06422384.1| ## NR: gi|288928537|ref|ZP_06422384.1| hypothetical protein HMPREF0670_01278 [Prevotella sp. oral taxon 317 str. F0108] # 1 142 1 142 142 279 100.0 3e-74 MDTKGLILNTAYNDFILDSDISNYLYKRHTKAVYNESTFTNSSYYFYDDEIDIWCDDDGK INVIRCASSCFYQGVELIGLPFQELLSAIRILPNNHERIYLLVNGKGQNQHVYDFETIGL QIWVWRQKIKTVLIYKVVEEEL >gi|283510559|gb|ACQH01000060.1| GENE 22 38671 - 39627 1147 318 aa, chain - ## HITS:1 COG:FN1549 KEGG:ns NR:ns ## COG: FN1549 COG0330 # Protein_GI_number: 19704881 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Membrane protease subunits, stomatin/prohibitin homologs # Organism: Fusobacterium nucleatum # 7 314 6 293 294 232 47.0 6e-61 MEYLGTYLIIAAILLAFVFVKKSLVIIPQSETKIIERLGKFRAILKPGVNIIIPFVDKAK NIVRMTNRRYSYSNTIDLREQVYDFDKQNVITKDNIQMQINALLYFQIVDPFKAVYEIDN LPNAIEKLTQTTLRNIIGEMELDQTLTSRDTINTKLRSVLDDATNKWGIKVNRVELQDII PPSSVLQAMEKQMQAERNKRATILTSEGEKQAVILQSEGEKTSTINRAEAVKQQAILYAE GEAQARIRKAEAEAIAIQKITDAVGQSTNPANYLLAQKYIAMMQELAQGDQTKMVYLPYE ATNLLGSLGGIKELFKIH >gi|283510559|gb|ACQH01000060.1| GENE 23 39810 - 40259 466 149 aa, chain - ## HITS:1 COG:MTH693 KEGG:ns NR:ns ## COG: MTH693 COG1585 # Protein_GI_number: 15678720 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Membrane protein implicated in regulation of membrane protease activity # Organism: Methanothermobacter thermautotrophicus # 13 149 10 142 146 84 38.0 8e-17 MFEYLNSNQWLFWLLISLLCLIIEMASGTFYILCFAIGALSSMVGCWLGFPFWGQVVCFA FFSILSIFAVRPVVMKYLHRDEEERKSNADALEGRVGVVIEDVQPQHSGYVKVDGDEWRA VSADGSLIKIGERVRIVKMESIIATVERV >gi|283510559|gb|ACQH01000060.1| GENE 24 40751 - 41233 570 160 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288928540|ref|ZP_06422387.1| ## NR: gi|288928540|ref|ZP_06422387.1| hypothetical protein HMPREF0670_01281 [Prevotella sp. oral taxon 317 str. F0108] # 1 160 1 160 160 303 100.0 2e-81 MNRLNSIGKALIAGLVLLLTMPQIAHARARIPVGTRDVIDIVYNIPASDSIIIDGKQVNL ARMHKEFNIAYILPLWVTEEPKLVLYDAPSETYYEMTTEKSQAFLKEYMKEKKLDEASML KLGFYTRYGGKAVLLAIVALMIWGQIPSRKKEEKIEPTEL >gi|283510559|gb|ACQH01000060.1| GENE 25 41244 - 41915 370 223 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288928541|ref|ZP_06422388.1| ## NR: gi|288928541|ref|ZP_06422388.1| hypothetical protein HMPREF0670_01282 [Prevotella sp. oral taxon 317 str. F0108] # 1 223 1 223 223 443 100.0 1e-123 MGITPFIIFAIIMTIGIVVALVFFISANNKPLNFAEVESESQRLGENELNTFINDKFGMC IEHLDDKRVLGVNYVYYTPSMKEYFKSAGKDLLRSAMTLGTVKYTTVETPTPLLLTADGL HIFDLSVEGKVKRHLMFNNSRLENATLTPLPDEQAPKLLVDCAKFYNLQVPCEDQVREIK ICTVVYTPTEQFFNYNKLQRVLAYAAGKEFFKALEEKYPNLKT >gi|283510559|gb|ACQH01000060.1| GENE 26 42095 - 43183 1147 362 aa, chain - ## HITS:1 COG:all0797 KEGG:ns NR:ns ## COG: all0797 COG0082 # Protein_GI_number: 17228292 # Func_class: E Amino acid transport and metabolism # Function: Chorismate synthase # Organism: Nostoc sp. PCC 7120 # 1 350 1 352 362 365 54.0 1e-100 MRNTFGNLFALTTFGESHGPAIGGVVDGMPAGIDIDLDFIQSELNRRRPGQSKLTTSRNE ADQVELLSGVFEGKSTGTPIGFVVRNANQHSKDYDNMRGLFRPSHADYTYYNKYGLRDHR GGGRSSARITLARVVGGALAKLALRQMGITISAYVSQVGDIVLNKDYRLYNLANTENNDV RCPDEEVAKRMAQLILDTKNAGDTIGGVVTCVMKGCPVGVGEPEFDKLHARLGYAMLSIN AAKGFEIGDGFDLVRCKGSEVNDVFVAEEGQVSTLTNHSGGIQGGISNGQDIFFRVAFKP VATVLMEQDTIELDGTPTTFAARGRHDSCVVPRAVPIVEAMAAMVLMDNLAMAGVKFGVD AE >gi|283510559|gb|ACQH01000060.1| GENE 27 43301 - 43975 696 224 aa, chain - ## HITS:1 COG:FN1875 KEGG:ns NR:ns ## COG: FN1875 COG1047 # Protein_GI_number: 19705180 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: FKBP-type peptidyl-prolyl cis-trans isomerases 2 # Organism: Fusobacterium nucleatum # 1 165 1 158 164 78 33.0 1e-14 MEKENNKFIAVAYKLYSLADGKETLIEEAPADRPFVFITGFGITLDAFENGVEDLPKGEQ FTLNIECEEAYGERLDERVLDLNKEIFTINGHFDHDNIYKDAVVPLQNEDGNRFMGRVLD VTEDTVKMDLNHPLAGLDLLFKGEIVENREATNEEVQQMLNHISGEGGCGGCGCGHNHGD EEGGGHEHGGCCGHEHHGDGHEHGHHGGCCSHGNGHHGGGCGHH Prediction of potential genes in microbial genomes Time: Sat May 28 01:13:13 2011 Seq name: gi|283510558|gb|ACQH01000061.1| Prevotella sp. oral taxon 317 str. F0108 cont2.61, whole genome shotgun sequence Length of sequence - 39650 bp Number of predicted genes - 35, with homology - 32 Number of transcription units - 21, operones - 8 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 66 - 332 89 ## + Prom 12 - 71 1.9 2 2 Tu 1 . + CDS 190 - 2514 1888 ## COG1404 Subtilisin-like serine proteases + Prom 2563 - 2622 3.0 3 3 Op 1 . + CDS 2644 - 3138 742 ## COG0716 Flavodoxins 4 3 Op 2 . + CDS 3229 - 3594 468 ## BVU_3218 hypothetical protein + Term 3689 - 3736 6.2 5 4 Tu 1 . - CDS 3902 - 4111 57 ## gi|288928547|ref|ZP_06422394.1| hypothetical protein HMPREF0670_01288 - Prom 4303 - 4362 8.2 - Term 5962 - 5998 5.0 6 5 Tu 1 . - CDS 6013 - 6180 87 ## - Prom 6238 - 6297 3.4 - Term 6301 - 6345 -0.8 7 6 Op 1 . - CDS 6466 - 7656 1434 ## PRU_1514 hypothetical protein 8 6 Op 2 . - CDS 7705 - 9126 1529 ## COG2265 SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase - Prom 9320 - 9379 8.9 + Prom 9329 - 9388 5.4 9 7 Tu 1 . + CDS 9422 - 12142 2873 ## COG0574 Phosphoenolpyruvate synthase/pyruvate phosphate dikinase + Term 12328 - 12376 10.5 10 8 Tu 1 . - CDS 12685 - 13206 433 ## SGO_1809 hypothetical protein - Prom 13278 - 13337 8.8 + Prom 13444 - 13503 9.3 11 9 Op 1 . + CDS 13747 - 15096 655 ## Dfer_0864 peptidase C10 streptopain 12 9 Op 2 . + CDS 15107 - 15952 631 ## gi|288928556|ref|ZP_06422403.1| hypothetical protein HMPREF0670_01297 13 9 Op 3 . + CDS 16003 - 16839 665 ## gi|288928557|ref|ZP_06422404.1| hypothetical protein HMPREF0670_01298 14 10 Tu 1 . + CDS 16959 - 19166 1053 ## BT_2172 hypothetical protein - Term 19089 - 19138 8.8 15 11 Tu 1 . - CDS 19192 - 19647 548 ## COG1225 Peroxiredoxin + Prom 19934 - 19993 1.9 16 12 Op 1 24/0.000 + CDS 20137 - 22656 2643 ## COG0209 Ribonucleotide reductase, alpha subunit 17 12 Op 2 . + CDS 22699 - 23745 1251 ## COG0208 Ribonucleotide reductase, beta subunit + Prom 23755 - 23814 1.8 18 12 Op 3 . + CDS 23835 - 25259 587 ## PROTEIN SUPPORTED gi|90021240|ref|YP_527067.1| ribosomal protein S32 + Prom 25313 - 25372 2.8 19 13 Op 1 . + CDS 25392 - 25766 359 ## GALLO_1389 FMN-binding protein 20 13 Op 2 . + CDS 25783 - 25977 82 ## 21 14 Op 1 . + CDS 27548 - 27697 56 ## gi|260912591|ref|ZP_05919121.1| RelE family toxin-antitoxin system 22 14 Op 2 . + CDS 27703 - 27900 68 ## gi|288928564|ref|ZP_06422411.1| LOW QUALITY PROTEIN: toxin-antitoxin system, toxin component, RelE family 23 14 Op 3 . + CDS 27863 - 28174 346 ## BT_4733 hypothetical protein + Term 28253 - 28295 2.0 - Term 28412 - 28457 9.2 24 15 Tu 1 . - CDS 28484 - 28747 100 ## gi|260912594|ref|ZP_05919123.1| conserved hypothetical protein - Prom 28898 - 28957 4.7 - Term 29038 - 29074 -0.8 25 16 Tu 1 . - CDS 29156 - 29701 696 ## COG1853 Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family 26 17 Tu 1 . - CDS 30074 - 30925 439 ## gi|288928568|ref|ZP_06422415.1| hypothetical protein HMPREF0670_01309 - Prom 30947 - 31006 1.9 - Term 30958 - 31012 15.2 27 18 Tu 1 . - CDS 31051 - 32556 1711 ## COG0174 Glutamine synthetase - Prom 32781 - 32840 5.9 + Prom 32644 - 32703 4.7 28 19 Tu 1 . + CDS 32839 - 35427 1805 ## PROTEIN SUPPORTED gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 + Prom 35577 - 35636 5.8 29 20 Op 1 . + CDS 35785 - 36879 512 ## gi|288928571|ref|ZP_06422418.1| hypothetical protein HMPREF0670_01312 30 20 Op 2 . + CDS 36882 - 37532 296 ## gi|288928572|ref|ZP_06422419.1| hypothetical protein HMPREF0670_01313 31 20 Op 3 . + CDS 37538 - 37867 327 ## gi|288928573|ref|ZP_06422420.1| hypothetical protein HMPREF0670_01314 32 20 Op 4 . + CDS 37897 - 38151 148 ## gi|288929525|ref|ZP_06423369.1| hypothetical protein HMPREF0670_02263 + Term 38213 - 38265 -1.0 33 21 Op 1 . - CDS 38131 - 38682 434 ## gi|288928574|ref|ZP_06422421.1| hypothetical protein HMPREF0670_01315 34 21 Op 2 . - CDS 38740 - 38940 142 ## gi|288928575|ref|ZP_06422422.1| hypothetical protein HMPREF0670_01316 35 21 Op 3 . - CDS 38927 - 39058 122 ## gi|288928576|ref|ZP_06422423.1| hypothetical protein HMPREF0670_01317 - Prom 39080 - 39139 2.7 Predicted protein(s) >gi|283510558|gb|ACQH01000061.1| GENE 1 66 - 332 89 88 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MANDNVFFIVRCSVCTVEVVRIGAETAISPGACWPSESKVTENNGDSIWRIIKIDFGGVL DLILGNLAYSLEGTLWIMWLRMQAEDAQ >gi|283510558|gb|ACQH01000061.1| GENE 2 190 - 2514 1888 774 aa, chain + ## HITS:1 COG:CAC3245 KEGG:ns NR:ns ## COG: CAC3245 COG1404 # Protein_GI_number: 15896490 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Subtilisin-like serine proteases # Organism: Clostridium acetobutylicum # 126 674 27 518 1118 94 24.0 6e-19 MLSPLFSVTLLSLGQHAPGEMAVSAPILTTSTVQTLQRTMKKTLSLAIIALLCILHAHAQ RPVYAKMSGELRQLYIESQAERQATTRSNPYDSRSVCAFVRIGSHAEAVFKAHACTSLAQ FGNIHIVNIPLNQLAKLSQSKWVSRIEMGQSNSILMDTTAKVLNAWPAYQSQSLPQAYTG KGVVVGIQDIGFDLTHPNFCDSTGQTLRIKALWDQLSTDTIGTTLPVGRDYTTPNALLSL RHTRDGKQETHGTHTLGIAAGSGFKSPYRGMAWESDICLVANLTSENKNLVDKKDLYKYT TATDALGFKYIFDYAKRHNQPCVASFSEGSSFRFDSDTQLYHAVLDSLQGPGRIIVASAG NEGGRASYLRKPLETDSAGTFIKDSNGKIRVSAKSKEHFLLRLRFHRVGTLTTYDIDTRA LVAAKDSQLIDTLTFGKLQYEIQAAAYQPDNEGALAYEMYLRTLGSSTYVPPTALVLIGA NADVELFRGAGRWETNKLEPALLGGDDTHTILSPASAPAVIAVGATSYRTHITNYKGEKK YSLQGTGGERAPYSSKGPTADGRTKPDVMAPGSNIISSYNSFYIANHPNNSDVQWDVAHF QHNGRTYPWNCNTGTSMSAPAVAGAIALWLQANPTLTASDVRNIVAQTGTQHSTSLPHPN NLYGYGQVDVYRGLLHILGITKVQGIYQHQPQGVHISPDGNNKLRITFADGQTHACSLRL YSLSGSLVLSRPRTLLAAESTIQLGALAAGVYVVQIHADSPKHSGSTLVRLIAK >gi|283510558|gb|ACQH01000061.1| GENE 3 2644 - 3138 742 164 aa, chain + ## HITS:1 COG:Cj1382c KEGG:ns NR:ns ## COG: Cj1382c COG0716 # Protein_GI_number: 15792705 # Func_class: C Energy production and conversion # Function: Flavodoxins # Organism: Campylobacter jejuni # 6 162 5 160 163 155 57.0 3e-38 MKKTVVVYGSSTGTCQSIAETIASKLGVEALDVANFNADVVAENENLLIGTSTWGAGELQ DDWYDGVKVLKEADLAGKTVAVFGCGDAESYSDTFCGAMAELYNAAKESGAKMVGEVSTD GYTFDDSEAVADGKFVGLALDDVNEDDKTEGRIDAWLEEIKPAL >gi|283510558|gb|ACQH01000061.1| GENE 4 3229 - 3594 468 121 aa, chain + ## HITS:1 COG:no KEGG:BVU_3218 NR:ns ## KEGG: BVU_3218 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 113 1 113 114 157 63.0 1e-37 MPQTELISADMKVLMNHIYEYQKGVRQMVLYTFNKKYEQFAIARLERQHIDYLIKPVGKE NLNLFFGKKECLNAIRMMINRPLNELSPEEDFILGAMLGYDICRQCERYCERKERTFKVA Q >gi|283510558|gb|ACQH01000061.1| GENE 5 3902 - 4111 57 69 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928547|ref|ZP_06422394.1| ## NR: gi|288928547|ref|ZP_06422394.1| hypothetical protein HMPREF0670_01288 [Prevotella sp. oral taxon 317 str. F0108] # 15 69 1 55 55 96 98.0 6e-19 MRSRLKKSLLICVGVEQTEKHLYYFDTVEMQADGVENFAAIIVQKSNPKLNEIVEAFNRV VNILKEKPE >gi|283510558|gb|ACQH01000061.1| GENE 6 6013 - 6180 87 55 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDKGKRNRVSEEERIKQRNLNWPARRRVIGRVLFVALTILAIIILAFALWVHLVE >gi|283510558|gb|ACQH01000061.1| GENE 7 6466 - 7656 1434 396 aa, chain - ## HITS:1 COG:no KEGG:PRU_1514 NR:ns ## KEGG: PRU_1514 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 35 396 59 392 392 320 45.0 5e-86 MKQKTMLALLLTATTLQASAQSGDTPTEEKKLTLVSNLKLSGYMISQYQWADQEGAESNG FNIRMARLSLEGRALTDFYWKAQVQFNGNTATLGTSPRVVDVFAEWQKYKPFRVKLGQFK RPFTFENPMHPIDQGFMSFGQSVLKLSGFSDRTGEAASNGRDIGLQIQGDLFPNSNERPL VHYQVGVFNGQGINMRDADQRKDVIGGLWVMPIKGMRLGAFGWTGSYARVGNYTLVDPDT HEPILDGAGQKQVVKGKQTVEKRRFAVSGEYVTKGWTFRSEYIYSKGYGFKTVHNTKADL QDANINYEAGDKAYGFYGLVIAPVADFNGKKLHVKARFDQYCPNGKRSTAKTFYELGIDC ELSRMLKLSAEYALVNDKALKKTNYNLIDFQLGLRF >gi|283510558|gb|ACQH01000061.1| GENE 8 7705 - 9126 1529 473 aa, chain - ## HITS:1 COG:BH0687 KEGG:ns NR:ns ## COG: BH0687 COG2265 # Protein_GI_number: 15613250 # Func_class: J Translation, ribosomal structure and biogenesis # Function: SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase # Organism: Bacillus halodurans # 5 470 3 455 458 286 37.0 9e-77 MARKRKEFPILENVTITDVAAEGKALARVNDMVVFVPFAVPGDVVDLKIRKKKHSYCEAE VVRFVKYSDVRVEPACQHFGVCGGCKWQNLPYEEQLRAKQQQVYDQLSRISHVQLPTFHP IMGSQLTLHYRNKLEFGCSNKRWLTREQVASGETFDTMNAIGFHITGAFDKILPIEKCWL MDDLQNQIRNEIRDYALAHNITFFDLRQQTGLLRDVMIRNSDTGEWMVLVQFHFETADDQ QTADALLAHVAERFPQITSLLWVDNQKCNDTFGDLPVYVFKGNDHIFEVMEDLRFKVGPK SFYQTNTQQAYHLYEVARRFANLSGTEIVYDLYTGTGTIANFVARQAKKVVGVEYVPEAI ADAKVNSELNGIGNTTFYAGDMKDILNDDFIANNGQPDVIITDPPRAGMHADVIETIIRA NPQRIVYVSCNPATQARDLIALDERYAVTAVQPVDMFPHTPHVENVVLLERRA >gi|283510558|gb|ACQH01000061.1| GENE 9 9422 - 12142 2873 906 aa, chain + ## HITS:1 COG:BMEI1436 KEGG:ns NR:ns ## COG: BMEI1436 COG0574 # Protein_GI_number: 17987719 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoenolpyruvate synthase/pyruvate phosphate dikinase # Organism: Brucella melitensis # 4 894 46 917 930 1001 56.0 0 MSEKRVYTFGNGVAEGKADMRNLLGGKGANLAEMNLIGVPVPPGFTITTDVCNEYFEKGK EKVVSLLKGEVEKAVNHIENLMNCKFGDVKNPLLVSVRSGARASMPGMMDTILNLGLNDE VAKGLAEKSGNERFAYDSYRRFVQMYGDVVLGMKPTNKEDIDPFEEIIQQVKAERGIKLD NEMTVDELKKLVVLFKQAIKRQTGKDFPTDPMEQLWGAICAVFDSWMNERAILYRKMEGI PAEWGTAVNVQAMVFGNMGNTSATGVCFSRDAATGENLFNGEYLVNAQGEDVVAGIRTPQ QITKEGSLRWAAQQLIDEDVRASQYPSMEENMPEIYAQLNAIQEKLERHYHDMQDMEFTV QDGKLWFLQTRNGKRTGTAMVKIAMDLLREGEIDEKTALKRCEPNKLDELLHPVFDKAAQ KQAKVLTRGLPASPGAACGQVVFFADDAARWHEEGHQVIMVRIETSPEDLAGMSAAEGIL TARGGMTSHAAVVARGMGKCCVSGAGAININYKERTLEIDGVLLHEGDYISLNGSTGEVY LGQVTTQPAKVTGDLADLMELCNKYTKLVVRTNADTPHDAQVARNFGAVGIGLCRTEHMF FENEKIKAMREMILADSTEGREKALTKLLPFQRQDFYGILKSMNGYPVNVRLLDPPLHEF VPHDLEGQQVMAADMGVSVQEIQRRVNSLSEHNPMLGHRGCRLGNTYPEITAMQTRAILG AAIQLKKEGFDPHPEIMVPLIGVVHEFDQQEKVIRDTAKKLFEEEGMEVDFHVGTMIEVP RAALVAENIAKRAEYFSFGTNDLTQMTFGYSRDDIASFLPVYLEKKILKVDPFQVLDQKG VGQLIDMGVKKGRSTRPELICGICGEHGGEPESVKFCHRVGLNYVSCSPFRVPIARLAAA QAAVED >gi|283510558|gb|ACQH01000061.1| GENE 10 12685 - 13206 433 173 aa, chain - ## HITS:1 COG:no KEGG:SGO_1809 NR:ns ## KEGG: SGO_1809 # Name: not_defined # Def: hypothetical protein # Organism: S.gordonii # Pathway: not_defined # 3 110 6 112 199 67 33.0 2e-10 MEIIRKEDFKTSVWSGGLTREVCILPSVGSSLQDRNFDLRVSSAVIHVTQSTFSDFTGFT RLILCLDGDITLCVDGQVVTLSDECLFEFDGAADVTSVNSPGAVDLNVIYKQGTRVCARV ERGEHVFAGVEHMLVFAIGSQTTLNGTPLRQHDAAWVSGDVRVSGHVLCVEVG >gi|283510558|gb|ACQH01000061.1| GENE 11 13747 - 15096 655 449 aa, chain + ## HITS:1 COG:no KEGG:Dfer_0864 NR:ns ## KEGG: Dfer_0864 # Name: not_defined # Def: peptidase C10 streptopain # Organism: D.fermentans # Pathway: not_defined # 5 446 81 431 431 82 23.0 2e-14 MSHSKVKSITPIVPQPHDTIMYIVNCEKGWMLLSADKRVTPILASSPSGEFNQNSTNPGV ATLLNSFADKLTNMKQQGEKATFAELKKNENFMFWLRMTWGATNKDCIKTASSLPSTRCH GCEDDGNIDTFICKKLVDNVIIEEKENHIGQRTKTRWGQRSPWNSNLPQVFDGNNYIYPP TGCTAVAMAQLLYFTHFKFNVPNGLNHGVSFTGRIVDNHYSIKNYIPGHYVPNSPRWNEM PLRKWENGQQEYYVADLMAELGFYLNMDYSPTGSGANVSTEVMRSYGIAWDEGDYDANTV MQSIEKGCPVLITAFEGKYRKGWLFWRRTHYRDGHAWLIDGIVNRHRSIRSEYQWEQVIV DRSQGDVEPSLGKYEEVLYINAARASGIREYDRTYKYYDNDSRYLQMNWGWDGEGNSLEF SPNAEIWSAGGHNFRYNKRIYFNVRPLNK >gi|283510558|gb|ACQH01000061.1| GENE 12 15107 - 15952 631 281 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288928556|ref|ZP_06422403.1| ## NR: gi|288928556|ref|ZP_06422403.1| hypothetical protein HMPREF0670_01297 [Prevotella sp. oral taxon 317 str. F0108] # 1 281 1 281 281 573 100.0 1e-162 MIKTRYVIFTLFALCLLACTPKDEKEVIGGHTHYRLIARECNYAWAFNAYIKQIPHQGFV QTKISSLKKDTTLKASDFILGFTTCNVDIKDIRKNPDWESYQEAYKRNGESLLVDFPPFN VLSTMPDNMADEKDPIIESRLKQQLERRGTRAIHDMPTTDLVHVDYRLTWLKDIKITSSA EIAGRAPGQSLSDLFVIERYFRNHSFIVTSNKNVITDHKKIVGISLSQYLSYKPMAPAEI YLRFKDNVSVPAPVTAQFTIELITGEDKLIKATAADVKLVP >gi|283510558|gb|ACQH01000061.1| GENE 13 16003 - 16839 665 278 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288928557|ref|ZP_06422404.1| ## NR: gi|288928557|ref|ZP_06422404.1| hypothetical protein HMPREF0670_01298 [Prevotella sp. oral taxon 317 str. F0108] # 1 278 14 291 291 556 100.0 1e-157 MLVSLLSACVEEIDTPNLHEPKMVVNCLLTQDSVQKLWLTYTSKLGNTAYEEVPQAIATL FENDREVGVFNKVAYGEWRMAFTPVRGRTYTLRIQTPNHPDITANTTFPERLPIQRLKPE DKEGRRAFRISKNADPFWACAFAKDQDTIMRTVVIRPYYTMKEDIATDYPKTDQFNMQPA EAERVNTYHYYIRMLPNAVPTTFTLYPLYSSVVSFMAVSEEYDHYLKSSLLKMQAYESFS DPTQWLEENAVYSNIENGVGIFGAYEETLFNCNNILPE >gi|283510558|gb|ACQH01000061.1| GENE 14 16959 - 19166 1053 735 aa, chain + ## HITS:1 COG:no KEGG:BT_2172 NR:ns ## KEGG: BT_2172 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 12 732 145 894 897 197 24.0 1e-48 MPRKVRHGPPEQSTVTNARGCFSLPNGVASVVVSHVAYEPKQVELSELAGNTIYITPKNI ALKETTITSTTPNNRGSEFSYDARQAASSISIIGEPDVLRHTLSFPGTSFGMEGTLGIFV RGGDTSGNGLYFGGVPLYVTSHLMGLFSVIPPEMADKVDFYKGGLPATNSNYSSALISVS PKACYGTPTTGKAYISPYLCGLHVTLPLIMNKMSVRLSARTTILPYMLNWMNRTDDKLRV DLYDICVKLDYKPNARNAFSAFYFHTNDLYEYTQYQELRNKQAWQASAAKLEWDAKLNEQ TNFNFLAYHTYAFSIQESDHFAPNTSDTRSHVAIASKLNEWALRAMFAYTWKQGMAVRAG LSMQSQKSTNGNQRSLTAQKGAENVQTYPNTLGAVFAEVDLKFTKQADLKLGIRNTTRFH DERQPINPDLHALTHWYFKPWLGLEIAYDKQNQFHHILEGLPTGWPLNIRVASNERFPKE ETNQFYAGAFVKKRWADNNKFDATLGAYCRRMNNLVAYTSPINAFGLNTFTWEEEASRGK GRSLGIEVSSAISLPIFTASIAYTLSKTTRTYPFINGGRPFPHKFDRRHVLNCQLQYAFV RRPTRKGTAEHNVACNVVLASGNRLTLPIGTYQGVAPPHWDNLKTGFIHPDEYYRHIYDR QQMSDVNAISIKNYFRTDLAYNYISRGRRCTRELSVSVFNVFNRRNPYAIFHEEDRWKQL SIVPIMPMVRWSISW >gi|283510558|gb|ACQH01000061.1| GENE 15 19192 - 19647 548 151 aa, chain - ## HITS:1 COG:HI0254 KEGG:ns NR:ns ## COG: HI0254 COG1225 # Protein_GI_number: 16272212 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peroxiredoxin # Organism: Haemophilus influenzae # 2 149 4 149 155 142 45.0 2e-34 MMEIGTKAPEVLGHDENGNEVRLSDFKGKKLVLYFYPKDLTSGCTTQACNLRDNYAELQK QGYAVVGVSINDEKSHKKFIEKNDLPFPLIADTEHKLVEEFGVWGEKSMYGRKYFGTFRT TFVINEEGVITRIMQPKEIKVKEHAQQILGE >gi|283510558|gb|ACQH01000061.1| GENE 16 20137 - 22656 2643 839 aa, chain + ## HITS:1 COG:TP1008 KEGG:ns NR:ns ## COG: TP1008 COG0209 # Protein_GI_number: 15639992 # Func_class: F Nucleotide transport and metabolism # Function: Ribonucleotide reductase, alpha subunit # Organism: Treponema pallidum # 1 839 1 845 845 1077 60.0 0 MDITKRNGSMAPFDREKIATAIKKSFISTGRPIVEDEIHDVAIAVEQVIITHPAKLNVEC IQDEVERSLMQRGFYAEAKSYILFRWQRTEQRKYLNRIANTLQQAEIAPVLKGIAADFEP ELYPIAALAEKFMGFVKADMNGRDKLSALIKAAVELTSAEAPNWEFIAARLLSYRLQRDL ETFEQAEDIQSFAQKVRHLASQNLYGAYIVDAYTENELQQAASFIDEERNKLLNYSGLDL LIKRYVIRSFAHKPVESVQEMFLGIALHLAMNEGENRLHWVKKFYDMLSRLEVTMATPTL ANARKPFHQLSSCFIDTVPDSLEGIYRSIDNFAQVSKFGGGMGLYFGKVRAAGGRIRGFK GAAGGVVRWMKLVNDTAVAVDQLGVRQGAVAVYLDVWHKDLPEFLQLRTNNGDDRMKAHD IFPAVCYPDLFWKMAQQDLNQDWHMFCPNEVLTTRGYSLEDFYGEAWEEKYHECVADKRL SRRTMTIKDVIRLVLRSAVETGTPFTFNRDIVNRANPNGHNGIIYCSNLCTEIAQNMKAI EEVSNEVLTKDGDTVVVRTVKPGDFVVCNLASLSLGNLPLENETYLNDIVATVVRALDNV IDLNFYPVPYAQITNHRYRSIGLGVSGYHHALALRGIKWESEEHLQFVDQVFERINRAAI EASSSLAKEKGAYAYFEGSDWQTGAYFDKRNYNSPQWQQTAQRVKAQGMRNAYLLAIAPT SSTSIVAATTAGLDPVMQRFFLEEKKGAMLPRVAPALSDRTFWLYKGAYQTNQAWSMRAS GIRQRHIDQAQSVNLYITNNYTMRQLLDLYLLAWQSGVKTIYYVRSKSLEVEECESCAS >gi|283510558|gb|ACQH01000061.1| GENE 17 22699 - 23745 1251 348 aa, chain + ## HITS:1 COG:TP0053 KEGG:ns NR:ns ## COG: TP0053 COG0208 # Protein_GI_number: 15639047 # Func_class: F Nucleotide transport and metabolism # Function: Ribonucleotide reductase, beta subunit # Organism: Treponema pallidum # 5 348 8 351 351 495 69.0 1e-140 MDNKLKKNALFNPQGDTDLRLRRMIGGNTTNLNDFNNMRYTWASDWYRQAMNNFWIPEEI NLTQDTKDYPLLPPAERKAYDKILSFLVFLDSLQSANLPSLTEFITANEVNLCLHIQAFQ ECVHSQSYSYMLDTICSPEERNDILYQWKTDEHLLRRNTFIGDCYNEFYANNDKASLMKT LMANYILEGIYFYSGFMFFYNLARNGKMSGSAQEIRYINRDENTHLWLFRNIIHEMKKEE PEYFTPEKVKVYEEMMREGVRQEIAWGQYVIGDEVQGLNAQMVNDYIRYLGNMRWRSLGF GNLLEDNIEEPENMRWVAQYANANMVKTDFFEAKSTAYAKSTAIEDDL >gi|283510558|gb|ACQH01000061.1| GENE 18 23835 - 25259 587 474 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|90021240|ref|YP_527067.1| ribosomal protein S32 [Saccharophagus degradans 2-40] # 96 473 34 404 408 230 34 9e-60 MPLVAAADNGLRCAKSVFTQPDVRFMQKIKSIHADAVIGVKGLPSGLQWNARRNLVEGKV KAPGTYTYTVTSSINGQVSEEQIKLIVSDKLAMPLPFMGWLSWNSVEGDVSEAIVKRVAD MFRANGLYQAGWNTVMMDDLWQARKRADDGKPLPDPKRFPNGLRNLADYVHGKGMKFGLY TDAADKTCAGAFGSYGYERIDAEQYAQWNVDIVKCDYCNAPPEQDTAMVRYRRLGDAFKA VARPITLYICEWGDRKPWLWGAESGGSCWRVSADVRDRWTCEPGGTGVVESIKAMKNIAQ YAGVNRFNDADMLCTGLHGKGKSSNDLCFGTPGMTQDEYATQFALWCMWSSPMALSFDPR ANTVTKEDLKILTNRHLIALNQDRMGQQADLISDADSLVMFAKDLENGDVALSVTNMSGK SLEATFHFTQIPALDANKRYQCHDLWTGERLSAVKRSFTTKVRPHATRVFRLSE >gi|283510558|gb|ACQH01000061.1| GENE 19 25392 - 25766 359 124 aa, chain + ## HITS:1 COG:no KEGG:GALLO_1389 NR:ns ## KEGG: GALLO_1389 # Name: not_defined # Def: FMN-binding protein # Organism: S.gallolyticus # Pathway: not_defined # 1 124 1 124 124 160 60.0 2e-38 MLTEKFFDVIGNEGVVSIVSWGNAEPHLTCTWNSYLVVTHDERILIPAAGMKKTEANVEV NNRILLALGTRNVEGFNGYQGTGFRIEGTARFLTQGADYDMMYKKYPFIRSVLEVTVVSA KQLL >gi|283510558|gb|ACQH01000061.1| GENE 20 25783 - 25977 82 64 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPDFPVWVVLIIIRQGKTYKPLLLSANFKNGTIISPIRNAHQQALRISFEKQAPFQTIPL SAVR >gi|283510558|gb|ACQH01000061.1| GENE 21 27548 - 27697 56 49 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260912591|ref|ZP_05919121.1| ## NR: gi|260912591|ref|ZP_05919121.1| RelE family toxin-antitoxin system [Prevotella sp. oral taxon 472 str. F0295] # 1 37 3 39 118 63 100.0 4e-09 MRREIVAFGSYYKDFMATLHDKERRKVLYVLSLLEREKTDFRQNSSNTY >gi|283510558|gb|ACQH01000061.1| GENE 22 27703 - 27900 68 65 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288928564|ref|ZP_06422411.1| ## NR: gi|288928564|ref|ZP_06422411.1| LOW QUALITY PROTEIN: toxin-antitoxin system, toxin component, RelE family [Prevotella sp. oral taxon 317 str. F0108] # 1 65 32 96 96 112 98.0 9e-24 MYELRIKYESNIYRIFFIFDNNRIVVLFNGFQKKTQKTPRTEIDRAIKIMEDYYEYKRKN DNEHQ >gi|283510558|gb|ACQH01000061.1| GENE 23 27863 - 28174 346 103 aa, chain + ## HITS:1 COG:no KEGG:BT_4733 NR:ns ## KEGG: BT_4733 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 10 95 11 96 101 89 56.0 3e-17 MSTKEKMITSISKEIEREFGKPGTTEREKFDEEAYAFYTGQLLLDARKEAKVTQAELAKR IHASKSYISRVESGDIIPSAAKFYNMINALGMRIEIVKPLVRV >gi|283510558|gb|ACQH01000061.1| GENE 24 28484 - 28747 100 87 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260912594|ref|ZP_05919123.1| ## NR: gi|260912594|ref|ZP_05919123.1| conserved hypothetical protein [Prevotella sp. oral taxon 472 str. F0295] # 9 87 1 79 79 132 86.0 1e-29 MSEYIARNLPIIRYDNSSFRFVVSATIGKEGKIKRLRLVNKKSFKGNMSAWEDVKKLLNA VQFKPVMYGGKAISYSFVFQIKVDFTT >gi|283510558|gb|ACQH01000061.1| GENE 25 29156 - 29701 696 181 aa, chain - ## HITS:1 COG:TM0564 KEGG:ns NR:ns ## COG: TM0564 COG1853 # Protein_GI_number: 15643330 # Func_class: R General function prediction only # Function: Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family # Organism: Thermotoga maritima # 26 168 22 159 159 77 36.0 1e-14 MKEINYKEMKFNPFNLLGEEWMLVSAGNEQDGCNTMTISWGHLGCLWGHNDPTAVIYLRP SRYTKTYVDKEKYFTLCVMDNDFKKQMAYLGSVSGRDEDKIAKAGLTKVFADETVYFKEA KLVLICKKLYASELQESGFVYQETLDKNYPKRDLHTMYVGKIEKVLVRDEEYIGIDVPLK K >gi|283510558|gb|ACQH01000061.1| GENE 26 30074 - 30925 439 283 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928568|ref|ZP_06422415.1| ## NR: gi|288928568|ref|ZP_06422415.1| hypothetical protein HMPREF0670_01309 [Prevotella sp. oral taxon 317 str. F0108] # 1 283 1 283 283 560 100.0 1e-158 MAKRKMRALRGSREARQRSYRWGKVRITTYVTMAVLQLFVVGYESFWGLPQVGFAGLLWQ LIAVSTPAMLFVYVVLIIVVGVKMNRISTYEERYKSLNFEEEERKWRQNAAILGLMLVVL MAIAVFIAGSMGVFNYVKDLKCGPKVTKMTLNEVSYKHHVVRSKRAYAWLFPQIGGGEFT LRFASCNEKDKWGLTKYYYFDFSQYETDVVDRVLRGVLVKGKAFSAMQFMKETDAKGLRG CGFKAGVINGGVPALYLVRYYPHSHIFINATPVRGKRQGLGDK >gi|283510558|gb|ACQH01000061.1| GENE 27 31051 - 32556 1711 501 aa, chain - ## HITS:1 COG:MA3382 KEGG:ns NR:ns ## COG: MA3382 COG0174 # Protein_GI_number: 20092196 # Func_class: E Amino acid transport and metabolism # Function: Glutamine synthetase # Organism: Methanosarcina acetivorans str.C2A # 1 499 1 504 506 573 54.0 1e-163 MANNELLMNANEVVAFLQKPSSEFTKEDIVRFIQENEIKMVNFMYPGGDGKLKTLNFIIN NLAYLDTILTCGERVDGSSLFSFIQAGSSDLYVLPRFSSAFVDPFAEIPTLCMLASYFDK DGNPLESSPEYTLAKAAKAFREVTGMEFHAMGELEYYVITPDEEVFPAVDQRGYHESMPY AKTNDFRTLCMSYIAKAGGQIKYGHSEVGNFSQDGLIYEQNEIEFLPVPVEKAADELMIA KWIIRNLAYQMGMNVTFAPKITTGKAGSGMHVHMRLMKDGKSVMVENGKLSNTARTAIAG LMKLADSITAFGNKNPMAYFRLVPHQEAPTNVCWGDRNRSVLVRVPLGWAAGVDMMHKAN PQQPAETPDTSMKQTFEMRSPDGSADVYQLLAGLCVGCRYGFELPNALEIAEQTYVDVNI HDAANADKLSRLGQLPDSCVASADCLERQRKVYEAHGVFSPAMIDGIIKQLRDFNDSNLR KEVENRPAELKELVERYFHCG >gi|283510558|gb|ACQH01000061.1| GENE 28 32839 - 35427 1805 862 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 [Bacillus selenitireducens MLS10] # 1 861 1 810 815 699 45 0.0 MNFDKYTIKAQETVQEAVNIAQRAGQQSIEPVHLLKALLEKAADVTNYIFQKLGVNAMQV STLANSEVEHLPRVADGNPYLSNEANNVLLKAEDLSKSLGDEFVSVEPLFLALLAVNSSA ARILKDAGCTEKDARTAIEALRQGQQVKSQSGDENYQSLEKYAKNLVEDARNGKLDPVIG RDDEIRRVLQILSRRTKNNPILIGEPGTGKTAIVEGLAERIVRGDVPENLKDKQLYSLDM GALVAGAKYKGEFEERLKSVIKEVTNADGQIILFIDEIHTLVGAGGGEGAMDAANILKPA LARGELRAIGATTLNEYQKYFEKDKALERRFQTVMVNEPDELSAISILRGIKERYENHHK VRIQDDACIAAVQLSERYISDRFLPDKAIDLMDEAAAKLRMERDSVPEELDEITRRLKQL EIEREAIKRENDEPKMVQLDKDIAELREQEKAFRAKWESERALVNKIQQDKLEIERLNHE ADRAEREGNYERVAEIRYGKLKELDSDIENIKQQLQATQGGEGMVREEVTADDIAEVVSR WTGIPVTRMLQSEREKLLHLEEELHRRVIAQDEAIAAVSDAVRRSRAGLQDPKRPIASFI FLGTTGVGKTELAKALAEYLFNDESMMTRIDMSEYQEKFSVSRLIGAPPGYVGYDEGGQL TEAVRRKPYSVVLFDEIEKAHPDVFNILLQVLDDGRLTDNKGRTVNFKNTIIIMTSNLGS QYIQQQCANLSEANRDEVLDETRQKVMDMLKQTIRPEFLNRIDETIMFLPLTKQEIAEVV RLQMRSVGRMLAEQGFKIDVSDDAISLLADLGYDPEFGARPVKRAIQRYVLNDLSKKILA EEVSRDKPILIDALDGKLVFRN >gi|283510558|gb|ACQH01000061.1| GENE 29 35785 - 36879 512 364 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288928571|ref|ZP_06422418.1| ## NR: gi|288928571|ref|ZP_06422418.1| hypothetical protein HMPREF0670_01312 [Prevotella sp. oral taxon 317 str. F0108] # 20 364 1 345 345 694 100.0 0 MKCNNIPKPNTVTQLLVVLMFASVFWADSAFGQTVHNDNASHLPKAENKVADIESARHLL HQAKNLAPLASTTQATVIIGPKVGYEGAKYSLQSAQEGDVTWSSNNPDVASINALGILTV KGRGVVVLTANYKKQTYNQTILVGIPRFILLSSPQKGGYKVVAKCIDSEYKNYQPLLNEV LRFKWGIKYPDENIRWSTSDKSSLTIQAQALDKDITVFLQVADALGNKSALQHISVHPQD VYASDYTTFYIDSEGNLYDFKKRYDLYESSRVFIDYKPNLPEKYKGREWMPITAIVLSPQ KGNKEISMYGGGPLVCDILSEEEYEFIRNKSNDNQRYNYMLMLLNDEQKVIQYIPISFIF KTTL >gi|283510558|gb|ACQH01000061.1| GENE 30 36882 - 37532 296 216 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288928572|ref|ZP_06422419.1| ## NR: gi|288928572|ref|ZP_06422419.1| hypothetical protein HMPREF0670_01313 [Prevotella sp. oral taxon 317 str. F0108] # 1 216 1 216 216 424 100.0 1e-117 MIRRIISTKMCAAIVLSLISIGCSRFTSEGETLIRGTATINGKEYVDITKRYWNTDFTIN NLSFYPEYSMFLICMQLQPKGQTKLNYEYWISFCLSTEKTGGLQINYPYRVEYKKSLETP TKRIPNDVIKNFATNREKVINTKTNDGIAIVANLKREATRSVEGQFIIRSLDLQSGDCRG EYTITTPAVDGKEDLRINGKFETKFSEYPYYSPYLK >gi|283510558|gb|ACQH01000061.1| GENE 31 37538 - 37867 327 109 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288928573|ref|ZP_06422420.1| ## NR: gi|288928573|ref|ZP_06422420.1| hypothetical protein HMPREF0670_01314 [Prevotella sp. oral taxon 317 str. F0108] # 1 109 1 109 109 194 100.0 1e-48 MKTRTIIISLTALLCAVSCKEETELQVPNPPPPHTNAKSAELDDEAVYRFQTDKNQMLNS MIRYDYTKKKYVLDISEQDARSIGITTQMYKDAIKRVEEMNKVNFERLQ >gi|283510558|gb|ACQH01000061.1| GENE 32 37897 - 38151 148 84 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929525|ref|ZP_06423369.1| ## NR: gi|288929525|ref|ZP_06423369.1| hypothetical protein HMPREF0670_02263 [Prevotella sp. oral taxon 317 str. F0108] # 1 37 1 37 41 63 89.0 5e-09 MNDVELTRTLSWRELEIADTELWAKWKGKTHFSRNEKNVPSHNNLRLYPIGRSSNVQYLG FCTAATNRWWCDSNKCPISFLYTF >gi|283510558|gb|ACQH01000061.1| GENE 33 38131 - 38682 434 183 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928574|ref|ZP_06422421.1| ## NR: gi|288928574|ref|ZP_06422421.1| hypothetical protein HMPREF0670_01315 [Prevotella sp. oral taxon 317 str. F0108] # 3 183 1 181 181 354 100.0 2e-96 MRMSMDKTSLSKNNKYKKAVLWWLLCLIGLFILFVCIYFLAWGMMMQPKPSYQPEHSIRE KQFFSDLEEKEGWTGADRYIYNVDKEGKSLFQNQIRLDKNYAYMFTIEIEDSSTFYSLPA NVEDTIALHLYNNVLEKSPKLQKIVVVFSYEEILNERASMGHGLTAEYEIRGKKLVKLKR IKE >gi|283510558|gb|ACQH01000061.1| GENE 34 38740 - 38940 142 66 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928575|ref|ZP_06422422.1| ## NR: gi|288928575|ref|ZP_06422422.1| hypothetical protein HMPREF0670_01316 [Prevotella sp. oral taxon 317 str. F0108] # 1 66 1 66 66 133 100.0 3e-30 MATCNVSMEAPNDYFILELMLKDEDKPRARGCIEMYDGLDDLPVRRIQFNEGCIWRHKLS EDTILC >gi|283510558|gb|ACQH01000061.1| GENE 35 38927 - 39058 122 43 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928576|ref|ZP_06422423.1| ## NR: gi|288928576|ref|ZP_06422423.1| hypothetical protein HMPREF0670_01317 [Prevotella sp. oral taxon 317 str. F0108] # 1 43 1 43 43 83 100.0 5e-15 MDIRTTLTLNGRSYDVDLFDQCFYRLVGWRGNLLSHYPADGDL Prediction of potential genes in microbial genomes Time: Sat May 28 01:15:40 2011 Seq name: gi|283510557|gb|ACQH01000062.1| Prevotella sp. oral taxon 317 str. F0108 cont2.62, whole genome shotgun sequence Length of sequence - 7578 bp Number of predicted genes - 7, with homology - 6 Number of transcription units - 6, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 164 - 382 138 ## - Prom 411 - 470 4.9 2 2 Op 1 . + CDS 900 - 1496 732 ## PRU_0296 TetR family transcriptional regulator 3 2 Op 2 . + CDS 1493 - 1981 254 ## Fjoh_4302 hypothetical protein + Term 2011 - 2057 3.9 + Prom 1997 - 2056 2.2 4 3 Tu 1 . + CDS 2094 - 2840 281 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 + Term 2854 - 2911 6.0 + Prom 2879 - 2938 1.9 5 4 Tu 1 . + CDS 3065 - 3730 181 ## PROTEIN SUPPORTED gi|238855152|ref|ZP_04645474.1| pseudouridine synthase, RluA family + Term 3843 - 3895 2.4 - Term 4224 - 4260 0.2 6 5 Tu 1 . - CDS 4466 - 4876 167 ## gi|288928582|ref|ZP_06422429.1| hypothetical protein HMPREF0670_01323 - Prom 5020 - 5079 6.6 - Term 5689 - 5728 2.1 7 6 Tu 1 . - CDS 5900 - 7108 1170 ## COG0738 Fucose permease - Prom 7256 - 7315 3.7 Predicted protein(s) >gi|283510557|gb|ACQH01000062.1| GENE 1 164 - 382 138 72 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSALFVHQSFTLPQKIDSRGGLFWCQKGAVVYADKMNFYLKRPPLTPVLGRFAAKCSVFW CKTQGALVQNAR >gi|283510557|gb|ACQH01000062.1| GENE 2 900 - 1496 732 198 aa, chain + ## HITS:1 COG:no KEGG:PRU_0296 NR:ns ## KEGG: PRU_0296 # Name: not_defined # Def: TetR family transcriptional regulator # Organism: P.ruminicola # Pathway: not_defined # 1 193 1 193 197 320 84.0 1e-86 MSISKTRQKLVDVARQLFARNGVANTTMNDIAVASGKGRRTLYTYFNRKEDVYYAVIEAE LERLSDKLDEVAAKKISPQDKIIELIYTHLSMIKETVVRNGNLRAEFFRNIWTVEKMRKN FDEDEIELFRKVYAEGKADGEFDIDNVELVADITHYCIKGLEVPFIYGRLGHGLTEEASK PLVAKVVYGALGKRINKQ >gi|283510557|gb|ACQH01000062.1| GENE 3 1493 - 1981 254 162 aa, chain + ## HITS:1 COG:no KEGG:Fjoh_4302 NR:ns ## KEGG: Fjoh_4302 # Name: not_defined # Def: hypothetical protein # Organism: F.johnsoniae # Pathway: not_defined # 1 162 1 153 153 65 28.0 5e-10 MKAAVLFSFVAIIMAACTASSPKEQSHATQPATAHAPKTATIVKDITPPQRSVTPDTTTK YTEEDGGMTQIESKLFTETSSMQVLYQETIRRGDIDNAEMLLPKLPSKSKEVDVDKNGLI GISYKIGHRQATVEMYYNGGVTTLILREQSRGVNREIIHSAD >gi|283510557|gb|ACQH01000062.1| GENE 4 2094 - 2840 281 248 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 7 247 4 241 242 112 34 7e-25 MGLLTGKTALVTGAARGIGKAIALKFAQEGANIAFTDLVIDENGKATEAEIAALGVKAKG YASNAADFAQSEEVVKLVKEDFGSVDILVNNAGITKDGLMLRMTEQQWDAVIGVNLKSAF NFIHAVIPVMMRQRGGSIINMASVVGVHGNAGQANYAASKAGMIALAKSVAQEMGPKGIR ANAIAPGFIDTAMTQELSDEVRKEWMNQIPLRRGGTVEDVANCALYLASDLSSYVSGQVI QVDGGMNT >gi|283510557|gb|ACQH01000062.1| GENE 5 3065 - 3730 181 221 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|238855152|ref|ZP_04645474.1| pseudouridine synthase, RluA family [Lactobacillus jensenii 269-3] # 4 214 84 283 287 74 27 3e-13 MIPVYEDNHIIIVSKHSGEIVQSDKTGDVPLVDIVKSYLKEKYAKPGNVFLGVVHRLDRP VWGLVAFAKTSKALSRMNALFRTGDVEKTYWAIVQSPPENETGELVHWLTRNEKQNKSYA HNHEVAGAKKAILSYKVIGRSENYTLLEVQLMTGRHHQIRCQLAAMGCPIKGDLKYGARR SNADGSISLLSRKISFVHPVSKETICVEAPLPDDNLWHAFG >gi|283510557|gb|ACQH01000062.1| GENE 6 4466 - 4876 167 136 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928582|ref|ZP_06422429.1| ## NR: gi|288928582|ref|ZP_06422429.1| hypothetical protein HMPREF0670_01323 [Prevotella sp. oral taxon 317 str. F0108] # 1 136 49 184 184 254 100.0 1e-66 MSCSFCAIVYMCIYLLHKICNKTVLVCYTKDYVVFQIGKKEKRYRKTDLLGFYSLDYINT SQYKIAIQFVFKDGCRLNLFEDSPFEKSEEVRLQKNESLKEFLLTSTHYFGFKLHKRVKW RSRLSSLNSADTWYTK >gi|283510557|gb|ACQH01000062.1| GENE 7 5900 - 7108 1170 402 aa, chain - ## HITS:1 COG:XF1462 KEGG:ns NR:ns ## COG: XF1462 COG0738 # Protein_GI_number: 15838063 # Func_class: G Carbohydrate transport and metabolism # Function: Fucose permease # Organism: Xylella fastidiosa 9a5c # 54 396 4 360 377 173 31.0 4e-43 MSENKNYLVPIAFLGLMFFSCGFALGINSLLVPVLQESLNVPSAGAYLLIGATFIPFLIF GYPAGQLVKLIGYKRTMAVAFLFFALSFGVFIVSASQESFVLFLVASFCCGTANAFLQTA INPYITILGPVESAAKRISVMGICNKLAWPISPWFITLVAGEHLTVSQLDKPFYIIIAVF LALGVLSLMAPLPEVKAAGEDESEASNNAEDAQVSQYANSKKSIFQFPHLVLGAFAIFFY VGAETIALATLIDYAKGLGLENAENYSWITPLCISVGYVSGVVLIPKYMSQTRALQVCTI IAIVGTLLVVLLPGALSVFSIGVIALGCSLMWPAFWPLSMQDLGKFTKSGASVLTMGLIG GAVVTVLFGLLKDNYGPQNAYWICLPCFLYIAFFAFKGHKLR Prediction of potential genes in microbial genomes Time: Sat May 28 01:15:58 2011 Seq name: gi|283510556|gb|ACQH01000063.1| Prevotella sp. oral taxon 317 str. F0108 cont2.63, whole genome shotgun sequence Length of sequence - 8006 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 4, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 471 - 878 323 ## PG1113 integrase 2 1 Op 2 . + CDS 860 - 1696 586 ## PG1113 integrase 3 1 Op 3 . + CDS 1699 - 2088 135 ## BDI_3501 hypothetical protein + Term 2101 - 2148 8.5 4 2 Tu 1 . + CDS 2546 - 3793 1057 ## BDI_3502 hypothetical protein + Prom 4310 - 4369 3.9 5 3 Tu 1 . + CDS 4409 - 5770 893 ## PG1109 mobilization protein - Term 5707 - 5747 -0.8 6 4 Op 1 . - CDS 5804 - 6583 332 ## gi|288928588|ref|ZP_06422435.1| conserved hypothetical protein 7 4 Op 2 . - CDS 6625 - 7338 293 ## BT_2451 putative pyrogenic exotoxin B 8 4 Op 3 . - CDS 7343 - 7567 102 ## gi|299141457|ref|ZP_07034594.1| hypothetical protein HMPREF0665_01036 - Prom 7677 - 7736 5.2 Predicted protein(s) >gi|283510556|gb|ACQH01000063.1| GENE 1 471 - 878 323 135 aa, chain + ## HITS:1 COG:no KEGG:PG1113 NR:ns ## KEGG: PG1113 # Name: not_defined # Def: integrase # Organism: P.gingivalis # Pathway: not_defined # 1 98 1 98 407 196 94.0 2e-49 MARSTFKVLFYVNGSKEKNGIVPIMGRVTINGTVAQFSCKRTIPRELWDVRGNRAKGKSK EAIATNLSLDNIKAQIIKHYQRLSDREAFITAEMVRNAPIRGWAASMTPCSNPSTGTALP CSSVWARTGAWGHTR >gi|283510556|gb|ACQH01000063.1| GENE 2 860 - 1696 586 278 aa, chain + ## HITS:1 COG:no KEGG:PG1113 NR:ns ## KEGG: PG1113 # Name: not_defined # Def: integrase # Organism: P.gingivalis # Pathway: not_defined # 1 278 130 407 407 536 92.0 1e-151 MGTYKVMLRARNNTEKFIRYKYNRSDMSMQKLTPDFIRDFAVYLSTVKGNRNATIWLNCM WLKGVAMRAHFNGKIPRNPFAQFHVSPNTKEREFLTEDELKKLMAHEFADSHSAFVRDLF VFASFTALSFVDLKELTTDEIVEVNGEKWILARRHKTHVPFQVKLLDVPLQIIERYRPFQ KDNSIFGDINYWTVCKRLKKVICECGITKDISFHCARHGFATLALSKGMPIESVSRVLGH TNIVTTQLYAKITTEKLDTDLSMLGSKLNVSFGNIKIG >gi|283510556|gb|ACQH01000063.1| GENE 3 1699 - 2088 135 129 aa, chain + ## HITS:1 COG:no KEGG:BDI_3501 NR:ns ## KEGG: BDI_3501 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 11 114 3 108 127 83 38.0 2e-15 MKGKDTNNNARQIITMDEHGNVTIPNGEIWMGEYEIANLLGVFGHTVRTQVREIYKAGLL HPHTAERNIRIAEGRWLDAYSLKMVIVLAFRIRSQRAKRFREHVIAMLTERHERFVMLLP EQAVPAETP >gi|283510556|gb|ACQH01000063.1| GENE 4 2546 - 3793 1057 415 aa, chain + ## HITS:1 COG:no KEGG:BDI_3502 NR:ns ## KEGG: BDI_3502 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 13 413 1 401 401 615 72.0 1e-174 MDKKNNDGLLEDMPEKRKKDGAFVRVGTTLYKLADMPLVGGGFVRKRIVWNNETLRQDYG RDYLATVPKYDGFCTVPDHVNYQPVVGSFLNLYEPTGHHPKQGAFPHIEALARHIFGEQY ELGMDYLQLLYLYPIEKLPILLLVSEERNTGKSTFLNFLKALFGNNVTFNTNEDFRSQFN SDWAGKLLILVDEALLDRREDSERLKNLSTTLSYKVEAKGKDRDEISFFAKFVLCSNNER LPVIIDAGETRYWVRKVGRIEKDDTDFLKRIKEEIPAFLFFLQHRTLSTKKESRMWFSPE LIHTQALSRIIRSNRNRTEVEMAETCLEVMACMKASAFSFCINDMLLLLNCAGCRTDRTQ VRRVVQDIWKLTPAENTLTYTTCQPSYDNMCPYTEVRRTGRFYTIGRKQLEEMQG >gi|283510556|gb|ACQH01000063.1| GENE 5 4409 - 5770 893 453 aa, chain + ## HITS:1 COG:no KEGG:PG1109 NR:ns ## KEGG: PG1109 # Name: not_defined # Def: mobilization protein # Organism: P.gingivalis # Pathway: not_defined # 1 453 1 455 455 699 83.0 0 MGYCVLHLEKAKGADSGMSAHIERTIVPKNIDPTRTHLNRELVEFPDGVRNRTAAIRHRL DTAGLKRKIGKNQVQAIRIVLSGTHEDMARVEGEGKLDEWCADNVKWLRETYGADNLVSA VLHMDEETPHIHATIVPIVQGECRKQKKEENVKRKYKVKAPAPRLCADEVMSRANLIRYQ DSYGDHMAKYGQKRGIRGSVAKHLSTHEYYRNLIIQGEDIQANIANLLAREVEARRIIAE AEKAKQELARIKAETKTVELKNSAARTATAALNGIGSLLGSNKMSRLESENRQLHGEVAE LKENIGQMRTDMQNMKDSHTAERLRASEQHQREIGNLRRIVDKAKEWFPMLAEFLKIERL CRSVGLSEKYTDELLQGKVLVVTGKLRSEEYRRSFAAEKVRLQVGRTEKEGKTILDLLVD RVPIAAWFKGQWEKIRQTSILSSKENRSKGVRL >gi|283510556|gb|ACQH01000063.1| GENE 6 5804 - 6583 332 259 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928588|ref|ZP_06422435.1| ## NR: gi|288928588|ref|ZP_06422435.1| conserved hypothetical protein [Prevotella sp. oral taxon 317 str. F0108] # 1 259 1 259 259 505 100.0 1e-141 MGRYFRFSLACLLFAFIISCTSSDEDKKLTFQLDKPYREIMQGSTLRVFITSGSGDLQIS NSNAFKDYARIAYARTPEQAGHVGVLSIQALQVGEFQVSLTDNLTKERQQLTVKVLPPYL LLEIHEGSDPVWSIGSGVSGLFLVQNAAHSFYLCRYTPTVYRHVGPAVLSGVYEIVMKDG IPTNLRLKSKKQGFVRDFTLTVKNKKEALERLGKLGKLGLLPDEEKFIYPVFLDEHDMEE HIMAEISYKEATLPEKVME >gi|283510556|gb|ACQH01000063.1| GENE 7 6625 - 7338 293 237 aa, chain - ## HITS:1 COG:no KEGG:BT_2451 NR:ns ## KEGG: BT_2451 # Name: not_defined # Def: putative pyrogenic exotoxin B # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 205 194 389 426 94 30.0 4e-18 MLRTEWGQNRPYNNELGYINNDIPKVVGCVGTAIAQIVAHYEAMSSVYGHTLDWNLIKER AGIDALTDDDIQHQVALLCKHVAYGIKTEWNMDGTGGASMTNSHKYLETMGVTFNLGKRN KGYDMDAAIIIASLDRGCPVLITGDEEPSETRSSGNKKGGHCWILDGYQVRTRSTPTKLK AMIKSHDVYVHANFGWKGYASGYYMVDRNETSLSFDTRPVEGHYNQRLRLFPMVQRK >gi|283510556|gb|ACQH01000063.1| GENE 8 7343 - 7567 102 74 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|299141457|ref|ZP_07034594.1| ## NR: gi|299141457|ref|ZP_07034594.1| hypothetical protein HMPREF0665_01036 [Prevotella oris C735] # 1 72 150 221 462 146 95.0 5e-34 MVREAENAIKRKLRYYNAIRDSLHDKTLCKLKTAFHTNDITYHDVKEKIKVVQTPLTRGQ WIPEPTMGIFGKRS Prediction of potential genes in microbial genomes Time: Sat May 28 01:16:37 2011 Seq name: gi|283510555|gb|ACQH01000064.1| Prevotella sp. oral taxon 317 str. F0108 cont2.64, whole genome shotgun sequence Length of sequence - 5353 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 2, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 31 - 90 3.1 1 1 Tu 1 . + CDS 185 - 964 331 ## COG3293 Transposase and inactivated derivatives + Term 1159 - 1206 6.4 - Term 1145 - 1196 11.9 2 2 Op 1 . - CDS 1227 - 1673 301 ## BT_4035 hypothetical protein 3 2 Op 2 . - CDS 1678 - 4080 368 ## BVU_1503 hypothetical protein 4 2 Op 3 . - CDS 4089 - 5048 305 ## COG0464 ATPases of the AAA+ class - Prom 5072 - 5131 5.8 Predicted protein(s) >gi|283510555|gb|ACQH01000064.1| GENE 1 185 - 964 331 259 aa, chain + ## HITS:1 COG:sll1710 KEGG:ns NR:ns ## COG: sll1710 COG3293 # Protein_GI_number: 16330321 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Synechocystis # 16 247 21 238 261 71 28.0 2e-12 MYEVLDKDTIKSEILPHLSVAKRGHVSKSDLVEVIQCILYKLKTGCQWHMLPVSSFFTGR VLHYKTVYGHFRKWSRNGEWEKVWGIILHRYRFFLDMSSVELDGSHTTALRGGECCGYQG RKKRKTTNAIYVTDSQGIPLAMSTPVSGSHNDVYKISEVLSELFSGLNSSALSVSGLFLN ADAGFDTEQFRWGCHEHEVFPNVAFNKRRGKQGEEELLDELLYKQSYCIERTNAWMDSYR NVLNRFDTPLASWKAWNSI >gi|283510555|gb|ACQH01000064.1| GENE 2 1227 - 1673 301 148 aa, chain - ## HITS:1 COG:no KEGG:BT_4035 NR:ns ## KEGG: BT_4035 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 4 97 1 94 106 103 56.0 2e-21 MRLLDGHMDQIINKLEAHESKTLSRWREAAEYRQTNKSWLRRSQRIAMLMLEKMDELHLT QKELAQRMSCSQQYVSKILRGKENLSIETLCKIEDALSPFTPQRDKSDMMKLYSHDEMLD RVLGSKGTPARNDYERKINRFLIKNIGK >gi|283510555|gb|ACQH01000064.1| GENE 3 1678 - 4080 368 800 aa, chain - ## HITS:1 COG:no KEGG:BVU_1503 NR:ns ## KEGG: BVU_1503 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 796 1 796 800 839 56.0 0 MHKKHFFLGNQIAENKGFTPQLQRGGSPAEFPQYDNRRAHADALKLQYANAFANAINEQE PLLQYGQHTADGAYLDFTVPQNMIPSSLDTQRGARIMKIHKKDSEDDNVDVTVFVSNKHK DWFDKKADKFATAETEKKKPCNESLIAPIQGISSAKIESLFESKDEYEALGPAEEMGFEL WICKTDDYNINDIQIVLDDLSIKCQIEQRLQFADVDVFLITTNKTVLAQLPYHMANISEI RRYKQPSILTESSEVSREWTDILANELQFEESDVHIGILDSGVNNAHPLIAQALPDSRMS TAINVQDNLDHIDHGTGMAGLVLMGDLTKLAYDRGNLPVVQHNLASVKIVDANYSTAPSF YGAVIEDAISQSQDMGADIDCMAVTDSISDDGKPTSSSAALDESIYHSGECDRLVLVSAG NIYQDDVDATNYIKSCKANAINSPAQAWNALTVGAYTELSIPADKTAKGIASPKGVSPYS CTSFSWHEERVKPEIVMEGGNLAYSPSMGLYPDEDLCLITTSNDLRSPLETFNATSAATA LAARLAAKIKVENPNLSMLSIRALMVHSARWTDEMKSIDDGRPQDIMPLCGYGIPDEDIA AYSSEKYATYVFENELKPYKSGGTNTYNEMHFYALPWPVDVLETMHDEVVTLRITLSYYV EPAPGKSGKNNKYRYPSATLRFDVKTPIENDSEFLARHNKLEGEKTSDNNSTRWNIGQQR RERGTVQSDWFSCTAQELASCGQIVVFPGPGWWKERKLENVDNKIKYSLVVSIQTEKTEI YQAVETAIANRIAIPISNHY >gi|283510555|gb|ACQH01000064.1| GENE 4 4089 - 5048 305 319 aa, chain - ## HITS:1 COG:AGpT158 KEGG:ns NR:ns ## COG: AGpT158 COG0464 # Protein_GI_number: 16119896 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ATPases of the AAA+ class # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 1 296 18 323 345 209 41.0 4e-54 MINSDQVLSLIRNHLDNDDAQFRKVALQISAVEAKNGHVILARTIQDLLGKSKTSFSPLH ISPKHKDVEDLLLQIESDEDLKSLITSDENIEKINRIIKEYLKRDNLHKYGLDNRRKLLL YGESGTGKTMTANVLSRELNLPFFVVRTEKVVTKFMGETGLKLSNIFDFISEVPAVYLFD EFDAIGAQRGMDNEVGEQRRILNTFLQLLERDSSESFIIAATNSIESIDKAMFRRFDDVI EYTLPDSKQRLKLLKEYLYTAKDLDFSMVEPLFDGMSHAEIKMVCADIFKESLLNDTPIN INLVKTVLDMRSILVRNIG Prediction of potential genes in microbial genomes Time: Sat May 28 01:16:48 2011 Seq name: gi|283510554|gb|ACQH01000065.1| Prevotella sp. oral taxon 317 str. F0108 cont2.65, whole genome shotgun sequence Length of sequence - 2948 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 618 392 ## COG3385 FOG: Transposase and inactivated derivatives + Prom 792 - 851 3.6 2 2 Op 1 . + CDS 1032 - 1868 553 ## MAE_08660 type II restriction enzyme EcoRI-like protein 3 2 Op 2 . + CDS 1888 - 2913 614 ## PGN_1426 adenine-specific methyltransferase Predicted protein(s) >gi|283510554|gb|ACQH01000065.1| GENE 1 3 - 618 392 205 aa, chain - ## HITS:1 COG:SMb20766 KEGG:ns NR:ns ## COG: SMb20766 COG3385 # Protein_GI_number: 16265206 # Func_class: L Replication, recombination and repair # Function: FOG: Transposase and inactivated derivatives # Organism: Sinorhizobium meliloti # 1 205 45 242 387 75 26.0 6e-14 MVFCQFSGCDSVRDISNGLKSATGNLNHLGISRAPSKSTVSYQNTNRDSDVFRDIFYAVY KYFGQQGWGSRKGFRFKMPIKLLDSTLVSLTMSVYDWAHYTSKKGAVKMHTLLDYDCLLP DFVNITDGKGSDNKAAFDIPLQPHSIVVADRGYCDYALLNHWDSTNVFFVVRHKGNMRYK RVRELPLPDHAAQNVLIDEEIEPDL >gi|283510554|gb|ACQH01000065.1| GENE 2 1032 - 1868 553 278 aa, chain + ## HITS:1 COG:no KEGG:MAE_08660 NR:ns ## KEGG: MAE_08660 # Name: not_defined # Def: type II restriction enzyme EcoRI-like protein # Organism: M.aeruginosa # Pathway: not_defined # 1 278 1 278 278 433 79.0 1e-120 MSKKNQSNRLTTQQKESKGVVGIFGEETKIHDMTVGEVSHLALNKLQEEFPQLEFRSRTS IKKEEINEALKKIDPKLGQTLFVPNSSIIPDGGIVEVKDDNGEWRVVLVSEAKHQGKDIE NIKAGNLVGKKDNQDIMAAGNAIERSHKNIAEIANLMLAEAHFPYVLFLEGSNFLTETIP VKRPDGRVVTLEYNSGKLNRLDRLTAANYGMPINTNLCKNKFIKHNDKTIMLQATSIYTQ GNGEGWNAEKMLEIMLDISRTSLEMLGSDLFKQLTQKK >gi|283510554|gb|ACQH01000065.1| GENE 3 1888 - 2913 614 341 aa, chain + ## HITS:1 COG:no KEGG:PGN_1426 NR:ns ## KEGG: PGN_1426 # Name: not_defined # Def: adenine-specific methyltransferase # Organism: P.gingivalis_ATCC33277 # Pathway: not_defined # 1 322 1 322 322 560 85.0 1e-158 MARKATNELLQKAKKQKNDEFYTQLADIESELQHYESHFKGKVVYCNCDDPHISNFFRYF VSRFKKLELKKVIAACYKEQNVDLFNTGKNNHGFFFEYTGTELTGTDPKTANIVYFKGDG DFRSHESIALLKQSDIIVTNPPFSLFREYVAQLMRYDKKFLIIGNINAITYKEIFKLIQD NKAWLGVNLGRGISGFIVPAHYELYGTETKINELGERIISPNNCLWLTNLDNFIRHEDIL LTKNYIGYEQEYPKYDNYDGINVNRTKDIPQDYCGYMGVPITFLHKFNPHQFEIIKFRKG NDKKDLSINGKCPYFRIIIKNKHANPKIKHLTDKYQSVSLK Prediction of potential genes in microbial genomes Time: Sat May 28 01:17:13 2011 Seq name: gi|283510553|gb|ACQH01000066.1| Prevotella sp. oral taxon 317 str. F0108 cont2.66, whole genome shotgun sequence Length of sequence - 69040 bp Number of predicted genes - 61, with homology - 51 Number of transcription units - 34, operones - 12 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 245 - 643 318 ## COG0545 FKBP-type peptidyl-prolyl cis-trans isomerases 1 + Term 719 - 771 4.2 + Prom 895 - 954 3.4 2 2 Op 1 . + CDS 1160 - 1558 346 ## gi|288928599|ref|ZP_06422445.1| hypothetical protein HMPREF0670_01339 + Prom 1560 - 1619 2.5 3 2 Op 2 . + CDS 1678 - 2124 335 ## BF2347 hypothetical protein + Term 2159 - 2216 10.7 4 3 Tu 1 . + CDS 2487 - 2957 651 ## PRU_2778 putative DNA-binding protein + Prom 3115 - 3174 6.8 5 4 Tu 1 . + CDS 3202 - 3312 128 ## + Term 3440 - 3482 6.2 - Term 4163 - 4206 4.6 6 5 Op 1 36/0.000 - CDS 4239 - 4994 769 ## COG0479 Succinate dehydrogenase/fumarate reductase, Fe-S protein subunit 7 5 Op 2 . - CDS 5056 - 7038 2401 ## COG1053 Succinate dehydrogenase/fumarate reductase, flavoprotein subunit 8 5 Op 3 . - CDS 7068 - 7754 714 ## PRU_2433 succinate dehydrogenase/fumarate reductase cytochrome b subunit - Prom 7774 - 7833 7.2 9 6 Tu 1 . + CDS 7783 - 8004 96 ## + Term 8090 - 8124 -0.8 + Prom 8544 - 8603 4.2 10 7 Tu 1 . + CDS 8721 - 11954 4114 ## COG0793 Periplasmic protease + Prom 12126 - 12185 3.6 11 8 Tu 1 . + CDS 12213 - 13115 917 ## Slin_6012 helix-turn-helix type 11 domain protein + Term 13361 - 13397 0.2 12 9 Tu 1 . - CDS 13833 - 14441 685 ## COG0164 Ribonuclease HII - Prom 14679 - 14738 4.8 13 10 Tu 1 . - CDS 15055 - 15255 96 ## + Prom 15036 - 15095 4.7 14 11 Tu 1 . + CDS 15161 - 15463 325 ## gi|288928611|ref|ZP_06422457.1| hypothetical protein HMPREF0670_01351 + Term 15535 - 15589 6.8 - Term 15521 - 15573 11.4 15 12 Tu 1 . - CDS 15690 - 16904 1215 ## COG0520 Selenocysteine lyase - Prom 17102 - 17161 3.9 - Term 17205 - 17253 9.3 16 13 Op 1 41/0.000 - CDS 17262 - 18605 1334 ## COG0719 ABC-type transport system involved in Fe-S cluster assembly, permease component 17 13 Op 2 . - CDS 18618 - 19373 197 ## PROTEIN SUPPORTED gi|90020817|ref|YP_526644.1| ribosomal protein S16 18 13 Op 3 . - CDS 19391 - 20164 518 ## BF0266 hypothetical protein 19 13 Op 4 . - CDS 20192 - 21643 1713 ## COG0719 ABC-type transport system involved in Fe-S cluster assembly, permease component - Prom 21697 - 21756 2.8 - Term 21705 - 21742 -0.9 20 14 Op 1 20/0.000 - CDS 21913 - 24894 3708 ## COG0532 Translation initiation factor 2 (IF-2; GTPase) 21 14 Op 2 32/0.000 - CDS 24956 - 26224 633 ## PROTEIN SUPPORTED gi|17988250|ref|NP_540884.1| transcription elongation factor NusA - Prom 26419 - 26478 4.8 22 14 Op 3 . - CDS 26620 - 27084 598 ## COG0779 Uncharacterized protein conserved in bacteria - Prom 27269 - 27328 4.7 23 15 Op 1 . - CDS 28520 - 29008 435 ## COG1430 Uncharacterized conserved protein 24 15 Op 2 . - CDS 29086 - 29289 200 ## - Prom 29328 - 29387 4.2 - Term 29633 - 29665 5.0 25 16 Op 1 . - CDS 29893 - 30537 369 ## gi|288928623|ref|ZP_06422469.1| hypothetical protein HMPREF0670_01363 - Prom 30597 - 30656 2.9 26 16 Op 2 . - CDS 30689 - 31240 464 ## gi|288928624|ref|ZP_06422470.1| hypothetical protein HMPREF0670_01364 - Prom 31362 - 31421 4.9 27 17 Op 1 . - CDS 31615 - 32835 1368 ## COG1488 Nicotinic acid phosphoribosyltransferase 28 17 Op 2 . - CDS 32903 - 33556 801 ## COG0546 Predicted phosphatases - Prom 33598 - 33657 3.4 - Term 33776 - 33823 10.3 29 18 Op 1 . - CDS 33832 - 36030 1159 ## PROTEIN SUPPORTED gi|15894003|ref|NP_347352.1| fused ribonuclease/ribosomal protein S1 30 18 Op 2 . - CDS 36079 - 38379 1493 ## BVU_0314 hypothetical protein - Prom 38605 - 38664 7.5 31 19 Tu 1 . - CDS 38667 - 38858 74 ## 32 20 Op 1 . + CDS 38803 - 39948 1321 ## PRU_2474 hypothetical protein 33 20 Op 2 2/0.000 + CDS 39998 - 42040 2400 ## COG0187 Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit + Term 42189 - 42232 0.3 + Prom 42342 - 42401 5.2 34 20 Op 3 . + CDS 42499 - 44079 1696 ## COG0793 Periplasmic protease + Term 44157 - 44213 11.1 - Term 44342 - 44377 -0.9 35 21 Tu 1 . - CDS 44424 - 44627 75 ## - Prom 44713 - 44772 3.4 + Prom 44995 - 45054 4.5 36 22 Tu 1 . + CDS 45134 - 45382 164 ## + Prom 45465 - 45524 3.0 37 23 Tu 1 . + CDS 45545 - 46033 430 ## COG1430 Uncharacterized conserved protein - Term 46052 - 46083 0.8 38 24 Tu 1 . - CDS 46196 - 46435 97 ## gi|5453496|gb|AAD43601.1| unknown - Prom 46476 - 46535 7.6 39 25 Tu 1 . - CDS 47292 - 47585 207 ## TDE0561 hypothetical protein + Prom 47479 - 47538 4.6 40 26 Tu 1 . + CDS 47607 - 48116 109 ## + Term 48203 - 48242 2.3 41 27 Tu 1 . - CDS 48098 - 48760 573 ## COG2173 D-alanyl-D-alanine dipeptidase - Prom 48867 - 48926 5.7 42 28 Tu 1 . - CDS 49257 - 50330 925 ## BT_0058 hypothetical protein - Prom 50499 - 50558 2.5 + Prom 50273 - 50332 2.6 43 29 Tu 1 . + CDS 50359 - 50571 106 ## + Term 50724 - 50763 1.9 44 30 Op 1 7/0.000 - CDS 50582 - 52507 2526 ## COG1086 Predicted nucleoside-diphosphate sugar epimerases - Prom 52539 - 52598 3.6 - Term 52559 - 52604 5.1 45 30 Op 2 9/0.000 - CDS 52793 - 53917 1225 ## COG0399 Predicted pyridoxal phosphate-dependent enzyme apparently involved in regulation of cell wall biogenesis - Prom 53940 - 53999 4.0 46 30 Op 3 3/0.000 - CDS 54108 - 54770 492 ## COG0110 Acetyltransferase (isoleucine patch superfamily) 47 30 Op 4 12/0.000 - CDS 54774 - 55382 468 ## COG2148 Sugar transferases involved in lipopolysaccharide synthesis - Prom 55553 - 55612 2.6 48 30 Op 5 25/0.000 - CDS 55620 - 56819 832 ## COG0438 Glycosyltransferase 49 30 Op 6 26/0.000 - CDS 56916 - 57986 719 ## COG0438 Glycosyltransferase 50 30 Op 7 . - CDS 57999 - 58868 620 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 51 30 Op 8 . - CDS 58882 - 59205 97 ## gi|288928643|ref|ZP_06422489.1| membrane protein - Prom 59272 - 59331 7.5 52 31 Tu 1 . - CDS 59927 - 60949 314 ## gi|288928644|ref|ZP_06422490.1| putative capsular polysaccharide biosynthesis protein + Prom 60746 - 60805 3.0 53 32 Tu 1 . + CDS 60948 - 61211 96 ## 54 33 Op 1 . - CDS 61089 - 62285 597 ## gi|288928645|ref|ZP_06422491.1| hypothetical protein HMPREF0670_01385 55 33 Op 2 . - CDS 62295 - 63137 191 ## COG1216 Predicted glycosyltransferases 56 33 Op 3 . - CDS 63147 - 63683 146 ## gi|288928647|ref|ZP_06422493.1| acyltransferase - Prom 63900 - 63959 3.9 57 34 Op 1 . - CDS 64161 - 65417 133 ## COG2244 Membrane protein involved in the export of O-antigen and teichoic acid 58 34 Op 2 . - CDS 65414 - 66310 454 ## Slin_4260 hypothetical protein 59 34 Op 3 2/0.000 - CDS 66346 - 67494 306 ## COG0381 UDP-N-acetylglucosamine 2-epimerase 60 34 Op 4 2/0.000 - CDS 67534 - 68250 384 ## COG1083 CMP-N-acetylneuraminic acid synthetase 61 34 Op 5 . - CDS 68252 - 68962 256 ## COG2089 Sialic acid synthase Predicted protein(s) >gi|283510553|gb|ACQH01000066.1| GENE 1 245 - 643 318 132 aa, chain + ## HITS:1 COG:CC3636 KEGG:ns NR:ns ## COG: CC3636 COG0545 # Protein_GI_number: 16127866 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: FKBP-type peptidyl-prolyl cis-trans isomerases 1 # Organism: Caulobacter vibrioides # 4 131 39 167 177 107 42.0 4e-24 MAKREYIEANKRWLEEKAKEAGVKCLTKGVMYKVIKQGGTGGANPSPRSIVTVHYTGRTI NGRKFDTSVGGVPLACRLCDLIEGWIIALQQMHVGDKWELYLPAEVGYGKLSQAGIPGGS TLVFEVELLGVG >gi|283510553|gb|ACQH01000066.1| GENE 2 1160 - 1558 346 132 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288928599|ref|ZP_06422445.1| ## NR: gi|288928599|ref|ZP_06422445.1| hypothetical protein HMPREF0670_01339 [Prevotella sp. oral taxon 317 str. F0108] # 1 132 1 132 132 261 100.0 1e-68 MNNLLNNPEHTESKQRLTQARQHLIKAFAPLLSTQHNGKQRWIGTKTDLLEMVHLAYTFS YVRDDQGRPATFLWMVQRACDNFLLSMPRNPSAFVGKAMQRKNTKQAPLLERYCHLLYEQ GVNEPLQTWMAV >gi|283510553|gb|ACQH01000066.1| GENE 3 1678 - 2124 335 148 aa, chain + ## HITS:1 COG:no KEGG:BF2347 NR:ns ## KEGG: BF2347 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 10 141 7 135 140 110 45.0 1e-23 MTTQHITLNQTLTPHFTLAEFVRSGVAIRHNIANTPTPNAINNLRLLCENVLEPLRRRFG VIRITSGYRCEELNNMVGGKPNSQHLLGQAADIHLSNKEVAMKMFNFVAERLDYDQLLFE HRKSDGANWLHVSYNAAGNRKIKSELDV >gi|283510553|gb|ACQH01000066.1| GENE 4 2487 - 2957 651 156 aa, chain + ## HITS:1 COG:no KEGG:PRU_2778 NR:ns ## KEGG: PRU_2778 # Name: not_defined # Def: putative DNA-binding protein # Organism: P.ruminicola # Pathway: not_defined # 1 142 1 142 174 149 52.0 4e-35 MAVMYKLRQEKNPKSKFKGQWYARAIAIDTVDTATIANIMQQNCTLKKADILAVIAELID VMKEQLQNSKVVRLDGLGSFRIGIKTKPALTAEDFNASKNVVGTHVLFRPQTKVNKDKSR KKALLDGCRVQELPLNMIEKKKKQAGGSTPAPTPGH >gi|283510553|gb|ACQH01000066.1| GENE 5 3202 - 3312 128 36 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNTKKLSTFKLIIKVIAAIATAVLGVIAGKGLGDED >gi|283510553|gb|ACQH01000066.1| GENE 6 4239 - 4994 769 251 aa, chain - ## HITS:1 COG:Cgl0368 KEGG:ns NR:ns ## COG: Cgl0368 COG0479 # Protein_GI_number: 19551618 # Func_class: C Energy production and conversion # Function: Succinate dehydrogenase/fumarate reductase, Fe-S protein subunit # Organism: Corynebacterium glutamicum # 5 243 1 238 249 224 45.0 9e-59 MDKNISFTLKVWRQNGPKAKGHFDTFEMKDIPGDTSFLEMLDILNEQLIEARQEPFVFDH DCREGICGMCSLYINGHPHGPATGATTCQLYMRRFKDGDVITVEPWRSAGFPVIKDCMVD RGAFDKIIQAGGYTTIRTGQPQDANAILIPKEDADEAMDCATCIGCGACVAACKNGSAML FVSSKVSQLALLPQGKPEAAKRAKAMVMTMEELGFGNCTNTRACEAECPKNESIANIARL NREFISAKLAD >gi|283510553|gb|ACQH01000066.1| GENE 7 5056 - 7038 2401 660 aa, chain - ## HITS:1 COG:Cgl0367 KEGG:ns NR:ns ## COG: Cgl0367 COG1053 # Protein_GI_number: 19551617 # Func_class: C Energy production and conversion # Function: Succinate dehydrogenase/fumarate reductase, flavoprotein subunit # Organism: Corynebacterium glutamicum # 5 659 28 673 673 606 49.0 1e-173 MAIKIDSRIPEGPVAEKWTNYKAHQRLVNPKNKLKLDVIVVGTGLAGASAAASLGEMGFN VLNFCIQDSPRRAHSIAAQGGINAAKNYQNDGDSVYRLFYDTVKGGDYRAREANVYRLAE VSNSIIDQCVAQGVPFAREYGGMLANRSFGGAQVSRTFYAKGQTGQQLLLGAYSALSCQV HAGKVKLYTRYEMEDVVIVDDRARGIIAKNLVTGKLERFSANAVVIATGGYGNAYFLSTN AMGCNCTAAIQCYRKGAYFANPAYVQIHPTCIPVHGDKQSKLTLMSESLRNDGRIWVPKK IEDAKALQAGQKRGSDIPEEDRDYYLERRYPAFGNLVPRDVASRAAKERCDKGFGVNNTG LAVFLDFSESIERLGIDAIRQRYGNLFDMYEEITDVNPGELANEINGVKYYNPMMIYPAI HYTMGGIWVDYELMTSIKGLFAIGECNFSDHGANRLGASALMQGLADGYFVLPYTIQNYL ADQAIWPRLSTDLPEFAEAEKGVQAEIDRLMGIQGQRSVDSIHKELGHILWEHVGMGRTA EGLREGIKKLKELRKEFDSNLFIPGSKEGLNVELDKAIHLRDFIIMGELEAYDALSRNES CGGHFREEYQTEEGEAKRDDENFFYVGCWKYQGDDTTEPELIKEPLEYEAIKVQTRNYKQ >gi|283510553|gb|ACQH01000066.1| GENE 8 7068 - 7754 714 228 aa, chain - ## HITS:1 COG:no KEGG:PRU_2433 NR:ns ## KEGG: PRU_2433 # Name: not_defined # Def: succinate dehydrogenase/fumarate reductase cytochrome b subunit # Organism: P.ruminicola # Pathway: Citrate cycle (TCA cycle) [PATH:pru00020]; Oxidative phosphorylation [PATH:pru00190]; Butanoate metabolism [PATH:pru00650]; Metabolic pathways [PATH:pru01100]; Biosynthesis of secondary metabolites [PATH:pru01110] # 1 227 1 229 229 310 75.0 3e-83 MWLINSSIGRKVVMSVTGVALVLFLTFHMSMNIVALFSGEAYNMVCEFLGANWYAVVATL GLGALTVAHIVYAFILTAQNRRARGNERYAVTSRHEKVEWASQNMLVLGIIIALGLLLHL YNFWYNMMFAELIGVEMQFHPADGFAYIKETFSNPVYVVLYIVWLVAIWFHLSHGFWSAM QSLGINGKVWFNRWKTIGLVYTTLLMLGFLIVVLAFAFGCAPSLCCCA >gi|283510553|gb|ACQH01000066.1| GENE 9 7783 - 8004 96 73 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MFVLFPVIYRVCFRRAWAIPMCQRAVFLRCASWFWGLVWLVCALPLDTRNAPPPQEQRTP CTIKPPSLADAFG >gi|283510553|gb|ACQH01000066.1| GENE 10 8721 - 11954 4114 1077 aa, chain + ## HITS:1 COG:VCA0045 KEGG:ns NR:ns ## COG: VCA0045 COG0793 # Protein_GI_number: 15600816 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protease # Organism: Vibrio cholerae # 707 1063 22 379 394 286 41.0 2e-76 MMKTTLTIVALALATSVSAQESPLWMRYCAISPDGSAVAFTYKGDIYTVPTTGGTARQLT TNAAYDTHPVWSPDSKQLAFCSNREGSMDVYVMQREGGTPRRLTTNSNSETPIAFKDNNT VLYTSSIIPSAKSIIFANRMLPQVYEVGTNGSRPHLFSALPMMDISIRPNGDLLYHDQKG YEDPWRKHHRSPITRDIWLRRGGKYTKLTNFNGEDRTPVWAAQGDTFYYLSEEDGTFNVY VRNIDGTGKQQLTRHTTNPVRFLTAANNGTLCYGYDGEIYTLKQGEQPQKLKVNIVTDKI DTDLDRQTLFTGATEIALSPDGKELAFVLHGDVYVTSNNYRTTKQITDTPEQEREIQFAP DGRSIVYASERGGLWQIYRTKLKLDKEKSFAYATQLEEEQLVKSDRTSQQPRFSPDGKSI AFFEDRSTLRVLDIKSKAVRTVMDGKYVYSYSDGDISFAWSPDSRWLLSSYIGVGGWNSP DVALVKADGKGEIHNLTQSGYSDVGPRWVLGGKAMLFSSDRAGYRSHGSWGAERDVYIMF FDLDAYEKFRRNKEERELAKNEVQEKDENKKEKEEDKLKLGSAKKDKKDDVKALEFDLEN CRDRVIRLTDHSSHLGDYVLSNGGDTLYYQSAFESGYDLWKREISENKTALVMKNVGGGE FIPDKDFRNLYLCTYNAIKKIDLGSNTSKNIEFEARFNFKPYAERQYMFNHVWQQVKDKF YVPTLHGVDWANYRKVYEKFLPHINNNFDFSEMLSEMLGELNASHTGARFWGTAPKLQTA ELGLFFDTEYQGDGLRVQEVIKRGPFAVRNTKVTPGCIIEKIDGDSILQGKDYYHLLDGK VGKPVRISVFNPKGKKHFDVVIRPISAGEQQELLYKRWVDRNRHLVDSLSGGRLAYVHVK AMNSSSFRQVYLELLSDTNRCRDAVIVDERHNGGGWLHDDLCTLLSGKEYQQFVPNGRYV GSDPFNKWTKPSCVMVCEDDYSNGHGFPWVYKELGIGKLIGTPVAGTMTAVWWESLLDPS IVFGIPQVGCRDMRGNYGENTQLDPDIEVYNTPEDYLNGYDRQLVRAVEEMMKVVKK >gi|283510553|gb|ACQH01000066.1| GENE 11 12213 - 13115 917 300 aa, chain + ## HITS:1 COG:no KEGG:Slin_6012 NR:ns ## KEGG: Slin_6012 # Name: not_defined # Def: helix-turn-helix type 11 domain protein # Organism: S.linguale # Pathway: not_defined # 3 292 6 292 297 117 32.0 4e-25 MNNLRRELELLLLLTENRDYTVQLLADRLDITRRQLYYDLENLRACGFKLIKTNTRYRLD RHSSFFKRLHENIALTEEEAMFVYRLLDRLPHTDYLSKNIQSKLERFFNLELLTDVGQRR QTERNAAVLREAIALRRVVILKNYSSPHSRTVSDRMVEPFLFLNAEMDIRCFEINSHTNK TFKTSRAESVEMTDVEWLNEKKHKSIFTDIFMFSGEERLPVELKLGRLSHSILLEEYPAA AAYITPSDDSEHWIFSSDVASYLGIGRFVLGLYHDIEVLGSPDFADYIYNKVKAMVEKVR >gi|283510553|gb|ACQH01000066.1| GENE 12 13833 - 14441 685 202 aa, chain - ## HITS:1 COG:NMA0075 KEGG:ns NR:ns ## COG: NMA0075 COG0164 # Protein_GI_number: 15793104 # Func_class: L Replication, recombination and repair # Function: Ribonuclease HII # Organism: Neisseria meningitidis Z2491 # 10 198 3 191 194 185 50.0 5e-47 MLKSHYFTDLIEAGCDEAGRGCLAGSVFAAAVILPVDYENALLNDSKKLTEKQRYALRTQ IETDALAWAVGEVTADEIDKINILNASILAMHRALDGLKLRPQAVIVDGNRFKPYGTLPH ETIVKGDAKFLSIAAASVLAKTYRDDYMNSLALQHPAYQWDVNKGYPTKAHRAAIEEHGI SPFHRKTFKGVKEMLENEEQKV >gi|283510553|gb|ACQH01000066.1| GENE 13 15055 - 15255 96 66 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGNIMPFFTLAKRLFIRISLAELTDNVISFFIVLYVECYYLLIFVGAKLQQNMEVIFHKQ MNLVNN >gi|283510553|gb|ACQH01000066.1| GENE 14 15161 - 15463 325 100 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288928611|ref|ZP_06422457.1| ## NR: gi|288928611|ref|ZP_06422457.1| hypothetical protein HMPREF0670_01351 [Prevotella sp. oral taxon 317 str. F0108] # 15 100 1 86 86 150 100.0 3e-35 MKKEMTLSVNSASEMRINNLFASVKKGIMFPIVWLSQYYSAVLNKEISVKTTLQLLNAQV AFALAVFPANAPFILRVVCTAWFATAISGCKRSLKSSKRR >gi|283510553|gb|ACQH01000066.1| GENE 15 15690 - 16904 1215 404 aa, chain - ## HITS:1 COG:mlr0021 KEGG:ns NR:ns ## COG: mlr0021 COG0520 # Protein_GI_number: 13470346 # Func_class: E Amino acid transport and metabolism # Function: Selenocysteine lyase # Organism: Mesorhizobium loti # 2 404 10 412 413 454 53.0 1e-127 MYNITSIRSDFPILSREVYGKPLVYLDNGATTQKPLCVLDAMRSEYLNANANVHRGVHYL SQQATDLHEAARTTVRKFINARSDNEIIFTRGTTEGINLIASSFCEGFMQPGDEVIITAM EHHSNIVPWQLQAARSGIAIRVIPIDDRGELILDEMDKLLTSRTRLVSVMQVSNVLGTVN PIKEIVRRAHLHGVPVLVDGAQSVPHFKVDVQDLDCDFFTFSGHKAYGPTGVGVVYGKEE WLEKLPPYQGGGEMIATVSFEKTTFAELPFKFEAGTPDYVATHGLATALDYLSALGMDNI QQHEQDLTRYATQQLETIDGMRIFGHSTDKDAVISFLVGDIHHLDLGTLLDRLGIAVRTG HHCAQPLMARLGISGTVRASFGLYNTREEVDALVAGVRRVAAMF >gi|283510553|gb|ACQH01000066.1| GENE 16 17262 - 18605 1334 447 aa, chain - ## HITS:1 COG:alr2494 KEGG:ns NR:ns ## COG: alr2494 COG0719 # Protein_GI_number: 17229986 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ABC-type transport system involved in Fe-S cluster assembly, permease component # Organism: Nostoc sp. PCC 7120 # 43 429 62 444 453 162 28.0 1e-39 MNSEKQYIDLFEANRNVISAHSTAVMNAPREAAFEDFKRLGFPTRKTEEYKYTDISKLFE PDFGLNLQRLPMPINPYEVFSCDVPHLSTANHFVVNDAFLQPTAPHKQLPDGVVITSLCQ YAEKNPDFIARYYGKLAATSGDAVTALNTMLAQDGLLVYVPRGVKLGQTIQVINILHAQV DLMVNRRVLIIVEPQAEAKFLFCDHTFDDQRFLATQVIEAFVGDNARLDLYCMEETHEKN VRVSSVYIEQQSSSRVNHNVITLHNGVTRNNVSLTFRGEGAECNLSGCVIADKKQHVDNN TVIDHAVPNCQSHELYKYVLDDEAIGAFAGLVLVRQDAQHTDSEMRNQNICATKKAHMYT QPMLEIYADDVKCSHGSTVGQLNDAALFYMRQRGISEKEAKLLLEFAFINEVIDQMDLEP LRERLHHLVEKRFRGEFDRCEGCRKVQ >gi|283510553|gb|ACQH01000066.1| GENE 17 18618 - 19373 197 251 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|90020817|ref|YP_526644.1| ribosomal protein S16 [Saccharophagus degradans 2-40] # 2 236 12 234 318 80 27 2e-14 MLEVRNLHAKIGDKEILKGINLTIKDGETHAIMGPNGSGKSTLSAVLVGNPMFEVTDGTA VFNGKDLLEMKPEDRANEGIFLSFQYPVEIPGVSMTNFMRTAINEKRKYQGLEPMSAADF IKLMKEKRQIVELDAKLSQRSVNEGFSGGEKKRNEIFQMAMLEPKLSILDETDSGLDVDA LRIVAEGVNKLKSPSTSTIVITHYDRLLDIIRPDVIHVLYKGKIVKTAGPELAKEIEARG YDWIKEEVGEK >gi|283510553|gb|ACQH01000066.1| GENE 18 19391 - 20164 518 257 aa, chain - ## HITS:1 COG:no KEGG:BF0266 NR:ns ## KEGG: BF0266 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 23 246 25 245 248 103 33.0 7e-21 MSDVKALLAIALLLVPLAVKAQDEPVRTFAIDGQIGGASSPHVKTLSKGQYLIPEKVGSG YGIISKLRLEYYLPYTPLSLKAGYEHEEWNFLDTNLGDDMEIFSLGGRCYPGRKSWFVQP YIGLDALYNFNGGSRVFSRELDSNRYGRVHYDGMAHIPRFSLAPIAGVDLYLFSCIAIQL EYSCRMGFSKASVIQAKYNHLAEPFEIRLRPIRHGFNIGLKVTFPFHFTEKDGQKVVDGL LDDLLYNILFGPNTNRK >gi|283510553|gb|ACQH01000066.1| GENE 19 20192 - 21643 1713 483 aa, chain - ## HITS:1 COG:SMc00530 KEGG:ns NR:ns ## COG: SMc00530 COG0719 # Protein_GI_number: 15965488 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ABC-type transport system involved in Fe-S cluster assembly, permease component # Organism: Sinorhizobium meliloti # 12 483 11 489 489 692 67.0 0 MEEVKNEKNAYVRSVAEQKYEFGFTTDVETDIIEKGLNEDVVRLISSKKNEPEWLLAFRL KAFNHWKTLKQPNWAHVNLPEIDYQDISYYADPMAKKPANKELDPELEKTFDKLGIPLEE RLVLGGVAVDAIMDSVSVKTTFKEKLAEKGVIFCSIGEAVQEHPDLVQQYLGSVVPYSDN FFAALNSAVFSDGSFVYIPKGVRCPMELSSYFRINARNTGQFERTLIVAEDDTYVSYLEG CTAPMRDENQLHAAIVEIIVKDNAEVKYSTVQNWYPGDEHGKGGVFNLVTKRGDLRGVNS KLSWTQVETGSAITWKYPSCILRGDNSQAEFYSVAVTNNYQQADTGTKMIHMGKNTKSTI ISKGISAGRSQNSYRGLVRATANADNARNYSSCDSLLLGDKCGAHTFPYMDIHNDTAIIE HEATTSKISEAQLFYCNQRGIPTEQAVGLIVNGYAKEVLNKLPMEFAVEAQKLLSVSLEG TVG >gi|283510553|gb|ACQH01000066.1| GENE 20 21913 - 24894 3708 993 aa, chain - ## HITS:1 COG:BMEI1965 KEGG:ns NR:ns ## COG: BMEI1965 COG0532 # Protein_GI_number: 17988248 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation initiation factor 2 (IF-2; GTPase) # Organism: Brucella melitensis # 355 993 95 733 733 543 47.0 1e-154 MSIRLGKAISELNIGLQTAVEFLEKKSELGEVKAEPTFKLNDAQYQALVKAFHKDKEVRN QADKLFPKKTKEKKRAAEDKDHRAETVLNAGKQQYKPLGKIDLSDLNGKGSAKKSDVKAV QDDKEETKPAAVGEVSNKAESVSAEKQEKVGKVETPKQKVETVEQKPEMKAETMQTTRVE KAETKTEQKEQPAKDAQANVAQPAASAKETAERKQEHADKAPKAENEKVETPAEKADTAE TESGKPAIFTLKSEQNNAPGIKVMGKIDLSTINQNTRPKKKSKEEKRKERENKGQQGGER RKRNRINTERVDINAAANQPNANKNKGGNNGGGNNNNAAGGRNANKKNKKPNRNQKPLEV NEEDVARQVKETLARLTSRGQNKKGAKYRKEKREAAQERLNEERAEERKESKTLKLTEFV TVSELASMMNVSVNQVIGTLMSVGIMVSINQRLDAETINLVAEEFGFKTEYVSAEVSEAV TEEADDEADLVSRSPIVTVMGHVDHGKTSLLDYIRNTNVIAGEAGGITQHIGAYNVKLDD GRNITFLDTPGHEAFTAMRARGAQVTDIAIIIIAADDSVMPTTKEAIAHAQAANVPMVFA INKIDKPGANPDKIREDLSQMNLLVEEWGGKYQCQEISAKKGIGVKELLEKVLLEAEMLD LKANPNRKGTGSIIESSLDKGRGYVSTLLVSNGTLHVGDNVIAGTSWGRIKAMFNVRNQR IESAGPAEPAIILGLNGAPTAGDAFHVLETEQEARDIANKRLQLQREQGLRTQKRLTLSD ISHRIALGSFKELNIIVKGDTDGSIEALSDSFIKLSTEKIKVNVIHKAVGQISENDVTLA DASDALIVGFQVRPSSAARKLADQEGVEINTYSVIYDAIDDVHSAMEGMLDKVKKEVVTG QVEVREVYKISKVGTVAGAIVTEGKVHRVDKARLVRDGIVVHTGDINALKRYKDDVKEVP TGFECGISIVNFNDIQVGDIIETFTVIEVEQKL >gi|283510553|gb|ACQH01000066.1| GENE 21 24956 - 26224 633 422 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|17988250|ref|NP_540884.1| transcription elongation factor NusA [Brucella melitensis 16M] # 24 356 21 350 537 248 41 6e-65 MVARKKDENILSMVDTFKEFKDTKNIDRMTLVSVLEESFRNVLARIFGSDENFDVIVNPD KGDFEIYRNRMVVADGEVEDENKQIALAEAQKIEPDYEVGEEVTERVDFAKFGRRAILTL RQTLASKILELEHDSLYNKYKDRVGQVIAGEVYQVWKREVLIVDDENNELILPKSEQIPA DQYRKGETIRAVILRVDNENNNPKIILSRTSPMFLERLLEAEVPEINDGLISIKRIARMP GERAKIAVESYDERIDPVGACVGVRGSRVHGIVRELCNENIDVINYTANTKLFIQRALSP AQVSSINIDEETRKAEVFLRPEEVSLAIGRGGLNIKLASMLTEYTIDVFREVAEGEADED IYLDEFSDEVDQWVIDAIKSIGLDTAKAVLNAPREMLVEKADLEEETVDNLLRVLRAEFE QG >gi|283510553|gb|ACQH01000066.1| GENE 22 26620 - 27084 598 154 aa, chain - ## HITS:1 COG:VC0641 KEGG:ns NR:ns ## COG: VC0641 COG0779 # Protein_GI_number: 15640661 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Vibrio cholerae # 7 120 8 131 151 62 34.0 4e-10 MIDKNVVKELVEQWLADKEYFLVDVETSPDSRIVVEIDHADGVWIEDCVELSRFIEEHLN RDEEDYELEVGSAGLGQPFKVPQQYVNFIGKEVEVLDKDGKKYKGILKSVDGNDFVVAVN EKVKVEGKKRPVLQDVDHNFKMDGVKYTKYLIQF >gi|283510553|gb|ACQH01000066.1| GENE 23 28520 - 29008 435 162 aa, chain - ## HITS:1 COG:TP0087 KEGG:ns NR:ns ## COG: TP0087 COG1430 # Protein_GI_number: 15639081 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Treponema pallidum # 33 153 32 149 179 137 50.0 7e-33 MTYKGLSMKKFLQLHAKLVLKGVVVLCCAIFSACSTSVIEFEKKDLAITNLDGKTIPITV ELARSRREMTKGFMGRTNIPDGTGMLFVFKKDQKLAFWMKDTPTPLSIAFISASGEICET RDMTPFSKASVESSQSVRYALEVPQGWFKKNNIGPGCSIKLP >gi|283510553|gb|ACQH01000066.1| GENE 24 29086 - 29289 200 67 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MFLRVFLDMPNRRTGQSIERDGKVASYILSYTKLAFASICNINKTFAFSVAKATKKNNKN FFVFCKK >gi|283510553|gb|ACQH01000066.1| GENE 25 29893 - 30537 369 214 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928623|ref|ZP_06422469.1| ## NR: gi|288928623|ref|ZP_06422469.1| hypothetical protein HMPREF0670_01363 [Prevotella sp. oral taxon 317 str. F0108] # 1 214 1 214 214 404 100.0 1e-111 MKHLRIRLLAALAALFVVACGSNNDADDSAFVVENKVYVQVQDENGQPIDPDELLKDNKF SVVGQNGGEALDVRTEVKEGDKLLVFSPEVPLKEIPPSYAAKGATRSDGRRNGITDEYIA KMEKRFTTGIGYSRVEMSIDGVQIPLIFYYSFYIDSSKFLSSGSLSDTGVELLSVYSGNN GTIPHMSFSYFRLVLRKKGQGYTFVDPYNTSYDF >gi|283510553|gb|ACQH01000066.1| GENE 26 30689 - 31240 464 183 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928624|ref|ZP_06422470.1| ## NR: gi|288928624|ref|ZP_06422470.1| hypothetical protein HMPREF0670_01364 [Prevotella sp. oral taxon 317 str. F0108] # 1 183 20 202 202 363 100.0 2e-99 MKQLKSSLWLMAVLVLLAACSADEEKEMYRPTTASSLNSTVYVEVQDGEKTVSPSAMLYE CKFSVFGKASKRTLNVDAYKLDDVDVFRFFADLPDLSVMKFWDTAEGKRGEGYADVVITI DRVKLPMRFHYAYAAVPNAEKMYGGTSIVIKSVEYQGKTIDPYLHNKHFKIVLRKQEQGY SVE >gi|283510553|gb|ACQH01000066.1| GENE 27 31615 - 32835 1368 406 aa, chain - ## HITS:1 COG:PA4919 KEGG:ns NR:ns ## COG: PA4919 COG1488 # Protein_GI_number: 15600112 # Func_class: H Coenzyme transport and metabolism # Function: Nicotinic acid phosphoribosyltransferase # Organism: Pseudomonas aeruginosa # 4 405 10 394 399 206 32.0 6e-53 MQPIINHFTDNDAYTFSCQYYILQTYPRAEVEYTFFDRNHTVYPEGFAEKVKEQISLMQN VRITEEEIAFMQHRMYYLPQWYFTFLRGYKFNPAEVHIEQAPDGQLAISIRGKWYSTIMW EMPVLSIVSELMHQHRGDLERYDAAVEYERAVDKARQILRGGLILGDMGTRRRLSFVHHD NVIRAMKATADAGGEQVDGVFVPWKGRIVGTSNVYLAMKHGLVPMGTMSHQIIEFEENVS GIFECNFNVMRKFSDVYDGDNGIYLYDCFGDKVFFNNLSKRMALMFVGLRVDSGVEEEQT EKIIEKYKQLGINPATKQVIYSNGLNIERALEIHRFVDGRVQDSYGMGTFLTCDVLGCKP MNIVVKLTKCRITEKREWHDCVKLSCDKGKTLGNPEKCAYLLKQIG >gi|283510553|gb|ACQH01000066.1| GENE 28 32903 - 33556 801 217 aa, chain - ## HITS:1 COG:BS_yvoE KEGG:ns NR:ns ## COG: BS_yvoE COG0546 # Protein_GI_number: 16080550 # Func_class: R General function prediction only # Function: Predicted phosphatases # Organism: Bacillus subtilis # 1 217 1 213 216 112 30.0 4e-25 MKKMRITTVILDFDGTLGDSRRIIVDTLRETLRLHGLPQNSEDECAATIGLPLKEAFVRL ANVDDATATACVTTYRSIFDINNVTGAVKPFANVVQTIKRMHSAGLTLTIASSRGRDSLP NLVRSIGIGPYISYIVSADDVTNAKPHPEPVLLTLEHFGVKPEETLVVGDMAFDIEMGRR ANTLTCGVTYGNGTPAELEEAGANWVIDDFAELLNII >gi|283510553|gb|ACQH01000066.1| GENE 29 33832 - 36030 1159 732 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15894003|ref|NP_347352.1| fused ribonuclease/ribosomal protein S1 [Clostridium acetobutylicum ATCC 824] # 16 731 3 703 730 451 35 1e-126 MGKGKKGGKRLTKKQVAERLQQFFEQHPGKTFSFKEIFKNLKLDTHPLKMLAIDVMEEMA WDDFLTKVTDSSYRLNQAGQVLEGRFVRKTNGKNSFVPDDGSLPIFVSERNSKSALNGDR VRVQLMARRKRHIKEALVIEILKRAKDQVVGKLRVERDMAFLVTPENLYVHDIFVPKRKL KGGKTGDKAVVKVTQWPDAEHKNIMGEVIDVLGPTGENDVEMHAILAQYGLPYTYPKAAE EAAERISAEITQKDYDEREDFRNVFTCTIDPRDAKDFDDALSIRMLDKGLWQVGVHIADV SHYVTEGSIIDKEAMKRATSVYLVDRTIPMLPERLCNFICSLRPDEEKLAYSVIFNMDEN ADIKAYRIVHTVIKSDRRYAYEEVQQLLEDNGVVDGTGEPAPPAPKGGYKGENADSLIML DKLAKKLREKRFKNGAVKFDREELHFDVDEQGKPIRAYFKRSKDANKLIEEFMLLANKTV AESVGKPKGGRTPKTLPYRIHDNPDPQKLENLREFIMKFGYKLKTTSTKGETSRSLNKLM DEVAGKREQNLIETIALRAMMKAKYSIHNIGHFGLAFDYYTHFTSPIRRYPDTMVHRLLT RYAAGGRSANKEHYEELCEHSSEMEQVAQNAERDSIKYKMVEFMADKLGEEFDAHISGIA SYGIYCEIDENHCEGMVPMRDLDDDYYDFDERNYCLVGRRRHRKYQLGDPIRIQVARVDQ EKRQLDFTVVDK >gi|283510553|gb|ACQH01000066.1| GENE 30 36079 - 38379 1493 766 aa, chain - ## HITS:1 COG:no KEGG:BVU_0314 NR:ns ## KEGG: BVU_0314 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 55 763 56 770 774 383 33.0 1e-104 MQKTLLLLFFLVLWTSLNAQINITGLVQDEAGHPLEGAVVSLVSDGTKEKQNAISDANGR FGLAVNKDKHFDNLLVTYLGYADYHVRLENVVHDLYLGEITMLPVERTMQEVEVVGQREI QKIDRKIIIPSKFQLKAATSGFGLLRNMQLSGIRLNTIDNTIRTSTGEPVQLRINGVKTT VAEIKSIRPHDVLRVEYYDMPGARYAGAAAVIDIIVRYKERGGNVSGDLTNGISMLGFGE YQLSANYHEGKSELKAVAWWNRRDLKWIRENTETYHYNDRTLTNREIGDPTKAKFNNFDF SLWYSYAVNKSVFSVIFRDRYNHVPNSTTDRISTLYTENKTYRINDFMRTHDNSPSLDLY FHTMLPHQQTIYFDLVGTYIGSGSNRKFTLFETGHNDQDFVSETTGRKYSLIAEGIYEKM FNRSKLSFGVKHTQMYTKNSYSGTIENLVKLNTSETYLYTEWLSKLGKLTYTLGLGVMRI YNSQANNSLSSLIFRPRLSLAYNVTNSLVIKYTGYVSGYAPSLSNMSDVSQDIDIYQLRR GNPNLKAVTFVSNSLSVDWATKWLNVGLYGRYSYDHKPIMEETLLENQRFVRTMNNQRGF HRINTRIDFKLHPFGDWLSLQVTPFFNRFISNGNHYTHTHANWGMRGQLLAMYKQWLLAA EMETSQHSLWGETLDKEEASHNIALGYNKDKWAVQLVVANPFTTLYTTEVENLSRIAPAN KLAYSRNLRRILMINVSFNLDFGKAKTAGNKRITNTDENTGVLHGR >gi|283510553|gb|ACQH01000066.1| GENE 31 38667 - 38858 74 63 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNAFKSLLVQTFFIITADMCYVYYLLFLFFLPFLKSLSGMAWGLTCAWARLEKGWFKPNV SVL >gi|283510553|gb|ACQH01000066.1| GENE 32 38803 - 39948 1321 381 aa, chain + ## HITS:1 COG:no KEGG:PRU_2474 NR:ns ## KEGG: PRU_2474 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 381 1 384 384 545 65.0 1e-153 MSAVIIKKVCTKSDLKAFIDLHYDLYEGNEYDAPNLFSDEMNTLSKDKNAAFDFCEAEYF LAYKEGKLVGRVAAIINHSANKKWDRKSVRFGWIDFINDREVLEALLHAVEEYGRSKGMR SIVGPLGFTDMDPEGMLTTGFDQLGTMPTIYNYEYYPQLMESIEGFEVDNHYVEYKLMVP DEVPEKFAKIADMIQKRYSLKIKKLTRDDVFKGGYGKRIFDLVNKTYGHLYGYSELTERQ MDQYVKLYFPMADLDLITLVEDEAADNKLVGLGISIPSLTRALQKCRRGRLTPFGWWHVL RAIKYHKTKVVDLLLIGVLPEYRMKGVNSLLFADLIPRYQKYGFEWGETQVEMETNSKVQ SQWEWLDASIHKRRKCYIKTF >gi|283510553|gb|ACQH01000066.1| GENE 33 39998 - 42040 2400 680 aa, chain + ## HITS:1 COG:CT661 KEGG:ns NR:ns ## COG: CT661 COG0187 # Protein_GI_number: 15605394 # Func_class: L Replication, recombination and repair # Function: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit # Organism: Chlamydia trachomatis # 38 666 4 602 605 530 47.0 1e-150 MNKKKTPNNQQEMTFEPANAAGSNANNTPTSTQQGNNQTAYTDDNIRHLSDMEHVRTRPG MYIGRLGDGTQAEDGIYVLLKEVIDNSIDEFKMQAGKRIEVDLEDDLRVSVRDYGRGIPQ GKLVEAVSMLNTGGKYDSKAFQKSVGLNGVGLKAVNALSARFEARSYREGKVRTVVFERG ELISDNTEDTNEENGTFIFFEPDNTLFLNYRFKGEFVETMLRNYTYLNAGLTIMYNGRRI ISRNGLADLLNDRMTNDGLYPIIHIQGEDIEIAFTHANQYGEEYYSFVNGQHTTQGGTHQ SAFKEHIAKTIKDFFSKNYEYVDIRNGLVAAIAINVQEPMFESQTKIKLGSLTMSPGGES VNKYVGDFIKREVDNYLHIHADVAEEIKRKIDESERERKAMAGVTKIARERAKKANLHNR KLRDCRIHFSDAKNDRKEESSIFITEGDSASGSITKSRDVNTQAVFSLRGKPLNSYGLTK KVVYENEEFNLLQAALDIEEGLDTLRYNKVIVATDADVDGMHIRLLIITFFLQFFPELVK KGHVYVLQTPLFRVRNRRTKIKSKKVIAAADARLMKGERKNDFITRYCYNEDERQKAIAE LGPDPEITRFKGLGEISPEEFTHFIGPEMRLEQVTLHKNDQVHKLLEYYMGKNTMERQNF IIDNLVVETDGPTEEETTEV >gi|283510553|gb|ACQH01000066.1| GENE 34 42499 - 44079 1696 526 aa, chain + ## HITS:1 COG:AGc5034 KEGG:ns NR:ns ## COG: AGc5034 COG0793 # Protein_GI_number: 15890023 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protease # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 40 343 70 374 462 206 39.0 9e-53 MKKILILAACFATLAAQAQLRTTIGKDNPQQKMQIAEMAIKNLYVDTVNENKLVEDAIRG MLEKLDPHSSYSTAKETKAMNESLQGSFEGIGVQFNVAKDTLLVIQPVAKGPSEKVGVMA GDRIVSVNDTAIAGVKMSQEDIVRRLRGPKGSIVKLGIVRPGIKGLLDFHVKRDKIPIET LDAYYLIRPGVGYIRIGSFGATTYEEFMRALMKLQLQGATDLILDLQDNGGGYLQAAVRV ANEFLRRNDLIVYTQGRNADRQEFRAQGDGGMLTGRVFVLVNEFSASAAEIVTGALQDQD RGIVVGRRSFGKGLVQRPIEFADGSMMRLTVAHYYTPSGRCIQKPYTKGDNANYEKDLDN RFKNGELYNADSIHLNDSLKYLTLRRKRAVYGGGGIMPDYFVPLDTTQFSRYNRELAAKR IIVNANIQYMDKHRKELKKKYPSFAEFNKKYQVPQELIDEIVAEGEKQKVKAKDQKELKT SLASIKIQLKALVARDLWDMSEYFHIMNQNNHVVQKALVLLGKGKM >gi|283510553|gb|ACQH01000066.1| GENE 35 44424 - 44627 75 67 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MAPNPAQPTAQPLCGCWVIESYRQATPFEVVGASQTTDEPPLLGLLGHCKLHISPILACY MHDIASY >gi|283510553|gb|ACQH01000066.1| GENE 36 45134 - 45382 164 82 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLVTRLQAKRNPVTSTLQQGYEQSVTILQACCNTTPCAKAGWKKPFRVFLDVPNGRTGQS KERGGKVASYVLCYTKLAFASI >gi|283510553|gb|ACQH01000066.1| GENE 37 45545 - 46033 430 162 aa, chain + ## HITS:1 COG:TP0087 KEGG:ns NR:ns ## COG: TP0087 COG1430 # Protein_GI_number: 15639081 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Treponema pallidum # 33 153 32 149 179 136 50.0 2e-32 MTYKELSMKKFLQLHAKLALKGAVVLCCAIFSACSTSVIEFEKKDLAITNLDGKTIPITV ELARSRREMTKGFMGRTNIPDGTGMLFVFKKDQKLAFWMKDTPTPLSIAFISASGEICET RDMVPFSEASVESSQPVRYALEVPQGWFKKNNIGPGCSIKLP >gi|283510553|gb|ACQH01000066.1| GENE 38 46196 - 46435 97 79 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|5453496|gb|AAD43601.1| ## NR: gi|5453496|gb|AAD43601.1| unknown [Bacteroides fragilis] # 1 71 1 71 481 125 87.0 7e-28 MQTEIEKISELSKLLSVKSRLSDDLFHLFGKFGIGHLLSRLSLEKRDGVSASELILSLCL FRIVGESIHRVTCEVCDYR >gi|283510553|gb|ACQH01000066.1| GENE 39 47292 - 47585 207 97 aa, chain - ## HITS:1 COG:no KEGG:TDE0561 NR:ns ## KEGG: TDE0561 # Name: not_defined # Def: hypothetical protein # Organism: T.denticola # Pathway: not_defined # 1 93 30 122 132 132 83.0 3e-30 MLDNPGWILSADISNYGDILKETKPLGRDNDVDWIDFEIRVIAKTYVYIEIFGDISKLNK ILYSFKAIIGELEEIERYGKGILSAQRIKEIIDDAPS >gi|283510553|gb|ACQH01000066.1| GENE 40 47607 - 48116 109 169 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MFPIAIYLTLCPILQPLYYATHTLSLTTIYWQTELNLENLVELLKKTQDLNHNERLFVFR RLQHKDCYMKASIANLTYLYPDNAFKIQCCSTIIGIRKKGERQARLQNLFQNEKTSKVAN PSSTCTETAGHITPLSPWRGAGGEAFLGEGLGAGGGEAGWGQAFYFITV >gi|283510553|gb|ACQH01000066.1| GENE 41 48098 - 48760 573 220 aa, chain - ## HITS:1 COG:ddpX KEGG:ns NR:ns ## COG: ddpX COG2173 # Protein_GI_number: 16129447 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine dipeptidase # Organism: Escherichia coli K12 # 32 210 8 172 193 92 35.0 7e-19 MFKFLLLFSLFVLPAKSQTSKTEEALSRAGYVNVRELDPTLQVSLMYARADNFVGVVLYD DLRQAYLHPDAAAALVKAQKRLKQLRPELSLKIYDAARPMHIQQKMWNKVKNTPMKPYVS NPANGGGLHNYGLAVDITLCDAKGDSITMGTKIDHLGPEAHIDTESQLVAKGIISKRAQQ NRQLLRQVMRYAGFKPLRTEWWHFNFKSRAEAKRHYTVIK >gi|283510553|gb|ACQH01000066.1| GENE 42 49257 - 50330 925 357 aa, chain - ## HITS:1 COG:no KEGG:BT_0058 NR:ns ## KEGG: BT_0058 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 331 10 350 376 98 28.0 3e-19 MLFSLLHNEFWQTPLPFTSLNHQQFVVLQQAAAKQTVEGLVMGALIRNNVKLERADALTA YSRTMMIEQLNNKVSGVAKDLAGILGGNGIDYRIFKGQTLAQLYPYPLARTPGDIDFYCV PADFTRAEHLLQQAWGIALKGEESEQHREFNYREVPLEMHFRMIKFNSKGIQAYWERLLA DAPIEQTAIQGVPVTTLPPTLNVLYTFLHLYHHLVELGVGLRQFCDLMMLLHRHREQIDR HALAQHLRTMGFYRAFCAIGSVLIDKLGLPAHEFPFAISPGDARYQPAILDIVFTGGNFG KHMSTTAVRSGMRYNVEATIRKLRHYRLFWRLSPREIRATILKEMPRKMLRLGFGKR >gi|283510553|gb|ACQH01000066.1| GENE 43 50359 - 50571 106 70 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVMVEIEMGQAGRIGRIRRMGVMRRWVQLRGVRAKRALAFVLHLSSVPFILSLPSAYLMA SSKKLPPVAF >gi|283510553|gb|ACQH01000066.1| GENE 44 50582 - 52507 2526 641 aa, chain - ## HITS:1 COG:BH3718 KEGG:ns NR:ns ## COG: BH3718 COG1086 # Protein_GI_number: 15616280 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Predicted nucleoside-diphosphate sugar epimerases # Organism: Bacillus halodurans # 29 586 16 555 608 337 36.0 4e-92 MDPFSKFINWYFRKDSLPYWCVFLIDCLICLLGGIFVCALFFRISRTATNIVPIVNTLLF YQLFNIIGFRVFHTYAGVIRYSSFVDLKRVAYAMGLGFIIVVVLHYPIIFLPAVHQHIVP LHLRHIVSIYLVTTILLWTFRIMIKGIYDTVFDTGQAKRALIYGVREGGVGLAKNITAQK PQQFRLFGFISHEEGFANHYLMGKKVYEVNDELLRVIKENRINAVLVSPLRNEVFRDNTR LQDMLIEAGVNIYMTEGVQEWDADASGKVVNTLKEISIEDLLPRDQIQIDMQSVGNLLKG KKILITGSAGSIGSEMVRQVCLFHPAELMLVDQAETPQHDIRLMMANDYPAIKSYTVIAS IANRDRMEGIFSTFRPDYVFHAAAYKHVPMMENNPSESIQNNVWGTKVVADLSLKYGVKK FVMVSTDKAVNPTNVMGCSKRICEIYVQALDKAEKEGKMHGNTQFVTTRFGNVLGSNGSV IPLFERQIKAGGPVTVTDPNIIRFFMLIPEACKLVLEAGTKGNGGEIFVFDMGKPVRIAD LAKRMIRLSGAKNIEIQYTGLRAGEKLYEEVLNDEEGTLPSFNPKIRIAKVREYDYEQVN TDVMELVSISHTYDEMNIVRKMKQIVPEFKSKNSVYAELDK >gi|283510553|gb|ACQH01000066.1| GENE 45 52793 - 53917 1225 374 aa, chain - ## HITS:1 COG:Cj1121c KEGG:ns NR:ns ## COG: Cj1121c COG0399 # Protein_GI_number: 15792446 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted pyridoxal phosphate-dependent enzyme apparently involved in regulation of cell wall biogenesis # Organism: Campylobacter jejuni # 4 371 2 378 386 300 42.0 4e-81 MDNQIYLCLAHMSGNEMQYIQEAFDTNWVVPLGPNVNAFEKDLEAFVGQGKHVVALSAGT AAIHLALVALGVKAGDEVICQSFTFSASANPITYQGATPVFVDSEDRTWNMDPNLLEEAI KDRIAKTGKKPKAIMLVYLYGMPAYIDEIMEIANRYGIPVVEDSAEAFGSEYKGCMTGTF GQFGIMSFNGNKMITTSGGGALICPDEETKKRIMFFATQAREPKPYYLHKEIGYNYRMSN ICAGIGRGQMTIAREHIAHHRHTTELYAQLFADVEGIDLHVAPEDYMDPNYWLNTVLIDP KKTGRDYEQMRQHLESVGIESRPLWKPMHTQPVFKDAPAYVNGVSERLFAQGLCLPSGPY VSDQDIEKIVREMK >gi|283510553|gb|ACQH01000066.1| GENE 46 54108 - 54770 492 220 aa, chain - ## HITS:1 COG:CC1011 KEGG:ns NR:ns ## COG: CC1011 COG0110 # Protein_GI_number: 16125263 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Caulobacter vibrioides # 71 215 71 213 215 76 31.0 4e-14 MKRIAIFGSGGMGREIACTLQRINHAANKEVWQLVGFYDDGLPQGSSNEMGKVLGGAAQL NAVTEPLAVVLAFGSPQVMQRVRQAITNPNVYFPNIIDPTVELWHAPSFTMGEGNLFMPH CVVSCNVRIGNFNLFNGDVSIRHDVQIGSFNAFMPGVRLSGGVKVGDGNFFGLNSAVVQY KTIGNHTQIGAGAVVMDDTADHSLYVGVPARVKRDLKVKE >gi|283510553|gb|ACQH01000066.1| GENE 47 54774 - 55382 468 202 aa, chain - ## HITS:1 COG:Cj1124c KEGG:ns NR:ns ## COG: Cj1124c COG2148 # Protein_GI_number: 15792449 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Sugar transferases involved in lipopolysaccharide synthesis # Organism: Campylobacter jejuni # 1 202 1 200 200 243 56.0 2e-64 MYKHFFKRLIDFSVALIVLICISPLLLVVTLWLHFANKGGGAFFLQERPGKNGKIFKVIK YKTMTDERDEQGQLLPDDKRLTPVGSFVRSTSIDELPQLFNVLKGDMSLIGPRPLLPQYL ALYSPSQARRHEVRPGISGWAQCHGRNAISWTQKFEYDVWYVDHVSFLTDMKVIYYTIKA VLKRDGISAEGQATIEPFNGSN >gi|283510553|gb|ACQH01000066.1| GENE 48 55620 - 56819 832 399 aa, chain - ## HITS:1 COG:SMb21231 KEGG:ns NR:ns ## COG: SMb21231 COG0438 # Protein_GI_number: 16264483 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Sinorhizobium meliloti # 41 343 49 334 384 108 27.0 2e-23 MKKKLIRVTTSDISLDTLIKGQLHFMNQHYEVVGLSNNTGRLKEVGKREGVRVVEVPMHR EISLWADLKSLWDLCKIFKRERPFIVHANTPKGSLLAMMAGKLTRVPHRIYTVTGLRYQG AQGMLRIILMTMERITCFCATKVIPEGNGVKMALEHDHITKKPLKVVLNGNINGIDTEYF SAESVARVDKEQPTQAEVANKRAAIRAPLGLSADDFAFVFVGRIVGDKGMNELAACMRKL QTDQPQCKLILVGTFETELDPLNDGNEEFFKTAPSVKYVGYQTDVRPYLLAADTLVFPSY REGFPNVVMQAGAMGLPSIVTDINGCNEIVVDGTNGKIIPPKDETALLAAIEVFLKERGA VAKMAVMARPMIQQRYEQEKVWQALLAEYQALEQTKKNR >gi|283510553|gb|ACQH01000066.1| GENE 49 56916 - 57986 719 356 aa, chain - ## HITS:1 COG:TM0622 KEGG:ns NR:ns ## COG: TM0622 COG0438 # Protein_GI_number: 15643387 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Thermotoga maritima # 1 355 2 372 388 159 30.0 9e-39 MRILQVITSLHTGGAEKLVVDITRILRTRGHIVDVVVFDGTDTPFMQRLRETGCKVYCLS NGGSVYSLSYVFKLRKIIKNYDVVHSHNTSPQFFVALANSCCKRIVVTTEHNTTNRRRNN PLWKPVDKWMYNRYRKIICISDQAEENLKKYLGKDKASITTIYNGVDVAAFYNAKPLDDF KKEKFVVVMVAAFRKQKDQKTLIDAMNLLPGDEYELWLVGQGECLDDVRNYVETKGLQQR VKFWGNRTDVANVLHTADMVCMSSHYEGLSLSNIEGMSSGRPFVASDVDGLREVTKGYGV LFPHGDAEALAAIIRQLHDDKAYYDQIADKCYQRAMMFDINKMVEGYESVYKDLVP >gi|283510553|gb|ACQH01000066.1| GENE 50 57999 - 58868 620 289 aa, chain - ## HITS:1 COG:L17695 KEGG:ns NR:ns ## COG: L17695 COG0463 # Protein_GI_number: 15672198 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Lactococcus lactis # 10 260 12 261 285 166 34.0 5e-41 MLITVFTPTYNREQLLSRLFTSLQEQSFKDFEWLIVDDGSTDSTHDVVVKFVEEGVVPIK YVFKRNGGKHRAINEGVKHAKGELFFIVDSDDILPPDALKRVAEVYEQIKDDRDFGGVAG VDAYPDGRIVGSGLPAPVIDCNSIDIRSKYHVAGDLSEVFRTDVMREFPFPEIEGEKFCP EVLVWNRIARKYKLRYFNEAIYVAEYQPEGLTARIVEIRMKSPIASTTCYAEMVSLKLPF KEKVKAATNYWRFRFCCKGKKNVAKIGWIWKSVCPLGWLMHMKDIRKNK >gi|283510553|gb|ACQH01000066.1| GENE 51 58882 - 59205 97 107 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928643|ref|ZP_06422489.1| ## NR: gi|288928643|ref|ZP_06422489.1| membrane protein [Prevotella sp. oral taxon 317 str. F0108] # 1 107 234 340 340 187 100.0 2e-46 MVFGRIFTYTNPLVICQSLLLVFAFSQFKLKSKFINWLAISCFAVYLTHANEFILRSYYG RWIKFLFDEQNSMVFIFAVACTILAIFFLSVLIDKVRIYAWTKIAGK >gi|283510553|gb|ACQH01000066.1| GENE 52 59927 - 60949 314 340 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928644|ref|ZP_06422490.1| ## NR: gi|288928644|ref|ZP_06422490.1| putative capsular polysaccharide biosynthesis protein [Prevotella sp. oral taxon 317 str. F0108] # 1 340 45 384 384 588 100.0 1e-166 MLGGYDRYIYAELFDEVADVQRTGGDVKTAYIFELYPSEFGFSWLNVAISYITANRYIFI LLLTIVIYALLFISFKRYVDNYPFALVLFMGLIFFFTFTYLRQLIGVGVGWLSIEYVYKK KFWKFLVVVLLATLIHNSAIILFPVYFFPVKQYPKRTVILLMVFCFIIGITGIPSSIFEI YGSVSEMEGRGQSYAENEVGFKVEYILEAFFFLYFIFRNYEKIPKTPARIVLLNMALSFC AVLLIFSRSLNGGRLGWYYLIGLISTLSFLVPNVKRITGQYLVLAVFSCFLYLRILLFAW GPLGTLYPYKSFFTNGVREGDWVHEKWEYDNGYDTNKFYR >gi|283510553|gb|ACQH01000066.1| GENE 53 60948 - 61211 96 87 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSEKPTNKASAANQQARYAVDFVLPCFSKKAQAKAKTKIKIYTYTLLLFKQIHQLRRYAL SAMLNKNCLTSFFCHHFILLNMMGNIR >gi|283510553|gb|ACQH01000066.1| GENE 54 61089 - 62285 597 398 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928645|ref|ZP_06422491.1| ## NR: gi|288928645|ref|ZP_06422491.1| hypothetical protein HMPREF0670_01385 [Prevotella sp. oral taxon 317 str. F0108] # 1 398 1 398 398 812 100.0 0 MLIVLITQVTPASENVRGTSALPYHLMVHRPAGVNVIIYSFNGNQLSDTKIKEVEEELNV TIKQLPRPTWQSWVLRLHLLFIRVLLKYPLFNYMRLSSKVVEEIKGLQPDGIWAYGQDIS RVTKQFAGFKRVHTLPDCESLYYYRMLGRRFVFKNRMMFMRNMVMYPKYLRMEGAYENSA MVKYHLVGEADAESLRNVNPGIQAHFLRHPHYNVAEPQKQIAFSKPKIRLLVAGQNNLYM QEAAEEAFDMMFHKADKLKNHYEITFLGRGWESTVVRLKDAGYDVAQITFAPDYIEEVRK HDVQFTPIAIGTGTKGKVLDALSNGLLVVGTSFALENIAVENGVSCVEYHSNEELLAALT DIPHHVEKYEVMAEKGREAIFVEHGRKRISAQLMDLFK >gi|283510553|gb|ACQH01000066.1| GENE 55 62295 - 63137 191 280 aa, chain - ## HITS:1 COG:alr4493 KEGG:ns NR:ns ## COG: alr4493 COG1216 # Protein_GI_number: 17231985 # Func_class: R General function prediction only # Function: Predicted glycosyltransferases # Organism: Nostoc sp. PCC 7120 # 3 277 9 292 295 135 32.0 1e-31 MDVSVIIVNYNTKKLTADCIQSIIDKTSGVDYEIILVDNASSDGSKEEFQDDRRIRYVYS DKNGGFGYGNNRGMEIARGKYLFLLNSDTILLNNAIKEFFEYSETHNQRCIYGCYLQGDD GSYRGSFHFFPKLTMCGLINGLFCKPSYEPDFTDTEVECVCGADMFFPREAINEAGSFDE NIFLYGEESELQLRMRSFGYKRMLINSPQIIHLEGQSMNESFSKLAIKMKSHFYVLHKHC TWYNYLFARAFFAALYTIHYLPSLRNKEARDFVKLLYTFK >gi|283510553|gb|ACQH01000066.1| GENE 56 63147 - 63683 146 178 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928647|ref|ZP_06422493.1| ## NR: gi|288928647|ref|ZP_06422493.1| acyltransferase [Prevotella sp. oral taxon 317 str. F0108] # 1 178 155 332 332 285 100.0 8e-76 MIVGVMLHSMGVPNVWYIENTLSLVLFFWVGQMICKHQGLLLRVRSLVLIGAVYPVIILL YYIQGKEEPFITLHVSVEVLEIPVILTLAISGTLLCFLVSQWISKNRVLEYIGRETLTIY LFHMYFLLKLLPHLSGVIHKGTMGYATGVALVISTLVFCTIVNVVLNTKYFKWVLGKF >gi|283510553|gb|ACQH01000066.1| GENE 57 64161 - 65417 133 418 aa, chain - ## HITS:1 COG:PM0507 KEGG:ns NR:ns ## COG: PM0507 COG2244 # Protein_GI_number: 15602372 # Func_class: R General function prediction only # Function: Membrane protein involved in the export of O-antigen and teichoic acid # Organism: Pasteurella multocida # 12 416 2 399 399 72 20.0 1e-12 MIKLLDKLLHNKTVRDGGIFAFFSFLNQGLNFFLLIVLSWYILPASYGDLNLFYTSVSVV SFVICLCTSGIVSIKYFKVTREVLSKYINFVLITTLVVSAVLLAAVSFFHPFVRSVAGIT LPMQMWCIYLCSTAVIYNLLLDIYRLEEKPIKYGAFTVVSTLLNICASLFFVIVLRQDWY GRVEANIIASTSFLLIGVYLLIRKKYLTTAKPTKEICKETLSYGVPLIPHSCNGFLRQGM DRYIINSHFTSSSVGLFSFAINFAFIIYSVGSAFNKSNSVYIFKLLSLGQVDVKAKLRKQ TVYMLGFYLGFTLLLYALCLFLIPFVFPKYQGSTIYLFPLCMSCFFQCVYLQFCNFLFYF KKTKELMYMTVGVSLLHLTLSLFLTRYSVLYTAYISMLSSFVEALLVYAYSRKLYKII >gi|283510553|gb|ACQH01000066.1| GENE 58 65414 - 66310 454 298 aa, chain - ## HITS:1 COG:no KEGG:Slin_4260 NR:ns ## KEGG: Slin_4260 # Name: not_defined # Def: hypothetical protein # Organism: S.linguale # Pathway: not_defined # 51 288 46 292 301 102 29.0 1e-20 MIQKFRHIVHLVLAFFSDYLSLLGRPLFYQLQFQHFKCEFDNKVQPKRLCLLGNGPTFNQ IEEHLEALNDFDFCAVNLSVNTDIFFRVKPKMYVIVDMMFWQQPEKEGIKECWDNIQKID WDITICIPFNFPTSMKRTFERNPFVKVRRYSNNVWRPELRMANKLKLWLYKKGWVSPNAT NVSVGAIFASILNGYKEINLLGLEHSWMKDIRVNDQNEVVLMNRHYYGDVEHVWRDYEGK PIKLVDFLASQLETFTSHLDLRRFVDYLGDVRIINRTKGSYIDAYERETFEKMLESSR >gi|283510553|gb|ACQH01000066.1| GENE 59 66346 - 67494 306 382 aa, chain - ## HITS:1 COG:Cj1328 KEGG:ns NR:ns ## COG: Cj1328 COG0381 # Protein_GI_number: 15792651 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylglucosamine 2-epimerase # Organism: Campylobacter jejuni # 1 330 1 332 384 214 36.0 2e-55 MHKYKVCFVTGSRADYGIVRCYLNLLNQDESIDLSILVTGALLDEKYGKAKTLIEDDGFK IGFECPVSLKVDRLCDTTSVMAVVLDEFGKHFEENKYDLVIILGDRYEVYSIAIAAAMQH LPILHLHGGEITLSNYDEFIRHSITKMSQYHFTSTDEYRKRVIQMGEDPSTVFNLGALGA ENCLSIDESNVNVEIKQLPDKRCFTILFHPETLNSQSPLVQVNQLLSAIEGFLPDFEFCF IGSNADTKSDVIRENVQAFCKEHEHTHYYENLHPDAYHYLVKKSIALIGNSSSGIIEVPS LSTMTVNIGDRQTGRVKGNSIIDVRCNKDDIIAGVDKAMNNAQEKIINPYLKENSALLYY KTTITILGKLTSTKYKRFYDLR >gi|283510553|gb|ACQH01000066.1| GENE 60 67534 - 68250 384 238 aa, chain - ## HITS:1 COG:HI1279 KEGG:ns NR:ns ## COG: HI1279 COG1083 # Protein_GI_number: 16273194 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: CMP-N-acetylneuraminic acid synthetase # Organism: Haemophilus influenzae # 1 223 5 226 228 121 34.0 9e-28 MKRLAIIPARAGSKGLKDKNIIDLCGKPMIAYSIAAAKNSGLFDRVIVSTDSVHYADIAT SYGAEVILRGEQLANDQSTSFMVIEDVLNKVETCYDYFVLLQPTSPLRTSTHIQEAIAKF EASYDSFDFLVSMKKSEFTKELVNMIDGDESLKNFDRDFSNYCRQTYSYFSPNGAIFIGK IKPYLKQKHFFGARALSYVMNDIDSVDVDHVIDYELAQILMKKRMKGGEPYCRLGLVK >gi|283510553|gb|ACQH01000066.1| GENE 61 68252 - 68962 256 236 aa, chain - ## HITS:1 COG:Cj1327 KEGG:ns NR:ns ## COG: Cj1327 COG2089 # Protein_GI_number: 15792650 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Sialic acid synthase # Organism: Campylobacter jejuni # 1 234 102 332 334 230 50.0 2e-60 MESIDLLEQSGQSIWKIPSGEITNLPYLERIAKIRIADKQIILSTGMSTVDEIHQAVDVL LKDEIKERITILHCNTEYPTPDVDVNLSAIDTLHQEFPGFRIGFSDHSVGSVAAIGACMK NIVLIEKHFTLDKSLEGPDHKASATPEELALLCESVRRIEVIAGGEGKRVTESERRNINI ARKSIVARIGIKKGEVLTATNITCKRPGNGISPMKWYDVLGTKAIKDFTEDQLIEI Prediction of potential genes in microbial genomes Time: Sat May 28 01:20:14 2011 Seq name: gi|283510552|gb|ACQH01000067.1| Prevotella sp. oral taxon 317 str. F0108 cont2.67, whole genome shotgun sequence Length of sequence - 84609 bp Number of predicted genes - 57, with homology - 51 Number of transcription units - 38, operones - 13 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.000 - CDS 3 - 198 144 ## COG2089 Sialic acid synthase 2 1 Op 2 . - CDS 209 - 853 215 ## COG0110 Acetyltransferase (isoleucine patch superfamily) - Prom 873 - 932 3.4 3 2 Tu 1 . - CDS 2967 - 3983 773 ## PG1003 hypothetical protein - Prom 4041 - 4100 3.6 4 3 Tu 1 . - CDS 4113 - 4715 501 ## COG1309 Transcriptional regulator - Prom 4810 - 4869 5.8 - Term 4941 - 4975 4.5 5 4 Op 1 . - CDS 5126 - 6445 1151 ## COG0318 Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II 6 4 Op 2 . - CDS 6438 - 6665 322 ## gi|288928658|ref|ZP_06422504.1| hypothetical protein HMPREF0670_01398 + Prom 6976 - 7035 5.2 7 5 Op 1 . + CDS 7134 - 9488 2904 ## COG0550 Topoisomerase IA 8 5 Op 2 . + CDS 9535 - 10113 577 ## COG0703 Shikimate kinase + Prom 10119 - 10178 1.7 9 6 Tu 1 . + CDS 10372 - 12264 2273 ## COG1166 Arginine decarboxylase (spermidine biosynthesis) + Term 12269 - 12309 5.5 10 7 Tu 1 . - CDS 12860 - 13324 -345 ## - Prom 13348 - 13407 5.4 + Prom 14077 - 14136 4.9 11 8 Op 1 . + CDS 14165 - 16975 3302 ## GAU_2470 putative outer membrane protein + Prom 17084 - 17143 5.7 12 8 Op 2 . + CDS 17236 - 18498 1129 ## gi|288928665|ref|ZP_06422511.1| hypothetical protein HMPREF0670_01405 + Term 18524 - 18594 15.1 13 9 Op 1 . + CDS 19373 - 22468 3892 ## BT_1280 hypothetical protein 14 9 Op 2 . + CDS 22513 - 24060 1773 ## BVU_0618 hypothetical protein + Prom 24168 - 24227 4.1 15 10 Op 1 . + CDS 24255 - 25358 1334 ## BVU_0617 glycoside hydrolase family protein 16 10 Op 2 . + CDS 25384 - 26547 1529 ## BF1313 hypothetical protein 17 11 Tu 1 . + CDS 26845 - 27831 978 ## gi|288928670|ref|ZP_06422516.1| F5/8 type C domain-containing protein + Term 27940 - 27985 3.2 + Prom 27979 - 28038 1.7 18 12 Tu 1 . + CDS 28229 - 29599 1058 ## BLD_0324 SAM-dependent methyltransferase 19 13 Tu 1 . - CDS 29841 - 30149 311 ## BVU_3759 hypothetical protein - Prom 30275 - 30334 4.3 + Prom 30464 - 30523 3.4 20 14 Op 1 . + CDS 30695 - 30892 75 ## 21 14 Op 2 . + CDS 30911 - 32116 1104 ## PRU_1100 hypothetical protein 22 14 Op 3 . + CDS 32161 - 33360 960 ## PRU_1100 hypothetical protein + Term 33540 - 33584 2.1 23 15 Tu 1 . - CDS 33487 - 33702 73 ## + Prom 33567 - 33626 3.7 24 16 Tu 1 . + CDS 33709 - 35802 1998 ## COG1629 Outer membrane receptor proteins, mostly Fe transport + Term 35865 - 35920 15.1 - Term 35854 - 35906 9.8 25 17 Tu 1 . - CDS 35946 - 36716 653 ## COG2227 2-polyprenyl-3-methyl-5-hydroxy-6-metoxy-1,4-benzoquinol methylase - Prom 36844 - 36903 3.5 26 18 Tu 1 . - CDS 38351 - 39385 1105 ## COG2255 Holliday junction resolvasome, helicase subunit - Prom 39522 - 39581 4.0 + Prom 39507 - 39566 2.4 27 19 Tu 1 . + CDS 39670 - 41574 1864 ## COG2217 Cation transport ATPase + Term 41589 - 41627 7.1 - Term 41691 - 41751 7.2 28 20 Tu 1 . - CDS 41763 - 43877 2521 ## PRU_2649 hypothetical protein - Prom 43903 - 43962 1.9 29 21 Op 1 . - CDS 44042 - 45508 1313 ## COG2244 Membrane protein involved in the export of O-antigen and teichoic acid 30 21 Op 2 . - CDS 45545 - 46774 972 ## COG2715 Uncharacterized membrane protein, required for spore maturation in B.subtilis. - Prom 46797 - 46856 2.4 - Term 47078 - 47129 7.0 31 22 Tu 1 . - CDS 47137 - 49653 3013 ## Coch_0557 hypothetical protein - Prom 49849 - 49908 3.9 + Prom 52348 - 52407 6.3 32 23 Op 1 . + CDS 52599 - 53060 277 ## PROTEIN SUPPORTED gi|163764798|ref|ZP_02171851.1| ribosomal protein S19 33 23 Op 2 . + CDS 53088 - 53510 390 ## COG0319 Predicted metal-dependent hydrolase + Term 53677 - 53715 1.4 + Prom 53662 - 53721 3.8 34 24 Tu 1 . + CDS 53759 - 55825 1589 ## Slin_1714 peptidase M48 Ste24p + Term 55944 - 55979 5.1 35 25 Tu 1 . - CDS 55886 - 56143 62 ## - Prom 56210 - 56269 3.0 + Prom 56250 - 56309 3.8 36 26 Tu 1 . + CDS 56335 - 58206 1646 ## COG0445 NAD/FAD-utilizing enzyme apparently involved in cell division + Prom 58599 - 58658 3.2 37 27 Tu 1 . + CDS 58678 - 59208 453 ## COG0503 Adenine/guanine phosphoribosyltransferases and related PRPP-binding proteins 38 28 Tu 1 . + CDS 60160 - 62010 1834 ## COG0322 Nuclease subunit of the excinuclease complex + Term 62012 - 62049 1.1 - Term 62000 - 62036 2.5 39 29 Tu 1 . - CDS 62087 - 62269 57 ## - Prom 62364 - 62423 7.3 + Prom 64719 - 64778 3.9 40 30 Tu 1 . + CDS 64930 - 66675 1530 ## BF0008 hypothetical protein + Prom 67086 - 67145 4.0 41 31 Op 1 . + CDS 67166 - 67615 504 ## COG1490 D-Tyr-tRNAtyr deacylase 42 31 Op 2 . + CDS 67628 - 68557 500 ## COG0270 Site-specific DNA methylase 43 31 Op 3 . + CDS 68544 - 69275 588 ## lpp1079 hypothetical protein 44 31 Op 4 . + CDS 69277 - 69648 539 ## COG1694 Predicted pyrophosphatase 45 31 Op 5 . + CDS 69655 - 70584 1332 ## COG0274 Deoxyribose-phosphate aldolase 46 31 Op 6 . + CDS 70617 - 72002 1625 ## BVU_0235 putative periplasmic protein 47 31 Op 7 . + CDS 71999 - 73558 1489 ## COG1696 Predicted membrane protein involved in D-alanine export - Term 73657 - 73724 4.4 48 32 Tu 1 . - CDS 73815 - 74288 302 ## gi|288928699|ref|ZP_06422545.1| hypothetical protein HMPREF0670_01439 - Prom 74525 - 74584 8.6 + Prom 74905 - 74964 5.5 49 33 Tu 1 . + CDS 75107 - 75331 316 ## BDI_2160 hypothetical protein + Prom 76407 - 76466 6.9 50 34 Tu 1 . + CDS 76709 - 76966 142 ## gi|288928701|ref|ZP_06422547.1| hypothetical protein HMPREF0670_01441 + Term 77006 - 77044 -0.4 + Prom 77264 - 77323 2.1 51 35 Op 1 . + CDS 77374 - 78000 607 ## BDI_2160 hypothetical protein 52 35 Op 2 . + CDS 77997 - 78821 547 ## BDI_2159 hypothetical protein - Term 79528 - 79582 16.1 53 36 Op 1 . - CDS 79590 - 80153 406 ## gi|288928705|ref|ZP_06422551.1| hypothetical protein HMPREF0670_01445 54 36 Op 2 . - CDS 80140 - 81432 260 ## gi|288928706|ref|ZP_06422552.1| putative pyrogenic exotoxin B - Prom 81549 - 81608 7.2 55 37 Op 1 . - CDS 81622 - 82461 618 ## Fjoh_1697 hypothetical protein 56 37 Op 2 . - CDS 82465 - 84306 1606 ## Fjoh_1696 TonB-dependent receptor - Prom 84504 - 84563 2.8 57 38 Tu 1 . + CDS 84305 - 84529 56 ## Predicted protein(s) >gi|283510552|gb|ACQH01000067.1| GENE 1 3 - 198 144 65 aa, chain - ## HITS:1 COG:Cj1327 KEGG:ns NR:ns ## COG: Cj1327 COG2089 # Protein_GI_number: 15792650 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Sialic acid synthase # Organism: Campylobacter jejuni # 1 65 1 65 334 59 50.0 2e-09 MEHVYIIAEIGCNHCGDATIAKEMVKKAKACGVDAVKFQTFKASALISKYAPKADYQKKT TGENE >gi|283510552|gb|ACQH01000067.1| GENE 2 209 - 853 215 214 aa, chain - ## HITS:1 COG:BS_yvfD KEGG:ns NR:ns ## COG: BS_yvfD COG0110 # Protein_GI_number: 16080477 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Bacillus subtilis # 1 206 1 209 216 114 35.0 1e-25 MKSLIIVGAGGYAKSVLDSLDYMNFKMEGFLDDIKPIGSEHLGYPIIGNKVEDIGHYKNC VFFVAIGNNAKRKLWFDKLKQYGLSLINVIDSSAIISRHATMGEGCFVGKLAILNHGCCV GNNCVVNTRALVEHGCNIQDHVNLSTNSTLNGDVIVEEGGFVGSGAIINGQLRIGTWALV GSGTVVIRDVRPNTTVVGVPAKEIPSNSVKYNSL >gi|283510552|gb|ACQH01000067.1| GENE 3 2967 - 3983 773 338 aa, chain - ## HITS:1 COG:no KEGG:PG1003 NR:ns ## KEGG: PG1003 # Name: not_defined # Def: hypothetical protein # Organism: P.gingivalis # Pathway: not_defined # 15 334 1 321 330 425 61.0 1e-117 MKKIFVTLILFIGMMSNVFAQEPDYLFMIAQAVKAPSGHNTQPWLFKIADGEIDILPDFT KSLPVVDPNHRELFVSLGCAAENLCIAAAHKGYETRVTVTHEGVIKVFLTKQNHHMESPL FAQIALRQTNRSRYDGRIVPEDSIVSLKNIGAEPSIGVHYYKNGSAEYTTITEMVYAGNR LQMHNKAFKTELQQWMRYNKKHQNATKDGLSYAVFGAPNVPKFIAKPIMANAINEKKQNK SDAKNMASASHLVLLTSRDNTIEQWVNLGRTLERILLRSTQMGIAHAYLNQPNEEPTLAV ELAQKLGLTNERPTILIRIGYGKKMPYSQRREAKVERM >gi|283510552|gb|ACQH01000067.1| GENE 4 4113 - 4715 501 200 aa, chain - ## HITS:1 COG:CC2662 KEGG:ns NR:ns ## COG: CC2662 COG1309 # Protein_GI_number: 16126897 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Caulobacter vibrioides # 2 187 16 207 213 85 27.0 7e-17 MKDRELTQQKILQAVDDIIANDGFERLGVNVIAQKAGVSKMLIYRYFGGLDDLIAQYLLS KDYWANTDIHAIDGADVAGSLKRMFREQIIRLRNDVTLRRLCRWELTTDNENIQLLRDRR ERNGCELIKAVSGFTHSHHTEVAALATILSASISYLVLIEEQSPTYNGINLRSNEGWEQV VKGLDLMIDLWIKQVNQLTG >gi|283510552|gb|ACQH01000067.1| GENE 5 5126 - 6445 1151 439 aa, chain - ## HITS:1 COG:SMa0150 KEGG:ns NR:ns ## COG: SMa0150 COG0318 # Protein_GI_number: 16262533 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II # Organism: Sinorhizobium meliloti # 101 432 153 490 508 118 27.0 3e-26 MLSIEQHIDRHAQNAPERVAMVLNGEELTYAELRQRTMLRTQHYQGMQRKGVVLRASCDA DFLISYFAMHVAGAVAVLLPAAATPEQTVRMEKALQGFYFPEGSADVLFTTGTTGAPKGV VIGTRAIVASAENIVHGQAYRHDIRFVVPGQLNHLGCLSKVWATMLAGATCCLMPKFDFN EFFSLLASCKEMSVGAFLVPSCIRMLTKFGGKQMEDAAIKLAFIETGGDALDTPTMAELH QQLPHTRLFNTYASTETGVVATFEFTHHQLAGCVGYALPNARFDITPEGHIRCAGDTLMN GYLRMGDQTTYEPLTEKSFTTCDCGEIDEQGRLQFKCRDSDIINTGGFKVSPQEVEAAAN SLAEVKACICIATPHPILGQALKLLVVADGDVEINAKDLARRLKEQLEPHQVPLFYEQVE SLALTPNGKIDRKRYLLRQ >gi|283510552|gb|ACQH01000067.1| GENE 6 6438 - 6665 322 75 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928658|ref|ZP_06422504.1| ## NR: gi|288928658|ref|ZP_06422504.1| hypothetical protein HMPREF0670_01398 [Prevotella sp. oral taxon 317 str. F0108] # 1 75 1 75 75 118 100.0 1e-25 MKEKIRNILGGALPLVDLDSQFLMAELDSLDVTTIMMLLADNFGIDVDATDATPANFKSL DTLAEMVRRKQNADA >gi|283510552|gb|ACQH01000067.1| GENE 7 7134 - 9488 2904 784 aa, chain + ## HITS:1 COG:SP1263_1 KEGG:ns NR:ns ## COG: SP1263_1 COG0550 # Protein_GI_number: 15901123 # Func_class: L Replication, recombination and repair # Function: Topoisomerase IA # Organism: Streptococcus pneumoniae TIGR4 # 4 570 15 553 553 442 43.0 1e-123 MQQNLVIVESPAKAKTIEKFLGEDFKVMSSYGHIRDLKKTGLSVDLDTFEPFYEISDDKK KVVTDLQKAAKSAKRVWLASDEDREGEAISWHLCEVLGLDEDKTNRIVFHEITKPAILRA IEEPRRVDMNLVDAQQARRVLDRMVGFMLSPILWRKVKPSLSAGRVQSVAVRLVVDRERE INAFKPETFYRIVAIFAITNDEGGMTEVDAELDTRFKTHEEALAFLEKCKNAEFTVESVS KKPLKRSPAPPFTTSTLQQEAARKLGFSVSQTMMVAQRLYENGRITYMRTDSVNLSALCI NATKEEVTRLWGEQYSLPRNFHTHAKGAQEAHEAIRPTYMNVTKIDGTQQEKKLYDLIWK RTAASQMAEAQLEKTRVNINISGTSEHFVATGEVLTFDGFLKVYRESTDDEDENQSDSSR MLPPINEGDKLERREITSIERFTQGPARYTEASLVHKMEELGIGRPSTYAPTISTIQHRE YVTRGDSKGKERQYLVDTLKGIVITTKNKKENVGGDKGKLLPTDVGIVVNDFLMDSFPTI MDYNFTAHVEEDFDRIAEGQEEWKAMMKAFYADFGPTVDTVMNARSEHKAGERVLGEDPT TGKQVLVKIGRFGPVAQIGVASDDEKPRFANLTAGLSIETITLEEALELFKLPRTVGEFE GKPVVIGSGRFGPYILHDKKYTSLPKTEDPHSVTLETSVALIEQKRMQEKQKHLKSFAED PKLEVINGKFGPYLAYDGKNYKLPKELHDKASELTYEQCMELVKAAPTKAKRTTKASAAS KAKK >gi|283510552|gb|ACQH01000067.1| GENE 8 9535 - 10113 577 192 aa, chain + ## HITS:1 COG:alr1244 KEGG:ns NR:ns ## COG: alr1244 COG0703 # Protein_GI_number: 17228739 # Func_class: E Amino acid transport and metabolism # Function: Shikimate kinase # Organism: Nostoc sp. PCC 7120 # 21 186 10 170 181 99 37.0 3e-21 MQNGLPSTASPNKGKGVCHRIILIGYMGAGKTTIGKALAAELGLRFYDLDWYIESRMRKT VAQLFAEMGEEGFRRIERNMLHEVAEFEDVLISCGGGTPCFFDNMQYINQQGHTLYLKAS PDVLYKHLKMGKSVRPLLLNKTPYEVKAFISEQLAAREPFYLLAHHTLDVSLMDNFEKIK LSVTQAKEVLGV >gi|283510552|gb|ACQH01000067.1| GENE 9 10372 - 12264 2273 630 aa, chain + ## HITS:1 COG:all3401 KEGG:ns NR:ns ## COG: all3401 COG1166 # Protein_GI_number: 17230893 # Func_class: E Amino acid transport and metabolism # Function: Arginine decarboxylase (spermidine biosynthesis) # Organism: Nostoc sp. PCC 7120 # 4 629 53 677 679 577 43.0 1e-164 MKKWTIEDSKELYNINGWGTSFFGVNDKGDIYVSPCKDNTEIDLRDVMDELALRDVTSPV LLRFPDILDSRIEKTASCFRQAAEEYGYKGENFVIYPIKVNQMQPVVEEIISHGRKFNLG LEAGSKPELHAVIAVQCQSDSLIICNGYKDQSYIELALLAQKMGKRIFLVIEKLNELELI ARAAKKLNVRPNLGIRIKLASSGSGKWEESGGDASKFGLTSSELLQALELLDQKGLRDCL KLIHFHIGSQITKIRRIQTALREASQFYVQLHKMGYDVDFVDCGGGLGVDYDGTRSPSSE SSVNYSIQEYVNDCIYTFVDAANKNGIEHPNIITESGRSLTAHHSVLVIDVLETATLPEM KEEFEPSENEHQLVKDLYDIWDNINSRSMLEDWHDAEQIREEALDLFAHGLVDLRTRAEI ESMYWSVCREINTIAKGLKHVPEELRNMDKLLADKYFCNFSLFQSLPDSWAIDQLFPIVP IQRLDERPTRNATLQDITCDSDGKIANFVTNRHISHVLPVHTLRKNEPYYLGVFLVGAYQ EILGDMHNLFGDTNAVHVSVKDGKYHIDQIFDGETVEEVLDYVQYNPKKLVRQLEIWVTK SVKQGKISLEEGKEFLSNYRSGLYGYTYLE >gi|283510552|gb|ACQH01000067.1| GENE 10 12860 - 13324 -345 154 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVANPSFLGSKYALFTPKIPHFEGYFAHSSRVSCGSKRLCLYHCSVFLCLSFCILQHFAL RLASKHTAFSIKTHCVLRHFTWYLAPKCTAFCRILHCILLQIAQKLVQIAVLLNKNTFWW HLKLSPFGIKTNLRENRLFAARRAVGAQKGQSVC >gi|283510552|gb|ACQH01000067.1| GENE 11 14165 - 16975 3302 936 aa, chain + ## HITS:1 COG:no KEGG:GAU_2470 NR:ns ## KEGG: GAU_2470 # Name: not_defined # Def: putative outer membrane protein # Organism: G.aurantiaca # Pathway: not_defined # 20 936 46 1032 1032 223 26.0 3e-56 MITFKRVLLLCTLSAACGGAFAQKVASGVVLDAETKEPLIGAVVCTSNGQKVVSDLNGEF NISVNPGERLSVSYVGYENKQLPIADTQQRTVVKLNRGLTLQEVNVVASIASARNKKAVG ADVAHLDAAKLLAKGQAANLSDLLDGRVSGMQMFQSNGKVGMPIRFNMRSGATLSMDRDP IIYIDGVRYNNSHTSDINTSQDALSALNDLPMDDIASIDVIKGPAAAASYGAEAANGVIV ITTKRQSGSTLERGKLAATAKISAGWSNKAREYTQFVNNTDINNFFVTGHNVSAYASFTK NFTPGNQLFFSFNESHNGGIVPGNKDVRHSLRAAYDMKQGPFTLNFTVGYVNGDISIPQT AQGRNDAIWNLMRAQKPWEYVSERTWRAMKWKYDNDRLTAALRMAYVLPYDIKLETQLGL DLNHIDGLYTLPYGYLLGTNDEGEKSISNRRNQNLTWDWKASRRFDLGHKLHLTGTLLSQ IVQRRETMDKTSASIFPADVDNIAAAAQRSVLETSFEQRTWGLYGEAFLNYDNRLFVNAG LRRDASNLIGRNVASIYYPSLSVAYNMEKAKLRAAYGESGRLPYPTDAFTYYEVKGMSAY GPLLVPGKKGNDNIRPERMREIEAGLDWTPARHQIGLTAYAQFTTDAIIYTPLLSSNGWV GDEPHNVGRVNGWGVELSWNWKAWQNPARTADLNLFVTANYQGNRVIDTGGADIENLPNV LKEGQPAYAFYYKEVTGANYDATGKYLGAKETDEYHYLGKPFPDFNGAFGFDLRLLSNLT LSTKFNWAVGASVYNQSFYNVAGLGDNLKKREELRAQLAAETVGTDAYRNVAEKLARTER NRANYIEKADFLRLSSISLGYDASSFAKRLTSDVLKGARLMLTAQNLFLVTNYSGIEPQV EANGGTRQTRGMGSLSRDITNAPSARTFTATLAVEF >gi|283510552|gb|ACQH01000067.1| GENE 12 17236 - 18498 1129 420 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288928665|ref|ZP_06422511.1| ## NR: gi|288928665|ref|ZP_06422511.1| hypothetical protein HMPREF0670_01405 [Prevotella sp. oral taxon 317 str. F0108] # 1 420 1 420 420 792 100.0 0 MKMKSKHISIAVLAVATALSSCDGWIDNAKTPNNMLTTEQITTPSMLATIRTENIKDGAM IANVKTLAGVAASAAFLTSGAMTDEVEPTAKPNFLIYKQLKSDDVKPNSDVADTWNKLQN FRARAEEVLQVEAKLSNDGTENFDAVRAYARYTGHLYAGLAYQLLGKMFSKSTESAEGVR VNNQMVSNEELLDKAQQHYSQALDEAKGKLLANKKGVFVATTAVKQVRMVMVKLAMQRQR YAEAAALWNDAYANNMQVDVVYNINGNANPLYSAVGLPAINVQVDASLVASLKGTAEQKA INTAINKDGHRYLTWLEKYAPLVIIDEKEMKLIQAELIVRGLMQGDAAQCVNEVLALYST GQTITAAPTLKELAHLRHVFLFLHGCRIDDLRRAMVDGEPQAAWNARKVKYIPMPESEYK >gi|283510552|gb|ACQH01000067.1| GENE 13 19373 - 22468 3892 1031 aa, chain + ## HITS:1 COG:no KEGG:BT_1280 NR:ns ## KEGG: BT_1280 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 31 1031 95 1110 1110 866 47.0 0 MNKERILSTAVFLVGIGTGVMASNAYPNNNGTALAKQLTERTGASTPNNLQQKHNEVKVT GTVFDSMNEPVIGATVRVKGEQTATVTDIDGNFTLNAHAGATLVISYLGHQTQEVAVPAN GNVTVTLKSESRQIDEVVVTALGIRRSEKALSYNVQKVGTDELTTVKNTNFMNSLNGKVA GININASSAGMGGATRVVMRGPKSISQSNQALYVIDGVPITNTSHGETGSMYSSQPGSEG IADINPEDIESISVLSGPAAAALYGSAAAQGVVMITTKKGKEGKVSVTVSHSTQFANPFI MPEFQNEYVNKPGSVASWGEKKPSAYGNYEPKNFFNTGTNIQNNVALTAGNDKNQTYLSL GTTNAAGILPNNKYNRYNLTFRNTTNMLGDKLTLDFGLNYILENDRNLTAQGQWFNPLTA VYLFPRGESFDAVRTYEIFDPNRNVYVQNWNYGDALKMQNPYWVTNRMVRENHRSRYMLS GSLKYKITDWLDVTGRLRWDDAAVKQEDKRYASTIDLFAHSKYGYYGHAKINDQSLYGDL MANVNKSFENFSIGANVGGSFSRTKYDNEGHQGGLKAPSNIFTPNAIDYSLVTNDNRPIY DLHKHAVNSLFANVELGWKSMVYLTLTGRNDWDSALDGTSNESFFYPSVGMSGVISQMVK LPKFINYMKVRASWASVGSAISPNITSRWRYEYVPASGTYRTVTYRFPKTFYPERTDSWE AGLTARFLNNALTLDLTLYQSNTRKQTFLRDVTAGGYDKEYIQLGNVRNRGIELSLGYNH RFGDLDWNSSFTFSSNQNKIVKLFDDDREVSKKGGLSGAEIILKKGGTMGDIYMSTDFLR DNEGNIATDNTGSVLQRNLDNPEYRGSVLPKANIGFSNEFAWKGFNFGFLITARFGGIVL SQTQAILDAWGVSKASADARNNGGINLNGGKISAENYYRIVGGDNPIWSEYIYSATNARV QEAHLGYTLPKKWLKGMELSLGLTANNLFFIYNKAPFDPEAVASTGTYYQGFDYFMQPSL RSLGFNVKLKF >gi|283510552|gb|ACQH01000067.1| GENE 14 22513 - 24060 1773 515 aa, chain + ## HITS:1 COG:no KEGG:BVU_0618 NR:ns ## KEGG: BVU_0618 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 14 515 9 512 513 283 35.0 1e-74 MNNKHILKSMALGALLVGGLAMQTACTAGFEEANRPGHKVSADELDRDNYSTSSFLTQLV NEAFPEQENTYQMTEDLIGNYLGRYMTYANNGFSDKNFARFNAPNGWVSWPFNNSLPKAT SAFKAIAAKTGREGVLYAQALVLRAQVYQRYVDMYGALPVGLEQDDDAAYSSQEDVYKLL VANLDTAALLLKPFNSSTVNDENDKVYEGKVSKWYKLANSLKLRMAIRMRYADPAFAKKT GEAAVAAGVITSNDDNCAVTYVPNGQYKTSIEWGDSRACADLECFLTGYADPRLRTYFKP AQKTSGRAIVGCLAGAKIENKKKANEIYSAANVGQNTRGVWLTAAEMAFCRAEGVLAGWS GMGGTAKDLYEEGVKLSFEQWGAGSAASYLENSTAKQADYTDAEGGFGQSQAAVSTITIK WNDGATAEEKLERLMTQKWIALFPDGQEAWSELRRTGYPKVFPVAQSTSGYSLKVPNRIP FAVDELTKNPTNYRKAVQLIGGTDDYATKMWWQKK >gi|283510552|gb|ACQH01000067.1| GENE 15 24255 - 25358 1334 367 aa, chain + ## HITS:1 COG:no KEGG:BVU_0617 NR:ns ## KEGG: BVU_0617 # Name: not_defined # Def: glycoside hydrolase family protein # Organism: B.vulgatus # Pathway: not_defined # 20 367 20 351 352 342 50.0 2e-92 MNIKNKTLLWLAAATMSLTACSDWTETEIKNPTDLTTTNKSEAYYAKLREYKNSDHPKAF GWFGNWTGDAAMLNNSIKGLPDSVDFVSLWGNWRNLNDKQKRDLQFVQQVKGTKVLMCFI VMDIGDQMTPPLPADKAKAGTTWKEWRKEFWGWGDSNESRIAAAQKYANAICDSIDKYGY DGFDIDAEPNFAQPFATDKEMWNEQGVMTAFVQTLSKRIGPKSGTNKMLVVDGEPDALPE SLGDHFDYFILQAYNTSSDEQLNSRLRTQIEHFKNTLTPEQVAKKIIVCENFENYAAMGG VDFRTKQGNVIPSLLGMAYWQPTYNGQTFQKGGVGSYHMEYEYGQSSAQTTYPWLRKAIQ IMNPSLK >gi|283510552|gb|ACQH01000067.1| GENE 16 25384 - 26547 1529 387 aa, chain + ## HITS:1 COG:no KEGG:BF1313 NR:ns ## KEGG: BF1313 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 385 3 378 380 209 34.0 1e-52 MKRYSITIKAAAMAIAALSLAACNNAEYSELANQAYFEQTRTNANTSMKITVGEDNVQQN FYVRLSSPAEQASTFEVSVDEAALNEYNARNATHYVALPTSEYELTNTEATVNVGGVSST PVQLTVKPLSNEVKNSGKKYAVALKLKSKNGVNDVLEPGSKLVCIIDQVVRQDVPVINSA ARITFNMRQEYALTAWTVEMNVNIDRLGKAIGELNNQMLFGAWAPSGKDGEIYTRFGDAP IEGNRLQIKTQGSQLNSNQLFEAGKWYHLAFVCEGTKLSLYVNGKLDSQMDLPGKVVNLG NTMLFGATDYLKANVQVSELRFWTKARTQNEVANDMYVCDPASDGLEAYWKMNEGSGDTF KDATGHGNNGKAATLPAWIPNVRIDGK >gi|283510552|gb|ACQH01000067.1| GENE 17 26845 - 27831 978 328 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288928670|ref|ZP_06422516.1| ## NR: gi|288928670|ref|ZP_06422516.1| F5/8 type C domain-containing protein [Prevotella sp. oral taxon 317 str. F0108] # 1 328 1 328 328 608 100.0 1e-172 MKSILKITLPALVGLATLFTACQSEPEVGTKLYDKGETSNLPKLYIHDLGNQGNKGALEV VNANGELIAKKDTVKFYVRLSSPLDHDLEVSVAENPSATKQAGATALEKGAVNILTPTVT IAKGATVSAEPIKVVAQLGNALDTLMSTRTNGAVTLTLNPAQGVDVASNYGNMNITVNYK ESNIVDNGNTDGLTAMDLGDYLVYDDQNNKLEKLYDGNLSTVWFKPSNEGPFTFVITVSE NTKVTALSLVPSARYVNWVAKSMTVETSEDWNTWTNQGEITRTDNSISNGEPFVVRFTKP VKCNYIRLSNVRSASSRYLVIGEMKFYK >gi|283510552|gb|ACQH01000067.1| GENE 18 28229 - 29599 1058 456 aa, chain + ## HITS:1 COG:no KEGG:BLD_0324 NR:ns ## KEGG: BLD_0324 # Name: not_defined # Def: SAM-dependent methyltransferase # Organism: B.longum_DJO10A # Pathway: not_defined # 1 455 1 424 431 369 45.0 1e-100 MYINNDTLRFIAEHRNDNVPQLALAAHNSPNVDLPFALDQIAGWQTACRKLPSWAKNPNI VYPPHLSMEQCSSQTTAEYKANVAWRLVNDIQQATPNNSEGGKGNCDKKQVGECEHTSLA DLTGGFGVDFVFMAPHFAQATYVERQSKLCQLVQNNLNALDIGHAEVVCAEAESHLKTME RVSCIYLDPARRDKNGGKTVLIEHCSPDILQLLPMLLAKCGLLMVKLSPMLHWQLAIAQL QEHGAWVSQLHIVSAKNECKELLFLITDRQNHAQCLSQAAAQSGQLHPTPLENPTPQQAT GTANSTHITCYNDGNSFQYSLAEADATPQTILQTPPQAGMYLFEPNASIMKAGCFGMLCS RLGVQAIAPNSHLFVANNDLPDFPGRRFKLVATTSFNKRHLKAALQGISKANIATRNFPL KPEALRKKLRLKDGGEHFLFATTDAQGKHWLLVAKG >gi|283510552|gb|ACQH01000067.1| GENE 19 29841 - 30149 311 102 aa, chain - ## HITS:1 COG:no KEGG:BVU_3759 NR:ns ## KEGG: BVU_3759 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 102 1 102 102 175 79.0 5e-43 MVDLYEYSTPELVRLLGTRFKEYRLRCNLTQKEVAERSGVGLTTIHKFENGSAGNLSLST FVLLLKVVGQINSLDNLLPELPPSPYLVRRDEKKAQRIRHTK >gi|283510552|gb|ACQH01000067.1| GENE 20 30695 - 30892 75 65 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRINLTSPILGKVKPNKHRHQGMCQCRTYAALTMCLRTNNPAYCLRTHPRPRQTTITTTY NLTTF >gi|283510552|gb|ACQH01000067.1| GENE 21 30911 - 32116 1104 401 aa, chain + ## HITS:1 COG:no KEGG:PRU_1100 NR:ns ## KEGG: PRU_1100 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 35 391 1 356 366 176 29.0 2e-42 MKRKTTLTMLFATFILMASAQKGPLDSKWESNDSIPFTWDEGIYLPATINNKYPANILFT TNANRQLVVDTTYLKEKCWQPPKIEKGLILRETDTLRLKASNTKHEVKFGNIAANFPYML ITDIRNVLGKHADGIMWDTFFEYSPFEVNFQQKFLRTLTAIPDSVKRNYRCLPLTVRDSK FMIEAYVWFNNKRIGGLYQLWLGGNDDILFTSYIVKRHNLMAYKGKTRQLLAQYTNIGDT TTTTTTFALADSVRLGLQNIGPVVVSIPMPEASSHSRMYNAGYIGAGILSSYNLVFDPAH NKLYYRPYKEHTPEKRTWGFSWINRTDIGKGWIVRSIYKGGAAEKAGIRLGDTILKVNGK KVENYSWEEERALSNDSTITLLLKTEQGMRKVTLKTEPLYE >gi|283510552|gb|ACQH01000067.1| GENE 22 32161 - 33360 960 399 aa, chain + ## HITS:1 COG:no KEGG:PRU_1100 NR:ns ## KEGG: PRU_1100 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 35 399 1 365 366 226 34.0 2e-57 MKRKTTLTMLFAFFILMASAQKKASQQVKVCEEPISFIKDGHIYLPSIIDSTHHANMIFD TGASGQLLVDTVYLKEQGWTVNTSMWGRLRGTNGYSRVKVSQKRHKVEHGKASETFPYMV LGNLRSVLGKHADGIFGSNDFERLPLEINYQQRFIRKLNTIPDSVKRNYQCLPLTFKDKD CLIKATIWFNGQAIEGLYIIDTGSGNDVFFTEATTKQYNLESYRGEKRVGHGLDMGLGDG GVSTWIDAYADSMRMGSLHIPQPEISMLPEGKGAFKENIYVGNIGAGVLGCYNLVIDIPN GKLYCRQFKEYRSGGTTWGFGWINRTDIGKGWIVRSIYDGSAADKAGMKLGDIVVEVNGR KVEHFSWEEERALSNDSTVTLLLKTEQGMRKVTLEEEEY >gi|283510552|gb|ACQH01000067.1| GENE 23 33487 - 33702 73 71 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MFLYFGKAVPPLTSQPVGARFTRFKSAKVCKCMKNRNPYLCGNRWERPYSQPLLNRRGER MALCLDDLTTR >gi|283510552|gb|ACQH01000067.1| GENE 24 33709 - 35802 1998 697 aa, chain + ## HITS:1 COG:all4026 KEGG:ns NR:ns ## COG: all4026 COG1629 # Protein_GI_number: 17231518 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor proteins, mostly Fe transport # Organism: Nostoc sp. PCC 7120 # 47 690 204 847 854 211 27.0 4e-54 MKIIATLTAVAFAATANATVEDSATIGGMINEVSVVGFKQDRAAISPVSQQGVGERYIEN NQLVGLRDLSGMLANFYMPDYGSRQYSPIYIRGIGSKVNSPSVGVYVDGIPYFDRTVLDM DLFGVSKVEVLRGPQGTLFGRNASAGLINVFTRSPLDYQGTMAKVSYGSYNDWLVAASAY HKLNQRFGLSLAANYHHNDGFQTNTFLNNKADRIGNGTIRLGATWKPADQWTARFTAALD LTQQNGYPYAPYNPQTGVLQPIAYNRESTYRRLVGTLGMSWRYDAEGWSMNSQTAFQHND GSVEVDQDYSPRDIFAVKMPHWQNQVSQEFTFRSTNESRYQWIVGAFGFHQHEHYTVNTS RIAAKLYEVSQNKLPTTGFALYHVSTFNIVKGLWASAGVRLDWERNKINNNKAKVKWMGQ DAPPMFNPVDPKGGWQATDFDASRIDKQITPKFTLKYQFTPFRMIYATVAKGYKAGGFNA VRQTDEDYTYKPEHTWNYEVGAKWSFLKGLLALEASAFYIDWRHQQLSITVPALGNVVRN VGHSNSKGVELTLNATPLPALSLQASYGYTYAKMLEGRMGMGRDYSGNMLPLVPRHTYSF NANYVVNNLGRIANKLMFNANLTGVGPLYWREDNAVKQPFYTLLNLKTAITRGIFTLELW SRNTLATNYLAYYFVASTPMAQKGKPFTVGTTLMVKW >gi|283510552|gb|ACQH01000067.1| GENE 25 35946 - 36716 653 256 aa, chain - ## HITS:1 COG:mlr3442 KEGG:ns NR:ns ## COG: mlr3442 COG2227 # Protein_GI_number: 13472976 # Func_class: H Coenzyme transport and metabolism # Function: 2-polyprenyl-3-methyl-5-hydroxy-6-metoxy-1,4-benzoquinol methylase # Organism: Mesorhizobium loti # 35 136 64 165 249 62 32.0 9e-10 MQHRQKDRKLYFNELSATSWKYFLPYIERYLPIEQGMNVLEIGCGDGGNLLPFSERGCEV VGVDLAECRINDAKRFFEEAGAKGRFIASDVFEMKGIEHHFDLIICHDVIEHIMDKASFL PKLRKFLRLGGVVFMSFPAWQMPFGGHQQICRSKFLSHLPWFHLLPEDVYAAILKAFHEK PGITDELLDIKQTRCTIELFEGLAAQHFAIRNRTLFFINPHYEIKFDLRPRLLWPRLARM RHVRNFFTTSCFYILE >gi|283510552|gb|ACQH01000067.1| GENE 26 38351 - 39385 1105 344 aa, chain - ## HITS:1 COG:XF1902 KEGG:ns NR:ns ## COG: XF1902 COG2255 # Protein_GI_number: 15838500 # Func_class: L Replication, recombination and repair # Function: Holliday junction resolvasome, helicase subunit # Organism: Xylella fastidiosa 9a5c # 7 329 4 326 343 382 59.0 1e-106 MTEDFDIRDERTTPSEREFENALRPPKFQDFSGQSKVVENLEVFVEAAKYRGEPLDHTLL HGPPGLGKTTLSNIIANELGVGFKITSGPVLDKPGDLAGILTSLEINDVLFIDEIHRLSP VVEEYLYSAMEDYRIDIMIDKGPSARSIQIDLNPFTLIGATTRSGLLTAPLRARFGINLH LEYYDPQTLARIIKRSARILNVPIDDEAAMEISRRSRGTPRIANALLRRVRDFAQVKGNG SIDTVIARLSLTALNIDQYGLDEIDNRILLTIIDKFKGGPVGVSTIATAIGEDAGTLEEV YEPYLIMEGFIKRTQRGRMVTELAYQHLGRNIYTSNAKEPGLFD >gi|283510552|gb|ACQH01000067.1| GENE 27 39670 - 41574 1864 634 aa, chain + ## HITS:1 COG:PAB0626 KEGG:ns NR:ns ## COG: PAB0626 COG2217 # Protein_GI_number: 14521140 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Pyrococcus abyssi # 46 632 113 689 689 496 47.0 1e-140 MCEHHHEEETKLWQPILSGVLLAAGIAFTWMGCGWFSQPWVQLVWYIAAYLPVGFGVMHE AVEEAMKGDVFSEFMLMSVACVGAFAIGEYPEAVAVMWLYCMGEALQHRAVHKARHSISS LTHYRPELARLMRWNGEKNAMEQLLVKPEMVSVGDVIEVLPGERVPLDGVLLLPDGEHDA LVNFDTAALTGESMPRQVGVEGEVLAGMIALERAASLRVTRTQSESALTRILKMVEEATE RKAPTELFIRRFARIYTPIVILLATLTVVLPWAYTLVSPSFHYHFSAWLHRALVFLVISC PCALVISVPLSYFAGIGRASRMGVLFKGGNSLDALTGVDAVVFDKTGTLTTGAFGVQQTL HLSPEELQTVVAMERSSTHPIAKAIVKVYGDGEKINAENMPGLGLRAEIGGETWLAGTLR LLEKQGVSYPEELQTIPDTIVACAKNGRFIGCILLSDTLKPDAADAISMLRKVGITHVEM LSGDKQALVDKVAEELKLEQALGDLLPQHKAERIETLQKQGRKVAFVGDGINDAPVLALS NVGIAMGGAGADMAIETADVVLQTDQPSRLTQALRLARKTRTIVWQNIAFAIGVKVVVMA LGLMGIATLWEAVFADSGVALLAVVNAMRIMRGK >gi|283510552|gb|ACQH01000067.1| GENE 28 41763 - 43877 2521 704 aa, chain - ## HITS:1 COG:no KEGG:PRU_2649 NR:ns ## KEGG: PRU_2649 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 703 1 705 707 941 63.0 0 MKKITFILAALMAFGTAKADEGMWTLYNLPQAVYQQMQAEGFKLPYADIYQSKDAVKNAV VNFSGYCSGVVVSPNGLVFTNHHCGFEAIRRHSTVEHDYMLNGFYAKTYEEELPNSDMFV SFMKSQQDVTARLMGRGYATASKRQKDFLIDSLQTVLTDSVRKVDKTLRVSIDPFYESNS FYATVYQDFTDLRLVFTVPKSMGKFGGETDNWMWPRQTCDFSVFRIYADPTTNGPAAYSK NNVPYRPQHWAPISLDGYKEGDFSMTMGYPGSTSRYISSYGIIERRDCENTPRAQVRGVK QEVMRRHMRANEAVRIKYDSKFAQSANYWKNSLGMNKCIDSIGLVQQKRDYEARLRQWQD STGFLKGKLDFDKLGAYYEKRFDRVRTRYYWMETFRRTNELATRAMRLAMGGMEVKGPKK PASKQYHEFADNSNEWDAALDKEVLAVLLKNYREHVGNRYLPKFYTTIDQRFGGDYQRFV DDLYARSIIMKKGAKLFYNTKAYKEDPGVRFGQDVMDIYYDLTEGMGAINDSIEAQERYL CAAKLRMEEDMPHYSDANFTMRLSYGQVKGFSLGGKELGYYTTPESIVEKMDRAKSNVDY EAEPIMRQLLSSNDFGRFTDKSTNNLHLCFLTNNDITGGNSGSPMFDGKGRLIGLAFDGN WDSLSSDILFDSLLARCIGVDIRYMLFMMDKWGHADRLLKEMGL >gi|283510552|gb|ACQH01000067.1| GENE 29 44042 - 45508 1313 488 aa, chain - ## HITS:1 COG:TM0620 KEGG:ns NR:ns ## COG: TM0620 COG2244 # Protein_GI_number: 15643386 # Func_class: R General function prediction only # Function: Membrane protein involved in the export of O-antigen and teichoic acid # Organism: Thermotoga maritima # 3 474 4 455 479 80 19.0 8e-15 MSNLKSLVKDTAIYGLSSIVGRFLNYLLVPLYTIKISAASGGYGVITNLYAYTALFLVLL TYGMETTFFRFANKSDENPQRVYSTILLSVGFTSALFIAMVVLFLHPIAQAMGYADHPSY IWVMAATVAIDAFQCIPFAYLRYQKRPIKFATLKLLFVGSNISLNLIYYLLLPALYEGGH QWVSLIYSPNVGAGYAFYINLVCTACITFCFYKELRGFAYVFDKALLKRMLSYSWPILVL GIAGILNQTADKILFPYIYKGNDAHTQLGIYGAASKIAMIMAMITQAFRYAYEPFVFGKS KDKDNRDTYAKAMKYFIIFTLLAFLVVMGYIDIFCRLIGRDYWEGLRVVPIVMAAEIMMG IYFNLSFWYKLIDKTIWGAYFSGIGCAVLIAVNVLFVPRYGYMACAWAGFAGYATAMILS YIVGQRKYPIAYPLTSIGVYVGITVLFYLCINYANTHFSVPVALGVNTVFILLFIGHIWK WDLKRGKG >gi|283510552|gb|ACQH01000067.1| GENE 30 45545 - 46774 972 409 aa, chain - ## HITS:1 COG:RSc0452_1 KEGG:ns NR:ns ## COG: RSc0452_1 COG2715 # Protein_GI_number: 17545171 # Func_class: R General function prediction only # Function: Uncharacterized membrane protein, required for spore maturation in B.subtilis. # Organism: Ralstonia solanacearum # 1 212 1 219 251 200 48.0 5e-51 MVLNYIWISFFLVAFVVAAAKLIFMGDATVFPAMMDSTFDMSKKAFEISLGLTGILALWL GVMKIGERGGVVNALARALSPVFTRLFPDIPKGHPVTGSIFMNIAANMLGLDNAATPLGL KAMEQLQTLNPKKDTASNPMIMFLVLNTSGLTLIPVSIMAYRAQLGASQPTDVFIPILLA TFFSTLAGIIITSIYQKINLLSPKTFLGLFVACAIVGSIIWGFGRMDKETMNTFSTTTAN ILLMGIIVAFIVAGVCKRVNVYDTFIEGAKDGFQTAVRIIPYLVAILVGVAVFRASGAMD LLVDGIRWLVGLLGINTDFIEALPTALMKPLSGAGARGIMIDTMSTYGADSFVGRLSCIF QGSTDTTFYVLAVYFGSVGIRYTRHAVACGLLADLAGVVAAILICYLFF >gi|283510552|gb|ACQH01000067.1| GENE 31 47137 - 49653 3013 838 aa, chain - ## HITS:1 COG:no KEGG:Coch_0557 NR:ns ## KEGG: Coch_0557 # Name: not_defined # Def: hypothetical protein # Organism: C.ochracea # Pathway: not_defined # 58 828 47 823 827 831 51.0 0 MGVNNRIGCLLAAVALGYSAQAKVAFAVKVQKDTTQTAKTDTAKVQTPNGKDVKAEPKKK ETDYDRLMKDSGTVMKGLFTVRHIKDKWYFEMHDSLVARYFMAVTRFAGVPQGFGKFSGE MVNEDAIYFEKRDAKTMLLRTFVRTQEANEKDRIYLSLKQSTADPIVAAFPIVAGNTQKG MNLFDVTGFFARDNNIVGINRTFFEKTKIGGQQADRTFIDTIKTYPINIEVLTTRTYAAS PSTIRASGTGAVTLALNTSIVELPKIPMRKRIWDNRVGFFAYPYTIFSDEQHKSQREQFI SRFRLVPKDVKRYQRGELVEPIKPIVFYIDPATPKKWVPYLIKGINDWNVAFEAAGFKNA IQGKEWPTNDPTMSLDDARFNVLRYLPSEAENAYGPHVKDPRSGEIIESHICWFHNVMNL LTKWYMTQCGPLDKRARTMNFDDRLMGELIRFVSSHEVGHTLGLRHNMGASYATPVEKLR DKAWVEKHGHTASIMDYARFNYVAQPEDNVGERGLFPRVNDYDKWAIKWGYQYRPEFKDE LAEKDKLMDETTAVLAKNPRLWFGGEGTNEDPRAQREDLGDDNVKASDYGVKNLQRLIKN LPQWTRQANDQYVDLREMYNSARQQFVRYFGHVAKNIGGRYINNMPGKTPYELVPAERQK AALGYFARQVFDAPLWLYPVEITSKTGVNATAEIAKMQASALNATFSLNLLNAIYNNSQE SSTAYRLENYLDDLFAAVWTPLDAAQDLKNSSRRTLQRNYVERLNTMLNPDDKDKVGATA PAFNTDALLFVAQHLDKLDNYLKAEMQTATGINALHYADLARQVNQIKTVRERGRNVR >gi|283510552|gb|ACQH01000067.1| GENE 32 52599 - 53060 277 153 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764798|ref|ZP_02171851.1| ribosomal protein S19 [Bacillus selenitireducens MLS10] # 1 147 1 148 164 111 38 1e-23 MRMKTGLFTGSFDPFTIGHQSIVARVLPLFDKIVIGVGVNERKKYMYSAEVRVKEIAELY ADNPKVEVRAFNDLAVDFAHREGAWFFVKGVRSVKDFEYEREQADINRMMGGIETLLVFA EPQHASVSSSLVRELIHFGKNAEMFLPKRKNKK >gi|283510552|gb|ACQH01000067.1| GENE 33 53088 - 53510 390 140 aa, chain + ## HITS:1 COG:TP0650 KEGG:ns NR:ns ## COG: TP0650 COG0319 # Protein_GI_number: 15639637 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase # Organism: Treponema pallidum # 44 121 48 135 160 67 44.0 6e-12 MITYNADGVKMPKIKRKDTTAWIKAVAQSYGKKVGEVGYMFVNDEKILEINNEYLGHDYY TDVITFDYDEDDVVNGDIVISLDTVRTNAELFDKAFEDELYRVIIHGILHLCGLNDKGPG EREIMEKAENKALEMRNGAF >gi|283510552|gb|ACQH01000067.1| GENE 34 53759 - 55825 1589 688 aa, chain + ## HITS:1 COG:no KEGG:Slin_1714 NR:ns ## KEGG: Slin_1714 # Name: not_defined # Def: peptidase M48 Ste24p # Organism: S.linguale # Pathway: not_defined # 21 615 47 610 708 184 27.0 1e-44 MRENNAIRALGKTVLYRIALFVLVYLSLVAFGLTLLYIGYKWATFWGIDMLQRAFDGTNS GFVMLILLGAYLGVIALCLMFGLFLVKFIFTRLNAEEEGRIQVKESDCPQLFELIRHVAR ATKCPMPHKVFLFTEINACVYFNTTFWSMFFPVRKNLVLGVGLFATTSTEEIKGILAHEF GHFSQDSMKISSVVYTANIVMGNLVYGEDAWDRWVDRWSHFGWSPFSIFGLLTRFFTVRV RLLLQCLYRYVNIAYRELSRQMEYDADSIACKVVGKAVMASSFYKTTVLMQCSSNTHVAL VRLAETSKKADPFEVLELLAQIKSREEGKTIEATALLQAEVKTNDEPTARFSCDDLWNSH PSDKDRIQHLPNIPTQCNEPLVSAWSILPADLRQSVGELLRFDKSAKSGMPQMLTGEELR QWLEKELVTELLDMRFRRFFVECGCIDLFNPMEEMPAETVTYPFTKRNRNIVTAYIRAYE DAEQMIEIINGNREVSAAYYKGKSYAPNDLPTIEHSHYMERMLPKVKNVYREIYTYLRNG SHGTEVQGLYERLFALNKRIEAYQQAIIKPAEKMVEKWNSSHKEDKDEAQLLQEYGRLLT ELHDMMPSTLRMLSEELHADDTCADLYRLSVQAAAPSKGLDSDSQMKEINTCLSLLSPFL ELLQQQYAELKLAIGRIAQAEEELELAQ >gi|283510552|gb|ACQH01000067.1| GENE 35 55886 - 56143 62 85 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MASFYRYIYGARNILLRCLMFLKSSTCGVMACNLIGRESHKGLTKLGYRNHICKLTATKK DGRSHPSTSVVHFMVWRTAKLRQRL >gi|283510552|gb|ACQH01000067.1| GENE 36 56335 - 58206 1646 623 aa, chain + ## HITS:1 COG:CAC3733 KEGG:ns NR:ns ## COG: CAC3733 COG0445 # Protein_GI_number: 15896964 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: NAD/FAD-utilizing enzyme apparently involved in cell division # Organism: Clostridium acetobutylicum # 5 622 9 620 626 559 45.0 1e-159 MIFRYDVLVIGGGHAGCEAACAAAKLGAKTCLVTMDMNKIAQMSCNPAVGGIAKGQIVRE VDALGGEMGIITDSTSIQFRMLNKGKGPAVWSPRAQCDRAKFITKWRETLDSTEGLDIWQ DQADELLVEDGMAVGVRTLWGAEFRAKSIVVTAGTFLNGLMHVGKVQIPGGRCAEPAVYR FSESIARHGITVDRMKTGTPVRIDARTVHFEEMERQDGEIDFHQFSYMPTPRTLTQLPCW TFYTTQEAHQALQAGIADSPLFNGQIQSTGPRYCPSIETKLVTFPDKEQHPLFLEPEGEN TAEMYLNGFSSSMPLDVQLNALRKIPALRDAKAYRPGYAIEYDYFDPTLLHASLESKVVK GLFLAGQVNGTTGYEEAAGQGLVAGINAAIACSRGEPFVMKRDESYIGVLIDDLTTKGVD EPYRMFTSRAEYRILLRQDDADARLTERAYQIGLATRHRYDHWMEKKESIERIISFCETT SVKANDINAALERWGTTPLNGSAKLADLIARPQLNLLVLADAVPELKAAIAQIPNRQEEI VEAAEIKMKYKGYIEREKIVADKMRRLENIRIKGHFNYEELHEISTEGRQKLARINPETL AQASRIPGVSPSDINVLLVLMNR >gi|283510552|gb|ACQH01000067.1| GENE 37 58678 - 59208 453 176 aa, chain + ## HITS:1 COG:YPO3123 KEGG:ns NR:ns ## COG: YPO3123 COG0503 # Protein_GI_number: 16123288 # Func_class: F Nucleotide transport and metabolism # Function: Adenine/guanine phosphoribosyltransferases and related PRPP-binding proteins # Organism: Yersinia pestis # 4 162 13 169 187 164 49.0 1e-40 MNNKLLMDNLRCIPDWPKKGVNFRDVTTLFKNPECIKVITDEMCEIYKDKGITKIVGIES RGFVMSSAVAIRLGAGIVLCRKPGKLPCETVQESYNKEYGTDTIEIHKDSISEDDVVLLH DDLLATGGTMKAACDLVKKFNPKKIYANFIIELVNEDLNGRAAFDKDIEISTLIQV >gi|283510552|gb|ACQH01000067.1| GENE 38 60160 - 62010 1834 616 aa, chain + ## HITS:1 COG:BS_uvrC KEGG:ns NR:ns ## COG: BS_uvrC COG0322 # Protein_GI_number: 16079901 # Func_class: L Replication, recombination and repair # Function: Nuclease subunit of the excinuclease complex # Organism: Bacillus subtilis # 12 593 4 573 598 395 42.0 1e-110 MTKEENERRLAHLKSIVLSLPDKPGSYQFYDEQHIIIYVGKAKNLKSRVSSYFHKEVDRF KTKVLVSKIFDISYTVVNTEEDALLLENSLIKKYNPRYNVLLKDGKTYPSICITNELYPR IFKTRTINKRLGTYYGPYSHVPTMVALLELIGKLYKPRTCHFPITHEGVEMGKYKACLEY HIHNCDAPCVGKQSWEDYQENIRQAREILKGNTREVRKMLYEEMLKKAEELKFEEAEALK KRYILLDNYCAKSEVVSFSIADVDVFTITDDEFNRTAFINYLHVKNGTVNQSFTFEYKRK LDESNEDILASAIPEIRERFGSKAKEIIVPFEMEWAMREATFIVPQRGDKRRLLELSEMN GKQYKFDRLKQSEKLNPEQKAVRLMKELQTKLQMPRLPYQIECFDNSNISGTDAVAGCVV FKGMKPSKKDYRKYNIKSVEGPDDYASMKEVVRRRYTRMIEEETPLPDLIITDGGKGQMG VVREVVVEELHLDIPIAGLAKDDRHRTNELLYGWPPQVVGLDVKSELFHVLTRIQDEVHR YAISFHRDKRSKHALHSELDDIKGIGPKTKATLLKELKSVKQIKEAPLDVLAKHIGNTKA QIVFNYWHNETPNETT >gi|283510552|gb|ACQH01000067.1| GENE 39 62087 - 62269 57 60 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLSPHLLWVLLMSVIRFRLLSWLRTFDANLLGGLSLCIMKEHLCFSFSFNWTWESKWTWL >gi|283510552|gb|ACQH01000067.1| GENE 40 64930 - 66675 1530 581 aa, chain + ## HITS:1 COG:no KEGG:BF0008 NR:ns ## KEGG: BF0008 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 31 580 33 585 588 315 32.0 3e-84 MKTKLYAILIATTCLLIGIANASAREKTFGYRWNMHPTDNFVNVVIDSTENDLSVVFAVK DEEKGKIGTLIPVYQLNKQGAEVTVSIRYKTLNCERAYITLNTIGAGEQFLRSDTIELPL TEEWKEVSKKVKIKDAYLLNVFIEAIGSELAFNNIRVSAFKLEANGTQLKNETAPKSDNT SLKKNDVITLNGVDFSKIPSMQSRIMAIGETIHGSKTLGQMAFDIIKQRITKQGCNLVLH EYPLEYSFYVNRYIKNDPHFKLQDIERYMEGNLSSEHTIDFIKWLRTYNSTHNNRVSFWG IDIESLDIASSVDLSEFVKALCSNRETPETDSIVKRLLNWETGAEDSLQLTKADGVVGKW LTADEREILKLCLQLNHESSETYERLTNRDKTMAKTAFALIDILANKGALATIYGHFMHL NYLVAGDMKDLNNYTAGYHIKAKYKEDYQAIALCTYEGKTLNCLTDKVIGVAQLAKAPEG SVENTLQSMGNNITYLPTERLNYTDVLTMRVLGNTNDNYQFFYFVPKARVDGILFVSRSQ PVEKSQEVLNRYLNYVDATVRRYLENAKEKIRKLRESKLNE >gi|283510552|gb|ACQH01000067.1| GENE 41 67166 - 67615 504 149 aa, chain + ## HITS:1 COG:SP1644 KEGG:ns NR:ns ## COG: SP1644 COG1490 # Protein_GI_number: 15901480 # Func_class: J Translation, ribosomal structure and biogenesis # Function: D-Tyr-tRNAtyr deacylase # Organism: Streptococcus pneumoniae TIGR4 # 1 149 1 147 147 176 55.0 2e-44 MRIVIQRVTRASVSIEGRVESAIEKGMMVLLGVGYADTDNDIDWLVKKTINLRIYDDSEG VMNLSVKDVEGDILVVSQFTLMASCKKGNRPSYIHAASPDVSVPLYERFCQRLSESLGKP VATGVFGADMQVELVNDGPVTIVIDSKDR >gi|283510552|gb|ACQH01000067.1| GENE 42 67628 - 68557 500 309 aa, chain + ## HITS:1 COG:NMA1500 KEGG:ns NR:ns ## COG: NMA1500 COG0270 # Protein_GI_number: 15794400 # Func_class: L Replication, recombination and repair # Function: Site-specific DNA methylase # Organism: Neisseria meningitidis Z2491 # 4 298 22 330 337 195 39.0 1e-49 MVRYVDLFAGIGGIRIPFDELGAQCVFSSEWDKAACKTYAANFGDIPSGDITKIAAEDIP PHQLLLAGFPCQAFSIMGQMKGFDDTRGTMFFEVARILDYHKPKAVLLENVKQLTTHDRG KTFETIKATLRSLGYHINWRVLNAMDFGLPQKRERVIIVGFRDKMAYENFDFTFKKKPFN LATILEEEKDIDPKLYASKTIREKRKASTAGKEVFYPSIWHENKAGNISVLNYSCALRTG ASYNYLLVNGVRRPSSRELLRLQGFPETFKIVVSHADIRRQTGNSVAIPMIRAVAEKMYK LIDKADETS >gi|283510552|gb|ACQH01000067.1| GENE 43 68544 - 69275 588 243 aa, chain + ## HITS:1 COG:no KEGG:lpp1079 NR:ns ## KEGG: lpp1079 # Name: not_defined # Def: hypothetical protein # Organism: L.pneumophila_Paris # Pathway: not_defined # 4 236 5 237 247 163 38.0 6e-39 MKHPKLRDSHRLNARELYPTNEIPDDVVTQIGAHIIYLLALGRKDITGTDWGDTLALALQ ATHLDSPIGIADVVREKMAWSAKTIKTTSPLNTETVRLISGRCSPDYSYGIANPHDDIQQ TGRAVLGIWNERINIAQDNYNPIRTSVLIRSNDLSTFSLFEEEVQRYRTSDFVWEVNANG NLIGKEIETSQTKFTWQPHGSQFTIHTKVPTNAIKFRIRKPPLIEKKDILNSINFDPSWI TRL >gi|283510552|gb|ACQH01000067.1| GENE 44 69277 - 69648 539 123 aa, chain + ## HITS:1 COG:SA1292 KEGG:ns NR:ns ## COG: SA1292 COG1694 # Protein_GI_number: 15927040 # Func_class: R General function prediction only # Function: Predicted pyrophosphatase # Organism: Staphylococcus aureus N315 # 22 114 8 101 105 76 46.0 1e-14 MQQQNDNDKTPTNEPLTLGGAQQLVDQWVKQYGVRYFSELTNMAVLTEEVGELARQMARI YGDQSFKQGEEPNLGDEMADVLWVLLCLANQTGVNLEEELKLNIEKKTRRDKTRHINNEK LKK >gi|283510552|gb|ACQH01000067.1| GENE 45 69655 - 70584 1332 309 aa, chain + ## HITS:1 COG:mll4784 KEGG:ns NR:ns ## COG: mll4784 COG0274 # Protein_GI_number: 13474008 # Func_class: F Nucleotide transport and metabolism # Function: Deoxyribose-phosphate aldolase # Organism: Mesorhizobium loti # 64 293 86 323 348 182 40.0 7e-46 MSEHNHQHEHHHHEEEVSKYEKTLQQYTIELDDAQVQQAVKTIIAEKTLANDTPEVKKFL FGSIELTTLSTTDSDTSVMAFVDRVNQFENAYPQLPHVAAVCVYPCFAEVASETLEVEGV EITCVSGSFPSSQAVIEVKVAETALAVRDGATEIDIVMPVGKFLSGNYEELCDDIAELKQ ACGDKPMKVILETGDLKTASNIKKASLLSMYAGADYIKTSTGKEKVSATPEAAYVMCQAI KEYYEKTGIQIGFKPAGGINTVMDAIIYYTIVKEVLGEKWLTNKWFRLGTSRLANLLLSE INGEETKFF >gi|283510552|gb|ACQH01000067.1| GENE 46 70617 - 72002 1625 461 aa, chain + ## HITS:1 COG:no KEGG:BVU_0235 NR:ns ## KEGG: BVU_0235 # Name: not_defined # Def: putative periplasmic protein # Organism: B.vulgatus # Pathway: not_defined # 4 461 6 455 455 346 42.0 1e-93 MNPPARTFILVAITVAALLLMHQLPTLSIGGTQLREVNILSQVVPENDGKQVDVLPTTPP HPIMVQTKKGAALHFKEVWTKGVEPIFDYSSGAAGGMDHFYSQLTKVKQLDRPVRIAYFG DSYIEGDILTADLRELFQRTWGGCGVGWVDTGSRMQQNRISIRQQYSGITEYAVAKKPFD ISKQGINERYFAPREGSWIKVSGTKNYAHTQNWTISTLYFHTPTSVTIKSNDGKGHISEH KFTGSTAVQKLEVKDSLNSIGFAFKGVGSGTFLYGLTLDGDKGVTLDNFSMRGSPGLTLA KIPTHRLQDFSRLRPYDLIVLHFGLNVAVPGNPLSVMKGYTTKMKKVIELMRQTFPQASI LVVSVPDRDQRSPNGIQTMKEVKQLVALQQQMAADMKVAFLNFFEAMGGESSVKTLVERS YANKDYTHLNYKGGKELARKMFPSFKEGYKNYVRRKAIEKK >gi|283510552|gb|ACQH01000067.1| GENE 47 71999 - 73558 1489 519 aa, chain + ## HITS:1 COG:PA3548 KEGG:ns NR:ns ## COG: PA3548 COG1696 # Protein_GI_number: 15598744 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted membrane protein involved in D-alanine export # Organism: Pseudomonas aeruginosa # 62 442 19 390 520 260 40.0 5e-69 MIKQLSEFVEKVSYYADQIAEFTKQHAAEIHRVTDLLTYNSDEPLIFSTGLFLMLFLGFA FVYMCLQRRLTARLLFVTAFSYYFYYKSSGLYFFLLALVTISDFFLANLIHRYRERRGLG KLFVTLSLVIDLGLLGYFKYTNFFVGMVARMLEHNFQPWDIFLPVGISFFTFQSLSYTID VYRGHLKPLPSLLDYAFYVSFFPQLVAGPIVRASDFAPQIRQPLTITRDMFARGLFFILI GLFKKAVISDYISLNFVDRIFDNPGLYSGLENLLGIYGYAMQIYCDFSGYSDMAIGIALL LGFHFPLNFNAPYAATNISDFWRRWHISLSTWIRDYIYISLGGNRKGKIRQYFNLILTML LGGLWHGASLNFIIWGGMHGVALAVHKLFSQGLLKHPRGYVSRGIRRPLAIFVTFHFVCF TWLFFRNLSFETSKMMLNRIFTDFRADLFVPILHGYKYVLALMLFGFVTHYIPNRWQEGI IAGLRRCNVVVCALFLVIVIYIVIQVKSSAIQPFIYFQF >gi|283510552|gb|ACQH01000067.1| GENE 48 73815 - 74288 302 157 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928699|ref|ZP_06422545.1| ## NR: gi|288928699|ref|ZP_06422545.1| hypothetical protein HMPREF0670_01439 [Prevotella sp. oral taxon 317 str. F0108] # 1 157 1 157 157 320 100.0 2e-86 MALTIPFLLLAFVGCSKDDAEKSDLKVRFKVLNAKGEETSVFKHGEDILFELTVMNEGEK PVEAWKIIDLRDVFHVYTSDGKDMGKPWDEIFDPFYVMRYIPAHGDFNWLCSWLEKKNFF VQKRPRTALPVGDYVARFNVTLENKHMVECKVDFKVE >gi|283510552|gb|ACQH01000067.1| GENE 49 75107 - 75331 316 74 aa, chain + ## HITS:1 COG:no KEGG:BDI_2160 NR:ns ## KEGG: BDI_2160 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 70 1 70 208 77 54.0 2e-13 MINGLEHIGNIPISTSALSSLYPEMEAGNQKVRNLELGGKLIRLKKGLYIVNPTVSRVAL STELIANHIYVMQS >gi|283510552|gb|ACQH01000067.1| GENE 50 76709 - 76966 142 85 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288928701|ref|ZP_06422547.1| ## NR: gi|288928701|ref|ZP_06422547.1| hypothetical protein HMPREF0670_01441 [Prevotella sp. oral taxon 317 str. F0108] # 1 85 1 85 85 159 100.0 4e-38 MQTSRFAKEFRPYELTSAIGWNITSRFFANLQGEAAVALFKVDGKKDYSTNYTLGLNTGY TFVKNDLCSIDARVGAGTKLGKSDA >gi|283510552|gb|ACQH01000067.1| GENE 51 77374 - 78000 607 208 aa, chain + ## HITS:1 COG:no KEGG:BDI_2160 NR:ns ## KEGG: BDI_2160 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 208 1 208 208 268 66.0 8e-71 MINGLEHIGNIPISTSALSSLYPEMKAGNQKVRNLELAGKLIRLKKGLYVVNPTVSRVAL STELIANHIYAPSYISMSSALRYYGLIPETVYSVQSMTIKHSRSFNTPIGLFDYTSINRD AFHIGVTSVNKQSYSFLIATPEKALCDLIANSPLVNLRYLKDVETYLEQDIRMNIDELRK MNLTIFEQYAKVGKKGKSIQTLINYLNR >gi|283510552|gb|ACQH01000067.1| GENE 52 77997 - 78821 547 274 aa, chain + ## HITS:1 COG:no KEGG:BDI_2159 NR:ns ## KEGG: BDI_2159 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 2 272 3 273 275 420 76.0 1e-116 MNEIYNLMLSAYEQTTEQQKRNATFEVNQQIILAGLYNGGFFDEAAFYGGTCLRIFHGLQ RFSEDMDFSLLAPNKAFDFSRYFQPIIDEFALVGRKVEITKKDKKAFCKVESAFLKDNTD VYDVTFQTEKSVKIKIEVDTQPPLKFNTEQKLLLLPKSFMVRCYTLPNLFAGKIHALLYR TWKNRVKGRDWYDFEWYIRHNTPLDFNHLHERTLQFNQEDISKELFLEKLKQRLSTADIN QVKADVLPFVRNPMELNIWSNDYFLQLASNIKFM >gi|283510552|gb|ACQH01000067.1| GENE 53 79590 - 80153 406 187 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928705|ref|ZP_06422551.1| ## NR: gi|288928705|ref|ZP_06422551.1| hypothetical protein HMPREF0670_01445 [Prevotella sp. oral taxon 317 str. F0108] # 1 187 1 187 187 370 100.0 1e-101 MNIAKLRISMAFLFGCLIFSSCTTCNEYDYQPTIAALRQNTILLSLLDAKGQAYDYELLM QKKNFSVYGLQSKQGFSVRINADNSLGEKRLEFIAELPDEKDMTYNANRSHGEGYATTQL NIEGKTIDLRIDFVYGAAEDKDMLADSSIAIERIFYNGVEIKPRKTSYPQFLLTLQKDDK GDFHIVE >gi|283510552|gb|ACQH01000067.1| GENE 54 80140 - 81432 260 430 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928706|ref|ZP_06422552.1| ## NR: gi|288928706|ref|ZP_06422552.1| putative pyrogenic exotoxin B [Prevotella sp. oral taxon 317 str. F0108] # 1 430 1 430 430 842 100.0 0 MKKQLFFAALAACFYSCSDSPMAYEPIKNDHSRSELQLSLKQATAYANLFYEGLSRNDSS YVPTRAVSDSKIGKIDYLVEGNDTLLYAVNYSGGNGYVILSGSNTSFPIVAHSGTGTLSF NDVNPDNPLYSLINSYKLRVKGELHDQSLTKQKYFDEWKDLGKDGYSYEVVLTNTEPKET RGRRRYSSGKKSIYPYTGKDLDVWHQGGGYNYYADNKYYIGCPAIAIGMLMYDTSQRPNG SMTTTKPQFLYSDKYDITNIDGADLARRLRQIADSIPGYSFGPKASGALPDNITTGLHKL GYNQAQIVPYDFELLYTNLNFTRKGYFGDDLNCNRGLLIFGYSPYNTGHIWFCDGYYEQS YTVRKKFLGMTIKTWTEYDDRVYMNWGFGKNAGNGWYCATEDFWGSLDGAHMPYYLNVYM FINLSHYEYR >gi|283510552|gb|ACQH01000067.1| GENE 55 81622 - 82461 618 279 aa, chain - ## HITS:1 COG:no KEGG:Fjoh_1697 NR:ns ## KEGG: Fjoh_1697 # Name: not_defined # Def: hypothetical protein # Organism: F.johnsoniae # Pathway: not_defined # 8 278 7 279 284 133 30.0 7e-30 MYRLLFPLLLAFFCLSCEETEVKQAPAQLVVEGWIDSGGFPIVKLTTTVPISKRLQSTDS LDRFLLRWAKVTVSDGTREVVLTGMPHRDYFPPYIYTTSDMRGEVGKTYTLRVDFQNFHA HAVTTIPKPVALTSIKAEELSRFPGWYKLRVGFTDNPATTNYYKVFVRQYSYSQDFHSTY LGTYNDRLLPSKASILVNNSRQNVTSPNATFFKRYEFVTIKFCHIDSVSYEFWKSMDDYQ GEGRNPLFNSMQNIRTNIHGGLGYWCGYGASYHTIVVEG >gi|283510552|gb|ACQH01000067.1| GENE 56 82465 - 84306 1606 613 aa, chain - ## HITS:1 COG:no KEGG:Fjoh_1696 NR:ns ## KEGG: Fjoh_1696 # Name: not_defined # Def: TonB-dependent receptor # Organism: F.johnsoniae # Pathway: not_defined # 1 613 135 756 756 252 27.0 3e-65 MPKLFGNADPLRYAQSLPNVQTNGELDAGLHIEGCDNAHNEMSLNGVPVHNAAHLLGIFS VFNASHYAQMRFSPTAVSAGHANRLGGFVDLCSPDTIAHKLGGEGSIGLISSQATLRVPI NGQMELTLSGRSAYINLLYHRLLNMNEMALKYGFYDANVTWTYRPDDNNMLRIDYYSGQD DGELNEKISVYKGKMKWANHLGALHWTRRMGRTTLQQTAYLSAYGNDFYTGFENTHLRLP SSITDWGYKAHLSFAPLSLHAAVVYHRLQLQSPEVSGDRQYINPAPQRQRTLETSLGADM ALPLAEQLKLTGGLRATIYRSPDATFYGLDPLLHLVWQSPVGAFLVSGAIKRQYLFRTGF SSLNLPSEFWLSSSAVHRPQRAINLSANYRTALFKGAFQLTAGVYHKWLYHQQEYVSTPF EVIFNPSDNVSKHLVEGQGRNYGAHVMLEKRSGYLTGWVSYAWGRAQRHYPGTALLGEYP ASHERVHELNAVAAYQLSPRWRLAATLVWASGTPFTAPENFYILDGNLVSQFGPHNGARL STYVRLDLSANYLLYSRNGREHGLNLSVYNANMRRNDLFWRIRIDGDGYSYRPLSFMPQQ MPFLPSISYFFKF >gi|283510552|gb|ACQH01000067.1| GENE 57 84305 - 84529 56 74 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSFRKSRFSVICPKALARNLPRKRGATMSTLRKFAAFSVSAVCPLAGKRAVAKSKANHVR GFMWWAKWRIIMVV Prediction of potential genes in microbial genomes Time: Sat May 28 01:24:26 2011 Seq name: gi|283510551|gb|ACQH01000068.1| Prevotella sp. oral taxon 317 str. F0108 cont2.68, whole genome shotgun sequence Length of sequence - 174432 bp Number of predicted genes - 117, with homology - 108 Number of transcription units - 79, operones - 27 average op.length - 2.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 109 - 158 2.2 1 1 Tu 1 . - CDS 168 - 1499 1599 ## gi|288928709|ref|ZP_06422555.1| hypothetical protein HMPREF0670_01449 - Prom 1522 - 1581 4.8 2 2 Tu 1 . - CDS 1639 - 2616 1086 ## COG0142 Geranylgeranyl pyrophosphate synthase - Prom 2637 - 2696 6.2 + Prom 2457 - 2516 3.5 3 3 Tu 1 . + CDS 2706 - 5489 3066 ## COG0749 DNA polymerase I - 3'-5' exonuclease and polymerase domains + Term 5539 - 5581 2.4 4 4 Tu 1 . - CDS 5903 - 7177 750 ## Amet_3117 beta-lactamase domain-containing protein - Prom 7264 - 7323 2.2 - Term 7280 - 7340 6.3 5 5 Op 1 . - CDS 7341 - 7838 470 ## Cpin_4709 DNA polymerase III, alpha subunit 6 5 Op 2 . - CDS 7916 - 8503 494 ## gi|288928715|ref|ZP_06422561.1| hypothetical protein HMPREF0670_01455 - Term 9571 - 9634 16.9 7 6 Op 1 . - CDS 9651 - 10229 623 ## COG3059 Predicted membrane protein - Prom 10334 - 10393 4.6 - Term 10285 - 10333 4.2 8 6 Op 2 . - CDS 10446 - 12461 2005 ## PRU_2529 hypothetical protein - Prom 12483 - 12542 5.2 - Term 12581 - 12637 3.1 9 7 Tu 1 . - CDS 12669 - 13271 335 ## PROTEIN SUPPORTED gi|71274727|ref|ZP_00651015.1| Ham1-like protein - Prom 13340 - 13399 2.9 - Term 13341 - 13398 2.1 10 8 Op 1 . - CDS 13446 - 14729 1048 ## COG0612 Predicted Zn-dependent peptidases - Prom 14762 - 14821 2.6 - Term 14736 - 14781 12.1 11 8 Op 2 . - CDS 14823 - 16001 1379 ## PRU_2632 putative lipoprotein - Prom 16042 - 16101 6.4 12 9 Op 1 . - CDS 16294 - 17676 937 ## COG0477 Permeases of the major facilitator superfamily 13 9 Op 2 . - CDS 17758 - 17967 168 ## gi|260910747|ref|ZP_05917403.1| conserved hypothetical protein - Prom 18042 - 18101 6.4 - Term 18053 - 18119 -0.2 14 10 Tu 1 . - CDS 18334 - 18603 156 ## - Prom 18748 - 18807 5.9 + Prom 18668 - 18727 7.0 15 11 Op 1 . + CDS 18749 - 19000 222 ## gi|288928723|ref|ZP_06422569.1| conserved hypothetical protein 16 11 Op 2 . + CDS 18997 - 19272 216 ## Ppha_1443 addiction module toxin, Txe/YoeB family 17 12 Tu 1 . - CDS 19518 - 20696 1419 ## COG1301 Na+/H+-dicarboxylate symporters - Prom 20782 - 20841 2.7 18 13 Tu 1 . - CDS 21078 - 22265 1479 ## COG0426 Uncharacterized flavoproteins - Prom 22375 - 22434 4.3 19 14 Tu 1 . - CDS 22529 - 23449 1173 ## COG0524 Sugar kinases, ribokinase family - Prom 23485 - 23544 4.0 + Prom 23985 - 24044 2.1 20 15 Tu 1 . + CDS 24114 - 28328 3915 ## Cpin_1738 PKD domain containing protein + Prom 28456 - 28515 2.2 21 16 Tu 1 . + CDS 28562 - 29455 746 ## PG1685 hypothetical protein + Term 29483 - 29536 7.0 + Prom 30786 - 30845 5.9 22 17 Op 1 . + CDS 31033 - 34185 3326 ## BDI_1624 hypothetical protein 23 17 Op 2 . + CDS 34210 - 35784 1292 ## BDI_1622 hypothetical protein + Term 35961 - 36009 1.2 + Prom 36581 - 36640 10.4 24 18 Op 1 23/0.000 + CDS 36678 - 38525 1974 ## COG0674 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit + Term 38716 - 38762 4.4 + Prom 38554 - 38613 2.1 25 18 Op 2 . + CDS 38783 - 39787 984 ## COG1013 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, beta subunit + Term 39821 - 39868 -0.7 - Term 39800 - 39847 2.9 26 19 Tu 1 . - CDS 40087 - 41790 1216 ## COG2194 Predicted membrane-associated, metal-dependent hydrolase 27 20 Tu 1 . - CDS 41963 - 42196 103 ## 28 21 Tu 1 . - CDS 42313 - 43266 1012 ## BVU_4066 hypothetical protein - Prom 43361 - 43420 3.9 29 22 Op 1 . - CDS 43599 - 46223 2918 ## COG0249 Mismatch repair ATPase (MutS family) 30 22 Op 2 . - CDS 46254 - 47069 743 ## PRU_2554 hypothetical protein - Prom 47140 - 47199 2.2 31 23 Tu 1 . - CDS 47260 - 49296 2260 ## PRU_2555 putative lipoprotein - Prom 49351 - 49410 5.1 + Prom 49883 - 49942 2.5 32 24 Tu 1 . + CDS 50086 - 50523 262 ## gi|288928741|ref|ZP_06422587.1| hypothetical protein HMPREF0670_01481 33 25 Tu 1 . - CDS 51811 - 53316 1730 ## PRU_2556 hypothetical protein - Prom 53426 - 53485 6.3 + Prom 53434 - 53493 4.1 34 26 Tu 1 . + CDS 53628 - 53942 355 ## BF2945 hypothetical protein + Term 54094 - 54139 8.0 - Term 54464 - 54504 11.1 35 27 Tu 1 . - CDS 54584 - 55072 746 ## COG0783 DNA-binding ferritin-like protein (oxidative damage protectant) - Prom 55188 - 55247 6.3 + Prom 55350 - 55409 3.4 36 28 Tu 1 . + CDS 55566 - 56930 1334 ## FP2425 lipoprotein precursor + Term 56976 - 57021 11.7 + Prom 57015 - 57074 3.0 37 29 Op 1 . + CDS 57231 - 58364 1093 ## TERTU_2962 putative acyltransferase 38 29 Op 2 . + CDS 58366 - 60777 2038 ## COG4258 Predicted exporter + Term 60980 - 61027 -0.1 + Prom 62012 - 62071 2.4 39 30 Tu 1 . + CDS 62095 - 62604 570 ## COG0566 rRNA methylases + Term 62608 - 62645 1.4 40 31 Tu 1 . - CDS 62740 - 62910 59 ## - Prom 62975 - 63034 4.9 + Prom 62660 - 62719 4.8 41 32 Tu 1 . + CDS 62855 - 63313 449 ## gi|288928750|ref|ZP_06422596.1| hypothetical protein HMPREF0670_01490 + Prom 64002 - 64061 5.3 42 33 Tu 1 . + CDS 64089 - 65189 1370 ## COG0012 Predicted GTPase, probable translation factor + Prom 65332 - 65391 4.2 43 34 Op 1 . + CDS 65466 - 66311 858 ## COG0682 Prolipoprotein diacylglyceryltransferase 44 34 Op 2 . + CDS 66401 - 68782 2559 ## COG3250 Beta-galactosidase/beta-glucuronidase + Term 68937 - 68986 9.7 - Term 68924 - 68974 3.1 45 35 Tu 1 . - CDS 69157 - 69534 389 ## COG0239 Integral membrane protein possibly involved in chromosome condensation - Prom 69568 - 69627 2.8 46 36 Tu 1 . + CDS 69550 - 69789 118 ## - Term 70343 - 70377 2.7 47 37 Op 1 . - CDS 70436 - 70837 462 ## COG0537 Diadenosine tetraphosphate (Ap4A) hydrolase and other HIT family hydrolases 48 37 Op 2 . - CDS 70893 - 71357 601 ## COG0782 Transcription elongation factor - Prom 71524 - 71583 3.9 + Prom 73348 - 73407 7.0 49 38 Op 1 . + CDS 73445 - 73930 563 ## COG0526 Thiol-disulfide isomerase and thioredoxins 50 38 Op 2 . + CDS 73942 - 74307 62 ## 51 38 Op 3 . + CDS 74352 - 76136 1767 ## PGN_0121 hypothetical protein + Prom 76147 - 76206 2.3 52 39 Tu 1 . + CDS 76308 - 77096 639 ## PG2173 outer membrane lipoprotein Omp28 + Term 77116 - 77157 1.3 + Prom 77391 - 77450 4.6 53 40 Tu 1 . + CDS 77480 - 78151 518 ## gi|288928762|ref|ZP_06422608.1| hypothetical protein HMPREF0670_01502 54 41 Op 1 . + CDS 78372 - 79685 1037 ## gi|288928763|ref|ZP_06422609.1| hypothetical protein HMPREF0670_01503 + Prom 79810 - 79869 2.0 55 41 Op 2 . + CDS 79928 - 81241 1074 ## gi|288928764|ref|ZP_06422610.1| hypothetical protein HMPREF0670_01504 + Term 81351 - 81416 12.4 + Prom 82417 - 82476 5.3 56 42 Tu 1 . + CDS 82551 - 85367 2196 ## COG5373 Predicted membrane protein + Prom 85392 - 85451 4.3 57 43 Op 1 1/0.000 + CDS 85484 - 86068 412 ## PROTEIN SUPPORTED gi|15900660|ref|NP_345264.1| superoxide dismutase, manganese-dependent + Prom 86071 - 86130 1.9 58 43 Op 2 11/0.000 + CDS 86220 - 86786 697 ## COG0450 Peroxiredoxin + Term 86846 - 86895 9.9 + Prom 86866 - 86925 4.6 59 43 Op 3 . + CDS 87112 - 88671 395 ## PROTEIN SUPPORTED gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 + Term 88766 - 88807 10.0 - Term 88915 - 88954 5.1 60 44 Tu 1 . - CDS 88979 - 89143 232 ## COG1592 Rubrerythrin - Prom 89174 - 89233 4.5 61 45 Op 1 . - CDS 89260 - 90357 874 ## COG0859 ADP-heptose:LPS heptosyltransferase 62 45 Op 2 . - CDS 90382 - 90984 840 ## PRU_2563 hypothetical protein - Prom 91107 - 91166 6.2 63 46 Tu 1 . + CDS 92620 - 92904 68 ## - Term 92728 - 92762 -0.3 64 47 Op 1 . - CDS 92867 - 94315 1270 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 65 47 Op 2 . - CDS 94308 - 94670 342 ## Coch_1765 dehydratase 66 47 Op 3 . - CDS 94654 - 95052 346 ## gi|288928777|ref|ZP_06422623.1| hypothetical protein HMPREF0670_01517 67 47 Op 4 . - CDS 95074 - 95700 688 ## Cpin_1869 outer membrane lipoprotein carrier protein LolA - Prom 95722 - 95781 4.0 - Term 95955 - 95995 -0.9 68 48 Op 1 . - CDS 96125 - 96484 327 ## gi|288928779|ref|ZP_06422625.1| membrane protein 69 48 Op 2 27/0.000 - CDS 96463 - 98211 1339 ## COG0304 3-oxoacyl-(acyl-carrier-protein) synthase 70 48 Op 3 27/0.000 - CDS 98208 - 98465 458 ## COG0236 Acyl carrier protein 71 48 Op 4 . - CDS 98561 - 100321 1669 ## COG0304 3-oxoacyl-(acyl-carrier-protein) synthase - Prom 100454 - 100513 2.3 72 49 Tu 1 . - CDS 100525 - 100941 534 ## COG0824 Predicted thioesterase - Prom 100967 - 101026 8.3 73 50 Op 1 . - CDS 101230 - 101673 507 ## ZPR_2773 FabZ-like protein 74 50 Op 2 . - CDS 101631 - 102533 869 ## COG4261 Predicted acyltransferase 75 50 Op 3 . - CDS 102558 - 102806 472 ## Cpin_1856 phosphopantetheine-binding protein - Term 102967 - 103003 3.3 76 51 Tu 1 . - CDS 103119 - 104213 1392 ## COG0500 SAM-dependent methyltransferases 77 52 Op 1 . - CDS 104414 - 104986 636 ## gi|260910716|ref|ZP_05917375.1| conserved hypothetical protein 78 52 Op 2 11/0.000 - CDS 105079 - 106296 1491 ## COG0304 3-oxoacyl-(acyl-carrier-protein) synthase - Prom 106328 - 106387 4.3 79 52 Op 3 . - CDS 106609 - 107331 252 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 - Prom 107371 - 107430 4.6 - Term 108926 - 108971 13.2 80 53 Tu 1 . - CDS 109006 - 112773 3583 ## COG4724 Endo-beta-N-acetylglucosaminidase D - Prom 112798 - 112857 1.9 + Prom 112486 - 112545 5.0 81 54 Tu 1 . + CDS 112643 - 112912 63 ## + Term 113034 - 113079 1.1 82 55 Tu 1 . - CDS 112959 - 113489 569 ## BF0685 putative RNA polymerase ECF-type sigma factor - Prom 113625 - 113684 3.5 - Term 113771 - 113817 7.5 83 56 Tu 1 . - CDS 113824 - 116334 3147 ## COG1629 Outer membrane receptor proteins, mostly Fe transport - Term 117791 - 117848 -0.9 84 57 Tu 1 . - CDS 117856 - 119172 1205 ## COG3177 Uncharacterized conserved protein + Prom 119642 - 119701 3.5 85 58 Op 1 . + CDS 119730 - 121019 1310 ## COG3681 Uncharacterized conserved protein + Prom 121021 - 121080 4.1 86 58 Op 2 . + CDS 121110 - 123170 2013 ## COG4232 Thiol:disulfide interchange protein + Term 123211 - 123262 11.2 - Term 123461 - 123507 8.1 87 59 Tu 1 . - CDS 123611 - 125179 1121 ## gi|288928799|ref|ZP_06422645.1| lipoprotein - Prom 125402 - 125461 4.2 88 60 Tu 1 . - CDS 125482 - 125658 185 ## gi|288928800|ref|ZP_06422646.1| hypothetical protein HMPREF0670_01540 + Prom 125924 - 125983 5.7 89 61 Tu 1 . + CDS 126017 - 126202 84 ## - Term 126457 - 126505 10.1 90 62 Tu 1 . - CDS 126546 - 129563 2930 ## COG4724 Endo-beta-N-acetylglucosaminidase D - Prom 129585 - 129644 2.8 - Term 129625 - 129663 9.2 91 63 Op 1 . - CDS 129700 - 132726 2956 ## COG4724 Endo-beta-N-acetylglucosaminidase D - Prom 132747 - 132806 3.5 92 63 Op 2 . - CDS 132904 - 134724 1645 ## Fjoh_4558 hypothetical protein - Prom 134756 - 134815 4.2 93 64 Tu 1 . - CDS 134829 - 138254 3082 ## BVU_2204 hypothetical protein - Prom 138341 - 138400 6.6 94 65 Tu 1 . - CDS 140147 - 141136 794 ## COG1120 ABC-type cobalamin/Fe3+-siderophores transport systems, ATPase components - Prom 141236 - 141295 2.2 95 66 Op 1 . - CDS 141313 - 141945 706 ## COG4122 Predicted O-methyltransferase 96 66 Op 2 . - CDS 141949 - 142365 476 ## COG0757 3-dehydroquinate dehydratase II - Prom 142414 - 142473 4.7 + Prom 142327 - 142386 6.6 97 67 Tu 1 . + CDS 142411 - 143325 938 ## COG4974 Site-specific recombinase XerD - Term 143469 - 143513 2.7 98 68 Tu 1 . - CDS 143542 - 144570 938 ## PRU_2865 M20/M25/M40 family peptidase (EC:3.4.-.-) - Prom 144643 - 144702 3.6 99 69 Op 1 . - CDS 144907 - 146745 1777 ## COG2355 Zn-dependent dipeptidase, microsomal dipeptidase homolog 100 69 Op 2 . - CDS 146906 - 147373 615 ## COG1522 Transcriptional regulators - Prom 147426 - 147485 8.2 - Term 147524 - 147575 15.7 101 70 Op 1 . - CDS 147591 - 148550 1109 ## COG0545 FKBP-type peptidyl-prolyl cis-trans isomerases 1 102 70 Op 2 . - CDS 148568 - 149443 1136 ## COG0545 FKBP-type peptidyl-prolyl cis-trans isomerases 1 103 70 Op 3 . - CDS 149589 - 150194 747 ## COG0545 FKBP-type peptidyl-prolyl cis-trans isomerases 1 - Prom 150231 - 150290 9.7 + Prom 150179 - 150238 6.5 104 71 Tu 1 . + CDS 150457 - 151161 919 ## COG0846 NAD-dependent protein deacetylases, SIR2 family 105 72 Tu 1 . - CDS 152302 - 152565 76 ## - Prom 152781 - 152840 7.0 + Prom 152745 - 152804 6.0 106 73 Tu 1 . + CDS 152902 - 153321 345 ## gi|288928816|ref|ZP_06422662.1| hypothetical protein HMPREF0670_01556 + Term 153345 - 153397 7.1 107 74 Tu 1 . + CDS 153570 - 156809 3496 ## COG3250 Beta-galactosidase/beta-glucuronidase + Term 156932 - 156969 6.0 - Term 156920 - 156958 8.3 108 75 Op 1 . - CDS 157036 - 157218 98 ## gi|288928818|ref|ZP_06422664.1| hypothetical protein HMPREF0670_01558 109 75 Op 2 . - CDS 157232 - 158479 897 ## PRU_0337 hypothetical protein - Prom 158506 - 158565 6.4 + Prom 159179 - 159238 4.2 110 76 Tu 1 . + CDS 159309 - 162134 3531 ## COG0612 Predicted Zn-dependent peptidases + Term 162197 - 162261 13.2 111 77 Tu 1 . - CDS 162670 - 163137 -77 ## gi|288929076|ref|ZP_06422922.1| hypothetical protein HMPREF0670_01816 - Prom 163340 - 163399 5.2 + Prom 163707 - 163766 3.5 112 78 Op 1 . + CDS 163815 - 166601 3316 ## SRU_0096 TonB-dependent receptor domain-containing protein 113 78 Op 2 . + CDS 166659 - 168017 1578 ## Fjoh_0764 hypothetical protein 114 78 Op 3 . + CDS 168105 - 169625 1199 ## gi|288928824|ref|ZP_06422670.1| hypothetical protein HMPREF0670_01564 115 78 Op 4 . + CDS 169672 - 172596 2439 ## COG0612 Predicted Zn-dependent peptidases 116 79 Op 1 . - CDS 172859 - 173668 497 ## gi|288928826|ref|ZP_06422672.1| hypothetical protein HMPREF0670_01566 117 79 Op 2 . - CDS 173695 - 174057 171 ## gi|288928827|ref|ZP_06422673.1| hypothetical protein HMPREF0670_01567 - Prom 174256 - 174315 7.3 Predicted protein(s) >gi|283510551|gb|ACQH01000068.1| GENE 1 168 - 1499 1599 443 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928709|ref|ZP_06422555.1| ## NR: gi|288928709|ref|ZP_06422555.1| hypothetical protein HMPREF0670_01449 [Prevotella sp. oral taxon 317 str. F0108] # 1 443 1 443 443 825 100.0 0 MREFRRFGYLMLGASLVFGVAACGSSNDDKPTIPGGTDPGNKTEINTELTPEQKKLELDK TAKRALAMVKSTDFNNIENIANYVHDNLSSEHATAKINKWWKDKLKSFWTDLGKREKDMK VMQRMIDLSQINGHFKVVNGQWVREDANDLQFVFNDNNGKECVLKLVLSGKQTNMYVPFL DDEEREYKGNGYYDHTKIETKLNIPEHATLTLTQGGNTLMTGELNTKVSTNGTLDVAKDN IEANCKLTINNYVVEVKRALFDAAKGAKADAAVTIGNQKLFDMVISVNGKATNDRIQSVG EVNMAFDILGEVQFKGKIDDGSTFNKWYRKLEEGGMFKNNSFVYYTESESKGFVEQANNH LNAGIFLKGSDKRSANIELAVFAEQVHNNLTLEYKVMWKPYALLKFDDKTSYGFNDYFTK KNFPDVYKQASDLIKSFERMFNK >gi|283510551|gb|ACQH01000068.1| GENE 2 1639 - 2616 1086 325 aa, chain - ## HITS:1 COG:RSc2823 KEGG:ns NR:ns ## COG: RSc2823 COG0142 # Protein_GI_number: 17547542 # Func_class: H Coenzyme transport and metabolism # Function: Geranylgeranyl pyrophosphate synthase # Organism: Ralstonia solanacearum # 5 258 4 257 322 177 41.0 3e-44 MDYLSTIRTPIHTELKDFIALFDESLTHKDGLLGQALEHVRAKGGKRMRPILILLMTKSF GKVTSVAQYAAVGLELLHTASLVHDDVVDESGERRGQASVNASYNNKVAVLVGDYILSTA LLNIARTNDCDIVCDLAELGRTLSNGEILQLTNIANTNFSEEVYFEVIRQKTAALFEACA VIGAKAGGADRQAVEAARAFGRNVGVIFQIRDDIFDYYQSADLGKPTGNDMAEGKLTLPV LYALNTTGDTQMAALARKVKARNVSAEEIAQLVEFTKNNGGIAYAEQKMAQCKVLTDGFI NQYIAQPQLQDALKAYVSFIIDRNN >gi|283510551|gb|ACQH01000068.1| GENE 3 2706 - 5489 3066 927 aa, chain + ## HITS:1 COG:HI0856_2 KEGG:ns NR:ns ## COG: HI0856_2 COG0749 # Protein_GI_number: 16272796 # Func_class: L Replication, recombination and repair # Function: DNA polymerase I - 3'-5' exonuclease and polymerase domains # Organism: Haemophilus influenzae # 276 927 1 648 648 509 43.0 1e-143 MDKLFLIDAYALIYRSYYAFINNPRINSKGLNTSAVMGFCNTLNEVLNKEQPSHIGVAFD HGLTFRNEAFPQYKAQREATPEDIKRSVPIIKDILTAYHIPVLQVDGFEADDVIGTLALK AGEQGIDTYMLTPDKDYAQLVRPNVYMYRPRHGGGYETMGPDEVNQKYNISSPLQVIDLL ALMGDSADNFPGCPGVGEKTASKLINEFGTVEKLLENTAQLKGKMREKVEGAVEDIKMSK FLATIRTDVPLELDLEQLRLKEPDTAKLQEIFTELEFKSFANRLLNKAEKPKKADNRQLS LFDEPAMAEADNSRFEPLSAPVANRQTIETTPHEYHLVQTEADIQALVNLLSAADVISLD TETTSTNAIDAQLVGLSFAVEEKKAYYVPVPEQANEAQNIVDKFKAIYENPNTLKVGQNI KYDLEVLRNYGIMLQGPLFDTMIAHYLLQPELRHNMDFMAEVYLNYETVHIDALIGAKGK TQKNMRELAPSEVYAYACEDADITLQLKNVLQPKLVEAGVERLFNEVEMPLIPVLAEMEC NGVRIDTAALKETSQVFTERMLQLEQEIYQAAGKTFNVASPKQVGDILFGEMKIIDKPKK TKTGQYVTSEEVLQTLRSKHPIVAHILDYRALKKLLGTYVDALPKLINPRTGHIHTSFNQ AVTATGRLSSSDPNLQNIPVRGEDGKEIRKCFIPEEGCEFFSADYSQIELRVMAHLSQDA NMLDAFREGYDIHSATAAKIYDKPVSEVTRDERTKAKRANFGIIYGITVFGLADRLNIER AEAKQLIDGYFKMFPQVRDYMEQAKETAKANGYVETFFHRRRYLPDINSSNATVRGIAER NAINAPIQGSAADIIKVAMVRIFQRFQRENIRSKMILQVHDELNFSVLPTEKELVERIVM EEMQAAYPLDVPLVADGGWGNNWLEAH >gi|283510551|gb|ACQH01000068.1| GENE 4 5903 - 7177 750 424 aa, chain - ## HITS:1 COG:no KEGG:Amet_3117 NR:ns ## KEGG: Amet_3117 # Name: not_defined # Def: beta-lactamase domain-containing protein # Organism: A.metalliredigens # Pathway: not_defined # 5 374 7 389 430 181 31.0 4e-44 MEEIVSKSQIEINIHRGANQIGGCITEIATAQSRIIIDLGSNLPGSKVKELMAEQIAEVV SGVDAIFYTHYHGDHVGLFQFVPDSIPQYIGEGALKVMKCKYETLSRHENIEQELSAINR MKTYQVNCTIQIGDIRITPYYCSHSAFDSYMFKIETQGRVILHTGDFRRHGYLGKSLDGV LKKYIRQVDILITEGTMLSRPHEKVLPEYAIQANTIKVLKRHKYVFALCSSTDMERLASF HAGCKQTGRVFYCDRYQKKILDIFTNYTQAELFQFTHIFELTSYKANRVKRKLQHEGFLM PVRASQINLIKAIMQVYGDEPACLIYSMWQGYHNGKEENRIPGIIEIRNLFGTRIFDGTA HGFHTSGHADVATLGHLCQLVRPRMGVVFIHKEAQTTGKALHLPPDIHVIEKDERIMEVA IKQN >gi|283510551|gb|ACQH01000068.1| GENE 5 7341 - 7838 470 165 aa, chain - ## HITS:1 COG:no KEGG:Cpin_4709 NR:ns ## KEGG: Cpin_4709 # Name: not_defined # Def: DNA polymerase III, alpha subunit # Organism: C.pinensis # Pathway: Purine metabolism [PATH:cpi00230]; Pyrimidine metabolism [PATH:cpi00240]; Metabolic pathways [PATH:cpi01100]; DNA replication [PATH:cpi03030]; Mismatch repair [PATH:cpi03430]; Homologous recombination [PATH:cpi03440] # 15 134 320 441 1214 62 32.0 8e-09 MERTKRRKAELAKRTKDESLQFQVEWGIHNLGLELTDEMRTRIDDELRLLLNLGLADNLL TLKAIVDGARQDLEAVPEPFKGNLAGSIVAYCLGITTGNPLEKNLLKPVDEYALPLQLTL YYDNAVRNRVVDWIKAHGYQEVKTRLSQPILKMDKMVVEFKRVVK >gi|283510551|gb|ACQH01000068.1| GENE 6 7916 - 8503 494 195 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928715|ref|ZP_06422561.1| ## NR: gi|288928715|ref|ZP_06422561.1| hypothetical protein HMPREF0670_01455 [Prevotella sp. oral taxon 317 str. F0108] # 1 195 1 195 195 342 100.0 7e-93 MKQENHPVRYSTEISDSAERALQRAIILSSVSNLNGKEVEWLDIEIPVDYSGKPRGKSID LIGKDADGKYVLCEVKFRKKSSDNDTPEEAAKQLKRYHELIKENYEKIHGHKENGKAVDW EEVASDRTRLVVAANNSYWENWDEKSINGWKFDTSNVELYSIAVDEFEFEKQKGIEKKYT PNMPSEAKTWSLIEK >gi|283510551|gb|ACQH01000068.1| GENE 7 9651 - 10229 623 192 aa, chain - ## HITS:1 COG:ykgB KEGG:ns NR:ns ## COG: ykgB COG3059 # Protein_GI_number: 16128286 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli K12 # 8 189 7 187 200 187 55.0 1e-47 MKSKVIAFLELASNLRQTGLNLIRVAILIVFVWIGGLKFTHYEADGIVPFVANSPFMSFF YAKEAPEYKQYKNKEGELVLKNRQWHEENRTYVFSYGLGCLIMSIGILVFLGMFSPKIGI FGELLCIIMTLGTLSFLITTPECWVPDLGSGEHGFPLLSGAGRLVIKDTVILAGAITLLS DTAGKLLKQLKK >gi|283510551|gb|ACQH01000068.1| GENE 8 10446 - 12461 2005 671 aa, chain - ## HITS:1 COG:no KEGG:PRU_2529 NR:ns ## KEGG: PRU_2529 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 9 671 5 634 634 540 43.0 1e-152 MRKPTIIRLFLALLLCFCLPVRASLWRSYTAYSDITQVQKAGNTLYVLASKRLFAYNTAD ESIQTFDRINGLSDTPIAFIAWNNVAKRLVVVYDNQNIDFVMPNGEVQNLSDLYHKNMMV DKTVFGINTYGANTYLATGFGIVKLDARKMQINDTYNLGFKVNWTRIEGNRIWAYSDVVG CYAAPLTANLNDKNSWTRVGEYQPNPQAQPDAKLLEMAKKANVGGPRYNNFYFMKFTAGK LFTVGGQYASGGTQKGYPGLVQVLESNGSWTFFQDDVAATTGIAFNDINCLAVDPLDPNH VFVGGRPGLYEFQDGRLKAYFNRDNSPLLPAVDGVVELDNNYVIVNGVDFDAQGNLWLVN SQTKRQSLLTIPRGKQMESRHQERLMADGLSLRNMTKMLVDSRGWVWFCNDHWTVPSLTC YKPATGECTVYDTFTNQDGVAIHVNAVTCVAEDREGNLWVGTNAAPLYLQIRNNQAEKHF VQFKIPRNDGTNSADYLLNEVHITHIAIDGANRKWMGTLNDGAYLISADNLQQVQHFTSA NSALSSNVLNCIAINPTTGEVFFATDNGLCSYQGDATEPQTGDDRQAYAYPNPVEPGYKG LIRIVNLSPNAYVKIVSPNGTLVKEGRSNGGMFTWDATDLRGQRVVSGVYVVLEATEEGK SGVACKVAVVN >gi|283510551|gb|ACQH01000068.1| GENE 9 12669 - 13271 335 200 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|71274727|ref|ZP_00651015.1| Ham1-like protein [Xylella fastidiosa Dixon] # 1 194 1 191 200 133 44 5e-30 MNTTIVFATNNAHKLEEIRQIMPSNLQMLSLKDIGCDVDIPETGSTLQENALIKAQYVLE HYGMACFADDTGLEVYALNNEPGVYSARYAGGDGHDSEANMHKLLCRLADNNHRDARFRT VIALVAPPNNRLGIDQPLFFEGIVEGHIATERHGTAGFGYDPLFVPNGYDKTFAELGTDI KNQISHRARAVGKLVQFLKG >gi|283510551|gb|ACQH01000068.1| GENE 10 13446 - 14729 1048 427 aa, chain - ## HITS:1 COG:BH2405 KEGG:ns NR:ns ## COG: BH2405 COG0612 # Protein_GI_number: 15614968 # Func_class: R General function prediction only # Function: Predicted Zn-dependent peptidases # Organism: Bacillus halodurans # 16 423 3 405 413 223 32.0 5e-58 MTTSSNTAIRMDNKYNTATLANGLRIIHRSSSSPVVYCGFQINAGTRNEKDDEMGMAHFC EHASFKGTSKRTPLSILNCLESVGGDINAFTNKEHTVYYAAIPKGHAPRAVKLLTDMVFD SQYPAAELRKEIEVICDEIESYNDSPAELIYDDFENAVFSGHPLGHNILGKASLLRTYTS EHAKDFTRRMYRPNNMVFFAYGELDFHWLVRSLKHATQHFPNALPHIDTHEGEPLPPYQP SEIIRQMDTHQAHVMLGNRAFSTYDKRRLPLYLANNLLGGPGMNARLNIALRERNGLVYN VESNLVSYADTGVWCVYFGCDPKDLRRCLRLVKKELNRLIEKPLSARQLAAAKRQIKGQI CVACDNRESFALDFGKSFLHFNKEKHIDNLLQQIDAITAEELQSVAREVFAEDKLTTLIY QGGSSKE >gi|283510551|gb|ACQH01000068.1| GENE 11 14823 - 16001 1379 392 aa, chain - ## HITS:1 COG:no KEGG:PRU_2632 NR:ns ## KEGG: PRU_2632 # Name: not_defined # Def: putative lipoprotein # Organism: P.ruminicola # Pathway: not_defined # 21 392 16 384 384 438 58.0 1e-121 MNNKKLSGLFYTAVLGVAVLSLGSCSEKKFNINGTITQAKDSVLYLENMSLNGPKAVDSV KLDENGNFEFKQKAPDAPEFYRLRIANQMVNLAVDSTESITVKAAYPTMSVNYEVSGSEE CAKIRDLAYMQLALQRQVTAIANSPTLGVQAVEDSVTKVLEVYKNKVKLNYIFKEPMKAY SYYALFQTIVLGNANILIFNPRSSKDDVKVFAAVATSWDTYFPKAERGLNLHNIAIEGMK NVRIAENNARQTISADKVKVAGVIDIALTDNHGRVRKLTDLKGKVVLLDFQAFAAEGSLK RIMMMREIYNKYHDRGFEIYQVSFDPEEHFWKTKTAALPWVSVWDENGTRSTVLSQYNVQ TLPTFFLIDRNNTLQKRDAQIKDLDAEIQALL >gi|283510551|gb|ACQH01000068.1| GENE 12 16294 - 17676 937 460 aa, chain - ## HITS:1 COG:mll1728 KEGG:ns NR:ns ## COG: mll1728 COG0477 # Protein_GI_number: 13471679 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Mesorhizobium loti # 6 455 17 467 477 266 37.0 6e-71 MKKYISRIVIALVAVAFFMEYIDSTALVTALPLMALDFGVESGRMSIGITSYMISLAVFI PVSGWIADRYGTRSVFAYAVLGFIGTSVLCSLCNSLTQFAFVRFIQGMAGALMVPVGRLA VLKNCDKRDLVNAIAWITTPGLIAPVVGPPIGGFLATYFSWHWIFFLNVPVGLLVFVLAI RYIPKQPVDRQRGKLDIPDFLLSALSLSGMMYSLELLGNNVSSFVRPLILLFLCILLFAF SIRRSLHASLPLIDYSVMRVPTYSVTIIYGTLSMMVIGAAPFLLPLLFQDGLHFNAFHSG LLLLSLMLGNLTTKPFTVWMMHHWRIKYILFLNALLLSLSTAACALFKASSPVFAIIVCL FLMGCFRSVQLSTLHTVAYMDVPQQMMSSANTLYSTTLQLASGLGIGLGALALRFVASTS ANPHSLAHYQLSFVLIAIVGLIALIGYGKMEPEAGKIITN >gi|283510551|gb|ACQH01000068.1| GENE 13 17758 - 17967 168 69 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260910747|ref|ZP_05917403.1| ## NR: gi|260910747|ref|ZP_05917403.1| conserved hypothetical protein [Prevotella sp. oral taxon 472 str. F0295] # 1 69 1 69 98 101 82.0 2e-20 MAYIVYVCNELKFRLMATTELQSKKITIDLSEDTFRWLSIMAANRGTNLKSLIEVLLEKA AEAYDENKQ >gi|283510551|gb|ACQH01000068.1| GENE 14 18334 - 18603 156 89 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKTYPCLPIMFVDSILVIPHRLWKTLLITRVKAMYNPSLIHNLQRTPKDWWITKKLLWRN ASVINTFRATKVMNICTMPYPTLLCGKHI >gi|283510551|gb|ACQH01000068.1| GENE 15 18749 - 19000 222 83 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288928723|ref|ZP_06422569.1| ## NR: gi|288928723|ref|ZP_06422569.1| conserved hypothetical protein [Prevotella sp. oral taxon 317 str. F0108] # 1 83 1 83 83 141 100.0 1e-32 MHIITAREFRANQKKYLDLAATEKVVILRNNTPSIQLVPMLDDETTISAALLAKIEKSRK DIKAGKGKTCRSVDELNAYLEKL >gi|283510551|gb|ACQH01000068.1| GENE 16 18997 - 19272 216 91 aa, chain + ## HITS:1 COG:no KEGG:Ppha_1443 NR:ns ## KEGG: Ppha_1443 # Name: not_defined # Def: addiction module toxin, Txe/YoeB family # Organism: P.phaeoclathratiforme # Pathway: not_defined # 1 91 1 92 92 75 45.0 8e-13 MSYVLRFSPDAEKELKKWKKSAPKLFKKLSEVLSELITHPKQGIGHPEALKGGDGITYSR RISGKHRVVYDVYEQIVEVHVLSIGGHYDDK >gi|283510551|gb|ACQH01000068.1| GENE 17 19518 - 20696 1419 392 aa, chain - ## HITS:1 COG:Cgl2969 KEGG:ns NR:ns ## COG: Cgl2969 COG1301 # Protein_GI_number: 19554219 # Func_class: C Energy production and conversion # Function: Na+/H+-dicarboxylate symporters # Organism: Corynebacterium glutamicum # 1 384 1 384 412 311 46.0 2e-84 MKIGLLGRILIAIALGIALGHIFTLPWVRIFATFNSIFGQFLGFIIPLIILGLVTPAIAD IGKGAGKLLLITVGIAYADTVVAAILSYATGSTFFPSLIGNGSQTLETVEKSAEILPYFT INIPAPIDVMSALVLSFIVGLGIAYSGHTVLRGATNEMKAIIVGVIERVIIPLLPVYIFG IFLNMTFSGDVMRMMTVFAKIIVVIFALHIFVLLYQYLIAGAIVRKNPFRLLANMFPAYL TALGTSSSAATIPVALKQTRKNGVSEGVAGFVIPLCATVHLSGSAMKITACAFTIALLEG LPTDFPLFLHFIMVLAIFMVAAPGVPGGAIMASLAPLGSILGFGENEQALMIALYIAMDS FGTACNVTGDGAIALVVDKLNRNKANVEEKAA >gi|283510551|gb|ACQH01000068.1| GENE 18 21078 - 22265 1479 395 aa, chain - ## HITS:1 COG:FN0512 KEGG:ns NR:ns ## COG: FN0512 COG0426 # Protein_GI_number: 19703847 # Func_class: C Energy production and conversion # Function: Uncharacterized flavoproteins # Organism: Fusobacterium nucleatum # 3 392 6 398 403 364 44.0 1e-100 MIEIVKDIYYVGVNDRKKEKFEGLWPLPNGVSYNSYLIVDEKVCLIDTVEADFFAQFIEN IREVIGDRPIDYIVTNHMEPDHSGSFALMRKYYPNVQVVGNKKSFDLLKGFYGMEGCEYE VKNGDQLSLGKHTLKFFLTPMVHWPETMMTLDTSNNMLFSGDAFGCFGALNGGLLDSEIN CEHFWLEMVRYYSNIVGKYGTPVQNALKKLANEKIDYICSTHGPVWHEQLQKVVEMYDHM SKYETEDGLVICYGTMYGNTERMAEVIARAASQAGVKNIAVHNVSKTHHSYIIRDVFRYR GLIVGAPTYNTGLYHEMDVLLSELANRDIKNHLIGWFGSYGWASKAVQKIGEWNENGLHF EHVGTPVEMKQALSPEVAEQCRALGKAMAERLAQG >gi|283510551|gb|ACQH01000068.1| GENE 19 22529 - 23449 1173 306 aa, chain - ## HITS:1 COG:SA1845 KEGG:ns NR:ns ## COG: SA1845 COG0524 # Protein_GI_number: 15927615 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Staphylococcus aureus N315 # 1 263 1 274 319 77 23.0 3e-14 MRKVIGVGETVLDIVFKHEQPIGALPGGSVFNTIVSLGRVGADATFISAVGNDRVGTNVI SFLKDNGVQADYVTKCDEGQTPISLAFLNERNDAEYLFYRDNKKNNPEFAYPDVQADDIV VFGSYYAVSPATRPQVVSFLEYAHNRGAIIYYDVNFRSGHRTDVMRITPNLLENYEYADV VRGSREDFEVLYRLSDPAKVYAAEVSFYCKRFVYTDGANPVQLRADGGFCESYAVSDTQP VSTIGAGDNFNAGFIFGMLKLGVTRQRLIDGLTKEQWNKIMSYALAFSAESCKGTDNYVP KGWKPE >gi|283510551|gb|ACQH01000068.1| GENE 20 24114 - 28328 3915 1404 aa, chain + ## HITS:1 COG:no KEGG:Cpin_1738 NR:ns ## KEGG: Cpin_1738 # Name: not_defined # Def: PKD domain containing protein # Organism: C.pinensis # Pathway: not_defined # 829 1315 283 774 2972 94 25.0 4e-17 MKQNQLFMVLCTLFLYAFSPLVATAQNREVRMELTPKSIDVDETPIDFYDDGGPNGQTSP EFKDGKTSSVTFIPKVTGKKVQVDFKLVDIFEGSIHKQFIRVYDGTEVRANRLLATITKG TTPIVKATNPQGALTVEFGSTTAFRSNGFKAVVSLFTPQPMRFRDVTISPMKDASLTAGD KAQPLLGFNVRTENTEPALTLNGLTLKASGPIAAIQLYQSTSATDFSTAHRLAEGTVQNG EVVLNIAAKPALRSGDNFFWLGCDVADEAENHNTLALSLAQLALSDGQHALPAPLPTGER VVENVVYALEGTHQRRVNGELTFRSKTNTYNDNYEGGSRDRITTFTPLHARHVVQIDFSK FDIVYSPRTDVGVRAKFVVYEGKGTTGRKLWELTSSADKQRGPGQTLRSQSPDGALTIVF NPNDSHYAAKGFVAKVSEYIPKAMSLASVDVRQASAAAVSPGATRQPLLLINLHTEDNLQ PITLNSLQLNLKQSLPAMAGLQLFVQPKNTPDATPSGNPLATIAQAALSQQQTMSLSTPL ALTEGDNWLLLCADVTNDAQAGKLLDAALMKVETSLGETAVQQGDPDGARRVEYERALVK GDNGRVEVADNTTLTLYDDGGKDENESKGFEGLITFVPKTAGRLIAIKPTTWKLAAADKL EVYFGTEKKDKPDFSFSRNDHFDKLISTSPNGAITLRYTTGKYVAAEGFALEVSAITPQK LAVSEVQTHAIAPEKAFKGQTDVPLLHVVVSVKGDNGQLSITQMQAQMQGDAPIHIYKVY ATGQDNAFATTQTFATNTRNNGIFDGTYTIEKEGEYHFWLAADVESEAQVGQTLTASLTN ITANNAQVTPQTAETANTIVAQGISGTLNVGPNATYKTIQAAVDALRDGVDGPVTISIES GKYNERVLIPHIPGLSPANTLTIKPTSGKRGDVHIFHDKFEKGGYDPDQMAKEYGVVTFD GADYTTLQALEISTEDLTYPGVIHLRNESRHVTIDSCYVHAPQSTNIQQKVTLLNMYAKS QANANNDHLTLRRSLLEGGYMGVRLGGTGTVALPKEVGGRILNNVFRNQESKAIYVAREA DAHIEGNIVENETSDHNDFNAVDIDASGDVRLARNRINLSTKAYCTAIYVRNMSGLPTSQ AQIVNNVVMVSHGTTPARRNASKALTLKGDLANLLVAHNSLCVEGNEKDITVSILGKMAD NVRLTNNLFANTGKGAVYQCTRKENLASLRLDHNNLYTNGPSLADVRTPVATLSAWMALS GERYAYNNKVEFASDVLLQPLSEPELRHATPLAVVPTDILGQQRNTLTPLIGAYEHAWNG PSTGIVAPSASANGIQLSAHKGVLSLSLLPANAVVEVFSTSGLLLSRHALPANADSLQVT DLPQGVLVVRITWNNGRWSKAIKL >gi|283510551|gb|ACQH01000068.1| GENE 21 28562 - 29455 746 297 aa, chain + ## HITS:1 COG:no KEGG:PG1685 NR:ns ## KEGG: PG1685 # Name: not_defined # Def: hypothetical protein # Organism: P.gingivalis # Pathway: not_defined # 53 297 16 260 260 378 73.0 1e-103 MIRETWKGHRPYNERIGKMLSSLFASLFHPNPEKGACVRLFAYVLLGFIVSIPVQAQMLF TENLTMNIDSTKTIQGSLQPVLDFKTEKENVLTFKNTANLNLLIGKNRVVNLINKLEFSA YGKKVTVSGGHVHAEFRYLLVPAVEVYPYAESQWAGSRGMNFKVSAGLQSRYRLVNTDHF LMFATAGMFYEFEKWEYPDPPVGTPLHAYSRSIKSHVSLSFKHTIGEHWEFIATGIHQAR PDSYHKSARFGGAVDLAYHVTPSVGFRGTYRIIYDTAPIVPIRKDYNVVEAGIDISF >gi|283510551|gb|ACQH01000068.1| GENE 22 31033 - 34185 3326 1050 aa, chain + ## HITS:1 COG:no KEGG:BDI_1624 NR:ns ## KEGG: BDI_1624 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 45 1050 47 1058 1058 1225 62.0 0 MTQKARKLSKTTRVPLYGVLACALAYGMPTYAEGNVAAKPYASQAGQRAGTVLVTVVDQN NEPVVGASVVVMSTKAGAATDVNGRCSVDAKPGDKLQVSYIGFDTQTLTVGNNNNVLVTL GENKAMLDEIVVVGYGTMRKSDVTGSIAVAKGADMIKAQSFGALDNLRGKVSGVNIFSNS EQGTTSPRVIIRGMATINASTDPLYVVDGVVMNDFAMMNPNDIESIEVLKDASATAIYGA RGANGVIMVTTKRGLSGTEGTRMSYQGSVSIRHMARKMEVMNAREWCDAFMKGLENENKY QGNSWKLDRSYWFSDRRYFDANGNPLYDTDWQKEATRTAIGHNHQLNIQQGTKTSSMGAF LNYSDYQGIMKNTWTKRISGKLAYDANPTKWLTTGINLMVNHTWDRHTDEGGGGQEARRT MIEMLPWLPVRTPDGSYTTSNSPSEMVGKFGFEGMSNPVMILDMQKRLYYNTQVFGNAAF TFHILPGLDLKTQLGIDWHAEEGRRYASRELNNISQTEQGVATNGRHNNLYWQEETYLTY NRTLKGLHRVNAMLGMSWQAYASRYNYSEAQSFESDLFEDLNMGTGKNPRPPGSNYDEWK MNSYFLRLAYTYNDRYSATFTARSDGSSKFGKNNKYAFFPSAGLAWIVSEESWLKGNPWL SNLKLHTSYGLTGNSEIATYRSLARVGSGTVLLDGKRAGAAYISSLANPDLKWEKTAQFD VGFDLGLFNNAVNLDFSYYYKHTTDLLLNTPIPHSSGFGSVYKNIGEVKNSGIDFMLTAY PVRTKDFTWTSILNLNYNKNKILKLGKNNEDIEILEWVGGPEGILRVGESMGSFYGYKRL GVWTEEDKAAGRCTAKMVGRAKREEKKSILGKGIPDWTGSWTNTLNYKNWDLTVEMQFVT GVQTMQQFYHSVYDRFGITNGLKSILYDAYNGTNPRTMQQMLYLTNSPYDSQSHAGQDTT VDSSWVVDGSYLRCNMVQLGYTFAPATLKALGLNALRLYLSGNNLFLLCSKDFKGYDPEM TSQGSKFGQNMAFFSLPRSRVFTLGVNVTF >gi|283510551|gb|ACQH01000068.1| GENE 23 34210 - 35784 1292 524 aa, chain + ## HITS:1 COG:no KEGG:BDI_1622 NR:ns ## KEGG: BDI_1622 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 524 1 526 526 555 55.0 1e-156 MNKIMIFAACACLALASCNDFLEERPKSLLSSRDFNKAEAQIRSNINGLYRRGAVTAHSS VASAYFGSAMTIPGMLTGYFSNSYEGQERVCAYARTLTRQEQTNNVSGFVDGIWGSCYGA INVANVALKAMPEVPMSDEARKVLLGEAKFFRAFNYFMLVKWFGDVPMPTQPSENVTDLE KPRTPVADIYTLIEQDLTEAVAALPDQTWQNNGHRISKYVALQALTAVYMQQGKYQQASE TAKQIISSGKYSLTENEDKALKSAYNKLRTIDDLPEVLYAVEYDADISTTSWRPTYAFSS SAINVFKSYNIFERVYGPNARYLNIYTADDLRGQEQQFFATKYKNPNNDSVWTAPSADAR GCWYYFDEEAVLKTGRGTKDWNIYRYAETLLDAAESLVQSGGAVTAEAAGYLAMVQARAL GKTQAELTTELSALSKEAFIEACWTERLREFPLEYKIWDDCLRTKKFPDVSATELGKVTY VDLVGATNGSGAVIKSSDLLWPLSLNEMQRNPQLRPQNEGYADK >gi|283510551|gb|ACQH01000068.1| GENE 24 36678 - 38525 1974 615 aa, chain + ## HITS:1 COG:MT2530_2 KEGG:ns NR:ns ## COG: MT2530_2 COG0674 # Protein_GI_number: 15841979 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit # Organism: Mycobacterium tuberculosis CDC1551 # 217 583 3 368 425 399 57.0 1e-111 MSEELEVKELDQVVVRFSGDSGDGMQLAGNIFSTISATVGNDISTFPDYPADIRAPQGSL NGVSGYQVHIGEDKVLTPGDFCDVLVAMNAAALKTQYKYARPQATIIIDTDSFGPKDLKK AEFKTEDYLAELGIDADCVVACPITTMVKECLKDTGMDNKSMLKCRNMFALGIVCWLFNR DMELAFDFLRKKFHKKPHVAESNIKVVMAGYDYGHNTHASVPATYRIESKVKTKGRYMDI TGNKATAYGLIAAAEKAGLQLFLGSYPITPATDILHELAKHKSLGVKTVQCEDEIAGCAS AIGASFAGALAATSTSGPGICLKSEAINLAVIDEIPLVVIDVQRGGPSTGLPTKSEQTDL LQALYGRNGESPMPVIAATSPTDCFDSAYMAAKIALEYLTPVILLTDGFLGNGSAAWQLP DINKLPDIHPHFATEEMRGNYTPYRRDEETKARYWAIPGTPGYEHILGGLEKDGQTGNIS TEPENHNLMVHLRAQKVAGIKVPDVQVQGDDDADLLIVGFGSTYGHLFSAMKALRKQGKK VALAQFKYINPLPANTEEVLRKYKKVVVAEQNMGQFAMYLRGKIDGFVPYQYNEVKGQPL VVTELVEAFKKIIDE >gi|283510551|gb|ACQH01000068.1| GENE 25 38783 - 39787 984 334 aa, chain + ## HITS:1 COG:Rv2454c KEGG:ns NR:ns ## COG: Rv2454c COG1013 # Protein_GI_number: 15609591 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, beta subunit # Organism: Mycobacterium tuberculosis H37Rv # 14 333 54 372 373 309 45.0 6e-84 MTYTAQDYKQGTPRWCPGCGDHFFLASLQKAMAEMGIAPHETAVISGIGCSSRLPYYMKT YAMQTIHGRAAAIATGAKVANPYLTIWQVSGDGDGLAIGGNHFIHCLRRNVDINIILLNN RIYGLTKGQYSPTSPRGFVSKSSPYGTVEDPFRPAELAFGARGNFFGRAVATDNQGVMEL MKAGQRHKGTSVCEILQHCVIFNDGIYDPVYSSPGRKSNAIYLKHGEKMLFGENNEYGLA QEGFGLKVVKVGENGATLDDVLVHDAHAEDHTLQMKLALMNNEQGFPVALGIIRDVESPT YEACVEQQIEDVKGRAAYHNFTELLETNDIWEVK >gi|283510551|gb|ACQH01000068.1| GENE 26 40087 - 41790 1216 567 aa, chain - ## HITS:1 COG:RP329 KEGG:ns NR:ns ## COG: RP329 COG2194 # Protein_GI_number: 15604197 # Func_class: R General function prediction only # Function: Predicted membrane-associated, metal-dependent hydrolase # Organism: Rickettsia prowazekii # 165 517 160 511 520 141 31.0 3e-33 MKPFKQICSGALTPQTAFWLGLSVLALPNVCLAYTEPLPLVSRLVGITLPLSVYAALLTL WRKPGRTLWWLFPLVFLSAFQMVLLYLYGHSVIAVDMFLNLVTTNPNEAGELLNNLLPGM VFVIVLYVPTLLLSAISGFCGGCLSANFTHKVRRSAGILGIVGLCLAGFALANGFRPSLH LYPLNVFENIRLAIVRTQRTAHYQQTAAQFKFHATPTHPNSEREIYVLVIGETARAPQFR LYGYPRNTTPLLSALPSLYAFDHALSQSNTTHKSVPMLLSPVSAINYDSIYTRKSLLTAF REAGFHTVYLSNQLPNHSFIDFFGQQANQWRFIKEGKPDTYDGYDQRLLPLVDSVLACKR QKQLIVLHTYGSHFEYRKRYPHNEGAFFPDENSDAKPQNRVELVNAYDNTILQTDRLLAA LISRISSTGASAALLYTSDHGENIFDDARRLFLHASPVPSYYELHVPLLVWLSPHYAQLQ PAVVTALRTNLHQPVSTSVSVFHTFAHLAGLRSPWVNVQSSVASPTYQPQPWLYLDDHNR AVPLLQMLRSPLDTNILRQKGFLYNEH >gi|283510551|gb|ACQH01000068.1| GENE 27 41963 - 42196 103 77 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTGYKARNRRSRKDLYKKIVACLSQTNKEKSCCRGCCRYTNKLETQGRKLDYLILNGCCR YINKLETQVATVPPQMV >gi|283510551|gb|ACQH01000068.1| GENE 28 42313 - 43266 1012 317 aa, chain - ## HITS:1 COG:no KEGG:BVU_4066 NR:ns ## KEGG: BVU_4066 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 18 317 20 319 319 413 64.0 1e-114 MDKATLVIVMASLFAKNGMAQEREEPILYGNMDHWVTRRITESTIIGGNEKTLYEIGPDR TLTGNTPYINLGHSPWATSNVMAKVMGITKTNQSVYRDAHGKGYCAKLTTHIEHVKVLGM INISVLAAGSVFLGDVREPIGGTKDGEKALNWGVPYTQRPKAIRFDYRTALSAETNRIKQ TGFSKAQTIAGRDCATALLILQKRTEDANGNVSASRVGTMAVEYNKSTNGWVNNATYPVH YGDIRNTSGYNARLMGLRSTDYVRNSKGKSVPLKETGWAAANETPTHLIVQFSSSNGGAF IGSPGNTIWIDNVRLVF >gi|283510551|gb|ACQH01000068.1| GENE 29 43599 - 46223 2918 874 aa, chain - ## HITS:1 COG:MA0523 KEGG:ns NR:ns ## COG: MA0523 COG0249 # Protein_GI_number: 20089412 # Func_class: L Replication, recombination and repair # Function: Mismatch repair ATPase (MutS family) # Organism: Methanosarcina acetivorans str.C2A # 8 873 5 900 900 641 41.0 0 MAQNDQGLTPMMKQFYSFKAKHPDALLLFRCGDFYETYSEDAVEASRVLGITLTKRNNKG NTDATEMAGFPHHALDTYLPKLIRAGKRVAICDQLEDPKLTKTLVKRGITELVTPGVAMG DTVLNYKENNFLAAVHFGKTACGVAFLDISTGEFITGQGSFDYVEKLLGSFAPKEVLYER NHKKDFERYFGNKHIVFELDDWVFTEQTARQKLLKHFGTKNLKGFGVEHVQSGVVAAGAI MQYLEITQHTNIGHVTSLALIEEDRYVRLDKFTIRSLELIQPMQEDGASLLDVVDRTVTA MGGRLLRRWLVFPLKEVKPIEQRLDVVDYIFRYPEFRTLLDEQLHRIGDLERIISKVAVG RVSPREMVQLKNALLAIQPLKIACLEADNETIRQIGEQMNLCESLRDRIENEIQPDPPLV VNRGNVIAEGFSPELDELRNISRGGRDFLIDIQQREAEATGISSLKIGYNNVFGYYLEVR NTYKDKVPEEWVRKQTLAQAERYITQELKEYEERILGADEKIQSLEERLFNELVTATQEF IPQIQINANVVARLDCLLSFAKTAEENRYVRPVIEDSDALDIRQGRHPVIETQLPLGEHY VPNDIQLDTERQQIIIITGPNMAGKSALLRQTALIVLLAQIGSFVPAESARVGLVDKIFT RVGASDNIAQGESTFMVEMTEASNILNNVSPRSLVLFDELGRGTSTYDGISIAWAIVEYL HEQPKARARTLFATHYHELNEMEKNFKRIKNFNVSVKELNGKVIFMRRLERGGSEHSFGI HVADIAGMPKSIVKRANTVLKQLENDNAQVGGTVSKPSVKHLDESREGMQLSFFQLDDPV LSQIRDEILGLDINNLTPMEALNKLNDIKKIVGG >gi|283510551|gb|ACQH01000068.1| GENE 30 46254 - 47069 743 271 aa, chain - ## HITS:1 COG:no KEGG:PRU_2554 NR:ns ## KEGG: PRU_2554 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 15 271 6 259 259 255 48.0 1e-66 MKKIKLALVLVALLMAAHMQAQCTFRNTAFKSGEFLTYNLYYNWKFIWVKAGTASMSVVQ SVYGGKPAYRGSLITRGNSKVDNFFVLRDTLMCYSSLDMAPLYYRKGAREGSRYTVDEVF YSYPDGSKCHVKQHCQTHDGNHLWDNRTLDECVFDMMNIFLRARSFNPEGWKSGNEIPFT VAEGDGFIPATLKYNGKTVVKADNGVKYRCLDLSYIERKPNKKPKEIARFFVTDDSNHVP VRIDMNLKFGSAKAFVVGMKGLRSTVSSTEK >gi|283510551|gb|ACQH01000068.1| GENE 31 47260 - 49296 2260 678 aa, chain - ## HITS:1 COG:no KEGG:PRU_2555 NR:ns ## KEGG: PRU_2555 # Name: not_defined # Def: putative lipoprotein # Organism: P.ruminicola # Pathway: not_defined # 23 671 7 636 650 723 53.0 0 MKERYNMDKAHTTQRTSRLNWQLCRFALLLFTLALVAACARMGNPDGGWYDETPPRVVGA SPTEKTTGVKTRKLHIRFNEFIKIENATENVVVSPPQLETPDIKAGGKSIDIELKDSLKA NTTYTVDFSDAITDNNEGNPLGNYTYSFSTGEHIDTMEVSGWVLAAENLEPVKGILVGLY ANLADSAFRTQPMLRVAKTDGRGHFVIRGIAPGKYRVYALQDVDGDYHLTQKGEEMAFNR EIIVPSSKPDVRQDTLWRDSLRIDSISRVSYTHFLPDNITLRAFTHVQTDRFFTKAERTL PECFSLVFTAGSNELPQLRGLNFNNAERAFIVMPTAKKDTITYWIKDSALINQDTLRMQM QYWSTDTTGQLRMKQDTIEVLAKTPYAKRLKEKQKKAEEWKKAQDKAQKKGEPFETIMRP EALKVEVKVNNSIAPDENVRIELPTPLLSLDSTKVHLYSKRDTLWYEARYRIRVREGGDS LAPVGTNLLHKRWLELQAEWKPGVEYSFELDSLALTDIYGTTSGKIKQGFKVREDKEFAT LAVSLTALTDSNVVVQLLNEQDAVVKQTRALAGTANFYYLQPATYYLRLFVDRNGNGRWD TGDFYRGEEPETVYYFPEEIECKANWDATRTWNPTAKPLNEQKPGKITKQKPDKGKSIKR RNRERARELGIPLPPELK >gi|283510551|gb|ACQH01000068.1| GENE 32 50086 - 50523 262 145 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288928741|ref|ZP_06422587.1| ## NR: gi|288928741|ref|ZP_06422587.1| hypothetical protein HMPREF0670_01481 [Prevotella sp. oral taxon 317 str. F0108] # 1 145 1 145 145 261 100.0 1e-68 MKEKVTIKLCNNFYEAWATSQWLAEHHIMVLPNAANNEQPGQGIAITVYETDVEQALKLI AQLENKPVEERLPHGVGDDNEQAPKRCPVRLEWRIKPLSLFLAIVLTTIFAIVFSLLCNE EFLPLWGKLMLGGGVGLLVGSVRRN >gi|283510551|gb|ACQH01000068.1| GENE 33 51811 - 53316 1730 501 aa, chain - ## HITS:1 COG:no KEGG:PRU_2556 NR:ns ## KEGG: PRU_2556 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 501 1 501 502 758 69.0 0 MSITSLVSGAFKRRQADIETYAHHSEEIQRRVLTHLVQQGQRTQYGNTWGMNNIQTYEHF AKQLPVTSYEELKEPLDRMRHGEADVLWPGVVKWYAKSSGTTNDKSKFIPVSAEGLKNIH YKGGKDAVALYLRNNPKSRLFDGRSLILGGSHAANYNVAHSLVGDLSAILIENINPLANI VRVPKKSIALMPDFEAKREAIAQQTLHANVTNISGVPSWMLSVLVRVLELSGKDTLAEVW PNLEVFFHGGIAFGPYREQYRKLVGSSQMRYMETYNASEGFFGLQDTPDDDAMLLMIDYG VFYEFIPMDEFGTDNASVVPLWGVEPGRNYAMVISTTCGLWRYVIGDTVCFTSTQPYKFK ITGRTKYFINAFGEELIMDNAEQGLAYACKQTGAEVLEYTAAPVFMDAEAKCRHQWLVEF AHKPTSLDAFAQALDLRLQQLNSDYEAKRHKNITLQQLEVVEARQGLFNDWLKSKGKLGG QHKVPRLGNSRKNIEELLKMN >gi|283510551|gb|ACQH01000068.1| GENE 34 53628 - 53942 355 104 aa, chain + ## HITS:1 COG:no KEGG:BF2945 NR:ns ## KEGG: BF2945 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 104 1 104 104 113 53.0 3e-24 MKPEKIYTHEEMLDKVLGKPGDLAREQYESDINAFLIGETIKKARLQRKLTQAQLGELMG VKRAQVSRIEGGRNLTFATLSRAFKAMGVAAIIEAKGIGKVSLW >gi|283510551|gb|ACQH01000068.1| GENE 35 54584 - 55072 746 162 aa, chain - ## HITS:1 COG:PM0817 KEGG:ns NR:ns ## COG: PM0817 COG0783 # Protein_GI_number: 15602682 # Func_class: P Inorganic ion transport and metabolism # Function: DNA-binding ferritin-like protein (oxidative damage protectant) # Organism: Pasteurella multocida # 7 154 5 152 159 149 50.0 3e-36 MNTLMYLGLDENKVQPVVKGLAQLLADFQVFYTNLRGLHWNVKGNRFFGLHAKYEELYND AAEKVDEIAERLLQLGATPENRFSEYLKVATIKESGNEPEGKEGVETVLDNLGNLIKQER AIAKLAADADDDATVGLMDGFLAGQEKNVWMFVAFLDKSYKK >gi|283510551|gb|ACQH01000068.1| GENE 36 55566 - 56930 1334 454 aa, chain + ## HITS:1 COG:no KEGG:FP2425 NR:ns ## KEGG: FP2425 # Name: not_defined # Def: lipoprotein precursor # Organism: F.psychrophilum # Pathway: not_defined # 111 440 46 358 371 72 22.0 5e-11 MKKIILAAAICVFFAACGGDDGLTPTPQKPPTQGETTEVKAADIVKYFNLDKQLNVYQAI EKAKADLGKKTIDGKEINVTAVNVVKRDDQKGTFTLKVTGVAGKKPFGLDVDFTGFAQKP ADQDMAMRAVAKWKDGVDYLSEFDFDTLFRLNKTDKFTAEYLSKLVDLTSSAADGSNHYT FTPDDWAKTTISDVRYVPSNSHSGHVMFTVTYNGIKGKTGNNGSPSLAIDKNAYYAKQFT VDTQAASKFYMRGVYRQLAAFYGSLIKYDETKFAPLLASKQKNDGDNSIALTIKLTPRDG SETELAQFTLTLKGFKPLEDLNNEWNIAGKTEVNEFFGKRFRNSADGDKTAQVKAIDTRV WLKLVQMTVKRGGNFVSLYPNEVASENGNYKVTAWFPSSSDAEYKDIYLEEPRIEVTAAE KKGNFLYIKYRLTSVNETAVAGVEKRTEVHLIMP >gi|283510551|gb|ACQH01000068.1| GENE 37 57231 - 58364 1093 377 aa, chain + ## HITS:1 COG:no KEGG:TERTU_2962 NR:ns ## KEGG: TERTU_2962 # Name: not_defined # Def: putative acyltransferase # Organism: T.turnerae # Pathway: not_defined # 10 375 3 383 383 218 36.0 4e-55 MENVCRARGEFDDIRPYNDDEVALAMRRIAQSQWLPQLAHFVFPEQNAQDVARMLQDIST AREFQLRVMYHFNREVMRRSTTQFVCEGLEQLQPATPYLFVSNHRDIVLDASLLQNALVD AGHDTCEITFGANLMTHPTVVDIGKSNKMFRVERGGNKRQFYESELHLSRYIRHAIVDRG SSVWIAQRNGRTKDGLDRTEPALIKMFALSGTGNKLTSLAQLNIVPVAVSYEYESCDLLK AREMAQATHKPYVKQPGEDINSIITGISQPKGRVHFRVTKPLGAAQLEPLAHLPLNDAVR QVALMIDHAIIDAYQLMPTNMAAYAMLHPEGPQNAQWDEAKAWLSQRLQQLTTDEERRHL LHIYANPVAAKQQIKQS >gi|283510551|gb|ACQH01000068.1| GENE 38 58366 - 60777 2038 803 aa, chain + ## HITS:1 COG:RSp0362 KEGG:ns NR:ns ## COG: RSp0362 COG4258 # Protein_GI_number: 17548583 # Func_class: R General function prediction only # Function: Predicted exporter # Organism: Ralstonia solanacearum # 172 789 169 792 805 85 23.0 4e-16 MIAQWLIRLYDWFAQHRLLRIALPIAMVLGALVGASQLRFKEDITDFLPNDHNYRRSMEI YRQTNAADRIFIVASPTDTTQLNTPLLVRAMQLFQRESKARGWQLTAQADFESVMRIPDF AYQNAPLLLGDADFARIRRITCTDSMAAALQRNREMLLFPTGGMVTQNIERDPLGLFTPL LARLQASGNALHYELHDGYIFTPDDRHAVAMLTSPYGASETNGNAQLVKQIDSVARSVET RVPGVKFAATGAPVIAVDNASQIKKDSIWAISIAATLIVALLVWALRRVRHLFLIVVSLG FGWLMAMGGMALVSSEVSIIVLGIASVIIGIAVNYPLHFVTHLAHCSSPRQALEEIASPL VVGNVTTVGAFAALIPLDATALSHLGIFAALMLVSTILFVVVFLPHFVGKKNVSVKPLTA NEQDEMPDAAPISATSGATATAHSPHHRRRTVVVGTALVVVTLVLGYFSLGTQFDTDLRN INYLTPQQQTLLMQLAQMRGEEHGTTQVYFVAEGKHVEQALQAAEQPQYPLLVASKAEQE RRLAQWHCILKERAPQLDSLGLLAAKEGFAADAFAPFQKMLAARFTPQPPNHFAPFLQAS LGNRIVGNAVVNIVNVPSAQADSVAAAFTSTPNAERWAFTLPSLNATIANSLSQHFNYIG WVCGTIVFAFLWLSFRKFKLALIAFMPMIVSWLWILGAMHLLDMRFNLVNIILATFIFGQ GDDYSIFVTEGLIYERRHGRPMLAAYRKGILLSAAIMFIGMGSLVVARHPALHSLGAITI VGMGAVVALTYVVPPLLFKWWVK >gi|283510551|gb|ACQH01000068.1| GENE 39 62095 - 62604 570 169 aa, chain + ## HITS:1 COG:SA0490 KEGG:ns NR:ns ## COG: SA0490 COG0566 # Protein_GI_number: 15926209 # Func_class: J Translation, ribosomal structure and biogenesis # Function: rRNA methylases # Organism: Staphylococcus aureus N315 # 10 165 89 241 248 78 33.0 5e-15 MHRLTVDEFKAASKLPLVVVLDNVRSLHNVGSVLRSCDAFRVEAVYLCGITATPPSAEIH KTALGAEDSVTWKYFAETSDAVDELHANGTKVFAVEQVANATMLQSLETKGGERYAVVLG NEVKGVQQTVVDRCDGCLEIPQFGTKHSLNVSVAAGIVVWEFAKKIGNK >gi|283510551|gb|ACQH01000068.1| GENE 40 62740 - 62910 59 56 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MATRPRKPSRALSNVVFFMIDMFNKYFYRLQSYDILCNQKRKAKAFSVTPCVKNLD >gi|283510551|gb|ACQH01000068.1| GENE 41 62855 - 63313 449 152 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288928750|ref|ZP_06422596.1| ## NR: gi|288928750|ref|ZP_06422596.1| hypothetical protein HMPREF0670_01490 [Prevotella sp. oral taxon 317 str. F0108] # 1 152 1 152 152 237 100.0 2e-61 MKKTTFDKALLGFLGLVAILFMGVTLSSCSNDDDGPVDKKKDKVSAIVGTWQQDMKEKVG EGTGEGESEKPVPVFEVYQKDGTYLKTSDPKNKEAKVEEGTYTVKEDSLFVEVKGTDAVR KDAFKFVIKENKLSKQQKGEEIATYTRLKENP >gi|283510551|gb|ACQH01000068.1| GENE 42 64089 - 65189 1370 366 aa, chain + ## HITS:1 COG:BS_yyaF KEGG:ns NR:ns ## COG: BS_yyaF COG0012 # Protein_GI_number: 16081144 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted GTPase, probable translation factor # Organism: Bacillus subtilis # 1 366 1 366 366 411 56.0 1e-115 MSLKCGIVGLPNVGKSTLFNCLSSAKAQAANFPFCTIEPNVGVITVPDERLTRLAEIVHP GRIVPATCEIVDIAGLVKGASKGEGLGNKFLGNIRETDAVIHVLRCFDDENITHVDGTID PIRDKEIIDTELQLKDLETIEARLAKDQKAAAAGNKDAKLNVTVLTAYKEVLEKGMNARN VTFESKEEQQAAHDLFLLTAKPVLYVCNVDEVAAKTGNDYSRKVEEVAKAEGAEVLVIAG KTEEDIASLESYEDKQMFLDELGLEESGVNRLIKKAYALLNLQTFITAGEMEVKAWTYHR GWKAPQCAGVIHTDFEKGFIRAEVIKYEDYIEYGSESAVREAGKLGIEGKDYVVQDGDIM HFRFNV >gi|283510551|gb|ACQH01000068.1| GENE 43 65466 - 66311 858 281 aa, chain + ## HITS:1 COG:VC0674 KEGG:ns NR:ns ## COG: VC0674 COG0682 # Protein_GI_number: 15640693 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Prolipoprotein diacylglyceryltransferase # Organism: Vibrio cholerae # 11 271 10 255 271 129 35.0 6e-30 MLSHLNYILWNPDVEAFHLFGFSVRWYSLCWLIGLVLAYFIVQRLYKQQKIKDEYFDPLF LYCFFGILLGARLGHCLFYQPEYYLTSWQHFIEMIVPIHFLPGGGGWKFVGYEGLASHGG TIGLIVALYLYYRRTRLNLWQVLDNIAIATPITACFIRLGNLMNSEIIGKVTDVPWAFIF ERVDKMPRHPGQLYEAIAYFVFFFVGLWLYRKHPQRVGTGFFFGLCLTLIFTFRFFIEYT KDIQVDFESGMPLNMGQILSLPFIAIGIYCMRRKGKGQVSK >gi|283510551|gb|ACQH01000068.1| GENE 44 66401 - 68782 2559 793 aa, chain + ## HITS:1 COG:SP0648_2 KEGG:ns NR:ns ## COG: SP0648_2 COG3250 # Protein_GI_number: 15900551 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Streptococcus pneumoniae TIGR4 # 34 778 80 869 871 385 33.0 1e-106 MKQTLILAFLALASVAQAQIRQERNLDKWQFSRDGKDWQQVDVPHDWAITGPFDRKWDIQ HLAIEQDGQTEAIDHTGRSGALPWIGKGEYRTTFRLPAGTKHTTLYFDGAMSEPVVYVNG REAGRWMYGYNAFRLDISPYVKSGVNEVRVLLNNQEESSRWYPGAGLYRPVTLITAGDAR IDPWATIFKTVNLAGSKAQVYVETQATTPKGKAGARLAVRLLDANGRAVAQKNLTANAQG TAAVQLNIANANAWSPEQPYLYTLEVSLYQGNKLADRLRQRVGIRTVAVSKEHGFQLNGQ SRKFKGVCLHHDLGPLGAAVNKAAIIRQIRLMKDMGVDAIRTSHNMPSQMQMQVCDSMGM MVMAESFDAWKEPKVKNGYNLFYDQWWRKDLENLIRGHRNHPSIVMWSIGNEIPEQWNKE GAQRAKDMTDYCHQLDNTRPVTCGGDNPKANTECGFYAALDVPGFNYRVNLYDELISKQR QGFILGSETTSTVSSRGVYKFPVDERKMATYKDGQCSSYDVECCGWSNLPDDDLIAQDDR SFTIGQFIWTGIDYLGEPTPYYSYWPSRSSYFGAVDLAGLPKDRFYLYRSVWNEKQPTLH LLPHWTWPGREGQQTPVYCYTSYPEAELFVNGQSQGRIKKDKASRLDRYRLRWNNVVYQP GELKVIAYDANGKPAAEQVVRTAGEPHHLVVQADKSSLAADGQDLLYLTVSMVDKDGNLC PDADDELTFDVSGAATFKAVCNGDATSLEMFHLPHMRLFHGQLVVTVQSNKAKGKVKMDV KAPARNISTIWQN >gi|283510551|gb|ACQH01000068.1| GENE 45 69157 - 69534 389 125 aa, chain - ## HITS:1 COG:XF1454 KEGG:ns NR:ns ## COG: XF1454 COG0239 # Protein_GI_number: 15838055 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Integral membrane protein possibly involved in chromosome condensation # Organism: Xylella fastidiosa 9a5c # 3 104 46 147 181 59 39.0 1e-09 MIRNIILVALGGAVGSALRYALSWLWPATQNGGLPLGTLAANVLGCLVIGFVGTLLQRWF GGGEGFRLLVCVGFCGGLTTLSSLVNETSGMAHTNQILMAAGYLGASVALGFVALHCGTW LASRV >gi|283510551|gb|ACQH01000068.1| GENE 46 69550 - 69789 118 79 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MFSEFFSYLFCFFTILQTFPSDTLVPNKPDALCDKPINSVASQPNNSVFVFYIQRLLTYL LIIYSASVRKASSCTPLHH >gi|283510551|gb|ACQH01000068.1| GENE 47 70436 - 70837 462 133 aa, chain - ## HITS:1 COG:Cgl2549 KEGG:ns NR:ns ## COG: Cgl2549 COG0537 # Protein_GI_number: 19553799 # Func_class: F Nucleotide transport and metabolism; G Carbohydrate transport and metabolism; R General function prediction only # Function: Diadenosine tetraphosphate (Ap4A) hydrolase and other HIT family hydrolases # Organism: Corynebacterium glutamicum # 3 129 4 132 136 80 34.0 9e-16 MDIFSKIAAGEIPSYKCAENEQFYAFLDINPLMEGHTLVIPRREVDYIFDMNDDELADFQ LFAKRVARAVKTACPCIKVAQVVLGLEVPHAHIHLLPLKSEADADFKREKLKLTAEQMQS IADKIYAAFKSQQ >gi|283510551|gb|ACQH01000068.1| GENE 48 70893 - 71357 601 154 aa, chain - ## HITS:1 COG:BMEI0508 KEGG:ns NR:ns ## COG: BMEI0508 COG0782 # Protein_GI_number: 17986791 # Func_class: K Transcription # Function: Transcription elongation factor # Organism: Brucella melitensis # 4 144 6 146 157 110 45.0 8e-25 MAYMSQEGYEKLVAELRRLESIERPKASAAIAEARDKGDLSENSEYDAAKEAQAHLEDKI NKLKVAINEAKIVDTSRLSSDAVQILSKVQMTNLTTKAKMTYTIVSESEANLKEGKIAIT TPIAQGLLNKKEGDEVEITIPRGTIKLRIDKITI >gi|283510551|gb|ACQH01000068.1| GENE 49 73445 - 73930 563 161 aa, chain + ## HITS:1 COG:BH1577 KEGG:ns NR:ns ## COG: BH1577 COG0526 # Protein_GI_number: 15614140 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Bacillus halodurans # 15 144 34 158 176 72 30.0 2e-13 MKKLYLLIAAMMVATFACAQTRLPNITLKDVSGNDVRLDTLSNNGKPIVVAFFATWCKPC NRELNAIDEVYNDWKQETGVKIVAVSIDQAQNINKVKPLVDENGWTYQVLLDPNGELKRA MGVQMIPYTLLLDGKGYIVYKHNGYADGAETELYQKVKDCL >gi|283510551|gb|ACQH01000068.1| GENE 50 73942 - 74307 62 121 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLSLGHPTVLLGAVEGEIRYPPAVRGAWELCAYTQPLATIADKRAVNEILELCAYAQPLI TIVDKRAVSETLELCAYAQPLTAIADKRAVRECLGVAFPHRKSPIHWPTMNEQIHQQEHC E >gi|283510551|gb|ACQH01000068.1| GENE 51 74352 - 76136 1767 594 aa, chain + ## HITS:1 COG:no KEGG:PGN_0121 NR:ns ## KEGG: PGN_0121 # Name: not_defined # Def: hypothetical protein # Organism: P.gingivalis_ATCC33277 # Pathway: not_defined # 83 594 59 581 581 458 46.0 1e-127 MFRNPLPNKQLLLQPRQPKWMGLPLQQMLGIKKKFSEPSAVAKSHRANWKPWGTALAIAI YCAMALPVGAQDQPKKGLTLRGSIQTDMLVPENDKQAGITKDNGDFLNNTYVELDASLRN FDAGVRFEYMQYPLPGFEPDFKGWGVPYYYLKWQTQHVELTAGTFYEQFGSGFVLRNYEE RSLGVDNSLLGGRLKANPLPGVYVKALAGKQRRYWEHNPGWVGGADLELHIEQWLLGMQA HDHHLMLGASVVNKNEPDEVIMADATHRLRLPRNVMAYDIRAQWQHGAYNVLAEYAQKGQ DPTADNGYIYRHGYVAMLSGSYSKRGLSMLLQAKRSVNMGFRSRRSMVGTSSYVNHLPAF TLEHTYALPALRPYATQPEGEWAYQASLGYKLKKGTPLGGRYGTALKLNFSHVHGIAKNH HGGKGTDGYGSAFWAWGDATYYQDLNVQVEKRWNPALKTTLMYMNQRYNKTVVEGEGGMI NANIFVADVKWKMSPRNTLRTEAQYMQSKDGNGDWLFGLAELSLAPSWMFTVSDQYNAGS THIHYYQALVTFVHGAHRLQVGYGRTNDGYNCSGGVCRYMPPTKGATLSYSYNF >gi|283510551|gb|ACQH01000068.1| GENE 52 76308 - 77096 639 262 aa, chain + ## HITS:1 COG:no KEGG:PG2173 NR:ns ## KEGG: PG2173 # Name: omp28 # Def: outer membrane lipoprotein Omp28 # Organism: P.gingivalis # Pathway: not_defined # 20 248 42 269 290 105 32.0 1e-21 MHIKPLHILLAALLLGACNDIAPSDRLIEVPATTAKRKVLVEEFTGQRCLNCPAAAEELS RLQAQYGADTLVVVAIHGGRLAIMPKEGLVGLATPLGNSYAEHWGYANSVPKVLVNRQRA GSVKEGWAAKIIAAFETKSPLNLKLSTTYNAANRELQVSTSAQTLAEGVEGKLQLWLVED GIVALQQFPDDVLKRDYVHNHVLRAALNGDWGEPISLPGNAEKTLTHKLTLPEGIKADNA WVVAFVYNDAGVVQVERRKVGI >gi|283510551|gb|ACQH01000068.1| GENE 53 77480 - 78151 518 223 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288928762|ref|ZP_06422608.1| ## NR: gi|288928762|ref|ZP_06422608.1| hypothetical protein HMPREF0670_01502 [Prevotella sp. oral taxon 317 str. F0108] # 19 223 1 205 205 372 100.0 1e-101 MKRLLLLSVLFCLASVMTMQAANREMVFVDAKGKVLADKATLSLTKVENAGFGPQIKSGL SIRNVSGKEFNVMLVYEIVEISGQGADLEVCFPSNCNTWDKKGKYYTSSRKVSSKAQNIS IATDFNLTGKANCTAILKLYTTKSTANVEKAVPEQLLATVTLKFNGNATGIESVQRTAQN ADNEVWTVNGVLVGRNISDVNSLPKGLYLVRQNGKTTKVVVGK >gi|283510551|gb|ACQH01000068.1| GENE 54 78372 - 79685 1037 437 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288928763|ref|ZP_06422609.1| ## NR: gi|288928763|ref|ZP_06422609.1| hypothetical protein HMPREF0670_01503 [Prevotella sp. oral taxon 317 str. F0108] # 1 437 1 437 437 894 100.0 0 MKKPLLSRLCVLAFMFCACATAFAGFGENNLAPIAAPRLFVAKGATDAKGTFSCVNYGTN PVASYRYTATVNGKVVEEKEVKLAEPINVNDFGKLLVKVPTFNQLGDHQLVCNITHVNGK PNTTTFSSSTLDITVVTKVPTCRVVFEDYTAMWCRYCPRATAIMEYLTKKHPDDFIGIAV HQGDKMGIGGYKTPAVGKFGLPYVWASRRAKVSGYTGEDLYREAKARGAVMDIDVKAEWN ASGNAINVQSTTTFRTNMPNANYALAYVVVEDGMHNASWSQANSYDGLTDLLEENEEMAV FVKGGRVIYGLKYNHVARAFLGIDSGMEGSLPKNVVADRGIKHQAMFQGVGRNWTPQNKK NLRVVVMVLDNSTNSIVNGAHCPIKPYNTTGVATTKNTDQRVEVARFNLKGQRLSAPEKG INIVKYSDGTVEKVNER >gi|283510551|gb|ACQH01000068.1| GENE 55 79928 - 81241 1074 437 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288928764|ref|ZP_06422610.1| ## NR: gi|288928764|ref|ZP_06422610.1| hypothetical protein HMPREF0670_01504 [Prevotella sp. oral taxon 317 str. F0108] # 1 437 1 437 437 895 100.0 0 MKKQHFLRLYALMALLCMQTYTLLAADVYSLSVDKPARMYVKTGASGHKNTFIMHNYGNI VQKNFHYKATINGKVVDERTVTFATPLVANGLTGLTINIPPQSKAGDYELVLDVTQINGK PNGSPGKIAKLPITAINKLPTCRVVFEDYTANWCPWCPSGIAVMRQMAKKHPNDFFGIVA HDGDVMGIKGYNVPATPHTGLPTVWASRANKVEGFTGELYYQMTKQRGAMMDVDVKAQWN RQGKSIDVQSTTTFRATTNNARYALAYVLIENRIQRNDWAQRNGYSGSTSYLGLNPELDY FIKSSDPIKNFVFEDVARSHIGIANGLDGSLPQSVVADRPIVHKSALNGIGDWVTVFNKN NLTVVVLVIDNTTHSIVNAARCVVKPYTATDIETPKATAERVEVARFNLTGRRLSAPEPG VNIIKYSDGTVQKVFVR >gi|283510551|gb|ACQH01000068.1| GENE 56 82551 - 85367 2196 938 aa, chain + ## HITS:1 COG:RSc0786 KEGG:ns NR:ns ## COG: RSc0786 COG5373 # Protein_GI_number: 17545505 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Ralstonia solanacearum # 291 608 165 483 938 103 25.0 2e-21 MYDEIGLFIFPLIFGFVAVVIAVRLSVNSKFEELFFQFKVMQKRLESTTESLNNLQEEIE KRLPQGVATNAGNHAEEQKDAIETPPQAAGNGPWATHTDLPTGEDTHQESVEEPVAEVTT LSEQQRVGAPVATVEVPVQGVNVLANAPEGEPQPTEGELQGTDSNLHQAETPTTCEVETS QAGAELQDNDAEQQDGDNETQGVDNEQQADNTEEQRENAAQEQPQLHVPVAAFVQDETEQ ERVEQAQEEPVLTGRAYLEAKYKHAATKGKAVNGSAMGADNEDGGFNYEKFIGENLFGKI GILVFVLGIAFFVKYAIDQNWIGHTMRTALGFGVGTLLLGLAWRLGKHYRAFSSLLAGGG CAVYYVTTAIAFHYYQLFSQSVAFGIMVVVTLFMAWTARHYERRELAMTAIVGGFLAPFL TATGTPNFLFLYVYMVILGLGSLYLSWGTRWNELPLISTAATYLALLLIGKDLVELSWGV HILFYGLFWAISTASCFTLLRRGMVPLFTGFHIATMILSSMFTIFLVAVQGDDEYYENGF TALTMGAAYIGLHLWLRYKERTKDATQAILLALGLVYVSVSLPLFFSGAVLTVCFSAEMV LLLWLYCRLDMGVYGIASGAVMVVSILGELSQLLLANAPEVSTPVFNSNFMSTIFRGLCF LAFARIMDRNKEKLEAYYVPLNHIIYGAGIITIYACLSMEMDLFIPSPTYNAVVTLLRIG TLFAIAVGFGRRYPAGKYGWLYAVYMFVAMFLIAGDVFYYGEYTTPMAAIGVQWLGIACG IGLFVHASRNYYKQAERSTKLFTVFLNVSATLLWVSMVRALLMQMGVEQFSAPLSLSLAA VGTLQMALGMRLPNKTMRLLSIATFGFIIAKLALYDVWRMPAVGRIVVFIILGALLLTLS FMYQKLKDTLNLGGENEDPNGLDTQLHEASDDAEDSHE >gi|283510551|gb|ACQH01000068.1| GENE 57 85484 - 86068 412 194 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15900660|ref|NP_345264.1| superoxide dismutase, manganese-dependent [Streptococcus pneumoniae TIGR4] # 1 192 1 196 201 163 41 5e-39 MNITMPTLPYAANALEPVISEQTINFHYGKHLQNYVNTLNTLIQGTEFEGKSVEEIVKTA PDGPIFNNAGQTLNHTYYFLQFKSPVKGNEPTGKIAEALVRDFGSVENFKKEFTQAAATL FGSGWAWLSQDKDGKLVITKEANAGNPLRHGNNPLLGIDVWEHAYYLDYQNRRADHLAAL WDIIDWKVVEGLLK >gi|283510551|gb|ACQH01000068.1| GENE 58 86220 - 86786 697 188 aa, chain + ## HITS:1 COG:STM0608 KEGG:ns NR:ns ## COG: STM0608 COG0450 # Protein_GI_number: 16763985 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peroxiredoxin # Organism: Salmonella typhimurium LT2 # 3 188 2 187 187 249 60.0 2e-66 MQSIINSLLPEFKVQAYQNGEFKTVSSEDVKGKWAVFFFYPADFTFVCPTELVDLAEKYE ELKKMGVEVYSVSCDTHFVHKAWHDASDSIKKINYTMLADPLGVLAKGFGVFIEEEGMAY RGTFVVDPEGKIKIAEIQDNSVGRNAEELVRKVAAAQFVASHPGEVCPAKWKQGAETLKP SIDLVGKI >gi|283510551|gb|ACQH01000068.1| GENE 59 87112 - 88671 395 519 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 [Streptococcus pneumoniae SP6-BS73] # 215 508 2 297 306 156 32 5e-37 MLDTSILKQVSDVFAGLENDYVLDVRTDASRAESAELKAFVNDFVSTSAHLSANFTDTND AQLSFSLLKGGQPVGITFRGIPNGHEFTSLLLAVLNADGKGKNLPDEAISRRIANLKGEI KLQTYVSLTCTNCPDVVQALNIVALANSRVSSEMIDGALFQDEVARLNIQAVPAVYANGE LLHIGRGDLGELLGKLEDMFGSEEDSNAEPVERQFDLVVIGGGPAGSAAAIYSARKGLKV AVVAGRIGGQVKDTVGIENLVSVPQTTGAQLADDLRSHIGHYPIEIFENRKLERVDFSQS PKRVYAKGGEVFVAPAVVVATGASWRRLNVDDEAKYMGKGVHFCPHCDGPFYKGKNVAVV GGGNSGIEAAIDLAGICNHVTVLEFADTLRADTVLQDKAASMSNIEIFKSSQTTQVLGDG NHVTAIRVKDRTNEEEREIALDGVFVQIGLAANTEHFRDALPLNERREIVVDATCRTNIP GVYAAGDCTNVPYKQIVIAMGEGAKAALSAFDDRIRGII >gi|283510551|gb|ACQH01000068.1| GENE 60 88979 - 89143 232 54 aa, chain - ## HITS:1 COG:alr1174 KEGG:ns NR:ns ## COG: alr1174 COG1592 # Protein_GI_number: 17228669 # Func_class: C Energy production and conversion # Function: Rubrerythrin # Organism: Nostoc sp. PCC 7120 # 2 54 181 233 237 84 62.0 3e-17 MKKYICETCGWIYDPAVGDPDGGIEPGTAFEDIPEDWVCPICGVGKDDFSVVEE >gi|283510551|gb|ACQH01000068.1| GENE 61 89260 - 90357 874 365 aa, chain - ## HITS:1 COG:FN0546 KEGG:ns NR:ns ## COG: FN0546 COG0859 # Protein_GI_number: 19703881 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: ADP-heptose:LPS heptosyltransferase # Organism: Fusobacterium nucleatum # 19 365 5 334 335 66 24.0 6e-11 MGVSLPSKKTNTHISVKTEHILIIRFSAMGDVAMTVPVVQSLAKQYPELRITFLSRPFAR PLFEGLAPNVGFMGADIKKEYHGVGGLNKLYKRIVAKNITAVADLHNVLRSGYLRLRFNL GRYRVEHIDKHRNERKKLVAQTNKRLVPLPTAFQNYADVFAKLGYPIKMDFTSIFPPQGG NLRLLPAEIGVKKAFQEWVGVAPFAAHKQKVYPPEMLEKVIKMIVRDHPSCRVFLFGKGP DEDPILNGIVERNPQCLNASAVLGGIAQELVLMSHLDVMISMDSANMHFASLVNTPVVSI WGATHPYAGFMGWGQKPQNAVQIDLACRPCSIFGNKPCLRGDLACLRTISPEMVYERVKR VLNKR >gi|283510551|gb|ACQH01000068.1| GENE 62 90382 - 90984 840 200 aa, chain - ## HITS:1 COG:no KEGG:PRU_2563 NR:ns ## KEGG: PRU_2563 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 200 1 200 201 291 75.0 9e-78 MSFTKLCNEIFNQAIRDYHVTDNIDTPINNPYKRETIENRLYLKCWIDTVQWHFEDIIRD PHINPEEALSLKRRIDRSNQDRTDLVEQIDSYFRQKYADVKPQADARLNTESPAWAIDRL SILALKIYHMREQVERTDVDAEHHERCKAKLNILLEQQADLSTAIDQLLADIEQGRIYMK VYMQMKMYNDPATNPVLYKK >gi|283510551|gb|ACQH01000068.1| GENE 63 92620 - 92904 68 94 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGQKIRFYRILVWYFVQERRVFHGFAAQILYSPFKLKVNTTDKSDQSAPALIIPPFQFKG QYNFNISEEHFSYYTPSPPPVCIKNSPILQHTFS >gi|283510551|gb|ACQH01000068.1| GENE 64 92867 - 94315 1270 482 aa, chain - ## HITS:1 COG:XF1638 KEGG:ns NR:ns ## COG: XF1638 COG0463 # Protein_GI_number: 15838239 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Xylella fastidiosa 9a5c # 1 226 6 227 274 107 31.0 4e-23 MNNNRVCVVIPTYNNATTLASVVRGVAQHTPNIIVVNDGSTDDTAQLLATHFPELCVVTH TCNRGKGAALLSAFKQAQNMGYTHAITIDADGQHAPADLPLFFKAIHQAPNAIVVGNRFD ADKFSETNNRNMNGQSKFANKFSNFWFALQTGRCLPDTQTGFRAYPLRLLRWLPLVTNRY ESELALLVFAAWHNVRTISIPINVHYPPREERVSHFRPGVDFARISVLNTFLTFFAIVYA LPRKLLRVVCTAGVLLLMFVLMFFLQIGMLIYFLSHRVSETERLRFHALIQRIARTLLRL LPGIKTRFDNPAGEDFAQPSIIISNHQSHLDLLCILALTPRMVILTKRWVWYNPLYAVAI RFAEYLPVTRNIEDNEQRIAHLVKRGYSVMLFPEGTRSASLKVQRFHQGAFYLAKRFQLD IVPVLLYGTGHVLNKRARTLSSGNIVVSIMRRTALSAFDKDITPRELARHFRKMYAEELE SF >gi|283510551|gb|ACQH01000068.1| GENE 65 94308 - 94670 342 120 aa, chain - ## HITS:1 COG:no KEGG:Coch_1765 NR:ns ## KEGG: Coch_1765 # Name: not_defined # Def: dehydratase # Organism: C.ochracea # Pathway: Fatty acid biosynthesis [PATH:coc00061]; Metabolic pathways [PATH:coc01100] # 7 120 5 119 122 81 38.0 8e-15 MKLKDNFYTIENATHDADKHVFGVRLNPNHAIYAAHFPGNPITPGACTLQMVGELAEEVV GKPLQLSCAKNVKYLAPLTPTDTPCPTFALNIRCEGNTWLVKAEVKNNEHVFAKLSLVYE >gi|283510551|gb|ACQH01000068.1| GENE 66 94654 - 95052 346 132 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928777|ref|ZP_06422623.1| ## NR: gi|288928777|ref|ZP_06422623.1| hypothetical protein HMPREF0670_01517 [Prevotella sp. oral taxon 317 str. F0108] # 1 132 1 132 132 266 100.0 3e-70 MRTITLYILLLWFSVASLSAQGRRWNVELSFKKAAITGLCIIDTVEQQPRGAILNEFGIK MVDFVCMPNGKVKLYNLVPMLNKWYIRRVLRHDLQQIVPQLVTRDSLTYENPKRHITYRF KPLKPTEDETER >gi|283510551|gb|ACQH01000068.1| GENE 67 95074 - 95700 688 208 aa, chain - ## HITS:1 COG:no KEGG:Cpin_1869 NR:ns ## KEGG: Cpin_1869 # Name: not_defined # Def: outer membrane lipoprotein carrier protein LolA # Organism: C.pinensis # Pathway: not_defined # 4 208 7 209 210 109 29.0 8e-23 MNYILSIILLFVPLIGHAQTAVSGAKADEMMQKVGEVAQKTKSLQCSFTQTKTLKMLSQK MVSKGRMCYSQPSKLRWQYTSPYQYTFILNGTKVLMKSAQRKDVIDAAKNKVFREITGIM LSSVTGECLTDKQRFKTQMFVDGDKWIAQLTPLKKEMKQMFSLLVITFDSKQLIATTVEM REKGGDNTRIELHQVKKNAAVSDEEFKI >gi|283510551|gb|ACQH01000068.1| GENE 68 96125 - 96484 327 119 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928779|ref|ZP_06422625.1| ## NR: gi|288928779|ref|ZP_06422625.1| membrane protein [Prevotella sp. oral taxon 317 str. F0108] # 1 119 1 119 119 167 100.0 2e-40 MFKLLALALVQSALLAAIQVTLKLSLQQIGAFSWSWAFFARALTCLWFALCGICYALSTV LWLYILKHYPLSQAYPLISLSYVMGMLAAVFVFHEAVPLVRWLGVGLIMAGVVMVSYQG >gi|283510551|gb|ACQH01000068.1| GENE 69 96463 - 98211 1339 582 aa, chain - ## HITS:1 COG:BH1840 KEGG:ns NR:ns ## COG: BH1840 COG0304 # Protein_GI_number: 15614403 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: 3-oxoacyl-(acyl-carrier-protein) synthase # Organism: Bacillus halodurans # 2 569 6 603 764 189 26.0 1e-47 MIVVTGTGIVSAIGTNTPEVAHALLADTSGIAPVRFLETQHSQMPVGEVKLSNEQLAQRL GVPCTANTSRTMLLGTLALREALAMAGINSEQLADVALVSGTTVADMDCVERHFNAAQSN NALGDCGRSTQEMVQNVGPFGYVTTCSTACSSAANAFILGANLLRSGRFKQVVVGGTECL SRFHFNGFRSLMILDGEPCRPFDDTRAGLNLGEGAAFVVLETAEHAAQRGVEVLAELSGW GNACDAFHQTASSDDGIGAVLSMRQALKRAALQPADIDYVNAHGTGTPNNDASESAALHS VFGEKVPPVSSTKSRTGHTTSASGSIEAVICLLAMRQSFIPSNLNWQHAMADGIVPVAQT LRQQQLRHVLCNSFGFGGNDTSLVFSAPSTSNTPLQTPPLRPVYVKTVVKFADIPETELP RIPPLVARRLSGVLRRALLTSLVTLKRSGIEQPQAIVTGTALGCVEETEKFLRELANDGE GSLKPTNFIHSTHNTISSLIAIHTHCHGYNSTYAHGQRSLESALTDAWLQIALGDLHTAL VGWHDEEGQTALSMVLTNEADGAEYALNSLEDVRQLCSNYSH >gi|283510551|gb|ACQH01000068.1| GENE 70 98208 - 98465 458 85 aa, chain - ## HITS:1 COG:ECs4328 KEGG:ns NR:ns ## COG: ECs4328 COG0236 # Protein_GI_number: 15833582 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acyl carrier protein # Organism: Escherichia coli O157:H7 # 1 83 1 84 85 66 47.0 1e-11 MEELKETLKKQIIEALSLEDMRPEDIDDAAPLFGDGLGLDSIDALELIVLMEKQYGIRLN NPAEGKEILASVNTMAEYIEKNRTK >gi|283510551|gb|ACQH01000068.1| GENE 71 98561 - 100321 1669 586 aa, chain - ## HITS:1 COG:BS_yjaY KEGG:ns NR:ns ## COG: BS_yjaY COG0304 # Protein_GI_number: 16078199 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: 3-oxoacyl-(acyl-carrier-protein) synthase # Organism: Bacillus subtilis # 11 335 15 366 413 123 28.0 9e-28 MTHCIAHNISSPLGWTSEENFAAVLQGRTKLTTHKDKWALPQSFVASLFDDNAVEERFNS HVGDVGLALSRFEMLVVLSVKQAAEQAQINLSSSDVGIVLATTKGNVELLSPTNQPHIPR QRTLIAPSAAAIAQALGSPNTPHVVCNACISGAHALLFAHRLLEMGQYRQVVVCGCDVQS AFIVSGFQSFKAFSQAPCKPFDLYRDGLNLGETAATMILSATEPATQRAWHIVGGAVRND AYHISSPSKKGEGCYRALVATGVNDAENIAFVSAHGTATPYNDEMEAKALHRAGLSAVPV NALKGIYGHTMGAAGVLETVLSMMSIEDGTVLPTIGFSELGVSKPLLISAQAQPTSKRSF VKMLSGFGGCNVAMRWQWGENTAYGNETQGQSKALPALTPLRRVQITPQEVQIDGHTQAI DNPNGALLTAIYKQWANDYPKFYKMDGLSRLAFAAGELLTKGTEGLPSMALDDSAAIVFV GNSGSLASDIKYQGTIQQADNFFPSPAHFVYTLPNIATGELAIRHNVHGETAFFIASQLH EAQMGTLLQSAAMDGEAKQVVGGWLEYEDDTHFEAQITLYHVERGA >gi|283510551|gb|ACQH01000068.1| GENE 72 100525 - 100941 534 138 aa, chain - ## HITS:1 COG:CAC0271 KEGG:ns NR:ns ## COG: CAC0271 COG0824 # Protein_GI_number: 15893563 # Func_class: R General function prediction only # Function: Predicted thioesterase # Organism: Clostridium acetobutylicum # 4 117 3 116 138 70 32.0 8e-13 MSLQASKEIEIRFSEVDSMNVVWHGSYALYFEDAREAFGAKYGLEYLTIADNGYYAPLVE LTFKYRQPIAYGMKCRIDIFYVPTEAAKIVFDYEIRRSEDNALMATGHSVQAFMNKQYQL EWYRPPFYQEWQERWKVV >gi|283510551|gb|ACQH01000068.1| GENE 73 101230 - 101673 507 147 aa, chain - ## HITS:1 COG:no KEGG:ZPR_2773 NR:ns ## KEGG: ZPR_2773 # Name: not_defined # Def: FabZ-like protein # Organism: Z.profunda # Pathway: not_defined # 13 146 15 149 149 100 39.0 2e-20 MDNNHKLIGRTVDVSELLPQQPPFVMVDSLLSFDNDTTVCGYCVPEDGVFTDNGQLCAAG LVENVAQTCAARLGFYNKYVLNRPVQLGFIGAIRNFCVARLPRVGQQLVTTIHVEENVMG MTLASAKVECEGETVATTEIKIALIEE >gi|283510551|gb|ACQH01000068.1| GENE 74 101631 - 102533 869 300 aa, chain - ## HITS:1 COG:Z4858_2 KEGG:ns NR:ns ## COG: Z4858_2 COG4261 # Protein_GI_number: 15803996 # Func_class: R General function prediction only # Function: Predicted acyltransferase # Organism: Escherichia coli O157:H7 EDL933 # 69 288 72 303 312 73 27.0 6e-13 MEWQGRTDGNRWMQQQLINCFRFLSLHIYYMGVAVVVPFYMLFGKGFKPSYGFYRKRFGW NALKSFWWSYKNHLRFGQVMIDRFARYAGKTFALTTDNIELYNELEAGDEAFVMLSAHVG NYEMGGYMLTAKRKSMHMLVFAQETETVMENRKKQFGRTNIRMIPVDNGMNHLFAINEVL ATGQILSLTADRRFGSERSVVCRFFGGDALFPLGPFATIVQRDVPTLVVLVVKTGIKSYR AILTRLPLPPTNLPRKERIQQLAQAYANELERVVRLYPEQWFNFFNIWTTTTSSSAEPST >gi|283510551|gb|ACQH01000068.1| GENE 75 102558 - 102806 472 82 aa, chain - ## HITS:1 COG:no KEGG:Cpin_1856 NR:ns ## KEGG: Cpin_1856 # Name: not_defined # Def: phosphopantetheine-binding protein # Organism: C.pinensis # Pathway: not_defined # 1 79 1 79 85 68 43.0 1e-10 MSRQEIEEKVREFLIEDLEIDEEKIVPEGKLKDDLGIDSLDFVDIVVIVEKKFGFKIKPE EMAGITTLGQFCDYIESKVNNA >gi|283510551|gb|ACQH01000068.1| GENE 76 103119 - 104213 1392 364 aa, chain - ## HITS:1 COG:Z4850 KEGG:ns NR:ns ## COG: Z4850 COG0500 # Protein_GI_number: 15803988 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Escherichia coli O157:H7 EDL933 # 11 359 2 347 352 320 45.0 2e-87 MHLFPALSKKYCHERMSAGEAQRMAQFIAWAPMVFQVSRLMVKFGILDLLRDSHDGLTQG ELVQRTKLSTYAVKILTEASLSAGIVLVDEETGRFSLSKTGWFLLTDDSTRVNMDFNHDV NYRGMFFLEEALTNEKPEGLKTLGNWPTIYEGLSKLPEEVQSSWFGFDHFYSDNSFSQAL NIVFAHNPRHLMDVGGNTGRFALRCVAHSADVNVTIVDLPGQIGMMKKNIEGKEGAERIA DYPTDLLDKTNELPKGEWDVIWMSQFLDCFSEPQIVAILQKAAAVMNPNTRLFIMETFWD RQQYETTALCLTMTSVYFTAMANGNSKMYHSEDMARLIDQAGLKVVATHDGIGRGHSILE VSLP >gi|283510551|gb|ACQH01000068.1| GENE 77 104414 - 104986 636 190 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260910716|ref|ZP_05917375.1| ## NR: gi|260910716|ref|ZP_05917375.1| conserved hypothetical protein [Prevotella sp. oral taxon 472 str. F0295] # 1 190 25 214 214 340 90.0 2e-92 MRHLILALAMLALAPKAMADSDMNFIRGNACIVNDQRKCVVDVDYSDLNIEGKPCMQYLQ DRGEENVRDWKSDMELCISQFVEKWNKDNKGALKAIAEKDEGTPLRLVVKMKRLHLGITA LAVVIGFGSGDAHLACEVDIYDGNKEITKLRIDDLSSGSQYTESQRLRDIFKKLAARSVK CLKKACKKQK >gi|283510551|gb|ACQH01000068.1| GENE 78 105079 - 106296 1491 405 aa, chain - ## HITS:1 COG:fabB KEGG:ns NR:ns ## COG: fabB COG0304 # Protein_GI_number: 16130258 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: 3-oxoacyl-(acyl-carrier-protein) synthase # Organism: Escherichia coli K12 # 3 405 2 403 406 275 39.0 8e-74 MQKRVVITGIGVWGCLGKNTEETTAALRAGQSGIGIDEARLTYGYQSALTGIVERPVLKG LLDRRMRTGLPEEGEYAFMASREAFAMAGIDDDYLLANEVGCIFGNDSSSTPVIEAAAIM REKHDSQLLGSGYIFQSMNSTVTMNLSTIFHLRGINFTVSAACASGSHAIGLGYMMIKQG LQNMVLCGGAQEVNLYSMATFDALGAFSKRMDDPQSASRPFDRDRDGLIPSGGAAALVLE EYDHAVARGATILCEVVGYGFSSNGGGISQPSDEGSVTAMTRALNDANLSPADIDYVNAH ATSTLQGDMFEAIALDRLFGQHMPWISSTKGMTGHECWMAGASEVVYSILMMQGGFVAPN INFENPDEHSAKLRLATHRIDTQVNTVLSNSFGFGGTNSALIIKK >gi|283510551|gb|ACQH01000068.1| GENE 79 106609 - 107331 252 240 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 4 238 6 238 242 101 32 2e-20 MRYVLVTGASRGIGRAIAIKLAKQGYAVVINYLHNDEAAQQTRDQIVAEGGQAELSRFDV AQPQAIEQAVEQWQQRHPDEYIDVLVNNAGIRRDNLMVFMQNEEWESVLQTSLNAFFYLT RRLLKDMLTHRHGRIVNISSVSGLKGMAGQVNYSAAKAALIGATKALAQEVAARKVTVNA VAPGFIESDMTGDLNQDELKKIVPMKRFGKAEEVAALVAFLASDEAAYITGETISITGGL >gi|283510551|gb|ACQH01000068.1| GENE 80 109006 - 112773 3583 1255 aa, chain - ## HITS:1 COG:BH0785_1 KEGG:ns NR:ns ## COG: BH0785_1 COG4724 # Protein_GI_number: 15613348 # Func_class: G Carbohydrate transport and metabolism # Function: Endo-beta-N-acetylglucosaminidase D # Organism: Bacillus halodurans # 80 601 66 554 556 159 26.0 5e-38 MKRKSTLMALALSAIAVTAQAQRTPSHPLDIATADETQWINSFMSWEPGKTIGNASRIDD EFFISRVKPRKRIGAEQGDYTVDATVDSKRKMCLWVPLDDPTSTWKAFPRYCFEGDNFSL WSYLDIHGNWTAPWVRCSAGLSDVAAKNGVKVGTLMSIPWDVHININWGTSGHAATLKML TESSYGKYKYAEKLVQYMKYYGINGIGVNSEFKSDRSSMTRLINFFAACHEEGKKQDWEF QVYWYDGTSSSGRIHFDGGLDSHNEEMFGAGDRPVVDMLFFNYNWGKDQLNNSARKAKSM NRSSYDIYAGCDIQKNGMRLSGGDRSWRALSETKASVGFWGAHSQSLLHQSATDDGTSDI AIQKAYLLKQELCFSGGHRNPAYRPEFTDNCNLSNASLQRFHGLSSLLTAKSTITSVPFV SRFNLGNGLKFYNQGEVTFDHKWHNINTQDIMPTWRWWITNAEDKVDAASYPSLIKADLI FDEAYFGGSCLSLHGQTAASHVKLFKTMLDVTADNTLSITYKLTEGNKTHAKLFIALKGE TNKYKEVAIPDAPKAGQWNTFTAKMSDFGITANATVAMIGIKVEGTTHNYNMLVGEMAVR NPQQAFNTKKPTVGELQVLRGRYNAFDFKMRYYSDDGTGAKKTYNDEVDTWYYEIYIQPK GGKEQLLTATTSWAAYVVDAPVPSDKANRQVRMGVRAISPDGLNHSDIAWSAWTDIPYDQ PLHTAKLSKAVIKPGEEFTLSLEDLIEEPAQKWEVIRPSDKRVLFSAENATSCNLKLDEV GVYDVRLVDNTGQEHLNRGFIQVSPESTGAVPHIENITPSKQQTKAGEEVKYSFTGKKGD GKVSRGLIISDPSMFRMPADLQQGTTYSYALWFKAEPFRHDKQGTNLINKNTIKDNWPHN NWGDLWVTIRPQWKNHAANEVSFNTMGWKDHDDPKADMMSTGYQVVPGVWTHIVVTQDGG RQCIYFNGKQVANAFFGASTRREDSGDGRIRRNELADIFIGGGNVYKAGFNGVIDEVQVW DKFLSAEEVKQCMNGYSKGNVPANLRGYYTFEEKTADGKYANWGSAGEQFKGDVVAIAQG GGEKTDNAFYDPREDNNNVLGYPGISGTLEVKTQATWTLDGANVIGRGDVATVTYAAAGK YAATLKLANLWGEDTKTVTDIVEVTGTTNGVNAVEAADFALYPNPFVESVNLRFAEGGEY TIQVTGVNGAVMQNTKATANAGDVVNVAITGAKGLYIVRVMKNGKLYKAVKVVKE >gi|283510551|gb|ACQH01000068.1| GENE 81 112643 - 112912 63 89 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKLLIHWVSSAVAMSNGCDGVRCAWAVTAIADSAKAINVDFLFIVYSFINTLLGSFPAIV LSFQMFRGFSFARTVVFVFCFLSINVVRG >gi|283510551|gb|ACQH01000068.1| GENE 82 112959 - 113489 569 176 aa, chain - ## HITS:1 COG:no KEGG:BF0685 NR:ns ## KEGG: BF0685 # Name: not_defined # Def: putative RNA polymerase ECF-type sigma factor # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 6 168 9 171 184 162 50.0 4e-39 MTEKEFRTYFRDLYPQLAQYATRLLGALDVEDVLQGTFLELWNRRDDISTAAHVRAFTYR TVYTQSLNVLKHRAVVNRYSAAVVDMENKTAALLSPDACDVAFNMACKELSNKLTNAIEQ LPPKSKEAFTLSYLHDMKNKDIAQEMGISVRTVDAHIYNALRTLRTKLKLTDLNTD >gi|283510551|gb|ACQH01000068.1| GENE 83 113824 - 116334 3147 836 aa, chain - ## HITS:1 COG:alr4028 KEGG:ns NR:ns ## COG: alr4028 COG1629 # Protein_GI_number: 17231520 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor proteins, mostly Fe transport # Organism: Nostoc sp. PCC 7120 # 104 256 57 211 287 80 32.0 1e-14 MKKKRTVRKQFFKLMFALCATFALSIPIGMRAENEVKEKTDANIKGSVVDSKTQQPLSHA SVLVVGTVITTTTDAEGHYALNNLPLGKVKIEVRSAGYRALRQEVTTERNYTTEKNFELT PDEIALDEVVVSANRSLTLRRESPVIVNVLDTKLFESTHSTTLAQGLNFQPGVRTEDNCT NCGFSQVRINGLDGHYSQILVDSRPVFTALQGVYGLEQIPANMIERVEIVRGGGSALFGA SAIGGTINIITKEPSENSAELAHTITGATTGSTAFDNNTTGNVSVVTTNGRAGFYLYAQS RHRSAYDRDGDGYTDLPTLDNKTLGLSTFLRLTDYSKVTLKYHGLKEFRRGGNNLFLPPH EANIAEQIEHTINGGSLAYDLFNPNGKGHFSAYASFQNVDRKSYYGGLGELASAAEGQKA LNEAYKLGLSLDMSEEDAAKLPDDQQEILANAQAYDKAQRAYNVTHNINYIAGAQYVHNF DRLLFMPADLTVGAEYSFDRIKDRSLGYNSLLEQRVRISSAFVQNEWKNAHWRLLLGGRL DKHNLISHLIFSPRVNLRYNPTPNANFRLTYAGGFRAPQAFDEDLHTKISDGDRVKIALA KDLKEERSHSFSASADLYKSFGTVQTNLLIEGFYTRLNNMFATRKLADDVVIDGARVEER YNSNGATVFGLNLEGKASLTSWAQLQAGFTWQSSRYRTAEEWDDDAADEFKTTKRLLRTP DTYGYFTLNVRPMIDFNVSLSGVYTGRMYVGHPKGGSERTKDFSIIEHTPSFLTLNLKLA YDFYLYNQVKLQASAGVQNLLDAYQKDLDKGPSRASDYVYGPTQPRSLFLGVKISY >gi|283510551|gb|ACQH01000068.1| GENE 84 117856 - 119172 1205 438 aa, chain - ## HITS:1 COG:MA1868 KEGG:ns NR:ns ## COG: MA1868 COG3177 # Protein_GI_number: 20090718 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanosarcina acetivorans str.C2A # 2 413 42 458 473 293 37.0 3e-79 MGKIELPPQVSIEDRLRALFKMQDKDITALVEQINEEYEYWDTVKYKPLPKGCTPQKLWL CTKASRLRNKLCVWPSYAITLSLTSHMQRMCHEFDMNFGGTWMADDKLPPETNERYLVSS LMEEAISSSQMEGANTTRRVAKEMLRKQTKPKDKAQQMIANNYQTIQFIVQSKDKPLTPE LLQHVHRLMTEKTLSNDEYAGRFRTNDEVVVADGITHEVVHRPPSHKDIEPFVADLCSFF NDEEQSVFIHPLIRGIVIHFMLAYVHPFADGNGRTARALFYWYLLRRGYWLTQYLSISRV IARSKTAYEKAFLYTEADDNDMGYFVAYNMRVLQQAFRELQTYIKRKQKEHLAANTYLRL GDINERQAQIIQLLVDNPKAVLTIKELQNRFLVTPTTAKTDIMGLVSRGLVSEFAFNKVK KGYVRSEGFEEVIKQLNA >gi|283510551|gb|ACQH01000068.1| GENE 85 119730 - 121019 1310 429 aa, chain + ## HITS:1 COG:STM3238 KEGG:ns NR:ns ## COG: STM3238 COG3681 # Protein_GI_number: 16766537 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Salmonella typhimurium LT2 # 14 428 17 435 436 288 41.0 2e-77 MLSENDRKAILELIHRQVVPAIGCTEPIAVALCAAKAKELLAQMPERIDVRLSANVLKNA MGVGIPGTGMIGLPIAIALGVLIGDSEKQLEVLKGCTPESVKQGKKLIAANCIDIKLEEE DEDKLFINITCTCGDEVAEARIKGSHTNFVYLRKNDKVLLDKAACSAETVAKTGVELSMR KVYEFATETALDDLRFILEAKKLNENASRCGLEDNYGHQLGKTMCSPLGRGVLGDSVFAH ILSATGGACDARMAGAMVPVMSNSGSGNQGICTTVPVTTFARENHNTEEELIRALIISNL TAIYIKQHLGTLSALCGCVVASTGSSCGITYLMGGTFEQICYSVKNMIANLTGMICDGAK PSCALKLSSGVSTAVLSAMLAMQHKYVTSVEGIIDDDVDRSIRNLAAIGSRGMDETDKYV LDIMTHKSC >gi|283510551|gb|ACQH01000068.1| GENE 86 121110 - 123170 2013 686 aa, chain + ## HITS:1 COG:NMA1719 KEGG:ns NR:ns ## COG: NMA1719 COG4232 # Protein_GI_number: 15794612 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol:disulfide interchange protein # Organism: Neisseria meningitidis Z2491 # 16 597 31 554 613 108 23.0 3e-23 MKKIRLALLTALLAVFALGASAQMQNPVHVSVQQKQVSPTEVDVIFTAKIDNGWHVYSTG LPSGGPISATITTEKAEGAQAAGKLQAKGKEISNYDKLFEMKLRYFENAVTFVQKYKITG KTYRIKGYLEYGACNDQNCLPPTQVEFDFKGNGPANAPDAKETKTPAQLAKEKADSLAAL AAANANATTDTAKTSAAAVAEGGNNVEQGGLWAPVVAELQAYNGDQGKQDNSMLYIFAMG FIGGLLALFTPCVWPIIPMTVSFFLKRSQNKAKGIRDALTYGVSIIVIYVALGLAVTAAF GPSALNSLSTNAVFNIFFCLLLVIFALSFFGWFEIRLPDKWGNAVDSKASSTSGVLSIFL MAFTLSLVSFSCTGPIIGFLLVQVSTSGSIMGPGIGMLGFAVALALPFTLFALFPTWLKK APKSGSWMNIIKVVLGFIELAFALKFFSVADLAYGWHLLDREVFLSIWIALFFMLGLYLI GRLKFASDQGSSDAMPVPCVMLGLVSFAFAVYMIPGLWGAPCKAVSAFAPPMNTQDFNLY HNEVRAQYDSYEEGMAAAAAQGKPVLIDFTGFGCVNCRKMEAAVWTDPQVADKLTKEYVL ISLYVDDKKPLPEPVEVTENGQKRTLRTIGDKWSYLQRSKFGANAQPFYVAIDNAGKPLA RSYSYDEDIAHYMDFLNQGLDNFKKK >gi|283510551|gb|ACQH01000068.1| GENE 87 123611 - 125179 1121 522 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928799|ref|ZP_06422645.1| ## NR: gi|288928799|ref|ZP_06422645.1| lipoprotein [Prevotella sp. oral taxon 317 str. F0108] # 1 522 1 522 522 972 100.0 0 MKYRHLLHAALLLTVTAFASCANEDVAQDENKQAQNATNAPVVTFTGERDIAQVSPTSRT TIKHALGQGATPYWSTGDKIWVKDTNGTFKQSNAGTFNSAMTDGVFAVSGTFANGCTVHY TGSNGTAGDKVTIAAKQNQTEANDFSHAGASGDCGVATATGNGNSFKFKLNHKASYLCFL PRTTNAFVKRSKLTSIEVVAESDIAGTFDFSNGSLSAQPVANGSKMITLVTGNGFDLTNE NTSLATNGAYVVIPPGTHHLTVRYWLENMVDNPNGRIVGTVTKYIDINCEAGKIYDITAN LNPADVSGEYFMWDAKKHYWYNRESSQPKVSGASTTNFPKSQADSPDSWFNSPAGGANWA SNSATQCASVNQVIWYVQKGAPHWDAAELWTLFGHLYKGGMWLKKQDVIARENHTTTQNM YASYNGIDYRQSTATFADYTFSNNDIVKERPTKSEISDYFYLPAKGFYVEGKMQYISYLG YYWSATCLKADAQRAFSLTFNPSSISLGSNFRFNGFAEDLKW >gi|283510551|gb|ACQH01000068.1| GENE 88 125482 - 125658 185 58 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928800|ref|ZP_06422646.1| ## NR: gi|288928800|ref|ZP_06422646.1| hypothetical protein HMPREF0670_01540 [Prevotella sp. oral taxon 317 str. F0108] # 1 58 1 58 58 103 100.0 2e-21 MMKQVVFRKQAYLCPKVEIVGVEMENLMQQVSGNHSPIGQGGTYGDAKQGNFVDEDEE >gi|283510551|gb|ACQH01000068.1| GENE 89 126017 - 126202 84 61 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MALFSYSRACLIDIPQLIDNSPLTKHAILNWSKRCCRQPCQYCKLLLATYNADVVYDLGY A >gi|283510551|gb|ACQH01000068.1| GENE 90 126546 - 129563 2930 1005 aa, chain - ## HITS:1 COG:BH0785_1 KEGG:ns NR:ns ## COG: BH0785_1 COG4724 # Protein_GI_number: 15613348 # Func_class: G Carbohydrate transport and metabolism # Function: Endo-beta-N-acetylglucosaminidase D # Organism: Bacillus halodurans # 112 596 83 554 556 111 25.0 6e-24 MKRNHVLTSLAFATLLTSTPVRGFAQQQSETASTDIYDFSKFDEKLLLDLFVTAQENGRK YPTVAELKAAGIYEELQFVRSHVAKRKLIDNADRLIPGTYAERELFLNLPMAVGKGIGGY PSKSFASDNFSFWNYTNLWGSWNHSLFQAPGAWADAAHKNGTDILSGMKFFDSKSGNPRA WIALLSQKKSDQTYKYAQPLINLLRYLGMDGINYNWEAMGYSNGDVVNFHRELYAIAQKE GFDNFHIMIYTVNSTLGGWDAPALFGSNGKRTAELMLNYSGSGFTYNMRQSVMMAESVMK TSKGLYAGGWLVGMDKNWSHLSSDEVTKRCGIALWGEHDQSRFWSYNSGKGPHDLMSNYQ MLLERAFSGGYRNPAKRPAVADQGNNLRWSGSTPPLSTFAGLATWIPERSAIGGNLPFAT YFNLGNGERYCYKGKKTGGAWYNMANQDIVPTYRWLVYEGGSTNVSTAVQPEFTNTDAYM GGSCLMLTGKQAAAQTDVVLYKTDIQGTKGNIYANVAIKNGKGAEQVSNLYLLVRVKGNP AWKEFAVGDVATKQWEEKRIPLTGINSGDVIDRIGLRVKNSDANYKMYVGKLLLADDFKA EPKPVEDFTVQVKEETKKSLSVKAAWTIAGEDGQRVLTNADANIDHFELLYKNGEKGKVS EIARTTQWATFVGDIDLPNAGDAPFVGVRAVSTDLKTYSPVVWQPITRAKQELLPDLPEE DYGTVELDMAADGVATAQKVRFLTVVKTEGGLQDIEYNASKPVGGKNYVDATHLVWRVKQ GQTVKIKFKGYEATEYADNTTDDLRYCLGKGWLDLNGDHVFNGNLLEDNPDAGECLFNVG KYKSSTIYNVQSLQEKTFTIPNDARVGKSRLRIVFSDAWFEGALQPVGKFNKGFAIDLGV MIEGTNTSRSDKSGNWDEGQADQPEGLDITNSITNAAAEASTLQVDGRSMTFAHVEQAWI FAVDGTLVTTLHKPSSFDASVLPKGVYLVKMKNKNVVRVGKITMK >gi|283510551|gb|ACQH01000068.1| GENE 91 129700 - 132726 2956 1008 aa, chain - ## HITS:1 COG:BH0785_1 KEGG:ns NR:ns ## COG: BH0785_1 COG4724 # Protein_GI_number: 15613348 # Func_class: G Carbohydrate transport and metabolism # Function: Endo-beta-N-acetylglucosaminidase D # Organism: Bacillus halodurans # 119 597 88 552 556 113 26.0 2e-24 MKRKVTLSALALAGMLSAAPFTVKAQGGNTAPTASKVVYDFQHFSDIKMLQLFEKAVREG KRYPSYEELKAAGLADEIEFVRSHVRKRSILPRTDRLIKNTFETRELFMNIPAGAGRTTG GYPSSEFASDNFSMWNYTNLFGAWNHGLFQAPGSWADAAHKNGTDMMSGIKFFDTTGNPG GVDAGGWLPILKEKNADGSFKYAKPMIYMLQYLGLDGINYNWEAPNYDDPQVIAFHQKLY EIAKEEKFDNFHIAIYTLNNSLTTRNVEALYGKDKKKTADVMLNYVSDDFTYGMRTSVAQ AKAAMGTADGVYAGVWIVTMNRSWTRLNSAPECGVCLWGEHAQSRFWSYNAGGDADEMMV NYQKLLERGFSGGNRNPLSRPAVNDRGHNWEESNGVQPLSTFPGMATWIPERTTITGNLP FSTHFNMGAGSQYHYKGKKTAGSWYNMSNQDIVPTYRWLVVEGNTNNHSDKVDPEITYKD AYMGGSCLLLNGKGQASSTDIVLYKTDIAGSNGAITANVAIKSGKEVPAESNLYLIVHLK NSNEWKEYAVGGTTGKTWEEKKIALNDITSGTAIDKIGLRVKNSTNSYKMMIGKLELVDG FTAAPSGVKNVAIQVKEETKSSLSVKATWAVDKAETTTGMIYNDDANIDHFEILLKDGTN GRVSEVARTSQWAAYVGNIDLKSFTNNPFIGVRAVSKDLKSYSPVVWTEVVRANANELPE LVLNPYTSPELDTDADGYKTAQAIRYVEIFKTEGGSTNIDYKATGPTGGTNYVDVTDQVL TIGQGQTVTLKFKGFEATDEKNGSHDDLRYCFGRGWIDLNGDHRFDPRELTVNPKEGEQL FTIGELRKGVPDQVQKIVTKTFTIPADARIGDTRMRIVFSDAWFKGALKPTGKFNKGFAI DFRVKITGNNPQRPVPADTRDQGLADEPAGLSTTGVSNVAVAPSQLQQEGNSLNFANVEK AWIFAADGSLVAALNNPANYNLSSLASGVYLVKMQNKNIIRTQKITVK >gi|283510551|gb|ACQH01000068.1| GENE 92 132904 - 134724 1645 606 aa, chain - ## HITS:1 COG:no KEGG:Fjoh_4558 NR:ns ## KEGG: Fjoh_4558 # Name: not_defined # Def: hypothetical protein # Organism: F.johnsoniae # Pathway: not_defined # 97 606 67 526 526 214 34.0 6e-54 MNRYSKWFVGVMALLSLTLQSCLDFDTTGDEFSATQKNTNKVSRRGKVDSLNYRAKFTAE DVKRAANKMSDVIAAGLSAQYAMRGGKEGKKPVTHAYQYQYGLGADLYAQYGVIPHTNFP YSKVTITSAYNISDRFYGGAMGSFNEVSTQMVPLLNHESIDTIPELKAIYLLLYNYSAVE VANVYGPFPYQDIKTNAQTAPFEYQSLETIYNKAVDNIDSIVACLNYYETKPEEYKEALR DILKQNMHVTGDYKTGFSNLNTFVRFANSLKLRMAMHIVKVNPTLAKRWAEEAVKAGVIE EYSQEVCINTAAEGIAHPLLEASQSWHDTRITASWVSILKSLNHPYLKYLFLKNDGDIVN SKTKEVTPANTDVIGIRSGVTPGEGQDYAGNQFIAFSAINPEAAHSMPLYLMKLSEVCFL RAEGALRGWSMGGTAKEFYEKGILAGNCEDREQFYTDANNQNPYTAGMAEYMQQESATPY TYKDPTGESDPIESLTKIGVKWNEDDSKEVKLEKIITQKYIAGYPESFEAWVDLRRTGYP RLFDVLQADEADGSLNDGDIIRRLPFPGRTDPATQADIKATGLGALGGPDFVGTRLWWDV KGTPNF >gi|283510551|gb|ACQH01000068.1| GENE 93 134829 - 138254 3082 1141 aa, chain - ## HITS:1 COG:no KEGG:BVU_2204 NR:ns ## KEGG: BVU_2204 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 58 1141 110 1115 1115 531 34.0 1e-149 MKRTNAQTEGLLLKALLALSMTAVPSVGVLAGVKANTTQNPLLKATATTQNSALQRVQAN TRTITGVVKDAQGEPLIGVSVQVKGTGKGAITNIDGQYTLTTNESNPTLVFSYVGYTAQE LAAGSRSQVDVTLKEDSKVLGEVVVTAMGIMRKQESLTYATQKVKAEDLMKVQDPNVANT LEGKVAGVTITPSAGGAGGASKIILRGNKSITGNNSPLIVVDGIPLTNETRGKITNGDDL GYAGSSEGSDPLSMINPDDIESMNVLKGANAAALYGSRAANGVIMITTKSGREGKVDINF TSNVTFDTPLLTPKLQNTYGAALTDGSVVNGTQTYMLSPSSWGGKLSDQTNLLPSVQLGD QYPQGALNQVHLRKNAVNDIDNFFRTGVTTNNSVSVSGGTEKSRTYISLGNSHSNGMIKN NSYNRNSITLRQSFKLFDRLKLEGSVNYVETITRNRPGGGTIGNPIYHAYMMPRNIDMDY YAQNYSKRGTWLSNPLSYYEPKEFTVLDQNGNQKKETHYVYTQGKRVPLSSEMQDWAYLS NGNNNPYWLVNKNRNRQDENRLFTTLSATIDIYKGLSFQARFNYTKTHFTKDARQYATTF LPSSMNSFGSYWSEDFKTTEMYTDYLLSYNTNVKDFSVSATAGWVGHTIKNTYKRTAVGN ATFVDPLRRILPSAVNVFETSAGDVGATTTTKSSNWDKAMLFTAQVGWQDKVFVDGSYRL DAYRPYRQFKLRGIIDSEWQGYFGVGANAIVSSLVKLPVWIDYLKYRVSYSDVGNSIPNN SYFDVSRNYLTGTATVNKYSDFVPKVETLGSFETGVEMLFFKNRLSFDLTLYDSRLRNMY MESANISAKTNAGNTGKVRNRGFETTIGYDFKFGKDLRWRTSYNLSFNDNKILETAYDNQ GKQKEIKNEIAGVYVVYKPGGSLGDMYVDDFKRDANGHIKLTRNGAPRFDKSGNSRKYVG NMNSKWQMGWSNTFNYKDFTLSMLINGRLGGKVISLTEAYLDLYGLSERSGKARQDAEAN GIVAANYGNVPGIPLPDGSGRIVPIDTYYKTLGGSSNPNSYIYSATNFRLRELSLGYTFR NLMGQNKNLSVSFIARNLFFLYKKSPADPDVSLSTGNGLGGFELFNTPSSRSYGLSMKLN L >gi|283510551|gb|ACQH01000068.1| GENE 94 140147 - 141136 794 329 aa, chain - ## HITS:1 COG:alr4033 KEGG:ns NR:ns ## COG: alr4033 COG1120 # Protein_GI_number: 17231525 # Func_class: P Inorganic ion transport and metabolism; H Coenzyme transport and metabolism # Function: ABC-type cobalamin/Fe3+-siderophores transport systems, ATPase components # Organism: Nostoc sp. PCC 7120 # 4 225 2 217 333 203 49.0 3e-52 MPSITLRDLSVGYQHPRRTPTTVLAQLNATLHAGQLTCLVGANGIGKSTLLRTLAAFQPP IAGQMHYYSGEQVAPINLVSLSQARLARLVSVVLTAKPSVENLSVEQIVGLGRSPYTNIW GTLRAEDHRMVEWAMDVVGITSLRHRQVQTLSDGERQKMMIAKALAQDTPVILLDEPTAF LDYKSKVEVLGLLARLAHETNKMVLLSTHDLEQAVHAADALWVVAKQQLGDGTQPCSRSD SSDRSDQSDQSDRPNGLDAPNSFSQYHHLPLLYVLEKHIHTTEQGHQQLVFEPAPTAPAP PHLTLTATPNFETRTQELLRLLGEEGIKR >gi|283510551|gb|ACQH01000068.1| GENE 95 141313 - 141945 706 210 aa, chain - ## HITS:1 COG:aq_1507 KEGG:ns NR:ns ## COG: aq_1507 COG4122 # Protein_GI_number: 15606661 # Func_class: R General function prediction only # Function: Predicted O-methyltransferase # Organism: Aquifex aeolicus # 2 208 8 212 212 151 39.0 7e-37 MDLTTYIANHIDPEGDYLYRLYRATNLHTLHGRMASGHLQGRLLKMLVQMTQAKRVLEVG TFSGYSAICLAEGLPDDGLLYTFEINDEQEDFTRPWIEGSAVANKIRFIIGDAITEAPRL GITFDLVFIDGDKRTYVETYEMALSVLRQGGFIVADNTLWDGHVFDSAYDKDQQTLGIRR FNDLVANDSRVEKVILPLRDGLTLIRKVKK >gi|283510551|gb|ACQH01000068.1| GENE 96 141949 - 142365 476 138 aa, chain - ## HITS:1 COG:SMc01343 KEGG:ns NR:ns ## COG: SMc01343 COG0757 # Protein_GI_number: 15965074 # Func_class: E Amino acid transport and metabolism # Function: 3-dehydroquinate dehydratase II # Organism: Sinorhizobium meliloti # 2 137 4 141 148 145 52.0 2e-35 MTIQIINGPNLNLLGVREPDVYGSTSFDDFLPHLRACFPDVQIDYFQSNIEGELIDKLQA VGFQCDGIVLNAAAYTHTSIALADCIRAISAPVVEVHISNVHQREAFRHQSMIAAACRGV ICGFGLDGYRLAIESFKR >gi|283510551|gb|ACQH01000068.1| GENE 97 142411 - 143325 938 304 aa, chain + ## HITS:1 COG:SA1328 KEGG:ns NR:ns ## COG: SA1328 COG4974 # Protein_GI_number: 15927078 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinase XerD # Organism: Staphylococcus aureus N315 # 8 300 1 294 295 224 41.0 2e-58 MERENKTVDNMVKAYMRYLKLERNLSPNTIEAYRNDLRWLLAYVNFHGLKVEELKLEHLD NFSASLHDQRIMPRSQARILSGVRSFFKFLLLDGFIDADPTELLVSPHVRNALPDVLSTA EVDRLEASIDLSKWEGQRNRAIVEVLFSCGLRVSELVNLKLSNLYVEEKFVRVTGKGDKE RLVPISSRALDELNAWFADRNAMRIKPGEEDYVFLNRRGAHLTRTMILIMIKRQAVAAGI TKTISPHTLRHSFATALLEGGADLIAIQAMMGHEDIATTEIYTHIDTSSLREEITKHHPR NKKG >gi|283510551|gb|ACQH01000068.1| GENE 98 143542 - 144570 938 342 aa, chain - ## HITS:1 COG:no KEGG:PRU_2865 NR:ns ## KEGG: PRU_2865 # Name: not_defined # Def: M20/M25/M40 family peptidase (EC:3.4.-.-) # Organism: P.ruminicola # Pathway: not_defined # 1 342 1 329 329 425 58.0 1e-117 MKKIIIAVIMIAVIALIAYSMGLFSTSHSAAEEAEEAEMAAAEKLNPVGPAFNADSAYAF VAAQCAFGPRAMNTKSHDNCGEWIAKKFESFGCKVKNQRTELRGYDGTMLRCRNIMASYN PESTTRILLCAHWDTRPWADNDPDSANWRKPIDGANDGASGVAVMLEIARLLNQNKNLNI GVDFVCFDAEDWGTPKWSGQEDSEDTWALGAQHFATNLPAGYEARYGILLDLVGGIGAKF YREGMSKAFAPDIVKKVWRAARAAGYGSFFPKSDGGMITDDHVPLNEKAKIPTIDIIAFY PDCVQSSFGPTWHTLNDNLQNIDRNTLKAVGQTVIQTLYSEK >gi|283510551|gb|ACQH01000068.1| GENE 99 144907 - 146745 1777 612 aa, chain - ## HITS:1 COG:RSp0056 KEGG:ns NR:ns ## COG: RSp0056 COG2355 # Protein_GI_number: 17548277 # Func_class: E Amino acid transport and metabolism # Function: Zn-dependent dipeptidase, microsomal dipeptidase homolog # Organism: Ralstonia solanacearum # 364 602 59 320 323 155 35.0 2e-37 MQNFDLQARLQNIYASFPQAKRQPVIGLTANFRDGSVMLMDRYYRQVVDAGGTPVLIPPV DSVDVIVNTLDNIDGLILTGGADLNPLWAGEEPSPRLHAVNAERDLPELLITQLAYNRQI PILGICRGMQTMAVALGGKVAQDIEEHFTNDVRRGLAKGEFKHFLPRFSDKLIQHSQDAQ RNLATQTVYFEPNSLPAQIFEATSLHVNSFHHQAVSHAGSKFRFVAATVDGIPEAMESTE HKPLLGVQWHPEWMEEQGLPLFQWLVRQAQTFADAKRLHAKVLTLDTHCDTPMFFPLNVR FDQRDPRILVDLHKMSEGRQDATTMVAYLPQPKSGETFREKVAFDVTGPRNYADLIFDKI ESIVAENSQYVALARNPQQLYANKAAGKKSIVLGIENGLAIENDLANVAHFAQRGVTYIT LCHNGDNDICDSARGTATHGGVSDFGRRVIEEMNRTGIMVDLSHGAESSFYDALEISSVP IVCSHSNCKALCDVPRNLTDNQMRALAQKGGVAHITLYHGFLRAQGEASILDAMAHLEHA IKVMGVDHVGLGTDFDGDGGVCGIANSSEIINFTIQLLRRRFSEQDIEKIWGGNWLRVMK EVQSKKHLKVKG >gi|283510551|gb|ACQH01000068.1| GENE 100 146906 - 147373 615 155 aa, chain - ## HITS:1 COG:YPO0002 KEGG:ns NR:ns ## COG: YPO0002 COG1522 # Protein_GI_number: 16120355 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Yersinia pestis # 3 149 6 152 153 122 42.0 2e-28 MEKIDSLDKKILNILSQNARIPFKDVAAECGVSRAAIHQRVQHLIENGVITGSGFDVNPK SLGYSTCTYIGLNLERGSMYKNVVARLNSINEIVECHFTTGSYTMLIKLYAKDNEQLMDL LNNKLQTIPGVVSTETLISLEQSIKREILVDPDEK >gi|283510551|gb|ACQH01000068.1| GENE 101 147591 - 148550 1109 319 aa, chain - ## HITS:1 COG:PA4572 KEGG:ns NR:ns ## COG: PA4572 COG0545 # Protein_GI_number: 15599768 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: FKBP-type peptidyl-prolyl cis-trans isomerases 1 # Organism: Pseudomonas aeruginosa # 83 287 9 204 205 141 43.0 2e-33 MALALLASAAFNTASAGKKKDVVQVATAAPVVLNTSSDSVSYAAGMAATKGLTAFIQQQY KVDTTYMADFVRGFRKALESDGDGAFTAYAAGMQIAQMAQQRILPSVQQELIGTKDSVAA KNFYNGFIAALEKDFTKYTEEEAQQLFEQRVEAAKKAKVEAAISTGKQWLAENAKKPGVV TLPSGLQYKVIANGTGDKPTATDEVVVKYEGKLIDGTVFDSSYKRTPDTSSFRADQVIAG WTEALQLMSPGSKWELYIPQNLAYGAREMGSIPAYSTLIFTVELLKVNKPKPEVSNAKAE TPAAKPAKKAGRPAAKKRK >gi|283510551|gb|ACQH01000068.1| GENE 102 148568 - 149443 1136 291 aa, chain - ## HITS:1 COG:ECs5185 KEGG:ns NR:ns ## COG: ECs5185 COG0545 # Protein_GI_number: 15834439 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: FKBP-type peptidyl-prolyl cis-trans isomerases 1 # Organism: Escherichia coli O157:H7 # 80 288 63 258 259 140 38.0 3e-33 MKKLTILAATAIAAVTFSACGNSTPKADLKNDIDTLSYALGYSQTQGLRDYLSRGLNVDT AYIDEFVKGLNDGANAGDDKKKAAYYAGVQIGQQISNQMVKGINHELFGEDSTKTISMKN FLAGFITGAKNKKGIMTQKEAEDYVRTQAEKIKDRTAEKTYGANKEAGKKFLAENAKKPG VKTLPSGVQYRVIKEGDGPIAKDTSVVKVNYEGKTIDGKVFDSSYTQGQPVTMRANQVIK GWTEVLTRMPAGSVWEVYIPENLAYGSREQPNIKPYSALIFKIELISVGEK >gi|283510551|gb|ACQH01000068.1| GENE 103 149589 - 150194 747 201 aa, chain - ## HITS:1 COG:ECs5185 KEGG:ns NR:ns ## COG: ECs5185 COG0545 # Protein_GI_number: 15834439 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: FKBP-type peptidyl-prolyl cis-trans isomerases 1 # Organism: Escherichia coli O157:H7 # 5 200 67 258 259 194 49.0 1e-49 MDKLSYALGLGIGRQLSQMGADDLNVADFAQAVKDMIDGKEPQVPTAEAQQIVEDFFRRQ EEKQRAEAAEKYKGAKSEGEKYLSENAKKEGVVTLPSGLQYKVLKEGNGKSPKATDKVVC HYEGMLVDGTMFDSSIQRGEPATFPLNGVIAGWTEGLQLMKEGAKYRFFIPYQLGYGERG AGASIPPFAALVFDVELIEVK >gi|283510551|gb|ACQH01000068.1| GENE 104 150457 - 151161 919 234 aa, chain + ## HITS:1 COG:jhp1180 KEGG:ns NR:ns ## COG: jhp1180 COG0846 # Protein_GI_number: 15612245 # Func_class: K Transcription # Function: NAD-dependent protein deacetylases, SIR2 family # Organism: Helicobacter pylori J99 # 6 234 2 224 234 214 50.0 9e-56 MNTKNKHVVFLTGAGMSAESGIKTFRGNDGLWENYPVMQVASHEGWLADPNLVNQFYNER RQQLFAAQPNKGHQLIAELEKRCQVTVITQNVDDLHERAGSSHVIHLHGELLKVCSSADP NNPRYIRTLTPDNAIVRPDEKAADGSRLRPFIVFFGEPVPLIDLAARTVRQADVLIVIGT SLNVYPAAGLLAYAPSTTPIYLIDPEPVETTYNPQVKQLRMGASQGMEELIGKI >gi|283510551|gb|ACQH01000068.1| GENE 105 152302 - 152565 76 87 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSLYSKTKINTSLMHKVLDGDVMKSEMLPHLSVTKRSHVSAGDPGESYSIRSLQVERLVA NGRLLLRLHGVVSVGRGSMPCNRLHWP >gi|283510551|gb|ACQH01000068.1| GENE 106 152902 - 153321 345 139 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288928816|ref|ZP_06422662.1| ## NR: gi|288928816|ref|ZP_06422662.1| hypothetical protein HMPREF0670_01556 [Prevotella sp. oral taxon 317 str. F0108] # 1 139 1 139 139 267 100.0 2e-70 MKSTGLKNRAWQLAAMLVVLIAGLSFASCASDDGEDKEASPIIGTWMAQASSTGTYYMQF TEYKQYFFTNDINKGSTTEEGGYTFSNNTLTLNPKGKAPRTYNCTFKGDHLFLTLASGKE GELQFIRYTPTKSVKHVKR >gi|283510551|gb|ACQH01000068.1| GENE 107 153570 - 156809 3496 1079 aa, chain + ## HITS:1 COG:TM1193 KEGG:ns NR:ns ## COG: TM1193 COG3250 # Protein_GI_number: 15643949 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Thermotoga maritima # 57 1075 7 983 1087 627 36.0 1e-179 MKKIFFLAALAALSLQAQAQTATQSKLKRSATKGNVKATAKQTVAASRIAAQPTFTEWHD LQVNTLNRMPMHTNFFPFKNYEEAKMGDPKRSENYLSLHGDWKFNWVENADQRPTRFYET DFNDTAWGTMPVPGNWELHGYGDPVYLNVGFAWRGHFKDNPPQVPVKDNHVGSYRRTITL PAAWKGKQVIAHFGSVTSNIYLWVNGHFVGYSEDSKVAPEFDLTPYLKEGDNLIAFQCFR WCDGSYCEDQDFWRLSGLARDTYLYARDKKTYLEDLRITPDLTDNYTNGTLTIDAKTQNG TALIYQLLDAEGNLVTNTHATDGKTQLKVPNVHKWTAETPYLYTLRTIVVDSRVKKGQQP KKADEDAYVAVTNQKVGFRKVEIRNAQLLVNGQPVLIKGADRHEIDPDGGYVVSRERMVQ DIRIMKQLNINAVRTSHYPNDPMWYDLCDEYGIYVVAEANQESHGFQYGDDAPSKKPMFA LQILQRNQNNVQTYFNHPSIITWSLGNETADGPNFAAAYKWIKGYDPSRPVQWERGGVDG PSTDIACPMYRTHQWCEDYARDDSKTRPLIQCEYNHTMGNSSGGFKEYWDLVRKYPKFQG GFIWDFVDQGLRAKDKNGVEIYKYGGDYNDYDPSDNNFNCNGIISPDRVPNPHAYEIAYW HQNIWAEPVDLKAGKISVYNENFFRNLDNYKLVWTVLKNGKAVQTGEVEQLDVQPQQRCE IALPIKTDSLCPHAELMLNVDFVLKTAEPLLDAGTRVAYNQFEMQQGACFRKMPKAPAVD KDTRLNLRNKSGESQVVVSNNHFTVAFNRVSGLLATYNVDGKPILGQGGTLKPNFWRAVT DNDMGAGVQKHNRVWREPKLQLVAINAALDKKDNKADVHVEYDMPEVGAQLTLTYSVMGD GSMHVTQQMTPKTADERPFLLRYGMVMQLPYDMQVSEFYGRGPIENYADRKLSQNVGIYK QTADEQFYPYIRPQETGTKSDIRWWKQTNDNGFGFRIVSPELFSASALHYSIADLDEGLE KAQRHSPQVPKSKYTELCIDLGQTGVGGVNSWSKEAIALPPYRLPYKPYTFTFSIIPQK >gi|283510551|gb|ACQH01000068.1| GENE 108 157036 - 157218 98 60 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928818|ref|ZP_06422664.1| ## NR: gi|288928818|ref|ZP_06422664.1| hypothetical protein HMPREF0670_01558 [Prevotella sp. oral taxon 317 str. F0108] # 1 60 1 60 60 103 100.0 4e-21 MDLMLSSLCPHTKLEGTLFTREQVDEISTRIAKRVSENVARQLGVNKGSSSDAGNKQEGK >gi|283510551|gb|ACQH01000068.1| GENE 109 157232 - 158479 897 415 aa, chain - ## HITS:1 COG:no KEGG:PRU_0337 NR:ns ## KEGG: PRU_0337 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 401 1 427 441 218 31.0 4e-55 MTALVGVLNKHAVAIAADSAVTMGNTHKVVNSANKIFTLSKFHPVAVMVFNNAAFMGVPW DIIIKEYRKELGSKALPSVKCYVDDFVRFLHSRDFFCDSTTQSKYLGLLLEAFWDLCISE THGGSDAIAASGYSEDKVKKLLNQALTYFEKQEKCPELKDYSYDVFIKTNKEVVGDCASN KGLSEKELLSKAFHAYLCTQISTQLSTGLVFVGYGENEIYPSLYPLDVAIGVDKRLRYYC EEKKVAVISENGSHAIITPFAQTDVTQTIIQGINPSFQDIIYSVTEKSIKEFINVITSYI DADPTAAHVSKSIRNLETDDIIRDIMRQMGRQMFESYTKLLLNTVISLDKEDMANMAESF IALTSLVRRMQPGKETVGGPVDVAVISKGDGFVWIKRKHYFKPELNAAFFNNYFK >gi|283510551|gb|ACQH01000068.1| GENE 110 159309 - 162134 3531 941 aa, chain + ## HITS:1 COG:pqqL KEGG:ns NR:ns ## COG: pqqL COG0612 # Protein_GI_number: 16129453 # Func_class: R General function prediction only # Function: Predicted Zn-dependent peptidases # Organism: Escherichia coli K12 # 27 924 29 917 931 256 25.0 1e-67 MRFRNLLALALLALCGGVQAQMQMPPIPRDPAVRIGKLDNGLTYYIRYNNWPEKRANFYI AQKVGSLQEEESQRGLAHFLEHMCFNGTKHFPGDALLRYCESLGVKFGADLNAYTAIDET VYNIDNVPTTRQSALDSCLLILRDWAGSLTLDPKEIDQERGVIHEEWRLRTSASSRMFER NLPKLYPGSKYGLRYPIGLMSVVDNFKYKELRDYYEKWYHPTNQGIIVVGDVDVDHVEAE IKKLFGPMKNPANASPVVTENVPDNNTPIVIIDKDKEQTSTIVQMMMKRDATPDSVKGDV NYLVYEYVKGVGIGLLNDRLAEAALKSDCPFVGASASVESYIFAKTKEAFSIAVSPKTTE LTADALRAAYTEALRAAQFGFTKTEYDRSKSSTLSSLDRMYSNRDKRFTSQFANSYKENF LDNEPIPPIEYYYETMKQVVPNIPLEFVNQVFADLVSKTDTNLVIVNFNPEKEGLTYPTE AGLIAAVNQARTAKLTPYVDNVKNEPLITKLPKPGKVVSQKRGPKFGYTELKLSNGVTVL LKKTDYKKDEVRLSGSGGAGSSSYGAADFVNLNVFNSALEVSGLANFSNTELSKALAGKN ASASLSMSEKRMRVGANATPKDIETMFQLVYLHFTKINKDQEAFNNLMESLKVSLQNRAI SPDQAFSDSLSATIYGHNPRVKPLELADLPKVNYDRILHMAAERTANANGWRFIIIGNYD EATIRPLIERYLGSLPSKGANPNSKKVTFFKKGVINNDFTRKMETPKADANMVWFSEDIP YTTENAIKASIAGQILSMVYLKKIREDASAAYTCGAAGSASIDDKDHNVTLFAYCPMKPE KADLALQIMRDEVTNLSKQCDPSMLAKVKEYMVKEADNEAKTNGYWAGVISTWYRYGIDL HTDYKALVAKQTPESISNFVKEILKAGNRIQVTMMPAEEKK >gi|283510551|gb|ACQH01000068.1| GENE 111 162670 - 163137 -77 155 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929076|ref|ZP_06422922.1| ## NR: gi|288929076|ref|ZP_06422922.1| hypothetical protein HMPREF0670_01816 [Prevotella sp. oral taxon 317 str. F0108] # 1 61 1 61 61 80 73.0 4e-14 MAILPFLAICFMVRKGFVYTIALDIYAFHLAFSGKKYCILHHFSLRLAPKRTAFSTKLHC ILLQMAQNMVQMAFLLNKYSSCRIHSLPPFCTKTNPCENRFFAARWAIGGKIAAIMLIFL LKIRQNVCGMYASIGNGTKGMYAYHSLGEPLQGLH >gi|283510551|gb|ACQH01000068.1| GENE 112 163815 - 166601 3316 928 aa, chain + ## HITS:1 COG:no KEGG:SRU_0096 NR:ns ## KEGG: SRU_0096 # Name: not_defined # Def: TonB-dependent receptor domain-containing protein # Organism: S.ruber # Pathway: not_defined # 31 924 71 958 962 313 28.0 2e-83 MKFPTRILWGGIRVMVLFMAFACAFNQLAHAQTTDENISLTITDRNGQALPYATVIVESA GLTYTADAQGRLRLKPTLFTTKGTRLTVSYLGKATRQLTITHDAVAKDKKINIALDDNNL YLKDVQVNATRAPRNSNSSMLIQRNTIDNIQAYSLADIVQALPGKAILNTDMHNASFLTL RSALQGDLQNPLDAYSRNKLNDYVRNAAFGIAYVVDGTPLSNNTNMQLDSYGKWGGIKMF DRRFNTDNNENVGSGNDLRLIPASSIESVEVISGVAPAKYGDLSSGAVIINRRAGLTPFY GSVKVQYDIFNASLGKGFALGERWGMLNLNVDYLHSTRDRRDKLKTNANIALNATWTHTI SRALAWENTLAVDFSKNVDGLKSDPDATLNRTRTDRQNLRLALRGALSPQQNALIDGIDY NFSLSLSHQYDMHEEFIANNRLNLITDAYTNGVHETDVAPPYYTALLEIDGKPLALSANV EARRTLRWSRATHTLSLGVFARHEANHGRGKIFNAETPFYDGGQGGRGDRSYQYRNRIKL NQYGLYLQDKLVLPTAHGTLNATAGLRLEYQNQRFAASPRINMMWALDNGLSFNAAYGIS YKIPATAFLYPENVYFDRLVYSNYSNNANERLFMYKTKVVNPTNPNLLSPYTHSFEVGTA YANSNFNTSLTGYLKIDLRGISSEAVLDTMWVQRYETVSQPPNQRPIYAPKGNPELIIDY YYTPRNLLYSRDAGVEWIGNLRNIKPLGLGVNFSVVYNYSLYHHQGERMGTSIDLDKEAV SGIYAPTKNSSHELISTLSLTKQIPTLGLIFNLRLQNFWFRHFNRHGFDIYPIGYVDRNF RRVYLDEQQRKDIRWAYLRLKDNEPVNISQPIIVPNCHLRVSKEIGKQLRLSCYINNVFN YRPWIERQGERIYFNQPPTVSMDVSYKF >gi|283510551|gb|ACQH01000068.1| GENE 113 166659 - 168017 1578 452 aa, chain + ## HITS:1 COG:no KEGG:Fjoh_0764 NR:ns ## KEGG: Fjoh_0764 # Name: not_defined # Def: hypothetical protein # Organism: F.johnsoniae # Pathway: not_defined # 25 448 18 435 436 184 31.0 9e-45 MKGKMFWNYRIFLLMAVGVTLFFTACSEDDAPEVHGASIKLNVKLAKEFEKAKNQKVAVT LRNTSTREETVVETTLNKDTVLANLPVDMYDVIAAYTLSTEEIKALTGREADGDIAFSAA ATGIQLSPGKETGIDLELTAGGTDDFVFKTIYYAGSDNKKAAGEDDCFIEIYNNTNRTLY ADSLCIALTTMNRYGLRNNGKWHEYAQPQKFYFTPKGTYDWSKAEGMADPEGANDKYVYG NIVLMVPGNGKTYPIKPGQSFIIAPFALNFKEPYTTVNGKEVKPEWPDSTLNLSNAEFDV VYPGNEELDNKNADNMVILHKGNNRYMRLSRNGKEGYVLFRHPNPSQLPLYLRPYRDMKY ADPSQFMQIPVAGIIDAVEVINPNADGYVSPKAFPKQLDASYAYSKPDYSFRCISRKVSR VNEGRRILQDLNNSALDFVQMIPNPKAFAPSK >gi|283510551|gb|ACQH01000068.1| GENE 114 168105 - 169625 1199 506 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288928824|ref|ZP_06422670.1| ## NR: gi|288928824|ref|ZP_06422670.1| hypothetical protein HMPREF0670_01564 [Prevotella sp. oral taxon 317 str. F0108] # 1 506 1 506 506 958 100.0 0 MHARLFVSSLLFGLFYIASAQDSLTRFAVLTPQEWRQHHQIMASPLGIYLLPAQSYGYTS ATFTHERGPLMRLQTPEKDTNLRLSSMGVHTTNRWRLYGEFAYDRQFADSVGWLLSETPR LGMPYYFASPRKGSWLNETYQLKGAANYRLSRLFDLGAALQIRYHKGARSNDPRPSTESF YSRYHLNVGLNLAPVHIAMAAGMVYGTSDNNIIYVNENNDRIDRLDFMAYELMGFGMHRK TGKLQNREMQANTYGHELSLQASFTRNETMLWARASYLAQSDSIRRSRTKNVSANLLSTY NVRQTRILAGLIHPLSPALRLQTTVHAHFTKGWDRLDNILGGQKNYVYNHRDVGLNALLY HRFTSQRLDLFALDATAEHEKRQDGATQHTLVRDAFHLAAAYRSQRGLPRNFAFFYGASQ AFTLPKASLSYPSTQETVFSTQLALPLQRFYATSTASTRIALGTAKRFAHYQLTLSLAYQ LGYVLKAQPTIHGTRHNFEAALGLAF >gi|283510551|gb|ACQH01000068.1| GENE 115 169672 - 172596 2439 974 aa, chain + ## HITS:1 COG:pqqL KEGG:ns NR:ns ## COG: pqqL COG0612 # Protein_GI_number: 16129453 # Func_class: R General function prediction only # Function: Predicted Zn-dependent peptidases # Organism: Escherichia coli K12 # 24 745 27 739 931 251 27.0 5e-66 MKKYLLILVLALFALSANAKRTQATLSTDPNLHVGKLPNGLTYYILRNNTPPNRANFYLA QCVGSLQESDNQRGLAHFLEHLCFNGTRHFPSNTFVAYLETLGLKFGQNINAYTGMERTV YHLNNVPTARVSALDSCLLALRDWACDISFSPEEINKERGVINEEWRQRNSATARMLQRN LPRLYPNSLYAHRMPIGLMEIIDTVGPSTLRQYYHRWYHPQNQAVIVVGDVDVARTAKRI EALFAPIRPTKAARRPAIVPVADNAKPIVVVDSDAEQRTTLVQVFCKMPPITPSEKSTRS YYALLARRSLMMSMLRMRLAELVVKPQCPFTQAVVGYGVYLYASSKYAFQVTIMAKDGQA QAATQTVMTELWRAAKHGFSAAELARAKAEERNIIERQYAARNEVGNNYLGNQLVEHALS GEPMPSPDALRKLRTSIVDAITPADVQQWLRKMLPTSGRNLVVLSLNPQREGANTPTKEG LLQAVLQPATANIAPYEDNTPNQPLLPILPKAGHIVSQRQESALGIERIELSNGVTVLLK PTSTGKDELLMTVFAPGGSSRLGQADFANARFFNRIVGSSGLGSLSSQQLTKVLTGQTAN ANLSLDTYWLSLNGSASTRDAECLLQQTYLYFTALRPDQQAFDNIMTNSRARLSQVSGLP EMALSDSLTATLHAHNPRFANNTLADLDHVDPNRILTLARQGMANAANFTFVFVGNFQRD SLLALVCRYIASLPASKPLLVSGQPASINHPSPRHQRAQSSTAPTTTAAPNSETSLVGAL RSVQTYARGITRNHFRHAMTTPKANAYIVWWAKGTPYTLGNIVMAEAVGQLLGMRYTQRI REEMGAAYSTDASCTLAPDVNTTYLKLYGICPMKPELVDSTLNAMRAEAENMTRNIDPTL LENVKRHLAKRFEERTKQNSTWLRAVQTWAQQGIDIHTNYLIELQNITPQRLQQFIAQQL MAQENRVEVVMMPE >gi|283510551|gb|ACQH01000068.1| GENE 116 172859 - 173668 497 269 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928826|ref|ZP_06422672.1| ## NR: gi|288928826|ref|ZP_06422672.1| hypothetical protein HMPREF0670_01566 [Prevotella sp. oral taxon 317 str. F0108] # 1 269 11 279 279 553 100.0 1e-156 MLFFCLCTYGQAELKEKDSLLYTSLEVYSHKVQSVAALVDTGCSLCVVDSTFAKDSLGIA MPDKLKLMVNSRDNRMPTCVFDSVRFCGKTYRQVLCIIINLKGKFQRFAPNFILGADILK DRPLRFDYETMTITPNLGDNNGGIVLKWKDSHKYTDIPMNFIVFETRIQGQHVRLVFDTG SRNNKLPNDIRIAPSGTIQKETANVSQQLTIKQVRQYKGVTFGLGKQTIVLDFIEGEDNY GLLSLNFLKGHSFILNYKEKKLEILAPSL >gi|283510551|gb|ACQH01000068.1| GENE 117 173695 - 174057 171 120 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928827|ref|ZP_06422673.1| ## NR: gi|288928827|ref|ZP_06422673.1| hypothetical protein HMPREF0670_01567 [Prevotella sp. oral taxon 317 str. F0108] # 13 120 1 108 108 204 99.0 2e-51 MNLPIYTWPIQSLVYKLKRTRVLFLGAAITISATSFAQKHVIENVAKKDSVCRKEQKALS DTTALYLYLRFQGFSPLILKRKEQEKRFNPKALSWPKFDPDKEKFNNLQLINILGDHLLR Prediction of potential genes in microbial genomes Time: Sat May 28 01:31:08 2011 Seq name: gi|283510550|gb|ACQH01000069.1| Prevotella sp. oral taxon 317 str. F0108 cont2.69, whole genome shotgun sequence Length of sequence - 59983 bp Number of predicted genes - 43, with homology - 40 Number of transcription units - 29, operones - 11 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 104 - 151 13.4 1 1 Op 1 . - CDS 299 - 3118 3087 ## COG0612 Predicted Zn-dependent peptidases 2 1 Op 2 . - CDS 3184 - 4623 1189 ## BT_3631 hypothetical protein - Prom 4872 - 4931 7.4 3 2 Tu 1 . - CDS 5138 - 6475 1079 ## BT_3632 hypothetical protein - Prom 6632 - 6691 4.2 4 3 Tu 1 . - CDS 6744 - 9716 3437 ## BT_3633 hypothetical protein - Prom 9765 - 9824 3.2 5 4 Tu 1 . - CDS 9919 - 10563 736 ## PRU_2884 hypothetical protein - Prom 10636 - 10695 4.6 + Prom 10552 - 10611 5.4 6 5 Op 1 . + CDS 10686 - 11831 1041 ## COG0438 Glycosyltransferase 7 5 Op 2 2/0.000 + CDS 11920 - 13416 1258 ## COG1233 Phytoene dehydrogenase and related proteins 8 5 Op 3 . + CDS 13413 - 14915 1435 ## COG1233 Phytoene dehydrogenase and related proteins + Prom 15022 - 15081 3.8 9 6 Op 1 . + CDS 15142 - 16083 976 ## COG0657 Esterase/lipase 10 6 Op 2 . + CDS 16146 - 16379 57 ## + Term 16404 - 16437 -0.6 11 7 Op 1 . - CDS 16512 - 16649 85 ## gi|282879671|ref|ZP_06288401.1| conserved hypothetical protein 12 7 Op 2 . - CDS 16654 - 16989 255 ## gi|288928838|ref|ZP_06422684.1| mobilizable transposon, excision protein + Prom 17323 - 17382 6.3 13 8 Op 1 . + CDS 17556 - 18170 222 ## COG1309 Transcriptional regulator 14 8 Op 2 . + CDS 18175 - 20589 979 ## COG4206 Outer membrane cobalamin receptor protein 15 8 Op 3 . + CDS 20617 - 21804 835 ## Ctha_1000 conserved hypothetical lipoprotein 16 9 Tu 1 . - CDS 24048 - 25709 675 ## PROTEIN SUPPORTED gi|39938628|ref|NP_950394.1| ribosomal protein L13 - Prom 25729 - 25788 4.1 + Prom 25626 - 25685 6.8 17 10 Tu 1 . + CDS 25798 - 27513 2137 ## COG1283 Na+/phosphate symporter - Term 27788 - 27823 -0.7 18 11 Tu 1 . - CDS 27945 - 28373 539 ## COG2166 SufE protein probably involved in Fe-S center assembly - Term 28473 - 28517 -0.6 19 12 Tu 1 . - CDS 28623 - 29882 1343 ## COG0826 Collagenase and related proteases + Prom 29826 - 29885 5.4 20 13 Tu 1 . + CDS 29935 - 30975 700 ## BDI_2912 aminopeptidase C + Term 30985 - 31041 7.1 - Term 30970 - 31026 16.2 21 14 Op 1 . - CDS 31071 - 32039 1174 ## COG1284 Uncharacterized conserved protein - Prom 32130 - 32189 3.2 22 14 Op 2 . - CDS 32195 - 35074 3359 ## COG0495 Leucyl-tRNA synthetase - Term 35469 - 35521 3.1 23 15 Op 1 . - CDS 35574 - 35963 242 ## gi|288928850|ref|ZP_06422696.1| hypothetical protein HMPREF0670_01590 24 15 Op 2 . - CDS 35960 - 36271 92 ## gi|288928851|ref|ZP_06422697.1| hypothetical protein HMPREF0670_01591 - Prom 36308 - 36367 4.8 + Prom 36326 - 36385 7.0 25 16 Tu 1 . + CDS 36502 - 36882 324 ## gi|288928852|ref|ZP_06422698.1| conserved hypothetical protein 26 17 Tu 1 . - CDS 36793 - 37002 88 ## - Prom 37121 - 37180 5.2 + Prom 38024 - 38083 4.2 27 18 Tu 1 . + CDS 38330 - 39832 2048 ## COG1508 DNA-directed RNA polymerase specialized sigma subunit, sigma54 homolog + Prom 40251 - 40310 1.6 28 19 Tu 1 . + CDS 40342 - 40917 458 ## PRU_2856 hypothetical protein + Prom 41072 - 41131 7.0 29 20 Tu 1 . + CDS 41325 - 41831 741 ## COG0041 Phosphoribosylcarboxyaminoimidazole (NCAIR) mutase + Term 41848 - 41909 4.8 + Prom 41855 - 41914 3.6 30 21 Tu 1 . + CDS 42096 - 44051 2337 ## COG0821 Enzyme involved in the deoxyxylulose pathway of isoprenoid biosynthesis + Prom 44088 - 44147 3.3 31 22 Tu 1 . + CDS 44287 - 45918 1817 ## COG0215 Cysteinyl-tRNA synthetase + Term 45970 - 46001 1.5 + Prom 46068 - 46127 2.2 32 23 Op 1 27/0.000 + CDS 46160 - 48208 2385 ## COG0286 Type I restriction-modification system methyltransferase subunit 33 23 Op 2 11/0.000 + CDS 48229 - 49497 497 ## COG0732 Restriction endonuclease S subunits 34 23 Op 3 . + CDS 49503 - 53015 3710 ## COG0610 Type I site-specific restriction-modification system, R (restriction) subunit and related helicases 35 24 Tu 1 . + CDS 54253 - 54690 425 ## gi|288928861|ref|ZP_06422707.1| hypothetical protein HMPREF0670_01601 - Term 54476 - 54514 -0.5 36 25 Tu 1 . - CDS 54697 - 54942 99 ## + Prom 54694 - 54753 4.5 37 26 Op 1 . + CDS 54866 - 55324 253 ## gi|288928862|ref|ZP_06422708.1| hypothetical protein HMPREF0670_01602 38 26 Op 2 . + CDS 55321 - 55767 367 ## gi|288928863|ref|ZP_06422709.1| hypothetical protein HMPREF0670_01603 + Term 55803 - 55841 5.1 + Prom 55780 - 55839 1.5 39 27 Tu 1 . + CDS 55885 - 57204 1751 ## COG0192 S-adenosylmethionine synthetase + Term 57215 - 57287 7.1 40 28 Op 1 . + CDS 57397 - 58437 815 ## PRU_2295 hypothetical protein 41 28 Op 2 . + CDS 58446 - 59192 785 ## PRU_2296 uroporphyrinogen-III synthase (EC:4.2.1.75) 42 29 Op 1 . + CDS 59325 - 59732 305 ## BVU_1188 ribonuclease P (EC:3.1.26.5) 43 29 Op 2 . + CDS 59729 - 59980 74 ## COG0759 Uncharacterized conserved protein Predicted protein(s) >gi|283510550|gb|ACQH01000069.1| GENE 1 299 - 3118 3087 939 aa, chain - ## HITS:1 COG:BB0536 KEGG:ns NR:ns ## COG: BB0536 COG0612 # Protein_GI_number: 15594881 # Func_class: R General function prediction only # Function: Predicted Zn-dependent peptidases # Organism: Borrelia burgdorferi # 5 900 9 894 933 286 25.0 1e-76 MQTRFLRQVSLLLLFLPTLLLAQPLKNDAAYRTGKLKNGLTYFVRHNAKEPGVADFYIAQ RVGSILEEPRQRGLAHFLEHMAFNGTKHFRNDGTSPGIVPWCETIGVKFGTNLNAYTSID ETVYNISQVPLKRSSVVDSVLLILHDWSHYLLLQDKEIDKERGVIHEEWRTRRAKMASQR MYEKLQPTIFKGSKYEDCMPIGSMDIVDNFPYQDLKDYYNKWYRPDLQAIVVVGDIDVNA IEAKIKQLFSTIPMPKNPAKRTYYPVPDNKRMIVAVEKDSEQPIVLAGLHMKHPATPFAQ KGQTAYVRDGYIENLITAMLSERLTALKQINPSPVLSASARTGSFLVAQTKEAFSLSFGC KEDNIRGSFRAVIGETERARRFGFTATELQRAKADALQRAQTRFDERNERSNRTLAMQAV RHFLSSEPLLTPEERLALTKRFDAEVSLKEVNEAARKLISNENQVLTVLAPQKAGFTLPS NQELEQYVLQAQADNSYQPYKEEPLPTTLIEHAPKAGTIVSEQPYGHFGVTKLVLSNGIE VYVKPTNFAADQITMRLWGEGGLSLCPETDAPNFSFLPNAIVDGGVGAFSADRLDKMLAG KHVRVSPYVGQETQGISGQSNRKDLATMFQLAYLYFTAPRTDTTAFATSIDRRRAMLRNR NANPQVEYNDSLSLIAYGHNERTAPLTLERLNKVNYQRIMQLYRERFADATGFKMLLVGN VNIDSLRPLLCQYVASLPAAGRKESFANNFPQVRNVNETHVFTKKMNTPSALVTIIYTFN LPYTPKSNLALDALRRVLTIAFTDSIREEKGGAYGVGVQGELDCNSRPNSLLKIAFRTGP EKYAALMPIVYRQLAHVAQGRINPESIRKIKAYLKKAHQQETLVNDYWNSIVYQHLRYGI DLHTDYEALVDQLSAADIQQVAQAIIKSNRRIEVTMKSE >gi|283510550|gb|ACQH01000069.1| GENE 2 3184 - 4623 1189 479 aa, chain - ## HITS:1 COG:no KEGG:BT_3631 NR:ns ## KEGG: BT_3631 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 458 53 497 502 110 23.0 1e-22 MAEVRGSVAKGDLVDYSQSANVVQAGAMVESFFRINPRTVVYGRMDYDNFAGKNMLGSVF INPERKPFDLSEDSLTNPGNKHRDTYNLVGAVGIDLWRGLAVGAKVDYTAANYAKYKDLR HRNSLMQLNLSLGIAAPIGKKWMLGANYIYGRSTESLRFVTYGKTEKVYKTLVDYGAFTG EVEQFGNKGYTDDSREQPLLDEQNGLALQAESNLGRGFSLFLGAQMVHRSGFYGRNSPYS IVHFKHQGDNFGLQLVLNKYRTHSQQQLSLSYTNEQLTNHKTTYRENQTATGSSFYEYFD PIKTSNKRWSSLKLAYTAWLRLFGGIAHWTISAALQADSRQVSAYLYPYVQRQQLHTNLL SLLVTRNISLSRRAMLSLTGGLSHQWGSGEPGQTATLATPSDKQTPPTLMQAYLYRQHHL LTAPQWGLHLAAKYAFGLRHLPFPLYVRAQTTWHQTAKPSPWIDNQHRTEATLAIGCTF >gi|283510550|gb|ACQH01000069.1| GENE 3 5138 - 6475 1079 445 aa, chain - ## HITS:1 COG:no KEGG:BT_3632 NR:ns ## KEGG: BT_3632 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 15 442 12 409 411 183 31.0 1e-44 MYRKTILINSFLAAFFALLVLSCSSDDKVDTVQTCKVNVTLQYPENSIQPYEGAKVELVD ARASVFVATTNANGVAEFNVPAGIYDVRSNATRGDGIWKLNLNGNKSKVVVSPPATEIKL EMKLSKAAQIIIKELYNGGCQPDNGGRFFQMDKGFVLYNNGGEVAVINNLAVGIVDPYNS QAPSKWLKNGKLVYDGQGYIPGIHGIWYFQGPLVMQPYSQIVVNVNGAIDNTKTYSKSVN YANKDYYAMYDPESGYDNELYYPSPSELIPTSHYLKAVEYGQGNGWTLSVTSPAMFIFQT QGVTPRNYATNVSNIIYAPGAAHNPVTANLKIPNEWVIDGIEVFSSAYKTANAKRLPAEI DGGSVLLTYQLGHTLYRNVDKEETEKLPENKGKLVYGYTMGVSTGDPSGIDAEASIKNGA HIIYMDTNNSTNDFHERKAFSIKGK >gi|283510550|gb|ACQH01000069.1| GENE 4 6744 - 9716 3437 990 aa, chain - ## HITS:1 COG:no KEGG:BT_3633 NR:ns ## KEGG: BT_3633 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 26 990 30 945 945 526 34.0 1e-147 MFKICFAILLFVVGAVSTVKAQGVLLNGTIKSKGDGRPIEYATVTLKECELWAITDNKGR FAIKNVPSGNVTLTVRCLGFATFTTKVNVRAGMPHVDIGLNDDNLKLNEVEVVAKRKGDA TTTSYTIDRLALDNRQALNLGDIMSLLPGGKTWNSSLLRDTRIALRSSSGEMGNASFGTA IEVDGVRLDNNATPGETLGAGTRTLSTANIESVEVVTGVPSVEYGDLSNGIVKVNTRSGK SPFIVEAKLNQNTRQLALSKGVDLGGKMGLFNFSMEHARTFGDITSPFTAYQRNVLSAQY MNVLMRQSMPLTLKLGLTGNVGGYNSKADPDGELDSYKKERDNRFTANLSAEWLLNRKWL TSVVLKSSLSYADQLYENYSHTSSATTLPYIHTMKEGYFVAYDYDKHPSADIILGPTGYW YMRSFNDSRPLSYAFKLKADWVKRFARIYNKVSVGADYTGSRNLGRGTYYEKARLTPTWR PYRYHELPTLNNLAVYAEERLNIPTGKLSNMELTVGVRNDISHIAGSAYGTASTLSPRAN LRYVFWHGRDTWMRSLSAHVGWGKSVKLPSFQVLYPRTSYIDRLAFTPGSTADNKAYYAY HTFPSQAVYNPQLQWQYTNQLDLGLQANIGHTRVNVSAFYHKTFNPYTAVVVYKPYTYKY TSQAALEHCPIPSANREYHIDQQTGIVTVRDITGANTPILLPYKEREGYLSNRKYVNGSP TERFGIEWIVDFAPIKALRTSLRLDGNYYYYKGLDDLLFAALPSEAGTLNAQGKPYTYIG HYRGSGTTSTDNTASPSVSNGSLTHESNLNITLTTHVPRIRIIMSLRVECSLYQYNRPLS ELPNGTRGYAVEKGDDFFGRPYTRDVRDKHMVVYPEYYSTWEEPDKLIPFAEKFAWAKDN DPALFNDLAKLVKRTNYAFVMNPERLSAYYSANFSVTKEIGDRVQLMFYANNFWNTMSRV RSTRTGQETSLYGSSYVPAFYYGLGLKVKI >gi|283510550|gb|ACQH01000069.1| GENE 5 9919 - 10563 736 214 aa, chain - ## HITS:1 COG:no KEGG:PRU_2884 NR:ns ## KEGG: PRU_2884 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 37 214 1 178 178 242 69.0 6e-63 MKRFCMAILFVSFFVAYSWAQKDGQLRVNPEEKEVDMDNPTFVPMVKVGRVLDKGDSIQY VELNKIYVYPQLTFENERQRMEYNRLVYNVKKVLPIAKEVNKIIIETYEYLQTLPDKKSR DEHMKRVEADIKREYTPRMKKLTYAQGKLLIKLVYRETSSSSYQLIQAFLGPIRAGFYQA FAWFFGASLKKEYQPNGVDRLTERVVLQVEAGQL >gi|283510550|gb|ACQH01000069.1| GENE 6 10686 - 11831 1041 381 aa, chain + ## HITS:1 COG:sll1231 KEGG:ns NR:ns ## COG: sll1231 COG0438 # Protein_GI_number: 16330676 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Synechocystis # 118 375 123 392 399 122 31.0 9e-28 MNEDKNIIVGFDAKRIVRNGTGLGAYGRNLVNDLAALYPNAQLLLYAPDAGREDLARQVK SAPNLRMVYPKNAKNALKKAAWRSGGMVKDLLRDGVQVFHGLSGELPKGVKHAGIDAVVT IHDLIFMRHPEFYHWWDALIYRWKFHKTLKEANRIIAISECTKRDIMKYGHYPEERISVI YQSCDTRFRERATPEKLNEVRQRYALPSRFVLNVGTIEARKNVLLAVKAMKNVDREVHLV VLGRPTPYINKVKSWAAHNGLSARVHFLHNVPNHDLPAIYQQAEVFAYPSRYEGFGIPII EAIQSALPVVACTGSCLEEAGGPHSLYVAPNDHKGMAQAINARLKGQAGRDESISRSMEY VQRFENKNVAQQVAQCLFENK >gi|283510550|gb|ACQH01000069.1| GENE 7 11920 - 13416 1258 498 aa, chain + ## HITS:1 COG:alr4631 KEGG:ns NR:ns ## COG: alr4631 COG1233 # Protein_GI_number: 17232123 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Phytoene dehydrogenase and related proteins # Organism: Nostoc sp. PCC 7120 # 5 481 20 504 533 120 24.0 7e-27 MKKCIIIGSGLGGLSCGHILARNGYEVTVLEQETQAGGCLQSFVRKGAKFETGMHFIGSA EEGQTLYPLLRYLELENLPLSRLDTTGYDVISMPQGRFAFANGHEAFVETLAAHFPKERD GLHRYLNLVERLAQASPINGHRPTPEQMHLKTEYQMRAVDEVINGFTNDALLRNVLVGNL PLYAGERGQTPFSTHAFITNFYNQSAFRLAGGSDIIAKRLIEGIARRGGRVLTRKRVEHI TCNNSKATGVTTADGEHFEADLVIAAIHPTLVMHMLNDTNLIRPAFRQRMLNLPNTKGVF TVYIKFKPQAMPYRNHNLYGYAADTPWQCEQYTADNWPCGFLYVHLSDALAADEQPPNTH QQWATCGEIMSYMRYEEVERWANTRIMHRGETYEDFKRDRAERLIDAVERHAPGLRQAID CYFTSTPLTYEQYTSTLQGSMYGVAKDVRLGSACHVPSRTRIPNLFFAGQNVNSHGMLGT LVGTLLTCGEVLGEKLKL >gi|283510550|gb|ACQH01000069.1| GENE 8 13413 - 14915 1435 500 aa, chain + ## HITS:1 COG:BH1848 KEGG:ns NR:ns ## COG: BH1848 COG1233 # Protein_GI_number: 15614411 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Phytoene dehydrogenase and related proteins # Organism: Bacillus halodurans # 6 496 12 494 498 118 24.0 3e-26 MNNNHIIIIGGGLGGLTTGALLAKEGWQVTVLEKNHIVGGGLQNFSRHGIHFDTGMHILG GFRPGQTLHRICKHLGIVQKLRLLPVDKLCMDEIYYAATNTTYRVAEGKEGFVESFAKHF PHQRTQLQRYVAALYQLANEIDLFNLRPTGDSIPQHSERFLWSADKFIGHYIDDERLCDV LAYMNPMYGGVSGKTPAYVHALISVLYIEGTDRFVGGSQQLATALTEVIEDAGGQVVADA EAVRIAVDNRTATAVQTKDGRTFTAPYYIAATHMAETLRIVDKGAFTPAYTKRLNAIDNT YSIFSLFVELKPQTFPYINHTCYYQHDYHHVWHLGHYNREQWPYGMMYMTPPSAEQTEWA THLIINCVMPFSAVEQWQDTLTGRRGKQYEAWKTWHQQRILQRMEEVRPGFGQCVKSLWS ASPLTVRDYYHAPVGAIYGFAKDCHNPLATQVPVATKVSNLLLTGQCVGLHGICGVPLSA ITTAEALVGRNKIIEKLQNI >gi|283510550|gb|ACQH01000069.1| GENE 9 15142 - 16083 976 313 aa, chain + ## HITS:1 COG:AGc2981 KEGG:ns NR:ns ## COG: AGc2981 COG0657 # Protein_GI_number: 15888930 # Func_class: I Lipid transport and metabolism # Function: Esterase/lipase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 39 295 68 304 310 138 35.0 1e-32 MRTVLIMLLAFMAYLGHAQQHKPLKIWQGTPEHAASVTLTPYIPQGAGRDCPAVIVCPGG SYHWLDTRTEGDGVARWLNSQGIAAFVLRYRVAGVLAFVTHYRLLIRGRQQPDMLRDVQR SVQLVREGASTWGICPTEVGVMGFSAGGHLAVASAAFAATNYLAPLGIKPKVSVRPDFVA ALYPVVTLADERYVHRRSRKGILGEWKKTDNRLKDSLSVERKIPFDCPPVFLVHCDDDPI VHPGNSLLLDSALTSKGIPHKFVRYATGGHGFGASDVKGSEESRQWREAFVQWLKTTALP LARNNTKCKKRHA >gi|283510550|gb|ACQH01000069.1| GENE 10 16146 - 16379 57 77 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTSYFNAIVGLEREINLLKFIPRPLPAVYNLLHFSMSSPYSSRWRHTMLSTKLERLLPLS PNLRTMKSARSCVSRFR >gi|283510550|gb|ACQH01000069.1| GENE 11 16512 - 16649 85 45 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|282879671|ref|ZP_06288401.1| ## NR: gi|282879671|ref|ZP_06288401.1| conserved hypothetical protein [Prevotella timonensis CRIS 5C-B1] # 1 44 1 44 144 65 90.0 1e-09 MENKTKTTDKDFSWLLGVENETEYESTPQEGTEVAEIEGPTNKQK >gi|283510550|gb|ACQH01000069.1| GENE 12 16654 - 16989 255 111 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928838|ref|ZP_06422684.1| ## NR: gi|288928838|ref|ZP_06422684.1| mobilizable transposon, excision protein [Prevotella sp. oral taxon 317 str. F0108] # 1 111 14 124 124 181 100.0 1e-44 MKEEITNHCLVMNSVSNVARAVRYLTDQHLTHIRAFLDNGDAGRRTVPEFVRAGFKVEDM SRYYKDCKDLNVFHVSRMREQKKQKAQEQKRMVVIEQNESKKSKQVKHNIR >gi|283510550|gb|ACQH01000069.1| GENE 13 17556 - 18170 222 204 aa, chain + ## HITS:1 COG:CAC0821 KEGG:ns NR:ns ## COG: CAC0821 COG1309 # Protein_GI_number: 15894108 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Clostridium acetobutylicum # 1 144 1 143 200 60 31.0 2e-09 MKTLKEDIRNRIVTAARNEFIKNGFRKTSMRTISAKSGVVLGNIYNYFKTKDDIFYTVLR PLLIVLDERMSSYRIEHNKIDKLHFSIQNQKDLLRETLKIIFLYKEELKLLLFESKGTSV EFFRENFIEEQVAISKEYVDQMQKVYPQIQTRISPLFFRISSSTWVTIIGEIVSSTEICK EEVKQVLSEYIRYNTAGWRELINP >gi|283510550|gb|ACQH01000069.1| GENE 14 18175 - 20589 979 804 aa, chain + ## HITS:1 COG:BMEI0657 KEGG:ns NR:ns ## COG: BMEI0657 COG4206 # Protein_GI_number: 17986940 # Func_class: H Coenzyme transport and metabolism # Function: Outer membrane cobalamin receptor protein # Organism: Brucella melitensis # 120 231 28 141 599 73 34.0 2e-12 MNRLFIVTILCLCTTVLSAQNILQGKIIDKETGLPISGAYISIKNTKQKTVSDKTGNYQI DIPSNGTCIVEVSFVEYKRVRQVIALNGNITKDFFLESSANVLNEVVVRGSSQQAEINRI RQSPMAVTVVDGAKLRGRSSGIEEILTRTSGIKVRKTGGLGSASRISVHGLEGKRVAVYI NGFPLNSPDGSFDINDIPIDVIKYIEVYKGIVPAEYGGDGLGGAINIVTREDECDLVGFT QELASFGTSKTLASAQKLFAKSGILFNVAFFKNKSKNNYTMSWPVFETNLPASEYRKVRR NNDYYDADFYHVGIGFRKLYFDKLDLECAFYRNKKGIQSLSFDSQHAYMKSFNIMPTLHI EKNDFIFKGLEMKSSLVIPIIQTSLIDTTTTRRQWDGTITQAMGETEDNLLNESHNRQFE LRNKFNLKYTFGQHTFNINDQLALSDYRPKDERMNDYLGFDPSSFPSKMISNNIGLSHLY ASSNHRFQNSLTISIYYLKSKIFRTSDALSKAQMKDASAPKQTSVNKTYYGFSEGVSYEF WKGVRGKFSFSHNVRLPDTGELFGNGMSIKPSVNLLPEIGDNLNVGTIIDTRNVMGLTHV QWETNFYYMYIQNMIRLFPADIRSIYTNLGKTSTMGFDTELKVDVTPNIYTYFNLTLQDI RDRQKWLNDAKGTDNPTYNKHVPNIPSFYFNYGLEYHAENLLGRNELSRIYMDVSHVGEF DWGWQMSTLSDQRKKWIIPSNDVFTIGLQQSFWHNNISLSFEVENIFDKENYMEFKMPLQ GRTFKTKLRFNLFRDKTSGGAMSL >gi|283510550|gb|ACQH01000069.1| GENE 15 20617 - 21804 835 395 aa, chain + ## HITS:1 COG:no KEGG:Ctha_1000 NR:ns ## KEGG: Ctha_1000 # Name: not_defined # Def: conserved hypothetical lipoprotein # Organism: C.thalassium # Pathway: not_defined # 1 370 1 372 399 215 37.0 2e-54 MKHYFLIAALVLLSLVGLTSCGKDSPDFPPSTKSEVGFVHSVNIGENTYISLFKDLSVSS LNTDNALVLPKGAFTFVYKNKIYVTDTEHLYKYIPKDGILVQEGNTMLFPSGAKATYITF LSEKKAYVSCLGLGKVWIIDPSSMTKTGEIDLSGYSLGKLAGDNNPEPCASIIRDGILYV TLCQLKSAYSCEKGAHIALIDTKTDKPIKMISDPRATMASSMTPAGDPFVDEKGDIYFYC VAMFGYQPGVKEGFLRIKKGEQDFDNSYCFTLADVNLEGVKGNRTSYVYNKVYGGNGKVY GYLNIPGAASNPPDYVHDKSFQAFEINLYNKTCKKMNFSGTVGWATSICKSGNNIIFGMS TDQGMGYSVYHMDDGSYEKLKVKVSGAPYMLHELK >gi|283510550|gb|ACQH01000069.1| GENE 16 24048 - 25709 675 553 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|39938628|ref|NP_950394.1| ribosomal protein L13 [Onion yellows phytoplasma OY-M] # 12 553 7 546 546 264 28 8e-70 MKQSIQIRCKNNKKTAKVEIGSTLYDVFHELKLNMPYGPVNAKVNNKIEGMHYRLYHNKD VEFLDLHSPSALRAYTRTLFFVLCKAVHDLYPHSNVVIDIPVSNGYYVDLDIQRPITPQD ADAIRRRMQEIIAARLPISRHETTTDAAIAMFKKVGDESKVRLLETVGTLYTTYYELDRY VDYYYGSLLTNTRELHLFGLEPYYDGLLLRVPSTHNPAQLGELVRQDKMFEIFKEHHRWQ QILGIRTVGDFNKAVANGFSADLVHVSEALQEKKIARIADEIAARPEVRMVLLAGPSSSG KTTTCKRLSVQLLANGIRPVQISLDDYFVNREHTPKDADGEYDYESLYALNIPLLNEHLA ALFRGEEVQMPKYNFVEGRSEPTGKRLHLPKDSILVVEGIHALNPELTALIPAEQKYKVY ASALTTILLDNHNYIPTTDNRLLRRIVRDHKYRGVSAAETIKRWPSVRAGETKWIFPYQE EADAMFNTAMLFELAIIKTQAEPLLEQVPESSDEYSEAYRLRKFLRYFAPMPFHQLPPTS LLREFLGGSSFSY >gi|283510550|gb|ACQH01000069.1| GENE 17 25798 - 27513 2137 571 aa, chain + ## HITS:1 COG:FN0276 KEGG:ns NR:ns ## COG: FN0276 COG1283 # Protein_GI_number: 19703621 # Func_class: P Inorganic ion transport and metabolism # Function: Na+/phosphate symporter # Organism: Fusobacterium nucleatum # 19 564 1 523 525 256 32.0 8e-68 MDTSNYLVFGFKIAGSLALLIYGMKIMSEALQKMAGPQLRHVLAAMTTNRFTGMLTGAFI TCAVQSSSATTVMTVSFVNAGLLTLAQAISVIMGANIGTTLTAWIMSLGYSIDLTDFVFP AFLVGIALIYSRKHRYKGDFLFGISFLFFSLVLLSDTGKQLDLQHNQGVIDFFSSFDVNS YFTILIFLLIGTVITTIVQSSAAVMAITILLCSTGVLPIHLGIALVMGENIGTTATANLA ALGANTQARRTAMAHLVFNVIGVMWVLIVFYPFVNMICNMVGYNPEGPKLSQAELAIKLP VVLAAFHTCFNLFNTGILIWFIPQIEKLVCIIIKPGKKDEEEEFRLRFIQAGIMKTPELS VLEAHKEISSFAERMQRMFGMVRELLGTRDDSNFAKLYSRIEKYENISDNMEQEIAKYLS AVSDAHLSDDTKGKIRDMMREITELESIGDSCYNLARTISRLVNSKEDFTEKQYERIHQM FELTDDSLSQMNIIIRGRKEYLDTTRSFNIETEINNYRKQLRNQNINDINNHEYTYNVGT MYMDIINECEKLADYVVNVVEARMGTRRVEA >gi|283510550|gb|ACQH01000069.1| GENE 18 27945 - 28373 539 142 aa, chain - ## HITS:1 COG:mll7646 KEGG:ns NR:ns ## COG: mll7646 COG2166 # Protein_GI_number: 13476352 # Func_class: R General function prediction only # Function: SufE protein probably involved in Fe-S center assembly # Organism: Mesorhizobium loti # 3 132 2 131 142 105 41.0 3e-23 MSTSTIDALQEEVIEEFSAFDDWMDKYQMLIDLGNTLPPLDAKYKVESNLIEGCQSRVWL QCDLVDGRLVFTADSDALITKGIIALLIRVISNHTPDEIAHADLHFIDAIGLKDHLSPTR SNGLLSMVKQIKAYAVGYAAKQ >gi|283510550|gb|ACQH01000069.1| GENE 19 28623 - 29882 1343 419 aa, chain - ## HITS:1 COG:aq_1015 KEGG:ns NR:ns ## COG: aq_1015 COG0826 # Protein_GI_number: 15606313 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Collagenase and related proteases # Organism: Aquifex aeolicus # 6 416 1 400 409 277 36.0 2e-74 MDNNKIREFEIMAPVGSRESLVAAIQAGADSVYFGIGQLNMRSHSANQFTIDDLRDIAAT CREHGMKSYLTVNTVIYGDDLGTMHTIVDAAKEAQVSAVIASDVAVMMYCRQVGVEVHLS TQLNISNIDALKFYAQFADVVVLARELNMEQVAEIYKQINEQRVLGPNGEPVRIEMFCHG ALCMAVSGKCYLSLHDANRSANRGQCVQLCRRSYTVTDKETGTQLDIDNEYIMSPKDLKT VRFIDQMMNAGVRVFKIEGRARGPEYVYTVVRCYKEAIASVLDGSFTEEKKDEWDERLST VFNRGFWDGYYQGQTMGEWTKHYGNKATEKKVLVGKVVKHFSKLGVAEIAVEASEIQLND HLLITGPTTGVMFFDAKEIRYELEPVQTAQKGTRVSIPVPDKVRPNDKLFKLVKNEVNK >gi|283510550|gb|ACQH01000069.1| GENE 20 29935 - 30975 700 346 aa, chain + ## HITS:1 COG:no KEGG:BDI_2912 NR:ns ## KEGG: BDI_2912 # Name: not_defined # Def: aminopeptidase C # Organism: P.distasonis # Pathway: not_defined # 39 344 33 379 389 189 31.0 1e-46 MNRIILILVALATLCSCTSKSEPRATTRLPKIEILLKTTPVKDQGQSQLCWIFAMLATIE TEHLMRGDSVNLSTAYVARMALRQRIQDAYLAKGKRPIHLRGMASNALSAIADQGLLAYD TYHVEDYSVFSAFPKKAANLCKLAVAQQEGLVRLVQRYDNLADQSIGALPRAQFMLGAEY TLGEFGRSVCRHDEYVGLTSFTHHPFNAAFALEVPDNVNRDCLLNLPIDSLVSLTERSLR AGHPLCWEGDTSEPGFNFTQAIARIPEHSTAPTQQMRQREFETFRTTDDHCMAIVGLARD AQGKRYFIMKNSWGTDNAFKGFMFMSEDYFRMKTIALWAQRECLGA >gi|283510550|gb|ACQH01000069.1| GENE 21 31071 - 32039 1174 322 aa, chain - ## HITS:1 COG:TM0177 KEGG:ns NR:ns ## COG: TM0177 COG1284 # Protein_GI_number: 15642951 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Thermotoga maritima # 8 303 1 281 283 149 33.0 9e-36 MTIDHKMLLSEIKDYVFITLGLCLYTFGLVVFIIPYEIVTGGVMGASLVVNYATGFKIEY TYAIINISLLVLALKVLGWRFLLKTIYATAALYFLLEYAQELMPKIEGTDQFVQILGPGN NFMSLLIGCTLTGSSLAIVFLNNGSTGGTDIIAAVVNKFFNLSLGQVLFLVDIFIIGSCF FFPQFGTYYERARMTVFGLCTMIVENFMLDYVMNARRESVQFLIFSRKYQEIANAIALHT DHGVTILDGHGWYTGKEMKVLCVLAKKNESQMIFRLIKLIDPNAFVSQSAVIGVYGEGFD TIKVKAKEMPEPPKRVSEGEKK >gi|283510550|gb|ACQH01000069.1| GENE 22 32195 - 35074 3359 959 aa, chain - ## HITS:1 COG:L0352 KEGG:ns NR:ns ## COG: L0352 COG0495 # Protein_GI_number: 15672798 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Leucyl-tRNA synthetase # Organism: Lactococcus lactis # 1 958 1 829 829 735 43.0 0 MEYNFREIEKKWQKRWEEQQTYRVVEDNSKKKFYVLNMFPYPSGAGLHVGHPLGYIASDI YARYKRLEGYNVLNPMGYDAYGLPAEQYAIQTGQHPAVTTETNIARYREQLDRIGFSFDW SREVRTCDPTYYKWTQWAFQKMFDSYYDNDQQRALPIAQLEERLAKQGTAGLNAACSEDL NITAEAWKAMDEHARQATLMNYRIAYLGETMVNWCAELGTVLANDEVVDGVSVRGGYPVV QKKMRQWCLRVSAYAQRLLEGLDKVDWTDSLKETQRNWIGRSEGAEVEFAVKDSNERFTI FTTRADTIFGVTFMVLAPESELVDRLTTPEQRAEVQAYVEATKKRTERERIADRKVSGVF TGAYAINPLTGEALPIWVSDYVLAGYGTGAIMAVPAHDSRDYAFARHFSLPIVPLIEGAD VSEQSFDAKEGIVINSPATNNAANDGQAQGLVLNGLTVTEAIAKTKEYVSRHNLGRIKVN YRLRDATFSRQRYWGEPFPVYYKGDMPYMIPQECLPLELPEVDNYQPTATGEPPLGNAKA WAWDEKARKVVDKSLVDGVQVFPLELNTMPGFAGSSAYFLRYMDPHNNDALVSNAAVSYW QNVDLYVGGTEHATGHLIYSRFWNKFLFDLGVAVNDEPFQKLVNQGMIQGRSNFVYRVQR TEGEGDKAPLFVSLGLKDQYEVTPIHVDVNIVHGDVLDIEAFRAWRPEYKDAEFVLENDK YVCGWAVEKMSKSMYNVVNPDLIVENYGADTLRLYEMFLGPVEQSKPWDTNGIDGCHRFL KKFWGLFYNMRTGEFMPNDAEATPEQLKSVHKLLKKVSADIPAFSYNTAVAAFMICVNEL SQAHCTNKALLSKLVVALAPFAPHIAEELWAALGNGQSSVCDAQWPAYEEKYLEESEVQL AIAFNGKARFQKAFAANATNAEIEQAAMADERSAKFLEGKQVVKVIVVPKKIVNIVVKG >gi|283510550|gb|ACQH01000069.1| GENE 23 35574 - 35963 242 129 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928850|ref|ZP_06422696.1| ## NR: gi|288928850|ref|ZP_06422696.1| hypothetical protein HMPREF0670_01590 [Prevotella sp. oral taxon 317 str. F0108] # 1 129 1 129 129 239 100.0 4e-62 MKDIEKKKLLRQSGSESQSDMVLNAEKDAAFNVSLGKLTVGTLLYSGGVWQFSYSEEFKQ QSRIVPLSNFPAKDKTYRDSELWPFFASRIPSSSQLQLDKDAPREDLVTLLRKFGRRTIA TPFEVVPVT >gi|283510550|gb|ACQH01000069.1| GENE 24 35960 - 36271 92 103 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928851|ref|ZP_06422697.1| ## NR: gi|288928851|ref|ZP_06422697.1| hypothetical protein HMPREF0670_01591 [Prevotella sp. oral taxon 317 str. F0108] # 1 103 1 103 103 215 100.0 5e-55 MDKIPFHKIENFTGASCYAVPKALVIKHNDYFIDDTISVDGDAPKKFIGVYDYRLLNGKH KANRKNWIRYIAKTGHKWYPIESVTELLLNRLFAGLTNCKMCL >gi|283510550|gb|ACQH01000069.1| GENE 25 36502 - 36882 324 126 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288928852|ref|ZP_06422698.1| ## NR: gi|288928852|ref|ZP_06422698.1| conserved hypothetical protein [Prevotella sp. oral taxon 317 str. F0108] # 1 126 24 149 149 240 100.0 3e-62 MKPTITRTTPRAITHNVLCIALCACIVSAFISACAFVPTASNFPRDKVQLGMNTREVRGI MGKPFSEDTFVEGEKRVDVLHYKEAVRVKTQGFILTTSLRFENDSLTAITQNEKMVDEVT RESKVQ >gi|283510550|gb|ACQH01000069.1| GENE 26 36793 - 37002 88 69 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLNTKRSDEHLRYANNRVAHGALRPLFPLEVGAGVRLYSLLLNLTFPRNLIHHLLVLSDC GKRIVLKSQ >gi|283510550|gb|ACQH01000069.1| GENE 27 38330 - 39832 2048 500 aa, chain + ## HITS:1 COG:RSc0408 KEGG:ns NR:ns ## COG: RSc0408 COG1508 # Protein_GI_number: 17545127 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma54 homolog # Organism: Ralstonia solanacearum # 19 499 15 497 499 207 32.0 3e-53 MAQKLIQTQTQKQVQVQRLSQQQMLQVRLLEMPLTELEQSVVAELDDNPALETDTNEPNE TENYDTAPANDEQGDEDFDQQTEREEREQALDAALENIGRDDEMPETFSRENHNNAEYEE IVYGDTTSFYDKLKEQVGEIDLNEEEEQVLLYLIGSLDNDGLLRKDLDTIADELAIYQNL DVETSEIERMLGVLQTFDPAGIGARSLKECLLIQIDRKPESWAKEMMRRVIDECFEAFTK KHWDTIQTQLQLTDEQVQEVQAEIRKLNPKPGASMGETQGRNLQQITPDFIVDTDDDGRV SFYLSRGNIPRLMVSPTFADMVDHYRNNKANMTKGEKEALLYAKEKVEKAQGYIEAIKQR QHTLSVTMQAIIDIQRKFFVEGDEADLRPMILKDIADRTGLDISTISRVSNIKYAQTKWG TFPLRFFFTDAYTTGDGEELSTRKIKIALKELIDNEDKKKPLSDDLLKEELAKKGLPIAR RTVAKYREQLGLPVARLRKG >gi|283510550|gb|ACQH01000069.1| GENE 28 40342 - 40917 458 191 aa, chain + ## HITS:1 COG:no KEGG:PRU_2856 NR:ns ## KEGG: PRU_2856 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 187 1 187 194 223 73.0 2e-57 MMSMLFTPFYLPIVGLIALFTFSYMSQLPWDYKLEVLLLVYIATIFIPTILIHLYRRYQG WTLIELGHKERRMVPYVISILCYFGCYYLMSVLQIPRFMARILVAALVVQVVCALINVWW KISTHTAAIGGVAGGLMAFAFLFMFNPVWWLCLVFIVGGLVGTSRMILRQHSLWQVVAGF LVGMVCAFVVI >gi|283510550|gb|ACQH01000069.1| GENE 29 41325 - 41831 741 168 aa, chain + ## HITS:1 COG:PAB1077 KEGG:ns NR:ns ## COG: PAB1077 COG0041 # Protein_GI_number: 14521838 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylcarboxyaminoimidazole (NCAIR) mutase # Organism: Pyrococcus abyssi # 3 163 7 168 174 158 52.0 5e-39 MTPSVSIIMGSTSDLPVMEKACKLLNDLKVPFEVNALSAHRTPDAVEAFAKGAKARGVKV IIAAAGMAAALPGVVAASTTLPVIGVPIKGMLDGLDALLSIVQMPPGIPVATVGVNAAMN AAILAVQMLALTDENVAQQLAEYKATLGQKITKANEELAEVKYEFKTN >gi|283510550|gb|ACQH01000069.1| GENE 30 42096 - 44051 2337 651 aa, chain + ## HITS:1 COG:CPn0373 KEGG:ns NR:ns ## COG: CPn0373 COG0821 # Protein_GI_number: 15618288 # Func_class: I Lipid transport and metabolism # Function: Enzyme involved in the deoxyxylulose pathway of isoprenoid biosynthesis # Organism: Chlamydophila pneumoniae CWL029 # 31 645 9 599 613 385 37.0 1e-106 MMNEQDNNHTSRPLSLQEDGEWSDIGIDLFNLQRRNASVTTVGSVTMGGNNPVRVQSMTT TDTNDTEASAAQAERIIQAGGELVRLTTQGKREAENLRNINAALRQKGYDTPLVADVHFN ANVADVAAQYAEKVRINPGNYVDPGRTFKQLEYTDEEYAEEIKKIDARFVPFLNICRENN TAVRIGVNHGSLSDRIMSRYGDTPAGIVESCMEFLRICKREQFSNVVISIKASNTVVMVQ SVRLLVQQMRLEDMNYPLHLGVTEAGEGEDGRIKSAVGIGTLLSDGIGDTIRVSLSEDPE AEIPVARALVDYIEARSDHLPIPAQAAKGFNYTAPERRPTTPVANIGGENLPVVISNRTD KDKVHVVANADQTPDYIYIGRHLPANLSAKRRYIMDYDAYMQVASTNPQAYEQVYPIFPH NAVPFISLVEADVKFLVLQYGANSDEYLACLKHHPEVVVVCMSTHKNKVGDQRALVHELW SAGINNPVVFAQMYRHAAEEKETFQLKAAADMGPLMIDGLTDGIWLMNDGDLPDEDVDAT AFGILQAARLRTTKTEYISCPGCGRTLYDLRTTIARIREATKHMKGLKIGIMGCIVNGPG EMADADFGYVGAGPGKVSLYRKKECVEKNIDEKDAVEHLLRLIEKETKQNL >gi|283510550|gb|ACQH01000069.1| GENE 31 44287 - 45918 1817 543 aa, chain + ## HITS:1 COG:DR1670 KEGG:ns NR:ns ## COG: DR1670 COG0215 # Protein_GI_number: 15806673 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Cysteinyl-tRNA synthetase # Organism: Deinococcus radiodurans # 17 540 5 531 532 422 43.0 1e-117 MNEQNKPADNSAQNQNNSVPSTKAAQHTTPLSPWRGDGDEASCGVDLSTSPLLIYNTLTR KKERFEPLHAPNVGMYVCGPTVYGDPHLGHARPAITFDVVFRYLQHQGYKVRYVRNITDV GHLEHDADDGDDKIAKKARLEQLEPMEIAQHYTNTYHAAMQLLNVLPPSIEPQASGHIIE QEELVKKILANGFAYESNGSIYFDIAKYNETHRYGILSGRNLEDVLDASRKLDGVGEKRN QADFALWKHAQPEHIMRWPSPWGDGFPGWHCECTAMGRKYLGEHFDIHGGGMDLIFPHHE CEIAQAVAAEGEQMVRYWMHNNMITINGQKMGKSLGNFITLEQFFNGAHPALEKAYSPMT IRFFILSAHYRGTVDFSNEALQASEKGLERLLNGIADLQRIQPAKTSDEATATIVNELPR KCYEAMNDDFQTPMVLSHLFEACRLINTLIDHKAQISAEHLEQLATTMHLFAFQLLGLNA DNGNNNDAREEAFGHVVDMVLDLRAKAKADKDWATSDRIRDELASAGFEVKDTKDGATWK LNK >gi|283510550|gb|ACQH01000069.1| GENE 32 46160 - 48208 2385 682 aa, chain + ## HITS:1 COG:PA2735 KEGG:ns NR:ns ## COG: PA2735 COG0286 # Protein_GI_number: 15597931 # Func_class: V Defense mechanisms # Function: Type I restriction-modification system methyltransferase subunit # Organism: Pseudomonas aeruginosa # 4 543 9 539 792 481 47.0 1e-135 MNPSQYNTLFTFIWNIANDVLVQAFNKGDYKKIILPMMVLRRLDILLEPTHQQVLQLKQQ LTEQGVDETQQESMLIRRTGLAYCNTSRFTMKTLRGETNPVRLKQNFLEYLDGFSKDVQD IIEKFKLKQQVDNLSDTGRLGRIIEKFTDAEINLGKDPVLDAEGNERLPGVDNHTMGTLF EQLLRKFNEANSVTEAGEHFTPRDYVALLADIAVLPVANKLRNGTYTIYDGACGTGGILS IAEQRIADIAKEQRKRIKISLYGQEMQPETYATCKADLMLSSITNSFAYLNAGVRRERFF CGSTISNDGHPGMKFDFCISNPPFGTPWKTDLQAWGLKDNEKQHITDPRFVLPQGYDPHN GLRFVPDVGDSQMLFLANNISRMKNDTELGTRIVEVHNGSSLFTGNAGGGESNLRRHIIE NDWLEAIIAMPEKDFYNTGIGTFIWVVTNRKEPRRAGKVQLIDATDIKTPLRKNLGEKNC ETNETDRQQIMQLLNRFEETPQSKIFANEEFGYWEIKVDRPLRLRVLPQSDITAGKLTSK EQEACRAAMQAVPNDTPLNNWDAYAAALGKLTKTVKNKLRALITVPDPSCEPVAGEADRA LRDTEQVPLTYPGGIEAFMQREVLPYAPDAYVAEDETKVVYELSFTKYFYKPVELRPIAD IKADIKAIETETDGLLADILNV >gi|283510550|gb|ACQH01000069.1| GENE 33 48229 - 49497 497 422 aa, chain + ## HITS:1 COG:VC1768 KEGG:ns NR:ns ## COG: VC1768 COG0732 # Protein_GI_number: 15641771 # Func_class: V Defense mechanisms # Function: Restriction endonuclease S subunits # Organism: Vibrio cholerae # 1 416 25 460 462 145 28.0 2e-34 MEKFYVYKDSGVKWLGNIPQHWEVRKIKYVFTERSQKGFPKEPILCSTQKYGVIPQHMYE NRVVVVNKGLEGLKLVRKGDFVISLRSFQGGIEYAYYQGIISAAYTILNLNDNCYSNYIK YLMKSFDFIQLLQTCVTGIREGQNINYTLLRKSSLPLPPLAEQRAIVSYLDGKVGQIDTY VAKQTQQIELLKELKQAVIANAVTKGIDNKAKLKQTGISWIGHVPQHWERCRCKDVLTEI KLLVGNGEYALLSLTTNGVIVRDLSEGKGKFPKDFNTYKVVKPNDLVFCLFDVDETPRTV GLVHNHGMLTGAYNVFETKNVDTSFLYHYFIALDNRKALKPLYKGLRKVIPLPAFMSMPL YIPPLSEQRAIVSYIEAKTASINKLIDAYEQQVERVKEYKQRLISDAVTGKMNVTDEHTQ TN >gi|283510550|gb|ACQH01000069.1| GENE 34 49503 - 53015 3710 1170 aa, chain + ## HITS:1 COG:PA2732 KEGG:ns NR:ns ## COG: PA2732 COG0610 # Protein_GI_number: 15597928 # Func_class: V Defense mechanisms # Function: Type I site-specific restriction-modification system, R (restriction) subunit and related helicases # Organism: Pseudomonas aeruginosa # 2 1001 3 1032 1146 858 44.0 0 MPTTMNEKALEDLIVNWLTNHNQYELGDTSQYDTHYALDTGRLETFLKLTQPAEVEKSGI FHSPVNRRKFLERLRDEITKRGVVDVLRKGLKHQSNTFVLYNPLPSALSAEGEERYALNK FSVVRQLRYDAQNPALALDVAIFINGLPVITMELKNQITGQNTAHAIAQYRTDRSPNNLL FMPKRCAVHFAADDDTVQMCTKLCGANSWFLPFNKGFNDGAGNPVNPNGLKTAYLWEEIL TKRNLSNILEHYAQVVTERNPDTRRNIEKTIWPRWHQLQLVRALLHETAKGNIGQRFLVQ HSAGSGKSNSITWLAYQLVELLEGDQPLFDSVIIVTDRVNLDKQLRNNMRAFSRNENIVD AANSSETLRQHLVNGKKIILSTVHKFSFILDAIGNELASHRFAVIIDEAHSSQSGKMAAS TNQVLAGSAVADDTELSVEDAVNEAIAQYMCGRRMAPNANFYAFTATPKNKTLETFGTPF IKPDGETGHRPFHVYSMKQAIEEGFILDVLSNYTTYESFYHIRRRVEDDPEFDRKQALRK IRHFVERQPETIEKKAEVMVEHFHTSVAHKIGGQARCMVCTTGIQRAIDYFFAVNRLLSQ RDSQYRAIVAFSGDFEYNGKTVNEGSLNGFPSAAIEKTFRTGNYRFLIVADKFQTGYDEP LLHTMYVDKPLHGLQTVQTLSRLNRTCPKKEDTFVLDFVNTCDDVEQSFQTFYKTTILSR ETDPNRLNDLLAEIEKHQIYTQYELETLNQRYWNNAPREALDPLLDAMCQRFCTLSEEEQ VECKSSIKAFIRTYEFLSTIMDSCSLEWEKAETLLKLLVRKLPSLGTDDLTRGLIEAIDF DRYRLEKKEERKIQLQNADTEIEPVPTATAAGVAEPDMQRLSVIETQFNELFGSTDWDDI EVVKRQVEDVTQRVTNDNNVRDAMLNNDEQTAAQECDEATNVKMADMAETYTQLISKYMS EPDFQERFNQLIRERVKAAINPEYDEQDLAFKLRQAFKEDFADFCDGVHNVEFDEVLRLF FCLVNAETIPALQGLRRILRNTLNCLYRAQHREEDFRTWYADLVSRFEAFLKKIYWLQHG VPMPLVNDGRDPALLDAVRHFPRISSLYHTRNPKHERFKQYYNVVFTWRNKENHSAEDLP ANLLPAALHAAVAMYLYATMVSAPDLKGKL >gi|283510550|gb|ACQH01000069.1| GENE 35 54253 - 54690 425 145 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288928861|ref|ZP_06422707.1| ## NR: gi|288928861|ref|ZP_06422707.1| hypothetical protein HMPREF0670_01601 [Prevotella sp. oral taxon 317 str. F0108] # 16 145 16 145 145 220 100.0 3e-56 MRSNKLSNLLSLLFLVATLGVAVYAAFVNDTIVVHWGPLGGSELQGMSYLILFLPLLSAA LYLLLRKGKENPFALDPKHDMPQTQGNAQQLRRYVDVLSLGVTGLLLYITLCAAGFLPMI PLVVMLVVAFFIIYYARSRRKLRGA >gi|283510550|gb|ACQH01000069.1| GENE 36 54697 - 54942 99 81 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MQYNTMLTVPTKKSSDNRPFKLEFFIFWYFIIGYILSSIWRAALHNKPFTYKAQVQLWSA MILFCKVTTKVCHRQNMKKEN >gi|283510550|gb|ACQH01000069.1| GENE 37 54866 - 55324 253 152 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288928862|ref|ZP_06422708.1| ## NR: gi|288928862|ref|ZP_06422708.1| hypothetical protein HMPREF0670_01602 [Prevotella sp. oral taxon 317 str. F0108] # 109 152 1 44 44 82 100.0 9e-15 MKNSSLKGLLSLLFLVGTVSIVLYCILTKDTINLRIDMPGGVSKPISPFFLLPPTLFSVG AYFVTRFLADKPMRMKSLLRRYVIPMNEHNAPLIRNFHESVSLGVTSIMFYCSLFISGII AFNPFITWAVMMLPFLMIGWSSFKLMRDKEMQ >gi|283510550|gb|ACQH01000069.1| GENE 38 55321 - 55767 367 148 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288928863|ref|ZP_06422709.1| ## NR: gi|288928863|ref|ZP_06422709.1| hypothetical protein HMPREF0670_01603 [Prevotella sp. oral taxon 317 str. F0108] # 1 148 1 148 148 270 100.0 1e-71 MIEHNTSNNRIAVVFLAATFAVALYSAVIEVLSALQISFFAADDRVGLPIFTFLLPFISG ALYLRLRYVSLHPLSNPRPNGPEDTEENKWLMSRYADKMLPNITGVMLLLAVEDVVFMPN LITVALLWGFYMVVVTMQVFSKITYKKR >gi|283510550|gb|ACQH01000069.1| GENE 39 55885 - 57204 1751 439 aa, chain + ## HITS:1 COG:TM1658 KEGG:ns NR:ns ## COG: TM1658 COG0192 # Protein_GI_number: 15644406 # Func_class: H Coenzyme transport and metabolism # Function: S-adenosylmethionine synthetase # Organism: Thermotoga maritima # 21 439 1 395 395 423 53.0 1e-118 MGGAWRMGIQLFIDTQIRSIMAYLFSSESVSEGHPDKVSDQISDALVDQFLAYDDKAHCA IETFVTTGQVVIMGEVRSSSYIDLQTIARNTIKSIGYTKSEYQFDGDSCGVLTAIHEQSD DINRGVSREEADEQGAGDQGMMFGYATNETENYMPLSLDLAHLLMITLAEIRKEGKEMTY LRPDSKSQVTVEYADDGQPVRIDTIVVSTQHDDFGPNDEQMLAQIKADVLNVLMPRVKAK ITSAKVLALFNNDIKYFVNPTGKFVIGGPHGDTGLTGRKIIVDTYGGKGAHGGGAFSGKD PSKVDRSAAYAARHIAKNMVAAGVADEMLVQVSYAIGVAEPVNIYVNTYGRSKVQLTDGE IAQKVAQLFDLRPNAIERNLKLRQPIYAETAAYGHMGRKNEVVKKTFTSRYHETKVMEVE LFTWEKLDRVDDIKKAFGL >gi|283510550|gb|ACQH01000069.1| GENE 40 57397 - 58437 815 346 aa, chain + ## HITS:1 COG:no KEGG:PRU_2295 NR:ns ## KEGG: PRU_2295 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 12 345 14 337 338 253 44.0 1e-65 MLKPIKEQYYIAPADTTAAETAFADSAKTAKPKVLTPKEVLSWLPYDATPEQQDSAIQAH FKPDSIHWSTRPDTLHLPGHSVGKDVRIVDLPLYYRESFFSQSPYYHPEMPAGRPGVAGD PVPYTIASDDLLTSLLLGCFILGCVAFAQSRDFIFRQVKHFFRVRNEGTSEIAETTGEIR FQIFLALQTSLLYAILFFLYLRHLRVETFIAEQFVVIGILTGLVAGYFLLKAFAQWFVGW VFFGKRKNNQWLKSFLFLVSSQGVLLLPLVMLQAYFQLTVKSSIIYVGIVLGTTKILSLY KTHLIFFRQKGQFLQFFLYFCALEIVPLFALWGVLTIASNYLKINF >gi|283510550|gb|ACQH01000069.1| GENE 41 58446 - 59192 785 248 aa, chain + ## HITS:1 COG:no KEGG:PRU_2296 NR:ns ## KEGG: PRU_2296 # Name: hemD # Def: uroporphyrinogen-III synthase (EC:4.2.1.75) # Organism: P.ruminicola # Pathway: Porphyrin and chlorophyll metabolism [PATH:pru00860]; Metabolic pathways [PATH:pru01100]; Biosynthesis of secondary metabolites [PATH:pru01110] # 1 246 1 246 249 362 71.0 5e-99 MVKKILISQPKPTSDKSPYFDIEREYGVDLVFRPFVQVEGITSKEFRQQKVSIPNYTAIV FTSRHAIDHFFKLAKELRVPIPETLKYFCVTETIALYIQKYVQYRKRKVFFGTTGKMEDL LPTMAKHKQEKYLVPLSDNHNDDIAKLLDSKKLNHTECVMYRTITNDFTPEEIKNFDYDM LIFVSPTGVKSFVKDFPNFEQGDVRIGTFGPATSKAITDEGLRLDFQAPSKEYPSMSGAL KAYLDEHK >gi|283510550|gb|ACQH01000069.1| GENE 42 59325 - 59732 305 135 aa, chain + ## HITS:1 COG:no KEGG:BVU_1188 NR:ns ## KEGG: BVU_1188 # Name: rnpA # Def: ribonuclease P (EC:3.1.26.5) # Organism: B.vulgatus # Pathway: not_defined # 5 135 2 129 130 112 57.0 4e-24 MAATVSQKLRKEERACGKKLTEALFCGADSRSLAAFPLRVVYLVQPFSAEEQQALASVQM MISVPKRRFKHAVDRNRVKRQVREAYRKNKHLLYDKLPTGRRITLGFLWLDAKHHSSAEV EDKVQNLLRRIGESL >gi|283510550|gb|ACQH01000069.1| GENE 43 59729 - 59980 74 83 aa, chain + ## HITS:1 COG:SA1613 KEGG:ns NR:ns ## COG: SA1613 COG0759 # Protein_GI_number: 15927369 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Staphylococcus aureus N315 # 18 83 4 69 85 86 57.0 1e-17 MNALRALLGLFRRVAVWLLLLPIRFYQLAISPYTPPSCRFTPSCSEYARQAIIKHGPFKG LALAVWRVLRCNPWGGSGYDPVP Prediction of potential genes in microbial genomes Time: Sat May 28 01:33:17 2011 Seq name: gi|283510549|gb|ACQH01000070.1| Prevotella sp. oral taxon 317 str. F0108 cont2.70, whole genome shotgun sequence Length of sequence - 69069 bp Number of predicted genes - 46, with homology - 44 Number of transcription units - 27, operones - 15 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 2 - 61 5.5 1 1 Tu 1 . + CDS 224 - 1525 670 ## PROTEIN SUPPORTED gi|163739624|ref|ZP_02147033.1| 50S ribosomal protein L32 + Term 1717 - 1771 9.5 2 2 Tu 1 . + CDS 2279 - 2671 460 ## gi|288928870|ref|ZP_06422716.1| hypothetical protein HMPREF0670_01610 + Term 2718 - 2752 -0.5 + Prom 3363 - 3422 5.3 3 3 Op 1 . + CDS 3560 - 4585 905 ## Coch_0966 hypothetical protein 4 3 Op 2 . + CDS 4582 - 5394 638 ## COG0463 Glycosyltransferases involved in cell wall biogenesis + Term 5443 - 5468 -0.1 5 4 Tu 1 . + CDS 5605 - 6156 337 ## BT_0225 hypothetical protein + Term 6329 - 6366 2.6 - Term 6411 - 6458 14.2 6 5 Tu 1 . - CDS 6513 - 7586 1178 ## COG0229 Conserved domain frequently associated with peptide methionine sulfoxide reductase - Prom 7606 - 7665 3.8 + Prom 7570 - 7629 3.1 7 6 Op 1 . + CDS 7768 - 8265 426 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases 8 6 Op 2 . + CDS 8314 - 9006 860 ## PRU_0954 hypothetical protein + Term 9090 - 9141 11.9 + Prom 9098 - 9157 4.8 9 7 Tu 1 . + CDS 9353 - 9793 318 ## COG2259 Predicted membrane protein + Prom 11371 - 11430 3.1 10 8 Op 1 . + CDS 11477 - 13069 1753 ## COG0025 NhaP-type Na+/H+ and K+/H+ antiporters + Term 13283 - 13331 0.4 + Prom 13241 - 13300 7.2 11 8 Op 2 . + CDS 13346 - 14035 978 ## COG0588 Phosphoglycerate mutase 1 + Term 14101 - 14151 15.3 - Term 14088 - 14138 13.0 12 9 Op 1 1/0.000 - CDS 14169 - 15548 1299 ## COG0571 dsRNA-specific ribonuclease 13 9 Op 2 9/0.000 - CDS 15538 - 15933 612 ## COG0304 3-oxoacyl-(acyl-carrier-protein) synthase 14 9 Op 3 27/0.000 - CDS 15960 - 16799 929 ## COG0304 3-oxoacyl-(acyl-carrier-protein) synthase - Prom 16821 - 16880 2.0 - Term 16951 - 17004 18.3 15 9 Op 4 . - CDS 17015 - 17251 495 ## COG0236 Acyl carrier protein - Prom 17395 - 17454 4.3 - Term 18418 - 18446 -0.6 16 10 Tu 1 . - CDS 18489 - 18893 446 ## gi|288928884|ref|ZP_06422730.1| hypothetical protein HMPREF0670_01624 17 11 Op 1 2/0.000 - CDS 19510 - 20391 1105 ## COG0524 Sugar kinases, ribokinase family 18 11 Op 2 . - CDS 20422 - 21582 1362 ## COG0738 Fucose permease - Term 21722 - 21771 2.5 19 12 Tu 1 . - CDS 21881 - 23680 1898 ## COG1621 Beta-fructosidases (levanase/invertase) - Prom 23744 - 23803 2.7 - Term 23834 - 23880 7.0 20 13 Op 1 . - CDS 23897 - 25501 1778 ## BT_1760 glycosylhydrolase 21 13 Op 2 . - CDS 25513 - 26895 1578 ## BT_1761 hypothetical protein 22 13 Op 3 . - CDS 26898 - 28628 2006 ## PRU_2273 putative lipoprotein 23 13 Op 4 . - CDS 28642 - 31746 3792 ## BT_1763 hypothetical protein + Prom 33971 - 34030 6.9 24 14 Tu 1 . + CDS 34129 - 35460 1115 ## COG1373 Predicted ATPase (AAA+ superfamily) 25 15 Op 1 . - CDS 35636 - 36883 1159 ## COG2382 Enterochelin esterase and related enzymes 26 15 Op 2 . - CDS 36958 - 38394 1303 ## COG2244 Membrane protein involved in the export of O-antigen and teichoic acid - Prom 38456 - 38515 8.5 + Prom 39134 - 39193 1.6 27 16 Op 1 . + CDS 39223 - 42318 3736 ## BT_4707 hypothetical protein 28 16 Op 2 . + CDS 42344 - 43903 1762 ## BT_4708 hypothetical protein + Prom 43920 - 43979 1.8 29 17 Op 1 . + CDS 44035 - 45078 1359 ## BF1328 putative secreted endoglycosidase 30 17 Op 2 . + CDS 45099 - 46295 1518 ## BT_4710 hypothetical protein + Term 46385 - 46429 12.0 + Prom 46509 - 46568 6.2 31 18 Op 1 . + CDS 46634 - 47050 571 ## ZPR_1410 hypothetical protein 32 18 Op 2 . + CDS 47130 - 52055 5202 ## BT_1809 hypothetical protein + Term 52234 - 52280 1.3 - Term 53145 - 53181 1.1 33 19 Tu 1 . - CDS 53304 - 53543 204 ## 34 20 Tu 1 . - CDS 53705 - 54430 574 ## COG0778 Nitroreductase - Prom 54451 - 54510 6.7 35 21 Op 1 . - CDS 54785 - 56401 852 ## Metev_0337 restriction endonuclease AlwI 36 21 Op 2 . - CDS 56445 - 57416 432 ## COG0338 Site-specific DNA methylase - Prom 57488 - 57547 5.1 - Term 57716 - 57757 8.6 37 22 Op 1 38/0.000 - CDS 57828 - 58820 1362 ## COG0264 Translation elongation factor Ts - Prom 58840 - 58899 1.8 38 22 Op 2 . - CDS 58910 - 59725 1084 ## PROTEIN SUPPORTED gi|212690772|ref|ZP_03298900.1| hypothetical protein BACDOR_00259 - Prom 59747 - 59806 1.6 - Term 59881 - 59937 1.7 39 23 Op 1 59/0.000 - CDS 59959 - 60345 539 ## PROTEIN SUPPORTED gi|150004192|ref|YP_001298936.1| 30S ribosomal protein S9 40 23 Op 2 . - CDS 60355 - 60816 585 ## PROTEIN SUPPORTED gi|150004191|ref|YP_001298935.1| 50S ribosomal protein L13 - Prom 61026 - 61085 5.6 - Term 61062 - 61099 -0.3 41 24 Tu 1 . - CDS 61146 - 63152 1591 ## BF2254 hypothetical protein - Prom 63287 - 63346 2.6 + Prom 63123 - 63182 3.0 42 25 Op 1 . + CDS 63268 - 63471 90 ## 43 25 Op 2 . + CDS 63517 - 64359 797 ## gi|288928910|ref|ZP_06422756.1| hypothetical protein HMPREF0670_01650 44 26 Tu 1 . + CDS 64779 - 66488 956 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog + Prom 66509 - 66568 6.4 45 27 Op 1 1/0.000 + CDS 66589 - 67662 462 ## COG0535 Predicted Fe-S oxidoreductases 46 27 Op 2 . + CDS 67676 - 68734 497 ## COG0673 Predicted dehydrogenases and related proteins Predicted protein(s) >gi|283510549|gb|ACQH01000070.1| GENE 1 224 - 1525 670 433 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163739624|ref|ZP_02147033.1| 50S ribosomal protein L32 [Phaeobacter gallaeciensis BS107] # 2 357 6 357 418 262 40 3e-69 MKKNFVEELRWRGMLAQIMPGTEELLQKETVTAYLGTDPTADSLHIGHLAGIMMLRHLQM CGHKPIILVGGATGMIGDPSGKSAERNLLDSKTLYHNQECIKAQVAKFLDFDAKGDNAAE MVNNYDWMKDFSFLDFARTVGKHITVNYMMAKDSVKKRLNGDARDGLSFTEFTYQLLQGY DFLHLYETKGCKLQLGGNDQWGNMTTGAELIRRTLGIENEAYALTCPLITKADGTKFGKT EGGNIWLDAKRTSPYMFYQFWLNVSDADAERYIKIFTSLDKETIDALVAEQNEDPGRRPL QRRLAEEVTVMVHSQAALDQAMEASNILFGKATKENLLKLDEQTLLDVFEGVPHYELDKA LLGQPAIDLFTREDVPVFASKGEMRKLVQGGGVSLNKEKLAAADRPVTADDLIDGKYLLV QRGKKNYYLLIVK >gi|283510549|gb|ACQH01000070.1| GENE 2 2279 - 2671 460 130 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288928870|ref|ZP_06422716.1| ## NR: gi|288928870|ref|ZP_06422716.1| hypothetical protein HMPREF0670_01610 [Prevotella sp. oral taxon 317 str. F0108] # 1 130 1 130 130 190 100.0 2e-47 MLQRDYIKRLIREFAEALRRMLDQKEVVKRREAIRLLYEQYLGPYSLYHFATIDELMSAI QSFPEDERLERLAMLAELYYAEADTEASVHDREVLLQKAFNLFEYLERESGVYSMERRGK MAELMKQLAK >gi|283510549|gb|ACQH01000070.1| GENE 3 3560 - 4585 905 341 aa, chain + ## HITS:1 COG:no KEGG:Coch_0966 NR:ns ## KEGG: Coch_0966 # Name: not_defined # Def: hypothetical protein # Organism: C.ochracea # Pathway: not_defined # 6 329 9 326 327 187 36.0 4e-46 MMNSEKKIIVILPPEFELYKAIDKNLKHLGYTKAVVLAPKFRHTFKTRMVNFVLKHLLGR DEYKRRIAAAYYSKRVGQVVHRLATKSFDYAIVMRPDLLEMETLNEVLRVANKTTAYQWD GLERFPQVFEVIPLFDRFFVFDPNDAPKYKAKYPNLLACTNFYFDFPMPEVQVNPNEVMY AGAYQEGRVGSLLKMVDELQKYNLKLNVHLVLGWRSVAFEHPIITFSKKGVGFLSYLETA RRAGMLLDIKACEHDGLSFRVFEALRYNKKLITDNKSVRLYDFYRPDNIFVVEDGCFDGL SEFLAAPYTPLPEEIIQKYSFTNWLRYALDMPPYQVTDSKT >gi|283510549|gb|ACQH01000070.1| GENE 4 4582 - 5394 638 270 aa, chain + ## HITS:1 COG:SP1771_1 KEGG:ns NR:ns ## COG: SP1771_1 COG0463 # Protein_GI_number: 15901601 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Streptococcus pneumoniae TIGR4 # 2 211 6 224 259 107 31.0 3e-23 MKVSVIVTCYNQEAFLNDCLASVAKQTYPGWECIIVDDGSTDTSAQIAKGWQAKDARFKY VYQANKGCSGARNTGLSEAGGDLIQFLDGDDILQPTKLEESVKAFSQEKCDVVVTNFSEE HNGEPRPPFCDLTKYDFTYKNLLLQWDIDFNVPPHCFCFTKKILEGLTFKEFLKAKEDWI MWLEVFRRNPKVCFINQPLVMYRIHGGNITRNTSLMEQYTEAASLYILKNEDAYFEEFYH KMVKPYKDKLNRLLSRPWYKKVFYAITGKQ >gi|283510549|gb|ACQH01000070.1| GENE 5 5605 - 6156 337 183 aa, chain + ## HITS:1 COG:no KEGG:BT_0225 NR:ns ## KEGG: BT_0225 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 16 183 20 197 197 90 35.0 3e-17 MMKRLLFIPILAWSAAMLMAQEKPRRAVYNELAGAAGIIGANFDSRFSANSPLGYRLGLA FNAGRTSDKSEGGACVYQGPAAVFEAYGLIGGTNHSLELGVGMMHGIFKRDYERDATPPL GMSSDPPEIWKYGYYGFANVGYRYQPAVGFFLRAGVNPTFVPETHYGISRFLVYPYLGLG FAF >gi|283510549|gb|ACQH01000070.1| GENE 6 6513 - 7586 1178 357 aa, chain - ## HITS:1 COG:MG448 KEGG:ns NR:ns ## COG: MG448 COG0229 # Protein_GI_number: 12045307 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Conserved domain frequently associated with peptide methionine sulfoxide reductase # Organism: Mycoplasma genitalium # 211 357 4 150 150 206 63.0 6e-53 MNRLTFKLTMTALCTVLACCTGNSKNQEKKGQTMNTKAEIYLAGGCFWGTEHYLKQIDGV TDTQTGYANGHGKNPSYQEVCTDKTGFAEAVRVAYNPERLPLSLLLQLYFKSINPTSLNR QGNDVGTQYRTGIYYTNPDDLPIIQAEMAKVAKRYDRPLQVEVLPLKNFYDAEDYHQDYL DKNPTGYCHLPQALFDMARKANKKTDEKKTYAKPTDEELKSKLTPLQYEVTQNAATERPF ANEYDHEFRPGIYVDVTTGEPLFLSTDKFDSGCGWPAFSKPIDNRLLNNLEDRSHGMLRT EVRSAKGNAHLGHVFNDGPKETGGLRYCINSASLRFVPKEDMEKQGYGKYLPMLEEK >gi|283510549|gb|ACQH01000070.1| GENE 7 7768 - 8265 426 165 aa, chain + ## HITS:1 COG:CAC2751 KEGG:ns NR:ns ## COG: CAC2751 COG0454 # Protein_GI_number: 15896008 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Clostridium acetobutylicum # 1 161 1 166 167 104 30.0 6e-23 MQIRLATKADLDAVMRLLDIGRQRMAATGNTQQWVQGYPQRATVEEDLRRGQCYVGEDAG RVVATFVLAEGPDRTYAHIWQGQWLHNERPYCVIHRMAADETQRGVFAEVMAFCKARAAN LRIDTHKDNAPMLHNIRKHGFAYCGVIHTDNGSERLAFQWLRTET >gi|283510549|gb|ACQH01000070.1| GENE 8 8314 - 9006 860 230 aa, chain + ## HITS:1 COG:no KEGG:PRU_0954 NR:ns ## KEGG: PRU_0954 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 5 228 26 222 223 123 37.0 5e-27 MLLMLMALAMAAQPAWAQHQFGERRKSYGYGVDRNAVYFEGRVIPGADARTFEYLAHGYA RDRRAVYYRGEVLQGANPRTFRVVGDAEEQPPVVIPDGRGGGIDPRDGGDWGRFPEELLP GNSLGFGYSKTNFDVYYLGKKIDASASSFQVLSFGYAKDAFNIYFEGAEVKDASSGTFRV LIDGYAKDAFNAYYRGREIPDCNVRTFECMGKGVARDDENKYFLGQKVVW >gi|283510549|gb|ACQH01000070.1| GENE 9 9353 - 9793 318 146 aa, chain + ## HITS:1 COG:RSc0240 KEGG:ns NR:ns ## COG: RSc0240 COG2259 # Protein_GI_number: 17544959 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Ralstonia solanacearum # 20 132 27 141 145 66 42.0 2e-11 MRKVLRFLLPDRQQRCSTAAFLLVARVVFALLLASHGLQKLEGFEQMSGGFPDPLGLGSG LSLLLAIFGELVCSLALVVGVLSRLVLLPMIFTMCVAFFMAHGGSMAEGELAFVYLVVFV LLFFAGPGRFSVDGWLAKKLGGQAAN >gi|283510549|gb|ACQH01000070.1| GENE 10 11477 - 13069 1753 530 aa, chain + ## HITS:1 COG:AGl3497 KEGG:ns NR:ns ## COG: AGl3497 COG0025 # Protein_GI_number: 15891867 # Func_class: P Inorganic ion transport and metabolism # Function: NhaP-type Na+/H+ and K+/H+ antiporters # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 4 524 28 542 550 218 30.0 2e-56 MIEHQLIVSILLVVAICLLVMVSQRIKVAYPVMLVLGGIALSFIPGMPRLHINPDVIFLV FLPPILYEAAVANSWKELWRWRRIIGSFAFIVVFITALVVGYIANTFIPGFSVALGFLIG GIVSPPDAVSAAAIMKFVKVPRRVSAVLEGESLFNDASSLIIVKFALIAIGTGQFVWHQA AASFVWMVIGGAAVGVLLSYVLIKLHKWLPMDENIDTMFTIISPYLMYICAETVEASGVL AVVSGGLYFSYRRILIIGSSSRLRAENVWNFLIFLLNGLAFLFIGLDLPEIMAGLKADGV SLWTATAYGLLITGALVVIRMCAAFSAVFITRFMSKFITVADSRSQGMAGPLVLGWTGMR GVVSLAAALAIPLYVPGTQTPFPERSMMLYITFVVITLTLVFQGLTLPILLKWVKFHNYD DHLPHEETERMIRIGMAQASLDYLRHNNKCVDVTRSALLNSLVAHWNEQLETEGSATLYD DELRKNYHNILNEQRLWLNKLNNENERVDEEIVRHFIHRIDLEEERLLKE >gi|283510549|gb|ACQH01000070.1| GENE 11 13346 - 14035 978 229 aa, chain + ## HITS:1 COG:TP0168 KEGG:ns NR:ns ## COG: TP0168 COG0588 # Protein_GI_number: 15639161 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoglycerate mutase 1 # Organism: Treponema pallidum # 3 228 2 231 251 286 56.0 2e-77 MKKLVIIRHGESEWNQKNLFTGWVDVELSEKGKAEAKRAGELMKEAGLDFDVCYTSYLKR AINTQQIALKVMEREWLPVIKSWRLNERHYGALSGLNKKETAEKYGDEQVHIWRRSFDVR PPQMEEDNVYSARKNPAYRDVPVEDVPMCESLKDTIARTVPYFENEIKPLVMEGKRVFIA AHGNSLRSLIKYFENISDEDIINVEIPTGTPLVYEFDDDFKVTNKYYLK >gi|283510549|gb|ACQH01000070.1| GENE 12 14169 - 15548 1299 459 aa, chain - ## HITS:1 COG:BB0705 KEGG:ns NR:ns ## COG: BB0705 COG0571 # Protein_GI_number: 15595050 # Func_class: K Transcription # Function: dsRNA-specific ribonuclease # Organism: Borrelia burgdorferi # 15 241 14 242 246 92 33.0 2e-18 MLSNLIDRVRLPFRKEKELYSSLFEILGFYPRNISYYKLALMHKSVMRRGANGKPVNNER LEFLGDAILDAIVGDIVYRHFPGKREGFLTNTRSKLVQRETLNRLAQEMGISRLILSSGH TSSHNSYMGGNAFEALVGAIYLDRGYSSCMRFMEKRILAKMINIDKVAYKEVNFKSKLIE WSQKNRVKLDFIVKEQGKDKNGSPFFLFCVHLEGVEGCNGRGYSKKESQQLASKLTLERL KKEPQFIDRVFAAKAERTKNEDSAEGNDKAATTQLLPVKPANGPQTAPKESAKASVKETA NEANKPKANLPTAEQAERSATQRTTERPEERAAEKKAERANERKTRNDAQRDKVTDKPEV PSSAEEKTKKQERRTNEKAEEKPMEVFRDDKPKDEPKGKRQRLYAERESEELSAETLDFD DDPEFDLSHITARQQSREEIIAAAEAAAFSEEHTDNQNQ >gi|283510549|gb|ACQH01000070.1| GENE 13 15538 - 15933 612 131 aa, chain - ## HITS:1 COG:PA2965 KEGG:ns NR:ns ## COG: PA2965 COG0304 # Protein_GI_number: 15598161 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: 3-oxoacyl-(acyl-carrier-protein) synthase # Organism: Pseudomonas aeruginosa # 1 131 288 414 414 145 54.0 2e-35 MEDAGLKPEDIDYINVHGTSTPVGDISEAKAIKELFGDAAYKLNISSTKSMTGHLLGAAG AVEAMATVLAVQNDIIPPTINHADDDVDENIDYNLNFTFNKAQKREVRAALSNTFGFGGH NACVIFKKYAE >gi|283510549|gb|ACQH01000070.1| GENE 14 15960 - 16799 929 279 aa, chain - ## HITS:1 COG:NMA0044 KEGG:ns NR:ns ## COG: NMA0044 COG0304 # Protein_GI_number: 15793076 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: 3-oxoacyl-(acyl-carrier-protein) synthase # Organism: Neisseria meningitidis Z2491 # 1 274 1 273 415 278 54.0 9e-75 MELKRVVVTGLGAVTPVGNTPEETWANLLAGKSGAAPITHFDATNFKTKFACEVKNLNVT DYIDRKEARKMDRYTQLAIIAAMQGIKDSALDLEKEDRNRIGVIYGVGIGGIKTFEEEVT YYGQHLGEEPKFNPFFIPKMIADIAAGHISIMYGLHGPNYATTSACASSTNALADAFNLI RLGKANVIVSGGAEAAICGCGVGGFNAMHALSTRNDDPEHASRPFSKSRDGFVMGEGAGC LVLEELEHAKARGAKIYAEMVGEGASADAHHITASHPKA >gi|283510549|gb|ACQH01000070.1| GENE 15 17015 - 17251 495 78 aa, chain - ## HITS:1 COG:DR1942 KEGG:ns NR:ns ## COG: DR1942 COG0236 # Protein_GI_number: 15806940 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acyl carrier protein # Organism: Deinococcus radiodurans # 8 76 41 109 110 82 63.0 2e-16 MSEIESKVKAIIVDKLGVDEAEVKPEASFTNDLGADSLDTVELIMEFEKEFGISIPDDKA EKIGTVGDAIAYIEENAK >gi|283510549|gb|ACQH01000070.1| GENE 16 18489 - 18893 446 134 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928884|ref|ZP_06422730.1| ## NR: gi|288928884|ref|ZP_06422730.1| hypothetical protein HMPREF0670_01624 [Prevotella sp. oral taxon 317 str. F0108] # 1 134 1 134 134 268 100.0 1e-70 MLHTIKNLSLCAMFAIVVMAFAACSSDDDRDNPVVQQLIGSWQQINNCNSCEQSNTYETW YKDLTWDKHAIVDGNAELVRKGDFGCKDNTLHLRSTCPKGRDLVEMDFKVEINGDVLTLT NLKTKATIKYKRRK >gi|283510549|gb|ACQH01000070.1| GENE 17 19510 - 20391 1105 293 aa, chain - ## HITS:1 COG:MA1840 KEGG:ns NR:ns ## COG: MA1840 COG0524 # Protein_GI_number: 20090690 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Methanosarcina acetivorans str.C2A # 8 282 34 312 326 178 37.0 1e-44 MKRLIVGLGEALWDVLPEGKKIGGAPANFAYHAAQYGLDTLAISALGNDALGDEIAQKFA DKKLNVLMPRVPFPTGTVQVSLDAQGVPTYDIKEGVAWDNIPFTPELEEVARNCRAVCWG SLAQRNEVSRNTIRRFLDATPKDCLKIFDINLRQTFYNEELLLDSFKRCDILKINDEELV TIGRLFGYPGLDMTNKCWLILGKYNLDALVLTCGVNGSYVFTPGAMSFLETPKVEVADTV GAGDSFTGSFCAAYLSGVPVSEAHRLAVETSAFVCTQNGAMPTIPKHFIDRIK >gi|283510549|gb|ACQH01000070.1| GENE 18 20422 - 21582 1362 386 aa, chain - ## HITS:1 COG:NMB0535 KEGG:ns NR:ns ## COG: NMB0535 COG0738 # Protein_GI_number: 15676441 # Func_class: G Carbohydrate transport and metabolism # Function: Fucose permease # Organism: Neisseria meningitidis MC58 # 9 380 32 418 426 74 24.0 3e-13 MRKSTQLTLIPVMFCFFAMGFVDLVGIASNYVKADLNLTDSAANVFPSLVFFWFLIFAVP TGMLMNKIGRKKTVLVSLVITVASLLLPLFGNSYTLMLISFSLLGIGNAFMQTSLNPLIS NIISGDRFAATLTFGQFVKAIASFMAPIIAAWGAAASIPHFGLGWRVLFPIYMVIGILAT LMLSATPIEEEPMKDKPSSFMETIKLLGTPIVLLSFIGIMCHVGIDVGTNTHAPKILQER LHMGLDEAGFATSLYFIFRTLGCLSGSFILAKFNNRAFFYVSVVMMALSMVGLAVGESKA VLYTAIALVGFGNSNIFSLVFSQAVLSLPDKKNEVSGLMIMGLFGGTIFPLLMGPASDAV GQWGAAVVMAVGVIYLFTYGRKIGAK >gi|283510549|gb|ACQH01000070.1| GENE 19 21881 - 23680 1898 599 aa, chain - ## HITS:1 COG:BS_sacC KEGG:ns NR:ns ## COG: BS_sacC COG1621 # Protein_GI_number: 16079757 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-fructosidases (levanase/invertase) # Organism: Bacillus subtilis # 129 599 16 510 677 383 43.0 1e-106 MAQKLTKPLLRLATTLLLLFVAHAIAHADNPKKMNMLSKNHCMLAFDTPTTRYLLLPVEE KAELCNLRLIVDNNAIETFNVRLAADHVDYFVPLALDAVKGKPFVLDVHVQGDVRKNGNL NDYVCWKNMTMADNFDTTNREKWRPLYHHTPPYGWMNDPNGLFYKDGLYHLYYQWNPYGS QWENMTWGHSVSKDLVHWEDRGTAIAPDALGTIFSGCAVVDHNNTAGFGAGAVVAFYTSA GENQTQSMAYSTDNGQTFTKYDRNPVITAPIPDFRDPHVFWHQPTERWIMVLAAGQEMQF YSSKNLKEWTFESNFGEGYGNHDGVWECPDLMQLPVRGTDKQKWMLICNINPGGPFGGNA TQYFIGEFDGHQFTCEDKPETTKWMDYGKDHYAAITFDNAPQGRRILMTWMSNWQYGNQV PTLQFRSANTIACDLDLFEHNGQTFVGRTPSKELLAARAKPLFKNSTSARKTFNPAQGAY ELVMVVKPQAKGKTRLTLQNAAGEKVVVSYDVAAQTLSVDRTQSGRTDFSDGFKAVTIAP TRGLITMLRIFVDKSSVEVIDAQGKVSLTNLVFPSTPYNTLLLENSGGNKSSVSVFEMK >gi|283510549|gb|ACQH01000070.1| GENE 20 23897 - 25501 1778 534 aa, chain - ## HITS:1 COG:no KEGG:BT_1760 NR:ns ## KEGG: BT_1760 # Name: not_defined # Def: glycosylhydrolase # Organism: B.thetaiotaomicron # Pathway: not_defined # 8 532 9 522 523 634 61.0 1e-180 MNKTAYIIASFALLATPLASCESDAPIVEERDWQGTTTFFQSVDDKKQDTYYKPFAGFVG DPMPFFDPVEKDFKVLYLQDYRPNPAATYHPIWAVSTKDAAHYTSLGELIPCGGAAEQDA ALGTGSTIYDETQRTYYTFYTGNKHQPRQGDNAQVVMVATSADFKTWTKNQTFAIRGNDY GYSANDFRDPYVYKAEDGSGYRMLVATYKDGKGVLAEFSSADLKAWKHVGVFMTMMWDRF YECPDLFKMGDWWYLVYSEKHAAVRRVQYFKGRTLDELKAATANDAGKWPDDHEGFLDSR GFYAGKTASDGTNRYIWGWCPTRPGNNNTDVGAYPHEPEWAGNLVVHRLIQHADGTLTLG EVPAISTQFGQNAAPKLMAQSAQGVTQNDDNMKLQGEAWALFHRLGQVNRISFTVKTAAN TDKFGISFVRGTDSEKWYTLVVNPEGADKRKVNFEEEGPKGKGFVEGIDGYVFARPADNT YNVTIYTDNSVLVMYVNDVCCYTNRIYNMAKNCWSINSYGGEVNVSNLSVGTTK >gi|283510549|gb|ACQH01000070.1| GENE 21 25513 - 26895 1578 460 aa, chain - ## HITS:1 COG:no KEGG:BT_1761 NR:ns ## KEGG: BT_1761 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 458 1 459 461 374 44.0 1e-102 MKTLITQIVMAFATLVAITACSESNVSDLQLGGDCNVQALALDKYEGTTHLPTRSITVRV PQSYDVSNMELTTLQLSAGATANIKQGDHLDFTTPRTMRVRNADVYEDWTISVRRDEAKI TSFKINDLYTGIIDEGARTISVYVPEALNLSSLVPTFTLSDNATASIASGVATSFAKPVQ ITVTNNTATTTYTVTVTKIGKPEVVFVGLAQSMDQLNIEELTACRWMLANVPNSLYASFA DIKNGNIDLSACKVIWWHYHKDGGVDGKEAFEKAAPAAVEASVALRDYYNAGGSFLFTRY ATNMPAFIGAMKNDAVPNNCWGQKETEAETVGSPWSFSMRGHTDNALYRNLVMKSDAPND VFTCDAGYRITNSTAQWHIGADWGGYADYDTWRNKTGARDLGYGGDGAIVAWEIPANGTK GGIVCIGSGCYDWYSIDPVTPHYHQNVATLTRNAINLLKR >gi|283510549|gb|ACQH01000070.1| GENE 22 26898 - 28628 2006 576 aa, chain - ## HITS:1 COG:no KEGG:PRU_2273 NR:ns ## KEGG: PRU_2273 # Name: not_defined # Def: putative lipoprotein # Organism: P.ruminicola # Pathway: not_defined # 1 576 1 583 583 689 59.0 0 MKKATIILALCAMLASSCNDFLGEQKPQGTLNNDQVKDARYVDKLVISAYAIWISAEDIN SSFSMWNFDVRSDDAYKGGNGTSDGDVFHQLEVSQGILTTNWNISDMWQRLYNGISRVNT AIELLQQTDAANYPLKEQRLAEMKFLRAYGHFLLKRLYKFIPFVVDPSLTSEQYNNLSNR QYSNDEGWALIAKDLEEAYNVLPARQTEVGRPSKAAAAAFLAKVYLYKAYRQDNENSNEV TSISAEDLQKVLQYTAPDIYAAGGFALESDFHNNFRPEPQYENGPESIWAMQYSMNDGTT YGNLNWSYGLIVPNIPGVTDGGCDFYKPSQNLVNAYRTDNAGLPRIDDFNKADYDKSRDY ADPRLFLTVGMPGLPYEFNRDFMMDRSATWSRSNGLYGYYVTLKQNVDPKSGYLIKGSWW GTPMNRIVFRYADVLLERAEALAQLGGNHLQEALALVNQLRRRAKQSTAAIANYESQYGV RLNVGQYTGSYNQAQTLKIVKMERRLEMGMESERFFDLVRWGEAASVINKYYAEEANDCA IYNGAHFTANKNEYLPIPHAQVAASKGNYKQNIGDW >gi|283510549|gb|ACQH01000070.1| GENE 23 28642 - 31746 3792 1034 aa, chain - ## HITS:1 COG:no KEGG:BT_1763 NR:ns ## KEGG: BT_1763 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 1034 5 1041 1041 1452 69.0 0 MKQLRTLLIIAMAIFVCALAHAQGVNVSGKVTSATDGEPITGASVVEKGTTRGTVTDIDG NFKLEVNSGATLVISYLGFKTIEVKATTSMNIALEENAKSLNEVVVTGYTTQRKADLTGA VSVVNVDDLAKQNENNPIKALQGRVPGMNITADGAPSGQATVRIRGVGTLNDNDPLYIID GVPTKAGMHELNGNDIESIQVLKDAASASIYGSRAANGVIIITTKRGKSGRIRVDLDASI SAATYVHKLDVLNAKEYGQVMWQAYVNDGMDPNTNGLGYHFNWGYDAQGLPTLQGMSMKK YLDAAGKTPAADTDWFNETTRTGFVQQYNVAVSNGSEKGSSYFSLGYYNNQGIIKQSDFN RFSARMNTEYKLIGKLLTVGEHFTLNRTSEVQAPGGFLSNVLQFNPSLPVRTTDGNWGGP VGGYPDRYNPVAVLERNKDNRYTYWRMFGDAYINLNPFKNFNLRSTFGLDYSQKMQRFFT YPVTEGNVAKNTNAVEPKQEHWTKWMWNAVATYNVESGKHRADAMAGMELNRQDDVYFSA YKEGFLILSPDYMWPSAGVGTAQSRGGGSGYSLVSFFGKLNYNYDDRYMASLTLRHDGSS RFGKNNRYATFPALSLGWRINRERFLNRASWIDDLKLRASWGQTGNQEISNIARYTVYVS NYGVNENGGQSYDTSYDIAGTNGGRTLDSGFKRDQIGNDNIKWETTTQTNLGADFSFFNA TLYGSVDWYYKKTRDILVQMAGIAAMGEGSTQWINAGEMENRGVELNLGYRNTTAFGLKY DLAANLSTYRNKITKLPETVAANGTFGGNGVKSVIGHPMGAQVGYVADGIFKSQAEVDNH AQQQGAAPGRIRYRDLDGDNKITEADQTWIYSPVPKFSYGLNVYLEYKGLDLTMFWQGVQ GVDIISDLKKETDLWAGLNIGFLNKGRRLLGAWSASNPSSDIPALSLSDNANEKRVSSYW VENGSYLKLRTIQLGYNLPQAVVKKLMMERLRLYVSAQNLLTIKSSSFTGVDPENPNFGY PIPLNLTFGLNVTF >gi|283510549|gb|ACQH01000070.1| GENE 24 34129 - 35460 1115 443 aa, chain + ## HITS:1 COG:FN1101 KEGG:ns NR:ns ## COG: FN1101 COG1373 # Protein_GI_number: 19704436 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Fusobacterium nucleatum # 6 439 24 449 470 132 26.0 2e-30 MKRIFKRKLYERLLEWKRVQNGKSAILIEGARRVGKSTLVEQFAKNEYESYILIDFNEAS DEVKSLFSNIMDKDFLFLQLQAIYNVVLKERRSVIVFDEVQNCPLARQAIKYLVKDGRYD YIETGSLISVKKSTKDITLPSEEERVTLYPMDYEEFRWALGDEVTVSLLRTFYEKRLPLD KAHRDKIRDFRLYMLIGGMPQAVETYLETNNFALVDRVKRGIINIYQADFQKLDPTGRLE TLFMEIPAQLSKANGRYKPYSVLGKVDGDKLTDLMRNLEDSKTVLISYHSNEPNVGMSLT KNLSRFKVFCADTGLFVTLAFWDKSNTENVIYQKLLNDKLSTNLGFVYENVVAQMLATSG NKLFYYTWAKDATHNYEVDFLLSRGTKLHPIEVKSSGYNSHVSLDVFCEKFSHVVDKRYL IYTKDLKRDEKTLLLPVYMTIFL >gi|283510549|gb|ACQH01000070.1| GENE 25 35636 - 36883 1159 415 aa, chain - ## HITS:1 COG:yieL KEGG:ns NR:ns ## COG: yieL COG2382 # Protein_GI_number: 16131587 # Func_class: P Inorganic ion transport and metabolism # Function: Enterochelin esterase and related enzymes # Organism: Escherichia coli K12 # 28 415 44 400 400 145 30.0 2e-34 MRTNKKTMLLALLTLMAWVNGMAQEQLSTPAQVLSPVINADGSVTFNLFAPQAKAVSVSG DFLPKQDTQGTKTENQITKHAPLQKNEQGVWTYTTPPLAPELYAYAFDVDGLRITDPANV YTSRDIATLSSIFIISNKQSDKGHMYSVNEVPHGNLFKVWYRSDALGMNRRMTIYTPAGY EQGKAYPVLYLLHGAGGDENAWSELGRAAQIMDNLIATGKAKPMIVVMPNGNPNCQAAPG EWSFGMYRPSFMGETGPKTASAMSMEESFMDIVKYVDSHYKTIKNRNGRAVCGLSMGGGH TFGISRLYPTTFHYYGLFSAAIYTGRHINPLGAPLYAKMCTDKAFQAQMQRLFASKPKLF WIAIGNTDFLYQMNVDLRHFLDDNHYPYQYVETDGGHIWRNWRIYLTMFAQKVFK >gi|283510549|gb|ACQH01000070.1| GENE 26 36958 - 38394 1303 478 aa, chain - ## HITS:1 COG:BS_tuaB KEGG:ns NR:ns ## COG: BS_tuaB COG2244 # Protein_GI_number: 16080613 # Func_class: R General function prediction only # Function: Membrane protein involved in the export of O-antigen and teichoic acid # Organism: Bacillus subtilis # 1 398 1 403 483 132 25.0 1e-30 MESLKEKTAKGLFWGALNNVMLQLIGVTMGIVMGRLLNATDYGMIAMISIFALVANDLQN SGFKAALNNIKQPQHRDYNSVFWFNIAMGSGLYVLLFFLAPLIASYYHTPKLIPLCRYAF LSLIFSSFGTAQSAYLAKNIMAKQVAKANLTGTFVSSSVAVWMAWSGYGYWAFATQTNLY VLINTLLYWHYSHWRPTFSIDWQPVRHMFKFSSKLLFSSILTDINNNIMNLLLGHYYSAS STGAYNQAYQWHSKAFFLIQGMLTQVAQPVLVDVGDERERQLRILRKMMRLAALLSFPLL FGLGMVAHEFITVAITEKWARSAEYMQLLCIGGAFVPLSFLLSNLVISKGRSSVYLYVNL GLGVTQVAAMLSLYPFGIKWMICALVTLNIICFGVWGLFARRLADYKLRLLALDTMPFAL AALGVMALTWFATQTITLLPLLLGVRIVVAGLLYLGIMKLANVDVFNEMVAFVRRKKM >gi|283510549|gb|ACQH01000070.1| GENE 27 39223 - 42318 3736 1031 aa, chain + ## HITS:1 COG:no KEGG:BT_4707 NR:ns ## KEGG: BT_4707 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 43 1031 105 1099 1099 816 46.0 0 MLKHYYALPPLANRAVLVAWLALSPVAWAMAGTSNTQPNATSLQQSTRKTVTGTVTDANG EPLVGVSVRSKGGKGGTITDIDGHFSMQVGDGETIEISYVGYATQTLTANTNMPMTIVLE EDKKLLNEVVVTALGIKREAKALTYNVQDLKGSEVTRVRDANFVNALAGKIAGVTINQSS SGVGGSSRVVMRGTKSLFGENNALYVLDGIPMQGLRTKQSDNFYESVEVADGDGISNINP DDIESMSVLTGASAAALYGNRGANGVILITTKKGAVGRPKVSFSNSTTFTRPFVTPQFQN TYGRKDGEFASWGEKMERPSSYNPLDFFNTGFNVSNSVALSTGSETSQTHVSVGAVNAGG IVPNNRYNRYNFTFRNSWDVIKDVLELDFSLLYVKQNTRNGIGQGMYYNPLVPTYLFPPS DDINRYAVYEMYNPVRNFKTQNWPYGNLGLGMQNPFWIVNRNKFETNRERFIASFSARWN ITSWLNILGRARMDNAYTDFERKLYASTDGLFSKPAGNWMTQDDKNTSTYLDFLINIDKQ FANDEIRLLANIGGSYFDERYTSKTFEGNLARVPNFFHPSNMSPTESNTAHANLHTQTQS FYAKVEVGYRNFLFADATGRIDFFSTLLGTSSNNVFYPSVGASVLLTEVLPLPKNIFSLW KIRGSYAQVGNPPSPYLTREAVALSNGTPKSGAFTPASHLKPEMTKAFELGMDLRLFDNK LNIAATYYNSNTYNQLFRYKLPPSSGYEYAFENAGKVNNWGVELSATYNQKLGPVDWSAN LVYSLNRNEIKELLPEYVTDRTTNTTVKAPTEFEVSSAESYKMILRKGGTMSDIYASHLK QDYQGNILASGGVQKDENDFVKVGSAAPRYNISLRNSFAWKGIELGFMFDARVGGVVVSA TQALMDQFGVSKQTADARDAGGVRVNNGLLNAEQYYGVAAGGRTGLLAHYTYSATNVRLR EMSISYDLPTAWFANKLKVNLSLTGHNLLMIYNRAPFDPELTANTGTYYQGFDYFMPPSQ RSMGFGVKVNF >gi|283510549|gb|ACQH01000070.1| GENE 28 42344 - 43903 1762 519 aa, chain + ## HITS:1 COG:no KEGG:BT_4708 NR:ns ## KEGG: BT_4708 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 19 519 25 529 531 363 40.0 1e-98 MKNKIKTLFVLPALCLPLVGCLDSSLNDDPDRANPAWLGYDNLHGTYLTSLQRNVVPEDQ NDFQLAEDLVGNMFAGYYAGTQSWEGGFNGTTYAFPDGWKDRPFSVAFTKLMSNWQQLRL KADSASVLFAVGEIVKVEAMHKTTDIYGPIPYTRFGLETPVSYDSQEAVYMRFFAELNHA IGVLTNFDRLNPAAKPLDKFDLIYGSDLKKWIRFANSLKLRLAMRCRAVYDGAQKLAEEA VNNPYGVLEDNADNAVMQTVGALSFTYNNPFYNIYSPEGYNEDRMGATMDAYLNGFADPR LPKFFAKAKGDVYRGLRNGLRNGKDFQGDERLSMPTITRSTPYVWMTAAEVWLLRAEGAL FGWNMGGTPEELYAQGVTTSFEQHGLADAAEAYLASTAKPSAYPGLGTSTAAEAPSDITP AWDESATKEKKLERIITQKWIAIYPLGQEAWSEYRRTGYPKLFPVVDNLSNGKVNTNVQV RRVPFPASEYSGNKQEVEKAVQLLGGEDTGGTRLWWDKQ >gi|283510549|gb|ACQH01000070.1| GENE 29 44035 - 45078 1359 347 aa, chain + ## HITS:1 COG:no KEGG:BF1328 NR:ns ## KEGG: BF1328 # Name: not_defined # Def: putative secreted endoglycosidase # Organism: B.fragilis # Pathway: not_defined # 10 335 8 345 350 202 40.0 2e-50 MNVTNIFHGIASLALAAAALTSCNTSIEALEIIKPEEKSEQYYADLRAYKQRTDHEVFFG WFGGWNTKSPNMIGSLKSVPDSVDIISIWSGTYDREDIEYVQRVKGTRVTFTIFAHKIPA QFLEGENHDQVTREAIERYAVSLVDTMKAYGYQGIDLDYEPGYQDPAGGPFTGPLVGPLN VYPNYRDNMEIFVKKLGEFIGPKSGTSNLLIIDGVPFDVKPELAEYFNYGVVQAYNSPSY ADLQRRFNDAAARGWKPEQYIFAETFEGGRAATGGVTHMLQDGKQVPSLLGMALFLPEYQ GKKATRKGGCGTYHMENDYDNWPNYKFTRRAIEIMQTHRDTVATTKP >gi|283510549|gb|ACQH01000070.1| GENE 30 45099 - 46295 1518 398 aa, chain + ## HITS:1 COG:no KEGG:BT_4710 NR:ns ## KEGG: BT_4710 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 15 376 17 364 402 127 26.0 9e-28 MKKILRKLPLALAMLLVAACNDGIKSYDGLYIVGTQGKEVTTTLTVDDVPSAIAVNVAAS ELAKENINVELKAAPELVESFNKEHHKNYVLLPKDAYKLENTTQTIMGGKHVSDKGTQLT IVNLEAMRPGTTYLLPLSIANVQGSDMPVIEASRTIYVVVNQVIVTKAADLSRSWRFYYA DFSDNKGRFDTHAMKSVTFEARVRFKKMDANSRKWCYSVMGLEENLCLRTAGGPADGWKL QLGDPNHIDSRDVLPNDKWVHLACVYNGETGKKYIYINGELQAETTDSRKTISLANAYGQ NDLFYIGQSASDDRCMEGWVSEARVWATARTAAELKNNVCWVDPTSKDLVAYWRFNEAQK KDDKWIVTDLTGNGFNAYYFSWPSGQEPSFVDGVRCPE >gi|283510549|gb|ACQH01000070.1| GENE 31 46634 - 47050 571 138 aa, chain + ## HITS:1 COG:no KEGG:ZPR_1410 NR:ns ## KEGG: ZPR_1410 # Name: not_defined # Def: hypothetical protein # Organism: Z.profunda # Pathway: not_defined # 23 134 22 137 137 64 37.0 1e-09 MRITVIAKNAGRRKELGKRQIELAKHPQTLQQLLEELTLIGLADAQAERTNQPLSQSDID AQAEEGRVRFAEGYATNNDTPEKSLERMRQAFVDGLFRVFADGEELTQWDGPLALHEDSE LVFIRFTMLTGLCWRMIF >gi|283510549|gb|ACQH01000070.1| GENE 32 47130 - 52055 5202 1641 aa, chain + ## HITS:1 COG:no KEGG:BT_1809 NR:ns ## KEGG: BT_1809 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 72 1639 86 1676 1676 680 30.0 0 MQTYKELVAKSHDYLKSLDQDKLNQYEKKFVKELLSNKDRLREYSFDREVIAQMPLLNAT ALINGGGALPNLLPTDLSQLLASYLHELGEYSPTRGIYRISLRRRDFNLRNRQEDIKEAY FAFYVFALLGGDISQLVRGQYQVKDWDYGNPSFEHVITAYLLANHQTAIQAAREVLTSEN NVGILTHALIRSIERTNIEELHTLLLQTLKAAQLQEGLRQSIVETGDECNLAFFERLLQI IDQENMLRFASVRRSVQVWAGLGYEEIEDKDIKTIFNAIRTFFRNPEMREEAYRGDNPLL VYVALYTAGLDDYEQSERAALNLIDGQYPQQVRVAAIYYLYRSDNFEGEKHLELLCRHLD DKYVLAFTVNMLNSSERFAKVDRGDFARLSDSEHALYLRLFDTVYEWHKGLKANETLRYG GFEWFALNTGTSTTTDVLWYLAARLGEQSCIDRFLTMKQPAYYVHCMNDYSWGSNNKRVP LTAFIEDFFPKGSAECRAQFVVNNLVAMESDKFKRFAEIGPKQTYNEAQIKTLENKLKSK VSTTRQRVIAMLLWLPDEQLIESYTRIQKLKAPDIEGSLSELREGSPALAKAFGPPTTQT TGQAMKAYPGEEGGYGLYTPVEIAEPEVANPFNSAPQGGLKGMLSKITGKDNAPDVSGIF AYTYEQMHDLYLKMEAIIDANADRQYKDRWNDDRVLGDRYLAYTSKEYGYDSLPFPELWE QFFEESGIKDEELYGLYLMVEWIGRDDNFLKLDLPNYPLTSKQRGGEMKYRSHFREVIEG RFYKVREQRPELFFEPSYLMSSLFYFFCNEKKIVRKTKYSTYIYPIPSCTPLNGAMHTMT QTWRTDEEFARCSNLLLAISRKFYLEDDEKDRSNYRLPPLMAARMNLEGRLTDDQLMQML MAEKGGMLESATFAVYYDSDYRRKPQWELTPQKSRYDKAVYEHLRNVVNRIANHLFDIEL TRRNAPTPATNLLTGSYRSKVVLWGTANLQKAMAALGKEHLVRDYSGKEKRAVLTSCIVH CYPLDTDTPDMLKGIDAARLVELAFFAPQWMELVRQHLNWKGFDEAYYYFVAHTKESDSE EKRATIALYTDLAPEDLADGAFDARLFNEAFKKVGKKNFALFYDAAKYMGSSNYHGRARR FADVTQGLIKEKQLMEQIDKTRNKDALCALGLVPLPKKNIDTALLKRYKRIQAFLKESKQ FGAQRQASEKRTVEIALINLARSAGYDDVIQLVWRMEGHLVADKKALLDGMVAEGYLIRV EIAPDGTNKVVIEKDDKPLKSVPTKLKKNATYLEVNQTHKEWTLQYRRAREIFEDMMRQQ TTIMPADAAVIEANPIVAPLFAKLLVQQHDHMGLYAEGKLHTLDGEQTIDPDTPLTIVHP YHLHASGKWAAWQSRLFEMRLVQPFKQVFRELYVPIEEEADKNESRRYSGYQIQVRQTLA TLRSRGWVADYEEGLRKVFHKQGVVVSLYARADWFSPSDIEAPAIEYVRFDRTRGIEPLH IADVDPVLYSEVMRDVDLAVSIAFVGGVDPETGQSTRELRAAIVRCTAEMLKLNNVRVDG NFAFIEGTLADYTVHLGSANVRQQGGKEIPIVPVHSSQRGKLYLPFVDEDPKTVEIVSKV VLLAEDNKLKDPTILQWIERR >gi|283510549|gb|ACQH01000070.1| GENE 33 53304 - 53543 204 79 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPRITRNTRIARKPRITRKPRITRKPRITRKPRITRKPRKPRITRNTRTTKNTRTTKNTR TTKNNCLLFTNYTFPTTPF >gi|283510549|gb|ACQH01000070.1| GENE 34 53705 - 54430 574 241 aa, chain - ## HITS:1 COG:PM0161 KEGG:ns NR:ns ## COG: PM0161 COG0778 # Protein_GI_number: 15602026 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Pasteurella multocida # 1 240 1 241 242 317 60.0 1e-86 MRLEEILNHRRSVRHFDSAKPIDPEQVKACLQLATLAPTSSNMQLWECYHITDRAMLEKM VPACLDQLAASSAQQLVVFVTRPDLTKHHAQAVLAFEQDNVRRNSPAEKQEKRIRRWQLY YGKFIPLWYGRCFGLAGLLRKALVQAIGLFRPIVRQASEADMRVVMHKSCGLVAQTFMLA MSEVGYDTCPLEGFDSWRVKRLLHLPYSADINMVIACGIRLPDGVWGDRFRLPLSEMYHR V >gi|283510549|gb|ACQH01000070.1| GENE 35 54785 - 56401 852 538 aa, chain - ## HITS:1 COG:no KEGG:Metev_0337 NR:ns ## KEGG: Metev_0337 # Name: not_defined # Def: restriction endonuclease AlwI # Organism: M.evestigatum # Pathway: not_defined # 1 536 22 563 569 380 38.0 1e-104 MIPEIMLLVEHFAGKKWDKESQCAFMEVLKNENFFNGKGENDPAFSARDRINRAPKSLGF VKLSPTIKLTPAGEQLISSKRKDEVFLRQMLKFQVPSPYHKPSSKAAKFCIKPYLEMLRL VRTMGTLKFDELQMFGMQLTDWHDFERIVQKIETFRIEKAKNKGSYRVFKAEYLNTELKQ IYKERIGQGETQTRESADTSLDNFLRTQSSNMRDYADSCFRYLRATGLINVSQVGKSLSI VPERIEDVDFLLEKIDRNPLTFKSEAAYCTYLGDTGNPKLFTDNKANLLEKLHTEFSGQT ISEESSIEELKDLLGNLRDERRDNNIKNEVKSIKEYKLYDDIQSVFGDIAAKKFYDNPIM LEWNTWRAMTMLDGGNIKANLNFDDFGKPLSTAAGNMPDIVCDYADFLVTVEVTMASGQK QFEMEGEPVSRHLGKLKVASGKPCYCLFIAPTINEASISFFYMLHKTNLAMYGGKSTVVP LPLTLFQKMVEDSYKAGYIPNPEKIKAFFETSNTLAQKTEDERTWYKGMLEKAMNWLG >gi|283510549|gb|ACQH01000070.1| GENE 36 56445 - 57416 432 323 aa, chain - ## HITS:1 COG:CAC3358 KEGG:ns NR:ns ## COG: CAC3358 COG0338 # Protein_GI_number: 15896601 # Func_class: L Replication, recombination and repair # Function: Site-specific DNA methylase # Organism: Clostridium acetobutylicum # 4 283 1 280 281 287 54.0 2e-77 MKPMIKYRGGKSKEIPHIMWHIPRFTGRYVEPFIGGGALYFYLEPREAIINDINKQLINF YKGVRNNFPNLRKELDEIEALYEINRRDYETLKANTPNERVEDKNEVLYYLLRDMFNNKV SKKYSDALLYYFINKTAYSGMIRYNADGDFNVPYGRYKHLGTQQVSLSHCELLKRTEIHN VDYAKIFNMCADNDFVFLDPPYDCAFSDYGNAEYKDGFNEDSHRRLAEDFKNFSCKALMV IGKTPLTEELYRQFIVDEYEKSYAVNIRNRFKSEAKHLVIANYKKDWDNVQFFAPETNKE HAILETAQVMLFEPKKEYRGKNK >gi|283510549|gb|ACQH01000070.1| GENE 37 57828 - 58820 1362 330 aa, chain - ## HITS:1 COG:aq_715 KEGG:ns NR:ns ## COG: aq_715 COG0264 # Protein_GI_number: 15606113 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factor Ts # Organism: Aquifex aeolicus # 1 327 3 287 290 108 30.0 1e-23 MAVSIADVQKLRKLTGAGLADCKKALDETGGDIEKAIEIIRAKGQAIAAKRSDRETANGC VLVKAENGFGAIIALKCETDFVANGKDYIQLTQDILDAAVAAKAKSLDEVKALTLADGTK VEDAVIARSGITGEKMELDGYNFIEGENIYTYNHMNKNLLCTMVQTNKPAAEEGHAVTMQ VAAMKPIALDEASVPQSVKDEELKVAIEKTKEEQVEKAVQAALKKAGFNLYIAENEEHIA EGIAKGEITEAQADEIRRIKKEVAEEKAAALPVEMVNNIAKGRMAKFFKESCLLEQEFIQ DSKLSVAQYLKAADKDLTVVAFTRFTLRAE >gi|283510549|gb|ACQH01000070.1| GENE 38 58910 - 59725 1084 271 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|212690772|ref|ZP_03298900.1| hypothetical protein BACDOR_00259 [Bacteroides dorei DSM 17855] # 1 271 1 271 279 422 77 1e-117 MSRTNFDQLLQAGCHFGHLRRKWNPAMAPYIFMERNGIHIIDLHKTVAKVDEAAEALKQI AKSGKRILFVATKKQAKDVVAEKAASVNMPYVIERWPGGMLTNFPTIRKAVKKMTNIDKL MSDGTFANLSKRELLQVTRQRAKLEKNLGSIADLTRLPSALFVVDVMKEAIAVKEAKRLG IPVFAMVDTNSNPRDIDFVIPANDDAKDSIEVVLTACCAAIAEGLEERKVEKADEKAAAE QAEEAPRAPKKARKEEAKAETEVAEKTKEEA >gi|283510549|gb|ACQH01000070.1| GENE 39 59959 - 60345 539 128 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|150004192|ref|YP_001298936.1| 30S ribosomal protein S9 [Bacteroides vulgatus ATCC 8482] # 1 128 1 128 128 212 82 5e-54 MEVVNAIGRRKSSVARVYLSEGTGKITINKKDINQYFPSAILQYVVKQPLQLLEVAEKYD IKANLDGGGYTGQSQALRLAIARALVKVNEEDKKALKDQGFLTRDSRAVERKKPGQPKAR RRFQFSKR >gi|283510549|gb|ACQH01000070.1| GENE 40 60355 - 60816 585 153 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|150004191|ref|YP_001298935.1| 50S ribosomal protein L13 [Bacteroides vulgatus ATCC 8482] # 1 153 1 153 153 229 73 2e-59 MDTLSYKTISANKETAKKEWVVIDATDQVVGRLCSKVAKLIRGKYKPNFTPHVDCGDNVI IINAAKVVFTGKKETDKVYTRYTGYPGGQRFNTPAELRKRKGGVDRIIRHAVKGMLPKGI LGRHLLGNLYVFEGTEHDKAAQKPKAIDINQYK >gi|283510549|gb|ACQH01000070.1| GENE 41 61146 - 63152 1591 668 aa, chain - ## HITS:1 COG:no KEGG:BF2254 NR:ns ## KEGG: BF2254 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 39 668 69 688 705 323 31.0 1e-86 MQMKRNTLTALLALALSAWAAPNMQAQTEAQQLNHAVSYDLPRLVAQKGVENGYDAIGAL PGITVQNGVYTLGGRAIAVSINGETQNVAYDQLQELLRAMPASGIERVEIVYNPAARMQA NAPLIALSFKRGDGESRPFAAEVGGDIALRHRTLFGERASGSFRKGQFSLDLSYLHSHGR VLDALRTDIPQVLSPSYEDITNTQERLTATCKHSYRLAATYTFGARHTLSATYQGFNTNN DLSVSNTDSHLGYADGKAKAWLHNARIDYVTPLGIRLGIEAAFFRAPERHVLMGYVNPFT LDPDFSPWLARLLKENLLVTDSLRTDLWRAYIAGEHELNNGWTINYGAWFKKVKHNSVVH RRHQARGEYSVSWHYPEEHFTTAYVGVSKRFGEKLSAEVSLSADYYHHVMTTQWRALPTF SISYAPCEGNVLSLSVSGSRDYPHYSDLNGLSQPDNGGRLYTVNDSWPMPSLHYQAQLSY VTKGGFQFRTWYNRASGRVMQQPYANGFLDYVLLKNCNIDRHQQVGLQAVMPHQFGSWLY TCLTLGGAWTRDKHEARMWEEAVNSRIIHGEAQFTGVATLCTKPNIALSVDAFGQTKTQQ GLWELPARMQVDMSLRWTFLKDMATLRLFCNDVLADGTPKYRYMYFTYEFDMRNVPRRQV GMGLTIRL >gi|283510549|gb|ACQH01000070.1| GENE 42 63268 - 63471 90 67 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MFYAEKFSANYKFSAVVNPYQRLFPCLAMAYAHAHAEELTCSFISPVASVFQLNLLSKTR WMTCISP >gi|283510549|gb|ACQH01000070.1| GENE 43 63517 - 64359 797 280 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288928910|ref|ZP_06422756.1| ## NR: gi|288928910|ref|ZP_06422756.1| hypothetical protein HMPREF0670_01650 [Prevotella sp. oral taxon 317 str. F0108] # 6 280 1 275 275 554 99.0 1e-156 MKQRALHLLLLLAFAIGVAAQSKLYNQYGFIYADAMLSTPNKAMTVQALIDTGCSLCMID STYAVDSCKIQLDATEKLILSADNERVRTKPVTLDSISFCGKTYRNVVCLVFNIAEKLQQ YTPQFLIGANILQQDLWLFNLKDMLVQRNPTDKRAPDYTLKWKNHRHYADVGRDLITLHA EINGIRGRFVFDTGSRNNYVPNNIKLEPTREVEREVASASRKLSKQMVKECENVAVTLGK QHFVLDFRLFDRSVGTLNLSFLKDRSFLLNYNKRTITVLR >gi|283510549|gb|ACQH01000070.1| GENE 44 64779 - 66488 956 569 aa, chain + ## HITS:1 COG:SMb20592 KEGG:ns NR:ns ## COG: SMb20592 COG1595 # Protein_GI_number: 16265252 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Sinorhizobium meliloti # 21 179 28 199 227 84 29.0 4e-16 MIDIDACKRGNKETLGELYAAYAPKLKSVCRSYVKDESVAEDILHDAFIIIFTSIGELKE NQKLEGWMITIVKNLSLKYLRHMQKHSTSLLDLNREVMDDTLENGQSVELDVLLQAIDTL PRGSREVFKLSVLDGLSHKEIGRMLGINPHSSSSQLFRAKKMLQAMLSNYWGLLLLPLFI PVYIYMVTRQKVGRLTTLKPRTTKQRQATPEKTKDESMKPNANQPAYSTPTPTTPAVGDS KLAQTVAYNGTFYFPKSADSTKLTKTILPFDADTIQRKLVVTHTPKDTLPSSLPAIATNK DIALNEQMAQTVKPKKKYPWTFNLGYSSNAGANSALSNQNYLTMIDYANGGATAKLYTWE DCQNYLTRNNALMDSVERARLAWIVNNNIDSPTDAGVGLGEKANHHRPITLGFSLNKQLS LRWTFGTGLMYTSLNSEFESEYHRATLRKTQRVDYLGIPLRLTYRFWTKGRFNAYTTGGV TFEIPVYSSLAKKYTVTADSTYTLRERIKVHSQWSVNLGVGVQYRLFKPFSIYLEPNMFY YFGNASGIETYRTEHPFIITVPFGLRLTW >gi|283510549|gb|ACQH01000070.1| GENE 45 66589 - 67662 462 357 aa, chain + ## HITS:1 COG:Ta1356 KEGG:ns NR:ns ## COG: Ta1356 COG0535 # Protein_GI_number: 16082340 # Func_class: R General function prediction only # Function: Predicted Fe-S oxidoreductases # Organism: Thermoplasma acidophilum # 36 208 40 207 341 68 27.0 2e-11 MYRIYDYIYNGWVLVNNMMRKQHKELTSLMIYSTTSCQSRCKHCAIWKKPIEHLNLSDII MIMKSKCITKRTMVGLEGGEFVLHPEADQILGWFDRNHPNYTLLSNGLAVDKVVAAVKQH HPKHLYVSLDGTRETYLHMRGRDGYDNVIAVVKACKDLVPLSLMFCLSPWNTFKDMEHVI GIAKVHGVDVRIGIYSTMSFFDTTADMMDAMGNGFVSQIPPSIHSTSENYDFVALYDEWK NKRLKLRCHSIFSQAVIHSNGDVPLCQNLNVLLGNIHSSSLDEIINSQTTCKLQCEHSRK CNKCWINYHRKYDIILLKNLERVFPKWLIELFYGEYQWTRDRDITYKQHFGRKTPQI >gi|283510549|gb|ACQH01000070.1| GENE 46 67676 - 68734 497 352 aa, chain + ## HITS:1 COG:YPO2042_2 KEGG:ns NR:ns ## COG: YPO2042_2 COG0673 # Protein_GI_number: 16122281 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Yersinia pestis # 90 261 66 238 303 67 28.0 6e-11 MANLSNIIKRYKSYRSINGLRATYTCQYALIGIGNHCVGNLLPVIQHLQLPLKYICCTSE KKATLVSQKYNGVKGTDNLNTILDDETILGVILATTPHEHFQIARKVLKAGKSLFIEKPP CEDEEQLRTLIDTVRLFGSPHVVVGLQRRFAPITQFLHQRLKNTDKHHYLYRYQTGLYPE GDVLLELFIHPLDYVVHLFGEAKVITANRITFKNGGQTLMLVLEHREVTGMLELSTDFSW QNAQEELSISTSKGCYTLSNMETLEFSPKRSSFLGVPLEKVVPGNAITTRLCARNAFLPI LDNNQVVTQGFFNEIKAFADMVENKREDNYAFGFESMKHTYSLMAKIRETTH Prediction of potential genes in microbial genomes Time: Sat May 28 01:35:36 2011 Seq name: gi|283510548|gb|ACQH01000071.1| Prevotella sp. oral taxon 317 str. F0108 cont2.71, whole genome shotgun sequence Length of sequence - 27793 bp Number of predicted genes - 15, with homology - 13 Number of transcription units - 9, operones - 2 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 369 193 ## gi|260910535|ref|ZP_05917203.1| hypothetical protein HMPREF6745_1158 + Term 543 - 597 3.8 + Prom 868 - 927 4.1 2 2 Tu 1 . + CDS 1088 - 1834 822 ## COG0500 SAM-dependent methyltransferases 3 3 Tu 1 . - CDS 1898 - 2158 84 ## - Prom 2212 - 2271 4.8 4 4 Tu 1 . - CDS 2381 - 2596 57 ## - Prom 2732 - 2791 8.5 - Term 3373 - 3424 13.1 5 5 Op 1 . - CDS 3476 - 6310 3558 ## COG0612 Predicted Zn-dependent peptidases 6 5 Op 2 . - CDS 6317 - 7621 1760 ## COG0526 Thiol-disulfide isomerase and thioredoxins 7 5 Op 3 . - CDS 7626 - 8114 584 ## gi|288928919|ref|ZP_06422765.1| hypothetical protein HMPREF0670_01659 8 5 Op 4 . - CDS 8118 - 8747 823 ## gi|288928920|ref|ZP_06422766.1| hypothetical protein HMPREF0670_01660 9 5 Op 5 . - CDS 8830 - 10380 1707 ## Phep_2855 RagB/SusD domain protein 10 5 Op 6 . - CDS 10385 - 13849 4162 ## BT_4660 hypothetical protein - Prom 13973 - 14032 5.3 - Term 15172 - 15216 3.1 11 6 Tu 1 . - CDS 15379 - 15603 286 ## gi|288928923|ref|ZP_06422769.1| hypothetical protein HMPREF0670_01663 - Prom 15701 - 15760 4.8 12 7 Tu 1 . - CDS 15994 - 20700 4468 ## COG1472 Beta-glucosidase-related glycosidases - Term 20904 - 20936 -0.4 13 8 Tu 1 . - CDS 21052 - 21948 1123 ## COG3568 Metal-dependent hydrolase - Prom 22176 - 22235 1.8 14 9 Op 1 . - CDS 22329 - 24107 1882 ## Dfer_2403 RagB/SusD domain protein 15 9 Op 2 . - CDS 24120 - 27674 3909 ## Slin_4979 TonB-dependent receptor plug - Prom 27710 - 27769 2.9 Predicted protein(s) >gi|283510548|gb|ACQH01000071.1| GENE 1 1 - 369 193 122 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260910535|ref|ZP_05917203.1| ## NR: gi|260910535|ref|ZP_05917203.1| hypothetical protein HMPREF6745_1158 [Prevotella sp. oral taxon 472 str. F0295] # 1 122 56 177 177 190 80.0 3e-47 NKQTKLSIELGKDGVAQCKVEDVKDNCAIRERVINVTCEGNQITLVVHHKERFEVLADCI CMYDVDFKMSKLSQGDYHLKVYYAGSVVKYDKESLVYDGEIRLVSGKVTQVTLRSGVPLP VD >gi|283510548|gb|ACQH01000071.1| GENE 2 1088 - 1834 822 248 aa, chain + ## HITS:1 COG:MA1319 KEGG:ns NR:ns ## COG: MA1319 COG0500 # Protein_GI_number: 20090181 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Methanosarcina acetivorans str.C2A # 44 184 7 134 199 67 31.0 3e-11 MAADISTWLKPEVKAEGLKQEGGIFVYMDNDFVSGDNHKYMKMYDWMSYVYDFGEKWLGR LKYGGAIDQLRSQLMSRLEWRNNISVLYVSIGTGADLHHLPQGVDLKTMELVGADISLGM LKRCKKKWQKKTNLTLVQCPAEQLPFADNAFDLVFHNGAINFFNDKARAMNEMLRVAKPG SKILVADETADFVESQYKKSVFSKSYFDGKTVDLHEMEQCVPPQAEDKQTELFWNNRFYG ITFRKSRG >gi|283510548|gb|ACQH01000071.1| GENE 3 1898 - 2158 84 86 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVYLGYLAVLFFWLSSFSSFLRFSSYPSYSSYPRFSSSSSSPGYSSYPSYSSYPSYLTTL GLFVHRWCAAIFSPPSPFHLFTFSPL >gi|283510548|gb|ACQH01000071.1| GENE 4 2381 - 2596 57 71 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MILVVIKGFVYSIAVNIYAFHLAFSSILHCVLHHFTLRFVPFYLAFSTKTHCVLHQNALY LAAYCTTFSSK >gi|283510548|gb|ACQH01000071.1| GENE 5 3476 - 6310 3558 944 aa, chain - ## HITS:1 COG:PM0804 KEGG:ns NR:ns ## COG: PM0804 COG0612 # Protein_GI_number: 15602669 # Func_class: R General function prediction only # Function: Predicted Zn-dependent peptidases # Organism: Pasteurella multocida # 14 853 13 847 923 194 23.0 6e-49 MNTTLRNLTRRMRTAMLCLATIGLAQQAYAQSEVLRTGKLPNGLTYYIYNDGSTPGEAQF YLYQNVGAVNEADNQTGLAHALEHLAFNATDNFPGGVMAFLKANGLTDFEAFTGVDDTRY AVHNVPTANAQLMGKMYLLLKDWCHGIKIQPADVEKERGIIMEEWRRREGIDRRITDSTA RVMYNYSKYAYRNVIGNEARLRSFTPKDVRTFYDTWYRPQLQFVAIIGDVNLDQAERTLK ATLSSLPKKATPYDAELRKIGDNAQPLYMQFVDKENKSPSFGLYQRVRLPNNPNSEEGTR NFLFTRIFNTLAPRRFARLKNADAETFIAASVSLSPLVRGFAQVAWDVVPYANNAHQAMQ QLLDVRGAIASEGFSSKEFEAEKSEMYQGMKDALEAKGLGTPDNVFNLMKQNFLYGTPIS DFREQIQRNIEVLVEQEVEDFNAWVAKLLDHNNLAFITYERKPNELDLSEGTFLAALKSS ETPSIGVAPDNTLTKLDLSHIVAGKIVSEKAISKLAAKEWKLSNGARVIYKYLPQAKGML FFSATAPGGRAAVTPQQLPSYMGMRNQLMQTGVGGYNRNQLASWLQGKDIDLTMSPGDYS DDLSGSAPVAQADNFFGYLNLILSRHDFSQSVFSKYVQRSKYLYANRSLEGMDAAQDSIR RLLFPPSEANPEQDEAFFDRMQFNELPSQFFAHFGNAARYTYFIVGDMPEVQAKNLVTRY LASLKGDPKQTLPAPTAMNFASKEPLIKRTFNVEMQGDLAEIELSFANNLRLTDKERAAF QVMRGILETKYFDELREKQHLTYTVGVKADYVSEPETAETLSIHLSTARASVSTALAQVK ALLDDVRLGKFSADEFKAAVVPLAVDEQTPASPEMATNPMLWMGTLGIYAQTGETITPEE SAAVDPIFSTLTPADVSAVAAKILNNAVKREIVVQAIVHEGKLS >gi|283510548|gb|ACQH01000071.1| GENE 6 6317 - 7621 1760 434 aa, chain - ## HITS:1 COG:YCR083w KEGG:ns NR:ns ## COG: YCR083w COG0526 # Protein_GI_number: 6319925 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Saccharomyces cerevisiae # 40 121 41 115 127 60 42.0 7e-09 MKRLILLPLMMLALLCKAQGEQGITFFKGSFNEALAEAKKQNKPLFVDFYATWCVPCKRM AKEVFTLPEVGEYFNPRFISLQVDAEKPENKEIAKQYKVEAYPTVAFIAPDGRTIGVNVG ALGKEDLLEAAKVVAGEGVSFEQLYDEYKKNPNDLEVQQQMLMKAPNFLAAQEGMDADKW VVRIQKLYKGYIAAKKGPKLINKDDYAIIYNLEGNDDKAHKQEMVDFINQNLDAWKAAIG TPAAYYVVMANDAWAEELVKEGSDKYKQYVEKIKNEYAKAYEVAALPTLPAYERAKLYYD ALYNLYKNKDVKGYIRDMKTLLQKMGDKVAPADYGKAAQDLYNAAGKKLTADDHKQAIEW VEQALKAEEAVMERVNYLVMIGDSYRELKNYAKARECYNQGYAESLRMQNMEMPQAMVQN AIKHKLATLELLEK >gi|283510548|gb|ACQH01000071.1| GENE 7 7626 - 8114 584 162 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928919|ref|ZP_06422765.1| ## NR: gi|288928919|ref|ZP_06422765.1| hypothetical protein HMPREF0670_01659 [Prevotella sp. oral taxon 317 str. F0108] # 1 162 1 162 162 332 100.0 7e-90 MKKIYNLARALFAVAFIVMAVAACNTMPVGFLRTEGASFSPDTLNVYHNPHASTPRYNDH RPWVSYRIQGVAGTNPINYELADVKATEGGDAEKFKALAQKGLLKVDGGMIVLMQEGVAE LPTSGRYTLSLRVYNDGHSKTIDDVYTIIVGVDEPEPEQQNP >gi|283510548|gb|ACQH01000071.1| GENE 8 8118 - 8747 823 209 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928920|ref|ZP_06422766.1| ## NR: gi|288928920|ref|ZP_06422766.1| hypothetical protein HMPREF0670_01660 [Prevotella sp. oral taxon 317 str. F0108] # 1 209 3 211 211 422 100.0 1e-117 MKRIKFFVAACLLTLVSVSCTQYNFEDTGEAYGVHDCTMWDYFSKDPYNWKLLQEMITRA GLEDVFKGTSSYGKDITYFGATSNSIRAYLLANGMKTVDEIPVDDCKAFVLNGLLAKHKM LDDFKPGRKSSDPNVLIGTGGETFDMASGKKFWIYTFNDTYNGVPGAGPKRIYVTSLDAS KESTVASSNIQTLTGVVHSMDYDFRLSDF >gi|283510548|gb|ACQH01000071.1| GENE 9 8830 - 10380 1707 516 aa, chain - ## HITS:1 COG:no KEGG:Phep_2855 NR:ns ## KEGG: Phep_2855 # Name: not_defined # Def: RagB/SusD domain protein # Organism: P.heparinus # Pathway: not_defined # 87 464 109 436 479 84 24.0 1e-14 MKKISIATLLLACTLTSCSLDILPENAQTYDTAFNNEAGVNSATASVEYFINIYAPNNNT FTTVGVFADTLQDDMEVRRWNPRRIIGQNESWKELYDIVFASNVIVDNIHRAKGLTDDRR NHYLGQAEFGLGFCYFLLAQRFDDAVITENSYVIKEYATSSRIDVINAAIDHATKAFNML PTYDKLRKMDGTPITNRQTASKGTCAALLAHLYAWKGSIIELMKLEGDAQAAYTKSMEYS TMLINKQVGDYSLCSTPEELCQCFSAPEKTNPEAIFSFLYDKARSGQTFSPNKVAEGFVS WPVDETQTLADIASRPRFKLYKETISKLYPDAGDLRRTAFFYKFDTDHTVGGHNYAIPYK FRTAVIDPDQTAPSGKLYRSINADYVYWRLADFYLLRAECAAKLGNEGQAIADLNVIRNR AGATAYPSANDTKGLRKAIFREREREFIAENDARYADIIRNNYIKDELTGKFTTLTLNDM RGGALGLPLPTDAKTDKNGNPINKLIRQKTYWQRYE >gi|283510548|gb|ACQH01000071.1| GENE 10 10385 - 13849 4162 1154 aa, chain - ## HITS:1 COG:no KEGG:BT_4660 NR:ns ## KEGG: BT_4660 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 112 1154 28 1047 1047 372 29.0 1e-101 MAIHLKRYLLAVFLGALALATLLPTEAIAQTQGKRISVSFKDKPLASILDFVGRQGDYEV TYTDDVRNDTLTVTISFENVDALQAVQSLLARTAFAYNVDGKKINVYRMEQHKETAFTLK GLVKDKDGQPVSFATVQIKGTTQGTTTDLDGHFTMKVDRQEGNVTISSVGFDTKTVKYDA RRELSVLLPSAEKMLGEVAVIAYGKRNTREQVGAIGSVKAEDLQKVPSPSLENLLQGRIA GVDVTNLSGSPGGGGSKVTIRGFSSLNQQGANDGTPLYVVDGVPVANDPQNMGGINALAG LDPSSIESVQVLKDAASASLYGSRAGNGVILITTKKGKAGRNEIQVNVSQSMSWLPATPK QTIGKAERDIALMLAKKYRSAHYDPTTDRIVMPNDYADTWGWGDDLDGGMDYLWRNGNVM SGDRKINPIMQDSLNTFYNNRTNWWKYGFRLGRVTKADVQLAGGTENMRYMANAGVYDET GIMINSNFRRYSLLTNLDFKLSTKLDVFIRLNMAYTNQSAGSGGRVQGLTFDPKQTPSVL PGKGSTAEREAVKQLRDVDQTNSNYNLRLNGGLNYSPIKGLRLTSTASIDHYFTRVYIFR PSYLMYDNLSEARGSNAAMTNLQTENIATYNAPLPQDHKLELMAGITYNYDLMQTIGGWA KGGPTNQVKYVGEGWPTLLKDISGQAKPARNFDSNKEDQAMLSFLGRAAYGYKKKYLGEF SVRRDGSSVFGSNVRWGTFPSVGLGWAFSEERFMKKLWWLSFAKLRASWGRSGQKFQEPY LALGTLSETDVFNGTAGLVPAALANPNLTWEKSDQYDVGLDLQVLDYRLKFKLDYYYKYS SNLLMEKYLPGNFFYTGKVWDNVSAISNEGIEFDVQADLIRSKDLNLSVGLNVSHNRNLF RKTDNGEDLNDKVLGRPVYGIYTYQDEGIVKDESQIPYYYSQTGSRLPLYFGNPNYPLRV GGRKIKDQNKDGKIDNSDLYYAGSTLPTAYGGVNGHLDWKGFNVDVLFSYVLGRKVMNMV QNSAFTFNKSVNPLMAHFRANEYWSKDNPNATKPSLEFADPGYLGHFDGNVDSNIENISF VRLKQLVLGYTLPATWFKSKFIKSINVYLSGENLFMLTNYSGLDPEVIDPYTGKDDGSQY PLNRKLTFGFNMKF >gi|283510548|gb|ACQH01000071.1| GENE 11 15379 - 15603 286 74 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928923|ref|ZP_06422769.1| ## NR: gi|288928923|ref|ZP_06422769.1| hypothetical protein HMPREF0670_01663 [Prevotella sp. oral taxon 317 str. F0108] # 1 74 20 93 93 147 100.0 2e-34 MTIKNLNIEEYLLGHCDAQQMHELASWASGSGKRLRLLFALRCAFLDLKYQHHFDDANLI ATAEQRLFDTISKG >gi|283510548|gb|ACQH01000071.1| GENE 12 15994 - 20700 4468 1568 aa, chain - ## HITS:1 COG:TM0076 KEGG:ns NR:ns ## COG: TM0076 COG1472 # Protein_GI_number: 15642851 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Thermotoga maritima # 726 1462 4 759 778 455 35.0 1e-127 MNSKAQQLINSLLIILAACTFSTRSEAATYNGVHQLVARRFPFMVGKVVFQPLAGTHEEA FELQTPGKRLVVKATSPSAAAVGLNHYLNNYCHISISRCGNNLPTHFRLVPIQGTVRRTT PFKYRYALNYCTYNYSYAFYRWADFEWELDWMALNGVNLMLAPLGMEAVWTETLKTLGFG QKDIQRFIPGPAYTAWWLMGNLEGWGGPMSESLIALRLQQQRQMLQRMRQLGIQPVVQGF PGIVPTFLKARFPQAHIIEQGKWGSFQRPAVLLPNGDGVFEKVAEAYYQSLTKLLGADFE FLGGDLFHEGGITTGVDVGSVAAQVQRQMLRFFPRAKWVLQGWNKNPSPQLLRALNKQHT LLVNLSGEIAASWESSDEFGGTPWLWGSVNHFGGKTDMGGQLPVVVAEPHRALALTVDSV MQGIGILPEGIGTNPVVYDLALKTAWHTATPDVDSMLVQYLGYRYGEVHPDLLAAWRIML KSVYGEFAIKGEGTFESVFCARPSLRVTSVSTWGPKQMQYQPADLYRALGLFLKAAAAPT LRNSETYQYDLVDLARQSLANYARTAYANVVQAYEQKNAEQLQLATRRFEHLIMLQDSLL QTNRHFLLGNWLQQATQYAPNAADRQLCVLNAQTLISYWGPNDPATKVHDYANKEWAGML STYYLPRWQAFFRGLQATIGTGNPPAIDFFEMEKRWANTPQPINTKPQGNAVQMAQRVLR TIVPPYLNASLDVETRLNDLLPRMTLDEKLAQTRHIHAKHYNDNGTVNLQKLRTNHTGGL SFGCFEAFPYSSAQYLAAVSTIQRHAMDSTRLGIPLIPVLEGIHGAVQDGCTIFPQAIAQ GATFNPPLIQRMAQHVGQEMRAIGARQVLAPDLDLARELRWGRVEETFGEDRLLVAQMGL AYATGIQQAGCIPTLKHFVAHGTPQGGLNLASVKGGRRELLDVFVRPFAHVISRTLPASV MNCYSAYDNEAITSSPFFMRQLLRDSLHFRGYVYSDWGSVPMLRYFHHTAETEREAAKQA IEAGIDLEAGSDYYRTAKQLVDQGELDAALIDSAATNVLRTKLAAGLFEDARPDTLNWRK AIHTAAAVATARQLADESLVLLENPNALLPLNAQRLKRIAVVGPNANQVQFGDYSWTADN RYGITPLAGIRQYLQGKGVQVDYERGCDYYSPKTDSIPHALRLAAQADLTIAVVGTQSML LARASQPSTSGEGYDLSELTLPGAQQQLVDSLVALGKPLVVVLVTGRPLVMESFRHRVGA LVVQWYGGEQAGLSLARMLFGEFSPSGRLPVSFPKSVGQLPVYYNALPTDKGYYNKKGTP EKPGRDYVFSDPYPAYAFGHGLSYTAFAYRALQLSSRTAVETDTVEVRFVVRNTGKRAAM EVAQLYVRDLKSSVATPVKQLYAFHKAAIGPEKEQHFALRLPIAELYLHDRNMRRVVEPG QFELLVGASSADIRLRDTLTVVSVAQQKSVNNAPTTFKTNTQRTPQATPTTSKTIRIAGI VRNVQAFPLVGVKVTADGKTAQTNARGEYSLSARIGSTLRFALAGYRTETLVVKENGLFD VELTPITP >gi|283510548|gb|ACQH01000071.1| GENE 13 21052 - 21948 1123 298 aa, chain - ## HITS:1 COG:BMEII0240 KEGG:ns NR:ns ## COG: BMEII0240 COG3568 # Protein_GI_number: 17988584 # Func_class: R General function prediction only # Function: Metal-dependent hydrolase # Organism: Brucella melitensis # 34 214 22 193 285 58 30.0 1e-08 MKPLFSTLVVAASLFVASLPIMGCGGKDAPGHIPDYQPKPKNEDYKYAKAEGNVRIMTYN CFYCKSNTSNKTFSDEHTADFAKVIKALNPDVVVIQELDSGTTERAKRYLLDDIRRATGL DYDLFFGSAAPYSEGKIGPGVLYKRAMQPTSIKKVALPGKETRALMVLTFPRFTLLGTHL DLDATARKTSAEIVNHELTTLATQPVFFAGDLNDSPSWKPELTAFPIITKAFDIITPQTG TSVDQPNETIDYILVDKAHKGAVKVVQTAVVKQLEINGKVTETGTISDHFPVFVDVRF >gi|283510548|gb|ACQH01000071.1| GENE 14 22329 - 24107 1882 592 aa, chain - ## HITS:1 COG:no KEGG:Dfer_2403 NR:ns ## KEGG: Dfer_2403 # Name: not_defined # Def: RagB/SusD domain protein # Organism: D.fermentans # Pathway: not_defined # 1 592 1 575 576 541 49.0 1e-152 MKKILLAQCLATAALLTSCSLDLEPLDKLADQNYYNNRKELEVFTNGFYADFPTAENLYS ETADIIIPTTLLPEVLGTRSVPQSGGGWGWKQLANINSFLKDCSRCGDEAARKEYEGVAR FFRAYFYYNKVKRFGEVPWFSDVLSSTDDRLYSPRAPRNELMSNIIADLDYAIDNLPTAK NLYRVTKWAALALKSRVCLYEGTWRKYHGVSGYESYLAAAASAAETLMDSRQYTIYAAGE QPYRDLFAADKAKTEEVILARNYFSGLSIVHSANLYFLGGGSAPGLNKKVVDSYLLKDGS RFTDQADYDKMQFYEEMQNRDPRLAQTIVTPGYKRIGQTTVTPVSFAAATTGYQIIKWVS TTASDGYNKSQNDIPLFRLSEVMLNFAEAKAELGTLTQADLDKSVKLIRDRVGMPNIDLA TANANPDPYLEAAETGYPNVTGANKGVILEIRRERTIELLAEGLRYDDLMRWKAGKTFEQ QFKGMVIPKLDDTKHFVVCDLNGNGVNDAEDVCIYEGDIDNIKANAEVAGIKQFLKLGVN LHVANGANGGNIIVHDIANNQRKWNEDRDYLYPIPQDQITIYGGVIKQNPGW >gi|283510548|gb|ACQH01000071.1| GENE 15 24120 - 27674 3909 1184 aa, chain - ## HITS:1 COG:no KEGG:Slin_4979 NR:ns ## KEGG: Slin_4979 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: S.linguale # Pathway: not_defined # 57 1184 41 1188 1188 882 43.0 0 MRENVYIKTWRWRIAQRYVSALMLLAPVACMASTSATWAAIGHGQLTALTAGNSMYTVKS VLSYIEKNSNYVFVYNAEVQKMMPNRVKVSLDGRPALQILDEMCAATGLAYKAQGNQISL AQAKNAQSGQPDKGQKRRIKGNVSDAKGEPIVGATISEKGGTGGTITDINGDFTLELTPG NLVTVSYVGFKPQTVKPTEGMHVTLDEQTRGLDEVVVIGYGTKKKANLIGAVSTVGAEEL KDRPVTNLGQMLQGQVPNLNISFGTGTPGEAATLNIRGKTSITNSGAPLVLIDGVEGSID RINPNDIESVSVLKDAASAAIYGARAGFGVVLITTKSNKDGQAHINYNGRFSFSAPTTKT DFMTVGYDVARLVDEFNTATTGSSYSGLNADDYKILEKRRYDVTENPARPWTVVDPQDGR YHYYGNFDWYNYIFNFSQPTWNHNLNVSGGTDKMNYMISGGMNDHDGLYALSTDKYSTRT LMAKFNAEVTPWLRVFSTASLFKSKYKQAGYDYEGGGNVGNLVFHAMPWIVPVNPDGTNV YLLPNSKDKPADGFAAMLRTGNGFTQVGKTEQTYAIGAVLKVMEGLEITGKLTYRNYAKE KQARAANIQYAIRPNEVLTASGWPFNNKLKDGRDTYENYVYDLYANFHRTFADVHNVSAV VGSNYERGHYKFVEPSGEDLTSDVLNDLALSTGAKGVKSSQNEFALMGYFARLSYDYNGR YLVEANMRYDGTSRFPRNHRWGFFPSVALGWRISEEAFFAPVRPVVSNLKLRASLGSLGN QITDNSAKFNKNTFYPYMRLISLANTTTLNYIFDNTQATYASLGDPTSGSLTWETIVTQN VGLDVGLFNNRLSLALDVYRRTTKDMLAAARALPAVYGYNAPYENNGELRTNGFELVVGW NDKFNLAGKPFYYGVNLTLADSKTKITKFKGNESKVLGQDYEGMEWGEIWGFRIKGIYQS DQEAIDRGVDQSFLGSRFTDKAGDLIFDDVDDSKKIDIGKGTLDNHGDLVRIGNSTPRYH YGITVNMAWMGFDFSMFWQGIGKQNIYPGGNNMMFWGPYARAYSSFIPKDFPAKVWSTNN RGAYFPRPVADLARELAMSKPNDRYLQDLAYCRLKNLTFGYTLPKSLTKKAYLEKVRVYF SGENIFITSKLKSDYLDPEQMTYDSNGRVYPFSKTFSFGLDVSF Prediction of potential genes in microbial genomes Time: Sat May 28 01:37:09 2011 Seq name: gi|283510547|gb|ACQH01000072.1| Prevotella sp. oral taxon 317 str. F0108 cont2.72, whole genome shotgun sequence Length of sequence - 77272 bp Number of predicted genes - 57, with homology - 53 Number of transcription units - 28, operones - 13 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 105 - 164 8.7 1 1 Op 1 . + CDS 280 - 438 110 ## gi|260910500|ref|ZP_05917168.1| transcriptional regulator 2 1 Op 2 . + CDS 506 - 709 89 ## + Term 787 - 829 1.6 3 2 Op 1 6/0.000 - CDS 947 - 1876 880 ## COG3712 Fe2+-dicitrate sensor, membrane component 4 2 Op 2 . - CDS 1889 - 2470 616 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 2534 - 2593 3.2 - Term 2649 - 2708 18.9 5 3 Op 1 . - CDS 2743 - 3609 959 ## COG0524 Sugar kinases, ribokinase family 6 3 Op 2 . - CDS 3662 - 4891 1448 ## COG4591 ABC-type transport system, involved in lipoprotein release, permease component 7 3 Op 3 . - CDS 4898 - 5233 489 ## COG0858 Ribosome-binding factor A 8 3 Op 4 . - CDS 5267 - 6106 897 ## COG0796 Glutamate racemase 9 3 Op 5 . - CDS 6186 - 6707 727 ## PRU_2286 hypothetical protein - Prom 6819 - 6878 2.5 - Term 6845 - 6908 8.2 10 4 Tu 1 . - CDS 7058 - 7558 746 ## PRU_2285 OmpH/HlpA family protein - Prom 7785 - 7844 3.7 11 5 Tu 1 . - CDS 7940 - 8452 743 ## PRU_2284 OmpH/HlpA family protein 12 6 Op 1 1/0.667 - CDS 8566 - 11181 3269 ## COG4775 Outer membrane protein/protective antigen OMA87 13 6 Op 2 . - CDS 11232 - 11969 893 ## COG0020 Undecaprenyl pyrophosphate synthase 14 6 Op 3 . - CDS 11969 - 12631 724 ## PRU_2281 hypothetical protein 15 6 Op 4 . - CDS 12718 - 14136 1385 ## BVU_0865 hypothetical protein - Prom 14208 - 14267 3.7 16 7 Op 1 . - CDS 15293 - 15778 406 ## BCG9842_B2215 hypothetical protein 17 7 Op 2 . - CDS 15785 - 16300 284 ## SO_A0080 hypothetical protein 18 7 Op 3 . - CDS 16332 - 16694 299 ## Pecwa_0162 hypothetical protein - Prom 16846 - 16905 4.1 - Term 17033 - 17069 -1.0 19 8 Op 1 . - CDS 17198 - 17932 521 ## gi|288928945|ref|ZP_06422791.1| hypothetical protein HMPREF0670_01685 20 8 Op 2 . - CDS 17935 - 23124 2082 ## COG0514 Superfamily II DNA helicase 21 8 Op 3 . - CDS 23127 - 24197 377 ## BDI_1313 hypothetical protein - Prom 24401 - 24460 5.7 22 9 Op 1 . - CDS 24891 - 25289 329 ## gi|288928948|ref|ZP_06422794.1| hypothetical protein HMPREF0670_01688 23 9 Op 2 . - CDS 25283 - 25684 252 ## gi|288928949|ref|ZP_06422795.1| hypothetical protein HMPREF0670_01689 - Prom 25763 - 25822 1.6 24 10 Tu 1 . - CDS 25874 - 28057 1979 ## COG3345 Alpha-galactosidase - Prom 28111 - 28170 5.0 25 11 Tu 1 . - CDS 28271 - 29095 636 ## gi|288928951|ref|ZP_06422797.1| hypothetical protein HMPREF0670_01691 - Prom 29294 - 29353 7.3 26 12 Op 1 . - CDS 29404 - 29625 129 ## gi|288928952|ref|ZP_06422798.1| fructan beta-(2,6)-fructosidase 27 12 Op 2 . - CDS 29718 - 30536 629 ## gi|288928953|ref|ZP_06422799.1| hypothetical protein HMPREF0670_01693 - Prom 30644 - 30703 4.2 - TRNA 31552 - 31625 51.5 # Gln TTG 0 0 28 13 Tu 1 . - CDS 33197 - 33679 210 ## gi|288928955|ref|ZP_06422801.1| hypothetical protein HMPREF0670_01695 - Term 33783 - 33839 2.4 29 14 Tu 1 . - CDS 33850 - 35517 2114 ## COG2759 Formyltetrahydrofolate synthetase - Prom 35564 - 35623 5.8 + Prom 35565 - 35624 6.0 30 15 Op 1 . + CDS 35829 - 37151 500 ## PROTEIN SUPPORTED gi|16079597|ref|NP_390421.1| hypothetical protein BSU25430 31 15 Op 2 . + CDS 37244 - 38530 701 ## Glov_0169 FRG domain protein + Term 38763 - 38809 9.1 32 16 Tu 1 . - CDS 39048 - 40205 1016 ## BT_2457 putative purple acid phosphatase 33 17 Tu 1 1/0.667 - CDS 40306 - 42579 1606 ## COG1472 Beta-glucosidase-related glycosidases - Prom 42608 - 42667 2.6 34 18 Tu 1 . - CDS 42691 - 45207 1658 ## COG3250 Beta-galactosidase/beta-glucuronidase - Prom 45401 - 45460 3.0 - Term 45508 - 45546 0.7 35 19 Tu 1 . - CDS 45656 - 47656 1752 ## gi|288928962|ref|ZP_06422808.1| glycoside hydrolase, family 16 - Prom 47676 - 47735 2.7 36 20 Op 1 . - CDS 47813 - 49804 1801 ## gi|288928963|ref|ZP_06422809.1| hypothetical protein HMPREF0670_01703 37 20 Op 2 . - CDS 49847 - 51532 1205 ## Phep_3874 RagB/SusD domain protein 38 20 Op 3 . - CDS 51559 - 54372 2255 ## Phep_3875 TonB-dependent receptor plug 39 20 Op 4 . - CDS 54430 - 56190 1286 ## Phep_3876 RagB/SusD domain protein 40 20 Op 5 . - CDS 56203 - 59622 2623 ## Phep_3877 TonB-dependent receptor plug 41 20 Op 6 . - CDS 59647 - 60612 693 ## COG3712 Fe2+-dicitrate sensor, membrane component - Prom 60792 - 60851 6.6 42 21 Tu 1 . + CDS 61409 - 61660 72 ## 43 22 Op 1 9/0.000 - CDS 62014 - 63363 1728 ## COG1538 Outer membrane protein - Prom 63529 - 63588 5.8 44 22 Op 2 27/0.000 - CDS 63631 - 66744 3467 ## COG0841 Cation/multidrug efflux pump - Prom 66770 - 66829 4.6 - Term 66840 - 66880 3.2 45 22 Op 3 . - CDS 66970 - 67986 1000 ## COG0845 Membrane-fusion protein - Prom 68168 - 68227 9.9 + Prom 68151 - 68210 9.3 46 23 Tu 1 . + CDS 68233 - 69111 925 ## COG2207 AraC-type DNA-binding domain-containing proteins - Term 69092 - 69136 -0.9 47 24 Op 1 3/0.000 - CDS 69308 - 70156 565 ## COG1266 Predicted metal-dependent membrane protease 48 24 Op 2 . - CDS 70189 - 71043 384 ## COG1266 Predicted metal-dependent membrane protease 49 24 Op 3 . - CDS 71119 - 71349 149 ## gi|288928978|ref|ZP_06422824.1| DNA-binding protein 50 24 Op 4 . - CDS 71411 - 72121 338 ## gi|288928976|ref|ZP_06422822.1| hypothetical protein HMPREF0670_01716 - Prom 72159 - 72218 4.5 - Term 72156 - 72217 8.7 51 25 Tu 1 . - CDS 72271 - 72477 292 ## gi|288928977|ref|ZP_06422823.1| hypothetical protein HMPREF0670_01717 - Prom 72672 - 72731 3.8 + Prom 72442 - 72501 6.2 52 26 Tu 1 . + CDS 72588 - 72818 57 ## + Term 72941 - 72992 -0.6 - Term 72633 - 72669 1.0 53 27 Op 1 . - CDS 72802 - 73137 238 ## ZPR_4403 putative DNA-binding protein 54 27 Op 2 . - CDS 73143 - 73373 220 ## ZPR_4403 putative DNA-binding protein 55 27 Op 3 13/0.000 - CDS 73379 - 75547 174 ## PROTEIN SUPPORTED gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein 56 27 Op 4 . - CDS 75561 - 76880 990 ## COG0845 Membrane-fusion protein - Prom 76955 - 77014 5.0 57 28 Tu 1 . - CDS 77064 - 77270 62 ## Predicted protein(s) >gi|283510547|gb|ACQH01000072.1| GENE 1 280 - 438 110 52 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260910500|ref|ZP_05917168.1| ## NR: gi|260910500|ref|ZP_05917168.1| transcriptional regulator [Prevotella sp. oral taxon 472 str. F0295] # 1 52 1 52 102 78 86.0 1e-13 MNKKKKDDVQLTPIEELIKEDFGSEGTVERTAFETGVDAFILGECLKEERKK >gi|283510547|gb|ACQH01000072.1| GENE 2 506 - 709 89 67 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDMLMCSCLRYSESLQYWVNGCPCRFLRGVLCPFLSSNKSKSDPHSGRRSRHCLPSAGLW SMMLHFL >gi|283510547|gb|ACQH01000072.1| GENE 3 947 - 1876 880 309 aa, chain - ## HITS:1 COG:RSc2919 KEGG:ns NR:ns ## COG: RSc2919 COG3712 # Protein_GI_number: 17547638 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Ralstonia solanacearum # 100 255 64 216 274 62 31.0 1e-09 MNRDIHDMRFDGSEEDLLLQDFISRAEPANVVGERIMARVYEKIQTDRARRSKRWWIVAA AVMVPLLACGVWLGLNLQGNSNTIAQQAAHTPNASASVMQELAVPTGHTLTLTLPDGTRM TANSRTHIRYPKPFSAHTRDIYIIGEAYFEVAHEESRPFVVHAQGFDVRVLGTHFCVNSY AADKASVVLAEGRVEVNTTAKDKVNMLPNERLNIAQGQFASKTKVNAAAYVDRLKGVLPM QGMRLSEVTEYLENYYGMRFKVQDGLRNERLYGKLVTAAPLADVLASLCNISGAEAMVKN GVVTIARRL >gi|283510547|gb|ACQH01000072.1| GENE 4 1889 - 2470 616 193 aa, chain - ## HITS:1 COG:mlr0407 KEGG:ns NR:ns ## COG: mlr0407 COG1595 # Protein_GI_number: 13470641 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Mesorhizobium loti # 44 188 37 172 183 65 32.0 4e-11 MEELYQQLVLISQGSQVAFNGFISRFSNTLYCHAYGILGCREMAEEVVSDAFLEAWRMRK KLTQLQNVRAWLTRIVYNKSVDYLRRERNYTKQISVEHFGMGDFAFPDMQTPADTLISAE ELAAINAAIDTLPPKCKYVFFLAKVEKMPYQDIANMLGIALATVNYHVGLAVQTLRQLLR RRTKPAAGALKNE >gi|283510547|gb|ACQH01000072.1| GENE 5 2743 - 3609 959 288 aa, chain - ## HITS:1 COG:TM0415 KEGG:ns NR:ns ## COG: TM0415 COG0524 # Protein_GI_number: 15643181 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Thermotoga maritima # 4 276 2 271 286 117 30.0 3e-26 MNDICCIGHITRDKIITPHHEAFISGGTAFYFSYAIARLPRKVSYQLVTKLGRYDVEEVK RMRAERIDVLAYTCDNSVYFENRYEGDQNKRTQRVLSKCEPFKISEMNGVEAKVFHLGSL LNDDFSADFVRELSTRGRISIDAQGFLREVRGQKVVAVDWEEKHKILKHTHTLKLNEHEM EVITGLTNPRSVAQQIASWGVKDVVITLGSGGSLIYADQTFYEIPAYPPRKLVDATGCGD TYSAGYLYCRAQGLSYAESGKFAAAMCTRKLEATGPYMPDKEWFEEGT >gi|283510547|gb|ACQH01000072.1| GENE 6 3662 - 4891 1448 409 aa, chain - ## HITS:1 COG:SMc01935 KEGG:ns NR:ns ## COG: SMc01935 COG4591 # Protein_GI_number: 15965040 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: ABC-type transport system, involved in lipoprotein release, permease component # Organism: Sinorhizobium meliloti # 7 405 2 398 408 139 27.0 9e-33 MAVSLFIARRYLFSKKSTNAINVISAISVVGVAVATMALVIVLSVFNGFHDLVASLFTAF DPQIEVVPAKGKDAPTDDPVLTQIKQLPEVDVATECVEDNALVFYGDKQAMVTIKGVDDN FTQLTRINDLLYGDGEFALRAANLQYGVLGIRLAQQLGTGARWNGFMRVFAPNREGQFDI SNPTSGFVVDSLLSPGVVFSVRQAKYDQRYILTSVSFARNLFGMQGMLTALELKLKPNAN TDEVKKKIQTIAGSKYKVLDRYEQQEDTFRIMNVEKAIAYIFLTFILVVACFNIVGSLSM LIIDKKADVATLRNLGATDKQIVRIFLFEGRMISAVGALVGVGLGLLLCWLQQQYGLVSL GKSEGSFIVNAYPVSVHYTDVAWIFVTVIAVGWLSVWYPVRYLSRKLLA >gi|283510547|gb|ACQH01000072.1| GENE 7 4898 - 5233 489 111 aa, chain - ## HITS:1 COG:BS_rbfA KEGG:ns NR:ns ## COG: BS_rbfA COG0858 # Protein_GI_number: 16078728 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribosome-binding factor A # Organism: Bacillus subtilis # 5 111 4 109 117 59 37.0 1e-09 MQETRQNRIARLLQKELSLIFQSQTRMMHGVMVSVTRTRVSPDLSICTAYLSVFPSEKGE ELLKNIESNTKTIRYELGTRVHNQLRIIPELRFFIDDSLDYIERIDELLKK >gi|283510547|gb|ACQH01000072.1| GENE 8 5267 - 6106 897 279 aa, chain - ## HITS:1 COG:SPy0361 KEGG:ns NR:ns ## COG: SPy0361 COG0796 # Protein_GI_number: 15674511 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glutamate racemase # Organism: Streptococcus pyogenes M1 GAS # 10 278 5 264 264 165 37.0 9e-41 MNNFPQQAGPIGVFDSGYGGLTILHGIRQLLPQYDYLYLGDNARAPYGPRSFDVVYQFTR QAVVKLFEQGCHLIIIGCNTASAKALRSIQQRDLPSLDAKRRVLGIIRPTAESIGSLTST RHVGILATEGTIKSDSYRLEIQKLFPDIVVQGMACPLWAAIVEANEADSPGADYFVKKRI DSLLRTDPQIDAIVLGCTHYPLLMNSIVKHVPPGVRIVPQGEYVANSLQQYLQRHPEIDA LCTKHGTARYLTTENKDKFKENAQIFLHEQIDCQHVSLE >gi|283510547|gb|ACQH01000072.1| GENE 9 6186 - 6707 727 173 aa, chain - ## HITS:1 COG:no KEGG:PRU_2286 NR:ns ## KEGG: PRU_2286 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 20 173 18 171 171 142 49.0 4e-33 MLLAALTTGAQDATAQTQNAPSERAATLRFGYFSYQDVLQSAPDYATIQQNIETLRGKYQ AEMKRATDEFNLKYEAFLDGLKDFAPAIRMKRQAELQELMEKNMAFKQEAARLLQKAETE AMAPLKARLNQAVAKVGEQKGLAFILNTDNNAAPYLNPALGENVSAAISAELK >gi|283510547|gb|ACQH01000072.1| GENE 10 7058 - 7558 746 166 aa, chain - ## HITS:1 COG:no KEGG:PRU_2285 NR:ns ## KEGG: PRU_2285 # Name: not_defined # Def: OmpH/HlpA family protein # Organism: P.ruminicola # Pathway: not_defined # 20 162 23 164 167 122 55.0 7e-27 MKKLVLMLMLFAPMAMFAQKFGHVNAQEVMASMPEFLKAKGDIEASAKKFENELTAMQEE LKRKSDEYEKTKSTMNATKQQETEASLQQMYQKIQQTYQDNQQSLQKAQQEKMGPITGKL VDAIKAVGKAGGYVYIMDVSAGIPYISDTLSKDVTAEVKAQLNKMK >gi|283510547|gb|ACQH01000072.1| GENE 11 7940 - 8452 743 170 aa, chain - ## HITS:1 COG:no KEGG:PRU_2284 NR:ns ## KEGG: PRU_2284 # Name: not_defined # Def: OmpH/HlpA family protein # Organism: P.ruminicola # Pathway: not_defined # 1 170 1 171 171 222 71.0 4e-57 MKKLIITCLVAIVSLAAHAQKFALIDMEYIMKNIPAYERANEQLNQVSKKWQAEVEALSN EAATMYKNYQNEVVFLSQEQRKAKQEAIMKKEKDAGELKKKYFGPEGELYKKRESLIAPI QEAVYNAVKELSEQRGYSLVLDRASDTGIIFGSPKIDISNEVLGRLGYSN >gi|283510547|gb|ACQH01000072.1| GENE 12 8566 - 11181 3269 871 aa, chain - ## HITS:1 COG:NMA0085 KEGG:ns NR:ns ## COG: NMA0085 COG4775 # Protein_GI_number: 15793114 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane protein/protective antigen OMA87 # Organism: Neisseria meningitidis Z2491 # 81 814 62 718 797 143 25.0 1e-33 MKILRKVFPVVALCLCGLKAAAQGKIVYPDISYAGTPKTLVLGGINVSGIEGYEDYMLAG ISGLSVGQEITVPGNEITDAVKRYWRHGLFSDVKISADSIVGDKIYLHVSLALRPRVSVI NYVGLKKSEREDMENKLGLLKGAQITPNMIDRAKILAKRYFDDKGFKNADIEIRQRDDVS NKNQVILDVIIDKKEKMKVRDIIIEGNAQLSSRKIKGGFLRKGAFAKTHEAGKLSSLFKA KKFTPERWAADKKKLIEKYNELGFRDATILEDTVWNADEKHVNIHIKVDEGKKYYLRNVT WVGNTVYSTDYLNALLGMKKGDVYDQKLLNKRLTEDEDAVGNQYWNHGYLFYNLQPTEMN IVGDSIDLEMRIVEGQQAHLNKVRINGNDRLYENVVRRELRTKPGDLFSKEALQRSAREL ATMGHFDAEKVSPDVRPNYEDGTVDINWNLEQKSNDQIEFSLGWGQTGLIGKIGLKLNNF SMRNLFGKNKERRGLLPIGDGETLSLGVQTNGKYYQSYNASYSTAWLGGKRPLQFNVSVF YSKQTGVNSNYYNNNSRYAYYNSLNGYGSSYYNNYENYYDPNQYIQMLGASVGWGKRLRW PDDFFVLSMDLSYTRYMLKNWRYFIMSNGNANNINLGLTLSRNSTDNPMFPRKGSEFMAS VSVTPPWSKWDGKDYRNLANDPKSPTFAKEQQEKYRWVEYHKWKFKSKTYTALSGGQKCF VLMSRVEFGLLGSYNSYKKSPFETYYMGGDGMSGYSTGYAEETIGLRGYENGSLTPYGAE GYAYTRMALELRYPFMLGNTTIYGLGFAEAGNAWTDTRKFNPLQLKRSAGLGVRIFLPMV GLMGIDWAYGFDKVYGTKGGSQFHFILGQEF >gi|283510547|gb|ACQH01000072.1| GENE 13 11232 - 11969 893 245 aa, chain - ## HITS:1 COG:VC2256 KEGG:ns NR:ns ## COG: VC2256 COG0020 # Protein_GI_number: 15642254 # Func_class: I Lipid transport and metabolism # Function: Undecaprenyl pyrophosphate synthase # Organism: Vibrio cholerae # 10 243 17 252 256 243 48.0 3e-64 MEQELDIKRIPRHIAIIMDGNGRWAKERGKERSYGHQAGVDTVRRITSECVRLGVKFLTL YTFSTENWSRPETEIAALMGLVLSSLEDEIFMKNNVRFQVVGDIQRLPQSVQDKLQETME HTAQNTAMTMVVALSYSSRWELTRAAQSIARDVKAGTLQPEDITEELMNERLETAFMPDP ELLIRTGGELRISNYMLWQIAYSELYFCDTYWPDFDEADLHQAIANYQARQRRFGKTGMQ VESES >gi|283510547|gb|ACQH01000072.1| GENE 14 11969 - 12631 724 220 aa, chain - ## HITS:1 COG:no KEGG:PRU_2281 NR:ns ## KEGG: PRU_2281 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 2 219 7 216 218 190 45.0 4e-47 MLLFAALPSGAMAQGDPLYLMEAGGTLGLVTYLGDFNAGIFRGAQPMGGLVLRRVINPYM DLRLAAFTGKIKGSSQNNGTYYPAYAASPYHFNHRVVDVGVAYEYNFWPYGTGRDYRGAQ PLTPYIMGGMGLTHVSGGEKSVATANLSLGVGVKYKLAERLNLGLEWAMHFATSDKLDGV ADPYTVRSVGMFKNKDGYSALSLSLTYSFMPKCRTCNKDD >gi|283510547|gb|ACQH01000072.1| GENE 15 12718 - 14136 1385 472 aa, chain - ## HITS:1 COG:no KEGG:BVU_0865 NR:ns ## KEGG: BVU_0865 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 11 472 12 465 465 158 25.0 5e-37 MKKMRWSLALLFSAGFVFASCLKDDARDITYYDDTAITAFTLGKLSLRVDSTTKDGKKDS VYYRKLDAKKYVFYIDQINKRVYNLDSLPYGVNQKKIVGTFSTRNAGVVTLKSLTSDSTA YYKNGQDSVDFTSERTFVVYSNSQKYSQKYTVDVRVHKQKPNDFKWNQLATVPAFAAMQG MRVANAGSMVLVFGSTGSGTVVYGSPIADGKTWAKLSPSFSPDANAWQNAVSFNGVAYVL SGNRVWQSTDGTTWTDFGVNAAGISHLLGASNAGLHLLKGTSTLYLAKVGTSEPVLETLA ASSGYLPQQDFNMVVWNHSASDKTEQVVLLGNRKEADYAGDAAPVVWGKVFEYGQTSSTQ RWAYYNSLEAEPRLPRMSNLQVVVHGPVLLAVGGAGMGASSSVQALKNVYVSADGGLSWR TNRIYTMPTDIAKNTTQFAMGADKDNHIWLVCAGTGKVWRVRKNSVGWATNK >gi|283510547|gb|ACQH01000072.1| GENE 16 15293 - 15778 406 161 aa, chain - ## HITS:1 COG:no KEGG:BCG9842_B2215 NR:ns ## KEGG: BCG9842_B2215 # Name: not_defined # Def: hypothetical protein # Organism: B.cereus_G9842 # Pathway: not_defined # 1 161 1 163 165 142 42.0 5e-33 MEDLINILTEIEYNESCAVSKLSHPAAKLPLHLPQDLRYFLENYSSISLFNDAHYPIKIV GAAEFKRANPIIVGEDVADDVSHNWFIIAHDNHSQYITIDLSKNKQGYCYDSFWDRHGVA GEQAVVAKSFTELLQSLYHAKGQSLYWTEQGFQSLGDAYDD >gi|283510547|gb|ACQH01000072.1| GENE 17 15785 - 16300 284 171 aa, chain - ## HITS:1 COG:no KEGG:SO_A0080 NR:ns ## KEGG: SO_A0080 # Name: not_defined # Def: hypothetical protein # Organism: S.oneidensis # Pathway: not_defined # 9 162 4 157 161 73 35.0 3e-12 MNKIVVHAGDIFCIPLFMPKDDWKLKVKLSDKDLDKDFAFGRVIETSSSVLVEIFKKIGS AQTCINEIENSGVLFAPLQIFWDGIIKKRWRIIGKTDCYDKYKHSNYDSLRMIFGVDGDF RLRNFATQEETPITRKEFKHHEFSIVWFPIDLENRILKAINQPAHTSNKKE >gi|283510547|gb|ACQH01000072.1| GENE 18 16332 - 16694 299 120 aa, chain - ## HITS:1 COG:no KEGG:Pecwa_0162 NR:ns ## KEGG: Pecwa_0162 # Name: not_defined # Def: hypothetical protein # Organism: P.wasabiae # Pathway: not_defined # 6 115 6 118 120 82 42.0 5e-15 MDCVSVKKNVIELEERLKITFPKPYFDFLCHIKADDVFEVEGSGIGLFSYSDIEERNETY EVRAYEPNYLMIGQDGDMGYFISTNTPNDHSIYANDLGALGSLQMNKAASDIFELINKFQ >gi|283510547|gb|ACQH01000072.1| GENE 19 17198 - 17932 521 244 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928945|ref|ZP_06422791.1| ## NR: gi|288928945|ref|ZP_06422791.1| hypothetical protein HMPREF0670_01685 [Prevotella sp. oral taxon 317 str. F0108] # 1 244 1 244 244 357 100.0 2e-97 MAYDINLALERLEKNLQDINSAKEQIEDTINASSQLQNVVNGYVASVNAVLQETILLKEE IAKMRVQKVTEIKEAVTSIEASCNTIIAEFKKNSSSILTDFVSQNDRLAKSNEELSAFQT KLEQSIEITSGLKTRQDEISADVHSIHNKLQESTENISKQYGQLSQKVDADMTMLTNSMS KEFSRLSERTEELNRAITKNQRDIEELCKQISDQHADNLKNININRWMLIAGILILAALQ FIIK >gi|283510547|gb|ACQH01000072.1| GENE 20 17935 - 23124 2082 1729 aa, chain - ## HITS:1 COG:SA0676 KEGG:ns NR:ns ## COG: SA0676 COG0514 # Protein_GI_number: 15926398 # Func_class: L Replication, recombination and repair # Function: Superfamily II DNA helicase # Organism: Staphylococcus aureus N315 # 704 1116 3 320 593 154 30.0 2e-36 MSDFIRLFKEKGNLSFIGMRDTAESLLPANRAERDKLHHDLDRGKGVLDDENHLNMYLHS FGKMHKAKLEAASDAIPNFAQLLTEEFEIYDWGCGQGTATICLLDFLSSHKITPKLLNIN LIEPSKAAVKRASEVVACINPNYTVRTITKDFDSLCVDDFEKSNYRKIHLFSNILDVEGF DLAKFIYLFQKSFNAANTFICIGPYYSNSRRIGEFIAATDPDEMFAIMDKDKGEWLNQWT ISLRIFDKAFERVEAIRDIRKRIEESHKQDQFFAGYVLDSVAEEYLDFEYSNEAESLYRA LSVFDVKSNIQLGVYDKFDAQFAVLANIISRGLPTKAPIFLEDTFSDLFGVSEKPVEGAN LNYDSKHSISAQQIYEALHIVDPRFNIECYNGDMLESSFEKSFVEQTLKDCKNEYLIQVL EPLRPLSTIIEIPDRNFHKEQRVDFAMNIPYGESRTGFIIEKDAKVYHSNIFQRHRDEQR DSSSTEQGWYTYHIKVLKDISFLHNWREKESANQYLQILEKNSRKELSGAWKDTLQIVLS PLAIARLERVLIETLMAGCLKKNATEWNVAVVERDVPCAAIAVEVLQEKIKNIYSLSGKS NPLPKINLSVVSTAEFADSPLHLEQKVNLEIPHRHFDVCVDISMLLRDNIDALPLNIKAD AVYIVRSSHYQKRHRTICSAESIQYPPLVVKDTTGKYVNIPEREKVLSYFLHDIFRKPSF RPGQLPIISHSLGDRTTIGLLPTGGGKSLTYQISCILQPGVSIVVDPLVSLMLDQVRGLR SIRIDVCECVNSSMSSMEKADKLNMLQRGAVLFMLLSPERFMMPNFRESLVSMTKKNHVY FSYGIIDEVHCVSEWGHDFRTSYLHLGRNMINFMHTKSKRPLSIIGLTATASFDVLADVE RELTLGGNLTIDSEAVVRPENDTRPELAYRVIEVKADFDSLKESEQPYLLKAGVDEWDMK DVIAKAKIEEMSLLLDSIPSQLEEKNIHNSECALENFEKECFYTSDEENKYPYAGIIFCP HAKGTFGVNNNDWGTREGISAVLRSKKDYLQIGTFVGGDQPSGHMSSFNNNDINVMIATK AFGMGIDKPNIRFTININHPSSIESYVQEAGRGGRDKKNAISYILYEPTEYIHLTIDKIN DIRFYMGKDDPRWLEKYNDRYILSNDFTDFCKAEGATDEQAQTITEIIHRNGYLENVDKG IDLWFHNNSFRGLLKEKAILNEMTDRLLNIKPTYVVEIQGKLREIMGNDDICLEVNQLKN AITIFSKEENAKQYGFIFLDTLRPSYNYIGFEDEICSKITNALINILKTYSDHSAKDLLT PLDGGNNLTEGIYRAMSQADKDGYVYVTVSWENQIKQSPDDFKAMLKAEITNIAQKQNWN NIDENRYGPLQLNKIGNFNDLLTQIANCSNDSRWLRYHADEATYQKLKILFCRKRDKDDT DKAIYRLCCIGLVEDVTIDYLSETYELKIRNRSDQEFQQCMFDFFKKYYSSEQAEKKVAE IDNQKGRNYLDKCLGYLTAFVYTNLEKKRYRAIDDMRIACEDSITERYKNGNDDWLKEFI HLYFNSKYARKGYKFGDRHYSLMEDTDEQGLDGFEVVEKYIRAMTEDGSGSEVDNVKHLY GATLLCLRAHPENAALQLLLTYCITFLGVGNNETLKATAYNSYIEGFMSLYGTMGYGMWN VLEVFNEKLCAKVHDAYIQEEILNKGKDAIMLFIHEENFNKITKKYLNK >gi|283510547|gb|ACQH01000072.1| GENE 21 23127 - 24197 377 356 aa, chain - ## HITS:1 COG:no KEGG:BDI_1313 NR:ns ## KEGG: BDI_1313 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 13 340 7 331 345 297 46.0 4e-79 MATWEYTDIIIDGKTVKAQAPIIISASRSTDIPAFYADWFFDRLEKGFSAWTNPFNGVKS YVSYINTRFIVFWSKNPRPLLPYLSVLKCKNIKCYVQYTLNDYEGNQLEKVPPLSKRIET FKLLVKELGLGSVIWRFDPIILTDDISIGNLLDKVKNIGDALKGYTEKLVFSYADIATYK KVKYNLEKSGVPYHEWTEEQMLEFAAKLSQMNIERGWNYQLATCSEKIDIERYGILHNRC IDADLITRLAWDDKKLMNFMKVKVKPMPSPSLFDDAGVLLPEGATCLPNNQYFISAHKKD PGQRALCGCMAAKDIGEYNTCPHLCEYCYANTTQKLAIENWKRHQQNRNADTITGK >gi|283510547|gb|ACQH01000072.1| GENE 22 24891 - 25289 329 132 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928948|ref|ZP_06422794.1| ## NR: gi|288928948|ref|ZP_06422794.1| hypothetical protein HMPREF0670_01688 [Prevotella sp. oral taxon 317 str. F0108] # 1 132 1 132 132 253 100.0 4e-66 MLAPKYKQDAITFQMRTLHINKENVFCDFEKLSKTWETSSNIAIRLDVEQADAEPIVKEL LGKLPNDLAYCIMSEIAEFKHLDDELMRLIYNTGDTGCKVAICLRDDLPQDLKKRCEQSN DINVQQHRDNKR >gi|283510547|gb|ACQH01000072.1| GENE 23 25283 - 25684 252 133 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928949|ref|ZP_06422795.1| ## NR: gi|288928949|ref|ZP_06422795.1| hypothetical protein HMPREF0670_01689 [Prevotella sp. oral taxon 317 str. F0108] # 1 133 55 187 187 236 100.0 3e-61 MLFNCTQELFNQIETRVSSDDIKLVEQIIPDTDEDGRNEAVLAQNAAIAQAYYLDFIKEN DAKYIDYCAIKLVETADIIALSNKRMSNSDAFISQELAVQKQLLRVIHHMDSGFDLIELN KFKQKIMQLKVEC >gi|283510547|gb|ACQH01000072.1| GENE 24 25874 - 28057 1979 727 aa, chain - ## HITS:1 COG:BH2223 KEGG:ns NR:ns ## COG: BH2223 COG3345 # Protein_GI_number: 15614786 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-galactosidase # Organism: Bacillus halodurans # 27 679 14 695 748 317 30.0 4e-86 MKLLKLLVLAILMLVGVAAYAKNIISIHTNNVQLVLEVKPNGRLYQAYLGEKLSDATPLD QLFMPRANQTSPMWAGSEAYPVMGTEDYFEPALQLTHNNGNPTSVLKYVSHTQTARNGAT ETVVKLRDEKYPLNVNLHYLAYAKEDVICQWAEISHGENKPISLGRYASAMLYFEEPTYY LTEFSGDWAREMQMSQTQLNFGRKTIDTRLGTRAAMLASPFFEVGLGSPVSENHGKVLMG TIGWTGNFRFTFEVDHNNQLRILAGINPDASTYTLDRGKTFRTPEFIFTLSTQGTGQGSR NFQRWALNHRLYMAQEDRMTLLNNWENTYFDFDEQKLDNLFGQTKSLGVDLFLLDDGWFG NRYPRNDDKAGLGDWQANRAKLPGGIPALVKGANEQGVDFGIWIEPEMVNPESELARNKP HWILRLPDRETYLFRHQLVLDLTNPEVQDFVFGVIDGLMKENPNIKFMKWDCNSPITNIF SPNLKARQGNLYVDYVRGLYAVLNRVQAKYPKLYMMLCSGGGARCDFEALKYFTEFWCSD NTDPVERIYIQWAFSHFMPAKAMCSHVTDWNKTASVKFRLDVDFSCKLGFDLDLKHLSPN DLAYCQAAIKEYNRLKPIIFAPNLYRLVSPYETSHCALMRVNDEKTHALLFAYDLHPRYN ETIISTRLQGLDPAATYRVREICLMPGQESHLAFHNKCFTGDYLHKVGLKVLGGNQLDSR IIELVKE >gi|283510547|gb|ACQH01000072.1| GENE 25 28271 - 29095 636 274 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928951|ref|ZP_06422797.1| ## NR: gi|288928951|ref|ZP_06422797.1| hypothetical protein HMPREF0670_01691 [Prevotella sp. oral taxon 317 str. F0108] # 3 274 1 272 272 515 100.0 1e-145 MIMKRILVIFTALIAVHYLTSCSSNPSNRAANEQEETFNASVMLKPTTVKIDGSLSAVLE VVEGEYRLNYTQKLLRYATIAVKIRSNGKGNPNDETFKDYTNGPLSLDVCDKQGQSIAKF SSIGNSYKDDAKLKEMMTKSGEYWVSFDMIVEDNLPKDAATFKIATVNASDLKEAYADVY VLCNTVSAANVAKWDKLLDDFEDSYIQLEALNKKLARKQDAETKLAFSKLDKKVDDLCDS INRACDEKAFAPMQAVRIGRLYSEITKREGTPHQ >gi|283510547|gb|ACQH01000072.1| GENE 26 29404 - 29625 129 73 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928952|ref|ZP_06422798.1| ## NR: gi|288928952|ref|ZP_06422798.1| fructan beta-(2,6)-fructosidase [Prevotella sp. oral taxon 317 str. F0108] # 1 73 16 88 88 154 100.0 1e-36 MGDYVGRFANCYKRPTAPDGTKITPTPLDYNWLNNDLALALVVQQGCQLYIQEHQHNCSN LIRTLRWGCSTAL >gi|283510547|gb|ACQH01000072.1| GENE 27 29718 - 30536 629 272 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928953|ref|ZP_06422799.1| ## NR: gi|288928953|ref|ZP_06422799.1| hypothetical protein HMPREF0670_01693 [Prevotella sp. oral taxon 317 str. F0108] # 1 272 1 272 272 537 100.0 1e-151 MKRYFVVLMALVAVLLVSSCAKNEGKCTCGRGEKGSDTVVVLSPKRLDVKSKDDVSLRVV DADYCLHFTKKLLNYIVMDVKFKCKGNDDVSSSALTDFVNGPLYLYFCNQYGKILADVVP LASSYKDDAKLKEALKKGGEVCISFVGMIPSDLAKNIDGFTIGYDVTYANQDLKDKFRLN CNTSSMEQVEKWDKLIDELETTYLKELAMNKQTDELNPTQLDSERSKLNGKKERMWAQID SAVKAHALAPMQMVRLGGIHMKAKSIEEGGFH >gi|283510547|gb|ACQH01000072.1| GENE 28 33197 - 33679 210 160 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928955|ref|ZP_06422801.1| ## NR: gi|288928955|ref|ZP_06422801.1| hypothetical protein HMPREF0670_01695 [Prevotella sp. oral taxon 317 str. F0108] # 1 160 1 160 160 319 100.0 3e-86 MNLFTNNRLLYEGRDSGLSEEEIRQSLGVGGNKINSFIQFYLMFDGVLFPKQAMMFRHAF YSIEKGDWDKIEIGFFLKLDDIVKVRNLQLQDDAQLDYFVQTHIPFADDGCGNDIWIEVS TGVIKAFYHEYSLEEGLIMVAPSFDVFCSSLENWVFNNNT >gi|283510547|gb|ACQH01000072.1| GENE 29 33850 - 35517 2114 555 aa, chain - ## HITS:1 COG:SPy2085 KEGG:ns NR:ns ## COG: SPy2085 COG2759 # Protein_GI_number: 15675843 # Func_class: F Nucleotide transport and metabolism # Function: Formyltetrahydrofolate synthetase # Organism: Streptococcus pyogenes M1 GAS # 3 554 4 556 557 563 53.0 1e-160 MKTDFEIAREATLLPINDIAAKAGISAVKLEPYGKHIAKVPYTLIDEERVKKCNLILVTS ITPTKSGNGKTTVSVGLALGMNRIGKNAVVALREPSLGPCFGMKGGAAGGGYAQVLPMEK INLHFTGDFHAITSANNMIAALLDNYIYQHQDDGFGMKEVLWRRVLDVNDRNLRTIVTGL GGKTDGIVTEGGFDITPASEIMAIFCLATSEEDLRRRIDNVLLGVTLQGKPFTVKDLGVG GAIVAILHDAIHPNMVQTIENTPAFIHGGPFANIAHGCNSVLATKMAMTFGDYAITEAGF GADLGAEKFFDIKCRKAGISPKLTVLVATAQALKMHGGVNEKEIKEPNVEGVRRGLANMD KHIANMQAFGQTVVVCLNRFATDTDEELDVVRRHCEELGVGFALNTAFGEGGKGAEDIAR LVVKTIEKNPSEPLRMLYDDADDIETKVRKIAQNIYGANKVVLKPAAKKKLARIKELGLE HFPVCVAKTQYSFSEDAKAYGLPTNFDITIRDFVINTGSEMIVAVAGDIMRMPGLPKSPQ AERIDVVNGVIEGLS >gi|283510547|gb|ACQH01000072.1| GENE 30 35829 - 37151 500 440 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|16079597|ref|NP_390421.1| hypothetical protein BSU25430 [Bacillus subtilis subsp. subtilis str. 168] # 1 426 1 423 451 197 29 2e-49 MKKLYIETYGCQMNVADSEVVAAVMQMAGYEMCQDEADADAIFMNTCSVRENAENKIYNR LDTLHAEQKKGRKVILGVLGCMAERVKDDLIENHHAQLVAGPDSYLNLPDMIAQAEAGNK AIDIALSKTETYRDVVPKRVALAKISGFVSIMRGCDNFCHYCIVPYTRGRERSRDVDSIL NEVRNLQQQGYKEVTLLGQNVNSYQFVNEEGQTIDFPQLLRLVAEAVPTMRIRFSTPHPK DMSDATLRVIAEVPNVCRHIHLPIQSGSDKVLKLMNRKYTVEWYLSRVKAIRELVPDCGL STDIFVGYHGETEADHEESLRIMREVGYDSAFMFKYSERPGTYASKHLPDDVPEDVKIRR LNELIMVQNENSARANHAEVGNVREVLVEGPSKRSREQLCGRTEQNKMVVFDKGNHHIGE YVKVRITGSTSATLLGEAVG >gi|283510547|gb|ACQH01000072.1| GENE 31 37244 - 38530 701 428 aa, chain + ## HITS:1 COG:no KEGG:Glov_0169 NR:ns ## KEGG: Glov_0169 # Name: not_defined # Def: FRG domain protein # Organism: G.lovleyi # Pathway: not_defined # 69 297 26 249 338 99 28.0 3e-19 MVIGNEYVFENILQASAYILEHEKEALANDKTRQKARISYPFPDFTKPFEEPNIRIIPPD SLEYPYAAYFNDADDSKFIMNKLMSGRYSLKPNLRNRKFLFRGETEFHNPCKPSLFRNTK KKYFLDYMIHGDEMFYLILSHPLVQLLDLGVVLNGELVRFEMNLYGLIQHYYNKSGLLDL TSDMNVALFFATQKYDWETDSYSPLTDEIHEPGVLYYYDIDFYRDFKTLHNQEFLSTVGL QVFPRSGRQRGFVYQCPMDINFNDLSQVKAFRFKHNAEIAQSIYESMNGGEKLFPHDVLQ AHWKDVARDENKVSLKAIHFNLTRNEKENLDSIRAKLENDYHINVEDYEPVLTKEELHEY YKAVKDQNLWEDFCNQIYIPGDTNGKMMADLLDVPNKPEYEWAFKEGIEHEIDYDKGFLL KAYKHILQ >gi|283510547|gb|ACQH01000072.1| GENE 32 39048 - 40205 1016 385 aa, chain - ## HITS:1 COG:no KEGG:BT_2457 NR:ns ## KEGG: BT_2457 # Name: not_defined # Def: putative purple acid phosphatase # Organism: B.thetaiotaomicron # Pathway: not_defined # 19 375 19 380 389 266 37.0 1e-69 MRYLLFSLMLLFAVGAKAIKVTHGPWICDMDSTGTTIVWVTDVPGMSWVEIAPDSTDHFY GRARQRYYDVLAGRKVLTDSVHRVRIEGLKPDTKYRYRVFTQEVAEWRYDDWVTLGKTAC TDVWRGKPHEFKTFPAKPREVTFLVLNDIHERAQFMKELCKNVDFKKLDFVLLNGDMSNR LRNQQHMMEAYLDTCVRMFATHTPLFFNRGNHELRGQFADYLYRYFPTNNGKYYRVQHVA GIDFLFIDTGEDKPDEDIEYSGIVNYDQYREEEARWLHGLRESKQVGKHPLIVFSHIPPT LQKWHGPYHLQKTLMPELNKMNVSVMLSAHLHAFGYQEPNEVINFPNLVNSNNTYLLCRI ANGKMEVDYVGLKGKDKKHFTFPLK >gi|283510547|gb|ACQH01000072.1| GENE 33 40306 - 42579 1606 757 aa, chain - ## HITS:1 COG:PA1726 KEGG:ns NR:ns ## COG: PA1726 COG1472 # Protein_GI_number: 15596923 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Pseudomonas aeruginosa # 38 749 31 762 764 585 43.0 1e-166 MKRSVLAVTCIVGSLLVGNAQHAHAKQPRKLVKKELSAADKETQFVTSLMQKMTLAEKIG QISQYVGGSLLTGPQSGALSDSLFARGMVGSILNVGGVDKLRPLQEKNMQLSRLKIPILF AFDVVHGYKTIFPTPLAESCSWDTNLMFETAKAAAVEAAASGIHWTFAPMVDIARDPRWG RIVEGAGEDTYLASQIAAARVRGFQWNLGKTNAVYACAKHFVAYGAPQAGRDYAPVDLSL STLAEVYLPPFKACVDAGVRTFMSAFNSVNGIPATGNRWLMTELLRNRWNFQGFVVSDWN AVQELKAHGVAETDKDAALMAFRAGVDMDMTDGLYNRCLEEAVREGQLDVHAIDAAVERI LRAKYVLGLFDDPYRFLDLKRERREVRSESVTALARKAATASMVLLKNANATLPLSKQTK RIALVGPLANNRSEVMGSWKARGEEKDVVTVMDGIKNKLGKDVVLNYVQGCDFLDLSTHE FSAAFEAAKHSDVVIAVVGEKALMSGESRSRAVLRLPGKQQALLDTLRKAGKPLVVVLMN GRPLCLEKVDKQSDALLEAWFPGTQCGNAVADILFGDAVPSAKLTTSFPLTEGQIPNYYN YKRSGRPGDMPHSSTVRHIDVPNKNLYPFGYGLSYTTFSYGEMQCPQQFAADGSLQVSVE VTNTGHFDGEEIVQLYVADKVASMVRPVKELKGFQKVFIPKGQTKRVDFVLHAHDLGFWD NTMQYVVEPGTFEIMVGKNSEDLQKKDATWKGDSRTP >gi|283510547|gb|ACQH01000072.1| GENE 34 42691 - 45207 1658 838 aa, chain - ## HITS:1 COG:SP0648_2 KEGG:ns NR:ns ## COG: SP0648_2 COG3250 # Protein_GI_number: 15900551 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Streptococcus pneumoniae TIGR4 # 35 820 59 871 871 478 34.0 1e-134 MRTFTNPFLKPPYIIAVSLLWFCTLVSHAAERHLWDEGWRFALHPDDAPLQPDFKDDAWR LLDLPHDWAIEGDFYAQNPSGANGGALPGGIGWYRKHLALHDNDAASRYVLHFDGAYMNT SVYVNGTLLGTRPYGFIGFSYDITPLLNKQGDNVVAVKVDNSQQPNSRWYTGCGIYRHVY LSKSADVRVAEWGVQAVSEVKKGLGSVTLNTQIENFSGRSRRVIVRQKLWNKARQLVTQT SKVCDIAVSGTTVKQQMRVPKPQLWSPASPNLYTITTEIIENGRVLDGDTIRMGMRTVAF DAKQGFFLNGKNMKINGVCLHGDLGCLGAAINEDALYRQVSLMKAMGANAVRCSHNPPAP ELLNLCDSLGMLVVDEAFDSWLHGKTPFDYSLHFKTWFERDLRDMVLRDRNHPSIILWSI GNEVLEQWNKVDNAGMALEDVNILLNNSRDKAALAQGDTLNVKVKLTQALAAIVRRYDST RLITAGCNEVSPDNNLFKSGALDVIGFNYHQKKVKDVPQNFPGKPFLMTETVSALQTRGY YKMPSDSVYRLPGRKRPFTDPTFLCSAYDNCCAHWGSTHEATLDVVKHTPYCAGQFIWTG FDYIGEPTPFNFPARSSYFGLVDLAGFPKDAYYLYQSEWTKGPVLHLFPHWNWIEGQAID LWCYYNQADEVELFVNGKSQGVRQKRNEHEYHVAWRVTFEPGEVRVVARKNEKVVAEKSL KTAGAPHHIRFTPNRKEIKANGRSLSFVMVEVVDKDGNLCPWAENNILFSLAGDAKIAGV DNGSPFSLERFQAHERRAFFGKCLVVVQAGKSPSAVRLTAKAVDLAPQTIEINTVDAR >gi|283510547|gb|ACQH01000072.1| GENE 35 45656 - 47656 1752 666 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928962|ref|ZP_06422808.1| ## NR: gi|288928962|ref|ZP_06422808.1| glycoside hydrolase, family 16 [Prevotella sp. oral taxon 317 str. F0108] # 1 666 1 666 666 1345 100.0 0 MKTIIATICALSMGYISISAQPNVPKTHALPGWTSLKGDEFNGSSVDKKLWGLYGDATKN YANDYYGNNTGQGMAQVYRDKMVTVKNGILTVRATRDAIQTGIRRPSPPDPDVDYGPIMR YPLKPNHNYTTIGWWSGALSSRDAEGGGRYYPLYSRIEVKAKIPYTIGVWMALWLRHRYG GAGTFEIDLEEFFVNDDDKDITHNWEGHTYHFKGKRTIHQSVHGLDRAKAYKDPNTGETK YKTTYNHNDFADRIREIDFNPAEDFHVYGAQMDPLPGDSSVHLAVSFLLDGRVRSVFTTK TDKVGGSSEYRFNELLKDKYIKGNIDHIWDVAITGQVGGKPDGQGGGVLYPEMDPKYGGN LNKVPRNYEMDIDWLRVFKRTNKALWLGSMPKGSDWKTKNVALEMSATQFKGLQVGDQLV MDIDTLSNSEYRKVEKLKLDIYDYNGNAITTLKPELTQRDAQVTFIVTDDLCKKLKAKGG VMKGENIRLFSVCRSQRDSAKWVGFKQMANGEVLIPATMFADIEKGQQIEVMVRDVAAGG TLCLQQFKKPDAGKNHRPDLSADKAAGNVLPLKAGHDENTYTFTLNAQAVKELKQHGLAV TGKGCYLRSVRIVGEKKNTAVNKVETQSQTNAAVYTIDGRKVAERWQEGTFPRGIYIVGG RKVIVK >gi|283510547|gb|ACQH01000072.1| GENE 36 47813 - 49804 1801 663 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928963|ref|ZP_06422809.1| ## NR: gi|288928963|ref|ZP_06422809.1| hypothetical protein HMPREF0670_01703 [Prevotella sp. oral taxon 317 str. F0108] # 1 663 10 672 672 1303 100.0 0 MLSLLGAMACTENNIAYDEVVDVPEGVYLSGPSSEFSLPIEAGRLTASSRANMLSIRAWL KKDGHFKISYVGTDGQPVAYGKGSEMTLANNYVNTFSLSADAEGFSVPTEGLYQVVVNKQ RKELTLIPMQFTMQADGALTSDGNTAVALSNVAYDKLTHIVTWSNADSTQQLLPNKYVFN MNNGVPLYLRDSDTENDTLSTVFTGMALAERTNQLTEVYQELTNQSAVKLKFPLKGQYNV QVQYNVLTGKFAARMGGKGVEVTGLANELYMGGTNFGAWNTNDVVKMAPVGAVGNGEFWR LCHLQAGKTVEWATAKDGSQAFSSLTTMQGVTLDGNGKATVNTTGLYLVYVNLDRKLIAF ETPKVYAAGEALGNKELPLTSNGSRLQVETTETGNLNLFAFSDQNARDWSTMQFTIRNGK VVYPGVAVHEMPALPVAKGTTVTLNMADNTADFGGTTPESLIPEGVKQLYMTGDDFGKWD WNAPEIALFENGYAGAARWFYITYLRGGTALEFSTEKAFGKGNFAKLKTATDYNVVNDRT VVPSDGIYLIYVGLDSREIAIQPLELSGDCGGSAVQFTTNADGRTMSATLASGGRMRIYP NIPAFKKVKKFGSWKREVYVDPASKAFLYRKNGEGEPNKDYVWKAGTKITIDFVSKKATI QEP >gi|283510547|gb|ACQH01000072.1| GENE 37 49847 - 51532 1205 561 aa, chain - ## HITS:1 COG:no KEGG:Phep_3874 NR:ns ## KEGG: Phep_3874 # Name: not_defined # Def: RagB/SusD domain protein # Organism: P.heparinus # Pathway: not_defined # 10 561 6 568 568 361 39.0 4e-98 MNAKFFSYIYISAAISIAFTLSSCNGFLDREEDSFIDKTATFDSYNRTKQYLTYAYTLLP DGLNRFSREAMLASATDDAEFAIESAEIQQFNNGSWNALNNPDDVWNRYYSGISKCCTLL ENTNHVNLDISRLDPDKQVEYANSLKDIRMWRAEARFLRAYFHFELLKRYGPIPIVTSTL SINGNYENTLRPTMKEVVDFIAKECDIAADTLELTPWRNVNDAFGRATKGAALALKSRLW LYAASPLYVDFGDTNEANKPTDVALWKSAADAAKAVIDLNQYELASSYADLFKNDFQNKE YIFVRRYAANSDFEKSNFPVSFGGKGGTNPSQNLVDDYEMLDGTPFDWNDPAKAAQPFEN RDARLGATILMNMAPFKGKKVATYPEGADASPNPNATKTGYYLRKFLNEDVNIQTGGSSS GHVVPLFRLAEIYLNYAEALNECDPTNPDIALYLNKVRNRASLPNVSALSQEQMRAVIQH ERRVELAFEEHRSWDVRRWKIASSTLGAPLMGVQIERKPLGGYTYMPVKVEQRVFQPKMY WYPIPQSEVLKLKQWKQNNGW >gi|283510547|gb|ACQH01000072.1| GENE 38 51559 - 54372 2255 937 aa, chain - ## HITS:1 COG:no KEGG:Phep_3875 NR:ns ## KEGG: Phep_3875 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: P.heparinus # Pathway: not_defined # 33 937 43 937 937 525 36.0 1e-147 MKYRDYINIGCALMLFCSPVNAQNAADTVATQIAYGAQPSWAQTAAISTVGGERLMEITS PTVGNALKGLLPGLSMLQKSGEPGYDFYMENMFTRGVSSFNGKQQMLVFVDGFETPLDNI STEEIESVSLLKDAAALAIYGSRGANGVLLITTKKGFHSAPKIGFRMQTGIQMPTIMDSP MDAYHYASLYNQALANDGKAPLYTDEALSAYQSGSNSYLYPNVDWKKQVYKSTAPLTMGE LSFRGGGGGLNYYVMAGLLQNDGLYRGTDSKGRESANVDYSRYNFRANLDVDITRYFRAS LYSGASLAEKTTPGGGGADDYLKSIWTTPPNAFPVYNPNGSIGGTSLYTNPVANLLNRGL YKENSRSLQVIFGLQYDFSALVRGLSAKVTFGYHNFMAETSPKTRDYARYSLTQTGVDAQ NNPIYSYMQYGSDAALTATEGFRTSQTRVGLQAQIDYQRRFGKHGIHAMLLMLNDRYQLY NVRDDERYVNFAGRLTYDFDKRYVAELVASYMGTDNYARGQRFGVFPALSAAWIVSNEPF MKAASWVDFLKLRSSYGLTGNNQTSARFIYDETYGGNGSYLFGVGSSQSSGFTERTLANP SVTWEKKRIFNVGVDATLWRNLSINLDVFHETQSEILTYPYAQVLGIVGASYGGILPLMN VGKVTNHGFEFKARYQGKVSDKISYFAEAGTWYAMSRVDEQGEDVKAESYLYRKGNPVWK PIVLEADGFYTANDFDGDGKLKAALPQPQFGHVAPGDIKYKDYNGDQIIDENDAHPIGNA TVPAWNYMLSGGMKYKGFDISVLFQGVAQRDIYLRGASIYSFQNNGTASNLALDSWTPTH TDATYPRLSTVDFSNNYRTSTFWRRNGSFLRLRNVQVGYELAERVSRALRLSSVYFYINA TNLFTLDYLGKLGDAEQDNLLSYPLMRTVSVGLKLAF >gi|283510547|gb|ACQH01000072.1| GENE 39 54430 - 56190 1286 586 aa, chain - ## HITS:1 COG:no KEGG:Phep_3876 NR:ns ## KEGG: Phep_3876 # Name: not_defined # Def: RagB/SusD domain protein # Organism: P.heparinus # Pathway: not_defined # 1 586 1 587 587 291 35.0 5e-77 MKKKSLYIAWIFAPALLVACSDFLETPPSVDYGENEVFATRADAEKLLTTLYAEGMPYGF CMSSSNTDRRLLSSSTLASACDEGEDVATWAMGNAAWNAGNHTNSNFTWDEDCRFYLRNH TTRVANLILKRIHEVPFDEGDPDFNKRATGEAYFMRAMLLWEGVYRYGGMPIVREVIDPT DFSQRSRSSFADCIDSILVDCDRATRYLPDYYTDNTKLGRVTRIAALALKARVLLYAASP LFNTNTPYMAFPGHEELIGYGNYDKERWKKAADAAKATIDAALNSGHYRLYNTGNPDKDY EDVWTIPDNEEIILGNMKYRNFKTTSRPLVANLPTWAMNKAWGDAGLYVTFNFVRHYEKR ADGEPADWNMNGGDNLIDIYNSLDPRFKQTVAYHSSSWNDEIPFINFLEGGAEKSKMDRT CHLLHKWVPRILHVNGSNTANVQWPVFRLAELYLNYAEALNEYESAPPQAAYDAVNLIRS RAGMPDFPAGMTKEAFRTKLRNERSVELAFEDHRFWDIRRWRIAGEEGVMKGKFYGLKIH STDGNKTHIHYEPYVFEHRFWNDNCYLYAIDQKEVNKGYLIQNPGW >gi|283510547|gb|ACQH01000072.1| GENE 40 56203 - 59622 2623 1139 aa, chain - ## HITS:1 COG:no KEGG:Phep_3877 NR:ns ## KEGG: Phep_3877 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: P.heparinus # Pathway: not_defined # 141 1139 33 1036 1036 865 45.0 0 MAGNLKSRRKGLLRFLLMMVLLLVPTMLSAQRGNVSLNLQNEEVSQFIRQVEKQTKYTFV YRNNVLQPKTKVTCVCKDWPLEKALAHVFSSLGIQYSFNNNTIVLVKGKVENAERKGTKS TEGKNADTTDKKKLWGIVRDESGEPIIGASVLVKGTKVGTVTNAEGEFSLDVPASGMLVI SYMGFATREVPIKNNSNLKITLNEDEAQNLNEVVVVGYGTQKKASVTGAIASVTTKDLVQ TPQANISNMLVGKMPGLIAMQRSGAPGEDNSTLLIRGVSTFSDNTAPLVMIDGVERPNYN GLDPNEIESVSILKDASATAIYGVRGANGVILITTRKGQKGKPHLSYSGNVAVQSPTALP HYLNSAQYCEMYNEALKNDAYTKGTSYVPRFSDEDIRLYRDGTDPIMHPNTDWVGTFLRK VSLRTQHNFNISGGTDRVKYFISAGFFNQGGMYKYTKIDRDHDVNASDTRYNFRSNLDFN ITQDFKAVVQLSTQINDIRTPGLGNSNLWKEISWATPLGTPGMVDGKLVRLENTIDDENP WQALLNNGYKNTFANTINTTLRFEYDLSRLLLKGLSVHASTSYDSYYNSRRLSVKTMQTF VPKRDPNDPTHIILVPQNEAGTWGGGFEYDKNRKVYFETGIHFDRTFGKHQATALLLYNQ SKYYAPNLAYHVPNAYQGIVGRVTYGYANRYLAEFNMGYNGTENFAAGRRFGFFPAVSAG WVVSEEPFFPKNNWVVYMKLRATYGEVGNDKIGGDRFLYLPSVYGETSGKLTGYNFGSSA NPVYSQMIEEKRLGNPLLTWEKARKLNLGVDMNFLKNHLTVSFDYFKERRNNILSNRNSA PMLIGASLPAYNMGEMENAGFELDVNYRNNIRDFNYWGRFNYSFARNKILFQDEVPEKYA YQMRTGRRVGQFYGLLFDGFYNSWEEINALDRPVSSWNGNRLQPGDMRYVDVNKDGKIDM YDMVPIGYSPTPEIIYGFSVGCSWRGIDFSALFQGASHVSIKYFGRALWPFINAHNSAKT LILERWTQERYERGEAINFPRLSMSPSRDTDNNYQDSNFWIRNADYLRLKNVEIGYTFRK SQLKALGLESLRLYVSGTNLFTWTDVIDLDPEAPSKGGATEINTYPLQKIYNVGINVQF >gi|283510547|gb|ACQH01000072.1| GENE 41 59647 - 60612 693 321 aa, chain - ## HITS:1 COG:PA1301 KEGG:ns NR:ns ## COG: PA1301 COG3712 # Protein_GI_number: 15596498 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Pseudomonas aeruginosa # 86 290 93 293 327 87 30.0 3e-17 MDKRIEDLYQQYLSGQLSAEDFEELQRKVTKATDEELWNLMCEDFSMSSELAEMSDESQQ RIWQRIHDEIQKGQRRKFVHTLLRYAAVVVVVLSLLGGGYWGLRSLNSATPVYTYVNVKA GSKSVITLPDGTHVSLNGNSQLRYNVVPKVHREVVLMKGEAYFDVAKDANCPFRVHVNDM QIEVLGTTFNVRCHSGEIETALFSGAVRLSAKGLRQAYQLLPGKKSVYQPASHNIEICDN DGASDGRWKDGYLAFSSKPLGEVLHEIENWYGVTIQLRNRHLANDLLTGSFYHETLESVL HSLSMQYGFKYRVNRGYVVIE >gi|283510547|gb|ACQH01000072.1| GENE 42 61409 - 61660 72 83 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLNFLLKTRQKRLCCIHARGQRHRNTYVYNYQRERLQGMLLTANGDNIMETQEQSAIDSP FSACREYIEWGTACAERRACGEV >gi|283510547|gb|ACQH01000072.1| GENE 43 62014 - 63363 1728 449 aa, chain - ## HITS:1 COG:aq_699 KEGG:ns NR:ns ## COG: aq_699 COG1538 # Protein_GI_number: 15606101 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Outer membrane protein # Organism: Aquifex aeolicus # 33 446 27 424 437 75 23.0 2e-13 MNINKLPLLLALCLPIVANAQTTYSLDQCKALAKENNVKLKRARLEITAAKEQQKEARAK YLPTVTANGTYFHATDYLVQEEFSLSPADQQKLAAIVKGAGLNPAILAALPTSYTLQAIK HGTLASLIAMQPVFAGGRIINANKLAAVQTQVKELMLEQNADEVAQTTEVYYNQLLALYE QEKTLDAADKQLENILKDANNAYEAGVSNKNDVLSVKLKQNEMAVNRLKLNNGISLCKMV LAQYIGKQGEEIDIDRSLTTELPDPRQLSVNHASALEQRTEMRLLDKKVEANRLLTRVKR GEMMPTLAIGVAGMYHDLTNKGRTNVIGLATLSVPISNWWGNRGLKRQKIAEQIAVEEKE DSRQLLLIQMQNAYDNLETAYKQIQLAKLSMEQAAENLRLNQDFYEAGTGTMSNLLDAQT QDQKARNQYSEAVVAYLNARTAYLKATGR >gi|283510547|gb|ACQH01000072.1| GENE 44 63631 - 66744 3467 1037 aa, chain - ## HITS:1 COG:SMb20345 KEGG:ns NR:ns ## COG: SMb20345 COG0841 # Protein_GI_number: 16264079 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Sinorhizobium meliloti # 3 1027 8 1023 1049 375 26.0 1e-103 MSKRKRNIIESVMHYNSIMLLLMVVLIGFGVFALINMPKNEFPSFTIRQGVVAAVYPGAT SAEVEEQVTKPLETFLWGFKEIKKNKTRSQTKDGITYIFVELNDDVANKDEFWSKFKIKL QQFKVSLPQGVLALIANDDFGDTSAMLITLESKTKSYRELHEYMNTLKDNLRTIPELANL RVSGEQGEEIGVYIDRDKLSDYGINSATLLANLSAQGMTMVSGKVDDGQTTRPIHLRSQM NTENDVANRIVYSDPRGNVIRLRDIATIKREYPHADKYVKNNGQKCIVLSVEMNEGSNIV QFGNKVKDILGQFEASLPQDVSIYTITDQSEVVNSSVVDFLKELLIAIASVVVVIMLLLP MRVASVAASTIPITIFITLGLFYALGLELNTVTLASLIVALGMIVDNSIVIIDSYIEKID EGMSRWHAATYSAKEFFSSIFSATLAISITFFPLLLTMKGMMKDFVKWFPLGISVVLGAS LLVAVLVVPWMQFTFIRKGLKKEGKGEGKQRRTFLDIMQSYYDRLIERCFAHPFITLGVG VLSVVVGAVLFASMPQKLMPRAERNQFAVEIYLPQGTSIERTAAVADSLRNMMQTDKRIR NISTFYGMSSPRFQTSYAPQLGGTNYAQFIVNTESDAATMALVDSFTTRYTDHFAEAQVR IKQLDYSDASSPVEIRFSGDDLNAIHQAVDSATRRMRAMPELLLVRNNFDGATAGLDVQI DDYEASRLGMSKSLLALNLATRFGDGVPLTTVWEADYPVKVMLKDSHAGHQTPEDLENAT VSGLLPGVNVPLRQVAKVSPDWTYTSIHHRNGVRSMSVMADAKRGLNLNSVTDKVIDEVK KIELPQGVTMSVGGQREKDLETEPQIYSGLAISIIIIFAILLFHFKDIRMSLVIMLSLLF SLLGAATGVLIMGQEVGITGILGIISLMGIIVRNGIIMIDYAEELRVDHRLSAKRAALQA AQRRMRPIFLTSAAASMGVIPMVIANTPLWGPMGVVVCFGTIVSLLFIVTMIPIAYWMVF RWEDRKRTLKMEAYSRS >gi|283510547|gb|ACQH01000072.1| GENE 45 66970 - 67986 1000 338 aa, chain - ## HITS:1 COG:VC1674 KEGG:ns NR:ns ## COG: VC1674 COG0845 # Protein_GI_number: 15641678 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Vibrio cholerae # 1 338 11 362 369 106 26.0 5e-23 MTKIKFMAATALIVVLSGCNSKKEEQAVAPIKVKTMAVQTGNVDGSRTYVGVIQESYGST LSFATLGTVSQVLADEGQAVRKGQLLAVIDKTSAQSAHDMAMSTLSQARDAFKRLEQLYK KGSLPEIRFVDVQTKLAEAEAAERIARKALQNCELRAPFDGFISQRMVDTGNNMAPGLGC FKLVKLDRVEMKIAVPENEISGISKGQTVPFTVSALGGKTFVGKVTEKGIQANLLSHTYD VIVSLPNAGHQLLPGMVCSAQLASKGGAQGIVVPQEAVLIDGSKPYVWVVEDGTAHRRDI SQGGVCQSGVVVNSGLSQGDKVIVNGQNKVSEGANVKI >gi|283510547|gb|ACQH01000072.1| GENE 46 68233 - 69111 925 292 aa, chain + ## HITS:1 COG:STM1108 KEGG:ns NR:ns ## COG: STM1108 COG2207 # Protein_GI_number: 16764466 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Salmonella typhimurium LT2 # 184 283 190 289 298 66 33.0 6e-11 MEQRLQLPFTLRSFTSADVETETRRSLSTNYRIFALFLCTQGSVSVNLDGKAYHVNKGDS LFIPPVIHSNIASFSNDLRGIVLYVDYDFIMGIVNKTMDITTGLYFSEHPFLSLSVSQYD RILTQMQTLMAREAHENESYANSPNPQVITELLSSMCRTLVYEIVNCYMLSHHLQVGTWN VTDKLAQNFIVAVYNNYRKHREVAHYADLLCVTPSYLSVVVKEKTGKTALQWINDVVMSD ARQLLLYSDQSIKEIVATLGFPNQSFFGKYFKQHSGKSPKRFRLESRGGKKE >gi|283510547|gb|ACQH01000072.1| GENE 47 69308 - 70156 565 282 aa, chain - ## HITS:1 COG:SP1240 KEGG:ns NR:ns ## COG: SP1240 COG1266 # Protein_GI_number: 15901101 # Func_class: R General function prediction only # Function: Predicted metal-dependent membrane protease # Organism: Streptococcus pneumoniae TIGR4 # 105 251 146 294 317 79 34.0 6e-15 MIIISKDDNLAWGCYLVLGVLAFLIIPGVVGYVLLLNIPAYAAVPLSTIAAVATLYAFKK YLSWYVENGESLSFRGKMRMMGIGWAVSVVNFLAIIVCLFLCGCYRIVNVELDVASQLSW LSLFLLVGVVEEVIFRGILFRLIADKWNIAVGLTTSSLLFGLAHLGNPGATLWAALAIAL ASGWLFGMAYAYHQTIWVPVGMHWAWNYLEGGVFGCAVSGTPLDYRPLITPRISGTDLLS GGAFGPEASIICVAIGIGISIVYTMLYIKKRKRLGAKPEMLF >gi|283510547|gb|ACQH01000072.1| GENE 48 70189 - 71043 384 284 aa, chain - ## HITS:1 COG:CAC0420 KEGG:ns NR:ns ## COG: CAC0420 COG1266 # Protein_GI_number: 15893711 # Func_class: R General function prediction only # Function: Predicted metal-dependent membrane protease # Organism: Clostridium acetobutylicum # 76 214 81 220 276 64 28.0 2e-10 MTVISKDDKLEWVAYLVIGCLGFWFLLSICLILLENDFSAFIKFPLALLASGVTLYAFKK YLGWYVEDGARLSAFGQIRKVGVGWLVSALYLFTILICLFATRHYSVAHLNFNLGHQLSW LSIFLLAAVIEEIIARGIVFRLITDKWNVVAGLVVSSITFGFVHLFNPNTTALSCLRIAI TGGWLCGIAYAYHRTLWVPIGIHWAWNYMQSNIFEHSVPGSALNSPPILILASKGVKFPA GIEFDTEVSIISTAIGVAISAVYTILYFKKKARFKTGEDVAPVF >gi|283510547|gb|ACQH01000072.1| GENE 49 71119 - 71349 149 76 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928978|ref|ZP_06422824.1| ## NR: gi|288928978|ref|ZP_06422824.1| DNA-binding protein [Prevotella sp. oral taxon 317 str. F0108] # 1 65 1 65 76 84 70.0 2e-15 MELRNETHRLNGLHRLIKMCNTGVPDEPTQRLHASKRHLDNVQNYLRALSARMDYSRSGY TLVSTILNLSSNESQF >gi|283510547|gb|ACQH01000072.1| GENE 50 71411 - 72121 338 236 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928976|ref|ZP_06422822.1| ## NR: gi|288928976|ref|ZP_06422822.1| hypothetical protein HMPREF0670_01716 [Prevotella sp. oral taxon 317 str. F0108] # 1 236 1 236 236 409 100.0 1e-113 MKHLIDIRRLFSTYDKSNRSKTYIITKTIVLFLGLSVCAELFIAVCYFLLPSSMTNLVRT TAESLTINGVKSFSLKAWVWVIIIGPMMEETTHRLWLSMKKTHVAVSVAFGLYYVLGFVM VNYVQNHVVGAISCGVVSVFAGMMFYSLVNEDALEKIKCNSLLYPILVLLGCMAFSLAHL TNYVINVHVLLFGVVSCFRQLAGGISLSYLRINLGFFYGVLFHCSTNLVACLLATL >gi|283510547|gb|ACQH01000072.1| GENE 51 72271 - 72477 292 68 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928977|ref|ZP_06422823.1| ## NR: gi|288928977|ref|ZP_06422823.1| hypothetical protein HMPREF0670_01717 [Prevotella sp. oral taxon 317 str. F0108] # 1 68 1 68 68 139 100.0 4e-32 MNAIIKELSHDETLMVNGGVNTYYGGTLPEVVCVGHKKTFWDKILEYIQSVPGFDWRAEY QSGGFGPY >gi|283510547|gb|ACQH01000072.1| GENE 52 72588 - 72818 57 76 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSPCKIFARKKPFFVERYETTRYALPNTTSYIINFSSNRPLIIYHSDKESAEYAKVANYD SLKTYVPNRDIIKIRC >gi|283510547|gb|ACQH01000072.1| GENE 53 72802 - 73137 238 111 aa, chain - ## HITS:1 COG:no KEGG:ZPR_4403 NR:ns ## KEGG: ZPR_4403 # Name: not_defined # Def: putative DNA-binding protein # Organism: Z.profunda # Pathway: not_defined # 1 91 1 91 113 72 39.0 4e-12 MYFIEELKRIQKIHQLISCQCTGSPDELAASICVSRRDLYYILHGLKKMGAKICYSRTKK TFYYTNRFNLNMQIKVSLLGEKETDILGSGSVKVLSESKVTCDFVALNIGF >gi|283510547|gb|ACQH01000072.1| GENE 54 73143 - 73373 220 76 aa, chain - ## HITS:1 COG:no KEGG:ZPR_4403 NR:ns ## KEGG: ZPR_4403 # Name: not_defined # Def: putative DNA-binding protein # Organism: Z.profunda # Pathway: not_defined # 1 72 1 72 113 66 43.0 3e-10 MELRNEIHRLNSLHRLIKMCNTGSPDELAHKLHISKRHLYNVLDDLRALGAKIDYSRSDY TFYYTNSFELHLVLII >gi|283510547|gb|ACQH01000072.1| GENE 55 73379 - 75547 174 722 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein [Acinetobacter baumannii AYE] # 477 696 1 216 311 71 25 1e-11 MQNKGIKIKQQDITDCGAASLASVCAHYGLAFPIARIRQYAFTDKKGTNVLGMIEAAGKL GFEAKGVRAEPEALKMISLPAIAHVIVNEVLYHFVVVYGVSDKHITYMDPADGEMHRVAW DAFLKLWTGVLILLEPKDDFVQGNMKTSIPQRFFQLLRPHRRVMAEVLFGAVVCSILGLS TSVYVGKIVDYVLVDENTNLLNLMGVAMLAIILIRVFISTTKSILALKTSQMIDATLILG YYKHILHLPQQFFDTMRVGEIISRVNDAVKIRTFINNVLVDLVVNVLILIFAVGVMFVYS WKLALVTLVSAPLFFVVFFLFNRLNKRYQRKIMETAADLESQLVESLNAITTIKRFGIEA HANLKTESRFVRLLRNTYVSAQGAIFANNGIDVVSSVVTVAVLWVGSNLVINQTLTPGSL MIFYSLIGYVLSPIGSLISSNQTIQDAQIAADRLFQILDLETENTDVSTKVKMEPDMVGD IVFDNVSFRYGSRKQVFSSLSLVIPKGKTTAVVGESGSGKTTLVSLLQHIYPIQAGRISI GQYDIAMISNESLRRFIGIVPQKVELFSGTLLSNICLGDLQPDMRRVLDIITQLGLKDFV DGLPKGLDTIVSEQGTTFSGGERQRIAIARALYKQPEVLIFDEATSSLDSISERFVKQTL HALAAQGKTIIVIAHRLSTIKDADRIVVIDKGEVAECGTHSELIAQNGVFQRLWHAQTNV ID >gi|283510547|gb|ACQH01000072.1| GENE 56 75561 - 76880 990 439 aa, chain - ## HITS:1 COG:RSp0291 KEGG:ns NR:ns ## COG: RSp0291 COG0845 # Protein_GI_number: 17548512 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Ralstonia solanacearum # 96 411 82 427 447 81 23.0 3e-15 MLTLGVSFFLYSFVACLPVRGMRIYRRSPRFFVGKRLHNPNIRTVMEQQPELLTTEWVSD SVGTYLYQKRPKRRIIYIIVLAVVSAVVISLPFVYVDITVQSNGFVRPNGEISVITAPMT ETVERVSAKEGDKLRKGDEILRFRTSSPDGKITYQQERSKEASAQIADLELLSKGQCPRA FASAARQQEYAKYLSEQYRLKTDLRQYETEWRRYKVLFDKGLISESEYNEHYYRYQDKLN ELHLQQTNQKSTWKTELTNLKMQLKETTSNLIETRSNRNVYVVRSPINGTLEQFSGIYPG SNLQVGATIAVVSPDTSLYIEAYVAPRDIAFIRENMRVKVQIESFNYNEWGTLQGYVQHI SSDYIRNNNGESYYKVKCKLTKNYLELRNSKRKGYVKKGMAGIVHFVVTRRSLFNLLYKN IDEWANPTQYQSTATDKAH >gi|283510547|gb|ACQH01000072.1| GENE 57 77064 - 77270 62 68 aa, chain - ## HITS:0 COG:no KEGG:no NR:no RNEKRTFLFVYGAPMPFLIQLLLGYHRATYNLHKNTPGHIILYYISFAYITNSLYLCKSF VGVGKCLQ Prediction of potential genes in microbial genomes Time: Sat May 28 01:40:55 2011 Seq name: gi|283510546|gb|ACQH01000073.1| Prevotella sp. oral taxon 317 str. F0108 cont2.73, whole genome shotgun sequence Length of sequence - 36440 bp Number of predicted genes - 27, with homology - 24 Number of transcription units - 20, operones - 4 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 1553 - 1612 6.8 1 1 Tu 1 . + CDS 1727 - 2488 515 ## gi|288928981|ref|ZP_06422827.1| hypothetical protein HMPREF0670_01721 + Term 2581 - 2619 4.8 2 2 Tu 1 . - CDS 2568 - 2903 169 ## ZPR_4403 putative DNA-binding protein - Prom 3007 - 3066 3.8 - Term 2987 - 3038 16.1 3 3 Op 1 . - CDS 3068 - 6079 3460 ## COG0342 Preprotein translocase subunit SecD - Prom 6123 - 6182 3.3 4 3 Op 2 . - CDS 6189 - 7277 1116 ## PRU_2304 endonuclease/exonuclease/phosphatase family protein 5 3 Op 3 . - CDS 7289 - 8209 1086 ## COG0705 Uncharacterized membrane protein (homolog of Drosophila rhomboid) - Prom 8269 - 8328 4.4 + Prom 8313 - 8372 6.2 6 4 Tu 1 . + CDS 8422 - 8694 386 ## COG0776 Bacterial nucleoid DNA-binding protein + Term 8726 - 8766 9.0 - Term 8712 - 8753 6.0 7 5 Tu 1 . - CDS 8790 - 9338 665 ## COG1611 Predicted Rossmann fold nucleotide-binding protein - Prom 9492 - 9551 4.1 + Prom 9626 - 9685 5.2 8 6 Tu 1 . + CDS 9888 - 10430 437 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Term 10701 - 10746 -0.3 9 7 Tu 1 . - CDS 10770 - 10970 68 ## - Prom 11016 - 11075 4.2 10 8 Tu 1 . - CDS 11446 - 11637 138 ## gi|260910472|ref|ZP_05917141.1| conserved hypothetical protein - Prom 11699 - 11758 4.6 + Prom 11588 - 11647 5.2 11 9 Tu 1 . + CDS 11833 - 12039 95 ## gi|288929061|ref|ZP_06422907.1| conserved hypothetical protein + Term 12061 - 12098 0.0 + Prom 12047 - 12106 2.4 12 10 Tu 1 . + CDS 12157 - 12525 136 ## + Prom 13591 - 13650 4.2 13 11 Op 1 . + CDS 13683 - 14711 1219 ## COG0451 Nucleoside-diphosphate-sugar epimerases 14 11 Op 2 . + CDS 14728 - 15690 1065 ## PRU_2755 hypothetical protein + Term 15700 - 15733 -0.7 + Prom 15913 - 15972 4.2 15 12 Tu 1 . + CDS 16062 - 17126 710 ## COG3177 Uncharacterized conserved protein + TRNA 17670 - 17745 83.5 # Gly GCC 0 0 + TRNA 17772 - 17857 69.6 # Leu TAA 0 0 + Prom 17782 - 17841 80.4 16 13 Op 1 . + CDS 17882 - 19558 2148 ## PRU_2390 hypothetical protein 17 13 Op 2 . + CDS 19621 - 20706 968 ## PRU_2389 putative lipoprotein 18 13 Op 3 . + CDS 20720 - 22267 1603 ## COG0606 Predicted ATPase with chaperone activity 19 13 Op 4 . + CDS 22285 - 23076 648 ## BF2573 alpha-1,2-fucosyltransferase + Prom 23214 - 23273 2.6 20 14 Tu 1 . + CDS 23296 - 23820 255 ## gi|288928996|ref|ZP_06422842.1| hypothetical protein HMPREF0670_01736 + Term 23977 - 24018 1.1 21 15 Tu 1 . + CDS 24341 - 25699 1667 ## COG0624 Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases + Term 25784 - 25822 -0.8 + Prom 26015 - 26074 3.9 22 16 Op 1 . + CDS 26170 - 28245 1604 ## Coch_1034 sialate O-acetylesterase (EC:3.1.1.53) + Term 28379 - 28420 -0.7 + Prom 28362 - 28421 4.5 23 16 Op 2 . + CDS 28441 - 29106 337 ## COG0350 Methylated DNA-protein cysteine methyltransferase + Term 29250 - 29298 10.2 - Term 29494 - 29539 2.5 24 17 Tu 1 . - CDS 29775 - 30932 914 ## COG3746 Phosphate-selective porin - Prom 30982 - 31041 3.3 25 18 Tu 1 . - CDS 31087 - 31878 862 ## COG0671 Membrane-associated phospholipid phosphatase - Prom 31899 - 31958 6.5 26 19 Tu 1 . - CDS 32053 - 35223 2196 ## gi|288929003|ref|ZP_06422849.1| hypothetical protein HMPREF0670_01743 - Prom 35299 - 35358 3.2 + Prom 35181 - 35240 5.3 27 20 Tu 1 . + CDS 35267 - 35581 339 ## Predicted protein(s) >gi|283510546|gb|ACQH01000073.1| GENE 1 1727 - 2488 515 253 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288928981|ref|ZP_06422827.1| ## NR: gi|288928981|ref|ZP_06422827.1| hypothetical protein HMPREF0670_01721 [Prevotella sp. oral taxon 317 str. F0108] # 1 253 41 293 293 518 100.0 1e-145 MYVDASIFSSGLCSTIRETIIDTGSSVCIIDSTYAVDSCQIKGEQVNATMGNTMGKNINT SYVYLDSIAFGGVVYTKVRCYLVDLAGTLQQFAPKFVIGGEVLKKDLWCCDLKEYTLQRY MRVPKNVVATIKWKKYADAALNLIYFEGKIGGKYTRIFFDTGSVRNAVPSNFDITPTDSV KLPGANIAEKLTYKKVGWCKNVPVEISNLNFNLDFGKSNNSGLKEPCINAEFLQGKKWML DYKRRRLLILGSD >gi|283510546|gb|ACQH01000073.1| GENE 2 2568 - 2903 169 111 aa, chain - ## HITS:1 COG:no KEGG:ZPR_4403 NR:ns ## KEGG: ZPR_4403 # Name: not_defined # Def: putative DNA-binding protein # Organism: Z.profunda # Pathway: not_defined # 1 90 1 90 113 69 35.0 4e-11 MNFIEDLKRLRKIHELISGQNTGSPEKLAAAICVSRSELYYILRELKKMGAQICYDRRKR SFCYVNNFKLNMEIKVSYLGEQEMSILGCGLVGCRTCRSVLYNAFVQNVTF >gi|283510546|gb|ACQH01000073.1| GENE 3 3068 - 6079 3460 1003 aa, chain - ## HITS:1 COG:AGc2877_1 KEGG:ns NR:ns ## COG: AGc2877_1 COG0342 # Protein_GI_number: 15888881 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit SecD # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 385 659 275 548 562 247 46.0 1e-64 MQNKGIVIVTAVLLTLASIFYLSFSVATRYYDGKAAELKDPIAQQDYKDSVKYLGIYSYQ RCLETQIGLGLDLKGGMNVILEISVPDVIETLADHKTDAAFTKSLEQAKKEEETSQSDFI SLFIKYYKQNAPGHKLAELFATQQLQGKVSPQSSDAEVEKVLRTEVQAAIDNSFNVVRTR IDKFGVVQPNIQKLEGQEGRIMVEMPGVREPERVRKLLQGSANLEFWETFNASEIVPYLT QLNARLASEQNGTDTAAVDTANARAKAEQEVAKAKNEQPKFKLNTGDDKKTDVAQRANNA QTAAAIKANPLFSRLQPTGGNSLSIVGYANIRDTAAINKLIYGPVAKQVLPPDLKLLWSA KPADQAKMKNIYELHAIKVTSTNGRAPIEGDVVTDASDEFNHTTGAPQVSMRMNSDGARR WAALTKANVGKAIAIVLDNAVYSAPRVSNEIDGGSSSISGNFTIEETKDLANTLKSGRMP APARIVQEEVVGPTLGAQSIQQGIQSFAIAFVVLMVYMVLMYGFTPGMMANTALIVNLFF TLGILTSFQAALTLSGIAGMVLSLGTAVDANVLIYERTKEELAAGKGLKQAIKEGYGNAF SAIFDSNLTSIITGIILYSFGTGPIRGFATTLIIGIVCSFFTAVFLTRLVYEYQLNKDRW TKLTFTTGFSKNFMQNVHFPFMSVYKKTFVLVAVLLAVFVGSFVTRGLSRSIDFTGGRNY VVTFEKQVQPEQVRGILANAFPGCTTGALALGTDHKTIRISTNYKIESNSPTVDDEAETI LYNALKKAGLVSQKNVDAFKNPDIRQGGSIISSTKVGPSVAKDITYGAIISVLLAIFAIF VYILIRFRNVAFSVGSTVALAIDTTFVIGLYSLLWGFMPFSLEIDQTFIGAILTVIGYSI NDKVVVFDRIRENMKLHPKQDFRSLFNDSINQTLARTINTSFSTLIVLLCILFLGGESIQ SFAFAMSLGVIFGTLSSIFIAAPIAFLTLGKTQKGRALEAATV >gi|283510546|gb|ACQH01000073.1| GENE 4 6189 - 7277 1116 362 aa, chain - ## HITS:1 COG:no KEGG:PRU_2304 NR:ns ## KEGG: PRU_2304 # Name: not_defined # Def: endonuclease/exonuclease/phosphatase family protein # Organism: P.ruminicola # Pathway: not_defined # 1 358 1 359 364 291 40.0 2e-77 MQAKVKKWIVQLASGANVATILVMLAVGYSDRLNPASHPLMATLGLTFPVFLAINAAFLV FWAFFKLRRTLIPLVGFVACFGPVRTYIPLNVNSNPTDSCLKVLSYNTFLWGGGEATEPQ RWEMLNYVKEKDADILCLQEANLGGKFQATIDSLLNSIYPYHDIKEKTTIYDRLAIYSKY PIVRSERVMYPQSEDFSQACWVAMPGDTVLVVNNHFATTGLTEAQRQEFKSMLKGDLRTK QMPSLGKSLVHKLGEYAKIRAPQVQCVAQFVRKRKGQSIILCGDFNDSPISYAHRCMAKE LTDCYVASGNGPGISYHRNAFYVRIDNIMCSDDWQPLKCVVENKVKTSDHYPIFCLLQKL RK >gi|283510546|gb|ACQH01000073.1| GENE 5 7289 - 8209 1086 306 aa, chain - ## HITS:1 COG:XF0649 KEGG:ns NR:ns ## COG: XF0649 COG0705 # Protein_GI_number: 15837251 # Func_class: R General function prediction only # Function: Uncharacterized membrane protein (homolog of Drosophila rhomboid) # Organism: Xylella fastidiosa 9a5c # 4 238 9 222 224 161 37.0 1e-39 MGNIPTITKNLLIINVLAFLATLVIQSSSGIDLNNILGLHFFRASEFRIYQLVTYMFMHG GFAHLFFNMFALWMFGVVVEGVWGPRKFLFYYIACGIGAGLMQELAQFVEIYLQLNAQAP LGVGDAFIVMGQLANQLNGLTTVGASGAIYAILLAFGMIFPENKIFIFPIPIPIKAKWFV MGYAALELYQAMAGTGGNVAHLAHLGGMIFGFFMIRYWQRHPSHDFSQRRGTNAFERMKD FYEKHRQPTPESPREETRAESDWDYNARRKAEQDEVDRILDKIRKSGYDSLTREEKQRLF DQSKRQ >gi|283510546|gb|ACQH01000073.1| GENE 6 8422 - 8694 386 90 aa, chain + ## HITS:1 COG:HI0430 KEGG:ns NR:ns ## COG: HI0430 COG0776 # Protein_GI_number: 16272378 # Func_class: L Replication, recombination and repair # Function: Bacterial nucleoid DNA-binding protein # Organism: Haemophilus influenzae # 1 90 47 136 136 82 54.0 2e-16 MNKTDLIDKIAAGSGLSKADSKKALDATVAAIKETLVKGDKVQLVGFGTFSVNERPAHGG VNPATKEKITIAAKKVAKFKAGAELTEAIQ >gi|283510546|gb|ACQH01000073.1| GENE 7 8790 - 9338 665 182 aa, chain - ## HITS:1 COG:mll7812 KEGG:ns NR:ns ## COG: mll7812 COG1611 # Protein_GI_number: 13476480 # Func_class: R General function prediction only # Function: Predicted Rossmann fold nucleotide-binding protein # Organism: Mesorhizobium loti # 2 167 6 174 203 104 31.0 9e-23 MNIAVFCSANADIDPRFFKATTALGRWCAENGHVVLFGGTNLGLMECVAKAAHEAGGRVV GVVPKFVEEGGKVSTYVDEVYRCENLTDRKEMLNTLCDVAIALPGGVGTLDEVFTLAAAN TVGYRHQRVILYNMEGFWDTLVALLDNLERQGMIRGNCRSRISVVNNLDELEEDLKGERV KK >gi|283510546|gb|ACQH01000073.1| GENE 8 9888 - 10430 437 180 aa, chain + ## HITS:1 COG:all2193 KEGG:ns NR:ns ## COG: all2193 COG1595 # Protein_GI_number: 17229685 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Nostoc sp. PCC 7120 # 12 178 25 193 201 64 30.0 1e-10 MKDNYEEYIALLKEGSYEAFSDLYTHYASKLYSFALVQTRNSVLSEDIVQETFLKLWNTR GQLDCYGNVHALLFTIARNLIIDGFRKQIRQIDFEDYRQSCNKLSLLPTPEEQLDYEEFI NRLQQTKGLLSKRACQIYEMSREENMPIQEIAESLNLSSQTVKNYLTSTLKIFRKKLMRE >gi|283510546|gb|ACQH01000073.1| GENE 9 10770 - 10970 68 66 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLVRHSEAKMVANGYVVNPSIIANVGFCKRGALLRHGVLIVVVDVRRASSRVGTRPDGSL QGNVAR >gi|283510546|gb|ACQH01000073.1| GENE 10 11446 - 11637 138 63 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260910472|ref|ZP_05917141.1| ## NR: gi|260910472|ref|ZP_05917141.1| conserved hypothetical protein [Prevotella sp. oral taxon 472 str. F0295] # 1 63 1 63 63 94 95.0 3e-18 MRQEKRTKSVLKQYAEQRKLMYSIVLACAIILMCCYLGYCFRRDCAIRDNAKESSATMER QQK >gi|283510546|gb|ACQH01000073.1| GENE 11 11833 - 12039 95 68 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929061|ref|ZP_06422907.1| ## NR: gi|288929061|ref|ZP_06422907.1| conserved hypothetical protein [Prevotella sp. oral taxon 317 str. F0108] # 1 57 1 56 60 67 64.0 3e-10 MSLCRKTNINAPLMHEVLNKDVTKSEILPHSPATKHGYPQKKVTWREFFQYTLYKLKASS LRHIMRKK >gi|283510546|gb|ACQH01000073.1| GENE 12 12157 - 12525 136 122 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVMARKERLAPSTYDEEKVKRTFLCFICSYVKMDVFRSQYEKMNVFVKPDEQCRACSNMV MARKERLAPSTYDEEIVKRTFLCLLCSYVKKDLFRSTLKLLSEPPTYVLMSKRTYLGHNM KC >gi|283510546|gb|ACQH01000073.1| GENE 13 13683 - 14711 1219 342 aa, chain + ## HITS:1 COG:VC0262 KEGG:ns NR:ns ## COG: VC0262 COG0451 # Protein_GI_number: 15640291 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Vibrio cholerae # 3 336 10 317 323 92 29.0 1e-18 MDKILVTGASGFIGSFIVEEALNRGMEVWAAVRKSSSKEYLQDKRIRFVELDLGNAERLK SQLGGHHFDYVVHAAGATKCLHRNDFYRVNTEGTKNLANAVIDLKMPLKRFVFISSLSVF GPVREQQPYEEIQETDQPMPNTAYGHSKLLAEQYLDSLNPKNEDGEITDVVLPYIVLRPT GVYGPREKDYFLMAKSIKQHIDFAVGYKPQDITFVYVQDVVQAVFLALDHGMSGRKYFLS DGGVYRSYDFSDLIHRNLGKPWMLRIKAPVWFLRVVTFFGEYIGRATGRISALNNDKYHI LKQRNWRCNIEPTMDELGYHAQVNLEEGTALTVKWYKENGWL >gi|283510546|gb|ACQH01000073.1| GENE 14 14728 - 15690 1065 320 aa, chain + ## HITS:1 COG:no KEGG:PRU_2755 NR:ns ## KEGG: PRU_2755 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 4 318 3 317 318 373 60.0 1e-102 MNFIKSLFAIEKSPKKGLMALEWVVLAYMALTLLMVFFTYTKLANPESMIWGRVRIGIII VALWVVYRLVPCRFTKLTRVVAQLYLLSWWYPDTYELNRILPNLDHVFAQAEQSVFGFQP ALVFCDAMPSAWFSELMDMGYASYFPMILAVTLFYFFCRYAEFERASFVLIAAFFIYYVV FVALPVTGPQYYYPAVGMEKIAMGIFPNVGDYFNLHSERLTSPGYANGIFYQMVEHAHEA GERPTAAFPSSHVGISTIIMLLAWHTRNRRLVFCLLPFFVLMCFSTVYIMAHYAIDAFAG LISGALLYFVLMWVSRKWSA >gi|283510546|gb|ACQH01000073.1| GENE 15 16062 - 17126 710 354 aa, chain + ## HITS:1 COG:mlr2757 KEGG:ns NR:ns ## COG: mlr2757 COG3177 # Protein_GI_number: 13472455 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Mesorhizobium loti # 26 251 32 229 263 102 34.0 1e-21 MDKLQALYDKWKALQPLNERKQFLLSQRFTVEYNYNSNHIEGNTLTYGQTELLLLFGKVS GEGKLKDFVDMKASLASVKMLEDEMKARTALTQNFIRQLHHVLLREDYTTHRQLPDGSQT SFTIHAGRYKTRPNSVITRYGERFDYASPEETPALMADLVDWYNAMEAKGVMSPVELAAL FHYRYIRIHPFEDGNGRIARLLVNYILARHGWPMVVVRNRNKNEYLDALHDADVDVGSSP SAGAHASLAQIRRFLKYFKAMVAEELQYNIGFATEQSANVWWYDGQKITFRSASSSTMLT AMQQTPDITIEQLSKKAGIVTAAVKKQLAQMTRKGYIERREKDGDWQVFVSSSV >gi|283510546|gb|ACQH01000073.1| GENE 16 17882 - 19558 2148 558 aa, chain + ## HITS:1 COG:no KEGG:PRU_2390 NR:ns ## KEGG: PRU_2390 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 558 1 553 553 565 57.0 1e-159 MNKNLILTLSVLSSLAMTSCSKLGPLSADNFQVTPTPLEANGGLVSATIDANFPAKYMKK KAVVKIIPELRYAGGQVATGEGATFQGEKATANGQQVPYKLGGRYSMKTDFNYVPDMIKS DLYLTFDARMGKKKVDLPAVKVSYGVVATSQLYREAMANDGLCIAPDSFQRIKAQKQEAN IKFLINQANLRKTELKNNSVTEFVKMLKKINADREKLNLRNVEVNAYASPEGGFVINDKL AAKRQTTGEGYVKGQLKQNKMQTAVEARYTAQDWDGFQELVQASNLQDKDVILRVLSMYK DPEERERQIRNMSEGFRELATGILPELRRARLTINYEVVGRSDEQIKEQFAADPTQLSSE EILYSATLTDNPAEKNNIYKKATELYDQDYRAYNNLGVLAMQAGNLAEARTYFQKALATA PDAAEANANLGMVALYEGNISEAETLIAKAQQANDINKAVGALNIAKGNYSLAETTLSSY ESNMAALAKLMNKNYEGALATLKNVKHPDAVTDYLMAVAYARQGDNAAATTALQAATAKD ARLAQRAAVDLEFAKLNK >gi|283510546|gb|ACQH01000073.1| GENE 17 19621 - 20706 968 361 aa, chain + ## HITS:1 COG:no KEGG:PRU_2389 NR:ns ## KEGG: PRU_2389 # Name: not_defined # Def: putative lipoprotein # Organism: P.ruminicola # Pathway: not_defined # 46 361 18 332 333 248 39.0 4e-64 MTTTSFKCTTLIEFAKMVWQVVNPSPFKPAKAMLAAAVLALLLTGCGVEGNRFEVEGHFL KINRGEFYVYSTNGLIDGIDTIKLEAGRFVYEIPCEREGTLVMVFPNFSEQPIFAQPGKS VKMEGDASHLKELTVKGTKDNKLMNQFREAIANVSPPQAAKTAALFAADNPASPVAAYLV RRYFITTPTPNYKEAARLLKLLLAEQPKNGELNRMQSLISVLAKTSVGASLPPFQARSTK GEKVSEQLCNKAPVAVFSVWSTTNMQSMELQRMLNQKVRSSKGKLKVVGLCIDPIQRECK DILERDSISWPNICDGAMFEGNVAKKLGVYSVPFNILLKNGKIVAKDLDANQLKERLDKL L >gi|283510546|gb|ACQH01000073.1| GENE 18 20720 - 22267 1603 515 aa, chain + ## HITS:1 COG:slr0904 KEGG:ns NR:ns ## COG: slr0904 COG0606 # Protein_GI_number: 16331658 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted ATPase with chaperone activity # Organism: Synechocystis # 1 507 1 507 509 501 50.0 1e-141 MLVKTYCAAVNGLNVTTVTVEVNLVKGMLYHFTGLGDEAVREGRDRISSAIQYNNMRFPR ADITVNMAPADLRKEGSSFDLPLAIAILAADSQLPTDNLGDFMMVGELSLDGTLQPIKGA LPIAIRARAEKFKGLLVPKANVREAAVVNNLDVYGMENIVDVVNFLSGKAQFEPTVIDTR KEFYEQQYRFDLDFADVRGQENVKRAMEVAAAGSHNLIMVGPPGSGKSMMAKRLPSILPP LSLGESLETTQIHSIAGKLGKGMSLISQRPFRAPHHTISEVALVGGGATPQPGEISLAHN GVLFADELPEFNKTTLEVLRQPLEDRKITISRAKYTIEYPCSFMFVASMNPCPCGYYGDP THHCVCMPGQIQRYMNKISGPLLDRIDIQVEITPVPFKDISQAVQGESSSVIRERVIRAR HMQEQRFKEVKGIYSNAQMTERMIHQFAEPDAEGIELLRTAMERLSLSARAYNRILKVAR TIADLEQSETVKSQHLAEAISYRNLDRGDWAERGN >gi|283510546|gb|ACQH01000073.1| GENE 19 22285 - 23076 648 263 aa, chain + ## HITS:1 COG:no KEGG:BF2573 NR:ns ## KEGG: BF2573 # Name: not_defined # Def: alpha-1,2-fucosyltransferase # Organism: B.fragilis # Pathway: not_defined # 1 263 1 271 271 172 37.0 1e-41 MKIVCIKGGLGNQLFEYCRYRSLHRHDNRGVYLHYDRRRTKQHGGVWLDKAFHITLPNEP LRVKLLVMVLKTLRRLHLFKRLYREEDPRAVLIDDYSQHKQYITNAAEILNFRPFEQLDY AEEIQTTPFAVSVHVRRGDYLLLANKSNFGVCSVHYYLSAAVAVRERHPESRFFVFSDDM EWAKENLNLPNCVFVEHAQAQPDHADLYLMSLCKGHIIANSTFSFWGAYLSKGSSAIAIY PKQWFAEPTWNVPDIFPAHWMAL >gi|283510546|gb|ACQH01000073.1| GENE 20 23296 - 23820 255 174 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288928996|ref|ZP_06422842.1| ## NR: gi|288928996|ref|ZP_06422842.1| hypothetical protein HMPREF0670_01736 [Prevotella sp. oral taxon 317 str. F0108] # 1 174 1 174 174 312 100.0 5e-84 MDTKKKKKSKLEIDDSYELQLVSKLHPALRWVFMLPIGLLAILLVQLAYGFVFNKMLSNF DPNGGVAIVINSLFMAMKYCVFVVAMVGVAPVVREKKFQASIACAIIAVVVCLGSTGFLL YNNGVENLTMLIATGGSAFLGIVWAVLNVRSTVSKPLPTEQEGELAELQTRKNA >gi|283510546|gb|ACQH01000073.1| GENE 21 24341 - 25699 1667 452 aa, chain + ## HITS:1 COG:DR2025 KEGG:ns NR:ns ## COG: DR2025 COG0624 # Protein_GI_number: 15807020 # Func_class: E Amino acid transport and metabolism # Function: Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases # Organism: Deinococcus radiodurans # 9 439 11 440 459 415 49.0 1e-116 MIKKYIQENEERMLDELFSLIRIPSVSAQQAHKPDMQRCAERWKELLLMAGVDRAEVMPS DGTPMVYGEKIVDPNAKTVLIYGHYDVMPAEPFELWKSEPFEAEVRDGRIWARGADDDKG QSFIQAKAFEYVVKNNLLKHNIKFILEGEEEVGSPSLGAFIEKHKELLKCDVILVSDTSM IAEDIPTLTTGLRGIAYWQIEVTGPNRDVHSGHYGGAIANPINVLCKLIADATNEDGRIL FPGFYDKVEEASDDERKMLASIPLDLEGYKKSIEVDELFGEKGYSTLERTGFRPSFDVCG IWGGYTGDGAKTVIASKAYAKLSCRLVPHQDYKEIAQLVIDYFKRVAPKSVKVDIKFLHG GQGYVCPIDMPAYKAAERGVEMVFGKRPLAARLGGSIPIISTFEQILGVKTILMGFGLES NAIHSPNENMPLSMFRKGIEAVVNFHLEYDKF >gi|283510546|gb|ACQH01000073.1| GENE 22 26170 - 28245 1604 691 aa, chain + ## HITS:1 COG:no KEGG:Coch_1034 NR:ns ## KEGG: Coch_1034 # Name: not_defined # Def: sialate O-acetylesterase (EC:3.1.1.53) # Organism: C.ochracea # Pathway: not_defined # 15 689 16 688 689 760 54.0 0 MRRPLVTLLLFALALAAFAAEPIRVACVGNSITYGYGLPNPATDSYPAQLQQKLGKAYDV RNFGHSGATLLSRGHRPYIEQETYKNALAFKPDIVVIHLGINDTDPRNWPFFRDDFTQDY IALIHSFKVANPKAAFYIAKMSPISHRHHRFLAGTRDWHKLIQTAIERVAKATGARLIDF YKPLLPFPQMFPDGLHPNKAGAGVLARVVYEALTGDYGGLQMPLVFSDNMVVRRNRDFDV YGTANAGQRVVATFNGEKRSTVVQNDGTWRIVFAAPRMGKPYTLTITSQQRTLRFKNIVA GEVWLCSGQSNMAFPLANDELGKEALTIVDDPNLRVLNFLPRWETDDTEWPQSALDSTNN LQYMRSKGWQQATSKNMGEVSAVAYYFARVLRECLGVPVGIIVNAVGGSPTEAWIDRQTL EDNYPELLNDRFKDNVMVMPWVVQRMRKNIAHTTDKLQMHPYQPTYLYDAAIRPLAKFPI NGVIWYQGESNAENMEIHERLFPLLLSSWRTNWDNLQLPVYMVQLSSINRPSWPWFRNSQ RLMANTLPYVYMAVSSDVGDSLDVHPRQKQPVGNRLARLALYHQYDFKRLTPCGPSVQSA TKQANGEVRVCFNWADGLKTADGKALRTFEVAGYDEVFYPAEATIKDREVRLTCPQVPHP AFVRYGWQPFTRANLVNGDALPASTFRVEVK >gi|283510546|gb|ACQH01000073.1| GENE 23 28441 - 29106 337 221 aa, chain + ## HITS:1 COG:SA2335 KEGG:ns NR:ns ## COG: SA2335 COG0350 # Protein_GI_number: 15928126 # Func_class: L Replication, recombination and repair # Function: Methylated DNA-protein cysteine methyltransferase # Organism: Staphylococcus aureus N315 # 59 211 9 157 173 141 49.0 1e-33 MKKNNKSHKSQDISLPSPRGEGLGAATFSLEEGLSVSPVISCGKQAVPNTRSFFLKLPSP LGTLHLESDGEALTCVLFEGEKVKHPKGVTLQEAPQLPVFVAARAWLSRYFEGHVPGAPP PVRPQGTPFQQSVWQRLLTIPYGSTTTYGALAAYLAAHNASGRMSAQAVGQAVGRNPISI IIPCHRVIGADGTLTGYAGGLDKKIFLLGLEGSMKGERVKR >gi|283510546|gb|ACQH01000073.1| GENE 24 29775 - 30932 914 385 aa, chain - ## HITS:1 COG:XF0975 KEGG:ns NR:ns ## COG: XF0975 COG3746 # Protein_GI_number: 15837577 # Func_class: P Inorganic ion transport and metabolism # Function: Phosphate-selective porin # Organism: Xylella fastidiosa 9a5c # 103 385 112 389 389 95 26.0 2e-19 MKRIIFIAACGLITCAGVFAQEEEPKLVVKPTGRILMDGGVMHSTDKTLDAQLNDGVAVP DVRVGVSATYGKWKAKVDVGFARQSLSLKDICLDYNFNKENLIRMGYFVHQFGLQSGTSS SFKITMEEPLANQAFFNSRLIGAMFVHAGEKYHATASVFAENDAMKMTADKLGNEAWGAM TRLVWRPTTEHGKIFHVGVSGAYESPRYNKNAAISHKSYTLKAPFPTRIVNVAAQEATIT DATALWKVSPEMNLAFGNFGVEAQYFYVNIKRENAMPNYNAWGAYGNVRYLLNGQGYDYN KADAGIATPDPGSMELVAGYNYSTLTDKSADIYGGKVHDWSLTFNYYLNKYMIWRVRGSM TRATDNAAFNNNTFSILETRLQIKF >gi|283510546|gb|ACQH01000073.1| GENE 25 31087 - 31878 862 263 aa, chain - ## HITS:1 COG:STM4319 KEGG:ns NR:ns ## COG: STM4319 COG0671 # Protein_GI_number: 16767569 # Func_class: I Lipid transport and metabolism # Function: Membrane-associated phospholipid phosphatase # Organism: Salmonella typhimurium LT2 # 51 245 39 233 250 177 46.0 3e-44 MTKKTLVVGLLAMFSVTTFAQTAAKKIKDVRTQPDLYYLQDGQQASSLGLLPPPPQAGSI MFLNDQAQYQWGKMQRNTPRGDQAVADARVGGDGVPNAFSEAFGIKISKETTPEIYKLVV NMREDAGDLATRSAKDHYMRVRPFAFYKEMTCNPEQQEELSTNGSYPSGHTAIGWATALV LSEINVDRQNEILERGYQMGQSRVICGYHWQSDVDAARVISAAVVARLHAEPAFQEQLAK AKKEFAKLKKEGKVAKSTFKAPN >gi|283510546|gb|ACQH01000073.1| GENE 26 32053 - 35223 2196 1056 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929003|ref|ZP_06422849.1| ## NR: gi|288929003|ref|ZP_06422849.1| hypothetical protein HMPREF0670_01743 [Prevotella sp. oral taxon 317 str. F0108] # 1 1056 1 1056 1056 2103 100.0 0 MTTRPKRKIKTSTCALWVCLTLLFTLASCKETTTSRSLSEKEFHYDDRLSTLSVDKDGYW IGGETGIIWNVKGQERKRFYTGLDRIYDIERNPRNANQIWVATRNAGLQLWNIAADTLIH QATFKIPSKGNRYSPYDIEISGDKLFVATSQGLFALSLNEQQQELTPLYPFNENHTHRYL EPFLVNNISSVGNKWLFAATQAGVISIDLALNKTTMRNKGQNVRNVAVYDGKLHVLADNR LSIENFNGTDRKTYALPQPVLSFYKAGSTYYFITSSSILLSDNLKRFVNIPLRSNIPDNA HQIAIADDGNSFSILLTKNAVWRIPHHLGFFNANPPVVAACAWGNDFFFVNNQHELFRQK ADEQTATKIYDFENEELPKEMYANGEDIYYYNANNQLFRLNLGSHYIINQLFKRPQLLAQ PTTRITAMALLPHQGKILIGVQDDLLSIDARDGKIDTIKAMNNRYVTAFHQVENSEDIYI ATLNHGVFVGQGNQVKSVVGTEDKVFIGSLLAYGKRSSHLLLLTNHHLQILGSDSIRAEG CGRMFCINDSIVYTIPETGIHKYIIKGGRLEDCGSFFADIHFNARAGFIHGNTLYIGSNL GVAQITPGKEETAQWITFDNKVPSLQLIAVIIFTIVCIVAIIFFSYRRHKVLTFRQLQMN KDDLHQRLTTLGTLKDRLTETERNTIDSITNEIDEMNISSQSVRASNEQFAKLSVRIAKL NRDMALQMVSYLNGQISRIQQFDVYERYSMIHDSEEARSTDNLEIIIEQCRRNEVWLNHI QELKERLNKFHRSTQDTLVIKGLNDGMKEQLHHILNESKQRPVAEVYTDFIAVKNQYENI FTPDGLRLISGYISSRIRQLQQQKGYEMMTKALADELLSIEKDIANRDRIVLLRVLQTID KRIAQVKHLKTLQQLMQEYAAVHDRVVQENEERRMKKFNAELFADIESATRNITTQIANT SNQFFKNFAYTDKDICKETLHFTTANNQQVRVLILLLAMPRVKRTLLPGMLGIYGNLNPV VSRLYHSKIGNNNAILAEYCSVNPTSVVNYILKLTE >gi|283510546|gb|ACQH01000073.1| GENE 27 35267 - 35581 339 104 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVANLKRMMQSMPSLARITRIAGKSRITRTTRKPRKPRNPRITRKPRTTRKSRTTRKPRI TRKPRTTRNPRIPRNPRIPRNPRITRNPRTTRKPRTTRKPKKSR Prediction of potential genes in microbial genomes Time: Sat May 28 01:42:38 2011 Seq name: gi|283510545|gb|ACQH01000074.1| Prevotella sp. oral taxon 317 str. F0108 cont2.74, whole genome shotgun sequence Length of sequence - 1568 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - TRNA 196 - 267 57.0 # Arg CCT 0 0 - Term 152 - 191 8.4 1 1 Tu 1 . - CDS 355 - 708 285 ## gi|288929004|ref|ZP_06422850.1| hypothetical protein HMPREF0670_01744 - Prom 729 - 788 2.0 - Term 769 - 804 -0.8 2 2 Tu 1 . - CDS 832 - 1431 586 ## gi|288929005|ref|ZP_06422851.1| lipoprotein Predicted protein(s) >gi|283510545|gb|ACQH01000074.1| GENE 1 355 - 708 285 117 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929004|ref|ZP_06422850.1| ## NR: gi|288929004|ref|ZP_06422850.1| hypothetical protein HMPREF0670_01744 [Prevotella sp. oral taxon 317 str. F0108] # 1 117 1 117 117 147 100.0 2e-34 MNNFTLFFMYDMVWVSLLAIIYATYALLSKPKGDKAVRLTIGFAFAYYLVFFLLSFATSI ATAKGLFLLALLPLSALWFALAPYRKSRRIKLVALMLAALYCAFFGIVILLHGMSNM >gi|283510545|gb|ACQH01000074.1| GENE 2 832 - 1431 586 199 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929005|ref|ZP_06422851.1| ## NR: gi|288929005|ref|ZP_06422851.1| lipoprotein [Prevotella sp. oral taxon 317 str. F0108] # 1 199 1 199 199 385 100.0 1e-106 MLKTKIHPLLIAMLLPLACALLSSCGKVENEFSDRRAYFIFDNQVQNNAVLASAMSPHSN VFVTVSMQTRYSGQNSYVEFNFVAGGGGAQQASKATAVDQNRGVVLGINNGLILGYGLLS DPPVFYAYDLQCPNCYSSTAVPRRSYPLTVQANGFATCANCRRKYNLSTGGNVAEGAQGN KLIRYRASTTGPYGLLVVN Prediction of potential genes in microbial genomes Time: Sat May 28 01:42:59 2011 Seq name: gi|283510544|gb|ACQH01000075.1| Prevotella sp. oral taxon 317 str. F0108 cont2.75, whole genome shotgun sequence Length of sequence - 27932 bp Number of predicted genes - 24, with homology - 21 Number of transcription units - 15, operones - 7 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 39 - 452 573 ## PRU_2394 hypothetical protein - Prom 629 - 688 3.1 - Term 592 - 638 -0.8 2 2 Tu 1 . - CDS 760 - 1398 924 ## COG0461 Orotate phosphoribosyltransferase - Prom 1487 - 1546 6.8 + Prom 1447 - 1506 6.7 3 3 Op 1 . + CDS 1702 - 4365 2493 ## ZPR_0173 hypothetical protein 4 3 Op 2 . + CDS 4415 - 5134 868 ## gi|288929011|ref|ZP_06422857.1| conserved hypothetical protein + Term 5161 - 5197 2.4 + Prom 5231 - 5290 1.9 5 4 Op 1 . + CDS 5349 - 7166 2285 ## COG0018 Arginyl-tRNA synthetase 6 4 Op 2 . + CDS 7170 - 7640 362 ## COG3341 Predicted double-stranded RNA/RNA-DNA hybrid binding protein + Prom 7656 - 7715 8.5 7 5 Op 1 . + CDS 7749 - 8999 1115 ## BF0423 hypothetical protein + Prom 9052 - 9111 3.8 8 5 Op 2 . + CDS 9169 - 9432 103 ## 9 6 Tu 1 . - CDS 11364 - 11852 293 ## gi|288929016|ref|ZP_06422862.1| hypothetical protein HMPREF0670_01756 - Prom 11946 - 12005 2.4 10 7 Tu 1 . - CDS 12124 - 13170 1039 ## COG0117 Pyrimidine deaminase - Prom 13319 - 13378 2.9 + Prom 13113 - 13172 2.7 11 8 Tu 1 . + CDS 13192 - 14076 620 ## COG2890 Methylase of polypeptide chain release factors + Prom 14167 - 14226 3.7 12 9 Op 1 . + CDS 14256 - 14744 628 ## PRU_2397 regulatory protein RecX 13 9 Op 2 . + CDS 14719 - 15435 477 ## COG1040 Predicted amidophosphoribosyltransferases + Term 15608 - 15649 11.1 14 10 Tu 1 . - CDS 15462 - 15662 82 ## - Prom 15887 - 15946 2.8 + Prom 15535 - 15594 3.2 15 11 Op 1 1/0.000 + CDS 15700 - 16314 605 ## COG1309 Transcriptional regulator 16 11 Op 2 . + CDS 16359 - 18719 2305 ## COG1033 Predicted exporters of the RND superfamily 17 11 Op 3 . + CDS 18760 - 19536 809 ## Fisuc_1740 hypothetical protein 18 11 Op 4 . + CDS 19533 - 20816 1060 ## Fisuc_1741 hypothetical protein + Term 20851 - 20902 4.1 + Prom 20855 - 20914 3.4 19 12 Tu 1 . + CDS 21047 - 21253 59 ## + Prom 22224 - 22283 5.4 20 13 Op 1 . + CDS 22377 - 23543 1211 ## COG4260 Putative virion core protein (lumpy skin disease virus) 21 13 Op 2 . + CDS 23608 - 24687 1070 ## Glov_1382 hypothetical protein 22 14 Tu 1 . + CDS 24925 - 25848 784 ## SYNPCC7002_G0157 hypothetical protein + Prom 25945 - 26004 4.8 23 15 Op 1 . + CDS 26051 - 27664 1561 ## Ccel_2393 S-layer domain protein 24 15 Op 2 . + CDS 27661 - 27931 78 ## gi|260910396|ref|ZP_05917068.1| conserved hypothetical protein Predicted protein(s) >gi|283510544|gb|ACQH01000075.1| GENE 1 39 - 452 573 137 aa, chain - ## HITS:1 COG:no KEGG:PRU_2394 NR:ns ## KEGG: PRU_2394 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 8 137 1 130 130 162 57.0 2e-39 MSTFESSIRQIPYKQEAVFNMLSDLSNIERLKDKLPEDKLEQMTFDSDSISMSVNPVGQI KLRIVEREAPKCIKFETAESPVAFNLWIQVVPNGDNASKMKLTIKAELNPFIKGMVKKPL MEGLEKIADLLQVIKYE >gi|283510544|gb|ACQH01000075.1| GENE 2 760 - 1398 924 212 aa, chain - ## HITS:1 COG:lin1945 KEGG:ns NR:ns ## COG: lin1945 COG0461 # Protein_GI_number: 16801011 # Func_class: F Nucleotide transport and metabolism # Function: Orotate phosphoribosyltransferase # Organism: Listeria innocua # 3 208 2 207 209 222 53.0 4e-58 MDNIKKDFAARLLQVNAIKLQPNDPFTWASGWKSPFYCDNRRTLSFPQLRSFIKLELAHA VLEHFPQAEAVAGVATGAIAQGALVADALQLPFVYVRSKPKDHGLENLIEGDLKAGQKVV VIEDLISTGGSSLKAVEAIRKAGCEVVGMVAAYTYGFAVAEEAFKAADVKLVTLTDYEHV VQKAVETGYIAKEDVVLLNEWRLNPAEWNPKK >gi|283510544|gb|ACQH01000075.1| GENE 3 1702 - 4365 2493 887 aa, chain + ## HITS:1 COG:no KEGG:ZPR_0173 NR:ns ## KEGG: ZPR_0173 # Name: not_defined # Def: hypothetical protein # Organism: Z.profunda # Pathway: not_defined # 20 338 2 318 892 139 29.0 5e-31 MVLMLRMPTTGRFRAKWLAIRLIFALLMLCGTAAVRAQNIVEGTVLTDSLSPNARCTVTL YDKDDKQWQALACDEKGKFRVENVPNGPFRVEAKAIGHVAQSKKVFLFGNTKMQVDFTLK SEAINLKEVEVKSSPMIVNGDTTTYIVSRFTTGREKVLKDVVNNLPGVRYDEKDNSLTVN GKRVSKVLVQGEDLYQGNVSTPMENLPASGVDNLKVIDNYSEYNVFSGFKSSNQTVVDLS MNKSMHGRLKGQAEAQGGLLNKAMAKGSAMLLGKRMMGNIIAGGNNTGEQLMKPADIVNI SGGYSELLSGDDPMASMRKTFADYAMMLDYRENAYRRNNGLVSLNTVSTPGKNMKILWNG MVGADRYRMKSTDHYAYHGGLAYTDSTLELQDKRHFLSNIKFKLANQKNMELTLTNRSYF GTTEGNTDRQLQSVGLNEQTKGRLATINNALVWLYRHGKNVFSYKADLNFLFSNNRYDFS SQPTPLFKSPALHSEMNIRRLEFTGSAQYVRRFADAYFLRIALRHGLETFSTTIKGDSAG MDNPALKMPYRFYDNGAELSLNKDQGKFRFGIGGIFHHFSLHSSRPLSFGVSPKMAVSPL FNISYRVSMMHYLELRLKSTYQLMGAGAMFDGPYIADFRSVSYAQIENALAHTTELTAMQ VYGNLFKGITLFNMVSLSYNHNALAYDYSMADGLVSHVVSRNASGGKGLSLVSMIEKKLV STPLNFRLFGGLSLARSPFFYQGVAQTSDSRSFNVNLNVKTYFKRGFNGELFAAAAWAND KSKLLRSHQNDYRLEGELGYIANKWQASVRLGQRWRKVTGSGTDYTALRFGMEYDLNKHI RLKASAQDMLNLRERLMQERVVDNYYVSTRNIHYMPGNVMIGAMVTF >gi|283510544|gb|ACQH01000075.1| GENE 4 4415 - 5134 868 239 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929011|ref|ZP_06422857.1| ## NR: gi|288929011|ref|ZP_06422857.1| conserved hypothetical protein [Prevotella sp. oral taxon 317 str. F0108] # 1 239 1 239 239 455 100.0 1e-126 MRRILLSLAAMLFAAIPIAVNAQQLTVEYETRVNANSPDAMKESGLPDEMRKALISAVAD VKAYYIMCIDGGQVEGRAQAAKEKQMINFMGQTIDANEFVKNQLENIIYYNKDTKRQVSK VVVMGKHYLLVDPLKPDTFDVKPNEKKDILGYECMKAVSKDGKQTIWFTKHIPVDAGPLY TDVGGLILEAELKDQIFVAKKISMSVDHALKEPTGGEEMTEKAFKEKMQKYVEMMRRGQ >gi|283510544|gb|ACQH01000075.1| GENE 5 5349 - 7166 2285 605 aa, chain + ## HITS:1 COG:TP0831 KEGG:ns NR:ns ## COG: TP0831 COG0018 # Protein_GI_number: 15639817 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Arginyl-tRNA synthetase # Organism: Treponema pallidum # 39 605 43 589 589 448 41.0 1e-125 MNIESEIIATVVQAVKECFGQDVPTTMVQLQKTKAEFEGNLTLVVFPFLKLSRLKPEDTA QQLGDYLAKHCKVVQSFNVVKGFLNLTIAPAAWISLLNRIDSEPRFGEKAVNEQSPLVMV EYSSPNTNKPLHLGHVRNNLLGWSLAQIMEANGNRVVKTNIVNDRGIHICKSMLAWKKWG NGATPESTGKKGDHLIGDYYVAFDQHYRAELAELTAKFRAEGMAAEEAEKRAKEDSPLMK EAHDMLVRWEQGDEEVRALWKKMNDWVYQGFDETYKAMGVGFDKIYYESETYLEGKAKVE EGLAKELFFRKPDGSVWADLSDEGLDQKLLLRADGTSVYMTQDIGTAALRFKDYPIDKMI YVVGNEQNYHFQVLSILLDRLGFKWGKELVHFSYGMVELPNGKMKSREGTVVDADELMEE MVSAARRTSEELGKFADMTENERNEIARIVGMGALKYFILKVDARKNMLFNPEESIDFNG NTGPFIQYTYARIRSIMRKAEAEGIVLPSVLPDTLPLNEKEVQLIQKLNSFETVVEQAGK DYSPSGIANYCYELTKDFNQFYHDYSILNAESAEAKTLRLALAKNVAKTIKNGMQLLGIE VPERM >gi|283510544|gb|ACQH01000075.1| GENE 6 7170 - 7640 362 156 aa, chain + ## HITS:1 COG:BH0863 KEGG:ns NR:ns ## COG: BH0863 COG3341 # Protein_GI_number: 15613426 # Func_class: R General function prediction only # Function: Predicted double-stranded RNA/RNA-DNA hybrid binding protein # Organism: Bacillus halodurans # 22 156 62 196 196 126 47.0 2e-29 MASQLVNPPSHRTDTVLPLPAEVKANAWAVDAACSGNPGPMEYQGVDLQTGYPVFHFGPV HGTNNIGEFLGIVHALALMQQQGITDKVIYSDSYNAILWVRKKKCKTKLERNAKTEPLFQ VIERAERWLQSHEINTQVIKWETSKWGEIPADFGRK >gi|283510544|gb|ACQH01000075.1| GENE 7 7749 - 8999 1115 416 aa, chain + ## HITS:1 COG:no KEGG:BF0423 NR:ns ## KEGG: BF0423 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 20 416 23 421 421 391 53.0 1e-107 MKKTFIALSLLAMPTMMWAQGDNEINIDAQVRARAEYRNGYMQLRPKSAQPASFIEERAR LGVGFERGDGLSAKLSVQQIGVWGQYRQIETDGSIMMNEAWGQYKFGDGFFAKLGRQILS YDDERIFGALDWNVAGRSHDALKLGYENDMHKLHLILAYNQSGENLLGTKFAPGQPYKHM QTFWYHYGNEDNPFGISALLSNLGWEYGPGAEGDTKYMQTLGTYVTYRPEKLDLQASAYY QMGKRPTGDTKVSAWMASVNAGFMPNEQWKFNLGYDYLSGEDGKGDTYSAFDPLYGTHHK FYGAMDYFYVREHRSLGLQDIQLGVTFTPTEKLALKANGHYFMDAVKTDGKDQGLGGEVD LQLDYEITQDVKLTVGYSTMLPTKTMDGADAENYKRWQDWAFVSLNINPRILSFKW >gi|283510544|gb|ACQH01000075.1| GENE 8 9169 - 9432 103 87 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVVALCCFMGWCYSENGLVVKPSCNFPPSHAHNGIAFGKLVLEEYKTCALGKMQDVRNMA SINPYAEEKEHFLLVYDALMALPTQFC >gi|283510544|gb|ACQH01000075.1| GENE 9 11364 - 11852 293 162 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929016|ref|ZP_06422862.1| ## NR: gi|288929016|ref|ZP_06422862.1| hypothetical protein HMPREF0670_01756 [Prevotella sp. oral taxon 317 str. F0108] # 1 162 41 202 202 343 100.0 3e-93 MLTDSILRLQTPRYQVISAKDNIKLHWTGKTFRPPKGCEGDRHLVTSSLEKVDNKLFTHP LSIIIKDIEVNNNGGEQEQMTILAVKSADKGWKISLSHTGWFMLKEYIGPAFGTTFLVDN CIYIIIFTPMWKTEWNGLSAEIYRKALYHDVVKPFIGKLLPQ >gi|283510544|gb|ACQH01000075.1| GENE 10 12124 - 13170 1039 348 aa, chain - ## HITS:1 COG:BH1554_1 KEGG:ns NR:ns ## COG: BH1554_1 COG0117 # Protein_GI_number: 15614117 # Func_class: H Coenzyme transport and metabolism # Function: Pyrimidine deaminase # Organism: Bacillus halodurans # 41 186 2 143 143 152 55.0 8e-37 MDKSNVNDTEKWLLQAENTPSTPDNMGPTTTQTHATATQTTDERYMRRCLQLARCGLLGA KPNPMVGAVIVYNDRIIGEGYHIHCGEGHAEVNAFAAIRPEDEPFLPHSTLYVSLEPCSH YGKTPPCADLIIRKGVPRVVVGCVDPFSKVQGRGIEKLRQAGIEVVVGVLEKACLALNRR FIVFQREQRPYVIMKWAQTANGFIDDHFHALAISTPFTQMLSHKLRAECDAILVGKTTDE REHPALNTRHWWGENPRRLVLHRHQPITQLLAQLYQDNVQTLIVEGGAIVHQAFIDANCC DEIHRETAPFTVFDGTRTVAIPPKFQLIKQEIYDGNVLEEFVSEYASL >gi|283510544|gb|ACQH01000075.1| GENE 11 13192 - 14076 620 294 aa, chain + ## HITS:1 COG:SP1021 KEGG:ns NR:ns ## COG: SP1021 COG2890 # Protein_GI_number: 15900892 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Methylase of polypeptide chain release factors # Organism: Streptococcus pneumoniae TIGR4 # 13 277 14 272 279 151 36.0 1e-36 MTYPELWRRLLPLYDEGEAKAIVRMVLDVGFGLSLADVYAGKVTQLSQDDDHRLRKMMDK LVQGVPVQYVLGRADFAGRTFDVAQGVLIPRPETEELCVWIVQTCSNLPVCGTSPTLLDV GTGSGCIATTLALDLPTWRVSAIDISQTALDIAARNAQKLGAEVRFALQDALCMPPDADL WDVIVSNPPYIMQREAQQMLPNVLQNEPHSALFVPNENPLLFYESIARYALHALKRGGKL FFEINPLCHEAMQRMLACMGWQRIETRNDAFGKQRMMSAIRQGRSMEGVKELRS >gi|283510544|gb|ACQH01000075.1| GENE 12 14256 - 14744 628 162 aa, chain + ## HITS:1 COG:no KEGG:PRU_2397 NR:ns ## KEGG: PRU_2397 # Name: not_defined # Def: regulatory protein RecX # Organism: P.ruminicola # Pathway: not_defined # 6 158 3 154 163 143 56.0 3e-33 MEDKQKQITEEQARTRLAALCAKAEHCTGEMREKMWRWGIAEDAQQRIVDYLVTNKYVDD ERFCRMFVRDKITYNRWGRRKVEQALIAKRISSDVYKPLLDEVEPEDFTAVLRPLLDSKR RTLKAATDYELNMKLIKFALGRGFDMETIRKCLTDVDEFAED >gi|283510544|gb|ACQH01000075.1| GENE 13 14719 - 15435 477 238 aa, chain + ## HITS:1 COG:PA0489 KEGG:ns NR:ns ## COG: PA0489 COG1040 # Protein_GI_number: 15595686 # Func_class: R General function prediction only # Function: Predicted amidophosphoribosyltransferases # Organism: Pseudomonas aeruginosa # 1 229 1 231 241 80 30.0 2e-15 MSMSLPKISLWERIVSLLAPRSCAVCGTRLEPQEDIVCACCNLHLPRTNYAVAPYDNEMA QLFWGRVPVERCAALLHYQAAAPASALLYKLKYGNRPDLGVELGRLMGKELLPTGFFQGI DFIVPVPLARQRLRQRGYNQSEMLAQGIAQVTGLKVRRGLIRRVVNTETQTHKDRWARTD NVNQAFKLVGRMPETGCHVLFVDDVVTTGATLCACMECLKGVANVRMSVLSVGVAHTR >gi|283510544|gb|ACQH01000075.1| GENE 14 15462 - 15662 82 66 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MFDYVHKKSALSDVPDSALCGKGNLLVALYVFSCQSYFYTFRRQSYAFSRNFTIPIIRYF DFGKGT >gi|283510544|gb|ACQH01000075.1| GENE 15 15700 - 16314 605 204 aa, chain + ## HITS:1 COG:CAC0821 KEGG:ns NR:ns ## COG: CAC0821 COG1309 # Protein_GI_number: 15894108 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Clostridium acetobutylicum # 1 125 1 121 200 65 33.0 7e-11 MQVKKDDVQGRIVSVAHEEFVRNGVKRTSMRTIAKLSGVSLGNIYKYFDSKDKLLCAVLA PLLKALDEYLLFHNNSEHLVIDDLSVEQMQTQMMNRLLDLIGRFRTELKLLFCNAEGTSL EGYRERLTAQQAQLGMEYLQMLKQKYPRLHVEVSPFFMQIVCSMWSNVLFEIVRHEGLTE ADITKFFAEYFDFSTAGWRKLMKV >gi|283510544|gb|ACQH01000075.1| GENE 16 16359 - 18719 2305 786 aa, chain + ## HITS:1 COG:PH0287 KEGG:ns NR:ns ## COG: PH0287 COG1033 # Protein_GI_number: 14590211 # Func_class: R General function prediction only # Function: Predicted exporters of the RND superfamily # Organism: Pyrococcus horikoshii # 15 771 50 775 787 99 21.0 4e-20 MKIESVNKSFKRAAICTLRHRWLVLAAFAALMVFSVVGTKRIVMKTSFDDYFLEDDPVLV KTNEFKSIFGNDYYAAVLVKNKDVFSHRSLSLIRELSNELLDSLSYAEKITSLTDLEFTV GNDEGLAIEQIVPDEIPTNADSLAAIRQRAFSKPHLARKLVSKDGTMTWIMVKLRTFPPE QEWKKTNNTSPDILTGKELAHIISKPKYRELSPNASGMPYLTHEKVTFLQGEMGRLMLFA FIASILIMFFVTRSLRGVVAPLLTSVVGILLGYGVIGWAGIYVDQPTVMIAVILAFACSI AYNIHLYNFFKTRFVQTGRRRTSVVEAVGSTGWGVLLSGLTTIAAMATFLAMKIVPMQAI GLNTSLCLLGVLGTCLFITPVLLSFGRNRKPHPHMNHSIEGRAGTLFERLGNFVIRRHRA VVIVSVVFTAFCAVGLFSIEPAFDVERTMGRKVPYVARFLDLCQTELGSMYSYDVMIVLP KAGDAKLPSNLEKLDKLSRIAQTYKLTKRHNSILDIIKDMNCTLNANKDSAYAIPHNPNM VAQLLLLYENAGGTEAEYWMDYDYKRLRLQIELNSYNSNEAEKEMQLIEQEAQKLFPDAH VSAVGNIPQFTAMQQYVERGQMWSMLLSVLVIGVILVLVFGNWKVGLIGMIPNIAPAIIV GGMMGWLDYPLDMMTASLIPMVLGIAVDDTIHFINHGHLAYNRYGDYDLAIKRTFRTEGL AIVMSTLIISVAFAGFTVSNALQLRNWGILAIAGMVSALLADLFITPILFKYLHIFGKAQ NGKANP >gi|283510544|gb|ACQH01000075.1| GENE 17 18760 - 19536 809 258 aa, chain + ## HITS:1 COG:no KEGG:Fisuc_1740 NR:ns ## KEGG: Fisuc_1740 # Name: not_defined # Def: hypothetical protein # Organism: F.succinogenes # Pathway: not_defined # 7 258 8 257 257 306 59.0 7e-82 MKTIFLSAILLACSAISASAAGLTGRDVMLKVKNRPDGDTRYAALTMTLIQKNGNRRERK LVSWAMDVGKDSKRVMFFTYPGDVKGTGFLTWDYDNAKREDDRWLYLPAMKKTRRISGKS SKTDYFMGSDFTYDDMSSRNVDEETHTLLREEMLGGQKCWVVQSVPNDKHGIYSKRITWI RQDCAVVVKAEYYDKLDKLHRSLVVSNVSKVQGFWVMGRMEMSNVQTGHKTILQMSDQKF DLKIDAALFTVARLEKGI >gi|283510544|gb|ACQH01000075.1| GENE 18 19533 - 20816 1060 427 aa, chain + ## HITS:1 COG:no KEGG:Fisuc_1741 NR:ns ## KEGG: Fisuc_1741 # Name: not_defined # Def: hypothetical protein # Organism: F.succinogenes # Pathway: not_defined # 40 427 23 407 407 343 44.0 9e-93 MKTARLMLLLLLALAVSAQAGAQDEDTATDTLLVGETPADELQVQVKGFLDTYHAARTEG KGEWLASRTRARGELTLEKGGARLFVSMNAIYNALLKEQTGITLREAYLSYANAHVDLRA GRQIIIWGVADALRLTDQISPMDYTEFLAQDYDDIRTPVNAFRLRLVRQAFNVELVCIPV SDFFIQPTDERNPWATHIDAPMPYTFDLDSRKPKTRLRNVEFGGRLSVNLSGIDFSFCGL HTWNKMPAFSYAVDPAGAAMTVVGHYRRLTMFGTDVSFPIGRFVVRGELAANLNEVQNAE FGSEVQGRNVFNALLGVDWYVGNDWMLSAQYAHKHVGGSLAGLSVLRNTGVATVRVSKEL FRNTLALSSFAYIDVSHGGVFNRLSCKYDLNDQISLTAGYDYFHADAGMFAMYSRNSEVW VKLKYGF >gi|283510544|gb|ACQH01000075.1| GENE 19 21047 - 21253 59 68 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKASVDLMGTPVGDSATQIQFCLTITRKLDEPVSLMNSLYYKRTNLQLHTLITPSTSPPH TLVNPLTC >gi|283510544|gb|ACQH01000075.1| GENE 20 22377 - 23543 1211 388 aa, chain + ## HITS:1 COG:RSp1296 KEGG:ns NR:ns ## COG: RSp1296 COG4260 # Protein_GI_number: 17549515 # Func_class: S Function unknown # Function: Putative virion core protein (lumpy skin disease virus) # Organism: Ralstonia solanacearum # 1 259 1 258 343 217 41.0 2e-56 MGLFNKLRNEFIDIIEWTDNTTDTMIWRFPRYQSEIKNGAQLNVRESQVAVLVNEGQFAD IYQPGRHVLNTNNMPILSTIMGWKYGFNSPFKVDVYFVNTKQFLNVKWGTANPIMLRDPE FGPIRMRAFGSYCFRVNADPRKFITNVAGTNGNFTTEGITQQLRNFVITKFTDHLGESKI AALDLAGNLNEFSASLTEALKPDFEEYGLELTNFLVENISLPEAVEKALDKRTSMGVIGN MGAYTQMQFADSLVEGAANGGGNVAGNAMGMGMGFAMANQMTNQMANQMAGQMAQPGAQQ APQQPAVGAGAPGMPPPPPPQPMFHLSVNGQQQGPFGMPQLQQMAQNGQLTRDTYVWANG MASWEFAKNVPALAALFGATPPPPPPGT >gi|283510544|gb|ACQH01000075.1| GENE 21 23608 - 24687 1070 359 aa, chain + ## HITS:1 COG:no KEGG:Glov_1382 NR:ns ## KEGG: Glov_1382 # Name: not_defined # Def: hypothetical protein # Organism: G.lovleyi # Pathway: not_defined # 10 353 15 363 370 275 38.0 2e-72 MDLSINDALQTKCPACGGMMAYSPKRKMLECVYCGSTKELDLTPAAVKENSYDKWAEQSD ENLNEQTISTVEVKCQQCGAFTTLPPEKSSATCAFCGTPLIMQDAQERRFWQPNYILPFE FAKDKCNDSFNGWLSRKWFAPSKLKKGGVQTDGFKGVYLPFWTYDADTVTRYTGERGIDR REESQDEKGNTTSRTVTDWKTTRGTVQVGFDDVIVPATKALPNNILNDLVLWDLNKLVPY NPEFLAGFITDLYTIDFREGLTAAKGKMENEIEERVKEDIGGDRQRITEMYTMYNNVMFK LILLPLWISVFHFNGKAYQFVVNGRTGKITGNYPLDKLKVTLTILACVVVALLFYFWLC >gi|283510544|gb|ACQH01000075.1| GENE 22 24925 - 25848 784 307 aa, chain + ## HITS:1 COG:no KEGG:SYNPCC7002_G0157 NR:ns ## KEGG: SYNPCC7002_G0157 # Name: not_defined # Def: hypothetical protein # Organism: Synechococcus_PCC7002 # Pathway: not_defined # 114 301 120 309 309 128 38.0 3e-28 MKRMLVVAALVVTFGTLFNACESRGGGTGGSDRGDSIGRTNGVQPTDSFPWDFPRNFTLK DVEVGQTVLSPAAFYGGALQRGEDLTATLLPIYCFTIKEVGQNTITVAGMSEEIKVPQAL VMPLPKGEKAQNGDVLLTWWQSGQGLQRAIVVDDTKQLTPKVCYLDLDYKADGRGFANSH ANEELRPNSFKVLKSGEWMSGASVAVNDNGKWRAAIIVNIKDDKLLLTDAGNRIWVAERE KCQLIPFNVDYAVGDSVFAKIGNTFVPDCKVVKVDKRIGRIRVEKNGRTALLSILEVTKE LKDLPKR >gi|283510544|gb|ACQH01000075.1| GENE 23 26051 - 27664 1561 537 aa, chain + ## HITS:1 COG:no KEGG:Ccel_2393 NR:ns ## KEGG: Ccel_2393 # Name: not_defined # Def: S-layer domain protein # Organism: C.cellulolyticum # Pathway: not_defined # 21 250 232 455 522 110 30.0 1e-22 MKKLSLLTLLLMLAGMLAAKEYKVSTAHDFIKALGSDRTITVSGVINLSDVMEFDDLAQK ANMSDLSAGSNPTKQVNREGMFDGHQLVLDQCKNLIIQGTKGAALIVRPRYAYVISFRSA HNITLRNLTLGHTDEGYCEGGVLEFDGCEGVQIDKCDLYGCGTEGITATDTREVKFTKSI IRDCSYGIMTLRNCFNVTFTDSDFYRCREFTLITLANATNTSFSRCRFAQNQGPLFYVEK TNLSLRDCEIHHSGDLGNIDVYQHGATSYSRDNNELSPRDVGPEGRPNLKASREQENVDS NADNEDGEDEADCECGEEEDNLSDEAQWGSELIEGWKDMGVELPASVKTDNVKDLTIAVC KAWQGLEGSPQKFVVRFSKKQGRYERFNFEHPQSPVNAKSSTYAVSDANYGIVVYNTIGG WFYVECENGQKLDAALWKRDNGHKLLIVTLTYESDTFSSQVCMAYDYDPTTRTLTPDLAV HKRLTQLEGTLCDLPRKGKNIGVRNIGSDTMRFYYLKWNGNGFSAPVKVATNTYQRP >gi|283510544|gb|ACQH01000075.1| GENE 24 27661 - 27931 78 90 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260910396|ref|ZP_05917068.1| ## NR: gi|260910396|ref|ZP_05917068.1| conserved hypothetical protein [Prevotella sp. oral taxon 472 str. F0295] # 7 89 4 86 250 144 84.0 1e-33 MTKNLMQERIQAINSVLSAYFADKTNPRQVPALELMGLFIDKGIFKKDHRNGLPIRKILR KLNNEGRLHDIPYAHEELKQKNIYWTFVDV Prediction of potential genes in microbial genomes Time: Sat May 28 01:44:30 2011 Seq name: gi|283510543|gb|ACQH01000076.1| Prevotella sp. oral taxon 317 str. F0108 cont2.76, whole genome shotgun sequence Length of sequence - 49459 bp Number of predicted genes - 35, with homology - 33 Number of transcription units - 25, operones - 5 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 62 - 1627 1497 ## gi|288929029|ref|ZP_06422875.1| hypothetical protein HMPREF0670_01769 2 1 Op 2 . - CDS 1674 - 2630 970 ## Cpin_1183 hypothetical protein 3 1 Op 3 . - CDS 2642 - 5401 2744 ## COG1404 Subtilisin-like serine proteases 4 1 Op 4 . - CDS 5435 - 7039 1651 ## gi|288929032|ref|ZP_06422878.1| lipoprotein 5 1 Op 5 . - CDS 7079 - 8698 1495 ## BF3101 hypothetical protein - Prom 8741 - 8800 4.0 - Term 9123 - 9173 15.0 6 2 Tu 1 . - CDS 9239 - 10600 1919 ## COG0544 FKBP-type peptidyl-prolyl cis-trans isomerase (trigger factor) - Prom 10620 - 10679 6.3 + Prom 10921 - 10980 8.6 7 3 Tu 1 . + CDS 11113 - 11790 729 ## gi|288929035|ref|ZP_06422881.1| hypothetical protein HMPREF0670_01775 + Term 11814 - 11858 -0.6 + Prom 13202 - 13261 4.4 8 4 Tu 1 . + CDS 13399 - 13584 59 ## + Prom 13676 - 13735 5.0 9 5 Tu 1 . + CDS 13758 - 16115 1464 ## BVU_1144 hypothetical protein 10 6 Tu 1 . - CDS 16386 - 17393 701 ## COG3712 Fe2+-dicitrate sensor, membrane component - Prom 17452 - 17511 4.9 + Prom 17723 - 17782 1.6 11 7 Tu 1 . + CDS 17961 - 18305 118 ## COG3344 Retron-type reverse transcriptase - Term 18448 - 18501 11.5 12 8 Op 1 . - CDS 18528 - 20126 1257 ## gi|288929039|ref|ZP_06422885.1| lipoprotein - Prom 20178 - 20237 1.7 13 8 Op 2 . - CDS 20245 - 20454 257 ## gi|288929040|ref|ZP_06422886.1| hypothetical protein HMPREF0670_01780 - Prom 20480 - 20539 2.5 14 9 Tu 1 . - CDS 21350 - 23668 1326 ## BVU_0280 hypothetical protein - Prom 23691 - 23750 3.2 + Prom 23904 - 23963 1.8 15 10 Tu 1 . + CDS 24029 - 24574 499 ## Cpin_5158 transcriptional regulator, LuxR family 16 11 Tu 1 . - CDS 24413 - 24712 62 ## - Prom 24913 - 24972 6.5 - Term 25676 - 25715 4.2 17 12 Tu 1 . - CDS 25941 - 26150 72 ## gi|288929043|ref|ZP_06422889.1| hypothetical protein HMPREF0670_01783 - Prom 26182 - 26241 2.6 + Prom 26126 - 26185 4.8 18 13 Tu 1 . + CDS 26216 - 26458 60 ## gi|288929044|ref|ZP_06422890.1| hypothetical protein HMPREF0670_01784 + Prom 26530 - 26589 5.2 19 14 Tu 1 . + CDS 26650 - 28512 1859 ## COG3669 Alpha-L-fucosidase 20 15 Tu 1 . + CDS 28949 - 30271 1378 ## COG0477 Permeases of the major facilitator superfamily + Prom 30391 - 30450 3.6 21 16 Tu 1 . + CDS 30521 - 31495 1109 ## COG2152 Predicted glycosylase + Prom 31561 - 31620 4.5 22 17 Tu 1 . + CDS 31799 - 33427 1565 ## COG3525 N-acetyl-beta-hexosaminidase + Term 33438 - 33507 20.3 + Prom 33479 - 33538 3.8 23 18 Tu 1 . + CDS 33567 - 35627 2305 ## COG3525 N-acetyl-beta-hexosaminidase 24 19 Tu 1 . + CDS 36169 - 37788 1550 ## COG4409 Neuraminidase (sialidase) + Term 37821 - 37864 8.2 - Term 37807 - 37852 10.2 25 20 Op 1 . - CDS 37882 - 38403 346 ## gi|288929051|ref|ZP_06422897.1| acetyltransferase, GNAT family - Term 38414 - 38458 9.1 26 20 Op 2 . - CDS 38478 - 38645 83 ## gi|260910379|ref|ZP_05917051.1| capsular polysaccharide biosynthesis protein - Prom 38669 - 38728 4.2 27 21 Tu 1 . + CDS 38831 - 39232 97 ## gi|260910377|ref|ZP_05917049.1| conserved hypothetical protein + Term 39335 - 39396 11.2 - Term 39917 - 39959 1.4 28 22 Op 1 22/0.000 - CDS 40065 - 41156 1054 ## COG0842 ABC-type multidrug transport system, permease component - Prom 41208 - 41267 3.9 29 22 Op 2 45/0.000 - CDS 41374 - 42474 945 ## COG0842 ABC-type multidrug transport system, permease component 30 22 Op 3 10/0.000 - CDS 42474 - 43934 305 ## PROTEIN SUPPORTED gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein 31 22 Op 4 . - CDS 43931 - 44824 1017 ## COG0845 Membrane-fusion protein - Prom 45030 - 45089 3.1 32 23 Tu 1 . - CDS 45136 - 46422 1367 ## BF2652 hypothetical protein - Prom 46495 - 46554 3.5 + Prom 46431 - 46490 3.3 33 24 Tu 1 . + CDS 46537 - 47481 810 ## COG2207 AraC-type DNA-binding domain-containing proteins + Prom 47508 - 47567 1.5 34 25 Op 1 . + CDS 47623 - 47940 206 ## PRU_0962 hypothetical protein 35 25 Op 2 . + CDS 47947 - 48258 344 ## PRU_0963 putative virulence-associated protein + Term 48405 - 48445 5.3 Predicted protein(s) >gi|283510543|gb|ACQH01000076.1| GENE 1 62 - 1627 1497 521 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929029|ref|ZP_06422875.1| ## NR: gi|288929029|ref|ZP_06422875.1| hypothetical protein HMPREF0670_01769 [Prevotella sp. oral taxon 317 str. F0108] # 1 521 1 521 521 970 100.0 0 MKRTLYAICALAMSLTAFAQKLDTPKGKLIDNMYRTSDSWVKRNWTSTEPGRYEGLVSKI VVGDDNCLYVYNPLSGLDSKSWLKLDKIADGKYRANLPQVIYKDNNGDDDDDEGGSSERI FKLNRMSAIADNQYKVVVADKNFMDFSWDGKTLKMLGTGTKNDMLGAVFNDKTWDSQYGD WNVTIETFDEKPLTPPASAPKKQYMLTSKTETSPRIVEVATHNNDIYVKGIFANEKLANL WVKLTKEGNKAVLPTNQYLGTAVKTYFKRFSNDMAQYHAYAAAFNDENTVADKLEFNIDP ATGALTNDKILKVVLGKSSSTNMPKEDFGTLQNLVLTPYEQKAGKPEKPTLHYCSAVPSY DYSTTTITLAFYVRSADINGNYLDPNKMYYNVYINDNQEPFKFTRAKFHNIEKDMTDIPF NYQDKRNDDIKVADNQRILHFYDETIKKLSVVMVYEAEDGKKYSSEPMTTQVVSTGIDNA TIDNAPAQQYYSVDGCRRQQLEKGLNIVKYSDGTTKKVLVK >gi|283510543|gb|ACQH01000076.1| GENE 2 1674 - 2630 970 318 aa, chain - ## HITS:1 COG:no KEGG:Cpin_1183 NR:ns ## KEGG: Cpin_1183 # Name: not_defined # Def: hypothetical protein # Organism: C.pinensis # Pathway: not_defined # 6 305 19 365 394 65 24.0 4e-09 MNSKHVIIIIACAALLGVAQANAQGQDLSVLTANPDARTAAMGNAAVAADGMYLYNNPSA FLSANKRFSADASASIYEKTEGYEGTFGLYTAALGYKFARRHAAFAGFRYAGGLKLKGYD MLGNPTKDYEPYNWTIDLGYAYMLGNGFSAYAMGNILFSHLSKNAVGGAFTIGASYQKSS TTAGNKPTHLMLDAKVGAIGPQLDYGNGNKNTMPTYVAVGGALTVDVANKHQVAGAWSTR YFFRPSEYKVLMLGGGLEYTYNKDISLRAGYEYGDRNLSHFTMGAGFKYAGLRLNGAYML KTADAGSSYFSVGLGYDF >gi|283510543|gb|ACQH01000076.1| GENE 3 2642 - 5401 2744 919 aa, chain - ## HITS:1 COG:alr1615_2 KEGG:ns NR:ns ## COG: alr1615_2 COG1404 # Protein_GI_number: 17229107 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Subtilisin-like serine proteases # Organism: Nostoc sp. PCC 7120 # 197 510 37 313 416 142 37.0 4e-33 MKKTIIYMCACAISGLMLATSCQESMEMENPNSKTRAVAIDKDLFAVRGRINVKLEKDTN QALPTSAKGNVEMQSVPSAMASAMKYAGAYKMERVFKPAGIYEKRTIAEGLDRWYTIYFD ESKDVAEVLQQFNKAAGVEYAERVLPIARPKFTAKPYTGPAPQTRNQPTASAFNDPLLAK QWHYYNDGSVSPHAKKGADCNLKPVWEKYTTGKSNVIVAVVDGGIDITHEDLVDNLYINE KEKNGQAGVDDDGNGFVDDVYGYNFVEAKDVVGGTIQPDNDGHGTHVAGTVAARNNNGKG VAGVAGGNGTPDSGVRLMSCQIFRGKDEQGDAAAAIKYAADNGAVICQNSWGYSSTSGVT AMPKLLKEAVDYFIKMAGCDENGQQRANSPMKGGVVVFAAGNENKEFAAYPACYPPAVSV SAMAWNFAKASFSNYARWITIMAPGGDQDTFGTEGGILSTVPKSKVPSGYAYFQGTSMAC PHVSGIAALIASYFGRQGFTNDELKSRLITAYRPFNIDELNPAYKGKLGKGYIDAEAAFE SDTKIAPEKVGTLTLTPDFVDITAEWSIAKDEDKTAAFYRLYISPNDLTAENIKDMSFKE INGMGHSLGEKLRFTFNNLQDNKAYSIAVVAVDRWGNLSEPAIQKCTTKLNHAPEVTNFP EEVIELNNNERKTFSFNVADPDGHNWDIRAIGETKGVSYSINQATVTVSIVPVLQAGSYT CTFVLTDDLGAKSEKSFTFKVIPYMPPKLEQPFAHYIIGLDEGPLNISLKGHYTGSGNQL SYKANVANGGIASVQVSNDQLQIKPLARGVTRVSVLASDGHQTSSDGSFQIRVVERKSAP VYAVYPIPAKKDINALLNPEVTQAEFVISSTVGEQLMSATVTPDKNNVAKLDLSKLRPGT YKLTVHTSKGNHTQMFIKR >gi|283510543|gb|ACQH01000076.1| GENE 4 5435 - 7039 1651 534 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929032|ref|ZP_06422878.1| ## NR: gi|288929032|ref|ZP_06422878.1| lipoprotein [Prevotella sp. oral taxon 317 str. F0108] # 1 534 11 544 544 1082 100.0 0 MRIINLYKLLQLSAVCLLVVMVAGCTKSDDFEQPQLSVSDKELKFANQVGETTITVTTNC KEWVATTPKQWVHLTQSGNEIVVKVDANTTGTERSSYVLVDGGLAVEKIMVKQSAADISL NIANGELVLPQAGGTTTVDVNIESGLYDLQQSEQPEWLQIVRKKHALKFISKPNYDTTER TIKLTISYAGKNNEVVVRQPGVSTFVLACNPGNPFSLHKMMDFEYRRGGMLKEYGAPDDA NGIYEESYFFKTTSPLFKDVVYVHDTKHFVPTRIYTRSLEQAGVDAVKSQAFQDFVKANG YVRDEKDPNHYINVKEQMTMDVDILESNHSVVLFFYQIHTQDRDYETFKTLDLGPLALLN KADKKVSDVEDYEASLHSEEITRKVSKNRDVEAIAYKTNDPLLVARTYFFFTRGGDNPAP QDMVGSVEQYSLSFSQPNLGVWQYGREWFVTREFDKLLTSNNFEFVGYNGKHHVYARRSD YLTLAISGGKFADVNGGQPVMQISVLYKKNVFAGSKQQRMQKIESMLKTHRPSK >gi|283510543|gb|ACQH01000076.1| GENE 5 7079 - 8698 1495 539 aa, chain - ## HITS:1 COG:no KEGG:BF3101 NR:ns ## KEGG: BF3101 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 3 204 1 209 939 66 24.0 3e-09 MDLKKTFYSLILGGFVLCLLSCAKEDEFELPTLVLSENSVSFERGVGERNISVTTNQSNW VASSPQEGEWLSLVQDGNVLKVKVAENKMGTERTSYVLVNANGATGKVEVRQSAADVTLD VVPTAIYLPQMGGEKIVDVTTNTSVYDVTPSEEVKWLKIIKSDEEIKLVAERNDTEQDRQ VKLYAKSGHVTREIVVSQSGIQRYILPINPGAPQNVHKIMEYELGRGSYLREYQSAMPAY GLEETYTFITPSPVFTLIQYCSPDGVTPSQIISIGDGSRAIAAVKDKAFEKFLTNNGYVR SNSESDREYVNEKDLLSLKVYVSEKENNEGVNLTFKPIVKQVGEYKTFDKVPYYPLELLQ KEAVKVAQVEQYEKNAGSKEEERTMNEHKQTEVSQLQYTLKGTPGPKDAYGRIHIFYTTD KDGKAPDKLGSVQIGALLFKDVSLGLWKYGNKWLVTNELQKKLGEEGFSFVRTAGTTHFF ARESDHLFIAITRVADNNTPVMALLYNYDASASGAGSKALKAQERMVHNMLKAKDALKF >gi|283510543|gb|ACQH01000076.1| GENE 6 9239 - 10600 1919 453 aa, chain - ## HITS:1 COG:RC1306 KEGG:ns NR:ns ## COG: RC1306 COG0544 # Protein_GI_number: 15893229 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: FKBP-type peptidyl-prolyl cis-trans isomerase (trigger factor) # Organism: Rickettsia conorii # 17 453 17 441 445 70 21.0 5e-12 MKISFENADKVNGLLTITVEEADYNEKVEKTLKDYRKKANIPGFRPGQAPLSLIKKQIGE HVKAEEIQKLVSDTIFNYLKENNIQFLGEPMPSAKQEEQDLSKPAPYTFVFDIAVAPEIK LELTDKDKLDYYEITVDDKLVDEQVDLFASRAGKYEQVEEYQAKDMLKGDMKELNADGSE KEGGIVKEGAVLMPEYIKVEDQKKLFDGVKRGEVVVFNPKKAYPDSEIEVSSMLGISKEQ AANVDSDFAFHVLEITRYAKAAVNQELFDAIYGEGAVKDEADFRSKIAEGIANQLVSNSD FKLLQDLRAYCEKKVGKLVFPEEMLKRIMQKNVKDEKEVDEKFEASIEELKWHLIKEHLV KANEIKVSDDDVLDAARLQARIQFAQYGMNNLPDETVDNYAREMLKNRETLDGFVDRAVE NKLVEAMKKVVKLKNKKVSLDDFNKMMEEDAKK >gi|283510543|gb|ACQH01000076.1| GENE 7 11113 - 11790 729 225 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929035|ref|ZP_06422881.1| ## NR: gi|288929035|ref|ZP_06422881.1| hypothetical protein HMPREF0670_01775 [Prevotella sp. oral taxon 317 str. F0108] # 1 225 1 225 225 441 100.0 1e-122 MKSKLIVTLLLSMVIVSSTWAKERPEVKCVLTLNDGKTVEGWFLKEGFSTYGPDRLKNVD EITITPTKDGKEGTTYKADDVKVLKCIYEKEGQQKEYRSLYALKPMTRPLNLKHSNHKFF WMVGYQGKKVIGFISDASLIFHASPTQRQTSKTAAYSYCVEGDEVVVTYYAPSGGIDIGS KSEQKMNFERFPEMVELIKSKDFSVKEMKDKPFAMLETLDQYLAK >gi|283510543|gb|ACQH01000076.1| GENE 8 13399 - 13584 59 61 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLKREQKLQIKVFRRLFPINISFYQKGENDESFWGCSYIKTLMAYRDLDNAARESKALFP V >gi|283510543|gb|ACQH01000076.1| GENE 9 13758 - 16115 1464 785 aa, chain + ## HITS:1 COG:no KEGG:BVU_1144 NR:ns ## KEGG: BVU_1144 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 12 780 13 766 772 187 22.0 2e-45 MKCTKLRIFIRCLCTILFSVVCLQCKAYGLRGRVVDGSGINPLVGVNVKVKTDSVTVLLQ TQTDVDGMFVFANLSQENITIEVLCMGYSPFSTSVRGNNTDLDLGVISLKSSSITLGEVT VTGNRTIELADKYLIFPTEQELKRTSELIELLNELKINMPGLKVNESQQTLYVDGGKPIL MLNGKEVGMEKIRNIDHRRIQRIEYSNVSGIRYLDRGATGVINFIMNGLQDGGLVSVNTN NALTTFRNRANINGTYHVGKSEWSINYNTLWRKSGHEYTDKDERFVGRTDDIIRSQTGLP SAITDFDNSLTMDYTYTHSPSTAFMAMFGLKYHDKDDKEAYRIVERQGANTETFERRYYN ERRLLTPSLDLFFKTAVANNQTLELNAVGTMSAGDYNRGLNDGGMYAQRNVTTNKSINVN GEALYTLTFKRATAKVGLSYTHSHAQNDYSENGGATLIDKLTKDNVYGYAIVTGTLRRIG YNLGFGLKHYRIGDLNKSKNFFNAKTTVSINYPLSKGWSVGYLFILDSSLPPLSSFSDVV QTIDKVSLQVGNMNVQPSVVTRNRLSMNLHLDKFTISLQGNYNYTCKPIVSAWRYVADPS HVYHNMFIKKTENGDYDSRLNVECAIGWQNVFNHFTFQTVMGWDRFRMQGEGYHSQISKP YASVSLSAYWNKLSVYANFDMLPQYSFWGTNVYRGVRYNYVGMRYNMKHWTFGCRIDNPL NKGGFVQVSENISNINPTHTEYRIVDLANMVELSMQYRIRYGKEYKKGNRTLMNRSFDSG VNNDY >gi|283510543|gb|ACQH01000076.1| GENE 10 16386 - 17393 701 335 aa, chain - ## HITS:1 COG:PA2388 KEGG:ns NR:ns ## COG: PA2388 COG3712 # Protein_GI_number: 15597584 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Pseudomonas aeruginosa # 144 334 132 324 331 78 32.0 2e-14 MNEENDVDDLNQEQIIRNHLPKQAYATQNVQEEPDNVLPSKGKTKSRHTLIIGVGVSIVV LTSLLLSIVLGFGMGILNAHREPIVVYQANTKAAQKAVTWQQDNGKPVVVPDKYLQVSHI IKDVDAGIRNVLSTPQNATAEITLGDGTEVTLNADSKLSFPLRFEDETREVQLQGEAYFK VSRDVRHPFVVKTNGITTKVFDAEFNIRAYHKNNMHVTLLQGFVEVCAGSVTKCLRPGED AALVAEKFEVNRVETEGFVAWTLGEFYFDNATLADIGKDIGRWYNVSVVFQDPSKMHIRV GFTAFRDANIEEIVEMLNRLNKAKFTYQDGQITID >gi|283510543|gb|ACQH01000076.1| GENE 11 17961 - 18305 118 114 aa, chain + ## HITS:1 COG:CAC3514 KEGG:ns NR:ns ## COG: CAC3514 COG3344 # Protein_GI_number: 15896751 # Func_class: L Replication, recombination and repair # Function: Retron-type reverse transcriptase # Organism: Clostridium acetobutylicum # 26 97 352 423 470 63 43.0 9e-11 MNGKRGKQPKIIDKRYNDRWGLKLLTCRSNGWGHAERKQKLEGSIKGWVGYYHLGNMKRF LLEADEWLKCRIRMCIWKSWKKVKTRVENLINCGIKSIRHANGAIPVRLIGAHS >gi|283510543|gb|ACQH01000076.1| GENE 12 18528 - 20126 1257 532 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929039|ref|ZP_06422885.1| ## NR: gi|288929039|ref|ZP_06422885.1| lipoprotein [Prevotella sp. oral taxon 317 str. F0108] # 1 532 1 532 532 1053 100.0 0 MKYIHIIALSAVAAVSLVGCNNENPALYTQENKSNDLRGVTVATYMESTSTRTTAEYETT KLAFYWTRGDKVWLHDIGETPEFVQNSGDNIDNQLNASGKDKVRNAGFYFPTPLNKASYP VRYTGYPTPAVDKVTIKAAQNQAKPNKGEHIKTDGDCGVGSFTRGADGNYSFYLKHCASY LTFTPFFSKGFAPSVRLTQIKVSADQAIAGTYDIEDSGLDLDSRPLAEPSNRNITLTLNG GGNDGFTIPAVSTPETNAGIMVVAPGEYTNFTIEYTLYDQETKVRGTVITKLGHVILKEG RNKRLSPDLDVRYYASDQYYMWDAQPGEHYWKGASKQPYLNGKYGDSYAKVGADPRWYNT VAHPGKASRLATNNLNANELMWLAYYGVPHWDSSMWTIMGHLYAGGMWLRNLDVIATTQK TTRERMKAVSPKGIDFTAYAQYDNFGAFVYTNNNIIPHRPRPTNGYFFLPTTGFYNNGRL YHISSRGYIWSSTPRAQSSSKVAQGYNFYISRYEVHTGYGDRWNGIYQHVTR >gi|283510543|gb|ACQH01000076.1| GENE 13 20245 - 20454 257 69 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929040|ref|ZP_06422886.1| ## NR: gi|288929040|ref|ZP_06422886.1| hypothetical protein HMPREF0670_01780 [Prevotella sp. oral taxon 317 str. F0108] # 1 69 1 69 69 94 100.0 2e-18 MKKIRVSNLAYSKPESYVVKVEAESLLAGSGKHGGTGGHEDAEDDNDEGQGTSGTGGHLP AQDDGQPLT >gi|283510543|gb|ACQH01000076.1| GENE 14 21350 - 23668 1326 772 aa, chain - ## HITS:1 COG:no KEGG:BVU_0280 NR:ns ## KEGG: BVU_0280 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 12 772 15 769 769 395 32.0 1e-108 MKSIITTLLALVTLQTMAQGLTGKLVDEKQHPIPFANVAVLSANDSTLIGGTTSNAEGMF SIAEIHQGCILRVSSVGYQSQFIVYNGTSPITITLNEDAQLLSEVVVKSKLPKTVFKGEG MITTVAGSVLEKTLSMESLLDLIPNLSAKDGSVVVLGRGSPEIYINGRKMRDRMELERLK PDEIKNVEVITNPGVRYNASVKSVIRITTKKPVGQGFSIDTKTNSKVNEQKRMSWTESLR LNYRKGKWDTSLHLYGAYTHKQDDKLIRQMTYLDDIWEQTTAISQEYTNVNPYVRLATSY ALDADNSIGASISYDRYAKNLAVGNAVGMAMRNNVQTEQSLSAFESPGNSKAVLSNVYYV GKIGKVNVDFNADYYWSGKKEQMHNMERFTEVGSPESIQNIHSDRATYNRLLASKLVLSV PLASGSLSFGGEFSTSRRKSRYSLLPRALVNDDDSRIKENMTSALVEYSNTFGKLHVQAG MRYEHIRFNYYYQERLIAKQSKSYGNFFPSLALSMPIGNTQMQLTFATDIYRPSYYDLRD GVQYNNRYTYDSGNPFLVPSISRNIGYAFSWQWMQFSVMYTHLSDEICTLVQTYKNEPQT TLARPENMPSYNTMQASVTVNPKFGFWKPMLEVTVFKQWFHMDTYKQTYLNHPVAFFRLN NTSDTKWITASVLVSAQTEGNMGNKFVRRGFFSADLSVYKSLMNNRLTLQLYASDLFGTA DACRIFYSGPRRSTYYKSYSSSSLSLTIRYQFNVTNSKYKGSSAGQSQRSRM >gi|283510543|gb|ACQH01000076.1| GENE 15 24029 - 24574 499 181 aa, chain + ## HITS:1 COG:no KEGG:Cpin_5158 NR:ns ## KEGG: Cpin_5158 # Name: not_defined # Def: transcriptional regulator, LuxR family # Organism: C.pinensis # Pathway: not_defined # 1 174 1 178 255 150 43.0 3e-35 MPDMKDFFISSNMACNAPDYNPDVLSTLTRTAEAFARVTYQGIYLIDYYRQEFFYVSDNP LFLCGHTAEEVRGLGYRFYLKHVPEKDQKMLVELNRSSFKLFGAFDAAAKCQCYISSHFH LSNGARRKLINHQLTPVLLTDEGKIWIGMGIVSLSSHRTAGHVEFHRRGSGTYWTYSFEG H >gi|283510543|gb|ACQH01000076.1| GENE 16 24413 - 24712 62 99 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTPIKLYRIHRAAHPARNLYHGKTCRRQPKHFKFLFFKQHATALIPLMSLERIGPIRAAT PPMKLNVARRSVRRQRHNAHSYPYLTFIGQQHGCKLMVD >gi|283510543|gb|ACQH01000076.1| GENE 17 25941 - 26150 72 69 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929043|ref|ZP_06422889.1| ## NR: gi|288929043|ref|ZP_06422889.1| hypothetical protein HMPREF0670_01783 [Prevotella sp. oral taxon 317 str. F0108] # 1 69 1 69 69 115 100.0 6e-25 MAMPQKVNGRKLFNAVSTSRKPLVDGLLLLRFHDAVAVVCEQHALQPLALAVVNVRPFGV VAHLRVHAA >gi|283510543|gb|ACQH01000076.1| GENE 18 26216 - 26458 60 80 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929044|ref|ZP_06422890.1| ## NR: gi|288929044|ref|ZP_06422890.1| hypothetical protein HMPREF0670_01784 [Prevotella sp. oral taxon 317 str. F0108] # 31 80 1 50 50 100 98.0 2e-20 MSGAFVLVLRHKDINASLFVERLKVKPTSALCSLPYRGLKGFKIGGINEEEGGKAWRSQS EMVNLCDGGEGDMLRRLVVW >gi|283510543|gb|ACQH01000076.1| GENE 19 26650 - 28512 1859 620 aa, chain + ## HITS:1 COG:SP2146 KEGG:ns NR:ns ## COG: SP2146 COG3669 # Protein_GI_number: 15901959 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-L-fucosidase # Organism: Streptococcus pneumoniae TIGR4 # 30 564 6 535 559 293 34.0 7e-79 MRINKFTALAVVLAMTVGANAQRKVTAPAPCGPLPNANQLRWHQLETYAFLHYSLNTYTD QEWGFGDEDPMLFNPANLDARQWARTCKEAGMKGIILTAKHHCGFCLWPSKYTEFSVKNA PWKNGKGDVVRELADACKEYGLKFAVYLSPWDRNHAEYARPEYVTYFRNQLEELLTQYGE MFEVWFDGANGGTGYYGGANEERKIDGATYYDWPKTFEMIHKWQPKLAIWGDRAELRWIG TEKGYAGLTNWCTIDYAFNMPQDTLMHGQENGKHWSPGECDTSIRPGWFYHENEDSKVKS LRKLVDTYYESVGRNASLLLNFPITREGRIHPIDSAHAVAMARVIKAELANNVARQASVT ASQVRGGSAKYNAQKAVDNSTETYWATNDGVTRASLTLSFKRPTNINRFMAQEYIALGQR VKSFSLEALVNGKWVQLRDERAPEGTTGLTTIGYKRIVCFPTVKATRLRFTINSAKACPL ISNIGVYCAPALPEDAVQLPASVLAKANEKQEEPERLLISVGYDSPRELLIDVQEKRVCH SLTYVPRGDTKDGLVLNYEIYASDDLVTWKRWLGGEFSNIENNPVPQTVKLPAIPTRYLK LVATRLAQGENMARNDLIVR >gi|283510543|gb|ACQH01000076.1| GENE 20 28949 - 30271 1378 440 aa, chain + ## HITS:1 COG:YPO3162 KEGG:ns NR:ns ## COG: YPO3162 COG0477 # Protein_GI_number: 16123324 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Yersinia pestis # 56 430 53 400 492 89 24.0 1e-17 MQQTNSKPNATRTNPIAWVPSVYFAMGLPFVVLNMVAVLMFKGLGVSDSKITFWTSLIMM PWTFKFLWSPFLELYRTKKFFVVTTQLVTGAGFALVGLALQLPHYFTVCIALMAVIAFSG ATHDIATDGVYMAELSKEDQARFIGWQGAFYNIAKIVASGGLVWLAGWLLKHYGGVKGAP ESVMHSAAVQAWMVVMIALGVLMFLLGLYHTRMLPQGGTRTEKRMTAAETFKELVRVLSD FFTKRHIVYYIFFIILYRFAEGFVMKVVPLFLKAGREVGGLGLSEEEIGLYYGTFGAAAF VLGSILAGYYISHFGLKRTLFSLCCVFNLPFVAYTLLSWYQPENGLLIGGAITLEYFGYG FGFVGLTLFMMQQVAPGKHQMAHYAFASGIMNLGVMLPGMASGFFSDWLGYKHFFIFTLV ATIPAFLITYFVPFTCEDKK >gi|283510543|gb|ACQH01000076.1| GENE 21 30521 - 31495 1109 324 aa, chain + ## HITS:1 COG:TM1225 KEGG:ns NR:ns ## COG: TM1225 COG2152 # Protein_GI_number: 15643981 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted glycosylase # Organism: Thermotoga maritima # 8 323 11 326 326 373 58.0 1e-103 MKELKDIMPWEERPAGCADVMWRYSKNPVIGRYHIPTSNSIFNSAVVPFEDGFAGVFRCD NKAVQMNIFAGFSKDGINWDINHEPIVFEAGNTQMIESEYKYDPRVTWIEDRYWITWCNG YHGPTIGIAYTFDFKRFYQCENAFLPFNRNGVLFPQKINGKFAMLSRPSDNGHTPFGDIY LSYSPDMKYWGEHRCVMKVTPFPESAWQCTKIGAGSVPFLTDEGWLMFYHGVITTCNGFR YSMGAVLLDKDNPERVLYRTKPYLLAPAAPYELQGDVPNVVFPCAALSDGERVAVYYGAA DTVVGMAFGYVKEIIEFVKNNSIV >gi|283510543|gb|ACQH01000076.1| GENE 22 31799 - 33427 1565 542 aa, chain + ## HITS:1 COG:CC0447 KEGG:ns NR:ns ## COG: CC0447 COG3525 # Protein_GI_number: 16124702 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetyl-beta-hexosaminidase # Organism: Caulobacter vibrioides # 26 511 31 506 757 366 41.0 1e-101 MKRLLLAAALGFSMLSAHAADANYNVVPLPKSVVMAKGKPFNLTSATTIVYEGTNPEMKR NARFLSEYIQQVSGIRTSLLDKRDKNAAAIVLTIDPKVTGAEAYRLTVNNKQVTIAASTP AGVFYGIQTLRKSLPVQTNGADVTLPAVTVADEPRFGYRGMMLDCARHFFPLSFVKKFID ILALHNMNVFHWHLTEDQGWRLEIKSHPELTAKSSMRSGTVIGHNAMVDDSIPHGGFYTQ QEAREIVEYARQRHITVIPEIDMPGHMLAALAAYPELGCSGGPYEVGHRWGVYKDVLCLG KESTYKFVQDVIDEVVDIFPAKYFHIGGDETPTIMWEKCPRCIQKAKDENTDIKHLQQYF TNRIEKYLNSKGKSIIGWDEILEGKINQSATIMAWRGEKNGFDGAIKGHDVVMTPSSHVY FDHYQAEDHAHEPDAIGGFSPVEKVYSYEPIPESLPADAKKRIFGVQANLWTEYIPYTTQ AEYMIMPRMAALAEVQWTPAAKKNFDDFSKRAHRLSDLYDRYGYQYARHLWKDKAIPTSA VW >gi|283510543|gb|ACQH01000076.1| GENE 23 33567 - 35627 2305 686 aa, chain + ## HITS:1 COG:CC0447 KEGG:ns NR:ns ## COG: CC0447 COG3525 # Protein_GI_number: 16124702 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetyl-beta-hexosaminidase # Organism: Caulobacter vibrioides # 25 518 31 518 757 355 38.0 2e-97 MHRTLLVLLLFVVSLCPKRVQAQPVIPQPKEIIMGKGFFTIKPNTPVELDFEGEEARQMK AYIRQTLGLKVFKGKVISKVPSIRLNLLRIKEPSCYGRTFPEYYTLDVTPKRITIMSHTT TGLFYALQTLRQLMVDGNKVACAKVNDQPRFPYRGMMLDCSRHFFTPQFIKKQLDAMAYF KLNRFHWHLTDGGGWRLEVKKYPKLIEETAYRTQNDWTKWWREHDRRYCHANDSGAYGGY YTQDEVRDIVAYAAKRHITVIPEIEMPGHSNEVFAAYPNLTCEGKAYTSPDFCPGNDSVF TFLEGVLTEVMQLFPSEYIHIGGDEAWQEKWKTCPKCQQRMKDEGLKDTHELQAYFIMRI EKFLNAHGRKLLGWDEIMQGRLAPNAAVMSWTGEEAGLKAAKAGHHVVMTPGAYCYLDMY QDVPFTQPKAMGGYVPLEKAYSYEPIAHKESADTLEKYIDGVQGNLWTEEVPTPEHAEYM LYPRMLAIAEVGWTRNRPPYANFRHRTIQALDRMKAMGYNFFDLRNERGRREESLTPVTH LALGKPVTYLSRYTEKYRGTGDGTLTDGRRGDWVFKGDRWQGFIGGECMDAVIDLGAETE LHEVSADFLQSMGVDIAFPDSVELLVSADGQNYTSLQRRDVERHDKREYFIEPFTWKGTA KARYVRLRAKQGAEGGWVFCDEVKVW >gi|283510543|gb|ACQH01000076.1| GENE 24 36169 - 37788 1550 539 aa, chain + ## HITS:1 COG:Cgl1519 KEGG:ns NR:ns ## COG: Cgl1519 COG4409 # Protein_GI_number: 19552769 # Func_class: G Carbohydrate transport and metabolism # Function: Neuraminidase (sialidase) # Organism: Corynebacterium glutamicum # 190 489 72 372 399 121 32.0 4e-27 MRLFLIAVLSAITFAAQASDTLFVQQPQYPILIERHDNILMLMRIDAKETGTLNALTLDL CNTPRQQIKALKLYYGGTDARQEYGKRRMKPVEYITNFTPGRTMEAIPSYSVKQDEVQPT QSTVTLRSKQKLFPGYNFFWVSIEMQPTAALHTTFNIALREALGDNRPLTVENVGQPAPL RRMGIGVRHAGDDGVAAYRIPGLVTTNRGTLLGVYDVRHNNSVDLQEYVDVGLSRSTDGG RSWEPMRLPMSFGETGGLPKAQNGVGDPSILVDTKTNTVWVVAAWTHGMGNERAWWSSQP GMDYNHTAQLVLVRSDDDGQTWSEPINITSQVKQPEWYFLLQGPGRGITMTDGTLVFPIQ FIKANREPSAGIMYSKDRGKTWHINKPARDNTTEAQVAEVEKGVLMLNMRDNRGGSRAVC TTRDLGATWTEHPSSRSALIEPVCMGSLISVKAKDNALKRDILLFSNPNSTKHRNNITIK MSLDGGKTWLPEHQVLLDEGLGWGYSCLTMIDRETIGILYESSVAQMTFQRIPLKDLMK >gi|283510543|gb|ACQH01000076.1| GENE 25 37882 - 38403 346 173 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929051|ref|ZP_06422897.1| ## NR: gi|288929051|ref|ZP_06422897.1| acetyltransferase, GNAT family [Prevotella sp. oral taxon 317 str. F0108] # 1 173 30 202 202 327 100.0 1e-88 MDDFIRNKYKGLQKHIELGLSHLWLVYEKEKVIAFFALSKDALTLNSEDKHMIEMRQKTD FFPHFDEEHFWIQEKYAAIEIDYLVVAEEKQRKGIGSFIIASIEEYARNDKLSSTLFLTV EALNSKEQNSIEFYKKCGFSLSEVGKTRNENMERFGDLPLTVRMYKRILPAKE >gi|283510543|gb|ACQH01000076.1| GENE 26 38478 - 38645 83 55 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260910379|ref|ZP_05917051.1| ## NR: gi|260910379|ref|ZP_05917051.1| capsular polysaccharide biosynthesis protein [Prevotella sp. oral taxon 472 str. F0295] # 1 55 1 55 55 84 100.0 2e-15 MLGIKSHRPPIEGDIADEIRDAVVRVAKGQLNEEEKAVQERLKKALEKYKMRWEL >gi|283510543|gb|ACQH01000076.1| GENE 27 38831 - 39232 97 133 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260910377|ref|ZP_05917049.1| ## NR: gi|260910377|ref|ZP_05917049.1| conserved hypothetical protein [Prevotella sp. oral taxon 472 str. F0295] # 46 99 32 85 91 94 83.0 2e-18 MHKVLDEDTIKPEILPQAVCGKTWLCFRRRLGGSYSMCSLQVENRLSTAYYVCFSMMSSW LVVSNIPHNRLRWPLQAYMFSVSLPTRACLRYNSFCMVFSNRFNIMSAFLSTNRSLGCKK SILAKVCFDAKRG >gi|283510543|gb|ACQH01000076.1| GENE 28 40065 - 41156 1054 363 aa, chain - ## HITS:1 COG:SMb21204 KEGG:ns NR:ns ## COG: SMb21204 COG0842 # Protein_GI_number: 16264618 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, permease component # Organism: Sinorhizobium meliloti # 2 361 6 367 370 145 27.0 1e-34 MFKSLLQKEWIQIRRNAFVIRLVVVYPIFIMVIAPWITTMEVRNITVSIVDNDRSTLSRQ LVGKVEHSTYFHFNGMASSYKEALASVERGQTDVVLVIPQHFERELTLGKSPQVFVAANA TNGTKGGLGSVYLANIVTSTSLNSRPLPSPHQQTITASTLYNRHKSYKVNMIPALMAMVM IMVCGFLPALNIVGEKEAGTIEQINVTPVGKTQFILAKLIPYWVFGLVIYTLCLLLAWGI YGITPVGNVALLYLFVVLLAVIFSSIGLIVSNYSDTMQQAMLVMWFIMTCMMLLSGLFTP VRSMPNWAQTLTWIVPVRHYIDAARSVFIRGTALSGLTTQLTILAAMATIMSVWAVWSYK KNS >gi|283510543|gb|ACQH01000076.1| GENE 29 41374 - 42474 945 366 aa, chain - ## HITS:1 COG:CAC3268 KEGG:ns NR:ns ## COG: CAC3268 COG0842 # Protein_GI_number: 15896513 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, permease component # Organism: Clostridium acetobutylicum # 2 364 4 374 378 186 29.0 9e-47 MKTFFSFVNKEVRHILRDRRTMLILFGMPLVMMLLFGFAISTDVRNVRLVVVTTPWDNVA QQIVERLDASEYFVHTYNAPTVEKAKQLIRDQKADLAVVFSPRFADHRYDGGARIQLITD GTDPNMSSLQAAYASQIIVQALQQGSKVSIGSTPMVTTRLLYNPQMKSAYNFVPGIMGML LLLICAMMTSVSIAREKERGTMEVLLVSPARPLLMLIAKAVPYFLLSIAILCIILGISHF VLGVPLSGNLSAIFALCLLYIFLALCIGLLISVLASTQLQALLISGMVMLMPSILLSGMI YPIESMPQVLQWLTCIIPTRWFTAAIRKLMIMGVGLDMVAKEVMILAAMAALILGIALKK FNIRLE >gi|283510543|gb|ACQH01000076.1| GENE 30 42474 - 43934 305 486 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein [Acinetobacter baumannii AYE] # 251 470 10 231 311 122 30 5e-27 MNAIEVNNICKSYGEVQALKDFSFSVAKGSLHGLIGPDGTGKTSMFRILATLVLPDSGTA SVDGFDVVSGLKHIRQRVGYMPGRFSLYQDLTVEENLRFFATLFGTTIEAGYHNVKDIYG QIEPFKRRRAGALSGGMKQKLALSCALIHQPSVLLLDEPTTGVDPVSRTEFWDMLARLKQ QGITILVSTPYQEEIKRCDEITTPPLPQALHTPPTRPLYINKVCEATKNPTTDNATRNLG QAETIIHVSHLVKTFGTFNAVDDISFDVRRGEIFGFLGANGAGKTTAMRILSGLSQPSGG KATVAGLDVSTQHEAIKRRIGYMSQRFSLYEDLTVTENIRLFGGIYGMSSADIKRKTEQL LNRLTLQNEANRLVKTLPLGFKQKLAFSVSILHRPEVVFLDEPTGGVDGNTRRQFWELIY EAAQGGITVFVTTHYMDEAEYCDRLSIMVDGKIRALDTPEQLKRQYAKDNLGEVFTLLAR NAERSE >gi|283510543|gb|ACQH01000076.1| GENE 31 43931 - 44824 1017 297 aa, chain - ## HITS:1 COG:alr1501 KEGG:ns NR:ns ## COG: alr1501 COG0845 # Protein_GI_number: 17228994 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Nostoc sp. PCC 7120 # 29 282 75 415 434 84 26.0 4e-16 MKHPSITIAALALLIAACGGNEKEYDATGIFEATEVTVSAEGTGRLMAFDINEGSTLTAG QQVGLIDTVQLQLKARQVGATREVFANQRPDIQAQIAVTKQQISKALVEQRRTQALLKEG AATSKQLDDAENAVAVLQKQLQGQISLLQNSTRSLNSQMSGADIQRYEVMDQLEKCHIKS PITGTVLEKYAEQGEFTSIGRPLFKVADIRNMTLRAYITSIQLAKVRLGQRVKVFADYGD DTRHAYEGTITWISPKAEFTPKSILTDDERADQVYAVKVSVKNNGYIKIGMYGEVKL >gi|283510543|gb|ACQH01000076.1| GENE 32 45136 - 46422 1367 428 aa, chain - ## HITS:1 COG:no KEGG:BF2652 NR:ns ## KEGG: BF2652 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 2 423 1 414 418 268 37.0 3e-70 MMKHTLLIVIALLCAAACPAQTTLEACQQQAQQNYPLIKRRALYAQSTAYTIANIKKGWL PQVQVAAQGTVQNRVAQLPEQLSNMMAALGQSTRGLAKEQYRVGVDVNQMLWDGGRISAQ KEVAQLQQQVDEAQTDVSLYQVRQRVNDLYFGILLVDERLRLNRDLQTLLQSNENKLAAL QKQGVAMQSDVDQVKAERLTAMQMANELQHTRAALCRMLALFCNVERIDSIVKPMPATTE QTAEDVRPELKAIDLRLRLITSQQRALRTSLLPTLSVFGQAYYGYPGFDMFKDMNSRSPS FNALAGVRLAWNIGNLYTHRNNVQRLSVAQAEIENARELFFFNNRLETVQQQETIASKRQ TMAADNEIVALRQSVRRAAEAKLAHGIIDTDRLLQEITRENNAKINRSTHEVEMLQGIYN LKYVKGMN >gi|283510543|gb|ACQH01000076.1| GENE 33 46537 - 47481 810 314 aa, chain + ## HITS:1 COG:AGl645 KEGG:ns NR:ns ## COG: AGl645 COG2207 # Protein_GI_number: 15890441 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 202 312 193 301 327 76 36.0 8e-14 MGGFYYLCGNKKGRVMNNQTIETLTMEEIRSRLSVHVAEHDMPITEDGFTMLYNGRNTFG LAVVAHCPYRLPGIRIGLLKKGEIKATLNLLERSAQSGTLGYFCDGTIFQFDQIPLESEV EGVVIEPWLMDELFPQGVPTMFNGQAKDALMPANDRDIAKVESLYESLMLVLRDRPYNRQ AALAIITALCYVYNECYVRATEHPAESPSRQRVVFDRFIALVNEHAKSEHNLAFYADKLC LSQRYLSRVVWQTSGVYAKEWIDRAVITEAKVMLRHGQLTVAAISEALNFPNPSFFNKYF KRQTGQTPLAFRNG >gi|283510543|gb|ACQH01000076.1| GENE 34 47623 - 47940 206 105 aa, chain + ## HITS:1 COG:no KEGG:PRU_0962 NR:ns ## KEGG: PRU_0962 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 105 1 105 105 107 54.0 1e-22 MEITFEQEYLRELFYLGKTRDKKHRFQPQVIRKYISVLNLMSSLNSTEDMYRFTSLHYEK LVGGKVGRESVRVNDQYRIEFRTEIKSGERIVTVCNILELSNHYK >gi|283510543|gb|ACQH01000076.1| GENE 35 47947 - 48258 344 103 aa, chain + ## HITS:1 COG:no KEGG:PRU_0963 NR:ns ## KEGG: PRU_0963 # Name: not_defined # Def: putative virulence-associated protein # Organism: P.ruminicola # Pathway: not_defined # 11 101 2 92 94 95 56.0 5e-19 MSSLGYGFYATHPGEVLKDELEARGISQRKFAESIGMAYSVLNELLNGRRPLSTTSALMF EAALDIPAEPLMELQMKYNMQVARKDTTLATRLKKIAKAVARP Prediction of potential genes in microbial genomes Time: Sat May 28 01:46:53 2011 Seq name: gi|283510542|gb|ACQH01000077.1| Prevotella sp. oral taxon 317 str. F0108 cont2.77, whole genome shotgun sequence Length of sequence - 5135 bp Number of predicted genes - 4, with homology - 3 Number of transcription units - 4, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 17 - 76 4.9 1 1 Tu 1 . + CDS 105 - 665 -2 ## + Prom 701 - 760 9.8 2 2 Tu 1 . + CDS 780 - 2093 1782 ## COG0148 Enolase + Term 2184 - 2220 5.3 3 3 Tu 1 . + CDS 2515 - 3144 608 ## COG0386 Glutathione peroxidase + Term 3185 - 3239 7.7 - Term 3200 - 3245 8.3 4 4 Tu 1 . - CDS 3277 - 4929 2009 ## COG1151 6Fe-6S prismane cluster-containing protein - Prom 5028 - 5087 4.0 Predicted protein(s) >gi|283510542|gb|ACQH01000077.1| GENE 1 105 - 665 -2 186 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MEWRIWKLLFFREFISQWVDLYHPLYAPGGFVLCGQHKDNKLLSNYTTIKSQDRVQLSMR ADVFMNKTWQGIFRLPRMTFFPSFYIEKVGLWKRFGRQCVSGTLVRCCLYRQYKHLIVSV QAACNVTTSPFQFCILPYTVPVQAARNITTRWLVSSPCGAIVRCLKAVAGVCCTSSFSVS LFSARS >gi|283510542|gb|ACQH01000077.1| GENE 2 780 - 2093 1782 437 aa, chain + ## HITS:1 COG:SA0731 KEGG:ns NR:ns ## COG: SA0731 COG0148 # Protein_GI_number: 15926453 # Func_class: G Carbohydrate transport and metabolism # Function: Enolase # Organism: Staphylococcus aureus N315 # 2 429 3 423 434 552 67.0 1e-157 MVIEKVHAREILDSRGNPTVEVEVTLCNGVVGRASVPSGASTGENEALELRDGDKKRYSG KGVLKAVENVNNVIAPALKGMPVCQQRKIDYKMLELDGTPTKSKLGANAILGVSLAVAHT AAKAFGMPLYRYIGGTNTYVLPVPMMNIVNGGAHSDAPIAFQEFMIRPVGASCEREAIRM GAEVFHALAKNLKARGLSTAVGDEGGFAPKFDGIEDALDTIMKSIKDAGYEPGKDVKIAM DCAASEFAVQENGEWYYDYRQLKNGMPKDPNGKKLTAAEQIKYLEELITKYPIDSIEDGL DENDWDNWVKLTEAIGDRCQLVGDDLFVTNVKFLEKGIKMGAANSILIKVNQIGSLTETL EAIEMAHRAGYTTVTSHRSGETEDTTIADIAVATNSGQIKTGSLSRTDRMAKYNQLIRIE EELCCSARYGYERLKKQ >gi|283510542|gb|ACQH01000077.1| GENE 3 2515 - 3144 608 209 aa, chain + ## HITS:1 COG:FN2007 KEGG:ns NR:ns ## COG: FN2007 COG0386 # Protein_GI_number: 19705303 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Glutathione peroxidase # Organism: Fusobacterium nucleatum # 29 207 19 198 199 184 52.0 7e-47 MYSKFYFLRTLTLALFALFAFSGMAQKNVYGFKVKDENGRMVSLSKYRGKVLLIVNTATQ CGLTPQYKPLQELYDKYRDKGLVVLGFPCNQFKGQAPGTNKEIRQFCEANYGVTFPQFAK IDVNGKQAIPLYRYLKRQQPFFGYLMSDSTQRAEFERLPKDVPMNVEYDIQWNFTKFIVD REGKVVKRVEPKDTMEYLEEEVAKVLEEK >gi|283510542|gb|ACQH01000077.1| GENE 4 3277 - 4929 2009 550 aa, chain - ## HITS:1 COG:FN0684 KEGG:ns NR:ns ## COG: FN0684 COG1151 # Protein_GI_number: 19704019 # Func_class: C Energy production and conversion # Function: 6Fe-6S prismane cluster-containing protein # Organism: Fusobacterium nucleatum # 4 547 2 555 566 702 61.0 0 MMENKMFCFQCQETAKGTGCTVKGVCGKEASTSNYQDLLLGVVRGVATIDRAIRKAGVGS VAGVDTYFVDALFACITNANFDDASILNRVDKGIALKKQLLAIAKEHNVELPAYHEILWG GEKSDYEAQSRKESVLRNENEDLRSLKELTVLGLKGMAAYYEHASRLGYTDTTITDFMGD ALAIIANPDAPMDQLLNCVLDTGKWGVTAMALLDKANSTSYGHPEITKVNIGVGKNPGIL ISGHDLHDIEDLLKQTEGTGVDVYTHGEMLPAHYYPELKKYKHLVGNYGNAWWKQKEEFS TFNGPIVFTTNCIVPPSPKANYKDRVFTMNSTGFPGWKHIEEDANGHKDFSEVIAIAKTC EAPTEIETGEIVGGFAHNQVFALADKIVEAVKSGAIRKFVVMGGCDGRMRSRDYYTEFAQ QLPHDVVILTSGCAKFKYNKLNLGDINGIPRVLDAGQCNDSYTWAVVALKLKEIFGAADI NDLPLAFNIAWYEQKAVIVLLALLSLGVKNIHIGPTLPAFVSTGVLKVLVEQFGLGGITT VEEDLKRMIG Prediction of potential genes in microbial genomes Time: Sat May 28 01:47:08 2011 Seq name: gi|283510541|gb|ACQH01000078.1| Prevotella sp. oral taxon 317 str. F0108 cont2.78, whole genome shotgun sequence Length of sequence - 34909 bp Number of predicted genes - 22, with homology - 20 Number of transcription units - 19, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 322 - 3018 2700 ## COG1879 ABC-type sugar transport system, periplasmic component 2 2 Tu 1 . + CDS 3061 - 3300 90 ## + Term 3439 - 3471 -0.8 + Prom 3665 - 3724 3.6 3 3 Tu 1 . + CDS 3937 - 4191 85 ## gi|288929067|ref|ZP_06422913.1| hypothetical protein HMPREF0670_01807 + Prom 4582 - 4641 7.7 4 4 Tu 1 . + CDS 4845 - 5618 763 ## COG3142 Uncharacterized protein involved in copper resistance 5 5 Tu 1 . + CDS 5909 - 8620 2930 ## BF1045 transglutaminase-family protein + Prom 8787 - 8846 4.2 6 6 Tu 1 . + CDS 8933 - 11830 2709 ## COG0642 Signal transduction histidine kinase + Term 11932 - 11981 2.4 + Prom 11834 - 11893 3.5 7 7 Tu 1 . + CDS 12102 - 12806 751 ## BT_4472 hypothetical protein + Prom 12876 - 12935 5.2 8 8 Op 1 . + CDS 12981 - 14279 1561 ## COG2252 Permeases 9 8 Op 2 . + CDS 14266 - 15195 921 ## COG3568 Metal-dependent hydrolase + Term 15406 - 15436 1.1 + Prom 15259 - 15318 3.7 10 9 Tu 1 . + CDS 15523 - 16410 638 ## gi|288929074|ref|ZP_06422920.1| hypothetical protein HMPREF0670_01814 + Prom 17387 - 17446 5.2 11 10 Op 1 . + CDS 17578 - 18789 1252 ## COG0477 Permeases of the major facilitator superfamily 12 10 Op 2 2/0.000 + CDS 18782 - 20149 1325 ## COG0635 Coproporphyrinogen III oxidase and related Fe-S oxidoreductases 13 10 Op 3 . + CDS 20192 - 21568 1289 ## COG1232 Protoporphyrinogen oxidase + Prom 21709 - 21768 4.7 14 11 Tu 1 . + CDS 21822 - 23159 1635 ## COG0334 Glutamate dehydrogenase/leucine dehydrogenase + Prom 23249 - 23308 4.2 15 12 Tu 1 . + CDS 23328 - 23969 440 ## COG0797 Lipoproteins + Prom 24186 - 24245 3.4 16 13 Tu 1 . + CDS 24272 - 24859 735 ## BC3029 hypothetical protein 17 14 Tu 1 . + CDS 25416 - 27161 1669 ## COG2194 Predicted membrane-associated, metal-dependent hydrolase + Term 27230 - 27272 -0.2 + Prom 27363 - 27422 4.1 18 15 Tu 1 . + CDS 27649 - 28482 1008 ## PRU_2086 putative lipoprotein + Prom 28824 - 28883 3.7 19 16 Tu 1 . + CDS 29100 - 30452 1213 ## COG0534 Na+-driven multidrug efflux pump 20 17 Tu 1 . - CDS 30766 - 32082 706 ## Apar_1241 hypothetical protein - Prom 32106 - 32165 1.9 21 18 Tu 1 . - CDS 32209 - 33522 888 ## Acfer_1144 hypothetical protein - Prom 33733 - 33792 4.5 + Prom 33949 - 34008 4.5 22 19 Tu 1 . + CDS 34096 - 34296 64 ## Predicted protein(s) >gi|283510541|gb|ACQH01000078.1| GENE 1 322 - 3018 2700 898 aa, chain - ## HITS:1 COG:SMb20671 KEGG:ns NR:ns ## COG: SMb20671 COG1879 # Protein_GI_number: 16265126 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Sinorhizobium meliloti # 22 307 28 318 322 165 33.0 5e-40 MNPKPTTFVLPLLALLALLLSACSTKKSYFIGVSQCSEDSWRTKMNSEIRRESYLYDNVR VEFASAKDDNRVQIAQIEHFIEEGADLIIVSPNEAKALTPVINKAFDRGVRVVLVDRKSA SDKYTAFIGADNVAIGRAVGRFVGEHLGGKGRVMELQGLRGSSPAMERDSGFREALARYP QIKVVANAYADWFEQKAESETERMFKDAGGVDLVFAQCDRMGIGAHQAIQKMGVKGVKIV GVDALPTPGDGIEAVKNGTFLATFVYPTHGDEVLKLAMNILEGRPFRRETILQTGVIDAN NAESALQQWAELTLLDNKMQRLNHRIDESMAQYSNQKTILVLGILLVFILMVFAFFVLKA YFAKAKLNEKLEEKNREIEAATQAKLMFFTNVSHELRTPLTLIETPVEQLLAENQLSKVQ RGLLEVAHRNVRTLLKLINQILDFRKVEGGKMTLQLSETDLAALIGNVVSEFVAAAEHKR IRLSCRLPEHVTAKIDAGKVERVVSNILSNAVKFTPVAGEINVELVVDTPQGTATLSVTN TGKGIAEGDLPHIFERFYQPQHSEGGTGIGLALAKAFVDMHGGHIGVSSKAEGPTVFTVH LPLNLQAETATSAAEQSPIAEPQAFIVPATQATTKADAPQLQTIFEENNDPNQPTILFTD DNDDVCQMARTLLETHYRVLTAPNGVVALQMAELNIPDLVVSDVMMPQMNGLELCSRLKQ STATSHIPVILLTARTLDEQRIEGYEHGADAYITKPFSAPLLLARIQNLLQSRKQLKQVF GGADELAKEEISTPDKEFVSKIRSEIHRNISNNDFGVEQLGAAVDLSRVQLYRKVKALTG LSPVELIRATRVNRARKLIEGGATSVSEVAYQVGFTSPSYFTKCFKDQFGISPMELLK >gi|283510541|gb|ACQH01000078.1| GENE 2 3061 - 3300 90 79 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MINLESPIILAYFFICLVKVSLLVGYLFFVGGDVGVAVLDIVAVLDILECLAVLDILFVL DILFVLVKLGCLGYSFCSS >gi|283510541|gb|ACQH01000078.1| GENE 3 3937 - 4191 85 84 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929067|ref|ZP_06422913.1| ## NR: gi|288929067|ref|ZP_06422913.1| hypothetical protein HMPREF0670_01807 [Prevotella sp. oral taxon 317 str. F0108] # 1 84 1 84 84 160 100.0 3e-38 MPPLFASKQTFARIDFLRQGERLVDEKGTYYVKFLAKNQTKLTRSEHARVGNGKENTCAC GGQRERLQGMLLTDNGHSIMGGRK >gi|283510541|gb|ACQH01000078.1| GENE 4 4845 - 5618 763 257 aa, chain + ## HITS:1 COG:mlr1122 KEGG:ns NR:ns ## COG: mlr1122 COG3142 # Protein_GI_number: 13471212 # Func_class: P Inorganic ion transport and metabolism # Function: Uncharacterized protein involved in copper resistance # Organism: Mesorhizobium loti # 11 239 17 244 258 177 45.0 3e-44 MNETSNNFEFEICANSVESCLAAQEGGADRVELCAGIPEGGTTPSYGDIVTARRLLNTTK LHVIIRPRGGDFTYSDLEMDIMAADIDACREAGVDGVVFGCLTPEGDIDIDKNAKLMAHV GQMAATFHRAFDRCRNPLDALGQLEQLGFNRVLTSGQQKTAEKGIPLLYKLHKRAGKNIA IMAGCGVNERNIFKIHVETDVTQFHFSARESVLSTCQYMGGNVRMGGKDTDEAYREVTTA RRVRHTIKELRGETENE >gi|283510541|gb|ACQH01000078.1| GENE 5 5909 - 8620 2930 903 aa, chain + ## HITS:1 COG:no KEGG:BF1045 NR:ns ## KEGG: BF1045 # Name: not_defined # Def: transglutaminase-family protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 39 895 13 888 892 922 52.0 0 MQTPQNYNKMTHAYHNVTTPYLTSTKPFTISMRKLFLSLFVALVACNVAQAGFYHFISDD KYRAEVHKDFTAKMQLVGKNFFTPTQTVPTPAEQEALEFLYAYMPVADVTDYTTAFYLDN VKATFKARDEMPWGRDVPELLFRHFVLPIRVNNENLDNTRMALYDELKKRVEGLSMTEAV LELNHWCHEHVTYQPSNARTLAPLACMKTAIGRCGEESTFTVAVLRTLGIPARQVYTPRW AHTDDNHAWVEAWADGKWHFLGACEPEPVLDLGWFNLPASRAMLMHTKAFGNYRGPEEVM LRTSNYTEINLIANYAPTARVDFAVVDAAGKAVDKAKVEFKIYNYAEYYTAVNKFTDNRG RTFLTAGVGDMLVWASKNGAYGFAKASFGKDKAVTIRLNRNARSDAKMTVGALDSVNIVP PVESAKLPEVTFDEAIANKTRLAKEDSIRHAYEATFYKPKKDGRIGDFLKRARGNWKTIH QFVSNHSDRLDRALALLETLPDKDLHDMPLEILEDHFGAESNQLSPRVEDEMIIAPFKQT FQQAFNTALADSMRANPTVLVEWTKHNIRLNPDTKALRIAQTPMGVWRSRLTDTRSRDIF FVSMARSLGIESRKDAVTGKVQYKRDSAWVDVNFDNAQSQATPTGKLKLTYSAAPLLPAD PKYYSHFTLSKIVNGQAQLLNFEEGDGNANEGTTWANTFERGYDLDAGTYLLVTGTRLAN GGVLATQRFFNVAAGKTTEVPLVMRRPSGQLNVLGNFDSESRFLKDGQEVSILSQTGRGY FTLALLGIGEEPTNHALRDLAKARATLDEWGRPFVLLFPNDADLQRFKDENFGTLPTNII LGVDADGKIKQQVANAMKLANPSQMPVFIVADTFNRVVFCSQGYTIGMGEQLEKVAKMLE TGK >gi|283510541|gb|ACQH01000078.1| GENE 6 8933 - 11830 2709 965 aa, chain + ## HITS:1 COG:all4963_3 KEGG:ns NR:ns ## COG: all4963_3 COG0642 # Protein_GI_number: 17232455 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Nostoc sp. PCC 7120 # 452 695 6 246 294 137 36.0 1e-31 MDCNRYLPKQALQYVALHFVTILVVLTGLLNACADKQPIDLPDFPPSSQEPAKQNVEPKT DGRLLRLLATYEEQDGTEKTAIANRIFGLLYAEELTDSLVKVANDTPKDKVDAMVLYHAA EYFWDKQDYATALNYARKALPLTYKTNDLLLISDCEQLAAQLLFRQSDLVRAIKHAQKSL EMDRKTGDKSRISSSLNTLAAISLVAKQPEEGERYISEAIALSTEVNDSNRMAIQYGMAS EIYHAMKKEQLALEHAVKAYQIDEQRGNVAKTGIRLSQMATALMDMERYVEAERTVERAI PILEKAGNKQSLGICMNQKGELLNRRGEYASAQQCFEKAIEIFEQRNDLYNLSRAQMGAF NALKASNPTLAAQHLQRYVALKDSIYQHDMEQAVSQSNAKYKNGELTLQAQHEQKEKRIV AIAAIVVVTLLLLVVVALIYVGKIRQRNHALLKQVSQLRENFYTNITHELRTPLTLILGL SHELNQDESLTDEAKHKATTIERQGNSLLTLINQLLDIARVKSAVGNPDWENADITAHVN MVVESYRDFAVSQNVELNFSGNERIVTDFVPEYINKVMNNLLSNAFKFTPPYGKVSVAVA REGANVVIRVKDTGVGIAPEAVAHLFEPFYRASNGASKNGTGVGLALVRQIIDALDGQIV VGSAPGRGATFAITLPIRNVCKPTANAQKHTNTPMLPHVEKRLDDETNDNCAEQRMLIVE DNADLAAYMGSLFTPQYAVCYASNGEEALARATELVPDIIITDRMMPGMDGLEVCKQVRA SDVMNHIPIVVVTGKITEQERLEGIKAGADAYITKPFNSEELITRVENLLDRHRKLREKY GDNGNETKEEGDVRAEAERQFLTKAVDTAYLLLDKRALDVNALAERLGMSIRQFTRKIVA LTGNTPGAFILNLKMQKGKQLLDTRFELNIEEISERCGFEHASSFYHAFKKAFGVTPSEY RKNGG >gi|283510541|gb|ACQH01000078.1| GENE 7 12102 - 12806 751 234 aa, chain + ## HITS:1 COG:no KEGG:BT_4472 NR:ns ## KEGG: BT_4472 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 234 1 234 235 252 56.0 7e-66 MKKLLAVAIAFVASLSVQAQNVQLHYDFGHLNDNLSTRPKLTSTVEMFKPDKWGNTFFFV DMDYADNGVAAAYWEISRELRFWQAPLAIHVEYNGGLAKGVGSYNDAYLAGVTYSWNEKD FNSGFTITPMYKYLAKQSQKHSWQLTATWYLNFCNHLLTFDGFADFWGDRRFADGRNIAV FLSEPQFWVNLNRIKGVSEDFKLSLGTEWKVSNNFVNQNHRWYWLPTLAAKWTF >gi|283510541|gb|ACQH01000078.1| GENE 8 12981 - 14279 1561 432 aa, chain + ## HITS:1 COG:VC2278 KEGG:ns NR:ns ## COG: VC2278 COG2252 # Protein_GI_number: 15642276 # Func_class: R General function prediction only # Function: Permeases # Organism: Vibrio cholerae # 17 431 16 428 430 344 50.0 2e-94 MSFYSLLGFNPKKQSRRTEMMAGVTTFLTMSYILAVNPDILSAAGMDKGAVFTATALASA LGTLFIAFLAKLPFAQAPGMGINAFFAFTLVRGMGYSWEAALAAVFVEGIIFILLTALNI REQIVKCIPKNLRFAISGGLGLFIAFIGLKNAGLVVANDATFVSLGAFTPTAALASLGII LSGALLVLKVRGALFYSILICTVVGIPMGITQIPDAFVPVSLPQSIAPTFLKLDFAALLN VDMMLTVFVLVFIDIFNTLGTLIGTAAKTDMMDENGNVKNIQKAMMADAIATSTGALLGT STVTTFVESAAGVAEGGRTGLTAFTTAMFFLVALFMAPLFLIIPSAATTGALVLVGVFML ESIKKIDLQDISEALPCFITVLMMVLTYSIAEGMALGLISYTLVKLLSGNYKDVNITLFI VSSLLVLRYVFQ >gi|283510541|gb|ACQH01000078.1| GENE 9 14266 - 15195 921 309 aa, chain + ## HITS:1 COG:lin0348 KEGG:ns NR:ns ## COG: lin0348 COG3568 # Protein_GI_number: 16799425 # Func_class: R General function prediction only # Function: Metal-dependent hydrolase # Organism: Listeria innocua # 30 301 6 250 257 140 32.0 4e-33 MCSNKARSLLLGVVLCMAFAAFGQTLVLGSYNIRYRNDGDEAKGHLWAKRCQVIADQLNY HHPDAFGAQEVLKGQLDDLLRALPDYDYVGVGRDDGHEAGEYAPIFYRRDRLDKLSDGYF WLSPTYTTPSLGWDAACIRICTWAKFRVRESGEVFLFFNLHTDHVGVKARRESAKLVMKA IDGLGGKDMPVVLTGDFNVDQTDETYAIFTANARLNDAYVVAKHRFAENGTFNSFDPTLK TTSRIDHVFVSPRFAVARYGVLPNFYWTEEPAAKSQKGQDAPQQIDFKQHRIRTASDHYP VFVEVNLRP >gi|283510541|gb|ACQH01000078.1| GENE 10 15523 - 16410 638 295 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929074|ref|ZP_06422920.1| ## NR: gi|288929074|ref|ZP_06422920.1| hypothetical protein HMPREF0670_01814 [Prevotella sp. oral taxon 317 str. F0108] # 1 295 73 367 367 587 100.0 1e-166 MTLLNQAIADTATAEWLWKRAELETKLGKDADALRDYDRLLLLHSIKDGAEFAAHVDIAA DSSVNVNEVFGRQLAAYFRLGQMRLAENKLEQMRKYLYDKGQVGKYYYWKGVIAKGNKDW YAAYSFFLDAALQHYPNARLQLKEMSKRSGMPLNWPAGLDSMNVELQFTDGTKWPIANKL QGEPRLVKVERKQVEITNIRKFVQKNYLHALALRTIPGMANKGKQVAPVCIATTSGNNLC FTCLPGSSGLAKLGLDASDGTAKPNAQRAKPQVIEVWVMKYEDGHGLRNCEVYIK >gi|283510541|gb|ACQH01000078.1| GENE 11 17578 - 18789 1252 403 aa, chain + ## HITS:1 COG:all4025 KEGG:ns NR:ns ## COG: all4025 COG0477 # Protein_GI_number: 17231517 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Nostoc sp. PCC 7120 # 10 393 4 389 396 169 30.0 8e-42 MKQDLNQRNKLSTFFCLYLAQAVPMSFFTTALQVTMREQRFSLSAIALLQLVKLPWILKM LWSPLVDRACTTMRGYKRLIVGSELVYALLVFLAGTLNLQLNVATVILLVLLSLVASATQ DIATDALAIRSTKGRERGLMNSMQSMGSFAGSLVGSGLLLVVLHRYGWNSVTQCLSLFVL VAALPLLFNKKISLQQDEDKRQRARLTDFVWFFGQRGVWRQILFLALYYMGIIGIMSNMK PYMVDLGYSMKDIGLINGVVGVTSAFAVGYPTGMLVRRYGYARVRPAVACAMLGATLLFT ILSVSVPTTVQIVLATVLLWASYGMATVVVYTSAMSHVRPGKEGTDFTVQTVLTHLTGIV AAMGSGAVADVAGYTAMFAVQTAIALASLAYVSLALKKETTHD >gi|283510541|gb|ACQH01000078.1| GENE 12 18782 - 20149 1325 455 aa, chain + ## HITS:1 COG:aq_2124 KEGG:ns NR:ns ## COG: aq_2124 COG0635 # Protein_GI_number: 15607073 # Func_class: H Coenzyme transport and metabolism # Function: Coproporphyrinogen III oxidase and related Fe-S oxidoreductases # Organism: Aquifex aeolicus # 3 441 6 441 456 322 38.0 8e-88 MTNEELIAKYNKPVPRYTSYPPANSFVEMGEEEYLAAVDQSNNAAVDNISFYIHVPFCQH LCHYCGCNSVAMARNEVVEAYFDALHREINLLLPHLDKGRKISQIHYGGGSPTSIPLHHI KRLNEHLLGSFGTIENPEIAIECHPAYLNDEHWEQLIGCGFTRYSLGIQDFDENVLRTVN RRPSLLPVEHIVERLRQSGAKVNLDFIYGLPGQTVQSFARTIARAIDMRPNRLVTFSYAH VPYIYGRQRILEKVGLPPETEKMRMFEQAAEQLARAGYCHIGMDHFVEPNDDLQQALERK QLHRNFQGYCPRRITAQVYALGVSGISQLDAAYAQNTKDVARYIQYTNEGRLCIERGYAL QPWQRVAREVIESLMCNYYVAWTDMAQRLNISAEEVKEALNYDVPALLQMADDGLITLEP THIALTEAGSPFVRNVAAALDKMMINNNNRFSKPV >gi|283510541|gb|ACQH01000078.1| GENE 13 20192 - 21568 1289 458 aa, chain + ## HITS:1 COG:aq_2015 KEGG:ns NR:ns ## COG: aq_2015 COG1232 # Protein_GI_number: 15607001 # Func_class: H Coenzyme transport and metabolism # Function: Protoporphyrinogen oxidase # Organism: Aquifex aeolicus # 1 456 3 430 436 199 31.0 7e-51 MRQTTIIIGAGLTGLTAAHTLRKQGKEVLVVEKENRIGGQIRTFEEQGFVYESGPNTGMV AYPEVAELFTDLAAYGCKLLTAREEAKQRWIWKGERFHALPIGLKASVQTTLFSWHDKLR ILAEPFREVGTDADESVAQLVRRRLGDSFERYAVDPFISGVYAGDPERLVTRYALPKLYN LEHTYGSFVRGAIAKMRQPKTPRDLLATKKVFSAEGGIQRLVDALGRSVGETNIVLGASN LHLQPSDQGWKAEYTQADGTVCSILAQQVVTTIGAYALPQLLPFVPQTLMEPITCLRYAP IVQVSVGFADVQGQRNEAFGGLVPSCENRHILGVLFPASCFEGRAPKQGDLFSVFVGGIK HPEVLDMNDAELTQMVLAELRHMLRLPADVQPSLLKIFRHRHAIPQYEVSSGARFKAIDA LQSQYRGLIVAGNLRNGIGMADRIKQGATIIPQELADI >gi|283510541|gb|ACQH01000078.1| GENE 14 21822 - 23159 1635 445 aa, chain + ## HITS:1 COG:PA4588 KEGG:ns NR:ns ## COG: PA4588 COG0334 # Protein_GI_number: 15599784 # Func_class: E Amino acid transport and metabolism # Function: Glutamate dehydrogenase/leucine dehydrogenase # Organism: Pseudomonas aeruginosa # 3 444 5 444 445 532 59.0 1e-151 MEVEKIMQELERKHPGESEFLQAVKEVLVSIKDVYNQHPEFEKAKIIERMVEPERIITFR VPWTDDKGEVHVNIGYRVQFNGAIGPYKGGLRFHPSVNLSILKFLGFEQTFKNALTTLPM GGGKGGSDFSPRGRSDAEIMRFCQAFMTELYRHIGPNEDVPAGDIGVGGREIGYLFGMYR KLTHQFEGVLTGKGLEWGGSIFRPEATGYGALYFVNQMLETKSIDIKGKTVALSGFGNVA WGAAKKATELGAKVITISGPDGYILDPNGVSGEKIEYMLELRNSGNDVCAPYAEKFPGST FFAGKKPWEAKADIYLPCATQNELNGEDADKILAAKPVCVAEVSNMGCTAEAAEKFVATK TLFAPGKAVNAGGVATSGLEMTQNSLRLGWTGKEVDERLHHIMSSIHEQCVKYGTEANGY IDYVKGANIAGFMKVAHAMMAQGIA >gi|283510541|gb|ACQH01000078.1| GENE 15 23328 - 23969 440 213 aa, chain + ## HITS:1 COG:slr0423 KEGG:ns NR:ns ## COG: slr0423 COG0797 # Protein_GI_number: 16331599 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lipoproteins # Organism: Synechocystis # 12 99 230 317 321 99 55.0 3e-21 MGFISLHAQQRGKASFYSRQATGARTSSGERLHHRDFTCAHRTHPFGTLLKVKNLSNGKE VVVRVNDRGPFGRGRIVDLSWGAAKALGMLSQGVVDVEITPVGKKKVSSKGKGSTSDASL KHKSGKRKGKAHAKKGKHKDKKGSSAIGKKHKKRDSKLKGNAKLRHSDKAKHVKPKKKSV KSKKKHVKTKPQKGNKKTKKGKNGKKKGKKGRR >gi|283510541|gb|ACQH01000078.1| GENE 16 24272 - 24859 735 195 aa, chain + ## HITS:1 COG:no KEGG:BC3029 NR:ns ## KEGG: BC3029 # Name: not_defined # Def: hypothetical protein # Organism: B.cereus # Pathway: not_defined # 17 195 78 257 263 108 35.0 9e-23 MEKQRQQELAELKPTAEMMDEEQFWAIVQTAVDEAGDDEEAYLEVVKRELSKLSLKEMIG FRLRTDKLLYDSYTSEMWCAGYLMNGGCSDDGFEYFRLWVISRGRKVYEAAMANPDNLID YIDDDDEMDFFEFELFWYVALEAFEEAVDAELYDYVDDDNFKTCEGNYPNFEFNWEEDDP ESMQKLCPRLFEKFG >gi|283510541|gb|ACQH01000078.1| GENE 17 25416 - 27161 1669 581 aa, chain + ## HITS:1 COG:jhp1312 KEGG:ns NR:ns ## COG: jhp1312 COG2194 # Protein_GI_number: 15612377 # Func_class: R General function prediction only # Function: Predicted membrane-associated, metal-dependent hydrolase # Organism: Helicobacter pylori J99 # 202 579 178 549 553 127 26.0 7e-29 MIRTLMRPLRQWVFFVAMFAAITLMPFEAIKHMYAYPDEVAAALLHLVAALALAYLFTCV VHAVGRTWFKCLAYALPTWCVLLRSFLHFALHSELTPKTVMLVAETNAAEVSGFFSTFAV SAGAFKSYALTILYIAVVVFAERHRLTWGHWLQRKLPQKLLVVTILSSFVALGYTYSAIP ALLRCQTTADAEWWNIRYYAYSMDDLSIAAYSLYVPHLMKRELQEAVANNLAAAKRASSL ANDADSLTLVVVIGESYIKSHAQIYGYPLPTTPRMVAEQKRGNLLAFTDYISPYNFTTEA VKAILSTNDVSAGQNWSQGALWPVVFKRAGYTVEMWSNQLVTADNVFTNYSLNGFIYHPQ IRPVAYNSVRGSEGTLDGALVSRALAHAKANSRRHRLLLLHLNGQHFNAKDKYPDTPAFN VFSAADIKSSAPWLTPSKRQQVAEYDNATRYNDHVLGMVFDAMRNTQAVVVYLPDHGEEI YDWRDQYGRSTTPEYTSAYMHSMNDIPLIVWASPSFIARNPQAWHRLQGAVARPGMSDGL CHLLFGIANLRTPYYIARKDMASDMWKPARRIVYGKVEYKP >gi|283510541|gb|ACQH01000078.1| GENE 18 27649 - 28482 1008 277 aa, chain + ## HITS:1 COG:no KEGG:PRU_2086 NR:ns ## KEGG: PRU_2086 # Name: not_defined # Def: putative lipoprotein # Organism: P.ruminicola # Pathway: not_defined # 6 277 5 231 232 157 35.0 4e-37 MVARKLIMALIALAMLVGCGTYAGSGAYAGATLGSILGSAIGGITGGAGGSDLGQIVGTG VGAAIGAAIGEQADKKAEERSEKRRERIEQRRREMEREDYGYNDNEPAYGQYPPYDSGFD ENNGGDDRIFDFKGPDYTGNYSAQQPSAGESAQQEPTRATGTLPIEIRNARFVDDNQNRQ INRDELCKVIFELWNNSSQTLHDVQPIVTEETGNKHLAISPTIHVEQLEPGKAIRYTALV RADRRLKPGNAVIRVAAVRGNNIPISEPLFFNIPTEK >gi|283510541|gb|ACQH01000078.1| GENE 19 29100 - 30452 1213 450 aa, chain + ## HITS:1 COG:lin2192 KEGG:ns NR:ns ## COG: lin2192 COG0534 # Protein_GI_number: 16801257 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Listeria innocua # 13 449 12 441 443 165 26.0 2e-40 MDDRDNIDFGKMNIPRLFVKLFVPTLLGLVFNAMLTLADGIFVGRGIGSNALAAVNIAAP IFLIGLGVSLMFATGVSVVAAVHLSRGNNKAANINATQALTVPFVVMSGVSVLIAVFAKP ICYLFGGSQMLEPLVVEYLTYISPIPALTVLVHVGGFLIRLDGSPKYAMYTSVVPAVLNI FLDWLFVFPLNLGLMGAASATSISETVGALMAVAYLVYFPKTIRLYRPKFTRTSVRLTVR NLGYMMKLGVPTFIAETAISGLMIVGNFMFMTFLHEDGVAAFSVACYLFPLVYMFGNSIA QAQLPIVSYNHGQSNHDRVRHTLRLSLALTAVNGLVISLLGMLAFRPLISLFLAEGTHAF IVCTEGFPLFATSFVFFSLNVVIIGFYQSVERATEAIFFMCLRCLIFVIPIFIALPYLIG VVGLWLAVPLAEALTFVVLVVWSKQKRVFS >gi|283510541|gb|ACQH01000078.1| GENE 20 30766 - 32082 706 438 aa, chain - ## HITS:1 COG:no KEGG:Apar_1241 NR:ns ## KEGG: Apar_1241 # Name: not_defined # Def: hypothetical protein # Organism: A.parvulum # Pathway: not_defined # 1 118 1 118 119 75 31.0 4e-12 MTKEERAEKWFRNIPNAETIDLETRMEICSKVAKKMVLVFFAVFALELVLLSLIVGDDLF NGLADFVNSLIGGSHSRGSRNVAVIVAAIVCAPLFALPLIVTLVFKKKAIASAVNRTLGI TNKGETRGHGATDKGWERLNGWMETFYPDWDMVVDDTPIKGFTPKDVRQQLAALKPGSYE SIRMWAHIPIRANDEYQFNHLTISPKKRVGYFRIEVFTSDIAHDNHVINYEKAKLSEREV MEVLQNVLNNNLAAELQSWKVAMSYLIEKSTDAENYKAIAKLLPDGDTLFAKVERCIDAP KQYYTDNAERYCEREMDENKRNCDIIWAAIANEMLDNGNAVEVDWDVTKQEFCLKMKGLA DKYHLTLNEDWLDEEDEIETWCEILNKQWADWGLCVAKFTIVCGNFGYLYFIYERQDVEK LSDLCMEVNQCIDVAKEI >gi|283510541|gb|ACQH01000078.1| GENE 21 32209 - 33522 888 437 aa, chain - ## HITS:1 COG:no KEGG:Acfer_1144 NR:ns ## KEGG: Acfer_1144 # Name: not_defined # Def: hypothetical protein # Organism: A.fermentans # Pathway: not_defined # 308 437 143 270 270 67 29.0 1e-09 MTREERAEKWFRNVPHAETLNLEARKEICRRVAKKVMLVSFAVLALEIVLLFLFDKETIF NDAGNFVINGTLQEHYRSGVARTALGVLILYLPLMALPLLAAVLFRKKRMASAVNSFLAE RDRDETCEQWATDDESEQFANHSGKRYADWGMDSDDEESSDGFTMIDVRRQLAALKYGKR DSIYIWAPDMIKVDDKFRYNYVFIYQGDRFGCFDVDVFTIDIAQNNHVITYKKENLSEKK TMETLQNLLDNHAIPKLHSWKVYADYLTEKSTDAENYKAIVELLPGGETYLVQVERCIAA PKQYYMDHEERYEDRGMDETDRNGDIIWIAIADEMLDSGNAVELDWKETKEEFYLQMQDL AGKHNLELREEWLQEDGDIPTWCGVLDGKWEEQGFCIAEFDIDSDSYVLFICTCETLEKL RKLGDEVLRTIDLAQEI >gi|283510541|gb|ACQH01000078.1| GENE 22 34096 - 34296 64 66 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNFNIMSAIFTHEQSALSQKIDSRGGLFWCKMGLACACGKMNFYLKRSSSTPDFGPFAAK CTAFWC Prediction of potential genes in microbial genomes Time: Sat May 28 01:48:12 2011 Seq name: gi|283510540|gb|ACQH01000079.1| Prevotella sp. oral taxon 317 str. F0108 cont2.79, whole genome shotgun sequence Length of sequence - 32959 bp Number of predicted genes - 18, with homology - 17 Number of transcription units - 11, operones - 6 average op.length - 2.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 669 - 3977 3944 ## Cpin_2947 TonB-dependent receptor plug 2 1 Op 2 . + CDS 4036 - 5436 1392 ## Cpin_2946 hypothetical protein 3 1 Op 3 . + CDS 5441 - 6613 839 ## COG0526 Thiol-disulfide isomerase and thioredoxins + Term 6651 - 6694 9.2 + Prom 6639 - 6698 2.7 4 2 Tu 1 . + CDS 6736 - 6966 90 ## gi|288927436|ref|ZP_06421283.1| hypothetical protein HMPREF0670_00177 + Term 7043 - 7101 2.5 - Term 8152 - 8217 13.1 5 3 Op 1 . - CDS 8222 - 10054 2245 ## Phep_4125 RagB/SusD domain protein 6 3 Op 2 . - CDS 10075 - 13359 3697 ## BF0381 hypothetical protein - Prom 13582 - 13641 1.8 - Term 13616 - 13642 0.1 7 4 Op 1 . - CDS 13645 - 15408 1933 ## BF0380 lipoprotein 8 4 Op 2 . - CDS 15430 - 18747 3750 ## BF0381 hypothetical protein + Prom 20690 - 20749 5.2 9 5 Tu 1 . + CDS 20840 - 21406 137 ## Gobs_3736 hypothetical protein + Term 21588 - 21637 -0.7 10 6 Tu 1 . - CDS 21647 - 21910 388 ## gi|288929097|ref|ZP_06422943.1| hypothetical protein HMPREF0670_01837 - Prom 21965 - 22024 1.7 11 7 Op 1 . - CDS 22098 - 23105 1356 ## COG4260 Putative virion core protein (lumpy skin disease virus) 12 7 Op 2 . - CDS 23160 - 24473 1068 ## gi|288929099|ref|ZP_06422945.1| hypothetical protein HMPREF0670_01839 - Term 24563 - 24629 4.2 13 8 Op 1 39/0.000 - CDS 24643 - 25521 1137 ## COG0074 Succinyl-CoA synthetase, alpha subunit 14 8 Op 2 . - CDS 25614 - 26750 1652 ## COG0045 Succinyl-CoA synthetase, beta subunit - Prom 26772 - 26831 4.4 + Prom 26668 - 26727 3.4 15 9 Tu 1 . + CDS 26775 - 26990 97 ## + Term 27159 - 27195 0.4 16 10 Tu 1 . - CDS 27034 - 28800 2168 ## COG4866 Uncharacterized conserved protein - Prom 28905 - 28964 6.8 + Prom 28904 - 28963 4.9 17 11 Op 1 21/0.000 + CDS 29032 - 30069 1280 ## COG0280 Phosphotransacetylase 18 11 Op 2 . + CDS 30089 - 31291 1451 ## COG0282 Acetate kinase + Term 31418 - 31470 16.6 Predicted protein(s) >gi|283510540|gb|ACQH01000079.1| GENE 1 669 - 3977 3944 1102 aa, chain + ## HITS:1 COG:no KEGG:Cpin_2947 NR:ns ## KEGG: Cpin_2947 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: C.pinensis # Pathway: not_defined # 60 1102 113 1161 1161 709 39.0 0 MNPILPRVQTAMHHALRHRLAVAIILPSLLTIAPPAAQATTANLQREPRTAMPDNGKRLL RGTVTDENGEPLMGVSVSADGSTPLTVTDADGKFSVPMPANATQLTFTYVGMATQTINIG QRSNFHIAMSAGEQALKDVVVTGYQTISRERTTGSFNVVTPEKLKGKLQTSILARLEGMV PGMMQQNGTLYIRGMSTLNGGANAYSPLFVVDGLPFEGDINSINPATIKSITVLKDAAAA SIYGARAANGVVVISTIDAKGENKTTIRYDASVKFTPKPDMDYLNLMDTRETIDLREYGF KFNSIPYAMIPPNYYLDPVSELLYKHRDGLIDEQTFQDGMNKYRNLNNRKQLEDFYTQTG IEHQHNLSLSGGNDINRYVFTLNYTGNRFNARYNQMQRYGATLRDNIKLTSWLNAEAALT INYNNTNSDRGMGTYTDMYRNQPSYTMLKDESGNPLNVPTYKSEWEMNRLIGLGLKDEHY SPITNRREERYNSNENYYRLMLGLNLKIMKGLNFDVRFQTENSADKTVEIFSERSYHVRN MINDAAQYNPATKKLTLNVPEGAHYSESRGDADSYTLRAQLNFNRDFGPHSITALAGGER RRTKTTNTSIYYMGFNESTLAYKPIDPTALTNVKGTESLAGNFSWSFTGNNWINEIENRY VSFYANAAYAFDSKYNLTASIRVDQSNLFGTDPRYQYRPLWSVGGAWHIAKERFLAGKLS WLNALTLRATYGIGGNVPRGASPYVTLKAAQYNPWSKDFMAEIKNPPNYTLRWEKTATTN LGLDFSVLNNRLSGSIEVYHKHSTDLLAQRDADPTLGFKQLTLNYGNMTNKGIEVSLNSV NLQTRNLRWTTGLNFGYNKNMLDDVEDSNPNVHSYTSGFAAVKGHPLGAIFSFQYAGLSP TDGTPRYYVEGGSKTEAQVSNLSDLVYSGTRHPTYSGSFSTTIAYKDFELSFMLMFYGGN VLRTEPAPFVSEPSHLNPNKEMRNIWRQPGDERNPDATPAFTGYALDLATVRHPWYAADR HVFKADYAKLRHLSLTYRVPQRLVSKCGLAQLAFTLQGQNLLSWSANKYGVDPEALGTTG YGWGMRTIPTPATWTFGLSATF >gi|283510540|gb|ACQH01000079.1| GENE 2 4036 - 5436 1392 466 aa, chain + ## HITS:1 COG:no KEGG:Cpin_2946 NR:ns ## KEGG: Cpin_2946 # Name: not_defined # Def: hypothetical protein # Organism: C.pinensis # Pathway: not_defined # 16 459 13 471 476 278 37.0 4e-73 MKTTYNTIVSMALTALLCACDSFLDIKPKGERIPETAADYKTMISHYDIQKMGETYPIYL TDDCYLPDMAFSTSSQGLNTIKPSLKRLYTFDRQVFGDGEDDTFWSSSYSRIFTNNVVIR EVMDATEGSTAEKSAIRGEALVNRALDYLYLVNAYAKHYNEATADADAGVPLLLNADISQ TNLTRTSVKGVYQQILADLNEAESSLPEEISGNAFHATKDAARGLRARVYLYMGNYAEAL KAANEVIARHDTLLDLTRYEVVKPAGQIGRTNVPDADANPESIFIKYAPYIFGLSSQVFP SDSLLKLYEDADMRKVLFMAETFRGKPLPRPIWVPYVRANTAISTPELFLIAAECEARVG YMPHALALVNKLRAHRIKDNGYVSLADKDSVLSFVLEERRRELAFNGFLRLIDLKRLNLD PRFATTVKHVGDTETWVLPPNDPRYVLPIPQRVMRFNANTMTQNER >gi|283510540|gb|ACQH01000079.1| GENE 3 5441 - 6613 839 390 aa, chain + ## HITS:1 COG:BS_resA KEGG:ns NR:ns ## COG: BS_resA COG0526 # Protein_GI_number: 16079372 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Bacillus subtilis # 257 371 44 155 181 72 33.0 1e-12 MKRLFSIALLAGLIAVQCLAVGISVQGYTADKRIKRLYLFMVQDERYGYVQPLDSFEVTD GKFAYQNDTLTTRLFFLSPVFNASDMESCFEQGTYLFLASGNNLLSVSRNARGAITADNP NVKLNAQHALFTHQKDSAGNKHTLDSLDQVFYAARNRNDQREMQRIKTSTAPIYNRAYEQ LRQWLDKEVARQQGSPFGLYLYYTYRLQHANITNRTDLDRARQVVEGANDEAKQTWYYAR ATQKLNALEQTVVGQVAPAIVGEDKAGKALSLSHFKGKFVLVDFWSSSCTWCRKETPNLL KTFNAFKDQNFTILGVSTDFRKADWLKAIKEDGAIWQQLLLKGDARKQVLAAYSIVSIPQ ILLVSPDGTIIAKDLRGNKIYEAVQKTLAK >gi|283510540|gb|ACQH01000079.1| GENE 4 6736 - 6966 90 76 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288927436|ref|ZP_06421283.1| ## NR: gi|288927436|ref|ZP_06421283.1| hypothetical protein HMPREF0670_00177 [Prevotella sp. oral taxon 317 str. F0108] # 7 76 5 74 74 75 57.0 1e-12 MAGTTVVAVDYEQHALQTLALVVASVHVFCAIFHSCEFTAYLFCLVFGKNFNMISALFAH QSPTLSQKIDSREGLF >gi|283510540|gb|ACQH01000079.1| GENE 5 8222 - 10054 2245 610 aa, chain - ## HITS:1 COG:no KEGG:Phep_4125 NR:ns ## KEGG: Phep_4125 # Name: not_defined # Def: RagB/SusD domain protein # Organism: P.heparinus # Pathway: not_defined # 8 606 9 599 602 297 33.0 1e-78 MKHKILSLALLTLPLVACNDFLTREPLDRVNADIDERWNSEEVIRSAMYDLYPIYFKGYN SGWSRSDWFADTQIADWNDDNAQEEATFFIKNVPTTTSSSYPWNFEQVHTINTYLDKLAH SAMPDEPKRHWTGVNRFLRGLEYAKLVSNFGDVPYYQHPVAPTDKAALYRAFSTRQEVMD SVLSDMKYAMDNIRQQDDVKGVHIVNDLVYAYTSRIFLFEGTWQKYHNKNNELAAKYLQV AKDAADYIMKSKRYQLCSNYKDLTASIDLAGNPEIILYRAYVAGVVTHSLMSFNNTESEE SSPSKSLIESYLTKDGLPITQTGGNALYKGDKLLKDEIANRDPRLYATIDTAELRLPSVA SVYAASGYFANRFVNPKLIDQPGGKSYTNITDAPVMKLNEVMMNYIEAAAELATLGKYTL TQTDFDRTINALRQRASTNMPTLQLVGDALRVPAGVINDSQRDADVSPILWEIRRERRVE LVYEGLRFNDLRRWKKLNYADMVKNPKLNMGAYVNKREYMEWYNRVHPTTDPKKILTLEK MKEMKLVLLNAKGEYEVNDSVGYIIPIVKQGFMRTYSDKDYLYPIPIDQLTLYRNAGHAI KQTPGWEDKK >gi|283510540|gb|ACQH01000079.1| GENE 6 10075 - 13359 3697 1094 aa, chain - ## HITS:1 COG:no KEGG:BF0381 NR:ns ## KEGG: BF0381 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 24 1094 35 1111 1111 783 41.0 0 MRRIYYLGVAFAALSGTFAPHQAQAAAAQSPRTTNAVQQQTSLSGIVTDANGEPVTGASV AIEGKGVGAVTDINGRFTIDARPGDKLTITYVGYNPMQVTARADMRITLTENASQLGEVV VVGYGTQKKVNLTGAVANVNVKEAIASRPITDVAKALQGITPGLTITNRIGGVGTQSTVK LRGSVGSLSAEGGTSPLILVDNVEVPNLNLVNPDDIETISVLKDAASASIYGTRAAWGVI LITTKQGAANDKVNVTYSNNFAWSTPTKMPEQMKASDNARFIFEIMKSKGKTRETSIGYT IDEELIKKLEDWENKYGGMSQDELGEMQQGRDFEIVGGNTYFYRSFDPIKEFTRKWTPQQ NHNLSVTGGSKKTTYNISLSYLNQTGVMKYNSDQYDRYTLHSSITSSIRDWWKVRANVLF TRTKNDEPYRFTSGQYDAWFYLLRWPRWYPYATYQGKPFRSAVTDIKHGNRENTTTTYVR TNLGTEITPMKNLSVNFDYTFSFTNDARKLNGGTVYAYDMFASAPFSSYTNLYGLWHNMV EQSSQYTLQNIFKGYATYMFDLAEQHHFKVMAGFDAETREAFSHYSERQKLYSEKFPEIA MTYGDQFSYNSNYSYHNDFAAAGVFGRLNYDYLQRYLLEFNLRYDGSSRFPIGKKWAFFP SFSAGWRTSEEKFMDWAKPALSNLKVRGSWGTIGNQDVAAYAYMSIMKPQNSGWVVDGKE MGTFSSPTIRSEELTWERVSTLDLGLDASFFNNDLNVTFDWYRRITSGMHTRGEQLPATF GAEPPKKNYGEITGTGLELGINYQHQFSNGLGVSASASFSHVKEVITKFNSQTRNINGNY QGKRLGEIWGYETDRLFQADDFDSNGKLKPGIASQSKYENDDFKFGPGDVKYKDINGDGE ITYGDNTVENHGDQKVIGNFLPNYEYSFSLGATYKGFDFNMFFQGVGKRDFWVGGAIGIP AGGSSYKEAAYKHQLDYWTPTNTGAFYPRPTDMSWKQNARNYLTQTRFLANMAYLRCKNI TLGYTLPQNIVSKALLQSVRVYVSAENLFEFDSLHLPFDPETTNYKRGKGESDWSFGRSY PYTRTISFGLQAAF >gi|283510540|gb|ACQH01000079.1| GENE 7 13645 - 15408 1933 587 aa, chain - ## HITS:1 COG:no KEGG:BF0380 NR:ns ## KEGG: BF0380 # Name: not_defined # Def: lipoprotein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 587 1 565 565 498 48.0 1e-139 MKKIIIAGLLTLGALASCNDFLDLAPQDKFTDTPEYWDNTVNLEDQCNLFYEDYLGYGNA GGQGWFYYKTLGDDQVDFQKNDWAWKNAPATSSDWKDAYKSIRHANFILERLSTSLMDED DKANFQAIARLNRAWQYYQLVRMFGNVPLVISVPDHHSDNILYGPRVDRDVVMDSVAADL NFATKHIAGTNKQRFTKNMAYAMLADVALYEGTYCKYRKAEDNAGKGPDATRAKTYLNAC VEACKVIMGNAAFKLNDTYQGNYNSTDLSKNPEIIFYKPYSKNTLMHSLIDYTVNTGGTN GMTKDAFDNYLFLDGKPRASTALNKTDAPKMVIVNVQKKVDGVMKTVQDTSYHIGHLFTV RDKRLSATLDSVLAFKGKAYSRAGSNAFTSTTGYGVAKYDNPQLLTTTERNNINRQYTDA PIYWLSVIYLNYAEAMAELGSITQDDLDKSVNLLQRRAGLPGMTLTPDADPANNMGVSNL LWEIRRARRCELMFDNWYRYWDLIRWHQLELVDSKAHPNIYRGANMTTVDKPQVELDTLN NAAPWEIYMAGSNTRKQVRTYDKKYYFYPIPTGQKALNGKLEQNPGW >gi|283510540|gb|ACQH01000079.1| GENE 8 15430 - 18747 3750 1105 aa, chain - ## HITS:1 COG:no KEGG:BF0381 NR:ns ## KEGG: BF0381 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 2 1105 21 1111 1111 1121 53.0 0 MLAALTAPWTTLTAAAKETKGTATVTEMQEQNTITGVITDTQGEPITGASVAVVGKSVGA VSDINGRFTINVRPGAKLTISYVGYKTVTVAASESMKVELQEDAGQLSEVVVVGYGTQKK ANLTGAVSTVDLSKTMAGRPQQDVAKALQGAVPGLSVISNNGDINGKPTLRIRGVGTLSN DAKSNPLIVVDGVPMDDISFLNTQDIESVSVLKDAASTSIYGTRAAFGVILIQTKGAKGT DRTTINYSNNFAWDGATFLPKFPDVPSQLRAALEGKANANNYEVELFGMYFDKLLPLAEK WKQQNGGKQDYREMRQYVDDNNIGDYTIIDKTPYYYADWDINKIYYNNAAPAQSHALSVQ GSSGKSNYYLSLGYDYKEGTMKIRPDKLNKYNASLAVTTSPYDWLQMGARLNFTRRDFTT PDTYNNVYQYIWRWGSFFLPSGSINGHDRRIMAMLKQAADKNVTNDYLRINTFVKANITE GLTFNADFTYAIENMNSGSQDFSVYGLNWGSVTPSYIVNKGNTNVWRDNSKTNTWTFNAY FNYEKTFATNHNFKVMLGMNAEKEKYTYFWGNRKGIIDERFPELNLASPIGQDLDAKHTH RASAGYFGRINYDYKDIYLLELNGRYDGSSRFPRHSHWAFFPSVSLGYRFSEEPYFKQLR NVVSNGKLRASFGEIGNEAVGDYMFESLINQVERDDVHWVDSNQENANKLPMFNTPNLVE PILTWERIRTADAGLDLGFLNNELSVGFDWYQRENTDMLAPSQVLPQVLGTDAPMANAGT LRTRGWELTLDWHHRFGEFNVYANFNLADSKTVVTKWESKAKLLNQNYSGKTYGDIWGFE TDRYFEESDFTGQNADGTWNYANGIASQKGLEQGSFHYGPGDVKFKDLDGNGVIDGGEGT ADKHGDLKVIGNFLPRYEYSFHLGGAWRGFDLDLFFQGVGKRHVWTVSSMNFPLMREADL AIYDHQLSYNRVIYKNGLKNVERYEVNQANKYPRLFPGNDPQGNISVIDPGTNNYYPQSR YLTDMSYLRLKNVTLGYTLPKELTRKAYIQKARIYLSASNLFLLYKGNDLPVDPEINAGA GLRYGGWGRTAPITRTLSFGMQVTL >gi|283510540|gb|ACQH01000079.1| GENE 9 20840 - 21406 137 188 aa, chain + ## HITS:1 COG:no KEGG:Gobs_3736 NR:ns ## KEGG: Gobs_3736 # Name: not_defined # Def: hypothetical protein # Organism: G.obscurus # Pathway: not_defined # 4 185 6 179 195 79 32.0 4e-14 MNRIFCLETEWEQTVHDLKSKSAVLPLLDFLRNTIKIDYCFRQVATKFDFRYYIEHLQQP SYSAYDLIYLCFHGQKRSICFADKTDFDLLSFAENEEYNGIFENRNVHFGSCSTMKMTYE DIKTFKRLTKSRMVTGYTKDVDITSSFIFEAWLLNTIHTNEGYAAKRINALAEKEMPYFT KVFGFEAF >gi|283510540|gb|ACQH01000079.1| GENE 10 21647 - 21910 388 87 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929097|ref|ZP_06422943.1| ## NR: gi|288929097|ref|ZP_06422943.1| hypothetical protein HMPREF0670_01837 [Prevotella sp. oral taxon 317 str. F0108] # 1 87 1 87 87 125 100.0 7e-28 MKYIKAITWSIVLLIVLVALFAVIKIWGITLPEGWNMSVDTLINIIITVGIADLLLIFLL IFIPFATHGSQKEYDENSGKVAQRKKE >gi|283510540|gb|ACQH01000079.1| GENE 11 22098 - 23105 1356 335 aa, chain - ## HITS:1 COG:RSp1296 KEGG:ns NR:ns ## COG: RSp1296 COG4260 # Protein_GI_number: 17549515 # Func_class: S Function unknown # Function: Putative virion core protein (lumpy skin disease virus) # Organism: Ralstonia solanacearum # 1 335 1 338 343 115 27.0 1e-25 MGITNIFRNQLSAVIQWQADKQQYLWYKYPSKTDEIKNASKLIVGPGQGAVLVYEGKIVD VLTEEGTFNLKTDNHPFFTTLVNMRQNFESEHKLNIYFFRKAQVTNQQWGTASPVKFIDA QYDIPVEMGANGTFSYQISDVEHFFINIVGTRTEVDNEEIRELMLGRLAQNIVTTIHKLG YSYNQIDGHLTEIGQELANLLNEETQKLGFTLTDFRVDGTLFDEQTQERIGRIADVTADA KAAQAGGLTYTELEKLRALRDAARNEGGLAGSGVQMGVGLEMGRQMGAASQLMNERNTPG DDLSRRLARLRLLRDENVITDADYEAKKRELLNEI >gi|283510540|gb|ACQH01000079.1| GENE 12 23160 - 24473 1068 437 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929099|ref|ZP_06422945.1| ## NR: gi|288929099|ref|ZP_06422945.1| hypothetical protein HMPREF0670_01839 [Prevotella sp. oral taxon 317 str. F0108] # 1 437 1 437 437 882 100.0 0 MSKRIKVIKCPHCGSTKVKETRPDYYQCEACSTDFFLDNDDININHHYIAPNDDPFQINK TKFLLMVVGIACLIFLPTVFIKCLSSSTPSSSSGLFTSTPEEEERFNTEHIMPFVAKDGR AVVAVFGIIKKGDYRNEKTDYLMRVFDMKKDKKIKEQRLPVDKLNDVQSRTFSNGWINVV INKSTWYTIDPSSFELKEMTLYKSIPELQDGFASIELIDQYGDSEGFKVMTNLGKERYYL PLIAKVYTKEEHYDACEAKLPNPTIETAFRFSKPTTEYPEQQIQLVKYTHYVQEGYPKTD YWSFGWCRDFGGKSGIFFGNAGSVKAFISTYSRQVARLINYSDFTPNTIYFSPKVIWFDK SQLFIRYKPTAKEDAEYIYQLLDANTAQRKWSIKAPAELDDDNIHYCVRSSEGFLISNYR CVWFLGNNGKTITVKKF >gi|283510540|gb|ACQH01000079.1| GENE 13 24643 - 25521 1137 292 aa, chain - ## HITS:1 COG:PM0281 KEGG:ns NR:ns ## COG: PM0281 COG0074 # Protein_GI_number: 15602146 # Func_class: C Energy production and conversion # Function: Succinyl-CoA synthetase, alpha subunit # Organism: Pasteurella multocida # 1 289 1 289 289 330 61.0 2e-90 MSILINKNTRLIVQGITGRDGGFHATKMKAYGTNVVGGTSPGKAGQEVSGIPVFNTVRDA VEATQANTSVIFVPAAFAKDAMMEAADAGIQLIICITEGVPTLDVVQAYNYIHQKGAKLI GPNCPGLISPEESMVGIMPTNIFKKGGTGVISRSGTLTYEVVYNLTQNGMGQSTAVGVGG DPIVGLYFQDLLEMFENDPETDSIAIIGEIGGDAEERAAEYIKAHVTKPVAAFISGREAP KGKQMGHAGAIISSGSGTAKEKVAAFEAAGIPVARETSEIPMLLKKQLEAKR >gi|283510540|gb|ACQH01000079.1| GENE 14 25614 - 26750 1652 378 aa, chain - ## HITS:1 COG:BH2470 KEGG:ns NR:ns ## COG: BH2470 COG0045 # Protein_GI_number: 15615033 # Func_class: C Energy production and conversion # Function: Succinyl-CoA synthetase, beta subunit # Organism: Bacillus halodurans # 1 374 1 383 386 345 49.0 6e-95 MKVHEYQAKGIFASYGVPVDNSILCKTPQEAVEAFKKLGTERCVVKAQVHTGGRGKAGGV KLASNEAEVLQHATSILGMDIKGFIVDRILVGEAVNIAAEYYVSIVIDRKSKGAILMLSR EGGMDIEAVAKETPEKIFKIAIDPSVGLTDFKAREAAFKLFDDMAQVKQAVPLFQKLYKL FVEKDASLAEINPLVLTKDGQLKAIDAKMTFDNNALYRHPDIFAMFEPTADEKKEQEAKD KGFSYVNLGGEIGCMVNGAGLAMATMDMIKLYGGNPANFLDIGGSSNPQKIVEAMKLLLA DKHVRVVLINIFGGITRCDDVAQGLIEAFKELKTDMPIVIRLTGTNEKEGRELLKGTKFI VAETMAEAGKKAVAAAGY >gi|283510540|gb|ACQH01000079.1| GENE 15 26775 - 26990 97 71 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSSKCKFSFFVLTHGVVCRGNLRNNDFLTNALSVCVLTLFKRFSPCVCLLFVLYGRNAYK LKGNRRLERTS >gi|283510540|gb|ACQH01000079.1| GENE 16 27034 - 28800 2168 588 aa, chain - ## HITS:1 COG:FN0277 KEGG:ns NR:ns ## COG: FN0277 COG4866 # Protein_GI_number: 19703622 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 4 317 2 289 290 139 29.0 2e-32 MIKFKDLSTGDRQLIQSYTLWGERQNCDLSFANLISWKFLYNTQFAIVDDYLVFRFHFNR HLAYMMPVAKPQLQDDGTYKVKPCDECSINVIKAIRDDSIAMGHPFLLMGVCHYMRDLIE MRFPDTFEIKPNRDSADYIYTREKLINLSGKKLQSKRNHINKFKNLYPHYIYKELTPDLI PQCLALEKQWRSVSKDDNDEQDLDEELSVELRSMTRAFNRWDRLGLTGGTIWVDNKLVAF TFGCPINQNTFDVCVEKADVNYEGAFTIINQEFVKHLPEQYFYINREEDMGDEGLRRAKE SYKPDILLEKNSVMEKHPLALFEDVNRIKEETRELWKLVFGDSEKFMDLYFSRVFRPKYN IVCQIDNHVVAALQTLPYNLLYHGQEVKAAYISGVSTHPDYRQQGVADNLMRQAHFDLYY KDTVFAALIPAEEWLFEWYGKCGYAKQITCTPPPEGAEDMDFATFDRWQRAKTCILLHDA EAFDVAKEDIRLAGDQYQRPENDVPGMIRVVNAKRALQLYLQQHPETNTVIRVEDDHDIP MNNAYYIVANGKVKKTDEPDANATRLRIEQLADFIFGEERAEMTLMLN >gi|283510540|gb|ACQH01000079.1| GENE 17 29032 - 30069 1280 345 aa, chain + ## HITS:1 COG:CAC1742 KEGG:ns NR:ns ## COG: CAC1742 COG0280 # Protein_GI_number: 15895019 # Func_class: C Energy production and conversion # Function: Phosphotransacetylase # Organism: Clostridium acetobutylicum # 1 332 1 330 333 302 50.0 6e-82 MDLLHQIIERAKSDKQRIVLPEASEERTLRAADRALADDIADIILIGNPSEIHKLAEARG LVNIDKAIIIDPQDNPDSEELAQLLAELRKKKGMTLAQARELVKNNLYLGCMLIKTNRAD GQVSGALSTTGETLRPALQIIKCEPGVTCISGAMLLVTDQKEYGEDGVVVMGDVAVTPMP DAGQLAQIAVCTAATARCVAGFKDPRVAMLSFSTKGSASHEVVDKVVEATRLAKAMAPNL RIDGELQADAALVPSVGQKKAPGSAIAGNANVLVVPNLEVGNISYKLVQRLGNAQAIGPI LQGIARPVNDLSRGCSVDDIYYMVAITACQAQDAKAKACAIPSDK >gi|283510540|gb|ACQH01000079.1| GENE 18 30089 - 31291 1451 400 aa, chain + ## HITS:1 COG:TM0274 KEGG:ns NR:ns ## COG: TM0274 COG0282 # Protein_GI_number: 15643044 # Func_class: C Energy production and conversion # Function: Acetate kinase # Organism: Thermotoga maritima # 1 399 1 401 403 479 57.0 1e-135 MKILVLNCGSSSIKYKLYNMDDKSVLAQGGVERIGLDEAFLKLTLPSGEKKVIMHDMPDH KEGVNFVFQTLLDREIGAINDLSEIDAVGHRIVQGGDKFNDSCIVTPEVEQGIEDLCDLA PVHNAGHLRGIRAVDALMPTTPQVVVFDNAFHSTMPEHAFLYAVPYKFYEDYHVRRYGFH GTSHRYVSQRVCDYLGRDIKTQRIITCHIGNGASMAAVKFGKCVDTSMGLTPLEGLMMGS RSGDIDPSAVTYLMEKTGMQPQEMAEFLNKQSGVLGITGISSDMREIESAAEEGNKRAKL ALDMYNYRIKKYIGAYAAAMGGVDIIVWTAGVGENQTGTRLDACSGLEFLGIKMDAEANK VRGKEAIISTPDSKVTVCVIPTDEELVIASDTEKLVSKLK Prediction of potential genes in microbial genomes Time: Sat May 28 01:49:38 2011 Seq name: gi|283510539|gb|ACQH01000080.1| Prevotella sp. oral taxon 317 str. F0108 cont2.80, whole genome shotgun sequence Length of sequence - 8608 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 271 - 330 6.6 1 1 Tu 1 . + CDS 413 - 1261 917 ## BF0442 hypothetical protein 2 2 Op 1 . + CDS 1709 - 4969 3649 ## PRU_0154 TonB dependent receptor 3 2 Op 2 . + CDS 4999 - 6549 1585 ## PRU_0155 putative lipoprotein + Term 6560 - 6607 15.5 Predicted protein(s) >gi|283510539|gb|ACQH01000080.1| GENE 1 413 - 1261 917 282 aa, chain + ## HITS:1 COG:no KEGG:BF0442 NR:ns ## KEGG: BF0442 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 2 277 1 276 282 299 51.0 8e-80 MMTLIVVSGSRHTDWVFVEDGKIVEKARTEGLNPYFQTRKEISRSVRLGLPEEYFHRKLQ KIYYYGAGCSNESKKKVVSTSIISQFKTPTFVESDLLAAARGMCLDKPGIACILDTGSNS CFYNGEDIVKNVHPGGFILGDEGSGSAMGKAFLSDVLKRLAPEELVVDFLYKNNTTPSEM MEKVYNEPLPHYYLATVSNFLAERPSHPYVRDLVSESFRSFIRRNLIQYDFQSQPVYFMG KVATTWEHLLREVSSEFNFVPARIETDIIPGLVSYHNRRKNQ >gi|283510539|gb|ACQH01000080.1| GENE 2 1709 - 4969 3649 1086 aa, chain + ## HITS:1 COG:no KEGG:PRU_0154 NR:ns ## KEGG: PRU_0154 # Name: not_defined # Def: TonB dependent receptor # Organism: P.ruminicola # Pathway: not_defined # 3 1086 2 1090 1090 1045 50.0 0 MKQRHVLLMALLVICTLSAFAQQTVTGYVFTADDREPILGATVMATGTKIGAATDINGKF TLTNVPNSARSVVVSYVGMKSQTVNIKPEMNIYLEPDARSLDEVVVQVAYGSAKKSTLTG AVSQVGTEHIELRPVSSVTSALEGTTSGVQINSTYGQPGSNPAVRIRGFGTVNGSAAPLY VLDGVPFSGDIFELNPADIESMTVLKDAASSALYGNRASNGVILITTKKGKSNKLSVNLR VSQGTYTRGIPEYKLLNPREFMEVSWLNLRNSRMTDKKASAAEAGEYASKNLIADALYLN IFNKANDALFDANGKLVADAQILPGYAGDLDWYDATIRRGARQEYSLSANAANEKSDYYF SVGYLKENGYVVGSDFKRLTGRANMNLRPKKWFTTGFSLNGSYQKSNLADNGSESFKNPF MFSRTIAPIYPVHLHNADGSYRTDALGNLQYDPGSYTDDNGQVVLTRNQFGDRHVAWENE LDKDRTVRNTLEATFYTDFKFLKHFTFTLKGQTSMRNNVNDTYDNPTIGNGKGNNGRATQ RLDRYREYNFQQQLNWNREFDKHTVSALLGHENYYWYHKYTHGRKANIKVPGQENFSNFS AITSLSGYEEDYATESYLGRVRYNYDDKYTAEVSFRRDGSSRFARAVRWGNFGSVGASWM VSRENFMKPVKWVNSLKLRADYGLVGNDAGSGLYSYLTFYGAAQNNSKGAFYLTQIGNDG LKWETGASFGLGVDARLFNRWNLSVEYFNRRNRDLLFDVYMPQSAGATDSDESLPSITRN LGTIENQGFEINTDVDIYKNKNWTVNFATNASFIKNKVISLPEQNKDGILSGAYKIVEGR SRYEFFTYTFVGVDQLTGNSLYKINTDNNYYTRADGTVVGNTSGTDITANVTEINGQYFV NGTTYAQREFHGSAIPKMYGSFTTTVKWKSLAFSALCTYALGGKVYDSTYSSLMSTSTGP SNYSADILNSWRETPAGMTATSADRIWYGGTPQINSSKNKDNNAVSSRWITNGNYFVVKN VNLSYQLPQKWVRAIDLDNVRLSFSAENLLSLSSRRGLNPQQAFSGMLDNYAVTPRVFIV GLEVKF >gi|283510539|gb|ACQH01000080.1| GENE 3 4999 - 6549 1585 516 aa, chain + ## HITS:1 COG:no KEGG:PRU_0155 NR:ns ## KEGG: PRU_0155 # Name: not_defined # Def: putative lipoprotein # Organism: P.ruminicola # Pathway: not_defined # 7 511 4 513 514 464 51.0 1e-129 MKATLLYVYALLAALTLGACSENYLDTESNNAEKRLILSNTSNIAMAVNGLSKMMSTQYL KTQGMNGEGTILTQYGNLHGNDFQKCDFTLWLQMTNLDFVENSTSIYDYYPWYYYYKILG NANAIISHVDAATGPDNEKRFLKAQALTFRAYCFFRLSQLYSKRWADSNNGASRGVVLRL DESTGNMAISTLAETYARIYQDLDDALALYAESKLNRAEDENYAVNADVAHAIYARAALT REDWATAASHAASARANYTLMNNAQYVDGGFNAPNKEWIWSVYSNAQETLHYFQFFAFQG SNSNAAAARNYPCAISKALYDSIPTTDVRRAMFLNPDTMTFNTSSGLAGKWMTAYAKRVY KKKLYSTSRIYAYMQFKMQAVSNPGVGEIPLFRAAEMYLTEAEANCHLGKEAEAQALLVA LNKTSGRDTAYTCTKTGDELLAEVRRYRRIELWGEGFDWFDYKRWNLPIVRRTFADGGSF HSAFAVTIKPNEKNNWTWVIPSRETDYNKLITSNKE Prediction of potential genes in microbial genomes Time: Sat May 28 01:49:59 2011 Seq name: gi|283510538|gb|ACQH01000081.1| Prevotella sp. oral taxon 317 str. F0108 cont2.81, whole genome shotgun sequence Length of sequence - 2042 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 51 - 2040 1642 ## COG3501 Uncharacterized protein conserved in bacteria Predicted protein(s) >gi|283510538|gb|ACQH01000081.1| GENE 1 51 - 2040 1642 663 aa, chain + ## HITS:1 COG:all3320_2 KEGG:ns NR:ns ## COG: all3320_2 COG3501 # Protein_GI_number: 17230812 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Nostoc sp. PCC 7120 # 12 140 39 180 219 65 30.0 3e-10 MNWQTGNMHTDWIRVMTPDGGGCRDGVETNRGFVFIPEVGDHVLVGFRHNDPNRPYVMGS LFNGITGGGCGKGNNCKSICTRSGICIAFNDDSRTLTLSDPSGNMVLMDGHGGMRIMAPS DVNLSVGNNLNVTVGKNLNLTVGDNQTSVVGNTLSIKALEQMLVETPVMKQLISDFFHTQ AGRALLNSHGQIKIEADETNVVGERKLMMHSGDHALLNSHGKVEIRAEQGIDELNHAEDY EVEKENIAEQVCVQFRPNKDYSGEFGFDWFRIGDCERKVPVQTKDNSESKGNSQTKASSE TKGNGQTNADGKTEGGSGTKFDIRYDTIMGFYNKTGFVPDDHVWQRFMWQEFSPLYIIPW KKQNKMKDYIYAPPVMTLLKGHSADLVLNMEVNQKAKELRYEYDKKFFQLDKTTAKVLDK GLHINGDTIRITCIEEFGETKFIEVYAYDEEGNKTLAGKLLVLANDAKHRRKIDVVLVRV RCKINSDPTVLPTLPGENSIRKATLMKFFTQCYVEANIEECVLDLSDEKHIKEFMAYTTT KDKKMILDIAKLNNKSTYNYLTNQINNNPNTSKYDGYYKIFLIDELCFKEEKEVYGRARD RNTKEVIVFRKGLNDNTCVHELFHALGLRHPFDKLNKYIFQEFTTDNIMDYSDIGPLHIP VVA Prediction of potential genes in microbial genomes Time: Sat May 28 01:50:01 2011 Seq name: gi|283510537|gb|ACQH01000082.1| Prevotella sp. oral taxon 317 str. F0108 cont2.82, whole genome shotgun sequence Length of sequence - 6449 bp Number of predicted genes - 5, with homology - 4 Number of transcription units - 3, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 1640 - 1693 1.3 1 1 Tu 1 . - CDS 1890 - 2081 100 ## - Prom 2125 - 2184 2.8 2 2 Op 1 . - CDS 2227 - 4326 2068 ## COG1200 RecG-like helicase 3 2 Op 2 . - CDS 4386 - 5078 268 ## PROTEIN SUPPORTED gi|163764767|ref|ZP_02171821.1| ribosomal protein L15 4 2 Op 3 . - CDS 5091 - 5660 683 ## COG0693 Putative intracellular protease/amidase - Prom 5691 - 5750 3.1 + Prom 5635 - 5694 5.5 5 3 Tu 1 . + CDS 5918 - 6449 478 ## COG3183 Predicted restriction endonuclease Predicted protein(s) >gi|283510537|gb|ACQH01000082.1| GENE 1 1890 - 2081 100 63 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MCVEIILTFNPNRTLVFAQILCIGEHIDCLILEDIAGYAERIRQTRCRQANEKLTRNDAN NLK >gi|283510537|gb|ACQH01000082.1| GENE 2 2227 - 4326 2068 699 aa, chain - ## HITS:1 COG:slr0020 KEGG:ns NR:ns ## COG: slr0020 COG1200 # Protein_GI_number: 16331409 # Func_class: L Replication, recombination and repair; K Transcription # Function: RecG-like helicase # Organism: Synechocystis # 23 678 151 810 831 479 42.0 1e-135 MHSILDQDIMYLAGVGPRKKEILSKELNINTFGDLLEYYPYKYVDRTQIYRIAQLQGDMP YVQVKGHILSFEEFDMGPRKKRIVAHFTDGWNVADLVWFQGAKYVLSTYKVGVEYIVFGK PSVYGGRYQFAHPDIDEATELKLAEMGMQPHYGTTERMKKAGITSRAMERLTKGLLEQMQ APLTETLPPFITNALHLVSRDQAMRGVHYPHNTDELQRAQMRLKFEELFYVQLNILRYAN FHRRKYRGYNFSKIGQFFNSFYHNNLPFELTGAQKRVMHEIRKDMDSGRQMNRLLQGDVG SGKTLVALMSMLIAVDNGFQACIMAPTEILAEQHLATVQQMLQGTGVTAELLTGIVKGKR RADVLQRLASGQLHILVGTHALLEDTVQFAHLGLAVVDEQHRFGVKQRAKLWGKSDNPPH VLVMTATPIPRTLAMTIYGDLDVSVIDELPPGRKPIQTLHKYDNQTTSLYAGIRQQVGLG RQVYIVYPLIKESEKTDLKNLEEGYEALTDIFPEYKMSKIHGKMKSKEKEAEMARFASGE TQILVATTVIEVGVNVPNASVMVILDAQRFGLSQLHQLRGRVGRGADQSYCILVTNRALS AETRKRIDIMCETNDGFRIAEADLKLRGPGDLEGTQQSGMAFDLKIANIARDGALVQLAR DEAQKIVERDPDCNLPEHAMLWNRLKELRKTNIDWAAIS >gi|283510537|gb|ACQH01000082.1| GENE 3 4386 - 5078 268 230 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163764767|ref|ZP_02171821.1| ribosomal protein L15 [Bacillus selenitireducens MLS10] # 5 230 7 232 234 107 31 2e-23 MNYVIIVAGGKGLRMGADVPKQFLLLGGKPVLMRTMECFFNFDAQLQMVLVLPRDQQTYW RELCRTHNFNLPYLLADGGTTRFESVRNGLALIPNDAQGVVAVHDGVRPMVSAEVVERCF GVAQQAKAVIPVMPVVETLRQVMPDGTSQTVNRDAYRLVQTPQTFDLQLLKRAYQQPYQS DFTDDASVVEALGMQITMVEGNKENIKITTPFDLDVCERLLALSSQHREK >gi|283510537|gb|ACQH01000082.1| GENE 4 5091 - 5660 683 189 aa, chain - ## HITS:1 COG:SP0804 KEGG:ns NR:ns ## COG: SP0804 COG0693 # Protein_GI_number: 15900697 # Func_class: R General function prediction only # Function: Putative intracellular protease/amidase # Organism: Streptococcus pneumoniae TIGR4 # 1 187 1 182 184 131 37.0 9e-31 MAKVYQFMADGFEDIEALAPVDILRRGGLDVKTVSIMGRELVESTNGVCVKADMLFENAN FNDADLLLLPGGLPGSTHLKEHKGLAEVLRRQVETGRKIGAICAAPMVLGTLGLLQGKRA TCYPGVEHTLHGAEYTAELVTVDGNIITGEGPAAALPYAYTLLALLVDRDKADEIEVAMR YKHLMAERL >gi|283510537|gb|ACQH01000082.1| GENE 5 5918 - 6449 478 177 aa, chain + ## HITS:1 COG:MA3494 KEGG:ns NR:ns ## COG: MA3494 COG3183 # Protein_GI_number: 20092304 # Func_class: V Defense mechanisms # Function: Predicted restriction endonuclease # Organism: Methanosarcina acetivorans str.C2A # 13 130 27 150 279 62 36.0 5e-10 MELNTLRETVNTNGEYSCGDIGVSASEWLDLLKQPEARKYHEALFCFLREPNQKGSCTSV AQKYGKSLNYYNGNVKGFSQWVQKRLNRFRVIGIKGSETFWPIAMEKGWQTKKEFLWLLR TELTEALRDLLMEHLIEELRKRKPFYGYEESYKWQLLDNTEGKDALGIIRALKGKNI Prediction of potential genes in microbial genomes Time: Sat May 28 01:50:08 2011 Seq name: gi|283510536|gb|ACQH01000083.1| Prevotella sp. oral taxon 317 str. F0108 cont2.83, whole genome shotgun sequence Length of sequence - 9964 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 5, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 1 - 1524 1143 ## COG1401 GTPase subunit of restriction endonuclease 2 1 Op 2 . + CDS 1521 - 2834 727 ## Coch_1910 McrBC 5-methylcytosine restriction system component-like protein + Prom 2861 - 2920 3.1 3 2 Tu 1 . + CDS 2940 - 3512 552 ## gi|288929117|ref|ZP_06422962.1| hypothetical protein HMPREF0670_01856 + Prom 3689 - 3748 2.7 4 3 Tu 1 . + CDS 3807 - 6221 2728 ## MCP_0630 hypothetical protein + Prom 6227 - 6286 3.7 5 4 Tu 1 . + CDS 6451 - 7494 739 ## COG2843 Putative enzyme of poly-gamma-glutamate biosynthesis (capsule formation) + Term 7600 - 7628 -0.9 - Term 7844 - 7881 -0.8 6 5 Tu 1 . - CDS 8055 - 8690 231 ## gi|288929120|ref|ZP_06422965.1| hypothetical protein HMPREF0670_01859 - Prom 8731 - 8790 11.5 Predicted protein(s) >gi|283510536|gb|ACQH01000083.1| GENE 1 1 - 1524 1143 507 aa, chain + ## HITS:1 COG:MA2119 KEGG:ns NR:ns ## COG: MA2119 COG1401 # Protein_GI_number: 20090962 # Func_class: V Defense mechanisms # Function: GTPase subunit of restriction endonuclease # Organism: Methanosarcina acetivorans str.C2A # 190 484 413 664 700 172 36.0 2e-42 NIVANAYVDPVFNKLNKEKPSELIVCGEHLLDETIPIDKRIARFKQEMQDLGHEDWKNFA NDERTASAILTCAYPEKYTFYKWEVYKRLCEYFGYDINKRGTNFGNFMEIINNLVAKYGE TVQQIMLSEIDRYKNKPLNLAIQTLFWCLKDFMLKEMDAAKQHTSQTSANEGTDYSRFDK GVITWKRHKNIVLYGAPGTGKTHSIPEYVVRLCCPTFNANHAEHSQIVAIYNQLKEEGRI AFTTFHQSMDYEDWVEGLRPVVTDNNQVGYNIESGIFKRLCEKANDTNTQEEMIQNDFPK NSSTLPDGENSSKPYIMVIDEINRGNVSKIFGELLTLLEADKRKGNPNTEKVVLPYSKEA FEVPANVFVIATMNTADRSLGALDYAIRRRFAFIAERPYALDNEGFDRDFFKQVSQLFIN NFDEYEKSNFDSTFKLQAADTLSADYKPEEVWIGHSYFLMLDENGEDFTNDRILYEIIPL LEEYIRDGILTTDAQTTIDELYDRAIK >gi|283510536|gb|ACQH01000083.1| GENE 2 1521 - 2834 727 437 aa, chain + ## HITS:1 COG:no KEGG:Coch_1910 NR:ns ## KEGG: Coch_1910 # Name: not_defined # Def: McrBC 5-methylcytosine restriction system component-like protein # Organism: C.ochracea # Pathway: not_defined # 53 436 49 434 437 256 37.0 9e-67 MTTVLQLNEQLKENSLLNQSDAEALLPYIADGGEKEFKVSDDKQDADFYLKLRRKGNDIL ATGSYFVGVDWVKENELAIQVSPKMNSDFEIDYVRMLNEALCDKENLNHLQQLVTISFHK PAIPISQQQDLLSIFLITEYLSVLRRITTKGLRKSYYIVEENLNNKIKGRCLVARNVKQN LSKGRVTNNFCRYQVYGIDSCENRILKRALRFCAKQLEVYRHAFDTSTLDNIVRFVKPHF DNVGEEVTTKAIQTFKGNPIFKEHSTAVELAQLLIRRYSYDITLAGNHQITTPPFWIDMS KLFELYVFRHLRLVFTGKNEVCYHPKAHRQELDYLLKPCHWPEPYVVDAKYKPRYKGMTG IDNDDARQVAGYARLQKIYDMLKLDAENALPIKCLIVYPDQEQQECFAFTNTEEPTFEKV KDYIRFYKIGIRLPVIQ >gi|283510536|gb|ACQH01000083.1| GENE 3 2940 - 3512 552 190 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929117|ref|ZP_06422962.1| ## NR: gi|288929117|ref|ZP_06422962.1| hypothetical protein HMPREF0670_01856 [Prevotella sp. oral taxon 317 str. F0108] # 24 190 1 167 167 334 100.0 1e-90 MSIAYENNAKATAQPHFPHPNTTMTNAFRTYGCLAIAALWPTGNLAAQTFSIETVNDSLN YLTLKTDSTTDKWPLPYPVYRLETADVNGDGRTEALVGVIKSTRFYPQKGRRLFVFKNYK NRVRAMWMGSKLGGILQDFRFVDGVVRSLETTTSGRYVVAEYRWQGFGFEFVRFLVVKVG REEALKAFNQ >gi|283510536|gb|ACQH01000083.1| GENE 4 3807 - 6221 2728 804 aa, chain + ## HITS:1 COG:no KEGG:MCP_0630 NR:ns ## KEGG: MCP_0630 # Name: not_defined # Def: hypothetical protein # Organism: M.paludicola # Pathway: not_defined # 167 780 126 742 748 260 31.0 2e-67 MKRTLLLIAFAAIGALLYADNNPKPVILPGKGKLNVESFNKEVNLKTNLSKLSLAELRVL KNAFKAREGFIFKEADLRGIYAQTSWYDSLMWLRSDNENEPLDDSNPNDWGQRQPKLSKA EQDFLKKIEQQENKILNSKHKLPKGQIVNLDLLLNPYQLDDFDPKLQNAISRNGFAIVPD NLQQLFHVYEKNDYSNFPSFVTTDLYLQLFHFYFDNILRDAEEKKLDALVKVFSQGMFNR MTKLVTTPSTGKQTKAAAAFCQAYFAVALALTTGKTPAGVSAAYKSQVASEIKKVMNSEN ALSPFLGYTEVKYPYSLYRPRGHYSRSERIKRYFRTMMWLQSVPFGTDKPDQLKRALLIA HTIGSDPQMKRTYNSMFEPITFLFGEPDNITVMQVYDLMKGQTPEKIFTNERLMADMAKR IDEVGEKQTRIRPKFESTSRNKINLMPQRYMPDAEVLNEMIDATNNVTKRDVPSGLDVFA AIGCSAAERILLKELQEDKRWEGFVPTLQTMKLRMSEIAWTTTVANRWMDALAEMNKPVA GAPYFMLTPQWEKKNLNTALASWAELKHDAILYAKQPMGAECGSGGPPDPIVKGYVEPNI PFWKKAIELVSQIERVFKQYKLNTPKMDASTASVKETAEFLLQVSQKELSPNPILTDEEY NAIEIIGSTIENISLDLVRQDDQYLDGWDNVEGADKSVAVIADVYTANAPNNPNHSILYE GTGPAYTIYVAVPIGNELYLMRGAVLSYRELKQSTDQQRLTDEEWQEKLKAKPYLGVPKW MDEIMVPLDNLPKDNEEVFYSSGC >gi|283510536|gb|ACQH01000083.1| GENE 5 6451 - 7494 739 347 aa, chain + ## HITS:1 COG:BS_ywtB KEGG:ns NR:ns ## COG: BS_ywtB COG2843 # Protein_GI_number: 16080641 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Putative enzyme of poly-gamma-glutamate biosynthesis (capsule formation) # Organism: Bacillus subtilis # 35 340 32 343 380 131 30.0 2e-30 MTTKPAQHPRNAPHTSFHAMGCAFFSHAAAYCRGVVLVLCLTFVCTESIWAQKADTLSVV FTGDVLLDRGVRKRIEYLGLSNLVGPKLQQIFNQSHFVVGNLECPATKINQPVFKRFVFR GEPEWLTMLAKHGFTHLNLANNHSIDQGRDALLDTRKNIEQAGIIHFGAGRNMTEAAQPL LLTTHPRPVYILASLRLPLENFAYLPAKPCVSQEPLDTLVARIKHLRHNEPNAYIVVSLH WGIEHQLTPTAQQRRQARLLINAGANILVCHHTHTLQSIEQYKGCPIYYSIGNFIFDQTK PINTRACVVKIVLTANEAHVETIPIEINDCVPEVYQLTSLPVNQLTN >gi|283510536|gb|ACQH01000083.1| GENE 6 8055 - 8690 231 211 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929120|ref|ZP_06422965.1| ## NR: gi|288929120|ref|ZP_06422965.1| hypothetical protein HMPREF0670_01859 [Prevotella sp. oral taxon 317 str. F0108] # 1 211 33 243 243 374 100.0 1e-102 MLKKGYELNITNQKGDSLLYFFLKNGNSMKNIITENQIYMNDTVLWRKYNSRYEGIDFTN YFCITYFIANSIQGIELYNKKTGIDFFHGKCLILCGENMQNRMAYHMKEELLLCLQINND QDDGTILLIDLKNAHIHKINIHQLVSNMQDTYFWHSVYIKNANSRYVIITYNNGFSKSTS LKVKRIKSLQNNLILDCKDYNGELKENRVNQ Prediction of potential genes in microbial genomes Time: Sat May 28 01:51:15 2011 Seq name: gi|283510535|gb|ACQH01000084.1| Prevotella sp. oral taxon 317 str. F0108 cont2.84, whole genome shotgun sequence Length of sequence - 164264 bp Number of predicted genes - 117, with homology - 111 Number of transcription units - 62, operones - 22 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 51 - 725 645 ## COG3501 Uncharacterized protein conserved in bacteria 2 1 Op 2 . + CDS 733 - 1989 458 ## gi|288929122|ref|ZP_06422967.1| hypothetical protein HMPREF0670_01861 3 1 Op 3 . + CDS 2001 - 2315 158 ## gi|299141737|ref|ZP_07034872.1| conserved hypothetical protein 4 1 Op 4 . + CDS 2351 - 3202 518 ## gi|288929123|ref|ZP_06422968.1| hypothetical protein HMPREF0670_01862 5 1 Op 5 . + CDS 3206 - 5029 723 ## Fjoh_4017 hypothetical protein 6 1 Op 6 . + CDS 5042 - 5554 260 ## gi|260909346|ref|ZP_05916060.1| hypothetical protein HMPREF6745_0013 + Prom 5572 - 5631 6.0 7 2 Tu 1 . + CDS 5692 - 6312 248 ## gi|288929125|ref|ZP_06422970.1| hypothetical protein HMPREF0670_01864 + Prom 6371 - 6430 4.7 8 3 Tu 1 . + CDS 6453 - 6941 282 ## gi|260912401|ref|ZP_05918946.1| hypothetical protein HMPREF6745_2901 + Prom 7012 - 7071 4.5 9 4 Tu 1 . + CDS 7122 - 7658 172 ## gi|260912404|ref|ZP_05918948.1| hypothetical protein HMPREF6745_2903 + Term 7698 - 7755 -0.3 + Prom 8337 - 8396 4.9 10 5 Tu 1 . + CDS 8534 - 9427 727 ## COG0739 Membrane proteins related to metalloendopeptidases + Term 9470 - 9518 13.6 + Prom 9703 - 9762 3.9 11 6 Op 1 . + CDS 9992 - 11368 1570 ## COG1211 4-diphosphocytidyl-2-methyl-D-erithritol synthase 12 6 Op 2 . + CDS 11358 - 12191 913 ## COG3475 LPS biosynthesis protein + Prom 12249 - 12308 3.7 13 7 Op 1 . + CDS 12352 - 13215 550 ## Coch_0959 DNA-damage-inducible protein D 14 7 Op 2 . + CDS 13276 - 14619 1232 ## COG1401 GTPase subunit of restriction endonuclease 15 7 Op 3 . + CDS 14616 - 16835 2191 ## PRU_0359 hypothetical protein + Prom 16866 - 16925 2.4 16 8 Tu 1 . + CDS 16970 - 18073 1099 ## COG0389 Nucleotidyltransferase/DNA polymerase involved in DNA repair + Prom 18117 - 18176 1.9 17 9 Tu 1 . + CDS 18289 - 19278 1109 ## PRU_2050 hypothetical protein + Term 19358 - 19392 2.5 18 10 Op 1 . - CDS 19311 - 19622 76 ## 19 10 Op 2 . - CDS 19696 - 21849 2438 ## COG3537 Putative alpha-1,2-mannosidase 20 10 Op 3 . - CDS 21940 - 22956 1080 ## PRU_2540 hypothetical protein + Prom 23152 - 23211 5.2 21 11 Tu 1 . + CDS 23335 - 23985 617 ## PRU_1474 hypothetical protein 22 12 Tu 1 . + CDS 24128 - 24757 573 ## PRU_1474 hypothetical protein + Prom 25046 - 25105 4.2 23 13 Tu 1 . + CDS 25185 - 26612 1692 ## BT_3413 hypothetical protein + Term 26663 - 26714 -0.8 + Prom 26627 - 26686 3.2 24 14 Tu 1 . + CDS 26868 - 27467 272 ## gi|288929139|ref|ZP_06422984.1| hypothetical protein HMPREF0670_01878 + Term 27601 - 27641 1.5 - Term 29159 - 29203 11.4 25 15 Tu 1 . - CDS 29266 - 31806 1848 ## PROTEIN SUPPORTED gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 - Prom 31989 - 32048 7.4 26 16 Op 1 . + CDS 32101 - 32475 425 ## COG0251 Putative translation initiation inhibitor, yjgF family 27 16 Op 2 . + CDS 32529 - 33464 1032 ## COG1284 Uncharacterized conserved protein + Term 33527 - 33595 5.8 28 17 Tu 1 . - CDS 33725 - 35380 1181 ## BDI_0813 hypothetical protein - Prom 35551 - 35610 7.0 + Prom 35551 - 35610 7.5 29 18 Tu 1 . + CDS 35632 - 36978 1664 ## COG0015 Adenylosuccinate lyase + Prom 37031 - 37090 2.3 30 19 Tu 1 . + CDS 37172 - 38713 1872 ## COG1187 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases + Prom 38731 - 38790 2.2 31 20 Tu 1 . + CDS 38839 - 39219 219 ## PROTEIN SUPPORTED gi|196247180|ref|ZP_03145892.1| S23 ribosomal protein + Prom 39476 - 39535 5.1 32 21 Tu 1 . + CDS 39559 - 40962 1572 ## COG0017 Aspartyl/asparaginyl-tRNA synthetases 33 22 Op 1 . + CDS 41210 - 42658 1666 ## Dfer_1218 hypothetical protein 34 22 Op 2 . + CDS 42668 - 43195 603 ## gi|288929150|ref|ZP_06422995.1| hypothetical protein HMPREF0670_01889 + Term 43236 - 43290 12.2 - Term 43224 - 43276 12.6 35 23 Tu 1 . - CDS 43282 - 44625 1272 ## COG1808 Predicted membrane protein - Prom 44863 - 44922 8.2 + Prom 46209 - 46268 2.2 36 24 Tu 1 . + CDS 46305 - 47648 1278 ## COG3004 Na+/H+ antiporter + Term 47802 - 47853 3.1 37 25 Tu 1 . - CDS 47683 - 48525 260 ## PROTEIN SUPPORTED gi|212640476|ref|YP_002316996.1| Uncharacterized protein conserved in bacteria containing two ribosomal protein S1-like RNA-binding domains - Prom 48555 - 48614 2.8 + Prom 48525 - 48584 5.9 38 26 Tu 1 . + CDS 48776 - 50002 1063 ## COG2233 Xanthine/uracil permeases 39 27 Op 1 . - CDS 50606 - 51100 340 ## gi|288929156|ref|ZP_06423001.1| hypothetical protein HMPREF0670_01895 40 27 Op 2 . - CDS 51102 - 51767 244 ## gi|288929157|ref|ZP_06423002.1| hypothetical protein HMPREF0670_01896 41 28 Tu 1 . - CDS 52218 - 54530 2509 ## COG3537 Putative alpha-1,2-mannosidase - Prom 54718 - 54777 3.0 42 29 Tu 1 . - CDS 54935 - 55405 280 ## gi|288929159|ref|ZP_06423004.1| hypothetical protein HMPREF0670_01898 - Prom 55435 - 55494 3.5 - Term 56106 - 56147 -0.9 43 30 Tu 1 . - CDS 56187 - 57272 1183 ## COG0136 Aspartate-semialdehyde dehydrogenase - Prom 57376 - 57435 2.9 + Prom 57172 - 57231 6.3 44 31 Tu 1 . + CDS 57441 - 59555 2191 ## COG0475 Kef-type K+ transport systems, membrane components + Prom 59709 - 59768 2.1 45 32 Tu 1 . + CDS 59860 - 60570 381 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 - Term 61332 - 61374 9.0 46 33 Op 1 17/0.000 - CDS 61592 - 62968 1397 ## COG0750 Predicted membrane-associated Zn-dependent proteases 1 - Prom 62988 - 63047 2.8 47 33 Op 2 . - CDS 63085 - 64245 1203 ## COG0743 1-deoxy-D-xylulose 5-phosphate reductoisomerase 48 33 Op 3 . - CDS 64250 - 64801 564 ## PRU_2926 16S rRNA processing protein RimM 49 33 Op 4 . - CDS 64880 - 66184 1410 ## COG0766 UDP-N-acetylglucosamine enolpyruvyl transferase 50 33 Op 5 . - CDS 66187 - 66810 702 ## PRU_2928 hypothetical protein 51 33 Op 6 . - CDS 66849 - 67541 713 ## COG1214 Inactive homolog of metal-dependent proteases, putative molecular chaperone - Prom 67624 - 67683 6.1 + Prom 67546 - 67605 5.5 52 34 Op 1 8/0.000 + CDS 67631 - 68506 1229 ## COG1561 Uncharacterized stress-induced protein + Prom 68552 - 68611 3.6 53 34 Op 2 . + CDS 68773 - 69396 653 ## COG0194 Guanylate kinase 54 34 Op 3 . + CDS 69393 - 69992 542 ## COG1057 Nicotinic acid mononucleotide adenylyltransferase 55 34 Op 4 . + CDS 70025 - 70840 540 ## Smon_0395 hypothetical protein 56 34 Op 5 . + CDS 70840 - 72804 1104 ## COG0863 DNA modification methylase - Term 72800 - 72842 3.2 57 35 Tu 1 . - CDS 72874 - 73272 264 ## PRU_1050 hypothetical protein - Prom 73298 - 73357 3.4 - TRNA 73432 - 73506 93.2 # Val TAC 0 0 + Prom 73632 - 73691 3.1 58 36 Tu 1 . + CDS 73716 - 74639 1027 ## gi|288929178|ref|ZP_06423023.1| hypothetical protein HMPREF0670_01917 + Term 74833 - 74866 1.2 + Prom 74815 - 74874 3.1 59 37 Tu 1 . + CDS 74943 - 75905 1050 ## gi|288929179|ref|ZP_06423024.1| hypothetical protein HMPREF0670_01918 + Term 75956 - 75997 7.1 + Prom 76354 - 76413 2.9 60 38 Tu 1 . + CDS 76479 - 77441 956 ## gi|288929180|ref|ZP_06423025.1| hypothetical protein HMPREF0670_01919 + Prom 77476 - 77535 5.2 61 39 Tu 1 . + CDS 77781 - 79835 1393 ## gi|288929181|ref|ZP_06423026.1| conserved hypothetical protein - Term 79816 - 79844 0.7 62 40 Op 1 . - CDS 79980 - 80576 345 ## gi|288929182|ref|ZP_06423027.1| hypothetical protein HMPREF0670_01921 63 40 Op 2 . - CDS 80581 - 80904 89 ## - Prom 80928 - 80987 4.0 + Prom 80731 - 80790 6.1 64 41 Op 1 . + CDS 80933 - 82816 1377 ## COG2194 Predicted membrane-associated, metal-dependent hydrolase 65 41 Op 2 . + CDS 82832 - 83989 929 ## PRU_2544 acyltransferase family protein 66 41 Op 3 . + CDS 84002 - 85009 733 ## PRU_2543 acyltransferase family protein 67 41 Op 4 . + CDS 85006 - 85878 825 ## COG0382 4-hydroxybenzoate polyprenyltransferase and related prenyltransferases + Prom 85925 - 85984 3.3 68 42 Op 1 . + CDS 86183 - 86803 440 ## COG0560 Phosphoserine phosphatase 69 42 Op 2 . + CDS 86763 - 88622 1614 ## PRU_2541 hypothetical protein + Prom 89329 - 89388 6.0 70 43 Op 1 . + CDS 89507 - 90841 1283 ## PRU_2541 hypothetical protein 71 43 Op 2 . + CDS 90857 - 94273 3297 ## COG1112 Superfamily I DNA and RNA helicases and helicase subunits + Term 94298 - 94337 -0.7 72 44 Op 1 . - CDS 94548 - 95015 526 ## COG0590 Cytosine/adenosine deaminases 73 44 Op 2 1/0.000 - CDS 95018 - 95662 604 ## COG2360 Leu/Phe-tRNA-protein transferase - Prom 95750 - 95809 2.3 - Term 95694 - 95734 5.3 74 44 Op 3 19/0.000 - CDS 95875 - 98136 1164 ## PROTEIN SUPPORTED gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 75 44 Op 4 . - CDS 98142 - 98444 341 ## COG2127 Uncharacterized conserved protein 76 44 Op 5 . - CDS 98488 - 98691 75 ## - Prom 98886 - 98945 4.4 + Prom 98921 - 98980 2.1 77 45 Tu 1 . + CDS 99011 - 99217 60 ## + Prom 100385 - 100444 3.2 78 46 Op 1 . + CDS 100623 - 101081 217 ## gi|288929196|ref|ZP_06423041.1| hypothetical protein HMPREF0670_01935 79 46 Op 2 . + CDS 101106 - 101438 350 ## gi|288929197|ref|ZP_06423042.1| hypothetical protein HMPREF0670_01936 + Prom 101803 - 101862 5.6 80 47 Tu 1 . + CDS 102062 - 102934 218 ## Slin_3307 hypothetical protein + Prom 104073 - 104132 3.4 81 48 Op 1 . + CDS 104170 - 106602 2797 ## COG1629 Outer membrane receptor proteins, mostly Fe transport 82 48 Op 2 . + CDS 106645 - 108099 1699 ## gi|288929202|ref|ZP_06423047.1| mucin-2 (MUC-2) (Intestinal mucin-2) 83 48 Op 3 . + CDS 108169 - 109638 1601 ## gi|288929203|ref|ZP_06423048.1| hypothetical protein HMPREF0670_01942 84 48 Op 4 . + CDS 109697 - 110758 1264 ## BF3067 putative lipoprotein + Term 110767 - 110821 16.1 - Term 111822 - 111866 1.4 85 49 Tu 1 . - CDS 112113 - 112373 92 ## - Prom 112528 - 112587 3.5 + Prom 112684 - 112743 2.8 86 50 Op 1 23/0.000 + CDS 112839 - 113267 315 ## COG1380 Putative effector of murein hydrolase LrgA 87 50 Op 2 . + CDS 113331 - 114029 792 ## COG1346 Putative effector of murein hydrolase - Term 114409 - 114442 2.1 88 51 Tu 1 . - CDS 114530 - 115981 1727 ## COG1649 Uncharacterized protein conserved in bacteria - Prom 116032 - 116091 2.5 89 52 Tu 1 . - CDS 116287 - 117705 1620 ## Phep_4124 hypothetical protein - Prom 117946 - 118005 3.5 90 53 Op 1 . - CDS 118096 - 119961 1997 ## Phep_4125 RagB/SusD domain protein 91 53 Op 2 . - CDS 119974 - 123510 4263 ## Phep_4126 TonB-dependent receptor plug - Prom 123701 - 123760 1.7 - Term 125168 - 125217 8.1 92 54 Tu 1 . - CDS 125463 - 127652 2563 ## COG3968 Uncharacterized protein related to glutamine synthetase - Prom 127857 - 127916 2.8 - Term 128073 - 128113 -0.9 93 55 Tu 1 . - CDS 128117 - 128332 66 ## 94 56 Op 1 . - CDS 128464 - 129561 668 ## BT_3235 hypothetical protein 95 56 Op 2 . - CDS 129586 - 130914 1202 ## BT_3243 hypothetical protein 96 56 Op 3 . - CDS 130936 - 131841 954 ## BT_3237 hypothetical protein 97 56 Op 4 . - CDS 131879 - 133468 1719 ## BT_3238 hypothetical protein 98 56 Op 5 . - CDS 133490 - 136807 3888 ## BT_3239 hypothetical protein - Prom 136840 - 136899 6.7 + Prom 137268 - 137327 3.7 99 57 Tu 1 . + CDS 137358 - 137789 104 ## gi|288929218|ref|ZP_06423063.1| hypothetical protein HMPREF0670_01957 - Term 138503 - 138549 8.3 100 58 Op 1 . - CDS 138594 - 139382 737 ## Coch_0245 hypothetical protein 101 58 Op 2 . - CDS 139389 - 140486 1086 ## Coch_0246 hypothetical protein - Prom 140608 - 140667 4.9 + Prom 140586 - 140645 4.5 102 59 Tu 1 . + CDS 140743 - 140964 242 ## gi|288929221|ref|ZP_06423066.1| hypothetical protein HMPREF0670_01960 + Prom 142625 - 142684 1.8 103 60 Op 1 26/0.000 + CDS 142765 - 144039 1438 ## COG0438 Glycosyltransferase 104 60 Op 2 . + CDS 144027 - 144821 735 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 105 60 Op 3 . + CDS 144876 - 146591 1055 ## BF0008 hypothetical protein 106 60 Op 4 . + CDS 146603 - 146803 105 ## BF0008 putative glycosyltransferase + Term 146813 - 146865 -0.2 107 60 Op 5 . + CDS 146891 - 147589 500 ## COG0438 Glycosyltransferase 108 60 Op 6 . + CDS 147505 - 147711 69 ## BF0009 putative glycosyltransferase 109 60 Op 7 . + CDS 147695 - 148372 656 ## BF0009 putative glycosyltransferase 110 60 Op 8 13/0.000 + CDS 148359 - 149651 1006 ## COG0845 Membrane-fusion protein 111 60 Op 9 . + CDS 149653 - 151845 230 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P + TRNA 152067 - 152144 93.9 # Val TAC 0 0 + Prom 152069 - 152128 79.3 112 61 Tu 1 . + CDS 152337 - 153320 1167 ## COG0180 Tryptophanyl-tRNA synthetase + Prom 155014 - 155073 2.0 113 62 Op 1 . + CDS 155094 - 158408 3763 ## BT_3239 hypothetical protein 114 62 Op 2 . + CDS 158420 - 159991 1785 ## BT_3238 hypothetical protein 115 62 Op 3 . + CDS 160005 - 160892 1081 ## BT_3237 hypothetical protein 116 62 Op 4 . + CDS 160906 - 162225 1524 ## BT_3243 hypothetical protein 117 62 Op 5 . + CDS 162234 - 163298 1271 ## gi|288929234|ref|ZP_06423079.1| hypothetical protein HMPREF0670_01973 + Term 163341 - 163401 2.5 Predicted protein(s) >gi|283510535|gb|ACQH01000084.1| GENE 1 51 - 725 645 224 aa, chain + ## HITS:1 COG:mlr6559 KEGG:ns NR:ns ## COG: mlr6559 COG3501 # Protein_GI_number: 13475477 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Mesorhizobium loti # 3 130 46 185 213 68 29.0 1e-11 MNWQTGNMHTDWIRVMTPDGGGCRDGVETNRGFVFIPEVGDHVLVGFRHGDPNRPYVMGS LFNGRTGKGGFAENHLKSIRTRSGHAIELDDAPESLGITIKDNKGNSVHIDSAEDSIVVN AERDITFNAAETFTVNAKNLNLNVEENAIERVGKDKVSTIGNKVSLEATEKEEEISNDSS INIGGLSSQTAGEIVQSATSGDAAITAEGKALLQGKDDARICKG >gi|283510535|gb|ACQH01000084.1| GENE 2 733 - 1989 458 418 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929122|ref|ZP_06422967.1| ## NR: gi|288929122|ref|ZP_06422967.1| hypothetical protein HMPREF0670_01861 [Prevotella sp. oral taxon 317 str. F0108] # 1 418 1 418 418 854 100.0 0 MSGKAFNSFVYDLAARVPVQKTGAKVVVQFRTSGSYLGEYGFDWIRMGDSGRRGDTWYAD IMGDKFITVIVNDKKKRLVLVDKTKRVYNAYAFRYFRSRRFSIPWKRQGANPYIYIAPVM TLWKGASAKLSLKVEVKEPAVKIVYQCQTDGIFRLSKESVPTLGKGKHTLPDELVITCLK EFSRDQEIKVYAYDENNVKHLAGKLIVKANDKKHQRTINVVAVQIKFNKIEPVPNTSNSV TKLNKILRQAYVNLNIRNVKLDIEQALKNKETSNYFTPDGGIYGSSIINNDLYMFLNAEF SKKFPQYNSYFKLYYINRMCFDDASREEMKIYGKARHLPSLEVIIPFRGLKDNTAAHELL HCIGIPHCFSEENQKRFHVAFKENLTDNIMDYSDINSHIPTIATWEFQWNEIQNKLSH >gi|283510535|gb|ACQH01000084.1| GENE 3 2001 - 2315 158 104 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|299141737|ref|ZP_07034872.1| ## NR: gi|299141737|ref|ZP_07034872.1| conserved hypothetical protein [Prevotella oris C735] # 1 96 1 101 219 105 50.0 8e-22 MIKLFLIITSAFFTACVSRAPQKATTKKKNRMTCFERLDTAEANKRILECSEKEEMEFPK EQFLSPFRYCQHIDSMGCTISVTNNIRRQRKGNLFGRYGRCFYE >gi|283510535|gb|ACQH01000084.1| GENE 4 2351 - 3202 518 283 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929123|ref|ZP_06422968.1| ## NR: gi|288929123|ref|ZP_06422968.1| hypothetical protein HMPREF0670_01862 [Prevotella sp. oral taxon 317 str. F0108] # 1 283 1 283 283 568 100.0 1e-160 MFVQCLGVKDRLRFFNKAYHIVKEKLVRMEKNIQYLNEVRLRVYSEKEVVSENAFSIESS LLIQEDVPQSSLVSKSRVSLEMSTSNVVMDLSKSLVKSTDLLQRCAYRYDGLQIGVDDVG GVEIKNGQDLARQWAAIRSLLQSDYKGWVVDDYLMRISNAMVSPEVCGTPINNYFYYGLM FVGTPYDTSAGWSRTQQVMLSDFDDVIFEETLRHEKNLGNSRCFSITGKSLGEPNHCKVN KYYGRTQIPTGSLFPDWVQLEVDYTKEDRSVYWKYELTRQEEE >gi|283510535|gb|ACQH01000084.1| GENE 5 3206 - 5029 723 607 aa, chain + ## HITS:1 COG:no KEGG:Fjoh_4017 NR:ns ## KEGG: Fjoh_4017 # Name: not_defined # Def: hypothetical protein # Organism: F.johnsoniae # Pathway: not_defined # 161 599 58 460 461 215 31.0 3e-54 MDFDVNKPANQENLAENENQRQEEKERKWRDPNEDVKYVCEGAKVQCRYCSSPLASLKVT AESVMLQDKPWATVGDNNGQANLGFTGICTHPKWGNHKPPCKAVISLGQWANCSGTIIGN HHALLVKSTIRCMVSGENLKIVHSGQTATLSKINPLDKRNAVMEIKVITPLDKGSHNDGT NKITNLGFVYGETYTLEATAFSNGIPDDKDIKWEAGFIFSNGNTGTVCYKGKPDRWHATG RKVEFSVTELSLLGGTIVFYAYIGSPKQEASLEIWVHYRFRYFDFATVSDEIEKRIADPW LVDQNSTSLCGMAALFYVFIKNNPTMYKSLVTDIHHKGVATYNGYVVEPYGESKEFMYNM NPFGNDYPHQIEQVKDKGSGKLKDTIKRMPMADWLSLAVLRSHESLRWEILSPNSSSRGC EYIEKVIPYSGEMEQSSMDSLAAVNWPSMMERLCKQFLGYPVVNSVGLSLFLLLQKKHPL RGRLNDCFFDSDLEHLQDMENAYKNGAEIIMMIDSQMFDDIVSYSYNDLFTTSHWIVYEG NLKFYDKQGRDTTSIKDAASLSFDFFTWGMEPNGIMATLHHPKISIDCFKSTFYGYVRLC RLFLAHN >gi|283510535|gb|ACQH01000084.1| GENE 6 5042 - 5554 260 170 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260909346|ref|ZP_05916060.1| ## NR: gi|260909346|ref|ZP_05916060.1| hypothetical protein HMPREF6745_0013 [Prevotella sp. oral taxon 472 str. F0295] # 1 170 1 169 169 172 48.0 7e-42 MRNILWCLLILTLILGCSKDKCSYDREISFIFINGWYAPQVPHHGIVYSYEKGSGFVNPI DSVSITSFEKSTRPSIFDKDTAGFSCLVCSLIETEQIDHNHELRIVVDDTLFYDISKQKL SMVKLSYWTMGGPMEWSVVSSLSVNGHDDENPTPEGSLSISNKYVRIQRK >gi|283510535|gb|ACQH01000084.1| GENE 7 5692 - 6312 248 206 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929125|ref|ZP_06422970.1| ## NR: gi|288929125|ref|ZP_06422970.1| hypothetical protein HMPREF0670_01864 [Prevotella sp. oral taxon 317 str. F0108] # 1 206 1 206 206 370 100.0 1e-101 MTKNQRQINNLKYGIMIPFLFLIITNAVGRQLKGKNMKGIVPKGYVVTAVKVLPQNHYLL ITEQQQKVKDYFHFQIPIILIKDSVGFVINKGQYVTSGLVPCGNEGFREVRVKGNAFTIV DDLCGDGLHIYSYVTFRYNYQLKSYVLSDYRKTLTSGGNDDSSSSVDVLYIIKKGLRFGK VTADTLLSFKLSPQSEGLYKRFVGKK >gi|283510535|gb|ACQH01000084.1| GENE 8 6453 - 6941 282 162 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260912401|ref|ZP_05918946.1| ## NR: gi|260912401|ref|ZP_05918946.1| hypothetical protein HMPREF6745_2901 [Prevotella sp. oral taxon 472 str. F0295] # 27 162 1 136 136 258 92.0 9e-68 MKRCLFVLICLLLLSACIKKQCKYEYPIAEIYVLNWGDRPIPNKGVAYIYKKGTNFAELI DTVRFYALARGITDSTNLACILNSKKLNHLNDIRVVLDDTLVYDISNIEIEKFLDTRRWG MCGPIERDIIPSLLVNGYVTRDTTVPGELAFPQECGRIIRKR >gi|283510535|gb|ACQH01000084.1| GENE 9 7122 - 7658 172 178 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260912404|ref|ZP_05918948.1| ## NR: gi|260912404|ref|ZP_05918948.1| hypothetical protein HMPREF6745_2903 [Prevotella sp. oral taxon 472 str. F0295] # 1 159 1 161 162 193 57.0 4e-48 MKSIIFVLITIAALSSCKGECKYDEPIEGFFVTHWEEALPRQGKVYSFKAGTNFTDPIDS FNLPVIERIHGYPDWTSCTLRGKKPTHRNDIRLVLDDTLVYDISDITLSWLVDQRHWTMG GPREYCIVSSLKVNGRIVRGTIDVGYLRFPREYARVLKELLRLQEQYNVPIRHKMVKE >gi|283510535|gb|ACQH01000084.1| GENE 10 8534 - 9427 727 297 aa, chain + ## HITS:1 COG:slr0993 KEGG:ns NR:ns ## COG: slr0993 COG0739 # Protein_GI_number: 16331215 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Synechocystis # 75 182 599 707 715 100 48.0 3e-21 MAAQDLLARQAPVDRKMKSVDTLALQSILKREAAQSPAANLYETWDNTHTHHITALPDSF LINLRHFCMPTPSRVVTSNFGSRWGRQHKGLDIKVYIGDTIRAAFSGKVRIVKYEAAGYG KYIVIRHPNGLETIYGHLSKQLVTENEEVRAGEPIGLGGNTGRSTGSHLHFETRLCGVAL NPALFFDFRAQDVVADNYMFRRSTYEMASSEATRLRGVSGNGGYNREDVYGKVGRDRDDN PQTPQYNAAEKLYHKVASGETLASIAEKRGVSIDTICRLNHMRRNDKIRPGQILRYS >gi|283510535|gb|ACQH01000084.1| GENE 11 9992 - 11368 1570 458 aa, chain + ## HITS:1 COG:BH0107 KEGG:ns NR:ns ## COG: BH0107 COG1211 # Protein_GI_number: 15612670 # Func_class: I Lipid transport and metabolism # Function: 4-diphosphocytidyl-2-methyl-D-erithritol synthase # Organism: Bacillus halodurans # 8 220 6 214 228 113 34.0 6e-25 MAKRNIAVVLAGGVGKRLGMTTPKQFFKVAGKMVIEHTVDVFERNAHIHEIAIVSNAMLI ADIENIVLKNGWTKVKRILKGGNERYESSLSAIKAYENEPVNLIFHDAVRPLVSQRILND VVDALQSYSAIDVAMPSADTIIEVDGDFISHIPDRSRLRRGQTPQAFALEVIKEAYEKAL LDPNFKTTDDCGVVKKYLPQTPIYVVEGEESNMKLTYKEDTYMLDKLFQLRNNEAERIDL SQVSLAGKVAIVFGGSYGIGADTAKMLVERGAKVHCFSRSSNGIDVGNKEDVERVFAEVA KEEGRIDYVVNTAGLLQKEPLATMNYADIAYSVQTNYMGTVNVALAAYPYLKQSGGRLVF FTSSSYTRGRAFYSIYSSTKAAIVNFVQAVAQEWEADGIRINCINPERTKTPMRVKNFGI EPDDTLLTSEKVAEATLRTLLSDDTGLVIDVRRAHDDR >gi|283510535|gb|ACQH01000084.1| GENE 12 11358 - 12191 913 277 aa, chain + ## HITS:1 COG:SP1273 KEGG:ns NR:ns ## COG: SP1273 COG3475 # Protein_GI_number: 15901133 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: LPS biosynthesis protein # Organism: Streptococcus pneumoniae TIGR4 # 11 258 2 254 267 97 29.0 3e-20 MTANTPLERFKQRHLRACQLKQLTILEAIHDICQRHTIPYWLDGGTLLGAVRHGGFIPWD DDIDIAMHKADLLRFVEVAQKELPKGLIVQSPLIDKHLKEPITKVRDLNSFFVEPGDDFS HNYEKGLFVDIFPFVAYPTLSRKLVKKLTLGISRSYSILHKAHHYSLRSFTEFFWFGAKY AVCRSVWALLCALNRNRDTYISNILHNNGYGIMHRQDSIFPLTEITFEGKRFMAPCNPDA YLQDLYRNYMDIPPVEKRKIHAIFVLETLEEEGEEGI >gi|283510535|gb|ACQH01000084.1| GENE 13 12352 - 13215 550 287 aa, chain + ## HITS:1 COG:no KEGG:Coch_0959 NR:ns ## KEGG: Coch_0959 # Name: dinD # Def: DNA-damage-inducible protein D # Organism: C.ochracea # Pathway: not_defined # 1 284 1 286 287 387 71.0 1e-106 MDTPKIQQYKNAFDLIAKEIKDDANNTIEVWYARELQQVLGYTRWENFVVAISRAIESCK TLKVNVDDHFREVTKMIPTAKGAHRGVQDFMLTRYACYLIAQNGDPKKEEIAFAQSYFAL QTRKTELIEERLQQIARLDTRERLRSSEKQLSRNIYERGVDEKGFARIRSKGDHALFGGL TTEQMKQRLGIKSGALADYLPTLTIAAKNLATEMTNYNVEQKNLHGENRITAEHVQNNRS VRGMLGQRGIKPEELPPAEDIKKVERRVAKEEKSIIKKTPKLPKGNS >gi|283510535|gb|ACQH01000084.1| GENE 14 13276 - 14619 1232 447 aa, chain + ## HITS:1 COG:BS_ydiS KEGG:ns NR:ns ## COG: BS_ydiS COG1401 # Protein_GI_number: 16077677 # Func_class: V Defense mechanisms # Function: GTPase subunit of restriction endonuclease # Organism: Bacillus subtilis # 117 408 34 329 343 159 37.0 1e-38 MIIENILQVGDASDNGQVTIVDEDRLCYLVRSESRGAIGLRTISKALLNEYVAFFGENPD ATPTDARDLLCGKSDIDKYEYGYTSTLSTMAKMVIANSGKTPSTIDNRRKAPNVYQQIFY GAPGTGKSNTIKRNVDNEKKICFRTTFHPDSDYSTFVGAYKPSMKPTGAALASGEKEEII TYKFVPQAFTKAYAAAWNTTEDVFLVIEEINRGNCAQIFGDLFQLLDRKGGKSEYPIEAD TDLANYLKEALDKSPRTDIPQEVKSGERLMLPQNLYIWATMNTSDQSLFPIDSAFKRRWD WVYIPIKPHAEENYKIKIGNETYDWWGFLQKINSVIDDTTHSEDKNLGYFFVNTESKTVC AEKFVSKVLFYLWNDVFKNYGFDHSIFTDNLGVKFTFSDFFNEDGTPNTNRVNDFLKKLD NEIDKEHSFIEITSTETENQGLPNTNE >gi|283510535|gb|ACQH01000084.1| GENE 15 14616 - 16835 2191 739 aa, chain + ## HITS:1 COG:no KEGG:PRU_0359 NR:ns ## KEGG: PRU_0359 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 533 1 547 772 439 42.0 1e-121 MRILIEEHPYQATEDILKVVSELGPTIGVGGKVSIGYVGYYYNANIQDCVFILPKVLLDE QGRAFQKYNPEDIVHLEDRNNPLNEEERHFVYELSVWIYRAICVFKERNKDTQIVLHERM AQIGHAQRRMSNTFLDILLSLIQFNQDNQDFFFFTIKNLHSGHNKINWTKTITHSQAYLQ SNRPVYLHPMNKRRQVNFDEELMVIYFSILNYIHHEYGFPFETQCHYDTISNQQFASYLK GMGKTRLQQIRYKYFSDKALELWNLCFAFFDEARQIVVNTKQREYLLVKSFEIVFEDIID ELIGSKNNELPTELKDQADGKRVDHIYRYKALTETSADNIYYIGDSKYYKRNTRIGKEAL YKQFTYARNVVQWNLNLFLDDNRKAGQKVELQAGTGMLRDETTEGYNIVPNFFISAKQCD LEMHSDITWVDREDKDFTSRQFRNRLFDRDTLLVCHYDVNFLYVVALYGRKNEGMKTAWR EKVRAQFRKEIQCMLDNHFQFHIMTPRKFVDGEEVLRQNFKQVLGKVFKPYPNAPNQQCF YSLALERPETINDPKLRATTIEENQNIKQLLGEHFEILECKIGEDRRNELTSPTPKAQCG ISNKDLVLLITKEGVHFDKNIQLLETTKRIGIALKMDGAVLQLVEGFTKARILIIHNKSN KHHAYLLSDKGPRLVPASGAPDMVVTKQHEDLYLVYDVDLTEGQTPLDLGELNLKPITKG GDSYSPHLLPLSRLLAETP >gi|283510535|gb|ACQH01000084.1| GENE 16 16970 - 18073 1099 367 aa, chain + ## HITS:1 COG:SMa2355 KEGG:ns NR:ns ## COG: SMa2355 COG0389 # Protein_GI_number: 16263727 # Func_class: L Replication, recombination and repair # Function: Nucleotidyltransferase/DNA polymerase involved in DNA repair # Organism: Sinorhizobium meliloti # 8 353 27 374 379 347 54.0 2e-95 MPAQDFNRKIIHIDMDAFYASVEQRDDPSLRGKPVGIGSEDRRGVLCTASYEARKYGVHS AMSGVQAKRLCPELIFLPPDFKKYKQVSQQVHEIFHRYTDLVEPLSLDEAYLDVTQNKLG IELAVDIAQRIRQEIRNELHLTASAGVSFNKFLAKVASDFRKPDGITVIHPSRAMHFIDN LPITDFWGVGHKTAETMHAMGIFNGADLRQLSRLRLRQVFGKAGEMYYNFARAIDPRPVE PHWIRKSVGCERTFDEDSSNPTVLLTELYRLVLELVERLKKNDFKGRTLTLKVKFHDFTQ ITRSRTAPHVLRKKEDILPIAKALLSEVNYSGRPIRLLGVQVSNMNDEPHRPQWIQLELP FKKESSF >gi|283510535|gb|ACQH01000084.1| GENE 17 18289 - 19278 1109 329 aa, chain + ## HITS:1 COG:no KEGG:PRU_2050 NR:ns ## KEGG: PRU_2050 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 56 329 1 272 272 320 55.0 3e-86 MDPTKTYPALMKGKKIMYVHGFGSSGQSGTVTLLRTLMPAATVIAPDLPLHPAEALELLK QTCDAEKPDLIIGTSMGGMYAEMLRGTDRILINPAFEMGDTMVKHNMVGKQTFQSPRTDG IQDFIVTKALVNEYKEITTLLFNGIDEAEQQRVIGLFGDEDTSVDTFDLFAQHYPTAIHF HGGHRLTDKVAMHYLMPLIRQIDDKQTGRQRPIVFIHANTLADSYQKPMPSMHKAYEMLI ENYDVYILAPSPTNAPEQITAQMAWVEQYLNAPAFNRVVFCNNANLLYGDYLISRHEHPN FLGSSILFGGNDLKTWDDVIVFFDRLGGQ >gi|283510535|gb|ACQH01000084.1| GENE 18 19311 - 19622 76 103 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPLAARNAILYVYMPPRNARCSQLLRGYLPSLAWQGVCLLPLLLSETKCVKTHFGGVQDT LYWDYKGLVVRLQGVCSDTTSGLYRGYKQHCTLQIASTRGLKA >gi|283510535|gb|ACQH01000084.1| GENE 19 19696 - 21849 2438 717 aa, chain - ## HITS:1 COG:XF0842 KEGG:ns NR:ns ## COG: XF0842 COG3537 # Protein_GI_number: 15837444 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Xylella fastidiosa 9a5c # 23 717 45 777 790 526 41.0 1e-149 MTMKKLFALVAFLTVGIGAWAQNLTQYVDPYIGTGGHGHVFLGANVPYGLVQVGPTQYKR GWDWCSGYHYSDSVIIGFGQMHLSGTGIGDLGDIALLPTTDATQHDVMFSHKGEHARPGY YSVMLKSGVKVDLTATARVAFHRYTYPADAQTAFVRLDLAQGIGWDKMTKSAFKQNSPTE LSGYRYSTGWAKDQRVYFVAQFSRPVTMATSQGDSLAVLSTPNNDEPLLVKVGISPVSED NARANIAAELPAWDFRGTVAKANEAWNKELSKIVITTDDDAARRTFYTAMYHSMFAPSVF NDVNGDYRGADGKVYRGDFKNYTTFSLWDTYRAAHPLMTIIHPEKQRDIAQTMLHIYEQQ GKLPVWHLVGNETDCMVGNPGIPVLIDIALKGFDVDKNKVLEAAKASAMRDERGMSLLKK YGYLPCDLDPEYETVAKGLEYALADAGIAKLAKQLGRTEDYKYFLKRSQSYKDHYFDKNT GFMRGKTSGGKFREPFDPFNSVHRQDDYTEGNAWQYAWLVPHDVHGLVQCFGGDKPFVAK LDSLFILEGRMGADASPDISGLIGQYAHGNEPSHHIIYMYNYVGMPWKAAPKLRQVMNTL YNDGIDGLSGNEDVGQMSAWYILSSMGLYQVEPAGGKYIFGSPLFKKAVVNVGGGKTFTI EAKNNSAKNIYIQSAKLNGKTYTRSYIDYKDIMKGGTLEFVMGEKPSAFGTKAADRP >gi|283510535|gb|ACQH01000084.1| GENE 20 21940 - 22956 1080 338 aa, chain - ## HITS:1 COG:no KEGG:PRU_2540 NR:ns ## KEGG: PRU_2540 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 337 1 340 341 470 63.0 1e-131 MKHAAKIDVAVLILFFNRPEPFEKVFQAVKAARPSRLFLYQDGPRGEKDMPGIEACRRIA ADIDWECDVKHLYQERNYGCDPSEYISQKWAFSMADKCIVLEDDDVPTQSFFPFCKEMLD RYEHDERVWMVAGFNSDEQTKDVAGDYFFTSVFSIWGWASWRRVIDTWDADYRFMDDPQT LAQLRQLIAQRQLRPDFLKMCADHKASGKAYYETIFWASMLLNSGVAIMPTRNQINNIGV IEDSTHFSTLNTMPAPLRRIFTMQRIEQRFPLTHPPHIIEHVAFKERVYRRHAWGHPWIK TRYAFIELWLNLRRGNFKQIGKSVAKRVRMWLGMEKHR >gi|283510535|gb|ACQH01000084.1| GENE 21 23335 - 23985 617 216 aa, chain + ## HITS:1 COG:no KEGG:PRU_1474 NR:ns ## KEGG: PRU_1474 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 8 213 1 168 171 102 33.0 8e-21 MKKILLMVLLLAVTVNAFSQRRRTVRRSYARSSQWYIMPKGGFNIATVTNWGGDVRFGLA LGGELGCRVSDVITLSGSLIYSQQGTNTWFYSRDFRTDIHLDYLNLPFMVHFNVAPGLEL KVGLQPGILVNDAVTLRTDGYSYDEDFHRSMKRLHWDDGYIESVDLSMPFAISYQYRQLV FDARYNLGLNNVTTRSWWRNGHNSVFQFTVGYKFDL >gi|283510535|gb|ACQH01000084.1| GENE 22 24128 - 24757 573 209 aa, chain + ## HITS:1 COG:no KEGG:PRU_1474 NR:ns ## KEGG: PRU_1474 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 13 206 1 168 171 87 32.0 4e-16 MNMKTLTTLCLSLLFAATTFAQTEPGTFSLLPKVGLNLSSLTKSKFEVDTRTGFCGGLEL NYQLNKSFSLTLGALYSLQGCKQTDKDHKTTVKIEYLNVPVLFNFHITNGLVAKVGVQPG LVLSDRIHQERKGVSGDMCFRDYVRSLPEYKNTNMRDYDIAIPIGLSYEFSNIVLDARYC PGLVSLFDHWDLTNKNYTFQFTVGYKFEL >gi|283510535|gb|ACQH01000084.1| GENE 23 25185 - 26612 1692 475 aa, chain + ## HITS:1 COG:no KEGG:BT_3413 NR:ns ## KEGG: BT_3413 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 28 472 28 477 485 318 38.0 4e-85 MNIKTTPKRNLKVLATLLLGIAPMAMFAQFNRTTFFMEGVSFKQQLNPALAPSRGYVTVP FVGSINMGVSSNAISSEYFSDLLNSSKTADYFTTDKFINQLKDNNRLNMSLASDLAAAGW WAGDGFWNVNLSLKLDLDANVKRDVFEFMRASRGMSDQSWANANLHTNGMQAKINSYMEA GVGYTRPFGDRLVLGAKVKLLFGIADINMKVDEITMQTNLRDIAANQDWSTITEEQLQKV KGKATIKARAEVQGSMAGLTFVTDNNKYVDNMDYSGSGGLAGFGAAVDLGAAYKLGRDFT LSAALLDLGFISWGQGSTIKGVSDFNRVYDFDGVHAGEHRQEFRKTIESGEIFNSDLWNL KAEESKSRTTSLRTTMVLGAQYKLLEDKLTVGALSTTRFGEVNTQSELTLAGNYSFNRHI GIALSYSMIQSQGTGLGLGFKLGPVTLATDYMYFGQDTRTISGMIGVSIPMGNSH >gi|283510535|gb|ACQH01000084.1| GENE 24 26868 - 27467 272 199 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929139|ref|ZP_06422984.1| ## NR: gi|288929139|ref|ZP_06422984.1| hypothetical protein HMPREF0670_01878 [Prevotella sp. oral taxon 317 str. F0108] # 8 199 1 192 192 398 100.0 1e-109 MKRLTLNMVCLTCVFCNCIAQTASLCIKDHAQQNKNEIIQPDSMNGKKVPSYRVTQVEST HRLRWTGKTFIGKHKDKANTWLIKAVIKGMDAKWLRHSLLVTAKDVWVLKDGVQKDFITI VALKSAEKGWDISAFKREYFLIAESKAPLHAVPYFINGSVYVICLIPSSRIVGDEVQPII DRNALYEDMLKPLLNEPFL >gi|283510535|gb|ACQH01000084.1| GENE 25 29266 - 31806 1848 846 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 [Bacillus selenitireducens MLS10] # 1 830 2 801 815 716 45 0.0 MKSQFSPRVSEVLAFSRQEATRLASNVVGPEHLLLGILRGSGGPVNELFERQNTDIEAMR LQLEELAKAHAQHVPMPNNEVALNEMANNILKLAVLEARIQNTQTVDVQHLLLAMLHDRV DNAAKQVLESNNLNYEDTLAYLQKRAMPSQDSLGLSDDEEDEPMPGNSGAQRNERAQATQ TAKGNGKTPILDSFSTDLTQAAMEDNLDPVVGREREILRVTEILSRRKKNNPIIIGEPGV GKSAIVEGLAQLIAKRKTSPVLFNKRIVSLDMAGMVAGTKYRGQFEERIRGLLKEIENNP DIIVFIDEIHTIIGAGSTPGSMDAANIMKPALARGTIQCVGATTLDEYRNSIEKDGALER RFQKVIVEPTTADETLQILENIKDRYERHHHVAYTEDALRACVKLTERYVTDRAFPDKAI DALDEVGAKVHLQHVEVPPAIIEKEKELDEAKLKKQNAVVNQNYELAANYRDMQLRLEKE LEALNYAWAFGDNDNRQPVTEKEVAEVVSIMTGVPVQRMAETEGVRLKGMANELKQAVVA QDAAIEKMTKAILRNRVGLKEPNHPIGVFMFLGPTGVGKTYLAKKLAQFMFGSDDALIRV DMSEYTESFNVSRLVGAPPGYVGYEEGGQLTERVRRKPYSIVLLDEIEKAHGNVFNLLLQ VLDEGRLTDGNGRLVDFRNTVIIMTSNAGTRQLKEFGRGVGFNTGGALGLNMNEKDKEYA RSIIQKNLSKQFAPEFLNRLDEIITFDQLDLNAINRIVDIELKGLYKRVEDLGYHLEITT EAKSYVASKGYDVQFGARPLKRAIQTYIEDLLAERILSDELALGGTIVIDKCPDKEELLV KEVVAS >gi|283510535|gb|ACQH01000084.1| GENE 26 32101 - 32475 425 124 aa, chain + ## HITS:1 COG:PH0854 KEGG:ns NR:ns ## COG: PH0854 COG0251 # Protein_GI_number: 14590714 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation initiation inhibitor, yjgF family # Organism: Pyrococcus horikoshii # 2 124 14 136 137 139 57.0 1e-33 MKAINTEKAPKAIGPYSQAIEANGFVFASGQLPVDVTTGEFVPGGVKEQTRQSLTNARNV LQAAGTDLNNVVKTTVFLSDMANFAEMNEVYAEFFSQPFPARSAVAVKALPKGALVEVEC IAAK >gi|283510535|gb|ACQH01000084.1| GENE 27 32529 - 33464 1032 311 aa, chain + ## HITS:1 COG:TM0177 KEGG:ns NR:ns ## COG: TM0177 COG1284 # Protein_GI_number: 15642951 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Thermotoga maritima # 12 291 4 281 283 151 33.0 2e-36 MTAVNQKAIYRELTDYVMIGIGMAFYSLGWVAFYLPNHITTGGVAGLSSIIFWGTHTPVQ FTYFGMNLILLAIALKVLGFKFCLKTIYGVTMMTMFVGLFRQLFPHPTILRDEPFMACII GSCFCGIGLGFGLSYHGSSGGSDIVAAIINKYRDISLGRVILLVDMIIVSLSYLVLKSWE QVIYGYVGLIVVSFVLDQVVNMGKRSVQFLIISERYEEICQRITSTPPHRGCTIIDATGY YSGNHLKLVVLVTKQREASQVYSMIDEIDPHAFVTQSSVSGVYGNGFDKFKVKRSKKRVD GAPKANATPAV >gi|283510535|gb|ACQH01000084.1| GENE 28 33725 - 35380 1181 551 aa, chain - ## HITS:1 COG:no KEGG:BDI_0813 NR:ns ## KEGG: BDI_0813 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 109 500 61 429 468 240 37.0 9e-62 MLTSKNSSTLHVNEHSRTQENKQAEPFFAWHWRIVQCVWVVLFLLFAIRVAAEDGEVERV AMSEDSVSVKNEVADSVLWAQPLPAMQPETNPTDTALHRQDYPRFDRSLTYVGLPLILAG VLGRGERHRVYYLHKEFVPEFKSSVDDYVQFAPFALATGLQVMGVRGKSSAVRYLASSAM SFAIMAAIVNTIKFTVNQMRPDNSTRNSFPSGHTATAFVAATILHKEYGLTVSPWYSFAG YGLATATGVMRMLNNRHWMSDVFCGAGIGIMSTELAYALSDLLFKKKGLVAPEKHNLVDL STHPSFFSIRMSGILGQQRLRTPDSFKPTLGDVPFLMQFGTGAMVGVEAAYFFNPYIGVG TSWQLVGHHVKHLDKPFRRPSATANLPEGVTFDFLRTGLNELCSSLGLYLAWPFAPRMSL GGQFTVGRNIFKGISVRAMKEGTVKSGTPESYTDTQQKYTTEWDFLTIQGNKALRLGTGM SFTVAYKQAYSWRAFLNLNYSFNTFTLTLNPSEWKQHEFPDAKNLPNEVITERIKKKLLQ LSVGTAFCISF >gi|283510535|gb|ACQH01000084.1| GENE 29 35632 - 36978 1664 448 aa, chain + ## HITS:1 COG:RSc2720 KEGG:ns NR:ns ## COG: RSc2720 COG0015 # Protein_GI_number: 17547439 # Func_class: F Nucleotide transport and metabolism # Function: Adenylosuccinate lyase # Organism: Ralstonia solanacearum # 2 447 3 447 457 455 51.0 1e-128 MSLNMLTAISPIDGRYRGKTEPLANYFSEYALIRYRVRVEIEYFIALCELPLPQLAGVGS ERFDQLRNIYRTFDEAKAQRVKDIEKVTNHDVKAVEYFIKEEFDAIGGLEPYKEFIHFGL TSQDINNTSVPLSLKEALEECYYPLLEELIGQLNKYAEAWKDVPMLAKTHGQPASPTRLG KEIGVFAYRLEEQLAQLKACKMSAKFGGATGNYNAHHVAYPQIDWREFGNRFVSDKLGLV REQLTTQISNYDNMGAAFDAMRRINTIVLDLDRDFWMYISMDYFKQKIKAGEVGSSAMPH KVNPIDFENSEGNLGLANAVLQFLAQKLPVSRLQRDLTDSTVLRNVGVPMGHALIAFQST LKGLGKLILNEEKLAEDLENTWAVVAEAIQTILRREAYPNPYEALKALTRTNEKMTERTI HDFVNQLDVNEDVKKELLAINPCNYTGV >gi|283510535|gb|ACQH01000084.1| GENE 30 37172 - 38713 1872 513 aa, chain + ## HITS:1 COG:SPy0369 KEGG:ns NR:ns ## COG: SPy0369 COG1187 # Protein_GI_number: 15674518 # Func_class: J Translation, ribosomal structure and biogenesis # Function: 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases # Organism: Streptococcus pyogenes M1 GAS # 278 507 1 232 240 167 39.0 4e-41 MESEMEKTPQDTSKGQQEESREGYNAGGSYSRTYGNAGARTQRPRIHTQRAYSTDRRSSG GEDEGAFRPEGFGANLQTGAAPQQRQYRPRYNNYNNEGGGYQQRSSYNSNNRSGYQQRQE GGYRSRYNNAQEGEGGYQQRANYQPRQEGGYRPRQEGSYQANQEGGYQQRSNYQPRQEGG YRPRYNNEEGGYQQRGGYQQRSYNNNNNRGGYQQRGGYQQRGGYQQGGYQQRGGYQQRGG YQQGNYRQRTPDYDPNAKYSLKKRIEYKEENIDPNEPLRLNKFLANAGVCSRREADQFIQ SGMVKVNGNVVTELGTKVLRSDEIIFHDQPVTLEKKVYVLINKPKDYVTTSDDPQQRKTV MDLVKGVCPERIYPVGRLDRNTTGVLLLTNDGDLASKLAHPKFLKKKVYHVFLDKEVTEH DLQQIATGVTLDDGEVHADAIEYANDKDKTQVGIEIHSGKNRIVRRIFESLGYRVVRLDR VQFAGLTKKNLKRGDWRFLTEKEVDMLRMGAFE >gi|283510535|gb|ACQH01000084.1| GENE 31 38839 - 39219 219 126 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|196247180|ref|ZP_03145892.1| S23 ribosomal protein [Cyanothece sp. PCC 8802] # 1 112 2 113 119 89 40 1e-16 MQTKRFQELVAWQKAHAFVLAVYQLSSHFPAHERFGLCSQFQRAAVSIPANIAEGYRKLG IADKLRFLNIAQGSLEECRYYVLLSKDLGYIDSNTYETMAIQIEEVSKVLNGYCRGILNN SHGNEL >gi|283510535|gb|ACQH01000084.1| GENE 32 39559 - 40962 1572 467 aa, chain + ## HITS:1 COG:sll0495 KEGG:ns NR:ns ## COG: sll0495 COG0017 # Protein_GI_number: 16332045 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Aspartyl/asparaginyl-tRNA synthetases # Organism: Synechocystis # 6 467 54 513 513 535 54.0 1e-152 MEKTRRTKIVDVLQRTDFGSEVNVKGWVRTHRSSKVVDFIALNDGSTINNVQIVVDPTKF DTELLKQITTGACVSATGKLVESQGQGQNSEIQCTELQLLGGCGSDYPMQKKGQTFEYMR QHAHLRLRTNTFGAVMRIRHNMAIAIHQYFHEHGFFYFHTPIITASDAEGAGEMFQVTTK KLDDLKKDEDGKVNYNDDFFGKMTSLTVSGQLEGELGATALGQIYTFGPTFRAENSNTPR HLAEFWMIEPEMAFIDLDDLMDTEEDFIKYCVHWALDNCKDDLAFLNKMIDNTLIERLQG VVSEEFVRLDYTEGINILEEAVKAGKKFEFPVSWGMDLASEHERFLVEEHFKKPVIMTHY PKGIKAFYMKLDEDGRTVQGTDVLFPQIGEIIGGSVREENYDKLVAQIEERHIPMKDMWW YLDTRRFGSCPHGGFGLGFERLILFVTGMQNIRDVIPFPRTPKSAEF >gi|283510535|gb|ACQH01000084.1| GENE 33 41210 - 42658 1666 482 aa, chain + ## HITS:1 COG:no KEGG:Dfer_1218 NR:ns ## KEGG: Dfer_1218 # Name: not_defined # Def: hypothetical protein # Organism: D.fermentans # Pathway: not_defined # 19 478 26 523 526 105 23.0 5e-21 MKTITLMALAMCLNLLSAMAQVTARIEYPHRGDYEDQVVAPLGERGLIVYSFAKKADEGK RYFKTEYLSTDLKPLFTDSTLIDDDLYFYSCTFNNDVNYTVLRARSGAFSVYAFNTKDHK TKVTDSEYTRKASMRDLVIHDGKLVFSSTQKRLERIGIVDLNTGESRFADIHIPGVRDRK VFILQNTILDNIIYAVVKGERELYLARVDFNGTQLGITNLTTDIDQDLLSASLSKANNKY FMTGTYTKSKKGGAQGIYFAQLENWQIKKQRFVNFLDLKNFTDYMSDRRKARIERKKEKA EKAGKEYSLKYNIASHGIMTDGKDYFYLGEAFYPTYVTYWAGTVMVTQFNGYAYTHAVLA KFDADCNLLWDNCFEMRPRQKPMYVKRFISAGIKKNDVDLVYADDKMLVSKLFNNKTGEV AQERKSEMIETDNEDEDVKKMRYSGTLHWYGKNFIVYGRQIIKNHNTGERRKVFSITKYT IK >gi|283510535|gb|ACQH01000084.1| GENE 34 42668 - 43195 603 175 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929150|ref|ZP_06422995.1| ## NR: gi|288929150|ref|ZP_06422995.1| hypothetical protein HMPREF0670_01889 [Prevotella sp. oral taxon 317 str. F0108] # 1 175 1 175 175 299 100.0 4e-80 MKTTRNTQGLARLFAALLLAFCSATPLFAQAFDGSDDSKIYLGYTNVGGRHGVEIGYEEG INDFLSYGAKFTTLRYKDDEGEKESRFYDFSDLGIYLNYHFMEVLKLPDNFDIYVGPILS TKTMSLQTGVRYNFGEVFGVYGSVQYNFFETITIGSNLGVYPEKFAFSAGFTISI >gi|283510535|gb|ACQH01000084.1| GENE 35 43282 - 44625 1272 447 aa, chain - ## HITS:1 COG:SP1264 KEGG:ns NR:ns ## COG: SP1264 COG1808 # Protein_GI_number: 15901124 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Streptococcus pneumoniae TIGR4 # 41 337 25 319 347 227 40.0 3e-59 MTNSEHTLWQVVKGYFNALPDKENEKETIKQINSGVIFHGANLWVLIFAILIASLGLNVN STAVIIGAMLISPLMGPIIGMGLAVGITDLELLKRSATHYLVATVISVITATLYFIITPL TEAQSELLARTSPTLYDVLIALFGGAAGILALCTKGKGNVLPGVAIATALMPPLCTAGYG LAVGNLSYFFGAFYLYFINTVFIALATFLGVRMLRFKRKDSVDAARLAVVNRYMIGAVVL TMLPAAYMTVQIVRESVQENAILKFVKDEFETKGAQIISHKIDDKSSSLNIVTVGKRLTE AEITDARRRMEEYKLEQFKLNVIQGTQTDSLMLSQMLNVNGGASQAKNTKLLEQAYEIQA LKAKNRKYERLPKLASDMRTEIQAVCPQTTSISLGNVAEARVDTTATTQYVLAVVDCKGK LSPTDKKRLHQWMRARLKVDSLRLVVQ >gi|283510535|gb|ACQH01000084.1| GENE 36 46305 - 47648 1278 447 aa, chain + ## HITS:1 COG:jhp1447 KEGG:ns NR:ns ## COG: jhp1447 COG3004 # Protein_GI_number: 15612512 # Func_class: P Inorganic ion transport and metabolism # Function: Na+/H+ antiporter # Organism: Helicobacter pylori J99 # 7 436 4 427 438 313 44.0 5e-85 MRQHYQKELENRLVIPTLQFLHRERSSGIVLGIFVLLALVLANSPLRDTYTHFLEHHFGF SVNGQNYLDFSLEHWINDGLMSMFFFVVGLELKREFIGGELRDLRKVTIPVVAALCGMLF PAAIYLLFNAGTPTAHGWGIPMATDIAFALAVLQLLGNRVPMSAKVFLTTLAIVDDLGSV VIIALFYTTEISLASLVIGFVVLAVMLAGNRLGIKSVWFYGVLGIGGVWVAFLTSGIHAT ISAVLAAMVIPADARIPEAAFIARIKKLTRQFENAAPNDVRTLEPEQVEILARVQADSSR AIPPLQKLEHGLHPFVSFVVMPIFALANAGVSFVDLDFGAVLSNGVAIGVLAGLLVGKPA GIVFAVWITERLGLGKRSRGMTWQRILGLGFLAGIGFTMSMFVTMLAFTSPEHAIQSKIG IFAASILGGIVGYIILRKPSHSSKKRA >gi|283510535|gb|ACQH01000084.1| GENE 37 47683 - 48525 260 280 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|212640476|ref|YP_002316996.1| Uncharacterized protein conserved in bacteria containing two ribosomal protein S1-like RNA-binding domains [Anoxybacillus flavithermus WK1] # 6 273 2 275 285 104 27 2e-21 MARIKLGDYNQLEVVKEVEFGMYLDAGDEGEVLLPKRYVPKGCKLGDKLDVFLYLDQDER LVATTETPLAKVGEFAYLQVSWVNLHGAFLNWGLMKDLFCPFREQKQHMEIGESYVVAVF IDEESYRIAASAKVEHFFATDFPPYCSGDEVDLLVWQTTELGFKVIVDNAYPGLVYRSQV FRPLHVGDRLRGYIMGVRPDGKIDVSLQPQGRKQTIDFAETLLQYLKDNGGFCDLGDKSP AEDIKRRFEVSKKVYKKAVGDLYKRRLITIEEGGVRLVQG >gi|283510535|gb|ACQH01000084.1| GENE 38 48776 - 50002 1063 408 aa, chain + ## HITS:1 COG:VC2171 KEGG:ns NR:ns ## COG: VC2171 COG2233 # Protein_GI_number: 15642170 # Func_class: F Nucleotide transport and metabolism # Function: Xanthine/uracil permeases # Organism: Vibrio cholerae # 10 397 2 399 417 339 49.0 6e-93 MEQIQLTASKRIVVGVQFLFVAFGATVLVPLLVGLDPSTALLTAGLGTFIFHFLTKGKVP IFLGSSFAFIAPIIAASKQWGMPGTLAGLVGVSLVYFVMSALVRWQGKRLLRKLFPPVVI GPVIILIGLSLSSAAVDMAKTNWILALASLVTAILVLTLGRGLLKLVPVVSGIAVGYVLA FFMDEVDFQPVIDAQWFALPASLAHFSLPQFSWEPLLFMIPVAIAPVIEHIGDVYVVGAV ADKDFTKDPGLHRTLLGDGAACLASALLAGPPVTTYSEVTGAMSITKVTSPAVIRISAAT AIVFSVIGKLSALLRSIPQAVLGGIMLLLFGTIASVGLLNLMRNKVDLNQTRNIIIIAVT LTMGIGGAVLSYGNFAISGIGLSAIVGITLNLCLPQEKDRKEEEEKNN >gi|283510535|gb|ACQH01000084.1| GENE 39 50606 - 51100 340 164 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929156|ref|ZP_06423001.1| ## NR: gi|288929156|ref|ZP_06423001.1| hypothetical protein HMPREF0670_01895 [Prevotella sp. oral taxon 317 str. F0108] # 8 164 1 157 157 294 100.0 2e-78 MKRLISIMWLLGFLSFNLFAQKEKSEQNAALEIAKYLVEERRVFYGKPAEFLYKELKRRK FPIKHMSTMSSGLWGAERARGVVFYNYSSTEIDTLNPNVVILEIRLDMNVDDHEFWQSIS NTDDWMEQILEKTKGYGVDEIEYRVKRIYPNEWKKVEFKKSSEE >gi|283510535|gb|ACQH01000084.1| GENE 40 51102 - 51767 244 221 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929157|ref|ZP_06423002.1| ## NR: gi|288929157|ref|ZP_06423002.1| hypothetical protein HMPREF0670_01896 [Prevotella sp. oral taxon 317 str. F0108] # 1 221 1 221 221 433 100.0 1e-120 MVGDLFSAEEQKGTIYHYIPFSEFVDISSKNLNYEYSIAADDLDGEISINPVHTDKNPFN STTYSGRSTVYSVHSHVNMLPPSPRDLEQICNIAADFQGRPKYKATMVYIPQDSSFYSLV ITDRDKAAKLSERLKGEIDNNNSFVEKGIFQDLLTKNKCSYENLNKIDKELIKLALVIKL MDGGISIVRHSRKHGKATQIYDVSPLKTKRGVNIYKPIKCQ >gi|283510535|gb|ACQH01000084.1| GENE 41 52218 - 54530 2509 770 aa, chain - ## HITS:1 COG:CC0533 KEGG:ns NR:ns ## COG: CC0533 COG3537 # Protein_GI_number: 16124788 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Caulobacter vibrioides # 60 768 57 756 770 340 32.0 6e-93 MPINHSLKPSFALRTLGRYGLCAALLLFAFLLSGTAKAEKQPVDYVNAFIGTTNFGTTNP GAVCPNGLMSVVPFNVMGSANNKFDKDARWWSTPYEYHNIYFTGYSHVNLSGVGCPDLGS LLLMPTTGKLEVDYKAYGSPMMMQKASPGYYTNFLPRYDVRTEVTATPRTSMARFWFPKG KSNILLNLGEGLTNESGAMVRRVSETEYEGVKLLGTFCYNPQAVFPIYFVMRVDTKPTAA GYWKKQRPMTGVEAEWDKDNGKYKLYTNYTKEIAGDDVGVYLNFDSRGNEPVVVRMGVSF VSMANARENLDAEQKGKDFDTIKEEARNAWNKDLSRIEVEGGTEDQRTVFYTALYHALLH PNVLQDVNGQYPTLETGEVKTMREGNRYTVFSLWDTYRNVHQLLTLVYPNRQRDMVRTMI DMYREHGWLPKWELYGRETYTMEGDPAIPVIVDTWLKGLRGYDINTAYEAMRKGATTPGS QNLLRPDNDDYLSLGYVPLREQYDNSVSHALEYYIADNSLSRLANALGKKADARLFYNRS LGYRHYYSKEYGTFRPILPNGKFYSPFNPKQGENFEPSPGFHEGNAWNYTFYVPHDVMGL AKLMGGQKNFVEKLQMVFDKDLYDPANEPDIAYPHLFSYFKGDEWRTQKELHRLMAKYFK NAPDGIPGNDDTGTMSAWLVFNMMGFYPDCPGKPAYTLSSPVFNRVTIHLDKEQWGRNEL VIEAKRQLPTDIFINAITLGSKPFKGYRVSHEELLKGGNLTFELKSEPSR >gi|283510535|gb|ACQH01000084.1| GENE 42 54935 - 55405 280 156 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929159|ref|ZP_06423004.1| ## NR: gi|288929159|ref|ZP_06423004.1| hypothetical protein HMPREF0670_01898 [Prevotella sp. oral taxon 317 str. F0108] # 16 156 1 141 141 265 100.0 1e-69 MNKHIKNGIISMIAWMLFLVILFGSYLYLTNSPFSYFVDEETGGFISSAFFLGWALIWFG IGRHYSIDYEAKKQVFIESHEGIDRYIIDKAFRKAYFSSGAKVLAIVCFISVPCYVAANV KGEPTLKDCILIGMLMLASIILYAYYKRNRAAGVTL >gi|283510535|gb|ACQH01000084.1| GENE 43 56187 - 57272 1183 361 aa, chain - ## HITS:1 COG:aq_1866 KEGG:ns NR:ns ## COG: aq_1866 COG0136 # Protein_GI_number: 15606903 # Func_class: E Amino acid transport and metabolism # Function: Aspartate-semialdehyde dehydrogenase # Organism: Aquifex aeolicus # 24 353 4 336 340 372 54.0 1e-103 MRINVESCTFAMQYIIILKEKEMKVAIVGASGAVGQEFLRILSERAFPIDELLLFGSERS AGRTYLFNGKEHVVRLLQHNDDFRGVDIAFVSAGGGTSKEFAETITKHGTVMIDNSSAFR MDPEVPLVVPECNAEDALNHPRGIVANPNCTTIMMVVVLQPLEKLSHIKRIRVSSYQSAS GAGAAAMAELQTQYSQLVNGEEPTVSRFPHQLAYNVIPQVDVFTDNGYTKEEMKMYHETQ KIMHTDAKCSATCVRVSSLRSHSESVWIETERPLSVEEARKAIAAAPGCTLKDDPSQGVY PMPLETGGKDDIFVGRIRKDLADENGLTLWLSGDQIRKGAALNAVQIAEYLMKNDKTLTK G >gi|283510535|gb|ACQH01000084.1| GENE 44 57441 - 59555 2191 704 aa, chain + ## HITS:1 COG:BH2844_1 KEGG:ns NR:ns ## COG: BH2844_1 COG0475 # Protein_GI_number: 15615407 # Func_class: P Inorganic ion transport and metabolism # Function: Kef-type K+ transport systems, membrane components # Organism: Bacillus halodurans # 6 395 5 388 388 272 41.0 1e-72 MFSAFPITDPTLIFFIVMLIILFAPIVMGKLRIPHIIGMVLAGVAVGEYGFNILKRDDSF ELFGRVGLYYIMFLAALEMDSSSLKRNKYRFMVFGLATFIVPFVLIYLLGTTFLHYSMPA SLLLATLMASNTLIAYPIIGRYGLQRNSAVSLSVGSTMIALFLSLVVLAMIVAAHNEGGG GLFWLWFVVKVVGFCVAMAFFIPRLTRWFLRRYSDAVMQFIFVMAILFLSAAVSSAIGLE GIFGAFLAGLILNRYVPRLSPLMNRIEFIGNAIFIPYFLIGVGMLINLRLLFQGQGIVWV VLAITLVGTLGKAIAAYITSWRFKMPAPSGQMMFGLTSAHAAGAIAIVMVGMHLQVEPGR YLITDDILNAVVIMILITCIISSVMTERAAKTFILSEKNIPASEISEQDDESMLVPVKYP EYANRLMGLAIMMRNPKLNRGIIALNLVYDDENMRANQEKGRQLLDQVSKYAAATDVRTQ TQVRIAANIANGIKHAFNEFHASEIIIGLHTHKEVSPKFWGQFHQSLFNGLNRQIIMARL NQPLSTIRRIQVAVPSRAQYEPGFYRWLERLARVVLNLECRIVFHGRQDTLALINEFVRN RHASMRAEYALMNHWNELPSLAATIAPDHLFVVVTARKGTVSYKTALDRLPDEITRHFSG TNLMIVFPDQYGDGIEELTFAEPQHTEGTSAYEDVGKWVKKLKK >gi|283510535|gb|ACQH01000084.1| GENE 45 59860 - 60570 381 236 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 215 1 217 245 151 38 2e-35 MITIKSITKSFGSLQVLKGIDLQINKGEVVSIVGPSGAGKTTLLQIIGTLDRPDGGEVVV DGVDVGTLSSKKLSDFRNQHLGFIFQFHQLLPEFTALENVMIPAYIAGKKKSEAMQRANE LLSFLGLADRANHKPAELSGGEKQRVAVARALINGPAVVLADEPSGSLDSKNKAELHQLF FDLRDKFGQTFVIVTHDEELAKLTDRTIHLRDGMVERNLATEQPSETAEQNEEVAL >gi|283510535|gb|ACQH01000084.1| GENE 46 61592 - 62968 1397 458 aa, chain - ## HITS:1 COG:STM0223 KEGG:ns NR:ns ## COG: STM0223 COG0750 # Protein_GI_number: 16763613 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted membrane-associated Zn-dependent proteases 1 # Organism: Salmonella typhimurium LT2 # 1 426 1 422 450 112 27.0 2e-24 MEIFLIRLLQFMLAISILVLLHEGGHFFFARLFGVRVEKFYLFFDPWFHLFEFKSKKSGT AYGMGWLPLGGYCKISGMVDESFDTEQMAKPAQPWEFRVKPAWQRLLIMLGGVIVNFLLA LFIYSMVLFHWGDSYVQVKDMTAGMKFNSEAKALGFKDGDVLLGTEKGEFKTFDADFFRD LATATRVDLVRQGKHMSLPMPGNLDLLNMLKSSPRFVMIMMPNTIDSVMPNSIAAKAGLK AGDKIVAFAGKPIDSQNDFNFEKERLGDILAAATTPADSAKALNTTISFVHQGDTTATTA AVKLNADLLFGMVFTNGLAKYKETHVEYGFFESFPAGAAYGVKVLKGYVGDMKYLFSADG AKSLGGFGAIGSLFPPMWDWHMFWLMTAFLSIILAFMNILPIPALDGGHVLFLLYEMITR RKPSEKFMVRAEYAGISILIILMVMANLNDVLRALGYM >gi|283510535|gb|ACQH01000084.1| GENE 47 63085 - 64245 1203 386 aa, chain - ## HITS:1 COG:CAC1795 KEGG:ns NR:ns ## COG: CAC1795 COG0743 # Protein_GI_number: 15895071 # Func_class: I Lipid transport and metabolism # Function: 1-deoxy-D-xylulose 5-phosphate reductoisomerase # Organism: Clostridium acetobutylicum # 3 353 2 354 385 340 51.0 2e-93 MKKQICILGSTGSIGTQALDVIAQHADKYEVYCLTANTRVELLAQQARKFRPAAVVVADE SRYQQLQDLLTDLPEVKVYAGKQALCDIVEAEPIDMVLTAMVGFAGLEPTIHAIKAHKKI CLANKETLVVAGELINELAIANRAPILPVDSEHSAIFQSIVGEGDNAIERILLTASGGPF RLLPEEQLAKVTKADALRHPTWDMGAKITIDSATMMNKGFEVIEAKWLFGVEAEKIQVLV HPQSIVHSAVQFEDGSVKAQLGMPDMRLPIQYAFSYPDRLPLNGPRLDFFAQPLEFFEPD MRKFRCLQLAFDAINRGGNMPCILNAANEVVNEAFRADRIGFLDMPRIIEETMQKVPFDA APSLDTYLQTDAESRRKASELVAQLR >gi|283510535|gb|ACQH01000084.1| GENE 48 64250 - 64801 564 183 aa, chain - ## HITS:1 COG:no KEGG:PRU_2926 NR:ns ## KEGG: PRU_2926 # Name: rimM # Def: 16S rRNA processing protein RimM # Organism: P.ruminicola # Pathway: not_defined # 8 182 11 181 181 199 57.0 5e-50 MSQQSSTRPQWQADNFYKIGKLGKPHGVKGEISFHFDDDIFDRIEADFLALEVEGLLVPF FFEEYRFKGSKTALVKFVNIDTQERAKELTGCNVFFPRQEADNPDAELSWAAIVGFTVKD AATNKRLGTLLAVDDSTVNTLFEVDVNGSAPLLLPASYDLIAGVDAKKREITMHIPDGIL DLD >gi|283510535|gb|ACQH01000084.1| GENE 49 64880 - 66184 1410 434 aa, chain - ## HITS:1 COG:BB0472 KEGG:ns NR:ns ## COG: BB0472 COG0766 # Protein_GI_number: 15594817 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylglucosamine enolpyruvyl transferase # Organism: Borrelia burgdorferi # 1 434 16 439 442 385 47.0 1e-106 MQSFLIEGGRPLQGTITPQGAKNEALEVICAAILTAEEVRISNVPDILDVNNLILLLADI GVKVTRHARNDYSFKADELNLDFLSSDEFVRKCAALRGSVLLIGPLLGRFGQAMVAQPGG DKIGRRRLDTHFWGFRMLGASFEKLPNRNVYQLRANQLQGRYMLLDEASVTGTANIIMAA VLAKGTTTIYNAACEPYIQQLCRLLNRMGARIGGIGSNLLTIEGVETLHGATHQVLPDMI EVGSFIGMAAMVGQGIRIKNVSVENLGLIPDTFKRLGITINIEGDDLFIPHHEHYQVEAF IDGSIMTLADAPWPGLTPDLLSVLIVVATQARGSVLFHQKMFESRLFFVDKIIDMGAQII LCDPHRAVVIGHDHKLQLRAGRMTSPDIRAGIALLIAALSANGTSQIDNIAQIDRGYEDI EARLNALGASIKRV >gi|283510535|gb|ACQH01000084.1| GENE 50 66187 - 66810 702 207 aa, chain - ## HITS:1 COG:no KEGG:PRU_2928 NR:ns ## KEGG: PRU_2928 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 11 195 1 185 196 206 55.0 3e-52 MNNRTSTTKIMNIEGLDYNTQRERLILPEYGREIQKMVNHAMTLPTKQERQACAESIIAT MSILFPQGYDAVDLEHKLWDHLAIMSDFKLDIDYPFDVSEATRATTKPQHVGYTSSRIPV RHYGKLLFETFERLKDMPAGDERDALVRLTANQMKRSLAQWGHGAADNERVASDLAAFTN GKIQLDLDTFKFEKLPAKEPEKKRKKK >gi|283510535|gb|ACQH01000084.1| GENE 51 66849 - 67541 713 230 aa, chain - ## HITS:1 COG:SA1856 KEGG:ns NR:ns ## COG: SA1856 COG1214 # Protein_GI_number: 15927626 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Inactive homolog of metal-dependent proteases, putative molecular chaperone # Organism: Staphylococcus aureus N315 # 5 175 13 178 229 82 31.0 8e-16 MSCILNIETSTAVCSVAVSNDGECIFNAEDHDGPNHAIKLGTFVDEALSFADSHALPLDA VAVSCGPGSYTGLRIGVSMAKGVCYARNAKLLAVPTLQLLCVPLLLREQITPDDALLCPM LDARRMEVYAALYDRALKEVRSVGADIVEADTYKAFLDQHHVFFFGNGADKCKAEIVHPN AHFIDNVKPLAKNMTPLAEKAMAEERFEDVAYFVPFYLKDFVAKAAKPLL >gi|283510535|gb|ACQH01000084.1| GENE 52 67631 - 68506 1229 291 aa, chain + ## HITS:1 COG:CAC1716 KEGG:ns NR:ns ## COG: CAC1716 COG1561 # Protein_GI_number: 15894993 # Func_class: S Function unknown # Function: Uncharacterized stress-induced protein # Organism: Clostridium acetobutylicum # 1 290 1 291 292 125 32.0 1e-28 MIQSMTGYGKAVVTFNDKKINVEIRSLNSKTLDLSTRVAPLYREKEIEMRQMVAKALTRG KVDFSIWVEKDAVADATPINAALVENYYQQIKEISARTGIPEPQDWYATLLKMPDVTTRT EMETLDEDEWKAAAEAIKQALEHIVAFRKQEGMALQKKFEENLQNIGKLMADIEPYEKAR VEKIRTRIVDALKDIPEVDYDKNRLEQELIYYIEKLDISEEKQRLTNHLKYFAKTMNDES EQGKKLGFIAQEMGREINTTGSKSNQAEMQNIVVMMKDELEQIKEQVLNVL >gi|283510535|gb|ACQH01000084.1| GENE 53 68773 - 69396 653 207 aa, chain + ## HITS:1 COG:RP765 KEGG:ns NR:ns ## COG: RP765 COG0194 # Protein_GI_number: 15604599 # Func_class: F Nucleotide transport and metabolism # Function: Guanylate kinase # Organism: Rickettsia prowazekii # 16 202 4 186 197 145 44.0 4e-35 MDNNQHIDPNGQSSTQHKGALVIFSAPSGSGKSTIINWLMQSHPELRLAFSISCTSRAPR GTEQHGVEYFFLSPEEFKQRIAQDEFLEYEEVYKDRFYGTLKEQVQRQLDAGQNVVFDVD VKGGCNIKRFYGDKALSIFIQPPSIEALRQRLEGRATDAPEVINDRIARAEYELSFAPQF DKVIINDDLETAKSQTLEVINAFLSRP >gi|283510535|gb|ACQH01000084.1| GENE 54 69393 - 69992 542 199 aa, chain + ## HITS:1 COG:BS_yqeJ KEGG:ns NR:ns ## COG: BS_yqeJ COG1057 # Protein_GI_number: 16079618 # Func_class: H Coenzyme transport and metabolism # Function: Nicotinic acid mononucleotide adenylyltransferase # Organism: Bacillus subtilis # 1 185 1 181 189 106 32.0 2e-23 MIRTGFYGGSFNPIHNGHIALAQQFLDDMGLDEVWFVVSPQNPFKRNANDLMADKARLEI VRAATANEPRFCATDYELHLPTPSYTWRTLQALAHDEPQRSFVLLIGADNWVSFPKWNHH EDILASHDIAIFPRRGYDINANELPANVTLLNTPFYDISSTDIRHRIAEGLPIDHLVPSA VKDMVIKTYANGDMTQREQ >gi|283510535|gb|ACQH01000084.1| GENE 55 70025 - 70840 540 271 aa, chain + ## HITS:1 COG:no KEGG:Smon_0395 NR:ns ## KEGG: Smon_0395 # Name: not_defined # Def: hypothetical protein # Organism: S.moniliformis # Pathway: not_defined # 9 268 6 267 274 222 44.0 1e-56 MNTTKTDKEEVIRTVVSACVECYASGFSDRHLAERDDADGTINMKIHNVFIAALGPEIQY YSALARSLDSSLGNMLEKMAIKIASLNYEVSQHVEGILYKQQTDKIAELLERYKRHDLKP SVDDYTGLSKMTGEVIEKRHESDYYLIDKETGMRYLIELKIGGDLDNKKARSEKEALLEQ YCILTNSLGSEEQVKILFATAYNRYGEGHPWTQGRVRQFFAEEELCISRDFWNLVCKSEN GYDIVIDEYRKNAHFINEALERIKKTYLPSV >gi|283510535|gb|ACQH01000084.1| GENE 56 70840 - 72804 1104 654 aa, chain + ## HITS:1 COG:SMc00021 KEGG:ns NR:ns ## COG: SMc00021 COG0863 # Protein_GI_number: 15964679 # Func_class: L Replication, recombination and repair # Function: DNA modification methylase # Organism: Sinorhizobium meliloti # 5 236 18 268 376 137 33.0 1e-31 MEVPNWKKCLIQADSKQVLRKIPDQSVDFIFTDPPYNIGRHSTGNIPLPGRAPMNNDVAD WDWVDFYPEEWADELVRVLKPTGNLFIFTSYNQLGKWYECLDHRFDTSNFMVWHKTNPAP KIFKAGFLNSCEMIFTCWNKRHTWNFISQKEMHNFIESPICMRPERLSCPKHPAQKPVSI LKKIIEIATNANDIVLDPFMGVGSIGVAALALERRYIGIEINNEYFKAAKKRVESMLTPM YEETDWACENPKSNETNGYVNSDTLPFEYENKANDTQPDVVNRHLRPLLKWAGGKERELQ YILPNLPTFERYFEPFVGGGAVFTAVHAKRYFVNDLSQELTDIYQYIATQDELFFKYANA IVDTWDMAKDFFLQNSNLLSLYQNYQRGVLSRANLEQAITVFCKDKNKDIAAMLGKDLQC AQKVLQTEIKKNLQRKMLRMRKLEEERGELPVKDIEDNLETAIKSALYMALRHLYNDKSI TSGNRKLHGALFLFIRNYAYSGMFRYNARGDFNVPYGGIGYNSKSMRKKLDYYLSAPLHD LFTQTTIFNQDFEVFLKRNTPSKNDFIFLDPPYDSEFSTYAQNEFTQSDQQRLANYLLNE CQAKWMLVIKNTDFIFGLYNKDGIHIRSFEKSYAVSFMNRNDKRTTHLLITNYS >gi|283510535|gb|ACQH01000084.1| GENE 57 72874 - 73272 264 132 aa, chain - ## HITS:1 COG:no KEGG:PRU_1050 NR:ns ## KEGG: PRU_1050 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 3 132 2 135 135 87 38.0 2e-16 MQKVIIIVEQSSDGAFWCRTETDIAGVGLNACGKDVASAKADLLECYEEAKADMKAEGKE MPDVEFEYQYDLQSFFNYFNFLNVTEIAKRAGINPSLMRQYSSGIKTAGEKTYKRLSACV DNIKSELQAASF >gi|283510535|gb|ACQH01000084.1| GENE 58 73716 - 74639 1027 307 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929178|ref|ZP_06423023.1| ## NR: gi|288929178|ref|ZP_06423023.1| hypothetical protein HMPREF0670_01917 [Prevotella sp. oral taxon 317 str. F0108] # 1 307 1 307 307 622 100.0 1e-177 MTRPIYTLTLLFFVLTTICARNKKADANCIGNIFPATETDSTQLYIEEEDKTDYSKFAIQ PIRVDTIWGDWEIHAHQFYDGNKFTYDKETHADYAVRFNVFKGGKPVFKGYKVNSKSLMG PNYIKGFELGLFEQFEITPTSVYIGFGYCEPETDNCLEKVLALNADGTVRKYETSIASAE GDIDMYVTDIYWLYTLYVNELSLATPNAASIQKVLNKYCTTDFAKRLQRETLKKNLLLGS DQFDYQWLRSLEVHLMDEKTNTCIVNYKRADGKMVYKRIHLQLKPETEYEYLVSGVSDAT EKDLPTM >gi|283510535|gb|ACQH01000084.1| GENE 59 74943 - 75905 1050 320 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929179|ref|ZP_06423024.1| ## NR: gi|288929179|ref|ZP_06423024.1| hypothetical protein HMPREF0670_01918 [Prevotella sp. oral taxon 317 str. F0108] # 1 320 1 320 320 639 100.0 0 MKKLALTFSLIAITLLTACSGNKKADANGAGNDSIAAKTDSVPLYVEEEDKTDYSKFAVQ PTRIDTTVGDWEIHIREFYDGKKFKLGNQVFANYSLKVNIFKGGKPVFKDHKIDAKAVAG SNYIKDFTLGVSENVFVTETTVYLDLSYCEPETDNCQMYTLALCANGQVRKVQTAAESYE GDMDGYVFDVYDFYAMYVNELTQPQPNKAAIQKVLNKYCTKGFAQKMQSHTSKNNPLLGS GKFEFQWLRSFIVHSKEAGTNSCIIHYQIPGGKKVYKRVLLQKKPRTEYDFIISGVQDAT EDELPSMSDGEEEYAEDDEM >gi|283510535|gb|ACQH01000084.1| GENE 60 76479 - 77441 956 320 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929180|ref|ZP_06423025.1| ## NR: gi|288929180|ref|ZP_06423025.1| hypothetical protein HMPREF0670_01919 [Prevotella sp. oral taxon 317 str. F0108] # 1 320 5 324 324 633 100.0 1e-180 MKKLTFTLSFIAITLLTGCSGDKKADANSTGNDSISAKTDSVQPYVEEKDTTDYSKFANP PTRIDTTVGDWEIHVREFYEGKKINYPERYMSYTSLKVNIYKAGKPVFKNYKLDIKSLVG KDINSSFTLFAPQEVYVTPTTVYLSLSYCDEILALDRIYTLAFCANGQTRKVEITNESYE GEYNRYIFDLYNFYAMYVNELAQPQPNPAAIRQVLNKYCTKGFAQKMQSHTIKNNPLLGP GMFQFQWLRSFAVHSKIAKTNACLIHYQTPGGKKVYKLLQLQKKPKSEDDFLIDGVSNAT ENDRPVMDYDDETFVTDNEI >gi|283510535|gb|ACQH01000084.1| GENE 61 77781 - 79835 1393 684 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929181|ref|ZP_06423026.1| ## NR: gi|288929181|ref|ZP_06423026.1| conserved hypothetical protein [Prevotella sp. oral taxon 317 str. F0108] # 1 684 1 684 684 1374 100.0 0 MKLHKKQALLLALLLMPLHGLQTSAQPQPAPSTQPLPTAAPTLWGANAKTYAGHLPVHVD SGRIYMEIPTACIGRDVLISAQVNRGFDFNAHPIKSLGVVQIVAPNTETILLKPLKNVAE GEMIVVKNEADRGIAYPVLGRTKAGAAVIDITNELLTGKQWFSYQELYTIREMVPDLSSL LDVKVNNGITQFRIRRYHGQQAEEGNFSSSMIVLPEGSVPLELSCRVQLLPQTYAPIRLA TKGVKNLTVPTESTSPTAESNMPIQRWALNRPLTFYIDSLFPPQYIGAVKAGVAAWNKAL ADAGIPNALQVEQVTPQTDVTQPRMYVSFELGQHETTSKMLCHPLSGELLRGWINVGKAV LEGRKLDYFLYLPHFDMRFWDENSTDAIVQETIQGRVMAKVAKVLGISTETDYGPSALQL TDADRRAVAYAYATAQGNTPYEQRADLIKRLSIKPQAATGDTPAFDKRVRIMDNVLQLMP QIDQKANMAKFKKEQGFALWHLYRDAVDTYGDQLVALSETFYKEDPTPLQRKAMRLLSKY IMLADSGLESKRINENQLASKGHELSMSLDGVFGNLFSTDTFNMLLKQGTKRGGYGINEH IGILCDAFFNNIETSRMPSNYLANLQLRYIMALNKAAKDAHKGSKLQLWLQHKIATTRTM LATMAQKCNSNEGKAWYKLLQDRR >gi|283510535|gb|ACQH01000084.1| GENE 62 79980 - 80576 345 198 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929182|ref|ZP_06423027.1| ## NR: gi|288929182|ref|ZP_06423027.1| hypothetical protein HMPREF0670_01921 [Prevotella sp. oral taxon 317 str. F0108] # 14 198 1 185 185 380 100.0 1e-104 MQRLALHIICLMGMLCNCKAQTAPTSTDTTLQNKSETRQVDSVRESNTPSYRVLSVENTD TLRWTGKAFTGRHKNKGNTWLIKAVMKGMNPNWLRHALWVTVKDVWLLRDGVEKDVISVM AFKSAETRWSISAFKREKFMVEECKAPLYAVPYFIDGCAYIICIIPPSQIVGNEVRTLID RKTLYEDVLKPLMDKPFF >gi|283510535|gb|ACQH01000084.1| GENE 63 80581 - 80904 89 107 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MYIATNRRTTACSDLLCAAWNTSAHCSKMRYRFVILCKFTLFIIDNRQCKRLICLVKENG HAPNLALINLSTRLLFNLPPPIQLRIIKARELSGEQKKRYLYSKIII >gi|283510535|gb|ACQH01000084.1| GENE 64 80933 - 82816 1377 627 aa, chain + ## HITS:1 COG:HP1417m KEGG:ns NR:ns ## COG: HP1417m COG2194 # Protein_GI_number: 15646209 # Func_class: R General function prediction only # Function: Predicted membrane-associated, metal-dependent hydrolase # Organism: Helicobacter pylori 26695 # 267 607 223 554 556 153 29.0 9e-37 MTLTVLRKKIFGYVVKPLQINAPFFVSMYLLGVICAWLTLPHSANEKIYRHLYTELLADL YTLCLLLSILPQRIRPWVRGVLYLILYLTTAVDVFCRVQLDSVITPTMLLLVGETNGREA TEFLSSYLSFSILFSKFGWIVLFALSHLAYTLRRFAPKRLKAACQKLSGKLPTISTPLAN KLKRYAPIGFTGILIWSFCVTAGNKVALWKLMHGKSVGEVEHMLTEQNHGECYHPVHHLI FSIYANSLASQQLNKLIAAAKTAKVDSCTYTSPHIVLIIGESFGKHHSQQYGYVQPTTPR QIRRERSKQLVKFNDVVSPWNLTSYVFKHVFSLYDVGDRGEWCDYPLFPQLFRKAGYHVT FITNQFLPQAKENVYDFSGGFFLNNPTLNKAMFDTRNDSLHIFDEDLVREYARLKSQEGR HNLTIFHLIGQHVGYKFRTPNDRRRWNGTDYKTLRPDLTPGQRSVLALYDDAVLYNDSVV DLIVKQFESEDAIVIYMPDHGEECYEENRGIVCRNHSAAIDYPLAKYEFEIPFWIWCSRT YIARHPDIYRQIQNARNRRFMTDALPHLLLYLAGIKTKSYREENNLISPKYNENRPRILK GTTDYDKLPRPKPKSAASNIGQRLTTH >gi|283510535|gb|ACQH01000084.1| GENE 65 82832 - 83989 929 385 aa, chain + ## HITS:1 COG:no KEGG:PRU_2544 NR:ns ## KEGG: PRU_2544 # Name: not_defined # Def: acyltransferase family protein # Organism: P.ruminicola # Pathway: not_defined # 55 384 2 340 341 266 43.0 9e-70 MPPNERHKEEKNAVQTCQQQATAHANSHANKSTSGQNATSINACMRGLAIIGIFLHNYCH WLGPIVKENEYTFNAENVTRMNHALVHPDAQLPMHLLSFFGHYGVPVFLFLSGYGLFKKY HGVQVPAGKFLFSHYLKLFRMMAVGFALFIVVDTLYPPSWHYDSLKVISQLLMFNNLLPR PDKMIWPGAFWFFGLMMQFYLLYRLVLHRRHWGITALLMALCLIVQLQLSPLGEPMNRYR YNFMGGMLPFGLGLLYAQFQKSATLWGDDNVKQSAALLVCIVLTYYLSQTFLGWTFVPAV ICLATLLTANLLARAPVMAWLYRALCWMGGISAALFVTHPVTRKLIIPISRQGQPYLGLL LYIVASVALAWATDKLIRRIPLPKR >gi|283510535|gb|ACQH01000084.1| GENE 66 84002 - 85009 733 335 aa, chain + ## HITS:1 COG:no KEGG:PRU_2543 NR:ns ## KEGG: PRU_2543 # Name: not_defined # Def: acyltransferase family protein # Organism: P.ruminicola # Pathway: not_defined # 25 331 1 304 305 305 52.0 2e-81 MIPTTNGIRFANISLHRSELMGLAMISIVLFHVSLPRLSPFYGLWRMGNIGVDVFLFLSG VGLWYALTGNKSLKRFFTRRYLRIYPTWFLVACLYYVPRFWQGQHGWKQVIDLAGDVLVN WDFWLHDELTFWYIPATMMLYFFAPAYVRLVSRHPAYRWLPVLMIVWCVAVQWVSPIHDA VGHIEIFWSRVPIFFIGINCGALVKQGLTLSPQSVWLLLAAAVLSLGVDIYLEQTRHGLF PLFIGRMLYIPATFSLVLLLGNWLTAVPQRIKRGLRLVGTVSLEMYLLHAHFVLVYIKRL QWGYWPTFLACFAITLPLAWLLNKGMEQLTKRFER >gi|283510535|gb|ACQH01000084.1| GENE 67 85006 - 85878 825 290 aa, chain + ## HITS:1 COG:alr3506 KEGG:ns NR:ns ## COG: alr3506 COG0382 # Protein_GI_number: 17230998 # Func_class: H Coenzyme transport and metabolism # Function: 4-hydroxybenzoate polyprenyltransferase and related prenyltransferases # Organism: Nostoc sp. PCC 7120 # 5 290 25 318 326 156 34.0 6e-38 MKKTLLLIRPQQWIKNGFVLIPMFFGGRLLNADDVIASVVTFFAFSFAASAIYCFNDIVD VDADRRHPVKCHRPIASGAVSVPTAYALMAVLALLSALLLFFLPQRAGETAGIVAFYFLL NMAYCLWLKRHAIIDVCTVAFGFVLRILAGGMACDVAVSNWLVLMTFLLALFLSFAKRRD DVLRMNETGEPPRRNTIRYNLTFVNQAITVTGTVTLVCYIMYTVSPEVVSRFHAPYLYLT SIFVLVGLLRYMQLTVVDEVSGDPTKILLRDRFTQAIVVAWIMAFLLIIY >gi|283510535|gb|ACQH01000084.1| GENE 68 86183 - 86803 440 206 aa, chain + ## HITS:1 COG:PA5547 KEGG:ns NR:ns ## COG: PA5547 COG0560 # Protein_GI_number: 15600740 # Func_class: E Amino acid transport and metabolism # Function: Phosphoserine phosphatase # Organism: Pseudomonas aeruginosa # 12 204 15 207 207 99 35.0 4e-21 MMTTNSTTSIRVSLFDFDGTLTTLDTLPAFIAHAVGRWRMLVGFGLYLPLIVLMKLHLYA NGRVKERLFAHFFKGMTLETFDAHCQSFAKNNVHILRPQGMKTVEEACQRGESVLIVSAS ITNWVLPFFRQLPRVQVIGTQIAVENGRLTGRFASPNCHGAEKVRRVENLFAPRHQYYIT AYGDSQGDKEMLAYADEPHYKPFRTK >gi|283510535|gb|ACQH01000084.1| GENE 69 86763 - 88622 1614 619 aa, chain + ## HITS:1 COG:no KEGG:PRU_2541 NR:ns ## KEGG: PRU_2541 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 158 618 1 439 445 369 42.0 1e-100 MQMNHITNPSEQNKRQSLGEMLRFGIVGLTATAIQYATYWVGLQFTNHNLAMTVAYLLSF AFNLWASLRYTFRVGATPGRWAGFAVAHVVNYLLQMATLNLFVDLGVSKTLAPLPMFAVC VPINFLLVRFFLKHNGIRMFAVFKVRREEWGVGAPLLILFVGLHALLIAHYYALFTPLQA HYWDLFIRNFHLSGFDPITYVVVSDWQVGYNVYRHPLLAFFMYVPYMINQGLMWLTGINC AIFVVAALQTFAAFYSLVFLHRLLRRIMGLTLFDTRLLSLFFFALAYIIVSAIAPDHFIL SLFALLLTLLVAANHQQQATTMPTWKALCLFVLTAGISLNNGLKVFVAALFANGRRFFRP RFLLFAVLLPSALIWVFARFEYKRFVWPGEVKRHQARDKRKAEAKKKAEEEARKRQLAWE EAVREAKKKNPHNPKLPPRPTNTTTKKKPNNGAAAKMGKPLMQGEFMRWTDISTSRLQSV TENLFGESIQLHRDHLLEDMYGRRPVIVHYRWAINYVAEGLLVLLFVVGLWVGRRNKLLL MALAFAAMDWVLHVGLGFGLNEVYIMTAHWAYVLPLGLAALFVVATGRKRTFLRIAIGSL TLWLYAWNLYSVVDYLLNS >gi|283510535|gb|ACQH01000084.1| GENE 70 89507 - 90841 1283 444 aa, chain + ## HITS:1 COG:no KEGG:PRU_2541 NR:ns ## KEGG: PRU_2541 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 17 441 1 441 445 304 38.0 6e-81 MLKLRKEEKPLALVAFLVFALQNALTLYKYHDLFTRVGFRGYWVLFHEMFHVSGYDAYSC IYLSNGKIYYELSRHPLFGPLLAPFYWLNDAIIKQFDFNAASYIMAVLLTLSATLSAVLL FRILRSQIGLKLADALALTALLFSFASIMLAVMLPDHFCFSLLLLILTLYIANKGKAMAW WQTALLLLFTAGITLSNGAKTLLADLFGTGKAFFKPRRLLLAGVLPLVVLGAAFYAQYTL QLLPREQESARIEAQVQKKNPNAIKRSVAHEARKERIIGKPMGNDPVLKWTDKDTPRLTS VVENVFGESIQLHRDHLLEDLFISRPVVVHYEHWFFYAVEAFVVVLFVAGLLCGMRQRLL WLCLSWLGVDIFLHIILGFGLNEAYIMGTHWLFIIPIAIAFMFKRMSAKPALVLRWAIVA LALYLYVYNARLVFQFMSHPLTIG >gi|283510535|gb|ACQH01000084.1| GENE 71 90857 - 94273 3297 1138 aa, chain + ## HITS:1 COG:sll1582 KEGG:ns NR:ns ## COG: sll1582 COG1112 # Protein_GI_number: 16329815 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases and helicase subunits # Organism: Synechocystis # 204 1138 219 1107 1118 298 28.0 4e-80 MPNPRTYITADELFRRIDHTLSTATTNPESINKQLHDTLSLACDGALQDSLQAFGNLFSK VDFLCKRHRIALRDVVAIQQMRRESNKAQPIPAADVPYHCRALAIFVSAVYNKHIPSMLV GRIAPTNKPPKELRHVDFRYLRCIVDQCDGNQIMVKIDHESYEEPMVLNLETEEQTYLKP LLRMGMQLNLLDCTKQGHALLPRLVVIEPDFLLDISAIARCFTDYGHHPLAYTVNRMGAN ANSQAILVGSFAGAALDDIINNNTDYDWRQTFTNTFKERTMDFCTCPDLNVREPFREVAV NQANNIEEIVDRLFDNEQEGFDRSKALLEPSFVCEELGLQGRVDLMTSDFRLLVEQKSGA NYNIQRNQPNEFGSFQKEDHYVQLLLYYGVLRHNFRLSNHQVDIRLLYSKYPLPGGLVVV AYLQRLFHEAIRLRNEIVAQEFGIAQQGFDSVIDKLSPDTLNQNQLCSTFYHRYLEPQIA AVTTPLHKLETLERAYVCRMLTFVYNEQLLAKVGAQQTQGHSGADLWNMPLAEKRETGNI FTALKLQKAEKSNSYNGVDTLTFDVPEQADDFLPNFRRGDMVYLYAYEPDAEPDVRQSIL FKGVLVDIAVGQIVVHLNDGLQNENYLQGDKHFAIEHATSDVGSTSAVKALHSFATANAQ RKALLLGQRAPKYDAAAELSRSYHPSYDEVLLRIKQARDYFLLVGPPGTGKTSMALRFIV EEQAGNILLMSYTNRAVDEICDMLLSAAIPFLRIGNEYSCDERFRPFLLDRLVDSDPKLS LIKQRIAACRVFVGTTSTMQSRSNIFALKHFDLAVIDEASQILEPNIVGLLSANLPHDTA MAHGLRPQSDSLPAIDKFVLIGDHKQLPAVVQQNAAQAAVNHPQLNAIGITDCRQSLFER LIHWERSQGRTRFIGTLNRQGRMHPHIARFPNEMFYAHERLDVVPCPHQEEQQLDYSQPS QDVLDDQLKAHRLLFFAAENNENESPSDKANPAEARIVARILGRIHRFYGKHFDANRTVG VIVPYRNQIAMIRRELQKLGVAELNGITIDTVERYQGSQRDVIVYSFTITHRYQLDFLTA NCFEEDGRIIDRKLNVALTRARKQMIITGHIPTLSHNRTFKALIQHVRENGLVVEEEG >gi|283510535|gb|ACQH01000084.1| GENE 72 94548 - 95015 526 155 aa, chain - ## HITS:1 COG:MA3407 KEGG:ns NR:ns ## COG: MA3407 COG0590 # Protein_GI_number: 20092219 # Func_class: F Nucleotide transport and metabolism; J Translation, ribosomal structure and biogenesis # Function: Cytosine/adenosine deaminases # Organism: Methanosarcina acetivorans str.C2A # 6 155 13 162 162 194 63.0 5e-50 MNNEELMRRAIALSEESVKNGGGPFGAVIARKGEIIAEAANRVTLDHDPTAHAEVSAIRL ASSKLGTFNLSGCDIYTSCEPCPMCLGAIYWARLDNVYYANNREDAANIGFDDDFIYHEM ALKPSERSKNMELLLPNEAINAFKMWEAKVEKTEY >gi|283510535|gb|ACQH01000084.1| GENE 73 95018 - 95662 604 214 aa, chain - ## HITS:1 COG:PA2617 KEGG:ns NR:ns ## COG: PA2617 COG2360 # Protein_GI_number: 15597813 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Leu/Phe-tRNA-protein transferase # Organism: Pseudomonas aeruginosa # 1 213 1 214 226 192 45.0 4e-49 MVFALDKTLSFPDPHLGEPDGLLAVGGDLSVDRMLLAYSYGIFPWYSFRYNREILWYCPM QRFVIFPAEVHVSHSMRSLINRGTYHVTFNQEFDEVIRHCSRLRIKEDGAWLGVDMVKAY RKMHEQGFAASVEVWQSDRLVGGLYGVTLGRCFFGESMFSLAPSASKLALIHLAHYMSDH GGLIIDCQFETPHLRSMGGRFIAYDDYLKIIQKE >gi|283510535|gb|ACQH01000084.1| GENE 74 95875 - 98136 1164 753 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 [Bacillus selenitireducens MLS10] # 7 736 7 806 815 452 34 1e-126 MPIYINSHETDQVLNSALNYCMLYRNEFVTPEHLLMAMTNLDNFSIALQASGGNTTRLMA QLDNVVGKMEKVPTEWFHVPENSVQTAEVLTNARKLAMMADVDMLQVPHLVTALLGLQNS DAAYLLSRQLGENASNFVAELVELYDEDDDMPVFGNDDEEDEEEAPWRSLVTCVNDTYKK HNPLIGREEELQRTIQVLCRKEKNNPLHVGEPGVGKTSLVYGLAAMIERGDVPQRLRGCR IYQMDMGTILAGTQFRGDLEKRIKQVMNGLLTEGNTILYIDEIHNLVGAGRSSEGSMDAS NMLKPYLEGGDVRFIGSTTYAEYNRYFAKSAGIVRRFQQIDVPEPSPEEAIKILNGLKRQ YEKFHNVHYSPDVIDFAVHASAKYVNDRFLPDKAIDLIDEAGAYRQMHPLPTKRQKVDKA LVADVLTKVCKVQAVALKDDGNEALFTLEKRMKELIYGQNQAVELVTQAIQTAKAGLTEE GKPLAAMLFVGPTGVGKTEVARVLATELGIELVRFDMSEYTEKHAVAKLIGSPAGYVGYE DGGLLTDAIRKTPNCVLLLDEIEKAHADIYNILLQVMDYARLTDNKGQKADFRNVVLIMT SNAGAQFASQASVGFAGNVSRGEAMMAQVKKTFKPEFINRLSATVVFRDMDEPMAQLILK KKVKALQQQLAARNVELTLDGAAESYLLRKGFTPQYGAREMDRVIAALLKPLLTREMLFG KLKKGGKAIVKTKGHEPPTPDDTNVQLVLDVPK >gi|283510535|gb|ACQH01000084.1| GENE 75 98142 - 98444 341 100 aa, chain - ## HITS:1 COG:DR0586 KEGG:ns NR:ns ## COG: DR0586 COG2127 # Protein_GI_number: 15805613 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Deinococcus radiodurans # 13 99 48 134 139 89 48.0 1e-18 MAKQQEAIRERIRTNLREPRKYKVLIHNDDFTTMDFVVMILKVVFFKSETEAEALMLAVH RSDKAVVGVYSYDVAMSKVSKATDMARHAGYPLRLTCQPG >gi|283510535|gb|ACQH01000084.1| GENE 76 98488 - 98691 75 67 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLNKQTKSARTITVDWSLNSTRLHPSKSNAMCFGMHFSPFKALASCSTWLKGCITLGKTT FWVNVLS >gi|283510535|gb|ACQH01000084.1| GENE 77 99011 - 99217 60 68 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MELVSLCHKHNTNTPHVQGIAKQNCFWKFVVRRLQDVRSMDKQAPTQAKMNTLVCLRHIL WFSKYNPV >gi|283510535|gb|ACQH01000084.1| GENE 78 100623 - 101081 217 152 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929196|ref|ZP_06423041.1| ## NR: gi|288929196|ref|ZP_06423041.1| hypothetical protein HMPREF0670_01935 [Prevotella sp. oral taxon 317 str. F0108] # 1 152 7 158 158 309 100.0 3e-83 MILLSTQTLMLTGCPMIDSESVNIKIINRSDKEISFQPYEFAYPFSKKDTLLQNHIPVCG IKAKGHYLYSATTNSDWRRELYGGRILQILVIEDGDQFLKYWPSHEDTLRKYVPVVQRYH LTIKDLERLHWTVSYPPTEDMKDVNMWPPYKK >gi|283510535|gb|ACQH01000084.1| GENE 79 101106 - 101438 350 110 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929197|ref|ZP_06423042.1| ## NR: gi|288929197|ref|ZP_06423042.1| hypothetical protein HMPREF0670_01936 [Prevotella sp. oral taxon 317 str. F0108] # 1 110 1 110 110 196 100.0 3e-49 MKFCCFAFEMYYTLENQYCYNIRKVKLTSPRLTEYGMMKYYNIPSLRGTRHKRADICFVM TMGYDTFTFDAPTVFISFCPFCGVNLYDYYKSDEYVNEIEGETFKFFKDQ >gi|283510535|gb|ACQH01000084.1| GENE 80 102062 - 102934 218 290 aa, chain + ## HITS:1 COG:no KEGG:Slin_3307 NR:ns ## KEGG: Slin_3307 # Name: not_defined # Def: hypothetical protein # Organism: S.linguale # Pathway: not_defined # 21 145 63 187 190 100 42.0 5e-20 MIKRIIAIITILILVNTNLIGQTVIEMEPYGGVYRIPCTVNGARMKFIFDTGASNVCISK AIAEYLLDNGYLTKEDIYGVGHSSVADGSIVDHIKIRLKDVEIQGTHLKDVDAIVIDGQN APLLMGQSAIQKLGMIEIHGKKLIINNGTPRKRSSHITNNPSDYKYIDTYLNKEHVLTNL RHNAQSYVKFKRWDESKASEFYSALEIFIKAIDAGRLSSDLQGNINDSAGLIDDGTANWR NQYGRVLTEEEFQKLSKRDKAKCTKNFYPNREVALYINTIVKSMFSRSRD >gi|283510535|gb|ACQH01000084.1| GENE 81 104170 - 106602 2797 810 aa, chain + ## HITS:1 COG:PA0781 KEGG:ns NR:ns ## COG: PA0781 COG1629 # Protein_GI_number: 15595978 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor proteins, mostly Fe transport # Organism: Pseudomonas aeruginosa # 156 808 58 687 687 75 21.0 3e-13 MKKNEKDERFTDTSASIAGHATLCLALLLCLGGRPQGAFGSPKPGAEVHVNAEETTQTVV VKSESTGEPIAGAVVRCDAAAAPAVTDINGLCRLQLKGKTEKVKIEVSSVGYKRLTQTIT PTPKQTLTLKMVDDAKMIGNVEVTAQKRHTTQLQQTATIGSDALEKGGATSLAKLLETVP GVSSISTGNTIAKPVIQGMHSSRILLMNNGVRLESQSWGADHAPELDYTGSSMVEVVKGA ECIRYGFGAMGGVVLLNDAPLPWGNKRVTTKGNVNMGYDTNAKGASGSATVEAGYGPWGA RVHGMYTKGGDYRTADYRLNNTGYNTISLSGMVGFEGKHITATLFSSIYYQRSGIYYASK VSDLDQLVKRFEYGRPDPQTIQPFSYEIKPPFQQSQHVTLKGEVKWRLNPSHQIDFTLSF QENLRQEFENRKKAQWSWIPVQDLILKTYKVDVLWHAKWKLWKMTTDAGLSNTYQTNYNY PGTKQPAFVPNFAALSMGGYFLHKAEWGRLQMALGMRYDFRVMSVNGYTSLSNYTYYDDF KLYSNFTASLAGHYQIDDNWDVRANIGWSWRPPDINELYAIGLQEGSYWVVGNRHLASER GYKSVVGARYRNTWLNVEPSFFYQRINSYIYDHIGTGKNRFHNHPSGKYPKFIYDQDDVR LMGGDIEATAKPIEGLSLVAKGEWMFARNITRDDWLPFMPSDRYGLSATYTHAIDKSGKW RGTVSLSGIYVTKQTRFDPDKDLVSDSPAPYFLLNGTAEVALRMPQHRELKLMLVGDNIL NNLYKEYTDRFRYYAHERGANFSIRTLFKF >gi|283510535|gb|ACQH01000084.1| GENE 82 106645 - 108099 1699 484 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929202|ref|ZP_06423047.1| ## NR: gi|288929202|ref|ZP_06423047.1| mucin-2 (MUC-2) (Intestinal mucin-2) [Prevotella sp. oral taxon 317 str. F0108] # 1 484 1 484 484 985 100.0 0 MQRIHLKTAAMAIALSALSASCTNEQVDDFFKKIVQAPPSSIERDVKGHDQIYSVHAILR MGYKGGLIGVGPNGDDSVRVYNTYHVAADTTVIPITQEIDIAKDDNGEMTVTTQRRQFDV VASPDIYYGLELKYYDQNGMLINHQFSTYPFTKDKEGVSVPDENSSTLMMHQHFFGIGNT TLGGETKETDKAETPDQQGLQLAYPRSLDEPATYIDRYTFRQNGGQPVPATKFSATNVFA PSGFEYGKNNVPYSPELAWKAIERSGKPEATQPYTASDGKTYNLFRVMDMMELNKITPEV FTYEYRDTDPVEEELGKLFVDSYNDDFIDPDTDAPRQRYGETVGLLRQARSLDAGAPLDR LGFKGVLQFKQAHICFSLQVKICNILNKGQQHVGEPEKPAKYVNTANSENGYLWDFNQIQ SGWDSFDIDYPLPVRVIADVKDGQAKCVQDAKRFYPSADAAQLWLLFTQPATFFYKHRKS VVAM >gi|283510535|gb|ACQH01000084.1| GENE 83 108169 - 109638 1601 489 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929203|ref|ZP_06423048.1| ## NR: gi|288929203|ref|ZP_06423048.1| hypothetical protein HMPREF0670_01942 [Prevotella sp. oral taxon 317 str. F0108] # 1 489 23 511 511 992 100.0 0 MDKTNQPSNANKANSQRNGAAWRARTIVTMLAVATTLVFTSCDDFIFGNVNINYNDNPLE AKPIPHSGADKYVDSLKKIRTTQPMANPDPNGAFANWATCLAMFKEGHSHGDGMMHGNFV YHNAPWRQEQFVIVHNGTDKWPSVQVQRQSTVTYLEQYAGKQGPDYIRIIGGKLKRWGLC LYFFDKEGKLMNDDILNHSDEYQIFFTVSDVDDKGTPYEVTDCRGTWHPNKNKFGEWKKG GTVDNTPVPSPFFADKTTWKQRADATPKIFEYTYRDTWIHDAMGDGARELFNQRLLPPLT RKDADWAVAPYDQDRVGLKGHFNFDLDADENDKLHSEWPFEITKRTDPATGEAGKYTRPS YLLPKFYLSVRVMKCPKGKKALIPRDEYLKRHSTYEFISKLICADWFNPDEPKEYGADSQ WQEVIRFNIPIKVYCSSFDTDPTAIDPNDPFYYYLGLEIGLSPADALEASQNLQTHGVGG GSGFGNWFL >gi|283510535|gb|ACQH01000084.1| GENE 84 109697 - 110758 1264 353 aa, chain + ## HITS:1 COG:no KEGG:BF3067 NR:ns ## KEGG: BF3067 # Name: not_defined # Def: putative lipoprotein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 35 295 132 355 404 70 26.0 7e-11 MKQLATLVLLALLAGCVRNELPTKQSNAKSKVDHLIIKEVFYIGHYWVRDVSKWGFKNTN NNYNDDQYIEIYNPTDQVQYLDNMALCAHAIDPTKVITFAPKDDFVNRYYGVNAISYFPG SGHDLPIQPKQSIIVAKYATDHKAMFEKELEGEDLSLYKGLDAFLDLRKANFEWTNANYD HSGKNNPDVPDLQAIMTRKDKSGNTIPSYEMQEISEHGGLALVRLPWTPEDFAKNYKDTK DKRGYLHYITVTSSAFADFEAIEIPFEYVVDCITICPRSQFQMRPSKLDKGYNAVTDVPF VSLKNSDYPTYSGLALTRRWDGRKFVDDDNTKSDFKVQVASLSRKDDKGNPIK >gi|283510535|gb|ACQH01000084.1| GENE 85 112113 - 112373 92 86 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MARITGKTGITGKTGITGKTGITRKTRITRKTGITRKTRITRKTGITGKARLTGKARITR KARITRKAWLARKASSPTHQLSNSKA >gi|283510535|gb|ACQH01000084.1| GENE 86 112839 - 113267 315 142 aa, chain + ## HITS:1 COG:NMA0437 KEGG:ns NR:ns ## COG: NMA0437 COG1380 # Protein_GI_number: 15793442 # Func_class: R General function prediction only # Function: Putative effector of murein hydrolase LrgA # Organism: Neisseria meningitidis Z2491 # 3 108 5 110 114 87 43.0 1e-17 MAKQFFVIFGCLALGEVVVWATGIKLPSSIIGMLLLTFFLKTKIVKLEWVEKLSQFLLAN LGFFFVPPGVAIMLYLDVIQKELLPIAMATLLSTILVLVVTGHMHQFVVKAERKLLAMHM RRHKRKQQDHTQAVASTNDDEV >gi|283510535|gb|ACQH01000084.1| GENE 87 113331 - 114029 792 232 aa, chain + ## HITS:1 COG:NMA0436 KEGG:ns NR:ns ## COG: NMA0436 COG1346 # Protein_GI_number: 15793441 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Putative effector of murein hydrolase # Organism: Neisseria meningitidis Z2491 # 1 230 1 229 230 175 43.0 7e-44 MQQLFSNQYIILALTFAVFYYFRRLQYRTGWVLLNPILIAIVLLIAYLKLTGVSFETFEQ SGQLIDFWLKPAVVALGVPLYLQFEAIKKLWFPIVLSQLVGCLVGIVSVVVVAQLFGASN VVIISMASKSVTTPIAMEVTQALGGIPSLTAAVVVITGILGAILGFKALALGHVSSPIAK GLSMGTASHAVGASTAMGISNKYGAFASLGITLNGIFTALLTPTILRIMGII >gi|283510535|gb|ACQH01000084.1| GENE 88 114530 - 115981 1727 483 aa, chain - ## HITS:1 COG:BS_yngK KEGG:ns NR:ns ## COG: BS_yngK COG1649 # Protein_GI_number: 16078889 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus subtilis # 39 477 34 505 510 267 35.0 4e-71 MKTWKKVALALCVLLVVACHRDDNNDQPDLSNQPILPKKELRGVWMATVWGLDWPQGEYD AEAQKASYIAYMDLFKQNNINAVFVQVRGMADAFYKSAYEPWCQYLTGQADKDPGYDVLR FMIDEAHKRGIAFHAWLNPYRVATKKADAAAFPALDSRIPQAMTVDYKTIRMYNPALPEV RQRIFDIVKELITKYDVDGVHIDDYFYPSLASGETIKDEKEYEQYVAKDKDGKPTITVED FRRNNVDLTVKGIHDVIQATRPEVAFTISPAGDPDYNFNTMYADVLKWSREGWSEAIIPQ LYFPMGSAESGFNHRLHWWSQFTYNNALFVGYGTYRFGSDEAAAYQSSAELANQFAFAQK FHKVKGSVLYSAKDLLNNPVGILSVIKDVYKHPAVLPYLGKQQATPPSAPTNVRANGKTL QWNGADGAYFAVYRSNGAGKEATTIAVTYQQEVTLNETGTYFVTAVSKKDNAESQPSTTV EVK >gi|283510535|gb|ACQH01000084.1| GENE 89 116287 - 117705 1620 472 aa, chain - ## HITS:1 COG:no KEGG:Phep_4124 NR:ns ## KEGG: Phep_4124 # Name: not_defined # Def: hypothetical protein # Organism: P.heparinus # Pathway: not_defined # 2 463 5 468 476 314 38.0 5e-84 MYKIKNIGLAAIAMMMATAFVSCGDNYPDSMDAPYDTDLLGIKIMNAGEKGDQVVEGRID EDKKEVNFPKLDTLTNFAALRLEAKLSDGAQLDKTEIDCKMTPDDEQKKLTIRVLNHNRY KDYFMFVRKRIPPIGADFKDAKVYSFAGDNRYEDFKTLYTRCADFDGQHVLVVTRTDNKP HLLKVDDLKAGNINRIPLDLTDVSGGTFPYNMGALANGHIYMATLAGGKPSPLKIYYWET PTSKPEVIFSSTVQDIPESGKRYGDNMSLNIDKNGDGHIFFGENTAQNILRLTVSNHKTV SEPTILSADPKMKVAMNIYRYENTGDYLYSGLAMPITLSGNAAEKKFSLSAEHQPAEAVA ARAFMFNNKRYLITCSAGFGSASKATPTLYVFDISKGNNLAEALERFDAASEHEPVYSFI LGGAGNGAPSPCTNFYIERDAQGKDAKLYLFASRGESGFVIVEVPMAQDKDD >gi|283510535|gb|ACQH01000084.1| GENE 90 118096 - 119961 1997 621 aa, chain - ## HITS:1 COG:no KEGG:Phep_4125 NR:ns ## KEGG: Phep_4125 # Name: not_defined # Def: RagB/SusD domain protein # Organism: P.heparinus # Pathway: not_defined # 5 619 8 601 602 632 51.0 1e-179 MKLKIFSFIAVCTLVCSACNDVLDRPSLTSAEDPEYWVSENNVRLYANAFYPNFFPGYGV GYGVTAYTPNNNYNFNDDVVLKGTQPQFSKAVPSSKGSTTLSLGWESQFSGPTWNFAWVR KANVMKDRIAQLMADKLTDEQYRHWLGIARFFHALEYARLVNVFGDVPYYNHEIHRDELN EIYKDRTPRNEVMDSVFSDFRYAMQNVRLNDGAQNVNRYVVAAFVSRWALFEGSWQKYHY QDNERAKACFKLATEAAQMIIDSGKYGIATDFRSLFGSTNLSDSKDCILYRKYDAEQGVT HCVASYCLPSNPTDLGPNLDLIKAFICNDGKDWNSTTMAAAKDFTLDSLIKTRDPRFEAT FYYKPTENAKSCGLYVTKFIPRSALSYLNIEGGTPAPEFQGDKNVTGYPVMRYAEVLLNW IEAKAELATLGEAAVTQADIDISVNAIRKRPLDAEAMQRHVKQTAAMKLDQLNNDPKRDA DVPQLLWEIRRERRMELAFEFSRIIDLRRWKKLNYMDTDANPDLLLGAWVNFPTELPGVL NEKTVGAYRVQNAAGKLVTYNGKNAKDMKGFFYPKETIGRLPFLNLPNVNPYLSPIGYND IRKYAERGYHLSQTKGWPEGY >gi|283510535|gb|ACQH01000084.1| GENE 91 119974 - 123510 4263 1178 aa, chain - ## HITS:1 COG:no KEGG:Phep_4126 NR:ns ## KEGG: Phep_4126 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: P.heparinus # Pathway: not_defined # 62 1178 19 1134 1134 1180 53.0 0 MNEKSTQSDNLHKGLQRKRLALSSAPMRTTLMLSLALSLGAMPSAALANSTPANGLLPNP IALQQQSKVAKGVVTDSNGEPLIGVSVTEKGTNNAYVTNVNGEFELRLSSANAQVVVSYV GFLPQTLRATTNMHIVLKEDSKALNEVVVVGYGAQKKANLTGAVTSVDVNKNLSSRPIAD IGRGLQGVVPGMNIRVPTGEVGSDPLIKIRGQIGSIEGSNAPLILLDNVEIPSIQVVNPN DVESISVLKDAASASIYGAKAAFGVVLITTKNGAKSNRLEVNYSNNFSFQSTARKLEMGG IDGLQYTLDAQINRGAPLPAGGMWRVSPESIKRMREWQEKYGGSVKWNDPVLYGRDWEVI GGEKYGYRLYDGVKAMVRDWAPTMTHNLSVNGKSGNTSYNIGLGYFTQDGMSRTAKKDDF RRYNATLSLQSEINQWVTVRMSSLYSDRNKRYPGVGTTSADPWLYLYRWSPQFPIGVTSN GSPLHEPSYELGAANTDNIQNKYFNINLGVTVNITKNWDFKLDYTYDHNAQETNESAIQY RAGQVWYNPVQWLENGYPVYVDENGNRTDTGGKPAYRFPVADYYSNVASSYVQQSSRATD NNTINAYTTYNLRLGAEKEHAFKFMAGMNQVTSKWTSHETLRNGLIDQTNPQFHLAGGTY DGDGNRNWEAQLGFFGRLNYAYADRYLLEANVRRDGSSKFPKHLRWQTFPSFSAGWVFTN EPLAQPIIKVLSFGKFRASWGSIGDQSVANTLYRAVMESGSSTWLDGNNKRVPLFGTPAL IDRNITWQRIETLDLGVDLRFLDNKIGVTFDWFKRDTKNMIVPGDDLPVTLGTSSPKGNF GHLQTKGWEVSVDFNHRFANGLGVNANFTLADAITTVIQGADYQKPWELRSVYDQYSTGR RYGDIYGLVTDRLFQKDDFEYDADGKNIVRTNVIYKGTLRSTNKQTARHPVYQVQYEDGD KLVFAPGDVKFKDLDGDGYISGGAGTNGDHGDLAVIGNSTPRYEYSFRIGVDYKGFDLSI FCQGIGKRKIWGEGQLAIPGFNAKEGAIPKTFATDYWREDRTDAFYPRAWNLGNNSTGFS MQRQSKYLLNMAYLRVKNINFGYSFQQNWISKVGLTRARLYLSLENFFTFDNLRGLPIDP EAISGYSMFTSKQNYNNGRTGTGTPVFKSLSFGAQLTF >gi|283510535|gb|ACQH01000084.1| GENE 92 125463 - 127652 2563 729 aa, chain - ## HITS:1 COG:slr0288 KEGG:ns NR:ns ## COG: slr0288 COG3968 # Protein_GI_number: 16331104 # Func_class: R General function prediction only # Function: Uncharacterized protein related to glutamine synthetase # Organism: Synechocystis # 16 729 17 724 724 627 45.0 1e-179 MANLRFEVVAEAYKKKPLEVITPAERPSEFFAKYVFNRAKMYKYLPADVYEKLTDVIDNG TRLDRSIADAVAKGMKQWANENGVTHYTHWFQPLTEGTAEKHDAFVEPDGKGGMIEEFSG KLLVQQEPDASSFPNGGIRNTFEARGYSAWDPTSPVFIIDDTLCIPTIFISYTGESLDYK APLLRSLQAVNTAAAAVAQYFDPNVKKVCCNLGWEQEYFLVDESLYFARPDLMLTGRTLM GHDSAKNQQMDDHYFGTIPERVQAFMKDLEIQALELGIPCKTRHNEVAPNQFELAPIFEE ANLAVDHNMLLMSLMKKVARKHSFRVLLHEKPFAGVNGSGKHNNWSLSTDTGVLLHAPGK GYDDNLRFVVFIVETLMGVYRHNGLLKASIMSATNAHRLGGHEAPPAIISSFLGTQLSTL LAQIERADDDECFAVAGKKGVELDIDQIPQLLIDNTDRNRTSPFAFTGNRFEFRALGSEA NCSSALIVLNSAVAEALNSFKQRVDALVQQGIDLKRAIVSVLRDDIRACRPIVFDGNGYS EEWVREAKRRGLDTETSCPLVFDRYLDEDSVQMFESLNVMHRNELQARNEVKWETYTKKI QIEARVLGDLCMNHIIPVATHYQSRLAKNVQSMFGIFASEKARILTSRNVKIIEEIAQRT QTIETKVEELVAARKVANNIQCEREKAIAYHDNIAPRMEAIRYEVDKLELLVADELWTLP KYRELLFIR >gi|283510535|gb|ACQH01000084.1| GENE 93 128117 - 128332 66 71 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSTSERNIPGSDGHNLKCLICGLMNRLGFIFIYAHAAFGANRKARMGILCTLGGVSHFST FSLLGVKELGS >gi|283510535|gb|ACQH01000084.1| GENE 94 128464 - 129561 668 365 aa, chain - ## HITS:1 COG:no KEGG:BT_3235 NR:ns ## KEGG: BT_3235 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 189 1 186 344 72 29.0 4e-11 MKNIFRAFLAAVALFATVACSHDDNVGRTFGDKAITIDSSSVTFTDAPSTGRVTVKAAGP ITKTAVSNDWCKSTFSGNVVTVEATQNTSFTSRVANLTIWVGTDSINVPVHQQGLPGFTI GERTQYLVSNKDTSIVISFRHIDGVSVTTHKQGEWLSPTVENGKVIIKVAANNGNKQRDG YVDISSGRRFPIRINIQQNWDVETQMNGSWTLTYFSKLDTTGLTRQQIDCTFDKDSIRIN SANHGKVSIPFRLKKSIPARFMLDGSVRCGGKLDNYTPLLYLVADYQRLWSYLTTSTTIA GRILPNAEGKLSIRPTGFLEGREERPITRFCFVLHDNVPPTGMPKNNNGFKYVLADFFFP VFVKK >gi|283510535|gb|ACQH01000084.1| GENE 95 129586 - 130914 1202 442 aa, chain - ## HITS:1 COG:no KEGG:BT_3243 NR:ns ## KEGG: BT_3243 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 382 1 383 443 117 27.0 1e-24 MKKIHHILSALLTLAVTLGLQSCLKDQADVFDNASSLRLQEYLDKTQKTLVDAPYGWAFD IYPERTQAYGGYAFTLKFDREKCKVRSVLDASREDESYYKMTSEVGPAITFDTYNPLMHY FSNPSSARFQALKGDFEFVVDSVSENLIKVHGYRTKNVMYLRKLDMPAAKYINKVDSFSA DFTPTGLWSFKGKVNGTDVQGELDASELKLTYNSGQKEETTAIAFTDRGVRTYRPLQIGG TTFGELTFNASDSTYRAKGANNEEIILKAEQPKWFPAYANAAGKYTLEAVYSLNNTDYSV ELDVDLKLTADYKGYEMEGALLGYNIRLTFDRKEEAMIVAPQQIRRFPDGNVLMLTTFST SAGRLTTSPSACFSLKWDNAANAYVFRPKGEWGYPSSGWFLGVYDSQWSFLGNAKALKVP NAYLFGRAYERLDQLLKLKRRQ >gi|283510535|gb|ACQH01000084.1| GENE 96 130936 - 131841 954 301 aa, chain - ## HITS:1 COG:no KEGG:BT_3237 NR:ns ## KEGG: BT_3237 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 40 299 40 292 292 192 43.0 1e-47 MKKNTLLYILASVICIGTMAGCADEKLSPNSVITADKRTPTPLDLWLDRYYVSTYNINFK YRYEDIESDMKYYTVPASYQMAIKMANLVKYSCLEAFDEVAGPDFTKANFPKLIYLTGNW EYKNNGTFILGTAEGGKKILLAGTNFIDKFIQSAEDINTYYLKTVYHEFTHILNQTKPYS ADFKLVTGSGYVADKWSESPYANTNYCYEHGFISAYAQHSDVEDFAEMFSIYVTSSPKKW NAWLDIANAKGQKDNILTKLDYVRNYMRESWNIDIDKLRDVYLRREDDIVKGKVDLTNIS L >gi|283510535|gb|ACQH01000084.1| GENE 97 131879 - 133468 1719 529 aa, chain - ## HITS:1 COG:no KEGG:BT_3238 NR:ns ## KEGG: BT_3238 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 529 2 519 519 375 42.0 1e-102 MKKQHIICLALATGTILLGSCNDFLNENPDNRTQIDTPEKVRQVLVSAYPENSYFVMAEY MSDNVDEYQNPRTDVSLDEYYRWKDATQTTNDSPDRLWRTYYASIANANEALKAIDKLGG PKTELLALCKGEALLCRAYAHFMLANIFCMNYGMPNSNTDMGIYYMYDTDTRIGQVNPRG TVAEVYQKIAKDLEEGLPLVGDAHLKVPKFHFNKKAAYAFATRFYLFHEEWEKAVKYANE CLGSSPQTLLRDWRSYQSMTRNIEAYTLKYVNTENKCNLLMQKAYTEAGLLFNNYSRGKK YAHGPYLDNNETMLAKNIWGEANYWDNGPFFYRVGATETYDITWRIPYLFQYTDPVAGTG YQSSIMPILTTDETLLNRAEAYIMLKKYDLAAADLTTWMQNIVNTNVVLTPTSIEAFYKP IDYSYSDDKKLQSTIKKHLHPKFAIDTEGSLQECMLQCVLGFKRIEQLQTGARWFDIKRY NITVMRRLINENGTPQELTDSLAPDDPRRAIQLPQDVRDAGVPMNPRNK >gi|283510535|gb|ACQH01000084.1| GENE 98 133490 - 136807 3888 1105 aa, chain - ## HITS:1 COG:no KEGG:BT_3239 NR:ns ## KEGG: BT_3239 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 53 1105 3 1059 1059 1302 62.0 0 MKKKRLSVFWAFLFLFIGTVVAQSKLTGTIISAENDEPIIGATVSVQGSNVRTVTNVDGA FTLNVNEGTTIVVSYVGMATQTLKAKNGMTVRMNPNQTLQEVVVTGMTTTDKRLFTGSTT KIDAADAKINGMADVSRSLEGRAAGVSVQNVSGTFGTAPKIRVRGATSIYGDSKPLWVVD GVIMEDVVDVSADQLSSGDAETLISSAIAGLNSDDIESFQILKDGSATSIYGARAMAGVI VITTKRGKAGQAHINYMGEYTMRLIPSYRQLNLMNSQEQMDVYQDMEQKGWLKYAETANA SNSGVYGKMYQMLSEYNPVTGQFALANTPDAKAAYLRNAEFRNTDWFNELFSNTVMHNHS VSISAGSDKAQYYASLSAMFDPGWTRRSNVQRYTANLNATFKLSKKLQLNLISTGNYRKQ EAPGTLAASTNVVTGEVSRDFDINPYSYAVNTSRTLDRNTFYTSNYAPFNILHELQNNFM DLNVNDFRAQAKLSYKPITKVDISVLGAVKYAGTSREHHVTEYSNQAEAYRAMQTTVIRQ ANSYLYHDPDNIFALPFSVMPYGGIITRADNRMFGWDFRASVSYNDVFNKDHILNTYAGM ETNSYTRHNTWFKGYGLQYDLGEEANWAYQNFKKMQEGGNDYYGLGNTNTRSAAFFGNVT YSWKGRYTINGTLRYEGTNGLGLSRQSRWLPTWNVSGAWNAHEESWFDQKLGKVLSHLTL KASYSLTADRPPVTNALAIIKANTPWRYRTGNTETGLERTSPANPDLTYEKKHELNLGLD AGFLNNRINMSVDWYKRNNYDLVGPIPTDGTDGFLVKYGNVAAMKSNGVELSLSTTNLKT KDFSWKTNFIYSYVHNEITDLKARSRMINFISGVGYARKGYPSRALFSVPFMGLTGDGLP TFRDQEGNISITNVNFQESDPEKLKFLKYEGPTDPTNTGSLGNVFTYKGFTLNVFVTYSF GNVVRLYPAFKSRYTDLSAMPKEFRNRWRTPGDEAYTDIPTIATLRQVYDNPNLYRAYNA YNYSDARIAKGDFIRMKELSLNYDFPQKLVGYLGLRTASVKVQGTNLFLIYADKKLNGQD PEFFNTGGVAVPMAKQFTLTLKVGL >gi|283510535|gb|ACQH01000084.1| GENE 99 137358 - 137789 104 143 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929218|ref|ZP_06423063.1| ## NR: gi|288929218|ref|ZP_06423063.1| hypothetical protein HMPREF0670_01957 [Prevotella sp. oral taxon 317 str. F0108] # 1 143 1 143 143 152 100.0 5e-36 MDYKHTKTQQLTNAPTHQLKISSTHQLKKINPQTQNSSTHQLKKINPQTQKLINSKTKKL INSKTKKIINPKNQKLINPQTYQPINSKTQKIINSKTKKIINQKKSQAHQPTNSKTQKLK LLFFILQQRSIFRTILAKIWGKK >gi|283510535|gb|ACQH01000084.1| GENE 100 138594 - 139382 737 262 aa, chain - ## HITS:1 COG:no KEGG:Coch_0245 NR:ns ## KEGG: Coch_0245 # Name: not_defined # Def: hypothetical protein # Organism: C.ochracea # Pathway: not_defined # 1 262 5 260 260 348 64.0 2e-94 MMRRKHILLALALVAFAPATAVFAQNERNESLIQSEKNGWEYEVRAGVNIGGASPMPLPK EIRKINSYSPKFNGSIAGMMTKWLDCNRNLGITLGLRLEEKGMETAATVKNYGMEIIDGG QRISGYWTGDVNTVYKSSFVTLPVLGAYRLGDSWKLRAGLYVSYRMDGEFSGFVSEGYLR SGSPIGEKVGFTNGQTAAYNFSNDLRRWLWGWQMGGSWRAFKHFTVNADLTYGFNNIFNA DFHTITFAMHPIYLNVGFGYIF >gi|283510535|gb|ACQH01000084.1| GENE 101 139389 - 140486 1086 365 aa, chain - ## HITS:1 COG:no KEGG:Coch_0246 NR:ns ## KEGG: Coch_0246 # Name: not_defined # Def: hypothetical protein # Organism: C.ochracea # Pathway: not_defined # 14 365 8 357 357 392 55.0 1e-107 MRFPKPFKAIVAALALATIFTACIREEALNAEADITAFHLDGNLLIREPVITNDEVKLYI NGWEDRSKLAPRFELTPGATISPASGTERDFTKPQTYVVTSQDGRWKKTYTVRFITDLLT EYHFENVEYYTFEGVNKFEKLYEIDTDGSKTEWSSGNPGYMIASQAAKPKDFPTAQDDEG YIGKCAKLVTRSTGAFGKMLYAPIAAGNLFLGNFTVELSDMAKSTRFGLPYTSRPVAVVG YYKYKPGNVLIDKYSKEIPNQSDTFDIYAVMYESTKDVPYLDGTNIKTHPNIVMIAQVKE RKATDQWTRFVMPFESVEGRTLDPEKLRSGKYNLAIVMSSSQGGAFFRGAVGSTLWVDEM QLFHE >gi|283510535|gb|ACQH01000084.1| GENE 102 140743 - 140964 242 73 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929221|ref|ZP_06423066.1| ## NR: gi|288929221|ref|ZP_06423066.1| hypothetical protein HMPREF0670_01960 [Prevotella sp. oral taxon 317 str. F0108] # 1 73 6 78 78 126 100.0 4e-28 MREFNANQNKYFELAENEVVYVVWKDARLIAIRATSNEDDLSEAELQSIQKGLDDIKNGR TYRMHEGESLKLS >gi|283510535|gb|ACQH01000084.1| GENE 103 142765 - 144039 1438 424 aa, chain + ## HITS:1 COG:all4426 KEGG:ns NR:ns ## COG: all4426 COG0438 # Protein_GI_number: 17231918 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Nostoc sp. PCC 7120 # 1 417 1 413 417 217 32.0 3e-56 MRILIVNTSERTGGAAVAAGRLVSALNNSGVQARMLVRDKQTTNPVVVALPKGVRSRWRF LWERWTIFWHLHFNRHNLFAIDVANTGADITSLREFKQADVIHLSWINQGMLSLSTIRKI VNSGKPVVWTMHDIWPATAICHLTLGCNNFKQQCRRCKYLPGRGGDNDLSAKVWNKKKRI YEKSDIHFVACSRWLADQARQSQLLRGRHVFAIPNPIDTRVFAPKNKIESAQRMNLPTNK RLILFAAQRATNANKGLAYLIEACNLLAAQYPELKESIALVVLGGKANQAVAELPFPTIP IDYVSDTATLVSLYNAVHTFVLPSLSENLPNTIMEAMACGVPCVGFNVGGIPEEIDHQKN GYVARYRDAADLAQGIRWVLCEADYAELSAQAVRKVLANYSQQAVALQYIEVYNQALAFK KCML >gi|283510535|gb|ACQH01000084.1| GENE 104 144027 - 144821 735 264 aa, chain + ## HITS:1 COG:MT3031 KEGG:ns NR:ns ## COG: MT3031 COG0463 # Protein_GI_number: 15842506 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Mycobacterium tuberculosis CDC1551 # 7 222 6 206 256 104 35.0 2e-22 MYVMIRFSVITCTYNAAAELPRTLKSVAEQTYPHVEHLLVDGLSADNTRQLIDQYVADMA QTESLHTIKVKAETDAGLYDAMNKGIEMATGTYLVFINAGDVFPSPQTLETVANAVGEAE TLPAVLYGDTDIVDADGRFLRHRRPVPRQGLSWRSFSRGMLVCHQAFYALTSLAKQTPYN LAYRFSADVDWCIRVMKAGQAANLPLKNVQAVVANYLDGGMTVKNHRASLKERFAVMRAH YGLPLTLAMHAWFVLRAIGNKFKS >gi|283510535|gb|ACQH01000084.1| GENE 105 144876 - 146591 1055 571 aa, chain + ## HITS:1 COG:no KEGG:BF0008 NR:ns ## KEGG: BF0008 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 14 568 21 579 588 340 33.0 1e-91 MRFIFFIFFFFVSLFAKAGDSSKLFGYRWWPHYQHLFFRFDLGTTAVQKQLRLEIVAPLK GVWSTLLPTYHLSGEANDITLKVKYKSKDVQHFYVTLNSITNGARIARRDTITLPASPNW AEATSQLHLKHTDLLGVSFEAMGKKDTVGTIQLADFGLYADGKKLSNEVSDVEVAPLFQR SDLLNWTFGNYGNIPCMDSPILGIAETIHGTKTMNAVAMDMMKERILKYQCKLVMLELPM ESIFYINRYVKNDPRFKFSYISNYIDNPLLGEPTKAFIEWLKQYNATHDNSVSLLGFDAN LEWMQSEINLFNFFYTLNDGKNAPEMKKICKAVLQDMGPFSKRDIAALLDSSEVVKSCLQ KDELLLVRKSIRLMQWFAKHSVRFGQRDTLMAQFAQFAIDSVFPKETVVTMYAHTGHLNY SQATSLVHLNYFGAGHYLKEKYGENYRCIDIATYQGATIVSAKFGAYAGGKLAAVPQRSI EYQLGQLGVDSVYLPMSKLQSSEALQMRMAGNGSIKVQFAYCYPQTRVDGVLFVRNVEPV IKSERVIKGWQEGDIPMLNAYKEALDKMKAM >gi|283510535|gb|ACQH01000084.1| GENE 106 146603 - 146803 105 66 aa, chain + ## HITS:1 COG:no KEGG:BF0008 NR:ns ## KEGG: BF0008 # Name: not_defined # Def: putative glycosyltransferase # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 66 1 66 372 79 51.0 6e-14 MKIAFLTTLDPEDTNTWSGTSYHILQALCKSHDVKIVGQNMLAQALFFAKGNISPKRDMR EYSPVF >gi|283510535|gb|ACQH01000084.1| GENE 107 146891 - 147589 500 232 aa, chain + ## HITS:1 COG:MT3116 KEGG:ns NR:ns ## COG: MT3116 COG0438 # Protein_GI_number: 15842595 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Mycobacterium tuberculosis CDC1551 # 26 210 155 343 414 60 27.0 2e-09 MVHLSDVIYHAFKDYMHNDSKPEEIRQTEDAERKLLNKYNTIIYSSEWAKQSAVSYYGID PSKIHVVEFGANIPHPNDYQIDIDTSVCNLVFVGRNWEKKGGDKALGAYRELKAMGMNCT LTIIGCQPPYAEDEGVVVYPFLDKSKPAHLSKLCSILRNAHFLILPTQFDAFGIVFCEAS AYGVPSVAANVQGTGQPICEGKTVSCSHPRQRQLIMPKRYTRYSATRTNTSH >gi|283510535|gb|ACQH01000084.1| GENE 108 147505 - 147711 69 68 aa, chain + ## HITS:1 COG:no KEGG:BF0009 NR:ns ## KEGG: BF0009 # Name: not_defined # Def: putative glycosyltransferase # Organism: B.fragilis # Pathway: not_defined # 1 68 304 371 372 83 55.0 2e-15 MLPPTATAIDYAQKIHSIFCDKDKYIALRKSSRKEYETRLNWDVWAKKVNIILEQTILNY KKTYGRQQ >gi|283510535|gb|ACQH01000084.1| GENE 109 147695 - 148372 656 225 aa, chain + ## HITS:1 COG:no KEGG:BF0009 NR:ns ## KEGG: BF0009 # Name: not_defined # Def: putative glycosyltransferase # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 217 1 216 222 308 64.0 8e-83 MEDSNNGFYLPVYVINLEERTERRQHIEEQFSDRTEFELTWIKAVKHPIGAVGLWKSMVQ AIRKAEENEDDIIVICEDDHTFTPAYNREYLFRNILEAEAQGIELLSGGIGGFGMTIPVA DNRYWVDWFWSTQFIVVFKPLFKRILDYDFKDTDTADGVLSALTLDKATMYPFISIQTDF GYSDVTRTNNDSVGLVDHLFLRSTLRLRIVQQVARKYGRTVHGRQ >gi|283510535|gb|ACQH01000084.1| GENE 110 148359 - 149651 1006 430 aa, chain + ## HITS:1 COG:NMB1738 KEGG:ns NR:ns ## COG: NMB1738 COG0845 # Protein_GI_number: 15677583 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Neisseria meningitidis MC58 # 104 427 149 467 475 75 26.0 2e-13 MEDNKRNERSEEVQAIVDRMPTEWVKWVALCVGVLMGIVVLLGFLIQYPDTVDGPISVTA NTAPVRLVANGNGRIYLLKTNKTKVKKGDVISYLESGANYRHILKVEGLLSNLNKNMQAA IDLPDTLILGEVSSAYNAFMLAYMQYRRVLSSNIYGTMHRNLRQQITSDKSVIANINNEV ALKRQLLHVSANQLKKDSVLLAAKVISEQEFQQQHAAHLSLQEALLNLQSTRMLKQSEVN HNQMEIQRILLEESESKDKVYTDFVTRKNELSNVINIWKERYLQYAPVAGQLEYLGFWRD NRFVQNGQELFSIIPSKNNILGEVTIPSFGVGKVAIGQHVNVKISNFPYDEYGQLKGVVK SISRISHKMKVQDKEVDAYLVLISFPHGTLTNFGKRLPLDFESNGTAEIITKRKRLIERL FDNLKAKGEK >gi|283510535|gb|ACQH01000084.1| GENE 111 149653 - 151845 230 730 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 469 716 118 356 398 93 29 7e-18 MMKFCRFPVEYQMDSHDCGPASLKIIAKHFGKYYSLQYLRDRCGITKEGVSLHDISVGAE NIGLRTLAVKCDVNEIVNSVPFPAIIFWNNNHFVVVYHADKKRIYVSDPARGRVKYSHET FKQGWYQRGENQGVLLAVEPTVDFKETKAEKEQKRNSFLSILKYFTPYSHSFALIFVIMF IVTALQGVLPFISKAVIDVGIKTSDVNFINMILIGNISILLSVMVFNVLRDWILLHITAR VNIALISDYLIKLMNLPVTFFENKLMGDILQRAQDHERIRSFIMNNSLSLIFSMLTFVVF AIILLIYNVVIFLIFLAGSALYVGWVLLFLNVRKKLDWEYFELLSRNQSYWVETVSAIQD VKINNYEKQRRWKWEEIQTRLYHVNKRVLAITNMQNLGAQFIESIKNMGIVFFCAVAVIN GDITFGVMISTQFIIGMLNGPLSQFISFVMGAQYAKISFLRINEIRQLEDEEELLSVGNT CILPNEKSITLSNVHFQYSANSPFVLKGIYLQIPENKVTAIVGGSGCGKSTLLKLLVRLY KPSFGDIKMDTMNVDAINLRQWRNLCGVVMQDGKVFSDTIINNIVLDADRIDYDRLHEVC KIAQLEEEINAMPNGFDTTVGETGRGLSGGQKQRLLIARALYRNPKMLFLDEATNALDTI NERKIVDALNNAFEHRTVVVIAHRLSTIQNADQIVVLDKGYIVEVGTHKTLMENKKRYYE LVSSQTQTLP >gi|283510535|gb|ACQH01000084.1| GENE 112 152337 - 153320 1167 327 aa, chain + ## HITS:1 COG:PM1621 KEGG:ns NR:ns ## COG: PM1621 COG0180 # Protein_GI_number: 15603486 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Tryptophanyl-tRNA synthetase # Organism: Pasteurella multocida # 4 326 6 330 333 244 41.0 1e-64 MEKVVSGIRPTGNLHLGNYFGAVKSFVKMQDEYESLFFIADWHSLTTSPKPEDIQRSARV ILAEYLACGIDPQKAKIYVQSDVKETLELYLFFNMNMYLGELERVTTFKDKARQQPDNVN AGLLTYPTLMAADILQHRAHKVPVGKDQEQNMEMARKCARRFNHIYGVDFFPEPQNFYFN TQAVKVPGLDGSGKMGKSDGNCIYLYEEDAAIRKKVMKAVTDMGPQAPNSPKPEVIENLF TFLRICSEPETYNYFDEKWNDCSIRYGDLKKQIAEDIIKTVTPIRERIREYSSNTQLLDR IAQEGAEVARASACATLKEVRQIIGFR >gi|283510535|gb|ACQH01000084.1| GENE 113 155094 - 158408 3763 1104 aa, chain + ## HITS:1 COG:no KEGG:BT_3239 NR:ns ## KEGG: BT_3239 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 51 1104 2 1059 1059 1348 63.0 0 MNKKLLMCLVTFLLCVSAAVAQTKISGTVVSAGDNEPVIGATITVVGTKTAAVTDFDGKF SLTTDVPNPQVTISYIGMVAQTLKATANMHVELKANAQVLNEVVVTGLTRTDRRLFTGAT DKVDAEKARLSGVADISRSLEGQAAGVSVQNVSGTFGTAPKIRVRGATSIYGSSKPLWVV DGVIMEDAANVGADDLASGNPETLISSAIAGLNADDIESFQILKDGSATSIYGARAMAGV IVVTTKKGKQGQAHISYTGEFTTRLVPSYNNFDILDSKEQMGIYREMADKGWLNLSEILN GSEYGVYGKMYELINRYDARTGRFALENTTEAKNRYLRQAEFRNTDWFNELFTPSIMQNH SISLSGGTQNSNYYASFSAMLDPGWYKQSNVNRYTLSLNLTQHLSDKLSLNFIGSAAYRK QRAPGTLGQDVDVVGGEVKRDFDINPYSYASNTSRVLDPNEYYVANFAPFNIFNELNNNY IDLNVLDAKFQLELKYKPIKGLELSALGAFKYMASTQEHNVKDNSNQAMAYRAMSNGIIR EANKYLYNDPSNPYKMPFSVLPYGGLYHKGDNRMSDYDLRATANYNHTFADTHIMNLFAG TEVKSIERQRNSFEGSGLRYDAGQVPFYIYEYFKRAVEGGNTYYMITPTNSRSVAFYGNA TYSYKGRYVFNGTLRYEGSNQMGRDTNARWMPTWNVSGAWNVHEEGWFGKLSPLSRLTLR ASYSLTGTPPDASYSNSTAIIKASNPFRLFAEDQEPQLEIYELANANLTYEKKNELNLGF DASLWNNRLGITFDFYTRRNFDEIGPMVTAGTGGQIIRAANVAEMSSNGVELSLSSVNMK NKHFSWTTNFIYAYAHTEITKLFNQGNVMSLVSGSGFAKRGYPARSLFSIPFVGLNDEGM PLVLNEKGMITSDDINFQERTLTDFLRYEGPTDPLHTGSIGNMFTYRGFRLNVFVTYAFG NVVRLDPKFRARYNDLVSMTNAFKNRWTKPGEEDQTNVPGIISKAQYMANTNLRLGYNAY NYTSLRVAKGDFIRLKEVSLGYDLPAQWFQKAMVKNVSLKLQATNLFLLYADKRLNGQDP EFYNVGGVASPMPKQFTLTIKLGL >gi|283510535|gb|ACQH01000084.1| GENE 114 158420 - 159991 1785 523 aa, chain + ## HITS:1 COG:no KEGG:BT_3238 NR:ns ## KEGG: BT_3238 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 523 1 519 519 436 44.0 1e-120 MKVNKYITGLSMAAISFALASCNEFLDKYPDSRMDLKNPSEVSQLLVSAYPSAHPAYLTE MYSDNTDEQMHSTWSTFDRFQEQAYQWKDIQEVRNSETPYQLWEAHYTAVATANEAIAHI KSVSNPHDYDAQLGEALLCRAFAMFQLSTVFCQAYDKTTAQGNLGLPYPTEPEQVVGRLM ERGTLEQLYANIEKDLLEGISLVGTKYARPKFHFTKQAAYAFATRFYLYAQQYDKAIKYA DLVLGNQPGDILRNWAEWNQLGPSGNVQPNAFVQASNNANILLLPVPSEWGVISIPVLAG SKYAHGEMLSKNETLQAPGPWGNSGTELLYTVTYNNGVSKYALRKIPYVPRVIDVTAGIG IPYSEYAVFNTDETLLERAEAYALSGKYAEALKDINTELTAFSKKKVQLTLDKIKDFYNS QAYYTPTKPTPRKALNPLFAIEKTTQEPLLQCLLQLRRILTIHEGFRLQDVKRYGITMYR RQVDVQSKVNALTDSMKVGDPRLAIQLPQDVISAGITPNPRNK >gi|283510535|gb|ACQH01000084.1| GENE 115 160005 - 160892 1081 295 aa, chain + ## HITS:1 COG:no KEGG:BT_3237 NR:ns ## KEGG: BT_3237 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 294 1 292 292 232 43.0 1e-59 MKKYLYLVASACLCCGAILQSCGESKLSGDSIFSTEAQKRNAFDQWLYKNYTMPYNIDFQ YRLKTEETDKAYNFVPADSAKTAKLAIITKFMWFDPYAETIGLDFVKENVPRIIVAVGIP GYTRYRTEVVGSAEGGYKVTLSKVNALTDDLLKDYHSMTAFYFHTMHHEFMHILNQKKPY DESYDNISRADYVSGNWTSVPEKRAYALGFVSPYSMENPAEDIAELYSIYVTSTPDEWNN ILKAAGNKGASTINRKLKMVRDYMKNSWNADIDLLRNAVLRRGGKISTLDLNSLN >gi|283510535|gb|ACQH01000084.1| GENE 116 160906 - 162225 1524 439 aa, chain + ## HITS:1 COG:no KEGG:BT_3243 NR:ns ## KEGG: BT_3243 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 15 307 1 304 443 145 30.0 3e-33 MKIKNIKEEVNDKQMKKAYFLFMAVVLTLLMQACLHDNKTAFDLPAAQRIDQSVAEYTVL LESSEGGWMLQYYAGKNYSYGGYTLLLKFKDGHVTAMGDVLDPEAVATSDYEVVKDQGPM LSFNVYNKVIHPLAEAWLGNPEGIQGDYEFSILRATTDSIVLRGRKWKNEMVLTRLPKDA NWEEMMLGIITVKDGMSVSTYNFIQGNDTLAQGSIDPTTRRLSVTLGKTTWDMPYCTHAT GIVLRQPIVIGNKQYQNFTWNETDKVLTDNDLKLAQFVPKNHKTLDFWVGEWQLKTSLRK RITLTLELGTTANTLKGHLLYDKVSYELQLTYDPATGRIELPGQPVIDPTYKYPAGIVLI PASLKEKKIFGEGKGSMYFTWNGDMERADAEDSGQITGHTVDSFFGVAYGEDLSPILDPK GDYVYAFTLPNIEYMRKIK >gi|283510535|gb|ACQH01000084.1| GENE 117 162234 - 163298 1271 354 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929234|ref|ZP_06423079.1| ## NR: gi|288929234|ref|ZP_06423079.1| hypothetical protein HMPREF0670_01973 [Prevotella sp. oral taxon 317 str. F0108] # 1 354 1 354 354 625 100.0 1e-177 MNKLISISMATALAAFVVACSSDKELPEYAAGLSVVKAQTAFGVLGGTQEVTMAAQPAQA YAQDAWLSVATKAETIALTAQTNTSAQTRNTLLVIKNQQGDSITLNVQQEGVAFGLPVGE DIYTGDEAFSKSFTTPANVPVTYQGTESWITVEDKGGAINVKLAANATGKPRVGWVTAKA VGLTDSLKVVQAALSDVEGEYTQTALMRLPNRELEEKTTDVRIVATGANTANFIVEGKYS WEVAFTPGKGFTMNNGKIVGKNEVKPGVYEYLITVIVADDFRKGHETAINGTKESVLLTF DDKGNLVFKEAQKIASEQTFSSYGWNRFSDSKPVMGAFRGIGEVFVKPKLTRKP Prediction of potential genes in microbial genomes Time: Sat May 28 01:59:09 2011 Seq name: gi|283510534|gb|ACQH01000085.1| Prevotella sp. oral taxon 317 str. F0108 cont2.85, whole genome shotgun sequence Length of sequence - 16922 bp Number of predicted genes - 16, with homology - 11 Number of transcription units - 12, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 169 - 228 7.4 1 1 Tu 1 . + CDS 312 - 533 56 ## 2 2 Tu 1 . - CDS 642 - 893 96 ## + Prom 583 - 642 1.5 3 3 Tu 1 . + CDS 763 - 1983 1175 ## PROTEIN SUPPORTED gi|90021240|ref|YP_527067.1| ribosomal protein S32 + Prom 2006 - 2065 2.1 4 4 Tu 1 . + CDS 2104 - 3423 1255 ## COG3458 Acetyl esterase (deacetylase) + Prom 3441 - 3500 2.2 5 5 Op 1 . + CDS 3614 - 4894 1448 ## COG3934 Endo-beta-mannanase 6 5 Op 2 . + CDS 4901 - 7213 2514 ## COG1472 Beta-glucosidase-related glycosidases 7 5 Op 3 . + CDS 7236 - 8345 946 ## COG4124 Beta-mannanase + Term 8400 - 8451 -0.6 - Term 8176 - 8203 0.1 8 6 Tu 1 . - CDS 8376 - 8579 75 ## - Prom 8618 - 8677 6.5 + Prom 8577 - 8636 2.9 9 7 Op 1 . + CDS 8682 - 9845 1464 ## COG2152 Predicted glycosylase 10 7 Op 2 . + CDS 9839 - 11242 622 ## PROTEIN SUPPORTED gi|90020673|ref|YP_526500.1| ribosomal protein L9 11 7 Op 3 . + CDS 11245 - 12462 1207 ## COG2942 N-acyl-D-glucosamine 2-epimerase + Term 12564 - 12611 2.1 + Prom 12476 - 12535 6.1 12 8 Tu 1 . + CDS 12646 - 13536 1089 ## COG2207 AraC-type DNA-binding domain-containing proteins 13 9 Tu 1 . - CDS 13830 - 14027 80 ## + Prom 14234 - 14293 5.0 14 10 Tu 1 . + CDS 14539 - 14754 110 ## + Prom 15055 - 15114 5.8 15 11 Tu 1 . + CDS 15222 - 15983 863 ## TDE0221 hypothetical protein + Term 16029 - 16065 -0.4 16 12 Tu 1 . - CDS 16064 - 16921 780 ## COG1835 Predicted acyltransferases Predicted protein(s) >gi|283510534|gb|ACQH01000085.1| GENE 1 312 - 533 56 73 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLYSEELRSYRVKELSQMPYSEELRSYRVKELSQMLYSEELRSYRVKELSQMLYSEELRS YRVKELSQMPRWR >gi|283510534|gb|ACQH01000085.1| GENE 2 642 - 893 96 83 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLPWNLFHELHPICGVCAKFSHFWAKHIVGSITKATSQSFILFISAINKRLSVGSTDDAK IELNCLTSVYYIDKYQYFICESR >gi|283510534|gb|ACQH01000085.1| GENE 3 763 - 1983 1175 406 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|90021240|ref|YP_527067.1| ribosomal protein S32 [Saccharophagus degradans 2-40] # 6 404 10 403 408 457 51 1e-128 MKRIKLWLVAFVMLPTMCFAQKWENLAQTPQMGWSSWNKFQGNINEDIIKGIADAMVSSG LRDAGYTYINIDDCWHGQRDADGFIQPDSKHFPSGMKALADYVHARGLKLGIYSDAGTET CAGRPGSLGHEYQDALQYARWEVDYLKYDWCNTTNVNPRGAYQLISDALCAAGRPIFLSM CEWGDNQPWRWARDIGHSWRIGPDIWCSFDSTRVFPTYVQYSVLDCINKNDSLRRYAGPG HWNDPDMLEVGNGLSVNQDRAHFAMWCMMASPLILGNDVRNMSAETKAILTNRDLIAINQ DRLGVQGLRFLSRDGLDYWFKPLANGDWAMTIFNPTRKPIACNLNWQDFNFTDDEVSKRS TEFDKYIYKVKNLWTGRMEGKTSTKQKVERKLVVPAQDVIVYRLVR >gi|283510534|gb|ACQH01000085.1| GENE 4 2104 - 3423 1255 439 aa, chain + ## HITS:1 COG:TM0077 KEGG:ns NR:ns ## COG: TM0077 COG3458 # Protein_GI_number: 15642852 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acetyl esterase (deacetylase) # Organism: Thermotoga maritima # 121 407 14 303 325 104 27.0 3e-22 MKRITLILGFFLLVSLAVVAQIRGNEIVVQVQPNHANWNYRLGEEAIFNVAVLKNGCPLP QAKVDIEAGPVMYADVKRSGVVLKDGTTTWKGTMKTAGFYRLKVWAHVGEKTYEGLCTVG YAPETLQPTDNCPADFDQFWADAYEKASYYPLDAHRRLLPERCTERVQVYEVSFNGLNPG NRIYGILCIPTAYEDKRPALLRVPGAGVRPYQGDVEEANIGVITLEIGVHGIPVTMPQQV YNDLLHGALNGYWETNLDNRDQMAYKRIFIGALRAVDYLTQLPEWNGRELGVTGASQGGM LSLVCGALHPAVTFVGVVHPAMCDLTASLHGKPCGWPHYFYGQKQPDAKKVATSGYYDGV NFARRLTIPCWFSFGYNDEVVPPNSAYAAYNVTQGLRTLKLYPATGHFWFQEQWEEWREW LEKQLKKPHPQPLSKGRGE >gi|283510534|gb|ACQH01000085.1| GENE 5 3614 - 4894 1448 426 aa, chain + ## HITS:1 COG:CC0801 KEGG:ns NR:ns ## COG: CC0801 COG3934 # Protein_GI_number: 16125054 # Func_class: G Carbohydrate transport and metabolism # Function: Endo-beta-mannanase # Organism: Caulobacter vibrioides # 22 423 28 438 442 322 42.0 1e-87 MNIIRYYTLALLLVCATAVKAQSFVTVKDGRLYRDGKPYTFIGANYWYGAILGSKGKGGD RKRLNRELDEMKRLGITNLRILVGSDGEEGIKWKVSPVLQPSPSVYNDAILDGLDYLMLQ LQRRGMVAVLYLNNSWEWSGGYGFYLEHAGAGKALQPNEVGYSAYIKYASQFSTNKQAQQ LFFNHLCFILKRTNRYTKKRYADDPAIMSWQIGNEPRAFDKAVLPQFEAWLAKAAAMMKS IDKRHLVSVGSEGAFGCEADYDSWQRICADPNVDYCNIHIWPYNWSWAKKDSLSQNLQRA KDNTKEYIDRHLAICAKINKPLVMEEFGYPRDGFAFSKQSPTTARDAYYGYVFSLLAADA AKGGYFAGCNFWGWGGQAQPKHEQWEPGDDYTGDPAQEAQGLNSVFSSDTSTINVIKAGI AKLPKH >gi|283510534|gb|ACQH01000085.1| GENE 6 4901 - 7213 2514 770 aa, chain + ## HITS:1 COG:TM0076 KEGG:ns NR:ns ## COG: TM0076 COG1472 # Protein_GI_number: 15642851 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Thermotoga maritima # 25 759 3 749 778 499 39.0 1e-141 MKRILSLALAALTLQTMSAKQEQPLYKNPKASVAQRVDDLLRRMTLEEKVGQMNQLVGIE HFKQYSTSMTAEELATNTANAFYPGVTVHDMETWTRRGLVSSFLHVLTLEEANYLQKLNM QSRLQIPLLIGIDAIHGNAKCKGNTVYPTNIGLASSFDVDMAYKIARQTAEEMRAMNMHW NFNPNVEVARDGRWGRCGETFGEDPYLVTLMGVATNKGYQRNLDNAQDVLGCVKHFVGGS YAINGTNGAPCDVSERTLREVFFPPFKAAIQQGGDWNVMMSHNELNGIPCHTNSWLMNDV LRKEWGFKGFVVSDWMDIEHCVDQHRTAANNKEAFYQSIMAGMDMHMHGPEWQTAVVELV REGRIPESRIDESVRRILTVKFRMGLFEHPYSDMKTRDRVINDPEHKRTALEAARNSIVL LKNANNLLPLDAQKYKKVLVTGINANDQNIMGDWSEPQPEEQVWTVLRGLRSVSPTTDFR FVDQGWNPRNMSQAQVGAAVEAAKECDLNIVCCGEYMMRFRWNERTSGEDTDRDNLDLVG LQEQLIRRLNETGKPTVVIIISGRPLSVRYAAEHVPAIVNAWEPGQYGGQAIAEILYGKV NPSAKLAMTMPRHVGQISTWYNHKRSAFFHPAVCADNTPLYPFGHGLSYTTFRYSNLQLN KANIPNDGKTSVTASVTIENTGKRDGVEICQLYINDVVASVARPVKELKDFRRVALKAGE KKTIEFTITPDKLALYDLNMKPIVEPGTFEVMVGGSSRDEDLQKATFNVQ >gi|283510534|gb|ACQH01000085.1| GENE 7 7236 - 8345 946 369 aa, chain + ## HITS:1 COG:BS_ydhT KEGG:ns NR:ns ## COG: BS_ydhT COG4124 # Protein_GI_number: 16077655 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-mannanase # Organism: Bacillus subtilis # 59 318 79 331 362 100 29.0 5e-21 MAIAMLALGANAKQKVQKTPAQQLIERLKTLQKRGVMFGHQDALFYGTTWKWEFGRSDVN DVCGDYPAVLGCELGGLELGNDKNLDGVPFDKMRQQIIAHHQKGGIVTISWHPYNPVTGK DAWNTEGDAVTAVLPGGKEVAKMQLWHTRLAQFLGSLKDDKGHAVPVIFRPWHEMSGAWF WWGSKQCTPEQYKSLFRLTFDAMMQAGLRNLVWSYSPNAQANDTPEHYFLFYPGDSYVDV LGVDLYQYNASAVFIEQCQNEMRIMSEYAKSHNKLYALTEAGYRNTPDAQWYSSTLVPAI KGFAPSYVLLWRNAWDKAEENFGPAPEKTCANDFRKIHKEGAFLFRSQLQCAPTKGCWMV SKKGKSKKG >gi|283510534|gb|ACQH01000085.1| GENE 8 8376 - 8579 75 67 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLCTVATRCFVLSLQDACTVTTRCLYCHYKVLVLPLQGALYYRYKKLVLRLQGTCSITKH GIALIRC >gi|283510534|gb|ACQH01000085.1| GENE 9 8682 - 9845 1464 387 aa, chain + ## HITS:1 COG:MA2382 KEGG:ns NR:ns ## COG: MA2382 COG2152 # Protein_GI_number: 20091213 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted glycosylase # Organism: Methanosarcina acetivorans str.C2A # 27 366 5 316 335 119 29.0 1e-26 MSTFNSKLNALRQRHEALLTTPNHMLEHGNGIYERYANPILTAEHTPLEWRYDMDERTNP HLMQRIMMNATLNAGAMKWGDKYLLVVRVEGADRKSFFAVAESPNGVDNFRFWDEPITMP ETDNPATNIYDMRLTQHEDGYIYGVFCAERHDDSKPADLSAATATAGIARTRDLKTWERL PDLKTRSQQRNVVLHPEFVDGKYAFYTRPQDGFIDTGSGGGIGWALVDDITHAEVHEETI INPRHYHTIMEMKNGEGPHPIKTPQGWLHLAHGVRGCASGLRYVLYMYMTALDDPTRMIA QPGGYLLVPQGGEYVGDVMNVTFSNGWIADEDGRVFIYYASSDTRMHVATSTIERLVDYC LHTPQDGLTTAASVENIKKLIAKNRAL >gi|283510534|gb|ACQH01000085.1| GENE 10 9839 - 11242 622 467 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|90020673|ref|YP_526500.1| ribosomal protein L9 [Saccharophagus degradans 2-40] # 1 466 6 521 522 244 29 4e-64 IMKIALSEKIGYGLGDMSSSMFWKLFGAYLMIFYTDVFGISAAIVGTMFAITRVWDSFFD PIVGAVADRTSSRWGRFRPYLLYLAVPFGLIGVLTFLTPPMHETGKIVYAFITYALMMMV YSAINVPYASLLGVMSPDPAHRNTLATYRMTFAYLGSFVALLLFMPLVNAFGGGDANGPM RGWLTAPQFGWLMAVVVIAIVCIVLFLGCFALTKERVKPVKQEKNSLKTDLRDLLYNRPW WILLGAGVASLVFNSIRDGATVYYFKYYVDETAVGNISFLGLPFVLSGLYLAVGQAANIV GVILAAPISNRIGKRRTFMAAMAVATVLSIGFFWLDKGQLALIFILQALISVCAGSIFPL LWSMYADCADYSELQTGNRATGLIFSSSSMSQKFGWAFGTAITGWMLAQFGFQANAVQSA ETIQGIKMFLSILPAAGAFLSLAFIYFYPLSETKMKEITAELQEKRK >gi|283510534|gb|ACQH01000085.1| GENE 11 11245 - 12462 1207 405 aa, chain + ## HITS:1 COG:slr1975 KEGG:ns NR:ns ## COG: slr1975 COG2942 # Protein_GI_number: 16330802 # Func_class: G Carbohydrate transport and metabolism # Function: N-acyl-D-glucosamine 2-epimerase # Organism: Synechocystis # 2 390 7 387 391 97 25.0 3e-20 MEKLKNELRDELTQNILPFWLNKMVDHRRGGFLGRIDGHGNPVPDAEKGAVLNARILWAF AAAYRVLGKPEYLEAATRAKDYFVAHFIDPQEGGVFWSLDAQGRPLDTKKQTYAIGFAIY GLSEYVRATGDEVALQQAKDLCYCIERYAFDSLNNGYVEALTRNWQPIADMRLSDKDENG SRTMNTHLHILEPYTNLYRVWRTPQLAERIANLIDIFLTHLLNPQTNHLDLFFDDRWQGR RNIQSFGHDIEAVWLLHEAALELNDPNVLLRVEHAIKAIAKAADEGLQADGSMVYERWTD TGKVDTQRQWWVMCECVIGHIDLYQHFGDTAALDIARRCWQYTREHIVDHEQGEWFWGCD ENARPNLVDDKAGFWKCPYHNARMCLEVVERGLTPNPSPRGEGSE >gi|283510534|gb|ACQH01000085.1| GENE 12 12646 - 13536 1089 296 aa, chain + ## HITS:1 COG:AGl1135 KEGG:ns NR:ns ## COG: AGl1135 COG2207 # Protein_GI_number: 15890685 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 22 293 36 303 313 115 28.0 7e-26 MDTYVLHEITPLMETDALYIADRHKKEFSYPVHNHDVFELNFVENAAGVRRTVGDNSEVI GDFDLVLITSPTLEHVWEQHECKSEDIREITIQFNFGAGMTETDQFFGKTPFESIRRMMK EAQKGLAFPMTTIMKVYDKLDEMSRITDRFRALMQFLDILHTLSLSTGARTLATTSYAKV NIEDDSRRVLRVKKYISDNYMYELRLKTLADLANMSESAFCRFFKLHTGRRLSDYIIDIR MGHAARMLVDTTETIVEISFKCGYNNMSNFNRIFKRKKGCSPTEFRNNYHKIKVIV >gi|283510534|gb|ACQH01000085.1| GENE 13 13830 - 14027 80 65 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVVVPCFCMLVLSLMLVPVNVHLHGITLGMKDCLKGGLTFRKRNPQRLLVGIADRKAEPI WSFAD >gi|283510534|gb|ACQH01000085.1| GENE 14 14539 - 14754 110 71 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MQKHSKCPKTEGISAQKTLKFHRLGAFYMALGGYFLLEWFGRRMRFVKYVYRFNKSLIDD FGGKDDGYAAA >gi|283510534|gb|ACQH01000085.1| GENE 15 15222 - 15983 863 253 aa, chain + ## HITS:1 COG:no KEGG:TDE0221 NR:ns ## KEGG: TDE0221 # Name: not_defined # Def: hypothetical protein # Organism: T.denticola # Pathway: not_defined # 3 118 6 122 216 65 37.0 3e-09 METNMKEMLAQLATTTQKAEIESVLQQTTAQLLQHYKIRAREVELYPIELESYFYRAGVF EDPYVHTNELQANHFGQLYVHRRGKEATSPYKMDNRVCMGISLSVSANYYYSALIRSAVF ADGSMVFGPNNVLMHFMRSVNASAKLLDVHFFDTHHAGTFMLAPLFHLVEGQQVLFEVDA SADPRDKSYLMHGARIGLGDTAPYFQQLPLRSAIGKLSKPFEFKEKTQLMDNYLAEHKLT GDKAKEMAKEMRQ >gi|283510534|gb|ACQH01000085.1| GENE 16 16064 - 16921 780 285 aa, chain - ## HITS:1 COG:CC1328 KEGG:ns NR:ns ## COG: CC1328 COG1835 # Protein_GI_number: 16125577 # Func_class: I Lipid transport and metabolism # Function: Predicted acyltransferases # Organism: Caulobacter vibrioides # 1 281 79 333 337 65 29.0 9e-11 IRLHPMVVIGALLGGLLFYAQGTEYLKVSDVPVALLAFATLLNLFLVPAWGGVEVRGFGE IFPLNGPTWSLFFEYIANLLYVLAIRKLSTRWLAMLVFATGGALFGYAITNEYGNLGAGW TFANYGFWGGLLRVMFAFPAGLLLSRVFRKWRVRGAFWWCGLAIVAVTVPPRLGGESQMW VNGIYEALCVLLVFPLVVWIGASGETTDALSTRICNFLGDISYPIYIIHFPLIYTCFAYA KRHDIPFCQTIPQVLALVFGCIALAWLCLKFYDMPVRRWLQKRFV Prediction of potential genes in microbial genomes Time: Sat May 28 01:59:51 2011 Seq name: gi|283510533|gb|ACQH01000086.1| Prevotella sp. oral taxon 317 str. F0108 cont2.86, whole genome shotgun sequence Length of sequence - 50676 bp Number of predicted genes - 32, with homology - 29 Number of transcription units - 17, operones - 7 average op.length - 3.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 274 147 ## COG1835 Predicted acyltransferases 2 1 Op 2 . - CDS 326 - 1345 1125 ## SCO6610 secreted protein - Term 1358 - 1398 5.0 3 2 Op 1 . - CDS 1406 - 2317 996 ## gi|288929247|ref|ZP_06423092.1| hypothetical protein HMPREF0670_01986 4 2 Op 2 . - CDS 2352 - 4082 1762 ## PRU_1875 putative lipoprotein 5 2 Op 3 . - CDS 4106 - 7261 3262 ## BVU_2578 hypothetical protein - Prom 7499 - 7558 3.4 - Term 9832 - 9866 3.2 6 3 Op 1 . - CDS 9917 - 10438 716 ## BT_2261 hypothetical protein 7 3 Op 2 . - CDS 10445 - 11164 862 ## BT_2262 hypothetical protein 8 4 Op 1 . - CDS 11265 - 12761 1584 ## BT_2263 hypothetical protein 9 4 Op 2 . - CDS 12774 - 15962 3637 ## COG1629 Outer membrane receptor proteins, mostly Fe transport - Prom 15991 - 16050 5.3 + Prom 16237 - 16296 2.5 10 5 Tu 1 . + CDS 16430 - 19393 3949 ## PRU_2068 hypothetical protein + Prom 19721 - 19780 8.6 11 6 Tu 1 . + CDS 19834 - 20346 -237 ## gi|288929076|ref|ZP_06422922.1| hypothetical protein HMPREF0670_01816 - Term 20874 - 20915 10.2 12 7 Op 1 . - CDS 20945 - 21889 1272 ## gi|288929257|ref|ZP_06423102.1| conserved hypothetical protein 13 7 Op 2 . - CDS 21900 - 24755 3270 ## COG1404 Subtilisin-like serine proteases 14 7 Op 3 . - CDS 24767 - 26488 2017 ## gi|288929259|ref|ZP_06423104.1| hypothetical protein HMPREF0670_01998 15 7 Op 4 . - CDS 26518 - 28140 1795 ## ZPR_0689 putative outer membrane protein, probably involved in nutrient binding 16 7 Op 5 . - CDS 28157 - 31156 3232 ## ZPR_0688 RagA protein 17 7 Op 6 . - CDS 31059 - 31247 62 ## - Prom 31351 - 31410 4.5 + Prom 31247 - 31306 4.2 18 8 Tu 1 . + CDS 31332 - 31541 86 ## + Prom 31558 - 31617 4.9 19 9 Tu 1 . + CDS 31691 - 32242 349 ## BT_0226 hypothetical protein + Term 32245 - 32290 3.2 + Prom 32491 - 32550 3.7 20 10 Tu 1 . + CDS 32657 - 33163 384 ## COG3727 DNA G:T-mismatch repair endonuclease + Term 33235 - 33278 -0.2 - Term 33338 - 33376 3.1 21 11 Tu 1 . - CDS 33430 - 34425 1038 ## COG3943 Virulence protein + Prom 34840 - 34899 7.5 22 12 Tu 1 . + CDS 34991 - 35239 109 ## + Term 35341 - 35395 -0.9 - Term 35962 - 36036 11.4 23 13 Tu 1 . - CDS 36043 - 37059 1128 ## Coch_0793 hypothetical protein - Prom 37083 - 37142 6.0 - Term 37138 - 37177 -0.1 24 14 Op 1 . - CDS 37409 - 40210 3021 ## COG3250 Beta-galactosidase/beta-glucuronidase 25 14 Op 2 1/0.000 - CDS 40235 - 41128 354 ## PROTEIN SUPPORTED gi|116517028|ref|YP_816079.1| glucokinase - Prom 41205 - 41264 4.3 - Term 41154 - 41202 5.3 26 14 Op 3 . - CDS 41333 - 42547 1032 ## COG0738 Fucose permease 27 14 Op 4 . - CDS 42565 - 43629 1096 ## COG0449 Glucosamine 6-phosphate synthetase, contains amidotransferase and phosphosugar isomerase domains - Term 43660 - 43694 0.5 28 14 Op 5 . - CDS 43707 - 44633 990 ## gi|288929272|ref|ZP_06423117.1| hypothetical protein HMPREF0670_02011 - Prom 44657 - 44716 1.5 29 15 Tu 1 . - CDS 44720 - 45511 903 ## COG1349 Transcriptional regulators of sugar metabolism 30 16 Tu 1 . - CDS 45770 - 46609 1005 ## Cpin_5551 hypothetical protein - Prom 46750 - 46809 2.8 31 17 Op 1 . - CDS 46846 - 48315 1743 ## SRM_03058 outer membrane protein, probably involved in nut rient binding 32 17 Op 2 . - CDS 48378 - 50612 2650 ## SRM_03057 TonB-dependent receptor Predicted protein(s) >gi|283510533|gb|ACQH01000086.1| GENE 1 1 - 274 147 91 aa, chain - ## HITS:1 COG:CC1328 KEGG:ns NR:ns ## COG: CC1328 COG1835 # Protein_GI_number: 16125577 # Func_class: I Lipid transport and metabolism # Function: Predicted acyltransferases # Organism: Caulobacter vibrioides # 18 91 12 80 337 70 50.0 7e-13 MTSNNTSAAAFADTKPHYHLLDGLRGVAALVVICYHIGEDFATNGLTRCVNHGYLAVDFF FMLSGFVIGYAYDDRLKTMGIMAFVKRRVIR >gi|283510533|gb|ACQH01000086.1| GENE 2 326 - 1345 1125 339 aa, chain - ## HITS:1 COG:no KEGG:SCO6610 NR:ns ## KEGG: SCO6610 # Name: SC1F2.07 # Def: secreted protein # Organism: S.coelicolor # Pathway: not_defined # 46 327 90 361 457 74 29.0 5e-12 MLYKFKHLTYIALALLALTFTACNESGDDVEETSLVKQNDEIVALDGKVTTLQTATKGNG YNIILMGDGFTVDLIKNGTYEEVMKKSAEHLFALEPMKSLRPYFNVYVVQKVSLSSDLSG STALASTIKNGKVCGFINDDNLDYKTMLYASTVPGYKEENSVISVVMNTTKPGGITFWGG WGSTLACAYTTLYGGVDGPYFRHTLVHETAGHAIGKLDDEYDLQNLDIDDAGRERFAYGH TLGWLMNVSTTNDATKAPWAEFLADTRYANQGLGLFLGGGARYATGVWRPSENSIMRTTD VDHIEFNAPSRRAIYNKVMQVTTGRTPTYEEFVAFDQKR >gi|283510533|gb|ACQH01000086.1| GENE 3 1406 - 2317 996 303 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929247|ref|ZP_06423092.1| ## NR: gi|288929247|ref|ZP_06423092.1| hypothetical protein HMPREF0670_01986 [Prevotella sp. oral taxon 317 str. F0108] # 1 303 1 303 303 596 100.0 1e-169 MKTFKHIGLIGVLTAVALFTACSDKDDYTPGPEVASGCQQVRFADTNNTICVLDSANTDD RTATITLKRNNTGAALTTPVKIVSQSAGLTIPTEVTFAAGDSLATLAIKAPEKVALKDVY SYELKLEGNDVDPYSKLDGGVVFSGQLNFPMSKKAEFYADPASKNYALFNAWTEMVMDLG NNNFYIKDFMHSGENLWLITSSTGVLGVKSDSWTKSCIVADTSAPGSYSYHFKKVVDKLL KPYPLYPLGNASGKPWISELTLYGGPDYSSYSSNGTWGKIYIEHAKYSTGEEESWVYLYF QLK >gi|283510533|gb|ACQH01000086.1| GENE 4 2352 - 4082 1762 576 aa, chain - ## HITS:1 COG:no KEGG:PRU_1875 NR:ns ## KEGG: PRU_1875 # Name: not_defined # Def: putative lipoprotein # Organism: P.ruminicola # Pathway: not_defined # 1 568 1 549 561 194 30.0 6e-48 MKRFIKNIILLSLTAPVLTACLEEAYPGRGFTQEQLFETDNTAAALSKAIPGAMLRMGKG NGHYGYAGLAMDRETMCAELPVYDVLYDFYQDEGFDRNYSPEYYVATDWWPFYYELIKKA NLTVGAVAINENTTAEELAHVGNALGYRAFAYYDMAFAYEYKKTGYGKLDEKAAIDKIYG LTVPIRTESTTQQEATTDRRAPFYVMYRFIMTDLDRAEKYLQGYKRSGANMMDQGVIFGI KARLWLQLGTRFEKNNDDLQMQLSHENDEALAKYDKLGITSANDCFKKAAQYARLAISEG HRPLTKAEWFNSTTGFNTANEAWLLAVQINSSDMTNGSDWNWKNFVGNMSSENTFGVNNL EYKAFRMIDAGLYKGIENGDWRKNTWIAPADAGQTAAYSKYTTLYKSDDWVKLPAYTGLK FHPRSGEMKNYKVGAAVDIPLMRAEEMFLIEAEAKAHYEGLAAGKQALTNYLNTYRYDDN SYQCTATTMDAFNDEILRQKRIEFWGEGIVFFDYKRLEKAVIRHYEGSNHPTDYQWNTKA GFVAPRMNICISHYERQYNANIVNNPDPTGVHEEKK >gi|283510533|gb|ACQH01000086.1| GENE 5 4106 - 7261 3262 1051 aa, chain - ## HITS:1 COG:no KEGG:BVU_2578 NR:ns ## KEGG: BVU_2578 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 19 1051 19 1053 1053 994 51.0 0 MMRRTATLMILCLVAFFAAAQTMKVSGTVVDESGEPIIGASVKVKDTTLGAATNLDGKFT LDNVPRTARELIVSYIGMNTVTATIKSEMRIVMKSDAKQIDDVIVVAFGKQKREAFTGSA GVLNADKIAERQVNNPLSALNGKVSGVQMIEGNGPSSSPTIQIRGISSLNAGNGPLIVVD GLPYSGYYSDINPADVESISVLKDAASNSLYGARGANGVILITTKTAKRGNAVISVDAKW GFNEDAKVDYDNVKSPGEYYEMFYKAIYNKFHYGDGMDAYSAYLKANETIYKPGSQGGLG SIVYSVPQGEYLIGQNGKLNPHATLGNVVRHNGKEYMLYPDDWKKEGLRKGFRQEYNIGI NGGTDIFKTYASIGYMKNNGICYGSDYTRYSARLKSEYQARKWLTFGGNISYTHSESNSS GNAFGAAHQIPPIYPLYLRDGKGNIMYDRNGKMYDYGNGAVNGVTREIFLNDAPLQNDLL NISTNSSNAYGMQGYADFSFLEDFKLTTNVSIYNTENRDNDAANPYYGFDKTRGGQVGVS HYRTFAFNTQQLLNYNKEFGKHTVSALLGHEYTRNSSTTLTGRKNNIFAYKMNTELSGAL TLVSANSYKDLYNIEGFFFRGQYDYDSKYFASASFRRDGSSRFHPKHRWGNFWSFGGAWI MTKEKWFKPKWVDMLKLKASFGQQGNDAIGDFYYSDFYDLLAVNGEGSLAFSNKGKRNIT WETNTNINAGFEFELFKRRLTGSVEFYQRKTTDMLLWFSVPPSLGYDGYYDNVGDMMNRG IEVNLDGDIIRTKDITWSMNFNITHNRNKISYLPKENKTLEKEGHAGYSDGSRYIGEGLP INTWYIKKFAGVNEEGLSTWYYTDQATGKLKPTTDYSSADYYLGGSPHPDVYGGFGTSVR AYGFDLNISFIYSLGGKGYDNGYEGMMRNPEPSIAPQAYHKDLLKAWTRENPNSNIPRFQ YGDLKTTAFSDRFLIDASTLTFKSISLGYTFPAQLIKKLQLSSLRVFLSCDNVAYWSKRK GFDPRTSLNGNVSDGTYSPMRTTSAGISVKF >gi|283510533|gb|ACQH01000086.1| GENE 6 9917 - 10438 716 173 aa, chain - ## HITS:1 COG:no KEGG:BT_2261 NR:ns ## KEGG: BT_2261 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 171 1 161 166 100 40.0 2e-20 MIMKKYIIQMAAAAFALMGLSACDVETNEKPGGTEVEQMAGFWDVRVHEVNADGTLTQNA FGGSTYNLQTYNTVENTADKMWVNVSVGKKWSLLLVTPINYGARTFACQNVKAVYNSDDA GATVSITDGKVLEGQGHNLHNLPTDSIVFNVQLSTYPGKTYRVTGTRRSGFTE >gi|283510533|gb|ACQH01000086.1| GENE 7 10445 - 11164 862 239 aa, chain - ## HITS:1 COG:no KEGG:BT_2262 NR:ns ## KEGG: BT_2262 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 239 1 230 230 83 26.0 5e-15 MNKYIINLLPALALVVGLSACGDESFTDSKVTPIVLMDMAGDATIEVALGATYQDPGCTA TYMGKDYTAKMEVDGLDAVNTNAPGLYPINYSCTTPDGYTYEAQRVVAVCNPKVKYKAAG TFFTAAGTKRVDANGVEQTYAKSVVNVTQLASGLVKVDDLLGGYYQAMKATGQPTADAVT IQALLLMAEDGTFQPYSVKANGAPVQLDAFSNATFNAKKGVFTWTVTYKGDTYNIVLNR >gi|283510533|gb|ACQH01000086.1| GENE 8 11265 - 12761 1584 498 aa, chain - ## HITS:1 COG:no KEGG:BT_2263 NR:ns ## KEGG: BT_2263 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 498 1 498 498 503 51.0 1e-141 MKKITILAALATMVLSGCNLDINDDPNYPSEAQVTPEKLFPSAENAIADALGDQMFTYAG FFSQYFEQRPEQNQYNNLAELHLDEGSNLFDRCYQTLYAGALKDLQEVMAKSPNKANHYA CTVLRAWAYQLLVDNMSDAPYSEALKGKANPTPKWEDGKTVLMGVLDELDKAEANVGGES MTVDDPMMDKKLSQWKGFANALRLRIYMRFIDGGVDVANYTQKAKDLVAANAFFKGDIMF DVFSPVQNQYNPWYGSIFALNANNYVAAYPLVEYYSDTDDPRMGYAISVTAATGSYVGQL PGAKTTMKEWRNNQDWKNKNVSAIKFSVMKAQPIFAFTESELQFLLAEVQLRFNNDAAKA KACYEQGVKADFAWRNLKVGEATAFLANDKVKFDGKSVEEQLHLIYMQKWAAFFMRNHME AWSEIRRTNVPQTSKFTAKQVFDGEGYTPGDLIVPAVNHIQVGGLAKRVPYPNHARRLNP AQTPPEKKLSDPVFWDVK >gi|283510533|gb|ACQH01000086.1| GENE 9 12774 - 15962 3637 1062 aa, chain - ## HITS:1 COG:SMa2414 KEGG:ns NR:ns ## COG: SMa2414 COG1629 # Protein_GI_number: 16263718 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor proteins, mostly Fe transport # Organism: Sinorhizobium meliloti # 90 309 25 223 746 67 29.0 1e-10 MGKRFTMLLGSLFLCMGAALAQMKISGTVISSEDGQPVIGASVVVTGTGQGTVTNLDGQF TLTVPERAKLHISSAGMKAVDVTAKDNMRVVLEPLNTTLGEVVVTAMGIKRSAKALGFSA TAVKGDEIAAARTNDIMSSLAGKVAGVQISSSSSDPGSSKSVIVRGISSLGSTNQPLYVI DGVPLNNSTVSTDRDLNRGYDFGNGASAVNPDDVESMTILKGAAATALYGSRAANGVILI TTKTGKKAQKGLGIEYNGGMQWETLLRLPQMQNDFGMGWYGNKTDNENGSWGPRFDGSWL RYGAVYEGTQNFKSYVPIKNNVRDFFDTGYRYSNSVSLNGATDVSNFFLSLSQISQDGII PTNADSYNKYTFSARASHKVNNLTFSTSLNYAYQLNNFVSTGQKSSSMYNNVMQTPRDIS LVELKDLNNPFNTPGYYYTPYGVTNPYYVLNNFKNEFESERFYGKLQLDYDFLKYFKLTY RFGLDTSTEHHDMGEPNMSTLFKGTPNWEDALRSLTGRVSEQTTRRREINQDLLVTFDKD LTDSWHVNALVGFNGNERKYSYQSSEITNLTIPTWLNLANTADKPTIKSYQRTQRLMGAL GQAEVAFKNMVYLTLTARNDWSSTLPRGSRSYFYPGATASWIFSEVLPENVKKTVNYAKL RAAWGKTGNDALPYMVNTIYPQGSASSSGWATSKFPFTNGGWNAYTVGNTLGGSKLSPEM TTEFEVGLNMAFFGNRLSFDASFYNRTSDRQIFSLDMDAASGYLYQNMNLGEIRNRGVEL VVSGTPVKLKDFAWDVTWNFTKNWSKVISLPEELGGESSIYGLTGGTGLYAKVGEELGVF KAYVPLRDPATGKVVVDNQGLPVRDPKQQIVGSMNYKYTMGINNTFTYKGVSLSVDVDIR QGGKMFSRTKRITHFTGNAMQTAYNSRNPWIIPNTVVQNGTDAQGKPVYVDNNTPLNATS IYKFWDDGGLELGAGDLIDKSYIKLRAITLSWALPKKWLANTFLTDVRLSAFGNNLFLWT PAENTFVDPELTSFGNDLRGNFGEWSANPSSRKFGFNVSVKF >gi|283510533|gb|ACQH01000086.1| GENE 10 16430 - 19393 3949 987 aa, chain + ## HITS:1 COG:no KEGG:PRU_2068 NR:ns ## KEGG: PRU_2068 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 3 983 5 994 996 1585 75.0 0 MDNKIPQEWNDILLKDVNFIDLMKHRIYNVLIVANPYDAFMLEDDGRIDEKMYNEYVELG LRYPPTFKQVSTVEEAEEVMQGVSVDLIICMPGNADNDAFAVAHAVKERFPVVPCVVLTP FSHGITRRMQDEDLSIFDYVFCWLGNTNLILSIIKLIEDKMNVEHDTKEGGVQMILLVED SIRFYSSILPNLYSYILVHSKRISTEALNSQAAALRMRGRPKVVLARTYEEAMEYYDKYA DNILGVISDVRFPKDGVKDPEAGIKLLREIRRRDEFVPLILESSETNNREKAEKEGFRFV DKNSKKMNIDLRHLMEEHMGFGDFIFRDPKTRKEVARIASLKQLQDNIFNIPYDSMLFHI SRNHVSRWLSARAIFPVSAFLKGVTWHKLKDVDAHRQIIFEAIVRYRNMKNIGVVAVFDR MKFDRYAHFARIGDGSLGGKGRGLAFLDNIIKRHPEFNSFPGVKVHIPKTVVLCTDFFDQ FMEQNNLYQIALSDATDDEILRHFLAAKLPDSLKADFTTFFDATRCPIAVRSSSLLEDAH YQPFAGIYSTYMIPYTEDAEVMLRMLAAAIKGVYASVFYKDSKAYMSATSNVIDQEKMAV ILQAVVGKDYGGRFYPNISGVLRSINYYPVGDEQAEDGIANLALGLGKYIVDGGQTLRVS PYHPNQVLQTSEMKIALRDTQAQFYALDTTHLNTDFAVDDGFNILKLGVKEAEADQSLHF IASTYDPNDNIIREGLYDGGRKLITFGGVLRHDAFPLPQIMQMAMRYGEDAMKRPVEIEF ACNINADRTGEFNLLQIRPIVDAKQMLDEDLSAIPNEQCLLRSHNTLGHGVSDDVTDVVY VKYGDNFSAADNYRVAEDVERINSSFLGTDRNYILVGPGRWGSSDFWLGVPVKWPHISAA RVIVEVALKNYRIDPSQGTHFFQNLTSFGVGYFTIDTNIPDGLFRKDILDAMPAIEETQY VRHVRFNQPLRILMDGMKGEGVVAFSK >gi|283510533|gb|ACQH01000086.1| GENE 11 19834 - 20346 -237 170 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929076|ref|ZP_06422922.1| ## NR: gi|288929076|ref|ZP_06422922.1| hypothetical protein HMPREF0670_01816 [Prevotella sp. oral taxon 317 str. F0108] # 110 170 1 61 61 69 68.0 8e-11 MLDLLGLPAHLDLPAHLEFLTHLLLPGLPTPKSYHILYKSHLLSHPILAKNKAKKHAISH DFLLNALPFHPKFKGIICILHHFAFLDWVPARHFPSPNTHFLPPKNHFLMAILPLLAIFL TIREGCIYTIQAHIYAFRLAFSTILLCVLHHFTLRFAPKRKAFSTKTHCI >gi|283510533|gb|ACQH01000086.1| GENE 12 20945 - 21889 1272 314 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929257|ref|ZP_06423102.1| ## NR: gi|288929257|ref|ZP_06423102.1| conserved hypothetical protein [Prevotella sp. oral taxon 317 str. F0108] # 1 314 1 314 314 565 100.0 1e-159 MNKISMAFLALGASLAMEATAQSRILPVLEATPDARTAALGGTLLGNANQMHIYTNPAAL TFGEKRFVADLSAEAQPKTDEGRLMQYNFATGYRFANRSALMAGVRYLGGLTVPSVGATG EPDKVSPYDMALDLGYSFAVTPEVAVYATATYARSHAATSANAYAFSVGAAYQKGFNICP TVPTTLTVGARLLDFGKSVKFNDTGIPQSLPTSVVVGGDWAVNFAPQHALTYALSCRYFT PKDASETLLAGGLEYTYNRMLSARVGYQYADKGSNALTFGAGGELSGFKLNVAYHHAFAD YGVDALMVGIGYAF >gi|283510533|gb|ACQH01000086.1| GENE 13 21900 - 24755 3270 951 aa, chain - ## HITS:1 COG:slr0535 KEGG:ns NR:ns ## COG: slr0535 COG1404 # Protein_GI_number: 16332024 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Subtilisin-like serine proteases # Organism: Synechocystis # 173 511 115 362 613 110 30.0 2e-23 MKKFRKIYTLAACITLTLGMFSACSDDLLSSQNNPNAAQTAADEAFPWQPGMAFIKLKPG VNPSTRAATQSVTRAKVFDNTDVKVEQVFDMNNEYAALKRARGLDRWFVVKFDSTKNVEE VINELRRDPAIEKAHGNVQIAPSKVKYTAASRAPIPSNKLRPANNGTGPLNFSDPYLQYQ WHYTTTVNEYGMFKPGADVNLFPAWQKETGSPHVVVAVMDSGVDFEHEDLAASAWQGVDK NTGQKIHGRNFYAAESGKGDPNVIVAGGHGTHVAGTIAARNNNGLGVCGVAGGNGSDDSG VRLMSCEIYGRDGTKETASTAYIVKAFEFAAENGASVCNCSWGYAFDRSKYLNNENFQSI FKAQFDLLKDGINYFTDYAGCDSKGNKKPESYMKGGLVIFASGNDSQYDIDMIPASYPRV VAVGATSSMGIPTDYMNKGPWVDILAPGGTTETGEVMRGVLSTVPKAFAQMSTGGTPNTD FTLPNDNNYAYAQGTSMAAPHVTGIAALIVSKFGKTNPNFTNEDLRRRLLGAVKATSPYA VKTDANLAGKMGVGFIDADFALADPETQKPDAPEITVTDYSADATRGYYDARITWKVTAD ADALNEQRTAFAYDIHLYKKADMSKPVQSLTRYSYDKPVGTELEQMFTELDTDVDYVVKM SARDRFNNTSADATASFKTRLNHAPVLGGAMTDTLRLLDTQPYFHKVLPVTDEDGHTWTY TTTALPSGVEMKRVGNALDLFIKVGSVGTYGFEVTLTDQLGGKTVQKFAYKVVSHTAPTP NNTLGDVSLFEQGEAAQINLANAFTAMSGGEMQFSATSSNDAVVKASVTGTQLTLTPGQK GTATITLTAIDGGKRTTTTILVRVTDKNAPDVHTIYPIPAHSYIKALMRSGVSKVQVIVT SLRGQKLIDETLTPDSRTHEVTLGIDRLAPGTYYLLLKTERITSKHTFIKK >gi|283510533|gb|ACQH01000086.1| GENE 14 24767 - 26488 2017 573 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929259|ref|ZP_06423104.1| ## NR: gi|288929259|ref|ZP_06423104.1| hypothetical protein HMPREF0670_01998 [Prevotella sp. oral taxon 317 str. F0108] # 1 573 12 584 584 1149 100.0 0 MYKNIKIWALLIVASLASACQKDLTLGGTEDVVQGEAQTLTLKQEGETKTVSFATLGDEW QVEKGAYDAWLTAKKVGQALELTATANEGADERQAEVVLTTPGGKQTVTVTQFGTAPFIA VDGSNGTAIFNHEAHTAVELNVISNSDNWRVEQVDKENNKWLTYDVDQKQRKLRLTLTAI DRNSPWAQSSRTEKLFLSNGNKHFLLSVTQNGYVQFQLPVWDLDNFNLARVTELEGERNN SRDRAFEIDSLLPYKEDVDKIYYAFRSPGEQSPRIFYTPNYYSKLITTAWLKAPKGKEFQ KESYDPWLKQQNFKDGNKQRNATEAQYYCEEENQTRLVHVYNDPNNFKMFGGIYKSPCMK YVVSSNELKVGSNGKATCFPVFISERMHNKSFKLQEVIEFEKKRGMKPDFENQFNSEKIT TTTEDPSCKYSRLLFVPENASHDAGSLANVIYFFNWRGVTPEDIDAGLVADAELSGTVGS CQVFYMGGEVLYNREVEGTPGVYEWYTYTLPTTTRNAMEDKGYTYFRGDNSGFVTYFRGE ADLIDMRPQQTRTVITYYKSKHYVDLIKKTLNL >gi|283510533|gb|ACQH01000086.1| GENE 15 26518 - 28140 1795 540 aa, chain - ## HITS:1 COG:no KEGG:ZPR_0689 NR:ns ## KEGG: ZPR_0689 # Name: not_defined # Def: putative outer membrane protein, probably involved in nutrient binding # Organism: Z.profunda # Pathway: not_defined # 9 537 8 497 497 286 35.0 1e-75 MNTYMKTTKILLWAFTLAAMLQGCNLDREPADYIDYNASFRNMQDAAKWDNGIYSTLRGK FGGGYVLPQEAQADMLNAHAAFGNLYGEFHGWTILPESSVIQEIYHSYYSALVDANVVIT KLPQLPVSAAEKPTQRHYLGNAYFARAFYHFNLALRWGMIYDAATADKDLGVVLALEPGS LNKPARATNAQTYKQVLDDLNEAETLLADVPTSPGNTEINADAVRALRARVYLYMGDMDK AYQEAKQLVDKNTYPLIAPYTAKLNSEGKVTPTEDAFAQMWFYDKGTEQIWQPYVAKENE VPTVTSLYGADLSTTTHWDEAKQPSKVGDYNKPPYVPTREVINDLFANGNDHRAHIHFEF VNTTVNDVNVSTQLYVVSKFKGNPNYATLTSTHWGGYVPNGNQAPKPFRIAEQYLIVAEA AYKLGNTADAQSYLNTLRQSRGVPTTSATGDDLYQEIKNERARELAYEGFRLWDLRRWKL GVRKRTIQGAKGYYQVPASFYAGGYKVDIQTGNKMFVWPFPDNEVQINPNVKQNPGWDKQ >gi|283510533|gb|ACQH01000086.1| GENE 16 28157 - 31156 3232 999 aa, chain - ## HITS:1 COG:no KEGG:ZPR_0688 NR:ns ## KEGG: ZPR_0688 # Name: not_defined # Def: RagA protein # Organism: Z.profunda # Pathway: not_defined # 7 997 8 986 988 706 38.0 0 MKVRFWMFTLCLLTGLSSVLAQTITVKGNVVDENGEALPGVSIIPKSAPKTGTVSNIDGK FTISVRNGEKLTFSFMGYKSQELAAASQMDVRMEPSDLTLNDVVVVGYMERKVANTSASV VKVTAKDLAMKPVANPLDAVQGKVSGLQVFSSSGEPSSRLSMALHGQGSLGAGTGLLIIL DGMQVSMAVLQSMNANDIESVQFLKDAAATSIYGARAANGVMYVTTKRGRTETRPTINLR AQYAVSSLANTDYFDQLMTGPELLRYYQETGIYTADQIAALKSKYFKETDFQWYKYVYQP APMYTADVSVSGGTSAVRYYLSGGALSQDGLRMGSSYKKAFGRINLDAKLAEWVRAGINT QIAYDDTHVSPFGNNNAVGGGVAAMNAPFASPYDPRTGEELERVPLLDITTPKHVISTNP AKTNAFILSANGNLTLTPIKNLTLRSLVGLELNYGDSRSRTLPSYYRAYGNGSAERGYSM VSNFTITNTAAYNMGFGNHHVTALLGHEYTNYNDDGFSASGSGILDDRLNLLNNATQKKN VGEATNSYAFLSFFTQLSYDFSERYFVDLVLRNDASSRFGRNKRNGLFWSVGLLWKAKGE SFLKDVSWLNELDFKTSYGTQGNSSIPPYSTSSYVGKVGQKKGGMSLGFVSLGNPDLGWE HQSKFTVGVKARLLSRLDLNVEYYNRITSNMLFEVPLPYTSGMPAGNEGYATRYDNVGRY LNHGIDFQLGADILRGRDWGVSANVNFNYNRDKVLELFEGRQSWHEPGGQLAYMVGKPVT FVLPLYKGVNPQTGVPEWYKPGADKNQTRRDDNDVVSEYSSSLSQNTGVNAYTPFTGGWG ISAYWKGFYLNADFFVALGKHMISMDKQFYENDYHVRNREGNFNGSRRLFDYWKKPGDNT EYPSLAWVRQANHQSNYVDSKMLENASFMRLKNFTLGYDIPRKSLGRQNIVRTAKVFFTG RNLLTFTKFRGIDPEVNRNVSYGVNPNTKQLSVGVEIGL >gi|283510533|gb|ACQH01000086.1| GENE 17 31059 - 31247 62 62 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGLQRRCHAAKLTKHTVAWGIGKTIINLDNYESKILDVHPLLANWIEFGFGTNNNSEGKR RR >gi|283510533|gb|ACQH01000086.1| GENE 18 31332 - 31541 86 69 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAYYFLISMHIVEIMRFTYNNCKKCNVAITILLPGKRHVVFLPLKQGKTNFLRQLPSFHA ARKFCHIRK >gi|283510533|gb|ACQH01000086.1| GENE 19 31691 - 32242 349 183 aa, chain + ## HITS:1 COG:no KEGG:BT_0226 NR:ns ## KEGG: BT_0226 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 8 176 12 180 181 100 33.0 2e-20 MKSYIAGIIIMSLTLISCQISGLTSGFSHLSKRAQERVVHYKGAIDDIRDFSNIYAIGVE QTKDYLTKHEKVIIYDFTPFCKSGRCVSPLALSEICKRQNIDLLVISNIYDDVFSAVNKN FPILMIDTDALQTKWRGKYIEAFYYPLIGRSQKEINYASYLYFHNGTYVRSFKDYREIEK AGL >gi|283510533|gb|ACQH01000086.1| GENE 20 32657 - 33163 384 168 aa, chain + ## HITS:1 COG:NMA0429 KEGG:ns NR:ns ## COG: NMA0429 COG3727 # Protein_GI_number: 15793434 # Func_class: L Replication, recombination and repair # Function: DNA G:T-mismatch repair endonuclease # Organism: Neisseria meningitidis Z2491 # 2 120 3 120 140 134 49.0 6e-32 MDKFTPRQRHRIMASIKSKDTKPELLVRKYLWAQGFRYRLNYGRLPGHPDLVLRKYRTCI FVNGCFWHGHEGCKRWTLPKTNTPFWENKVKRNQQRDAETQRQLARMGWHCLVVWECQLR PSLRQNTLQALAHTLNHILLTDYQQRYALPHPDEDEPMMAAEERRGYL >gi|283510533|gb|ACQH01000086.1| GENE 21 33430 - 34425 1038 331 aa, chain - ## HITS:1 COG:STM3755 KEGG:ns NR:ns ## COG: STM3755 COG3943 # Protein_GI_number: 16767039 # Func_class: R General function prediction only # Function: Virulence protein # Organism: Salmonella typhimurium LT2 # 6 134 13 142 345 130 47.0 4e-30 MHSNNQIVIYQSEDGQTQVDVRLENETVWLTQAQMVELFQTTKQNVSLHVGNVFREGELE QESTVKEYLTVQNEGERKVSRKVKYYNLDVIISVGYRVKSKRGTAFRIWANKVLKQYLMK GYAVNERMHKEQIGELRQLVGMLGRTIQNQPLLSNDETDALFKVVTDYTYALDTLDNYDY ERLTINKTTKEEPFHATYDNAMEAIKGLREKFGGSVLFGHEKDNSFKSSIGQIYQTFGGE ELYPSVEEKAAMLLYLVTKNHSFSDGNKRIAATLFLWFLNGNHILYHPDGSKRIADSTLV ALTLMIAESRTEEKDVMVKVVVNLINKNNYE >gi|283510533|gb|ACQH01000086.1| GENE 22 34991 - 35239 109 82 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSITIFLSILHCHSVRLRLTEWGGLHTASRLIYFSQQRVSRLSGLILLLCKIFSVKFNTM SAVFAHHLPALPQKIDSREGLF >gi|283510533|gb|ACQH01000086.1| GENE 23 36043 - 37059 1128 338 aa, chain - ## HITS:1 COG:no KEGG:Coch_0793 NR:ns ## KEGG: Coch_0793 # Name: not_defined # Def: hypothetical protein # Organism: C.ochracea # Pathway: not_defined # 37 337 61 370 372 337 52.0 3e-91 MHKSILVLLLSFLFAACAKDSGMAPLDETQYPNPDAKKALTGSFITFYEKDNWAPYQWSS LLEGMKKIGMRTVIAQFSAHDQQVWFDTSETFVTQKSKYALGRLLAAAAEKGMEVYVGLY FDETYWKNQTSETWLDEHANRCNRMATEINALFGSQPAFKGWYIPHEPEPYAYNTEEKMN LFRTHFVDRISNHLHQLNNKPVAIAAFWNSTLSTPEQLQTFMSQLAKCNLQVIMLQDGVG VGHVSLDRLETYYKHAETGLFGDGSAFKGEFWTDLETFNKDNSPASISRVSQQLNIELAI PRITRAVSFDYYSNMCPTGPFGRKAQKLRNDYHKLMKF >gi|283510533|gb|ACQH01000086.1| GENE 24 37409 - 40210 3021 933 aa, chain - ## HITS:1 COG:SMc04255 KEGG:ns NR:ns ## COG: SMc04255 COG3250 # Protein_GI_number: 15965678 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Sinorhizobium meliloti # 137 516 93 431 831 77 24.0 2e-13 MRIKFLLVSLCCFTMAHAANNSLKDGRQLLTQWQMQSTTRVPQAGNVLSLRTYKPQRWHN ATVPTSVLGALVKENVYPDPHFDLNDLRIPDASDDLNRRLGLSKYSHIKGVPNPFKHPYW FRTQFSIPASERGRRVWLNFDGINYRADLWVNGKQVANKTQMAGMFLRFKYDITALVSAD KPNVVAVKIYQVDNVGTPRPGYLFTPFGQGRGQGEEIFRDVTLKFMAGWDCSPVVRDRNM GLYQDVYLTYTDDVKIENPYIVTNLNLPDTTKASISVQAELRNASNKPQTGVLRGRINLL KEIDFHTYKKQMPGQMPTIAFEKRVTIPAGKTLTVNFSPEEFAQLNVQNPHLWWPNGYGL QHLYRLDLSFEVGGKVSDTQSTTFGIRQITSTLKERDGEHGRIFWVNGQRIFCRGGFIQP DMLLDMNPKRMYDEARLLANAGVNMIANEDMPSPPEAVMETYDKYGLLMWEVFFQCWTSY PGEPSFNNPTDTRLALRNMYDVVKRYRNHPSLALWCMQIETMVREELYVPLRAYIKQHDP TRPFIATSSHGWDVAKLTPYIKPDLPTGMTDENEPDYTWYPHPYYYNKVLEVKDQMFRNE MGVPAVSTLESMKKYVFDLGKGPRNAHYPLDKVWAYHDAWDSICCPPSSYAFKPYDNAIR QQFGQPSSAEDYIRWAQYINAGSYRAMFEAANHRMWDITTGVMLWKLNATWPQVLWQIYD WFLNPNAGYYYAKKALEPLHIQMNEHNRMVSVINTMHKPQTDITVNVKVYDNNLNVAWQQ TVPQHAFDADCFTELFAVPIPQGVSNVHFVRLQLVDKQGRIMSDNFYWQTADGKNDFREL TRLPKVPLKFDTQTTQRDGDMIMTIKVKNPSTQLAFMHRLYIAKGEGGAEVLPTIWSDNF ITLFPGEERELTATFAQQDLDGKQPVLVLDEMQ >gi|283510533|gb|ACQH01000086.1| GENE 25 40235 - 41128 354 297 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|116517028|ref|YP_816079.1| glucokinase [Streptococcus pneumoniae D39] # 1 293 1 317 319 140 28 1e-32 MNNIIIGVDLGGTKIMTGAIDAADGRVIGTPVKIATQSHLPKEQIVENIAQSIRQVMNAN GLQKQDVVGVGVGSTGPLDLDLGLVLDCPQLPTMQHFALREALQQAVALPVALNNDANCL VLGEAVFGAARNKRTVLGFTLGTGIGCALVVDGKIWNGATGTAGEIWCSPHGKGIIEDVI SGQGVANIYKQIAHTDASSLEVYLRAQQGERHALDAWETFGQHLAVPLAWCINLIDPDVV LLGGSIATAHPFFMPAMMHSLQAHICPLPAQRTPIVMASLADSAGFVGAACLMRALV >gi|283510533|gb|ACQH01000086.1| GENE 26 41333 - 42547 1032 404 aa, chain - ## HITS:1 COG:NMB0535 KEGG:ns NR:ns ## COG: NMB0535 COG0738 # Protein_GI_number: 15676441 # Func_class: G Carbohydrate transport and metabolism # Function: Fucose permease # Organism: Neisseria meningitidis MC58 # 2 398 22 421 426 119 26.0 8e-27 MGNSGKRQTMLLGITALIYVVFGLLTSVIGVAIDRFQVQYGVSLGVAALLPLAFFLAYGL TSIPFGLLTDRFGAKSVLLLGTALMGTGSALCYLSAAAWAVIVSVFLIGTGVTAVQIAGN PFVRELDQPSRYTSNLTLIIGIGALGYALSPLIVPVLQANNLSWTTIYLLLAVVNAVIFV VVAFAHFPKTTTNADERLNLGRFFHLCRNPLVLVYALGIFLYVGGEVGVSSYIITFMNNV HHLSNDQSFWPESTFMHQLFPSKTALIVALFWLLQAVGRLVASALMRKVSERAVFIVHST LTVVALVVAMMGSEGVALVSFALVGYFTCVSFTALFSAVINSFNHDHGMLSGILGTAIVG GAFIGWLVGAVGNAHGMVWGMAVNVVAFAYVMAVAVWGKGKHLS >gi|283510533|gb|ACQH01000086.1| GENE 27 42565 - 43629 1096 354 aa, chain - ## HITS:1 COG:CAC0158 KEGG:ns NR:ns ## COG: CAC0158 COG0449 # Protein_GI_number: 15893453 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glucosamine 6-phosphate synthetase, contains amidotransferase and phosphosugar isomerase domains # Organism: Clostridium acetobutylicum # 7 329 253 583 608 94 24.0 3e-19 MSTNKFLTEILEQPAAIGNIIHFYSAPEGMELLRKTREAIAQRNIHDVVFTGMGSSFFIS FAAASLFNQQGIHAHYINTSELLHYNLSLLNRPTLLVCASQSGESYEIKEVLERLPQSVY CVGIVNEEDSALARKADVALLCKGGREEMTSTKTYVLTSLAACILGLYLSDRWNAHTQRQ LAGLQAKFEDMLASHDQWLDKGLDFLGDLTTLQIIARGPSFSTASQSALMFKEATHIAAT GILGGEFRHGPMEMVAPGFKSVLFASEGRTLEQSLKMAEDIAGFGGRVWLITNSHDAVQH VTDNILPVYVDEQDEYLFSILSILPLQLFIDEYAKRHGFEAGSFSRGAKVTVTE >gi|283510533|gb|ACQH01000086.1| GENE 28 43707 - 44633 990 308 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929272|ref|ZP_06423117.1| ## NR: gi|288929272|ref|ZP_06423117.1| hypothetical protein HMPREF0670_02011 [Prevotella sp. oral taxon 317 str. F0108] # 1 308 1 308 308 615 100.0 1e-174 MRTMKMFLAVASLAVATVGANAQSKVYPQVVDDQLVIQGDKCGWDAGTVHTFSVVEANKD GYKYWGYYGLDHYENDVHFRKAGLVRSNNLTDWVKYEANPIIAANCRWPTVVMSDGKFYM FYAEYKGPNKDSRIVMAESENGIDFDNKRVVVPYADGEQNQNPFIYFNKNDGFFYLFYYN GTERSKNNPRWNVLVKKSKRVADLPQQKSYEVVTSNKTLAAPSVAYHGGTYYLLVEEFSD DTHTKWVTNAFSSKEVDRGYQRVTNNPVLYKNDACAFQYVLGGQLYVTYSHCTDLSKGQW VMRMMKTK >gi|283510533|gb|ACQH01000086.1| GENE 29 44720 - 45511 903 263 aa, chain - ## HITS:1 COG:srlR KEGG:ns NR:ns ## COG: srlR COG1349 # Protein_GI_number: 16130614 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulators of sugar metabolism # Organism: Escherichia coli K12 # 16 262 5 251 257 155 38.0 6e-38 MEQHNTQADNIVFQEERLRTIAERLEREGRIVTKDLVDELNTTPVTLRKDLLLLEKRGLL KRTHGGAVKVNRLYPGQALNEKEKINLEEKIRIVRKAATFVAEGDTIILDSGSTTSLLAK ELRNFKRLVVITNALNIASDLSDTSVEVIVVGGNLLKMAATMVGPLADDTLRKVSADKLF MGVDGVDMNIGLTTPDINEAKTSRVMMESAGEVFLLVDASKFGRRSLGVISPLNAIDKLI TTKHLSEEEFQQLSENEIDIFMV >gi|283510533|gb|ACQH01000086.1| GENE 30 45770 - 46609 1005 279 aa, chain - ## HITS:1 COG:no KEGG:Cpin_5551 NR:ns ## KEGG: Cpin_5551 # Name: not_defined # Def: hypothetical protein # Organism: C.pinensis # Pathway: not_defined # 79 278 78 250 254 77 34.0 8e-13 MNAKHSMFALAVAVICFAACEPQESKTGKMGPAPSNPHITINNADPYNPIFRAWADNGFI YSWDMGNGQVIAPGRDTATAYYPLPGNYNVSVTIYGEGAQEVSADTTYTVTSADPTMAEK PLWKELTGGGTGATWVYNTDTQTGKPDYCYQTTGDLVNYPNNWMPSSSWGQCERITPDIK GEMVFDLNGGVNYTYHHVAGDAGVKGTFVLDAEKRTLTVKSPYILDHAVGCTQPAATATG VYQIMKLTEDELVLWQMQKAPTSAPGSGEGWGWSFKRKK >gi|283510533|gb|ACQH01000086.1| GENE 31 46846 - 48315 1743 489 aa, chain - ## HITS:1 COG:no KEGG:SRM_03058 NR:ns ## KEGG: SRM_03058 # Name: not_defined # Def: outer membrane protein, probably involved in nut rient binding # Organism: S.ruber_M8 # Pathway: not_defined # 16 488 19 508 511 259 35.0 3e-67 MKRHLIYIIMPMAMGFAACDPEGFLDTPVPSIDQNTFFANENAAKSTLVGAYDPAGWMSY IQYLDWAIGDVASDDATKGGGGDSDQPEMYDIEHFRASPEQEQLSTAWQQQYEGINRANR VIAGVKGNANISEAAQRRFIAEARFMRGYYHFCLMKVFGAVPLVDHLLKPSEYKGGRASM EECWAFVEADFKAAMQELPHKKNQPAEDLGRATWGSAAAFLTKCYIWQEKWKEAEALAKE IVASGDYSLQPNYGDLFTIATDNGVESVFDVQLKDFNMTQWGDENEGSMIEIYQRSRDDR NGGWGFDQPTQNLYDEFEPGDPRREWTIIKHGAVLWQGTKDEETIYTNYDPVHNPDAPAV GYCRRKGTLPKSQRPSMEDAASLNVRVIRYADVLLWQAEAAAHNGSDWQTPLNAVRKRVG LGESPYKSDPLKAVYHERRVELAMESHRYWDLVRTGRGNLIEGYTDNKRYLPIPQSEIGL NPNLTQNPY >gi|283510533|gb|ACQH01000086.1| GENE 32 48378 - 50612 2650 744 aa, chain - ## HITS:1 COG:no KEGG:SRM_03057 NR:ns ## KEGG: SRM_03057 # Name: cirA # Def: TonB-dependent receptor # Organism: S.ruber_M8 # Pathway: not_defined # 6 744 286 1028 1028 457 37.0 1e-127 MQAALGQVPLFGDVRSLPTVDYFDAILHKNAPVKNADLSVQGGTDKANYYLSLNGFSQDG LMKKTAFDRVTLRANGEFKANDWLTIGENITLVRTRQNGTLENDEWTSPMMTTYTRDPIT PVRNEDGSFTKSTFSDIHNPVATIEYNNPENIVYRVLANVYAEVSPLKGLKWRSTYSTEY ANNEYANYTPVYYVFAAQQNSKENSITNSTYSTYMRQFTNTLTYDTHLGNHALQAMLGCE SYAVTYKSLQASVKNVPSNDPSVMFINNARNKDAAAAAKGITTQHKLLSGMARINYNYLE RYLLTLNMRLDGSSKFLRGHRWGVFPSVSAGWRVTEEPFMKDQKVFSNLKLRAGWGQIGN EGSVDSYSYATFASAGSNYVFGGKLMPGFSFNSTGNSELKWETSTTANLGLDFAVLGGKL SGTIEVFDKTTSDMLLRVPVPGQVGIAEPPFQNAGKMRNRGLELSLQYRNFDHPLGYSVG ANFTTISNRVLDLGADAYIEGARFMNSVYLTRTVVGRPMAQFYGLKTDGLFQNWDEVNAQ KAQKNVAPGDVRYKTVNKDDLMAPENYTYLGSPLPKFSYSLNGSLSYKGFDLSLLLQGVY GNKLYNGPSTYSLSTGNLSYNFSRDMVNRWHGEGTQTDARYPRMGGTDSNNSILQSDRYL EDGSYLRIKMLQVGYELPAAWVSKAKLTKARVYLNAQNLYTFTKYTGLDPEIGMRGGEDP LDIGVDRGYYPTPRLYSVGLELSF Prediction of potential genes in microbial genomes Time: Sat May 28 02:02:36 2011 Seq name: gi|283510532|gb|ACQH01000087.1| Prevotella sp. oral taxon 317 str. F0108 cont2.87, whole genome shotgun sequence Length of sequence - 14309 bp Number of predicted genes - 9, with homology - 9 Number of transcription units - 4, operones - 2 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 755 963 ## BF4370 hypothetical protein 2 2 Op 1 . - CDS 2481 - 4922 1565 ## BF2918 hypothetical protein 3 2 Op 2 . - CDS 4938 - 7169 1211 ## COG1401 GTPase subunit of restriction endonuclease 4 2 Op 3 . - CDS 7156 - 8268 398 ## COG0270 Site-specific DNA methylase 5 2 Op 4 . - CDS 8255 - 9901 463 ## Cagg_0465 hypothetical protein 6 2 Op 5 . - CDS 9912 - 10142 137 ## BVU_3740 hypothetical protein - Prom 10264 - 10323 2.8 + Prom 10594 - 10653 4.5 7 3 Op 1 . + CDS 10699 - 12111 676 ## COG3436 Transposase and inactivated derivatives 8 3 Op 2 . + CDS 12155 - 13042 766 ## gi|288929284|ref|ZP_06423129.1| hypothetical protein HMPREF0670_02023 + Prom 13210 - 13269 7.0 9 4 Tu 1 . + CDS 13297 - 13920 257 ## gi|288929285|ref|ZP_06423130.1| hypothetical protein HMPREF0670_02024 Predicted protein(s) >gi|283510532|gb|ACQH01000087.1| GENE 1 2 - 755 963 251 aa, chain - ## HITS:1 COG:no KEGG:BF4370 NR:ns ## KEGG: BF4370 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 30 251 22 242 998 216 49.0 5e-55 MTVMRGKGFYLAVLLALLLPPLCLMAQQTGERTIVGSVVDAATNEALQGVTVLAKGSGAG TTTDANGNFSLRLARTDNVLLFSYVGYAPQSVNVAGKQNITVTMSETANAVGEVVVVGYG TQKKKDVTGAISVVDMKEVNKLSASTLAHALQGQVAGVDVTSFSGAPGSGITVRVRGIGT LNDSDPLFVVDGMMVGDINFLNTNDIESVQVLKDASATAIYGARGANGVVIITTKRGSQG KATINLSAYYG >gi|283510532|gb|ACQH01000087.1| GENE 2 2481 - 4922 1565 813 aa, chain - ## HITS:1 COG:no KEGG:BF2918 NR:ns ## KEGG: BF2918 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 91 644 69 582 742 239 31.0 5e-61 MQKEVLRFVHPDYEVIVRTRDISYSWERFKGRIDYTRFTNPELVEPDKYCRYTSKDSCKL CLYNPIAECKTDDKVVVKFPESKEWDNLWPVIFETCKYQVRLLFHGLDKESMPEIRHVRK DVEDSFFYDEELNGRDEKSLTGELDFINEPGVFKLDFSYQKNGIRKEAYVTFDVVSPKLD TKNDYKSLLREVNKEYEDVIYRYLSITLQQFARGRLNSDATWMAAFQSVVDDYIKNVKRV IQNPHSQIAIYRTSRKAEQIKFWTPTMEERYGEVKKEGKLEECYFDYNEVRSTQNTMENR FVKHTLQTIGKRLSTVITTILNATQEELSERHRQMWKDYQTSLYKLAKHPFFKSIERFDG MAQESLVLQNRTGYQQIYKDWLKLKRGIDFYNGAANIGTLQIWEIYELWCFIKMKKLVAE VLGIDKGKPSHKQLITEPKGTLLNPFEKSSSEHVVEYHYPKAEETDTDERKAELTAHEGD VVTLHYQHTFSRSSGKDGYGMGINTATTEQRPDIVLNIRKASGEVVLTYLYDAKYRVIND KKLDADFEEQDMSENMAMPGGDYPPTDAVNQMHRYRDAIYYSKEHEPYRSKEIIGGYILF PGRGDDEYVKKRYYSASVESVNIGAFPLLPNSYSLLKKHLEDILMKYASSEMHVAKAKPQ RTLAYVTEEEKAGMSSEDLVMIAVAGSEEKRQWTFEKLWFNIPLDKIADSPWNLAKYLLL KVKGERTVGNLCRIVRTKHDVWTSEHLKQSGYPDTPSHPAYFMIRIRKPNDTERELKKQI FNVRDVPVIYRGNKKMSFILVKMKDLQCLADTV >gi|283510532|gb|ACQH01000087.1| GENE 3 4938 - 7169 1211 743 aa, chain - ## HITS:1 COG:aq_647 KEGG:ns NR:ns ## COG: aq_647 COG1401 # Protein_GI_number: 15606070 # Func_class: V Defense mechanisms # Function: GTPase subunit of restriction endonuclease # Organism: Aquifex aeolicus # 299 741 85 468 469 129 29.0 2e-29 MSIAEEVLSTLKDDDVWYFAKQDASFDNVYQAVRLFDNAPKGESPSSYYQSVASNNGLDS NVRILSVAQLMGLLTKSSPFSRAHYDVEKPTPAFRELERHPVGSREYNAIKTEQLLKIRM KAITDTRATDDYGIYPFLFIAEVLWRLSLKGIRKIPRKAFFQYVMTSKAHENLESCVKIL SQEEKDIAIDEDKVKKFMSDSRVETIIKGNMRLFNIDSKYVRIDDNFVNYYSYFFNGPYK GIVALLYSIVDDVAKYQDILTRNIGISLNFIDAPEIKPDGSPSLSILPLTKIKNSSSLSF LTAIRTKPFILLAGISGTGKSRIVREFAFKSCPKCLQDKDGTTPGNYCMIEVKPNWHDST ELLGYYSNLSKGYQFKKFVKFLVKAKMYPKVPFFVCLDEMNLAPVEQYFAEILSILETRK HPKDTETGKVDMSTVKTEVIVDARYFRELSEMSHCRNIETGETATTGLTDKAIYLKLFGI NTKDAIEKEEIDDEVGVRQDLTTEGLALPDNVVVIGTVNMDDTTHQFSRKVIDRAMTIEM NGGNLRNMFGGSKNLEYLPEKEQKAWQDAFVRRYVTADEVLDAHPDEAKKLVEILPARLE EINNALKGTPFEVSYRVLNELAVMVGVMLDDNKDDKELDDIINQAVNYILLMKVLPRIEG DTEMFALSKEYKAKMGVNYVDRLEWLKNIAPDLRKVTKDGVSGDEETVESGEKELKGQQN TSKDKIQEMIDRLNNQDFTRFWP >gi|283510532|gb|ACQH01000087.1| GENE 4 7156 - 8268 398 370 aa, chain - ## HITS:1 COG:HP0051 KEGG:ns NR:ns ## COG: HP0051 COG0270 # Protein_GI_number: 15644682 # Func_class: L Replication, recombination and repair # Function: Site-specific DNA methylase # Organism: Helicobacter pylori 26695 # 6 362 5 349 355 241 38.0 1e-63 MGKNSVIDLFAGCGGFSIGFEKAGFHVTKAVEIDKQIAHTYSMNHANVLMLNDDIGSVDN EYNFTRGEAEVIVGGPPCQGFSMAGARIRKNGFIDDPRNYLFKHYLNVVKIVRPKVFVFE NVKGILSMKKGEIFREIVSAFSNPANFDGNHYFINYKTVKAIEFGVPEKRERLFVIGTLN KSIELEKLEQLTRTQILAENPSFFDDVTVGDAIAGLPMPSEDGMIENPRPCNRYQEYLCS KNDTLTNHKATIHTETAVGRMRQIQKGQNWTSLEEEIRSVHSGSYGRMSWDEPSATITTR FDTPSGGRFTHPEENRTITPREAARIQSFPDDFVFYGSKSSICKQIGNAVPPKLSFFIAS LIKNIIYEHS >gi|283510532|gb|ACQH01000087.1| GENE 5 8255 - 9901 463 548 aa, chain - ## HITS:1 COG:no KEGG:Cagg_0465 NR:ns ## KEGG: Cagg_0465 # Name: not_defined # Def: hypothetical protein # Organism: C.aggregans # Pathway: not_defined # 220 548 541 880 882 137 29.0 1e-30 MLNSYLLSNLKRLIRGVLNSFSYNTEKEHAVLNALATIIDEPIDVLCTDSLDITEFQLQA ASLNEKTNKRKVKGVFYTDDDVVQYIIGNAFLNFIQNDMSNVFPLDDCVKLILAANGNST NALLKASVFDPTCGTGQFLLAALMIKVSILEEKGELNDALILKLLATIHGNDIDKLSTEI AMVRLFLFVVPLLSDLKSFNKAATTLVSNFTTVDYVRSAMSEKSYNIIIGNPPYIEYGKL DERPPILLSNTYANVVFNSLGQLTSNGVMAYIIPLSFVSTPRMKRLRKEVKDLSKKMMVI NFADRPDCLFVQVHQKLSILLAVKGRKKCKTYSSNYYYWYKSERINLFDECKVHPSKYEI DSYIPKIGNKIEESIFDKIIKSKEEAELLKLQVSKNNGKNVYLNMRGTFWMKAFDKNPGS NEYKAYCFSKEFQPYVICILNSSLFFFFWIVVSDCWHITGKELGNIKVRTDIDLTPFKEL ARKLLDKLEKTKKYIGTVQCDYEYKHKYCKEEINAIDDTLAEVYGLTQCELDYIKAYQLK YRMSDGEE >gi|283510532|gb|ACQH01000087.1| GENE 6 9912 - 10142 137 76 aa, chain - ## HITS:1 COG:no KEGG:BVU_3740 NR:ns ## KEGG: BVU_3740 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 2 62 3 63 67 82 70.0 3e-15 MEDINRLKVLLVEKKRTSKWLAEQLGVNPTTVSKWCTNSSQPDLSNLLRIADLLDVDIRE LFVREYKRHLLSQETK >gi|283510532|gb|ACQH01000087.1| GENE 7 10699 - 12111 676 470 aa, chain + ## HITS:1 COG:MA1425 KEGG:ns NR:ns ## COG: MA1425 COG3436 # Protein_GI_number: 20090285 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Methanosarcina acetivorans str.C2A # 31 468 19 469 477 172 31.0 1e-42 MKETVTDISAMLKSLTSQVQSMQETIDAQHTEIASLNRNINRLTNENRILRKRLSKYESP DKNSNNSSTPPSKESMKDEVVRRTKSLRTPSGLKPGGQPEHEARGKEMVEHPDAIEERMS EYCRNCGRDLSCVEGELDYVSQVVDLPAVAPITTEYRHYKKTCTCGCCNKGYAPRRRGSQ ITFGRNIKAIVTYLNVVQCVPFERLASLMDEVFSVHMSQGTIANFVQETLRKSRPAIQLL ENLLKQSPVVGFDESGCYNNKRLDWAWIAQTAYITLCFRATGRSSKVLEERFGDALQRMV AVTDRHSAYFVLDFLNHQVCLAHLLRELEYLTELDESQRWSKDVTNLLRAAIHQRNVQPE ETVEKESWTDRLDELLKANLLHLNDKFERLRKGLVKCRDYIFNFLENPMIPPDNNASERG TRKLKIKQKISGTFRADKGADAFFAIHSIADTAWKNKQSQLQAISTILSL >gi|283510532|gb|ACQH01000087.1| GENE 8 12155 - 13042 766 295 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929284|ref|ZP_06423129.1| ## NR: gi|288929284|ref|ZP_06423129.1| hypothetical protein HMPREF0670_02023 [Prevotella sp. oral taxon 317 str. F0108] # 1 295 1 295 295 608 100.0 1e-172 MEKSKSNEPSELKDVRGALDIVMDKKVREYHVNIETKYIAEDRRLIECSSDMNVTIDFLH ISEGRNIMDVHITRRENNYGLTDMPEEYLQLRDQVNDIQAYLTLAIDRYGRIKSVLKYDE LYTKWQDVKSRMNPNSNGMAKIMHDGDEVYGMGAAHFAKQLNKATLYKALGMGLFSFEKA TGNRECWGWRTTSILIPTMGIDLNIIRDEMVHDPYAPTTLQVFKNKTAESNAHKMEAKFL SLFPFLKKQMGKYRVQFMCTVTRDASHGWPVSIEWDMSEYVSPVSADVVCKIELV >gi|283510532|gb|ACQH01000087.1| GENE 9 13297 - 13920 257 207 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929285|ref|ZP_06423130.1| ## NR: gi|288929285|ref|ZP_06423130.1| hypothetical protein HMPREF0670_02024 [Prevotella sp. oral taxon 317 str. F0108] # 1 207 1 207 207 429 100.0 1e-119 MKLKHKFKVGSSFTCCGIPFRIIVLVMLIPSMLAYYHTGFTSVNRNLPTKSGVVNWVHTT NTGSGWKTTFVLEDDSLHAYSQYYETLSRIFLLFPPTFKVDRHHIHFGDTITFYLSQKQD QDEDEDNAFNDDRIVYTEGLIINGETIASPVVVSFTRPCLDTLGAAVGILFYLTIALFIA DYRRWHRQKKNGAKLTEFFSNRNCFIQ Prediction of potential genes in microbial genomes Time: Sat May 28 02:03:18 2011 Seq name: gi|283510531|gb|ACQH01000088.1| Prevotella sp. oral taxon 317 str. F0108 cont2.88, whole genome shotgun sequence Length of sequence - 6997 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 521 - 580 1.6 1 1 Tu 1 . + CDS 623 - 2668 2597 ## COG0143 Methionyl-tRNA synthetase + Term 2892 - 2924 -0.1 + Prom 2929 - 2988 5.3 2 2 Tu 1 . + CDS 3030 - 4115 1119 ## COG0836 Mannose-1-phosphate guanylyltransferase + TRNA 4343 - 4427 54.6 # Leu CAA 0 0 - Term 4424 - 4464 5.1 3 3 Op 1 . - CDS 4517 - 4807 244 ## gi|288929288|ref|ZP_06423133.1| hypothetical protein HMPREF0670_02027 4 3 Op 2 . - CDS 4788 - 6827 661 ## PG0456 phosphotransferase domain-containing protein Predicted protein(s) >gi|283510531|gb|ACQH01000088.1| GENE 1 623 - 2668 2597 681 aa, chain + ## HITS:1 COG:PAB2364_1 KEGG:ns NR:ns ## COG: PAB2364_1 COG0143 # Protein_GI_number: 14521189 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Methionyl-tRNA synthetase # Organism: Pyrococcus abyssi # 8 547 3 553 562 535 48.0 1e-152 MEQKQYKRTTITAALPYANGGVHIGHLAGVYVPADIYARYMRLKNEDVVFIGGSDEHGVP ITIRARKEGVSPQDVVDRFHAMIKKSFEDFGISFDIYSRTTSPTHHQLASDFFKKLYEDG KLLEKETEQYYDEEAHQFLADRYIMGECPHCHNEGAYGDQCEKCGRDLSPTELLHPRSTI SGSQPVLKKTKNWYLPLDKYQDWLKEWILEGHKEWRPNVYGQCKSWLDLDLQPRAMTRDL DWGIPVPIEGAEGKVLYVWFDAPIGYISNTKELLPDTWEKWWKDPETRLVHFIGKDNIVF HCIVFPVMLKAHGGYILPENVPANEFLNLEDEKISTSKNWAVWLHEYLEDFPGKQDVLRY VLTANAPETKDNNFTWKDFQERNNNELVAVYGNFVNRALQLTKKYFGGRVPACGELNEVD KQALDEFKDVKEKVEGLLDTYKFRDAQKEAMMLARIGNRYITECEPWKVWKTDPKRVETI LYISLQLVANLAIAFEPFLPFSSAKLRSMLNMEGVDWNVLGNTEILKPGHQLAEPELLFE KIEDEAIEAQLKKLADTKKANEAETYKAKEIKPTVDFADFEKLDLRVGRILACEKVKKAN KLLKFTLDDGSGTPRTIVSGIAKFYEPEELVGRDVVFIANLAPRQLMGIESQGMILSAEN YDGGLSLLSLMKEVKAGSQVG >gi|283510531|gb|ACQH01000088.1| GENE 2 3030 - 4115 1119 361 aa, chain + ## HITS:1 COG:CAC3072 KEGG:ns NR:ns ## COG: CAC3072 COG0836 # Protein_GI_number: 15896323 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Mannose-1-phosphate guanylyltransferase # Organism: Clostridium acetobutylicum # 9 340 5 337 350 221 36.0 3e-57 MNTNNNYCLILAGGRGLRLWPYSRLSFPKQFVDFFGVGRSQLQQTFDRFAQLIDPSHILV STNETYADIVRNQLPELPPENILAEPIYRSTAPSVAWASHRILRQCPEACLVVTPSDQAI QGEEAFQRDIIRGLDFVSKKDGMLTLGVKPTRPEPGYGYIQMGKEVENDVFEVQSFTEKP DREFARMFMQSGEFYWNTGLFLLNVQFARNRFCELLPPVLRKFDADNPKFNPHEEALFIR ENFPLYPNLSIEHGILESSNDVFVMQCHFGWADLGTWHGMYETMQKGENDNVVIDSDVIL EDSSNNIIKLPKGKLGVINGLEGFIVVEKDNVLLICKKEDSSAKVRKYVNEVMLKKGEGF V >gi|283510531|gb|ACQH01000088.1| GENE 3 4517 - 4807 244 96 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929288|ref|ZP_06423133.1| ## NR: gi|288929288|ref|ZP_06423133.1| hypothetical protein HMPREF0670_02027 [Prevotella sp. oral taxon 317 str. F0108] # 1 96 1 96 96 153 100.0 4e-36 MKILKINSGKAYYSFDGITDYEIDRISKGDLLKLLDVFLSQDCDMDEYNEDGVANKAHQI IYKSLYGKFKDLKEKKQFFNEESDKVYKDAIEHYNS >gi|283510531|gb|ACQH01000088.1| GENE 4 4788 - 6827 661 679 aa, chain - ## HITS:1 COG:no KEGG:PG0456 NR:ns ## KEGG: PG0456 # Name: not_defined # Def: phosphotransferase domain-containing protein # Organism: P.gingivalis # Pathway: not_defined # 1 679 1 679 679 484 42.0 1e-135 MRKIDLHIHTLSTLNDADFEFDIDKLSEYTNTMRIDCIAVTNHNTFDKDQFLRIKDKLTD VVVMPGIEIDLCGGHILLITEYESIDSFVSQAKAVYSKIVSPNSSLDYDTFRDIFPDLSR FLVIPHYWKEPKIPFDIINKFGDDIFAGEVTSPKKFIYAIKDPNALTPVVFSDCRLTSST MVFSSQQTFVDIGDITLNSLKSVLHDKSKVSLNEDGGHDLISIQNGEIKLSTGLNIILGR RSSGKTHTLQMIKEEYKGGNIKYIKQFQLLERDDKKDKNTFDQLLADKKNSLFEDFISPF KTVIEDLLNTPSMEDDESVLDNYLGSLIQYAKDERVRDVYSKCALFSEELFSPVEMTILE SLIRNVSELLDNKHYKAVIDRHIERSKIIALYNDLIMMRRNISIEKKKKEWTNDVIREIK GKLQIKSSSTPINDDLDLYQIALRRCKRLKFINLCKLVKRERKIRELTLKKFKVIATTHP FVNATDLKQRLHVRVSLAQAYLKYDSPIDYLKELRNCETISPSEFYKLFVDIEYKILNED GFTISGGESAEFNLLQTIEDARNYDMLLIDEPESSFDNIFLKDDVNSLLKEFSKEMPVVI VTHNSTIGGSIQPDYILFTEKQISGKDKIFKIYYGYPNDKFLKNKDNEQKRNYVIQIDSL EAGSDAYQKRNKNYENLEN Prediction of potential genes in microbial genomes Time: Sat May 28 02:03:37 2011 Seq name: gi|283510530|gb|ACQH01000089.1| Prevotella sp. oral taxon 317 str. F0108 cont2.89, whole genome shotgun sequence Length of sequence - 28843 bp Number of predicted genes - 23, with homology - 22 Number of transcription units - 12, operones - 6 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 666 - 725 4.6 1 1 Tu 1 . + CDS 884 - 1093 208 ## gi|288929290|ref|ZP_06423135.1| hypothetical protein HMPREF0670_02029 + Prom 1152 - 1211 2.4 2 2 Op 1 . + CDS 1319 - 1564 152 ## gi|288929291|ref|ZP_06423136.1| hypothetical protein HMPREF0670_02030 3 2 Op 2 . + CDS 1610 - 2413 505 ## gi|288929292|ref|ZP_06423137.1| hypothetical protein HMPREF0670_02031 4 2 Op 3 . + CDS 2505 - 3077 311 ## COG3647 Predicted membrane protein + Prom 3118 - 3177 2.9 5 3 Op 1 . + CDS 3339 - 4769 1377 ## Dfer_3251 hypothetical protein 6 3 Op 2 . + CDS 4818 - 6788 1785 ## COG3291 FOG: PKD repeat + Term 6970 - 7012 7.2 7 4 Tu 1 . - CDS 6876 - 7097 136 ## gi|288929296|ref|ZP_06423141.1| hypothetical protein HMPREF0670_02035 - Prom 7160 - 7219 3.9 - Term 7259 - 7299 -0.9 8 5 Tu 1 . - CDS 7315 - 9882 1783 ## PRU_2417 hypothetical protein - Prom 9908 - 9967 2.2 9 6 Op 1 . - CDS 9972 - 10775 594 ## ZPR_4167 hypothetical protein 10 6 Op 2 . - CDS 10783 - 11565 358 ## PRU_2416 hypothetical protein - Prom 11602 - 11661 2.0 + Prom 11831 - 11890 9.1 11 7 Tu 1 . + CDS 11950 - 12171 99 ## 12 8 Tu 1 . - CDS 12194 - 12409 137 ## TDE0555 hypothetical protein - Prom 12495 - 12554 4.8 - Term 12492 - 12542 -0.7 13 9 Tu 1 . - CDS 12617 - 13783 574 ## CPS_4960 putative esterase 14 10 Op 1 . - CDS 13857 - 14426 467 ## TDE0555 hypothetical protein 15 10 Op 2 . - CDS 14476 - 14934 470 ## gi|288929303|ref|ZP_06423148.1| hypothetical protein HMPREF0670_02042 16 10 Op 3 . - CDS 14952 - 15458 550 ## Vpar_1735 hypothetical protein + Prom 16903 - 16962 5.3 17 11 Op 1 . + CDS 17190 - 20516 3969 ## BT_3239 hypothetical protein 18 11 Op 2 . + CDS 20536 - 22104 1897 ## BT_3238 hypothetical protein 19 11 Op 3 . + CDS 22123 - 23007 986 ## BT_3242 hypothetical protein 20 11 Op 4 . + CDS 23025 - 24293 1554 ## BT_3243 hypothetical protein 21 11 Op 5 . + CDS 24300 - 25361 1093 ## BT_3235 hypothetical protein + Term 25385 - 25423 1.0 - Term 25454 - 25493 2.9 22 12 Op 1 . - CDS 25735 - 25965 227 ## gi|288929310|ref|ZP_06423155.1| hypothetical protein HMPREF0670_02049 23 12 Op 2 . - CDS 25971 - 28295 880 ## Bpro_5033 PHP-like - Prom 28535 - 28594 7.4 Predicted protein(s) >gi|283510530|gb|ACQH01000089.1| GENE 1 884 - 1093 208 69 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929290|ref|ZP_06423135.1| ## NR: gi|288929290|ref|ZP_06423135.1| hypothetical protein HMPREF0670_02029 [Prevotella sp. oral taxon 317 str. F0108] # 1 69 8 76 76 119 100.0 6e-26 MYACDLQRERLQGMLLTHSPVYLFRVSHEPERLQGMLLTHSPVYLFRVSHERERLQGMLL TYCIMQTQK >gi|283510530|gb|ACQH01000089.1| GENE 2 1319 - 1564 152 81 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929291|ref|ZP_06423136.1| ## NR: gi|288929291|ref|ZP_06423136.1| hypothetical protein HMPREF0670_02030 [Prevotella sp. oral taxon 317 str. F0108] # 1 81 21 101 101 157 98.0 2e-37 MPPLKEDEYVRTFGEWWYLACDVGAILPRAEWKQGNMKIVTGITSSGRTFISFLDEKALS HANLDEYFKEGQTTKAKVAWW >gi|283510530|gb|ACQH01000089.1| GENE 3 1610 - 2413 505 267 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929292|ref|ZP_06423137.1| ## NR: gi|288929292|ref|ZP_06423137.1| hypothetical protein HMPREF0670_02031 [Prevotella sp. oral taxon 317 str. F0108] # 1 267 1 267 267 529 100.0 1e-149 MTREIYLLLSICLIVSCKQPKDKAMYKGQELHRLVTMVSCCQYNDTLAAWKDTSSYFVNK AGEPIMVNRQPLFVERKGTDSLQTISHLQHYKGKTDTTYTYRVIMGNLVRENRIGILDRY LFTYDNKHRLVKAITYATDTYQYKFKWVGNLLTDIITEGGYYSNTFRTHIEYDKTQTIPQ SGLPALLDGAACFLFIQPSSNAYLLTGYMGKLPTLPVKKMTTKAEDGETIAAYDVKTKFD KDKRITSYSLVKPNGDIYVKRDFYYPN >gi|283510530|gb|ACQH01000089.1| GENE 4 2505 - 3077 311 190 aa, chain + ## HITS:1 COG:VCA0789 KEGG:ns NR:ns ## COG: VCA0789 COG3647 # Protein_GI_number: 15601544 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Vibrio cholerae # 35 176 53 185 195 73 31.0 3e-13 MITCVSPIYPHEQLLQHIGTALLFIPLITDVFKKAMPMSAFVGLACFTILHVIGARYIYS YVPYKELAVSLGLVDTGFFQDPRNHYDRLVHFSFGILLFPYLVYLCKKWFKQQQPIAVFM AWLMIQTGSMMYELFEWLLTIVMTEDAADCYNGQQGDMWDAQKDMALALLGSTVMLFFYM LRYAYMERKH >gi|283510530|gb|ACQH01000089.1| GENE 5 3339 - 4769 1377 476 aa, chain + ## HITS:1 COG:no KEGG:Dfer_3251 NR:ns ## KEGG: Dfer_3251 # Name: not_defined # Def: hypothetical protein # Organism: D.fermentans # Pathway: not_defined # 84 440 79 383 402 75 25.0 6e-12 MNSIYVNKHCRRLCLAMSLLISLSVAAQRDFSTINTGVNAADAKIIERGIKNVELPKDVA RTRGNGFDDDNKDHGESWFGYEYYRYTYPSINAKGEPVTLTALAAMPQNNNVTINNVILG CHITITDNASTPSEYIKTGSIKSDVAMLTMHAQEKTPSKLAYTCLVILPDYQGYGISRND AHPYLAQEITARQSVDALRYGIELYKTSKKNRAKIRDNWKTICVGYSQGGSVAMACQKFM ETNFLDTDLHLGGSVCGDGPYDPFATLRTYITEDKVYMPVALPLIIKGMLDYNPYMRRHA MSDYFNERFLNTGIVEWIEKKEIATDEIQKKLKSIYTFEKDNKGEYIKVSEIFTPDAFDF MSKKMKGEPTEKNPKFDDLYTAITVNDLTEGWKPKYPIYLFHSRVDEVVPIGNAESAYQK MRTDANPNIIKYTYLSKGSHVDNGSTFFFSFWGKSYEEKGVDAIAGGEETWNQFKP >gi|283510530|gb|ACQH01000089.1| GENE 6 4818 - 6788 1785 656 aa, chain + ## HITS:1 COG:MA4292 KEGG:ns NR:ns ## COG: MA4292 COG3291 # Protein_GI_number: 20093081 # Func_class: R General function prediction only # Function: FOG: PKD repeat # Organism: Methanosarcina acetivorans str.C2A # 47 209 1329 1510 1995 79 29.0 2e-14 MRKYIILLLLAICTQLSADVLSKETLIKGTAKLNGKDVPMWFNVTAEDIAEVGNGKNAAI SQYSEGKLVVPASIVNPADKRIYKVARVANFAFSLCSKLTAVTLEEGIKEIGENAFSGCN ALQTIGYPASLTAIGRGAFAGCNQLRHAHLPENLASIGQEAFAGNQFADNKVTLPIKVAT IPVAAFDGCKLQMAILPPTLQSIEEDAFRGAGSCDFYMFDGNAVPLVHNKGVNVASHWYV SKPQNYVQKTYCNGLLPISAMVPQGEFVADNITYRIVQVGQYPGIFTASAYKRNDKKDWS SEFKDLKTAVHNPTFGGKWNPEYRINGIDANFFTGNDIVKLHHIALPASIEPTQSKNAFA QCKALVSLDLSAMNPLSKAESDALLSGLPEFTIVYAPKTQVQEHRAANTVLTNKDGKRHT SLYKMQLNETFDALAQQGNLAYRLPYSFEAKRATFFRSSFKSKKKETLVLPFAAKPRGKA FAFDGMDKQADANTTLKFAKKDELQANTPYIYASDGTEIWAENVVVNPQNSTNTQTKDGL FGVYKADLVKSLATAQLLNGAAYTLNAEGNKGNATFVQATDETKLTPFHAFLHLSDKLVG ANLKLLFEGETATGIEKVANDVDNEDNTWFNLQGIKFNSKPQKGIYIRNGKKVVVQ >gi|283510530|gb|ACQH01000089.1| GENE 7 6876 - 7097 136 73 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929296|ref|ZP_06423141.1| ## NR: gi|288929296|ref|ZP_06423141.1| hypothetical protein HMPREF0670_02035 [Prevotella sp. oral taxon 317 str. F0108] # 1 73 1 73 73 139 100.0 6e-32 MYFYNISCFYFSHFSGQPMNDRLKGYMLFEKRNPQRLLMGIADREMVPTWSLADSYMLNN QTRFVQVLTTYLG >gi|283510530|gb|ACQH01000089.1| GENE 8 7315 - 9882 1783 855 aa, chain - ## HITS:1 COG:no KEGG:PRU_2417 NR:ns ## KEGG: PRU_2417 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 853 1 848 850 232 24.0 4e-59 MKTLYFLLVLAAISALPVYAQENVVRGLVVEMGSGTPVRGAVVTIRNAKDIIIYNTQTSA KGEFSLTMKHENRAETRLVVSSMAHKTAQCALTDQTFYRIELDPKAFEIKDVYVKPDKIT HRNDTTSYLVSAFATQKDRTIGDVLRKLPGIDVSKEGTVSYNGKAIDQFLIEGVDLFDGQ YNIATRNISHDVISRVDVVENFQSKKTLRKSKAEGGIALNLALKDKAKGRWSGNARAATG LPKLWEGEIFGARLSAVNQTSLTGKSNNTGKDIMSENKVLTLDELLRKVSTNAPNPFISS SLDRPGVLDENRTRFGRTHIANVGNVQKMSQTAIMRTKLYYSDDRNTSERIQGLSYFLAD TTLTRTTTEHGILNNRELSAAILFKNDKERSYFSNEIKYASTWERNRAITHGDYNNQWET RSDMYSIENNLQWIRAIGRHHVKLTSVNLYRFMPEQLAVMNDSTLQQDVKRRQFLSSTQL NLVFNTRQWALELNTEGRIQSMNMESNYRTTPLDTGFYENADINYVAFIVQPTIIYKHRG LRGTLQMPLNTYRYFGTSVANKLFFSPNLNINWEINSHWTIGAKLSLNRQEPAVNDSHTM PILVDYRTFEMSPINFYGSRSWNEMAFVSYTDYAHGLFVRASMGYDTRYNTLARTKSVDR RGMAYYSTVENDNHAHGITALVTVSQRLEWIKGTFNARCLLSQNASSMYQNGINTEFNTG SLQTSVSLNSNVWAWLDLAYRAQYNIHALEFSAMKSRTKLLTQELELTWMPTNALNFSLS AEHYANYFNDNTRKYTFFADFNCLYKYRKTDFTLHLNNILNQRFYNNATYNNLSSTYNQY GLRGRTLIVGVKMFF >gi|283510530|gb|ACQH01000089.1| GENE 9 9972 - 10775 594 267 aa, chain - ## HITS:1 COG:no KEGG:ZPR_4167 NR:ns ## KEGG: ZPR_4167 # Name: not_defined # Def: hypothetical protein # Organism: Z.profunda # Pathway: not_defined # 20 204 21 210 279 115 38.0 1e-24 MKSTLTIILCLACLLQSRAADNKANLECIYRLTYQVDTLTKRSVNAMMVLRRNENCSLFY SQANFEKDSLQRDANGIEEQKAINDTIKARYGKVTVSYYVLKDFEKKQLDFMGTLLNSNK YTEPLPVFNWQMSEDKKTIGEHQCQKATCTFGGRAYEAWFAPDIPINDGPWKFYGLPGLI LEVYDKQHHYEFQFLGMRPCAGKIEIPVVDYTKTTKKEYLRIKQMSIDDPESVLQLWEAK LGIKTTRPRAPKLRSYATMERVESMSK >gi|283510530|gb|ACQH01000089.1| GENE 10 10783 - 11565 358 260 aa, chain - ## HITS:1 COG:no KEGG:PRU_2416 NR:ns ## KEGG: PRU_2416 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 88 225 104 242 280 94 33.0 3e-18 MSLAAIYAEETRTFESEVSYLATCIKDTVTNKVLTDTFKLRFDGNKALFYNVETFAEDSL KHNDLTAWQKAIGSALTRKQRADKANSSYYILADLQQKTYIFKDKIGTNSFSYTDSLPAF NWQLKSEYKDIAGHRCAKAIGKYMGRTYEAWYATDIPTPVGPWKFYGLPGLVLSVYDTQH QYNFTFIDMQPCHGEITLFPSKSFKTTKEKFLKEYAIYQNDPIAYYNDRAMIKINFGKAG NDFEQEIKNDNRHRIMEILK >gi|283510530|gb|ACQH01000089.1| GENE 11 11950 - 12171 99 73 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNVKRWGGHLNCKFKELIYFNQISALLFLSLFSKTLQKYVLVWFSKKLNLLIFIQQMPGY LLDYFFLLFCHWY >gi|283510530|gb|ACQH01000089.1| GENE 12 12194 - 12409 137 71 aa, chain - ## HITS:1 COG:no KEGG:TDE0555 NR:ns ## KEGG: TDE0555 # Name: not_defined # Def: hypothetical protein # Organism: T.denticola # Pathway: not_defined # 1 59 114 172 179 63 44.0 3e-09 MNFLINPRFKVADERLNAIPKEERTALSRAYHKGVQRLDDLSEKLWGYGAEEDGWKNVLL NLQLSGLGKTF >gi|283510530|gb|ACQH01000089.1| GENE 13 12617 - 13783 574 388 aa, chain - ## HITS:1 COG:no KEGG:CPS_4960 NR:ns ## KEGG: CPS_4960 # Name: not_defined # Def: putative esterase # Organism: C.psychrerythraea # Pathway: not_defined # 33 274 30 259 420 81 28.0 5e-14 MKIKYLALLCFFLVCNLLHAQKDIYHVVGVQEINLKSKYFDFDRKIWVRLPSDYSFTDAQ DYDVTYIFDAQVTPFFELASAYPVFLNEGWFSKGTIVVGICSPQDSEYNRREDFLPDDGL TCNAYKIRKGYSDKLMCFVKDELMPYIRSHYRTTEKNLAIGHSLGASFLLQCLLNYDIFN DYFLFSPNLAFGKNMLANKFVKHSFDRTARHYLFFSDAAEEKIKGWEGWQTPRDEVYRYI DSKALPNNIVCRHKSYPESEHFASFPLALQDAYKDYFAYREAKDATAEGEVYAKHIEVIV DNPKYEVYICGNQASLGNWDAKKIKMTHVNDSVRAIDVKVQLPAQFKFTRGSWETEGFPA NALGGINLRVDNKSKKAYVYKVSDWADK >gi|283510530|gb|ACQH01000089.1| GENE 14 13857 - 14426 467 189 aa, chain - ## HITS:1 COG:no KEGG:TDE0555 NR:ns ## KEGG: TDE0555 # Name: not_defined # Def: hypothetical protein # Organism: T.denticola # Pathway: not_defined # 1 176 1 175 179 143 40.0 4e-33 MDDLRTLCEKYRDLDVESKFASISDDIWKLGSIEKIKDEVSAETFTFHVAVNMIGNWKGD GWDFIFYEGRALLPYIPDTLSRLGLVEIKDAFEETLSVFPDFASDCDEEVYTDVANFLIN PRFKVADERLNAISKEERRALSAAYHKGVQRLDDLSEKLWGYGAEEDGWKNVLDYLNSFV LGGSSSQQV >gi|283510530|gb|ACQH01000089.1| GENE 15 14476 - 14934 470 152 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929303|ref|ZP_06423148.1| ## NR: gi|288929303|ref|ZP_06423148.1| hypothetical protein HMPREF0670_02042 [Prevotella sp. oral taxon 317 str. F0108] # 1 152 1 152 152 286 100.0 3e-76 MEKKMMIGATEFTFDKGCGEYVGKLQVWGRETDVFLDTEHAEGDDIDKIVTEKINWIERN KEKIMKAFMEENDHYVDVVNEMIACGDFKADGPISADDFVNALFVDNVTIWVKGVDTDFA LDLDAKPDYLLGHLAFMEIDNQYHVEFGGLNG >gi|283510530|gb|ACQH01000089.1| GENE 16 14952 - 15458 550 168 aa, chain - ## HITS:1 COG:no KEGG:Vpar_1735 NR:ns ## KEGG: Vpar_1735 # Name: not_defined # Def: hypothetical protein # Organism: V.parvula # Pathway: not_defined # 1 167 1 167 168 174 57.0 7e-43 MGMIANYQYINDEQLNSLKNFDRERNDVLDEVEEWNEESEMLLDIDKMWDVLHFVLTGVS SCDPIENNPLSEAVVGVRSLEGIEEFVAYTEKERVADILAALEAFDMEQAMATFSMDACK KAELYPDIWDYDDEEELVKEEISDYFQNMKDFYREVLEANGHVMVTIY >gi|283510530|gb|ACQH01000089.1| GENE 17 17190 - 20516 3969 1108 aa, chain + ## HITS:1 COG:no KEGG:BT_3239 NR:ns ## KEGG: BT_3239 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 51 1108 2 1059 1059 1305 61.0 0 MEKRLMMFLVSLFLSAGIAMAQTQVSGTVISSEDNEPIIGASVIVEGSKTGAVTDQNGNF TLSVPKGKKIVASYIGMRPQTLSPKPTMKITLSPNNETLTEVVVTGVSNMDRRMFTGAAD KIKANDALISGMADISRSLEGRSAGVSVQNVSGTFGTAPKIHVRGATSIYGNSKPLWVVD GVVQEDITDISADALSSGDAETLISSAIAGLNSDDIESFQILKDGSATSIYGARAMSGVI VVTTKKGRSGQAHVNYTGEFTSRLVPSYSQFNILNSQEQMGIYREMQYKGWLNLSSVLNG SDYGVYGKMYELINTYNPKTGQFFLENTQEARDAYLYSAEMRNTNWFKQLFSSALMQNHS VSLSGGTGKSNYYVSMSFMADPGWYKTSAVRRYTANVNATHHLLDNLSLTLLASVSYRKQ QAPGTLGQDVDVISGQVKRDFDINPYSYALNTSRALDPNTYYHANYAPFNILKELENNKI DMNVLNTKFQAELAYKPIDDLRLAIIGAVQYDTSTQEHQITEYANQALAYRAMDNALIRD ANNYLYTDPTNAYALPITVLPNGGFYQKTDLQKVSWDVRATANYNHTFDNAHAVNVLGGI DVSNVDRRRTYFNGVGMQYDGGEIPFYVYQFFKKNIDDNAVYYSLNNTHNRTAAFFATGT YSYKQRYNLTGTYRYEGTNRLGRSRSARWLPTWNVSGSWNASEEPWFKNTFNNALTHAMV RASYSLTADPGPATYTNSTQILKSYTPYRIVGTDKESGIMTSELENAQLTYEKKHEFNIG SELGFLNNRINVTFDLFWRNNFDLIGPIATQGVGGQVIRYANTATMKSQGQELSVQTTNI KNNNLTWVTNVIFSHVTTEVTSLMSQARTIDLVNGEASSGFTMEGYPHRAIFSMPFQGLT DKGIPTYLNEKGELTSTDLDFQNRNSNGYLVYEGPTEPTITGSLGNEIKWKNWRLNVFLT YSFGSYVRLDRIFSAQYNDLTSMTKEFKNRWMQAGDEKTTSVPTILPYRSFASDRNLRIA YNAYNYTSNRTAKGDFVRMKEISLGYDFPKAMLRSTPFTNLSLKLQATNLFLIYADKKLN GQDPEFVNAGGVAAPVPRQFTMTLKFGL >gi|283510530|gb|ACQH01000089.1| GENE 18 20536 - 22104 1897 522 aa, chain + ## HITS:1 COG:no KEGG:BT_3238 NR:ns ## KEGG: BT_3238 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 8 522 6 519 519 434 48.0 1e-120 MNKRNIYIYVLLAATTLGITSCDKFLDKLPDNRMELNNKEKIQKYLVSAYPDHNPAYLAE LYSDNADEFDVTGWGSDGRFYDQAYSWSDITETQESESPQELWNSLYMAMSTANTALQTI EQQGNNSDLMPSKGEALLCRAYAALTLANTFCMAYDPTTASKNMGLPYPTRPETTVGTKY ERGTLAAFYEQINRDIEEGLPLVTDNYARPKYHFTRNAAYAFAALFNLYYQKYAKAVAYA TRVLGADASSKLRDWKAFNALSVNGQIAPNAYVSVSSQANLLLQTVHSQAGALLGPYRMG NKYAHGQLISEKEDLQSQGPWGRSSNFGYTVWSNNSLSKYFINKVPYAFEYTDVQAGIGY AHSAYAVLTTDLTLLVRAEANAMLGNYDAAVADLNTEVKAYSSGALSVTLEGIKNFYNGI GYYTPTAPTPKKKFNTAFSIEPTTQEPLLHAILHLRRIMTLGEGWRLQDVKRYGIVIYRR TLNASSSVTDVKDSLTVNDLRRAIQLPQDVITAGLTANPRNK >gi|283510530|gb|ACQH01000089.1| GENE 19 22123 - 23007 986 294 aa, chain + ## HITS:1 COG:no KEGG:BT_3242 NR:ns ## KEGG: BT_3242 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 293 1 303 304 283 50.0 7e-75 MKKICLFALAAILSLGFNSCSEDNPSSDSIFGGKTVHRDNFDKWLLANYTYPYNIDVKYK MEDIYSDMKYHLVPADSAKSAKLAIIAKYLWFDAYAECVGPNFVKANVPRIIHLIGSPAY NSGQGTMILGTAEGGLVITLYMVNNLTDKMLTDYATVNEYYFHTMHHEFTHILNQKIPYD ENFQHISEGNYVSGDWYLIRKAAANKKGFVTPYAMSEPIEDFAEMLSVYVTTSPADWEKM MNTAGTTGAPYIQAKLDLIRAYMRDSWKLDIDQLRSIVLRRASEIGHLDLNQLN >gi|283510530|gb|ACQH01000089.1| GENE 20 23025 - 24293 1554 422 aa, chain + ## HITS:1 COG:no KEGG:BT_3243 NR:ns ## KEGG: BT_3243 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 327 3 346 443 160 29.0 9e-38 MKKILICMQALVATMLLAACLHDDNEVFDEPAAQRLDKAVENYKQVLESAPNGWKLNLWT EPRYSGGGYTYLMKFKDGKVTVASELTDADKVATSSYDIKKDMGPVLTVNTYNEIFHSLA NPSLRDDDGKGQDYEFIIQRVTNDSIFIEGKKFHNKMVMTRLNDNVNWKDHIGKMKDVAD NVKFNYKYIAGNDTTKLYLSSARRARFTFKDSVVYVPFCYTENGIELQAPVTVGGKQVQH FAYDIDKLTFTGSDTGTGDMVFTTDYMRYADYAGTYNFEYKNASIKVKMVPAGDEKTYWL EGLSPDFKLTFVYDKKTGTLTWSPEKIYTDPNKHEHWMCSWDAGDTGQIFRYAFLGFVVN KDFNKPGVFLTFTSTLAGYLDLDSMIIIEYNGSKKVGESTTVLVNGSAQILMIKGMTKIN ND >gi|283510530|gb|ACQH01000089.1| GENE 21 24300 - 25361 1093 353 aa, chain + ## HITS:1 COG:no KEGG:BT_3235 NR:ns ## KEGG: BT_3235 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 7 276 3 274 344 75 24.0 3e-12 MKMKKVKFTTIIACALLCLLAACGNDDITNEVSQHTLKVVSANTSFDALGGTDSIFVNVD ATKAYAQAPWAKAETKGKLVTVTADVNKEMQSRHTTVVIKSADGDSTIVAIDQQGPVTQI DLPEQLVLGDKADTLTYKVKSTFGINIKSLDSWLTTTYEKGEVTIITTENNEGHVRTGQF VVETTMGKDTISVMQIDLDRDVLGEYYLQLTETDENRKPVKRVYAVKLHKEDKKLMLTFT ANPFAFPVTFSGKEGTLSIAGGQFAGMFENDYVATIVGPMDGQLSVSPNVTISGQFGYSA ELPAKVSSLEGTFLHFGTEPAFVLGRFSSKVFENKTYKSYLLYGNDAFLFKRK >gi|283510530|gb|ACQH01000089.1| GENE 22 25735 - 25965 227 76 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929310|ref|ZP_06423155.1| ## NR: gi|288929310|ref|ZP_06423155.1| hypothetical protein HMPREF0670_02049 [Prevotella sp. oral taxon 317 str. F0108] # 1 76 1 76 76 125 100.0 7e-28 MKLFFKKDEIGNITVQIQKGTTVIDYDYIEMLKQLIEKNEIECNWGNIEDFEQQKFRELL DKIKGAVDEGLNKPLE >gi|283510530|gb|ACQH01000089.1| GENE 23 25971 - 28295 880 774 aa, chain - ## HITS:1 COG:no KEGG:Bpro_5033 NR:ns ## KEGG: Bpro_5033 # Name: not_defined # Def: PHP-like # Organism: Polaromonas # Pathway: not_defined # 642 769 770 893 894 89 38.0 7e-16 MEPVYIDIHIHTSSNPDDLNVDYDVKTLFRKVRSKAQGQSVLLSFTDHNVINKKAYIAAL DECSTDIHLILGVELHIHYVPETEAYHCHMFFKNEVTEQSIDDINGILKRLYPQKTVEKK DTSIPTLDKVINEFDNYDFVLLPHGGQSHATFNKAIPSTKKFDTMMERSVYYNQFDGFTA RSESKRDETDKYFQRLGISDFVNLVTCSDNYDPDRYPDAKAKDAEPFIPTWMFSQPTFEG FRLSLSEKSRLIYSQTKPESWSEKIESVKYKNELLDIDVQFSSGLNVVIGGSSSGKTLLV DSIWRKLSKKSFENSDYKDFDVENVNVVNPSEMTPHYLGQNYIMKVIGNDSEQGIEDIEI IKSLFPDNREISAQVGNSLATLKKDLTELIRTVEDIESIEHKMRATPQVGRLLVLKPTRQ NIISILLPQSNERNSISYENSKKTSHISTLLEIKTLLRNNPFVANYNTTIDELIKVLQTV YKYSEVENGVFTTIDNANSTYASILRAESLEDQTKTQQFNELIKQISDYIYLNRKFKKYL HAIASYQVEFETREIKSSGHSLFIKNQYKINKNIVLDVFNNKLKTTCKIDTFNNISPEKL YQKNFSGQKPKVADYQDFINRVYKDFENMNKTVYSIITEEGKDFSRLSAGWKTSVLLDLI LGYDKDIAPIIIDQPEDNLATKYINDGLVNAIKKVKNTKQIILVSHNATIPMMGDAQQII YCENKNGVIVIRSSPLEGEIGEKSVLDLIASITDGGKPSIKKRVKKYNLKKFTK Prediction of potential genes in microbial genomes Time: Sat May 28 02:07:43 2011 Seq name: gi|283510529|gb|ACQH01000090.1| Prevotella sp. oral taxon 317 str. F0108 cont2.90, whole genome shotgun sequence Length of sequence - 29241 bp Number of predicted genes - 42, with homology - 41 Number of transcription units - 6, operones - 4 average op.length - 10.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 285 191 ## - Term 434 - 471 5.3 2 2 Op 1 50/0.000 - CDS 499 - 1032 571 ## PROTEIN SUPPORTED gi|212695278|ref|ZP_03303406.1| hypothetical protein BACDOR_04818 3 2 Op 2 26/0.000 - CDS 1111 - 2103 1194 ## COG0202 DNA-directed RNA polymerase, alpha subunit/40 kD subunit 4 2 Op 3 36/0.000 - CDS 2116 - 2721 899 ## PROTEIN SUPPORTED gi|150003363|ref|YP_001298107.1| 30S ribosomal protein S4 5 2 Op 4 48/0.000 - CDS 2797 - 3183 622 ## PROTEIN SUPPORTED gi|150003364|ref|YP_001298108.1| 30S ribosomal protein S11 6 2 Op 5 2/0.000 - CDS 3265 - 3645 605 ## PROTEIN SUPPORTED gi|29348113|ref|NP_811616.1| 30S ribosomal protein S13 7 2 Op 6 . - CDS 3682 - 3798 170 ## PROTEIN SUPPORTED gi|86141162|ref|ZP_01059708.1| ribosomal protein L36 8 2 Op 7 9/0.000 - CDS 3812 - 4030 230 ## PROTEIN SUPPORTED gi|15900168|ref|NP_344772.1| translation initiation factor IF-1 9 2 Op 8 2/0.000 - CDS 4222 - 5010 579 ## COG0024 Methionine aminopeptidase 10 2 Op 9 53/0.000 - CDS 5032 - 6375 857 ## PROTEIN SUPPORTED gi|163796899|ref|ZP_02190856.1| 30S ribosomal protein S11 11 2 Op 10 48/0.000 - CDS 6380 - 6826 525 ## PROTEIN SUPPORTED gi|150008965|ref|YP_001303708.1| 50S ribosomal protein L15 12 2 Op 11 50/0.000 - CDS 6852 - 7028 225 ## PROTEIN SUPPORTED gi|167933113|ref|ZP_02520200.1| 50S ribosomal protein L30 13 2 Op 12 56/0.000 - CDS 7040 - 7552 742 ## PROTEIN SUPPORTED gi|167933112|ref|ZP_02520199.1| 30S ribosomal protein S5 14 2 Op 13 46/0.000 - CDS 7559 - 7903 444 ## PROTEIN SUPPORTED gi|167933111|ref|ZP_02520198.1| 50S ribosomal protein L18 15 2 Op 14 55/0.000 - CDS 7922 - 8488 787 ## PROTEIN SUPPORTED gi|167933110|ref|ZP_02520197.1| 50S ribosomal protein L6 16 2 Op 15 50/0.000 - CDS 8505 - 8900 597 ## PROTEIN SUPPORTED gi|53715452|ref|YP_101444.1| 30S ribosomal protein S8 - Prom 8931 - 8990 1.5 17 2 Op 16 50/0.000 - CDS 9049 - 9351 438 ## PROTEIN SUPPORTED gi|150003374|ref|YP_001298118.1| 30S ribosomal protein S14 18 2 Op 17 48/0.000 - CDS 9357 - 9911 860 ## PROTEIN SUPPORTED gi|167933171|ref|ZP_02520258.1| 50S ribosomal protein L5 19 2 Op 18 57/0.000 - CDS 9914 - 10231 383 ## PROTEIN SUPPORTED gi|228470722|ref|ZP_04055573.1| ribosomal protein L24 20 2 Op 19 50/0.000 - CDS 10252 - 10617 570 ## PROTEIN SUPPORTED gi|150003377|ref|YP_001298121.1| 50S ribosomal protein L14 21 2 Op 20 . - CDS 10620 - 10877 387 ## PROTEIN SUPPORTED gi|160883055|ref|ZP_02064058.1| hypothetical protein BACOVA_01018 22 2 Op 21 . - CDS 10891 - 11085 220 ## PROTEIN SUPPORTED gi|150008976|ref|YP_001303719.1| 50S ribosomal protein L29 23 2 Op 22 50/0.000 - CDS 11089 - 11517 583 ## PROTEIN SUPPORTED gi|53715459|ref|YP_101451.1| 50S ribosomal protein L16 24 2 Op 23 61/0.000 - CDS 11533 - 12267 1054 ## PROTEIN SUPPORTED gi|160883058|ref|ZP_02064061.1| hypothetical protein BACOVA_01021 25 2 Op 24 59/0.000 - CDS 12275 - 12679 534 ## PROTEIN SUPPORTED gi|53715461|ref|YP_101453.1| 50S ribosomal protein L22 26 2 Op 25 60/0.000 - CDS 12711 - 12977 453 ## PROTEIN SUPPORTED gi|167933129|ref|ZP_02520216.1| 30S ribosomal protein S19 27 2 Op 26 61/0.000 - CDS 12992 - 13816 1307 ## PROTEIN SUPPORTED gi|212695302|ref|ZP_03303430.1| hypothetical protein BACDOR_04842 28 2 Op 27 61/0.000 - CDS 13825 - 14169 359 ## PROTEIN SUPPORTED gi|212695303|ref|ZP_03303431.1| hypothetical protein BACDOR_04843 29 2 Op 28 58/0.000 - CDS 14186 - 14815 822 ## PROTEIN SUPPORTED gi|150003386|ref|YP_001298130.1| 50S ribosomal protein L4 30 2 Op 29 40/0.000 - CDS 14815 - 15429 871 ## PROTEIN SUPPORTED gi|53715466|ref|YP_101458.1| 50S ribosomal protein L3 - Prom 15449 - 15508 7.0 31 2 Op 30 4/0.000 - CDS 15558 - 15863 470 ## PROTEIN SUPPORTED gi|53715467|ref|YP_101459.1| 30S ribosomal protein S10 - Prom 15883 - 15942 3.2 32 2 Op 31 51/0.000 - CDS 15967 - 18081 2485 ## COG0480 Translation elongation factors (GTPases) 33 2 Op 32 56/0.000 - CDS 18107 - 18583 735 ## PROTEIN SUPPORTED gi|29348139|ref|NP_811642.1| 30S ribosomal protein S7 - Prom 18674 - 18733 4.5 34 2 Op 33 . - CDS 18906 - 19289 616 ## PROTEIN SUPPORTED gi|150003391|ref|YP_001298135.1| 30S ribosomal protein S12 - Prom 19394 - 19453 5.8 - Term 19948 - 20002 2.8 35 3 Tu 1 . - CDS 20122 - 20667 528 ## PRU_1474 hypothetical protein - Prom 20798 - 20857 3.9 - Term 21094 - 21142 5.2 36 4 Op 1 . - CDS 21372 - 22640 887 ## COG3550 Uncharacterized protein related to capsule biosynthesis enzymes 37 4 Op 2 . - CDS 22644 - 22961 199 ## BF1073 putative DNA-binding protein - Prom 23090 - 23149 6.0 - Term 23176 - 23227 3.6 38 5 Op 1 . - CDS 23414 - 23650 173 ## gi|288929348|ref|ZP_06423193.1| hypothetical protein HMPREF0670_02087 39 5 Op 2 . - CDS 23587 - 23778 97 ## gi|288929349|ref|ZP_06423194.1| hypothetical protein HMPREF0670_02088 + Prom 23938 - 23997 3.7 40 6 Op 1 . + CDS 24142 - 26985 2012 ## COG0610 Type I site-specific restriction-modification system, R (restriction) subunit and related helicases 41 6 Op 2 . + CDS 26990 - 27541 324 ## gi|288929351|ref|ZP_06423196.1| hypothetical protein HMPREF0670_02090 42 6 Op 3 . + CDS 27546 - 29099 1436 ## COG0286 Type I restriction-modification system methyltransferase subunit Predicted protein(s) >gi|283510529|gb|ACQH01000090.1| GENE 1 1 - 285 191 94 aa, chain + ## HITS:0 COG:no KEGG:no NR:no KLSELQYQVGHKLFATYDTWYGQLAFSAFYNFGINGYCVDLYKHFSRQDIKTDSIIRNNG PICQFLFVSFTSAVCSVEIKKRLFTNGNSLLASV >gi|283510529|gb|ACQH01000090.1| GENE 2 499 - 1032 571 177 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|212695278|ref|ZP_03303406.1| hypothetical protein BACDOR_04818 [Bacteroides dorei DSM 17855] # 1 169 1 162 162 224 69 5e-58 MRHNKKFNHLGRTAPHRNAMLANMAISLIMHKRITTTLAKAKALKKYVEPLITRSKEDTT NSRRVVFRYLQNKYAVTELFKVVAAKVGDRPGGYTRVIRLGFRKGDAAEIAFIELVDFDE NMAKTPKAEAKKTRRSRKSTKAEGETAAAEVAQPVAEEAPKAEEAPKAEATEEAKAE >gi|283510529|gb|ACQH01000090.1| GENE 3 1111 - 2103 1194 330 aa, chain - ## HITS:1 COG:MT3564 KEGG:ns NR:ns ## COG: MT3564 COG0202 # Protein_GI_number: 15843052 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, alpha subunit/40 kD subunit # Organism: Mycobacterium tuberculosis CDC1551 # 19 318 16 308 347 231 44.0 1e-60 MAILAFQKPDKVVMLEANDKFGRFEFRPLEPGFGVTIGNSLRRILLSSLEGFSINTVRIA GVEHEFSSVPGVKEDVTNIILNLKQVRFKQVVEEFENEKVSITVENTTEFKAGDIGKYLT GFEVLNPDLVICHLDAKASLQIDLTINKGRGYVPADENREFCTDVNVIPIDSIYTPIRNV RYSVEPYRVEQKTDYDKLVIDVTTDGSISPKDALKEAAKILIYHFMLFSDEKIAVENQDV DTNEEFDEEVLHMRQLLKTKLVDMNLSVRALNCLKAADVETLGDLVQYNKTDLLKFRNFG KKSLTELDDLLLGLNLSFGTDISKYKLDRD >gi|283510529|gb|ACQH01000090.1| GENE 4 2116 - 2721 899 201 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|150003363|ref|YP_001298107.1| 30S ribosomal protein S4 [Bacteroides vulgatus ATCC 8482] # 1 201 1 201 201 350 84 5e-96 MARYIGPKSKIARRFGEPIFGADKVLSKRNFPPGQHGNNRRRKTSEYGAMLAEKQKAKYT YGVLERQFRNMFEKAAKASGITGEVLLQNLESRLDNVVFRLGLAPTRAAARQLVGHKHIV VDGKVVNIPSYSVKAGQVVGVREKAKSLEVIEAALAGFNHSKYPWIEWDDASKSGKFLHK PERADIPENIKEQLIVELYSK >gi|283510529|gb|ACQH01000090.1| GENE 5 2797 - 3183 622 128 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|150003364|ref|YP_001298108.1| 30S ribosomal protein S11 [Bacteroides vulgatus ATCC 8482] # 1 128 1 129 129 244 93 6e-64 MAKKIATKKRNVRVDALGQLHVHSSFNNIIVSLANSEGQVISWSSAGKMGFRGSKKNTPY AAQMAAEDCAKVAFDLGLRKVKAYVKGPGNGRESAIRAVHGAGIEVTEIIDVTPLPHNGC RPPKRRRV >gi|283510529|gb|ACQH01000090.1| GENE 6 3265 - 3645 605 126 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|29348113|ref|NP_811616.1| 30S ribosomal protein S13 [Bacteroides thetaiotaomicron VPI-5482] # 1 126 1 126 126 237 93 6e-62 MAIRIVGVDLPQNKRGEIALTYIYGIGRSSSAKILDKAGVNRDLKVSEWTDDQAAKIREI IGAEFKVEGDLRSEIQLNIKRLMDIGCYRGVRHRNGLPVRGQSTKNNARTRKGKKKTVAN KKKATK >gi|283510529|gb|ACQH01000090.1| GENE 7 3682 - 3798 170 38 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|86141162|ref|ZP_01059708.1| ribosomal protein L36 [Flavobacterium sp. MED217] # 1 38 1 38 38 70 84 2e-11 MKTRASLKKRTADCKIVRRKGRLFVINKKNPKYKTRQG >gi|283510529|gb|ACQH01000090.1| GENE 8 3812 - 4030 230 72 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15900168|ref|NP_344772.1| translation initiation factor IF-1 [Streptococcus pneumoniae TIGR4] # 1 72 1 72 72 93 61 2e-18 MAKQSAIEQDGTIVEALSNAMFRVELENGCPITAHISGKMRMHYIKILPGDKVKVEMSPY DLTKGRIVFRYK >gi|283510529|gb|ACQH01000090.1| GENE 9 4222 - 5010 579 262 aa, chain - ## HITS:1 COG:BH0156 KEGG:ns NR:ns ## COG: BH0156 COG0024 # Protein_GI_number: 15612719 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Methionine aminopeptidase # Organism: Bacillus halodurans # 6 253 5 246 248 258 52.0 6e-69 MKVYLKTEDEIELMRQANQLVGKTLAELAKHIVPGVTTLQLDKIADEFIRDHGAIPTFKN FPNPFGGPFPASICTSVNEVVVHGVPSEKTVLKDGDIISVDCGTLLAGYNGDSCYTFCVG NVAQNVRDFLSVTRKSLYLAIEAAVAGNHLGDIGHAVQSFCESYGYGIVRELTGHGIGRE MHEEPKVPNYGSRGSGMMLKEGMCIAIEPMVTMGDRRIGLLPDKWSIRTIDGSWAAHYEH TIAIRKGKAEILSSFEEIERFR >gi|283510529|gb|ACQH01000090.1| GENE 10 5032 - 6375 857 447 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163796899|ref|ZP_02190856.1| 30S ribosomal protein S11 [alpha proteobacterium BAL199] # 13 442 19 447 447 334 42 3e-91 MKKFIEALKNCWKVEDLRQRLLITILFVAIYRFGSFVVLPGINPSQLDTLQKQTSGGLMS LLDMFSGGAFSNASIFALGIMPYISASIVMQLLAVAVPYFQKMQREGESGRKKINWYTRA LTVVILLFQAPSYLINLKMQAEGALASGISWSVFMIPATIILAAGSMFVLWLGERITDKG VGNGISLIIMVGIIARLPQAFIQEAGSRLEAIAGGGLVMFIVEILVLYAIVCAAILLVQG TRKVPVQYAKRVVGNKQYGGARQYVPLKLFAANVMPIIFAQALMFIPLALVKYQSENASY VVQSLMDNRSLLYNIIYVTLIIAFTYFYTAITLNPTQMAEDMKRNNGFIPGIKPGKDTAD YIDTVMSRLTLPGSLFIAFIAVMPALAGLLNVQQGFSQFFGGTSLLILVGVVIDTLQQIE SYLMMRHYDGLLNSGHTRNAAGVPSAY >gi|283510529|gb|ACQH01000090.1| GENE 11 6380 - 6826 525 148 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|150008965|ref|YP_001303708.1| 50S ribosomal protein L15 [Parabacteroides distasonis ATCC 8503] # 1 143 1 143 148 206 72 1e-52 MKLHNLKPAAGSTYSRRRIGRGPGSGLGGTSTRGHKGAKSRSGYKRKIGFEGGQMPLQRR VPKSGFKNINHKEYFPVNLSTLQALAEKKGLTKIGVAELKEAGLFNGKLLVKVLGNGQIK AALTVEANAFSKSAEEAIKAAGGNTTLI >gi|283510529|gb|ACQH01000090.1| GENE 12 6852 - 7028 225 58 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|167933113|ref|ZP_02520200.1| 50S ribosomal protein L30 [candidate division TM7 single-cell isolate TM7b] # 1 58 1 58 58 91 75 7e-18 MATIKIKQVKSKIGYPIDQKRTLQCLGLRKISQVVEVEDTPSIRGMIRKVHHLVTVVE >gi|283510529|gb|ACQH01000090.1| GENE 13 7040 - 7552 742 170 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|167933112|ref|ZP_02520199.1| 30S ribosomal protein S5 [candidate division TM7 single-cell isolate TM7b] # 1 170 1 170 170 290 87 7e-78 MAMNKVRVNSDAELKDRLVAINRVTKVTKGGRTFTFAAIVVVGDGNGVIGYGLGKAGEVT TAIAKGVESAKKNLVKVPVLKGTVPHEVETSYGGAQVLLKPAAAGTGLKAGGAMRAVLES CGITDVIAKSKGSSNPHNLVKATIEALSLMRDAYTIAGVRGISMDKVFNG >gi|283510529|gb|ACQH01000090.1| GENE 14 7559 - 7903 444 114 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|167933111|ref|ZP_02520198.1| 50S ribosomal protein L18 [candidate division TM7 single-cell isolate TM7b] # 1 114 1 114 114 175 75 3e-43 MTTKKVERRIKIKYRIRKSVNGTAERPRMSVFRSNKQIYVQIINDITGTTLASASSLGME AMPKKEQAQKVGELVAQKAQAAGITAVVFDRNGYLYHGRVKELADAARKGGLNF >gi|283510529|gb|ACQH01000090.1| GENE 15 7922 - 8488 787 188 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|167933110|ref|ZP_02520197.1| 50S ribosomal protein L6 [candidate division TM7 single-cell isolate TM7b] # 1 188 1 188 188 307 80 4e-83 MSRIGKLPISIPAGVTVTQANGVVTVKGPKGELSQHVDSSIKVNIEDGQIVFEVDENSPV NIKQKQAFHGLYRSLVHNMVVGVSEGYTKVLELVGVGYRVSNQGNLIEFSLGYTHPIFIQ LPAEVKVETKSERNQNPLLILESCDKQLLGLICAKIRSFRKPEPYKGKGILFKGEVIRRK SGKTASAK >gi|283510529|gb|ACQH01000090.1| GENE 16 8505 - 8900 597 131 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|53715452|ref|YP_101444.1| 30S ribosomal protein S8 [Bacteroides fragilis YCH46] # 1 131 1 131 131 234 88 5e-61 MTDPIADYLTRLRNAIMAHHRVVEVPASNLKKEITKILFEKGYILNYKFVEDGPQGSIKV ALKYDPVTKANAITSLKRVSTPGLRQYTGYKDMPRVINGLGIAILSTSQGVMTNKEAAAL KIGGEVLCYVY >gi|283510529|gb|ACQH01000090.1| GENE 17 9049 - 9351 438 100 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|150003374|ref|YP_001298118.1| 30S ribosomal protein S14 [Bacteroides vulgatus ATCC 8482] # 1 100 1 99 99 173 88 1e-42 MAKESMKAREVKRAKLVARYAEKRAALKKIIATSEDYAEAYEAARKLQSIPKNANPIRLH NRCKITGRPKGYIRQFGLSRIQFREMASAGLIPGVKKASW >gi|283510529|gb|ACQH01000090.1| GENE 18 9357 - 9911 860 184 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|167933171|ref|ZP_02520258.1| 50S ribosomal protein L5 [candidate division TM7 single-cell isolate TM7b] # 1 184 1 184 184 335 91 2e-91 MNTAQLKKVYAETIAPALQKQFNYSSSMQVPVLKKIVINQGLGDATQDKKIIEVAINEIT AITGQKAVATYSKKDIANFKLRKKMPIGVMVTLRRERMYEFLEKLIRVSLPRIRDFKGIE SKFDGRGNYTLGISEQIIFPEINIDGIDRIQGMNITFVTTANTDEEGYALLKAFGLPFKN AKND >gi|283510529|gb|ACQH01000090.1| GENE 19 9914 - 10231 383 105 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|228470722|ref|ZP_04055573.1| ribosomal protein L24 [Porphyromonas uenonis 60-3] # 1 105 1 105 106 152 69 3e-36 MSKLHIRKGDEVIVLAGDDKGRKGKVLKVLVAKQRALVEGVNMVSKSMKPSAKNPQGGIV KQEAPIHVSNLSLIDPKSGKATRVGMKRTDDGKKVRVAKKSGEVI >gi|283510529|gb|ACQH01000090.1| GENE 20 10252 - 10617 570 121 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|150003377|ref|YP_001298121.1| 50S ribosomal protein L14 [Bacteroides vulgatus ATCC 8482] # 1 121 1 121 121 224 94 6e-58 MIQAESRLTVCDNSGAREALCIRVLGGTRRRYASVGDVIVVSVKNVIPSSDLKKGAVSKA LIVRTKKEIRRADGSYIRFDDNACVLLNNAGEIRGSRIFGPVARELRAVNMKVVSLAPEV L >gi|283510529|gb|ACQH01000090.1| GENE 21 10620 - 10877 387 85 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|160883055|ref|ZP_02064058.1| hypothetical protein BACOVA_01018 [Bacteroides ovatus ATCC 8483] # 1 85 5 89 89 153 87 1e-36 METRNLRKVRQGVVLSNKMDKTIVIAAKFKEKHPIYGKFVQKTKKYHVHDEKNEANIGDT VLIMETRPLSKTKRWRLVQIIEKAK >gi|283510529|gb|ACQH01000090.1| GENE 22 10891 - 11085 220 64 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|150008976|ref|YP_001303719.1| 50S ribosomal protein L29 [Parabacteroides distasonis ATCC 8503] # 1 63 1 63 65 89 66 2e-17 MKMKELKELETKDLAEKLENAVAAYDQMKLNHNITPLENPSQIKSARRDIARMKTELRQR ELNK >gi|283510529|gb|ACQH01000090.1| GENE 23 11089 - 11517 583 142 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|53715459|ref|YP_101451.1| 50S ribosomal protein L16 [Bacteroides fragilis YCH46] # 1 139 1 138 144 229 79 2e-59 MLQPKRVRYRRPQDGRGNKGNASRGTQLAFGSFGIKTLEAKWIDSRQIEAARVAVNRYMQ REGQVWIRIFPDKPITRKPADVRMGKGKGDPAGWVAPVTPGRILFEVEGVSFDIAKEALR LAAQKLPVKTKFVVRRDYDKNA >gi|283510529|gb|ACQH01000090.1| GENE 24 11533 - 12267 1054 244 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|160883058|ref|ZP_02064061.1| hypothetical protein BACOVA_01021 [Bacteroides ovatus ATCC 8483] # 1 244 1 243 243 410 81 1e-114 MGQKVNPISNRLGIIRGWDSNWFGGKNFGDNIVEDMKIRKYLNERLAKANVSRIVIERTL KLVTITICTARPGIVIGKGGQDVDKLKEELKKLYKKDIQINIFEVKRPDLDATIVANNIA RQVEGKIAYRRAIKMAVQNTMRAGAEGIKVQITGRLNGAEMARKEMYKEGRTPLHTFRAD IDYCQAEALTKVGLLGIKVWICRGEVYGDRDLTPNFTQDKQSGGRSNNAGGRSGRGNRKR NNNR >gi|283510529|gb|ACQH01000090.1| GENE 25 12275 - 12679 534 134 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|53715461|ref|YP_101453.1| 50S ribosomal protein L22 [Bacteroides fragilis YCH46] # 1 133 1 133 136 210 76 1e-53 MGARKHIAAEKMKEARKNLYFAKLKGVPSSPRKMRYVVDMIRGMEVNRALGVLRFSKKQA AADVEKLLRSAVANWETKNDRKADEGELYISKVFVDEGVTMKRMIPAPQGRGYRIRKRSN HVTLFVDAKTNDEK >gi|283510529|gb|ACQH01000090.1| GENE 26 12711 - 12977 453 88 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|167933129|ref|ZP_02520216.1| 30S ribosomal protein S19 [candidate division TM7 single-cell isolate TM7b] # 1 88 1 88 88 179 97 2e-44 MSRSLKKGPYINVSLEKKILAMNESGKKSVVKTWARASMISPDFVGHTVAVHNGNKFIPV YITENMVGHKLGEFSPTRRFGGHSGNRK >gi|283510529|gb|ACQH01000090.1| GENE 27 12992 - 13816 1307 274 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|212695302|ref|ZP_03303430.1| hypothetical protein BACDOR_04842 [Bacteroides dorei DSM 17855] # 1 273 1 273 273 508 90 1e-143 MAVRKFKPVTPGQRHKIIGTFEEITASVPEKSLVYGKRSTGGRNNTGKMTVRYMGGGHKK KYRVIDFKREKDGVPAVVKTIEYDPNRSARIALLYYADGEKRYIIAPNGLQVGSTLLSGA EAAPEIGNAIPLANIPVGTVIHNIELRPGQGALLVRSAGNFAQLTSREGSYCVIKLPSGE TRKVLSACKATVGSVGNSDHALEQSGKAGRSRWLGRRPHNRGVVMNPVDHPMGGGEGRQS GGHPRSRKGLYAKGLKTRAPKKQSNKYIIERAKK >gi|283510529|gb|ACQH01000090.1| GENE 28 13825 - 14169 359 114 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|212695303|ref|ZP_03303431.1| hypothetical protein BACDOR_04843 [Bacteroides dorei DSM 17855] # 1 114 1 96 96 142 64 2e-33 MGFIIKPIVTEKMTAITEKSSRDKSYKVKGEVRTKAATPRYGFIVRPEANKLQIKSEIEN LYNVTVLDVNTMRYAGKRSSRYTRAGLVRGQKSAYKKAIVTLKAGDEIDFYSNI >gi|283510529|gb|ACQH01000090.1| GENE 29 14186 - 14815 822 209 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|150003386|ref|YP_001298130.1| 50S ribosomal protein L4 [Bacteroides vulgatus ATCC 8482] # 1 208 1 208 208 321 75 4e-87 MEVSVLNINGQETGRKVVLNDAIFGIEPNDHVLYLDVKQYLANQRQGTAKTKERSEMSGS TRKLGRQKGGGGARRGDINSPVLVGGARVFGPKPRDYSFKLNKKVKVLARKSALSYKAKE NAIIVVEDFEMAAPKTKEYVNIVKNLQLGERKSLLLLPNVNKNVYLSARNLQRSEVMTAS ALNAYKVLNADVLVITEKSLEAIDGILNK >gi|283510529|gb|ACQH01000090.1| GENE 30 14815 - 15429 871 204 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|53715466|ref|YP_101458.1| 50S ribosomal protein L3 [Bacteroides fragilis YCH46] # 1 204 1 205 205 340 80 8e-93 MPGLIGRKIGMTSVFSADGKNVPCTVIEAGPCVVTQVKTVETDGYKAVQLAYGEAKKNRT TKAMAGIFEKAGTTPKKHLAEFKFDTEYNLGDTITVDLFEGAEFVDVVGTSKGKGFQGVV KRHGFGGVGQTTHGQDDRARKPGSIGACSYPAKVFKGMRMGGQMGGDRVTTQNLKVLKVI PEHNLILVKGSVAGCNGSTLLIKK >gi|283510529|gb|ACQH01000090.1| GENE 31 15558 - 15863 470 101 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|53715467|ref|YP_101459.1| 30S ribosomal protein S10 [Bacteroides fragilis YCH46] # 1 101 1 101 101 185 94 3e-46 MSQKIRIKLKSYDHKLVDKSAEKIVKAVKATGAIVSGPIPLPTHKRIFTVNRSTFVNKKS REQFQLQNFKRLIDIYSSTAKTVDALMKLELPSGVEVEIKV >gi|283510529|gb|ACQH01000090.1| GENE 32 15967 - 18081 2485 704 aa, chain - ## HITS:1 COG:HP1195 KEGG:ns NR:ns ## COG: HP1195 COG0480 # Protein_GI_number: 15645809 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factors (GTPases) # Organism: Helicobacter pylori 26695 # 6 699 7 692 692 826 59.0 0 MANRDLHLTRNIGIMAHIDAGKTTTSERILFYTGKTHKIGEVHDGAATMDWMAQEQERGI TITSAATTCNWNYDKKSFKINLIDTPGHVDFTAEVERSLRVLDGAVATYSAADGVQPQSE TVWRQADKYNVPRIGYVNKMDRSGADFFETVQQMKDILGANPCPVQIPIGAEENFKGVID LIKMKAILWHDETMGAEYSIEDIPADLLDEAKEWHDKMVENAANFDDALMEKYLEGIEPS EEELIAAIRKATISMDLTPMVLGSSYKNKGVQPLLDYVCAFLPSPLDTVAIVGVNPNTDE EEERKPSEDAPTSALAFKIATDPFMGRLVFFRVYSGKVEAGSYVYNARSGKKERISRLFQ MNSNKEIPMESIDAGDIGAGVGFKDIRTGDTLCDENAPIVLESMTFPDTVISIAVEPKSQ ADIAKLDNGLAKLAEEDPTFTVRTDEQSGQTIISGMGELHLDIIIDRLKREFKVECNQGK PQVNYKEAITKDVTLREVYKKQSGGRGKFADIIVTVGPKDEDYKEGNFQFINEVKGGNVP KEFIPSVQKGFESAMKNGVLGGYPMENLKVTLTDGSFHPVDSDQLSFELAAINAYRNACP KAGPVLMEPIMKVEVVTPEENMGDVIGDLNKRRGQVEGMDEARSGARIVKAQVPLAEMFG YVTALRTITSGRATSSMEYDHHAPLSSSIAKAVLEEVKGRTDLV >gi|283510529|gb|ACQH01000090.1| GENE 33 18107 - 18583 735 158 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|29348139|ref|NP_811642.1| 30S ribosomal protein S7 [Bacteroides thetaiotaomicron VPI-5482] # 1 158 1 158 158 287 88 5e-77 MRKAKPKKRVILPDPVYNDQKVSKFVNHLMYDGKKNASYEIFYNALDTVKAKMEKEEKSP LEIWKTALDNITPQVEVKSRRIGGATFQVPTEIRAERKESISMKNMISYARKRGGKTMAD KLAAEIMDAFNNQGGAYKRKEDMHRMAEANRAFAHFRF >gi|283510529|gb|ACQH01000090.1| GENE 34 18906 - 19289 616 127 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|150003391|ref|YP_001298135.1| 30S ribosomal protein S12 [Bacteroides vulgatus ATCC 8482] # 1 126 1 126 137 241 94 3e-63 MPTISQLVRKGRKVLVDKSKSPALDSCPQRRGVCVRVYTTTPKKPNSAMRKVARVRLTNQ KEVNSYIPGEGHNLQEHSIVLVRGGRVKDLPGVRYHIVRGTLDTAGVANRTQRRSKYGAK RPKAAKK >gi|283510529|gb|ACQH01000090.1| GENE 35 20122 - 20667 528 181 aa, chain - ## HITS:1 COG:no KEGG:PRU_1474 NR:ns ## KEGG: PRU_1474 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 15 174 18 168 171 109 47.0 4e-23 MGIASAQHEMKSLSLQPKIGLNLSKLTNHYDSDFKPGVVVGFEAEWQTKAKFGWSVGLLY SQQGTKYALGEFKYTYNYNYINVPVLANFYVARNLALKTGLQLGFNVSNGYTIEGPGSKE NRNDPDVKAVEASIPLGVSYEFNQFVVEARYNLGLTYAVPQTRFSAFQFTLGYRFDVVKG N >gi|283510529|gb|ACQH01000090.1| GENE 36 21372 - 22640 887 422 aa, chain - ## HITS:1 COG:CC2770 KEGG:ns NR:ns ## COG: CC2770 COG3550 # Protein_GI_number: 16127002 # Func_class: R General function prediction only # Function: Uncharacterized protein related to capsule biosynthesis enzymes # Organism: Caulobacter vibrioides # 71 401 68 417 435 88 26.0 2e-17 MEKLDVVACFDWLEKEEKVGTLGHETLRGSDVFSFEFDGDWLKRYADICFGRDLQPFTGI QYSPSNNRIFGCFSDALPDRWGRRLIDLRMAQEQRGKGMGKPTLSDWDYLKGVEDSLRMG GFRFKDLASGAYVNAASGYAVPPVLHIDELLQAAGEIEKSEYKHMVPEEKWVQRLFQPGS SMGGARPKACVESGGHLFLAKFPSVKDEINVSRWEHFAHLMARACGISVAQTEIVKAGNG QDILLSKRFDRNDDNQRVHMASSLTLLGFADGDGEKTGKGYLDIVDFIVSSGGRKVGVDL EELYRRVAFNICLGNTDDHFRNHAFMLAKGGWELSPAYDLNPSNSRYQALLIDNDTNDSD LNRLFEVHENYMLDDAAARRIIGDVTQNMKCWEAMAQDVGLSRGEMVHFAERFEKGMNFI LR >gi|283510529|gb|ACQH01000090.1| GENE 37 22644 - 22961 199 105 aa, chain - ## HITS:1 COG:no KEGG:BF1073 NR:ns ## KEGG: BF1073 # Name: not_defined # Def: putative DNA-binding protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 105 3 107 108 102 57.0 6e-21 MTKFTKSNWMTRGLQEKLNIVGEQFRLARLRRNLTMDQVAQRAQCSRLTLSRLEKGSSAV SLGVVARVLNALQLEDDILKLAQDDILGRMVQDMEIKTKKRASKK >gi|283510529|gb|ACQH01000090.1| GENE 38 23414 - 23650 173 78 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929348|ref|ZP_06423193.1| ## NR: gi|288929348|ref|ZP_06423193.1| hypothetical protein HMPREF0670_02087 [Prevotella sp. oral taxon 317 str. F0108] # 17 78 1 62 62 112 100.0 9e-24 MLCRYLLWQRFTKVEYMQKDVMAQRKTATKYLDRIVEAGLLRKMKFGCSDYYINMTLMDL FILHRVDEINDTVSIESV >gi|283510529|gb|ACQH01000090.1| GENE 39 23587 - 23778 97 63 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929349|ref|ZP_06423194.1| ## NR: gi|288929349|ref|ZP_06423194.1| hypothetical protein HMPREF0670_02088 [Prevotella sp. oral taxon 317 str. F0108] # 1 63 1 63 63 116 100.0 4e-25 MEKLDVVACFDWLEKEEKVGMLGHETLRSSDVFSFEFDGDLLKCCADICFGRDLPRLSIC RKT >gi|283510529|gb|ACQH01000090.1| GENE 40 24142 - 26985 2012 947 aa, chain + ## HITS:1 COG:SA0189 KEGG:ns NR:ns ## COG: SA0189 COG0610 # Protein_GI_number: 15925899 # Func_class: V Defense mechanisms # Function: Type I site-specific restriction-modification system, R (restriction) subunit and related helicases # Organism: Staphylococcus aureus N315 # 1 933 1 916 929 627 39.0 1e-179 MPVQSEAALENGLIATLQQMSYGYVHIEEENNLRANFKTQLEKHNRKRLEEFGRTEFTES EFEKILIYLEGGTRFEKAKKLRDLFPLELESGERLWVEFLNRTHWCQNEFQVSNQITVEG RKKCRYDVTILINGLPLVQIELKRRGVELKQAYNQIQRYHKTSFHGLFDYIQLFVISNGV NTRYFANNPNSGYKFTFNWTDAANIPFNDLEKFATTFFDKCTLGKIIGKYIVLHEGDKCL MVLRPYQFYAVEKILDRVVNSNDNGYIWHTTGAGKTLTSFKAAQLVAELSDIDKVMFVVD RHDLDTQTQAEYEAFEPGAVDSTDNTDELVKRLHSNSKIIITTIQKLNAAVTKQWYSSRI EEIRHSRIVMIFDECHRSHFGDCHKNIVKFFDNTQMFGFTGTPIFVENAVNGHTTKEIFG NNLHKYLIKDAIADENVLGFLVEYYHGNADVDNADQNRMTEIAKFILNNFNKSTFDGEFD ALFAVQSVPMLIRYYKLFKSLNPKIRIGAVFTYAANSSQDDALTGMNTGGYVSDSTGEAD ELQTIMDDYNDMFGTSFTTENFRAYYDDINLRMKKKKADMKSLDLCLVVGMFLTGFDSKK LNTLYVDKNMDYHGLLQAFSRTNRVLNEKKRFGKIVCFRDLKSKVDASIKLFSNSNNLED IIRPPFNEVKKNYQELTTSFLEQYPTPSSIDLLQSERDKTQFILAFRDVIKKHAEIQIYD EFEEDATDLGMSEQQFMDFRSKYLDIYDAVDGGRKPSGEDSTPDEDAEPTETSTDSVIGD IDFCLELLHSDIINVTYILDLIADLNPYSEDYQEKRTHIIDTMIKDAELRSKAKLIDDFI QQNVDDDRDNFMARKQRADGTSDLEEQLNNYITNKRNDAVDELAKDEDLDVKVLNHYLSE YDYLQKEQPEIIQQALKEKHLGLVKRRNTLKRIMDRLHYIIRTYCLE >gi|283510529|gb|ACQH01000090.1| GENE 41 26990 - 27541 324 183 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929351|ref|ZP_06423196.1| ## NR: gi|288929351|ref|ZP_06423196.1| hypothetical protein HMPREF0670_02090 [Prevotella sp. oral taxon 317 str. F0108] # 1 174 1 174 183 335 100.0 9e-91 MEKNIYIIRWYGPFPDVESIRRWELNNHIPCSLYLMSGMKKYAKSSIHYYIGKTERNLIT NRFKDKDHHINDFSRINEIWLGCFANKKASHNDVTLVENMLTSYFVGEVGSDKMLNKINF YLPTSNVYVLSEWISPYRERAWSKLSRSSPANIVADVIVNKANDSAFNYYLYVSKKLKKQ QKK >gi|283510529|gb|ACQH01000090.1| GENE 42 27546 - 29099 1436 517 aa, chain + ## HITS:1 COG:SA1626 KEGG:ns NR:ns ## COG: SA1626 COG0286 # Protein_GI_number: 15927382 # Func_class: V Defense mechanisms # Function: Type I restriction-modification system methyltransferase subunit # Organism: Staphylococcus aureus N315 # 6 517 11 515 518 592 57.0 1e-169 MSEELQQKLRDQLWEVANKLRGNMSASDFMYFTLGFIFYKYLSEKIEKHANEALVDDEVT FKELWAMENDDDIESLQQEVKTECLENIGYFIEPHFLFSSIIEKIKKKENILPILERSLK RIEDSTLGQDSEEDFGGLFSDIDLASPKLGKTADDKNTLVSNVLLALDDIDFGVEASQEI DILGDAYEYMISQFAAGAGKKAGEFYTPQEVSHILAEIVTLGHARLRNVYDPTCGSGSLL LRAANIGHANEIFGQEKNPTTYNLARMNMLLHGIKFSNFRIENGDTLEADAFGDTQFDAV VANPPFSAEWSAAEKFNNDDRFSKIGRLAPRKTADYAFILHMIYHLNEGGTMACVAPHGV LFRGNAEGVIRRFLIEKKNYVDAIIGLPANIFYGTSIPTCILVFKKCRKEDENILFIDAS KEFEKVKTQNKLRPQHIQKIVDTYRDRKEIEKYSHLATLEEIAENDYNLNIPRYVDTFEE EEPIDIKAVMAEIKELETKRAELDKEIEVYLKELGIV Prediction of potential genes in microbial genomes Time: Sat May 28 02:08:13 2011 Seq name: gi|283510528|gb|ACQH01000091.1| Prevotella sp. oral taxon 317 str. F0108 cont2.91, whole genome shotgun sequence Length of sequence - 2235 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 143 - 1168 197 ## XCC0463 putative restriction modification system specificity subunit + Term 1186 - 1229 -0.2 2 2 Tu 1 . - CDS 1161 - 1910 354 ## COG0732 Restriction endonuclease S subunits - Prom 1939 - 1998 4.0 Predicted protein(s) >gi|283510528|gb|ACQH01000091.1| GENE 1 143 - 1168 197 341 aa, chain + ## HITS:1 COG:no KEGG:XCC0463 NR:ns ## KEGG: XCC0463 # Name: not_defined # Def: putative restriction modification system specificity subunit # Organism: X.campestris # Pathway: not_defined # 1 146 68 213 430 150 51.0 9e-35 MGRSFAGASVLPYHIVRLGNIVYTKSPLKEYPYGIVKANTGKDGIVSTLYAVYSVKDNAN YKFIEYYFSLANRANRYFKPIVRIGAKHDMKIGNQEVLANQVIFPTVKEQEKIAGFLSLI DDRISNQNKIIEDLKKLKCAIIENVLNNCHDNKMRLGDVGIYIRGLTYSSNDVVEQKGTI VMRSNNIVSGGLLDYCNNVVRVNKQILQEQQLQNGDIVICMANGSSALVGKTSFYDGKCL SPITVGAFCGIYRSKMPITKWLFQTNRYHRYIWNSLQGGNGAIANLNGEDILRMSFPTPD KSTIGHCIKLLSSLDLLIENNVSLCSMFSQQKEYLLQQMFI >gi|283510528|gb|ACQH01000091.1| GENE 2 1161 - 1910 354 249 aa, chain - ## HITS:1 COG:SA0392 KEGG:ns NR:ns ## COG: SA0392 COG0732 # Protein_GI_number: 15926110 # Func_class: V Defense mechanisms # Function: Restriction endonuclease S subunits # Organism: Staphylococcus aureus N315 # 20 249 164 403 403 67 25.0 2e-11 MSAKNSVDSVRKEMITDMPLSLPCCQEQIKIGYMLSILDERIATQNKIIEDLKKLKCAII EKLYSEIQGKEYLYGQLFEVVNKRNKQMEYSNILSASQEKGMVNRDDLNLDIQFERSNIN TYKIVRAGDYVIHLRSFQGGFAFSDKLGVCSPAYTILRPNCLLEYGYLSYYFTSHRFIKS LIIVTYGIRDGRSINIEEWLNMKVIIPSKEYQLHTLKILRSIEGKIENEETYTICLSNQK QYLLNQMFI Prediction of potential genes in microbial genomes Time: Sat May 28 02:08:41 2011 Seq name: gi|283510527|gb|ACQH01000092.1| Prevotella sp. oral taxon 317 str. F0108 cont2.92, whole genome shotgun sequence Length of sequence - 92645 bp Number of predicted genes - 76, with homology - 65 Number of transcription units - 45, operones - 18 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 55 - 981 599 ## COG0582 Integrase + Term 1017 - 1061 -0.8 2 2 Tu 1 . - CDS 1728 - 1928 82 ## - Prom 2117 - 2176 5.0 + Prom 1988 - 2047 3.1 3 3 Tu 1 . + CDS 2075 - 2329 56 ## + Term 2415 - 2445 -0.4 - Term 2238 - 2279 11.2 4 4 Tu 1 . - CDS 2323 - 3855 1374 ## COG3119 Arylsulfatase A and related enzymes - Prom 3878 - 3937 2.3 5 5 Tu 1 . + CDS 3764 - 3970 98 ## 6 6 Tu 1 . + CDS 4096 - 4329 85 ## - Term 4210 - 4252 9.1 7 7 Op 1 . - CDS 4300 - 5916 1413 ## BVU_0960 hypothetical protein 8 7 Op 2 . - CDS 5940 - 9080 3577 ## BVU_0959 hypothetical protein - Prom 9103 - 9162 6.9 - Term 9235 - 9281 9.6 9 8 Op 1 . - CDS 9301 - 9942 289 ## Ccel_2974 hypothetical protein 10 8 Op 2 . - CDS 9954 - 11042 643 ## COG4748 Uncharacterized conserved protein - Prom 11160 - 11219 5.6 + Prom 12116 - 12175 5.3 11 9 Tu 1 . + CDS 12289 - 12414 59 ## + Prom 12631 - 12690 1.8 12 10 Tu 1 . + CDS 12761 - 13423 527 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 13 11 Op 1 . - CDS 13528 - 14445 385 ## COG0338 Site-specific DNA methylase 14 11 Op 2 . - CDS 14448 - 15254 618 ## Rcas_2237 restriction endonuclease EcoRV 15 12 Tu 1 . + CDS 15915 - 16946 651 ## BT_1876 putative anti-sigma factor + Prom 17026 - 17085 9.0 16 13 Op 1 . + CDS 17111 - 20713 1835 ## Slin_2121 YD repeat protein 17 13 Op 2 . + CDS 20710 - 23958 2146 ## BT_1993 putative cell wall-associated protein precursor 18 13 Op 3 . + CDS 23970 - 24527 295 ## gi|260909780|ref|ZP_05916474.1| conserved hypothetical protein + Prom 24704 - 24763 1.8 19 14 Tu 1 . + CDS 24790 - 25248 168 ## gi|260909782|ref|ZP_05916476.1| conserved hypothetical protein + Prom 25278 - 25337 3.1 20 15 Tu 1 . + CDS 25375 - 25866 385 ## gi|288929368|ref|ZP_06423213.1| hypothetical protein HMPREF0670_02107 21 16 Op 1 . + CDS 25995 - 26477 213 ## gi|288929369|ref|ZP_06423214.1| hypothetical protein HMPREF0670_02108 22 16 Op 2 . + CDS 26464 - 26721 236 ## gi|288929370|ref|ZP_06423215.1| hypothetical protein HMPREF0670_02109 + Prom 26779 - 26838 2.9 23 17 Tu 1 . + CDS 26881 - 27303 277 ## gi|288929371|ref|ZP_06423216.1| hypothetical protein HMPREF0670_02110 + Prom 27393 - 27452 7.4 24 18 Op 1 . + CDS 27487 - 27963 292 ## 25 18 Op 2 . + CDS 28037 - 29197 823 ## gi|288929372|ref|ZP_06423217.1| hypothetical protein HMPREF0670_02111 + Term 29293 - 29330 3.0 + Prom 29281 - 29340 3.9 26 19 Tu 1 . + CDS 29507 - 31441 2101 ## COG3525 N-acetyl-beta-hexosaminidase - Term 31686 - 31726 2.4 27 20 Op 1 . - CDS 31786 - 31977 204 ## gi|288929374|ref|ZP_06423219.1| hypothetical protein HMPREF0670_02113 28 20 Op 2 . - CDS 31974 - 35519 3002 ## COG3291 FOG: PKD repeat - Prom 35737 - 35796 3.8 29 21 Tu 1 . - CDS 35882 - 36385 -37 ## - Prom 36508 - 36567 1.6 - Term 36528 - 36571 -0.3 30 22 Tu 1 . - CDS 36575 - 37831 1111 ## COG0641 Arylsulfatase regulator (Fe-S oxidoreductase) - Prom 37945 - 38004 1.6 + Prom 37740 - 37799 4.5 31 23 Op 1 . + CDS 37969 - 38214 56 ## gi|288929377|ref|ZP_06423222.1| hypothetical protein HMPREF0670_02116 + Prom 38216 - 38275 2.9 32 23 Op 2 . + CDS 38295 - 39065 762 ## BVU_3832 hypothetical protein 33 23 Op 3 . + CDS 39139 - 39639 535 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 34 23 Op 4 . + CDS 39636 - 40961 1260 ## BVU_0051 hypothetical protein + Prom 42498 - 42557 3.4 35 24 Op 1 . + CDS 42584 - 42976 83 ## gi|288929381|ref|ZP_06423226.1| hypothetical protein HMPREF0670_02120 36 24 Op 2 . + CDS 42973 - 43407 265 ## gi|288929382|ref|ZP_06423227.1| hypothetical protein HMPREF0670_02121 37 24 Op 3 . + CDS 43410 - 43835 367 ## gi|288929383|ref|ZP_06423228.1| hypothetical protein HMPREF0670_02122 + Term 43952 - 44005 11.1 - Term 43948 - 43985 9.4 38 25 Op 1 58/0.000 - CDS 44040 - 48374 5304 ## COG0086 DNA-directed RNA polymerase, beta' subunit/160 kD subunit 39 25 Op 2 28/0.000 - CDS 48401 - 52213 2916 ## PROTEIN SUPPORTED gi|163796927|ref|ZP_02190884.1| 30S ribosomal protein S12 - Prom 52342 - 52401 2.7 - Term 52361 - 52421 13.0 40 25 Op 3 . - CDS 52437 - 52817 508 ## PROTEIN SUPPORTED gi|153805949|ref|ZP_01958617.1| hypothetical protein BACCAC_00194 - Prom 52837 - 52896 6.5 41 26 Op 1 . - CDS 52928 - 53446 629 ## PROTEIN SUPPORTED gi|160883075|ref|ZP_02064078.1| hypothetical protein BACOVA_01038 42 26 Op 2 55/0.000 - CDS 53465 - 54157 1015 ## PROTEIN SUPPORTED gi|150003398|ref|YP_001298142.1| 50S ribosomal protein L1 43 26 Op 3 45/0.000 - CDS 54178 - 54618 635 ## PROTEIN SUPPORTED gi|150003399|ref|YP_001298143.1| 50S ribosomal protein L11 44 26 Op 4 . - CDS 54676 - 55221 552 ## COG0250 Transcription antiterminator 45 26 Op 5 . - CDS 55236 - 55427 114 ## PRU_2141 preprotein translocase subunit SecE - TRNA 55441 - 55516 78.8 # Trp CCA 0 0 - Term 55511 - 55551 9.6 46 27 Tu 1 . - CDS 55588 - 56778 1388 ## PROTEIN SUPPORTED gi|119502908|ref|ZP_01624993.1| Ribosomal protein S19 - Prom 56802 - 56861 5.4 - TRNA 56843 - 56917 74.4 # Thr GGT 0 0 - TRNA 56924 - 56999 70.2 # Gly TCC 0 0 - TRNA 57081 - 57163 62.7 # Tyr GTA 0 0 - Term 57265 - 57325 2.9 47 28 Op 1 . - CDS 57372 - 57671 188 ## PROTEIN SUPPORTED gi|163755828|ref|ZP_02162946.1| 30S ribosomal protein S21 48 28 Op 2 . - CDS 57698 - 58579 827 ## COG4974 Site-specific recombinase XerD - Term 58590 - 58643 -0.3 49 28 Op 3 . - CDS 58659 - 58850 268 ## PROTEIN SUPPORTED gi|150003404|ref|YP_001298148.1| 30S ribosomal protein S21 - Prom 59077 - 59136 5.5 + Prom 58792 - 58851 4.3 50 29 Tu 1 . + CDS 59056 - 59376 56 ## + Term 59539 - 59575 0.6 - Term 59791 - 59838 -0.6 51 30 Tu 1 . - CDS 59886 - 60368 -406 ## 52 31 Op 1 . + CDS 61600 - 64830 3512 ## PRU_1874 putative receptor antigen RagA 53 31 Op 2 . + CDS 64860 - 66536 1562 ## PRU_1875 putative lipoprotein 54 32 Tu 1 . + CDS 66610 - 67437 796 ## PRU_1876 putative lipoprotein + Term 67451 - 67490 -0.7 + Prom 67501 - 67560 5.5 55 33 Tu 1 . + CDS 67727 - 67918 112 ## gi|288929401|ref|ZP_06423246.1| hypothetical protein HMPREF0670_02140 + Prom 67921 - 67980 7.6 56 34 Op 1 . + CDS 68004 - 68957 1085 ## COG1284 Uncharacterized conserved protein 57 34 Op 2 . + CDS 69044 - 69358 459 ## BVU_0810 hypothetical protein + Term 69412 - 69468 20.1 58 35 Tu 1 . + CDS 69817 - 70191 260 ## COG3152 Predicted membrane protein - Term 70699 - 70754 2.2 59 36 Op 1 . - CDS 70759 - 72117 786 ## PROTEIN SUPPORTED gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 60 36 Op 2 12/0.000 - CDS 72104 - 73585 1555 ## COG1003 Glycine cleavage system protein P (pyridoxal-binding), C-terminal domain 61 36 Op 3 16/0.000 - CDS 73599 - 74921 1469 ## COG0403 Glycine cleavage system protein P (pyridoxal-binding), N-terminal domain 62 36 Op 4 18/0.000 - CDS 74942 - 75322 645 ## COG0509 Glycine cleavage system H protein (lipoate-binding) - Prom 75411 - 75470 3.2 - Term 75515 - 75561 6.2 63 36 Op 5 . - CDS 75605 - 76696 1403 ## COG0404 Glycine cleavage system T protein (aminomethyltransferase) 64 36 Op 6 . - CDS 76712 - 77623 1044 ## COG0095 Lipoate-protein ligase A - Prom 77689 - 77748 3.8 + Prom 79781 - 79840 3.0 65 37 Tu 1 . + CDS 79866 - 80144 72 ## - Term 80046 - 80081 6.7 66 38 Tu 1 . - CDS 80127 - 80717 668 ## gi|288929412|ref|ZP_06423257.1| conserved hypothetical protein - Prom 80799 - 80858 4.5 - Term 80752 - 80789 1.8 67 39 Op 1 . - CDS 80869 - 81096 78 ## 68 39 Op 2 . - CDS 81107 - 81433 422 ## COG1917 Uncharacterized conserved protein, contains double-stranded beta-helix domain - Prom 81453 - 81512 4.5 + Prom 81721 - 81780 3.4 69 40 Tu 1 . + CDS 81800 - 82465 809 ## COG0664 cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases + Term 82617 - 82680 20.8 - Term 82742 - 82788 1.0 70 41 Op 1 1/0.000 - CDS 82794 - 84452 1704 ## COG0365 Acyl-coenzyme A synthetases/AMP-(fatty) acid ligases 71 41 Op 2 . - CDS 84466 - 85020 664 ## COG1396 Predicted transcriptional regulators - Prom 85106 - 85165 5.7 - Term 85178 - 85216 7.5 72 42 Op 1 . - CDS 85302 - 86285 684 ## gi|288929418|ref|ZP_06423263.1| hypothetical protein HMPREF0670_02157 73 42 Op 2 . - CDS 86305 - 86535 181 ## gi|288929419|ref|ZP_06423264.1| hypothetical protein HMPREF0670_02158 - Prom 86679 - 86738 6.2 - Term 87177 - 87219 -0.2 74 43 Tu 1 . - CDS 87376 - 88917 1666 ## BF1460 putative outer membrane protein precursor - Prom 88982 - 89041 4.7 - Term 89125 - 89162 -0.8 75 44 Tu 1 . - CDS 89199 - 91253 612 ## PROTEIN SUPPORTED gi|163762592|ref|ZP_02169656.1| ribosomal protein S21 - Prom 91467 - 91526 4.6 + Prom 91580 - 91639 5.4 76 45 Tu 1 . + CDS 91710 - 92426 892 ## COG0670 Integral membrane protein, interacts with FtsH + Term 92600 - 92636 1.3 Predicted protein(s) >gi|283510527|gb|ACQH01000092.1| GENE 1 55 - 981 599 308 aa, chain + ## HITS:1 COG:SPy2122 KEGG:ns NR:ns ## COG: SPy2122 COG0582 # Protein_GI_number: 15675872 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Streptococcus pyogenes M1 GAS # 4 307 68 379 381 91 27.0 2e-18 MKAKTIREIALAWKSEKQRYVKQSTYAAYVLILENHLLSSFGDCEVLSEKLVQEFVLQKL NAGLCIKTVKDMLIVLRMVMKFGVKNGWMNYCEWDIKYPTTEGNREIEVLTVAQHKKILD FIRQNFTFRNLGIYISLTTGLRIGEICGLMWADINTDTGTITVNRTIERIYIIEGERKHT ELVINTPKTKNSCREIPMNKELLAMVKPLKKVVNASFYVLTNEEKPTEPRTYRNYFHRLM GHLDIPRLKYHGLRHSFATRCIESNCDYKTVSVLLGHANITTTLNLYVHPNMDQKKRCIT KMLKYLNK >gi|283510527|gb|ACQH01000092.1| GENE 2 1728 - 1928 82 66 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MAGFFITNRPASRKYSILAQVCFGAEWSRAYNGGQPIWCSMNMQLVPSLVRFAPKYPAFS TETHCI >gi|283510527|gb|ACQH01000092.1| GENE 3 2075 - 2329 56 84 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDIFSGLDSQHIIRRIHFSTLGVKELSQMYGWRWPSFHPFTFSPFSDSFHFKLGKFANKG DVPQQAHPLMLHSTAIRKALIHHH >gi|283510527|gb|ACQH01000092.1| GENE 4 2323 - 3855 1374 510 aa, chain - ## HITS:1 COG:STM0035 KEGG:ns NR:ns ## COG: STM0035 COG3119 # Protein_GI_number: 16763425 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Salmonella typhimurium LT2 # 1 498 1 480 497 193 30.0 8e-49 MKTTSSILLSQAAWLALTPALAAKNTKPNIIYIMCDDMGYGDLGCYGQQHILTPNIDRMA AEGMRFTQAYAGAPVSAPSRACFMTGQHSGHTEVRGNKEYWAESEPVFYGQNRDFSIVGQ HPYDANHVIIPEIMKANGYQTGMFGKWAGGYEGSTSTPDKRGIDEFFGYICQFQAHLYYP NFLNSYSRSRGDKGVQRVVLDDNIAHPMFGDDYFKRTAYSADLIHRRALQWLETQTADKP FFGVFTYTLPHAELAQPNDSLVAFYKKKFFVDKTWGGQEASRYNAVEHTHAQFAAMITRL DLYVGEILKMLKQKGLDDNTLVIFTSDNGPHEEGGADPSFFNRDGKLRGIKRQCYEGGIR IPFIARWKGCVEAGKVSDLPFAFYDLMPTFCELAGVKNYVQRYRNKCLTTDFFDGLSIAP TLLGNDKAQQRHPHLYWEFAETNQIAVRMDDWKLIVIKGTPHLYNLATDLHEDNDVAAQH PDVVKKMVNIVYKEHVDSPLFPITLPKMAQ >gi|283510527|gb|ACQH01000092.1| GENE 5 3764 - 3970 98 68 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MMLGLVFFAAKAGVRANHAACESRIDDVVFIVFNCLIQIVYCNQGFVLAAYSARFGWWEP CLTQIIHC >gi|283510527|gb|ACQH01000092.1| GENE 6 4096 - 4329 85 77 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSSLFGHCHGEKTHFSRNEKNLSSATEKLNDAKSVLNHNRRNAQAPLIMMPGQKEGKLCK AHLQGRYPLLGVAWVLF >gi|283510527|gb|ACQH01000092.1| GENE 7 4300 - 5916 1413 538 aa, chain - ## HITS:1 COG:no KEGG:BVU_0960 NR:ns ## KEGG: BVU_0960 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 536 1 545 545 658 59.0 0 MKAIKTYILAGAAALALSSCDDFLNTLPKDAMSPPTTWKTSDDAEKFLVGCYDGWEGGAA LLYWDAGSDFGYNNFPWEGFTNIGNGSVSPSSPGWSFYDYTIIGRCNTFLENVGKCAFSS EAVKKNLVAQVKAIRAYNYFRMGFLYGGVPIVRPFTSAKEAQVPRNTEQEVKGLVFKDLE EAIADISDKPASRGFIAKGAALAMKMRAALYWADYQKAKDAAQAIIDLGQYELDKDYTNL FKLDGRDSKEIILAVQYKTGTHSLGTIGQLYNNGDGGWSSVVPTQKCVDNYEMSNGMAID EAGSGYDATHPFHARDPRMAMTILFPGADWNKKVYNTLDEDWDGVKNPNYPSNAANSSKT ALTWRKYLDPMSQYADVWDTECCPIVFRYADVLLTWAEAENELNGPSADVYSKINMVRKR VGMPDVDQSKYNSKEKLRELIRRERGSEFAGEGLRRADILRWTTTNGKMVAETVLNGILT RITGTVDTTVVDPTRRAIVNGTSAVEERTFHTYNRYLPIPQKNISDNPKLEQNPGYAK >gi|283510527|gb|ACQH01000092.1| GENE 8 5940 - 9080 3577 1046 aa, chain - ## HITS:1 COG:no KEGG:BVU_0959 NR:ns ## KEGG: BVU_0959 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 21 1046 7 1031 1031 1179 59.0 0 MSNFMEEEQMRRKHPPFVAGSVAFSLVLGLMAFGSPSVSAKADANTFASIQQQKQSITGM VTDANGDPIIGASVQAGGTALAVTGADGRFSVSVAPGTELQISYVGFATKAVMVRNGVSN YDVTLSDDSRALNEIVVVGYGTQKKANLSGSVAQLDGKTLGNRPIANVSSGLQGLLPGVT VTGASGAPGLDGGLILVRGVGTLNSASPYILIDGVEAGTLNSLDPQDIASISVLKDASSA AIYGSKASNGVILVTTKRGQNGAPKVNYSGFLGIQNATALMERMNSADAAYYYNKALERS GKAPRFTEEAIQKFRDGSDPYNYPNTDWYKLAFKTAWQNRHNVNVTGGNEFVKYLASAGY LKQSSILPNAGREQFNARANLDMVLSKRITAHLNLAYIQNNYRDASSAYAGGSSDQIIRQ LNIIAPWIPYKKEDGSYGTVSDGNPMAWLESGMTVNRNNRNFTGMIGIDYQVFKDLKLTL QGAYVDASQRYSYFQKFIQYNANKATDPSKLEIGHYDWHRTTFDAFLNYDKSVAKHNFKA MLGWHTERYKYLPDWMYRKNFPNNNLTDMDAGDASTQKNSGYTRELTMVSYFGRLNYDYA GRYLFEANFRSDASSRFAKEHRWGFFPSFSAAWRISEEPFMERSKSWLDDLKLRASWGQL GNQDALSGKQDGTSDYYPWMNTYKLDGSYPFGGQLMQGYYQGDYHLSTISWERSTTWGVG VDFTLFGGLTGSIDYYNRKTTDIIMDVSAPAEFALGKYKDNIGALRNEGVEVSLAYAKQL NKDWAINAGVNFAYNKNKILNLGEGTEYIEKGDRRTALGQQYNSYYMYKATGKFFNSQEE ADAFTAKYGNPFGRKFMAGDIIYEDTNGDGKLDSNDRVYTKHTDIPAITYNINLGATWKG FDLSMTWQGVGSVSHIFNREVLGEFSGDASHPSTMWKDSWTESNHNAKMPRVFETGNSPS DMTRAMSTFWLWNTAYLRLKTLQLGYSLPENVLKAIGVERVRIYYAGENLLTFHSLPFNI DPEITSERGSSYPLLRSHAIGINITF >gi|283510527|gb|ACQH01000092.1| GENE 9 9301 - 9942 289 213 aa, chain - ## HITS:1 COG:no KEGG:Ccel_2974 NR:ns ## KEGG: Ccel_2974 # Name: not_defined # Def: hypothetical protein # Organism: C.cellulolyticum # Pathway: not_defined # 73 213 145 287 289 87 35.0 2e-16 MKKLIKGILLFFGGLTVFSILMVGVVTCSYMKGSSDKDSVSTDSDESSEKGDMVINVDSV IAAHRSDFTFRKDEFENKIWVEPKNAPKYTNVNRIYCYFLLRDKTASDFRFRIQYAADDW LFIKTIMFNLGNDKILTFEPEEMQRDNETSIWEWCDEQIDDKNESLISALAYSKSVKVKF CGRQYFRTRDMTSKELEYIKKTYEFYQALGGKF >gi|283510527|gb|ACQH01000092.1| GENE 10 9954 - 11042 643 362 aa, chain - ## HITS:1 COG:MA3840 KEGG:ns NR:ns ## COG: MA3840 COG4748 # Protein_GI_number: 20092636 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanosarcina acetivorans str.C2A # 1 361 4 359 362 305 48.0 6e-83 MDLKDLIKQLAEKINQQKANILTEEATKNAFIMPFINALGYDVFNPLEVVPEMDCDLVKK KGEKIDYAIMKDGAPIILIECKHWHQDLLLHDTQLKKYFVASKAKFGVLTNGIRYLFYTD LENQNIMDDKPFLELDITDVKDYQLVELKKFHKSYFNIDDILSSASGLKYSTELKKIFAE ELVNPTPEFVKYFAKKVYDGVITSKVQDQFSELVKRAFNSYINELISKRLKTALNSEEQR EATETISAHAESEPVESAIGLKDGVVTTQEELDGFNIVRAIVRKSVDVSRVVYRDALTYF SILLDDNNRKPICRLYFNGKNKKYISTFKADKTETKHEIAGLNEIFSFEDELCEIIKIHD NK >gi|283510527|gb|ACQH01000092.1| GENE 11 12289 - 12414 59 41 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPFTRGKETETESTYVYNRQHKRLPRMQLTTNGGCIRQTQK >gi|283510527|gb|ACQH01000092.1| GENE 12 12761 - 13423 527 220 aa, chain + ## HITS:1 COG:sll0687 KEGG:ns NR:ns ## COG: sll0687 COG1595 # Protein_GI_number: 16331536 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Synechocystis # 22 205 4 180 185 60 29.0 3e-09 MAITKKQLTLLRKILQQNRLTIVRDMVESDIFSRLKDGDLAAFESLYKTYYAVLCKLAKT ITHSHELAEEIVDDLFFYLWDNHGQLQVESLQAYLFRSIRNNSEKACRSLAFRKGRVTDS IDNTLLCLHEYLSDPEHPLGWLLEEEMKNTAKKAVDELPTECRQVFELSRYEGKKYSEIA QELGISVNTVKYHIKNAIKTLSSRLPADVVGVLLFVVVWW >gi|283510527|gb|ACQH01000092.1| GENE 13 13528 - 14445 385 305 aa, chain - ## HITS:1 COG:STM2730 KEGG:ns NR:ns ## COG: STM2730 COG0338 # Protein_GI_number: 16766042 # Func_class: L Replication, recombination and repair # Function: Site-specific DNA methylase # Organism: Salmonella typhimurium LT2 # 35 226 27 207 285 85 30.0 1e-16 MPVIIPPIKSQGIKTKLVPWINDLILTSGVSLTARWIEPFFGTGVVGFNCPIMGTHIVGD TNPHIINFYQKVQNGEVTPYTMRAYLEREGALLEVADEDGYAHYRLVRDRFNKEHSPYDF IFLSRAGFNGMMRFNKQGCWNIPFCKKPERFAPAYITKICNQIENIARIIGGKKWVFNNV HFLETIAQAGENDLIYCDPPYYGRYVDYYNGWTEKDEEELFHALRQTRARFILSTWHHND FRQNEMVSRFWQQFNVVTKEHFYHNGGKTENRHAMVEAMVFNFDLQDKVEVAVPHRQFEL FAGMP >gi|283510527|gb|ACQH01000092.1| GENE 14 14448 - 15254 618 268 aa, chain - ## HITS:1 COG:no KEGG:Rcas_2237 NR:ns ## KEGG: Rcas_2237 # Name: not_defined # Def: restriction endonuclease EcoRV # Organism: R.castenholzii # Pathway: not_defined # 1 226 1 226 226 242 51.0 9e-63 MDKKVFQELLAEEVKSYKALLETPNNDWIVKGFIDVNKNIYGITNDTKVVSKVIEIILIP KLASFAERNGLVLEFPSAQNFYPDLTFKDKEGNLFAVDFKSSYYRDDKKVNGLTLGSYWG YFRDRQVKRNTDYPYDDYKCHLVLGVLYKQGVEDGKDMEVYSIDTLADIKSVIEHFIFFV QPKWKIASDQPGSGNTRNIGGVVDVDKLVNGEGTFAKYGEEVFNDYWMNYYNAADAQKAG MGKQHYHNLKTYIEYLEKMSKRLTELKK >gi|283510527|gb|ACQH01000092.1| GENE 15 15915 - 16946 651 343 aa, chain + ## HITS:1 COG:no KEGG:BT_1876 NR:ns ## KEGG: BT_1876 # Name: not_defined # Def: putative anti-sigma factor # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 252 1 254 337 162 38.0 1e-38 MKTENEHIDNLIRHFVQGDITAEGLCELNNFVRQSEENRLYVRNLRELLFDRAAAKAHEE FDVEQALQRFHAHVHTGNTGEKQAQRLPIKLLVRVAAVFLIAVLPIVAYFLGMQKVNRQF AMIETNAPDGSQLSLVLPDGSKVKLNSGSTIRYSQGFGISDRKIYLHGEGFFCVKHNERM PLSIVTKDVVLHDLGTAFKVSNYQEDQHATISLYEGELSIDNLISQQKGLAVKPGERIVV NKTTGQLHKESVKETMDEGASMNTFLRRNHVLRRTNRAEVSNFPYLCSVFRTRVRCLSWF YNSKIINTSQFIGSPKRQGYVHVLQPTDRAELIYSHLCLISRV >gi|283510527|gb|ACQH01000092.1| GENE 16 17111 - 20713 1835 1200 aa, chain + ## HITS:1 COG:no KEGG:Slin_2121 NR:ns ## KEGG: Slin_2121 # Name: not_defined # Def: YD repeat protein # Organism: S.linguale # Pathway: not_defined # 13 835 26 903 1837 363 31.0 2e-98 MNLTLTAIKRLALSLLLLFVIGVSTLMSQTYYDRITNVQTPNVANLGVYEEIPITPFTGT PNISIPLYELNLEGLKFPITLSYHSGSVKPDQHPSWVGLGWTLNAGGVIYRTINEGPDEY SAPGIFFDYTHRGYYFQHSILDNDSWNNADYLKERARTTYTTYWDRAPDEFSFQFLEYSG KFWLSEKGEWKVQCSKPIKVEFNGNFVDIPKDLHWPPNEMRRNQTPTFAGFTLIDEKGNR FEFGYTSSAIEYSKKFFKQGLDIWFANAWHLTGIVLANGQNVRLNYERGAYIDQLYSSYM RVLDQEGKNPKWSSCNSSFASSQDMTISGELLSPVYLESINGENEDLEFNRTLSNDLRVQ SSKYHDYFTSMKLWPAFNLLMYDLNPEQKEDLDGCLSKLKWYKLSDITVERKNGYPNMHL VLDYDENPNQRLILKSIREEDMQHARTAKKYSFEYNRPDMLPPYLSNETDHWGYYNAREA SIMDLDGYYAKKEANPNCASLGLLTKITYPTGGYTRLVFEPHQYSKNLKLNRWEGCFTEN VDKYAGGVRIKEIYSSPTDRAEDETLVHRYYFSKSKDGRSSGVCGGQFKYFYRNNTIKPF GNKSSKRVSIFTSSSVLPACANSQGTSVGYTDVIETNADGSYQKYHYTNFDNGHLDERFE CNMQQGMENNEQYANKGQERGLLLRKQSFDNTGKLRQDEDISYVKNKIIENYVRAMNANC YNICGGEAGSYGEGVCYKIYTYSMLPATRREVFFENGDSLVKQFRYNYNARGILQETITT LPTNDKLRHFITYAEDMKATPTYDAMVNRHMVGVPIEQFTFRNSSLVDATLNTFKLNIKN SWPVLDKVYHGMFKLPYSKTEPVRFDGETVPQRYGEAEIQFNNYDASNNYREAIGKGGRR VVFFWQKGFKQPVAIFKNAQNNRDTIYEKRVNRVCKGLAVECKPYAKKEITFTTEQETEV EVEMFFPESDKKDFFMYVGVDQHTVNLLIDFRRDPQNRDRYYAKLGRISAGKHTLHVVVE YVGKEGKADESGYSVYAKVNCYYTVITSLPPRIVGYNNVFYEGFEHGRGKNTFPGGFHSY WGYRGLYRARLDIDPKKRYFIDYQVYHNGGWQYKCAPFTDGEYIINENNEPIDDIRIYPE DAEVMSFNYDGTGQITSRTDQRGYTYSYGYDAFGRLVSEHDTDWNTLRSYEYHYKNQEAK >gi|283510527|gb|ACQH01000092.1| GENE 17 20710 - 23958 2146 1082 aa, chain + ## HITS:1 COG:no KEGG:BT_1993 NR:ns ## KEGG: BT_1993 # Name: not_defined # Def: putative cell wall-associated protein precursor # Organism: B.thetaiotaomicron # Pathway: not_defined # 37 891 240 1126 1316 442 34.0 1e-122 MKMIRHYTFNWQATLLCLAVFLANPSTKCFAQDNTRSYIKTSVHVKRNIDGWLDNIQYVD GLGRPVQNVQIHVTPNKNNLADRIEYDVMGRKSKIWLPIPTMADYVETANMYNDYMGVYH EDTAPFELRTYESWQKGRKTANIGPGQAWAKHPSISSFHTNKSSGVLACKKYAVDEATGT LKEEGFYPSNQLLVQKLTDEDRNEKYVFADAEGRQLLVRRVDGNTYSDVYFVHDLRGNIR YVLQPMYQKQPSIDKFAFRYLYDETNRCIEKQLPGCEKTDYAYDAADRMVMSQTGNQRRT GHKTFYRYDALGRNTYVVERAVGNDSLVTIRKFYDDYSFLDQQGFNSPCFKVSAQRAQGL LTGREVAVYGCDSLLREVNLYDRKGRLTRQIRTNMRGGYDDTSYEYSFTDKPVSKLMVHS TANMPTLTQRYEYEYDHADRLAKVTYSLNNAEKKTLREIDYDELGRVNSETTGDIPELGQ VINYNIRSWPTSRYSPLLEEHICYEDATQVFMESVTPCYNGNICAYSSSVGKPQRGYLWG ICNITRSFSYCYDGLSRLKRVKYKDSDSPDHNYNTNYEYDLQGNLLKLQRNGLCDYDTYE PIDCISMEYDGNQLHKATDPCTDPAFPNALYFADGADQPEEYTYDANGNMTADLNKHIRK ISYNTQNLPQEVRFDDGSSIAYLYDADGNKLRTSYAIASYSESMPITQVMHGNVSSQTQQ PYMARNSIDYCGSAVYENGKLAYVPIDGGYITFADSTATSAPTFHYYVKDYLGNNRLVVR DDGFGEQMNHYYPFGALMSVSTQGAAQRYKYNGKELDRIHGLNLYDYGARQYDAALGRWT SIDPLAEKYYDISPYAYCANNPVVFVDPDGCKFDFSKMTKGEYQSYLGQINPLREASPLF NTMYSSLENSKETYFISFDRVPSVVKQAGDIDGHFVMDDNGGGSVVFDKEKLGNITNVVL AEELFHAYQHDNRKGYDKGEFNREFEAKIVAYFIGQDMVGSAAFDNTNGFVDRILDGNYG SAFQYLTPKAVMSKPFLDDYISSAEYFANYNIINKVGGVNYHLHTTVSPYSLQKIVLDTY NK >gi|283510527|gb|ACQH01000092.1| GENE 18 23970 - 24527 295 185 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260909780|ref|ZP_05916474.1| ## NR: gi|260909780|ref|ZP_05916474.1| conserved hypothetical protein [Prevotella sp. oral taxon 472 str. F0295] # 1 177 81 257 262 318 91.0 7e-86 MKTKILAMFFLLPLFCSAQITMDDDCFDRSNRIFAKVILEVFDTSFVHKMVDNGQRFLLV LNVDTAGYVLGVRNGRGNFPETQVKEMTDKLREYFQTNMVQFPLCYVLQDIGLSSEDQLK LARKIFSEKKERLFGANFPGGLFFPYEADKRKGFKGSKFDYLLLRISQQKIPIKKKVSKG GKKDD >gi|283510527|gb|ACQH01000092.1| GENE 19 24790 - 25248 168 152 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260909782|ref|ZP_05916476.1| ## NR: gi|260909782|ref|ZP_05916476.1| conserved hypothetical protein [Prevotella sp. oral taxon 472 str. F0295] # 1 152 1 152 152 291 92.0 8e-78 MLNSVYLMLVAMLLPGVSSKYQLMRDYTWTYQYTLIMNVDKRPVKKSEKRLISTQIMCGN GGCRTIAFKDSSIALSWKETGPIDTLKWSIKGDTLTLFTHFDHIEFMRKNTYEKFIVRVN KEQQKGDLVSFKCPNISYHFALEPTDVQTQGK >gi|283510527|gb|ACQH01000092.1| GENE 20 25375 - 25866 385 163 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929368|ref|ZP_06423213.1| ## NR: gi|288929368|ref|ZP_06423213.1| hypothetical protein HMPREF0670_02107 [Prevotella sp. oral taxon 317 str. F0108] # 1 163 1 163 163 325 100.0 5e-88 MKAKVIAFLLWACLPCLGQVVILLVVACKETDGSREYYPQLPVTSYALKTNGDTLNVVET DAKGGNGRVWKLYLKGDEYYMEWLGVEMLMMSTHQLTDSLYANSMKYQQYRVCIERMNDS LFACSFYQDIPTKWLLLRMYYDKDYNLTAIQREEAFVDYECER >gi|283510527|gb|ACQH01000092.1| GENE 21 25995 - 26477 213 160 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929369|ref|ZP_06423214.1| ## NR: gi|288929369|ref|ZP_06423214.1| hypothetical protein HMPREF0670_02108 [Prevotella sp. oral taxon 317 str. F0108] # 1 160 1 160 160 294 100.0 1e-78 MERKGLIKILMAITTIVVVLVSFMRYMEKGDELKFHFSSGIKSYTLKRQGDTLKLIENNG EQTRNRVFVMYRKGNDFYSALLGRERLVLSNRLTLDTIYKNSLVGAEVALAVKQEKDSLR SSFIFVSGECNFPRIKLFYDKEYNIKKIQSYELLLNYAPD >gi|283510527|gb|ACQH01000092.1| GENE 22 26464 - 26721 236 85 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929370|ref|ZP_06423215.1| ## NR: gi|288929370|ref|ZP_06423215.1| hypothetical protein HMPREF0670_02109 [Prevotella sp. oral taxon 317 str. F0108] # 36 85 1 50 50 94 100.0 2e-18 MRQIRAKYYGMSEFQDRIMKPCMVQRITPISREAVMSNAFLDDYMVSAEKYAKHNIEKEI GGINYRLHTKVMPYALQKIILDTYK >gi|283510527|gb|ACQH01000092.1| GENE 23 26881 - 27303 277 140 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929371|ref|ZP_06423216.1| ## NR: gi|288929371|ref|ZP_06423216.1| hypothetical protein HMPREF0670_02110 [Prevotella sp. oral taxon 317 str. F0108] # 1 140 1 140 140 265 100.0 7e-70 MLEHNQRLTLWLNIDSLGYVLSVKKAGGEMPQPQHEEAVSKLKSYFQQNMVQMPICYALE DIDRTYEEQLKLAIESFRESGLRTIIAFFPGELPLRYEMAKNSGYKGSMLDYLLLKLNEQ EIPIKKKVRVHEKKDIFVEP >gi|283510527|gb|ACQH01000092.1| GENE 24 27487 - 27963 292 158 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKKILFYLLVIVCLTTSCVWQRTKRYKIPKTNKVLLIYRPWFTGAAYVTIRDSGATALKK SDVDVIRVPIYESTELNFVLEQSDPDKIYYVDPWDFATLHPRQKKYKKIRFEDKRFYQPR QPATRFDVRPGYIEVGVRAYANYVIYSEGKDYIELEPF >gi|283510527|gb|ACQH01000092.1| GENE 25 28037 - 29197 823 386 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929372|ref|ZP_06423217.1| ## NR: gi|288929372|ref|ZP_06423217.1| hypothetical protein HMPREF0670_02111 [Prevotella sp. oral taxon 317 str. F0108] # 29 386 1 358 358 752 100.0 0 MKFRTTLMNRQRFLHIVLAAFIQCSYGIMAHAQGFNAVEKIRNAYRVGDRMTGRQLLVGN VAGLLNDSICDLSQTQVANDRYTCRYEKGPNNDVISRLEHGTRYFFEHKGDSLLIMGFEN HLAKIDYDLPELWLKFPMHAGDSLSGLYHGCGLYGERWNVRKLGDYKTWAKLCRTMVLPE GDTLRNVLRLHTQRQVCTLNSPARMPGDSLPHFTTDSIRWHLANNKGVLVENVYRWYVAG WRYPILEAYVTGGTQSQAPLFTTAFYYSPREQECLSDDNPNTLIRKAVTQQEVNGNALAD NDKDLDYKVSVDSERKTIRLTYNLERETHVRLILSDNAGIVYKTSERKDTPGKTHVQEIG YGTLPRGRYVLYINANGKRYAEKFKH >gi|283510527|gb|ACQH01000092.1| GENE 26 29507 - 31441 2101 644 aa, chain + ## HITS:1 COG:VC0613 KEGG:ns NR:ns ## COG: VC0613 COG3525 # Protein_GI_number: 15640633 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetyl-beta-hexosaminidase # Organism: Vibrio cholerae # 85 467 194 604 637 119 28.0 2e-26 MKKKLLLLLLLTANTALSTFAINKKPFTIPEVKEWKGTEGTLVLSGRIVCNGAEERRIAQ QLADDYRQMFGTALTLAQGKAQKGDIVLSLKAPKTLGEEGYIMTINAQATIAAANRKGLY WGTRTLLQLAEQSETRALPCGFISDAPDYPFRGFMLDCGRKFFSMPMLRQYVRMMAYYKM NVFHIHLNDNAFHWFYNEDWSRTPAAFRMESNWFPGLTARDGHYTKKDFIALQELADSVG VEILPEIDVPAHCLALTRFRPDLVSKDYPPDHLDLLNPNTITFCDSLFTEYLEGKNPVFR GKYVHVGTDEYSNKDQNVVEKFRAFTDNYIRLVEKFNRKAVVWGQQTHAKGTTPIKVKDV LMFSWSNDYSKPDEMMKLGYNLVSIPDGQVYIVPKAGYYFDYLNTKKLYDEWTPANINGM KFAERHKQIEGGAFAVWNDIVGNGISDKDVHYRVLPALQTMATKMWTGAKPTFAYDAFVG NARALSEAPGMNYAGYYPTGMVLQQPSVAPNAVQKIPQIGWNYRVSFDIEAQQEARGTVL FSFGDTHFYLSDPVAGKIGFSRDGYLNTFDYQLFPGEKVRMTVIGDKEKTSLYINNRLVS DLPVRKMNFGKRGDMYYISTLVFPLQKAGNFKSKITNLKAESME >gi|283510527|gb|ACQH01000092.1| GENE 27 31786 - 31977 204 63 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929374|ref|ZP_06423219.1| ## NR: gi|288929374|ref|ZP_06423219.1| hypothetical protein HMPREF0670_02113 [Prevotella sp. oral taxon 317 str. F0108] # 1 63 1 63 63 87 100.0 2e-16 MKSKRTYEKPLAEIVAFNAYPIMYEASLPQGGEGEIGGAADSKADDFEEEPYWDVKDYNP WEN >gi|283510527|gb|ACQH01000092.1| GENE 28 31974 - 35519 3002 1181 aa, chain - ## HITS:1 COG:MA4292 KEGG:ns NR:ns ## COG: MA4292 COG3291 # Protein_GI_number: 20093081 # Func_class: R General function prediction only # Function: FOG: PKD repeat # Organism: Methanosarcina acetivorans str.C2A # 291 906 762 1483 1995 142 27.0 3e-33 MKRFLDHIGTPLRHLLSGSSLMQCSVRGVCLLFLLIVGNLAAHAYVEGEYFIQGHITYRV IDASTSSPKLAVYNVRGVSGKVIIPATVFDGIDTHFTVTQIGGAGDNDPFRWDSGITEVV LPNTITTLACYCFSYSGITSLRLPASVTTIHPRANVLLDRCYKLKEILVDAGNPAFISDD GVLYTKGHEELICVPFNKDLSSKGNCFTINSNTKKVHVDAIMETPTLKKLVVPASVVDLY MGRWPTFAYSPRQLEEIVVDAGNQTYCSIDGVVYSKDKTKLIYYPAGKKNTTFKVPDEVK TIPYGFTISWNNYLTSIDLNNVDSVGEYASSVCYKLKEIRIPKTLTRIGEAAFGTFINLE KYVVDPDNPNYCSDADGVLFNKDKTKLLFYPTARQGEYTIPSTVTYIGKMAFIYSKITNI TIPAKVTTIGSEAFRNTKLTTVTFEEPSKLENLYVRAFLWCHELKTVTLPKSVKILGDAF AGCDKLETVNVPDGSELKAIWGGAFVGCDNLTNFNFLGSCKLEAIGSRVFADKQKLKEFN FPASVARIHDNAFGNTPAMEKVTFANNSAIVSFGKGAFANSGIKSIKIPTGVKSIDKDAF RRCNVLERVIVPAGCTNIHPEAFKFDSKLADIDVDPANPKYSSVQGILLSKDKSELLIFP PGKARTDFTLLPPSITKIGDFAFYEGNANFTNIVIPAKVNTIGKRAFGLNPALKTMTLLC DEMIPSSKIDQGTNTMAFDDGTVATDNAKQHITVYVRKNLLAQYKADPFWKQFTLKPSFT VKAEGTTAATDEYIPTSTTTVDFLSTTADVKTFVLPKTITYNDGATTTTYKVGLIGDYAF ENANANMKEVVVNADVDYIGAMAFVTKTKRVAKNTIQPVSTTISQVVFTGNTPATKLSRN YFSLGAPFSEFFRGAAGTGACTQKIYVKKSKLDGYKTAWGDYASALDYRIKGDGTSAFSI TNEFGTFSREFDVDFGDVDNGGNRMFWDVAKNCPKVIAFTSGEKVGKSVIHMKSINLGEG AATDGLYVPANTGVVLRAIGGSSPTDFYYRIGEDDKWSYSGTNILKPVTVNAKDITPNEG GNTNFYVSKGKAFRVTQAQQFKDEGKLTIGVHKAYININVPAGAKLTLLFDNDETTGIEE VGADDNPTSTSDDSYYDLSGRRISNPVKGVYIHKGKKVIVK >gi|283510527|gb|ACQH01000092.1| GENE 29 35882 - 36385 -37 167 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNRLETHTNPLRLFLFVGKDEGVMQRNEMLFCKGRMGDKTNHFLSFGRMQPTSCTKGTKD LAVPNNSSPCYCHIAAYLTNVFVKPNKQCRACSNIVMARKRTFREAKKNVFVKPNKQCRA CSNIVMARKRTLRGTKNNIFLFGRARDQPPPLVRPKQSLAFMPFLAW >gi|283510527|gb|ACQH01000092.1| GENE 30 36575 - 37831 1111 418 aa, chain - ## HITS:1 COG:ECs4730 KEGG:ns NR:ns ## COG: ECs4730 COG0641 # Protein_GI_number: 15833984 # Func_class: R General function prediction only # Function: Arylsulfatase regulator (Fe-S oxidoreductase) # Organism: Escherichia coli O157:H7 # 12 400 11 403 411 382 46.0 1e-106 MATINPFGHPMYIMLKPAGSLCNLRCEYCYYLEKQQLYANCPTHIISDQMLEKFVREYME AQTTPEVLFTWHGGETLMRPISFYKRALQLQRAYARGRQVDNCIQTNATLLNDEWCNFFK ENNFLVGVSIDGPQEFHDEYRKTASGKPSFHKVMQGIRLLNKHGVEWNALAVVNDFNADY PLDFYHFFKEIKCRYIQFTPIVERIVTRNDGLRLAPGMYEGGKLAPFSVTPEQWGNFLCT IFDEWVRNDVGQYYVQLFDATLANWVGVAPGLCTMAAECGHAGVMEFNGDVYACDHFVYP EYKLGNLKDKTIYEMMTSPRQKEFAKLKNALLPRQCKECRFRFACHGECPKNRFVRDRYN QSGLNYLCKGYYQFFEHAAPYFEFMKRELQAQRPPANIMERLKRADRSDWSDRSDRSD >gi|283510527|gb|ACQH01000092.1| GENE 31 37969 - 38214 56 81 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929377|ref|ZP_06423222.1| ## NR: gi|288929377|ref|ZP_06423222.1| hypothetical protein HMPREF0670_02116 [Prevotella sp. oral taxon 317 str. F0108] # 1 81 1 81 81 136 100.0 4e-31 MRQYNMKHVPSKGKGRLERPTTKASRLCTASCFVCQQPHGEASGGGVYKAPQPTYAQGTG LCATHCTAQQLQSTPRRLTFF >gi|283510527|gb|ACQH01000092.1| GENE 32 38295 - 39065 762 256 aa, chain + ## HITS:1 COG:no KEGG:BVU_3832 NR:ns ## KEGG: BVU_3832 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 8 245 11 241 251 223 49.0 4e-57 MKQLHSWLAGVGAVVAVFTFVACQNDKEEDMQSYPTALVTVKPKALDNSFVLQLDDSTQL IPSKMKTSPYGSKEVRALVNYNVVAGPSAKKEDGRVDERTYVNINWLDSIRTKPVAKNMG DNNVKTYGNDPLEIVNDWVSIAEDGYLTLRFRTRWSKGAKHMLNLVATPTKDNPYKVTLY HNAYNDLFGRMGDGLIAFRLAGLPDTGKKYVWLTVEWESFSGKKMAKFKYATRKDNGTPR GKMAILTAPVQITSVN >gi|283510527|gb|ACQH01000092.1| GENE 33 39139 - 39639 535 166 aa, chain + ## HITS:1 COG:mll8140 KEGG:ns NR:ns ## COG: mll8140 COG1595 # Protein_GI_number: 13476734 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Mesorhizobium loti # 22 157 36 176 208 74 32.0 7e-14 MQDFYTLYADYLTSVCARYVDNDDDLKDVLQDSMVKMLTSIGQFDYRGPGSLQAWATRIV VNQSLSFIKRKRTAAIVSLDIDLPDEPEKDDPPIDHIPPEVIHQMVRELPTGYRTVFNLY VFENKSHQEIAQLLGIKRDSSASQLHRAKNMLAKKIEQYNKTNLAQ >gi|283510527|gb|ACQH01000092.1| GENE 34 39636 - 40961 1260 441 aa, chain + ## HITS:1 COG:no KEGG:BVU_0051 NR:ns ## KEGG: BVU_0051 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 245 441 229 431 431 125 36.0 4e-27 MNEQWRTNMREKLDGHRQAAPLLSWDEVERRAAAAEKSAKVAAWGRRAMAAAAMVIFLCG VAATLLLLNNGQNAQQVANLERHYAPQHNTTKPADNGTNAVAHNINAEKGTRTANTAGSD NTQYYYASDNRNIVNQETNLYAAVPANNQQKASQPQANIETTTDTQQSITDYTSKNSTDK APLGQQTTRTQPNRTESLPAKLPRNTAKPTRAGGAWMAKAYVSGGAMGANNSQMQRPMLA TAHTYGAPFDEEIRNGEYVRLENTSPAVETKAHHRQPLRVGAAVRYALNNRWSIDAGLTY ARHRSDITRKMGNVVNETQQTLHYLGVPLNVNYRIVGNRRFNVYAAAGVTGEKLLKGKRK TVTQNEDMPDDVQTESVKENRAQFSVNAAIGAEYKIDDRLSVFAEPGISHYFDNGSELSS IYKERPTNFNLNVGLRFCVNK >gi|283510527|gb|ACQH01000092.1| GENE 35 42584 - 42976 83 130 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929381|ref|ZP_06423226.1| ## NR: gi|288929381|ref|ZP_06423226.1| hypothetical protein HMPREF0670_02120 [Prevotella sp. oral taxon 317 str. F0108] # 87 130 1 44 44 81 97.0 2e-14 MKTKATNNLNSAQLLINNKQYTTSVHCSYYAVFQSMKYVLANTSINPIPLATQDSHLGES SHEYVLLEIKNRLQSSPRNERRFAETVRFLKKERVDADYKSREFTDEESLDCLDKAKKIV ADLKTYFGEL >gi|283510527|gb|ACQH01000092.1| GENE 36 42973 - 43407 265 144 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929382|ref|ZP_06423227.1| ## NR: gi|288929382|ref|ZP_06423227.1| hypothetical protein HMPREF0670_02121 [Prevotella sp. oral taxon 317 str. F0108] # 1 144 1 144 144 263 100.0 2e-69 MKKENIEKKLRSWLSEMTKKYCWLRIKFEFSEVEGVYMVSFSPASKIERSDDFNKDAMQF ADDMDNEFGPDAPLFTEEEELFKLSETAETMVGHISTTTIAVGPVKVHNDNSWTSPIAPQ VYVEVKKNFRPSASSKEINYASAA >gi|283510527|gb|ACQH01000092.1| GENE 37 43410 - 43835 367 141 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929383|ref|ZP_06423228.1| ## NR: gi|288929383|ref|ZP_06423228.1| hypothetical protein HMPREF0670_02122 [Prevotella sp. oral taxon 317 str. F0108] # 1 141 1 141 141 263 100.0 2e-69 MKQIPFRIVGLQVESFVLNDNQLREKKAVAVNTSYEFGVNATNHLVMVRIVYKYFQDEAE LLQLALVSTFDVKQEDFDEMLIDNKFTLDPFLSQYLSTINVGAARGEIHARCEVAKSKLA EFILPPINLVEALPNPIEIEV >gi|283510527|gb|ACQH01000092.1| GENE 38 44040 - 48374 5304 1444 aa, chain - ## HITS:1 COG:VC0329 KEGG:ns NR:ns ## COG: VC0329 COG0086 # Protein_GI_number: 15640356 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, beta' subunit/160 kD subunit # Organism: Vibrio cholerae # 1 1408 5 1362 1401 1314 48.0 0 MAFKKDTKVKNNFTKITIGLASPEEILENSYGEVTKPETINYRTYKPERDGLFCERIFGP TKDYECACGKYKRIRYKGIVCDRCGVEVTEKKVRRERSGHIELVVPVAHIWYFRSLPNKI GYLLGLPTKKLDTVVYYEKYIVIKPGVLEGRKDSEGEDLNGSHQMDLLSEDEYLNILETQ VDPNNEFLDDSDPNKFIAKMGAEAVYDLLAGLDLDALSYELRDRANNDSSQQRKTEALKR LQVVEAFRSSTEINRPEWMILKIIPVIPPELRPLVPLDGGRFATSDLNDLYRRVIIRNNR LKRLVEIKAPEVILRNEKRMLQEAVDSLFDNSRKSSAVKSESNRPLKSLSDSLKGKAGRF RQNLLGKRVDYSARSVIVVGPELKMGECGLPKLMAAELYKPFIIRKLIERGIVKTVKSAK KIVDRREPVIWDILENVMKGHPVLLNRAPTLHRLGIQAFQPKLIEGKAIQLHPLACTAFN ADFDGDQMAVHLPLSNEAILEAQVLMLQSHNILNPANGAPITVPSQDMVLGLYYITKIRP GAKGSGLTFYGPEEAIIAHNEGRCDLHAPVKVVVTDLVDGKLQPRMVETSVGRVIVNGII PDEVGYFNEIISKKTLRGIIADVIKSVGMARACEFLDGIKNLGYRMAYVAGLSFNLDDII IPKEKEEIVERGHEEVRQITENYNMGFITDTERYNQVIDTWTHVNNDLGNVLMKEMTEAD QGFNAVFMMLDSGARGSKDQIKQLSGMRGLMAKPQKAGAEGQQIIENPILSNFKEGMSVL EYFISSHGARKGLADTAMKTADAGYLTRRLVDVSHDVIINEEDCGTLRGLVCTALKNGDE IISTLYERILGRVSVHDIIHPTTGELIISAGEEITEPIARIINDSPIESVEIRSVLTCES KHGVCMKCYGRNLATSRMVQMGEAVGVIAAQAIGEPGTQLTLRTFHAGGVAGNAAANASI VAKNDSRIEFDELRTVPFVEDGEDGERNCEMVVSRLAEIRFVDPNTSIVLSTLNVPYGAS LYNKHGDVVKKGTLIARWDPFNAVIVSEYAGTLKFHDVVEGVTFRAETDETTGLTEKIIT DAKDRSKVPTCDIVAANGEVLGTYNFPVGGHVVVEDGAKIKTGTTLVKIPRAVGGAGDIT GGLPRVTELFEARNPSNPAVVSEIDGEVTMGKVKRGNREIVVTSKTGDQKKYLVSLSKQI LVQEHDAVRAGTPLSDGVITPNDILAIKGPTAVQEYIVNEVQDVYRLQGVKINDKHFEII VRQMMRKVQINDPGDTTFLEQELVDKLDFSEENDRIWGKKVVTDAGDSETMRAGQIVTAR RLRDENSSLKRRDLRPVQVRDAVPATSTQILQGITRAALQTKSFMSAASFQETTKVLNEA AIRGKVDRLEGMKENVICGHLVPAGTGLRQWERIIVGSKEEYDRMQANRKNVIDYSDKET ATQE >gi|283510527|gb|ACQH01000092.1| GENE 39 48401 - 52213 2916 1270 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163796927|ref|ZP_02190884.1| 30S ribosomal protein S12 [alpha proteobacterium BAL199] # 9 1270 16 1388 1392 1127 45 0.0 MASKIVDNRVNFASVHNPMPYPDFLDVQLKSFKDFLQLDTPPEERKNDGLYKVFAENFPI TDTRNNFVLEFLDYYIDPPRYTIDECLERGLTYSVPLKAKMKLYCTDPDHEDFGTFIQDV FLGTIPYMTSNGTFVINGAERVVVSQLHRSPGVFFGQGVHANGTMLYSARIIPFKGSWIE FATDINNVMYAYIDRKKKLPVTTLLRAIGYENDRDILQIFDLAEEVKVNKKNMKEALGKR LAARVLKSWNEDFVDEDTGEVVSIERNEVIMERETEITEENVGEILDSGVSTVLLHKDAD AANKFSIIFNTLAKDPSNSEKEAVLYIYRQLRNADPADDASAREVFQNLFFSDKRYDLGE VGRYRINKKLGLDTDMDTRVLTKDDIIEIIKYLIQLVNSNATVDDIDHLSNRRVRTVGEQ LANQFSIGLARMSRTIRERMNVRDNEVFTPTDLINAKTISSVINSFFGTNPLSQFMDQTN PLAEVTHKRRLSALGPGGLSRERAGFEVRDVHYTHYGRLCPIESPEGPNIGLISSLCVYA NINELGFIETPYRKVSNGVVDLNNKHLVYLTAEEEEDHIIGQGNAPLNEDGTFIRDVVKC RQDADFPVVEPAEVDLMDVSPQQIASVSAGLIPFLEHDDGHRALMGCNMMRQAVPLLHND APIVGTGLEKQVCEDSRTMITAEGDGVIDYVDATTIRILYDRTEDEEYVSFEPALKEYRV PKFRRTNQNMTIDLRPICVKGQRVKAGDILTEGYATENGELALGRNLLVAYMPWKGYNYE DAIVLSERLVRDDVLTSVHVDEYSLDVRETKRGVEEFTSDIPNVSEEATKDLDDNGVVRI GARIEPGDILIGKISPKGESDPSPEEKLLRAIFGDKAGDVKDSSLKANPSLSGVVIDKKL FSRAIKTRDSKKQDKVILAKIDEEYEAKNEDLKDILVDKLLELTEDKVSQGVKDYTGAEI INKGSKFTAAALKNLEYDGIQSNDWTDDAHTNDLIQKLIMNYIRKYKLLDAELKRRKFAI TIGDELPSGILQMAKVYVAKKRKIGVGDKLAGRHGNKGIVSKVVRMEDMPFLEDGRPVDL VLNPLGVPSRMNLGQIFEAILGAAGKKLGVKFATPIFDGAQLEDLSEWTDKAELPRLCST YLYDGETGEKFDQPATVGVTYFLKLGHMVEDKMHARSIGPYSLITQQPLGGKAQFGGQRF GEMEVWAIEAFGASHILQEILTIKSDDVVGRSKAYEAIVKGEPMPTPGIPESLNVLLHEL RGLGLSIKMD >gi|283510527|gb|ACQH01000092.1| GENE 40 52437 - 52817 508 126 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|153805949|ref|ZP_01958617.1| hypothetical protein BACCAC_00194 [Bacteroides caccae ATCC 43185] # 1 126 1 124 124 200 84 3e-50 MADIKAIAEELVNLTVKEVNELANVLKEEYGIEPAAAAVAVAAAPGAGAAAGGAEEKSSF DVVLVEAGAAKLQVVKAVKEACGLGLKEAKDLVDGVPSTLKEGMSKDEAENLKKAIEEAG AKVELK >gi|283510527|gb|ACQH01000092.1| GENE 41 52928 - 53446 629 172 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|160883075|ref|ZP_02064078.1| hypothetical protein BACOVA_01038 [Bacteroides ovatus ATCC 8483] # 1 172 1 172 173 246 72 2e-64 MKKEVKDTIIVELGKKLKEYPHFYLVDLAGLNAEATNRLRATCFKKEIKLSVVKNTLLHK AFEASDIDFEPLYGTLKGTTAIMFTTVANEPAKLLKDYKKEGIPALKAAYAEESFYVGAD KLDELVALKSKNEVIADIVALLQSPAKNVVSALQSSAGTIHGVLKTLGERAE >gi|283510527|gb|ACQH01000092.1| GENE 42 53465 - 54157 1015 230 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|150003398|ref|YP_001298142.1| 50S ribosomal protein L1 [Bacteroides vulgatus ATCC 8482] # 1 230 1 230 232 395 84 1e-109 MSKLTKKQKSVADKIEAGKAYTLKEAAELVKEITTTQFDASLDIDVRLGVDPRKANQMVR GVCTLPNGTGKTVRVLVLCTPDAEAAAKEAGADYVGLDEYIEKIKGGWTDIDVIITMPSV MGKVGSVGRILGPRGLMPNPKSGTVTMDVAKAVKDIKSGKIDFKVDKAGIIHTSIGKVSF TPEQIYQNAKEFVATVIKLKPAAAKGTYIKSIFISSTMSKGIKIDPKSVE >gi|283510527|gb|ACQH01000092.1| GENE 43 54178 - 54618 635 146 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|150003399|ref|YP_001298143.1| 50S ribosomal protein L11 [Bacteroides vulgatus ATCC 8482] # 1 145 1 145 147 249 80 5e-65 MAKEVAGLIKLQIKGGAANPSPPVGPALGSKGINIMGFCKEFNARTQDKAGKVIPVVITY YTDKSFSFELKTPPAAVQLKEVAKVKSGSAQPNRQKVASVTWEQIKVIAEDKMKDLNCFT VESAMKLIAGTARSMGITVNGEFPGK >gi|283510527|gb|ACQH01000092.1| GENE 44 54676 - 55221 552 181 aa, chain - ## HITS:1 COG:AGc3576 KEGG:ns NR:ns ## COG: AGc3576 COG0250 # Protein_GI_number: 15889255 # Func_class: K Transcription # Function: Transcription antiterminator # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 8 179 5 175 176 133 39.0 2e-31 MAETQKNWYVLRAVSGKEAKVKEYIEAELKHNTMLQSHVFQVLIPMEKQASLRNGKRVEK EKISLPGYVFVEADLKGDVAHTLRFMPNVLGFLGGLDNPSPVPQADINRMLGAAENTELE NDIDIPYSVDETVKVTDGPFSGFMGVIEEVNTEKHKLKVMVKIFGRKTPLELGFMQVEKE G >gi|283510527|gb|ACQH01000092.1| GENE 45 55236 - 55427 114 63 aa, chain - ## HITS:1 COG:no KEGG:PRU_2141 NR:ns ## KEGG: PRU_2141 # Name: secE # Def: preprotein translocase subunit SecE # Organism: P.ruminicola # Pathway: Protein export [PATH:pru03060]; Bacterial secretion system [PATH:pru03070] # 1 62 1 62 63 90 74.0 1e-17 MFKKIVNYCKACYNELAHNTTWPTRAELTHSAMVVLSASLVIALVVFAMDSAFKFIMSGI YPH >gi|283510527|gb|ACQH01000092.1| GENE 46 55588 - 56778 1388 396 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|119502908|ref|ZP_01624993.1| Ribosomal protein S19 [marine gamma proteobacterium HTCC2080] # 1 396 1 407 407 539 66 1e-152 MAKETFVRTKPHVNIGTIGHVDHGKTTLTAAISKVLHEKGFGSEDVKSFDQIDNAPEEKE RGITINTAHIEYETAKRHYAHVDCPGHADYVKNMVTGAAQMDGAIIVVAATDGPMPQTRE HVLLARQVNVPKLVVFLNKCDMVEDEEMLELVEMEMRELLDQYEYDGDNTPIIRGSALGA LNGVDKWVDSVMQLMDAVDTWIPLPPREVDKPFLMPVEDVFSITGRGTVATGRIETGKVK VGDEVELLGLGEDKKCVVTGVEMFRKLLEEGEAGDNVGLLLRGIDKNEIKRGMVLCHPGQ IKPHKKFKASVYVLKKEEGGRHTPFGNKYRPQFYLRTMDCTGEITLPEGVEMVMPGDNVE ITVDLIYAVALNVGLRFAIREGGRTVGAGQITEIIE >gi|283510527|gb|ACQH01000092.1| GENE 47 57372 - 57671 188 99 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163755828|ref|ZP_02162946.1| 30S ribosomal protein S21 [Kordia algicida OT-1] # 1 97 4 100 102 77 41 3e-13 MEIKIQAIHFDATEQLHAFIEKKVSKLEKTYEQVQKADVSLKVVKPATAMNKQASITVTF PGNTAFAEKTCDTFEESVDQCVDSLKVQLTKIKEKQRER >gi|283510527|gb|ACQH01000092.1| GENE 48 57698 - 58579 827 293 aa, chain - ## HITS:1 COG:SA1095 KEGG:ns NR:ns ## COG: SA1095 COG4974 # Protein_GI_number: 15926835 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinase XerD # Organism: Staphylococcus aureus N315 # 4 293 6 292 298 183 37.0 4e-46 MTTEQFLNYLQYELNRSELTIASYGDDLRAFEAFFKNKDTQLSWEAVDADLIRNWMESMM DKGCSATTIQRRLSALRSFYRFGLTRGYVEKNPARGVVGPKRSRPLPQFLREGEMDRLLD EVPVGESYRDLLAYTIVLTFYSTGMRLAELVGLNDDSIDFGARVIKVLGKGSKQRLIPFA DELEAALKAYIARRDAEVARCDKAFFVDQRGERVKRGAVQNSVRASLAKVTSMKKRSPHV LRHSFATAMLNNDAGLESVKKLLGHESLSTTEIYTHTTFEQLRKIYDKAHPRA >gi|283510527|gb|ACQH01000092.1| GENE 49 58659 - 58850 268 63 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|150003404|ref|YP_001298148.1| 30S ribosomal protein S21 [Bacteroides vulgatus ATCC 8482] # 1 63 1 63 63 107 82 2e-22 MIIVPVKDGENIERALKKFKRKFEKTGVVKELRARQQYDKPSVLKRLKMEHAVYVQKLRT MEE >gi|283510527|gb|ACQH01000092.1| GENE 50 59056 - 59376 56 106 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPKGVKALWNYSQTELDLESHSCIVSKQKSSFSSALGFIDAMQRTSCILLTTIFQMQFCS AISHEFRNGHNFRPCHFNPKRSLTPVHILGFAEQMTVLFLKAQRTT >gi|283510527|gb|ACQH01000092.1| GENE 51 59886 - 60368 -406 160 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MHHFTLHLAPFYPAFCTKTHCILHQNALCFAPKRTAFSGILHHILPKIAPNLVLMAVVCN KYSFCCIRGLTRFCPKINSRENRFFAARLAIGDENGAHNVKSLADKLTKASPPTSLQKVE TSPPNPLQRVKASPPTPLQGVKASPPTPLQGERGVICVVG >gi|283510527|gb|ACQH01000092.1| GENE 52 61600 - 64830 3512 1076 aa, chain + ## HITS:1 COG:no KEGG:PRU_1874 NR:ns ## KEGG: PRU_1874 # Name: not_defined # Def: putative receptor antigen RagA # Organism: P.ruminicola # Pathway: not_defined # 1 1076 1 1085 1085 1173 57.0 0 MIKRLILFLTCLFLTAGMALAQTRVGGVVVALPDNEPVVGASVKVVGTTNGTVTDIDGKF SLSVPANAQLEFSYIGMKSKVLTAKANMRVELESVDNTIDEVMVVAFGTQKKSAFTGSAA VVSSKDLAKRVSTNVSSTLAGTVAGFQMREASGEPGAGNGKMNIRGISSLEASTDPLIIV DGAPYPGSLSNIPQQDIESVTVLKDAASAALYGARGASGVILITTKRGNTSEAVVTVDMK WGSNSRAIQDYDVIKDPGQYYETYYGMLRNKYFYTDGMSADDAHARANKEMLAQLGYNVY TVPTGEQLIGTDGKLNPKATLGRSYKTADGETYYMTPDDWTKEAYRNSFRQDYNVSVSGG NTRSSFFASAGYLKDNGVILYSDYERYTARLKADYQAKKWLKLGGNVDYVHSSKNQNPNM DDQLNSSNLLYFTSMIAPIYPLYVRTLDANGNPVIRTDKNGNPHYDYGVPGNNFVGNAPR AFSAQGNPLGANRYNQKNFLVNQFNGTFNVDVIFTDWLKFNATSNLNFQHITRSDYDNSL YGSKVNVNGELTKFQYYYLRTNNVQTLDFHKAFGDHDVNVLAGHEYFNEQLRYLNATATG GFSPNILELDAFATKSNSRSYGERYNVEGWFSRVLYNYKEKYYGSLSYRRDASSRFKKEN RWGNFWSVGGAWNIEKESFFNVPWVNHLKLKVSLGQQGNDALGNNWYFADQYTLSKSSAT AMAPSFRLPGNPELTWETTTNLNVGLEFSLLKNFLQGSVDVYSKKISDMLFWLSLPELSG TRGYYGNIGDMRNSGVEVTLTVTPIRTRDITWSITGMLAHNATKILKLPRQKTQDYGGYV DNGTGHECWYEVGKPLYNMLLPEYAGVNEHGQALYYVDTDVAVKDRGSFPNKKRNGTTTN ISEASFYEQGSVMPKASGGFSTTVEAYGFDATVAFDYQLGGKVYDSRYAALMNPVESQAS GSNFHKDVLKSWTPNNTASNIPRFQYGENYSAKSSNRWLTNATYLNFQSFTVGYTLPKTF TNKYKISTLRVYVAGENLCFWSARKGFDPRYSFDGNKSVGSYSPTRNITGGLQVTF >gi|283510527|gb|ACQH01000092.1| GENE 53 64860 - 66536 1562 558 aa, chain + ## HITS:1 COG:no KEGG:PRU_1875 NR:ns ## KEGG: PRU_1875 # Name: not_defined # Def: putative lipoprotein # Organism: P.ruminicola # Pathway: not_defined # 7 553 4 556 561 598 56.0 1e-169 MKHIAVKLIASFWACALLSGCIEEVDPQGRTITGKQLGEAPNAFEKAVDATTSTLVGQFY YRPTATFVFDAGYPCFFLERDAMGQDIAVVGKNNQYDSWYSVSEVLGPGYAVCQLPWTYY SKWLKNCNIVIGMAGENPEPARRTGAGIARAMRAMFYMDIARMYAAEPYSLNKKAETTPL VTEKTTNEDLYNNPRKSNEEMWAFILSDLDKAEELLKGYVRPDIRKPDVSVVYGLKARAY LETQDWANAEKYAKLAQEKYTMMDEAAYTSKELGFNSPNKAWMFGVKYNEEDPNIRLNDA DSSWGSVMLNESLSGCGYASNYGGLMNIDRHLFETIPTTDWRRKVFVDFALDDLSEDAQR EALRTYAYKEDVKYCKQIMASAAAAQLNVGGIQLKFRNAGGVAGRENQHVGFAVWAPLMR VEEMKLIEIEAVGMQDEGKGIALLTEFAKTRDPNFVYGKHNEAYGNTATSAFQNEVWWQR RVELWGEGFATFDIKRLNKGIIRNYPRTNHTAGYRWNTTAYPKWMNLCIIETELRYNRAC TNNDIPIHPQKDSEEHQF >gi|283510527|gb|ACQH01000092.1| GENE 54 66610 - 67437 796 275 aa, chain + ## HITS:1 COG:no KEGG:PRU_1876 NR:ns ## KEGG: PRU_1876 # Name: not_defined # Def: putative lipoprotein # Organism: P.ruminicola # Pathway: not_defined # 1 273 4 275 276 146 34.0 8e-34 MKHSIINYAWALLGLAFMAACADEQGSEPGNDNTAVATIYQLDAKDPLNPDNDMLLRVAT NSKTDAVYYASIPKAEFETKMQQGGEEAIKDFVIASGTKVKDLSGGAVADVAVEGMTGEY KVAVVATGAGARTLKVFDFTGLSWSDMAKGTYTFALVSFGENKVKVIKDVVLQKCENREG LYRLNKVYNGKSNIKFTLLNAKQDDYQFVAVSTQATALNGVYIRDLATKENNQAFATDPE KGCKLYKDNKVTLVLNYFTAKNDYGAVVETFVPNP >gi|283510527|gb|ACQH01000092.1| GENE 55 67727 - 67918 112 63 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929401|ref|ZP_06423246.1| ## NR: gi|288929401|ref|ZP_06423246.1| hypothetical protein HMPREF0670_02140 [Prevotella sp. oral taxon 317 str. F0108] # 1 63 1 63 63 85 100.0 1e-15 MKGRVMGLALKAVNDFTAELCLVTTLVMGLVLKAAGDFTAELCLATALVMGLVLKAVGGR GTH >gi|283510527|gb|ACQH01000092.1| GENE 56 68004 - 68957 1085 317 aa, chain + ## HITS:1 COG:TM0177 KEGG:ns NR:ns ## COG: TM0177 COG1284 # Protein_GI_number: 15642951 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Thermotoga maritima # 10 287 5 281 283 142 33.0 1e-33 MLTLKRYTLLKDLVLLTIAMFIGSLGWAIFLLPNHITTGGIPGLASILYWGWNIPVQVTF LGVNALLLLIALKVLGWQFCLKTLYAAVMFTLFSALGQHWMEGSTLLQGQPFMAIVLGAA SLGSTVGLGLASNASTGGTDVVAAMINKYHDISLGKLILLVDISIVTASYLALRNWEEVI YGYVFLFIFSICVDKVVNMMHQSVQFFIISNKYEQIGRAINVNAQRGCTTISGHGFYSGR EVKMLFVLARQTESGKIFSIINEIDPAAFVSQSSVIGVYGLGFDKFKAKRKKNKMMKEQA ERLGVDVSPDESLATKA >gi|283510527|gb|ACQH01000092.1| GENE 57 69044 - 69358 459 104 aa, chain + ## HITS:1 COG:no KEGG:BVU_0810 NR:ns ## KEGG: BVU_0810 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 2 102 3 96 113 99 49.0 2e-20 MDQKQPVPFEMDLKPEVASGVYSNFVLISHSPSDFILDFARILPGMPKPEVASRIIMAPE HAKRLLMALQENIFKYEQEFGKIQLPHEGSRTIAPFNVGGNGEA >gi|283510527|gb|ACQH01000092.1| GENE 58 69817 - 70191 260 124 aa, chain + ## HITS:1 COG:PA0563 KEGG:ns NR:ns ## COG: PA0563 COG3152 # Protein_GI_number: 15595760 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Pseudomonas aeruginosa # 11 121 8 111 117 81 35.0 3e-16 MDIIGNYKNVLTKKYSDFKGRAGRTEYWLFVLVNVAISIVYQILVSVSGDNATARLVISA IFGLFFLAILVPGLAISVRRMHDIGKGGEWFFINFIPVIGGIWFLVLCIKEGEPTANRFG EPQP >gi|283510527|gb|ACQH01000092.1| GENE 59 70759 - 72117 786 452 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 [Flavobacteriales bacterium ALC-1] # 1 447 1 446 458 307 36 1e-82 MNKFDILIIGSGPGGYRTAEYAARKGLKVAVFEKDQPGGTCLNSGCIPTKTLCKHAEVAD TVREAAQYGVAIKDAAFDIDMQAVVARKEEITEKLRQGVEQLMSMPGVTFVRGEARFTAN KTLVANGEEYTADNIIIASGSSAKVLPVEGAQLKGVVTSTELLCLNHVPRSLCIIGAGVI GMEFASIYRSFGCEVMVVEFLKECLPALDSDIAKRLRKQLEQRGVQFALQSGVTKIEQTE DNALRVHYQKKGKEAFADAELVLMATGRAANVDALGLENTDICYTKAGITTDDNMQTNVP GVYAIGDVNGKQMLAHAATFQGFRAVNHIVGHADRILFNVVPAAIFTHPEVGSVGLSEDQ CREQGVAYKCRKGYYRSNGKANAGNATEGMIKLMTDEQDRILGCHLYGENAAFIAQEVAV LMNFGATLTQLGEIVHTHPTLSEILQDMAFMP >gi|283510527|gb|ACQH01000092.1| GENE 60 72104 - 73585 1555 493 aa, chain - ## HITS:1 COG:lin1387 KEGG:ns NR:ns ## COG: lin1387 COG1003 # Protein_GI_number: 16800455 # Func_class: E Amino acid transport and metabolism # Function: Glycine cleavage system protein P (pyridoxal-binding), C-terminal domain # Organism: Listeria innocua # 9 489 9 488 488 513 55.0 1e-145 MNNKLYGNLIFELSRPGCRGFHLPNNPFGKHPIANELKRGSDAQLPECDEVTVVRHYTNH SGNNFGVDNGFYPLGSCTMKYNPVINEEVAAMPAFANLHPAQPADTCQGALAALYGLQKA LSEISGLHEFTLNPFAGAQGELTGLMIIRAYHQANNDTKRTKVIVPDSAHGTNPASAAVC GLDVVEVKSTPEGTVDTEHLKQLLGDDVAAMMMTNPNTLGLFEKDIPAITKIVHDCGGLM YYDGANLNPLLGECRPGDMGFDVMHINLHKTFSTPHGGGGPGAGPVGVRKGLEQFLPSPH VVRKGERYEIEELPFGFAPDAPTMNLGTFFGNFAVLVRAYTYILTLGKENLKNVGKLATL NANYIKEALKDVCDLPIEGLCKHEFVFDGLKNKDTGIRTMDVAKALLDYGYHAPTIYFPL LFHEAMMIEPTETESKATIDGFIEVMRSIANDAERNPQRLKDAPLNTPVGRVDDVLAAKH PITTYKQSIDEQV >gi|283510527|gb|ACQH01000092.1| GENE 61 73599 - 74921 1469 440 aa, chain - ## HITS:1 COG:lin1386 KEGG:ns NR:ns ## COG: lin1386 COG0403 # Protein_GI_number: 16800454 # Func_class: E Amino acid transport and metabolism # Function: Glycine cleavage system protein P (pyridoxal-binding), N-terminal domain # Organism: Listeria innocua # 2 436 3 442 448 400 47.0 1e-111 MQYKFFPHTEEDIKLMLDKIGVPSLDALFAEVPEEIRFRRDYDLPSAKSEIEIRNLFGNL GKQNKQLTVFAGGGVYDHYTPSIVPYLVSRSEFLTSYTPYQAEISQGTLHYIFEFQSMMA ELTGLPIANASMYDGSTATAEAAIMAVASGKKANKVLVSETVDDKILAVIRTYTHFQGVQ IEVVPAEDGSTSRTAMQEKLQQGGVAGMIVQQPNKYGIIENYDGFADDCHQQKALFVMNS VAADLALLKSPGELGADIAVGDGQSLGLPMQFGGPSVGYLCCTEKLMRKMPGRIVGQTKD NRGQRAFVLTLQAREQHIRREKATSNICSNQSLMALWVTIYLSVMGKEGLKEAAQMSVDG AHYLYEKLIHSGKFKPAFNQPFFNEFCVKYVGGNLDQLQQRLLDNGILGGIKMADDMLML AVTEKRTKEEVDLLVSLVTQ >gi|283510527|gb|ACQH01000092.1| GENE 62 74942 - 75322 645 126 aa, chain - ## HITS:1 COG:TM0212 KEGG:ns NR:ns ## COG: TM0212 COG0509 # Protein_GI_number: 15642985 # Func_class: E Amino acid transport and metabolism # Function: Glycine cleavage system H protein (lipoate-binding) # Organism: Thermotoga maritima # 10 124 6 121 124 112 50.0 2e-25 MAKFIEGLYYSESHEYVRVEGEYAYIGITDYAQNALGNVVYVDLPEVDDEIEAGEDFGAV ESVKAASDLTSPVSGVVVETNEALEDEPELINKDAFANWIMKVKMADTTELDNLMDAKAY EAFCNK >gi|283510527|gb|ACQH01000092.1| GENE 63 75605 - 76696 1403 363 aa, chain - ## HITS:1 COG:BH2816 KEGG:ns NR:ns ## COG: BH2816 COG0404 # Protein_GI_number: 15615379 # Func_class: E Amino acid transport and metabolism # Function: Glycine cleavage system T protein (aminomethyltransferase) # Organism: Bacillus halodurans # 2 359 3 363 365 323 46.0 3e-88 MEDKKTCLHDRHVALGALMQPFAGYDMPIQYSSITDEHNAVRNHCGVFDVSHMGEVYITG NEAEKYVNHIFTNDIAGAPVGKVFYGMMLYPDGGTVDDLLVYKLGENEFFLVINAANIDK DVDWIRQNATGYDVAIDHCSDYYGQLAVQGPEAEQVMEEVLGLACKDLEFYTAKTIATHG ANVIVSRTGYTGEDGFEIYGPHEFIVEQWDKLMASKRCVPCGLGCRDTLRFEVGLPLYGD ELSNEISPVMAGFSMFCKLDKEEFIGKEAVAKQKADGVEKKVVGIELKDKAIPRHGYDVV KDGVKVGEVTTGYHCISVDKSVCMALVDSQYAKLGNELEIQIRKKTFPGTVVKKRFYDKH YKK >gi|283510527|gb|ACQH01000092.1| GENE 64 76712 - 77623 1044 303 aa, chain - ## HITS:1 COG:BH0683 KEGG:ns NR:ns ## COG: BH0683 COG0095 # Protein_GI_number: 15613246 # Func_class: H Coenzyme transport and metabolism # Function: Lipoate-protein ligase A # Organism: Bacillus halodurans # 1 301 18 328 330 181 35.0 2e-45 MEEFVARNLDLDECFFMWQVEPTVLFGRNQLIENEVNLDYCREHNIKAFRRKSGGGCVYA DMSNVMFSYITKDENVSLTFNHYLNMVTRTLQKLGVEAEASGRNDILIHGKKVSGTAFYH VPGRSIVHGTMLYDTNMQHMVGSITPTDAKLVSKGVQSVRQHIALLKDHTNISLPHFKAF VRQELCNGKHVLTPDDVLAIEEIEKEYLTPEFIYGNNPRYTSIRKQRIENVGEFEVRIEM KKNIVKQVNIMGDFFLTGDIDNRLLATLKNVPYTEEAIRQALPQRVDDIIMNLSKEDFIK LII >gi|283510527|gb|ACQH01000092.1| GENE 65 79866 - 80144 72 92 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGQNFRRHHVFAQHFVRERGVYLLFSFLGKCVLQRTNITPSQFAERLKSKTTSIRKHDPP KTNQPPIFTNRGLILPNTEVRIDEMNCLLNSY >gi|283510527|gb|ACQH01000092.1| GENE 66 80127 - 80717 668 196 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929412|ref|ZP_06423257.1| ## NR: gi|288929412|ref|ZP_06423257.1| conserved hypothetical protein [Prevotella sp. oral taxon 317 str. F0108] # 1 196 1 196 196 294 100.0 2e-78 MYTKSLLTTLATSAFCATMLFFTACDKDNDSNKPMLNKLTFDKTKVEVTEGMETELKVKN GTAPFKAMLSGEKNKQIADVAVKDRVITIKGKMAGSATLAVTDKKGDKGVINVVVKKAAG TLKLDKTSLTMEVGKTEVVKAQNGNGKYTAIAKDPKTVEVMVNKYEISVKALKSGKTDVE VKDGDNNLGIISITVK >gi|283510527|gb|ACQH01000092.1| GENE 67 80869 - 81096 78 75 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRATIGALNQLHGALYPSCCDTNQPITSLIGFTYCFIAPLVFSLRACRKGCWAGKMLCLS FIRFAVSSRFARLLP >gi|283510527|gb|ACQH01000092.1| GENE 68 81107 - 81433 422 108 aa, chain - ## HITS:1 COG:MTH1452 KEGG:ns NR:ns ## COG: MTH1452 COG1917 # Protein_GI_number: 15679449 # Func_class: S Function unknown # Function: Uncharacterized conserved protein, contains double-stranded beta-helix domain # Organism: Methanothermobacter thermautotrophicus # 14 107 5 98 99 117 62.0 5e-27 MDKKIEKATVIVANEVINYAEGGIVSKEFVHANAGSLTLFAFDEGQGLSEHSAPFDATVQ VLDGEAEVKIDGVPHVVKAGEMIIMPANHPHALFAHKRFKMLLTMIRG >gi|283510527|gb|ACQH01000092.1| GENE 69 81800 - 82465 809 221 aa, chain + ## HITS:1 COG:CAC0884 KEGG:ns NR:ns ## COG: CAC0884 COG0664 # Protein_GI_number: 15894171 # Func_class: T Signal transduction mechanisms # Function: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases # Organism: Clostridium acetobutylicum # 1 221 1 221 229 108 30.0 6e-24 MEKKLLNALTHCPLFKGISEQGLEQIVASTKHKIVTFDKKDIYILAGMPYRHVDIMVQGE MVARMVSLSGKSVEVSRLVAGNLIAPAFIFSKDNLMPVSVETETETTVLRMEKADFVELI DTDAVLRRNFIAILSNIDVFLTKKMRVLSLFTVREKIAYFLKEASGKQGSNVIQLDKSRQ EIAESFGIQKFSLLRTMSELAEEGAIRVEGKRITILDRNKL >gi|283510527|gb|ACQH01000092.1| GENE 70 82794 - 84452 1704 552 aa, chain - ## HITS:1 COG:MA2912 KEGG:ns NR:ns ## COG: MA2912 COG0365 # Protein_GI_number: 20091733 # Func_class: I Lipid transport and metabolism # Function: Acyl-coenzyme A synthetases/AMP-(fatty) acid ligases # Organism: Methanosarcina acetivorans str.C2A # 1 552 7 560 560 662 56.0 0 MVERFLKQTTFSSTEDYNKNLQFIIPDNFNFAYDVMDAWAAEAPDKTALIWVSEEGGERF FTFSDLKRESDRAASYFQALGIGRGDMVMLILKRKYEWWIAMLALCKIGAVAIPATHMLT THDIVYRNNSASVKAIVCVGEDYVLTQVRGAMPESPSVKTLVSIGPDVPEGFHCWQTGVE KAAPFVRPSKVNDNDDIMLMYFTSGTSGEPKMVAHDFLYPLGHIPTGVFWHNLGPESIHL TVADTGWGKAVWGKLYGQWIAGAAVFVFDHEKFAADKILRMIEKYRITSFCAPPTVYRFL IHEDFNDYDLSSLTYCTTAGEALNAAVYRKFYERTGVKLMEGFGQTETCMTLGTMPWMEP KPGSMGKPNAQYDIDLVRPDGTSCEDGEKGQIVVRIGDKRPLGLFKEYYRDPELTHQANH DGVYYTGDIAWRDEDGYYWFVGRADDVIKSSGYRIGPFEVESALMTHPAVVECAITGVPD EIRGMVVKATVVLHPKWKGRAGDELKKELQEHVKHETAPYKYPRIVEFVDELPKTISGKI RRVEIRQRDNAK >gi|283510527|gb|ACQH01000092.1| GENE 71 84466 - 85020 664 184 aa, chain - ## HITS:1 COG:MA2914 KEGG:ns NR:ns ## COG: MA2914 COG1396 # Protein_GI_number: 20091735 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Methanosarcina acetivorans str.C2A # 1 184 1 184 184 150 42.0 2e-36 MEESIRTIGQRLKGLREVLNIPAEEIADLCGISLEHYLKMEDGTADPSVYRLAKISKRYG IDLDVLLFGEEPRMSAYYLTRKGNGLSVERGNDYKYQSLGSGFRGRKMEPFLAQVDPLPE GKSYRKNSHNGQEFDYIVEGTMELTLGEKVMVLKEGDSVYFDATKPHRMQALGGKPVKFL CVII >gi|283510527|gb|ACQH01000092.1| GENE 72 85302 - 86285 684 327 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929418|ref|ZP_06423263.1| ## NR: gi|288929418|ref|ZP_06423263.1| hypothetical protein HMPREF0670_02157 [Prevotella sp. oral taxon 317 str. F0108] # 1 327 1 327 327 630 100.0 1e-179 MKNTTLNGVVCAIAFAALTACANEPSADNGNNVIKPKQNTLRNLVFNAEFDDYSNHGDGP QKMFAGTSSVQSDTINLSNGMRAIATLTPDRFSETASKMTRTLPNDTYTMETYNGYTHEF QGKLTGKVENGKFISLDGKEDIILEHDYYHFLLYNSKLYRQDGVLMINRANADDAFVGYV SEIIKHEPRRQVVLFPMKRKGVRLKMKITGTEAFGPMTATLSNANREGLPASLKYELRNN NWSTATKSAFSQPITFGRSVQKAGNNTHFALANEFTYLLFTTNAADLKMKFNSGTIYGED ITGAELQFKNLVFDVNGSYVLAIELTK >gi|283510527|gb|ACQH01000092.1| GENE 73 86305 - 86535 181 76 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929419|ref|ZP_06423264.1| ## NR: gi|288929419|ref|ZP_06423264.1| hypothetical protein HMPREF0670_02158 [Prevotella sp. oral taxon 317 str. F0108] # 1 76 1 76 76 148 100.0 9e-35 MKREQQNKLPYMSPQCDMVKIENIDFLCVSVRTDVRNNTSPSGYENRGEHNVGTIRVGRK NAPARGMMFFEDDVEY >gi|283510527|gb|ACQH01000092.1| GENE 74 87376 - 88917 1666 513 aa, chain - ## HITS:1 COG:no KEGG:BF1460 NR:ns ## KEGG: BF1460 # Name: not_defined # Def: putative outer membrane protein precursor # Organism: B.fragilis # Pathway: not_defined # 13 513 11 501 501 417 47.0 1e-115 MRSFKKASSAAVLAMCCQSLFAGGLLTNTNQNIAFNRNFARDGVIAIDGVYSNPAGVAFL EKGWHLSFNFQNAYQTRTIRSGMSVEAAKSTPFYHPLSLNAGDAEGTKKYVGKASVPILP SFQGAYVGDKWSFQVGIGLIGGGGKATFDEGLGSFERQVALIPAALAQAGLTSPTPAYSL ASNIKGQQYIFGVQMGATYKFNDQLSAYAGMRVNYVYNRYTGSITDITANIAGANENLYA YFGNKATQLNAQAAALKAQAEATTDATLKAKLLAGAAKAEGGARLMSEKQGQVKDKYLEC TQRGWGVTPIIGLDFKTGRWNFGTRLELNTHLNIENDTKVDDTGLFQHGVNTPSDLPGLW TIGTQYAIFPNLRAMASYHLYFDKSAKMANDKQKLLGGNTQEFLAGMEWDVTPDITVSAG GQRTKYDLGDGAYLSDMSFVTSSYSIGLGAQVRLAKNMKLNVAYFWTNYENFDKTYKQTV VTNANPLATVTLDNTDRFTRTNKVLGVGLDVEF >gi|283510527|gb|ACQH01000092.1| GENE 75 89199 - 91253 612 684 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163762592|ref|ZP_02169656.1| ribosomal protein S21 [Bacillus selenitireducens MLS10] # 189 680 237 732 750 240 30 2e-62 MNKSIITKGRALNSVLALSGFVLVSLLVIVWFLPRNNTPQMRYDVGKPWMYGSLIAKFDF PIYKSDQVIQQERDSLLALFQPYYHYDRSMEKRQLAKLQADFPKGIPGMPVAQMRTLYDR LHRLYQAGIIGTPQYNSIAKDSGSMVRVIIGKRVQSMQVGCIYSTMKAYEQLLNDDVLSA YRSALQQADIVGYLHVNMTYDKERSTTELNDLLASVPLASGMVLSGQKIIDRGEIVNEQT YRVLASMEKEVARRNVSKGEIKNTIIGQFIYVLLLLGLFTSFLVMFRPQYLQKARSVVML YVMITLYPIMVSLFMEHSLLSVYIIPFAIGPMFIRVFMDSRTAFTTHVVTMLICAVAVKY QFEFLLLQLIGGLVAIYSLSDLSGRAQLFKCALFVTLANFLVFFTLQLMQTGDIANFETS MYSHFVANGVLLLLAYPLMYVIERTFGFTSSVTLFELSNTNKGLLRKMSEVAPGTFQHSI TVANLAAEIANRIGADSLLVRTGALYHDIGKMKAPAFFTENQVGVNPHKGLPYKESARII ISHVTEGVKMAEKANLPTFIREFILTHHGTGMAKYFYINYKNEHPDEEVDVRDFSYPGPD PFTREQAILMMADTVEAASRSLPEYTEESIKGLIDRLVDAQLEEGHFKECPITFRDIAVA KAVLLERLKSIYHTRISYPELKRG >gi|283510527|gb|ACQH01000092.1| GENE 76 91710 - 92426 892 238 aa, chain + ## HITS:1 COG:CAC2831 KEGG:ns NR:ns ## COG: CAC2831 COG0670 # Protein_GI_number: 15896086 # Func_class: R General function prediction only # Function: Integral membrane protein, interacts with FtsH # Organism: Clostridium acetobutylicum # 23 238 15 231 231 178 50.0 1e-44 MEQQDFDRLIREKEGALSLAFPALMRKVYVWMTLALIITGVTAYGVAHSEVLISQMFESR AVFMVLIIAELALVFGISRFINRLSLTTATLLFILYSALNGATLSVIFLAYSASVITKTF FITAGTFGTMALFGYFTKADLSGLGKILLMALVGIIIASIANLLFFKSGMFDLIVSYIGV LVFVGLTAYDSQKIKEMLMRAEDADENAQKMALMGSLTLYLDFINLFLYLLRIFGRER Prediction of potential genes in microbial genomes Time: Sat May 28 02:13:28 2011 Seq name: gi|283510526|gb|ACQH01000093.1| Prevotella sp. oral taxon 317 str. F0108 cont2.93, whole genome shotgun sequence Length of sequence - 64560 bp Number of predicted genes - 38, with homology - 34 Number of transcription units - 26, operones - 8 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1161 - 2306 1119 ## COG1519 3-deoxy-D-manno-octulosonic-acid transferase - Prom 2364 - 2423 1.6 - Term 2495 - 2535 2.1 2 2 Tu 1 . - CDS 2536 - 4053 1676 ## COG0008 Glutamyl- and glutaminyl-tRNA synthetases - Prom 4207 - 4266 6.3 + Prom 4193 - 4252 4.0 3 3 Tu 1 . + CDS 4476 - 5489 1094 ## gi|288929425|ref|ZP_06423270.1| hypothetical protein HMPREF0670_02164 + Term 5528 - 5583 11.1 - Term 5517 - 5570 16.9 4 4 Op 1 . - CDS 5601 - 6788 1260 ## BF0761 putative lipoprotein 5 4 Op 2 . - CDS 6831 - 8591 2027 ## BF0834 hypothetical protein 6 4 Op 3 . - CDS 8607 - 11801 3068 ## BF0759 putative outer membrane protein 7 4 Op 4 . - CDS 11872 - 14136 1972 ## COG3537 Putative alpha-1,2-mannosidase 8 5 Op 1 . - CDS 14368 - 15135 1004 ## BLD_0197 endo-beta-N-acetylglucosaminidase D 9 5 Op 2 . - CDS 15166 - 17916 2527 ## COG3537 Putative alpha-1,2-mannosidase - Prom 18136 - 18195 4.5 - Term 19546 - 19594 13.2 10 6 Tu 1 . - CDS 19621 - 20622 1354 ## COG0039 Malate/lactate dehydrogenases - Prom 20721 - 20780 7.3 + Prom 20677 - 20736 8.8 11 7 Tu 1 . + CDS 20932 - 22728 1871 ## COG0006 Xaa-Pro aminopeptidase + Term 22945 - 22987 2.3 - Term 22748 - 22808 4.5 12 8 Tu 1 . - CDS 22873 - 23364 554 ## COG0663 Carbonic anhydrases/acetyltransferases, isoleucine patch superfamily - Prom 23392 - 23451 6.0 13 9 Op 1 . - CDS 23767 - 24462 468 ## PRU_2233 putative lipoprotein 14 9 Op 2 . - CDS 24459 - 25613 1079 ## BT_3714 hypothetical protein - Prom 25708 - 25767 4.6 - Term 25646 - 25693 -0.8 15 10 Tu 1 . - CDS 25780 - 26781 1325 ## COG1181 D-alanine-D-alanine ligase and related ATP-grasp enzymes - Prom 26869 - 26928 3.5 - Term 26866 - 26899 -0.9 16 11 Op 1 . - CDS 27061 - 28110 1207 ## COG0564 Pseudouridylate synthases, 23S RNA-specific 17 11 Op 2 . - CDS 28158 - 28844 856 ## PRU_2237 PASTA domain-containing protein 18 12 Tu 1 . - CDS 29730 - 30203 485 ## COG1438 Arginine repressor - Prom 30274 - 30333 3.0 - Term 31593 - 31649 -0.0 19 13 Tu 1 . - CDS 31667 - 32674 1103 ## COG1321 Mn-dependent transcriptional regulator - Prom 32694 - 32753 2.8 20 14 Tu 1 . - CDS 32964 - 35234 2237 ## COG1198 Primosomal protein N' (replication factor Y) - superfamily II helicase - Prom 35468 - 35527 3.2 + Prom 35628 - 35687 2.9 21 15 Tu 1 . + CDS 35789 - 38680 3256 ## COG0642 Signal transduction histidine kinase + Term 38718 - 38776 13.3 + Prom 38886 - 38945 2.9 22 16 Tu 1 . + CDS 39035 - 39235 152 ## gi|288929444|ref|ZP_06423289.1| hypothetical protein HMPREF0670_02183 23 17 Op 1 . - CDS 40021 - 40380 157 ## 24 17 Op 2 . - CDS 40468 - 40929 -33 ## - Prom 41057 - 41116 3.2 25 18 Tu 1 . - CDS 41692 - 42657 933 ## gi|288929446|ref|ZP_06423291.1| tetratricopeptide repeat protein 26 19 Op 1 . - CDS 42937 - 43911 1024 ## gi|288929449|ref|ZP_06423294.1| TPR repeat protein 27 19 Op 2 . - CDS 43915 - 49887 3122 ## COG3209 Rhs family protein - Prom 50074 - 50133 8.6 28 20 Tu 1 . - CDS 50402 - 50896 134 ## - Prom 51119 - 51178 6.5 + Prom 51066 - 51125 2.9 29 21 Op 1 . + CDS 51292 - 52230 805 ## PRU_2176 hypothetical protein 30 21 Op 2 . + CDS 52284 - 53789 1504 ## COG1262 Uncharacterized conserved protein 31 21 Op 3 . + CDS 53786 - 54631 907 ## PRU_2174 hypothetical protein 32 21 Op 4 . + CDS 54643 - 56199 1697 ## PRU_2173 hypothetical protein 33 22 Op 1 . + CDS 56330 - 57325 1152 ## PRU_2172 hypothetical protein + Term 57339 - 57385 2.4 34 22 Op 2 . + CDS 57402 - 58028 715 ## PRU_2171 hypothetical protein + Term 58124 - 58175 8.5 - Term 58466 - 58509 -0.8 35 23 Tu 1 . - CDS 58594 - 58974 391 ## gi|288929457|ref|ZP_06423302.1| hypothetical protein HMPREF0670_02196 - Prom 59072 - 59131 4.6 + Prom 58712 - 58771 4.6 36 24 Tu 1 . + CDS 58991 - 59197 78 ## + Prom 59339 - 59398 4.0 37 25 Tu 1 . + CDS 59435 - 61588 2473 ## COG0370 Fe2+ transport system protein B + Prom 61797 - 61856 2.7 38 26 Tu 1 . + CDS 61917 - 63122 977 ## BDI_2811 hypothetical protein Predicted protein(s) >gi|283510526|gb|ACQH01000093.1| GENE 1 1161 - 2306 1119 381 aa, chain - ## HITS:1 COG:PA4988 KEGG:ns NR:ns ## COG: PA4988 COG1519 # Protein_GI_number: 15600181 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: 3-deoxy-D-manno-octulosonic-acid transferase # Organism: Pseudomonas aeruginosa # 23 333 51 372 425 126 32.0 7e-29 MWRGERDAFDVLRNNVEEGAQYVWFHAASLGEFEQGRPLIEELRHTHPQYKVLLTFFSPS GYEVRKNYEGADIVCYLPLDTPLNARRFLRLVRPVMAFFIKYEFWYNYLHILKHRGVPTY SVSSIFRPDQIFFRWYGKSYGKVLACFSHFFVQNEQSRSLLATIGIRNVSVTGDTRFDRV LQIQKASKHLPLIENFVQGKHLFVAGSSWPSDEEIFVPFFNARSDWKMIIAPHVVSEEHL QQLEQQVEGQTIRYSKATPESVAEADCLLIDCYGLLSSVYHYADVTYVGGGFGVGIHNVL EAAVWGVPVIFGPNNQRFQEAQDLMVAGGGFEVDCKAHFDALMDEFIEAPWRVQVAGEKS TEYVKSQVGATDKVLEKVFFK >gi|283510526|gb|ACQH01000093.1| GENE 2 2536 - 4053 1676 505 aa, chain - ## HITS:1 COG:BB0372 KEGG:ns NR:ns ## COG: BB0372 COG0008 # Protein_GI_number: 15594717 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Glutamyl- and glutaminyl-tRNA synthetases # Organism: Borrelia burgdorferi # 7 505 4 487 490 339 38.0 6e-93 MSERKVRVRFAPSPTGALHIGGVRTALYNYLFAKQQGGDLVFRIEDTDSQRFVPGAEEYI IESFRWLGIKFDEGVSFGGKYGPYRQSERKEIYKQYVDQLLNEGKAYIAFDTPQELETKR AEIENFQYDAQTRNGMRNSLTMSAEEVDALIGDGTQYVVRFKVMPGEEVHVNDMIRGDVV IKSDILDDKVLYKSADGLPTYHLANIVDDHLMEITHVIRGEEWLPSAPLHVLLYKAFGWA DTMPRFAHLPLLLKPEGKGKLSKRDGDRLGFPVFPLEWHDPKTGEVSSGYRESGYFPEAV INFLALLGWNPGTEQELFTLPQLVEAFDISRCSKAGAKFDYQKGIWFNHEYILQKSNEEI ASLFAPMVANNGIDEPMERIVQVVAMMKDRVNFVKELWPLCSFFFIAPVQYDEKTVKKRW KENSAEVMTQLSNVLEGLNDFSIENQEREVFAWVESMGYKLGDVMNAFRLALVGEGKGPG MFDISAFLGKEETLARLRRAIEVLK >gi|283510526|gb|ACQH01000093.1| GENE 3 4476 - 5489 1094 337 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929425|ref|ZP_06423270.1| ## NR: gi|288929425|ref|ZP_06423270.1| hypothetical protein HMPREF0670_02164 [Prevotella sp. oral taxon 317 str. F0108] # 1 337 1 337 337 676 100.0 0 MRKPILLASFALFTTLSAFPMKEIAGEFQPWDANCQILGNNVAFMTEWNGVTFKFFDKDG CVFEGGTDFSEYDMLVLKLKDASCMFKIKTEYTDGTTQQDSWGAEQAITPGLLVAGIALN HEHKKNVKDFVLQSADYPGMITVDKVIACTKAEYEQMLKEEKAKRFELTLKKVNEGGGAD YDPATKTISIKDDWAHKGWYFNDQFRDFSLFNLFVIQFAQPTATDGEIGIEYDGEGAQTT TSRFETGVSQVNVPLSKDKNKLRQVYVKGPSGATFVLKGARFATNGEVTAIKRIDKEKTS ANAPTRYYSLDGRELKQPTKGLHIEKKNGSAQKVFGR >gi|283510526|gb|ACQH01000093.1| GENE 4 5601 - 6788 1260 395 aa, chain - ## HITS:1 COG:no KEGG:BF0761 NR:ns ## KEGG: BF0761 # Name: not_defined # Def: putative lipoprotein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 33 349 23 329 368 147 31.0 7e-34 MKAKIRKHLNLGLLAIATAMMACQNIVDNPDVEDFSAAGAPTITKITTVTDLNQAVTQSG MGQWLAIHGDNLAHPTAILFNDVEVNLKDVYAVRTRINVAIPAEAPNKLTNTLTVRTALG ETTTNFTVDFPKLKVVGLDNEFAKPGSNVTVMGEFFNLYGLTSNQATFTLGGKPLTVVEK NDKKIVLTIPEDAADGAEIVMSSPKLEQPIRLPYRDKGVQLFASYDKDYLFGKGYLWTSQ DYFTDGTNEGDPVPPVGKCFFRRKNLYSAWNWDTLIAGHFDLDDADVVNHLENYCIKFEV WTAKDKPIPTGDFIFWSQQSADNMKLRWNPADQGVSLNTNGEWRTITLDAATWFRDNDAQ PTLKKGSNDLTIVYQPHDGFDADFALANLRFAKKR >gi|283510526|gb|ACQH01000093.1| GENE 5 6831 - 8591 2027 586 aa, chain - ## HITS:1 COG:no KEGG:BF0834 NR:ns ## KEGG: BF0834 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 586 1 555 555 430 43.0 1e-119 MKKIGVKYIAACLWGALVAASCISCADLLDEKGFSFVNDKNIENNDEGAQQWATGAFSAL TYEVFRWDAFPHVLDFDCDYMTGPDWAFGKFGSGNFQDDDHAQQMWEKMYNIIHRCNLGI ENVEAMTGTTPRGKDNAIGEMQVLRAYAYFLLVRAFGPIPVRDHTINSGLSDIHQPRQPI DSVYHYIIANLQQAEQRLYKNTDREFSLGHVSAGTAASLLAKVYLTMASGAVPTGEKVIV RGGKAFEKVAGEKVYTNPVAIEHKKMQLAGYDKIDYMKCFALARDKAKEVMDGKYGSYDL LPYDALWAIGNRNKVEHIWSLQTVSNDDVYGVSFGEGYTGVLGTAGETYGLHHGMRNHWY KLFEPQDYRITKGVMHRWQRYHAVDYHGGSYYPNTDEWKKKAQGYTNDEGKWVAPVAPFN DGLSYESKNDNNYLAYLTKYTYVSDRTKKRADIYFPFLRFADVLLIYAEAANEANNGPTA EALNALNRVRVRSNATPKSLTGVGNINDKVLFRSAVLEERAMEFAEEGDRRWDLIRWGIY LPVMNAIGGTDEVDIVKTRMEKHLLFPIPGSEVGVNKFITKNNPGW >gi|283510526|gb|ACQH01000093.1| GENE 6 8607 - 11801 3068 1064 aa, chain - ## HITS:1 COG:no KEGG:BF0759 NR:ns ## KEGG: BF0759 # Name: not_defined # Def: putative outer membrane protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 19 1064 18 1045 1045 971 49.0 0 MNKYALLLFSLCSVPLSALAQGKHITGRVSDTKAEPIVGAVIKIQNKAAVVTDAEGRYAI DAEPGDVLDITCIGMQSQRLKVGNRSTLNVVLQDDAVKLDDIVVVGYGTMKKRDLSGAVS KISGDDLLGTGASSFNQALQGKVSGVVVNKTDGAPGGAISINVRGANSFTTSTEPLYIVD GVPFETAGTPSSKATDGSQMRMNALASINPHDIESLEILKDASATAIYGSRGANGVVLIT TKKGKAGRGKIEFTSNLSMSNVLKKIDVLDPVTYARYINEQTANREIYNGETVNGLPYEG KWKYRELNGKIVENSGVYNPSPEDFLNPGLRTDQYGNKTMVRGTNWQDEIYHTGFSQEYN LNVSGSDNNGWWSFSGNYLNQTGVIRRSGFERYILHMNIGRRVKDWLEIGMNMNYTNSTT DFAKSSNEVGVVRSALVFPATYAPDVSVYESDKLNWLASNPVVYLNNSKDQQRMSNFFSS SYLEAKIRPYLKFRQNLGLGYSENSRGSYYDRHTSEGKYPDNGKGGQSDSWWKSLTSESL LTFDKSFGGKHSVNALLGFTYEIANFGSKGMSAKNFPSDVTQEHNMGLGLSYNAPESERG QQKLMSLLARVNYSYAGGKYVATASLRRDGSSKFSQANNKIGHFASGAVAWRLSEERFVK DLDIFDNLKLRFSYGQTGNQGIGAYQAQLYMIASNYPYNGSMASGFSEVLWRGALNKNLK WETTDQFNLGLDMAFWRGRVNLTADLYWKKTRDLLQSTAIPSSNGFVNMWSNEGHVVNKG IEVAAKFNAINHKGFNWNIEANIAFNKNEIGGLSSDRFAERVSAGMEKVFIQRNGLHIGT IYGFLEDGFYDNEAEARMNKSYAGMSSDEIKRFVIGEVKYKDLNGDGVITDADKTVIGKT LPDYTFGITNTFSWKGFNLSFFFQGSVGNSLFNANMRDVRLDDIGNIPRSAYEARWTPET AATARYPKNVATRNRDMKVSDRYVEDASYLRLKSLNLGYNIKPHWQGISNIYVYASATNL FTITKYSGADPDVNAFGWDASRRGVDYYCYPGSRTFALGFKLDY >gi|283510526|gb|ACQH01000093.1| GENE 7 11872 - 14136 1972 754 aa, chain - ## HITS:1 COG:Rv0584 KEGG:ns NR:ns ## COG: Rv0584 COG3537 # Protein_GI_number: 15607724 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Mycobacterium tuberculosis H37Rv # 13 753 23 768 877 498 39.0 1e-140 MLKPVRFWVCLVLILGMPRVFFAQDKTPYQLVNPFIGTGNEGNCFPGAQAPFGMLSLSPN NTFDNYEDAASRPGYKYFRNEINGFGLTHYSGVGCHAMQDLQFMPVAGTLDKSPVNDKHA YVSRFSHEREKAMPGYYSVTLDDYHVDAKFAATVHAAIGEITYRGDQEAHLVFAPTNCAN GIGDGELHIESTAMTVTGWVSTGGFCWRDPKDRPYRVFFVAKFNTKFSSYGVWKGKAKAE GASNVAGDDVGAYVSFGKLNGKAVKMKVAISYVSIDNARKNLQEEISCWEFEDVCRATKS QWERMLSRLTVEGGSESQRQAFYTAVYHNLLHPNVYSDVNGDYMGFDDKVHRVANGRKQY ANFSLWDTYRTTATLQALIAPDEASDMAQSLLLDAEQGGAYPNWSMNNVEYGVMNGYSTF PFIANLYAFGARNFDLQATKEMMKRVSVKHIKCKGFHGWEHVEDYMTYGYVPVDRHGHGA SMTLEYAIDDFSIAQICKAAGDEAAYNYYMNRSQSFEKLFDEETRLIRPRNADGSFLTPF APNMEKGFNEGNAMQYFWSVPHNVDALTDFCGGKDAMEKRLDTFTSQVKCGWAPDVPYYW LGNEPCFGSVYVYNFLGKAWKSQRMVRNVLNRFDDTPNGLPGDDDAGAMSALYVFSAMGL YPYIPGIGGFVVTGPLFTKVQLQLKGGKTLTLIGENAGNDAPYIQRLQVDGRETTSTWLD WDKLKHGARLNFVMGKGPNKQWGATEKDVPPSYY >gi|283510526|gb|ACQH01000093.1| GENE 8 14368 - 15135 1004 255 aa, chain - ## HITS:1 COG:no KEGG:BLD_0197 NR:ns ## KEGG: BLD_0197 # Name: not_defined # Def: endo-beta-N-acetylglucosaminidase D # Organism: B.longum_DJO10A # Pathway: not_defined # 32 253 704 921 927 87 29.0 3e-16 MKHFLLTLGFMLTLGVLSAFGQQTGNKLWLKASKVLLPTGGKVSFEVEGTNPSATYKYKW TLPKQFKKLSERGNKVVVKIPDEGNYEVKVFVESNDESDDVELKTNVEVSNSKKIELLSV GKKVVSCSGNVGDERPDWLFDGTADPDNYSKKWCAEGKKEHEVVVDLGQTCQIYRLKFYD CRTKETDYDNIQNFKLFVSDNQTDWTLALSETGNNDNVKDITFAPVKGRYVKFVAYDPNK DFTIRVWELELYGVK >gi|283510526|gb|ACQH01000093.1| GENE 9 15166 - 17916 2527 916 aa, chain - ## HITS:1 COG:CC0574 KEGG:ns NR:ns ## COG: CC0574 COG3537 # Protein_GI_number: 16124828 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Caulobacter vibrioides # 216 915 37 728 867 429 34.0 1e-119 MRKHTLLPALLFVAVALHAVVRTNFPTDNTRAKRKKVTVTKQAQTANADTVMRNIALSKD IDVSASAFVNQNEAPEKACDGDVRTKWCDNSSPNKWLLYDLQKDYNVRKVCLIWEHGDPN NIYKVQTSVDGQHWTDRITETTNTQNQRVYDVDWKGVRLVRFLVPEEAKDDAVRLMEFQV WTRGAGQPSRIEPIGKWGAMVKPRYLHTEGKKKIYALMSFVDNRVGVIDANGSNCVIGPQ MPFGSINPSPQTPEGEHDGYAPNQPIRGFGQLHVSGTGWGKYGHFLLSPQVGLSIGETEH DSPASDEVAKPNYYRARLDRYGIVAELTPTEHAAIYRFTFPQTPEANIAMDVTHSLTRDI AKYIGGTVKANSVSIDSDEGDRFSGMIEYEGGFSGGFYKLYFTAQLDKKPRTFGVWKNGT LQQGAKKAQLTQGEDRIGSYFTYYTNNGEAVKLKIAISFNSVEQAKKYLAAEIPGWDFEA TKKRGEDKWNRILSDIVVDEAPAVRMKQFYSALYHCLLMPRNRTNEFPAFGNKELWDDHF AVWDTWRTLFPLLTIIEPDVVGRNVQAFVNRWKVNGKVKDAYIAGNDMAEEQGGNDVDNI VADAIIKDIKGFDKAEAYKYLKFSADHERRAAPLMVGEKGLGQNDTLFYRQNGWLPGGVM AQSTALEYSYNDYCVALAAKKLGHECDYKKYLERSKRWVNMWNADLESKGFKGFICPRDK DGKWIDIDATYFWGSWKRYFYEANAWTYSFFVPHDIDKLIELNGGKEQFAKKLDYGFRNS LLELANEPSFLTARLFNHAGRMDLTCYWVNHVLENLFGEYSMPGNDDSGAMSSWWLFSAM GFFPNAGQNIYYLNSPLFKRVTIQRPNGNIEISAPNRTDKNIYIKEVKVNGKTCTDGIIT YDDLKNGATIRYELTK >gi|283510526|gb|ACQH01000093.1| GENE 10 19621 - 20622 1354 333 aa, chain - ## HITS:1 COG:BMEI0137 KEGG:ns NR:ns ## COG: BMEI0137 COG0039 # Protein_GI_number: 17986421 # Func_class: C Energy production and conversion # Function: Malate/lactate dehydrogenases # Organism: Brucella melitensis # 4 264 7 267 326 117 34.0 3e-26 MEFLTDEKLVIVGAAGMIGSNMAQTAAMLGLTPNICLYDVYEPGLAGVTEEMRHCGFEDV NFTYTTDVKEALKGAKYIISSGGAPRKEGMTREDLLKGNATVAEQLGKDIKAYCPDVKHV VIIFNPADITGLVTLIWSGLKPSQVSTLAALDSIRLQSELAKHFGVPQSEVTGCRTYGGH GEAMAVFASTAKVQGTPLLDLIGTPKLSAEKWAEIKQKTVQGGSNIIKLRGRSSFQSPAY VSVKMIEAAMGGEEFAWPAGRYVSFAGIDHIMMAMEVSITEKGSAYEEVVGTPEEMAELK KSYDHLVKMREEVIALGVLPPVEKWGEINPNLK >gi|283510526|gb|ACQH01000093.1| GENE 11 20932 - 22728 1871 598 aa, chain + ## HITS:1 COG:FN0453 KEGG:ns NR:ns ## COG: FN0453 COG0006 # Protein_GI_number: 19703788 # Func_class: E Amino acid transport and metabolism # Function: Xaa-Pro aminopeptidase # Organism: Fusobacterium nucleatum # 4 594 3 584 584 437 38.0 1e-122 MKTIEQRLDALRQLMRREHLAAFIFPSTDPHSGEYVPEHWKGREWISGFNGSAGTAVVTL DDAAVWTDSRYFIAAEEQLQGTGFKLMKDGLPQTPSVAEWLADKLRHTDNTEVALDGMVN TLSEVNALKVELRKLGGLTLRTNIDPLKTIWTDRPEIPTNSVELQPLELAGEETRHKIER IRKALRAVHADGTLVSTLDDVAWTLNLRGSDVQCNPVFVAYLLIEQNRSTLYINKEKLGE EVKAYLKSQQIEVAEYADVDKGLARYAEYNILLDPNTTNYTLAQKVTCQEIITLPSPVPA LKAVKNDAEIRGFRNAMLKDGIAMVKFLKWLKPAVEGGTETEISLDEKLTSFRAEQPLFR GKSFETIVGYEAHGAIVHYEATPETDIPVKPRGLVLIDSGAQYQDGTTDITRTIALGETT PEQRTAYTLVLKGFINFAMLKFPDGATGTQLDATARLPLWREGMNFLHGTGHGVGAYLNV HEGPHQVRMQWRPAPFHAGMTITDEPGLYIEGEYGIRIENTLLTIPYRSTAFGEFLQFTS LTLCPIDTAPIVLSMLSAEEVTWLNDYHRMVYTTLAPHLDREHLVWLKEATKPLERGR >gi|283510526|gb|ACQH01000093.1| GENE 12 22873 - 23364 554 163 aa, chain - ## HITS:1 COG:BH3289 KEGG:ns NR:ns ## COG: BH3289 COG0663 # Protein_GI_number: 15615851 # Func_class: R General function prediction only # Function: Carbonic anhydrases/acetyltransferases, isoleucine patch superfamily # Organism: Bacillus halodurans # 9 163 8 165 174 153 46.0 2e-37 MLTKQVNGKGPQWGNNCYFSENATIVGDVTMGDDCSVWFCAVLRADVDGIRIGNRVNIQD GACVHQSHGTPVVIEDDVSVGHNATVHGCILRRGCLIGMGATVLDAAEVGEGAVVAAGAV VLQGTKIGANELWAGVPAKLIKRTAPGQAEEFAQHYMEIKKWY >gi|283510526|gb|ACQH01000093.1| GENE 13 23767 - 24462 468 231 aa, chain - ## HITS:1 COG:no KEGG:PRU_2233 NR:ns ## KEGG: PRU_2233 # Name: not_defined # Def: putative lipoprotein # Organism: P.ruminicola # Pathway: not_defined # 11 212 1 207 226 137 37.0 4e-31 MMNARKVISPLARLFLGGVSVVVLVALTTLSACETDSYDTGDGSNSYLKAHFAEAHTTAN AKLAFAVTDDGDTLRFPPNTSNKALAKADSVYRVLYYYDQRGANVNPRSIVPIPVVSLSD STTWPTDPVTFESAWVSKNRKYFNIGFALKVGRTDEPDRKQRIGVVRDSLVTTAAGEHIL YMHLAHGQNGVPQYYSQRVYISVPIKKLPANTRFVFRVKDYQRMITVERGG >gi|283510526|gb|ACQH01000093.1| GENE 14 24459 - 25613 1079 384 aa, chain - ## HITS:1 COG:no KEGG:BT_3714 NR:ns ## KEGG: BT_3714 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 9 380 5 376 382 443 56.0 1e-123 MQSNNYLEKFATIRPFDAEELPAAFDRLLADKQFESVITYIFPQLPFSLFAQQLRACKTS LEVQKTFAYPFLANLLAKASSGGSMDSTAIDSQRRYTFLSNHRDIVLDSAFLAKFLIDNG FSTTCEIAIGDNLLSLPWVRDLVRVNKSFIVERNLPIRQMLASSKLLSEYMHYAINEKNE NVWIAQREGRAKDSNDRTQEAVLKMMAIGGEGDIVDRLADLHIVPLAISYEYDPCDFLKA KEMQQKRDTPDFKKSAQDDVLSMQTGIVGYKGAIHYHCAPCIDGFLSTLDRDMPKGDLFA AVAQHIDQEIYRNYRIYPGNHVALYLLTGEKGADNAFTAEEQATFEKYLEGQMAKITLPQ KDETFLRQRMLEMYANPLINQRSV >gi|283510526|gb|ACQH01000093.1| GENE 15 25780 - 26781 1325 333 aa, chain - ## HITS:1 COG:HI1140 KEGG:ns NR:ns ## COG: HI1140 COG1181 # Protein_GI_number: 16273066 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanine-D-alanine ligase and related ATP-grasp enzymes # Organism: Haemophilus influenzae # 5 330 5 303 306 182 33.0 6e-46 MKDLKRNIAIVCGGNSSEHEVSLRSARGLYSFFDKDKYNVYIVDIERMNWKVDLGNGEKA TIDKNNFSFEHEGTHVFFDYAYITIHGTPGENGIMQGYFELIDMPYSTSGVLVEALTFDK YVLNNYLRGFGVNVAKSILVRRGQEYGIDERAVEEQLGMPCFVKPAADGSSFGVSKVKNV DQLAPALRKALMEDDSAVIESFLDGTEISVGCYRVKGETKVLPATEVVSHNEFFDYDAKY NGQVEEITPARIADETARRVADETARIYELLGCNGIIRIDYILTKEPDGTDKINLIEVNT TPGMTPTSFIPQQVRAAGMEMKDVLSDIVENQF >gi|283510526|gb|ACQH01000093.1| GENE 16 27061 - 28110 1207 349 aa, chain - ## HITS:1 COG:BH2542 KEGG:ns NR:ns ## COG: BH2542 COG0564 # Protein_GI_number: 15615105 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Pseudouridylate synthases, 23S RNA-specific # Organism: Bacillus halodurans # 18 342 1 303 305 246 43.0 4e-65 MTDYIEEIEEQDDDTGQLYEHLRIVVDRGQVPVRIDRFMTERLQHSSRNRIQKAADAGFV HVNERPVKSNYKVRPGDVITLMLDRPHHDTTIEAEDIPLDVVYEDDALMVINKPAGMVVH PGAGNFHGTLINAVAWHLRNMPSFDANDPEVGLVHRIDKDTSGLLVVAKTPEAKRKLGLQ FFNKTTHRSYNALVWANFAEDEGRIEGNIGRDPRDRLRMAVFPPDSETGKPAVTHYRVLE RFGYVTFVECILETGRTHQIRAHMKHIGHPMFGDERYGGTEILRGERSSTYKAYIQNCFK LCPRQALHARTLGFVHPTTGQQMDFTSPLPQDMEQLLDKWRNYIKGLAV >gi|283510526|gb|ACQH01000093.1| GENE 17 28158 - 28844 856 228 aa, chain - ## HITS:1 COG:no KEGG:PRU_2237 NR:ns ## KEGG: PRU_2237 # Name: not_defined # Def: PASTA domain-containing protein # Organism: P.ruminicola # Pathway: not_defined # 1 219 1 215 216 199 46.0 8e-50 MKSSDFFGKLTSKYIWLNLAAMAAVVVLLVVGAKFGIDIYTHHGEAIPIPDIKHKSFADA KQMLANVGLLIEVTDTGYVRTLPADCILEQSPQPGDCVKTGHVVYVIVNSGNTPTITMPD IVDNSSMREAMAKLRAMGFRVGEPQFIVGEREWVYGATVNGRHVVAGDKIPVDAVVVLQV GNGSRSETDTTIDYVEPNYKNQQEDQEGEGDVDEFDVVTPAETPQRGN >gi|283510526|gb|ACQH01000093.1| GENE 18 29730 - 30203 485 157 aa, chain - ## HITS:1 COG:BH2777 KEGG:ns NR:ns ## COG: BH2777 COG1438 # Protein_GI_number: 15615340 # Func_class: K Transcription # Function: Arginine repressor # Organism: Bacillus halodurans # 4 154 3 149 149 97 37.0 1e-20 MKEKNNRLETLKLLISSQEIGCQEDLLKALSQEGFNITQATLSRDLKQLKVAKAASINGR YTYVLPNETMYKRIPTPRMAREMMKVSGFQSIAFSGNMGVMKTRPGYASSIAYNIDNGGI AEILGTISGNDTIFIAFKEPFDPDEMTRRMQDVISRM >gi|283510526|gb|ACQH01000093.1| GENE 19 31667 - 32674 1103 335 aa, chain - ## HITS:1 COG:VNG0536G KEGG:ns NR:ns ## COG: VNG0536G COG1321 # Protein_GI_number: 15789756 # Func_class: K Transcription # Function: Mn-dependent transcriptional regulator # Organism: Halobacterium sp. NRC-1 # 29 244 12 232 233 70 27.0 5e-12 MATTINSKEHWWTATKGLLARNKHVDGKELQEDFLKYIYEHGKEMTRQLVDEAAKDLGTN REGMANTVGALVLAGELTANDTLVLTEKGCMHALRLIRAHRIYEQYLAEHSGYAPSEWHE RAHRMEHRITPDEQERIASLLGNPLFDPHGDPIPTPSLAVADKGTCGEPIEAQSWWRITH VEDDDKQLFTLIAEKGLTKDSLIFITQIDNVAMMFDYEGEHFSLPITALEAINMQPVAAA ELNDLPETRAQRLTHLQAGEEARIVGLSPACRGALRRRLLDLGFVKGSVVSIDMPSPMGN PIAYIVRGSAIALRHEQAKYVLVRKDLVESEEQQQ >gi|283510526|gb|ACQH01000093.1| GENE 20 32964 - 35234 2237 756 aa, chain - ## HITS:1 COG:lin1938 KEGG:ns NR:ns ## COG: lin1938 COG1198 # Protein_GI_number: 16801004 # Func_class: L Replication, recombination and repair # Function: Primosomal protein N' (replication factor Y) - superfamily II helicase # Organism: Listeria innocua # 16 755 20 793 797 459 35.0 1e-128 MRYVDVILPRPLEGYFTYAISDELSTGVKLGIRVLVPFGSSKTCTAMAVRVHDEKPDFDT KAVLQVVDSAPMLLPQQLDLWQWIAQYYMAPIGDVYTAALPAGLKDEAGWQPKTEVYVAL ADNFQTDKTLNIALDMLRSARQQLKAFTCFLHLSHWDTLENNLTQQPVAEITRVELMNES GASNATIKALVDRGLLRIYNKEVGRLNTGDTEPSEKPKPLSAAQQEAYNSIVFQFLKKRV VLLHGVTSSGKTEVYIHLISKALAEKKQVLYLLPEIALTVQMTSRLKRVFGKRLGIYHSR YSDAERVEIWKKQLSAEPYEVILGARSAVFLPFQRLGLVIIDEEHEGSFKQQDPAPRYHA RSAAIVLASYYGAQTLLGTATPSTETYYNAQTGKYGLVELTTRYKDIALPEIKVVDVKDL RRRKLMNGPFSPTLLSAVRGALERGEQVILFQNRRGFAPMVECHTCGWVPKCDNCDVSLT LHKNMNQLTCHYCGFTYAVPTTCPNCGETDLRGRGYGTEKIEDQFAQLFPEAKIARMDLD TTRTKNAYERIIHDFSQGKTNVLIGTQMVTKGLDFDKVSVVGILNADAMLNYPDFRAYEQ AFMMMAQVSGRAGRKGKRGLVLLQTTSPELPVIGQVVRNDYQAFYTDLLEERRLFRYPPF FRLVYVYLKHAKEQVAETAGIELGSRLRQLFGDRVLGPDKPAVARVKTLHIRKLVLKLEP SLSGEQVRQCLRYAHNEMAKDKKYATLHVFYDVDPL >gi|283510526|gb|ACQH01000093.1| GENE 21 35789 - 38680 3256 963 aa, chain + ## HITS:1 COG:all4963_3 KEGG:ns NR:ns ## COG: all4963_3 COG0642 # Protein_GI_number: 17232455 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Nostoc sp. PCC 7120 # 446 692 6 248 294 145 36.0 5e-34 MKCFVAEVCLLIPSFMEDTSTLLRPSARMALHARKAFFILSALLLLAACAGRHRCFTPQE RRKADSIVAQAQGIEALAKLQAQLEKQGNKLGSVVALREWGKALRDESRFEESLRVHSKG LQQAEAMGDTIEWVLALNYIGTNYRRLGVLDAAQTYHYQALRLTEESGDTCWLARKNYVK SLNGLGNIYMTLGNFRRADSVLRKALAGERQLGSDVGQAINCANLGSVAKHLGKKDSAWA YFHRSMAFNRHANNTLGIALCHTAFGEMYQEAKEYDKAYAEYQTAYEQLHDSKDEWHALQ PLLALVSIDYITGNTSTAFENLQRAKAIAEKIKSKEHLAEVYSHLYRLYERQGNYRAALA NHVLASAMQDSVIDMEKVNRIQSISLNLERQMQGEKVNKARQELASERTAKTNILYAGTL VIVVLGLLIGTLIYMGLMRKRNHLLLKKMSALRENFFTNITHEFRTPLTVILGLSRAISQ DKNVPADTREKTKTIERQGTSLLTLINQLLDISKMKSAVGEPAWRNGNIVAQIAMNIESY REYARMREMTFTADLCGEVLMDFVPDYVNKLMNNLLSNAFKYTPPQGTVSVKVRTDGQKL LMDVSDTGKGIPAESLPHIFEAFYQAENDGAKIGTGVGLALVKQIIDAIGGTITAESTLG QGCVFHVSLPIRQTKSAAEPAQPQRPDMPKPLLPTEGDDPTDTEMANDDALRVLVIDDNV DVAAFIGSQLSDKYAIIYAHNGRDGLNKAEQMVPDVIVTDLMMPGVDGLEVCRRIRANEV TSHIPIIVVTAKITEADRVKGLEAGADAYLAKPFNRDELRMRVEKLLEQRRLLREKYAKL EETDKEDEQPQSESDRKFINKVTDTVYMLLNAGGEIDVTAVAEKMGMTYSQFYRKLSALT GCTPMGYILRIKIRKARLLIDKNPSMPFRDVAERCGFSDYSNFVRAFKNLCGITPTQYVR QKE >gi|283510526|gb|ACQH01000093.1| GENE 22 39035 - 39235 152 66 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929444|ref|ZP_06423289.1| ## NR: gi|288929444|ref|ZP_06423289.1| hypothetical protein HMPREF0670_02183 [Prevotella sp. oral taxon 317 str. F0108] # 1 66 1 66 66 93 100.0 5e-18 MKDARDTPLPRRQSNNRRQCDERVELLEVIKEMVEAEVDKRLRALLDGFKKKKETGDEPS GTAPNK >gi|283510526|gb|ACQH01000093.1| GENE 23 40021 - 40380 157 119 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MCGLGSRLAPYKYEKKHFIMGHCHSLSLLFFAGAPAAPLLSSCLVVALWALLLGLMFVYI VSLLYCRGAAKGMPSVAMLWVSLFFSGVFFTGGFRYGPNKRFARIAFVSFSFVRPFDVF >gi|283510526|gb|ACQH01000093.1| GENE 24 40468 - 40929 -33 153 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MQCVFISLAFITIPLLPLNKPPRESIFCGKVGDWWIKMALIVLNFLLKNLQRWQVDGWEG KREGKLKGVTSRRANELMSVRIAGLQAGKETYLPQNDKLFAAVLATKEHQKDVACSSFAG RGCPRTAHRTRKRSRTSITGNPAPTNNLTLYLD >gi|283510526|gb|ACQH01000093.1| GENE 25 41692 - 42657 933 321 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929446|ref|ZP_06423291.1| ## NR: gi|288929446|ref|ZP_06423291.1| tetratricopeptide repeat protein [Prevotella sp. oral taxon 317 str. F0108] # 1 321 3 323 323 605 100.0 1e-171 MDKLNQLIRQKKYRAASNLLRKMVEQAPEDGYLLTRMANDLWDDRKNEEALQYADRAKKV EPTDPYVIYTRARVLWALDKNEEAIAEWDNILNMSEHDLKESGYRPMWIKTFVNDARYYK ALSLRALFRDKEALTLMEEHLKNRGKGVRSNFTKKEATLFYKELKYSYLNSDVDYSDEGY ATRPQAHRIGRRMEALEDAKQWDKLVRYLKGVCKRYPKEYYFHVRLSEYSKKVGNKADCL KYATNAFAQEPNDPLVKYNYAVALMYCGRNEDALAQFEGLVALGLDYIAYSEHGEGMRWA KKIMRHTLRYIDDIKRSCVDF >gi|283510526|gb|ACQH01000093.1| GENE 26 42937 - 43911 1024 324 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929449|ref|ZP_06423294.1| ## NR: gi|288929449|ref|ZP_06423294.1| TPR repeat protein [Prevotella sp. oral taxon 317 str. F0108] # 1 324 1 324 324 566 100.0 1e-160 MSKLAQLIKQNRFKAVENLMRKKLEQEPENVYFLAQLANALWNLNKDEEALSYADKAKSI SPTNPLTLFTRARVLGSLHKFEEAAAEWEELISMGEAEVAEKGFGKGWAKSVINNARYYR ALSLGELSRDKEALALMEEHLKHRGKGVRSDFTKKEATLFYKELKYSYLNSGVDYSDEGY ATSAQAHRIEWRMETLEEARQWDKLVRYLKGVCKRYPKEYYFQVRLSEYSKKVGNKADCL KYAEEAFAQEPNDPLVKYNYAVALKYCGRNEDALAQFEGLVALGLDYIAYSEHGEGMRWA KKIVRHSQRYIEEIKQKEETAENH >gi|283510526|gb|ACQH01000093.1| GENE 27 43915 - 49887 3122 1990 aa, chain - ## HITS:1 COG:MA2045 KEGG:ns NR:ns ## COG: MA2045 COG3209 # Protein_GI_number: 20090892 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rhs family protein # Organism: Methanosarcina acetivorans str.C2A # 1079 1689 1404 2013 2217 169 27.0 7e-41 MRTYLLTLLCLLVATKGFTQERLYTEKEHGGIENGVFGKINDEINVSPTGQLSYEIPIPT IPGTAGMKPNLSISYNSSTKNGLAGYGFDLTGLSIISRVPSDRFHDGTSTVIGFSRSDHF ALDGQRLFNCSYSTETTNEYRTENNSFARIIAKGKSINPDSFIVYTKSGLIYEYVPVAIA LGKTKTDFTLFWLVSKVTDTKGNYFTVTYRGEAETNDFYPHRVEYTGNAAAGLTPYASIN FSYRDNLYSPVTYITGVKVKRSQILSSVSLSMQGQTVRSFSLDYQVVNRKYQLSKITESG VGRENKNPTQLTWTNLNDFNVKNYDYSKTSFIHKATLTIGDFNGDGLADFIATPENNDAG WKGWKLFISHGTSFELAASGTWNWNDNDLEQVVCGDFDGDGYTDVVAKRCVSGTWHNCDL YTTSVDANKKVHLNFSKCFISLQSNYTIQTVELNGDGAADLFAWLSDSRECKLIYSVEGE YGIKPLDYTATRYCSANWDRVEFGDFNGDGLTDAMNLNEYGNSIMYSDGAGTMTKEDWTS WPDKRHYMELGDFNGDGKTDMLLTGWTADPNSGGWSDWCINYSKGDGTFTREYYEKPFDA RSKHLFIADLNGDGFDDMQAVDNTSSGYNTTRPQVYLNDGKGNFYQQIKGESVYATDKWH FYVGDFNGDGKADFVCTSDWNKSWWDGYQLYLMPSTKNCLLTGIKDGLGNTTNIEYKYLT DDTVFKRGETNRYPLVSIGSSWPVVASVSTPNGIGGTNVTSYRYEDALFHKNGRGLLGFA KCYVKDEATNTLTTTDYAVDKRAYVIAPSHSQTTINNKVIDESDYTYTFKADYSYSIYFK HIYTYMPEMVHQRSYEFNTGELIKDVKTNYEYDNFGNTTKTIIKDGDVETITTNTFTNNT DKWFLGRLMESIVSKSYEGNTITRKSAFEYDQSSGLLSAEVFAPDNANLGYRKTYVHDNF GNIIKSVVSPLAHNSIERVTQTIFDTKGRNIIRSINSLGFTETSTFNEATGLVTTSTDKN NITSNYTYDVFGNLVTASNPISKALKTIGWSSKMPDAPTNALYFEWSKVTGQPAVIEFYD CLGRLLRKVTESINGKKVCIDHTYNKRGLIEKSSEPYYIGDQPLWNWNEYDDAGRTTTQI ASDGSRYTFQYAGLKTAVSDPLGHASTKVSNLNGLLVESIDNAGTSITYKYNPDGKCVET KGPRTTISCAYDMVGNRVGLNDPDLGFSEDTYNAFGELVAHRDAHGETRYVYDSGGRIKE ELRPDVEVFTRYDKGWKGAIDVAFSEGPPKSHNLYSYDSYGRVVKKQTLIDDKEYDIVYT YNSANQVETIKYPSGLKIKYEYDACGIQTSVVNADNQKVYWKLLSLDARGQIEKEEYGNG LVTTTTHDPRKGTIASILTPGVQNWGYKFDPVGNLIARHDLKRNLEEVFTYDDMYRLTTV SKNGQVKQSMTYDNAGNITSKSDLGTYTYLDGSNKLHAIKNSKSPLTTWDGITYNSFDKI TVVEARGTFMYIGYGVNKSRVLTDIDDTRRYYVDNLFEQKIKNGKVSNTNYILVFDKAIA IVSQEDNGSSSVKYLHHDHLGSIQAYSDETGKLFQELSYDAWGVRRNPDTWVEFSVLTSS NAYNDHGFGGHEHIDIFEMINMDGRMYDPIVGRFISADPFIQSPDLSQSLNRYAYCINNP LSLIDPSGYSWFSENWKTITASIVGIAVTVVTAGSGSGAAVALIAGAAGGAAGALTGALL NGANIGQIAKSTFIGAFWGAVTGVVKNYVGDIQQFWLRIGAHTLSEGALEGLQGGNVIHG FMMGATSSLGSSFIESNLKPLGKVAEVAANSILGGTIDEIGGGKFANGAITGAFTILFND MMHTDGKKTKQQTNDDDNSNQLAGSLALAGTALLVDDATGIGVIDDPIAFALYATAGAII TWNSATAIYKGIKDSYFKPMAAEHTKNKCKQTNDKHTRRHSGRRYNQNQNAKKGEKNQTY EPKPNPNKRH >gi|283510526|gb|ACQH01000093.1| GENE 28 50402 - 50896 134 164 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MCKISLDRTQQINKNTFPMRNLTILLMLLLLCICSKMQGQYIQRYNYTYTYNAQGSCTSR VHVSSHSNDGDIKYPKLSEVEVSPVPNFDTQITLWVRGMPKGQYASYLMTNLGGQVVFKG RTGNGSLTLATGTLPRGLYILNVNGARITKSYKLSKGMFNKPSN >gi|283510526|gb|ACQH01000093.1| GENE 29 51292 - 52230 805 312 aa, chain + ## HITS:1 COG:no KEGG:PRU_2176 NR:ns ## KEGG: PRU_2176 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 5 312 6 312 312 346 55.0 7e-94 MLKRLTIITLWLVLGFSEASAQYDATFSHYFDLEPTFNPAAVGKQPKLNITGAYAMSMAG FENNPRTMVLAADMPFVFAGMNHGAGVQLMNDQIGVFKHQRIALQYAYKLKLGSATLGLG LQGALLLENIDGTRLDPADTNDHALPTSEETGNAVDLGVGLYYMARTWYVGLSAQHLNSP TVELGERHELHVAPTYYLTSGCNISLRNPFVKIKPSALVRTDGTGYRADISTRVEYTNEK RLLYAGVSYAPTISVTGMVGGRFHGIVLGYSYEMYTSAIKPGNGSHELFVGYEQELNFVK KGKNKHKSVRLL >gi|283510526|gb|ACQH01000093.1| GENE 30 52284 - 53789 1504 501 aa, chain + ## HITS:1 COG:MT0739 KEGG:ns NR:ns ## COG: MT0739 COG1262 # Protein_GI_number: 15840119 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Mycobacterium tuberculosis CDC1551 # 329 484 136 293 299 82 40.0 2e-15 MQESAARVDWIKQMMKRKRQIVLALCAFGLSLMVSGCFGGKGLSASGRGGEVVGVGSSKN FAEPTPYGMTRVGRGYLKMGIEKGDSLWGKNTPVKDISVDGFWMDETEVTNSKYKQFVNW VRDSILRTRLADPAYGGDETYMITEDKDGNAIRPQVNWKKPLPRKPNEDEQRAIESMYVT NPVTGEKLLDYRQLNYRYEIYDYTTAALRKNRILPQERNLNTDLTVDPNEEVMISKDTAY VDDNGNIVSETINRQLSGPWDFLNTYIVNVYPDTTCWVNDFTNSDNEVYLRNYFSNPAYN DYPVVGVTWEQANAFCAWRTDYLLKGLGAVARFVQRYRLPTEAEWEYAARGKSGAEFPWE NPNVKNGQGCFYANFKPDRGDYTQDGNLITSRVGAYPSNSNGLYDMAGNVAEWTSTIYTE AGVDAMNDLNPQLDYKAAKEDPYRLKKKSVRGGSWKDPESYIRSAWRSWEYQNQPRAYIG FRCVRSLATTSTGKPQKQKSK >gi|283510526|gb|ACQH01000093.1| GENE 31 53786 - 54631 907 281 aa, chain + ## HITS:1 COG:no KEGG:PRU_2174 NR:ns ## KEGG: PRU_2174 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 17 276 1 236 238 241 54.0 2e-62 MTSYSKYNLVYRLQKWMDSVPGQTFLNYAYSWGASIVILGTLFKLTHLPGANLMLFFGMG TEVVVFFLSAFDRPFDKTADGMDLPSHVTKEHLEGTDVSAEQPTKIRVAAVQQPPVVGVQ QQQSAAVRQSAVGVQQPTVALSANVSTPASQVPAAHTHANAQTTPAANAAQGSPLLADIE VVEATSRYIDELNKLTETLEKVSQQSARLTRDSEEMENLNRTLTGITKVYEMQLKGASQQ IGTIDQINDQSRRMAQQIEQLNAIYARMIEAMTVNMKMQNP >gi|283510526|gb|ACQH01000093.1| GENE 32 54643 - 56199 1697 518 aa, chain + ## HITS:1 COG:no KEGG:PRU_2173 NR:ns ## KEGG: PRU_2173 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 518 1 519 519 620 60.0 1e-176 MAIKKRPISPRQKMINLLYVVLMAMLALNVSTEVLEGFSVVEDSLKRTISGSTKQNEALY DVFASQLKSNPEKVKAWFDKAEGVKRASDSLFNFVEKLKVAIVREADGSDADVHDIVNKE NLEAASHVMLAPIEGKGRKLYEGINAYRNYILKMVNDPTQREIIAQNLSTEIPKGKRSMG KNWQEYMFESMPVSAAVTLLSKLQSDIRYAEGEVLHTLVANIDMKDIRVNKLDAFVIPEK TTLYPGETFSASIVMAAVDTTQQPDIYINGARVNLRGGKYSFAAGGVGQHSFGGYITMRN GSGEVLRRNFLQKYEVVAMPTAATVAADLMNVLYAGYTNPMSVSVSGIPQNAISMTMTGG KLTNKGNGHYVAVPSAVGHDVTFNITANDKGKVRSLGQAVFHVRKLPDPTAYIAVGTDRF KGGGLSKGALMGAEGINAAIDDGILDIRFRVLSFETVFFDNIGNAVPMSSAGSQFSERQR DAFRKLSHSKRFYISNVSAVGPDGITRKLPQAMEVIVR >gi|283510526|gb|ACQH01000093.1| GENE 33 56330 - 57325 1152 331 aa, chain + ## HITS:1 COG:no KEGG:PRU_2172 NR:ns ## KEGG: PRU_2172 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 30 331 43 351 351 316 56.0 6e-85 MSCALQVAFAQPAARRNNQQNTATGNADNVSLRARISFPTQSKMDEDVVWRRDIYRELNL TEDANAGLYYPVEPIDGRMNLFTYLFKLVMRGQVKAYEYRLDGNESFEDSARVKPLALLD NYHIFYERVDGRVRIDNSDIPSREVTSYYIKESAYYDQATATFHTKVLALCPIMKRQDDF GDATTYPLFWVKYDDLAPFLAKQTVMTSNLNNAATMSLADYFTMNKYRGKIYKTNNMLGL TLAQYCKSDSAMAQEQKRIEAEIAAFEKGMWGVKAPQQADSLAQTTKNSRGVKAPRANRR ATVAPAPKKEKAARSSQGSASPRMSVRRERH >gi|283510526|gb|ACQH01000093.1| GENE 34 57402 - 58028 715 208 aa, chain + ## HITS:1 COG:no KEGG:PRU_2171 NR:ns ## KEGG: PRU_2171 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 208 1 204 204 188 45.0 1e-46 MKKMMKMAVLAAALTLSANASAQTKLGLKGGINTSEVSVNDDVWKTDNRLGFFIGPTVKF TLPIVGMGMDISALYEKRESKMKSEDGKLGSVVSREQLAIPINARYSFGLGETANIFLFA GPQVAFNLGKKDKEIVPEVADWTLKSSNFSINLGIGCTLADHLQATLSYNIAVGKTGEVE ISKDAVGDAVKKYDGRSNAWQLGVAYFF >gi|283510526|gb|ACQH01000093.1| GENE 35 58594 - 58974 391 126 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929457|ref|ZP_06423302.1| ## NR: gi|288929457|ref|ZP_06423302.1| hypothetical protein HMPREF0670_02196 [Prevotella sp. oral taxon 317 str. F0108] # 1 126 1 126 126 233 100.0 3e-60 MEKMKKSGKIIFVVLGILLVLGTGVFTYLNHIPKAEAFGYESLVLCVTTFLFYRPFTKRG ELEAILLNFVIILGVTAVVTVPEWTFNNIITAWGWSAIGFLVVCAISFIVRKTRLMDVEN RPAEHN >gi|283510526|gb|ACQH01000093.1| GENE 36 58991 - 59197 78 68 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MFFHAPQSLPIAVHSAKLCKKKQIAKLYTNNIFHCLPKDQGIDQQGKIAHFYTRETSQII SMQATCRW >gi|283510526|gb|ACQH01000093.1| GENE 37 59435 - 61588 2473 717 aa, chain + ## HITS:1 COG:CAC1031 KEGG:ns NR:ns ## COG: CAC1031 COG0370 # Protein_GI_number: 15894318 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+ transport system protein B # Organism: Clostridium acetobutylicum # 34 716 7 681 683 312 30.0 2e-84 MPNKQNTSPCDACPKAAMHSLQRRGAQAGGERYVVALAGNPNTGKSTVFNALTGLKQHTG NWPGKTVGKAEGYFAYQGEGYRIVDLPGTYSLSSTSEDEEIARDFILFGQPDVTVMVADA TRLERNMNLILQVLQITDKAVLCVNLLDEAKRNKIDINLNALSRRLGIPVVGASARSGQG IDQLLERMRQVAEGEYKCRPHRSTNLPPHAAQHVKELARKIAETHPDISNAEWIAFRLIE GDTSVKERFGNPELNALAEKLLLEIGSNFHDLWVEDIYTQAGQICSEVVTQGNQRGRLPL DVKLDRILTHRIWGFPIMLALLCGVFWLTIIGSNYPSDWLNQLLVGIIHPALHDTFVSMH APWWLTGLLVDGVYLATAWVVSVMLPPMAIFFPLFTLLEDFGYLPRVAFNLDELFRRSGA HGKQALTMSMGFGCNAAGVVSTRIIDSQRERLIAIITNNFSLCNGRWPTQILLATLFIGA AVPARYSSMVSLVAVMTVVLMGVGFMFGSSWLLSRTILRGEVSTFHLELPPYRPPQFWQT LYTSIIDRTLIVLWRALVFAAPAGALIWLCCNIHVADATIAQHLISLLDKPGWLMGLNGV ILLAYVLAIPANEIVIPTVMMLTMLVLGQSDMGAAGVLMEGTEQQTKQILLLGGWNMLTA VCLMVFCLLHHPCSTTIYTIYKETKSAKWTTVATLLPLALGIVMTTLIATVWRMVAE >gi|283510526|gb|ACQH01000093.1| GENE 38 61917 - 63122 977 401 aa, chain + ## HITS:1 COG:no KEGG:BDI_2811 NR:ns ## KEGG: BDI_2811 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 8 390 7 390 393 391 55.0 1e-107 MSSNITTKAKAFLAKPIFHDRRTILWLWIALSVIAAVLKYNRTDNNYRIFRGVFWHTLQC TSLYAEYPLEYYDVNHYGPFFSLVIAPFALMPIPLGLVFWCIALSLTLYFAITRSTFSSW QQMFVLWFCSETLLTSLFMQQFNITIAAIIIASYALIEKERDFWAACLIVLGTFVKLYGI VGLAFFFFSRHKGKFVLSLLFWGVVLFVAPMIISSPGYVVSQYHEWFVCLVEKNGENLAS EAQNISALGMVRRVLGNPQYSDLLILAPALVLFALPYLRFKQWRNEGFRMTLLASVLLFT VLFSTGSESSSYIIALSGVCVWYFAAPWQRGKADIWLLVFVFLLSSMGSSDLYPRAIKRE YIQAYSLKALPCLIVWLKLCWEMMAKNYQPHPQTLTKDRGE Prediction of potential genes in microbial genomes Time: Sat May 28 02:15:52 2011 Seq name: gi|283510525|gb|ACQH01000094.1| Prevotella sp. oral taxon 317 str. F0108 cont2.94, whole genome shotgun sequence Length of sequence - 8568 bp Number of predicted genes - 11, with homology - 11 Number of transcription units - 7, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 29 - 841 150 ## BT_0082 hypothetical protein + Prom 857 - 916 4.2 2 2 Tu 1 . + CDS 1124 - 1555 127 ## gi|288929462|ref|ZP_06423307.1| conserved hypothetical protein + Term 1590 - 1637 6.5 - Term 1517 - 1555 -0.9 3 3 Op 1 . - CDS 1610 - 2035 351 ## PRU_2901 hypothetical protein 4 3 Op 2 3/0.000 - CDS 2013 - 2975 883 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 5 3 Op 3 . - CDS 2972 - 3781 777 ## COG0726 Predicted xylanase/chitin deacetylase - Prom 3892 - 3951 2.5 + Prom 3840 - 3899 4.0 6 4 Tu 1 . + CDS 4144 - 4707 420 ## PG1466 hypothetical protein - Term 5172 - 5222 13.1 7 5 Op 1 . - CDS 5235 - 5456 345 ## PGN_1678 hypothetical protein 8 5 Op 2 . - CDS 5518 - 6072 516 ## COG2096 Uncharacterized conserved protein 9 5 Op 3 . - CDS 6096 - 6686 418 ## BF4083 hypothetical protein - Prom 6850 - 6909 5.0 + Prom 6787 - 6846 4.2 10 6 Tu 1 . + CDS 6935 - 7672 902 ## COG2859 Uncharacterized protein conserved in bacteria + Term 7689 - 7732 11.9 11 7 Tu 1 . + CDS 7804 - 8370 259 ## gi|288929471|ref|ZP_06423316.1| hypothetical protein HMPREF0670_02210 Predicted protein(s) >gi|283510525|gb|ACQH01000094.1| GENE 1 29 - 841 150 270 aa, chain + ## HITS:1 COG:no KEGG:BT_0082 NR:ns ## KEGG: BT_0082 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 268 36 302 306 219 41.0 1e-55 MKSFTKQEVRKILTEEDVPQASLGGSRITLTCRQCNSTCGSEIDVHLYNAIKAREQRLFL PKTNRKITVEKENQRLNAELIVEDNKNIKLFINEKRNNPRVWEFFHNNILLPDEIIDIAD YPLKRDERRIGSALIKNAYLLLFAKTGYSFLTDSYYDDLRLQIANPEVFYLPERLWTAQN ISIDDGIYLTQDNRYRGFFVIYTLKLQQIYRVCVLIPTPLIPFLFAAKELGKIIAGAGLR ILKLPELNYLEDSNAINRLRDWCYEWKMNF >gi|283510525|gb|ACQH01000094.1| GENE 2 1124 - 1555 127 143 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929462|ref|ZP_06423307.1| ## NR: gi|288929462|ref|ZP_06423307.1| conserved hypothetical protein [Prevotella sp. oral taxon 317 str. F0108] # 1 143 18 160 160 270 100.0 3e-71 MASLPELFSVKTHICVKKELNSPSVVNVLQLTSELNQFAQEPLAQIQATGNQGNVNTQES LAQLGEVSETLSTIYQHDINQIEEIFKQSAVVRTNWASLVILRYSLAVGDIYGIGRLPFL ASLYRYFKSQSDFIILKVGNFHP >gi|283510525|gb|ACQH01000094.1| GENE 3 1610 - 2035 351 141 aa, chain - ## HITS:1 COG:no KEGG:PRU_2901 NR:ns ## KEGG: PRU_2901 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 21 141 7 127 127 114 49.0 1e-24 MTTTQRFKGWWKGHPAFRAHFWRVFRFGIVGAICTALHYGVYCLCLLFANANVSYTVGYV VGLLCNYVLTTYFTFQSKATRGNVAGFVASHAVNYLMEIGLLNLFLWWGLSKWLAPIVVM AIAVPINFLLLNVVYKKRSKE >gi|283510525|gb|ACQH01000094.1| GENE 4 2013 - 2975 883 320 aa, chain - ## HITS:1 COG:L29089 KEGG:ns NR:ns ## COG: L29089 COG0463 # Protein_GI_number: 15672983 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Lactococcus lactis # 1 315 1 318 318 311 50.0 1e-84 MTRLAIVSPCYNEEAVLDTSVARLTQLLDTLVSLGDVSAESMIVFVNDGSTDATWQLICR HQRENHYVRGINLARNVGHQSAIMAGMLTARHWADVVITIDADLQDDLAAIPRMLQHYRE GSDIVYGVKVSRKADPVFKRLSAVAFYKLQSKMGVKSVFNHADFRLMSRRALDILAGYGE RNLYLRGLIPMIGLESSTVDDVISERTAGTSKYTLKKMLSLALDGITSFSVRPIYLILYI GLFFLFLSLVAGVYVVHALIVHTAYPGWASIMLSLWLIGGVIMLSIGVVGVYVGKIYTEV KNRPLFNIREIVGDDDHTKV >gi|283510525|gb|ACQH01000094.1| GENE 5 2972 - 3781 777 269 aa, chain - ## HITS:1 COG:MA0797 KEGG:ns NR:ns ## COG: MA0797 COG0726 # Protein_GI_number: 20089681 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted xylanase/chitin deacetylase # Organism: Methanosarcina acetivorans str.C2A # 9 220 16 207 250 82 31.0 7e-16 MQTKEKELILLSFDTEEFDVPREHGVDISLEEGMRVSVEGTTRILDCLKSNGVRATFFCT GNFAEQAPRLIRRIMDEGHEVACHGVDHWKPVPEDVWRSKNILERVAGVPMHGYRQPRMF PVSDDALAEAGYLYNSSLNPAFVPGRYMHLTTPRRWFMKGEVMQIPASVTPLLRFPLFWL SLHNLPQWLYNAMVRRVLRHDGYFVTYFHPWEFYDLKEHPEYKMPYIIRNHSGVDMVNRL DRLVKMMKRRDAEFITYTQFVERKKGEKR >gi|283510525|gb|ACQH01000094.1| GENE 6 4144 - 4707 420 187 aa, chain + ## HITS:1 COG:no KEGG:PG1466 NR:ns ## KEGG: PG1466 # Name: not_defined # Def: hypothetical protein # Organism: P.gingivalis # Pathway: not_defined # 1 187 3 189 190 256 73.0 4e-67 MLDLSFFTTLKLGWLNAWIPAFGMVVIQFVYMALFREVGKRAVDTSWYTPRDKFWAMVST FLQIVLLVLSIFVPFKLGTAWFLIGSTIFALSFAAFIWAFHSYGIDPAGKTIKSGIYRWS RNPMYFFFFAGMFGTCIASASLWLLIVIVPFVIATHFTILGEERYCAQTYGKEYLEYKAK TPRYLLF >gi|283510525|gb|ACQH01000094.1| GENE 7 5235 - 5456 345 73 aa, chain - ## HITS:1 COG:no KEGG:PGN_1678 NR:ns ## KEGG: PGN_1678 # Name: not_defined # Def: hypothetical protein # Organism: P.gingivalis_ATCC33277 # Pathway: not_defined # 1 73 13 85 85 110 87.0 2e-23 MYWTLELASKLEDAPWPATKEELIDYAMRSGAPLEVLENLQEIEDEGDVYESIEDIWPDY PTKEDFLFNEDEY >gi|283510525|gb|ACQH01000094.1| GENE 8 5518 - 6072 516 184 aa, chain - ## HITS:1 COG:lin1172 KEGG:ns NR:ns ## COG: lin1172 COG2096 # Protein_GI_number: 16800241 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Listeria innocua # 1 182 1 179 188 145 42.0 5e-35 MKVYTKTGDKGTTSIVGGARIDKSHARIEAYGTIDELNSALGYLLALLPEGSHVSLLESV QHALFNIGCHLATDIEPGNETDAPCLDPSCAVQLETTIDRMQEKLPPQTHFILPGGTPAA AWAHVCRTICRRAERRVVALSHEAHVAFSALQYLNRLSDYLFVLARFINHNADLSEKTWQ NTCK >gi|283510525|gb|ACQH01000094.1| GENE 9 6096 - 6686 418 196 aa, chain - ## HITS:1 COG:no KEGG:BF4083 NR:ns ## KEGG: BF4083 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 2 196 16 220 220 110 33.0 2e-23 MYYLERLWIWITRLTCCRGFGIQSPSAYSFVRYVVNEHYPYYAYADLADSFPQLGKRERK LGEFYFRLANFAQAYQWLCCGSMPHWLAPYVQAGCKGTAVVGLKEAELHVGISPEHPLVV HLLDPLRAEETYERVLPMLNQKSILVVEGIGYSHHARQLWQKLRHDLRTGVTFDLHYCGL VFFDTVRPKQHYVINF >gi|283510525|gb|ACQH01000094.1| GENE 10 6935 - 7672 902 245 aa, chain + ## HITS:1 COG:slr1258 KEGG:ns NR:ns ## COG: slr1258 COG2859 # Protein_GI_number: 16330444 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Synechocystis # 39 245 49 251 251 113 33.0 3e-25 MDKTKIIITSIIGVVIVIGAWLIAAGFKNFREGEPNINVTGMAEKQITSDLIVWKISVNA EGSTRSSAFTEFEKAARVMRQYLQTHGIPEKEITMSSVDISKRTKDFWDEDSRKYVTLEN GFAVSQTFTVSSNDLKRVESVYQRISELYNRGMDFSSEKPLYYYTKLNDLKMEMLNQASA NAYERAQTIAKGSDSKVGAIVSSSMGVFQIVGLNSEEEYSWGGTFNTSSKEKVASITVRT TYKVK >gi|283510525|gb|ACQH01000094.1| GENE 11 7804 - 8370 259 188 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929471|ref|ZP_06423316.1| ## NR: gi|288929471|ref|ZP_06423316.1| hypothetical protein HMPREF0670_02210 [Prevotella sp. oral taxon 317 str. F0108] # 1 188 1 188 188 363 100.0 3e-99 MWFFLRKNINKKDTFVTLNGSYDKLSRYFFSDIEDEKYIVNVKFSGVAQKGFTHQLLAQA DYLENSRFGGIPLFSQRFMERTKQYLSGKIDFHPCQILLNDVTYVFYLGRIKCIKPIIDY EKSGYRILTDGSRILSEPKVIKASIDEEMLIVRDATYKSTFVVSELFKQMVEKEKLKVGF DNTSSTFW Prediction of potential genes in microbial genomes Time: Sat May 28 02:16:22 2011 Seq name: gi|283510524|gb|ACQH01000095.1| Prevotella sp. oral taxon 317 str. F0108 cont2.95, whole genome shotgun sequence Length of sequence - 2360 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 1, operones - 1 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 200 - 490 196 ## gi|288929556|ref|ZP_06423400.1| prophage PSPPH06, putative tail tape measure domain protein 2 1 Op 2 . + CDS 511 - 1086 476 ## gi|288929474|ref|ZP_06423318.1| conserved hypothetical protein 3 1 Op 3 . + CDS 1055 - 1309 229 ## Sez_1333 hypothetical protein 4 1 Op 4 . + CDS 1346 - 1672 340 ## gi|288929556|ref|ZP_06423400.1| prophage PSPPH06, putative tail tape measure domain protein 5 1 Op 5 . + CDS 1693 - 2232 419 ## gi|288929476|ref|ZP_06423320.1| conserved hypothetical protein Predicted protein(s) >gi|283510524|gb|ACQH01000095.1| GENE 1 200 - 490 196 96 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929556|ref|ZP_06423400.1| ## NR: gi|288929556|ref|ZP_06423400.1| prophage PSPPH06, putative tail tape measure domain protein [Prevotella sp. oral taxon 317 str. F0108] # 1 96 218 313 313 140 89.0 3e-32 MGKIAGRALIVVGIAWDAYCINEAYQEEGEFGDKTQQATGAAVGGLAGAWAGAEIGAIVG TAVCPGVGTFVGAVIGGLVGAFFCSWGGSALVDSIF >gi|283510524|gb|ACQH01000095.1| GENE 2 511 - 1086 476 191 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929474|ref|ZP_06423318.1| ## NR: gi|288929474|ref|ZP_06423318.1| conserved hypothetical protein [Prevotella sp. oral taxon 317 str. F0108] # 1 191 1 191 191 355 100.0 6e-97 MGLFDAIFGSSKKEQEKPIRDKLKDRKHFEKWLKNTLQEIAEDEISILKAKESGNEVSTF LYYVLDDRNVDVIQIKYSMDASIIELREIYMDSLEYFRLSFEPEEPMYFEILNRVSLGIL LNIPDENLMQLVDYVQRMDEEAKPADWTPDLLLWFLLNSRLKDNEKRAHAQKLAFPRLYT RIVQGNTSHRQ >gi|283510524|gb|ACQH01000095.1| GENE 3 1055 - 1309 229 84 aa, chain + ## HITS:1 COG:no KEGG:Sez_1333 NR:ns ## KEGG: Sez_1333 # Name: not_defined # Def: hypothetical protein # Organism: S.equi # Pathway: not_defined # 3 79 150 229 231 82 53.0 5e-15 MYKVIQATDSKAALKALKDYIGKWYNLNKDAPWYDSHLKKNCYSGYWAWEVAAVAKIMHI DDTDLKDNPYYPYDMVHWEEDREE >gi|283510524|gb|ACQH01000095.1| GENE 4 1346 - 1672 340 108 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929556|ref|ZP_06423400.1| ## NR: gi|288929556|ref|ZP_06423400.1| prophage PSPPH06, putative tail tape measure domain protein [Prevotella sp. oral taxon 317 str. F0108] # 2 108 207 313 313 159 91.0 6e-38 MQYIDSSPVLRTVGKVAGRALIVVGIAWDAYCINEAYQEEEEFGDKTQQATGAAVGGLAG AWAGAEIGVIVGTAVCPGVGTFVGAVIVGLIGAFLCSRGGSALVDSIF >gi|283510524|gb|ACQH01000095.1| GENE 5 1693 - 2232 419 179 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929476|ref|ZP_06423320.1| ## NR: gi|288929476|ref|ZP_06423320.1| conserved hypothetical protein [Prevotella sp. oral taxon 317 str. F0108] # 1 179 6 184 184 343 100.0 3e-93 MGLFDAIFGSSKKEQEQPIRDKLKKRKDFEKVLKIRLAEIDRYKEILTTGQGDELYSCYM IADDSMDVIQIRYSMDGGLKELQKIYMDSFDYFLRDNSCDGAIYDDVLYRISLGILLNIP DENFMQLVDYVQRLDEEAKPANWTPDLLLWFLLNSRLKDNEKRTHAQKLAFPRECKGLY Prediction of potential genes in microbial genomes Time: Sat May 28 02:16:55 2011 Seq name: gi|283510523|gb|ACQH01000096.1| Prevotella sp. oral taxon 317 str. F0108 cont2.96, whole genome shotgun sequence Length of sequence - 3162 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 2, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 17 - 559 482 ## gi|288929477|ref|ZP_06423321.1| hypothetical protein HMPREF0670_02215 2 1 Op 2 . + CDS 574 - 1104 284 ## gi|288929478|ref|ZP_06423322.1| hypothetical protein HMPREF0670_02216 3 1 Op 3 . + CDS 1177 - 1365 110 ## gi|288929479|ref|ZP_06423323.1| hypothetical protein HMPREF0670_02217 4 2 Tu 1 . + CDS 1486 - 3160 1711 ## COG3209 Rhs family protein Predicted protein(s) >gi|283510523|gb|ACQH01000096.1| GENE 1 17 - 559 482 180 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929477|ref|ZP_06423321.1| ## NR: gi|288929477|ref|ZP_06423321.1| hypothetical protein HMPREF0670_02215 [Prevotella sp. oral taxon 317 str. F0108] # 1 180 1 180 180 350 100.0 2e-95 MKYNIFSFRESDDYDKKRYGFAESPDSVDPIDIYQGLPLEKAWSEPMFELTEGRFSDYLS NDCDWVLCSEKLKQCIETNAINATDITWCSVAVNDGDVVETYYALLMETPMKEILDIEKS RKLRNGKIYLPHFVYDKIKDVDIFTLEEEYPNYVYISDRLKQRIEAESLTGVGFEDWYAS >gi|283510523|gb|ACQH01000096.1| GENE 2 574 - 1104 284 176 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929478|ref|ZP_06423322.1| ## NR: gi|288929478|ref|ZP_06423322.1| hypothetical protein HMPREF0670_02216 [Prevotella sp. oral taxon 317 str. F0108] # 1 176 1 176 176 327 100.0 2e-88 METEKVYFVESSEDYDDQHYGFAFSDEGLSPADIQNWKQDVAWKELKMELRDGEFADYLV NDLDIPLCSRRLKDLIVRHADNSDDIIWYPIIIQSASSWNQERYYYLKTSILLDDVIDFE KSDIEKGIVYVPYFIKEKIRDIFRCAYDGSYLFVSQALKDVITNNNITGISFETWE >gi|283510523|gb|ACQH01000096.1| GENE 3 1177 - 1365 110 62 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929479|ref|ZP_06423323.1| ## NR: gi|288929479|ref|ZP_06423323.1| hypothetical protein HMPREF0670_02217 [Prevotella sp. oral taxon 317 str. F0108] # 1 62 1 62 62 103 100.0 2e-21 MEWMVGNYLMIKDYPNAKKWVDKLGTVFKNQEILGDWDLKGKTNRINFLGNWMMMNKREL GS >gi|283510523|gb|ACQH01000096.1| GENE 4 1486 - 3160 1711 558 aa, chain + ## HITS:1 COG:RSp1137 KEGG:ns NR:ns ## COG: RSp1137 COG3209 # Protein_GI_number: 17549358 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rhs family protein # Organism: Ralstonia solanacearum # 1 553 686 1232 1517 140 26.0 6e-33 MVLRTLPGGESHAYAYEGSDLTGIADGEGRTFTLAYNARHELIRLTFPNGLHRDWRYDGR GRMVWARDVLGNVTTYAYDDADNLIRIEETDGNVHNLVYDTSGNLVKASDLLHDVEFDYG PLGTLTGRRQFGRHVRFGYDSELQLREIVNAGGERYAFGLDGLGRVVEETGFDGLKRKYL RDGAGRVVRVVRPCGRQTDYELDGAGNVLEERHHDGRTSRFAYDGDGLLLRAENGDTKVA FKRDAAGRVIKETQGGHSIVRTFDPTGRHVRTESSLGASVDYRHDGQGRLQEMSCGEWSA KWLRDSVGLEAERHLTGGVHVITRRDRFGRETFKSVGARNVEQFRRRYTWDIGNRLLVAR DEISGRVARYDYDEFDNLIAAEYERGGEVERLYRVPDRMDNLFETRERDDRRYDAGGRLA EDREFFYHYDCEGNLVFKEFKEMVLRGGIVAPINKERLETELGIRFRAFGTGWRYDWQSD GMLARVVRPDGKEVSFAYDALGRRIRKSYAGTTTHFVWDGNVPLHEWTETAESEESEMVT WLFEQDTFVPAAKLIANG Prediction of potential genes in microbial genomes Time: Sat May 28 02:17:17 2011 Seq name: gi|283510522|gb|ACQH01000097.1| Prevotella sp. oral taxon 317 str. F0108 cont2.97, whole genome shotgun sequence Length of sequence - 2121 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 1 - 456 90 ## BF2852 putative RhsD protein 2 1 Op 2 . + CDS 459 - 1109 386 ## gi|288929483|ref|ZP_06423327.1| hypothetical protein HMPREF0670_02221 3 2 Tu 1 . + CDS 1222 - 2119 995 ## COG3209 Rhs family protein Predicted protein(s) >gi|283510522|gb|ACQH01000097.1| GENE 1 1 - 456 90 151 aa, chain + ## HITS:1 COG:no KEGG:BF2852 NR:ns ## KEGG: BF2852 # Name: not_defined # Def: putative RhsD protein # Organism: B.fragilis # Pathway: not_defined # 1 40 1315 1354 1462 66 75.0 3e-10 YYDPNGGSYISQDPIGLAGGNPTLYAYVSDVNCWNDVLGLTAEVYKLVATKDGYYDVYEW GNDKPVGKTYLKKGDTWKIGETTNFRTRKDGTEIQNRYTQKWLRQNNLEYKRLQYSPNKS AKTSFQNFETSRIEKFEKQFGKKPAGNKCYH >gi|283510522|gb|ACQH01000097.1| GENE 2 459 - 1109 386 216 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929483|ref|ZP_06423327.1| ## NR: gi|288929483|ref|ZP_06423327.1| hypothetical protein HMPREF0670_02221 [Prevotella sp. oral taxon 317 str. F0108] # 1 216 6 221 221 429 100.0 1e-119 MMNRVSIQIDSPSYVKVSEIGKELVAFSLTIERKYPQIDFEFAFCFRALYTTRGIRSKVR FEKDCNYLGMDLIISEEEFNPYKNNVSMQRRIMGKHFFPFFAENIKKYRNKLPVLKPIEK DLVEDMRLFLIENLWLPDDSGSFKLAVIENVSYDRAMALFGKPRQKKFTDTDNGKIQDIL WEVDEQTQLSARYRLIDKVWTLESYNIAEGQIGIWK >gi|283510522|gb|ACQH01000097.1| GENE 3 1222 - 2119 995 299 aa, chain + ## HITS:1 COG:YPO3615 KEGG:ns NR:ns ## COG: YPO3615 COG3209 # Protein_GI_number: 16123757 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rhs family protein # Organism: Yersinia pestis # 5 276 739 1005 1512 82 27.0 9e-16 MVWARDVLGNVTAYAYDDADNLVRIEEADGNVHNLVYDASGNLIKASDLLHDVEFDYGPL GTLTGRRQFGRHVRFGYDSELQLREIVNEGGERYAFGLDGLGQVVEETGFDGLKRKYLRD GAGRVTRVVRPGGRQTDYELDGAGNVLEERHHDGRTSRFAYDGDGLLLRAENEETKVAFK RDAAGRVVKETQGGHSIVRTFDPTGRHVRTESSLGASVDYRHDGQGRLQEMSCGEWTAKW LRDGVGLEAERHLTGGVHVTTRRDRFGRETYKSVGARNVEQLRRRYTWDMGNRLLVARD Prediction of potential genes in microbial genomes Time: Sat May 28 02:17:30 2011 Seq name: gi|283510521|gb|ACQH01000098.1| Prevotella sp. oral taxon 317 str. F0108 cont2.98, whole genome shotgun sequence Length of sequence - 3891 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 3, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 328 123 ## Ctu_01120 hypothetical protein 2 1 Op 2 . + CDS 325 - 555 193 ## gi|288929486|ref|ZP_06423330.1| hypothetical protein HMPREF0670_02224 3 1 Op 3 . + CDS 555 - 1397 469 ## gi|288929487|ref|ZP_06423331.1| hypothetical protein HMPREF0670_02225 4 2 Op 1 . + CDS 1546 - 2592 657 ## COG3209 Rhs family protein 5 2 Op 2 . + CDS 2607 - 3125 351 ## ABBFA_002270 hypothetical protein + Prom 3173 - 3232 3.5 6 3 Tu 1 . + CDS 3324 - 3891 633 ## COG3209 Rhs family protein Predicted protein(s) >gi|283510521|gb|ACQH01000098.1| GENE 1 2 - 328 123 108 aa, chain + ## HITS:1 COG:no KEGG:Ctu_01120 NR:ns ## KEGG: Ctu_01120 # Name: not_defined # Def: hypothetical protein # Organism: C.turicensis # Pathway: not_defined # 1 101 1353 1452 1523 75 45.0 6e-13 YYDPNAGSYISQDPIGLKGGNPTLYGYVGNTNNWYDVWGLRPFGHAVGDIGEKAVINDLK KNGYEIIDVKYGSNNGIDVLAKNPKTGKYDAFEVKSSTVATSTSPKPR >gi|283510521|gb|ACQH01000098.1| GENE 2 325 - 555 193 76 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929486|ref|ZP_06423330.1| ## NR: gi|288929486|ref|ZP_06423330.1| hypothetical protein HMPREF0670_02224 [Prevotella sp. oral taxon 317 str. F0108] # 1 76 1 76 76 125 100.0 1e-27 MNPSDFVKTRLNETELRGKINEDTRAEIMSKLGERKIAYVDIKRGKRGKLYADSISYENW DTETKRIEKLKKGGHH >gi|283510521|gb|ACQH01000098.1| GENE 3 555 - 1397 469 280 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929487|ref|ZP_06423331.1| ## NR: gi|288929487|ref|ZP_06423331.1| hypothetical protein HMPREF0670_02225 [Prevotella sp. oral taxon 317 str. F0108] # 1 280 1 280 280 535 100.0 1e-150 MKKKVFDKLVERELKKFKNYVQSDFMTKFISDMSIDYNNYIKHDTAYGNYFGMHFYPFWH NSNQEDFTGIIDSYMLEYMVCTMYVHDPEKKFRYMLVGASSLFLAILIFGDDRERDTMFH SIVNLIKEYITMEYSINYQSTTLQEAFLLYDIQTNSKNHDQWTPYITVSLTPLYQDCMDS ILSDDEERVNTLLSDMLDYHVKQSNNDNFVRSEFTYPSQRVFPTEILALIHYRHTQGKSI DFIDDEVLSVFVPYIKAGNFRPSPAVEKARNDMYGLLGIR >gi|283510521|gb|ACQH01000098.1| GENE 4 1546 - 2592 657 348 aa, chain + ## HITS:1 COG:YPO3615 KEGG:ns NR:ns ## COG: YPO3615 COG3209 # Protein_GI_number: 16123757 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rhs family protein # Organism: Yersinia pestis # 31 226 1161 1383 1512 131 34.0 2e-30 MALRGGIVAPINKERLETELGIRFRTFGTGWRYDWQSDGMLARVVRPDGKEVSFAYDALG RRIRKTYAGTTTHFVWDGNVPLHEWTETAESEKSEMVTWLFEQDTFVPDAKLVANGECFS IISDYLGTPLQAYDKQGNKVWEQELDIYGRQRKRPSAFIPFKYQGQYEDAETGLYYNRFR YYDPNAGSYISQDPIGLAGGNPTLYGYVGEPNSWIDVLGLSKIPAQTLQMKWETHVGSKH TQAHIHHGFPEKFSDRFKNIADIDVNNPKYFYNLPQNKHTKKPGVHTNSSRTGRNWNRTW SGILDRVEKMNLSKSEAKEFLEARLRELARKERISKYNSEAIKGAGKH >gi|283510521|gb|ACQH01000098.1| GENE 5 2607 - 3125 351 172 aa, chain + ## HITS:1 COG:no KEGG:ABBFA_002270 NR:ns ## KEGG: ABBFA_002270 # Name: not_defined # Def: hypothetical protein # Organism: A.baumannii_AB307-0294 # Pathway: not_defined # 6 172 6 172 172 158 51.0 8e-38 MEFRKLSECEVFERMKRRKMEIRHLEKLPPLDKLYLSFISKYEGVEITPDIEIYGYEKVL YENRYLVTNYSDISQQVWMIGTSGQGDGWFINKESNLVLFYDHDQGEYSNITQFTSLNIS FCCFLQMAFLYQDLEKMLRKQENVNKILLDAFVKEIDSIYPNLFKAYPYKYF >gi|283510521|gb|ACQH01000098.1| GENE 6 3324 - 3891 633 189 aa, chain + ## HITS:1 COG:YPO3615 KEGG:ns NR:ns ## COG: YPO3615 COG3209 # Protein_GI_number: 16123757 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rhs family protein # Organism: Yersinia pestis # 3 181 791 981 1512 73 31.0 2e-13 MLRDVEFDYGPLGTLTGRRQFGRHVRFGYDGELQLREIVNEGGERYAFGLDGLGQVVEET GFDGLRRKYVRDGAGRVVRVVRPGGRQTDYELDGAGNVLEERHHDGRISRFAYDGDGLLL RAENGETKVAFKRDAAGRVIKETQGGHSIVRTFDPTGRHVHTESSLGASVDYGHDEQGRL QEMSCGEWL Prediction of potential genes in microbial genomes Time: Sat May 28 02:17:50 2011 Seq name: gi|283510520|gb|ACQH01000099.1| Prevotella sp. oral taxon 317 str. F0108 cont2.99, whole genome shotgun sequence Length of sequence - 2021 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 1 - 1221 1172 ## COG3209 Rhs family protein 2 1 Op 2 . + CDS 1227 - 1850 294 ## gi|288929492|ref|ZP_06423336.1| hypothetical protein HMPREF0670_02230 Predicted protein(s) >gi|283510520|gb|ACQH01000099.1| GENE 1 1 - 1221 1172 406 aa, chain + ## HITS:1 COG:YPO3615 KEGG:ns NR:ns ## COG: YPO3615 COG3209 # Protein_GI_number: 16123757 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rhs family protein # Organism: Yersinia pestis # 111 295 1161 1377 1512 110 33.0 4e-24 ITGRVARYDYDEFDNLISAEYERGGEVERLYRVPDRMGNLFESREKDDRKYDAGGRLAED REFFYHYDCEGNLVFKEFKELSWGGNVIAPINKERLETELGIRFRAFGTGWRYDWQSDGM LARVVRPDGKEVSFAYDALGRRIRKTYAGTTTHFVWDGNVPLHEWTEENEMVTWLFEQDT FVPAAKLVANGDCFSIISDYLGTPLQAYDKQGNKVWEQELDIYGRQRKRPSAFIPFKYQG QYEDAETGLYYNRFRYYDPNGGSYISQDPIGLMGENPTLYAYVSDINSLTDLLGLTAEVY KLVATKDGYYDVYEWGNDKPVGKTYLKKGDTWKIGETTNFRTRKNGTEIQNRYTRKWLRQ NNLDYKPLQHSPNKSAKTSFQNFETSRVEKFEKQFGKKPAGNKCYH >gi|283510520|gb|ACQH01000099.1| GENE 2 1227 - 1850 294 207 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929492|ref|ZP_06423336.1| ## NR: gi|288929492|ref|ZP_06423336.1| hypothetical protein HMPREF0670_02230 [Prevotella sp. oral taxon 317 str. F0108] # 1 207 7 213 213 416 100.0 1e-115 MNRVSIQIDSPGYVKVSEIGKELVAFSLTIERKYPQIDFELAFCFRALYTTRGIRSKVRF EKDCNCLGMDLIMSLDEFNPYKNNVSMQRRIMGKHFFPFFAENIKKYRNKLPVLKPIEKD LVEDMRLFLIENLWLPDDSGSFKLSVIENVSYDRAMALFGKPKQKRFTDTDNGKMQDILW EVDEQTKLSAQYRLIDKVWTLECYNIE Prediction of potential genes in microbial genomes Time: Sat May 28 02:18:01 2011 Seq name: gi|283510519|gb|ACQH01000100.1| Prevotella sp. oral taxon 317 str. F0108 cont2.100, whole genome shotgun sequence Length of sequence - 11023 bp Number of predicted genes - 6, with homology - 5 Number of transcription units - 3, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 359 190 ## gi|288929493|ref|ZP_06423337.1| hypothetical protein HMPREF0670_02231 2 1 Op 2 . + CDS 385 - 4092 4100 ## COG0587 DNA polymerase III, alpha subunit + Term 4240 - 4276 -0.7 + Prom 4117 - 4176 1.6 3 1 Op 3 . + CDS 4338 - 4652 452 ## COG0526 Thiol-disulfide isomerase and thioredoxins + Term 4675 - 4733 11.1 - Term 4902 - 4948 9.0 4 2 Op 1 . - CDS 4959 - 5132 179 ## gi|288929496|ref|ZP_06423340.1| hypothetical protein HMPREF0670_02234 5 2 Op 2 . - CDS 5146 - 8910 2922 ## gi|288929497|ref|ZP_06423341.1| hypothetical protein HMPREF0670_02235 - Prom 8933 - 8992 7.0 6 3 Tu 1 . - CDS 9170 - 9373 92 ## - Prom 9587 - 9646 5.6 Predicted protein(s) >gi|283510519|gb|ACQH01000100.1| GENE 1 3 - 359 190 118 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929493|ref|ZP_06423337.1| ## NR: gi|288929493|ref|ZP_06423337.1| hypothetical protein HMPREF0670_02231 [Prevotella sp. oral taxon 317 str. F0108] # 19 118 1 100 100 193 99.0 3e-48 KWLRDSVGLEAECHLAGGVHVTTRRDRFGRETYKSVGARNVEQLRRRYTWDMGNIAESMY AYRGQSARLRACNLRQTETAPCPCRSCLSLKGRSQFVDNVFCFCLLVSIKWRNFEGDF >gi|283510519|gb|ACQH01000100.1| GENE 2 385 - 4092 4100 1235 aa, chain + ## HITS:1 COG:BB0579 KEGG:ns NR:ns ## COG: BB0579 COG0587 # Protein_GI_number: 15594924 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, alpha subunit # Organism: Borrelia burgdorferi # 4 1208 21 1132 1161 763 38.0 0 MEDFVHLHVHTYYSILDGQSSIQRLVDKAIANGMGGMALTDHGNMFGTKEFFNYCKKVNG KRKEEGLPEFKPIIGCEMYVARHRKTDKVKENGDMSGYHLIVLAKNYTGYKNLTKLVSRA WVDGYYMRPRTDREDLETYHEGLIVCSACIAGEVPRKILRNDIEGAREAAQWYHRVFGDD YYLELQRHEVKDPHIRANREAFPLQQRANKVMLEMAEEMGIKVVCTNDAHFVDQENAEAH DHLLCLSTGKDLDDPTRMLYSKQEWFKTREEMNEVFGDLPQAMRNTVEILNKVETYSIDH SPIMPFFPIPEDFGTEEQWRQRISEKELYDEFTSDENGQNQLPPEEGEEKIKKLGGYEKL YRIKFEADYLAKLAYDGAKKLYGEPLTEEVRERVNFELHIMKTMGFPGYFLIVQDFINTA QNELDVMVGPGRGSAAGSVVAYCLGITKIDPIKYDLLFERFLNPDRISLPDIDTDFDDDG RGRVLEWVEDKYGHDKVAHIITYGTMATKNSIKDVGRVEKVPLKIVNDLCKLIPDKLPDG LKMSVANAIKAIPELREAEASADEGLRNTMMYAKMLEGTVRGTGIHACGTIICRDAISDW VPVSTAEDKADPGHKLLCTQYDGHVIEETGLIKMDFLGLKTLSILKEAVENVRLSTGKTI NLEEIPIDDPLTYKLYCEGRTIGTFQFESAGMQKYLRELQPTVFEDLIAMNALYRPGPMD YIPEFIARKRDPSRVKYDIPCMEKYLKDTYGITVYQEQVMLLSRQLANFTRGQSDTLRKA MGKKLIAQMDKLETLFYEGGMKNGHPKEVLNKIWEDWKKFASYAFNKSHATCYSWVAFQT AYFKAHYPAEYMAAVMSRNLANISEITKFMDECRVMGIKTLGPDVNESRLKFSVNKHGEI RFGLAAIKGMGDAAAQAIIQEREKNGPYTSVFDMVQRVNLSSVNKKAIESLALSGGFDSL GIARENYLGTDSKGATFIDNLTRYGQLFQMEQQQMQNSLFGASNAVEVSTPPVPQAERWS SIERLNKERDLVGIYLSAHPLDDYKIVLQSMCNTGCNELEDRMALVQKAEVTLAGMVTNT RSAVTKTGKPCGFVTIEDYSGSGELAFFGEEWGRWNGMFEVGRTLFVKCRFTQKYPTSAF VDLQVANIEYLQTTFDKRIERFTIQVDSAHVDQTVVNDIVTMVSDSPGQTQLYFEVYDSE SNNTLLLKASIPPIRVGRTLVNYLESDPHMNYKVN >gi|283510519|gb|ACQH01000100.1| GENE 3 4338 - 4652 452 104 aa, chain + ## HITS:1 COG:Cj0147c KEGG:ns NR:ns ## COG: Cj0147c COG0526 # Protein_GI_number: 15791535 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Campylobacter jejuni # 3 104 5 104 104 114 49.0 3e-26 MEVKITSENFESYKNGELPLVVDIWATWCGPCKMVGPIISELANDYDGKIVVGKCDVEEN NEVAAEFGVRSVPTILFFKGGQLVDKFVGATNKETLDAKFQALL >gi|283510519|gb|ACQH01000100.1| GENE 4 4959 - 5132 179 57 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929496|ref|ZP_06423340.1| ## NR: gi|288929496|ref|ZP_06423340.1| hypothetical protein HMPREF0670_02234 [Prevotella sp. oral taxon 317 str. F0108] # 1 57 9 65 65 105 100.0 8e-22 MKKKKYITPCVLMELLTIETHLAAGSFQAEEPGAKPGEDFDEEFDVDENFDDRIFRL >gi|283510519|gb|ACQH01000100.1| GENE 5 5146 - 8910 2922 1254 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929497|ref|ZP_06423341.1| ## NR: gi|288929497|ref|ZP_06423341.1| hypothetical protein HMPREF0670_02235 [Prevotella sp. oral taxon 317 str. F0108] # 23 1254 1 1232 1232 2481 99.0 0 MRMYLLLRMRHSYPLSLIIALLLVWRSVPLSAQNATISPSLGNLITAFTHNPHEVGFEGG YGSLWQHKQLPITYSCSDEPTLSKDGVLGNHTCNFLYYKKEGKERMIHVTGVIPNYSSIA MPKGYKITGYRIVIKNNLSIPSEKAIFDDVMPSLPFERWKDWTFGEVDKANIRLDHAKTD PTFRSDAHVVLPKNSYGQTFTIERKAADMGHILYFCFAGNTGRSRLAAFTYESIELWFTA NNEFEYELTPKIEHSSSISLSENDMPLGKTDVGELKLREKRGKKFYSYDQYNSQETTASV YLYEQGAHDGKTWDDTRGNKFIKIVNANNKSWYALQKGTYFVEAPTTAKVSAASGESKVP VGYRIVGIKVDYRKDDYKTPGFVLKNAGHGHKQFFLTSEVKGGTERAIWHRTANDEIYTI INGKPKFLYHKREDSPSPVPVPAELTDENKHVRFEWSNGRPFFRVGSKEYFLRMNKNKQA FFTYGSAGDHAQMAAINEQAVSSTPFKLKIFGPTGEESDPLTRTIDIKETTPNGTVRIDG YNNDAVKFKIEGLPNTSSPTALVNITLIMETLNPYIKSVDVVCHGAYNTEQVRTFDATDF NLGGEKFVYRVPKGFSEQTKLKFTFRNLKSDFADNTYLNNPGTHNSRYSFVASDYYNKVN DDLYTNKGVVANYDYTKKIEVDLAGNIAFPFNNAGDLSNKQSTLTSAYFTERQFTMEEYT KMTGEVETNINGVITKKTEHGKFAQENTLLRDNETKTMYLYTTDETRYNIAPTKKEQHRA FAYYHTVIKLEVKNYEPKVKWIPIYNKAMYYKDAANKHYMPTPLVGAEIQTTESGFEGNK HQQGATSEFGYLTLNQIVDVMQQSIAKGTDPNVPHSLGDVLFVDNSKLFDIIRKEHTQQL VPILEDLRSKLAKNALIYLPYRAEQLASVSHTALQRTDGKGFEGHTNITLTDKLPFFAPY DIQLKPECHATYTREVTVQGMGRVNYATLVLPYGLSIDGNGMHSNAVGPQSHFYLAKMRD NSLICSSGPKHEGADYEVTAKLKTLTGVNKSEANMPYIVRIAPGYGDGNVSFVAAEKGAL IKKTPEALAGLVTSYAVYSTLDGHPVPFKSYYTYSGRVINKPGNEMVFYFSLDRYVAFKN LKKQQLFLFPFRSVFHFENLGPLLSKAFNSFNISFDGDEGSTPTAIEDVQEEIALKVTVS TGQITAVARKDVPLNVYTMSGQCVTRTMIKAGETRTFYLAPGVYMVNGKKMIVN >gi|283510519|gb|ACQH01000100.1| GENE 6 9170 - 9373 92 67 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MQPAYQTKRTKNMNKCQTTGAKLTLAPCLMCKIYIKNIILDTRYCLLSFFDYLCSVFTEN MGLLAAN Prediction of potential genes in microbial genomes Time: Sat May 28 02:18:54 2011 Seq name: gi|283510518|gb|ACQH01000101.1| Prevotella sp. oral taxon 317 str. F0108 cont2.101, whole genome shotgun sequence Length of sequence - 21884 bp Number of predicted genes - 13, with homology - 13 Number of transcription units - 7, operones - 4 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 6 - 52 12.8 1 1 Tu 1 . - CDS 77 - 1726 452 ## PROTEIN SUPPORTED gi|169634422|ref|YP_001708158.1| fumarate hydratase 2 2 Op 1 . - CDS 2134 - 5040 3354 ## COG0612 Predicted Zn-dependent peptidases 3 2 Op 2 . - CDS 5065 - 6891 2109 ## PRU_0114 hypothetical protein 4 2 Op 3 . - CDS 6891 - 8135 1542 ## COG2262 GTPases - Prom 8167 - 8226 2.1 5 3 Tu 1 . - CDS 8345 - 8509 56 ## gi|288929502|ref|ZP_06423346.1| hypothetical protein HMPREF0670_02240 - Prom 8539 - 8598 3.8 6 4 Op 1 . - CDS 10028 - 10276 224 ## gi|288929504|ref|ZP_06423348.1| hypothetical protein HMPREF0670_02242 7 4 Op 2 . - CDS 10312 - 10806 605 ## BVU_0599 hypothetical protein + Prom 11734 - 11793 1.9 8 5 Op 1 35/0.000 + CDS 11863 - 13617 241 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P 9 5 Op 2 . + CDS 13643 - 15367 165 ## PROTEIN SUPPORTED gi|90020817|ref|YP_526644.1| ribosomal protein S16 + Term 15472 - 15536 8.5 - Term 15460 - 15521 20.3 10 6 Op 1 . - CDS 15609 - 16022 560 ## COG0802 Predicted ATPase or kinase 11 6 Op 2 . - CDS 16127 - 17686 1660 ## COG0784 FOG: CheY-like receiver 12 6 Op 3 . - CDS 17689 - 20922 3649 ## PRU_1900 hypothetical protein 13 7 Tu 1 . - CDS 21056 - 21883 864 ## COG1108 ABC-type Mn2+/Zn2+ transport systems, permease components Predicted protein(s) >gi|283510518|gb|ACQH01000101.1| GENE 1 77 - 1726 452 549 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|169634422|ref|YP_001708158.1| fumarate hydratase [Acinetobacter baumannii SDF] # 77 533 38 482 508 178 30 2e-44 MAQTPEFKYAPMFQLGKDETEYYLLTKDGVSVSEFEGNEILKVSPKALTMLTNTAFRDVN FLLRREHNEQVAKILTDPEASDNDKYVALTFLRNAEVAAKGKLPFCQDTGTAIIHGEKGQ HVWTGFCDEEALSKGVYKTYTEENLRYSQNAPLNMYDEVNTRCNLPAQIDIEATEGEEYR FLCVVKGGGSANKTYFYQQTKALLQNPGTLVPFLIDKMKSLGTAACPPYHIAFVIGGTSA EKNLLTVKLASTHFYDSLPTTGDETGRAFRDVELEKQLLEEAYKIGLGAQFGGKYFAHDV RVVRLPRHGASCPVGMGVSCSADRNIKAKINKDGIWIEKLDDNPGELIPEEMRHAGEGNA VKVDLDKPMAEILKELSKYPVSTRLSLNGTIIVARDIAHAKLKARLDETGDLPQYFKDYP VLYAGPAKTPEGLPCGSMGPTTANRMDPYVDEFQDHGGSMIMIAKGNRTQVVTDACKKHG GFYLGSIGGPAAVLSLNSIKSIECVEFPEIGMEAVWKIRVENFPAFILVDDKGNDFFTQL KPWNPTCNH >gi|283510518|gb|ACQH01000101.1| GENE 2 2134 - 5040 3354 968 aa, chain - ## HITS:1 COG:sll0915 KEGG:ns NR:ns ## COG: sll0915 COG0612 # Protein_GI_number: 16330991 # Func_class: R General function prediction only # Function: Predicted Zn-dependent peptidases # Organism: Synechocystis # 39 477 62 505 524 173 27.0 1e-42 MSIKSILVVLLLTGAATAMQAKDYKYQTVKGDPTGTRIYTLDNGLRVYLSVNKETPRIHT YIAVKTGSRNDPAETTGLAHYLEHLMFKGTKQFGTTDAEKEAPLLKDIEERYEKYRTLTD PEQRKKAYHGIDSVSQLAAKYFIPNEYDKLMSSIGAEKTNAYTSNDVTCYTEDIPANEVD NWAKIQADRFQNMVIRGFHTELEAVYEEYNIGLTRDGNKVWQAISKLLTPTHPYGTQTTI GTQEHLKNPSIVNIKNYFNRYYVPNNVAICMAGDMDPDKVIATIDKYFGSWKKSESLSFP QFPKQKPLTAPKDTTVVGPEADELMMAWRFDGGKSLQGDTLDVIANILSNEKAGLMDINL AQKMKYLGGGSISFQLAEYGLLGLWASPKEGQSLDDVKKLVLGEVENLKKGNFSDNLLPA VINNMKLEYYHALEKNKDRADQFVDAFINGKDWQTVVGRLDRISKMTKAQIVAFANKHLN NNYAVVYKRQGEDTTQKKIDKPQITPIPSNRDLQSDFVKEIIASKTTPIEPRFVDFNKDL VKTKTKKGLPVLYVPNKDNGLFTLAFHYDFGKEADKRLDIATEYLDYLGTNKLTPEQVKQ RFYQLACDYSISAGADNLNITITGLNENMPKALWLVEHLLANAKVDNAAYKQLVELVKKG RKDSRSNQVSNFMALAAYGMYGPYNTVRNVMTNAELDKTNPQSLLNLLKGLRNYKHEVLY CGQSTPEELVKAVDENHAIGKTLANVPQNKAYTKVQTKENAVWIAPYEAKNIYMMLYNNS GKGWNLEQRPVVYLFNEYFGTGMNSIVFQELRETRGLAYSASARYNTPSRVGETESLQAN IISQNDKMMDCVRAFNSIIDEMPQSDKAFELAKQASMKRIATERTTKFGIINAYLQARRL GLDFDIKERIYNALPKITLKEMVEFEKQTMAKKPLRYLILGDEKNLDMKGLEKIGKIKKV TTQEIFGY >gi|283510518|gb|ACQH01000101.1| GENE 3 5065 - 6891 2109 608 aa, chain - ## HITS:1 COG:no KEGG:PRU_0114 NR:ns ## KEGG: PRU_0114 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 2 608 4 614 617 804 61.0 0 MYRLLTDEEITLLEDHGCQAENWEAIMVAEDFKPTYVHDVTFYGDIKLGVFEKNIEVSRG FLKHSGLRNATLRNVCIGDNSLVENIGNYINNYTIGEECYISNVSTMETTEGATYGEGNL ISVLNEVGEGNVTLFDGLNSQLAAFMVKHNADRELRERLRQMIAEELERALPDRGTIGFG VKIVNTREVVNTIVQDNCEVNGAARLFDCSLLSAAGSGVYVGSGVICENSIVADGSSITN DCNLQDCYVGEACQLTNGFTASASVFFANSYMSNGEACAAFCGPFTASHHKSSLLIGAMF SFYNAGSATNFSNHAYKMGPMHYGILERGTKTASGAYILMPANIGAFSVCFGKLMHHPDT RNIPFSYLIAYGDIMYLVPGRNLNTVGLYRDVRKWPKRDIRPRSGQKSIVNFTWLSPFTV NEMLQGKHILEKLREAQGENVAEYNFRGYVITNNSLNIGLRNYDMAIKMFLARCVRKYGP TEPASTTGWKQWSDLSGLLLPESEELRLIEEIKNREITDIQRVTERFGEIGERYEQYEWA FAYRLINEYYGIAQPRHADWQDIIDQGREAKRQWIATIREDALKEYKLGDVEEDVYRGFV EQLNKETE >gi|283510518|gb|ACQH01000101.1| GENE 4 6891 - 8135 1542 414 aa, chain - ## HITS:1 COG:XF0088 KEGG:ns NR:ns ## COG: XF0088 COG2262 # Protein_GI_number: 15836693 # Func_class: R General function prediction only # Function: GTPases # Organism: Xylella fastidiosa 9a5c # 10 402 7 371 450 258 41.0 2e-68 MKEFVISEVKAETAVLVGLITQQQNEAKTKEYLDELEFLADTAGAVTVKRFTQRLNAPSM VTYVGKGKLEEIKQYILAEEEAEREVGMVIFDDELSAKQMRNIEKELNVKILDRTSLILD IFAMRAQTANAKTQVELAQYRYMLPRLQRLWTHLERQGGGSGSGGGKGSVGLRGPGETQL EMDRRIILSRMALLKERLAEIDKQKTTQRKNRGRMVRVALVGYTNVGKSTIMNLLAKSEV FAENKLFATLDTTVRKVVVENLPFLLADTVGFIRKLPTDLVDSFKSTLDEVREADLLVHV VDISHPDFEEQIAVVEKTLTELDCADKPSMIVFNKIDNYTFVKKDEDDLTPATKENMTLD DLKRTWMARLNDNCIFISAREKTNVDELRDRLYAKVRELHVQKYPYNDFLYEME >gi|283510518|gb|ACQH01000101.1| GENE 5 8345 - 8509 56 54 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929502|ref|ZP_06423346.1| ## NR: gi|288929502|ref|ZP_06423346.1| hypothetical protein HMPREF0670_02240 [Prevotella sp. oral taxon 317 str. F0108] # 1 54 1 54 54 100 100.0 4e-20 MGGVFIVVLQHKGHKPLSIYKNIKSQDIFTVENNKQCHESRIVEETFYVSSRDK >gi|283510518|gb|ACQH01000101.1| GENE 6 10028 - 10276 224 82 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929504|ref|ZP_06423348.1| ## NR: gi|288929504|ref|ZP_06423348.1| hypothetical protein HMPREF0670_02242 [Prevotella sp. oral taxon 317 str. F0108] # 1 82 1 82 82 124 100.0 1e-27 MEADLSYWRFIEEWHPKYWSDDRVLLCDILFRHLEKEDVDEDDKKWIAKDFNSNEEIVHE LKRLEKDLYLESLDNYYERLLA >gi|283510518|gb|ACQH01000101.1| GENE 7 10312 - 10806 605 164 aa, chain - ## HITS:1 COG:no KEGG:BVU_0599 NR:ns ## KEGG: BVU_0599 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 27 164 26 163 165 116 41.0 2e-25 MKQVIGVAVAIALLAGCAGTRNLSPEKAKRQAEMAQKVNEGLEKQQYTIEVSHVYPMRMQ AKALSYGYYIKVSGDSIYSYLPYFGHAYRVPYGGGKALDFAEKMTDYTTTKGKDGRMNIE IKVNNREDRYAYHLEVYDNGHAYIDVSADERDRIGFAGMMNVGR >gi|283510518|gb|ACQH01000101.1| GENE 8 11863 - 13617 241 584 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 344 562 134 351 398 97 32 7e-20 MIKVLKRLGNYMGRRKALLPLSVALSAINGLLSLVPFVLVWLVVRTLLTTGGNLAATPVW NYALAAFAVSVFNVLLYFLALALSHLAAFRVETNMRRQAMHALMRAPLGFFDTHNTGGMR KVIDEDSSQTHTFVAHILPDVAGSIVSPLGVVALLLAVDWQLGLAALVPIVCAFGIMGFM MNPKNNQFQRLYLDAQERMGAEAVEYVRGIPVVKVFQQTVFSFKRFYNSIIAYRDLVIKY TLVWRTPMSFYIVAINAFAFVLVPVGIVMIGHGRNAAVVIANMFLYVLIAPIIGLNVMRM MHLMQSLFMANEAIERLERLTEATPLPISSEPRKITSYGICLDKVSFKYGDAEKEALHRV CLDMSQGSTTAIVGASGSGKTTLARLVPRFWDVCEGSVSIGGVDVRQVHKAELMQCVSFV FQNTRLFKTSLLDNLRYGNEGATIEQINRAVDLSQSREIVNRLPKGLDTVIGADGTYLSG GEQQRLVLARAILKDAPIVVLDEATAFADPENEHLIRRALANLTKGKTVLMIAHRLTTVQ DADRIVVLNNGEIAEQGTHEELIGRQGRYYRMWNEYQRAIAWRL >gi|283510518|gb|ACQH01000101.1| GENE 9 13643 - 15367 165 574 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|90020817|ref|YP_526644.1| ribosomal protein S16 [Saccharophagus degradans 2-40] # 330 559 7 233 318 68 25 5e-11 MIRYFQNRFALSEKGAKDLRRGIFFSTLLNVALMLPPSYLFFFLMEYVDVRRAAPPQSLW LYLLLALVALCIMFVVSRWQYNSTYVTVYAESAQRRITLAEKLRRLPLAFFGEHNLSDLT STMMEDCAQLEETFSHAVPQLFAAGTSVVVVGIGLFSYDWRLALAVYWPVPLALLVLLLS VRSIDKAFTKHYKVKRALTEQIQEGLECVQEIKAYNGEEAYDRQLDERLNVYEKALVSSE LVTSSLVNVSAVLLKLGMPSVLLLGAWLMQRGDVSLFVYLAFVLVSAMVYNPIQDVCTHM VVLSYLDVRIKRMRQIEAMPTLDGSKDVEVNGYDIAFRGVSFSYEMGKQVLHDVSFTARQ GQVTALVGPSGGGKSTAAKLAARFWDVDEGVVTLGGKNIAEMDAEALLKNYAVVFQDVLL FNASVADNIRIGKRNATDDEVRQAARLAQCDDFIRSMPQGYDTIIGENGETLSGGERQRI SIARALLKDAPVILLDEATASVDAENETKIQAGISELVRNKTVVIIAHRMRTVLNADHIV VLDGGTVREQGSPQELLKENGEFARIVRLQLSDS >gi|283510518|gb|ACQH01000101.1| GENE 10 15609 - 16022 560 137 aa, chain - ## HITS:1 COG:BS_ydiB KEGG:ns NR:ns ## COG: BS_ydiB COG0802 # Protein_GI_number: 16077658 # Func_class: R General function prediction only # Function: Predicted ATPase or kinase # Organism: Bacillus subtilis # 25 137 28 136 158 97 39.0 6e-21 MTLTITSLAQIHNVAKQFIDNIGTGKVFAFYGKMGSGKTTFIKAVCEELGVTDVITSPTF AIVNEYHSEQTPKPIFHFDFYRIKKLEEVYDMGYEDYFYSGSLCFLEWPELIEEILPADV VKVKIEEQADGSRTVTF >gi|283510518|gb|ACQH01000101.1| GENE 11 16127 - 17686 1660 519 aa, chain - ## HITS:1 COG:mlr6691 KEGG:ns NR:ns ## COG: mlr6691 COG0784 # Protein_GI_number: 13475585 # Func_class: T Signal transduction mechanisms # Function: FOG: CheY-like receiver # Organism: Mesorhizobium loti # 6 122 8 126 130 83 41.0 1e-15 MNNGLILWADDEIELLKAHIIFLEKKGYEVVTVSNGMDALDQCKQRNFDLIMLDEMMPGL SGLETLQRIKEVQPATPVVMCTKSEEENIMEQAIGSKIADYLIKPVNPSQILLTLKKNIH RKEIVSEVTQTGYQQNFQDISMQIMDCKTLDDWRKVYRRLVHWELELSNTGSSMADILVT QKEEANNAFAKYVKANYMDWVAEGASTANGRPVMSPDLFKHSIFPLLDKGEKVFLIVLDN FRYDQWKTLESEIGDLFDIDEDMYISILPTATQYARNAIFSGLMPNKIAKMFPELWVDED EEEGKNLNEDLLIKTQIDRYRRHDTFSYTKVNTSAEAEKLLEHFSQLSQNDLNVMVFNFI DMLSHARTESRMVRELANNESAYRSITLSWFRHSILANLFKLLSQSDFTVVLTTDHGSIR TSKPIKIVGDRNTNTNLRYKLGKNLSYNSKEVFTIKEPHKAQLPSPNISTSYVFATGNSF FAYPNNYNYYVSYYKDTFQHGGISMEEMLVPLVTMRRRR >gi|283510518|gb|ACQH01000101.1| GENE 12 17689 - 20922 3649 1077 aa, chain - ## HITS:1 COG:no KEGG:PRU_1900 NR:ns ## KEGG: PRU_1900 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 891 40 913 939 886 51.0 0 MAYVDGMLEKENGNRDNYTELLPLYPVANKGSKELGKGNFERAILKSEKAIQLHSIKRRP TWDKKRRKTQRDIEWLNRREYNPFMWRVWMLMGRSQFHKGDFDQAASTFAYMSRLYRTQP HIYAKARAWLAKTYIEQGWMYDAEDVIRNMSRDSMDWRAVKEWDYTHTLYHLHAGEWQKA IPYLQKVIKHEMRRKQRARQWFIMGQLQRFVGNRDAAYKAFRKAASMHVSYELDFNARIA QTEVLAQGQAKKMIAKLRRMATNDNNRDYLDQVYYAIGNIYLAQRDTANAIAAYENGGKR ATRNGIEKGVLMLKLGNVYWQKQRFADARRCYNEALGLLDKERPDYEELSQRSKVLDELV PHIEAVQLQDSLQLLATMPEPQRNAAIDRVIAALKKKEKEEQYAQDEETARRTIARNSSA SNMPTTQTPMQRTQQGGGWYFYNPIAVSQGKAAFERQWGKRDNVDNWQRLNKAAVALNTP LDEPSAEQRDSILAEEARRDSIQQAMQTAENDPHKREFYLAQIPFTAEQRQASNVLLTDG LFHSGVIFKDKLDNLALSEKALRRLTDNYPQFDKMPQAYYHLFLLYMRKGDKVTAQRYAD MLKQQYPKHELTELITDPYYFANAQRGEHIEDSLYAVTYDFFKAENYAAVKRNAKVSATK FKHGANRDKFLFVEALSMLHAGNIDACLNGLQTLVEQFPQSELSKMAAMIVNGIKAGRQV QAGGFDLNNVWQRRTAATNAADSLKGKQFSNDRNANFLFVMAYRPDFVDENKLLFALAKY NFTNYLVRNFEIRTETDNGLRLMIVSGFRNFDEALQYARELHRQTGIVGRLNNGRTLVIS EDNMQLLGKEFSFDDYDKYYAKHFAPLKVSTFRLLNEPAEIVQPEVPVRLPTIEEIDRAL DNDFVLPNEKATTDVDEDTYPITDEPTQKQKGKPETKQKADDATVIEAPLPPTRQPDDGT IVIEDDTPSTPKAKPNTNKGEQKPQQPTQQKGKTPQTQTQPKGKTPVKTEQKPQKQPTKT DKKEDTGIDFKDELGNKGKKTNPKKTPTAKQDSTKTPAKPKRKEIDPDDEYYDLEGF >gi|283510518|gb|ACQH01000101.1| GENE 13 21056 - 21883 864 275 aa, chain - ## HITS:1 COG:MA0025 KEGG:ns NR:ns ## COG: MA0025 COG1108 # Protein_GI_number: 20088924 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Mn2+/Zn2+ transport systems, permease components # Organism: Methanosarcina acetivorans str.C2A # 3 261 4 262 274 204 45.0 1e-52 MNILEYSFFQNALLGSLLASVLCGFIGTYIVTRRLVFISGGITHASFGGIGLGVFLGINP ILSAMVFSVLSAFGVQWMSRRGDVREDSAIAVFWTFGMSLGIICCFLSPGFMPDLPSFLF GSILTISQADLWLLAALLVVVSIVFALLYRTILSVAFDVDFARSQRLPVAFIEYLMMALI ALTIVSTLRMVGIVLSISLLTIPQMSANVLTHNFKHMIAWSIALGWVDCLMGLGLSYALN VPSGASIIFVSILVYLVLKVGNVAFKKRQRQLETI Prediction of potential genes in microbial genomes Time: Sat May 28 02:19:29 2011 Seq name: gi|283510517|gb|ACQH01000102.1| Prevotella sp. oral taxon 317 str. F0108 cont2.102, whole genome shotgun sequence Length of sequence - 13391 bp Number of predicted genes - 11, with homology - 11 Number of transcription units - 7, operones - 3 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 272 - 670 554 ## BF0726 hypothetical protein 2 1 Op 2 . - CDS 674 - 1921 1191 ## COG0128 5-enolpyruvylshikimate-3-phosphate synthase - Prom 2132 - 2191 6.8 + Prom 2625 - 2684 6.4 3 2 Tu 1 . + CDS 2737 - 3000 64 ## gi|288929514|ref|ZP_06423358.1| hypothetical protein HMPREF0670_02252 + Prom 3198 - 3257 5.9 4 3 Tu 1 . + CDS 3333 - 3935 711 ## COG0138 AICAR transformylase/IMP cyclohydrolase PurH (only IMP cyclohydrolase domain in Aful) + Prom 4078 - 4137 4.1 5 4 Op 1 22/0.000 + CDS 4251 - 5273 1262 ## COG1077 Actin-like ATPase involved in cell morphogenesis 6 4 Op 2 . + CDS 5278 - 6141 883 ## COG1792 Cell shape-determining protein 7 5 Op 1 . + CDS 6280 - 6780 422 ## PRU_1706 hypothetical protein 8 5 Op 2 19/0.000 + CDS 6795 - 8645 1921 ## COG0768 Cell division protein FtsI/penicillin-binding protein 2 9 5 Op 3 . + CDS 8626 - 10098 1422 ## COG0772 Bacterial cell division membrane protein + Term 10195 - 10247 12.0 - Term 10204 - 10255 13.1 10 6 Tu 1 . - CDS 10428 - 12233 1685 ## COG0366 Glycosidases + Prom 12521 - 12580 7.2 11 7 Tu 1 . + CDS 12745 - 13227 357 ## gi|288929522|ref|ZP_06423366.1| hypothetical protein HMPREF0670_02260 Predicted protein(s) >gi|283510517|gb|ACQH01000102.1| GENE 1 272 - 670 554 132 aa, chain - ## HITS:1 COG:no KEGG:BF0726 NR:ns ## KEGG: BF0726 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 132 1 137 140 110 45.0 2e-23 MHILLLALLGLAIFAAVLGLLSRNKNASEPGVVKADTASCATCSGEDERCEHDCMLEAAV RDVEYYDDEELDKYRGRDSADYTDSEAEEFAEVFYTMRPSDVQGWNRSLILRQINLPNQL KDEVIMVINDAR >gi|283510517|gb|ACQH01000102.1| GENE 2 674 - 1921 1191 415 aa, chain - ## HITS:1 COG:VC1732 KEGG:ns NR:ns ## COG: VC1732 COG0128 # Protein_GI_number: 15641736 # Func_class: E Amino acid transport and metabolism # Function: 5-enolpyruvylshikimate-3-phosphate synthase # Organism: Vibrio cholerae # 15 407 16 423 426 217 34.0 4e-56 MQYSIVAPEKLRVVVTLPASKSISNRALIMHALSGSSMLPQNLSNCDDTRVMVRALSDMP KNIDVRAAGTAMRFMAAFLSCQKGEHTLTGTQRMRQRPIKPLVDALRYVGADIEYEAKEG YPPIRIYGKPLEGGRVEMPGGISSQFISAILMIAPTLKKGLDLKMTGQIASRPYIDLTLC MMREYGAKADWTDINAISVKPGQYKARQYTIESDWSAASYWYEMVALSNDPDTIVELEGL SENSKQGDSVVRHIFSLLGVKTVFSKHPAGQLTKVTLKKLRTRLPKLEYDFVNSPDLAQT LVVTCAMMGVPFHFKGLSSLRIKETDRIEALKKEMAKLGIVLREMAEGELAWDGMRCRPS DEPIDTYEDHRMALAFAPVAFVNKEIKINQPGVVSKSYPQFWSDLRQSGFEINEI >gi|283510517|gb|ACQH01000102.1| GENE 3 2737 - 3000 64 87 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929514|ref|ZP_06423358.1| ## NR: gi|288929514|ref|ZP_06423358.1| hypothetical protein HMPREF0670_02252 [Prevotella sp. oral taxon 317 str. F0108] # 1 41 1 41 87 73 100.0 3e-12 MGAFSVKLFAEKWTKCESASWQVGKPIGIWLGKLAGLAFLDLLVFLGLLVVLGLLVVLGL LVVLGLLAVLGLLVVLGLLVVLGLLAF >gi|283510517|gb|ACQH01000102.1| GENE 4 3333 - 3935 711 200 aa, chain + ## HITS:1 COG:slr0597 KEGG:ns NR:ns ## COG: slr0597 COG0138 # Protein_GI_number: 16332321 # Func_class: F Nucleotide transport and metabolism # Function: AICAR transformylase/IMP cyclohydrolase PurH (only IMP cyclohydrolase domain in Aful) # Organism: Synechocystis # 3 195 40 233 553 161 45.0 7e-40 MTETKKIRTALVSVFHKDGLNELLAKLNDEGVKFLSTGGTRQFIEELGYECGAVEDLTAY PSILGGRVKTLHPKVFGGILARRDNHADKAQMEQYDINPIDLVIVDLYPFEETVASGANQ SDIIEKIDIGGISLIRAGAKNFNDVVIVPSKNEYGMLLDILKASGANTTLAQRKAFATRA FGVSSHYDTAIHAWFEANNA >gi|283510517|gb|ACQH01000102.1| GENE 5 4251 - 5273 1262 340 aa, chain + ## HITS:1 COG:CAC1242 KEGG:ns NR:ns ## COG: CAC1242 COG1077 # Protein_GI_number: 15894525 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Actin-like ATPase involved in cell morphogenesis # Organism: Clostridium acetobutylicum # 1 335 1 332 335 282 47.0 8e-76 MGFFSFMQEIAMDLGTANTIIISDDKIVVDEPSVVALDRRTDKMIAVGERAKLMYEKTHE NIRTVRPLRDGVIADFTACEQMMRGLIKMVHTGSRLFSPSLRMVIGVPSGSTEVELRAVR DSAEHADGRDVYLIFEPMAAAIGIGIDVEAPEGNMIVDIGGGSTEIAVISLGGIVSNNSI RVAGDDLTAEIQEYMSRQHNVKVSERMAERIKIHVGSALTDLGEEAPEDYVVHGPNRITA LPMEVPVCYQEIAHCLDKTIAKIENAVLSALENTPPELYADIVKNGIYLTGGGALLRGLD RRLQEKISIPFHIAEDPLHSVAKGAGIALKNVERFSFLMR >gi|283510517|gb|ACQH01000102.1| GENE 6 5278 - 6141 883 287 aa, chain + ## HITS:1 COG:lin1582 KEGG:ns NR:ns ## COG: lin1582 COG1792 # Protein_GI_number: 16800650 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell shape-determining protein # Organism: Listeria innocua # 67 267 76 277 295 69 28.0 7e-12 MRNLFAFIARYNHWFVFLALEVTSMVLLFQYNSYQGSVWFSSANEVAGKVYEANSWVETF FSLTQVNKELTQRNLELEQQVSAMQEMLTRAKHDSLDVVQTQKMALSGFKLYPARVVSNS LDRNDNFITIDKGWAHGIKKDMGVACGTGVVGVVYLVSRNYSVVIPLLNSHSNVSCMIKD RGYFGYLHWNGGDPGIAYVDDVPRHARFKLGLDVVTSGYSSIFPPGVLVGKILHVFNSPN GLSYRLQVRLSTDFGKLRDVCVIDNAGMEERLEMMRAVQDSLKVKRE >gi|283510517|gb|ACQH01000102.1| GENE 7 6280 - 6780 422 166 aa, chain + ## HITS:1 COG:no KEGG:PRU_1706 NR:ns ## KEGG: PRU_1706 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 166 1 166 166 126 47.0 4e-28 MTVGVLKTLGTFLVFLLVQVLVLNQVHLFDSATPLLYIYMVLLFPRNYPKWGMLAWAFAM GLGVDMFSNTPGVAASSLTLVALLRPYLLELFLQRESAEDLVPSMKVLGFGRYLFFAFII VFTYCLVFFSLEAFSFFNWMQWLFCVGGSTLLTLVLVMVLDNLRSR >gi|283510517|gb|ACQH01000102.1| GENE 8 6795 - 8645 1921 616 aa, chain + ## HITS:1 COG:PA4003 KEGG:ns NR:ns ## COG: PA4003 COG0768 # Protein_GI_number: 15599198 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell division protein FtsI/penicillin-binding protein 2 # Organism: Pseudomonas aeruginosa # 2 593 12 614 646 255 30.0 2e-67 MKDFELDKRRLVIGGVAIFIVTVYLIRLFTLQLLSDDYKKNADSNAFLKKIEYPSRGAIT DRHGRLLVYNQPAYDIMVVMNEEKGRLDTMEFCNALGITKEYFIKRMNDIKDRNKNPGYS RFTEQVFMSQLSDKEFSVFQEKIFRFPGFYVQKRSIRQYQYPYAAHVLGDVAEVSKSDVE NSDYYQPGDYIGKLGIERSYEEQLRGEKGVQILLRDAHGRVQGNYQNGKFDRRPVAGKNL TLGIDIKLQELGERLLEGKIGSIVAIEPATGEVLCMVSSPTYDPRIMVGRQRGKNHLALS KDVWKPLLNRAIMGQYPPGSTFKTTQGLTYMTEGIIDEHTLYPCARGFNYRGLHVGCHPH AAPTNLVQALCTSCNSYFCWGLFHMIGNRRRYRNVQEAMNTWRDYMVSMGFGYKLGVDLP GEKRGLIPNAAFYDKAYKGSWNGLTVISISIGQGEVNLTPLQIANLCATIANRGYYYVPH VVRKVQGEQLDTLYRRKHYTKASRKAYEYVVQGMRASVVGGTCHAANRADYLVCGKTGTA QNRGQDHSVFMGFAPMDNPKIAIAVYVENGGFGATYGVPIGSLMMEQYINGKLSPFSEAQ ASVYQSRRIAYGTSNR >gi|283510517|gb|ACQH01000102.1| GENE 9 8626 - 10098 1422 490 aa, chain + ## HITS:1 COG:TP0501 KEGG:ns NR:ns ## COG: TP0501 COG0772 # Protein_GI_number: 15639492 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Bacterial cell division membrane protein # Organism: Treponema pallidum # 56 484 47 432 433 165 31.0 2e-40 MARATDRQPSMLGSLDWWTIGIYLALLTFGWVSVCGASYTYGDTDIFSLGTRSGMQIVWI GTSICLGFVLLTLDDHFYDTFAFVIYGAMLLLLFATIFNPHEIKGSRSWLVLGPLRLQPA EFAKFATALAVAKLMSTYGFSLNNWKDFASACLVVVLPMVFIVAQKETGSALVYVSFFLM FYREGMSGSFLFTGVAMVVYFVVGVRFEEVMLWQTPTSLGRFIVLLLVQLFTAGMVRSYT NDRGRTTRLVALGILFTGVCLLVSKYIVAFDVTIVQLVLTAALVCWLLYRWLSARIRNYF YVALFAVGSVMFFYSVDYVLNEVMEPHQRVRINVLLGLEEDLAGAGYNVHQSEIAIGSGG LKGKGFLNGTQTKLKFVPEQDTDFIFCTVGEEEGFVGSAAVLLLFLALILRLIKLAERQP FKFGRIYGYCVLSVFLFHLFINVGMVLGLTPVIGIPLPFFSYGGSSLWGFTLLLFIFLRI DASRNKSRRQ >gi|283510517|gb|ACQH01000102.1| GENE 10 10428 - 12233 1685 601 aa, chain - ## HITS:1 COG:alr2190 KEGG:ns NR:ns ## COG: alr2190 COG0366 # Protein_GI_number: 17229682 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Nostoc sp. PCC 7120 # 29 337 5 366 492 72 23.0 2e-12 MKKLFTLLTALLLTVCTKALAQGWPQNYDGVMLQGFYWDSYDDTHWTTLEAQATELSAAF NQIWVPQSGYCNTTHMQMGYLPIWWFNHLSAFGTEAELRQMIKTFKSKGTGIIEDVVINH RAGNTNWCDFPTEKWNGKTMTWTLADICANDDGGNTKANGYNVTGAADTGDDFGGGRDLD HTRQNVRDNIKAYLSFLLTDLGYDGFRYDMVKGYAAHYIGEYNTSANPKFSVGEYWDGNV QKVKEWIEGTRANGAIQSAAFDFPMKYAINDAFGQGSWNRLTDATLAADQAYSRYAVTFV DNHDTGRPKENGGAQLYANVLAANAYILTMPGTPCVFLRHWKMYKQSLKRLIATRKAVGL TNQSQIVKAESAANGFVLCTQGTKGKALLLLGDVNNAQTAGYQLAVEGPKYKLYVSDGVD LTDVKAIKGEDAGFNAPDFCKVKKGERCAFFEAPANWNNTITCWQWDKGYNYTGNQWPGA ACIKVGTTANGTHVWKWTWNNSKQTSSAGNEGIIFSNNGSPQTADLPFENGGYYTAEGLL GVVKPVTTGITSPLQNTPSAKSLPVYTIDGKALGHTADIDTSLKTLPKGMYIIGKKKYAV K >gi|283510517|gb|ACQH01000102.1| GENE 11 12745 - 13227 357 160 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929522|ref|ZP_06423366.1| ## NR: gi|288929522|ref|ZP_06423366.1| hypothetical protein HMPREF0670_02260 [Prevotella sp. oral taxon 317 str. F0108] # 1 160 37 196 196 326 100.0 2e-88 MLFVAGSFSASAQQSNGSDVAQFAAKMDKAKWKKSGLGFTYPSFFVAREVFDDDIPTLYN TYSWRKVMLGYCLLGAWAVIDDNFPNEGQQLINKVKMKKITYYPRGKGVFSGFTNDGRIF YAKCKPTEGGMVTHLEVLALIYPKSYQKSVDVLIKQIASW Prediction of potential genes in microbial genomes Time: Sat May 28 02:19:55 2011 Seq name: gi|283510516|gb|ACQH01000103.1| Prevotella sp. oral taxon 317 str. F0108 cont2.103, whole genome shotgun sequence Length of sequence - 25043 bp Number of predicted genes - 24, with homology - 22 Number of transcription units - 18, operones - 5 average op.length - 2.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 247 203 ## gi|260909559|ref|ZP_05916261.1| conserved hypothetical protein 2 1 Op 2 . + CDS 304 - 1200 599 ## BT_1056 hypothetical protein 3 2 Tu 1 . + CDS 1531 - 2019 377 ## BT_1056 hypothetical protein + Term 2060 - 2103 -0.9 + Prom 3309 - 3368 5.0 4 3 Tu 1 . + CDS 3500 - 4183 786 ## COG0775 Nucleoside phosphorylase + Prom 4199 - 4258 2.9 5 4 Op 1 . + CDS 4303 - 4782 712 ## COG1854 LuxS protein involved in autoinducer AI2 synthesis 6 4 Op 2 . + CDS 4790 - 5524 482 ## COG0705 Uncharacterized membrane protein (homolog of Drosophila rhomboid) - Term 5819 - 5868 15.1 7 5 Tu 1 . - CDS 5909 - 6178 391 ## PROTEIN SUPPORTED gi|160883111|ref|ZP_02064114.1| hypothetical protein BACOVA_01079 + Prom 6412 - 6471 3.8 8 6 Tu 1 . + CDS 6516 - 8546 2106 ## COG4206 Outer membrane cobalamin receptor protein + Term 8599 - 8647 13.3 + Prom 8587 - 8646 2.0 9 7 Tu 1 . + CDS 8754 - 9746 950 ## COG0667 Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) - Term 9885 - 9937 3.1 10 8 Tu 1 . - CDS 9962 - 10429 513 ## FP0040 hypothetical protein - Prom 10530 - 10589 4.2 + Prom 10992 - 11051 8.8 11 9 Tu 1 . + CDS 11295 - 12035 732 ## COG0778 Nitroreductase + Term 12079 - 12110 4.8 12 10 Tu 1 . - CDS 11963 - 12229 110 ## - Prom 12255 - 12314 3.1 + Prom 12946 - 13005 3.5 13 11 Tu 1 . + CDS 13029 - 13445 248 ## gi|288929538|ref|ZP_06423382.1| hypothetical protein HMPREF0670_02276 14 12 Tu 1 . - CDS 13746 - 14585 697 ## PRU_0130 hypothetical protein - Prom 14617 - 14676 3.0 15 13 Tu 1 . + CDS 14488 - 14691 79 ## 16 14 Tu 1 . - CDS 14818 - 15960 947 ## COG2706 3-carboxymuconate cyclase 17 15 Op 1 . - CDS 16184 - 16597 221 ## PROTEIN SUPPORTED gi|148994682|ref|ZP_01823786.1| 50S ribosomal protein L13 18 15 Op 2 . - CDS 16603 - 17769 1361 ## COG0809 S-adenosylmethionine:tRNA-ribosyltransferase-isomerase (queuine synthetase) - Prom 17793 - 17852 1.9 19 16 Op 1 . - CDS 18244 - 18954 771 ## COG0130 Pseudouridine synthase 20 16 Op 2 . - CDS 18954 - 19814 1033 ## COG1968 Uncharacterized bacitracin resistance protein 21 16 Op 3 . - CDS 19821 - 20078 359 ## PRU_1834 hypothetical protein - Prom 20239 - 20298 3.9 22 17 Tu 1 . - CDS 20349 - 21230 869 ## COG2177 Cell division protein - Prom 21257 - 21316 3.6 + Prom 22216 - 22275 5.5 23 18 Op 1 . + CDS 22299 - 22880 402 ## Ppha_1224 hypothetical protein 24 18 Op 2 . + CDS 22877 - 23080 104 ## gi|265753781|ref|ZP_06089136.1| conserved hypothetical protein Predicted protein(s) >gi|283510516|gb|ACQH01000103.1| GENE 1 2 - 247 203 81 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260909559|ref|ZP_05916261.1| ## NR: gi|260909559|ref|ZP_05916261.1| conserved hypothetical protein [Prevotella sp. oral taxon 472 str. F0295] # 1 67 106 172 186 71 61.0 2e-11 NNTNPTPTASRTDTSLIKDTRHSGEDPIPMGEIIVDDSVESPKVDPYDPFPLGKIAPAKP KKKNNAQKTPQNNKKKGRKRR >gi|283510516|gb|ACQH01000103.1| GENE 2 304 - 1200 599 298 aa, chain + ## HITS:1 COG:no KEGG:BT_1056 NR:ns ## KEGG: BT_1056 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 5 104 4 103 269 98 42.0 2e-19 MKSNGKNICAALKQVRKQVADANGIHYTPAPCRFDGDCSGTCPACESEVQYIESQLGRLR LAGKAVKVAGLALGLTMVAGCNSSSGTPPSATDKSAAANETSPTSQPNPQPQANALVSAA DEGAKPHARHYHVRGNSGRVVKKGAAQADTTAIYMADSSGYALPEVSVTSTPSKKKVNYV GGISYYQRIARNRNHVYLNPEISSRYKKGDEAMRQFIADHIRITPRMSAACGRALVKVAC VVERNGRLSAMRVIQSADSLYDAEAIRILKKMPRWRPARMEGKRVRSLVVIDVPFVLK >gi|283510516|gb|ACQH01000103.1| GENE 3 1531 - 2019 377 162 aa, chain + ## HITS:1 COG:no KEGG:BT_1056 NR:ns ## KEGG: BT_1056 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 5 100 4 103 269 87 46.0 1e-16 MKPNGKNICAALKQVRKRVADTNGIVYAPKECHFEGSCNGTCPACEAEVRYLEHQLNLLR KAGKAVTVMGVALGMTFTTGCKQGNSPKALQETQDSVKASECQDTKEFLPEEIRHRSETD SVYEDVSGMISPDTTSDRQGQERTKRKRKVRMVTKFVPPSNK >gi|283510516|gb|ACQH01000103.1| GENE 4 3500 - 4183 786 227 aa, chain + ## HITS:1 COG:VC2379 KEGG:ns NR:ns ## COG: VC2379 COG0775 # Protein_GI_number: 15642376 # Func_class: F Nucleotide transport and metabolism # Function: Nucleoside phosphorylase # Organism: Vibrio cholerae # 1 227 1 230 231 141 38.0 9e-34 MKIALIVAMDKEFVQLRSLLENEKVETHRNLTFVTGSIGTNDLVLLQCGIGKVNSTIGAV ETITRYAPQLVISTGVAGGADVEMNPLEVVVGTHYAYHDLYCGTDVAYGQLPGLPPCFDA ATDMVSRALSIKGDTPIHGGLIVSGDWFVDSKEKMRSILEHFPKAKAVDMESCSIAHTCY RYQTPFVSFRIISDVPLKDEKAAQYFDFWERMANGSFAVVSQFVKSL >gi|283510516|gb|ACQH01000103.1| GENE 5 4303 - 4782 712 159 aa, chain + ## HITS:1 COG:BB0377 KEGG:ns NR:ns ## COG: BB0377 COG1854 # Protein_GI_number: 15594722 # Func_class: T Signal transduction mechanisms # Function: LuxS protein involved in autoinducer AI2 synthesis # Organism: Borrelia burgdorferi # 1 158 17 173 173 189 55.0 3e-48 MEKIPSFTIDHNKLKRGIYVSRKDNVGNDVVTTFDVRMKEPNREPILEQGAIHTMEHLAA TFLRNDPEWKDRVVYWGPMGCLTGNYLLMKGDLQSADIVDLMRRTFEFVANFEGEVPGAA AKDCGNYLLHDLPAARCEARKYVDEVLNNMKEENLIYPK >gi|283510516|gb|ACQH01000103.1| GENE 6 4790 - 5524 482 244 aa, chain + ## HITS:1 COG:BH1421_1 KEGG:ns NR:ns ## COG: BH1421_1 COG0705 # Protein_GI_number: 15613984 # Func_class: R General function prediction only # Function: Uncharacterized membrane protein (homolog of Drosophila rhomboid) # Organism: Bacillus halodurans # 61 209 194 330 349 69 27.0 6e-12 MTDEEHPISGGQEGKQPVTLPVEPPKHMPYVKPMLYVKQTAGSIDFSFKNYKTQYPAAIP LVVCFVYFVVMVLGGVNPFFTTAEDCLEWGGGQRESIYVDGEFWRLITNVFAHADVIHLM SNAASYVYGAMFLLEIMSPGKVVAVFITCGVVGSLVCSVTNSYVFLGASGGVLGFYGAFI GYALLDTSVFQRHKTAFYVALGLVALSILSSFRVGISLTIHAVGLLTGFLIGCVLPLWKQ RNGD >gi|283510516|gb|ACQH01000103.1| GENE 7 5909 - 6178 391 89 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|160883111|ref|ZP_02064114.1| hypothetical protein BACOVA_01079 [Bacteroides ovatus ATCC 8483] # 1 89 1 89 89 155 84 3e-37 MYLDQAKKQEIFGQYGKSNSDTGSAESQIALFSYRISHLTEHLKKNRKDYNTERSLTMLV GKRRRLLDYLYDRDIERYRAIIKALGLRR >gi|283510516|gb|ACQH01000103.1| GENE 8 6516 - 8546 2106 676 aa, chain + ## HITS:1 COG:BMEI0657 KEGG:ns NR:ns ## COG: BMEI0657 COG4206 # Protein_GI_number: 17986940 # Func_class: H Coenzyme transport and metabolism # Function: Outer membrane cobalamin receptor protein # Organism: Brucella melitensis # 66 557 12 485 599 105 25.0 2e-22 MYKSLFNKRASLRFTHFSRKGYALFSCLGKEVIVGTLSVATLAHAKAEGVSTRTPLAADT LTHKELKLDEVVVTGSRAPLTALQAAKVVAVITSDDIQRAAATSVNDVLKLVSGVDVRQR GGFGVQTDISINGGTFDQITILLNGVNISNPQTGHNAADFPVSLSDIDRIEVLEGASARV FGASAFSGAINIVTKTEAQSGVRLSADGGSFGTFGGSAALSLVSKAVSQQLSGGYSQSDG GTINTYFRKRQGYYQGNFAANAVDLFWQAGLISKDYGAGTFYGIASDNQYEATRRFIASV GSNIRPVANNVLTLSPTIYWQRNWDHYQWKRGMTGAAKGENYHRTDVFGGSLNTVVNWLL GKTALGADLRKEVIYSTAYGETLPADQWKSIHGSDRMYERRGERTNTSLFLEHNVVLEHF TLSAGLLANHNTWLSGGLRFYPGVDISWRPNVQWKVYASWNKALRLPTYTDLYISNRAQV GDMNLNPERVNTYKLGARYRTTALETQLNAFYSNGRDMIDWVFETAASTRYHALNIGKLD NMGASLDLAFMPKELWAGSPFTAVKLGYAYIHQQHETTQPIHKSLYALEYLRHKFTVSVD HHIVSRLSAHWAMRWQQRMNGYHPYTKVDLKLQWTAPNYSLYVQGDNLTAHPYYDLGGVK QPGLWVMAGGSVKLNW >gi|283510516|gb|ACQH01000103.1| GENE 9 8754 - 9746 950 330 aa, chain + ## HITS:1 COG:SMb20500 KEGG:ns NR:ns ## COG: SMb20500 COG0667 # Protein_GI_number: 16264230 # Func_class: C Energy production and conversion # Function: Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) # Organism: Sinorhizobium meliloti # 4 326 8 324 331 236 42.0 3e-62 MEYKKLGATDLRLSVITYGSFAIGGTMWGGNEAADSIAAVRASIDNGITSIDTAPFYGFG LSEEMIGKAIKGYDRSRLQLLTKFGLVWDGSNAGRGERTSDALKDGKTVSVYKYASRENI LKEVEESLKRLDTDYIDLLQLHWPDSTTPISETMETLDLLVKQGKIRAAAVCNYSVEQLE EAAKSIQLASNQVPYSMLNRRIEAQLVPYALKNNLGIIAYSPMARGLLTGKYFEGSRLKA DDHRNEYFSRFNLERVEAMLNRLKPLAEEKQATLAQLVLRWTTLQKGISVVLAGARNATQ AISNAAAMSFSLSNEELLFINREVDACTAP >gi|283510516|gb|ACQH01000103.1| GENE 10 9962 - 10429 513 155 aa, chain - ## HITS:1 COG:no KEGG:FP0040 NR:ns ## KEGG: FP0040 # Name: not_defined # Def: hypothetical protein # Organism: F.psychrophilum # Pathway: not_defined # 26 146 20 142 267 85 39.0 6e-16 MRKQTKSILCAALAVLCMITLTTACTASKQANNDIEQISYHEARNFFLNQGQELAVGQKI TNDKDFDKFFGLAAFMGKGGEPTRVDFANEFVLPIVLPETDCETDITAVSLSGNASVINL NYKVKVGEKRDFYIRPIMLLVVDKAYRDAQVVVNQ >gi|283510516|gb|ACQH01000103.1| GENE 11 11295 - 12035 732 246 aa, chain + ## HITS:1 COG:BH1048 KEGG:ns NR:ns ## COG: BH1048 COG0778 # Protein_GI_number: 15613611 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Bacillus halodurans # 1 182 5 186 244 103 30.0 3e-22 MMKSLQNRTTIRRYKADDVSEQLLNGLLEQAARTQTMGNLQLYSVVVTRDASQKRRLAPA HFNQPMTTQAPVVLTICADFRRTTIWAEQRKAHPGYDNFLSFMNAATDALLFTQTFCCLA EEEGLGYCFLGTTIYNAQQIIDVLELPPLVVPVATLTLGWPDEQPALSDRLPLAAFIHDE VYRDYTPTKIDDFYAGKEDLAVNKAFVAENKKETLAQVYTDVRYTKAANEAMSEELLKVL RQQGFV >gi|283510516|gb|ACQH01000103.1| GENE 12 11963 - 12229 110 88 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSGQTQLVRKHPFAHKPTPRLILLTQDGTVRKHPFTHKPTPYIYTRVCARARSMQCSFLH QAKSYHTKPCCLSTFKSSSDMASLAALV >gi|283510516|gb|ACQH01000103.1| GENE 13 13029 - 13445 248 138 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929538|ref|ZP_06423382.1| ## NR: gi|288929538|ref|ZP_06423382.1| hypothetical protein HMPREF0670_02276 [Prevotella sp. oral taxon 317 str. F0108] # 1 138 1 138 138 253 100.0 2e-66 MKKKLLYLVFVALTAMTAGLLSLTSCSKDDDRKDEIESPAELKAKFVGQWKLTFVYLPSG GKIDSPYSWVYDFTSDGTMKILSDGVVQEERKWYVSISGKERLLFVDPTTKMIITSISSE EFYGLESDGSTSVYVRVY >gi|283510516|gb|ACQH01000103.1| GENE 14 13746 - 14585 697 279 aa, chain - ## HITS:1 COG:no KEGG:PRU_0130 NR:ns ## KEGG: PRU_0130 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 262 1 292 297 327 56.0 3e-88 MQLQTPVTIPQPKFDIPPFECILLVGSCFAQNMGKRFVDAQFRATVNPYGVMYNPASILH TVERCGESPAVGIFTLGTNHVYCWKETGEIADNCRKLPQKLFEEKELSVDECVDFLNQAV AILKERNPQVQVVLTVSPIRYAKYGFVASGLSKATLLLATNRLLGLHPDCTSYFPAYELM NDELRDYRFYAPDMLHPSEQAVEYIWQRFTDAYLSFSAHDFLKEWQPLQAALNHKPFNPD GAEYKAFLAKTLQDIRALETKYGHKLDLSAYESIRICKG >gi|283510516|gb|ACQH01000103.1| GENE 15 14488 - 14691 79 67 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MFWAKQEPTRSMHSKGGMSNLGWGMVTGVCSCIMLNGLLLLCCLPQYKGRVRINCTAVGG MFVHPDG >gi|283510516|gb|ACQH01000103.1| GENE 16 14818 - 15960 947 380 aa, chain - ## HITS:1 COG:PA4204 KEGG:ns NR:ns ## COG: PA4204 COG2706 # Protein_GI_number: 15599399 # Func_class: G Carbohydrate transport and metabolism # Function: 3-carboxymuconate cyclase # Organism: Pseudomonas aeruginosa # 5 372 2 379 388 151 31.0 2e-36 MLHPRPLLSLVLSALSLCGFAKGKQSAQLTLFVGTYTEDSPSEGIYVFHFNQETGHFSLR SSAKAGNPSFVVLSPNGNRLYAVSEFNDGRQGAYSFDYNPTTGRLSNPTFVSTKASDRLG DPENRPGSDPCNLYTDGHCLITANYSGGDISAFKLDKQGRLHGTAQQTRFIGRDSAAVSH IHCGLPTPEGKYFLVTDLGNDRIHRFNFVPNGTEILTNPQVVYEVKRGEGPRHITFDKGG KHLYLINELGGTCVVMRYNDGKLTEIQTLMADEGGGRGSADIHISPDGKYLYTSHRLKKD GIAIFSINPQTGLLTKVGYQLTGLHPRNFAITPNGKYLLVACRDNNTIQIFERNANTGLL ADTGKRINLGKPVCLVLRKY >gi|283510516|gb|ACQH01000103.1| GENE 17 16184 - 16597 221 137 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148994682|ref|ZP_01823786.1| 50S ribosomal protein L13 [Streptococcus pneumoniae SP9-BS68] # 2 135 118 248 278 89 34 2e-17 MREHKVYMSLGANLGDRESNIKLAIQQISELIGPVVRQSALLETAPWGFSSANTFINAAV CSQTSLSPREVLNATQDIERTLGRTQKSTDGQYHDRPIDIDILLYDDLQVNEPDLVIPHP HMNERQFVLQPLSEIIE >gi|283510516|gb|ACQH01000103.1| GENE 18 16603 - 17769 1361 388 aa, chain - ## HITS:1 COG:SA1466 KEGG:ns NR:ns ## COG: SA1466 COG0809 # Protein_GI_number: 15927220 # Func_class: J Translation, ribosomal structure and biogenesis # Function: S-adenosylmethionine:tRNA-ribosyltransferase-isomerase (queuine synthetase) # Organism: Staphylococcus aureus N315 # 1 386 1 341 341 263 37.0 6e-70 MKLSQFNFKLPQELIALYPNSVYHEIKNEKGEKEMSRLTRRDECRLMVLHKKSGTIDLFR KDKKGKPIKGDFLQFRDVIDYFDEGDTFIFNDTKVFPARLYGMKEKTDAKIEVFLLRELN QEMRLWDVLVEPARKIRIGNKLFFDDSGTMVAEVIDNTTSRGRTLRFLYDCDHDEFKRSL FALGESPLPRYIIDKRENHHGTEEDMEDFQTIYAANEGAVTAPATGLHFSRELLKRMEIR GINSAFITLHCGLGNFHDIEVEDLTKHKVDSESMHIGAEACAIVNKTKAEGRRVCAIGTS VVKATETAVGTDGMLKEYTGWTNRFIFPPYDFGLADTMMANFYHPLSTLLMSTAAFGGYD LVMEAYRLAVENEYKFGCFGDALLILND >gi|283510516|gb|ACQH01000103.1| GENE 19 18244 - 18954 771 236 aa, chain - ## HITS:1 COG:ML1546 KEGG:ns NR:ns ## COG: ML1546 COG0130 # Protein_GI_number: 15827813 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Pseudouridine synthase # Organism: Mycobacterium leprae # 8 224 14 231 320 172 45.0 4e-43 MNIQEGEIICIDKPYEITSFGALARVRYLLSRRLGVKRVKIGHAGTLDPLATGVLVLCTG KATKRIAELQAHTKEYVATLQLGATTPSYDMEHPVNETFETGHITREAIDNALQQFVGEI QQVPPTYSAVKVNGDRAYDLRRKGKDVELTPKTLRIDEIEVLNFDPSQMQLTIRVVCGKG TYIRALARDVGRALQSGAYLTALRRTRVGDVRVENCLQLDSFKEWLDQQDIEIPIS >gi|283510516|gb|ACQH01000103.1| GENE 20 18954 - 19814 1033 286 aa, chain - ## HITS:1 COG:aq_2195 KEGG:ns NR:ns ## COG: aq_2195 COG1968 # Protein_GI_number: 15607126 # Func_class: V Defense mechanisms # Function: Uncharacterized bacitracin resistance protein # Organism: Aquifex aeolicus # 5 277 5 255 256 184 42.0 1e-46 MDILQTIIVAIVEGLTEFLPVSSTGHMIITQALLGIESDEFVKAFDVIIQFGAILAVVVL YWKRFFRLDHSTPPEGLTPVQRTLRKWDFYYKLLIAFIPAMIIGGLCNKYIDALLGNVMV VAVMLVVGGVFMLFCDKLFGNGSPETKLTTRRAFAIGVFQCLAMVPGVSRSMATIVGGMT QKLTRKDAAEFSFFLAVPTMFAATCLDLLKLFLHMHKIGDYSMLASHVGTLLLGSVVAFV VALVAIKAFIAYLTKYGFRAFGVYRIIVGGAILAWLLAGNSLAMVD >gi|283510516|gb|ACQH01000103.1| GENE 21 19821 - 20078 359 85 aa, chain - ## HITS:1 COG:no KEGG:PRU_1834 NR:ns ## KEGG: PRU_1834 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 74 1 74 74 113 74.0 2e-24 MDKKNFAFDKMNFILLAVGMAVVLLGLILMSGPGSTETAFNPDIFSATRIKVAPVVCFLG FISIIYAVIRKPKDPKEQQEEQQEA >gi|283510516|gb|ACQH01000103.1| GENE 22 20349 - 21230 869 293 aa, chain - ## HITS:1 COG:CAC0498 KEGG:ns NR:ns ## COG: CAC0498 COG2177 # Protein_GI_number: 15893789 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Cell division protein # Organism: Clostridium acetobutylicum # 27 283 35 299 301 92 28.0 1e-18 MRKKHNKARSRYGLQVITLCISTALVLILLGIIVLSVFSTRNLTNYVKENLVVTMMLDQD LTNPEAQRLCTRLQARPYIKSLSFISKEQALKEQTAAMGSDPSEFVGGDNPFLASIELTL KGDYANNDSLNWISKELKAYPKVADITYQHELVEDVNSTLSKISLVLLALAALLTFVSFS LINNTVRLGIYARRFSIHTMKLVGASWGFIRRPFIKQAVGVGIVAALLAIVVLAGCVYGL YRYQENMLSIITWQVLAITACAVLLFGIFITAICAYISVNRFLRMKAGELYKI >gi|283510516|gb|ACQH01000103.1| GENE 23 22299 - 22880 402 193 aa, chain + ## HITS:1 COG:no KEGG:Ppha_1224 NR:ns ## KEGG: Ppha_1224 # Name: not_defined # Def: hypothetical protein # Organism: P.phaeoclathratiforme # Pathway: not_defined # 24 185 15 174 183 101 34.0 1e-20 MLTLRTEKFLEMTHFDKLRLTEIVLYILNKTNGLDYYHVFKVIYFANVAHLAKYGSLMVS DDFCALPDGPVPSNLYNCVKGEQFCDKDLQSMLDESVSKGKDDAYYMLEAKREADEEYLS KADIEVLDESISENAYLPYGDLRAKSHGEEWKRAFEQHGRKVVDVIGMAKDGMASDGMIA YIKENLAVEDALS >gi|283510516|gb|ACQH01000103.1| GENE 24 22877 - 23080 104 67 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|265753781|ref|ZP_06089136.1| ## NR: gi|265753781|ref|ZP_06089136.1| conserved hypothetical protein [Bacteroides sp. 3_1_33FAA] # 1 67 1 67 153 91 71.0 2e-17 MTILGDILGDVGDKLAQDKIKVGNVYLINLDQRNGITPKNGDLTRDKFFIVLGFDNEGNV IGGLVIN Prediction of potential genes in microbial genomes Time: Sat May 28 02:20:42 2011 Seq name: gi|283510515|gb|ACQH01000104.1| Prevotella sp. oral taxon 317 str. F0108 cont2.104, whole genome shotgun sequence Length of sequence - 7122 bp Number of predicted genes - 8, with homology - 7 Number of transcription units - 6, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 51 - 2120 1744 ## COG3501 Uncharacterized protein conserved in bacteria 2 1 Op 2 . + CDS 2132 - 2920 373 ## gi|288929551|ref|ZP_06423395.1| hypothetical protein HMPREF0670_02289 + Prom 2952 - 3011 1.7 3 2 Op 1 . + CDS 3052 - 3837 619 ## gi|288929552|ref|ZP_06423396.1| hypothetical protein HMPREF0670_02290 4 2 Op 2 . + CDS 3755 - 3964 120 ## 5 3 Tu 1 . + CDS 4024 - 4368 227 ## gi|288802910|ref|ZP_06408347.1| hypothetical protein HMPREF0660_01352 + Term 4403 - 4443 -0.6 6 4 Tu 1 . + CDS 4736 - 5338 435 ## gi|288929553|ref|ZP_06423397.1| hypothetical protein HMPREF0670_02291 + Prom 5409 - 5468 3.8 7 5 Tu 1 . + CDS 5490 - 6347 194 ## gi|288929554|ref|ZP_06423398.1| hypothetical protein HMPREF0670_02292 + Prom 6445 - 6504 6.7 8 6 Tu 1 . + CDS 6605 - 7121 487 ## CHU_0660 hypothetical protein Predicted protein(s) >gi|283510515|gb|ACQH01000104.1| GENE 1 51 - 2120 1744 689 aa, chain + ## HITS:1 COG:all3320_2 KEGG:ns NR:ns ## COG: all3320_2 COG3501 # Protein_GI_number: 17230812 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Nostoc sp. PCC 7120 # 12 184 39 208 219 77 33.0 8e-14 MNWQTGNMHTDWIRVMTPDGGGCRDGVETNRGFVFIPEVGDHVLVGFRHNDPNRPYVMGS LFNGRTGKGGGDGNNSKSICTRSGICIAFDDDSRTLTLSDPSGNMVLMDGQGHMELNAPN GIVMNASRIAMNTGGNITLNVGGELMASVAGNWLTSVGGAMNAVTNSYRTTVVNALEMCS ASALFSTEMGMQLQGETLNAIGTKKLLMHSDEQVLANSRGRMDMKSDGSLNMEQKADDVK KEEKEQMALATVEFRPARPEYNGQYGFDWLRVNDGKATAEMPYKDIIVSGYSDGLQNLTK DKAYDNLKNEYKQVPITFKKGEENTYFVPYLNLYSKECVEALKLDKDVSKPCYKALLRVL VDINEEVDRLEFDYDKETFEIDRPVLTEKEKTNGKIESGNTIEITCNKSFSDADKGVIRV FAYPKGCADKPMPDQVRLRSLAGKILVGVNDEKAQKRMKFVLVCVKTEINGKEKEGKFSS EHLGLLSNTLHQMYINPVFEQHLKDKDGKVLMDADGNPKSIVLDLVEDKRFMPGGIYVKN GKIQKFTKQRVGMTLQRYFNSNVARYLRIVFLRKYPQYNGFFTIFSFNERGFDKKVELYG FCESVYKSGTTKFSHYKKNLVLFANPEAMVLSHEVLHGMGLHHTHIDDEVIEESTRRYVY KENTTDNVMSYASTRKSTWYWQWKIVRKE >gi|283510515|gb|ACQH01000104.1| GENE 2 2132 - 2920 373 262 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929551|ref|ZP_06423395.1| ## NR: gi|288929551|ref|ZP_06423395.1| hypothetical protein HMPREF0670_02289 [Prevotella sp. oral taxon 317 str. F0108] # 1 262 1 262 262 447 100.0 1e-124 MIRNLIILTIVLFSSLRGDAQTDSVYNYKGKSMIRYFDIEEFKRKSTTSSGYDCVEGDMR VSYLKRYDSKEKVIGYTEWREGISTPYGYYYEYDLRGRLIHSITSFQAFDIGKECYYDTL GRIIKTIDHDIPYKFTLDAFIEKMKNEYGYDVEDRKKVNLLERDEGKEDLHRPWYEVLCT NEPNGLYGERFLIDGTTGETLYIERDVYIDREGDGKYIEDMLAGRPVTIQEKYLYEQKKK KEEQDKKGTKKKGKSFWRKLFD >gi|283510515|gb|ACQH01000104.1| GENE 3 3052 - 3837 619 261 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929552|ref|ZP_06423396.1| ## NR: gi|288929552|ref|ZP_06423396.1| hypothetical protein HMPREF0670_02290 [Prevotella sp. oral taxon 317 str. F0108] # 1 261 1 261 261 489 100.0 1e-137 MRMLAIISLLMPFCTLCCAQTDSVYSHKGKSMIRYFDIDDFVAKAKGSDFYECMEGHTKV TCFKINDTNGVHVGYTENRESLDTPYSNYYECDLKGRLRYSLESFYNVSIRDKCHYDTLG RVVQMEDKDKPYKFTLESLIKKMKAEYGYDILDRKMTLKMYRSEGKNDLHRPWYEVWCNE DSSGSYSDHFLIDGTTGETLYMEKHVDNLMLMTPEEELKAMTNKSPVKLQDKYLYEQSKK NQQDKKGTKKKGKSFWRKLFD >gi|283510515|gb|ACQH01000104.1| GENE 4 3755 - 3964 120 69 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNRVRKINKIRKGQRRRARAFGASCLIKVRVVSEVYACGGNGAALAHLNGRTDGDRAKRY TYTYVAQTA >gi|283510515|gb|ACQH01000104.1| GENE 5 4024 - 4368 227 114 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288802910|ref|ZP_06408347.1| ## NR: gi|288802910|ref|ZP_06408347.1| hypothetical protein HMPREF0660_01352 [Prevotella melaninogenica D18] # 1 114 1 114 114 132 60.0 5e-30 MEKIIVDGAHMRCTFATGNAQIRVDSHSLVRIGGALVATEADKAGMKNIPTFGTCKCGWP NRPCVPSPIAWQHVSAQSSVNGAKKLTMQSFCPCAKGGRISFCDAADNSFVESE >gi|283510515|gb|ACQH01000104.1| GENE 6 4736 - 5338 435 200 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929553|ref|ZP_06423397.1| ## NR: gi|288929553|ref|ZP_06423397.1| hypothetical protein HMPREF0670_02291 [Prevotella sp. oral taxon 317 str. F0108] # 1 200 1 200 200 368 100.0 1e-101 MKINSTYVWSNLVVVPVLFYIITYLFYRGNTDAYDYVDTYLSRSKVIKLTIPQKYERIRE FYRFRFGKKGSRNYNQFIFYSDKYKKYLAIVSFSGIDTSGEIDNEDLSRSSLYIEQPTIF AYANTAQWNDAKYGTRALPMPIFFIKATHLYNNQLTIGPWDSSSSPCIIDYNSTRKFVEQ YLAYFLPEEEFDRLFKGKEK >gi|283510515|gb|ACQH01000104.1| GENE 7 5490 - 6347 194 285 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929554|ref|ZP_06423398.1| ## NR: gi|288929554|ref|ZP_06423398.1| hypothetical protein HMPREF0670_02292 [Prevotella sp. oral taxon 317 str. F0108] # 1 285 1 285 285 561 100.0 1e-158 MKRFIFIVIITIGCISCTARRKPQEISFTSRKDTLFVDKERCAIFFKLSLGELNELRKTY KEKEDFYTTSDGIAEDKFQIKTFLEENNISIIYADTTIGVIRFKDCDVNITNTPIIKTIW TSIILYNHGKKYAVIQYPDFDFTSKYFDLHEGNKLGEKHIKQVRNSKWFGEYEYFLSDTT TAPSPFIEYTLSITADRCVFSGNGHMTAFEVLCSVKAETEDKLSFGFAKSLTEDEPMPNL DRGISPIVNLYYKEGRYYFESPFISNFEGKENVKIKCRKVNRHGF >gi|283510515|gb|ACQH01000104.1| GENE 8 6605 - 7121 487 172 aa, chain + ## HITS:1 COG:no KEGG:CHU_0660 NR:ns ## KEGG: CHU_0660 # Name: not_defined # Def: hypothetical protein # Organism: C.hutchinsonii # Pathway: not_defined # 11 150 385 541 605 118 42.0 8e-26 MAAFLTFCHGKQQLIIRMSWIRVMTPGGGGSRGGVETNRGFVFIPEVGDHVLVGFRHGDP NRPYVMGSLFNGITGGGGGKGNSCKSITTRSGSTLAMNDETGSVRLADPGKAWLLMDGAG NTTTSSKNNCTISAGNSHVVNAGANYATNVGKGASCLQMDADGNILLEGKSK Prediction of potential genes in microbial genomes Time: Sat May 28 02:21:36 2011 Seq name: gi|283510514|gb|ACQH01000105.1| Prevotella sp. oral taxon 317 str. F0108 cont2.105, whole genome shotgun sequence Length of sequence - 1027 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 7 - 66 2.1 1 1 Tu 1 . + CDS 163 - 972 615 ## gi|288929556|ref|ZP_06423400.1| prophage PSPPH06, putative tail tape measure domain protein Predicted protein(s) >gi|283510514|gb|ACQH01000105.1| GENE 1 163 - 972 615 269 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929556|ref|ZP_06423400.1| ## NR: gi|288929556|ref|ZP_06423400.1| prophage PSPPH06, putative tail tape measure domain protein [Prevotella sp. oral taxon 317 str. F0108] # 1 269 45 313 313 453 100.0 1e-126 MYIDKPTGKGFDGAYTKNGEYIVDDAKPWKKGPEKTKDSGKQLSQKWVGRHLEQGAVPEN HAEAMKAANDKNSLRRTVTHVDGDGNMRIINYATKGTGDVSSTRKSEKVTKPPTKAKGLI KSVRSKVANLRPVKVISESNFSKAAQSSKAAAKANDALWKGTQYIESSPVLRTVGKVAGR ALIVVGIAWDAYCINDAYQEEGEFGDKTQQATGAAVGGLAGAWAGAEIGAIIGTAVCPSV GTLVGAVVVGLIGAFLCSWGGSALVDSIF Prediction of potential genes in microbial genomes Time: Sat May 28 02:21:48 2011 Seq name: gi|283510513|gb|ACQH01000106.1| Prevotella sp. oral taxon 317 str. F0108 cont2.106, whole genome shotgun sequence Length of sequence - 1736 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 95 - 154 1.7 1 1 Op 1 . + CDS 175 - 978 597 ## Sez_1333 hypothetical protein 2 1 Op 2 . + CDS 1039 - 1341 301 ## gi|288929556|ref|ZP_06423400.1| prophage PSPPH06, putative tail tape measure domain protein 3 1 Op 3 . + CDS 1361 - 1736 186 ## gi|288929476|ref|ZP_06423320.1| conserved hypothetical protein Predicted protein(s) >gi|283510513|gb|ACQH01000106.1| GENE 1 175 - 978 597 267 aa, chain + ## HITS:1 COG:no KEGG:Sez_1333 NR:ns ## KEGG: Sez_1333 # Name: not_defined # Def: hypothetical protein # Organism: S.equi # Pathway: not_defined # 30 262 8 229 231 90 30.0 6e-17 MGLFDAIFGSSSNMQKWHMRDKVKDHKYFERQLERNLQSKVITEQALIESVENNNNQVDR TFLYWSLAQNAYYNIEIKYSMGGCIKELQTDYLNSLDDFRHGCDINDPIYFEILNRVALG ILLKIPNEKFQLLVEYVKRVDEETNPNDWTADMLLWFMLNSHLEDNERKSYAQALAFPCI YMGLYKVTQTTEKEKTKKALNEYLDKWYDLNKNAPWHNTHLRSNGYTGYWAWEVAAVAKI MHIDDTDLKDNPYYPYDMVHWEEDKEE >gi|283510513|gb|ACQH01000106.1| GENE 2 1039 - 1341 301 100 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929556|ref|ZP_06423400.1| ## NR: gi|288929556|ref|ZP_06423400.1| prophage PSPPH06, putative tail tape measure domain protein [Prevotella sp. oral taxon 317 str. F0108] # 1 100 214 313 313 139 91.0 6e-32 MLRTVGKVAGRSLIVVGIAWDAYRINEAYQEEGEFGDKTQQATGAAVGGLAGAWAGAEIG AIIGTAVCPGVGTLVGAVIVGLIGAYVCSKGGSALVDSIF >gi|283510513|gb|ACQH01000106.1| GENE 3 1361 - 1736 186 125 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929476|ref|ZP_06423320.1| ## NR: gi|288929476|ref|ZP_06423320.1| conserved hypothetical protein [Prevotella sp. oral taxon 317 str. F0108] # 1 125 6 130 184 177 75.0 2e-43 MGLFDAIFGSSKKEQEKPIRDKLKKRKDFEKVLKIRLAEIDRYKAYLTTEQESELFFCRA VGHDSIDVIQIRYSMDGGLKELQRIYVESLNYFIVGFNDDNPIYFDILNRVSLGILLNIP DENFM Prediction of potential genes in microbial genomes Time: Sat May 28 02:22:27 2011 Seq name: gi|283510512|gb|ACQH01000107.1| Prevotella sp. oral taxon 317 str. F0108 cont2.107, whole genome shotgun sequence Length of sequence - 105122 bp Number of predicted genes - 64, with homology - 60 Number of transcription units - 36, operones - 15 average op.length - 2.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 54 - 296 109 ## + Prom 436 - 495 3.7 2 2 Tu 1 . + CDS 533 - 3097 2864 ## COG0209 Ribonucleotide reductase, alpha subunit + Term 3112 - 3161 14.1 + Prom 3204 - 3263 4.6 3 3 Tu 1 . + CDS 3454 - 3825 349 ## COG1539 Dihydroneopterin aldolase + Prom 3905 - 3964 4.0 4 4 Op 1 . + CDS 4142 - 5524 1385 ## COG0860 N-acetylmuramoyl-L-alanine amidase 5 4 Op 2 . + CDS 5531 - 6433 1139 ## PRU_1284 Mce-like protein + Term 6519 - 6572 10.2 + Prom 6764 - 6823 5.8 6 5 Tu 1 . + CDS 6880 - 8289 1477 ## COG0593 ATPase involved in DNA replication initiation + Term 8419 - 8462 -0.8 - Term 8273 - 8316 6.0 7 6 Tu 1 . - CDS 8452 - 9051 286 ## gi|288929564|ref|ZP_06423408.1| hypothetical protein HMPREF0670_02302 - Prom 9072 - 9131 2.9 - Term 10778 - 10825 7.8 8 7 Op 1 4/0.000 - CDS 10867 - 12201 1479 ## COG0477 Permeases of the major facilitator superfamily 9 7 Op 2 . - CDS 12303 - 13457 1354 ## COG1609 Transcriptional regulators - Prom 13551 - 13610 3.7 + Prom 14026 - 14085 6.1 10 8 Op 1 . + CDS 14138 - 17155 3204 ## PRU_2681 outer membrane protein SusC 11 8 Op 2 . + CDS 17178 - 18767 1594 ## PRU_2682 outer membrane protein SusD 12 8 Op 3 . + CDS 18814 - 19938 892 ## BT_3699 outer membrane protein SusF + Term 19983 - 20035 17.0 + Prom 20642 - 20701 8.1 13 9 Op 1 3/0.000 + CDS 20878 - 21444 776 ## COG1704 Uncharacterized conserved protein 14 9 Op 2 . + CDS 21466 - 22479 1019 ## COG0501 Zn-dependent protease with chaperone function + Term 22504 - 22549 5.7 + Prom 24597 - 24656 4.9 15 10 Op 1 . + CDS 24679 - 25464 1091 ## COG0363 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase 16 10 Op 2 . + CDS 25469 - 27466 2186 ## COG0363 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase + Term 27624 - 27658 0.7 + Prom 27469 - 27528 2.2 17 11 Op 1 . + CDS 27663 - 28790 1376 ## BF0928 hypothetical protein 18 11 Op 2 . + CDS 28884 - 29540 460 ## BF0187 hypothetical protein 19 11 Op 3 . + CDS 29570 - 32272 2730 ## COG0210 Superfamily I DNA and RNA helicases 20 11 Op 4 . + CDS 32337 - 34469 1891 ## COG4412 Uncharacterized protein conserved in bacteria + Term 34546 - 34597 0.2 + Prom 34534 - 34593 1.6 21 12 Tu 1 . + CDS 34660 - 35808 685 ## gi|288929580|ref|ZP_06423424.1| hypothetical protein HMPREF0670_02318 + Term 35844 - 35886 0.9 + Prom 37216 - 37275 2.9 22 13 Tu 1 . + CDS 37329 - 37541 293 ## COG1983 Putative stress-responsive transcriptional regulator + Term 37621 - 37669 11.5 23 14 Tu 1 . - CDS 37710 - 40028 1545 ## BVU_0280 hypothetical protein - Prom 40183 - 40242 5.7 + Prom 40317 - 40376 7.6 24 15 Tu 1 . + CDS 40403 - 40912 263 ## BT_3896 TonB + Prom 40932 - 40991 3.4 25 16 Tu 1 . + CDS 41018 - 41533 344 ## BDI_3213 outer membrane protein TonB - Term 41402 - 41435 -0.7 26 17 Tu 1 . - CDS 41563 - 42951 1080 ## BF1115 hypothetical protein - Prom 43169 - 43228 5.0 27 18 Op 1 . - CDS 43562 - 45187 1118 ## gi|288929586|ref|ZP_06423430.1| hypothetical protein HMPREF0670_02324 28 18 Op 2 . - CDS 45242 - 45463 71 ## 29 18 Op 3 . - CDS 45510 - 47141 900 ## gi|288929587|ref|ZP_06423431.1| hypothetical protein HMPREF0670_02325 - Prom 47214 - 47273 3.2 30 19 Tu 1 . - CDS 47316 - 47699 504 ## COG0346 Lactoylglutathione lyase and related lyases - Prom 47748 - 47807 1.5 31 20 Op 1 . - CDS 47936 - 50080 2270 ## PRU_2265 hypothetical protein 32 20 Op 2 . - CDS 50058 - 51014 700 ## PRU_2264 hypothetical protein 33 20 Op 3 . - CDS 51038 - 52717 218 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P 34 21 Tu 1 . - CDS 52848 - 53768 936 ## COG0061 Predicted sugar kinase - Prom 53866 - 53925 3.6 + Prom 53842 - 53901 5.1 35 22 Tu 1 . + CDS 53925 - 54143 136 ## + Term 54292 - 54328 3.0 + Prom 55391 - 55450 3.0 36 23 Tu 1 . + CDS 55487 - 56104 769 ## COG0179 2-keto-4-pentenoate hydratase/2-oxohepta-3-ene-1,7-dioic acid hydratase (catechol pathway) + Prom 56249 - 56308 4.9 37 24 Tu 1 . + CDS 56474 - 59983 3570 ## PRU_2691 hypothetical protein + Prom 60109 - 60168 5.4 38 25 Tu 1 . + CDS 60195 - 61382 1374 ## PRU_2690 hypothetical protein + Term 61401 - 61460 2.4 + Prom 61521 - 61580 3.0 39 26 Tu 1 . + CDS 61620 - 62099 586 ## COG0245 2C-methyl-D-erythritol 2,4-cyclodiphosphate synthase + Prom 62618 - 62677 9.9 40 27 Op 1 . + CDS 62783 - 63763 878 ## COG4206 Outer membrane cobalamin receptor protein 41 27 Op 2 . + CDS 63789 - 65180 1464 ## BF0628 putative TonB-linked outer membrane receptor 42 27 Op 3 . + CDS 65242 - 66480 1139 ## BF0627 hypothetical protein + Term 66543 - 66602 8.0 - Term 66738 - 66795 9.7 43 28 Op 1 . - CDS 66813 - 68258 1327 ## PRU_2629 sialic acid-specific 9-O-acetylesterase-like protein 44 28 Op 2 . - CDS 68255 - 69934 1917 ## COG4124 Beta-mannanase 45 28 Op 3 . - CDS 69950 - 71236 1442 ## Dfer_2531 hypothetical protein 46 28 Op 4 . - CDS 71260 - 73017 1963 ## Cpin_6367 RagB/SusD domain protein 47 28 Op 5 . - CDS 73036 - 76320 3767 ## Fjoh_4951 TonB-dependent receptor, plug 48 28 Op 6 . - CDS 76304 - 76714 172 ## - Prom 76780 - 76839 3.5 49 29 Op 1 . + CDS 78185 - 79588 1252 ## gi|288929604|ref|ZP_06423448.1| hypothetical protein HMPREF0670_02342 + Term 79597 - 79646 8.2 + Prom 79591 - 79650 4.4 50 29 Op 2 . + CDS 79676 - 82120 1972 ## GYMC10_1275 metallophosphoesterase + Term 82182 - 82226 5.3 - Term 82166 - 82218 13.4 51 30 Op 1 . - CDS 82318 - 82509 196 ## gi|288929606|ref|ZP_06423450.1| hypothetical protein HMPREF0670_02344 52 30 Op 2 1/0.000 - CDS 82658 - 83302 717 ## COG2860 Predicted membrane protein 53 30 Op 3 . - CDS 83486 - 84445 1222 ## COG1052 Lactate dehydrogenase and related dehydrogenases - Prom 84485 - 84544 3.2 54 31 Tu 1 . - CDS 84644 - 85921 1374 ## COG2256 ATPase related to the helicase subunit of the Holliday junction resolvase - Term 86793 - 86850 8.1 55 32 Tu 1 . - CDS 86973 - 87863 849 ## BT_2087 hypothetical protein 56 33 Op 1 24/0.000 + CDS 88259 - 88921 837 ## COG0740 Protease subunit of ATP-dependent Clp proteases 57 33 Op 2 . + CDS 89087 - 90319 268 ## PROTEIN SUPPORTED gi|163762510|ref|ZP_02169575.1| ribosomal protein S16 58 33 Op 3 . + CDS 90540 - 92717 2567 ## COG0514 Superfamily II DNA helicase + Term 92840 - 92886 10.9 + Prom 93239 - 93298 4.0 59 34 Op 1 . + CDS 93501 - 96635 3305 ## Cpin_5147 TonB-dependent receptor plug 60 34 Op 2 . + CDS 96651 - 98168 1982 ## Cpin_1098 hypothetical protein 61 34 Op 3 . + CDS 98181 - 98855 741 ## gi|288929617|ref|ZP_06423461.1| hypothetical protein HMPREF0670_02355 + Prom 98886 - 98945 3.5 62 35 Op 1 . + CDS 98998 - 100629 1407 ## Cpin_3291 hypothetical protein + Term 100663 - 100689 -0.6 63 35 Op 2 . + CDS 100711 - 103023 1412 ## Cpin_6153 hypothetical protein + Term 103045 - 103085 -0.3 64 36 Tu 1 . - CDS 103441 - 105006 1299 ## COG2356 Endonuclease I - Prom 105026 - 105085 2.5 Predicted protein(s) >gi|283510512|gb|ACQH01000107.1| GENE 1 54 - 296 109 80 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAISFVGRCVLLTYNKLCKMGNIGSEKRVEVFRFSKLMRCGGRPLSCCLVHICFTTFVFL LYFEPYFVSKIKVRVIVLSV >gi|283510512|gb|ACQH01000107.1| GENE 2 533 - 3097 2864 854 aa, chain + ## HITS:1 COG:AF1664 KEGG:ns NR:ns ## COG: AF1664 COG0209 # Protein_GI_number: 11499254 # Func_class: F Nucleotide transport and metabolism # Function: Ribonucleotide reductase, alpha subunit # Organism: Archaeoglobus fulgidus # 26 680 7 596 752 258 30.0 3e-68 MENNKTYSYEEAFAAALDYFAGDELAARVWVNKYAMKDSFGNIYEKSPEDMHWRIANEIA RIEGKYPNPLSAQEVYELLDHFRYIVPAGSPMTGIGNGYQVASLSNCFVVGLEGDADSYG AILRIDEEQVQLMKRRGGVGHDLSHIRPKGSPVNNSALTSTGLVPFMERYSNSTREVAQD GRRGALMLSVSIKHPDSEAFIDAKMTEGKVTGANVSVKIDDSFMQAAVDDKPYVQQFPIE GDNPEVKKEISAKTLWEKIVHNAWQSAEPGVLFWDTILRESIPDCYADLGFRTVSTNPCG EIPLCPYDSCRLLCVNLFSYVVNPFTKEAYFDFDKFAKHVAVAQRIMDDIVDLELEKIDL IMEKIKDDPQNDEVKGAEYHLWEKIKRKSSMGRRTGVGITAEGDMIAALGLRYGTQEATD VSVSVHKRLALAAYRSSVVMAKERGAFEIFDAKREAANPFILRLKEADQSLYDDMVAYGR RNIACLTIAPTGTTSLMTQTTSGIEPVFMPVYKRRRKVNPNDTDVHVDFVDEVGDSFEEY IVYHKKFMDWMKANGFDTDKRYTQEEIDAIVEQSPYYKATANDVDWLMKVRMQGEIQKWV DHSISVTVNLPNDVDEALVNRLYVEAWRSGCKGCTIYRDGSRSGVMISVSKKDKKKEDKQ EEQPIVPCKQPEVTEVRPKELACDVVRFQNNKEKWVAFVGLLNGYPYEIFTGLQDDDEGI ALPKSVTKGKIIKNIGPDGRSRYDFQFENKRGYKTTVEGLSEKFNPEYWNYAKLISGVLR YRMPIDHVIKLVGSLQLKSESINTWKIGVERALKKYITDGTEATGMKCPSCGQESLVYQE GCLICKNCGASRCG >gi|283510512|gb|ACQH01000107.1| GENE 3 3454 - 3825 349 123 aa, chain + ## HITS:1 COG:lin0257 KEGG:ns NR:ns ## COG: lin0257 COG1539 # Protein_GI_number: 16799334 # Func_class: H Coenzyme transport and metabolism # Function: Dihydroneopterin aldolase # Organism: Listeria innocua # 5 116 4 116 124 73 35.0 9e-14 MSHYIHLRNLHFHAFHGVMPQERTAGNDYVVNLRAEYPIEQAMQSDDVSHTLNYAEVYAV VETEMQKPSCLLENVAYRIAQRLFRDFPLIRSIRLEVAKQNPPMGTCGGQAAVEIQLNND KSQ >gi|283510512|gb|ACQH01000107.1| GENE 4 4142 - 5524 1385 460 aa, chain + ## HITS:1 COG:aq_1681 KEGG:ns NR:ns ## COG: aq_1681 COG0860 # Protein_GI_number: 15606778 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: N-acetylmuramoyl-L-alanine amidase # Organism: Aquifex aeolicus # 24 251 130 355 359 118 35.0 3e-26 MNKKIAALFALLLLLPFAANAAGKKFVLVIDPGHGGKDAGALGAFSKEKDINLSVAMAFG RKVQHNCPDVKVIYTRTTDVFIGLKERADIANKNKADLFISVHTNALPGGKQAYGMETYT LGMHRAGDNLDVAKRENAVILIEKDYKQSYQGFNPNSAESYIMFEFMQDRNMSNSVDLAK MVQRETCSAANRPDKGVHQAGFLVLRETSMPSCLIELGFITTPDEERLLNDKARVENIAT GIYRAFVNYKNKYYSGVVVPYKAAPVPTPTVPDVVPDVYKSNDVPATQPKPEPQVTRRTE RTLSEPIKRDSEVGRSVPQRAQAANQDVTERPTVSAAETRSNEETRLPSITSGDAEEKII SFGETAIVPIFKVQIIASPTRIRRNSPQFKGLKDVEYFKENDMVKYTYGASADYNEINRL RKSIAKKFPDAFVIAFKDGERMDVNKAIREFSVNRQKGNK >gi|283510512|gb|ACQH01000107.1| GENE 5 5531 - 6433 1139 300 aa, chain + ## HITS:1 COG:no KEGG:PRU_1284 NR:ns ## KEGG: PRU_1284 # Name: not_defined # Def: Mce-like protein # Organism: P.ruminicola # Pathway: not_defined # 1 300 1 300 300 326 55.0 7e-88 MKIFTREVKIALVAIAGVVILFFGMNFLKGLTLFSNDTGYKVMFKDITGLSNSTPIYANG FAVGVVTGIQYNYDQKGDIVVTIDLDKSMRVPQGSTAEIESDFMGNVKLNLILAPNSGQF LSPGDVIQGNVSKGALAQVADMVPSIERLLPKLDSILVSVNTILANPAINQSLDNVQGLT ANLRTTSQQLNGLVAGLNRSVPAMVGKANKVLDNTNTLTTNLAALDLDGTMQQVNRTLAN VEEMTQKLNSNEGSLGLLMRDPTLYNNLNATLMSADSLLINVRQHPKRYVHFSLFGRKDK >gi|283510512|gb|ACQH01000107.1| GENE 6 6880 - 8289 1477 469 aa, chain + ## HITS:1 COG:BS_dnaA KEGG:ns NR:ns ## COG: BS_dnaA COG0593 # Protein_GI_number: 16077069 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA replication initiation # Organism: Bacillus subtilis # 9 466 8 446 446 267 32.0 3e-71 MDVTPDKHWEAALALIAQNVSPQQFDTWFKPIVFESFNNETKLLVLRVPSSFVYEYIEEH YIDLLSRVLFSAFGQRIRLTYRVVVDSEHKKTQDIEADPAEAHATQAAKTWGAKQKAVEE PAKEQELDPQLNPHQTFSNYIEGESNKLSRSVGLSVAEHPNTNQFNPMFIYGPSGCGKTH LVNAIGVQTKQLYPQKRVLYISAHLFQVQFVNAVLKNATNDFIKFYQTIDMLIVDDVQTW ASADKTQETFFHIFNHLFRNGKRIILASDRPPVELNEMSDRLITRFSCGIIAELEKPNVQ LCVDILNAKIRRDGLKIPAEVTQFIAETANGSVRDLEGVINSLMAYSVVYNSDIDMRLAE RVIKRAVKVDDTPLTVDDILEKVSNHFNVSVSAVSSKSRKRDLVVPRQVSMYLAQKYTKM PASRIGKLVGGRDHSTVLHSCAQVEARLKTDSLFAAEVESIATSFKLKG >gi|283510512|gb|ACQH01000107.1| GENE 7 8452 - 9051 286 199 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929564|ref|ZP_06423408.1| ## NR: gi|288929564|ref|ZP_06423408.1| hypothetical protein HMPREF0670_02302 [Prevotella sp. oral taxon 317 str. F0108] # 1 199 1 199 199 409 100.0 1e-113 MKTYIKTIWTKAMLCICTCTLLSCSDKDGPHVVDPFRVSIRFESPKGTNIADSLNIIDNT KPETWNAKKEDIQVRVYRGSDLKPLEINRTRWAYIKENTIPNFLGGTTLNVSATDLRIWK EDFWGRETYIFEFRSKKLFGNDEPHTIKWFVTCKGSMQYNVYKCEIDNEDMPLSKSHLLM IAGCADLYAEVTLRIKPTK >gi|283510512|gb|ACQH01000107.1| GENE 8 10867 - 12201 1479 444 aa, chain - ## HITS:1 COG:NMA2100 KEGG:ns NR:ns ## COG: NMA2100 COG0477 # Protein_GI_number: 15794975 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Neisseria meningitidis Z2491 # 9 441 14 446 451 405 51.0 1e-113 MNNQLKKKPDMGFWKLWNLSFGFFGVQIAYALQSANISRIFATLGADPHELSYFWILPPL MGILVQPIVGTLSDKTWTRFGRRIPYLFVGATVAVLVMCLLPNAGSLGMTAGMAMIFGLT ALMFLDTSINMAMQPFKMLVGDMVNEKQKALAYSMQSFLCNAGSVVGFIFPFFFTFIGIS NIAAAGVVPDSVIWSFYVGAAILILCVIYTTLKVKEWNPKEYAEYNQLEKPLDKEKTNVF ALLRNAPKTFWTVGLVQFFCWAAFMYMWTYTTGSIADTVWGVDMQLHSATSTPQYQEAGN WVGILFAVQAIGSVLWAVVLPNIRSRKMAYALSLLLGGIGFLLVPYIANKYLMFVPFILI GCAWAATLAMPFTFVTNALEGKGHMGAYLGLFNGTICVPQIIAAALGGTILHLLGSVQSH MMMAAGAMLIIGTFCVSIISEKEK >gi|283510512|gb|ACQH01000107.1| GENE 9 12303 - 13457 1354 384 aa, chain - ## HITS:1 COG:VC2677 KEGG:ns NR:ns ## COG: VC2677 COG1609 # Protein_GI_number: 15642672 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Vibrio cholerae # 8 342 3 331 335 169 32.0 7e-42 MGEIASITMKDIARELNVSVATVSRALKNSPRISRQQRERIQAYANEHNFSPNLIAESLR NSRVRPMKIIGVIVPELTHFYFSSVLAGIEGAAEKQDYRIMVAQSNEQHEREVSICKSFF ENKVCGIIVSQAKDTTQYAHFQHLIDRGVPLVFYDRVCTGVNASRVVVDDYAGAYAAVTH LVETGCRRVACFNASMKLEISKNRYNGYCDALLKHGIKVDPSLVFQCDNRSDAERIAPEV LDRHDRPDAFFAVNDDTAIGILYSAKRRGFRIPEDISVCGFTNGQRAVACDPALTTVDQH GFQVGEAAAEILMRHVEGHIEKGKTERRVVRTRLVVRDSTRPIATSITEQLARTLPKETN TGHDKGKPVEQQTQSQKAKAQQTA >gi|283510512|gb|ACQH01000107.1| GENE 10 14138 - 17155 3204 1005 aa, chain + ## HITS:1 COG:no KEGG:PRU_2681 NR:ns ## KEGG: PRU_2681 # Name: not_defined # Def: outer membrane protein SusC # Organism: P.ruminicola # Pathway: not_defined # 1 1005 2 1011 1011 1233 63.0 0 MKQLINAMPQRIAMLVCGLILSVCAFAQQIVVKGHVKDATGEPITGATVRVAGQEGGVVT DIDGNFSINANAGAEVTVSYIGYSDAKAPASANMVIVMKDDAAKSLNEVVVIGYGMVKKS DLTGSVTALKPDSKNKGLVVNPQDMLSGKVAGVNITSDGGAPGAGSSIRIRGGSSLNASN NPLIVIDGIAMDQNGVQGLSNPLSMVNPQDIESFNVLKDASATAIYGSRGSNGVIIITTK KGRKGMAPQVSYNGSVTIGVKSNSLKLMNGDEYRDFITKKYGAGSDAAKALGTANTDWQK EIYRTAVSHDHNVAVQGAIKNLPYRVSVGYTGQQGILKTSDFQRVTAALNLNPSLFDDHL TMNLNAKGMFAETNYPNYAAINDAFRFDPTQNIYDNASPQSNNFAGYFQWRSPGGFLDDK NYDYANNNLAPTNPVAALALHYERAYSRSFVGSADIDYKVHGFEDLRLHLTLGADVSEGT QNKRVPPTAPEAFYYGSYGSANKLKRNLSLSMYAQYFKDFNEKHHFDIMAGYEWQHFWRN ETSDYTGYYGSGHKSKAGQETPHVPYHSRTENYLVSFFGRANYTLLNRYFFTATVRDDGS SRFKKHWAWFPSFAFAWKAKEEGFLKDVNAVSDLKLRLGWGKTGQQEGIGDYNYFAIYNI NTGTQSFYPILEDGRLAMPQAYDPNLTWETTTTFNVGLDWGILNQRLSGSIDWYYRKTTD LLNFAPAAAGTNFDNRVNTNIGALRNTGVEATLSWKAIQTKDWYWTLDYNVTFNQNRITE LVGANSKPIETGASIHSGTGRKILAYAVDRAATAFYVYQQAYDANGKPIEGAVVDRNQDG IINDDDRYFYKQAAPPVTMGLASRLEYKNWDLGMSFRASIGNYVYNDVEASRSDMSQLWA PSGFLEQRPKSGLDLNWQSSKWTQSDYFVHNASFLRCDNITLGYSFNNLMKAGSWHGLSG RVYATASNVFTITKYKGLDPEVFSGLDYELYPRPTSFILGLSLNF >gi|283510512|gb|ACQH01000107.1| GENE 11 17178 - 18767 1594 529 aa, chain + ## HITS:1 COG:no KEGG:PRU_2682 NR:ns ## KEGG: PRU_2682 # Name: not_defined # Def: outer membrane protein SusD # Organism: P.ruminicola # Pathway: not_defined # 18 529 1 520 520 456 47.0 1e-127 MKRYIKNILPAAAMLLTVGLSSCIGDLDVTPIDPNKRTELDPVALFNQCYANLSIEGLEG PGKSIVTAEDPGTTGLVRQYFNTNELTTDEAICSWNDPGVATMNTNEQGSSNAFLAIYYN RLYAGISVCNHYLDVAANVDATRTAEVRFLRALQFYLAMDAFGNIALPLTVSAEKPVQRT RAEVYAWLEKELLEIEPLLSEPKAKKSSDAGYGRADKAAAWMLLTRLYLNAEVYSGTPQW QKAAEYAEKVIKSDYKLFTAGTTNYSAYQMLFMGDNGESGASVEAVLPLLQQGNKTAAYG NTTFLISSTYEKDMPGHGISGVWAGNRARRELVQKFFPGAVPSGNTATILNAAGDKRALF FAVDRTLEISNDDAIKDFKKGYSVVKFTGIKSDGSPTTDTGYADTDFFFFRVAEAYLAYA EATARLNGGNATTQGIAYLNELRKRSGAAELRGFSLREILNERSREFYFEGQRRTDLIRY GYFGGDNNYMWSWKGGAPNGRTFDVRRNIFPLPVSDLSVNPNLKQNPGY >gi|283510512|gb|ACQH01000107.1| GENE 12 18814 - 19938 892 374 aa, chain + ## HITS:1 COG:no KEGG:BT_3699 NR:ns ## KEGG: BT_3699 # Name: susF # Def: outer membrane protein SusF # Organism: B.thetaiotaomicron # Pathway: not_defined # 170 371 276 482 485 167 45.0 9e-40 MLLGGIFVMAACSDDRDDNPTIKVPTAFKMETPAFVSSQVNLNNAAQFNLAWEKPDFGYF AKVNYTLEVSNTGKFTASLADEEADLDGKVVADYAEVPTIFPLNKATITNDDLALAVVQV GHWEKPEQLPATQKVYVRVKATLPNVASLHSNVVEFTVIPKYVKLPPKITDCYLTGSEYG WGGTWKPLTPVKDTTGKQGKGVPTFWMMAYFKDGEEFKFAPKPAWENDFGAANATITDHA GASPSDAGGNIKIGKAGWYLVVVTNNGESRTVSFEKPEVYLQGPTIGNWDCKKENMFTAP TTATGEFVSPAFVATDEVRMCVKLKDFNWWKSEFIVNKAGSIVYRGEGNDPEKITVTAGQ RCYLNFSTGKGSYK >gi|283510512|gb|ACQH01000107.1| GENE 13 20878 - 21444 776 188 aa, chain + ## HITS:1 COG:MA2664 KEGG:ns NR:ns ## COG: MA2664 COG1704 # Protein_GI_number: 20091487 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanosarcina acetivorans str.C2A # 22 188 20 182 182 153 49.0 2e-37 MSATAIVIIVVAVLLLVWGVALYNNLVKLRNNRENAFANVDVQLKMRHDLVPQLVGTVKG YAAHEQSVFQRVTEARAAAMGATSINDKIAAENMLTSALSGLKVSMEAYPELKANANFMQ LQTELADIENKLAATRRFFNSATRELNNAVQTFPSVIFAKMFGFQKEPMFEIPDSQRAEV ERAPEVKF >gi|283510512|gb|ACQH01000107.1| GENE 14 21466 - 22479 1019 337 aa, chain + ## HITS:1 COG:BMEI0236 KEGG:ns NR:ns ## COG: BMEI0236 COG0501 # Protein_GI_number: 17986520 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Zn-dependent protease with chaperone function # Organism: Brucella melitensis # 82 336 54 289 337 155 35.0 1e-37 MKYVGMQTQISRNNTLSLLLLLLFPIIILGMVWVFLAVVNYFGNGYYDVYGEFHRQLDLH AVLTTFVEVLPWVVGIVGVWFLIAYFFNASMIQHVVQAKPLTRKDNARVYNIVENLCMAC NMDMPAIYVVDDNQLNAFASGINKRTYAVTVTTGLLDMLDDDELAGVLAHELTHIRNRDT RLLITSIVFVGIVSTVMMVAARLLYGFMISGGSHSRSSSNKDNRNGMAATFIVLVVVVIC AAIAYFFTLLTRFAISRKREYMADAGGAELCGNPLALASALRKISRNPGLDSVNREDVAQ LFIVHPDEMDLGLLGFVEGLFSTHPDPEKRIAILEQF >gi|283510512|gb|ACQH01000107.1| GENE 15 24679 - 25464 1091 261 aa, chain + ## HITS:1 COG:PM0875 KEGG:ns NR:ns ## COG: PM0875 COG0363 # Protein_GI_number: 15602740 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase # Organism: Pasteurella multocida # 1 259 1 259 267 354 62.0 1e-97 MRLIIKADYDGLSNWAAEHVIESINKAAPTKERPFVLGLPTGGTPVGMYQALVKACKEGR VSFKNVVTFNMDEYVGLPVEHPESYHSFMYRNLFDHIDCPKENVHILNGNAEDWEAECRQ YEEAIKAAGGIDLFIGGVGVDGHLAFNEPGSSLTSRTRRMPLTHDTRVVNSRFFDNDFEK VPRFSLTVGVGTVMDARQVMVLINGHGKAGALRDAVEGPVTQMRTVSALQMHEDAIIVCD EAATDELKVGTYKYFKDIERA >gi|283510512|gb|ACQH01000107.1| GENE 16 25469 - 27466 2186 665 aa, chain + ## HITS:1 COG:CAC0187 KEGG:ns NR:ns ## COG: CAC0187 COG0363 # Protein_GI_number: 15893480 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase # Organism: Clostridium acetobutylicum # 45 282 10 241 241 155 36.0 3e-37 MRLNLSSEIVLNKIPEEYYNPKNAVERSEITRCEKINTDIFPKIEEGAEHIARAIEAEIK FKNDQGKLCVLGLGTGNSLTPVFEELIKRHKEHGLSFARVAVFNAYEYFPLNEGNPHSSI SQLRERFLDHVDIAPENVFTLDGTVPQEMVNSSCKQYEQRIADMGGIDVMLLGVGRMGNI ATNEPGSVLTSGSRLILIDAVSREEMTLSFGSKEAVPPCSITMGISTILGARKIFLTAWG EDKADIIQKAVEEKITDALPASFLQTHKDANVVIDLPAASRLTRIVHPWLVTSCQWTDKL VRSALVWLCQLTGKPILKLTNKDYNENGLSELLALYGSAYNANIKIFNDLQHTITGWPGG KPNADDTYRPERATPYPKRVIIFSPHPDDDVISMGGTLRRLVEQGHEVHVAYQTSGNIAV GDEEVTRFMHFINGFNQLFGGDADTVIKDKYREIKAFLKEKKDSIGDTRDILTIKGLIRR GEARTASTYNDIPLERVHFLDLPFYESGKIEKLPMGEADVKVVREFIHGIQPHQIYVAGD LADPHGTHKKCTDAVFAAIDEELKAGEKWLDDCRVWMYRGAWAEWEIENIEMCVPMSPEE LRAKRNSILKHQSQMESAPFLGNDERLFWQRSEDRNRGTAALYDKLGLACYEAMEAFVEY KFKEM >gi|283510512|gb|ACQH01000107.1| GENE 17 27663 - 28790 1376 375 aa, chain + ## HITS:1 COG:no KEGG:BF0928 NR:ns ## KEGG: BF0928 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 17 375 2 359 361 511 67.0 1e-143 MGILPNLNDHIIIMEEKNLLQLVSHFDVNGTVKSVKPLGNGLINDTYKVTMNEPNAPCYV LQRINNAIFKDVELLQRNIETVTAHLRKKLEEQGATDIDRRVLTFIKAETGKTYWRETDD TYWRMMLFIPDAYTYETVNEEYSHAAGLAFGQFEAALVDLPAQLGETIPDFHNMELRARQ LKDAVQKDPVKRLSVVHDIVEELNNNMEEMCKAERLYREGKLPKRICHCDTKVNNMLFDA DGNILCVIDLDTVMPSFVFSDYGDFLRTGANYTAEDDPNLANVGFNMPIFKAFTKGYLTS AGAFLTPIERENLPFAAKLFPFMQCVRFLTDYINGDEYYKIKYPEHNLDRAKNQLALFNS VCSHEEEMQRFIAEC >gi|283510512|gb|ACQH01000107.1| GENE 18 28884 - 29540 460 218 aa, chain + ## HITS:1 COG:no KEGG:BF0187 NR:ns ## KEGG: BF0187 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 6 216 3 212 213 266 56.0 3e-70 MNNYLVKSISASDVKAENVPSLLDKAEIPFETIDNVNWAEAFPCKPNVAFRMAYAGDKIL LNFQVEEDSVRAVAPEDNGRVWEDSCCEFFVSPTDDGTYYNVECNCAGTLLVGFGPGREG RQHLPADVLQKVGRWSSLGRQPFDERVGKCHWQMALVIPTSLFMHHPNLQLQGNEMRANF YKCGDKMQKPHFLSWNPIALPKPDFHCPPFFGTISFAR >gi|283510512|gb|ACQH01000107.1| GENE 19 29570 - 32272 2730 900 aa, chain + ## HITS:1 COG:TM1238 KEGG:ns NR:ns ## COG: TM1238 COG0210 # Protein_GI_number: 15643994 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases # Organism: Thermotoga maritima # 11 468 20 427 648 102 23.0 3e-21 MSNTTQAWQPDESQQRVINLQQGIHLVLAPPGCGKTQILTERVRLARKNGVNYADMLCLT FTNRAARGMVERIKANIDDVAAAEVYVGNVHRFCSRFLYDNGLIAAETSVIDDDDALSIL ARYTEEDEMKVAESFKRRHDYFEIIFLSHLMHQIAQAHPKNLRLHADCLSAEDVHAMRKL CELNRIDFTPQAMTDIYLHASTYETMLQGAQIDYAALCLVVALLRKMKFAKQYEDYKREN KLVDFEDLLLLTYDALVADVTGVYRRYAWLQVDEVQDLNPLQLAIVDALTAKENPTVMFL GDEQQAIFSFMGAKLDTLNQLRRRCGAHVHHLSVNHRSPKYLLNVFNTYAKAMLNIDERL LPEADNSDMGTGNELQVVSSNVLDTEYKDVARLAGNLQTQSPNETTAIIVNANRDADDLS RALHAQGLSHFKVSGQDLFASPEVKLLFAHLNILANAHNFIAWARLLKGLRVFEGNAAAR NFVQALLRCAMLPTDLLSADKPSYIERFAHCFDTEEIVVFDTETTGLNVFEDDIVQIAAV KMRAGCVVEGAAFNVFIKTERPIPAMLGDIPNPIVAELQHNTCLPAAQALQSFMQYVGNS MLLAHNADFDYNILRFNLQRYCPSIDLHATHPTYFDSLKIIRLLQPGLKQYKLKALLEVL HLEGTNSHLADEDVQATVSLVGYCRQKAAEMIPAQQQFLARQRVQHCAAVFRQRYAELFN AHYARLYCVPIEDSKPALVQVINDFYTFLLEENHIQPVPNIAYVEAYLQGNMLAGDEAQA LVQQLQAHIMELNTLKEADLCGSGFIAERIYVTTIHKAKGLEFDNVLIFDAVDGRYPNFY TRTDERQTSEDARKFYVAMTRAKRRLFVSWALSREAYNGTSRPCELTPFMRPVLHLFAGG >gi|283510512|gb|ACQH01000107.1| GENE 20 32337 - 34469 1891 710 aa, chain + ## HITS:1 COG:CAC0746 KEGG:ns NR:ns ## COG: CAC0746 COG4412 # Protein_GI_number: 15894033 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 106 345 139 402 781 99 29.0 3e-20 MRAKLLILAFLALVIVGFAIPTMPGIWRTAKLADGREVQLEQVGDENLHYWRDVRGVCYN VDDEGVATPIANIGAVRAKAATRIAQNNARRAAAATRTVGISRRHVGKRKGLIVLVNFTD VRFQPAHTKSLLSRMANEEGFTYKDGHRGSLHDYFKDQSNGLFHLTFDVVGPIQLRHNMA YYGKDTLDWRGSRSDVRAGQMVAEACLAIKDSVNFRDYDWDGDNVVDQVLVIYAGYGQAS GGGSNTIWPHEWALQYSDYGRVLRIDGHPLINTYACSNELNPNSTSLGGFGTACHEFTHC LGLPDMYDTSGKESLGMGFYDLMSAGNYNGEGYVPAGYTSYEKMFVGWLTPQELTRDTTI TGMKPLSEHGEAFIIRNDANADEYYLLENRQRSGWDSHLPGEGLLVLHVDFDQMDWLTNR VNANPQHQRCTLVPADNRQRKILSAQDSIPYPQAGNDSLTATSLPAASVFTSGLDESYRL NKSVLGIKINGEGNASFAFRIDNLRPKGLKGDTLFYESFDSCDGKGGNDGRWGGYIGAGK FKPDHQDWRFSVPEATSAGSRCALIGSSTKYGIVESTPLFYVEDGSTISFKAAAWNNEEG SLTVLSLNPKVQLSETRLKLPNRSWQTFTLTVKGSGLLRLSMRNGQRRFFLDELLVKRPA TTSITNIPTARPAPRATSDRIYSLDGMYLGTNIDALSKGVYIVDGKKVVK >gi|283510512|gb|ACQH01000107.1| GENE 21 34660 - 35808 685 382 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929580|ref|ZP_06423424.1| ## NR: gi|288929580|ref|ZP_06423424.1| hypothetical protein HMPREF0670_02318 [Prevotella sp. oral taxon 317 str. F0108] # 1 382 1 382 382 784 100.0 0 MMKKAILLAFWGIFIAWGEVLAQSKTVICDAVSKAPIPYVNVFKAEDGKFRGTTSDEKGV AIVNFPFRQLSVSHVSYASCTLTVLPDTVFLTPKERLLGEIVVGNAEPQWIRSFLEKFVL KKKSTYRTVRKNFNYSYQTRNTSDSSGYWFENSGVICVPAATEKRLFTIKPTIGYIHFKD KTAGCDFANMKRMVYHDFVDELDGKFIKHHLFQVNDAFSSPNKNIVQLYFKSVKYGSEDK GYITVDTAQCQVLSVVRNMGLSYNVDNNTNSFTRSAFGLLMGWKYAKWVVQQSSTYHVSD AGCNLSSCRYKAYILAESKKGKYAGTKFDSSEAEITFTPTGDGIAPPFLTLPEPWYMKLI ITRKERQAEEQLQGIEKEYVVY >gi|283510512|gb|ACQH01000107.1| GENE 22 37329 - 37541 293 70 aa, chain + ## HITS:1 COG:BH3592 KEGG:ns NR:ns ## COG: BH3592 COG1983 # Protein_GI_number: 15616154 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Putative stress-responsive transcriptional regulator # Organism: Bacillus halodurans # 7 63 4 62 65 57 49.0 9e-09 MNTNKGIYRSRDRKIAGVCGGIAKYFDWDVTLVRLVYALATFTTAFSGVLVYLVAWAVVP SEEFDPRQQE >gi|283510512|gb|ACQH01000107.1| GENE 23 37710 - 40028 1545 772 aa, chain - ## HITS:1 COG:no KEGG:BVU_0280 NR:ns ## KEGG: BVU_0280 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 13 772 15 769 769 393 34.0 1e-107 MRRLTLVMLLAMLFVRLCAQQVSGVVTDANNGGLGLANIVVLSADSTFLSGTTSDDNGNF TLNQPQQGNIIKISLIGYETKFLQYAGEERLTVQLTEKAFQLGEAVIHGQLPQTILKGEG MTTIVAGSVLEKTTSIDQLLDFIPHVSMQNGKVEVLGRGTPEIYINGKRMMNQMELERLN PENVKSVEVITNPGARYNKSVTSVLRITTKQIAGEGFGFDSKTVGSVNEQKRTSGFETLK LNWRKGGWDVNAHVYGEHTHTQDNTSVRLMTFLNDTWQQQFDIDQEHGTTNMYLSLASSY SIDANTSIGASVDYDRYASLFAAGNSHSENFRNGEKHDEMDGNYSLNGNKSEVSTNAYFV GKIGKLSIDFNSDYYWKKESRPMYNDERYKEVGQTETHRLVNSRQNNASRLFANKLVLSF PLAKGDFSVGGEYSHSRRKNLYAVLPTGVVDDENNRIKENIASIFMDYSKQFGKVGLQVG LRYERVGFNYYERDVLVHAQSKRYADWFPSLTLSFPVHAVQMQLSYSSDINRPTFQKLRS GVQFDNRFTYESGNPFLLPAISRNLSYALSWKWISLSAMFARVSNQACYIMQPYKGDPTT TLLRPENMPTYNKVQASLSLSPSWGVWHPSLETMLYKQWFKMQTHQGGSINHPLVSFNLT NTFDTKWFTVSFIMRTRTEGNTENMFLRKGSFGSDLSLYKSCLGGKLLLQLYASDIFGTE DQHFVLYSGKLRTSYLSKFSSSTIKLTLRYRFNTSKSKYQGTGAGKSQKNRM >gi|283510512|gb|ACQH01000107.1| GENE 24 40403 - 40912 263 169 aa, chain + ## HITS:1 COG:no KEGG:BT_3896 NR:ns ## KEGG: BT_3896 # Name: not_defined # Def: TonB # Organism: B.thetaiotaomicron # Pathway: not_defined # 68 158 69 161 285 90 43.0 2e-17 MCATIFYSAFRCKLLLVVAMCLVCLNADAQGRKDTAWWPPYSLFVKPKFRTSPAHRLRVF RDEDYAKPKFPGGEEALEAFIDKEKVYPEEVQDVKGRVLLSFFVEEDGQITDIRLLRCVH PALDKAAVNVVKKMPAWIPGKRKGKVARLKVCMGIPFYVIKKGGAETER >gi|283510512|gb|ACQH01000107.1| GENE 25 41018 - 41533 344 171 aa, chain + ## HITS:1 COG:no KEGG:BDI_3213 NR:ns ## KEGG: BDI_3213 # Name: not_defined # Def: outer membrane protein TonB # Organism: P.distasonis # Pathway: not_defined # 16 157 13 161 163 104 37.0 1e-21 MRTMSIHYVCRCRLWLVMAMCLACLSANAQGGKGKNKKRNQPAQVKSATTKAEQPQRVMC EFFEILPQFPGGDKALMEFLKKEVRQPDEAKDVKGKVVISVIVETSGELTHFKVARSVHP ALDKEALRVVKMMPKWIPGSIYGKLVAVKYNLVVAFKGEEKGGIEAERRDK >gi|283510512|gb|ACQH01000107.1| GENE 26 41563 - 42951 1080 462 aa, chain - ## HITS:1 COG:no KEGG:BF1115 NR:ns ## KEGG: BF1115 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 462 22 499 504 382 42.0 1e-104 MFCLSGIVLNHPTLFSSVNVGRNLLPKEYSYDKWNNGLFRGTIKWRNKVLLYGNAGVWLT DSTAKAFMDFNKGMPLGVDNRNIRGMAIVPTGEVFAAGQYGLFRLDGKKTWQAVNLDLLH GDRISDIATRGDTLVVTTRSQVFIARAPYRSFQRIELASPDGGAPRFSLFRTLWWLHSGE LFGLVGKLVVDVVALVLIFLCLSGMVYWLLPHIGKKAHTAARQWLFSWHDRIGRLTIALT LLVCITGWFLRPPLLLLVANVKVPAILLTKTSSNNPWYDKLRSLRWDERMHDWLLYTSDG FYSLATLTEQPNKLDVQPPVSVMGLNVQKTVDGDYWLLGSFSGLYIWQRSTGMIIDYFTE QPAIAVQGPPVGANAVAGFSSDFGPNEYVVNYSAGTDFAPMPQWMSSLPMSLRNVALEIH TGRMYTFLGSMSVVYIFIVGMAILWCLWTGWKIRVRIKKKEE >gi|283510512|gb|ACQH01000107.1| GENE 27 43562 - 45187 1118 541 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929586|ref|ZP_06423430.1| ## NR: gi|288929586|ref|ZP_06423430.1| hypothetical protein HMPREF0670_02324 [Prevotella sp. oral taxon 317 str. F0108] # 1 541 1 541 541 1100 100.0 0 MKKILALVILCCYHATSLHAQQTISLPDVDAVSFDDSSNDLKVYTKIGPNQYTRQSLQVE EKTRHDKSPNIMLIRWNINPHKDTLSREDIDKPQQRLVDAQELHAMPLDEYINKTFSIGD CEKQKTQTNSPTRVLKFLRQDFATIQDLVYGDGEHDDYKRTLSINLPDSAVIWRSTRSSY FVPLILEIKGGKKIFILYKEGIRYVILPSPKGGLMCSFYNPLYSAKLTMAGEKYDGAKPY EFQYFNRYSFYDVEKTETGKCRLINQLGENVLKEEYDWISGSYRFIIAKRGNHIDVFNLY LNKLNVGKAKVAREIETCMVGCIEILNEKGAFYYDEMGQKIKKPIRRWGRIKCGTVIDWT HNILKNKGKHLVQIYTRDPGDPDIDERYWLSDLLPSDAVTFLDGRKSFSYDENSDDPEDK AYPKMLRVCRKGKYGIMVYEYYREKSVLPEYNYESLGNRDKTIMIYPPAKVVGKTIFPIN NDFLIMRSDGYIYFYKDKKVGLYPRDKVPVYDKIMKVTDSFYHIVKKGIPGWLDIKTNRE Y >gi|283510512|gb|ACQH01000107.1| GENE 28 45242 - 45463 71 73 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPNALYTISQKCLVHVNHTPSSCNKFRQSPLSLHAQNEYTQENKASSFVVDVHIPPSTNT THTLISFILLLNN >gi|283510512|gb|ACQH01000107.1| GENE 29 45510 - 47141 900 543 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929587|ref|ZP_06423431.1| ## NR: gi|288929587|ref|ZP_06423431.1| hypothetical protein HMPREF0670_02325 [Prevotella sp. oral taxon 317 str. F0108] # 1 543 1 543 543 1121 100.0 0 MKKLFALVILCSYHATSLYAQQTISLPNADAISFDCHGNNFDVYTKIGPNQYTRQSLQIG EQTRQDKSADITLIEPKINPHKDTVSREDIDKLPKKFIEAKDLHAIDWDSYLGMSFTMND SEARTSETDNSTYVLTFLRQDLAIILPLACRNGRGANNDCPLSSHIPHSVVTPRSTRSES FVPVILKLKEGKKVFLLYNRGVEYMMLPSSGGWQMFEFDKNLNPGEPTIAAKECNDVKSY NLRYFNKKTFYSVQKTETGKYRLFNSFGQDVLKEEYDSIINHYCFTIAKRGNDIDVFNLY NKKLNIGKAKAVKKVDACAAGCIEVLNEEGAYYYDEMGNRLKHPIMGSISICKMVSNWQH NILKKQGNYQMQIYTDEPGSPTEDECYWLSDLLPSDAVTFLDGEKSFFQSENSSLCGDID VQPEWIRVGRKGKFGIIAYEYKQKKGVKPKLVTKRIDDYQQKIKVYPPVKIVGKTVFPIN NDSIIMRSDGLIYFYKNKKVGLFPRDKAPIYDEIKKVTGSFYHIVKNGISGWLDIKTDEE YWE >gi|283510512|gb|ACQH01000107.1| GENE 30 47316 - 47699 504 127 aa, chain - ## HITS:1 COG:FN1050 KEGG:ns NR:ns ## COG: FN1050 COG0346 # Protein_GI_number: 19704385 # Func_class: E Amino acid transport and metabolism # Function: Lactoylglutathione lyase and related lyases # Organism: Fusobacterium nucleatum # 1 127 1 127 127 172 65.0 1e-43 MKIDHIAIYVCDLEGARNFFVTYFGAKSNEGYHNQRTGFRSFFLSFDDNTRLELMTKPEM ADREKEPNRTGFAHLSFSVGSKERVDALTAQLKDNGFEVLDGPRTTGDGYYESCIVGFED NLIEITI >gi|283510512|gb|ACQH01000107.1| GENE 31 47936 - 50080 2270 714 aa, chain - ## HITS:1 COG:no KEGG:PRU_2265 NR:ns ## KEGG: PRU_2265 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 77 710 44 661 667 214 26.0 1e-53 MKRFITLSLLACISLFTSKTWACADALEYHEGNFFDVYYHDYDFVSGEKLARFWAGYTGD RTPLSKDWSEGYGYAWYYYFENSCARLFAAAKKKHDEEMLLYLRQTLKYNKICEELKDSW TYPTAEQLAQRRKTLLAMQRTAKAYGGQRLKAQFSLLYMRTNLVLKDYRANQDFWVLTAS KFPESVFKDMMHSLYANAILNLGQWRKACDIYINQGDWESLEWAMRNYRNPAGIKRIYKE DPNSPTLAYLLQYYVNGGYGWDCNYENSFKADPIEPDTKAEAEKFINFVQTDVLHNESVR NKAMWKAAQAMVLFELTNYKAAQKYAKEAQQLSGTKRAKDCARCIALLASTRTEPLDEAY SAYVVGEIKWLLGRAAERKPSNDVSRYDYYHATALENVVYNHLIPRYQEANDNNQTLALL GMMDAVMGIDLQSEPEGHADFPSSREYKVGLDTLTSKEILAYYQYLNSSPADPLQRFALE KASKNADFFNSYIGKAYLREGNFAEAIPYLEKVSLDYVNTCYYAEYVAGTDYTLDRWFTR RSEPSESEKNKKLTVHPRLKFCRDMLQLQSQYRLARDKEVQAEIAYRLASHYFQASIHGD CWYLTRDSYSRYDEPENYEMNFPEQALTLLKQGANSRKVELKLPSLFAMAYVSYAIATLP RLLGPGSWFEDLENETDINTNYTKLAKYVKSSKTEAAYIIHCDILREFMETKMK >gi|283510512|gb|ACQH01000107.1| GENE 32 50058 - 51014 700 318 aa, chain - ## HITS:1 COG:no KEGG:PRU_2264 NR:ns ## KEGG: PRU_2264 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 42 317 1 259 260 213 42.0 9e-54 MTKITAPPQKHGPNIAFALLVALALFLGACSGTRQKPPVRSLYYWSTTFVNDSLKRQFYK AHNVQRLYIRYFDVVKKPNQEPLPNATITFNDSVPQDIEVIPTVFITNECMRQPPQFAEQ LWQRIRQMNETHGVKNVKEIQIDCDWSKQTQEVYFQFLRQLHQLLAKAGLKLSVTIRLHQ LGMQAPPVDKGTLMLYNTGDFRQLSNQKPILDPEVVRQYIGGLRSYSLPLNAAYPLFRVR ALFRSGKFVGIIHTKDEYPVLPTDSIAVRETTLDDLQSVQQLINKHRPDVHNEIILYDIN NRNLTKYSSNDYEKIYNP >gi|283510512|gb|ACQH01000107.1| GENE 33 51038 - 52717 218 559 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 341 554 126 338 398 88 29 1e-16 MQIKSLLRIPDNKYSTKTILRWLWLAWNGNRLQAMLNASIGVMQVVVSLAQVWAVKRAID VASGSVPGSIYWAVGVMGLFILADFSLSIASTWVRNLLGIKAQNRMQQRMLDRLLKSQWH GKEAFHSGDVLNRLEQDVNQVVTFLTETIPSSLSVFAMFVGAFCYLFAMDHILALVIVGI LPIFLLASRTYVGQMRQLTRKVRDSDSKVQSVLQETIQNRMLIKTLESDNMMVNKLGETQ SELRQHVVRRTLFSVFSSVVLNFGFALGYLVAFLWAAIRLADRTLTFGGMTAFLQLVNRI QNPARNLTKLAPAFVAVFTAAERLMELEENPLEQQGEPEYIDAPCGLKLDNVTFAYSDKE RNIVKDLSFDFKPGSCTAVLGETGAGKTTLIRMLLALVSPTKGSVNIYNSSTSRELTPLM RCNFVYVPQGNTLMSGTIRENLKLGNIAATDEEMIAALHKSCADFVMELPHGLDTVCSES GGGLSEGQAQRIGIARALLRDRSIMLFDEATSALDPETERQLLSNILASRDKTIIFITHR LAVVDYSDQTLRLERLESN >gi|283510512|gb|ACQH01000107.1| GENE 34 52848 - 53768 936 306 aa, chain - ## HITS:1 COG:MTH872 KEGG:ns NR:ns ## COG: MTH872 COG0061 # Protein_GI_number: 15678892 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted sugar kinase # Organism: Methanothermobacter thermautotrophicus # 25 291 21 277 283 141 31.0 2e-33 MPSCPLKIAIFGNECQLHNERAIARVVQCLAQHKAEVYVDEPFLDYLRQSGNLPTYPLTP FTGNHFDAQLALSLGGDGTFLKAAGRIGQKQIPIVGINMGRLGFLADVPASKAEDALNDI FNGEYRIEEHVVMKVEAGNEPFGGNPFAVNDIAILKRDDASMITIGVRVDGERLITYQAD GLIVATQAGSTAYNLSNGGPIVAPNTNALCLTAVAPHSLNVRPIVLPDNVELQLSVESRS HNYLVAIDGRSTKLVQGVDIHISKAPYVVKQVRRNNQTYFTTLRNKMMWGADSRNADMPN SHSEEG >gi|283510512|gb|ACQH01000107.1| GENE 35 53925 - 54143 136 72 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRKKDNRIGALGETKVWLSARLASGLSLTSDVLQAFTQPYIIKYCWQLNYKSARFLYPQC PSIGVADCVFEK >gi|283510512|gb|ACQH01000107.1| GENE 36 55487 - 56104 769 205 aa, chain + ## HITS:1 COG:PAB1451 KEGG:ns NR:ns ## COG: PAB1451 COG0179 # Protein_GI_number: 14521596 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: 2-keto-4-pentenoate hydratase/2-oxohepta-3-ene-1,7-dioic acid hydratase (catechol pathway) # Organism: Pyrococcus abyssi # 2 201 17 214 225 153 43.0 2e-37 MKIFAIGMNYAQHNKELDGTLYKPETPVVFTKADSALLKDRKPFFVPDFMGRIDYETELV VRISRLGKGISQRFAHRYYDAVTVGIDFTARDLQQKLRAQGLPWELCKGFDGSAALGEWI PIERFRSVDALHFHLDQNGKTVQEGCTSDLLFKVDELIAYISSFFTLKTGDLIYTGTPAG VGPVAIGDHLEGYLEDQKVLEVRCK >gi|283510512|gb|ACQH01000107.1| GENE 37 56474 - 59983 3570 1169 aa, chain + ## HITS:1 COG:no KEGG:PRU_2691 NR:ns ## KEGG: PRU_2691 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 133 1159 19 1041 1046 1006 49.0 0 MVSFAMQAQRFFNLTYDQVRVDSILPHFGHAIPLTGNYQDSTYTASIVYPEFAEMTRADI ANYNTNWGKQLPEMPQLEQRIVLDRKRAILNVGFCPLVFRDNRYQMLVGFMLKVEAKPLK RTQRKVLSVTRATPKAGRYANNSVLATGRWAKIRVPASGVYQITESLIRQAGFNDMNKVK VYGYGGNLQNERLEGTELQAKDDLKEVATCLVGGKRLFYAKGSVSWESASAAIRTRNPYS DYGYYFLTQSDGEPLRVDESAFLNSFYPSADDYHSLHEVDNFSWYPGGRNLFENTPINRG TKKSYTIDAPIADTGGSITVTLSAGTATRANISINGKRLGEINVSIPSKYDKGNRADFFA NIEQFAAKNEVTIEVTQGGPARLDFISIATSKPRNAPDLHAATFPTPQYVHNITTQNLHA HGAADMVMVIPTSQKLRQEAERLKAFHEKHDGMRVRIVPADELYNEFASGTPDANAYRRY LKMLYDRAESEADQPKYLLLFGDCAWDNRMKTGDWRTTSVDDYLLCFESENSFNEITCYV DDGFFTLLDDGEGTDLERSDLQDVAVGRFPVVHPNDARTMVDKTIAYAQNANAGAWQNTI MFMGDDGNDNLHMSDANKMADSLATEHPQYLIKKVMWDMFKRESSATGHTYPEVTSIIKQ QQAAGALVMNYVGHGRTDQISHESVLRLNDFKAFSNTDLPLWITASCDIMPFDGAVPTIG EAAVLNKNGGAVAFFGTTRTVYAYYNQRINMAYLHFLFSENGGKPMTIGEAQRLAKNYMI TDGQDQTTNKLQYSLLGDPALAINRPTARVVVDSINGTPVQAAILPKLAAGKVNSIVGHI EGHTNFAGVVTATVRDRKQRLVGRLNNQSPTEGADKPFAYTDRVNTIFNGADSVRKGQFK LSFAVPRDVDNDNGTGLVTLYAVNNERTIRANGYSERFTIGSGTLSANDSIGPSIYCYLN STSFTNGGNVTPQPLFVAKISDKDGLNATGSGIGHDMQLVIDGSMHKTYNLNDHFTFDFG SYTAGTVMFNIPTLSPGKHKLQFRASDIQNNTSQVTLDFHVVGGLMADNIWIDATENPAR TTTTFIVNHDMGNTPVDIIIEVFDLTGKLLWTNEERGTQTVGNYTRTWDLVCTDGHVLST GVYLYRAKVAAQGTAKTSKARKLIVQGKR >gi|283510512|gb|ACQH01000107.1| GENE 38 60195 - 61382 1374 395 aa, chain + ## HITS:1 COG:no KEGG:PRU_2690 NR:ns ## KEGG: PRU_2690 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 11 393 4 378 379 538 68.0 1e-151 MTNSKRIALAFTLAIATTTAWADDKKEIFNPVRTSVTSQGIAPDARAAGLGDVGAATEAD VYSQFWNPAKYPFAISRAGVGLSYTPWLRQLVNDIGLAHMSGYYRIGDYAAISAGIRYFS LGEVTTNDNSGGSSSAPALTIRPYEMSADVAYSLMLSEKFSIAAAVRWIYSDLTYSYTDN TTPGSAFAADLALYYQNYLNIGQRECQLGIGMNVSNIGSKITMGEDANSEFIPTNLRLGA SLMIPVDEYNRFTLAADANKLLVPTYPHQNEGESTEDYQRRVQKDYYDVSSIGGIFKSFG DAPGGFKEELDEISLSFGAEYVYHDQFSVRAGYHHESANKGNRKYFTVGAGFKMSAFTLD AGYVIATAKSNPLDQTMRFSLTFDMDGLKDLFKRR >gi|283510512|gb|ACQH01000107.1| GENE 39 61620 - 62099 586 159 aa, chain + ## HITS:1 COG:alr3883 KEGG:ns NR:ns ## COG: alr3883 COG0245 # Protein_GI_number: 17231375 # Func_class: I Lipid transport and metabolism # Function: 2C-methyl-D-erythritol 2,4-cyclodiphosphate synthase # Organism: Nostoc sp. PCC 7120 # 1 156 3 158 165 174 53.0 9e-44 MNYRIGFGYDVHRLVEGRELWLGGIRIDHTVGLLGHSDADVLIHAICDALLGAANMRDIG FHFPDTANETLNMDSKVILEKTVELIATKGYRVGNVDATVCAEQPKLNPHIPAMQQCMAK LMGVDPDAVSIKATTSERMGFVGRQEGMAAYATALIEKV >gi|283510512|gb|ACQH01000107.1| GENE 40 62783 - 63763 878 326 aa, chain + ## HITS:1 COG:YPO3910 KEGG:ns NR:ns ## COG: YPO3910 COG4206 # Protein_GI_number: 16124042 # Func_class: H Coenzyme transport and metabolism # Function: Outer membrane cobalamin receptor protein # Organism: Yersinia pestis # 127 275 32 184 625 71 31.0 2e-12 MFLLYKKAKAARSVRMLCCTVRLLCVACLLALSATLTAQNVVVYGTITDGRSNEPLMGAY IEALGTKANAVSSASGSYQIVVENNANIRLRTTYVGYGELMKQVKTFGKDSIKLDLVLMP DEHTLHDVTVTARSEARKIREGAMPVSVISQRQLQGTATNINDVLARTVGVTVRNTGGMG SASRISVRGLEGKRMGLFVDETPLAQIGNFVALNDIPTSMIERIEVYKGIVPYKFGGSAL GGAVNVVTKEYPPVYLDVAYEISSFNTHQLSTVLKRTNARLGLQFGVGGVLTHSDNDYTM RLENLGGRVVKRDHDKFDKAMGGLFV >gi|283510512|gb|ACQH01000107.1| GENE 41 63789 - 65180 1464 463 aa, chain + ## HITS:1 COG:no KEGG:BF0628 NR:ns ## KEGG: BF0628 # Name: not_defined # Def: putative TonB-linked outer membrane receptor # Organism: B.fragilis # Pathway: not_defined # 4 463 323 779 779 404 41.0 1e-111 MKAEMIFTRTKQQIQGIDLPVKEAYNESKSLVGALKLKRENFFTDGLDFDLDAGYVWGWY GMHDVAKHRYDWDGNQLPPVSGFGGEQGNHPTDGKNRSQDLIAKLNMGYTLNEHHALNLN VYENYTRLNPKDTLMDRALNYHANFPSNMNNLTVGFSHDMTLFGGRFQNALTLKAFFYSS HSRAIEVFGVSDPEPVSVARSYAGFSDAMRYKFNPYLMLKVSFNSEVRIPTSEELIGNGY SILPSPALKPERTTGTNLGLLYRRQSDDGQVMECEVNGFYNVLHDMVRFTPDIIPTMARY RNFGNVRTMGVEIDCKADVLPWVYLYANGTWQDLRDTRRMMPGTQVENPTYNKRIPNVPY LMGNFGAELHKENLFGGKGQNTRLLFDASYVHQYFYDFEMSIYQDRKIPTSFTMDAGLEH SLGNGKWTLTLKLKNITDRWVVSELNRPLPGRTLAFKVRYVLK >gi|283510512|gb|ACQH01000107.1| GENE 42 65242 - 66480 1139 412 aa, chain + ## HITS:1 COG:no KEGG:BF0627 NR:ns ## KEGG: BF0627 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 409 1 405 407 253 36.0 9e-66 MKVKHLCMAVIAMAAMTACGDDNKEQPTPQPGEGGKFEGVLFATSVTNADGNSGAAYLQG IPKLTGNPVDNSKGLPTGFGITPIVAPSGNVYSLPDYMGNSRSAIVRYVPNANGELVEKG KLELTAGAQACNVVELNAEKAYVTSQATGKIIVFNPTTMKQMKEIDLNAYAQQGMNVAPS AMIIRDADLFVGLSQRNANWMPTKASVELVVVDTKTDAVKKHIVNETLGLSTATRPIDPK SIFMDEKKDIYVNCIGSFGTKAEFPGGIIRIKNGSTEIDPDYSFKLNEVKVAGLTGDYGT FLGTVCYMAKGKLYGYINAPQLDPGAKNMYLTIAYCPVEVDLYNKTITKIESVGYSNPHA CAVAKYGDKVVFGVTNKTQSGFFSYDPATKKAEGPILKVTGNPIFFYSYGAK >gi|283510512|gb|ACQH01000107.1| GENE 43 66813 - 68258 1327 481 aa, chain - ## HITS:1 COG:no KEGG:PRU_2629 NR:ns ## KEGG: PRU_2629 # Name: not_defined # Def: sialic acid-specific 9-O-acetylesterase-like protein # Organism: P.ruminicola # Pathway: not_defined # 26 481 18 483 483 525 52.0 1e-147 MKKIVFSLRLRLLGAMVACLATMPLQAKIRLPHILSDGMVVQQQADVCFWGWTTPNTNVT VVPTWQEKRVTVKSDSQGRWRVVLRTPKASFHPLSITFSDGQPTTINNMLAGEVWVCAGQ SNMEMPVRGFNECPVEGYQDAVWASANMRGVRYVSIPPRMSSVPLEDTPCQWVDMSPKTV SDCSAVGFFFAQRLAQALQIPVGLVLANKGGTMVESWLNAENLGKYTDEPTDSVQLARKY AYEWLRPLLWGNGTFHPIINYGVRGVLYYQGCANVDHNPNTYGERLKLLAQQWRKHFSDN MLMLIVEIAPHRYEGEMGTSAAFIREQQYKASLEIPHCAMISTNDLVYPYEVRQIHPTQK RPIGERLALTALARYYGYDKVMYKQPAFEKLEIKGDTCVVHLKDTYTGIIPQSTYEGFEV AGADRIFHPAHATYVRHNKFRLTSKAVPKPVAVRYCYRNFQLGNVKNQAALPLIPFRTDN W >gi|283510512|gb|ACQH01000107.1| GENE 44 68255 - 69934 1917 559 aa, chain - ## HITS:1 COG:BS_ydhT KEGG:ns NR:ns ## COG: BS_ydhT COG4124 # Protein_GI_number: 16077655 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-mannanase # Organism: Bacillus subtilis # 373 512 155 294 362 85 35.0 4e-16 MKRNNIYLAIFAALSLSAAACNDAVEHNIQAVDAPAFVGVSPQENIKAGLDSIVVTYDKN VFFASSQYQRITLNGQPVVSANVIGSSNKLLLMANIVRGESYELLIPEGVVSGPNNMPAP EVKATLKAQSQQITTALVNANATAETKALYQKLVANYGQKTFSGAMADVAWNTKISNQVH AQTGKYPAINGYDYIHLQSTTPGGWIDYANIAPVKTWHDAGGMVTIGWHWNVPTSNPYAS TVPIVLYGGPNKDMPSDWSGYIQLTTQATKDILAKASVGSKLVVKIANAKANAQGSIKNS KWAGFVDENGKNWDYFNISGDSYTMTLDQTTLTEMRANGLIIGGHDYTVTGVTVEAAGMV KYDFYAAKNEFTLNDAVTEGTWANRFMKSDLEKIVPYLQQLQRAGIPVLWRPLHEAAGKW FWWGNGTAENYRKLWHIMFDFFKQRGVNNLIWVWTSEKDDPDWYPGDEYVDIIGTDMYGQ NGTPVTAQTAAQRFNELAYRYPTKMIALTECGTVADISAQWSADAKWSFFMPWYPGGGVV YATDAWWKSAMSNPNVITR >gi|283510512|gb|ACQH01000107.1| GENE 45 69950 - 71236 1442 428 aa, chain - ## HITS:1 COG:no KEGG:Dfer_2531 NR:ns ## KEGG: Dfer_2531 # Name: not_defined # Def: hypothetical protein # Organism: D.fermentans # Pathway: not_defined # 20 425 18 381 384 160 31.0 9e-38 MKRSIITLRSLAAWLALATLALTACSDQPDEYKPTGGSPQVQYIRLPESADSIITQSYMQ RTICLVGSNLRSVRRMFFNDKEAVLNSSFITDNTLLVDIPGEIPSAVTDKIYMVNGDGDT TTHAFHVIVPPPSVKAMNCEYAPVGEEVTITGDYFIQDPYKPLQILFNDGALAVTDIKSI TKNSITFTMPTGAAAGRVIVQSVYGKGESDFVYKDTRNIIFDFDGSHGGMAKGHGWRAGN IRAGGIDGSYLYFGGVKMLGKVGATWAEDNFAINYWPEPNNNFPEISSIPSIAAMLDTCS VSDLVLKFECRVPANKAWSASALQVIFTGNADVTYTNANSQYYGKKDLPRGLWIPWQATG SYNTGDKWVTVSMPLSAFNKTGDGQTAGSALTKDRLTGFTFFLWNGGVEGTDCEPELYID NVRLVPAK >gi|283510512|gb|ACQH01000107.1| GENE 46 71260 - 73017 1963 585 aa, chain - ## HITS:1 COG:no KEGG:Cpin_6367 NR:ns ## KEGG: Cpin_6367 # Name: not_defined # Def: RagB/SusD domain protein # Organism: C.pinensis # Pathway: not_defined # 1 570 1 538 547 267 35.0 1e-69 MKRLKYLYMAFALTTGALTTSCNDFLDRPAEDSYNAGTFYKDDNQCLQGVNYLYNSPWYD FQRGFIKVGEVMSGNYYWGSSPYLTLTVNSTDGDLMNMSYSLWSVIGHANTVYNNLKNAT ASQAIKNQCMGECLAWKAMAYFFLVRSFGDVPIIHDNSAMLADGSYSTVSRVKKADVYEY IILTLEKAMQLLPRNGSPGRIDYYSAEGLLAKVYLAKSGVNANGNGQRNADDLKKAAAYA KDVIDNSGRTLMANYSDVFRLQNAMNREFLFAWMWTADTSIWTVQNTLQSDLAMVGFDEF GDCWGGYNGPSIDLQEAFGVSPTENPESRSEVDTRRKATMMMAGDFYDYFWTDKTDNKGR KGFNYLQFMYDADNYGSGGPGKLQSATGANNVKHLYGNAYDHKTFAIDGISASNMHSSLP TPVLRLSDVYLVYAEAVIGNGTSTTDASAIDAFYKVRSRAVKSASRPTSISWQDVWKERR LELAMEGDRWYDFVRRSYYDTAGCIAELKAQKRGAFFGLNTLYENYYKSGAWNVNTTDMR YNPEAQAPNVTAQSFTLPFPSQDVVFNGNLLNEAVHVDVRSTYSY >gi|283510512|gb|ACQH01000107.1| GENE 47 73036 - 76320 3767 1094 aa, chain - ## HITS:1 COG:no KEGG:Fjoh_4951 NR:ns ## KEGG: Fjoh_4951 # Name: not_defined # Def: TonB-dependent receptor, plug # Organism: F.johnsoniae # Pathway: not_defined # 43 1094 36 1063 1063 638 37.0 0 MDFNYSKVTLLAACLTLTMLGHSPRVLADITNNVAPAAQQTRKITGRVTDGRESIIGATV RVKGTTNATVTDLDGNFSLDAPQGATLEISYIGYAPKEVKASGQGPLNIVLSEDNNTLNE VVVVGYGVMKKSDVSGASVTMDEKKLRGSIVTSLDQTLQGRAAGVTAISTSGAPGTSSSI RVRGQATINANAEPLYVIDGVIIQGSGASGSSLGLGDALGNGKVSTVSPLSTINPNDIVS MEILKDASATAIYGAQGANGVVLITTRRGKAGEAKFTYDGMAAWSRQGKRLQVMNLQEFA DYYNDFVTTGQINLADADKHFADPSLLGKGTNWQDAIFRTALQHSHQISAQGGSEKVQYY VSGNYMSQEGTLIGSKFDRLGMRANLDAQLKSWLKLGMSVNFSNTNDDLKLADSDAGIIN YSLTTPPDIQIYNINGGYSSVSKEGFTNPNPIAMAMLNQILLNRRKLNGNLFFEITPIKH LVWHAEVGYDLSSDRGRRFMPTVDLGTWKRTKNSASVQKSSGTFWQLKNYVTYANTIGKH SFSAMLGQEMWESSYDFNRTENTDLPANTVNNPALGAGTPAIGVGFGSSAMASFFSRLTY AFNDRYNATYTYRYDGSSNFGPNKRWAGFHSFAASWRFTNEPFFVKSAAAKYISNGKLRA GWGQTGNANIGSYKWGTLMKAMPTLLGKAYRPDNIPNLDVQWESQEQWNVGLDLGVLNDR ISFTLDWYHKESRDMLMPLQLPSYMGTSGNISSVLAAPYGNYGTIRNTGLELTVNAHPLT GALQWDSEFQISWNKNKLVALSGTKNAAIVGYGQWNDVISVSNVGESLYSFYGFVTDGVY KDLADLETSPKTSKYPADGKSFDRNTTPWIGDLKFKDISGPEGKPDGVIDDYDKTNIGSP MPKFTFGWTNTFRYKNFDLNVFVNGTYGNKVYNYMKMKLTHMNSLWSNQLTDVTKRARLE PIDPSKTYTPGSYWYTDAGNIRVANPETNIPRATITDPNDNDVISDRYIEDGSYIRLKNI ALGYTLPKKWISKWGIDNLRVYVNIQNLLTITGYDGYDPEIGASTADINGYTYGLDNGRY PSPTTYSFGLNLTF >gi|283510512|gb|ACQH01000107.1| GENE 48 76304 - 76714 172 136 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVLLACKIQADFTDNSRKQTNKDIAKTMFLFFFIAHIGALDPTFCKLSNIVLANNKIVLV CNNINNYLCNRIKMISKCRHKCQPNNKFEPQTTVHRYPKRSVTTITQPAGSCNSLPFRHL SHQYNQSNPNNNGFQL >gi|283510512|gb|ACQH01000107.1| GENE 49 78185 - 79588 1252 467 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929604|ref|ZP_06423448.1| ## NR: gi|288929604|ref|ZP_06423448.1| hypothetical protein HMPREF0670_02342 [Prevotella sp. oral taxon 317 str. F0108] # 1 467 1 467 467 900 100.0 0 MSRILLHLAVCALCSVSAVAQNNKDAATYAPTATGEQLTNLYLNSALLNGGNAVLPGGVA GDARGMATVNGKMYVCCREGTVSKLIELDGKTGALLRTITLPEEMWKEGEAALGFICNDV QVDNAGHIFVSNMATDMRGDGAAHLFRVNYVDVTKTPVAYKTVLNDTLPGDLPQTMRIDT YDLYGDILNGDGIIMLPVSGNEAGAGNTVIKYTVTKGVADVANRQTIVLTKFNPEKAVAA GAAPRINIVDNELFYHDGFNTMPMLYDMNGAAVDGFQNNKPLTPVSTGQNGVTEFELNGA YYLIVASTNTDDKEAPQAFDIFKFKDEGRSFADMTFLYRFPQAGLGSVGNPVRTALPRVE IVDGVGGQKKARINIYAYKNGYGIYEFANNVSTNIAKVEGDKLDYTVRGNVITVNADVKC ITLYNAIGQKVAETANAQAVKAPARGVYLLHVLGKNGNAKTIKVVIN >gi|283510512|gb|ACQH01000107.1| GENE 50 79676 - 82120 1972 814 aa, chain + ## HITS:1 COG:no KEGG:GYMC10_1275 NR:ns ## KEGG: GYMC10_1275 # Name: not_defined # Def: metallophosphoesterase # Organism: Geobacillus_Y412MC10 # Pathway: not_defined # 41 480 69 503 2050 142 29.0 5e-32 MKKILLSLLVLLMLAAYAQARDVVIDGKSYRVDTVALFKAGPGTQYLALEFKGARRINAF FLKVDLTNPYVAYRAALGRDSIYGGEQPTQVAARKSKEKEVYIAGTNGDFYNVSDYVGMP IGCTVVNNELANNAAVPYHRQVFAFDKAKMPYIGTMAYSGNLQIGTERYAIDHVNHLRGS GQLVLYNQHNGQYTHTDANGTEVLLALSEGETWQLNKEIRAKVVSVVQNKGNMKIPVGGA VLSANGAIAGKLAALTAGTEVKIALNLSVGGVSAPFADVVGGDQRSPMLQDGSVNTTEVW NELHPRTGFGYTQDKKTAIHCVVDGRSTISSGATTKELAEIMQFVGAYTAMNLDGGGSSC LFLKDFGPMNKNSDGNERAVGNGIYVVSTAPTDNKITEIKSLQTKVRLPRYGIFKPTFYG YNQYGVLVSRDVAGVKQTVDAKVGEVLADGSFFASGNQSGVITASYNGATEKIEVELLPE APVAIRLDSVLLDNKVTYPIEVQTEVGGNVFNLYPGAMQWDVADPSVCTVNEGVVKGLRN GTTIVTGRLGNFTDQLKVHIEIADKPSITARAFTDATWKVTHSANLKSVVNTPTNDAVKT TFTYKAGRNPYVAYNNDFALYGLPDSIRLTFNAGETNISKLSFRFKENNGGMATINKEFQ GFKKNKDNTISLALADLMGHPNDRAAYPLHFNFMQFSINGSGMTSDKGYDITIKEFVLVY KGTTTAIVRPTNNASQRVFVGGTKDNKLTLHLDLDKEEQVQVQLISVSGKVLRKVSLGVV GKGSYVIPMQQPMSGVYLLKVNYGNKHETIKVVM >gi|283510512|gb|ACQH01000107.1| GENE 51 82318 - 82509 196 63 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929606|ref|ZP_06423450.1| ## NR: gi|288929606|ref|ZP_06423450.1| hypothetical protein HMPREF0670_02344 [Prevotella sp. oral taxon 317 str. F0108] # 1 63 1 63 63 70 100.0 3e-11 MKPSKIKQFVIQLVVAYVIIVVVKMLACMVFGEQLAVHSILASIETLTFALFTSLYISIF AKK >gi|283510512|gb|ACQH01000107.1| GENE 52 82658 - 83302 717 214 aa, chain - ## HITS:1 COG:PA1037 KEGG:ns NR:ns ## COG: PA1037 COG2860 # Protein_GI_number: 15596234 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Pseudomonas aeruginosa # 14 205 4 196 206 119 37.0 3e-27 MTYRDPELVQTVQTIIEIVGTFAFAISGIRHASAKQVDWFGGLVCGIAVAIGGGTLRDMM LGVKPFWMTSSVYMICTVLALLSIIVFAKYVRRLDNAWFVFDTLGLGLFTIAGLQKALAL GHPFWVAIIMGCITGSAGGVVRDVLLNNVPLIFRKEIYAMASIAGGLLYWALYALRVEVS LIVLICFLFICLVRFLAMRYHISLPILREEKPKE >gi|283510512|gb|ACQH01000107.1| GENE 53 83486 - 84445 1222 319 aa, chain - ## HITS:1 COG:CAC2945 KEGG:ns NR:ns ## COG: CAC2945 COG1052 # Protein_GI_number: 15896198 # Func_class: C Energy production and conversion; H Coenzyme transport and metabolism; R General function prediction only # Function: Lactate dehydrogenase and related dehydrogenases # Organism: Clostridium acetobutylicum # 1 317 1 323 324 320 48.0 3e-87 MKIVILDGYTANPGDYSWKELNSFGEVEIYDRTSRDEVIARAKDADMVLTNKVVLKGETL AQLPQLKYIGILATGYNIIDVDETRARGIVVANVPAYSTDSVAQMTFAHVLNITNRIEHY ADQNRKGQWSEAADFCYWDTPLSELAGKTFGIVGLGHIGQKVARIALNFGMRVIAFTSKR AEELPEGVEKANLEELLMQSDVLSLHCPLTENTREMINRQSLTKMKRGAILVNTGRGPLV NEADVAEALAEGRLAGYGSDVMSSEPPKADNPLLKQPNAFITPHIAWATAEARGRLMATA IENAKAFIAGKPQNVVNGL >gi|283510512|gb|ACQH01000107.1| GENE 54 84644 - 85921 1374 425 aa, chain - ## HITS:1 COG:CAC0326 KEGG:ns NR:ns ## COG: CAC0326 COG2256 # Protein_GI_number: 15893618 # Func_class: L Replication, recombination and repair # Function: ATPase related to the helicase subunit of the Holliday junction resolvase # Organism: Clostridium acetobutylicum # 4 424 16 436 443 414 49.0 1e-115 MSEPLAERMRPRSLDDYVGQKHLVGPNAVLRNMIEGGRIPSFILWGPPGVGKTTLAQIVA KKLETPFYTLSAVTSGVKDVREVIEKAKGGRFFGSHSPILFIDEIHRFSKSQQDSLLGAV EKGIVTLIGATTENPSFEVIRPLLSRCQLYVLQSLSKEDLEELIQRALHTDVVLQQRNIE VKESTALIRYSGGDARKLLNILELVVEASPANASPVLIDDETVVKCLQQNPLAYDKDGEM HYDIVSAFIKSIRGSDPDAALYWMARMIEGGEDPQFIARRLVISAAEDIGLANPNALLIA NAAFDAVMKIGWPEGRIPLAEAAVYLATSAKSNSAYQGINAALETARSTGNLPVPLHLRN APTRLMKQLGYGGGYKYAHDYEGHFAQQQYLPDELQDARFWFAQHSPAEEKLYQRIVQLW GDKKK >gi|283510512|gb|ACQH01000107.1| GENE 55 86973 - 87863 849 296 aa, chain - ## HITS:1 COG:no KEGG:BT_2087 NR:ns ## KEGG: BT_2087 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 166 292 165 289 291 75 35.0 2e-12 MKHACLLVCVLLCCTNALAQEQKKHRKTVQTTERPLPKPRLAEPNPTLSTATAPIATPDV QEKPLSIADSIAAAHHFWNANDNLNLRLMQPMPTQATSLSTTPATNYNRGGQLGEWKGLN FYGQSYRETFPAMLVRQGALVGVSRTWDKLSFSAFASADRYRTFNTATQFGIGGSLTYRF SPYVSATAFGSYYSVQPYLSMAAFPYINSSAYGAYFTLDNGRWGIDLGVQRRYDPYRGEW IVSPIVTPKYHFSKKFTLEFPVGELVRDVLERAVFGRRSNSPTIMPPRNFRPFPRP >gi|283510512|gb|ACQH01000107.1| GENE 56 88259 - 88921 837 220 aa, chain + ## HITS:1 COG:TM0695 KEGG:ns NR:ns ## COG: TM0695 COG0740 # Protein_GI_number: 15643458 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Protease subunit of ATP-dependent Clp proteases # Organism: Thermotoga maritima # 32 217 14 199 203 225 56.0 4e-59 MNDFRKYATGHLGMNGMVVDDVMKAQAAYLNPYILEERQLNVTQMDVFSRLMMDRIIFLG TEINDYTANTLQAQLLYLDSADAGKDISIYINSPGGSVTAGLGIYDTMQFISSDVATICT GMAASMAAVLLVSGQEGKRSALPHSRVMIHQPLGGVQGQASDIEIEAKEILKFKKELYTI ISNHSHTPFEKVWNDSDRNYWMNAEEAKEYGMIDSVLVKK >gi|283510512|gb|ACQH01000107.1| GENE 57 89087 - 90319 268 410 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163762510|ref|ZP_02169575.1| ribosomal protein S16 [Bacillus selenitireducens MLS10] # 156 400 250 452 466 107 32 2e-22 MSKNVCSFCGRSEKDVNYLIAGLNGYICDSCAQQAYEITQSANASKAQSAKGKKVDALKN VPKPMDIKAYLDQYIIGQDEAKRYLSVSVYNHYKRLEQPIGDDGVEIEKSNIIMVGSTGT GKTLMARTIAKLLDVPFTIVDATVFTEAGYVGEDVESILSRLLQVADYNVAAAERGIVFV DEIDKIARKSDNPSITRDVSGEGVQQGLLKLLEGTTVNVPPKGGRKHPDQDYIHVDTRNI LFICGGAFDGIERKIAQRLNTHVVGYNSVQNVRKIDKGDLMKYILPQDLKSFGLIPEIIG RLPVLTYLNPLDREALRRILVEPKNSIIKQYEKLFKMDGIKLSFATETLDYIVDKAVEYK LGARGLRSIVEAVMMDAMFEVPSKKVKQFDVTLQYAREQLDKSGLQEQAS >gi|283510512|gb|ACQH01000107.1| GENE 58 90540 - 92717 2567 725 aa, chain + ## HITS:1 COG:alr0205 KEGG:ns NR:ns ## COG: alr0205 COG0514 # Protein_GI_number: 17227701 # Func_class: L Replication, recombination and repair # Function: Superfamily II DNA helicase # Organism: Nostoc sp. PCC 7120 # 1 719 1 712 718 482 38.0 1e-135 MTKDVNLTEPLKQYFGFDSFKGDQEAIIRNLLAGNDTFVLMPTGGGKSLCYQLPSLLMEG TAIVISPLIALMKNQVDVMNGMSEDGCVAHYLNSSLNKAAIQQVMHDIKRGATKLLYVAP ESLGKDENVEFLQSIKVSFYAVDEAHCISEWGHDFRPEYRNIRPTILRIGHAPVIALTAT ATDKVRSDIKKNLGICDAKEFKSSFNRPNLYYEVRPKTQEVERNIIMFIRQHAGKSGIIY CLSRKKVEELSAILKANNIKAEPYHAGLDSATRSQTQDDFLMERIDVIVATIAFGMGIDK PDVRFVIHYDIPKSLEGYYQETGRAGRDGGEGLCITFYSNKDLQKLEKFMESKPVAEQDI GRQLLLETAAYAESSVCRRKMLLHYFGEDYTEDNCHNCDNCLHPKTKREAKEALLIVLKA VLAIKENFRSDYVVDFVKGRATDDLVSHKHNELDDFGAGEDEDDKVWNPVIHQALIAGYL KKDVENYGLLKVTAAGRRFIKHPQSFMIVEDKDFDDDFVEESRDGCGSTLDPDLYVMLKQ LRKDMAERLNVPPYVIFQDVSVEQMAVTYPITLEELQNIPGVGVGKAKRYGEAFCQLIKK HCEDNQIERPEELRVRTVAKKSMNKVKIIQSIDRQIALDDIAIALNLGFDDLLSEIETIV NSGTKLNIDYFLDEVMDEDRVDDIYDYFRTSETDDLDVAIEELGGDYTEEDIRLVRIKFL SEMAN >gi|283510512|gb|ACQH01000107.1| GENE 59 93501 - 96635 3305 1044 aa, chain + ## HITS:1 COG:no KEGG:Cpin_5147 NR:ns ## KEGG: Cpin_5147 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: C.pinensis # Pathway: not_defined # 50 1044 166 1164 1164 640 36.0 0 MKKNLVLRGFALCSLLLFPLFCAWAVVPIADKDAQRAPLTQKGNDGRVDVRGNVVADEDG LPVMGATVTATATGQKTVTDGDGNFFLPSIPQGSELRVTYVGMHPLSVKAKASLTIRLKS DAKMLDEFVVTGMFNRKKEGFTGSAVTIKGEDIKRYSTTNIAKALSAIAPGLRIAENIVN GSNPNGLPDMRMRGGANMDLGSSAGSKLLAVQGEYETYANQPLLIMDGFEISVQALADLD PDRVASIVMLKDAAATAIYGSRAANGVIVIEMKTPKPGRIWVTYGGELRIENPDLTGYNL MNAREKLEAERISGLYTMGGPTVDKWRLYQSKLRQVLSGVDTYWLDKPLQTSLQQRHTVT LEGGDEALRYRLYVGYNHSPGVMKGSKRDVLTGALDFQYRLPNVLLKNSIAIDNSTADES PWGSFDQYTQLNPYLRPYGENGELLKRLDDFSGLGGDTDYLNPMYNTTFNSKDRSKNFSV RELFKVEYTPLKDLRFEAAFSLSKSVGNRDKFRPAQHTDFDGQTDPTLRGDYRRSQSELV KWGIDLTGSWNKQLDDHYITANARMSVQEHSRETYGNYVTGFPNDNMDNLLFGKKYNEKV DGIETTSRSIGWVVAGGYSYKYKYSVDFNLRLDGSSEFGKDNRWAPFWSTGLRWDVKKES LLANLSWLSDFVLRATYGTTGSQGFNPYQAHGYYTYANLLLPYYSSDATGSEILAMHNEK LKWQTTRNLNFGLELGALDHRLTARVEYYRKVTSNMITMVSLAPSVGFAQYPENIGKLEN RGWELTLSAIPYRNVARQSYWTVTLNGSHNADKLLEISEAMRHINEVNAGNLKNAPLPRY EEGQSVNRIWVVRSLGIDPASGNEILLKRNGEMTSAVNWDAKDAVPVGNTEPTWQGHINT SFTYMGWGIDVSMLYRFGGQVYNQTLIDKVENANLKYNADRRVTQLRWLRPGDRAMFREI SPRGSETKATSRFVMDENVLQGSSLSLHYRMDRNNTPFIKRLGVNSARLAFNMEDLFYLS TVKRERGTSYPFSRQFIFSLNVGF >gi|283510512|gb|ACQH01000107.1| GENE 60 96651 - 98168 1982 505 aa, chain + ## HITS:1 COG:no KEGG:Cpin_1098 NR:ns ## KEGG: Cpin_1098 # Name: not_defined # Def: hypothetical protein # Organism: C.pinensis # Pathway: not_defined # 2 503 1 487 488 240 33.0 8e-62 MMKKIHILLLVAMLGLTACNNWLDVEPKTSIPADKQFSSESGFKDALTGVYIKLGSATLY GADLSYGYIDELGGLYTGYPGYETSAVFDQSFVYDYDNKFKSKKNAIYAELYNVIANVNN IIGYTETKRNVLVTPRYYETIRGEALGLRAFLHFDLLRLFGPIYKEEPNAKAIPYRTTYN RVPTPVLTAQAVVEAVLKDLHEAENLLKTSDPLDFFTDPDAEEYKLKNGFLVNREFRMNL FAVKAMLARVYCYKGDAESKALAVKYAREVIDANQHFKLIDAQSAANYNSIRYSEQIFGI SIHELDKLLTANHMDAEGKRSDMHYTTSEDNFNEFYETAGAGVTDWRKLPEMFELTGSSD VVAFCRKYNQKPLKEAYGYVGANAIPLIRLPEMYYIVAECEPNAQKSADALNAVRFARGI SFTDEIRTEHYDDPEPMPGVSRTQTKRINEIMKEYRKEFFAEGKLFFFLKAHNYETWLGC AVKSMTKAQYQIPLPDGETIHGNNN >gi|283510512|gb|ACQH01000107.1| GENE 61 98181 - 98855 741 224 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929617|ref|ZP_06423461.1| ## NR: gi|288929617|ref|ZP_06423461.1| hypothetical protein HMPREF0670_02355 [Prevotella sp. oral taxon 317 str. F0108] # 1 224 1 224 224 460 100.0 1e-128 MKHILYILSVLFVSAAFVGCKQNEIELYDQSPRLNFEWSQRIIEFVDTDYIKKTAFKTDS FEVRIQGVHLNAPRDFCLKVQPAEGYEAAPELELADKYTYSALDTVCQTYRLRVKRPELK WGNKAFGCYVAFDLGNAAHQFDKGLVEKHQMLVNVRWRLRPYDWEEGDWGKYSNGKYIFM MDVLGKAYSNISYDDYETVTKAYEEYVKAGKPDILDDDGNVISF >gi|283510512|gb|ACQH01000107.1| GENE 62 98998 - 100629 1407 543 aa, chain + ## HITS:1 COG:no KEGG:Cpin_3291 NR:ns ## KEGG: Cpin_3291 # Name: not_defined # Def: hypothetical protein # Organism: C.pinensis # Pathway: not_defined # 5 191 1 181 497 65 31.0 4e-09 MKPKLTKTILASVLAFALAGCYEDKGNYDYHDVNGVNIELAEQQVRMPKVDDVEVTLTPT IVQTKETGEENLDFQWLITKEGAKAMSDKMTDYTPYAQGKACKVTVKSGQTDNIGLLLVV TDKLNGTQWFKTTQVKIVKPFNPCWFVLYEQGGKAQLGAVEGTPEGHFVFADVYQSERNE PFPLSGKPLAVAARKVYGNKEAASLFGFWGFTANPGMVIATTEGVAFVTPSTLAVKYSTD KILFEPTAKGTPIKIENYKMDTHGELFVNNSKAYFAHMDGACVPYSLKEGDKYPAISAYG AGNNALYFYDSGAHRFLKRGSLGLQDFYGTPKSSVSLRKGYGSWTDKGVELKPVGQRGTY VNVFDPDNVPAELTIVDIVAGGAFGNQLYAVGYNDAGKQLTVFKFSCRSTEPFCAAKYTV DMPGEVNLKTAKFTASYAYSADLLFMAAANKVYRIDLKRNKVVEIYAAPADAQVVCAKFK DRETTETLGTTLGVAYNQGGKGTVVELKLTAAGDLARTSNASFVYGNGSLPFGKIVDIAF NYE >gi|283510512|gb|ACQH01000107.1| GENE 63 100711 - 103023 1412 770 aa, chain + ## HITS:1 COG:no KEGG:Cpin_6153 NR:ns ## KEGG: Cpin_6153 # Name: not_defined # Def: hypothetical protein # Organism: C.pinensis # Pathway: not_defined # 33 644 46 660 812 278 29.0 7e-73 MYRRLCLVVFSALLSFVAKAQTAPHFFANNPQALVSHGLFTTYKVGERLYWEIPDTLLGR EFVVSLTVLTAPKPAKITKEGPKWGYAGDMIGPVFFCIEEHGDKLWITHPQYKRMLVDST SVFSRLALQSATTRLYDRLPIVQRSPNSLLVEVGEWFKDFPLFSLGIAYYDLRLGEKQKQ FDKIERIDVKPDRFLFQISRKYERKSLFSDTESDDTTSQADYWRTGVCIALLPKQPIEPV EAHPRAFFRINRTCYDANRAPRRRAFVKRWRLEVPPALRQRYLNGELVEPVRPIVFYVDR QMPPQYRAAVMQAVRNWQPAFEQAGFKNAIDAQLAPADSANSDFCPYDIGVPFISWKEST FRNAYGPTPCEPRAGEVVGCHIGVFSGVLDLLRQWYFVQCGANDPVGRQPNLPDSVMNVL LELVLTHEVGHTLGLEHNFLGSAHFSIDSLRNDAFLTRNGITTSIMDYVRCNYALRPTDK ASLRNRIARLGVYDRWAIEWAYRLFPGKDAAERAANREQWAKKCLADSLLAFRDGLDVMA QAEDLGNDHLAVNAQGMENLRLLCADSTIWVWRDSVERYSFRNRQEGLVEHYFDMLRQVM AHFIGLRLAAADSLIFVVEPPDVAHRTLNFVEKHLWHPPYWLFNAERGRMLGINMKDKRD KFCEEVFEIWKQALMEIDSWDCSQSPRLTSVEMLRSAYTALFVSQTGTTKGKELLQDFRE KHLKLLLNLNGEEATSISAPLLVEVNWQMSKVCVPTNKKPLRGRRQVRRK >gi|283510512|gb|ACQH01000107.1| GENE 64 103441 - 105006 1299 521 aa, chain - ## HITS:1 COG:BS_yurI KEGG:ns NR:ns ## COG: BS_yurI COG2356 # Protein_GI_number: 16080307 # Func_class: L Replication, recombination and repair # Function: Endonuclease I # Organism: Bacillus subtilis # 7 264 45 287 288 110 31.0 6e-24 MGIKRFTHILLALAMATCAWAQGPNGSGTYYQNANGQKGEALKTALSKIISTGAVDIGYN ALWKAYQTTDVRPDGTIWDIYSNITKYDPETGHQGSYKKEGDVYNREHTTPQSWFKKASP MKADLYHVFPSDGYVNNRRGSFPLGETNGEEYKSENGFSKLGKSTTPGYNGTVFEPADEF KGDLARAYFYMATRYEDKVGTWSGGVYGNSSYPGLAKWALDMFIRWSGKDPVSQKEVDRN KAVYKLQMNRNPYVDYPGLEQYVWGTQTSTAFDYANYVAPNGNNKPNNPNEPNKPDEPKQ PDNPGNETTPPAGAMVFHKVTSTSDLQAGKYYLIVNETAKAALSATSKNARAYARVEIVG NAITTEVNTNGKPHALTLGGTVGSYTWFDATEKKYLAYTGPKNSLNNANDATAASAQWQI SIDSEGNAVISNKQATVRMINYNSSQPRFACYMGSNKQEPIQLYVSTATTDIRNLPQVET NGVNVYSIDGRLLRANVQQEDALMGLPKGMYIVGKKKFVVR Prediction of potential genes in microbial genomes Time: Sat May 28 02:27:09 2011 Seq name: gi|283510511|gb|ACQH01000108.1| Prevotella sp. oral taxon 317 str. F0108 cont2.108, whole genome shotgun sequence Length of sequence - 14725 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 6, operones - 3 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 133 - 173 1.9 1 1 Tu 1 . - CDS 392 - 1474 852 ## Clim_1763 hypothetical protein - Prom 1497 - 1556 4.8 - Term 2932 - 2979 4.6 2 2 Op 1 . - CDS 3004 - 3336 375 ## gi|288929622|ref|ZP_06423466.1| hypothetical protein HMPREF0670_02360 3 2 Op 2 . - CDS 3352 - 3732 193 ## gi|288929623|ref|ZP_06423467.1| hypothetical protein HMPREF0670_02361 - Prom 3802 - 3861 6.0 + Prom 3789 - 3848 4.0 4 3 Tu 1 . + CDS 4044 - 5528 1964 ## COG0516 IMP dehydrogenase/GMP reductase + Term 5561 - 5620 6.5 + Prom 5539 - 5598 1.7 5 4 Op 1 3/0.000 + CDS 5633 - 7057 1522 ## COG0760 Parvulin-like peptidyl-prolyl isomerase 6 4 Op 2 . + CDS 7087 - 8544 1866 ## COG0760 Parvulin-like peptidyl-prolyl isomerase 7 4 Op 3 . + CDS 8541 - 10418 1645 ## PRU_0035 hypothetical protein + Prom 10456 - 10515 1.8 8 5 Op 1 . + CDS 10557 - 10865 310 ## BVU_0423 hypothetical protein + Prom 10871 - 10930 4.4 9 5 Op 2 . + CDS 10954 - 12777 2128 ## COG0323 DNA mismatch repair enzyme (predicted ATPase) + Term 12812 - 12853 10.5 + Prom 12815 - 12874 4.0 10 6 Tu 1 . + CDS 13040 - 14191 983 ## BVU_0414 major outer membrane protein OmpA + Term 14238 - 14295 13.8 Predicted protein(s) >gi|283510511|gb|ACQH01000108.1| GENE 1 392 - 1474 852 360 aa, chain - ## HITS:1 COG:no KEGG:Clim_1763 NR:ns ## KEGG: Clim_1763 # Name: not_defined # Def: hypothetical protein # Organism: C.limicola # Pathway: not_defined # 7 359 18 361 515 214 39.0 5e-54 MKHSRSLFALLLLFILTGCKKDPQDIFNEQKSGVVLICNEYYYDITLFNGKHLYFSGLDD KGNFINLTGNLDDIKRHPGILNGTGAFIDNTGRILTNRHVVAPSVDKTTVKNNVNAIIED YARYIQSLQDSMNQRYSAIQEYAQENTYEDDEGESYTNLSNEEIIYLQQEIANLKEQYVE AENLKNSVWNNILNDNFSIQLHSRFGIAYDGSNVNSWDDFMKTPCSMLRVSEDAYSDLAL LQLDSKETPTGHYIFTIDNSMATLDNQLEINQAVYMIGYNYGMALAKTDNGICAQFTSGT VTQQPDGKRVTYSIPSMPGSSGSPIVDDRGRLVAVNFAKSNGSDNFNFGIPLIRIVTFLR >gi|283510511|gb|ACQH01000108.1| GENE 2 3004 - 3336 375 110 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929622|ref|ZP_06423466.1| ## NR: gi|288929622|ref|ZP_06423466.1| hypothetical protein HMPREF0670_02360 [Prevotella sp. oral taxon 317 str. F0108] # 1 110 1 110 110 207 100.0 2e-52 MSKIRVKDIIGAEVRSRIPIAALKEAIARDGCYDIDMAEVTFISRSFADELYNLQLDHPN VQFINAQGNVKKMMEVVWKGRKKKRVRPQADVKTVDMTSIEDFSNFLLSI >gi|283510511|gb|ACQH01000108.1| GENE 3 3352 - 3732 193 126 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929623|ref|ZP_06423467.1| ## NR: gi|288929623|ref|ZP_06423467.1| hypothetical protein HMPREF0670_02361 [Prevotella sp. oral taxon 317 str. F0108] # 1 126 4 129 129 248 99.0 9e-65 MKKDLYITIADCGKTIYGSYVDTNKYINEIGTNEAEALRIANEGFSTKDRPGAENRGYGI SKSREMIVNGLRGAFFMLSGTAFYRHDENGIVPANIPEEYRWNGTIILLKVPLTAPSGFS FYNYVE >gi|283510511|gb|ACQH01000108.1| GENE 4 4044 - 5528 1964 494 aa, chain + ## HITS:1 COG:SPy2206_3 KEGG:ns NR:ns ## COG: SPy2206_3 COG0516 # Protein_GI_number: 15675939 # Func_class: F Nucleotide transport and metabolism # Function: IMP dehydrogenase/GMP reductase # Organism: Streptococcus pyogenes M1 GAS # 207 491 1 284 286 346 63.0 7e-95 MSSFVADKIVMDGLTYDDVLLIPAYSDVLPKTVTLKTKFSRNIELNIPFVTAAMDTVTEA AMAIAIAREGGIGVIHKNMSIEEQAHEVAVVKRAENGMIYDPVTIRRGRTVKDALDMMRD YHIGGIPVVDEDNCLVGIVTNRDLRFEHRLDKKIDEVMTSENLVVTHQQTDLAAAAQILQ ENKIEKLPVVDANNRIVGLITYKDITKAKDKPMACKDEKGRLRVAAGVGVTADTLDRMKA LVEAGADAIVIDTAHGHSKYVVEKLVEAKRAFPNVDIVVGNVATGEAAKLLVEHGADAVK VGIGPGSICTTRVVAGVGVPQLSAIYSVFDALKGTGVPLIADGGLRYSGDVVKALAAGGS SVMIGSLVAGTEESPGETIIFNGRKFKTYRGMGSMEAMEQKNGSKDRYFQGDTLEAKKLV PEGIAGRVPYKGTVQEVIYQLIGGLRSGMGYCGANDILALHNAKFTRITNAGVLESHPHD ITITSEAPNYSRPE >gi|283510511|gb|ACQH01000108.1| GENE 5 5633 - 7057 1522 474 aa, chain + ## HITS:1 COG:XF1191 KEGG:ns NR:ns ## COG: XF1191 COG0760 # Protein_GI_number: 15837793 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Parvulin-like peptidyl-prolyl isomerase # Organism: Xylella fastidiosa 9a5c # 116 273 276 436 655 84 35.0 4e-16 MIRRALSVALMFCAFVAHAQDDPTIMTINGRPVSRSEFEYSYNKNNTDNVVDKKTVAQYV ELFVNYKLKVEAALAARLDTTQAFRNEFADYRDQQVRPSFVNNDDVDKAARQLYDDTKQR IEGQGGLIRVAHIMLLLPQKSAKDLQRRAEQRIDSIYNALKRGADFAALARKLSDDKGSA QQGGELPWLQKGQTLKEFEDAAFALKPGEISKPLLSPAGYHIIKMIERRSFLPYDSVKAD IFAYIEQTNMREQIIDNKLNELAKASPTPRTTADVLEERAQAMQAKDPALRFLIQEYHDG LLLYEISNRTVWEKAAKDEEGLRYYFNKNKKRYAWDSPRFKGIACYAKSKADLKAAKKKV KNLPFNEWNEALRKAFNNDSTIRIKAERGVFKLGDNPVVDRDVFKQKVAIKDDANFPIVG VIGKKLKAPKEMDDVRALVVADFQETLEKEWVKDLRKQYKVEVNEAVLKTVNKH >gi|283510511|gb|ACQH01000108.1| GENE 6 7087 - 8544 1866 485 aa, chain + ## HITS:1 COG:RSc1715 KEGG:ns NR:ns ## COG: RSc1715 COG0760 # Protein_GI_number: 17546434 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Parvulin-like peptidyl-prolyl isomerase # Organism: Ralstonia solanacearum # 174 308 242 376 648 80 36.0 1e-14 MRTNTKIFCGLSAMILTAAISANAKQGLATKVAENDSTATATNKPVKTEPAVPESSVVDE VIWVVGDEPILKSDVEAARLQAENEGHKFKGNPDCSIPEQLALQKLFLHQAAIDSLEVSE SEISQGIEEQINYWIQLIGSKEKLEEYRKQTISEMRQAMHDDYKNNRLIAMMKEKLVSDV KVSPADVRKYFKDLPADSIPMVPTMVEVEIITQNPKVKTEEVNRIKDQLREYTDRVTKGE TTFATLARLYSEDPGSARQGGELGFTGRAAFDPAFAAVAFNLTDPNKISKIVETEFGYHI IQLIDKRGDKINVRHILLKPRIAQDDIDRSKARLDSIASDIKAGKFSFEEGATFISDDKD TKNNHGLMAYTDRESRSLSSKFRMGDLPTEVAREVEGMKVGDISKPFQMTNARGKTVVAI IKLKNRVDGHRATITEDFQVMRNVVLAKEREKKLHNWVVNKIKQTYVRMGDRYKNCDFEY QGWIK >gi|283510511|gb|ACQH01000108.1| GENE 7 8541 - 10418 1645 625 aa, chain + ## HITS:1 COG:no KEGG:PRU_0035 NR:ns ## KEGG: PRU_0035 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 29 534 4 514 517 510 50.0 1e-143 MRLKPISTSNGDRHSGVLWAVLCLFVLCLLPALAQKKVKTAAPQTDDRVYLVHSDELRYD QFGPVPDAQIVKGKVQFMHKGARMWCDSAYFYQQTNSFRAFGHVRMVQGDTLSLTCERAF YDGQAQMMEARQNVWLKHRGQTLNTDSLNYDRLYNNAYFFEGGTLTDKNQKLVADWGQYN TQTREAVFYYNVKMIEGDRVITTDTLHYNTQNSMAHVLGPSKITSKTGVIETRDGYFDTK TSKSKLFGRSTVNDKYKTITGDSLYYDDKTGQSEGYGDVIYVDKKNNNSLLCQRFKYNEK TGVGWATGKLLAKDYSQKDTLYVHADSVKLFTYNINTDSVYRLAHCFRHVRAYRTDVQAV CDSMVANSKDSCLTMYRDPIVWNANRQLLGEVIKIYMQDSTVREAHVLGQALSVEQMPDS VHFNQLSSRDMFAYFVEGNVRRNDAVSNVRSIYYSVDEKDSTLIGLNYLETDTMRMYISP QRKLEKIWTSRFEATLYPMTQIPPGKERLEAFGWFDYVRPLNKDDLYEWRPKAAGTELKK VQPRVLPKQRLDDDEKSGGNTNSDKENTTEQATKQPTVDDENKTDAAATKTKPAVKSSAK KGSSAATAKKSTAKSRTTTANPKSK >gi|283510511|gb|ACQH01000108.1| GENE 8 10557 - 10865 310 102 aa, chain + ## HITS:1 COG:no KEGG:BVU_0423 NR:ns ## KEGG: BVU_0423 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 102 1 94 98 66 42.0 3e-10 MGMFSMRKPRRFHHEYIYANDRKQRLEAIEQRAKEELGLAEKTEEQSVSPRDADRFRGAF TPSNSHLHRRGQSYGVAFGSRIFLLVAIFLFMGYVLFRLLLG >gi|283510511|gb|ACQH01000108.1| GENE 9 10954 - 12777 2128 607 aa, chain + ## HITS:1 COG:SA1138 KEGG:ns NR:ns ## COG: SA1138 COG0323 # Protein_GI_number: 15926879 # Func_class: L Replication, recombination and repair # Function: DNA mismatch repair enzyme (predicted ATPase) # Organism: Staphylococcus aureus N315 # 5 606 4 665 669 301 29.0 2e-81 MSDVIQLLPDSVANQIAAGEVIQRPASIIKELVENAVDAGATRIDVNVMDAGKTSVQVID NGRGMSETDARLAFERHATSKIRQASDLFALNTMGFRGEALASIAAVAQVELKTRLASEE IGTSISIAGSQFTGQEPCACPVGSNFVIENLFYNIPARRKFLKSNATELNNIITAFERIA LVYPDIAFTLRSNGTEVFNLPSVVLKQRIVDVFGKRISQDLLSFNVETSICKIHGFVGKP ESARKKGAHQYFFVNGRYMKHPYFNKAVMQPFERLMPQGEQVPYFIYFEVNPEDIDVNIH PTKTEIKFENEQAVWQILSAAVREAVGLFNDVPTIDFDTEGRPDIPVYNPNENAPAPKVN YNPHYNPFDDTPQAALQTKNNSRWEDLYAGLKADEEADLGLFPDVGSGGQSGETERMIED KSPAHYQYKGRYVMTAVKSGLMVIDQHRAHVRILFEQYLRQVSQRSGASQKVLFPEVVQF TATEDLMAQKIMPDLQGLGFELTDLGARNYAVNAIPAGLEGINPVAMIRDMVATAMEKGA GLHDEINEALALSLARNAALPYGQVLGNDEMESLVNALFTCQNVNYTPNGQKILAIMAQT DLEKLLG >gi|283510511|gb|ACQH01000108.1| GENE 10 13040 - 14191 983 383 aa, chain + ## HITS:1 COG:no KEGG:BVU_0414 NR:ns ## KEGG: BVU_0414 # Name: not_defined # Def: major outer membrane protein OmpA # Organism: B.vulgatus # Pathway: not_defined # 1 380 1 397 399 221 34.0 4e-56 MKKLLIVLALAGVSTASMAQDSDPVQKYSVATNSFWNNWFIQVGGEWNAWYSDQEHGKNL PVSPFKSFRGSPSAAIAVGKWFTPGIGLRTKLQGIWGRTVVDENKRENKYWNLNEHVLLN VSNMVCGYNPNRVWNLIPFFGGGLGRTMTHDLYAMNLSVGVLNTFRVCDRMAINLELGWN RFEEDMDGRWGNTIADPHVNRGWEDKDNLLYAELGLTFNLGKSTWNKTPDVDALKALSQA QMDALNAQLNDAASENARLKNMLANQQPAETKTIKEFVTTPVSVFFNINKTNIASQKDLV NVQAVAKYAKENNSNINVVGYADSATGSVNRNQYLSDKRAEMVANELVKMGISRDKISTV GKGGVDDLSPISFNRRATVQIAE Prediction of potential genes in microbial genomes Time: Sat May 28 02:27:43 2011 Seq name: gi|283510510|gb|ACQH01000109.1| Prevotella sp. oral taxon 317 str. F0108 cont2.109, whole genome shotgun sequence Length of sequence - 11675 bp Number of predicted genes - 14, with homology - 11 Number of transcription units - 9, operones - 4 average op.length - 2.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 504 - 559 -0.2 1 1 Tu 1 . - CDS 636 - 872 151 ## - Prom 926 - 985 2.8 + Prom 782 - 841 3.8 2 2 Tu 1 . + CDS 1071 - 1256 81 ## + Term 1291 - 1323 2.0 - Term 1125 - 1175 14.1 3 3 Tu 1 . - CDS 1258 - 2115 1090 ## COG0623 Enoyl-[acyl-carrier-protein] reductase (NADH) - Prom 2154 - 2213 3.3 + Prom 1786 - 1845 3.5 4 4 Tu 1 . + CDS 2006 - 2227 57 ## 5 5 Tu 1 . - CDS 2521 - 4080 1514 ## COG3119 Arylsulfatase A and related enzymes - Term 4490 - 4541 10.2 6 6 Op 1 . - CDS 4590 - 5699 666 ## BF1823 putative transmembrane protein 7 6 Op 2 . - CDS 5683 - 6036 97 ## gi|260909412|ref|ZP_05916120.1| hypothetical protein HMPREF6745_0074 8 6 Op 3 . - CDS 6023 - 6586 465 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 6618 - 6677 4.7 - TRNA 6833 - 6908 75.1 # Lys TTT 0 0 - Term 6786 - 6837 14.2 9 7 Op 1 . - CDS 7015 - 8130 1046 ## COG0628 Predicted permease - Prom 8173 - 8232 4.8 10 7 Op 2 . - CDS 8296 - 8886 497 ## COG1435 Thymidine kinase - Prom 8948 - 9007 3.3 + Prom 8825 - 8884 2.5 11 8 Op 1 . + CDS 8935 - 9693 625 ## COG0313 Predicted methyltransferases 12 8 Op 2 . + CDS 9690 - 10529 990 ## PRU_1870 putative lipoprotein + Term 10652 - 10696 7.4 13 9 Op 1 . + CDS 10701 - 11267 426 ## gi|288929639|ref|ZP_06423483.1| hypothetical protein HMPREF0670_02377 14 9 Op 2 . + CDS 11289 - 11534 86 ## gi|288929640|ref|ZP_06423484.1| hypothetical protein HMPREF0670_02378 Predicted protein(s) >gi|283510510|gb|ACQH01000109.1| GENE 1 636 - 872 151 78 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MYEVPDKDMMKIEIMPLSSVANMAQSQNANPARVFSIHSLQVERIFPGTHNEKKVKSTSL YFLCSYVNMDVFMSKHER >gi|283510510|gb|ACQH01000109.1| GENE 2 1071 - 1256 81 61 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAAANYLRPKKAEREFVNFKKTTNATLALGKSGVCRMYTECVAIANIWCNYLHTSQREQH L >gi|283510510|gb|ACQH01000109.1| GENE 3 1258 - 2115 1090 285 aa, chain - ## HITS:1 COG:AGc1374 KEGG:ns NR:ns ## COG: AGc1374 COG0623 # Protein_GI_number: 15888100 # Func_class: I Lipid transport and metabolism # Function: Enoyl-[acyl-carrier-protein] reductase (NADH) # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 5 260 7 259 272 120 31.0 2e-27 MSYNLLKGKRGIIFGALNDMSIAWKVAERCAEEGASIVLSNTEMALRMGELDALAKKINA PVVAADATSYEDLENVFVKAQELLGGKIDFVLHSIGMSPNVRKRRTYDDLDYNWLNKTLD ISAISFHKMLQAAKKVDAIAEYGSVVALTYIASHRTFFGYNDMADAKSLLESIARSFGYI YGREKNVRINTISQSPTPTTAGKGIKDIDNMMDFADKMSPLGNASALECADYCVTLFSDL TRKVTMQTLFHDGGFSNMGMSLRAMNQYSKDLGPYEDENGKIIYG >gi|283510510|gb|ACQH01000109.1| GENE 4 2006 - 2227 57 73 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLAPSSAQRSATFQAMDMSLSAPKIIPRLPFNKLYDIIVLIIIRCKITLFKLEYQEFEAK RANERGKSEKVKR >gi|283510510|gb|ACQH01000109.1| GENE 5 2521 - 4080 1514 519 aa, chain - ## HITS:1 COG:STM0035 KEGG:ns NR:ns ## COG: STM0035 COG3119 # Protein_GI_number: 16763425 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Salmonella typhimurium LT2 # 35 507 29 467 497 159 26.0 2e-38 MKKMKTYPALLAPALLTAVGAHAKQVKNNRTDDTRRPNVIIIYADDLGYGDLQCYGAKNV ETPNVNRLASQGIRFTNAHAVAATSTPSRYSLLTGEYAWRRADTDVAPGNAGMIIRPEQF TMADMFKSVGYATCAIGKWHLGLGDKGGEQDWNAQLPAALGDLGFDYSYIMAATSDRVPC VFIENGRVANYDPSAPIEVSYKRNFEGEPTGKAHPELLYNLKSSHGHDMSIVNGIGRIGF MKGGGKALWKDENIADSITAHAINFIRDNQNRPFFMYFATNDVHVPRFPHARFRGKNPMG YRGDAIVQFDWSVGQLMKALEQYGLADNTLIILSSDNGPVVDDGYADRAEELLNGHSPAG PFRGNKYSAFEGGTAIPAIVSWKKGVKGGRQSNVLMSQIDWLASLAQLVGARLPKGTACD SEPRLGNLLGTDTQNRPWIIEQSSSRVLSVRTPQWKFIEPSDGEKMISWGPKIETGNNPT PQLYDLSKGEYEKDNVANENPNVVYELQNVLRRERTKKR >gi|283510510|gb|ACQH01000109.1| GENE 6 4590 - 5699 666 369 aa, chain - ## HITS:1 COG:no KEGG:BF1823 NR:ns ## KEGG: BF1823 # Name: not_defined # Def: putative transmembrane protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 366 3 357 357 84 24.0 6e-15 MKTKDSLILLLLATFMVACNNKYQGSATTIVNEDGTCSRELSFQLDSAQLVSGVFNADES MLRWDKGWQLTWGIKGDSLRHAFPVELKAYQALSRRCQKDGRTAADTILVYAKCNFASVE EMAKATTFIVGSRRITPRITFERTRGFFRTEYRYEEVYPQQHINCPIPMSKFFTKDEIGF WFSGTPDLVGGLNGIEIDDITQHLKTKYTQWLVANDFELTYQTILKNYSRIAAKAIDKQR FVALHDTLFREYSRETDLTAVHMDRSKWFKKIFHSDVYSEILDNDTLMREVDVAQRDFLA LSMFNIDYKLTFLGADYPPTLSRLTGNRLMAGDVVISPCIIKKNLWAFALTGAVLVLSLV LLVLKRKKR >gi|283510510|gb|ACQH01000109.1| GENE 7 5683 - 6036 97 117 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260909412|ref|ZP_05916120.1| ## NR: gi|260909412|ref|ZP_05916120.1| hypothetical protein HMPREF6745_0074 [Prevotella sp. oral taxon 472 str. F0295] # 1 39 1 39 168 67 76.0 4e-10 MNKNDNLYADVIRGLQGKRPSLEHPDDMIDGVMARIEQRNRQRAYVIVLRACAIAASLAL VFGISECVRTVPKNNDMALLYLQETLPQKADGVYFQNKYLKRNQYELIKQSIYENKR >gi|283510510|gb|ACQH01000109.1| GENE 8 6023 - 6586 465 187 aa, chain - ## HITS:1 COG:BS_sigW KEGG:ns NR:ns ## COG: BS_sigW COG1595 # Protein_GI_number: 16077241 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Bacillus subtilis # 6 179 5 184 187 79 23.0 3e-15 MEQNEVNQIVVQCQEGSKESFRRLVREYQSMVFSLTLKMLCDENDAEDACQDTFVSVWLN LHKYNSEKGKFSTWIYSIASNICLDKLKRRKPQLPMLSDERAFKAYATNPSPERQLMNSE WVSIVRVLAAQLSPKQQLVFTLRVLENVPIEEIESITNMNATKIKSNLYVARQQIKQQLI RLGYEQE >gi|283510510|gb|ACQH01000109.1| GENE 9 7015 - 8130 1046 371 aa, chain - ## HITS:1 COG:SA1192 KEGG:ns NR:ns ## COG: SA1192 COG0628 # Protein_GI_number: 15926939 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Staphylococcus aureus N315 # 11 338 33 361 402 110 26.0 4e-24 MLEEKITLDKLIRWVVCGLIIFAVFFITNSMSEVLLPFFVACLLAYLLYPLVKFTEKYLH IRIRTLSILFVLLVLVGLITGVGLLVVPSLIDQFDKISELAARYLGSSKSSGDLSGYIQH WLREHQEDIDRLIRSKDFNDAVRTAMPKLFSLLGQTASVVVSIVASCITLLYMFFILLDY EFLTDNWVRIFPKKYRPFWNELAQDASRELNNYIRGQGLVSLIMGIMFCIGFTIIDFPMA IGMGILIGVMNLVPYLHTFALIPTAFLALLKAADTGQNFWIIFGTAILVFAVVQIISDMV VTPRVMGKAMGLNPAIILLSLSVWGSLLGFIGVIIALPLTTLIIAYWQRYVTKEKVDVEP ETTETPAKTDA >gi|283510510|gb|ACQH01000109.1| GENE 10 8296 - 8886 497 196 aa, chain - ## HITS:1 COG:BH3779 KEGG:ns NR:ns ## COG: BH3779 COG1435 # Protein_GI_number: 15616341 # Func_class: F Nucleotide transport and metabolism # Function: Thymidine kinase # Organism: Bacillus halodurans # 10 186 5 188 204 189 50.0 2e-48 MIDNVTGENHRQGRVEVICGSMFSGKTEELIRRMKRAVFAKQRVEIFKPAIDTRFSEDDV VSHDRNAIHCTPVDSSAAILLLSADIDVVGIDEAQFMDEGLVPVCNELANRGVRVIVAGL DMDFKGVPFGPMPALCAIADHVTKVHAICVKCGALAYVSHRIVEGEKRVMLGEMQEYEPL CRHCYQKAMQGGKGEC >gi|283510510|gb|ACQH01000109.1| GENE 11 8935 - 9693 625 252 aa, chain + ## HITS:1 COG:sll0818 KEGG:ns NR:ns ## COG: sll0818 COG0313 # Protein_GI_number: 16330705 # Func_class: R General function prediction only # Function: Predicted methyltransferases # Organism: Synechocystis # 17 251 1 234 279 220 49.0 2e-57 MLFLLLGIYIAIFAYNMGILYIVPTPVGNMEDITLRAIRTLKEADLVLAEDTRTTGILLK HLGIEVAHLQSHHKFNEHGTSAGIVERLKAGQNIALVSDAGTPGISDPGFFLAREAVAAG ITVNCLPGATACIPAVVSSGMPCDRFCFEGFIPQKKGRKTYLESLKDEQRTMVFYESPYR LVKTLTQLAEVVSTERQVCVCREISKLHEENVRGTLQEVAKHFEQTPPRGEIVIVLAGTS SKKDKKSKGEKL >gi|283510510|gb|ACQH01000109.1| GENE 12 9690 - 10529 990 279 aa, chain + ## HITS:1 COG:no KEGG:PRU_1870 NR:ns ## KEGG: PRU_1870 # Name: not_defined # Def: putative lipoprotein # Organism: P.ruminicola # Pathway: not_defined # 1 279 1 288 288 228 45.0 2e-58 MKKLVIACFAVLSLAACKQGKTNGEPQLNDSLQQIIQQRDNQLNDMMATMNEIQDGFRQI NEAENHVNIAKDGEGANKSEQIKENIKFITQRMEENRALLKKLRAQLEKSTFKGAELKKT VDRLMKQLDEKDQQLQQLRAELDAKDIHISELDEKVNNLNTHVTKLKTDNTAKAQTISDQ DKQIHTAWYVFGTKKELKEQGILVDGKVLQSAFNKAYFTRVDIRTLKEIKFYSKSVKLLT THPASSYSLEKDVNKEYVLTITNPDVFWSTSKYFVAVVR >gi|283510510|gb|ACQH01000109.1| GENE 13 10701 - 11267 426 188 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929639|ref|ZP_06423483.1| ## NR: gi|288929639|ref|ZP_06423483.1| hypothetical protein HMPREF0670_02377 [Prevotella sp. oral taxon 317 str. F0108] # 1 188 1 188 188 304 100.0 2e-81 MNHNNQLDFFRAILIVLVILIHIVNFGDHYPLLKNAILAFLMPSFLVVTGFLVNINKPLK TYALYLSKIVLAYVVMVSGYAALSLFLPVRDGLTQPTWQAFAHVLFIKSIGPYWFLHLMV VCGVLYYASFRVVPKISTAAKLSIFATFIILTAQFTPLLNIRFAAYYFVGVVIRHLVKDF SKVYESSL >gi|283510510|gb|ACQH01000109.1| GENE 14 11289 - 11534 86 81 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929640|ref|ZP_06423484.1| ## NR: gi|288929640|ref|ZP_06423484.1| hypothetical protein HMPREF0670_02378 [Prevotella sp. oral taxon 317 str. F0108] # 1 81 1 81 81 142 100.0 7e-33 MVGTPRFQDWGTISILVSVFCFFCFAAKLHTYLGNKIKAAAHYIGRNTFPIYIFHPIFTM LAKFILPAFAFEPTGLLHAAV Prediction of potential genes in microbial genomes Time: Sat May 28 02:28:34 2011 Seq name: gi|283510509|gb|ACQH01000110.1| Prevotella sp. oral taxon 317 str. F0108 cont2.110, whole genome shotgun sequence Length of sequence - 42976 bp Number of predicted genes - 38, with homology - 31 Number of transcription units - 22, operones - 11 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 471 504 ## COG0590 Cytosine/adenosine deaminases 2 1 Op 2 . - CDS 496 - 957 -126 ## - Prom 1016 - 1075 3.0 3 2 Op 1 . - CDS 2290 - 2718 109 ## gi|260591517|ref|ZP_05856975.1| hypothetical protein HMPREF0973_00950 4 2 Op 2 . - CDS 2735 - 2977 98 ## - Prom 3146 - 3205 6.8 - Term 3155 - 3220 17.0 5 3 Op 1 4/0.000 - CDS 3269 - 5314 2228 ## COG0526 Thiol-disulfide isomerase and thioredoxins - Prom 5335 - 5394 4.5 6 3 Op 2 . - CDS 5436 - 6548 1159 ## COG0526 Thiol-disulfide isomerase and thioredoxins - Prom 6576 - 6635 2.2 7 4 Op 1 . - CDS 6638 - 8407 1658 ## gi|288929644|ref|ZP_06423488.1| hypothetical protein HMPREF0670_02382 8 4 Op 2 . - CDS 8501 - 9781 1253 ## gi|288929645|ref|ZP_06423489.1| hypothetical protein HMPREF0670_02383 9 4 Op 3 . - CDS 9788 - 11263 1487 ## Cpin_1098 hypothetical protein 10 4 Op 4 . - CDS 11272 - 14640 3567 ## BT_3271 hypothetical protein - Term 14706 - 14754 2.0 11 5 Tu 1 . - CDS 14798 - 15778 1055 ## gi|288929649|ref|ZP_06423493.1| hypothetical protein HMPREF0670_02387 - Prom 15985 - 16044 6.4 + Prom 16170 - 16229 2.2 12 6 Tu 1 . + CDS 16281 - 16502 56 ## + Term 16508 - 16567 6.0 + Prom 17314 - 17373 4.5 13 7 Op 1 . + CDS 17393 - 17767 440 ## COG0792 Predicted endonuclease distantly related to archaeal Holliday junction resolvase 14 7 Op 2 . + CDS 17770 - 18525 703 ## COG0340 Biotin-(acetyl-CoA carboxylase) ligase 15 7 Op 3 . + CDS 18577 - 19281 922 ## COG0528 Uridylate kinase + Prom 19392 - 19451 5.1 16 8 Tu 1 . + CDS 19475 - 20779 1331 ## COG1322 Uncharacterized protein conserved in bacteria + Term 20968 - 21006 -0.6 - Term 20866 - 20929 -0.3 17 9 Op 1 . - CDS 21001 - 21933 608 ## COG0320 Lipoate synthase 18 9 Op 2 . - CDS 21945 - 22151 104 ## - Prom 22203 - 22262 4.1 + Prom 22112 - 22171 5.0 19 10 Tu 1 . + CDS 22195 - 22689 697 ## COG0716 Flavodoxins + Term 22739 - 22776 3.0 + Prom 22762 - 22821 1.7 20 11 Tu 1 . + CDS 22973 - 23536 599 ## COG1853 Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family + Prom 23708 - 23767 2.3 21 12 Tu 1 . + CDS 23874 - 24413 372 ## gi|288929657|ref|ZP_06423501.1| hypothetical protein HMPREF0670_02395 + Term 24424 - 24468 -0.2 22 13 Op 1 . + CDS 24549 - 24950 158 ## gi|288929658|ref|ZP_06423502.1| hypothetical protein HMPREF0670_02396 + Prom 24952 - 25011 1.7 23 13 Op 2 . + CDS 25041 - 25937 644 ## gi|288929659|ref|ZP_06423503.1| hypothetical protein HMPREF0670_02397 24 14 Op 1 . - CDS 27739 - 28488 890 ## COG0708 Exonuclease III 25 14 Op 2 . - CDS 28493 - 29137 640 ## gi|288929661|ref|ZP_06423505.1| hypothetical protein HMPREF0670_02399 26 14 Op 3 . - CDS 29213 - 29794 639 ## COG0302 GTP cyclohydrolase I 27 14 Op 4 . - CDS 29844 - 30536 959 ## PRU_1861 hypothetical protein - Term 30617 - 30680 6.2 28 15 Op 1 . - CDS 30702 - 32780 1824 ## COG0149 Triosephosphate isomerase 29 15 Op 2 . - CDS 32853 - 33410 759 ## PRU_1858 hypothetical protein - Prom 33505 - 33564 3.6 - Term 33533 - 33562 2.1 30 16 Tu 1 . - CDS 33574 - 33981 479 ## gi|288929666|ref|ZP_06423510.1| hypothetical protein HMPREF0670_02404 - Prom 34132 - 34191 6.1 + Prom 33839 - 33898 2.2 31 17 Tu 1 . + CDS 33995 - 34195 59 ## + Prom 34254 - 34313 2.1 32 18 Tu 1 . + CDS 34354 - 35787 1681 ## COG0617 tRNA nucleotidyltransferase/poly(A) polymerase + Term 35910 - 35943 1.5 + Prom 37237 - 37296 2.5 33 19 Tu 1 . + CDS 37340 - 37552 58 ## + Term 37591 - 37620 -0.3 - Term 37916 - 37954 9.2 34 20 Op 1 . - CDS 38006 - 38794 759 ## COG0755 ABC-type transport system involved in cytochrome c biogenesis, permease component 35 20 Op 2 . - CDS 38791 - 40023 1214 ## BT_1416 hypothetical protein - Prom 40194 - 40253 4.3 - Term 40284 - 40340 12.3 36 21 Op 1 1/0.000 - CDS 40392 - 41876 1545 ## COG3303 Formate-dependent nitrite reductase, periplasmic cytochrome c552 subunit 37 21 Op 2 . - CDS 41961 - 42563 483 ## COG3005 Nitrate/TMAO reductases, membrane-bound tetraheme cytochrome c subunit - Prom 42678 - 42737 3.8 38 22 Tu 1 . - CDS 42757 - 42975 71 ## Predicted protein(s) >gi|283510509|gb|ACQH01000110.1| GENE 1 3 - 471 504 156 aa, chain - ## HITS:1 COG:SA0516 KEGG:ns NR:ns ## COG: SA0516 COG0590 # Protein_GI_number: 15926236 # Func_class: F Nucleotide transport and metabolism; J Translation, ribosomal structure and biogenesis # Function: Cytosine/adenosine deaminases # Organism: Staphylococcus aureus N315 # 14 156 1 148 156 118 47.0 5e-27 MAQKDEQKQLDEQMQVDEQMMRKALDEARQALAQGEVPIGAVIACKGRVVARAHNLTETL CDVTAHAEMQAITAAANMLGGKYLPECTLYVTVEPCPMCAGACGWAQLGRIVYGTRDEKR GYERFAPNVLHAKATVTAGVLEEECRELMKSFFSGL >gi|283510509|gb|ACQH01000110.1| GENE 2 496 - 957 -126 153 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MAKSITPLLLRSLTPYHSIPYYSIPYYSIPYYSTPYYLLLYSLLLTPLLLTPLLLTTYYL LLTTYYLLLTTYYLLLTTSYLLLPTYYFLPYYSFLFQGITTFCQRQMGHKSRLYRIFVQY FVHERGEFILVLQHKDNILRSIYRHIKSEAGFI >gi|283510509|gb|ACQH01000110.1| GENE 3 2290 - 2718 109 142 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260591517|ref|ZP_05856975.1| ## NR: gi|260591517|ref|ZP_05856975.1| hypothetical protein HMPREF0973_00950 [Prevotella veroralis F0319] # 7 140 14 146 148 100 35.0 3e-20 MKKYLYLFLLILCASCSHHNKVDDVVISYWNAHKAEKNVVIDFSKCFDFEWDTLCFYSVG CSLDEINQDLGFELAEYTDLADRMVFLNHKKWAYTAGWWYDPEEPKGIIISTKNNIFKVS YHKAKFKIKREGNIFILEPLDF >gi|283510509|gb|ACQH01000110.1| GENE 4 2735 - 2977 98 80 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGLDFRFYCKFVPYSVHERSAYLVFSRNTKIISTAQFIEILEIKTSMYKIFYGIYHRGGL LQRRLEFTIEKLSKNQKTYE >gi|283510509|gb|ACQH01000110.1| GENE 5 3269 - 5314 2228 681 aa, chain - ## HITS:1 COG:DR0189 KEGG:ns NR:ns ## COG: DR0189 COG0526 # Protein_GI_number: 15805225 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Deinococcus radiodurans # 241 369 47 167 185 77 34.0 8e-14 MKKELICLLLSCAGTAVAQQNSSYVIKGTMTVDSLRYTPERIKKVYLTHEVQGQDVVVDS ATVENGRFTFKGEAPGVLAPYRISGFDNGTVQLFLEPGEITVLPFDGRFPVGASVKGTPG NDVLFAYQQSGSGAVDIAKQRMDKVLASLPPEQRNDEKAFYPYQRATYYVNSLSHRTTAM RFVARHLDSPVTLYIIKYDLFRFFTPQVLEETYLNAVPAAVRKHPMYRELTNLVRAASLE VGKPAPDIDGLTPDGKPLALSDLKGKYVLIDFWASWCAPCRREFPVIKQALQETQGKVPF MVLSYSIDSKKKEWTDCIERNSLTHANWQHISALKGWGSPAAKLYNVEAVPRTVLISPEG NIVAFDLRGEQLIEAVRKMKTTSAKPAENQKAKADAAKAVASEVKEIPDKPLYEEYLALE RQRDQQIAQDLAQLRSTKGEAYLNTADGKIDAEGIRRAADIGWTAERFQFLLDHNTSPLM PLLVERDMLPIFNKEYGRQLSAAVAPQVLTHPYARSLENNVRSRNLMQGSDVPDIALPLA NGTVRQLSNSLGKFVLLTFWASGCPTCQRDLPLLKKLYDETRGSKDKFEMIGFSLDKNPK DWKKAMKTLGIDAADWLQACDFKGGASPSVRLFNVGETPTNVLIDPEGKAISLTLQGEEL VTRVKQILSGDLYYQKEDSKK >gi|283510509|gb|ACQH01000110.1| GENE 6 5436 - 6548 1159 370 aa, chain - ## HITS:1 COG:aq_152 KEGG:ns NR:ns ## COG: aq_152 COG0526 # Protein_GI_number: 15605725 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Aquifex aeolicus # 229 295 36 102 146 69 37.0 1e-11 MYKYFFTFLFSLFATTLLAQQSSSIEGEIEGISEGQLQLVVRSSESRWDLIHTVAFANGR FSMPNIELTEPLPARLLVAGYQGGFSFFIEPGTDYRALLRNDEGWFVRGKGLQDTDRAYQ QKCLSLMQSVAKLQQRADSLRKALRYGSASRVNDTIAQLQKALETERLNFISANDNILSA SLLLQEAESKDAPLEACQQLYAQLGSKAQQSRSGLILKQRIERLQQVSKGSKAPDFTLPT SDGRKFTLSKMPGKVKIVDFWASWCGQCRLNNPVLRQLYADFHAAGLEIVNVSLDEKRDR WLAAVKQDQLTWTQVSSLKGWKDEVAKSYSVTAIPAIFVLDANNNILATGLHGDDLRKFV TNLFSNKAAK >gi|283510509|gb|ACQH01000110.1| GENE 7 6638 - 8407 1658 589 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929644|ref|ZP_06423488.1| ## NR: gi|288929644|ref|ZP_06423488.1| hypothetical protein HMPREF0670_02382 [Prevotella sp. oral taxon 317 str. F0108] # 1 589 1 589 589 1135 100.0 0 MTNNIIILAVSKHLRTALRSPWKSVGSKAACLSLLVAGMFSSCYEDKGNYTYSFDGMNAI DTLIFSPASVETLTGQTIEFTQPLTADETHKRIEVQVGQTLLKNTDHLDFQWIRSYQKGK ETVTDTFTTKGYLDVELPVGKATRFTVHLKVHDRSTGLSRYQNFAVATRPIYKNSLFFLH GKAGSVQLGNVETVGAVTNVRSDAYKLIFKEEKNPFTDAYKLMYQYGLVAEGRDFLKKQN FIVFLNKGEVKAYDPFGLTPRYTAYKDFVVPVSAQGAFLAEKIGMTGDPSNQSDFYYMLG KDGRFLTARTIPSFKTPATGGSVANYNVTAAAITASHFVLWDGRNNRFLYVSKDDGYGIW AEQQAYQAQLNNPVLDAHVDFSGLKEGLSPVGKKAIYGFIQYRENYDKATPYFIFRDKAA PNYYLYQLTSTATDGDKGSGGSGDEPAYTIKGQKLEGFTPSSVATILYNTWFSTNYLFYA DGADVVRYNVNNGDKAVLYSAPAGYTIACMKFRSEDTFIYSTDLGRYLSIGLNNGANGAI AEIKLNTASDVDETYTPLFYDKDSEGHKLGNILDVQFVREYSYALPSNK >gi|283510509|gb|ACQH01000110.1| GENE 8 8501 - 9781 1253 426 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929645|ref|ZP_06423489.1| ## NR: gi|288929645|ref|ZP_06423489.1| hypothetical protein HMPREF0670_02383 [Prevotella sp. oral taxon 317 str. F0108] # 1 426 1 426 426 864 100.0 0 MKMKNKFGYVVAVALGCMLSSACKEEMVPLYADDGDGVYFNYGNKDALNATVNFADSILT APKEIGVSLKLKLMGRNSDQPRKVVLKSRAVQGVAEATVVCPEVVFKPGENEKSVMVLAQ RPSLVDSTFKAEVYIDASDPASQIGEGIKDFQSFTLHVKEAYTKPSQWDGMGGSYFGTWS AAKQKLLVKVTKRNNFYAENDYSKFVQWNLAAVDSLRRQQQAAPQTAIEVDIPFTNDNTY AKPWYWTSLQDTYLGSYQSGSFVGLCNSMDITTANEYAQLGGNEAAMKAVNLRAVNLMMQ RYNTFYLDGWRPGNSYKGSFYIPMLPNVAYELVEPQPWKDQQGGKAMIEKYYGPYSAAKY RFMINVWLAHKGADFVLNQLFPVMNEWGDVHWDASLGGEAAIKACNKLFREKAAQGTYSF TFPTVP >gi|283510509|gb|ACQH01000110.1| GENE 9 9788 - 11263 1487 491 aa, chain - ## HITS:1 COG:no KEGG:Cpin_1098 NR:ns ## KEGG: Cpin_1098 # Name: not_defined # Def: hypothetical protein # Organism: C.pinensis # Pathway: not_defined # 19 473 9 486 488 199 32.0 2e-49 MKHLIFPSILYKVCRMGTLLLACSLVTACSGWFDVSPKTDLKADEMFSTESGFESSLTGI YLLMTDQGAYGGNMSFGLLDQLAQQYDYIPDGANDRAAIYNYATATSDGFRTKQKIAQSW LKLYNVIANCNNYIKWLDRNGESVIRNVKTRNTFRAEALAVRAYCHFDLLRAWGPWKYGK DAAAADVPCIPYRMVADNSKQPLLPAKEIVKLALKDLGEAKELLAHEKKMRLDNSERRFR FNYHAVNALMARIYCYAGDAANAVACAQDVVDNCGLTLVSSNQDDPILFSECIAALNIYH LTETFSSHWADGEKYTTQYFIRSERFNSYFEVSGDRREDMRSKSSAFYEHAAAKTALSRK YNTNPKEAVPLIRLPEMYYILCEMTPDLTQAARFLNAVRNKRGYSRSLNVNFTDEASRMK ALDREFRKEFYAEGQYWYFLKRHGITELTYDNSIVLSKERFVFPLPDAEKEYGWTASHDA QAGKQTPQTNP >gi|283510509|gb|ACQH01000110.1| GENE 10 11272 - 14640 3567 1122 aa, chain - ## HITS:1 COG:no KEGG:BT_3271 NR:ns ## KEGG: BT_3271 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 30 1121 13 1116 1116 593 35.0 1e-167 MRTTENNHQTALGWLCMLCAMLLCVTSAKAQSGALDDWKNTPVTLRVSNEPLGTVLEKVA KQANATIEFQEVALVGIARPTTLNVKDVPLDKVLGRLIGDQNVLVRYETGRHIIISSYNR DVEDKEMLRIEGVVTDAKTNETLVGATVMLTDGTKTGGVKGSITDVNGKFSLQVPRKSSI KVSYMGYESVSLQITKPNPNMKIALKADGHMVEEVVVTGLSKRSKSSFTGNYVSVKGEEL RKLNPNNILKSLQFYDPSFKVLESNTRGSDPNAQPEFLIRGDQNLGTTASLNSMDLLLDN VSSRPNTPLFVLDGFIVSMSRILQLDPERIENVTILKDAAATAIYGSRASNGVVVIETKV APDGVLSVSYNGGLTIQAPDLTDYKLMNAREKLQMEWDAGVYEPSNDKSMNRYNQLLRNV LAGVDTYWLSKPLRTAITHRHSISAAGGTEVFRYSLDLNAAFNPGVMKGSEQNNKSLNFS MTYRKENVSVGASLNLSETDGHNSPYGSFSEYTRVNPYYRPTNDFGEYLQQLDNHTGSGS VPIANPLYNANVGIKDLTSNTNITASLNLEYRLLKNLRLTENVSYMRGMARTENFLPATH TVFITELDKTLRGSYTKNTGEMTSWSSNLGLNWNMAFKKHLLSVFGNWTLSEDRSNYVNL SAVGYPDSHMDDFIFGNKMRTHPSGSESIARSMGLITQLSYSYDNRYSVDVNVSSEITSR YADHNLTPFWSVGLRWNAYREKWLAGRVSNLVFRATYGVTGAQNSSPADAIEYYTFSQTM RPYASFSTLGAVLSRLNNPNIKWTKTDNMSFGLDVGVWKNRANLSFNYYNNISRQLLTNY DLAPSTGFESQYINAGELQNRGFDATLNVIALQNIRKQFYWTVALGMNHNRNKIRKISDY LRKMNEEQLKSKSAPLPVYQEGKSTTTYYAVRSLGIDPMTGNEVFLTRNGEKTFKWNSAD KVPVGDTTPKMNGTISSSVNWRDLSCTLGFIYKWGGIVYNQTLVDKIENSNIAYNLDRRA AQDRWRKPGDVTKYKRIDLNGSQTPASDRFIMKDNELRLASVNVGYRFKQTDYKLLSRLN IDVLSLNFTTNDLLRLSTVKMERGLGYPFSRSYTLTCSIIFK >gi|283510509|gb|ACQH01000110.1| GENE 11 14798 - 15778 1055 326 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929649|ref|ZP_06423493.1| ## NR: gi|288929649|ref|ZP_06423493.1| hypothetical protein HMPREF0670_02387 [Prevotella sp. oral taxon 317 str. F0108] # 1 326 1 326 326 640 100.0 0 MTAKTRKFLLGALCVLAFNGYAQKKTTLSAEVYGYKQEMVYFDCLQSPFFNAEFHTNPGD EHLYSFETKNPVCMLINGQVNVLLMPGDSIHAVVRYDGKRPSEVVFSGSEKAVSANRLIS NIDDMRRQMRYKSQLLACVVVDIKPQQRIEDSRTLLSKVQEMVAKQKSQLSDVAAEYILA NTEAAAYISFMEYPQMYADTRKTPIAEQGIGDYWKLMDGVKTRSGEGALRNPDYVSFLMR YCFYENEKKAVAKKATYTMPRQLETMYKELAAFYQGAQRDAVLYQLLVSFIRNGKELERA KPLFAEYKAKHNVDKGYLAILEKLME >gi|283510509|gb|ACQH01000110.1| GENE 12 16281 - 16502 56 73 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MICQLFSSTFNIMSALFANQLPTLSQKIDSRGGLFRCKKWSVVHAVKMNFYLNRPPLTPV FGRFAAKCSAFWC >gi|283510509|gb|ACQH01000110.1| GENE 13 17393 - 17767 440 124 aa, chain + ## HITS:1 COG:Rv2898c KEGG:ns NR:ns ## COG: Rv2898c COG0792 # Protein_GI_number: 15610035 # Func_class: L Replication, recombination and repair # Function: Predicted endonuclease distantly related to archaeal Holliday junction resolvase # Organism: Mycobacterium tuberculosis H37Rv # 7 116 12 125 128 67 40.0 5e-12 MAKHNDLGKWGEDLAAQYLQKQGFVIRERDWHCGKRDIDLVAITADLTTVVFVEVKTRTT DEVSEPADAVNRQKIRNLGIAANNYIKQFNVVEQVRFDVVTIVGNSNENAKLEHIEDAFN PLLA >gi|283510509|gb|ACQH01000110.1| GENE 14 17770 - 18525 703 251 aa, chain + ## HITS:1 COG:lin2018_2 KEGG:ns NR:ns ## COG: lin2018_2 COG0340 # Protein_GI_number: 16801084 # Func_class: H Coenzyme transport and metabolism # Function: Biotin-(acetyl-CoA carboxylase) ligase # Organism: Listeria innocua # 15 232 31 241 253 79 25.0 8e-15 MKIRQIHLEETTSTNTYIKGLALEADEMVVVSTLWQTAGRGQGVNRWESERGKNLLFSLM ARPSFVAVQQQFALSMAGALAIKDVLDRFTPDIRIKWPNDIYWRDRKISGTLIETSLAHG EISRFVYGIGLNVNQVRFVSDAPNPVSLAQITGKQVALMPLLDSIVEAFLAQWDLLKAGQ THLIAARYNASLYRATGFHRYNDKQGAFDAELVGVETDGLITLRDRQGQLRTYSQPYATN GEMADKLRFEI >gi|283510509|gb|ACQH01000110.1| GENE 15 18577 - 19281 922 234 aa, chain + ## HITS:1 COG:FN1622 KEGG:ns NR:ns ## COG: FN1622 COG0528 # Protein_GI_number: 19704943 # Func_class: F Nucleotide transport and metabolism # Function: Uridylate kinase # Organism: Fusobacterium nucleatum # 2 232 6 236 239 260 56.0 1e-69 MYKRILLKLSGESLMGTQSHGIDPQRLQEYAQQIKAVHEMGVQIGIVIGGGNFFRGLSGA QKGFDRVKGDQMGMCATVMNSLALSSALEAIGVKTKVLTAIRMEPIGEFYSKWKAIESME SGYVCIFSAGTGSPYFTTDTGSSLRGIEIEADVMLKGTRVDGIYTADPEKDPTATKFAEI TYDEIYTRGLKVMDLTATTMCKENNLPIYVFNMDVMGNLERVMRGENIGTLVHN >gi|283510509|gb|ACQH01000110.1| GENE 16 19475 - 20779 1331 434 aa, chain + ## HITS:1 COG:STM3969 KEGG:ns NR:ns ## COG: STM3969 COG1322 # Protein_GI_number: 16767239 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Salmonella typhimurium LT2 # 55 426 83 436 476 197 34.0 5e-50 MEIVFALIVIVAVVFVFVLFKNTKAGYERQIDELKKQLETEREYAKRMLDQADDNLQRAN QNLDRERQHAEDLRRESDQRWEQKLESLKKDMQNTAAKELAAKQEELQRANRTQMDDLLR PIKEQFTDFKRAVDESKTQNEVNKTELQKAFENTMKLFQQQQQQTVDSLKQQTERIGENA ANLSRALKGESKTQGDWGEMVLETLLENSGLQRDEEFFVQETVKSEDGATFRPDVVVRFP EGRSVVIDSKVSLTAYADAVATDDEHERDRLLAEHAKSVRRHVDELSAKSYDKLVEDAIG FVLMFVPNENSYIAAMKQQPDLSRYAYQKRIIIISPSNLLMALQLAYNLWQYDRQSKNVE KIVKTAADLYDKVAGFTETFTDLEGQLQRLARNFEQARGQLFDGKGNVLRRIDGLRALGV TPKKRIKGLEEEGL >gi|283510509|gb|ACQH01000110.1| GENE 17 21001 - 21933 608 310 aa, chain - ## HITS:1 COG:VNG2216G KEGG:ns NR:ns ## COG: VNG2216G COG0320 # Protein_GI_number: 15791039 # Func_class: H Coenzyme transport and metabolism # Function: Lipoate synthase # Organism: Halobacterium sp. NRC-1 # 16 298 5 296 311 266 45.0 3e-71 MCYTMEEQSIRKPYLRKPKWLYTKIEGGEKYGFVEKTLEENGLHTICSSGKCPNKGHCWK SGTATFMILGDRCTRACRFCATQTGNPLPPDPHEPLKIARSVRAMELRHCVITSVDRDDL PDKGAAHWAETIRRIRELCPEIIIEVLIPDYRGQELQTVLAAAPDIVGHNLETVERLTPS VRHRATYRNSMGTLREIAEQGFITKTGIMLGLGETQEEVNALLQEAYDVGCRMITIGQYL QPSEKQLDVVEYVHPDKFLSYKRTAVRIGYENAESSPFARSSYMAEQSFISMLVRKKKLE GKNKGEKVKG >gi|283510509|gb|ACQH01000110.1| GENE 18 21945 - 22151 104 68 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MFIRNAYECYFCLLLGLLFRARTKNVWACPKDGAQKSALYLLLFRLKTLSLSCKLLRPRY AEYSVQHH >gi|283510509|gb|ACQH01000110.1| GENE 19 22195 - 22689 697 164 aa, chain + ## HITS:1 COG:Cj1382c KEGG:ns NR:ns ## COG: Cj1382c COG0716 # Protein_GI_number: 15792705 # Func_class: C Energy production and conversion # Function: Flavodoxins # Organism: Campylobacter jejuni # 6 162 5 160 163 152 55.0 2e-37 MKKTIVIFGSSMGTCQGIAEKIAEKLDVEAINVGDLTADVLNENENLLLGTSTWGSGEMQ DDWYEGVKLLDEVGLKGKTVAIFGCGDSNSNADTFCGGMAELCEAATNAGANVLEGVPTE GYTFEDSPAVKDGKFVGLALDNENEADKTDERIDAWLAQIKPAL >gi|283510509|gb|ACQH01000110.1| GENE 20 22973 - 23536 599 187 aa, chain + ## HITS:1 COG:lin1042 KEGG:ns NR:ns ## COG: lin1042 COG1853 # Protein_GI_number: 16800111 # Func_class: R General function prediction only # Function: Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family # Organism: Listeria innocua # 1 183 1 180 183 99 33.0 4e-21 MKVKLDTKKFYPGYPVFIVSYYGKDGQPKLSTLSSSFSIGNVFAMGMHANSYLSSCIKHK KGLCINFLTRNYLRAIERAGLVSNFDCETKTADTELTFSNNETTNSPYINEASVVLACET KDEITLRDEGLKVVLADIQERLCEEKLIVDGHFKYEELDLPLLEADSSGIFYRFMNKELQ KGGSFFK >gi|283510509|gb|ACQH01000110.1| GENE 21 23874 - 24413 372 179 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929657|ref|ZP_06423501.1| ## NR: gi|288929657|ref|ZP_06423501.1| hypothetical protein HMPREF0670_02395 [Prevotella sp. oral taxon 317 str. F0108] # 1 179 1 179 179 328 100.0 6e-89 MGKNKLLTIAQSITLALAMASCTNKHPYQVRQDCYAAIETYVGSHKEYNSFLLLSTRKLF NEDGTHPGFLIGPLYKGLDKELKNFAPIEFLEVDGKKVYLFSEVSYLTNNDNNPINNYCK PDSTLLLSYDNHQIYCHYRLVNYLKRAKLLFFEQDKLCISNSPDTLYLPTIKIDSLRKF >gi|283510509|gb|ACQH01000110.1| GENE 22 24549 - 24950 158 133 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929658|ref|ZP_06423502.1| ## NR: gi|288929658|ref|ZP_06423502.1| hypothetical protein HMPREF0670_02396 [Prevotella sp. oral taxon 317 str. F0108] # 1 133 1 133 133 253 100.0 2e-66 MAHFNIIDRIYFAGERSQDRGDRKVSGPGGIMAGLLFPLLILLDKLNKLHLLPFGKQLSV LYVCGSFCALFFGIWRYYVKSGRHERVMNYYRGRATDTPAYNYAYIIGWIIVCVVVTMII AQCNISLPPRRVL >gi|283510509|gb|ACQH01000110.1| GENE 23 25041 - 25937 644 298 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929659|ref|ZP_06423503.1| ## NR: gi|288929659|ref|ZP_06423503.1| hypothetical protein HMPREF0670_02397 [Prevotella sp. oral taxon 317 str. F0108] # 1 298 1 298 298 604 100.0 1e-171 MMKEQIIELLKQGRKLEAVATAHRLGGGTLLQAKRYVDAVERETLNGNSPDDNDRVVRSW SVKYNGGKPVTITLRDQYGERQLVEGTLEWNAILAEITQTRNERDTTTEGKDKRWSFSWW SKRNQERFADTNTHSPSKWKFYLPFGLALVLFFGFITFLTDCTYGYVGLLRQAVMTFIMF LITYAGYLLAIEKKEKWYSRLWGVVICCASCVFGVAIGIALVKDLCTDSLKVYQGEVEVN RHSGRRGTHYTVEWAGDDSPWWADHSINRSIYQDMQGYKKARVEYWQNSGVVKSIQPL >gi|283510509|gb|ACQH01000110.1| GENE 24 27739 - 28488 890 249 aa, chain - ## HITS:1 COG:FN0047 KEGG:ns NR:ns ## COG: FN0047 COG0708 # Protein_GI_number: 19703399 # Func_class: L Replication, recombination and repair # Function: Exonuclease III # Organism: Fusobacterium nucleatum # 1 249 1 250 253 363 68.0 1e-100 MKFISWNVNGLRAVVGKDFENTFKALDADFFCLQETKMQAGQLDLQFDGYDSFWNYADKK GYSGTAIYTRHKPLNVSYGIGIDQHDHEGRVITLEMPTFFLVTVYTPNSQEELRRLDYRM QWEDDFQAYLHNLDKCKPVIVCGDMNVAHQEIDLKNPKTNRRNAGFTDEERQKMTQLLDA GFTDTFRWKYPEEVTYSWWSYRFKARERNTGWRIDYFLVSNRLQAEVLDAKIHTDILGSD HCPVELLLK >gi|283510509|gb|ACQH01000110.1| GENE 25 28493 - 29137 640 214 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929661|ref|ZP_06423505.1| ## NR: gi|288929661|ref|ZP_06423505.1| hypothetical protein HMPREF0670_02399 [Prevotella sp. oral taxon 317 str. F0108] # 1 214 1 214 214 404 100.0 1e-111 MKKTLNLVAVALFGVAMLSLSSCLSGGNGGEEPRFKKITKAEQTVIMSSIVGNYSGQLKF VKGPKPTDIDSVKIAWAVTNDTILVIDKFPVRVLTGTVSASETFVKQLGAADPVEVKMKL KLPNYTLEDYYNKGYYLASPIAAKDFDITIAGKPGKLAFTNVIQFGSMQHQQVQQLLEYY KEKQVIRLLVKQITYENTPYEIKNPVVLQGKKIN >gi|283510509|gb|ACQH01000110.1| GENE 26 29213 - 29794 639 193 aa, chain - ## HITS:1 COG:slr0426 KEGG:ns NR:ns ## COG: slr0426 COG0302 # Protein_GI_number: 16331608 # Func_class: H Coenzyme transport and metabolism # Function: GTP cyclohydrolase I # Organism: Synechocystis # 21 191 57 228 234 216 61.0 2e-56 MNLEPEYREGLDQLAVHYKEVLALLGEDPSREGLEKTPMRVAKAMQVLTRGYTQDARKVL TDALFEEKYNQMVIVKDIDFFSMCEHHMLPFFGKVHVAYIPNGYITGLSKIARVVDIFSH RLQVQERMTQQIKDCIQDTLKPLGVMVVVEAKHLCMQMRGVEKQNSITTTSDFSGAFNQA KTREEFMNLIRPK >gi|283510509|gb|ACQH01000110.1| GENE 27 29844 - 30536 959 230 aa, chain - ## HITS:1 COG:no KEGG:PRU_1861 NR:ns ## KEGG: PRU_1861 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 13 229 7 200 202 126 38.0 7e-28 MKRLVTVIFTWICTVIAMEAQTFTDHLRRRDPKTQGVVTISQSAAIEQLVNGKGGAGKWA NPNPNERTPQTANGSTANSGKSTSVTEVESPARQPNEKRETEQTRKNEPNKKETTQKNNA LGDDGDGLNIPVIDTRKKVMVGGYKVDGYRVQAYAGGNSRDDKNRAQQIGNRIKAAFPDQ PIYVHFYSPRWICRVGNYRSIEEATRMLNAIQAMGYKQAVIVKGKITVQN >gi|283510509|gb|ACQH01000110.1| GENE 28 30702 - 32780 1824 692 aa, chain - ## HITS:1 COG:FN1366 KEGG:ns NR:ns ## COG: FN1366 COG0149 # Protein_GI_number: 19704701 # Func_class: G Carbohydrate transport and metabolism # Function: Triosephosphate isomerase # Organism: Fusobacterium nucleatum # 440 689 1 249 251 233 46.0 1e-60 MNVKRLIYILTNVCRLLIATTFVLSGYVKAIDPLGTQYKINDYLAAAGLSGFVSNWFTLF TSVALSATEFALGVYLLFAIQRRLVSKLLLGFIGMMTLITVWIYFASPVKDCGCFGDALH LSNGQTLAKNIVLLLAATVVARWPLRMVRFISKTNQWIVTNYTLFFIIVSSAWSLYTLPP FDFRPYHIGADIRKGMEIPADAKQPKFETTFVLSKNGETREFTLDNYPDSTWTFVDSKTV QTEEGYVPPIHDFSITDEATGNDITEDVLNRKGYTFLLISPYLEQADDTNFGAIDRIYEY AKRHNVPMMCLTASGKAAISRWQDLTGAEYPFYITDGTTLKTMIRSNPGLILIKDGVVIN KWSHNALPKQETLNAPLNKLSIGKIDPTSVTTRITKIVLWFVFPLFLLTLADRLWAWTKW IKKQRKRNKLYTLLKKKRKMRKKIVAGNWKMNLNLQEGLALAKEVNDALAADKPNCDVII CTPFIHLASVAGVLNSQLVGLGAENCADKEKGAYTGEVSAEMVKSTGAQYVILGHSERRE YYNETPEILKEKVLLALKNGLKVIFCIGETLAEREANKQNDVVKAELEGSVFNLSAEEFA NVIVAYEPIWAIGTGKTATAEQAEEIHAFIRSAIAEKYGNEIAENTSILYGGSAKPSNAP ELFAKPNIDGGLIGGAALKCADFKGIIDAWKK >gi|283510509|gb|ACQH01000110.1| GENE 29 32853 - 33410 759 185 aa, chain - ## HITS:1 COG:no KEGG:PRU_1858 NR:ns ## KEGG: PRU_1858 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 183 1 183 188 260 73.0 2e-68 MAEQATTEKQFEETLKECADLFERKLHDYGASWRILRATSLTDQLFIKAKRIRSLEIKRT AMVNEGIRPEFIALVNYGIIGLIQLEKGFADEVDLTPEEALALYHQYANDALQLMLKKNH DYDEAWRTMRVSSYTDLILTKLQRIKEMEDLEGDTLVSEGIDANYMDIINYAVFGAIKLK EKEVR >gi|283510509|gb|ACQH01000110.1| GENE 30 33574 - 33981 479 135 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929666|ref|ZP_06423510.1| ## NR: gi|288929666|ref|ZP_06423510.1| hypothetical protein HMPREF0670_02404 [Prevotella sp. oral taxon 317 str. F0108] # 1 135 1 135 135 180 100.0 3e-44 MNKILLLLALFALTVSCEKSEDEKAAPLLAKIDSLYKAERYQDVLDSISVLRDRFPRAIN ARKTALGLWQTASLKLAQADIARTDSALQAKEQALKQGKLTSQRKAELLVRRDSLKIRYE ALCQMVKAIQKKQAQ >gi|283510509|gb|ACQH01000110.1| GENE 31 33995 - 34195 59 66 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MYCMGYHLQHPMWHRGDNRPVLCIIKSALPAMYLRLIWCLHAPSWTKVTIKNETAELSAC ILRLIR >gi|283510509|gb|ACQH01000110.1| GENE 32 34354 - 35787 1681 477 aa, chain + ## HITS:1 COG:ML2697 KEGG:ns NR:ns ## COG: ML2697 COG0617 # Protein_GI_number: 15828457 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA nucleotidyltransferase/poly(A) polymerase # Organism: Mycobacterium leprae # 17 464 26 474 486 233 34.0 6e-61 MKILSDAQLAELLDQDIFHKIGDAADALGVPCYVVGGYVRDLFLERASNDIDVVVVGSGI QVADALRKSLGKKAHISVFRNFGTAQVKFRGYEIEFVGARKESYSHDSRKPIVENGTLED DQNRRDFTINAMAVCLNAENFGQLIDPFDGLYDLEDGIIRTPLDPDITFSDDPLRMLRCV RFATQLRFFIEDETFDALTRNAERIKIVSGERIADELNKIMMTQQPSRGFVELHRCGLLQ LILPELTRLDNVETRNGRAHKNNFYHTLEVLDNICQHTDNLWLRWAALLHDVGKAKTKRW DPAAGWTFHNHNFVGAKMVPDLFRRMKLPLDSKMKYVQKLVDLHMRPIVIAEDCVTDSAV RRLMNDAGDDTEDLMTLCEADITSKNHIKKQRFLDNFRIVREKLADLKERDYKRLLQPVI DGNEIMELFHLGPSREVGTLKQTLKDAVLDNRVPNEREPLMQLLMQKAKQMGLDTAR >gi|283510509|gb|ACQH01000110.1| GENE 33 37340 - 37552 58 70 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLNFLLKTIQKKNVSNTRVWATERQAGTFADGQRMLLQCMQPTADGESSMEIQRTSAICS LSSACGRYME >gi|283510509|gb|ACQH01000110.1| GENE 34 38006 - 38794 759 262 aa, chain - ## HITS:1 COG:RSc2985 KEGG:ns NR:ns ## COG: RSc2985 COG0755 # Protein_GI_number: 17547704 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ABC-type transport system involved in cytochrome c biogenesis, permease component # Organism: Ralstonia solanacearum # 113 254 244 391 395 83 33.0 4e-16 MTWNNFPYFALVAVLLWAVGAYAAWKDKKTLTLGATIAGLLVFFGYIVAMWVSLERPPLR TMGETRLWYSFFLPFAGMVVYLRWRYKWILSFSTILATVFIIINVMKPEIHNKTLMPALQ SPWFAPHVIVYMFAYALLGAATVMAIYLLFIKKKPATDDEMDITDNLVYVGLSFMTLGML MGALWAKEAWGHYWAWDPKETWAAVTWFSYLVYVHFRRSRPHHLRPALYMLIVSFLLLQM CWWGINYLPAAMGSSVHTYNLG >gi|283510509|gb|ACQH01000110.1| GENE 35 38791 - 40023 1214 410 aa, chain - ## HITS:1 COG:no KEGG:BT_1416 NR:ns ## KEGG: BT_1416 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 407 1 407 411 407 47.0 1e-112 MWTRPWKMVEGFAVGAGLVLVGLLLQFSTGPIAWNNLSFPVNFIILAGYILAMSLVYAFR NKVYSFKFLSSYASAIPALAYAVALTAIMGLTKQVPDGRPASDPLGLSRMLSFWPFVLIY IWVSAIVAQATIKKALHFKWHDVPFMLNHAGLLIVMLCATLGNADMQRLKMTIHKSSPEW RATNDQGKIVELPLAIQLKAFKIYEYPPKLMLVNGKTGDPIPKENPTTLIVDSTFKTADL QGWTIKLNKYLDEAAPLMTRDTTNYLPWYSSGAMTAINITATSPDGKVRRTGWVTCGSYH FPYQVLSLDGKISIAMPEREPQRYLSSVELMTQAGLHAFADIEVNRPLSVEGWKIYQLSY DTSMGRWSETSVLELVRDPWLPFVYVGLGMMMLGAVFMFLSLQRRKETKE >gi|283510509|gb|ACQH01000110.1| GENE 36 40392 - 41876 1545 494 aa, chain - ## HITS:1 COG:STM4277 KEGG:ns NR:ns ## COG: STM4277 COG3303 # Protein_GI_number: 16767527 # Func_class: P Inorganic ion transport and metabolism # Function: Formate-dependent nitrite reductase, periplasmic cytochrome c552 subunit # Organism: Salmonella typhimurium LT2 # 52 489 38 474 478 430 46.0 1e-120 MAKQLKRWQGWLLFGGSMVVVFILGLLVSSLLERRAEVVSIFNNRKTPMEGIVSENSKFA SDFPREYETWKQTADTTFKSEFNSSQAVDVLAQRPNMVILWAGYAFSWDYATPRGHQHAV EDVRRTLRTGAPGVTEGKEPQPGTCWTCKGPDVPRLMKEHGVANFYKAPWSKWGDQVMNN VGCSDCHDSKTMDLKPARPALYEAWQRVGKDVNKATHQEMRSLVCAQCHTEYYFKGDGKY LTFPQDSGMTVEAMEKYYDAMNYKDYTHALSRAPILKAQHPGYEIHQMGIHGQRGVSCAD CHMPYMSKGGVKYTDHHIMSPLANIERTCQTCHRQSAETLRQNVYERQRKCNEIRNRVEN ELATAHIEAKFAWDKGATEAEMKPVLDLLRKSQWRWDFAVASHGAAFHAPQEIMRILSHS LDFANQARLKVAKVLARHGYTADVPMPDISTKAKAQAYIGLDMKQKEADKEKFLNTVVPQ WIKKAKAAGRALDL >gi|283510509|gb|ACQH01000110.1| GENE 37 41961 - 42563 483 200 aa, chain - ## HITS:1 COG:Cj1358c KEGG:ns NR:ns ## COG: Cj1358c COG3005 # Protein_GI_number: 15792681 # Func_class: C Energy production and conversion # Function: Nitrate/TMAO reductases, membrane-bound tetraheme cytochrome c subunit # Organism: Campylobacter jejuni # 7 166 6 167 171 81 31.0 1e-15 MIKDKLKSILSAFSYRQKVFLLVTGGLIVGLGGLFMYLLRMHTYLTDDPSACVNCHIMTP YYATWMHSSHGRDATCNDCHVPHNNVFAKYYFKAKDGMNHVRKFVTFQERQAITAEDASA GVIMDNCIRCHAQLNQEFVRTGRINYMMAKRGEGMACWDCHRDVPHGGMNSLSSTPNAEV PLPKSPVPDWLQGMLGRKGK >gi|283510509|gb|ACQH01000110.1| GENE 38 42757 - 42975 71 72 aa, chain - ## HITS:0 COG:no KEGG:no NR:no RRIESQAPFNLKSQEHSNNMLSNRKQRRQEMPLCRELLFVKRTTKIRKKNTATPHPFALF HNETGTCKCLNM Prediction of potential genes in microbial genomes Time: Sat May 28 02:31:14 2011 Seq name: gi|283510508|gb|ACQH01000111.1| Prevotella sp. oral taxon 317 str. F0108 cont2.111, whole genome shotgun sequence Length of sequence - 25869 bp Number of predicted genes - 25, with homology - 20 Number of transcription units - 18, operones - 4 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 96 - 152 15.2 1 1 Op 1 . - CDS 169 - 2625 2842 ## COG1506 Dipeptidyl aminopeptidases/acylaminoacyl-peptidases 2 1 Op 2 13/0.000 - CDS 2653 - 3768 1430 ## COG0845 Membrane-fusion protein 3 1 Op 3 10/0.000 - CDS 3895 - 5136 1335 ## COG0577 ABC-type antimicrobial peptide transport system, permease component - Prom 5157 - 5216 2.7 4 1 Op 4 36/0.000 - CDS 5438 - 6700 1416 ## COG0577 ABC-type antimicrobial peptide transport system, permease component 5 1 Op 5 . - CDS 6726 - 7427 315 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 - Prom 7528 - 7587 7.1 - TRNA 7655 - 7729 57.2 # Glu CTC 0 0 - TRNA 7750 - 7836 57.4 # Ser GCT 0 0 - Term 7609 - 7648 8.3 6 2 Op 1 . - CDS 7798 - 8271 -105 ## 7 2 Op 2 . - CDS 8488 - 8667 79 ## gi|288929681|ref|ZP_06423525.1| hypothetical protein HMPREF0670_02419 - Prom 8803 - 8862 5.6 8 3 Tu 1 . + CDS 9055 - 9765 537 ## gi|288929682|ref|ZP_06423526.1| hypothetical protein HMPREF0670_02420 + Term 9856 - 9892 6.3 9 4 Tu 1 . - CDS 9823 - 9960 75 ## - Prom 10158 - 10217 3.3 + Prom 10022 - 10081 3.9 10 5 Tu 1 . + CDS 10286 - 10774 -235 ## + Term 10995 - 11029 3.0 11 6 Op 1 . - CDS 11712 - 13805 2043 ## BF1191 alpha-glucosidase 12 6 Op 2 . - CDS 13833 - 14129 99 ## + Prom 13978 - 14037 5.1 13 7 Op 1 . + CDS 14116 - 15102 766 ## gi|288929684|ref|ZP_06423528.1| hypothetical protein HMPREF0670_02422 14 7 Op 2 . + CDS 15087 - 16316 1278 ## COG3757 Lyzozyme M1 (1,4-beta-N-acetylmuramidase) 15 8 Tu 1 . - CDS 16542 - 17027 239 ## PRU_2128 TonB family protein - Prom 17047 - 17106 3.7 16 9 Tu 1 . - CDS 17157 - 17639 386 ## BT_3898 TonB - Prom 17692 - 17751 5.1 17 10 Tu 1 . - CDS 17875 - 18363 426 ## PRU_1918 TonB family protein - Prom 18415 - 18474 5.7 18 11 Tu 1 . - CDS 18476 - 18955 388 ## PRU_1918 TonB family protein - Prom 18975 - 19034 4.1 19 12 Tu 1 . - CDS 19144 - 19659 515 ## BT_0813 TonB - Prom 19775 - 19834 4.0 20 13 Tu 1 . - CDS 19927 - 20406 334 ## BT_0813 TonB - Prom 20612 - 20671 4.0 - Term 20549 - 20576 -0.8 21 14 Tu 1 . - CDS 20762 - 21994 938 ## BT_3898 TonB - Prom 22017 - 22076 3.7 22 15 Tu 1 . - CDS 22125 - 22502 418 ## BT_3899 transcriptional regulator - Prom 22636 - 22695 7.4 23 16 Tu 1 . - CDS 22761 - 22979 70 ## 24 17 Tu 1 . + CDS 23050 - 23583 644 ## gi|288929695|ref|ZP_06423539.1| hypothetical protein HMPREF0670_02433 + Prom 25024 - 25083 5.9 25 18 Tu 1 . + CDS 25315 - 25851 339 ## PG1508 hypothetical protein Predicted protein(s) >gi|283510508|gb|ACQH01000111.1| GENE 1 169 - 2625 2842 818 aa, chain - ## HITS:1 COG:PAB1300 KEGG:ns NR:ns ## COG: PAB1300 COG1506 # Protein_GI_number: 14521796 # Func_class: E Amino acid transport and metabolism # Function: Dipeptidyl aminopeptidases/acylaminoacyl-peptidases # Organism: Pyrococcus abyssi # 240 803 75 629 631 161 26.0 6e-39 MRKLKTSLLFAACAMAFTTAWANDVDVKTFRYAGPFDLPEPVLIDSTDAQQKSFASESLL NTPLHLSLAQQGRIVNAGNMPTSSAPYALHLLQFNIQNAAYRKVGIELKGLKHYQVYVDN KLVSPSDVALQPQTHNVVIKYLSQKDSTYNLSVRLTNVGDSLTTNTDTRRTLTLSDIQNG LQYYRLGLSANGRYLITNYYDVMDGGYTRYSWRLTDLKTGTIMRETDESMSWIPGTNRFY TIRKGREGRDVIATDPLTGREQTLATGIGEGNITLAPNGEYLILSTSQEGPKEDADIYEI TQPEDRQPGWRTRWNLSLVDLKTGLTRPLTFGYHNVSLEDISADGRYILYMVSRSRLEKR PTTVSSLYKLDVKTMQTTCVAHDEGFLSEARFSPDASVLLVKATAEAFSDVGKNVPEGST PNMYDYQLFALNVSNKAVTPLTKTFNPSIENFSWNPRDGQIYFTALDKDQQKLFRLNPKT NKFTRLNMPEDYVTRVSFAHNAPLAVWYGESASNSDRLYTLNLDNGKSNLLEDLSKLKLK DIDLGTCTPWNFVSSRGDTISGRCYLPPHFDPNKKYPMIVNYYGGCSPVSVYFESRYPLH LYAAQGYVVYLVNPSGAAGFGQTHASRHVNTAGKGVAEDIIEGTKAFLKAHPYVDPKRVG CIGASYGGFMTQYLQTQTDIFAAAISHAGISDHTSYWGEGYWGYSYSEVSMANSYPWTRK DLYVDQSPLYNADKIHTPLLFLHGNADTNVPIGESIQMFTALKLLGRPTAFVVVNGENHT IMDYQKRIKWQNTIFAWFARYLQNNPSWWQALYPEKKL >gi|283510508|gb|ACQH01000111.1| GENE 2 2653 - 3768 1430 371 aa, chain - ## HITS:1 COG:VC1563 KEGG:ns NR:ns ## COG: VC1563 COG0845 # Protein_GI_number: 15641571 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Vibrio cholerae # 2 316 12 337 338 115 27.0 1e-25 MKKKIIVAVLIAIVFVGTFVFLWAKSQPQPEQYTELTASVKDLKKTTVITGRIEPRNEVN VKPQISGIITEIMKEAGQPIQAGEVIAKVKVIPDMGQLSGAQARVRLANINLEQAKAEYE REKTLFDKGLVAAQEYEKVKQVYQQAKEERSAADDNLQVVRDGVSKSNAKASSTLIRSTI TGIILDIPVKVGNSVILSNTFNDGTTIASVANMNDLIFKGNIDETEVGHVKPGMTMKITI GALQNLNFTASLEYISPKAVESNGANQFEIKAAVHVGKGTNIRSGYSANAEIVLQAANNA LTVPESAIEFSGKDTYVYIPKGKGKELTYERRKVKTGLSDGIDIQILSGLKAGDKVRGPK VVAPTDTEKDN >gi|283510508|gb|ACQH01000111.1| GENE 3 3895 - 5136 1335 413 aa, chain - ## HITS:1 COG:slr0594 KEGG:ns NR:ns ## COG: slr0594 COG0577 # Protein_GI_number: 16332318 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Synechocystis # 14 412 15 406 407 140 28.0 4e-33 MIDLDRYKEILDTLTRNKTRSFLTGFGVFWGIFMLVALLGGGKGLRELLSSNFKGFANNS GVFIAQNTTKPYDGLKKGRSWSMVYKDLERVRQQVPEVDRVTPMISRWGFTAVHEDKKSS CVFKGLEKDYQHIEAPDLYYGRYLNDLDNLQKRKVCIIGKKVYKTLFPKGGNPCGKEIRV GPVFFKVVGVDYSPGELNINGSAEESVVVPIDVMRDAFAMGDTINLLCFTAHKGYKVSEI VPKIRTVLARAHHVDPTDEKALPSFNMEALFGMMDNLFNGVDFLAWLVGIGTLLAGAIGV SNIMMVTVKERTTEIGIRRAIGATPRNILTQIITESITLISVAGMSGIVFTVMILQLAEM GSTTDGIVSAHYQINFTTAVGAVLFLCVLGVLAGLAPALRAMNIKPVDAMRDE >gi|283510508|gb|ACQH01000111.1| GENE 4 5438 - 6700 1416 420 aa, chain - ## HITS:1 COG:VC1566 KEGG:ns NR:ns ## COG: VC1566 COG0577 # Protein_GI_number: 15641574 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Vibrio cholerae # 1 418 1 402 404 128 24.0 2e-29 MYTLIAEIWSTTRRNKLRTALTGFAVAWGIFMLICLLGAGNGLINAATQNMDRFLSTSMM VGGGHTSKEYNGLPKDRSITLNDQDIKTTQNQFKQNVESVGAEIELGNQTITLGDNYETA SLTGVYPNEIDINKRKMLCGRFINQTDMEQQRKVLVISKSRAEGLLPSKWQGVVGQHVTM GGLAFLVVGIYSDDESMFRNIMYAPFTTIRTIYNKGDKTENIIFSFKGLDTKAANEAFEQ SYRRKINVQHEAAPDDEESVWIWNRFTDSLQMSTGTAIIRTALWIVGIFTLLSGIVGVSN IMLITVKERTREFGIRKAIGATPWSILKLIITESVIITLFFGYIGMMLGIGANLWMDATI GNTTMDVGAFQQKMFVNPTVGFGVCMQATLLIVIAGTIAGLIPARKAAKTRPIEALNAMD >gi|283510508|gb|ACQH01000111.1| GENE 5 6726 - 7427 315 233 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 222 1 221 245 125 36 2e-28 MIELKDINKTYFGAQPLHVLKGINLTIGDGEFVSIMGASGSGKSTLLNILGILDNYDTGE YRLGGTLIKDLSERRAAYYRNRMIGFVFQSFNLIGFKTAVENVELPLFYQGVSRKKRHTM AMEYLEKLGLAQWASHYPNEMSGGQKQRVAIARALITKPQIILADEPTGALDSKTSVEVM QLITSLNRNEGITTVVVTHDPGVAEQTDRLIRIKDGLIVDEGQNKQPQTRQHT >gi|283510508|gb|ACQH01000111.1| GENE 6 7798 - 8271 -105 157 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MHYHQSTLPKHTTTTHPPQKHTLYNARVRERKKTITFPHHGLHRTKAITLPHSPTRTNAI RASCSPLPCLAEQTNLRKQVGPIAMHQTNHANEPSCPNWPLKSSKGTFYKPPGTKKTRKN LELLKKSNTFAPHLTAKRFFNIDGKGSLGEWLKPPVC >gi|283510508|gb|ACQH01000111.1| GENE 7 8488 - 8667 79 59 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929681|ref|ZP_06423525.1| ## NR: gi|288929681|ref|ZP_06423525.1| hypothetical protein HMPREF0670_02419 [Prevotella sp. oral taxon 317 str. F0108] # 1 59 1 59 59 66 100.0 6e-10 MKKKIKDAARWEKHAREREKLGKNKQKTDLTPKLNYSTIRTKKLNIKHKIYTKKTFVKK >gi|283510508|gb|ACQH01000111.1| GENE 8 9055 - 9765 537 236 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929682|ref|ZP_06423526.1| ## NR: gi|288929682|ref|ZP_06423526.1| hypothetical protein HMPREF0670_02420 [Prevotella sp. oral taxon 317 str. F0108] # 1 236 19 254 254 464 100.0 1e-129 MESFTKRVGTLMLSWILAVVGVFGQTVDDICSQNVRQLGGLTAAAKVKSLQLQQVVVTQG GEMPVTTLLVPGVAFYQRVKSPLGTMMVCAYKDKGWTFNSAPTPQTEPMPAGMVESYIIG SKFLGPLFDYYVNKKTSDVVSLAFVGEKDLDGEACFELEVRYRSGYKMMVCMSRKDGLIR QTTSPAGVIRYEDYRNVGGVKVPFMMMTSTRQGLVLTRVTSAQVNVDLEGKLLAMP >gi|283510508|gb|ACQH01000111.1| GENE 9 9823 - 9960 75 45 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRGKNRMVLGVCKPTIPIGCTRYNTQKGPETRSSPIQITTNFFNN >gi|283510508|gb|ACQH01000111.1| GENE 10 10286 - 10774 -235 162 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSACQLVNSPSPHLTNSSTYQTCQLANPSTRPLPISQIHQLTSLLACHPINSTSPHLANS STYKLVNSQTHQFTISPPHKSVNLPTYQLKSLKNIIFILQYHSNFIAILAKILAVNWANW HCFQPSASLTRSIFKRKTCILHHFAFLDWLPPHYFLRPIIRF >gi|283510508|gb|ACQH01000111.1| GENE 11 11712 - 13805 2043 697 aa, chain - ## HITS:1 COG:no KEGG:BF1191 NR:ns ## KEGG: BF1191 # Name: not_defined # Def: alpha-glucosidase # Organism: B.fragilis # Pathway: not_defined # 10 694 16 705 719 967 66.0 0 MKRFVLSVCLVLLCAFSALAQEVKSPNGDVKVNFALNNSVPTYTVTFRGKAVIKPSRLGF ELVKGGDLLDGFTLVGEEKGSADETWTPVWGENRTIRNHYNELLVRLEQKSTERMLNIRF RVYDDGVGFRYEFPQEGKLNYFQVKDERTEFAMTGDHTAWWIPGDYDTQEYEYTQSRLSE VRQKMKGAITDNLSQTPFSPTGVQTSLMMKTAEGLYINLHEAALVDYPAMHLNLNDRDMV FTSWLTPDAKGVKAYLHTPFRSPWRTMIITDDARRVLASNLILNLNDPCKYTDTSWIRPV KYVGVWWEMISGKGDWAYTKELPSVILGETDYTKVKPSGKHSANNANVRRYIDFAAEHGF DQVLVEGWNIGWEDWAGNMKEKVFDFQTPYPDFDIKALNDYAHSKNVRLMMHHETSSSVM NYERYMDRAYKLMNLYGYNAVKSGYVGNIIPRGEYHYGQVMVNHYLEAVKRAADYKIMVN AHEAVRPTGLCRTYPNLIGNESARGTEFQAFGGSKPGHTAILPFTRLQGGPMDYTPGIFE MDCAGGSHVNSTICGQLALYVTMYSPLQMAADFPENYLKHPDAFQFIKDVAVDWDRSLYL EAEPMAYITAARKAKGTDNWFVGCVSGNAPHKSQLSLSFLDKGRKYEATIYTDAPDADYR TNPKAYRIERRTVTSATKLSLTAVAGGGYAISIVPKR >gi|283510508|gb|ACQH01000111.1| GENE 12 13833 - 14129 99 98 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPLLILLFFLSIVPVLGVFPAYVVDYRCKVNLKILILHTPTHEKMYTCKMYNACARNRLH YFLHISIGVTCRFRHFILSLQKEQVCETSARFLVVPPL >gi|283510508|gb|ACQH01000111.1| GENE 13 14116 - 15102 766 328 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929684|ref|ZP_06423528.1| ## NR: gi|288929684|ref|ZP_06423528.1| hypothetical protein HMPREF0670_02422 [Prevotella sp. oral taxon 317 str. F0108] # 1 328 8 335 335 644 100.0 0 MSKGMAWSLVAAVLLVGALVYAVMCCFSSQRQMRNSVVVVQSNLWYTVGTGADKVVFFAE VAPDSTFAKAATSPTDLLKTPHLTSGFWVNRFALLPSCSGRVVAAMPHGTADKGKLPTDV NTMVQKTLGKLLAQEKQLKREVKEINYYMKVHGMQDEGYTAVLAYAHKLKTRLAEVQKVS KRLAVLRKEKQLRVLRHTRYALLFDGKSIATKCLREDQQWGLCLLQTTNERKPWRANALS ALPWGMSATGEVRVASVPGLGLDPLATVKDSAAVWAASVQRDSCLRMHAALGQAGSPVFS AWGRFVGMYNGVGIVPRNIIAKMLWNEK >gi|283510508|gb|ACQH01000111.1| GENE 14 15087 - 16316 1278 409 aa, chain + ## HITS:1 COG:ECs2905 KEGG:ns NR:ns ## COG: ECs2905 COG3757 # Protein_GI_number: 15832159 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lyzozyme M1 (1,4-beta-N-acetylmuramidase) # Organism: Escherichia coli O157:H7 # 185 405 70 267 275 124 35.0 3e-28 MERKINHPKRFLWLTALLVVGLFGACHFEKNTLNGDGTAQVLQQEGDLCYQGTVLRGRSH GYGVLWLKDSVVYAGQWKDGKRQGMGTAHDSKGRKITGRWNADTLVSGTREDAAGVYVGS FNSKMMASGHGEYTGNDGSHYEGHWENDAQTGKGFALTAQPKVKVGEWKAGQYLGERLTY TTERIYGIDISKYQHVINKRVHPIKWNQLRIIHLGTASRKAIHGQVNYPISFIYIKSTEG KSLLNPFYKADYKQARAHGFSVGTYHFFSIYSPAEAQAAHFLRHSYISQGDFPPVLDVEP SPQQIAKMGGAEVLFKAVRTWLTIVEQRTGKRPILYISQQFVNRYLPLAPDIKRDYDIWI ARYGEYKPDVHLVYWQLCPDGRVQGIQGEVDINVFNGYKEVFDEFKAKL >gi|283510508|gb|ACQH01000111.1| GENE 15 16542 - 17027 239 161 aa, chain - ## HITS:1 COG:no KEGG:PRU_2128 NR:ns ## KEGG: PRU_2128 # Name: not_defined # Def: TonB family protein # Organism: P.ruminicola # Pathway: not_defined # 42 160 307 430 517 121 46.0 7e-27 MKMKNKQACKRLVGFDCRLFAVLAFAIMAVVFAPARANAQRKARNKYVRPKRIEVDNKVY NVCELMPCFKGGDAALLQYLGKTARYPQSAQKAKKEGCVIVSFIVEKDGSLSDFKVERAV DAALDAEALRVMKAMPNWLPGQQDGRFVRVRYNVPLRFKLN >gi|283510508|gb|ACQH01000111.1| GENE 16 17157 - 17639 386 160 aa, chain - ## HITS:1 COG:no KEGG:BT_3898 NR:ns ## KEGG: BT_3898 # Name: not_defined # Def: TonB # Organism: B.thetaiotaomicron # Pathway: not_defined # 34 159 333 467 609 121 43.0 9e-27 MNNKQTPNLFSGFKCLLFAVLAITLAVLLVPARANAQDKKEKGKTTQTQKDTTTDDKVYE MCEQMPTYKGGEEAMMRFLSQVTRYPQRAQDFGIQGCVVVGFIVEKDGSLSDFKFIQRVD PELDDEALKTVKAMPKWNPGKHNGKNVRVRYSVPVGFKLI >gi|283510508|gb|ACQH01000111.1| GENE 17 17875 - 18363 426 162 aa, chain - ## HITS:1 COG:no KEGG:PRU_1918 NR:ns ## KEGG: PRU_1918 # Name: not_defined # Def: TonB family protein # Organism: P.ruminicola # Pathway: not_defined # 56 159 174 277 277 134 58.0 1e-30 MNNKQAPNLFAGFKCLLFAISALVLLVIVFAPARANAQDKRGKTTQTHKDTTTGDKVYEV CEQMPIFEGGDAALLKYLTDSVKYPELAKKHGVQGRVVIDFIVEKDGSLTDIKVLRPVDI ALDAEALRVVKGMPKWIPGCHDGQLVRVEYNVPVSFRLKPLE >gi|283510508|gb|ACQH01000111.1| GENE 18 18476 - 18955 388 159 aa, chain - ## HITS:1 COG:no KEGG:PRU_1918 NR:ns ## KEGG: PRU_1918 # Name: not_defined # Def: TonB family protein # Organism: P.ruminicola # Pathway: not_defined # 54 159 172 277 277 129 55.0 4e-29 MKNKQTPKLFAGFKCLLFVVSALILLVIVIAPASANAQDKRGKTTQTRKDTATDDKVYEV CEQMPIFEGGDAALLKYLRENLKYPDKTKDRGVQGRLVIGFIVEKDGSLTDVKVLRPVDI DLDAEALRVIKGMPKWIPGRQNGQRVRVRYLLPIHICLQ >gi|283510508|gb|ACQH01000111.1| GENE 19 19144 - 19659 515 171 aa, chain - ## HITS:1 COG:no KEGG:BT_0813 NR:ns ## KEGG: BT_0813 # Name: not_defined # Def: TonB # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 171 265 441 443 152 46.0 4e-36 MKNRQKTNSFAGYKYLLFVPLAFALVFMASCKRKTKVQEQVMEGTKVEVKAETGEDTAQI KNAEPSDKAYEVVEQMPTFKGGDAALMKYLSENIKYPEAAEKAGEQGRVVVNFIVEKDGA VSNVKVVRSVTPTLDAEAVRVIKAMPKWVPGKQDGKFVRVKYNVPVSFRLQ >gi|283510508|gb|ACQH01000111.1| GENE 20 19927 - 20406 334 159 aa, chain - ## HITS:1 COG:no KEGG:BT_0813 NR:ns ## KEGG: BT_0813 # Name: not_defined # Def: TonB # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 157 265 439 443 138 42.0 5e-32 MKNRQKTNSFAGYKYLLFVPLIISLLAMNSTAMRANVQKKVVKTNKTTKKSGANDKVYVV VEQMPSFPGGDSALLKYLFENIKYPMSALKAQKQGRVMVRFTVEKDGAITNVKVARSVTP SLDAEAVRAIKSMPKWSPGKQGGEFVRVKYNVPVTFRLK >gi|283510508|gb|ACQH01000111.1| GENE 21 20762 - 21994 938 410 aa, chain - ## HITS:1 COG:no KEGG:BT_3898 NR:ns ## KEGG: BT_3898 # Name: not_defined # Def: TonB # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 410 5 468 609 355 40.0 2e-96 MLTYLIKANVVLVVLFGFYQLISAGDTFFKWRRLSLLTVYVLSLLLPTIDLSVLVNETAP LSNILPRMAYNLPEVTVQPAHDTFDWQQLAVWLYTGVALALLLRVFWQVVVVCRLAQRSE RATLHGTAVCLLTGDYNPFSFFRWIFINPDDKTPSQVQQILTHEQTHVAQWHSADALLSQ LFVAAFWFNPAAWLMRLQVRNNLEYLADRSVINGGTDKKAYQYHLLAVAYRTNVATITNN FNVLPLKKRIKMMNKQTSNPLARFKYLLFVPLAVALLAMNSTTIRANVQKKVVKTTKVTK KTGTTDKVYEVCEQMPTFPGGDAALMKYLSENVKYPALAIKAQEQGRVVVSFTVEKDGAI SDVKVARSVTPSLDAEAVRVVKAMPKWTPGKQDGQLVRVRYNVPVSFKLN >gi|283510508|gb|ACQH01000111.1| GENE 22 22125 - 22502 418 125 aa, chain - ## HITS:1 COG:no KEGG:BT_3899 NR:ns ## KEGG: BT_3899 # Name: not_defined # Def: transcriptional regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 120 1 118 121 148 60.0 6e-35 MKIEKLSIQEEEVMRCVWQLGRCNIKAIVELLPQPAPPYTTVASVVGNLKRKGYVAAQRK GNGFEYVPAVKEKDYKRNFVSGFVRDYFKNSFRDMVSFFAQEEKISPDELKDIIDEIENG DVAKV >gi|283510508|gb|ACQH01000111.1| GENE 23 22761 - 22979 70 72 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPLTHKPCTLNWRNNQQPLPTLADVALLMQHWGLWPNSTWGNRLVGSRLVDAASGLRPNS ILPKRWTRAIGD >gi|283510508|gb|ACQH01000111.1| GENE 24 23050 - 23583 644 177 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929695|ref|ZP_06423539.1| ## NR: gi|288929695|ref|ZP_06423539.1| hypothetical protein HMPREF0670_02433 [Prevotella sp. oral taxon 317 str. F0108] # 1 177 39 215 215 299 100.0 5e-80 MKRVLHLLFISLFALTFTSCENKEKQADKLTEEFMRKNLNDPRSYQLVERGRVDSLFSEF RATNEAQYMLDKVKQITDSAARYSESVKTVPQAQKMLADGQRILEEYKKKQKKFKSQFLG YAFKVSFRAKNEHGALVRSYVVFRLNPEMDKLSIEDADMNIFSIFASDAINDNMNKY >gi|283510508|gb|ACQH01000111.1| GENE 25 25315 - 25851 339 178 aa, chain + ## HITS:1 COG:no KEGG:PG1508 NR:ns ## KEGG: PG1508 # Name: not_defined # Def: hypothetical protein # Organism: P.gingivalis # Pathway: not_defined # 1 176 2 181 182 73 27.0 3e-12 MKKSICLLQVFVLSLLFIGCTNGDNQVLPPAEQGSKAELFAVGTTRITKSSHLTPCLFTL DNIKSFNVQTREIVFQDFEPTNQLFPIYRNIEVHSYDKVLLRIATFVSSFDSQIFTDLSL VSETESGKFYLSDCYPRHIENDSEYAKANEAIKEAKDKRAAEWSAFLSILKKHGKLIE Prediction of potential genes in microbial genomes Time: Sat May 28 02:32:51 2011 Seq name: gi|283510507|gb|ACQH01000112.1| Prevotella sp. oral taxon 317 str. F0108 cont2.112, whole genome shotgun sequence Length of sequence - 14546 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 6, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 362 - 421 4.5 1 1 Tu 1 . + CDS 479 - 1321 1065 ## COG1360 Flagellar motor protein + Term 1348 - 1395 9.2 + Prom 1578 - 1637 3.9 2 2 Tu 1 . + CDS 1735 - 2622 879 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily + Prom 2715 - 2774 1.8 3 3 Tu 1 . + CDS 2977 - 6744 4595 ## COG0046 Phosphoribosylformylglycinamidine (FGAM) synthase, synthetase domain + Term 6844 - 6880 1.9 + Prom 6756 - 6815 1.6 4 4 Op 1 7/0.000 + CDS 7036 - 7596 471 ## COG2059 Chromate transport protein ChrA 5 4 Op 2 . + CDS 7683 - 8258 668 ## COG2059 Chromate transport protein ChrA - Term 8465 - 8524 -0.8 6 5 Tu 1 . - CDS 8557 - 9084 415 ## gi|288929703|ref|ZP_06423547.1| conserved hypothetical protein - Prom 9111 - 9170 2.2 - Term 9708 - 9753 11.1 7 6 Op 1 . - CDS 9845 - 10048 372 ## gi|288929704|ref|ZP_06423548.1| hypothetical protein HMPREF0670_02442 8 6 Op 2 . - CDS 10060 - 12330 1955 ## gi|288929705|ref|ZP_06423549.1| conserved hypothetical protein Predicted protein(s) >gi|283510507|gb|ACQH01000112.1| GENE 1 479 - 1321 1065 280 aa, chain + ## HITS:1 COG:PA1461 KEGG:ns NR:ns ## COG: PA1461 COG1360 # Protein_GI_number: 15596658 # Func_class: N Cell motility # Function: Flagellar motor protein # Organism: Pseudomonas aeruginosa # 146 266 127 245 296 87 40.0 2e-17 MKKKNIVQLVLLASVVVVSSCASKKDLDNCQRENRELNESYRSTREQLAASQARVTSLEE QLAQQKRDYAALQGSLDKSLNNSSANNVNISKLVDQINESNQYIRHLVEVKSKSDSLNMV LTNNLTRSLSREELKEVDVRVLKGVVYISLADNMLYKSGSYEINDRAAETLSKIAKIIMD YQDYDVLVEGNTDNVPITRENIRNNWDLSCLRASSVVQYLQTRYGVNPKRLTAGGRGEFN PVATNGTEVGKQRNRRTQIIITPKLDQFMDLIDKAPEEGK >gi|283510507|gb|ACQH01000112.1| GENE 2 1735 - 2622 879 295 aa, chain + ## HITS:1 COG:FN1744 KEGG:ns NR:ns ## COG: FN1744 COG0697 # Protein_GI_number: 19705065 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Fusobacterium nucleatum # 95 292 96 290 293 60 21.0 3e-09 MSTTVKGFLNASISGITFGLIPLFAIPVLATGMHSTSVLIYRYAFGCLAMLGMLMFHRTR MRLAFGDFLRILLLSAMYAVSSIALIEGYNYMASGIATTLLFSYPVWTLLLSVLFLHERL SLTTAVAIGIAVAGVFFLSGILDGNGSMEGLTGLFLLLLSGFLYAVYMVVFPRMRIRQMP SLKLTFYIFFFAMLILTLYATFTRGRIDPIDTRSQLVNLFLLGLVPTAVSNVTLIVALKQ ISSTLAAVLGAFEPMTAMCVGILLFGEPLTLPIVIGFVLIITSVLILVLSKRKTG >gi|283510507|gb|ACQH01000112.1| GENE 3 2977 - 6744 4595 1255 aa, chain + ## HITS:1 COG:VC0869_1 KEGG:ns NR:ns ## COG: VC0869_1 COG0046 # Protein_GI_number: 15640885 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylformylglycinamidine (FGAM) synthase, synthetase domain # Organism: Vibrio cholerae # 16 914 41 950 972 497 36.0 1e-140 MILFFKTQKETVIATEVNHQLNDDEMAKLAWLYGDATLMDTQELEGFYVGPRREMVTPWS TNAVEITQNMNLNGISRIEEFFPVASRNADHDPMLQRMYEGIGQEVFCVDRQPEPVRHVD NLEEYNEKEGLALSKDEMAYLHDIEKQLGRPLTDSEIFGFAQINSEHCRHKIFGGTFVID GEEKESSLFAMIKRTTAENPGNILSAYKDNVAFAQGPVVEQFAPKDQSKADFFEVKDIES VISLKAETHNFPTTVEPFNGAATGTGGEIRDRMGGGVGSWPIAGTAVYMTAYPRFEGGRD WEKLLPVREWLYQSPEQILIKASNGASDFGNKFGQPLITGSVLTFEHQEKAETDNSANEP FYAYDKVIMLAGGVGYGTKRDCLKGEPQKGNKVVVVGGDNYRIGLGGGSVSSVDTGRYSN GIELNAIQRANPEMQKRAYNLVRALVESDDNPVVSIHDHGSAGHLNCLSELVEECGGEIN MDQLPIGDKTLSAKEIIANESQERMGLLIDEQHIERVRQIAQRERAPLYVVGETTGDAHF AFKQADGTKPFDLDVAQMFGHSPKTVMVDNTVERHYQNVEYEPIGSEADKNKLGEEALTA HIERVLQLEAVACKDWLTNKADRSVTGKVARQQTQGEIQLPLSDCGVVALDYRGRAGIAT AIGHAPQAALASAEKGSVLAVSESLTNLVWAPLAHGLGGVSLSANWMWPCRSQAGEDARL YKAVEALSDFCCTLGINVPTGKDSLSMTQQYPNSQKVIAPGTVIVSSGAEVSDVRKVVSP VLVNDKNSSLYYIDFSFDGQRLGGSAFAQSLGFVGDDVPTVANPEYFADCFNAIQELVRR GWILAGHDISAGGLITTLLEMAFANRKGGMHLNLHDLAGDDVVKNLFAENPGVVIQVSDD HRNDLRTYLEGEGIGYTKIGYSVPNSRTLVVKKGENEYVFDIDSLRETWYRTSYRLDTMQ SHNGMAKKRWLNYKKQPIELKFNDDFTGKLSGYGISADRREPSGVKAAIIREKGTNGERE MAYALYLAGFDVKDVAMTDLISGRETLEEVNMIVFCGGFSNSDVLGSAKGWAGAFLYNPK AKEALDKFYARPDTLSLGICNGCQLMAELNLINPEHDHRSHLTHNTSRKFESSFVGLTIP QNNSVMLHSLSGSKLGIWVAHGEGRFYMPESEDKYNIVAKYNYAEYPGNPNGSDYNVAGI CSSDGRHLAMMPHLERSIFPWQNAYYPRRRQSDEVTPWIEAFVNARKWVEKKLKA >gi|283510507|gb|ACQH01000112.1| GENE 4 7036 - 7596 471 186 aa, chain + ## HITS:1 COG:FN0712 KEGG:ns NR:ns ## COG: FN0712 COG2059 # Protein_GI_number: 19704047 # Func_class: P Inorganic ion transport and metabolism # Function: Chromate transport protein ChrA # Organism: Fusobacterium nucleatum # 8 172 10 174 186 144 44.0 1e-34 MSIYWNSFRTFFKVGMFTLGGGYAMIPIIEAEVVDKNKWIEKEEFLDLIAIAQSCPGVFA VNVSIFIGYKMRKLRGALCTCVGTALPSFLIILLIAMFFHQFQDNSVVAAAFRGIRPAVV ALIAAPTFTLAKSAHITLSTLWIPVVCALLIWAMGVNPIYIIIGAALCGFLYGQFIKPTE GSHNEN >gi|283510507|gb|ACQH01000112.1| GENE 5 7683 - 8258 668 191 aa, chain + ## HITS:1 COG:FN0713 KEGG:ns NR:ns ## COG: FN0713 COG2059 # Protein_GI_number: 19704048 # Func_class: P Inorganic ion transport and metabolism # Function: Chromate transport protein ChrA # Organism: Fusobacterium nucleatum # 1 191 1 174 176 107 39.0 1e-23 MIFIELFYTFFLIGLFGFGGGYGMLSLIQTETVIHHQWLSSAEFTNIVAISQMTPGPIGI NAATYCGYTATHNAGFNTWMAMLGSATATFALVLPSLILMILICKTLMKYMDTPVVQNMF AGLRPAVVGLLAAAALLLMTEENFSSPTINPWQFGISLFLFAATFAGTKFLKINPIRMIG YAAVAGMLLLY >gi|283510507|gb|ACQH01000112.1| GENE 6 8557 - 9084 415 175 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929703|ref|ZP_06423547.1| ## NR: gi|288929703|ref|ZP_06423547.1| conserved hypothetical protein [Prevotella sp. oral taxon 317 str. F0108] # 1 136 1 136 175 244 100.0 2e-63 MKRQIELSVLWVVLVLGFLAHTLADVMPAFWGESIVAMPTSTHLRELIALMMGICYTLPV LAIFLVLWGKHKAWRVIHSILACLFALFCLLHMLEWLDQFNPVQLTIMPLMAIVGIILAV KSIKYVREKPCVESKADDNEEEEDENVDNEEETEEEENKEEVSEEVSEEKEKEAE >gi|283510507|gb|ACQH01000112.1| GENE 7 9845 - 10048 372 67 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929704|ref|ZP_06423548.1| ## NR: gi|288929704|ref|ZP_06423548.1| hypothetical protein HMPREF0670_02442 [Prevotella sp. oral taxon 317 str. F0108] # 1 67 1 67 67 122 100.0 6e-27 MKKQVYTKPNCIAVYFDLNELLRAPGASRFDNDGDGDSDPGPIVPGDPDEIGAKPGFGWG WDEEEYM >gi|283510507|gb|ACQH01000112.1| GENE 8 10060 - 12330 1955 756 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929705|ref|ZP_06423549.1| ## NR: gi|288929705|ref|ZP_06423549.1| conserved hypothetical protein [Prevotella sp. oral taxon 317 str. F0108] # 1 756 1 756 756 1472 100.0 0 MMKFYSLVLSALCACVPTVLKAQGQNPHSHEWNGNSIVSVANSTNEDIKTAYLYNVGTGQ YLNAGSYWGTVVVGYGVGMPINITKSPTSGKYRMQGPQATTEGRNIAFGRRKDTPGYTDP INYNHVYVDRGVDYDLSKTPNPYTTEAHHINGILDWEFVETHSGSKTYKIRFYNDEHNQG YGGTRYLQMKSVGHSNTSPLAYPATESNSDKSCLWKIVTKADLKAAFKEQYATDEAPANA TFLIYDQNFIRGDKDVEKWHASNGLTWAYANAQAYLFNPANANYTYYVGNGAISSNYYMA NYAGYSTANVRNLGNNGKANGKVTQQVVTLKKGWYRVSCNGFYNAISGSQLKSKLFAKVQ GTTDAFSNVSTSLNVFAHEFCYTLADLMHAYDASDLAQKHISPYSRAAMEFEKGSYNNTV LVFVPTDGAKLDIGIEITGSTRQRDWTCWDNFRLEYCGTQDLVLDEAQTSILYLAKQVQP HKAATLILKRTMQKDEWNSIVLPISLTAAQVKATFGETARLSAYPHQSPTLSTRIDFTKV SLDNDNAIAIEANKLYLLRPTKEPTVPPTAAPYRKLVKDVGWVEVQAPYYIINNVTLDTN PQTLPNYSNGILRDPSTPSTTTDERLQFCASLYNQTAKVVPAHSYVLGKSAKSNNKWLWH FTQNPMPVKGFRGWIATGSNTQSKAFNFFVDGEEIGSTFDNTTDVTNTLVQPNSDLFAVP SNIYSVDGKLVRPNATSIEGLPKGIYIVNHKKIIVK Prediction of potential genes in microbial genomes Time: Sat May 28 02:33:31 2011 Seq name: gi|283510506|gb|ACQH01000113.1| Prevotella sp. oral taxon 317 str. F0108 cont2.113, whole genome shotgun sequence Length of sequence - 5279 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 36 - 70 5.0 1 1 Tu 1 . - CDS 119 - 1732 2111 ## COG1866 Phosphoenolpyruvate carboxykinase (ATP) + Prom 1918 - 1977 5.0 2 2 Tu 1 . + CDS 2005 - 2664 699 ## COG0035 Uracil phosphoribosyltransferase + Term 2724 - 2778 19.1 - Term 2831 - 2869 1.0 3 3 Op 1 7/0.000 - CDS 3059 - 3628 692 ## COG0503 Adenine/guanine phosphoribosyltransferases and related PRPP-binding proteins - Prom 3682 - 3741 2.9 4 3 Op 2 . - CDS 3813 - 5135 357 ## PROTEIN SUPPORTED gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 Predicted protein(s) >gi|283510506|gb|ACQH01000113.1| GENE 1 119 - 1732 2111 537 aa, chain - ## HITS:1 COG:VC2738 KEGG:ns NR:ns ## COG: VC2738 COG1866 # Protein_GI_number: 15642731 # Func_class: C Energy production and conversion # Function: Phosphoenolpyruvate carboxykinase (ATP) # Organism: Vibrio cholerae # 9 537 14 541 542 769 70.0 0 MAKFDKSVLEKYGIKGTTEVVYNPSYEVLFEEETKAGLEGFEKGQVTELGAVNVMTGVYT GRSPKDKFIVDDETSHNTVWWTTDEYKNDNHRASKETWAAVNKLAKEELSNKRLFVVDGF CGANKDTRMKIRFIVEVAWQAHFVTNMFIKPVNQEELDQEPDFIVYNASKAKVENYKELG LHSETAVVFNVTTREQVILNTWYGGEMKKGMFSMMNYYLPLKGIAAMHCSANTDLNGENT AIFFGLSGTGKTTLSTDPKRLLIGDDEHGWDDNGVFNFEGGCYAKVINLDKESEPDIYGA IRRNALLENVTVDANGKIDFTDGSTTENTRVSYPINHIKNIVRPISAGPAAKQVIFLSAD AFGVLPPVSILSPEQTKYYFLSGFTAKLAGTERGITEPTPTFSACFGQAFLELHPTKYAE ELVKKMTKNGAKAYLVNTGWNGTGKRISIKDTRGIIDAILCGAINNVPTKKLPIFDFEIP TVLEGVATNILDPRDTYADPAQWDEKAKDLAGRFIKNFQKYEGNEAGKALVAAGPKL >gi|283510506|gb|ACQH01000113.1| GENE 2 2005 - 2664 699 219 aa, chain + ## HITS:1 COG:SSO0231 KEGG:ns NR:ns ## COG: SSO0231 COG0035 # Protein_GI_number: 15897179 # Func_class: F Nucleotide transport and metabolism # Function: Uracil phosphoribosyltransferase # Organism: Sulfolobus solfataricus # 17 216 15 216 216 124 37.0 1e-28 MKIVNFSEQNSIVNQYLAEMRDVDYQKNRLLFRNNIQRIGELEAYEISKTLDYEARNITT PLGTSSVQVPTDKIVLATIFRAGLPFHNGFLNIFDHAGNAFVSAYREYVDEQHTKVGIHV EYLATPNIEGKNLIIADPMLATGGSMELGYKAFLTKGTPKRIHVACVIASPEGIEHIRKT FPEDKTTIWCAAIDPGLNEHKYIVPGFGDAGDLCYGEKV >gi|283510506|gb|ACQH01000113.1| GENE 3 3059 - 3628 692 189 aa, chain - ## HITS:1 COG:BS_xpt KEGG:ns NR:ns ## COG: BS_xpt COG0503 # Protein_GI_number: 16079265 # Func_class: F Nucleotide transport and metabolism # Function: Adenine/guanine phosphoribosyltransferases and related PRPP-binding proteins # Organism: Bacillus subtilis # 1 187 1 188 194 181 47.0 5e-46 MDKLKTRILQDGKCYEGGILKVDSFINHQMDPNLMMEVAKEFSRHFADAHINKIVTIEAS GIAPAILVGYIMQLPVVFIKKKQPKTMDNMLTSVVHSFTKDRSYTVCISADYLTPQDHVL FIDDFLANGNASMGVIDLCKQAGARLEGMGFIIEKAFQKGGDALRQMGIRYKALATVESL DNCKIKLRD >gi|283510506|gb|ACQH01000113.1| GENE 4 3813 - 5135 357 440 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 [Clostridium botulinum Bf] # 13 398 7 386 447 142 27 7e-34 MEQKTTKAVIYGVNDRPPLQETLFAAVQHLLAIFVAIITPPLIIAKALRMDLATTGFLVS MALFVSGVATFIQCRKFGPVGCGLLCVQGTSFSFIGPIIAIGQAGGLPLIFGACIAAAPV EMAVSYSFKYLRRIITPLVSGIVVMLIGLSLVKAGVMACGGGDLAAADFGSPRNLIVAAT VLASVIALNCSNNKYVRMSSIILGVVIGYVLALCMGMVDFGSLNVGDVKSFYVPVPFKLG IDFNVSSIIAVALVYLVTAIEATGDVTANSMVSGEPVDGEVYVKRVSGGVFADGLNSLLA GIFSSFPNSIFAQNNGIIQLTGVASRYVGYFIAAMLVVLGLFPIVGIVFSLMPSPVLGGA TLLMFGTVAAAGVRIVCSEKLDRKAVLVIAASLALGMGVELMPDILKQLPEGVRTIFSSG ITTGGLTAILANTLFNRKRI Prediction of potential genes in microbial genomes Time: Sat May 28 02:33:34 2011 Seq name: gi|283510505|gb|ACQH01000114.1| Prevotella sp. oral taxon 317 str. F0108 cont2.114, whole genome shotgun sequence Length of sequence - 10783 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 6, operones - 4 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 16 - 65 8.1 1 1 Op 1 . - CDS 303 - 737 467 ## BVU_2483 putative transcriptional regulator 2 1 Op 2 . - CDS 749 - 2032 1398 ## PRU_2853 hypothetical protein - Prom 2055 - 2114 6.0 + Prom 2298 - 2357 2.6 3 2 Tu 1 . + CDS 2509 - 3243 491 ## gi|288929713|ref|ZP_06423556.1| hypothetical protein HMPREF0670_02450 + Prom 3258 - 3317 6.0 4 3 Op 1 . + CDS 3352 - 5328 684 ## PSHAb0021 putative orphan protein + Prom 5342 - 5401 3.1 5 3 Op 2 . + CDS 5425 - 5889 634 ## COG2030 Acyl dehydratase + Term 5915 - 5950 0.3 + Prom 5981 - 6040 2.5 6 4 Op 1 . + CDS 6188 - 6700 561 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 7 4 Op 2 . + CDS 6678 - 7172 658 ## BVU_1340 hypothetical protein + Term 7276 - 7317 7.5 8 5 Op 1 . - CDS 7436 - 8341 299 ## COG0863 DNA modification methylase 9 5 Op 2 . - CDS 8431 - 9147 717 ## Tmel_0307 type II site-specific deoxyribonuclease (EC:3.1.21.4) - Prom 9180 - 9239 2.1 10 6 Tu 1 . - CDS 9281 - 10219 749 ## COG0338 Site-specific DNA methylase Predicted protein(s) >gi|283510505|gb|ACQH01000114.1| GENE 1 303 - 737 467 144 aa, chain - ## HITS:1 COG:no KEGG:BVU_2483 NR:ns ## KEGG: BVU_2483 # Name: not_defined # Def: putative transcriptional regulator # Organism: B.vulgatus # Pathway: not_defined # 3 142 2 141 141 177 60.0 9e-44 MQDEAYFVAKLELRHIKPTATRLLIVREMMRGDETISLPELERLLPTIDKSTISRTLSLF LLHRLIHAIDDGSGALKYAVCADDCDCTVQDEHTHFYCEHCHRTFCLKQLAVPVVPLPDG FRLNSVNYVLKGVCPECERKKIRC >gi|283510505|gb|ACQH01000114.1| GENE 2 749 - 2032 1398 427 aa, chain - ## HITS:1 COG:no KEGG:PRU_2853 NR:ns ## KEGG: PRU_2853 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 426 1 426 434 451 50.0 1e-125 MRKAAILLYALTISVGAFAQQAKEDFKKDRLMSASNYWAYPGPQKVQTPAPKGYKPFYIS HYGRHGSRYLIGTKDYDTTYFSLAKADSLGKLTPLGKATMAKLKLIREEAMGRDGELTQR GAEQHKGIARRMFKNFPEVFAGRTTIDAKSTTVIRCILSMENALQQLLLLNPQLIIKHDA SMHDMYYMNQTDDSLRAKKMRGRAKDEWTAFAKRHDMHERLMNSLFNDPEYWKQNLNGTE FNRMLIKVANNVQSTELRHKLQLLDLFTDEELYDNWLVGNARWYIAYGPSPLNGGTQPYS QRNLLRNIITQADSCIALPNPGATLRYGHDTMVMPLVCLLDLNQNGRQIVDLEKLAAENW VDYKIFPMASNLQFVFYRKNAADKDVLFKVMLNENEATLPLKAVSGSYYRWKDFRDFYLK KIDAYKE >gi|283510505|gb|ACQH01000114.1| GENE 3 2509 - 3243 491 244 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929713|ref|ZP_06423556.1| ## NR: gi|288929713|ref|ZP_06423556.1| hypothetical protein HMPREF0670_02450 [Prevotella sp. oral taxon 317 str. F0108] # 1 244 1 244 244 419 100.0 1e-115 MNNGIILITILFLVVAINIIVVIRIRKRPDNIVGLGLENTEEEIAEDKAWRALLCRCISI GNVITLIGGGVSIYFDNLLLYTLFVSLPIPLGLLFAYGKKCKSGKGGSEQKSTVLVVTIA VVFLLELVAIVAVSFHGSGDIKLSFNQNELEIHGLYGTNIPYEDIKQLSLKSCLPAIKHR SNGFEARKTRLGNFVTNDDLRIKLFTHSDTCFIRIVTKGNEVFYLSSRQPSKTKTLLGEI QKRI >gi|283510505|gb|ACQH01000114.1| GENE 4 3352 - 5328 684 658 aa, chain + ## HITS:1 COG:no KEGG:PSHAb0021 NR:ns ## KEGG: PSHAb0021 # Name: not_defined # Def: putative orphan protein # Organism: P.haloplanktis # Pathway: not_defined # 2 646 7 651 656 255 28.0 4e-66 MNRVHYFSSNDLSIGHYLKLVEERINELSQSSMHMELVDIIELWSIRKLFEEGCRLTTWT DVKCQELNAATSSYNTIIAKYLNSLNPSCIKEGYSKLDRVYKQSFWKIVSQFKCFKIILP ETLSDIIKENTNDLRAVLSCKQLVDKFKNEIKKVLIESEETAHILLDKYVSRPLGHSGNE IYLPSNLSVADSERIIQIYLASSDPNLNYVRLICQAKNQDGKFVLSAKTRLVAKRLEKRL NMELLDNPQTAMATFGLRIAFINEEKAPLFTSSNENGLLAYTYNINGISKYKDEELLFYC YRNFDWLNEHFMLNLINKHCEVEGLEYLLMDKSSTAYPDYSSFRHKNMIALYQLYGFCQS LNRLDRPLENGLCKFYSETLKNKVGYPSIEIKLPNCNDDWLTKCRMIFPELDNVARQFNT FVEEDEIDPEYLELLKPMKMTDAKSLLVSKYCEANNNSKDIGNILNNLFDSGSLLDYVEP YKNESLGSFVNLMEHKKVNYNNYEEHQKYKIDFLIDLDIVQVEKDGTLVFVNHSIIEVLK HIWEFSACSYWHLDEEGRAAANEMIKKGWLITDDHLLSREERKYFSFYTDNVEFTNGYEY RNRYAHGKTPPPSDVNAHTEAYLVLLRLFTILLLKVYDDLHLAYKLLVKWLPTAPKIK >gi|283510505|gb|ACQH01000114.1| GENE 5 5425 - 5889 634 154 aa, chain + ## HITS:1 COG:CC0942 KEGG:ns NR:ns ## COG: CC0942 COG2030 # Protein_GI_number: 16125194 # Func_class: I Lipid transport and metabolism # Function: Acyl dehydratase # Organism: Caulobacter vibrioides # 8 142 5 137 148 135 51.0 4e-32 MSKLVVNSYEEFAAHLGEELGVSPWLEIDQERINLFADATLDHQWIHVDVERAKQESQFK STIAHGYLTLSLLPHLWEQIIEVNNIKMLVNYGMDKMRFGQAVVTGSRVRLVTKLHAISN IRGICKTEIEFKIEIEGQRKPALEGIATFLYYFE >gi|283510505|gb|ACQH01000114.1| GENE 6 6188 - 6700 561 170 aa, chain + ## HITS:1 COG:mlr0407 KEGG:ns NR:ns ## COG: mlr0407 COG1595 # Protein_GI_number: 13470641 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Mesorhizobium loti # 7 160 13 163 183 69 31.0 3e-12 MNKISFRNDILPLKDKLYRLALRITLNARDAEDIVQDTLIKVWNRRDRWDELDSIEAFSL TVCRNLALDSIKRKGHNNPSIEDAHADRPDLTANPYEEMLHNDRVKLVRDIINALPEKQK TSMQLREFEGKSYREIAQIMEVTEEQVKINIFRARQAIKQKFKNADDYGL >gi|283510505|gb|ACQH01000114.1| GENE 7 6678 - 7172 658 164 aa, chain + ## HITS:1 COG:no KEGG:BVU_1340 NR:ns ## KEGG: BVU_1340 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 5 118 1 110 159 132 59.0 5e-30 MQMIMDYKYINQLLERYWNCETSLEEEGILRAFFSQKDVPAELRQYQSLFAYQQMESKAK HLGADFDSKLLAIINEEEPVKVKARTITLTQRLMPLFKAAAVVAIFLTLGNAAQESFQTK PTSPAGMAGYNKVEKGASVALGDSAVIDTLKKSGEPAASLNIIK >gi|283510505|gb|ACQH01000114.1| GENE 8 7436 - 8341 299 301 aa, chain - ## HITS:1 COG:HP0092 KEGG:ns NR:ns ## COG: HP0092 COG0863 # Protein_GI_number: 15644722 # Func_class: L Replication, recombination and repair # Function: DNA modification methylase # Organism: Helicobacter pylori 26695 # 6 253 14 261 277 296 57.0 3e-80 MIETYYRSDDRHFNLLQGNCIELLGQFDFKFNTIFADPPYFLSNGGISCQSGEVVSVNKG DWDKSHGADEDNLFNRRWLEVCRDKLADNGTIWVSGTYHNIFSVANCLAELGYKILNVIT WAKTNPPPNISCRYFTHSSEFVIWARKSPKVPHYFNYQLMKEMNDNKQMTDVWHLPAIAP WEKTCTKHPTQKPLGLLTRIILASTRPNDWVLDPFAGSSTTGIAANLFGRRYFGIEQEHH FLEISKARHMEIEQPDVATIYIDKILKQLRKLNNGYKTEPLTETMLMCEDVPSYNISTLP F >gi|283510505|gb|ACQH01000114.1| GENE 9 8431 - 9147 717 238 aa, chain - ## HITS:1 COG:no KEGG:Tmel_0307 NR:ns ## KEGG: Tmel_0307 # Name: not_defined # Def: type II site-specific deoxyribonuclease (EC:3.1.21.4) # Organism: T.melanesiensis # Pathway: not_defined # 1 238 35 270 272 236 52.0 7e-61 MRLNALNYLIGKDDIAAAVHDLWKENPQVFTTLDILIGVRAKDKKLSFNRESEIQLIEEF FTSAEGVVEFIQDTGLEKVFRNKQITNLVDYVFGVEVGLDTNARKNRSGHIMEERVASIL TKANVEFRQEVYSREFPEVYQALGVDSKRFDFVVETPAKTYLMEVNFYSGGGSKLNEVAR AYAELAPKVNACEGYEFVWVTDGKGWESAKGKLEEAFYTIPSIYNLTTFKPFVDTLRQ >gi|283510505|gb|ACQH01000114.1| GENE 10 9281 - 10219 749 312 aa, chain - ## HITS:1 COG:slr1803 KEGG:ns NR:ns ## COG: slr1803 COG0338 # Protein_GI_number: 16330320 # Func_class: L Replication, recombination and repair # Function: Site-specific DNA methylase # Organism: Synechocystis # 7 295 8 306 309 257 43.0 3e-68 MTIHKGAKPFVKWAGGKTQLLSDIESLLPTDFHNRNITYVEPFVGGGSVLFWLLQHYPNI QHAVINDVNAKLINVYRVINAKPQKLISALRVLENEYLPMNHAERTAYFMEKRRRFNDDE LTNVEQAAIFIFLNRTCFNGLYRENSKGKFNVPHGKYVHPKICDEQTIMVDSDLLQRVDI LCGDFDATKRYANEDTLFYLDPPYKPLNSTSSFNTYVKEPFDDAEQVRLRNFCNEVSGRG SLFVLSNSDVKSYDPKDNFFDNLYAAYSIQRVFATRMINSNAEKRGKLTELMISNIGQGE NRTARETIAVSI Prediction of potential genes in microbial genomes Time: Sat May 28 02:34:05 2011 Seq name: gi|283510504|gb|ACQH01000115.1| Prevotella sp. oral taxon 317 str. F0108 cont2.115, whole genome shotgun sequence Length of sequence - 3309 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1013 - 1516 195 ## Gbem_3567 hypothetical protein 2 2 Op 1 . - CDS 1624 - 1896 177 ## gi|288929722|ref|ZP_06423565.1| hypothetical protein HMPREF0670_02459 3 2 Op 2 . - CDS 1938 - 2654 271 ## Gbem_3565 hypothetical protein - Prom 2679 - 2738 9.8 4 3 Tu 1 . + CDS 3025 - 3307 107 ## ZPR_3005 transposase Predicted protein(s) >gi|283510504|gb|ACQH01000115.1| GENE 1 1013 - 1516 195 167 aa, chain - ## HITS:1 COG:no KEGG:Gbem_3567 NR:ns ## KEGG: Gbem_3567 # Name: not_defined # Def: hypothetical protein # Organism: G.bemidjiensis # Pathway: not_defined # 6 163 44 199 206 96 38.0 4e-19 MYTTSVESNVLYQGDGYNPFPYVELMRLEKGNKNVKGIILSNTCDISLENHRLYPSSILY APIIEIEKYRDVLIEKGASKEQVEGHLQAVRKQEVSSVLYLPAISKLKESIVFFDRIMSI DNSFIKREELKHNRLFSLSDYGFYLLLFKLSVHFSRIQEKVNRGSCS >gi|283510504|gb|ACQH01000115.1| GENE 2 1624 - 1896 177 90 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929722|ref|ZP_06423565.1| ## NR: gi|288929722|ref|ZP_06423565.1| hypothetical protein HMPREF0670_02459 [Prevotella sp. oral taxon 317 str. F0108] # 1 90 14 103 103 177 100.0 2e-43 MTCTTAALGGINPTDSYGVCPEVSVKIPSEITQPYLLNSTTKSYLEHLPANEDIEDFNKL VTFGKKFLQNEIGIDKNILKIINDNFWDML >gi|283510504|gb|ACQH01000115.1| GENE 3 1938 - 2654 271 238 aa, chain - ## HITS:1 COG:no KEGG:Gbem_3565 NR:ns ## KEGG: Gbem_3565 # Name: not_defined # Def: hypothetical protein # Organism: G.bemidjiensis # Pathway: not_defined # 1 238 1 241 241 149 35.0 1e-34 MNLPRNLKSCPIIDALVEIRFETTLNPNAVFGLIYGALINDYPGEVESLPILQVPEVVRI NDPALKFKPLYKIINKEVVIQIGNDMLSISSAIPYIGWETFQNHISKIINVVYEKKIISR VFRLGHRYVNFFDFDILDKVTLSFQMTGGYSAENVQITTQVQDGNFESTIQFSNTAIFNS RGTQKKGSIIDIDTFRNYEGNKFLESIVPEINSAHIAEKKLFFSLLKPKFIEDLSPVY >gi|283510504|gb|ACQH01000115.1| GENE 4 3025 - 3307 107 94 aa, chain + ## HITS:1 COG:no KEGG:ZPR_3005 NR:ns ## KEGG: ZPR_3005 # Name: not_defined # Def: transposase # Organism: Z.profunda # Pathway: not_defined # 2 94 51 143 266 126 61.0 3e-28 MLPVSAIFTGRVLSYKSVYAHFRKWSRNGEWKKVWGMILSRHRSFLDMSSVDLDGSHTTT IRGGECCGYQGRKKKTTTNAIYVTDRQGIPLAMS Prediction of potential genes in microbial genomes Time: Sat May 28 02:34:26 2011 Seq name: gi|283510503|gb|ACQH01000116.1| Prevotella sp. oral taxon 317 str. F0108 cont2.116, whole genome shotgun sequence Length of sequence - 40458 bp Number of predicted genes - 25, with homology - 22 Number of transcription units - 18, operones - 4 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 437 - 655 113 ## - Prom 795 - 854 3.5 - TRNA 645 - 721 74.3 # Met CAT 0 0 + Prom 563 - 622 2.1 2 2 Op 1 . + CDS 842 - 3520 2744 ## COG0474 Cation transport ATPase 3 2 Op 2 . + CDS 3525 - 4439 425 ## PROTEIN SUPPORTED gi|163762565|ref|ZP_02169630.1| ribosomal protein S2 + Prom 4466 - 4525 2.9 4 3 Tu 1 . + CDS 4549 - 5346 882 ## PRU_2618 CAAX amino terminal protease family protein 5 4 Tu 1 . + CDS 5597 - 6856 617 ## COG4826 Serine protease inhibitor + Prom 7740 - 7799 2.2 6 5 Tu 1 . + CDS 7832 - 8125 236 ## + Term 8193 - 8239 3.2 7 6 Tu 1 . - CDS 9061 - 9492 682 ## COG0071 Molecular chaperone (small heat shock protein) - Prom 9561 - 9620 4.2 - Term 9937 - 9980 -0.1 8 7 Tu 1 . - CDS 9991 - 10200 65 ## gi|288929731|ref|ZP_06423574.1| hypothetical protein HMPREF0670_02467 - Prom 10315 - 10374 2.4 + Prom 9962 - 10021 2.6 9 8 Tu 1 . + CDS 10094 - 10411 440 ## COG0393 Uncharacterized conserved protein + Term 10413 - 10455 1.1 10 9 Tu 1 . + CDS 10899 - 12998 1679 ## BT_2458 putative pyridine nucleotide-disulphide oxidoreductase + Term 13153 - 13212 1.1 + Prom 13217 - 13276 5.1 11 10 Op 1 . + CDS 13298 - 14350 917 ## BF1208 putative endonuclease/exonuclease/phosphatase family protein 12 10 Op 2 . + CDS 14461 - 15018 382 ## Ppha_0327 hypothetical protein + Term 15037 - 15100 5.2 13 11 Tu 1 . - CDS 15146 - 16381 1297 ## Cpin_6429 alkyl hydroperoxide reductase/thiol specific antioxidant/Mal allergen 14 12 Tu 1 . - CDS 16525 - 16791 68 ## - Prom 16959 - 17018 7.3 15 13 Tu 1 . - CDS 18155 - 18574 204 ## Caul_4998 leucine-rich repeat-containing protein - Prom 18802 - 18861 2.5 16 14 Tu 1 . - CDS 18928 - 20508 1946 ## PRU_2054 putative lipoprotein + Prom 21154 - 21213 6.7 17 15 Op 1 . + CDS 21457 - 24549 3086 ## BT_3271 hypothetical protein 18 15 Op 2 . + CDS 24579 - 26201 1471 ## BT_3272 putative outer membrane protein 19 15 Op 3 . + CDS 26219 - 26977 618 ## gi|288929741|ref|ZP_06423584.1| hypothetical protein HMPREF0670_02478 20 15 Op 4 . + CDS 27002 - 28687 1442 ## gi|288929742|ref|ZP_06423585.1| hypothetical protein HMPREF0670_02479 21 15 Op 5 . + CDS 28764 - 29858 1288 ## gi|288929743|ref|ZP_06423586.1| fibronectin type III domain protein 22 16 Tu 1 . + CDS 30376 - 33216 3459 ## COG0612 Predicted Zn-dependent peptidases + Term 33462 - 33513 -0.3 + Prom 34636 - 34695 6.7 23 17 Tu 1 . + CDS 34751 - 36532 1406 ## COG3513 Uncharacterized protein conserved in bacteria + Term 36533 - 36575 11.2 24 18 Op 1 2/0.000 - CDS 36732 - 38408 695 ## COG0210 Superfamily I DNA and RNA helicases 25 18 Op 2 . - CDS 38393 - 40438 1009 ## COG3593 Predicted ATP-dependent endonuclease of the OLD family Predicted protein(s) >gi|283510503|gb|ACQH01000116.1| GENE 1 437 - 655 113 72 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MAHEELINKVEDLHLNGRGTAVRQLLFFVEEQGETSMSRKVGYGYLAVSGGIRLDEEQLL ETITSALNLPIV >gi|283510503|gb|ACQH01000116.1| GENE 2 842 - 3520 2744 892 aa, chain + ## HITS:1 COG:MTH1001 KEGG:ns NR:ns ## COG: MTH1001 COG0474 # Protein_GI_number: 15679019 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Methanothermobacter thermautotrophicus # 15 891 6 844 844 457 34.0 1e-128 MYLCILGEMCPIHNYIDVSMSIETTSEKFDNNGLTDEQVAKSREQHGYNVLTPPKRASLW ALYIDKYRDPIIQILLVAVAVSLVFSFINGEFLETIGIFLAIILATTIGFYFERDAAKKF DVLTTLGEEQPVKVRRNGQTTTVARKDVVPGDLMVIEVGDEIPADGLLLQSSDLQIDESS LTGEPIINKHATPEVAEEEATYPTNVVLRSTMVMNGSGLARVTAIGDNTEIGKVACKATE LTAVKTPLNQQLDHLAKLISKVGSAVAVTAFAVFLAHDVFTNPLWQGQDYMRMAEIVLRY FMMAVTLLVMTVPEGLPMAVTLSLALNMRRMLKSNNLVRKLHACETMGAVTVICTDKTGT LTENKMQVAELRVFAPETPLVDAIALNTTAHLETKDNITSGIGNPTEIALLLWLTSNGHN YRELRAAGKVENQLPFSTERKYMATVASLNGQRFLFVKGAPETIMGLCSLTDEERHDTTH LLRSYQNKAMRTLVFAYKPLPGEVEVNFADANFMQGLTLQGVAAISDPIRQDVPAAVKEC LDAGIEVKIVTGDTAATAIEIARQIGIWHKDTPIEAQITGPDFAALSDESAYERVRVLRV MSRARPTDKQRLVNLLQQRGEVVAVTGDGTNDAPALNHAHVGLSLGSGTNVAKEASDMTL LDDSFRSIERAVMWGRSLYNNIRRFLFFQLVVNLTALLLVLGGAIIGTQMPLTVTQILWV NLIMDTFAAMALASLPPTREVMNERPRKQTAPIISPSMARGIVFCGLMFFAMLFALLVYF RNESGGQLDTHQLTIFFTAFVMLQFWNLFNAKTLNSHHSTFRRLYADRGLLLVLLLILAG QWLIVTFGGRMFRTVPLSFTEWAAIIGATSVVMWVGEVWRRLKPKAPSTNPA >gi|283510503|gb|ACQH01000116.1| GENE 3 3525 - 4439 425 304 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163762565|ref|ZP_02169630.1| ribosomal protein S2 [Bacillus selenitireducens MLS10] # 5 293 14 304 317 168 34 6e-41 MHTYNTPPCVATIGFFDGVHLGHRYLISHVRQLAQGQGMPSMVVTFGTHPRQVVQTSFVP KLLSTPEEKQKLLCETGIDHTMVLNFDQNMATLTAREFMQRVLKERLNVGQLVIGYDNKF GHNREEGFEDYCRYGEMLGIQVVEAPQFEPDGVHVSSSAIRTHLLEGRVKQANALLGYAY SLGGKVVPGFQEGRKMGFPTANLLVENPQKLLPQNGVYATWVEVEGHEHPLMGMTNIGNR PTFNGKGVTIETNILDFSADIYGQNMRIAFVDRIRAEVKFDTIDQLKAKMKEDEEVARRL VQNV >gi|283510503|gb|ACQH01000116.1| GENE 4 4549 - 5346 882 265 aa, chain + ## HITS:1 COG:no KEGG:PRU_2618 NR:ns ## KEGG: PRU_2618 # Name: not_defined # Def: CAAX amino terminal protease family protein # Organism: P.ruminicola # Pathway: not_defined # 10 261 3 253 256 158 41.0 2e-37 MEKNNRLILNVLGYLVAFIAVQFVVSLVALYLDDSIKHPPSASAKALTLSSVVSSVILIA LFVWCKWAPISRYYLQQKPWGALVWAILAGLGAILPLQWLYEQMNITMSEDVKHLFESIM GSSWGYLALGILAPVAEELVFRGAILRSLMAYFNYRLPWIPIVVSALLFGAVHGNVAQFA NAFVMGLLLGWMYCRTHSIVLGVALHWVNNTVAYTMYKLMPEMNDGQLIDLFHGDNKLMY MGLFCSLLVLLPSLFQLNKRMTAGG >gi|283510503|gb|ACQH01000116.1| GENE 5 5597 - 6856 617 419 aa, chain + ## HITS:1 COG:all0778 KEGG:ns NR:ns ## COG: all0778 COG4826 # Protein_GI_number: 17228273 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Serine protease inhibitor # Organism: Nostoc sp. PCC 7120 # 43 412 5 368 374 107 25.0 5e-23 MKTTLFTCLLALCVACAAPRRERLKDVAEKAEVSFFLDESQRKLVDGSNEFGIRLFQVMA QAHPDSSMVISPMGVVYSLNMLNNGTDGETLAEICKALGYGEGDLQRVNELCRTFLVAQR KKYRIKGQQTDDYMHTANLLATIGEDVVKGSFKEVLASNYFADAISATSVNQLQQQADQW ISEQTEGKVKQLPLKLSANAQACLANTIVFQGGWAQCFDEESTRKAPFYCEDGSIDSVWM MSRHLNDDTFKGRYDEHFIALRMPYRGQFNITMLLPVEGQPLINLVKQLDARMLQEINNS LVTFDEVHVRIPRMRLNTSVALKPLLAKVGIKQTFSQTAHLSKMSSGALWVNDIVQQTQF SLDEQGSTAISSTATELGALSDILQPRELHFTANRPFLYYISDGFGNVCFIGQYCGGKR >gi|283510503|gb|ACQH01000116.1| GENE 6 7832 - 8125 236 97 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSNSFDSTLKYFDLEPEEMFLCALCVNISSCQYGFYCMLFTKIGMLMYLLWFVVKRAIAP QRREVVKKWYKKTAALMMGVGGLIMLLRYINEIVVSI >gi|283510503|gb|ACQH01000116.1| GENE 7 9061 - 9492 682 143 aa, chain - ## HITS:1 COG:MA0133 KEGG:ns NR:ns ## COG: MA0133 COG0071 # Protein_GI_number: 20089032 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone (small heat shock protein) # Organism: Methanosarcina acetivorans str.C2A # 13 136 28 148 153 67 32.0 7e-12 MIMLPVMKRNSWLPEVFDDFLNSNLMPRANATAPAINVLENDKQYTVELAAPGLKKDDFS VNVNEDGNLSIKMEQKSESTDQNEKTHYLRREFSYSKYEQTLLLPEDVNREAIAARVNDG VLTVDLPKVQKEEQKQFRSIQID >gi|283510503|gb|ACQH01000116.1| GENE 8 9991 - 10200 65 69 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929731|ref|ZP_06423574.1| ## NR: gi|288929731|ref|ZP_06423574.1| hypothetical protein HMPREF0670_02467 [Prevotella sp. oral taxon 317 str. F0108] # 23 69 1 47 47 73 100.0 5e-12 MKSLKRLAPIMVSPVTTPLYACMGRPSIVGVVLINIRLIYNNVFIATLTSMLGRDFMLSV CLLVTSRQR >gi|283510503|gb|ACQH01000116.1| GENE 9 10094 - 10411 440 105 aa, chain + ## HITS:1 COG:ECs0952 KEGG:ns NR:ns ## COG: ECs0952 COG0393 # Protein_GI_number: 15830206 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 103 1 103 107 127 65.0 3e-30 MLISTTPTIEGRPIQAYKGVVTGETIIGANLFKDFMAGIRDIIGGRSGSYERVLREAKDT SLAEMQRRAEEMGANAIVGVDIDYETIGQNGSMLMVAVSGTAVVI >gi|283510503|gb|ACQH01000116.1| GENE 10 10899 - 12998 1679 699 aa, chain + ## HITS:1 COG:no KEGG:BT_2458 NR:ns ## KEGG: BT_2458 # Name: not_defined # Def: putative pyridine nucleotide-disulphide oxidoreductase # Organism: B.thetaiotaomicron # Pathway: not_defined # 9 689 12 626 626 654 48.0 0 MRTTPFLALALLIAAWLAKPLTAANIWIEAESFQEKGGWLVDQQMIEGMGSPYLIAHGLG TPVKDALTKVRIDTAATYNVYVRTYNWTAPWHSAAGPGGFKVAINGKRLPVIVGQTGNRW QWQKAGTMNLKQGTATLALCDLMGFDGRCDAICLNTSPVPPPEGYETLKRYRDKMLRRGQ HVRSAGVFDLVVVGGGVAGMSAAVAAARLGLKVALVQNRPVLGGNNSSEVRVYLGGRINT GKYPRLGDLQKEFGPTQEGNAMPAENYADDRKLKFVKGESNVSLFLNHHVCGVQMQGQHI PMTKLLPALQPSAIPHGAIPKASSASDSDLNVTASSAKNGVESQLLAANGTGKSKGNVLI NRDNAIAAVIAMNVLTGEELRFEAPLFADCTGDATLGVLAGAMYSIGREPQSAFGEELAP QQADDMTMGVSMQWYAKKKDKPTSFPLFEYGISFNEQTAEKRLRGEWTWETGLNSRIVDN LERVRDYGMLVVYANWSFIKNRSKDRRRYERQQLDWLAYVAGKRESRRLLGDYVLSEQDI VKNMPHEDATFTTTWSIDLHYPDTLNARNFATGPFKAISRQRVIYPYAVPYRCLYSRNIS NLFMAGRNISVTHVALGSVRVMRTTGMMGEVVGMAAAVCHANKAYPRQVYTHFLPQLQAL MERGVGDANLPNNQNYNEGWVLEQPPRLESRHVDQEQDE >gi|283510503|gb|ACQH01000116.1| GENE 11 13298 - 14350 917 350 aa, chain + ## HITS:1 COG:no KEGG:BF1208 NR:ns ## KEGG: BF1208 # Name: not_defined # Def: putative endonuclease/exonuclease/phosphatase family protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 350 1 341 341 434 57.0 1e-120 MKKLVIIMLLAAFVCTATAQKNAQGVAKRAVQFTVLQWNIWQEGTLIPGGFEAIVGELQN LQPDFVTLSEVRNYNDVDFTHRLCQALKEKGLTYYSFRSKDSGLLSKHPIIDWRPIFPEN RDHGSIYKLVAKVQGVEVAVYAGHLDYLDCAYYNVRGYDGNTFKETTKPESIDEVLRLND LSWRDNAVQIFLNEAGHDIVAGRCVVFGGDFNEPSHLDWIHATANLYDHNGMAIPWTVST MLQRAGFKDSYRVLYPNPVTHPGFTYPCFNIAAEMSKLTWAPKADERERIDLIYYKGRGM EVDKAQIFGPNVCVCRSQPAQDKWDDPILLPKGGTWPTDHRGLLVKFIVK >gi|283510503|gb|ACQH01000116.1| GENE 12 14461 - 15018 382 185 aa, chain + ## HITS:1 COG:no KEGG:Ppha_0327 NR:ns ## KEGG: Ppha_0327 # Name: not_defined # Def: hypothetical protein # Organism: P.phaeoclathratiforme # Pathway: not_defined # 3 182 20 200 205 154 50.0 2e-36 MIQPFLQQKGGYRKLRVYQVATIIYDVTYYFVSHFLSSRDRTTDQMVQAARSGKQNIAEG SEAATTSAETEIKLTNVAKASLEELLVDYEDYLRVRGKVQWTPEHPRFERMRHYASSDRI GYEYTKLLPKLTDEELANLCITLINQATYMLRHLIDKQQKQFLQNGGVREQMTKARLKWR KENGG >gi|283510503|gb|ACQH01000116.1| GENE 13 15146 - 16381 1297 411 aa, chain - ## HITS:1 COG:no KEGG:Cpin_6429 NR:ns ## KEGG: Cpin_6429 # Name: not_defined # Def: alkyl hydroperoxide reductase/thiol specific antioxidant/Mal allergen # Organism: C.pinensis # Pathway: not_defined # 38 395 32 370 715 107 28.0 1e-21 MNIRYALSLLLIAIIGASTKVAAQLPTPTLIPEKETCIIRGQITSLVAAGAKAKVKNVSL CVDYGDGNLVQVASTVTNKGRFRFNRDITPTHPSMLYYITGLGKEGIPFWVEPGEVNIVV PTASQGANAKVSGPTTNTLYQAYFALANQRDRDYNDSVAALQRSKGNDFMATNAGRDARQ ALRAAVEKQWTANRKQFVLEHKDLPLAPLLVQRDLLPILNKESVDQLMQAFSPKLGKHPY TRSLSNNIKALNLGQGKEVPDIRLPLEDGHAIQLYDLRGKHVLLTFWASWAPGCLDEMQN IKRIYDETRNAADKFVMVNLSIDKDKETWKRSVKSLGINRDGWLQAYDSQNEVSPAAKLF GIRDIPKCILISPDGKAISFTLMGIELFARVKQILAGDLYYLRDENAEVGK >gi|283510503|gb|ACQH01000116.1| GENE 14 16525 - 16791 68 88 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPKAYKQQRTANKNHLHTYQPLYLNGTKLYIIHVQEKDKKNKKRTKQNHEKAQAHNNMRN IGRICRQRLHTCALPYPQQGLGCRQKGR >gi|283510503|gb|ACQH01000116.1| GENE 15 18155 - 18574 204 139 aa, chain - ## HITS:1 COG:no KEGG:Caul_4998 NR:ns ## KEGG: Caul_4998 # Name: not_defined # Def: leucine-rich repeat-containing protein # Organism: Caulobacter_K31 # Pathway: not_defined # 1 125 1 131 146 79 35.0 3e-14 MIDNIKEYIKLADSNNLEDNNRTKMEQLSSEVISEIIHNYPERKAWLVHNKHIPIDVLRL LCTDSNADVRFTVAMKNKNDRYIFETLMKDPDFSIRMAVVRNKKIPMDLLRRMADDKSDK IATEAMRILKLRNRNRLGF >gi|283510503|gb|ACQH01000116.1| GENE 16 18928 - 20508 1946 526 aa, chain - ## HITS:1 COG:no KEGG:PRU_2054 NR:ns ## KEGG: PRU_2054 # Name: not_defined # Def: putative lipoprotein # Organism: P.ruminicola # Pathway: not_defined # 40 525 34 479 479 124 24.0 1e-26 MCAQKQHSIFALLALTLLTCACHNRTLSRLEQIDSLLSQDNVEKAYTLVQAIPIHEMENS DEMAYFTLLKTEATYRNMISIDNDSIDISVFYYEQHGPKEQLARAYYYKACILFFNRNNV TAAIKLLKRAENTAKGHGGLALMHKIFSTLSYINLCTKNYNTALNYAEKARGMATRANND LWMAYSLSYIANAYSGLDMPDSNLNYLLKALKLYKHLDTESQTVLLANISDVYSQKGDTV KSAKFILRALKERPDSYTYTILTDLYIRQGQYQKAFDLLQKALASDDVYTREKALYNLFK LKQKMGDYVGAANTADSLLIFKEKQQKEWQANNVYEIQNRFDIEASEREIRSYRIYITCL ALVLVLSVITLVLYHKYKVARAKRSILEKRLLISEYSDRIDQFKVSQTEANKELNSLRQR VNNLKDKELKVLSNGKLLYESILQNHSTVHWSTNDFADFIEYFKFADPTFMLSLDNTYNN LSTRQYLFLIVTGKLGKSETDAGDILGISPGSVRSIKSRIKTKKKS >gi|283510503|gb|ACQH01000116.1| GENE 17 21457 - 24549 3086 1030 aa, chain + ## HITS:1 COG:no KEGG:BT_3271 NR:ns ## KEGG: BT_3271 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 37 1030 109 1116 1116 762 42.0 0 MRRNLSRPANVMFWLMALLLCWSQAHAQGPGSTGNITIEGIVIDNNTNEPVVGATVKLKD GTKGTVTSADGKFVMTGLSVGNVVHVSFIGFKPRQFTVGKKRFVTVYLEDDANQLDQVVV TGFQKLKKNSFTGTATVVTGDELRKVNVKDAVKALEAFDPSFRIIDRGGFGSDPNHINEI NIRGASNIQRQEFDANGQALERRTNLLGNPNMPIFMLDGFEVPVQKIYDMDINRIQSMTI LKDAAATALYGSRAANGVVVVTSVPPKVGEIRIDYNTTLELTFPDLSDYNLTNAAEKLQV EQDAGIYTASNKVDQVEKDIEFNNKLNEVRRGVNTDWLSRPLRNTANWRNSLNVSGGVNS IRYGIDLNYDKNGGVMKGSYRNRWGAGMTLDYRLHEWLQIQNQATINRTTYEDSPYGSFG TYSTYMPYEAIYGEDGELLKNLPMSKKANPLWQQQNLANYSGRGGITDFTDNFALNVYFT PSLYFRGSFSVTQRTTDNNAFVDPKDSRFKASSANRKGTLTISEQEDLSWSSKGEVHFNQ RLDKHFINLTGAVDLQEQIARTKGRRYEGFSLGQLDTPIYAAQQDGKAGDSKTTIRTVGW LGALNYTYNEIYLLDASFRFDGSSQFSSKKRFAPFASIGLGVNVHNYPFMKKFPWVNSLR FRGTYGSTGKVTFSRFDVISSYNVDTKSWYYTGPAVTLATMGNPDLTWELNKTLDLGFSF ELFKRRLFFEATYYHKATDRTIDQIKIRPSSGFSSYSGNAGGVLNEGVELKTNVTVYRDR DLSVVFNANLASNKNRITKLNSSIEDYNKRIRENNKTENGGRGGSLPPILYYVGASTTAI YAVPSLGIDPATGVELFRKKDGTITKEWNQDDMQVCGDLNPDVQGSFGVNVAYKGFYLNT SFKYAYGGQAYNHTLVSKVENAKIKDENVDRRILTERWRKVGDVSAFYGIRQNAVTNATS RFIQNDNYVYFNSLTLGYDFPKAITSKLKLNALSLTFNASDLARWSTVRVERGLSYPYAH TYSIGLRASY >gi|283510503|gb|ACQH01000116.1| GENE 18 24579 - 26201 1471 540 aa, chain + ## HITS:1 COG:no KEGG:BT_3272 NR:ns ## KEGG: BT_3272 # Name: not_defined # Def: putative outer membrane protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 15 512 16 469 488 219 30.0 3e-55 MKRSIYFLITSAMLSLVFGSCNDWLTVPVDGKSTSTELYTGSDGYRSSLNGIYKGLAVDE LYGLQLQYGVIDFFSNQYRNDLREDEQSSTIFIAAGKREFKNVALQPVLNSMWMTAYNRI AAANDLIDNVSKASDSKFRQGEMERKVIYGEALAIRALLHFDMLRLFAPAPVNDDGQAYI PYITTFPEVLGAHVPVKVSLDMVIKDLEDARELVKLYDFSPLGMSANASGKARFYNNLEY GMEGYGKTEDLDEFFLGRGYRFSYWAITGLLARAYQYKAAYDASFYEKAKILAQEVLDAK CTDAKGGTSYSPFKNEDFNGFAYAQDPEQMRDVRMVGNLLLGIYRDNEAEAKLSGIEASF PREKKSPAQYSFCAVDVDGQDIFKTAGGVDESKDDIRCLRLLYKPINSYNILLSTKWYVK TRDTDERKRTLNIFPILRTSEMRFIIAEALARNKEYDKAYEILNEMRRNRGLRNGDLPTQ ASLDGFLKDLVREGQREWISEGQLFYLYKRLNFEVKRDDGTKAPFKKEECVVPLPIEELR >gi|283510503|gb|ACQH01000116.1| GENE 19 26219 - 26977 618 252 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929741|ref|ZP_06423584.1| ## NR: gi|288929741|ref|ZP_06423584.1| hypothetical protein HMPREF0670_02478 [Prevotella sp. oral taxon 317 str. F0108] # 1 252 1 252 252 501 100.0 1e-140 MKRLFYFIPLTILALLVGCKEDALLTFGDDRYVYFEKFYRDAVAPGTEKADSTMASFFFY NDDVNSIDALLEVHIAGRNLTHDETFKLRVVPEETTAKPEEYEIKDIYTFRAREAAPNAT NRNDTIAIKMHRSTRLKDFPHGLRLMVELVPQGALQLGQSERIRALVVLTRDAIKPKWWK DEVETNLLGTYSSKKYKLFVTHIDTKLIFDGELVKRNPAKAILLVRNFKRWLAEHPKEAV EEDGVTPITVNV >gi|283510503|gb|ACQH01000116.1| GENE 20 27002 - 28687 1442 561 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929742|ref|ZP_06423585.1| ## NR: gi|288929742|ref|ZP_06423585.1| hypothetical protein HMPREF0670_02479 [Prevotella sp. oral taxon 317 str. F0108] # 1 561 1 561 561 1158 100.0 0 MKRIKHFMIWLSVPLLLGACFSDEGNYEYESLKPPTWTRDFMSSPIQISVREGRPIRIDG TKCFNWGKLDSLQRSGEVRYEWRYYGKVFCTDLKAEIPYEEFMKRCGLEEMPISDYERGD FAIIEKSTGVSFKVKTTFYFHAQIDEADFLIYSAKSPNEPHVGKVNVMRLDYSRDPKTGG YVSDFVLSDRHFSQDLPGTPKQLDLSAAKNVSLVGSATAITEEGDAYVLNSANLQKVWDM SSQFDEGTPADFKVSSRKDQDGGGTDSYAFSWVATKDGRVFTRQFGRNYLGGKFITEPYY LDSLGYKITQFGHTCWGITNIPCYDEKNKRVLIATSLQNGAYGSYRSYMTTLYKDGWKGV PVMKMPDDTKVYYLTLMNGDQYWDNNANSWFQIYYNTGGKSMVGTFAVDNRGRRLNEPNS YTRPYEVTGHLFNKETVFLVAAGTRLTNSSSKAYIDLFSEGKEVYGIKRSGLGWPGQFDY EIIKLPLKGITSKITSMTYNRDDLLSGSNYHDLVIGCENGDILVYEAEELLNPSFRKKLN VGGKVVAIKQLGLIRATVDMY >gi|283510503|gb|ACQH01000116.1| GENE 21 28764 - 29858 1288 364 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929743|ref|ZP_06423586.1| ## NR: gi|288929743|ref|ZP_06423586.1| fibronectin type III domain protein [Prevotella sp. oral taxon 317 str. F0108] # 1 364 5 368 368 734 100.0 0 MVLKRSAGALMMLLAGAFAVQAQSVDKVSLNVSVANKGGAPTVTLKKSAECQAYCLALFT SDIAGIIKDEVVDGYFTGQKKPRYTDNVKDKKLVDYGLVVVPEKEYTLIALGYGKDGKPG KVSRTQFTAPRRQGFASAQVLAQVMNIGPDSVTVKFTPNTDVAGYALCQFEGGTIEQTVK EHGPMMGFSNAYDMIKQFSGKDYTTEKQHTWREMVPDANYEICVLPWDKNGVFQEITKVP VTTKKLGGPGEATVDIKIGEFGGSKKGGYYQDVVYSPNDQASLHRDIIITEEAFNKPDMG NIGVIKMLQTDHPKDIYWNQYRVDRARWNATPLTTYYACSMAKNVDGKWGELKKVKFTTP AAKE >gi|283510503|gb|ACQH01000116.1| GENE 22 30376 - 33216 3459 946 aa, chain + ## HITS:1 COG:BB0536 KEGG:ns NR:ns ## COG: BB0536 COG0612 # Protein_GI_number: 15594881 # Func_class: R General function prediction only # Function: Predicted Zn-dependent peptidases # Organism: Borrelia burgdorferi # 33 894 28 881 933 280 27.0 1e-74 MSNLKTILRGWALLALLVPTNLRGQGVNQAPPIPVDTAVRIGKLPNGLTYYIRHNNWPEH RADFYIAQKVGSIQEEESQRGLAHFLEHMCFNGTKHFPGNELIRYLETLGVKFGGDLNAY TSIDQTVYNISNVPTTRQTALDSCLLILSDWANALTLDPTEIDKERGVIHEEWRERTGAT SRMLERNLPKLYSGTKYGARFPIGLMSVVDNFKPKELRDYYEKWYHPSNQGIIVVGDIDV AHTEAMIKKLFGPLKNPDNQAKVVDVPVPDNEEPIIVVDKDKEQANSSVEVSFKHEAWPD SLKRNVDYLLANYAKNMALGMLNDRYAEAAQNSTNCPYLGASAADGNFIFAKTKGAFTIS ATPRDMAGTAAALQAALVEALRAAKFGFTQSEYDRAKANLLSALEKAYNGRDKRGNASFA DDYKGHFLSQEPIPAFEDYYEIMKQLVPNIPLTDINAILPQLLPETDRNMVIINFNNEKE GNVYPTPESLLQAVHAARQTKVEPYVDTVKEVPLMTKLPRPGKIVKEKKNAELGYTELKL ANGVTVILKKTDFKKDQVNFAASAHGGKSLYGPKDYTNLAVFNDIVSISGLSGFRSMELP KILAGKIASAGLSIGDKYMGMSGGSSKRDVETMLQLAHLYLSGGITKDEQAFATLMDSWR TALKNRALNHDIAFNDSLVATVYGHNPRLRPVLETDLPDINYDRILQIARERTNNAAAWT FSFIGDFDEPQLRKLICRYLGSLPTKGKVVKGHLTSSFAKGKIENVFRRKMETPKAMACV MWHTTDVPYSVENAVRMNMIGQILSMVYIKTIREDASAAYSVSAEGGATIEGDYHDYSVL VTCPVKPEKRDTAMAIIYREAENMTRTCDAAMLDKVKEYMLKNVASAEKTNAYWSGVINM YRRHGINMHTRYRDMIKQQTPEKLCTLMKQILQSGNRITVEMLPQE >gi|283510503|gb|ACQH01000116.1| GENE 23 34751 - 36532 1406 593 aa, chain + ## HITS:1 COG:NMA0631 KEGG:ns NR:ns ## COG: NMA0631 COG3513 # Protein_GI_number: 15793618 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Neisseria meningitidis Z2491 # 121 469 108 444 1082 68 25.0 3e-11 MNTRILGLDTGTNSLAWAVVDRDAKGNYTLVRRGDVIFTEGVNVDGQPVPSKAAVKSKYK SLRVQYARRRLRKIDTLKVLVSLDLCPYLSDQQLSQWKAKKTYPLTDDFMLWQRSGSAEE NNPYYCRYRCLTQKLDLTDQSQRFLLGRAFYHLAQRRGFLSNRLDTTSDSAESGKVKSGI AQLSVEMEKAGCTFLGEYFYQLYKQHGNTVRLRSRYTDREEHYIKEFLAICEKQELDAQW VEKLQRALYFQRPLKSQRKGVGKCTFEPKKPRCAESHPDYEAFRMLCILNNIKIKTPKDV SLRPLNAEEREKVWPLFFRKSKPNFDFEDIAKALAGKNNYAWYKDAGEKPFRFNYRMHQG VSGCPVTAQLIDIFGSDWKAGIAETYTAVGKKNGTKTIDEMADDVWNVLYSFSSKDKLKA FATDKLQLDEARATKFANAKLPHGFAALSLKAIRNILPFLRQGFIYAHAVLLAKIPDIVT AAVWDDADKREQILSGLLEVIENYNPDDRGIEGTIDFCIKSHLQDVVDLRPGAADALYHP SMIEAYPDAKLNKMGVFQLGSPRTNAIRNPMAMRSLHILRSVVNLLLRDGMLS >gi|283510503|gb|ACQH01000116.1| GENE 24 36732 - 38408 695 558 aa, chain - ## HITS:1 COG:MA2140 KEGG:ns NR:ns ## COG: MA2140 COG0210 # Protein_GI_number: 20090983 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases # Organism: Methanosarcina acetivorans str.C2A # 14 552 19 584 612 255 32.0 1e-67 MATIIKSEQTIPIDEPFKVTAGPGAGKTHWLINHIKKVVSNSHKLDVVRKVACITYTNVG IDTITSRLNMGNDVVEVYTIHSFLYANVIKPYIHLIAEDFGLKLDKLVVIDDSNFKSEGI AALVLKKVGKSWLDTKIYLKGLERAMWRYENHEYKHYKPDHPQVYKIKSGKRYVGNDWYM GFKRWLWSGGYMSFDDILYFSHILLSRYPNIYTLIKARYPYIFVDEFQDTIPFVIDLLAK LGNEGVIVGVVGDKAQSIYDFLGATVQQFDSFTVPEMQEYEIRGNRRSTKQIIDLLNIVR TDFSQDWLNGSEGMMPELLVGDMLNCYQQCIEKSGTDEIQSLAFQNILANSMRKKNGVRE VENILGMDFDSNAERQKVIKALIKAVEYTRMNDLRNAWHQLDIIDRDRTQTIVVLRYLLD GYKDYKDGSLMDFYNFLVNDLHVKMTKIKGTAIRDFYQNHTYADAALGVKYGDSNNKHKT IHKSKGEEYDNVFVVLKEEKDLEFLLSPNLNDNNSHRVYYVAASRAINRLFICVPTLSAE KRIQLEGMPINILLPNNS >gi|283510503|gb|ACQH01000116.1| GENE 25 38393 - 40438 1009 681 aa, chain - ## HITS:1 COG:MA2139 KEGG:ns NR:ns ## COG: MA2139 COG3593 # Protein_GI_number: 20090982 # Func_class: L Replication, recombination and repair # Function: Predicted ATP-dependent endonuclease of the OLD family # Organism: Methanosarcina acetivorans str.C2A # 1 679 12 680 689 382 34.0 1e-106 MYIKEINILNFRSFKEALIPFHEGVNVIIGHNNTGKSNLLRAMGLVLSYSNGHRLGTSDL FYETDVAELQRQSPRIQITLVLRRSADENLDSADMALFANMMTDPALSEEAELRYEFKLD DSQEKNYKADVANAITAKEIWKIIEQDYIRLYKSSRSGGNQAAGININETLGQIDFQFLD AIRDVSHDLYAGYNPLLRDVLNFFIDYSVKNDVTKTENEIKEQLKALRDDFVQQSRPLMQ TLQDRLQDGKNVFLKYALDTGATFNGAEPDFDGTVTENEMFSVLRMFIKYAVGIEVPATY NGLGYNNLIYMSLLLAKMQADGSIAYMKRNAKVLSFLAVEECEAHLHPAMQYKFLKFLQD NNLNGHVRQIFMTSHSTQIASAVKLDDLICLTSPALGQINVGYPRAIYKEESSNDMVSKQ YVQRFLDATKADMFFANRLIFVEGIAEELLLPVFARYLNKNLTDEHVLVVNMGGRYFNHF LKLFDTKNPYTINKKIVCLTDIDPCRKKKGDDKNYEVCYPYEYNIDTDNYDYKHHADTEV DQYVVHPNIRFYRQDVTYGKTLEYDLMRENPNCELLLTDSVSNRDEIKAMMAELDVNKMM GKMRNSEANTRIKSSIDQSGWEDEEKRKALLASRYLNSVSKGSNALELNVTLMDNLEKPE AERKEFHVPQYIIDALKWLLS Prediction of potential genes in microbial genomes Time: Sat May 28 02:36:44 2011 Seq name: gi|283510502|gb|ACQH01000117.1| Prevotella sp. oral taxon 317 str. F0108 cont2.117, whole genome shotgun sequence Length of sequence - 101097 bp Number of predicted genes - 76, with homology - 72 Number of transcription units - 48, operones - 13 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 1664 - 1717 14.1 1 1 Op 1 . - CDS 1748 - 5461 4509 ## COG4886 Leucine-rich repeat (LRR) protein 2 1 Op 2 . - CDS 5468 - 7855 2293 ## Ctha_0236 peptidase S8 and S53 subtilisin kexin sedolisin 3 1 Op 3 . - CDS 7855 - 9420 1704 ## gi|288929751|ref|ZP_06423594.1| hypothetical protein HMPREF0670_02488 - Prom 9440 - 9499 6.0 4 2 Tu 1 . - CDS 9589 - 9777 111 ## gi|288929752|ref|ZP_06423595.1| hypothetical protein HMPREF0670_02489 5 3 Tu 1 . - CDS 10589 - 11188 723 ## COG1051 ADP-ribose pyrophosphatase - Prom 11212 - 11271 5.4 - Term 12207 - 12262 8.6 6 4 Op 1 . - CDS 12477 - 12683 129 ## gi|288928567|ref|ZP_06422414.1| hypothetical protein HMPREF0670_01308 7 4 Op 2 . - CDS 12608 - 12871 58 ## - Prom 12923 - 12982 4.2 8 5 Tu 1 . - CDS 13003 - 13593 205 ## gi|260910901|ref|ZP_05917543.1| hypothetical protein HMPREF6745_1498 + Prom 13526 - 13585 5.8 9 6 Tu 1 . + CDS 13629 - 13844 80 ## gi|260910902|ref|ZP_05917544.1| conserved hypothetical protein + Prom 13876 - 13935 3.3 10 7 Op 1 29/0.000 + CDS 14156 - 14617 494 ## COG2001 Uncharacterized protein conserved in bacteria 11 7 Op 2 . + CDS 14614 - 15573 999 ## COG0275 Predicted S-adenosylmethionine-dependent methyltransferase involved in cell envelope biogenesis 12 7 Op 3 . + CDS 15570 - 16184 731 ## PRU_2326 hypothetical protein 13 7 Op 4 26/0.000 + CDS 16181 - 18346 2704 ## COG0768 Cell division protein FtsI/penicillin-binding protein 2 14 7 Op 5 4/0.000 + CDS 18431 - 19876 1621 ## COG0769 UDP-N-acetylmuramyl tripeptide synthase + Term 19925 - 19970 -0.8 + Prom 19984 - 20043 7.7 15 7 Op 6 28/0.000 + CDS 20082 - 21353 1516 ## COG0472 UDP-N-acetylmuramyl pentapeptide phosphotransferase/UDP-N-acetylglucosamine-1-phosphate transferase 16 7 Op 7 25/0.000 + CDS 21462 - 22793 1458 ## COG0771 UDP-N-acetylmuramoylalanine-D-glutamate ligase 17 7 Op 8 31/0.000 + CDS 22806 - 24071 1183 ## COG0772 Bacterial cell division membrane protein + Prom 24077 - 24136 2.2 18 7 Op 9 26/0.000 + CDS 24257 - 25384 1286 ## COG0707 UDP-N-acetylglucosamine:LPS N-acetylglucosamine transferase 19 7 Op 10 . + CDS 25434 - 26816 1439 ## COG0773 UDP-N-acetylmuramate-alanine ligase 20 7 Op 11 . + CDS 26848 - 27657 855 ## PRU_2334 putative cell division protein 21 7 Op 12 35/0.000 + CDS 27668 - 29086 1520 ## COG0849 Actin-like ATPase involved in cell division + Prom 29316 - 29375 3.1 22 7 Op 13 . + CDS 29440 - 30774 1485 ## COG0206 Cell division GTPase 23 7 Op 14 . + CDS 30854 - 31306 231 ## PROTEIN SUPPORTED gi|42519249|ref|NP_965179.1| 30S ribosomal protein S21 + Prom 32323 - 32382 2.6 24 8 Op 1 . + CDS 32513 - 33139 486 ## COG0558 Phosphatidylglycerophosphate synthase 25 8 Op 2 . + CDS 33167 - 34207 881 ## Coch_1578 patatin 26 8 Op 3 . + CDS 34204 - 34959 208 ## PROTEIN SUPPORTED gi|163797523|ref|ZP_02191474.1| 50S ribosomal protein L9 + Term 35183 - 35227 -1.0 + Prom 35338 - 35397 4.3 27 9 Op 1 5/0.000 + CDS 35484 - 36233 76 ## COG0671 Membrane-associated phospholipid phosphatase 28 9 Op 2 . + CDS 36214 - 37911 1112 ## COG0500 SAM-dependent methyltransferases + Term 38151 - 38189 -0.4 29 10 Tu 1 . - CDS 37892 - 38935 723 ## BT_2183 hypothetical protein - Prom 38985 - 39044 2.0 - Term 39114 - 39157 3.1 30 11 Tu 1 . - CDS 39183 - 41567 1735 ## BDI_3481 putative TonB dependent outer membrane protein - Prom 41642 - 41701 8.0 - Term 42059 - 42089 1.2 31 12 Tu 1 . - CDS 42225 - 42905 737 ## PRU_2447 hypothetical protein - Prom 42932 - 42991 2.6 - Term 43019 - 43092 10.5 32 13 Op 1 . - CDS 43149 - 44594 1693 ## COG0681 Signal peptidase I 33 13 Op 2 . - CDS 44694 - 45449 941 ## COG0289 Dihydrodipicolinate reductase - Prom 45537 - 45596 3.9 - Term 45562 - 45600 -0.4 34 14 Tu 1 . - CDS 45623 - 46750 1380 ## COG0758 Predicted Rossmann fold nucleotide-binding protein involved in DNA uptake - Prom 46837 - 46896 3.2 - Term 46842 - 46902 9.2 35 15 Tu 1 . - CDS 46945 - 47361 497 ## COG0824 Predicted thioesterase - Prom 47387 - 47446 2.8 - Term 47434 - 47481 2.1 36 16 Tu 1 . - CDS 47729 - 49027 862 ## PRU_2464 hypothetical protein - Prom 49047 - 49106 5.0 37 17 Tu 1 . + CDS 49187 - 50176 459 ## PROTEIN SUPPORTED gi|145632364|ref|ZP_01788099.1| ribosomal protein L11 methyltransferase + Term 50197 - 50244 1.2 38 18 Tu 1 . + CDS 51216 - 51881 471 ## COG0692 Uracil DNA glycosylase + Prom 51910 - 51969 2.4 39 19 Op 1 . + CDS 52050 - 52715 512 ## COG0692 Uracil DNA glycosylase 40 19 Op 2 . + CDS 52719 - 53324 591 ## PRU_2353 hypothetical protein 41 20 Tu 1 . + CDS 53623 - 54642 1088 ## COG2855 Predicted membrane protein + Term 54753 - 54794 -0.3 42 21 Tu 1 . - CDS 54776 - 55342 454 ## gll1842 hypothetical protein 43 22 Tu 1 . - CDS 55459 - 56529 904 ## COG4637 Predicted ATPase - Prom 56636 - 56695 3.9 - Term 56679 - 56737 5.1 44 23 Tu 1 . - CDS 56757 - 57623 714 ## COG4632 Exopolysaccharide biosynthesis protein related to N-acetylglucosamine-1-phosphodiester alpha-N-acetylglucosaminidase - Prom 57780 - 57839 2.5 + Prom 57707 - 57766 3.5 45 24 Tu 1 . + CDS 57851 - 59479 204 ## PROTEIN SUPPORTED gi|161507907|ref|YP_001577871.1| ribosomal protein large subunit + Prom 59624 - 59683 3.2 46 25 Tu 1 . + CDS 59705 - 60310 667 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes - Term 60393 - 60426 -0.9 47 26 Tu 1 . - CDS 60661 - 62325 1899 ## COG2985 Predicted permease - Prom 62348 - 62407 3.7 + Prom 62126 - 62185 1.9 48 27 Tu 1 . + CDS 62293 - 62502 77 ## + Prom 62991 - 63050 3.1 49 28 Tu 1 . + CDS 63086 - 63268 90 ## + Prom 63279 - 63338 2.8 50 29 Tu 1 . + CDS 63459 - 63653 87 ## 51 30 Op 1 . - CDS 63972 - 64850 682 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 64875 - 64934 5.0 52 30 Op 2 . - CDS 64946 - 65821 400 ## gi|288929796|ref|ZP_06423639.1| hypothetical protein HMPREF0670_02533 - Term 66249 - 66299 8.2 53 31 Tu 1 . - CDS 66373 - 68751 2053 ## PRU_2351 hypothetical protein - Prom 68990 - 69049 3.2 - Term 68952 - 68994 1.4 54 32 Tu 1 . - CDS 69157 - 71130 2666 ## COG0187 Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit 55 33 Tu 1 . - CDS 71448 - 72659 521 ## gi|288929799|ref|ZP_06423642.1| hypothetical protein HMPREF0670_02536 - Prom 72685 - 72744 3.5 - Term 73153 - 73193 7.2 56 34 Tu 1 . - CDS 73223 - 73477 330 ## PROTEIN SUPPORTED gi|29348839|ref|NP_812342.1| 30S ribosomal protein S20 - Prom 73712 - 73771 3.3 - TRNA 73500 - 73574 39.3 # Pseudo TTC 0 0 + Prom 73446 - 73505 7.7 57 35 Tu 1 . + CDS 73736 - 74467 645 ## PRU_2337 putative DNA repair protein RecO + Term 74576 - 74602 0.3 - Term 75766 - 75808 10.3 58 36 Op 1 1/0.000 - CDS 75918 - 78146 179 ## PROTEIN SUPPORTED gi|148543994|ref|YP_001271364.1| 30S ribosomal protein S1 - Prom 78168 - 78227 4.8 59 36 Op 2 . - CDS 78339 - 80249 2204 ## COG0513 Superfamily II DNA and RNA helicases - Term 80800 - 80870 9.4 60 37 Tu 1 . - CDS 80910 - 81737 911 ## COG0657 Esterase/lipase - Term 83431 - 83458 -0.8 61 38 Op 1 . - CDS 83495 - 85804 1802 ## BVU_1454 hypothetical protein 62 38 Op 2 . - CDS 85833 - 86057 258 ## gi|288929807|ref|ZP_06423650.1| hypothetical protein HMPREF0670_02544 - Prom 86101 - 86160 5.2 + Prom 86081 - 86140 4.6 63 39 Tu 1 . + CDS 86300 - 87550 688 ## Fjoh_3271 SH3 type 3 domain-containing protein + Prom 87631 - 87690 4.8 64 40 Tu 1 . + CDS 87813 - 88250 243 ## ASA_1312 hypothetical protein + Term 88426 - 88467 1.3 - Term 88772 - 88822 18.5 65 41 Tu 1 . - CDS 88854 - 89009 223 ## PROTEIN SUPPORTED gi|150003118|ref|YP_001297862.1| 50S ribosomal protein L34 - Prom 89075 - 89134 6.0 66 42 Op 1 . - CDS 91112 - 91324 277 ## gi|288929812|ref|ZP_06423655.1| putative transposase 67 42 Op 2 1/0.000 - CDS 91363 - 92136 1023 ## COG1624 Uncharacterized conserved protein 68 42 Op 3 . - CDS 92201 - 93049 1088 ## COG0294 Dihydropteroate synthase and related enzymes - Prom 93218 - 93277 1.9 69 43 Tu 1 . - CDS 93299 - 94447 1447 ## COG0526 Thiol-disulfide isomerase and thioredoxins - Prom 94599 - 94658 7.5 + Prom 94491 - 94550 6.2 70 44 Op 1 2/0.000 + CDS 94683 - 95249 757 ## COG0231 Translation elongation factor P (EF-P)/translation initiation factor 5A (eIF-5A) + Term 95262 - 95313 12.2 71 44 Op 2 . + CDS 95369 - 96406 1071 ## COG0463 Glycosyltransferases involved in cell wall biogenesis + Prom 96537 - 96596 3.2 72 45 Tu 1 . + CDS 96722 - 97402 817 ## COG2003 DNA repair proteins + Term 97572 - 97600 0.6 - Term 97249 - 97287 -1.0 73 46 Op 1 . - CDS 97525 - 98289 850 ## COG2908 Uncharacterized protein conserved in bacteria 74 46 Op 2 . - CDS 98316 - 98636 579 ## COG2151 Predicted metal-sulfur cluster biosynthetic enzyme - Prom 98669 - 98728 3.4 75 47 Tu 1 . - CDS 98763 - 99650 747 ## COG3177 Uncharacterized conserved protein - Prom 99687 - 99746 3.3 76 48 Tu 1 . - CDS 99950 - 100870 719 ## PG1032 ISPg3, transposase - Prom 100922 - 100981 4.7 Predicted protein(s) >gi|283510502|gb|ACQH01000117.1| GENE 1 1748 - 5461 4509 1237 aa, chain - ## HITS:1 COG:lin0354_1 KEGG:ns NR:ns ## COG: lin0354_1 COG4886 # Protein_GI_number: 16799431 # Func_class: S Function unknown # Function: Leucine-rich repeat (LRR) protein # Organism: Listeria innocua # 888 1045 122 279 292 71 30.0 8e-12 MNYKQTLLASLLLALMAVVGLPAFAQNKNQQPIITFHTTVYEENKGESPIVTFSLGSTQG NEYVEIDCGNGSEELAVGVAKLKQGAEVPTGTDYSGTVTRDGIVKIYGDPSKIDFFNATG CKIDQITFHKDLKLSYLRLAYNSIKQIDLSNLKSLQFLYLRDNPFSAQTPLRLGNMPELL ELEIAQCAYVQSKFELKNFPKLVSFDAYHCRTITEVDTKGCTTLRRLSLDMTDISTIDVS HNANLNVLNVSDTRVNKLDISKNSKLNELYIEHASGTINTDVKFDDIDLTHCPDLRHLYC AGNNLRSLNLSKVPKLVTLGANHNKLTSIDVSKCPDLYIFNIRKNLMTFASLPKPQNTWR EYYYDQRDLVLDDTYKVGTVLDFSKQVLREGTTTLGKLYKLDKDVLAQRTELDASYYYYD NGKVTLLRPVDGKVVLIFTNTIFNEYPLFTEPFTVKDDSEFGKDVRAIDFATTATAGQTI NLTVGMKGATPQKPVKVKVDFGNGQLKEVNITAEKPTAANITGQRAGNGNIVVTVPQDNY VTALEVKDLNITSIDVTELTDLRVLSLVNTGLTTIDLGRNNQLERLDLSHNQLKTISLKG KSKSQYKSLLTNVNLSHNQLETITVDDYFTITDFDLSHNNLANLDLRRADALVRGNLSYN KLTRMILDHSEQLKELNVSNNELTLLGIIPLAPVAKLDISNNHFTLANMPNRFGLDEAHF TYAPQREVVIATTSPGIDLTEQYVTINNKTTNFVWKTEGGQPYVLGTDYLDDKGSTRFKT LTKGKVYCEISHPAYPAFTGKNILKTTLVQPIPAPTNELASFTTTNNKDSVQLSLRGNIE GMSVYFDWRGDGAMSQYVLKADTYTRFNALTNKGVKVRVLVSKPEEKLTVFSVTNAKMSS VDISKLADTRSITLDNVGLSKFDFAPSTALKQLDLSRNNISDIDLSKYPKLEYLSMSDNK LTKLDLTKNKDLIVVSLSRNNLTEVNFRGLTALESLDLTANSMTTLDCKPLLNLGQLFVS NNKLTSINVKNNAYLRALNLVGNNLRFSSLPPTTGTYRTSYAYQRQNPVDVSANGNKVDL SSEAVIDGTPTVYRWFTGNVNVDKNGNLQGEEFDVDAEYTITNGVTTLKLTKRYEDLVCV MTNAKFPKLLLHTKHIRFIPDATGIEGVNADNTDTFIKVVEGGIVVETATGSTVNVYGIN GGLVQKAKMNGTSQSFSLPKGAYVVTVNNKVAKVMVK >gi|283510502|gb|ACQH01000117.1| GENE 2 5468 - 7855 2293 795 aa, chain - ## HITS:1 COG:no KEGG:Ctha_0236 NR:ns ## KEGG: Ctha_0236 # Name: not_defined # Def: peptidase S8 and S53 subtilisin kexin sedolisin # Organism: C.thalassium # Pathway: not_defined # 51 794 76 757 761 142 24.0 6e-32 MKRTFLTTICAGLLLGGYAQSKIDLQSQVTLKEELNFKIPRYNPQTRSIGRGNANPQRTI GMIEFDGKEALGTLAENNVSVLRVKGNIAIVSMPLNGVEHIAELKCVRRIQLSRPVAQKM DRVREAVGVNKIHQGVGLPQAYTGKGVVTGIVDGGIDPNNINFLKPDGTTRYGYLSRLYT SSAGKHGYIWESYFPKAQLPEIAKHKKSMDNVFAIEDFETDNSRQFHGTHTTGIMAGGYK GKATVAVTKDNKTSKNVEMPNPYYGIATESELVASCGDLRDVFIALGIDDVYQYARLSGP QAKPCVINLSLGSNLGSHDSTSVMNRYLAGCTDSAIICVAAGNEANYPVALNKTFTQTDN KLQTFIRPMAEGKQHIGKKDYYNFRNGQVYAYSSDATEFKMQVVVFNEKRGRIATRITVE PSPTGKLTMYSSGGEYAEEGAITSNSVFNQAFEGYVMVGGTKDPETGRYYAIARFLTSDN QTTNKTGKYKLGILVEGKDGKRVDLYGDAQFVYFDDYDGAYNDANGSGIWAKGSRNGTIN DMACAAGMITVGSYNVRDHWAAMDGWVYGYNRHEYDEGKVTRFSSFGTLLDGRNLPLVCA PGAVVISSVNTYTVEDPKNGYAPVNLQAKHTVNGKSYYWHQTLGTSMATPVVAGSIALWL EADPTLKAADVADIIRKTAVVDNDVKEGDPVQWGAGKFDAYAGLKEVLRRKAGSTGINTA EADNKTIINPAGPRRFNVFRAGEKQLEVRVFTMAGLLAHTQSTVGDETTIDASAWSKGVY LIQVNGKSAKKIIIY >gi|283510502|gb|ACQH01000117.1| GENE 3 7855 - 9420 1704 521 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929751|ref|ZP_06423594.1| ## NR: gi|288929751|ref|ZP_06423594.1| hypothetical protein HMPREF0670_02488 [Prevotella sp. oral taxon 317 str. F0108] # 1 521 15 535 535 973 100.0 0 MVSSAAYADSNIKINVDDASRVSIKVNGVAINNLVNGENNIPITASGSFEVRATDGNCLV DVVKKTAKGDTKQNIYSMTSCYVYVSLPDDNGCTINITTNTFAKLRTASCTVKVDNAAKV KVQRSGTYTEVTLKDGNNTIHWIPDVEKEIVISTVSYTAAPLYSIVVDGLQLPKGNTSYT VKPKDGGLIDITANYPDKDCRVKFAFNTEEAKGVVSSVTVDGKPVTNYTDPNFSVKAGSR LTITFNTTTHAIDAVKVNGTPNTIYGSLNLTITDDMNIDIAAHKLGKVKAMFYVDHAENI ILFQGWDYNQNHIALQDGMNTIELPETNCVVKVRAKNGCKIKVLRLNGKQYTNNIDGVYE ITLKNGMNIGVETSAPKRDKTATVFVDSKQLADRFFYFNRQDRTEVQLNNGENTIKFGED DNPFTLAFDGTDLSKLKVRLNGKEVEATGGGGHTYELEFANGDRLEIWFNGISTNGIANP AAANKTNKRVVYRLDGTRVDEQKNLPKGLYIINGKKIFINK >gi|283510502|gb|ACQH01000117.1| GENE 4 9589 - 9777 111 62 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929752|ref|ZP_06423595.1| ## NR: gi|288929752|ref|ZP_06423595.1| hypothetical protein HMPREF0670_02489 [Prevotella sp. oral taxon 317 str. F0108] # 1 62 1 62 62 96 100.0 4e-19 MLSSPFKTNKKRWNVTLRTLFKQISYRKGNVSNKIAKKFSLKEVFYTLFTNFTAVLGTIM VD >gi|283510502|gb|ACQH01000117.1| GENE 5 10589 - 11188 723 199 aa, chain - ## HITS:1 COG:MK1028 KEGG:ns NR:ns ## COG: MK1028 COG1051 # Protein_GI_number: 20094464 # Func_class: F Nucleotide transport and metabolism # Function: ADP-ribose pyrophosphatase # Organism: Methanopyrus kandleri AV19 # 47 158 33 141 154 68 39.0 6e-12 MHPLEKFRYCPCCGSSRFEENTEKSKVCKSCGFEYFLNPSSANVAFIVNAKGELLVERRK EDPGKGTLDLPGGFSDTNETAEEGVRREVKEETGLTVTNCQYLFSQPNVYRYAGFDVHTL DLFFRCEVEDESQLQAMDDAAECFWLAPEDIHTEEFGLRSVRQGLYEFLKYVEAKKAESK GEPSKWKNCLPFRKEEKAE >gi|283510502|gb|ACQH01000117.1| GENE 6 12477 - 12683 129 68 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288928567|ref|ZP_06422414.1| ## NR: gi|288928567|ref|ZP_06422414.1| hypothetical protein HMPREF0670_01308 [Prevotella sp. oral taxon 317 str. F0108] # 1 68 1 68 81 78 64.0 1e-13 MSGLPEHCHGEKTHFSRNERRTFSLSQTNNAELARTLSWRENALFEEREKNDYPVYLSTC LLVNLPIR >gi|283510502|gb|ACQH01000117.1| GENE 7 12608 - 12871 58 87 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MWQNMAMPQKGSLVGVLSIRSLQVESIFPVAYNEEKARRTSLCFLCSYVKTRKMNVFVKP NEQCRACPNIVMARKRTFRGTREERFR >gi|283510502|gb|ACQH01000117.1| GENE 8 13003 - 13593 205 196 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260910901|ref|ZP_05917543.1| ## NR: gi|260910901|ref|ZP_05917543.1| hypothetical protein HMPREF6745_1498 [Prevotella sp. oral taxon 472 str. F0295] # 1 196 13 208 210 353 91.0 4e-96 MRCVVILSFLALCACKATSKKASYPEQRLTYSKQEAKSSCAGERINRQFISLTNKSHLVK SKETLPYDKRIDIKTVKYNLIESKLLKGASAFICDGGNKLRYLPLPNKGDVSLILVPMDC GDFDYRFYLLTIKNNTIISDLYVEGIWYEPGGPELEEVTSFKIDKNFSVKVKTTSLGSPQ KVRNYIIRDDGKIIKK >gi|283510502|gb|ACQH01000117.1| GENE 9 13629 - 13844 80 71 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260910902|ref|ZP_05917544.1| ## NR: gi|260910902|ref|ZP_05917544.1| conserved hypothetical protein [Prevotella sp. oral taxon 472 str. F0295] # 1 71 1 71 71 89 71.0 1e-16 MIKRPYMPLIFSGLVAFRLLSSNRDRHNLYTKFVKNIEICKLFSENLSTNMPTFLAVARL YPRLLGSFRQV >gi|283510502|gb|ACQH01000117.1| GENE 10 14156 - 14617 494 153 aa, chain + ## HITS:1 COG:YPO0546 KEGG:ns NR:ns ## COG: YPO0546 COG2001 # Protein_GI_number: 16120874 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Yersinia pestis # 3 129 2 127 152 73 32.0 2e-13 MRFLGNTEAKTDAKGRVFLPVAFRKVLQASGEESLVLCKDLHQPCLVLYPESVWNEQMDA LRNRLSRWNAAHQQLFRQFVSDVELVTLDGNGRFLIPKRYMAMAQISQSIRFLGMGDTIE IWSEANTQQPFVAAHDFGPAVEAIMALPEGENQ >gi|283510502|gb|ACQH01000117.1| GENE 11 14614 - 15573 999 319 aa, chain + ## HITS:1 COG:SP0334 KEGG:ns NR:ns ## COG: SP0334 COG0275 # Protein_GI_number: 15900265 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted S-adenosylmethionine-dependent methyltransferase involved in cell envelope biogenesis # Organism: Streptococcus pneumoniae TIGR4 # 7 318 1 313 316 236 42.0 3e-62 MNNDNIITAQGYHVPVLLAESVEALNIVEGGVYVDVTFGGGGHTRHILSKMDGTARLFSF DQDADAEENTRRKSPDGQQPLANDERFTFVRSNFRYLDNWMRYYGVDHVDGILADLGVSS HHFDDQERGFSFRFDAPLDMRMNKNAALTAAQVLNTYDEERLADVFFLYGEIRQSRRIAS AIVKARAKAAVQTTQQFAAIVEPFFKREREKKEMAKLFQALRIEVNNEMNALKEMLQAAA QWLGKGGRLSVITYHSLEDRMVKNFMKTGNVEGKLSQDFFGRAQTPFTLVNNKVIVPTDD EQLRNPRSRSAKLRIAEKR >gi|283510502|gb|ACQH01000117.1| GENE 12 15570 - 16184 731 204 aa, chain + ## HITS:1 COG:no KEGG:PRU_2326 NR:ns ## KEGG: PRU_2326 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 66 203 12 145 145 133 51.0 5e-30 MKDKDMNAGEPDLRFTVEQPNDGPDPVETLKNIAKRESEAAKKAVAKGEKRPNEDKAGER GGTNEDEAETKEAATDELPTLKEAIMEQATESDAPLTSNLTFMKIIGGDILNTSTIRKQI WLLLLITAFIFVYIANRYSCQQYLIEIDQLTKELQDAKYKSLSSNSQITEKSRESHVLRL LRESNDTTLHMPDQPPYIINVPEE >gi|283510502|gb|ACQH01000117.1| GENE 13 16181 - 18346 2704 721 aa, chain + ## HITS:1 COG:CAC2130 KEGG:ns NR:ns ## COG: CAC2130 COG0768 # Protein_GI_number: 15895399 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell division protein FtsI/penicillin-binding protein 2 # Organism: Clostridium acetobutylicum # 9 720 12 724 729 168 24.0 3e-41 MSKFESKKIMPRYTFITILMTMVAIAVIGKTLYIMMAKRDYWMAVADRQKKDSVSVKPMR GNILSCDGQLMASSLPEYKLYMDFNALREAKNDSLWEVKLDSICQGLHTIFPEKSAASFR RHLEEGRNKQSRHWLLWDRRVDYNTYTEVRMLPVFNLSKYKSGFHTEEFNARQRPYGSLA GRTVGDMYGAKDTARFGLELSYDSILRGSNGIIHRRKVLNRFLDIMDTPPIDGADIVTTI DVGMQDLAERSLLEEMKEISARVGVVIVMEVATGDVKAIVNMERCQDGEYRELKNHAVSD LLEPGSVFKVASIMTAIDDGVVDTAYQVNTGGGTWPMYGREMRDHNWRRGGYGVLSVSRS LEVSSNIGVSRVIDQFYHNNPERFVRGIYRLGLGLDFQIPLVGSSPAKIRMPKKAPNGQW VNWSNTALPWMSIGYETQVPPISMLAFYNAIANGGKLMRPRFVKKVMKDGVTIMDFPPEP MKGYEQICKPRTVKLIQDVLRKVVSQGLAKKAGSNLFQVAGKTGTAQMSKGAAGYKSGSV DYLLSFAGYFPAEAPRYSCIVCIQKTGIPASGGLMCGPVFRNIAEGIMAKDIKLDARDAR DSSSILYPEVKSGNVLAADYVLSHLGVKTFVDWKAGDASTKPIWGQASIEGGKAVSLKRT RQFVRSQMPNVHGMGARDAVYLVESRGVKVRLVGRGKVVKQSIEPGTPLKMGMKCELVLE G >gi|283510502|gb|ACQH01000117.1| GENE 14 18431 - 19876 1621 481 aa, chain + ## HITS:1 COG:CAC2129 KEGG:ns NR:ns ## COG: CAC2129 COG0769 # Protein_GI_number: 15895398 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl tripeptide synthase # Organism: Clostridium acetobutylicum # 1 479 1 476 482 336 40.0 6e-92 MTLSSILKNVKPLNVVGNAEVQVTAVDIDSRKVGPGHLFVAIKGTQVDGHAFISKAVAQG AVAVLCETMPAELPQGVTFVQVESTEDAVGPVATLFYGEPSKKLKLVGVTGTNGKTTIAT LLYDMFRKFGHRCGLLSTVCNYIEDEAIPADHTTPDPIALNALLHRMVEAGCEYVFMECS SHAIAQKRIGGLTFAGGVFSNLTRDHLDYHKTFENYRNAKKAFFDMLPKQAFAITNADDK NGMVMVQNTRATVKTYSTRSMADFKAKILECHFEGMYLEINGAEVGVQFIGKFNVSNLLA VYGTAVMLGKRPEDILLVMSTLHSVNGRLEPLRSPEGYTVIVDYAHTPDALENVLNAIHE VLNGKGKVITVCGAGGNRDKGKRPLMAVEAVKQSDRVIITSDNPRFEDPQAIINDMLAGL NAQQMKKVVSIVDRREAIRTACMMAQKGDVVLVAGKGHEDYQEIEGVKHHFDDKEVVKEN F >gi|283510502|gb|ACQH01000117.1| GENE 15 20082 - 21353 1516 423 aa, chain + ## HITS:1 COG:YPO0552 KEGG:ns NR:ns ## COG: YPO0552 COG0472 # Protein_GI_number: 16120880 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl pentapeptide phosphotransferase/UDP-N-acetylglucosamine-1-phosphate transferase # Organism: Yersinia pestis # 1 423 1 360 360 208 32.0 2e-53 MLYYLFRFLEQFGISGAHIWGYISFRALLAMILSLVISAWFGEKFIKYLKRKQITEVQRD ASIDPFGVKKIGVPSMGGIIIIVAILVPVLLLGRLRNIYLILMVITTVWLGFLGGMDDFI KIFRHNKEGLKGKYKIVGQVGIGLIVGLVLWSSPDVKMNENLALQRKDNTEVVVKHRSEA RKSLKTTIPFVKGHNLDYSRVMSFMGKYKTAAGWVLFVIMTIFVVTAVSNGANLNDGMDG MCAGNSAIIGVVLGILAYVSSHLEFAAYLNIMYIPGSQELVVFFCAFIGALIGFLWYNAY PAQVFMGDTGSLTIGGIIAVGAIIIHKELLLPILCGIFFVESLSVMLQVYYFKMGKRRGV KQRIFKRTPIHDNFRTQDEQLDPDCRYVLKKPKGAVHESKITIRFWIITIILAALTIITL KIR >gi|283510502|gb|ACQH01000117.1| GENE 16 21462 - 22793 1458 443 aa, chain + ## HITS:1 COG:BS_murD KEGG:ns NR:ns ## COG: BS_murD COG0771 # Protein_GI_number: 16078584 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramoylalanine-D-glutamate ligase # Organism: Bacillus subtilis # 5 443 13 450 451 252 37.0 1e-66 MDRIVILGAAESGVGAAVLAKQKGLDVFVSDMGTIKDKYKQMLDSHGIDWEEGQHTEAKI LNAKEIVKSPGIPSEVPMVQKAIDKGIGIISEIEFAARYTDAKMVCITGSNGKTTTTSLI YHIFKAAGYNVGLGGNIGRSLALQVAEGAHEWYVIELSSFQLENMYRFKAHIAVLLNITP DHLDRYNYCMQGYVDAKMRILQNLGPQDYFIYWADDPVVKPELQKYDVSARVCPFAENRE TGAAAYVADGQITLPAPLEFNMPQARLSLPGKHNLYNSLAAALAARAAGIPQNVVEQGLS DFPGVEHRMERVATMHGVNYINDSKATNVDACRYALEAMTAPTILIIGGKDKGNDYEPIK DLVKEKCAALVYLGADNTKLHQNFDHLGLPVADTHSMKECLTACAGMAKAGYTVLLSPCC ASFDLFKNMEDRGEQFKTLVRNL >gi|283510502|gb|ACQH01000117.1| GENE 17 22806 - 24071 1183 421 aa, chain + ## HITS:1 COG:RSc2845 KEGG:ns NR:ns ## COG: RSc2845 COG0772 # Protein_GI_number: 17547564 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Bacterial cell division membrane protein # Organism: Ralstonia solanacearum # 16 402 44 400 413 126 27.0 7e-29 MALNKSLGNFFKGDKVIWMIFFFLCLISAVEVFSASSSLTYKGGSYLAPIIKHLGILFLG VVFMVFTVNIPCRYFKVLTPFLFLFSVLTLLWVLWGGQSTNGAQRWVSLLGIQFQPSEVG KGTLVLIVAQILSIAQTEHGADRKAIGWIGMATLFIVGPILFENLSTALLICLVVYMMMI LGRVPVSQLGKILGVVVLFAAAALSFVLIFGHAKATDVPEQSLTENVEPKKEEKGFFGSM FHRADTWKSRIYKFMKNEEVPPEKFDLDKDAQVGHANIAIASSNVIGQGPGNSVQRDFLS QAFSDFIYAVIIEEMGIIGAVVVAMLYVFLLFRTGKIANRCENNFPAFLAMGLALLLVTQ ALFNMCVAVGLVPVTGQPLPLISKGGTSTIINCVFMGAIISVSRTAKKAEKQTKSGVAEK K >gi|283510502|gb|ACQH01000117.1| GENE 18 24257 - 25384 1286 375 aa, chain + ## HITS:1 COG:BH2565 KEGG:ns NR:ns ## COG: BH2565 COG0707 # Protein_GI_number: 15615128 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylglucosamine:LPS N-acetylglucosamine transferase # Organism: Bacillus halodurans # 5 321 1 315 363 236 36.0 4e-62 MENNLRIIISGGGTGGHIFPAVAIANALKAKRPDAQILFVGALGRMEMQRVPAAGYDIKG LPICGFNRKNLLKNFAVLFKIWKSQRMAKQIIKQFKPMAAVGVGGYASGPTLNQCAAMGI PCLIQEQNSYAGVTNKLLSKKARKICVAYEGMERFFPKDKIVLTGNPVRQQLLDSQLTKA EALRTFGLEPTKKTILIVGGSLGARTLNESVMAHLDELRDSGVQVIWQTGKNYFEGIKAE LADKSPLPTLKPTDFIADMGAAYRAADLVISRAGASSISEFCLIGKPVILVPSPNVAEDH QTKNAMALVNRQAARFVSDAEAVQKLIPLALQTVNDDQALSQLSHNIKQMALRNSAEIIA DEVIALAEEQAAETR >gi|283510502|gb|ACQH01000117.1| GENE 19 25434 - 26816 1439 460 aa, chain + ## HITS:1 COG:CAC3225 KEGG:ns NR:ns ## COG: CAC3225 COG0773 # Protein_GI_number: 15896472 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramate-alanine ligase # Organism: Clostridium acetobutylicum # 7 444 11 448 458 228 32.0 3e-59 MDIKKIKAVYFVGAGGIGMSAIARYFIKKGYVVAGYDKTRTQLTNQLEKEGMLLHYEENV DEIPQACKKAQSCLVVYTPAIPDDHKELVYFRQNGFEIQKRAQVLGTLTRTLKGLCFAGT HGKTTTSCMCAHIMHQSHPDCNAFLGGISKNYGTNYILSPKSEYVVIEADEFDRSFHWLR PWMSVITSTDPDHLDIYGTKAAYMESFRHYTTLIREGGVLIIKKGVELKPDVKPGVKVYE YALDAGDFHAENVKIDNGNITFDFVSPVENVKNVTLGHPIPINIENAVAAMAMAQLNGCN AEELRYGIKTYHGVDRRFDFKIKTPHLVFLSDYAHHPNEILQSAQSLRQLYKDRKITAVF QPHLYTRTRDFYKEFAKALSLLDEVILCDIYPAREEPIEGVSSKLIYDNLDAGVEKSMIA KDDVLELVQQRNFDVLVFLGAGDLDSLAPQVTKILESKAK >gi|283510502|gb|ACQH01000117.1| GENE 20 26848 - 27657 855 269 aa, chain + ## HITS:1 COG:no KEGG:PRU_2334 NR:ns ## KEGG: PRU_2334 # Name: not_defined # Def: putative cell division protein # Organism: P.ruminicola # Pathway: not_defined # 1 260 1 247 251 254 46.0 2e-66 MNVNWKKTVTMLVDALLAVYLVLAFTAFNKPNAKAALCQKVQIDIQDENTNGFITANDIR SRLDANQLYPLNKPMQAVQARKIEEMLKRSPFVKTVDCYKTQDGCVSISITQRMPIVRIK AANGEDYYLDDNNQVMPNSHYSADLIIATGHISKAFARNYIAPLAKLFMGNELWENQVEQ INILPDKGVELVPRVGQHVVFIGYLPQADTQKERNEKIADFVEKKLLRLEKFYKYGLSQV GWNKYSYIDLEFDNQIICKKKKENKNNPE >gi|283510502|gb|ACQH01000117.1| GENE 21 27668 - 29086 1520 472 aa, chain + ## HITS:1 COG:RSc2840 KEGG:ns NR:ns ## COG: RSc2840 COG0849 # Protein_GI_number: 17547559 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Actin-like ATPase involved in cell division # Organism: Ralstonia solanacearum # 3 330 6 338 410 154 30.0 5e-37 MAKDFIVAIEIGSSKITGIAGRKNTDGSISILAVAKETPSSCVKKGVVYNLDRTVQCLNS IVSKLKTTLKAEIAYAYVGVGGQSTRSIENVITRELPNDTIVSQDMVDSLMDANRSMDYP DADIIEAITLEYKLGQQYQIDPVGVPCTHLEGHFLNILWRNSNYRNLKKCFSNAGIAIAD VFISPLALADSVLTDSEKRSGCVLVDLGAQTTTIAVYSKSLLRHLAVIPLGGANITKDIA SLQIEEEHAERLKLTYGAAYTNPEEDNTGLTYAIDSDRSIQSDTFMNVVEARIEEIIANV RYQIPEKYEGKLLGGLILTGGGALLKNIEEAFQRKTNFNKIRFAKTVKFDLHSKHPEITA QDGTMNTVLGLLAKGNLNCAGREISNNLFPDAETERPATLPSEGEHMPNATDNRAQQEQA NNKSKFETKEKKEDKAEKEQANNEPEEEKGPTKFKQWKEKLSGFIRQMAEEE >gi|283510502|gb|ACQH01000117.1| GENE 22 29440 - 30774 1485 444 aa, chain + ## HITS:1 COG:BB0299 KEGG:ns NR:ns ## COG: BB0299 COG0206 # Protein_GI_number: 15594644 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Cell division GTPase # Organism: Borrelia burgdorferi # 1 345 6 351 404 214 43.0 3e-55 MSEKNINKGIVNFVDEEHGDSIIKVVGVGGGGGNAVNHMFKEGIHKVSFVLCNTDKQALD DSPVPVHLQLGKEGLGAGNRPLKAKAAAEESIDDIKEMFSDGTKMAFITAGMGGGTGTGA APVIARISKEMGILTVGIVTIPFRFEGLRKIDQALDGVEEMAKHVDALLVINNERLRQVY PELSLIEAFKRADDTLSVAAKSIAEIITYHGFMNLDFNDVKMVLEDGGVAIMSSGYGEGE NRVQQAIHDALNSPLLNDNDVFNSKKLLLNISFSEKNNQGSNLMMEEINDVDEFMAKFGP DFIFKWGVTFDESLGDKVKVTVLATGFGVENITTTPERTARKSIEDIEDAARKAQERLDR TKRIDTYYGTDMPGGRTKRHTNIYLFRPDDLDNEDIIFAVDESPTYARSQQKLEEIRGYS TTVQKKKQDEVEPPTTMQGTISFA >gi|283510502|gb|ACQH01000117.1| GENE 23 30854 - 31306 231 150 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|42519249|ref|NP_965179.1| 30S ribosomal protein S21 [Lactobacillus johnsonii NCC 533] # 4 149 3 146 147 93 36 3e-18 MAQLFEQISNDIKEAMKARDKVRLDTLRNIKKVFLEAKTAPGANDVLEDADALKILQKLA KQGKESAATFVQQARQDLADAELAQVKVIEEYLPAALSEAQIEAKVKEIIAQTGAATMKD MGRVMGMASKQMAGLADGGTISAIVKRLLA >gi|283510502|gb|ACQH01000117.1| GENE 24 32513 - 33139 486 208 aa, chain + ## HITS:1 COG:PM2000 KEGG:ns NR:ns ## COG: PM2000 COG0558 # Protein_GI_number: 15603865 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylglycerophosphate synthase # Organism: Pasteurella multocida # 2 198 1 196 218 141 40.0 8e-34 MISVYSIKPQFQRALTPILELLHRAKVTANQITLWAGILSLVIGILFWFAGDVGTWLYLC LPVGLMIRMALNALDGMMARRYNQITRKGELLNELGDVVSDTIIYFPLLKYHPESLYFIV AFIALSIINEYAGVMGKVLSAERRYDGPMGKSDRAFVLGLYGVVCLFGINLSGCSVYIFG AIDLLLVVSTWIRIKKTLKVTGSSQTPE >gi|283510502|gb|ACQH01000117.1| GENE 25 33167 - 34207 881 346 aa, chain + ## HITS:1 COG:no KEGG:Coch_1578 NR:ns ## KEGG: Coch_1578 # Name: not_defined # Def: patatin # Organism: C.ochracea # Pathway: not_defined # 11 342 12 345 346 300 45.0 5e-80 MKKKPYPTVAVFSGGGTRYALYLGMYAAMQHYGVKPDLLIASCGGSIAANIINSFATDEQ RKAFIQSEELYRFVQNTRMTRFKMLYKIGLYCQRKVRNTCYAPYVENIFDKFIVDMPTDI RPLLPSLAVQPTLPTIVVGARLLFDRADIGKERNGRKLYRKVLFTDAETARNIDLSAINC NFKSYEGSAIDSQVELFQDVPLPIATRISVSDMFYVQPVHYNGAYFAGGIVDLMPIELAE QLGETIFFEKKKGFGTIDEALVRAVFGYSGNRRLQEVMTHSVNYWVDTADMNQALSGHYI EKNINIFKLRVELKLPKTYAQFVEDMDFQWQYGYDRVAQHFKNKQV >gi|283510502|gb|ACQH01000117.1| GENE 26 34204 - 34959 208 251 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163797523|ref|ZP_02191474.1| 50S ribosomal protein L9 [alpha proteobacterium BAL199] # 3 243 7 250 259 84 27 2e-15 MNIFITGGTSGIGLALARFYTAEGHRVGVCGRNTARIDRSDEVNKLLLAYQLDVCDKDAL AAAVEVFCADKGLDMMIVAAGYYRNGVTEEVDFEQTSQMLKVNITGALNAMEVARETMKA SGGHLVVIASVAGLLHYPCASVYAKCKRALIQIADAYRRSLADYQITVTTLVPGYIDTPR LREIYRNDLSKCPFCMPLNRAVETMTKAIAQRKEQVVFPPKMRLSIAFLSLLPTCLLSAF MHRKTLWSIPK >gi|283510502|gb|ACQH01000117.1| GENE 27 35484 - 36233 76 249 aa, chain + ## HITS:1 COG:ynbD_1 KEGG:ns NR:ns ## COG: ynbD_1 COG0671 # Protein_GI_number: 16129372 # Func_class: I Lipid transport and metabolism # Function: Membrane-associated phospholipid phosphatase # Organism: Escherichia coli K12 # 2 88 160 254 342 63 35.0 4e-10 MWLVLLMLSTLLVYQHHFIDITTGLMAGFLTLWMFPYRKKRNRQIAKVYFFVAAVALTAL LFAIEHSVLGSIFLLWGILVLLLLGRAYYRSNRHFLKDDHGCIGFLRYIFYFPYIAVYRI LWLFSRQAPVEMAPNIFVGGLLSCRSARKFALLGEVSVFDLSAELPENAYFRTSADYHCF PLLDIATIPKACEERIVTTVLEKMKASEVPRKLYIHCAMGRFRSCRIGEAVLLTISKNLT RDKNGNRNI >gi|283510502|gb|ACQH01000117.1| GENE 28 36214 - 37911 1112 565 aa, chain + ## HITS:1 COG:FN1306_2 KEGG:ns NR:ns ## COG: FN1306_2 COG0500 # Protein_GI_number: 19704641 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Fusobacterium nucleatum # 267 561 3 297 302 328 52.0 2e-89 METGTFKSFDNADIFYRAWNYNPSQKTIVILHRGHEHSGRLQAFAEDEQFVHFNIFGFDM RGLGHTSQPVSPHFMDYVRDLDAFVKYLHEQYGIVERDIFVVANSIAGVIVSAWCHDFAP RIAGMALLAPAFTIKLYVPFAKTGIALAARLFKHLTVPSYVKSKVLTHDVEQQKAYDTDP LITREIDAHYLLDLLKAGKRIVEDAAAITIPTLLLSAGKDYVVKDDMQKRFFVDISSTQK RFIKLKGFYHGLLFETQREQVYNPIAEFINRCFSLEQPPTTCMPDKFTVDEYNKMALNML PRIERWGFGIQKWMLHHLGFLSKGMRVGLKYGFDSGMSLDYVYRNHAQGCGPIGRWIDRG YLNAIGWRGVRIRRENLVSIVEERIKALEQRNEPVKILDIAGGTGNYLFDIKRKFPEIEV VINDFSEANIAFGEAQIQQQGLQNIRFTRLDCFNKDSYQQLNFEPNIVIISGIFELFGDN ELVCNAISGALSALQPGGFLVYTGQPWHPQLKMIAFVLNSHQERDWIMRRRSQRELDSLF AFYGLTKCGMKLDNFGIFTVSYGMK >gi|283510502|gb|ACQH01000117.1| GENE 29 37892 - 38935 723 347 aa, chain - ## HITS:1 COG:no KEGG:BT_2183 NR:ns ## KEGG: BT_2183 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 14 323 18 327 351 311 48.0 2e-83 MIRLRAQQLINPDFTDVKDVVAWMGAIQAQQPRMAKLALGIRTRGATMQHIKQALDRGEI LRTHVLRPTWHYVSPNDIRWMLKLSCNRLKSAYASLMKGHGLTITEQMYDTANQHIYDML SGGKSLTKQQITERIAEKGLPSDTIFMNRFLENAECEALICSGPEAGNTHTYMLLDERVA PMPLPTKDEALSILARNYFRSHAPATLDDFCWWSGLSIKEARLGVEIIEKELQKVAFNDK TYLLHDSSSIDIKEKESFIFLPAYDEYIIAYKYRGDVLQASHNSKAFTNNGIFFPLILQN GRATGNWKMTLTRRNIAVNTSYFDNDTPTDLSAAEEKARLQLISFHN >gi|283510502|gb|ACQH01000117.1| GENE 30 39183 - 41567 1735 794 aa, chain - ## HITS:1 COG:no KEGG:BDI_3481 NR:ns ## KEGG: BDI_3481 # Name: not_defined # Def: putative TonB dependent outer membrane protein # Organism: P.distasonis # Pathway: not_defined # 4 794 8 797 797 696 45.0 0 MRIYSLLCLAFLLAYNGFAQTGIIQGRVINDKNNEPLEFATVQVQGTSLGAKTSIEGTFK ITGVQPGFHKIVVSMVGFETKISEELQVLGNQTTFIDIMVSEASTTLKEVLVSPNLLLRP TESPVSVLPLGVHQIEKSAGANRDVSKLVQILPGVGATAPNRNDLIIRGGGPTENVFYLD GIEIPVINHFATQGASGGVVGIINPDFVREISFYTGAFPANRPNTLSSVMEIKQKDGSRD RLHTKVAVGASDAGITLDGPLGKDASFIVSFRQSYLQWLFKVIGLPFLPTYNDFQMKYKW RINPHHEITVIGIGAIDNMSLNRNLEQTGTEAQKYILGYLPEYSQWNYTFGLAYKHYGKR FVDSWILSRNMLRNAGVKYKDNDKSASKISDYQSDEAEIKLRYERNYTTLPIKLNFGAGI KYSNYANDVYRLQFASGNVVPFQYHSSIDLLSYQAFVQASDQYLNNRFKVSLGVNLVGNT FTASMSNPLKQLSPRLSLSYSFSEDFDLNANIGRYAMPPSYTTMGYRNTEGNYVNRSNAL KYITSNQAIVGMEYHPGNFLRFSLEGFYKQYSNYPISLTDGISLASKGVDYGQVGDEAVM STGKGRAYGVEVVAKLMGWRDIDLTATYTLFRSEFTDKNGIYRPSSWDTRHILNLTTSYK LPKGWYISARWRYIGGAPYSPIDVERSTNKAAWSITNQAYTDYEQFNTLRLADAHQLDVR LDKEFYFKHWMLNLYLDVQNAYISNTPSKPIYTNRDAEGHVMDDPTSPHTKQLLRTLVYY SGTILPTVGIMIKF >gi|283510502|gb|ACQH01000117.1| GENE 31 42225 - 42905 737 226 aa, chain - ## HITS:1 COG:no KEGG:PRU_2447 NR:ns ## KEGG: PRU_2447 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 4 226 4 200 200 233 52.0 5e-60 MFPLLSSAYLAPVQWYAKLFQSNGCWMEQHDHFIKQTYRNRCIIATANGPQALTIPIERP SGEQVSCMAMKDIRISDHGNWRHIHWQAIVSAYNESPFFEYYADDFRPFFERRFEFLFDF NLELTHTVCALLDFTPNIILTPAFVPTQAQLASGASLPAAFVDADTVTFPTFQDFRDTIV PKNPPHDPVFTPRPYYQVFAQRHGFLPNLSVIDLLFNMGNEAPLYL >gi|283510502|gb|ACQH01000117.1| GENE 32 43149 - 44594 1693 481 aa, chain - ## HITS:1 COG:BU259 KEGG:ns NR:ns ## COG: BU259 COG0681 # Protein_GI_number: 15616870 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Signal peptidase I # Organism: Buchnera sp. APS # 410 473 246 310 314 66 52.0 1e-10 MINKKKKEFNPRKQWTKFIIVALLYLAFLFWVESWWGLLVLPFIYDVYISKKIKWQWWKD SEGPTRFIMGWVDAIVFALVAVYFINLFFFQNFVIPSSSLEKSLLTGDYLFVSKLSYGPR IPQTPLTMPLTQHTMPVINVKSYIEVPHWDYRRVKGLGHVKLNDIVVFNYPSGDSLVNEA RWAAADYYQMVYSYGKQLYDQANHPVNLDSMSLLQQRAYFQHIYALGRAYILQNPNEYGD LISRPTDRRENYVKRCVGLSGQTLQIKNRIVYLDGKPNKEPDNVQYTYYVKLKGMLPDDL LTSLGITQEDLQSLNTSGVMPLTAMAAKALAARKDIVESVKLNTDAPTGDLYPLNANTGW TRDNYGPIWIPKKGATIKLSLANLPMYERAIRVYEDNEVDVRNGQIYINNKPANSYTFKM DYYWMMGDNRHNSADSRYWGFVPEDHVVGKPIFIWWSHDVDHPGFKGIRWNRLFKLVDKI K >gi|283510502|gb|ACQH01000117.1| GENE 33 44694 - 45449 941 251 aa, chain - ## HITS:1 COG:RC0190 KEGG:ns NR:ns ## COG: RC0190 COG0289 # Protein_GI_number: 15892113 # Func_class: E Amino acid transport and metabolism # Function: Dihydrodipicolinate reductase # Organism: Rickettsia conorii # 1 250 48 284 285 117 31.0 3e-26 MNIAIIGYGKMGRMIEEIAKQRGHHIVCIIDKDNQADFDSPAFASTDVAIEFTTPQAAFD NYRKAFAHNVKVVSGSTGWLAQHGEEVRRMCQEEGQTLFWASNFSIGVAIFQALNRRLAE LMNRFPQYNVHIEETHHVHKQDAPSGTAITLAEEVVTQLDRKATWVKGEQHLADGSIVRS EQTTDDQLAIDSFRRGEVPGIHSIVYDSEADSISITHDAHSRRGFAEGAVLAAEYAATHN GLLTMNDLFKF >gi|283510502|gb|ACQH01000117.1| GENE 34 45623 - 46750 1380 375 aa, chain - ## HITS:1 COG:FN1068 KEGG:ns NR:ns ## COG: FN1068 COG0758 # Protein_GI_number: 19704403 # Func_class: L Replication, recombination and repair; U Intracellular trafficking, secretion, and vesicular transport # Function: Predicted Rossmann fold nucleotide-binding protein involved in DNA uptake # Organism: Fusobacterium nucleatum # 72 371 5 284 288 176 34.0 8e-44 MNDNETRNAIALTRVNYFNLAGLAQLYRLLGSATAVIAHRNDLREVLPDASPRLQEAMRN IEQHLKAADAEMEYNLRNGIRALALADDDYPQRLKNCDDAPLVLFYKGTANLNRARVINI VGTRHCTAYGQDLIRRFVSELKNLCPEVLIVSGLAYGVDINAHRQALANGLDTVGVLAHG LDYLYPTRHKQTANEMLTQGGLLTEFLTNTNADKLNFVRRNRIVAGISDACVLVESAAHG GGLITAAIARDYNRDVFAFPGPVGAPYSEGCNNLIRDHKAQLVSSAADFVHALGWETDLK RMQATRQGIERQLFPQLSGEEQAVVQALQKLNDQPINQLSLTANIPISQLTVLLFQLEMK GLLKLLAGGSYHLYV >gi|283510502|gb|ACQH01000117.1| GENE 35 46945 - 47361 497 138 aa, chain - ## HITS:1 COG:TP0156 KEGG:ns NR:ns ## COG: TP0156 COG0824 # Protein_GI_number: 15639149 # Func_class: R General function prediction only # Function: Predicted thioesterase # Organism: Treponema pallidum # 4 84 3 83 134 57 38.0 7e-09 MYTFTTEMEVRDYECDIQGIVNNANYLHYTEHTRHLFLKHTGLSFAGLHEKGVDAVVARM TLQYKTPLQCDDILVSKLRITQEGIRYVFHHDIFRKRDDKLSFRATAELVCLVDGRLSRS EEIDKAFAPFVDQQSADK >gi|283510502|gb|ACQH01000117.1| GENE 36 47729 - 49027 862 432 aa, chain - ## HITS:1 COG:no KEGG:PRU_2464 NR:ns ## KEGG: PRU_2464 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 3 432 1 430 431 551 57.0 1e-155 MPMEQLLHYVWKHKLFATAPLKTTEGVDVEVIDVGLHNDNAGPDFFNAKVKLGDTLWVGN VEIHDMASQWFEHGHQNDPRYNNVVLHVVGKADMECLTQDGKQLPQMELAVPSGLAAHYG ELLAADKFPPCHHIIPQLQRLTLHSWLAALQTERLESKTEAVIRRVERANGSWEGAFFIT LARNFGFGVNGDAFERWAANVPLRCVDHHRDNAFQIEALFMGQAGLLNDVAIPTRHRDAA TAESYFGLLKKEYAYLAHKFNLQPLDPSLWLFLRLRPQNFPHIRLSQLAVLHHSRRASLA NIVACENLTQLRKALVTQVSGYWQTHYVFGGESKASDKALSTASRDVLILNTVVPVLFAY GRHRHNDALCDRAYQLLDELKAEDNHIIRMWKACGLEVNNAGDTQALIQLKNEYCDRKNC LRCRIGYEYLRR >gi|283510502|gb|ACQH01000117.1| GENE 37 49187 - 50176 459 329 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|145632364|ref|ZP_01788099.1| ribosomal protein L11 methyltransferase [Haemophilus influenzae 3655] # 1 317 28 340 353 181 37 1e-44 MKIGNISFDAHPLFLAPMEDVTDIGFRLLCKRFGAAMVYTEFVSAEALVRSVKSTVSKLT ISDEERPVGIQIYGRDVESMVEAARIVEQSRPDVIDINFGCPVKKVAGKGAGAGMLRNVP LMLEITREVVRAVRTPVTVKTRLGWDCEEIIIPTLAEQLQDCGIAALTIHGRTRSQMYTG EADWRYIGEVKANPRIHIPIIGNGDICTPQQAKEVFSRYGVDAVMVGRATFGRPWVFKEM KDHLEGLPQDPSLTIDRKIDLLEEQLRINIERIDEYRGILHTRRHLAASPIFKGIPDFKQ TRIAMLRADKQEELIAILESCRERLRGLE >gi|283510502|gb|ACQH01000117.1| GENE 38 51216 - 51881 471 221 aa, chain + ## HITS:1 COG:PA0750 KEGG:ns NR:ns ## COG: PA0750 COG0692 # Protein_GI_number: 15595947 # Func_class: L Replication, recombination and repair # Function: Uracil DNA glycosylase # Organism: Pseudomonas aeruginosa # 3 221 8 226 231 241 54.0 7e-64 MNVEIESSWKEQLADEFEKPYFKELSEFVDGEYKQNKCFPLEKDLFNAFNLCPFHKVKVV IIGQDPYPGEGQAHGLSFSVNDGVPFPPSLNNIFKEISEDFKTPIPTSGNLTRWAEQGVL LLNTILTVSPGDSNSHKGKGWEQFTNAVIKRLNDECEGLVFMLWGVEARKKGRCIDSNKH LVLTSAHPSPRAVNFGCSFGKKHFSQANEYLRERGKSPINW >gi|283510502|gb|ACQH01000117.1| GENE 39 52050 - 52715 512 221 aa, chain + ## HITS:1 COG:PA0750 KEGG:ns NR:ns ## COG: PA0750 COG0692 # Protein_GI_number: 15595947 # Func_class: L Replication, recombination and repair # Function: Uracil DNA glycosylase # Organism: Pseudomonas aeruginosa # 3 221 8 226 231 258 58.0 6e-69 MDVKIEPSWKEQLADEFEKPYFKALSAFVHLEYKQHQCFPPAQLVFNAFNLCPFDKVKVV IIGQDPYHDDGQAHGLSFSVNDGVPFPPSLQNIFKEINNDLGTPIPTSGNLTRWAEQGVL LLNATLTVRAHQAASHQRQGWETFTDAAIARLNAGREGLVFILWGGYARSKAALIDARRH CILQSVHPSPLSANRGGWFGNHHFSQANAYLSSRGLTPISW >gi|283510502|gb|ACQH01000117.1| GENE 40 52719 - 53324 591 201 aa, chain + ## HITS:1 COG:no KEGG:PRU_2353 NR:ns ## KEGG: PRU_2353 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 3 198 2 208 210 260 60.0 2e-68 MNNNIPPIMQVGNVLISSDLITERFCCDLDACKGACCIEGDAGAPVTLDEIAAMEDALDI VWNDLSAQAQTVIDKQGVAYNDREGDLVTSIVNGKDCVFTCYDKGLCLCTFDRAHRAGLS SFRKPISCALYPIREKRFDADLVGLNYHRWDVCRPAVKKGRELNLPLYRFLEGPLTERFG REWYAELCEVAESLMAEGGEE >gi|283510502|gb|ACQH01000117.1| GENE 41 53623 - 54642 1088 339 aa, chain + ## HITS:1 COG:NMA0465 KEGG:ns NR:ns ## COG: NMA0465 COG2855 # Protein_GI_number: 15793467 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Neisseria meningitidis Z2491 # 25 335 22 330 338 221 42.0 1e-57 MLNEQRSSMLHGILLIALFSLAAFYLGDMAWAKKFSVSPMIIGIVLGMIYANSLRNNLPP TWVPGITFCAKRVLRVGIVLYGFRLTLQDVTAVGLSAIVVDAIVVCITIGGGLLIGRILK LDRSITLLTSVGAGICGAAAVLGTEGAINAKPYKTAVAVSTVVIFGTLSMFLYPVLYRNG VLDLSPQSMGIMTGSTVHEVAHVVGAGNAMGKGVSDIAIIVKMIRVMMLVPVLLVVSWFV AHGKQGDAETQTKRKGITIPWFAVLFLVVIGFNSLNLFSPAVNDAINTFDTFLLTMAMTA LGAETSIDKFKKAGAKPFLLAALLYCWLLGGGYLIAKAL >gi|283510502|gb|ACQH01000117.1| GENE 42 54776 - 55342 454 188 aa, chain - ## HITS:1 COG:no KEGG:gll1842 NR:ns ## KEGG: gll1842 # Name: not_defined # Def: hypothetical protein # Organism: G.violaceus # Pathway: not_defined # 2 178 48 227 230 114 35.0 2e-24 MGGIANWNVLKREIETYLRREKDVLVTTLIDYYGIKDSHGFPLWAEKQAIADKSQRLDEL EAAMLADVDANLRPRFVPYMQLHEFEGLLFSDKQAFYTTFYNSNELVDKAYLEQTFADFD NPEMINDGVETSPSHRLERIISGYEKVVYGCCLAEAIGLDKMRQKSPRFDNWLKRLETNI AITKPNNP >gi|283510502|gb|ACQH01000117.1| GENE 43 55459 - 56529 904 356 aa, chain - ## HITS:1 COG:all8080 KEGG:ns NR:ns ## COG: all8080 COG4637 # Protein_GI_number: 17227454 # Func_class: R General function prediction only # Function: Predicted ATPase # Organism: Nostoc sp. PCC 7120 # 1 325 9 334 345 248 42.0 1e-65 MDYIEIKGYQSIKSAKIDIAPINILIGANGSGKSNFVSFFELLHRLCGRGLSEYVALKGG EDKILFQGRKVTEKFSFHIEFDGGQNGYAAELTAGNDGFVFTDEQLYYKGEPKGITRHGK EANLSLTDNYRAAYILKYMAKFRKYHFHDTSANSPFTKYSNTSTDTAFLYANGGNLAAFL YKIRSEQPLVYNRIVACIKSVAPYFSDFYLEPNEENNLRLLWQDKFSDMTYGATDLSDGT LRFIALATLFLQPNLPTTIIIDEPELGLHPFAIAKLAGMIKSASKRNCQVIVATQSADLI AHFEPSDIITVDQVGGETHFTRQSEEMLKAWLEENYTIDELWKRNMIEGGQPNYEG >gi|283510502|gb|ACQH01000117.1| GENE 44 56757 - 57623 714 288 aa, chain - ## HITS:1 COG:CAC2633 KEGG:ns NR:ns ## COG: CAC2633 COG4632 # Protein_GI_number: 15895891 # Func_class: G Carbohydrate transport and metabolism # Function: Exopolysaccharide biosynthesis protein related to N-acetylglucosamine-1-phosphodiester alpha-N-acetylglucosaminidase # Organism: Clostridium acetobutylicum # 82 283 151 352 354 64 30.0 2e-10 MRNKFFIGLLLGLFLPLHLVSQTREDVVAINTAMWETKKISENIEVRQMRWPLLYGCAQT VTVAEITPKRGLAFDIAVADGGATVGEMARRTKALAAINGSYFDIHKRSAITYLRQGRTV IDTTTTAELALRVTGAIRTHGRKLHIMPWSKAIEQRYRCRHGSTLASGHLLLYRGKDVSL RSSSMGFIVKKHPRTAIALTSRGTVLFVTVDGRHPGYAGGMNLIELRHFLQQLGCTDALN LDGGGSTTLWAKGYNANGIANYPCDNRKFDHDGERKVANAVVVMKRGK >gi|283510502|gb|ACQH01000117.1| GENE 45 57851 - 59479 204 542 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|161507907|ref|YP_001577871.1| ribosomal protein large subunit [Lactobacillus helveticus DPC 4571] # 336 506 83 247 285 83 32 5e-15 MFHPLHTTLPRPEKMNNPFFYEPDALCLMAAEQVKGLVAQHETWREEVAQGKMFGVLVVE RPAPHGTELGYLAAFSGQLGGKATWEGFVPPVFDYLQPDGYFKIHEREISEINVQLSEME RNPLLNELKTRIANEQETAETELNAYRTMMAEAKKERENKRRRGATAEEQAAMVAESQFQ KAELRRLKQRIKAQIECLQAELNVQEKAMTDLKNKRKRLSDALQRWLFQQFEMLNARGER RNLLDIFAPTPQRTPPAGTGECCAPKLLQHAFVMGYRPLSMAEFWLGRSPKTEIRREGAF YPACPGKCGPTLGFMLQGLEVACVPQKPVDRAALRVVYEDEGLVVVSKPHGMLSVPGKDG GTSVAQMMRQRYPDTDSPLIVHRLDQDTSGLMIIAKTKEMHRTLQLAFENHEVKKRYVAV LDGLLPPEKQRGEISLPLNSNYLDRPRQMVDWLNGKSAVTRYEVIGHEEGRTIVALYPLT GRTHQLRVHCAHPDGLNLPILGDPLYGRKADRMYLHAEVIEVEKLGLRVEDDKWFHPQSL NK >gi|283510502|gb|ACQH01000117.1| GENE 46 59705 - 60310 667 201 aa, chain + ## HITS:1 COG:CC3650 KEGG:ns NR:ns ## COG: CC3650 COG0494 # Protein_GI_number: 16127880 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Caulobacter vibrioides # 23 194 6 177 187 124 41.0 9e-29 MKEEKRNLAPNEKTREEGNEMAWQVLGREYLIRKQWLTARCDKVCLPTGAVVDEYYVLEY PEWVNTVAITREGDFVLVRQYRHALGVTAMEICAGVMEQGEEPMQAAQRELLEETGYGNG TWRPLMTIAPNPGTMTNRCHCFLATDVENVSGQHLDDTEDIQVHVVSRDEMLRMLKNNEL YQAMMYAPLWKYFALEDELKG >gi|283510502|gb|ACQH01000117.1| GENE 47 60661 - 62325 1899 554 aa, chain - ## HITS:1 COG:HI0035 KEGG:ns NR:ns ## COG: HI0035 COG2985 # Protein_GI_number: 16272010 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Haemophilus influenzae # 16 552 7 548 551 349 37.0 1e-95 MDWINGLFTVHSAIQTVVVLSLICYVGLILGKIHVKGISLGVAFVFFIGIIAGHWGLDID PTVLQYAEAFGLVLFVYTLGLHVGPNFFGSLRKEGISLNLWGFAVIAVGTILALVLSEVT NVPMQTMVGILSGAVTNTPALGAAQQALENFGFSTRSAALGCAVAYPLGVLGVIFAMIFL RKFFVKPDDLVPRSPQDEDHTYVAQYEIVNQAIDGKTLAEISQMAHIRFIVSRIWRNEQV IVPVSTTKLRLGDNVLVVTTDDEVGGMELLLGKKVDKDWNKDKIDWNHIDKKVESKVVVI TRPMLNGKKLGSLQLRNTYGVNVSRVTRGDTKLLGTNDLRLQYGDRVTVVGEHDALENVE HYLGNAVQTLNEPNLGAIFLGMLLGLALGTIPIMLPGMSAPVRMGLAGGPIVMGILVGSI GPRFRLVSYTTRSASLMLRQLGLSLYLACLGLDAGQDFFKTVIRPEGLLWVGLGFVLTVV PVLLVGYIALRTKRLDFGSICGILCGSMANPMALTYASETLEGDTPAVTYATVYPLGMFL RVIIVQMMVMFFLG >gi|283510502|gb|ACQH01000117.1| GENE 48 62293 - 62502 77 69 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MHGEQAVNPIHAYIYFDILCKGSVKVVNGKMLGEIKYWLASNSVSKGKNRMIQCILQSKR IAICSQNAV >gi|283510502|gb|ACQH01000117.1| GENE 49 63086 - 63268 90 60 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLNNTQDLYRCHMNRLGFKETDAFASNTRTDAKQHTYFYRIVKMVFRAIRCHAHCGSGAV >gi|283510502|gb|ACQH01000117.1| GENE 50 63459 - 63653 87 64 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLLHLPSLRAGCLPFGRTMAYDIYAKTAIGMSNTQAECSFIKVGMQHTCRRSGKGVQIHR AKIQ >gi|283510502|gb|ACQH01000117.1| GENE 51 63972 - 64850 682 292 aa, chain - ## HITS:1 COG:PA0248 KEGG:ns NR:ns ## COG: PA0248 COG2207 # Protein_GI_number: 15595445 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Pseudomonas aeruginosa # 188 291 183 286 288 68 32.0 1e-11 MRKLQLNDLTIPIVKENCDVVSMDDDLVLVDDLNKLSDLMGPYKMQCLVLAICAQGSARY TVDTITYELQPNHIMVVPAGPILNSGELSTNCKGIAILLSKDFMNEIIMGIHELSSLILF SRTNPVFTLNAEEGGEMLLHLSLIEKKMQQAGHRYRKDTVRSMITTMIYNASDAIYRVQQ TTDLKKTRAEAIFAKFIKMVEQNFKKERRVSWYGEQLGITPKYLSETVRNVSKRTPNQWI DKYITLELRVQLKNTTKSVKEIAQELNFPNQSFLGKFFKEQVGVSPSQYRKR >gi|283510502|gb|ACQH01000117.1| GENE 52 64946 - 65821 400 291 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929796|ref|ZP_06423639.1| ## NR: gi|288929796|ref|ZP_06423639.1| hypothetical protein HMPREF0670_02533 [Prevotella sp. oral taxon 317 str. F0108] # 1 291 186 476 476 614 100.0 1e-174 MRLTPYQAPWISAPQPDSTAMQWFLRTYKQNGKPRKAYVSVATTGKMQLLVNGMNVSRNL FEPHRAQDDTSVVNIVYDVTPFLSRKANTISLWYCPTMPHVGRRQVAVCYWGENADGTPF AFQSDENWACRRANRGLQPPESEWQDGSDKAVAWQINYADTISWQSSVRYAHPETLPPPT PSRLVCAPRVAKVYGPSYVDVRNDTLTCVFPHHFQGVVRITFRGTRTGEKVYINGFRYIC NGATDEQFFSKFIVKTTYKLIVHGDKYFRSEHVQNVEAMETSIGLYGKFLF >gi|283510502|gb|ACQH01000117.1| GENE 53 66373 - 68751 2053 792 aa, chain - ## HITS:1 COG:no KEGG:PRU_2351 NR:ns ## KEGG: PRU_2351 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 30 787 6 762 769 912 58.0 0 MRKTILATATLLFSALTSFAQGVPLRIWDNRPGSFFENSMPIGNGKLGAMVDGNPHCDYL KLNDITLWSGKPIDPNEDAGAHKWIPQIRKALFEENYALADSLQLRVQGHNSAWYQPLST LCICDVKAAANADAPLKNYRRELDLDSSLVKVSYESEGVSYRREYFASHPGRAIMVRLTA NKPHAISLQLSLTSLLNHQTRVEGNTIRLMGHAEGHPDSTVHFCNLLQAKATGGTITAQD STLLISNATQVVLYIVNETSYNGFDKHPVTQGAPYVQLAETDLKNLQNCTFEQLKQNHTD DYQALFGRLALHLDGTKLDMHRTTEQQLQDYTKRGETNPYLETLYFQFGRYLLISSSRTP GVPANLQGLWNPHVRAPWRSNYTVNINLEENYWPAQVANLAELTTPLVGMVKALSVNGRY AARNYYGINEGWCSSHNTDLWAMTNPVGEKRESPEWANWNLGGAWLLSNLWEQYDFTRDR HYLRHTLYPLMKGACDFMLQWLVENPKQPGELITAPSTSPENEYVTPDGYHGTTVYGGTA DLAILRELFANTATADEILNGRPTAYSKILRQTIGRLHPYTIGKEGDLNEWYYDWNDFDP QHRHQTHLIGLYPGHHIAPETTPELAEAARKTLVQKGDISTGWSTGWRINLWARLYNGEK AYQIYRKLLTYVAPDAIRKSDAGPGGGTYPNLFDAHPPFQIDGNFGGTAGVCEMLMQSAR GIRLLPALPAAWPSGSVKGLCARGGFVVDFSWRNGSVTQVRIKSNVGGQTTLYYNGKAHK VKLKAGKEAVLR >gi|283510502|gb|ACQH01000117.1| GENE 54 69157 - 71130 2666 657 aa, chain - ## HITS:1 COG:CAC0006 KEGG:ns NR:ns ## COG: CAC0006 COG0187 # Protein_GI_number: 15893304 # Func_class: L Replication, recombination and repair # Function: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit # Organism: Clostridium acetobutylicum # 11 651 8 630 637 668 54.0 0 MVENQNNENNYSASNIQVLEGLEAVRKRPAMYIGDISEKGLHHLVNETVDNSIDEAMAGY CSNIEVTINEDNSITVQDDGRGIPVDEHEKLHKSALEVVMTVLHAGGKFDKGSYKVSGGL HGVGVSCVNALSSRMLSQVFRDGKIYQQEYETGKPLYPVKVVGETDKRGTRQQFWPDPSI FTTTVYRWDIIASRMRELAYLNAGVKITLTDLRPDENGKTRTETFHAKDGLKEFVRYVDR HRTHLFDDVIYLKTEKQNIPIEVAVMYNTDYSENIHSYVNNINTIEGGTHLTGFRMALTR TLKAYADNDPVISKQIEKSKIEIAGEDFREGLTAVISIKVAEPQFEGQTKTKLGNSEVSG AVQQAVGEALGYYLEEHPKEAKLICDKVILAATARIAARKARESVQRKNPMSGGGLPGKL ADCSNKDPKTSEIFLVEGDSAGGSAKQGRDRYTQAILPLKGKILNVEKVQWHKVFEAESV MNIIQSIGVRFGVEGEDDKEANIDKLRYDKIIIMTDADVDGSHIDTLIMTLFYRFMPRII QEGHLYIATPPLYLCTYKNKVREYCYNEQQRQAFIDKYGEGQEDNKSIHVQRYKGLGEMN PEQLWETTMDPSKRLLKQVTIQNAADADEIFSMLMGDDVEPRREFIEQNATYANIDA >gi|283510502|gb|ACQH01000117.1| GENE 55 71448 - 72659 521 403 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929799|ref|ZP_06423642.1| ## NR: gi|288929799|ref|ZP_06423642.1| hypothetical protein HMPREF0670_02536 [Prevotella sp. oral taxon 317 str. F0108] # 1 403 1 403 403 791 100.0 0 MKVLITSCVMVFVMAISVAKTPENISYETLKNTPPMQWTCQEIEKNQDLLASMQNDDYKI YASNVKEKFELYASPSSPDDKYLITQAVLKVENPMFDTGNLLKHISSWIETNKGWKVLGV DKKNNIVNSAASINIANNASFLTINKISIAPTLSIQLVDGKMLIMSFMVDSYQNVEFHGS EDKRPITYSAKISKVFPFVPKSSYKISYAKAYVNTYKYFWNFISQLRDNLNDSFQKDNQL LAQLHYEYSRDSLNAKYGEPTKIVKGTTTDPDLNKEIIFYENAKKVVCMGKTIDFKDVVS CEVADDPEFIPGRTTTYGAEISFFGLGFGGSETRATPDKTIHNFVVHVKIDNLAMPFVHI ATGRDEQKAREIASSFEYIIRHQQDKTERIQPVTPKKSTRQRR >gi|283510502|gb|ACQH01000117.1| GENE 56 73223 - 73477 330 84 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|29348839|ref|NP_812342.1| 30S ribosomal protein S20 [Bacteroides thetaiotaomicron VPI-5482] # 1 83 1 83 84 131 78 1e-29 MANHKSSLKRIRQTKVRTLHNKYYAKTMRNAVRKLRALTNKDEAVKLYPTVQKMLDKLAK TNIIHKNKAANLKSSLALHINKLG >gi|283510502|gb|ACQH01000117.1| GENE 57 73736 - 74467 645 243 aa, chain + ## HITS:1 COG:no KEGG:PRU_2337 NR:ns ## KEGG: PRU_2337 # Name: not_defined # Def: putative DNA repair protein RecO # Organism: P.ruminicola # Pathway: Homologous recombination [PATH:pru03440] # 1 240 1 240 241 279 56.0 8e-74 MLVKTHAIVLRSIKYGDSQLITDMLTETFGRVSFVSNIPRTQKAKIKKQLFQPLTILEIE LDHRPNTRLQRLRDARMAYPFTSIPYSEAKLAITLFLAEFLCHATRSEQQNVPLYRYVET SVRWLDGAQQLFANFHLAFMIRLSRFLGFYPNLENYSPGCLFDLRSGCFVPAIPMHNDYL SPADAAQMATLMRITPQTMHRFALSHNERNRCTQLAINFYRLHIPDFPELKSFDVLRQVY DSE >gi|283510502|gb|ACQH01000117.1| GENE 58 75918 - 78146 179 742 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148543994|ref|YP_001271364.1| 30S ribosomal protein S1 [Lactobacillus reuteri DSM 20016] # 634 739 286 387 416 73 35 4e-12 MNVITKTISLPDGRTISIETGKVAKQADGSAVVRMGNTVLLATVCAAKDAVPGTDFMPLQ VDYREQYSAAGRFPGGFTKREGKPGDNEILTSRLVDRVLRPLFPSNYHAEVYVNVMLLSA DGVDQPDALAGLAASSAMACSDIPFDFYISEVRVARVNGEYVVNPTFEQMKEADMDIMVG ATKDNIMMVEGEMDEVTEQDLIQALKVAHDAIKPMCTMQEELAKELGKDVKREYDHEVND EDLRKQMNDELYQPVYDVTKQALAKQERHDAFDKIVTDFLEKYDAENTEKLTAEELEEKH ALAARYYDDVLRDAMRRCILDEGKRLDGRKTDEIRPIWCEVSSLPMPHGSAIFTRGETQS LSTCTLGTKLDEKMVDDVLDKSYMRFLLHYNFPPFSTGEAKAQRGVGRREIGHGHLAWRG LKGQIPADYPYTVRVVSQILESNGSSSMATVCAGTLALMDAGVPLKKPVSGIAMGLIKNP GEEKYAILSDILGDEDHLGDMDFKTTGTKDGLTATQMDIKCDGLSFEILEKALMQAKAGR EHILGKLTETIAEPRPELKPQVPRIVQIEIPKEFIGAVIGPGGKIIQQMQEDTGTTITID EVDGKGKVQVSAPDKASIDAALSKIRAIVAVPEVGEIYEGTVRSVMPYGCFVEIMPGKDG LLHISEIDWKRLETVEEAGIKEGDKMQVKLLDIDPKTGKYKLSRRVLMEKPEGYVERERR PRGDRPERGERRGRRDDRHDRD >gi|283510502|gb|ACQH01000117.1| GENE 59 78339 - 80249 2204 636 aa, chain - ## HITS:1 COG:FN1975 KEGG:ns NR:ns ## COG: FN1975 COG0513 # Protein_GI_number: 19705271 # Func_class: L Replication, recombination and repair; K Transcription; J Translation, ribosomal structure and biogenesis # Function: Superfamily II DNA and RNA helicases # Organism: Fusobacterium nucleatum # 1 544 7 511 528 372 36.0 1e-102 MKTFKELGVSEPICRAIEELGFEQPMPVQEEVIPYLLGNGNDVIALAQTGTGKTAAYGIP VLQKVNPEEKHTQALILSPTRELCLQIADDLADFSKYVQGLHVVPVYGGASIEMQIRQLR KGAQIIVATPGRLIDLMKRGVAQLDNVNNVVLDEADEMLNMGFTESINAIFEGVPADRNT LLFSATMSREIEKIAKNYLNDYKEIVVGSRNEGAEKVNHIYYMVNAKDKYLALKRLVDFY PRIFAIIFCRTKLETQEIADKLIRDGYNAEALHGDLSQQQRDLTMQKFRNHNVQFLVATD VAARGLDVDDLTHVVNYGLPDDIESYTHRSGRTGRAGKKGTSISIIHTRERSKVRAIEKI IQKEFVDGTLPTAKEICSKQLYKAMDAISKVDVNEEEIAPFLEEINRHFEYIDKEDLIKK IVSMTFGRFLDYYANAPEIEKPSLKRNEREGKRAERGGGRDAQRGARSKRTERGARGARK AEAGYRRLFVNLGKDDGFYPGEVMQFINRNVQGRVAVGHIDLLGKFSYIEVPEGDASRVM QGLSGASYKGRDVRCNDADEGGHGRSAAAKGGRKGQRFADSAPAAGGKRGRKAQHQFNDD NETGDWRQFFQQPAPKLKGAEPDFSEEGWARRRPKG >gi|283510502|gb|ACQH01000117.1| GENE 60 80910 - 81737 911 275 aa, chain - ## HITS:1 COG:CC2313 KEGG:ns NR:ns ## COG: CC2313 COG0657 # Protein_GI_number: 16126552 # Func_class: I Lipid transport and metabolism # Function: Esterase/lipase # Organism: Caulobacter vibrioides # 43 253 72 303 328 120 34.0 2e-27 MKTKRFLLAATLMLTAATSAFAQKTFTLNLWSSAPAVAGSDDKDTAKVQVFLPREKMATG RAVVICPGGGYQTLAMDSEGRDWAPFFNNMGIAAIVLKYRMPNGDKQVPISDAEEAMKLV RRNAAEWRIKPNEVGIMGASAGGHLASVLATKATGDARPDFQILLYPVITMQPGLTHRGS RERFLGKNPSKADEREYSTDQQVTRSTPRAWIALSDDDRTVMPINGVNYYAELYRHDVPA SLHVFPGGRHGWGRKQTFRYNLEMELELKAWLQSF >gi|283510502|gb|ACQH01000117.1| GENE 61 83495 - 85804 1802 769 aa, chain - ## HITS:1 COG:no KEGG:BVU_1454 NR:ns ## KEGG: BVU_1454 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 2 769 4 773 773 429 31.0 1e-118 MVRILTIILAASWSMAIVAQEKVTGHVYDPSHKAIVGATVVMLSSPDSTYICGEVSDAQG RFVMPRPQGEWMIAVSCLGYETAIVRPNAGDSLSIVLSEAQHQLDEVRVQASRPVSQLIR DGFRYVVRGTALARAGNLEDVLAAMPLMRKTESGYDVMGRGQAVFFVDGHRIYNLSELDH LSSDAIKSLEVVTNPGPEYDAAIKAVVRVTTYTSVAQGLSVDARSTWYQNRNTSAIEQMN LRYSARRWTVYDNFEYRLDNYLKWKDLTQTVHVDTLWNEISNEQEHRKHNRVENTVGIDY QIAEQSYVGGRYLLTLDTRNRMDLSSTNRISADGQPYDLLRTDGQERGHSNPSHLFNVYY SGAVGKFKINTDLDYLHSTTETDNAYDETSQTGRSRAFSAVSHINNRLLSLRVSVGHKLW RGDATVGAEYVRTRRNDDYTSTLGEMPTSQSLLRETQRAAFADYSVLTSVGLFGLGIRAE DSRFTFHPGNGEADVSQSVFRVYPNLSWGIRVGQLQAQLLYSTEVSRPTYRQLSKNVLYG SRYTWQMGNPLLRPEYVHELTLQGVWRIVQFQLSYSDTRNAIINWGTQSATHSAVSIMSY RNLPSVKQMRLAVVVSKGFGAWTPQLTAVMNKQFLHLTTGTGRYNLSSPIWVVKLSNNFQ IGRTLSLFLTGDYQSQGDYRNVHLSRDMWSVNFNAVKTLCHDRLSVQLKVNDIFDSKKDR NDIYNDQMVMRLLNRYDFRAVSITLRYKLNSKDYKSHSHSDADQEVKRL >gi|283510502|gb|ACQH01000117.1| GENE 62 85833 - 86057 258 74 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929807|ref|ZP_06423650.1| ## NR: gi|288929807|ref|ZP_06423650.1| hypothetical protein HMPREF0670_02544 [Prevotella sp. oral taxon 317 str. F0108] # 1 74 1 74 74 155 100.0 9e-37 MTLNLSIEEFKTLRRLQHSREFRPYWIQITCILMLAHGHDVETISNDLGISPFCVYNYAE AYRTGGISKLTDNR >gi|283510502|gb|ACQH01000117.1| GENE 63 86300 - 87550 688 416 aa, chain + ## HITS:1 COG:no KEGG:Fjoh_3271 NR:ns ## KEGG: Fjoh_3271 # Name: not_defined # Def: SH3 type 3 domain-containing protein # Organism: F.johnsoniae # Pathway: not_defined # 302 407 276 382 384 70 31.0 2e-10 MLRRLCNPLVSNHKLMISKYALMALFSLFAFFCHGRNKKDTAYVNRKLMDEQALIQTAIL EEYGKENGYKAPTYQEFQDRCMEYWGISLHKKAQDHTVNIGLCIEYTINEVGRFIFTEAE GLFYQPEKEEDPTEEDAKRVLFTRENGIGEKFLAYNKLLFNDNAATLSFFLNNPDYAIDI VFNIDYERNKMLMTKSIDFAKNVPNHSFYRDCAESILFYNNKKRGFRRNLIAEIFRTSNR TNETMTPFDDLVYAYFELFTKLKYAQQVKDACLAYLMACLIEYDTKNKDNIGGIKDEKAY QHLSNFMSKDKHLAHRLKKNNYYGNSQLKELVAAVLVMENAIDPNDAYFVDDPDGYANLR ERASSTSKVIKRVATNEMLTVLNNDGQWWKVQTKDGKTGYIHKSRIRCALKTNLLK >gi|283510502|gb|ACQH01000117.1| GENE 64 87813 - 88250 243 145 aa, chain + ## HITS:1 COG:no KEGG:ASA_1312 NR:ns ## KEGG: ASA_1312 # Name: not_defined # Def: hypothetical protein # Organism: A.salmonicida # Pathway: not_defined # 1 142 6 148 148 125 44.0 7e-28 MLSKTIINYLKRNNFYLEKEEEDYTKALLELGIDLESDLAYFCLHTAEISFKGRAGSIEN ICWYLIYSTYTRRIESLQSYLVLPKEYIPLDSFETEGGFFYNRETGEVLELELGQKLIDF QNGKLQPQWKSFNSFLEWFFGLDTN >gi|283510502|gb|ACQH01000117.1| GENE 65 88854 - 89009 223 51 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|150003118|ref|YP_001297862.1| 50S ribosomal protein L34 [Bacteroides vulgatus ATCC 8482] # 1 51 1 51 51 90 82 3e-17 MKRTFQPHNRRRVNKHGFRERMSTKNGRRVLAARRARGRKKLTVSDEHHGK >gi|283510502|gb|ACQH01000117.1| GENE 66 91112 - 91324 277 70 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929812|ref|ZP_06423655.1| ## NR: gi|288929812|ref|ZP_06423655.1| putative transposase [Prevotella sp. oral taxon 317 str. F0108] # 1 70 1 70 70 117 100.0 3e-25 MGVEGEKDYQKLKVTVMAGSFGNKKQHYAVERIKARNMFCETLLLFYGIHTANAAFLAAG QMAKGMKKAA >gi|283510502|gb|ACQH01000117.1| GENE 67 91363 - 92136 1023 257 aa, chain - ## HITS:1 COG:CAC3079 KEGG:ns NR:ns ## COG: CAC3079 COG1624 # Protein_GI_number: 15896330 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 10 251 19 253 281 173 39.0 3e-43 MFFNFGIKDIIDIVLVALMLYYLYRLMKESRSLNIFIGIMFFVLIWLLVSQVFEMRLLGG ILDKLVSVGVIALIILFQEDIRKFLYSLGAHQRVRGVMKYFYKNDNNPKEDKETIIPIVM ACMSMARGKVGALIVIERGEPLDDIVDTGDLIDARINQRLIENIFFKNSPLHDGAMVISK KRIKAAGCILPVSHNQDIPRSLGLRHRAAMGISQSSDAMAVIVSEETGNISVAIGGKFRL RLSAEELESILTEEAFR >gi|283510502|gb|ACQH01000117.1| GENE 68 92201 - 93049 1088 282 aa, chain - ## HITS:1 COG:VC0638 KEGG:ns NR:ns ## COG: VC0638 COG0294 # Protein_GI_number: 15640658 # Func_class: H Coenzyme transport and metabolism # Function: Dihydropteroate synthase and related enzymes # Organism: Vibrio cholerae # 3 272 2 270 278 165 36.0 8e-41 MSFSITVKQKLFDLSTPRVMGILNATPDSFYAGSRARTEKQIALRTNQIIEQGGTMIDVG AFSTRPGAEQVSQEEEMKRLRKTLAVVCKEQPEAIISVDTFRPDVAQMCVEEFGVDIIND VSDGGLTGVANVPLEPEPGELPQIFHTVARLGVPYILMSVQPTLRDVLLGFARKVSQLRT LGVKDIILDPGFGFGKTVEDNYLLLNEMEKMQVLDLPTLVGVSRKSMIYKTVGGNADTAL NGTTVLHTIALMKGANILRVHDVKEAVEVIALYSRTMSATTP >gi|283510502|gb|ACQH01000117.1| GENE 69 93299 - 94447 1447 382 aa, chain - ## HITS:1 COG:BS_resA KEGG:ns NR:ns ## COG: BS_resA COG0526 # Protein_GI_number: 16079372 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Bacillus subtilis # 249 365 44 160 181 82 34.0 2e-15 MKRILTTVLTTAAMLTANAANPAEGVLKVKGTLKNFGDSLILFVAEPGHNPSPRDTFVMK NGAFDFTVKLAKVSDLTVATPEAVRGQSRQYLTFVGVPGETLVLNGDADGKYTYEGSKFY KEFAEMKNALDNSNSELEALIKSLNERMEKGEKQEDLMKEYQEKAPALQAKAGLAYKDFI KAHPDYEANAIIVASLAKLEEMEEAASMMSPAVREGRMKDFYMASINRVKKQKEEEDKAA RVQAAGVVAPDFTLNNLNGKPFKMSSLKGKYVVLDFWGSWCGYCIKGFPKMKEYYQKYKG KFEILGIDCNDTPEKWKAAVKKHELPWLNVYNPRESKLLGDYAIQGFPTKILVGPDGKIV KTIVGEDPAFYTMLDELFGKAK >gi|283510502|gb|ACQH01000117.1| GENE 70 94683 - 95249 757 188 aa, chain + ## HITS:1 COG:slr0434 KEGG:ns NR:ns ## COG: slr0434 COG0231 # Protein_GI_number: 16331453 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factor P (EF-P)/translation initiation factor 5A (eIF-5A) # Organism: Synechocystis # 1 185 1 184 187 175 51.0 5e-44 MINSQEIKIGTCIRLDNQIFTCIDFQHVKPGKGNTVMRTKLKNVTTGRVLDRTFQVGFKL EDVRIERRPYQFLYKDGEEYIFMNQETFEQAPIAKDAITGVEFMKENDVVEVVTDTTDGT VLFAEMPVKTVLRITHSEPGIKGDTATNATKPATLETGVVVRVPLFINEGELIQVDTRDG SYLQRVKE >gi|283510502|gb|ACQH01000117.1| GENE 71 95369 - 96406 1071 345 aa, chain + ## HITS:1 COG:Ta1048 KEGG:ns NR:ns ## COG: Ta1048 COG0463 # Protein_GI_number: 16082079 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Thermoplasma acidophilum # 4 232 7 217 256 92 30.0 1e-18 MKYSFIVPVYNRPDEVDELLESLTRQSITDFEVIVVEDGSAVPCHDVCEKYRGLLNLHYY YKENSGPGQSRNYGVERAKGDYVLILDSDVVVPSGYLQAVEDELNADDADAFGGPDAAHP SFTPKQKAISYSMTAFLTTGGIRGGKKKMDKFYPRSFNMGVRRSAYSQLGGFSKMRFGED IDFSIRIFKAQLRCRLFPDAWVWHKRRTDFRKFFRQVYNSGIARINLYKKYPESLKLVHL LPMVFTLGVILLAGVALTGVVAWLLGCVVLGQWLVIAGVSPLLVYSLALFVHALTMFGSL HIALLSVAAAFTQLLGYGFGFLQAWWRRCVKGKDEFSAFEKTFYK >gi|283510502|gb|ACQH01000117.1| GENE 72 96722 - 97402 817 226 aa, chain + ## HITS:1 COG:CAC1241 KEGG:ns NR:ns ## COG: CAC1241 COG2003 # Protein_GI_number: 15894524 # Func_class: L Replication, recombination and repair # Function: DNA repair proteins # Organism: Clostridium acetobutylicum # 1 225 6 229 229 144 36.0 1e-34 MKIIDLAEEDRPREKMARIGAENLTNSELLAILIGSGSVNESAVDLMKRMLNDCGNSLKR LGRLSFNELMNYRGIGEAKAVTIMAACELGKRRQQEQAEERQDLGSATAIYNFMHARMQD LDVEEAWALLMNQNYKLIKAFRISHGGISETAVDVRVIMREALLNNTTILALCHNHPSNN TSPSTADDRLTQTVKKACETMRIHFLDHIIVTDGRYYSYHEEGRLV >gi|283510502|gb|ACQH01000117.1| GENE 73 97525 - 98289 850 254 aa, chain - ## HITS:1 COG:NMA0723 KEGG:ns NR:ns ## COG: NMA0723 COG2908 # Protein_GI_number: 15793700 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Neisseria meningitidis Z2491 # 1 228 1 217 240 80 29.0 4e-15 MKNVYFLSDAHLGSLAIEHGRMHERRIVRFLDSIKHKAEAIYLLGDLFDFWNEYKMVVPK GYTRFLGKLSELTDMGVEVHFFVGNHDLWTYGYLEEECGLIVHKKPLTTEIYGKVFYLAH GDGLGDPNNSFKLLRKVFHNRFCQVLFNALHPRWGMALGLNWAKHSRMKRAEGKEPPYMG EDKEYLVLYTKAYMKDHNNIDFFLYGHRHIELDLMLSKKTRMMILGDWIWQFTYAVFDGE HMFMEQYVEGESQP >gi|283510502|gb|ACQH01000117.1| GENE 74 98316 - 98636 579 106 aa, chain - ## HITS:1 COG:CC1859 KEGG:ns NR:ns ## COG: CC1859 COG2151 # Protein_GI_number: 16126102 # Func_class: R General function prediction only # Function: Predicted metal-sulfur cluster biosynthetic enzyme # Organism: Caulobacter vibrioides # 1 103 13 115 118 97 49.0 5e-21 MTQEEKLKIEERIIDVLKTVYDPEIPVNIYDLGLIYKVDLQEDGTLDLDMTFTAPACPAA DFILEDVRTKVESIEGVTSANIELVFEPAWDQSMMSYEARVELGFE >gi|283510502|gb|ACQH01000117.1| GENE 75 98763 - 99650 747 295 aa, chain - ## HITS:1 COG:mlr2757 KEGG:ns NR:ns ## COG: mlr2757 COG3177 # Protein_GI_number: 13472455 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Mesorhizobium loti # 61 237 14 188 263 77 28.0 4e-14 MQFVSTKEFAEAHGVAERTVRSYCAQGKILGAQRIGRKWCVPTNATLPTRRNANKSISPL LETLRREKDCGMKGGIYHHTQIALTYNSNHIEGSCLTEEQTRYIFETNTIGLTDKAIKVD DIVETTNHFRCVDFIISHAKAALTERMIKHLHALLKSSTSDSAKSWFATGNYKRLPNEVG GIETTAPQKVGEAMRTLLKEYRAIKEVSLTHILDFHHRFETIHPFQDGNGRIGRLIMFKE CLNHHIVPFIITDELKMFYYRGLRNWPSIKEYPLDTCLTAQDSYKARLDYFQIKY >gi|283510502|gb|ACQH01000117.1| GENE 76 99950 - 100870 719 306 aa, chain - ## HITS:1 COG:no KEGG:PG1032 NR:ns ## KEGG: PG1032 # Name: not_defined # Def: ISPg3, transposase # Organism: P.gingivalis # Pathway: not_defined # 7 291 5 285 300 271 46.0 2e-71 MITTDKVTEIFCILDEFCKNLDAELTKNLHIAPIDEGYKRMRNRKGQMSKSEIMTILLCY HFGSFRNFKHYYLFFIKEHLASYFPKAVSYTRFVELMPRVFFDLMAFMRIQGFGKCTGIS FVDSTMIPVCHNMRRKFNKVFDGLAKNGKGTMGWCHGFKLHLLCNEMGDVLTFCLTPANV DDRDPRVWRVFTKVLYGKVFADKGYIKQEFFENLFNQGIHLVHGLKSNMKNKLMPLWDKM MLRKRYIIECINELLKNKANLVHSRHRSVHNFLMNLCAALAAYYFFKNKPEALPVRIEKS RQLELF Prediction of potential genes in microbial genomes Time: Sat May 28 02:39:53 2011 Seq name: gi|283510501|gb|ACQH01000118.1| Prevotella sp. oral taxon 317 str. F0108 cont2.118, whole genome shotgun sequence Length of sequence - 75512 bp Number of predicted genes - 54, with homology - 43 Number of transcription units - 32, operones - 14 average op.length - 2.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 912 - 954 8.0 1 1 Tu 1 . - CDS 986 - 2341 1481 ## COG0526 Thiol-disulfide isomerase and thioredoxins 2 2 Op 1 . - CDS 2450 - 3955 1608 ## Cpin_5144 hypothetical protein 3 2 Op 2 . - CDS 4004 - 4840 951 ## Cpin_5145 hypothetical protein 4 2 Op 3 . - CDS 4853 - 6265 1391 ## Cpin_5146 hypothetical protein 5 2 Op 4 . - CDS 6286 - 9423 3058 ## Cpin_5147 TonB-dependent receptor plug - Prom 9580 - 9639 4.8 6 3 Tu 1 . - CDS 10872 - 11177 56 ## - Prom 11387 - 11446 5.8 7 4 Op 1 . - CDS 11661 - 12962 1363 ## COG0770 UDP-N-acetylmuramyl pentapeptide synthase 8 4 Op 2 . - CDS 13045 - 14796 2035 ## COG3119 Arylsulfatase A and related enzymes - Prom 14876 - 14935 4.6 + Prom 14945 - 15004 4.5 9 5 Tu 1 . + CDS 15168 - 15443 143 ## - TRNA 15362 - 15434 82.1 # Phe GAA 0 0 + TRNA 15626 - 15701 85.9 # Pro CGG 0 0 - Term 15694 - 15747 18.0 10 6 Op 1 . - CDS 15800 - 16231 266 ## gi|288929831|ref|ZP_06423674.1| conserved hypothetical protein 11 6 Op 2 . - CDS 16224 - 16457 285 ## gi|288929832|ref|ZP_06423675.1| hypothetical protein HMPREF0670_02569 - Prom 16479 - 16538 7.2 12 7 Op 1 . - CDS 16835 - 17824 1219 ## COG3176 Putative hemolysin 13 7 Op 2 . - CDS 17850 - 18656 870 ## COG3176 Putative hemolysin 14 8 Op 1 . - CDS 19510 - 19779 222 ## Ppha_1443 addiction module toxin, Txe/YoeB family 15 8 Op 2 . - CDS 19772 - 20032 236 ## gi|288929836|ref|ZP_06423679.1| conserved hypothetical protein - Prom 20059 - 20118 4.3 - TRNA 20556 - 20632 83.1 # Asn GTT 0 0 + Prom 20704 - 20763 6.4 16 9 Tu 1 . + CDS 20797 - 22485 2100 ## COG0793 Periplasmic protease + Term 22538 - 22608 16.2 + Prom 25014 - 25073 3.3 17 10 Tu 1 . + CDS 25093 - 25443 399 ## gi|288929839|ref|ZP_06423682.1| hypothetical protein HMPREF0670_02576 + Term 25590 - 25627 8.8 18 11 Op 1 . - CDS 25920 - 26429 619 ## gi|288929840|ref|ZP_06423683.1| hypothetical protein HMPREF0670_02577 19 11 Op 2 . - CDS 26434 - 26658 92 ## - Prom 26737 - 26796 1.7 + Prom 26524 - 26583 7.1 20 12 Tu 1 . + CDS 26648 - 27859 317 ## PROTEIN SUPPORTED gi|116517028|ref|YP_816079.1| glucokinase + Prom 27933 - 27992 4.0 21 13 Tu 1 . + CDS 28013 - 29821 1940 ## COG1022 Long-chain acyl-CoA synthetases (AMP-forming) 22 14 Tu 1 . + CDS 29994 - 31112 1424 ## COG1186 Protein chain release factor B + Prom 31349 - 31408 3.5 23 15 Op 1 . + CDS 31614 - 31751 119 ## 24 15 Op 2 . + CDS 31848 - 32348 419 ## COG2954 Uncharacterized protein conserved in bacteria 25 16 Tu 1 . - CDS 32299 - 32724 306 ## PRU_2494 hypothetical protein - Prom 32877 - 32936 3.9 + Prom 32826 - 32885 5.0 26 17 Op 1 . + CDS 32909 - 34270 1363 ## COG0733 Na+-dependent transporters of the SNF family 27 17 Op 2 . + CDS 34347 - 35258 835 ## COG1555 DNA uptake protein and related DNA-binding proteins - Term 35295 - 35330 -0.9 28 18 Tu 1 . - CDS 35365 - 36702 1418 ## COG0232 dGTP triphosphohydrolase - Prom 36724 - 36783 5.4 29 19 Tu 1 . + CDS 37728 - 37952 90 ## 30 20 Tu 1 . - CDS 38686 - 38919 63 ## - Prom 38945 - 39004 3.1 + Prom 39668 - 39727 4.3 31 21 Op 1 . + CDS 39770 - 43192 2883 ## COG3291 FOG: PKD repeat 32 21 Op 2 . + CDS 43204 - 43407 288 ## gi|288929850|ref|ZP_06423693.1| hypothetical protein HMPREF0670_02587 + Prom 44118 - 44177 2.1 33 22 Tu 1 . + CDS 44322 - 44507 116 ## + Prom 44897 - 44956 6.7 34 23 Op 1 . + CDS 45143 - 48541 2509 ## COG3291 FOG: PKD repeat 35 23 Op 2 . + CDS 48592 - 48753 136 ## gi|288929853|ref|ZP_06423696.1| hypothetical protein HMPREF0670_02590 + Term 48899 - 48940 7.6 36 24 Tu 1 . - CDS 49293 - 49643 60 ## - Term 51090 - 51148 4.6 37 25 Op 1 . - CDS 51185 - 53059 1836 ## COG3468 Type V secretory pathway, adhesin AidA 38 25 Op 2 . - CDS 53060 - 54022 1153 ## PRU_2918 putative lipoprotein 39 25 Op 3 . - CDS 54082 - 54357 127 ## 40 25 Op 4 . - CDS 54435 - 56210 2204 ## COG0457 FOG: TPR repeat 41 25 Op 5 . - CDS 56219 - 56656 707 ## COG0756 dUTPase - Prom 56891 - 56950 5.8 42 26 Tu 1 . - CDS 57119 - 57331 67 ## - Prom 57514 - 57573 6.6 + Prom 57003 - 57062 5.8 43 27 Op 1 . + CDS 57258 - 57485 121 ## 44 27 Op 2 . + CDS 57529 - 60921 3011 ## COG3291 FOG: PKD repeat 45 27 Op 3 . + CDS 60933 - 61133 270 ## gi|288929859|ref|ZP_06423702.1| hypothetical protein HMPREF0670_02596 + Term 61207 - 61251 13.2 + Prom 61811 - 61870 6.9 46 28 Op 1 . + CDS 61954 - 62214 96 ## gi|288929860|ref|ZP_06423703.1| hypothetical protein HMPREF0670_02597 47 28 Op 2 . + CDS 62288 - 63871 1230 ## gi|288929861|ref|ZP_06423704.1| lipoprotein + Term 63925 - 63971 15.4 48 29 Tu 1 . - CDS 64582 - 66267 1308 ## Dfer_0822 hypothetical protein - Prom 66355 - 66414 3.2 - Term 66360 - 66405 11.5 49 30 Tu 1 . - CDS 66574 - 68133 1938 ## COG0673 Predicted dehydrogenases and related proteins - Prom 68154 - 68213 3.6 - Term 69420 - 69444 -1.0 50 31 Tu 1 . - CDS 69509 - 70261 846 ## gi|288929864|ref|ZP_06423707.1| hypothetical protein HMPREF0670_02601 - Prom 70289 - 70348 1.8 51 32 Op 1 . - CDS 70563 - 71210 505 ## gi|288929865|ref|ZP_06423708.1| hypothetical protein HMPREF0670_02602 52 32 Op 2 . - CDS 71201 - 72229 1068 ## gi|288929866|ref|ZP_06423709.1| hypothetical protein HMPREF0670_02603 53 32 Op 3 . - CDS 72242 - 73438 1275 ## gi|288929867|ref|ZP_06423710.1| hypothetical protein HMPREF0670_02604 54 32 Op 4 . - CDS 73452 - 75512 2029 ## COG1629 Outer membrane receptor proteins, mostly Fe transport Predicted protein(s) >gi|283510501|gb|ACQH01000118.1| GENE 1 986 - 2341 1481 451 aa, chain - ## HITS:1 COG:BH1522 KEGG:ns NR:ns ## COG: BH1522 COG0526 # Protein_GI_number: 15614085 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Bacillus halodurans # 309 424 35 150 177 68 33.0 2e-11 MKRTITVLTAACLCLASYAAVVIRGTIAPDLGNKVSLYAINDGLDSLCATAQVTRGKFVL KSAVPYNGLYVLSTDENNKLKHAIYLKDGAQLAVNYGSNLVVQPTNGNAHEKDFALWATK SAPVFYHAYLYNHTPGGKTVKPQQFLKEMEVLKQTAAALKRKANGKGDKNSLLQLKIDAD LAFYMLSYRKNNPDTVDDNFISPAALQDYKQLFSDKSVLRLPYAPEMLALYADYLAEKQN IGKEDYRQRLSLLSSEPLREAYLYQVAQGLNYDEKYTALLRAWGDTPLSSRLQQALKPIE EALAWSKVGQTAIDFRGIRPDSTTLSLSDLRGKVVVIDVWATWCAPCIRMMPYFKQLEKE LSHPDLTFMSVCLGVWAESDRWEKLIQEHGLTGNLIFIDSWTKGFAVDYHVTGVPRFMLI DREGRVLSFAAPAPNKPELKAMILEALNAKQ >gi|283510501|gb|ACQH01000118.1| GENE 2 2450 - 3955 1608 501 aa, chain - ## HITS:1 COG:no KEGG:Cpin_5144 NR:ns ## KEGG: Cpin_5144 # Name: not_defined # Def: hypothetical protein # Organism: C.pinensis # Pathway: not_defined # 18 490 18 481 488 99 25.0 3e-19 MKHIAIFSALAMACLLGSCFSDNGNYDYQPPINIRVTGVNEAYTVNPTGDKLQIAPKIYP ENRQYDCFWMVIPAAASWNEQADTISRQRDLDYNVNLNVGNYKLRLCAKDRATGVFAYEQ YDLYVTTDMDTGWWVLKSESDSADIDFFSETKAKPNIVHAANGHHLQGAAQNLYFTINYW NFNTATLRDTRVNAVFVASNKDLAVLDYFTGKIIRGYEELFVEEPARREVRDMFAGPSDV HVYVGDFVYTLFNSKYDIYKQFVLKTLGDYNLSPIHHSGRALPLLFNTKNSSLCSVSRTS PAINYFVDGNGPLSPKNMNMDLVFMGGRTTAAYNEGDEALAIVKRKDTGKYLLLTLNGKP NGMDLNPIRKSEELDGSLNMLQATHRALNQNNRIIYFAKDNRLYACNLDNNTETAQGISW NSGEQVTYMEFLKYAPYGLDSTWFDYLAIATAQNGHYKLYLHPVPAGNVKPAVKVLEGKG EVKRACFMSQTKNGFYTSTLF >gi|283510501|gb|ACQH01000118.1| GENE 3 4004 - 4840 951 278 aa, chain - ## HITS:1 COG:no KEGG:Cpin_5145 NR:ns ## KEGG: Cpin_5145 # Name: not_defined # Def: hypothetical protein # Organism: C.pinensis # Pathway: not_defined # 1 275 1 273 273 161 34.0 2e-38 MKKILYAALLCALFTTGCKEDEKLLFNEKARAELVSENQKAPADHPFSFVWGSDTRVRDT VFIPIRVIGGSAPTDRHVLFEQVSEYIVDYTYDNKGYIIDSTVTERTDKAIPGTHYVPLN DESIRPLFTIKAGRVKDSVGIVVLRDASLKKASVRLRLRLKENENFALGERRLQEQTIII SDKLEMPSNWNYTTKAYLGNYSAPKHRLMMQVVGSKVDDVWIAEVNKSQELIVYWRGKFI EALEAFNSDPANIASGLAPMREDPTNASSPLVTFPSRV >gi|283510501|gb|ACQH01000118.1| GENE 4 4853 - 6265 1391 470 aa, chain - ## HITS:1 COG:no KEGG:Cpin_5146 NR:ns ## KEGG: Cpin_5146 # Name: not_defined # Def: hypothetical protein # Organism: C.pinensis # Pathway: not_defined # 18 470 17 499 500 294 37.0 5e-78 MKFGKFIIALLLILGTTTACNDFLNVLPSTEKEKDEMFSTVDGCRSVLIGSYIRMKQTNL YGEEMVCGTVENLAQHWTYNAGSVGEYLNKYDYKAAVVETAMGNIYNNLYKVVADVNGLL SGIEQHRSVLDDTNYNLIKGEALGLRAFCHLDVLRLFGPMPNNVPTSKVLPYVTTVTNRP NAFLSYSEFTTQLLKDLDEAETCLKDDPVRSKSIEALNKDAASEDNFFTARQLRMNYYAV CALKARAYLWMGNKEKALAYAKLVIDAKTPEGNAMFRLGTRDDCSRGDKTLSAEHVFNLK ANDLSKTLGTGRTYQKPKAELTARLYEAGTSDIRFVNLWEEVNISYYNRPFYFQKYTQTE KMPTLAKNVIPLIRLSEMFLIAIESSPLAGANAYYATFCTARDVTPTTMANDSERIDLLV KEYNKEFYGEGQAFYAYKRLGASTILWTSVPGSVETYIVPLPKDEAVYVN >gi|283510501|gb|ACQH01000118.1| GENE 5 6286 - 9423 3058 1045 aa, chain - ## HITS:1 COG:no KEGG:Cpin_5147 NR:ns ## KEGG: Cpin_5147 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: C.pinensis # Pathway: not_defined # 61 1045 178 1164 1164 988 50.0 0 MRRNRQVGRVTRLLLLMLWTTFPQLPCHAESAELASKTSQQATQGSKRTMKGQVKSESGE PLIGVTVYGADGKVSGVTDMDGMYHIDIPTSLTVVRFSYVGMNPVSITIEPGTTAVVRNL SMQSSTTIKDVVVTGIFQKNKEAFTGAVATITNKELKEFGNKNLLTSIANIDPSFNLLAN NQYGSDPNRLPDVQIRGTANLPTLTNLQDNTRTDLNTPLIILDGFEISLARMVDLNEDEV ESITLLKDGSATAIYGSRGANGVIVIKRKTPQTGRLRFSYSGSLNVEVPDLSDYHLLNAA DKLELERRSGYYESTNPGRDFILKKKYSAILENITRGFETDWLSKPLRTGVGQRHNIRIE GGDESFRYAASLQYNGVAGVMKGSNRNNFNGGITLSYRHRSLIFTNDLSIGYTKSAESPY GSFSDYTRLNPYWRPYDDNGRLIKMFDNDVDFYGGFRNLPANPLYNAMLNQKNSQQYTDI TNNFSIEWRPFEGFITRGRVGLTWRNTETDDFKPAKHTMFEADEYQTEEGSLRKGRYNYG TGKLTNYEVALTASYSKHFAEKHLLYAAANWNVTSNFARNYNFVFEGFTDEYLDFPSNAL QYQKGGKPSGGESRTRAVGLVFNANYSFDNRYYTDLAYRIDGSSQFGKNRRFAPFYSVGI GWNVHNEKFMKSVRFVDRLKFRASFGQTGSQKFSAYQAIATYSYYLNDRYNSWIGTYQKA LENPNLEWQKTDKWNAGVEINLFDNRLNLVADVYLDKTSNLLSSLKLPLSNGFTSFVENI GEVENRGFEVKATGFLLRNTSQRLAWSVTGSIAHNKDKVVKLSQALKDEYAKRLLTGGTT PNKVIREGESQHTIYAVQSLGIDPSTGNEVFRKKNGELTHTWSAADRVACGISQPKFRGT LSSMIRWKDFTANVVFGYRFGGQIYNSTLANRIENADKHYNVDERVFRDRWQQKDDRTFF KGLNNETITNATTRFVQDERTLSCQNIHLAYMFSNNPWLKRHLGIQVLTLSSDLSDLFYL SSVKQERGLSYPYSRRCSFTLSVSF >gi|283510501|gb|ACQH01000118.1| GENE 6 10872 - 11177 56 101 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLSLPVESVFPVAHNEEKVKSTSLCSLCSYVKKDVVMSKDEKINVFAKPNEQCRACSNIV AARKERLARSLVNFSTRQLVNLSSCQLVNSSTSQLVNFPAR >gi|283510501|gb|ACQH01000118.1| GENE 7 11661 - 12962 1363 433 aa, chain - ## HITS:1 COG:BS_murF KEGG:ns NR:ns ## COG: BS_murF COG0770 # Protein_GI_number: 16077524 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl pentapeptide synthase # Organism: Bacillus subtilis # 16 429 30 447 457 221 34.0 2e-57 MTTEQLYKIFERHPIVTTDSRDCPAGSIFFALKGESFNGNKFAAAALQQGCTFAVVDEAE FCPPNDERYILVDDVLRAFQLLARHHRRTLGTRIVGITGTNGKTTTKELLAAVLGQKYNV LYTLGNFNNDIGVPKTLLRLKPEHEVAVVEMGASHPGDIKTLVELVEPDLALITNVGMAH LQGFGSFEGVVKTKGELYDFMRQSKRGKVFIDANNPHLMGIAEGLDLVTYGTPARPDLFV SGKVVSAAPFLRFAWQREGGAWNEVQTHLVGAYNVLNMLAALSVGLFLGVSAEDANRALA NYVPSNNRSQLEETAHNKLIMDAYNANPTSMSVALNNFHDMEVPHKMAILGDMLELGAVS QSAHQAIVDQLSQLALDEVWLVGSEFARTSCAFRKFNNVEEVIAELKKHRPENHYILVKG SNGIHLDKLPQWL >gi|283510501|gb|ACQH01000118.1| GENE 8 13045 - 14796 2035 583 aa, chain - ## HITS:1 COG:SPBPB10D8.02c KEGG:ns NR:ns ## COG: SPBPB10D8.02c COG3119 # Protein_GI_number: 19111838 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Schizosaccharomyces pombe # 49 575 6 522 554 120 24.0 9e-27 MNLPSQLSLTMLPALALATTADKTVAQTNDTRYPKCPELATVPVNAAAQATAVRPNILYI MCDDHSMQTISAYGSALSKLAPTPNIDRLARRGMLFRSAFVENSLSTPSRACLITGLYSH QNGQRQLGEGIDTTRTFFSELLQKAGYTTAMVGKWHMHCRPKGFDFFHILYNQGSYYNPV FCSNREYGKYKQEKGYATTLITDHAIDFLEHRDKSKPFCLLVHHKAPHRNWMPEEKYFGL YSDVEFPLPKTFWDDYATRGSAARTQKMRIDDDLRMIQDLKVPETLDTADVESMDSYYAL LGETSRFTTEQRVAFDKYYMPRNKKFIESKLTGKELVKWKYQNYIRDYAAVIRSVDDNVG RLLDYLEQNGLSDNTIVVYTSDQGFYMGEHGWFDKRFMYEESFHTPLIISYPGHVKEGTE CNQLVQNIDFAPTFLELAGVKKPSEMSGRSLQPLFKGTNVKDWRKDLYYHYYDYPTYHLV RKHDGVRNERYKLIHFYGKGSDRAVQENKYQRTPGTSEYYTYQALKRINYFRDDADIDYY ELYDLQADPDELNNIYGKPGTEKVTKQLLKRLGEYRKELKVDE >gi|283510501|gb|ACQH01000118.1| GENE 9 15168 - 15443 143 91 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNNAELVRTLSWRENALFEEREKNNFPKPIMFDYKQGIWLSLGYKKKSLKSHDFKDPHNL LTKKCGATRNRTGDTRIFSPLLYQLSYGTNF >gi|283510501|gb|ACQH01000118.1| GENE 10 15800 - 16231 266 143 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929831|ref|ZP_06423674.1| ## NR: gi|288929831|ref|ZP_06423674.1| conserved hypothetical protein [Prevotella sp. oral taxon 317 str. F0108] # 1 143 1 143 143 254 100.0 1e-66 MAKQSISRKETQVSTGDSVGKQLEQTVSVDDNSLPSPQELSQYQKISPNIVEFLIETARK EQEHRHYMEKEKLELIKESEQKDAKMNIRGMRFAFLSLVIFMGITAFALYLDKPWFSGIF GGITILSIMSIFVEVGKKPKKEK >gi|283510501|gb|ACQH01000118.1| GENE 11 16224 - 16457 285 77 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929832|ref|ZP_06423675.1| ## NR: gi|288929832|ref|ZP_06423675.1| hypothetical protein HMPREF0670_02569 [Prevotella sp. oral taxon 317 str. F0108] # 1 77 15 91 91 135 100.0 9e-31 MYIIKSMIQFVARATYVFGRASKGQYHSDSEAIKELEREVLYGKSDRRTDAENLINDRRN VAADIRKSFNKLIMENG >gi|283510501|gb|ACQH01000118.1| GENE 12 16835 - 17824 1219 329 aa, chain - ## HITS:1 COG:VCA0646 KEGG:ns NR:ns ## COG: VCA0646 COG3176 # Protein_GI_number: 15601404 # Func_class: R General function prediction only # Function: Putative hemolysin # Organism: Vibrio cholerae # 5 295 304 583 605 116 28.0 5e-26 MEEKIIQPVERELLKSELTPNKQLRLTNKSNNEIYIVTAQDSPNVMREIGRLREIAFRNA GGGTGKALDIDEFDTMEGGYKQLIVWNPDAEDIIGGYRYIYGTDWKMNDDGQPLLATSHM FKFSNRFLTDYMPYTVELGRSFVAVEYQSIRKGTKSIFALDNLWDGLGALTVINPKLKYF FGKMTMYPSYIRRGRDMILYFLKKHFDDKENLITPMKPLIIETDPKELEALFCENDFKAD YKILNREIRALGYNIPPLVNAYMGLSPTMKLFGTAINYGFGDVEETGILIAVDEILEEKR VRHIDSFIAENRETLELTTGQNKIIYKEK >gi|283510501|gb|ACQH01000118.1| GENE 13 17850 - 18656 870 268 aa, chain - ## HITS:1 COG:VCA0646 KEGG:ns NR:ns ## COG: VCA0646 COG3176 # Protein_GI_number: 15601404 # Func_class: R General function prediction only # Function: Putative hemolysin # Organism: Vibrio cholerae # 35 234 66 264 605 79 30.0 8e-15 MGDKARLVPRPLVRWLKHIVHQDEVNGFLWDNRERVGVDWLEACVSYLDLTLNVRGMENL PAKDDGRLYTFVSNHPLGGIDGVALGAIVGRHYDGNFRYLVNDLLMNLPGLAPLCIPINK TGNQSRAFPAMVEAGFKSQQHMMMFPAGLCSRRTKGEIRDIPWTKTFIAKSVETQRDVVP IHFGGQNSNFFYRLANTCKRLGIKFNVAMLFLVDEMYKNRHKTFNVTIGKPIPWQTFDKS RTSLQWAQAVQNLVYQLPTNNQNQRLNP >gi|283510501|gb|ACQH01000118.1| GENE 14 19510 - 19779 222 89 aa, chain - ## HITS:1 COG:no KEGG:Ppha_1443 NR:ns ## KEGG: Ppha_1443 # Name: not_defined # Def: addiction module toxin, Txe/YoeB family # Organism: P.phaeoclathratiforme # Pathway: not_defined # 2 89 3 92 92 82 48.0 4e-15 MYDLDFTEQSQKEIAKLRKSDAQGYKKLKALLVELSEHPYTGTGHPHQLRYANGVWSRKI DKKNRLRYLVNNTTIVVLVLSVVGHYDDK >gi|283510501|gb|ACQH01000118.1| GENE 15 19772 - 20032 236 86 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929836|ref|ZP_06423679.1| ## NR: gi|288929836|ref|ZP_06423679.1| conserved hypothetical protein [Prevotella sp. oral taxon 317 str. F0108] # 1 86 1 86 86 151 100.0 2e-35 MQIISPAELRANQKKYFDLAAHETIFVARRNDYPIRLVVVTDEEEFPTREEVASLQKALE DVKSGRVNRMERGESLDDFLERTGHV >gi|283510501|gb|ACQH01000118.1| GENE 16 20797 - 22485 2100 562 aa, chain + ## HITS:1 COG:aq_797 KEGG:ns NR:ns ## COG: aq_797 COG0793 # Protein_GI_number: 15606169 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protease # Organism: Aquifex aeolicus # 2 391 3 390 408 205 34.0 2e-52 MRKTFIALCLAATAYTAAAQNERDHNFDVAKNLNIFNAIYKNLDMMYVDTLNANQTVGTG VNAMLKALDPYTVYYPEDKVKDLKFMITGKYGGIGAIIKYNSQLKRVIIEEPYENMPAAE VGLKKGDIILAIDNEDMTKKDQSYVSDHLRGDPATSFILQIKRPSTGKVMKFKVTRKAIQ SPAVPYYGMQPNNVGYINLSGFTENCSREVRRAFLDLKQKGMKSLVFDLRGNGGGSESEA VSIVNMFVPKGKLIVSNRGKLQRSNHEYRTTVEPIDTVMPIVVLVDGNSASASEITCGSL QDLDRAVVLGTRTFGKGLVQMTMDLPYNTALKLTTAKYYIPSGRCIQAINYKHGNGGTTE QVPDSLTKLFHTANGRPVRDGGGILPDVVVKADSLPNIAFYLAGIRDSNELLLNYQLDYI AKHPTIAQPEDFTLTDAEYNEFKQRVLNSKFTYDRESEKFLKELVKLAKFEGYYDDAKTE FDALEKKLSHNIAKDLDINKDAVKRLIIDGIIPTYYFQKGRIRHSLVDDKQMKEALKLLA DPKRYADLLKPQPKAAQPKAKK >gi|283510501|gb|ACQH01000118.1| GENE 17 25093 - 25443 399 116 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929839|ref|ZP_06423682.1| ## NR: gi|288929839|ref|ZP_06423682.1| hypothetical protein HMPREF0670_02576 [Prevotella sp. oral taxon 317 str. F0108] # 1 116 1 116 116 228 100.0 9e-59 MKKVVLFVFMLLQLWACGQVKYREVLSLADEFVSSLETDYQSYGLLGGVDKIKYTRDGLY QVFPMGRLINVKIDSMASDDDYEQLRQALASHYSADGRVRQVYRCHAGTIMIDCRN >gi|283510501|gb|ACQH01000118.1| GENE 18 25920 - 26429 619 169 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929840|ref|ZP_06423683.1| ## NR: gi|288929840|ref|ZP_06423683.1| hypothetical protein HMPREF0670_02577 [Prevotella sp. oral taxon 317 str. F0108] # 1 169 23 191 191 327 100.0 1e-88 MIKSIFATAFALLISATAFAESNASIQGVAFGTNYETALKGIKGQFGTPTSINNKQIEYR NMMFKGVKFEKVTFNFQTDEQGNTYFSEARFTSRPVNKKNALKDVELLAKTMDKDYPGVT IDWEDDDMPFYKGGSSPVENNYLFTVCIYPAGKAYTTVLRYGPLPYVRN >gi|283510501|gb|ACQH01000118.1| GENE 19 26434 - 26658 92 74 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVAIFYTLYFCVMWHRLPPYANLQKISYTSLLINKKKRTVLHFITNFLAMRAKWSAEAHR CLPIFVYLCTHEQI >gi|283510501|gb|ACQH01000118.1| GENE 20 26648 - 27859 317 403 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|116517028|ref|YP_816079.1| glucokinase [Streptococcus pneumoniae D39] # 88 394 6 316 319 126 25 3e-28 MATMLLNDLLRGTKSAMTHKRIIAYYLNNSTSTIPELAKELNLSVPTIAKVIAEMCEDGY IVNCGKMDTGKGRSPVIYGLAPESGYFVGVDLMSNGLNLGLMNFNGELVKLDMDVPYEYR NSQEGLDELCGYVTDFIDQLDIAKERILNVGVNIPGRVNPEMGHSFSRFNFEERPLVDLM AERIGCRVYIDNDTRSMTYGEMSKGVVKGEKDVIFINLSWGLGCGLIFNGELYMGKSGFS GELGHFPSFDNEILCHCGKRGCLETEISGMALHRNLLQCVKEGRQTLLSEQIMKDEASLT LDDIIEATLKEDLLCIELVEDIGQKLGRYLAGLINLLNPEMVIIGGSLARTGDSVLQPVK SAVRKYSLNMVNRDSAIVLSKLQGKAGVTGACLLARKSLFHFS >gi|283510501|gb|ACQH01000118.1| GENE 21 28013 - 29821 1940 602 aa, chain + ## HITS:1 COG:HI0002 KEGG:ns NR:ns ## COG: HI0002 COG1022 # Protein_GI_number: 16271978 # Func_class: I Lipid transport and metabolism # Function: Long-chain acyl-CoA synthetases (AMP-forming) # Organism: Haemophilus influenzae # 1 600 9 602 607 458 39.0 1e-128 MQTRCHLSMLIHEQARKYGDAPALTFKPFGEDGWRTVSWNHFSLRVKQVSNALLNLGVKP QENIAVFSQNCLQYLYTDFGAYGVRAVAIPFYATSSEQQVQFVVNDAQVRFIFVGEQEQY DKAHRVFALCHSLERIVVFDRNVRISTHDPNALYFDDFLKLGENLPRQSELERRWRDANE NDLCNILYTSGTTGDSKGVMLTYSQYHAALKANDECVPVGETDRVISFLPYTHIFERAWA YLALSEGAQIIINTNPHDIQDSMRQTHPTCMSSVPRFWEKVYAGVKERMEAASPLQRRLF KHALEVGRRHNVHYLSRGKRPPLPLRMEYKVLNRTVLSLVRKQLGLENANIFPTAGARVS PEVEEFVHAIGLFMMVGYGLTESLATVSCDRVDKPYTLGSVGRVINGLQIKIGENNEVLL KGPTITKGYYKRESLNSKIFDKDGFFRTGDAGYIKNGELFLTERIKDLYKTSNGKYIAPQ MIEALLLVDKYIDQVAVVADERKFVSALVVPEFRVVEEYAHAHGIDFNSREDLCANRQIH QMMLDRIHTLVQQLSPYEQIKRITLIPHHFSMENGELTNTLKLRRPVILANYKELIDKMY EE >gi|283510501|gb|ACQH01000118.1| GENE 22 29994 - 31112 1424 372 aa, chain + ## HITS:1 COG:NMB2138 KEGG:ns NR:ns ## COG: NMB2138 COG1186 # Protein_GI_number: 15677951 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Protein chain release factor B # Organism: Neisseria meningitidis MC58 # 5 366 11 365 367 258 40.0 1e-68 MITADQLKDIQERAEALHRYLDIDHKRVELEEEELRTQAPDFWDDPARAQEQMKKVKGIE KWIVGYKEVRQAADELQLAFDFYHDDMVTEEEVDEDYRKAITAVENLELMNMLRQKEDPM DCVLKINSGAGGTESQDWAQMLMRMYMRWAESHGHRVTISNLQEGDEAGIKSVTMEIEGG EYAYGYLKSENGVHRLVRVSPYNAQGKRMTSFASVFVSPLVDDTIEVYVDPSKVSWDLFR SGGAGGQNVNKVETGVRLRYQYTDPDTGEEEEILIENTESRKQLENRNNAMRLLKSQLYD RAMKKRLEAQAKIEAGKKKIEWGSQIRSYVFDDRRVKDHRTNYQTADVDGVMDGKIDDFI KAYLMEFPVTEE >gi|283510501|gb|ACQH01000118.1| GENE 23 31614 - 31751 119 45 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSEKSKMRRAQREARQEKQAKQVIYWLVAVLALLALILLVYAMNS >gi|283510501|gb|ACQH01000118.1| GENE 24 31848 - 32348 419 166 aa, chain + ## HITS:1 COG:all4694 KEGG:ns NR:ns ## COG: all4694 COG2954 # Protein_GI_number: 17232186 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Nostoc sp. PCC 7120 # 3 157 2 153 153 141 51.0 6e-34 MSGQEIERKFLVEKGSAYREAAFSSSHIVQGYIPCEGATVRIRKRDNRAYLTIKGRSTNG GLSRYEFETEIPPHEADQLLLLCRGGVVDKRRYLVKSGAHTFEIDEFYGNNEGLLLAEVE LESEDETFEKPHFIGPEVTGDRRFYNSNLLANPFCLWRNTLPEAYR >gi|283510501|gb|ACQH01000118.1| GENE 25 32299 - 32724 306 141 aa, chain - ## HITS:1 COG:no KEGG:PRU_2494 NR:ns ## KEGG: PRU_2494 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 14 131 3 121 128 115 46.0 5e-25 MKPTGLQKVNYGRHIIMAYPFATCIILVIWVICLMPIPETPLSHLTLFDKWMHITMYAVL CVVVWAEYLRRHRELNKMRLFIGIFLAPLLMGGLIELAQATCTGGNRSGDWLDFAANSIG VVLGNLIGMLLVRCFAKGKRD >gi|283510501|gb|ACQH01000118.1| GENE 26 32909 - 34270 1363 453 aa, chain + ## HITS:1 COG:BH1128 KEGG:ns NR:ns ## COG: BH1128 COG0733 # Protein_GI_number: 15613691 # Func_class: R General function prediction only # Function: Na+-dependent transporters of the SNF family # Organism: Bacillus halodurans # 5 453 9 451 453 301 40.0 2e-81 MSESRAGFGSKLGLILATAGSAVGLGNVWRFPYMTGADGGAVFILIYMCCILFLGIPCMV SEFIIGRHGASNTARAYTKVAGKGSPWRWVGFLGVATGFLITGYYAVVAGWCLQYMYASA TNQLQGDHAAIANYFSNFSSSAFLPVFWTAIILLFTHYVIIHGVRGGIEKASKLFMPTLF VLLLVIVVASCMLPGAGKGIAFLFNPDFSKVNSNVFLDALGQSFYSLSIAMGCICTYASY FSRQTNLLKSAVQISVIDSVVAILAGLMIFPAAFSVGVSPDSGPSLIFITLPHVFQQAFS GVPIVGYVLSLMFYALLGLAALTSLISLHEVSTAFFFEELHTTRKKAATIVTAGTLTIGV FCSLSLGAMSGLSIAGHTLFDIFDFVTGQIFLPVGGFLTCIFLGWFVPRKLVKDEFTNWG TLRSTLFGTYLFSVRYICPLCILAIFLHQLGLF >gi|283510501|gb|ACQH01000118.1| GENE 27 34347 - 35258 835 303 aa, chain + ## HITS:1 COG:TM1052 KEGG:ns NR:ns ## COG: TM1052 COG1555 # Protein_GI_number: 15643810 # Func_class: L Replication, recombination and repair # Function: DNA uptake protein and related DNA-binding proteins # Organism: Thermotoga maritima # 176 303 47 181 181 78 35.0 2e-14 MSIKELFYFSKSDRKAILFLLVAIVVIFVAIRWSSGLSITADKPAAEQRKASKQAVSSSH ADKDWSLDEPTQMAIHLTPFDPNTAPPEQLLALGLQPWQVKSLLKYRNKGGVFRQPSDFA RLYGLTLKQYRQLEPYIRIADDYRPAADFIASERPRTMHQRDTLHYPMKLKAGMHVFLNT ADTTDLKKVPGIGSYFARQIVNYRNRLGGFSSIEQLAEIDDFPIESLTFFAVSADELAKI HRLNVNKLSLNELKRHPYINFYMARAIVDFRRLHGDLQSLEQLKLLKDFSEEKIRRLQPY ITF >gi|283510501|gb|ACQH01000118.1| GENE 28 35365 - 36702 1418 445 aa, chain - ## HITS:1 COG:PA3043 KEGG:ns NR:ns ## COG: PA3043 COG0232 # Protein_GI_number: 15598239 # Func_class: F Nucleotide transport and metabolism # Function: dGTP triphosphohydrolase # Organism: Pseudomonas aeruginosa # 1 443 1 442 443 270 38.0 3e-72 MNWQQLISNMRLGQEGKHPERHDDRSEFKRDYDRLIFSAPFRRLQNKTQVFPLPGSVFVH NRLTHSLEVASVGMSLGNDVARALVKLHPELANTLFEEIGTIVSAACLAHDMGNPPFGHS GEKAIQTFFSEGAGLTLKDKLPPKFWTDLTHFEGNANAFRLLTHRFNGRRAGGFVMTYTM LSSIVKYPFESGASGKKSKFGFFATESDTYCKIADELGVIRLSAPNEPLRFARHPLVYLV EAADDICYEVMDIEDSHKLKILSFEETENLLLGFFDEQEKQRIRNRIEEEGVTDVHEKVV YMRACVIGLLERKCVDAFVANEQKLLDGTLEGCLIDHISPLEREAYRRCEMVSRQKIYNS KPVLDVELSGYNIMETLLTALVEAAAEPQRYYSRQLIQRFSSQYDINSPSLETRIQAVID YISGMTDLFALDVYQKIKGISLPIV >gi|283510501|gb|ACQH01000118.1| GENE 29 37728 - 37952 90 74 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLSSTFNLTPPQTYYPINLHSHTLANSQAHALTFFQPHAPTAPPTHQLTSSHPHQLSSSC SHQPTRHNLINSKS >gi|283510501|gb|ACQH01000118.1| GENE 30 38686 - 38919 63 77 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MREQPISLNRANGLLWFWVFSTGRKLAVKGGCVLHFHDEVSVGHERHTQQTHALPVLTAK TKFQRKSYISCKRSYCQ >gi|283510501|gb|ACQH01000118.1| GENE 31 39770 - 43192 2883 1140 aa, chain + ## HITS:1 COG:MA4289 KEGG:ns NR:ns ## COG: MA4289 COG3291 # Protein_GI_number: 20093078 # Func_class: R General function prediction only # Function: FOG: PKD repeat # Organism: Methanosarcina acetivorans str.C2A # 234 692 503 901 1734 189 31.0 2e-47 MNTFRLLLLTLFVAAMPIMGFAYTGGQIVTFNQLWYKVLSPTSHTLAFIGTNGFPSGPLT LPNSVFDGKDATFTITELQYEPGYACYGVTELTIPETVKTIGAFSVLWAPLIKVHIPQSV TNISESAFCCLNKAPKFTVASGSLHFEAGSDGALYSKGKTKLYSVPSDVQLAAGGVYNVD NNVKRICIGAFLKTNGLKKVVLPNGLQSIDMGMFTIAPTSSLEAFEVRPGGSTPFSAVDG VLFKGDELVLYPQGKSTQNYVVPNGIKKLATYSISDNGHLRSIDLNQVTTLRNSAIYNAL ELTSITIPKDIKKYGKVPGQGLMEGCFESCPKVSTYYVASGNTDFVIENGVLFSADKTLL YCYPPSKPGSSYAIPNSVTQIGNNAFQNAQFITSIHIPQNVIVIGSSAFRQLSNLSTLTF DANSKVNYISDLAFRGCSKLNVVTLPKGITSLGAVFYECENLEVINVPANSKLTTIRSGA FSTNRKLKAFNFMGECQLNRIGANVFAGLKELTAFSIPKSVAYIDANAFIGCKNLTTVTF DPNAVIQKIGEGAFADCGITNIKIPQKVVEIEREAFLKCEALTTINITKATTSISPEAFK YCSRLTAINVDKDNEVYSSIDGYLLSKNKQELMIFPPGKANERFTLLPPSVIKIGDYAFY DCKKLKNVTIPNKVKHIGKRAFGFCDNLTTITFLCDEIIPAANISQGGNDAAFDADKGAT KKMGKITINVRKNKFVDYQSIPFYKKFALIRPSFTISKEEYILVSEKAAAMLSTENADHT FVLPTDFHYGGKKYEVSFIGDYAFEHVSPNVKEVVVKKNVEYIGAKAFVTTGNTLKNVFF IEGNPTKGMLSTTRFDLDATGDNYNEFAPDTKIYVKKSAYNKYKAEWTKTRYNSVTKHEE QSPYNFTNQIDYRIKDVKINTKYATFAREFDVDFKECVTENGSRVAAFVSGSKVVFGKPD YGTATYRIRMTSIDCNGGVSDSYGYVPANTGVLLKVIDNKSTTNADFYYTIGEKDDVQYS IANNVMKGVTVNATKVVPFGSARYYVMQGGSFRRVTVPVNNFPVHKAYLKVNKPAGAKLI LQFDDEEIAADEGETVTGIGGVTNDENGKEDDAYYNLNGQRVDNPQKGIYIHRGKKVIIK >gi|283510501|gb|ACQH01000118.1| GENE 32 43204 - 43407 288 67 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929850|ref|ZP_06423693.1| ## NR: gi|288929850|ref|ZP_06423693.1| hypothetical protein HMPREF0670_02587 [Prevotella sp. oral taxon 317 str. F0108] # 1 67 1 67 67 124 100.0 2e-27 MKKRTYRKPQAGLVNLEIEQLMIVASPGVGGGYDPGKPIEGKGTNCFEDEEDTSLEPEND QDIGDYW >gi|283510501|gb|ACQH01000118.1| GENE 33 44322 - 44507 116 61 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MILASLLHHVEDENLETLVVAALVCSCQCDFGNRLTSYTKKLTLELVRFQNCDGTYLTFM I >gi|283510501|gb|ACQH01000118.1| GENE 34 45143 - 48541 2509 1132 aa, chain + ## HITS:1 COG:MA4292 KEGG:ns NR:ns ## COG: MA4292 COG3291 # Protein_GI_number: 20093081 # Func_class: R General function prediction only # Function: FOG: PKD repeat # Organism: Methanosarcina acetivorans str.C2A # 175 660 717 1191 1995 182 30.0 3e-45 MNTFRFTFLVLLFVTMPIRGFSYTNDQVVVFGDCYYKVVSEEKLTLSFLGTKPTKTGALT LSAKITDNEGYVLTVVGVEFNPLYKSVGITSVQLPETIEFIKMYAFRWAKLSTMNIPKKV KTIEPGAWIALFSTPKFTVDPSNSYFENDEDGALYTTGKRSLLSVPSNVPLKGGTYTVNS DVKYINAAAIQNIEGLTKIVLPKNLEGTYEGYPTIAPISTLVEMEIPSDAPHYKTIDGVL FRDTTLVVYPMAKADKNYTVPNGIKAITSYAISSSRFLESISLNEVTRLVGSAVYNAPKL TTITLPKKLKPFTYETGDGMMEGGFEACPNLTEYKVDPGNADFVAVNGVLFSKDMKTLYF YPAAKAGASYTIPSTVELIARQAFQGVKGITTMHIPAKVKSINAMAFRGMSNLERIIFDA NSQLEDIQYYAFWSCNKLKEVTLPKSLLYLNEIFYVCTGLETINIPAGSKLRGIKIKAFA TNTKLKAFNFLGDCDLEKIERNVFAGLSNLQSFKIPKSVKTIESNAFIGCTNMKTVTFDP NADMVEIGAGAFADCGITSINVPKKVKKIEREAFLRCQALTTIDITKATTYISPEAFKYC SNLTAINVDKDNSVYSSVDGYLLSKSKNTLMIFPPGKANSKFTPLPPSITAIGEYAFYDC KKLENVTVPNKVMSIGKRAFGLCDNLNTITFLCDQIITPMNINQEQNEESFDVDKPGQKK MGKIKINVRKDKLSAYQSDPFYQKFASINSSFQAGTEEYIAVSENAVDLLSTTRNDHTFI LPTEISNGSKKYEVSLIGDYAFENAPNSIKEVVVKQNVKYIGAKAFVKNNSKIQSVFFIE SEPTKEMLGTTRFELDATGKNYNEFDPDTKIYVKKSAYDKYKAEWTKTVYNDTKKREGLS PFNFTSQIDYKIKDTQIETKYVTFAREFDVDFGDCASNGGVNVAAFVYGRLLNGSGDYGG NTPQHIRMKSIDQHGGVSDSYTYVPANTGVLLKILGSKNSTGGLLYYTIGEKDNVTYHIT DNVMNGIEVKSKKVDASTTSPIYVLQKGVFKKATAPILNFPVHKAYLTLPSGAAPSKELK PCFDDDETTNIENATTSEEMKANDVYYNLNGQRVSNPQKGVYVHNGRKVLVK >gi|283510501|gb|ACQH01000118.1| GENE 35 48592 - 48753 136 53 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929853|ref|ZP_06423696.1| ## NR: gi|288929853|ref|ZP_06423696.1| hypothetical protein HMPREF0670_02590 [Prevotella sp. oral taxon 317 str. F0108] # 1 53 1 53 53 94 100.0 3e-18 MFLPIEQLMIIASPGVGGEYDPGKPIDGKGSDFFDENEDELIKQQGYEPTWGY >gi|283510501|gb|ACQH01000118.1| GENE 36 49293 - 49643 60 116 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVKFKFQSVVSLLINSFRPFPSPPIASVTKDVFTRKHEPKLDIKEHIFLFCWKNTVCLQM VRSPLTSLSITGGVSLHFTFNPNQSSLNRLSNILSYSHRCLFIFRLLAGILSVLSS >gi|283510501|gb|ACQH01000118.1| GENE 37 51185 - 53059 1836 624 aa, chain - ## HITS:1 COG:NMB0700 KEGG:ns NR:ns ## COG: NMB0700 COG3468 # Protein_GI_number: 15676598 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Type V secretory pathway, adhesin AidA # Organism: Neisseria meningitidis MC58 # 63 467 965 1359 1815 65 24.0 2e-10 MKHVLTILLALMVAFPLGAQKKRTAKKSRPTATTQKRTQAKPKGATKGKTAPPAKKNAKI AKAPAKKTVKPAPKKGAKAANKRGNAAREKNTQAETNKVIKGLQGQREAIKKKIQQQEAA LKANQADVKKRLQELMVINSEIGERQKNIDGIQKDIKHIEGNIGILNSQLATLEQQLADR KAKYIRSVRYMARHRSVQDKMMFVFSAKNLAQVYRRMRFVREYASYQKAQGELVKAKQQQ VTEKHIQLEEVKGQKNTLLYKGRQEHAALQTKQGEQQKAVDGLQQQQKTIQAVIAEQQKK DAALNAEIDRLIAIEVEKARQRAIAEAKRKAEAAEAAKRKAEELARKKAAAEAAARENQR RIAEAKAREERLRAEAKVAAEAERARRVEAERAAREAEERRARAERETADKRSRAEKEAR EAAERKARADQAARESAEKQARAQQAAREAEAARVAAERKAEAERARSEKAIADSKKEVE ETRKLSTVDRMVSGGFEANKGRLPMPITGNYRIVSHFGQYNVEGLKNVKLDNKGINILGG NGCQARSIYDGEVSAVFGYGGSMVVMVRHGAYISVYANLRSANVTRGQHVTTRQTLGTVG ADNILQFQLRKETAKLNPETWLGR >gi|283510501|gb|ACQH01000118.1| GENE 38 53060 - 54022 1153 320 aa, chain - ## HITS:1 COG:no KEGG:PRU_2918 NR:ns ## KEGG: PRU_2918 # Name: not_defined # Def: putative lipoprotein # Organism: P.ruminicola # Pathway: not_defined # 22 320 3 287 287 169 31.0 1e-40 MIKKKKTTTACTRNGLRHITGFKAAVRILSLALVVVLTPLIWTSCKTKSAVVSDSSTPTH KAARDREVLANNQLAFMRRVNDNQVYAKNIVGSISLNVQGAGKDVTVPGQLRMRKDEVIR LQAFVPLLGTEVGRVEFTPDYVLVIDRIHKEYIKADYNQMDFLRKNGLNFYSLQALFWNQ LLLPDRPRITESDLNQFSVTFGGGSSNPITYSTGNFNYAWLADADNGRLRQTDITHQDAT RGTSRLTWKYGDFKGVGVKMFPASQVFSMSSAAVKGGQTLQVSIKMNEVKTDEKWEAQTT VSPKYKRVKAQDVFNKILNM >gi|283510501|gb|ACQH01000118.1| GENE 39 54082 - 54357 127 91 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MFKELSNALFRALFGKLIQLVRLEKLEKQESRKSRTTRTTRKPRITRITRKPKTTRKPSS TRKISKPKPTNATGAAIFSPFHLFTFSPFHL >gi|283510501|gb|ACQH01000118.1| GENE 40 54435 - 56210 2204 591 aa, chain - ## HITS:1 COG:aq_854 KEGG:ns NR:ns ## COG: aq_854 COG0457 # Protein_GI_number: 15606205 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Aquifex aeolicus # 67 574 65 523 545 63 22.0 9e-10 MKRNLIHIAVATATIATAFTMATPTLWAKKKVQPARVVPTKTLSYNDQRRFDYFFLEAVK QENTGNYASALALMSHCLSINPNAAEVYFMQAPYYVQIKNDSLALACLQKAASLRPDNAT FTERLGQFYISSGNFDKAIETYERLTTNSGERDNDDALNILMQLYNKKKDFAGVLRSLER LEQVNGVSEETTLSKVRAYEMLDNKNAAYKALKQLSDDHPNDVNYRLMLGNWLMQNQKEA AAHKIFTDVLADDPDNSFAQASLYDYYRAKNDEAKASQLRDKMLISSKTDDETKLSIMRQ VVQENERAGGDSTKVLALFDKITEANPNDAGMAELKAAYMSVKKMPDSLVVDALKRVVAI APDNAGARLQLIENLWRKQKWDEILALSKQAQAYNPEEMAFYYFEGIACFQKDKKDEALD AFRRGVTQINANSNKDIVSDFYELMGEILYQKGLAKEAFASYDSCLQWKPNNLGCLNNYA YYLGEKGIELDKAEAMSYRTVKEEPNNGTYLDTYAWLLFLKRRYSEAQVYIDQALKNDSN SLKSKVVVEHAGDIHAMNGDTNKAIEYWKKALELGADKAVINRKIKLKKPF >gi|283510501|gb|ACQH01000118.1| GENE 41 56219 - 56656 707 145 aa, chain - ## HITS:1 COG:FN1028 KEGG:ns NR:ns ## COG: FN1028 COG0756 # Protein_GI_number: 19704363 # Func_class: F Nucleotide transport and metabolism # Function: dUTPase # Organism: Fusobacterium nucleatum # 2 143 4 145 146 171 59.0 6e-43 MLKVKIINRGHQPLPTYATALSAGMDLRANIDEDITLLPMQRRLVPTGLHMALPEGYEAQ IRPRSGLALKHGITVLNTPGTVDADYRGEIMVLLINFSNEPFIVKDGERIAQMIVAKHEQ VSFELTETLDETERGAGGYGHTGLK >gi|283510501|gb|ACQH01000118.1| GENE 42 57119 - 57331 67 70 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTCISCILIIRTYNSVTLNKGAFAISNLYRTLFFMGIRCALRSFQPVVHDVEYERALPVT TFTKQLFASA >gi|283510501|gb|ACQH01000118.1| GENE 43 57258 - 57485 121 75 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MANAPLLRVTELYVRIIRMQLIHVMENERINTVGGYETQVLCKCKTIEYGNAVVRWAFKQ LACCLLLGLIVLLNI >gi|283510501|gb|ACQH01000118.1| GENE 44 57529 - 60921 3011 1130 aa, chain + ## HITS:1 COG:MA4289 KEGG:ns NR:ns ## COG: MA4289 COG3291 # Protein_GI_number: 20093078 # Func_class: R General function prediction only # Function: FOG: PKD repeat # Organism: Methanosarcina acetivorans str.C2A # 255 703 519 908 1734 167 30.0 1e-40 MNTFRLLFFMLFVAMPIRGFSYTDDQIVNFGDCYYKVVSGKKLTLSYLGVKSTKTGHLNL PPTVTDKYGYVLTVIGAEYNPQYRSYGITSVHLPETVEYIWGYAFSGATLTSMNIPKKLE TIRSGAWSSIYSTPRFDVHPQNPKFESDANGVLYSRGRKTLYAVPSDVPLQGGKYTVDGQ VTYIDAAAFQSVHGLTRIALPKNLEKVYEGYPTIVPTSSFTAFEIPAGAPHFKVIDGVLF RDTTLVLYPRGREATNYTVPNGIKAITNYAISSNSKLESINLNGVTKLVKSSMYSVTKLT SITLPQNLKPYDDVTGDGMSEGCFEACSNLTEYKVHPQNKDFVAVGGVLFSKDMKTLYFY PGSKMGTTYTIPQSVEVIAGHAFQSAKYITTMHIPAKVARINAEAFRDVQNLKTITFDAN SQLQEIEYYAFRWCSSLKEVTLPKSLPKLNEIFYMCINLETINVPAGSKLKNIRHGAFST NSKLKAFNFLGDCELERLEKNVFAGLRLLESFNIPKSVRFIESNAFIGCASMKTVTFHPD AEIDVIGAGAFADCGITSINIPKKVTKIEREAFLKCQALTKIDVTKTTTSISPEAFKYCS NLTEINVDRDNNEYSSVDGCLLSKNKAKLMIFPPGKANDKFTLLPPSITIVGDYAFYDCK KLVNVTIPNKVDSIGIRAFGLCDNLNTITFLCDQMITPNRINQAQNETSFDEDKGPIKNM GKIRINVRKDKLAAYQANPFYSKFASISPSFNEKTEEYIAVSDKAVDLLSTSCADHTFIL PKSIEYQGNTYEVSLIGDYAFENAPSSVQEVVVKKNVKYIGARAFVAKNHNIKSVFFIES EPTKDMLSTTRFALDETEVNFNEFAPYTKIYVKKSACQDYKDKWTKTVYDQATKKDKPSP YNFTSQIDYRITDLQISTKYATFAREFDVDFKDCVAANGVRVAAFVAGSKIESGAGDYGT ATSHIKMTSIDKHGGVSDSYAYVPAETGVLLKIIDNKVASNNKFYYTIGEKDNVPYNISN NVMHGVFGKPATVQASTAAPIYVMQGGTFKKCTTPLNNFPVHKAYLKLKKPAGAKLILHF DDDDTTTAIEEVTTDDGNTGNDIFYNLNGQRVSNPQKGIYIRQGKKVIVK >gi|283510501|gb|ACQH01000118.1| GENE 45 60933 - 61133 270 66 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929859|ref|ZP_06423702.1| ## NR: gi|288929859|ref|ZP_06423702.1| hypothetical protein HMPREF0670_02596 [Prevotella sp. oral taxon 317 str. F0108] # 1 66 1 66 66 100 100.0 3e-20 MKKETYRRPLAMVLLVATEQLMIVASPGVGGGYNPHNPIDGKASDFFEEEEDENFESEGC NAFAEF >gi|283510501|gb|ACQH01000118.1| GENE 46 61954 - 62214 96 86 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929860|ref|ZP_06423703.1| ## NR: gi|288929860|ref|ZP_06423703.1| hypothetical protein HMPREF0670_02597 [Prevotella sp. oral taxon 317 str. F0108] # 31 86 1 56 56 110 100.0 2e-23 MVAFGRMFCLSILRYLSIIITIFRYKANQRMKKNQLERSLYTKPQVEVVGVEAVTLLVDA SVPGQHNKAEKGTGIEDDEDENSKYF >gi|283510501|gb|ACQH01000118.1| GENE 47 62288 - 63871 1230 527 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929861|ref|ZP_06423704.1| ## NR: gi|288929861|ref|ZP_06423704.1| lipoprotein [Prevotella sp. oral taxon 317 str. F0108] # 1 527 1 527 527 988 100.0 0 MQLTKTSIRGFAFAAALVAMVSGCQSNIDELIQTKGTDNGSTETTGLTAFTSNGSTRHAS TATRTSMSYPDGAFYWENADKIYVLDDNNAYRRSRNSVSADKQAGFKFMVPGAFANSNTY MVFYPGVNGTNNNVTIAEKQTQNEPNNTKHFGVAGDCGMAIATKNGGQFEFKLDHKVAYL CFLPKTKHAFVSTYIEKIEVTSDNFIAGNFTLDPTTKKLTGQGTKQTITLNLGGTANPKG FKLESSNAQLNQNGAFMVIKPGAHKLTIKYYLKDAVTKVGGVVTKVVKSKNYEENKFYDV SQDIEVRDYNSQLYYMWDAKKDYWSGGANQPKDNSDPAGTGYATSSPDERWYNEIKGGPA PWNASNKAKDCPNANECKWYVMKGDPHWDNSTLWTIWGHLRSGGIWLKKQSVIAAENNIT PVSDLKKGYGLNGKKYDYTTNFTDGGDANKTVAKGRPSHITNYFFLPATGCYIEGKLEQL GARGYYWTSSGSLFTPKQAFYLHFNETEVRMANLNKRNTGQNLWKAE >gi|283510501|gb|ACQH01000118.1| GENE 48 64582 - 66267 1308 561 aa, chain - ## HITS:1 COG:no KEGG:Dfer_0822 NR:ns ## KEGG: Dfer_0822 # Name: not_defined # Def: hypothetical protein # Organism: D.fermentans # Pathway: not_defined # 190 561 187 561 565 268 36.0 6e-70 MKQQIVKHLLLLSVLVSIPLHGQETYSSLTRAALETMWQSADSATYKKSLAMYEHAFALF PDSIEDIGLYKASVLAGYLNEKDKAFNYLNRLFQLSSKLPQLTKWELVIGEYAQDEYANL LSDPRWDDLKIKALKAKQTFYDGLKALEKEFYDVDKAPLNNAKDGKILYSEIKKKAGFLP KQQQDYSLSFVINDTARTAFLVHLPPNYNPAKRYPLLFFLHGAVRNNRLTDYLPASWVLY DWNRYYTKYAELNDVILVFPQGSKEFNWMTSDKGFFMVPKILTLIKKAINVDDNKVFITG HSNGATGSFSYLMKQPTPFAGFYGFNTQPKVFTGGTFVENIKNRSFINFSTDQDYYYPPN ANDDFTKLMNSINADYKEYRYNGFPHWFPEFDESEPAYKILFADLLQRQRNPFPKEISWE FDDNRYGNIDWLTNMQLDTLANRKAWHKEKNFKINQWLEYNEKDSLITKTVDKMAFDFPR KSGKILAKYANNVFHIETSCIKSFAVNIAPEMVDTNKKVRIFVNGKLRFDKRIGYDTSLM LKNFNATQDRSQVWINQIDIR >gi|283510501|gb|ACQH01000118.1| GENE 49 66574 - 68133 1938 519 aa, chain - ## HITS:1 COG:TM0585 KEGG:ns NR:ns ## COG: TM0585 COG0673 # Protein_GI_number: 15643351 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Thermotoga maritima # 62 220 3 160 360 65 27.0 2e-10 MLRKNLLLQLSFVALALMPQTMKAQFNWPYKKVNGTLVTDVPQRDAGQQSALNLVTPKMK VVRVAFVGLGMRGPGAVERWTHIPGIEIKALCDYEKSRAEGCQKYLKQASMPAADIYYGE NGYKEICKRPDIDLVYIATDWAHHFPVAKEAMTNGKHAAIEVPSAMNMHEIWELINLSEK TRLHCIMLENCCYDFFELNSLHMAQEGLFGDVIYAQGAYRHELSPFWKHYWKNGPNDKLG WRLRYNKEFRGDVYATHGLGPIAQVLNIHRGDRMRTLIAMDTRSFNGKKQVEKFSGEPCD TFRNGDQTTTLIRTEQGKVMEIIHNVMTPQPYNRMYQLTGTKGFANKYPVEGYAVSAKDM KDAGVTPSNDDLSGHDYLKEADMKALVERYTSPIVKKYGDEAKEVGGHGGMDFIMDSRLV YCLQNGLPLDMDVYDLAEWCCLAELGSISMDNGFMPVEVPDFTRGHWNEVKGYKHAYAAP EDEAKALAEAKAFTAQLKAKGAKYWAAADKKAAKKKGKK >gi|283510501|gb|ACQH01000118.1| GENE 50 69509 - 70261 846 250 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929864|ref|ZP_06423707.1| ## NR: gi|288929864|ref|ZP_06423707.1| hypothetical protein HMPREF0670_02601 [Prevotella sp. oral taxon 317 str. F0108] # 1 250 1 250 250 477 100.0 1e-133 MQKKAIRSLLWMQALLCAMLLFACSNDDDSNAPLPKPEPKELKVQVAADTKTNTATVNVA NAPAEASLSLVLDVKRGKALDQAKADKNGTHTFKLPLLAGYEQKLKLVVKSGATSAEVNN LTLPAAEEQYADEQIAQGLQKHKWLSDQKRSRIIVDHTSSNPYHMFVNTAVKHFEFLPNS KFTFTVSSPQSKTFPDGKWSVKNKRVDIDTRILLGPMLLRNSRIQTLTEKEMAMLTEVDG GLFLLWFTAE >gi|283510501|gb|ACQH01000118.1| GENE 51 70563 - 71210 505 215 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929865|ref|ZP_06423708.1| ## NR: gi|288929865|ref|ZP_06423708.1| hypothetical protein HMPREF0670_02602 [Prevotella sp. oral taxon 317 str. F0108] # 1 215 1 215 215 405 100.0 1e-112 MGLDKINKLKRLAYDFTVWPMLPQDIPCAVTSLSLPGSSTPDIKVGTKIDYTKVQGLREL ENQQFITANTIYPENLDSLVLKPLNYINPVKTLDLSHTKLKRCELYFGWTRGMEESRPDM PRIELIKMPTTIEKLDLSSIKTDVLDLTGLDNLRFLKINDDLDNPIKRIIFPKNLKRSNF KKSDDFIIRVDKGKTKLVNYPSWVTQDEFGNDVAK >gi|283510501|gb|ACQH01000118.1| GENE 52 71201 - 72229 1068 342 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929866|ref|ZP_06423709.1| ## NR: gi|288929866|ref|ZP_06423709.1| hypothetical protein HMPREF0670_02603 [Prevotella sp. oral taxon 317 str. F0108] # 1 342 1 342 342 652 100.0 0 MNKTYITLAATTALALTLASCQKGDLLNVVQDDVELNENTAQYQEFIKERVADYARAYRF EQARANLPKLTNEANRQEGERIINFYHAKALKDGFAYLLPNGDSLFLKMKNEENLPPEKI EHILQFNQYAEFKGLGQDVTLWGAANFPNTKSIYINEAQITKMLDLDKLTKLEEVRLIFE SGNFDYTLWFPNRPFKPIDVSGYDFSKNDKITWMEFKHCDLTAIKAPTNVFPMFKASYCE YNANTINTPRARKMQFESCNILEPDIKVTNPHLRSLTITTDFDANNRGFRTFDISASRIN CFSINQQNSKQHEVEEIKLNQYLDTLEILGLGNRQKKRISWG >gi|283510501|gb|ACQH01000118.1| GENE 53 72242 - 73438 1275 398 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929867|ref|ZP_06423710.1| ## NR: gi|288929867|ref|ZP_06423710.1| hypothetical protein HMPREF0670_02604 [Prevotella sp. oral taxon 317 str. F0108] # 1 398 1 398 398 802 100.0 0 MKHIVLAALSLATLFAACDQRNVIDVHQEAAVPDLNNPVVQKLTSTTWFKDVNLNIPNDS WSSTNYSKKASGPFEGMLYSMAWLGMELQRDGTSTLIFYPPVTKYQFLLCRGKWKTSPTD PNTVIIDTKTPVGYATMRVKVMDMQTKDNVAMIRTYIDTGNRVMMIDFFNGGSVLEGGSP TEFPDLRNMRQAWFDGMQVSRQPIDKSLFNNTAWEMSPHSRDVEKDFEIPQMAIRTTFIN DLLTKTPSLVYGAKFNFAQDGNVYIDIPSVVKETAFRPWASQVEDKTIMVKGTWRTQGNR IIIETNELPFASVGEAVFQLPVDLQPLDVLFSDNNGDAVRVWKNYYIIIEYIKPANTGAW FRISSPQETYYLFMRKQPIDKLTDVKGVRETAKTLKQP >gi|283510501|gb|ACQH01000118.1| GENE 54 73452 - 75512 2029 686 aa, chain - ## HITS:1 COG:PM0741 KEGG:ns NR:ns ## COG: PM0741 COG1629 # Protein_GI_number: 15602606 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor proteins, mostly Fe transport # Organism: Pasteurella multocida # 2 684 127 782 784 204 26.0 6e-52 ASRIDFDPYLASSVEIQRGANSFVSGNGALGGTVNYNTKNARDLITKGQWGGFAHAGYNG KDNLRSFIGGAAALWKQWEALLMVTRRDGNELRNFAHGARTRNITSTQTDPTSFAQNAWL AKLAYAPNEHHRIDAAFYAMNRKADAEVWTLEPIDAFTAEGKPYYYSHDQSLTRQASAGY RFTSDEALVKKFAARLHYQVSYLDATTWTDYYRPNFDKHNGNMTLLHEGKRDKYRAQEIG DALLRLSADTRKLPLGALGQHSLNLTATTSLRNYNSRNVDVENPIAANNVDGYTVRHGKV YLLGESMGTTAEAYAFVNPIKRFNLALSLLDEATLSPVLNLKLGLRYDLFRTTDDADTQA RNTGYIAYLLQNVQGANMDFSPIRNTQGGLSALAVLAYTPKPWLNLAYKFSTGYRVPNIE EQYFQFYSTWPSFLVMANRDLKPEKSINHELEITGKTNALAYMFSAYYNHYSDFIELRQG LLTVSEPLMQQEKRLAYVTNVNRDRAHLVGFDAALQLYPDAWWPVLRGISLSSAWSYAQG ESSSGMSMLSVQPLAGNIGLAYTNANKRWEVNAKMNFHFAKPISQTTFWDRDASGRDIIR RFPAAFLENAYTFDLYAYYKIGGHITLRAGVYNLFDTQYHRWDDLRQLTNPALLGNINLF FKDGRKSLARFTQPRRYLSAAIEINI Prediction of potential genes in microbial genomes Time: Sat May 28 02:43:20 2011 Seq name: gi|283510500|gb|ACQH01000119.1| Prevotella sp. oral taxon 317 str. F0108 cont2.119, whole genome shotgun sequence Length of sequence - 8495 bp Number of predicted genes - 9, with homology - 7 Number of transcription units - 4, operones - 3 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 450 307 ## COG1629 Outer membrane receptor proteins, mostly Fe transport 2 1 Op 2 . - CDS 447 - 1046 686 ## gi|288929870|ref|ZP_06423713.1| hypothetical protein HMPREF0670_02607 3 1 Op 3 . - CDS 1043 - 2089 1276 ## gi|288929871|ref|ZP_06423714.1| hypothetical protein HMPREF0670_02608 - Prom 2157 - 2216 2.1 4 2 Tu 1 . - CDS 2243 - 2824 591 ## gi|288929872|ref|ZP_06423715.1| hypothetical protein HMPREF0670_02609 - Prom 2876 - 2935 1.7 5 3 Op 1 . - CDS 2956 - 3570 741 ## gi|288929873|ref|ZP_06423716.1| conserved hypothetical protein 6 3 Op 2 2/0.000 - CDS 3573 - 4352 857 ## COG0500 SAM-dependent methyltransferases 7 3 Op 3 . - CDS 4412 - 6700 2220 ## COG1629 Outer membrane receptor proteins, mostly Fe transport - Prom 6742 - 6801 3.9 - Term 7506 - 7566 11.4 8 4 Op 1 . - CDS 7758 - 7976 67 ## 9 4 Op 2 . - CDS 8003 - 8230 57 ## - Prom 8340 - 8399 1.7 Predicted protein(s) >gi|283510500|gb|ACQH01000119.1| GENE 1 3 - 450 307 149 aa, chain - ## HITS:1 COG:NMB1668 KEGG:ns NR:ns ## COG: NMB1668 COG1629 # Protein_GI_number: 15677517 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor proteins, mostly Fe transport # Organism: Neisseria meningitidis MC58 # 41 147 20 123 791 75 40.0 3e-14 MKRHSHTTADALRGQTIRAALAHCKPGVLLAWALLFVCFMPATVSAQAVVNDTTRTRNIK EVVVKANRGARAVGMIKQTAEQLKVEMPADMTDLVRYMPSVGVSISGSRGGMRGFAMRGV EANRVAISVDGILQPDIQDNVVFSSYGLS >gi|283510500|gb|ACQH01000119.1| GENE 2 447 - 1046 686 199 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929870|ref|ZP_06423713.1| ## NR: gi|288929870|ref|ZP_06423713.1| hypothetical protein HMPREF0670_02607 [Prevotella sp. oral taxon 317 str. F0108] # 1 199 1 199 199 394 100.0 1e-108 MSRTPHFKLLAALLTMLFSLSGCIGRELDDNDTFQQAQPSEMDADNALLLSKIFVIDESN TYLWFDLHNEVANFSQPELLLPIDNGGTEGTLRIPLRGLLYEYRANEHTLTFKRVPPRFL NWGETTVSFVFNLTQTDGGSILLPGGKTQKGSKPTFELALKSILIDGVQAPIAVGTPFNA DGRTITMKPFTKKLTLING >gi|283510500|gb|ACQH01000119.1| GENE 3 1043 - 2089 1276 348 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929871|ref|ZP_06423714.1| ## NR: gi|288929871|ref|ZP_06423714.1| hypothetical protein HMPREF0670_02608 [Prevotella sp. oral taxon 317 str. F0108] # 1 348 1 348 348 669 100.0 0 MPNSNKTATRPLRWFAIAFTGLALLITGCTTSELDENERALVAKQQLADKFDQLFNLAFE SPDKAEEQFNKFVLSDLVTDDHNLLADDDVRAFQTLVQIAVRINRDLRDVPTKMNKETIF DRYVARNLLLISSEGSLYELPYRLRSGERYQRLKDLVAQVDKDTQKYIDQLSGKTQAPPK DITGIQWTWTFNIDDIRNDVADLNEGVYKSMKEFTFDFMFNNDNTITAKRFFLLPAMDQY SLRFNGDGDWREANMSETPMRYTTYGNNKLFIRVPMKNTAVSSDRYGDVKREWYYEFTYQ RSGDKLILSQPRIGLFVQIKQFLTSYADKTFDTSYPEPFKQITLTPKK >gi|283510500|gb|ACQH01000119.1| GENE 4 2243 - 2824 591 193 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929872|ref|ZP_06423715.1| ## NR: gi|288929872|ref|ZP_06423715.1| hypothetical protein HMPREF0670_02609 [Prevotella sp. oral taxon 317 str. F0108] # 1 193 56 248 248 350 100.0 2e-95 MRTISIKNLAALLLMLVMGMSLTTACSDSDNNDPIDSSIVASIVGPYKATIAPTLGSKKM AEGPHTIYIERVEGNTQQVRLHYEGFNAPFLDEDDKPKKDRMPFDMTVDFTLNITQEKDG TVTLTSVKGYFKASPHNGQSVKPGQAPGGIAIPDPKGFETDRATAKGTWKDGKLEVDIEP NILPVVVKVKATK >gi|283510500|gb|ACQH01000119.1| GENE 5 2956 - 3570 741 204 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929873|ref|ZP_06423716.1| ## NR: gi|288929873|ref|ZP_06423716.1| conserved hypothetical protein [Prevotella sp. oral taxon 317 str. F0108] # 1 204 1 204 204 380 100.0 1e-104 MVAHNISQLRGAAIGLVLALLTTTLVGACTDDMRSPDLERADQLLPNRNEYADSNLHTQA QAKLVAGLYDSRLDLIYYDADRVTPRLYFTSGEATVPLSAKANGSVQLRVVDFHTYFMPL YMSIDMNLLLTDTPSDTIRLAGKDGAVHTSDHGKTIGLPLPESDDAEMEGFYIKSKGEIY ALIDLMLPVPMKIRWHGKKQTATP >gi|283510500|gb|ACQH01000119.1| GENE 6 3573 - 4352 857 259 aa, chain - ## HITS:1 COG:MT2697 KEGG:ns NR:ns ## COG: MT2697 COG0500 # Protein_GI_number: 15842162 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Mycobacterium tuberculosis CDC1551 # 1 257 14 268 273 199 45.0 3e-51 MTEKKMNMEQGHWVLAKLGKRVLRPGGRELTAQMLEAMNINETDDVVEFAPGLGQTARLT VAHKPHSYTAVELNEEAASRVRHNVNYANMRVVAADAAQTGLPDACATKVYGEAMLTMQS APQKNAIVREAARLLKAGGLYAIHEIVICPDDAGSELQRTIEKDLAYSIKTNVRPLTLSA WRAVLEENGFEVVWTKQSPMHLLEWRRVVADEGWGGFLRIVWRMLRNAPARKRVLHMRSI FRKHDKNIQAISMVARKRA >gi|283510500|gb|ACQH01000119.1| GENE 7 4412 - 6700 2220 762 aa, chain - ## HITS:1 COG:AGl930 KEGG:ns NR:ns ## COG: AGl930 COG1629 # Protein_GI_number: 15890582 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor proteins, mostly Fe transport # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 71 758 50 713 717 335 33.0 2e-91 MDARLKLGLAFWFAMTSCMAARPSYALSNALPSATDNDSTQADKVRNTLPHVVNLKEVNV LAPRQLTQPTVAQALNEQRLIAGSTSLVQMSPLRQRLSTLKDALAMQPGVIIQEFFGLND QPRLNIRGSGIQSNPQRRGVYLLQDGIPVNFADGSYIIGVMDPMTAHFVEVFKGANALNF GASTLGGAINFVSPTANKQHGANTLVKVEGGSYGYRAVAASMGYRWQTSDAHISTSYSAQ DGHRLYNHNTRWSIAANYGLQSLNGQVENRTYLHFTWLKFQFPGPLNMQQLMDNPKQVNA GVDLPFSMGPNVLRDKPRRFARMIRVANRTGIRLNANSDLVAAVYYQYADDQFVFPITIS IPHSYHHDGGLTVSYRLNTGKHHLRAGLVASAGQIDRRTYINKDGLESFMFAHDDLQARN LALFAEDLIRLTSKLNFVVDAHLVYNERNSSDRFPQPDLRPWYSHMSKKYRYFRSQSTTL NQHWTAFNPRIGLIYSPFKGEDVQLFGNLSRSYEPPTFDELVGTEVTTNINTSPKRLYAI ALDKQTATTLELGSKGRTTRFSWNVAAYRSWVHNEILEVKDYVRGIKRTENYPTTLHQGV EVGLNAVVADNPWGINAGKLTLGGVYDFSDFRFSGGKYQGKRLAGIPQHYLNLSAEYEHS CGLSLAAQVEWKPGETPIDHTNTMFQPAYRVWNVRAAYALGKNISLYVEMKNVANSRYAS SYVISDEIHNPPIPFPNFSAKQMAFFIPAQPRCIFGGITWQM >gi|283510500|gb|ACQH01000119.1| GENE 8 7758 - 7976 67 72 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVKQTSHANTNTTKTRLAKLTRQTSLARQARLTLHNGEKPTQSTANKANPTKKPHNKRKE MHRNNTTPVALM >gi|283510500|gb|ACQH01000119.1| GENE 9 8003 - 8230 57 75 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNSFLIAPKELRSRKRMQRYTLFRFSQNIHKKNIHFNPIFNILKHLLRSKTPLHIIYIIK YFAKITKLECNQSNK Prediction of potential genes in microbial genomes Time: Sat May 28 02:44:09 2011 Seq name: gi|283510499|gb|ACQH01000120.1| Prevotella sp. oral taxon 317 str. F0108 cont2.120, whole genome shotgun sequence Length of sequence - 6151 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 9/0.000 - CDS 199 - 1620 1022 ## COG1538 Outer membrane protein 2 1 Op 2 27/0.000 - CDS 1607 - 4795 2139 ## COG0841 Cation/multidrug efflux pump - Prom 4846 - 4905 4.5 3 1 Op 3 . - CDS 4913 - 5992 589 ## COG0845 Membrane-fusion protein - Prom 6012 - 6071 3.8 Predicted protein(s) >gi|283510499|gb|ACQH01000120.1| GENE 1 199 - 1620 1022 473 aa, chain - ## HITS:1 COG:RSc0009 KEGG:ns NR:ns ## COG: RSc0009 COG1538 # Protein_GI_number: 17544728 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Outer membrane protein # Organism: Ralstonia solanacearum # 34 472 45 478 514 170 31.0 7e-42 MKRNNIFLLILLAIGVSSCKTPQATQVNSRTRLQLPTQYNGDTLTAKTTPTSWKMFFTDP QLKALIDSALVNNQDLKITLQQIAVAKSGMIAAKGAMLPSLSVGAEVGVSKAGRYTSEGA GNASTEMTPGKLIPDPLMNYHPGFAFDWELDLFGKLKSSKKAAVERYLATLEGQNAVRAS LISQVASNYYTLLALDNKLALIRKYIDLQRQAEKIAEIQKEADSDTELAVEKFKAELAKA RSQEFALKQEITECENALNLLLGRYPTPIQRDPKQLLNEEVKTIVTGLPSNLLSHRPDIR QAEHELAAAHWDIETARKAFLPSVNISATLGLEAFNPTYLTRMPKSLAYSVVGGLTAPLI NRKAIEANFQQADAAQLQALYEYDKTLLTAYSEVCTLLSKNKNLAAYYALKEEEVKTLEH SVEVSRQLYMNGRATYLDVLGAERDALDAQMELLETRQQQLSCVVDLYRSLGE >gi|283510499|gb|ACQH01000120.1| GENE 2 1607 - 4795 2139 1062 aa, chain - ## HITS:1 COG:SMa1662 KEGG:ns NR:ns ## COG: SMa1662 COG0841 # Protein_GI_number: 16263363 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Sinorhizobium meliloti # 5 1036 7 1032 1044 753 38.0 0 MFQKFIRRPVLAIVVSLVIVFIGILSLSNLPITQFPSISPPKVNVVADYPGANNELMIKS VLIPLEQALNGVPGMKYIESDAGNDGEGEINIVFKLGTDPNVDAINVQNRVSAATNKLPP AVVREGVKISREEPNILMYINLYSDDPKVDQNFLFNYADINISPELARVDGVGDLDILGT RSFAMRVWLKPDKLTAYGLSATDITDALQQQSIEASPGKLGESSGKHAQSFEYVLKYPGR YTTPEEYSNIVVKSTPDGQFIRLKDVADVSLGSEVYDIYSTLNGHPSAAITLKQAYGSNA RQVIQNVKKTMAHLQKDMPKGMHYELSYDISRFLDASIEKVIHTLFEAFILVGFVVFLFL GDWRSALIPCLAVPVSLIGSFAVMNAFGITLNMISLFALVMAIGIVVDDAIVVIEAVNVK IAKEGLEPLEATQKAMKEISGAIVAVTLVMASVFIPVAFMGGPEGMFYRQFSITIASGIV LSGFTALTLTPSLCALILTRERDNHIKKTWLNRVLDKFNAGFDKGIGGYRNVLQRTVSRK IITLPLLLLFCIGAFFINNSLPSGFIPQEDQGMIYAVIETPPGATIERTNKVAHQLASII AKEDGIESVSSLAGYEILSEGTSANSGSCLINLKPWKERSRTAKEIINDLENKCQQITDA NIEFFEPPSIPGYGAASGFELRLLDKTGSNDYHAMEKVSKSFVSALNKRPEIGSAFTFYS ASFPQYMLRVDNDIAAQKGITLGTAMDNLSTLIGSDYETGFVRFGKPYKVIVQADPKYRA FPQDLMQLNVKNNQGEMVPYADFLHMEKVYGMSEMTRHNLYNSAEVTGSAAAGYSSGQAI KAIEEVATSTLPHGYDYDFAGITKDEVDQGNQALIIFIVCLTFVYLILSAQYENFLLPLP IITCLPAGICGTFIFLKLTGLENNIYAQVAMVMLIGLLGKNAVLMVEYAVQRHNLGKSIR AAAIEGAAARFRPILMTSFAFIAGLLPLVFAHGAGAIGNRTIGTAAAGGMLFGTCFGLIL VPGLYYIFGRMADHVKMVRYQRNKPLTEEKDKRYKLENDEKK >gi|283510499|gb|ACQH01000120.1| GENE 3 4913 - 5992 589 359 aa, chain - ## HITS:1 COG:XF2093 KEGG:ns NR:ns ## COG: XF2093 COG0845 # Protein_GI_number: 15838684 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Xylella fastidiosa 9a5c # 5 347 19 386 408 110 26.0 4e-24 MKKEMLLMAIITLIVVTGCRKNKENSDQTVYAVTSPWVTNTQVSKDYVANIESRKNIEVR AQQEGILQQVYVTEGQQVRAGQPLFRISIVGAQQNVARAKAEAEQARIELQNTTTLTRGQ IVSPNAQKMAKAKLNAALADYKLAQIQARLCLISAPFTGVIGSLSKKIGSLVQGGDLLTT LSDNSSLNVYFNISEPEYIDYQQHSAERNKLPLTLILANGSAFSAKGYIQNIAGEFDSSS GNIALRARFANPKGLLRNGETGTVRINMPLHNVLVIPQQATYEEEDRRYVFVVDGAGYAH SREIKVAFEQPEVFIIAEGVTANDKILLSGVQKVHDGQHVKTRYYTPSGAMKRVQLNAD Prediction of potential genes in microbial genomes Time: Sat May 28 02:44:36 2011 Seq name: gi|283510498|gb|ACQH01000121.1| Prevotella sp. oral taxon 317 str. F0108 cont2.121, whole genome shotgun sequence Length of sequence - 114830 bp Number of predicted genes - 80, with homology - 75 Number of transcription units - 47, operones - 19 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 140 - 340 117 ## gi|288802887|ref|ZP_06408324.1| LOW QUALITY PROTEIN: transposase, IS4 family protein + Prom 455 - 514 2.8 2 2 Tu 1 . + CDS 542 - 739 278 ## gi|288929881|ref|ZP_06423723.1| hypothetical protein HMPREF0670_02617 3 3 Tu 1 . - CDS 769 - 978 188 ## - Prom 1019 - 1078 2.5 + Prom 2362 - 2421 3.2 4 4 Op 1 . + CDS 2514 - 2819 267 ## gi|288929882|ref|ZP_06423724.1| hypothetical protein HMPREF0670_02618 5 4 Op 2 . + CDS 2803 - 3423 298 ## gi|288929883|ref|ZP_06423725.1| hypothetical protein HMPREF0670_02619 + Term 3503 - 3539 -0.7 + Prom 4470 - 4529 5.0 6 5 Tu 1 . + CDS 4599 - 5813 1038 ## COG3876 Uncharacterized protein conserved in bacteria + Prom 5842 - 5901 3.6 7 6 Op 1 . + CDS 5930 - 7381 1300 ## COG0591 Na+/proline symporter 8 6 Op 2 . + CDS 7394 - 10144 2665 ## BF0436 xanthan lyase + Prom 10333 - 10392 4.5 9 7 Op 1 4/0.000 + CDS 10431 - 11843 1750 ## COG0463 Glycosyltransferases involved in cell wall biogenesis + Prom 11851 - 11910 3.1 10 7 Op 2 . + CDS 12084 - 13448 1392 ## COG0477 Permeases of the major facilitator superfamily 11 7 Op 3 . + CDS 13438 - 14688 1404 ## COG3876 Uncharacterized protein conserved in bacteria 12 7 Op 4 . + CDS 14708 - 15907 1221 ## COG4299 Uncharacterized conserved protein + Prom 15909 - 15968 2.4 13 8 Tu 1 . + CDS 16148 - 16987 780 ## BT_3618 hypothetical protein + Prom 17126 - 17185 1.7 14 9 Op 1 . + CDS 17212 - 18033 981 ## COG2103 Predicted sugar phosphate isomerase 15 9 Op 2 1/0.000 + CDS 18055 - 19908 1868 ## COG1680 Beta-lactamase class C and other penicillin binding proteins 16 9 Op 3 . + CDS 20027 - 22246 2122 ## COG3537 Putative alpha-1,2-mannosidase - Term 22021 - 22055 1.8 17 10 Tu 1 . - CDS 22274 - 23413 1013 ## BDI_0229 thiol:disulfide interchange protein - Prom 23497 - 23556 6.1 - Term 24683 - 24730 14.0 18 11 Op 1 . - CDS 24773 - 25783 1170 ## COG0191 Fructose/tagatose bisphosphate aldolase - Prom 25833 - 25892 5.6 19 11 Op 2 . - CDS 25982 - 27817 1777 ## COG0471 Di- and tricarboxylate transporters - Prom 27901 - 27960 5.4 20 12 Tu 1 . - CDS 28083 - 29285 832 ## COG0738 Fucose permease - Prom 29507 - 29566 5.6 21 13 Tu 1 . - CDS 29911 - 31722 1482 ## COG0249 Mismatch repair ATPase (MutS family) - Prom 31791 - 31850 3.4 22 14 Op 1 . - CDS 31926 - 32480 755 ## gi|288929900|ref|ZP_06423742.1| hypothetical protein HMPREF0670_02636 23 14 Op 2 . - CDS 32486 - 33778 1270 ## COG0739 Membrane proteins related to metalloendopeptidases 24 14 Op 3 . - CDS 33854 - 34630 680 ## COG1496 Uncharacterized conserved protein 25 14 Op 4 . - CDS 34620 - 35786 710 ## PROTEIN SUPPORTED gi|149915191|ref|ZP_01903719.1| 50S ribosomal protein L27 26 14 Op 5 . - CDS 35794 - 36366 868 ## COG0563 Adenylate kinase and related kinases 27 14 Op 6 . - CDS 36436 - 36975 611 ## COG0634 Hypoxanthine-guanine phosphoribosyltransferase - Prom 37009 - 37068 4.6 - Term 38280 - 38307 -0.8 28 15 Tu 1 . - CDS 38479 - 38685 59 ## - Prom 38812 - 38871 3.9 + Prom 38664 - 38723 3.8 29 16 Tu 1 . + CDS 38779 - 40293 535 ## PROTEIN SUPPORTED gi|225086616|ref|YP_002657886.1| ribosomal protein S15 + Prom 40523 - 40582 4.1 30 17 Op 1 . + CDS 40692 - 41744 1341 ## BVU_1679 hypothetical protein 31 17 Op 2 . + CDS 41910 - 42497 616 ## COG0526 Thiol-disulfide isomerase and thioredoxins + Term 42654 - 42705 18.7 - Term 42637 - 42696 19.1 32 18 Tu 1 . - CDS 42716 - 45226 2023 ## COG1629 Outer membrane receptor proteins, mostly Fe transport - Prom 45258 - 45317 4.3 - Term 45632 - 45669 2.0 33 19 Tu 1 . - CDS 45759 - 45926 162 ## COG1145 Ferredoxin - Prom 46006 - 46065 2.8 34 20 Op 1 . - CDS 46280 - 47026 706 ## PRU_1922 PorT protein 35 20 Op 2 1/0.000 - CDS 47095 - 47931 235 ## PROTEIN SUPPORTED gi|145635642|ref|ZP_01791339.1| 30S ribosomal protein S16 36 20 Op 3 . - CDS 48072 - 48836 928 ## COG0737 5'-nucleotidase/2',3'-cyclic phosphodiesterase and related esterases 37 21 Tu 1 . - CDS 49399 - 49797 472 ## BF0592 hypothetical protein - Prom 50042 - 50101 79.6 + TRNA 50025 - 50101 82.5 # Met CAT 0 0 + Prom 50027 - 50086 79.1 38 22 Op 1 . + CDS 50311 - 50679 518 ## PROTEIN SUPPORTED gi|237713894|ref|ZP_04544375.1| 50S ribosomal protein L19 + Term 50699 - 50743 11.0 + Prom 50889 - 50948 4.4 39 22 Op 2 . + CDS 50985 - 51317 288 ## gi|288929919|ref|ZP_06423761.1| hypothetical protein HMPREF0670_02655 40 22 Op 3 . + CDS 51314 - 52015 499 ## Csac_0182 hypothetical protein + Term 52044 - 52098 0.1 41 23 Tu 1 . - CDS 52127 - 52396 376 ## gi|288929921|ref|ZP_06423763.1| hypothetical protein HMPREF0670_02657 - Prom 52487 - 52546 5.6 + Prom 52623 - 52682 4.9 42 24 Op 1 1/0.000 + CDS 52860 - 54296 1622 ## COG3291 FOG: PKD repeat + Term 54308 - 54357 9.6 + Prom 54325 - 54384 1.9 43 24 Op 2 . + CDS 54404 - 55999 1550 ## COG3291 FOG: PKD repeat + Term 56102 - 56145 1.5 44 25 Op 1 . - CDS 56083 - 56355 126 ## gi|288929924|ref|ZP_06423766.1| hypothetical protein HMPREF0670_02660 45 25 Op 2 . - CDS 56243 - 56455 56 ## gi|288929925|ref|ZP_06423767.1| hypothetical protein HMPREF0670_02661 - Prom 56585 - 56644 2.3 + Prom 57600 - 57659 3.3 46 26 Tu 1 . + CDS 57695 - 58027 388 ## Coch_1399 TM2 domain containing protein + Prom 58029 - 58088 3.0 47 27 Tu 1 . + CDS 58154 - 58432 137 ## gi|288929927|ref|ZP_06423769.1| hypothetical protein HMPREF0670_02663 + Term 58497 - 58544 4.4 + Prom 58538 - 58597 4.5 48 28 Op 1 . + CDS 58618 - 59799 1455 ## COG0138 AICAR transformylase/IMP cyclohydrolase PurH (only IMP cyclohydrolase domain in Aful) 49 28 Op 2 . + CDS 59819 - 60013 299 ## + Term 60122 - 60175 16.6 50 29 Op 1 . - CDS 60515 - 62557 2235 ## BF2269 putative lipoprotein 51 29 Op 2 . - CDS 62580 - 65807 3392 ## BF0971 hypothetical protein - Prom 65835 - 65894 4.5 52 30 Tu 1 . - CDS 66558 - 66977 465 ## gi|288929932|ref|ZP_06423774.1| hypothetical protein HMPREF0670_02668 - Prom 67126 - 67185 3.4 + Prom 66943 - 67002 6.3 53 31 Tu 1 . + CDS 67248 - 69092 1863 ## PRU_1912 hypothetical protein + Prom 69135 - 69194 7.5 54 32 Tu 1 . + CDS 69284 - 70492 1554 ## COG0108 3,4-dihydroxy-2-butanone 4-phosphate synthase + Term 70500 - 70527 -0.8 + Prom 73182 - 73241 9.6 55 33 Tu 1 . + CDS 73275 - 74468 1304 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase + Prom 74472 - 74531 4.5 56 34 Op 1 . + CDS 74722 - 75552 732 ## COG0811 Biopolymer transport proteins 57 34 Op 2 . + CDS 75584 - 76195 794 ## BVU_1616 hypothetical protein 58 34 Op 3 . + CDS 76199 - 76858 629 ## PRU_1917 hypothetical protein 59 34 Op 4 . + CDS 76900 - 77733 1000 ## PRU_1918 TonB family protein 60 34 Op 5 . + CDS 77740 - 78702 712 ## COG0226 ABC-type phosphate transport system, periplasmic component + Term 78746 - 78792 9.7 + Prom 78743 - 78802 2.4 61 35 Tu 1 . + CDS 78962 - 80380 1876 ## BF3942 TPR repeat-containing protein + Term 80382 - 80422 -0.7 + Prom 80946 - 81005 9.8 62 36 Tu 1 . + CDS 81117 - 83195 1846 ## BVU_1315 hypothetical protein + Term 83240 - 83293 4.2 63 37 Op 1 . - CDS 83662 - 84162 359 ## gi|288929943|ref|ZP_06423785.1| hypothetical protein HMPREF0670_02679 64 37 Op 2 . - CDS 84167 - 85882 818 ## gi|288929944|ref|ZP_06423786.1| hypothetical protein HMPREF0670_02680 - Prom 86048 - 86107 6.7 - Term 86157 - 86206 4.3 65 38 Op 1 . - CDS 86280 - 87062 1033 ## COG1043 Acyl-[acyl carrier protein]--UDP-N-acetylglucosamine O-acyltransferase - Prom 87082 - 87141 5.9 66 38 Op 2 9/0.000 - CDS 87150 - 88535 1750 ## COG1538 Outer membrane protein 67 38 Op 3 27/0.000 - CDS 88532 - 91801 3745 ## COG0841 Cation/multidrug efflux pump - Prom 92042 - 92101 2.2 68 38 Op 4 . - CDS 92144 - 93340 1425 ## COG0845 Membrane-fusion protein - Prom 93477 - 93536 4.0 69 39 Tu 1 . + CDS 94370 - 94570 77 ## + Prom 94685 - 94744 6.4 70 40 Op 1 . + CDS 94826 - 95041 129 ## gi|282859332|ref|ZP_06268444.1| conserved domain protein 71 40 Op 2 . + CDS 95179 - 95550 317 ## COG5346 Predicted membrane protein + Term 95587 - 95638 11.4 - TRNA 95801 - 95873 69.9 # Pro GGG 0 0 + Prom 95904 - 95963 4.7 72 41 Tu 1 . + CDS 96068 - 97177 946 ## BF0631 hypothetical protein + Term 97313 - 97370 12.8 + Prom 97677 - 97736 5.4 73 42 Op 1 . + CDS 97978 - 101373 3874 ## BF1894 outer membrane protein Omp121 74 42 Op 2 . + CDS 101387 - 103300 2243 ## BF1957 hypothetical protein + Term 103408 - 103455 12.7 - Term 103400 - 103439 2.9 75 43 Tu 1 . - CDS 103617 - 104495 549 ## COG0464 ATPases of the AAA+ class - Prom 104628 - 104687 3.6 76 44 Op 1 . - CDS 104714 - 105592 626 ## Swol_2068 hypothetical protein 77 44 Op 2 . - CDS 105597 - 107306 773 ## Swol_2069 hypothetical protein - Prom 107333 - 107392 1.6 78 45 Tu 1 . - CDS 107417 - 107662 84 ## - Prom 107849 - 107908 3.0 + Prom 109558 - 109617 3.3 79 46 Tu 1 . + CDS 109774 - 112470 2968 ## COG1640 4-alpha-glucanotransferase + Term 112561 - 112604 -0.5 + Prom 112574 - 112633 1.9 80 47 Tu 1 . + CDS 112720 - 114672 2060 ## COG1523 Type II secretory pathway, pullulanase PulA and related glycosidases Predicted protein(s) >gi|283510498|gb|ACQH01000121.1| GENE 1 140 - 340 117 66 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288802887|ref|ZP_06408324.1| ## NR: gi|288802887|ref|ZP_06408324.1| LOW QUALITY PROTEIN: transposase, IS4 family protein [Prevotella melaninogenica D18] # 1 66 199 264 264 127 98.0 3e-28 MRRDDELLDELLYKERYSIERTNAWMDSYRSVLNRFDTTQTSWEGWNDIAFILIFLKKIR KREKSR >gi|283510498|gb|ACQH01000121.1| GENE 2 542 - 739 278 65 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929881|ref|ZP_06423723.1| ## NR: gi|288929881|ref|ZP_06423723.1| hypothetical protein HMPREF0670_02617 [Prevotella sp. oral taxon 317 str. F0108] # 1 65 43 107 107 104 100.0 2e-21 MYNYYTKEEISWDTLFRKYTLPKEYRYYYDRYYQEFKNNGRKIDSIFVSRKRGVKLWLIA RESNP >gi|283510498|gb|ACQH01000121.1| GENE 3 769 - 978 188 69 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MILTICNMEASSVSSKYCPYTDIPYIHFLSLPIYNRISLSRHSFIAIIFSHEISIIRINS FTLEDTVSV >gi|283510498|gb|ACQH01000121.1| GENE 4 2514 - 2819 267 101 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929882|ref|ZP_06423724.1| ## NR: gi|288929882|ref|ZP_06423724.1| hypothetical protein HMPREF0670_02618 [Prevotella sp. oral taxon 317 str. F0108] # 24 101 1 78 78 99 100.0 5e-20 MPSRKKNLCELLDQGQALCKRYGMITAFYFSSIAISFSLGRYYEEVKKEKEFNKLTTEQS KELLNQKEKYMNQKEEYVNKYLDLREKYMSERNTLQNENKK >gi|283510498|gb|ACQH01000121.1| GENE 5 2803 - 3423 298 206 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929883|ref|ZP_06423725.1| ## NR: gi|288929883|ref|ZP_06423725.1| hypothetical protein HMPREF0670_02619 [Prevotella sp. oral taxon 317 str. F0108] # 1 206 1 206 206 359 100.0 5e-98 MKTRNRKEILDDIRKLSLDEGIDLDNYKLSISNYIEELEKIEDNERKENKYTLIILLFVS LFTGVFAYITFTSKEDLADNLKEKSNIIDRYEKILPRDTLNQTTTAVKGIVIYTDEKGRK LTVPDLIKENTHLMNELTSALGKLDYIKETYGVYVETDKNGDWHLKADRVDSALLLLPHY KDCINYNPQKKHWTIYISKIDTLKHK >gi|283510498|gb|ACQH01000121.1| GENE 6 4599 - 5813 1038 404 aa, chain + ## HITS:1 COG:BS_ybbC KEGG:ns NR:ns ## COG: BS_ybbC COG3876 # Protein_GI_number: 16077233 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus subtilis # 12 379 32 414 414 211 35.0 3e-54 MLFCLAAVVGKAQKTRVICGDERTDAYLPKLKDKRVALFANHTAVVKGKHILDLLIDNKV NVVGIFAPEHGFRGTADAGEHVKNSTDKKTGVRIFSLYNGKNGTPNIDVLKETDVLVVDI QDVGLRFYTYYISMLQLMNACAKTKTTMMILDRPNPNGCYVDGPVLDMKYKSGVGALPIP VVHGLTLAEMAGMINGEGWLEGGEPCQLDIIACRNYTHSTRYKLPIAPSPNLPNMQAIYL YPSICLFEGTDVSLGRGTGLPFQQYGHPQMTGYKYSFIPRSVPGAKNPPQINQLCFGVNL SHKPQEEIIKRGFDLTYVIDAYRNLNVGERFFTPFFTKLVGVDYVQKMIMEGRSNEEIRA VWQPELEKYKEMRRKYLIYKEEGVSSKGVSSKRSKGRRSKGVKR >gi|283510498|gb|ACQH01000121.1| GENE 7 5930 - 7381 1300 483 aa, chain + ## HITS:1 COG:sll1087 KEGG:ns NR:ns ## COG: sll1087 COG0591 # Protein_GI_number: 16330938 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Na+/proline symporter # Organism: Synechocystis # 8 422 7 423 512 119 29.0 1e-26 MTPLAILATVLAYFAVLFIVSGLARRNVDNAAFFTGSRQSKWYMVAFAMIGASISGVTYV SVPGMVAQSGFGYLQLVLGFIAGQLIVAFVLTPLFYRLQLTSVYEYLRNRFGQQAYHTGA WFFFISKMLGAAVRLFLVCFTLQLLAFGPLGIPFWVNVALSVGCVWFYTHKGGVKSVIWA DLIKTACLIVSVALCIWFIAGDLQLSFGGVVSHISQSDMSRMFFFDDVNNKQFFFKQFFA GIFTTIAMTGLDQDMMQRNLSCRNSRESQKNMVISILLQFIVVAAFLMLGVLLYEEAARH GLTATGDQLFPTIATGGLLPGVVGVLFIVGLSASAYNAAGSALTALTTSFTVDILGAGKR SEDEVTRMRKRVHVGMAVVMGLSIIVFNILNNTSVVDAVYTLASYTYGPILGLFAFGILI KRPVRDRWIPFVALASPLLCWVLDRHSEAWFNGYHFSYELLILNALFTFVGLCLLIKRGT CPQ >gi|283510498|gb|ACQH01000121.1| GENE 8 7394 - 10144 2665 916 aa, chain + ## HITS:1 COG:no KEGG:BF0436 NR:ns ## KEGG: BF0436 # Name: not_defined # Def: xanthan lyase # Organism: B.fragilis # Pathway: not_defined # 27 883 128 1021 1023 1080 57.0 0 MLRPVLTLAALCLTALATRAQQAAPTTFAPHCDQPLVTNMSAPTTPTQGLSGRHIAVWQS HGRYYERTLDRWEWQRARLLQTVEDLYTQSYVLPFLVPMLENAGANVLVPRERDWNTHEV VVDNDASTLSPHTIYKERNGQQAWQQGQGEGFAYRHKVYTDFQNPFRDGTFRTIQSIKKG QESLAEWVPNVPKTGEYAVYVSYKTVQGSTATAHYTVYHQGGESHFRVNQQMGGGTWIYL GKFVFDEKAGQQHRVCLSNNTGKVGEVVTADGVKFGGGMGNIGRPDVSGYPRFTEAARYW LQWAGVPDSVYSESHGENDYTDDYKSRGMWVNWLAGGSQSYPEGQGLNVPIDLSLAFHSD AGVTKDDRTIGTLGIYYTQSYDSVFANGASRYLCKDLTESVQNSILNDIRALYEPLWNSR GSRDASYFEARTPRVPAMLLELLSHQNFADMRYGLDPRFRFTVSRAIYKGMLRFICAQRG QTPIVAPLPVDHLSASLKGRDQVELTWNAVADTLEPSAMPDRYIVYTRLGNGAFDNGKVV KRNRYVARIPADVVCSFRVEAVNKGGKSFPSETMAVARCSASMGKKALVVNGFDRVCAPA DFVAPAPIDTLYAGFLDNIDHGVPYLQDISYTGSQKEFNRTLPWLDDDSGGYGDSYGNEE TKVIAGNTFDYPALHGEAILKVGLSFESCANECLDKAEDGYVFVDYILGKQCQTKMGRGN VRPLMFKTFDAKVQKALAEYAQRGVHLFVSGAYVGSDLWCNPLAKALESDQRFATEVLKY KWRNSRAALTGGVRMVRSPLQLGVGEMNYANTLNSEQYIVESPDAIEAADSTGYTVMRYP ENNLSAAVASQGSYKTFVMGFPFESITQAFCREKLMKTIVDFFMRQPHEDISHDPKGDGK KARHITSFTGEEKRER >gi|283510498|gb|ACQH01000121.1| GENE 9 10431 - 11843 1750 470 aa, chain + ## HITS:1 COG:PAB0772 KEGG:ns NR:ns ## COG: PAB0772 COG0463 # Protein_GI_number: 14521365 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Pyrococcus abyssi # 230 426 7 199 298 63 30.0 1e-09 MKKITCFVPYIDESQAGKTLSALRDSQLVDKVVCLDEPVFKSETIRRIAAESKADYALVY TKTTTLELGYMALERLLQIAQDTNAGLVYADHYQVKGGELVKAPVIDYQKGSLRDDFDFG SVLFFDAAALKESVQRMTESYQHAGLYDLRLKLSQRYALVHANEYLYSEVEEDNRNSGEK QFDYVDPRNRERQIEMEKACTRHLKEIGGYLEPHFEDIDFNQGEFEVEASVIIPVRNREA TIGAAIESVLKQQTKFKFNLIVIDNHSTDGTTEAIDAFKADGRVIHLVPERNDLGIGGCW NYGVNSKHCGKFAVQLDSDDLYKDEHTLQTIVDAFYEQKCAMVIGSYMMTDFDLNELPPG VIDHKEWTPDNGRNNALRINGLGAPRAFYTPVLRSIGLPNTSYGEDYAMGLNISRHYQIG RIYDVLYLCRRWGGNSDAALSIEKVNANNLYKDRIRTWELEARIALNKQR >gi|283510498|gb|ACQH01000121.1| GENE 10 12084 - 13448 1392 454 aa, chain + ## HITS:1 COG:NMB0360 KEGG:ns NR:ns ## COG: NMB0360 COG0477 # Protein_GI_number: 15676275 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Neisseria meningitidis MC58 # 19 426 25 402 427 98 24.0 3e-20 MKNVTKNINPWLWIPTLYFAEGIPYFIVNNISVILFTRMGVPNGEMALFTSLIYFPWVIK PLWSPFVDILRTKRWWIIMMQIVMSIAFILLTFSIPHPSPEVISATATPVSLFTFTLILF VITAFASATHDIAADGFYMLALNQQQQSFFVGIRSTFYRLSSIFGQGVLVYVAGVLETRT GDIPWAWTITMAITAVMFSCITIYHTFFVPRAKEDAEKDAEAPAGKETRKRSTTEEIFDE FARTFSLYFKKPGVLLAIVFMLLYRLPEAFLLKMVSPFLLDTRANGGLGLSTESVGFVYG TIGVIFLTVGGIIGGMAASRWGLKKSLWPMAACMTLPCFTFVYLSMAMPSSLTTISVCVA IEQFGYGFGFTAYMLYMMYFSEGEFKTAHYAICTAFMAASMMIPGMAAGYLQELLGYEHF FWMVIACCIATVGVTLTIKVNPEYGRKRITKDEK >gi|283510498|gb|ACQH01000121.1| GENE 11 13438 - 14688 1404 416 aa, chain + ## HITS:1 COG:BS_ybbC KEGG:ns NR:ns ## COG: BS_ybbC COG3876 # Protein_GI_number: 16077233 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus subtilis # 21 415 30 414 414 333 44.0 4e-91 MKNKTLFALVLALATVFFPLETWAKKTVLTGIDVLTQQHFKCLQGKRVGLITNPTGVNAN LVSTVDVLKAAPGVNLVALYGPEHGVRGDIHAGDKVETARDAKTGLPVFSLYGKTRKPTP EMLKDVDVLVYDIQDIGCRSFTFISTMGLAMEAAAENDKEFVVLDRPNPLGGLKIEGNIT DDDCMSFVSQFKIPYIYGLTCGELARMLNDEGMLKGGVRCKLTVVKMKNWKRSMGYADTG LQWIASSPHIPQAVTAYYYPTSGILGELGYVSIGVGYTIPFQMFAAPWIDAVQLADAMNA HQMPGVTFRPIFLKPFYSVGKGELLQGVQMHITNFQRVELTPMQFVFMEEVARLYPKHAV LANADAKRFDMFDKVSGSKQVRIKMAMHNRWPDLKPFWDKDVAQFRTLSKKYYLYK >gi|283510498|gb|ACQH01000121.1| GENE 12 14708 - 15907 1221 399 aa, chain + ## HITS:1 COG:all1887 KEGG:ns NR:ns ## COG: all1887 COG4299 # Protein_GI_number: 17229379 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Nostoc sp. PCC 7120 # 9 399 2 375 375 184 31.0 2e-46 MEKNKTTSRILSIDILRGLTIAGMITVNNPGSWSYMYAPLEHAEWNGLTPTDLVFPFFMC VMGMCIYIAMRKFDFACNRATVYKIVKRMVLIYLVGLAIGWFAKFCYRWNSPQEGADFFS QLWYMVWSFDKIRLTGVLARLAICYGITALLAITVRHKHLPYIIVGLLLTYFVILMAGNG FAYDETNILSIADRAVLTDAHMYHDNGIDPEGLLSTLPSIAHTLLGFIIGSLLFRKADVG EQQLDARTNITLTKVVPLFVVGTSLLFAGYLLSYGCPINKKVWSPTFVLVTCGLASMLLA LLTWIIDVKGKKSWSKFFEVFGVNPLFLFVLSDFFAIVFGAFTFPVGDKQMNVVGFVYSQ LLSPVFGQYGGSLVYSLLFIALNWVIGYQLYKRKIYIKL >gi|283510498|gb|ACQH01000121.1| GENE 13 16148 - 16987 780 279 aa, chain + ## HITS:1 COG:no KEGG:BT_3618 NR:ns ## KEGG: BT_3618 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 4 279 2 277 283 355 59.0 1e-96 MQYILIADSGSTKTDWCVVQNGQQQARYETKGMNPFFQSQDQMEDEIRQSLLPQLPNEQP AAVYFYGAGCTPEKSPLVKESLERCFPTATTIEVNSDLLAAAHALCGRQPGIACILGTGS NSCFYDGKGISFNVPALGFILGDEGSGANLGKRLVGDILKNQFPGDLKAAFFEEYDTSMA DIIDHVYRQPFPSRFLAGFSPFLSRHREHPAVHQLLVNAFKSFIRRNVLQYDYNTHAMNC IGSIADVYRNEVAEAAQILGVTLGQIMKSPMEGLVEYHS >gi|283510498|gb|ACQH01000121.1| GENE 14 17212 - 18033 981 273 aa, chain + ## HITS:1 COG:STM2571 KEGG:ns NR:ns ## COG: STM2571 COG2103 # Protein_GI_number: 16765891 # Func_class: R General function prediction only # Function: Predicted sugar phosphate isomerase # Organism: Salmonella typhimurium LT2 # 15 255 17 257 297 253 56.0 2e-67 MSFTKITEQPSLYNNLETKTAEELLRDINAEDRKVAEAVEKTIPQVAKLVELIVPRMKRG GRIFYMGAGTSGRLGVLDASELPPTFGVPKTLVIGLIAGGDTALRNAVENAEDDEERGWD ELTEFNINEKDTVIGIAASGTTPYVVGALRSAREHGILTACITSNPDSPMAAESDVAIEM VVGPEYVTGSSRMKSGTGQKMILNMISTAVMIQLGRVKGNKMVNMQLSNKKLIDRGTRML VEMLGLGYGKAKSLLLLHGSVEKALEEYKRPHE >gi|283510498|gb|ACQH01000121.1| GENE 15 18055 - 19908 1868 617 aa, chain + ## HITS:1 COG:ECs3301 KEGG:ns NR:ns ## COG: ECs3301 COG1680 # Protein_GI_number: 15832555 # Func_class: V Defense mechanisms # Function: Beta-lactamase class C and other penicillin binding proteins # Organism: Escherichia coli O157:H7 # 4 392 5 397 434 191 34.0 5e-48 MFRILLLALCAFTFNVWANNVPTASPEEVGMSLARLRAADEVILRAIRDKKTPGAVLAVV RHGKLAYLKAYGNRQTYPSTLPMTTQTVFDMASCTKPMATAISAMLLVERGLLRLSDPVS NYLPEFKNWHGEGKDSATIRVEDLFTHTSGLPAYAPVKTLAEADGKPNPAKLMAYIAGCK REFRPRSDMQYSCLNYITLQNIVERITGQSLRTFAANNIFIPLGMKHTDFLPCTSDKSGK LVNTSQPRWVANGEQAGLTPIAPTERQPNGVVKQGQVHDPLACTLNGGVSGNAGLFSTAE DVATLCAMLQNGGAWGGKRILSPLTVKAMRSVPQNFNSFGRSLGWDVSSAYASNQGDLLS SEAYGHTGYTGTSIVIDPVNDLSIILLCNSVHPVDASNVIRLRAQVANAVAASITNEPAA FPPYYYVRMRTFEDEPPIRSTDIVMLGNSLTEGGKDWAEKLGKPNVRNRGISGDVALGVD ARLYQITPHKPAKIFLLIGINDVSHDVTVDSLMTDIRTLVDHIRAQTPKTKLVLQSLLPI RESTGRWKRLQGKTDMIPQINARIEALAREKGLTFINLYPHFTEPGTNIMRAELTYDGLH LTKAGYDVWVKLLKPHL >gi|283510498|gb|ACQH01000121.1| GENE 16 20027 - 22246 2122 739 aa, chain + ## HITS:1 COG:L135972 KEGG:ns NR:ns ## COG: L135972 COG3537 # Protein_GI_number: 15673483 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Lactococcus lactis # 24 737 3 700 717 393 32.0 1e-109 MKKLSVLAFAFISASAGMAQRQPVDYVSTLVGTQSTFQLSTGNTYPATAMPWGMNFWTPQ TGKMGDGWTYGYQANKIRGLKQTHQPSPWINDYGQFAIMATTGKPIFDEEGRASWFSHKA ETAKPYHYKVYLADYDITAEMAPTERACLMRFTFPKTDSANVIVDAFDKGSFVRLLPDKR TIVGYTTKNSGGVPDNFKNYFVVVFDHDFAAARLADGKKLLAADKQELQANHAGAIVTFS CKRGEQVHARVASSFISEAQAMQNLNELGNNSFETLLQKGRQRWNDVLGRISVEDNNVDN LRTFYSCLYRSTLFPRKFYEISASGDTLHYSPYNGKVLPGTMFTDTGFWDTFRSLFPLVN LMYPSMAREMQEGLVNTYKESGFLPEWASPGHRNCMVGNNSASVVADAYIKGIRGYDVET LWQAVVHGTKAVHPQVPSTGRLGYEYYNKLGYVPCDVGINESAARTLEYAYDDWCIAQLG KALGKPKKEWAEFERRAENYRNLFSKEFNLMRGRKKNGEWQKPYNPLKWGDVFTEGNGWH YTWSVFHNPRGLISLMGGNKAFTAMLDSVFALPPLFDDSYYGQVIHEIREMQIMNMGNYA HGNQPIQHMTYLYDWSGAPWRTQYWVREVMDRLYSAHPDGYCGDEDNGQTSAWYVFSALG FYPVCPASNEYAIGTPLFAKVSLHLENGKTTVITADRPQWRYIKRLTVNGTPEKAHWLNH KTITEGAKIDFYMSETPAL >gi|283510498|gb|ACQH01000121.1| GENE 17 22274 - 23413 1013 379 aa, chain - ## HITS:1 COG:no KEGG:BDI_0229 NR:ns ## KEGG: BDI_0229 # Name: not_defined # Def: thiol:disulfide interchange protein # Organism: P.distasonis # Pathway: not_defined # 20 346 19 324 327 70 23.0 9e-11 MERIIFTLLCTLCACTSFAQTKNSFVITSDIPDLPDGIEVELETAEGSNYDEVARAVVQD GKFVLTGRLTHPTLCTLSTNNYKLLYAMESKEEPTWTYTPIFVSNTTMTVKADKYKYLAS NLPISKHLCVTGGKAQADFNDFNQRKNDSLTWMKKNPQSPLAIKMATEMVLNNRTLTKRQ IEDITATIFACPEDYPRVNAYRRAVATAKAMAVGESLQDIPMFTQKNRTTSLLVAIPTGK VAFICFWTTWRKPNQKLIPELKNMVAKHPEVAFLCVADDKEDYLWQKFLERTQLSWPQYR LTKRGMKQFTDTFGPDATPLFVMLAPDGTIVRASSYLGDMQKLLDKTAMKLARAYEQQKK KAGEQSEGTRNTKQSKKKT >gi|283510498|gb|ACQH01000121.1| GENE 18 24773 - 25783 1170 336 aa, chain - ## HITS:1 COG:TP0662 KEGG:ns NR:ns ## COG: TP0662 COG0191 # Protein_GI_number: 15639649 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose/tagatose bisphosphate aldolase # Organism: Treponema pallidum # 1 328 1 328 332 478 72.0 1e-135 MVDYKSLGLVNTREMFKRAINGGYAIPAFNFNNMEQLQAIIKAAADLKSPVILQVSKGAR NYANPTLLRYMSQGAVEYAKELGVNHPEIVVHLDHGDSFETCKSCIDMGFSSVMIDGSSL PYDENVALTKKVVDYAHQFDVTVEGELGVLAGVEDDVVAEKSHYTKPEEVIDFVTKTGVD SLAISIGTSHGAYKFTPEQCTRDPKTGRLVPPPLAFDVLDAIMKKLPGFPIVLHGSSSVP QEYVDMINKYGGKLPDAVGIPEEQLRKAAKSAVCKINIDSDSRLAFTAAVRKVFAEKPAE FDPRKYCGPARDLMTELYKHKIKEVLGSDNKLAQLD >gi|283510498|gb|ACQH01000121.1| GENE 19 25982 - 27817 1777 611 aa, chain - ## HITS:1 COG:YPO2561 KEGG:ns NR:ns ## COG: YPO2561 COG0471 # Protein_GI_number: 16122779 # Func_class: P Inorganic ion transport and metabolism # Function: Di- and tricarboxylate transporters # Organism: Yersinia pestis # 10 610 14 610 610 390 39.0 1e-108 MITTLVILAVSVTMFIIGRLRSDVVAVCAMAALLIFGILTPTEALAGFSSNVVIMMVGLF VVGGAIFQTGLAKAISQRLMKLAGGNETFMFLLVMFTTAVIGGFVSNTGTIALMMPVVVS MATQKGTHPGRMLMPLAFASSMGGMLTLIGTPPNLVIQDVLTKAGQQPLTFFSFTPVGIV IVLLGVALMLPLSRMFLGKRKQKGRDAANVGKTLDQLVEEYNLQHQLRRYHITHESPIAG QTLAELDLRNRYGLSVMEIRRKAARSGRFIRNVKQSMPMPDSILQTEDVIYITGDNEHMD FFAAEMKLEQLDNTGIDFYDLGIAELVLMPTSQLNGTKLKNSGLREHFNINVLGIRRSHD YILNDLSEERLHAGDVLLVQGSWTNIGQLANENDDWVVLGQPKEQAQKVILTNKAPIAGA IMLLMVAMMMFDFIPIAPVTAVIIAGLLMVLTGCFRNVEAAYKTINWESVVLIAAMMPMS TALEKTGVSAEISHTLVNSLGSMSPIVLLAGIYLTTSILTMFISNTATAVLMAPIALSAA KEIGASPYAFLFAVTIAASMCFMSPFSTPPNALVMRAGQYTFMDYVKVGFPLQLIIGIVM IFLLPLFFPLG >gi|283510498|gb|ACQH01000121.1| GENE 20 28083 - 29285 832 400 aa, chain - ## HITS:1 COG:HI0610 KEGG:ns NR:ns ## COG: HI0610 COG0738 # Protein_GI_number: 16272552 # Func_class: G Carbohydrate transport and metabolism # Function: Fucose permease # Organism: Haemophilus influenzae # 6 396 3 426 428 203 34.0 6e-52 MNRKQASLTERRFLLPFVLITTLFFLWGFARAILDVLNKHFQNELHISIAQSALIQVTTY LGYFIMAIPAGLFINRFGYRRGVVFGLSLFALGAFLFVPGAQIGTFEVFLAALFTIGCGL TFLETAANPYVTELGAPQTATSRLNLSQSFNGLGSIFATFSVGLFLFRSDSEGGNVAIPY VVLGVVVLAIALVFSRVQLPEIQATSDENTSGSGLKNLSELFRHPMFVLGLAALLAYEVA EISINSYFINFVTGQGWMSDKTASMVLTAALAFFMIGRFGGSWVMRRVRAQLVLLVCAIG CVCSMCLVLLNLGTLSLVGLLANYFFEAIMFPTIFSLALTGLGQLTKSASSILMMTPVGG CGFLLMGMIADGGNPVLPFVLPLAGFAVVLAYAWKMASKG >gi|283510498|gb|ACQH01000121.1| GENE 21 29911 - 31722 1482 603 aa, chain - ## HITS:1 COG:CAC3034 KEGG:ns NR:ns ## COG: CAC3034 COG0249 # Protein_GI_number: 15896285 # Func_class: L Replication, recombination and repair # Function: Mismatch repair ATPase (MutS family) # Organism: Clostridium acetobutylicum # 66 600 62 593 598 230 30.0 5e-60 MKTTSLKTAYLAESSLLAALLVRLKRRNLYFVAGEITSFVAMLGFVVLATTLHNATWTLW LAAVCLALYVVIRRFDVQNEARITAVDQLRQAYEDEARYLEGDFTPFDDGARHADTRHPF ALDLDIFGPSSLFHRLNRTVSTGGSRSLAEFLTRLPGHTPQAKAEVERKREAIDELAAKQ TWRMKFIAQGKGKAGGIDSTRLTQTMQAAAKVSIPRFAAARLPFAIACASIVGFFATLFL AVFTPLSANVPVWWGIINFFVVMLICTKPLRMVSKTANLLHAELTACVQLLSLITTEDMQ AALNKQLQTELEGALLAFKQLQQTVASLDRRSNVLGLMFANALFLSDFFLVRKFLKWQDS HLKHFEQWVDCLSKADALVSLATFRYNHPETVQPQIEDAPKIVYQATNLRHPFLGTKAVG NDLCIEQGQYHIVTGANMAGKSTFLRCVGVNYVLAMAGLPVFADEMRVSCFWLFSSMRTT DDLTHGISYFNAELLRLEQLIGYCHNHPNTFIILDEILKGTNSLDKLNGSRLFLQTMSQL PASGIVATHDLELSKLEDENPSRFANYCFEIELGNDVTYSYRITRGVAQNQNATFLLKKI LKA >gi|283510498|gb|ACQH01000121.1| GENE 22 31926 - 32480 755 184 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929900|ref|ZP_06423742.1| ## NR: gi|288929900|ref|ZP_06423742.1| hypothetical protein HMPREF0670_02636 [Prevotella sp. oral taxon 317 str. F0108] # 1 184 1 184 184 366 100.0 1e-100 MNIELLFDRYLQLSEVKRPGYIESLGLPEPDSTQQLELAAGTPLAPLLVYIYNKVSGTPR QCKREGLIDFIPGFRLPHLREWPAAYEAFKLVHGEQWLPLLFDGDKGYYAINRETDEVAI SFLDEFAFIDTVSLNALNFLQTLVANYERKVYSVDRYGLLDYDERREGEVARDINPDVDY WMEE >gi|283510498|gb|ACQH01000121.1| GENE 23 32486 - 33778 1270 430 aa, chain - ## HITS:1 COG:SMc00539 KEGG:ns NR:ns ## COG: SMc00539 COG0739 # Protein_GI_number: 15965497 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Sinorhizobium meliloti # 273 385 279 399 413 90 41.0 5e-18 MKQLIALLLVLFFSSGKPAEDHFTPAETQQISIATPGLFDKGNVFFIDLNTIAHKNYAFP LPVGKAKVAKNNALEITTKRGDVVKAMMPGRVRLARKNKQWGNVVVIRHDNGLETVYAHN AQNLVKPNETVEAGQSIAIVGKKDGQGRCLFLTMVNGGRINPQTIVDPKSHKLNKGVIRF EKNGQHVTVRQATEKDLEEKLLADKKGKAEVKKAAKEEEKAKREEAKRLKEEQKLLAEEK KAANKTLLKTQKDVRGSVNNKLFLSQFSPNEWAYPLPGSHVISPYGGKRRHSGVDIKTRP NDKILAAFSGIVVRSGPYFGYGNCIVIRHDNGLETLYSHQSRNFVKVGQAVKAGDVIGLT GRTGRATTEHLHFEVSFKGKRIDPALVFNHAAKTLHQHTLVHASGVVRIDKSARLPIAQA SGKQETTNVQ >gi|283510498|gb|ACQH01000121.1| GENE 24 33854 - 34630 680 258 aa, chain - ## HITS:1 COG:VC0710 KEGG:ns NR:ns ## COG: VC0710 COG1496 # Protein_GI_number: 15640729 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Vibrio cholerae # 13 250 11 231 240 125 34.0 9e-29 MSPKLLEYALGHRIHAFSTMRTDGASQGAYAQFNINPYCGDAPEHVAQNLHSLCRVLGIA PSNLVMPHQTHETTVRQVGSKLLSMPQNVRQMVLEGVDALITNVPGVCIGVSTADCIPIL LYDAEHRATAAIHAGWRGTVKRIAQNAIAQMRAAFDTNPRTLRAVIAPGISLQHFEVGDE VYEAFAKAAFPMNDIAQRHDKWHINLPLCNQLQLQELGVKAENIYATDICTYNRCNDFFS ARRLGINSGRIFTGVTIR >gi|283510498|gb|ACQH01000121.1| GENE 25 34620 - 35786 710 388 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149915191|ref|ZP_01903719.1| 50S ribosomal protein L27 [Roseobacter sp. AzwK-3b] # 6 342 3 338 345 278 43 1e-73 MPESNFVDYVKIYCRSGKGGRGSMHLRHVKYNPNGGPDGGDGGRGGNIYLRGNHNYWTLL HLKFQRHVFAEHGGNGGRDKCHGTDGKDVYIDVPCGTVVYNAETGKYVCDVMHDGQVVML LKGGRGGLGNFQFRTATNQTPRYAQPGEPMEEMTIILELKLLADVGLVGFPNAGKSTLLS SLSSARPKIANYPFTTLEPSLGIVAYHDHKSFVMADIPGIIEGASEGKGLGLRFLRHIER NSLLLFMVPGDTDDIKKEYEVLLNELNNFNPELNDKHRVLAITKCDLLDDELIEMLRETL PTDLPVVFISAVTGQGLSELKDILWRELNSESNKLQSIMAEDTLVHRDKDIRIFAQELAD EGEDIDVEYIDADIEDLDDFEYEDEDEP >gi|283510498|gb|ACQH01000121.1| GENE 26 35794 - 36366 868 190 aa, chain - ## HITS:1 COG:aq_078 KEGG:ns NR:ns ## COG: aq_078 COG0563 # Protein_GI_number: 15605675 # Func_class: F Nucleotide transport and metabolism # Function: Adenylate kinase and related kinases # Organism: Aquifex aeolicus # 4 187 3 202 206 158 43.0 5e-39 MKNIVIFGAPGSGKGTQSDKMIAKYGLEHISTGDVLRNEIKNGTPLGKTAKEYIDNGQLI PDELMIDILANVYDSFGKEHKGVIFDGFPRTIAQAEALKKMLDKRGHKVAAMIELDVPED ELMKRLVLRGQQSGRTDDNEETIKKRLGVYHNQTAPLIEWYKTENIHNHIDGLGELERIF ADVCAVIDNI >gi|283510498|gb|ACQH01000121.1| GENE 27 36436 - 36975 611 179 aa, chain - ## HITS:1 COG:STM0170 KEGG:ns NR:ns ## COG: STM0170 COG0634 # Protein_GI_number: 16763560 # Func_class: F Nucleotide transport and metabolism # Function: Hypoxanthine-guanine phosphoribosyltransferase # Organism: Salmonella typhimurium LT2 # 12 177 6 174 178 127 41.0 9e-30 MARIKLIDKVFETSITETQIQEHVKAVADRINKDMADKNPLLLAVLNGSFMFAADLMRML TIPCEISFVKLASYQGTTSTGKIKEVIGINEDLAGRTVIIVEDIVESGLTIKRMIESLGT RSPKSIHICTLLIKPDRLTVPLDVEYAAFEIPNDFIVGYGLDYNQQGRNLRDIYTVVEE >gi|283510498|gb|ACQH01000121.1| GENE 28 38479 - 38685 59 68 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MHHKDALWQVRRQALPPLNQPSYNSPKPFSDERFRPNALLMFYKSTIIQHKAVLLIKMSI DREGLVAL >gi|283510498|gb|ACQH01000121.1| GENE 29 38779 - 40293 535 504 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|225086616|ref|YP_002657886.1| ribosomal protein S15 [gamma proteobacterium NOR5-3] # 1 493 7 489 497 210 31 2e-53 MKIFTGTQIKELDKFTIENEPVASIDLMERAAKAIVRVLREEWDNRTPFVVFAGPGNNGG DALAVARLMAEAGYKVAVFLFNVNGKLSDDCATNKQRVMDCKRIKAFTEVVVDFDPPELT AETVVIDGLFGAGLNKTLTGGFASLVKYINQSPAKVVSIDLPSGLMTEDNTHNVKSHIVK ADLTLTLQQKKLAMLFEDNQQFVGRLRVLDIRLSPEYIARTDAKYKVLEEADVRSRMLKR GDFVNKGLMGHALVVAGSYGMAGAAVLAGRACMRSGVGKLTICTPRRNYDVMQISLPEAI LSAGKEDYFFTEPLDTEHYDAVGMGPGLGQHEDTAIALISQIRRTQCPMVIDADALNILA SHKAWMQQLPQNLILTPHVGEFDRLGNGGSEGDYDRLSKALDLAQHLQAYILLKGHYSAL CLPSGKVFFNPTGNAGMATAGSGDVLTGIITGLLARGYNREDACIVGMYLHGLAGDLAAK QLGKESLMAGDIVAYLPQAFEFLA >gi|283510498|gb|ACQH01000121.1| GENE 30 40692 - 41744 1341 350 aa, chain + ## HITS:1 COG:no KEGG:BVU_1679 NR:ns ## KEGG: BVU_1679 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 8 341 6 343 349 259 43.0 1e-67 MKQLFVTIAALAFAALPGFAQGNSQAEGIPYFLPKTGIRLAVMVERTTYTPGELAKYGEK YMKLFNTSMEKSTEYRVVGINITDFGTPDSTKHYVAAMDKKHSISDVKLADNGLLLAINT TPPEPKQPKNFVPARAKRQLNPKDFMNQEILSAGSSAKMAELIAKEIYDIRESRNQLSRG QADAMPKDGEQLRLMLNNLDLQERALLQVFAGTTVKDTTETIINFVPTKAVEREILFRFS RHYGMADKDDLGGKPYYISVEDLHSIPTMQASIDRGKVKDNAGVYVNLPGKVKVSVAQEN NMRAVIELYMAQFGKTEPLSGELFGKKQLTQIVFSPVTGAIESVKTESVK >gi|283510498|gb|ACQH01000121.1| GENE 31 41910 - 42497 616 195 aa, chain + ## HITS:1 COG:BS_resA KEGG:ns NR:ns ## COG: BS_resA COG0526 # Protein_GI_number: 16079372 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Bacillus subtilis # 63 177 46 155 181 75 33.0 4e-14 MKKTILSAWVLAIAFTFASCTCKKQDCNNQDSAAAQTTEQTADTASAGDGAQTAPAEGGR YADFTLPTLDGKQVKLSDVVAKNKITMVDFWASWCGPCRMEMPHVVQAYSKFRGKGLEIV GVSLDEKKEDWENAVKDMGLGWIQASDLKGWECAAARLYQVQGIPACVLINQKGEIVGRD LRGDELLGRLSELLK >gi|283510498|gb|ACQH01000121.1| GENE 32 42716 - 45226 2023 836 aa, chain - ## HITS:1 COG:alr2175 KEGG:ns NR:ns ## COG: alr2175 COG1629 # Protein_GI_number: 17229667 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor proteins, mostly Fe transport # Organism: Nostoc sp. PCC 7120 # 126 836 201 863 863 292 30.0 2e-78 MICISHCKHLLLASMIICAAPTAYAQTDANANDVETKHPMELSGHVTGKQGEPLIGVVVQ VKGTNTRTITDEQGAYTLSAPQGAHITLTFHYIGCKPAERQVDLSQNCKQDVQLNEATEM LQGVEVVGRNERSYKNTLSFVGTKTATPIKDVPQSIGYVTKELVLDQGATTVNDVVKNIS GVNQYSFYNDFSIRGFRTTGNRNSGNLINGMRAQTSLWKQQSLANIERVEVIKGPASALF GNASPGGVVNRVTKKPLSENRRQVSTTVGSYSTFNVYSDFTGPLDRKATLLYRLNLGYEN TDSYRDLQGGQNFIAAPSFSYVPNSRTQFNVDVFFQQYNGKIDRGQSIFGDGDLYSVPIS RSLSAANDYLKERLVNITLGLSHKLSNHVSFNSTYLNSSYDEDMQEHNQANAYVLNADST QNNSRVMMQAMVRKRHFRNNSFNNYLNIDFNTGRVRHTVLVGWDYFQTDLLAGTSSMTAK GYLLKNGKTTTRFDKRKINNYVLDADGNPKTNVPYFDLNSTTGNGMKDISKYVFTTDPIN PYRQYSHGIYVQEQLKYGRLQLLLGLRQEFFTDILNRRLANESRSTQHALIPRVGAVVAL NKAINVYGTWVKGFEPQAASIQSDPNTGGPFSPEQSQLLEMGIKSDWFDRRLSLTAALFH LAKNNTLYNAGDAGNPNLMMQVGEEVAKGVEVDVAGEILPNWSVVANYAYTDAKITKTAT DKERDFGTQRPNTPRHAANVWTKYILRHGALRNLGFGIGVNANSERYGQVGKRANTIVYP GYAILNAALYYRVRSMQLQLNADNLLNKTYWVGGYDKLRSFPGAPLFIKATATYRF >gi|283510498|gb|ACQH01000121.1| GENE 33 45759 - 45926 162 55 aa, chain - ## HITS:1 COG:AF0627 KEGG:ns NR:ns ## COG: AF0627 COG1145 # Protein_GI_number: 11498235 # Func_class: C Energy production and conversion # Function: Ferredoxin # Organism: Archaeoglobus fulgidus # 1 54 276 329 340 58 50.0 3e-09 MAYVISDDCIACGTCLPECPVEAISEGDIYKIDADACTECGTCASVCPSEAISLP >gi|283510498|gb|ACQH01000121.1| GENE 34 46280 - 47026 706 248 aa, chain - ## HITS:1 COG:no KEGG:PRU_1922 NR:ns ## KEGG: PRU_1922 # Name: not_defined # Def: PorT protein # Organism: P.ruminicola # Pathway: not_defined # 1 248 1 235 235 210 40.0 4e-53 MKQLIYLIIYMCLLPIASMAQQRRVENRPYTDLRPFHFGVVVGVHMQDVELFNAGPQTLT LDGGQVVETNIAAEQDRYDPGFVVGVTGELRLNRHFQLRAVPAMYFGNRHLVFKNQGQTD ADGQPIVHRQDMKTVYISTAFNLIFAAPRFNNHRPYLVAGINPMLNLSGSDTDILRFKRH DVFAEVGIGCDFYLPYFKLRPELKFMFSLIDCLDAKHVDRLRDKNLVSYARSTSRATSKM IALSFYFE >gi|283510498|gb|ACQH01000121.1| GENE 35 47095 - 47931 235 278 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|145635642|ref|ZP_01791339.1| 30S ribosomal protein S16 [Haemophilus influenzae PittAA] # 5 242 17 255 603 95 29 1e-18 MKKYLLMAVACMALLTAQAQKQLVILHTNDTHSCIMPLSEHLADTAIAGRGGFLRRVEMI KQERQKHPDLLYFDSGDFSQGSPYYSMFKGDVEIELMNRMGCDASTVGNHEFDFGMENMA RIFKKAKFPILCANYDFTGTPLEGVVKPYTIIHRNGLKIGVFGIDPKLEGLVLTKMYKTV KYLDPAATALKVATMLKKKMKCDVVICISHLGWGIGGDDDQKMIAGSRHIDLVLGGHSHT FFEKLRYANDLDGKPVPDDQNGKNAVYVGKIVLTVGKK >gi|283510498|gb|ACQH01000121.1| GENE 36 48072 - 48836 928 254 aa, chain - ## HITS:1 COG:STM4104 KEGG:ns NR:ns ## COG: STM4104 COG0737 # Protein_GI_number: 16767370 # Func_class: F Nucleotide transport and metabolism # Function: 5'-nucleotidase/2',3'-cyclic phosphodiesterase and related esterases # Organism: Salmonella typhimurium LT2 # 23 243 288 503 518 91 28.0 2e-18 MNRRKVFLSLLITAATLCGCGAHYAVTGIERSRVLIDKRYDAANDAELQAFIAPYKQQVD SLVAPIVGRAGCNMSAGRPESTLSNLMTDILVWAGQRYGEQPQFGVYNIGGLRAAIVEGD VTRGDIINVAPFENKVCFITLTGEQVQRLFEQIAKRGGEGVSHSVRLVIDKNRQLKRATI DGKPIDPKAQYRVVTNDYVVQGNDGMPAFKEGTQLVAPQSEENNVRYVMMDYFREQMKAG KTVCGKREGRIIIE >gi|283510498|gb|ACQH01000121.1| GENE 37 49399 - 49797 472 132 aa, chain - ## HITS:1 COG:no KEGG:BF0592 NR:ns ## KEGG: BF0592 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 131 1 127 128 92 38.0 5e-18 MDKQRLCIEHELTSNSTAIIWDLISTDSGLARWMADSVTQEGERLTFVWGELWSHHEVRT ATIVEKVKNQYIKVSWDDEEGPDNFFELRMDKSHITNDYMLTITDFAWDDEMDSLHSIWT DNLARLRNTCGI >gi|283510498|gb|ACQH01000121.1| GENE 38 50311 - 50679 518 122 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237713894|ref|ZP_04544375.1| 50S ribosomal protein L19 [Bacteroides sp. D1] # 1 115 1 115 117 204 86 2e-51 MDLIKVAEEAFATGKKFPEFKSGDTITVAYKIIEGTKERIQLYRGVVIRISGHGDKKRFT VRKMSGTVGVERIFPIESPNIDSIEVNKRGKVRRAKLYYLRALTGKKARIKEKRSGAVAP QQ >gi|283510498|gb|ACQH01000121.1| GENE 39 50985 - 51317 288 110 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929919|ref|ZP_06423761.1| ## NR: gi|288929919|ref|ZP_06423761.1| hypothetical protein HMPREF0670_02655 [Prevotella sp. oral taxon 317 str. F0108] # 1 110 1 110 110 206 100.0 3e-52 MQKAEIKRQLLTKLKQEHCFWSYDSSSINNITDEFLIELVMLHLDLKDINKLFLIYPYKQ IKACWIRNLIPQGSYLYTLNKFLAFYYFNAKRPGAYVKAMATRQLNKIST >gi|283510498|gb|ACQH01000121.1| GENE 40 51314 - 52015 499 233 aa, chain + ## HITS:1 COG:no KEGG:Csac_0182 NR:ns ## KEGG: Csac_0182 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticus # Pathway: not_defined # 10 160 18 160 219 73 32.0 4e-12 MKGLAPNVSKIFSSVSKLECIKPFVLVGGTALSLQLGTRLSEDLDFMRWKADKNDRLEID WPGIKRGLETIGKVEHVDILGFDHALFTVDGVKLSFYAAPRKKLSAMREITIQNNLKVAD VVSIGAMKMETMSRRSRFRDYYDLYSILKAGADLREMVAMAISHSGYMLKEKNLLAILSN GERFRKEGKFKQLAPIYDVTAADIQACITEEIKKMRDKGASTAAVGELDSQTS >gi|283510498|gb|ACQH01000121.1| GENE 41 52127 - 52396 376 89 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929921|ref|ZP_06423763.1| ## NR: gi|288929921|ref|ZP_06423763.1| hypothetical protein HMPREF0670_02657 [Prevotella sp. oral taxon 317 str. F0108] # 1 89 19 107 107 168 100.0 8e-41 MALVKAVEKKLGQSITVATDFEKLALAFARQHVMLQPTSLKRLWAHLSGAEKPSAEVLDK VALFVGFQSWKDFKDALHGDDDGQTNYEV >gi|283510498|gb|ACQH01000121.1| GENE 42 52860 - 54296 1622 478 aa, chain + ## HITS:1 COG:MA4292 KEGG:ns NR:ns ## COG: MA4292 COG3291 # Protein_GI_number: 20093081 # Func_class: R General function prediction only # Function: FOG: PKD repeat # Organism: Methanosarcina acetivorans str.C2A # 80 423 760 1118 1995 131 27.0 3e-30 MTQNISQTPNTNDEQTFGYRQDNDTFFNALTINIEEPGTLEDLFHALNPKDMTGIRVVGP INATDIAFLAKLSAGNELDSLHSINLHDAFIERLPDHAFEGLVFLTHFYFPTQLKAVGDF AFANCNALLSIELPQSLESIGEQAFANLHLRTLSLPAGIKHIGEGALAGMKELTQLHIAE GNARYDVRQGLLFDKDTNTLLQCFNYRKGPVDVPQGTQAIGALAFSKAQEVTQVNIPASV TRIGHDAFASTYSLARIEVAADNPRYSSSAEGVLFNKDRTKLIAYPASRNGNKYEVPATV KKLAAGAFQEAGGQNTHAAAKDKAELRLKTVALPEGLEIIGHEAFLFAGVQHVNIPSTVR AIGYNCFYYTDIEEAVLPEGISRLEDCTFYACYSLRKVVLPASLEYVGRGVFDLSDGLKT IEIHATTPPRCHAEAFANIGTNPKLEVPNGDKKPYHDNETWASLTDHKAKSQRKAFVK >gi|283510498|gb|ACQH01000121.1| GENE 43 54404 - 55999 1550 531 aa, chain + ## HITS:1 COG:MA4289 KEGG:ns NR:ns ## COG: MA4289 COG3291 # Protein_GI_number: 20093078 # Func_class: R General function prediction only # Function: FOG: PKD repeat # Organism: Methanosarcina acetivorans str.C2A # 83 414 607 980 1734 124 26.0 7e-28 MRRIILQLFCIACTLNVMAQKGKYSDKDPYFNQTTLQVTKAGTLEQAFAEANPDKHQGLR VIGPLNDADMRFIAKLAQPSLLDDLHSINLQEAQLERIPAHWLQGFTYVTHVYFPTTLKD VGAYAFAYTNSLCKADLPEGLKSIGQCAFVGTNIRRVNLPASVEKIGEGAFAHLKSLTEV SVPAANNNFDLVDSLLIRNADNTLLQCFKKGTGEVQVPEVVQRVGSLAFGGAKYISAITL PKAVTFVGEDAFASTYALESINVAEGNAQFVSTNGVLFDKGATLLICYPTSKRGNKYTVP ATVKELATGAFQECGGGNAYKLINDEAEKKRARLNEVVLPEGLTKIGKWAFTFTGCQVNI PSTVREIGDSCFYFCEIEEATIPEGVQRIGDGMFAACYRLATITLPSTTTYIGAKFVAYN SLETTLNVYALNPPACHQDAFADFSSSINLHVVKGQKKAYEKSADWTSFIFNGVEDDLKA VVTGINAPAVPAADAVEAARYNLQGVRIYAPQRGVNIIRMSDGRTRKVLVK >gi|283510498|gb|ACQH01000121.1| GENE 44 56083 - 56355 126 90 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929924|ref|ZP_06423766.1| ## NR: gi|288929924|ref|ZP_06423766.1| hypothetical protein HMPREF0670_02660 [Prevotella sp. oral taxon 317 str. F0108] # 26 90 1 65 65 116 100.0 4e-25 MVQQLANSDWSSKAIFFQQQMRNVAMCLNPTNTQTAYDGLLTGASNAAKIETITSCLPNQ VGIFYGCQGMRSLNQAHPAKGRQAQSHTSA >gi|283510498|gb|ACQH01000121.1| GENE 45 56243 - 56455 56 70 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929925|ref|ZP_06423767.1| ## NR: gi|288929925|ref|ZP_06423767.1| hypothetical protein HMPREF0670_02661 [Prevotella sp. oral taxon 317 str. F0108] # 1 70 1 70 70 113 100.0 3e-24 MQPSTAGFQLVENTRNDFNQSPLLMHNHILPHTNGAAISQQRLEFESHFLSATNAQRCHV LKPNKHTNSL >gi|283510498|gb|ACQH01000121.1| GENE 46 57695 - 58027 388 110 aa, chain + ## HITS:1 COG:no KEGG:Coch_1399 NR:ns ## KEGG: Coch_1399 # Name: not_defined # Def: TM2 domain containing protein # Organism: C.ochracea # Pathway: not_defined # 3 110 4 111 113 118 50.0 5e-26 MERDVNIMLATYGKYFPETSLPQVREIFEHMDDSQAATLACIQFKDPITSLIISILAGTL GVDRFYLEQIGIGIAKLLTCGGLGVWALIDLFLIMDATRQQNLQKLMSII >gi|283510498|gb|ACQH01000121.1| GENE 47 58154 - 58432 137 92 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929927|ref|ZP_06423769.1| ## NR: gi|288929927|ref|ZP_06423769.1| hypothetical protein HMPREF0670_02663 [Prevotella sp. oral taxon 317 str. F0108] # 1 92 1 92 92 167 100.0 2e-40 MPCPACGTTRGLVHLLHGEPWQAVVSNPNVLLVAPAALVFTLSLVVGWLCRKPFAQQIYA QVQQVLSRKRVFAAFVAWELCVWAFLLFRHFN >gi|283510498|gb|ACQH01000121.1| GENE 48 58618 - 59799 1455 393 aa, chain + ## HITS:1 COG:CAC2445 KEGG:ns NR:ns ## COG: CAC2445 COG0138 # Protein_GI_number: 15895710 # Func_class: F Nucleotide transport and metabolism # Function: AICAR transformylase/IMP cyclohydrolase PurH (only IMP cyclohydrolase domain in Aful) # Organism: Clostridium acetobutylicum # 3 393 5 391 391 512 61.0 1e-145 MKELALKYGCNPNQKPSRIFMADGNLPIKVLSGRPGYINFLDALNSWQLVKELKAAIGLP AAASFKHVSPAGAAVGLPLTDTLKRIYFVNDAQMPLSPLACAYARARGADRMSSFGDFIA LSDTCDESTALLIKPEVSDGVIAPDYTPEALAILKEKRKGTYNVLQIDPAYTPEPLERKQ VFGITFEQRRNDIKLDDPKLFENIPTRNKHFTPEALRDLIIALITLKYTQSNSVCYVKDG QAIGIGAGQQSRIHCTRLAGSKADAWWLRQHPKVMNLPFVDDIRRADRDNTIDLYIGDES EDVLADGTWQQFFKTKPEPLTLQEKRQWIAQNHGVCLGSDAFFPFGDNIERAHKSGVDFI AQAGGSVRDDHVVATCDKYNIAMAFTGVRLFHH >gi|283510498|gb|ACQH01000121.1| GENE 49 59819 - 60013 299 64 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDIDEIISGGGMTFRKGSSGGGKKEDKVKTKAKKKKYITGAHGSGSAKQKAKYREQRANR HKKR >gi|283510498|gb|ACQH01000121.1| GENE 50 60515 - 62557 2235 680 aa, chain - ## HITS:1 COG:no KEGG:BF2269 NR:ns ## KEGG: BF2269 # Name: not_defined # Def: putative lipoprotein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 679 1 694 694 650 49.0 0 MKINIKSIGACAALALACFGTTSCNDALDLKPVNQITPDDFYKSADQLAAYLNNYYAGYL SNPYTDMFHVEGRYNDGMAHSDANTDIFVQGNGNTRLYSDKHWEVPSGKVLQGYYGGVRI YNFLLDKINKSLAAKVLAKDAAVKNYLGEAYFFRALSYFNLLANYGDVPVVKEVLPDENS AIVQNSVRTPRNEVARFILTDLDSAINNLYNRSVFKGQRVNKEAALLLKSRIALFEATFE KYHKGSGRVPGDANWPGARMTYNQGKTFNIDGEISFFLQEAMNAAKQVADKVELTANNHQ MNPNDGQTANWNNYFEMFSQPSLAEVPEVLLWRQYSSALSITHDVPMRTKVGCNDGFTRA FTQSFLMANGKPIYAAGSGYKGDNTLDAVKSNRDERLQLFVWGESNKVDTDPAAGKLRKL YAAPGDTLSYITTSVVETRCITGYQPRKYYTYDYAQSMNDELRGTNACPIFRTAEAMLNY MEACYELTGALDATAQGYWKALRKRAGVDTNYQTTIDATDLSQEGDWSVYSGTTPVSTTL YNIRRERMNELFSEGLRYQDLIRWRSFDKMLTTKWIPEGANFWDEMYKSYLDKKGNAPKA DGTADAVVSSKDLSKYLRPYSRSMQASNELKDGYKWHEAYYLAPIGVGDLQTASPDRSVA NSNMYQNANWPTTAGSHAEK >gi|283510498|gb|ACQH01000121.1| GENE 51 62580 - 65807 3392 1075 aa, chain - ## HITS:1 COG:no KEGG:BF0971 NR:ns ## KEGG: BF0971 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 30 1075 56 1089 1089 1154 55.0 0 MNKSSKHFSPRLTITAMLMFFCFILHAQHITVTGTVTDTSGEPAVGATVQVQGTKVGSVT NLDGQYSIQCSPNATLEFKYIGYETQTVAVHGRTSIDVALKSLSTDLNEVVVVGYGTQKK VNVTGAVSMVGSEVIENRPVANATQALQGAVPGLNFSPNNAGGELNNSMGIRIRGTGSIG NGSSDSPLVLIDGIEGDINALNPNDIESVSVLKDAASASIYGTRAAFGVILITTKSGKSG KVRVNYSGDLRFSTATQLPRMANSLEWANYMNVGSENAGAPHQFSDETVERIKKYMNGEY TDPKKPEYYGTIRSTSDSRWANYMGAFANTDWFKEHYKSNVPSTQHNLSMSGGNDKFNWL VSTGYLLQNGLIRHGHDELNRYTINGKMNAQLAEWARMEYSTKWTRKDYERPQYMSELFY HNIARRWPTSPVVDPNGHWMAEMEIEELENGGIHRTNDDQFTQQMKFIFTPLAGWNIYAE GALRVDNYKSTTSRIPIYNYDADNKPMLRDSGYGTVSNYTDYRWRENYYAVNVYTDYARS FGPHNLKGLLGLNFERYNQDHVNGYGENLTMNERAFLSQTQKNFKASDGYWHRATAGYFG RLNYNYNDIYMLEFNMRYDGSSRFTSDKRWAWFPSVSLGWNVAREEFFKKLTDKLSTLKL RASWGQLGNTSSKYDSFWAWYPFYQQQSLASQNSSWLINGQKQNTASLPSIVNRSLTWET IETWDVGLDWAAFNNRLTGSFDWYVRTTKDMIGPAPILGSVLGTDAPKTNNCDMRSTGWE LELGWRDQIAQVKYGVKLNLSDATSKILRYPYEGKFENQNINGYYNGKQLNEIWGYESVG LAQSNEEMAEWLKGNKPNWGSNWGEGDVMYKNLVDRVDENGNKIDEGVVNAGSNTLGDHG DKKVIGNSTPRYNFGITLNAEWKGFDFSIFFQGVMKRDWMFGAGQPYFWGATGNVWQSTV FKEHLDYWSKDNKGAYYPKPYVEGGITKNQEPQTRYLQNAAYMRCKNIQLGYSLPTSLIS RAGLSNCRIYFSCDNLFTITKLSSVFDPEGLGGGWGSGKLYPLQRTFAFGVNVSF >gi|283510498|gb|ACQH01000121.1| GENE 52 66558 - 66977 465 139 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929932|ref|ZP_06423774.1| ## NR: gi|288929932|ref|ZP_06423774.1| hypothetical protein HMPREF0670_02668 [Prevotella sp. oral taxon 317 str. F0108] # 1 139 1 139 139 252 100.0 6e-66 MRKLLLTISLFVTALLPITALCQTAKPDSMMPKDTIQTAQTTPKDSVPLLLTNRMLKEVV VFGYRPPIKFKLDFSSVEYQLMSIRPSDLNFNILGFLGYLFKWIGKGQHRETKAERTQRL LREYDLIPPQLPTTKKAAK >gi|283510498|gb|ACQH01000121.1| GENE 53 67248 - 69092 1863 614 aa, chain + ## HITS:1 COG:no KEGG:PRU_1912 NR:ns ## KEGG: PRU_1912 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 612 17 629 662 696 54.0 0 MLFLGTFFICLFVLMMQFLWRYVDELVGKGLTIEVLAQFFWWMGLMMVPQALPLAILLSS LITFGNLGESSELTAIKSAGISLTRTFSSLVVVSCFISATSFVFQNNIGPYSTIKLSQLL VSMKQKNPELEIPEGVFYDGIPNSNIYVQKKDVKTGKFYGIMIYRMSNSYEDSEIILADS GMLQTTAEKQHLLLTLWNGEWFSNQAQEVGRDAAAPFRRETFLEKKTLIDFNGDFDMTDA ALFSGDARGKGLAKLYRDLDSLQHNNDSIGRVFYNEVQMSYYNTSGLSRTDTLAAIKEAG KKTFNVDSAFARLNNDGKRSVLGIARSQVQAVDADLEFKAMVTEDANRMIRQHKIEMYKK FVLSLSCLIFFFIGAPLGTIIRKGGLGIPVIVSVLVFIVYYILDNSGYQMARRGIWAIWF GELLATMVLVPLAVFVTYKANKDSAVFNFDAYRNLLMNLLGMRQKRNIMAKEVIINEPDY GRDAELLSEVTERAQAYAEAHRLKTAPNVKRVFFEYQADNEMEEINRLLETAIDDLGNTR DKTILNLLNEYPIMSVKAHTRPFNRQWLNTAAAILVPVGLVLYIRMWRFRLRLLKDLRVV VHNNNQIIAQIGKL >gi|283510498|gb|ACQH01000121.1| GENE 54 69284 - 70492 1554 402 aa, chain + ## HITS:1 COG:aq_350_1 KEGG:ns NR:ns ## COG: aq_350_1 COG0108 # Protein_GI_number: 15605862 # Func_class: H Coenzyme transport and metabolism # Function: 3,4-dihydroxy-2-butanone 4-phosphate synthase # Organism: Aquifex aeolicus # 3 206 6 211 211 232 54.0 1e-60 MSEIKLNSVEEAIADFKEGKFVIVVDDEDRENEGDLIVAAEKITPEMVNFMLKHARGVLC APITLSRCKELNLPHQVNDNTSVLGTPFTVTVDKLEGCSTGVSAHDRAETIRALADPKST PETFGRPGHINPLYAQDNGVLRRSGHTEAAIDLCRMAGLYPAGALMEIMNEDGTMARLPE LKVMADKYGMKLISIKDMIAYRLRTESLIEVGLTVDMPTKYGHFQLTPFRQISNGLEHFA LTKGEWKDDDEVLVRVHSSCATGDILGSLRCDCGDQLHHSLEMIEKEGRGVLIYMQQEGR GIGLMNKIAAYKLQEQGYDTVDANVHLGFKPDERDYGCGAQMLRHLRVHKMRLITNNPVK RVGLEAYGLEITENVPVEVHPNKYDYKYLKTKQERMGHNLHL >gi|283510498|gb|ACQH01000121.1| GENE 55 73275 - 74468 1304 397 aa, chain + ## HITS:1 COG:AGc3991 KEGG:ns NR:ns ## COG: AGc3991 COG0436 # Protein_GI_number: 15889474 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 1 394 1 397 400 389 49.0 1e-108 MAQLSDRLNRLSPSATLAMSQKSSEMKAQGIDVINLSVGEPDFNTPDHIKDAAKKAVDEN YSRYSPVPGYPELRKAIVTKLQKENNLEYGLNEVMVSNGAKQCVCNAVMALVNNGDEVIV PAPYWVSYPQMVKLAGGTPVYVNAGFEQNFKITPQQLEAAITPKTKMLILCSPSNPTGSV YSKEELKALAEVIRRHDDLYVLADEIYEHINYVGRHESIAQFDGMKERCIIVNGVSKAYA MTGWRIGYMAAPEWIIKGCNKLQGQYTSGPCSVSQKAAEAAYTLDQGCVEDMRLAFERRR NLVVKLAKEIEGLEVNVPEGAFYLFPKCSSLFGKHTDGYVINNATDLAMYLLEVGHVATV SGDAFGDPECIRLSYATSDDNLREAMRRIKETLARLV >gi|283510498|gb|ACQH01000121.1| GENE 56 74722 - 75552 732 276 aa, chain + ## HITS:1 COG:FN1312 KEGG:ns NR:ns ## COG: FN1312 COG0811 # Protein_GI_number: 19704647 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Biopolymer transport proteins # Organism: Fusobacterium nucleatum # 66 268 8 200 202 81 27.0 2e-15 MATQKTASPAKKSQGFQGIRAAFWVIVCCFIIAACFFKFYLGSPAHFQGGDNEGHPLDIW GTIYKGGLVVPVIQTLLLTVLALSVERWLALRTAFGTGSLTKFVANIKAALNAKDFNRAQ QLCDKQKGSVANVVSASLQAYKAMETVHGIKLSQKIAKIQQAHEEATQLEMPTLQMNLPI LATLVTLGTLTGLLGTVTGMIKSFQALAAGGGGDSLALSAGISEALINTAFGITTSWFAV ISYGFFTNKIDKLTYALDEVGYSIAQTYEANHADEA >gi|283510498|gb|ACQH01000121.1| GENE 57 75584 - 76195 794 203 aa, chain + ## HITS:1 COG:no KEGG:BVU_1616 NR:ns ## KEGG: BVU_1616 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 203 1 200 201 225 62.0 7e-58 MGKVKIKKKSTWIDMTPMSDVMVLLLTFFMLTSTFVKNEPVKVVTPGSVSEIKVPEKSVL NILVDKTGKIFMSMDNQNDTQAVLEGMSELYGMKFTPQQVKKFKKDAMWGVPMNSIQQYL SQSETEMAQTLKNYGIPTDSIDGKESEFQLWVKEARKVNPDVKLAIKADEKTPYALVKKV MSELQDLSENRYYLITSYKKVED >gi|283510498|gb|ACQH01000121.1| GENE 58 76199 - 76858 629 219 aa, chain + ## HITS:1 COG:no KEGG:PRU_1917 NR:ns ## KEGG: PRU_1917 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 9 219 4 221 221 202 55.0 6e-51 MAELVEQGKGGGKQKKKDVRVDFTPMVDMMMLLITFFMLCTSLAKPQTMELSMPSNDKNI QDDDKTVTKASQTITIYVAGNDKIYHVDGLPNYNDPSTMKETTWGAKGIRDILINHQTED GTIPVRQIMAAKQKLDKKKLENPKFTQAEYDKELSKLKAGEVDGQKIPTLTIIIKATDKA SYKNLVDVLDEMQICSIGKYVIDKINPQDSELLKKKGIE >gi|283510498|gb|ACQH01000121.1| GENE 59 76900 - 77733 1000 277 aa, chain + ## HITS:1 COG:no KEGG:PRU_1918 NR:ns ## KEGG: PRU_1918 # Name: not_defined # Def: TonB family protein # Organism: P.ruminicola # Pathway: not_defined # 1 277 1 277 277 296 64.0 7e-79 MSKIDLVDGDWVDLVFEGKNQAYGAYKLRKGTTKRNILAMISILAAAVLVFSVVAIKSVI EASRGKVDATQVTELSALNQPKKQAEVKQKPKVNTEPEKVVERVKSSVKFTAPVIKRDDQ VKPEDEIKTQEEIMSTKTAIGALDVKGNDDAAGEVLKIKEAVAQPEPKPEVETKIFEVVE QMPQFPGGDAALMKFLSDNVKYPVVAQENGVQGRVVISFVVERDGSITDVKVARSVDPSL DREAARVVKSMPNWIPGKQNGSAVRVKYNVPVSFRLQ >gi|283510498|gb|ACQH01000121.1| GENE 60 77740 - 78702 712 320 aa, chain + ## HITS:1 COG:MA0887 KEGG:ns NR:ns ## COG: MA0887 COG0226 # Protein_GI_number: 20089771 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type phosphate transport system, periplasmic component # Organism: Methanosarcina acetivorans str.C2A # 1 307 11 315 317 82 23.0 2e-15 MRKSITSYTIVGLALCSFVLAACGESKRKNGRTDTYSSGVISFASDESFSPIIDEEREVF HAVYPQATVNPIYTTEGEGITLLLKDSVSLVITSRDFKKKEYQSLYDKQFRPQTIKLAYD GLALIVNKNNVDTCISVKDIARVLRGEAKNWSDIFPGSKRGTITVVFDNKRSSTVHFAED SILGGKPITNPNVVATNKTADVVDYVEKTPNAIGIIGSNWLNDKRDSTNLTFNKNIRVMS VSRKDKGTPANSWKPYQYYIYNGNYPLIRTIYALLNDPYHGLPWGFAQFMASPKGQLIIL KSGLLPVQGNITIRDVNVNQ >gi|283510498|gb|ACQH01000121.1| GENE 61 78962 - 80380 1876 472 aa, chain + ## HITS:1 COG:no KEGG:BF3942 NR:ns ## KEGG: BF3942 # Name: not_defined # Def: TPR repeat-containing protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 469 1 473 477 268 38.0 4e-70 MKTVKYLLLGALVVGISAPLMAQDNKSTIDAISKVILANPAAAKDQVKEVFKKNKKNAEV LAGIGRAYFEAKDTANAKQYANLAIKANKNYGQGYILLGDIEVLKDDGGAAAAWYQQAIY FDPKNPEGYFKYANIYRGRSPEEAVAKLNDLRAQRPDVAVDALAGRILYASNRMEQSLQY YDKVTDKSKLEDVDITNYATEAWMLQKRDKSLEMALYGLSRNPRKAAWNRLAFYNLTDME RTAEALKYADALFNASDSSKFTGFDYTYYGTALKNDKQYDKAIEMFNKAMEENKGNEELI NSNRKALSDLYLSKEDFDKATAYYEEYLKHAAKTTASDMAGLATIYMNLAAKQTGDEQKA SFKKADDVYARLGEKFPANIDFANFMRARINSNLDPETKEGLAKPFYEAIVNSLASKTDR DKADNARLTEAYRYLGYYYLVKNDKATADGYWKKVLEIDPENETAKQALGVK >gi|283510498|gb|ACQH01000121.1| GENE 62 81117 - 83195 1846 692 aa, chain + ## HITS:1 COG:no KEGG:BVU_1315 NR:ns ## KEGG: BVU_1315 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 8 690 9 688 690 446 36.0 1e-123 MKSRCSILVLSLFACALSYAQGKVNTSDSLSIDSLYRDLPEVIVKGERPTVKLERGKLVY NMQLLLEKLPADNAYDAITNIPGISAADGSLSLVGANVTLIIDGKASTLSQAQAIAKLKN MPASRLSKAEVMLSAPAQYHVRGAAINIITNDYGGQHHTSGQLQATLNKSKYARGYGQGN LLYANGRFTLDLNYTFTGGKGYAEAEHYAQHPLNGTRVAYNDKTSNISSGTSHDIGIELG YRFAERHKVEVAYTADANFIDSNNTTTGSSVSAQSSEGHNLLHNIGLYYTLPFGLRLNGT YTYYSSPRNQSLNGQLGADRRVVAAKSDQTIRKWLFTADQVHTLGHGWQLSYGLEWQTTN NKSYQTTLNDAGIVIPEATSEVNIDEQRFNGYIGFSKQFGKGVSLETTLAIENYHTPQWN DWRVFPTLNAAWQVNSYHTLNLSFNSNATYPSYWSTMSQIYYSSAYSEIWGNPLLRPSKS YDTSLSWLIKGRYTLVAFAHICNDYAIQLPYQLSNRMAVVMQDVNFNHRNIFGLQAMARF SAGSWLNGNVYVTGMFISDKNNHFFDLPFNRQRFSMRAGGNVSALLSKKTNVRFSVSPTF QSRAIQGVYDINGMFSLNASMRWASANGKWNAVLSGYNLTNRRMVTHSGYGNQDFGLRVV QDWTTLALSVIYKFGNYKAKTKKEVDTSRMRK >gi|283510498|gb|ACQH01000121.1| GENE 63 83662 - 84162 359 166 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929943|ref|ZP_06423785.1| ## NR: gi|288929943|ref|ZP_06423785.1| hypothetical protein HMPREF0670_02679 [Prevotella sp. oral taxon 317 str. F0108] # 8 166 1 159 159 323 100.0 2e-87 MKRLFLLMVSLCFGITMYAQNTQTDAERERENKMLEAFLPILDARRNEFVGQPAKKLFDV IRGSAFKIRNIGTESTSPWAEYKGKTYVYGLSLFNKPVQQAMKDKEVYIIRIILDVRWES SAFWDAASHAGPAWMEAIVRKCLDVKVLDVSWEHFYIPERTTSSEK >gi|283510498|gb|ACQH01000121.1| GENE 64 84167 - 85882 818 571 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929944|ref|ZP_06423786.1| ## NR: gi|288929944|ref|ZP_06423786.1| hypothetical protein HMPREF0670_02680 [Prevotella sp. oral taxon 317 str. F0108] # 2 571 1 570 570 1189 100.0 0 MMTRNVKKTNIVSPMIWLCVLLVCACNQLENDAGLSLPDEVTNREAQRFIDALNASTDQQ PVFQSLTKGKQVYFEDVRLCNSSMHGMVFIVPFGTEGTISGALYYPVGFTQLDDERVELN NKLKSPQVVTAETLNNDIPITQRFLYSNDFTILSDKGLQVDDSLKAYEFLKDSMMPLTQE QLPETRYYNPYLNGSRVIITIDYSANYVGQPDGDSHGLYHYTVERWAEEALRQTMINPYV CRIEKGWYLNKIEITIPCEQLYNVSSDPRSYVMYYGNILRGIAITHSFSFFLQATYRVDG ATYWDRRGGTIGIVGESRPETNNGPYTGNLGLGIFPKSKIPEECDSLPYSNEAKRAMKEM LDSLFSLKPYKGQKKNYIDLSQFKKLVKEHGNVEHVCVINVYDDNRHYMLEPEGGGLVDR APYTANFYTKYTLHNHPNLTPPSAKDLLNLCSDAASQECPNMRGSFVFIDENTYYALIIN DRKKAAAMFKEIEYEVAENNDFKMGGKCDRILKQYEHLNTFKSSNETAIARLSTIINHFD GGISIVKCHITDKPDYTTYGSEKVSYKPIKC >gi|283510498|gb|ACQH01000121.1| GENE 65 86280 - 87062 1033 260 aa, chain - ## HITS:1 COG:XF1043 KEGG:ns NR:ns ## COG: XF1043 COG1043 # Protein_GI_number: 15837645 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Acyl-[acyl carrier protein]--UDP-N-acetylglucosamine O-acyltransferase # Organism: Xylella fastidiosa 9a5c # 2 253 5 258 263 149 36.0 5e-36 MSSDISSRAEVSPRAKIGDNCKIYPFVYIEDDVEIGDNCVIHPFVSILNGTRMGSGNSVY QCSVLGALPQDFNFVGERSFLIIGNDNTIRENVVINRATHEGCKTVIGNHNFLMEGAHIS HDTQVGNDCVFGYGTKIAGDCEIGNGVIFSTGVIENAKTRVGDRAMIQAGTTFSKDVPPY IILGGKPLAYGGVNTVMLKADGVDPKNIKHIANAYRLVFHGQTSVFDSVLQVKEQVPDGP EIRNLVQFIEATEGGIVTKM >gi|283510498|gb|ACQH01000121.1| GENE 66 87150 - 88535 1750 461 aa, chain - ## HITS:1 COG:ECs0610 KEGG:ns NR:ns ## COG: ECs0610 COG1538 # Protein_GI_number: 15829864 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Outer membrane protein # Organism: Escherichia coli O157:H7 # 13 460 12 457 460 157 27.0 5e-38 MKIKNFILLALAALALTGCKSLYGKYERPAVKAAGVVRSPLSDTDTLAVSDTASFGNLPW RSVFTDPQLQSHIATALENNYDLLNAALNVKMAEAQLKSAQLSFLPSFAFTPQGTISSWD GHAAVKTYSLPVSASWTIDLFGTLLSAKRSSQMALIGMKDYQLAARTRLISNVANLYYTL LMLDKELELIDEMEVLAKDTWETMKLQKEYGNVRSTGVQSAESNYYSVQAQKVGLKRQLR EAENALSLLLGQSARTIARGKFDGQSLPTNFSTGISLQMLNNRPDVHAAEMSLAQCFYNV QTARSRFYPTLTLSAQGGYTNSGGMGITNPAKLLLSAVGSLTQPIFQRGQLIAGLKVAKI QYERAYNTWQQMVLSAGNEVSNALVLYNASAEKSDIEAKQIVTLRQNVEDTKMLLTESRG SYLEVITAQQTLLKVELSKVQDDFYKMQAVVNLYMALGGGK >gi|283510498|gb|ACQH01000121.1| GENE 67 88532 - 91801 3745 1089 aa, chain - ## HITS:1 COG:BMEI1629 KEGG:ns NR:ns ## COG: BMEI1629 COG0841 # Protein_GI_number: 17987912 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Brucella melitensis # 6 1045 5 1022 1051 741 40.0 0 MTFTRFIQRPVLSTVISVFLVLLGLIGIVSLPIAQYPDIAPPTISVMAHYQGADAQTVLN SVVVPLEESINGVENMTYMESTATNDGTAMITIYFKQGVNADMAAVNVQNRVSQAQALLP AEVTQVGVTTSKRQTSNVVMYTITSSDGRYDEKFITNFNEINIQPAIKRVQGVGNVQSFS SGTYSMRIWLQPDKMKQYGLMPSDITAALAEQNIQAAPGSFGEQSNVAFEYVMRYKGRLK TPEEYGNIIISNNTQGQTLYLKDVAKVELGALGYSVTMKNNGYPSVMGMVQQIAGSNATQ IAQDVKKVLEEQSKTFPPGLEYKINYDVTEFLFAAIEEVVMTLIITFVLVFFVVYVFLQD FRSTLIPLIAVPVSLIGTFLFLWMFGFSINLLTLSALLLAIAIVVDDAIVVVEAVHAKLD MGYKSSKVAAIDAMNEISGAIISITLVMACVFIPVSFISGTSGTFYREFGVTMAVSIIIS AINALTLSPALCAIFLKPHAEGEHGRKTSFVDRFHAGFNVAFNKVLGKYRKGVENIIRHR IVTGVAVLVGIVVLVFTMSTTKTGLVPNEDTGTLFVTISLPPATSQERTQEVVNQVDEML ANNPMIQSREQIAGYNFIAGQGSDQATFVLKLKPFEERTGGFFYKLSGLWQGDGIMRFFI DVRSSNMILGTIYKQISSIKGAQIIAFGAPMIPGYGLANSVTFTVQDKTGGSLDHFFKVT QDYLKALNERPEVMNAMTTYNPNYPQYMVDVDVAKCKQSGISPRVVLSTLQGYYGGMYAS NFNSFGKLYRVMIQGDVASRMKPSDITNIYVRTAQGMVPVSEYCTLTRVYGPSNVRRFNL FTSINVNVTPADGYSSGEVIKAIEEVAAQNLPTGYGYDFSGLTRSEHEQSNTTALIFALC LVFVYLILSAQYESYLLPLSVILSLPFGLAGAFIFTQLFGHQNDIYMQIALIMLIGLLAK NAILIVEFALERRRTGMAIKYSAILGAASRLRPILMTGLAMVIGLLPMMFASGVGKNGNQ TLGAAAVGGMLIGMMLQIFVVPALFVIFQYLQEKFTPLEFGDEENREVKNELKQYARQMA IEKDVKAKQ >gi|283510498|gb|ACQH01000121.1| GENE 68 92144 - 93340 1425 398 aa, chain - ## HITS:1 COG:mll6731 KEGG:ns NR:ns ## COG: mll6731 COG0845 # Protein_GI_number: 13475614 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Mesorhizobium loti # 3 365 16 392 402 141 27.0 2e-33 MRQKKWLMLVACVALLAACGGKGNMKMDNNEYPVMTVGTQGSETETTYPASIKGVQDVEI RPKVSGFITKLCVQEGQVVKAGQLLFVIDNTTYQATVRQAQAALNSAKVQLNTTKLTFDN SKKLFERNVIGSYELQTAQNAYESARAAVAQAQASLSSAKDMLGFCFVKSPANGIVGSLP YKVGALVSAQSVEPLTTVSNAATIEVYFSVNEKDVLNMSKQAGGTHAAISDYPEVKLKLA DGTMYKHTGKVVKMSGVISQTTGAVSLIARFPNPEHLLKSGASGTIIVPRVSTNSLVIPQ SATTEIQDKVFVYKLGAENRVKYTEITVDPQNDGNNYVVTSGLKAGDKIVTRGLTTLTDT MKIVPLTEEEYLKKVSDAAKLGKNQTSAKGFIDMMKKK >gi|283510498|gb|ACQH01000121.1| GENE 69 94370 - 94570 77 66 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTFNHLIWIKRNIHVCTGLRAWVWNSDHRKLDFSGVSGAPGAVLMADRELCGKAASKRNN VFLMLL >gi|283510498|gb|ACQH01000121.1| GENE 70 94826 - 95041 129 71 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|282859332|ref|ZP_06268444.1| ## NR: gi|282859332|ref|ZP_06268444.1| conserved domain protein [Prevotella bivia JCVIHMP010] # 1 69 1 69 72 82 53.0 8e-15 MNIKANRPLQGLSKHIVRERPGRRSGIETLTNVCGSMRFSGYLFGSDVQDMRADWRVVGN DMRKAMHRYGR >gi|283510498|gb|ACQH01000121.1| GENE 71 95179 - 95550 317 123 aa, chain + ## HITS:1 COG:NMA0456 KEGG:ns NR:ns ## COG: NMA0456 COG5346 # Protein_GI_number: 15793459 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Neisseria meningitidis Z2491 # 1 103 35 128 143 57 33.0 7e-09 MEEHKAYSGPLPAPEDFGAYKDVLSDAPERILAMAEKQIQHRISMEEKIVSSGLSESKKG QNLGAIIVLACIASSVYLGMNGHDILAAALVAIIAAVGTIFVLKKTPRSGGKGISLKEED TEE >gi|283510498|gb|ACQH01000121.1| GENE 72 96068 - 97177 946 369 aa, chain + ## HITS:1 COG:no KEGG:BF0631 NR:ns ## KEGG: BF0631 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 44 351 123 423 436 256 46.0 1e-66 MKQRLFSLFMVAAALVLLNACDKKDDRTTIPKPEPEEPIKLGVLPVVFHVFYADENDATQ KVPAERLRQVLDNVNRLYQGVYSGAGNANIRFLPTPVDKAGKRLAESGVEYIKLEKSEYP IDPYVLMNDKSGKYKQYMWDPNRFVNVYVYHFKQTNTNGTTLGISHMAISAKGEHELEGL LAVQARHLTLDNLPTPFCVSVNSKFVNDESNRYKDWPNLHFLRINSADFNVTLAHELGHY LGLFHVFGEEDGEYIDGFKDTDYCKDTPTYNKVAYDKFLAEYIRTHYAQAADLKELLKRT GSDGKTFDSNNIMDYAISLGHTFTQDQTERMRHVLNYSLLIPRPGSRSRSVDAHGKPKGV VDVKFRVIE >gi|283510498|gb|ACQH01000121.1| GENE 73 97978 - 101373 3874 1131 aa, chain + ## HITS:1 COG:no KEGG:BF1894 NR:ns ## KEGG: BF1894 # Name: not_defined # Def: outer membrane protein Omp121 # Organism: B.fragilis # Pathway: not_defined # 19 1131 16 1125 1125 1107 53.0 0 MTVMNEKQKKPQAFWRMCLVALICMLSSTLYAQNITVTGTVKDPMGEPVIGASVSVQGTR AGTVTNIDGHYSIECSPQATLVFSYLGFKAKSVAVNGRQQVDVDFEDDATALNEVVVTAL GIKRQTKALGYAVTELKSDELERANTVSPVTALQGKVAGVEISQSDGGMFGSTKIQIRGA STLNANNQPIFVVDGVILDNATSDSGDADWNTNTSDYGNQLKNLNPDDFESVSVLKGAAA TALYGSRGLNGAVVITTKSGKAGKGVSVRFSQTVGLETVYRSPDLQNKYLLGSFPGGVDY DEYYTKTGNTWGDNMSSFARNSKGDYSFIEQYGNVAWGPEISWAEGKQFEQYDGTMGPAR IFKNNYKDAYDTGINTNTNVSLQGGNDRTQFYASASYKYNKGTTPRNTFNRFSFLGKASQ KVGDIMTVDFSINFTQSQPRNAPLNIGEYFAKGTFPREYDVNRYRHLYKGEHGGLADGKY GDQYRAVPGRDLWWNIYENDYRQTETVFRPVLNLNVQALPWLQLSAGGSLNYYAVSGEYK APGSGYANEGGSYALSHTQTTQENFYLAANANYQINDDWEVHGFLREEYFNQYAQLHSES TNGGLIVPNQYFIKNSKLQPDIDTHKFNTKRIVSTIFMVGTSWKNQLFLDVTGRNDWSSA LVYSYGRGNFSYFYPSVSGSWIITESFKDTLPKWVSFAKVRGSWAQVGNDTSPYYINSGY EIATYQRGDKKVYGMKIPDKMKSTDLKPERKNAWEVGLDWRFLDSRIGLDFTYYKENTRN QIMTIDVPGPSGVKQKLINAGNIQNSGIEVALNTTPYKRGDWQWDLNLTYTRNRNKIVEL SPDVASYINLDGSATYGNYRIASVAKVGSDYGMLMSDSWIKTDEKTGKPIIGYTNNYRTV YYKRGGTVKEVGSMLPDFLGTVNTTLRWKDLSLYVLFDARFGGYVASYNSRYSTAYGFSG ESEKYRKGMTWTSRYDNAKGKTFTDGFIPDAIFDQGTIVTAPDGTQHDVSGKTYQEAYEK GYVEPAHMQGHGYFKNSWGQGVINDDWFKKLNYIALREVTLSYNVPRRLCSYIGAKSLAL SLTGRNIAYLLNTAPNHENPESVRGTGAAQFRMRSFMPYTASYLFTLNATF >gi|283510498|gb|ACQH01000121.1| GENE 74 101387 - 103300 2243 637 aa, chain + ## HITS:1 COG:no KEGG:BF1957 NR:ns ## KEGG: BF1957 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 635 1 627 629 488 43.0 1e-136 MKQNIFNFKSLAMGLCTACLALTVGCSESDYADINTDPSKVTKGDPIFLFTQAQVEYQPF DYLLWFYDGAYTSKFVQAYTPAASFNDLFNNMAELGGVGSQSIKVKLYENEINSVLSSMP ADKAAQYTHLSAMANALSVYMALFDTDLFGSIPYSEAARARLDGTLTPKYDSQPQLFDQW LKELDADLEKLKTTSPQISIGNNDLAYNGNAEKWVKFINGIKLKIAVRLLHQDKARAIKI AEEVGAADANVMQSITDDYVYNKGTGGDGGNNTYGTDNSVNLGVGSKNVIDFMRRNKDPR MLVMFTKNDFNSEVIQAFFDAQARGDKDCAIPKYILDQVNYTTDAKGHKHFVSWKGDGEP WVRYHGLPIGIKLSDNADYTGENNYFVATRWKVTDGDKSKTYSPLSYFNEELVRGRVDYT FPTAPKGKVVQDVEDNPLYEMTISAAEMNLYLAEFKLLGANLPRTAADYFKTGVEASAQA YSRMATLNKIPYYDKAHCNDPLDKPVTYEADAIKTMMANADYQLTGNRDADLEKVYIQLY LHFYYQPLEQFVTARRSGVPKVGSTLIPWVALKPSKDIPRRFYIAQPEPSDKMRAIIEAA MHEQGFSFTDGQNPDVLNAERVWYDKGAPNFGEGPNY >gi|283510498|gb|ACQH01000121.1| GENE 75 103617 - 104495 549 292 aa, chain - ## HITS:1 COG:all4233 KEGG:ns NR:ns ## COG: all4233 COG0464 # Protein_GI_number: 17231725 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ATPases of the AAA+ class # Organism: Nostoc sp. PCC 7120 # 6 292 198 485 490 211 39.0 1e-54 MLHGIRSIEKICKSNVKPSDGETEKTLDELLNDLDHLVGLDNVKSKVSDLICFQKVQKLR NGLGLSSPKSTLHLAFTGNPGTGKTTVARIVGRLYKQIGILSKGHFVEVSRTDLIAGFQG QTAHKVKHVIERARGGVLFIDEAYSITENEHTDSYGRECLTELTKALEDYRDDLVVIVAG YTEPMIKFFESNPGLKSRFNTFIEFEDYTPNELMEIFKLMCNKEDYTITELALTKLFERI NATVDAKDSHFANGRFIRNIFEDMIMNHARRLAKMENPDKTELQELKEQDLS >gi|283510498|gb|ACQH01000121.1| GENE 76 104714 - 105592 626 292 aa, chain - ## HITS:1 COG:no KEGG:Swol_2068 NR:ns ## KEGG: Swol_2068 # Name: not_defined # Def: hypothetical protein # Organism: S.wolfei # Pathway: not_defined # 1 136 1 143 241 71 36.0 4e-11 MENKSIEGHNFKKQLSEIGKLAQTTPRAITLPTFETKGNLIGWNAHNVTGKEANELVSNI QSAFIEANERFRTLYREFHKVYNAFNQLDKEYIRGILGAIESANKASEQAMLAQKDINKT IEGLKATVQVLKNFKESTTSQLDIIGRTIHSLVDVGTLESLKSSLSELQNQLNEIGKQLD KTKDQTQKDLLILQNYHSELQSYKHLADIDSLWEDVERQKASILANKQEQITLTKEFYRN REEQQSSFNDINKRIEQDERIVSSKIKTAYAIAAGAASLSIIQLVLRLVGLL >gi|283510498|gb|ACQH01000121.1| GENE 77 105597 - 107306 773 569 aa, chain - ## HITS:1 COG:no KEGG:Swol_2069 NR:ns ## KEGG: Swol_2069 # Name: not_defined # Def: hypothetical protein # Organism: S.wolfei # Pathway: not_defined # 15 569 20 562 564 485 49.0 1e-135 MNDNYNELQQWSNDDEASLYEDETLLERELASHMSELSGLEEDFEKIGSPDTLGETVMNV VWDQFINQIGNIAGEDFIKENRGLTLDLRSSAHIQTTENFANGKIASHNSSIDFQERYDS WQSKLQHDENGNVVTHTTRSGREEANLVKGARAPFDKGRPTGSVEKGTDMDHTIPAGEII CDAAANAHMSEQEQIDFANSESNLHEMKSSHNRSKGRMSTKEWLDYRNSKGQKPSEQFAD PMADDYLSPELEAQYRKDDEEAREEYGKRKSTAEQHSIEAGKQSQKAEAFRIGGKALRSV LMGLLAALIKDIIKELISWFRSGKREISTFTESVKKAIKNFISNMKQHLTTARDTVLSTI ITAICGPIVGMLKKAWTFLKQGYRSVKEAIQYFKNPANKDKSFGTKMLEVGKIVIAGLTA VGAIALGEVIEKALTSIPIFAVQIPLLGSLASLLGIFFGALGSGLVGALALNLIDRLIAK RQKEQNKLQQLEKKNEILNTQTSLLNVSEKNLKTTEHQTFSNIKDRHKEGAKGMRDMVEH ILSNSDSQAEEHLRNNEDIDEIFNTLKSI >gi|283510498|gb|ACQH01000121.1| GENE 78 107417 - 107662 84 81 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPSTPENAVGLLPRAWPLLCGAVIYSASAWGSAVARSTVRAMREPICAGRVTVVSALFVI VCSYRTQQTLKTRPTTLKIIS >gi|283510498|gb|ACQH01000121.1| GENE 79 109774 - 112470 2968 898 aa, chain + ## HITS:1 COG:L94405 KEGG:ns NR:ns ## COG: L94405 COG1640 # Protein_GI_number: 15672678 # Func_class: G Carbohydrate transport and metabolism # Function: 4-alpha-glucanotransferase # Organism: Lactococcus lactis # 408 898 3 487 489 434 44.0 1e-121 MNLYFNIDYQTVFGEELVLNIVTNNDKGECLTTQYRMNTVDGLRWICCINKLPDACQNVV QYFYCVDSAGVTRRKEWALQPHRLALGALKANAYHVYDQWNAIPEDAYQYTSAFAQCLKR RNLQPMPASDYERTLRLVVRAPQLRASQRLVLVGDDLALGAWALGGALPMTEVSDCEWAV DLNLDELKGEEIAVKFVVVDADSRVAPLWETGYNRELQLPQVGKNEVCVYQLHQAFFELW DERFAGTLVPVFSLRTKRSFGVGDFGDLKAMIDFVAHTNQRVLQVLPINDSTTTHTWTDS YPYSCISIFALHPQYVDLNALPLLKKEDERMAFEALRQELNALSQIDYERVNNAKTAYLK QLFEQEGKEMMKGEDFKKFFEEECYWLVPYAQYCHLRDLYGTADFAQWPDHNTWDEAERK ALSTPTTKAYKEVAFHYFVQYLLYTQMRHAHEHARAKGVILKGDIPIGVNRHGCDVWTEP HYFNLNGQAGAPPDDFSVKGQNWGFPTYNWDEMLRDDCAWWVRRFQNMAKFFDAYRIDHV LGFFRIWEIPVSCVDGLLGQFSPSLGMSREEIEAYGLPFNEEAFTRPFISDWVLDRMFGE NADNVKATYLQPLHDDIWQLRPEYDTQRKIEQAFEGKNEEADRWLRDGLFALVSDVLFVR DRKDPSKFHPRISVQHDFVYEALWDKDKDVFNRIYNDYYYRRNNQFWYHEAMKKLPKLVE ATRMLVCAEDLGMVPDCVAWVMNELRILSLEVQSMPKDPRVRFGNLSTFPYRSVNTFSSH DMPTLRQWWDENWERTQEYYNTMLYNGGPAPHPLSGLLARDIVLRQLLSPSMLCILSIQD WLSTDEDLRLPDADAERINIPANPHHYWRYRMHLNIEDLMANTPFCTAVKDMVQQSGR >gi|283510498|gb|ACQH01000121.1| GENE 80 112720 - 114672 2060 650 aa, chain + ## HITS:1 COG:TM1845 KEGG:ns NR:ns ## COG: TM1845 COG1523 # Protein_GI_number: 15644588 # Func_class: G Carbohydrate transport and metabolism # Function: Type II secretory pathway, pullulanase PulA and related glycosidases # Organism: Thermotoga maritima # 30 638 229 840 843 466 41.0 1e-131 MNIKGFIMAALFVCAQGLDVRAQDMNEMTYSPRETVFKLHAPANAKRVLVRIYAEGQGGK PLQTLRMKPAGKDMWSVTAKGDLKGRFYTFDVPAFNLGETPGIFAKAVGVNGRRGAIINL DETNPEGWKADKRPPLASPADLVIYELHHRDFSVHPSSGIKNKGKFLALTEEKAINHLRQ LGINAVHILPSFDFASVDETRLDKPQYNWGYDPLNYNVPEGSYSTDPYRPEVRIREFKQM VQALHKAGIRVILDVVYNHTYDLSGSAFQRTWPDYFYRKRADGTYSDGSGCGNETASEQP LMRQFMIESMKHWAKEYHIDGFRVDLMGVHDIETMNRIQAEMQKIDPSIYIYGEGWSAGT CAYPQEKLATKANTAQMPGIGAFSDELRDALRGPFSDDRKPAFLAALNGNEESLKFGIVG AIAHPQVDMSQVNYSKKPWANQPTQMISYVSCHDDMCLTDRIRNSIPGITTAELIKLDLL AQTFVFTSQGVPFMLCGEEMLRDKKGVHNSFCSPDSINELNWTNLERYPQVFAYYKGLIA LRKAHPAFRLGNAELVRKHLEFIKAPANVVAYRLKDHAGGDAAKDIVVLLNANREAQVVD VPQGDYRAVVCDGVVNGDGLCTMKGGKVTVDAQSALILMSGLTPNPSPRD Prediction of potential genes in microbial genomes Time: Sat May 28 02:48:45 2011 Seq name: gi|283510497|gb|ACQH01000122.1| Prevotella sp. oral taxon 317 str. F0108 cont2.122, whole genome shotgun sequence Length of sequence - 43663 bp Number of predicted genes - 30, with homology - 29 Number of transcription units - 17, operones - 5 average op.length - 3.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 113 - 172 2.7 1 1 Tu 1 . + CDS 258 - 2114 1891 ## COG0366 Glycosidases + Prom 2198 - 2257 4.6 2 2 Tu 1 . + CDS 2283 - 2660 331 ## gi|288929961|ref|ZP_06423803.1| hypothetical protein HMPREF0670_02697 + Prom 2680 - 2739 1.9 3 3 Tu 1 . + CDS 2780 - 3034 78 ## gi|288929962|ref|ZP_06423804.1| hypothetical protein HMPREF0670_02698 + Term 3193 - 3242 -0.9 + Prom 4210 - 4269 6.8 4 4 Tu 1 . + CDS 4289 - 4813 350 ## COG1309 Transcriptional regulator + Prom 4894 - 4953 3.0 5 5 Op 1 . + CDS 4973 - 5809 423 ## Coch_0388 abortive infection protein 6 5 Op 2 . + CDS 5844 - 6824 861 ## COG1073 Hydrolases of the alpha/beta superfamily + Term 6948 - 6992 7.0 - Term 6935 - 6979 7.0 7 6 Tu 1 . - CDS 7013 - 8815 2207 ## COG1217 Predicted membrane GTPase involved in stress response - Prom 8849 - 8908 2.5 8 7 Tu 1 . + CDS 9221 - 11491 2613 ## BF0615 hypothetical protein + Term 11558 - 11599 -0.7 9 8 Tu 1 . + CDS 12098 - 12691 475 ## COG3201 Nicotinamide mononucleotide transporter + Term 12862 - 12902 3.2 - Term 12849 - 12891 2.3 10 9 Tu 1 . - CDS 12907 - 15399 2683 ## PRU_0128 hypothetical protein - Prom 15420 - 15479 3.2 - Term 15607 - 15658 13.2 11 10 Op 1 . - CDS 15695 - 17218 1686 ## COG3291 FOG: PKD repeat 12 10 Op 2 . - CDS 17269 - 18198 973 ## BF1999 hypothetical protein 13 10 Op 3 . - CDS 18202 - 19353 1301 ## BF1930 chitinase 14 10 Op 4 . - CDS 19391 - 20560 973 ## BF1997 hypothetical protein 15 10 Op 5 . - CDS 20601 - 21221 619 ## BF1928 hypothetical protein 16 10 Op 6 . - CDS 21250 - 22284 991 ## BF1995 hypothetical protein 17 10 Op 7 . - CDS 22274 - 23242 1057 ## BF1926 hypothetical protein 18 10 Op 8 . - CDS 23274 - 24794 1610 ## BF1993 hypothetical protein 19 10 Op 9 . - CDS 24856 - 28239 3483 ## BF1924 hypothetical protein 20 10 Op 10 . - CDS 28277 - 30535 2447 ## COG4772 Outer membrane receptor for Fe3+-dicitrate - Prom 30750 - 30809 8.8 - Term 31976 - 32021 1.6 21 11 Tu 1 . - CDS 32136 - 32648 755 ## COG1528 Ferritin-like protein - Prom 32774 - 32833 2.7 22 12 Op 1 . - CDS 33049 - 33990 920 ## COG0845 Membrane-fusion protein 23 12 Op 2 . - CDS 34063 - 34935 1094 ## PRU_1259 hypothetical protein - Prom 34956 - 35015 8.1 24 13 Op 1 . - CDS 35297 - 36928 1465 ## BVU_3494 hypothetical protein 25 13 Op 2 . - CDS 36954 - 39761 3043 ## BVU_3493 hypothetical protein 26 14 Tu 1 . - CDS 40003 - 40404 441 ## PRU_2471 putative FMN-binding protein - Prom 40492 - 40551 6.9 27 15 Tu 1 . - CDS 40690 - 41043 323 ## gi|288929986|ref|ZP_06423828.1| hypothetical protein HMPREF0670_02722 - Prom 41254 - 41313 3.9 28 16 Op 1 . - CDS 41565 - 41801 188 ## 29 16 Op 2 . - CDS 41893 - 42153 57 ## gi|288929067|ref|ZP_06422913.1| hypothetical protein HMPREF0670_01807 - Prom 42360 - 42419 3.9 - Term 42733 - 42770 3.5 30 17 Tu 1 . - CDS 42797 - 43543 546 ## COG3209 Rhs family protein - Prom 43602 - 43661 1.8 Predicted protein(s) >gi|283510497|gb|ACQH01000122.1| GENE 1 258 - 2114 1891 618 aa, chain + ## HITS:1 COG:BH2927 KEGG:ns NR:ns ## COG: BH2927 COG0366 # Protein_GI_number: 15615490 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Bacillus halodurans # 122 615 134 578 578 165 26.0 2e-40 MKKILLALTLFSLTASAAVKVKRIDPTNWFVDMKDPSLQLMVYGEGIRNANVSTTYPGVE ITSIERMDSPNYLLVYLNIGKAKAGKVPLVFAQGKQKTTVTYELKARDKKGEERMGFTNA DVLYMFMPDRFAASGTKQHPLPGLNAYRVDRSQPSLRHGGDLEGIRQHLDYFNQLGVTAL WFTPVLENNTPDEKAQSTYHGYATTDYYRVDPRFGTNADYRKLIDEAHAKGLKVVMDMIF NHCGITHPWIKDVPSHNWFNRPDYKNNYLQTSYKLTPVLDPYASEVDMDETVDGWFVPSM PDLNQRNPHVLRYLIQNSIWWIETVGIDGIRMDTYPYADREAMAKWMARINQEYPNFNTV GETWVTEPAYTAAWQKDSKLSNVNSNLKSVMDFAFFDRINKAKNEETDGWWNGLNRVYND LCYDYLYPNPSSVMAFIENHDTDRFLGNGTDTLALKQALALLLTMNRIPQLYYGTEILMN GTKEKTDGDVRKDFPGGFPGDKQNAFTAEGRTKEQNAMFSWLSRLLHWRKGNEVISKGKQ THFIPFNGVYVLARTLDKRHVLTIMNGTSKPATMPVARYAEVIGEVDEVRDVLTGRTISL LEDLQLSPRETLVLEWSE >gi|283510497|gb|ACQH01000122.1| GENE 2 2283 - 2660 331 125 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929961|ref|ZP_06423803.1| ## NR: gi|288929961|ref|ZP_06423803.1| hypothetical protein HMPREF0670_02697 [Prevotella sp. oral taxon 317 str. F0108] # 1 125 1 125 125 213 100.0 4e-54 MEKKHGLAALLLLFLLVACGRSDNYGYPSKIDFGKEGGTKICSGRIIVSSISINDYNGNG KAELTKDENDTIITKCYWLTVKFKRLVGHEIKIIAEPNTTGKKRTLYVYGSIYNDDATIK VTQSK >gi|283510497|gb|ACQH01000122.1| GENE 3 2780 - 3034 78 84 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929962|ref|ZP_06423804.1| ## NR: gi|288929962|ref|ZP_06423804.1| hypothetical protein HMPREF0670_02698 [Prevotella sp. oral taxon 317 str. F0108] # 1 84 1 84 84 134 100.0 1e-30 MKFCPICLWQNVAMAQNMIGQKLFNAVFTSRKPLVDGRLLLRLHHAVFVVREQHALQSLA LAAASDCVFRAVAMRTYTAQLLFV >gi|283510497|gb|ACQH01000122.1| GENE 4 4289 - 4813 350 174 aa, chain + ## HITS:1 COG:FN1004 KEGG:ns NR:ns ## COG: FN1004 COG1309 # Protein_GI_number: 19704339 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Fusobacterium nucleatum # 1 171 1 168 188 87 31.0 1e-17 MPRSTIIAQEVIIEAAFELVRKEGFEVLSARNIAKQIGCSTQPIYWCYKNMDEIKAEVCR KALPYLQNLMLGYVKTGDAFLDLGLGYVRIAYAEPALFKAFYMDNIIKVKLTDIFPESER VVEVMKSSDECRHLSEQEVKNGIAKAWMLAHGIASLVAVGMFVYDEEKIIEILK >gi|283510497|gb|ACQH01000122.1| GENE 5 4973 - 5809 423 278 aa, chain + ## HITS:1 COG:no KEGG:Coch_0388 NR:ns ## KEGG: Coch_0388 # Name: not_defined # Def: abortive infection protein # Organism: C.ochracea # Pathway: not_defined # 1 276 1 276 279 393 72.0 1e-108 MKANKRTLRNVILFSFVAVVCGWVGVGVDKLLGQPSNLESLGALIFISSPILCMVLLRLF GGDGWKDLPLKPNFRRNARWYLFAIVVYPVVIGITLFVGKLFGWVDVSKFSIAAYFPVFI AAFLPQCLKNIFEESVWRGYLTVKVEQLTQNEWLVYLVVALVWQVWHLPYYLILLDDAYL ASFFPFGNVLFVVTSFAVIGVWTIMYTEIFFLSRSLLLVVLMHAMEDSLNPLISDGFAVM SPDKALLVSPSFGLIPLLLYLVIGLWLRRIRKSSARCS >gi|283510497|gb|ACQH01000122.1| GENE 6 5844 - 6824 861 326 aa, chain + ## HITS:1 COG:ECs0310 KEGG:ns NR:ns ## COG: ECs0310 COG1073 # Protein_GI_number: 15829564 # Func_class: R General function prediction only # Function: Hydrolases of the alpha/beta superfamily # Organism: Escherichia coli O157:H7 # 2 326 32 377 378 394 59.0 1e-110 MNTNMEMPKCNDGWDKIFSQSDKVENKKVSFRNRYGITLVGDLYFPKNSENNKLAALAVC GAFGAVKEQASGFYAQTMAERGFVTLAFDPSYTGESGGEPRNVASPDINTEDFSAAVDCL GLLPFVDRERIGIIGICGWGGFALNATAVDTRIKAVATMVMYDMSRVFGKGYFDSNTPEM RLEMKRSLNAIRWTDAEKGTPAPGYSMPDTPSDKLPQFLNDYIEFYKTPRGFHERSISNK VWTATTLLSFLNFPILAYANEINVPTLIVAGENAHSRYMSEDAYQAIGTDKKELFVVPGA RHIDFYDNQGGVVPFDKLEMFFNTNL >gi|283510497|gb|ACQH01000122.1| GENE 7 7013 - 8815 2207 600 aa, chain - ## HITS:1 COG:DR1198 KEGG:ns NR:ns ## COG: DR1198 COG1217 # Protein_GI_number: 15806217 # Func_class: T Signal transduction mechanisms # Function: Predicted membrane GTPase involved in stress response # Organism: Deinococcus radiodurans # 3 595 2 593 593 665 55.0 0 MQDIRNIAIIAHVDHGKTTLVDKMMLAGKLFREGQDNSGQVLDANDLERERGITILSKNV SITWKGVKINIIDTPGHSDFGGEVERVLNMADGCILLVDAFEGPMPQTRFVLQKALQLGL KPIVVVNKVDKPNCRPEEVYEMVFDLMFSLNATEDQLDFPVVYGSAKNGWMSSDWKAPSD NIDYLLDEIVEKIPAPRQIEGTPQMLITSLDYSSYTGRIAVGRVHRGTFTDGMNVTVVHR NGEKERTRIKELHTFVGMGHVKTDHVDSGDICAVIGLERFEIGDTICDFEMPEPLPPIAI DEPTMSMLFTINDSPFFGKEGKFVTSRHINDRLEKELEKNLALRVRPFEDYTDRWIVSGR GVLHLSVLIETMRREGYELQVGQPQVIYKEIDGQRCEPIEELTINVPEEFASKMIDMVTR RKGEMTSMETQGERVNIEFDVPSRGIIGLRTNVLTASQGEAIMAHRFKEYQPFKGEITRR TNGSMIAMETGTAFAYSIDKLQDRGKFFIDPGEEVYYGQIVGEHVHDNDLVINVTKSKQL TNVRASGSDDKARVIPKVIMSLEEALEYIKGDEFVEVTPKSMRLRKMTLDHNERKRLNKE >gi|283510497|gb|ACQH01000122.1| GENE 8 9221 - 11491 2613 756 aa, chain + ## HITS:1 COG:no KEGG:BF0615 NR:ns ## KEGG: BF0615 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 24 756 24 739 739 646 48.0 0 MTSLAALISVATAYSANPKQETLDTLRSYQLQDVQVVATRATKKTPMAFTNMSQEQIKRL NTGRDIPFLLATMPSVVTTSDAGNGMGYTAMRIRGTDASRINVTTNGIPMNDAESSLLYW VNMGDLASSLGSIQVQRGVGTSTNGAGAFGATVNMQTENIGLQPWAAFDLSGGSYYSHKQ TVRFSTGLMGGHWGIQGRLSNLGSKGYVDRASSKLNSYFLQAGYFGETTVVKFITFNGVE QTYHAWDFASKYDQEQYGRTYNPSGKMEKDAAGNAQFYKNQNDNYHQQHYQLLWNQLMGE YWNLNLALHYTKGGGYYEQYKKNHELAQYGLDAGGTKAKSSLVREKWLDNYFYGAVAALN YDNKRNFSATLGEGWNRYSNDHFGRVTWVKKPVDDFMPNHEYYNNNTKKTDYNAYLKATY EFVKGLSAFADLQYRYVRIRMVGPADATSAKRSFSYDLNETFNFFNPKFGLNYELSPLHR LYASYAIAHREPVRSNYEHNMDAQLEKPRAERLNDLEVGYEFTAKNFTAGLNLYHMNYKD QLVPTGEMKLDKPITRNFDKSYRMGVELSAAWMPVDWFRWDANATFSKNRVKDVTVKLTD GTVYTIAGESQLAFSPNTVFNNVFTFDKAGFRGMVMSHFVGEQNLTNTGFKTMQTKDAGK NTVNDLISLKQYFTTDIDLSYTFSLKALALKDATIGITLYNVFSSKYDNNGWAAPQYRLD GSTLVAENTWGTRDSGAAGFAPSAPFHFMARLAVNF >gi|283510497|gb|ACQH01000122.1| GENE 9 12098 - 12691 475 197 aa, chain + ## HITS:1 COG:PA1958 KEGG:ns NR:ns ## COG: PA1958 COG3201 # Protein_GI_number: 15597154 # Func_class: H Coenzyme transport and metabolism # Function: Nicotinamide mononucleotide transporter # Organism: Pseudomonas aeruginosa # 10 196 4 186 191 85 32.0 7e-17 MLSFLSLHWLDIFTTILGLLYIWLEYRASIALWVVGIVMPALDVWLYWQHGLYGDAGMAV YYTLAAIYGYAVWKFGRKRGQAAGQSMPITHMPRRLFVHTGVFFVVAWGATYYVLITFTD SNVPALDAFTNALSFVGLWTLARKYVEQWLFWIAVDVVCTGLYIYKGIPFKALLYGLYVV IAVAGYRKWRREAEVRS >gi|283510497|gb|ACQH01000122.1| GENE 10 12907 - 15399 2683 830 aa, chain - ## HITS:1 COG:no KEGG:PRU_0128 NR:ns ## KEGG: PRU_0128 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 827 1 827 829 1209 70.0 0 MEILKKYLFDVVAVVAFALLAYAYFVPATIDGRILYQHDSAAGRGAGQEVLKYKEKTGET SRWTNATFSGMPTYQTSPSYPSTSVLSTATKAYHLWMPDYVWYVFAYLLGFYILLRAFDF RQSLAALGAVIWAFSSYFFIIIAAGHIWKVMALAYLPPMIAGIVWAYRGKYVRGLIVTAV FTAFEIYANHVQMTYYYLFVIFFMVIAYLVQAIKEKQLARFFKATAACAIGATLAVCLNL TSLYHTWQYGQESMRGKSELVKKNNANQSNSGLERDYITQWSYGIGETWTLLVPNTKGGA SVPMSANPIVQEKGNPELGYLYQQIGQYWGEQPGTSGPVYVGAFVLMLFILGLFIVKGPM KWALVAATVLSIALSWGKNMMWLTDLFIDYMPLYAKFRTVASILVIAEFTIPLLAIMALK KIIDEPELLTHKIKYVYASFGLTAGMALLFALMPSVFFGSFVSSDELQALSQFPQQQLNP ILADLTQVRQAIFRADCWRSFWIIVIGTTFLLLYKAKKLKSEYLVGALTLLCLIDMWQVN KRYLHDDMFVEASVRDQPQPMDNTDRLILQDKSLDYRVLNLASNTFNENETSYYHKSIGG YHAAKLRRYQELIEAYIAPEMQGLMKSVAEAGGDMTRVKGDSIYPVINMLNAKYFILPLQ NNQKVPLLNPYAYGNAWLVDKVKYVDNANAELDALAKLNLRHEAVADKRFESVLGTSVQQ GTIAKAELKKYAPNQLHYSVESDKGGVLVFSEVYYPGWTATVDGQPTELGRVNYLLRALA VKPGKHEVVLSFYPKSIDRTETIAYISYAALLLLIAATLYLDWRRKKKEA >gi|283510497|gb|ACQH01000122.1| GENE 11 15695 - 17218 1686 507 aa, chain - ## HITS:1 COG:MA4289 KEGG:ns NR:ns ## COG: MA4289 COG3291 # Protein_GI_number: 20093078 # Func_class: R General function prediction only # Function: FOG: PKD repeat # Organism: Methanosarcina acetivorans str.C2A # 123 465 621 1010 1734 159 31.0 1e-38 MLQIFTSKKQLAVMAFAMLCTTANAQQKSVTLTEAGTLSKQITGDEKLTITDLTISGPIN TADILFIRQMAGADNGDNKPIDPGKLQRLNLRAASFKKSSGAYFFKKKDLSNVGYGISED NLVDQYMFSGCDKLVEVKLPATTQKINNNAFNGCSKLTTCAIPVAVTHIGDYAFAKCKAL KEVNIPAAVSYMGTAAFSECTSIETVNFNEDCQLKEIRANAFNGCSALKNVKLSSGVEEL GSKAFANCASLNTFTLPASFIEKAADTFKDTPIKNYEVEEGIDGFSSVGGVLYNEDQSEL LLYPIAREDKSFIVPQSVGGIAEQAFAGAKNLTHITLSESITNIQSKTFAGSGLTEIVIP AEVTSIGSEAFMGCNALTSVTFKGGPSGIGEKAFFGTRLSTVTFNVMNAAPELGKQAFYL ARTQTTFFVPADKVETFKNALLAANAVSPSRFEVKALNTSSIEGLQQEDTAAEVSRYNAA GQRISAPVRGLNIVKMANGKTVKTIVR >gi|283510497|gb|ACQH01000122.1| GENE 12 17269 - 18198 973 309 aa, chain - ## HITS:1 COG:no KEGG:BF1999 NR:ns ## KEGG: BF1999 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 14 309 9 306 306 232 42.0 1e-59 MTTPIYKTASRLCLLLVFAAMALTSCDNATDTELSDIAFGTNNTDSVALNKQSTRIVLLH GGTGKYQANIADSKVASISISKDTLRINGLLEGETYATILSGDFKRKLKISVVVPPISIS DSEIRLYPRDESKFVSLAGGGDLARLTIDDPDSILNTKWNAKTGILEIAASYEGEANIKA IGEDGQFKTLKVTVRCSGSAQKVGVYGTTSRSIYQQMNTVMAVRRPGIGVWLMNGARPAA ARRVLKIKPAIVAPKLGQKVEVSLGMNYPEEFSGTLLREGKHKLTVEEVREQNVVLRGRG FKLVLPYER >gi|283510497|gb|ACQH01000122.1| GENE 13 18202 - 19353 1301 383 aa, chain - ## HITS:1 COG:no KEGG:BF1930 NR:ns ## KEGG: BF1930 # Name: not_defined # Def: chitinase # Organism: B.fragilis # Pathway: not_defined # 58 378 45 357 367 147 30.0 7e-34 MKQLHILALALAFAQAGTAQTHEELVIDGGFEVYVKPQNGNEGSGEGGTAPSFDPYLKPQ YWYFNSSLSRERTKTAHSGNYAIRVFPNGGSFIARDKDFNTHYIKVQPGAEYELSYWYQG TVTKPNILVYIDWYKGEQLIKKEERKAQKDMANGFTGTWKKKTLTFKAPAGVESAALGFY ILSVYEDADAGRHILIDDISMVKTKEGVVPNKPEPPANLHATPQQREIELSWNAVAETDV TYEITINGKAAATTKATKHVVTGLEPGKSYAFSVKTIKGSVASDASATVSQTTMPLNFGV NDEDRIPYLYMVREQGTAPRTLPLYYHDLANPDAKMLYWVDGQPVTPTSGAITFGSKGEH TLRIEIEETPTRKWEIEYKLNVD >gi|283510497|gb|ACQH01000122.1| GENE 14 19391 - 20560 973 389 aa, chain - ## HITS:1 COG:no KEGG:BF1997 NR:ns ## KEGG: BF1997 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 5 387 8 389 395 321 43.0 3e-86 MNHKIFLALLLAPLVFSSCNRWEGETVESAYVMPPYPDPQYKFARNGISSVDYLECTLLR SPLDYIYDSYLKEARIENRTLLETMKGYYNNGEFGLKPRRELSASPTHKADSALVKKDVE DIFNQAATLSGMGKEGSGTARNQKAKPGKGGYVGRNIGDVNLAFANEKGLVVAELFQGIV YGGIYLDKILNTHLNDSVLNSEALRKNHEDNVLVAGHNYTELEHHWDLAYGYYQFWLPYV QAANAPALRQSRITLYNAFAAGRAAITQYRYADMLRQQAIIRAELSKVAAIRAINLLTGE TTMANIDEDAANALTFLSEACGAVYGLQFTVQASGKPHFTYNEVKTLIDQLTAGNGLWDK QRLLADTSTTGSLRNVAWQVAQRYGISLQ >gi|283510497|gb|ACQH01000122.1| GENE 15 20601 - 21221 619 206 aa, chain - ## HITS:1 COG:no KEGG:BF1928 NR:ns ## KEGG: BF1928 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 206 10 223 223 149 36.0 5e-35 MNAMKYMLLALVAMLVSCSRSTADYAEEDYDTLFPFTGVEKPKVSYEDQVVQLGNPDAPV SDFVYPGVEITKNVRTYDVTLTCSFREVDILGNNVPEAELASRYVVRYVAANRSLTTIAT NKTNEDATSFLSNGQQHELHFKAKSGFPMYLLVNGVGPRGSSIKATISAISEDGLTIVKP LKVNEHQNEEGIGKIKGPFCAYIILP >gi|283510497|gb|ACQH01000122.1| GENE 16 21250 - 22284 991 344 aa, chain - ## HITS:1 COG:no KEGG:BF1995 NR:ns ## KEGG: BF1995 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 13 340 7 342 343 296 44.0 8e-79 MHSKTKAWLTAIIVTLLAACDASQDVFDRSPAQRNAESIAALKAELTSAEHGWRVLYFPR TDSLLFSNPSELIPQFGFRGRYGYGGDSFTMRFNADNTMEMKADFTQQTASTPQKSEYRV GRNSFTQLSFITRNYVHQLVNDAFRAASDWLFMGKNEDGDLVFRTASYLEPAREYIVFTK LASEEQWKQTADRAFQNRDFFESMVNPQLSIHRGSRIFFRSDVYVKRNVETNQSLLREIR EKKYYLFLFAQKKNPIPDYPAKEIVGLGSGYAGTEHGITFRAGLRYDSKIMFFDFQRVGQ RFVAELVEVYDPLFRKTRLVSKHLHPEGKPTGLVAEIWDEGDDR >gi|283510497|gb|ACQH01000122.1| GENE 17 22274 - 23242 1057 322 aa, chain - ## HITS:1 COG:no KEGG:BF1926 NR:ns ## KEGG: BF1926 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 4 319 16 333 334 287 45.0 3e-76 MNIKNIICLLLPCLLQACTNELLSEQSVVDAGAVQNPQTELDRWIADSITAPYNIEVVYR WQKNANTGTTFVSPPRPDNVKAILRAVRELCFETYRQENLGGVDFLQGKTPLRIYLYGGQ NVDENGVELLNNPQLTPSEMCIFRVDDFKAGDANKMYALMRSVHHQFARRLAELVTYDRD KFTAISGHRYTGSTEPLAAPLGYSKKEKDYFGLADYANKRGFYTMQAFLSAEDDFAEIIS STLCSTPKEVNDAINTAQTPDQDSDPEVQQQYNKEAEQAYKEIVAKQAFVEQYMQKSLHI NLKQLQITSLRRINNYIKQHAQ >gi|283510497|gb|ACQH01000122.1| GENE 18 23274 - 24794 1610 506 aa, chain - ## HITS:1 COG:no KEGG:BF1993 NR:ns ## KEGG: BF1993 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 499 9 509 509 669 65.0 0 MKNNILLLALLALFTACNDFLDQSPDSDLDVAIDTEDKIAELLTGAYPQASYIPFLEPRT DNVEERPNGVHSRLNEAMFFWEDYDQEDLDTPLNYWNACYKGIAQANKALELLSTYPKSA RVKALYGEAFLLRAYLHFMLVNIWAQPYSTQSTDPGIPYITKPEKEALVDYKRQTVSEVY ELIEKDLKRGITLVADEYYKHPKFHFNKKAAYAFATRFYLMKGDWEQVVAYADYVLGGDP KTQLRPWRTYANELEFRHQDLFRRYTATDEPANLLLTTTESRLARTLPTEKYGSTFNTVN KVFAQKGIEGGGDYEKMNFIGTYIFTSGVLGVPSSRYLAKFDELSTTESVGSKPRGLYVT NVLFTADEVLLNRMEAHAMLRNYTKSIDDLLQYMQGKFGFMPSIDRDVYTTTNRDNYKTY TPFYGMTLKQLALVKTIVGFRQQEFFQEGLRWFDLRRFHIAVKRSSKSTYYFPLEKNDPR KLLQIPAEAINRGLEPNPRERNEPLH >gi|283510497|gb|ACQH01000122.1| GENE 19 24856 - 28239 3483 1127 aa, chain - ## HITS:1 COG:no KEGG:BF1924 NR:ns ## KEGG: BF1924 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 4 1127 2 1125 1125 1533 65.0 0 MTTNISTPLKKVLTLCLALLACLQVAMAQTANTITGTVKNEQGETLVGALVQAAGTTERA VTDIDGHFTLKAAPGHTLQVSYVGMTTLRTKAQAKPMTIVMKADSRMVDEVVVTGYQNIR NRVYTGAATSVKMNDIKLEGIADLSRMLEGRVAGLSIQNISGTFGSAPRINIRGGASIIG NVQPLWVIDGVVYEDLVHLNLDQLASGDAATLVGSAVAGLNPADIEDIQVLKDASATSVY GARALNGVIVVTTKSGKRETPLRVSYTTENTIRMKPRYSQFDLLNSQETMALYQEMNDKG YFGLHNALYGRRAGVYHELYRGLSTIDPATGQYQIANTAEARNAFLREREYANTDWFDLL FTLNPTTNHSLTFAGGGKNVATYASVGFYHDSGWTIADKVSRLTANIKSTFYINEKMRAT LSAQGNIRSQKAPGTMPQRKNHTLGVFERDFDINPFSYALGTSRTLRPYNADGSLAYYRN NWAPFNILNEYANNSTHVDVLDFKLQAEGSYQLTADLAVKTLFSARRANTVTTHEVDEGS NVVQAYRANDNPTVAQENIYLVRDKDNPLLQPKVGLTHGGLFNRTETSMTSFLARLALDF DKRMGQHDFKAFAFTEVRAANRSTQPFQGYGIQYGRGNQVYTNPLIFDKLTNEGNSYFGL HKRNERGVTFSLNGTYGYAGRYVFNFVLNTEGSNSSGRGARALWLPTWNVGAKWNIDQEK FLKNHAWISRLALRASYGLTAKMNEEAINANAVYQSGIVNRYHFNNRENKLNIKHLENRD LTWEKMYELNLGIELGMFNNRISATIDLYQRNTFDLIDLVRTSGVGGQYYKYANFGDMRT RGVELGLHTKNIVADRFSWTTSLTLSAMHQEITRLLNTPNAFDMVAGRGRGNIVGYPKGS LFSYNFQGLDGNGLPTFNFGRYPSSQNAYAKNAGADFLDTQYAKSYLIYHGPIEPQVLGG LNNTFKLGAWELSFFLTMQAGNKIRLNPTFDPDFADLNVFSKEYYNRWLNPGDERHTDVP TLPSQELIAKVGKENIEKAYNTYNFSQNRVADGSFVRMKNISLGYTCTPALLKKLRLHAA SLRLNVTNPFLIYADSKLKGQDPEYYRTGGVSLPTPRQFTATLNLGF >gi|283510497|gb|ACQH01000122.1| GENE 20 28277 - 30535 2447 752 aa, chain - ## HITS:1 COG:SMc02721 KEGG:ns NR:ns ## COG: SMc02721 COG4772 # Protein_GI_number: 15966136 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor for Fe3+-dicitrate # Organism: Sinorhizobium meliloti # 65 634 56 633 932 347 39.0 8e-95 MMCQTFRTKFLIALATLALAIFTTAQAQEPTKSKAMTDSTHRIREVVVRSNQMLGSKFEA RNRTGSAYYLSPTELAKFGYTDINRMLKSVPGVNVYEEDGFGLRPNISLRGTKAERSERI SLMEDGVLAAPAPYAAPAAYYFPNAARMYAIEVLKGSSQVQYGPFTTGGAINMVSTPIPQ RLWGKAQVSYGSYNTLKSYAGFGNRHKHVGYMIEYLRYQSKGFRRDEPDNRTGFKRNDLI AKVSAQTDNDEGVNHLFELKFGFANETSDETYVGLSETDFAQRPYFRYAGAQRDQLTTRH TQWVGTYIIEKANKWKITTNLYYNHFFRNWYKLSDVRAGYTKAEKRSIASVLTDPETNKD YFDILTGAKDYLGEALMMRANYRTYHSRGIQSKAEYRTMLFGGYFTAEAGLRYHADSEDR FQQDDAYSMQQGKMNLFLAGMPGSQANRITTAHAWAGYALTKWAKDALTLTLGVRYEDVD LLKRDYTAADPRRTGMVRIETPNHARAFLPGLGVSYRLLPVLSAFGGVHKGFAPPSATLY QKAESSVNLEAGLRLTTNNLKAEAIGFYNNYDNMLGSDLAAQGGQGTLDQFNVGKAVVRG LELMLHYQPLPRHWGVQLPLQLSYTYTDTEMRNTFSSESWGDVVYGDEIPYIYKHAFNAQ LGFEHKRFEANIGVRINGDMRTQPGQGRIAEREKIPAHAIVDASLKGHINRHLTLTLNAI NLTNKVYLVSRHPSGLRAGHPLGIYGGVLWRL >gi|283510497|gb|ACQH01000122.1| GENE 21 32136 - 32648 755 170 aa, chain - ## HITS:1 COG:MTH158 KEGG:ns NR:ns ## COG: MTH158 COG1528 # Protein_GI_number: 15678186 # Func_class: P Inorganic ion transport and metabolism # Function: Ferritin-like protein # Organism: Methanothermobacter thermautotrophicus # 1 165 1 165 171 146 43.0 2e-35 MLKKKIEDALNAQINAEMWSAYLYLSMAAYCHAQGQPGMGKWFEVQFKEEQDHAKILFNY VISRNGKVDLRPIEAVPTEWNNILNVFESSLRHEQTITESINKLFALCAEENDYATQSML KWFIDEQVEEEENVQTIIDNIKMIKDNGYGIYMIDKELGQRAYNPATPLA >gi|283510497|gb|ACQH01000122.1| GENE 22 33049 - 33990 920 313 aa, chain - ## HITS:1 COG:VC1607 KEGG:ns NR:ns ## COG: VC1607 COG0845 # Protein_GI_number: 15641615 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Vibrio cholerae # 35 305 30 322 324 69 24.0 7e-12 MQKKLRLRFKLLIGGTAVVVLAAAVAIAANMLKTPKPRQIDGQVEVLQYSVQVQQRGKVL EIRVKEGDYVEVGDTLALYRTDDTPMPEDATLELLATTLHPYRTGNRNVDRAYGRWQQAK SAEEQAERIYNDVQQKFMNGKAPASLRDNVFTEYKTLQVQALSLKADYEKALHNLKTEKS NSKGPQIIAIVTKVEGEVSEIVVKRGEQLSANNTLMSIALLDNLWGSIAVTRNEKMQFKE GDTIQVYCKPFNMSVPMRVSGFKANFDFLDPNNYKRGNEGAQTWEMQLRPVYKVEGLRPG MQLSIGVRADEET >gi|283510497|gb|ACQH01000122.1| GENE 23 34063 - 34935 1094 290 aa, chain - ## HITS:1 COG:no KEGG:PRU_1259 NR:ns ## KEGG: PRU_1259 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 290 1 290 290 305 57.0 2e-81 MNKKIWIPTIIVVLILAGVTAYLFINLNKQKEENAAIKELAEIDKKEMENEYQQFAQQYS EMKTQINNDSIVAQLTAEQEKTQRLLNELRRVKSTDAIEITRLKKELATVRAVIRSYVME IDSLNKVNANLTQENTRVKGQYEAATRQIEGLSTEKRSLSEKVAIAAQLDATGISLVAKN KRGKATDQIDKATTLQVSFNITRNVTAASGVKDIYVRIMSPTGSLLNGAGSFSYENRTLQ YSMKRSVEYNGEETPVTLFWNVSQALVAGTYQVSIFADGNMIGSRSFAFK >gi|283510497|gb|ACQH01000122.1| GENE 24 35297 - 36928 1465 543 aa, chain - ## HITS:1 COG:no KEGG:BVU_3494 NR:ns ## KEGG: BVU_3494 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 9 540 4 534 538 619 57.0 1e-175 MKPKTHIIHLTLAAIALLGMFVSCLEDDPRDRISEDQAITSPTSLWINSIGTLYNNIGSV EDGSGLQGTNRGIYDYSTFTTDEAIVPIRGGDWYDGGFWKRLYTHEWSANDLELYNLWCY LLQSVVCCNDAIALIDKHKSLLNQSQLMAYNAEARAIRAMYYYYLIDLFGNVPLLEKNQT SVAEVSNTRRSEVFRFAFKELQQVAPYLADAHSNLQGQYYGRLTRPVAHFLLAKLALNAE VYTCDNPVRHARKPGKDILFTVDNEQLNAWQTTIRYCEKLQAAGYTLENSYADNFKVHNE TSRENIFTIPMDKNIYPNQYQYLFRSRHYNHGKALGMGSENGPCATISTVRAFGYGTSEK PDTRFKLNFYADTVMVDGDTVRLDNGQPLVYMPLNVAIDLTTSPYIKTAGARMAKYEIDR HSFLDGKLQDNDIVLYRYADVLLMLAEAKVRNGQSGQAEMDAVRARAGMPSRTATLANIL EERRLELVWEGWRRQDLIRFGRFTHSYDLRHALEKEDTGYTTIFPIPSRAIGANGNLRPN SEF >gi|283510497|gb|ACQH01000122.1| GENE 25 36954 - 39761 3043 935 aa, chain - ## HITS:1 COG:no KEGG:BVU_3493 NR:ns ## KEGG: BVU_3493 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 87 935 50 898 898 968 56.0 0 MNIAGTFSYSEQRLRSLGQGKNKPASCARLAAYLLMACLLPFVCAENLFAQVGEDKPKYI SRLVPDSLKGDQPLIQVWQKTGGFGDKGLVVNPLDAIKGRLAGVNVSRGGDQRQASLTSV RVRGTTSLTGGNDPLVVIDGVSSDLATLSTIFPSDILSYQVLKNASETAQYGSRGASGVI IVTTAKGARGKFHISYDGNVGFETAYKNLEMLSGRDYVATAKALGLSVNDGGEDNDYLRA ITQTGFIQNHHLAFSGGTEQSNYRASMSVMNHQMVVRTNNYRNFTAKFDLMQKAFDDWLT VNFGLFGSSLSNKKILDTQRLFYSAASQNPTFSTQQNAAGGWDVNTLASQITHPMALLHQ KDHDRGFNINTHLSFDFPLAKGLTARLFGSYSFTTLENDIFQPTWTEAQGKMYRRENKSE DLLGSVQLNYNNTWGAHKLQLTGTAEYQNNRKQFFFTTVKGLSSNDFGYDNLTAGSIRPQ DGTGSGYESPKLTSFMGSLSYTLLGRYALSLTLRADGSTLVGKNNTWGYFPSAGLEWNVK EEKWLKSVLWLDQLKLRTSVGRSGNLGGIMAYYTLQMLMPNGIVPHYGRPTVTLGQLRNI NPDLKWETRTSFNVGMEAGFFSNRIHVYAEYYHSKTTDMLYNYDVSVPPFAYNKLLANIG SMSNSGMEFGLGLDLIRKKDMELSVNMNLSFQQNKLLSLTGMYNGELLSSPDITPIGGLT GAGMHGGHNFITYQIVGQPLGVFYLPHCTGLTKNADGSYSYACADLDNNGRINLEDGGDR RVAGQAVPKATLGSNISFRYRMWDVTMQLNGAFGHKIYNGTSLTYMNMASFPDYNVLREA PKQNIKDQTATDYWLERGDYLNIDYLTIGCNLPVRSRFVNALRLSLSVNNLATITGYSGL TPIINSYVVDNTLGIDDKRSYPPYRSFSLGLSVQF >gi|283510497|gb|ACQH01000122.1| GENE 26 40003 - 40404 441 133 aa, chain - ## HITS:1 COG:no KEGG:PRU_2471 NR:ns ## KEGG: PRU_2471 # Name: not_defined # Def: putative FMN-binding protein # Organism: P.ruminicola # Pathway: not_defined # 15 132 9 126 126 122 57.0 6e-27 MRKVTSRLLFGIAAIVFAATLTSASSNNNIITKEKGMTVVNTTELTRDIKGFKGSTPVKI FIKKNKVVKVEALPNQETPRFFDKVKPLLKYWEGKPVSKAIEDEPDAITGATYSSDAIMK NVQVGLEYYNAHK >gi|283510497|gb|ACQH01000122.1| GENE 27 40690 - 41043 323 117 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929986|ref|ZP_06423828.1| ## NR: gi|288929986|ref|ZP_06423828.1| hypothetical protein HMPREF0670_02722 [Prevotella sp. oral taxon 317 str. F0108] # 1 117 1 117 117 157 100.0 1e-37 MNNFTLFFMYDVVWVTLLALAYAAYAISSKPKGDLAVRLTIGFAFAYYLVFYLLSFATSV ATAKGLFLLALFPVSALWFALAPYRKSRRTKVIAWALATAYCAFYGLIVLLHSQSNM >gi|283510497|gb|ACQH01000122.1| GENE 28 41565 - 41801 188 78 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVENALNNFCQIVLNRKLTFQLIAENALNDLCQTVFWRITTFCHNQLRQQFRPYSLTELD LESHTCSVSQQDTQFYFP >gi|283510497|gb|ACQH01000122.1| GENE 29 41893 - 42153 57 86 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929067|ref|ZP_06422913.1| ## NR: gi|288929067|ref|ZP_06422913.1| hypothetical protein HMPREF0670_01807 [Prevotella sp. oral taxon 317 str. F0108] # 5 82 3 80 84 67 48.0 2e-10 MHNCPLWVPKRTSARIDFLRQGGRLVGRNGSFCVKFLAESLTKSVVPSIHLWATARKSTY AYNHQRERLQSLLPTANGDSIIEMEK >gi|283510497|gb|ACQH01000122.1| GENE 30 42797 - 43543 546 248 aa, chain - ## HITS:1 COG:YPO3615 KEGG:ns NR:ns ## COG: YPO3615 COG3209 # Protein_GI_number: 16123757 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rhs family protein # Organism: Yersinia pestis # 31 201 1161 1370 1512 113 32.0 3e-25 MALRGGIVAPINKERLETELGIRFRAFGTGWRYDWQSDGMLARVVRPDGKEVSFAYDALG RRIRKSFAGTTTHFVWDGNVPLHEWTEETEENVVNWLFEQDTFVPAAKLVANGECFSIIS DYLGTPLQAYDKQGNNVWEQELDIYGRQRKRPSAFIPFKYQGQYEDAETGLYYNRFRYYD PNAGSYISQDPIGLLVDNPTSHGYAKKIIGMPKRIISIWFKIIMGMPKKIIGMPKMQMAC PIYNIPKI Prediction of potential genes in microbial genomes Time: Sat May 28 02:50:41 2011 Seq name: gi|283510496|gb|ACQH01000123.1| Prevotella sp. oral taxon 317 str. F0108 cont2.123, whole genome shotgun sequence Length of sequence - 6922 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 41 - 96 0.6 1 1 Op 1 . - CDS 106 - 4866 3384 ## COG3209 Rhs family protein 2 1 Op 2 . - CDS 4963 - 6420 734 ## ZPR_3000 protein containing rhs repeat 3 1 Op 3 . - CDS 6494 - 6916 167 ## gi|288929991|ref|ZP_06423832.1| hypothetical protein HMPREF0670_02726 Predicted protein(s) >gi|283510496|gb|ACQH01000123.1| GENE 1 106 - 4866 3384 1586 aa, chain - ## HITS:1 COG:MA2045 KEGG:ns NR:ns ## COG: MA2045 COG3209 # Protein_GI_number: 20090892 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rhs family protein # Organism: Methanosarcina acetivorans str.C2A # 656 1295 1404 2046 2217 141 26.0 1e-32 MGLKAADAVPLLVPFDTDGNGKDDVVCVEQRQKDGYYPCTIIKYMDGDKLDHRDYKLRLP QNPEKMFVGDYNNDGLQDLIFLYDNGYKIYFNAGGKPTDEKFVETKTAEGTNLRNRWRIQ QGDFDGDGLIDFVYNCGDGDYHLWVAYNEGGGTFRIGKGVDMGFSDQGTTNDDDKFALLT WDMDRDGLTDVMVCKARYRYRGFPKFKTEFTATEVRWYRSTGDGFAFNNAASKGREDDAL ENKIFLGDFDGDGQMELANYGSKLTSGDDGFNGTIHAYFNYGEKANAGRVTNLEDGMGTS HRILYAYATNPIVYKRTIANRYPVNTYALPIAVVRYVYANDGSAGHYYTEYGYEDLRLHI AGRGMLGFNKMASTNTSLGIKTSTETTKWDETRWLPTETKTVVEQGGCAATTLSTTTIVG QDKGNYFAYISKQEATDMDGNRTITRSAYDNGKGVPVEEVVENDGDNMYRRVTYGGYVKK QARWLPTTMTKTQKHKDDPSPFSTTTSYKYDDNGNILSTTENYGTEMALTTTNTYNAQGN ILTTVTTGKGVKPITKHFEYDNDGRLVVKTFTTPETSVNTFAYDEWGNLLTQTDITNSAE PLTTTHTYDGWGQETQVVAADGTKTKVERRWTSPFSTEQYDIEVASDGAPTVLTCYDQKG RELSKKSVGPKGVDVNNKTTYNTKGLVSKIEKTTGKLTTKQVLTYDGRGRLVKEVLSSGK TTKYTYGNRTVTSTIAETGQSVTKTFDAWGNVVKVEDPSGIVTYKYASVGKPASVTAHGS TVCMTYDAAGNQTSLADPDAGTSTYTYAADGTLLAQTDGKGTKTTYQFDELGRPFLTCIG KQLLVTNYGAYVNGSPLVLSRINDRCSVQYQYDKLGRMVSEKRDIRGHEKAYKFEYAYND KNQLSQVIYPGNLEVNYQYDLNGFKDQVEVNGDIVYKQEDYDGLVNRFSFLGKLVSTSVR DKDGFETNLKLTNGNSTLEDFTSEYDKATGNLLSRRRNNSSKEEFGYDNLDRLISYKPSQ GGMMKIDYAPNGNVLFKTDVGNYNYDANIRPHAVTEVDNPIAAIPSDPLTTTYNEYGKVD MITDSVKGLSTHFFYGPDQERWCSAQYRDGNLVRSTIYAGDYEEVYENGDTRGYYYLDGN VIVIRDGLFKPYLAFTDNLGSILSVFDEEGRKVFDASYDAWGKQTITQNDIGLYRGYTGH EMLNEYDIINMNGRLYDPVIGRFFSPDNYVQMPFNSQNFNRYSYCLNNPLKYTDPSGEWF GIDDLIVAGAGFVVGYLGNAISSHNWGWSSIKSGLITAGASWLGFNTAGLATGSITSATW RQVGSICLNGIVNKAFPSGNVPLNDHLGLAFSPSFSWTEGGLTAGMLASLSYTDGDFSAS VSGGITNNYVGWNAEASYGGWGAGFGKTYYGEQTVRGNVLGKQIVGAITLLAPGDVSFRL YNDMFGQSGHDRWRTSAAELTIGDFSVGTYVTTNDGKEESGIYGYDPNIVDPYLGANPER VIHVNGREQKVGGGWPNGKVYSAPFWVGIKSGSQIYRFGVSARIVQSLTQNLVHRYLVPT PYFIPGKEFYRGLYSSFGHNSPLSLW >gi|283510496|gb|ACQH01000123.1| GENE 2 4963 - 6420 734 485 aa, chain - ## HITS:1 COG:no KEGG:ZPR_3000 NR:ns ## KEGG: ZPR_3000 # Name: not_defined # Def: protein containing rhs repeat # Organism: Z.profunda # Pathway: not_defined # 78 365 72 340 2141 110 32.0 2e-22 MRRFYLITLCLAFLPLANLAQEVSKRNGETPVLNKCVPPHGGGTCPGQEDSSRIPPVIPP DEVPSNDMYVVGTPTCSFNVSNSGAAIYDIKIGVPDAGPLTPKISLTYNSQSAGYGLAGY GVNISGISVITRGNKDLFHDGKQQGVEFNASDNLYLDGKRLIYQSGEMEDSCVVYAVEGD PFTKVFQYGFHPFQVGGMCFKVVTADGLEYEYASDADATLFCGSLDQKKHGCVAWYVSAV RDKNGNFIFYSYDNSGLTVRPVYISYGPKDETGKGKFYNIEFSYAKLGTNARPFAAYVDR GNFDVCLSSIVAACANHTYRKYTLTYDDNSDGSQTKWTRLVQVSEENGKGEKYPPVKLNW SFLPKGDASPLQMDVSTNDDRWYVKESDKGFFATDLNGDGVSDIIRVAPVEVIDWQGYGT THSHYNTYVYVSRSKVSPNGNVSYEKPLVFSLPANISLGDIKTTMGGASVMDFDGDGYND LLIPY >gi|283510496|gb|ACQH01000123.1| GENE 3 6494 - 6916 167 140 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929991|ref|ZP_06423832.1| ## NR: gi|288929991|ref|ZP_06423832.1| hypothetical protein HMPREF0670_02726 [Prevotella sp. oral taxon 317 str. F0108] # 34 140 1 107 107 187 99.0 2e-46 MKKAFLTIVAFATSLLVAAQHKIKFSYDAAGNRVAREAVVPKSKVAQAKAQTQSANERMP VSMEGVTHSYAQGIIRVNVPMLPNGSSCELYVYSAAGALVYRGTAVVGTNEISITHCADG VYLVKAVVGENTSAWKIVKR Prediction of potential genes in microbial genomes Time: Sat May 28 02:50:55 2011 Seq name: gi|283510495|gb|ACQH01000124.1| Prevotella sp. oral taxon 317 str. F0108 cont2.124, whole genome shotgun sequence Length of sequence - 5323 bp Number of predicted genes - 5, with homology - 4 Number of transcription units - 4, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 256 - 315 5.9 1 1 Op 1 . + CDS 452 - 937 494 ## gi|288929992|ref|ZP_06423833.1| hypothetical protein HMPREF0670_02727 2 1 Op 2 . + CDS 934 - 1854 946 ## gi|288929993|ref|ZP_06423834.1| hypothetical protein HMPREF0670_02728 3 2 Tu 1 . - CDS 1723 - 1980 70 ## - Prom 2003 - 2062 3.5 4 3 Tu 1 . + CDS 2029 - 2823 787 ## BVU_2192 hypothetical protein + Term 2900 - 2941 -0.7 5 4 Tu 1 . + CDS 3235 - 5190 1891 ## COG4771 Outer membrane receptor for ferrienterochelin and colicins + Term 5220 - 5257 5.3 Predicted protein(s) >gi|283510495|gb|ACQH01000124.1| GENE 1 452 - 937 494 161 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929992|ref|ZP_06423833.1| ## NR: gi|288929992|ref|ZP_06423833.1| hypothetical protein HMPREF0670_02727 [Prevotella sp. oral taxon 317 str. F0108] # 1 161 1 161 161 276 100.0 4e-73 MKIRTTIRALVIASALFFGSAWGMAANHVKVSGKSGESVFFALADKPTVTFTTNKLVITA GKQTVEYPLNEFRSFEFADPSSTGISTAPSPNESQAVFSFGQSVHGEGLQPGSRVSIFNV SGQLVVSATVNSDGSVDLPLKGQTGVFIVKSTSRTFKFIQK >gi|283510495|gb|ACQH01000124.1| GENE 2 934 - 1854 946 306 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288929993|ref|ZP_06423834.1| ## NR: gi|288929993|ref|ZP_06423834.1| hypothetical protein HMPREF0670_02728 [Prevotella sp. oral taxon 317 str. F0108] # 1 306 1 306 306 556 100.0 1e-157 MKYLFTLVFAILAGVATTKAQTVTITKSDGSTVTYKASEIKNIQFTNEESQVKPIHTFTG YIVVNSPMFMDTYYGEEAKMEIFAQGKKSICKFTDAKWGTGTFEVTLNDGEIRGSGKISV ADPHKAGQTKEYEALISGPMAAVNISIKGLMGGTTIKWRNGKAPEKVKLSGTYLGDNSVS VTGLTYVAKNTGYSFWVNDDGTYTIMVLGQKLEGTLMGDLTLGAYTINNLVYDKKTETFS KDYSNDGLKLKFKKGAETEYKEYLLTKATIKATFGKDGSLKVENNFTAGHMPFPLQGVFN GKIYKM >gi|283510495|gb|ACQH01000124.1| GENE 3 1723 - 1980 70 85 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVQACPLRRLGWEDTPQAQGGLLCLSLIISTALLGIKACTAALHFVNLTVEHALEWERHV AGSEIVFYLQAAVFAKGCFDGGLGQ >gi|283510495|gb|ACQH01000124.1| GENE 4 2029 - 2823 787 264 aa, chain + ## HITS:1 COG:no KEGG:BVU_2192 NR:ns ## KEGG: BVU_2192 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 13 264 11 218 220 112 30.0 9e-24 MHSIIKLAYKLSMGTLIALVSACTGLFDGIYDSPDKAPKINANQLAIDASDWHNWYYIDF DSLQMLAEAGDADAFLHARTHFTPYPIPTIKDETQANNQTGIYTYWFDVFGKGLSHNEKR DFKPTAKQPEPPAWSIAIHRDNVRTNGGAVLETNYTSMSQLPQSSTQFMGATFKPDEWTT NEVWVDQSQMANSLIGCQGIHINRALSSWLRLDIPPMPPAFKHNNRVFVVKLNNGKFVAL QLESYINPTTGTKCYLTINYRYPY >gi|283510495|gb|ACQH01000124.1| GENE 5 3235 - 5190 1891 651 aa, chain + ## HITS:1 COG:ECs3047 KEGG:ns NR:ns ## COG: ECs3047 COG4771 # Protein_GI_number: 15832301 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor for ferrienterochelin and colicins # Organism: Escherichia coli O157:H7 # 34 538 31 563 659 123 24.0 1e-27 MFHRTFIFLSLIISHITANAALPADTLNRTINLEQVVVTGTRSPKLLASTPVLTRVITRK DIEKTDATNLRDLLQTSMPGIEFSYAMNQQTHLNFAGFGGQSLLILVDGERLAGETMDDV DFTRLSMDNVERIEIVRGAASALYGSNAAGGVINIITRNATQPFKLNLNARVARHNDWRQ GLNMQLKKGKWANTLALTHTSIDNYDVTNGPSPISRVVATIYGDNTWNANNQLSFNPTSE LKLTGRAGYFFRETTRTTDTPERYRDFSGGLRALWQPSDKQHLELSYSFDQYDKSTYLQQ ARLDLRSYSNVQNAFRLLYTLNLRQGDVLSLGADWMHDWLLNPNLEGRTRKQRSCDVFAQ YDLNLGKAWEVVGALRLDHFSDHAITRLTPKISLRHTPLPHLNLRLGYGMGFRSPTLKEK YYNFDMAGIWLVTGNANLKPEVSHNFNLSAEYTRLHYIFTLSGHYNLVSDKIATGAPFFA NPSEPIPHLPYINLARYAVLGFEATVKGKWQNGMSAQLSYCFTHERLPKDKDGKQLNNQY QPARKHTFTAHWDYSHRFNKSYQLALNIDGRALSPVDNLEYVDYYDLSKGTNTIHYPAYT LWKVSAVQQIGSIFTLTLSVDNVLNYRPQYHYLNSPLTDGANFMVGAAINL Prediction of potential genes in microbial genomes Time: Sat May 28 02:51:38 2011 Seq name: gi|283510494|gb|ACQH01000125.1| Prevotella sp. oral taxon 317 str. F0108 cont2.125, whole genome shotgun sequence Length of sequence - 71624 bp Number of predicted genes - 39, with homology - 37 Number of transcription units - 24, operones - 7 average op.length - 3.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1094 - 1561 -201 ## + Term 1675 - 1705 -1.0 2 2 Op 1 . - CDS 1742 - 4378 2385 ## BT_3275 hypothetical protein 3 2 Op 2 . - CDS 4375 - 6921 2345 ## BT_3275 hypothetical protein 4 2 Op 3 . - CDS 6974 - 8413 1371 ## gi|288929999|ref|ZP_06423840.1| hypothetical protein HMPREF0670_02734 5 2 Op 4 . - CDS 8501 - 9217 715 ## gi|288930000|ref|ZP_06423841.1| hypothetical protein HMPREF0670_02735 6 2 Op 5 . - CDS 9234 - 10688 1578 ## Cpin_1098 hypothetical protein - Prom 10709 - 10768 1.8 7 2 Op 6 . - CDS 10772 - 14026 3530 ## Cpin_1097 TonB-dependent receptor plug - Prom 14176 - 14235 5.0 8 3 Tu 1 . - CDS 15718 - 17136 1367 ## COG0534 Na+-driven multidrug efflux pump - Prom 17221 - 17280 4.4 + Prom 18381 - 18440 2.5 9 4 Tu 1 . + CDS 18477 - 19649 1433 ## COG0642 Signal transduction histidine kinase - TRNA 19724 - 19808 69.3 # Leu TAG 0 0 10 5 Op 1 . - CDS 19999 - 20982 939 ## COG0793 Periplasmic protease 11 5 Op 2 . - CDS 21015 - 21935 683 ## BVU_0209 hypothetical protein 12 5 Op 3 . - CDS 21925 - 24663 3180 ## COG0188 Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), A subunit - Prom 24782 - 24841 5.0 + Prom 26392 - 26451 3.5 13 6 Op 1 . + CDS 26477 - 28018 2172 ## COG0423 Glycyl-tRNA synthetase (class II) 14 6 Op 2 . + CDS 28021 - 28644 688 ## COG0545 FKBP-type peptidyl-prolyl cis-trans isomerases 1 + Term 28865 - 28924 2.3 + Prom 28864 - 28923 6.5 15 7 Op 1 29/0.000 + CDS 28953 - 29822 1043 ## COG2086 Electron transfer flavoprotein, beta subunit + Prom 29999 - 30058 4.2 16 7 Op 2 . + CDS 30114 - 31130 1325 ## COG2025 Electron transfer flavoprotein, alpha subunit 17 7 Op 3 . + CDS 31174 - 31539 373 ## PROTEIN SUPPORTED gi|189500835|ref|YP_001960305.1| S23 ribosomal protein 18 7 Op 4 . + CDS 31606 - 33315 1882 ## COG1960 Acyl-CoA dehydrogenases + Term 33542 - 33586 11.8 + Prom 33635 - 33694 7.6 19 8 Tu 1 . + CDS 33720 - 34733 754 ## COG3943 Virulence protein + Term 34795 - 34837 1.0 - Term 34776 - 34831 12.0 20 9 Tu 1 . - CDS 34898 - 35797 902 ## PRU_2042 diaminopimelate dehydrogenase (EC:1.4.1.16) - Prom 35881 - 35940 1.6 21 10 Tu 1 . - CDS 36922 - 39480 1590 ## PRU_2417 hypothetical protein - Prom 39528 - 39587 2.4 22 11 Tu 1 . - CDS 39611 - 40378 545 ## PRU_2126 hypothetical protein - Prom 40600 - 40659 7.8 + Prom 40639 - 40698 7.5 23 12 Tu 1 . + CDS 40909 - 42846 1685 ## COG3291 FOG: PKD repeat + Prom 43195 - 43254 2.6 24 13 Tu 1 . + CDS 43306 - 44604 1221 ## COG1570 Exonuclease VII, large subunit + Prom 44650 - 44709 2.8 25 14 Tu 1 . + CDS 44806 - 45003 399 ## gi|288930021|ref|ZP_06423862.1| exodeoxyribonuclease VII, small subunit + Term 45047 - 45087 -0.9 + Prom 45079 - 45138 2.7 26 15 Tu 1 . + CDS 45165 - 45971 294 ## Fjoh_3691 NERD domain-containing protein + Term 46197 - 46228 -0.6 27 16 Tu 1 . - CDS 46888 - 48126 1006 ## COG3344 Retron-type reverse transcriptase - Prom 48308 - 48367 7.0 + Prom 48248 - 48307 6.2 28 17 Op 1 . + CDS 48327 - 48545 74 ## 29 17 Op 2 . + CDS 48558 - 49175 890 ## COG0632 Holliday junction resolvasome, DNA-binding subunit 30 17 Op 3 . + CDS 49219 - 56913 9008 ## PRU_2040 hypothetical protein + Term 57004 - 57054 10.1 31 18 Tu 1 . - CDS 57946 - 59316 811 ## COG1032 Fe-S oxidoreductase - Prom 59360 - 59419 4.4 - Term 59553 - 59590 5.3 32 19 Op 1 . - CDS 59668 - 60066 488 ## gi|288930028|ref|ZP_06423869.1| hypothetical protein HMPREF0670_02763 33 19 Op 2 . - CDS 60098 - 62233 1597 ## PROTEIN SUPPORTED gi|51894064|ref|YP_076755.1| ribosomal protein S1-like protein - Prom 62353 - 62412 2.6 34 20 Tu 1 . - CDS 62620 - 63162 768 ## COG1592 Rubrerythrin - Prom 63307 - 63366 2.5 35 21 Tu 1 . - CDS 63927 - 64697 786 ## COG0220 Predicted S-adenosylmethionine-dependent methyltransferase - Prom 64779 - 64838 1.5 - Term 64753 - 64791 1.0 36 22 Tu 1 . - CDS 64867 - 65889 932 ## COG0115 Branched-chain amino acid aminotransferase/4-amino-4-deoxychorismate lyase - Prom 66020 - 66079 1.9 - Term 66143 - 66173 -0.3 37 23 Tu 1 . - CDS 66190 - 67896 893 ## BF4127 hypothetical protein - Prom 68145 - 68204 1.8 38 24 Op 1 . - CDS 68216 - 69904 1286 ## BF4127 hypothetical protein 39 24 Op 2 . - CDS 69918 - 70631 260 ## PROTEIN SUPPORTED gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 Predicted protein(s) >gi|283510494|gb|ACQH01000125.1| GENE 1 1094 - 1561 -201 155 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAVPHRLRVSIGLPMHRLSPIIVGCALSSSLFSWFVHAHLIANNCWLCLVVFAFQLVCPC TAYRQLFLAVPSRLRVSVGLPMYLLSLVIHAQTNTNTPRCLIRSVANDNIAAYIRHRGAL NPNRSLDRLPIILGCFRRFLFIFRLLVGILPVLFS >gi|283510494|gb|ACQH01000125.1| GENE 2 1742 - 4378 2385 878 aa, chain - ## HITS:1 COG:no KEGG:BT_3275 NR:ns ## KEGG: BT_3275 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 51 758 30 741 860 490 36.0 1e-136 MKHSLLYSATSRRIFSLSLMLVVLGSSATAFPLFKRKKKAQTTQTEKKSAYERALTEHAT ESCRGAFVSLHKTDGKLLVELPQASLGRDMLIGVTISSVSNPKMGDLGFKNSNLVHVRFI EKDSAVVMQVVNTDLFIPKNSPSASIAARLNYDNLDFFSFPIKARSAGNASVLFDASSFF LKEDRFFPIIAKNVGSYSVSSSLKENLSRITALKVFERNACIKMDRHYLVSLSGKSGTPI SNYPVTIGVNFTLALLPNEQMTPRLSDTRVGMFLINKDVLASDGSVDKATFVKRWRIEPK DTAAYFAGTPTEPVKPIVFYVENTFPPLWKRAIKAGILWWNKAFERIGFKNVMQVADFPT DNPDFDPDNFAYSCIRYLPTDVENAMGPSWTDPRTGEIINATVLVYNDVVNTIDNWRLVQ TAQLDPAARAAQMPDSIVEQTLEYIIAHEVGHTLGFMHNMAASAAIPTDSLRSTTFTRKY GTTASIMDYARFNYVAQPSDSGVRLTPPRLGVYDFYAVEWAYRLFPGSKGYEDDAKQLRN LADKHEGDPFYRYGLQQTNTRYDPSAIEEDLSNDPIKAGDYGMANLRFILSNLDHWISDE DGGKRKAELYNEILQQAMRYVRHVFANVPGIYLYQTSEKSGLPRYKVVPKDQQKASAQWL LRQARTFATLGNDTIERKLPYGANKPFKILARDVQSLAMMANAKLAISYYLDSTSYSPLE YMEDVYADVFGKTIAGNENLSVADLSMQRLYVDLLQSGVADMKQAPNVHNLQADSVQTDL SAYPLRSAGAHHWCGSHACFPTTDGQTDNAATAFLNFGNAYGEPEPLWGTTVNRTSEFQL FYAQKLHSLLLSCIPKVGSPDLKAHYMLLEKRLKSYLK >gi|283510494|gb|ACQH01000125.1| GENE 3 4375 - 6921 2345 848 aa, chain - ## HITS:1 COG:no KEGG:BT_3275 NR:ns ## KEGG: BT_3275 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 46 845 30 860 860 422 33.0 1e-116 MITNKTKATARTIGMGLFLLCFGFQAFADDNKKVDDSKKAKKETKYDRLFKDKKKETARS TFLTLHKVDGKLYLEVPVKLMQREMLLSGAVSSTTDPTYITVGAKNFAPLHFFFELQDSS LVMKTPNGVVYSDGSASEEMQQALNVSYRDPVLMGFRVEAYNNDSSAVVVDATNFLARPN TMLPLIPKKSGDLAVNASPKSEMSFVRSIKSFEQNVTVKVDFNYLLTATLMSLPVASEIP TTVGATFSISLLPESKMRQRISDSRVGIAAVSKLTFSNAIAKSKPTFVAQRWNLTPKNAK AYAQGKGSEPVKPIRFYIDDAFPNEWKAAIKQGVELWNKAFEQAGYLKAIEAVDFPTADR AFDPDNISFSCIRYVPSASEKVSSSFMANPQTGEIINASVFVPANVGDQLYRWFFIGTAA ADPAVRTSHLPQSKFNEGLRYLVAREVGHVLGLLDNIGASSAFPVDSLRNAVFTHANGLA ASIMANTPFNYVAQPADKGVVLMPSATGIYDRHAIEWAYRYFDPAKTSVKEEAEILEKMV DKRVTNPRYRFFRTSSLAWDPRVQEGALGNDAIKASEYGLRNLQMVERNLYNWVKKDEDS RIKEKLYLTIAQQRYAFFKRVLSNVGGIYLNDMKLSSGVPRYEVVSKARQRQAMLWCLAQ AKRFKRYADPTFERKSFISVSYYDQLLEFIGYDLFGVRTRLAVSSHLSAQGYSQKEYFDD LFSAIFQSVEQQKAPSQEERVLQRAYLTYSRAVVDKANKQGGNGPAALQGEVSATAMPIA AAYGSPTASLAPTVDAALLDGSAIYFYSSLLRLKPLLEKCIKSNLSPDARSHYEMLLFKV NKALEDGK >gi|283510494|gb|ACQH01000125.1| GENE 4 6974 - 8413 1371 479 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288929999|ref|ZP_06423840.1| ## NR: gi|288929999|ref|ZP_06423840.1| hypothetical protein HMPREF0670_02734 [Prevotella sp. oral taxon 317 str. F0108] # 1 479 1 479 479 952 100.0 0 MMIKKHLLHVLALSVAVLSLQSCITDDSTSPSGNASKLSLSQDLQQKYTLDRWDTLKIAP KVVQTNAQKNVAYEWEINRKVVSTDAQLKYVCEEFGVFPCRLKVSNGDNIQFYEFELNVQ YSYVEGLYILASHGGRAIVSYLPEPQSNKSFDLDVLQKNNPNANFASEPKDIDYVLARDN KTPLIYVAVGNPSTIYELDGNLMTTRFKTTATGNVSYLKRSALTYPKSMLAMVDHVPARL TLSETSLFDLGKFIKDSLKTDMSMADAAVSWKQQDLRYVHGYVLFDNAEGRLVPQKVQAT GKIPAQLLKGTFTGDSLIGMGAVDSERNLVLMTWNKAASKFRCYYVAPGFYPSNITKVEA ATLKHAADVPTSAGFTKNSVVRVSPEKNLVYYATGNKLYAYNVLSNGNFPTSALTTFGDS GETIADMLITEGSDRLFVATNAASGSLVGSIYCFDLNENKLLWQKKNITGTIKRITYRQ >gi|283510494|gb|ACQH01000125.1| GENE 5 8501 - 9217 715 238 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288930000|ref|ZP_06423841.1| ## NR: gi|288930000|ref|ZP_06423841.1| hypothetical protein HMPREF0670_02735 [Prevotella sp. oral taxon 317 str. F0108] # 1 238 37 274 274 454 100.0 1e-126 MKKTFKHILLASAALTALVACSNIDYTGEYSEDGTYQSGNQAYFYFAEPGDSVLNYSFGV KPLDTLTHVVYVPVNLAGRLSADKQAFRVEIGAGSTAVSGKHYTLDADTLAFLPNSHRAY VPVKLIRENLSEDKNDSINLVLNLVPTAEMGVRFKANNRVKITFNNVLSKPDFWTVLETY WGLGAFTKNKYRKLLSYYDGNEDAIRQILTGTDATAQGYLYQRVQEVIAYFAAHPDEL >gi|283510494|gb|ACQH01000125.1| GENE 6 9234 - 10688 1578 484 aa, chain - ## HITS:1 COG:no KEGG:Cpin_1098 NR:ns ## KEGG: Cpin_1098 # Name: not_defined # Def: hypothetical protein # Organism: C.pinensis # Pathway: not_defined # 17 484 9 488 488 179 28.0 2e-43 MKRTTYWINLLAGLVAVVFATSCNDFLDVHPAGEVDESEQFGSIRGYRNAMYGVYGSMAA TNLYGKNLSYGFVDQLGQLFGYDNTSETSYFVSQYNYLRSDVRAIVDGIWEGQYQTIAYA NNVLRNAENPQFDHKELAFMRGECYGLRAFLHFDLVRLFAEDYTRSNASTRGIPYATTFD LNNKPLLTLHETFKAILKDLDTAESLLADDNEVTVEQTPTSDYLKGRATFFNKYAVAATK ARVYYAMGDTANAAKYARLVIAATSNFSLKKLTSMADVKRFPATGELIFGLYNNTLSADI SSTFLSQTARGTFTEGRRDLEELYETSAFSATSADLRYTGYYRQNTTPDGIKTYSFVRLL ESDAQVNSSPLQGLTLIRLPEMYYILSECTYDNNKAEAVRLLNVVRTSRGLEAVADAKVA DRTAFDKEMLRERMREMPGEGQTFFALKHYNKAFGDYRGINTYQPSSAIFVLPWPEREKE YGGQ >gi|283510494|gb|ACQH01000125.1| GENE 7 10772 - 14026 3530 1084 aa, chain - ## HITS:1 COG:no KEGG:Cpin_1097 NR:ns ## KEGG: Cpin_1097 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: C.pinensis # Pathway: not_defined # 120 1084 149 1148 1148 637 37.0 0 MQKTNFVLRAWAVLALVLLLGSVTTALAQTQSVDNMRVTLSMKQVTLKTFLDEVSKQTGL QFDADASQLEKAERVSIEAREQPVRAVLGQVLNQAAFAYDITGNKVKLTRKNAVERRKGV YGQITDADGQPLPGVTIRMQGFSGGYITNLDGYYEIETDQPEVKLTFNYIGYKPLEKTVK NGTAGNFVMHDDADVLGEVVVTGFATKNKNSFTGAQVSIKKDELLAVGTKNVLTSLATFV PGMNILEDNAGGSDPNKMADINIRGRATFTGQANMPVFVVDGSQVKVDYVNDMDMNDIET VTVLKDASASALYGAKASAGVIVITTKALTGGKLKVNYSSTLRLSTPDLSDYNLLSASQK LEYERLAGLYTSTDLTEQYALDKKYAQYYNQIQDGVSTDWISKPLRNALSTQQSLSIDGG DEHARYNLGVRYGNDAGVMKGSNRERLSTNFKLSYNLPGKFFVSNTATISSVKSTQSPYG DFSDWVKQNPYEYPYDETGALKPKLNYDLSNPLYEASLGSFSKGDNFDFLNTTSLQLWLG EKFRIDGDFSIQKSKYDGRTFVSPFSADQLKNVADVSRRGRLNETFTKTTTYQGKLMASY NDYLLKKLFLTAMAGASIESNSVDGSTYGSVGYYTDNIAHTSLAGSYPTGRPSGTDTKYN GVGFFVNANAIWDNRYFLDVIYRYEGSSKFGKNTRFAPFWSLGAGWNVHNESFLKGSPFQ LLKLRASVGYLGNISFEPYQALTMYTYLNGYNYIKGIGAVPKGIGNTNLKWERTLSGNVG VDLTVFKGRWDFSADFYVKNTDNLLLDITKAPSVGVRTSRENVGEVENRGLEFQTRVIPI QNKDWQWSLSLNYAYNKNKIKKISNALREQNEKNQAKRGLAPLPIYEEGQSLTALKVVPS AGIDPITGDEVFIKRDGSYTFVYDPNDKVVFGDSTPFGNGSLSSYLTYRQFSMGASLRYS FGGAVYNQTLASKVEGADPRYNADERVFSDRWKQVGDHTAYKRISDSSVPMQTSRFVQTN NYLTLSSLSFAYEVPLAFIQKYGLRRLHLEMLANDLFYLSSVKRERGLSYPYERSVELSV RLGF >gi|283510494|gb|ACQH01000125.1| GENE 8 15718 - 17136 1367 472 aa, chain - ## HITS:1 COG:CAC3444 KEGG:ns NR:ns ## COG: CAC3444 COG0534 # Protein_GI_number: 15896685 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Clostridium acetobutylicum # 22 453 11 456 462 200 29.0 7e-51 MFDKKSNDTLLAGIRQGKTLTRNEMLSLIVGLSIPSILAQVTNVLMFYIDAAMVGKLGAA ASASIGLVESATWLFGGLCSAVSLGFSVQVAHFIGANDFVKARQVLRHALVVTLSFSLLV TLCASLIAFKLPIWLRGGDDIAHDAALYFLIYALSVPFLQLGILSSNMLKSAGNMQIPSI MSVLMCVLDVGFNYLFIYVAGLGVPGAALGTVLSIAIVASVEAWFALFRSSILALRLDKV RFVWMWSYVRNAVKISAPIAAQYVLMTGAQVVSTYIVAPLGNFAIAANTFAITAESLCYM PGYGIGDAATTLVGQSIGAGQYGLCRSFAKLTVGMGMAVMALMGVVMYVFAPEMIALMTP VDEIRALGTQILRIEAFAEPMFAAAIVGNSVCVGAGDTLKPSLMNLASMWGVRLTLAAVL AHWYGLQGVWTAMAVELTFRGMLFLARLKWGAWLRAEGDVEKSNGDGGRHAL >gi|283510494|gb|ACQH01000125.1| GENE 9 18477 - 19649 1433 390 aa, chain + ## HITS:1 COG:STM4173 KEGG:ns NR:ns ## COG: STM4173 COG0642 # Protein_GI_number: 16767427 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Salmonella typhimurium LT2 # 169 386 235 457 465 84 31.0 5e-16 MQWTDRIRQVKIILVIAATIIVTASLVVSHVLTRDLETEERHKMEIWAEAMRTLNQADEN TDLNLVLQVINANNTIPVVVLDAKGKPQTARNINLKGKTGEDSVLLVSNMGQRLLAEGRF IRISLNDSIKSEYIDVCYDDSLMLQRLATYPYVQLGIVLVFVMVAIIALRTSKRAEQNKV WVGLSKETAHQLGTPISSLMAWTEVLKETHPDDELIPEMEKDISRLQLIADRFSKIGSLP EPVPSSLNDVLAHVVSYMDKRTPKRITLKTQLPEEDIILNLNASLFEWVIENLCKNAMDA IGESEGCITLNVQKTAQKVLIEVSDTGKGIRKKDIANVFRPGFTTKKRGWGLGLSLAKRI VEEYHKGKISVKSSEVGVGTTFLIELRTGR >gi|283510494|gb|ACQH01000125.1| GENE 10 19999 - 20982 939 327 aa, chain - ## HITS:1 COG:CC2028 KEGG:ns NR:ns ## COG: CC2028 COG0793 # Protein_GI_number: 16126271 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protease # Organism: Caulobacter vibrioides # 41 313 142 439 462 74 26.0 2e-13 MAAWGLLGACVDVEQYDNNPRGNFEALWRIIDEHYCFFDYKQQEYGLDWHAVYNKYAPQF DDQMTEEQTFEVLTNMLGELKDGHVNLYAPFDVGRDWSWKEDYPTHFSENVLKLYLGKDY KIAAGMKYRILNDNVAYLRCATFVNDFGAGNLDRILLYFAPCNGLIIDLRENGGGMVTSA EALAARFTNEEVLVGYMQHKTGRGHNDFSPRRQQILKPSKGLRWQKRVVVLTNRGVYSAA NEFVKYMKCLPQVTIVGDRTGGGAGLPFSSELPNGWIVRFSACPMYDRDGKDTEFGIAPT HQVGMKPTDQARGRDTLIEYARKLLAK >gi|283510494|gb|ACQH01000125.1| GENE 11 21015 - 21935 683 306 aa, chain - ## HITS:1 COG:no KEGG:BVU_0209 NR:ns ## KEGG: BVU_0209 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 56 303 20 267 281 210 41.0 6e-53 MNLSASPLWRIQTVATHASETFTCANVRQWAWAGWVLCLLLLCLSPTAQAQNELYQDPTL PTNRKITTTARMLGISRAQVLDTYLSPEHYTGPDLRYISQTQREREGRRLSQLITHTGNV AYLKNRAGSGNEIAGMYCFDYALHYGFDWLDHRLQLQVGGRVETHVGFIYNTHNSNNPAQ GRVFLHLAPSAVASYKLRAGSISLLLRYEVAVPLLGVMFSPNYGQSYYEIFSEGNYDNNI VPTTIGSAPSLRQMLTVDFPLLRSTVRLGYMGDIQQSRVNGLKTHVYTHGVVIGLVKRFT LIKLAP >gi|283510494|gb|ACQH01000125.1| GENE 12 21925 - 24663 3180 912 aa, chain - ## HITS:1 COG:BB0035 KEGG:ns NR:ns ## COG: BB0035 COG0188 # Protein_GI_number: 15594381 # Func_class: L Replication, recombination and repair # Function: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), A subunit # Organism: Borrelia burgdorferi # 59 688 11 623 626 447 38.0 1e-125 MDDEIKDPNELEEGLDMADEKQQDENEEEGVETHSDYKPADRFDASAVHHLSGMYQNWFL DYASYVILERAVPNIADGLKPVQRRILHSMKRMDDGRYNKVANIVGHTMQFHPHGDASIG DALVQMGQKDLLIDTQGNWGNILTGDRAAAPRYIEARLSKFALDTVFNPKTTQWQASYDG RNKEPIALPVKFPLLLAQGAEGIAVGLSSKILPHNMRELCQAAIHYLKGEPFRLFPDFPT GGSIDASRYNDGQRGGSIKVRAKIEKLDNKTLVIREIPFSKNTTTLIESILKAVDKGKIK AKRVDDNTAAEVEIQVHLAPGISSDKTIDALYAFSDCEINISPNCCVIEDNKPRFLTVSE VLRHNVDSTMGLLRMELQIRKDELLEQLFFSSLEKIFIEERIYKDKKFENAENMDVAVEH IDNRLTPFKPNFIRELRREDIMRLMEIKMQRILKFSKDKADELLARIKEEIAGIDHDLAH MTDVTIKWFKFLIDKYGKDHPRLTEIKSFETIEATKVVEANEKLYINRREGFIGTGLKKD EFVCNCSDIDDILVFHRDGKYKIMRVADKIFVGKNVLHVQVHKRNDRRTTYNIVYRDGKE GHYFIKRFNITSATRDREYDLTQGTAGSKVVYFTANPNGEAETIRVILEPEAPPSRKKCI VDFDFSKVIIKARTSRGNIICKRPVKKIGLLEAGHSTLGGRKVWFDPDVKRINYDDHGRL LGEFNENDSILVVLDNCEFYLSDFDVNNHYEDNLKLIEKWDENKVWSAVVRDADNDKFAY VKRFTMEGSKRKQSFLGDNPKSELLFLSDQPYPRIKVVFGGSDKDRAPMEIDIEDFVGVK GFKAKGKRITTSHVDHIEELEPTRFPEPEPEPEDNGEQGNDGAESAPSAVSSSPRIKPTA SPEQPSLFQDEP >gi|283510494|gb|ACQH01000125.1| GENE 13 26477 - 28018 2172 513 aa, chain + ## HITS:1 COG:SA1394 KEGG:ns NR:ns ## COG: SA1394 COG0423 # Protein_GI_number: 15927145 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Glycyl-tRNA synthetase (class II) # Organism: Staphylococcus aureus N315 # 10 502 8 460 463 467 49.0 1e-131 MAQEDVFKKIVSHCKEYGFVFPSSDIYDGLGAVYDYGQNGVELKNNIKEYWWKSMVLLHE NIVGIDSAIFMHPTIWKASGHVDAFNDPLIDNRDSKKRYRADVLIEDQIAKYEEKIDKEV AKAAKKFGDSFDEAMFRQTNPRVQEHQAKRDALHERYAQAMQGPNLEELKQIIEDEEIVD PISGTKNWTDVRQFNLMFSTEMGSLSDAASKIYLRPETAQGIFVNYLNVQKTGRMKVPFG IAQIGKAFRNEIVARQFIFRMREFEQMEMQFFVTPGTELDWFHKWKETRMKWHQALGFGA ENYRFHDHEKLAHYANAASDIEFRMPFGFKEVEGIHSRTDFDLSQHEKFSGKSIKYFDPQ TNESYTPYVIETSIGVDRMFLSVMCHAFEEEKLENGETRTVLKLPAALAPVKLAVLPLVK KDGLPEKAREIVNNLKFKFNTHYDEKDTIGKRYRRQDAIGTPYCVTVDHDTLTDNCVTLR FRDNMQQERVNIDQLENIIEDKVSITTLLKKLQ >gi|283510494|gb|ACQH01000125.1| GENE 14 28021 - 28644 688 207 aa, chain + ## HITS:1 COG:ECs5185 KEGG:ns NR:ns ## COG: ECs5185 COG0545 # Protein_GI_number: 15834439 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: FKBP-type peptidyl-prolyl cis-trans isomerases 1 # Organism: Escherichia coli O157:H7 # 67 202 147 259 259 63 35.0 2e-10 MTRKSFLLMAAAAFMAFAFTACSETDNAADDFPNWQAANEAYFDSIYNVAKANTAQWQVI PSITLPKEAVKNKTDNVAAEVINAGEGSDRAFQNDTVRVILQGRLKASPSYPQGKVFHQT YVGSDDNTTANAAKLAVSSVAPGLQTALQNMPVGAKWRVYVPYQLGFGSATGSGTIAPSL SKTVSVPPYSTLVYTVEVVSIIRQGHK >gi|283510494|gb|ACQH01000125.1| GENE 15 28953 - 29822 1043 289 aa, chain + ## HITS:1 COG:FN0784 KEGG:ns NR:ns ## COG: FN0784 COG2086 # Protein_GI_number: 19704119 # Func_class: C Energy production and conversion # Function: Electron transfer flavoprotein, beta subunit # Organism: Fusobacterium nucleatum # 3 288 1 261 262 183 42.0 4e-46 MSLKIVVLAKQVPDTRNVGKDAMTAEGTVNRAALPAIFNPEDLNALEQALRLKEQHPGST VGVLTMGPPRAGEIIRQGLYRGADTGWLLTDKLFAGADTLATSYALATAIKKIGDVDIVI GGRQAIDGDTAQVGPQVAQKLGLNQVTYAEEILKVEDGKLTIRRHIDGGVETVEAPMPCV VTVNGSAAPCRPCNAKLVMKYKNARCPMERNGKEPHPELFDLRPYLTLNQWSVADVDGDA AQCGLSGSPTKVKSVQNIVFQAKESKTLTAADADIEAMMKELLDEKIIG >gi|283510494|gb|ACQH01000125.1| GENE 16 30114 - 31130 1325 338 aa, chain + ## HITS:1 COG:CAC2709 KEGG:ns NR:ns ## COG: CAC2709 COG2025 # Protein_GI_number: 15895966 # Func_class: C Energy production and conversion # Function: Electron transfer flavoprotein, alpha subunit # Organism: Clostridium acetobutylicum # 4 336 9 332 336 265 45.0 1e-70 MNNVFVYCEIEDTTVAEVSQELLTKGRSLANELGVQLHAVVAGTGIKGHVEGQILPYGVD KLFVFDAEGLFPYTSAPHTDILVNLFKQEKPQICLMGATVIGRDLGPRVSSSLTSGLTAD CTQLEIGDYDDKKAGKHYDKLLYQIRPAFGGNIVATIVNPDHRPQMATVRSGVMQKKVLD ENYKGETVYPDVTQYVQPESYVVKVLDRHVEEAKHNLKGSPIVVAGGYGVGSKEGFDLLF KLAKELHGEVGASRAAVDAGWADHDRQIGQTGVTVHPKVYIACGISGQIQHIAGMQDSGI IISINTNPDAPINKIADYVIVGTVEEVVPKLIKYYKNK >gi|283510494|gb|ACQH01000125.1| GENE 17 31174 - 31539 373 121 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|189500835|ref|YP_001960305.1| S23 ribosomal protein [Chlorobium phaeobacteroides BS1] # 1 116 17 131 143 148 59 9e-35 MTRSFKDLIVWQKAHSFVVLVYQATKSYPKFELFGLCSQFQRAAVSIPANIAEGYRRDGK ADKLRFLNFAQGSLEECRYYILLSKDLCYINIETYDLLTTSIEETSKLLNAYYKGVKEHM P >gi|283510494|gb|ACQH01000125.1| GENE 18 31606 - 33315 1882 569 aa, chain + ## HITS:1 COG:CC3393 KEGG:ns NR:ns ## COG: CC3393 COG1960 # Protein_GI_number: 16127623 # Func_class: I Lipid transport and metabolism # Function: Acyl-CoA dehydrogenases # Organism: Caulobacter vibrioides # 54 445 56 459 603 196 33.0 2e-49 MANYFSDSPELEFHLHHPLMKRIVELKERNFEDKDKYDDAPVDYDDAIENYKQLLEITGD ITANVIEPNSESVDEEGPHLVDGRMVYASKTFENLDATRKAGLWGISMPRRYGGLNLPIT PYSMASEMMATADAGFQNIWSLQDCIETLYEFGDEEQRKKYIPRVCQGETMSMDLTEPDA GSDLQSVMLKATYSEKDGCWLINGVKRFITNGDSDIHLVLARSEEGTKDGRGLSMFIYDK RQGGVTVRHIENKLGIHGSPTCELVYKNAKAELCGETRLGLIKYVMALMNGARLGIAAQS VGVSQAAYTEGLNYARERRQFETNIIDFPAVYDMLARMKAKLDAGRSILYQTARYVDVYK ALDDIARERKLTSEERAEQKKYMRLADAFTPLAKGMNSEYANQNAYDAIQIHGGSGFIME YKSQRLYRDARIFSIYEGTTQLQVVAAIRYITNGTYLGIMKEMLEDEVAENLQPLKERVA KLVERYEESVNNVKECGNQDRQDFLARRLYEMTAEIFMSLLIIADASKAPELFEKSANVY VRLTEEAVIGNHAFIVNFKVEDLEYYKSC >gi|283510494|gb|ACQH01000125.1| GENE 19 33720 - 34733 754 337 aa, chain + ## HITS:1 COG:NMA1039 KEGG:ns NR:ns ## COG: NMA1039 COG3943 # Protein_GI_number: 15793995 # Func_class: R General function prediction only # Function: Virulence protein # Organism: Neisseria meningitidis Z2491 # 10 337 3 331 336 288 48.0 1e-77 MKDKKTNNHLIIYQDDNGLVKVNVRFADEDVWLTQGQLADIYDTTQQNIALHIKNVYADA ELEETATHKKYLLVRQEGGRTVQRNIDHYNLDMIIALGYRVQSQVATRFRRWATERLHEY IQKGFSMDDDRLMQGGNRYFRELLQRIRDIRASERNFYQQVTDIYATSVDYDPRTELTHT FFATVQNKLHYAVHEHTAAEIIFDRVDSDKPLVGMTNFKGDYITKDDVKIAKNYLTQKEL QRLNLLVSQFLDFAEFQALDEHPMRMVDWITALDNQILSLQRTVLEGKGRISHNEAIEKA EREFVIYRQREMARLESDFDKMVKRFPKRGQHPTKEE >gi|283510494|gb|ACQH01000125.1| GENE 20 34898 - 35797 902 299 aa, chain - ## HITS:1 COG:no KEGG:PRU_2042 NR:ns ## KEGG: PRU_2042 # Name: not_defined # Def: diaminopimelate dehydrogenase (EC:1.4.1.16) # Organism: P.ruminicola # Pathway: Lysine biosynthesis [PATH:pru00300] # 1 299 1 299 299 482 79.0 1e-135 MKKIRAAVVGYGNIGQYTLQTLEVAPDFEVAGIVRRQGAKERPMELEKYNVVSDIAELHD VDVAILALPTRECPTYAKKYLAMGINTVDSFDIHTNILNYRAELMPVCKEHGRVSVISAG WDPGSDSIVRMLMQSLAPKGLSYTNFGPGMSMGHSVCVRSKKGVKNALSMTIPLGEGIHR RMVYVELEDGARLEDVAAEVKADPYFANDETHVFAVKSVDEVRDMGHGVHLVRKGVSGKT QNQHFSFDMSINNPALTAQVLVNVARASMRLQPGCYTMVEIPVIDMLPGDREELIGQMV >gi|283510494|gb|ACQH01000125.1| GENE 21 36922 - 39480 1590 852 aa, chain - ## HITS:1 COG:no KEGG:PRU_2417 NR:ns ## KEGG: PRU_2417 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 851 1 849 850 267 25.0 1e-69 MKKACLMAIAILWTFVTCAQVVKVEGRLTDTNGQPMQWVVVKHVDAESNKMLNYCQTDAE GRFSIEAKVGNSLQVSSLGYKARRIEVKEGMPFQKLTMTDDAVTLKEVNVKSEKVKLSGD TIKYLLATYAQAGDRTLADVLRRVPGFEVNKESGQISYGGKPISNFYIEGLDLLGAKYGV ATNTLPQGEVASVEVIKHHQPVRVLEEFTYTNDDAVNIRMKDGAKAHWIVTAKGTVGIKS KGTLWETESFAMRIKPQWQLMATYKSNNLGKPIGKETTNLFNFDEISSRLTDVIQLPVAN SGFLGRRSLFNRSHSFTLNTLKRINDISQVNLQLIYENNEENTWERRVTEYYRAGDTRLT DNQKHGLTRENNLHALLKYENNANNHYLKNKLQGDLEWRKQWLNETGTNAHIVYAKLPKF TVRDDLYLIKRWGKHLLSFNSGSIYETRPQRIQADSALQTLTQHYFETNTSLHGALLIDK VKLSTSLGVNAVVHNLQSNLYGVPDSVWANKIGDGRFGFCQLYANPEAEYRLRDVRFTLS AFLEHNMYWYRGHGGQSHAYVSPGLTIQWNATPRLEFRCSASEYTSQVDINRFFDVLILQ DYEYASRGYNGYAVCRERNVRLRMLYKNALQATHFNVNVTRTQGTNPYTTSREFVGKYIV LSLLPFETRYNMWKVTSMLSKGFSLFNTKLQTMVDYNHTNTYISQDGMRMPLQTDGLGLK TSLGMVLWKGMDFNYALSYRWSKMYMPDWDKRSSLNNWHHEAQATIPLVGAMKLETNLEW YRNQLPDRGYKDMVFVDLALGYVGRHIDCKLKLTNALNKSSYAYSMNNDLVRTTTDMRVR GRELVLTLTYRP >gi|283510494|gb|ACQH01000125.1| GENE 22 39611 - 40378 545 255 aa, chain - ## HITS:1 COG:no KEGG:PRU_2126 NR:ns ## KEGG: PRU_2126 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 80 194 66 181 264 84 36.0 3e-15 MKTMLIKVLLCGLPISLWAQATHVIEPAILEVNYKVVYGKDKDEYALRCGKHTSQYINID KYVSDSMSNSPNGPDYDMERLRRMMDALEHRDDDSKRMPYSPGCGQYLYRNLTQGKITVY TSLFGDYCFYEEPIPQMDWHLVADSTRTIVGQQCLMAKTRFRGREWTVWYAPEIGIPLGP WKLAGLPGLILKAHCENYIDLEANSLQTKGVPPVMFYNFYNHKYTLIAREKYLKASTDPK TYPPKTIIAPPMELE >gi|283510494|gb|ACQH01000125.1| GENE 23 40909 - 42846 1685 645 aa, chain + ## HITS:1 COG:MA4312_6 KEGG:ns NR:ns ## COG: MA4312_6 COG3291 # Protein_GI_number: 20093101 # Func_class: R General function prediction only # Function: FOG: PKD repeat # Organism: Methanosarcina acetivorans str.C2A # 212 466 17 275 382 93 29.0 1e-18 MKRTLLLLLCCLCFVAYGQSLQANFYALGGTTPYYRMGWDSVEEANQWKYFALNNTATWQ LYEKPSWKGLKPFSYFNPNSKFSLGIGYSNKKQKETAQSPQITIKPNSTCSFYACFSPGL LVFAKWKLLITDVDTKETEELLDAFKWSQDNAFDGPNWNKFNIKLSAYAQKKVQFSFVYE GSGGEDVFIDGFEVSQNDENATSIHVVEGEEVAFYNSSVGKVTSYNWKFEGGTPNVSTDE TPKVVYAKAGTYPVTLTVSNGTEAHSFTREAYVNVSKKAPQAFIGVEPGGYLSPFVGRFI ALGSSLTFKDLSKGNPTQWQWSFNGANIEQSNEQNPTVTYNKQGVFSVGLRATNDAGTSK DALVNYIQVGGQQHIWNITPEENPNLNVIGLGFYGFYAGSNWLGMSKFAEHFDAPQAKAE VKSVDVYFGSTVTVSPNAEISIALTLADDKGMPGQVLATSTLKASELVYDNNAIKPTTFT FNTPATVEKEFFVVIGGLPNNDEGNNSDKIAILCLRRQAKEKSTTYHLLADEDGNNKPTG TYTWYRNDDEPLSMAVCPLLSYDLSTTALPRNEANVGETPLRFDGSNLLTAADYDAIEVY AMHGARVLSATKPPHVLSLRHLPAGVYVVKATRGKVQDTLKVVKE >gi|283510494|gb|ACQH01000125.1| GENE 24 43306 - 44604 1221 432 aa, chain + ## HITS:1 COG:DR0186 KEGG:ns NR:ns ## COG: DR0186 COG1570 # Protein_GI_number: 15805222 # Func_class: L Replication, recombination and repair # Function: Exonuclease VII, large subunit # Organism: Deinococcus radiodurans # 3 278 25 305 416 152 37.0 9e-37 MRKETLTLYELNQMVRDALAITMPDEYWVEAEISEMREIRGHCYMELVQKDAAGNTPVAK ASAKCWKNKWAFLRPHFERNANQILRAGLKVRLLVYADFHEAYGFSWIVADIDPTYTLGD MARKRLEIVQTLKQQGVFDLQKDFRLPLFAQRVAVISSEQAAGYGDFCRHLHENEWHLQF SVQLFAATMQGEGVERSVIAALNAINERLADFDVVVIIRGGGATSDLSGFDSLPLAENVA NFPLPIITGIGHDRDESVLDMVSFRRVKTPTAAAAFLVEHLAETYQRILDAQEEMVHLVQ RRMELERIQLARLTEKVPMLFSLVRTRQEKRLENLALRMQNTITTTLQRQDLRLQRLALA LPTHAERLLTKAAHRLELLQQAVKSHDPSLLLAKGYSITTRGGKVVRNANDLQKGDVIVT QLQKGKVKSVVE >gi|283510494|gb|ACQH01000125.1| GENE 25 44806 - 45003 399 65 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288930021|ref|ZP_06423862.1| ## NR: gi|288930021|ref|ZP_06423862.1| exodeoxyribonuclease VII, small subunit [Prevotella sp. oral taxon 317 str. F0108] # 1 65 1 65 65 69 100.0 7e-11 MAKQEIKYEEAVAQLEAIVRRMETGELNLDSLADELKKAQKLIKMCKDKLTKTDEEIKKM LEEEG >gi|283510494|gb|ACQH01000125.1| GENE 26 45165 - 45971 294 268 aa, chain + ## HITS:1 COG:no KEGG:Fjoh_3691 NR:ns ## KEGG: Fjoh_3691 # Name: not_defined # Def: NERD domain-containing protein # Organism: F.johnsoniae # Pathway: not_defined # 27 265 34 258 259 151 36.0 2e-35 MQSFLSILPLVIIISLFSLGWWYSSPKQKGKRGEQRVNDILSHLPKEYHILCDIVLKTKK GTTQIDHIVVSKYGIIAIETKNYRGNIYGDDNRKEWTQLIVTKVTFAKKWWKTYTYVTKK HFYNPVKQSAGHALKIKELLTAYPHVKIVPIVVFTGGAVLNNVKSNHHVIYEDKLLEVIS GYKATYLTDNDVQTILTILNENNIRESVSDRQHIKNIQATAREVNETIKQGVCPKCRGHL IERRGKYGTFYGCSNYPQCRFTVRQRYK >gi|283510494|gb|ACQH01000125.1| GENE 27 46888 - 48126 1006 412 aa, chain - ## HITS:1 COG:MA2102 KEGG:ns NR:ns ## COG: MA2102 COG3344 # Protein_GI_number: 20090946 # Func_class: L Replication, recombination and repair # Function: Retron-type reverse transcriptase # Organism: Methanosarcina acetivorans str.C2A # 9 286 41 295 563 133 30.0 7e-31 MARTDRDVNLKALIGNQLQYIYSADELTNVLNTLASNIGVFAGFRRNQMLCFADTSKDNR RAKRYRMHKKHGGTRAIVAPHQSLLYMLKGFNALLQQYYEPTTWCYGFVKGRSVVQNAQQ HLGKRYILNIDIKDFFPSITRQMVEEVLLAEPLKCSAEAARLLSGLCTAAEPQGDVLPQG FPTSPTLSNMVCKKMDEELAAAAQRIGATYSRYADDMSFSSDKDVLRPTGSFYLQVSTIV EKYGFRLNDRKTRLQRRGRRQQVTGVVVSHKVNVTREYARQIRSLLYMWERYGYWETARA ALRAYREQHGKTRGHGEYLTLYAVLRGRLNYMKMVKGETDPTYLKLWNKYNQLMQRDVPK RRRGVYGTNAHRPEPSRANDYLGEPQRKGCAGVVALFVALTLAGLLACNLLA >gi|283510494|gb|ACQH01000125.1| GENE 28 48327 - 48545 74 72 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIYLHFGIYAAISLHKKGTNDVSRTANATFEHTISLCKGTYTPFFRLQMQFFMRTHNNAQ RESLYNNVYVHI >gi|283510494|gb|ACQH01000125.1| GENE 29 48558 - 49175 890 205 aa, chain + ## HITS:1 COG:CT501 KEGG:ns NR:ns ## COG: CT501 COG0632 # Protein_GI_number: 15605230 # Func_class: L Replication, recombination and repair # Function: Holliday junction resolvasome, DNA-binding subunit # Organism: Chlamydia trachomatis # 1 202 1 198 200 108 36.0 5e-24 MIEYIKGDLAELTPAMAVIEAAGVGYALNISLNTYTAIQGKDKVKLFVHEALVTGGRDDS YTLYGFATRQERELYRLLISVSGVGANSARMILSGMSPAELCNVIAAGDDKMLKTVKGIG AKTAQRIIVDLKDKIVTSGVAQELSVPTNANATVMNTAVKDEAVGALTMLGFAPSASAKV VVEILKEQPELPVEQVVKAALKMIK >gi|283510494|gb|ACQH01000125.1| GENE 30 49219 - 56913 9008 2564 aa, chain + ## HITS:1 COG:no KEGG:PRU_2040 NR:ns ## KEGG: PRU_2040 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 166 1937 2 1772 2210 2388 67.0 0 MVVLVICLTGYALALPFVQDNRGRRQNNRTQRNANTAPARNANTPQNNTPQRDVERPTLP EDEQPEDTTLVIDNEDTPETQTNDNPQTAPATPLWRVQPTTPTTYGDLWQNAMDLKRPEN MKQGVEYNDTLDRYLLGTKWGRTYLGTPIMMTPDEYRRWSERQARNSFFRKKNDEIYKAK GKEKFDFTDMHFDLGPAEKIFGPGGVRIKTQGSAEIRFGATLKNIDNPSIPIRNRKTTTM NFDEKININMNGKVGDKVNMNLNYNTDATFDFDAQSLKLKYDGKEDEIIKLVEAGNVSFP ANSSLIKGASSLFGIRTDLQFGKLKLQLVASQKKSSSKSVSSRGGVQLTPFELDAANYEE NRHFFLSHYFRDKYDEWMASLPTVKSGISINRVEVWVTNKTGTTTNTRNIVALTDLGEVS HISNPLWSASGLVPANNANSEYPAMVSTYVAARNIDQTSTTLDGIAGFVGGNDYEKLQNA RLLQPSEYTVNTTMGYISLRQGLQTDQVLAVAYEYTYGGNTYQVGEFAADNTDTNQALFV KSLKNTSNNPRQGNWHLMMKNVYYLATSVEKERFRLDIKYQSDTTGVYLTYIPETQVKDQ PLIRVMGADRLDNNNKVHANGYFDFVEGYTISNGRVFLPKTQPFGKHLYNYLRAKGVPDA VARSYTYDQLYDSTKTIAKQIAEKNKFILTGQFRGTSANVISLGAYNVPQGSVVVTAGGV RLTEGVDYTVDYYAGEVTILNQSILDAGTAVNVSLESNTDYGQQRKTMFGLNWEYNFSKD LQLSGTLQHLTEQALTTKVNMGAEPLNNTLWGLNLNWKKESQWLTNLLDKIPFLHVTQPS QISFTGEFAHLIAGRNKGTQDNASYLDDFETTKSSLDVSTPTSWTLASTPSMFAEHNDKA TLNSGFNRSLLAWYTIDPLFTRRSSSLTPSHIKGDLQQLSNHYVREVPVRELFPKRDQNS YGGVSTLSVLNLAFYPAERGPYNFNPNLNPDGTLDNPDKRWGGMMRKLDTNDFEAANIEY IEFWMLDPFIYSNRQPNANDYGGDFYLNLGEVSEDVLRDGKKFYESGMPVDGSQSFVTTQ WGKIPTQATTTYAFATSRGARALQDVGFNGLNDNEERSFPAYQSFLNAVRGRVNAAVFDS IQADPANDNYHYFRGSDYDRAELPILQRYKRINNPQGNSPDSDQRSESYDTSYKSTPDVE DINQDYTLNEYEKYYQYHVSLRPQDMVVGRNFIVDKREASASLRNGTNENVTWYQFRIPV REYEKRFGSISDFSSIRFMRMFLTGFKRPIVLRFGSLDLVRGEWRSYEQNLSNTASQTGR MSVSAVSYEENSDKTPVNYILPPGIERGADPNQPQMVENNEQALSLVVDNLGQGEAKAVY KNTTIDLRQYRRLQMFVHANALEPNATALANGELSVFIRLGSDYKNNFYEYDIPLSLTPA HRYNDRLWADSRAVWPEENMLDIALSVFTNLKKARNKARATGLASFTSLYSAYDVDKPKN KVSIMGNPTLGEVKTIIIGVRNNAATQKSGEVWVNELRLKEYENEGGWAAQGNLNVQLSD FGTLNLQGRYVSEGFGGLEDGVADRAKDNFKAYSVTTNLELGKFFPEKAKVSAPLYYSRT QEESSPKYNPLDTDLRLKDALDAANRQERDSIKSIAIRRTTNTNFSLSNVRIGIQNKRHP MPYDPANFSFSYSNSSRFTSGETTVYERENNWRAAMAYNWTPVYKPLEPFKNIKSKSKWL DILKRFGLNWLPQNLSFNSDINRSYYELQERDMESKGGDRLPLTFNSQFLWNRDFAIRWD LTRNLHANFQSATHAEVEQPYTPVNKDLYAERYQAWKDSVWNSIRNFGTPLDYNQSFTLS YQPPLNLLPIFNWVNADLTYNATYRWVRGTQLEDGSSLGNTIANNRDLNLNGTFDLVRLY NQIPFLKKANERFNREPNTAELSRKREERQAEQRRKAQERKAMEARMKGKGVTPNGQKID NNETLAQQQRNELPKNRNSFEQEITLMPDSTFIISHNKRSKRLMVTARTEDGKRFPLKFK TSGENSIRILSKVDSALRLKLTVTPRQPQEKQSWYETAQTVARVLMMVRNVSISYRNQYA ISLPGFMPRIGDAFGQARLPSALSPGLDFAFGLTDDSYINKARNMGWLLDNDSIATPAAT TKTEDLQLRMTLEPVRNLKIDLHASRTQTTARNIQYMYAGTPTTQNGTFMMTTISIRSAF EGMGSAANGYRSASFERFCNALDAFRNRVESQYVGARYPAGTSMAGAVFNPANGAVDRYG SDVMIPAFLNTYTAMGGSSLNIFPTLARLLPNWTLRYSGLGKLPWLRDHFKSVNINHGYK SIYAVGAYASYNTFVEYMNGLGFISDVTTGNPLPRSLYNVANVSINEAFSPLLGVDMTFN NNLTAKVEYRSTRVLNLSMTSVQINEAVSRDWVVGMGYRINNFKLFGMRAPRAKKTKTKR TDTNESTATSSSSQGNPNHDLNMRLDLSYRSQASISRDIATRTSAASSGNTAIKISFSAD YTLSRLLTMSFFYDRQTNTPLLSSNSYPTTTQDFGLSLKFSLTR >gi|283510494|gb|ACQH01000125.1| GENE 31 57946 - 59316 811 456 aa, chain - ## HITS:1 COG:slr0309 KEGG:ns NR:ns ## COG: slr0309 COG1032 # Protein_GI_number: 16331878 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Synechocystis # 1 404 9 411 473 168 26.0 2e-41 MRRPLRVKMILPSLDEAKSPYWRPIKYSLFPPIGLATLAGYFHDDDEVVLLDQHVEKLDL TDTPDLVCIQVYVTNAYRAYAIADSYRSRGVYVVMGGLHITALPEEAALHADTIIIGPGE EAFPRFINDFRNGCAQKRYSAKWRSIEDIPPVRRDLIKRSKYFVPNSLVVSRGCPHHCDF CYKDAFYEGGKFFYTARVDAALKEIDALPGRHLYFLDDHLLGSKRFAAELFEGMKGMNRV FQSAATVQSILEGDLIEKAAEAGMRSVFIGFETFSPENLKASNKCQNLQRDYSAAVQRLH SLGIMINGSFVFGLDHDDADVFRRTVDWGVDNAITTATYHILTPYPGTRQFQRMETEGRI LTYDWSKYDTRTVVYQTRGLTAEQLKEGYDWACRSFYSWNNILRACSHDTTLTQRITHLM YTGGWKKFEQVWGALVRVGGLNKALPLLEQLLSHTK >gi|283510494|gb|ACQH01000125.1| GENE 32 59668 - 60066 488 132 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288930028|ref|ZP_06423869.1| ## NR: gi|288930028|ref|ZP_06423869.1| hypothetical protein HMPREF0670_02763 [Prevotella sp. oral taxon 317 str. F0108] # 1 132 10 141 141 250 100.0 2e-65 MALFALVGYGQERKSFKGHLYNSEYNVYMRINFHDKDVVARGQELFGKLPGYLGSKFDTR LWLITDVKLNKDKEAKLFIINDYGSEDLEAKLEKENDSTYTLKQLEGSPIKIVVDRKWVK LPKVMQFKVKSY >gi|283510494|gb|ACQH01000125.1| GENE 33 60098 - 62233 1597 711 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|51894064|ref|YP_076755.1| ribosomal protein S1-like protein [Symbiobacterium thermophilum IAM 14863] # 8 709 10 720 764 619 48 1e-176 MNEYANIIATSLELNQTKVANTLALLDEGCTIPFISRYRKEKTGGLDEVQIANISQWKDK LTELTKRKETICKTIDEQGKLTPELKSRIDNTWDSTTLEDIYLPYKPKRRTRAQVARQQG LEPLAQTILLQREPNPQSVARAYVKGDVKDVEAAIKGAQDIIAETISENEQTRQQVRNAF KREAIISSKVVAAKKDEEGAQKYTDYFDFSEPLRRCNGNRLLAMRRGESEGFLRVSISID GEETTERLQRHYVRGRGACAQLVEEAVADAYKRLIEPSVENEFAAASKEKADEEAIGVFS LNLRQLLLAAPLGQKRVMGVDPGIRTGCKVVCLDAQGNLLYHDVVYPFTPRGNEQAAKTQ FQKIANRFKVDAIAVGNGTASRETADILRQAGNAAEPRIPVYVVSEDGASVYSASKTARD EFPDEDVTVRGAVSIGRRLMDPLAELVKIDPKSIGVGQYQHDVDQGKLKKSLDTTVESCV NLVGVNVNTASVHLLTYISGLGPTLAKNIVDYRRDNGAFTSRAQLKKVPRLGPAAFEQCA GFLRVDGAKNPLDNSAVHPECYTVVESMAKDMGCSVGQLIGNNEKIRQIPLAKYVTAEVG MPTLNDIVAELEKPGRDPREDLEEFAFDERVHTIADLVPGMLLPGIVTNITKFGAFVDVG VHQDGLVHVSQLANRYVADPAEVVKLHQHVQVKVVEVDTRRNRISLTMKGV >gi|283510494|gb|ACQH01000125.1| GENE 34 62620 - 63162 768 180 aa, chain - ## HITS:1 COG:CAC2575 KEGG:ns NR:ns ## COG: CAC2575 COG1592 # Protein_GI_number: 15895835 # Func_class: C Energy production and conversion # Function: Rubrerythrin # Organism: Clostridium acetobutylicum # 1 180 1 195 195 182 50.0 3e-46 MKDLKGTKTEKNLQEAFAGECMARTKYTFFASKAKKDGFEQIAAIFEETAHNEKEHAELW YKYLHGGEIGNTSENLKIAAEGENYEWTDMYDRMAKEAEEEGFKELAVKFRKVGEIEKTH EARYRKLLKNIEDALVFSRDGDCIWVCRNCGHVVVGKKAPQVCPVCLHPQAFFELKPENF >gi|283510494|gb|ACQH01000125.1| GENE 35 63927 - 64697 786 256 aa, chain - ## HITS:1 COG:CAC2627 KEGG:ns NR:ns ## COG: CAC2627 COG0220 # Protein_GI_number: 15895885 # Func_class: R General function prediction only # Function: Predicted S-adenosylmethionine-dependent methyltransferase # Organism: Clostridium acetobutylicum # 30 223 23 210 211 110 33.0 3e-24 MSKGKLEKFAEMETFSNVFQYPYSVIESTPFAMRGQWRCDYFHNDNPIVLELGCGKGEYT VELAKLYPEINFIGVDIKGARMWKGAKMALEQGLKNVAFLRTNIEIIDRFFAPDEVQELW LTFSDPQMKNVHKRLTSTFFLERYRRFLENNGVVHLKTDSNFLFTYTTHVVSANKLNLLF RTEDLYNTPGIDEQTAKILSIQTYYESMWIERGLNIRYMKFELPHEGTLIEPDVEIPLDE YRSYHRNKRSSLDARK >gi|283510494|gb|ACQH01000125.1| GENE 36 64867 - 65889 932 340 aa, chain - ## HITS:1 COG:HI1193 KEGG:ns NR:ns ## COG: HI1193 COG0115 # Protein_GI_number: 16273115 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Branched-chain amino acid aminotransferase/4-amino-4-deoxychorismate lyase # Organism: Haemophilus influenzae # 1 338 1 342 343 321 48.0 1e-87 MKNLDWAGLNFGYIPTDYNVRCYYRDGKWGEIETCSDEYLKLHMAATCLHYGQEAFEGLK AYRCPDGKVRVFRMGENAVRLQETCKGIMMPTVPTALFEEMVTKVIRLNQDYIPTYESKA ALYIRPLLIGTSAQVGVKPAEEYLFLIFVTPVGPYFKGGFSTNPYVIIRSFDRSAPLGTG MYKVGGNYAASLRANALAHEKGYASEFYLDAKEKKYVDECGAANFFAIKGNKYVTPKSTS ILPSITNDSLMQLARDLGMEVERRPIPEEELSTFEEAGACGTAAVISPISYLDDMETGQR YTFSADGKPGPVSTRLYNLLRNIQYGIEEDKHGWTTVVIE >gi|283510494|gb|ACQH01000125.1| GENE 37 66190 - 67896 893 568 aa, chain - ## HITS:1 COG:no KEGG:BF4127 NR:ns ## KEGG: BF4127 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 8 252 7 269 271 112 30.0 4e-23 MSSMNDNFSFNRFWAYFTKLLVERWQTNKMRLAILFGSIAMIELWIAFATYSRNSSSTDK AVDILLVVFVIVLFILGAMSASEMLSGAQRKAERISALTFPVTPFENWLARWIICLPLFL VCFLVCLYLADGLRVLLVGMLYPQIELKFIPLWGDSESTCDFAAAIWLSYFACTAIYSLG SVFFVKRALLKTTLSLFLLYWIVGFLPFFFIFKHSLLTSSYGEAILIWYSWLSVLFVWWL SYLCFKDMEVVDSAALTRTKLWVVGYFVLTFLLLCAGALIPFSQNDKATDIVEDRIFVYL DRTTEKRIAPVSIIMYDATNRADKNIGGSLRLAVEVNVVSDVSQCSVIYPEQLITVKQRG DTLYYALSDSLEDDDFSRLECLNGPDGDANSREAGDIGQMPYLIKGADGKVRTAIDQNYT NDKKVRIIVNTLPGTLWVKQWGRQRTLLGAGDMKNVTVEGGEELILTSKVRVDNLTVKGK SSIDIGDAYINKMVLKSIGDDEYNRPMNVNLKAGNKGLVNLMVINSSASVDGLTDVRCKR IELTPAKDRKMNVEIRGIWHKTVIQGNE >gi|283510494|gb|ACQH01000125.1| GENE 38 68216 - 69904 1286 562 aa, chain - ## HITS:1 COG:no KEGG:BF4127 NR:ns ## KEGG: BF4127 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 5 249 7 269 271 108 29.0 4e-22 MNDNFNASRFWAYLNKLLVERWRTNLLRVAILFGCMVIIELWIASVSYDRDETSYDRAVG LLFPIFSLILLVSGCFFASEILSGARRKAERIGALTFPVTPFENWLARWTISIPLYLVCF LVCMYVADGLRVAFFSAIYPKTPIEFISIWRREDYDGEVFLNAWLMYFWCTAVYALGSIL FVKRSILKTTLTLFILFWITAFVVLGLVMVSSGNYDNIFESVLVAYGWVSVPFLWWLSYS CFKEMEVVDGFSFNGSKAWVVGYCMVSVLFISIWSMFDKSALDAVGQHSVDETAIFLERT VEKPIAPIAVVQFEDSSSTDLAEEYNLTLAVEVNYVSDASKCSVIYPEQVIQVRQQGDTL YYTYNPELYKRDREDMYFVNGIDGMADASSACYEGKIAYQIKGADGKVRTAVDKHYDDAN KVRIVINTLPGNLAVRQTSRQSTLIGAGKTKGVSVEGGFSLELDSLSHIDALSVVGMRVF KVGTAQVNRLVVTLRKNEDDEGEMETRFNTDYPAVINHVVVKGAGKLSGLSKMACKRIEL LPDTVGSITLEVKDVKRKMVLE >gi|283510494|gb|ACQH01000125.1| GENE 39 69918 - 70631 260 237 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 [marine gamma proteobacterium HTCC2080] # 2 218 5 226 305 104 30 1e-21 MLEIDNLSFTYQKSGNVIFSNFSTRFSPGHIYGLLGPNGAGKTTLLYLMSGLLRPDEGQV TFNDINVRLRKPSTTADIFIVPDEYELPRVSLSEYIRVNSVFYPRFSHDDLRRYLDVFGM TEDTHLGNLSLGQGKKVFMSFAMATHTRVLLMDEPTNGLDIPGKAQFRKFLTEGRTPESI FVVSTHQVKDIEQVLDHIVMFDNSQILANADMKSIFPDGKIDLERYFNERINQQTAE Prediction of potential genes in microbial genomes Time: Sat May 28 02:54:09 2011 Seq name: gi|283510493|gb|ACQH01000126.1| Prevotella sp. oral taxon 317 str. F0108 cont2.126, whole genome shotgun sequence Length of sequence - 16402 bp Number of predicted genes - 15, with homology - 13 Number of transcription units - 9, operones - 3 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 1489 - 1548 3.6 1 1 Tu 1 . + CDS 1603 - 1938 232 ## gi|288930037|ref|ZP_06423877.1| hypothetical protein HMPREF0670_02771 + Term 1948 - 1993 4.6 + Prom 2268 - 2327 1.9 2 2 Tu 1 . + CDS 2347 - 4407 2562 ## BF2030 putative TonB-dependent outer membrane receptor protein 3 3 Tu 1 . + CDS 4512 - 4841 316 ## BT_1092 putative heavy-metal binding protein + Term 4910 - 4941 -0.9 + Prom 5210 - 5269 3.2 4 4 Op 1 14/0.000 + CDS 5293 - 5982 657 ## COG0688 Phosphatidylserine decarboxylase 5 4 Op 2 . + CDS 5998 - 6720 544 ## COG1183 Phosphatidylserine synthase 6 4 Op 3 . + CDS 6723 - 6998 382 ## gi|288930042|ref|ZP_06423882.1| hypothetical protein HMPREF0670_02776 + Term 7032 - 7068 -0.8 7 5 Op 1 . + CDS 7484 - 8497 1063 ## COG0685 5,10-methylenetetrahydrofolate reductase 8 5 Op 2 7/0.000 + CDS 8548 - 9633 1322 ## COG0470 ATPase involved in DNA replication + Prom 9703 - 9762 3.0 9 5 Op 3 . + CDS 9788 - 11116 1275 ## COG1774 Uncharacterized homolog of PSP1 10 5 Op 4 . + CDS 11124 - 11615 463 ## PRU_1703 hypothetical protein + Term 11740 - 11779 10.1 - Term 11897 - 11937 2.8 11 6 Tu 1 . - CDS 11956 - 12954 1158 ## COG1052 Lactate dehydrogenase and related dehydrogenases - Prom 13155 - 13214 2.6 12 7 Op 1 . + CDS 13288 - 14091 1035 ## COG0030 Dimethyladenosine transferase (rRNA methylation) 13 7 Op 2 . + CDS 14088 - 14552 284 ## PROTEIN SUPPORTED gi|15902812|ref|NP_358362.1| hypothetical protein spr0768 + Term 14690 - 14733 1.4 - Term 14463 - 14499 -0.4 14 8 Tu 1 . - CDS 14549 - 14830 68 ## - Prom 14971 - 15030 7.5 + Prom 14670 - 14729 4.3 15 9 Tu 1 . + CDS 14834 - 15034 76 ## Predicted protein(s) >gi|283510493|gb|ACQH01000126.1| GENE 1 1603 - 1938 232 111 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288930037|ref|ZP_06423877.1| ## NR: gi|288930037|ref|ZP_06423877.1| hypothetical protein HMPREF0670_02771 [Prevotella sp. oral taxon 317 str. F0108] # 1 111 70 180 180 217 100.0 2e-55 MILWIGSGIMTVVCEHTGSVSVAEMPSKMHCKDAAASNCMKMQVKTLSSTNAERADFHQP LPVQSSLLPQLITDCDLLPLPVLAKASERTWQWLWHGPPRQYLRLLTTLLI >gi|283510493|gb|ACQH01000126.1| GENE 2 2347 - 4407 2562 686 aa, chain + ## HITS:1 COG:no KEGG:BF2030 NR:ns ## KEGG: BF2030 # Name: not_defined # Def: putative TonB-dependent outer membrane receptor protein # Organism: B.fragilis # Pathway: not_defined # 29 684 88 737 738 718 55.0 0 MNKIPYILGATLLAVVPAVAQQTHPDSTITLKEQTLANVTVTSRKASVRSLKGAINGKDI LRDELFKAACCNLGESFVTNPSVDVNYADATTGAKQVKLLGLSGTYVQMLTENLPNFRGA ALPYGLGYVPGNWMKSMQVSKGNSSVKNGYEAMTGQINVEYVKPEDEKGVSLNLYGNTMG KFEANADGNVHINGNKNLSTELLVHFENNWSQHDGNGDGFEDDPQVKQYNLQNRWFWKKG GYMLHAGLALLKEDRTSGQVAHAQPAPGSLTPYRIGIGTERYEGYMKHAFILNEAHGTNV AFMGNVSLHKQDASYGIKQYSVNEKNAYASLIFETNFTPEHNLSAGLSLNHDYLHQRLLL PAGVVPADYGNRYPLTRGIESETTPGAYVQYTYNLHNRVIAMAGLRIDHSNVYGTFFTPR FHLKWMPADLLTIRLSAGKGYRSPHALAENNYLMASGRRLIIDNLEQEAAWNYGTSLSFL IPIGKQTLKLNADYYYTHFLSQTLIDYDTNPQELHITNLNGKSYSHTVQIDASYLLFKGL DLTAAYRYNLVKATYGGRLMWKPLQSRYKALLTASYKTPLGLWQFDATAVLNGGGRMPEP YTTASGNLSWSRNFKAYGQLNAQVTRYFRHFSVYVGSENLGNYKQKNPIIGYQNPWDNNF EPTMVYGPVQGASAYIGIRVNLGKRL >gi|283510493|gb|ACQH01000126.1| GENE 3 4512 - 4841 316 109 aa, chain + ## HITS:1 COG:no KEGG:BT_1092 NR:ns ## KEGG: BT_1092 # Name: not_defined # Def: putative heavy-metal binding protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 19 90 22 92 119 65 51.0 5e-10 MKKSIISMVLFMVAMIATAKDIKTVVFTTTPQMHCANCEAKIKGNLRFEKGVKAIKTDVE AQKVFVSYDSKKTTEEKIQKAFEKFGYKAEKTDKDAKIPVHKDEQCENM >gi|283510493|gb|ACQH01000126.1| GENE 4 5293 - 5982 657 229 aa, chain + ## HITS:1 COG:RSc2074 KEGG:ns NR:ns ## COG: RSc2074 COG0688 # Protein_GI_number: 17546793 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylserine decarboxylase # Organism: Ralstonia solanacearum # 13 228 10 214 215 135 40.0 5e-32 MVNKIKKLKKIRIHREGTETLALGFIVLAIVAPLLWVFFECKVAFWTFLVIYGIAYGIII NFFRCPIRVFNGDTEKVVVAPADGKIVVIEEVDENTYFHDRRIMVSIFMSLFNVHANWFP VDGKVKKVEHKDGNFHKAWLPKASEENEHADVVITTPDGVDILCRQIAGAMARRIVTYAK PGEDCYIDEHLGFIKLGSRVDVFLPLGTEICVKMGQLTTGDQTVLAKLK >gi|283510493|gb|ACQH01000126.1| GENE 5 5998 - 6720 544 240 aa, chain + ## HITS:1 COG:BS_pssA KEGG:ns NR:ns ## COG: BS_pssA COG1183 # Protein_GI_number: 16077296 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylserine synthase # Organism: Bacillus subtilis # 8 141 3 125 177 89 39.0 5e-18 MANKITRHIPNTITCCNLISGCIAIIAAIYGDLWLALLFIVIGSVFDFFDGMSARLLHVS APIGKELDSLADVITFGLAPSVMLFQQLSVLDYPFGSVYRWQITGYIPFVAFLMTAFSAL RLAKFNLDERQTTSFIGLPTPANALFWGSLLLGLDSQMQAVNWAPFALIALMLLNCWLLV SELPMFALKFKHWGWKGNEVKYIFVLSCIPLLVIFKVLAFAIIIAWYVVLSAYVNATKKQ >gi|283510493|gb|ACQH01000126.1| GENE 6 6723 - 6998 382 91 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288930042|ref|ZP_06423882.1| ## NR: gi|288930042|ref|ZP_06423882.1| hypothetical protein HMPREF0670_02776 [Prevotella sp. oral taxon 317 str. F0108] # 1 91 1 91 91 153 100.0 4e-36 MDFILKSIGVLLLSVLFVALIAFLLTFRRIIKSVRQTMRQFNNATTSSATNGSQRAGKAG EVIIDSRTSERMNQKIFEQNEGEYVDFTEEK >gi|283510493|gb|ACQH01000126.1| GENE 7 7484 - 8497 1063 337 aa, chain + ## HITS:1 COG:aq_1429 KEGG:ns NR:ns ## COG: aq_1429 COG0685 # Protein_GI_number: 15606607 # Func_class: E Amino acid transport and metabolism # Function: 5,10-methylenetetrahydrofolate reductase # Organism: Aquifex aeolicus # 20 332 1 283 296 166 31.0 7e-41 MPNKGEQGAHTTPPKHQIKMKVSDFLNKKAAAKAISFEVLPPLKGNGTAQLFRTIDRLRE FCPSYINITTHNSEYVYRQLDNGQYERDRIRRRPGTIAIAAAIQQRYAIPVIPHIICSGA TIESIEYELLDLQFLGIQNLLLLRGDKARDDRMFTPTPNGHAHTTELIEQVNHFNEGFFA DGTPMKHPGESFEYGVACYPEKHEEAPNLDMDIDYLLRKQQLGAKYAVTQLFYDNQKYFD FVEKARQKGVTIPIVPGIKPFAKLGQLTVIPRTFHCDLPTALASEIVKCKTDDDAKALGV EWATQQCRELYAKGVSSIHFYTVSAVDSICEVAKKLL >gi|283510493|gb|ACQH01000126.1| GENE 8 8548 - 9633 1322 361 aa, chain + ## HITS:1 COG:NMA0980 KEGG:ns NR:ns ## COG: NMA0980 COG0470 # Protein_GI_number: 15793937 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA replication # Organism: Neisseria meningitidis Z2491 # 3 214 16 191 325 103 30.0 6e-22 MVHNNRLPHALMLCGPSGSGKLALGLALASYLLCERHAEGDAPCGECAACAMMRRFEHPD LHFSFPVIRPAGTSSEHKMNSDDFAPQWREMLQTTLYPSIDLWLDQMDAANQQAQMGVGE SDLLMKRLSMKSSQGGYKVAVVWLAERMNQECANKLLKLLEEPPAQTLFILVCQEPELLL ETIKSRTQRIDLPPIAVHDMQQALIQKRNITPEDARRVARLANGSWTNALAELSVDNENK QFLDMFIMLMRLAYQRNIRELRRWSDAVATYGREKQKRMLTYFARLMRENFMFNFGIADL VYMSREEENFARNFARFVNEENIVEISELIDRCIRDISQNANAKVVFFDYAINMILYIKK A >gi|283510493|gb|ACQH01000126.1| GENE 9 9788 - 11116 1275 442 aa, chain + ## HITS:1 COG:CAC0301 KEGG:ns NR:ns ## COG: CAC0301 COG1774 # Protein_GI_number: 15893593 # Func_class: S Function unknown # Function: Uncharacterized homolog of PSP1 # Organism: Clostridium acetobutylicum # 48 270 4 220 303 169 43.0 7e-42 MDYKNMKFKVATGCDHGGCCGGCCRSDHQLNTYDWLADVVNDPDKTDLVEVQFKNTRKGY YHNVNNLQLTKGDMVAVEASPGHDIGVVTLTGRLVALQIKKANLKSADDIKRIYRLAKPV DLDKYEEAKSRENDTMIQSRQIAKDLGLKMKIGDVEYQGDGNKAIFYYIADERVDFRQLI KVLADTFHVRIEMKQIGARQEAGRIGGTGPCGRELCCATWNKNFVSVNTNAARFQDISLN PQKLAGMCGKLKCCLNYEVDNYVEAAKRMPSKEVSLHSGDAEHFVFKTDILAGTVTYSTD KNLAVNLVTITAERARAIIEMNRRGEKPDSLEEERHKQHTDKPIDLLANADLSRFDKSKR PKRPANGKPKGDKRDGSAPQNGAPQQRKEHLRQREQQAERRRQRPQQPRAQEGKQENNGN RHRRDNQRRGGAPSNNQPQKQE >gi|283510493|gb|ACQH01000126.1| GENE 10 11124 - 11615 463 163 aa, chain + ## HITS:1 COG:no KEGG:PRU_1703 NR:ns ## KEGG: PRU_1703 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 16 153 6 143 143 119 42.0 5e-26 MERLVKGALVALAVALVSACSMDTVYHRYSHTSVSGWEKNDTLKFYVPRLHQGGVFAQQL DVRINGAYPFTAATLIVKQRIVPGNQELTDTINCTFSKPDGTRLSRGISYYQYSYPIAKH DLREGDSLVVSVYHDMKRDMLPGISDVGIHISRVSYLHNALYK >gi|283510493|gb|ACQH01000126.1| GENE 11 11956 - 12954 1158 332 aa, chain - ## HITS:1 COG:FN0511 KEGG:ns NR:ns ## COG: FN0511 COG1052 # Protein_GI_number: 19703846 # Func_class: C Energy production and conversion; H Coenzyme transport and metabolism; R General function prediction only # Function: Lactate dehydrogenase and related dehydrogenases # Organism: Fusobacterium nucleatum # 3 329 5 331 335 354 57.0 2e-97 MIKVAFFDAKSYDEVSFNKTNADYGFDIRYYQEYLNLKTVPLAKGADVVCIFVNAECNAA VIDELIKNGVKLIALRCAGFNNVDLKAAKDRIKVVRVPAYSPHAVAEYAVSLMLALNRKI FRAVNRTREGNFALKGLMGFDMYGKTAGIVGMGRIAKELIKILHGFGMKVLAYDLYPDMD FAKRYDVRMVSLDELYAESDIISLHCPLTPETTFLINAQSIAKMKPGVMIINTGRGKLVH TEDLIEGLRTKQVGSAGLDVYEEEKNYFYEDRSDKIIDDDVLARLLMMPNVVLTSHQAFF TAEAMHNIALTTLESIKEFSEGKELTNEVMEK >gi|283510493|gb|ACQH01000126.1| GENE 12 13288 - 14091 1035 267 aa, chain + ## HITS:1 COG:PA0592 KEGG:ns NR:ns ## COG: PA0592 COG0030 # Protein_GI_number: 15595789 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Dimethyladenosine transferase (rRNA methylation) # Organism: Pseudomonas aeruginosa # 5 260 8 263 268 152 37.0 9e-37 MKLVRPKKHLGQHFLTDLGIAQRIADTVDACPELPILEVGPGMGVLTQYLATKNRPLRVV EIDTESVAFLYNNFPLLAENVLGEDFLRMDLASVFTGQPFVLTGNYPYDISSQIFFKMLD NKHLIPCCTGMIQREVALRIASQPGTKAYGILSVLIQAWYDVEYLFTVDETVFNPPPKVK SAVIRMARNKVENLGCNEILFKRVVKTVFNQRRKMLRVSLRQLFAGMPASPEFYAQEMFT RRPEQLSVAEFVQLTNMVEQEMNAQKQ >gi|283510493|gb|ACQH01000126.1| GENE 13 14088 - 14552 284 154 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15902812|ref|NP_358362.1| hypothetical protein spr0768 [Streptococcus pneumoniae R6] # 3 151 6 152 165 114 40 6e-25 MNKETTYTALLPQITALIKGESNETSVLANVAAALHQAFDQFFWTGFYLLHPDGMLRLGP FQGPPACYAIAIGKGVCGQAFELGRTLVVPDVEQFPGHIACSTLSRSEIVVPVFSKSGKP VAVLDIDSKQLATFDDTDKLYLEQVAQLLTDTLY >gi|283510493|gb|ACQH01000126.1| GENE 14 14549 - 14830 68 93 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNKPLCGGKGDFLFAYDATVSFQIQFCLTIILFVRTILTKITWLLSKQTQNAERILFTNE YALQVMELKEEKRKSRAKEKQVLSPCVELQNGP >gi|283510493|gb|ACQH01000126.1| GENE 15 14834 - 15034 76 66 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MQRTSCTLPTTIFQMQFFFTIIAILAATPLTHLPNAMYLSLNNLPYYQILNYYLLFYKTI LIFAPF Prediction of potential genes in microbial genomes Time: Sat May 28 02:54:43 2011 Seq name: gi|283510492|gb|ACQH01000127.1| Prevotella sp. oral taxon 317 str. F0108 cont2.127, whole genome shotgun sequence Length of sequence - 5658 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 2, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 1 - 552 336 ## BF2845 hypothetical protein 2 1 Op 2 . + CDS 564 - 1397 624 ## gi|288930052|ref|ZP_06423892.1| hypothetical protein HMPREF0670_02786 3 1 Op 3 . + CDS 1409 - 2911 789 ## gi|288930053|ref|ZP_06423893.1| hypothetical protein HMPREF0670_02787 4 1 Op 4 . + CDS 2935 - 3567 521 ## gi|288930054|ref|ZP_06423894.1| hypothetical protein HMPREF0670_02788 + Prom 3724 - 3783 7.5 5 2 Op 1 . + CDS 3937 - 4398 187 ## gi|288930055|ref|ZP_06423895.1| hypothetical protein HMPREF0670_02789 6 2 Op 2 . + CDS 4440 - 5333 533 ## gi|288930056|ref|ZP_06423896.1| hypothetical protein HMPREF0670_02790 Predicted protein(s) >gi|283510492|gb|ACQH01000127.1| GENE 1 1 - 552 336 183 aa, chain + ## HITS:1 COG:no KEGG:BF2845 NR:ns ## KEGG: BF2845 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 3 157 401 544 587 112 46.0 8e-24 CRDGVETNRGFVFIPEVGDHVLVGFRHGDPNRPYVMGSLFNGRTGAGGGKDNKCKSITTR SGCALTLNDAEGSVTLNDKGGANMNFDGGGNASTTTNSAVHVSAGKQVKTDVGNGQSVLT MGKDGIIDLSGHKKITFKVGSSTITITADNIKLESSTVDIDGGGGTIHAAGIVEVDGGDV FIN >gi|283510492|gb|ACQH01000127.1| GENE 2 564 - 1397 624 277 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288930052|ref|ZP_06423892.1| ## NR: gi|288930052|ref|ZP_06423892.1| hypothetical protein HMPREF0670_02786 [Prevotella sp. oral taxon 317 str. F0108] # 1 277 1 277 277 530 100.0 1e-149 MSSTEIYISKPQIQRRYHIAINGDMLMGGVENNTKVETECLLNVLGIDSNKCAKIELITL DTNVIETGNQAFRELSVIANQLKKVTQDIVCVIDKEGKVLQIINIEQIKGKWEQLKGEMV SICGRSGELTDFFNVNEKLFSEEETLRHYVGEMEFFKIYFNGLYGHRIRDEEHRKTDNAF KTNKISYNLYFDNDEDDELIRIRFEGRDFEINHDWLEKSYGQMPFVDLRNMKPEFAIQGD YVIEKATGLIKGAKFVWDENTSKELRLRTEYIITEKR >gi|283510492|gb|ACQH01000127.1| GENE 3 1409 - 2911 789 500 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288930053|ref|ZP_06423893.1| ## NR: gi|288930053|ref|ZP_06423893.1| hypothetical protein HMPREF0670_02787 [Prevotella sp. oral taxon 317 str. F0108] # 1 500 1 500 500 998 100.0 0 MAKVYLTDEHYFLCSGGLFPARIYDGKSTHVSKTGKRYLTHYDTATKTPDFTCRWAVVLA VMAAVFFALLASNPIGWAILAGLIIGLGTGFLLCGVMMASGRKWIGYERHLIIKRTQYAL TSDSYMVCPLGGKISWSPEITSPGIALWTGLRNTGFATFQGIMYGYAAYGGVSLFTTSGW EAVLPNLIKGWAKTYGKAGLLVRSGFAAENVAYRNATRQYEGGEPSDMLKDAGKTFIGDV YQYYNIGTKLANGEKVGAEEFANGVATPLGMVGLNIQLKNADYQIPYALRRISNRGYIGG SKLLFKRDIVAERMDYAREFYEKVFKERVVTGEMTREEAIKAMNNELKGIDYRMPIKVRY IRGEMYQYQKHGLDGTYYEGEYYTPDATAKPTDLGVSSEYNVRDSNMDPTGKTDKIRQIQ VNKEGCIGLESTSSPINDDWSVYQVDTNGNVVIGTNGKPVPVEVPTKGKGTQIYIPRNQG CLNGIEPSSPHPVYPIYPDE >gi|283510492|gb|ACQH01000127.1| GENE 4 2935 - 3567 521 210 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288930054|ref|ZP_06423894.1| ## NR: gi|288930054|ref|ZP_06423894.1| hypothetical protein HMPREF0670_02788 [Prevotella sp. oral taxon 317 str. F0108] # 1 210 1 210 210 380 100.0 1e-104 MNTSPKTQKEYDQLCDETMAYFKKIMNDYPQITMYKSIYVQIWMIKKDLSENVRLDRVKD FQKYSLGAIAAKNFDEGDVYGDAIGTIDAMAFIYFYMPEDAINTPDEMLRMINRTIGMMK QPFEEYKLKGKYKEYREEKTPYIEYARQVLNYIKDYLEKRNDGSKLNIIDLEECANKTYL SFRYEVSFLMYEIVPDIINKLINKDTQTGV >gi|283510492|gb|ACQH01000127.1| GENE 5 3937 - 4398 187 153 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288930055|ref|ZP_06423895.1| ## NR: gi|288930055|ref|ZP_06423895.1| hypothetical protein HMPREF0670_02789 [Prevotella sp. oral taxon 317 str. F0108] # 1 153 1 153 153 263 100.0 2e-69 MLPGETLGDFCERVIKEYLKSQGFDKFYEVQNRSGNGVDIIAEKTKTHEVKIIEVKGTQS ESKWDKGQTKELPLSRDQKAGGETYSESRINRAKNGDDGWKNEPETQANAKQAHAAMEQA KDNGTLLYEKYDVYVDESGAIRNGEQRVRRRSW >gi|283510492|gb|ACQH01000127.1| GENE 6 4440 - 5333 533 297 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288930056|ref|ZP_06423896.1| ## NR: gi|288930056|ref|ZP_06423896.1| hypothetical protein HMPREF0670_02790 [Prevotella sp. oral taxon 317 str. F0108] # 1 297 1 297 297 602 100.0 1e-171 MNSLEKSFSYTKEMMAHKIKTMEESHFKRATQKIDNYIKGNAYDTAVGTRLSSLSSWYLW NFINEFTQSGKIEYDLLARSTYYGISANKWAYSIGQCSTHYDGALLFSDSSMHLAQLVCL GLSQEAITYCSLLTKMLNGKQFKMFPDNPTFPWFILDLYLRYNNTSIEESWKYPSEMGAY GRTLENWDTCDKALFEKLLIDLCDFHVSQSDEYEHDGNLLEFSSAQYFLFPPEILMWIRL RHNRGLYVGHPNLHPLLQLEINNLPLNSFEMPLDPLVTRCFNKLHKDNPSVNFEFMA Prediction of potential genes in microbial genomes Time: Sat May 28 02:55:44 2011 Seq name: gi|283510491|gb|ACQH01000128.1| Prevotella sp. oral taxon 317 str. F0108 cont2.128, whole genome shotgun sequence Length of sequence - 1502 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 1 - 630 358 ## COG3209 Rhs family protein 2 1 Op 2 . + CDS 642 - 1304 380 ## t2568 hypothetical protein Predicted protein(s) >gi|283510491|gb|ACQH01000128.1| GENE 1 1 - 630 358 209 aa, chain + ## HITS:1 COG:rhsB KEGG:ns NR:ns ## COG: rhsB COG3209 # Protein_GI_number: 16131354 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rhs family protein # Organism: Escherichia coli K12 # 2 97 1152 1247 1411 84 49.0 1e-16 IGRPVQAYDEHGVVIWQADYDIYGNLLNLKGDRQFVPFRQLGQYEDEETGLYYNRFRYYD PSTGGYISQDPIGLAGNNPTLYGYTRDSNTRIDLWGLDCDVPKKTQTHHIIPMAVYKAFR KDFKKIKGYVQAVNVRTAKNRTNLIDLDTPFHGNHPKYNEYVSKKITKLTNEGDLKLDDI NTLQNNLRNEIGKAQASGKNLNEHFGKLK >gi|283510491|gb|ACQH01000128.1| GENE 2 642 - 1304 380 220 aa, chain + ## HITS:1 COG:no KEGG:t2568 NR:ns ## KEGG: t2568 # Name: not_defined # Def: hypothetical protein # Organism: S.typhi_Ty2 # Pathway: not_defined # 3 217 2 221 225 144 38.0 3e-33 MYKYKRVWIETDPLIMGVRNGVYQVELKEKKSFVSKEERNYYESYFASGINAFLLDDFKR IDEKKITCITYFPLKGALETDFIVAAPHERGIDFLVTEKCLDVLESFKLPTYNKFKVNIE GFSSNYFAVGFPMVPNRFVDYSDSKFVDLTTKEHVQIADSEEYKKYFSIGDRKISVKYRL DYDIIAIQPFGLFFSGKLINAIKMNRLIGLQVEDTEMVLK Prediction of potential genes in microbial genomes Time: Sat May 28 02:55:48 2011 Seq name: gi|283510490|gb|ACQH01000129.1| Prevotella sp. oral taxon 317 str. F0108 cont2.129, whole genome shotgun sequence Length of sequence - 2500 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 2, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 1 - 669 253 ## COG3209 Rhs family protein 2 1 Op 2 . + CDS 657 - 929 202 ## gi|288930060|ref|ZP_06423900.1| hypothetical protein HMPREF0670_02794 + Prom 1241 - 1300 4.2 3 2 Op 1 . + CDS 1404 - 2048 317 ## ZPR_2169 hypothetical protein 4 2 Op 2 . + CDS 2108 - 2498 389 ## gi|288930062|ref|ZP_06423902.1| BclA protein Predicted protein(s) >gi|283510490|gb|ACQH01000129.1| GENE 1 1 - 669 253 222 aa, chain + ## HITS:1 COG:rhsC KEGG:ns NR:ns ## COG: rhsC COG3209 # Protein_GI_number: 16128676 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rhs family protein # Organism: Escherichia coli K12 # 2 100 1152 1250 1397 82 48.0 4e-16 VGRPVQAYDEHGTPVWQADYDIYGNQLNLKGDRQFVSFRQLGQYEDEETGLYYNRFRYYD PSTGGYISQDPIRLLSGESNFYAYVRDTNNWADVFGLEELFRGMKQKNNVPLTGNSADKL GVRPNVDIEVIDGKVYPNSGGMSVNKSIDNIPSHRKPIEFGGTQKGSAMFKIESDNLGDN LRFKADKNGTHGVIEPSRPMSLAEYQESLGALQNKFKSVCPS >gi|283510490|gb|ACQH01000129.1| GENE 2 657 - 929 202 90 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288930060|ref|ZP_06423900.1| ## NR: gi|288930060|ref|ZP_06423900.1| hypothetical protein HMPREF0670_02794 [Prevotella sp. oral taxon 317 str. F0108] # 1 90 1 90 90 155 100.0 7e-37 MSKLNTKESFIISSDIKDWQAKYIDVEKLPIEFYLKYTNILSDYKESGKSKEEVLAFFYK ISEDNQDISEFVNEIMDLIEGYCRPDFRVW >gi|283510490|gb|ACQH01000129.1| GENE 3 1404 - 2048 317 214 aa, chain + ## HITS:1 COG:no KEGG:ZPR_2169 NR:ns ## KEGG: ZPR_2169 # Name: not_defined # Def: hypothetical protein # Organism: Z.profunda # Pathway: not_defined # 7 212 18 224 237 114 32.0 2e-24 MFNIKKIKKQDYALLRSISLKLVDKYPYLVRQVSTNFILDKKHNEFERHGVYTLILNAKL EQEFINKSLPQLFIIKDILVWNKKENRYEPIELDIMEGMLAGYSLKANVRELDLSRIDVS RVKEQTFENHEQEELANIIGDVDDKIRSYLSLESSFKIIIPEGTFYVIKNLGDGDYISID KLGRVYEMTHDPYEVKCIYKKKEDYFDSIKNNGI >gi|283510490|gb|ACQH01000129.1| GENE 4 2108 - 2498 389 130 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288930062|ref|ZP_06423902.1| ## NR: gi|288930062|ref|ZP_06423902.1| BclA protein [Prevotella sp. oral taxon 317 str. F0108] # 1 130 13 142 143 243 100.0 3e-63 MKVWQRRPDPDTWTVRTAERRTTYRAFPDGEGGTVYRAVRVDHADGGFLAIAYDRDSGLL STLEDHHGRTVVFGQDDRRGLILSANLLCDGRLEMLAEYEYDGRRNLTRACDRFGKAIEF AYDGDNRVVR Prediction of potential genes in microbial genomes Time: Sat May 28 02:56:05 2011 Seq name: gi|283510489|gb|ACQH01000130.1| Prevotella sp. oral taxon 317 str. F0108 cont2.130, whole genome shotgun sequence Length of sequence - 2148 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 947 485 ## COG3209 Rhs family protein 2 1 Op 2 . + CDS 949 - 1182 168 ## gi|288930064|ref|ZP_06423904.1| hypothetical protein HMPREF0670_02798 + Term 1233 - 1276 1.1 + Prom 1189 - 1248 5.0 3 2 Tu 1 . + CDS 1321 - 2146 787 ## COG3209 Rhs family protein Predicted protein(s) >gi|283510489|gb|ACQH01000130.1| GENE 1 3 - 947 485 314 aa, chain + ## HITS:1 COG:YPO3615 KEGG:ns NR:ns ## COG: YPO3615 COG3209 # Protein_GI_number: 16123757 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rhs family protein # Organism: Yersinia pestis # 4 189 1184 1377 1512 103 34.0 3e-22 VVTFKYDALGRRIEKSFNGRVYRYLWDGDVVLHEWEYDETERPQQIVAENGEVSFDRPEP TDNLVTWVYDPDGYIPTAKLVGGKRYGIVSDYVGRPVQAYDEHGTPVWQADYDIYGNLLN LKGDSQFVPFRQLGQYEDEETGLYYNRFRYYDPSTGGYISQDPIGLAGRNPTLYGYVRDV NIWVDVSGLDCSIKDKKVATPYGDAIQSIEPEALKARTEVENGATLYRIGTMGRSETIGA QFWALEHPFSEGYASRYGIPPENVANSNFIMTGKLKEGSDFITRPAPPVGTNSGGGIEVV TPTDGVEIITFSTL >gi|283510489|gb|ACQH01000130.1| GENE 2 949 - 1182 168 77 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288930064|ref|ZP_06423904.1| ## NR: gi|288930064|ref|ZP_06423904.1| hypothetical protein HMPREF0670_02798 [Prevotella sp. oral taxon 317 str. F0108] # 1 77 1 77 77 125 100.0 1e-27 MDIYEESIKLAENLNKFGYQFISQEVLDAINYSSTGTEALMRIRFFLKEFLDNGVDINLL LLERAKNLLNQINAIID >gi|283510489|gb|ACQH01000130.1| GENE 3 1321 - 2146 787 275 aa, chain + ## HITS:1 COG:RSp1137 KEGG:ns NR:ns ## COG: RSp1137 COG3209 # Protein_GI_number: 17549358 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rhs family protein # Organism: Ralstonia solanacearum # 11 199 845 1024 1517 77 33.0 3e-14 MSKLFETDERGERYTFERDLNGKVIAERGFDGQERRYERDMDGTVLVTHLPNGTTIHHQH DLAGRLTYSRYPDGSWEAWEYDRAGRLCKASNPHGETVFERDALGRIIREIQNGHAIEHC YDSRSRLTHTLSGLGADVSYGYDDAGLPESVKAIVQGMPHPWEARMQHDRLGRETLRTMT GGVACAMLYDGVGRPSRQSVTRGGHSLYSRSYRWDDDFRLSHAHDAISGRIVRYLYDDFG SLAEAEYGDGARQWRNPDVMGNVYDSTDRTDRTYA Prediction of potential genes in microbial genomes Time: Sat May 28 02:56:10 2011 Seq name: gi|283510488|gb|ACQH01000131.1| Prevotella sp. oral taxon 317 str. F0108 cont2.131, whole genome shotgun sequence Length of sequence - 2026 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 823 309 ## COG3209 Rhs family protein 2 1 Op 2 . + CDS 744 - 1406 458 ## PAU_03002 hypothetical protein 3 2 Tu 1 . + CDS 1553 - 2024 410 ## COG3209 Rhs family protein Predicted protein(s) >gi|283510488|gb|ACQH01000131.1| GENE 1 2 - 823 309 273 aa, chain + ## HITS:1 COG:YPO3615 KEGG:ns NR:ns ## COG: YPO3615 COG3209 # Protein_GI_number: 16123757 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rhs family protein # Organism: Yersinia pestis # 4 146 1237 1377 1512 85 36.0 1e-16 QEIVAENGDVSFDRAEPTDNLVTWVYDTDSYVPTAKLVNGKRYGIVSDYIGRPVQTYDEH GTPVWQADYDIYGNLLNLKGDRQFVPFRQLGQYEDEETGLYYNRFRYYDPSTGGYISQDP IGLTGNNPTLYGYVSDTNVTHDIYGLEISYKGDTSPYQTRYDPTYPGRPDPRYSIDTTTF ESGKPTANGGLRDNKQFWQKWEKLNPSSISKNNRWRIEHGLSPTIDNVWIKEFPEHGGYK GEILIHHHVDQGKYAVPVPASTHVGSGGVWHCK >gi|283510488|gb|ACQH01000131.1| GENE 2 744 - 1406 458 220 aa, chain + ## HITS:1 COG:no KEGG:PAU_03002 NR:ns ## KEGG: PAU_03002 # Name: gene0032 # Def: hypothetical protein # Organism: P.asymbiotica # Pathway: not_defined # 1 184 1 183 211 68 32.0 2e-10 MWIKVSTLYQYQQALMLAQVAYGIVNKIKNNYMKLLTLNEIILTIVEETKGLDPEFIEEI ILKKVDIPSTEIVRLKEELRIDVLNQNFINIILKYSWGNFCFLTYQFGYNDTYGINWLIQ RNLEYEDYSTLNEAGFIIIANGDPYTILLECASGRIYTIDSETEIEERMLIAENFEQLVR GMGTGQYACWCKRETEFIQLMEHITKGIGMPFWHSLVGFY >gi|283510488|gb|ACQH01000131.1| GENE 3 1553 - 2024 410 157 aa, chain + ## HITS:1 COG:RSp1137 KEGG:ns NR:ns ## COG: RSp1137 COG3209 # Protein_GI_number: 17549358 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rhs family protein # Organism: Ralstonia solanacearum # 2 157 1160 1303 1517 70 29.0 1e-12 MLDSVTRPDGKVVTFKYDALGRRIEKAFNGRVHRYLWDGDVVLHEWEYDETERPQQIVAE NGDVSFDRPEPTDNLVTWVYDTDSYVPTAKLVNGKRYGIASDYIGRPVQGYDEHGTLVWQ ADYDIYGNLLNLKGDSQFVPFRQLGQYEDEETGLYYN Prediction of potential genes in microbial genomes Time: Sat May 28 02:56:15 2011 Seq name: gi|283510487|gb|ACQH01000132.1| Prevotella sp. oral taxon 317 str. F0108 cont2.132, whole genome shotgun sequence Length of sequence - 3274 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 511 - 999 -47 ## COG3209 Rhs family protein 2 1 Op 2 . + CDS 1006 - 1632 196 ## gi|288930071|ref|ZP_06423911.1| hypothetical protein HMPREF0670_02805 + Prom 1634 - 1693 4.0 3 2 Tu 1 . + CDS 1739 - 2038 145 ## gi|288930072|ref|ZP_06423912.1| hypothetical protein HMPREF0670_02806 + Prom 2286 - 2345 4.0 4 3 Tu 1 . + CDS 2593 - 3027 207 ## BPUM_1759 hypothetical protein Predicted protein(s) >gi|283510487|gb|ACQH01000132.1| GENE 1 511 - 999 -47 162 aa, chain + ## HITS:1 COG:ECs0560 KEGG:ns NR:ns ## COG: ECs0560 COG3209 # Protein_GI_number: 15829814 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rhs family protein # Organism: Escherichia coli O157:H7 # 2 58 1202 1257 1398 65 59.0 5e-11 MGQYEDEETGLYYNRFRYYDPSTGGYISQDPIGLEGNNSTLYAYVYDHNKEVDIFGLNGT PQLPNETILSGGNTKIVHYYNNLAEHAEPIHFHIEENGNSIGKIKADGTLISGRTNKTSQ NMVKQSKNKLRKAEKKIANYLRKVRKAVAGKPFKYGNRGCKS >gi|283510487|gb|ACQH01000132.1| GENE 2 1006 - 1632 196 208 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288930071|ref|ZP_06423911.1| ## NR: gi|288930071|ref|ZP_06423911.1| hypothetical protein HMPREF0670_02805 [Prevotella sp. oral taxon 317 str. F0108] # 1 208 1 208 208 372 100.0 1e-102 MDLYNRILNKGDIESIIELTKEDKKIITRLYSYETIFSKTLNYLLVNKNKSADLEYLFTI FIDMLSGRLINKPSDLLSCIQKVKNKNNQILFLKTIMHHRLVNDDFLISLGENKFVFEHL PYDLSWIEIPVIKYGSKAIVSATEKLSIVQICPLIDCIEDTSLIEYLVGWAFEENKLSDS GIDYFMQNYEKKYNLIKNIKQKENDIIR >gi|283510487|gb|ACQH01000132.1| GENE 3 1739 - 2038 145 99 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288930072|ref|ZP_06423912.1| ## NR: gi|288930072|ref|ZP_06423912.1| hypothetical protein HMPREF0670_02806 [Prevotella sp. oral taxon 317 str. F0108] # 1 99 1 99 99 165 100.0 7e-40 MLNRQEYQEIFRILNVTIGYIDKIASGFYGTEETALALLLGFKENKTLDQLSQIRYILQI AMEKQLSNQEYDEFIEKEVEIWKPPYDSSKEELLDMIRE >gi|283510487|gb|ACQH01000132.1| GENE 4 2593 - 3027 207 144 aa, chain + ## HITS:1 COG:no KEGG:BPUM_1759 NR:ns ## KEGG: BPUM_1759 # Name: not_defined # Def: hypothetical protein # Organism: B.pumilus # Pathway: not_defined # 1 141 1 140 146 64 31.0 1e-09 MNSYYDNILIKSVEEQKINYTVISNLSYYIENLSKHFLFVGERLDFNNLSNHKFVKSEQR DISFDAVNFIKKLIEEKICSKDDAIIYIGDSLTENGYEFYLNDLLKIVPFLVNEIPQHHY FLSKDFKKIIYISFENEIEFGGVS Prediction of potential genes in microbial genomes Time: Sat May 28 02:56:32 2011 Seq name: gi|283510486|gb|ACQH01000133.1| Prevotella sp. oral taxon 317 str. F0108 cont2.133, whole genome shotgun sequence Length of sequence - 737 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 188 - 247 4.9 1 1 Tu 1 . + CDS 342 - 602 82 ## gi|288930076|ref|ZP_06423916.1| hypothetical protein HMPREF0670_02810 Predicted protein(s) >gi|283510486|gb|ACQH01000133.1| GENE 1 342 - 602 82 86 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288930076|ref|ZP_06423916.1| ## NR: gi|288930076|ref|ZP_06423916.1| hypothetical protein HMPREF0670_02810 [Prevotella sp. oral taxon 317 str. F0108] # 1 86 1 86 86 145 100.0 8e-34 MTSIKLTFIVYGDIFDVDDFSKIIGKSPTDFAYKNDMLKYRRSTETFWEYSFQEVLSPYI EESIRCFENVITPSFETVSSFIKKTI Prediction of potential genes in microbial genomes Time: Sat May 28 02:56:37 2011 Seq name: gi|283510485|gb|ACQH01000134.1| Prevotella sp. oral taxon 317 str. F0108 cont2.134, whole genome shotgun sequence Length of sequence - 1579 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 360 392 ## gi|288930082|ref|ZP_06423921.1| cell well associated RhsD protein - Prom 402 - 461 1.9 2 2 Op 1 . - CDS 464 - 919 236 ## gi|288930078|ref|ZP_06423917.1| hypothetical protein HMPREF0670_02811 3 2 Op 2 . - CDS 916 - 1371 241 ## gi|288930079|ref|ZP_06423918.1| hypothetical protein HMPREF0670_02812 - Prom 1435 - 1494 8.5 Predicted protein(s) >gi|283510485|gb|ACQH01000134.1| GENE 1 3 - 360 392 119 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288930082|ref|ZP_06423921.1| ## NR: gi|288930082|ref|ZP_06423921.1| cell well associated RhsD protein [Prevotella sp. oral taxon 317 str. F0108] # 1 119 368 486 785 233 94.0 2e-60 MLQIEKQREEYYKSLGTPPGVPTMKGATADPDNTGRGYNDIIGDLGDHSVPEGWVQRPKF NDKGDVPGPPIQWQKPEDPDNAQPGYGYVPADTTNTEEIFFYHSDHLGSTSYITDAKAN >gi|283510485|gb|ACQH01000134.1| GENE 2 464 - 919 236 151 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288930078|ref|ZP_06423917.1| ## NR: gi|288930078|ref|ZP_06423917.1| hypothetical protein HMPREF0670_02811 [Prevotella sp. oral taxon 317 str. F0108] # 1 151 1 151 151 283 100.0 3e-75 MRKKLFLLLLICCTYLLSAKNKRELPFYQVLVPKSIATQLDEGYGKEFVNASANVYNLLD KKEKKMVNGVYAFKGQGPHFPRKIFIYRDKKIFFFQSVGAFNPNGIIKEYSTFLSENKLT NAETIMYLRAIYEYLKDENGIQYGAEIKKCK >gi|283510485|gb|ACQH01000134.1| GENE 3 916 - 1371 241 151 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288930079|ref|ZP_06423918.1| ## NR: gi|288930079|ref|ZP_06423918.1| hypothetical protein HMPREF0670_02812 [Prevotella sp. oral taxon 317 str. F0108] # 1 151 70 220 220 241 100.0 1e-62 MKSLKRAINSKIRVKMKISSETKITKVEGGNSYTYGETVQGNFNSKDNYGRRVSKNGTFG ITESSITIYEGTLKEDAKTENPKHKGLTIDQAIGAVAGHEIVHGTNRKEINKDLRYELKN KGGTRPNKETAPNKIENKIIEQSRELNNEIK Prediction of potential genes in microbial genomes Time: Sat May 28 02:57:00 2011 Seq name: gi|283510484|gb|ACQH01000135.1| Prevotella sp. oral taxon 317 str. F0108 cont2.135, whole genome shotgun sequence Length of sequence - 2542 bp Number of predicted genes - 3, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 575 349 ## 2 1 Op 2 . - CDS 572 - 2194 1225 ## BT_2927 putative cell wall-associated protein precursor 3 1 Op 3 . - CDS 2272 - 2541 72 ## gi|288930082|ref|ZP_06423921.1| cell well associated RhsD protein Predicted protein(s) >gi|283510484|gb|ACQH01000135.1| GENE 1 2 - 575 349 191 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKMPQIAILLVLLLMLNCTSTKWPFNQSLKNVIDKFIQMYPNGDVIQVQSSKIGTHELLI ITNRKSYDPDMIDGYCIYKSRLITYYQTDSTNRDSIVRIDRLLKYEGKINGYKNAYSTNV SFEPKQEIYKIFNKDSIVQIKDISKLKFSDMPQSENVIANKELNEIINSYIYQLASVLYE LKFIRKGNHSF >gi|283510484|gb|ACQH01000135.1| GENE 2 572 - 2194 1225 540 aa, chain - ## HITS:1 COG:no KEGG:BT_2927 NR:ns ## KEGG: BT_2927 # Name: not_defined # Def: putative cell wall-associated protein precursor # Organism: B.thetaiotaomicron # Pathway: not_defined # 200 437 751 1016 1074 109 31.0 3e-22 MYWDEDNRLMVLSDNGKTSRYTYNAAGERIVKSHGDLEGVYVNGAPQGITFHETEDYAIY PAPIITVTKNRFTKHYFIGDKRVASKLGTGKFSNVYGISSNNVTAGQKDYAARMMQIEKQ REDYYRKLGTPPGVPTMKGATADPDNTGYGYNTIIGELGDHSVPEGWVQRPKFNKKGDVP GPPIQWNKPEDPDNAQPGYGYVPTDTTNTEEIFFYHSDHLGSTSYITDAKANITQFDAYL PYGELLVDEHSSSEEMPYKFNGKEFDGETGLYYYGARYMNPRTSLWYGVDPLAEKYPEIG GYINCHCNPIMRVDIDGMDDYALNSEGKMYFWRKTDARTTHRVFSGKKSITVNKTLVNQL VSNATLNGTSATISDSEDAFKFFKFAADNTKVEWHLVGVKDKRNVNFIIRTDFSTGGVGV DEEYYANMIFHLHNHPWGSNDFKKGTRASGDYKAYKRVASGERVANYYTEGDVWTLNNIY IGYNQAHPNTALNKYPKAYIFYSGDKRQRSQLYQFDLNSPRFNPIYNPTYKEIKKYIYKK >gi|283510484|gb|ACQH01000135.1| GENE 3 2272 - 2541 72 89 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288930082|ref|ZP_06423921.1| ## NR: gi|288930082|ref|ZP_06423921.1| cell well associated RhsD protein [Prevotella sp. oral taxon 317 str. F0108] # 13 87 151 225 785 120 82.0 3e-26 QANSNSNNSNNNKAKLGGAFNHTYVYDDLSRLIKANGEAKGAKYEMTMTFGRMSEPLTKV QEVESTKTAQSYDFTYKYETATTPPLPRR Prediction of potential genes in microbial genomes Time: Sat May 28 02:57:21 2011 Seq name: gi|283510483|gb|ACQH01000136.1| Prevotella sp. oral taxon 317 str. F0108 cont2.136, whole genome shotgun sequence Length of sequence - 3123 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 10 - 597 189 ## BF1750 hypothetical protein 2 1 Op 2 . - CDS 675 - 3032 2032 ## COG3209 Rhs family protein Predicted protein(s) >gi|283510483|gb|ACQH01000136.1| GENE 1 10 - 597 189 195 aa, chain - ## HITS:1 COG:no KEGG:BF1750 NR:ns ## KEGG: BF1750 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 7 193 9 193 193 154 42.0 1e-36 MNKTYRDLIYRFQNQGIEESQSTGAILIGKAPHIAPEAWLNTLYPPLSKNDVQALEKELG MGIPIDYKKFLLDVSNGLDILVGTFCLDGLRRNYKRSTDESRQPFSIITANIRERPRNAA DDHFFIGGYDWDGSYLYIDKRTSTVHYCDRDDATSLFQWETFEQMLISELKRIYSLFDER GRKIDEDLYTTPIKR >gi|283510483|gb|ACQH01000136.1| GENE 2 675 - 3032 2032 785 aa, chain - ## HITS:1 COG:MA2045 KEGG:ns NR:ns ## COG: MA2045 COG3209 # Protein_GI_number: 20090892 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rhs family protein # Organism: Methanosarcina acetivorans str.C2A # 4 295 1641 1878 2217 70 27.0 1e-11 MVSTADVRTYVYGATYDSWNRVRTMTYPDGEVVTYHYNAAGQIASVKSNKQGKEETIVEK VGYDKDGHTVYTKLGNGAETTYTYDKQRERLQEMNLTAAGTAIMTNKYQYDAVDNILGIV NAVDPTQANSNNNNSNNNSNNNSNNNSNNNKAKLGGAFNHTYAYDDLNRLIRANGEAKGA KYEMTMTVGRMSEPLTKVQKVDPTKTAQSYDFTYKYEDSNHPTAPTQIGHEHYTYDANGN PTLVENDSLNTERRMYWDEDNRLMVLSDNGKTSRYTYNAAGERIVKSHGDLEGVYVNGAP QGITFHETEDYTIYPAPIITVTKNRFTKHYFIGDKRIASKLGTGKFSNVYGVSSNNVTAG QKDYAARMLQIEKQREEYYRKLGTPPGVPTMKGATADPDNTGYGYNTIIGELGDHSVPEG WVQRPKFNDKGDVPGPPIQWQKPEDPDNAQPGYGYVPADTTNTEDIFFYHSDHLGSTSYI TDAKANITQFDAYLPYGELLVDEHTSSEDMPYKFNGKELDEETGLYYYGARYMDPKISMW LGVDKLAEKYPTLGGYVYCAGNPIKLIDTDGNDIVIAGKNNSNITLKTNLIDIRVNATSL GIDFGGAYTLEGEKILSAALDIVGIFDPTGVADGLNATLAAKNGEWLDVGISVLGLLPYA GDLAKVGKIKKDVKIIEDGIEAVKGAKNLKYLRQKAVKDAWKAEKRLVERTGRGSRRWTK KELKELKETGKVKGYEGHHINSVKGHPEDAGDPSNIEFVKKGGEHLSRHNGNYRNPSSGK KINRD Prediction of potential genes in microbial genomes Time: Sat May 28 02:57:26 2011 Seq name: gi|283510482|gb|ACQH01000137.1| Prevotella sp. oral taxon 317 str. F0108 cont2.137, whole genome shotgun sequence Length of sequence - 9350 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 2, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 5418 4661 ## Hoch_2925 YD repeat protein 2 1 Op 2 . - CDS 5495 - 7318 1183 ## gi|288930084|ref|ZP_06423923.1| hypothetical protein HMPREF0670_02817 3 1 Op 3 . - CDS 7315 - 8397 724 ## gi|288930085|ref|ZP_06423924.1| hypothetical protein HMPREF0670_02818 - Prom 8436 - 8495 2.6 - Term 8440 - 8472 1.4 4 2 Tu 1 . - CDS 8510 - 9235 767 ## gi|288930086|ref|ZP_06423925.1| hypothetical protein HMPREF0670_02819 - Prom 9290 - 9349 5.8 Predicted protein(s) >gi|283510482|gb|ACQH01000137.1| GENE 1 3 - 5418 4661 1805 aa, chain - ## HITS:1 COG:no KEGG:Hoch_2925 NR:ns ## KEGG: Hoch_2925 # Name: not_defined # Def: YD repeat protein # Organism: H.ochraceum # Pathway: not_defined # 116 1003 559 1466 3456 401 32.0 1e-109 MNKSACFSAVILSAFLLFVSCNGNRRTSHVGNKQEIRESPVSDAKEQKTGKDNQSLHAKK YAHAVTGTKEIGRERGTRTNNRNHVSKRLQDFRKRTVSKYFAGTLADSLSVGRAELKVPL GCMEHAKILSITPLRKGELPHLPAGMVNVTADRSNPTVAANSKDSIAGYRFLPHGEHFVH SLASITVPYDSTLIPQGYTAEDIHTYYYDELKAQWVMLRHKVLDKDRELVVAETSHFTDV INGIIKVPENPETQNYVPTGISELKAADPSAGITTVSAPTANQSGTAALSYPFELPKGRA GMQPSVGLQYSSDGGSSYVGYGWGLPLQSIDIETRWGVPRFDTDKESESYLLMGSKLNDR TYRTTDAPARTKDKRFYPLVEGGFAKIIRKGDSPQNYTWEVTSKDGTVSYFGGVDGTVDE GAVLKDGNGNILRWALCKTQDTHGNFVSYKYRKRSDNLYPETYRYTGSKDEEGAYSVQFA FKWDARKDVVKSGRLGVLQTDDALLERVEVKYNSELLRAYALHYKEGVFSKTLLDGIEQL DSKDNHVATQTFSYYNDITKGIFSDTPVIYTAEKDNYGRLVKLHIGNFDERLSLLGGGSA KGNTVGGGVMVGAGWGPASVNAGGSYSYTKTNNEGRIALVDINGDGMPDKVWKGADGRLH YRLNQNKDLRHPSFGDKMTITGIGSFTGGTTTSNSWNANVATGFGPASIGYAVNKTTDQS KNKTYFQDFNADGLIDIAQNGTVYFNHSNGKDVSFSSTSTNTGNPITGNATSIDSTFLPD YKAIRDSLEREFPLHDAIRLWRAPYAGTVKVTGAVNKPSAQGDGVSLSMQHNNVVLWKDT LLQSGSVTVPAKTLPVRPGDYLLFRVNARYSGIGDVVEWDPNIVYTSIAVDTYAGEDLKH YRNSSDYVPGEMSTAPLAIDGTVFYEGGFNKQKTSDDVVLSIVKTDKTGTQTEIDHLLLP ADTVLTGKFEGNFHSSVVDSATVNFIIKTSSPINWQHISWTPVFRSDTTKYALAPQRLMF NKNIVAAKDTLVHVVASDSTWGDRLVLVPELRVSRSSGKDTAPATVHLTLKDETGSLLYR KEYTMRGNNRLQGDSTAVNGVSLIPHLSAKRVQVSFSVLNELDNAPIARLHVLRDSILYT VGTDGKKHETGRRLVRVDSIPASVFSSFNRFDHGPLYKGWGQFAWNGNAKGEPIRTDELR ASDHNDYIKDGEIDEEAVEKNTLDINKQKFFTMAYSPTTGKYVSATDSVYVKVAMMRASR LGEDEIVVDSINYNLNGEGLSAPVQMTESKSTGKTYSLGFSMGFSLGVNKSNTKQESYTT VSALDLNGDSYPDWVREDNGKIKVQLTRQIGTLGDGQDYDVEASMSYGESENTGGDAGLS MNPDSYKLSNFQDKLQKAKASFAKSVASLQQTGRLGKNAEGASFSGNVSEGACSASGNFS SGTSESVRDWGDLNGDGLPDMVSKGQVRYNLGYSFTEAMPSGINAVEHSSNNNYGDGLGV CIPILGLFNISAGQNDTGSLTSGKGAFHDVNGDGLPDFVEQDSKGDLKVTLNTGSGFAAP EGLQKKSELSGNMGSSVAIYASTAYTIRIPLPFGFVINITPSIQGSHSESINRTTAALMD MDGDGLPDLVYSDSENSMGVRRNLTGRTNLLKGVTLPFGGHVAVGYKQTEPSYDMPGRRW VMASVETTGGYAENGATRMRNEFEYEGGYRDRRERDFYGFEKVTTKQIDTQNGNAVYRTQ VAEYGHNRNLYMHDLVTAETLYDAAGNKLQGTQNTYELKQQADTTVFFPALASVRQTIYD NAGQG >gi|283510482|gb|ACQH01000137.1| GENE 2 5495 - 7318 1183 607 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288930084|ref|ZP_06423923.1| ## NR: gi|288930084|ref|ZP_06423923.1| hypothetical protein HMPREF0670_02817 [Prevotella sp. oral taxon 317 str. F0108] # 1 607 1 607 607 1158 100.0 0 MTMKQSTLHLTSILVSLLCAVSITASRAQAQTQAQIQPKPQCQAPGGVAGFVKWASGNDN TPVRLTGAGGLTFIGVGKIQKAGEQLLWNVSTQAGKTERVQTTARAANLVKGTFMNYTGR DTLPQLRLYAYSTSSASGTRGTLNVGGMTKEKLPIKSLKNGMEEYVVYARALTAAERMRV ESALALRHGITLAHSYLNYKGETIRNYYRLKAYNHRVAGIIGDATSKLHRTTGESSENEA VVKVSASAINEGASYLWGDNAKQLSFTADKGNGKWMQRCWAATTTGQPAEHLTLTFDTRS IHQLQPLGKDECYYLAVDNSGTGKFPVGQIRYYKAQDSHADSIVFAGIAADAGESIFTLR AAKDFFATVEIARPRCSTAQKGQLKVLFTGGTAPYHVTISLDGKACYSQSTSDSLITVSA LQQGKYTIAAKDHTGKTLLNEVMVSNADMADVPELQDVRFARGASRDYHLDTKGDYTCRW KSPKGRYLSGESVTLDEDGAYILELTNAEGCSTTRTLNVSTMGKDGFARHAVSPNPTTDG NVDVRVEMSESLPLEFRLYSPDGALLQREARTADTYHATRCYLPMNGTYVLEMSSGESRQ SVKLIRK >gi|283510482|gb|ACQH01000137.1| GENE 3 7315 - 8397 724 360 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288930085|ref|ZP_06423924.1| ## NR: gi|288930085|ref|ZP_06423924.1| hypothetical protein HMPREF0670_02818 [Prevotella sp. oral taxon 317 str. F0108] # 1 360 3 362 362 678 100.0 0 MGNSTPNTRRNVISGIVALIVILIILLCLRLCKHDEEDVPTPQATPSASTPQDTIKQGTT TVSPKAWHKVSKPKRKRKVIAASALQPAEKVAVVEKKAEPAVVQETPKEQPKQQTGIEPE SDDVVIDTAEITREPKPRRIFSHKRYYQTRIGIRAGVGYSVIGNLGALVEDGAVRPRYTM EENGAFVPSIGVFALWRHDRLGVELAADYTWLSSTLKEHKQIGNIDEKTVFRYHVIMPQV AARLYLLSDLYMGAGVGMGIPLNPGGIDFSSNRAALYVSVDKLTQEHLRETFRARLHVMP LLKIGYSSFKNGIEASLQYGYGFMDLIKTNENPYGYHKATNNSHLLLLTVGYTIPLTKQK >gi|283510482|gb|ACQH01000137.1| GENE 4 8510 - 9235 767 241 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288930086|ref|ZP_06423925.1| ## NR: gi|288930086|ref|ZP_06423925.1| hypothetical protein HMPREF0670_02819 [Prevotella sp. oral taxon 317 str. F0108] # 1 241 1 241 241 469 100.0 1e-131 MGTKFNLLILFIVSCLSLSAQEGAFNAGVRLGGGYSVNSHVDKILVSEGYYANYSFDNKG LFVPSAELFFLYRQPASLWGVEAGIAYYNKTARVRYEDKNELNYTLSTRYHHLGLAAYFN LYPFKAKNSLHVSLGGRIGANLSPSNLTYTGNQEDAKFSALGYPSVKETERILKDKLKGR PDAALGGGIGYDFPFGMTLDLRYHYSLTNSIKTETNTYNWVEHDNHNQQIELTVGYMINI R Prediction of potential genes in microbial genomes Time: Sat May 28 02:58:25 2011 Seq name: gi|283510481|gb|ACQH01000138.1| Prevotella sp. oral taxon 317 str. F0108 cont2.138, whole genome shotgun sequence Length of sequence - 1182 bp Number of predicted genes - 0 Prediction of potential genes in microbial genomes Time: Sat May 28 02:58:26 2011 Seq name: gi|283510480|gb|ACQH01000139.1| Prevotella sp. oral taxon 317 str. F0108 cont2.139, whole genome shotgun sequence Length of sequence - 1052 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 79 - 1050 477 ## COG3209 Rhs family protein Predicted protein(s) >gi|283510480|gb|ACQH01000139.1| GENE 1 79 - 1050 477 323 aa, chain - ## HITS:1 COG:RSp1137 KEGG:ns NR:ns ## COG: RSp1137 COG3209 # Protein_GI_number: 17549358 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rhs family protein # Organism: Ralstonia solanacearum # 4 242 1173 1397 1517 110 31.0 3e-24 VVTFKYDALGRRIEKSFNGRVHRYLWDGDVVLHEWEYDETKRPQPIVAENGDVSFDRPEP TDNLVTWIYDPDSYVPTAKLVNGKRYGIVSDYIGRPVQAYDEHGTPVWQADYDIYGNLRN LRGDRKLVPFRQLGQYEDEETGLYYNRFRYYDPNTGGYISQDPIGLAGENPTLYGYVENM NVDIDILGLHIANAVYTNTNGINGEVGSFASQKGGFHSEPQILNQLGNQKGGHLEITSMG PKGDGKTSFFKGPSGKSFPAGPLPPCGPKSKNCDALLYNHAKNHGMTITYKWNDMHGNSN VRVYSPDGTVMENGIDISCKYCK Prediction of potential genes in microbial genomes Time: Sat May 28 02:58:27 2011 Seq name: gi|283510479|gb|ACQH01000140.1| Prevotella sp. oral taxon 317 str. F0108 cont2.140, whole genome shotgun sequence Length of sequence - 4013 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 3, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 154 - 190 1.8 1 1 Op 1 . - CDS 283 - 924 291 ## gi|288930088|ref|ZP_06423927.1| hypothetical protein HMPREF0670_02821 2 1 Op 2 . - CDS 929 - 1897 559 ## COG3209 Rhs family protein - Term 2261 - 2299 3.3 3 2 Op 1 . - CDS 2372 - 2707 311 ## Pecwa_4338 hypothetical protein 4 2 Op 2 . - CDS 2704 - 3099 93 ## YpsIP31758_0341 YD repeat-/rhs repeat-containing protein 5 3 Tu 1 . - CDS 3216 - 4013 511 ## COG3209 Rhs family protein Predicted protein(s) >gi|283510479|gb|ACQH01000140.1| GENE 1 283 - 924 291 213 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288930088|ref|ZP_06423927.1| ## NR: gi|288930088|ref|ZP_06423927.1| hypothetical protein HMPREF0670_02821 [Prevotella sp. oral taxon 317 str. F0108] # 1 213 1 213 213 389 100.0 1e-107 MGKKEFKQCFKSQIETLVERFNSRQIVIEYSKNIANIKFMNETVGKIEFKFLTGKQLSMG AIASIFITANIFGGEIWKEILESTGFRKKDSNIYTNYLYSFYSSLPNNNLFKGSKTDFFS YDNIEEKLSSFFDFISKELLSKISNILMKRAMAIDDILNTPHYYGQPEISMLLLCRDNPQ ELSLSAIEEDKRFRNSSCINKRIWEENKTVVFL >gi|283510479|gb|ACQH01000140.1| GENE 2 929 - 1897 559 322 aa, chain - ## HITS:1 COG:RSp1137 KEGG:ns NR:ns ## COG: RSp1137 COG3209 # Protein_GI_number: 17549358 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rhs family protein # Organism: Ralstonia solanacearum # 2 200 1160 1345 1517 101 33.0 2e-21 MLGSVTRPDGKIVTFKYDALGRRIEKAFNGRVHRYLWDGDVVLHEWEYDETERPQQIVGE NGDVSFDRPEPTDNLITWVYDTDSYVPTAKLVNGKRYGIVSDYIGRPVQAYDEHGTLVWQ ADYDIYGNLLNLKGDRKLVPFRQLGQYEDEETRLYYNRFRYYEPSTGGYISQDPIGLEGN NPTLYAFVKDVNMEVDVFGLECSNFNGRRGENLAEFDLERNGFKIVGRQVPMTINGQSIR ADFVATRNNKVYVFEAKMNSGRLTPAQKASGVFSTKKVDMANTSMSGGGTIKTSKGSSGV YIDGNGNTTNATFHVVKYKTRI >gi|283510479|gb|ACQH01000140.1| GENE 3 2372 - 2707 311 111 aa, chain - ## HITS:1 COG:no KEGG:Pecwa_4338 NR:ns ## KEGG: Pecwa_4338 # Name: not_defined # Def: hypothetical protein # Organism: P.wasabiae # Pathway: not_defined # 1 111 1 109 109 74 36.0 1e-12 MSDILNDKIIVVELAGKEIILRNTSNNFKNELIAMGFRNVGSEYFLVLPIEDIDNRVLII NKLIDLGGLFSGGNGWAPSEVVDYYKEKGLISAKYNRISWKAPGKFTISTE >gi|283510479|gb|ACQH01000140.1| GENE 4 2704 - 3099 93 131 aa, chain - ## HITS:1 COG:no KEGG:YpsIP31758_0341 NR:ns ## KEGG: YpsIP31758_0341 # Name: not_defined # Def: YD repeat-/rhs repeat-containing protein # Organism: Y.pseudotuberculosis_IP31758 # Pathway: not_defined # 1 116 1402 1516 1527 157 62.0 9e-38 MLQDDKGFNISPIDWDSYPNIGLNGTFITDRTGALNNLPNFRKGDTVTISSATASSIERG MGLKPGSLQNGFKVREISDITSMNPRSPLEGNEYFLGPGQHLPNGAPEMVINSVPTIDNE SVTTILTVLVK >gi|283510479|gb|ACQH01000140.1| GENE 5 3216 - 4013 511 265 aa, chain - ## HITS:1 COG:RSp1137 KEGG:ns NR:ns ## COG: RSp1137 COG3209 # Protein_GI_number: 17549358 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rhs family protein # Organism: Ralstonia solanacearum # 41 244 1111 1299 1517 66 25.0 7e-11 DSADRTDRAYARGGQLREDRRWRYHYDKQGNLVLKTKRKIRADLEHAVPQTKGKGLFGLF SGSNTSKERREQEGLIAWQPGDYAYTWLPNGMLGSVTRPDGKTVTFKYDALGRRIEKSFN GRVHRYLWDGDVILHEWEYAETDRPQPIVAENGEVSFDKPEPTDNLVTWVYDTDSYVPTA KLVNGKRYGIVSDYIGRPVQAYDEHGTLVWQADYDIYGNLFDLKGNREFVPFRQLGQYED EETGCTTIDLGITTRIRVGIYHKTR Prediction of potential genes in microbial genomes Time: Sat May 28 02:58:41 2011 Seq name: gi|283510478|gb|ACQH01000141.1| Prevotella sp. oral taxon 317 str. F0108 cont2.141, whole genome shotgun sequence Length of sequence - 1739 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 198 - 578 84 ## gi|288930091|ref|ZP_06423930.1| hypothetical protein HMPREF0670_02824 2 1 Op 2 . - CDS 581 - 1738 514 ## COG3209 Rhs family protein Predicted protein(s) >gi|283510478|gb|ACQH01000141.1| GENE 1 198 - 578 84 126 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288930091|ref|ZP_06423930.1| ## NR: gi|288930091|ref|ZP_06423930.1| hypothetical protein HMPREF0670_02824 [Prevotella sp. oral taxon 317 str. F0108] # 1 126 1 126 126 250 100.0 3e-65 MSTNNRNIPGTIKVEGREHLYVSDDGEDFFTKRSRSVKRQSLFDRVTSLFKIAQSGGVIS ENCKLRLPTHELFHGLSYKGDIEGWRKQIEQGAEHLGLLTGKVSNNSIELSDGRIYALSD CEIEFY >gi|283510478|gb|ACQH01000141.1| GENE 2 581 - 1738 514 385 aa, chain - ## HITS:1 COG:YPO3615 KEGG:ns NR:ns ## COG: YPO3615 COG3209 # Protein_GI_number: 16123757 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rhs family protein # Organism: Yersinia pestis # 28 262 1148 1384 1512 115 32.0 2e-25 RADLEHAVSQTEGKGFFGLFSNNNTSKERREQEELIAWQPGDYAYTWLPNGMLGSVTRPD GKTVTFKYDALGRRVEKNFNGRVHRYLWDGDVVLHEWEYDETKRPQPIVAENGEVSFDRP EPTDNLVTWIYDTDGYVPTAKLVNGKRYGIVSDYIGRPVQAYDEHGTPVWQADYDIYGNL RNLRGDSQFVPFRQLGQYEDEETGLYYNRFRYYDPNTGGYISQDPIGLAGNNPTLYAYVS DANSWVDLFGLDCDKATSKAREYEQQIQDMYGGKLPQSQREYGAIVGGTQVNGIADHVAN INGKNVAIDAKYVKNWSKSIRNPNSSIGNKPFAIAEQQNMVSQAQKYSNAFDEVIYHTNN QDLATHYTRIFQDAGLNNVKIIVTP Prediction of potential genes in microbial genomes Time: Sat May 28 02:58:48 2011 Seq name: gi|283510477|gb|ACQH01000142.1| Prevotella sp. oral taxon 317 str. F0108 cont2.142, whole genome shotgun sequence Length of sequence - 1990 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 1122 915 ## Cpin_6418 YD repeat-containing protein 2 1 Op 2 . - CDS 1172 - 1804 521 ## gi|288930094|ref|ZP_06423933.1| hypothetical protein HMPREF0670_02827 - Prom 1908 - 1967 4.0 Predicted protein(s) >gi|283510477|gb|ACQH01000142.1| GENE 1 3 - 1122 915 373 aa, chain - ## HITS:1 COG:no KEGG:Cpin_6418 NR:ns ## KEGG: Cpin_6418 # Name: not_defined # Def: YD repeat-containing protein # Organism: C.pinensis # Pathway: not_defined # 29 365 86 422 1368 108 29.0 3e-22 MVGDVIPIKHITGGLPSYKIFYGDLEGPHDGEVYFGSKSVSADGSECGGSFPAQVLTCWG APFGHQFFPPTWLGLYQHALSLYVPLSFGKPVMVGGTFVPHKYSLSDILMRAVAVETMRF LGCIARKGLTRFNHMLMRKFGQNPVSDALCRFGFEPVNLVTGAMDFAWDDFELGGDHPLS MRCRWFSDVAYSGVMGNGVVCSYDRFIVPDFEAGVAAYNDPEECKALPLPIVEVGAPEEY YRGMKLWQCRPDARTWIVRKADATTTYRAFNDGKGGTVYRAVRVDYAGGGFLNFSYDRDS GLLSTLEDHHGRTVVFGLDYERGLILSANLLRKGRLETLAEYEYDGRRNLTRACDRFGKA IEFTYDGDNRVVR >gi|283510477|gb|ACQH01000142.1| GENE 2 1172 - 1804 521 210 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288930094|ref|ZP_06423933.1| ## NR: gi|288930094|ref|ZP_06423933.1| hypothetical protein HMPREF0670_02827 [Prevotella sp. oral taxon 317 str. F0108] # 1 210 1 210 210 389 100.0 1e-107 MKSLVFILAAFIVAGSCGAQRKVKVSKIKGGKQMTTEKIDKQRFHWNKDKNDIYTFVNYK GQKVVQRWMSSGGVYYFYETRRKENELIEEYRRYFNAGKLNVEGFQYKDNGFEVGIWKIY DGDGKLVEVRDYDAPFKNYPWEEVRKFLERERGIDFFDKRTTVSRYVDEKHPAGWGIRYY DKKNQTFKYIGLDCATRKIVEENEFSIVRD Prediction of potential genes in microbial genomes Time: Sat May 28 02:59:05 2011 Seq name: gi|283510476|gb|ACQH01000143.1| Prevotella sp. oral taxon 317 str. F0108 cont2.143, whole genome shotgun sequence Length of sequence - 5682 bp Number of predicted genes - 6, with homology - 2 Number of transcription units - 5, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 281 - 340 3.7 1 1 Tu 1 . + CDS 393 - 572 104 ## - Term 382 - 420 1.1 2 2 Op 1 . - CDS 634 - 837 195 ## gi|288930097|ref|ZP_06423935.1| hypothetical protein HMPREF0670_02829 3 2 Op 2 . - CDS 849 - 3119 1927 ## gi|288930098|ref|ZP_06423936.1| conserved hypothetical protein + Prom 2931 - 2990 3.2 4 3 Tu 1 . + CDS 3073 - 3279 59 ## 5 4 Tu 1 . - CDS 4020 - 4337 70 ## - Prom 4402 - 4461 2.9 6 5 Tu 1 . + CDS 4478 - 4687 92 ## + Term 4898 - 4953 2.7 Predicted protein(s) >gi|283510476|gb|ACQH01000143.1| GENE 1 393 - 572 104 59 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MYTLYIYKERLYGLRQLLDLLLTQSHSIIYPGHSLIQYSYSVLSFKYPARYKAQLMDIW >gi|283510476|gb|ACQH01000143.1| GENE 2 634 - 837 195 67 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288930097|ref|ZP_06423935.1| ## NR: gi|288930097|ref|ZP_06423935.1| hypothetical protein HMPREF0670_02829 [Prevotella sp. oral taxon 317 str. F0108] # 1 67 1 67 67 121 100.0 2e-26 MKTHDYLKPDCAAIHFGLDELLIAPGASRFDNDGDGNADQHGEIPHNPDEIGAKPNYGWD WCEEDEG >gi|283510476|gb|ACQH01000143.1| GENE 3 849 - 3119 1927 756 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288930098|ref|ZP_06423936.1| ## NR: gi|288930098|ref|ZP_06423936.1| conserved hypothetical protein [Prevotella sp. oral taxon 317 str. F0108] # 1 756 3 758 758 1519 100.0 0 MKRILFFALLAVYACIPLAVNAQGQDPTTHKWLGNPVESVINNPDENKRIVYLYNVGTGK YLNAGSYWGTSLVGFSTGMTITVKHSTLANHYRMVGPLKTTEGQNIAFGRRRDTPGFDDA ANYNRAYVDRGVTYNTDVTPNPYAVQKKYINGVLDWKFEEVKPGSKTYWISVYNDETTQG MGGKRYLQMTKVLKDKVYPISYPGNVNPNDETCQWRIVTRADLKDVFKDVYASDESPANA TILIDDHNFARGDRDVEKWVTAGGLTWGWADHNAYLLEPANDAYTYYVGNGATSSNSYMA DNASYGTANVRNLGNTAHANGKVSQKVKAIKKGWYRISCNGFYAPATGSNLTAELFVSVV GITDANSNVKTTLNKFGGDFEYTPQEFRKVYTNADRAADKVSPYVKAAKVFEHGMYNNTV FVYVPHDTDVMEIGVRVANSTKPLDWTCWDDFSLAYCGTLDLILDETQNNSTYILEQVKP NRAAIMVLKRTLQKNEWNSIVLPVSLTVGQLKAAFGEDVKLSAYPKQSTDYERRIDFTKV DLDQEDDHVALDAYKLYLIKPTKDPTVMTSLKPYSKLKNNKPWLSVNAPYYVINNVTLDK KPEDQPGYSGGILRNAASWSTTADGKLQFCGSLYRHASAVVPAFSYALGKSSASKHRWLW HYTQSPMPVKGFRCWIATGSATQSKALKFFVDNEEIGNTFNTTGIATTASEGNGDLFAVP CNIYAIDGKLVRPNATSTEGLPKGVYIVNHKKLILK >gi|283510476|gb|ACQH01000143.1| GENE 4 3073 - 3279 59 68 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MQAYTANKAKNRIRFIIINYLNSLVMFLMVLYRNDVSFSLSGTFHGAKRFTIAAVAIFTR MFITMHRS >gi|283510476|gb|ACQH01000143.1| GENE 5 4020 - 4337 70 105 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MIIHSSQFIEITSVHCHTLEPVQFCLIVGLIVCLIFEGVTDNVKRIRRVTSAANKRKVYR RRWKQPIIIDRCSNGRLGFNVVRFLVVRQSITLSQKEDQYSLRVF >gi|283510476|gb|ACQH01000143.1| GENE 6 4478 - 4687 92 69 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPSLPFENRISMADYFRIFMMLSSLAVISMPRNRSRWRSQVYVLSVAVAMRAYTAQLPFV RFSAKILTL Prediction of potential genes in microbial genomes Time: Sat May 28 02:59:55 2011 Seq name: gi|283510475|gb|ACQH01000144.1| Prevotella sp. oral taxon 317 str. F0108 cont2.144, whole genome shotgun sequence Length of sequence - 13968 bp Number of predicted genes - 13, with homology - 12 Number of transcription units - 9, operones - 3 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) - TRNA 123 - 198 78.1 # Met CAT 0 0 1 1 Tu 1 . - CDS 425 - 1252 198 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P + Prom 1301 - 1360 4.1 2 2 Op 1 23/0.000 + CDS 1581 - 2327 857 ## COG0767 ABC-type transport system involved in resistance to organic solvents, permease component 3 2 Op 2 . + CDS 2481 - 3266 308 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 + Term 3322 - 3353 -0.3 - Term 3156 - 3204 3.0 4 3 Tu 1 . - CDS 3400 - 3786 408 ## PRU_2941 putative lipoprotein - Prom 3818 - 3877 2.9 - Term 4172 - 4231 9.1 5 4 Tu 1 . - CDS 4263 - 4574 191 ## 6 5 Tu 1 . - CDS 4976 - 5818 562 ## gi|288930103|ref|ZP_06423941.1| hypothetical protein HMPREF0670_02835 - Prom 6030 - 6089 6.2 - Term 7600 - 7647 -0.7 7 6 Tu 1 . - CDS 7832 - 8635 554 ## BVU_0849 putative iron uptake factor - Prom 8661 - 8720 5.1 - Term 8916 - 8961 10.1 8 7 Op 1 . - CDS 8965 - 10278 1564 ## COG1160 Predicted GTPases 9 7 Op 2 . - CDS 10365 - 11246 1080 ## COG1159 GTPase - Prom 11305 - 11364 2.0 - Term 11280 - 11334 10.8 10 8 Op 1 . - CDS 11372 - 12394 990 ## COG0332 3-oxoacyl-[acyl-carrier-protein] synthase III 11 8 Op 2 . - CDS 12394 - 12582 268 ## PROTEIN SUPPORTED gi|29349241|ref|NP_812744.1| 50S ribosomal protein L32 12 8 Op 3 . - CDS 12596 - 13123 589 ## PRU_2936 hypothetical protein - Prom 13150 - 13209 2.7 13 9 Tu 1 . - CDS 13410 - 13715 77 ## gi|288803514|ref|ZP_06408945.1| hypothetical protein HMPREF0660_01950 - Prom 13816 - 13875 3.6 Predicted protein(s) >gi|283510475|gb|ACQH01000144.1| GENE 1 425 - 1252 198 275 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 35 243 147 355 398 80 27 5e-15 MSTNDFQLNNQNHEDTHMELRAEGLVKRYGKRTVVNDVSINVRQGEIVGLLGPNGAGKTT SFYMTTGLVIPNDGHIFIDNTEITDFPVYKRARAGIGYLPQEASVFRKLSVEDNIMAVLE LTGKPRQYQLDKLESLIREFRLGKVRKNKGDQLSGGERRRTEIARCLAIAPKFIMLDEPF AGVDPIAVEDIQHIVWQLKYRNIGILITDHNVQETLTITDRAYLLFEGRILFQGAPEELA ENKIVREKYLSNSFVLRKKDFQTLDEERRAREGGH >gi|283510475|gb|ACQH01000144.1| GENE 2 1581 - 2327 857 248 aa, chain + ## HITS:1 COG:aq_355 KEGG:ns NR:ns ## COG: aq_355 COG0767 # Protein_GI_number: 15605864 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: ABC-type transport system involved in resistance to organic solvents, permease component # Organism: Aquifex aeolicus # 21 247 23 244 245 110 30.0 2e-24 MLILNYLNTLGQYLILMGRSFSRPERMRMFFKRYVKEMSQLGVDSVGIVLLISFFIGAVI CIQMKLNIQSPWMPRWVSGYTTREIMLLEFSSSIMCLILAGKVGSNIASELGTMRVTQQI DALDIMGVNSASYLILPKILGLVTIMPFLVIFSSAMGILGAYATAYIGHILTPDDLTLGI QHAFKPWFMWMSIVKSLFFAYIIASVSSYFGYTVEGGSVEVGKASTDAVVSSSVLILFSD VFLTQMLS >gi|283510475|gb|ACQH01000144.1| GENE 3 2481 - 3266 308 261 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 228 1 229 245 123 31 8e-28 MIEVKKLTKSFEDKTVLKGIDCVFETGKTNLIIGQSGSGKTVLMKNIVGLLEPTSGEILY DGRNFVTMSKREKVHMRREMGMIFQSAALFDSLSVLENVMFPLDMFSTMNLRERTKRAQE CLERVNLTDAQDKFPGEISGGMQKRVAIARAIVLNPKYLFCDEPNSGLDPKTSLVIDELL SSITKEFGMTTIINTHDMNSVMGIGENICFIYKGKKEWQGTKDDVMSSKNQKLNDLVFAS DLFRKVKEVEEREEANITASK >gi|283510475|gb|ACQH01000144.1| GENE 4 3400 - 3786 408 128 aa, chain - ## HITS:1 COG:no KEGG:PRU_2941 NR:ns ## KEGG: PRU_2941 # Name: not_defined # Def: putative lipoprotein # Organism: P.ruminicola # Pathway: not_defined # 1 128 1 128 128 119 48.0 5e-26 MKKLFYFFVALLVVGCSMGPDPGEVAAQAAKEYYMQLLAGKYEHYVDGFYRPDSIPPSYR RQLIDNAKMFVGQQKAERRGILDVRVVNAVADTAKRSANVFLLFAYGDSTSEEVVVPMVL HRGVWYMR >gi|283510475|gb|ACQH01000144.1| GENE 5 4263 - 4574 191 103 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSIIFFFSPVILKHMEKYEETKRQNTEIKSSKVKKKGKKSNWQVKGRNKEQPISETSSTE KPEQGVEENYSDVSNEGGVSEEKISVTKQNEMTPINEETEFSE >gi|283510475|gb|ACQH01000144.1| GENE 6 4976 - 5818 562 280 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288930103|ref|ZP_06423941.1| ## NR: gi|288930103|ref|ZP_06423941.1| hypothetical protein HMPREF0670_02835 [Prevotella sp. oral taxon 317 str. F0108] # 1 280 1 280 280 581 100.0 1e-164 MTMNNLLLNERLRQLYQGYWNKFCKEVKESGTSIGYPFLLSTCQWEGDVPTEKWYANSDL KVMIFGQEPNKWYGIEGCDDFNSASQFYTGDEPETVETIMSAYENFYATNYLLNGNKWYF DEKGRKNTFHAKGINNFMSQLDEIVHTFNPKLRTVCMWNNISKLSTIDGKPVDNCTHNIE RHFLSSIIAQEIEILQPDIIIFLTGRYDTYIKENLGLTDEDFIPFDGFPVHDVAKVTIPG VKLAYRTCHPNAYPKAKPDYYRYVEAIIYDIHSNFNNLIK >gi|283510475|gb|ACQH01000144.1| GENE 7 7832 - 8635 554 267 aa, chain - ## HITS:1 COG:no KEGG:BVU_0849 NR:ns ## KEGG: BVU_0849 # Name: not_defined # Def: putative iron uptake factor # Organism: B.vulgatus # Pathway: not_defined # 49 260 117 373 377 142 32.0 2e-32 MRKICLKIHKWLALPLGVFMAILCFTGFLLLVIKDISPLLGMEEGVEPFYKAVKQLHRWL FMVPNNPHGGLSVGRVIMAVSSMCMALVLLTGVVVWWPKSKKMLKSRLKVSTNKGFRRFV YDTHVSLGIYAFVFLFLMAITGPVFSFGWYRQGMSKLFGQNTEKMENKMPKAGDKPTAAK ENAFANGKMEQAKEQTQPQAGGANDTKKEGDGKKPGGKKLFKSLHTGKWGGWFSKILHAL AALVGGFLPISGYYLWWKKRQGKHAKC >gi|283510475|gb|ACQH01000144.1| GENE 8 8965 - 10278 1564 437 aa, chain - ## HITS:1 COG:SPy0341 KEGG:ns NR:ns ## COG: SPy0341 COG1160 # Protein_GI_number: 15674498 # Func_class: R General function prediction only # Function: Predicted GTPases # Organism: Streptococcus pyogenes M1 GAS # 5 437 6 435 436 373 46.0 1e-103 MANLVAIVGRPNVGKSTLFNRLTKTRSAIVSDTAGTTRDRQYGKCDWAGREFSVVDTGGW VVNSDDIFEDAIRRQVLVATEEADLVLFLVDVNTGVTDLDEDVAQILRRTKVPVVLVVNK ADNNEQIYEAPAFYSLGLGDPFPISAATGSGTGDLLDAVIAQLKPGENENVEDGIPRFAV VGRPNAGKSSIVNAFIGEDRNIVTEIAGTTRDSIYTRFDKFGFDFYLVDTAGIRRKNKVT EDLEFYSVMRSIRAIEHSDVCILMIDATRGIESQDMNIFQLIQKNQKSLVVVVNKWDLVP DKDQKVIKTFENAIRERMAPFVDFPIIFASALTKQRIFKVLETAKQVYLNRKARVGTSKL NEVMLPLIEAYPPPSTKGKYIKIKYCTQLPNTQIPSFVFYANLPQYVKENYRRFLENKIR ENWNMHGCPINVFIRQK >gi|283510475|gb|ACQH01000144.1| GENE 9 10365 - 11246 1080 293 aa, chain - ## HITS:1 COG:SPy0476 KEGG:ns NR:ns ## COG: SPy0476 COG1159 # Protein_GI_number: 15674592 # Func_class: R General function prediction only # Function: GTPase # Organism: Streptococcus pyogenes M1 GAS # 1 291 1 294 298 237 45.0 2e-62 MHKAGFVNIVGNPNVGKSTLMNQLVGERISIATFKAQTTRHRIMGIVNTPEMQIVFSDTP GVLKPNYKLQESMLAFSVSALSDADVLLYVTDVVEDPEKNADFLEKVRGMKIPVLLLINK IDESDQKTLGDIVERWHALLPKAEILPISAKNKFGTDLLLRRIQELLPESPPYFDKDQLT DKPARFFVSEIIREKILLYYDKEIPYSVEVKVERFKETEKKIHINAVIYVERESQKGIVI GHQGVALKKVSSEARKTLERFFDKEVYLETFVKVDKDWRSSKKELDSFGYNPE >gi|283510475|gb|ACQH01000144.1| GENE 10 11372 - 12394 990 340 aa, chain - ## HITS:1 COG:CAC3578 KEGG:ns NR:ns ## COG: CAC3578 COG0332 # Protein_GI_number: 15896812 # Func_class: I Lipid transport and metabolism # Function: 3-oxoacyl-[acyl-carrier-protein] synthase III # Organism: Clostridium acetobutylicum # 8 328 6 323 325 273 43.0 3e-73 MDKINAIISGVAGYVPDYVLNNEELSRIVDTNDEWITTRTGIKERRILTEEGLGTSYLAR KAAKLLMKKTGVDPDSIDAVIVATTTPDYIHPSTASIVLGKLGLKNAFAFDFSAACCGFM YAFDVACNMIQSGRHKRIIVIGADKMSAITDYKDRATCPLFGDGAGAVMVEGTTEENVGL VDSLFLADGKGLPFLHVKAGGSVCPSSQFTVDHRLHYTYQEGRTVYRYAVSAMGDDCTKL IERNGLTPNDINYLVPHQANLRIIEAVGKRIGVQPEQVLVNIQRYGNTSAACMPLVLWDF EKQLKKGDNLIFTGFGAGFVHGASYYKWAYDGAEMAEQGK >gi|283510475|gb|ACQH01000144.1| GENE 11 12394 - 12582 268 62 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|29349241|ref|NP_812744.1| 50S ribosomal protein L32 [Bacteroides thetaiotaomicron VPI-5482] # 1 58 1 58 61 107 81 4e-23 MAHPKRRQSKTRTLKRRTHDKAVAPTLALCPNCGAYYIYHTVCPTCGQYRGKVAIVKEEV AE >gi|283510475|gb|ACQH01000144.1| GENE 12 12596 - 13123 589 175 aa, chain - ## HITS:1 COG:no KEGG:PRU_2936 NR:ns ## KEGG: PRU_2936 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 172 1 172 173 184 50.0 9e-46 MCDLETFKLDVKALKQGKTLLEYVLDDSFFEAIDAPEIRKGKVNVELQATRNGDLFELNF NIEGSVSIPCDLCLDDMEQAVSTKERLIVRFGEEYSEEDELVALEEDPGVIDLSWFVYEF IALNIPIKHVHAPGKCNPAMIDVLDAHAANRSGNEDEEEAIDPRWMKLKEIKKQY >gi|283510475|gb|ACQH01000144.1| GENE 13 13410 - 13715 77 101 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288803514|ref|ZP_06408945.1| ## NR: gi|288803514|ref|ZP_06408945.1| hypothetical protein HMPREF0660_01950 [Prevotella melaninogenica D18] # 1 97 147 244 246 75 41.0 9e-13 MKGDTIEVDLYWNYMLRGVLVKRFFKVLDKNHLLLFQQIWVSPDEADEPDGCRILYDFIP AHNLPPSSSFRNKQRRYLWDNKKDWKAYKRKIKEMRKGNNG Prediction of potential genes in microbial genomes Time: Sat May 28 03:00:30 2011 Seq name: gi|283510474|gb|ACQH01000145.1| Prevotella sp. oral taxon 317 str. F0108 cont2.145, whole genome shotgun sequence Length of sequence - 6000 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 4, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1423 - 2061 297 ## gi|288930110|ref|ZP_06423948.1| hypothetical protein HMPREF0670_02842 - Prom 2176 - 2235 3.4 - Term 2273 - 2321 5.1 2 2 Tu 1 . - CDS 2422 - 2799 138 ## gi|288930111|ref|ZP_06423949.1| hypothetical protein HMPREF0670_02843 - Prom 2819 - 2878 1.8 3 3 Op 1 . - CDS 2981 - 3319 196 ## gi|288800958|ref|ZP_06406415.1| hypothetical protein HMPREF0669_01355 4 3 Op 2 . - CDS 3346 - 3648 215 ## gi|288800957|ref|ZP_06406414.1| NHL/RHS/YD repeat protein 5 3 Op 3 . - CDS 3734 - 4147 340 ## gi|288930112|ref|ZP_06423950.1| hypothetical protein HMPREF0670_02844 - Prom 4185 - 4244 2.7 6 3 Op 4 . - CDS 4246 - 4620 357 ## gi|260910023|ref|ZP_05916706.1| conserved hypothetical protein - Prom 4758 - 4817 2.8 7 4 Tu 1 . - CDS 5009 - 5626 258 ## gi|288930113|ref|ZP_06423951.1| hypothetical protein HMPREF0670_02845 - Prom 5735 - 5794 2.7 Predicted protein(s) >gi|283510474|gb|ACQH01000145.1| GENE 1 1423 - 2061 297 212 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288930110|ref|ZP_06423948.1| ## NR: gi|288930110|ref|ZP_06423948.1| hypothetical protein HMPREF0670_02842 [Prevotella sp. oral taxon 317 str. F0108] # 1 212 1 212 212 442 100.0 1e-123 MKKNFVIFMLASICLLSAHAQRSCKDCIQDLYKVVEGAQLDSISIGHSFYSVKSLYQGKG HGLVVGAIAKARVFSYGNPLDSVVMLDLGDKALYFMVNTEPPRNFKCADINCVYDGEGRN LLDKEDYMRFPAVINDPDGFTFIREGPSTTFKVKAKIEKDKIFFYTPILSSDWYRVFLRD GGPCIGYIHRSRILPYDKCPTKIKRKMEKLML >gi|283510474|gb|ACQH01000145.1| GENE 2 2422 - 2799 138 125 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288930111|ref|ZP_06423949.1| ## NR: gi|288930111|ref|ZP_06423949.1| hypothetical protein HMPREF0670_02843 [Prevotella sp. oral taxon 317 str. F0108] # 1 125 1 125 125 259 100.0 5e-68 MAIGHKNTIRRPTPEELLLIEFLARKAKYPLAHEWQQGVWAEPTTDDKIGPIAIAMNNGE PVKCKPSHVVSDCMFYDADDKGVAAYLLVDDDGCLCELDLWKGGGPEIFSLLSSTDKFKD IPMGK >gi|283510474|gb|ACQH01000145.1| GENE 3 2981 - 3319 196 112 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288800958|ref|ZP_06406415.1| ## NR: gi|288800958|ref|ZP_06406415.1| hypothetical protein HMPREF0669_01355 [Prevotella sp. oral taxon 299 str. F0039] # 2 111 3 105 109 108 48.0 1e-22 MKAIVISMFVILFACCIASTNPHLNENKNAIVKKESLYGYWSLDGVVWLRITRDRIYFVD EDGQPSVRYSLNKDTLTWYFDGIAPQKDIISILNDTLFTRNDEGVGKYVRVR >gi|283510474|gb|ACQH01000145.1| GENE 4 3346 - 3648 215 100 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288800957|ref|ZP_06406414.1| ## NR: gi|288800957|ref|ZP_06406414.1| NHL/RHS/YD repeat protein [Prevotella sp. oral taxon 299 str. F0039] # 48 100 328 380 381 94 79.0 2e-18 MENKGCIRRPTPEELLLVGYLAQKLNMHSKLICNKVYGQSLSWKRGWADFNKLNVIDISP SSVSNPRAFHKSLINNVKISRVLDPWTKTKDPVFHIEIPQ >gi|283510474|gb|ACQH01000145.1| GENE 5 3734 - 4147 340 137 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288930112|ref|ZP_06423950.1| ## NR: gi|288930112|ref|ZP_06423950.1| hypothetical protein HMPREF0670_02844 [Prevotella sp. oral taxon 317 str. F0108] # 1 137 1 137 137 254 100.0 1e-66 MIYFKFHFKPFCHDDASKQEKRSFLGRLFSKLFHHLEANPDFEESIDFVAEWLIEYDDVE FHQAVREMGFDTNKNLIVKLPDERNYGYWSDLDCDINFFKKFNYQLITKETFDLLWNSTR HDRELKKFVPINDVNGQ >gi|283510474|gb|ACQH01000145.1| GENE 6 4246 - 4620 357 124 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260910023|ref|ZP_05916706.1| ## NR: gi|260910023|ref|ZP_05916706.1| conserved hypothetical protein [Prevotella sp. oral taxon 472 str. F0295] # 1 124 1 124 124 217 88.0 2e-55 MTTKELADHYGIKYVSLQEYGVYSVAFLPDDALRMVNILREEITPILGGDVYVKKNDPTR VYPTDDGWYYERKCDELIKEYTDNSCDEAIKYIQIVLDSVTKGCYEEGSDVLFELVILDL NNQS >gi|283510474|gb|ACQH01000145.1| GENE 7 5009 - 5626 258 205 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288930113|ref|ZP_06423951.1| ## NR: gi|288930113|ref|ZP_06423951.1| hypothetical protein HMPREF0670_02845 [Prevotella sp. oral taxon 317 str. F0108] # 1 205 4 208 208 400 100.0 1e-110 MIYFKFNCTAFYRKHLAYKTMKAILLSIAVIPLIGCASLPKYTILNRQHSIYYTREQTSE LNKRVCTLIRKSKERRDFDSVYVYHIAICQLTSQLGGTAKSPSEFIKEDFTDSVFLSMSR LGRMKVKALDQGKTFLSTESIVLKKDKSLGGVGWIQFFSAASSGVKTFLPEDLSALVAIL EDDDFIGFFQVIKSIQQGGKRPFSQ Prediction of potential genes in microbial genomes Time: Sat May 28 03:01:20 2011 Seq name: gi|283510473|gb|ACQH01000146.1| Prevotella sp. oral taxon 317 str. F0108 cont2.146, whole genome shotgun sequence Length of sequence - 1259 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 65 - 754 282 ## gi|288930115|ref|ZP_06423952.1| hypothetical protein HMPREF0670_02846 2 1 Op 2 . + CDS 723 - 1232 247 ## gi|288930116|ref|ZP_06423953.1| hypothetical protein HMPREF0670_02847 Predicted protein(s) >gi|283510473|gb|ACQH01000146.1| GENE 1 65 - 754 282 229 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288930115|ref|ZP_06423952.1| ## NR: gi|288930115|ref|ZP_06423952.1| hypothetical protein HMPREF0670_02846 [Prevotella sp. oral taxon 317 str. F0108] # 1 229 1 229 229 439 100.0 1e-122 MDSDGAKVHVANAASFSVLLSSLPAGVRSVIVLDKNLYLDLSSVEEACRRYPTSKNLDIL KRIVSDRRTVDFDATSKTYMYMDSSGKIESKDFKEPNRSNAYQDFMSKYSGPEEQRQLYS LTLLKQGIKDEIEVSANLGATLRPADEQKAFPGGEISPSKNFKVFINPFATPEEQAKAVG HEFGGHLYMYLIGKDPRHGGSTGTQDGNIELENQIKEREHESIRNFKEK >gi|283510473|gb|ACQH01000146.1| GENE 2 723 - 1232 247 169 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288930116|ref|ZP_06423953.1| ## NR: gi|288930116|ref|ZP_06423953.1| hypothetical protein HMPREF0670_02847 [Prevotella sp. oral taxon 317 str. F0108] # 1 169 1 169 169 325 100.0 5e-88 MNPFVILKRNKIRLCIYMLLFISNASCGQNRISISDKDKESIVLYDVQGLKYKQIWRFNN KVVGYCYFSRKGEIIDSTTWNVEIQTTLKLRDKILSKFRVDDETFTPGTAGVLLLAMPKC NIVELRLIRGLTDSFNKEMLRVLREVEQDILVFSANPIAVIVPIRITID Prediction of potential genes in microbial genomes Time: Sat May 28 03:01:37 2011 Seq name: gi|283510472|gb|ACQH01000147.1| Prevotella sp. oral taxon 317 str. F0108 cont2.147, whole genome shotgun sequence Length of sequence - 1421 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 54 - 911 299 ## Hoch_2925 YD repeat protein 2 1 Op 2 . + CDS 921 - 1349 171 ## gi|288930118|ref|ZP_06423955.1| hypothetical protein HMPREF0670_02849 Predicted protein(s) >gi|283510472|gb|ACQH01000147.1| GENE 1 54 - 911 299 285 aa, chain + ## HITS:1 COG:no KEGG:Hoch_2925 NR:ns ## KEGG: Hoch_2925 # Name: not_defined # Def: YD repeat protein # Organism: H.ochraceum # Pathway: not_defined # 2 81 3203 3294 3456 79 39.0 2e-13 MPYKFNGKEFDQETGLYYYGARYMNPRTSLWYGVDPLMEKYPNVNGYCYTMDNPIKYIDP NGKNMSEYDTGGKKISNLGGDKIDFIHQNNGDVQIIDKTNNHTNIIKNGSKYIKNYTQRN SSVSWNDITKEFMTDSGPVYSLFSDFSNKGEGAFGSLESYNSIYGLAARSDVLRSNKSKN TMAITTLYANPLSAGFDPWEQMIGEANISWYNLGNDVLFMLVDSKSNTSLYYHIPFINNK KRGQDGFGHGNTYQTYIWLETKKDIKQKNDFYLEMLQKRMEAIHR >gi|283510472|gb|ACQH01000147.1| GENE 2 921 - 1349 171 142 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288930118|ref|ZP_06423955.1| ## NR: gi|288930118|ref|ZP_06423955.1| hypothetical protein HMPREF0670_02849 [Prevotella sp. oral taxon 317 str. F0108] # 1 142 1 142 142 288 100.0 1e-76 MKKIIAIVILIMLVGFFIWNQRASFEDPIGTYINESNVRDTICLFSMGRFEQVVYDKAGR PIYHCKSKWRKTSHGIAIDSILLYDNLSSLNVDQQENKLYEGMSYDGFQPSYKNGRFIIS WNDYVDTPESAISFYRIRTLSK Prediction of potential genes in microbial genomes Time: Sat May 28 03:01:50 2011 Seq name: gi|283510471|gb|ACQH01000148.1| Prevotella sp. oral taxon 317 str. F0108 cont2.148, whole genome shotgun sequence Length of sequence - 3875 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 1, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 2041 1432 ## COG3209 Rhs family protein 2 1 Op 2 . + CDS 2038 - 2955 267 ## Cthe_0003 ankyrin repeat-containing protein 3 1 Op 3 . + CDS 2982 - 3329 133 ## ZPR_3002 hypothetical protein 4 1 Op 4 . + CDS 3396 - 3873 553 ## Hoch_2925 YD repeat protein Predicted protein(s) >gi|283510471|gb|ACQH01000148.1| GENE 1 2 - 2041 1432 679 aa, chain + ## HITS:1 COG:MA2043 KEGG:ns NR:ns ## COG: MA2043 COG3209 # Protein_GI_number: 20090890 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rhs family protein # Organism: Methanosarcina acetivorans str.C2A # 343 450 124 229 440 63 36.0 9e-10 NILGIVNAVDPTQANSNNNNNSNNNSNNNKAKLGGAFNHTYAYDDLNRLIRANGEAKGAK YEMTMTFGRMSEPLTKVQKVDSTKTAQSYDFTYKYEDSNHPTAPTQIGHEHYTYDANGNP TLVENDSLNTERRMYWDEDNRLMVLSDNGKTSRYTYNAAGERIVKSHGDLEGVYVNGAPQ GMTFHETEDYTIYPAPIITVTKNRFTKHYFIGDKRIASKLGTGKFSNVYGISSNNVTAGQ KDYAARMLQIEKQREEYYKSLGTPPGVPTMKGATADPDNTGYGYNTIIGDLGDHSVPEGW VQRPKFNDRGDVPGPPIQWQKPEDPDNAQPGYGYVPADTAHHEDIFFYHSDHLGSTSYIT DAKANITQFDAYLPYGELLVDEHSSSEEMPYKFNGKELDQETGLYYYGARYMNPVTSLWY GVDPQIEKMPKDGAYSYCFGNPIKLIDPNGERPTEEEAARISAHVYRTEKVTLTGGWHLS KLNNKLKGVIWNDASSGFKSALYERTYKGKTEYVYATAGTDFTEVSDWKNNIKQLVGMSE QYDISMRNAEKISQVTGNSELTFTGHSLGGGLAAANAYKTGRSAMTFNAAWVSPFTVFSR KLARIDAYINWGDELNTVQSAVGLRADGVPHYRFNRRALLGHSIQNFYRTDLELMKDNIK AKNELYQRMYMESFHGFPQ >gi|283510471|gb|ACQH01000148.1| GENE 2 2038 - 2955 267 305 aa, chain + ## HITS:1 COG:no KEGG:Cthe_0003 NR:ns ## KEGG: Cthe_0003 # Name: not_defined # Def: ankyrin repeat-containing protein # Organism: C.thermocellum # Pathway: not_defined # 29 305 27 309 309 102 30.0 1e-20 MNYSKYKYEAYFILSIVCFFSCKEPNVKDMLGDDYRLYKYTPAWNLAKAVEDEDTTEILR QILQKHIPVDYRDPKYKQTLLMLATRTNKIESVKKLLELGAAPNAHDDSTKYFGENAVLL ACRFSRPSNEILALLLKYGGNPNSTSCGVEENGLGEIVPIRTFALSAAVSSSFEKVKLLV DAGANINYSTPTECCAIENCMIHDRMDIMLFLLLKGADFRRNFTEIDLDNPDYPSFKVNI LYKLRKCVYPLNSKEYNGKMKIVDFLKKRGLDYWKSPMPRGIYGVIMRDIDPKSKADFEY YIKHY >gi|283510471|gb|ACQH01000148.1| GENE 3 2982 - 3329 133 115 aa, chain + ## HITS:1 COG:no KEGG:ZPR_3002 NR:ns ## KEGG: ZPR_3002 # Name: not_defined # Def: hypothetical protein # Organism: Z.profunda # Pathway: not_defined # 1 83 1 83 85 61 42.0 1e-08 MKLKNRYSFDDEIAEKFCYDNVNRCIQICFSGYFDFIANKHIESKCKFVICNWQQAKSKI IDREGYSSLDSNLGIISLILDAEEDGDNLKLIVNTLDDKYIELFFFSVETLVLSL >gi|283510471|gb|ACQH01000148.1| GENE 4 3396 - 3873 553 159 aa, chain + ## HITS:1 COG:no KEGG:Hoch_2925 NR:ns ## KEGG: Hoch_2925 # Name: not_defined # Def: YD repeat protein # Organism: H.ochraceum # Pathway: not_defined # 1 158 2692 2854 3456 106 38.0 3e-22 MAYDAYGNLLTKLTAELRKRISDKAPVTYTYDYERLSEVLYPKNLFNRVTYTYGKPGEKY NRAGRLVLVEDASGGEAYYYGNQGEVVKTVRSVMVSTADVRTYVYGATYDSWNRIRTMTY PDGEVVTYHYNAAGQIVRLSGNKQGRESVIVDRIGYDKD Prediction of potential genes in microbial genomes Time: Sat May 28 03:01:59 2011 Seq name: gi|283510470|gb|ACQH01000149.1| Prevotella sp. oral taxon 317 str. F0108 cont2.149, whole genome shotgun sequence Length of sequence - 2906 bp Number of predicted genes - 4, with homology - 2 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 904 517 ## COG3209 Rhs family protein + Term 1052 - 1089 0.5 - Term 532 - 570 3.2 2 2 Tu 1 . - CDS 809 - 1027 119 ## - Prom 1219 - 1278 9.1 + Prom 975 - 1034 5.2 3 3 Op 1 . + CDS 1212 - 1406 159 ## 4 3 Op 2 . + CDS 1419 - 2654 807 ## COG3209 Rhs family protein Predicted protein(s) >gi|283510470|gb|ACQH01000149.1| GENE 1 2 - 904 517 300 aa, chain + ## HITS:1 COG:YPO2380 KEGG:ns NR:ns ## COG: YPO2380 COG3209 # Protein_GI_number: 16122603 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rhs family protein # Organism: Yersinia pestis # 5 94 598 688 984 62 38.0 1e-09 SYITDKDANITQFDAYLPYGELMVDEYSSSEDMPYKFNGKELDEETGLYYYGARYMDPKI SMWLGVDPLAEKYPEISPYIYCHNNPINLIDPNGKNDYILNLDGEIKLIKENSNDFNTIF AYDGQGGIDKNTKIDVPKSFNESKTTTQVQGRTSKDADGNKISTYEFDLYTVDDEQIASN IFEFLAKNSDVEWSNTKVSSKDNTSFNLISTSHKEGSEVGMRYTFEKYKNNHNFTIDYAI HSHLPESVGYSPGDIKFAKDAKSIYPNIELKIYNRIEYKSFDQYDIPGVLNEVVVPPQKR >gi|283510470|gb|ACQH01000149.1| GENE 2 809 - 1027 119 72 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MIDNELFVAQYNGIETPIPFLPHTKEQEVRQIIIQIIRYIYLSFLRWNNNFIQYPGNIIL VKTFIFYSIVNF >gi|283510470|gb|ACQH01000149.1| GENE 3 1212 - 1406 159 64 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNGTFFYLLNRNVNNDVIDLFIKPINAQLHLKKESPQLNDDTLFIFKESIAFKYESGKFK RYKR >gi|283510470|gb|ACQH01000149.1| GENE 4 1419 - 2654 807 411 aa, chain + ## HITS:1 COG:MA2043 KEGG:ns NR:ns ## COG: MA2043 COG3209 # Protein_GI_number: 20090890 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rhs family protein # Organism: Methanosarcina acetivorans str.C2A # 112 222 124 232 440 63 35.0 1e-09 MNDQGTAGQRDYAARMQNIEAQREEYYKSLGTPPGVPTMKGATADPDNTGRGYNKIIGDL GDHSVPEGWVQRPKFNDKGDVPEPPIQWQKPEDPDNAQPGYGYVPADTAHHEDIFFYHSD HLGSTSYITDSKANITQFDAYLPYGELLVDEHSSSEEMPYKFNGKEFDQETGLYYYGARY MNPRTSLWYGVDPLAEKYVSTGSYVYCIDNPIRLIDPDGRKVKADLNSQTNIYNILTKED AKYVKFDKNGILDVHSLQKLDSKSDIIQAIKTLANSDITYSFEVKSEDDHGKKFKDGDFN FYRGVTEIPGAESQPSKDNDVHIIVGINLSEKQQAKTTAHEGMGHAYIYELTRDYKKASH DYQPTVVAVEWDEELQQMVTVMGREDKNVLLAPQIKKVVSEAERNYNENKK Prediction of potential genes in microbial genomes Time: Sat May 28 03:02:09 2011 Seq name: gi|283510469|gb|ACQH01000150.1| Prevotella sp. oral taxon 317 str. F0108 cont2.150, whole genome shotgun sequence Length of sequence - 1810 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 54 - 857 411 ## BT_1993 putative cell wall-associated protein precursor 2 1 Op 2 . + CDS 784 - 1269 229 ## gi|288930126|ref|ZP_06423963.1| hypothetical protein HMPREF0670_02857 3 1 Op 3 . + CDS 1238 - 1808 504 ## Hoch_2925 YD repeat protein Predicted protein(s) >gi|283510469|gb|ACQH01000150.1| GENE 1 54 - 857 411 267 aa, chain + ## HITS:1 COG:no KEGG:BT_1993 NR:ns ## KEGG: BT_1993 # Name: not_defined # Def: putative cell wall-associated protein precursor # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 65 1043 1108 1316 72 51.0 2e-11 MPYKFNGKEFDQETGLYYYGARYMDPMASIWYGVDKLTEKYVSVSSYTYCNGNPIANIDV MGMFPKGIVVKHVETVVQYQAVGSANTLMGIPHEVTYYTFTESAAHLLSLVANVPITYVK NARLEEVILQPEGNCITIGGSPDNARILVSPYYLDNSKGGQDYDLWFREFSHEVGHIKQI ARDKSLTKYLLKTIAGYIKAGNHDDASREIEAEQGSDTFNTFRRFVKTHFKASVENLFKN DKLDEKDKIEQINKWWNEFKKETNNKK >gi|283510469|gb|ACQH01000150.1| GENE 2 784 - 1269 229 161 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288930126|ref|ZP_06423963.1| ## NR: gi|288930126|ref|ZP_06423963.1| hypothetical protein HMPREF0670_02857 [Prevotella sp. oral taxon 317 str. F0108] # 28 161 1 134 134 254 99.0 1e-66 MRKTKSSKSINGGMNSKKRQTIKNRIILVTIAFTLCACDNHDFTFDEEKQTFYVYDMLGF YIEPDGEKELFYSYDIKLKEKKKEKGIDTLDLNNISSKYQVEACFPNIDTVINNPKRAVL VPNTRYRVLHMGMGRVYGIKYYHTDSAGRLRNEEKPEPNTK >gi|283510469|gb|ACQH01000150.1| GENE 3 1238 - 1808 504 190 aa, chain + ## HITS:1 COG:no KEGG:Hoch_2925 NR:ns ## KEGG: Hoch_2925 # Name: not_defined # Def: YD repeat protein # Organism: H.ochraceum # Pathway: not_defined # 19 153 2925 3072 3456 76 34.0 4e-13 MKKSQSQIQNNKTTLGGSTYDAQNRLIHANGKAKTASYQLDMAYGIMSEPLTKVQKVDST KTAQSYDFTYKYEDSNHPTAPTQIGHEHYTYDANGNPTLVENDSLNTERRMYWDEDNRLM VLSDNGKTSRYTYNAAGERVVKSHGDLEGVYVNGAPQGITFHETEDYTIYPAPIITVTKN RFTKHYFIGD Prediction of potential genes in microbial genomes Time: Sat May 28 03:02:25 2011 Seq name: gi|283510468|gb|ACQH01000151.1| Prevotella sp. oral taxon 317 str. F0108 cont2.151, whole genome shotgun sequence Length of sequence - 4682 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 2, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 1 - 1008 115 ## Hoch_2925 YD repeat protein 2 1 Op 2 . + CDS 1040 - 1654 326 ## gi|288930129|ref|ZP_06423966.1| hypothetical protein HMPREF0670_02860 + Prom 1916 - 1975 4.9 3 2 Op 1 . + CDS 2036 - 2344 154 ## gi|288930130|ref|ZP_06423967.1| hypothetical protein HMPREF0670_02861 4 2 Op 2 . + CDS 2338 - 4681 2194 ## Hoch_2925 YD repeat protein Predicted protein(s) >gi|283510468|gb|ACQH01000151.1| GENE 1 1 - 1008 115 335 aa, chain + ## HITS:1 COG:no KEGG:Hoch_2925 NR:ns ## KEGG: Hoch_2925 # Name: not_defined # Def: YD repeat protein # Organism: H.ochraceum # Pathway: not_defined # 1 134 3170 3310 3456 99 34.0 3e-19 SYITDKDANITQFNAYLPYGELLVDEHSSSEEMPYKFNGKELDEETGLYYYGARYMDPKI SMWYGSDPLSEEYENVSAFVYCHGNPICLFDPDGQGDYYTNEGVWLGSDKKKDNFVYTAS GVHQSKDKNGNMVNVFENPQKLSIHHSKFISQSSTVYGESSAYRVRDKKSEPSEDLKKEM YAIASVHQRNSKAYGISSEPAKDFRSKSAKERNDLPLMRTAIAAEINALKNGIDYSYGAT MWDGAEQAQFSENEQRRSNGHFEIHMNTMGWNISPKHYAQWKNNIGKAFKAPMIRAARDS FYNSVTKKSIPNPNAGKMRLQSTAVYGRTIFWKTN >gi|283510468|gb|ACQH01000151.1| GENE 2 1040 - 1654 326 204 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288930129|ref|ZP_06423966.1| ## NR: gi|288930129|ref|ZP_06423966.1| hypothetical protein HMPREF0670_02860 [Prevotella sp. oral taxon 317 str. F0108] # 1 204 34 237 237 339 100.0 8e-92 MKVFLLLLLLYFPLNSKVAESHQVDVTLSYESGYTSKSLYINGKKEAVYKSRKKKPIRLT KVEKDMQNIFFEQLKYLESWGVIIKYPEVIFEDLCIISDILSSYDKSEIQSFIQTKDNCK TINILKINKYYTYINARVDYVLTLTIVIEKGLLKEFRYKYMPDFSNETISYRCQYKYDNY GRIIYITILGKEKNEIRITYASLR >gi|283510468|gb|ACQH01000151.1| GENE 3 2036 - 2344 154 102 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288930130|ref|ZP_06423967.1| ## NR: gi|288930130|ref|ZP_06423967.1| hypothetical protein HMPREF0670_02861 [Prevotella sp. oral taxon 317 str. F0108] # 1 102 29 130 130 185 100.0 9e-46 MIIENKSKVDTVLFSLYEQIKIKKKWVTSNYDLFCDVNNPNTSIFIIYPNQKIDLRAMKS NVIYTDEKIGKCKFLSMKSKRVLLVGNIKGSGTSISIHSDAW >gi|283510468|gb|ACQH01000151.1| GENE 4 2338 - 4681 2194 781 aa, chain + ## HITS:1 COG:no KEGG:Hoch_2925 NR:ns ## KEGG: Hoch_2925 # Name: not_defined # Def: YD repeat protein # Organism: H.ochraceum # Pathway: not_defined # 26 767 1771 2553 3456 300 30.0 2e-79 MVKTEASGNTVSASGKFTAKFLKAGISGSKSWQDSYTNVSCMDINGDGYPDWINERKEEI KAQITTPTGALSSHIIEPNVPNPKFKASSGTVGIEVGAELKDGGNAGDGKAESPTIAEQW SFSKKSLIAAAKANETSSGSVSASGNYTSSKTSTERDWTDLNGDGLPDMLDGGSVRYNLG NSFANAQESGCGGVGTSKSKTWGAGGGVSIPIAGKVNISFGFNGTRTTSESTSSMNDLNG DGLPDIITQEGDDLYVAYNTGNGFLPKIKFIQNAALARSVSSSVSEFANVGVAIPLLFVT LLPRGIASKSDGVSCNSTALIDIDGDGYPDYVSENGPNELKVRRNLTARTNLLKGVTLPF GGHIALTYEKTRPSFAMPGSRYVLKSVETTGGYAENGATRMRNEFEYEGGYRDRRERDFY GFEKVTTKQIDTQNGNAVYRKQVSEYGHNRNLYMHDLVTAETLYDAAGNKLQGTQNTYEL KQQADTTVFFPALASVKQTIYDNNGAGSMSTTVHNTYDAYGNLASYKETATNYELDADIA YHDLQAKYIVSVPKHIAVKDKGGKVYRERSTQINGHGDITSITMHNGTKPSVYDMTYDAY GNLASLTKPENHKGQRMRYDYTYDDVLHMLVTNVRDAYGYTSSTVYDYKWAVPVETSDLN GNKMRYAYDDMGRPSTIVGPKEIAAGKPYTIRFEYHPAMRHARTVHYAPEGDIETYTFAD SLMRAVQTKQTGVVWTGGSNQKVSIVSGRAVVDAFGRTVRAFYPTTESYGSIGLYNKGVG D Prediction of potential genes in microbial genomes Time: Sat May 28 03:02:52 2011 Seq name: gi|283510467|gb|ACQH01000152.1| Prevotella sp. oral taxon 317 str. F0108 cont2.152, whole genome shotgun sequence Length of sequence - 1024 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 640 137 ## gi|288930133|ref|ZP_06423969.1| hypothetical protein HMPREF0670_02863 Predicted protein(s) >gi|283510467|gb|ACQH01000152.1| GENE 1 2 - 640 137 212 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288930133|ref|ZP_06423969.1| ## NR: gi|288930133|ref|ZP_06423969.1| hypothetical protein HMPREF0670_02863 [Prevotella sp. oral taxon 317 str. F0108] # 54 212 1 159 159 308 100.0 2e-82 NVVFKAEKFADFRRLISRSGKHATGKDFNKIDPVAYSNYIANNSELSNDDKALMDILVNS INSDDFTYKFQYVSENDELPDKSMLAPQLVEEIESTGHTVLTKLLGGSQTKQTKKGSYSL VVEDDRSPRDYYNNLGEQVPNPLGRVLYVFHEVFGHGRSLSLGRGSDNQHGDAIRLENLV LRVAGFSNIQRNGEDHGPKTKIPNNSQLPDYR Prediction of potential genes in microbial genomes Time: Sat May 28 03:03:02 2011 Seq name: gi|283510466|gb|ACQH01000153.1| Prevotella sp. oral taxon 317 str. F0108 cont2.153, whole genome shotgun sequence Length of sequence - 2402 bp Number of predicted genes - 4, with homology - 3 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 110 - 169 8.6 1 1 Op 1 . + CDS 275 - 727 228 ## gi|288930135|ref|ZP_06423971.1| hypothetical protein HMPREF0670_02865 2 1 Op 2 . + CDS 687 - 1280 327 ## gi|288930136|ref|ZP_06423972.1| hypothetical protein HMPREF0670_02866 3 2 Tu 1 . - CDS 1285 - 1488 113 ## - Prom 1576 - 1635 4.5 + Prom 1447 - 1506 3.3 4 3 Tu 1 . + CDS 1708 - 2304 315 ## gi|288930138|ref|ZP_06423974.1| hypothetical protein HMPREF0670_02868 Predicted protein(s) >gi|283510466|gb|ACQH01000153.1| GENE 1 275 - 727 228 150 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288930135|ref|ZP_06423971.1| ## NR: gi|288930135|ref|ZP_06423971.1| hypothetical protein HMPREF0670_02865 [Prevotella sp. oral taxon 317 str. F0108] # 1 150 1 150 150 257 100.0 2e-67 MGGSVGREVCIQGKADLNAVFSDGTTRTAATFEFNSGPYGNGPTPNNSYEAFGAVPTNEA GMLNNGYTGWKVLLPNYNGRSGLRVHPDTNSPGTKGCIGIVGCYEELKNLGNFFNKYIGP SGKNRMIFNFNIKGNPNYGNEGKANSRLAQ >gi|283510466|gb|ACQH01000153.1| GENE 2 687 - 1280 327 197 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288930136|ref|ZP_06423972.1| ## NR: gi|288930136|ref|ZP_06423972.1| hypothetical protein HMPREF0670_02866 [Prevotella sp. oral taxon 317 str. F0108] # 1 197 1 197 197 401 100.0 1e-110 MAMKVKLILVLLNSLFTIQACIRQETKERKKIYVAEHIKKEVPIPLDTSVNGIILDDTVS LNNILGKRLPVIDVSDYGECTVCSNIDGSEYLILCINYGGYINQYDRFVVVPSHSITQSR RIQQTSFKKFCSGLGAYIGCDMKEIEKQMMMMKKVQYKTAEGVEIVYSQSLPDWPYHCEY HFVNNRLTRYEFDYRDP >gi|283510466|gb|ACQH01000153.1| GENE 3 1285 - 1488 113 67 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MFIFIGIITFIAITGVVGISRFTRHCRNTGRCTKRLVRLLALPFYLHHAGGVNFLTRRES VSFCLAI >gi|283510466|gb|ACQH01000153.1| GENE 4 1708 - 2304 315 198 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288930138|ref|ZP_06423974.1| ## NR: gi|288930138|ref|ZP_06423974.1| hypothetical protein HMPREF0670_02868 [Prevotella sp. oral taxon 317 str. F0108] # 1 198 32 229 229 352 100.0 7e-96 MEKDTLLKSFGISYVDLKGYKIIHINKRTSCVLQPLVTSDKGETNYFRLKLNKDKKTVYL IDSILSVGDILYNSRTTGIIFPIIKYQNEDDFSTTGEIRYFNTDKLLSDCIENNLENSEA VCFDNSGLFCLYMSADTLFAYNISTKEKKSIFTFDNPMMYSVELKLKNNILTLIYYSNFV EDFSDLNSAKVINFDYQE Prediction of potential genes in microbial genomes Time: Sat May 28 03:03:32 2011 Seq name: gi|283510465|gb|ACQH01000154.1| Prevotella sp. oral taxon 317 str. F0108 cont2.154, whole genome shotgun sequence Length of sequence - 7047 bp Number of predicted genes - 8, with homology - 7 Number of transcription units - 5, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 57 - 1340 629 ## COG3209 Rhs family protein + Term 1358 - 1409 2.1 + Prom 1352 - 1411 5.0 2 1 Op 2 . + CDS 1438 - 1980 6 ## gi|288930140|ref|ZP_06423976.1| hypothetical protein HMPREF0670_02870 + Prom 2153 - 2212 4.4 3 2 Op 1 . + CDS 2248 - 2775 214 ## gi|288930141|ref|ZP_06423977.1| hypothetical protein HMPREF0670_02871 4 2 Op 2 . + CDS 2828 - 3103 61 ## gi|288930234|ref|ZP_06424029.1| hypothetical protein HMPREF0670_02923 5 2 Op 3 . + CDS 3103 - 3720 218 ## gi|288930142|ref|ZP_06423978.1| hypothetical protein HMPREF0670_02872 + Prom 5098 - 5157 3.8 6 3 Tu 1 . + CDS 5226 - 5597 137 ## + Prom 5795 - 5854 3.3 7 4 Tu 1 . + CDS 5874 - 6335 234 ## gi|288802781|ref|ZP_06408218.1| hypothetical protein HMPREF0660_01223 + Prom 6374 - 6433 4.0 8 5 Tu 1 . + CDS 6508 - 6897 203 ## gi|260911273|ref|ZP_05917873.1| conserved hypothetical protein Predicted protein(s) >gi|283510465|gb|ACQH01000154.1| GENE 1 57 - 1340 629 427 aa, chain + ## HITS:1 COG:MA2043 KEGG:ns NR:ns ## COG: MA2043 COG3209 # Protein_GI_number: 20090890 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rhs family protein # Organism: Methanosarcina acetivorans str.C2A # 97 205 123 229 440 60 36.0 6e-09 MLQIEKQREEYYRKLGTSPGVPTMKGATADPDNTGYGYNTIIGELGDHSVPEGWVQRPKR NTTPGTPPGPPIQWNKPEDPDNAQPGYGYVPTDTTNTEDIFFYHSDHLGSTSYITDAKAN ITQFDAYLPYGELLIDEHSSTKEMPYKFNGKELDTETGLYYYGARYMNPVTSLWYGVDRY AEKYPFASCYSYCLGNPLKFIDINGDSTVIDNSGYIKHYNPNDKDLRVFLNGKSIGQLGG KINANGWFNNLLAENVDEAKSIWNPFTFKKNVQQYGKWDYKYRSPANRSNNLRSHILGIA FYRKDGEKGPGDLGETIFRFNGMNDRAEDLNNFHFGVVGKALLFPFAETFMMRMAGDTEM SKWQDDYRAGKRSTPHVPASWRPMTITGYNPSAMGGTPIYELGSPYGDNPTDNRWIRLGF NYYKTRK >gi|283510465|gb|ACQH01000154.1| GENE 2 1438 - 1980 6 180 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288930140|ref|ZP_06423976.1| ## NR: gi|288930140|ref|ZP_06423976.1| hypothetical protein HMPREF0670_02870 [Prevotella sp. oral taxon 317 str. F0108] # 1 180 1 180 180 307 100.0 1e-82 MLLLLSASNSCKRSASCSRQYLIENFIRNEECFAQFVSLYQKQIPSVIIQKFQVQIELND YKDDIHIILIPHIIGERKYVLENVRRNHLKYQKKLAALNLKTKDIESLVSCLNSADCHTV RNVNYYKSSVEMIPIQNGNVSHSYLYHNDEISADMVSVIGKPISQSRLGRHFTLSNESML >gi|283510465|gb|ACQH01000154.1| GENE 3 2248 - 2775 214 175 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288930141|ref|ZP_06423977.1| ## NR: gi|288930141|ref|ZP_06423977.1| hypothetical protein HMPREF0670_02871 [Prevotella sp. oral taxon 317 str. F0108] # 8 175 1 168 168 326 100.0 4e-88 MRSILAIMLLSILLCSCYKSSKKEGVHAQLKDEVYSIVVSYMDSHPQYNTFILTEAMEQT MSGIKTSGYFLGPGYPRLLPPEKSTSYLDINNCRLYIISNLSSLYEFDNDASMWKNTNPT DSVVLDGVVIYDLVFNFLHNGLFIYYSNSKIKIMERPDTFFTPYIKSAVVFENKR >gi|283510465|gb|ACQH01000154.1| GENE 4 2828 - 3103 61 91 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288930234|ref|ZP_06424029.1| ## NR: gi|288930234|ref|ZP_06424029.1| hypothetical protein HMPREF0670_02923 [Prevotella sp. oral taxon 317 str. F0108] # 23 65 134 176 182 77 86.0 3e-13 MCREARRIYQWCPTGNHPLAIPDFRLRLGKTSELLFHSACTEVPRNGRLYHLSRTNYHYD EEPLYNIHQRKGSPRITNENNRYPYIPKKKK >gi|283510465|gb|ACQH01000154.1| GENE 5 3103 - 3720 218 205 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288930142|ref|ZP_06423978.1| ## NR: gi|288930142|ref|ZP_06423978.1| hypothetical protein HMPREF0670_02872 [Prevotella sp. oral taxon 317 str. F0108] # 1 205 1 205 205 376 100.0 1e-103 MLYTLKKYSLPGDLSIPIQIQSLQAKAIEVIDYDYIGDAKEENYISFANQFYGFLSKWDI KDIAIIIYSKAMRNTPLLRKTFRFDFDSYKFENDITIFHISIDNAYIVFVGILRIQGEKE VTRIIKFLFSQTYETQNLLVCSTEFIDKEQLKNALHKALFVEYDRYGYIRHTKIALEKLK LYSNNLKVLYPYGGTDFGSFIFFCF >gi|283510465|gb|ACQH01000154.1| GENE 6 5226 - 5597 137 123 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MISLLCCTPKKDKRIVGGDIIGHWRAVDNDYRTATYANLFFYRDSTAVFDCISDTVMFLR YKLVKDTLIIIDENKVCTKAPIQTLNSKELVLESLKDSENKQCFIRNETSGEQFDSSFVR EFM >gi|283510465|gb|ACQH01000154.1| GENE 7 5874 - 6335 234 153 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288802781|ref|ZP_06408218.1| ## NR: gi|288802781|ref|ZP_06408218.1| hypothetical protein HMPREF0660_01223 [Prevotella melaninogenica D18] # 26 146 21 146 147 87 42.0 2e-16 MILFFRISVALTMCIVFTNCTRVKHNYVEKGNWVRLYEGETIDGYTRLGDSIYGGYADSV YLHAYMRALKGVDTKSFKVCKNTGYAKDVQRVYYPLKIVCEDAEDGGGCYFEEYIIEKAN PVSFKYIGNRYATDGEKLFYEGREVDWSLLKQK >gi|283510465|gb|ACQH01000154.1| GENE 8 6508 - 6897 203 129 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260911273|ref|ZP_05917873.1| ## NR: gi|260911273|ref|ZP_05917873.1| conserved hypothetical protein [Prevotella sp. oral taxon 472 str. F0295] # 1 128 11 138 140 218 82.0 1e-55 MVLPFLIACNENHSFSLDESRQLLIVHNFLHMEIESDCLNPSESHLFFITKKNEFDTRSC DTINFKNVSSELSVEEMSSYGITKNLRRIKFRPNTRYVVIHSGMGAQVYIKEYFWADSKG KLRRTRNPK Prediction of potential genes in microbial genomes Time: Sat May 28 03:04:22 2011 Seq name: gi|283510464|gb|ACQH01000155.1| Prevotella sp. oral taxon 317 str. F0108 cont2.155, whole genome shotgun sequence Length of sequence - 3758 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 1 - 1344 572 ## COG3209 Rhs family protein 2 1 Op 2 . + CDS 1352 - 1915 331 ## gi|288930145|ref|ZP_06423981.1| hypothetical protein HMPREF0670_02875 3 1 Op 3 . + CDS 1917 - 3756 1991 ## Hoch_2925 YD repeat protein Predicted protein(s) >gi|283510464|gb|ACQH01000155.1| GENE 1 1 - 1344 572 447 aa, chain + ## HITS:1 COG:MA2043 KEGG:ns NR:ns ## COG: MA2043 COG3209 # Protein_GI_number: 20090890 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rhs family protein # Organism: Methanosarcina acetivorans str.C2A # 133 247 123 235 440 70 38.0 4e-12 IGDKRIASKLGTGKFSNVYGVSSNNVTAGQKDYAARMLQIEKQREEYYKSLGTPPGVPTM KGATADPDNTGYGYNSIIGDLGDHSVPEGWVQHPKRNTTPGTPPGPPIQWGAPEDPDNAQ PGYGYVPTDTTNTEEIFFYHSDHLGSTSYITDAKANITQFDAYLPYGELLVDEHSSSEDM PYKFNGKELDQETGLYYYGARYMNPVTSLWYGVDMLKEKYPNISGYSYTRGNPIRFIDPD GKIVKPGSQEALTVIRNTLTTEDMNFVQLDRNGYIDRKLINSHKSESKNFNNLTTLVNSD ILIEVSLVDGNVSYRDNEGRLQTEQITYTEPDPDFADPKAEYTSGLSTGEGGKFGLTLLP GKGESGVNSPDDNIRVYIQSRLSPEGKAETYSHEGNGHALLYVETKDRRISGHQVQGGKE GNIRLRNKILESRQETIRNMKSRENEK >gi|283510464|gb|ACQH01000155.1| GENE 2 1352 - 1915 331 187 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288930145|ref|ZP_06423981.1| ## NR: gi|288930145|ref|ZP_06423981.1| hypothetical protein HMPREF0670_02875 [Prevotella sp. oral taxon 317 str. F0108] # 16 187 1 172 172 305 100.0 1e-81 MRYILLIIISFISQRMFPQIKNVGIERTMEYTNYFIDNKPVYLINYKLRNVADEDYWIWF DNTNIENRTSKELIRKHFFSRDGDFILCQIALDFNIESYSPDLLSNFTKEIKPKSQFMIS FYAKDVADTTIVRDYMKEHLVIIEKQKILRYIKNADSFNSKIFFSLDNMLILYEQLYSLV KSSIKDK >gi|283510464|gb|ACQH01000155.1| GENE 3 1917 - 3756 1991 613 aa, chain + ## HITS:1 COG:no KEGG:Hoch_2925 NR:ns ## KEGG: Hoch_2925 # Name: not_defined # Def: YD repeat protein # Organism: H.ochraceum # Pathway: not_defined # 103 607 1741 2257 3456 160 30.0 2e-37 MLDYNGNRAFGTQPIDTAVLKIDKEKYEDIVGGGDPDPEHISEVMTPLNEERFFVMGYNP QRDAYISVTDSTYIGGLWQCSSRMGQQEIDVDSVSYAAGGTSLPAPVLQTKATDNTVAAS GGLSMKFLSIGASGSKSWQDSYTNVACMDLNGDGYPDWIKEHKDKVNAQLTSQTGALSAE VIHPDISRPKYKTSAGTAGMNIGVPLTEKNDSKNAAAKAPSPKEQWNFKTANLIEAAKGN EASATSVSASGNYTRGTTDTERDWTDLNGDGLPDMLDGGSVRYNLGYTFTDTRQSDCGGI GTSKSTTWGAGGGVSIPVAGKVNISFGVNGTRTTSESTSSMLDVNGDGLPDIVTEDDDNL FVAYNTGNGFPPKVNFLQNANLTQNIANSVSTFGNVAVTIPVFIFRLIPRGILATALGVS RTTTALIDIDGDGYPDYVRENGPNELKVRRNLTGRTNLLKGVTLPFGGHIALIYEKTRPN FAMPGSRYVLASVETTGGYAENGATKMRNEFEYEGGYRDRRERDFYGFEKVITKQIDTQN GNAVYRKQVAEYGHNRNLYMHDLVTAETLYDAAGNKLQGTQNTYELKQQADTTVFFPALA SVRQTIYDNAGQG Prediction of potential genes in microbial genomes Time: Sat May 28 03:04:40 2011 Seq name: gi|283510463|gb|ACQH01000156.1| Prevotella sp. oral taxon 317 str. F0108 cont2.156, whole genome shotgun sequence Length of sequence - 13784 bp Number of predicted genes - 9, with homology - 8 Number of transcription units - 5, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 394 - 453 4.4 1 1 Op 1 5/0.000 + CDS 475 - 5055 3668 ## COG3513 Uncharacterized protein conserved in bacteria 2 1 Op 2 4/0.000 + CDS 5055 - 5987 785 ## COG1518 Uncharacterized protein predicted to be involved in DNA repair 3 1 Op 3 . + CDS 5995 - 6336 259 ## COG3512 Uncharacterized protein conserved in bacteria + Term 6359 - 6399 9.1 - Term 7183 - 7213 0.3 4 2 Tu 1 . - CDS 7425 - 7883 -239 ## - Prom 8055 - 8114 2.4 - Term 8281 - 8313 -0.2 5 3 Tu 1 . - CDS 8372 - 8737 341 ## gi|288930155|ref|ZP_06423990.1| hypothetical protein HMPREF0670_02884 - Prom 8758 - 8817 6.6 6 4 Tu 1 . - CDS 9012 - 9236 164 ## gi|288930156|ref|ZP_06423991.1| hypothetical protein HMPREF0670_02885 - Prom 9288 - 9347 4.9 7 5 Op 1 9/0.000 - CDS 9610 - 10470 779 ## PROTEIN SUPPORTED gi|163755345|ref|ZP_02162465.1| 30S ribosomal protein S6 8 5 Op 2 10/0.000 - CDS 10489 - 11481 894 ## COG0379 Quinolinate synthase 9 5 Op 3 . - CDS 11554 - 13143 1507 ## COG0029 Aspartate oxidase - Prom 13320 - 13379 4.6 Predicted protein(s) >gi|283510463|gb|ACQH01000156.1| GENE 1 475 - 5055 3668 1526 aa, chain + ## HITS:1 COG:NMA0631 KEGG:ns NR:ns ## COG: NMA0631 COG3513 # Protein_GI_number: 15793618 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Neisseria meningitidis Z2491 # 686 1024 433 715 1082 84 23.0 1e-15 MKNILGLDLGTNSVGWALISVDENGNLVKDIRLGSRIIPMSQDVLGKFDSGVTVSQTAER TEYRSHRRLLERALLRRERLHRVLHLLGFLPEHYAAYIGWDRNDNATYGKFLPETEPKLA WRKADDGGYRFIFMQSFEEMLHDFKASQPSLVANGKRIPLDWTLYYLRTKALTKPIAKEE LAWILLNFNQKRGYYQLRGEEEEETPSKRVEYHELKVVKVEATDEGKGGNVWYNVLLENG WVYRRQSKYSLEKWLGQTLPFIVTTDLEADGVTPKRDKDGNVKRSFRSPKEDDWTLVKKR TEDHLEKSGLTVGAFIYHHLLANPSDKVRGQLVRTIDRKYYKRELQAIIAKQMEFHPELC DAELLKACAEELYRNNEAHRAQLLAKDFKHLLVEDVLFYQRPLKSKKSLIANCPYETRQY INKETGEIKDVNVKCIAKSNPYFQEFRLWQWISNLRLLRQADGQDVTAAYIKGNEDKERL FDLLNNQGEIKQEHILRDFLKLKKPKGGEFPLRWNYVDDKIYPANETRHAINKAIKGMAV DKEEKETLGQLLDQPAVAYMLWNLLYSITDRNELARALRRFALTHGVSSVDAFVESFLKV KPFKKEYGAYSEKATKKLLQVMRLGKYWCGDELPEPAKRTKGNKQAVYDEMAHRGICKQT LQNISNIIEENIDERLMGALLNPERPLTELSHFQGLSVSDACYVVYGRHSEVAEIVKWKD ADELNTFINNFKQHSMRNPIVEQCVLETLRTVRDVWEQEGQIDEIHVELGRNMKRTAEQR ARDTQAIQRNENTNMRISLLLTELKNDPSISDVRPYSPSQRELLKIYEEGALENLTNQDE DIAKISRLAQPSASELRRYKLWLDQKYVSPYTGRPIPLAKLFTPAYEIEHIIPQSRYFDN SFSNKVICEAEVNKLKDNLLGMEFIKKFGGKTKGILDEEAYKLHVTTNYAHNPAKRSKLL MEDLPTDFNSRQLNDTRYISKTVMALLSNVVREEGERETRSKHVVVCTGGVTDTLKKDWG LHDVWHNLVYPRFIRLNNLTGSQNFGYWRYLNEENSKKGEDSKKVFQITLPPECQRGFVK KRIDHRHHAMDALVIACASANIVNYLNNQSAANPHIHENLQRLLCDRKREIRKPWPTFTQ DAQAALKDLVVSFKNTVRVVNKASNTYLRYDKDGKKKRFKQEGEGLKAIRKPMHKETYYG EVNLIRKEMVPLKRALDDVNAIVDKELRASIKGLLRDGFNPQQVLANFKAKDFKFDKRDI KKVEVYVSSEKSTPMVATRKSLDTSFNAKRIADITDTGIQKILLNYLKANDNSPEQAFTP EGIAYLNEHIAEYNDGKPHQPILKVRVSEVKGAKFAVGQTGSKKDKFVEAQSGTNLYFAI YEDKEGKRHFETIALKVAAERLKDKQMPVPEINDAGLPLKFYLSPNDLVYVPTDEDRATA HPSIDKMRIYKMVSATDRQCFFLPFTVASVISNKVEFEALNKMERVLDASEYSSAYDSSH VNKVMVKNVCWKLEVDRLGNITKIVR >gi|283510463|gb|ACQH01000156.1| GENE 2 5055 - 5987 785 310 aa, chain + ## HITS:1 COG:PM1126 KEGG:ns NR:ns ## COG: PM1126 COG1518 # Protein_GI_number: 15602991 # Func_class: L Replication, recombination and repair # Function: Uncharacterized protein predicted to be involved in DNA repair # Organism: Pasteurella multocida # 1 294 40 317 343 165 34.0 1e-40 MIKKTLCFTNPAYLSLHQCQLVIRLPEVEENESLSNVLKEQAERTIPVEDIGVVVLDHRR ITITSGALDALLENNCAVITCNAQGHPVGLLLPLSGNTLQSERFREQIDSSLPLRKQLWQ QTIKQKIANQAAVLRGVTGNDEKCMQVWAEQVRSGDPDNIEARAAAHYWQHLFPELPHFV RAREGEPPNNLLNYGYAILRAVVARALVGSGLLPTLGIHHHNRNNAYCLADDIMEPYRPY VDRLVLNIIRTNGCVAELARELKSQLLVIPTLDVVVNGKRSPLMIAVQQTTASLYKCFSG ELRRVSYPEM >gi|283510463|gb|ACQH01000156.1| GENE 3 5995 - 6336 259 113 aa, chain + ## HITS:1 COG:Cj1521c KEGG:ns NR:ns ## COG: Cj1521c COG3512 # Protein_GI_number: 15792834 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Campylobacter jejuni # 9 97 3 91 143 79 44.0 2e-15 MSNLSRLNEYRVMWILVFFDLPTETKKDKKAYVDFRNLLQRDGFTMFQFSIYLRHCASME NAEVHIKRVKNSLPKYGKVGVLCITDKQFDNIQLFYGTKPHKPNAPGQQLELF >gi|283510463|gb|ACQH01000156.1| GENE 4 7425 - 7883 -239 152 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLENSYFCSIANNPKEHAYARLSVVICLKTRTFAVSQTTEDKEAYTQTCCDLLENSYFCS IANNLPPKRQMPRSVVICLKTRTFAVSQTTCPKGQSTIISCDLLENSYFCSIANNQSISH HRTMWVVICLKTRTFAVSQTTLTNNIILRRLL >gi|283510463|gb|ACQH01000156.1| GENE 5 8372 - 8737 341 121 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288930155|ref|ZP_06423990.1| ## NR: gi|288930155|ref|ZP_06423990.1| hypothetical protein HMPREF0670_02884 [Prevotella sp. oral taxon 317 str. F0108] # 1 121 19 139 139 192 100.0 4e-48 MPAEQPINSGGSQSPNNKDKDNSAVTQAENLKKEEKETNETILSQKDYVQKRLTEKFTDS PYNTIEEWENNDVEGYTKAYDDMIAEYPSYLRGLTKTEKQQETCCDLLENSYICSIANNM S >gi|283510463|gb|ACQH01000156.1| GENE 6 9012 - 9236 164 74 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288930156|ref|ZP_06423991.1| ## NR: gi|288930156|ref|ZP_06423991.1| hypothetical protein HMPREF0670_02885 [Prevotella sp. oral taxon 317 str. F0108] # 1 74 1 74 74 132 100.0 6e-30 MNGHTDKEIASESELSADVFGGMKYGNYWGWNDEDNKLRGDEICRGLAYGPTKKYRIDPW RTHQGRVWWQEGQD >gi|283510463|gb|ACQH01000156.1| GENE 7 9610 - 10470 779 286 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163755345|ref|ZP_02162465.1| 30S ribosomal protein S6 [Kordia algicida OT-1] # 11 284 14 285 286 304 57 2e-82 MLSVEELNDRLIELAFSEDIGDGDHTTLCCIPAEATGESRLLIKEEGILAGVNVAKKVFH NFDPELEVEVYIEDGARVKPGDIAMSVKGRTRSLLQTERLMLNILQRMSGIATMTHKYQQ ALIDAGTKTRVLDTRKTTPGMRMLEKEAVKIGGGMNHRIGLFDMILLKDNHIDFCGGVHN AISMAKRYCRENGKDDLKIECEVRNFKELEEALHEGCDRIMFDNFTPEATRKAVEMVGGR CETESSGGITFETMIPYAQAGVDFISFGALTHSVKGLDMSFKAVGK >gi|283510463|gb|ACQH01000156.1| GENE 8 10489 - 11481 894 330 aa, chain - ## HITS:1 COG:all4673 KEGG:ns NR:ns ## COG: all4673 COG0379 # Protein_GI_number: 17232165 # Func_class: H Coenzyme transport and metabolism # Function: Quinolinate synthase # Organism: Nostoc sp. PCC 7120 # 14 325 13 323 324 350 56.0 2e-96 MVDKKWLELGFIDEPISEKTDIKAEIRRMCKEKNALIMAHYYTESVIQEVADFIGDSLAL AQKAATTDADIIVMCGVHFMGETNKILCPEKTVLIPDLNASCSLAESCPAEDFEQFVKAH PGHTVISYVNTSAGTKAVTDIVVTSSNAKQIVDALPKDAPIIFGPDRNLGNYINGLTGRQ MVLWDGACHVHEKFSVEKILQLKREHPGAKILAHPECKGPVINIADKVGSTAALLKYSIA DDAQEFIVATESGILTEMQKSAPQKTFIPAPPDDSTCACNECNFMKLITLNKLYNCLKYE WPTIEVQPEVAQKAIKPIEKMLEISKRLGL >gi|283510463|gb|ACQH01000156.1| GENE 9 11554 - 13143 1507 529 aa, chain - ## HITS:1 COG:PA0761 KEGG:ns NR:ns ## COG: PA0761 COG0029 # Protein_GI_number: 15595958 # Func_class: H Coenzyme transport and metabolism # Function: Aspartate oxidase # Organism: Pseudomonas aeruginosa # 4 522 6 520 538 478 48.0 1e-134 MIQKFDFLVIGGGIAGMSYALSVAKSGKGKVALVCKTTLDETNTTKAQGGVAAVTNLDVD NFEKHINDTMVAGDYISDLAAVKHVVRNAPDAIKALVQWGVQFDKNENGEYDLHREGGHS DFRILHHADDTGFEIQRGLMAAVRANKHITVLENHYAVEIITQHHLGVQVTRKTPDIECY GAYILNPETKKVDTFLSKVTLMATGGVGAVYAMTSNPVIATGDGIAMAYRAKATVADMEF VQFHPTVLYNPSETHPAFLITEAMRGYGAVLRLPNGKEFMQKYDERLSLAPRDIVARAID HEMKIHGLDYVCLDLTHKDAEETKRHFPHIYEKCLSMGIDITKQYIPVCPSAHYMCGGIK VDLNGESSISRLYAVGECSCTGLHGGNRLASNSLIEAVVYAQSAAEHALLTVDNYDFNLK VPEWNDEGTLTNEEHVLITQSVREVGEIMSNYVGIVRSNLRLKRAWDRLDLLYEETEELF KRVTATKDICELRNMINVGYLITRQAIERKESRGLHYTVDYPKHAYDQQ Prediction of potential genes in microbial genomes Time: Sat May 28 03:04:59 2011 Seq name: gi|283510462|gb|ACQH01000157.1| Prevotella sp. oral taxon 317 str. F0108 cont2.157, whole genome shotgun sequence Length of sequence - 1570 bp Number of predicted genes - 0 Number of transcription units - 0, operones - 0 average op.length - 0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - LSU_RRNA 108 - 673 92.0 # CP000139 [R:343479..346279] # 23S ribosomal RNA # Bacteroides vulgatus ATCC 8482 # Bacteria; Bacteroidetes; Bacteroidia; Bacteroidales; Bacteroidaceae; Bacteroides. - LSU_RRNA 702 - 1003 91.0 # CP000139 [R:343479..346279] # 23S ribosomal RNA # Bacteroides vulgatus ATCC 8482 # Bacteria; Bacteroidetes; Bacteroidia; Bacteroidales; Bacteroidaceae; Bacteroides. Prediction of potential genes in microbial genomes Time: Sat May 28 03:05:02 2011 Seq name: gi|283510461|gb|ACQH01000158.1| Prevotella sp. oral taxon 317 str. F0108 cont2.158, whole genome shotgun sequence Length of sequence - 1556 bp Number of predicted genes - 0 Number of transcription units - 0, operones - 0 average op.length - 0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - SSU_RRNA 13 - 1535 99.0 # FJ983102 [D:1..1522] # 16S ribosomal RNA # uncultured bacterium # Bacteria; environmental samples. Prediction of potential genes in microbial genomes Time: Sat May 28 03:05:03 2011 Seq name: gi|283510460|gb|ACQH01000159.1| Prevotella sp. oral taxon 317 str. F0108 cont2.159, whole genome shotgun sequence Length of sequence - 2138 bp Number of predicted genes - 3, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 451 188 ## gi|288927391|ref|ZP_06421238.1| conserved hypothetical protein 2 1 Op 2 . - CDS 520 - 1032 328 ## 3 1 Op 3 . - CDS 1054 - 2136 476 ## COG3209 Rhs family protein Predicted protein(s) >gi|283510460|gb|ACQH01000159.1| GENE 1 1 - 451 188 150 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288927391|ref|ZP_06421238.1| ## NR: gi|288927391|ref|ZP_06421238.1| conserved hypothetical protein [Prevotella sp. oral taxon 317 str. F0108] # 7 150 1279 1422 1422 273 98.0 2e-72 MLPHNKRYASGEGLSAPILQSKSSGHGVSATGNGPLRFGVSGSKSSQTSHTEVAAMDVNG DGYPDWIDEGDSHVRTQYTSPTGTLSQLSVKTDIPMPEFSSGAYSLGIGNDGAIAVSIGN RNRSESGRTTGNSSPGDVGSMNNANENPNK >gi|283510460|gb|ACQH01000159.1| GENE 2 520 - 1032 328 170 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNKIDFCMLSLAFIVLLNGGCSSKHLPERCACQCLIFPTTDFEGCYLIEVDKDGYITTSV GEKGDSVTHAVLNDIHISTQHIRLLKNTCEREKKLLGKDEFLKLQQQISKLEGVSIDNIF LESWENDSWAVILLTEEKQYIFPYWDCNNSSIEELVKSLVRGSPINIELR >gi|283510460|gb|ACQH01000159.1| GENE 3 1054 - 2136 476 360 aa, chain - ## HITS:1 COG:MA2043 KEGG:ns NR:ns ## COG: MA2043 COG3209 # Protein_GI_number: 20090890 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rhs family protein # Organism: Methanosarcina acetivorans str.C2A # 49 165 119 233 440 63 35.0 6e-10 ELGDHSAPDGWIQRPKRNAVPGTPPGPPVQWDKAEDPDDVQPGYGYVPADTTHEDIFFYH SDHLGSTSYITDAKANVAQFDAYLPYGELLVDEHSSSEDMPYKFNGKELDQETGLYYYGA RYMNPRTSLWYGVDPLAEKYPSVGAYVYCMDRPTKLIDTDGRKVIISGSRNQRIIVLNQL QKLTNYKLGVKQNGEVVILATHGRNKSKKLTVGNSLIESVIKHKRTMTIQTTDPGEKSSE HDIYRRDAFNGKGTDVVVNFDVTDTPKLLTENGKTGKSSEEVMPLFLVLGHEIIHGERSM DGIAIDPDTKSSYKYRSPNGQLKIKNTSKEELETVGIIGKAKRTENALRKEHGLNKRIKY Prediction of potential genes in microbial genomes Time: Sat May 28 03:05:19 2011 Seq name: gi|283510459|gb|ACQH01000160.1| Prevotella sp. oral taxon 317 str. F0108 cont2.160, whole genome shotgun sequence Length of sequence - 1596 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 46 - 1479 612 ## COG3436 Transposase and inactivated derivatives - Prom 1532 - 1591 2.0 Predicted protein(s) >gi|283510459|gb|ACQH01000160.1| GENE 1 46 - 1479 612 477 aa, chain - ## HITS:1 COG:MA2748 KEGG:ns NR:ns ## COG: MA2748 COG3436 # Protein_GI_number: 20091571 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Methanosarcina acetivorans str.C2A # 40 473 22 476 485 163 29.0 7e-40 MKEQVTDISKVLQDMTEEMRLLRETVNQQYAEIVKLNRNINALNLQIRKKDTELTNLRER LAKYEKPDKNSNNSSTPPSKERMKDEVVRRTRSLRKPSGKKPGGQKGHDGHKLSCSSVPD EIIDDAPDYCTRCGESLADAERVLDYVTQVISIPELKPVIKEIRHYVMVCKNCGERIRTA PRRRSNDVVYDSSVKALVVYLSVVQFLPYGRIASFLREVFGLTPSEGSLVNWVNEAKRNA QPVIDKIKEYIKSSAVVGFDESGLYCNKRLDWAWIAQTVYYTLLFRADGRGSKVLAGKFG DSLERMTAVTDRHSAYFALHFLNHQVCLAHLLRELQYLSELNTEQEWSGKVTNLFREAIH ERNTNPNDVIDKVSWTRRLDNLLKQNIEGLGKKFITFKKGLVKCRDYIFNFLVNPMIPSD NNGSERGIRKLKIKLKNSCTFRSDFGADAFLELHSIVETAKKHDKTPYNAIQALFKV Prediction of potential genes in microbial genomes Time: Sat May 28 03:05:20 2011 Seq name: gi|283510458|gb|ACQH01000161.1| Prevotella sp. oral taxon 317 str. F0108 cont2.161, whole genome shotgun sequence Length of sequence - 1449 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 28 - 1212 472 ## Cyan8802_4529 hypothetical protein - Prom 1294 - 1353 1.9 Predicted protein(s) >gi|283510458|gb|ACQH01000161.1| GENE 1 28 - 1212 472 394 aa, chain - ## HITS:1 COG:no KEGG:Cyan8802_4529 NR:ns ## KEGG: Cyan8802_4529 # Name: not_defined # Def: hypothetical protein # Organism: Cyanothece_PCC8802 # Pathway: not_defined # 29 386 192 568 574 93 23.0 1e-17 MRTPCGLRTVVTVLETFAELLGDAFGKVPCYNTVENWVKKLGLSVYQDDQPCKDKKFAMV VDESIAINGQKLLLTLGIPSEHQGRPLRHADVTVLDMSVSKGFNGDDVQGRIEAAEKSAG NAPDYIISDNGHNLVKGITGSGHVRHADISHSMGVILKKVYEKQSDFVELTTLLGKKRLQ YHLTDKAYLLSPNMRAMSRFMNMSSWVSWGNEMLNCYDTLPEKMQEAYAFIKDYECLLRE LQAVLCAVRHVETVCKNEGFSVMTSRRCKLHIVTHVLGDAHSRQARVGMKMLEYFNREET LITANMSINISSDIIESTFGIYKSKKSPNKLHGVTSFVLTIPLYSKVTNQSVTKTINFKE RIVKVKLKDIRAWSTEHLSTNWVTERTKTLRKAS Prediction of potential genes in microbial genomes Time: Sat May 28 03:05:25 2011 Seq name: gi|283510457|gb|ACQH01000162.1| Prevotella sp. oral taxon 317 str. F0108 cont2.162, whole genome shotgun sequence Length of sequence - 1389 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 146 - 187 6.2 1 1 Tu 1 . - CDS 221 - 670 368 ## PRU_0294 hypothetical protein - Prom 900 - 959 6.1 - 5S_RRNA 1273 - 1351 92.0 # CP000685 [R:24140..24245] # 5S ribosomal RNA # Flavobacterium johnsoniae UW101 # Bacteria; Bacteroidetes; Flavobacteria; Flavobacteriales; Flavobacteriaceae; Flavobacterium. Predicted protein(s) >gi|283510457|gb|ACQH01000162.1| GENE 1 221 - 670 368 149 aa, chain - ## HITS:1 COG:no KEGG:PRU_0294 NR:ns ## KEGG: PRU_0294 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 37 147 26 136 136 131 60.0 7e-30 MERLKRIVFLCITILSLLPGLALATSNEDNTADNCVNWEPVMDAIIQVESGGNRFARSGK SVGAMQITPICVREVNLYLKQLNIKKAYTLKDRFSVQKSKEIFLLIQKRHNPQNNIERAI RAWNGGLKYSNKGTQRYYEKVTRAMNKTT Prediction of potential genes in microbial genomes Time: Sat May 28 03:05:29 2011 Seq name: gi|283510456|gb|ACQH01000163.1| Prevotella sp. oral taxon 317 str. F0108 cont2.163, whole genome shotgun sequence Length of sequence - 1376 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 1283 688 ## COG3436 Transposase and inactivated derivatives Predicted protein(s) >gi|283510456|gb|ACQH01000163.1| GENE 1 2 - 1283 688 427 aa, chain - ## HITS:1 COG:Z1131 KEGG:ns NR:ns ## COG: Z1131 COG3436 # Protein_GI_number: 15800652 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Escherichia coli O157:H7 EDL933 # 57 426 91 461 512 159 31.0 1e-38 MVFGRRSEKRLPESPDGWVGTLFDEQWAKEGSLLHAETLPVIDEIKQQAKRRREASRTNR PARKGRPYASYVPDDIEREVVVIYPEGYDAERMVVIGHDRTEHLCLRPPCFYVKVEDRVV CRQKEARPTDAKVGILEAPLQRQAVDCFADASLLAEIVTGKFAYHLPEYRQSARWKEHGI NIPTSTINRWVHGAADALHPLYKLQVGLILQSPYLQVDETSVQVADRKGKTRKGYLWGVR DALHSRGVFFHWKEGSRSGAVPDELFKGYHGAIQSDGYEAYGRFENVQGIELLGCMAHVR RKFDHLAADDKNAAHIVDTIAALYELEENLRHGNAGPEQVLAERKSKAYPILKSLEAYFK EVHKQYLPNEAMEKALRYAFSVWIRISRYVQDGRFNIDNNLMEQAIRPITLGRKNYLFCG NNEEAEN Prediction of potential genes in microbial genomes Time: Sat May 28 03:05:29 2011 Seq name: gi|283510455|gb|ACQH01000164.1| Prevotella sp. oral taxon 317 str. F0108 cont2.164, whole genome shotgun sequence Length of sequence - 1305 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 11 - 1201 343 ## COG3464 Transposase and inactivated derivatives - Prom 1240 - 1299 4.8 Predicted protein(s) >gi|283510455|gb|ACQH01000164.1| GENE 1 11 - 1201 343 396 aa, chain - ## HITS:1 COG:RSc2573 KEGG:ns NR:ns ## COG: RSc2573 COG3464 # Protein_GI_number: 17547292 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Ralstonia solanacearum # 18 388 27 399 406 190 32.0 4e-48 MQNALGVSQQVCQNLRYEGNNLVLEIQTPKEKLCCPVCGSHNVNRNGCHIRRFVSVPIGL SKTYLDMRVCRIQCHDCGCIKQENIDFAKGKRRHTIAFANMVLDLSRFATIQDISWFLGV SWDVVRNIQMEFLQSNYSNPDLSMLRRISIDEFATHKGQVYKTIVVDLDNGHVVYVGDGN GKKALDGFWERLGNDKEHIQAVCTDLSAAYTRAVSEHLPNAALVVDHFHVTKLMNEKLDL LRRQLWHVEKDVNKRKVIKGTRWLLLRNGDDIFDHVHRNRLENALSLNRPLMTAYYLKEG LREIWNQCSKQKAKAVLEEWVKQALESKIQPLMKMASTIRAYKTYILAWYDHCITNGTIE GINNKIKVLKRQVYGFRNEEYFTLRLYALHDRRLRI Prediction of potential genes in microbial genomes Time: Sat May 28 03:05:30 2011 Seq name: gi|283510454|gb|ACQH01000165.1| Prevotella sp. oral taxon 317 str. F0108 cont2.165, whole genome shotgun sequence Length of sequence - 1187 bp Number of predicted genes - 0 Prediction of potential genes in microbial genomes Time: Sat May 28 03:05:30 2011 Seq name: gi|283510453|gb|ACQH01000166.1| Prevotella sp. oral taxon 317 str. F0108 cont2.166, whole genome shotgun sequence Length of sequence - 1163 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 408 - 1067 456 ## DSY2013 hypothetical protein Predicted protein(s) >gi|283510453|gb|ACQH01000166.1| GENE 1 408 - 1067 456 219 aa, chain + ## HITS:1 COG:no KEGG:DSY2013 NR:ns ## KEGG: DSY2013 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 73 155 32 114 114 68 39.0 1e-10 MHFVGLAKAGKTRYTVSGRKKNAAELIAAYERERGKNCRKYKCRYIQLNGNLGDMPVRIF LIKYGRNSAWNVLLTTDTTMSFVKAFEVYQIRWNIEVMNKETKQYLGLGGYQGCDFNGQI ADATLCYLTYTVMALEKRFTEYQTMGELFSDMEGELMALTLWKRALACIERILRVLGETL GMTPQHLMATICGNDKEMVKILVMAEALKKRDEVCGQSA Prediction of potential genes in microbial genomes Time: Sat May 28 03:05:34 2011 Seq name: gi|283510452|gb|ACQH01000167.1| Prevotella sp. oral taxon 317 str. F0108 cont2.167, whole genome shotgun sequence Length of sequence - 1138 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 153 - 185 0.1 1 1 Tu 1 . - CDS 274 - 1122 722 ## gi|288930177|ref|ZP_06424002.1| putative arginine-specific protease ArgI polyprotein Predicted protein(s) >gi|283510452|gb|ACQH01000167.1| GENE 1 274 - 1122 722 282 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288930177|ref|ZP_06424002.1| ## NR: gi|288930177|ref|ZP_06424002.1| putative arginine-specific protease ArgI polyprotein [Prevotella sp. oral taxon 317 str. F0108] # 1 282 1 282 282 569 100.0 1e-161 MKKNVFYYYILVLVAALWQTQARAADYYVEVGGVHITSDNYKNITAAGGFTAVKKGKITF DNSTRTLTLNNVLIKPDEYKPGISMFIDSENKERPLIIKLVGRNVIEPTNSVAVETLENA VVVTGSGSLEVSGNVSLSPGGQLTIQGGCTIEAKSPVWGNAITIDDAQVHAVRLWEDEPC MCAFRGIKLKGGSYLASPQGATCEYRGDMYHAIRGYVFVKGKEICGEVLIKRGKNTGIDN AATPPSTKEDVIYTLDGIRMKLPFERLPKGVYVVNGKKVKKE Prediction of potential genes in microbial genomes Time: Sat May 28 03:05:46 2011 Seq name: gi|283510451|gb|ACQH01000168.1| Prevotella sp. oral taxon 317 str. F0108 cont2.168, whole genome shotgun sequence Length of sequence - 1011 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 14 - 394 211 ## MAE_46860 transposase - Prom 499 - 558 1.7 - Term 430 - 473 1.6 2 2 Tu 1 . - CDS 613 - 870 102 ## gi|288803252|ref|ZP_06408686.1| putative transposase Predicted protein(s) >gi|283510451|gb|ACQH01000168.1| GENE 1 14 - 394 211 126 aa, chain - ## HITS:1 COG:no KEGG:MAE_46860 NR:ns ## KEGG: MAE_46860 # Name: not_defined # Def: transposase # Organism: M.aeruginosa # Pathway: not_defined # 1 122 80 201 201 79 32.0 3e-14 MIDRDNRYKGFTTSEKMTADKIVSFLDEISFNLRMDTFVVLDNASVHRNKKIRELRPIWE QRGLFLFYLPPYSPQLNIAETLWRILKGKWMKPQDYITSDMLFYTTNRALADIGKGLRIN FSKHVA >gi|283510451|gb|ACQH01000168.1| GENE 2 613 - 870 102 85 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288803252|ref|ZP_06408686.1| ## NR: gi|288803252|ref|ZP_06408686.1| putative transposase [Prevotella melaninogenica D18] # 1 83 67 149 151 104 67.0 2e-21 MSHVSVNAWVKRFKSEGIAGLKTRSGRGRKPIMDSSDEESVRRAIEQDRQSVSKARAAWE QSSGKEVSDATFKRFLSALAQDISE Prediction of potential genes in microbial genomes Time: Sat May 28 03:05:54 2011 Seq name: gi|283510450|gb|ACQH01000169.1| Prevotella sp. oral taxon 317 str. F0108 cont2.169, whole genome shotgun sequence Length of sequence - 922 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 886 165 ## LLO_1884 transposase (DDE domain) Predicted protein(s) >gi|283510450|gb|ACQH01000169.1| GENE 1 1 - 886 165 295 aa, chain - ## HITS:1 COG:no KEGG:LLO_1884 NR:ns ## KEGG: LLO_1884 # Name: not_defined # Def: transposase (DDE domain) # Organism: L.longbeachae # Pathway: not_defined # 46 289 19 258 297 97 27.0 4e-19 MIYKKLHISLIFSNLVKSKPLSTNPRMCNLYTKLVKILEICKQFSHNLVNEQGNIPRRGP MPKFSDLEVVALSLTAESEIIDSEKWLFDYKLQEYKDKIPNLISRRQFNDRRKNTAGLCE DTRKRIAAQMDGGETQFFVDSKPIAVCRVARGKRCKMGRMGDFSQAPDFGFCASQNMYYF GYKLNALCGLSGVIHSYDLSKASVHDLNYMKDVKLVYHDCNIYGDKGYIGADVQLDLFQT AHIRLECPYRLNQKNWKPKLIPFAKARKRIETLFSQLTDQFLVIRNYAKITNGLF Prediction of potential genes in microbial genomes Time: Sat May 28 03:05:58 2011 Seq name: gi|283510449|gb|ACQH01000170.1| Prevotella sp. oral taxon 317 str. F0108 cont2.170, whole genome shotgun sequence Length of sequence - 914 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 18 - 912 860 ## COG3209 Rhs family protein Predicted protein(s) >gi|283510449|gb|ACQH01000170.1| GENE 1 18 - 912 860 298 aa, chain + ## HITS:1 COG:RSp1137 KEGG:ns NR:ns ## COG: RSp1137 COG3209 # Protein_GI_number: 17549358 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rhs family protein # Organism: Ralstonia solanacearum # 6 211 826 1022 1517 86 33.0 6e-17 MHTVSAREYGYDAYERLLHVENERGERYTFERDPNGEVIAERGFDGQERRYERDMDGTVL VTHLPDGTTIHHQHDLAGRLTYSRYPDGSWEAWEYDRAGRLCKASNPHGETLFERDALGR IIREIQNGHAIEHCYDSRSRLTHTLSDLGADVSYGYDDAGLPESVKAIVQGMPHPWEARL QHDRLGRETLRTMTGGVACAMQYDGVGRPSRPSVTRGGHSLYNRGYRWDDDFRLSHAHDA ISGRIVRYFYDDFGSLAEAEYGDGARQWRNPDVMGNVYDSTDRTDRTYARGGQLREDR Prediction of potential genes in microbial genomes Time: Sat May 28 03:05:59 2011 Seq name: gi|283510448|gb|ACQH01000171.1| Prevotella sp. oral taxon 317 str. F0108 cont2.171, whole genome shotgun sequence Length of sequence - 902 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 831 499 ## Dhaf_0692 transposase IS4 family protein Predicted protein(s) >gi|283510448|gb|ACQH01000171.1| GENE 1 1 - 831 499 276 aa, chain + ## HITS:1 COG:no KEGG:Dhaf_0692 NR:ns ## KEGG: Dhaf_0692 # Name: not_defined # Def: transposase IS4 family protein # Organism: D.hafniense_DCB-2 # Pathway: not_defined # 4 218 191 400 450 91 28.0 4e-17 NPDYERFQECKTSKMEVAMAMLRRGWKMGLRAKYVITDSWFTCEQLMACVRGIGKGAMHF VGLAKMGKTRYTVSGRKKNAAELIATYERERGKNCRKYKCRYIQLNGNLGDIPVRIFLIK YGRNSAWNVLLTTDTTMSFVKAFEVYQIRWNIEVMNKETKQYLGLGGYQGCDFNGQIADA TLCYLTYIVMALEKRFTEYQTMGELFSDMEGELMALTLWKRVLACIERILRVLGETLGVT PQQLMAMISGNDKEMVKILVMAEALEKWDEACGQSA Prediction of potential genes in microbial genomes Time: Sat May 28 03:06:03 2011 Seq name: gi|283510447|gb|ACQH01000172.1| Prevotella sp. oral taxon 317 str. F0108 cont2.172, whole genome shotgun sequence Length of sequence - 890 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 888 948 ## Hoch_2925 YD repeat protein Predicted protein(s) >gi|283510447|gb|ACQH01000172.1| GENE 1 3 - 888 948 295 aa, chain + ## HITS:1 COG:no KEGG:Hoch_2925 NR:ns ## KEGG: Hoch_2925 # Name: not_defined # Def: YD repeat protein # Organism: H.ochraceum # Pathway: not_defined # 1 278 2812 3072 3456 126 30.0 1e-27 RTITYPDGEVVSYGYDAAGQVTSLKSAKQGREETIVAQVGYDKDGHTIYTRMGNGTESTY AYDRQRERLQGMLLTANGDSIMQTQYKYDPVDNILGLTNVITPKAPKKPKGPGGPGRTDG TGGPDAGKEKKEKPLGGAFSHTYAYDELNRLVKASGKAKGIGYAMDMSFGLMGEPLTKVQ RTDSGSVAGSYALAYEYGDADHPTAPSQIGHERYSYDANGNPTLVEDDSLNTERRMAWDE ENRLMALSDNGKTSRYTYNAAGDRVVKSHGYLEGVYVNGAPQGLTFHETEDYTIY Prediction of potential genes in microbial genomes Time: Sat May 28 03:06:08 2011 Seq name: gi|283510446|gb|ACQH01000173.1| Prevotella sp. oral taxon 317 str. F0108 cont2.173, whole genome shotgun sequence Length of sequence - 886 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 87 - 146 6.3 1 1 Tu 1 . + CDS 196 - 405 76 ## gi|288930189|ref|ZP_06424008.1| hypothetical protein HMPREF0670_02902 + Term 637 - 663 -0.6 Predicted protein(s) >gi|283510446|gb|ACQH01000173.1| GENE 1 196 - 405 76 69 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288930189|ref|ZP_06424008.1| ## NR: gi|288930189|ref|ZP_06424008.1| hypothetical protein HMPREF0670_02902 [Prevotella sp. oral taxon 317 str. F0108] # 1 69 1 69 69 132 100.0 6e-30 MQQNDENGGQHCALFPPIFAIYYAFAFLFEENLGQICIRNHKKMLAKMLAFALHLAPKRI AFSTKTQGI Prediction of potential genes in microbial genomes Time: Sat May 28 03:06:13 2011 Seq name: gi|283510445|gb|ACQH01000174.1| Prevotella sp. oral taxon 317 str. F0108 cont2.174, whole genome shotgun sequence Length of sequence - 876 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 195 126 ## gi|260911287|ref|ZP_05917886.1| YD repeat protein 2 2 Tu 1 . + CDS 378 - 876 473 ## Hoch_2925 YD repeat protein Predicted protein(s) >gi|283510445|gb|ACQH01000174.1| GENE 1 1 - 195 126 64 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260911287|ref|ZP_05917886.1| ## NR: gi|260911287|ref|ZP_05917886.1| YD repeat protein [Prevotella sp. oral taxon 472 str. F0295] # 2 63 46 107 791 114 96.0 2e-24 RESVIVDRIGYDKDGHTVYTKLGNGTETTYTYDKQRERLQAMNLTADGRAVMENKYQYDA VDNS >gi|283510445|gb|ACQH01000174.1| GENE 2 378 - 876 473 166 aa, chain + ## HITS:1 COG:no KEGG:Hoch_2925 NR:ns ## KEGG: Hoch_2925 # Name: not_defined # Def: YD repeat protein # Organism: H.ochraceum # Pathway: not_defined # 17 159 2978 3101 3456 82 35.0 6e-15 MTFGRMSEPLTKVQKVDSSKTAQSYDFTYKYEDSNHPTAPTQIGHEHYTYDANGNPTLVE NDSLNTERRMYWDEDNRLMVLSDNGKTSRYTYNAAGERIVKSHGDLEGVYINGAPQGITF HETEDYTIYPAPIITVTKNRFTKHYFIGDKRIASKLGTGKFSNVYG Prediction of potential genes in microbial genomes Time: Sat May 28 03:06:21 2011 Seq name: gi|283510444|gb|ACQH01000175.1| Prevotella sp. oral taxon 317 str. F0108 cont2.175, whole genome shotgun sequence Length of sequence - 870 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 22 - 81 3.0 1 1 Tu 1 . + CDS 253 - 868 424 ## COG3385 FOG: Transposase and inactivated derivatives Predicted protein(s) >gi|283510444|gb|ACQH01000175.1| GENE 1 253 - 868 424 205 aa, chain + ## HITS:1 COG:SMb20766 KEGG:ns NR:ns ## COG: SMb20766 COG3385 # Protein_GI_number: 16265206 # Func_class: L Replication, recombination and repair # Function: FOG: Transposase and inactivated derivatives # Organism: Sinorhizobium meliloti # 1 205 45 242 387 75 26.0 8e-14 MVFCQFSGCDSVRDISNGLKSATGNLNHLGISRAPSKSTVSYQNTNRDSDVFRDIFYAVY KYFGQQGWGSRKGFRFKMPIKLLDSTLVSLTMSVYDWAHYTSKKGAVKMHTLLDYDCLLP DFVNITDGKGSDNKAAFGIPLQPHSIVVADRGYCDYALLNHWDSTNVFFVVRHKGNIRYK RVRELPLPDHAAQNVLIDEEIELEL Prediction of potential genes in microbial genomes Time: Sat May 28 03:06:22 2011 Seq name: gi|283510443|gb|ACQH01000176.1| Prevotella sp. oral taxon 317 str. F0108 cont2.176, whole genome shotgun sequence Length of sequence - 861 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) Predicted protein(s) >gi|283510443|gb|ACQH01000176.1| GENE 1 290 - 694 56 134 aa, chain - ## HITS:1 COG:no KEGG:BMULJ_05093 NR:ns ## KEGG: BMULJ_05093 # Name: not_defined # Def: hypothetical protein # Organism: B.multivorans_T # Pathway: not_defined # 14 72 17 76 120 68 63.0 5e-11 MVSGTISLFLSKCFSPFPHGTGSLSVSWEYLALPDGPGRFAQDFSCPALLRIPLCPIVLR IRGSHPLRPLFPGEFSSHARYNGAVLQPRRCVATTPVWALPRSLATTGGIINLFSLPRGT KMFQFPRFASPTVR Prediction of potential genes in microbial genomes Time: Sat May 28 03:06:25 2011 Seq name: gi|283510442|gb|ACQH01000177.1| Prevotella sp. oral taxon 317 str. F0108 cont2.177, whole genome shotgun sequence Length of sequence - 851 bp Number of predicted genes - 0 Prediction of potential genes in microbial genomes Time: Sat May 28 03:06:26 2011 Seq name: gi|283510441|gb|ACQH01000178.1| Prevotella sp. oral taxon 317 str. F0108 cont2.178, whole genome shotgun sequence Length of sequence - 807 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 7 - 66 2.2 1 1 Tu 1 . + CDS 163 - 805 307 ## gi|288930199|ref|ZP_06424013.1| hypothetical protein HMPREF0670_02907 Predicted protein(s) >gi|283510441|gb|ACQH01000178.1| GENE 1 163 - 805 307 214 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288930199|ref|ZP_06424013.1| ## NR: gi|288930199|ref|ZP_06424013.1| hypothetical protein HMPREF0670_02907 [Prevotella sp. oral taxon 317 str. F0108] # 1 214 55 268 269 402 100.0 1e-111 MYIDKPTGKGFDGAYTKNGEYIVDDAKPWKGYPEKTRNSGKQLSQDWVENHLADGAVPNS HTEAMETANKNGTLKRTVTHVDGDGNMQISNYATKGTDDVFSTRKPEKVTKPPTKVKGLI KSVRSKGANLRPVKVISESNFSKAVQSSKAATKANDALWKGTQYIESSPVLRTVGKVAGR ALIVVGIAWDAYCINEAYQEEREFGDKTQQATGA Prediction of potential genes in microbial genomes Time: Sat May 28 03:06:36 2011 Seq name: gi|283510440|gb|ACQH01000179.1| Prevotella sp. oral taxon 317 str. F0108 cont2.179, whole genome shotgun sequence Length of sequence - 787 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 322 - 549 165 ## gi|288930201|ref|ZP_06424014.1| hypothetical protein HMPREF0670_02908 2 1 Op 2 . - CDS 533 - 751 118 ## gi|288930202|ref|ZP_06424015.1| hypothetical protein HMPREF0670_02909 Predicted protein(s) >gi|283510440|gb|ACQH01000179.1| GENE 1 322 - 549 165 75 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288930201|ref|ZP_06424014.1| ## NR: gi|288930201|ref|ZP_06424014.1| hypothetical protein HMPREF0670_02908 [Prevotella sp. oral taxon 317 str. F0108] # 1 75 1 75 75 130 100.0 3e-29 MVKIVDLATKSLEDWRTRIYYFVDDRCKDEISHFYSFNKNGHDGIAVFYEGENKVERIRY YEKDEEIWSKVEPEK >gi|283510440|gb|ACQH01000179.1| GENE 2 533 - 751 118 72 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288930202|ref|ZP_06424015.1| ## NR: gi|288930202|ref|ZP_06424015.1| hypothetical protein HMPREF0670_02909 [Prevotella sp. oral taxon 317 str. F0108] # 9 72 1 64 64 123 100.0 4e-27 MKKYIICIMLVHVCISCSNNKNEMPSESFTYECFTYEDSINGIKFTMPNVGEGGLLIVGE VQVMLTPHGQNS Prediction of potential genes in microbial genomes Time: Sat May 28 03:06:46 2011 Seq name: gi|283510439|gb|ACQH01000180.1| Prevotella sp. oral taxon 317 str. F0108 cont2.180, whole genome shotgun sequence Length of sequence - 785 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 736 836 ## Hoch_2925 YD repeat protein Predicted protein(s) >gi|283510439|gb|ACQH01000180.1| GENE 1 1 - 736 836 245 aa, chain - ## HITS:1 COG:no KEGG:Hoch_2925 NR:ns ## KEGG: Hoch_2925 # Name: not_defined # Def: YD repeat protein # Organism: H.ochraceum # Pathway: not_defined # 9 245 2284 2539 3456 148 39.0 2e-34 MSTTVHNTYDAYGNLASYKETSTDYELRADIAYHELRERHIVGLPRHIAVMDKAGRVYRE RSTEVDGKGDVTRITMHNGQLPSVYDMAYDAYGNLAALTKPANHKGQRMRYEYAYDGVLH QLVTGVKDAYGYTSSTDYDCRWGAPLETRDINGNRMRYAYDDAGRATAIVGPKELAAGKP YTVRFEYHPTGRWARTLHYAPEGDVETRTFADSLMRAVQTKRTGVVWKGGAAHKVSIVSG RMVQD Prediction of potential genes in microbial genomes Time: Sat May 28 03:06:50 2011 Seq name: gi|283510438|gb|ACQH01000181.1| Prevotella sp. oral taxon 317 str. F0108 cont2.181, whole genome shotgun sequence Length of sequence - 780 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 727 878 ## COG3209 Rhs family protein Predicted protein(s) >gi|283510438|gb|ACQH01000181.1| GENE 1 1 - 727 878 242 aa, chain - ## HITS:1 COG:PA2458 KEGG:ns NR:ns ## COG: PA2458 COG3209 # Protein_GI_number: 15597654 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rhs family protein # Organism: Pseudomonas aeruginosa # 84 242 199 350 373 63 31.0 4e-10 MGLYNKGVGDLQATTEYDAYDRPTKVTLPDGATTTTAYAIVDHDGEPMLETRITDALGRH AESYTDRKGRNRETLQHAASGDVRVKYGYDAVGQVLSVHHPNGKTTAYEYDLLGHKLKVN HPDAGEVTCTYDAAGNLLTKLTAELKKTISDKAAISYTYDYERLSEVLYPKNLFNRVTYT YGKPGAKYNRAGRLVLVEDASGGEAYYYGSQGEVVKTVRSVMVSQADVRTYVHAATYDSH NR Prediction of potential genes in microbial genomes Time: Sat May 28 03:06:51 2011 Seq name: gi|283510437|gb|ACQH01000182.1| Prevotella sp. oral taxon 317 str. F0108 cont2.182, whole genome shotgun sequence Length of sequence - 775 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 37 - 372 176 ## gi|288803514|ref|ZP_06408945.1| hypothetical protein HMPREF0660_01950 2 1 Op 2 . + CDS 405 - 596 145 ## gi|288930208|ref|ZP_06424018.1| hypothetical protein HMPREF0670_02912 Predicted protein(s) >gi|283510437|gb|ACQH01000182.1| GENE 1 37 - 372 176 111 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288803514|ref|ZP_06408945.1| ## NR: gi|288803514|ref|ZP_06408945.1| hypothetical protein HMPREF0660_01950 [Prevotella melaninogenica D18] # 6 105 44 166 246 63 35.0 6e-09 MQVGRFRPNINTGIAKKVNINGYYVVTHKRVGTFEPFILYSDGTFGNIVFKNRDSLYEQK KQDADLMQEIISSEKGFCGGGGYYEIKGDTLEVDKVYRYQLRKVLAKIALR >gi|283510437|gb|ACQH01000182.1| GENE 2 405 - 596 145 63 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288930208|ref|ZP_06424018.1| ## NR: gi|288930208|ref|ZP_06424018.1| hypothetical protein HMPREF0670_02912 [Prevotella sp. oral taxon 317 str. F0108] # 1 63 1 63 63 102 100.0 6e-21 MWVVSKKEDVPEPYDVLYEFVPAKNLPPSTSFGCKLNKYMWENKADWKAYKQRMKQQMIM KKW Prediction of potential genes in microbial genomes Time: Sat May 28 03:07:02 2011 Seq name: gi|283510436|gb|ACQH01000183.1| Prevotella sp. oral taxon 317 str. F0108 cont2.183, whole genome shotgun sequence Length of sequence - 756 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 35 - 755 512 ## PG1032 ISPg3, transposase Predicted protein(s) >gi|283510436|gb|ACQH01000183.1| GENE 1 35 - 755 512 240 aa, chain + ## HITS:1 COG:no KEGG:PG1032 NR:ns ## KEGG: PG1032 # Name: not_defined # Def: ISPg3, transposase # Organism: P.gingivalis # Pathway: not_defined # 2 240 36 274 300 247 48.0 3e-64 MRNRKGQMSKSEIMTILLCYHFGSFRNFKHYYLFFIKEHLASYFPKAVSYTRFVELMPRV FFDLMAFMRIQGFGKCTGISFVDSTMIPVCHNMRRKFNKVFDGLAKNGKGTMGWCHGFKL HLLCNEMGDVLTFCLTPANVDDRDPMVWKVFTKVLYGKVFADKGYIKQEFFENLFNQGIH LVHGLKSNMKNKLMPLWDKMMLRKRYIIECINELLKNKANLVHSRHRSVHNFLMNLCAAL Prediction of potential genes in microbial genomes Time: Sat May 28 03:07:06 2011 Seq name: gi|283510435|gb|ACQH01000184.1| Prevotella sp. oral taxon 317 str. F0108 cont2.184, whole genome shotgun sequence Length of sequence - 746 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 699 244 ## BVU_0020 putative transposase Predicted protein(s) >gi|283510435|gb|ACQH01000184.1| GENE 1 3 - 699 244 232 aa, chain - ## HITS:1 COG:no KEGG:BVU_0020 NR:ns ## KEGG: BVU_0020 # Name: not_defined # Def: putative transposase # Organism: B.vulgatus # Pathway: not_defined # 1 232 22 253 273 342 68.0 7e-93 MLIMILFHDSGYRCFKHFYLEKVYKHLRHLFPNVVSYNRLVELEREVAVPLTLFIKKVLL GKCTGISFVDSTPLHVCKNRRIHIHKVFKGIAQRGKCSMGWFFGFKLHLICNEKGELLNF MITPGDVDDRKPLEYKAFIDFIYGKPVGDKGYISKNLFQRLFVDGIQLITKLKSNMKGAL MSVSDRLLLRKRAIIETVNDELKNIAQVEHSGHRCFDNFIVNLLGAIAAYCL Prediction of potential genes in microbial genomes Time: Sat May 28 03:07:10 2011 Seq name: gi|283510434|gb|ACQH01000185.1| Prevotella sp. oral taxon 317 str. F0108 cont2.185, whole genome shotgun sequence Length of sequence - 739 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 532 113 ## gi|288930214|ref|ZP_06424021.1| hypothetical protein HMPREF0670_02915 - Prom 588 - 647 5.7 Predicted protein(s) >gi|283510434|gb|ACQH01000185.1| GENE 1 1 - 532 113 177 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288930214|ref|ZP_06424021.1| ## NR: gi|288930214|ref|ZP_06424021.1| hypothetical protein HMPREF0670_02915 [Prevotella sp. oral taxon 317 str. F0108] # 1 177 1 177 177 355 100.0 7e-97 MICTVKHVLLIVIMAVFCSCGSRTSKSSENVRKNVPATKYRRYATEELAIALDISMRPDA VYDDDTPFYDQCRKFCNLSDDEIETIKTILSDKQKWQKVGTKFPTLNISVYYRQYLAYRK HGHVYVLVNLFKYYYMVFLGNDVLGVGAPQKGITIISLANDRGRNKYDNVTILLDLS Prediction of potential genes in microbial genomes Time: Sat May 28 03:07:18 2011 Seq name: gi|283510433|gb|ACQH01000186.1| Prevotella sp. oral taxon 317 str. F0108 cont2.186, whole genome shotgun sequence Length of sequence - 736 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 624 278 ## Dacet_0812 transposase IS4 family protein - Prom 665 - 724 4.9 Predicted protein(s) >gi|283510433|gb|ACQH01000186.1| GENE 1 3 - 624 278 207 aa, chain - ## HITS:1 COG:no KEGG:Dacet_0812 NR:ns ## KEGG: Dacet_0812 # Name: not_defined # Def: transposase IS4 family protein # Organism: D.acetiphilus # Pathway: not_defined # 24 205 20 200 466 92 26.0 1e-17 MEAKIEKISELSKLLSVKSRMSDDLFHLFGKFGIGHLLSRLSLEKHDGVSASELILSLCL FRILGESIHSICKRKIYELSNHGKNCFYRMMIRPQMDWRRLMNHFALRYMCLLRKYGEAP QSDATTCFIIDDTVLEKSGVRMEGISRIFDHVKGRCVLGYKLLLCAFFDGKTTIPFDFSL HQEKGKQGDCGLTRQQRRKAYHTKRND Prediction of potential genes in microbial genomes Time: Sat May 28 03:07:22 2011 Seq name: gi|283510432|gb|ACQH01000187.1| Prevotella sp. oral taxon 317 str. F0108 cont2.187, whole genome shotgun sequence Length of sequence - 735 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 682 859 ## COG3209 Rhs family protein Predicted protein(s) >gi|283510432|gb|ACQH01000187.1| GENE 1 1 - 682 859 227 aa, chain - ## HITS:1 COG:PA2458 KEGG:ns NR:ns ## COG: PA2458 COG3209 # Protein_GI_number: 15597654 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rhs family protein # Organism: Pseudomonas aeruginosa # 84 225 199 334 373 59 32.0 4e-09 MGLYNKGVGDLQATTEYDAYDRPTKVTLPDGATTTTAYAIVDHDGEPMLETRVTDALGRH AESYTDEKGRNRETVQHAASGDVRVKYGYDAVGQVLTVHHPNGKTATYQYDLLGHKLKVN HPDAGEVTCTYDAAGNLLTKLTAELKKTISDKASITYTYDYERLSEVLYPKNLFNRVTYT YGKPGAKYNRAGRLVLVEDASGGEAYYYGNQGEVVKTVRSVMVSQAD Prediction of potential genes in microbial genomes Time: Sat May 28 03:07:22 2011 Seq name: gi|283510431|gb|ACQH01000188.1| Prevotella sp. oral taxon 317 str. F0108 cont2.188, whole genome shotgun sequence Length of sequence - 730 bp Number of predicted genes - 2, with homology - 0 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 17 - 76 2.9 1 1 Tu 1 . + CDS 114 - 299 172 ## + Term 331 - 381 -0.8 + Prom 384 - 443 4.2 2 2 Tu 1 . + CDS 486 - 729 129 ## Predicted protein(s) >gi|283510431|gb|ACQH01000188.1| GENE 1 114 - 299 172 61 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDGALVLLNELLRNTPINGLNPCFNGWCTRTRLITDYFAALKDVLILVLMDGALVQCYSC Y >gi|283510431|gb|ACQH01000188.1| GENE 2 486 - 729 129 81 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVHSYSVDLNDGGTQNQQVLILVLMDGALVQIIIYTELQVFCQCLNPCFNGWCTRTSAST FNPLNALGLNPCFNGWCTRTY Prediction of potential genes in microbial genomes Time: Sat May 28 03:07:32 2011 Seq name: gi|283510430|gb|ACQH01000189.1| Prevotella sp. oral taxon 317 str. F0108 cont2.189, whole genome shotgun sequence Length of sequence - 726 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 31 - 510 154 ## Xaut_1719 hypothetical protein + Term 539 - 574 2.3 Predicted protein(s) >gi|283510430|gb|ACQH01000189.1| GENE 1 31 - 510 154 159 aa, chain + ## HITS:1 COG:no KEGG:Xaut_1719 NR:ns ## KEGG: Xaut_1719 # Name: not_defined # Def: hypothetical protein # Organism: X.autotrophicus # Pathway: not_defined # 17 154 57 203 226 138 51.0 8e-32 MDANTLLSKGLVDVSNGASVRKKGSSRIDSDEYYVIELCNEVLKMEALQQYCFEFLLGDT GRKLPVDAFYEGLNLVIEYYESQHTESTPFFDNKKTVSGVSRGEQRRLYDERRRTELPKH GIKLIILRYSDFGTTKRLKRDNREHDIEVVRKKLAEFIP Prediction of potential genes in microbial genomes Time: Sat May 28 03:07:35 2011 Seq name: gi|283510429|gb|ACQH01000190.1| Prevotella sp. oral taxon 317 str. F0108 cont2.190, whole genome shotgun sequence Length of sequence - 693 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 691 770 ## COG3209 Rhs family protein Predicted protein(s) >gi|283510429|gb|ACQH01000190.1| GENE 1 1 - 691 770 230 aa, chain + ## HITS:1 COG:RSp1137 KEGG:ns NR:ns ## COG: RSp1137 COG3209 # Protein_GI_number: 17549358 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rhs family protein # Organism: Ralstonia solanacearum # 4 208 829 1024 1517 79 32.0 6e-15 TREYGYDAYDRLAWVENERGERYAFERDLNGEVIAERGFDGRERRYGRDGDGTVLVTHLP DGTTIHHQHDLAGRLTYSRYPDGSWEAWEYDRAGRLCKAGDPHGETVFERDALGRIVREI RNGHAIEHAYDSRSRLTHTLSDLGVDITHSYDDAGLPESVKAIVQGMPHPWEAHLRHDRL GRETLRTMTGGVACAMQYDGVGRPSRQSVIRGGHSLYNRSYRWDDDFRLS Prediction of potential genes in microbial genomes Time: Sat May 28 03:07:36 2011 Seq name: gi|283510428|gb|ACQH01000191.1| Prevotella sp. oral taxon 317 str. F0108 cont2.191, whole genome shotgun sequence Length of sequence - 692 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 485 383 ## gi|288930225|ref|ZP_06424026.1| hypothetical protein HMPREF0670_02920 Predicted protein(s) >gi|283510428|gb|ACQH01000191.1| GENE 1 2 - 485 383 161 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288930225|ref|ZP_06424026.1| ## NR: gi|288930225|ref|ZP_06424026.1| hypothetical protein HMPREF0670_02920 [Prevotella sp. oral taxon 317 str. F0108] # 1 161 1 161 161 290 100.0 2e-77 MKSLVFILAAFIVAGSCGAQRKVKVTKIKGGKQMTTEKIDTQRFHWNKDKNNDYTFVNYK GQKVIQSRDMDNGVYYYYETRRKENELIEEYRRYFNDGKLNVEGFQYKDYGFPMGIWRTY DEKGKLIETEDYDAPFKNYPWEEVRKFLERERGIDFFDKRT Prediction of potential genes in microbial genomes Time: Sat May 28 03:07:44 2011 Seq name: gi|283510427|gb|ACQH01000192.1| Prevotella sp. oral taxon 317 str. F0108 cont2.192, whole genome shotgun sequence Length of sequence - 687 bp Number of predicted genes - 0 Prediction of potential genes in microbial genomes Time: Sat May 28 03:07:45 2011 Seq name: gi|283510426|gb|ACQH01000193.1| Prevotella sp. oral taxon 317 str. F0108 cont2.193, whole genome shotgun sequence Length of sequence - 672 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 206 224 ## gi|288930131|ref|ZP_06423968.1| FG-GAP repeat protein Predicted protein(s) >gi|283510426|gb|ACQH01000193.1| GENE 1 2 - 206 224 68 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288930131|ref|ZP_06423968.1| ## NR: gi|288930131|ref|ZP_06423968.1| FG-GAP repeat protein [Prevotella sp. oral taxon 317 str. F0108] # 1 68 664 731 782 134 95.0 2e-30 MRYAYDDMGRPSTIVGPKEIAAGKPYTIKFEYHPAGRYARTVHYAPEGDIETYTFADSLM RAVQTKQT Prediction of potential genes in microbial genomes Time: Sat May 28 03:07:50 2011 Seq name: gi|283510425|gb|ACQH01000194.1| Prevotella sp. oral taxon 317 str. F0108 cont2.194, whole genome shotgun sequence Length of sequence - 643 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 341 - 641 220 ## BVU_3552 putative transposase Predicted protein(s) >gi|283510425|gb|ACQH01000194.1| GENE 1 341 - 641 220 100 aa, chain + ## HITS:1 COG:no KEGG:BVU_3552 NR:ns ## KEGG: BVU_3552 # Name: not_defined # Def: putative transposase # Organism: B.vulgatus # Pathway: not_defined # 1 92 72 163 452 120 57.0 2e-26 MFLKPYTGLSDDGLIELLNGSIHMQMFCGVLIDPANPIKDGKIVSAIRNRLARHLDIDGL QRILYARWEGDLKDKDLCLTDATCYESHLRFPDRRQTALG Prediction of potential genes in microbial genomes Time: Sat May 28 03:07:52 2011 Seq name: gi|283510424|gb|ACQH01000195.1| Prevotella sp. oral taxon 317 str. F0108 cont2.195, whole genome shotgun sequence Length of sequence - 640 bp Number of predicted genes - 1, with homology - 0 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 227 - 526 169 ## Predicted protein(s) >gi|283510424|gb|ACQH01000195.1| GENE 1 227 - 526 169 99 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDMAQTSSRTGQKDYPMFMMQIEKQRNYHDGNGGFCITWFSRKAGRTFCTFQIEGLLEVI LHILYKLKVSPLGHLMRKNLRLLQTDKVTNMSNHKLMNC Prediction of potential genes in microbial genomes Time: Sat May 28 03:07:58 2011 Seq name: gi|283510423|gb|ACQH01000196.1| Prevotella sp. oral taxon 317 str. F0108 cont2.196, whole genome shotgun sequence Length of sequence - 625 bp Number of predicted genes - 1, with homology - 0 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 90 - 149 7.6 1 1 Tu 1 . + CDS 260 - 526 154 ## Predicted protein(s) >gi|283510423|gb|ACQH01000196.1| GENE 1 260 - 526 154 88 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MEDKTKQVIAHTIYTICKILAREFRLALKPLGIGIIICAIFLTVAEWSDVIFGHNWKEPF ITLGIIGFFLPMIVKYYKRIKAWVLKWK Prediction of potential genes in microbial genomes Time: Sat May 28 03:08:04 2011 Seq name: gi|283510422|gb|ACQH01000197.1| Prevotella sp. oral taxon 317 str. F0108 cont2.197, whole genome shotgun sequence Length of sequence - 615 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 14 - 562 393 ## Hoch_2925 YD repeat protein Predicted protein(s) >gi|283510422|gb|ACQH01000197.1| GENE 1 14 - 562 393 182 aa, chain + ## HITS:1 COG:no KEGG:Hoch_2925 NR:ns ## KEGG: Hoch_2925 # Name: not_defined # Def: YD repeat protein # Organism: H.ochraceum # Pathway: not_defined # 27 122 2978 3072 3456 73 39.0 4e-12 MAKGAKYEMTMTFGRMSEPLTKVQKVDSTKTAQSYDFTYQYEDSNHPTAPTQIGHEHYTY DANGNPTLVENDSLNSERRMYWDEDNRLMVLSDNGKTSRYTYNAASERIVKSHGDLEGVY VNGAPQGITADFVDFRLRLGKTSELLFHSACTEVPRNGRLHHLSCPDYHRNEEPLYQALL HR Prediction of potential genes in microbial genomes Time: Sat May 28 03:08:08 2011 Seq name: gi|283510421|gb|ACQH01000198.1| Prevotella sp. oral taxon 317 str. F0108 cont2.198, whole genome shotgun sequence Length of sequence - 603 bp Number of predicted genes - 1, with homology - 0 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 22 - 53 -0.6 1 1 Tu 1 . - CDS 194 - 472 86 ## - Prom 501 - 560 3.9 Predicted protein(s) >gi|283510421|gb|ACQH01000198.1| GENE 1 194 - 472 86 92 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKESLTPTPWSIYRMNINTSKPFIILSRIIIIIQRVVYHRIKENNRIFDPFFLLSDMTFL KYLFLQHKSLNSSTCYLLSCDKEANKYNYHLL Prediction of potential genes in microbial genomes Time: Sat May 28 03:08:13 2011 Seq name: gi|283510420|gb|ACQH01000199.1| Prevotella sp. oral taxon 317 str. F0108 cont2.199, whole genome shotgun sequence Length of sequence - 598 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 215 130 ## gi|260592356|ref|ZP_05857814.1| hypothetical protein HMPREF0973_01804 + Term 357 - 399 2.2 Predicted protein(s) >gi|283510420|gb|ACQH01000199.1| GENE 1 3 - 215 130 70 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260592356|ref|ZP_05857814.1| ## NR: gi|260592356|ref|ZP_05857814.1| hypothetical protein HMPREF0973_01804 [Prevotella veroralis F0319] # 2 63 232 293 295 95 72.0 9e-19 RLNVDFEESYTFLKNGTIAQAELYLNTSVSNWYAVATGHRLIQVTARGLEVLKAAEDMVE TLEMNEPNTE Prediction of potential genes in microbial genomes Time: Sat May 28 03:08:18 2011 Seq name: gi|283510419|gb|ACQH01000200.1| Prevotella sp. oral taxon 317 str. F0108 cont2.200, whole genome shotgun sequence Length of sequence - 588 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 588 665 ## Cpin_4580 YD repeat protein Predicted protein(s) >gi|283510419|gb|ACQH01000200.1| GENE 1 3 - 588 665 195 aa, chain - ## HITS:1 COG:no KEGG:Cpin_4580 NR:ns ## KEGG: Cpin_4580 # Name: not_defined # Def: YD repeat protein # Organism: C.pinensis # Pathway: not_defined # 5 195 962 1145 1401 142 37.0 1e-32 VSGRVARYDYDEFDNLISAEYERGGEVERLYRVPDRMGNLFESREKDDRKYDAGGRLAED REHFYHYDCEGNLVFKEFKEMALRGGVIAPINKERLEAELGITFRAFGTGWRYDWQSDGM LARVVRPDGKEVSFAYDALGRRIRKSFAGTTTHFVWDGNVPLHEWTETAESEESVITWLF EQDTFVPAAKLVANG Prediction of potential genes in microbial genomes Time: Sat May 28 03:08:22 2011 Seq name: gi|283510418|gb|ACQH01000201.1| Prevotella sp. oral taxon 317 str. F0108 cont2.201, whole genome shotgun sequence Length of sequence - 568 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 97 - 438 256 ## gi|288930240|ref|ZP_06424031.1| hypothetical protein HMPREF0670_02925 Predicted protein(s) >gi|283510418|gb|ACQH01000201.1| GENE 1 97 - 438 256 113 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288930240|ref|ZP_06424031.1| ## NR: gi|288930240|ref|ZP_06424031.1| hypothetical protein HMPREF0670_02925 [Prevotella sp. oral taxon 317 str. F0108] # 1 113 44 156 156 228 100.0 9e-59 MDGGLKELQKIYMDSFDYFLRDNSCDNPVYDDILNRISLGILLNIPDENFMQLVDYVQRL DEEAKPADWTPDLLLWFLLNSRLKDNEKRTHAQKLGFPRECKGAIQSNASHRQ Prediction of potential genes in microbial genomes Time: Sat May 28 03:08:28 2011 Seq name: gi|283510417|gb|ACQH01000202.1| Prevotella sp. oral taxon 317 str. F0108 cont2.202, whole genome shotgun sequence Length of sequence - 551 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 333 - 551 61 ## gi|288930094|ref|ZP_06423933.1| hypothetical protein HMPREF0670_02827 Predicted protein(s) >gi|283510417|gb|ACQH01000202.1| GENE 1 333 - 551 61 72 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|288930094|ref|ZP_06423933.1| ## NR: gi|288930094|ref|ZP_06423933.1| hypothetical protein HMPREF0670_02827 [Prevotella sp. oral taxon 317 str. F0108] # 2 70 142 210 210 94 72.0 2e-18 GEEVRKFLERERGVDFFDERTTVDRYVDAKNPAGWGIIYYDKKNDRRMYIDLDCATRKIM KEYEVTVDTDSY Prediction of potential genes in microbial genomes Time: Sat May 28 03:08:34 2011 Seq name: gi|283510416|gb|ACQH01000203.1| Prevotella sp. oral taxon 317 str. F0108 cont2.203, whole genome shotgun sequence Length of sequence - 533 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 520 161 ## BF2845 hypothetical protein Predicted protein(s) >gi|283510416|gb|ACQH01000203.1| GENE 1 1 - 520 161 173 aa, chain - ## HITS:1 COG:no KEGG:BF2845 NR:ns ## KEGG: BF2845 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 10 103 7 101 587 65 35.0 1e-09 MNLWTPNNDIAVSIGGKPLFSFKSLRLHQPINDHHRFELALDFEVAAKALEDGTEWLGKR IQIKRREAKSFLFVGVVTKVNACRENLDGGEVRVSGYSTTYLLGERRELPLVARPHAGRD SQGAVRQGGRAGAGGTRTRQVRGVRLPIPGIGLHVHPAAGQEVPGVAVLRRRQ Prediction of potential genes in microbial genomes Time: Sat May 28 03:08:37 2011 Seq name: gi|283510415|gb|ACQH01000204.1| Prevotella sp. oral taxon 317 str. F0108 cont2.204, whole genome shotgun sequence Length of sequence - 526 bp Number of predicted genes - 0 Prediction of potential genes in microbial genomes Time: Sat May 28 03:08:38 2011 Seq name: gi|283510414|gb|ACQH01000205.1| Prevotella sp. oral taxon 317 str. F0108 cont2.205, whole genome shotgun sequence Length of sequence - 506 bp Number of predicted genes - 2, with homology - 1 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 14 - 262 188 ## 2 1 Op 2 . + CDS 277 - 505 402 ## gi|260911294|ref|ZP_05917893.1| conserved hypothetical protein Predicted protein(s) >gi|283510414|gb|ACQH01000205.1| GENE 1 14 - 262 188 82 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPFYDKKEYDGYFYRDGHLIVLYGVNQSRNILERKWIKRMEGGIPHFRHVTIKRWNYPYP LKMEILSNGKVRILSLEEGFFI >gi|283510414|gb|ACQH01000205.1| GENE 2 277 - 505 402 76 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260911294|ref|ZP_05917893.1| ## NR: gi|260911294|ref|ZP_05917893.1| conserved hypothetical protein [Prevotella sp. oral taxon 472 str. F0295] # 6 76 2240 2310 3065 136 97.0 4e-31 MSNFITYDYERLSEVLYPKNLFNRVTYTYGKPGEKYNRAGRLVLVEDASGGEAYYYGNQG EVVKTVRSVMVSTADV